vignettes/B_multiphylo_treedata.table.Rmd
B_multiphylo_treedata.table.Rmd
treedata.table
further allows the matching of multiple phylogenies (multiPhylo
) against a single dataset (data.frame
). Below, we modified the anole dataset to explain the extended functionality of treedata.table
with multiPhylo
objects. Note that all the trees in the multiPhylo
must have exactly the same taxa.
We first load the sample dataset.
## Thank you for using the {treedata.table} R package!
##
## 🙂Happy coding!!🙂
# Load example data
data(anolis)
#Create treedata.table object with as.treedata.table
td <- as.treedata.table(tree = anolis$phy, data = anolis$dat)
## Tip labels detected in column: X
## Phylo object detected
## All tips from original tree/dataset were preserved
We then create a multiPhylo
object including only two phylo
objects. Users can provide any number of phylo
objects within the multiPhylo
object. However, trees can only differ in their topology. In other words, all trees must have the same tip labels.
We also note that both the provided multiPhylo
and data.frame
should partially overlap
## 2 phylogenetic trees
Now, we create our treedata.table object by combining the trait data (data.frame
) and the newly generated multiPhylo
object. Note that there is only a single character matrix.
td <- as.treedata.table(tree=trees, data=anolis$dat)
## Tip labels detected in column: X
## Multiphylo object detected
## All tips from original tree/dataset were preserved
The resulting td
object now returns a multiPhylo
object under phy
. This objectcontains only the overlapping taxa between the multiphylo objects and the input dataset.
class(td$phy);td$phy
## [1] "multiPhylo"
## 2 phylogenetic trees
Please note that all the basic treedata.table
functions highlighted above for phylo
objects are still functional when treedata.table
objects include multiPhylo
objects.
td[, head(.SD, 1), by = "ecomorph"]
## $phy
## 2 phylogenetic trees
##
## $dat
## ecomorph tip.label SVL PCI_limbs PCII_head PCIII_padwidth_vs_tail
## 1: TG ahli 4.039125 -3.2482860 0.3722519 -1.0422187
## 2: GB ophiolepis 3.637962 0.7915117 1.4585760 -1.3152005
## 3: CG garmani 4.769473 -0.7735264 0.9371249 0.2594994
## 4: TC opalinus 3.838376 -1.7794371 -0.3245381 1.5569939
## 5: TW valencienni 4.321524 2.9424139 -0.8846007 1.8543308
## 6: U reconditus 4.482607 -2.7270416 -0.2104066 -2.3534242
## PCIV_lamella_num awesomeness hostility attitude island
## 1: -2.4147423 -0.24165170 -0.17347691 0.64437708 Cuba
## 2: -2.2377514 0.35441877 0.05366142 -0.09389530 Cuba
## 3: 0.1051149 0.16779131 0.67675600 -0.69460080 Puerto Rico
## 4: 0.9366501 1.48302162 -0.90826653 0.72613483 Jamaica
## 5: 0.1288233 -0.08837008 0.46528679 -0.56754896 Jamaica
## 6: -0.7992905 0.26096544 -0.27169792 0.01367143 Jamaica
Functions can also be run on any treedata.table
object with multiphylo
data. For instance, the following line will fit a phenogram for SVL
on each of the trees we provided in the multiPhylo
object.
tdt(td, geiger::fitContinuous(phy, extractVector(td, 'SVL'), model="BM", ncores=1))
## Multiphylo object detected. Expect a list of function outputs
## [[1]]
## GEIGER-fitted comparative model of continuous data
## fitted 'BM' model parameters:
## sigsq = 0.136160
## z0 = 4.065918
##
## model summary:
## log-likelihood = -4.700404
## AIC = 13.400807
## AICc = 13.524519
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## number of iterations with same best fit = 100
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
##
## [[2]]
## GEIGER-fitted comparative model of continuous data
## fitted 'BM' model parameters:
## sigsq = 0.136160
## z0 = 4.065918
##
## model summary:
## log-likelihood = -4.700404
## AIC = 13.400807
## AICc = 13.524519
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## number of iterations with same best fit = 100
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
The output is an object of class list
with each element corresponding to the output function of each tree in the provided multiPhylo
object.