Published December 8, 2022 | Version v1
Dataset Open

Supplementary material and supplementary data files for: Handling logical character dependency in phylogenetic inference: Extensive performance testing of assumptions and solutions using simulated and empirical data

  • 1. Harvard University
  • 2. University of Kentucky
  • 3. Smithsonian Tropical Research Institute
  • 4. Southeastern Louisiana University

Description

Logical character dependency is a major conceptual and methodological problem in phylogenetic inference of morphological datasets, as it violates the assumption of character independence that is common to all phylogenetic methods. It is more frequently observed in higher-level phylogenies or in datasets characterizing major evolutionary transitions, as these represent parts of the tree of life where (primary) anatomical characters either originate or disappear entirely. As a result, secondary traits related to these primary characters become "inapplicable" across all sampled taxa in which that character is absent. Various solutions have been explored over the last three decades to handle character dependency, such as alternative character coding schemes and, more recently, new algorithmic implementations. However, the accuracy of the proposed solutions, or the impact of character dependency across distinct optimality criteria, has never been directly tested using standard performance measures. Here, we utilize simple and complex simulated morphological datasets analyzed under different maximum parsimony optimization procedures and Bayesian inference to test the accuracy of various coding and algorithmic solutions to character dependency. This is complemented by empirical analyses using a recoded dataset on palaeognathid birds. We find that in small, simulated datasets, absent coding performs better than other popular coding strategies available (contingent and multistate), whereas in more complex simulations (larger datasets controlled for different tree structure and character distribution models) contingent coding is favored more frequently. Under contingent coding, a recently proposed weighting algorithm produces the most accurate results for maximum parsimony. However, Bayesian inference outperforms all parsimony-based solutions to handle character dependency due to fundamental differences in their optimization procedures—a simple alternative that has been long overlooked. Yet, we show that the more primary characters bearing secondary (dependent) traits there are in a dataset, the harder it is to estimate the true phylogenetic tree, regardless of the optimality criterion, owing to a considerable expansion of the tree parameter space.

Notes

All data and codes to produce all simulated datasets, empirical dataset, run analyses, produce output trees, and generate tables, graphs and plots.

See more details in README file.

Funding provided by: Natural Sciences and Engineering Research Council of Canada
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000038
Award Number: Postdoctoral fellowship

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB‐2113425

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB‐2045842

Funding provided by: National Institutes of Health
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000002
Award Number: P2O GM103424‐20

Funding provided by: Smithsonian Institution
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000014
Award Number: Postdoctoral fellowship

Files

README_Simoes_etal_2022.txt

Files (340.1 MB)

Name Size Download all
md5:70fa822fd4355dd11c6c08ab5d7c2ed9
28.7 MB Download
md5:01d33264d15c905cdb88fbe8cf9f4b0f
7.7 kB Preview Download
md5:d400b00ed4fd57c1579592052163818c
7.8 MB Download
md5:330e51a598a349d8d4f25443cd83f956
3.2 MB Download
md5:2b24c83ea2841e269dc11be56333ca0a
13.6 MB Download
md5:faa5d6bcbaa446f98942d2ac4ffa18d4
14.6 MB Download
md5:0d329aa83615181df1aa1856e914a67d
272.3 MB Download

Additional details

Related works

Is derived from
10.5281/zenodo.6422125 (DOI)
Is source of
10.5281/zenodo.7415683 (DOI)