Published May 15, 2023 | Version v1
Other Open

Data from: Redefining possible: Combining phylogenomic and supersparse data in frogs

  • 1. California Academy of Sciences
  • 2. Natural History Museum
  • 3. University of Florida
  • 4. Oklahoma State University
  • 5. Louisiana State University
  • 6. University of Arizona

Description

The data available for reconstructing molecular phylogenies have become wildly disparate. Phylogenomic studies can generate data for thousands of genetic markers for dozens of species, but for hundreds of other taxa, data may be available from only a few genes. Can these two types of data be integrated to combine the advantages of both, addressing the relationships of hundreds of species with thousands of genes? Here we show that this is possible, using data from frogs. We generated a phylogenomic dataset for 138 ingroup species and 3,784 nuclear markers (ultraconserved elements, UCEs), including new UCE data from 70 species. We also assembled a supermatrix dataset, including data from 97% of frog genera (441 total), with 1–307 genes per taxon. We then produced a combined phylogenomic-supermatrix dataset (a "gigamatrix") containing 441 ingroup taxa and 4,091 markers, but with 86% missing data overall. Likelihood analysis of the gigamatrix yielded a generally well-supported tree among families, largely consistent with trees from the phylogenomic data alone. All terminal taxa were placed in the expected families, even though 42.5% of these taxa each had >99.5% missing data, and 70.2% had >90% missing data. Our results show that missing data need not be an impediment to successfully combining very large phylogenomic and supermatrix datasets, and they open the door to new studies that simultaneously maximize sampling of genes and taxa.

Notes

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100008982
Award Number: DEB-1655690

Files

Supplementary_File_S1.pdf

Files (6.2 MB)

Name Size Download all
md5:368f80d31284d39ab20b5fd50dd2ae3e
2.7 MB Preview Download
md5:4d3d598c8d832745ac5fce42f3864cd3
30.6 kB Preview Download
md5:7e5566cbc63e985f606bf3991f38f4f3
27.7 kB Preview Download
md5:db9b914f0e0b03d72b94988158642ecc
25.9 kB Preview Download
md5:43f6f7f978eb6372af283f1404168351
7.0 kB Preview Download
md5:96c1990148fe21a45eff3df58d594998
5.4 kB Preview Download
md5:e7ad4f5b9af8bea8bee0475441678632
15.8 kB Preview Download
md5:45ebd03c650b6a445451d327acc8d9ea
16.8 kB Preview Download
md5:eb4c9c2d9f12f62341fbd4a6e1e42af3
15.7 kB Preview Download
md5:13abd84edbfb27ccb9a46be510a435e8
15.9 kB Preview Download
md5:4e2ad1be4a799c6c957e0def178cced6
14.9 kB Preview Download
md5:5bacc285a9c97e7715ca959453998da7
104.0 kB Preview Download
md5:6671e7335e6e8797cc338bef1634e20d
100.7 kB Preview Download
md5:6a21c4bdd3843a786d103722f82b52c4
3.0 MB Preview Download
md5:a934133b92c7614763709cc8f6e3d3a1
29.9 kB Preview Download
md5:2b8d71faec058f6fbac7ca331eb482a8
30.1 kB Preview Download
md5:efeefd9e73ec22ec897f5f905989b668
29.6 kB Preview Download

Additional details

Related works

Is cited by
10.1093/molbev/msad109 (DOI)
Is derived from
10.5061/dryad.f7m0cfz0n (DOI)