Published September 15, 2018 | Version v1
Dataset Open

Data from: The unexpected depths of genome-skimming data: a case study examining Goodeniaceae floral symmetry genes

  • 1. St. John's University
  • 2. University of Florida
  • 3. Department of Parks and Wildlife
  • 4. United States Department of Agriculture
  • 5. Rhodes College

Description

Premise of the study: The use of genome skimming allows systematists to quickly generate large data sets, particularly of sequences in high abundance (e.g., plastomes); however, researchers may be overlooking data in low abundance that could be used for phylogenetic or evo-devo studies. Here, we present a bioinformatics approach that explores the low-abundance portion of genome-skimming next-generation sequencing libraries in the fan-flowered Goodeniaceae. Methods: Twenty-four previously constructed Goodeniaceae genome-skimming Illumina libraries were examined for their utility in mining low-copy nuclear genes involved in floral symmetry, specifically the CYCLOIDEA (CYC)-like genes. De novo assemblies were generated using multiple assemblers, and BLAST searches were performed for CYC1, CYC2, and CYC3 genes. Results: Overall Trinity, SOAPdenovo-Trans, and SOAPdenovo implementing lower k-mer values uncovered the most data, although no assembler consistently outperformed the others. Using SOAPdenovo-Trans across all 24 data sets, we recovered four CYC-like gene groups (CYC1, CYC2, CYC3A, and CYC3B) from a majority of the species. Alignments of the fragments included the entire coding sequence as well as upstream and downstream regions. Discussion: Genome-skimming data sets can provide a significant source of low-copy nuclear gene sequence data that may be used for multiple downstream applications.

Notes

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB-1256963, DEB-1256946

Files

Files (217.2 kB)

Name Size Download all
md5:5f73043e211fe35814dc4d32a2e1bc13
85.5 kB Download
md5:1fd50752d7b0b2b3b1b38f30bb064e1b
29.0 kB Download
md5:e39226747ffc577f845f5b1e4f84b567
41.0 kB Download
md5:441ae454015b88f572ec9fe34f379863
61.8 kB Download

Additional details

Related works

Is cited by
10.3732/apps.1700042 (DOI)