Published January 2, 2021 | Version v1

A two-tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae

  • 1. University of Florida
  • 2. Mississippi State University
  • 3. Universidade Federal do Rio Grande do Sul
  • 4. New York Botanical Garden
  • 5. Florida Museum of Natural History

Description

Premise of the study: Putatively single-copy nuclear (SCN) loci, identified using genomic resources of closely related species, are ideal for phylogenomic inference. However, suitable genomic resources are not available for many clades, including Melastomataceae. We introduce a versatile approach to identify SCN loci for clades with few genomic resources and use it to develop probes for target enrichment in the distantly related Memecylon and Tibouchina (Melastomataceae).

Methods: We present a two-tiered pipeline. First, we identified putatively SCN loci using MarkerMiner and transcriptomes from distantly related species in Melastomataceae. Published loci and genes of functional significance were added (384 total loci). Second, using HybPiper, we retrieved 689 homologous template sequences for these loci using genome-skimming data from within the focal clades.

Results: We sequenced 193 loci from both Memecylon and Tibouchina, with probes designed from 56 template sequences successfully targeting sequences in both clades. Probes designed from genome-skimming data within a focal clade were more successful than probes designed from other sources.

Discussion: Our pipeline successfully identified and targeted SCN loci in Memecylon and Tibouchina, enabling phylogenomic studies in both clades and potentially across Melastomataceae. This pipeline could be easily applied to other clades with few genomic resources. 

Notes

This dataset includes the template sequences recovered by the second tier for the loci identified by the first tier in this pipeline as well as the probe sequences used for target enrichment. Custom scripts for analyzing these data are available on github at https://github.com/jjantzen/Probe_design. Cleaned reads are deposited on NCBI SRA (PRJNA592250, PRJNA573947,  PRJNA576018).

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB-1343612

Funding provided by: American Society of Plant Taxonomists
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010402
Award Number:

Funding provided by: Botanical Society of America
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010426
Award Number:

Funding provided by: Society for the Study of Evolution
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100012454
Award Number:

Funding provided by: Society of Systematic Biologists
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100006069
Award Number:

Funding provided by: University of Florida
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100007698
Award Number:

Funding provided by: Florida Museum of Natural History
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100012310
Award Number:

Files

Locus_lengths.csv

Files (3.2 MB)

Name Size Download all
md5:3035c522375f8cfdf866ff024a56936c
10.6 kB Preview Download
md5:51c847296e21de1ac186c09eecbde464
8.3 kB Preview Download
md5:986571e7bc561b05d8707d10cbf0a503
2.9 MB Preview Download
md5:9b45c1cae6df8ed5a5fc9e1f7d2b6a19
1.2 kB Preview Download
md5:bdeca3c24c87d2f112ab2e167350dc2f
6.8 kB Preview Download
md5:3b0eefa082e030cbd388930018206ef1
27.7 kB Preview Download
md5:d3b8a0a2a938e6c28c5321cda588325a
260.2 kB Preview Download

Additional details

Related works