Other Open Access
Stuart, Katarina;
Edwards, Richard;
Cheng, Yuanyuan;
Warren, Wes;
Burt, Dave;
Sherwin, William;
Hofmeister, Natalie;
Werner, Scott;
Ball, Gregory;
Bateson, Melissa;
Brandley, Matthew;
Buchanan, Katherine;
Cassey, Phillip;
Clayton, David;
De Meyer, Tim;
Meddle, Simone;
Rollins, Lee
The European starling, Sturnus vulgaris, is an ecologically significant, globally invasive avian species that is also suffering from a major decline in its native range. Here, we present the genome assembly and long-read transcriptome of an Australian-sourced European starling (S. vulgaris vAU), and a second North American genome (S. vulgaris vNA), as complementary reference genomes for population genetic and evolutionary characterisation. S. vulgaris vAU combined 10x Genomics linked-reads, low-coverage Nanopore sequencing, and PacBio Iso-Seq full-length transcript scaffolding to generate a 1050 Mb assembly on 1,628 scaffolds (72.5 Mb scaffold N50). Species-specific transcript mapping and gene annotation revealed high structural and functional completeness (94.6% BUSCO completeness). Further scaffolding against the high-quality zebra finch (Taeniopygia guttata) genome assigned 98.6% of the assembly to 32 putative nuclear chromosome scaffolds. Rapid, recent advances in sequencing technologies and bioinformatics software have highlighted the need for evidence-based assessment of assembly decisions on a case-by-case basis. Using S. vulgaris vAU, we demonstrate how the multifunctional use of PacBio Iso-Seq transcript data and complementary homology-based annotation of sequential assembly steps (assessed using a new tool, SAAGA) can be used to assess, inform, and validate assembly workflow decisions. We also highlight some counter-intuitive behaviour in traditional BUSCO metrics, and present BUSCOMP, a complementary tool for assembly comparison designed to be robust to differences in assembly size and base-calling quality. Finally, we present a second starling assembly, S. vulgaris vNA, to facilitate comparative analysis and global genomic research on this ecologically important species.
Funding provided by: Australian Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000923
Award Number: LP160100610
Funding provided by: Australian Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000923
Award Number: LP18010072
Funding provided by: Human Sciences Frontier Programme*
Crossref Funder Registry ID:
Award Number: RGP0030/2015
Funding provided by: Roslin Institute Strategic Grant*
Crossref Funder Registry ID:
Award Number: BB/P013759/1
Funding provided by: UNSW Scientia Fellowship*
Crossref Funder Registry ID:
Award Number:
Name | Size | |
---|---|---|
Sv3.10_Supplementary_File_1_starling10xV3.N3L20ID0U.full.html
md5:dbde3f08ab11433045e4231cb868ce45 |
65.7 MB | Download |
Sv3.10_Supplementary_File_2_starling10xV5.N3L20ID0U.full.html
md5:174807d111a46d6820e413fbcdeb7e67 |
70.3 MB | Download |
All versions | This version | |
---|---|---|
Views | 22 | 22 |
Downloads | 4 | 4 |
Data volume | 262.7 MB | 262.7 MB |
Unique views | 18 | 18 |
Unique downloads | 4 | 4 |