Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Other Open Access

Transcript- and annotation-guided genome assembly of the European starling

Stuart, Katarina; Edwards, Richard; Cheng, Yuanyuan; Warren, Wes; Burt, Dave; Sherwin, William; Hofmeister, Natalie; Werner, Scott; Ball, Gregory; Bateson, Melissa; Brandley, Matthew; Buchanan, Katherine; Cassey, Phillip; Clayton, David; De Meyer, Tim; Meddle, Simone; Rollins, Lee

The European starling, Sturnus vulgaris, is an ecologically significant, globally invasive avian species that is also suffering from a major decline in its native range. Here, we present the genome assembly and long-read transcriptome of an Australian-sourced European starling (S. vulgaris vAU), and a second North American genome (S. vulgaris vNA), as complementary reference genomes for population genetic and evolutionary characterisation. S. vulgaris vAU combined 10x Genomics linked-reads, low-coverage Nanopore sequencing, and PacBio Iso-Seq full-length transcript scaffolding to generate a 1050 Mb assembly on 1,628 scaffolds (72.5 Mb scaffold N50). Species-specific transcript mapping and gene annotation revealed high structural and functional completeness (94.6% BUSCO completeness). Further scaffolding against the high-quality zebra finch (Taeniopygia guttata) genome assigned 98.6% of the assembly to 32 putative nuclear chromosome scaffolds. Rapid, recent advances in sequencing technologies and bioinformatics software have highlighted the need for evidence-based assessment of assembly decisions on a case-by-case basis. Using S. vulgaris vAU, we demonstrate how the multifunctional use of PacBio Iso-Seq transcript data and complementary homology-based annotation of sequential assembly steps (assessed using a new tool, SAAGA) can be used to assess, inform, and validate assembly workflow decisions. We also highlight some counter-intuitive behaviour in traditional BUSCO metrics, and present BUSCOMP, a complementary tool for assembly comparison designed to be robust to differences in assembly size and base-calling quality. Finally, we present a second starling assembly, S. vulgaris vNA, to facilitate comparative analysis and global genomic research on this ecologically important species.

Funding provided by: Australian Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000923
Award Number: LP160100610

Funding provided by: Australian Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000923
Award Number: LP18010072

Funding provided by: Human Sciences Frontier Programme*
Crossref Funder Registry ID:
Award Number: RGP0030/2015

Funding provided by: Roslin Institute Strategic Grant*
Crossref Funder Registry ID:
Award Number: BB/P013759/1

Funding provided by: UNSW Scientia Fellowship*
Crossref Funder Registry ID:
Award Number:

Files (135.9 MB)
Name Size
65.7 MB Download
70.3 MB Download
All versions This version
Views 2222
Downloads 44
Data volume 262.7 MB262.7 MB
Unique views 1818
Unique downloads 44


Cite as