Genome assembly and annotation files for Corylus americana accessions 'Rush' and 'Winkler'
Description
The native shrub American hazelnut (Corylus americana) is currently used in breeding programs that are aiming to develop commercially viable hazelnut varieties for the U.S. Upper Midwestern U.S. This species provides significant ecological benefits as it is a perennial crop and well-adapted to this region. Breeding cycles for perennial species are long, and may benefit from the use of predictive methods such as genomic selection to reduce cycle time and increase the efficiency of field trials.
High-quality reference genome assemblies are very useful for the implementation marker-assisted selection and genomic prediction, and we therefore developed the first chromosome-scale reference assemblies for C. americana, using the accessions 'Rush' and 'Winkler'. Initial draft assemblies were created using HiFi PacBio reads and Arima Hi-C sequencing to assemble genomes into 11 pseudomolecules. We then utilized Oxford Nanopore reads and a high-density genetic map in order to perform error correction. N50 scores were calculated to be 31.9 Mb and 35.3 Mb for 'Rush' and 'Winkler', respectively, while 97.1% (for 'Winkler') and 90.2% (for 'Rush') of the total genome was assembled into the 11 pseudomolecules. Gene prediction was performed using both RNAseq libraries as well as protein homology data. 'Rush' had a BUSCO score of 99.0 for its assembly and 99.0 for its annotation, while 'Winkler' had corresponding scores of 96.9 and 96.5, indicating extremely high-quality assemblies.
These two independent, de novo assemblies enable unbiased assessment of structural variation across the genome, as well as patterns of syntenic relationships within C. americana and the Corylus genus. These assemblies are also an important first step in providing a resource for using next-generation sequencing data in the improvement of C. americana. We demonstrate this utility through the generation of high-density SNP marker sets from genotyping-by-sequencing data for 1,343 C. americana, C. avellana, and C. americana x C. avellana hybrids, in order to assess population structure in natural and breeding populations. Finally, the transcriptomes of these assemblies, as well as several other recently published Corylus genomes, were utilized to perform phylogenetic analysis of sporophytic self-incompatibility (SSI) in hazelnut, providing further evidence of unique molecular pathways governing self-incompatibility in Corylus not exhibited in other well-studied SSI systems. We hope these assemblies will aide in the application of modern breeding methods to the development of commercially viable hazelnut varieties for the U.S. Upper Midwest.
Files
Camericanavar_rush_835_v1.1.annotation_info.txt
Files
(2.5 GB)
Name | Size | Download all |
---|---|---|
md5:422147597db12b3ce6d87cc545a4b5d8
|
354.6 MB | Download |
md5:11084ebf0f9427e43ace91bf90ba9172
|
354.6 MB | Download |
md5:4b94a7115b17226006e14dbbe7397107
|
114.4 MB | Download |
md5:fd49ce888c9388a1c524c0ab3ffe8291
|
331.9 MB | Download |
md5:277294b06c1c96f4103d15f8ca7cce88
|
331.9 MB | Download |
md5:692b02bac31425f80044f85ec14ea027
|
107.7 MB | Download |
md5:937a1e3bf1ef23d151c9cb8a5fc56d2d
|
62.1 MB | Download |
md5:011dd16bfec082e9691333133f1a3394
|
61.1 MB | Download |
md5:e7c4fd47bd6385eb3382add8351ecd2e
|
7.7 MB | Preview Download |
md5:a1e4e709e861b23c7a7f3127ea76a055
|
7.5 MB | Preview Download |
md5:ac28a10888f25a4a4672232f03d44aa8
|
49.7 MB | Download |
md5:31ba4e735b435cb00fa42c73505160e9
|
19.9 MB | Download |
md5:81fca8402356697e805ca0b097b22c35
|
15.2 MB | Download |
md5:66681e45e4b608fee0b4b25b9dbadf6c
|
10.7 MB | Download |
md5:30dda5685a9126a6a001d79ffde4d322
|
35.6 MB | Download |
md5:c1d598c82d022736504c4404b82d04c6
|
38.0 MB | Download |
md5:ad5f71b818152c5f1868481c1d8c3918
|
45.8 MB | Download |
md5:37f8c18f265f034bce76196801efd665
|
27.1 MB | Download |
md5:f765581ec4985fe0ed8d11e265a53eae
|
15.4 MB | Download |
md5:0c0d15622b51a2d5a60d57e7847b3fde
|
46.6 MB | Download |
md5:249339eac4090ce55592133e57704ca8
|
19.0 MB | Download |
md5:855a773a147b815e97e11cc7153919c3
|
13.8 MB | Download |
md5:7b12520fdba3486a72184b99b2d0cba3
|
10.1 MB | Download |
md5:6cd90a660c41e9b169008d40c6451129
|
35.2 MB | Download |
md5:24d81db5c730367846f561fc7686ecce
|
36.1 MB | Download |
md5:5a1678b29b4a9fca9af8e6fba1f16841
|
43.4 MB | Download |
md5:70769b212bb1e56667f9ab3bb9515e3e
|
27.1 MB | Download |
md5:9a2155cc6795ac4ef2b98677e6bf9a2e
|
14.8 MB | Download |
md5:1de3b84d32bc4c64ff7dc33e7d66d5d8
|
244.4 MB | Download |
Additional details
Related works
- Is published in
- Preprint: 10.1101/2023.04.27.537858 (DOI)