Published May 8, 2023 | Version 0.5
Dataset Open

Genome assembly and annotation files for Corylus americana accessions 'Rush' and 'Winkler'

  • 1. University of Wisconsin-Madison

Contributors

Project leader:

  • 1. University of Wisconsin-Madison

Description

The native shrub American hazelnut (Corylus americana) is currently used in breeding programs that are aiming to develop commercially viable hazelnut varieties for the U.S. Upper Midwestern U.S. This species provides significant ecological benefits as it is a perennial crop and well-adapted to this region. Breeding cycles for perennial species are long, and may benefit from the use of predictive methods such as genomic selection to reduce cycle time and increase the efficiency of field trials.

High-quality reference genome assemblies are very useful for the implementation marker-assisted selection and genomic prediction, and we therefore developed the first chromosome-scale reference assemblies for C. americana, using the accessions 'Rush' and 'Winkler'. Initial draft assemblies were created using HiFi PacBio reads and Arima Hi-C sequencing to assemble genomes into 11 pseudomolecules. We then utilized Oxford Nanopore reads and a high-density genetic map in order to perform error correction. N50 scores were calculated to be 31.9 Mb and 35.3 Mb for 'Rush' and 'Winkler', respectively, while 97.1% (for 'Winkler') and 90.2% (for 'Rush') of the total genome was assembled into the 11 pseudomolecules. Gene prediction was performed using both RNAseq libraries as well as protein homology data. 'Rush' had a BUSCO score of 99.0 for its assembly and 99.0 for its annotation, while 'Winkler' had corresponding scores of 96.9 and 96.5, indicating extremely high-quality assemblies.

These two independent, de novo assemblies enable unbiased assessment of structural variation across the genome, as well as patterns of syntenic relationships within C. americana and the Corylus genus. These assemblies are also an important first step in providing a resource for using next-generation sequencing data in the improvement of C. americana. We demonstrate this utility through the generation of high-density SNP marker sets from genotyping-by-sequencing data for 1,343 C. americanaC. avellana, and C. americana x C. avellana hybrids, in order to assess population structure in natural and breeding populations. Finally, the transcriptomes of these assemblies, as well as several other recently published Corylus genomes, were utilized to perform phylogenetic analysis of sporophytic self-incompatibility (SSI) in hazelnut, providing further evidence of unique molecular pathways governing self-incompatibility in Corylus not exhibited in other well-studied SSI systems. We hope these assemblies will aide in the application of modern breeding methods to the development of commercially viable hazelnut varieties for the U.S. Upper Midwest.

Files

Camericanavar_rush_835_v1.1.annotation_info.txt

Files (2.5 GB)

Name Size Download all
md5:422147597db12b3ce6d87cc545a4b5d8
354.6 MB Download
md5:11084ebf0f9427e43ace91bf90ba9172
354.6 MB Download
md5:4b94a7115b17226006e14dbbe7397107
114.4 MB Download
md5:fd49ce888c9388a1c524c0ab3ffe8291
331.9 MB Download
md5:277294b06c1c96f4103d15f8ca7cce88
331.9 MB Download
md5:692b02bac31425f80044f85ec14ea027
107.7 MB Download
md5:937a1e3bf1ef23d151c9cb8a5fc56d2d
62.1 MB Download
md5:011dd16bfec082e9691333133f1a3394
61.1 MB Download
md5:e7c4fd47bd6385eb3382add8351ecd2e
7.7 MB Preview Download
md5:a1e4e709e861b23c7a7f3127ea76a055
7.5 MB Preview Download
md5:ac28a10888f25a4a4672232f03d44aa8
49.7 MB Download
md5:31ba4e735b435cb00fa42c73505160e9
19.9 MB Download
md5:81fca8402356697e805ca0b097b22c35
15.2 MB Download
md5:66681e45e4b608fee0b4b25b9dbadf6c
10.7 MB Download
md5:30dda5685a9126a6a001d79ffde4d322
35.6 MB Download
md5:c1d598c82d022736504c4404b82d04c6
38.0 MB Download
md5:ad5f71b818152c5f1868481c1d8c3918
45.8 MB Download
md5:37f8c18f265f034bce76196801efd665
27.1 MB Download
md5:f765581ec4985fe0ed8d11e265a53eae
15.4 MB Download
md5:0c0d15622b51a2d5a60d57e7847b3fde
46.6 MB Download
md5:249339eac4090ce55592133e57704ca8
19.0 MB Download
md5:855a773a147b815e97e11cc7153919c3
13.8 MB Download
md5:7b12520fdba3486a72184b99b2d0cba3
10.1 MB Download
md5:6cd90a660c41e9b169008d40c6451129
35.2 MB Download
md5:24d81db5c730367846f561fc7686ecce
36.1 MB Download
md5:5a1678b29b4a9fca9af8e6fba1f16841
43.4 MB Download
md5:70769b212bb1e56667f9ab3bb9515e3e
27.1 MB Download
md5:9a2155cc6795ac4ef2b98677e6bf9a2e
14.8 MB Download
md5:1de3b84d32bc4c64ff7dc33e7d66d5d8
244.4 MB Download

Additional details

Related works

Is published in
Preprint: 10.1101/2023.04.27.537858 (DOI)