Published 2024 | Version v2
Dataset Open

Chromosome-level genome assembly of Triticum turgidum var 'Kronos'

  • 1. ROR icon University of California, Berkeley

Description

 

This data is made available under the Toronto Agreement. 

All of the data listed here is available under the prepublication data sharing principle of the Toronto agreement (1). By using this data, you agree to:

  • respect the rights of the data producers and contributors to analyze and publish the first global analyses and certain other reserved analyses of this data set in a peer-reviewed publication.
  • not redistribute, release, or otherwise provide access to the data to anyone outside of the group, until the data has been published & submitted to the public data repositories.
  • contact the authors to discuss any plans to publish data or analyses that utilize this data to avoid the overlap of any planned analyses.
  • fully cite the prepublication data along with any applicable versioning details.
  • understand that this data as accessed is precompetitive and is not patentable in its present state.

This agreement does not expire by time but only upon publication of the first global analysis by the data producers and contributors.
(1) Toronto International Data Release Workshop Authors. Prepublication data sharing. Nature 461, 168–170 (2009). https://doi.org/10.1038/461168a

 

  • If you have questions about the use of this dataset, please contact Ksenia Krasileva: kseniak [at] berkeley.edu

 

Summary of the datasets

We produced 526 Gbp of high-fidelity (HiFi) reads for Kronos. As Kronos typically self-pollinates in the field and its residual heterozygosity is low, these reads were assembled with hifiasm v0.19.5-r587 (-l0) to produce haplotype-collapsed assembly. Primary and associated contigs were concatenated into a single file. These contigs are in the files with the prefix 'Kronos.contigs'

The concatenated primary and associated contigs were further scaffolded with chromosome conformation capture sequencing (Hi-C) data. We used yahs v1.2a.2. The resulting 14 largest scaffolds were greater than 600 Mbp in size, representing 14 chromosomes (7 x AB). These scaffolds were renamed based on the similarity to the bread wheat reference genome from the IWGSC. After plasmid genomes were separated, the rest of the contigs or scaffolds, which were all smaller than 4 Mbp, were concatenated into a single sequenced named 'Un' (for unplaced). These sequences can be found in the files with the prefix 'Kronos.collapsed'. 

 

Updates in Zenodo v2

In the genome version 1.1, the following chromosomes are reversed and complemented: 1B, 2A, 2B, 3A, 3B, 5A, 6A and 6B. This adjustment was made to ensure the alignment (orientation) of the chromosome arms remains consistent with that of the bread wheat reference genome. 

The gene models were initially generated using BRAKER, GINGER, and Funannotate, all of which utilized protein evidence, transcript evidence generated from paired-end RNA-seq data, and independently trained ab initio predictors. Consensus annotations were derived using EvidenceModeler by merging the gene models from the three predictions, transcripts assembled with PASA, and protein sequences from closely related species aligned with miniprot. PASA was also employed to update alternative transcripts and untranslated regions.

The high-confidence set (Kronos.v1.0.high) comprises 69,808 genes. These gene models have start and stop codons and have homologs in public databases with 97% or more bidirectional coverages. The low-confidence set (Kronos.v1.0.low) has 44,381 genes, including putative pseudogenes and gene fragments. Some of the genes are partially annotated, and we are in process of improving the annotations. Please use the genome version v1.1 for this annotation set.

 

Acknowledgement

This work has been funded by the United States Department of Agriculture - National Institute for Food and Agriculture Award (2021-67013-35726). 

Files

Files (4.4 GB)

Name Size Download all
md5:9ac04da33b9ac0f58f79a615b42bcf39
3.0 GB Download
md5:0685ab716ffec81dde81021a66d5e11f
198.7 MB Download
md5:78c5180b2bf05174b6a9d4561d3d7337
121.2 MB Download
md5:38c473adec18303b9e9121f2ca66099d
209.3 MB Download
md5:b11efa003fbebd89469816d44ec84786
115.4 MB Download
md5:a553e5460b54b8df2166d79f3de72590
68.7 MB Download
md5:eec47d67a07ca19f29c60fb8cb1acb9b
42.1 MB Download
md5:22e1a9a205b1cb66bcf6fdfe2f9ca6bb
162.5 MB Download
md5:f83461cdd53205aba0a38068a40e7ce0
169.2 MB Download
md5:0ba96fa07f722be846263b4684c43d1d
90.7 MB Download
md5:d01c56a2a1cafe4634c24c29c892510c
83.9 MB Download
md5:e3c234d6e8ce48c2a7e37a90b59d4dc5
31.2 MB Download
md5:8cabfa5246414817ffd0600720de8cd2
55.9 MB Download
md5:0ec718b4e8df7ed8f3b351c69caea187
36.2 MB Download
md5:88277de0754b120cf212a59d72b53dc3
40.1 MB Download
md5:dcb70a26fe6f44ec2a56606b9448a3e2
12.8 MB Download