Published July 3, 2023 | Version v1
Journal article Open

Supporting data for "Near telomere-to-telomere level genome assembly for marigold (Tagetes erecta)"

Authors/Creators

  • 1. Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences

Description

Here, we generated a near telomere-to-telomere level genome assembly of marigold based on highly accurate high-fidelity (HiFi) long reads and Hi-C sequencing data. Compared to the previously reported marigold genome, the current assembly had obviously higher contiguity and higher completeness of gene set. The current genome assembly has a 27-fold increase in contig N50 size, a 12.1% increase in chromosome anchoring rate, and a 9.0% increase in BUSCO complete rate for the gene set. Besides, the current assembly has much fewer assembly errors. Based on this high-quality genome assembly, we found that the 170-bp repeats are the most abundant centromeric unit and all centromeric regions are distributed along the whole chromosomes for all 12 centromeres, indicating the existence of the holocentromeres in marigold. In addition, we analyzed the structure and phylogenetic relationship of the four PSYgenes, and revealed that these genes have diversified and possibly executed different functions in various tissues.

Files

md5.txt

Files (1.1 GB)

Name Size Download all
md5:3100a176b26621d1d8296707789cf7c1
419 Bytes Preview Download
md5:e0d674607e82d577dfa93266e2fc6d24
1.7 kB Preview Download
md5:003c43a7740ef6f238716c5e554596b7
53.4 MB Download
md5:743b542ee4f44b79d457a52228d719e4
75.5 MB Download
md5:fe59d47ca1d41907fb97c07e2c4464ee
791.9 MB Download
md5:d3d2c30431a9371311fd996627dca58a
19.8 MB Download
md5:c8af6921a02bd6133e3701e9c28d993f
116.4 MB Download
md5:4e6d3be6d30ca72f464daeb697577c2a
52.6 MB Download