genome assembly and gene annotations for the fiber flax
Description
Flax (Linum usitatissimum) is a valuable crop as its fiber and seed oil have been widely used. Several genome assemblies for flax have been published. Because the genome of flax has undergone a recent whole-genome duplication event and has a high proportion of repetitive sequences, these published genomes were pool assembled, even using Oxford Nanopore long reads. Here, we reported a high-quality genome assembly of fiber flax using HiFi and Hi-C sequencing data. A total of 21.80 Gb HiFi reads were generated with an N50 of 12.19 Kb and assembled into a 454.95 Mb assembly with 336 contigs and an N50 of 9.61 Mb. 93.0% of the contigs were anchored to 15 chromosomes using a Hi-C contact map. More repeat elements (251.86 Mb, 55.36%) were assembled in our assembly compared to the CDC Bethune genome assembly. 49,616 protein-coding genes and 52,207 transcripts were predicted, covered 95.3% of complete BUSCOs. The specific and rapidly evolving gene families in flax may relate to oil metabolism, fiber biosynthesis, and resistance to biotic stress. It was proved that HiFi sequencing technology is a promising strategy for assembling complex genomes like flax undergone a very recent whole-genome duplication event and full of repeat elements. The high-quality reference for flax will promote genetic research and accelerate the genetic breeding process for flax.
Files
Files
(202.7 MB)
Name | Size | Download all |
---|---|---|
md5:3d7e9d99d89058adea8e9c05c6fe762d
|
87.6 MB | Download |
md5:d8892cf9bd40c3b605f95f6b8ff3d0ce
|
1.4 MB | Download |
md5:9af9fca9f28b0a64c39bb60af1a7334a
|
107.8 MB | Download |
md5:20b24c3c41027591b6427f2fd328a081
|
5.9 MB | Download |