Published May 30, 2021 | Version v1
Journal article Open

genome assembly and gene annotations for the fiber flax

Creators

  • 1. Inner Mongolia Agricultural University

Description

Flax (Linum usitatissimum) is a valuable crop as its fiber and seed oil have been widely used. Several genome assemblies for flax have been published. Because the genome of flax has undergone a recent whole-genome duplication event and has a high proportion of repetitive sequences, these published genomes were pool assembled, even using Oxford Nanopore long reads. Here, we reported a high-quality genome assembly of fiber flax using HiFi and Hi-C sequencing data. A total of 21.80 Gb HiFi reads were generated with an N50 of 12.19 Kb and assembled into a 454.95 Mb assembly with 336 contigs and an N50 of 9.61 Mb. 93.0% of the contigs were anchored to 15 chromosomes using a Hi-C contact map. More repeat elements (251.86 Mb, 55.36%) were assembled in our assembly compared to the CDC Bethune genome assembly. 49,616 protein-coding genes and 52,207 transcripts were predicted, covered 95.3% of complete BUSCOs. The specific and rapidly evolving gene families in flax may relate to oil metabolism, fiber biosynthesis, and resistance to biotic stress. It was proved that HiFi sequencing technology is a promising strategy for assembling complex genomes like flax undergone a very recent whole-genome duplication event and full of repeat elements. The high-quality reference for flax will promote genetic research and accelerate the genetic breeding process for flax.

Files

Files (202.7 MB)

Name Size Download all
md5:3d7e9d99d89058adea8e9c05c6fe762d
87.6 MB Download
md5:d8892cf9bd40c3b605f95f6b8ff3d0ce
1.4 MB Download
md5:9af9fca9f28b0a64c39bb60af1a7334a
107.8 MB Download
md5:20b24c3c41027591b6427f2fd328a081
5.9 MB Download