Journal article Open Access

The genome of flax ( Linum usitatissimum ) assembled de novo from short shotgun sequence reads

Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo; Zhu, Shilin; Shi, Daihu; McDill, Joshua; Yang, Linfeng; Hawkins, Simon; Neutelings, Godfrey; Datla, Raju; Lambert, Georgina; Galbraith, David W.; Grassa, Christopher J.; Geraldes, Armando; Cronk, Quentin C.; Cullis, Christopher; Dash, Prasanta K.; Kumar, Polumetla A.; Cloutier, Sylvie; Sharpe, Andrew G.; Wong, Gane K.-S.; Wang, Jun; Deyholos, Michael K.

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Wang, Zhiwen</dc:creator>
  <dc:creator>Hobson, Neil</dc:creator>
  <dc:creator>Galindo, Leonardo</dc:creator>
  <dc:creator>Zhu, Shilin</dc:creator>
  <dc:creator>Shi, Daihu</dc:creator>
  <dc:creator>McDill, Joshua</dc:creator>
  <dc:creator>Yang, Linfeng</dc:creator>
  <dc:creator>Hawkins, Simon</dc:creator>
  <dc:creator>Neutelings, Godfrey</dc:creator>
  <dc:creator>Datla, Raju</dc:creator>
  <dc:creator>Lambert, Georgina</dc:creator>
  <dc:creator>Galbraith, David W.</dc:creator>
  <dc:creator>Grassa, Christopher J.</dc:creator>
  <dc:creator>Geraldes, Armando</dc:creator>
  <dc:creator>Cronk, Quentin C.</dc:creator>
  <dc:creator>Cullis, Christopher</dc:creator>
  <dc:creator>Dash, Prasanta K.</dc:creator>
  <dc:creator>Kumar, Polumetla A.</dc:creator>
  <dc:creator>Cloutier, Sylvie</dc:creator>
  <dc:creator>Sharpe, Andrew G.</dc:creator>
  <dc:creator>Wong, Gane K.-S.</dc:creator>
  <dc:creator>Wang, Jun</dc:creator>
  <dc:creator>Deyholos, Michael K.</dc:creator>
  <dc:description>Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50)  = 694 kb, including contigs with N(50)  = 20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43 384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.</dc:description>
  <dc:title>The genome of flax ( Linum usitatissimum ) assembled de novo from short shotgun sequence reads</dc:title>
Views 149
Downloads 86
Data volume 53.4 MB
Unique views 148
Unique downloads 84


Cite as