Dataset Open Access

Supplementary dataset to "Draft genome assembly of the biofuel grass crop Miscanthus sacchariflorus"

De Vega, JJ

Miscanthus sacchariflorus (Maxim.) Hack. is a C4 perennial rhizomatous biofuel grass crop. M. sacchariflorus is among the most widely distributed species within the genus, particularly at cold northern latitudes, and one of the progenitor species of the main biomass commercial crop M. × giganteus. We generated a 2.54 Gbps whole-genome assembly of the diploid M. sacchariflorus “Robustus 297” genotype, which represented ~59% of the expected genome size. We later anchored this assembly in the chromosomal-scale M. sinensis genome to improve its contiguity. We annotated 86,767 and 69,049 protein-coding genes in the unanchored and anchored, respectively. We estimated our assemblies include ~85% of the M. sacchariflorus genes based on homology, core markers and RNA-seq alignments stats. Raw data and further metadata are available under Bioproject PRJNA435476.

  • Msac_v2.fasta: Unanchored whole-genome assembly (WGA) of M. sacchariflorus in FASTA format.
  • Msac_v3.fasta: The previous WGA re-scaffolded with the M. sinensis public reference.
  • Msac_v3.agp: Chromosomal position in the M. sinensis reference of the previous scaffolds in Msac_v3.fasta
  • Msac_v2.gff3: Gene annotation of the unanchored WGA in GFF3 format, which contains 86,767 coding genes
  • Msac_v3.gff3: Gene annotation of the anchored WGA in GFF3 format, which contains 69,049 coding genes
  • Msac_v2.func_annot.tsv: Text table containing the functional annotation of the 86,767 coding genes in Msac_v2.gff3
  • Msac_v2.repeats_annotation.gff3: Repeats annotation (Repeatmasker) of the unanchored reference.
  • Msac_v2.masked.fasta.gz: Repeats-masked version (Repeatmasker) of Msac_v2.fasta
  • all.satsuma.blocks_Msac_v2-vs-Msin.gz: Every alignment from scaffolds in Msac_v3.fasta into M. sinensis reference
  • Msac_v2.orthology_Msin.tsv: Ortologous between Msac_v2 and M. sinensis
  • Msac_v3-vs-Msin.tsv: Ortologous between Msac_v3 and M. sinensis
Files (1.3 GB)
Name Size
all.satsuma.blocks_Msac_v2-vs-Msin.gz
md5:b493a76ed8924b4a082980c3911532ae
78.1 MB Download
Msac_v2.fasta.gz
md5:e456585fa4e4237be9bdd6da207e2388
491.8 MB Download
Msac_v2.func_annot.tsv
md5:38c3e140e85770fad9c3568da4d1c648
15.9 MB Download
Msac_v2.gff3.gz
md5:df9e133a543e50996e7bbbf9b2233478
13.0 MB Download
Msac_v2.masked.fasta.gz
md5:feb3f65b33c3ac8f1ebcebd9bbd72530
252.7 MB Download
Msac_v2.orthology_Msin.tsv
md5:ba7464cf36bef8f2bd9354c753f770d2
2.5 MB Download
Msac_v2.repeats_annotation.gff3.gz
md5:67ee8454829335c92b894c228f1f5c01
58.5 MB Download
Msac_v3-vs-Msin.tsv
md5:5c821a80a2093ee10acab2be1a8d7241
1.5 MB Download
Msac_v3.agp
md5:785ed643fe645f250895d98e3cce0668
3.6 MB Download
Msac_v3.fasta.gz
md5:f3378c92b42155ebd2d43adaa6be1e2d
407.8 MB Download
Msac_v3.gff3.gz
md5:e47be8ab3527580e8cdc241b6f75902b
7.9 MB Download
56
40
views
downloads
All versions This version
Views 5656
Downloads 4040
Data volume 6.8 GB6.8 GB
Unique views 5454
Unique downloads 1717

Share

Cite as