Dataset Open Access

Supplementary dataset to "Draft genome assembly of the biofuel grass crop Miscanthus sacchariflorus"

De Vega, JJ


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.4270235", 
  "author": [
    {
      "family": "De Vega, JJ"
    }
  ], 
  "issued": {
    "date-parts": [
      [
        2020, 
        11, 
        12
      ]
    ]
  }, 
  "abstract": "<p><em>Miscanthus sacchariflorus</em> (Maxim.) Hack. is a C4 perennial rhizomatous biofuel grass crop. <em>M. sacchariflorus</em> is among the most widely distributed species within the genus, particularly at cold northern latitudes, and one of the progenitor species of the main biomass commercial crop <em>M.&nbsp;&times;&nbsp;giganteus</em>. We generated a 2.54 Gbps whole-genome assembly of the diploid <em>M. sacchariflorus</em> &ldquo;Robustus 297&rdquo; genotype, which represented ~59% of the expected genome size. We later anchored this assembly in the chromosomal-scale <em>M. sinensis</em> genome to improve its contiguity. We annotated 86,767 and 69,049 protein-coding genes in the unanchored and anchored, respectively. We estimated our assemblies include ~85% of the <em>M. sacchariflorus</em> genes based on homology, core markers and RNA-seq alignments stats. Raw data and further metadata are available under Bioproject PRJNA435476.</p>\n\n<ul>\n\t<li>Msac_v2.fasta: Unanchored whole-genome assembly (WGA) of M. sacchariflorus in FASTA format.</li>\n\t<li>Msac_v3.fasta: The previous WGA re-scaffolded with the M. sinensis public reference.</li>\n\t<li>Msac_v3.agp: Chromosomal position in the M. sinensis reference of the previous scaffolds in Msac_v3.fasta</li>\n\t<li>Msac_v2.gff3: Gene annotation of the unanchored WGA in GFF3 format, which contains 86,767 coding genes</li>\n\t<li>Msac_v3.gff3: Gene annotation of the anchored WGA in GFF3 format, which contains 69,049 coding genes</li>\n\t<li>Msac_v2.func_annot.tsv: Text table containing the functional annotation of the 86,767 coding genes in Msac_v2.gff3</li>\n\t<li>Msac_v2.repeats_annotation.gff3: Repeats annotation (Repeatmasker) of the unanchored reference.</li>\n\t<li>Msac_v2.masked.fasta.gz: Repeats-masked version (Repeatmasker) of Msac_v2.fasta</li>\n\t<li>all.satsuma.blocks_Msac_v2-vs-Msin.gz: Every alignment from scaffolds in Msac_v3.fasta into M. sinensis reference</li>\n\t<li>Msac_v2.orthology_Msin.tsv: Ortologous between Msac_v2 and M. sinensis</li>\n\t<li>Msac_v3-vs-Msin.tsv: Ortologous between Msac_v3 and M. sinensis</li>\n</ul>", 
  "title": "Supplementary dataset to \"Draft genome assembly of the biofuel grass crop Miscanthus sacchariflorus\"", 
  "type": "dataset", 
  "id": "4270235"
}
69
46
views
downloads
All versions This version
Views 6969
Downloads 4646
Data volume 7.6 GB7.6 GB
Unique views 6666
Unique downloads 2222

Share

Cite as