Published March 19, 2025 | Version v2
Dataset Open

Data From: Oatk - a de novo assembly tool for complex plant organelle genomes

  • 1. ROR icon University of Cambridge
  • 2. ROR icon Wellcome Sanger Institute
  • 3. Anglia Ruskin University

Description

This reposity hosts the data for 195 plant organelle genome assemblies generated in the manuscript "Oatk: a de novo assembly tool for complex plant organelle genomes". The sequence data were produced by the Tree of Life programme at the Sanger Institute, mostly from the Darwin Tree of Life (DToL) project, including 24 monocots, 154 eudicots, 16 mosses and one liverwort. See SAMPLE_LIST file for descriptions of these species.

In each species subfolder, below files are included.

  1. PLTD.fasta                   Plastome assembly file in FASTA format
  2. PLTD.annot.bed          Plastome assembly annotation file in BED format
  3. MITO.fasta                   Mitogenome assembly file in FASTA format
  4. MITO.annot.bed          Mitogenome assembly annotation file in BED format
  5. MBG.gfa                         Genome assembly file in GFA format generated with MBG
  6. PMAT.gfa                       Genome assembly file in GFA format generated with OATK
  7. OATK.gfa                       Genome assembly file in GFA format generated with PMAT (may not exist)

 

Updates in the New Version:

In the previous version, our raw PacBio HiFi read pre-processing pipeline had screened out some reads that it erroneously thought contained HiFi adapter sequence, which led to the gaps in the Hibiscus plastomes. We now fixed this and have rerun all the assemblies that led to any linear organelle components (37 species). All plastomes remain unchanged except for the three Hibiscuses, which are now also circular. Thirteen mitogenomes changed, with six of them now becoming circular.

Files

DATA.zip

Files (260.7 MB)

Name Size Download all
md5:6ac20f1eacd5ce57b5331340c14010fd
260.7 MB Preview Download
md5:20730dc20ad04aa4e813abbcdcfa890c
13.1 kB Preview Download

Additional details

Software