Data From: Oatk - a de novo assembly tool for complex plant organelle genomes
Authors/Creators
Description
This reposity hosts the data for 195 plant organelle genome assemblies generated in the manuscript "Oatk: a de novo assembly tool for complex plant organelle genomes". The sequence data were produced by the Tree of Life programme at the Sanger Institute, mostly from the Darwin Tree of Life (DToL) project, including 24 monocots, 154 eudicots, 16 mosses and one liverwort. See SAMPLE_LIST file for descriptions of these species.
In each species subfolder, below files are included.
PLTD.fastaPlastome assembly file in FASTA formatPLTD.annot.bedPlastome assembly annotation file in BED formatMITO.fastaMitogenome assembly file in FASTA formatMITO.annot.bedMitogenome assembly annotation file in BED formatMBG.gfaGenome assembly file in GFA format generated with MBGPMAT.gfaGenome assembly file in GFA format generated with OATKOATK.gfaGenome assembly file in GFA format generated with PMAT (may not exist)
Updates in the New Version:
In the previous version, our raw PacBio HiFi read pre-processing pipeline had screened out some reads that it erroneously thought contained HiFi adapter sequence, which led to the gaps in the Hibiscus plastomes. We now fixed this and have rerun all the assemblies that led to any linear organelle components (37 species). All plastomes remain unchanged except for the three Hibiscuses, which are now also circular. Thirteen mitogenomes changed, with six of them now becoming circular.
Files
DATA.zip
Files
(260.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:6ac20f1eacd5ce57b5331340c14010fd
|
260.7 MB | Preview Download |
|
md5:20730dc20ad04aa4e813abbcdcfa890c
|
13.1 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/c-zhou/oatk