There is a newer version of the record available.

Published April 18, 2019 | Version v1.1.0
Dataset Open

Simulated Arabidopsis thaliana sequencing datasets for chloroplast assembler benchmarking

  • 1. Center for Computational and Theoretical Biology, University of Würzburg, Germany
  • 2. Fraunhofer Institute for Molecular Biology and Applied Ecology IME: Gießen

Description

## Changes

Fixed off-by-one error in reverse read in version 1.0.0

## Purpose and Documentation

See: https://github.com/chloroExtractorTeam/benchmark

## Original data
The original *Arabidopsis thaliana* sequences were downloaded from TAIR:                                                                      

The Arabidopsis Information Resource (TAIR), ftp://ftp.arabidopsis.org/home/tair/Sequences/whole_chromosomes/ on www.arabidopsis.org, Mar 22, 2019 available under the [TAIR Terms of Use](http://www.arabidopsis.org/doc/about/tair_terms_of_use/417)                                       

Tanya Z. Berardini, Leonore Reiser, Donghui Li, Yarik Mezheritsky, Robert Muller, Emily Strait and Eva Huala                                  
    The Arabidopsis Information Resource: Making and mining the "gold standard" annotated reference plant genome.                             
    genesis 2015 doi: 10.1002/dvg.22877

## Programs used to generate this data
 - [seqkit](https://github.com/shenwei356/seqkit) (v0.10.1): Shen W, Le S, Li Y, Hu F (2016) SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLOS ONE 11(10): e0163962. [https://doi.org/10.1371/journal.pone.0163962](https://doi.org/10.1371/journal.pone.0163962)

 

Notes

Full documentation: https://github.com/chloroExtractorTeam/benchmark/blob/master/03_representative_datasets.md

Files

Files (8.3 GB)

Name Size Download all
md5:0bf68954797b9ea10a9e89a3a591feab
8.3 GB Download

Additional details

References

  • Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E (2015) The Arabidopsis Information Resource: Making and mining the "gold standard" annotated reference plant genome. genesis doi:10.1002/dvg.22877
  • Shen W, Le S, Li Y, Hu F (2016) SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLOS ONE 11(10): e0163962. doi:10.1371/journal.pone.0163962