Chloroplast genome sequencing reads from snow gum
Creators
Description
Tutorial data for chloroplast genome assembly: fastq reads from illumina and nanopore sequencing for the snow gum, Eucalyptus pauciflora.
Data from: Wang, W., Schalamun, M., Morales-Suarez, A. et al. Assembly of chloroplast genomes with long- and short-read data: a comparison of approaches using Eucalyptus pauciflora as a test case. BMC Genomics 19, 977 (2018) doi:10.1186/s12864-018-5348-8
Data hosted at NCBI under accession numbers: illumina (SRR7153063) and nanopore (SRR7153095). Additional illumina file SRR7153071 not used here.
This is how the files have been changed from the original datasets:
Using the Galaxy platform (usegalaxy.org):
-
Each dataset was separately mapped to the NCBI Reference Sequence for Eucalyptus pauciflora chloroplast NC_039597.1, using BWA-MEM.
-
Unmapped reads were filtered out using a SAMtools flag.
-
Bam files were converted to fastq files.
-
Each fastq file was then reduced in size:
-
snow-gum-illumina-cp-reduced: has the first 62,500 reads only. Note that original pairing of reads has not been preserved so consider these to be unpaired reads for this tutorial.
-
snow-gum-nanopore-cp-reduced: has only reads that are longer than 90,000 bp.
Files
Files
(64.1 MB)
Name | Size | Download all |
---|---|---|
md5:07f1e4d07d6f2dbd31a1507ac8222beb
|
22.1 MB | Download |
md5:b422993cc7545ae252637d1106fd9d4e
|
42.0 MB | Download |