Published November 11, 2019 | Version v1
Dataset Open

Toy dataset for metaGEM documentation (Gut v1)

  • 1. EMBL

Description

This dataset was generated to test and benchmark the metaGEM workflow.

The 100 bp illumina WGS reads consist of ~10% subsets of 3 paired end sets of reads from the following publication:

Karlsson, Fredrik H., et al. “Gut Metagenome in European Women with Normal, Impaired and Diabetic Glucose Control.” Nature, vol.498,no.7452,2013,pp.99–103., doi:10.1038/nature12198.

SRA Accession code ERP002469.

The subsets were generated using the command line tool seqtk:

seqtk sample -s100 sample_X.fastq.gz 3000000 > subset_X.fastq.gz

The sample names in the original publication used to generate the toy dataset are ERR260162 (sample1), ERR260173 (sample2), ERR260184 (sample3).

Files

Files (1.8 GB)

Name Size Download all
md5:cb9adfb94c9bfba4903daa6561969550
313.2 MB Download
md5:d825dd7fa1eeb8d8a2c4cded49bd0546
309.2 MB Download
md5:756660059eb6e40d86e2a29e6eac7989
297.9 MB Download
md5:30e3c1c9f956149c1ba9f01beb600609
302.4 MB Download
md5:d4e155c92bd7d9ccd6583ddf420383c3
309.4 MB Download
md5:6b8316f0741a7f795090777693417233
308.2 MB Download