Published November 6, 2019 | Version v.1.0
Dataset Open

MACREL software benchmark data set: Simulated metagenomes with sequencing quality, errors profile and abundance distributions derived from real samples

Description

These metagenomes were used in the benchmarking of FACS pipeline, and were designed after NGLess benchmark dataset (doi.org/10.5281/zenodo.2560288).  Metagenomes were simulated with ART-bin-MountRainier-2016.06.05 using real abundance profiles (.abund files) available elsewhere, and proGenomes' representative contigs as reference genomes. There are available metagenomes with 40, 60 and 80 M (million of reads) based in the reference genomes and abundances of the following samples:

SAMEA2466916
SAMEA2466953
SAMEA2466965
SAMEA2621107
SAMEA2621229
SAMEA2621247

To convert them from the CRAM format back to fastq files:


## 1. converting from cram to bam format:
samtools view -b -T refgenome.fa -o file.bam file.cram

## 2. sorting the bam file:
samtools sort -n file.bam -o input_sorted.bam   # sort reads by identifier-name (-n)

## 3. converting from bam to fastq format:
bedtools bamtofastq -i input_sorted.bam -fq output_r1.fastq -fq2 output_r2.fastq

 

Files

Files (47.0 GB)

Name Size Download all
md5:366fe8cc20b48ea555f97b7a7039676d
1.8 GB Download
md5:b17828703b51cc8366c8259a3fac7a46
1.7 GB Download
md5:fd1041dd9dc6805697aa365e699ec304
1.7 GB Download
md5:5691a41fb6ea92c956cfec917744c94b
1.7 GB Download
md5:d0a49282dc1de838527fc070fd0373df
1.7 GB Download
md5:58e47f8865d36af069e771b7200befe3
1.7 GB Download
md5:268a5c6afd961d59287dfc3fec3a0107
2.6 GB Download
md5:f078e0659cb88ffd2c8ac8b12f60c08b
2.6 GB Download
md5:ea23bb3bf54e615034de42c39ce9cbc4
2.6 GB Download
md5:ade80da0f512f6bbf2d3063dc04ecd6e
2.6 GB Download
md5:abdc3cd139a190dca1a7067e8a5b6adf
2.6 GB Download
md5:61b7a50b92b6eb15ea6f144b9e78ade0
2.6 GB Download
md5:462689bede06d059c1fe41eb38add115
3.5 GB Download
md5:b873f4c0571327092ea0fa211c54abe4
3.5 GB Download
md5:f960d8bc3151cd1f0554f82df3f59a48
3.5 GB Download
md5:4f9336a4110581644169de0198c42b4f
3.5 GB Download
md5:a48a549984800d75612592c5f785e292
3.5 GB Download
md5:4dd5f79b38be04f5f2b9a22355aa2c76
3.4 GB Download

Additional details

References

  • Alves, Renato, Coelho, Luis Pedro, Huerta-Cepas, Jaime, & Bork, Peer. (2019). Simulated metagenomes with quality and abundance distributions derived from real samples (Version 1.0.1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2560288