Training Data for "Taxonomic Profiling of Metagenomic Data" tutorial

Batut, Bérénice; Hampe, Sophia

doi:10.5281/zenodo.7871630

Published April 27, 2023 | Version v1

Journal article Open

Training Data for "Taxonomic Profiling of Metagenomic Data" tutorial

Metagenomics involves the extraction, sequencing and analysis of combined genomic DNA from entire microbiome samples. It includes then DNA from many different organisms, with different taxonomic background.

The investigation of microorganisms present at a specific site and their relative abundance is also called "microbial community profiling". Basic for this is to find out which microorganisms are present in the sample. This can be achieved for all known microbes, where the DNA sequence specific for a certain species is known. For that we try to identify the taxon to which each individual reads belong. Several approaches exist to profile a community.

In this tutorial, we will learn some theory taxonomic profiling, how to run taxonomic profiling tools and visualize their outputs. The dataset we will use for this tutorial comes from an oasis in the mexican desert called Cuatro Ciénegas (Okie et al. 2020). The researchers were interested in genomic traits that affect the rates and costs of biochemical information processing within cells. They performed a whole-ecosystem experiment, thus fertilizing the pond to achieve nutrient enriched conditions.

Here we use 2 datasets:

JP4D: a microbiome sample collected from the Lagunita Fertilized Pond
JC1A: a control samples from a control mesocosm.

The datafiles are named according to the first four characters of the filenames. It is a collection of paired-end data with R1 being the forward reads and R2 being the reverse reads. Additionally, the reads have been trimmed using cutadapt

Files

Files (424.0 MB)

Name	Size	Download all
JC1A_R1.fastqsanger.gz md5:d81ab0af18820753e1d57eb3cd793485	21.5 MB	Download
JC1A_R2.fastqsanger.gz md5:8d0606ff0f73044eb9786a18c4272d99	20.5 MB	Download
JP4D_R1.fastqsanger.gz md5:cad03d2dbdcc721c82362eb8fa812d17	182.7 MB	Download
JP4D_R2.fastqsanger.gz md5:ed76606b2a30c4221f1a3c5265c7a2b8	199.3 MB	Download
metadata.tabular md5:ae982e901a0958d2abf1886d3a0ecc5e	620 Bytes	Download

	All versions	This version
Views	1,477	1,462
Downloads	3,443	3,430
Data volume	494.2 GB	491.4 GB

Training Data for "Taxonomic Profiling of Metagenomic Data" tutorial

Creators

Description

Files

Files (424.0 MB)