Training Data for "Binning of metagenomic sequencing data" tutorial
Description
Metagenomics is the study of genetic material recovered directly from environmental samples, such as soil, water, or gut contents, without the need for isolation or cultivation of individual organisms. Metagenomics binning is a process used to classify DNA sequences obtained from metagenomic sequencing into discrete groups, or bins, based on their similarity to each other. The goal of metagenomics binning is to assign the DNA sequences to the organisms or taxonomic groups that they originate from, allowing for a better understanding of the diversity and functions of the microbial communities present in the sample. This is typically achieved through computational methods that use sequence similarity, composition, and other features to group the sequences into bins.
There are two main types of metagenomics binning: reference-based and de novo.
- reference-based binning involves aligning the sequences to a database of known genomes or reference sequences
- de novo binning involves clustering the sequences based on similarity without prior knowledge of the organisms or reference sequences present in the sample.
Both methods have their strengths and limitations, and researchers often use a combination of approaches to improve the accuracy of their binning results. Metagenomics binning is an important tool for understanding the functional potential of microbial communities in various environments and has applications in fields such as biotechnology, environmental science, and human health.
In this tutorial, we will learn how to run metagenomic binning tools and evaluate the quality of the results. In order to do that, we will use data from the study: Temporal shotgun metagenomic dissection of the coffee fermentation ecosystem and MetaBAT2 algorithm. For an in-depth analysis of the structure and functions of the coffee microbiome, a temporal shotgun metagenomic study (six time points) was performed. The six samples have been sequenced with Illumina MiSeq utilizing whole genome sequencing.
Based on the 6 original dataset of the coffee fermentation system, we generated mock datasets for this tutorial.
Files
26_ MetaBAT2 on data ERR2231567_ Bins.zip
Files
(29.4 MB)
Name | Size | Download all |
---|---|---|
md5:7b055fafe7e02b042dc5eb6325a6ff19
|
4.9 MB | Preview Download |
md5:510bcfdcaea28e3ed6b1779e72412604
|
5.1 MB | Preview Download |
md5:71a039eeacd6e7d316c10e14b99c25e6
|
4.5 MB | Preview Download |
md5:9f52b35dcd54abf972947a1853fc3466
|
4.8 MB | Preview Download |
md5:7a43d09654d597fe333c7e5aadbd9e71
|
4.9 MB | Preview Download |
md5:b7cf7e3df0a6dfd39075cef903cdfab4
|
5.3 MB | Preview Download |
md5:4cd96649fd7c8ce677fdcbcbbbacebeb
|
577 Bytes | Preview Download |
md5:b7c5df483804a429b988b08f0ba502ce
|
389 Bytes | Preview Download |
md5:4b2ba8fa1481ff627f09754d1a090643
|
449 Bytes | Preview Download |
md5:9a35dcc9e5f78136330125c87f9b120c
|
295 Bytes | Preview Download |
md5:68eff5bd537d691fd72a681e1a57e9f8
|
371 Bytes | Preview Download |
md5:5e375df6cba7c8c873ba5306742ed9c5
|
642 Bytes | Preview Download |