3631711
doi
10.5281/zenodo.3631711
oai:zenodo.org:3631711
Strowig, Till
Helmholtz Centre for Infection Research
iMGMC - integrated Mouse Gut Metagenomic Catalog
Lesker, Till Robin
Helmholtz Centre for Infection Research
doi:10.1101/528737
doi:10.1016/j.celrep.2020.02.036
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
mouse gut
metagenome
gene catalog
Metagenome-assembled genomes (MAGs)
<p><em>Creation of an new mouse gut gene catalog with special features:</em></p>
<ul>
<li>more diverse samples from different studies (12 Vendors incl. wild mice and various gut locations)</li>
<li>clustering-free approach: all-in-one assembly, keeping track of each ORF to contigs to bins</li>
<li>higher taxonomic resolution and more accuracy by using contigs for annotation</li>
<li>16S rRNA gene integration via linkage to bins</li>
<li>expansion by 20,927 MAGs from sample-wise assembly of 871 mouse gut metagenomic samples, representing 1,296 species</li>
</ul>
<p>Code used: <a href="https://github.com/tillrobin/iMGMC">https://github.com/tillrobin/iMGMC</a></p>
<p>The vast complexity of host-associated microbial ecosystems requires host-specific reference catalogs to survey the functions and diversity of these communities. We generated a comprehensive resource, the integrated mouse gut metagenome catalog (iMGMC), comprising 4.6 million unique genes and 660 metagenome-assembled genomes (MAGs) with many of them (485 MAGs, 73%) linked to reconstructed full-length 16S rRNA gene sequences. iMGMC enables unprecedented coverage and taxonomic resolution of the mouse gut microbiota, i.e. more than 92% of MAGs lack species-level representatives in public repositories (<95% ANI match). The integration of MAGs and 16S rRNA gene data allows a more accurate prediction of functional profiles of communities than based on 16S rRNA amplicons alone. Accompanying iMGMC we provide a set of MAGs representing 1,296 gut bacteria obtained through complementary assembly strategies. We envision that integrated resources such as iMGMC together with MAG collections will enhance the resolution of numerous existing and future sequencing-based studies.</p>
<p>Genecatalog:</p>
<p>Description Size Filename<br>
Catalog ORF sequences 1 GB iMGMC-GeneID.fasta.gz<br>
Full assembly contigs 1.3 GB iMGMC-ConitgID.fasta.gz<br>
Mapping File (GeneID->ContigID->BinID) 30 MB iMGMC-map-Gene-Contig-Bin.tab.gz<br>
Taxonomic annotations 40 MB iMGMC_map_taxonomy.tar.gz<br>
Functional annotations 36 MB iMGMC_map_functionality.tar.gz<br>
16S rRNA sequences 2 MB iMGMC-16SrRNAgenes.fasta</p>
<p>Metagenome-assembled genomes (MAGs) :</p>
<p>Description Size Filename<br>
integrated MAGs 0.5 GB iMGMC_MAGs.tar.gz<br>
representave mMAGs (n=1296) 1 GB iMGMC-mMAGs-dereplicated_genomes.tar.gz<br>
representave hqMAGs (n=830) 0.7 GB iMGMC-hqMAGs-dereplicated_genomes.tar.gz<br>
all mMAGs (n=20,927) 15 GB iMGMC-mMAGs.tar.gz<br>
Annotations by CheckM, dRep-Clustering, GTDB-Tk 2 MB MAG-annotation_CheckM_dRep_GTDB-Tk.tar.gz<br>
Functional annotations (hqMAGs by eggNOG mapper v2) 187 MB hqMAGs.emapper.annotations.gz</p>
<p> </p>
Acknowledgements: TS was funded by the Helmholtz Association (VH-NG-933), by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, STR-1343/1 and STR-1343/2) and the European Union (StG337251). JFB was funded by the DFG under Germany`s Excellence Strategy – EXC 22167-390884018 and by the DFG Collaborative Research Center (CRC) 1182 "Origin and Function of Metaorganisms". TC received funding from the DFG (project CL481/2-1 and grants within Collaborative Research Center 1382).
Zenodo
2020-01-31
info:eu-repo/semantics/other
3631710
1
1583328459.006026
196548995
md5:d4afcb3987c47098850cf3a69447796e
https://zenodo.org/records/3631711/files/hqMAGs.emapper.annotations.gz
15406016962
md5:9eedd41d115127701e0cb2b6ac3eed29
https://zenodo.org/records/3631711/files/iMGMC-mMAGs.tar.gz
579691655
md5:71ab991698b358ac37729477b6d05615
https://zenodo.org/records/3631711/files/iMGMC-MAGs.tar.gz
1860949
md5:b011dd70e8c8005c252aedba537a9178
https://zenodo.org/records/3631711/files/MAG-annotation_CheckM_dRep_GTDB-Tk.tar.gz
744144471
md5:58d2aabd73ec9b580d2aeade4b949057
https://zenodo.org/records/3631711/files/iMGMC-hqMAGs-dereplicated_genomes.tar.gz
1112788489
md5:293bb3922237dea87df9cac8d9e7600f
https://zenodo.org/records/3631711/files/iMGMC-GeneID.fasta.gz
1045663202
md5:434b0a269af6bc11bb1dd22acc17eb33
https://zenodo.org/records/3631711/files/iMGMC-mMAGs-dereplicated_genomes.tar.gz
1393901680
md5:32a6219d4323cdcc10b275b60a314179
https://zenodo.org/records/3631711/files/iMGMC-ConigID.fasta.gz
38181359
md5:d1b64aecd3d397d009f036a0a313e736
https://zenodo.org/records/3631711/files/iMGMC_map_functionality.tar.gz
2025119
md5:56a4c3e4a3adfe027cb29f888ece3d1b
https://zenodo.org/records/3631711/files/iMGMC-16SrRNAgenes.fasta
41750604
md5:2cc20450e5bf5f783ccc1894b19aaf8d
https://zenodo.org/records/3631711/files/iMGMC_map_taxonomy.tar.gz
32715456
md5:c65b1e54f9d856ca745809b8f4d6092b
https://zenodo.org/records/3631711/files/iMGMC-map-Gene-Contig-Bin.tab.gz
public
10.1101/528737
Is part of
doi
10.1016/j.celrep.2020.02.036
Is part of
doi
10.5281/zenodo.3631710
isVersionOf
doi
Cell Reports
30
9
2909-2922
2020-01-31