Published February 28, 2024
| Version v3
Dataset
Open
A catalog of genes and species of the brown rat (Rattus norvegicus) gut microbiota
Creators
- 1. Université Paris-Saclay, INRAE
- 2. Université de Nantes, UMR_A 1280 Physiologie des Adaptations Nutritionnelles
Description
Dataset overview
We built a catalog of 5.9M genes found in the brown rat gut microbiota. Co-abundant genes were binned in 1627 Metagenomic Species for which we provide taxonomic labels.
This dataset can be used to analyze shotgun sequencing data of the brown rat gut microbiota.
Data sources
Rat fecal (and milk) samples characterized by shotgun metagenomic sequencing during the Mamiprooffi project. Sequencing data will be submitted soon on the European Nucleotide Archive (Bioproject PRJEB57230)
The gene catalog of the Sprague-Dawley rat gut metagenome published by Pan et al.
Metagenomic assembly
Metagenomic assembly was performed on the Mamiprooffi samples (Data Source 1) with SPAdes (parameters: --iontorrent --careful). Contigs of less than 1500 bp or successfully aligned on the rat genome (Rnor_6.0) were removed.
Non-redundant gene catalog
Genes were predicted on all contigs with Prodigal (parameters : -m -p meta ). Genes with missing start codon or shorter than 99 bp were discarded.
Then, partial and complete genes were separately clustered with cd-hit-est (parameters -c 0.95 -aS 0.90 -G 0 -d 0 -M 0 -T 0 ). Finally, these two non-redundant gene sets were merged with the previously published catalog (Data Source 2) using cd-hit-est-2d by considering at first complete genes (contact us for futher details).
Functionnal annotation
KEGG Orthologs (KOs) were assigned to genes of the final catalog with KofamScan (version 1.3.0, KEGG 107 database)
Metagenomic Species
Using the Meteor software suite, reads from samples in Bioprojects PRJEB57230 and PRJEB22973 were mapped against the final non redundant catalog to build a raw gene abundance table (5.9 million genes quantified in 370 samples). This table was submitted to MSPminer and Canopy. A total of 1627 clusters of co-abundant genes or MetaGenomic Species (MGS) were discovered.
Quality control of each MGS was manually performed by visualizing heatmaps representative of the normalized gene abundance profiles.
Taxonomic annotation of Metagenomic Species
MGS taxonomic annotation was performed by aligning all core and accessory genes against the GTDB r214 representative genomes using blastn [4] (version 2.10.1, task = megablast, word_size = 16). The 20 best hits for each gene were kept. A species-level assignment was given if > 50% of the genes matched a GTDB representative genome with a mean identity ≥ 95% and mean gene length coverage ≥ 90%. The remaining MGS were assigned to a higher taxonomic levels (genus to superkingdom) if more than 50% of their genes had the same annotation.
Files
Files
(10.4 GB)
Name | Size | Download all |
---|---|---|
md5:ec15c379263488645bbfad45c601aadb
|
8.4 GB | Download |
md5:1f7471656b76ca45354d1ff8b131d77e
|
2.0 GB | Download |