There is a newer version of the record available.

Published February 28, 2024 | Version v3
Dataset Open

A catalog of genes and species of the brown rat (Rattus norvegicus) gut microbiota

  • 1. Université Paris-Saclay, INRAE
  • 2. Université de Nantes, UMR_A 1280 Physiologie des Adaptations Nutritionnelles

Description

Dataset overview


We built a catalog of 5.9M genes found in the brown rat gut microbiota. Co-abundant genes were binned in 1627 Metagenomic Species for which we provide taxonomic labels.

This dataset can be used to analyze shotgun sequencing data of the brown rat gut microbiota.

Data sources


Rat fecal (and milk) samples characterized by shotgun metagenomic sequencing during the Mamiprooffi project. Sequencing data will be submitted soon on the European Nucleotide Archive (Bioproject PRJEB57230)
The gene catalog of the Sprague-Dawley rat gut metagenome published by Pan et al.

Metagenomic assembly


Metagenomic assembly was performed on the Mamiprooffi samples (Data Source 1) with SPAdes (parameters: --iontorrent --careful). Contigs of less than 1500 bp or successfully aligned on the rat genome (Rnor_6.0) were removed.
Non-redundant gene catalog
Genes were predicted on all contigs with Prodigal (parameters : -m -p meta ). Genes with missing start codon or shorter than 99 bp were discarded.
Then, partial and complete genes were separately clustered with cd-hit-est (parameters -c 0.95 -aS 0.90 -G 0 -d 0 -M 0 -T 0 ). Finally, these two non-redundant gene sets were merged with the previously published catalog (Data Source 2) using cd-hit-est-2d by considering at first complete genes (contact us for futher details).
Functionnal annotation
KEGG Orthologs (KOs) were assigned to genes of the final catalog with KofamScan (version 1.3.0, KEGG 107 database)

Metagenomic Species


Using the Meteor software suite, reads from samples in Bioprojects PRJEB57230 and PRJEB22973 were mapped against the final non redundant catalog to build a raw gene abundance table (5.9 million genes quantified in 370 samples). This table was submitted to MSPminer and Canopy. A total of 1627 clusters of co-abundant genes or MetaGenomic Species (MGS) were discovered.
Quality control of each MGS was manually performed by visualizing heatmaps representative of the normalized gene abundance profiles.

Taxonomic annotation of Metagenomic Species


MGS taxonomic annotation was performed by aligning all core and accessory genes against the GTDB r214 representative genomes using blastn [4] (version 2.10.1, task = megablast, word_size = 16). The 20 best hits for each gene were kept. A species-level assignment was given if > 50% of the genes matched a GTDB representative genome with a mean identity ≥ 95% and mean gene length coverage ≥ 90%. The remaining MGS were assigned to a higher taxonomic levels (genus to superkingdom) if more than 50% of their genes had the same annotation.

Files

Files (10.4 GB)

Name Size Download all
md5:ec15c379263488645bbfad45c601aadb
8.4 GB Download
md5:1f7471656b76ca45354d1ff8b131d77e
2.0 GB Download