Published September 10, 2025
| Version v10
Dataset
Open
MicroReset: characterization of the rabbit (Oryctolagus cuniculus) fecal metagenome and resistome by deep shotgun sequencing
Creators
- 1. Université Paris-Saclay, INRAE
- 2. INRAE, Université Paris-Saclay
- 3. INRAE, Equipe NED, UMR1388 GENPHYSE
- 4. Université Paris-cité, Institut Pasteur
- 5. INSERM, IAME
- 6. LABOVET Conseil, Réseau Cristal
- 7. Clinique Vétérinaire des Marchés de Bretagne
Description
Data source
The dataset was generated from 30 rabbit fecal samples subjected to deep shotgun metagenomic sequencing. The sequencing data is available under the BioProject PRJEB50625.
Metagenomic Assembly
Raw sequencing reads were first pre-processed using fastp for adapter removal and quality trimming. Host-derived reads were filtered out by mapping to the rabbit reference genome (GCF_000001635.27) using Bowtie2 and removing mapped reads with Samtools. Each sample was individually assembled using metaSPAdes. Contigs shorter than 1,500 bp were excluded from downstream analysis.
MAG Recovery
Reads from each sample were mapped to all 30 assemblies (30×30 mappings) using Bowtie2. The resulting alignments were sorted and indexed with Samtools. Contig coverage across all samples was computed using `jgi_summarize_bam_contig_depths`. Binning was performed with MetaBAT 2 and SemiBin v1.3. MAG quality was assessed with CheckM. Only high-quality MAGs (≥70% completeness, ≤5% contamination, N50 ≥ 8 kb) were retained.
Non-Redundant Gene Catalog
Gene prediction was carried out using Prodigal on all contigs from the current study (with `-m -p meta`). Genes shorter than 90 bp or lacking start/stop codons were discarded. The remaining genes from both sources were pooled and clustered using CD-HIT-EST (parameters: `-c 0.95 -aS 0.90 -G 0 -d 0 -M 0 -T 0`). The longest contigs were used to select representative genes.
MSP Recovery
Shotgun reads from the 30 samples were aligned to the non-redundant gene catalog using the Meteor suite, generating a gene abundance matrix (5.7 million genes × 30 samples). Co-abundant genes were grouped into 1,053 Metagenomic Species Pan-genomes (MSPs) using MSPminer.
Taxonomic Annotation of MSPs
MAGs representing each species were taxonomically annotated using GTDB-Tk with GTDB release r214. The resulting taxonomy was propagated to the corresponding MSPs.
Phylogenetic Tree Construction
A set of 39 universal phylogenetic marker genes was extracted from the 1,053 MSPs (or their corresponding MAGs, when available) using fetchMGs. Each marker was independently aligned using MUSCLE, and the alignments were concatenated and trimmed using trimAl (parameter: `-automated1`). A maximum-likelihood phylogenetic tree was constructed with FastTreeMP (parameters: `-gamma -pseudo -spr -mlacc 3 -slownni`).
Mapping rate distribution across public cohorts
We generated mapping rate distribution plots using Meteor2 (default parameters) for PRJEB50625 (cohort used in catalogue assembly).Files
catalogue_mapping_rate_oc_5_7_gut.pdf
Files
(8.7 GB)
Name | Size | Download all |
---|---|---|
md5:d303711190cff0d5f37087b32a8a886e
|
4.7 kB | Preview Download |
md5:738b371d7a7c5f678bfd8774ade6646d
|
8.1 GB | Download |
md5:16dd45d4b5ebbc09f41bdc3273ce9848
|
516.7 MB | Download |