Published April 15, 2025 | Version v3
Dataset Open

Data and code from: Host-microbiome associations of native and invasive small mammals across a tropical urban-rural ecotone

  • 1. EDMO icon Swansea University, Department of Biosciences
  • 2. ROR icon Sabah Parks
  • 3. Swansea University, Swansea University Medical School
  • 4. Instituto de Investigaciones Marinas, Laboratorio de Biotecnología Acuática, Vigo, ES

Description

16Sv4 rRNA gene sequencing data

Checklist for 16Sv4 rRNA gene sequencing data submitted to ENA for the 245 samples analysed in the manuscript - project PRJEB81284 

This TSV file contains: the sample ID ("sample"), the study ID in ENA ("study"), information about the type of sequencing ("instrument_model", "library_name", "library_source", "library_selection", "library_strategy", "library_layout"), the file name of the forward read ("forward_file_name"), the md5 of the forward read ("forward _file_md5"), the file name of the reverse read ("reverse_file_name"), the md5 of the reverse read ("reverse _file_md5").

The sequencing data of 354 samples from various species of small mammals are stored in the project PRJEB81284, but only the 245 samples from Sundamys muelleri, Rattus rattus, Rattus norvegicus, Suncus murinus were analysed in the paper. FASTQ files of the paired-end reads (R1: forward read; R2: reverse read) can be downloaded

File name: file_checklist_project_PRJEB81284.tsv

 

Bioinformatics

Bash script used to analyse the 16Sv4 sequences to ASVs (Amplicon Sequence Variants) using the QIIME 2 package

This SH file contains the bash command lines to analyse the 16Sv4 sequences to ASVs using the QIIME 2 package.

File name: script_qiime2_Borneo_sm_microbiota.sh

Metadata required to run script_qiime2_Borneo_sm_microbiota

This TSV file contains: the sample ID, the ID of the individual small mammal, the small mammal species, the town or village in which the individual was captured, the geographic coordinates in UTM system, the sex and age of the individual.

File name: metadata_qiime2_Borneo_sm_microbiota

 

Statistical analysis

Data --> R_data_Borneo_sm_microbiota

Individual information of small mammal faecal samples analysed in the study

This CSV file contains data on the 245 small mammal individuals that were captured in Borneo between March 2012 and May 2013 and whose fecal content was sequenced for microbial analysis. The table contains 245 rows and 9 columns. The variables include:

  1. "Sample_ID": unique identifier for each faecal sample.
  2. "ID_individual": unique identifier for each individual (one Sample_ID corresponds to one ID_individual).
  3. "Species": small mammal species information, including genus and species.
  4. "District": district in which the individual was captured.
  5. "Town_village": town or village in which the individual was captured.
  6. "Coord_UTM_x": geographic coordinates in UTM system corresponding to the longitude of the capture location.
  7. "Coord_UTM_y": geographic coordinates in UTM system corresponding to the latitude of the capture location.
  8. "Sex": sex of the individual (female, male, unknown).
  9. "Age": stade of maturity of the individual (adult, immature, juvenile, subadult, unknown)

After filtering in R ("data_preparation.R"), 236 samples remained. --> df_samples

File name: dataset_samples.csv

 

Trapping effort with information about environment of the trapping locations and presence-absence data for the four analysed small mammal species

This CSV file contains 3541 rows, corresponding to unique trapping locations, and 20 columns describing the environment and providing information about the presence-absence of the four studied species.

  1. "coord_id": unique identifier for each trap location, formatted as longitude_latitude.
  2. "LC20m_xxx": columns describing the landcover types within 20m radii around the trapping locations: housing (compound and soil), sealed, soil, agriculture (grass and tree), garden (grass and tree), fallow (grass and tree), forest edge, forest, water, others. The columns contain numeric values ranging from 0 to 100, where 0 corresponds to no landcover type and 100 represents total landcover.
  3. "PA_smue": presence-absence data for Sundamys muelleri (0 = absence, 1 = presence).
  4. "PA_rr": presence-absence data for Rattus rattus (0 = absence, 1 = presence).
  5. "PA_rn": presence-absence data for Rattus norvegicus (0 = absence, 1 = presence).
  6. "PA_smur": presence-absence data for Suncus murinus (0 = absence, 1 = presence).
  7. "ID_individual": unique identifier for each individual.

File name: dataset_environment.csv

 

ASV abundance table for analysed smalla mammal faecal samples

This text file contains the number of sequences of each bacterial ASV found in each analysed samples. The table contains 6109 rows corresponding to the ASVs and 247 columns corresponding to 1) unique code assigned to the ASV from QIIME 2 ("ASV_code"), 2-246) 245 analysed samples, 247) empty column that is removed during the filtering ("taxonomy").

After filtering in R ("data_preparation.R"), 1864 unique ASVs and 236 samples remained. --> asv_table

File name: dataset_asv.txt

 

ASV taxonomy table

This TSV file contains the taxonomic classification of the found ASVs. Each ASVs has been affiliated to the SILVA database. The table contains 6109 rows corresponding to the ASVs found in the 245 analysed samples and 3 columns corresponding to 1) unique code assigned to the ASV from QIIME 2 ("ASV_code"), 2) concatenate taxonomic classification (from domain to species) ("taxon"), 3) confidence of the taxonomic classification ("confidence").

After filtering in R ("data_preparation.R"), 1864 unique ASVs remained. --> taxo_table

File name: dataset_taxonomy.tsv

 

ASV phylogenetic tree

The phylogenetic tree of each ASV.

File name: ASV_phylogenetic_tree.nwk

 

Phylogenetic trees (from vertlife.org project) of all small mammal species captured in Borneo during the trapping effort made between March 2012 and May 2013 (Wells et al. 2014)

These trees are used to create a final consensus tree.

File name: host_species_trees.nex

 

Colour palette for Figure 5 = ANCOMBC results

Colour palette used in the plot displaying ANCOMBC results (Figure 5).

File name: ANCOMBC_palette_families.csv

 

Colour legend for bacterial families from the microbiome composition plot (Figure 2 and Figure S2)

Legend extracted from the microbiome composition plot, used as legend for Figure 5.

File name: legend_bacterial_families.rds

 

Scripts --> R_scripts_Borneo_sm_microbiota

The zip file contains various scripts corresponding to each step of the statistical analysis run in R.

  • data_preparation.R: data filtering, preparation and inspection.
  • relative_occurrence_probability.R: computation of relative occurrence probability for each host species using Generalised Additive Models (script to create Figure 1).
  • density_function.R: function to calculate mode and 95% CI.
  • legend_extraction_function.R: function to extract the legend from a plot, modification of the individual_legend function from the "microshades" package.
  • microbiome_composition.R: analysis of microbiome composition at host species and individual level (script to create Figure 2).
  • alpha_diversity.R: alpha diversity metrics calculation and Generalised Linear Models (GLM) testing the effect of Land Use Intensity (LUI) on alpha diversity metrics.
  • computation_beta_diversity_metrics.R: beta diversity metrics computation.
  • beta_diversity_GDMs.R: Generalised Dissimilarity Modelling testing the effect of host phylogenetic relatedness, LUI and spatial proximity on beta diversity metrics (script to create Figure 3).
  • beta_diversity_NMDS_beta_dispersion_ANOSIM.R: beta diversity visualisation and analysis using beta dispersion, Analysis of Similarity, and GLM testing the effect of LUI on beta dispersion (script to create figure 4).
  • ANCOM-BC.R: analysis to find differentially abundant ASVs in relation to LUI for each host species (script to create figure 5).

Abstract (English)

Global change and urbanisation profoundly alter wildlife habitats, driving native animals into novel habitats while increasing the co-occurrence between native and invasive species. Host-microbiome associations are shaped by host traits and environmental features, but little is known about their plasticity in co-occurring native and invasive species across urban-rural gradients.

Here, we explored gut microbiomes of four sympatric small mammal species along an urban-rural ecotone in Borneo, one of the planet's oldest rainforest regions experiencing recent urban expansion.

Host species identity was the strongest determinant of microbiome composition, while land use and spatial proximity shaped microbiome similarity within and among the three rat species. The urban-dwelling rat Rattus rattus had a microbiome composition more similar to that of the native, urban-adapted rat Sundamys muelleri (R. rattus’ strongest environmental niche overlap), than to the closely related urban-dwelling R. norvegicus. The urban-dwelling shrew Suncus murinus presented the most distinct microbiome. The microbiome of R. norvegicus was the most sensitive to land use intensity, exhibiting significant alterations in composition and bacterial abundance across the ecotone.

Our findings suggest that environmental niche overlap among native and invasive species promotes similar gut microbiomes. Even for omnivorous urban-dwellers with a worldwide distribution like R. norvegicus, gut microbiomes may change across fine-scale environmental gradients. Future research needs to confirm whether land use intensity can be a strong selective force on mammalian gut microbiomes, influencing the way in which native and invasive species are able to exploit novel environments.

Files

R_data_Borneo_sm_microbiota.zip

Files (1.3 MB)

Name Size Download all
md5:b548232cd4d86ea21774bf0f4ad4b66e
48.1 kB Download
md5:5b24bd52554b2bfc8f8581bf1dd1e16b
15.0 kB Download
md5:9c8cae3e8b10e614541efbea9ce04d28
1.2 MB Preview Download
md5:24d7a7c95c66ca24a6404f6f8375687f
24.6 kB Preview Download
md5:4a8c34c758801e996f5f9e7482cbc630
3.3 kB Download

Additional details

Related works

Is supplement to
Preprint: 10.22541/au.173720389.93870617/v1 (DOI)