Data and code from: Host-microbiome associations of native and invasive small mammals across a tropical urban-rural ecotone
Creators
Description
16Sv4 rRNA gene sequencing data
Checklist for 16Sv4 rRNA gene sequencing data submitted to ENA for the 245 samples analysed in the manuscript - project PRJEB81284
This TSV file contains: the sample ID ("sample"), the study ID in ENA ("study"), information about the type of sequencing ("instrument_model", "library_name", "library_source", "library_selection", "library_strategy", "library_layout"), the file name of the forward read ("forward_file_name"), the md5 of the forward read ("forward _file_md5"), the file name of the reverse read ("reverse_file_name"), the md5 of the reverse read ("reverse _file_md5").
The sequencing data of 354 samples from various species of small mammals are stored in the project PRJEB81284, but only the 245 samples from Sundamys muelleri, Rattus rattus, Rattus norvegicus, Suncus murinus were analysed in the paper. FASTQ files of the paired-end reads (R1: forward read; R2: reverse read) can be downloaded
File name: file_checklist_project_PRJEB81284.tsv
Bioinformatics
Bash script used to analyse the 16Sv4 sequences to ASVs (Amplicon Sequence Variants) using the QIIME 2 package
This SH file contains the bash command lines to analyse the 16Sv4 sequences to ASVs using the QIIME 2 package.
File name: script_qiime2_Borneo_sm_microbiota.sh
Metadata required to run script_qiime2_Borneo_sm_microbiota
This TSV file contains: the sample ID, the ID of the individual small mammal, the small mammal species, the town or village in which the individual was captured, the geographic coordinates in UTM system, the sex and age of the individual.
File name: metadata_qiime2_Borneo_sm_microbiota
Statistical analysis
Data --> R_data_Borneo_sm_microbiota
Individual information of small mammal faecal samples analysed in the study
This CSV file contains data on the 245 small mammal individuals that were captured in Borneo between March 2012 and May 2013 and whose fecal content was sequenced for microbial analysis. The table contains 245 rows and 9 columns. The variables include:
- "Sample_ID": unique identifier for each faecal sample.
- "ID_individual": unique identifier for each individual (one Sample_ID corresponds to one ID_individual).
- "Species": small mammal species information, including genus and species.
- "District": district in which the individual was captured.
- "Town_village": town or village in which the individual was captured.
- "Coord_UTM_x": geographic coordinates in UTM system corresponding to the longitude of the capture location.
- "Coord_UTM_y": geographic coordinates in UTM system corresponding to the latitude of the capture location.
- "Sex": sex of the individual (female, male, unknown).
- "Age": stade of maturity of the individual (adult, immature, juvenile, subadult, unknown)
After filtering in R ("data_preparation.R"), 236 samples remained. --> df_samples
File name: dataset_samples.csv
Trapping effort with information about environment of the trapping locations and presence-absence data for the four analysed small mammal species
This CSV file contains 3541 rows, corresponding to unique trapping locations, and 20 columns describing the environment and providing information about the presence-absence of the four studied species.
- "coord_id": unique identifier for each trap location, formatted as longitude_latitude.
- "LC20m_xxx": columns describing the landcover types within 20m radii around the trapping locations: housing (compound and soil), sealed, soil, agriculture (grass and tree), garden (grass and tree), fallow (grass and tree), forest edge, forest, water, others. The columns contain numeric values ranging from 0 to 100, where 0 corresponds to no landcover type and 100 represents total landcover.
- "PA_smue": presence-absence data for Sundamys muelleri (0 = absence, 1 = presence).
- "PA_rr": presence-absence data for Rattus rattus (0 = absence, 1 = presence).
- "PA_rn": presence-absence data for Rattus norvegicus (0 = absence, 1 = presence).
- "PA_smur": presence-absence data for Suncus murinus (0 = absence, 1 = presence).
- "ID_individual": unique identifier for each individual.
File name: dataset_environment.csv
ASV abundance table for analysed smalla mammal faecal samples
This text file contains the number of sequences of each bacterial ASV found in each analysed samples. The table contains 6109 rows corresponding to the ASVs and 247 columns corresponding to 1) unique code assigned to the ASV from QIIME 2 ("ASV_code"), 2-246) 245 analysed samples, 247) empty column that is removed during the filtering ("taxonomy").
After filtering in R ("data_preparation.R"), 1864 unique ASVs and 236 samples remained. --> asv_table
File name: dataset_asv.txt
ASV taxonomy table
This TSV file contains the taxonomic classification of the found ASVs. Each ASVs has been affiliated to the SILVA database. The table contains 6109 rows corresponding to the ASVs found in the 245 analysed samples and 3 columns corresponding to 1) unique code assigned to the ASV from QIIME 2 ("ASV_code"), 2) concatenate taxonomic classification (from domain to species) ("taxon"), 3) confidence of the taxonomic classification ("confidence").
After filtering in R ("data_preparation.R"), 1864 unique ASVs remained. --> taxo_table
File name: dataset_taxonomy.tsv
ASV phylogenetic tree
The phylogenetic tree of each ASV.
File name: ASV_phylogenetic_tree.nwk
Phylogenetic trees (from vertlife.org project) of all small mammal species captured in Borneo during the trapping effort made between March 2012 and May 2013 (Wells et al. 2014)
These trees are used to create a final consensus tree.
File name: host_species_trees.nex
Colour palette for Figure 5 = ANCOMBC results
Colour palette used in the plot displaying ANCOMBC results (Figure 5).
File name: ANCOMBC_palette_families.csv
Colour legend for bacterial families from the microbiome composition plot (Figure 2 and Figure S2)
Legend extracted from the microbiome composition plot, used as legend for Figure 5.
File name: legend_bacterial_families.rds
Scripts --> R_scripts_Borneo_sm_microbiota
The zip file contains various scripts corresponding to each step of the statistical analysis run in R.
- data_preparation.R: data filtering, preparation and inspection.
- relative_occurrence_probability.R: computation of relative occurrence probability for each host species using Generalised Additive Models (script to create Figure 1).
- density_function.R: function to calculate mode and 95% CI.
- legend_extraction_function.R: function to extract the legend from a plot, modification of the individual_legend function from the "microshades" package.
- microbiome_composition.R: analysis of microbiome composition at host species and individual level (script to create Figure 2).
- alpha_diversity.R: alpha diversity metrics calculation and Generalised Linear Models (GLM) testing the effect of Land Use Intensity (LUI) on alpha diversity metrics.
- computation_beta_diversity_metrics.R: beta diversity metrics computation.
- beta_diversity_GDMs.R: Generalised Dissimilarity Modelling testing the effect of host phylogenetic relatedness, LUI and spatial proximity on beta diversity metrics (script to create Figure 3).
- beta_diversity_NMDS_beta_dispersion_ANOSIM.R: beta diversity visualisation and analysis using beta dispersion, Analysis of Similarity, and GLM testing the effect of LUI on beta dispersion (script to create figure 4).
- ANCOM-BC.R: analysis to find differentially abundant ASVs in relation to LUI for each host species (script to create figure 5).
Abstract (English)
Global change and urbanisation profoundly alter wildlife habitats, driving native animals into novel habitats while increasing the co-occurrence between native and invasive species. Host-microbiome associations are shaped by host traits and environmental features, but little is known about their plasticity in co-occurring native and invasive species across urban-rural gradients.
Here, we explored gut microbiomes of four sympatric small mammal species along an urban-rural ecotone in Borneo, one of the planet's oldest rainforest regions experiencing recent urban expansion.
Host species identity was the strongest determinant of microbiome composition, while land use and spatial proximity shaped microbiome similarity within and among the three rat species. The urban-dwelling rat Rattus rattus had a microbiome composition more similar to that of the native, urban-adapted rat Sundamys muelleri (R. rattus’ strongest environmental niche overlap), than to the closely related urban-dwelling R. norvegicus. The urban-dwelling shrew Suncus murinus presented the most distinct microbiome. The microbiome of R. norvegicus was the most sensitive to land use intensity, exhibiting significant alterations in composition and bacterial abundance across the ecotone.
Our findings suggest that environmental niche overlap among native and invasive species promotes similar gut microbiomes. Even for omnivorous urban-dwellers with a worldwide distribution like R. norvegicus, gut microbiomes may change across fine-scale environmental gradients. Future research needs to confirm whether land use intensity can be a strong selective force on mammalian gut microbiomes, influencing the way in which native and invasive species are able to exploit novel environments.
Files
R_data_Borneo_sm_microbiota.zip
Files
(1.3 MB)
Name | Size | Download all |
---|---|---|
md5:b548232cd4d86ea21774bf0f4ad4b66e
|
48.1 kB | Download |
md5:5b24bd52554b2bfc8f8581bf1dd1e16b
|
15.0 kB | Download |
md5:9c8cae3e8b10e614541efbea9ce04d28
|
1.2 MB | Preview Download |
md5:24d7a7c95c66ca24a6404f6f8375687f
|
24.6 kB | Preview Download |
md5:4a8c34c758801e996f5f9e7482cbc630
|
3.3 kB | Download |
Additional details
Related works
- Is supplement to
- Preprint: 10.22541/au.173720389.93870617/v1 (DOI)