There is a newer version of the record available.

Published September 26, 2025 | Version 1.0.0

Data from: Local expertise anchors biodiversity documentation, but geopolitical power drives parachute discovery

  • 1. ROR icon Universidade Federal da Paraíba
  • 2. ROR icon Universidade de São Paulo
  • 3. ROR icon Universidade Federal do Ceará
  • 4. ROR icon Universidade Estadual de Campinas (UNICAMP)

Description

Description

This repository provides the raw data and codebase for analyzing scientific colonialism practices associated with retention, appropriation, and network flow of mammal holotypes for species described between 1990 and early 2025. The code fully reproduces all analyses reported in the manuscript, including the extraction of holotype-based metrics per country, the integration of socioeconomic variables from global datasets, statistical comparisons, and the generation of all figures for the main text and supplementary materials. The workflow ensures transparent and replicable results, from data processing to final visualization.

File content

RawData.xlsx: it represents the raw data on mammal holotypes for species descriptions published between 1990 and 2025. It includes 24 fields detailing taxonomic ranks, authority, the sourcing and housing countries of each holotype specimen, the geographical coordinates of the species' type locality, and geopolitical classifications associated with the country of origin and destination.

R-code: script to reproduce all analyses and figures.

Shapefiles.zip: it represents the "Shapefiles" folder directory, which includes four shapefiles in this zip. (i) landcover_SIMP, boundaries of land areas worldwide. (ii) world_limit, bounding box of world extent. (iii) world-administrative-boundaries, country-level geopolitical boundaries (see https://data.ipu.org/content/regional-groupings). (iv) wwf_realms, the biogeographical realm limits extracted from https://ecoregions.appspot.com/). 

Datasets.zip: it represents the "Datasets" folder directory, which includes six csv files. Except for the file qog_std_ts_jan25.csv, which is sourced from the Quality of Government dataset (also available here), all other files are produced by the provided R code. We include these files to facilitate reproducibility without requiring users to run the entire code. One necessary CSV file for full reproduction, which contains 2.5GB of mammal specimen records from the Global Biodiversity Information Facility (GBIF), is available for separate download at https://doi.org/10.15468/dl.32rfs8

RData.zip: it represents the "RData" folder directory, which includes 10 files in RData or rds extension. These files are all produced by the R-code provided. We provide them here to facilitate reproducibility of our results without the need of reruning the complete code. 

Correspondence to: mariormoura@gmail.com

Files

Datasets.zip

Files (61.6 MB)

Name Size Download all
md5:df1df42f48851f07aa4aeaa175522238
19.5 MB Preview Download
md5:cd84d1f096f3d245b36f1bf9b41bbee9
156.1 kB Download
md5:3f2b0a99176885d4ea3a016ade990eab
179.2 kB Download
md5:a2410e9748cb8ca46c1181acbbb43eb8
26.2 MB Preview Download
md5:147a8f8c8e6beb27a9aaa3a078614981
15.6 MB Preview Download

Additional details

Software

Programming language
R