GloSED - Global standardised soil eukaryome dataset
Authors/Creators
Description
Global Standardised Soil Eukaryome Dataset (GloSED)
Dataset description
The GloSED dataset is a metabarcoding-based dataset encompassing the entire spectrum of soil eukaryotes collected and analysed using standardized protocols.
Key characteristics
- Sampling sites: 4,147 globally distributed locations across 121 countries
- Taxonomic scope: Complete soil eukaryome including fungi, protists, animals, and plants
- Operational taxonomic units: 988,824 curated OTUs
- Sequencing technology: PacBio long-read sequencing of full-length ITS
Data collection and processing
- Standardized sampling design: 50x50 m plots
- Soil cores: 40 cores per plot (5 cm diameter x 5 cm depth), pooled by volume
- DNA extraction: PowerMax Soil DNA Isolation kit (Qiagen) with FavorPrep cleanup
- Primers: universal eukaryotic primers ITS9mun/ITS4ngsUni
- Processing: NextITS v.1.0.0 workflow (DOI: 10.5281/zenodo.15074882)
- Taxonomic annotation: EUKARYOME v.1.9.4 database (DOI: 10.1093/database/baae043)
Data files and formats
Core data
- `GloSED__OTU_sequences.fasta.gz`: Quality-filtered representative sequences for all OTUs, FASTA format
- `GloSED__OTU_table.tsv.zip`: Sample-by-OTU abundance matrix (TSV format)
- `GloSED__Taxonomy.tsv.zip`: Complete taxonomic annotations with UNITE-based species hypotheses (TSV format)
- `GloSED__OTU_table.parquet`: Columnar format of abundance data for efficient querying (Parquet format)
- `GloSED__Taxonomy.parquet`: Columnar format of taxonomic data (Parquet format)
- `GloSED__phyloseq.RData`: phyloseq object for R-based analyses
- `GloSED__BIOM.biom`: BIOM v.2.1 format compatible with QIIME2
Metadata files
- `GloSED__Sample_metadata.xlsx`: Sample metadata
- `DRI.json`: Data Reuse Information tag with ORCID identifiers
- `DRI.csv`: Tabular format mapping accession IDs
- `Contributors.xlsx`: List of contributors
Data reuse information
This dataset includes Data Reuse Information (DRI) tags to support equitable data sharing (Hug et al., 2025). The DRI identifies creators who prefer to be contacted before reuse:
DRI: `{0000-0002-1635-1249, 0000-0003-2786-2690}`
Please contact these individuals prior to reuse of the data.
Related resources
Files
DRI.csv
Files
(1.9 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:b1386b6b73df4a6c80e6e9bf715a5edd
|
13.3 kB | Download |
|
md5:206987b42366f762bde2a5ae693c74a8
|
558.3 kB | Preview Download |
|
md5:2756e2f149f260cd3071bedb28594ee5
|
269 Bytes | Preview Download |
|
md5:88bbfa018e890ee78749a4b1920a9c12
|
907.3 MB | Download |
|
md5:a013f52bc7c16390b25e48b536f659ef
|
189.5 MB | Download |
|
md5:9b0f54ae618c449dfa54fa0bdd0d52cf
|
63.0 MB | Download |
|
md5:c445a7ee47e7c87c87602bf848509586
|
46.1 MB | Preview Download |
|
md5:355b02ee9557f1e73ce557f159d31c81
|
265.9 MB | Download |
|
md5:aa70ae9ca5df89e50fdec01cbefbd6e2
|
1.0 MB | Download |
|
md5:66d8f1d918d7c64b47b2843008ba04d6
|
169.2 MB | Download |
|
md5:5b167a8eb8b079d68b68267aa5b26606
|
220.6 MB | Preview Download |
Additional details
Funding
- Estonian Research Council
- PRG632
- Estonian Research Council
- MOBERC116
- Estonian Research Council
- PRG1789
- Ministry of Education and Research
- Agroecology and new crops in future climates TK200
- European Research Council
- 101200758