Global Biotic Interactions: Interpreted Data Products
Creators
Description
Global Biotic Interactions: Interpreted Data Products
Global Biotic Interactions (GloBI, https://globalbioticinteractions.org, [1]) aims to facilitate access to existing species interaction records (e.g., predator-prey, plant-pollinator, virus-host). This data publication provides interpreted species interaction data products. These products are the result of a process in which versioned, existing species interaction datasets ([2]) are linked to the so-called GloBI Taxon Graph ([3]) and transformed into various aggregate formats (e.g., tsv, csv, neo4j, rdf/nquad, darwin core-ish archives). In addition, the applied name maps are included to make the applied taxonomic linking explicit.
Citation
--------
GloBI is made possible by researchers, collections, projects and institutions openly sharing their datasets. When using this data, please make sure to attribute these *original data contributors*, including citing the specific datasets in derivative work. Each species interaction record indexed by GloBI contains a reference and dataset citation. Also, a full lists of all references can be found in citations.csv/citations.tsv files in this publication. If you have ideas on how to make it easier to cite original datasets, please open/join a discussion via https://globalbioticinteractions.org or related projects.
To credit GloBI for more easily finding interaction data, please use the following citation to reference GloBI:
Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.
Bias and Errors
--------
As with any analysis and processing workflow, care should be taken to understand the bias and error propagation of data sources and related data transformation processes. The datasets indexed by GloBI are biased geospatially, temporally and taxonomically ([5], [6]). Also, mapping of verbatim names from datasets to known name concept may contains errors due to synonym mismatches, outdated names lists, typos or conflicting name authorities. Finally, bugs may introduce bias and errors in the resulting integrated data product.
To help better understand where bias and errors are introduced, only versioned data and code are used as an input: the datasets ([2]), name maps ([3]) and integration software ([6]) are versioned so that the integration processes can be reproduced if needed. This way, steps take to compile an integrated data record can be traced and the sources of bias and errors can be more easily found.
Contents
--------
README:
this file
citations.csv.gz:
contains data citations in a in a gzipped comma-separated values format.
citations.tsv.gz:
contains data citations in a gzipped tab-separated values format.
verbatim-interactions.csv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
verbatim-interactions.tsv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
interactions.csv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
interactions.tsv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-verbatim-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
refuted-verbatim-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
interactions.nq.gz:
contains species interactions expressed in the resource description framework in a gzipped rdf/quads format.
dwca-by-study.zip:
contains species interactions data as a Darwin Core Archive aggregated by study using a custom, occurrence level, association extension.
dwca.zip:
contains species interactions data as a Darwin Core Archive using a custom, occurrence level, association extension.
neo4j-graphdb.zip:
contains a neo4j v3.5.32 graph database snapshot containing a graph representation of the species interaction data.
taxonCache.tsv.gz:
contains hierarchies and identifiers associated with names from naming schemes in a gzipped tab-separated values format.
taxonMap.tsv.gz:
describes how names in existing datasets were mapped into existing naming schemes in a gzipped tab-separated values format.
References
-----
[1] Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. doi: 10.1016/j.ecoinf.2014.08.005.
[2] Poelen, J. H. (2020) Global Biotic Interactions: Elton Dataset Cache. Zenodo. doi: 10.5281/ZENODO.3950557.
[3] Poelen, J. H. (2021). Global Biotic Interactions: Taxon Graph (Version 0.3.28) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4451472
[4] Hortal, J. et al. (2015) Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity. Annual Review of Ecology, Evolution, and Systematics, 46(1), pp.523–549. doi: 10.1146/annurev-ecolsys-112414-054400.
[5] Cains, M. et al. (2017) Ivmooc 2017 - Gap Analysis Of Globi: Identifying Research And Data Sharing Opportunities For Species Interactions. Zenodo. Zenodo. doi: 10.5281/ZENODO.814978.
[6] Poelen, J. et al. (2022) globalbioticinteractions/globalbioticinteractions v0.24.6. Zenodo. doi: 10.5281/ZENODO.7327955.
Content References
-----
hash://sha256/2e0158ca0b4341f4fa8ff454cf12bac2879b4e9d2d68e5e29b439af8ab467a30 citations.csv.gz
hash://sha256/42a8595ad8de2a32c52f56d632c2ef42a04e3645cb88b0ad328b1cedd2ac8f1a citations.tsv.gz
hash://sha256/fed873a314d91d09500c896c9831108358c0a47bb17b7ff8aebca5c2e170d508 interactions.csv.gz
hash://sha256/91db1a9fd55ddb584d888f6c6314adcd5a668462d0016be266aa3593f2f60884 interactions.tsv.gz
hash://sha256/b40229414a565ab68971d05754e5040eea4af27e3ac6ef6df383410ea2a64752 verbatim-interactions.csv.gz
hash://sha256/e1485d6b23db9f8989315334c1696d64e5b39c33147e3617b41944d0a5d8581d verbatim-interactions.tsv.gz
hash://sha256/4ed995d3a7d17b291f0a3af0a3fc50b41cb742d228a160f32e09f630a57563b0 refuted-interactions.csv.gz
hash://sha256/10564bdbc054b0e17ce78fe13d8e925c032c8484e599d4c245e70279d0f0e0bb refuted-interactions.tsv.gz
hash://sha256/fc0b9e23d1b026e223a7716c7dde0677d00f816fb356dd0e7238d827f5e051d8 refuted-verbatim-interactions.csv.gz
hash://sha256/b54a93b46bf4583a7a0e090a9ab5e53a8d5a6d9f10a4165934a7ab98ea6d88d9 refuted-verbatim-interactions.tsv.gz
hash://sha256/15cdbd8d6b6aac59500df664d5675e1d614fdcd1c2af165b950a7fe430dcd6a6 interactions.nq.gz
hash://sha256/ee0a810a54bb6c564de4beb9186a3b9d55201cb77697aa605783882c85adf9c8 dwca-by-study.zip
hash://sha256/7b1a034da65d6ecd0941ea93bfc104166a3ddb00e3af5c2d4f806e52ca92e5cc dwca.zip
hash://sha256/b3abdcfc5867ff6d8a5b7327c07bd6c2748d1f09efa06d63bd16a91447a4d97f neo4j-graphdb.zip
hash://sha256/e5b0a7990379d6e69404020ed48db9b0336443ff516a3dd99e3c9708eec74cf6 taxonMap.tsv.gz
hash://sha256/a5f7c0b4b718ebc7725cdac0502e2edee92ed164297880512e551bdf3d43f4ee taxonCache.tsv.gz
Files
dwca-by-study.zip
Files
(18.8 GB)
Name | Size | Download all |
---|---|---|
md5:dfa6e0317ec8c3724b4c24a8472d8a0b
|
54.7 MB | Download |
md5:9e12daa5dd4bd730b6cb4223be808ba1
|
54.7 MB | Download |
md5:44008bc8d45a643875ecfbec88563d5c
|
365.1 MB | Preview Download |
md5:81fb1af31f87a38c72c53d718680992f
|
509.0 MB | Preview Download |
md5:d6aedbbce50932dd5de39a4aea92681d
|
1.6 GB | Download |
md5:879e03c6905452281fc828a909bae85d
|
7.4 GB | Download |
md5:e2dbe94e3c61570f3f123ebc07ea839d
|
1.6 GB | Download |
md5:1b1e9c2a69fa4cc218943a526920ba5a
|
6.2 GB | Preview Download |
md5:a64ba5bacf9b07197648a9eed660c176
|
8.5 kB | Download |
md5:092bdf870613ab2e6c9dff4ea4ec8aff
|
2.8 MB | Download |
md5:9035cf51f6dbc04245b26d39a1ed997e
|
2.8 MB | Download |
md5:b17d40a7e14a9ee030c77631121e27cc
|
575.2 kB | Download |
md5:e6916ceba01a282c03beefa820019ce9
|
575.1 kB | Download |
md5:fd53631e37fae04ff6d9a4b7379cf9a5
|
115.6 MB | Download |
md5:3f9545af3a468d52f5fd972112881af1
|
58.7 MB | Download |
md5:882dc7804e2e7d0bfde2fa4dfa612bb6
|
481.6 MB | Download |
md5:c6c3d93515cfec7f6c2ada06996dcdb2
|
479.7 MB | Download |