Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published November 22, 2022 | Version 0.5
Dataset Open

Global Biotic Interactions: Interpreted Data Products

Description

Global Biotic Interactions: Interpreted Data Products

Global Biotic Interactions (GloBI, https://globalbioticinteractions.org, [1]) aims to facilitate access to existing species interaction records (e.g., predator-prey, plant-pollinator, virus-host). This data publication provides interpreted species interaction data products. These products are the result of a process in which versioned, existing species interaction datasets ([2]) are linked to the so-called GloBI Taxon Graph ([3]) and transformed into various aggregate formats (e.g., tsv, csv, neo4j, rdf/nquad, darwin core-ish archives). In addition, the applied name maps are included to make the applied taxonomic linking explicit. 

Citation
--------

GloBI is made possible by researchers, collections, projects and institutions openly sharing their datasets. When using this data, please make sure to attribute these *original data contributors*, including citing the specific datasets in derivative work. Each species interaction record indexed by GloBI contains a reference and dataset citation. Also, a full lists of all references can be found in citations.csv/citations.tsv files in this publication. If you have ideas on how to make it easier to cite original datasets, please open/join a discussion via https://globalbioticinteractions.org or related projects.

To credit GloBI for more easily finding interaction data, please use the following citation to reference GloBI:

Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.

Bias and Errors
--------

As with any analysis and processing workflow, care should be taken to understand the bias and error propagation of data sources and related data transformation processes. The datasets indexed by GloBI are biased geospatially, temporally and taxonomically ([5], [6]). Also, mapping of verbatim names from datasets to known name concept may contains errors due to synonym mismatches, outdated names lists, typos or conflicting name authorities. Finally, bugs may introduce bias and errors in the resulting integrated data product.

To help better understand where bias and errors are introduced, only versioned data and code are used as an input: the datasets ([2]), name maps ([3]) and integration software ([6]) are versioned so that the integration processes can be reproduced if needed. This way, steps take to compile an integrated data record can be traced and the sources of bias and errors can be more easily found.

Contents
--------

README:
this file

citations.csv.gz:
contains data citations in a in a gzipped comma-separated values format.

citations.tsv.gz:
contains data citations in a gzipped tab-separated values format.

verbatim-interactions.csv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.

verbatim-interactions.tsv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources. 

interactions.csv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.

interactions.tsv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.

refuted-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.

refuted-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.

refuted-verbatim-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources. 

refuted-verbatim-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources. 

interactions.nq.gz:
contains species interactions expressed in the resource description framework in a gzipped rdf/quads format.

dwca-by-study.zip:
contains species interactions data as a Darwin Core Archive aggregated by study using a custom, occurrence level, association extension.

dwca.zip:
contains species interactions data as a Darwin Core Archive using a custom, occurrence level, association extension.

neo4j-graphdb.zip:
contains a neo4j v3.5.32 graph database snapshot containing a graph representation of the species interaction data.

taxonCache.tsv.gz:
contains hierarchies and identifiers associated with names from naming schemes in a gzipped tab-separated values format.

taxonMap.tsv.gz:
describes how names in existing datasets were mapped into existing naming schemes in a gzipped tab-separated values format.

References
-----

[1] Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. doi: 10.1016/j.ecoinf.2014.08.005.

[2] Poelen, J. H. (2020) Global Biotic Interactions: Elton Dataset Cache. Zenodo. doi: 10.5281/ZENODO.3950557.

[3] Poelen, J. H. (2021). Global Biotic Interactions: Taxon Graph (Version 0.3.28) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4451472

[4] Hortal, J. et al. (2015) Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity. Annual Review of Ecology, Evolution, and Systematics, 46(1), pp.523–549. doi: 10.1146/annurev-ecolsys-112414-054400.

[5] Cains, M. et al. (2017) Ivmooc 2017 - Gap Analysis Of Globi: Identifying Research And Data Sharing Opportunities For Species Interactions. Zenodo. Zenodo. doi: 10.5281/ZENODO.814978.

[6] Poelen, J. et al. (2022) globalbioticinteractions/globalbioticinteractions v0.24.6. Zenodo. doi: 10.5281/ZENODO.7327955.

Content References
-----

hash://sha256/2e0158ca0b4341f4fa8ff454cf12bac2879b4e9d2d68e5e29b439af8ab467a30  citations.csv.gz
hash://sha256/42a8595ad8de2a32c52f56d632c2ef42a04e3645cb88b0ad328b1cedd2ac8f1a  citations.tsv.gz
hash://sha256/fed873a314d91d09500c896c9831108358c0a47bb17b7ff8aebca5c2e170d508  interactions.csv.gz
hash://sha256/91db1a9fd55ddb584d888f6c6314adcd5a668462d0016be266aa3593f2f60884  interactions.tsv.gz
hash://sha256/b40229414a565ab68971d05754e5040eea4af27e3ac6ef6df383410ea2a64752  verbatim-interactions.csv.gz
hash://sha256/e1485d6b23db9f8989315334c1696d64e5b39c33147e3617b41944d0a5d8581d  verbatim-interactions.tsv.gz
hash://sha256/4ed995d3a7d17b291f0a3af0a3fc50b41cb742d228a160f32e09f630a57563b0  refuted-interactions.csv.gz
hash://sha256/10564bdbc054b0e17ce78fe13d8e925c032c8484e599d4c245e70279d0f0e0bb  refuted-interactions.tsv.gz
hash://sha256/fc0b9e23d1b026e223a7716c7dde0677d00f816fb356dd0e7238d827f5e051d8  refuted-verbatim-interactions.csv.gz
hash://sha256/b54a93b46bf4583a7a0e090a9ab5e53a8d5a6d9f10a4165934a7ab98ea6d88d9  refuted-verbatim-interactions.tsv.gz
hash://sha256/15cdbd8d6b6aac59500df664d5675e1d614fdcd1c2af165b950a7fe430dcd6a6  interactions.nq.gz
hash://sha256/ee0a810a54bb6c564de4beb9186a3b9d55201cb77697aa605783882c85adf9c8  dwca-by-study.zip
hash://sha256/7b1a034da65d6ecd0941ea93bfc104166a3ddb00e3af5c2d4f806e52ca92e5cc  dwca.zip
hash://sha256/b3abdcfc5867ff6d8a5b7327c07bd6c2748d1f09efa06d63bd16a91447a4d97f  neo4j-graphdb.zip
hash://sha256/e5b0a7990379d6e69404020ed48db9b0336443ff516a3dd99e3c9708eec74cf6  taxonMap.tsv.gz
hash://sha256/a5f7c0b4b718ebc7725cdac0502e2edee92ed164297880512e551bdf3d43f4ee  taxonCache.tsv.gz
 

Files

dwca-by-study.zip

Files (18.8 GB)

Name Size Download all
md5:dfa6e0317ec8c3724b4c24a8472d8a0b
54.7 MB Download
md5:9e12daa5dd4bd730b6cb4223be808ba1
54.7 MB Download
md5:44008bc8d45a643875ecfbec88563d5c
365.1 MB Preview Download
md5:81fb1af31f87a38c72c53d718680992f
509.0 MB Preview Download
md5:d6aedbbce50932dd5de39a4aea92681d
1.6 GB Download
md5:879e03c6905452281fc828a909bae85d
7.4 GB Download
md5:e2dbe94e3c61570f3f123ebc07ea839d
1.6 GB Download
md5:1b1e9c2a69fa4cc218943a526920ba5a
6.2 GB Preview Download
md5:a64ba5bacf9b07197648a9eed660c176
8.5 kB Download
md5:092bdf870613ab2e6c9dff4ea4ec8aff
2.8 MB Download
md5:9035cf51f6dbc04245b26d39a1ed997e
2.8 MB Download
md5:b17d40a7e14a9ee030c77631121e27cc
575.2 kB Download
md5:e6916ceba01a282c03beefa820019ce9
575.1 kB Download
md5:fd53631e37fae04ff6d9a4b7379cf9a5
115.6 MB Download
md5:3f9545af3a468d52f5fd972112881af1
58.7 MB Download
md5:882dc7804e2e7d0bfde2fa4dfa612bb6
481.6 MB Download
md5:c6c3d93515cfec7f6c2ada06996dcdb2
479.7 MB Download

Additional details