There is a newer version of this record available.

Dataset Open Access

Global Biotic Interactions: Interpreted Data Products

Poelen, Jorrit H.

Global Biotic Interactions: Interpreted Data Products

Global Biotic Interactions (GloBI, https://globalbioticinteractions.org, [1]) aims to facilitate access to existing species interaction records (e.g., predator-prey, plant-pollinator, virus-host). This data publication provides interpreted species interaction data products. These products are the result of a process in which versioned, existing species interaction datasets ([2]) are linked to the so-called GloBI Taxon Graph ([3]) and transformed into various aggregate formats (e.g., tsv, csv, neo4j, rdf/nquad, darwin core-ish archives). In addition, the applied name maps are included to make the applied taxonomic linking explicit.

Citation
--------

GloBI is made possible by researchers, collections, projects and institutions openly sharing their datasets. When using this data, please make sure to attribute these *original data contributors*, including citing the specific datasets in derivative work. Each species interaction record indexed by GloBI contains a reference and dataset citation. Also, a full lists of all references can be found in citations.csv/citations.tsv files in this publication. If you have ideas on how to make it easier to cite original datasets, please open/join a discussion via https://globalbioticinteractions.org or related projects.

To credit GloBI for more easily finding interaction data, please use the following citation to reference GloBI:

Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.

Bias and Errors
--------

As with any analysis and processing workflow, care should be taken to understand the bias and error propagation of data sources and related data transformation processes. The datasets indexed by GloBI are biased geospatially, temporally and taxonomically ([5], [6]). Also, mapping of verbatim names from datasets to known name concept may contains errors due to synonym mismatches, outdated names lists, typos or conflicting name authorities. Finally, bugs may introduce bias and errors in the resulting integrated data product.

To help better understand where bias and errors are introduced, only versioned data and code are used as an input: the datasets ([2]), name maps ([3]) and integration software ([6]) are versioned so that the integration processes can be reproduced if needed. This way, steps take to compile an integrated data record can be traced and the sources of bias and errors can be more easily found.

Contents
--------

README:
this file

citations.csv.gz:
contains data citations in a in a gzipped comma-separated values format.

interactions.csv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format.

citations.tsv.gz:
contains data citations in a gzipped tab-separated values format.

interactions.tsv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format.

interactions.nq.gz:
contains species interactions expressed in the resource description framework in a gzipped rdf/quads format.

dwca-by-study.zip:
contains species interactions data as a Darwin Core Archive aggregated by study using a custom, occurrence level, association extension.

dwca.zip:
contains species interactions data as a Darwin Core Archive using a custom, occurrence level, association extension.

neo4j-graphdb.zip:
contains a neo4j v2.3.12 graph database snapshot containing a graph representation of the species interaction data.

taxonCache.tsv.gz:
contains hierarchies and identifiers associated with names from naming schemes in a gzipped tab-separated values format.

taxonMap.tsv.gz:
describes how names in existing datasets were mapped into existing naming schemes in a gzipped tab-separated values format.

Notes that each of the data files has an computed content hash in associated .sha256 file.

References
-----

[1] Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. doi: 10.1016/j.ecoinf.2014.08.005.

[2] Poelen, J. H. (2020) Global Biotic Interactions: Elton Dataset Cache. Zenodo. doi: 10.5281/ZENODO.3950557.

[3] Poelen, J. H. (2020) Global Biotic Interactions: Taxon Graph. Zenodo. doi: 10.5281/ZENODO.3905244.

[4] Hortal, J. et al. (2015) Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity. Annual Review of Ecology, Evolution, and Systematics, 46(1), pp.523–549. doi: 10.1146/annurev-ecolsys-112414-054400.

[5] Cains, M. et al. (2017) Ivmooc 2017 - Gap Analysis Of Globi: Identifying Research And Data Sharing Opportunities For Species Interactions. Zenodo. Zenodo. doi: 10.5281/ZENODO.814978.

[6] Poelen, J. et al. (2020) globalbioticinteractions/globalbioticinteractions v0.19.0. Zenodo. doi: 10.5281/ZENODO.3946991.

Content References
-----

hash://sha256/905c15b2f9e798478deaa7503288d1b8f6e2851cf4823f666ee7672b1cf21a52

hash://sha256/d4801790c55a6e36224770518717c01713e50a0d20dd6cd098c378f4a13093eb

hash://sha256/cc0cbc58136771b64c3f6b54b373d04f0c591cc417df93e68dc3c800bb6d0d6f

hash://sha256/9bfba55dc42aa0ce53a35e72d31cf5e0543d195ae6012f2d46db1afeccf14d3d

hash://sha256/92a70b8d1289dbd01149fc632de016a1d295c93053eef1e42b1b2ed6a9bb5dcf

hash://sha256/446fcb92d720fc7675beefb938a28fdf35cea11efd354e1326bef5cc32bbdae7

hash://sha256/fc45a465c06ff828decb28ea58a83d00534ef517cd7a10d84b346c644c978cab

hash://sha256/041c3d2e185017f94efa4efa378b909782bbbaeb5f2a94e901eb9b2227710566

hash://sha256/122ebcb2906da2a57e019000d215e8ab601a70a17af50943085a002ef8a3a309

hash://sha256/01cdcb2fb164aca78b430c3624996633b73ec124ad3045bce76eb97a5d6b8481

Files (4.9 GB)
Name Size
citations.csv.gz
md5:52a22a32308c62b9f2f1c9b207f3d18e
9.7 MB Download
citations.csv.sha256
md5:66a75989587547b3106bd6f37e5ff232
65 Bytes Download
citations.tsv.gz
md5:081120032a1f53ac9609a916fcfb81f6
9.7 MB Download
citations.tsv.sha256
md5:abb37b31928be08188889023ebbdee38
65 Bytes Download
dwca-by-study.zip
md5:01feff5c3d29db6198c3ffc6d08ecedb
91.4 MB Download
dwca-by-study.zip.sha256
md5:49b87f80f3bddda29b66e9dff03cbafd
65 Bytes Download
dwca.zip
md5:8028edd72aa7fca7b947757769e21228
140.5 MB Download
dwca.zip.sha256
md5:35d9ffcb2edbe3dad7906cec10d8421b
65 Bytes Download
interactions.csv.gz
md5:11455fbabaa2b16c004a0e0e450adeed
491.0 MB Download
interactions.csv.sha256
md5:6af824b00deda4ef31d6dc97f7ac744c
65 Bytes Download
interactions.nq.gz
md5:4f3ddbd3fc57c01d53951e2139caa526
1.5 GB Download
interactions.nq.sha256
md5:ee2ecd7942b93a24ee3298b2a4d7a969
65 Bytes Download
interactions.tsv.gz
md5:f071ddac62cfea39632905f8820273d4
490.6 MB Download
interactions.tsv.sha256
md5:87b448b99e0d0ba67de9de5ddd81e131
65 Bytes Download
neo4j-graphdb.zip
md5:e8bd426254a0cc60a26ccaf25b1031f9
2.0 GB Download
neo4j-graphdb.zip.sha256
md5:702c966314e83fe5dbd7dbc8de7f870c
65 Bytes Download
README
md5:837d4291ad3130a83d1babb854ac20d5
5.9 kB Download
README.sha256
md5:c93b936cc7bc9760f84f68b83df36693
65 Bytes Download
taxonCache.tsv.gz
md5:7d75966472b6caac19d3379a065ca9f0
80.6 MB Download
taxonCache.tsv.sha256
md5:bafc4005d2b8469a6897e116a2eb6eab
65 Bytes Download
taxonMap.tsv.gz
md5:42b1fa2787bc32bf26b8f1c33620e633
32.8 MB Download
taxonMap.tsv.sha256
md5:fd40ce591c397e1c3ca8e5f4fab0cf2e
65 Bytes Download
345
9,902
views
downloads
All versions This version
Views 345126
Downloads 9,9022,491
Data volume 29.6 TB3.3 TB
Unique views 309116
Unique downloads 1,547426

Share

Cite as