Global Biotic Interactions: Interpreted Data Products hash://md5/e76bf914309ad27dce6ab911d8854590 hash://sha256/ba79836caab5b7ba2d7d659123d27c89f4ad990bd50f97ded935edee9fbe9f87
Creators
Description
Global Biotic Interactions: Interpreted Data Products
Global Biotic Interactions (GloBI, https://globalbioticinteractions.org, [1]) aims to facilitate access to existing species interaction records (e.g., predator-prey, plant-pollinator, virus-host). This data publication provides interpreted species interaction data products. These products are the result of a process in which versioned, existing species interaction datasets ([2]) are linked to the so-called GloBI Taxon Graph ([3]) and transformed into various aggregate formats (e.g., tsv, csv, neo4j, rdf/nquad, darwin core-ish archives). In addition, the applied name maps are included to make the applied taxonomic linking explicit.
Citation
--------
GloBI is made possible by researchers, collections, projects and institutions openly sharing their datasets. When using this data, please make sure to attribute these *original data contributors*, including citing the specific datasets in derivative work. Each species interaction record indexed by GloBI contains a reference and dataset citation. Also, a full lists of all references can be found in citations.csv/citations.tsv files in this publication. If you have ideas on how to make it easier to cite original datasets, please open/join a discussion via https://globalbioticinteractions.org or related projects.
To credit GloBI for more easily finding interaction data, please use the following citation to reference GloBI:
Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.
Bias and Errors
--------
As with any analysis and processing workflow, care should be taken to understand the bias and error propagation of data sources and related data transformation processes. The datasets indexed by GloBI are biased geospatially, temporally and taxonomically ([5], [6]). Also, mapping of verbatim names from datasets to known name concept may contains errors due to synonym mismatches, outdated names lists, typos or conflicting name authorities. Finally, bugs may introduce bias and errors in the resulting integrated data product.
To help better understand where bias and errors are introduced, only versioned data and code are used as an input: the datasets ([2]), name maps ([3]) and integration software ([6]) are versioned so that the integration processes can be reproduced if needed. This way, steps take to compile an integrated data record can be traced and the sources of bias and errors can be more easily found.
This version was preceded by [7].
Contents
--------
README:
this file
citations.csv.gz:
contains data citations in a in a gzipped comma-separated values format.
citations.tsv.gz:
contains data citations in a gzipped tab-separated values format.
datasets.csv.gz:
contains list of indexed datasets in a gzipped comma-separated values format.
datasets.tsv.gz:
contains list of indexed datasets in a gzipped tab-separated values format.
verbatim-interactions.csv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
verbatim-interactions.tsv.gz
contains species interactions tabulated as pair-wise interaction in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
interactions.csv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
interactions.tsv.gz:
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources.
refuted-verbatim-interactions.csv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
refuted-verbatim-interactions.tsv.gz:
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic name are *not* interpreted, but included as documented in their sources.
interactions.nq.gz:
contains species interactions expressed in the resource description framework in a gzipped rdf/quads format.
dwca-by-study.zip:
contains species interactions data as a Darwin Core Archive aggregated by study using a custom, occurrence level, association extension.
dwca.zip:
contains species interactions data as a Darwin Core Archive using a custom, occurrence level, association extension.
neo4j-graphdb.zip:
contains a neo4j v3.5.32 graph database snapshot containing a graph representation of the species interaction data.
taxonCache.tsv.gz:
contains hierarchies and identifiers associated with names from naming schemes in a gzipped tab-separated values format.
taxonMap.tsv.gz:
describes how names in existing datasets were mapped into existing naming schemes in a gzipped tab-separated values format.
References
-----
[1] Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. doi: 10.1016/j.ecoinf.2014.08.005.
[2] Poelen, J. H. (2020) Global Biotic Interactions: Elton Dataset Cache. Zenodo. doi: 10.5281/ZENODO.3950557.
[3] Poelen, J. H. (2021). Global Biotic Interactions: Taxon Graph (Version 0.3.28) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4451472
[4] Hortal, J. et al. (2015) Seven Shortfalls that Beset Large-Scale Knowledge of Biodiversity. Annual Review of Ecology, Evolution, and Systematics, 46(1), pp.523–549. doi: 10.1146/annurev-ecolsys-112414-054400.
[5] Cains, M. et al. (2017) Ivmooc 2017 - Gap Analysis Of Globi: Identifying Research And Data Sharing Opportunities For Species Interactions. Zenodo. Zenodo. doi: 10.5281/ZENODO.814978.
[6] Poelen, J. et al. (2022) globalbioticinteractions/globalbioticinteractions v0.24.6. Zenodo. doi: 10.5281/ZENODO.7327955.
[7] GloBI Community. (2024). Global Biotic Interactions: Interpreted Data Products hash://md5/946f7666667d60657dc89d9af8ffb909 hash://sha256/4e83d2daee05a4fa91819d58259ee58ffc5a29ec37aa7e84fd5ffbb2f92aa5b8 (0.7) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.11552565
Content References
-----
hash://sha256/5f4906439eba61f936b3dd7455a62c51656a74206f82d3f654e330fda6fbbe45 citations.csv.gz
hash://sha256/c8100368dae39363b241472695c1ae197aaddc6e3d6c0a14f3f5ee704b37f3f6 citations.tsv.gz
hash://sha256/e6f4aa897c5b325e444315e021b246ffed07fef764b0de6c0f1b2688bbdf9d0f datasets.csv.gz
hash://sha256/e6f4aa897c5b325e444315e021b246ffed07fef764b0de6c0f1b2688bbdf9d0f datasets.tsv.gz
hash://sha256/f11dc825609cdb1d4a3e9ba8caca9bf93c90dd6f660c7f6a0c8aa01c035a5e1f dwca-by-study.zip
hash://sha256/7f16aacacae74e8b0cdef04c612ba776f508ff7ffe385abc57583e37aec8fe53 dwca.zip
hash://sha256/b65e4c9a3615f1386bb97e45fb907d053df55476149aa6d71e6f398351218d0d interactions.csv.gz
hash://sha256/0c28032392f82d753690be126805e6334ca46bdc4b5e2102a79b15ce0cc0ba90 interactions.nq.gz
hash://sha256/8a7031250c288ba0da3d5cdbedc19d54c2f16ba3aa70d49826a7369b6edeca04 interactions.tsv.gz
hash://sha256/d0c0fbf536cc63c004d057efc14600ba8cc5874f401b08f51837273b7854f1bb neo4j-graphdb.zip
hash://sha256/50e77636f8b58c040e38b6a70ba7cc8288b190ef252dc0d4eb2f12f4c541e82f README
hash://sha256/a74e2a39cfe133ae9de1eeea94f5dda8cbd58cfe61a8ccf91b7c540757719c74 refuted-interactions.csv.gz
hash://sha256/37b06e274e41ca749399763989816854101238ade9863365f384a2764c639e9d refuted-interactions.tsv.gz
hash://sha256/23315b6cd3fdc91f9c1d5d5bc39fa52cf1cef7a4e97d9d023d452751df13f30e refuted-verbatim-interactions.csv.gz
hash://sha256/ff82e40cee4f8a8852d0c241f5027f66157a2b8a9090ffa3a0a329a206828d96 refuted-verbatim-interactions.tsv.gz
hash://sha256/f072fbc7affb6e29978c7540af6cdccd3a219a23b0a4765b5bae56bd20df0d88 taxonCache.tsv.gz
hash://sha256/cd28c81bb2432646a81ad216bc11818f7568ce81826e0074d9a33579da2c1426 taxonMap.tsv.gz
hash://sha256/a1d14aa47806c624cf7e3a8c8236643dcf19ed1835c79c65958f7317ebfb9566 verbatim-interactions.csv.gz
hash://sha256/2284434219d5fdab1e2152955f04363852c132b76709c330d33e31517817a82e verbatim-interactions.tsv.gz
hash://md5/d6ebf42729d988e15cb30adfa6112234 citations.csv.gz
hash://md5/42877ae68e51871b8eb7116e62f6b268 citations.tsv.gz
hash://md5/3e437580296fdeff3b6f35d1331db9d1 datasets.csv.gz
hash://md5/3e437580296fdeff3b6f35d1331db9d1 datasets.tsv.gz
hash://md5/fe88720fd992771bd64bfa220ad6a7d3 dwca-by-study.zip
hash://md5/cbe132a9288feaef2f3e0c0409b8dc2f dwca.zip
hash://md5/051f6db667c4b84616223c2776464dbf interactions.csv.gz
hash://md5/b66857f8750e56ba9abe484b1f72eac4 interactions.nq.gz
hash://md5/300839c346184b2fedc4e1fb31bcc29c interactions.tsv.gz
hash://md5/e79cf5ffee919672f99ea338f3661566 neo4j-graphdb.zip
hash://md5/898678f47561d7ef53722bc32957dcd9 README
hash://md5/65a185f19df304e53f92a7275f2de291 refuted-interactions.csv.gz
hash://md5/bc37a4354f8a2402e9335ae44f28cbd7 refuted-interactions.tsv.gz
hash://md5/42e817c31e2ca05e582be94e6ec283c5 refuted-verbatim-interactions.csv.gz
hash://md5/93639b70a1d8e47fd194b6384c0287a7 refuted-verbatim-interactions.tsv.gz
hash://md5/e32482b3697aa928a5fcb58a570191df taxonCache.tsv.gz
hash://md5/75251510925875d3fdc1952cc4b98043 taxonMap.tsv.gz
hash://md5/6a0c6224f4a4c3dca9994d70ad0b2fd2 verbatim-interactions.csv.gz
hash://md5/905acb49a700e5b5a292be02c917e710 verbatim-interactions.tsv.gz
Files
dwca-by-study.zip
Files
(31.5 GB)
Name | Size | Download all |
---|---|---|
md5:d6ebf42729d988e15cb30adfa6112234
|
40.9 MB | Download |
md5:42877ae68e51871b8eb7116e62f6b268
|
40.8 MB | Download |
md5:3e437580296fdeff3b6f35d1331db9d1
|
4.4 kB | Download |
md5:3e437580296fdeff3b6f35d1331db9d1
|
4.4 kB | Download |
md5:fe88720fd992771bd64bfa220ad6a7d3
|
355.3 MB | Preview Download |
md5:cbe132a9288feaef2f3e0c0409b8dc2f
|
721.8 MB | Preview Download |
md5:051f6db667c4b84616223c2776464dbf
|
3.0 GB | Download |
md5:b66857f8750e56ba9abe484b1f72eac4
|
13.1 GB | Download |
md5:300839c346184b2fedc4e1fb31bcc29c
|
3.0 GB | Download |
md5:e79cf5ffee919672f99ea338f3661566
|
9.1 GB | Preview Download |
md5:e76bf914309ad27dce6ab911d8854590
|
10.5 kB | Download |
md5:65a185f19df304e53f92a7275f2de291
|
17.5 MB | Download |
md5:bc37a4354f8a2402e9335ae44f28cbd7
|
17.5 MB | Download |
md5:42e817c31e2ca05e582be94e6ec283c5
|
2.5 MB | Download |
md5:93639b70a1d8e47fd194b6384c0287a7
|
2.5 MB | Download |
md5:e32482b3697aa928a5fcb58a570191df
|
310.2 MB | Download |
md5:75251510925875d3fdc1952cc4b98043
|
111.2 MB | Download |
md5:6a0c6224f4a4c3dca9994d70ad0b2fd2
|
827.9 MB | Download |
md5:905acb49a700e5b5a292be02c917e710
|
825.5 MB | Download |