Dataset Open Access
Poelen, Jorrit
{ "description": "<p>This supplementary data publication contains:</p>\n\n<p><strong>links-globi-wd-ott.tsv.gz:</strong> aggregate list of taxon graphs from Open Tree of Life Taxonomy (OTT), GloBI and Wikidata. This tab separated two column table, describe the taxonomic identifiers (e.g., NCBI:9606) that map into OTT, GloBI and Wikidata. For instance, the line "NCBI:9689{tab}WD:Q140" indicates that wikidata links their lion (<em>Panthera leo</em>, https://www.wikidata.org/wiki/Q140) to NCBI's lion (<em>Panthera leo</em>, https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=9689).</p>\n\n<p><strong>wikidata-taxon-info20171227.tsv.gz: </strong>a terse 5 column file in tab-separated format of taxon objects extracted from WikiData. (2018). Wikidata dump 2017-12-27 [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1211767 . The columns contain the following:</p>\n\n<ol>\n\t<li>wikidata taxon item id (e.g., Q140 or https://www.wikidata.org/wiki/Q140)</li>\n\t<li>scientific name of taxon item id (e.g., Panthera leo, Mammalia)</li>\n\t<li>rank id of the taxon item id (e.g., Q7432 species or https://www.wikidata.org/wiki/Q7432). To retrieve a full list of wikidata taxon rank ids and their common names, you can use sparql to query wikidata (e.g., <a href=\"https://github.com/globalbioticinteractions/nomer/blob/c3a1f5a2ebfb87ffc67e3bace19b82d96c0d25e8/nomer/src/main/java/org/globalbioticinteractions/nomer/util/WikidataTaxonRankLoader.java\">Nomer's WikidataTaxonRankLoader</a> ). </li>\n\t<li>parent ids if taxon item id using pipes "|" as separators if there's multiple parents. Please note that some taxon items have multiple parents (e.g., https://www.wikidata.org/wiki/Q774014).</li>\n\t<li>external taxonomic identifiers that taxon item link to (e.g. "ITIS:162532|EOL:8266|GBIF:2960|WORMS:125440") . If muliple are present, pipes "|" are used to separate the links. Only a selection of taxonomic schemes was used, namely: NCBI, GBIF, ITIS, WORMS, FISHBASE, IF (index fungorum) and EOL.</li>\n</ol>\n\n<p>The datasets can be recreated by scripts in https://github.com/bio-guoda/guoda-datasets/tree/master/wikidata or <a href=\"https://doi.org/10.5281/zenodo.1428949\">https://doi.org/10.5281/zenodo.1428949</a> .</p>", "license": "https://creativecommons.org/licenses/by/4.0/legalcode", "creator": [ { "@type": "Person", "name": "Poelen, Jorrit" } ], "url": "https://zenodo.org/record/1213477", "datePublished": "2018-04-06", "version": "0.1", "@context": "https://schema.org/", "distribution": [ { "contentUrl": "https://zenodo.org/api/files/5b7e2a31-01bd-4c04-9e2d-84c8d689798d/links-globi-wd-ott.tsv.gz", "encodingFormat": "gz", "@type": "DataDownload" }, { "contentUrl": "https://zenodo.org/api/files/5b7e2a31-01bd-4c04-9e2d-84c8d689798d/wikidata-taxon-info20171227.tsv.gz", "encodingFormat": "gz", "@type": "DataDownload" } ], "identifier": "https://doi.org/10.5281/zenodo.1213477", "@id": "https://doi.org/10.5281/zenodo.1213477", "@type": "Dataset", "name": "20 GB in 10 minutes: Data linking across major biodiversity databases: Data supplements" }
All versions | This version | |
---|---|---|
Views | 196 | 196 |
Downloads | 42 | 42 |
Data volume | 2.5 GB | 2.5 GB |
Unique views | 186 | 186 |
Unique downloads | 35 | 35 |