Published April 10, 2019 | Version v1
Dataset Open

MetaLink - Closure and Error Degree of 556M owl:sameAs statements

  • 1. Vrije Universiteit Amsterdam

Description

MetaLink is a dataset that contains metadata for a very large set of owl:sameAs links that are crawled from the LOD Cloud. MetaLink encodes a previously published error metric for each of these links [Raad et al., 2018]. This error degree ranges from 0.0 (most likely correct) till 1.0 (most likely incorrect). The idea is that the more an owl:sameAs link is isolated in the network (of all owl:sameAs links), the higher error degree this link will have. Experiments shows that discarding the 1M owl:sameAs links with an error degree >0.99 can significantly increase the quality of the transitive closure. Also by keeping only the 400M owl:sameAs links with error degree <= 0.4, the resulting closure is 100% precise in several manually evaluated cases. The resulted equivalence classes from these different closures are publicly available online.

MetaLink is published in combination with LOD-a-lot, a dataset that is based on a very large crawl of a subset of the LOD Cloud. By combining MetaLink and LOD-a-lot, applications are able to make informed decisions about whether or not to follow specific links on the LOD Cloud. This dataset contains 4,352,602,452 unique triples, and is available in HDT (Header Dictionary Triples) format. It can be navigated online using the TriplyDB Linked Data hosting platform: https://krr.triply.cc/krr/metalink.  

A figure describing the vocabulary of the MetaLink dataset can be found here. Classes are displayed by circles and properties are displayed by arcs. The MetaLink-specific classes and properties are displayed in red, the blue classes and properties are reused from existing vocabularies.

Files

Files (73.0 GB)

Name Size Download all
md5:1546c60c84d57abd448478c0cc89bb86
36.0 GB Download
md5:39121f59c72b4f3ffcc63649bc5eb3fc
37.0 GB Download