UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Dataset Open Access

Graphs of redirection: an examination of URIs in identity graphs

Wang, Shuai; Idries Nasim; Joe Raad; Peter Bloem; Frank van Harmelen

This is the dataset for our paper

What does it mean when your URIs are redirected? Examining identity and redirection in the LOD cloud

 

Redirection of URIs is widely used in the LOD cloud, and is even part of the best practice guidelines as an approach to the ``curation problem'' on the semantic web (i.e. how to repair imperfections). When dereferencing, one URI is redirected to another URI. Such a redirection could be the result of an update of the namespace, a different encoding scheme, or some other reasons. In this paper, we study the semantics of redirection and examine if redirection indicates how entities in the LOD cloud evolve. More specifically, we focus on entities in the identity graphs: subgraphs in the semantic web restricted to identity links. The entities we study are from sameAs.cc, an identity graph extracted from a crawl of the semantic web in 2015. Our analytical results include an examination of edges and chains of redirection as well as a statistical analysis of the redirection behavior of sampled entities. Additionally, we present properties of the graphs formed by redirection relations. 

 

The dataset contains the redirect relations of four sets of sampled entities. These sampled files are:

  • ite_uniform... the edges of redirection graph corresponding to uniform samplings
  • cc_sample_2... the sampling regarding connected components of size 2, 3-10, 10+, respectively.

The Python scripts are open source online at:

https://github.com/shuaiwangvu/redirection

The paper is attached. In case of any questions, please contact Shuai Wang at shuai.wang@vu.nl.

Files (56.1 MB)
Name Size
cc_sample_2_redirect_edges.nt
md5:85d41aac5bc744c815db0d7927473dba
3.5 MB Download
cc_sample_2_redirect_nodes.tsv
md5:5565ae213bc9f178e1404948bff294a5
3.1 MB Download
cc_samples_10_redirect_edges.nt
md5:aeb3ddc74e89f87bbc0bbae3be4122ff
3.2 MB Download
cc_samples_10_redirect_nodes.tsv
md5:09be6b96a377529850cf07e8066d9b4b
2.9 MB Download
cc_samples_3_9_redirect_edges.nt
md5:38ae62a478f005b1a5588bd13202fdbf
3.8 MB Download
cc_samples_3_9_redirect_nodes.tsv
md5:8d9722992dded0611b00321b739fd85d
3.2 MB Download
ite_uniform_sampling_redirect_edges.nt
md5:bb716609000ba4420819478cef4de72f
19.8 MB Download
ite_uniform_sampling_redirect_nodes.tsv
md5:28e3c4ff929d717c67a65e54b357d84e
16.0 MB Download
new_MEPDaW_redirection (3).pdf
md5:9164459b4c4d246f600d33e425fa8ae4
92.4 kB Download
sample4000.tsv
md5:1ff5cfa78edb31816bbb9a231a6a637e
489.4 kB Download
sampled_chains_100.tsv
md5:ac32c71dadefa9bc40c7b065165475b1
24.2 kB Download
295
52
views
downloads
All versions This version
Views 295295
Downloads 5252
Data volume 123.2 MB123.2 MB
Unique views 188188
Unique downloads 3939

Share

Cite as