Dataset Open Access
Wang, Shuai;
Idries Nasim;
Joe Raad;
Peter Bloem;
Frank van Harmelen
This is the dataset for our paper
What does it mean when your URIs are redirected? Examining identity and redirection in the LOD cloud
Redirection of URIs is widely used in the LOD cloud, and is even part of the best practice guidelines as an approach to the ``curation problem'' on the semantic web (i.e. how to repair imperfections). When dereferencing, one URI is redirected to another URI. Such a redirection could be the result of an update of the namespace, a different encoding scheme, or some other reasons. In this paper, we study the semantics of redirection and examine if redirection indicates how entities in the LOD cloud evolve. More specifically, we focus on entities in the identity graphs: subgraphs in the semantic web restricted to identity links. The entities we study are from sameAs.cc, an identity graph extracted from a crawl of the semantic web in 2015. Our analytical results include an examination of edges and chains of redirection as well as a statistical analysis of the redirection behavior of sampled entities. Additionally, we present properties of the graphs formed by redirection relations.
The dataset contains the redirect relations of four sets of sampled entities. These sampled files are:
The Python scripts are open source online at:
https://github.com/shuaiwangvu/redirection
The paper is attached. In case of any questions, please contact Shuai Wang at shuai.wang@vu.nl.
Name | Size | |
---|---|---|
cc_sample_2_redirect_edges.nt
md5:85d41aac5bc744c815db0d7927473dba |
3.5 MB | Download |
cc_sample_2_redirect_nodes.tsv
md5:5565ae213bc9f178e1404948bff294a5 |
3.1 MB | Download |
cc_samples_10_redirect_edges.nt
md5:aeb3ddc74e89f87bbc0bbae3be4122ff |
3.2 MB | Download |
cc_samples_10_redirect_nodes.tsv
md5:09be6b96a377529850cf07e8066d9b4b |
2.9 MB | Download |
cc_samples_3_9_redirect_edges.nt
md5:38ae62a478f005b1a5588bd13202fdbf |
3.8 MB | Download |
cc_samples_3_9_redirect_nodes.tsv
md5:8d9722992dded0611b00321b739fd85d |
3.2 MB | Download |
ite_uniform_sampling_redirect_edges.nt
md5:bb716609000ba4420819478cef4de72f |
19.8 MB | Download |
ite_uniform_sampling_redirect_nodes.tsv
md5:28e3c4ff929d717c67a65e54b357d84e |
16.0 MB | Download |
new_MEPDaW_redirection (3).pdf
md5:9164459b4c4d246f600d33e425fa8ae4 |
92.4 kB | Download |
sample4000.tsv
md5:1ff5cfa78edb31816bbb9a231a6a637e |
489.4 kB | Download |
sampled_chains_100.tsv
md5:ac32c71dadefa9bc40c7b065165475b1 |
24.2 kB | Download |
All versions | This version | |
---|---|---|
Views | 295 | 295 |
Downloads | 52 | 52 |
Data volume | 123.2 MB | 123.2 MB |
Unique views | 188 | 188 |
Unique downloads | 39 | 39 |