Dataset Open Access
This deposit consists of the manually annotated gold standard, graph files, and evaluation results for our study of (pseudo-)transitive relations.
The gold-standard.zip file provides the files consisting of manually evaluated triples.
The files were exported from ANNit with the column:
- LEFT* for the source URI of the triple.
- RIGHT* for the target URI of the triple.
- UserChioce for the choice of user when manually evaluated
- Decision* for the actual decision made by rater/annotator. It can only be unknown, remove, remain.
- Comment, if any.
The corresponding gold-standard.pdf file provides all the details about the creation of the gold standard.
The folder graph_file includes the unweighted graphs, as well as the two sets of weighted graphs: the graphs with counted weights and the graphs with inferred weights (in the subdirectory of counted_weights and inferred_weights subdirectory respectively).
The files are compressed in the format of *.gz. Each file consists of two columns of integers as the source and the target. The integers correspond to the URIs. The corresponding mapping files are in the directory mapping.
The corresponding files (of unweighted graphs) in WebGraph format are provided. These files were used when evaluating our algorithm against the exiting web-scale feedback-arc-set algorithm.
The source code for refinement is at https://github.com/shuaiwangvu/Refining-Transitive-Relations
Some raw data for table 2 and table 3 in our corresponding paper and their analysis are also provided. There were two static settings of the parameter for Table 2 and we chose the first setting for the final presentation in the paper.
Should there be any problem with these datasets, please feel free to report to us at the following email address: firstname.lastname@example.org.
A link to the full paper will be published when the paper is accepted.
Gold standard description.pdf