Knowledge Graph Triple Validation by LLMs and Human-in-the-Loop

Published September 7, 2024 | Version v1

Dataset Open

Suplementary material for the sumbitted article to the IPM Special issue on Large Language Models and Data Quality for Knowledge Graphs.

The dataset is an extension of [1] and includes the following columns:

subj the subject/head of the triple
rel the predicate of the triple
obj the object/tail of the triple
support-level indicating the reliability of the triple
gpt-4o-1 [1: valid, 0: invalid], response from 1st GPT prompt
gpt-4o-2 [1: valid, 0: invalid], response from 2nd GPT prompt
gpt-4o-3 [1: valid, 0: invalid], response from 3rd GPT prompt
gpt-4o-majority [1: valid, 0: invalid], GPT annotation, computed as majority vote of gpt-4o-1,gpt-4o-2,gpt-4o-3
ann-random[1: valid, 0: invalid], randomly selected annotation from the expert annotations avaialble in [1]
ann-new [1: valid, 0: invalid], junior expert annotation

[1] https://github.com/danilo-dessi/SKG-pipeline/tree/main/eval

Files

Name	Size	Download all
data.csv md5:def339def6505309c76373f56f6508a4	250.7 kB	Preview Download