Published May 1, 2024 | Version v1
Publication Open

Efficient and Reliable Estimation of Knowledge Graph Accuracy

  • 1. Department of Information Engineering, University of Padua

Description

Data accuracy is a central dimension of data quality, especially when dealing with Knowledge Graphs (KGs). Auditing the accuracy of KGs is essential to make informed decisions in entity-oriented services or applications. However, manually evaluating the accuracy of large-scale KGs is prohibitively expensive, and research is focused on developing efficient sampling techniques for estimating KG accuracy. This work addresses the limitations of current KG accuracy estimation methods, which rely on the Wald method to build confidence intervals, addressing reliability issues such as zero-width and overshooting intervals. Our solution, rooted in the Wilson method and tailored for complex sampling designs, overcomes these limitations and ensures applicability across various evaluation scenarios. We show that the presented methods increase the reliability of accuracy estimates by up to two times when compared to the state-of-the-art while preserving or enhancing efficiency. Additionally, this consistency holds regardless of the KG size or topology.

Files

Efficient and Reliable Estimation of Knowledge Graph Accuracy.pdf

Files (1.9 MB)

Additional details

Funding

European Commission
HEREDITARY - HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY 101137074