Results of KROWN: Knowledge Graph Construction Benchmark

Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia

doi:10.5281/zenodo.10973892

Published April 17, 2024 | Version 0.9.0

Dataset Open

Results of KROWN: Knowledge Graph Construction Benchmark

1. IDLab
2. Ghent University
3. IMEC
4. Universidade de Santiago de Compostella
5. KU Leuven

In this Zenodo repository we present the results of using KROWN to benchmark popular RDF Graph Materialization systems such as RMLMapper, RMLStreamer, Morph-KGC, SDM-RDFizer, and Ontop (in materialization mode).

What is KROWN 👑?

KROWN 👑 is a benchmark for materialization systems to construct Knowledge Graphs from (semi-)heterogeneous data sources using declarative mappings such as RML.

Many benchmarks already exist for virtualization systems e.g. GTFS-Madrid-Bench, NPD, BSBM which focus on complex queries with a single declarative mapping. However, materialization systems are unaffected by complex queries since their input is the dataset and the mappings to generate a Knowledge Graph. Some specialized datasets exist to benchmark specific limitations of materialization systems such as duplicated or empty values in datasets e.g. GENOMICS, but they do not cover all aspects of materialization systems. Therefore, it is hard to compare materialization systems among each other in general which is where KROWN 👑 comes in!

Results

The raw results are available as ZIP archives, the analysis of the results are available in the spreadsheet results.ods.

Evaluation setup

We generated several scenarios using KROWN’s data generator and executed them 5 times with KROWN’s execution framework. All experiments were performed on Ubuntu 22.04 LTS machines (Linux 5.15.0, x86_64) with each Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 48 GB RAM memory, and 2 GB swap memory. The output of each materialization system was set to N-Triples.

Materialization systems

We selected the most popular maintained materialization systems for constructing RDF graphs for performing our experiments with KROWN:

RMLMapper
RMLStreamer
Morph-KGC
SDM-RDFizer
OntopM (Ontop in materialization mode)

Note: KROWN is flexible and allows adding any other materialization system, see KROWN’s execution framework documentation for more information.

Scenarios

We consider the following scenarios:

Raw data: number of rows, columns and cell size
Duplicates & empty values: percentage of the data containing duplicates or empty values
Mappings: Triples Maps (TM), Predicate Object Maps (POM), Named Graph Maps (NG).
Joins: relations (1-N, N-1, N-M), conditions, and duplicates during joins

Note: KROWN is flexible and allows adding any other scenario, see KROWN’s data generator documentation for more information.

In the table below we list all parameter values we used to configure our scenarios:

Scenario	Parameter values
Raw data: rows	10K, 100K, 1M, 10M
Raw data: columns	1, 10, 20, 30
Raw data: cell size	500, 1K, 5K, 10K
Duplicates: percentage	0%, 25%, 50%, 75%, 100%
Empty values: percentage	0%, 25%, 50%, 75%, 100%
Mappings: TMs + 5POMs	1, 10, 20, 30 TMs
Mappings: 20TMs + POMs	1, 3, 5, 10 POMs
Mappings: NG in SM	1, 5, 10, 15 NGs
Mappings: NG in POM	1, 5, 10, 15 NGs
Mappings: NG in SM/POM	1/1, 5/5, 10/10, 15/15 NGs
Joins: 1-N relations	1-1, 1-5, 1-10, 1-15
Joins: N-1 relations	1-1, 5-1, 10-1, 15-1
Joins: N-M relations	3-3, 3-5, 5-3, 10-5, 5-10
Joins: join conditions	1, 5, 10, 15
Joins: join duplicates	0, 5, 10, 15

Files

duplicates.zip

Files (1.8 GB)

Name	Size	Download all
duplicates.zip md5:8c34dc6fcd2918c6074773d53ff392e5	24.1 MB	Preview Download
empty-values.zip md5:43d340a7da9d261fb8919d6eae70b250	10.3 MB	Preview Download
joins.zip md5:56b3403a2c08c31cbccbec9cb0f5c2d0	1.2 GB	Preview Download
mappings.zip md5:8e939731b0924625eb7afe8b3af9c41c	276.7 MB	Preview Download
raw-data.zip md5:e80a08a305ab1c9b145b2d555948d786	244.2 MB	Preview Download
results.ods md5:2c4d660df2594705e09fc5f98f5a371e	212.7 kB	Download

Additional details

Is compiled by: Software: 10.5281/zenodo.10979321 (DOI)

Ghent University
Special Research Fund BOF20/DOC/132

Submitted: 2024-04-17

ISWC 2024 Resource Track

Repository URL: https://github.com/kg-construct/KROWN
Programming language: Python
Development Status: Active

	All versions	This version
Views	341	341
Downloads	321	321
Data volume	94.1 GB	94.1 GB

Results of KROWN: Knowledge Graph Construction Benchmark

What is KROWN 👑?

Results

Evaluation setup

Materialization systems

Scenarios

Files

duplicates.zip

Files (1.8 GB)

Additional details

Related works

Funding

Dates

Software

Results of KROWN: Knowledge Graph Construction Benchmark

Creators

Description

What is KROWN 👑?

Results

Evaluation setup

Materialization systems

Scenarios

Files

duplicates.zip

Files (1.8 GB)

Additional details

Related works

Funding

Dates

Software