There is a newer version of the record available.

Published June 4, 2025 | Version v1
Dataset Open

Synthetic RDF Data

Description

This dataset contains synthetic RDF data generated as part of my master's thesis research. The data was generated based on SHACL (Shapes Constraint Language) shapes that define the structure and constraints of RDF graphs.

Two different generative models were used:

  • GAN (Generative Adversarial Network): Used to model and sample property values by learning the distribution of entities and relationships.

  • VAE (Variational Autoencoder): Used to capture the latent distribution of data features and generate new, realistic RDF instances while preserving SHACL constraints.

The primary objective was to produce high-quality, diverse synthetic knowledge graph data that:

  • Adheres to SHACL constraints

  • Represents realistic distributions

  • Is suitable for testing RDF-based systems and knowledge graph pipelines

Files

Files (9.3 MB)

Name Size Download all
md5:6e770cc1f7364e97bff7a6555a2d172e
4.5 MB Download
md5:67f3e28c432b9839a367236a400724c9
40.6 kB Download
md5:0d8be880bd190dd6de0182e42f7c77a8
4.8 MB Download

Additional details

Dates

Available
2025-06-04