Synthetic RDF Data
Authors/Creators
Description
This dataset contains synthetic RDF data generated as part of my master's thesis research. The data was generated based on SHACL (Shapes Constraint Language) shapes that define the structure and constraints of RDF graphs.
Two different generative models were used:
-
GAN (Generative Adversarial Network): Used to model and sample property values by learning the distribution of entities and relationships.
-
VAE (Variational Autoencoder): Used to capture the latent distribution of data features and generate new, realistic RDF instances while preserving SHACL constraints.
The primary objective was to produce high-quality, diverse synthetic knowledge graph data that:
-
Adheres to SHACL constraints
-
Represents realistic distributions
-
Is suitable for testing RDF-based systems and knowledge graph pipelines
Files
ProteinOntologyShapes.ttl.txt
Files
(9.3 MB)
Additional details
Dates
- Available
-
2025-06-04