Published June 6, 2023 | Version v1
Presentation Open

CLARIAH integrations of Dataverse with SEMAF Semantic framework and SpaCy DANS Machine Learning library

  • 1. DANS-KNAW


DANS-KNAW has developed "Archive in a box" distribution which was presented for the Dataverse Community Meeting'23 in Braga, Portugal. This distribution provides fully automatic FAIR Dataverse data repository deployment integrated with third-party networked services, and connection to external controlled vocabularies required to produce Linked Data out of datasets metadata descriptions. It also includes data previewers and support of custom metadata schemes such as CESSDA CMM, CLARIN CMDI, ODISSEI etc. There are multiple benefits for institutions worldwide to run Common Data Infrastructure and do community based maintenance and development as costs will drop massively with a number of organizations joining the consortium. Following distributed setup, this shared infrastructure is sustainable and more suitable for the future.

To make it more flexible and suitable for different communities, in CLARIAH project we have developed SEMAF semantic transformation framework and created SpaCy Machine Learning library to create semiautomatic workflow to generate FAIR metadata descriptions of datasets.


03_Lightning_Talk - CLARIAH integrations of Dataverse.pdf

Files (2.4 MB)

Additional details


ODISSEI (Open Data Infrastructure for Social Science and Economic Innovations) 31702
Dutch Research Council
Dutch Research Council
CLARIAH - Common Lab Research Infrastructure for the Arts and Humanities 2300171392
Dutch Research Council