D4.19 Mapping of two indicative selected standards to the SSHOCro

Eleni Tsouloucha; Athina Kritsotaki; Chryssoula Bekiari; Maria Theodoridou

This report documents the work undertaken in the frame of the SSHOC project Task 4.7 Modeling the SSHOC data life cycle. Specifically, SSHOC Deliverable 4.19 concerns the mapping of two selected metadata standards used to document social science research ⎼namely DDI Codebook and CMDI ⎼to the SSHOC Reference Ontology (SSHOCro). The mapping involves the integration and harmonization of DDI and CMDI metadata schema and the transformation of selected cases from social sciences and humanities repositories, documented with the DDI and CMDI to SSHOCro.

SSHOCro is a common meta-level schema based on CIDOC CRM, that aims at providing a semantic interoperability framework for the description of the Social Sciences and Humanities data life cycle, by offering a conceptual model that can describe at a generic level the real-world lifecycle of data produced by the generic workflow of processes of collection, connection, interpretation and the auxiliary activities of storing, publishing and finding data –as it actually takes place in the various domains of Social Sciences and Humanities. Its development has, in fact, been informed by data lifecycle management practices in use, in said disciplines. In practical terms, the use of such a model and schema for the research community is twofold: (a) it can be applied as a standard to be used in the step of devising and implementing metadata capture scheme for tracking the data lifecycle in individual projects, institutions and disciplines; (b) it is a canonical form or target schema that can provide a model for a single knowledge base for cross–domain tools and services (e.g., resource discovery, browsing, and data mining). In this case mappings must be produced to relate DDI or CMDI concepts or relationships (source schemata) to SSHOCro concepts (target schema) in a way that facts described in terms of the above source schemata can automatically be translated into descriptions in terms of the target schema (SSHOCro). This is the mapping definition process and the output of this task is the mapping, i.e., a collection of mapping rules. The present report describes the mapping definition process between the DDI / CDMI and SSHOCro along with their resulting mapping rules.

This task includes the following activities:

  1. interpreting conceptualizations expressed in DDI and CMDI and of concepts necessary to explain the intended meaning of DDI and CMD attributes and relationships, especially those related to the data life cycle, in terms of SSHOCro v.0.1. Any conflicts occurred in the harmonization process with the existing version of SSHOCro have been resolved on the SSHOCro side producing a new version of SSHOCro (v.1.1.3). The output of this task is the listing of the entities, relationships and attributes defined in DDI and CMDI which shows how the same information can be expressed using SSHOCro. These listings are the mappings at schema level which can be served as an intellectual definition of the relationship between DDI and CMD with SSHOCro. Also, they are in a format that could be turned more or less mechanically into an algorithm to automatically transform data structured following the one form into data in the other form, i.e., they can be used to implement an automatic data translation.
  2. transforming selected cases found in social science repositories into the SSHOCro v1.1.3. The tool used for the transformation of the data of the selected cases was the X3ML (3M) Toolkit; a set of small, open source, micro services that follow the Synergy Reference Model of data provision and aggregation (SRM), which defines a consistent set of business processes, user roles, generic software components and open interfaces that form a harmonious whole. It is based on experience and evaluation of national and international information integration projects. It is an initiative of the CIDOC CRM Special Interest Group (CRM SIG), a Working Group of CIDOC, the International Committee for Documentation of the International Council of Museums (ICOM). The X3ML Toolkit allows data experts to transform their internal structured data and other associated contextual knowledge to other schemas. Fields or elements from a source database (Source Nodes) are aligned with one or more entities described in the target schema so that the data from an entire system can be transformed. The purpose of this is typically for publication on the Web and in particular meaningful integration with other data also transformed to the same target schema.

