TRIPLE Deliverable: D2.1 Data Acquisition Plan
Creators
- 1. CNRS (Huma-Num)
- 2. CNRS (Open Edition)
Description
This report describes the Data Acquisition Plan (DAP) with technical specifications to be implemented in order to collect metadata about the research outputs from Social Sciences and Humanities (SSH) in 9 languages and to make them available through the future TRIPLE platform.
To reach this objective, the DAP, strategic step of the TRIPLE Core, defines the process of collecting metadata until their exposition in the TRIPLE database through a two-fold approach: 1) Metadata provision by processing chains of aggregation platforms and 2) Semantic enrichment and resource linking by the TRIPLE pipeline. A delivery platform will be the communication interface between both processes.
As a first phase, metadata are collected by aggregation platforms which are part of the consortium such as ISIDORE or CESSDA (and others out of the consortium like OpenAIRE, NARCIS etc.) and dropped on the delivery platform. To collect and expose their metadata, these platforms use generic processing chains called BUILD. In accordance with the TRIPLE recommendations and with their agreement, the BUILD chains will deliver selected metadata on a delivery platform, under the monitoring of the OPERAS Scientific Advisory Committee. This implies that the TRIPLE project creates a model, called TRIPLE data model, that the aggregation platforms might align with to be compliant with the TRIPLE platform. To start the project, the ISIDORE platform, developed by the coordinator of the TRIPLE project, had been chosen to be the first source of metadata, by using its processing chain “BUILD-I”, as indicated in the proposal. In the long run, to reach a satisfying level of exhaustivity, other BUILD chains will be added to cover the maximum of resources available in the whole SSH community worldwide.
In a second phase, by a connexion to the delivery platform, the TRIPLE pipeline will be able to collect, enrich and link the metadata corresponding to the 3 types of resources targeted by the project: 1) Research documents (publications and datasets), 2) Research projects and 3) Researcher profiles. The semantic enrichment will imply the creation of a TRIPLE thesaurus to align the vocabularies in the 9 languages.
The enriched and linked metadata will be then both stored in a tripleStore and indexed in the TRIPLE database and available through REST APIs for the Innovative Services (IS) to run their tools or for data providers to retrieve improved metadata.
Notes
Files
D2.1 Data Acquisition Plan_TRIPLE.pdf
Files
(3.1 MB)
Name | Size | Download all |
---|---|---|
md5:d0996de9195caa9a4f3b6e4d157b3050
|
3.1 MB | Preview Download |