There is a newer version of this record available.

Project deliverable Open Access

TRIPLE Deliverable 8.5: Guidelines on the Research Data in the Humanities

Umerle, Tomasz; Błaszczyńska, Marta; Wnuk, Madgalena; Franczak, Mateusz; Stojanovski, Jadranka; Rosinski, Cezary; Wołczuk, Nikodem; Mikołajczyk-Bareła, Agnieszka; Karlińska, Agnieszka; Ogrodniczuk, Maciej; Pęzik, Piotr; Kramer, Bianca; Peroni, Silvio; De Santis, Luca; Balkan, Lorna; Inkret, Ana; Przysiecka, Karoline; Maryl, Maciej; Tóth-Cifra, Erszébet

This report focuses on the metadata as a specific type of research data in the humanities by analysing key metadata elements – persistent identifiers (PIDs), abstracts, keywords and citations. It defines those elements, outlines challenges for processing them in the humanities and presents the challenges for GoTriple as the metadata aggregator of this kind of research data.
The assumption is that GoTriple is a specific kind of research dataset on its own that can and will be reused by stakeholders such as other metadata aggregators, indexers,
publishers, information services (i.e. providers of scholarly metrics), but also scientists interested in data-driven research (cultural analytics, scientometrics, bibliometrics, etc.). This demands a good understanding of key metadata elements important to GoTriple's
aggregation and enrichment processes (abstracts, keywords) and their development (PIDs, citations).
Chapter 1 defines the aim of the deliverable, context of its creation and its audience.
Chapter 2 discusses the specificity of the research data in the humanities and this report’s position in the rich discussions on the topic.
Chapter 3 – dedicated to PIDs – presents the overview of the topic and the challenges related to the PID’s uptake by the humanities, such as the role of cultural heritage data for the humanities, importance of bibliodiversity and multilingualism (subchapter 3.1), then it
proceeds to the discussion of processing PIDs from GoTriple’s data providers by focusing on data dispersion and heterogeneity (subchapter 3.2).
Chapter 4 – dedicated to keywords – begins with the typology of keywords and the expected standards they should adhere to (subchapter 4.1). Subchapter 4.2 tackles the issue of automated generation of keywords and proposes different approaches applicable in the context of GoTriple. In the subchapter 4.3 a current approach to keyword organisation in GoTriple is presented, with focus on the GoTriple vocabulary that responds to the need for keywords LOD-ification and can be in the future reused for automated keyword generation.
Chapter 5 – dedicated to abstracts – starts with the comprehensive presentation of the abstract ecosystem, offering also a specific perspective on SSH. Subchapter 5.2 offers solutions to the issues of “missing abstracts” which are aimed at the needs of the GoTriple platform.
Chapter 6 – dedicated to citations – offers an overview of the topic and its relevance to the SSH. In the subchapter 6.2 an analysis of issues related to GoTriple’s expression of citation data is presented (that relates especially to the challenge of processing different  citation formats and citation data quality).
Each chapter concludes with a summary of the guidelines for the specific metadata type for the humanities.

The TRIPLE project (, which is financed under the Horizon 2020 framework (, under Grant Agreement No. 863420, with approx. 5.6 million Euros for a duration of 42 months (2019-2023). The content of this deliverable reflects only TRIPLE's view and the Commission is not responsible for any use that may be made of the information it contains.
Files (4.1 MB)
Name Size
D8.5 Guidelines on the research data in the humanities (FINAL).pdf
4.1 MB Download
All versions This version
Views 167137
Downloads 141109
Data volume 588.2 MB447.8 MB
Unique views 147119
Unique downloads 127101


Cite as