Published July 5, 2023 | Version v1
Project deliverable Open

CLS INFRA D5.2 Case Studies in Data Preparation and Sharing

  • 1. HU Berlin / Institut für Deutsche Sprache und Linguistik
  • 2. Universität Potsdam
  • 3. Österreichische Akademie der Wissenschaften

Description

This deliverable presents three case studies involving digitisation and transformation processes; the studies are presented in order of the complexity of the research question, which is reflected in the difficulty of the corpus compilation task. Transformation processes seem to be inevitable in each case, but paradoxically the urgency of digitisation diminishes as the complexity of a task increases, The case studies described in this deliverable are:

1. Creation of an ELTeC affine corpus of the Slovak novel (chapter 2)

2. Finding the haiku across multilingual corpora (chapter 3)

3. Measuring entropy and surprisal in the prose of the Tsarist Empire Devoted to Terrorism (Russian and Polish Texts) (chapter 4)

The first two case studies have already served as reference cases for the data landscape review (CLS INFRA Deliverable 5.1). This extended version, which conveys the experience of six months of research and is enriched by the third case study, highlights specific aspects of the multidimensional landscape of literary text collections. In Deliverable 5.1, they were merely illustrations and concretisations of general points; now they are the focus of attention. The third case has been designed with the most complex research questions in mind, to go even further in exploring what is available and what is possible in the digital humanities today.

Files

Deliverable 5.2.pdf

Files (804.8 kB)

Name Size Download all
md5:a18bc2e2c72655e39d589359a4520441
804.8 kB Preview Download

Additional details

Funding

CLS INFRA – Computational Literary Studies Infrastructure 101004984
European Commission