Presentation Open Access

LIBER 2021 - Workshop: SSHOC'ing drama in the cloud; Encoding theatrical text collections for discovery,exploration,and visualisation;the added value of SSHOC/CLARIN services

Eskevich, Maria; van der Lek, Iulianna; Frontini, Francesca

These are the slides from the LIBER 2021 Workshop: SSHOC’ing drama in the cloud; Encoding theatrical text collections for discovery,exploration,and visualisation;the added value of SSHOC/CLARIN services

Encoding theatrical text collections for discovery,exploration,and visualisation;the added value of SSHOC/CLARIN services.

The objective of this virtual workshop is to equip the participating librarians with some general knowledge on how researchers in the field of Social Sciences and Humanities (SSH) can benefit from the resources and services offered by SSH research infrastructures for producing and exploiting highly encoded historical textual data. After the workshop, the participants will be able to successfully guide and advise SSH researchers (with a particular focus on literature studies) in their choice amongst existing resources and tools, based on their research question.

This objective will be achieved by:
1. Familiarising the librarians with the Text Encoding Initiative (TEI) format that is widely adopted in SSH for the XML-based mark-up of textual documents and demonstrating the potential benefits;
2. Teaching them how to explore and visualise TEI collections with the help of tools and services offered by CLARIN and SSHOC.
3. Showing them how to optimize research workflows with the help of SSH Open Marketplace (SSHOC).

The workshop use case will be based on ongoing work carried out within the SSHOC project (WP3) on a corpus of theatrical play texts from the 17th and 18th century covering examples in three languages (English, French, and Spanish). The participants will first learn how encoded documents can be searched for in the CLARIN Virtual Language Observatory (VLO). Then they will be introduced to the basics of the XML-TEI encoding, in particular to those elements that concern theatre plays, characters and their respective lines. Through concrete examples, the participants will be shown how simple scripts can be used to generate separate sub-corpora containing the speech for each character or a group of characters. Finally, a contrastive computational text analysis and visualization will be performed using the Voyant tool to demonstrate how the showcased standards, techniques and tools can be successfully used by literary scholars to answer research questions concerning characterization.

The virtual workshop will consist of the following parts:
● Welcome and introduction
● Scenario of use and motivation: a researcher with SSH research questions and limited knowledge of TEI, what should be the advice of a librarian?
● TEI in details:
○ What is a TEI document?
○ Where can you find more TEI documents within the SSH research infrastructures?
○ Pointer to the tutorials on how to create TEI documents for other text types than play texts.
● SSHOC/CLARIN use case step by step:
○ A brief description of the research question that is used as example during the workshop
○ How to find datasets through the CLARIN Virtual Language Observatory
○ How to process the dataset
○ Visualisation and analysis

Hands-on sessions will allow the participants to test proposed methods, resources and tools.

This virtual workshop is offered as part of the work carried out by CLARIN ERIC and LIBER within the EU funded H2020 project “Social Sciences and Humanities Open Cloud (SSHOC)” which contributes to the creation of the SSH area of the European Open Science Cloud (EOSC).

123
70
views
downloads
All versions This version
Views 123123
Downloads 7070
Data volume 1.0 GB1.0 GB
Unique views 8888
Unique downloads 6060

Share

Cite as