OcOr : a Corpus of Occitan Oral Narratives

Marianne Vergez-Couret; Janice Carruthers

doi:10.5281/zenodo.4740659

Published October 8, 2018 | Version 0.1.1

Dataset Open

OcOr : a Corpus of Occitan Oral Narratives

1. Queen's University

Contributors

Contact person:

Marianne Vergez-Couret¹

Researcher:

Janice Carruthers¹

1. Queen's University

OcOr is a corpus of Occitan oral narratives. This corpus is one of the outputs of the project ExpressioNarration, financed by a Marie Sklodovska Curie Fellowship (2016-2018, n°655034).

It includes three sub-corpora, constituted as follows:

• OOT (Occitan, oral, traditional): stories drawn from fieldwork among native speakers in the Occitan domain, recorded by the COMDT (Conservatoire Occitan des Musiques et Danses Traditionnelles - http://www.comdt.org/), transcribed and digitised for the project by the researchers.

• OWT (Occitan, written, traditional): published literary stories, digitised by and for the project by the researchers. These are stories collected from oral sources and produced in a publishable written version.

• OOC (Occitan, oral, contemporary): stories recounted by contemporary artists, taken from existing recordings and two Toulouse storytelling events organised by the project in collaboration with the Institut d'Etudes Occitanes (IEO), in 2016. The stories were recorded during the events and subsequently transcribed and digitised by the researchers.

The overall aim of the ExpressioNarration project was to use contemporary linguistic theory to explore the relationship between language and orality, with a specific focus on key temporal features of oral narrative in Occitan, including ‘tenses’, ‘connectives' and 'frame introducers'. These features were thus annotated in the three sub-corpora.

All the sub-corpora are disseminated in XML format (TEI-P5) and PDF. Each story is available as an annotated XML document, an annotated PDF and a stripped PDF document.

Full metadata appears in the Header of each XML document, with information on speakers (e.g. gender, age, place of origin, education, languages spoken), variety of Occitan (or dialect), authors/editorial information (in the case of OWT) and story-type when relevant (i.e. the Aarne Thompson category). For each sub-corpus, a user-friendly summary of this metadata is also available in an Excel spreadsheet: these are contained in the OcOr zipfile.

The annotation system was designed by the researchers and is given in full in the Header of each XML document.

For further information on the constitution of the corpus and discussion of the theoretical and methodological issues relating to data collection, digitisation and annotation, please read the following article in the journal Corpus, written by the researchers and entitled ‘Méthodologie pour la constitution d’un corpus comparatif de narration orale en Occitan : objectifs, défis, solutions’, available at: https://journals.openedition.org/corpus/3490.

Files

OcOr.zip

Files (93.6 MB)

Name	Size	Download all
OcOr.zip md5:e28965992fe97f6b8b47df1913ab4270	3.0 MB	Preview Download
OcOr_v1.1.zip md5:f8d407d3a13d210ad201f9f9503d751f	90.6 MB	Preview Download

Additional details

European Commission
EXPRESSIONARRATION – Narration, linguistic expression and discourse structure: explorations of orality in Occitan and French 655034

Janice Carruthers et Marianne Vergez-Couret, « Méthodologie pour la constitution d'un corpus comparatif de narration orale en Occitan : objectifs, défis, solutions », Corpus [En ligne], 18 | 2018, mis en ligne le 09 juillet 2018, consulté le 08 octobre 2018. URL : http://journals.openedition.org/corpus/3490
Vergez-Couret M. (2017). « Constitution et annotation d'un corpus écrit de contes et récits en occitan », Analyses et méthodes formelles pour les humanités numériques, ISTE OpenScience, 1-1, publication en ligne : https://www.openscience.fr/Constitution-et-annotation-d-un-corpus-ecrit-de-contes-et-recits-en-occitan.

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	1,796	251
Downloads	144	56
Data volume	3.3 GB	3.0 GB

OcOr : a Corpus of Occitan Oral Narratives

Contributors

Contact person:

Researcher:

Files

OcOr.zip

Files (93.6 MB)

Additional details

Funding

References

OcOr : a Corpus of Occitan Oral Narratives

Creators

Contributors

Contact person:

Researcher:

Description

Files

OcOr.zip

Files (93.6 MB)

Additional details

Funding

References