Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.
Published December 6, 2021 | Version v1.0.0
Software Open

nikopartanen/castren-komi-wedding-laments: Matthias Alexander Castrén's Komi Wedding Laments, sentence-aligned dataset

Creators

Description

Matthias Alexander Castrén's Komi Wedding Laments, sentence-aligned dataset

Matthias Alexander Castrén (1813–1852) collected seven wedding laments in Komi language, presumably in 1843 somewhere in the Pechora region. The original manuscripts are archived in the National Library of Finland, and they were published in 1873 with Finnish and German translations of T.G. Aminoff in Acta Societatis Scientiarum Fennicae, digitized copies being available in the University of Helsinki Library and in the Internet Archive.

In this dataset different versions of the text, especially in Komi and Finnish, are aligned with another. The Komi transcription provided by Castrén, and later edited by Aminoff, is also presented in a version in Standard Zyrian Komi orthography, or a variety of that used in recent dialect dictionary and corpora.

The work of Niko Partanen was conducted within the Kone Foundation funded research project Language Documentation Meets Language Technology: The Next Step in the Description of Komi. The materials used are in Public Domain, and author doesn't claim new copyright for the rearrangement the text into XML files or for the creation of the orthographic variants. The citation of the original source in Zenodo is, however, recommended and appreciated.

As the narrator of the texts is not known, and presumably Castrén collected them from several individuals, exact places or persons are not indicated anywhere besides Castrén. We are, however, glad to add into collection new information in case that can be found. The author of the dataset, Niko Partanen, can be reached by email in niko.partanen@helsinki.fi.

The data is provided in ELAN XML files so that it is easily compatible with other spoken Komi materials, even though in this case no recording naturally exists. Later, when more alignations are created to different manuscript versions, some other format may well be adopted.

Citation

Please cite this dataset as:

Niko Partanen 2021: Matthias Alexander Castrén's Komi Wedding Laments, sentence-aligned dataset. 

Files

nikopartanen/castren-komi-wedding-laments-v1.0.0.zip

Files (64.8 kB)

Additional details