Annotating a historical manuscript as a linguistic resource
Authors/Creators
- 1. Centre for Information Modelling, University of Graz
- 2. University of Tübingen
- 3. Humboldt-Universität zu Berlin
Description
The Bocabulario de lengua sangleya por las letraz de el A.B.C. is a historical Chinese-Spanish dictionary held by the British Library (Add ms. 25.317), probably written in 1617. It consists of 223 double-sided folios with about 1400 alphabetically arranged Hokkien Chinese lemmas in the Roman alphabet.
The contribution will introduce our considerations on how to extract and annotate linguistic data from the historical manuscript and the design of a digital scholarly edition (DSE) in order to answer research questions in the fields of linguistics, missionary linguistics and migration (Klöter/Döhla 2022).
The DSE will be juxtaposing (1) digital facsimiles of the original folios, (2) a diplomatic transcript, (3) a narrow English translation, and (4) critical notes. We will endeavor to expand the possibilities of digital editing by adding the following levels: (5) normalized representation of the original Spanish text, (6) linguistic analysis. The TEI is not only designed to represent written manuscripts, but also to annotate linguistic corpora (Tasovac/Romary, et al. 2018). However, a currently underrepresented topic in the TEI Guidelines is interlinear glossing (as mentioned, for instance, in Bowers 2020: 112 and formalized by the Leipzig Glossing Rules (EVA MPG 2015)) and the application of TEI to indigenous, under-resourced languages, and non-standard varieties (cf. Bowers 2020, Czaykowska-Higgins/Holmes/Kell 2014, Ngué Um 2017). Another concern is the representation of tones, and the additional representation of entries and example sentences with Chinese characters.
Although there are more than 200 documented tone languages in the world, most of which are spoken in Asia and Africa (Yip 2002, Maddieson 2013), the TEI Guidelines are still lacking a framework for the annotation of tonal features.
One of the project deliverables will therefore be a recommendation for the TEI annotation of tone which we believe will be a valuable service to the community.
Bibliography
Bowers, Jack. 2020. Language documentation and standards in Digital Humanities: TEI and the documentation of Mixtepec-Mixtec. Computation and Language [cs.CL]. École Pratique des Hauts Études. Online at https://tel.archives-ouvertes.fr/tel-03131936, last access 17 June 2022.
Czaykowska-Higgins, Ewa, Martin D. Holmes, and Sarah M. Kell. 2014. Using TEI for an Endangered Language Lexical Resource: The Nxaʔamxcín Database-Dictionary Project. Language Documentation & Conservation 8: 1–37.
EVA MPG (= Max Planck Institute for Evolutionary Anthropology, Department of Linguistics). 2015. The Leipzig Glossing Rules: Conventions for interlinear morpheme-by-morpheme glosses. Online at https://www.eva.mpg.de/lingua/pdf/Glossing-Rules.pdf; last access 17 June 2022.
Klöter, Henning and Hans-Jörg Döhla. 2022, forthcoming. Early Spanish-Chinese encounters in the Philippines and the birth of Spanish-Chinese lexicography. In: Michela Bussotti/François Lachaud (eds.), Interpreting empires, mastering languages, taming the world: Dictionaries and multilingual lexicons in East Asia. Paris: EFEO.
Maddieson, Ian. 2013. Tone. In: Dryer, Matthew S. and Haspelmath, Martin (eds.), The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. Online at http://wals.info/chapter/13, last access 17 June 2022.
Ngué Um, Emanuel. 2017. Issues in digital text representation, online dissemination, sharing and reuse for African tone languages. In: Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, Honolulu, Hawai‘i, 24–32. Online at https://aclanthology.org/W17-0104.pdf, last access 17 June 2022.
Tasovac,Toma and Laurent Romary, et al. 2018. TEI Lex-0: A baseline encoding for lexicographic data. Version 0.9.0. DARIAH Working Group on Lexical Resources. Online at https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html, last access 17 June 2022.
Yip, Moira. 2002. Tone. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139164559, last access 16 August
Files
Bocabulario - TEI 2022.pdf
Files
(485.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ec9e468144bc950d703b6868179fa01c
|
485.7 kB | Preview Download |