Published September 14, 2022 | Version 1.0
Poster Open

Okinawan Lexicography in TEI: Challenges for Multiple Writing Systems

  • 1. National Institute for Japanese and Linguistics (NINJAL), Japan;
  • 2. Tokyo University of Foreign Studies / Japan Society for Promotion of Science / National Institute for Japanese Language and Linguistics
  • 3. SOAS University of London
  • 4. University of Hawai'i at Hilo
  • 5. Kyushu University / Hitotsubashi University

Description

Okinawan is classified as one of the Northern Ryukyuan languages in the Japonic language family. It is primarily spoken in the south and central parts of the Okinawa Island of the Ryukyu Archipelago. It was the official lingua franca of the Ryukyu Kingdom and a literary vehicle, e.g., the Omoro Soshi poetry collection, but currently an endangered language. Okinawan has been recorded in various written forms: A combination of Kanji logograms and Hiragana syllabary with archaic spellings (e.g., Omoro Soshi) or modern spelling variations to approximate actual pronunciation, pure Katakana syllabary (e.g., Bettelheim’s Bible translation), Latin alphabet (mostly by linguists), and pure Hiragana (popular).

 

The Okinawago Jiten (Okinawan Dictionary; OD), published by the National Institute for Japanese Language and Linguistics (NINJAL) in 1963 and revised in 2001[1], uses the Latin alphabet for each lexical entry. We first added the possible writing forms listed above to the data in CSV format. We then converted the CSV into TEI XML using Python. Figure 1 presents a sample encoding of the TEI file for each entry. Here, we solved the multiple writing forms with <orth> tags with corresponding writing systems in @xml:lang attribute following BCP 47[2] (e.g., xml:lang=”ryu-Hira'' for Okinawan words written in Hiragana). We added the International Phonetic Alphabet (IPA) and the accent type to make the pronunciation clearer with the <pron> tags.

 

Using XSLT, we transformed this TEI file into a static webpage with a user-friendly GUI, as shown in Figure 2. It is anticipated that this digitization of OD and its publication under the open license will benefit key stakeholders, such as Okinawan heritage learners and worldwide Okinawan learners, being the largest Okinawan dictionary available online.

 

 

Fig. 2 Webpage rendition of TEI

 

Bibliography

[1] National Institute for the Japanese Language and Linguistics (ed.), Okinawago Jiten, revised edition, Tokyo: Zaimusho Insatsukyoku, 2001.

[2] Text Encoding Initiative, “teidata.language,” P5: Guidelines for Electronic Text Encoding and Interchange, version 4.4.0, last updated on 19th April 2022, revision ff9cc28b0, https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-teidata.language.html (last accessed on 20th June 2022).

Files

TEI2022-Miyagawa_et_al-Okinawan.pdf

Files (1.1 MB)

Name Size Download all
md5:0fb5abce1ca571df61b02b68cde50b8a
1.1 MB Preview Download