TaiTone v1.0.0
Authors/Creators
Description
Please cite as: Rikker Dockum and James Kirby. 2026. TaiTone: Tai lexical tone box dataset (version 1.0.0). DOI:10.5281/zenodo.10689845
Contact rdockum@binghamton.edu with questions.
Description
TaiTone v1.0.0 is an open dataset aggregating 926 tone boxes drawn from 112 distinct Thai, Chinese, English, and French sources, augmented with location metadata including geographic coordinates. Sources include scientific articles, theses, dissertations, organizational reports, and manuscripts published between 1970 and 2018, though some sources include fieldwork data gathered many years earlier. Bibliographic information can be found in the accompanying `.bib` file. No attempt has been made to balance the sample for geographic or dialectal coverage; any reference that included or allowed the inference of a tone box for a Tai language was considered for inclusion. This dataset will grow as addtional Tai language documentation work becomes available.
On tone boxes
A tone box is a compact mapping between surface tones, historical tones, and historical onset consonants, which frequently condition phonemic tone change. These were conventionalized by William J. Gedney into 5 columns (historical tones) x 4 rows (historical onsets), for a total of 20 cells, each representing a subset of the lexicon. The columns are conventionally labeled, A, B, C, DL, and DS. The rows are conventionally labeled 1 to 4, and each cell has a label, A1, A2, and so on, up through DS4. While the surface phonetics of tones frequently vary and change without phonemic restructuring, phonemic tone changes result in mergers and splits between tone box cells (i.e. subsets of the lexicon), resulting in changes to the mapping of surface tones to historical categories within a given tone box. See Gedney (1972) for details on the tone box method, and Dockum (2019) on the use of tone box data in historical classification and reconstruction. (While this dataset adheres to the traditional 20-cell Gedney tone box, see also Liao (2023) for more fine-grained division of tone box rows to account for tonal patterns in some Tai languages.)
See README.md for complete release notes.
Files
README.md
Files
(315.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:bd42c310da116bb21f61644e379c8e7c
|
8.6 kB | Preview Download |
|
md5:331f0931c421ce2b9faf95a575a8c663
|
81.2 kB | Download |
|
md5:8d365d45f0dd8f6bf733c34823966324
|
226.1 kB | Download |
Additional details
Software
- Repository URL
- https://github.com/rikker/TaiTone