# Tseltal-Spanish multidialectal dictionary

> by Gilles Polian

This repository contains the data underlying the published version of the dictionary
at [Dictionaria](https://dictionaria.clld.org/contributions/tseltal) as [CLDF](https://cldf.clld.org)
[Dictionary](cldf)
[![Build Status](https://travis-ci.org/dictionaria/tseltal.svg?branch=master)](https://travis-ci.org/dictionaria/tseltal)

Releases of this repository are archived with and accessible through
[ZENODO](https://zenodo.org/communities/dictionaria) and the latest release
is published on the [Dictionaria website](https://dictionaria.clld.org).

<h3 id="This dictionary">This dictionary</h3>
<table class="table table-nonfluid">
<tr>
<th>Size</th>
<td>8,109 entries; 321 images</td>
</tr>
<tr>
<th>Content</th>
<td>Lexical items of 20 dialects of Tseltal (Mayan language from Mexico, ISO
      639:tzh; Glottolog code: tzel1254), with morphological segmentation,
      descriptions of meanings in Spanish and comparative concepts in English.
      Lemmas consist of uninflected stems, with the exception of phrasemes, which
      are inflected phrases or sentences.
    </td>
</tr>
<tr>
<th>Research assistants</th>
<td>Alberto Gómez Pérez, Alberto Gutiérrez Gómez, Ángela Lorena Cruz Gómez,
      Antonia Sántiz Girón, Catalina López Gómez, Jaime Pérez González, Juan López
      Intzín, Juan Méndez Girón, Manuel Vázquez Castellanos, María de Jesús Gómez
      Sánchez, Miguel Silvano Jiménez, Oscar Gregorio Cruz Méndez, Roberto Sántiz
      Gómez, Sebastián Aguilar Méndez, Tomás Gómez López
    </td>
</tr>
<tr>
<th>Photographs</th>
<td>Archive of the Tseltal Documentation Project</td>
</tr>
<tr>
<th>Purpose</th>
<td>Lexical documentation of Tseltal as a whole language through all its
      dialects.
    </td>
</tr>
<tr>
<th>Research context and funding</th>
<td>This dictionary is the revised electronic version of the paper dictionary <i>Diccionario Multidialectal del Tseltal</i> (<a href="source">R049</a>); both dictionaries are part of the many outcomes of the Tseltal Documentation
      Project, hosted at CIESAS-Sureste, which was funded by ELDP/SOAS, CONACYT
      (Mexican National Council of Science and Technology), the INALI (Mexican
      National Indigenous Languages Institute) and the Max Planck Institute for
      Psycholinguistics.
    </td>
</tr>
<tr>
<th>Project Leader</th>
<td>Gilles Polian</td>
</tr>
</table>
<h3 id="The language and its speakers">The language and its speakers</h3>

Tseltal, previously spelled Tzeltal, is spoken in central and eastern Chiapas, a southeastern state of Mexico, by lightly less than half a million speakers. It is a western Mayan language, close to Chol (the language of classic Mayan inscriptions) and closest to Tsotsil (Tzotzil) (<a href="source">R021</a>, <a href="source">R045</a>, <a href="source">R048</a>).

Tseltal language is not immediately endangered as a whole thanks to its relatively large number of speakers (compared to most indigenous languages) and by the fact that many children still acquire it as their first language. However, Tseltal is threatened in the medium term. First of all, most speakers are now bilingual with Spanish and the linguistic transmission to new generations is globally on the decline, especially in urbanized places and their surroundings, where more and more children are now socialized primarily in Spanish. In some districts, such as Villa Las Rosas, Tseltal is on the verge of extinction, as only elders still speak it. In addition, the children that do acquire Tseltal learn an increasingly impoverished version of the language, as many native words fall into disuse, along with the traditional knowledges and ways of life that they were used to express. At the same time, Spanish is pervasively infiltrating the lexicon and the grammar, displacing native words and constructions and thus obstructing the genuine creativity of the language. Finally, there is almost no functional literacy, in spite of some progress being made in bilingual schooling, and the Mexican national context is still one of discrimination of indigenous languages and cultures.

Tseltal, like Mayan languages in general, is among the best described Amerindian languages. In addition to a few early colonial documents, in particular a good dictionary from the late 16th century (<a href="source">R013</a>), there has been a constant flow of publications since the mid-20th century. Published works include a reference grammar (<a href="source">R014</a>), dictionaries (<a href="source">R016</a>, <a href="source">R011</a>), grammatical studies (<a href="source">R044</a>, <a href="source">R018</a>, <a href="source">R015</a>, <a href="source">R019</a>, <a href="source">R017</a>), dialectal and diachronic studies (<a href="source">R020</a>, <a href="source">R021</a>, <a href="source">R022</a>, <a href="source">R023</a>, <a href="source">R024</a>, <a href="source">R025</a>, <a href="source">R026</a>), acquisition studies (<a href="source">R027</a>) and studies of semantic typology of space (<a href="source">R028</a>, <a href="source">R029</a>, <a href="source">R030</a>, <a href="source">R031</a>, <a href="source">R032</a>, <a href="source">R033</a>, <a href="source">R034</a>), among others. Nevertheless, most studies focus on just a few dialects (Tenejapa, Oxchuc).

There are three broad dialect areas: North, Center and South, plus a dialectally heterogeneous oriental region, a place of recent migrations, which was not studied. Dialectal variation is only moderate, as it allows to some extent a fluid communication between speakers from different areas. This dictionary is multidialectal, as it covers eighteen places from all three areas, as represented in Map 1, along with the abbreviations used in this study. Note that references are also made to entire areas, through the corresponding abbreviation.

<figure class="img">
<img src="https://cdstar.shh.mpg.de/bitstreams/EAEA0-A6BB-0DF7-AD56-0/MAPA_TSELTAL.jpg" title="tseltal"/>
<figcaption>Map 1 [Based on a map designed by Vittorio Dell'Aquila]</figcaption>
</figure>

In the following list, the places where the lexicon was studied more thoroughly appear in boldface. In the other places, the lexicographic work was only partial.

<h4 id="Lexical coverage of the Multidialectal Tseltal Dictionary">Lexical coverage of the
    Multidialectal Tseltal Dictionary</h4>

North:

*   __Petalcingo__(PE)
*   Yajalón (YA)
*   Chilón (CHI)
*   __Bachajón__ (BA) \[subdialects: San Sebastián (SS), San Jerónimo (SJ)\] 
*   Sitalá (ST)
*   __Guaquitepec__ (GU)
*   Sibakja’ (SB)

Center:

*   Tenango (TG)
*   __Cancuc__ (CA)
*   __Tenejapa __ (TP)
*   Abasolo (AB)
*   __Oxchuc __(OX)
*   San Pedro Pedernal (SP)
*   Chanal (CHA)
*   Altamirano (AL)

South

*   Amatenango (AM)
*   Aguacatenango (AG)
*   __Villa Las Rosas__ (VR)

Others (not shown in Map 1):

*   Oriental region (OR)
*   Copanaguatla, extinct dialect from the 16th century (CO)

In the North, microdialectal information was included in the case of Bachajón, which covers two historically and socially well-defined parts: San Sebastián (SS) and San Jerónimo (SJ).

In the Center, the speech of Oxchuc and Chanal are practically identical (Chanal was founded by people from Oxchuc in historical times). Therefore, the dialectal category Oxchuc is meant to cover both Oxchuc and Chanal, unless it is indicated otherwise (e.g. <a href="entry">TSE11761</a> in its second sense, <a href="entry">TSE46861</a>).

As already mentioned, the oriental region, geographically known as “Cañadas” and “Selva” to the east of Map 1, was outside the lexicographical coverage. That area is dialectally heterogeneous, as it was populated by people from a great diversity of origins, speakers of indigenous languages (Tseltal and others) as well as monolinguals in Spanish. As a consequence, there is no oriental dialect of Tseltal as such. Nevertheless, a few data from villages of that region were included in the dictionary when it seemed relevant; those bear the abbreviation “OR”.

Finally, some data of comparative interest were included from <a href="source">R013</a>, the 16th century dictionary that describes the Tseltal spoken 500 years ago in Copanaguastla, a town to the south of Villa Las Rosas that disappeared in the 17th century. Those data are indicated by the abbreviation “CO”.

The dialectal information contained in this dictionary should be understood as the best approximation possible with the lexicographic work undertaken. It is not meant to be definitive or fully systematic: this is not a dialectal atlas. 

<h3 id="Collaborators and source of the data">Collaborators and source of the data</h3>

Sixteen people contributed to the Tseltal-Spanish Multidialectal Dictionary (TSMD) as project collaborators, in addition to the coordinator and other occasional language consultants. Their participation varied from a couple of months to several years, from 2010 to 2017. Their names are listed below, alphabetically by first name in each category.

<h4 id="General lexicography">General lexicography (development, correction and edition of
    the multidialectal database):</h4>

*   Juan López Intzín (“Xuno”)
*   Miguel Silvano Jiménez
*   Oscar Gregorio Cruz Méndez
*   Sebastián Aguilar Méndez
*   Tomás Gómez López

<h4 id="Data collection by dialect">Data collection by dialect:</h4>

*   Amatenango: Catalina López Gómez
*   Bachajón: Alberto Gutiérrez Gómez, Miguel Silvano Jiménez
*   Cancuc: Manuel Vázquez Castellanos
*   Guaquitepec: Sebastián Aguilar Méndez
*   Oxchuc: María de Jesús Gómez Sánchez, Roberto Sántiz Gómez
*   Petalcingo: Alberto Gómez Pérez, Oscar Gregorio Cruz Méndez
*   Tenango: Jaime Pérez González
*   Tenejapa: Antonia Sántiz Girón, Juan López Intzín (Xuno), Juan Méndez Girón
*   Villa Las Rosas:Tomás Gómez López
*   Yajalón: Ángela Lorena Cruz Gómez
*   Bionimy: Luis Malaret (Community College of Rhode Island)

This dictionary was developed as part of a larger project, the Tseltal Documentation Project (TDP), which started in 2006 in CIESAS-Sureste (San Cristóbal de Las Casas, Chiapas, Mexico) under the coordination of Gilles Polian and which was still underway in 2017. The TDP provided a corpus of around 500 hours of transcribed audiovisual recordings in Tseltal for the lexicographic work. Those recordings include narratives, dialogues, spontaneous conversations, ritual speech, public discourse and songs; many of them are fully accessible at AILLA (http://www.ailla.utexas.org/) and ELAR (http://www.elar-archive.org/) under Gilles Polian’s deposits. Fieldwork was conducted in all the dialectal points shown on Map 1, more in some of them, less in others: most of the corpus concerns boldfaced place names of list (1) above. This corpus was one of the fundamental bases for the dictionary’s elaboration, since it allowed carrying out many searches for words, morphemes and phrases, as well as studying their semantics by context of use and their dialectal distribution. Many examples of the TSMD were extracted from the corpus, either directly when it was possible or through an edition process.

Many previous works, among which several dictionaries, were carefully examined at various stages of the TSMD project. Most important references, i.e. those that had a direct impact on this dictionary, are mentioned here:

*   Brent Berlin and Terrence Kaufman worked together on a Tenejapa Tseltal-English dictionary, which was not published but has been accessible through various manuscript versions, and is registered in microfilm as <a href="source">R016</a>. This same database was later reworked and broadened as <a href="source">R035</a>. Those two authors kindly shared their dictionary file with the TSMD team, for which we express to them our deep gratitude. 
*   The most complete Tseltal dictionary published up to now is <a href="source">R011</a>, which is a Bachajón Tseltal-Spanish dictionary. It was elaborated in a community called Bahtsihbiltik, which belongs to the San Jerónimo sub-region (Bachajón (SJ)). The TSMD team frequently looked it up to confirm data from that dialect. 
*   The Public Education Office of Chiapas started publishing of several works on local indigenous languages some twenty years ago. In particular, two lexicographic works were taken into account: the Tenejapa Tseltal-Spanish dictionary (<a href="source">R036</a>) and the multidialectal monolingual dictionary (<a href="source">R037</a>). 
*   <a href="source">R012</a> was also carefully studied, for the large amount of Tseltal data it contains. 
*   From a very different perspective, the formerly mentioned dictionary <a href="source">R012</a> of a 16th-century Tseltal dialect was the object of many queries, although the task of linguistically processing all the information it contains is still incipient. 
*   Two linguistics Master’s theses with lexical information on certain Tseltal word classes were very useful: <a href="source">R038</a> on positionals and <a href="source">R039</a> on expressive predicates. Likewise, <a href="source">R047</a> is a PhD dissertation that consists of a dictionary of a particular Tseltal dialect: Villa Las Rosas. It was developed in parallel with the TSMD and both studies fed eachother to a great extent. 
*   The last dictionary that was often looked up for the TSMD project is the indispensable work of <a href="source">R040</a> on Zinacantán Tsotsil. Tsotsil and Tseltal are indeed so close to each other that they can be called sister languages, which makes that great dictionary, unique in its depth in Amerindian linguistics, so beneficial for Tseltal lexicography. 
*   In addition to dictionaries, other studies contain significant lexical information on particular semantic fields or word classes of Tseltal. Those works were consulted whenever it was necessary and possible, although no systematic lexical extraction was carried out. The main works consulted were the following: <a href="source">R041</a> on numeral classifiers, <a href="source">R002</a> on ethnobotanics, <a href="source">R003</a> on ethnozoology, <a href="source">R004</a> and <a href="source">R042</a> on ethnomedicine, and other biologists’ studies where Tseltal names for living beings can be found along with their scientific identification; those references are cited in the corresponding entries of the TSMD. 

<h3 id="The orthography used in the dictionary">The orthography used in the
    dictionary</h3>

Tseltal orthography is officially normed by a document published as <a href="source">R043</a>, which was the result of a series of meetings and workshops with Tseltal writers and bilingual teachers. This agreement differs little from what was already the common practice of most people writing the language. Tseltal orthography is globally similar to that of other Mayan languages, with a few specificities.

The following table displays the five vowels common to all Tseltal dialects.

  

<table class="table table-bordered">
<caption>Table 1: Underlying vowels</caption>
<thead>
<tr>
<th></th>
<th>Front</th>
<th>Central</th>
<th>Back</th>
</tr>
</thead>
<tbody>
<tr>
<td>High</td>
<td>i</td>
<td></td>
<td>u</td>
</tr>
<tr>
<td>Mid</td>
<td>e</td>
<td></td>
<td>o</td>
</tr>
<tr>
<td>Low</td>
<td></td>
<td>a</td>
<td></td>
</tr>
</tbody>
</table>

  

Table 2 presents the consonants of the phonologically most conservative dialect, Bachajón, using the practical orthography now commonly accepted among speakers and linguists. When this differs from IPA, the corresponding IPA symbol is given between slashes.

  

<table class="table table-bordered">
<caption>Table 2: Underlying consonants</caption>
<thead>
<tr>
<th></th>
<th></th>
<th>Labial</th>
<th>Alveo-dental</th>
<th>Palato-alveolar</th>
<th>Velar</th>
<th>Glottal</th>
</tr>
</thead>
<tbody>
<tr>
<td>Stops</td>
<td>simple</td>
<td>p</td>
<td>t</td>
<td></td>
<td>k</td>
<td></td>
</tr>
<tr>
<td></td>
<td>ejective</td>
<td>p'</td>
<td>t'</td>
<td></td>
<td>k'</td>
<td>' /ʔ/</td>
</tr>
<tr>
<td></td>
<td>voiced</td>
<td>b</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Affricates</td>
<td>simple</td>
<td></td>
<td>ts /t͡s/</td>
<td>ch /t͡ʃ/</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>ejective</td>
<td></td>
<td>ts' /t͡s'/</td>
<td>ch' /t͡ʃ'/</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Fricatives</td>
<td></td>
<td></td>
<td>s</td>
<td>x /ʃ/</td>
<td>j /x/</td>
<td>h</td>
</tr>
<tr>
<td>Nasals</td>
<td></td>
<td>m</td>
<td>n</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Laterals</td>
<td></td>
<td></td>
<td>l</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Flap</td>
<td></td>
<td></td>
<td>r /ɾ/</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Approximants</td>
<td></td>
<td>w</td>
<td></td>
<td>y /j/</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

  

Notes on consonants:

*   Previously, &lt;ts&gt; and &lt;ts'&gt; used to be written &lt;tz&gt; and &lt;tz'&gt; respectively. Some linguists still follow that tradition. 
*   Some dialects (Oxchuc, Altamirano) lack /p'/, which merged with /b/ (cf. 6.2 below). This represents no orthographic issue, because the unique resulting phoneme /b/ is written as &lt;b&gt; (so for instance _p’ij_ ‘wise’ is _bij_ in Oxchuc and Altamirano). 
*   Most other dialects (all but Petalcingo) lack the opposition between /x/ (&lt;j&gt;) and /h/ (&lt;h&gt;), which historically merged. The resulting phoneme varies phonetically between \[x\] and \[h\], but it is uniquely transcribed as &lt;j&gt;. 
*   Some complications exist in the transcription of the glottal stop, because of two regrettable orthographic decisions: on the one hand, the decision to represent it orthographically with the same symbol used for ejective consonants (the apostrophe &lt;'&gt;), leading to potential confusions; and on the other hand, the decision not to write it at the beginning of words (preceding vowels). I’ll comment on these two cases and their consequences. 
*   Sequences of non-ejective stop/affricate + glottal stop are absent from basic roots, but a few of them arise through compounding or reduplication. In those cases, a different symbol must be used for the glottal stop to avoid confusion with the corresponding ejective stop/affricate: the symbol chosen by Tseltal writers has been the hyphen. This is the case in <a href="entry">TSE74871</a> /ʃʔuhtʔuht/ ‘flycatcher (bird)’, where two glottal stops can be observed: the second glottal stop cannot be transcribed with the normal apostrophe, because the orthographic sequence &lt;t'&gt; would be wrongly interpreted as the glottal alveo-dental stop /t'/, so a hyphen is used instead. This problem is absent with the first glottal stop in this word, as no ejective /ʃ'/ exists, so the sequence &lt;x'&gt; is correctly read as /ʃ+ʔ/. 
*   The hyphen is also used instead of the apostrophe after ejective consonants, such as _ok'-on_ /ok’ʔon/ ‘whine’. With the hyphen here, a visually confusing sequence of two apostrophes is avoided, as would be _x’ok’’on_. The same applies to _ach’-ach’tik_ /ʔat͡ʃ’ʔat͡ʃ’tik/ ‘half-new’ and _ihk’ ihk’tik_ /ʔihk’-ʔihk’tik/ ‘blackish’. 
*   Concerning the beginning of words, the TSMD also aligns with a relatively bad practice, only because it is already well entrenched in the writing tradition of Mayan languages. It consists of not writing the prevocalic initial glottal stops. For example, /ʔiʃim/ ‘corn’ is written _ixim_, not _’ixim_. This orthographic tradition comes from the fact that initial glottal stops at some point were considered only phonetic, among other reasons because they disappear after possessive/ergative prefixes, e.g. /kiʃim/ _kixim_ ‘my corn’ {k- ‘1POS’} and because they are systematic (there are no roots initiating in vowel) and thus generally not contrastive. Unfortunately, in Tseltal there are some cases where they are contrastive: possessive/ergative prefix for second person is _a(w)-_ without initial glottal stop (in most dialects), which creates minimal pairs with words initiating in /ʔa.../. For example, orthographic _abak_ may correspond to /abak/ {a-bak ‘2POS-bone’} ‘your bone’ or to /ʔabak/ ‘soot’, which are phonetically distinguished in speech. Fortunately, this kind of ambiguity is infrequent in practice. 
*   When the phoneme /b/ is preceded by a vowel inside a word, that vowel tends to be laringealized, which amounts to hearing a glottal stop before the /b/. For example, _abat_ ‘assistant’ may sound _a’bat_. This phenomenon is related to the fact that /b/ corresponded originally to the implosive /ɓ/, as it still is in other Mayan languages (especially in Guatemala), where it is written &lt;b'&gt;. Actual Tseltal dialects lost the implosive feature, but several dialects maintain to some degree the pre-laringealization associated with the constrained glottis feature. However, this phenomenon is not fully understood yet, as it is quite variable, both inter- and intra-dialectally, so the TSMD follows the INALI’s norm, which consists of not taking into account this pre-laringealization in the practical orthography. The only sequences written as V’b are those where the glottal stop belongs to the root and the /b/ to the first consonant of a suffix. This is the case for instance in <a href="entry">TSE59171</a> ‘meat’, from _ti’_ (t.v.) ‘eat (meat)’ and the nominalizer _-bal_. 

Apart from those few cases, Tseltal orthography is rather straightforward.

<h3 id="Grammatical Categories">Grammatical Categories</h3>

In what follows, a very short sketch of each grammatical category used in this dictionary is presented. See <a href="source">R014</a> for further information on Tseltal grammar.

<h4 id="Nouns">Nouns</h4>
<h5 id="Class 1 and 2">Class 1 and 2</h5>

Two basic classes of nouns are distinguished in this dictionary: class 1 and class 2 (abbreviated as <span style="font-variant: small-caps;">n.</span> and <span style="font-variant: small-caps;">n2.</span> respectively): nouns of class 1 can be used without possessor, whereas class 2 nouns require a possessor, at least in their unmarked (non-suffixed) form. Some class 2 nouns can also appear non-possessed when they take an additional suffix, almost always a -Vl suffix, called “non-possession suffix”. The vowel of this suffix is not predictable and subject to dialectal variation and so it is indicated in each entry (e.g. <a href="entry">TSE44621</a>, <a href="entry">TSE24561</a>). Other class 2 nouns never appear non-possessed (e.g. <a href="entry">TSE07371</a>).

Beside the non-possession suffix, two other kinds of morphological information are indicated in some entries. First, some nouns (kinship terms) take a special plural suffix when they are possessed (e.g. <a href="entry">TSE01701</a>). On the other hand, many nouns display a marked possessed form, in which they take a -Vl suffix, in addition to the possessor prefix (e.g. <a href="entry">TSE10641</a>, <a href="entry">TSE43021</a>). Marked possessed form often indicates that the possessor is inanimate instead of animate. In other cases, it highlights that the kind of possession involved is non-canonical in some other way.

<h5 id="Action Nouns">Action Nouns (<span style="font-variant: small-caps;">act.n.</span>)
</h5>

Action nouns are a subtype of class 1 nouns. They denote agentive events, like <a href="entry">TSE02671</a> or <a href="entry">TSE29631</a>, and can be used in constructions where a non-finite verb is expected. Most of them are associated with an intransitive verb, although the morphological relation between action noun and verb is irregular. They also appear in a special construction as object of the verb <a href="entry">TSE00981</a>, which emphasizes the agentive involvement of the subject.

A subtype of action nouns is incorporating action nouns (<span style="font-variant: small-caps;">inc.act.n.</span>). They are formally compounds with a transitive root or stem followed by a (notional) object noun (e.g. <a href="entry">TSE33051</a>, <a href="entry">TSE37391</a>).

<h5 id="Relational Nouns">Relational Nouns (<span style="font-variant: small-caps;">rel.n.</span>)</h5>

Relational nouns are a subtype of class 2 nouns: they are formally nouns that are always possessed. They are functionally equivalent to adpositions, as they are basically used as grammatical relators (e.g. <a href="entry">TSE69241</a>, <a href="entry">TSE03501</a>, <a href="entry">TSE60631</a>).

<h5 id="Collective Nouns/Predicates">Collective Nouns/Predicates (<span style="font-variant: small-caps;">coll.</span>)</h5>

Lemmas classified as “collectives” are words derived with a suffix _-tik_, a suffix -Vl (variable vowel) or a combination of both (as _-tik-Vl_ or _-Vl-tik_). They denote the abundance of the thing designated by the base, e.g. _nichim_ ‘flower’ &gt; <a href="entry">TSE44662</a> ‘(place) full of flowers’. Their lexical classification is still problematic: in some of their uses they look like nouns, but at least in some dialects they do not behave like canonical nouns, in particular they cannot function as core verbal arguments, and they rather seem to be (both formally and semantically) diffusive adjectives (cf. 5.3 below). This is a topic for further research.

<h4 id="Verbs">Verbs</h4>

Verbs may be transitive (<span style="font-variant: small-caps;">t.v.</span>) or intransitive (<span style="font-variant: small-caps;">i.v.</span>). No basic ditransitive verbs exist in Tseltal, but all transitive verbs can be made ditransitive with the benefactive applicative _-bey ~ be ~ b _(e.g. <a href="entry">TSE01391</a>). Verbs may be finite or non-finite. The regular infinitives are derived with the suffix _-el_; they are considered part of the verb forms when they head a non-finite clause, but many of them can also be used as nouns and some head their own entry as such (e.g. <a href="entry">TSE32161</a>, <a href="entry">TSE54741</a>).

Finite verbs inflect for aspect and mood, marked by affixes and preverbal auxiliaries. Only auxiliaries have entries of their own (<span style="font-variant: small-caps;">aux.</span>, e.g. <a href="entry">TSE75331</a>, <a href="entry">TSE35611</a>, <a href="entry">TSE28261</a>). An optional inflection category is pluractionality: there are special iterative and distributive forms for both transitive and intransitive verbs. Voice categories for transitive verbs are passive, antipassive, reflexive/reciprocal and the already mentioned benefactive applicative. Other valency-changing devices are derivational, like causative and anticausative.

Verbal inflection is very regular in Tseltal. The only verbs with some minimal irregularity are <a href="entry">TSE03481</a> ‘go’ and <a href="entry">TSE31381</a> ‘arrive’.

Several subclasses of verbs are identified in the dictionary:

*   Agentive intransitive verbs (<span style="font-variant: small-caps;">agt.i.v.</span>) typically correspond to actions carried out by human beings (e.g. <a href="entry">TSE02621</a>, <a href="entry">TSE29641</a>). Most of them have an irregular non-finite form, instead of the regular infinitive in _-el_. The irregular forms correspond to action nouns (cf. 5.1.3 above). 
*   Some transitive and intransitive verbs are registered as defective (<span style="font-variant: small-caps;">dev.t.v.</span> and <span style="font-variant: small-caps;">dev.i.v.</span> respectively), because they are restricted in terms of the inflection categories (person, aspect-mood) they can combine with (e.g. <a href="entry">TSE58141</a>, <a href="entry">TSE72851</a>, <a href="entry">TSE57291</a>). 
*   Movement and phasal intransitive verbs (<span style="font-variant: small-caps;">mov.i.v.</span> and <span style="font-variant: small-caps;">phas.i.v.</span> respectively) may function either as canonical intransitive verbs or as auxiliaries. In the latter case, they appear devoid of person marking and followed by a dependent form of the main verb, which carries person marking but no aspect. The exact construction is variable depending on the type of auxiliary (movement or phasal) and on the dialect. 
*   Several subclasses of transitive verbs are restricted to some particular pluractional or voice category, meaning that they always occur with that particular category (and its morphology): only distributive (<a href="entry">TSE67771</a>), only iterative (<a href="entry">TSE30971</a>), only reciprocal (<a href="entry">TSE09631</a>), only reflexive (<a href="entry">TSE06761</a>), and only passive (<a href="entry">TSE16621</a>), respectively abbreviated as <span style="font-variant: small-caps;">distr.t.v.</span>, <span style="font-variant: small-caps;">iter.t.v.</span>, <span style="font-variant: small-caps;">recipr.t.v.</span>, <span style="font-variant: small-caps;">refl.t.v.</span>, and <span style="font-variant: small-caps;">pass.t.v.</span>. 

<h4 id="Adjectives">Adjectives (<span style="font-variant: small-caps;">adj.</span>)</h4>

Canonical adjectives (simply classified as <span style="font-variant: small-caps;">adj.</span>, e.g. <a href="entry">TSE31151</a>, <a href="entry">TSE47331</a>) can normally be found in two functions: as non-verbal predicates and as attribute modifiers of a noun. Some adjectives display only one of these functions: they are then classified as <span style="font-variant: small-caps;">attr.adj.</span> (only attributive adjective, e.g. <a href="entry">TSE15211</a>) or <span style="font-variant: small-caps;">pred.adj.</span> (only predicative adjective, e.g. <a href="entry">TSE51571</a>).

Diffusive adjectives (<span style="font-variant: small-caps;">diff.adj.</span>, e.g. <a href="entry">TSE00091</a>) are a class of derived adjectives with a _-tik_ suffix; when they are based on a CVC root, that root is reduplicated. Their semantics is attenuative or distributive (visually plural pattern). They are mainly used as non-verbal predicates.

Positional adjectives (<span style="font-variant: small-caps;">pos.adj.</span>, e.g. <a href="entry">TSE04241</a>) are a class of derived adjectives. They are all based on CVC roots and derived through a -Vl suffix (with vocalic harmony). Their semantics deals mainly with position (‘sit’, ‘stand’), disposition (‘lined up’, ‘heaped’) and/or shape (‘long’, ‘hollow’). Most of them have a special distributive plural form CVC-_ajtik_, indicated in each entry.

Morphology associated with adjectives:

*   Some root adjectives take an extra -Vl suffix in attributive function (e.g. <a href="entry">TSE11131</a>, <a href="entry">TSE53211</a>). The exact form of this suffix is indicated in each entry (there may be several variants). When an adjective takes the attributive suffix only optionally, the possibility of the absence of any suffix is indicated by a slashed zero “∅”, followed by the overt form(s) of the suffix (e.g. <a href="entry">TSE50771</a>). 
*   Most adjectives derive an abstract noun with a -Vl suffix, which can be homophonous with the attributive -Vl suffix (e.g. <a href="entry">TSE11131</a>, <a href="entry">TSE45021</a>) . With positional adjectives, the abstract noun is often derived directly from the CVC root with an _-il_ suffix, instead of being formed on the CVC-Vl stem (e.g. <a href="entry">TSE04241</a>). 

<h4 id="Numerals and numeral classifiers">Numerals and numeral classifiers</h4>

With the exception of <a href="entry">TSE26301</a> ‘one’, all numerals (<span style="font-variant: small-caps;">num.</span>) are morphologically complex: they consist of a numeral root plus another element, which is either the generic suffix _-eb_ or a specific numeral classifier. In the TSMD, numerals are registered with the suffix _-eb_ (e.g. <a href="entry">TSE10171</a>, <a href="entry">TSE46951</a>). They derive an abstract noun which can be used as ordinal (like ‘second’) or quantifier (like ‘both’).

Numeral classifiers (<span style="font-variant: small-caps;">num.clas.</span>) are registered as bare stems (e.g. <a href="entry">TSE66471</a>, <a href="entry">TSE13171</a>), but they cannot constitute independent words by themselves: they must combine with a preceding numeral root or undergo some derivational process. When they seem to be used alone, it is because they combine with _j-_, the reduced form of _jun_ ‘one’, which is dropped in some dialects (cf. 6.7 below).

Some numeral classifiers are defective (<span style="font-variant: small-caps;">def.num.clas.</span>): they always take the numeral ‘one’ (_j-_), which is then integrated in their lemmatical form. They denote small amounts, like ‘a bit of...’ etc. (e.g. <a href="entry">TSE24291</a>, <a href="entry">TSE74841</a>).

<h4 id="Expressive predicates">Expressive predicates</h4>

Expressives (<span style="font-variant: small-caps;">expr.</span>), otherwise known as “affect (words/verbs/predicates)” are a class of derived predicates, intermediate between verbs and non-verbal predicates, that highlight impacting sensorial properties of events (e.g. <a href="entry">TSE08821</a>, <a href="entry">TSE32661</a>). They are based on CV(h/j)(C) roots, which can be of any other open lexical category or be properly expressive, often onomatopoeic. Additionally, they obligatorily take one of a series of dedicated suffixes that mainly encode information of aspect, pluractionality, and degree of emphasis. 

<h4 id="Adverbs">Adverbs (<span style="font-variant: small-caps;">adv.</span>)</h4>

Words classified as adverbs are free words that typically add information of space, time, manner, emphasis or modality, instead of predicating directly or acting as predicate arguments. This classification is only tentative and based on function, not on form, as there is no morphological uniformity among Tseltal adverbs. Many adverbs could probably be alternatively classified as non-verbal predicates or as some kind of adjective. Indeed, some adverbs are associated with an abstract noun suffix (e.g. <a href="entry">TSE45021</a>) just like adjectives are.

Incorporated adverbs (<span style="font-variant: small-caps;">inc.adv.</span>) appear inside the verbal complex before the verbal root, after the personal and/or aspectual prefixes, although most of them are orthographically written separated from the verbal root (e.g. <a href="entry">TSE00781</a>).

<h4 id="Other word classes">Other word classes</h4>

*   Coordinators (<span style="font-variant: small-caps;">coord.</span>): There are three coordinators: <a href="entry">TSE55781</a> ‘and’ and the loanwords <a href="entry">TSE18331</a> ‘and’ and <a href="entry">TSE46111</a> ‘or’. 
*   Definite articles (<span style="font-variant: small-caps;">art.</span>): Three lemmas are classified as definite articles: <a href="entry">TSE58251</a>, <a href="entry">TSE18351</a> and <a href="entry">TSE40951</a>, of which the last two originate as demonstratives (cf. <a href="entry">TSE18341</a> and <a href="entry">TSE40941</a>, respectively). All those articles usually coincide with the suffixed determiners _-e_ or _-i_. 
*   Demonstratives (<span style="font-variant: small-caps;">dem.</span>): This category covers locative and non-locative demonstratives (e.g. <a href="entry">TSE59071</a>, <a href="entry">TSE40941</a>). 
*   Directionals (<span style="font-variant: small-caps;">dir.</span>): Directionals are based on nominalized intransitive movement verbs and one phasal verb (e.g. <a href="entry">TSE03581</a>, <a href="entry">TSE16971</a>). They normally appear after a predicate or a spatio-temporal localizing expression to specify the trajectory or orientation, as well as to add aspectual nuances. 
*   Interjections (<span style="font-variant: small-caps;">interj.</span>): These are mainly greetings and address terms (e.g. <a href="entry">TSE03041</a>, <a href="entry">TSE03211</a>). 
*   Interrogative/indefinite proforms (<span style="font-variant: small-caps;">prof.</span>): Under this label are registered interrogative pronouns, such as <a href="entry">TSE39421</a> ‘who’, and proadverbs, such as <a href="entry">TSE04481</a> ‘where’, etc. Those proforms function either as interrogatives or as indefinite (‘someone’, ‘in some place’, etc.), depending on the syntactic context. 
*   Non-verbal predicates (<span style="font-variant: small-caps;">n.v.p.</span>): This is a residual category for words that mainly function as predicates, but that do not qualify as verbs, nouns or adjectives. It includes for instance the existential/locative predicate <a href="entry">TSE03221</a>. 
*   Onomatopoeias (<span style="font-variant: small-caps;">onom.</span>): Only a few onomatopoeias are registered in the TSMD (e.g. <a href="entry">TSE08941</a>). This lexical field has not been properly researched yet. 
*   Particles (<span style="font-variant: small-caps;">part.</span>): This is a residual category for different invariable elements, whose detailed classification is still pending. It includes second-position clitics and discourse particles, among others. Their functions cover aspectuality, tense, modality, etc. (e.g. <a href="entry">TSE00011</a>, <a href="entry">TSE03001</a>). 
*   Personal pronouns (<span style="font-variant: small-caps;">pro.</span>): Only two groups of items are identified as personal pronouns. On the one hand, <a href="entry">TSE16701</a> (_~ja'_) and its inflected forms. On the other hand, the possessed forms of <a href="entry">TSE66811</a>, which is also classified as relational noun. 
*   Prepositions (<span style="font-variant: small-caps;">prep.</span>): This group contains only two items: <a href="entry">TSE56851</a> (general locative/instrumental preposition) and <a href="entry">TSE55771</a> ‘with’. 
*   Quantifiers (<span style="font-variant: small-caps;">quant.</span>): In this group are included adverbs and/or non-verbal predicates whose function is to quantify, such as ‘a lot (of)’ or ‘a little bit (of)’ (e.g. <a href="entry">TSE04871</a>, <a href="entry">TSE41791</a>). This is a very preliminary classification not yet supported by a detailed analysis. 
*   Subordinators (<span style="font-variant: small-caps;">sub.</span>): A few subordinators are registered in the TSMD, such as <a href="entry">TSE40961</a> ‘if’ or <a href="entry">TSE58261</a> ‘general subordinator’. 

<h4 id="Coordinate compounds">Coordinate compounds</h4>

The only compounds identified as such in the TSMD are the coordinate compounds (or “co-compounds”), because they tend to be lexically anomalous: they usually lay somewhere between completely fused compounds and the coordination of independent words (this is not uncommon cross-linguistically, cf. <a href="source">R046</a>). This means that their inflection may be variously and unpredictably distributed between both members of the compound. The following kinds of co-compounds are registered: 

*   Nominal co-compounds: <span style="font-variant: small-caps;">n.co.</span> and <span style="font-variant: small-caps;">n2.co.</span>, depending on the noun class, cf. 5.1.1 (e.g. <a href="entry">TSE69521</a>, <a href="entry">TSE41061</a>). 
*   Verbal co-compounds, both transitive and intransitive: <span style="font-variant: small-caps;">t.v.co.</span> and <span style="font-variant: small-caps;">i.v.co.</span> (e.g. <a href="entry">TSE35901</a>, <a href="entry">TSE70381</a>). 
*   Adjectival co-compounds: <span style="font-variant: small-caps;">adj.co.</span> (e.g. <a href="entry">TSE69101</a>) and positional adjectival co-compounds: <span style="font-variant: small-caps;">pos.adj.co.</span> (e.g. <a href="entry">TSE22251</a>). 
*   Adverbial co-compounds: <span style="font-variant: small-caps;">adv.co.</span> (e.g <a href="entry">TSE57561</a>). 

<h4 id="Phraseology">Phraseology</h4>

Phrasemes have their own entries, with references to the corresponding entries of their constitutive parts. Phrasemes that function as predicates or as whole sentences are just identified as phr. Phrasemes may also be equivalent to a complex noun or adverb; those are abbreviated as n.phr. and adv.phr. respectively. Subsequently, an indication of the internal syntax of each phraseme is given in parentheses, e.g. “t.v.+obj.NP” describes a phraseme consisting of a transitive verb followed by an object NP (cf. <a href="entry">TSE35161</a>).

<h3 id="Predictable dialectal variation">Predictable dialectal variation</h3>

As a dialect dictionary, the TSMD is made up of many entries that subsume several dialect forms. That is, although each entry is headed by a unique lemma, other forms are indicated as dialectal alternative forms and the rest of the entry concerns any of those forms. Whenever it was possible to determine the most conservative form, that form was selected as lemma, as the other dialect forms can be deduced from it through the application of rules. In other cases, an arbitrary decision was made.

The dialectal variation concerning the phonology or morpho-phonology of particular words is partly predictable on the basis of the most conservative dialectal form, which generally coincides with that of Bachajón. For instance, if Bachajón presents a word starting with /h/, one can automatically deduce that, if another dialect like Tenejapa also displays this word, it will have /j/ instead of /h/. This kind of correspondence is defined in the TSMD as a set of seven parameters of predictable variation. These parameters, described below, allow merging together in one entry different forms under the same conservative lemma. Those seven parameters are indicated by abbreviations, which appear as the titles of the following sub-sections.

<h4 id="The H">The "H"</h4>

Proto-Tseltal distinguished a glottal fricative /h/ and a velar fricative /j/ (IPA: /x/). Only Bachajón and Petalcingo maintain this phonological opposition, whereas all other dialects have merged /h/ and /j/ (and the resulting phoneme is written &lt;j&gt;). But the developments of the proto-phoneme /°h/ were complex, as some dialects dropped it in several contexts instead of conserving it as /j/. The outcomes of /°h/ are well documented; the abbreviation “H” indicates that the /h/ present in the lemma gives way to the following phenomena.

<ul><li>In initial position, all dialects but Bachajón  have /j/ instead of /h/. Petalcingo is
    particular in this respect, because it is in the middle of the process of
    substituting /h/ with /j/ in initial position. This process is more advanced
    among younger speakers than among older ones. But in the TSMD only
    conservative forms (i.e., with initial /h/) are given for Petalcingo.
  </li><li>Between vowels, some dialects maintain the outcome of /°/, as /h/ or /j/;
    others drop it; a third group allows both possibilities, as in Table 3
    <table class="table table-bordered">
<caption>Table 3: Outcomes of /°h/ in intervocalic position</caption>
<tr>
<th></th>
<th colspan="2">Conservation</th>
<th>Loss</th>
<th>Unstable</th>
</tr>
<tr>
<td></td>
<td>Bachajón , Petalcingo</td>
<td>Villa Las Rosas</td>
<td>Center, Aguacatenango, Amatenango</td>
<td>North (-Bachajón , -Petalcingo)</td>
</tr>
<tr>
<td>‘become bitter’ <a href="entry">TSE08581</a></td>
<td>ch’a<b>h</b>ub</td>
<td>ch’a<b>j</b>ub</td>
<td>ch’aub</td>
<td>ch’a<b>j</b>ub ~ ch’aub</td>
</tr>
<tr>
<td>‘smoke’ <a href="entry">TSE08321</a></td>
<td>ch’a<b>h</b>il</td>
<td>ch’a<b>j</b>il</td>
<td>ch’ail</td>
<td>ch’a<b>j</b>il ~ ch’ail</td>
</tr>
<tr>
<td>‘down’ <a href="entry">TSE31411</a></td>
<td>ko<b>h</b>el</td>
<td>ko<b>j</b>el</td>
<td>koel</td>
<td>ko<b>j</b>el ~ koel</td>
</tr>
</table>
</li><li>Some VhV sequences with identical vowels do not follow the preceding rule, but
    tend to undergo a further reduction to V. This tendency is distributed over
    dialects as illustrated in Table 4. Note that this phenomenon mixes with the
    preceding one: no reduction only means that both vowels stay in place, but the
    aspiration may be present, as /h/ or as /j/, or drop.
    <br/>
<table class="table table-bordered">
<caption>Table 4: Tendency to reduction of homorganic °VhV sequences in
        frequent words
      </caption>
<tr>
<th></th>
<th colspan="3">No reduction</th>
<th>Optional reduction</th>
<th>Reduction</th>
</tr>
<tr>
<td></td>
<td>Bachajón,Petalcingo</td>
<td>North (-Guaquitepec, -Sitalá, -Yajalón )</td>
<td>Center (Tenejapa)</td>
<td>South</td>
<td>Guaquitepec, Sitalá, Tenejapa , Yajalón </td>
</tr>
<tr>
<td>‘walk’ <a href="entry">TSE05181</a></td>
<td>be<b>h</b>en</td>
<td>be<b>j</b>en</td>
<td>been</td>
<td>be<b>j</b>en ~ ben</td>
<td>ben</td>
</tr>
<tr>
<td>‘name’ <a href="entry">TSE05681</a></td>
<td>bi<b>h</b>il</td>
<td>bi<b>j</b>il</td>
<td>biil</td>
<td>bi<b>j</b>il ~ bil</td>
<td>bil</td>
</tr>
<tr>
<td>‘chasm’ <a href="entry">TSE72161</a></td>
<td>xa<b>h</b>ab</td>
<td>xa<b>j</b>ab</td>
<td>xaab</td>
<td>xa<b>j</b>ab ~ xab</td>
<td>xab</td>
</tr>
</table>
</li><li>In word-final position two groups of dialects emerge: those that keep a reflex
    of /°h/ (either as /h/ or as /j/) and those that do not, as illustrated in
    Table 5.
  </li><br/><table class="table table-bordered">
<caption>Table 5: Outcomes of /°h/ in word-final position</caption>
<tr>
<th></th>
<th colspan="2">Conservation</th>
<th>Loss</th>
</tr>
<tr>
<td></td>
<td>Bachajón, Petalcingo</td>
<td>North (-Bachajón, -Petalcingo), Cancuc, Tenango, Villa Las Rosas</td>
<td>Central (-Cancuc, -Tenango)</td>
</tr>
<tr>
<td>‘go down’ <a href="entry">TSE31371</a></td>
<td>ko<b>h</b></td>
<td>ko<b>j</b></td>
<td>ko</td>
</tr>
<tr>
<td>‘look for’ <a href="entry">TSE36361</a></td>
<td>le<b>h</b></td>
<td>le<b>j</b></td>
<td>le</td>
</tr>
<tr>
<td>‘spicy’ <a href="entry">TSE75471</a></td>
<td>ya<b>h</b></td>
<td>ya<b>j</b></td>
<td>ya</td>
</tr>
</table><br/><li>In Oxchuc, an /°h/ caused the ejectivization of a following non-ejective stop or
    affricate, as shown in Table 6.
  </li><br/><table class="table table-bordered">
<caption>Table 6: Ejectivization of /°hC/ in Oxchuc</caption>
<tr>
<th></th>
<th>With ejectivization</th>
<th>Without ejectivization</th>
</tr>
<tr>
<td></td>
<td>Oxchuc</td>
<td>other dialects</td>
</tr>
<tr>
<td>‘shoulder’ <a href="entry">TSE44311</a></td>
<td>ne<b>jk’</b>el</td>
<td>ne<b>hk</b>el,...</td>
</tr>
<tr>
<td>‘wound’ <a href="entry">TSE16171</a></td>
<td>e<b>jch’</b>en</td>
<td>e<b>hch</b>en,...</td>
</tr>
<tr>
<td>‘go’ <a href="entry">TSE03481</a></td>
<td>ba<b>jt’</b></td>
<td>ba<b>ht</b>,...</td>
</tr>
</table><br/><li>The proto-phoneme /°h/ dropped before sonorants (/m/, /n/, /l/, /w/ and /y/)
    and before the bilabial stop /b/ in all dialects but Bachajón, Petalcingo and Yajalón , and
    optionally in Chilón ; see Table 7 (Yajalón  is omitted because /h/ further drops in
    non-final syllables, see below).
  </li><br/><table class="table table-bordered">
<caption>Table 7: Outcomes of /°h/ before sonorants and /b/</caption>
<tr>
<th></th>
<th>Conservation</th>
<th>Variable</th>
<th>Loss</th>
</tr>
<tr>
<td></td>
<td>Bachajón, Petalcingo</td>
<td>Chilón </td>
<td>Center, South, Guaquitepec, Sibakja', Sitalá</td>
</tr>
<tr>
<td>'thunder' <a href="entry">TSE60321</a></td>
<td>t’o<b>h</b>m</td>
<td>t’o<b>j</b>m</td>
<td>t’om</td>
</tr>
<tr>
<td>'middle' <a href="entry">TSE46351</a></td>
<td>o<b>h</b>lil</td>
<td>o<b>j</b>lil</td>
<td>olil</td>
</tr>
<tr>
<td>'cough' <a href="entry">TSE46291</a></td>
<td>o<b>h</b>bal</td>
<td>obal</td>
<td>obal</td>
</tr>
</table><br/><li>In Villa Las Rosas, the /°h/ was elided before ejective consonants (both stops and
    affricates), as in Table 8.
  </li><br/><table class="table table-bordered">
<caption>Table 8: Loss of /°h/ before ejective consonants in Villa Las Rosas</caption>
<tr>
<th></th>
<th>Loss</th>
<th colspan="2">Conservation</th>
</tr>
<tr>
<td></td>
<td>Villa Las Rosas</td>
<td>Bachajón, Petalcingo</td>
<td>other dialects</td>
</tr>
<tr>
<td>‘dance’ <a href="entry">TSE00551</a></td>
<td>ak’ot</td>
<td>a<b>h</b>k’ot</td>
<td>a<b>j</b>k’ot</td>
</tr>
<tr>
<td>‘swell’ <a href="entry">TSE54721</a></td>
<td>sit’</td>
<td>si<b>h</b>t’</td>
<td>si<b>j</b>t’</td>
</tr>
<tr>
<td>‘tasty’ <a href="entry">TSE07401</a></td>
<td>buts’an</td>
<td>bu<b>h</b>ts’an</td>
<td>bu<b>j</b>ts’an</td>
</tr>
</table><br/><li>Finally, in Yajalón  the reflexes of preconsonantic /°h/ drop everywhere but on the
    last syllable of an intonation phrase. This has two consequences: 1) the /°h/
    of °CVhCVC roots is always lost in Yajalón  (e.g. <em>°nehkel</em> ‘shoulder’ gives
    <em>nekel</em>); 2) the reflex of /°h/ in monosyllabic roots disappears when
    the root is followed by any other syllable in the same utterance, for instance
    when that root takes any suffix. This phenomenon is illustrated in Table 9.
    <br/>
<table class="table table-bordered">
<caption>Table 9: Reflexes of /°hC/ in Yajalón </caption>
<tr>
<th></th>
<th>Conservation everywhere</th>
<th>Conservation in utterance-final position</th>
<th>Other dialects</th>
</tr>
<tr>
<td></td>
<td>Bachajón, Petalcingo</td>
<td>Yajalón </td>
<td>Guaquitepec, Cancuc, Amatenango,...</td>
</tr>
<tr>
<td>‘shoulder’ <a href="entry">TSE44311</a></td>
<td>ne<b>h</b>kel</td>
<td>nekel</td>
<td>ne<b>j</b>kel</td>
</tr>
<tr>
<td>‘thunder’ <a href="entry">TSE60321</a></td>
<td>t’o<b>h</b>m</td>
<td>t’o<b>j</b>m</td>
<td>t’om</td>
</tr>
<tr>
<td>‘s/he fell’ <a href="entry">TSE75521</a></td>
<td>ya<b>h</b>l</td>
<td>ya<b>j</b>l</td>
<td>yal</td>
</tr>
<tr>
<td>‘I fell’ (<em>-on</em> 'suj1sg')</td>
<td>ya<b>h</b>lon</td>
<td>yalon</td>
<td>yalon</td>
</tr>
<tr>
<td>‘s/he went’ <a href="entry">TSE03481</a></td>
<td>ba<b>h</b>t</td>
<td>ba<b>j</b>t</td>
<td>ba<b>j</b>t</td>
</tr>
<tr>
<td>‘s/he already went’ (<em>-ix</em> ‘already’)</td>
<td>ba<b>h</b>tix</td>
<td>batix</td>
<td>ba<b>j</b>tix</td>
</tr>
</table>
</li></ul>
<h4 id="The P">"P"</h4>

The abbreviation P’ stands for the phenomenon whereby all instances of /p’/ correspond to /b/ in Oxchuc, as illustrated in the table. Furthermore, an /°h/ caused the ejectivization of a following /p/ in Oxchuc (cf. Table 6), which subsequently became /b/, as illustrated in the last row.

  

<table class="table table-bordered">
<caption>Table 10: Neutralization of /p'/ with /b/ in Oxchuc</caption>
<tr>
<th></th>
<th>Without neutralization all dialects but Oxchuc</th>
<th>Neutralization Oxchuc</th>
</tr>
<tr>
<td>‘wise’ <a href="entry">TSE49991</a></td>
<td>p’ij</td>
<td>bij</td>
</tr>
<tr>
<td>‘pine bark’ <a href="entry">TSE47871</a></td>
<td>p’alax</td>
<td>balax</td>
</tr>
<tr>
<td>‘merchandise’ <a href="entry">TSE51321</a></td>
<td>p’olmal</td>
<td>bolmal</td>
</tr>
<tr>
<td>‘crab’ <a href="entry">TSE44401</a></td>
<td>nep’</td>
<td>xneb</td>
</tr>
<tr>
<td>‘be resolved’ <a href="entry">TSE08501</a></td>
<td>chahpaj~chajpaj~chapaj</td>
<td>chajbaj</td>
</tr>
</table>
<h4 id="The -Y">"-Y"</h4>

Several derivative suffixes end in /y/, like the transitivizer suffixes -(C)Vy ( _-tay, -liy, -iy, -uy_), the iterative suffix _-Vlay_ and the suffix _-ey_ that derives temporal adverbs. The final /y/ of all these suffixes tends to drop at least in some contexts in all dialects. The only exception is Villa Las Rosas, where this elision seems absent. Most dialects tend to elide this /y/ before a consonant, i.e. when the word takes another suffix that starts with a consonant; some others also elide it when the referred suffix ends the word (before the final word boundary). Finally, Tenejapa tends to elide it always (i.e. it is close to losing this segment altogether in those suffixes). Note that this is just a gross approximation, as we are dealing here with tendencies on a continuum. 

This phenomenon is illustrated in Table 11 with forms of the verb <a href="entry">TSE32151</a> ‘help’, where the elision of the final /y/ is at stake: before a vowel with suffix _-on_ ‘OBJ1SG’, at the end of the word with a null third person object and before a consonant with suffix _-tik_ ‘plural of a first person subject’.

  

<table class="table table-bordered">
<caption>Table 11: Derivative suffix "-Y"</caption>
<tr>
<th>Elision:</th>
<th>Minimal</th>
<th>Before consonants only</th>
<th>Before consonants and word boundary</th>
<th>Maximal</th>
</tr>
<tr>
<td>Dialects:</td>
<td>Villa Las Rosas</td>
<td>North (-Bachajón, -Petalcingo); Central (-Tenejapa), Amatenango</td>
<td>Aguacatenango, Bachajón, Petalcingo</td>
<td>Tenejapa</td>
</tr>
<tr>
<td>‘s/he helps me’</td>
<td>ya skoltayon</td>
<td>ya skoltayon</td>
<td>ya skoltayon</td>
<td>ya skoltaon</td>
</tr>
<tr>
<td>‘s/he helps her/him’</td>
<td>ya skoltay</td>
<td>ya skoltay</td>
<td>ya skolta</td>
<td>ya skolta</td>
</tr>
<tr>
<td>‘we help her/him’</td>
<td>ya jkoltaytik</td>
<td>ya jkoltatik</td>
<td>ya jkoltatik</td>
<td>ya jkoltatik</td>
</tr>
</table>
<h4 id="Vj">"Vj"</h4>

Many intransitive verbs are derived with a suffix _-ij / -uj_, which comes sometimes with a preceding consonant, as _-Cij / -Cuj_ (e.g. _-k’ij / -k’uj,_ etc.). In these suffixes, the vowel is phonologically determined by the root vowel: if the root vowel is /o, u/ the suffix is _-(C)ij_, whereas if the root vowel is /a, e, i/ the suffix takes the form _-(C)uj_. Now, some dialects display other vowels in these suffixes. Namely, Center and Amatenango have _-(C)ej_ instead of _-(C)ij_, and Amatenango, Cancuc, and Tenejapa have _-(C)oj_ in place of _-(C)uj_. Both cases can be analyzed as a lowering of the vowel caused by the final velar fricative /j/. This is summarized in Table 12.

  

<table class="table table-bordered">
<caption>Table 12: Derivative suffix "-Vj"</caption>
<tr>
<th></th>
<th>Basic form</th>
<th>Lowering of /i/</th>
<th>Lowering of both /i/ and /u/</th>
</tr>
<tr>
<td></td>
<td></td>
<td>Abasolo, Oxchuc, Tenango, San Pedro Pedernal</td>
<td>Amatenango, Cancuc, Tenejapa</td>
</tr>
<tr>
<td>‘be scattered’ (<a href="entry">TSE07741</a>)</td>
<td>busk’ij</td>
<td>busk’ej</td>
<td>busk’ej</td>
</tr>
<tr>
<td>‘roll’ (<a href="entry">TSE04321</a>)</td>
<td>balch’uj</td>
<td>balch’uj</td>
<td>balch’oj</td>
</tr>
</table>
<h4 id="O/U">"O/U"</h4>

Several suffixes display a dialectally variable vowel: it may be /o/ or /u/. It is not clear which one is historically anterior. For instance, a common derivation for expressive predicates (see 5.5 above) consists of a suffix -{C}Vn where {C} is a copy of the root initial consonant and V alternates between /o/ and /u/ depending on the dialect (cf. _<a href="entry">TSE08821</a>~chajchun_ ‘sound repeatedly as steps in dry leaves’).

Some dialects consistently select either /o/ or /u/ in all the concerned suffixes: for example Cancuc has /o/, whereas Amatenango, Petalcingo, and Tenejapa always prefer /u/. But other dialects display some indeterminacy, as Bachajón and Oxchuc, where the selection is lexically determined. However, the dialectal distribution of this phenomenon has not been completely documented yet. 

In the dictionary, the forms with /o/ have been chosen in the lemmas, and the other possibility is indicated below with the abbreviation “O/U”. This is an arbitrary decision. Other examples can be observed in <a href="entry">TSE00841</a>, <a href="entry">TSE07431</a>, <a href="entry">TSE16051</a>, and <a href="entry">TSE23521</a>.

<h4 id="-tVmba">“-tVmba”</h4>

The reciprocal nominalizer suffix is _-tamba_ or _-tomba_ depending on the dialect, with a variable /a/~/o/ vowel: in North and Villa Las Rosas it is always /a/ (e.g. <a href="entry">TSE41841</a> ‘mutual killing’, from _mil_ ‘kill’), whereas in Center and Amatenango it is exclusively /o/ (miltomba), with the exception of Tenejapa where /a/ and /o/ alternate (information is lacking for Aguacatenango). This is summarized in Table 13.

<table class="table table-bordered">
<caption>Table 13: Derivative suffix "-tVmba"</caption>
<tr>
<th></th>
<th>North, Villa Las Rosas</th>
<th>Center, Amatenango</th>
<th>Tenejapa/th&gt;
    </th></tr>
<tr>
<td></td>
<td>/a/</td>
<td>/o/</td>
<td>/a/~/o/</td>
</tr>
<tr>
<td>‘mutual killing’ <a href="entry">TSE41841</a></td>
<td>miltamba</td>
<td>miltomba</td>
<td>miltamba~miltomba</td>
</tr>
<tr>
<td>‘fight’ <a href="entry">TSE39871</a></td>
<td>majtamba</td>
<td>majtomba</td>
<td>majtamba~majtomba</td>
</tr>
</table>

  

As _a_-forms of this suffix are more widely spread, those were selected for lemmas in the dictionary.

<h4 id="J-">"J-"</h4>

There are three homophonous prefixes j-:

*   The agentive prefix, which derives a person-denoting noun from action nouns, as _elek’_ ‘theft’ &gt; _j’elek’_ ‘thief’. 
*   The masculine nominal class, which appears with proper nouns, as _jPetul_ ‘Peter’ and some names of animals and plants (the feminine counterpart of this prefix is _x-_). 
*   The reduced form of the numeral _jun_ ‘one’ in combination with numeral classifiers (cf. 5.4), as _jch’ix_ ‘one long thing’. 

These prefixes were completely lost in Guaquitepec and Oxchuc, and are only optionally used in Cancuc and San Pedro Pedernal. Therefore, in those dialects _elek’_ means either ‘theft’ or ‘thief’. In the dictionary, the prefixed forms were preferentially registered.

<h3 id="Abbreviations">Abbreviations</h3>
<table class="table table-bordered">
<caption>Table 14: Abbreviations used in the dictionary</caption>
<tbody>
<tr>
<td><span style="font-variant: small-caps;">act.n.</span></td>
<td>action noun</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">adj.</span></td>
<td>adjective</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">adv.</span></td>
<td>adverb</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">agt.</span></td>
<td>agentive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">art.</span></td>
<td>definite article</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">attr.</span></td>
<td>attributive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">clas.</span></td>
<td>classifier</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">co.</span></td>
<td>co-compound</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">coord.</span></td>
<td>coordinator</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">def.</span></td>
<td>defective</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">dem.</span></td>
<td>demonstrative</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">diff.</span></td>
<td>diffusive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">dir.</span></td>
<td>directional</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">expr.</span></td>
<td>expressive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">i.</span></td>
<td>intransitive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">inc.act.n.</span></td>
<td>incorporating action noun</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">inc.adv.</span></td>
<td>incorporated adverb</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">interj.</span></td>
<td>interjection</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">mov.</span></td>
<td>movement</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">n.</span></td>
<td>noun</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">n.v.p.</span></td>
<td>non-verbal predicate</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">num.</span></td>
<td>numeral</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">onom.</span></td>
<td>onomatopoeia</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">part.</span></td>
<td>particle</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">phas.</span></td>
<td>phasal</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">pos.</span></td>
<td>positional</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">pred.</span></td>
<td>predicative</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">prep.</span></td>
<td>preposition</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">pro.</span></td>
<td>personal pronoun</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">prof.</span></td>
<td>interrogative/indefinite proform</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">quant.</span></td>
<td>quantifier</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">rel.</span></td>
<td>relational</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">sub.</span></td>
<td>subordinator</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">t.</span></td>
<td>transitive</td>
</tr>
<tr>
<td><span style="font-variant: small-caps;">v.</span></td>
<td>verb</td>
</tr>
</tbody></table>
<h3 id="Acknowledgements">Acknowledgements</h3>

The following institutions funded the general documentation project, of which the TSMD was a part of:

*   ELDP/SOAS, through the Field Trip Grant 0114 (2006) and the Major Documentation Project 0164 (2007-2010). 
*   The CONACYT (Mexican National Council of Science and Technology), through the SEP-CONACYT fund for basic research. 
*   The INALI (Mexican National Indigenous Languages Institute).
*   The Max Planck Institute for Psycholinguistics.
*   CIESAS-Sureste, where this project was hosted.

Roberto Sántiz Gómez donated 69 drawings, which he had asked the artist Antun Kojtom to make for his MA research on positional adjectives (cf. <a href="source">R038</a>). 

Gabriela Torres Freyermuth contributed to the collection and selection of photographs as part of her social service, along with Antonia Sántiz Girón.

<h3 id="External Link">External Link</h3>

An updated version of the Tseltal-Spanish database, with added audios and illustrations, is available at [http://ditsel.aldelim.org/](http://ditsel.aldelim.org/).

