Tigrinya Analogy Test for evaluating Word Embeddings
Description
Tigrinya Analogy Test for evaluating Word Embeddings
This is a Tigrinya version of the Google Analogy Test set, which is used to evaluate English word-embedding models. The analogy test is a well-established strategy to empirically evaluate the quality of word-embedding models. More information about the English task can be found at the ACL Wiki.
This data is was first machine translated then manually verified by a native speaker to reduce errors.
Some aspects of the original analogy test is focused on English and may not transfer well to other languages, such as those related to grammar or morphology. Therefore, we have discarded examples that became irrelevant in Tigrinya when adapting the task. Finally, there are a total of 18465 entries in the Tigrinya Analogy Test set, while the source English data has 19544 entries.
An entry is dropped if the translations led to one of the following conditions:
- If the source word pair map to one Tigrinya word, for example, lucky & luckiest both correspond to ዕድለኛ.
- If the source word results in a multi-word expression. For example, grandson (ወዲ ጓል / ወዲ ወዲ), granddaughter (ጓል ጓል / ጓል ወዲ). This because the typical word-embedding approaches such as word2vec are not designed to predict multi-word phrases.
Test Sections
The test includes a series of semantic and syntactic analogies divided up into subsections including world capitals, currencies, family, tense, and plurality. The test contains the following sections:
- capital-world
- currency
- city-in-state
- family
- gram1-adjective-to-adverb
- gram2-opposite
- gram3-comparative
- gram4-superlative
- gram5-present-participle
- gram6-nationality-adjective
- gram7-past-tense
- gram8-plural
- gram9-plural-verbs
Examples:
- Semantic section of World Capitals: “ኣስመራ: ኤርትራ as ፓሪስ: ?” and if the model responds correctly it will return: “ፈረንሳ”.
- Semantic section of Family section: “ሰብኣይ: ሰበይቲ as ወዲ: ጓል”.
- Syntax section with tense, a sample analogy might be “Walk: Walked as Run: Ran”.
Evaluation
The final accuracy of a model is the proportion of the questions that the model answers correctly.
Generally, a better-quality model would answer more questions correctly than a model of lower quality.
However, note that a model with low performance on this analogy test, might still contain useful information, but may not be robust or good enough for more complex tasks.
Limitations
- The analogy test could be a good indicator of the quality of word-embeddings, but it should be used with caution when comparing models trained on varying domains of data. It shall not be expected to generalize equally to all domains.
- The final score can be affected by the size, vocabulary, and domain of the text with which the models are trained on. For example, this may not be a good benchmark to compare models trained on news text vs posts on social media.
- Even though a manual sanity check was performed, we note that the semi-automatic construction of the Tigrinya test set might contains errors. If you discover any, you are welcome to contribute back by either opening an Issue at the GitHub repo, https://github.com/fgaim/tigrinya-analogy-test.
Citation
If you use this resource in your research, please cite it accordingly.
Files
TigrinyaAnalogyTest.zip
Files
(82.5 kB)
Name | Size | Download all |
---|---|---|
md5:cafaf293227bf6d419738d03a6e9a462
|
82.5 kB | Preview Download |