WordNets for South African Languages
Authors/Creators
- 1. Council for Scientific and Industrial Research
- 2. University of Limpopo
- 3. University of Pretoria
Description
Data statement of the WordNets for South Africa languages
Data set name: WordNets for South Africa languages
Citation: Sefara, T.J., Mokgonyane, T.B. and Marivate, V., 2021. Practical Approach on Implementation of WordNets for South African Languages. In Proceedings of the Eleventh Global Wordnet Conference.
Data set developer(s): Sefara, T.J. (https://speechtech.co.za), Mokgonyane, T.B. (https://sites.google.com/view/tumisho-mokgonyane) and Marivate, V. (https://vima.co.za)
Data statement authors: Sefara, T.J. (https://speechtech.co.za), Mokgonyane, T.B. (https://sites.google.com/view/tumisho-mokgonyane) and Marivate, V. (https://vima.co.za)
Link to the dataset: zenodo link here
Dataset license: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
A. CURATION RATIONALE
The dataset of the WordNets for South Africa languages has been modified to be compatible with OMW in NLTK. The dataset contains Wordnets of Setswana, Sepedi, Tshivenda, isiZulu and isXhosa. Originally the datasets was created for WordNet 2.0. Now the dataset is converted to WordNet 3.0 using the sensemap files from Princeton WordNets.
B. LANGUAGE VARIETY/VARIETIES
The language of the datasets are Standard ISO639-2:
-
Sepedi (nso)
-
Setswana (tsn)
-
isiXhosa (xho),
-
isiZulu (zul)
-
Tshivenda (ven)
C. SPEAKER DEMOGRAPHIC
N/A
D. ANNOTATOR DEMOGRAPHIC
N/A
E. SPEECH SITUATION
N/A
F. TEXT CHARACTERISTICS
N/A
G. RECORDING QUALITY
N/A
H. OTHER
We provide a link to the library that utilise this dataset: https://github.com/JosephSefara/AfricanWordNet
I. PROVENANCE APPENDIX
N/A
About this document
A data statement is a characterisation of a dataset that provides context to allow developers and users to better understand how experimental results might generalise, how software might be appropriately deployed, and what biases might be reflected in systems built on the software.
Notes
Files
africanwordnet.zip
Files
(473.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d84345dcebfb7be12fb4be78bb25b03f
|
473.3 kB | Preview Download |