Kiel South Asian Typological Database
Contributors
Description
Legend for interpreting the data of the
Kiel South Asian Typological Database
This corpus was originally compiled under the direction of John Peterson by Jessica Katiuscia Ivani, with contributions by Netra Prasad Paudyal, Nikita König, Lennart Chevallier, Anika Besser, Nellia Bleyer, Sarah Anders and Josephine Hennig in the project “Towards a linguistic prehistory of eastern central South Asia (and beyond)”, financed by the German Research Council (DFG, Project Grant 326697274). The database has since been considerably expanded and the data have been re-checked and corrected, where necessary, by John Peterson and Lennart Chevallier.
This database includes information on up to 237 features (described below) for 40 languages from the Indo-Aryan, Munda and Dravidian families, as well as the isolates Kusunda and Nihali. Of these 237 features, 98 derive from the Grambank database of the Glottobank research consortium and were compiled by the members of our project in cooperation with that project. We include here only those 98 features from that database which we felt are of particular relevance for South Asia.
All updates will be documented in detail with respect to the changes made, together with the date of the respective update.
Feature values: The features are encoded as follows for all languages:
1 – the respective feature is found in this language
0 – the respective feature is not found in this language
? – it is not clear from the available data sources whether this feature is found in the respective language or not
NA – this section of the data has not yet been completed for the relevant data
The values of the multistate features – GB024, GB025, GB065, & GB193 – state, whether the adnominal element precedes (1) or follows (2) the noun or both orders occur (3).
Features labeled “GB” are features from the original Grambank database compiled by members of our project. As the labeling of features in that database may have changed somewhat since that time, the labels found here may no longer correlate one-to-one with those features. We hope to synchronize these labels in the near future, but until then users of these features will have to check these on their own.
The feature labels “NGB”, “JPP” and “SA” refer only to different stages during the compilation of the data in our own project and are not relevant to the analysis of the data themselves.
The primary areas of grammar covered by the relevant features (e.g., ergativity, classifiers, negation, number, etc.) have been indicated on the right-hand side of the features list for many of these features. This list is not exhaustive and is only intended to serve as an initial orientation.
Use of these data
These data may be freely used in scientific research under the following two conditions:
- That you properly cite this database, including the following information:
Ivani, Jessica Katiuscia, Peterson, John & Chevallier, Lennart. 2022. The Kiel South Asian Typological Database. https://doi.org/10.5281/zenodo.7153825. [Date of last access].
- That you inform us in the event of incorrect data in the table, should you find any, so that we can recheck these ourselves.
We would also be grateful if you would send us a copy of your work using these data.
We expressly welcome input on the data from experts in the languages contained in this database and on further languages of the subcontinent!
Files
Kiel_Corpus_data.csv
Files
(75.5 kB)
Name | Size | Download all |
---|---|---|
md5:84d6d6b96d483c5bd862005512aa7fbb
|
49.0 kB | Preview Download |
md5:fa1d683b1e65a3f2d3123f2b95be59eb
|
26.6 kB | Preview Download |