There is a newer version of the record available.

Published October 6, 2022 | Version v1
Dataset Open

Kiel South Asian Typological Database

Description

Legend for interpreting the data of the

Kiel South Asian Typological Database

This corpus was originally compiled under the direction of John Peterson by Jessica Katiuscia Ivani, with contributions by Netra Prasad Paudyal, Nikita König, Lennart Chevallier, Anika Besser, Nellia Bleyer, Sarah Anders and Josephine Hennig in the project “Towards a linguistic prehistory of eastern central South Asia (and beyond)”, financed by the German Research Council (DFG, Project Grant 326697274). The database has since been considerably expanded and the data have been re-checked and corrected, where necessary, by John Peterson and Lennart Chevallier.

This database includes information on up to 237 features (described below) for 40 languages from the Indo-Aryan, Munda and Dravidian families, as well as the isolates Kusunda and Nihali. Of these 237 features, 98 derive from the Grambank database of the Glottobank research consortium and were compiled by the members of our project in cooperation with that project. We include here only those 98 features from that database which we felt are of particular relevance for South Asia.

All updates will be documented in detail with respect to the changes made, together with the date of the respective update.

 

Feature values: The features are encoded as follows for all languages:

1 – the respective feature is found in this language

0 – the respective feature is not found in this language

? – it is not clear from the available data sources whether this feature is found in the respective language or not

NA – this section of the data has not yet been completed for the relevant data

The values of the multistate features – GB024, GB025, GB065, & GB193 – state, whether the adnominal element precedes (1) or follows (2) the noun or both orders occur (3).

Features labeled “GB” are features from the original Grambank database compiled by members of our project. As the labeling of features in that database may have changed somewhat since that time, the labels found here may no longer correlate one-to-one with those features. We hope to synchronize these labels in the near future, but until then users of these features will have to check these on their own.

The feature labels “NGB”, “JPP” and “SA” refer only to different stages during the compilation of the data in our own project and are not relevant to the analysis of the data themselves.

The primary areas of grammar covered by the relevant features (e.g., ergativity, classifiers, negation, number, etc.) have been indicated on the right-hand side of the features list for many of these features. This list is not exhaustive and is only intended to serve as an initial orientation.

 

Use of these data

These data may be freely used in scientific research under the following two conditions:

  1. That you properly cite this database, including the following information:

    Ivani, Jessica Katiuscia, Peterson, John & Chevallier, Lennart. 2022. The Kiel South Asian Typological Database. https://doi.org/10.5281/zenodo.7153825. [Date of last access].
     
  2. That you inform us in the event of incorrect data in the table, should you find any, so that we can recheck these ourselves.

    We would also be grateful if you would send us a copy of your work using these data.

 

We expressly welcome input on the data from experts in the languages contained in this database and on further languages of the subcontinent!

Files

Kiel_Corpus_data.csv

Files (75.5 kB)

Name Size Download all
md5:84d6d6b96d483c5bd862005512aa7fbb
49.0 kB Preview Download
md5:fa1d683b1e65a3f2d3123f2b95be59eb
26.6 kB Preview Download