There is a newer version of the record available.

Published February 24, 2022 | Version v1.0.1
Dataset Open

The AUTOTYP database

Description

AUTOTYP is a large-scale research program with goals in both quantitative and qualitative typology. This release contains the full raw data from the AUTOTYP databases along with metadata, aggregated datasets and aggregation scripts. 

Please cite as:

Bickel, Balthasar, Nichols, Johanna, Zakharko, Taras, Witzlack-Makarevich, Alena, Hildebrandt, Kristine, Rießler, Michael, Bierkandt, Lennart, Zúñiga, Fernando & Lowe, John B. 2022. The AUTOTYP database (v1.0.1). https://doi.org/10.5281/zenodo.6255206

Release 1.0.1

This is a maintenance release:

  • improved JSON output
  • improved and corrected the metadata for multiple variables of the type value-list
  • improved the bibliography data, added Glottolog language and reference IDs (many thanks to Robert Forkel for doing this work)
  • minor data fixes (duplicate entries in datasets Alienability, Gender and NumeralClassifiers)

Known limitations:

  • WordDomains dataset has known structural defects and only exports partial information published in AUTOTYP. We are working on overhauling this module and will publish a redesigned version of this dataset in a future release
  • Alignment computations are currently only defined for a subset of the available grammatical relation data. Please refer to the documentation. This will be addressed in a future release 
  • Data in CLDF format will be provided in a future release

Release 1.0.0

This is a completely new release, radically overhauled from the earlier 0.1.x version, and focuses on usability, documentation, and completeness. New features include:

  • Over 260 typological variables that describe 1319 languages across approximately 260,000 data points or, together with the derived (aggregated) data, over 1,700,000 data points.
  • New naming conventions for datasets and variables, focusing on usability and clarity.
  • Language name and Glottolog code now accompany every dataset, so each dataset is a self-standing table of a typological variable (but can also be linked to any and all of the others via the internal language ID).
  • Published data now includes the raw exported database data as well as derived aggregated tables.  All aggregation scripts used to compute derived data are published as well.
  • New R and JSON exports for users who prefer those environments.

For a complete list of major new features see:

       https://github.com/autotyp/autotyp-data/blob/v1.0.1/CHANGES-1.0.0.md

For general information about the database:

   https://github.com/autotyp/autotyp-data/blob/v1.0.1/readme.md

 

Files

autotyp-data-v1.0.1.zip

Files (9.5 MB)

Name Size Download all
md5:9ddf9782c812eb977799fe1a27514270
9.5 MB Preview Download

Additional details

Related works