There is a newer version of the record available.

Published November 28, 2022 | Version r20221127
Dataset Open

A Global Lexical Database (GLED) with cognate annotation and phonological alignments

Description

This work presents a lexical database encompassing most natural languages, with cognate annotation and phonological alignment, along with per-family and global phylogenetic resources. The lexical data is organized in a single and easy-to-use tabular file, and all resources are built following best practices and state-of-the-art algorithms for historical linguistics. It was developed to provide a source for prototyping studies, developing new methods, as well as bootstrapping analyses, and to allow for the community to engage in research in computational historical linguistics. The data is expected to be updated regularly, with additions and improvements. All resources are freely available for download for all interested researchers.

Notes

The software pipeline for generating and releasing the database is available at https://github.com/tresoldi/gled

Files

20221127.zip

Files (197.8 MB)

Name Size Download all
md5:80a5f8f9a6eaa48542ca8e2c605fe74e
197.8 MB Preview Download

Additional details

Related works