Dataset Open Access
Syrjänen, Kaj; Lehtinen, Jyri; Vesakoski, Outi; de Heer, Mervi; Suutari, Toni; Dunn, Michael; Määttä, Urho; Leino, Unni-Päivä
The UraLex basic vocabulary dataset has its origins in the basic vocabulary cognacy dataset collected by the research initiative BEDLAN (Biological Evolution and the Diversification of Languages), funded by the Kone Foundation between 2009-2013. The data has since been revised and expanded in follow-up research projects, including SumuraSyyni (2014-2016), UraLex (2014-2016) and AikaSyyni (2017-2020). The dataset has been compiled especially for the purposes of quantitative language classification/historical linguistics, such as Bayesian Inference of phylogeny.