Published June 5, 2014 | Version v1
Dataset Open

Benchmark Database for Phonetic Alignments

  • 1. Philipps-Universität Marburg

Description

In the last two decades, alignment analyses have become an important technique in quantitative historical linguistics and dialectology. Phonetic alignment plays a crucial role in the identification of regular sound correspondences and deeper genealogical relations between and within languages and language families. Surprisingly, up to today, there are no easily accessible benchmark data sets for phonetic alignment analyses. Here we present a publicly available database of manually edited phonetic alignments which can serve as a platform for testing and improving the performance of automatic alignment algorithms. The database consists of a great variety of alignments drawn from a large number of different sources. The data is arranged in a such way that typical problems encountered in phonetic alignment analyses (metathesis, diversity of phonetic sequences) are represented and can be directly tested.

Files

andean.zip

Files (73.4 MB)

Name Size Download all
md5:771dfd8fd5b4fb8a0b55f130958b1119
101.1 kB Preview Download
md5:c0cb444b50a828417c160f241f0a6cde
169.7 kB Preview Download
md5:1c2dbd111a18c4d91ad7d9f928b4e3eb
777.7 kB Preview Download
md5:8cf415045a60f91aec70b4ab4c406a2b
32.1 MB Preview Download
md5:03121f2760f51e2133043f72af397aeb
937.9 kB Preview Download
md5:9e88909bfe332f11be0168b01a2fe195
1.1 MB Preview Download
md5:357fa06c4604cabca7591963c407a51d
1.1 MB Preview Download
md5:a372ef2dcb46d695943c05efa659efe0
24.3 kB Preview Download
md5:bb5eb87ad1b8fe085c94c8938c5959f8
36.3 MB Preview Download
md5:682864e0633717ab2ee7350d57de6daa
470.0 kB Preview Download
md5:c3ad74de986d40a2c0e247528008efe1
83.5 kB Preview Download
md5:efc06dad93953b616e6d8d0976f49744
114.6 kB Preview Download
md5:04f3e7beb5a72f5448d10b6a8e4b67c4
28.9 kB Preview Download
md5:5851b21f7f0d47f5c4284ba2d7002d34
53.0 kB Preview Download
md5:ae4e33fb0e151b2affb1df266eb7e051
14.6 kB Preview Download

Additional details

Related works

Funding

QUANTHISTLING – Quantitative modeling of historical-comparative linguistics: Unraveling the phylogeny of native South American languages 240816
European Commission