Dataset Open Access

Cross Lingual Word Embeddings for Turkic Languages

Elmurod Kuriyozov

Pre-trained word embeddings for five Turkic languages using skip-gram model of FastText trained on large corpora of Turkic languages.  

Turkish, Uzbek, Azeri, Kazakh and Kyrgyz languages are covered.

 

For full info, please check this repo of GitHub: https://github.com/elmurod1202/crosLingWordEmbTurk 

Files (7.7 GB)
Name Size
az.sg.300.vec.tar.gz
md5:2b6e16c53575c2356ac6d732171c3b70
545.3 MB Download
kk.sg.300.vec.tar.gz
md5:a22a4317b9ee4f7fa69445cf3bddd93e
792.3 MB Download
ky.sg.300.vec.tar.gz
md5:255eaafea23a21a0f1a30bf334293d63
225.1 MB Download
tr.sg.300.vec.tar.gz
md5:220a7a397f8908d17e67d6a9c4c0035b
6.0 GB Download
uz.sg.300.vec.tar.gz
md5:9b423dbab2bb22e0a6b58f6510f45cde
196.2 MB Download
23
17
views
downloads
All versions This version
Views 2323
Downloads 1717
Data volume 15.3 GB15.3 GB
Unique views 1919
Unique downloads 99

Share

Cite as