Cross Lingual Word Embeddings for Turkic Languages

Published February 13, 2020 | Version v1

Dataset Open

Pre-trained word embeddings for five Turkic languages using skip-gram model of FastText trained on large corpora of Turkic languages.

Turkish, Uzbek, Azeri, Kazakh and Kyrgyz languages are covered.

Files

Name	Size
az.sg.300.vec.tar.gz md5:2b6e16c53575c2356ac6d732171c3b70	545.3 MB	Download
kk.sg.300.vec.tar.gz md5:a22a4317b9ee4f7fa69445cf3bddd93e	792.3 MB	Download
ky.sg.300.vec.tar.gz md5:255eaafea23a21a0f1a30bf334293d63	225.1 MB	Download
tr.sg.300.vec.tar.gz md5:220a7a397f8908d17e67d6a9c4c0035b	6.0 GB	Download
uz.sg.300.vec.tar.gz md5:9b423dbab2bb22e0a6b58f6510f45cde	196.2 MB	Download

550

Views

708

Downloads

Show more details

DOI

Resource type

Dataset

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more