Published February 13, 2020 | Version v1
Dataset Open

Cross Lingual Word Embeddings for Turkic Languages

  • 1. Department of Computational Linguistics, University of a Coruna

Description

Pre-trained word embeddings for five Turkic languages using skip-gram model of FastText trained on large corpora of Turkic languages.  

Turkish, Uzbek, Azeri, Kazakh and Kyrgyz languages are covered.

 

For full info, please check this repo of GitHub: https://github.com/elmurod1202/crosLingWordEmbTurk 

Files

Files (7.7 GB)

Name Size Download all
md5:2b6e16c53575c2356ac6d732171c3b70
545.3 MB Download
md5:a22a4317b9ee4f7fa69445cf3bddd93e
792.3 MB Download
md5:255eaafea23a21a0f1a30bf334293d63
225.1 MB Download
md5:220a7a397f8908d17e67d6a9c4c0035b
6.0 GB Download
md5:9b423dbab2bb22e0a6b58f6510f45cde
196.2 MB Download