Published October 20, 2025 | Version v1.0.0
Dataset Open

Esperanto word2vec embeddings trained on OpenSubtitles

  • 1. ROR icon Harrisburg University of Science and Technology

Description

This dataset contains the subs2vec embeddings for Esperanto, as presented in https://zenodo.org/records/17243814. The embeddings were trained on large-scale subtitle corpora and represent semantic vector spaces derived from naturalistic language use in films and television from the OpenSubtitles 2018 datasets: https://opus.nlpl.eu/OpenSubtitles/corpus/version/OpenSubtitles

For this language, we provide all embedding variants explored in the study. Specifically, the dataset includes vectors generated under different combinations of:

  • Dimensionality: multiple vector sizes (e.g., 100, 200, 300, …)
  • Window size: varying context windows (e.g., 2, 5, 10, …)
  • Each file corresponds to a unique configuration (dimension × window size). 

Each file contains the vocabulary for that language (column 1) and then the embedding values (columns 2 through dimension size + 1). 

If you use this dataset, please cite:

Files

Files (19.7 GB)

Name Size Download all
md5:f7c4ef0b03d103594e780eebf6e9f1d2
143.5 MB Download
md5:838c23295e88991b6883a2ae6ea94f42
143.4 MB Download
md5:b209b8c724055df7f9f86b0cd3619a0c
143.4 MB Download
md5:3f7073f84160b7f476bb23870b87873e
143.4 MB Download
md5:bc2a8899be6efd98307703b01e1d89cd
143.3 MB Download
md5:27b4bd1b8257e0e545c1018604324efb
143.5 MB Download
md5:14766a003889cca600a79b7a7808b673
143.3 MB Download
md5:40ee4a2f66d13fc26ad8273adeac9678
143.6 MB Download
md5:f43e797a49a2f44e57c079a3fa353e75
143.4 MB Download
md5:a6de6460c152b8543a8b4c7f9691758a
143.5 MB Download
md5:5a32b7be18f53ed86f71f0297a5eb04a
143.3 MB Download
md5:931138e426a9e64c66d8ad83b899f422
143.6 MB Download
md5:757bae3fdfce644836008cee8170c474
285.5 MB Download
md5:df4ce3b41bfb1751cec588730327b736
285.9 MB Download
md5:63ba5134544cc90626c5a1b8330322da
285.3 MB Download
md5:51d4a6b79739e759b442d5bd98d43596
285.9 MB Download
md5:65e709bf7add19eec2fe699608c10fd7
285.2 MB Download
md5:a6562a63a6c7192706786e12eab7d513
286.0 MB Download
md5:6e317ccb113ad2e905967288cd928021
285.1 MB Download
md5:f0444049a11128bcdacb5dc6989d198a
286.0 MB Download
md5:38cf39d04a18564789a86a3949d15942
285.0 MB Download
md5:e6b83c29f941293d975d9ee6ca2f642b
286.0 MB Download
md5:7461e27b197d644a56def10917f4ca26
285.0 MB Download
md5:46d8fe53dd570d9bd12683275e0189af
286.0 MB Download
md5:f148cb437028d8be0f6a525781333e60
427.7 MB Download
md5:688c782dde68b18eb096140bd8fc9b00
428.7 MB Download
md5:17d7b4a72299fc2d25926342606b16da
427.4 MB Download
md5:fb847d3a36d27c4554b47fd600b7e9c6
428.7 MB Download
md5:a38f6e85b4580aeb1a1c6dda49cc3de5
427.1 MB Download
md5:3c860289de30c2e47ff4e807dafb0c61
428.7 MB Download
md5:dc04cf59c25f7e33f587bd25f97d4c6d
427.1 MB Download
md5:1d76717ece71ca99bf83daf0d0887bfe
428.9 MB Download
md5:f52676096ce8c7b7284aac65f3e5a53d
426.9 MB Download
md5:4e2ec02b564a3aaa4a0e8e9fcdbbd793
428.9 MB Download
md5:04285985b040a9a56d3db196fb18b04f
426.9 MB Download
md5:ea13b6889b8442a09127d4b4ff9c1c12
428.8 MB Download
md5:e643e910e68c21a1d8d25fecc7fb3b2c
713.6 MB Download
md5:d9ad55d0a82e150b235266f7836dda2b
715.2 MB Download
md5:d7f721148fd0df5649ed403bcc1a75c0
712.3 MB Download
md5:fa65105a073b2fd01fb35cd47813638c
715.8 MB Download
md5:8a9943cdaaa99d2308cbae954690c41c
711.7 MB Download
md5:91f597340cede69c907322731f7f5bc8
715.3 MB Download
md5:3c5648b1e1038163cce20b8612d8d250
711.4 MB Download
md5:87265754488a707661c9617ece7001a2
715.2 MB Download
md5:fc4658b1a3d556312e9af5c0aa6fdce8
711.1 MB Download
md5:e7076b19f1871bcdecfffa17b0d9c4f3
715.4 MB Download
md5:61b6e10db853283681916da35414abc8
711.0 MB Download
md5:fc3aa1d9382ba1293ee8fb81a27feaf6
715.1 MB Download
md5:ab00b892cb6ee521dec3d35b0768065b
72.6 MB Download
md5:9e8047f7a91a484d1230f07f40ff90e4
72.7 MB Download
md5:7c4c7a481907d21d6be752886f29ceb8
72.6 MB Download
md5:1f7b1df9a2270ebf19792e069d44e03e
72.6 MB Download
md5:f381420f447cfb93a50565c05ca12a2e
72.7 MB Download
md5:a8c0911099cfac8c80fb6abcfa9960ea
72.7 MB Download
md5:dd0952ae5e848328e387d260f42405b7
72.6 MB Download
md5:8af3b3ee467e39c1d5f3b1c49e55d808
72.6 MB Download
md5:41fe39555b67dd1043141f07d37be2df
72.5 MB Download
md5:54ca3a4d7eadcb767d9cce2bbb12470c
72.6 MB Download
md5:cba5a12ab4030ce00d01afbdadd5eb33
72.5 MB Download
md5:901ff0577b880ab70ccb423387cacd7f
72.7 MB Download

Additional details

Related works

Is supplement to
Standard: 10.5281/zenodo.17243812 (DOI)

Software