Published April 26, 2023 | Version v1
Dataset Open

Sentiment analysis data and word embeddings for Erzya, Komi-Zyrian, Moksha and Udmurt

  • 1. Rootroo Ltd
  • 2. University of Helsinki

Description

The aligned sentiment annotated data is in setiment_eval_data.json, vectors.zip has the word embeddings in a textual Gensim format, code.zip has the code and models.zip the sentiment analysis model.

Please cite the following paper:

Alnajjar, K., Hämäläinen, M., & Rueter, J, (2023) Sentiment Analysis Using Aligned Word Embeddings for Uralic Languages. In Proceedings of the Second Workshop on Resources and Representations for Under-resourced Languages and Domains (RESOURCEFUL-2023)

Files

code.zip

Files (97.9 MB)

Name Size Download all
md5:8d3dc9a6d850bc2baae4681305ea5b19
9.9 kB Preview Download
md5:54bc63cc5a09c35af79423484e575c8a
85.0 MB Preview Download
md5:4971c0b322ee8c3e2dcc7106b89976ef
37.7 kB Preview Download
md5:7df254d1edd17044b3b91374e4b83d90
12.9 MB Preview Download