Dataset and Code for: Tuning of Language Models in Eastern European Languages on Twitter/X
Description
This dataset contains the data and experimental code used in the study:
Filip, T., Pavlíček, M., & Sosík, P. (2024).
"Tuning of language models in Eastern European languages on Twitter/X."
In Proceedings of the Workshop on Artificial Intelligence and Language Technologies (Vol. 4092). CEUR-WS.
The dataset includes:
– text data from multiple Eastern European V4 languages collected from Twitter/X,
– preprocessing and cleaning scripts,
– experimental code used for training and evaluation,
– evaluation outputs (metrics, tables, plots),
– all resources required to reproduce the results presented in the article.
This dataset serves as supplementary material to the publication referenced above.
Funding acknowledgment:
This work has been produced with the financial support of the European Union under the
"Biography of Fake News with a Touch of AI: Dangerous Phenomenon through the Prism of Modern Human Sciences"
project no. CZ.02.01.01/00/23_025/0008724 via the Operational Programme Jan Ámos Komenský (OP JAK).
Files
zrec-paper-a-study-on-eastern-european-v4-languages-main.zip
Files
(2.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ad655868060c9a4b448fbef7cb545f19
|
2.2 MB | Preview Download |
Additional details
Related works
- Is referenced by
- Publication: https://ceur-ws.org/Vol-4092/paper19.pdf (URL)
- Is source of
- Software: https://github.com/zrecorg/zrec-paper-a-study-on-eastern-european-v4-languages (URL)
- Is supplement to
- Dataset: 10.5281/zenodo.17723755 (DOI)
Funding
- Ministry of Education Youth and Sports
- Biography of Fake News with a Touch of AI: Dangerous Phenomenon through the Prism of Modern Human Sciences CZ.02.01.01/00/23_025/0008724
- Ministry of Education Youth and Sports
- REFRESH – Research Excellence For REgion Sustainability and High-tech Industries CZ.10.03.01/00/22_003/0000048
- Silesian University in Opava
- SGS/9/2024
Dates
- Issued
-
2025-12-09
Software
References
- Filip, T., Pavlíček, M., & Sosík, P. (2025). Tuning of language models in Eastern European languages on Twitter/X. In Proceedings of the Workshop on Artificial Intelligence and Language Technologies (Vol. 4092). CEUR-WS. https://doi.org/10.5281/zenodo.17723755