Published October 4, 2021
| Version v1
Conference paper
Open
SloBERTa: Slovene monolingual large pretrained masked language model
Description
Large pretrained language models, based on the transformer architecture, show excellent results in solving many natural language processing tasks. The research is mostly focused on English language; however, many monolingual models for other languages have recently been trained. We trained first such monolingual model for Slovene, based on the RoBERTa model. We evaluated the newly trained SloBERTa model on several classification tasks. The results show an improvement over existing multilingual and monolingual models and present current stateof-the-art for Slovene.
Files
Ulcar+Robnik.pdf
Files
(396.9 kB)
Name | Size | Download all |
---|---|---|
md5:d63f3fb1a5c4c82b1a8f3574de07cbb7
|
396.9 kB | Preview Download |