RoBERTaSense-FACIL: A Technical Report and Model Selection Study for Meaning Preservation in Easy-to-Read Spanish Texts

Diab, Isam; Suárez-Figueroa, Mari Carmen

doi:10.5281/zenodo.17674467

Published November 21, 2025 | Version v1

Report Restricted

RoBERTaSense-FACIL: A Technical Report and Model Selection Study for Meaning Preservation in Easy-to-Read Spanish Texts

1. Universidad Politécnica de Madrid

RoBERTaSense-FACIL is a Spanish Transformer-based model fine-tuned to evaluate meaning preservation in Easy-to-Read (E2R) text adaptations. The model builds upon RoBERTa-base-bne and incorporates a balanced dataset of expert-validated E2R adaptations together with automatically generated hard negatives designed to introduce structural, semantic, and cross-textual distortions.

This technical report describes the full methodology used to construct the dataset, the hard negative generation framework, the fine-tuning process, and a comparative evaluation of three models: MeaningBERT, RoBERTa-base-bne, and a BERTScore-based regression variant. Results show that the fine-tuned RoBERTa-base-bne, referred to as RoBERTaSense-FACIL, achieves the most robust and reliable performance for binary meaning-preservation classification in Spanish E2R texts.

Data availability:
The datasets and intermediate scripts used in this work cannot be made publicly available due to privacy and copyright restrictions. However, access may be granted upon reasonable request for academic research purposes.

Model availability:
The RoBERTaSense-FACIL model is publicly available on Hugging Face:
https://huggingface.co/oeg/RoBERTaSense-FACIL

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

	All versions	This version
Views	21	21
Downloads	15	15
Data volume	8.6 MB	8.6 MB

RoBERTaSense-FACIL: A Technical Report and Model Selection Study for Meaning Preservation in Easy-to-Read Spanish Texts

Authors/Creators

Description

Files

Restricted