Published December 11, 2025 | Version 1.0
Dataset Open

Los101

  • 1. ROR icon Universidad Complutense de Madrid
  • 2. ROR icon National University of Distance Education
  • 3. ROR icon Universidad Publica de Navarra
  • 4. ROR icon Universidad Autónoma de Madrid

Description

Subcorpus seleccionado del corpus PastReader 2025 y revisado manualmente a partir del análisis cualitativo de errores, como se explica en la publicación de Archiv "Transcribing Spanish Texts from the Past: Experiments with Transkribus, Tesseract and Granite".

Files

Los101-Gresel.zip

Files (1.3 GB)

Name Size Download all
md5:e1f00e664d6dbbc0467d87e38d14f70b
1.3 GB Preview Download

Additional details

Additional titles

Subtitle (Spanish)
Subcorpus de PastReader modificado manualmente

Related works

Is supplemented by
Publication: arXiv:2507.04878 (arXiv)

Funding

Ministerio de Ciencia, Innovación y Universidades
PID2023-151280OB

Dates

Submitted
2025-12-11