PRIMA HTR

Crespi, Serena Carlamaria

doi:10.5281/zenodo.18220238

Published January 12, 2026 | Version 1.0

Model Open

PRIMA HTR

Crespi, Serena Carlamaria (Producer)^{1, 2}

1. Université de Tours
2. Università degli Studi di Firenze

Contributors

Annotator (2):

Project leader:

Pierazzo, Elena¹

1. Université de Tours
2. Università degli Studi di Padova
3. Centre d'Études Supérieures de la Renaissance
4. University of Florence

PRIMA HTR Model — Italian Early Modern Manuscripts (late 16th–18th c.)
Description
The PRIMA HTR model was developed within the framework of the ERC project PRIMA — Manuscripts in the Age of Print, hosted at the Centre d’Études Supérieures de la Renaissance (CESR – UMR 7323), Université de Tours, with the support of the LIFAT computer science laboratory (Université de Tours), which provided the high-performance computing infrastructure used for model training.
The model is designed to support large-scale transcription and analysis of Italian manuscript heritage from the late sixteenth to the eighteenth century, with a particular focus on literary, satirical and poetic texts.

We invite scholars and institutions working on early modern Italian manuscripts to use this model and, whenever possible, to publish the resulting transcriptions in open repositories, in order to contribute to the continuous improvement of both the model and the associated training datasets.

Training data and methodology
The model is the result of fine-tuning on a heterogeneous corpus of Italian handwritten sources from the late sixteenth to the eighteenth century, including poetic, satirical, narrative and documentary texts, on top of a base model trained on a wide range of Latin-script handwritten documents. In order to increase the diversity of writing styles, the training corpus also incorporates a selection of early modern printed calligraphy manuals, used to introduce additional stylistic variation in letterforms and writing practices.

Its performance was further optimized through the injection of synthetic training data generated from the manuscript material under study, in order to improve robustness to scribal variation and layout heterogeneity.
The complete data augmentation workflow is documented in the project Gitlab repository: https://scm.univ-tours.fr/cesr/prima/data_augmentation

Source collections
A representative portion of the training corpus is derived from digitized manuscripts preserved in the following institutions (non-exhaustive list):

- Biblioteca Nazionale Centrale di Firenze

- Biblioteca Marucelliana, Firenze
- Biblioteca dell’Archiginnasio, Bologna
- Fondo Joppi, Udine
- Biblioteca Bertoliana, Vicenza

- Biblioteca Angelica, Roma
- Biblioteca Nazionale Centrale di Roma

- Bibliothèque universitaire Droit-Lettres, Université Grenoble Alpes

The authors gratefully acknowledge these institutions for making their collections available for scholarly research.

Transcription and normalization
Normalization and transcription practices strictly follow the PRIMA transcription guidelines deposited in the Gitlab: https://scm.univ-tours.fr/cesr/prima/htr

Funding
The PRIMA project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, Grant agreement No. 101142242.
Funded by the European Union. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

Files

Files (16.2 MB)

Name	Size	Download all
PRIMA HTR.mlmodel md5:628f3cce7dbe9542f8d87408e3f39277	16.2 MB	Download

Additional details

Created: 2026-01-12

Repository URL: https://scm.univ-tours.fr/cesr/prima/htr

	All versions	This version
Views	280	280
Downloads	55	55
Data volume	957.2 MB	957.2 MB

Contributors

Annotator (2):

Project leader:

Files (16.2 MB)

Dates

Software

PRIMA HTR

Authors/Creators

Contributors

Annotator (2):

Project leader:

Description

Files

Files (16.2 MB)

Additional details

Dates

Software