Published November 25, 2025 | Version v2
Dataset Open

Dataset of Manuscripts Signed and Attributed to Raoulet d'Orléans and Henri de Trévou

  • 1. ROR icon Institut de Recherche et d'Histoire des Textes
  • 2. ROR icon Laboratoire d'Informatique Gaspard-Monge

Description

This repository hosts the revised version of the dataset used for the paper “Reconciling traditional and computational methods for the analysis of scribal hands; the case of Raoulet d’Orléans and Henri de Trévou (XIVe c.)”, presented at the 23rd Colloquium of Latin Palaeography (CIPL), 17-19 September 2025, Vienna (expected to be published in Brepols as part of the Colloque's proceedings, in 2026). 

Revisions for v2: Eleven folios, damaged or affected by digitization artefacts (distortions on verso folios, lighting effects) have been replaced with new images. No new manuscripts were added in this version. Further details are provided in the file RaouletOrleans_HenriTrevou_dataset_metadata_v2.csv.

Contents

The archive RaouletOrleans_HenriTrevou_dataset_v2.zip contains the following structure:

├── images/
└── annotation.json

└── RaouletOrleans_HenriTrevou_dataset_metadata_v2.csv

 

images/

Contains subfolders per manuscript ID, each including extracted .png lines (via eScriptorium) from selected folios. Polygonal line extractions include alpha transparency and are deslanted.

Image naming convention: DocID_f<number>

Image rights: 

  • Reproductions from open-access and public collections:

    • Paris, Bibliothèque nationale de France — Gallica

    • Vienna, Austrian National Library

    • Oxford, Saint John’s College Library: By permission of the President and Fellows of St John’s College Oxford.

    • Paris, Bibliothèque Sainte-Geneviève

    • Cambridge, Massachusetts Library

    • IRHT library — microfilm reproduction of London, BL - ADD 15420 (Thanks to Gilles Kagan for providing the file)

     

    Purchased or licensed reproductions:

    • KB (Det Kongelige Bibliotek, Denmark): KB Thott 6, f.1 and 228v; KB Thott 6, f.229r and 472v (purchased high-resolution images);

    • KBR (Royal Library of Belgium): KBR – Cabinet des Estampes et des Dessins – S.V10319 f.7r, 90r, 175r / KBR – Cabinet des Estampes et des Dessins – S.V9507 f.4r and 125r/ KBR – Cabinet des Estampes et des Dessins – S.V9505-6 f.1v and 222r /KBR – Cabinet des Estampes et des Dessins – S.V11201 (purchased high-resolution images)

     

    Special thanks:

    • Koninklijke Bibliotheek, Netherlands — with special thanks to Ed Van der Vlist for providing high-quality reproductions at no cost for Koninklijke Bibliotheek, KW 78 D 41.

     

annotations.json

A JSON file containing line-level transcriptions and metadata. The dataset is CATMuS-compliant and uses a graphemic transcription approach. 

Structure example:

"<image_id>": {
  "split": "train",
  "label": "A beautiful calico cat.",  // Transcription text of the line
  "script": "RaouletOrleans",         // Scribal hand identifier
  "folio": "1r",
  "doc": "RO1"
}

 

RaouletOrleans_HenriTrevou_dataset_metadata_v2.csv

A CSV file accompanies the dataset with folio-level metadata, for example:

 

Shelfmark

Colophon

DocID

FileID

Split

Disposition

TextualCategory

TextType

Text

NotBefore

NotAfter

Scribe

NbPages

Folios

NbLines

Paris, BnF NAF, 27401, f.159-194v et 255-266v

No

RO1_1

btv1b10532600x_f321

val

columns

Narratives

Historiography

Pierre Bersuire, Histoire Romaine de Tite-Live

1358

1361

RaouletOrleans

1

159r

85

Files

RaouletOrleans_HenriTrevou_dataset_v2.zip

Files (1.4 GB)

Name Size Download all
md5:de11781838979e68b79fc8ddb7c9cce8
1.4 GB Preview Download

Additional details

Funding

European Research Council
DISCOVER 101076028
Centre National de la Recherche Scientifique
CreMe