LECTAUREP Contemporary French Model (Administration)

Chagué, Alix

doi:10.5281/zenodo.6542744

Published May 12, 2022 | Version 1.0.0

Other Open

LECTAUREP Contemporary French Model (Administration)

Chagué, Alix¹

1. ALMAnaCH, Inria

Contributors

Contact person:

Chagué¹

Data collector (3):

Data curator:

Rostaing²

1. Alix
2. Aurélia
3. Marie-Françoise
4. Nathalie
5. Marc

Description

The model was trained from the ground truth produced by the LECTAUREP Project (Inria & Archives Nationales) between 2019 and 2022. The training dataset contained many handwriting examples taken from French administrative documents produced between 1742 and 1928.

Training and Testing datasets

The data was collected from LECTAUREP's ground truth repositories:
- lectaurep-bronod v0.0.1
- lectaurep-mariages-et-divorces v.1.0
- lectaurep-repertoires v2.0

12 pages were kept aside to create a test set.

The training dataset contained:
- 308 files
- 19 364 lines
- 329 270 characters

The test dataset contained:
- 12 files
- 962 lines
- 15 243 characters

Transcription standards

The transcriptions were created with eScriptorium. They respect what is written (abbreviations are not developed, capitalization follows 19th century practices). Superscripted portions of text are signaled by `^` and many signatures are transcription with ¥.

Training

The model was trained using the NFD normalization.

Credits

The model was trained by Alix Chagué using data created by Aurélia Rostaing, Françoise Limon-Bonnet, Nathalie Denis and Marc Durand.

Additional information

- more information on the LECTAUREP Project can be found at https://lectaurep.hypotheses.org/
- more information on the model can be found at https://github.com/lectaurep/lectaurep_base_model

Files

metadata.json

Files (16.1 MB)

Name	Size	Download all
lectaurep_base.mlmodel md5:f6c7f613931dce656a163756eb2b56de	16.1 MB	Download
metadata.json md5:3e37fc80c0dfd089024c8173e9528a99	2.3 kB	Preview Download

	All versions	This version
Views	1,276	1,269
Downloads	5,859	5,819
Data volume	9.7 GB	9.6 GB

LECTAUREP Contemporary French Model (Administration)

Authors/Creators

Contributors

Contact person:

Data collector (3):

Data curator:

Description

Files

metadata.json

Files (16.1 MB)