Published June 15, 2023 | Version v1
Dataset Open

The Belfort dataset: Handwritten Text Recognition from Crowdsourced Annotations

  • 1. TEKLIA
  • 2. TEKLIA, Nantes Université
  • 3. Nantes Université

Description

This dataset includes minutes of Belfort municipal council drawn up between 1790 and 1946. Documents include deliberations, lists of councillors, convocations, and agendas.

The dataset includes 24,105 text-line images that were automatically detected from pages. Up to 4 transcriptions are available for each line image: two from humans, and two from automatic models.

We would like to thank the Archives municipales de la ville de Belfort, France for giving us access to these documents.

Files

belfort.zip

Files (1.4 GB)

Name Size Download all
md5:4066324e9e065320bf8452f36ca76b59
1.4 GB Preview Download
md5:84cfa28195231fade466aa076665d4f0
2.5 kB Preview Download