Published January 17, 2017 | Version v1
Dataset Open

ICDAR 2015 Competition HTRtS: Handwritten Text Recognition on the tranScriptorium Dataset

  • 1. Pattern Recognition and Human Language Technologies, Universitat Politècnica de València

Description

This dataset comprises the dataset used for the ICDAR 2015 Competition on  Handwritten Text Recognition on the tranScriptorium Dataset. The handwritten images for this contest were drawn from the English “Bentham collection” dataset used in the TRAN SCRIPTORIUM project. The selected data has been written by several hands and entails significant variabilities and difficulties regarding the quality of text images, writing styles and crossed-out text. This contest is clearly more difficult than the the first edition both for training and for testing. A portion of the training dataset and the full test dataset were provided in the form of carefully segmented line images, along with the corresponding transcripts. Another portion of the training dataset was provided as raw images and their corresponding transcripts at region level.
 

ICDAR 2015 competition HTRtS: handwritten text recognition on the tranScriptorium dataset
JA Sánchez, AH Toselli, V Romero, E Vidal.  In International Conference on Document Analysis and Recognition (ICDAR), pp. 1166-1170, 2015.

Files

ICDAR-HTR-Competition-2015.zip

Files (6.8 GB)

Name Size Download all
md5:ad928dd1fa667a754c12b85f503ec524
6.8 GB Preview Download

Additional details

Funding

TRANSCRIPTORIUM – tranScriptorium 600707
European Commission