Dataset Open Access

Dataset for ICDAR2017 Competition on Handwritten Text Recognition on the READ Dataset (ICDAR2017 HTR)

Sánchez, Joan andreu; Romero, Verónica; Toselli, Alejandor H.; Villegas, Mauricio; Vidal, Enrique

Train-A: Dataset of pages with manually revised baselines and the corresponding transcripts associated to them. This batch is small, 50 pages. Please, keep in mind that only the baselines have been manually corrected, The polygons associated to each line have not been manually reviewed.

Train-B: Dataset of pages without any layout or text line information. The corresponding transcripts are provided at page level with line breaks. It has 10k pages, though for convenience it is divided into two 5k page batches. This information is provided in PAGE format.

Test A: Dataset of pages with manually revised baselines. This batch has 65 pages. The polygons associated to each line have not been manually reviewed.

Test-B1: The same dataset of pages of the Test A, but annotated only with the geometry of regions. Text line information is not provided.                                                   

Test-B2: Dataset of page images annotated with the geometry of regions where to detect text line and recognize. It has 57 pages.

Baseline.tgz: Baseline system trained using the first 40 pages of Train-A. The system is based on the deep learning toolkit to transcribe handwritten text images called Laia.

More information at:

https://scriptnet.iit.demokritos.gr/competitions/~icdar2017htr/

 

Files (4.0 GB)
Name Size
Baseline.tgz
md5:5ef6d6d9a1be6785686559d6f8c9b67a
22.1 MB Download
Test-A.tgz
md5:f989a3f056d1b830564594a576b4dc75
70.9 MB Download
Test-B1.tgz
md5:6bea580c2fdcae850041738bc03d8c1c
70.8 MB Download
Test-B2.tgz
md5:0bea41d3beab30431fdb3ad01f5929ab
48.0 MB Download
Train-A.tbz2
md5:e46c7019f8ac639b796ecb8d872fd481
21.4 MB Download
Train-B_batch1.tbz2
md5:e11b9d0cb97169d64069268a23e90ef2
1.9 GB Download
Train-B_batch2.tbz2
md5:93ea0b7285f65c8438155e9490c691ed
1.9 GB Download
4,093
4,793
views
downloads
All versions This version
Views 4,0934,094
Downloads 4,7934,793
Data volume 4.1 TB4.1 TB
Unique views 3,6943,695
Unique downloads 1,6151,615

Share

Cite as