Dataset Open Access

ImageCLEF 2016 Bentham Handwritten Retrieval Dataset

Villegas, Mauricio; Puigcerver, Joan; Toselli, Alejandro H.

Dataset compiled for the ImageCLEF 2016 Handwritten Scanned Document Retrieval challenge. It is derived from a subset of pages from unpublished manuscripts written by the philosopher and reformer Jeremy Bentham, that have been digitised and transcribed under the Transcribe Bentham project [Causer 2012]. More details about the dataset and the challenge are found in the overview paper at http://ceur-ws.org/Vol-1609/16090233.pdf the slides of the overview presentation at http://imageclef.org/system/files/Villegas16_CLEF_Handwritten-Overview_presentation.pdf or the evaluation web page http://imageclef.org/2016/handwritten.

[Causer 2012] T. Causer and V. Wallace, Building a Volunteer Community: Results and Findings from Transcribe Bentham, Digital Humanities Quarterly, Vol. 6 (2012), http://www.digitalhumanities.org/dhq/vol/6/2/000125/000125.html

Files (7.9 GB)
Name Size
assessment.py
md5:b3e48372e9e208fb03e2bc9d7a30d447
28.4 kB Download
baseline.txt.zip
md5:7fec53cfe5d417d6e30f905069726554
295.7 kB Download
bboxs_train_for_query-by-example.txt.zip
md5:1b37630a4f3b961dd0ec23620855f506
863.3 kB Download
groundtruth_devel.txt.zip
md5:522d825d8972a482b4a46c9044895762
87.1 kB Download
line2page_devel.txt.zip
md5:c0f1eb0da4c84c4bc1a6e06fe0c11665
20.2 kB Download
line2page_test.txt.zip
md5:f468a8f85ef9e9aa214cdaf95572f24c
33.4 kB Download
line2page_train.txt.zip
md5:157e38d0e76b313811a0f1108ad7cd42
18.0 kB Download
line_images_devel.zip
md5:59d41c433b2293d649aa09fa9097ed37
1.2 GB Download
line_images_test.zip
md5:aae6562bb5ce5dcef1aa636175d67f40
668.9 MB Download
line_images_train.zip
md5:e967bf10b285d599c4792c033f188a3d
1.0 GB Download
nbest-baseline.py
md5:1657f251e1140cb648cf6775e5987364
8.1 kB Download
nbest_devel.zip
md5:b558875f39453dd476845a905e60037d
131.5 MB Download
nbest_test.zip
md5:69114fdb887b6551ed73d3870cf9ec5b
64.5 MB Download
pages_devel_jpg_1.zip
md5:b8fe6a4b5d20999f8afcc3817ca4e4d0
1.0 GB Download
pages_devel_jpg_2.zip
md5:a95f6100a319c97be4a5626339c10031
1.1 GB Download
pages_devel_xml.zip
md5:93d8e1ee16502778859869615cb31804
7.8 MB Download
pages_test_jpg.zip
md5:a7068aabfec16fdc0047ed7188284243
947.6 MB Download
pages_test_xml.zip
md5:871408b735be885dbf09780fabb076b0
521.8 kB Download
pages_train_jpg.zip
md5:4671a0881b6dd913047a74e41b6e8263
1.8 GB Download
pages_train_xml.zip
md5:ea5752fad7546b9d1f94a4b8f22121e4
3.4 MB Download
queries_devel.txt.zip
md5:b8d3c0b609bc1a84975fc42fb4d84572
5.6 kB Download
queries_test.txt.zip
md5:612747560d43d38348c29b1ba3f8a322
9.9 kB Download
run_baseline.sh
md5:7283f431486ba3e800def4a63a1cd942
2.7 kB Download
segments_devel.txt.zip
md5:a192d6f80df9da3681bba6ed04ef0466
35.4 kB Download
segments_test.txt.zip
md5:2849287e4ac641225075370c977dbc02
54.7 kB Download
transcript_devel.txt.zip
md5:4bfb26944e67f7a2cad363960af71b8a
200.5 kB Download
transcript_train.txt.zip
md5:cf24baf1ce6ab7474a2eb43545142cea
170.7 kB Download
validation_pages.txt.zip
md5:06c3996764cd6ce74fe71f9212d81cde
341 Bytes Download
175
46
views
downloads
All versions This version
Views 175175
Downloads 4646
Data volume 14.2 GB14.2 GB
Unique views 167167
Unique downloads 2727

Share

Cite as