Dataset Open Access

CVL Ruling Database

Diem, Markus; Kleber, Florian; Sablatnig, Robert

The CVL ruling dataset was synthetically generated to allow for comparing different ruling removal methods. It is based on the ICDAR 2013 Handwriting Segmentation database [1]. It was generated by synthetically adding four different ruling images resulting in a total of 600 test images. The pixel values are:

  • 255 background
  • 155 ruling
  • 100 text
  • 0 ruling and text (overlaping)

For processing, a binary image must be generated which sets all pixels to 0 that are not 255. When evaluating, the line GT image can be found by setting all pixel having value 155 to one (e.g. linImg = img == 155). The text GT image can be extracted by setting all values below 155 to zero (e.g. txtImg = img < 155). Then, true positives (tp), false positives (fp) and false negatives (fn) are defined as:

  • tp = result & linImg & !txtImg
  • fp = result & !txtImg
  • fn = !result & linImg & !txtImg

The database ships with a Matlab that gives evaluation results if all images are already processed.

Files (83.9 MB)
Name Size
cvl-ruling-database.zip
md5:8a1599d56f59d188f9a9f73525df805c
83.9 MB Download
  • Markus Diem, Florian Kleber and Robert Sablatnig. Ruling Analysis and Classification of Torn Documents. In ACM Symposium on Document Engineering. Colorado, USA, pages 63 – 72 2014.

  • N. Stamatopoulos, B. Gatos, G. Louloudis, U. Pal and A. Alaei. ICDAR 2013 Handwriting Segmentation Contest. In Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013, 1402-1406

421
15
views
downloads
All versions This version
Views 421421
Downloads 1515
Data volume 1.3 GB1.3 GB
Unique views 392392
Unique downloads 1414

Share

Cite as