ICPR 2020 Competition on Text Block Segmentation on a NewsEye Dataset
- 1. Institute of Mathematics, CITlab, University of Rostock
Description
We present a competition on text block segmentation within the framework of the International Conference on Pattern Recognition (ICPR) 2020. The main goal of this competition is to automatically analyse the structure of historical newspaper pages with a subsequent evaluation of the participants’ algorithms performance. In contrast to many existing segmentation methods, instead of working on pixels, the present study has a focus on clustering baselines/text lines into text blocks. Therefore, we introduce a new measure based on a baseline detection evaluation scheme. But also common pixel-based approaches could participate without restrictions. Working on baseline level addresses directly the application scenario where for a given image the contained text should be extracted in blocks for further investigations. We present the results of three submissions. The experiments have shown that text blocks can be reliably detected both on pages with a simple layout and on pages with a complex layout.
Files
Michael2021_Chapter_ICPR2020CompetitionOnTextBlock.pdf
Files
(15.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:f280800fe3647b2a2db601b3149f7472
|
15.1 MB | Preview Download |