Dataset Open Access
This dataset contains the training, evaluation, and test set for the ICDAR 2019 Competition on Baseline Detection (cBAD).
A newly created freely available real world dataset consisting of 3021 annotated document page images that are collected from seven European archives and form the basis of cBAD. The baselines in all images were manually annotated. The training and the evaluation sets contain PAGE XMLs with annotated text regions and baselines. The groundtruth for the test set will be published after the competition deadline (May 2019).
Competition Website: https://scriptnet.iit.demokritos.gr/competitions/11/