Published April 23, 2018 | Version 1.1
Dataset Open

READ ABP Table datasets

  • 1. Naver Labs Europe
  • 2. Bistum Passau Archiv
  • 3. CVL TU Wien

Description

Datasets used in the publication : Comparing Machine Learning Approaches for Table Recognition in Historical Register Books, Hervé Déjean, Jean-Luc Meunier, Stéphane Clinchant, Eva Maria Lang and Florian Kleber, 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS DAS 2018; Vienna, Austria

CHANGES:

07/05/2018: second version: dataset150 images were missing

dataset111
    img/  images
    xml/ READ pagexml with BIESO annotation

dataset150
    img/  images
    GT_xml: READ pagexml with BIESO annotation
    WK_xml: workflow version: pagexml corresponding to the workflow outputs (textlines are automatically recognised, columns as well)
    ROWREF: GT for the row regions

Tagset (attribute of the TextLine element)

Type: deprecated

DU_row:
    B: first element of cell
    I: inside a cell
    E: last element of a cell
    S: single element of  a cell
    O: outside the table

Files

READ_ABP_TABLE.zip

Files (590.6 MB)

Name Size Download all
md5:2bfa34d909adf0300c0527818d164dc2
590.6 MB Preview Download

Additional details

Funding

READ – Recognition and Enrichment of Archival Documents 674943
European Commission