There is a newer version of the record available.

Published April 23, 2018 | Version 10
Dataset Open

READ ABP Table datasets

  • 1. Naver Labs Europe
  • 2. Bistum Passau Archiv
  • 3. CVL TU Wien

Description

Datasets used in the publication : Comparing Machine Learning Approaches for Table Recognition in Historical Register Books, Hervé Déjean, Jean-Luc Meunier, Stéphane Clinchant, Eva Maria Lang and Florian Kleber, 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS DAS 2018; Vienna, Austria

 

dataset111
    img/  images
    xml/ READ pagexml with BIESO annotation

dataset150
    img/  images
    GT_xml: READ pagexml with BIESO annotation
    WK_xml: workflow version: pagexml corresponding to the workflow outputs (textlines are automatically recognised, columns as well)
    ROWREF: GT for the row regions

Tagset (attribute of the TextLine element)

Type: deprecated

DU_row:
    B: first element of cell
    I: inside a cell
    E: last element of a cell
    S: single element of  a cell
    O: outside the table

Files

READ_ABP_TABLE.zip

Files (311.1 MB)

Name Size Download all
md5:ddbbe2c3a0d480a0c74af5c47da52351
311.1 MB Preview Download

Additional details

Funding

European Commission
READ - Recognition and Enrichment of Archival Documents 674943