There is a newer version of the record available.

Published April 23, 2019 | Version v1
Dataset Open

ICDAR 2019 Competition on Table Detection and Recognition (cTDaR)

  • 1. Naver Labs Europe
  • 2. Institute of Computer Science & Technology, Peking University, China
  • 3. Institute of Computer Science & Technology, Peking University,China
  • 4. State Key Laboratory of Digital Publishing Technology, Founder Group Co. LTD., China
  • 5. Computer Vision Lab, TU Wien
  • 6. Archiv des Bistums Passau

Description

The aim of this competition is to evaluate the performance of state of the art methods for table detection (TRACK A) and table recognition (TRACK B). For the first track, document images containing one or several tables are provided. For TRACK B two subtracks exist: the first subtrack (B.1) provides the table region. Thus, only the table structure recognition must be performed. The second subtrack (B.2) provides no a-priori information. This means, the table region and table structure detection has to be done. The Ground Truth is provided in a similar format as for the ICDAR 2013 competition (see [2]):

<?xml version="1.0" encoding="UTF-8"?>

<document filename='filename.jpg'>

    <table id='Table_1540517170416_3'>

         <Coords points="180,160 4354,160 4354,3287 180,3287"/>

       <cell id='TableCell_1540517477147_58' start-row='0' start-col='0' end-row='1' end-col='2'>

           <Coords points="180,160 177,456 614,456 615,163"/>

       </cell>

        ...

    </table>

    ...

</document>

 

The difference to Gobel et al. [2] is the Coords tag which defines a table/cell as a polygon specified by a list of coordinates. For B.1 the table and its coordinates is given together with the input image.

Important Note:

For the modern dataset, the convex hull of the content describes a cell region. For the historical dataset, it is requested that the output region of a cell is the cell boundary. This is necessary due to the characteristics of handwritten text, which is often overlapping with different cells.

See also: http://sac.founderit.com/tasks.html

The evaluation tool is available at github: https://github.com/cndplab-founder/ctdar_measurement_tool

Notes

http://sac.founderit.com/

Files

TRACKA_test.zip

Files (6.0 GB)

Name Size Download all
md5:9d919af14a8a59d0ab68e55a0f45e48f
791.5 MB Preview Download
md5:979ee6a89fea6fb52b9981f614b17db1
2.1 GB Preview Download
md5:a4f787914c39c86de8579a94d56ee7fa
531.1 MB Preview Download
md5:43e32c9670a54cdf17ec9ed485874c37
548.0 MB Preview Download
md5:d0814b29043071ae7019bbb910050909
2.0 GB Preview Download

Additional details

Funding

READ – Recognition and Enrichment of Archival Documents 674943
European Commission