There is a newer version of the record available.

Published January 30, 2023 | Version 1.0.5
Dataset Open

DUDE competition train and validation splits ground truth

  • 1. KU Leuven

Description

This JSON file contains the ground truth annotations for the train and validation set of the DUDE competition (https://rrc.cvc.uab.es/?ch=23&com=tasks) of ICDAR 2023 (https://icdar2023.org/).

 

1.0.5 release candidate: 30102 annotations for 3702 documents (train-validation)

DatasetDict({
    train: Dataset({
        features: ['docId', 'questionId', 'question', 'answers', 'answers_page_bounding_boxes', 'answers_variants', 'answer_type', 'data_split', 'document', 'OCR'],
        num_rows: 23774
    })
    val: Dataset({
        features: ['docId', 'questionId', 'question', 'answers', 'answers_page_bounding_boxes', 'answers_variants', 'answer_type', 'data_split', 'document', 'OCR'],
        num_rows: 6328
    })
})

++update on answer_type
+++formatting change to answers_variants

Notes

Binaries are hosted elsewhere for now, see https://huggingface.co/datasets/jordyvl/DUDE_loader/tree/main/data

Files

DUDE_gt_release-candidate_trainval.json

Files (18.3 MB)

Name Size Download all
md5:4d4f949e170b65b8a062acb597c2125c
18.3 MB Preview Download