Published October 8, 2023 | Version 1.0
Dataset Open

TempTabQA: Temporal Question Answering for Semi-Structured Tables

  • 1. ROR icon University of Pennsylvania
  • 2. Bloomberg

Description

This repository contains resources, namely TempTabQA, developed for the paper: Gupta, V., Kandoi, P., Vora, M., Zhang, S., He, Y., Reinanda R., Srikumar V., TempTabQA: Temporal Question Answering for Semi-Structured Tables. In: Proceeding of the The 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023.

TempTabQA is a dataset which comprises 11,454 question-answer pairs extracted from Wikipedia Infobox tables. These question-answer pairs are annotated by human annotators. We provide two test sets instead of one: the Head set with popular frequent domains, and the Tail set with rarer domains. 

Files to access the annotation follow the below structure:

Maindata

  • qapairs: split into train, dev,  head, and tail sets, in both csv and json formats
  • Tables: Wikipedia category and tables metadata in csv, json and html formats

Carefully read the ```LICENCE``` for non-academic usage.

Note : Wherever required consider the year of 2022 as the build date for the dataset.

 

 

 

Files

maindata.zip

Files (4.6 MB)

Name Size Download all
md5:a1f58402b91b0a1c5121866481d6a900
4.6 MB Preview Download

Additional details

Dates

Available
2023-10-20