QAWiki v1: Knowledge Graph Question Answering (KGQA) / SPARQL Query Generation Dataset for Wikidata
Authors/Creators
- 1. DCC, Universidad de Chile
- 2. Instituto Milenio Fundamentos de los Datos
Description
This is a snapshot of QAWiki from 2025-09-09: a dataset for knowledge graph question answering (KGQA) and/or SPARQL query generation over Wikidata.
The dataset is presented in two formats:
- The simple format is a TSV file, and contains language-tagged questions and paraphrased questions with SPARQL queries.
- The full format is a TTL file, and contains a full RDF dump of QAWiki featuring also entity mentions, relation mentions, question relations, quality tags, etc.
The dataset contains 518 question/query pairs in English and Spanish with SPARQL queries (and 8 additional ambiguous questions without queries). Some questions also feature Italian and Danish translations provided by the community.
Files
Files
(2.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9f2d8ecae766576b3e3c23c9565debff
|
2.6 MB | Download |
|
md5:f2be2d6c795665e2523e7df82575c532
|
226.2 kB | Download |