QAWiki v2026.1: Knowledge Graph Question Answering (KGQA) / SPARQL Query Generation Dataset for Wikidata
Authors/Creators
Contributors
Contact person (3):
Description
This is a snapshot of QAWiki from 2026-03-19: a dataset for knowledge graph question answering (KGQA) and/or SPARQL query generation over Wikidata. This snapshot is published for use in the WikiKGQA 2026 challenge.
For the WikiKGQA challenge, please rather use the updated version: v2026.2.
The dataset contains question/query pairs in English and Spanish with SPARQL queries. Some questions also feature Italian and Danish translations provided by the community.
The dataset is presented in two formats:
- The full format is a TTL file, and contains a full RDF dump of QAWiki featuring also entity mentions, relation mentions, question relations, quality tags, etc.
- The WikiKGQA format is a JSON file, following an extended form of the QALD schema, extracted from the full format: it provides questions, queries, mentions and expected solutions, but omits some details such as questions with queries generating empty results on the static Wikidata dump used for the challenge, alias queries, questions not in English nor Spanish, ambiguous questions, question relations, etc.
Files
qawiki-complete.ttl.zip
Files
(4.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9d8c78ee021501d1448123a6c7d00383
|
425.5 kB | Preview Download |
|
md5:dbd6cf5a64bf7b2b916874859f7e4e85
|
4.1 MB | Preview Download |