Published May 20, 2020 | Version v1
Dataset Open

RuBQ

  • 1. JetBrains Research

Description

We present RuBQ (pronounced [`rubik]) -- Russian Knowledge Base Questions, a KBQA dataset that consists of 1,500 Russian questions of varying complexity along with their English machine translations, corresponding SPARQL queries, answers, as well as a subset of Wikidata covering entities with Russian labels. To the best of our knowledge, this is the first Russian KBQA and semantic parsing dataset.

The dataset is thought to be used as a development and test sets in cross-lingual transfer, few-shot learning, or learning with synthetic data scenarios. Detailed information about RuBQ can be found on the Github page.

Files

specification.md

Files (173.5 kB)

Name Size Download all
md5:ca5b0625052c5e1d59be996dca69e3b3
170.9 kB Download
md5:63e6dbf8d53d7da1be5a8da0296dbb2c
2.6 kB Preview Download