XQuAD-ca
Description
Professional translation of XQuAD into Catalan
XQuAD (Cross-lingual Question Answering Dataset) is a benchmark dataset for evaluating cross-lingual question answering performance. The dataset consists of a subset of 240 paragraphs and 1190 question-answer pairs from the development set of SQuAD v1.1 (Rajpurkar et al., 2016) together with their professional translations into ten languages: Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi. Rumanian was added later. We added the 13th language to the corpus using also native, professional catalan translators.
For more information on how XQuAD was created, refer to the paper, On the Cross-lingual Transferability of Monolingual Representations (https://arxiv.org/abs/1910.11856), or visit the webpage https://github.com/deepmind/xquad
Translation into Catalan was commissioned by BSC TeMU (https://temu.bsc.es/) within the AINA project.
Files
XQuAD-ca.zip
Files
(137.8 kB)
Name | Size | Download all |
---|---|---|
md5:8c0727616a378e95377b5d0cd2d80087
|
137.8 kB | Preview Download |