XQuAD-ca

Carlos Gerardo Rodriguez-Penagos; Carme Armentano-Oller

doi:10.5281/zenodo.4526224

There is a newer version of the record available.

Published February 9, 2021 | Version v1

Dataset Open

XQuAD-ca

1. BSC

Professional translation of XQuAD into Catalan

XQuAD (Cross-lingual Question Answering Dataset) is a benchmark dataset for evaluating cross-lingual question answering performance. The dataset consists of a subset of 240 paragraphs and 1190 question-answer pairs from the development set of SQuAD v1.1 (Rajpurkar et al., 2016) together with their professional translations into ten languages: Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi. Rumanian was added later. We added the 13th language to the corpus using also native, professional catalan translators.

For more information on how XQuAD was created, refer to the paper, On the Cross-lingual Transferability of Monolingual Representations (https://arxiv.org/abs/1910.11856), or visit the webpage https://github.com/deepmind/xquad

Translation into Catalan was commissioned by BSC TeMU (https://temu.bsc.es/) within the AINA project.

Files

XQuAD-ca.zip

Files (137.8 kB)

Name	Size	Download all
XQuAD-ca.zip md5:8c0727616a378e95377b5d0cd2d80087	137.8 kB	Preview Download

956

Views

Downloads

Show more details

	All versions	This version
Views	956	331
Downloads	86	23
Data volume	11.9 MB	3.2 MB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

Catalan

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: February 9, 2021
Modified: June 20, 2022

XQuAD-ca

Creators

Description

Files

XQuAD-ca.zip

Files (137.8 kB)