Published February 29, 2024
| Version v1
Dataset
Open
Multi30k_train-ca
Description
Multi30k_train-ca dataset is a professional translation of the train.en.multi30k dataset into Catalan, commissioned by BSC LangTech Unit.
The Flickr30k is a dataset for sentence-based image description. It includes 31,000 images collected from Flickr, together with 5 reference captions provided by human annotators (https://paperswithcode.com/dataset/flickr30k).
This work was funded by the Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya within the framework of Projecte AINA.
Files
train.ca.multi30k.txt
Files
(2.4 MB)
Name | Size | Download all |
---|---|---|
md5:f66e61881719ed30f95e37548935b67d
|
2.4 MB | Preview Download |