Published June 8, 2020
| Version version 1.0.0
Dataset
Open
Wiki-MLM: Multiple Languages and Modalities
- 1. University of Bonn
- 2. Technical Information Library (TIB)
- 3. Jožef Stefan Institute (JSI)
Description
International organizations and companies encounter web data in a range of modalities and languages. At the same time, applications developed by these users result from pipelines that perform multiple tasks. Systems that handle diverse inputs and multiple objectives hold the promise of limiting complexity in the application work-flow and improving generalization. We present a Wikidata-generated re-source designed to train and evaluate multitask systems on samples in four modalities and three languages
Files
MLM_v1_eu.zip
Files
(14.0 GB)
Name | Size | Download all |
---|---|---|
md5:a000a8f2f6c8b0b860f7e06cd6fb1d07
|
1.3 GB | Preview Download |
md5:a5c41775e451fdba3177be8f1c1be72c
|
12.7 GB | Preview Download |