Published June 8, 2020
| Version version 1.0.0
Dataset
Open
Wiki-MLM: Multiple Languages and Modalities
- 1. University of Bonn
- 2. Technical Information Library (TIB)
- 3. Jožef Stefan Institute (JSI)
Description
International organizations and companies encounter web data in a range of modalities and languages. At the same time, applications developed by these users result from pipelines that perform multiple tasks. Systems that handle diverse inputs and multiple objectives hold the promise of limiting complexity in the application work-flow and improving generalization. We present a Wikidata-generated re-source designed to train and evaluate multitask systems on samples in four modalities and three languages
Files
data-description.txt
Files
(14.0 GB)
Name | Size | Download all |
---|---|---|
md5:e0370476d992138e21926f664ecbcaf5
|
330 Bytes | Preview Download |
md5:f9f2f3a63b1ffffd9e872953781b427f
|
1.2 kB | Preview Download |
md5:3db2764b0d2fda1feee893c5e198f0c6
|
1.8 kB | Preview Download |
md5:ecf66b26ab5959feb025d499adde1fa5
|
12.7 GB | Preview Download |
md5:a000a8f2f6c8b0b860f7e06cd6fb1d07
|
1.3 GB | Preview Download |
md5:860ab7404e4c9069ce348c89d319a182
|
417 Bytes | Preview Download |