GLAMI-1M: A Multilingual Image-Text Fashion Dataset - 800px
Description
We introduce GLAMI-1M: the largest multilingual image-text classification dataset and benchmark. The dataset contains images of fashion products with item descriptions, each in 1 of 13 languages. Categorization into 191 classes has high-quality annotations: all 100k images in the test set and 75% of the 1M training set were human-labeled. The paper presents baselines for image-text classification showing that the dataset presents a challenging fine-grained classification problem: The best scoring EmbraceNet model using both visual and textual features achieves 69.7% accuracy. Experiments with a modified Imagen model show the dataset is also suitable for image generation conditioned on text. The dataset, source code and model checkpoints are published at: https://github.com/glami/glami-1m
Files
GLAMI-1M-dataset-800px--0.zip
Files
(64.0 GB)
Name | Size | Download all |
---|---|---|
md5:9d7aa43dc315a5d567389e68e4e50ccb
|
6.8 GB | Preview Download |
md5:36d1ffbef560b707f815faad98b039f1
|
5.2 GB | Preview Download |
md5:cec58af85ff2cc9eefd9ba20ebdd334e
|
4.4 GB | Preview Download |
md5:47598ccfddcfb3ed9a00e851aa1165b9
|
6.4 GB | Preview Download |
md5:d92c0a41e093bd07fd59a6a80c65cf25
|
5.0 GB | Preview Download |
md5:0134506e9d47ddf2e4c7b286b68e3c09
|
5.2 GB | Preview Download |
md5:19dcb70eb67c4ecb560d1ba6580d0b3c
|
6.0 GB | Preview Download |
md5:33cddce2f4c3df3be3ae8c0e7fb841ed
|
6.1 GB | Preview Download |
md5:2ed26dcc539e1e57af4305e9b1f22686
|
6.0 GB | Preview Download |
md5:6895f21aa72a790fb8ce9556637f42d7
|
6.3 GB | Preview Download |
md5:235f5003edaf7b45a7d99b38d814311e
|
6.5 GB | Preview Download |