Published April 14, 2024 | Version v2
Dataset Open

Dollar street 10 - 64x64x3

  • 1. ROR icon Netherlands eScience Center

Description

The MLCommons Dollar Street Dataset is a collection of images of everyday household items from homes around the world that visually captures socioeconomic diversity of traditionally underrepresented populations. It consists of public domain data, licensed for academic, commercial and non-commercial usage, under CC-BY and CC-BY-SA 4.0. The dataset was developed because similar datasets lack socioeconomic metadata and are not representative of global diversity.

This is a subset of the original dataset that can be used for multiclass classification with 10 categories. It is designed to be used in teaching, similar to the widely used, but unlicensed CIFAR-10 dataset.

These are the preprocessing steps that were performed:

  1. Only take examples with one imagenet_synonym label
  2. Use only examples with the 10 most frequently occuring labels
  3. Downscale images to 64 x 64 pixels
  4. Split data in train and test
  5. Store as numpy array

This is the label mapping:

Category label
day bed 0
dishrag 1
plate 2
running shoe 3
soap dispenser 4
street sign 5
table lamp 6
tile roof 7
toilet seat 8
washing machine 9

Checkout this notebook to see how the subset was created.

The original dataset was downloaded from https://www.kaggle.com/datasets/mlcommons/the-dollar-street-dataset. See https://mlcommons.org/datasets/dollar-street/ for more information.

Files

Files (57.6 MB)

Name Size Download all
md5:666c2c5964c543e66c121f1288c65941
14.4 MB Download
md5:1ece5eb015a3e4c42098c4353346ef90
2.5 kB Download
md5:bfab25210d8eddedb893b75ce5686ad2
43.2 MB Download
md5:f3adc3502b7e592b09ccbedd6addb849
7.2 kB Download

Additional details

Related works

Is derived from
Dataset: 10.34740/kaggle/dsv/4478812 (DOI)

Software

Programming language
Python

References

  • @misc{william_gaviria_rojas_sudnya_diamos_keertan_ranjan_kini_david_kanter_vijay_janapa_reddi_cody_coleman_2022, title={The Dollar Street Dataset}, url={https://www.kaggle.com/dsv/4478812}, DOI={10.34740/KAGGLE/DSV/4478812}, publisher={Kaggle}, author={William Gaviria Rojas and Sudnya Diamos and Keertan Ranjan Kini and David Kanter and Vijay Janapa Reddi and Cody Coleman}, year={2022} }