There is a newer version of the record available.

Published March 19, 2024 | Version 1.0.0
Dataset Open

Dollar street 10

  • 1. ROR icon Netherlands eScience Center

Description

The MLCommons Dollar Street Dataset is a collection of images of everyday household items from homes around the world that visually captures socioeconomic diversity of traditionally underrepresented populations. It consists of public domain data, licensed for academic, commercial and non-commercial usage, under CC-BY and CC-BY-SA 4.0. The dataset was developed because similar datasets lack socioeconomic metadata and are not representative of global diversity.

This is a subset of the original dataset that can be used for multiclass classification with 10 categories. It is designed to be used in teaching, similar to the widely used, but unlicensed CIFAR-10 dataset.

These are the preprocessing steps that were performed:

  1. Only take examples with one imagenet_synonym label
  2. Use only examples with the 10 most frequently occuring labels
  3. Downscale images to 128 x 128 pixels
  4. Split data in train and test
  5. Store as numpy array

This is the label mapping:

Category label
day bed 0
dishrag 1
plate 2
running shoe 3
soap dispenser 4
street sign 5
table lamp 6
tile roof 7
toilet seat 8
washing machine 9

Checkout this notebook to see how the subset was created.

The original dataset was downloaded from https://www.kaggle.com/datasets/mlcommons/the-dollar-street-dataset. See https://mlcommons.org/datasets/dollar-street/ for more information.

Files

Files (230.2 MB)

Name Size Download all
md5:a37a36b80c9769618e295f185e1df174
57.6 MB Download
md5:afae4b8fb9021393ba090af8eb94beec
2.5 kB Download
md5:e29a292ba7bfe101acb7382a523ca46c
172.6 MB Download
md5:21ce094f851a42ef38ebdfdd3a35dd2d
7.2 kB Download

Additional details

Related works

Is derived from
Dataset: 10.34740/kaggle/dsv/4478812 (DOI)

Software

Programming language
Python

References

  • @misc{william_gaviria_rojas_sudnya_diamos_keertan_ranjan_kini_david_kanter_vijay_janapa_reddi_cody_coleman_2022, title={The Dollar Street Dataset}, url={https://www.kaggle.com/dsv/4478812}, DOI={10.34740/KAGGLE/DSV/4478812}, publisher={Kaggle}, author={William Gaviria Rojas and Sudnya Diamos and Keertan Ranjan Kini and David Kanter and Vijay Janapa Reddi and Cody Coleman}, year={2022} }