Published September 20, 2019 | Version public 1.0
Dataset Open

MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection

Description

This dataset is a sound dataset for malfunctioning industrial machine investigation and inspection (MIMII dataset). It contains the sounds generated from four types of industrial machines, i.e. valves, pumps, fans, and slide rails. Each type of machine includes seven individual product models*1, and the data for each model contains normal sounds (from 5000 seconds to 10000 seconds) and anomalous sounds (about 1000 seconds). To resemble a real-life scenario, various anomalous sounds were recorded (e.g., contamination, leakage, rotating unbalance, and rail damage). Also, the background noise recorded in multiple real factories was mixed with the machine sounds. The sounds were recorded by eight-channel microphone array with 16 kHz sampling rate and 16 bit per sample. The MIMII dataset assists benchmark for sound-based machine fault diagnosis. Users can test the performance for specific functions e.g., unsupervised anomaly detection, transfer learning, noise robustness, etc. The detail of the dataset is described in [1][2].

This dataset is made available by Hitachi, Ltd. under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

A baseline sample code for anomaly detection is available on GitHub: https://github.com/MIMII-hitachi/mimii_baseline/

*1: This version "public 1.0" contains four models (model ID 00, 02, 04, and 06). The rest three models will be released in a future edition.

[1] Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” arXiv preprint arXiv:1909.09347, 2019.

[2] Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019.

Files

-6_dB_fan.zip

Files (100.2 GB)

Name Size Download all
md5:f02ae808a58d84b6815b7ec38ff30879
10.9 GB Preview Download
md5:d20b783a0ff9c93d58f452f98c37b112
8.2 GB Preview Download
md5:49913eda7d37f182cbf8ed5c984140e0
8.0 GB Preview Download
md5:fdfaf185fea61b21e11952a070a4ada7
8.0 GB Preview Download
md5:6354d1cc2165c52168f9ef1bcd9c7c52
10.4 GB Preview Download
md5:488748295c3f60b25de07b58fe75b049
7.9 GB Preview Download
md5:4d674c21474f0646ecd75546db6c0c4e
7.5 GB Preview Download
md5:178478eb0d11c79080a35562bfdeee71
7.5 GB Preview Download
md5:0890f7d3c2fd8448634e69ff1d66dd47
10.2 GB Preview Download
md5:a09ba6060c10fc09cd4c8770213b0b9f
7.7 GB Preview Download
md5:838c2b3441858359c4704ef13a1b27ff
7.1 GB Preview Download
md5:fe5fb7c337cd701b1d31dc641e621892
6.9 GB Preview Download

Additional details

References

  • Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, "MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection," arXiv preprint arXiv:1909.09347, 2019.
  • Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, "MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection," in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019.