Published June 1, 2020 | Version 1.0
Dataset Open

DCASE 2020 Challenge Task 2 Evaluation Dataset

Description

Description

This dataset is the "evaluation dataset" for the DCASE 2020 Challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" [task description]

In the task, three datasets have been released: "development dataset", "additional training dataset", and "evaluation dataset". This evaluation dataset was the last of the three released. This dataset includes around 400 samples for each Machine Type and Machine ID used in the evaluation dataset, none of which have a condition label (i.e., normal or anomaly).

The recording procedure and data format are the same as the development dataset and additional training dataset. The Machine IDs in this dataset are the same as those in the additional training dataset. For more information, please see the pages of the development dataset and the task description

After the DCASE 2020 Challenge, we released the ground truth for this evaluation dataset.

 

Directory structure

Once you unzip the downloaded files from Zenodo, you can see the following directory structure. Machine Type information is given by directory name, and Machine ID and condition information are given by file name, as:

/eval_data

  • /ToyCar
    • /test  (Normal and anomaly data for all Machine IDs are included, but they do not have a condition label.)
      • /id_05_00000000.wav
      • ...
      • /id_05_00000514.wav
      • /id_06_00000000.wav
      • ...
      • /id_07_00000514.wav
  • /ToyConveyor (The other Machine Types have the same directory structure as ToyCar.)
  • /fan
  • /pump
  • /slider
  • /valve

 

The paths of audio files are:

  • "/eval_data/<Machine_Type>/test/id_<Machine_ID>_[0-9]+.wav"

For example, the Machine Type and Machine ID of "/ToyCar/test/id_05_00000000.wav" are "ToyCar" and "05", respectively. Unlike the development dataset and additional training dataset, its condition label is hidden. 

 

Baseline system

A simple baseline system is available on the Github repository [URL]. The baseline system provides a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. It is a good starting point, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

 

Conditions of use

This dataset was created jointly by NTT Corporation and Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

 

Publication

If you use this dataset, please cite all the following three papers:

Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. [pdf]

Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019. [pdf]

Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]


Feedback

If there is any problem, please contact us:

Files

eval_data_test_fan.zip

Files (1.9 GB)

Name Size Download all
md5:1eb9356a768cadfd0f2e59a5c57e578b
328.0 MB Preview Download
md5:23a8f8f924be218c69df67fe07360348
191.4 MB Preview Download
md5:0193c769073840332ce3aad84b1ccaa2
204.3 MB Preview Download
md5:bea2bdd612f616be8f2b8eb087b32c7c
442.9 MB Preview Download
md5:ccfcc9847c7d3404d7ee9d6d9e0b2ba2
491.8 MB Preview Download
md5:f69c551d2088691050d20cccca9d631c
227.9 MB Preview Download

Additional details

References

  • Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019.
  • Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, "MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection," in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019.
  • Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring,"  in arXiv e-prints: 2006.05822, 2020.