Published June 1, 2021 | Version 1.0
Dataset Open

DCASE 2021 Challenge Task 2 Evaluation Dataset

  • 1. Hitachi, Ltd.
  • 2. Doshisha University
  • 3. Google LLC
  • 4. NTT Corporation

Description

Description

This dataset is the "evaluation dataset" for the DCASE 2021 Challenge Task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions"

In the task, three datasets have been or will be released: "development dataset", "additional training dataset", and "evaluation dataset". This evaluation dataset was the last of the three released. This dataset includes around 200 samples for each machine type, section index, and domain, none of which have a condition label (i.e., normal or anomaly).

The recording procedure and data format are the same as the development dataset and additional training dataset. The section indices in this dataset are the same as those in the additional training dataset. For more information, please see the pages of the development dataset and the task description

After the DCASE 2021 Challenge, we released the ground truth for this evaluation dataset.

 

Directory structure

Once you unzip the downloaded files from Zenodo, you can see the following directory structure. The machine type information is given by directory name, and the section index, domain, and condition information are given by file name, as:

  •  /eval_data
    • /fan
      • /source_test (Normal and anomaly data are included, but they do not have a condition label.)
        • /section_03_source_test_0000.wav
        • ... 
        • /section_03_source_test_0199.wav
        • /section_04_source_test_0000.wav
        • ...
        • /section_05_source_test_0199.wav
      • /target_test (Normal and anomaly data are included, but they do not have a condition label.)
        • /section_03_target_test_0000.wav
        • ... 
        • /section_03_target_test_0199.wav
        • /section_04_target_test_0000.wav
        • ...
        • /section_05_target_test_0199.wav
    • /gearbox (The other machine types have the same directory structure as fan.)
    • /pump
    • /slider
    • /ToyCar
    • /ToyTrain
    • /valve  

The paths of audio files are:

  • "/eval_data/<machine_type>/source_test/section_[0-9]+_source_test_[0-9]+.wav"
  • "/eval_data/<machine_type>/target_test/section_[0-9]+_target_test_[0-9]+.wav"

For example, the machine type, section, and domain of "/fan/source_test/section_03_source_test_0018.wav" are "fan", "section 03", and "source", respectively.

 

Baseline system

Two simple baseline systems are available on the Github repository [URL] and [URL]. The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

 

Conditions of use

This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

 

Publication

If you use this dataset, please cite all the following three papers:

  • Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021. [URL]
  • Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021. [URL]
  • Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021. [URL]


Feedback

If there is any problem, please contact us:

Files

eval_data_fan_test.zip

Files (2.2 GB)

Name Size Download all
md5:933cae49ac0fe9c3cb759c386b21cd11
303.5 MB Preview Download
md5:ee1cd888f44d0d3f865c5701f3b8f002
389.4 MB Preview Download
md5:40d9dce5f150afebb01de7123d63130d
304.6 MB Preview Download
md5:90eb36feb333c854459f2f3e0ef147d7
318.1 MB Preview Download
md5:16be66b9c7a51a3fe73356e8f1691dc5
356.1 MB Preview Download
md5:513a219b708f86c21e163877fac9ec04
273.8 MB Preview Download
md5:9f9ad80365b20fae46540502afad1d34
302.5 MB Preview Download

Additional details

References

  • Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021.
  • Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021.
  • Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021.