Published April 1, 2021 | Version 1.0
Dataset Open

DCASE 2021 Challenge Task 2 Additional Training Dataset

  • 1. Hitachi, Ltd.
  • 2. Doshisha University
  • 3. Google LLC
  • 4. NTT Corporation

Description

Description

This dataset is the "additional training dataset" for the DCASE 2021 Challenge Task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions"

In the task, three datasets have been or will be released: "development dataset", "additional training dataset", and "evaluation dataset". This additional training dataset was released before the "evaluation dataset". This dataset includes around 1,000 normal samples for each machine type and section index used in the evaluation dataset and can be used for model training in advance.

The recording procedure and data format are the same as the development dataset. The section indices in this dataset are different from those in the development dataset. For more information, please see the pages of the development dataset and the task description

 

Directory structure

Once you unzip the downloaded files from Zenodo, you can see the following directory structure. The machine type information is given by directory name, and the section index, domain, and condition information are given by file name, as:

  •  /eval_data
    • /fan
      • /train (only normal clips)
        • /section_03_source_train_normal_0000_<attribute>.wav
        • ... 
        • /section_03_source_train_normal_0999_<attribute>.wav
        • /section_03_target_train_normal_0000_<attribute>.wav
        • /section_03_target_train_normal_0001_<attribute>.wav
        • /section_03_target_train_normal_0002_<attribute>.wav
        • /section_04_source_train_normal_0000_<attribute>.wav
        • ...
        • /section_05_target_train_normal_0999_<attribute>.wav
    • /gearbox (The other machine types have the same directory structure as fan.)
    • /pump
    • /slider
    • /ToyCar
    • /ToyTrain
    • /valve  

The paths of audio files are:

  • "/eval_data/<machine_type>/train/section_[0-9]+_<domain>_train_normal_[0-9]+_<attribute>.wav"

For example, the machine type, section, and domain of "/fan/train/section_03_source_train_normal_0108_strenght_1_ambient.wav" are "fan", "section 03", and "source", respectively, and its condition is normal.

 

Baseline system

Two simple baseline systems are available on the Github repository [URL] and [URL]. The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

 

Conditions of use

This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

 

Publication

If you use this dataset, please cite all the following three papers:

  • Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021. [URL]
  • Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021. [URL]
  • Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021. [URL]


Feedback

If there is any problem, please contact us:

Files

eval_data_fan_train.zip

Files (5.5 GB)

Name Size Download all
md5:57afe982988072fa2c96faefd5516a73
759.8 MB Preview Download
md5:a3ce10fd1d8a7ce668dd0171ddf3d21f
847.6 MB Preview Download
md5:a1fad9e61ba0d1fde48aefd02aa2c1fa
762.9 MB Preview Download
md5:3b1ff6887aceb50d51f901704dc63e33
792.0 MB Preview Download
md5:2cf0eb8518b85c7cefb052a8dc204629
894.5 MB Preview Download
md5:277e6f2aa857e3152c4bea5f1eb4bb6d
684.2 MB Preview Download
md5:ddf7513b752946acd07c75ad8fce7dc4
758.3 MB Preview Download

Additional details

References

  • Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021.
  • Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021.
  • Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021.