DCASE 2021 Challenge Task 2 Additional Training Dataset
Creators
- 1. Hitachi, Ltd.
- 2. Doshisha University
- 3. Google LLC
- 4. NTT Corporation
Description
Description
This dataset is the "additional training dataset" for the DCASE 2021 Challenge Task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions".
In the task, three datasets have been or will be released: "development dataset", "additional training dataset", and "evaluation dataset". This additional training dataset was released before the "evaluation dataset". This dataset includes around 1,000 normal samples for each machine type and section index used in the evaluation dataset and can be used for model training in advance.
The recording procedure and data format are the same as the development dataset. The section indices in this dataset are different from those in the development dataset. For more information, please see the pages of the development dataset and the task description.
Directory structure
Once you unzip the downloaded files from Zenodo, you can see the following directory structure. The machine type information is given by directory name, and the section index, domain, and condition information are given by file name, as:
- /eval_data
- /fan
- /train (only normal clips)
- /section_03_source_train_normal_0000_<attribute>.wav
- ...
- /section_03_source_train_normal_0999_<attribute>.wav
- /section_03_target_train_normal_0000_<attribute>.wav
- /section_03_target_train_normal_0001_<attribute>.wav
- /section_03_target_train_normal_0002_<attribute>.wav
- /section_04_source_train_normal_0000_<attribute>.wav
- ...
- /section_05_target_train_normal_0999_<attribute>.wav
- /train (only normal clips)
- /gearbox (The other machine types have the same directory structure as fan.)
- /pump
- /slider
- /ToyCar
- /ToyTrain
- /valve
- /fan
The paths of audio files are:
- "/eval_data/<machine_type>/train/section_[0-9]+_<domain>_train_normal_[0-9]+_<attribute>.wav"
For example, the machine type, section, and domain of "/fan/train/section_03_source_train_normal_0108_strenght_1_ambient.wav" are "fan", "section 03", and "source", respectively, and its condition is normal.
Baseline system
Two simple baseline systems are available on the Github repository [URL] and [URL]. The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.
Conditions of use
This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Publication
If you use this dataset, please cite all the following three papers:
- Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021. [URL]
- Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021. [URL]
- Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021. [URL]
Feedback
If there is any problem, please contact us:
- Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com
- Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp
- Keisuke Imoto, keisuke.imoto@ieee.org
Files
eval_data_fan_train.zip
Files
(5.5 GB)
Name | Size | Download all |
---|---|---|
md5:57afe982988072fa2c96faefd5516a73
|
759.8 MB | Preview Download |
md5:a3ce10fd1d8a7ce668dd0171ddf3d21f
|
847.6 MB | Preview Download |
md5:a1fad9e61ba0d1fde48aefd02aa2c1fa
|
762.9 MB | Preview Download |
md5:3b1ff6887aceb50d51f901704dc63e33
|
792.0 MB | Preview Download |
md5:2cf0eb8518b85c7cefb052a8dc204629
|
894.5 MB | Preview Download |
md5:277e6f2aa857e3152c4bea5f1eb4bb6d
|
684.2 MB | Preview Download |
md5:ddf7513b752946acd07c75ad8fce7dc4
|
758.3 MB | Preview Download |
Additional details
References
- Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021.
- Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021.
- Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021.