Published July 20, 2020 | Version 1.0
Dataset Open

Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset

Description

Description

This data is the ground truth for the "evaluation dataset" for the DCASE 2020 Challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" [task description]

In the task, three datasets have been released: "development dataset", "additional training dataset", and "evaluation dataset". The evaluation dataset was the last of the three released and includes around 400 samples for each Machine Type and Machine ID used in the evaluation dataset, none of which have any condition label (i.e., normal or anomaly). This ground truth data contains the condition labels.

 

Data format

The ground truth data is a CSV file like the following:

---------------------------------

fan
id_01_00000000.wav,normal_id_01_00000098.wav,0
id_01_00000001.wav,anomaly_id_01_00000064.wav,1
...

id_05_00000456.wav,anomaly_id_05_00000033.wav,1
id_05_00000457.wav,normal_id_05_00000049.wav,0
pump
id_01_00000000.wav,anomaly_id_01_00000049.wav,1
id_01_00000001.wav,anomaly_id_01_00000039.wav,1
...

id_05_00000346.wav,anomaly_id_05_00000052.wav,1
id_05_00000347.wav,anomaly_id_05_00000080.wav,1
slider
id_01_00000000.wav,anomaly_id_01_00000035.wav,1
id_01_00000001.wav,anomaly_id_01_00000176.wav,1
...

---------------------------------

"Fan", "pump", "slider", etc mean "Machine Type" names. The lines following a Machine Type correspond to pairs of a wave file in the Machine Type and a condition label. The first column shows the name of a wave file. The second column shows the original name of the wave file, but this can be ignored by users. The third column shows the condition label (i.e., 0: normal or 1: anomaly).

 

How to use

A system for calculating AUC and pAUC scores for the "evaluation dataset" is available on the Github repository [URL]. The ground truth data is used by this system. For more information, please see the Github repository.

 

Conditions of use

This dataset was created jointly by NTT Corporation and Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

 

Publication

If you use this dataset, please cite all the following three papers:

Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. [pdf]

Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019. [pdf]

Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring,"  in Proc. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]


Feedback

If there is any problem, please contact us:

Files

eval_data_list.csv

Files (338.4 kB)

Name Size Download all
md5:f16aaa185df9906e588864aaf7cb58ea
338.4 kB Preview Download

Additional details

References

  • Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019.
  • Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, "MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection," in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019.
  • Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring,"  in arXiv e-prints: 2006.05822, 2020.