DCASE 2023 Challenge Task 2 Evaluation Dataset
Creators
- 1. Hitachi Ltd.
- 2. Doshisha University
- 3. NTT Corporation
- 4. Google, Inc.
Description
Description
This dataset is the "evaluation dataset" for the DCASE 2023 Challenge Task 2 "First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring".
The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel audio that includes both a machine's operating sound and environmental noise. The duration of recordings varies from 6 to 18 sec, depending on the machine type. The following seven types of real/toy machines are used:
- Vacuum
- ToyTank
- ToyNscale
- ToyDrone
- bandsaw
- grinder
- shaker
Definition
We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".
- "Machine type" indicates the type of machine, which in the development dataset is one of seven: fan, gearbox, bearing, slide rail, valve, ToyCar, and ToyTrain.
- A section is defined as a subset of the dataset for calculating performance metrics.
- The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.
- Attributes are parameters that define states of machines or types of noise.
Dataset
This dataset consists of seven machine types. For each machine type, one section is provided, and the section is a complete set of training and test data. For each section, this dataset provides (i) 990 clips of normal sounds in the source domain for training, (ii) ten clips of normal sounds in the target domain for training. The source/target domain of each sample is provided. Additionally, the attributes of each sample in the training and test data are provided in the file names and attribute csv files.
Recording procedure
Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. For simplifying the task, we use only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. The environmental noise samples were recorded in several real factory environments. We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.
Directory structure
- /dev_data
- /raw
- /Vacuum
- /test
- /section_00_0001.wav
- ...
- /section_00_0200.wav
- /ToyTank (The other machine types have the same directory structure as Vacuum.)
- /ToyNscale
- /ToyDrone
- /bandsaw
- /grinder
- /shaker
Condition of use
This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Citation
If you use this dataset, please cite all the following papers. We will publish a paper on the description of the DCASE 2023 Task 2, so pleasure make sure to cite the paper, too.
- Noboru Harada, Daisuke Niizumi, Yasunori Ohishi, Daiki Takeuchi, and Masahiro Yasuda. First-shot anomaly detection for machine condition monitoring: A domain generalization baseline. In arXiv e-prints: 2303.00455, 2023. [URL]
- Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, and Yohei Kawaguchi. MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. In Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022), 31-35. Nancy, France, November 2022, . [URL]
- Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. ToyADMOS2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. In Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 1–5. Barcelona, Spain, November 2021. [URL]
Contact
If there is any problem, please contact us:
- Kota Dohi, kota.dohi.gr@hitachi.com
- Keisuke Imoto, keisuke.imoto@ieee.org
- Noboru Harada, noboru@ieee.org
- Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp
- Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com
Files
eval_data_bandsaw_test.zip
Files
(397.6 MB)
Name | Size | Download all |
---|---|---|
md5:2a8e8f39f6584ab366a8f4da52d4d7a6
|
54.2 MB | Preview Download |
md5:631b3e1608b6077772829a6e68c82c77
|
54.0 MB | Preview Download |
md5:ba98c98caa96051ec80e24e44b8fca56
|
53.7 MB | Preview Download |
md5:fdae7b8d1f4cadb2bea88bc93e2367db
|
93.6 MB | Preview Download |
md5:62f5f5043d8fb3a305b1c2e1025872de
|
29.1 MB | Preview Download |
md5:f5639bf58c47169c622751f19c6fc321
|
41.1 MB | Preview Download |
md5:a32524fd8c45b574a560685b38acc4e1
|
71.9 MB | Preview Download |