DCASE 2026 Challenge Task 2 Evaluation Dataset

Nishida, Tomoya; Harada, Noboru; Takeuchi, Daiki; Niizumi, Daisuke; Imoto, Keisuke; Dohi, Kota; Purohit, Harsh; Endo, Takashi; Kawaguchi, Yohei

doi:10.5281/zenodo.20437238

Published June 1, 2026 | Version v1

Dataset Open

DCASE 2026 Challenge Task 2 Evaluation Dataset

1. Hitachi Ltd.
2. NTT, Inc.
3. SB Intuitions Corp.
4. Kyoto University
5. Hitachi Australia PTY Ltd.

Description

This dataset is the "evaluation dataset" for the DCASE 2026 Challenge Task 2.

The data consists of the normal/anomalous operating sounds of five types of real/toy machines. Each recording is a two-channel 6, 10, or 16-sec audio that includes both a machine's operating sound and environmental noise. The following five types of real/toy machines are used in this task:

ToyDrone
ToothBrush
SewingMachine
BlowerDustCollector
Sander

Overview of the task

Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.

This task is the follow-up from DCASE 2020 Task 2 to DCASE 2025 Task 2. The task this year is to develop an ASD system that meets the following five requirements.

1. Train a model using only normal sound (unsupervised learning scenario)
Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data, which is called UASD (unsupervised ASD). This is the same requirement as in the previous tasks.
2. Detect anomalies regardless of domain shifts (domain generalization task)
In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same since DCASE 2022 Task 2.
3. Train a model for a machine type unseen in development phase
For unseen machine types, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same since DCASE 2023 Task 2.
4. Train a model both with or without attribute information
While additional attribute information can help enhance the detection performance, we cannot always obtain such information. Therefore, the system must work well both when attribute information is available and when it is not.
5. Training and inference with two-channel audio recorded at different distances from the target machine
In practical cases, the target machine sounds may be recorded using multiple microphones. Participants may leverage synchronized recordings captured by microphones placed both near to and far from the target machine to help develop systems that are robust to background noise.

The last optional requirement is newly introduced in DCASE 2026 Task2.

Noise-aware UASD : Focus of task

The focus of this year's task is on building noise-robust UASD systems by exploiting recordings captured simultaneously by multiple microphones placed at different distances from the target machine. Improving ASD performance in noisy conditions is essential for real-world deployment, since factory environments are acoustically complex with multiple machines operating simultaneously. However, achieving strong performance on noisy recordings remains challenging.

To support research in this direction, this challenge provides synchronized multi-microphone recordings taken from different locations relative to the target machine. Differences in microphone distance tend to produce differences in SNR and spectral characteristics. These differences can serve as cues for distinguishing the target machine components from background noise, and thus can potentially be leveraged to improve robustness against environmental noise.

Definition

We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".

"Machine type" indicates the type of machine, which in the additional training dataset is one of five: ToyDrone, ToothBrush, SewingMachine, BlowerDustCollector, and Sander.
A section is defined as a subset of the dataset for calculating performance metrics.
The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.
Attributes are parameters that define states of machines or types of noise. For several machine types, the attributes are hidden.

Dataset

This dataset consists of five machine types. For each machine type, one section is provided, and for each section, this dataset provides 200 clips of test data. A set of training data corresponding to this test data is provided in another seperate zenodo page as an "additional training dataset" for the DCASE 2026 Challenge task 2 (DCASE 2026 Challenge Task 2 Additional Training Dataset).

Each recording is a two-channel, 6-, 10-, or 16-second audio clip containing both target-machine and environmental sounds, captured at different distances from the target machine. In each recording, channel 1 is captured near the target machine, and channel 2 is captured farther away.

File names and attribute csv files

File names and attribute csv files provide reference labels for each clip. The given reference labels for each training clip include machine type, section index, normal/anomaly information, and attributes regarding the condition other than normal/anomaly. The machine type is given by the directory name. The section index is given by their respective file names. For the datasets other than the evaluation dataset, the normal/anomaly information and the attributes are given by their respective file names. Note that for machine types that has its attribute information hidden, the attribute information in each file names are only labeled as "noAttributes". Attribute csv files are for easy access to attributes that cause domain shifts. In these files, the file names, name of parameters that cause domain shifts (domain shift parameter, dp), and the value or type of these parameters (domain shift value, dv) are listed. Each row takes the following format:

[filename (string)], [d1p (string)], [d1v (int | float | string)], [d2p], [d2v]...

For machine types that have their attribute information hidden, all columns except the filename column are left blank for each row.

Recording procedure

Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. Each sample is a two-channel recording captured at two microphone positions, one close to the target machine and the other farther away. To add environmental noise, pre-recorded real factory noise was played back through loudspeakers placed at the four corners of the recording room, and the resulting signals were mixed.
We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.

Directory structure

- /eval_data

- /raw
- /ToyDrone
- /test
- /section_00_0001.wav
- ...
- /section_00_0199.wav

- /ToothBrush (The other machine types have the same directory structure as fan.)
- /SewingMachine
- /Blower
- /Sander

Baseline system

The baseline system is available on the Github repository https://github.com/nttcslab/dcase2023_task2_baseline_ae. The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

Condition of use

This dataset was created jointly by Hitachi, Ltd. and NTT, Inc. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Citation

<TBD>

Contact

If there is any problem, please contact us:

Tomoya Nishida, tomoya.nishida.ax@hitachi.com
Keisuke Imoto, keisuke.imoto@ieee.org
Noboru Harada, noboru@ieee.org
Daisuke Niizumi, daisukelab.cs@gmail.com
Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com

Files

eval_data_BlowerDustCollector_test.zip

Files (558.7 MB)

Name	Size
eval_data_BlowerDustCollector_test.zip md5:ec90d56f189e84e6430fb2878c59cb84	118.2 MB	Preview Download
eval_data_Sander_test.zip md5:05f81c9a80e91c7b3cea15f8e8559b44	114.3 MB	Preview Download
eval_data_SewingMachine_test.zip md5:f0a1a96b48006c301e8cec25e9214046	116.6 MB	Preview Download
eval_data_ToothBrush_test.zip md5:987395506fbb48c570d5d26babcab05e	72.6 MB	Preview Download
eval_data_ToyDrone_test.zip md5:b36102371c188b630d201d443a00bab2	137.0 MB	Preview Download

	All versions	This version
Views	343	343
Downloads	529	529
Data volume	100.3 GB	100.3 GB

DCASE 2026 Challenge Task 2 Evaluation Dataset

Authors/Creators

Description

Files

eval_data_BlowerDustCollector_test.zip

Files (558.7 MB)