There is a newer version of the record available.

Published May 13, 2025 | Version v2
Video/Audio Open

DataSEC - Dataset for Sound Event Classification of environmental noise

  • 1. Institute for Chemical-Physical Processes of the Italian Research Council
  • 2. University of Pisa - Physics Department
  • 3. University of Campania "Luigi Vanvitelli"

Contributors

Data curator:

Project manager:

  • 1. University of Campania "Luigi Vanvitelli"
  • 2. Istituto per i Processi Chimico-Fisici Consiglio Nazionale delle Ricerche Unità Organizzativa di Supporto di Pisa

Description

Sound Event Classification (SEC) has attracted considerable attention in recent times, with applications in a variety of fields, including environmental acoustics. In particular, there has been a notable increase in the level of interest in outdoor measurements, with a view to distinguishing the contributions of a particular source from the background noise. The utilisation of machine learning tools is contingent upon the availability of substantial datasets for the purposes of training or validation. DataSEC is an open-access dataset specifically designed for Sound Event Classification of environmental noise that can be listened in outdoor environments.
The collection consists of 23 hours and 44 minutes of authentic, non-synthesized audio recordings, which have been meticulously gathered from two distinct sources: sound level measurements and online repositories. The authors have conducted a comprehensive analysis of the sound samples, encompassing a diverse range of environments, from urban to rural settings.
DataSEC comprises a total of 5,051 mono-channel .wav audio samples, with a sampling rate of 44.1 kHz. Each sample represents a single event that has been classified into one of the following 22 defined sound classes and 28 subclasses. The utilisation of the symbol "/" serves to denote the subclasses. The identified classes and subclasses are: Bells; Birds; Cat fights and moans; Chicken coop; Cicadas and crickets \Cicadas \Crickets; Crows seagulls and magpies \Crows \Seagulls \Magpies; Dog barkings and howlings; Glass breaking; Horn; Jet aircrafts; Lawn mower brush cutter and olive shaker \lawn mower \Brush cutter \Olive shaker; Music; Propeller aircrafts \Airplanes, \Helicopters; Sirens and alarms \Sirens \Alarms; Thunder fireworks and gunshot \Thunder \Fireworks \Gunshot; Train; Vacuum cleaner fan and hairdryer \Vacuum cleaner \Fan \Hairdryer; Vehicle idling \Car-truck idling, \Motorbike idling; Vehicle pass-by \Car pass-by, \Motorbike pass-by, \Truck pass-by; Voices; Wind turbine; Workshop \Air compressor, \Drill, Grinder, \Jackhammer, \Saw.
The minimum number of entries required for each class has been set at 50, with a minimum of 20 entries stipulated for each sub-class. A thorough examination of all samples has been conducted by the authors, with all files being meticulously preprocessed to eliminate silences and superfluous parts. This preprocessing involved the removal of irrelevant background activity and overlapping sounds.
The authors expect that the dataset will contribute to future research in real-world sound event analysis and automated acoustic evaluation by means of machine learning.

Files

DATASEC.zip

Files (6.4 GB)

Name Size Download all
md5:5d537b1318d8243ab01126028b33b2e8
6.4 GB Preview Download

Additional details

Funding

Ministero dell'università e della ricerca
Sustainable condition monitoring of wind turbines using acoustic signals and machine learning techniques 20223LMSZN