STeLiN-US: A Spatio-Temporally Linked Neighborhood Urban Sound Database

Snehit; Bo-Hao Su; Chi-Chun Lee

doi:10.5281/zenodo.10560740

Published September 2023 | Version v2

Dataset Open

STeLiN-US: A Spatio-Temporally Linked Neighborhood Urban Sound Database

1. NTHU, Taiwan

Update:

Baseline: GitHub: STeLiN-US
Bug Fix: a minor bug on missing brackets in some clip labels in the MetaData.xlsx file has been corrected.
Paper: DCASE-2023: Proceedings

Introduction:

In this work, we present a novel dataset, the Spatio-temporally Linked Neighborhood Urban Sound (STeLiN-US) database. The dataset is semi-synthesized, that is, each sample is generated by leveraging diverse sets of real urban sounds with crawled information of real-world user behaviors over time.

Aim:

We proposed this dataset with the inspiration to equip researchers with variable surrounding sound in an environment that closely resembles realistic patterns. The proposed STeLiN-US dataset simulates the acoustic appearance of closely interconnected neighborhoods in urban areas.

Possess potential in not only identifying the scenes but also predicting acoustic scenarios.
This accommodates the user-centered applications, e.g., If combined with the ASR, the ASR performance can be analyzed based on the location and time more than that possible performance can be predicted beforehand based on the prediction of the scene busyness.
Incorporation of scene-specific events to replicate the real surrounding environments facilitates researchers in testing trailblazing event detection systems.

Dataset Specification:

STeLiN-US dataset consists of 5 minutes of audio segments representing 5 acoustic scenes or microphone locations:

Street
Metro-Station
Park
School-Playground
Café

Audio segment at each scene is synthesized for 15 discrete hours of the day from 7am to 9pm, equally distributed for each day of the week from Monday to Sunday. For 5 locations on 7 days with 15 discrete timestamps representing each audio segment accumulate to 525 total audio segments representing 43 hours 45 minutes of duration. We use 14 acoustic sound classes divided into event and background as below:

Events	Backgrounds
Vehicle, Children Playing, Street Music, Phone Ring, School Bell, Car Horn, Bird, and Dog Bark	Train, Pedestrian, Cafe Crowd, Urban Park, River, and Fountain

Sound Classes and Dataset used for the synthesis:

Sound Class	Source Dataset
Vehicle	IDMT Traffic
Train, Cafe Crowd, Urban Park	TUT Rare Sound Events 2017
Pedestrian	TAU Urban Acoustic Scenes 2020 Mobile
Children Playing, Street Music	UrbanSound
Phone Ring	NIGENS
School Bell, River, Fountain	Freesound
Car Horn, Dog Bark	UrbanSound8K
Bird	ESC-50

Naming Convection:

[Day]_[Microphone Location/Scene]_[Time].wav e.g. “Mon_Park_3pm.wav” represent Park scene on Monday synthesized at 3pm

File Structure:

STeLiN-US
|    MetaData.xlsx
|    Traffic_Temporal_MetaData.xlsx
|
|____Readme
|    |    README.md
|    |    Map_Paper.png
|
|
|____Audio
|    |____Street
|    |    |    Fri_Street_1pm.wav            file naming convection: [Day]_[Microphone Location/Scene]_[Time].wav
|    |    |    Fri_Street_2am.wav
|    |    |    …
|
|    |____Metro-Station
|    |    |    Fri_Metro-Station_1pm.wav
|    |    |    Fri_Metro-Station_2pm.wav
|    |    |    …
|
|    |____Park
|    |    |    Fri_Park_1pm.wav
|    |    |    Fri_Park_2pm.wav
|    |    |    …
|
|    |____School-Playground
|    |    |    Fri_School-Playground_1pm.wav
|    |    |    Fri_School-Playground_2pm.wav
|    |    |    …
|
|    |____Cafe
|    |    |    Fri_Cafe_1pm.wav
|    |    |    Fri_Cafe_2pm.wav
|    |    |    …

Contact person:

This is preliminary work, and We look forward to improving the present version of the dataset. Suggestions on this are most welcome by sending your feedback to:

Snehit Chunarkar: snehitc@gmail.com

Files

STeLiN-US.zip

Files (4.0 GB)

Name	Size	Download all
STeLiN-US.zip md5:c020de1eeccc5ab7a94acaee4d0ac2a6	4.0 GB	Preview Download

Additional details

Accepted: 2023

	All versions	This version
Views	607	106
Downloads	12	8
Data volume	51.5 GB	35.7 GB

STeLiN-US: A Spatio-Temporally Linked Neighborhood Urban Sound Database

Creators

Description

Introduction:

Files

STeLiN-US.zip

Files (4.0 GB)

Additional details

Dates