TAU Urban Acoustic Scenes 2021 Mobile, Evaluation dataset

Audio Research Group / Tampere University

Authors

Recording and annotation

  • Henri Laakso
  • Ronal Bejarano Rodriguez
  • Toni Heittola

1. Dataset

TAU Urban Acoustic Scenes 2021 Mobile evaluation dataset consists of 10-seconds audio segments from 10 acoustic scenes:

  • Airport - airport
  • Indoor shopping mall - shopping_mall
  • Metro station - metro_station
  • Pedestrian street - street_pedestrian
  • Public square - public_square
  • Street with medium level of traffic - street_traffic
  • Travelling by a tram - tram
  • Travelling by a bus - bus
  • Travelling by an underground metro - metro
  • Urban park - park

The dataset contains in total 22 hours of audio.

The dataset was collected by Tampere University of Technology between 05/2018 - 11/2018. The data collection has received funding from the European Research Council under the ERC Grant Agreement 637422 EVERYSOUND.

ERC

Preparation of the dataset

The dataset was recorded in 12 large European cities: Amsterdam, Barcelona, Helsinki, Lisbon, London, Lyon, Madrid, Milan, Prague, Paris, Stockholm, and Vienna. For all acoustic scenes, audio was captured in multiple locations: different streets, different parks, different shopping malls. In each location, multiple 2-3 minute long audio recordings were captured in a few slightly different positions (2-4) within the selected location. Collected audio material was cut into segments of 10 seconds length.

The equipment used for recording consists of a binaural Soundman OKM II Klassik/studio A3 electret in-ear microphone and a Zoom F8 audio recorder using 48 kHz sampling rate and 24 bit resolution. During the recording, the microphones were worn by the recording person in the ears, and head movement was kept to minimum.

This dataset contains data recorded with device A, three mobile devices (referred to as devices B, C and D) and 11 simulated devices (S1-S11). Devices B, C and D are commonly available customer devices (e.g. smartphones, cameras) and were handled in typical ways (e.g. hand held).

Post-processing of the recorded audio involves aspects related to privacy of recorded individuals, and possible errors in the recording process. The material was screened for content, and segments containing close microphone conversation were eliminated. Some interferences from mobile phones are audible, but are considered part of real-world recording process.

File structure

dataset root
│   README.md               this file, markdown-format
│   README.html             this file, html-format
│
└───audio                   7920 audio segments, 24-bit 44.1kHz mono
│   │   0.wav       
│   │   1.wav
│   │   ...
│   │   7919.wav
│
└───evaluation_setup        cross-validation setup, 1 fold
    │   test.txt            testing file list, csv-format, [audio file (string)]

2. Usage

The partitioning of the data was done based on the location of the original recordings. All segments recorded at the same location were included into a single subset - either development dataset or evaluation dataset. The locations from this dataset are therefore different than the ones used in TAU-urban-acoustic-scenes-2020-mobile-development.

Testing

evaluation setup\test.txt
testing file list (in csv-format)

Format: [audio file (string)]

3. Changelog

v1.0 / 2021-05-11

  • Initial commit

4. License

License permits free academic usage. Any commercial use is strictly prohibited. For commercial use, contact dataset authors.

Copyright (c) 2021 Tampere University and its licensors
All rights reserved.
Permission is hereby granted, without written agreement and without license or royalty
fees, to use and copy the TAU Urban Acoustic Scenes 2021 Mobile (“Work”) described in this document
and composed of audio and metadata. This grant is only for experimental and non-commercial
purposes, provided that the copyright notice in its entirety appear in all copies of this Work,
and the original source of this Work, (Audio Research Group at Tampere University),
is acknowledged in any publication that reports research using this Work.
Any commercial use of the Work or any part thereof is strictly prohibited.
Commercial use include, but is not limited to:
- selling or reproducing the Work
- selling or distributing the results or content achieved by use of the Work
- providing services by using the Work.

IN NO EVENT SHALL TAMPERE UNIVERSITY OR ITS LICENSORS BE LIABLE TO ANY PARTY
FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE
OF THIS WORK AND ITS DOCUMENTATION, EVEN IF TAMPERE UNIVERSITY OR ITS
LICENSORS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

TAMPERE UNIVERSITY AND ALL ITS LICENSORS SPECIFICALLY DISCLAIMS ANY
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE. THE WORK PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND
THE TAMPERE UNIVERSITY HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT,
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.