Published February 28, 2023 | Version v1
Dataset Open

MAESTRO Real - Multi-Annotator Estimated Strong Labels

  • 1. Tampere University

Description

The dataset was created for studying estimation of strong labels using crowdsourcing.

It contains 49 real-life audio files from 5 different acoustic scenes, and the annotation outcome. Annotation was performed using Amazon Mechanical Turk. Total duration of the dataset is 189 minutes and 52 seconds

Audio files are a subset from TUT Acoustic Scenes 2016 dataset, belonging to five acoustic scenes: cafe/restaurant, city center, grocery store, metro station and residential area. Each scene have 6 classes, some of them are common to all the scenes, resulting into 17 classes in total.


The dataset contains:

  • audio: the 49 real-life recordings, each from 3 to 5 min long.
  • soft labels: estimated strong labels from the crowdsourced data, values between 0 and 1 indicates the uncertainty of the annotators.

For more details about the real-life recordings, please see the following paper:

A. Mesaros, T. Heittola and T. Virtanen, "TUT database for acoustic scene classification and sound event detection," 2016 24th European Signal Processing Conference (EUSIPCO), 2016, pp. 1128-1132.

 

Files

development_annotation.zip

Files (2.6 GB)

Name Size Download all
md5:e6a3f84f8020725d559b38ccb494ef3d
877.2 kB Preview Download
md5:3de7cb4f92a115a6f5cc077a41ca07b3
2.6 GB Preview Download
md5:d1111f68bf579834e148db1788ae4cf9
1.6 kB Preview Download
md5:a8f77695c5933dcfcaea5018bc43c154
4.9 kB Preview Download

Additional details

Funding

Teaching machines to listen 332063
Research Council of Finland