Published August 27, 2021 | Version v1
Dataset Open

MAESTRO Synthetic - Multi-Annotator Estimated Strong Labels

  • 1. Tampere University


The dataset was created for studying estimation of strong labels using crowdsourcing.

It contains 20 synthetic audio files created using Scaper, the reference annotation created with Scaper, and the annotation outcome. Annotation was performed using Amazon Mechanical Turk.

Audio files contain excerpts of recordings uploaded to Urban Sound 8k dataset). Please see FREESOUNDCREDITS.txt for an attribution list. 

The dataset contains: 

  • audio: the 20 synthetic soundscapes, each 3 min long
  • ground truth:  the "true" reference annotation created using Scaper
  • estimated strong labels: the reference annotation created from the crowdsourced data
  • audio tags: the weak labels corresponding to each 10 s segment of the soundscapes, as annotated

For details on the annotation procedure and label processing methodology, see the following paper:

Irene Martin Morato, Manu Harju, and Annamaria Mesaros. Crowdsourcing strong labels for sound event detection. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2021). New Paltz, NY, Oct 2021.




Files (590.8 MB)

Name Size Download all
590.4 MB Preview Download
130.3 kB Preview Download
26.2 kB Preview Download
1.5 kB Preview Download
232.9 kB Preview Download
6.6 kB Preview Download

Additional details


Teaching machines to listen 332063
Academy of Finland