MACS - Multi-Annotator Captioned Soundscapes
Description
This is a dataset containing audio captions and corresponding audio tags for a number of 3930 audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park). The files were annotated using a web-based tool.
Each file is annotated by multiple annotators that provided tags and a one-sentence description of the audio content.
The data also includes annotator competence estimated using MACE (Multi-Annotator Competence Estimation).
The annotation procedure, processing and analysis of the data are presented in the following papers:
- Irene Martin-Morato, Annamaria Mesaros. What is the ground truth? Reliability of multi-annotator data for audio tagging, 29th European Signal Processing Conference, EUSIPCO 2021
- Irene Martin-Morato, Annamaria Mesaros. Diversity and bias in audio captioning datasets, submitted to DCASE 2021 Workshop (to be updated with arxiv link)
Data is provided as two files:
- MACS.yaml - containing the complete annotations in the following format:
- filename: file1.wav
annotations:
- annotator_id: ann_1
sentence: caption text
tags:
- tag1
- tag2
- annotator_id: ann_2
sentence: caption text
tags:
- tag1
- MACS_competence.csv - containing the estimated annotator competence; for each annotator_id in the yaml file, competence is a number between 0 (considered as annotating at random) and 1
id [tab] competence
The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.
Files
LICENSE.txt
Additional details
Funding
- Teaching machines to listen 332063
- Research Council of Finland