MACS - Multi-Annotator Captioned Soundscapes

Irene Martin Morato; Annamaria Mesaros

doi:10.5281/zenodo.5114771

Published July 22, 2021 | Version v1

Dataset Open

MACS - Multi-Annotator Captioned Soundscapes

1. Tampere University

This is a dataset containing audio captions and corresponding audio tags for a number of 3930 audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park). The files were annotated using a web-based tool.

Each file is annotated by multiple annotators that provided tags and a one-sentence description of the audio content.

The data also includes annotator competence estimated using MACE (Multi-Annotator Competence Estimation).

The annotation procedure, processing and analysis of the data are presented in the following papers:

Irene Martin-Morato, Annamaria Mesaros. What is the ground truth? Reliability of multi-annotator data for audio tagging, 29th European Signal Processing Conference, EUSIPCO 2021
Irene Martin-Morato, Annamaria Mesaros. Diversity and bias in audio captioning datasets, submitted to DCASE 2021 Workshop (to be updated with arxiv link)

Data is provided as two files:

MACS.yaml - containing the complete annotations in the following format:

- filename: file1.wav
   annotations:
- annotator_id: ann_1
sentence: caption text
   tags:
    - tag1
     - tag2
- annotator_id: ann_2
                   sentence: caption text
                   tags:
                   - tag1

MACS_competence.csv - containing the estimated annotator competence; for each annotator_id in the yaml file, competence is a number between 0 (considered as annotating at random) and 1

id [tab] competence

The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.

Files

LICENSE.txt

Files (2.8 MB)

Name	Size	Download all
LICENSE.txt md5:d3086f4517cccc32c1bb3a081b07cfa1	1.5 kB	Preview Download
MACS.yaml md5:23fcb2ebd0b109094034ef9e87972256	2.8 MB	Download
MACS_competence.csv md5:4dfe9f951f0af9f29cb7952ec030370a	1.4 kB	Preview Download

Additional details

Research Council of Finland
Teaching machines to listen 332063

	All versions	This version
Views	2,262	2,252
Downloads	1,352	1,341
Data volume	1.3 GB	1.3 GB

MACS - Multi-Annotator Captioned Soundscapes

Creators

Description

Files

LICENSE.txt

Files (2.8 MB)

Additional details

Funding