Audio Commons Ground Truth Data for deliverables D4.4, D4.10 and D4.12

doi:10.5281/zenodo.2545728

Published January 21, 2019 | Version 1.0

Dataset Open

Audio Commons Ground Truth Data for deliverables D4.4, D4.10 and D4.12

Font, Frederic¹

1. Music Technology Group, Universitat Pompeu Fabra

This dataset contains the ground truth data used to evaluate the musical pitch, tempo and key estimation algorithms developed during the AudioCommons H2020 EU project and which are part of the Audio Commons Audio Extractor tool. It also includes ground truth information for the single-eventness audio descriptor also developed for the same tool.

This ground truth data has been used to generate the following documents:

Deliverable D4.4: Evaluation report on the first prototype tool for the automatic semantic description of music samples
Deliverable D4.10: Evaluation report on the second prototype tool for the automatic semantic description of music samples
Deliverable D4.12: Release of tool for the automatic semantic description of music samples

All these documents are available in the materials section of the AudioCommons website.

All ground truth data in this repository is provided in the form of CSV files. Each CSV file corresponds to one of the individual datasets used in one or more evaluation tasks of the aforementioned deliverables. This repository does not include the audio files of each individual dataset, but includes references to the audio files. The following paragraphs describe the structure of the CSV files and give some notes about how to obtain the audio files in case these would be needed.

Structure of the CSV files

All CSV files in this repository (with the sole exception of SINGLE EVENT - Ground Truth.csv) feature the following 5 columns:

Audio reference: reference to the corresponding audio file. This will either be a string withe the filename, or the Freesound ID (for one dataset based on Freesound content). See below for details about how to obtain those files.
Audio reference type: will be one of Filename or Freesound ID, and specifies how the previous column should be interpreted.
Key annotation: tonality information as a string with the form "RootNote minor/major". Audio files with no ground truth annotation for tonality are left blank. Ground truth annotations are parsed from the original data source as described in the text of deliverables D4.4 and D4.10.
Tempo annotation: tempo information as an integer representing beats per minute. Audio files with no ground truth annotation for tempo are left blank. Ground truth annotations are parsed from the original data source as described in the text of deliverables D4.4 and D4.10. Note that integer values are used here because we only have tempo annotations for music loops which typically only feature integer tempo values.
Pitch annotation: pitch information as an integer representing the MIDI note number corresponding to annotated pitch's frequency. Audio files with no ground truth pitch for tempo are left blank. Ground truth annotations are parsed from the original data source as described in the text of deliverables D4.4 and D4.10.

The remaining CSV file, SINGLE EVENT - Ground Truth.csv, has only the following 2 columns:

Freesound ID: sound ID used in Freesound to identify the audio clip.
Single Event: boolean indicating whether the corresponding sound is considered to be a single event or not. Single event annotations were collected by the authors of the deliverables as described in deliverable D4.10.

How to get the audio data

In this section we provide some notes about how to obtain the audio files corresponding to the ground truth annotations provided here. Note that due to licensing restrictions we are not allowed to re-distribute the audio data corresponding to most of these ground truth annotations.

Apple Loops (APPL): This dataset includes some of the music loops included in Apple's music software such as Logic or GarageBand. Access to these loops requires owning a license for the software. Detailed instructions about how to set up this dataset are provided here.
Carlos Vaquero Instruments Dataset (CVAQ): This dataset includes single instrument recordings carried out by Carlos Vaquero as part of this master thesis. Sounds are available as Freesound packs and can be downloaded at this page: https://freesound.org/people/Carlos_Vaquero/packs
Freesound Loops 4k (FSL4): This dataset set includes a selection of music loops taken from Freesound. Detailed instructions about how to set up this dataset are provided here.
Giant Steps Key Dataset (GSKY): This dataset includes a selection of previews from Beatport annotated by key. Audio and original annotations available here.
Good-sounds Dataset (GSND): This dataset contains monophonic recordings of instrument samples. Full description, original annotations and audio are available here.
University of IOWA Musical Instrument Samples (IOWA): This dataset was created by the Electronic Music Studios of the University of IOWA and contains recordings of instrument samples. The dataset is available upon request by visiting this website.
Mixcraft Loops (MIXL): This dataset includes some of the music loops included in Acoustica's Mixcraft music software. Access to these loops requires owning a license for the software. Detailed instructions about how to set up this dataset are provided here.
NSynth Dataset Test and Validation sets (NSYT and NSYV): NSynth is a large-scale and high-quality dataset of annotated musical notes built with synthesized sounds by Google's Magenta team. Full dataset description including original annotations and audio files is available here.
Philarmonia Orchestra Sound Samples Dataset (PHIL): This includes thousands of free, downloadable sound samples specially recorded by Philharmonia Orchestra players. Audio files are freely downloadable from the philarmonia orchestra website.
Freesound Single Events Dataset (SINGLE EVENT): This includes a selection of Freesound audio clips representing audio signals containing either a single audio event or multiple ones. Original audio files can be retrieved by downloading individual audio clips from Freesound using the ID identifier provided in the CSV file. A similar procedure to that described here could be followed.

Files

APPL - Ground Truth.csv

Files (2.1 MB)

Name	Size	Download all
APPL - Ground Truth.csv md5:3a5f079339f4521640fcbf4082cdb543	200.6 kB	Preview Download
CVAQ - Ground Truth.csv md5:ece959759d8d8ff2d88470ce635fcbb3	20.9 kB	Preview Download
FSL4 - Ground Truth.csv md5:0d56320668b9695571a18058b2996f8a	99.9 kB	Preview Download
GSKY - Ground Truth.csv md5:fbc42a800e17ee5183fba9030015413e	21.9 kB	Preview Download
GSND - Ground Truth.csv md5:6c55bc355cff68f5c009d0b8cd3faae0	438.3 kB	Preview Download
IOWA - Ground Truth.csv md5:9c1e14105ae6addbae6ffa821b1e7817	20.2 kB	Preview Download
MIXL - Ground Truth.csv md5:b3fd5bb396088423f2177022c90d4158	201.0 kB	Preview Download
NSYT - Ground Truth.csv md5:efcf4f85e45a9f6d68a2a6b1d127ba17	194.9 kB	Preview Download
NSYV - Ground Truth.csv md5:d344eb4685cee403d03e505d559f5943	603.4 kB	Preview Download
PHIL - Ground Truth.csv md5:f5c2fc39f69b231ee73b138a73e59e06	289.4 kB	Preview Download
SINGLE EVENT - Ground Truth.csv md5:9c096b04cdb4789dd2a9b3d6bafb21de	3.1 kB	Preview Download

Additional details

Is supplement to: 10.5281/zenodo.2546643 (DOI)

AudioCommons – Audio Commons: An Ecosystem for Creative Reuse of Audio Content 688382: European Commission

	All versions	This version
Views	1,087	368
Downloads	812	112
Data volume	191.5 MB	38.1 MB

Audio Commons Ground Truth Data for deliverables D4.4, D4.10 and D4.12

Files

APPL - Ground Truth.csv

Files (2.1 MB)

Additional details

Related works

Funding

Audio Commons Ground Truth Data for deliverables D4.4, D4.10 and D4.12

Creators

Description

Files

APPL - Ground Truth.csv

Files (2.1 MB)

Additional details

Related works

Funding