Published July 20, 2022 | Version 1.0
Dataset Restricted

BAF: an audio fingerprinting dataset for broadcast monitoring

  • 1. BMAT Licensing S.L.. Music Technology Group, Universitat Pompeu Fabra
  • 2. BMAT Licensing S.L.
  • 3. Music Technology Group, Universitat Pompeu Fabra
  • 4. Epidemic Sound
  • 5. IPEM, Ghent University

Description

Overview

Broadcast Audio Fingerprinting dataset is an open, available upon request, annotated dataset for the task of music monitoring in broadcast. It contains 2,000 tracks from Epidemic Sound's private catalogue as reference tracks that represent 74 hours. As queries, it contains over 57 hours of TV broadcast audio from 23 countries and 203 channels distributed with 3,425 one-min audio excerpts.

It has been annotated by six annotators in total and each query has been cross-annotated by three of them obtaining high inter-annotator agreement percentages, which validates the annotation methodology and ensures the reliability of the annotations.

Purpose of the dataset

This dataset aims to become the standard dataset to evaluate Audio Fingerprinting algorithms since it’s built on real data, without the use of any data-augmentation techniques. It is also the first dataset to address background music fingerprinting, which is a real problem in royalties distribution.

Dataset use

This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis.

About the data

All audio files are monophonic, 8kHz, 128kb/s, pcm_s16le encoded in .wav. Annotations mark which tracks sound (either in foreground or background) in each query (if any) and also the specific times where it starts and ends sound in the query.

Note that there are 88 queries that do not have any matches.

For more information check the dedicated Github repository: https://github.com/guillemcortes/baf-dataset and the dataset datasheet included in the files.

 

Dataset contents

The dataset is structured following this schema

baf-dataset/
├── baf_datasheet.pdf
├── annotations.csv
├── changelog.md
├── cross_annotations.csv
├── queries_info.csv
├── queries
│   ├── query_0001.wav
│   ├── query_0002.wav
│   ├── …
│   └── query_3425.wav
├── queries_info.csv
└── references
    ├── ref_0001.wav
    ├── ref_0002.wav
    ├── …
    └── ref_2000.wav

There are two folders named queries and references containing the wav files of TV broadcast recordings and the reference tracks, respectively.

annotations.csv file contains the annotations made by the 6 annotators, giving the following information:

annotations.csv content summary
query reference query_start query_end annotator
query_0692.wav ref_1235.wav 0.0 59.904 annotator_6

cross_annotations.csv contains the resulting annotations after merging the overlapping annotations in annotations.csv file. x_tag has three different values:

  • single: the segment has only been annotated by one annotator.

  • majority: the segment has been annotated by two annotators.

  • unanimity: the segment has been annotated by the three annotators.

cross_annotations.csv content summary
query reference query_Start query_end annotators x_tag
query_0693.wav ref_1834.wav 37.53 38.07 ['annotator_3'] single
query_0693.wav ref_1834.wav 18.18 37.48 ['annotator_3', 'annotator_5', 'annotator_3'] unanimity
query_0693.wav ref_1834.wav 37.48 37.53 ['annotator_5', 'annotator_3'] majority

queries_info.csv contains information about the queries as a citation reference. It contains the country, the channel and the date where the broadcast happened.

queries_info.csv content summary
filename country channel datetime
query_0001.wav Norway Discovery Channel 2021-02-26 14:45:26

changelog.md contains a curated, chronologically ordered list of notable changes for each version of the dataset.

baf_datasheet.pdf contains standardized documentation for datasets

 

Ownership of the data

Next, we specify the ownership of all the data included in BAF: Broadcast Audio Fingerprinting dataset. For licensing information, please refer to the “License” section.

Reference tracks

The reference tracks are owned by Epidemic Sound AB, which has given a worldwide, revocable, non-exclusive, royalty-free licence to use and reproduce this data collection consisting of 2,000 low-quality monophonic 8kHz downsampled audio recordings.

Query tracks

The query tracks come from publicly available TV broadcast emissions so the ownership of each recording belongs to the channel that emitted the content. We publish them under the right of quotation provided by the Berne Convention.

Annotations

Guillem Cortès together with Alex Ciurana and Emilio Molina from BMAT Music Licensing S.L. have managed the annotation therefore the annotations belong to BMAT.

 

Accessing the dataset

The dataset is available upon request. Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset. Bear in mind that this information is important for the evaluation of every access request.

 

License

This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis. Given the different ownership of the elements of the dataset, the dataset is licensed under the following conditions:

  1. User’s access request

  2. Research only, non-commercial purposes

  3. No adaptations nor derivative works

  4. Attribution to Epidemic Sound and the authors as it is indicated in the ”citation” section.

Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset.

 

Acknowledgments

With the support of Ministerio de Ciencia Innovación y universidades through Retos-Colaboración call, reference: RTC2019-007248-7, and also with the support of the Industrial Doctorates Plan of the Secretariat of Universities and Research of the Department of Business and Knowledge of the Generalitat de Catalunya. Reference: DI46-2020.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis. Given the different ownership of the elements of the dataset, the dataset is licensed under the following conditions:

  1. User’s access request

  2. Research only, non-commercial purposes

  3. No adaptations nor derivative works

  4. Attribution to Epidemic Sound and the authors as it is indicated in the ”citation” section.

Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset.

You are currently not logged in. Do you have an account? Log in here