Da-TACOS: A Dataset for Cover Song Identification and Understanding

Furkan Yesiler; Chris Tralie; Albin Correya; Diego Furtado Silva; Philip Tovstogan; Emilia Gomez; Xavier Serra

doi:10.5281/zenodo.3527810

Published November 4, 2019 | Version v1

Conference paper Open

Da-TACOS: A Dataset for Cover Song Identification and Understanding

This paper focuses on Cover Song Identification (CSI), an important research challenge in content-based Music Information Retrieval (MIR). Although the task itself is interesting and challenging for both academia and industry scenarios, there are a number of limitations for the advancement of current approaches. We specifically address two of them in the present study. First, the number of publicly available datasets for this task is limited, and there is no publicly available benchmark set that is widely used among researchers for comparative algorithm evaluation. Second, most of the algorithms are not publicly shared and reproducible, limiting the comparison of approaches. To overcome these limitations we propose Da-TACOS, a DaTAset for COver Song Identification and Understanding, and two frameworks for feature extraction and benchmarking to facilitate reproducibility. Da-TACOS contains 25K songs represented by unique editorial metadata plus 9 low- and mid-level features pre-computed with open source libraries, and is divided into two subsets. The Cover Analysis subset contains audio features (e.g. key, tempo) that can serve to study how musical characteristics vary for cover songs. The Benchmark subset contains the set of features that have been frequently used in CSI research, e.g. chroma, MFCC, beat onsets etc. Moreover, we provide initial benchmarking results of a selected number of state-of-the-art CSI algorithms using our dataset, and for reproducibility, we share a GitHub repository containing the feature extraction and benchmarking frameworks.

Files

ismir2019_paper_000038.pdf

Files (700.6 kB)

Name	Size	Download all
ismir2019_paper_000038.pdf md5:22835eaf7d4b39b01ec18589cf861545	700.6 kB	Preview Download

	All versions	This version
Views	294	293
Downloads	212	212
Data volume	156.2 MB	156.2 MB

Da-TACOS: A Dataset for Cover Song Identification and Understanding

Creators

Description

Files

ismir2019_paper_000038.pdf

Files (700.6 kB)