Published November 7, 2021 | Version v1
Dataset Open

Multi-modal dataset for music genre recognition based on six different modalities for LMD-aligned and SLAC datasets

  • 1. TU Dortmund University, Department of Computer Science, Germany
  • 2. Marianopolis College, Department of Liberal and Creative Arts, Canada

Description

Multi-modal dataset for music genre recognition based on six different modalities for the LMD-aligned [1] and SLAC [2] datasets. Further details are provided in [3].

Descriptions of files

Link Description
LMD-aligned_Filelist.arff File list with 1575 music tracks selected from the LMD-aligned dataset [1] with tagtraum genre annotations [4] (only a subset of LMD-aligned is used, which includes only pieces for which all six modalities were accessible, and which includes only well-represented genres)
LMD-aligned_ExtractedFeatures.tar.gz Raw audio signal and model-based features extracted with AMUSE [5]
LMD-aligned_ProcessedFeatures.tar.gz Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames
LMD-aligned_Datasets.tar.gz Training, optimization, and test datasets for 3 splits for the recognition of 5 genres in [3]
SLAC_Filelist.arff File list with 250 music tracks from the SLAC dataset [2] (genres and sub-genres are provided in the folder structure)
SLAC_ExtractedFeatures.tar.gz Raw audio signal and model-based features extracted with AMUSE [5]
SLAC_ProcessedFeatures.tar.gz Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames
SLAC_Datasets.tar.gz Training, optimization, and test datasets for 3 splits for the recognition of 5 genres and 10 sub-genres in [3]

Modalities and feature sub-groups

Modality Sub-group

Dimensions in processed

features of LMD-aligned

Dimensions in processed

features of SLAC

Audio signal Low-level 1-524 1-524
Audio signal Semantic 525-810 525-810
Audio signal Structural complexity 811-908 811-908
Model-based Instruments 909-1018 909-1018
Model-based Moods 1019-1146 1019-1146
Model-based Various 1147-1402 1147-1402
Playlists Genres 1403-1973 1403-1973
Playlists Styles 1974-1695 1974-1695
Symbolic Pitch 1696-1757 1696-1757
Symbolic Melodic 1758-1781 1758-1781
Symbolic Chords 1782-1836 1782-1836
Symbolic Rhythm 1837-1935 1837-1935
Symbolic Tempo 1936-1963 1936-1963
Symbolic Instrument presence 1964-2441 1964-2441
Symbolic Instruments 2442-2456 2442-2456
Symbolic Texture 2457-2480 2457-2480
Symbolic Dynamics 2481-2484 2481-2484
Album covers SIFT 2485-2584 2485-2584
Lyrics jLyrics descriptors 2585-2603 2585-2671
Lyrics Bag-of-Words 2604-2703  
Lyrics Doc2Vec 2704-2803  

Files

Files (11.6 GB)

Name Size Download all
md5:b12218bee9b89f59b3a5a618af15a2de
339.0 kB Download
md5:64605dfb088f07e84fcdb0a58471f64c
5.5 GB Download
md5:9302ccd86292c893c187df89c637484a
162.3 kB Download
md5:8757b54912eb795ead87bc90a468cc41
148.3 MB Download
md5:faa8cbf325129f52bdf31904938b12ff
85.4 kB Download
md5:b7991f385d2c4284b22f9c707f01ff7c
5.7 GB Download
md5:4c06a526e11bd93bddfe436b49e3e7ef
15.8 kB Download
md5:e3d716aa197904dec27f179bd8130693
247.8 MB Download

Additional details

References

  • [1] Raffel, C. (2016). Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching. PhD thesis, Graduate School of Arts and Sciences, Columbia University
  • [2] McKay, C., Burgoyne, J. A., Hockman, J., Smith, J. B. L., Vigliensoni, G., and Fujinaga, I. (2010). Evaluating the genre classification performance of lyrical features relative to audio, symbolic and cultural features. In Proc. of the 11th International Society for Music Information Retrieval Conference, ISMIR, pp. 213–218
  • [3] Vatolkin, I. and McKay, C. (2022). Multi-Objective Investigation of Six Feature Source Types for Multi-Modal Music Classification. Transactions of the International Society for Music Information Retrieval, 5(1), pp.1–19
  • [4] Schreiber, H. (2015). Improving genre annotations for the million song dataset. In Proceedings of the 16th International Society for Music Information Retrieval Conference, ISMIR, pp. 241–247
  • [5] Vatolkin, I., Theimer, W. M., and Botteck, M. (2010). AMUSE (advanced music explorer) - A multitool framework for music data analysis. In Proc. of the 11th International Society for Music Information Retrieval Conference, ISMIR, pp. 33–38