Dataset Open Access
Multi-modal dataset for music genre recognition based on six different modalities for the LMD-aligned [1] and SLAC [2] datasets. Further details are provided in [3].
Descriptions of files
Link | Description |
---|---|
LMD-aligned_Filelist.arff | File list with 1575 music tracks selected from the LMD-aligned dataset [1] with tagtraum genre annotations [4] (only a subset of LMD-aligned is used, which includes only pieces for which all six modalities were accessible, and which includes only well-represented genres) |
LMD-aligned_ExtractedFeatures.tar.gz | Raw audio signal and model-based features extracted with AMUSE [5] |
LMD-aligned_ProcessedFeatures.tar.gz | Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames |
LMD-aligned_Datasets.tar.gz | Training, optimization, and test datasets for 3 splits for the recognition of 5 genres in [3] |
SLAC_Filelist.arff | File list with 250 music tracks from the SLAC dataset [2] (genres and sub-genres are provided in the folder structure) |
SLAC_ExtractedFeatures.tar.gz | Raw audio signal and model-based features extracted with AMUSE [5] |
SLAC_ProcessedFeatures.tar.gz | Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames |
SLAC_Datasets.tar.gz | Training, optimization, and test datasets for 3 splits for the recognition of 5 genres and 10 sub-genres in [3] |
Modalities and feature sub-groups
Modality | Sub-group |
Dimensions in processed features of LMD-aligned |
Dimensions in processed features of SLAC |
---|---|---|---|
Audio signal | Low-level | 1-524 | 1-524 |
Audio signal | Semantic | 525-810 | 525-810 |
Audio signal | Structural complexity | 811-908 | 811-908 |
Model-based | Instruments | 909-1018 | 909-1018 |
Model-based | Moods | 1019-1146 | 1019-1146 |
Model-based | Various | 1147-1402 | 1147-1402 |
Playlists | Genres | 1403-1973 | 1403-1973 |
Playlists | Styles | 1974-1695 | 1974-1695 |
Symbolic | Pitch | 1696-1757 | 1696-1757 |
Symbolic | Melodic | 1758-1781 | 1758-1781 |
Symbolic | Chords | 1782-1836 | 1782-1836 |
Symbolic | Rhythm | 1837-1935 | 1837-1935 |
Symbolic | Tempo | 1936-1963 | 1936-1963 |
Symbolic | Instrument presence | 1964-2441 | 1964-2441 |
Symbolic | Instruments | 2442-2456 | 2442-2456 |
Symbolic | Texture | 2457-2480 | 2457-2480 |
Symbolic | Dynamics | 2481-2484 | 2481-2484 |
Album covers | SIFT | 2485-2584 | 2485-2584 |
Lyrics | jLyrics descriptors | 2585-2603 | 2585-2671 |
Lyrics | Bag-of-Words | 2604-2703 | |
Lyrics | Doc2Vec | 2704-2803 |
Name | Size | |
---|---|---|
LMD-aligned_Datasets.tar.gz
md5:b12218bee9b89f59b3a5a618af15a2de |
339.0 kB | Download |
LMD-aligned_ExtractedFeatures.tar.gz
md5:64605dfb088f07e84fcdb0a58471f64c |
5.5 GB | Download |
LMD-aligned_Filelist.arff
md5:9302ccd86292c893c187df89c637484a |
162.3 kB | Download |
LMD-aligned_ProcessedFeatures.tar.gz
md5:8757b54912eb795ead87bc90a468cc41 |
148.3 MB | Download |
SLAC_Datasets.tar.gz
md5:faa8cbf325129f52bdf31904938b12ff |
85.4 kB | Download |
SLAC_ExtractedFeatures.tar.gz
md5:b7991f385d2c4284b22f9c707f01ff7c |
5.7 GB | Download |
SLAC_Filelist.arff
md5:4c06a526e11bd93bddfe436b49e3e7ef |
15.8 kB | Download |
SLAC_ProcessedFeatures.tar.gz
md5:e3d716aa197904dec27f179bd8130693 |
247.8 MB | Download |
[1] Raffel, C. (2016). Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching. PhD thesis, Graduate School of Arts and Sciences, Columbia University
[2] McKay, C., Burgoyne, J. A., Hockman, J., Smith, J. B. L., Vigliensoni, G., and Fujinaga, I. (2010). Evaluating the genre classification performance of lyrical features relative to audio, symbolic and cultural features. In Proc. of the 11th International Society for Music Information Retrieval Conference, ISMIR, pp. 213–218
[3] Vatolkin, I. and McKay, C. (2022). Multi-Objective Investigation of Six Feature Source Types for Multi-Modal Music Classification. Transactions of the International Society for Music Information Retrieval, 5(1), pp.1–19
[4] Schreiber, H. (2015). Improving genre annotations for the million song dataset. In Proceedings of the 16th International Society for Music Information Retrieval Conference, ISMIR, pp. 241–247
[5] Vatolkin, I., Theimer, W. M., and Botteck, M. (2010). AMUSE (advanced music explorer) - A multitool framework for music data analysis. In Proc. of the 11th International Society for Music Information Retrieval Conference, ISMIR, pp. 33–38
All versions | This version | |
---|---|---|
Views | 111 | 111 |
Downloads | 26 | 26 |
Data volume | 28.9 GB | 28.9 GB |
Unique views | 74 | 74 |
Unique downloads | 7 | 7 |