10.5281/zenodo.5651429
https://zenodo.org/records/5651429
oai:zenodo.org:5651429
Igor Vatolkin
Igor Vatolkin
0000-0002-9454-9402
TU Dortmund University, Department of Computer Science, Germany
Cory McKay
Cory McKay
0000-0003-3214-8862
Marianopolis College, Department of Liberal and Creative Arts, Canada
Multi-modal dataset for music genre recognition based on six different modalities for LMD-aligned and SLAC datasets
Zenodo
2021
Multi-modal music data
2021-11-07
10.5281/zenodo.5651428
Creative Commons Attribution 4.0 International
Multi-modal dataset for music genre recognition based on six different modalities for the LMD-aligned [1] and SLAC [2] datasets. Further details are provided in [3].
Descriptions of files
Link
Description
LMD-aligned_Filelist.arff
File list with 1575 music tracks selected from the LMD-aligned dataset [1] with tagtraum genre annotations [4] (only a subset of LMD-aligned is used, which includes only pieces for which all six modalities were accessible, and which includes only well-represented genres)
LMD-aligned_ExtractedFeatures.tar.gz
Raw audio signal and model-based features extracted with AMUSE [5]
LMD-aligned_ProcessedFeatures.tar.gz
Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames
LMD-aligned_Datasets.tar.gz
Training, optimization, and test datasets for 3 splits for the recognition of 5 genres in [3]
SLAC_Filelist.arff
File list with 250 music tracks from the SLAC dataset [2] (genres and sub-genres are provided in the folder structure)
SLAC_ExtractedFeatures.tar.gz
Raw audio signal and model-based features extracted with AMUSE [5]
SLAC_ProcessedFeatures.tar.gz
Processed features: audio signal and model-based features aggregated for 4 s time frames with 2 s step size / all other features (see the table below) with the same values for all time frames
SLAC_Datasets.tar.gz
Training, optimization, and test datasets for 3 splits for the recognition of 5 genres and 10 sub-genres in [3]
Modalities and feature sub-groups
Modality
Sub-group
Dimensions in processed
features of LMD-aligned
Dimensions in processed
features of SLAC
Audio signal
Low-level
1-524
1-524
Audio signal
Semantic
525-810
525-810
Audio signal
Structural complexity
811-908
811-908
Model-based
Instruments
909-1018
909-1018
Model-based
Moods
1019-1146
1019-1146
Model-based
Various
1147-1402
1147-1402
Playlists
Genres
1403-1973
1403-1973
Playlists
Styles
1974-1695
1974-1695
Symbolic
Pitch
1696-1757
1696-1757
Symbolic
Melodic
1758-1781
1758-1781
Symbolic
Chords
1782-1836
1782-1836
Symbolic
Rhythm
1837-1935
1837-1935
Symbolic
Tempo
1936-1963
1936-1963
Symbolic
Instrument presence
1964-2441
1964-2441
Symbolic
Instruments
2442-2456
2442-2456
Symbolic
Texture
2457-2480
2457-2480
Symbolic
Dynamics
2481-2484
2481-2484
Album covers
SIFT
2485-2584
2485-2584
Lyrics
jLyrics descriptors
2585-2603
2585-2671
Lyrics
Bag-of-Words
2604-2703
Lyrics
Doc2Vec
2704-2803