MAP-MUSIC2VEC: A SIMPLE AND EFFECTIVE BASELINE FOR SELF-SUPERVISED MUSIC AUDIO REPRESENTATION LEARNING

Yizhi Li; Ruibin Yuan; Ge Zhang; Yinghao Ma; Chenghua Lin; Xingran Chen; Anton Ragni; Hanzhi Yin; Zhijie Hu; Haoyu He; Emmanouil Benetos; Norbert Gyenge; Ruibo Liu; Jie Fu

doi:10.5281/zenodo.7403084

Published December 6, 2022 | Version v1

Preprint Open

MAP-MUSIC2VEC: A SIMPLE AND EFFECTIVE BASELINE FOR SELF-SUPERVISED MUSIC AUDIO REPRESENTATION LEARNING

1. University of Sheffield
2. Beijing Academy of Artificial Intelligence, Carnegie Mellon University
3. Beijing Academy of Artificial Intelligence, University of Michigan, Ann Arbor
4. Centre for Digital Music, Queen Mary University of London
5. 1Department of Computer Science, University of Sheffield
6. University of Michigan Ann Arbor, USA
7. Department of Computer Science, University of Sheffield
8. School of Music, Carnegie Mellon University
9. HSBC Business School, Peking University, China
10. University of Tübingen & MPI-IS, Germany
11. Centre for Digital Music, Queen Mary University of London, UK
12. Department of Computer Science, University of Sheffield, UK
13. Dartmouth College, NH, USA
14. Beijing Academy of Artificial Intelligence, China

The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL). However, it still remains unexplored how to build a framework for learning useful representations of raw music waveforms in a self-supervised manner. In this work, we design Music2Vec, a framework exploring different SSL algorithmic components and tricks for music audio recordings. Our model achieves comparable results to the state-of-the-art (SOTA) music SSL model Jukebox, despite being significantly smaller with less than 2% of parameters of the latter. The model will be released on Huggingface.(https://huggingface.co/m-a-p/music2vec-v1)

The paper has been published at ISMIR LBD 2022. We only used 1k/130k hours of data to train the ISMIR LBD demo and will further scale up to get better performance.

Files

mhkcfwvzfptvydrkzjgdvsyhgzsdgnrb.zip

Files (400.6 kB)

Name	Size	Download all
mhkcfwvzfptvydrkzjgdvsyhgzsdgnrb.zip md5:c5c8c6748ea8133aab38e5ddbf89decd	201.0 kB	Preview Download
Music2Vec_ISMIR (7).pdf md5:40ac856587dc7ec535610b844974575d	199.6 kB	Preview Download

	All versions	This version
Views	720	718
Downloads	427	425
Data volume	91.0 MB	90.6 MB

MAP-MUSIC2VEC: A SIMPLE AND EFFECTIVE BASELINE FOR SELF-SUPERVISED MUSIC AUDIO REPRESENTATION LEARNING

Authors/Creators

Description

Files

mhkcfwvzfptvydrkzjgdvsyhgzsdgnrb.zip

Files (400.6 kB)