Transfer learning from speech to music: towards language-sensitive emotion recognition models

doi:10.5281/zenodo.4076791

Published October 5, 2020 | Version v1

Conference paper Open

Transfer learning from speech to music: towards language-sensitive emotion recognition models

1. Universitat Pompeu Fabra
2. Social and Cognitive Computing Department, A*STAR, Singapore
3. European Commission, Joint Research Centre; Universitat Pompeu Fabra

In this study, we address emotion recognition using unsupervised feature learning from speech data, and test its transferability to music. Our approach is to pre-train models using speech in English and Mandarin, and then fine-tune them with excerpts of music labeled with categories of emotion.
Our initial hypothesis is that features automatically learned from speech should be transferable to music. Namely, we expect the intra-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in English) should result in improved performance over the cross-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in Mandarin). Our results confirm previous research on cross-domain transferability, and encourage research towards language-sensitive Music Emotion Recognition (MER) models.

Files

EUSIPCO2020_JSGC_Transfer_Learning.pdf

Files (189.3 kB)

Name	Size	Download all
EUSIPCO2020_JSGC_Transfer_Learning.pdf md5:3d1d8da98f8fc134c70c62cf936e179f	189.3 kB	Preview Download

Additional details

TROMPA – Towards Richer Online Music Public-domain Archives 770376: European Commission

Views

Downloads

Show more details

	All versions	This version
Views	67	66
Downloads	98	98
Data volume	19.9 MB	19.9 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Conference

Proceedings of the 28th European Signal Processing Conference (EUSIPCO) , Amsterdam, The Netherlands, 18-22 January 2021

Languages

English

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: October 10, 2020
Modified: October 10, 2020

Transfer learning from speech to music: towards language-sensitive emotion recognition models

Creators

Description

Files

EUSIPCO2020_JSGC_Transfer_Learning.pdf

Files (189.3 kB)

Additional details

Funding