Towards Explainable Music Emotion Recognition: The Route via Mid-level Features

Shreyan Chowdhury; Andreu Vall Portabella; Verena Haunschmid; Gerhard Widmer

doi:10.5281/zenodo.3527788

Published November 4, 2019 | Version v1

Conference paper Open

Towards Explainable Music Emotion Recognition: The Route via Mid-level Features

Emotional aspects play an important part in our interaction with music. However, modeling this aspect in MIR systems has been notoriously challenging since emotion is an inherently abstract and subjective experience, thus making it difficult to quantify or predict in the first place, and to make sense of the predictions in the next. In an attempt to create a model that can give a musically meaningful and intuitive explanation for its prediction, we propose a VGG-style deep neural network that learns to predict emotional characteristics of a musical piece together with (and based on) human-interpretable, mid-level perceptual features. We compare this to predicting emotion directly with an identical network that does not take into account the mid-level features, and observe that the cost of going through the mid-level features is surprisingly low, on average. The design of our network allows us to visualize the effects of perceptual features on individual emotion predictions, and we argue that the small loss in performance in going through the mid-level features is justified by the gain in explainability of the predictions.

Files

ismir2019_paper_000027.pdf

Files (646.1 kB)

Name	Size	Download all
ismir2019_paper_000027.pdf md5:261dab75d3d34d0261e46f5f804f4e9c	646.1 kB	Preview Download

358

Views

193

Downloads

Show more details

	All versions	This version
Views	358	358
Downloads	193	193
Data volume	134.4 MB	134.4 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 20th International Society for Music Information Retrieval Conference, 237-243. Delft, The Netherlands.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2019) , Delft, The Netherlands, November 4-8, 2019

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 4, 2019
Modified: July 22, 2024

Towards Explainable Music Emotion Recognition: The Route via Mid-level Features

Authors/Creators

Description

Files

ismir2019_paper_000027.pdf

Files (646.1 kB)