On Loss Functions for Music Source Separation

Gusó, Enric

doi:10.5281/zenodo.4091379

Published September 15, 2020 | Version v1

Thesis Open

On Loss Functions for Music Source Separation

Gusó, Enric¹

1. Universitat Pompeu Fabra

Contributors

Supervisor:

Pons, Jordi¹

1. Universitat Pompeu Fabra

Despite that L1 and L2 loss functions do not represent any perceptually-related information besides waveform-matching, these achieve remarkable results when used to train music source separation models. Our work contributes in extending the existing literature on loss functions for training deep learning audio models — to keep understanding of the pros and cons of several loss functions (including: L1, L2 and perceptually motivated losses) in a standardized evaluation framework.

In this work we focus on defining an evaluation framework for a fair comparison among losses — because we found diÿcult to extract conclusions out of the existing body of literature. Generally, loss improvements are presented along with additional model modifications (e.g. di˙erent data augmentation, or di˙erent model topology), making it diÿcult to assess the loss contribution to the results. This study focus on standardizing the evaluation process via employing the same dataset, the same data augmentation strategy and the same model topology — while varying its loss. The alternative losses we consider are based on cross-entropy, scale invariant SDR, multi-resolution STFT, and phase sensitive losses among others.

Files

2020-Enric-Guso.pdf

Files (2.0 MB)

Name	Size	Download all
2020-Enric-Guso.pdf md5:5beeb7734d344ec5ae1acef946910c32	2.0 MB	Preview Download

	All versions	This version
Views	764	750
Downloads	1,473	1,467
Data volume	3.1 GB	3.1 GB

On Loss Functions for Music Source Separation

Creators

Contributors

Supervisor:

Description

Files

2020-Enric-Guso.pdf

Files (2.0 MB)