Published September 19, 2020 | Version v2
Preprint Open

Learnable Loss Mixup for Speech Enhancement

Description

Mixup is a recently proposed learning paradigm that improves the generalization of deep neural networks by training them on virtual data sampled from linear interpolations of examples and their labels. However, applying it to speech enhancement is challenging, because mixup was not designed for non-classification tasks and its success is contingent on the shape of the mixing distribution. We propose a generalization of mixup that mixes the losses instead of the labels, and automatically learns a non-linear mixing function by conditioning on the mixed data. On the VCTK benchmark, our proposal significantly outperforms standard training, learnable label mixup, and linear loss mixup. It achieves 3.26 PESQ, surpassing the previous state-of-the-art by 6 points.

Files

ICASSP 2021 - Learnable Loss Mixup for Speech Enhancement.pdf

Files (736.4 kB)