Learnable Loss Mixup for Speech Enhancement

Oscar Chang; Dung N. Tran; Kazuhito Koishida

doi:10.5281/zenodo.4053951

Published September 19, 2020 | Version v2

Preprint Open

Learnable Loss Mixup for Speech Enhancement

Mixup is a recently proposed learning paradigm that improves the generalization of deep neural networks by training them on virtual data sampled from linear interpolations of examples and their labels. However, applying it to speech enhancement is challenging, because mixup was not designed for non-classification tasks and its success is contingent on the shape of the mixing distribution. We propose a generalization of mixup that mixes the losses instead of the labels, and automatically learns a non-linear mixing function by conditioning on the mixed data. On the VCTK benchmark, our proposal significantly outperforms standard training, learnable label mixup, and linear loss mixup. It achieves 3.26 PESQ, surpassing the previous state-of-the-art by 6 points.

Files

ICASSP 2021 - Learnable Loss Mixup for Speech Enhancement.pdf

Files (736.4 kB)

Name	Size	Download all
ICASSP 2021 - Learnable Loss Mixup for Speech Enhancement.pdf md5:c3fb99f84c38b1cce0eb8e0c6e4f601f	736.4 kB	Preview Download

985

Views

805

Downloads

Show more details

	All versions	This version
Views	985	821
Downloads	805	566
Data volume	641.8 MB	459.5 MB

More info on how stats are collected....

DOI

Resource type

Preprint

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: September 27, 2020
Modified: July 19, 2024

Learnable Loss Mixup for Speech Enhancement

Authors/Creators

Description

Files

ICASSP 2021 - Learnable Loss Mixup for Speech Enhancement.pdf

Files (736.4 kB)