Learnable Loss Mixup for Speech Enhancement

Oscar Chang; Dung N. Tran; Kazuhito Koishida

doi:10.5281/zenodo.4039110

There is a newer version of the record available.

Published September 19, 2020 | Version v1

Preprint Open

Learnable Loss Mixup for Speech Enhancement

Mixup is a recently proposed learning paradigm that improves the generalization of deep neural networks by training them on virtual data sampled from linear interpolations of examples and their labels. However, applying it to speech enhancement is challenging, because mixup was not designed for non-classification tasks and its success is contingent on the shape of the mixing distribution. We propose a generalization of mixup that mixes the losses instead of the labels, and automatically learns a non-linear mixing function by conditioning on the mixed data. On the VCTK benchmark, our proposal significantly outperforms standard training, learnable label mixup, and linear loss mixup. It achieves 3.26 PESQ, surpassing the previous state-of-the-art by 6 points.

Files

Learnable Loss Mixup for Speech Enhancement.pdf

Files (729.3 kB)

Name	Size	Download all
Learnable Loss Mixup for Speech Enhancement.pdf md5:c15117e16a706f74d9c6bce7274eecf5	729.3 kB	Preview Download

982

Views

798

Downloads

Show more details

	All versions	This version
Views	982	160
Downloads	798	233
Data volume	636.7 MB	177.9 MB

More info on how stats are collected....

DOI

Resource type

Preprint

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: September 19, 2020
Modified: July 19, 2024

Learnable Loss Mixup for Speech Enhancement

Authors/Creators

Description

Files

Learnable Loss Mixup for Speech Enhancement.pdf

Files (729.3 kB)