Learnable Loss Mixup for Speech Enhancement
Authors/Creators
Description
Mixup is a recently proposed learning paradigm that improves the generalization of deep neural networks by training them on virtual data sampled from linear interpolations of examples and their labels. However, applying it to speech enhancement is challenging, because mixup was not designed for non-classification tasks and its success is contingent on the shape of the mixing distribution. We propose a generalization of mixup that mixes the losses instead of the labels, and automatically learns a non-linear mixing function by conditioning on the mixed data. On the VCTK benchmark, our proposal significantly outperforms standard training, learnable label mixup, and linear loss mixup. It achieves 3.26 PESQ, surpassing the previous state-of-the-art by 6 points.
Files
Learnable Loss Mixup for Speech Enhancement.pdf
Files
(729.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:c15117e16a706f74d9c6bce7274eecf5
|
729.3 kB | Preview Download |