Published September 15, 2021 | Version v1
Thesis Open

Quality Enhancement of Overdub Singing Voice Recordings

  • 1. Universitat Pompeu Fabra

Contributors

  • 1. Universitat Pompeu Fabra

Description

Singing enhancement aims to improve the perceived quality of a singing voice record-ing in various aspects. Focusing on the aspect of removing degradation such as background noise or room reverberation, singing enhancement is related to the topic of speech enhancement. In this work, two neural network architectures for speech denoising – namely FullSubNet and Wave-U-Net – were trained and evalu-ated specifically on denoising of user singing voice recordings. While both models show similar performance as for speech denoising, FullSubNet outperforms Wave-U-Net on this task. Furthermore, the removal of sound leakage (i.e. reference sig-nal/accompaniment for overdubbing that becomes audible in the background of a recording) was performed with a novel modification of FullSubNet. The proposed ar-chitecture performs leakage removal by taking the signal leading to aforementioned leakage as an additional input. For the case of choir music and for leakage removal, this modified FullSubNet architecture was compared to the original FullSubNet ar-chitecture. Evaluation results show its overall eÿcacy on leakage removal as well as significant benefits introduced by usage of the additional input.

Files

2021-Benedikt-Wimmer.pdf

Files (2.5 MB)

Name Size Download all
md5:c9aadd3917561a07287c26507e71faf8
2.5 MB Preview Download