Deep Noise Suppression for Real Time Speech Enhancement in a Single Channel Wide Band Scenario

Esteban Gómez

doi:10.5281/zenodo.5554193

Published September 15, 2021 | Version v1

Thesis Open

Deep Noise Suppression for Real Time Speech Enhancement in a Single Channel Wide Band Scenario

Esteban Gómez

Contributors

Supervisor (2):

1. Universitat Pompeu Fabra

Speech enhancement can be regarded as a dual task that addresses two important issues of degraded speech: Speech quality and speech intelligibility. Improved speech quality can reduce listener’s fatigue, whereas improved speech intelligibility can re-duce the listener’s e˙ort to understand and extract meaning from speech. This work is focused on speech quality in a real time context. Algorithms that improve speech quality are sometimes referred to as noise suppression algorithms, since they enhance quality by suppressing the background noise of the degraded speech. Improving state of the art noise suppression algorithms could lead to significant benefits to several applications such as video conferencing systems, phone calls or speech recognition systems. Real time capable algorithms are especially important for devices with a limited processing power and physical constraints that cannot make use of large architectures, such as hearing aids or wearables. This work uses a deep learning based approach to expand on two previously proposed architectures in the context of the Deep Noise Suppression Challenge carried out by Microsoft. This challenge has provided datasets and resources to teams of researchers with the common goal of fostering the research on the aforementioned topic. The outcome of this thesis can be divided into three main contributions: First, an extended comparison between six variants of the two selected models, considering performance, computational com-plexity and real time eÿciency analyses. Secondly, making available an open source implementation of one of the proposed architectures as well as a framework transla-tion of an existing implementation. Finally, proposed variants that outperform the previously defined models in terms of denoising performance, complexity and real time eÿciency.

Files

2021-Esteban-Gómez.pdf

Files (2.5 MB)

Name	Size	Download all
2021-Esteban-Gómez.pdf md5:1839523207e32314226ecd2f0cf3ec6e	2.5 MB	Preview Download

	All versions	This version
Views	416	412
Downloads	338	337
Data volume	920.0 MB	917.5 MB

Deep Noise Suppression for Real Time Speech Enhancement in a Single Channel Wide Band Scenario

Authors/Creators

Contributors

Supervisor (2):

Description

Files

2021-Esteban-Gómez.pdf

Files (2.5 MB)