Singing Vocal Enhancement for Cochlear Implant Users Based on Deep Learning Models

doi:10.5281/zenodo.1303050

Published July 2, 2018 | Version v1

Thesis Open

Singing Vocal Enhancement for Cochlear Implant Users Based on Deep Learning Models

Tom Gajecki¹

1. Hannover Medical University

Researcher:

Waldo Nogueira¹

1. Hannover Medical University

Severe hearing loss problems that some people suffer from can be treated by providing them with a surgically implanted electrical device called cochlear implant (CI). These devices perform well in the context of speech intelligibility but still struggle when it comes to representing more complex audio signals such as music. However, previous studies show that CI recipients find music more enjoyable when enhancing the vocals with respect to the background music. In this thesis source separation (SS) algorithms are used to remix music multi-tracks by applying gain to the lead singing vocal.
This work proposes deep convolutional auto-encoders (DCAEs), a deep recurrent neural network (DRNN), a multilayer perceptron (MLP) and non-negative matrix factorization (NMF) to be evaluated objectively and subjectively through two different perceptual experiments involving normal hearing (NH) subjects and CI recipients. The evaluation assesses the relevance of the artifacts introduced by the SS algorithms considering their degree of complexity, as this study will try to propose one of the algorithms for real-time implementation. Moreover, this work presents a benchmark which relates the measured distortions as a function of the observed preference ratings on CI subjects. Objective results based on the source to distortion ratio (SDR) and source to artifacts ratio (SAR) show that the DCAEs outperform only when presented with data similar to the one used for training, on the other hand, the MLP performs in a consistent way throughout the tested data obtaining similar performance as the DRNN while reducing algorithmic complexity.
Using the benchmark, next to a MUSHRA test we propose an MLP for real-time audio SS.

Files

master-thesis.pdf

Files (3.8 MB)

Name	Size	Download all
master-thesis.pdf md5:eda20ffdd074ffc416d9d73160c99959	3.8 MB	Preview Download

	All versions	This version
Views	73	73
Downloads	210	210
Data volume	951.9 MB	951.9 MB

Singing Vocal Enhancement for Cochlear Implant Users Based on Deep Learning Models

Creators

Contributors

Researcher:

Description

Files

master-thesis.pdf

Files (3.8 MB)