Voice Processing and Synthesis by Performance Sampling and Spectral Models

doi:10.5281/zenodo.3662128

Published May 20, 2014 | Version 1.1

Thesis Open

Voice Processing and Synthesis by Performance Sampling and Spectral Models

Jordi Bonada¹

1. Music Technology Group, Universitat Pompeu Fabra

Singing voice is one of the most challenging musical instruments to model and imitate. Along several decades much research has been carried out to understand the mechanisms involved in singing voice production. In addition, from the very beginning of the sound synthesis techniques, singing has been one of the main targets to imitate and synthesize, and a large number of synthesizers have been created with that aim.

The goal of this thesis is to build a singing voice synthesizer capable of reproducing the voice of a given singer, both in terms of expression and timbre, sounding natural and realistic, and whose inputs would be just the score and the lyrics of a song. This is a very difficult goal, and in this dissertation we discuss the key aspects of our proposed approach and identify the open issues that still need to be tackled.

This dissertation substantially contributes to the field of singing voice synthesis: a) it critically discusses spectral processing techniques in the context of singing voice modeling, and provides significant improvements to the current state of the art; b) it applies the proposed techniques to other application contexts such as real-time voice transformations, museum installations or video games; c) it develops the concept of synthesis based on performance sampling as a way to model the sonic space produced by a performer with an instrument, focusing on the specific case of the singing voice; d) it proposes and implements a complete framework for singing voice synthesis; e) it explores the sonic space of the singing voice and proposes a procedure to model it; f) it discusses the issues involved in the creation of the synthesizer’s database and provide tools to automate its generation; g) it performs a qualitative evaluation of the synthesis results, comparing those to the state of the art and to real singer performance; h) it implements all the research results into an optimized software application for singing voice analysis, modeling, transformation and synthesis, including tools for database creation; i) a significant part of this research has been incorporated to a commercial singing voice software by Yamaha Corp.

Files

PhD_jbonada_v1.1.pdf

Files (11.5 MB)

Name	Size	Download all
PhD_jbonada_v1.1.pdf md5:b07f6fea8d22ee0144ab392d51b26062	11.5 MB	Preview Download

	All versions	This version
Views	262	262
Downloads	200	200
Data volume	3.5 GB	3.5 GB

Voice Processing and Synthesis by Performance Sampling and Spectral Models

Creators

Description

Files

PhD_jbonada_v1.1.pdf

Files (11.5 MB)