Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks
- 1. Department of Computer Science, University of Crete, Greece
- 2. Institute of Applied and Computational Mathematics, FORTH, Greece
Description
In this paper, we suggest a novel way to train GenerativeAdversarial Network (GAN) for the purpose of non-parallel,many-to-many voice conversion. The goal of voice conversion(VC) is to transform speech from a source speaker to that of atarget speaker without changing the phonetic contents. Basedon ideas from Game Theory, we suggest to multiply the gradi-ent of the Generator with suitable weights. Weights are calcu-lated so that they increase the power of fake samples that foolthe Discriminator resulting in a stronger Generator. Motivatedby a recently presented GAN based approach for VC, StarGAN-VC, we suggest a variation to StarGAN, referred to as WeightedStarGAN (WeStarGAN). The experiments are conducted onstandard CMU ARCTIC database. WeStarGAN-VC approachachieves significantly better relative performance and is clearlypreferred over recently proposed StarGAN-VC method in termsof speech subjective quality and speaker similarity with 75% and 65%preference scores, respectively.
Files
ESR10_Interspeech2019_2869.pdf
Files
(351.9 kB)
Name | Size | Download all |
---|---|---|
md5:b3e8e116eda265478ab4258aceca49e3
|
351.9 kB | Preview Download |