Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks

doi:10.21437/Interspeech.2019-2869

Published September 19, 2019 | Version v1

Conference paper Open

Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks

1. Department of Computer Science, University of Crete, Greece
2. Institute of Applied and Computational Mathematics, FORTH, Greece

In this paper, we suggest a novel way to train GenerativeAdversarial Network (GAN) for the purpose of non-parallel,many-to-many voice conversion. The goal of voice conversion(VC) is to transform speech from a source speaker to that of atarget speaker without changing the phonetic contents. Basedon ideas from Game Theory, we suggest to multiply the gradi-ent of the Generator with suitable weights. Weights are calcu-lated so that they increase the power of fake samples that foolthe Discriminator resulting in a stronger Generator. Motivatedby a recently presented GAN based approach for VC, StarGAN-VC, we suggest a variation to StarGAN, referred to as WeightedStarGAN (WeStarGAN). The experiments are conducted onstandard CMU ARCTIC database. WeStarGAN-VC approachachieves significantly better relative performance and is clearlypreferred over recently proposed StarGAN-VC method in termsof speech subjective quality and speaker similarity with 75% and 65%preference scores, respectively.

Files

ESR10_Interspeech2019_2869.pdf

Files (351.9 kB)

Name	Size	Download all
ESR10_Interspeech2019_2869.pdf md5:b3e8e116eda265478ab4258aceca49e3	351.9 kB	Preview Download

Additional details

ENRICH – Enriched communication across the lifespan 675324: European Commission

	All versions	This version
Views	40	40
Downloads	97	97
Data volume	35.2 MB	35.2 MB

Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks

Creators

Description

Files

ESR10_Interspeech2019_2869.pdf

Files (351.9 kB)

Additional details

Funding