Published June 7, 2022 | Version v2
Conference paper Open

Stylewavegan: Style-Based Synthesis of Drum Sounds With Extensive Controls Using Generative Adversarial Networks

  • 1. Sorbonne Université
  • 2. Minstère de la Culture
  • 3. Apeira Technologies

Description

In this paper we introduce StyleWaveGAN, a style-based drum sound generator that is a variation of StyleGAN, a state-of-the-art image generator. By conditioning StyleWaveGAN on both the type of drum and several audio descriptors, we are able to synthesize waveforms faster than real-time on a GPU directly in CD quality up to a duration of 1.5s while retaining a considerable amount of control over the generation. We also introduce an alternative to the progressive growing of GANs and experimented on the effect of dataset balancing for generative tasks. The experiments are carried out on an augmented subset of a publicly available dataset comprised of different drums and cymbals. We evaluate against two recent drum generators, WaveGAN and NeuroDrum, demonstrating significantly improved generation quality (measured with the Frechet Audio Distance) and interesting results with perceptual features.

Files

47.pdf

Files (611.5 kB)

Name Size Download all
md5:b0d0afa4b0dec7ab042faf98e68aa9d9
611.5 kB Preview Download