From Words to Sound: Neural Audio Synthesis of Guitar Sounds with Timbral Descriptors
Interest in neural audio synthesis has been growing lately both in academia and industry. Deep Learning (DL) synthesisers enable musicians to generate fresh, often completely unconventional sounds. However, most of these applications present a drawback. It is difficult for musicians to generate sounds which reflect the timbral properties they have in mind, because of the nature of the latent spaces of such systems. These spaces generally have large dimensionality and cannot easily be mapped to semantically meaningful timbral properties. Navigation of such timbral spaces is therefore impractical. In this paper, we introduce a DL-powered instrument that generates guitar sounds from vocal commands. The system analyses vocal instructions to extract timbral descriptors which condition the sound generation.