Published January 29, 2023 | Version v4
Preprint Open

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

  • 1. University of Surrey
  • 2. Imperial College London

Description

This space host the pre-trained model of AudioLDM

 

audioldm-m-text-ft (**recommand**, default): the medium large AudioLDM finetuned with AudioCaps and MusicCaps audio-text pairs *(added 2023-04-10)*.

audioldm-s-text-ft (**recommand**): the small AudioLDM finetuned with AudioCaps and MusicCaps audio-text pairs *(added 2023-04-10)*.

audioldm-m-full: the medium AudioLDM without finetuning and trained with audio embeddings as condition *(added 2023-04-10)*.

audioldm-s-full-v2: more training steps comparing with audioldm-s-full *(added 2023-03-04)*.

audioldm-l-full: larger model comparing with audioldm-s-full *(added 2023-03-04)*.

audioldm-s-full: the original open-sourced version *(added 2023-02-01)*.

Files

hifigan-vocoder-config.json

Files (25.0 GB)

Name Size Download all
md5:8c62081daf5a1b2b9fa014e5a56ff03f
7.0 GB Download
md5:fca6e5a37c7bf47a701a662d48aeded8
2.6 GB Download
md5:46bad9f176651404b3cf1484942749b9
4.6 GB Download
md5:036bc9b547a50f78b960ef8f14d0e1fb
4.6 GB Download
md5:5451d3628b3a2b535f87044685fe102c
2.6 GB Download
md5:70a662bf2d200cdd1913c5533b126007
2.6 GB Download
md5:8d2ad944f74cf71cdf140dac9f46728c
767 Bytes Preview Download
md5:77a0f15a947821b999dbc0e22fea0a33
221.2 MB Download
md5:77fcff0cdf9073dc8cd1bd1db268b228
978.0 MB Download