AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Published January 29, 2023 | Version v4

Preprint Open

This space host the pre-trained model of AudioLDM

audioldm-m-text-ft (**recommand**, default): the medium large AudioLDM finetuned with AudioCaps and MusicCaps audio-text pairs *(added 2023-04-10)*.

audioldm-s-text-ft (**recommand**): the small AudioLDM finetuned with AudioCaps and MusicCaps audio-text pairs *(added 2023-04-10)*.

audioldm-m-full: the medium AudioLDM without finetuning and trained with audio embeddings as condition *(added 2023-04-10)*.

audioldm-s-full-v2: more training steps comparing with audioldm-s-full *(added 2023-03-04)*.

audioldm-l-full: larger model comparing with audioldm-s-full *(added 2023-03-04)*.

audioldm-s-full: the original open-sourced version *(added 2023-02-01)*.

Files

Name	Size	Download all
audioldm-full-l.ckpt md5:8c62081daf5a1b2b9fa014e5a56ff03f	7.0 GB	Download
audioldm-full-s-v2.ckpt md5:fca6e5a37c7bf47a701a662d48aeded8	2.6 GB	Download
audioldm-m-full.ckpt md5:46bad9f176651404b3cf1484942749b9	4.6 GB	Download
audioldm-m-text-ft.ckpt md5:036bc9b547a50f78b960ef8f14d0e1fb	4.6 GB	Download
audioldm-s-full md5:5451d3628b3a2b535f87044685fe102c	2.6 GB	Download
audioldm-s-text-ft.ckpt md5:70a662bf2d200cdd1913c5533b126007	2.6 GB	Download
hifigan-vocoder-config.json md5:8d2ad944f74cf71cdf140dac9f46728c	767 Bytes	Preview Download
hifigan_vocoder.ckpt md5:77a0f15a947821b999dbc0e22fea0a33	221.2 MB	Download
VAE.ckpt md5:77fcff0cdf9073dc8cd1bd1db268b228	978.0 MB	Download

Citations

Oops! Something went wrong while fetching results.