Published August 6, 2023 | Version v2
Software Open

BLAT Pre-trained Checkpoints and Data

Creators

Description

Checkpoints of BLAT (Bootstrapping Language-Audio pre-training based on Tag-guided synthetic data) pre-trained models. The models are first pre-trained on synthetic AudioSet tag-guided audio-text data, then fine-tuned on human-annotated audio-text data (AudioCaps + Clotho + MACS).

Caption data generated by the AudioSet tag-guided captioning model is also provided.

Files

cap.json

Files (578.8 MB)

Name Size Download all
md5:904ebd9acf956871049c69e24074a2ba
495.0 MB Download
md5:532992f4220d5853154c933221d153fa
83.9 MB Preview Download