Published July 4, 2024
| Version 1.0
Dataset
Open
Sound-VECaps
Description
This is the dataset for Sound-VECaps, a large-scale audio dataset with visual-enhanced captions.
We also release the dataset for AudioCaps-Enhanced, the visual-enhanced AudioCaps testing dataset as the new benchmark.