Published November 11, 2022
| Version v0
Other
Open
CLAP: Learning Audio Concepts From Natural Language Supervision (Pretrained Model)
Authors/Creators
Description
CLAP (Contrastive Language-Audio Pretraining) is a neural network model that learns acoustic concepts from natural language supervision. It achieved SoTA in “Zero-Shot” classification, Audio-Text & Text-Audio Retrieval, and in some datasets when finetuned.
Weights for the Microsoft CLAP model published in 2022. Refer to the GitHub repository for the code.
microsoft/CLAP: Learning audio concepts from natural language supervision (github.com)
Files
Files
(2.3 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:0731ffb09d8567ba5610be34aa577a62
|
2.3 GB | Download |