There is a newer version of the record available.

Published November 11, 2022 | Version v0
Other Open

CLAP: Learning Audio Concepts From Natural Language Supervision (Pretrained Model)

Description

CLAP (Contrastive Language-Audio Pretraining) is a neural network model that learns acoustic concepts from natural language supervision. It achieved SoTA in “Zero-Shot” classification, Audio-Text & Text-Audio Retrieval, and in some datasets when finetuned.

Weights for the Microsoft CLAP model published in 2022. Refer to the GitHub repository for the code.

microsoft/CLAP: Learning audio concepts from natural language supervision (github.com)

Files

Files (2.3 GB)

Name Size Download all
md5:0731ffb09d8567ba5610be34aa577a62
2.3 GB Download