Published November 11, 2022
                      
                       | Version v0
                    
                    
                      
                        
                          Other
                        
                      
                      
                        
                          
                        
                        
                          Open
                        
                      
                    
                  CLAP: Learning Audio Concepts From Natural Language Supervision (Pretrained Model)
Description
CLAP (Contrastive Language-Audio Pretraining) is a neural network model that learns acoustic concepts from natural language supervision. It achieved SoTA in “Zero-Shot” classification, Audio-Text & Text-Audio Retrieval, and in some datasets when finetuned.
Weights for the Microsoft CLAP model published in 2022. Refer to the GitHub repository for the code.
microsoft/CLAP: Learning audio concepts from natural language supervision (github.com)
Files
      
        Files
         (2.3 GB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:0731ffb09d8567ba5610be34aa577a62 | 2.3 GB | Download |