AudioSet Strong Ensemble Logits
Description
This upload contains one HDF5 file that stores ensemble predictions on AudioSet Strong audio files. It is supplementary material for the ICASSP'25 paper Effective Pre-Training of Audio Transformers for Sound Event Detection. The corresponding code can be found in this GitHub repository.
The HDF5 file contains filenames (Youtube IDs) matched with ensembled logits of multiple transformer models. The corresponding keys are "filenames" and "strong_logits". Ensemble Logits for one file are of shape 447 x 250 (number of classes x timeframes at 40 ms resolution). Ensemble Logits are stored in float16 format to save space. Check out the GitHub repository for information on how to use the ensemble logits.
Files
Files
(22.6 GB)
Name | Size | Download all |
---|---|---|
md5:2e34dd1fc30a084bff9234e4cbd89b53
|
22.6 GB | Download |
Additional details
Software
- Repository URL
- https://github.com/fschmid56/PretrainedSED
- Development Status
- Active