Published January 10, 2025 | Version v1
Dataset Open

AudioSet Strong Ensemble Logits

  • 1. ROR icon Johannes Kepler University of Linz

Description

This upload contains one HDF5 file that stores ensemble predictions on AudioSet Strong audio files. It is supplementary material for the ICASSP'25 paper Effective Pre-Training of Audio Transformers for Sound Event Detection. The corresponding code can be found in this GitHub repository

The HDF5 file contains filenames (Youtube IDs) matched with ensembled logits of multiple transformer models. The corresponding keys are "filenames" and "strong_logits". Ensemble Logits for one file are of shape 447 x 250 (number of classes x timeframes at 40 ms resolution). Ensemble Logits are stored in float16 format to save space. Check out the GitHub repository for information on how to use the ensemble logits.

Files

Files (22.6 GB)

Name Size Download all
md5:2e34dd1fc30a084bff9234e4cbd89b53
22.6 GB Download

Additional details

Software

Repository URL
https://github.com/fschmid56/PretrainedSED
Development Status
Active