Visualization of the output of Sound Event Detection algorithms in Freesound
Authors/Creators
Description
This thesis presents a postprocessing and visualization pipeline for the output of sound event detection (SED) models, with a specific focus on the FSD-SINet model
and its integration within the Freesound platform.
The first part of the work addresses the refinement of SED outputs by proposing several postprocessing methodologies to improve the quality and usability of detection
results. These include proposals to obtain adaptive parameters per class for the postprocessing pipeline, tag recommendation strategies applied to SED systems, hierarchical filtering based on the dataset vocabulary, and exploration of future optimization techniques.
The second part of the project focuses on the design and evaluation of visualization approaches to display SED outputs in a user-friendly manner directly within
Freesound. Several interface displays are proposed and analyzed in terms of clarity, space efficiency and user interaction. A user satisfaction experiment is designed to
evaluate these approaches based on real user’s feedback.
By connecting these two parts, this work aims to make crucial steps in making SED technology more accessible to the broader Freesound community, while contributing
to the platform’s development and user experience improvement.
Files
Quim_Marce_SMC_2025_Master_Thesis.pdf
Files
(4.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:dadf9e4922b8adff0191b6e35b6bb23a
|
4.6 MB | Preview Download |
Additional details
Dates
- Accepted
-
2025-10-09