Speech Removal Framework for Privacy-preserving Audio Recordings

Bibbó, Gabriel; Arshdeep, Singh; Mark D., Plumbley

doi:10.5281/zenodo.17050321

Published October 15, 2025 | Version v2

Poster Open

Speech Removal Framework for Privacy-preserving Audio Recordings

1. University of Surrey

WASPAA 2025 Demo

Public dataset such as The Sounds of Home are being recorded in people's home to capture everyday soundscape. Such audio recordings from home environments provide valuable information for recognizing daily activities, monitoring health and wellbeing, and enabling smart home applications. They support the development of robust sound event detection systems under real-world conditions. However, in-home recordings contain crucial personal information in the form of speech signals. It is crucial to remove the personal information such as speech from domestic audio recordings when publicly sharing the recorded datasets. This demonstration showcase real-time identification of personal information, in our case it is speech, using various AI models such as convolutional neural networks (PANNs, E-PANNs), Transformer model (AST), voice activity detection (VAD) models (Silero, WebRTC). Our focus is two fold: (1) To design a speech removal system to identify and remove speech from the recorded audio in real-time. (2) How well can AI models distinguish speech from non-speech audio? Our demonstration is simple, easy to use and a software-based GUI.

Files

WASPAA2025-Demo.pdf

Files (525.5 kB)

Name	Size	Download all
WASPAA2025-Demo.pdf md5:0b936050cef064ac5ddbf7f5e9078df3	525.5 kB	Preview Download

Additional details

Repository URL: https://github.com/gbibbo/vad_demo
Programming language: Python
Development Status: Active

	All versions	This version
Views	160	133
Downloads	143	131
Data volume	83.5 MB	76.2 MB

Speech Removal Framework for Privacy-preserving Audio Recordings

Authors/Creators

Description

Files

WASPAA2025-Demo.pdf

Files (525.5 kB)

Additional details

Software