Impact of Multi-Classifier Fusion on Target Speaker Detection in Audio Streams

Kenai, Ouassila

doi:10.5281/zenodo.17618334

Published November 15, 2025 | Version v1

Publication Open

Impact of Multi-Classifier Fusion on Target Speaker Detection in Audio Streams

Kenai, Ouassila (Contact person)

This article discusses robust system by multi-classifier fusion approach used in target Speaker Detection (SD) systems to improve their performance. Single classifiers may introduce significant performance degradation in the performance. To overcome this problem, we propose in this work to apply the fusion of multi-classifiers Hierarchical Ascending Clustering (HAC), Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) on an architecture based on Activity Detection Voice (VAD) in order to reduce errors of speakers’ detection. A comparative investigation was conducted between individual classifiers and their fusion; and for the evaluation task, the three classifiers and their fusion were tested on telephonic conversations extracted from the NIST-2005 corpus. The results of experiments have shown that the applied multi-classifier fusion on this architecture has considerably enhanced the performances of target SD system, comparing to the applied each classifier. The results show a Speaker Detection Rate (SDR) of 99.18% with the fusion approach, compared to HAC (85.98%), GMM (86.68%), and SVM (97.67%).

Files

25_Kenai.pdf

Files (750.8 kB)

Name	Size	Download all
25_Kenai.pdf md5:a99a679de2ce18072651b14c2529b571	750.8 kB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	44	44
Downloads	14	14
Data volume	12.8 MB	12.8 MB

More info on how stats are collected....

DOI

Resource type

Publication

Publisher

HDSKD journal

Published in

International Journal of Hidden Data Mining and Scientific Knowledge Discovery, 10(1), ISSN: 2437-069X, 2025.

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 15, 2025
Modified: November 15, 2025

Impact of Multi-Classifier Fusion on Target Speaker Detection in Audio Streams

Authors/Creators

Description

Files

25_Kenai.pdf

Files (750.8 kB)