Published June 17, 2024 | Version v1
Conference paper Open

Calibrating neural networks for synthetic speech detection: A likelihood-ratio-based approach

  • 1. ROR icon Fraunhofer Institute for Digital Media Technology

Description

In this paper, we introduce a calibration procedure designed to convert the uncalibrated output scores of neural networks for synthetic speech detection into calibrated and interpretable likelihood ratios. This procedure is based on the assumption that the networks subject to calibration are deterministic and have undergone training until they reached convergence. Provided these conditions are satisfied, it is then possible to transform their output values into likelihood ratios using a minimal set of validation and calibration data, eliminating the need for retraining the models. We successfully tested the entire workflow on a state-of-the-art network example, demonstrating not only its effectiveness in calibration but also its ability to enhance fault tolerance against inadequate inputs.

Files

AES_2024__LLR-Based-Calibration-for-Synthesis-Detection.pdf

Files (473.6 kB)

Additional details