Towards Reliable Objective Evaluation Metrics for Generative Singing Voice Separation Models

Bereuter, Paul Armin; Stahl, Benjamin; Plumbley, Mark; Sontacchi, Alois

doi:10.5281/zenodo.15911723

Published July 15, 2025 | Version 1.0.0

Dataset Open

Towards Reliable Objective Evaluation Metrics for Generative Singing Voice Separation Models

1. University of Music and Performing Arts Graz
2. University of Surrey

This upload accompanies the WASPAA 2025 paper, 'Towards Reliable Objective Evaluation Metrics for Generative Singing Voice Separation Models'.

It contains the evaluation audio used to compute the evaluation metrics, as well as the loudness-normalised test stimuli used in the DCR test.

Also included is a CSV file containing all the metrics and DMOS scores used to calculate Spearman's rank-based correlation coefficients (SRCCs) for evaluating both discriminative and generative models.
A demonstration Python script outlines how the SRCCs between the DMOS and the evaluation metrics for both types of model are calculated.

The Readme.md file provides step-by-step instructions on how to benchmark metrics not included in the analysis.

Files

gensvs_eval_data.zip

Files (810.4 MB)

Name	Size	Download all
gensvs_eval_data.zip md5:3790b0f47f56eae10fc9604b3f7a1011	810.4 MB	Preview Download

Additional details

Subtitle (English): Evaluation Data: Audio, Embeddings, Metrics and DMOS

Available: 2025-07-15

Repository URL: https://github.com/pablebe/gensvs_eval
Programming language: Python
Development Status: Active

gensvs_eval

	All versions	This version
Views	144	144
Downloads	56	56
Data volume	48.6 GB	48.6 GB

gensvs_eval_data.zip

Files (810.4 MB)

Additional titles

Dates

Software

References

Towards Reliable Objective Evaluation Metrics for Generative Singing Voice Separation Models

Authors/Creators

Description

Files

gensvs_eval_data.zip

Files (810.4 MB)

Additional details

Additional titles

Dates

Software

References