Synthetic Pathology Dataset

Togootogtokh, Enkhtogtokh; Klasen, Christian

doi:10.5281/zenodo.14974650

Published March 5, 2025 | Version 1.0.0

Dataset Open

Synthetic Pathology Dataset

1. Voizzr Technology Germany

Contributors

Researchers:

1. Voizzr Technology Germany

A synthetic dataset was generated to mimic realistic distributions of voice parameters (e.g., pitch, jitter,
shimmer, harmonic-to-noise ratio, age, and a continuous disease severity score). The pathological
labels were derived based on domain-inspired thresholds, ensuring a challenging classification task.

we assess the thresholds applied to generate synthetic pathology labels, evaluating
their alignment with clinical contexts.
• Jitter (> 0.05): Jitter measures frequency variation in voice signals. Healthy voices typically
exhibit jitter below 1–2%, while the 0.05 (5%) threshold exceeds clinical norms but may
detect pronounced pathology, assuming proper scaling.
• Shimmer (> 0.08): Shimmer reflects amplitude variation, normally below 3–5% in healthy
voices. The 0.08 (8%) threshold is above typical ranges, suitable for severe cases but
potentially missing subtle issues.
• HNR (< 15): Harmonic-to-Noise Ratio (HNR) indicates harmonic versus noise balance.
Healthy voices often exceed 20 dB, while <15 dB aligns with pathological noisiness, making
this threshold clinically plausible.
• Age (> 70): Age is a risk factor for voice decline, but >70 as a pathology marker is overly
simplistic. It may act as a proxy in synthetic data, though not diagnostic in practice.
• Disease Severity (> 0.7): This synthetic parameter, likely on a 0–1 scale, uses a 0.7 cutoff
to denote severity. While arbitrary, it is reasonable for synthetic data but lacks direct clinical
grounding.

Files

Files (1.5 MB)

Name	Size	Download all
synthetic_pathology_dataset.V.1.0.xlsx md5:0e07637b37970bb607196cad64b8e592	1.5 MB	Download

	All versions	This version
Views	49	49
Downloads	30	30
Data volume	46.5 MB	46.5 MB

Synthetic Pathology Dataset

Creators

Contributors

Researchers:

Description

Files

Files (1.5 MB)