UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Dataset Restricted Access

Kassel State of Fluency - Challenge

Sebastian P. Bayerl; Alexander Wolff von Gudenberg; Florian Hönig; Elmar Nöth; Korbinian Riedhamer

Korbinian Riedhamer

The stuttering Kassel State of Fluency corpus KSoF-C, provided is derived from
the Kassel State of Fluency (KSoF) corpus. The original corpus fea-
tures some 5500 typical and nontypical (stuttering) 3-sec segments
from 37 German speakers with an overall duration of 4,6 hours. The
segments contain speech of persons who stutter (PWO). The record-
ings from which the segments were extracted were recorded before,
during, and after PWOs underwent stuttering therapy.

KSOF-C only features the 4601 non-ambiguously labeled segments.
The task proposed in this challenge is the classification of speech
segments as one of the 8 classes - the seven stuttering-related classes
mentioned above and an eighth “garbage” class, denoting unintelligible segments, segments containing no speech, or segments that
are negatively affected by loud background noise. The dataset is split by speaker (train, 23 spk, devel, 6 spk, test, 8 spk)

Restricted Access

You may request access to the files in this upload, provided that you fulfil the conditions below. The decision whether to grant/deny access is solely under the responsibility of the record owner.

This is the full release of the KSF-C dataset, including the previously anonymous test set labels. 
The previous version was used in the  ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE) challenge.


The EULA for the KSF-C dataset can be obtained from the dataset website:


Access to the dataset will be granted upon receiving the signed EULA.


  • Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Stephen Roberts: "The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitos", Proceedings ACM Multimedia 2022, ACM, Lisbon, Portugal, 2022.

  • Sebastian P. Bayerl, Alexander Wolff von Gudenberg, Florian Hönig, Elmar Nöth, Korbinian Riedhammer: "KSoF: The Kassel State of Fluency Dataset - A Therapy Centered Dataset of Stuttering", Proceedings LREC 2022, Marseille, France, 2022.

All versions This version
Views 44815
Downloads 421
Data volume 62.3 GB1.5 GB
Unique views 31510
Unique downloads 391


Cite as