Speech sound disorder classification based on time-aligned dissimilarity profiles
Description
Speech sound disorders (SSD) are characterized by a person’s difficulty (or inability) in producing specific sounds or pronouncing certain words correctly. In this project we are dealing with SSD that appear during the development of speech; these are diagnosed by phonologists using specific protocols and comparing a child’s utterance of a specific word with a reference pronunciation. In order to help them to detect and speed up diagnosis we propose a classifier based on dissimilarity profiles built out of DTW-aligned MFCCgrams. Unlike usual classifiers based on statistical audio features, this method preserves the temporal sequence of the audio recordings, which usually have different durations. We compare the proposed method with two other SSD classifiers previously used for the same task, one based on the Earth Mover’s Distance, and another that uses a relative DTW embedding (minDTW). We present results showing that the proposed method compares favorably with respect to the competitors on a dataset used for SSD diagnosis in children speaking Brazilian Portuguese.
Files
SMCCIM_2020_paper_130.pdf
Files
(2.6 MB)
Name | Size | Download all |
---|---|---|
md5:231f4ffe18206b881ca22f88f644e0e9
|
2.6 MB | Preview Download |