Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published June 17, 2020 | Version v1
Conference paper Open

Speech sound disorder classification based on time-aligned dissimilarity profiles

Description

Speech sound disorders (SSD) are characterized by a person’s difficulty (or inability) in producing specific sounds or pronouncing certain words correctly. In this project we are dealing with SSD that appear during the development of speech; these are diagnosed by phonologists using specific protocols and comparing a child’s utterance of a specific word with a reference pronunciation. In order to help them to detect and speed up diagnosis we propose a classifier based on dissimilarity profiles built out of DTW-aligned MFCCgrams. Unlike usual classifiers based on statistical audio features, this method preserves the temporal sequence of the audio recordings, which usually have different durations. We compare the proposed method with two other SSD classifiers previously used for the same task, one based on the Earth Mover’s Distance, and another that uses a relative DTW embedding (minDTW). We present results showing that the proposed method compares favorably with respect to the competitors on a dataset used for SSD diagnosis in children speaking Brazilian Portuguese. 

Files

SMCCIM_2020_paper_130.pdf

Files (2.6 MB)

Name Size Download all
md5:231f4ffe18206b881ca22f88f644e0e9
2.6 MB Preview Download