Broad Phoneme Classification Using Signal Based Features

doi:10.5281/zenodo.5771643

Published November 28, 2014 | Version v1

Journal article Open

Broad Phoneme Classification Using Signal Based Features

Speech is the most efficient and popular means of human communication Speech is produced as a sequence of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme classification is achieved using features derived directly from the speech at the signal level itself. Broad phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first three formants. Features derived from short time frames of training speech are used to train a multilayer feedforward neural network based classifier with manually marked class label as output and classification accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction which is useful for applications such as automatic speech recognition and automatic language identification.

Files

5314ijsc01.pdf

Files (408.3 kB)

Name	Size	Download all
5314ijsc01.pdf md5:cfc3d93b7c1dbb78ca62da8b24ac8908	408.3 kB	Preview Download

	All versions	This version
Views	48	48
Downloads	43	43
Data volume	18.4 MB	18.4 MB

Broad Phoneme Classification Using Signal Based Features

Creators

Description

Files

5314ijsc01.pdf

Files (408.3 kB)