Echo State Networks for Arabic Phoneme Recognition
Creators
Description
This paper presents an ESN-based Arabic phoneme
recognition system trained with supervised, forced and combined
supervised/forced supervised learning algorithms. Mel-Frequency
Cepstrum Coefficients (MFCCs) and Linear Predictive Code (LPC)
techniques are used and compared as the input feature extraction
technique. The system is evaluated using 6 speakers from the King
Abdulaziz Arabic Phonetics Database (KAPD) for Saudi Arabia
dialectic and 34 speakers from the Center for Spoken Language
Understanding (CSLU2002) database of speakers with different
dialectics from 12 Arabic countries. Results for the KAPD and
CSLU2002 Arabic databases show phoneme recognition
performances of 72.31% and 38.20% respectively.
Files
16322.pdf
Files
(364.0 kB)
Name | Size | Download all |
---|---|---|
md5:74e0c861f34b10ade9bf12fe243ebe29
|
364.0 kB | Preview Download |
Additional details
References
- <p>
- T. J. Reynolds, C. A. Antoniou, "Experiments in speech recognition using a modular MLP architecture for acoustic modeling, "Information Sciences, vol.156, Mar. 2003, pp. 39-54.
- W. Chen. S. Chen, C.Lin, "A speech recognition method based on the sequential multi-layer perceptrons, "Neural Networks, vol. 9, Nov. 1996, pp. 655-669.
- N. Hmad, T. Allen, "Biologically inspired Continuous Arabic Speech Recognition,".In Research and Development in intelligent systems XXIX, 32nd ed. Bramer, Petridis Ed. Cambridge, UK: Springer,2012, pp. 245- 258.
- T. Koizumi, M. Mori, S. Taniguchi, M. Maruya, "Recurrent Neural Networks for Phoneme Recognition,"
- M. D. Skowronski, J. G. Harris, "Automatic speech recognition using a predictive echo state network classifier," Science direct, Neural Networks, vol. 20, 2007,pp. 414-423.
- M. D. Skowronski, J. G. Harris, "Minimum mean squared error time series classification using an echo state network prediction model," IEEE International Symposium on Circits Systems, Island of Kos, Greece, 2006, pp. 3153-3156.
- M. C. Ozturk, J. C. Principe, "An associative memory readout for ESNs with applications to dynamical pattern recognition, "Science direct, Neural Networks, vol. 20, 2007. pp. 377–390.
- G. Holzmann, Echo State Networks with Filter Neurons and a Delay&Sum Readout with Applications in Audio Signal Processing., Thesis, Graz University of Technology, Austria, June 2008.
- H. Jaeger, H. Haas, "Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless telecommunication," Science, vol. 304, 2004, pp. 78-80. [10] H., Jeager, Adaptive Nonlinear System Identification with Echo State Networks, 2003. [11] D. Verstraeten, B. Schrauwen, M. D'Haene, D. Stroobandt, "An experimental unification of reservoir computing methods, "Science direct, Neural Networks, vol. 20, 2007. pp. 391–403. [12] M. H. Tong, A. D. Bickett, E. M. Christiansen, G. W. Cottrell, "Learning grammatical structure with Echo State Networks," Science direct, Neural Networks, vol. 20, 2007. pp. 424–432. [13] V. Sakenas, Distortion Invariant Feature Extraction with Echo State Networks, Jacobs University Bremen, Germany, Oct. 2010. [14] B. Schrauwen, L. Busing, A Hierarchy of Recurrent Networks for Speech Recognition, 2010. [15] H. Jaeger, M. Lukosevicius, D. Popovici, U. Siewert, "Optimization and Applications of Echo State Networks with Leaky Integrator Neurons," Science direct, Neural Networks, vol. 20, 2007. pp. 335–352. [16] T. P. Schmidt, M. A. Wiering, A. C. van Rossum, R. A.J. van Elburg, T. C. Andringa, B. Valkenier, Robust Real-Time Vowel Classification with an Echo State Network.,2010. [17] H.J aeger, A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the "echo state network" approach, International University Bremen, 2005. [18] I. Sutskever, Training Recurrent Neural Networks, University of Toronto, 2013.</p>