Performance of Support Vector Machine in Classifying EEG Signal of Dyslexic Children using RBF Kernel

ABSTRACT


INTRODUCTION
Dyslexia is neurobiological inefficiency of some part in the brain that makes the people experience difficulty in acquiring fluent skills in reading although they have received appropriate academic education at the same level as normal children [1].Despite this learning disability, dyslexic children possess the same or above IQ level compared with normal children [2].
Several studies have been conducted to identify cognitive strengths and weaknesses of the children using computer model analysis from Gibson test [3].While Malaysia Ministry of Education uses the Dyslexia check list as the instrument to identify the probability of the children having learning disability specific to dyslexia by measuring their capability in spelling, reading, and writing.
Beside visual, auditory, processing and word test to examine the etiology of dyslexia, further studies were carried out using imaging techniques such as functional Magnetic Resonance Imaging fMRI [4], Positron Emission Tomography PET [5], Magnetoencephalogram MEG [6] which examine cognitive process associated with learning disabilities.However, EEG analysis is the subject of interest in this study due to its practicality and cost-effective with high temporal resolution.
Electrical activities of the brain can be recorded and monitored noninvasively using EEG electrodes attached to the scalp.This signal shows activities of the brain region during executing a task such as  ISSN: 2502-4752 Indonesian J Elec Eng & Comp Sci, Vol. 9, No. 2, February 2018 : 403 -409 404 decoding, reading, and writing.EEG signal consists of several frequencies bands.Delta waves δ (1-4Hz), Theta waves θ (4-7Hz), Alpha waves α (8-12Hz), Beta waves β (13-30hz) and Gamma waves γ (31Hz and above) that indicate different activities and level of awareness in the brain.
Various classification techniques have been investigated to identify dyslexia accurately.One of them is SVM, which is known as good performance classifier compared to other classifiers.SVM is a supervised binary classification algorithm that finds the optimal separating boundary in hyperplane by maximising the margin of two classes/training data.SVM has great ability in solving high dimension and nonlinear features.However, the performance of SVM in classifying dyslexia using the optimum value obtained by varying the scale of kernel parameter has not been reported.
It is anticipated that by tuning the kernel parameter of the SVM, the classifier can produce high accuracy in classifying dyslexia and perform better than other classifiers.This paper describes the classification of EEG signals of normal, poor dyslexic and capable dyslexic children using multiclass SVM binary learner through one versus one coding design.Varying scale of SVM and RBF kernel parameter is carried out to find the optimum parameters.

RESEARCH METHOD
In this work, the examination of the SVM performance in classifying dyslexia was carried out through several stages which include subject identification, EEG signal acquisition, notch and high pass filtering, power feature extraction, kernel parameter scale tuning, cross validation and classification as shown in Figure 1.

Subject Identification and Task Procedure
Wireless bio signal acquisition system g.nautilus was used to capture EEG signal from the scalp of the children.Head cover consists of 8 channel electrodes that are complied with international 10 to 20 electrode placement system was used during the recording.These electrodes were positioned at C3, P3, T7 and FC5 in the left side of the brain and C4, P4, T8 and FC6 at the right side of the brain as shown in Figure .2. The system acquired EEG signal, amplified and sampled it using a sampling frequency 256Hz before transmitting the signal wirelessly to a personal computer for recording and analyzing.In this study, the EEG data were recorded from 33 subjects with the age ranging from 7 to 12 years old.From the total subjects, the distribution is 8 normal, 17 poor dyslexics and 8 capable dyslexics.This data was acquired with the assistant from Dyslexia Association of Malaysia and Rakan Dyslexia Malaysia group.
Two categories of word were prepared for the subject; known word or word that was familiar to the subject with which can be visualized in their mind or have a specific meaning.Another category is non-word which has not seen before by the subject or word that have no specific meaning in particular and is not referring to anything.Three sets of word and non-word were prepared based on their age appropriate to their academic level.Set A was for subject of age 7 to 8, set B was for subject of age 9 to 10 and set C was for subject of age 11 to 12. Table 1 shows five tasks performed by the subject while their brain activities are recorded.Subject was asked to relax and try to think of nothing in particular for 40 seconds.Task 2: Simple Word Three simple words were shown one by one and the subject was asked to write the word on a piece of paper then relax.Task 3: Complex Word Then another three complex words were shown one by one and the subject was asked to write the word they saw on a piece of paper then relax.Task 4: Simple Non-Word Three simple non-words were shown one by one and the subject was asked to write the word they saw on a piece of paper then relax.Task 5: Complex Non-Word Three complex non-words were shown one by one and the subject was asked to write the word they saw on a piece of paper then relax.
Altogether 170 datasets were collected where each dataset contains 8-electrode recording.Hence, the total number of data recorded was 1360.Out of this, sixty-five percent (65%) of the dataset was used for training data and the remaining thirty-five percent (35 %) of the dataset was used for testing data.

EEG Signal Pre-processing and Features Extraction
The recorded EEG signals were filtered using a notch filter to eliminate power line noise at 50Hz and a high pass filter with a cutoff frequency of 0.5Hz to remove dc offset.The data were analyzed using a program written in Matlab.Since EEG signal is non-stationary, time-scale analysis is more suitable for extracting the underlying information than other methods.The raw EEG signals were extracted using DWT to decompose the signal into frequency sub-bands as shown in Figure 3.In this work, input features were not normalized because the output variation was small.Out of several wavelet family, Daubiechies of order 2 (db2) was employed to provide EEG signal time-frequency scale representation as its ability to localize features and smoothing over EEG signal [17].The detail coefficient D5 is theta band that indicates drowsiness and the detail coefficient D3 is beta band, which refers to active attention and was the subject of interest in this study.When a task is performed by the subject, the brain waves will shift towards increasing beta band frequency while the rest of the band frequency will be reduced.
Theta-Beta ratio is an indication of the relationship between internal, (slow activity) and sequential, (fast activity) [18,19].Theta band represents the subconscious mind and beta band represents the conscious mind.Brain activation through theta-beta was examined to analyze the brain state at a particular site between logical and spontaneous processing.Higher ratio indicates theta is dominant while lower ratio indicates beta is dominant.

Classification
In this stage, multiclass classification with one versus one was employed to classify normal, poor dyslexic and capable dyslexic.SVM with RBF kernel was then applied to the extracted band power features of Beta and Theta-Beta ratio.SVM classification is based on finding maximum margin separation boundary between two classes.In linear form, the separation can be done straight forward but for nonlinear condition, the data has to be placed in features space where the separation is performed in hyperspace.Kernel is a string that specifies the kernel function and is used to map the data from input space into a new space.There are three types of kernel function that can be used.They are known as Linear, Polynomial and RBF.Polynomial and RBF kernel are used for mapping non-linear data into hyperspace.The SVM classifier can be written as in Equation ( 1) and the RBF kernel function is shown in Equation (2). 3) shows  or kernel width that is a positive number specifying the kernel scale factor which is used to specify the shape of "peak" either broader or pointed bump.The SVM classifier with RBF kernel is given by Equation ( 4).

 
The SVM classifier with RBF kernel has two parameters; kernel scale (  ) and box constraint (C).Box constraint is a regulation parameter which controls tradeoff between margin maximization and errors of training data.SVM with (C) is shown in Equation ( 5)

407
To obtain the optimal parameters, varying scale on SVM with RBF kernel was carried out.In the first analysis the box constraint was varied from 0.001 to 1000 by increasing factor of 10 while kernel scale was set to 1.In the second analysis the kernel scale,  was varied from 0.001 to 1000 by increasing factor 10 while the box constraint was fixed to 1. Cross-validation with K-fold equal to ten folds was applied to predicts classification accuracy with the lowest error is performed with training data.
Confusion matrix for multiclass were then employed in order to verify the performance of classification model.The sensitivity, specificity and accuracy were determined using Equation ( 6), ( 7) and ( 8) respectively.

RESULTS AND ANALYSIS
Table 2 shows the result of k-fold cross-validation error for various C and kernel scales.It is obvious that scale 1 for both C and  gives the lowest error, which is 23%.The sensitivity versus C plot of the multiclass SVM classifier when C is varied from 0.001 to 1000 is shown in Figure 4(a).As can be seen, increasing C more than 0.1 decreases the classifier sensitivity from 100% to 92% for poor dyslexic, while for capable dyslexic the sensitivity rapidly increases from 25% to 75%.In contrast, the sensitivity for normal subject does not change and stays at 100%.Furthermore, increasing C above 1 give no changes to classifier sensitivity for all classes.dyslexic and normal subject decreases from 100% to 98% and 95% respectively, while for poor dyslexic the specificity increases from 63% to 88% when C is set at 1.The result remains unchanged when C is above 1.
Although C in the range of 0.001 to 0.1 performs better in specificity for normal and capable dyslexic, it does not perform well for poor dyslexic.Thus, it can be concluded that C equals to 1 is the optimal setting that gives the best overall sensitivity and specificity for classifying normal, poor dyslexic and capable dyslexic.
Figure 5(a) and (b) shows the sensitivity and specificity for normal, poor dyslexic and capable dyslexic resulted from SVM classification when  value is varied from 0.001 to 1000.When  is set from 0.001 to 0.1, the SVM sensitivity for poor and capable dyslexic is fluctuated, between 0% and 100%.While for normal subject it is not sensitive at all.However, when  is set to 1, the sensitivity increases to 100% for normal, 92% for poor dyslexic and 75% for capable dyslexic.Above scale of 10, the sensitivity drops tremendously when classifying normal and poor dyslexic.
The same trend is observed in the specificity for  in the range of 0.001 to 0.1.At scale equal to 1, specificity for classifying normal subject is 95%, while for poor dyslexic and capable dyslexic, it is 88% and 98% respectively.The best sensitivity and specificity are obtained for all groups when  is set to 1.It is observed that in Figure 6, classifier accuracy for C is high, which is in the range of 94% to 89%.However, classifier accuracy is not stable for  , which increases and decreases between 91% to 9%.When both  and C are 1, the SVM accuracy is 91%.The accuracy decreases when both parameters is set above 1.Thus, the optimal value for C and  is 1 since these values give good accuracy.

409
are Beta and Theta-Beta ratio was carried out using wavelet db2 and these features were used as the input to the classifier.The box constraint of SVM and the RBF kernel parameter were varied to find the optimum results.Cross-validation also was carried out.The results obtained in this study shows that RBF kernel parameter  affects the classification performance.Setting  to 1 in the RBF kernel and C to the same value in the SVM yielded the highest accuracy, which is at 91%.The SVM with RBF kernel classify the normal, poor dyslexic and capable dyslexic children accurately with high sensitivity and specificity using the optimum parameters.

Figure 1 .
Figure 1.Flow Chart of EEG Signal Analysis

405 Figure 2 .
Figure 2. Electrode Placement in Left and Right Hemisphere of Brain.

Figure 3 .
Figure 3. DWT Decomposition of EEG Signal Performance of Support Vector Machine in Classifying EEG Signal of Dyslexic … (AZA.Zainuddin)

Figure 4 .
Figure 4. Multiclass SVM Classification Performance When C is varied for Normal, Poor Dyslexic and Capable Dyslexic (a) Sensitivity (b) Specificity

Figure 5 .
Figure 5. Multiclass SVM Classification Performance When Kernel Scale is Varied for Normal, Poor Dyslexic and Capable Dyslexic (a) Sensitivity (b) Specificity

Figure 6 .
Figure 6.Multiclass SVM Classification Overall Accuracy for RBF Kernel

Table 1 .
Tasks That Were Performed During EEG Signal Recording