Image Processing for Rapidly Eye Detection based on Robust Haar Sliding Window

ABSTRACT


INTRODUCTION
In digital image processing, there are several methods that have been used and one of them is haar cascade classifier. Haar cascade has been widely applied for humans and computers interaction (HCI). This implementation frequently used for the detection of object such as facial area, mouth detection, and eye detection. Eye gaze tracking doing by Ghani et al. [1] for controlling mouse pointer, Gao et al. [2] developed a method based on haar classifier, detection of eye movement for controlling wheelchair present by Singh et al. [3] using EOG signal, and Coetzer et al [4] implement SVM and ANN for real time eye detection. The process of haar cascade in detecting eyes is using the technique of the sliding window or rectangular window on the whole image and find out if there is a part of the image that is shaped like an eye or not. Haar based on cascade models include a weight connection from the input to each layer and from each layer to the successive layers [5]. Before doing the detection of the eye area, there are some process of image processing in advance. RGB image initially converted into a grayscale image, the result will be calculate using integral image at the pixel of grayscale image. Haar features process grayscale image in a rectangular window with a certain size such as 24 x 24 pixels. Inside this window, the filtering process performed to find the eye region. This filtering was done using selected haar feature template. If the value of the haar features in a certain region below the threshold then it has to be clustered. However, if a feature in a region is not qualified, so there is no object to be detected in the area of the region and its region area relocates the area. However, if the area meets the feature, then the rules are performed to the next feature. If all requirements fulfilled feature, it means that there is an object in the area of the region. In detecting an object, Haar feature has a fairly good accuracy rate, but using haar feature requires a long computation time. In accordance with the haar method, this happens because haar do processing on all parts of the image pixel sequentially. In training process haar used AdaBoost for classifier the object [6], [7]. However, eye detection can affect the accuracy control of device or application that utilizes the eye area accurately. Besides those methods, occasionally image come up with noise often affect the quality of the image. So that, it has an impact on the accuracy in eye detection [8]. Previous research and methods offered by Viola Jones [9] which use certain window sizes moving at each pixel from the upper left corner to the bottom right.
In this paper, we proposed a method by utilizing a sliding window and haar features, which the detection window starts from the center of the image which already indicates the area of eye to reduce the processing time and errors in eye detection. For classification, we also proposed method by using simply nearest distance compared haar with adaboost. It is also considering to use Open CV with nothing change of scaning windows region.

PROPOSED METHOD
After successfully detect objects in the facial area, our proposed method is placing a sliding window in the middle area of the face image, not from the top left corner as was done by the conventional haar cascade. Previously, images have been through grayscaling process and change the pixel value into the integral image, determination ROI of eye, than in this area performed calculation to get features value of eye. In this case we aware that there are some factors can disturbing the detection of objects such as lighting, so that in testing phase, the images have good lighting conditions and taken pictures during in the morning or in the afternoon. To ease the understanding of eye detection process in general, diagram block in Figure 1 represents the method that we proposed.

Scaling
Scaling process is a way to resize a digital image, it is necessary to all images which are processed have the same size. In this system the scaling process performed by utilizing the scaling package provided on Java.

Grayscaling Image
In preprocessing phase, we need to remodel color image into grayscale image, which applied in haar method as well [10]. So that, it need to transform from RGB image into grayscale image. To convert image into grayscale, average method is utilized, by adding up all the value of R G B, then divided by 3, in order to obtain an average value of R G B, the average value of that can be said as grayscale. (1)

Integral Image
Integral image is a technique used in the haar method for calculating the value of the rectangle features quickly by changing the value of each pixel in the grayscale image. Then generating value summed area table so that the results regarded as a representation of a new image. This method has the advantage that relatively faster computing, because it depends on the number of pixels in a square instead of each pixel value of an image. The region result value of original image shows in Figure 2(a): 5+2+1+5+2+2 = 17. By using that calculation, it needs five steps calculation. In Figure 2(b), region result value of integral image: 17-0-0+0 =17, so it is only need four steps calculation. This explains that the calculation of the integral image requires fewer steps than the calculation of each pixel in a region. The calculation of integral image performs faster in large-sized image rather than in small-sized image.

Training Phase for Eye Feature
To detect eye feature on facial area, it is necessary to train on many eye images dataset by various forms or types of a person's eyes [11]. The dataset containing eye images with a look ahead (frontal) and the variation of the left and right views obtained at random from people around the research. Training phase is done by the viola Jones et al using haar template. The window size used was 92 x 19 pixels in accordance by the size of the dataset of eye, while the size of the box feature is the 4 x 3. Figure 3 will clarify eye dataset.

Direction of Sliding Window
In this paper, window region is placed on the top center area of the face image show in Figure 4(b). The size of the window region that use is same as the size of the window that is used during the process of training data of eye region. The size that we use is 92 x 19 pixels, it is rectangular and in accordance with a window that would indicate that the area contained of eye region. To determine the midpoint of the starting of scanning window, it is necessary to know the width of the pixels in facial area and the size of the box window used due to the determination of the midpoint derived from the width of the image. In this case we use 360 x 240 pixels of the image size or the otherwise, while the size of the box window haar is 92 x 19 pixel. Figure 5 illustrates the method of determining the starting point of scanning window.  summed to generate a single value features and compared whether the value of these features is less than a specified threshold value, illustrated in Figure 6. The features are obtained from black and white squares of haar features.

Value(V) = sum|Test ij -Train ij | (3)
Testi ij is a pixel value of testing data features at position i, j, Train ij is a pixel value of train data features at position i, j. Window boxes shifted every single pixel in the image of the picture. Then through haar features, it requires to examine whether the image area in a window box detected as eye area or not. If the value is less than the threshold, then the window box will continue to run until it stops. In this system we have modified the rules in detecting eye region. If the difference in value features do not have any value which less than the specified threshold then it has to find out the smallest difference value. So, the value is detected as an area of eye.

EXPERIMENTAL RESULT AND DISCUSSION
Testing data was taken from the people around the university in Malang city. The dataset captured in bright lighting conditions only and we collected 30 dataset retrieved using MacBook Air with resolution 900 pixels and also from mobile device camera Asus Zenfone Selfie with resolution 1920 x 1080 pixels. As a limitation in this system, the facial image position should be in the middle. We also add several conditions of image testing to be tested in a system that has been created. In this section, we did some experiments for detecting eye area with scaling image size into around of 360 x 240 pixels and 3 x 4 pixel for size of template haar feature. This experiment implemented on Intel Core i5-5200U processor 2,7GHz using Java Netbeans 8.1. The following Figure 7 shows the results of our proposed method. Experimental result obtained from calculation using Percentage of Successful Prediction (PSP). Accuracy calculation is important to measure the ability of a system that has been made in recognizing objects. In experimental, it reachs 71.4% detection rate and 11.1% false rate as shown in Table 1 that based on the formula 4.
(4) TP = person / facial image and succeeded to recognize the eye. TN = image is not the eye area, and detected as an eye. As we can see, the experiment result from Table 1, the proposed method could improve the detection rate and false rate by changing the direction of sliding window. While, the haar cascade method achieves less detection rate and higher false rate. Moreover, the proposed method have faster processing time than haar cascade method.
In this paper, we also applied haar cascade method which generated manually for comparing with our proposed method. To obtain the comparable results, testing process was conducted with the same database. The comparison result already presented in Table 2. The system was measured quantitatively using that has been done by the equations of accuracy [12], [13] in formula 5. The accuracy of proposed method is 93.3% and conventional haar cascade method is 86.67 %. From the Figure 8(a), we can see the difference that conventional haar cascade method missed some of eye detection, and our proposed method can improve the eye detection.

CONCLUSION
Minimizing errors and speeding up the detection process can be handled using the proposed system in this paper. Some of testing images can not be detected by using haar conventional methods, for instance, several images with a different eye gaze sometimes also can not be detected. By using the scheme of changing the position of the scanning window in the area of detected face, the detection of eye can be improved effectively, as well as the computation time can be improved. The drawback of this proposed method is come up with any different lighting condition of captured image. Thus for further research, it will improve the performance of image in different lighting condition in order to run more quickly and accurately. Besides that, the approach of the angle of the face can also be performed.