Vibration-Based Damaged Road Classiﬁcation Using Artiﬁcial Neural Network

It is necessary to develop an automated method to detect damaged road because manually monitoring the road condition is not practical. Many previous studies had demonstrated that the vibration-based technique has potential to detect damages on roads. This research explores the potential use of Artiﬁcial Neural Network (ANN) for detecting road anomalies based on vehicle accelerometer data. The vehicle is equipped with a smart-phone that has a 3D accelerometer and geo-location sensors. Then, the vehicle is used to scan road network having several road anomalies, such as, potholes, speedbump, and expansion joints. An ANN model consisting of three layers is developed to classify the road anomalies. The ﬁrst layer is the input layer containing six neurons. The numbers of neurons in the hidden layer is varied between one and ten neurons, and its optimal number is sought numerically. The prediction accuracy of 84.9% is obtained by using three neurons in conjunction with the maximum acceleration data in x , y , and z -axis. The accuracy increases slightly to 86.5%, 85.2%, and 85.9% when the dominant frequencies in x , y , and z -axis, respectively, are taken into account beside the previous data.


2180
ISSN:  in Android smart-phone. The accelerometer records the vehicle vibration. Our system can detect few types of road anomaly. If one is found, then our artificial neural network system will classify its type whether pothole or speed-bump.

Research Method 2.1. Relevant Works
The size of road network that increases massively demands an automatic road monitoring system. However, the system is hard to develop considering the complexity of the road conditions. Fortunately, the roadways and mobile phone networks have grown simultaneously in emerging economies. Mukherjee and Majhi [9] demonstrated the capability of using smart-phone that has accelerometers and position sensors. This capability can be useful for autonomous monitoring roads. The ability of the smart-phone in recording accelerations reliably is demonstrated.
Gunawan et al. [8] performed similar experiment that utilized a smart-phone which was enriched with a 3D accelerometer sensor and geo-location sensor. The smart-phone installed in a vehicle. Their study found that whenever a vehicle crosses a pothole, it will vibrate significantly in z and x directions. Data collected from the pothole case were statistically deviated from the normal road and the bump road cases which showing potential for classification purpose.

Motion Sensors on Android Phone
Motion sensors on Android Phone return a multi-dimensional array of data measured by each sensor at an instance of time [10]. The accelerometer sensor measure accelerations in three directions, namely, x, y, and z-axis. The orientations of those axis on a smart-phone is depicted in Figure 1. The results of the orientations on the accelerometer responses are the following. If the device is pushed on the left side (device moves to the right), the x acceleration value will be positive.
If the device is pushed on the bottom side (device moves away from user), the y acceleration value will be positive. If the device is pushed toward the sky with an acceleration of A m/s 2 , then the z acceleration value equal to A + 9.81 m/s 2 , which corresponds to the acceleration of the device (A m/s 2 ) minus the acceleration of gravity (−9.81 m/s 2 ).
The device on stationary condition will have an acceleration value of +9.81 m/s 2 , which corresponds to the acceleration of the device (0 m/s 2 ) minus the acceleration of gravity (−9.81 m/s 2 ).

Vibration-based Method
Most road anomalies can be characterized as high-energy events in the acceleration data, yet not all events are road anomalies. Another thing such as road fixtures (railroad crossings and expansion joint) can generate significant acceleration impulse. Passengers slamming the door or driver braking suddenly can also produce high energy events.
Eriksson et al. [5] and Gunawan et al. [8] used vehicle acceleration data as the main source. Smart-phone which is enriched with a 3D accelerometer sensor and geo-location sensor is installed into the vehicle. Figure 2 shows the pothole detection flowchart used. The detection method would be as follow: 1. Vehicle velocity will be evaluated. If it is too low, this stream of data will be ignored, and next new stream of data will be evaluated. This process will be repeat until the stream data satisfied the requirement 2. Apply high-pass filter to remove acceleration, braking, or turn events 3. z direction acceleration (a z ) will be evaluated against a threshold (t z ). This stream data would be further processed if maximum of a z (a max z ) exceeds (t z ); Otherwise new data stream will be evaluated (back to step 1). 4. Calculate the largest value of x direction acceleration data (a x ) within the time interval centered at the time of a max x occurring. The time interval may vary (32, 64, or 128). This extreme value will be checked against a threshold (t x ). Similar to previous step, if a max x < t x , this stream data will be ignored and new one will be tested (back to step 1).

Last step is to reject any
where t s is threshold and v is the vehicle traveling velocity.

Data Collection
Road anomaly can be defined as abnormality of the road condition from what it supposed. There are several kinds of road anomaly such as damaged road (pot hole existence), speed bump, railroad crossing, or expansion joint. This research will focus on providing a method to detect road anomaly in real time and further classify the types of the road anomaly. There are several things to be prepared before data collection can be performed: smart-phone, vehicle, accelerometer data and the road anomaly it selves.
Smart-phone and vehicle: In this research, two smart-phone devices will be used: Device A and B. Device A will be placed on the card dashboard, while the device B will be placed in the middle of the car floor close to the back passenger Accelerometer data: Third party software that will be used to record the vehicle acceleration data. This application will record the acceleration data and save it in a .csv file.
Road anomaly: There are four kinds of road condition that will be recorded: normal, road with pothole, road with speed bump, and road with expansion joint.

Feature Extraction
Raw accelerometer data may not be directly used. These anomalies data are mixed with noise data, such as passengers slamming the door or driver braking suddenly that can also produce high energy events. There are several steps before features can be extracted from the raw data, they are: 1. Zero Shift: The purpose of this process to shift each acceleration data (x, y, and z) values in data to zero. All acceleration data are subtracted by theirs median.
2. Savitzky-Golay Filter: The purpose of this step is to remove noise from this acceleration data. The polynomial order used in this filter is one with frame size of 41.
3. Determine z acceleration peak point: The moment vehicle wheel hit the damaged road, the z acceleration will reach its peak. This point will becomes the median value of cutting window of data. Number 32 chosen as the size of the window to cover more point in time span, because there is a possibility that the peak window can be missed. Therefore data used are 65 points span between (z max − 32) and (z max + 32).

Hamming Window and Fast Fourier
Transform: Fourier Transform is implicitly applied to an infinitely repeating signal. Sometimes the start and end of the finite sample signal do not match, hence make it looks like a discontinuity in the signal. Applying Hamming Window makes sure that the ends match up while keeping everything reasonably smooth. Sixty-five points that has been acquired before will be applied with Hamming Window.

ANN Model for Classification
ANN is used as classification method because its capability to learn from examples and capture the functional relationships among the hard description of data. The network will be a Multilayer back-propagation network. This network will use Sigmoid as its activation function.Network parameter such as percentage of training data and number of hidden layers will be changed and tested several times to achieve the optimal result. Figure 3 shows the ANN model used in this study. After pre-processing, there are five input nodes: maximum x acceleration data (a max x ), maximum z acceleration data (a max z ), dominant frequency of x acceleration (f dom . The output node would be chosen from four available classes of the road condition: normal, speed-bump, pothole, and expansion joint. Table 1 shows the parameter of neural network used in this study. If there are 500 data, and ANN set to 10% training set size and 50% validation set size, data composition will be: 50 testing data (randomly chosen), 225 testing data, and 225 validation data.  There are two separated experiments. First experiment is to determine the reliable sample size: This process determines minimum portion of training data needed to achieve desired result. Training data portion will be increased gradually with increment of 10% until 90% portion of training data. Every multiple of 10%, the data set will be classified a hundred times. The optimal training data portion will be used in the second process.
The second process is determining the optimum number of Neurons: Using previously obtained optimal parameter, number of neurons in the hidden layer will be changed from 2 up to 9. Each variation will be run for 100 times classification process.

Result and Analysis 3.1. Typical Acceleration Data
This section shows how each road anomaly affects the accelerometer data. Figure 4 shows acceleration data when a vehicle crosses a normal road. The best indicator is that z acceleration tends to stay at gravity acceleration which is +9.81 m/s 2 . Using this information can be concluded that vehicle crosses normal road will have its z acceleration relatively stays at +9.81 m/s 2 . Any rise or fall from this value is the indicator of road anomaly.  Figure 5 shows acceleration data when a vehicle crosses a normal road then hits a pothole. Region in between the sixth and eighth seconds is when the vehicle hits the pothole. Notice that starting from the normal value of gravity acceleration, the z acceleration falls to 5-7 m/s 2 . It is when the front wheel hits the base of the pothole. After that the z acceleration starts to rise significantly to 12-13 m/s 2 . It is when the front wheel exits the pothole. The next drop is caused by the rear wheel hitting the pothole base. Identical to the previous one, this one is also followed by another rise when the rear wheel exits the pothole Figure 6 shows acceleration data when a vehicle crosses a normal road then hit a speed bump. Region in between the eleventh and thirteenth second is the time when the vehicle hits the speed bump. When the front wheel hits the speed bump, it gives significant increase to z acceleration from gravity acceleration value to about 13-14 m/s 2 . After that z acceleration starts to fall off because the front wheel has passed through the speed bump. Figure 7 shows acceleration data when a vehicle crosses a normal road then passing an expansion joint. Region in between the fifth and sixth second is the data recorded when the vehicle crosses the expansion joint. When the wheel hits the expansion joint, it drops the z acceleration to about 7-8 m/s 2 . Then the z acceleration rises significantly to about 15 m/s 2 .

Determining the Reliable Sample Size
This study evaluates a variation of the training data size to the accuracy of the ANN prediction. The approach is of the following. Firstly, the training size is fixed at 10% of the total sample size. The remain data are equally divided for the validation and testing stages. For these fixed sizes, the data are resampled for a hundred times using a Monte Carlo simulation. This procedure is repeated for the training size of 20%, 30%, ..., 80% and 90%.
The effects of the data sizes on the accuracy are shown in Figure 8. The ANN model trained using 10% data is only about 15% accurate or about 85% misclassify the cases. The accuracy increases almost steadily with the increasing of the training data size until the data size reaches 50%. After the size, the accuracy still slightly varies with the data size. The highest accuracy is obtained for 80% training data size.

Determining the Optimum Number of Neurons
Increasing the number of neurons increases the capability of the model to fit more complex relationship. However, this complexity may happen due to over fitting. A good ANN network model should be general and not overfit to a specific case. A minimum number of neurons is usu-ISSN: 1693-6930 ally required to provide a generic model. To find this generic model, the neural model accuracy is computed for a various number of neurons. The results are depicted in Figure 9. For the twoneuron case, the accuracy varies widely from around 57% up to around 91%. However, for the cases where the number of neurons is three and nine, the accuracy variation is relatively constant from one case to the others. The figure suggests that the most optimum number of neurons is three.

Determining the Significance Features
Feature selection is the process of selecting a subset of relevant features for use in the classifier model construction. Sometimes the data collected may redundant or irrelevant. Features selection may help eliminate this possibility by preventing loss of information. Theoretically, smaller number of features can decrease the classifier workload, hence decreasing the modelling and training time of the classifier. This also increases the classifier performance by maintaining its accuracy. To perform feature selection, the condition of the data in each class must firstly be observed. The distribution of features of the classification is shown in Figure 10.
The dominant frequency of x in normal class is 1.52, which is identical in other classes too. Meanwhile, the dominant frequency of y in normal and pothole case both has score 1.52, while speedbump and expansion joint have 1.21 and 1.49. Only the dominant frequency of z that has varied score for each class.
Using these facts, further classification is performed by reducing the number of features involved in the classifier. Table 2 shows which features presence in each classifier. Each classifier used 80% training data and three neurons in the hidden layer. This classifier is resampled for a After testing the classifier, the results are depicted in Figure 11. Classifier A that used all the features has accuracy of 85.2%. Classifier B has 46.1% accuracy, which is the worst amongst another classifier. Classifier B did not include a max x as its features. From this result can be predicted that a max x is a significance features. Classifier C accuracy is 83.3%. Its accuracy is slightly lower than classifier A. This classifier did not have a max y as its feature. Meanwhile Classifier D has second lowest accuracy at 75% Figure 9. The effect of the number of neurons in the hidden layer to the classification accuracy of the road anomalies.