An Accurate Medium-Term Load Forecasting based on Hybrid Technique

ABSTRACT


INTRODUCTION
Electric load forecasting is important in planning, operation and regulation of electric power systems An accurate load forecasting will lead to substantial savings in operating and maintenance costs, increased the stability, reliability and security of the system [1]. Underestimate the load demand may cause the insufficient of power supply to the consumers. Furthermore, it may result the reduction of power quality in the system. On the other hand, overestimation may lead the provider to make unnecessary investment and does not meet the optimum economic power dispatch. Electric load forecast can be divided into 3 types. The first type is long-term load forecasting. The forecast whitin 5 to 20 years is classified as long term load forecasting [2]. This type of forecasting also has non-linear correlation with other factors. The second type is medium-term load forecasting. Medium-term load forecasting (MTLF) can be considered as forecast for monthly up to several year [3]. Operators can rely on MTLF in making decisions for unit commitment, system security analysis, dispatching schedule and load flow analysis. Therefore, improving MTLF accuracy is crucial for increasing the efficiency of systems and reducing the costs [4].
The third type is short-term load forecasting (STLF). STLF mainly covers the period of one week, and refers to the assessment of load per hour during the day [5]. This type of prediction is more specific in time as it considers hourly prediction. The more specific the more accurate the prediction can be. The shortterm load forecasting is needed for control and scheduling of power system, power system maintenance, power system operation and contingency analysis [6][7]. Conventional methods such as linear regression methods [8], time-series modelling [9] and general exponential method [10] have been utilized for load forecasting. These methods were able to predict the linear load series and unable to predict non-linear character of load [11]. In line with the rapid development of artificial intelligence algorithms, especially algorithms with strong self-learning such as simulated annealing algorithm, artificial neural network, BP neural network and particle swarm fuzzy inference have been widely used in load predictions. However, all these methods have their own advantages and disadvantages. Recently, support vector machine (SVM), which is suitable for solving practical problems such as load forecasting [12]. An improved version of SVM, Least-Square Support Vector Machine (LS-SVM) applies equality constraints instead of inequality constraints to simplify the complex calculation and improve the training process. In this paper, the hybridized of LS-SVM and Ant-Lion Optimizer (ALO) is presented for medium-term load forecasting.

RESEARCH METHOD
The data is obtained from PJM website. PJM is a regional transmission organization (RTO) that coordinates electrical transmission systems in all or parts of Illinois, Delaware, Indiana, Kentucky, Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee, Virginia, West Virginia and the District of Columbia. In order to verify the effectiveness of the proposed algorithm, historical load data from Duke is selected. The hourly data for whole days in 2010 and 2011 are used as an input and output for training data and testing data respectively. The hourly data in the first day of January is set to be input while the second day of January will be the output. The hourly data in the 1st day until 364th day will assigned as input while the hourly data in the 2nd day until 365th day will be the output. The data can be downloaded from [13].

Least-Square Support Vector Machine (LS-SVM)
The approach of LS-SVM is a reformulation of the principles of SVM, which applies equality instead of inequality constraints [14]. The optimization problem in LS-SVM is formulated as: Where is an unknown coefficient vector, is a regularization constant and e is assumed to be a white noise process. Equation (1) is subject to: Where xi is mapped into a high dimensional feature space with mapping φ. The problem can be solved using Lagrange multipliers and the solution is presented in form: Where K(x,xi) represents kernel, defined as the dot product between the φ(x)T and φ(x). In this paper, Radial Basis function (RBF) is used.
In LS-SVM with RBF kernel function, the selection of parameters between gamma and sigma is essential. These parameters need to be tuning to minimize training error and improved the prediction performance. This paper proposed 10-folds cross validation technique for the parameters selection. Mean absolute percentage error (MAPE) is used to quantify the performance of the prediction. The lower value of MAPE indicate that the prediction is good. The formula of MAPE is shown in Equation 5: Where is the actual value and is the forecast value.
Besides MAPE, the evaluation of the estimation is determined by the correlation of determination, R2 as shown in Eq. (6).
Where is the average value of actual LS-SVM.

Ant Lion Optimizer
The ALO algorithm imitated from the interaction between ant-lions and ants in the trap [15]. This algorithm aspired from 5 important step in the true nature of the ant-lions hunting behavior. The ant-lion build the trap by digging the sand. After that the ant is randomly walk until trapping in the ant-lion's pits. This will give the ant-lion the chance to catch the ant but usually the prey will run away. This will lead to fourth step which the ant-lion will throw the sand making the ant sliding toward ant-lion. At the final step, ant-lion catch the prey and rebuild the pit. Random walks of ants are represented as Equation (7).
Where is the minimum of random walk of i-th variable, is the maximum of random walk in i-th variable, is the minimum of i-th variable at t-th iteration, and indicates the maximum of i-th variable at t-th iteration. The new equations are formulated based on the random walks of prey that are affected by antlion's traps.
Where is the minimum of all variables at t-th iteration, indicates the vector including the maximum of all variables at t-th iteration, is the minimum of all variables for i-th ant, is the maximum of all variables for i-th ant, and shows the position of the selected j-th antlion at t-th iteration. The mathematically modeling for the behavior of sliding ants toward ant-lion are formulated as Eq. (10) and Eq. (11). The formulations are based on the radius of ants' random walks that is decreased eventually.
At the final step, ant-lion catch the prey and rebuild the pit. The step is formulated as: Elitism is a crucial characteristic of evolutionary algorithms that allows them to maintain the best solution(s) obtained at any level of optimization process. The best ant-lion obtained is saved in every iteration and considered as elite. The elite are the fittest ant-lion and should be able to affect the random walks of all the ants in iteration process. Thus, it is assumed that every ant randomly walks around a selected ant-lion by the roulette wheel and the elite simultaneously as follows: is the random walk around the antlion selected by the roulette wheel at t-th iteration, is the random walk around the elite at t-th iteration, and is the position of i-th t-th iteration. Reference [15] proved that the proposed ALO algorithm shows high exploration and exploitation in solving mathematical functions. The proposed random walk mechanism and random selection of ant-lions stimulate exploration which facilitate the ALO algorithm to achieve global optima and solve local optima stagnation when solving complexity problems. Moreover, adaptive shrinking boundaries of ant-lions' traps and elitism emphasize exploitation as iteration increases, which leads to an accurate approximation of the global optimum. All these characteristics require the ALO algorithm to solve real optimization problems potentially and avoid local optima. Therefore, this paper presents the application of ALO for solving load forecasting problems.

Development of Hybrid LS-SVM
In this paper, a hybrid Ant-Lion Optimizer Least-square Support Vector Machine (ALO-LSSVM) is proposed to forecast 24-hour load demand. As mentioned earlier, in LS-SVM (with the RBF kernel), two parameters need to be tuning which are gamma (γ) and sigma (σ2). Sigma is the kernel function parameter (squared bandwidth) while gamma is the regularization parameter for determining the trade-off between the training error minimization and smoothness of the estimated function. If the value of sigma is too big, it will lead to under fitting phenomenon to sample data. On the contrary, if the value of sigma is too small, it will lead to over fitting phenomenon to sample data [16]. In ALO-LSSVM, ALO is used to enhance the performance of LS-SVM by optimizing the values of gamma and sigma. The objective of the optimization is to minimize the value of Mean Absolute Percentage Error (MAPE). The overall flowchart of ALO-LSSVM is shown in Figure 1.
Firstly, the ant-lion and ant matrices are initialized randomly. In every iteration, the position of each ant with respect to an ant-lion are updated. Then, the best fitness are selected by the roulette wheel operator and the elite. The boundary of position updating is defined as proportional to the current number of iteration. The updating position is then accomplished by two random walks around the selected ant-lion and elite. When all the ants randomly walk, they are evaluated by the fitness function. If any of the ants become fitter than any other ant-lions, their positions are considered as the new positions for the ant-lions in the next iteration. The best ant-lion is compared to the best ant-lion found during optimization (elite) and substituted if it is necessary. These steps are repeated until the termination criterion is met. The termination criterion is set based on the difference between maximum and minimum fitness which is less than 10-7. The maximum iteration is set to 300 and the number of the search agent is set to 20.

RESULTS AND ANALYSIS
LS-SVM with 10-fold cross validation technique is used to find the value of gamma (γ) and sigma ( 2) in this paper. The accuracy of prediction is determined by calculating Mean Absolute Percentage Error (MAPE) and correlation of determination (R2). LS-SVM is simulated ten times to determine the best prediction performance. The best, average and worst results in term of MAPE value are tabulated in Table 1. The results revealed that the best value for gamma and sigma are 132.3344 and 44.3020 which produce MAPE of 4.3796%. The lower the MAPE is better, while the R2 should approach to 1 which indicates the good regression line. In order to optimize the value of RBF parameters, a new algorithm namely ALO-LSSVM is proposed as described in section 4. In ALO-LSSVM, the Kernel parameters were optimized using ALO. The performance of training data using ALO-LSSVM is illustrated in Figure 2. From the results obtained in Figure 2, the optimum value for gamma (γ) is 340.2442 while for sigma (σ2) is 321.1076. These values produce 4.356% of MAPE.  Table 2 shows the comparison of prediction performance between LS-SVM with cross validation technique and ALO-LSSVM in terms of MAPE and R2. From the results tabulated in Table 2, it can be seen that ALO-LSSVM produced better performance in terms of MAPE value and R2. The performance of ALO-LSSVM for medium term load forecasting is measured through testing process as shown in Figure 3. The figure shows the comparison between predicted and actual data for one year testing data. From the results presented in Figure 3, it can be observed that the predicted and actual data are quite similar. For clear observation on the performance of ALO-LSSVM, graph of testing data for a month (January 2011) and week (first week of January 2011) are plotted in Figure 4 and Figure 5 respectively.  It can be seen from Figure 3 that the electrical usage is highest in summer season while the lowest usage in February to March which are spring season. The highest electrical usage in summer is due to increasing the usage of air-conditioner and also increasing the human activities since it is a holiday. The major maintenance work is best to be done in February to March.
By referring to Figure 5, the first day is Saturday and continues until Friday. Starting from Friday's night, the electricity consumed is increase until Saturday. This is due to more activities in weekend and peoples start to have a great time after working. Based on the graph, the power provider should increase the generation in weekend compared to weekdays. This analysis will help the electricity provider to determine optimal unit commitment and plan the schedule. From all the scenarios have been discussed above, Mediumterm Load Forecasting is essential to power Supply Company to determine the electricity consumption in specific time. From the forecasting, the company might not over generate thus will cut-off the operating cost. Besides, the electricity collapse or trip can be avoided.

CONCLUSION
This paper had presented a medium-term load forecasting by using ALO-LSSVM to predict the load demand for every hour in a year. It is become a responsibility to power industry in making precise prediction in order to keep a healthy power supply and competition between the companies in terms of economy. In power planning, it is important not to make overestimation in order to avoid over spent. The determination of tariff also takes the load forecasting as the input to analyze. The most important thing to take into account is the stabilization of the electrical distribution especially at the receiving ends. In order to avoid electrical collapse at a particular area, medium-term load forecasting is needed to give the precise prediction since load demand varies from according to time. The results showed that the accurate prediction based on hourly load demand could be achieved using ALO-LSSVM algorithm. In future, it is suggested that ALO-LSSVM is also utilized for long-term load forecasting to verify the robustness and the nonlinearity of this hybrid technique.