Time series analysis of electric energy consumption using autoregressive integrated moving average model and Holt Winters model

With the increasing demand of energy, the energy production is not that much sufficient and that’s why it has become an important issue to make accurate prediction of energy consumption for efficient management of energy. Hence appropriate demand side forecasting has a great economical worth. Objective of our paper is to render representations of a suitable time series forecasting model using autoregressive integrated moving average (ARIMA) and Holt Winters model for the energy consumption of Ohio/Kentucky and also predict the accuracy considering different periods (daily, weekly, monthly). We apply these two models and observe that Holt Winters model outperforms ARIMA model in each (daily, weekly and monthly observations) of the cases. We also make a comparison among few other existing analyses of time series forecasting and find out that the mean absolute percentage error (MASE) of Holt Winters model is least considering the monthly data.


INTRODUCTION
The prediction of energy is vital for electricity traders who buy and sell electricity, change loads, organize maintenance and unit commitment to balance their purchase of electricity, as well as to supply their consumers with optimal price products. The need of electricity is increasing so rapidly and the impact of this is so crucial and harmful for our environment. Use of electricity in the USA in 2018 was 16 times greater than that of in 1950 [1]. According to the energy information administration data, the consumption of electricity can be 79% greater through 2050 [2]. That's why proper prediction of electrical energy consumption is really needed to control the excessive use or to reduce waste of energy to minimize its harmful effect on the environment. Saving of electricity does not require a lot of cash. It needs a proper way of consuming electricity in the most efficient way [3]. Most of the proposed prediction models usually use statistical method as a tool for forecasting future data [4]. The suitable prediction models are recognized from some factor like prediction period, prediction interval, the duration of the time series, characteristics of time series [5]. In this paper a comparative study is presented for statistical prediction of two commonly used linear demand models for energy consumption in Ohio/Kentucky. The models are a trendy time series linear model named autoregressive integrated moving average (ARIMA) [6] model and Holt-Winters model. These models have exceptional adaptive capacity to deal with the linearity in problem solving [7]. We applied ARIMA and Holt Winters models for several reasons. ARIMA is a very strong time series model where a series past data is used as an independent variable. ARIMA forecast data are generally more reliable and accurate [8]. Holt Winters model is easier to apply, provides accurate forecast and recent observations are given significance here [9].
The comparison is made by considering the lowest root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE) and the highest value of mean absolute percentage accuracy (MAPA) for time series predictions. Our main contribution is to represent the best suitable and accurate technique among two powerful time series model-ARIMA and Holt Winters model. Our work is done for daily, weekly and monthly electricity consumption from 2012 to 2018 of Duke Energy Ohio/Kentucky [10]. In this paper the section 2 presents some existing works related to time series data and other models. Section 3 represents theoretical explanation of our work and models. Section 4 explains the methodology. Section 5 shows the analysis of the result and section 6 concludes our work.

RELATED WORKS
Different researches have been done so far with time series data. Ma et al. [11] applied support vector machines (SVM) to forecast energy consumption in buildings of China. Their analysis showed that SVM method can forecast energy consumption with quite a good accuracy. They didn't use other models in their paper or didn't show any comparative result. Nie et al. [12] used an ARIMA and SVM hybrid technique for load forecasting. They used this hybrid model to achieve a better accuracy. But the hybrid model was not clearly mathematically explained. Fard and Zadeh [13] presented a hybrid technique for short time load forecasting which is based on wavelet transform, Artificial Neural network and ARIMA. This hybrid model can increase the complexity in many cases. In [14] several time series models have been used to predict the consumption of electricity of University Tun Hussein Onn Malaysia (UTHM). The models they used are simple moving average (SMA), weighted moving average (WMA), Holt-Winters (HW), Holt linear trend (HL), simple exponential smoothing (SES) and centered moving average (CMA). From their analysis they found that HW gives the smallest MAE and MAPE.
Pan [15] showed for Airline passenger data that prediction of Holt-Winters technique is more accurate compared to ARIMA model. In their case they could have experimented with data of different periods by resampling the data. Chujai et al. [16] presented a suitable model and period among weekly, daily, quarterly or monthly for forecasting household energy consumption. They used ARMA model and ARIMA model for data analysis. They used the lowest value of AIC to figure out the most appropriate model and the least value of RMSE to find the appropriate period. They didn't use other metrics like MAPE, and MAE to analyze the result.
Bouktif et al. [17] used long short term memory based model to forecast electricity consumption. By selecting the best base line, choosing the appropriate features and applying genetic algorithm, they found an optimal model prediction. In [18] support vector regression has been applied on the data of energy consumption of different buildings. This data is hourly observation of different buildings. They didn't use different models and different periods of data for better analysis. Vinagre et al. [19] proposed a forecast method of energy consumption based on artificial neural network for an office building using the data from October 2014 to April 2015. Their average MAPE based on the best network was 13.6%. This percentage is quite moderate and can be better using other models or hybrid models.
Yoo and Myriam [20] proposed a model based on neural network to predict the residential electricity consumption in Seoul using the historical data from January 1996 to July 2016. They found out some interesting characteristics which have direct impact on the data. The MAPE of the total dataset was 4.85%. Bilal Şişman [21] made a comparison between ARIMA and Grey Model by forecasting the electricity consumption in Turkey where it has been found that the MAPE of ARIMA and Grey Model were 4.9% and 5.6% respectively. They also analyzed that both are quite better than MAED model which has MAPE of 14.8%. This study can further be applied to other fields of forecasting along with other models to increase its accuracy and efficiency.
In [22] 4 different models based on multi objective genetic algorithm have been applied to predict the power consumption using the data from 1 September 2010 to 29 February 2012 of a building in Spain. They compared these models with existing perceptron model [23] and a naïve autoregressive baseline (NAB) [24] model. The MAPE of the NAB model showed the worst performance. An auto-encoder using deep learning model has been proposed in [25] by the researchers for predicting electric energy consumption. They used household power consumption of five years and obtained mean squared error of 0.384 but they haven't mentioned any percentage of error. This work also can be extended using several household data or the data of a whole area to make it more applicaple. Kim and Cho [26] proposed CNN-LSTM neural network which can predict the energy consumption of the housing effectively and achieved RMSE of 61.14%.
In [27] the researchers proposed LSTM and GRU to predict the traffic flow. It has been shown that MSE of two RNN is smaller than ARIMA and GRU gives better performance than LSTM in 84% of total time TELKOMNIKA Telecommun Comput El Control series. Tso and Yau [28] presented three techniques-regression analysis, neural network and decision tree for the prediction of energy consumption. They compared the three-model based on their RASE. They found out that the RASE of the three models are quite similar. Others performance metrics could be used besides RASE to have a better comparison among the models.

THEORETICAL FRAMEWORK
Time series analysis is on historical data which is taken collectively for continuous period which can be of different periods-hourly, daily, weekly, monthly, quarterly, and yearly depending on the usage and scope of users. We are using ARIMA as well as Holt Winters model for analyzing the time series data of energy consumption for Duke Energy Ohio/Kentucky [10]. Figure 1 shows the general flow diagram of our work.
Here the hourly observations of energy consumption are used and later resampled to daily, weekly and monthly data. Further the ARIMA and Holt Winters model are applied on these data separately.

ARIMA model
ARIMA is a technique that analyzes autocorrelation in the time series by modeling it directly. It is the combination of autoregressive (AR) process, moving average (MA) process and stationery (I) series [29]. It is an integrated (I) series to become static, it has to be differentiated. The (AR) terms are the Lags of the stationary series. Predictive error lags are, in actual fact, moving averages known as MA terms. It uses Box and Jenkins approach which has been extensively applied in studies of time series forecasting [30]. The ARIMA model consists of three parameters (p, d, q) [31]. Here, p is the autoregressive component's order, d means the amount of dissimilarities required to make the series stationary ARMA (p, q) and q is moving average component's order [32].

Holt Winters model
Holt-Winters model is a technique to forecast the characteristics of a time series data. It is a very popular forecasting technique for time series data. It works with three features of the time series: a regular/mean value, a cyclical pattern which repeats seasonally and a slope or trend over time. The model combines the effect of these three aspects to predict or guess a present or future data. This model is called triple exponential smoothing as the 3 features of the time series analysis-typical/regular value, slope or trend value, and seasonality are represented as 3 types of exponential smoothing. The model needs some parameters (ɑ, β, γ)one for each smoothing, the duration of a season, and the amount of periods in a season.

METHODOLOGY
Our main goal is to apply ARIMA and Holt-Winters model in our data set (Duke Energy Ohio/Kentucky) to forecast some energy consumption values for daily, weekly and monthly data and make a comparison of the results to find the suitable model for forecasting. We worked with python 3 using jupyter notebook. The Workflow of our analysis is given in Figure 1.

Dataset
The dataset we used is an hourly energy consumption data taken from PJM's website [10]. PJM Interconnection LLC is a regional transmission organization (RTO) in the USA. Figure 2

Loading and resampling the data set
The data set contains 57740 observations of hourly energy consumption from 12/31/2012 1:00 AM to 1/2/2018 12:00:00 AM. Firstly, the dataset is loaded in the csv format using panda's library. Original dataset is provided in Figure 3 where the hourly observations in mega watt are graphically shown. Then we resample the dataset into daily, weekly and monthly observations. After re-sampling into daily, weekly and monthly observations, the datasets are graphically shown in Figure 4, Figure 5 and Figure 6 respectively.
We split our data set into test set and training set. Then we apply the models on the training set to examine the test set. The training and test set are split accordingly:

Predicting values
After applying the models on the test set, obtained results are illustrated in Figures 7-12. Each of the figures represents the original and predicted values. In the Figures 7-12, the comparison of the original and forecasted data is shown using two models (ARIMA and Holt Winters). Here the x axis indicates the time period (day, week or month) and the y axis indicates the energy consumption in Mega Watt.

Analysis using evaluation metrics
After obtaining the results of the two models for daily, weekly and monthly data, the results are compared using the MAE, RMSE, MAPE and MAPA values. These are shown in Table 1 And we also determine that the greater the period, the more accurate is the result. We get the best accuracy for monthly data using Holt Winters model that is 95.64%.

Comparison with other existing works
Finally, we compare our two models with few other models. The comparison is based on the mean absolute percentage error (MAPE). There exist many other analyses of the prediction models for predicting the electric energy consumption. Table 2 gives us the idea of other analyses and our analysis. We can see from Table 2 that Holt winters model has the minimum MAPE (incase of monthly observations) than all other models. Our obtained MAPE value is 4.36% for monthly observations using Holt Winters model. Thus, we can use this model for predication of energy consumption with a view to making a proper plan of electricity supply and minimizing the energy waste for sustainable development. This model also can be used in other time series prediction for proper management and decision making.

CONCLUSION
One of our important concerns today is the appropriate management of energy and hence an accurate predicting model for forecasting energy consumption is required. We tried to apply the ARIMA and Holt Winters model in the energy consumption data of Ohio/Kentucky from PJM's website [10] and made a comparative analysis of the results. The main aim of our study was to find the suitable model among the two models for daily, weekly and monthly data for appropriate prediction. We determined the best suited model based on the minimum value of MAE, RMSE and MAPE. From our analysis it was observed that Holt Winters model provided more accuracy for the data sets in each case (daily, weekly, monthly).
Later we compared few other existing models with our two models which also reflect that the Holt Winters model that we used has greater accuracy for the monthly observations. So, we can conclude that for this kind of long-term forecasting, our proposed Holt Winters model can work efficiently for proper energy management. Further studies can be done with the similar dataset considering other parameters and environmental factors which has a greater impact on the data. We can also work using some other hybrid models or models like ANN and genetic algorithm to be able to have better result on short term load forecasting.