A Novel Forecasting Based on Automatic-optimized Fuzzy Time Series

In this paper, we propose a new method for forecasting based on automatic-optimized fuzzy time series to forecast Indonesia Inflation Rate (IIR). First, we propose the forecasting model of two-factor high-order fuzzy-trend logical relationships groups (THFLGs) for predicting the IIR. Second, we propose the interval optimization using automatic clustering and particle swarm optimization (ACPSO) to optimize the interval of main factor IIR and secondary factor SF, where SF = {Customer Price Index (CPI), the Bank of Indonesia (BI) Rate, Rupiah Indonesia /US Dollar (IDR/USD) Exchange rate, Money Supply}. The proposed method gets lower root mean square error (RMSE) than previous methods.


Introduction
The concept of fuzzy time series (FTS) is proposed firstly by Song and Chissom which is based on the concept of fuzzy logic [1] to address the problem of forecasting [2][3]. There are various models of FTS are used to solve various problems of forecasting. Chen proposed the model of forecasting FTS to do enrolment forecasting or can be called fuzzy logical relationships (FLRs) [5]. The forecasting model which is proposed by Chen is still used because it has high accuracy. In economic sector forecasting give a very high profit because it can see the future condition [6][7][8][9]. So, the forecasting with high accuracy is needed to get maximum profits. The FTS forecasting model for forecasting TAIEX problem gives high accuracy from the exist forecasting [10]. The FTS models have been developed to produce forecasting model that provides high accuracy, however some model of existing research have not maximum yet so that there are some things that will be proposed in this paper to get maximum forecasting result.
In this several years there are some progresses in a FLRs model. The first is interval of FLRs. The interval is a most influence variable on the accuracy of the results generated. This is because the forecasting result obtained based on interval midpoint [5]. So that to obtain high accuracy result need to optimize the interval value, there are several methods used to do optimize the interval. Automatic clustering can cluster the interval based on the data history [11].
Automatic clustering is helpful in performing the clustering intervals so that it can deliver better accuracy for forecasting results [9], [12]. Setting the right divide value in automatic clustering can give more maximum result in forecasting [13]. Previous study proposed optimization interval model using PSO method in the optimization interval value from one of main variable so that gives high accuracy result [15]. PSO performs a solution search which focuses on local search, small search space optimization interval so that PSO can obtain maximum accuracy result. Additionally, in the interval optimization, multi variable PSO also can find an optimum solution so that can give high accuracy result [10]. That study represents a particle for main factor and secondary factor with the equal length to view the historical data.
The combination of automatic clustering and PSO can give better results [13]. The automatic clustering can do the clustering from the data history and PSO can optimization the value in each interval. On this paper will be submitted interval optimization with ACPSO to get the best interval value. Beside that in the particle representation in PSO is made adaptive so that the length of each segment is different according to the result of automatic clustering, where each segment is a representation of each variable.  Inflation forecasting problem is influenced by multiple variable such as exchange rate,  CPI, 4 th time inflation, BI rate, previous money supply so that those variables must be used because it is influenced the inflation value [6], [16][17]. Inflation forecasting is important factor for investor to invest money [19]. The concept of FTS that using multi variable to determine the forecasting is proposed by Lee et al (2006) [20], where build the current state in fuzzy logical relationship by adding some of fuzzy value for all variable is a commonly known as two-factors.
The using of two-factor can do the forecasting by looking at several variables that support the forecasting [21]. While to solve the problem of t time variable value, in the past using the concept of high order in the problems of multi variable [21], the use of high-order to see patterns that are formed from time to time, the right value of n th order will give high accuracy forecasting result [22]. Otherwise, it will be seen that form fuzzy-trend seen from the fuzzy logical relationship that is formed, so it will be formed trend "down-trend", "equal-trend", and "up-trend". After the fuzzy-trend formed fuzzy logical relationships are grouped using K-means method [23], and the do the forecasting based on cluster by looking the probability from fuzzytrend [24]. The use of similarity measures to see the relationship of fuzzy sets subscript of fuzzy time now with the time before, this similarity measures could improve the better accuracy of forecasting results [25].
The focus of this research proposed forecasting model of two-factor high-order fuzzytrend logical relationship, doing the forecasting by considering several variables and see from the n th time before. Grouping history data using automatic clustering to form interval and optimize the value interval using the PSO. In addition, used similarity measures to maximize the accuracy results forecasting inflation.
based on that FLRs can be form a fuzzy-trend to know the pattern of data history starts from "down-trend", "equal-trend", "up-trend", the value of fuzzy-trend obtained from the comparison of a and b, c and d, and so on.
In addition to getting maximum results will put forward similarity measures between subscripts from FS. If there is FS A y and A x . Then to find out the similarity between the subscripts y of FS, A y and x from Fs A x it will be obtained the results of similarity as follows. (1) where each max and min is the largest and smallest value from a dataset.

The Proposed ACPSO-Based Optimal-Interval Partition Algorithm
In this section, we propose ACPSO for optimum-interval partition algorithm. Forecasting results which have high accuracy influenced by the proper interval, so need to optimize the interval as follows: 1. Form the interval by viewing data history using automatic clustering [13], [11] for each variable is either a main or secondary factor, e.g. data from main factor i.e. IIR. 1.1. First sort the data in ascending history. d 1 where clus_diff is the average of the current cluster and c 1 , c 2 , ... c n is the data in the current cluster. 1.3. Customize the content of each cluster with the following principles.
Principle 1: If in the cluster there are more than two datum, then keep the smallest and largest datum and remove datum to another. Principle 2: If in the cluster there are two datum, then keep them all. Principle 3: If in the cluster there is only one datum d q , then add the datum with the value "d q -avg_diff" and "d q + avg_diff" into clusters. However it is also have to adjust to the situation here. Situation 1: If first cluster, then delete the "d q -avg_diff" and preserve d q . Situation 2: If the last cluster, then delete the "d q + avg_diff" and preserve d q .  [14], [26]. The first is done the initializing in all PSO particles randomly. PSO particle represents limitation interval value for the main and secondary factor. There are 5 segment of particle representation, where each segmen represents interval value of IIR, CPI, BI Rate, Exchange Rate, and Money Supply. Particle representation model which is used an adaptive length of each segment are different and can vary in accordance with the many intervals of the results of grouping data by automatic clustering.
, [e i1 , e i2 , e i3 , …, e ie ]}, where x i is the position of a particle i and a ', b, c, d, and e are each segment represents interval IIR, CPI, the BI Rate, Exchange Rate, and the Money Supply with the length of each segment of a , b, c, d, and e. Whereas the velocity is represented in the v i with many segments and length according to the position of the particle x i . the first stage of velocity v i is initialized with zero value. 3. After generate the position of the particle, the next stage is doing the calculation of cost value cs each particle to see how much the semblance of forecasting with actual data. The forecasting is done using two-factor high order fuzzy trend logical relationship groups, as follows. 3.1. Change the particle value into interval for each variable, so it will be formed u 11 , u 12 , u 12 , …, u 1a , u 11 , u 12 , u 12 , …, u 1b , u 11 , u 12 , u 12 , …, u 1c , u 11 , u 12 , u 12 , …, u 1d , and u 11 , u 12 , u 12 , …, u 1e that each of that is interval from IIR, CPI, BI Rate, Exchange Rate, and Money Supply. Beside that it is also calculated the midpoint of m i from U ji interval.
where n represents many interval from variable i, with value n in variable interval i in each a, b, c, d, and e. 3.3. Fuzzified for each datum from each variable with FS A ji so that the value of fuzzified is retrieved. 3.4. Generate the FLRs data history from the fuzzified value by looking at the previous nth.
For example, it is generated 2 years ago then will generate two factor second factor fuzzy logical relationships (TSFLRs).

(F 1 (t-2), F 2 (t-2), F 3 (t-2) , F 4 (t-2) , F 5 (t-2)), (F 1 (t-1), F 2 (t-1) , F 3 (t-1) , F 4 (t-1) , F 5 (t-1)) → F 1 (t)
F i is the fuzzified value from i variable. 3.5. Grouped the result of TSFLRs based on fuzzy-trend that is "down-trend", "equal-trend", and "up-trend". First formed the group first with many group of 3 (n * 2) -2 . Fuzzy-trend determines by looking at the comparison of fuzzified value of IRR and CPI variable from current state FLRs [7] for each nth-order. When use the secong-order so will obtain 9 groups. In the next part will do the analysis of nth-order value to know the best accuracy result value. 3.6. The next step is do the time t forecasting by looking at the fuzzified data value of t-1,… t-n. then find the appropriate group by comparing fuzzified value of IIR and CPI value. The forecasting IIR the t time will know from the probability trend-fuzzy from group [9], so will be obtained P down , P equal , P up that each retrieved using the Equation 4-6.
(4) where the G down , G equal , and G up each Group is the number of "down-trend", "equaltrend", and "up-trend" in the group. Before it, do the calculation value similarity using Equation 1, between the subscripts FS F 1 (t-1) and F1(t) which is a fuzzified of the IRR which is symbolized each x, y so obtained S(y, x). The x and y value will be compared when the result x > y then value similarity will be added on probability of "down-trend" so that P down = P down + S (y, x) , if the result is x = y then the value similarity will be added on probability "equal-trend" so that P equal = P equal + S (y, x), and if the result is x < y then the value similarity will be added on the probability "up-trend" so P up = P up + S (y, x). So the result of forecasting the time to t is obtained as follows [9].
where I is a subscripts FS from A l. F 1min and the F 1max is a minimum and maximum value of the IRR data history. While the Q 1 and Q 2 is positive integer values which are generated randomly. So the retrieved forecasting result is Y '. 3.7. Calculation of cs cost between the forecasting results of Y ' and the actual data Y are shown in Equation 7 using the RMSE [13].
4. After each of the particles is done the calculation cost next step is update the calculating of value of p best and g best each represent the best local value and the best global value. p best is the value of the lowest cost ever achieved each of the particles. While g best is the lowest cost value that ever achieved by the entire particle. 5. In this section is the most important part in the PSO stage, i.e., calculate the velocity v t of each particle [14].
where w max and w min is the variable that is initialized from inertia weight. While the Iteration max and Iteration is the biggest iteration and the iteration of the moment. In addition, it will also do a limit velocity v max and v min to get a maximum solution in the movement of particles of PSO [27].
where the value of k still on the interval (0, 1], then will di the testing to get the best appropriate k value. 6. Then, after the new velocity is obtained then updating the new position of particle x is done.
Stages 3-6 will be done repeatedly until the iteration is complete. The result of this PSO stage is the most appropriate interval value so that the use of THFLRGs use data testing will be resulting high forecasting accuracy.

Result and Analysis 4.1. Best Parameters
In part, this was done testing against the parameters of ACPSO started the number of population, number of iterations, combination w min and w max , and value of k at the PSO, and value p on automatic clustering. The data used in this study from January 2005 until June 2017. Testing conducted as many as five times and taken its average value, this is because PSO is a stochastic or random nature method, the test results indicated in Figure 1       The study proposed the approach of high-order so as to get maximum results forecasting need to be performed the testing against the value of n th -order, where testing is done on the value 2 ≤ n ≤ 7 are shown in Table 1. Based on the test results of n th showed the second-order model delivers high result accuracy of forecasting with gives the lowest result of RMSE with comparison as 1,747.

Comparison Works
In this section will be done a comparison with previous forecasting IIR model proposed the concept of forecasting with fuzzy logic and fuzzy inference system development (FIS) Sugeno, neural fuzzy system (NFS) on the model of neural network backpropagation [28], and fuzzy neural system (FNS) that proposed model forecasting hybrid FIS Sugeno and backpropagation neural network [29]. In this testing was conducted with variable main factor IRR and secondary factor SF, i.e. CPI, the BI Rate, Exchange Rate, and the Money Supply. In addition, the data used for this testing is from October 2005 to March 2008. Forecasting results and comparison methods are shown in Table 2

Result Forecasting using a New Proposed Method
In this section we do forecasting test on the data starts from January 2015 until June 2017 by doing the ACPSO training with dataset starts from January 2005 to December 2014. ACPSO training is carried out using the best parameters that have been examined previously so formed value intervals used to make forecasting on the data of January 2015 until June 2017 that the results shown in Table 4. Based on the results of the forecasting it is retrieved RMSE value of 0.2142 so it can show that the forecasting model proposed in this research gives great results. Some things that effect high forecasting results. First interval optimization is very helpful so that when doing forecasting can take the right midpoint. Second fuzzy-trend helps to see the pattern formed, so that can be obtained the probability of data trend. Third similarity measures help in decision making forecasting result, decision making based on THFLRGs and fuzzy-trend probabilities.

Conclusion
In this paper, we propose a new forecasting model for forecasting the IRR with the concept of two-factor second-order fuzzy-trend automatic-optimized logical relationship groups and similarity measures between subscripts from FS. The main contribution of this paper is on file a new FTS method for forecasting i.e. automatic-optimized. It is good combination to get best interval value, automatic clustering used to form interval and particle swarm optimization optimized interval value to get maximum solution [13]. Besides this paper is taken into account several parameters that influence in doing forecasting i.e. two-factor, second-order, and similarity measures between subscripts from FS. Further research can be done on the development of the n th -order adaptive for each variable. In addition, will do optimization interval using auto-speed acceleration algorithm to get better result.