hybrid Machine Learning Model for Rainfall Forecasting Hybrid Machine Learning Model for Rainfall Forecasting

: The state of the weather became a point of attraction for researchers in recent days. Its control in many fields as agriculture, the country determines the types of crops depend on the state of the atmosphere. It is therefore essential to know the weather in the coming days to take precautions. Forecasting the weather in future especially rainfall won the attention of many researchers, to prevent flooding and other risks arising from rainfall. This paper presents a vigorous hybrid technique was applied to forecast rainfall by combining Particle Swarm Optimization (PSO) and Multi-Layer Perceptron (MLP) which is the popular kind used in Feed Forward Neural Network (FFNN). The purpose of using PSO with MLP is not just to forecast the rainfall but, to improve the performance of the network; this was proved by comparison with various Back Propagation (BP) algorithm such as Levenberg-Marquardt (LM) through results of Root Mean Square Error (RMSE). RMSE for MLP based PSO is 0.14 while RMSE for MLP based LM is 0.18.


Introduction
The weather is the reflection state of the atmosphere around us, through its parameters like the temperature, wind, humidity, and other weather parameters [1]. It considers the main factor in many things as "Human" where affects on the life for a human, control in the activities performed by human and determines the wears based on the state of the weather. "Agriculture" the former determine the types of crops based on the weather, its controls in the plants should be planting. So, weather forecasting has become an essential field of research in recent days. Weather forecast is a vital process to avoid hazardous causes from the climatic [2].
Weather conditions change rapidly and continuously. So, the process of forecasting needs suitable technique as Artificial Neural Networks (ANN). It is the most common use in forecasting weather especially MLP because it has many benefits as Solve nonlinear problems that can't solve by traditional techniques, extract meaning from complicated or imprecise data and learn how to perform a task based on data provided to train [3]. MLP used gradient descent algorithm as BP is a supervised training algorithm. It trains MLP by adjusting the weights of each layer until the error between the desired output and the actual output is reduced [4]. Although BP has widely used for training MLP, it has drawbacks as the following [5]: o Training too slowly, which required many iterations to adjust weights so, that process takes a long time. o Easy to fall into local minima where BP offers solutions for ANN through adapting the weights to reduce the error between desired and actual but, these solutions not the best or optimal.
o Sensitivity to choice initial weights and biases where weights must select carefully because they affect the network and when a selection of weights and biases not suitable that cause training process takes a long time and lead to the results cannot be optimal.
Usage of optimization algorithm based on Swarm Intelligent (SI). SI simulate the movement of animals in the swarm or search space such particle swarm optimization algorithm (PSO) as a training algorithm for MLP instead of gradient descent BP to avoid its drawbacks mentioned above. The aim of this paper is forecasting of the rainfall by applying a hybrid intelligent technique MLP based on PSO. The rest of this paper is represented as follows; Section 2 represents related work, Section 3 introduces techniques used in our research, Section 4 represents proposed technique used for forecasting rainfall, and Section 5 represents experimental results and discussion.

RELATED WORK
This section deals with some of the relevant works of our research topic that has been done on weather forecasting using ANN. ANN consider more accurate than other techniques as Naïve Bayes, Decision Tree (DT), and K-Nearest Neighbor (KNN) through comparison between the techniques to treat massive data (Data Mining) DM [6]. FFNN used to predict the minimum temperature for Jordan, used the actual data of 40 years for Arabia Weather in Jordan, data classified as 60% training, 20% validation, and 20% test [7]. BP algorithm used for forecasting rainfall in the region of DELHI (India) through training FFNN [8]. Develop ANN (MLP) to forecast air temperature in the area of Meknes in Morocco based on weather parameters, such as atmospheric pressure, humidity, visibility, wind speed, and dew point; used LM training algorithm; observed Mean Square Error (MSE) is 3.65 when learning phase is 70%, and testing phase is 30%, the value of MSE is the smallest value compared to other distribution of the database have been experimented [9]. Implement ANN with BP to forecast the weather for next day by accepting input parameters of the previous day; ANN is a suitable technique that works on complex and nonlinear systems like Weather forecasting; 70 % of the

Hatem Abdul-Kader et al.; hybrid Machine Learning Model for Rainfall Forecasting
Dataset given as input to the network and 30% of the dataset is provided as unseen to the network. The implemented network provides the error rate at 0.0773 of MSE and the accuracy as 90 % [10].

Artificial Neural Network
ANN is a mathematical or computational model based on biological networks. It consists of interconnected of artificial neurons and processes information using a connectionist approach to computation [11]. Like a human brain where knowledge acquired by the network through learning, and these knowledge stored interconnection strength called synaptic weight [12], [13]. ANN is consist of layers, and each layer has a set of neurons that are connected through weights between the neurons. Data is mathematically processed and transfers the results to neurons in the next layer. Neurons in the last layer provide the network's output [14]. In this case, ANN called MLP is the most common type used in FFNN and known as a supervised neural network because it requires a desired output to learn. MLP need to learn how to do a particular task through learning algorithm as gradient descent BP, Although BP is Most important algorithm to train a neural network for weather forecasting, another learning algorithm has emerged faster and more efficiently than gradient descent called LM [1].

Levenberg-Marquardt
One of the most popular algorithm for non-linear problems; LM is another type of BP training algorithm has also been used for ANN training. Although LM is more potent than gradient descent techniques, it does not always guarantee global optimum for the problem [11] as demonstrated by our comparison with PSO which will be clarified in the section of experimental results.

Particle Swarm Optimization
PSO developed by Kenndey and Eberhart 1995, a stochastic algorithm which describes the movement of animals as bird flocks and fish schools. Flocks of animals do not have a leader, but they follow one of them where its position close to the food source [15]. In the PSO algorithm, each particle in population is reckoned as a solution that algorithm works on finding optimal values as work of birds flock and fish school. Each particle changes its velocity and its position according to the following equations: Where Vi refers to the velocity of the particle i, t refers to an iteration number, c1, c2 refers to learning rate for individual (local) and group (global), xi refers to the position of the particle i, pi , pg refer to local best for particle (personal best) and global best for the particle ( best particle) for a whole swarm, and r1,r2 refer to random values have values between (0-1) [15]. On each iteration of the algorithm the current position considers as a solution and if that position better than the previous according to its value of fitness function which has a minimum value (minimize problem), that position considers Pbest [16]. Used inertia weight (w) to adjust the velocity of particles in the population, which provide particles to move close to each other where manage global exploration and local exploitation shown the following equation (3)

PROPOSED TECHNIQUE
Our proposed technique used for forecasting rainfall is represented into two steps: First step: Develop ANN has the following feature, as shown in the figure: Input layer: has four nodes (neurons). One hidden layer: has 20 nodes (neuron); although using two hidden layers get a more accurate and efficient result than one hidden layer, it takes a long time to train the network [18].
The output layer has one node (neuron). PSO generate weights where the dimension of the search space represent the total number of weights used to train developed ANN, the role of PSO in ANN is to get the best set of weights (particle position), and the part shows more through the following steps:

High Temp
1. Initialize population size and maximum iterations for PSO.
2. Applied all particles to train constructed ANN and calculate fitness function (ff) (RMSE) to each particle in the population.
3. Determine personal best (Pbest) and global best (gbest) at each iteration which has a minimum value of RMS. 4. Other particles update its velocity based on gbest as equation (3) and position as equation (2). 5. Repeat step two and calculate ff to each particle; if current ff of the particle < pbest then-current ff=

Data Set
The constructed hybrid technique applied to weather data of 2009 for New Capital Management through the astronomical site (Kottamia dome), the parameters of the weather are collected through Automatic Weather Station (AWS). Consider low temperature, high temperature, humidity, and Wind speed as input parameters while rain rate as output parameter that is the basis of our research to forecast. Data classified into two parts:-First part is the training phase used 90% of data to train the network to forecast Rainfall. The second part is the testing phase starts after the training done successfully using to test the network using unseen 10% of data.

Result and Discussion
In this section, implantation of MLP for forecasting the rainfall is introduced using two different training algorithms are PSO, and another version of BP is LM.

Training Phase
In this phase used 90% of weather data to train two different techniques, as shown in figure 3. The values obtained from PSO as a training algorithm for MLP are more close to actual values of rainfall than the values obtained from LM as training algorithm for MLP; that conclusion simulates from the previous figure 3.

Testing Phase
In this phase used 10% of unseen weather data to test two different techniques and its ability for forecasting, as shown in figure 4. The values obtained from MLP based PSO are more close to actual values of rainfall than the values obtained from MLP based LM in the testing phase; as concluded in the training phase.

Performance of the network
Using a statistical method to measure the performance of the network, through measuring the error of each technique. The statistical method is RMSE, as shown in table one.

Conclusion
ANN is one of the most suitable techniques used in data mining. Although ANN proved it's efficient, it suffers from some of the problems when used BP as a training algorithm. BP has some of the drawbacks mentioned in the previous; so, using a training algorithm based on stochastic to train ANN is more accurate than train ANN-based deterministic. We proved it in the section of experimental results and discussion through the results of RMSE, MLP based PSO is 0.14 while MLP based LM is 0.18.
The proposed hybrid technique has two phases, in the first phase is developed neural network by determining the number of neurons for the input layer, neurons for hidden layer, and number of neurons for output layer; in the second phase PSO mainly used for automatic generation of optimized weights which used in the first phase for training network.