Solar Photovoltaic Power Forecasting in Jordan using Artificial Neural Networks

In this paper, Artificial Neural Networks (ANNs) are used to study the correlations between solar irradiance and solar photovoltaic (PV) output power which can be used for the development of a real-time prediction model to predict the next day produced power. Solar irradiance records were measured by ASU weather station located on the campus of Applied Science Private University (ASU), Amman, Jordan and the solar PV power outputs were extracted from the installed 264KWp power plant at the university. Intensive training experiments were carried out on 19249 records of data to find the optimum NN configurations and the testing results show excellent overall performance in the prediction of next 24 hours output power in KW reaching a Root Mean Square Error (RMSE) value of 0.0721. This research shows that machine learning algorithms hold some promise for the prediction of power production based on various weather conditions and measures which help in the management of energy flows and the optimisation of integrating PV plants into power systems.


INTRODUCTION
The importance of solar Photovoltaic (PV) systems is increasing with the ongoing industrial growth and the increased energy demand for developed and developing countries [1,2]. Energy production by PV systems is becoming one of the main renewable energy sources as it turns the power of the sun into electricity and this can be done repeatedly without causing any damage to the environment.
The term "Photovoltaic" is first used in English since 1849 as the process of light conversion into electricity [3]. Solar PV power plants are installed in two modes: grid-connected and a stand-alone (Off-Grid) [4]. Off-Grid systems are used for isolated or remote areas that are normally on smaller scale. On the other hand, grid-connected systems are widely operated and they are proven to be hugely beneficial but they were known as uncertain systems, uncontrollable, and non-scheduling power source [5]. This is because such type of power production depends on the variable weather conditions according to the geographical area of the system.
To maintain a stable power quality and scheduling and improve investment feasability, many studies were reported in the literature suggesting different modeling, simulation, and prediction methods for the expected power production of solar PV plants [6,7]. In [8], the accuracy of one-day ahead prediction for the power produced by 1MW PV System is compared for two methods, Support Vector Machines (SVM) and Multilayer Perceptron (MP) Artificial Neural Networks (ANNs). It was found that the two algorithms approximately obtained almost the same accuracy with 0.07 KWh/m 2 and 0.11 KWh/m 2 Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), respectively.
Various forecasting methods of PV power output were reviewed in [9]. It was demonstrated that any model uses numerically predicted weather data will not take into account the effect of cloud cover and cloud formation when initializing, therefore sky imaging and satellite data methods used to predict the PV power output with higher accuracy. The article also outlined some key factors affecting the accuracy of prediction, such as forecast horizon, forecasting interval width, system size and PV panels mounting method (fixed or tracking). The aim of the work published in [10] was to study the effect of forecast horizon on the accuracy of the method used to predict the PV power production, which was Support Vector Regression (SVR) using numerically predicted weather data. Two forecast horizons studied: up to 2 and 25 hours ahead. As expected, the forecasting of up to 2 hours ahead was more accurate with RMSE and MAE increased 13% and 17%, respectively, when the forecast horizon was up to 25 hours ahead. The authors of [11] developed and validated a model that adapted an ANN with tapped delay lines and built for one day ahead forecasting. The inputs were the irradiation and the sampling hours. The model achieved seasonal MAE ranging from 12.2% to 26% in spring and autumn, respectively.
The research work of [12] compared two short-term forecasting models: the analytical PV power forecasting model (APVF) and the MP PV forecasting model (MPVF), with both of the models using numerically predicted weather data and past hourly values for PV electric power production. The two models achieved similar results (RMSE varying between 11.95% and 12.10%) with forecast horizons covering all daylight hours of one day ahead, thus the models demonstrated their applicability for PV electric power prediction.
A new Physical Hybrid ANN (PHANN) method was proposed in [13] to improve the accuracy of the standard ANN method. The hybrid method is based on ANN and clear sky curves for a PV plant. The PHANN method reduced the Normalized MAE (NMAE) and the Weighted MAE (WMAE) by almost 50% in many days compared to the standard ANN method. In [14], the PV energy production for the next day with 15-minutes intervals was accurately predicted with a SVM model that uses historical data for solar irradiance, ambient temperature and past energy production. The method demonstrated very good accuracy with R 2 correlation coefficients of more than 90%, and the coefficient was strongly dependent on the quality of the weather forecast.
A model using multilayer perceptron-based ANN was proposed in [5] for one day ahead forecasting. The daily solar power output and atmospheric temperature for 70 days used for training the ANN. For the different settings of the ANN model (number of hidden layers, activation function and learning rule), the minimum MAPE achieved was 0.855%.
In this research work, ANNs were optimized to find the best learning configurations and map the available solar irradiance records into the generated solar PV power. The proposed system provides real-time next-day predictions for the output power based on the knowledge extracted from the available historical data. These predictions can be used by many energy management systems [15] and power control systems of grid-tied PV plants [16].

PV SYSTEMS AND DATA
The data used in this research were collected from the existing weather station and solar PV plants at Applied Science Private University (ASU) as depicted in the map of Figure 1. There are four separate PV systems installed at the university campus for a total generation capacity of IJECE ISSN: 2088-8708 499 550KWp: three rooftop mounted solar systems and one ground mounted test field. In this work, the power production data extracted from the PV system ASU09 (Faculty of Engineering) [17] is correlated with the solar irradiance measured for the same period by the weather station [18] which is located about 175m from the engineering building (see Figure 1).

PV ASU09: Faculty of Engineering
The largest PV system is installed on top of the faculty of engineering building with a capacity of 264KWp. It consists of 14 SMA sunny tripower inverters (17KW and 10KW) connected with Yingli Solar (YL 245P-29b-PC) panels that are tilted by 11and oriented 36(S to E).
The dataset used in this research was created using all reported solar irradiance and PV power records between 15 May 2015 and 30 September 2017. This consists of 19800 PV power and 20808 weather station records with one hour frequency.

THE PROPOSED PREDICTION SYSTEM 3.1. Preprocessing
As shown in Figure 2, the first stage of our system is to make sure that all data entries are consistent and available for both solar irradiance and PV power per instance of time. A filter was designed to remove out any irradiance record where no PV power value is reported at the same time. In addition, many records were not reported correctly because of some network connection disruptions and in some cases this was caused by an inverter failure. An irradiance record is associated with a solar PV output power value at each hour for a total of 19249 samples as depicted in Figure 3. As shown in Figure 2, the dataset is then normalized between 0 and 1 for a better machine learning performance.

Artificial Neural Networks
ANNs is a machine learning algorithm that interconnects non-linear elements through adjustable weights. The structure of ANN consists of three layers: input, hidden, and output layers as illustrated in Figure 4 [19]. The input layer receives the raw data, and then these inputs are processed in the hidden layer to be finally sent as computed information from the output layer [5].
Using neural network learning methods provide a robust algorithm to interpret real-world sensor data [20], and it has been widely used in the field of solar energy [21]. Artificial intelligence techniques can be used for sizing PV systems: stand-alone PVs, grid-connected PV systems, and PV-wind hybrid systems [22]. There are many learning algorithms that can be used in our work [23,24,25], but it was shown in the literature that ANN systems were proven to provide excellent prediction and classification results in similar applications such as [26] and [27].

ANN Experiments and Optimisation
In this research work, an ANNs network model was created with five inputs representing the solar irradiance (Irr) records at the same time of the previous five days that are associated with a current solar PV output power (P ) which represents the target function (output node). So, if the mean power value for the hour h on day d is represented ISSN: 2088-8708  All training and testing experiments were carried out using the MATLAB ANNs toolbox with the aid of the back-propagation learning algorithm [28]. To optimize the model performance, the number of hidden layers was IJECE ISSN: 2088-8708 501 incremented from 1 to 30 and at each value of hidden layers, ten experiments were carried out using a different set of randomly mixed samples consisting 80% of the samples (15399 samples) for training, 5% for validation, and 15% for testing. The average RMSE for each of ten experiments is calculated to evaluate the performance per specific number of hidden layers. A total of 300 sets of training, validation, and testing experiments were handled and the best ANN configurations were found to provide an average RMSE of 0.0721 and best validation MSE of 0.0053397 using 22 hidden layers for the testing performance illustrated in Figure 5 and Figure 6. These results are very good compared to the methods and measures reported in the literature and related to the current research. A two-days prediction for the PV energy production for 23 and 24 May 2015 was simulated using our model (see Figure 7 (left)) and the system provided a RMSE=0.0234 and correlation coefficient of R=0.9983 which means an almost perfect linear relationship between solar irradiation and the output power generated. In addition a ten-days simulation for the duration from 20 to 30 July 2015 provided RMSE=0.0333 and R=0.9965 as illustrated in Figure 7 (right).

CONCLUSIONS
In this work, a machine learning model is proposed to analyses historical solar PV output power and solar irradiance data to provide a set of decision rules that represent a proper prediction system. All data records in the duration from 16 May 2015 to 30 September 2017 were used in this research work and the ANNs-based system provided promising results.
We believe that this work is the first to predict the next-day solar PV output power using real time irradiation data measured accurately at a weather station that is located at the same geographical area of the PV plants.