energies

Article Probabilistic Forecasting Model of Solar Power Outputs Based on the Naïve Bayes Classifier and Kriging Models

Seungbeom Nam and Jin Hur *

Department of Electrical Engineering, Sangmyung University, Seoul 03016, Korea; [email protected] * Correspondence: [email protected]; Tel.: +82-2-781-7576

 Received: 27 September 2018; Accepted: 30 October 2018; Published: 1 November 2018 

Abstract: Solar power’s variability makes managing power system planning and operation difficult. Facilitating a high level of integration of solar power resources into a grid requires maintaining the fundamental power system so that it is stable when interconnected. Accurate and reliable forecasting helps to maintain the system safely given large-scale solar power resources; this paper therefore proposes a probabilistic forecasting approach to solar resources using the R program, applying a hybrid model that considers spatio-temporal peculiarities. Information on how the weather varies at sites of interest is often unavailable, so we use a spatial modeling procedure called kriging to estimate precise data at the solar power plants. The kriging method implements interpolation with geographical property data. In this paper, we perform day-ahead forecasts of solar power based on the probability in one-hour intervals by using a Naïve Bayes Classifier model, which is a classification algorithm. We augment forecasting by taking into account the overall data distribution and applying the Gaussian probability distribution. To validate the proposed hybrid forecasting model, we perform a comparison of the proposed model with a persistence model using the normalized mean absolute error (NMAE). Furthermore, we use empirical data from South Korea’s meteorological towers (MET) to interpolate weather variables at points of interest.

Keywords: hybrid spatio-temporal model; probabilistic forecasting; solar power forecasting

1. Introduction utilization is rapidly growing all over the world. According to the International Agency (IRENA), solar power generation capacity was 397 GW at the end of 2017. It took first place again with a capacity increase of 94 GW, accounting for a 32% increase—higher than the 10% growth rate [1]. In the United States, the electric power company PJM will supply 13% of its total load as renewable energy by 2031. In the case of solar power generation, installed capacity will be increased to 8.1 GW by 2027 [2]. South Korea’s solar power generation capacity is 904.1 MW and its cumulative installation capacity is 4519.4 MW as of 2016, which is a small percentage compared with existing generators. As solar power generation capacity increases, many electric utilities are expected to have difficulty managing power system planning and operation. Meteorological variables that change over time and space mean that solar energy is highly intermittent and uncertain. Forecasting among various technologies can play an important role in preventing transmission congestion and maintaining a power balance, thus reducing the difficulty. Various forecasting techniques are being formulated abroad. UC San Diego uses a Total Sky Imager to predict the movement and location of clouds to forecast solar irradiation levels, determines the sky cover every 30 s, and estimates the position of clouds 5 min in advance [3,4]. San Antonio, Texas, U.S. uses satellites to forecast and assess solar irradiation for use in solar power systems and

Energies 2018, 11, 2982; doi:10.3390/en11112982 www.mdpi.com/journal/energies Energies 2018, 11, 2982 2 of 15 power system planning and integration [5]. In addition, a review of recent forecasting methods as related to solar generation resources is shown in [6,7]. Reference [6] accentuates the need for accurate forecasting of intermittent resources to achieve power gird balance. Various methods are currently being studied for the forecasting of solar energy resources, such as clear sky models, regressive methods, Artificial Intelligence (AI) techniques, remote sensing models, Numerical Weather Prediction (NWP), Local sensing, and Hybrid systems. In NWP-based forecast, reference [8] used the Environment Canadas Global Environmental Multiscale NWP model to forecast hourly Global Horizontal Irradiance (GHI) and solar power for horizons out to 48 h. They applied spatial averaging and bias removal using a Kalman filter on the NWP forecasts to increase the predictions’ accuracy. Reference [9] used NWP forecasts from the National Weather Service’s (NWS) database as exogenous inputs for Artificial Neural Networks (ANNs) to predict hourly GHI and Direct Normal Irradiance (DNI) out to 6 days ahead of time for Merced, California. In stochastic forecasts, reference [10] constructed three autoregressive integrated moving average (ARIMA) forecasting models for next-hour GHI including cloud cover effects. The main difference in the three models tested concerns the inputs used: GHI in the first model, DNI and Diffuse Horizontal Irradiance (DHI) in the second model, and cloud cover (CC) in the third model. The authors used the third typical meteorological year (TMY3) data from the National Solar Radiation Data Base [11] to estimate the ARIMA models and to validate the forecasting accuracy. In AI forecasting, reference [12] used an ANN with exogenous variables to forecast the hourly solar power for a forecasting horizon of 12 h. This model shows an improvement in root-mean-square error (RMSE) of about 2.07%. Reference [13] applied several stochastic and AI techniques (ARIMA, k-Nearest Neighbor (k-NN), ANN) to predict the one- and two-hour averaged power output of a 1 MW solar power plant in Merced, California. In hybrid forecasting, hybrid models have recently been used to improve forecast error by combining the benefits of forecasting models. Reference [14] tested hybrid forecasting models that combine information from processed satellite images with ANNs. Many solar forecasting methods use expensive and restricted equipment, such as satellite images and sky imagers, and complex equations. In addition, existing forecasting methods take a deterministic approach that represents a single value for a forecasting target. This has limited ability to express uncertainty in solar energy [15–18]. Some recent works have dealt with probabilistic forecasting for addressing uncertainty in solar energy. References [19,20] assess the performance of three probabilistic models for intra-day solar forecasting. The results demonstrated that the NWP exogenous inputs improve the quality of the intra-day probabilistic forecasts. Reference [21] shows three different methods for ensemble probabilistic forecasting, derived from seven individual machine learning models, to generate 24-h-ahead solar power forecasts. The results have shown that the ensemble models offer even more accurate results than any individual machine learning model like ARIMA. GEFCOM represents a general framework of probabilistic forecasts for renewable energy generation [22,23]. This is demonstrated by an application in probabilistic solar power forecasting. The results from its evaluation show that the RMSE and quantile score are quite low, verifying the precision of the proposed forecasting method. This paper proposes a probabilistic approach for solar power forecasting using spatial interpolation and a naïve Bayes Classifier. Section2 describes the hybrid forecasting model of solar energy resources using kriging and naïve Bayes Classifier models. First, we show the spatial interpolation using what is called the kriging method. This method can spatially estimate the weather factor at different points of interest using a current weather value without historical data. Next, we propose a method for solar power forecasting using a naïve Bayes Classifier. This method can probabilistically forecast solar power in one-hour intervals. Section3 verifies the proposed method. We apply a hybrid spatio-temporal forecasting model that combines kriging and naïve Bayes Classifier based on empirical NWP data in South Korea. We also perform a comparison of the proposed model with a persistence model using normalized mean absolute error (NMAE) to validate the proposed hybrid forecasting model. Energies 2018, 11,, 2982x FOR PEER REVIEW 3 of 15

22.. A Hybrid Spatio Spatio-Temporal-Temporal Forecasting ModelModel We introduce a hybrid method to probabilistically forecastforecast solar power output using krigingkriging and a naïve Bayes Classifier.Classifier. W Weathereather factors and historical data of solar power have a significantsignificant impact on forecasting the amount of solar power [ 24,25]].. However, However, obtain obtaininging data on meteorological variables that can changechange at any point of interestinterest necessitatesnecessitates expensive equipment, and acquiring weatherweather data over time and determining the forecas forecastt generation outputs based on that takes a long time. This This paper paper estimates estimate weathers weather variables variables at points at points of interest of interest economically economically and efficiently and efficiently through througha type of a spatial type of modelling spatial modelling called kriging. called In kriging. addition, In addition, solar power solar varies power according varies according to humidity, to humidity,temperature, temperature, cloud cover c amount,loud cover and amount, wind speed; and in wind particular, speed irradiation; in particular, has the irradiati greateston influence has the greateston determining influence solar on power determining output [26 solar]. However, power sinceoutput they [26 have]. However high variability, since they and uncertainty, have high variabilityit is essential and to uncertainty, use a probabilistic it is essential method. to use Figure a probabilistic1 below shows method. the Figure algorithm 1 below for the show hybrids the algorithmforecasting for model. the hybrid forecasting model.

Figure 1. Algorithm for the hybrid spatio spatio-temporal-temporal forecasting model.

2.1. Spatial Modelling U Usingsing the Kriging Technique Spatial Modeling Modeling is isa geo-statisticsa geo-statistics technique technique used inused various in fields various of geoscience fields of and geoscience engineering, and engineeringsuch as pollution, such concentration as pollution and concentra precipitationtion and analysis, precipitation to analyze analysis physical, to phenomena analyze physical or data withphenomena spatial characteristicsor data with spatial [27]. This characteristics allows values [27 of] interest. This allows to be estimatedvalues of usinginterest only to currentbe estimated values usingand location only current data without values historical and location data. Kriging data without is one spatial historical interpolation data. K methodriging is that one estimates spatial data at target points based on regression against observed α values of neighbor points, weighted interpolation method that estimates data at target points basedi on regression against observed  according to spatial covariance values. The general formula for the ordinary kriging method is asi valuesshown of in neighb Equationor points, (1) [28]: weighted according to spatial covariance values. The general formula for the ordinary kriging method is as shown in Equation∗ n (1) [28]: α = ∑ λ α i=1 i i (1) * n n s.t ∑i=1 λi = 1   i1 i i ∗ where α is the estimated value of the target point, λi is a weight regarding the spatial distances between(1) n two points, and n is the number of neighbor points.st.  ii In1 the 1 ordinary kriging method, the weight must sum to 1 to avoid biased models and thus minimize the error variance between estimated and actual * wvalues.here  This is method the estimated can be representedvalue of the astarget shown point, in Figure i is 2a. weight regarding the spatial distances between two points, and n is the number of neighbor points. In the ordinary kriging method, the

Energies 2018, 11, x FOR PEER REVIEW 4 of 15

weight must sum to 1 to avoid biased models and thus minimize the error variance between Energiesestimated2018 ,and11, 2982 actual values. This method can be represented as shown in Figure 2. 4 of 15

Figure 2. Visualization of the kriging technique. Figure 2. Visualization of the kriging technique. We minimize the error variance in the ordinary kriging method using the Lagrange function. We minimize the error variance in the ordinary kriging method using the Lagrange function. L(λ1, λ2, ··· , λi; ω) is the objective function of the Lagrange and ω is the Lagrange factor as shown in L(,,,;)     Equation12 (2)i [29]. is the objective function of the Lagrange and is the Lagrange factor as shown in Equation (2) [29]. n n n n ! 2 2 2 L(λ , λ , ··· , λ ; ω) = σ − 2 λnσ + n nλ λ σ + 2ω 1 − n λ (2) 1 2 i 2 ∑ i 0i 22∑ ∑ i j ij ∑ i L( , , , ; )  2     2 1   1 2ii=1 i 0 ii= 1 j=1 i j ij i=1 i (2) i1 i  1 j  1 i  1 Performing the kriging method requires the weights of the ambient points; these values are Performing the kriging method requires the weights of the ambient points; these values are combined linearly to estimate the results of spatial modelling. This equation has a minimum value in combined linearly to estimate the results of spatial modelling. This equation has a minimum value extremum, so we must perform two partial differentiations for λ and ω as shown in Equations (3) and in extremum, so we must perform two partial differentiations for  and as shown in Equations (4). After we have solved the above equation, λ can finally express the weights for neighboring points. (3) and (4). After we have solved the above equation, can finally express the weights for However, as noted in Equation (1) [29], the sum of weights is 1. neighboring points. However, as noted in Equation (1) [29], the sum of weights is 1. n ∂L 2 n 2 =L −2σ −222 σ − 2ω, l = 1, 2, ··· , n (3)  20l ∑2 il  2  , ln  1,2, , ∂λi 0li=1 il (3) i i1 ∂L n = ( − n ) = L 2 1 ∑ λi 0 (4) ∂ω i=1 2(1  i )  0 (4)  i1 2.2. Probabilistic Forecasting for Solar Power Using a Naïve Bayes Classifier

2.2. ProbabilisticA Naïve Bayes Forecasting Classifier for isSolar machine Power learning Using a techniqueNaïve Bayes based Classifier on Bayes probabilistic theory that represents the relationship between a prior probability and posterior probability using conditional probabilitiesA Naïve [ 30Bayes–34]; Classifier it deals with is machine decision learning problems technique mathematically based on underBayes uncertaintyprobabilistic and theory creates that arepresents simple and the efficientrelationship model between in the a field prior of probability document and taxonomy posterior and probability disease prediction using conditional [35,36]. Thisprobabilities method makes[30–34] classification; it deals with rules decision based problems on historical mathematically data and applies under newuncertainty values toand the create classs thata simple is arranged and effic accordingient model to predefinedin the field rules.of document The general taxonomy formula and for disease a Naïve prediction Bayes Classifier [35,36]. isThis as shownmethod in makes Equation classification (5) [30–34 ]:rules based on historical data and applies new values to the class that is arranged according to predefined rules. The generalP(B| Aformula) × P(A for) a Naïve Bayes Classifier is as shown P(A|B) = (5) in Equation (5) [30–34]: P(B) where A and B are different events, A is a random variable denoting the class of an instance, and B is a vector of random variables denoting observed attribute values. This assumes that attribute values are independent, which means that one of several attribute values does not affect the other attribute values. When depicted graphically, a Naïve Bayes Classifier has a form such as that shown in Figure3. Energies 2018, 11, x FOR PEER REVIEW 5 of 15

PBAPA( | ) ( ) PAB( | )  (5) PB() where A and B are different events, A is a random variable denoting the class of an instance, and B is a vector of random variables denoting observed attribute values. This assumes that attribute values are independent, which means that one of several attribute values does not affect the other attributeEnergies 2018 values., 11, 2982 When depicted graphically, a Naïve Bayes Classifier has a form such as that shown5 of 15 in Figure 3.

Figure 3 3.. A Naïve Bayes Classi Classifierfier Network Network..

This assumptio assumptionn supports efficient efficient algorithms that are simple to compute for test cases and to estimate from trainingtraining data.data. TheThe independencyindependency assumption assumption means means that that the the conditional conditional probability probability is isrepresented represented by by the the chain chain rule rule as as shown shown in in Equation Equation (6) (6) [30 [30–34–].34].

n (PBAPBAPBAPBAPBA(|)(|| ) = ( | ) ×))( (|)| ) · · · × ( (|| ) =  n ( (|)| ) (6) P Bi Aik kP B1 A12k kP B2 Ak kP Bi iAk k∏i=i11P Bi iAk k (6)

The probability probability of of the the attribute attribute value value located located in the in denominator the denominator serves servesas a normalized as a normalized constant constantand does and not doesaffect not the affect probability the probability results, so results, we have so omitted we have it foromitted convenience it for convenience of calculation. of Whencalculation. we assign When a newwe attributeassign a valuenew attribute to a class value based to on a predefined class based classification on predefined criteria, classification all classes willcriteria, have all a post classes probability, will have and a the post class probability among them, and with the the class maximum among probability them with is thefinally maximum selected probabilityas shown in is Equation finally selected (7) [30– as34 ].shown in Equation (7) [30–34]. mn n m n o ( PABPBAPA(|)argmax| k) = i( (|)| i) k× ( () k) (7) P Ak Bi arg max ∏k=1ki∏11i=1 P Bi Ak P Ak  (7)

A NaïveNaïve Bayes Bayes Classifier Classifier consists consists of prior of and prior posterior and posterior probabilities. probabilities. First, a “prior First, probability” a “prior probability” is a probability that event A will occur before event B, and it gives the number of is a probability that event A will occur before event B, and it gives the number of classes. P(Aprior) m m classes.refers to thePA() ratio ofrefer thes numberto the ratio of specific of the classesnumber for of allspecific classes, classes∑ Nfor(A allk) .classes, The prior probabilityNA(). prior k=1 k1 k is represented as shown in Equation (8) below. The larger the amount of data, the more different the The prior probability is represented as shown in Equation (8) below. The larger the amount of data, class, and the prior probability can be adjusted depending on the class range setting [30–34]. the more different the class, and the prior probability can be adjusted depending on the class range setting [30–34]. N(Aj) P(Aprior) = m (8) ∑k=NA1 N()(Ak) PA() j prior m (8) NA()A B A “conditional probability” is the probability thatk1 eventk will occur when event happens. We calculated the probability for the occurrence of event A using a probability distribution because it is generallyA “conditional assumed probability to follow” ais normal the probability distribution that in event the Naïve A will Bayes occur Classifier when event method. B happens. This is Wefulfilled calculated by a graph the probability in which the for probability the occurrence distribution of event is A symmetrical using a probability in relation distribution to the mean. because it is generallyThis is determined assumed to by follow the new a normal value distribution and the defined in the normal Naïve Bayes distribution. Classifier The method. Naïve This Bayes is fulfilledClassifier by treats a graph discrete in which and the numeric probability attributes distributi somewhaton is symmetrical differently. in For relation each discreteto the mean. attribute, the conditionalThis is determined probability by is the modeled new value as a single and the number defined between normal 0and distribution. 1. In contrast, The each Naïve numeric Bayes Classifierattribute istreats modeled discrete by someand numeric continuous attributes probability somewhat distribution differently. over theFor rangeeach discrete of that attribute’s attribute, the conditional probability is modeled as a single number between 0 and 1. In contrast, each values. We can write continuous attributes as shown in Equation (9) where Bnew is the new attribute’s value, µ is its mean, and σ is its standard deviation [30,34].

2 (Bnew−µ ) − Ak Bi σ2 1 2 A B P(Bnew|Ak) = √ e k i (9) 2πσ2 Ak Bi Energies 2018, 11, x FOR PEER REVIEW 6 of 15 numeric attribute is modeled by some continuous probability distribution over the range of that attribute’s values. We can write continuous attributes as shown in Equation (9) where Bnew is the new attribute’s value,  is its mean, and  is its standard deviation [30,34].

2 Bnew A B   ki 1 2 2 PBA( | )  e ABki (9) new k 2 2 ABki Energies 2018, 11, 2982 6 of 15

3. Experimental Study: Probabilistic Forecasting of Solar Power Outputs in South Korea 3. Experimental Study: Probabilistic Forecasting of Solar Power Outputs in South Korea 3.1. Estimating Weather Data at Solar Farm “A” Using the Kriging Technique 3.1. Estimating Weather Data at Solar Farm “A” Using the Kriging Technique In this case study, we forecast the output of solar power a day ahead in South Korea. We In this case study, we forecast the output of solar power a day ahead in South Korea. We applied appliedthe krigingthe kriging technique technique using weatherusing weather information information and location and data location for neighbor data for points. neighbor Figure points.4 Figurerepresents 4 represents solar farmsolar sitefarm “A” site and “A” 30 nearbyand 30 meteorologicalnearby meteorological towers in Southtowers Korea in South using Korea Google using GoogMaps.le Maps. Before Before applying applying thekriging the kriging method, method, we collected we collected latitude latitude and longitude and longitude data for d bothata for the both the meteorologicalmeteorological towers towers and and solar solar farm farm “A”. “A” Table. 1Table shows 1 theshows coordinate the coordinate data for applying data for the applying kriging the krigingmethod. method The.weather The weather information information was measured was at measured the meteorological at the towers meteorological from 1 January towers 2015 from to 1 JanuaryDecember 2015 to 2016 December in hourly 2016 intervals. in hourly intervals.

FigureFigure 4. 4.LocationLocation data data of of both both 30 METsMETs and and the the solar solar farm. farm.

Table 1. Location data of both 30 METs and the solar farm. Table 1. Location data of both 30 METs and the solar farm. Name Longitude Latitude Name Longitude Latitude MET1 37.57 126.96 MET2MET1 37.4737.57 126.96 126.92 MET3MET2 37.2737.47 126.92 126.98 MET4MET3 37.3337.27 126.98 127.94 MET5 37.90 127.73 MET6MET4 37.6737.33 127.94 128.71 MET7MET5 37.8037.90 127.73 128.85 MET8MET6 37.7537.67 128.71 128.89 MET9 36.77 126.49 MET10MET7 36.6337.80 128.85 127.44 MET11MET8 36.3737.75 128.89 127.37 MET12 34.81 126.38

Energies 2018, 11, x FOR PEER REVIEW 7 of 15

MET9 36.77 126.49 MET10 36.63 127.44 MET11 36.37 127.37 Energies 2018, 11, 2982 MET12 34.81 126.38 7 of 15 MET13 35.28 126.47 MET14 35.84 127.11 Table 1. Cont. MET15 35.37 127.12 MET13MET16 35.2835.17 126.89 126.47 MET14MET17 35.8434.94 127.69 127.11 MET15 35.37 127.12 MET18 34.62 126.76 MET16 35.17 126.89 MET17MET19 34.9434.76 127.21 127.69 MET18MET20 34.6235.42 126.69 126.76 MET19MET21 34.7635.10 129.03 127.21 MET20MET22 35.4236.22 127.99 126.69 MET21 35.10 129.03 MET22MET23 36.2236.57 128.70 127.99 MET23MET24 36.5736.43 129.04 128.70 MET24MET25 36.4335.51 127.74 129.04 MET25MET26 35.5135.81 129.20 127.74 MET26 35.81 129.20 MET27 35.32 128.28 MET27 35.32 128.28 MET28MET28 35.1635.16 128.04 128.04 MET29MET29 35.2235.22 128.67 128.67 MET30MET30 35.2235.22 128.89 128.89 Solar FarmSolar Farm 35.2235.22 126.31 126.31

WeWe estimatedestimated thetheweather weather data data at at the the solar solar farm farm using using information information relating relating to neighboring to neighbor points;ing pointswe need; we the need weights the weights of 30 meteorological of 30 meteorological points forpoints 1 site for of 1 interest. site of interest. We calculated We calculated these weights these weightsbased on based the spatial on the correlation, spatial correlation and sum, and of thesum weights of the must weights be 1.must Figure be 15. showsFigure the5 shows locations the locationsand weights and ofweights neighboring of neighboring points. MET points. 13 MET has the 13 greatesthas the greatest impact onimpact the point on the of point interest. of interest. Table2 Tableprovides 2 provide detaileds detailed weights weights of METs. of METs.

FigureFigure 5. 5. LocationsLocations and and weights weights of of METs. METs.

Energies 2018, 11, 2982x FOR PEER REVIEW 8 of 15

Table 2. Weights of 30 METs for solar farm “A”. Table 2. Weights of 30 METs for solar farm “A”. Neighbor Point Weight Neighbor Point Weight Neighbor Point Weight NeighborMET1 Point0.016438 Weight NeighborMET11Point 0.018432 Weight NeighborMET21 Point Weight0.017088 MET2MET1 0.0188290.016438 MET12MET11 0.1237040.018432 MET21MET22 0.0170880.024010 MET3MET2 0.0177930.018829 MET13MET12 0.3795230.123704 MET22MET23 0.0240100.020570 MET4MET3 0.0237360.017793 MET14MET13 0.0214320.379523 MET23MET24 0.0205700.020350 MET4 0.023736 MET14 0.021432 MET24 0.020350 MET5 0.023736 MET15 0.020195 MET25 0.021448 MET5 0.023736 MET15 0.020195 MET25 0.021448 MET6MET6 0.0152010.015201 MET16MET16 - -0.011560.01156 MET26MET26 0.0239460.023946 MET7MET7 0.0122280.012228 MET17MET17 0.0202060.020206 MET27MET27 0.0176060.017606 MET8MET8 0.0122590.012259 MET18MET18 0.0082420.008242 MET28MET28 0.0150030.015003 MET9MET9 0.0241260.024126 MET19MET19 0.0225460.022546 MET29MET29 0.0145570.014557 MET10 0.019265 MET20 0.026276 MET30 0.012812 MET10 0.019265 MET20 0.026276 MET30 0.012812

We performedperformed thethe kriging kriging method method to to estimate estimat thee the weather weather information information at the at solarthe solar farm farm in 2016. in 2016The result. The of result estimating of estimating the irradiance, the irradiance, humidity, humidity and temperature, and temperature linearly combine linearly the combine weights andthe weightsvalues of and weather value informations of weather for information neighboring pointsfor neighbor using Equationing points (1). using Figure Equation6 shows the(1). resultsFigure of 6 showsthe kriging the re methodsults of atthe solar kriging farm method “A” for at August solar farm 2016. “A” for August 2016.

(a)

(b)

Figure 6. Cont.

Energies 2018, 11, 2982 9 of 15 Energies 2018, 11, x FOR PEER REVIEW 9 of 15

(c)

Figure 6. 6. WeatherWeather data data estimated estimated at solar at solar farm “A” farm and “A” measured and measured at three highly at three weighted highly neighboring weighted neighborpoints foring August points 2016. for A (ugusta) Irradiance; 2016. (a) (b Irradiance;) Humidity; (b ()c Humidity;) Temperature. (c) Temperature.

To verify to the accuracyaccuracy of thethe model,model, crosscross validationvalidation waswas performed.performed. After a assumingssuming that thethe acquired meteorological data data was unknown, we we compare comparedd the the actual actual and and forecast forecast data data by by the remainingremaining data. data. Table Table3 shows 3 shows the results the results of cross of validation cross validation of irradiance of irradiance for five points for representing five points representingeach administrative each administrative district. district.

TableTable 3. CrossCross validation validation error for irradiance of five five METs METs..

NeighborNeighbor P Pointoint Error (%)Error (%) MET1MET1 15.287515.2875 MET4MET4 19.647719.6477 MET9 19.2359 MET9 19.2359 MET13 15.0040 MET13MET22 15.112815.0040 MET22 15.1128

Now, we need to make a classifier for solar power output forecast using a Naïve Bayes Classifier Now, we need to make a classifier for solar power output forecast using a Naïve Bayes based on a probabilistic approach. Therefore, we must map weather data and solar power output data Classifier based on a probabilistic approach. Therefore, we must map weather data and solar power by the hour. output data by the hour. 3.2. Probabilistic Forecasting for Solar Power Using a Naïve Bayes Classifier Technique 3.2. Probabilistic Forecasting for Solar Power Using a Naïve Bayes Classifier Technique We forecasted the output of solar farm “A” based on the estimated weather data for 30 neighboring We forecasted the output of solar farm “A” based on the estimated weather data for 30 points. The prior and conditional probability are essential components of forecasting the solar power neighboring points. The prior and conditional probability are essential components of forecasting output. We calculated the prior probability for each solar power output at solar farm “A” using the solar power output. We calculated the prior probability for each solar power output at solar farm Equation (8); the results are as shown in Figure7. In this figure, a value of 0 among the solar power “A” using Equation (8); the results are as shown in Figure 7. In this figure, a value of 0 among the output values was excluded because the prior probability corresponding to 0 was significantly higher solar power output values was excluded because the prior probability corresponding to 0 was than the other output values. significantly higher than the other output values. We used the prior probability calculated above to forecast the amount of solar power by We used the prior probability calculated above to forecast the amount of solar power by multiplying it by the conditional probability. In addition, we could optionally utilize Laplacian multiplying it by the conditional probability. In addition, we could optionally utilize Laplacian correction to avoid the problem in which there is a 0 for the prior probability to produce outputs that correction to avoid the problem in which there is a 0 for the prior probability to produce outputs do not exist in the past. that do not exist in the past. Finally, we must compute the conditional probability regarding each weather factor to forecast the Finally, we must compute the conditional probability regarding each weather factor to forecast solar power output since we treat numeric attributes over the range of an attribute’s values. Figure8 the solar power output since we treat numeric attributes over the range of an attribute’s values. shows the normal distribution of solar irradiance that could occur when the solar power output is Figure 8 shows the normal distribution of solar irradiance that could occur when the solar power 4.9 MW. In the event of 4.9 MW, it indicates that irradiance occurs in the middle. output is 4.9 MW. In the event of 4.9 MW, it indicates that irradiance occurs in the middle.

Energies 2018, 11, x FOR PEER REVIEW 10 of 15 EnergiesEnergies2018 2018, 11, 11, 2982, x FOR PEER REVIEW 1010 of of 15 15

Figure 7. Prior probability used in the simulation. FigureFigure 7. 7Prior. Prior probability probability used used in in the the simulation. simulation.

FigureFigure 8. 8Probability. Probability distribution distribution of of the the irradiance irradiance given given that that the the solar solar output output is is 4.9 4.9 MW. MW. Figure 8. Probability distribution of the irradiance given that the solar output is 4.9 MW. WeWe calculated calculated the the probabilitiesprobabilities of the estimated estimated weather weather values, values, such such as asin inFigur Figuree 6, at6, the at thesolar solarfarm farmWe by calculatapplying by applyinged themthe probabilit them at the at forecast theies of forecast the point estimated pointto the to weatherpredefined the predefined values, classifier. such classifier. asWe in classifiedFigur Wee classified 6, atall the models solar all modelsfarmaccording by according applying to the to solarthem the solar powerat the power forecast output output point values values to provided the providedpredefined with withthe classifier. theestimated estimated We weatherclassified weather valuesall valuesmodels and andaccordingshow showeded the tothe probability the probability solar through power through output a acontinuous continuous values normal normalprovided distribution distribution with the function.estimated TheThe weather summationsummation values of of the andthe conditionalshowconditionaled the probability probabprobabilityility and andthrough the the prior priora continuous probability probability asnormal shownas shown distribution in Equation in Equation function. (6) allows(6) allows The us summation to us determine to determine of the the postconditionalthe probabilitypost probability probab for eachility for model. andeach the model. We prior can We acquireprobability can oneacquire modelas shown one with model in the Equation maximumwith the (6) maximum probabilityallows us probability asto showndetermine in as Equationtheshown post in (7).probability Equation Based on (7).for data eachBased from model. on2015 data upWe to from can the acquire previous2015 up one dayto modelthe of forecasting, previous with the day wemaximum of can forecasting, forecast probability the we solar can as powershownforecast output in the Equation solar by selecting power (7). Based theoutput models on by dataselecting with from highest the 2015 models probability up to with the for high previous theest year probability day 2016. of Figure forecasting, for 9the shows year we two2016. can outputsforecastFigure of9 the solarshows solar power twopower byoutputs comparingoutput of by solar selecting the power actual the and by models comparing forecast with values highthe overestactual probability August. and forecast The for red the values lineyear is 2016. theover outputFigureAugust. value 9 The shows predicted red line two isby outputs the the output Naïve of solar value Bayes power predicted Classifier by comparingby technique, the Naïve andthe Bayes theactual blue Classifier and line forecast is technique the actual values, outputand over the valueAugust.blue measuredline The is the red atactual line solar isoutput farm the output “A”. value valuemeasured predicted at solar by farm the N “A”aïve. Bayes Classifier technique, and the blueWe line performed is the actual day-ahead output value forecasting measured for the at solar year 2016farm and“A”. we used the NMAE (%) to view the accuracy of the forecast using the Naïve Bayes Classifier method. Table4 describes the NMAE for

Energies 2018, 11, 2982 11 of 15 the results of each forecasting model, and expresses a percentage of the installed capacity, 11 MW. The NMAE is as shown in Equation (10) [37,38]:

f orecast N − 100 SPh SPh NMAE(%) = ∑ (10) N h=1 SPins

f orecast where SPh and SPh are the actual and forecasted solar power for period h, and SPins refers to the Energiesinstalled 2018 solar, 11, x powerFOR PEER capacity. REVIEW 11 of 15

Figure 9. Comparison of the actual and forecast solar power in August 2016. Figure 9. Comparison of the actual and forecast solar power in August 2016. Table 4. The normalized mean absolute error (NMAE) for each month of forecasting in 2016. We performed day-ahead forecasting for the year 2016 and we used the NMAE (%) to view the accuracy of the forecastMonth using the N NMAEaïve Bayes (%) Classifier Monthmethod. Table 4 describes NMAE (%) the NMAE for the results of eachJanuary forecasting model, 5.159702 and expresses a percentage July of the installed 3.017107 capacity, 11MW. The NMAE is asFebruary shown in Equation (10) 4.423764 [37,38]: August 2.969330 March 5.136241 September 3.212500 April 4.444066 Octoberforecast 4.488025 100 N SPhh SP MayNMAE(%) 3.479961 November 4.068056 (10) June 3.540152 h1 December 3.355083 N SPins where SP and SP forecast are the actual and forecasted solar power for period h, and SP refers to Normally,h sinceh the output is significantly more likely to be a 0 during periods whenins the sun thedoes installed not rise, solar we power considered capacity. the NMAE of the forecasting model except when both the forecast and measured values were 0. Figure 10 expresses the forecast error with and without 0 outputs. Additionally,Table the4. The NMAE normalized of the mean forecasting absolute model error for(NMAE the year) for ofeach 2016 month is about of forecasting 4%, and in the 2016 accuracy. for summer is better thanMonth that for winter.NMAE (%) Month NMAE (%) Several North AmericanJanuary utilities5.159702 have been forecastingJuly solar power3.017107 since the 2000s. Most operating entities use NMAEFebruary as a forecasting error4.423764 assessment; August as shown in Table2.969335, they0 had levels below 10% in 2013 [37]. In this paper,March we identified5.136241 an improvementSeptember in the proposed3.212500 model compared to the persistence model, asApril shown in Figure4.444066 11 and Table6.October 4.488025 May 3.479961 November 4.068056 Table 5. Assessments of forecasting accuracy. June 3.540152 December 3.355083 Operation Entity 2013 Value Normally, since the outputCASIO is significantly more likely to MAEbe a <0 8%during periods when the sun does not rise, we consideredIdaho the N PowerMAE of the forecasting model MAE except < 6.5% when both the forecast and measured values were 0. FigureXcel Energy 10 expresses the forecast MAE error < 9.8% with and without 0 outputs. Additionally, the NMAE of the forecasting model for the year of 2016 is about 4%, and the accuracy for summer is better than that for winter. Several North American utilities have been forecasting solar power since the 2000s. Most operating entities use NMAE as a forecasting error assessment; as shown in Table 5, they had levels below 10% in 2013 [37]. In this paper, we identified an improvement in the proposed model compared to the persistence model, as shown in Figure 11 and Table 6.

Table 5. Assessments of forecasting accuracy.

Operation Entity 2013 Value CASIO MAE < 8% Idaho Power MAE < 6.5% Xcel Energy MAE < 9.8%

Energies 2018, 11, x FOR PEER REVIEW 12 of 15 Energies 2018, 11, 2982 12 of 15 Energies 2018, 11, x FOR PEER REVIEW 12 of 15

Figure 10. Forecast error for solar power output in 2016. Figure 10. Forecast error for solar power output in 2016. Figure 10. Forecast error for solar power output in 2016.

Figure 11. Monthly NMAE of each forecasting method for solar power output in 2016. Figure 11. Monthly NMAE of each forecasting method for solar power output in 2016. Table 6. NMAE for each month of forecasting in 2016. Figure 11. MonthlyTable NMAE 6. N ofMAE each for forecasting each month method of forecasting for solar in power 2016. output in 2016. Month Proposed Model Persistence JanuaryMonthTable 6. NMAE forProposed each 5.159702 month M odelof forecasting inPersistence 2016. 6.0136 FebruaryJanuary 5.159702 4.423764 6.0136 7.7273 Month Proposed Model Persistence MarchFebruary 4.423764 5.136241 7.7273 7.0781 AprilJanuary 5.159702 4.444066 6.0136 9.9371 March 5.136241 7.0781 MayFebruary 4.423764 3.479961 7.7273 7.6248 April 4.444066 9.9371 JuneMarch 5.136241 3.540152 7.0781 9.3845 JulyMay 3.479961 3.017107 7.6248 5.6085 April 4.444066 9.9371 AugustJune 3.540152 2.969330 9.3845 4.6752 May 3.479961 7.6248 SeptemberJuly 3.017107 3.212500 5.6085 6.8424 OctoberJune 3.540152 4.488025 9.3845 9.2645 August 2.969330 4.6752 NovemberJuly 3.017107 4.068056 5.6085 7.6265 DecemberSeptember 3.212500 3.355083 6.8424 7.0162 August 2.969330 4.6752 October 4.488025 9.2645 September 3.212500 6.8424 November 4.068056 7.6265 4. Conclusions October 4.488025 9.2645 December 3.355083 7.0162 As solar energyNovember depends on climate phenomena,4.068056 accurate forecasting7.6265 technologies for stable interconnection of solarDecember power generation3.355083 are becoming increasingly7.0162 important. In this paper,

Energies 2018, 11, 2982 13 of 15 we proposed a probabilistic solar power output forecast method using a hybrid spatio-temporal forecasting model. We applied two estimating techniques to forecast the output of a solar power farm. Firstly, since numerical weather prediction models are difficult to apply in forecasting, kriging helps to perform spatial modeling for sites of interest using data from nearby points, as noted in Section 2.1. The results of the method give us relatively precise weather information. Secondly, we applied the Naïve Bayes Classifier method based on the probability. Unlike previous studies in which weather values were difficult to apply at the exact point and where discrete values were used, the proposed model allows for the consideration of meteorological values that are not applied to the classifier because of the continuous probability distribution. Finally, we applied the hybrid spatio-temporal forecasting model using empirical data to a solar farm located in South Korea and evaluated the performance of the model. As a result, it was confirmed that the NMAE of the forecasting model had a value of less than 10%. The proposed forecasting model based on a probabilistic approach shows improved results when compared to the deterministic persistence forecasting model. In future, we will carry out probabilistic forecasting model improvements using a different probability distribution that will be used to better reflect the characteristics of the data. Also, we will represent a probabilistic range to ensure better results.

Author Contributions: J.H. conceived and designed the overall research; S.N. implemented the probabilistic forecasting model and conducted the experimental simulation; J.H. and S.N. wrote the paper; and J.H. guided the research direction and supervised the entire research process. Acknowledgments: This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (2015-0-00445) supervised by the IITP (Institute for Information. Conflicts of Interest: The authors declare no conflict of interest.

References

1. International Renewable Energy Agency. Renewable Capacity Highlights 2018. Available online: https: //www.irena.org/-/media/Files/IRENA/Agency/Publication/2018/Mar/RE_capacity_highlights_2018.pdf (accessed on 25 September 2018). 2. PJM. PJM’s Evolving Resource Mix and System Reliability. 2017. Available online: https://www.pjm.com/ ~/media/library/reports-notices/special-reports/20170330-appendix-to-pjms-evolving-resource-mix-and- system-reliability.ashx (accessed on 20 September 2018). 3. Chow, C.W.; Urquhart, B.; Lave, M.; Dominguez, A.; Kleissl, J.; Shields, J.; Washom, B. Intra-hour forecasting with a total sky imager at the UC San Diego solar energy tested. Sol. Energy 2011, 85, 2881–2893. [CrossRef] 4. Yang, H.; Kurtz, B.; Nguyen, D.; Urquhart, B.; Chow, C.W.; Ghonima, M.; Kleissl, J. Solar irradiance forecasting using a ground-based sky imager developed at UC San Diego. Sol. Energy 2014, 103, 502–524. [CrossRef] 5. Perez, R.; Kivalov, S.; Schlemmer, J.; Hemker, K.; Hoff, T. Parameterization of site-specific short-term irradiance variability. Sol. Energy 2011, 85, 1343–1353. [CrossRef] 6. Inman, R.H.; Hugo, T.C.P.; Carlos, F.M.C. Solar forecasting methods for renewable energy integration. Prog. Energy Combust. Sci. 2013, 535–576. [CrossRef] 7. NYISO. Solar Impact on Grid Operations—An Initial Assessment. 2016. Available online: http://www.nyiso. com/public/webdocs/markets_operations/services/planning/Documents_and_Resources/Special_ Studies/Special_Studies_Documents/Solar%20Integration%20Study%20Report%20Final%20063016.pdf (accessed on 20 September 2018). 8. Pelland, S.; Galanis, G.; Kallos, G. Solar and photovoltaic forecasting through post-processing of the global environmental multiscale numerical weather prediction model. Prog. Photovolt. Res. Appl. 2013, 21, 284–296. [CrossRef] 9. Marquez, R.; Coimbra, C.F.M. Forecasting of global and direct solar irradiance using stochastic learning methods, ground experiments and the NWS database. Sol. Energy 2011, 85, 746–756. [CrossRef] 10. Yang, D.; Jirutitijaroen, P.; Walsh, W.M. Hourly solar irradiance time series forecasting using cloud cover index. Sol. Energy 2012, 86, 3531–3543. [CrossRef] Energies 2018, 11, 2982 14 of 15

11. NREL. National Solar Radiation Data Base. Available online: https://rredc.nrel.gov/solar/old_data/nsrdb/ (accessed on 25 September 2018). 12. Mandal, P.; Madhira, S.T.S.; Haque, A.U.; Meng, J.; Pineda, R.L. Forecasting power output of solar photovoltaic system using wavelet transform and artificial intelligence techniques. Procedia Comput. Sci. 2012, 12, 332–337. [CrossRef] 13. Pedro, H.T.C.; Coimbra, C.F.M. Assessment of forecasting techniques for solar power production with no exogenous inputs. Sol. Energy 2012, 86, 2017–2028. [CrossRef] 14. Marquez, R.; Pedro, H.T.C.; Coimbra, C.F.M. Hybrid solar forecasting method uses satellite imaging and ground telemetry as inputs to ANNs. Sol. Energy 2013, 92, 176–188. [CrossRef] 15. Le, D.D.; Berizzi, A.; Bovo, C.; Ciapessoni, E.; Cirio, D.; Pitto, A.; Gross, G. A Probabilistic Approach to Power System Security Assessment under Uncertainty. In Proceedings of the 2013 IREP Symposium-Bulk Power Dynamics and Control, Rethymno, Greece, 25–30 August 2013. 16. Murata, A.; Ohtake, H.; Oozeki, T. Modeling of uncertainty of solar Irradiance forecasts on numerical weather predictions with the estimation of multiple confidence intervals. Renew. Energy 2018, 117, 193–201. [CrossRef] 17. Ramakrishna, R.; Scaglione, A.; Vittal, V. A Stochastic Model for Short-Term Probabilistic Forecast of Solar Photo-Voltaic Power. In Proceedings of the Asilomar Conference on Signals, Systems and Computers 2016, Pacific Grove, CA, USA, 6–9 November 2016. 18. Abuella, M.; Chowdhury, B. Hourly probabilistic forecasting of solar power. In Proceedings of the 2017 North American Power Symposium, Morgantown, WV, USA, 17–19 September 2017. 19. Lauret, P.; David, M.; Hugo, T.C.P. Probabilistic Solar Forecasting using Quantile Regression Models. Energies 2017, 10, 1591. [CrossRef] 20. Massidda, L.; Marrocu, M. Quantile Regression Post-Processing of Weather Forecast for Short-Term Solar Power Probabilistic Forecasting. Energies 2018, 11, 1763. [CrossRef] 21. Mohammed, A.A.; Aung, Z. Ensemble Learning Approach for Probabilistic Forecasting of Solar Power Generation. Energies 2016, 9, 1017. [CrossRef] 22. Zhang, Y.; Wang, J. GEFCom2014 Probabilistic Solar Power Forecasting based on k-Nearest Neighbor and Kernel Density Estimator. In Proceedings of the IEEE International Conference on Power & Energy Society General Meeting, Denver, CO, USA, 26–30 July 2015. 23. Hong, T.; Pinson, P.; Zareipour, H.; Troccoli, A.; Rob, J.H. Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond. Int. J. Forecast. 2016, 32, 896–913. [CrossRef] 24. Sharma, N.; Sharma, P.; Irwin, D.; Shenoy, P. Predicting Solar Generation from Weather Forecasts Using Machine Learning. Smart Grid Communications. In Proceedings of the 2011 IEEE International Conference, Brussels, Belgium, 17–20 October 2011. 25. Zafarani, R.; Eftekharnejad, S.; Patel, U. Assessing the Utility of Weather Data for Photovoltaic Power Prediction 2018. Available online: https://arxiv.org/pdf/1802.03913.pdf (accessed on 23 September 2018). 26. Nomiyama, F.; Asai, J.; Murakami, T.; Murata, J. A study on global solar radiation forecasting using weather forecast data. Circuits and Systems. In Proceedings of the 2011 IEEE 54th International Midwest Symposium, Seoul, Korea, 7–10 August 2011. 27. Stein, M.L. Interpolation of Spatial Data: Some Theory for Kriging; Springer: New York, NY, USA, 1999. 28. Cressie, N. Spatial Prediction and Ordinary Kriging. Math. Geol. 1988, 20, 405–421. [CrossRef] 29. Yamamoto, J.K. An Alternative Measure of the Reliability of Ordinary Kriging Estimates. Math. Geol. 2000, 32, 489–509. [CrossRef] 30. Bracale, A.; Caramia, P.; Carpinelli, G.; Fazio, A.R.D.; Ferruzzi, G. A Bayesian Method for Short-Term Probabilistic Forecasting of Photovoltaic Generation in Smart Grid Operation and Control. Energies 2013, 6, 733–747. [CrossRef] 31. Visscher, R.D.; Delouille, V.; Dupont, P.; Deledalle, C.A. Supervised classification of solar features using prior information. J. Space Weather Space Clim. 2015, 5, A34. [CrossRef] 32. Quek, Y.T.; Woo, W.L.; Logenthiran, T. A naïve Bayes Classification Approach for Short-Term Forecast of Photovoltaic System. In Proceedings of the and Environmental Sciences, Singapore, 6–7 May 2017. Energies 2018, 11, 2982 15 of 15

33. Davig, T.; Hall, A.S. Recession Forecasting Using Bayesian Classification. Available online: https://www. kansascityfed.org/publications/research/rwp/articles/2016/recession-forecasting-bayesian-classification (accessed on 25 September 2018). 34. Raschka, S. Naïve Bayes and Text Classification 1–Introduction and Theory. 2014. Available online: https: //arxiv.org/abs/1410.5329 (accessed on 10 September 2018). 35. Kim, S.B.; Han, K.S.; Rim, H.C.; Myaeng, S.H. Some Effective Techniques for Naïve Bayes Text Classification. IEEE Trans. Knowl. Data Eng. 2006, 18, 1457–1466. 36. Gayathri, A.; Revathi, M.; Velmurugan, J. A survey on Weather forecasting by Data Mining. IJARCCE 2016, 5, 298–300. 37. Widiss, R.; Porter, K. A Review of Variable Generation Forecasting in the West. Available online: https: //www.nrel.gov/docs/fy14osti/61035.pdf (accessed on 20 September 2018). 38. Dolara, A.; Grimaccia, F.; Leva, S.; Mussetta, M. Comparison of Training Approaches for Photovoltaic Forecasts by Means of Machine Learning. Appl. Sci. 2018, 8, 228. [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).