Downloaded from orbit.dtu.dk on: Oct 05, 2021

Online short-term solar power forecasting

Bacher, Peder; Madsen, Henrik; Nielsen, Henrik Aalborg

Published in:

Link to article, DOI: 10.1016/j.solener.2009.05.016

Publication date: 2009

Document Version Peer reviewed version

Link back to DTU Orbit

Citation (APA): Bacher, P., Madsen, H., & Nielsen, H. A. (2009). Online short-term solar power forecasting. Solar Energy, 83(10), 1772-1783. https://doi.org/10.1016/j.solener.2009.05.016

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

 Users may download and print one copy of any publication from the public portal for the purpose of private study or research.  You may not further distribute the material or use it for any profit-making activity or commercial gain  You may freely distribute the URL identifying the publication in the public portal

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Online Short-term Solar Power Forecasting

Peder Bacher

Henrik Madsen

Informatics and Mathematical Modelling, Richard Pedersens Plads, Technical University of Denmark Building 321, DK-2800 Lyngby, Denmark

Henrik Aalborg Nielsen ENFOR A/S, Lyngsø All´e3, DK-2970 Hørsholm, Denmark

Abstract This paper describes a new approach to online forecasting of power production from PV systems. The method is suited to online forecasting in many applications and in this paper it is used to predict hourly values of solar power for horizons of up to 36 hours. The data used is fifteen-minute observations of solar power from 21 PV systems located on rooftops in a small village in Denmark. The suggested method is a two-stage method where first a statistical normalization of the solar power is obtained using a clear sky model. The clear sky model is found using statistical smoothing techniques. Then forecasts of the normalized solar power are calculated using adaptive linear time series models. Both autoregressive (AR) and AR with exogenous input (ARX) models are evaluated, where the latter takes numerical weather predictions (NWPs) as input. The results indicate that for forecasts up to two hours ahead the most important input is the available observations of solar power, while for longer horizons NWPs are the most important input. A root mean square error improvement of around 35 % is achieved by the ARX model compared to a proposed reference model. Key words: Solar power, prediction, forecasting, time series, photovoltaic, numerical weather predictions, clear sky model, quantile regression, recursive least squares

1. Introduction tems on rooftops. The peak power of the installed PV systems is in the range of 1 to Efforts to increase the capacity of solar 4 kWp, which means that the larger sys- power production in Denmark are concen- tems will approximately cover the electric- trating on installing grid connected PV sys- ity consumption (except heating) of a typi- cal family household in Denmark. The PV Email addresses: [email protected] (Peder Bacher) URL: henrikmadsen.org (Henrik Madsen), systems are connected to the main electric- www.enfor.eu (Henrik Aalborg Nielsen) ity grid and thus the output from other p Solar power W

pcs Clear sky solar power W τ Normalized solar power - t Time index - k Forecast horizon index - i, j Miscellaneous indexes -

pt Observation of average solar power W

pˆt+k|t k-step prediction of solar power W cs pˆt Estimated clear sky solar power W 2 gˆi,k i’th update of NWP of global irradiance W/m 00 2 gˆk,t NWP of global irradiance updated at 00:00 W/m 12 2 gˆk,t NWP of global irradiance updated at 12:00 W/m 00 00 pk,t Observation of solar power corresponding tog ˆk,t W 12 12 pk,t Observation of solar power corresponding tog ˆk,t W

τt Normalized solar power -

τˆt+k|t k-step prediction of normalized solar power - nwp τˆt NWPs transformed into normalized solar power -

xt Day of year -

yt Time of day -

et+k k-step prediction error - q Quantile level - h Bandwidth of smoothing kernel - λ Forgetting factor - power production units has to be adjusted Denmark is balanced by the energy mar- in order to balance the total power pro- ket Nord Pool, where electricity power is duction. The cost of these adjustments in- traded on two markets: the main market creases as the horizon of the adjustments Elspot and a regulation market Elbas. On decreases and thus improved forecasting of Nord Pool the producers release their bids solar power will result in an optimized total at 12:00 for production each hour the fol- power production, and in future power pro- lowing day, thus the relevant solar power duction systems where is im- forecasts are updated before 12:00 and con- plemented, power forecasting is an impor- sist of hourly values at horizons of 12 to tant factor in optimizing utilization of stor- 36 hours. The models in this paper focus age facilities (Koeppel and Korpas, 2006). on such forecasts, but with the 1-to-11-hour horizons also included. The total electricity power production in 2 Interest in forecasting solar power has in- hourly values of global irradiance. Differ- creased and several recent studies deal with ent types of meteorological observations are the problem. Many of these consider fore- used as input to the models; among others casts of the global irradiance which is essen- the daily mean global irradiance and daily tially the same problem as forecasting solar mean cloud cover of the day to be fore- power. Two approaches are dominant: casted. This paper describes a new two-stage • a two-stage approach in which the solar method where first the clear sky model ap- power (or global irradiance) is normal- proach is used to normalize the solar power ized with a clear sky model in order to and then adaptive linear time series mod- form a more stationary time series and els are applied for prediction. Such models such that the classical linear time series are linear functions between values with a methods for forecasting can be used. constant time difference, where the model • another approach in which neural net- coefficients are estimated by minimizing a works (NNs) with different types of in- weighted residual sum of squares. The co- put are used to predict the solar power efficients are updated regularly, and newer (or global irradiance) directly. values are weighted higher than old values, hence the models adapt over time to chang- In a study Chowdhury and Rahman (1987) ing conditions. make sub-hourly forecasts by normalizing Normalization of the solar power is ob- with a clear sky model. The solar power is tained by using a clear sky model which divided into a clear sky component, which gives an estimate of the solar power in clear is modelled with a physical parametriza- (non-overcast) sky at any given point in tion of the atmosphere, and a stochastic time. The clear sky model is based on sta- cloud cover component which is predicted tistical smoothing techniques and quantile using ARIMA models. Sfetsos and Coonick regression, and the observed solar power is (2000) use NNs to make one-step predic- the only input. The adaptive linear predic- tions of hourly values of global irradiance tion is obtained using recursive least squares and compare these with linear time series (RLS) with forgetting. It is found that the models that work by predicting clearness adaptivity is necessary, since the character- indexes. Heinemann et al. (2006) use satel- istics of a PV-system are subject to changes lite images for horizons below 6 hours, and due to snow cover, leaves on trees, dirt on in (Lorenz et al., 2007) numerical weather the panel, etc., and this has to be taken into predictions (NWPs) for longer horizons, as account by an online forecasting system. input to NNs to predict global irradiance. The data used in the modelling is de- This is transformed into solar power by a scribed in Section 2. The clear sky model simulation model of the PV system. Ho- used for normalizing the solar power is de- caoglu et al. (2008) investigate feed-forward fined in Section 3 followed by Section 4 NNs for one-step predictions of hourly val- where the adaptive time series models used ues of global irradiance and compare these for prediction are identified. In Section 5 an with seasonal AR models applied on solar approach to modelling of the uncertainty in power directly. Cao and Lin (2008) use NNs the forecasts is outlined. The evaluation of combined with wavelets to predict next day the models and a discussion of the results 3 are found in Section 6 and finally the con- which then covers the forecast horizons up clusions of the study are drawn in Section to 36 hours ahead, and is given in (W/m2). 7. Time series are resampled to lower sample frequencies by mean values and when the 2. Data resampled values are used this is noted in the text. In order to synchronize data with The data used in this study is observa- different sample frequencies, the time point tions of solar power from 21 PV systems for a given mean value is assigned to the located in a small village in Jutland, Den- middle of the period that it covers, e.g. the mark. The data covers the entire year 2006. time point of an hourly value of solar power Forecasts of global irradiance are provided from 10:00 to 11:00 is assigned to 10:30. by the Danish Meteorological Institute us- As an example of the NWPs of global ing the HIRLAM mesoscale NWP model. irradiance Figure 2 shows values at time The PV array in each the 21 PV sys- of day 10:30 of {p } resampled to three tems is composed of “BP 595” PV modules t hour interval values plotted versus the cor- and the inverters are of the type “BP GCI responding {gˆ } values with a 24 hour hori- 1200”. The installed peak power of the PV i,k zon. Clearly the plot indicates a significant arrays is between 1020 Watt peak and 4080 correlation. Hence it is seen that there is Watt peak, and the average is 2769 Watt information in the NWPs, which can be uti- peak. Let p denote the average value of i,t lized to forecast the solar power. solar power (W) over 15 minutes observed for the i’th PV system at time t. These ob- servations are used to form the time series 3. Clear sky model A clear sky model is usually a model which estimates the global irradiance in {pt; t = 1,...,N} (1) clear (non-overcast) sky at any given time. where Chowdhury and Rahman (1987) divide the 21 1 X global irradiance into a clear sky component p = p . (2) t 21 i,t and a cloud cover component by i=1 This time series is used throughout the G = Gcs · τc (4) modelling. The time series covers the pe- where G is the global irradiance (W/m2), riod from 01 January 2006 to 31 December and Gcs is the clear sky global irradiance 2006. The observations are fifteen-minute 2 (W/m ). Finally τc is the transmissivity of values, ie. N = 35040. Plots of {pt} are the clouds which they model as a stochastic shown in Figure 1 for the entire period and process using ARIMA models. The clear for two shorter periods. sky global irradiance is found by The NWPs of global irradiance are given in forecasts of average values for every third Gcs = I0 · τa (5) hour, and the forecasts are updated at 00:00 where I is the extraterrestrial irradiance and 12:00 each day. The i’th update of the 0 (W/m2). τ is the total sky transmissiv- forecasts is the time series a ity in clear sky which is modelled by atmo- {gˆi,k, k = 1,..., 12} (3) spheric dependent parametrization. 4 Solar power (W) 0 2000

Jan Mar May Jul Sep Nov Jan Solar power (W) 0 2000

Apr 22 Apr 27 May 02 Solar power (W) 0 2000

Jul 26 Jul 31 Aug 05

Figure 1: The observations of average solar power used in the study. Upper plot: The solar power over the entire year 2006. Lower plots: The solar power in two selected periods.

In this study the same approach is used, but instead of applying the factor on global irradiance it is applied on solar power, i.e.

p = pcs · τ (6)

where p is the solar power (W) and pcs is the clear sky solar power (W). The fac- tors τ and τc are much alike, but since the clear sky model developed in the present study estimates pcs by statistical smooth- Solar power (W) ing techniques rather than using physics, the method is mainly viewed as a statistical 0 1000 2000 normalization technique and τ is referred to 0 200 400 600 800 NWPs of global irradiance (W M2) as normalized solar power. The motivation behind the proposed nor- Figure 2: All three hour interval values of solar malization of the solar power with a clear power at time of day 10:30 versus the correspond- ing NWPs of global irradiance with 24 hour horizon. sky model is that the normalized solar Hence the plot shows observations and predictions power (the ratio of solar power to clear of values covering identical time intervals. sky solar power) is more stationary than the solar power, so that classical time se- ries models assuming stationarity (Madsen, 2007) can be used for predicting the nor- malized values. The non-stationarity is il- lustrated by Figure 3 where modified box- 5 Solar power (W) 0 500 1000 1500 2000

0 3 6 9 12 15 18 21 24 Figure 5: The solar power as a function of the day Time of day (UTC) of year, and the time of day. Note that only positive values of solar power are plotted. Figure 3: Modified boxplots of the distribution of the solar power as a function of time of day. The boxplots are calculated with all the fifteen-minutes values of solar power, i.e. covering all of 2006. At each time of the day the box represents the center half of the distribution, from the first to the third quantile. The lower and upper limiting values of the distribution are marked with the ends of the vertical dotted lines, and dots beyond these indicate outliers.

Figure 6: The estimated clear sky solar power shown as a surface. The solar power is shown as points.

plots indicate the distribution of solar power pt as a function of time of day. Clearly a change in the distributions over the day is seen and this non-stationarity must be con- Normalized solar power sidered. Figure 4 shows the same type of

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 plot for the normalized solar power and it 0 3 6 9 12 15 18 21 24 Time of day (UTC) is seen that the distributions over the day are closer to being identical. Thus the effect Figure 4: Modified boxplots of the distribution of of the changes over the day is much lower the normalized solar power as a function of time for the normalized solar power than for the of day. The boxplots are calculated with all fifteen- solar power. minutes values available, i.e. covering all of 2006. The clear sky model is defined as

pcs = fmax(x, y) (7) 6 a Gaussian two dimensional smoothing ker-

) ) nel, defined in Appendix A. The smoothing x y h h , , i i

x y kernel is used to form the weights applied in , , t t x y ( ( the quantile regression. As seen in Figure w w 7, which shows the smoothing kernel used, 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 the weights in the day of year dimension −100 −50 0 50 100 −45 −15 0 15 45 − − xt xi (days) yt yi (minutes) w(xt, xi, hx), are decreasing as the absolute time differences are increasing. Similarly for Figure 7: The one dimensional smoothing kernels the weights in the time of day dimension used. Left plot is the kernel in the day of year (x) dimension. Right plot is the kernel in the time of w(yt, yi, hy). The applied weights are finally day (y) dimension. They are multiplied to form the found by multiplying the weigths from the applied two dimensional smoothing kernel. two dimensions. The choice of the quantile level q to be estimated and the bandwidth in each dimension, h and h , is based on a where pcs is the clear sky solar power (W), x y x is the day of year and y is the time of visual inspection of the results. A level of q = 0.85 was used since this gives τ ≈ 1 for day. The function fmax(·, ·) is assumed to t days with clear sky all day, as seen in Fig- be a smooth function and thus fmax(·, ·) can be estimated as a local maximum (Koenker, ure 8. The plot for days with varying cloud 2005). Figure 5 shows the solar power cover in Figure 9 show that estimates where plotted as a function of x and y, and the τt > 1 occur. These are ascribed to reflec- ˆ tions from clouds and varying level of wa- estimated clear sky solar power fmax(·, ·) is shown as a surface in Figure 6. Due ter vapour in the atmosphere. Future work to outliers the weighted quantile regression should elaborate on the inclusion of such ef- method outlined in Appendix A is used to fects in the clear sky model. ˆ For smallp ˆcs values the error of τ is nat- find the local maximum. The fmax(·, ·) is t t then used to form the output of the clear urally increasing and at nighttime the error cs sky model as the time series is infinite. Therefore all values ofp ˆt where

cs {pˆt , t = 1,...,N}, (8) pˆcs t < 0.2 (11) cs max({pˆcs}) wherep ˆt is the estimated clear sky solar t power (W) at time t, and N = 35040. The are removed from {τt}. The function normalized solar power is now defined as cs max({pˆt }) gives the maximum value in cs pt {pˆt }. τt = cs (9) pˆt The estimates of clear sky solar power are best in the summer period. The bad esti- and this is used to form time series of nor- mates in winter periods are caused by the malized solar power sparse number of clear sky observations. It should also be possible to improve the nor- {τt, t = 1,..., 35040}. (10) malization toward dusk and dawn, and thus For each (xt, yt) corresponding to the so- lower the limit where values in {pˆcs} are lar power observation pt, weighted quan- removed, either by refining the modelling tile regression estimates the q quantile by method or by including more explanatory 7 ee omda ons ..a 20.Teuprpo hw h oa oe n h siae la sky clear estimated the and p power solar the shows plot upper The 12:00. at power i.e. solar points, midday to refer 9: Figure solar sky clear estimated the and p power solar the shows plot upper The 12:00. at power i.e. points, midday to 8: Figure Normalized values Power values (W) Normalized values Power values (W) 0 0.5 1 0 1000 2000 0 0.5 1 0 1000 2000 p ˆ cs h oe ltsostenraie oa power solar normalized the shows plot lower The . h euto h omlzto o eetdcersydy vrteya.Tetm-xstcsrefer ticks time-axis The year. the over days sky clear selected for normalization the of result The h euto h omlzto o aseel itiue vrteya.Tetm-xsticks time-axis The year. the over distributed evenly days for normalization the of result The p e 1My4Jn1 u e 4Ot1 Nov3 Oct17 Sep14 Jul2 Jun10 May4 Feb 11 a 0Mr1 a u 2Ag9Sp2 Nov13 Sep26 Aug9 Jun22 May6 Mar19 Jan 30 ˆ cs h oe ltsostenraie oa power solar normalized the shows plot lower The . 8 τ . τ . p p p p ^ ^ cs cs τ τ variables such as e.g. air mass. 4.1. Transformation of NWPs into predic- Finally it is noted that the determinis- tions of normalized solar power tic changes of solar power are really caused In order to use the NWPs of global irra- by the geometric relation between the earth diationg ˆi,k as input to the prediction mod- nwp and the sun, which can be represented in els, these are transformed intoτ ˆt which the current problem by the sun elevation as are meteorological based hourly predictions x and sun azimuth as y. The clear sky so- of τt. This is done by first transformingg ˆi,k lar power was also modelled in the space into solar power predictions and then trans- spanned by these two variables, by apply- forming these by the clear sky model. The ing the same statistical methods as for the time series {gˆi,k}, defined in (3), holds the space spanned by day of year and time of i’th NWP forecast of three hour interval val- day. The result was not satisfactory, i.e. ues, and was updated at the estimated clear sky solar power was less accurate, probably because neighboring val- timei = t0 + (i − 1) · 12h (12) ues in this space are not necessarily close in time and thus changes in the surroundings where t0 = 2006-01-01 00:00. Thus the time to the PV system blurred the estimates. series with sample period of one day

00 {gˆk,t, t = 1,..., 364} = (13)

4. Prediction models {gˆi,k, i = 1, 3,..., 727},

Adaptive linear time series models (Mad- consist of all the NWPs updated at time of sen, 2007) are applied to predict future val- day 00:00 at horizon k, i.e. the superscript ues of the normalized solar power τt. The “00” forms part of the name of the variable. inputs are: lagged observations of τt and Similarly the time series nwp transformed NWPsτ ˆt . Three types of 12 models are identified: {gˆk,t, t = 1,..., 364} = (14) {gˆi,k, i = 2, 4,..., 728}, • a model which has only lagged observa- consist of all the NWPs updated at time of tions of τt as input. This is an autore- gressive (AR) model and it is referred day 12:00. The corresponding time series to as the AR model. of solar power covering the identical time intervals are respectively nwp • a model with onlyτ ˆt as input. This 00 {pk,t, t = 1,..., 364} = (15) is referred to as the LMnwp model. {pt, t = k, (1 · 8 + k),..., • a model with both types of input. This (363 · 8 + k)} is an autoregressive with exogenous in- put (ARX) model and it is referred to and

as the ARX model. 12 {pk,t, t = 1,..., 364} = (16) The best model of each type is identified by {pt, t = k + 4, (1 · 8 + k + 4),..., using the autocorrelation function (ACF). (363 · 8 + k + 4)}, 9 where {pt} has been resampled to three hour interval values. The NWPs are mod- elled into solar power predictions by the adaptive linear model ACF 00 00 pˆk,t = βt + αt gˆk,t + et , (17) note that the hat above the variable indi-

cates that these values are predictions (esti- 0 0.5 1 mates) of the solar power. A similar model 0 24 48 72 96 120 144 168 192 Lag is made for the NWP updates at time of 12 day 12:00 givingp ˆk,t. The interpretion of Figure 10: ACF of the time series of normalized solar power {τ }. the coefficients βt and αt is not further elab- t orated here, but it is noted that they are time dependent in order to account for the constant time difference, the ACF is calcu- effects of changing conditions over time, e.g. lated and plotted in Figure 10. Clearly an the changing geometric relation between the AR(1) component is indicated by the ex- earth and the sun, dirt on the solar panel. ponential decaying pattern of the first few This adaptivity is obtained by fitting the lags and a seasonal diurnal AR component model with k-step recursive least squares by the exponential decaying peaks at lag (RLS) with forgetting, which is described = 24, 48, ... . By considering only first-order in Appendix B. In order to use the RLS al- terms this leads to the 1-step AR model 00 gorithm, pk,t has to be lagged depending on k. Each RLS estimation is optimized by τt+1 = m + a1τt + a2τt−23 + et+1 . (19) choosing the value of the forgetting factor And a reasonable 2-step AR model is λ from 0.9, 0.905,..., 1 that minimizes the root mean square error (RMSE). τt+2 = m + a1τt + a2τt−22 + et+2 . (20) The last steps in the transformation of Note that here the 1-step lag cannot be the NWPs is to normalizep ˆ00 andp ˆ12 with k,t k,t used, since this is τ i.e. a future value, the clear sky model, and resample up to t+1 and thus the latest observed value is in- hourly values by linear interpolation. Fi- cluded instead. Formulated as a k-step AR nally the time series model nwp {τˆt , t = 1,..., 8760} (18) τt+k = m + a1τt + a2τt−s(k) + et+k of the NWPs of global irradiance trans- (21) formed into predictions of normalized solar s(k) = 24 + k mod 24 (22) power is formed, and this is used as input to the ARX prediction models as described where the function s(k) ensures that the lat- in the following. More details can be found est observation of the diurnal component is in (Bacher, 2008). included. This is needed, since for k = 25 the diurnal 24 hour AR component cannot 4.2. AR model identification be used and instead the 48 hour AR com- To investigate the time dependency in ponent is used. This model is referred to as {τt}, i.e. dependency between values with a the AR model. 10 k = 1 k = 2 k = 3 ACF

−0.251 0.5 24 1 48 72 1 24 48 72 1 24 48 72

k = 23 k = 24 k = 25 ACF −0.25 0.5 1 1 24Lag 48 72 1 24Lag 48 72 1 24Lag 48 72

Figure 11: ACF of the time series of errors {et+k} for selected horizons k of the AR model. The vertical bars indicate the lags included in each of the models, and the grayed points show the lags which cannot be included in the model.

Figure 11 shows the ACF of {et+k}, which is the time series of the errors in the model for horizon k, for six selected horizons after fitting the AR model with RLS, which is de- scribed in Appendix B. The vertical black lines indicate which lags are included in the k = 1 model. For k = 1 the correlation of the AR(1) component is removed very well and ACF the diurnal AR component has also been decreased considerably. There is high cor- 0 0.5 1 relation left at lag = 24, 48,.... This can 1 24 48 72 most likely be ascribed to systematic er- k = 24 rors caused by non-stationarity effects left in {τt}, and it indicates that the clear sky model normalization can be further opti- ACF mized. For k = 2 and 3 the grayed points 0 0.5 1 show the lags that cannot be included in the 1 24 48 72 model and the high correlation of these lags Lag indicate that information is not exploited. The AR model was extended with higher Figure 12: ACF of the time series of errors {et+k} order AR and diurnal AR terms without at horizon k = 1 and k = 24 of the LMnwp model. The grayed points show the lags which cannot be any further improvement in performance, included in the model. see (Bacher, 2008).

11 u ti eandsneti lrfista no that clarifies this since out, retained left is be it can fact but component in AR that show diurnal results the The only effect. has small whereas component a needed, AR diurnal clearly the is adding thus term and AR(1) horizons an short for ACF decaying The model. of tions identification model ARX 4.4. ac- well. the very correlation horizons diurnal removes compo- both input NWP for AR(1) tual seen an as but from nent, left For is horizons. correlation two for 12 ure of ACF the and RLS as to referred is LM 4.3. lags model. the the show in points included grayed be the cannot each which in and included models, lags the the of indicate bars vertical The horizons at 13: Figure ACF ACF

h oe sn ohlge observa- lagged both using model The input as NWPs only using model The 0 0.5 1 0 0.5 1 τ 44 72 48 72 24 48 1 24 1 t + k nwp τ = C ftetm eiso errors of series time the of ACF t k n Wsa nu sa ARX an is input as NWPs and m 1 = oe identification model LM + nwp and b LM 1 τ ˆ t nwp eelda exponential an revealed + k nwp Lag { k 24 = | e t t + + ti te using fitted is It . k } e fteAXmodel. ARX the of t + ssoni Fig- in shown is k k k =24 k =1 clearly 1 = { e (23) t + k } 12 ssoe ae.Temodel The this later. it, showed adding is by achieved is improvement anisdpn ntelvlo ˆ of uncer- level the the that on reveal wind depend horizon tainties to given applied a of is for Plots it Møller forecasts. where by power (2008) inspired al. used, quan- et Instead, is regression limits. tile finite case has of distribution errors this the the in since appropriate will be errors not the normal of assuming of distribution way outlined classical The is power approach here. solar simple a the and mod- of forecasts by uncertainties achieved the be elling can their This increases distribution usefulness. forecast) a point predicting (a to value single a predicting modelling Uncertainty 5. model optimal forecasting. an online make for adap- to the needed that is change indicates tivity this estimates and coefficient time over the the Clearly of setting. values current the in horizons the minimizes that value of value a the for estimates efficient estimates coefficient Adaptive 4.5. im- further any performance. without in provements terms di- and AR AR urnal order higher The with well. extended very was the horizons that short correlation seen the the is for It removes component 13. of AR(1) Figure ACF in the plotted and is RLS using fitted is the as to referred is τ xedn h oa oe oeat,from forecasts, power solar the Extending co- online the show 14 Figure in plots The t + k = m λ + 0 = a 1 τ . t 9 sue ic hsi the is this since used is 995 + ARX a 2 τ t − s RMSE ( oe.Temodel The model. k AR ) { + τ t b } oe,where model, 1 k τ ˆ versus t nwp etfrall for best τ + Figure . k | t + { ARX e e (24) t { t + + τ ˆ k t k } } , and k 15: Figure period. burn-in the marks beginning the in period grayed The shown. are horizons 14: Figure Normalized solar power Estimated coefficients Estimated coefficients 24 = 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0 . . . . . 1.0 0.8 0.6 0.4 0.2 0.0 . 95 a 4Mr2 a u 9Ot31Dec 19Oct 31Dec 7Aug 19Oct 26May 7Aug 14Mar 26May 1 Jan 14Mar 1 Jan h rdcin r aewt h R oe.Telnsaeetmtso the of estimates are lines The model. ARX the with made are predictions The . k =1 k=24 k=2 unie of quantiles h nieetmtso h offiinsi h Rmdla ucino ie w selected Two time. of function a as model AR the in coefficients the of estimates online The omlzdslrpwrvru h rdce omlzdslrpwra horizons at power solar normalized predicted the versus power solar Normalized Predicted normalizedsolarpower f τ (ˆ τ ) . 13 0.0 0.2 0.4 0.6 0.8 1.0 1.2 . . . . . 1.0 0.8 0.6 0.4 0.2 0.0 k =24 Predicted normalizedsolarpower 0 . 05 , 0 . 25 k , 0 1 = . 50 b a m , 1 1 0 and . 75 15 shows such plots for horizons k = 1 and The RMSE k is used as the main evaluation k = 24. The lines in the plot are estimates criterion (EC) for the performance of the of the 0.05, 0.25, 0.50, 0.75 and 0.95 quan- models. The Normalized Root Mean Square tiles of the probability distribution function Error is found by of τ as a function ofτ ˆ. The weighted quan- tile regression with a one dimensional kernel RMSE k NRMSE k = (27) smoother, described in Appendix A, is used. pnorm Figure 15 illustrates that the uncertain- ties are lower forτ ˆ close to 0 and 1, than where either for the mid-range values around 0.5. Thus N 1 X forecasts of values toward overcast or clear p =p ¯ = p . (28) norm N t sky have less uncertainty than forecasts of t=1 a partly overcast sky, which agrees with re- sults by Lorenz et al. (2007). Further work or pnorm is the average peak power of the 21 should extend the uncertainty model to in- PV systems. clude NWPs as input. The mean value of the RMSE k for a range of horizons

6. Evaluation k 1 Xe The methods used for evaluating the pre- RMSE ks,ke = RMSE k (29) ke − ks + 1 diction models are inspired by Madsen et al. k=ks (2005) where a framework for evaluation of is used as a summary error measure. When forecasting is suggested. The comparing the performance of two models RLS fitting of the prediction models does the improvement not use any degrees of freedom and the dataset is therefore not divided into a train- EC ref − EC ing set and a test set. It is, however, noted IEC = 100 · (%) (30) EC ref that the clear sky model and the optimiza- tion of λ does use the entire dataset, and is used, where EC is the considered evalua- thus the results can be a little optimistic. tion criterion. The values in the burn-in period are not used in calculating the error measures. In 6.2. Reference model Figure 14 the burn-in periods for the AR model are shown. To compare the performance of prediction models, and especially when making com- 6.1. Error measures parisons between different studies, a com- The k-step prediction error is mon reference model is essential. A refer- ence model for solar power is here proposed e = p − pˆ (25) t+k t+k t+k|t as the best performing naive predictor for The Root Mean Square Error for the k’th the given horizon. Three naive predictors of horizon is solar power are found to be relevant. Per- 1 N ! 2 sistence 1 X 2 RMSE k = et+k . (26) N pt+k = pt + et+k, (31) t=1 14 Diurnal persistence forecasted solar power generally follows the Persistence main level of the solar power, but the fluc- Diurnal mean tuations caused by sudden changes in cloud k cover are not fully described by the model.

RMSE The NRMSE k is plotted for each model in Figure 19. Clearly the performance is in- creasing from the Reference model to the

0 200 400 600 800 1000 AR model and further to the ARX model. 0 5 10 15 20 25 30 35 The differences from using either the solar Horizon k power or the NWPs, or both, as input be-

Figure 16: RMSE k for the three naive predictors come apparent from these results. used in the Reference model. At k = 1 the AR model that only uses so- lar power as input is better than the LMnwp diurnal persistence which only uses NWPs as input, but at k = 2,..., 6 the LMnwp is better, though only pt+k = pt−s(k) + et+k (32) slightly. This indicates that for making fore- casts of horizons shorter than 2 hours, solar s(k) = fspd + k mod fspd (33) power is the most important input, whereas where s(k) ensures that the latest diurnal for 2 to 6 hours horizons, forecasting sys- observation is used and fspd is the sample tems using either solar power or NWPs can frequency in number of samples per day, perform almost equally. The ARX model and diurnal mean using both types of input does have an in-

n creased performance at all k = 1,..., 6 and 1 X p = p + e (34) thus combining the two types of input is t+k n t−s(k,i) t+k i=1 found to be the superior approach. For k = 19,..., 29, which are the next s(k, i) = i · fspd + k mod fspd (35) day horizons, very clearly the LMnwp model which is the mean of solar power of the last and the ARX model perform better than n observations at the time of day of pt+k. the AR model. Since the LMnwp model and The value of n is chosen such that all past the ARX model perform almost equally, samples are included. it is seen that no improvement is achieved Figure 16 shows the RMSE k for each of from adding the solar power as input, and the three naive predictors. It is seen that for thus using only the NWPs as input is found k ≤ 2 the persistence predictor is the best to be adequate for next day horizons. while the best for k > 2 is the diurnal per- A summary of the improvement in per- sistence predictor. This model is referred to formance is calculated using (29) and (30). as the Reference model. The improvements compared to the Ref- erence model are calculated for the four 6.3. Results models by I for short horizons and RMSE 1,6 Examples of solar power forecasts made I for next day horizons. The re- RMSE 19,29 with the ARX model are shown in Figure sults are shown in Table 1. These results 17 for short horizons and in Figure 18 for naturally show the same as stated above, next day horizons. It is found that the though the difference at k = 1 from AR to 15 oiosadtergttenx a oios h etsaeso RMSE show scale left The RMSE horizons. show day scale next right the the right the and horizons 19: Figure NRMSEk by mean solar power Solar power (W) Solar power (W) 0.0 0.5 1.0 1.5 2.0 2.5 0 1000 2000 0 1000 2000 iue18: Figure 6 5 4 3 2 1 218 12 17 6 iue17: Figure Apr 21 h NRMSE The Apr 21 oeat fslrpwra etdyhorizons day next at power solar of Forecasts oeat fslrpwra hr horizons short at power solar of Forecasts 218 12 May 21 17 6 k o aho h he oesadteRfrnemdl h etpo hwteshort the show plot left The model. Reference the and models three the of each for k k Jun 2 omlzdby normalized 218 12 Jun 20 17 6 218 12 2769W ARX LMnwp AR Ref Jul 14 Jul 20 hc stema ekpwro h 1P systems. PV 21 the of power peak mean the is which 16 17 6 218 12 02 42 28 26 24 22 20 Aug 20 k Aug 26 k 1 = 19 = , . . . , , . . . , 218 12 6 Sep 19 17 6 29 k aewt h R model. ARX the with made omlzdby normalized k aewt h R model. ARX the with made Oct 7 218 12 Oct 19 p p p p ^ ^ 17 6 p ¯ 4 W 248 = 218 12 Nov 19 q=0.05 q=0.95 q=0.05 q=0.95 Nov 19 /

h 0 0.04 0.09 0.13 0.18 0.22

and NRMSEk by mean peak power Models I I RMSE 1,6 RMSE 19,29 ies. The best performing prediction model AR over Reference 27% 17% is an ARX model where both solar power LMnwp over Reference 25% 36% observations and NWPs are used as input. ARX over Reference 35% 36% The results indicate that for horizons below LMnwp over AR -2% 23% 2 hours solar power is the most important ARX over AR 12% 23% input, but for next day horizons no consid- ARX over LMnwp 13% 1% erable improvement is achieved from using available values of solar power, so it is ade- Table 1: Summary error measures of improvements quate just to use NWPs as input. Thus, de- compared to the Reference model for short horizons pending on the application of the forecast- k = 1,..., 6 and next day horizons k = 19,..., 29 . ing system using only either of the inputs can be considered, and a lower limit of the LMnwp cannot be seen. These results show latency, at which solar power observations that a RMSE improvement of around 35 % are needed for the forecasting system, can over the Reference model can be achieved be different. Finally it is noted that a com- by using the ARX model. parison to other online solar power forecast- ing methods, e.g. (Lorenz et al., 2007) and 7. Conclusions (Hocaoglu et al., 2008), has not been car- ried out, but that such a study would be in- Inspired by previous studies, the present formative in order to describe strengths and method for solar power forecasting has been accuracy of the different proposed methods. developed from scratch. A new approach to clear sky modelling with statistical smooth- A. Weighted quantile regression ing techniques has been proposed, and an adaptive prediction model based on RLS The solar power time series {pt, t = makes a solid framework allowing for fur- 1,...,N} is the realization of a stochastic ther refinements and model extensions e.g. process {Pt, t = 1,...,N}. The estimated cs by including NWPs of temperature as in- clear sky solar power at time t isp ˆt and it is found as the q quantile of f , the probabil- put. The adaptivity of the method makes Pt it suited to online forecasting and ensures ity distribution function of Pt. The problem cs comprehension of changing conditions of the is reduced to estimatingp ˆt as a local con- PV system and its surroundings. Further- stant for each (xt, yt), where x is the day more the RLS algorithm is not computer in- of year and y the time of day. This is done tensive, which makes updating of forecasts by weighted quantile regression in which the fast. The clear sky model used to normal- loss function is  ize the solar power delivers a useful result, qi , i ≥ 0 ρ(q, i) = (36) but can be improved, especially for the es- (1 − q)i , i < 0 timates toward dawn and dusk, by using cs cs polynomial-based kernel regression. A pro- where i = pi −pˆt . The fitting ofp ˆt is then cedure based on quantile regression is sug- done by gested for calculating the varying intervals N X of the uncertainty of the solar power predic- pˆcs = arg min k(x , y , x , y ) · ρ(q,  ). t cs t t i i i tions and the results agree with other stud- pˆt 17 i=1 (37) Hence the model can be written as

T where Yt = Xt θ + et. (44) w(x , x , h ) · w(y , y , h ) k(x , y , x , y ) = t i x t i y The estimates of the parameters at t are t t i i PN i=1 w(xt, xi, hx) · w(yt, yi, hy) found such that (38) ˆ θt = arg min St(θ), (45) θ is the two dimensional multiplicative ker- nel function which weights the observations where the loss function is locally to (xt, yt), (Hastie and Tibshirani, t 1993). Details of the minimization are X t−s T 2 St(θ) = λ (Ys − Xs θ) . (46) found in (Koenker, 2005). In each dimen- s=1 sion a Gaussian kernel is used   This provides weighted least squares with |xt − xi| exponential forgetting. The solution at time w(xt, xi, hx) = fstd (39) hx t leads to where fstd is the standard normal probabil- θˆ = R−1h , (47) ity density function. A similar kernel func- t t t tion is used in the y dimension, and the final see (Madsen, 2007), where two dimensional kernel is found by multiply- ing the two kernels as shown in (37). Pt t−s T Pt t−s Rt = s=1 λ XsXs , ht = s=1 λ XsYs. (48) B. Recursive least squares The k-step RLS-algorithm with exponential Fitting of the prediction models is done forgetting is then using k-step recursive least squares (RLS) with forgetting, which is described in the R = λR + X XT (49) following using the ARX model t t−1 t−k t−k ˆ ˆ −1 T ˆ θt = θt−1 + Rt Xt−k(Yt − Xt−kθt−1) nwp (50) τt+k = m+a1τt +a2τt−s(k) +b1τˆt+k|t +et+k, (40) and the k-step prediction at t is as an example. The regressor at time t is ˆ T ˆ Yt+k = Xt θt. (51) T nwp Xt = (1, τt, τt−s(k), τˆt+k|t), (41) References the parameter vector is Bacher, P., 2008. Short-term solar power forecast- T ing. Master’s thesis, Technical University of Den- θ = (m, a1, a2, b1), (42) mark, IMM-M.Sc.-2008-13. Cao, J., Lin, X., 2008. Study of hourly and daily so- and the dependent variable lar irradiation forecast using diagonal recurrent wavelet neural networks. Energy Conversion and Yt = τt. (43) Management 49 (6), 1396–1406. 18 Chowdhury, B., Rahman, S., 1987. Forecasting sub- hourly solar irradiance for prediction of photo- voltaic output. In: IEEE Photovoltaic Special- ists Conference, 19th, New Orleans, LA, May 4-8, 1987, Proceedings (A88-34226 13-44). New York, Institute of Electrical and Electronics En- gineers, Inc., 1987, p. 171-176. pp. 171–176. Hastie, T., Tibshirani, R., 1993. Varying-coefficient models. Journal of the Royal Statistical Society. Series B (Methodological) 55 (4), 757–796. Heinemann, D., Lorenz, E., Girodo, M., 2006. Fore- casting of solar radiation. In: Dunlop, E., Wald, L., Suri, M. (Eds.), Solar Resource Manage- ment for Electricity Generation from Local Level to Global Scale. Nova Science Publishers, New York, pp. 83–94. Hocaoglu, F. O., Gerek, O.¨ N., Kurban, M., 2008. Hourly solar radiation forecasting using optimal coefficient 2-d linear filters and feed-forward neu- ral networks. Solar Energy 82 (8), 714–726. Koenker, R., 2005. Quantile Regression. Cambridge University Press. Koeppel, G., Korpas, M., 2006. Using storage de- vices for compensating uncertainties caused by non-dispatchable generators. 2006 International Conference on Probabilistic Methods Applied to Power Systems, 1–8. Lorenz, E., Heinemann, D., Wickramarathne, H., Beyer, H., Bofinger, S., 2007. Forecast of ensem- ble power production by grid-connected pv sys- tems. In: Proc. 20th European PV Conference, September 3-7, 2007, Milano. Madsen, H., 2007. Time Series Analysis. Chapman & Hall. Madsen, H., Pinson, P., Kariniotakis, G., Nielsen, H. A., Nielsen, T. S., 2005. Standardizing the performance evaluation of shortterm wind power prediction models. Wind Engineering 29 (6), 475. Møller, J. J. K., Nielsen, H. A., Madsen, H., 2008. Time-adaptive quantile regression. Com- putational and Data Analysis 52 (3), 1292–1303. Sfetsos, A., Coonick, A., 2000. Univariate and mul- tivariate forecasting of hourly solar radiation with artificial intelligence techniques. Solar En- ergy 68 (2), 169–178.

19