<<

applied sciences

Article Application of Long-Short-Term-Memory Recurrent Neural Networks to Forecast Wind Speed

Meftah Elsaraiti * and Adel Merabet

Division of Engineering, Saint Mary’s University, Halifax, NS B3H 3C3, Canada; [email protected] * Correspondence: [email protected]; Tel.: +1-902-420-5101

Abstract: Forecasting wind speed is one of the most important and challenging problems in the wind power prediction for electricity generation. Long short-term memory was used as a solution to short-term memory to address the problem of the disappearance or explosion of gradient information during the training process experienced by the () when used to study time series. In this study, this problem is addressed by proposing a prediction model based on long short-term memory and a deep neural network developed to forecast the wind speed values of multiple time steps in the future. The weather database in Halifax, Canada was used as a source for two series of wind speeds per hour. Two spring (March 2015) and summer (July 2015) were used for training and testing the forecasting model. The results showed that the use of the proposed model can effectively improve the accuracy of wind speed prediction.

Keywords: forecasting; long-short-term memory; multiple time series; RNN; wind speed

  1. Introduction Citation: Elsaraiti, M.; Merabet, A. Forecasting wind speed is a very difficult challenge compared to other variables of the Application of Long-Short-Term- atmosphere, and this is due to their chaotic and intermittent which causes difficulty Memory Recurrent Neural Networks in integrating wind power to the grid. As wind speed is one of the most developed and to Forecast Wind Speed. Appl. Sci. cheapest green energy sources, its accurate forecasting in the short term has become an 2021, 11, 2387. https://doi.org/ important matter and has a decisive impact on the electricity grid. Both dynamic and 10.3390/app11052387 statistical methods, and some hybrid methods that couple the two methods, were applied to forecast short term wind speed. Running high-resolution numerical weather prediction Academic Editor: Sergio Montelpare (NWP) models requires an understanding of many of the basic principles that support them, including data assimilation, knowledge on how to estimate the NWP model in Received: 16 December 2020 space and time, and how to perform validation and verification of forecasts. This can be Accepted: 5 March 2021 Published: 8 March 2021 costly in terms of computational time. Reliable methods and techniques for forecasting wind speed are becoming increasingly important for characterizing and predicting wind

Publisher’s Note: MDPI stays neutral resources [1]. The main purpose of any forecast is to build, identify, tune and validate time with regard to jurisdictional claims in series models. Time series forecasting is one of the most important applied problems of published maps and institutional affil- and artificial intelligence in general, since the improvement of forecasting iations. methods will make possible to more accurately predict the behavior of various factors in different areas. Traditionally, such models are based on the methods of statistical analysis and mathematical modeling developed in the 1960s and 1970s [2]. The ARIMA model was used to forecast wind speed using common error ratio measures for model prediction accuracy [3]. Recently, in the machine learning community has gained Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. significant popularity as it is considered a general framework that facilitates the training This article is an open access article of deep neural networks with many hidden layers [4]. The availability of large datasets distributed under the terms and combined with the improvement in and the exponential growth in computing conditions of the Creative Commons power led to an unparalleled surge of interest in the topic of machine learning. These Attribution (CC BY) license (https:// methods use only historical data to learn the random dependencies between the past and creativecommons.org/licenses/by/ the future. Among these methods, recurrent neural networks (RNNs) which are designed 4.0/). to learn a sequence of data by traversing a hidden state from one step of the sequence to

Appl. Sci. 2021, 11, 2387. https://doi.org/10.3390/app11052387 https://www.mdpi.com/journal/applsci Appl. Sci. 2021, 11, 2387 2 of 9

the next, combined with the input, and routing it back and forth between the inputs [5]. Long short-term memory based recurrent neural network (LSTM-RNN) has been used to forecasting 1 to 24 h ahead wind power [6]. A comparison was made between LSTM, (ELM) and SVM. The results have shown that deep learning approaches are more effective than the traditional machine learning methods in improving prediction accuracy through the directional loop neural network structure and a special hidden unit [7]. Studies have suggested that coupling NWP models and artificial neural networks could be beneficial and provide better accuracy compared to conventional NWP model downscaling methods [8]. Numerical weather prediction method (NWP) is one of the most used methods, and it is suitable for long term rather than short term and medium term forecast due to the large amount of computation [9]. There has been analysis of the wind speed forecasting accuracy of the recurrent neural network models, and they have presented better results compared to the univariate and multivariate ARIMA models [10]. Linear and non-linear autoregressive models with and without external variables were developed to forecast short-term wind speed. Three performance metrics, MAE, RMSE and MAPE were used to measure the accuracy of the models [11]. LSTM method was used to forecast wind speed and the results were compared to traditional artificial neural network and autoregressive integrated moving average models, the proposed method proved better results [12]. Long Term Memory Model (LSTM) was done to short-term spatio-temporal wind speed forecast for five locations using two-year data for historical wind speed and auto-regression. The model aimed to improve forecasting accuracy on a shorter time horizon. For example, using LSTM for two or three hours can forecast horizons extending up to fifteen days using an NWP model that updates itself typically with a frequency of six hours [13]. The training speed of RNNs for predicting multivariate time series is usually relatively slow especially when used with large network depth. Recently a variety of RNN called long-term memory (LSTM) has been favored due to its superior performance during the training phase by better solving vanishing and exploding gradient problems of standard RNN architecture [14,15]. Long short-term memory (LSTM) and temporal convolutional networks (TCN) models were proposed for data-driven lightweight weather forecasting, and their performance was compared with classic machine learning approaches (standard regression (SR), support vector regression (SVR), (RF)), statistical machine learning approaches (Autoregressive integrated moving average (ARIMA), vector auto regression (VAR), and vector error correction model (VECM)), and dynamic ensemble method (Arbitrage of forecasting expert (AFE)). The results of the proposed model demonstrate its ability to predict effective and accurate weather [16]. Despite the continuous development of research on the LSTM in long-term prediction of wind speed, the traditional RNN algorithm for prediction is still preferred in most research. In this paper, emphasis has been placed on the application of the LSTM algorithm in the field of forecasting wind speed and a comparison between prediction efficiency and accuracy of the algorithm under different wind speed time series were used for training and testing.

2. Methodology and Data Source 2.1. Data Sources In this study, the proposed model was implemented only for short-term forecasting of wind speed in order to avoid the high computational time of making dynamical downscal- ing by using NWP models such as Weather Research and Forecasting Model (WRF). Wind speed data from the Halifax Dockyard station in Nova Scotia, which is located: Latitude 44.66◦ N, Longitude 63.58◦ W. Wind speed was measured at a height of 3.80 m, and it was used as a source for two different seasons, spring (March 2015) and summer (July 2015), as shown in Figure1. For both seasons, the data recorded at each hour, (576 reads/24 days) as observations and (168 readings/7 days), respectively, as the training and testing groups. The proposed LSTM implementation fits well with the time series dataset, which can improve the convergence accuracy of the training process. Appl. Sci. 2021, 11, x FOR PEER REVIEW 3 of 9

Appl. Sci. 2021, 11, 2387 hour, (576 reads/24 days) as observations and (168 readings/7 days), respectively, as3 ofthe 9 training and testing groups. The proposed LSTM implementation fits well with the time series dataset, which can improve the convergence accuracy of the training process.

FigureFigure 1.1. TimeTime seriesseries ofof thethe windwind speedspeed inin Halifax,Halifax, Canada.Canada.

2.2.2.2. RecurrentRecurrent NeuralNeural NetworksNetworks RecurrentRecurrent neuralneural networksnetworks (RNNs)(RNNs) areare sequentialsequential datadata neuralneural networksnetworks whosewhose pur- posepose areare toto predictpredict thethe nextnext stepstep inin aa sequencesequence ofof observationsobservations relativerelative toto previousprevious stepssteps inin thethe samesame sequence.sequence. RNNsRNNs containcontain hiddenhidden layerslayers distributeddistributed acrossacross time,time, whichwhich allowsallows themthem toto storestore informationinformation obtainedobtained inin previousprevious stagesstages ofof readingreading serialserial data.data. WindWind speedspeed depends on the short and long term. The simple RNN model is unable to handle long-term depends on the short and long term. The simple RNN model is unable to handle long- time dependencies. One problem that arises from the unfolding of an RNN is that the term time dependencies. One problem that arises from the unfolding of an RNN is that gradient of some of the weights starts to become too small or too large if the network is the gradient of some of the weights starts to become too small or too large if the network unfolded for too many time steps. This phenomenon is called the vanishing gradients is unfolded for too many time steps. This phenomenon is called the vanishing gradients problem, and it can store the short-term memory only because it includes the functions of problem, and it can store the short-term memory only because it includes the functions of activating the hidden of the previous time step only and this causes the loss of infor- activating the hidden layer of the previous time step only and this causes the loss of infor- mation in the long term [17,18]. A type of network architecture that solves this problem is mation in the long term [17,18]. A type of network architecture that solves this problem is the LSTM. In a typical implementation, the hidden layer is replaced by a complex block of the LSTM. In a typical implementation, the hidden layer is replaced by a complex block computing units composed by gates that trap the error in the block, forming a so-called of computing units composed by gates that trap the error in the block, forming a so-called “error carrousel” [5]. Figure2 shows the RNN structure where the output of the previously “error carrousel” [5]. Figure 2 shows the RNN structure where the output of the previ- hidden layer is input to the current hidden layer. The RNN model is expressed by ously hidden layer is input to the current hidden layer. The RNN model is expressed by (   h =ℎ tanh= 푡푎푛ℎw(푤x 푥++w푤 y푦 ) t { 푡 hxℎ푥t 푡 hyℎ푦t−푡−11 ((1)1) 푦 = 휎(푤ℎ ) yt =푡 σ(wht)푡

where 푥 is the input, ht is the state value of the hidden layer, 푦 is value at the output where x 푡is the input, h is the state value of the hidden layer, y is value푡 at the output layer layer att time t, 푤 is tthe weight from the input layer, 푤 ist the weight for the delayed at time t, w is theℎ푥 weight from the input layer, w is the weightℎ푦 for the delayed output at output at timehx t ‒1, tanh is the hyperbolic tangenthy as the activation at the hidden time t − 1, tanh is the hyperbolic tangent as the at the hidden layer, and layer, and 휎 is the as the activation function at the output layer. σ is the sigmoid function as the activation function at the output layer.

2.3. Long Short Term Memory Long short-term memory networks are a type of recurrent neural network (RNN) designed to avoid the problem of long-term dependence, where each neuron contains a memory capable of storing the previous information used by the RNN or forgetting it if necessary [19]. Currently, it is widely used with success in time series prediction problems. LSTM-RNN was designed from a memory cell that stores long-term dependencies. In addition to the memory cell, the LSTM cell contains an input gate, output gate and forget gate. Each gate in the cell receives the current input xt, the hidden state ht−1 at the previous moment and the state information Ct−1 of the cell’s internal memory to perform various operations and determine whether to activate using a logic function. The state ht of the unit, Appl. Sci. 2021, 11, 2387 4 of 9

Appl. Sci. 2021, 11, x FOR PEER REVIEW 4 of 9

the output at time t and the input hidden state at time t1 are determined by non-linearly activating tanh ( ) and the information of the output gate.

y yt-1 yt yt+1

why why why why ht-1 ht ht+1 h w unfold w w w

whx w hx whx whx

x xt-1 Xt xt+1

Figure 2. RNN structure. Figure 2. RNN structure. The processing of a time point inside a LSTM cell could be described as below: 2.3. Long Short Term Memory (1) The unwanted information in the LSTM is identified and thrown away from the cell Longstate short through-term the memory sigmoid networks layer called are forgeta type gate of recurrent layer defined neural by network (RNN) designed to avoid the problem of long-term dependence, where each neuron contains a memory cell capable of storing the previoush information usedi by the RNN or forgetting it ft = σ w f (ht−1, xt) + b f (2) if necessary [19]. Currently, it is widely used with success in time series prediction prob- lems. LSTM-RNN was designed from a memory cell that stores long-term dependencies. where, w f is the weight, ht−1 is the output from the previous time stamp, xt is the In additionnew input,to the bmemoryf is the bias. cell, the LSTM cell contains an input gate, output gate and forget(2) gate.The new Each information gate in the thatcell receives will be stored the current in the input cell state 푥푡, the is determined hidden state and ℎ푡−1 updated at the previousby moment the sigmoid and layer the state called information the input gate 퐶푡−1 Layer.of the cell Next,’s internal a tanh layer memory creates to aperform vector of variousnew operations candidate and values determine that could whether be added to activate to the using state a logic function. The state ℎ푡 of the unit, the output at time 푡 and the input hidden state at time 푡1 are determined by non-linearly activating 푡푎푛ℎ ( ) and theit =informationσ[wi(ht−1, ofxt )the+ boutputi] gate. (3) The processing of a time point inside a LSTM cell could be described as below: cˆ = tanh[w (h , x ) + b ] (4) (1) The unwanted information in thei LSTM isc identifiedt−1 t andi thrown away from the cell (3) stateThe through old cell the state, sigmoidct−1, is layer updated called into forget the newgate celllayer state definedct. The by cell state is updated as follows 푓푡 = 휎 [푤푓(ℎ푡−1, 푥푡 ) + 푏푓] (2) ct = ft ∗ ct−1 + it ∗ cˆt (5)

where,(4) A 푤 sigmoid푓 is the weight, layer will ℎ푡− be1 is run the to output decide from what the parts previous of the time cell stamp, state are 푥푡 going is the tonew the input,output. 푏푓 is It the is expressed bias. by (2) The new information that will beot stored= σ[w oin(h thet−1, cellxt) +statebi] is determined and updated(6)

by the sigmoid layer called the inputht gate= o tLayer.∗ tanh Next(ct) , a tanh layer creates a vector of(7) new candidate values that could be added to the state The LSTM structure is illustrated in Figure3.

𝑖푡 = 휎[푤𝑖(ℎ푡−1, 푥푡) + 푏𝑖] (3) 2.4. Forecast Validation In order to analyze the accuracy푐𝑖̂ = 푡푎푛 ofℎ[푤 the푐(ℎ models,푡−1, 푥푡) + among푏𝑖] the most commonly( used4) parameters for estimating wind speed predictions is root mean square error (RMSE) [20]. It (3) The old cell state, 푐푡−1, is updated into the new cell state 푐푡. The cell state is updated expressed by as follows v u  2 u N y − y t ∑i=1 ai fi RMSE푐푡 ==푓푡 ∗ 푐푡−1 + 𝑖푡 ∗ 푐푡̂ (5)(8) N (4) A sigmoid layer will be run to decide what parts of the cell state are going to the where, N the number of data, yai is the actual value and y f i is the forecast value. output. It is expressed by

표푡 = 휎[푤표(ℎ푡−1, 푥푡) + 푏𝑖] (6)

ℎ푡 = 표푡 ∗ 푡푎푛ℎ(푐푡) (7) The LSTM structure is illustrated in Figure 3. Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 9

Appl. Sci. 2021, 11, 2387 5 of 9

Input gate Output gate

ht-1 xt ht-1 xt

Ot it

ht-1 ht x ct x xt

x

ft

ht-1 xt

Forget gate

FigureFigure 3. Description 3. Description of LSTM of LSTM cell configuration. cell configuration.

2.4. 3.Forecast Results Validation and Discussion In orderIn this to study,analyze the the MATLAB accuracy software of the models, (R2019b) among was the used most for commonly the training used process pa- of rametersthe LSTM, for estimating which is anwind advanced speed predictions architecture is for root RNN mean to predictsquare theerror values (RMSE) of future [20]. It time expressedsteps of by a sequence. The sequence regression network was trained to the LSTM sequence, where responses are training sequences with changing values in one time step. That is, for 푁 2 each time step of the input sequence, the∑ LSTM𝑖=1(푦푎 network− 푦푓 ) learns to predict the value of the 푅푀푆퐸 = √ 푖 푖 (8) next time step. In this work, to evaluate the effectiveness푁 and applicability of the proposed model in a comprehensive and systematic manner, two series of data were selected for where,wind N speedthe number for two ofdifferent data, 푦푎𝑖 seasons is the actual due to value their and different 푦푓𝑖 is climatic the forecast characteristics, value. which are, respectively, spring (March 2015) and summer (July 2015). Each data series was divided 3. Resultsinto 1–576 and (24Discussion days) as observations and 577–744 (7 days), respectively, as the training and testIn this groups. study Training, the MATLAB data have software been standardized (R2019b) was to haveused zerofor the mean training and unit process variance of at the predictionLSTM, which time is toan prevent advanced training architecture from divergence. for RNN to The predict best trainingthe values parameter of future to time obtain stepsthe of lowest a sequence. RMSE The is found sequence using regression an initial network learning was rate trained of 0.005. to Figures the LSTM4 and sequence,5 show the wherecomparison responses of are observed training values sequences with predictedwith changing values values of hourly in one wind time speed step. seriesThat is, collected for eachin time the springstep of (1–31the input March, sequence, 2015) and the Summer LSTM network (1–31 July learns 2015), to respectively,predict the value for evaluation of the nextof time the step. LSTM, In this which work, was to trained evaluate once the (theeffectiveness value of theand previous applicability prediction) of the proposed and reused modelto makein a comprehensive a prediction for and each systematic time step betweenmanner, predictionstwo series of which data iswere represented selected for by the windEquations speed for (2)–(4). two different This meaning seasons thatdue to no their updates different are made climatic once characteristics, the model first which fits the are,training respectively, data, spring and the (March model 2015) in this and case summer is called (July the 2015). fixed modelEach data be-cause series there was aredi- no videdupdates. into 1– LSTM576 (24 network days) as training observations options and was 577 selected–744 (7 days), for 200 respectively, hidden modules. as the The train- initial ing learningand test rategroups. is 0.005, Training and the data maximum have been number standardized of iterations to have is fixed zero at mean 250. Theand gradientunit variancethreshold at prediction is set to 1time to prevent to prevent the gradientstraining from from divergence. exploding. The learningbest training rate isparam- dropped eterafter to obtain 125 epochs the lowest by multiplying RMSE is found by a factorusing 0.2.an initial of 0.005. Figures 4 and 5 showIn both the Figurescomparison6 and7 of, the observed LSTM has values been with updated predicted with new values data of for hourly the time wind series speedforecasting series collected by using in the the predicted spring (1 values–31 March, and the 2015) updated and Summer state values (1–31 from July the 2015 test),set re- and spectively,made available for evaluation to the modelof the LSTM, for the which forecast was on trained the next once time (the step value Specially, of the theprevious modified prediction)LSTM adopted and reused Ct− 1toto make the input, a prediction forget, for and each output time gates. step between This is because predictions every which time the is representedLSTM proceeds, by theC Etquations−1 affects (2 the)–(4 input,). This forget, meaning and that output no updates of the LSTM.are made All once predictions the modearel first collected fits the in training the test datadata, set and and the an model error scorein this is case calculated is called to the summarize fixed model the skillbe- of the model. The root mean square error (RMSE) is used because it penalizes large errors Appl. Sci. 2021, 11, 2387 6 of 9

and results in a score in the same units as the forecasted data, which is wind speed per hour. Here, the predictions are more accurate when updating the network state with the Appl. Sci. 2021, 11, x FOR PEER REVIEWobserved values instead of the predicted values. From the results, it is observed that in6 bothof 9

series Spring (March 2015) and Summer (July 2015) the value of RMSE dropped by 4.5845 and 4.9392, respectively, when using the LSTM update, and this is due to the different characteristics of each. In this work and based on various previous studies according to cause there are no updates. LSTM network training options was selected for 200 hidden different models, it is noted that the accuracy of prediction models differs according to the modules. The initial learning rate is 0.005, and the maximum number of iterations is fixed different characteristics of the information, and therefore, until now, no model has been at 250. The gradient threshold is set to 1 to prevent the gradients from exploding. The reached that works with the same accuracy with different information. Table1 shows the learning rate is dropped after 125 epochs by multiplying by a factor 0.2. errors for the two data series (July 2015) and (March 2015) in the proposed LSTM model using the RMSE error metrics.

Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 9

Figure 4. Wind speed prediction of existing LSTM for Spring (March 2015) data: (a) Comparison of Figureobserved 4. Wind values speed and prediction forecast values; of existing (b) Error LSTM and for RMSE Spring value (March values. 2015) data: (a) Comparison of observed values and forecast values; (b) Error and RMSE value values.

In both Figures 6 and 7, the LSTM has been updated with new data for the time series forecasting by using the predicted values and the updated state values from the test set and made available to the model for the forecast on the next time step Specially, the mod- ified LSTM adopted Ct−1 to the input, forget, and output gates. This is because every time the LSTM proceeds, Ct−1 affects the input, forget, and output of the LSTM. All predictions are collected in the test data set and an error score is calculated to summarize the skill of the model. The root mean square error (RMSE) is used because it penalizes large errors and results in a score in the same units as the forecasted data, which is wind speed per hour. Here, the predictions are more accurate when updating the network state with the observed values instead of the predicted values. From the results, it is observed that in both series Spring (March 2015) and Summer (July 2015) the value of RMSE dropped by 4.5845 and 4.9392, respectively, when using the LSTM update, and this is due to the dif- ferent characteristics of each. In this work and based on various previous studies accord- ing to different models, it is noted that the accuracy of prediction models differs according to the different characteristics of the information, and therefore, until now, no model has been reached that works with the same accuracy with different information. Table 1 shows

theFigure errors 5. Windfor the speed two prediction data series of existing (July 2015) LSTM and for Summer (March (July 2015) 2015) in the data: proposed (a) Comparison LSTM of Figure 5. Wind speed prediction of existing LSTM for Summer (July 2015) data: (a) Comparison of modelobserved using values the andRMSE forecast error values; metrics. (b) Error and RMSE value values. observed values and forecast values; (b) Error and RMSE value values.

Figure 6. Wind speed prediction of the proposed LSTM for Spring (March 2015) data: (a) Compari- son of observed values and forecast values; (b) Error and RMSE value values. Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 9

Figure 5. Wind speed prediction of existing LSTM for Summer (July 2015) data: (a) Comparison of Appl. Sci. 2021, 11, 2387 7 of 9 observed values and forecast values; (b) Error and RMSE value values.

Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 of 9

Figure 6. Wind speed prediction of the proposed LSTM for Spring (March 2015) data: (a) Comparison Figure 6. Wind speed prediction of the proposed LSTM for Spring (March 2015) data: (a) Compari- sonof observedof observed values values and and forecast forecast values; values; (b) ( Errorb) Error and and RMSE RMSE value value values. values.

Figure 7. Wind speed prediction of proposed LSTM for Summer (July 2015) data: (a) Comparison of Figure 7. Wind speed prediction of proposed LSTM for Summer (July 2015) data: (a) Comparison b ofobserved observed valuesvalues andand forecastforecast values; values; ( (b)) Error Error and and RMSE RMSE value value values. values.

TablTablee 1. 1.EvaluationEvaluation of of two two different different time time series series data data in in the the LSTM LSTM model. model.

TimeTime Series Series SpringSpring (1‒31 (1-31 March March 2015) 2015) Summer Summer (1 (1–31–31 July July 2015). 2015). RMSERMSE 8.51288.5128 4.7796 4.7796

4.4. Conclusions Conclusions AccurateAccurate forecasting forecasting model model of of wind wind speed speed sources sources is isnecessary necessary to to provide provide essential essential informationinformation to to empower empower grid grid operators operators and and system system designers designers in in generating generating an an optimal optimal windwind power power plant, plant, and and to to balance balance the the supply supply and and demand demand in in the the energy energy market. market. In In this this studstudy,y, modified modified l longong short-term short-term memorymemory (LSTM) (LSTM) has has been been proposed proposed to forecast to forecast wind wind speed. speedSince. S theince actual the actual value value of the of time the time steps steps between between predictions predictions can becan accessed, be accessed, the windthe wind speed is predicted by updating the network state in each prediction using the ob- served value instead of the gates. This is because every time the LSTM proceeds, the cell state affects the input, forget, and output of the LSTM. The results of the model demon- strated improved accuracy in predicting wind speed.

Author Contributions: Conceptualization, M.E.; methodology, M.E.; software, M.E.; validation, M.E.; formal analysis, M.E. and A.M.; writing—Original draft preparation, M.E.; writing—Review and editing, A.M.; supervision, A.M.; project administration, A.M. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded in part by Libyan Education Ministry, grant number 3772. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: Not applicable. Conflicts of Interest: The authors declare no conflict of interest.

Appl. Sci. 2021, 11, 2387 8 of 9

speed is predicted by updating the network state in each prediction using the observed value instead of the gates. This is because every time the LSTM proceeds, the cell state affects the input, forget, and output of the LSTM. The results of the model demonstrated improved accuracy in predicting wind speed.

Author Contributions: Conceptualization, M.E.; methodology, M.E.; software, M.E.; validation, M.E.; formal analysis, M.E. and A.M.; writing—Original draft preparation, M.E.; writing—Review and editing, A.M.; supervision, A.M.; project administration, A.M. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded in part by Libyan Education Ministry, grant number 3772. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: Not applicable. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Monfared, M.; Rastegar, H.; Kojabadi, H.M. A new strategy for wind speed forecasting using artificial intelligent methods. Renew. Energy 2009, 34, 845–848. [CrossRef] 2. Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control, 5th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2015; pp. 129–171. 3. Elsaraiti, M.; Merabet, A.; Al-Durra, A. Time Series Analysis and Forecasting of Wind Speed Data. In Proceedings of the IEEE Industry Applications Society Annual Meeting, Baltimore, MD, USA, 29 September–3 October 2019; pp. 1–5. 4. Yadav, A.P.; Kumar, A.; Behera, L. RNN based solar radiation forecasting using adaptive learning rate. In The International Conference on Swarm, Evolutionary, and Memetic Computing; Springer: Cham, Switzerland, 2013; pp. 442–452. 5. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef][PubMed] 6. Kumar, D.; Mathur, H.D.; Bhanot, S.; Bansal, R.C. Forecasting of solar and wind power using LSTM RNN for load frequency control in isolated microgrid. Int. J. Model. Simul. 2020.[CrossRef] 7. Shi, X.; Lei, X.; Huang, Q.; Huang, S.; Ren, K.; Hu, Y. Hourly day-ahead wind power prediction using the hybrid model of variational model decomposition and long short-term memory. Energies 2018, 11, 3227. [CrossRef] 8. Rodrigues, E.R.; Oliveira, I.; Cunha, R.; Netto, M. DeepDownscale: A Deep Learning Strategy for High-Resolution Weather Forecast. In Proceedings of the 2018 IEEE 14th International Conference on e-Science (e-Science), Amsterdam, The Netherlands, 29 October–1 November 2018; IEEE: New York, NY, USA, 2018; pp. 415–422. 9. Chen, N.; Qian, Z.; Nabney, I.T.; Meng, X. Wind power forecasts using Gaussian processes and numerical weather prediction. IEEE Trans. Power Syst. 2013, 29, 656–665. [CrossRef] 10. Cao, Q.; Ewing, B.T.; Thompson, M.A. Forecasting wind speed with recurrent neural networks. Eur. J. Oper. Res. 2012, 221, 148–154. [CrossRef] 11. Lydia, M.; Kumar, S.S.; Selvakumar, A.I.; Kumar, G.E. Linear and non-linear autoregressive models for short-term wind speed forecasting. Energy Convers. Manag. 2016, 112, 115–124. [CrossRef] 12. Ghaderi, A.; Sanandaji, B.M.; Ghaderi, F. Deep Forecast: Deep Learning-Based Spatio-Temporal Forecasting; Cornell University: Ithaca, NY, USA, 2017. 13. Özen, C.; Kaplan, O.; Özcan, C.; Dinç, U. Short Term Wind Speed Forecast by Using Long Short Term Memory. In Proceedings of the 9th International Symposium on Atmospheric Sciences ATMOS 2019, Istanbul, Turkey, 23–26 October 2019. 14. Kim, T.S.; Reiter, A. Interpretable 3D Human Action Analysis with Temporal Convolutional Networks. In Proceedings of the 2017 IEEE Conference on and Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; IEEE: New York, NY, USA, 2017; pp. 1623–1631. 15. Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A.; Hager, G.D. Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 156–165. 16. Sandhu, K.S.; Nair, A.R. A Comparative Study of ARIMA and RNN for Short Term Wind Speed Forecasting. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 6–8 July 2019; IEEE: New York, NY, USA, 2019; pp. 1–7. 17. Squartini, S.; Hussain, A.; Piazza, F. Preprocessing Based Solution for the Vanishing Gradient Problem in Recurrent Neural Networks. In Proceedings of the 2003 International Symposium on Circuits and Systems (ISCAS), Bangkok, Thailand, 25–28 May 2003; IEEE: New York, NY, USA, 2003; Volume 5, p. 5. Appl. Sci. 2021, 11, 2387 9 of 9

18. Batbaatar, E.; Park, H.W.; Li, D.; Li, M.; Ryu, K.H. DeepEnergy: Prediction of appliances energy with long-short term memory recurrent neural network. In Asian Conference on Intelligent Information and Database Systems; Springer: Cham, Swithzerland, 2018; pp. 224–234. 19. Sak, H.; Senior, A.; Beaufays, F. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary ; Cornell University: Ithaca, NY, USA, 2014. 20. Ghofrani, M.; Suherli, A. Time series and renewable energy forecasting. Time Ser. Anal. Appl. 2017, 2017, 77–92.