CARRNN: a Continuous Autoregressive Recurrent Neural Network for Deep Representation Learning from Sporadic Temporal Data

Total Page:16

File Type:pdf, Size:1020Kb

CARRNN: a Continuous Autoregressive Recurrent Neural Network for Deep Representation Learning from Sporadic Temporal Data 1 CARRNN: A Continuous Autoregressive Recurrent Neural Network for Deep Representation Learning from Sporadic Temporal Data Mostafa Mehdipour Ghazi, Lauge Sørensen, Sebastien´ Ourselin, and Mads Nielsen Abstract—Learning temporal patterns from multivariate lon- architectures. State-of-the-art multivariate sequence learning gitudinal data is challenging especially in cases when data is methods such as recurrent neural networks (RNNs) have been sporadic, as often seen in, e.g., healthcare applications where applied to extract high-level, time-dependent patterns from the data can suffer from irregularity and asynchronicity as the time between consecutive data points can vary across longitudinal data. Moreover, different variants of RNNs with features and samples, hindering the application of existing gating mechanisms such as long short-term memory (LSTM) deep learning models that are constructed for complete, evenly [1] and gated recurrent unit (GRU) [2] were introduced to spaced data with fixed sequence lengths. In this paper, a novel tackle the vanishing and exploding gradients problem [3], [4] deep learning-based model is developed for modeling multiple and capture long-term dependencies efficiently. temporal features in sporadic data using an integrated deep learning architecture based on a recurrent neural network (RNN) However, RNNs are modeled as discrete-time dynamical unit and a continuous-time autoregressive (CAR) model. The systems with evenly-spaced input and output time points, proposed model, called CARRNN, uses a generalized discrete- thereby ill-suited to process sporadic data which is common time autoregressive model that is trainable end-to-end using in many healthcare applications [5], [6]. This real-world data neural networks modulated by time lags to describe the changes problem, as shown in the left part of Figure1, can, e.g., arise caused by the irregularity and asynchronicity. It is applied to multivariate time-series regression tasks using data provided for from different acquisition dates and missing values and is Alzheimer’s disease progression modeling and intensive care unit associated with irregularity and asynchronicity of the features (ICU) mortality rate prediction, where the proposed model based where the time between consecutive timestamps or visits can on a gated recurrent unit (GRU) achieves the lowest prediction vary across different features and subjects. To address this errors among the proposed RNN-based models and state-of-the- issue, most of the existing approaches [7] use a two-step art methods using GRUs and long short-term memory (LSTM) networks in their architecture. process assuming a fixed interval and applying missing data imputation techniques to complete the data before modeling Index Terms—Deep learning, recurrent neural network, long the time-series measurements using RNNs. On the other hand, short-term memory network, gated recurrent unit, autoregressive model, multivariate time-series regression, sporadic time series. a majority of methods that can inherently model varied-length data using RNNs without taking notice of the sampling time I. INTRODUCTION information [8] fail to handle irregular or asynchronous data. HE rapid development of computational resources in re- Recently, there have been efforts to incorporate the time T cent years has enabled the processing of large-scale tem- distance attribute into the RNN architectures for dealing with poral sequences, especially in healthcare, using deep learning sporadic data. For instance, the PLSTM [9], T-LSTM [10], GRU-D [11], tLSTM [12], and DLSTM [13] have placed ©2021 IEEE. Personal use of this material is permitted. Permission from exponential or linear time gates in LSTM or GRU architectures IEEE must be obtained for all other uses, in any current or future media, arXiv:2104.03739v1 [cs.LG] 8 Apr 2021 including reprinting/republishing this material for advertising or promotional heuristically as multiplicative time-modulating functions with purposes, creating new collective works, for resale or redistribution to servers learnable parameters and achieved good results when applied or lists, or reuse of any copyrighted component of this work in other works. to sporadic data. However, none of these studies have analyti- The corresponding author M. Mehdipour Ghazi ([email protected]) was with the Department of Medical Physics and Biomedical Engineering, University cally investigated the proposed models, nor have they provided College London, London, UK, and Biomediq A/S, Copenhagen, DK. He is motivation for the RNN architecture modifications. In other now with the Department of Computer Science, University of Copenhagen, words, current studies mainly focus on the design of deep Copenhagen, DK. L. Sørensen and M. Nielsen are with the Department of Computer Science, architectures or loss functions or apply the proposed solutions University of Copenhagen, Copenhagen, DK. They are also with Biomediq to only one specific type of RNN while the capability of the A/S and Cerebriu A/S, Copenhagen, DK. method using different types of RNNs is not examined. S. Ourselin was with the Department of Medical Physics and Biomedical Engineering, University College London, London, UK. He is now with In this paper, a novel model is reported for modeling multi- the School of Biomedical Engineering & Imaging Sciences, King’s College ple temporal features in sporadic multivariate longitudinal data London, London, UK. using an integrated deep learning architecture as represented in Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database Figure2 based on an RNN, LSTM, or GRU to learn the long- (adni.loni.usc.edu). As such, the investigators within the ADNI contributed term dependencies from evenly-spaced data and a continuous- to the design and implementation of ADNI and/or provided data but did not time autoregressive (CAR) model implemented as a neural participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at http://adni.loni.usc.edu/wp-content/uploads/ network layer modulated by time lags to compensate for the how to apply/ADNI Acknowledgement List.pdf existing irregularities. The proposed model, called CARRNN, 1 CARRNN • Variable sequence length (across all subjects/samples): normalized loss function • Long-term dependencies: LSTM/GRU • Irregularity: CAR(1) • Asynchronicity: (dynamic) binning and aligning the features • Missing values: • Incomplete features: diagonal (independent) CAR(1) applied to the neighboring points of each feature • Fully missing features: passing zeros by the normalized input array and loss function 2 ∆ 3 ∆ Fig. 1: Illustrative example of time-series data with irregularly-spaced time points and asynchronous events, and the proposed CARRNN structure for modeling the binned time points. The left subfigure shows five feature sequences of a sample subject aligned with a bin width of τ and a time gap of ∆tk defined between the two consecutive data points at time tk and tk−1. The right subfigure shows the CARRNN model structure that learns the multivariate temporal dependencies from the binned data. Note that the bottom feature includes only missing values. where φ 2 N×N is the matrix of the slope coefficients of RNN CAR(1) CARRNN R the model, c 2 RN×1 is the vector of the regression constants N×1 or intercepts, and "t 2 R is white noise as the prediction error. In general, the time interval between the consecutive observations can vary, making the sequence seem like an irregularly-sampled data. Therefore, the aforementioned AR(1) Fig. 2: Illustration of the proposed continuous-time autore- model can be rewritten as gressive recurrent neural network model using an RNN and a x(t ) = φ(∆t )x(t ) + c(∆t ) + "(t ) ; CAR(1) model simulated as a built-in linear neural network k k k−1 k k layer for end-to-end learning. where ∆tk = tk − tk−1 is the time gap between the two consecutive data points at time tk and tk−1, as shown in introduces an analytical solution to the ordinary differential Figure1, and the matrix φ(·), modulated by ∆tk, contains equation (ODE) problem and utilizes a generalized discrete- the autoregressive effects in the main diagonal and cross- time autoregressive model which is trainable end-to-end. The lagged effects in the off-diagonals. The model can then be asynchronous events are aligned using data binning to reduce expressed by the first-order differential equation [14], [15] the effects of missing values and noise. Finally, the remaining using continuous-time autoregressive parameters (drift matrix N×N N×1 missing values are estimated by using a univariate CAR(1) and bias vector) of Φ 2 R and & 2 R with an model applied to the adjacent observations, and in cases when exponential solution as the feature points are fully missing, as shown in the bottom dx(t) d"(t) of the left part of Figure1, the input array and loss function = Φx(t) + & + Γ ; dt dt are normalized with the number of available data points [8]. Φ∆tk −1 Φ∆tk x(tk) = e x(tk−1) + Φ e − IN & + η(tk) ; II. THE PROPOSED METHOD N×N where IN is the identity matrix of size N × N, Γ 2 R is The proposed continuous-time autoregressive recurrent neu- the Cholesky triangle of the innovation covariance or diffusion ral network model, which is trainable end-to-end as shown in matrix Ψ (Ψ = ΓΓT, where T is the transpose operator), and N×1 Figure2, applies an RNN to learn the long-term dependencies η(tk) 2 R is the continuous-time error vector at
Recommended publications
  • Backpropagation and Deep Learning in the Brain
    Backpropagation and Deep Learning in the Brain Simons Institute -- Computational Theories of the Brain 2018 Timothy Lillicrap DeepMind, UCL With: Sergey Bartunov, Adam Santoro, Jordan Guerguiev, Blake Richards, Luke Marris, Daniel Cownden, Colin Akerman, Douglas Tweed, Geoffrey Hinton The “credit assignment” problem The solution in artificial networks: backprop Credit assignment by backprop works well in practice and shows up in virtually all of the state-of-the-art supervised, unsupervised, and reinforcement learning algorithms. Why Isn’t Backprop “Biologically Plausible”? Why Isn’t Backprop “Biologically Plausible”? Neuroscience Evidence for Backprop in the Brain? A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: How to convince a neuroscientist that the cortex is learning via [something like] backprop - To convince a machine learning researcher, an appeal to variance in gradient estimates might be enough. - But this is rarely enough to convince a neuroscientist. - So what lines of argument help? How to convince a neuroscientist that the cortex is learning via [something like] backprop - What do I mean by “something like backprop”?: - That learning is achieved across multiple layers by sending information from neurons closer to the output back to “earlier” layers to help compute their synaptic updates. How to convince a neuroscientist that the cortex is learning via [something like] backprop 1. Feedback connections in cortex are ubiquitous and modify the
    [Show full text]
  • Implementing Machine Learning & Neural Network Chip
    Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP Implementing Machine Learning & Neural Network Chip Architectures USING NETWORK-ON-CHIP INTERCONNECT IP TY GARIBAY Chief Technology Officer [email protected] Title: Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP Primary author: Ty Garibay, Chief Technology Officer, ArterisIP, [email protected] Secondary author: Kurt Shuler, Vice President, Marketing, ArterisIP, [email protected] Abstract: A short tutorial and overview of machine learning and neural network chip architectures, with emphasis on how network-on-chip interconnect IP implements these architectures. Time to read: 10 minutes Time to present: 30 minutes Copyright © 2017 Arteris www.arteris.com 1 Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP What is Machine Learning? “ Field of study that gives computers the ability to learn without being explicitly programmed.” Arthur Samuel, IBM, 1959 • Machine learning is a subset of Artificial Intelligence Copyright © 2017 Arteris 2 • Machine learning is a subset of artificial intelligence, and explicitly relies upon experiential learning rather than programming to make decisions. • The advantage to machine learning for tasks like automated driving or speech translation is that complex tasks like these are nearly impossible to explicitly program using rule-based if…then…else statements because of the large solution space. • However, the “answers” given by a machine learning algorithm have a probability associated with them and can be non-deterministic (meaning you can get different “answers” given the same inputs during different runs) • Neural networks have become the most common way to implement machine learning.
    [Show full text]
  • Chombining Recurrent Neural Networks and Adversarial Training
    Combining Recurrent Neural Networks and Adversarial Training for Human Motion Modelling, Synthesis and Control † ‡ Zhiyong Wang∗ Jinxiang Chai Shihong Xia Institute of Computing Texas A&M University Institute of Computing Technology CAS Technology CAS University of Chinese Academy of Sciences ABSTRACT motions from the generator using recurrent neural networks (RNNs) This paper introduces a new generative deep learning network for and refines the generated motion using an adversarial neural network human motion synthesis and control. Our key idea is to combine re- which we call the “refiner network”. Fig 2 gives an overview of current neural networks (RNNs) and adversarial training for human our method: a motion sequence XRNN is generated with the gener- motion modeling. We first describe an efficient method for training ator G and is refined using the refiner network R. To add realism, a RNNs model from prerecorded motion data. We implement recur- we train our refiner network using an adversarial loss, similar to rent neural networks with long short-term memory (LSTM) cells Generative Adversarial Networks (GANs) [9] such that the refined because they are capable of handling nonlinear dynamics and long motion sequences Xre fine are indistinguishable from real motion cap- term temporal dependencies present in human motions. Next, we ture sequences Xreal using a discriminative network D. In addition, train a refiner network using an adversarial loss, similar to Gener- we embed contact information into the generative model to further ative Adversarial Networks (GANs), such that the refined motion improve the quality of the generated motions. sequences are indistinguishable from real motion capture data using We construct the generator G based on recurrent neural network- a discriminative network.
    [Show full text]
  • Predrnn: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal Lstms
    PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs Yunbo Wang Mingsheng Long∗ School of Software School of Software Tsinghua University Tsinghua University [email protected] [email protected] Jianmin Wang Zhifeng Gao Philip S. Yu School of Software School of Software School of Software Tsinghua University Tsinghua University Tsinghua University [email protected] [email protected] [email protected] Abstract The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal vari- ations are two crucial structures. This paper models these structures by presenting a predictive recurrent neural network (PredRNN). This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial ap- pearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously. PredRNN achieves the state-of-the-art prediction performance on three video prediction datasets and is a more general framework, that can be easily extended to other predictive learning tasks by integrating with other architectures. 1 Introduction
    [Show full text]
  • Recurrent Neural Network for Text Classification with Multi-Task
    Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Recurrent Neural Network for Text Classification with Multi-Task Learning Pengfei Liu Xipeng Qiu⇤ Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, Fudan University School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, China pfliu14,xpqiu,xjhuang @fudan.edu.cn { } Abstract are based on unsupervised objectives such as word predic- tion for training [Collobert et al., 2011; Turian et al., 2010; Neural network based methods have obtained great Mikolov et al., 2013]. This unsupervised pre-training is effec- progress on a variety of natural language process- tive to improve the final performance, but it does not directly ing tasks. However, in most previous works, the optimize the desired task. models are learned based on single-task super- vised objectives, which often suffer from insuffi- Multi-task learning utilizes the correlation between related cient training data. In this paper, we use the multi- tasks to improve classification by learning tasks in parallel. [ task learning framework to jointly learn across mul- Motivated by the success of multi-task learning Caruana, ] tiple related tasks. Based on recurrent neural net- 1997 , there are several neural network based NLP models [ ] work, we propose three different mechanisms of Collobert and Weston, 2008; Liu et al., 2015b utilize multi- sharing information to model text with task-specific task learning to jointly learn several tasks with the aim of and shared layers. The entire network is trained mutual benefit. The basic multi-task architectures of these jointly on all these tasks. Experiments on four models are to share some lower layers to determine common benchmark text classification tasks show that our features.
    [Show full text]
  • Introduction to Machine Learning
    Introduction to Machine Learning Perceptron Barnabás Póczos Contents History of Artificial Neural Networks Definitions: Perceptron, Multi-Layer Perceptron Perceptron algorithm 2 Short History of Artificial Neural Networks 3 Short History Progression (1943-1960) • First mathematical model of neurons ▪ Pitts & McCulloch (1943) • Beginning of artificial neural networks • Perceptron, Rosenblatt (1958) ▪ A single neuron for classification ▪ Perceptron learning rule ▪ Perceptron convergence theorem Degression (1960-1980) • Perceptron can’t even learn the XOR function • We don’t know how to train MLP • 1963 Backpropagation… but not much attention… Bryson, A.E.; W.F. Denham; S.E. Dreyfus. Optimal programming problems with inequality constraints. I: Necessary conditions for extremal solutions. AIAA J. 1, 11 (1963) 2544-2550 4 Short History Progression (1980-) • 1986 Backpropagation reinvented: ▪ Rumelhart, Hinton, Williams: Learning representations by back-propagating errors. Nature, 323, 533—536, 1986 • Successful applications: ▪ Character recognition, autonomous cars,… • Open questions: Overfitting? Network structure? Neuron number? Layer number? Bad local minimum points? When to stop training? • Hopfield nets (1982), Boltzmann machines,… 5 Short History Degression (1993-) • SVM: Vapnik and his co-workers developed the Support Vector Machine (1993). It is a shallow architecture. • SVM and Graphical models almost kill the ANN research. • Training deeper networks consistently yields poor results. • Exception: deep convolutional neural networks, Yann LeCun 1998. (discriminative model) 6 Short History Progression (2006-) Deep Belief Networks (DBN) • Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554. • Generative graphical model • Based on restrictive Boltzmann machines • Can be trained efficiently Deep Autoencoder based networks Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H.
    [Show full text]
  • Deep Learning Architectures for Sequence Processing
    Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright © 2021. All rights reserved. Draft of September 21, 2021. CHAPTER Deep Learning Architectures 9 for Sequence Processing Time will explain. Jane Austen, Persuasion Language is an inherently temporal phenomenon. Spoken language is a sequence of acoustic events over time, and we comprehend and produce both spoken and written language as a continuous input stream. The temporal nature of language is reflected in the metaphors we use; we talk of the flow of conversations, news feeds, and twitter streams, all of which emphasize that language is a sequence that unfolds in time. This temporal nature is reflected in some of the algorithms we use to process lan- guage. For example, the Viterbi algorithm applied to HMM part-of-speech tagging, proceeds through the input a word at a time, carrying forward information gleaned along the way. Yet other machine learning approaches, like those we’ve studied for sentiment analysis or other text classification tasks don’t have this temporal nature – they assume simultaneous access to all aspects of their input. The feedforward networks of Chapter 7 also assumed simultaneous access, al- though they also had a simple model for time. Recall that we applied feedforward networks to language modeling by having them look only at a fixed-size window of words, and then sliding this window over the input, making independent predictions along the way. Fig. 9.1, reproduced from Chapter 7, shows a neural language model with window size 3 predicting what word follows the input for all the. Subsequent words are predicted by sliding the window forward a word at a time.
    [Show full text]
  • Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J
    1 Unsupervised speech representation learning using WaveNet autoencoders Jan Chorowski, Ron J. Weiss, Samy Bengio, Aaron¨ van den Oord Abstract—We consider the task of unsupervised extraction speaker gender and identity, from phonetic content, properties of meaningful latent representations of speech by applying which are consistent with internal representations learned autoencoding neural networks to speech waveforms. The goal by speech recognizers [13], [14]. Such representations are is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being desired in several tasks, such as low resource automatic speech invariant to confounding low level details in the signal such as recognition (ASR), where only a small amount of labeled the underlying pitch contour or background noise. Since the training data is available. In such scenario, limited amounts learned representation is tuned to contain only phonetic content, of data may be sufficient to learn an acoustic model on the we resort to using a high capacity WaveNet decoder to infer representation discovered without supervision, but insufficient information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the to learn the acoustic model and a data representation in a fully kind of constraint that is applied to the latent representation. supervised manner [15], [16]. We compare three variants: a simple dimensionality reduction We focus on representations learned with autoencoders bottleneck, a Gaussian Variational Autoencoder (VAE), and a applied to raw waveforms and spectrogram features and discrete Vector Quantized VAE (VQ-VAE). We analyze the quality investigate the quality of learned representations on LibriSpeech of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately re- [17].
    [Show full text]
  • Machine Learning V/S Deep Learning
    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 Machine Learning v/s Deep Learning Sachin Krishan Khanna Department of Computer Science Engineering, Students of Computer Science Engineering, Chandigarh Group of Colleges, PinCode: 140307, Mohali, India ---------------------------------------------------------------------------***--------------------------------------------------------------------------- ABSTRACT - This is research paper on a brief comparison and summary about the machine learning and deep learning. This comparison on these two learning techniques was done as there was lot of confusion about these two learning techniques. Nowadays these techniques are used widely in IT industry to make some projects or to solve problems or to maintain large amount of data. This paper includes the comparison of two techniques and will also tell about the future aspects of the learning techniques. Keywords: ML, DL, AI, Neural Networks, Supervised & Unsupervised learning, Algorithms. INTRODUCTION As the technology is getting advanced day by day, now we are trying to make a machine to work like a human so that we don’t have to make any effort to solve any problem or to do any heavy stuff. To make a machine work like a human, the machine need to learn how to do work, for this machine learning technique is used and deep learning is used to help a machine to solve a real-time problem. They both have algorithms which work on these issues. With the rapid growth of this IT sector, this industry needs speed, accuracy to meet their targets. With these learning algorithms industry can meet their requirements and these new techniques will provide industry a different way to solve problems.
    [Show full text]
  • Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning
    Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Lecture 11 Recurrent Neural Networks I CMSC 35246 Introduction Sequence Learning with Neural Networks Lecture 11 Recurrent Neural Networks I CMSC 35246 Some Sequence Tasks Figure credit: Andrej Karpathy Lecture 11 Recurrent Neural Networks I CMSC 35246 MLPs only accept an input of fixed dimensionality and map it to an output of fixed dimensionality Great e.g.: Inputs - Images, Output - Categories Bad e.g.: Inputs - Text in one language, Output - Text in another language MLPs treat every example independently. How is this problematic? Need to re-learn the rules of language from scratch each time Another example: Classify events after a fixed number of frames in a movie Need to resuse knowledge about the previous events to help in classifying the current. Problems with MLPs for Sequence Tasks The "API" is too limited. Lecture 11 Recurrent Neural Networks I CMSC 35246 Great e.g.: Inputs - Images, Output - Categories Bad e.g.: Inputs - Text in one language, Output - Text in another language MLPs treat every example independently. How is this problematic? Need to re-learn the rules of language from scratch each time Another example: Classify events after a fixed number of frames in a movie Need to resuse knowledge about the previous events to help in classifying the current. Problems with MLPs for Sequence Tasks The "API" is too limited. MLPs only accept an input of fixed dimensionality and map it to an output of fixed dimensionality Lecture 11 Recurrent Neural Networks I CMSC 35246 Bad e.g.: Inputs - Text in one language, Output - Text in another language MLPs treat every example independently.
    [Show full text]
  • Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting
    water Article Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting Halit Apaydin 1 , Hajar Feizi 2 , Mohammad Taghi Sattari 1,2,* , Muslume Sevba Colak 1 , Shahaboddin Shamshirband 3,4,* and Kwok-Wing Chau 5 1 Department of Agricultural Engineering, Faculty of Agriculture, Ankara University, Ankara 06110, Turkey; [email protected] (H.A.); [email protected] (M.S.C.) 2 Department of Water Engineering, Agriculture Faculty, University of Tabriz, Tabriz 51666, Iran; [email protected] 3 Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Vietnam 4 Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam 5 Department of Civil and Environmental Engineering, Hong Kong Polytechnic University, Hong Kong, China; [email protected] * Correspondence: [email protected] or [email protected] (M.T.S.); [email protected] (S.S.) Received: 1 April 2020; Accepted: 21 May 2020; Published: 24 May 2020 Abstract: Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012–2018, and all models are coded in Python software programming language.
    [Show full text]
  • Deep Learning in Bioinformatics
    Deep Learning in Bioinformatics Seonwoo Min1, Byunghan Lee1, and Sungroh Yoon1,2* 1Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea 2Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea Abstract In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e., omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e., deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. *Corresponding author. Mailing address: 301-908, Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea. E-mail: [email protected]. Phone: +82-2-880-1401. Keywords Deep learning, neural network, machine learning, bioinformatics, omics, biomedical imaging, biomedical signal processing Key Points As a great deal of biomedical data have been accumulated, various machine algorithms are now being widely applied in bioinformatics to extract knowledge from big data.
    [Show full text]