Understanding Deep Neural Networks

Total Page:16

File Type:pdf, Size:1020Kb

Understanding Deep Neural Networks Bachelor thesis (Computer Science BSc) Understanding Deep Neural Networks Authors David Kempf Lino von Burg Main supervisors Dr. Thilo Stadelmann Prof. Dr. Olaf Stern Date 08.06.2018 DECLARATION OF ORIGINALITY Bachelor’s Thesis at the School of Engineering By submitting this Bachelor’s thesis, the undersigned student confirms that this thesis is his/her own work and was written without the help of a third party. (Group works: the performance of the other group members are not considered as third party). The student declares that all sources in the text (including Internet pages) and appendices have been correctly disclosed. This means that there has been no plagiarism, i.e. no sections of the Bachelor thesis have been partially or wholly taken from other texts and represented as the student’s own work or included without being correctly referenced. Any misconduct will be dealt with according to paragraphs 39 and 40 of the General Academic Regulations for Bachelor’s and Master’s Degree courses at the Zurich University of Applied Sciences (Rahmenprüfungsordnung ZHAW (RPO)) and subject to the provisions for disciplinary action stipulated in the University regulations. City, Date: Signature: …………………………………………………………………….. …………………………..………………………………………………………... …………………………..………………………………………………………... …………………………..………………………………………………………... The original signed and dated document (no copies) must be included after the title sheet in the ZHAW version of all Bachelor thesis submitted. Zurich University of Applied Sciences Abstract Artificial Intelligence (AI) comprises a wealth of distinct problem solving methods and attempts with its various subfields to tackle different classes of challenges. During the last decade, huge advances have been made in the field and, as a consequence, AI has achieved great results in machine translation and computer vision tasks. In particular, Artificial Neural Networks (ANNs) won several international con- tests in pattern recognition between 2009 and 2012 and even reached human-competitive performance in recognition applications. Despite these impressive achievements, ANNs are still often perceived as incomprehensible black boxes due to the complexity of the underlying models. As ANNs find their way more and more into everyday processes, the need for understanding their inner workings is becoming central. This thesis therefore surveys the state of the art in debugging Convolutional Neural Networks (CNNs) and explores the question how helpful the explanations provided by such methods are to improve the understanding of these classifiers. In order to investigate this question, an experimental testbed for applying debugging methods to CNNs was built and an experiment to evaluate these methods was carried out. An image recognition task related to the classification of cats and dogs served as a basis to conduct the investigation. The goal was to analyze how the model arrives at its classification scores when it is presented with input images that belong to neither of the classes it was trained to detect. When using debugging methods for better understanding the function of CNNs, it is important that the results obtained from such methods do explain not only reasons for assigning a given image to a specific output class but also reasons for not assigning it to a different output class. In this way, the results faciliate the understanding of the notion of classes that a CNN has developed from the data used to train it. Hence, part of the conducted experiment was to compare how differently the trained network behaves when it is fed with data that does not contain concepts corresponding to the output classes. Within the context of this work, state-of-the-art methods for debugging CNNs were reviewed and subsumed into a new taxonomy. A characteristic subset of these methods was applied to a CNN trained on the kaggle dogs vs cats competition dataset. ii Zusammenfassung Künstliche Intelligenz (KI) umfasst eine Vielzahl von verschiedenen Problemlösungsmethoden und ver- sucht mit seinen Unterbereichen unterschiedliche Klassen von Herausforderungen anzugehen. Während der letzten Dekade konnten in diesem Forschungsfeld sehr grosse Fortschritte verzeichnet werden und in der Folge hat KI sehr gute Ergebnisse in maschineller Übersetzung und maschinellem Sehen er- reicht. Insbesondere künstliche neuronale Netze (KNNs) gewannen im Zeitraum zwischen 2009 und 2012 zahlreiche internationale Wettbewerbe in Mustererkennung und erreichten sogar menschenähnliche Leistungen in Erkennungs-Anwendungen. Trotz dieser beeindruckenden Erfolge werden KNNs aufgrund der Komplexität der zugrundeliegenden Modelle immer noch oft als unverständliche Blackboxes aufge- fasst. Je mehr KNNs Teil der alltäglichen Prozesse werden, umso notwendiger wird es ein Verständnis ihrer inneren Funktionsweise zu haben. Diese Bachelorarbeit begutachtet deshalb den State of the Art, um Convolutional Neural Networks (CNNs) zu debuggen und untersucht die Frage, wie hilfreich die Erklärungen solcher Methoden dabei sind ein besseres Verständnis von diesen Klassifikatoren zu gewinnen. Um diese Frage zu untersuchen wurde eine experimentelle Testumgebung für das Anwenden von Debug- ging Methoden auf CNNs aufgebaut und ein Experiment für die Evaluation dieser Methoden durchge- führt. Ein Bilderkennungstask in Verbindung mit der Klassifizierung von Katzen und Hunden diente dabei als Basis, um die Untersuchung durchzuführen. Dabei war das Ziel zu analysieren, wie das trainierte Modell zu seinen Klassifikationsscores kommt, wenn als Input für das Netzwerk Bilder genom- men werden, die zu keiner der Outputklassen zugehörig sind. Wenn Debugging Methoden benutzt werden, um die Funktion von CNNs besser zu verstehen, dann ist es wichtig, dass die von solchen Methoden erhaltenen Ergebnisse nicht nur die Gründe erklären, wieso ein gegebenes Bild zu einer spezifischen Outputklasse zugeordnet worden ist, sondern auch die Gründe, wieso das gleiche Bild nicht einer anderen Outputklasse zugeordnet worden ist. Auf diese Weise verbessern die Resultate das Verständnis dafür, was für eine Vorstellung der Klassen das CNN aus den Trainingsdaten entwickelt hat. Daher wurde in einem Teil des Experiments verglichen, wie anders sich das Netzwerk verhält, wenn Daten eingegeben werden, die nicht den Outputklassen entsprechen. Im Rahmen dieser Arbeit wurden State-of-the-Art Methoden für das Debuggen von CNNs überprüft und in eine neue Taxonomie eingeordnet. Eine charakteristische Auswahl dieser Methoden wurde auf ein CNN angewendet, das auf dem dogs vs cats Wettbewerbsdatensatz trainiert worden ist. iii Preface Acknowledgement Our supervisor Dr. Stadelmann provided us with many inputs and practical tips while we were working on this Bachelor thesis. We would like to thank Dr. Stadelmann for all his support during the whole work. We thank our second supervisor Prof. Dr. Stern for reviewing the drafts of our report and giving us feedback for it. We would further like to express thanks to Mrs. Fernando for giving us language-related advice. We would also like to thank Mr. Amirian for his support with the training of the Neural Network. iv Contents 1 Introduction 1 1.1 Motivation . .1 1.2 Research Question . .2 1.3 Limitations of Learning Capability . .2 1.4 Structure of Thesis . .3 2 Prerequisites 4 2.1 Artificial Neural Networks . .4 2.2 Convolutional Neural Networks . .7 3 Literature Survey9 3.1 Scope . .9 3.2 Intuitive Interpretability . 10 3.3 Taxonomy Features . 10 3.4 Visualization of Feature Representation . 11 3.5 Visualization of Activation Attribution . 11 3.5.1 Code Inversion . 12 3.5.2 Occlusion Sensitivity . 16 3.6 Visualization of Learned Class-Concepts . 19 3.6.1 Optimisation . 19 3.7 Classification of Methods . 21 4 Experiments 23 4.1 Methodology . 23 4.2 Experimental Testbed . 24 4.3 Implementation of Debugging Methods . 25 5 Results 27 6 Summary and Future Work 30 (Listings) . 31 Bibliography . 31 (List of Figures) . 35 (List of Tables) . 36 A Appendix 37 A.1 Announcement . 37 A.2 Visualizations of ‘Unknown’ Classes . 39 v 1 Introduction The present Introduction chapter determines the frame for the thesis by naming the motivation for it and the question to investigate. First, the problem of lacking comprehension regarding Artificial Neural Networks is contemplated within section 1.1. After that, the research question of this work is defined in the next section 1.2. The third section 1.3 then points out the limitations of a Neural Network’s learning capability. Finally, the last section 1.4 describes the structure of the remaining thesis. 1.1 Motivation The field of Artificial Intelligence (AI) consists of a variety of distinct problem solving methods. Its several subfields aim at providing support in different matters and have already succeeded in doing so in areas like computer vision and natural language processing. Voice recognition and machine translation are examples of narrow tasks which, atleast for some languages, AI systems are able to carry out on human-level [RN03]. Although such systems are present in many modern devices, AI applications are still often perceived as little more than toys. Additionally, classifiers such as Neural Networks have frequently been referred to as incomprehensible black boxes due to the fact that they independently find structures in the processed material. The increasing application of AI technology makes it necessary to improve the understanding of it. Artificial Neural Networks (ANNs) as a branch of Artificial Intelligence have been on researchers’ and pundits’ minds for
Recommended publications
  • Artificial Neural Networks Part 2/3 – Perceptron
    Artificial Neural Networks Part 2/3 – Perceptron Slides modified from Neural Network Design by Hagan, Demuth and Beale Berrin Yanikoglu Perceptron • A single artificial neuron that computes its weighted input and uses a threshold activation function. • It effectively separates the input space into two categories by the hyperplane: wTx + b = 0 Decision Boundary The weight vector is orthogonal to the decision boundary The weight vector points in the direction of the vector which should produce an output of 1 • so that the vectors with the positive output are on the right side of the decision boundary – if w pointed in the opposite direction, the dot products of all input vectors would have the opposite sign – would result in same classification but with opposite labels The bias determines the position of the boundary • solve for wTp+b = 0 using one point on the decision boundary to find b. Two-Input Case + a = hardlim(n) = [1 2]p + -2 w1, 1 = 1 w1, 2 = 2 - Decision Boundary: all points p for which wTp + b =0 If we have the weights and not the bias, we can take a point on the decision boundary, p=[2 0]T, and solving for [1 2]p + b = 0, we see that b=-2. p w wT.p = ||w||||p||Cosθ θ Decision Boundary proj. of p onto w proj. of p onto w T T w p + b = 0 w p = -bœb = ||p||Cosθ 1 1 = wT.p/||w|| • All points on the decision boundary have the same inner product (= -b) with the weight vector • Therefore they have the same projection onto the weight vector; so they must lie on a line orthogonal to the weight vector ADVANCED An Illustrative Example Boolean OR ⎧ 0 ⎫ ⎧ 0 ⎫ ⎧ 1 ⎫ ⎧ 1 ⎫ ⎨p1 = , t1 = 0 ⎬ ⎨p2 = , t2 = 1 ⎬ ⎨p3 = , t3 = 1⎬ ⎨p4 = , t4 = 1⎬ ⎩ 0 ⎭ ⎩ 1 ⎭ ⎩ 0 ⎭ ⎩ 1 ⎭ Given the above input-output pairs (p,t), can you find (manually) the weights of a perceptron to do the job? Boolean OR Solution 1) Pick an admissable decision boundary 2) Weight vector should be orthogonal to the decision boundary.
    [Show full text]
  • Artificial Neuron Using Mos2/Graphene Threshold Switching Memristor
    University of Central Florida STARS Electronic Theses and Dissertations, 2004-2019 2018 Artificial Neuron using MoS2/Graphene Threshold Switching Memristor Hirokjyoti Kalita University of Central Florida Part of the Electrical and Electronics Commons Find similar works at: https://stars.library.ucf.edu/etd University of Central Florida Libraries http://library.ucf.edu This Masters Thesis (Open Access) is brought to you for free and open access by STARS. It has been accepted for inclusion in Electronic Theses and Dissertations, 2004-2019 by an authorized administrator of STARS. For more information, please contact [email protected]. STARS Citation Kalita, Hirokjyoti, "Artificial Neuron using MoS2/Graphene Threshold Switching Memristor" (2018). Electronic Theses and Dissertations, 2004-2019. 5988. https://stars.library.ucf.edu/etd/5988 ARTIFICIAL NEURON USING MoS2/GRAPHENE THRESHOLD SWITCHING MEMRISTOR by HIROKJYOTI KALITA B.Tech. SRM University, 2015 A thesis submitted in partial fulfilment of the requirements for the degree of Master of Science in the Department of Electrical and Computer Engineering in the College of Engineering and Computer Science at the University of Central Florida Orlando, Florida Summer Term 2018 Major Professor: Tania Roy © 2018 Hirokjyoti Kalita ii ABSTRACT With the ever-increasing demand for low power electronics, neuromorphic computing has garnered huge interest in recent times. Implementing neuromorphic computing in hardware will be a severe boost for applications involving complex processes such as pattern recognition. Artificial neurons form a critical part in neuromorphic circuits, and have been realized with complex complementary metal–oxide–semiconductor (CMOS) circuitry in the past. Recently, insulator-to-metal-transition (IMT) materials have been used to realize artificial neurons.
    [Show full text]
  • Using the Modified Back-Propagation Algorithm to Perform Automated Downlink Analysis by Nancy Y
    Using The Modified Back-propagation Algorithm To Perform Automated Downlink Analysis by Nancy Y. Xiao Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degrees of Bachelor of Science and Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 1996 Copyright Nancy Y. Xiao, 1996. All rights reserved. The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part, and to grant others the right to do so. A uthor ....... ............ Depar -uter Science May 28, 1996 Certified by. rnard C. Lesieutre ;rical Engineering Thesis Supervisor Accepted by.. F.R. Morgenthaler Chairman, Departmental Committee on Graduate Students OF TECH NOLOCG JUN 111996 Eng, Using The Modified Back-propagation Algorithm To Perform Automated Downlink Analysis by Nancy Y. Xiao Submitted to the Department of Electrical Engineering and Computer Science on May 28, 1996, in partial fulfillment of the requirements for the degrees of Bachelor of Science and Master of Engineering in Electrical Engineering and Computer Science Abstract A multi-layer neural network computer program was developed to perform super- vised learning tasks. The weights in the neural network were found using the back- propagation algorithm. Several modifications to this algorithm were also implemented to accelerate error convergence and optimize search procedures. This neural network was used mainly to perform pattern recognition tasks. As an initial test of the system, a controlled classical pattern recognition experiment was conducted using the X and Y coordinates of the data points from two to five possibly overlapping Gaussian distributions, each with a different mean and variance.
    [Show full text]
  • Deep Multiphysics and Particle–Neuron Duality: a Computational Framework Coupling (Discrete) Multiphysics and Deep Learning
    applied sciences Article Deep Multiphysics and Particle–Neuron Duality: A Computational Framework Coupling (Discrete) Multiphysics and Deep Learning Alessio Alexiadis School of Chemical Engineering, University of Birmingham, Birmingham B15 2TT, UK; [email protected]; Tel.: +44-(0)-121-414-5305 Received: 11 November 2019; Accepted: 6 December 2019; Published: 9 December 2019 Featured Application: Coupling First-Principle Modelling with Artificial Intelligence. Abstract: There are two common ways of coupling first-principles modelling and machine learning. In one case, data are transferred from the machine-learning algorithm to the first-principles model; in the other, from the first-principles model to the machine-learning algorithm. In both cases, the coupling is in series: the two components remain distinct, and data generated by one model are subsequently fed into the other. Several modelling problems, however, require in-parallel coupling, where the first-principle model and the machine-learning algorithm work together at the same time rather than one after the other. This study introduces deep multiphysics; a computational framework that couples first-principles modelling and machine learning in parallel rather than in series. Deep multiphysics works with particle-based first-principles modelling techniques. It is shown that the mathematical algorithms behind several particle methods and artificial neural networks are similar to the point that can be unified under the notion of particle–neuron duality. This study explains in detail the particle–neuron duality and how deep multiphysics works both theoretically and in practice. A case study, the design of a microfluidic device for separating cell populations with different levels of stiffness, is discussed to achieve this aim.
    [Show full text]
  • Neural Network Architectures and Activation Functions: a Gaussian Process Approach
    Fakultät für Informatik Neural Network Architectures and Activation Functions: A Gaussian Process Approach Sebastian Urban Vollständiger Abdruck der von der Fakultät für Informatik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Prof. Dr. rer. nat. Stephan Günnemann Prüfende der Dissertation: 1. Prof. Dr. Patrick van der Smagt 2. Prof. Dr. rer. nat. Daniel Cremers 3. Prof. Dr. Bernd Bischl, Ludwig-Maximilians-Universität München Die Dissertation wurde am 22.11.2017 bei der Technischen Universtät München eingereicht und durch die Fakultät für Informatik am 14.05.2018 angenommen. Neural Network Architectures and Activation Functions: A Gaussian Process Approach Sebastian Urban Technical University Munich 2017 ii Abstract The success of applying neural networks crucially depends on the network architecture being appropriate for the task. Determining the right architecture is a computationally intensive process, requiring many trials with different candidate architectures. We show that the neural activation function, if allowed to individually change for each neuron, can implicitly control many aspects of the network architecture, such as effective number of layers, effective number of neurons in a layer, skip connections and whether a neuron is additive or multiplicative. Motivated by this observation we propose stochastic, non-parametric activation functions that are fully learnable and individual to each neuron. Complexity and the risk of overfitting are controlled by placing a Gaussian process prior over these functions. The result is the Gaussian process neuron, a probabilistic unit that can be used as the basic building block for probabilistic graphical models that resemble the structure of neural networks.
    [Show full text]
  • A Connectionist Model Based on Physiological Properties of the Neuron
    Proceedings of the International Joint Conference IBERAMIA/SBIA/SBRN 2006 - 1st Workshop on Computational Intelligence (WCI’2006), Ribeir˜aoPreto, Brazil, October 23–28, 2006. CD-ROM. ISBN 85-87837-11-7 A Connectionist Model based on Physiological Properties of the Neuron Alberione Braz da Silva Joao˜ Lu´ıs Garcia Rosa Ceatec, PUC-Campinas Ceatec, PUC-Campinas Campinas, SP, Brazil Campinas, SP, Brazil [email protected] [email protected] Abstract 2 Intraneuron Signaling Recent research on artificial neural network models re- 2.1 The biological neuron and the intra- gards as relevant a new characteristic called biological neuron signaling plausibility or realism. Nevertheless, there is no agreement about this new feature, so some researchers develop their The intraneuron signaling is based on the principle of own visions. Two of these are highlighted here: the first is dynamic polarization, proposed by Ramon´ y Cajal, which related directly to the cerebral cortex biological structure, establishes that “electric signals inside a nervous cell flow and the second focuses the neural features and the signaling only in a direction: from neuron reception (often the den- between neurons. The proposed model departs from the pre- drites and cell body) to the axon trigger zone” [5]. vious existing system models, adopting the standpoint that a Based on physiological evidences of principle of dy- biologically plausible artificial neural network aims to cre- namic polarization the signaling inside the neuron is per- ate a more faithful model concerning the biological struc- formed by four basic elements: Receptive, responsible for ture, properties, and functionalities of the cerebral cortex, input signals; Trigger, responsible for neuron activation not disregarding its computational efficiency.
    [Show full text]
  • Artificial Neural Networks· T\ Briet In!Toquclion
    GENERAL I ARTICLE Artificial Neural Networks· t\ BrieT In!TOQuclion Jitendra R Raol and Sunilkumar S Mankame Artificial neural networks are 'biologically' inspired net­ works. They have the ability to learn from empirical datal information. They find use in computer science and control engineering fields. In recent years artificial neural networks (ANNs) have fascinated scientists and engineers all over the world. They have the ability J R Raol is a scientist with to learn and recall - the main functions of the (human) brain. A NAL. His research major reason for this fascination is that ANNs are 'biologically' interests are parameter inspired. They have the apparent ability to imitate the brain's estimation, neural networks, fuzzy systems, activity to make decisions and draw conclusions when presented genetic algorithms and with complex and noisy information. However there are vast their applications to differences between biological neural networks (BNNs) of the aerospace problems. He brain and ANN s. writes poems in English. A thorough understanding of biologically derived NNs requires knowledge from other sciences: biology, mathematics and artifi­ cial intelligence. However to understand the basics of ANNs, a knowledge of neurobiology is not necessary. Yet, it is a good idea to understand how ANNs have been derived from real biological neural systems (see Figures 1,2 and the accompanying boxes). The soma of the cell body receives inputs from other neurons via Sunilkumar S Mankame is adaptive synaptic connections to the dendrites and when a neuron a graduate research student from Regional is excited, the nerve impulses from the soma are transmitted along Engineering College, an axon to the synapses of other neurons.
    [Show full text]
  • Artificial Neuron Using Vertical Mos2/Graphene Threshold
    www.nature.com/scientificreports OPEN Artifcial Neuron using Vertical MoS2/Graphene Threshold Switching Memristors Received: 11 June 2018 Hirokjyoti Kalita1,2, Adithi Krishnaprasad1,2, Nitin Choudhary1, Sonali Das1, Durjoy Dev1,2, Accepted: 6 November 2018 Yi Ding1,3, Laurene Tetard1,4, Hee-Suk Chung 5, Yeonwoong Jung1,2,3 & Tania Roy1,2,3 Published: xx xx xxxx With the ever-increasing demand for low power electronics, neuromorphic computing has garnered huge interest in recent times. Implementing neuromorphic computing in hardware will be a severe boost for applications involving complex processes such as image processing and pattern recognition. Artifcial neurons form a critical part in neuromorphic circuits, and have been realized with complex complementary metal–oxide–semiconductor (CMOS) circuitry in the past. Recently, metal-insulator- transition materials have been used to realize artifcial neurons. Although memristors have been implemented to realize synaptic behavior, not much work has been reported regarding the neuronal response achieved with these devices. In this work, we use the volatile threshold switching behavior of a vertical-MoS2/graphene van der Waals heterojunction system to produce the integrate-and-fre response of a neuron. We use large area chemical vapor deposited (CVD) graphene and MoS2, enabling large scale realization of these devices. These devices can emulate the most vital properties of a neuron, including the all or nothing spiking, the threshold driven spiking of the action potential, the post-fring refractory period of a neuron and strength modulated frequency response. These results show that the developed artifcial neuron can play a crucial role in neuromorphic computing. Understanding the complex and overwhelming functioning of the human brain has always fascinated scientists and researchers for biological felds and beyond.
    [Show full text]
  • Artificial Neuron-Glia Networks Learning Approach Based on Cooperative Coevolution
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Repositorio da Universidade da Coruña ARTIFICIAL NEURON-GLIA NETWORKS LEARNING APPROACH BASED ON COOPERATIVE COEVOLUTION PABLO MESEJO Department of Information Engineering, University of Parma, Parma, 43124, Italy ISIT-UMR 6284 CNRS, University of Auvergne, Clermont-Ferrand, 63000, France OSCAR IBA´NEZ~ European Centre for Soft Computing, Mieres, 33600, Spain Dept. of Computer Science and Artificial Intelligence (DECSAI), University of Granada, Granada, 18071, Spain ENRIQUE FERNANDEZ-BLANCO,´ FRANCISCO CEDRON,´ ALEJANDRO PAZOS, and ANA B. PORTO-PAZOS* Department of Information and Communications Technologies, University of A Coru~na, A Coru~na,15071, Spain Artificial Neuron-Glia Networks (ANGNs) are a novel bio-inspired machine learning approach. They extend classical Artificial Neural Networks (ANNs) by incorporating recent findings and suppositions about the way information is processed by neural and astrocytic networks in the most evolved living organisms. Although ANGNs are not a consolidated method, their performance against the traditional approach, i.e. without artificial astrocytes, was already demonstrated on classification problems. How- ever, the corresponding learning algorithms developed so far strongly depends on a set of glial parameters which are manually tuned for each specific problem. As a consequence, previous experimental tests have to be done in order to determine an adequate set of values, making such manual parameter configura- tion time-consuming, error-prone, biased and problem dependent. Thus, in this article, we propose a novel learning approach for ANGNs that fully automates the learning process, and gives the possibility of testing any kind of reasonable parameter configuration for each specific problem.
    [Show full text]
  • Art in Machine Learning
    1 Creativity in Machine Learning w0 x0 w1 x1 w2 Martin Thoma x2 Σ ' w3 E-Mail: [email protected] x3 . wn xn Abstract—Recent machine learning techniques can be modified (a) Example of an artificial neuron unit.(b) A visualization of a simple feed- to produce creative results. Those results did not exist before; it xi are the input signals and wi are forward neural network. The 5 in- is not a trivial combination of the data which was fed into the weights which have to get learned. put nodes are red, the 2 bias nodes machine learning system. The obtained results come in multiple Each input signal gets multiplied are gray, the 3 hidden units are forms: As images, as text and as audio. with its weight, everything gets green and the single output node summed up and the activation func- is blue. This paper gives a high level overview of how they are created tion ' is applied. and gives some examples. It is meant to be a summary of the current work and give people who are new to machine learning Fig. 1: Neural networks are based on simple units which get some starting points. combined to complex networks. I. INTRODUCTION This means that machine learning programs adjust internal parameters to fit the data they are given. Those computer According to [Gad06] creativity is “the ability to use your programs are still developed by software developers, but the imagination to produce new ideas, make things etc.” and developer writes them in a way which makes it possible to imagination is “the ability to form pictures or ideas in your adjust them without having to re-program everything.
    [Show full text]
  • Hybrid Memristor-CMOS Neurons for in Situ Learning in Fully Hardware Memristive Spiking Neural Networks
    Hybrid memristor-CMOS neurons for in situ learning in fully hardware memristive spiking neural networks Xumeng Zhang Fudan University Jian Lu Institute of Microelectronics of the Chinese Academy of Science Rui Wang Chinese Academy of Sciences Jinsong Wei Institute of Microelectronics of the Chinese Academy of Science Tuo Shi Chinese Academy of Sciences Chunmeng Dou Institute of Microelectronics of the Chinese Academy of Science zuheng Wu Institute of Microelectronics Dashan Shang Chinese Academy of Sciences https://orcid.org/0000-0003-3573-8390 Guozhong Xing Institute of Microelectronics of the Chinese Academy of Science Qi Liu Fudan University Ming Liu ( [email protected] ) Key Laboratory of Microelectronics Devices and Integrated Technology, Institute of Microelectronics of the Chinese Academy of Science Article Keywords: Spiking neural network, memristor-based neurons, hybrid memristor-CMOS neurons Posted Date: July 27th, 2020 DOI: https://doi.org/10.21203/rs.3.rs-43977/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Version of Record: A version of this preprint was published at Science Bulletin on April 1st, 2021. See the published version at https://doi.org/10.1016/j.scib.2021.04.014. Hybrid memristor-CMOS neurons for in situ learning in fully hardware memristive spiking neural networks Xumeng Zhang#1,2,3, Jian Lu#2, Rui Wang2,3, Jinsong Wei2, Tuo Shi2,4, Chunmeng Dou2,3, Zuheng Wu2,3, Dashan Shang2,3, Guozhong Xing2,3, Qi Liu*1,2, and Ming Liu1,2 1Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China, 2Key Laboratory of Microelectronic Devices & Integrated Technology, Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China, 3University of Chinese Academy of Sciences, Beijing 100049, China, 4Zhejiang Laboratory, Hangzhou 311122, China.
    [Show full text]
  • Chapter 10: Artificial Neural Networks
    Chapter 10: Artificial Neural Networks Dr. Xudong Liu Assistant Professor School of Computing University of North Florida Monday, 9/30/2019 1 / 17 Overview 1 Artificial neuron: linear threshold unit (LTU) 2 Perceptron 3 Multi-Layer Perceptron (MLP), Deep Neural Networks (DNN) 4 Learning ANN's: Error Backpropagation Overview 2 / 17 Differential Calculus In this course, I shall stick to Leibniz's notation. The derivative of a function at an input point, when it exists, is the slope of the tangent line to the graph of the function. 2 0 dy Let y = f (x) = x + 2x + 1. Then, f (x) = dx = 2x + 2. A partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant. 2 2 @z Let z = f (x; y) = x + xy + y . Then, @x = 2x + y. The chain rule is a formula for computing the derivative of the composition of two or more functions. dz dz dy 0 0 Let z = f (y) and y = g(x). Then, dx = dy · dx = f (y) · g (x). 1 Nice thing about the sigmoid function f (x) = 1+e−x : f 0(x) = f (x) · (1 − f (x)). Some Preliminaries 3 / 17 Artificial Neuron Artificial neuron, also called linear threshold unit (LTU), by McCulloch and Pitts, 1943: with one or more numeric inputs, it produces a weighted sum of them, applies an activation function, and outputs the result. Common activation functions: step function and sigmoid function. Artificial Neuron 4 / 17 Linear Threshold Unit (LTU) Below is an LTU with the activation function being the step function.
    [Show full text]