Neural Networks I

Total Page:16

File Type:pdf, Size:1020Kb

Neural Networks I Machine Learning { winter term 2016/17 { Chapter 07: Neural Networks I Prof. Adrian Ulges Masters \Computer Science" DCSM Department University of Applied Sciences RheinMain 1 14. November 2016 Resources for the next Chapters Online Book 1: The nice one I Nielsen: \Neural Networks and Deep Learning" http://neuralnetworksanddeeplearning.com Online Book 2: The tough one I Goodfellow, Bengio, Courville: \Deep learning" https://www.deeplearningbook.org 2 (Artificial) Neural Networks.... image: [1] I ... are one of the core topics of artificial intelligence (AI) I ... are biologically inspired: They are mathematical models simulating the process of \thinking" in living organisms' brains I ... are graphs (or networks) of interlinked atomic processing units (neurons) I ... are not programmed but define their own behavior entirely through the graph's links and weights. These are adjusted by training. I ... are covered by two research fields 1. computational neuroscience (goal: better understanding of biological processes) 2. machine learning / AI (here) (goal: solving practical data analysis problems) 3 Neural Networks.... I ... have a turbulent history 1940 1950 1960 1970 1980 1990 2000 2010 2015 I ... exist in many variants I multi-layer perceptron (in this lecture) I convolutional neural networks (in this lecture) I Boltzmann machines I RBF networks I recurrent neural networks I LSTM networks I ... I ... are the most intensely machine learning model these days (\deep learning") 4 Deep Learning Applications images from [5] [6] [2] [4] [7] 5 Neural Networks: Biological Inspiration Neurons are brain cells that send and receive electrical impulses. Neurons: Terminology I Cell body (soma): soma 1 center of the neuron, with electrical axon a diameter of 5 − 100µm impulse soma 3 synapse I axon: thin, long (up to 1m) axon dendrites nerve fibre, splits and dendrites dendrites transmits impulses to other neurons I dendrites: small, soma 2 branch-like outgrowths that axon synapse receive impulses and axon transmit them to the soma pre-synaptic post-synaptic membrane membrane I synapses: electro-chemical link between two neurons. Transfers signal from axon dentrite to dendrites via transmitter substances. 6 From Biological to Artificial Neural Networks I Biological neural networks are (much) more complex than artificial ones - connectivity: 1011 neurons, each connected with ≈ 7; 000 others - adaptivity: number and connections of neurons change over time I Biological neural networks are asynchronous and compute with rather low frequency (1 KHz) I The circuit layout is unknown Learning I Learning happens at synapses, by changing the transmitter dose - sensitization: more transmitter, enhancing the signal - desensitization: less transmitter, supressing the signal Artificial Neural Networks X1 w1 θ I weighted sum + activation function X y 2 Σ w2 I learning happens by adapting weights wm I memory happens by storing weights Xm 7 Outline 1. Neurons 2. Training Neurons 3. The Capacity of Neurons 4. Neural Networks 8 The McP-Neuron I The historically oldest (1943) neuron by McCulloch & Pitts I strongly simplified model of biological neurons I no learning yet Definition I input d x = (x1; :::; xd ) 2 f0; 1g I output y 2 f0; 1g I weights w = d (w1; :::; wd ) 2 {−1; 1g I wj = 1: stimulating X (="anregend") 1 w1 θ y I wj = −1: inhibitory X 2 w Σ (= \hemmend") 2 X wm I threshold θ 2 R m 9 The McP-Neuron Computational Model I compute the scalar product between input signal x and weights w, namely w · x I apply an activation function f (here, a step function), obtaining the output y y = f (w · x) 1 if w · x ≥ θ = 0 else f(x·w) I The McP-Neuron is a function φw,θ : f0; 1gd ! f0; 1g with parameters w; θ If φw,θ(x) = 1, we say the x·w I θ neuron \is activated", or the neuron “fires” 10 The McP-Neuron x → w McP-Neurons: Graphical Representation 1 1 . I input x . I weights w → w xm m I threshold θ θ ↓ y What can McP-Neurons do? I We can model logic gates using McP-neurons x → x → → 1 -1 1 1 x1 1 -0,5 x → → 2 1 x2 1 ↓ y 2 1 ↓ y ↓ y NOT AND OR 11 From McP-Neurons to the Perceptron McP-Neurons: Limitations I only boolean inputs/weights/outputs I only binary activation functions I no learning Extensions: The Perceptron d I real-valued weights and inputs w; x 2 R I We will introduce a learning algorithm, the Delta-rule 12 Perceptron: Graphical Interpretation I The parameters w; θ define a hyperplane I w · x ≥ θ: x is on the positive side I w · x < θ: x is on the negative side hy pe w rp la ne θ / ||w|| positive side negative side 13 Outline 1. Neurons 2. Training Neurons 3. The Capacity of Neurons 4. Neural Networks 14 Perceptron Training: Basics w,θ I A neuron's function (or behavior) φ is determined by the parameters w and θ I Given labeled training data: How do we find the best parameters / the best hyperplane? hyperplane 2 0 (w , θ ) positive 2 2 0 0 side 0 1 0 1 1 1 negative 0 0 0 0 side 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 1 1 0 0 0 1 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 negative 1 1 side hyperplane 1 positive θ (w1, 1) side bad parameters: good parameters: error rate 12 / 30 = 40% error rate 0% 15 Perceptron Training: Vector Augmentation Before learning, we simplify notation using a → w x1 1 simple trick called vector augmentation: . Idea . I omit the additional threshold parameter → w θ and include it in the weight vector xd d θ So far... ↓ y I x := (x1; :::; xd ); w := (w1; :::; wd ) I fire if w · x ≥ θ (or w · x − θ ≥ 0) → w x1 1 Now . I x := (x1; :::; xd ; 1); w := (w1; :::; wd ; −θ) . I fire if w · x ≥ 0 x → w Remark d d 1 → −θ This is just a change of notation (θ is not 0 written separately any more), not of behavior! ↓ y 16 Perceptron Training Learning happens by an iterative optimization. We start with random values for w and θ, and in each iteration we ... I ... classify a training sample x and compare the result φ(x) with the targeted result y I If necessary, we correct the neuron's parameters such that the output φ(x) is adapted towards the desired output y wnew ← wold + ∆w x1 w 1 no 2 φ(x) x θ w Σ =? 2 yes w xd d y 17 Perceptron Training: The Delta Rule How do we compute the 'right' update ∆w? There are different strategies. The most famous one is the Delta Rule: 4w = λ · y − φ(x) · x Remarks I We call λ the learning rate I Note: If xi is classified correctly, ∆w = 0 ! no update 1 function train perceptron(x1; :::; xn; y1; :::; yn; λ): 2 initialize w randomly 3 until convergence: 4 choose a random training sample (x; y) (with y 2 f0; 1g) 5 compute φ(x) // classify x 6 if φ(x) <> y: 7 w := w + λ · y − φ(x) · x | {z } ∆w 8 18 The Delta Rule: Motivation 4w = λ · y − φ(x) · x I Why does the Delta rule work? I We illustrate what happens when a training sample is misclassified ... hy pe rp la ne w w ∆w (=0.2·x) wnew =0.2 x 1 1 19 The Delta Rule: Motivation Formal Motivation 20 The Delta Rule: Motivation Formal Motivation 21 The Delta Rule: Code Example 22 The Delta Rule: Code Example iteration 0 10 8 6 4 2 0 0 2 4 6 8 10 23 The Delta Rule: Code Example iteration 1 10 8 6 4 2 0 0 2 4 6 8 10 24 The Delta Rule: Code Example iteration 2 10 8 6 4 2 0 0 2 4 6 8 10 25 The Delta Rule: Code Example iteration 1000 10 8 6 4 2 0 0 2 4 6 8 10 26 Outline 1. Neurons 2. Training Neurons 3. The Capacity of Neurons 4. Neural Networks 27 An Important Question... d Can a single perceptron learn any Function f : R ! {−1; 1g? 28 Achieving Non-Linearity by connecting Neurons We can realize XOR by connecting multiple neurons! x2 neuron 1 neuron 2 1 0 x1 1 1 n e x2 1 1 u ro n 0,5 1,5 2 0 1 n x φ φ e 1 1(x1,x2) 2(x1,x2) u ro n 1 1 -1 0,5 φ φ φ (x ,x ) 3( 1(x1,x2), 2 1 2 neuron 3 φ (x ,x ) ) 2 1 2 0 φ φ φ φ φ ⊗ x1 x2 (x ,x ) 2(x ,x ) 3( (x ,x ), 2(x ,x )) x x 3 1 1 2 1 2 1 1 2 1 2 1 2 n ro u 0 0 0 0 0 0 e n 1 0 1 1 0 1 1 0 1 1 0 1 0 1 1 φ 1(x1,x2) 1 1 1 1 0 0 29 neuron 1 neuron 2 x 1 1 Achieving Non-Linearity 1 x2 1 1 Remarks 0,5 1,5 (x ,x ) I This is only possible because 1(x1,x2) 1 2 of the neurons' non-linear 1 -1 0,5 ( 1(x1,x2), activation functions! neuron 3 (x1,x2) ) - with activation functions 00 00 0 0 y = f (w1 · f (w1x1 + w2x2) + w2 · f (w1x1 + w2x2)) - without activation functions 00 00 0 0 y = w1 · (w1x1 + w2x2) + w2 · (w1x1 + w2x2) 000 000 = w1 x1 + w2 x2 This is just a linear function (and does not solve XOR) 30 Achieving Non-Linearity by connecting Neurons Theorem (Learning Capacity of Networks of McP-Neurons) We can realize any boolean function f : f0; 1gd ! f0; 1g by connecting not more than 2d + 1 McP-neurons.
Recommended publications
  • ML-Based Interactive Data Visualization System for Diversity and Fairness Issues 1
    Jusub Kim : ML-based Interactive Data Visualization System for Diversity and Fairness Issues 1 https://doi.org/10.5392/IJoC.2019.15.4.001 ML-based Interactive Data Visualization System for Diversity and Fairness Issues Sey Min Department of Art & Technology Sogang University, Seoul, Republic of Korea Jusub Kim Department of Art & Technology Sogang University, Seoul, Republic of Korea ABSTRACT As the recent developments of artificial intelligence, particularly machine-learning, impact every aspect of society, they are also increasingly influencing creative fields manifested as new artistic tools and inspirational sources. However, as more artists integrate the technology into their creative works, the issues of diversity and fairness are also emerging in the AI-based creative practice. The data dependency of machine-learning algorithms can amplify the social injustice existing in the real world. In this paper, we present an interactive visualization system for raising the awareness of the diversity and fairness issues. Rather than resorting to education, campaign, or laws on those issues, we have developed a web & ML-based interactive data visualization system. By providing the interactive visual experience on the issues in interesting ways as the form of web content which anyone can access from anywhere, we strive to raise the public awareness of the issues and alleviate the important ethical problems. In this paper, we present the process of developing the ML-based interactive visualization system and discuss the results of this project. The proposed approach can be applied to other areas requiring attention to the issues. Key words: AI Art, Machine Learning, Data Visualization, Diversity, Fairness, Inclusiveness.
    [Show full text]
  • Building Intelligent Systems with Large Scale Deep Learning Jeff Dean Google Brain Team G.Co/Brain
    Building Intelligent Systems with Large Scale Deep Learning Jeff Dean Google Brain team g.co/brain Presenting the work of many people at Google Google Brain Team Mission: Make Machines Intelligent. Improve People’s Lives. How do we do this? ● Conduct long-term research (>200 papers, see g.co/brain & g.co/brain/papers) ○ Unsupervised learning of cats, Inception, word2vec, seq2seq, DeepDream, image captioning, neural translation, Magenta, ML for robotics control, healthcare, … ● Build and open-source systems like TensorFlow (see tensorflow.org and https://github.com/tensorflow/tensorflow) ● Collaborate with others at Google and Alphabet to get our work into the hands of billions of people (e.g., RankBrain for Google Search, GMail Smart Reply, Google Photos, Google speech recognition, Google Translate, Waymo, …) ● Train new researchers through internships and the Google Brain Residency program Main Research Areas ● General Machine Learning Algorithms and Techniques ● Computer Systems for Machine Learning ● Natural Language Understanding ● Perception ● Healthcare ● Robotics ● Music and Art Generation Main Research Areas ● General Machine Learning Algorithms and Techniques ● Computer Systems for Machine Learning ● Natural Language Understanding ● Perception ● Healthcare ● Robotics ● Music and Art Generation research.googleblog.com/2017/01 /the-google-brain-team-looking-ba ck-on.html 1980s and 1990s Accuracy neural networks other approaches Scale (data size, model size) 1980s and 1990s more Accuracy compute neural networks other approaches Scale
    [Show full text]
  • Audio Event Classification Using Deep Learning in an End-To-End Approach
    Audio Event Classification using Deep Learning in an End-to-End Approach Master thesis Jose Luis Diez Antich Aalborg University Copenhagen A. C. Meyers Vænge 15 2450 Copenhagen SV Denmark Title: Abstract: Audio Event Classification using Deep Learning in an End-to-End Approach The goal of the master thesis is to study the task of Sound Event Classification Participant(s): using Deep Neural Networks in an end- Jose Luis Diez Antich to-end approach. Sound Event Classifi- cation it is a multi-label classification problem of sound sources originated Supervisor(s): from everyday environments. An auto- Hendrik Purwins matic system for it would many applica- tions, for example, it could help users of hearing devices to understand their sur- Page Numbers: 38 roundings or enhance robot navigation systems. The end-to-end approach con- Date of Completion: sists in systems that learn directly from June 16, 2017 data, not from features, and it has been recently applied to audio and its results are remarkable. Even though the re- sults do not show an improvement over standard approaches, the contribution of this thesis is an exploration of deep learning architectures which can be use- ful to understand how networks process audio. The content of this report is freely available, but publication (with reference) may only be pursued due to agreement with the author. Contents 1 Introduction1 1.1 Scope of this work.............................2 2 Deep Learning3 2.1 Overview..................................3 2.2 Multilayer Perceptron...........................4
    [Show full text]
  • Bridging Reinforcement Learning and Creativity: Implementing Reinforcement Learning in Processing
    Bridging Reinforcement Learning and Creativity: Implementing Reinforcement Learning in Processing Jieliang Luo Media Arts & Technology Sam Green Computer Science University of California, Santa Barbara SIGGRAPH Asia 2018 Tokyo, Japan December 6th, 2018 Course Agenda • What is Reinforcement Learning (10 mins) ▪ Introduce the core concepts of reinforcement learning • A Brief Survey of Artworks in Deep Learning (5 mins) • Why Processing Community (5 mins) • A Q-Learning Algorithm (35 mins) ▪ Explain a fundamental reinforcement learning algorithm • Implementing Tabular Q-Learning in P5.js (40 mins) ▪ Discuss how to create a reinforcement learning environment ▪ Show how to implement the tabular q-learning algorithm in P5.js • Questions & Answers (5 mins) Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green Psychology Pictures Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green What is Reinforcement Learning • Branch of machine learning • Learns through trial & error, rewards & punishment • Draws from psychology, neuroscience, computer science, optimization Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green Reinforcement Learning Framework Agent Environment Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green https://www.youtube.com/watch?v=V1eYniJ0Rnk Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green https://www.youtube.com/watch?v=ZhsEKTo7V04 Bridging Reinforcement Learning and Creativity, Jieliang Luo & Sam Green Bridging Reinforcement
    [Show full text]
  • Learning Semantics of Raw Audio
    Learning Semantics of Raw Audio Davis Foote, Daylen Yang, Mostafa Rohaninejad Group name: “Audio Style Transfer” Introduction Variational Inference Numerically modeling semantics has given some useful exciting results in the We would like to have a latent space in which arithmetic operations domains of images [1] and natural language [2]. We would like to be able to correspond to semantic operations in the observed space. To that end, we operate on raw audio signals at a similar level of abstraction. Consider the adopt the following model, adapted from [1]: problem of trying to interpolate two voices into a voice that sounds “in We interpret the hidden variables � as the latent code. between” the two. Averaging the signals will result in the two voices There is a prior over latent codes �� � and a conditional speaking at once. If this operation were performed in a latent space in which distribution �� � � . In order to be able to encode a data dimensions have semantic meaning, then we would see results which match point, we also need to learn an approximation to the human perception of what this interpolation should mean. posterior distribution, �� � � . We can optimize all these We follow two distinct branches in pursuing this goal. First, motivated by parameters at once using the Auto-Encoding Variational aesthetic results in style transfer [3] and generation of novel styles [4], we Bayes algorithm [1]. Some very simple results (ours) of implement a deep classifier for various musical tags in hopes that the various sampling from this model trained on MNIST are shown: layer activations will encode meaningful high-level audio features.
    [Show full text]
  • Mechanisms of Artistic Creativity in Deep Learning Neural Networks
    Mechanisms of Artistic Creativity in Deep Learning Neural Networks Lonce Wyse Communication and New Media Department National University of Singapore Singapore [email protected] Abstract better be able to build systems that can reflect and com- The generative capabilities of deep learning neural net- municate about their behavior. works (DNNs) have been attracting increasing attention “Mechanisms” have a contested status in the field of for both the remarkable artifacts they produce, but also computational creativity. Fore some, there is a reluctance because of the vast conceptual difference between how to make the recognition of creativity depend on any partic- they are programmed and what they do. DNNs are ular mechanism, and focus is directed to artifacts over pro- “black boxes” where high-level behavior is not explicit- cess (Ritchie 2007). This at least avoids the phenomenon ly programed, but emerges from the complex interac- plaguing classically programmed AI systems identified by tions of thousands or millions of simple computational John McCarthy that, “as soon as it works, no one calls it AI elements. Their behavior is often described in anthro- any more.” Others (Colton 2008) argue that understanding pomorphic terms that can be misleading, seem magical, or stoke fears of an imminent singularity in which ma- the process by which an artifact in generated is important chines become “more” than human. for an assessment of creativity. In this paper, we examine 5 distinct behavioral char- With neural networks, complex behaviors emerge out of acteristics associated with creativity, and provide an ex- the simple interaction between potentially millions of units.
    [Show full text]
  • City Research Online
    City Research Online City, University of London Institutional Repository Citation: Ollero, J. and Child, C. H. T. (2018). Performance Enhancement of Deep Reinforcement Learning Networks using Feature Extraction. Lecture Notes in Computer Science, 10878, pp. 208-218. doi: 10.1007/978-3-319-92537-0_25 This is the accepted version of the paper. This version of the publication may differ from the final published version. Permanent repository link: https://openaccess.city.ac.uk/id/eprint/19526/ Link to published version: http://dx.doi.org/10.1007/978-3-319-92537-0_25 Copyright: City Research Online aims to make research outputs of City, University of London available to a wider audience. Copyright and Moral Rights remain with the author(s) and/or copyright holders. URLs from City Research Online may be freely distributed and linked to. Reuse: Copies of full items can be used for personal research or study, educational, or not-for-profit purposes without prior permission or charge. Provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way. City Research Online: http://openaccess.city.ac.uk/ [email protected] Performance Enhancement of Deep Reinforcement Learning Networks using Feature Extraction Joaquin Ollero and Christopher Child City, University of London, United Kingdom Abstract. The combination of Deep Learning and Reinforcement Learning, termed Deep Reinforcement Learning Networks (DRLN), offers the possibility of us- ing a Deep Learning Neural Network to produce an approximate Reinforcement Learning value table that allows extraction of features from neurons in the hidden layers of the network.
    [Show full text]
  • Google Deep Dream
    Google Deep Dream Team Number: 12 Course: CSE352 Professor: Anita Wasilewska Presented by: Badr AlKhamissi Sameer Anand Kathryn Blecher Marolyn Liang Diego Santos Campo What Is Google Deep Dream? Deep Dream is a computer vision program created by Google. Uses a convolutional neural network to find and enhance patterns in images with powerful AI algorithms. Creating a dreamlike hallucinogenic appearance in the deliberately over-processed images. Base of Google Deep Dream Inception is fundamental base for Google Deep Dream and is introduced on ILSVRC in 2014. Deep convolutional neural network architecture that achieves the new state of the art for classification and detection. Improved utilization of the computing resources inside the network. Increased the depth and width of the network while keeping the computational budget constant of 1.5 billion multiply-adds at inference time. How Does Deep Dream Work? How Does Deep Dream Work? Deep Dream works on a Neural Network (NN) This is a type of computer system that can learn on its own. Neural networks are modeled after the functionality of the human brain, and tend to be particularly useful for pattern recognition. Biological Inspiration Biological Inspiration Convolutional Neural Network (CNN) Feed forward artificial neural network Inspired by the organization of the animal visual cortex (convolution operation) Designed to use minimal amounts of preprocessing Combine Kernel Convolution and Deep Learning Mostly used in image and video recognition, recommender systems and NLP Why Convolutional Networks? Curse of dimensionality Local connectivity Shared Weights CNN Architecture 5.1% Human Error Rate in Identifying Objects 4.94% Microsoft made it better than humans in 2014 3.46% Google inception-v3 model beat them in 2014 Digging Deeper Into The Neural Network Deep Dream’s Convolutional Neural Network must first be trained.
    [Show full text]
  • Art in Machine Learning
    1 Creativity in Machine Learning w0 x0 w1 x1 w2 Martin Thoma x2 Σ ' w3 E-Mail: [email protected] x3 . wn xn Abstract—Recent machine learning techniques can be modified (a) Example of an artificial neuron unit.(b) A visualization of a simple feed- to produce creative results. Those results did not exist before; it xi are the input signals and wi are forward neural network. The 5 in- is not a trivial combination of the data which was fed into the weights which have to get learned. put nodes are red, the 2 bias nodes machine learning system. The obtained results come in multiple Each input signal gets multiplied are gray, the 3 hidden units are forms: As images, as text and as audio. with its weight, everything gets green and the single output node summed up and the activation func- is blue. This paper gives a high level overview of how they are created tion ' is applied. and gives some examples. It is meant to be a summary of the current work and give people who are new to machine learning Fig. 1: Neural networks are based on simple units which get some starting points. combined to complex networks. I. INTRODUCTION This means that machine learning programs adjust internal parameters to fit the data they are given. Those computer According to [Gad06] creativity is “the ability to use your programs are still developed by software developers, but the imagination to produce new ideas, make things etc.” and developer writes them in a way which makes it possible to imagination is “the ability to form pictures or ideas in your adjust them without having to re-program everything.
    [Show full text]
  • Computer Vision and Computer Hallucinations
    A reprint from American Scientist the magazine of Sigma Xi, The Scientific Research Society This reprint is provided for personal and noncommercial use. For any other use, please send a request Brian Hayes by electronic mail to [email protected]. Computing Science Computer Vision and Computer Hallucinations A peek inside an artificial neural network reveals some pretty freaky images. Brian Hayes eople have an amazing knack images specially designed to fool the network proposes a label. If the choice for image recognition. We networks, much as optical illusions fool is incorrect, an error signal propagates can riffle through a stack of the biological eye and brain. Another backward through the layers, reducing pictures and almost instantly approach runs the neural network in the activation of the wrongly chosen Plabel each one: dog, birthday cake, bi- reverse; instead of giving it an image output neuron. The training process cycle, teapot. What we can’t do is ex- as input and asking for a concept as does not alter the wiring diagram of plain how we perform this feat. When output, we specify a concept and the the network or the internal operations you see a rose, certain neurons in your network generates a corresponding of the individual neurons. Instead, it brain’s visual cortex light up with ac- image. A related technique called deep adjusts the weight, or strength, of the tivity; a tulip stimulates a different set dreaming burst on the scene last spring connections between one neuron and of cells. What distinguishing features following a blog post from Google Re- the next.
    [Show full text]
  • Robust Or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks
    Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks Felipe A. Mejia1, Paul Gamble2, Zigfried Hampel-Arias1, Michael Lomnitz1, Nina Lopatina1, Lucas Tindall1, and Maria Alejandra Barrios2 1Lab41, In-Q-Tel, Menlo Park, USA 2Previously at Lab41, In-Q-Tel, Menlo Park, USA June 18, 2019 Abstract tomer photos. As we increasingly rely on machine learning, it is critical to assess the dangers of the Adversarial training was introduced as a way to im- vulnerabilities in these models. prove the robustness of deep learning models to ad- In a supervised learning environment several com- versarial attacks. This training method improves ro- ponents make up the learning pipeline of a machine bustness against adversarial attacks, but increases the learning model: collection of training data, definition models vulnerability to privacy attacks. In this work of model architecture, model training, and model out- we demonstrate how model inversion attacks, extract- puts { test and evaluation. This pipeline presents a ing training data directly from the model, previously broader attack surface, where adversaries can lever- thought to be intractable become feasible when at- age system vulnerabilities to compromise data privacy tacking a robustly trained model. The input space for and/or model performance. There are four main types a traditionally trained model is dominated by adver- of attacks discussed in literature: sarial examples - data points that strongly activate a certain class but lack semantic meaning - this makes • model fooling { small perturbation to the user it difficult to successfully conduct model inversion at- input leads to large changes in the model output tacks.
    [Show full text]
  • Lecture 9: Visualizing Cnns and Recurrent Neural Networks
    Lecture 9: Visualizing CNNs and Recurrent Neural Networks Tuesday February 28, 2017 * Original slides borrowed from Andrej Karpathy 1 and Li Fei-Fei, Stanford cs231n comp150dl Announcements! - HW #3 is out - Final Project proposals due this Thursday March 2 - Papers to read: Students should read all papers on the Schedule tab, and are encouraged to read as many papers as possible from the Papers tab. - Next paper: March 7 You Only Look Once: Unified, Real-Time Object Detection. If this paper seems too deep or confusing, look at Fast R-CNN, Faster R-CNN comp150dl 2 Python/Numpy of the Day • t-SNE (t-Distributed Stochastic Nearest Neighbor Embedding) • Scikit-Learn t-SNE • Examples of 2D Embedding Visualizations of MNIST dataset • Other Embedding functions in Scikit-Learn comp150dl 3 Visualizing CNN Behavior - How can we see what’s going on in a CNN? - Stuff we’ve already done: - Visualize the weights - Occlusion experiments — ex. Jason and Lisa’s AlexNet Occlusion Tests comp150dl 4 Visualizing CNN Behavior - How can we see what’s going on in a CNN? - Straightforward stuff to try in the future: - Visualize the representation space (e.g. with t-SNE) - Human experiment comparisons comp150dl 5 Visualizing CNN Behavior - How can we see what’s going on in a CNN? - More sophisticated approaches (HW #4) - Visualize patches that maximally activate neurons - Optimization over image approaches (optimization) - Deconv approaches (single backward pass) comp150dl 6 Deconv approaches - projecting backward from one neuron to see what is activating it 1. Feed image into net Q: how can we compute the gradient of any arbitrary neuron in the network w.r.t.
    [Show full text]