Artificial Intelligence. What Machines Can Learn and How to Implement Machine Learning Project in You Organization?

Total Page:16

File Type:pdf, Size:1020Kb

Artificial Intelligence. What Machines Can Learn and How to Implement Machine Learning Project in You Organization? Artificial Intelligence. What machines can learn and how to implement Machine Learning project in you organization? Antons Mislēvičs [email protected] Agenda 1. What machines can do today? 2. How Machine Learning works? 3. How to implement Machine Learning project? What machines can do today? Chess: IBM Deep Blue 3 ½ - Gary Kasparov 2 ½ (1997) Kasparov vs Deep Blue the rematch, 1997 https://www.research.ibm.com/deepblue/ Jeopardy: IBM Watson beats champions (2011) Final Jeopardy! and the Future of Watson http://www-03.ibm.com/marketing/br/watson/what-is-watson/the-future-of-watson.html Go: Google AlphaGo 4 – Lee Sedol 1 (2016) AlphaGo https://deepmind.com/alpha-go Image Recognition Error rates – human vs machine 1. Traffic Sign Recognition (IJCNN 2011): – Human: 1.16% – Machine: 0.54% 2. Handwritten Digits (MNIST): – Human: approx. 0.2% – Machine: 0.23% (2012) The German Traffic Sign Recognition Benchmark: http://benchmark.ini.rub.de/?section=gtsrb&subsection=results THE MNIST DATABASE of handwritten digits: http://yann.lecun.com/exdb/mnist/ Large Scale Visual Recognition Challenge (ILSVRC) 2015 challenge: – Object detection - 200 categories – Object recognition – 1000 categories – Object detection from video – 30 categories – Scene classification – 401 categories Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) http://image-net.org/challenges/LSVRC/2015/index#maincomp Microsoft Research Team: “To our knowledge, our result is the first to surpass human- level performance…on this visual recognition challenge” Large Scale Visual Recognition Challenge 2015 – Results: http://image-net.org/challenges/LSVRC/2015/results Microsoft Researchers’ Algorithm Sets ImageNet Challenge Milestone, 2015: https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets-imagenet-challenge-milestone/ Machines can understand the meaning… Show and Tell: A Neural Image Caption Generator, O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015: http://arxiv.org/abs/1411.4555v2 DenseCap: Fully Convolutional Localization Networks for Dense Captioning, J. Johnson, A. Karpathy, L. Fei-Fei, 2015: http://arxiv.org/abs/1511.07571 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539 https://clarifai.com/demo Self-Driving Cars Google Self-Driving Car Project - How it drives: https://www.google.com/selfdrivingcar/how/ Autopilot Full Self-Driving Hardware (Neighborhood Long), Tesla Motors: https://vimeo.com/192179727 Machines get creative… 1. Reproduce artistic style [1, 2, 3] 2. The Next Rembrandt [4] 3. Complete images [5] 4. Enhance images [6] 1. A Neural Algorithm of Artistic Style, 2015: https://arxiv.org/abs/1508.06576 2. Supercharging Style Transfer, 2016: https://research.googleblog.com/2016/10/supercharging-style-transfer.html 3. Neural Doodle:, 2016: https://github.com/alexjc/neural-doodle 4. The Next Rembrandt: https://www.nextrembrandt.com/ 5. Image Completion with Deep Learning in TensorFlow, 2016: https://bamos.github.io/2016/08/09/deep-completion/ 6. Neural Enhance, 2016: https://github.com/alexjc/neural-enhance Text to speech and voice recognition… 1. Talk - text to speech (WaveNet) [1, 2] 2. Recognize voice [3, 4] 1. WaveNet: A Generative Model for Raw Audio: https://deepmind.com/blog/wavenet-generative-model-raw-audio/ 2. WaveNet: A Generative Model for Raw Audio, 2016: https://arxiv.org/abs/1609.03499 3. Historic Achievement: Microsoft researchers reach human parity in conversational speech recognition: https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/ 4. Achieving Human Parity in Conversational Speech Recognition, 2016: http://arxiv.org/abs/1610.05256 What else machines can do? 1. Compose music [1] 2. Generate handwriting [2, 3] 3. Translate texts [4, 5] 1. Composing Music With Recurrent Neural Networks: http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/ 2. Generating Sequences With Recurrent Neural Networks, A. Graves, 2014: http://arxiv.org/abs/1308.0850 3. Alex Graves’s RNN handwriting generation demo: http://www.cs.toronto.edu/~graves/handwriting.html 4. University of Montreal, Lisa Lab, Neural Machine Translation demo: http://lisa.iro.umontreal.ca/mt-demo 5. Fully Character-Level Neural Machine Translation without Explicit Segmentation, J.Lee, K. Cho, T. Hofmann, 2016: http://arxiv.org/abs/1610.03017 Playing Atari Playing Atari with Deep Reinforcement Learning, 2013: https://arxiv.org/abs/1312.5602 Google DeepMind's Deep Q-learning playing Atari Breakout: https://www.youtube.com/watch?v=V1eYniJ0Rnk Machine Learning is transforming businesses… Predictive Maintenance for Aircraft Engines Rolls-Royce and Microsoft collaborate to create new digital capabilities: https://www.youtube.com/watch?v=B3CZXp-RK0g How Machine Learning works? Machine Learning process Introducing Azure Machine Learning, D. Chappell, 2015: http://www.davidchappell.com/writing/white_papers/Introducing-Azure-ML-v1.0--Chappell.pdf Machine Learning questions 1. How much / how many? Regression 2. Which category? Classification 3. Which groups? Clustering 4. Is it weird? Anomaly Detection Regression: how much / how many? $??? Housing prices by square feet 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 605,000 3010 1000 641,370 3250 824,280 3600 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 1,092,640 3700 0 1,187,550 4500 Output variable Input variable/Feature Housing prices hypothesis 5000 Price Square Feet 125,999 950 4500 Hypothesis 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 605,000 3010 1000 641,370 3250 824,280 3600 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 1,092,640 3700 0 1,187,550 4500 Using model to predict house price 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 Prediction “errors” & improving models 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 Cost function (sq. error function) 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 Try different algorithm 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 6000 5000 4000 3000 2000 1000 0 0 1,600,000 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 1,000,000 1,100,000 1,200,000 1,300,000 1,400,000 1,500,000 1,700,000 1,800,000 1,900,000 2,000,000 2,100,000 Get more data Get more Use more data variables (features) Price Square Feet # Bedrooms # Bathrooms Fireplaces Garage Size Floors 125,999 950 1 1 0 0 1 207,190 1125 1 1 0 1 1 227,555 1400 2 1.5 1 2 1 319,010 1750 2 1.5 0 2 2 345,846 1525 3 2 1 2 1 350,000 1690 3 1.5 1 2 1.5 437,301 2120 3 2.5 2 3 2 450,999 2500 3 2.5 1 2 1.5 605,000 3010 4 2.5 2 3 2 641,370 3250 3 3 1 3 2 824,280 3600 3 3 2 3 2 1,092,640 3700 5 4.5 2 3 2 1,187,550 4500 6 6 4 5 2 Classification: which category? Classification – 2 classes Years driving Age Class Hypothesis / Classifier 5 65 1 7 70 1 2 68 1 25 45 2 Age 25 55 2 20 50 2 5 25 1 3 22 1 Years driving 8 30 1 15 35 2 … … … 12 38 ??? Input variables/Features Output variable WALL·E (2008): http://www.imdb.com/title/tt0910970/ Classification – more than 2 classes Classifier 2 1. Input variables / Features: Classifier 3 – Years driving – Age Age 2. Output variable: – Class: Yellow, Green, Blue Classifier 1 3. One-vs-rest approach: Years driving – Train classifier for each class – Select class that returned highest confidence score https://quickdraw.withgoogle.com https://quickdraw.withgoogle.com http://playground.tensorflow.org/ Why Machine Learning is developing rapidly? 1. Data 2. Algorithms – same approach in different domains (deep neural networks) 3.
Recommended publications
  • Deep Learning Is Robust to Massive Label Noise
    Deep Learning is Robust to Massive Label Noise David Rolnick * 1 Andreas Veit * 2 Serge Belongie 2 Nir Shavit 3 Abstract Thus, annotation can be expensive and, for tasks requiring expert knowledge, may simply be unattainable at scale. Deep neural networks trained on large supervised datasets have led to impressive results in image To address this limitation, other training paradigms have classification and other tasks. However, well- been investigated to alleviate the need for expensive an- annotated datasets can be time-consuming and notations, such as unsupervised learning (Le, 2013), self- expensive to collect, lending increased interest to supervised learning (Pinto et al., 2016; Wang & Gupta, larger but noisy datasets that are more easily ob- 2015) and learning from noisy annotations (Joulin et al., tained. In this paper, we show that deep neural net- 2016; Natarajan et al., 2013; Veit et al., 2017). Very large works are capable of generalizing from training datasets (e.g., Krasin et al.(2016); Thomee et al.(2016)) data for which true labels are massively outnum- can often be obtained, for example from web sources, with bered by incorrect labels. We demonstrate remark- partial or unreliable annotation. This can allow neural net- ably high test performance after training on cor- works to be trained on a much wider variety of tasks or rupted data from MNIST, CIFAR, and ImageNet. classes and with less manual effort. The good performance For example, on MNIST we obtain test accuracy obtained from these large, noisy datasets indicates that deep above 90 percent even after each clean training learning approaches can tolerate modest amounts of noise example has been diluted with 100 randomly- in the training set.
    [Show full text]
  • Using Neural Networks for Image Classification Tim Kang San Jose State University
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by SJSU ScholarWorks San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-18-2015 Using Neural Networks for Image Classification Tim Kang San Jose State University Follow this and additional works at: https://scholarworks.sjsu.edu/etd_projects Part of the Artificial Intelligence and Robotics Commons Recommended Citation Kang, Tim, "Using Neural Networks for Image Classification" (2015). Master's Projects. 395. DOI: https://doi.org/10.31979/etd.ssss-za3p https://scholarworks.sjsu.edu/etd_projects/395 This Master's Project is brought to you for free and open access by the Master's Theses and Graduate Research at SJSU ScholarWorks. It has been accepted for inclusion in Master's Projects by an authorized administrator of SJSU ScholarWorks. For more information, please contact [email protected]. Using Neural Networks for Image Classification San Jose State University, CS298 Spring 2015 Author: Tim Kang ([email protected]), San Jose State University ​ ​ Advisor: Robert Chun ([email protected]), San Jose State University ​ ​ Committee: Thomas Austin ([email protected]), San Jose State University ​ ​ Committee: Thomas Howell ([email protected]), San Jose State University ​ ​ Abstract This paper will focus on applying neural network machine learning methods to images for the purpose of automatic detection and classification. The main advantage of using neural network methods in this project is its adeptness at fitting non­linear data and its ability to work as an unsupervised algorithm. The algorithms will be run on common, publically available datasets, namely the MNIST and CIFAR­10, so that our results will be easily reproducible.
    [Show full text]
  • The MNIST Database of Handwritten Digit Images for Machine Learning Research
    best of THE WEB Li Deng [ ] The MNIST Database of Handwritten Digit Images for Machine Learning Research n this issue, “Best of the Web” pres- The MNIST database was constructed their results that were obtained using the ents the modified National Institute of out of the original NIST database; hence, MNIST database. In most experiments, the Standards and Technology (MNIST) modified NIST or MNIST. There are existing training data from the database resources, consisting of a collection of 60,000 training images (some of these were used in learning the classifiers, where handwritten digit images used exten- training images can also be used for cross- “none” is entered in the “Preprocessing” Isively in optical character recognition and validation purposes) and 10,000 test column of the table on the Web site. In machine learning research. images, both drawn from the same distri- some experiments, the training set was Handwritten digit recognition is an bution. All these black and white digits are augmented with artificially distorted ver- important problem in optical character size normalized, and centered in a fixed- sions of the original training samples. The recognition, and it has been used as a test size image where the center of gravity of distortions include random combinations case for theories of pattern recognition the intensity lies at the center of the of jittering, shifts, scaling, deskewing, and machine learning algorithms for image with 28 # 28 pixels. Thus, the deslanting, blurring, and compression. The many years. Historically, to promote dimensionality of each image sample vec- type(s) of these and other distortions are machine learning and pattern recognition tor is 28 * 28 = 784, where each element specified in the “Preprocessing” column of research, several standard databases have is binary.
    [Show full text]
  • Data Augmentation for Supervised Learning with Generative Adversarial Networks Manaswi Podduturi Iowa State University
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2018 Data augmentation for supervised learning with generative adversarial networks Manaswi Podduturi Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Computer Sciences Commons Recommended Citation Podduturi, Manaswi, "Data augmentation for supervised learning with generative adversarial networks" (2018). Graduate Theses and Dissertations. 16439. https://lib.dr.iastate.edu/etd/16439 This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Data augmentation for supervised learning with generative adversarial networks by Manaswi Podduturi A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Major: Computer Science Program of Study Committee: Chinmay Hegde, Co-major Professor Jin Tian, Co-major Professor Pavan Aduri The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this thesis. The Graduate College will ensure this thesis is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2018 Copyright c Manaswi Podduturi, 2018. All rights reserved. ii DEDICATION I would like to dedicate this thesis to my late father Prasad who pushed his boundaries to make me what I am today.
    [Show full text]
  • Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN)
    sensors Article Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN) Savita Ahlawat 1, Amit Choudhary 2, Anand Nayyar 3, Saurabh Singh 4 and Byungun Yoon 4,* 1 Department of Computer Science and Engineering, Maharaja Surajmal Institute of Technology, New Delhi 110058, India; [email protected] 2 Department of Computer Science, Maharaja Surajmal Institute, New Delhi 110058, India; [email protected] 3 Graduate School, Duy Tan University, Da Nang 550000, Vietnam; [email protected] 4 Department of Industrial & Systems Engineering, Dongguk University, Seoul 04620, Korea; [email protected] * Correspondence: [email protected] Received: 25 May 2020; Accepted: 9 June 2020; Published: 12 June 2020 Abstract: Traditional systems of handwriting recognition have relied on handcrafted features and a large amount of prior knowledge. Training an Optical character recognition (OCR) system based on these prerequisites is a challenging task. Research in the handwriting recognition field is focused around deep learning techniques and has achieved breakthrough performance in the last few years. Still, the rapid growth in the amount of handwritten data and the availability of massive processing power demands improvement in recognition accuracy and deserves further investigation. Convolutional neural networks (CNNs) are very effective in perceiving the structure of handwritten characters/words in ways that help in automatic extraction of distinct features and make CNN the most suitable approach for solving handwriting recognition problems. Our aim in the proposed work is to explore the various design options like number of layers, stride size, receptive field, kernel size, padding and dilution for CNN-based handwritten digit recognition. In addition, we aim to evaluate various SGD optimization algorithms in improving the performance of handwritten digit recognition.
    [Show full text]
  • BEGINNER's GUIDE to NEURAL NETWORKS for the MNIST DATASET USING MATLAB Bitna Kim and Young Ho Park 1. Introduction Machine
    Korean J. Math. 26 (2018), No. 2, pp. 337{348 https://doi.org/10.11568/kjm.2018.26.2.337 BEGINNER'S GUIDE TO NEURAL NETWORKS FOR THE MNIST DATASET USING MATLAB Bitna Kim and Young Ho Park∗ Abstract. MNIST dataset is a database containing images of hand- written digits, with each image labeled by an integer from 0 to 9. It is used to benchmark the performance of machine learning algo- rithms. Neural networks for MNIST are regarded as the starting point of the studying machine learning algorithms. However it is not easy to start the actual programming. In this expository article, we will give a step-by-step instruction to build neural networks for MNIST dataset using MATLAB. 1. Introduction Machine learning or more generally artificial intelligence is a real hot topic in these days. We have many applications of these principles to areas including virtual assistances, traffic predictions, video surveillance, social media services, email Spam filtering, etc. Many mathematicians want to learn about machine learning, particularly neural networks. There are so many books and internet pages for neural networks scat- tered around all over the places. However it is hard to find a right and fast track to actual programming for neural networks for beginners. Python is known to be the best programming language for machine learn- ing. But many mathematicians are more familiar with MATLAB than Received June 7, 2018. Revised June 18, 2018. Accepted June 20, 2018. 2010 Mathematics Subject Classification: 68T10, 68T35. Key words and phrases: Neural network, Machine learning, Pattern recognition, MNIST dataset, MATLAB.
    [Show full text]
  • Recognition of Handwritten Digit Using Convolutional Neural Network (CNN) by Md
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Global Journal of Computer Science and Technology (GJCST) Global Journal of Computer Science and Technology: D Neural & Artificial Intelligence Volume 19 Issue 2 Version 1.0 Year 2019 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Online ISSN: 0975-4172 & Print ISSN: 0975-4350 Recognition of Handwritten Digit using Convolutional Neural Network (CNN) By Md. Anwar Hossain & Md. Mohon Ali Pabna University of Science & Technology Abstract- Humans can see and visually sense the world around them by using their eyes and brains. Computer vision works on enabling computers to see and process images in the same way that human vision does. Several algorithms developed in the area of computer vision to recognize images. The goal of our work will be to create a model that will be able to identify and determine the handwritten digit from its image with better accuracy. We aim to complete this by using the concepts of Convolutional Neural Network and MNIST dataset. We will also show how MatConvNet can be used to implement our model with CPU training as well as less training time. Though the goal is to create a model which can recognize the digits, we can extend it for letters and then a person’s handwriting. Through this work, we aim to learn and practically apply the concepts of Convolutional Neural Networks. Keywords: convolutional neural network; MNIST dataset; MatConvNet; ReLu; softmax. GJCST-D Classification: I.2.6 RecognitionofHandwrittenDigitusingConvolutionalNeuralNetworkCNN Strictly as per the compliance and regulations of: © 2019.
    [Show full text]
  • Facial Recognition Using Deep Learning Neural Network Shrinath Oza, Abdullah Khan, Nilabh Swapnil, Aayush Sharma
    ISSN: 2350-0328 International Journal of Advanced Research in Science, Engineering and Technology Vol. 4, Issue 11 , November 2017 Facial Recognition Using Deep Learning Neural Network Shrinath Oza, Abdullah Khan, Nilabh Swapnil, Aayush Sharma,. Jayashree Chaudhari Student, Dr. D. Y. Patil School of Engineering, Pune, India Student, Dr. D. Y. Patil School of Engineering, Pune, India Student, Dr. D. Y. Patil School of Engineering, Pune, India Student, Dr. D. Y. Patil School of Engineering, Pune, India Professor, Dr. D. Y. Patil School of Engineering, Pune, India ABSTRACT: By implementing Restricted Boltzmann Machine with the rules of deep learning and deep belief network, a facial recognition system algorithm is proposed. The algorithm will face images storing them as input and will store them in a database, after that a comparison will be made between the given input and already classified features in the database and dataset. The implementation of neural network and greedy layer wise training of the same network will ensure efficient and improved facial recognition system. KEYWORDS: Facial recognition, Deep Learning, Neural Network, Deep Belief Network, Greedy Layer Algorithm, I. INTRODUCTION Facial recognition has become the norm with almost every device and every security system depending on it one way or other. With increased application of facial recognition the research on the same has increased exponentially. The main purpose of a facial recognition system is analysing, identifying and verifying a person from a digital image, video source or a database. Despite immense research and development most of the facial recognition system struggle to accurately identify and detect a face.
    [Show full text]
  • “A Review on Generative Adversarial Networks for Unsupervised Machine Learning”
    Bachelor's Degree in Telecommunication Technologies Engineering Academic Year 2018/2019 Bachelor Thesis “A review on Generative Adversarial Networks for unsupervised Machine Learning” Pablo Javier de Paz Carbajo Supervised by: Pablo Martínez Olmos Madrid, June 2019 This work is licensed under Creative Commons Attribution – Non Commercial – Non Derivatives ABSTRACT This thesis is conceived to provide an in-depth review of Generative Adversarial Networks (GANs) for unsupervised Machine Learning by introducing the basic concepts of Deep Learning as one of the main current techniques for unsupervised learning with deep neural networks as well as carefully explaining what and how a GAN works. The approach followed during this year-long work consists in learning and internalizing the technical concepts of Deep Learning and, particularly, GANs. After that, an analysis of the most used techniques takes place. For the experimentation, TensorFlow over Python is used to obtain experimental results for different types of public datasets of images (MNIST, Fashion-MNIST and CIFAR–10). GANs are outstandingly interesting in Deep Learning because of its endless capabilities and huge potential in many fields such as the health industry, the agriculture, banking and insurance, transportation, security, education… Its capabilities to generate new data that, mimics the input learning data given to it, are impressive and never seen before. This method is very nice approach for Data Science when dealing with a huge amount of (unlabeled) data, like most of the companies, institutions and governments do today. This thesis is restricted to image applications, but the outcomes can be applied to other datasets or fields such as text or voice applications.
    [Show full text]
  • Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images
    Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images Anh Nguyen Jason Yosinski Jeff Clune University of Wyoming Cornell University University of Wyoming [email protected] [email protected] [email protected] Abstract Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between com- puter and human vision. A recent study [30] revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the- art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neu- ral networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possi- ble to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects, which we call “fooling images” (more generally, fooling ex- Figure 1. Evolved images that are unrecognizable to humans, amples). Our results shed light on interesting differences but that state-of-the-art DNNs trained on ImageNet believe with between human vision and current DNNs, and raise ques- ≥ 99:6% certainty to be a familiar object.
    [Show full text]
  • Incorporating Nesterov Momentum Into Adam
    Incorporating Nesterov Momentum into Adam Timothy Dozat 1 Introduction Algorithm 1 Gradient Descent gt ∇q f (q t−1) When attempting to improve the performance of a t−1 q t q t−1 − hgt deep learning system, there are more or less three approaches one can take: the first is to improve the structure of the model, perhaps adding another layer, sum (with decay constant m) of the previous gradi- switching from simple recurrent units to LSTM cells ents into a momentum vector m, and using that in- [4], or–in the realm of NLP–taking advantage of stead of the true gradient. This has the advantage of syntactic parses (e.g. as in [13, et seq.]); another ap- accelerating gradient descent learning along dimen- proach is to improve the initialization of the model, sions where the gradient remains relatively consis- guaranteeing that the early-stage gradients have cer- tent across training steps and slowing it along turbu- tain beneficial properties [3], or building in large lent dimensions where the gradient is significantly amounts of sparsity [6], or taking advantage of prin- oscillating. ciples of linear algebra [15]; the final approach is to try a more powerful learning algorithm, such as in- Algorithm 2 Classical Momentum cluding a decaying sum over the previous gradients gt ∇qt−1 f (q t−1) in the update [12], by dividing each parameter up- mt mmt−1 + gt date by the L2 norm of the previous updates for that q t q t−1 − hmt parameter [2], or even by foregoing first-order algo- rithms for more powerful but more computationally [14] show that Nesterov’s accelerated gradient costly second order algorithms [9].
    [Show full text]
  • CNN Model for Image Classification on MNIST and Fashion-MNIST Dataset
    Volume 64, Issue 2, 2020 Journal of Scientific Research Institute of Science, Banaras Hindu University, Varanasi, India. CNN Model for Image Classification on MNIST and Fashion-MNIST Dataset Shivam S. Kadam, Amol C. Adamuthe*, and Ashwini B. Patil Dept. of CS & IT, RIT, Rajaramnagar, Sangli, MS, India. [email protected], [email protected]*, [email protected] Abstract: Paper presents application of convolutional neural flatten it to pass on to the output layer, which will make the network for image classification problem. MNIST and Fashion- prediction. Due to such architecture, it will take fewer parameter MNIST datasets used to test the performance of CNN model. Paper to learn and reduce the amount of data required to train the presents five different architectures with varying convolutional model. CNNs inspired by time-delay neural networks in which layers, filter size and fully connected layers. Experiments conducted the weights are shared in temporal dimension for reduction in with varying hyper-parameters namely activation function, optimizer, learning rate, dropout rate and batch size. Results show computation. Successful training of the hierarchical layers leads that selection of activation function, optimizer and dropout rate has CNN as the first successful deep learning architecture proved by impact on accuracy of results. All architectures give accuracy more Liu et al. (2017). CNN has shown excellent performance for than 99% for MNIST dataset. Fashion-MNIST dataset is complex machine learning applications. Due to practical benefits of CNN, than MNIST. For Fashion-MNIST dataset architecture 3 gives it gives better accuracy and improve the performance of system better results.
    [Show full text]