Artificial Intelligence. What Machines Can Learn and How to Implement Machine Learning Project in You Organization?

Artificial Intelligence. What machines can learn and how to implement Machine Learning project in you organization? Antons Mislēvičs [email protected] Agenda 1. What machines can do today? 2. How Machine Learning works? 3. How to implement Machine Learning project? What machines can do today? Chess: IBM Deep Blue 3 ½ - Gary Kasparov 2 ½ (1997) Kasparov vs Deep Blue the rematch, 1997 https://www.research.ibm.com/deepblue/ Jeopardy: IBM Watson beats champions (2011) Final Jeopardy! and the Future of Watson http://www-03.ibm.com/marketing/br/watson/what-is-watson/the-future-of-watson.html Go: Google AlphaGo 4 – Lee Sedol 1 (2016) AlphaGo https://deepmind.com/alpha-go Image Recognition Error rates – human vs machine 1. Traffic Sign Recognition (IJCNN 2011): – Human: 1.16% – Machine: 0.54% 2. Handwritten Digits (MNIST): – Human: approx. 0.2% – Machine: 0.23% (2012) The German Traffic Sign Recognition Benchmark: http://benchmark.ini.rub.de/?section=gtsrb&subsection=results THE MNIST DATABASE of handwritten digits: http://yann.lecun.com/exdb/mnist/ Large Scale Visual Recognition Challenge (ILSVRC) 2015 challenge: – Object detection - 200 categories – Object recognition – 1000 categories – Object detection from video – 30 categories – Scene classification – 401 categories Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) http://image-net.org/challenges/LSVRC/2015/index#maincomp Microsoft Research Team: “To our knowledge, our result is the first to surpass human- level performance…on this visual recognition challenge” Large Scale Visual Recognition Challenge 2015 – Results: http://image-net.org/challenges/LSVRC/2015/results Microsoft Researchers’ Algorithm Sets ImageNet Challenge Milestone, 2015: https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets-imagenet-challenge-milestone/ Machines can understand the meaning… Show and Tell: A Neural Image Caption Generator, O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015: http://arxiv.org/abs/1411.4555v2 DenseCap: Fully Convolutional Localization Networks for Dense Captioning, J. Johnson, A. Karpathy, L. Fei-Fei, 2015: http://arxiv.org/abs/1511.07571 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539 https://clarifai.com/demo Self-Driving Cars Google Self-Driving Car Project - How it drives: https://www.google.com/selfdrivingcar/how/ Autopilot Full Self-Driving Hardware (Neighborhood Long), Tesla Motors: https://vimeo.com/192179727 Machines get creative… 1. Reproduce artistic style [1, 2, 3] 2. The Next Rembrandt [4] 3. Complete images [5] 4. Enhance images [6] 1. A Neural Algorithm of Artistic Style, 2015: https://arxiv.org/abs/1508.06576 2. Supercharging Style Transfer, 2016: https://research.googleblog.com/2016/10/supercharging-style-transfer.html 3. Neural Doodle:, 2016: https://github.com/alexjc/neural-doodle 4. The Next Rembrandt: https://www.nextrembrandt.com/ 5. Image Completion with Deep Learning in TensorFlow, 2016: https://bamos.github.io/2016/08/09/deep-completion/ 6. Neural Enhance, 2016: https://github.com/alexjc/neural-enhance Text to speech and voice recognition… 1. Talk - text to speech (WaveNet) [1, 2] 2. Recognize voice [3, 4] 1. WaveNet: A Generative Model for Raw Audio: https://deepmind.com/blog/wavenet-generative-model-raw-audio/ 2. WaveNet: A Generative Model for Raw Audio, 2016: https://arxiv.org/abs/1609.03499 3. Historic Achievement: Microsoft researchers reach human parity in conversational speech recognition: https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/ 4. Achieving Human Parity in Conversational Speech Recognition, 2016: http://arxiv.org/abs/1610.05256 What else machines can do? 1. Compose music [1] 2. Generate handwriting [2, 3] 3. Translate texts [4, 5] 1. Composing Music With Recurrent Neural Networks: http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/ 2. Generating Sequences With Recurrent Neural Networks, A. Graves, 2014: http://arxiv.org/abs/1308.0850 3. Alex Graves’s RNN handwriting generation demo: http://www.cs.toronto.edu/~graves/handwriting.html 4. University of Montreal, Lisa Lab, Neural Machine Translation demo: http://lisa.iro.umontreal.ca/mt-demo 5. Fully Character-Level Neural Machine Translation without Explicit Segmentation, J.Lee, K. Cho, T. Hofmann, 2016: http://arxiv.org/abs/1610.03017 Playing Atari Playing Atari with Deep Reinforcement Learning, 2013: https://arxiv.org/abs/1312.5602 Google DeepMind's Deep Q-learning playing Atari Breakout: https://www.youtube.com/watch?v=V1eYniJ0Rnk Machine Learning is transforming businesses… Predictive Maintenance for Aircraft Engines Rolls-Royce and Microsoft collaborate to create new digital capabilities: https://www.youtube.com/watch?v=B3CZXp-RK0g How Machine Learning works? Machine Learning process Introducing Azure Machine Learning, D. Chappell, 2015: http://www.davidchappell.com/writing/white_papers/Introducing-Azure-ML-v1.0--Chappell.pdf Machine Learning questions 1. How much / how many? Regression 2. Which category? Classification 3. Which groups? Clustering 4. Is it weird? Anomaly Detection Regression: how much / how many? $??? Housing prices by square feet 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 605,000 3010 1000 641,370 3250 824,280 3600 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 1,092,640 3700 0 1,187,550 4500 Output variable Input variable/Feature Housing prices hypothesis 5000 Price Square Feet 125,999 950 4500 Hypothesis 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 605,000 3010 1000 641,370 3250 824,280 3600 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 1,092,640 3700 0 1,187,550 4500 Using model to predict house price 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 Prediction “errors” & improving models 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 Cost function (sq. error function) 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 Try different algorithm 5000 Price Square Feet 125,999 950 4500 207,190 1125 4000 227,555 1400 3500 319,010 1750 3000 345,846 1525 350,000 1690 2500 437,301 2120 2000 450,999 2500 1500 ??? 2700 1000 605,000 3010 641,370 3250 500 0 100,000 800,000 200,000 300,000 400,000 500,000 600,000 700,000 900,000 1,000,000 1,100,000 1,200,000 824,280 3600 0 1,092,640 3700 1,187,550 4500 6000 5000 4000 3000 2000 1000 0 0 1,600,000 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 1,000,000 1,100,000 1,200,000 1,300,000 1,400,000 1,500,000 1,700,000 1,800,000 1,900,000 2,000,000 2,100,000 Get more data Get more Use more data variables (features) Price Square Feet # Bedrooms # Bathrooms Fireplaces Garage Size Floors 125,999 950 1 1 0 0 1 207,190 1125 1 1 0 1 1 227,555 1400 2 1.5 1 2 1 319,010 1750 2 1.5 0 2 2 345,846 1525 3 2 1 2 1 350,000 1690 3 1.5 1 2 1.5 437,301 2120 3 2.5 2 3 2 450,999 2500 3 2.5 1 2 1.5 605,000 3010 4 2.5 2 3 2 641,370 3250 3 3 1 3 2 824,280 3600 3 3 2 3 2 1,092,640 3700 5 4.5 2 3 2 1,187,550 4500 6 6 4 5 2 Classification: which category? Classification – 2 classes Years driving Age Class Hypothesis / Classifier 5 65 1 7 70 1 2 68 1 25 45 2 Age 25 55 2 20 50 2 5 25 1 3 22 1 Years driving 8 30 1 15 35 2 … … … 12 38 ??? Input variables/Features Output variable WALL·E (2008): http://www.imdb.com/title/tt0910970/ Classification – more than 2 classes Classifier 2 1. Input variables / Features: Classifier 3 – Years driving – Age Age 2. Output variable: – Class: Yellow, Green, Blue Classifier 1 3. One-vs-rest approach: Years driving – Train classifier for each class – Select class that returned highest confidence score https://quickdraw.withgoogle.com https://quickdraw.withgoogle.com http://playground.tensorflow.org/ Why Machine Learning is developing rapidly? 1. Data 2. Algorithms – same approach in different domains (deep neural networks) 3.

Artificial Intelligence. What Machines Can Learn and How to Implement Machine Learning Project in You Organization?

Deep Learning Is Robust to Massive Label Noise

Using Neural Networks for Image Classification Tim Kang San Jose State University

The MNIST Database of Handwritten Digit Images for Machine Learning Research

Data Augmentation for Supervised Learning with Generative Adversarial Networks Manaswi Podduturi Iowa State University

Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN)

BEGINNER's GUIDE to NEURAL NETWORKS for the MNIST DATASET USING MATLAB Bitna Kim and Young Ho Park 1. Introduction Machine

Recognition of Handwritten Digit Using Convolutional Neural Network (CNN) by Md

Facial Recognition Using Deep Learning Neural Network Shrinath Oza, Abdullah Khan, Nilabh Swapnil, Aayush Sharma

“A Review on Generative Adversarial Networks for Unsupervised Machine Learning”

Deep Neural Networks Are Easily Fooled: High Conﬁdence Predictions for Unrecognizable Images

Incorporating Nesterov Momentum Into Adam

CNN Model for Image Classification on MNIST and Fashion-MNIST Dataset