Artificial Intelligence: Separating Hype From Reality

Antons Mislēvičs Head of AI / , CTCo

[email protected] Agenda

1. What machines can do today? 2. How AI helps in real projects? 3. How you can implement AI? What machines can do today? Go: Google AlphaGo 4 – Lee Sedol 1 (2016)

AlphaGo https://deepmind.com/alpha-go Poker: Libratus and DeepStack beat top pros (2017)

Carnegie Mellon Beats Top Poker Pros: https://www.cmu.edu/news/stories/archives/2017/january/AI-beats-poker-pros.html Brains Vs. AI Rematch: Why Poker?: https://www.youtube.com/watch?v=JtyA2aUj4WI Tough poker player: Brains Vs. AI update: https://www.youtube.com/watch?v=CRiH8yCskAE Safe and Nested Endgame Solving for Imperfect-Information Games, N. Brown, T. Sandholm, 2017: http://www.cs.cmu.edu/~sandholm/safeAndNested.aaa17WS.pdf DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker, 2017: https://arxiv.org/abs/1701.01724 AlphaZero learns chess (2017)

Google's AlphaZero Destroys Stockfish In 100-Game Match, 2017: https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match Image Recognition Error rates – human vs machine

1. Traffic Sign Recognition (IJCNN 2011): – Human: 1.16% – Machine: 0.54%

2. Handwritten Digits (MNIST): – Human: approx. 0.2% – Machine: 0.23% (2012)

The German Traffic Sign Recognition Benchmark: http://benchmark.ini.rub.de/?section=gtsrb&subsection=results THE MNIST DATABASE of handwritten digits: http://yann.lecun.com/exdb/mnist/ Large Scale Visual Recognition Challenge (ILSVRC)

2015 challenge: – Object detection - 200 categories – Object recognition – 1000 categories – Object detection from video – 30 categories – Scene classification – 401 categories

Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) http://image-net.org/challenges/LSVRC/2015/index#maincomp Microsoft Research Team: “To our knowledge, our result is the first to surpass human- level performance…on this visual recognition challenge”

Large Scale Visual Recognition Challenge 2015 – Results: http://image-net.org/challenges/LSVRC/2015/results Microsoft Researchers’ Algorithm Sets ImageNet Challenge Milestone, 2015: https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets--challenge-milestone/ Pixel Level Segmentation

Mask R-CNN: https://arxiv.org/abs/1703.06870 A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN: https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4 Machines can understand the meaning…

Show and Tell: A Neural Image Caption Generator, O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015: http://arxiv.org/abs/1411.4555v2 Generating faces

TL-GAN interface demo: https://www.youtube.com/watch?v=O1by05eX424 TL-GAN: transparent latent-space GAN: https://github.com/SummitKwan/transparent_latent_gan Text to speech and voice recognition…

1. Talk - text to speech (WaveNet) [1, 2] 2. Recognize voice [3, 4]

1. WaveNet: A Generative Model for Raw Audio: https://deepmind.com/blog/wavenet-generative-model-raw-audio/ 2. WaveNet: A Generative Model for Raw Audio, 2016: https://arxiv.org/abs/1609.03499 3. Historic Achievement: Microsoft researchers reach human parity in conversational : https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/ 4. Achieving Human Parity in Conversational Speech Recognition, 2016: http://arxiv.org/abs/1610.05256 What else machines can do?

1. Compose music [1] 2. Generate handwriting [2, 3]

3. Translate texts [4, 5, 6]

1. Composing Music With Recurrent Neural Networks: http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/ 2. Generating Sequences With Recurrent Neural Networks, A. Graves, 2014: http://arxiv.org/abs/1308.0850 3. Alex Graves’s handwriting generation demo: http://www.cs.toronto.edu/~graves/handwriting.html 4. University of Montreal, Lisa Lab, Neural Machine Translation demo: http://lisa.iro.umontreal.ca/mt-demo 5. Fully Character-Level Neural Machine Translation without Explicit Segmentation, J.Lee, K. Cho, T. Hofmann, 2016: http://arxiv.org/abs/1610.03017 6. Microsoft Research - Achieving Human Parity on Automatic Chinese to English News Translation: https://www.microsoft.com/en-us/research/publication/achieving-human-parity-on-automatic-chinese-to-english-news-translation/ Generating Text: Fake algebraic geometry

The Unreasonable Effectiveness of Recurrent Neural Networks: http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Generating Text: Fake cooking recipes

do androids dream of cooking?: https://gist.github.com/nylki/1efbaa36635956d35bcc And this is where it gets scary… (Dec 2017)

Exploring DeepFakes: https://www.kdnuggets.com/2018/03/exploring-deepfakes.html Google Duplex: A.I. Assistant Calls Local Businesses To Make Appointments: https://www.youtube.com/watch?v=D5VN56jQMWM AI writes fake news (2019)

New AI fake text generator may be too dangerous to release, say creators: https://www.theguardian.com/technology/2019/feb/14/elon-musk-backed-ai-writes-convincing-news-fiction OpenAI built a text generator so good, it’s considered too dangerous to release: https://techcrunch.com/2019/02/17/openai-text-generator-dangerous/ The AI Text Generator That's Too Dangerous to Make Public: https://www.wired.com/story/ai-text-generator-too-dangerous-to-make-public/ Does this mean that machines are already as smart as we are?

Introducing GeForce GTX TITAN Z: Ultimate Power, May 2014. http://www.geforce.com/whats-new/articles/introducing-nvidia-geforce-gtx-titan-z Deep neural networks can be fooled…

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, 2015: https://arxiv.org/abs/1412.1897 Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, 2015: https://arxiv.org/abs/1412.1897 Deep neural networks are easily fooled: High confidence predictions for unrecognizable images: http://www.evolvingai.org/fooling Machines do not always “understand” images…

Accelerating innovation and powering new experiences with AI: https://code.facebook.com/posts/310100219388873/accelerating-innovation-and-powering-new-experiences-with-ai/ CS231n: Convolutional Neural Networks for Visual Recognition, Lecture 10: Recurrent Neural Networks: http://cs231n.stanford.edu/slides/winter1516_lecture10.pdf Machines do not “understand” how the world works…

Accelerating innovation and powering new experiences with AI: https://code.facebook.com/posts/310100219388873/accelerating-innovation-and-powering-new-experiences-with-ai/ RI Seminar: Yann LeCun : The Next Frontier in AI: : https://www.youtube.com/watch?v=IbjF5VjniVE Experiment

Stanford Encyclopedia of Philosophy – The Chinese Room Argument: https://plato.stanford.edu/entries/chinese-room/ The “Chinese room” argument: http://cse3521.artifice.cc/chinese-room.html What are examples of real customer solutions? Solution examples Do you need PhD to implement such solutions? Implementing Machine Learning models in your project

1. Easy: – Use Cognitive Services – Integrate with your solution via web services 2. Medium: – Use Azure ML GUI – Integrate with your solution via web services 3. Hard: – Use Azure VM with GPU + ML libraries (, , TensorFlow, , CNTK) Face Detection

Face Verification

Recognize Emotions

Similar Face Searching Face Grouping

Microsoft Cognitive Services: https://www.microsoft.com/cognitive-services Azure Machine Learning Studio

Microsoft Azure – Machine Learning: https://azure.microsoft.com/en-us/services/machine-learning/ Azure Machine Learning: https://studio.azureml.net/

Slides: https://1drv.ms/f/s!AsXSX3Q3cMlAjsENH4rZ5q6doL7Bew