and

Vijay Gadepally Jeremy Kepner, Lauren Milechin, Siddharth Samsi

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work © 2020 Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. the U.S. Government may violate any copyrights that exist in this work.

Slide contributions from: Siddharth Samsi, Albert Reuther, Jeremy Kepner, David Martinez, Lauren Milechin

Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – • Conclusions/Summary

AI and ML - 2 VNG 010720

1 What is Artificial Intelligence?

Narrow AI: The theory and development of computer systems that perform tasks that augment for human intelligence such as perceiving, classifying, learning, abstracting, reasoning, and/or acting

General AI: Full autonomy

AI and ML - 3 Definition adapted from Oxford dictionary and inputs from Prof. Patrick Winston (MIT) VNG 010720

AI. Why Now?

Big Data Compute Power Machine Learning Algorithms

Source: DARPA/ Public domain Source: DARPA/ Public domain

Convergence of High Performance Computing, Big Data and Algorithms that enable widespread AI development

AI and ML - 4 VNG 010720

2 AI Canonical Architecture

© IEEE. Figure 1 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, Sensors 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ Users (Missions)

Structured Algorithms Human- Machine Data e.g.: Teaming (CoA)

• Knowledge-Based Human Information Knowledge Insight Data • Unsupervised Human-Machine Conditioning and Supervised Complement Sources Learning Machine • Transfer learning

Unstructured • Reinforcement Data Learning Spectrum • etc

Modern Computing

CPUs GPUs TPU Neuromorphic Custom . . . Quantum

Robust AI

Metrics Security Policy, Ethics, Safety and Explainable AI Verification & Validation and Bias Assessment (e.g., counter AI) Training

AI and ML - 5 GPU = CoA = Courses of Action VNG 010720 TPU = Tensor Processing Unit

Select History of Artificial Intelligence

1950 1960 1970 1980 1990 2000 2010 2017 AI Winters 1974–1980 and 1987–1993

1950 - Computing 1956 - Dartmouth Summer 1979 - An 1982 - Expert 2001–present Systems Pioneer Machinery and Research Project on AI Assessment The 2005 - ’s Arabic and DENDRAL project availability Intelligence “Turing J. McCarthy, M. Minsky, of AI from Chinese to English translation at Stanford of very large Test” published by N. Rochester, O. Selfridge, a Lincoln MIND vol. LIX C. Shannon, others Laboratory data sets J. Forgie Perspective Internal 1955 - Western Joint Computer Conference and 1984 - Hidden 2007 - DARPA 2011 - IBM 1957 - Frank MIT LL Session on Learning Machines J. Allen Markov models Grand Rosenblatt defeats former publication Challenge Neural Jeopardy! champions (“Urban Networks (Brad Rutter and Challenge”) Perceiving and Ken Jennings) 1989 - Recognizing 1988 - Statistical W.A. Clark G. Dineen O. Selfridge Convolutional Automation Machine Translation Generalization of Pattern Programming Neural Networks 2012 - Team from U. of Toronto Recognition in a Self- Pattern and Modern (Geoff Hinton's lab) wins the 2014 - Google’s GoogleNet Organizing System Recognition Computers (MIT LL Staff) (MIT LL Staff) (MIT LL Staff) ImageNet Large Scale Visual Object classification at near 1986–present Recognition Challenge with human performance 1994 - Human-level deep-learning software 1957 - Memory 1958 - National Physical The return of neural spontaneous speech networks Test Computer, Laboratory in the UK recognition first computer Symposium on the Mechani- 2015 - DeepMind 2016 - DeepMind AlphaGo to simulate the zation of Thought Processes achieves human defeats top human Go player operation of neural 1997 - IBM Deep Blue expert level of play on (Lee Sedol) networks defeats reigning chess Atari games (using only 1960 - Recognizing hand- champion (Garry Kasparov) raw pixels and scores) written characters, Robert 1959 - Arthur Samuel Larson of SRI AI Center “Some studies in machine learning 2016 - DARPA Cyber 1961 - James Slagle, Solving using the Game Grand Challenge of checkers” IBM Freshman Calculus (Minsky Journal of R&D Student) MIT

AI and ML - 8 Adapted from: The Quest for Artificial Intelligence, Nils J. Nilsson, 2010 and MIT Lincoln Laboratory Library and Archives VNG 010720

3 Artificial Intelligence Evolution

Four Waves of AI

REASONING* LEARNING CONTEXT ABSTRACTION Handcrafted Statistical Contextual System Knowledge Learning Adaptation Evolution

Perceiving Perceiving Perceiving Perceiving Learning Learning Learning Learning Abstracting Abstracting Abstracting Abstracting Reasoning Reasoning Reasoning Reasoning

Lots of data Adding Ability of * Waves adapted from John enabled context to system to Launchbury, Director I2O, non-expert AI systems abstract DARPA systems

AI and ML - 9 VNG 010720

Spectrum of Commercial Organizations in the Machine Intelligence Field

AI and ML - 10 Source: Shivon Zilis, 2016, https://www.oreilly.com/ideas/the-current-state-of-machine-intelligence-3-0 VNG 010720 © Shivon Zilis and James Cham, designed by Heidi Skinner. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

4 Data is Critical To Breakthroughs in AI

Year Breakthroughs in AI Datasets (First Available) Algorithms (First Proposed) 1994 Human-level read- Spoken Wall Street Journal articles and Hidden Markov Model (1984) other texts (1991)

1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka Negascout planning algorithm “The Extended Book” (1991) (1983)

2005 Google’s Arabic- and Chinese-to-English 1.8 trillion tokens from Google Web and Statistical machine translation translation News pages (collected in 2005) algorithm (1988)

2011 IBM Watson became the world Jeopardy! 8.6 million documents from Wikipedia, Mixture-of-Experts algorithm (1991) champion Wiktionary, Wikiquote, and Project Gutenberg (updated in 2010)

2014 Google’s GoogleNet object classification ImageNet corpus of 1.5 million labeled Convolutional neural network at near-human performance images and 1,000 object categories (2010) algorithm (1989)

2015 Google’s Deepmind achieved human Arcade Learning Environment dataset of Q-learning algorithm (1992) parity in playing 29 Atari games by over 50 Atari games (2013) learning general control from video

Average No. of Years to Breakthrough: 3 years 18 years

AI and ML - 11 Source: Train AI 2017, https://www.crowdflower.com/train-ai/ VNG 010720

AI Canonical Architecture

Sensors Users (Missions)

Structured Algorithms Human- Machine Data e.g.: Teaming (CoA)

• Knowledge-Based Human Information Knowledge Insight Data • Unsupervised Human-Machine Conditioning and Supervised Complement Sources Learning Machine • Transfer learning

Unstructured • Reinforcement Data Learning Spectrum • etc

Modern Computing

CPUs GPUs TPU Neuromorphic Custom . . . Quantum

Robust AI

Metrics Security Policy, Ethics, Safety and Explainable AI Verification & Validation and Bias Assessment (e.g., counter AI) Training

AI and ML - 12 GPU = Graphics Processing Unit CoA = Courses of Action VNG 010720 TPU = Tensor Processing Unit

5 Unstructured and Structured Data

Human- Data Data Conditioning/Storage Technologies Conditioning Algorithms Machine Teaming - Data to Information -

Modern Computing Technologies Capabilities Provided

Robust AI Infrastructure/Databases • Indexing/Organization/Structure • Domain Specific Languages Structured Data Types • High Performance Data Access • Declarative Interfaces

Data Curation • Unsupervised machine learning Speech Sensors Network Metadata • Dimensionality Reduction Logs • Clustering/Pattern Recognition Unstructured Data Types • Outlier Detection

Data Labeling • Initial data exploration • Highlight missing or incomplete data • Reorient sensors/recapture data Social Human Reports Side Media Behavior Channel • Look for errors/biases in collection

Often takes up 80+% of overall AI/ML development work

AI and ML - 13 VNG 010720

Machine Learning Algorithms Taxonomy

Human- Data Conditioning Algorithms Machine Teaming

Modern Computing Symbolists Artificial Intelligence Robust AI (e.g., exp. sys.)

Bayesians (e.g., naive Machine Learning

* Bayes)

Analogizers Neural Nets (e.g., SVM)

Deep Neural Nets Algorithms Connectionists (e.g., DNN)

Evolutionaries (e.g., genetic programming) * ”The Five Tribes of Machine Learning”, Pedro Domingos

AI and ML - 14 DNN = Deep Neural Networks Image Adapted From “” by Ian VNG 010720 SVM = Support Vector Machines Exp. Sys. = Expert Systems Goodfellow, and Aaron Courville

6 Modern AI Computing Engines

Human- Data Conditioning Algorithms Machine Teaming What It Provides to AI Selected Results

Modern Computing • Most popular computing Alexnet comparison: Forward-Backward Pass CPU Robust AI platform • General purpose compute

• Used by most for training GPU algorithms (good for NN )

• Speeds up inference time TPU (domain specific architecture)

SpGEMM Performance using Graph (G102)

Neuromorphic • Active research area 1E+141014 ASIC Graph Processor (Projected) FPGA Graph Processor (Measured) d1E+131013 Cray XK7 Titan (Measured) n o Cray XT4 Franklin (Measured) c e1E+121012 S 16k Nodes, 64 Racks r • Ability to speed up e Computing Class Computing P1E+111011 4k Nodes s 16 Racks Custom specific computations e g 256 Nodes 1024 Nodes d 4 Racks 1E+1010 Rack • of interest (e.g. graphs) E 10 Second / Edges Traversed d 64 Nodes e Chassis • s r 9 e 1E+910 8 Nodes • v Mini-Chassis ra • Benefits unproven until now T 1E+8108 Embedded Data Center Quantum • Recent results on HHL Applications Applications 1E+7107 (linear system of equations) 1E+1101 1E+2102 1E+3103 1E+4104 1E+5105 1E+6106 1E+7107 1E+8108 1E+9109 Watts

AI and ML - 15 GPU = Graphics Processing Unit HHL = Harrow-Hassidim-Lloyd quantum algorithm VNG 010720 TPU = Tensor Processing Unit

Neural Network Processing Performance

© IEEE. Figure 2 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ Legend

WaveSystem Computation Precision

DGX-2 Int8 Int16 Arria DGX-1 WaveDPU DGX-Station Float16 Turing TPU3 V100 Float16 -> Float32 Goya TPU2 Float32 GraphCoreNode TPU1 Nervana P100 Float64 /W GraphCoreC2 /Second Xavier K80 Form Factor Peak TeraOps Phi7290F 10TrueNorth TrueNorthSys 2xSkyLakeSP Chip /W Phi7210F Card

GOps JetsonTX2 MovidiusX System 1 TeraOps /W JetsonTX1 Computation Type TPUEdge GigaOps Inference 100 MIT Eyeriss Training

Peak Power (W)

AI and ML - 16 Reuther, Albert, et al. "Survey and Benchmarking of Machine Learning VNG 010720 Accelerators." arXiv preprint arXiv:1908.11348 (2019).

7 Robust AI: Preserving Trust

Human- Data Conditioning Algorithms Machine Teaming

Modern Computing Confidence Level vs. Consequence of Actions

Robust AI High Best Matched to Machines

Machines Augmenting Humans the Decision the Best

Confidence Level in Making Machine the Matched to Humans Low Low Consequence High of Actions

AI and ML - 17 VNG 010720

Importance of Robust AI

Robust AI Feature Issue Example Solutions

User unfamiliarity or Seamless integration, Explainable AI mistrust leads to lack model expansion, of adoption transparent uncertainty

Unknown relationship Explainability, dimensionality Metrics between arbitrary input reduction, feature and machine output importance inference

Algorithms need to Robust training, Validation & meet mission “portfolio” methods, Verification specifications regularization

System vulnerable to Model failure Security adversarial action (both detection, red cyber and physical) teaming

Unwanted actions when Risk sensitivity, Policy, Ethics, controlling heavy or robust inference, high Safety, and Training dangerous machinery decision thresholds

AI and ML - 18 VNG 010720

8 Human-Machine Teaming

Scale vs. Application Complexity Human- Data Conditioning Algorithms Machine High Teaming Machines More Effective than Humans Modern Computing

Robust AI Machines Augmenting

Scale Humans

Human-Machine Humans More Teaming (CoA) Effective than Machines Low Human Knowledge Insight Low Application Complexity High Human-Machine Complement Confidence Level vs. Consequence of Actions Machine

Best Matched to Machines Spectrum Machines Augmenting

Level Humans

Human-Machine teaming will Confidence

consist of intelligent assistants Best Matched to enabled by artificial intelligence Humans

Low Consequence of Actions High

Critical Element of AI: Understanding how humans and machines can work together for applications AI and ML - 19 VNG 010720

Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary

AI and ML - 20 VNG 010720

9 What is Machine Learning?

• Machine Learning – Study of algorithms that improve their performance at some task with experience (data) – Optimize based on performance criterion using example data or past experience • Combination of techniques from statistics, computer science communities • Getting computers to program themselves • Common tasks: – Classification – Regression – Prediction – Clustering – …

AI and ML - 21 VNG 010720

Traditional Programming vs. Machine Learning

Traditional Programming

Data Computer Output Program

Machine Learning Data Computer Program Output

AI and ML - 22 VNG 010720 Source: Pedro Domingos, University of Washington

10 Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)

AI and ML - 23 VNG 010720

Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)

Classification Clustering

Regression

AI and ML - 24 Dimensionality VNG 010720 Reduction

11 Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)

Classification Clustering

Naïve Bayes Hierarchical Markov Spectral Methods Decision Logistic Methods Processes Regression K-Means

K Nearest Ensembles neighbors SVM Neural SARSA Tree based Nets NMF techniques

Linear LASSO Regression Kernel PCA Q-learning Non-Linear PCA Regression

Regression

AI and ML - 25 Dimensionality VNG 010720 Reduction

Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)

Classification Clustering

Naïve Bayes Hierarchical Markov GMM Spectral Methods Decision Logistic Methods Processes Regression HMM K-Means

K Nearest Ensembles neighbors SVM Neural SARSA Tree based Nets NMF techniques Embedded Methods (Multiple) Linear LASSO Kernel Regression PCA Q-learning Non-Linear Filter PCA Regression Methods Wrapper Regression Methods Dimensionality AI and ML - 26 Structure VNG 010720 Reduction Learning

12 Common ML Pitfalls

• Over-fitting vs. Under-fitting • Bad/noisy/missing data • Model selection • Lack of success metrics • Linear vs. Non-linear models • Ignoring outliers • Training vs. testing data • Computational complexity, curse of dimensionality • Correlation vs. Causation

AI and ML - 27 VNG 010720

Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary

AI and ML - 28 VNG 010720

13 Supervised Learning

• Starting with labeled data (ground Training Data truth) • Build a model that predicts labels Data Labels • Two general goals: – Regression: predict continuous variable Train – Classification: predict a class or label Model • Generally has a training step that forms the model Test Data Supervised Predicted Learning Data Labels Model

AI and ML - 29 VNG 010720

Artificial Neural Networks

• Computing systems inspired by biological networks • Systems learn by repetitive training to do tasks based on examples – Generally a supervised learning technique (though unsupervised examples exist) • Components: Inputs, Layers, Outputs, Weights • Deep Neural Network: Lots of “hidden layers” • Popular variants: – Convolutional Neural Nets – Recursive Neural Nets – Deep Belief Networks • Very popular these days with many toolboxes and hardware support

AI and ML - 30 VNG 010720

14 Deep Neural Networks

W0 W1 W2 y0 y3 b0 b1 b2

yi+1 = f(Wi yi + bi)

© RSIP. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

AI and ML - 31 Image source: http://www.rsipvision.com/exploring-deep-learning/ VNG 010720

Components of an Artificial Neural Network

-0.06

w1

Inputs w 2.5 2 f(x)

w3

Neuron/Node

1.4 © Drexel University. All rights reserved. This content is excluded from our Creative Connections/Weights Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ AI and ML - 32 VNG 010720 Image source: https://www.cs.drexel.edu/~greenie/cs510/CS510-17-08.pdf

15 Components of an Artificial Neural Network

-0.06

w1=2.7

w2 =8.6 2.5 f(x) ~1

w3 = 0.002 x = -0.062.7 + 2.58.6 + 1.40.002 = 21.34 f(x) = 1 1.4 © Drexel University. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

AI and ML - 33 VNG 010720 Image source: https://www.cs.drexel.edu/~greenie/cs510/CS510-17-08.pdf

Common Activation Functions

%, " < % • Step Function: ! " = $ (, " ≥ %

: ! " = ( (*+,"

• Tanh Function: ! " = -./0(")

• Rectified Linear Unit (ReLU): ! " = 3."(%, ")

AI and ML - 34 VNG 010720

16 Neural Network Landscape

© Fjodor van Veen and Stefan Leijnen. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

AI and ML - 35 Image Source: http://www.asimovinstitute.org/neural-network-zoo/ VNG 010720

Neural Network Training

Key Idea: Adjusting the weights changes the function represented by the neural network (learning = optimization in weight space).

Iteratively adjust weights to reduce error (difference between network output and target output)

Weight Update – training rule – linear programming – delta rule – Backpropagation

Real neural network architectures can have 1000s of input data points, hundreds of layers and millions of weight changes per iteration

AI and ML - 36 VNG 010720

17 Neural Network Inference

• Using the trained model on previously “unlabeled” data

AI and ML - 37 VNG 010720

Neural Network Learning: Decision Boundary

AI and ML - 38 VNG 010720

18 Designing a Neural Network

• Designing a neural network can be a complicated task. • Many choices:

• Depth (number of layers) • Types of layers: • Training Algorithm • Inputs (number of inputs) – MaxPool – Performance vs. Quality – Dropout – Stopping criteria • Type of Network: – Convolutional – Performance function – Convolutional Neural Network – Deconvolutional • Metrics: – Deep Feedforward Neural – Softmax Network – False positive – Fully Connected – Deep Belief Network – ROC curve – Skip – Long/Short Term Memory – … – … – …

AI and ML - 39 VNG 010720

Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary

AI and ML - 40 VNG 010720

19 Unsupervised Learning

• Task of describing hidden structure from unlabeled data

• More formally, we observe features X1, X2,…,Xn and would like to observe patterns among these features. • We are not interested in predcition because we don’t know what an output Y would look like.

• Typical tasks and associated algorithms: • Clustering • Data projection/Preprocessing

• Goal is to discover interesting things about the dataset: subgroups, patterns, clusters?

AI and ML - 41 VNG 010720

More on Unsupervised Learning

• There is no good simple goal (such as maximizing certain probability) for the algorithm • Very popular because techniques work on unlabeled data – Labeled data can be difficult and expensive

• Common techniques: – Clustering • K-Means • Nearest neighbor search • Spectral clustering – Data projection/preprocessing • Principal component analysis • Dimensionality Reduction • Scaling

AI and ML - 42 VNG 010720

20 Clustering

• Group objects or sets of features such that objects in the same cluster are more similar than those of another cluster • Optimal clusters should x – Minimize intra-cluster distance x – Maximize inter-cluster distance • Example of intra-cluster measure x • Squared error se x x k x 2 x x se = å å - mp i x i=1 Î cp i

where mi is the mean of all features in cluster ci

AI and ML - 43 VNG 010720

Dimensionality Reduction

• Process of reducing number of random variables under consideration – Key idea: Reduce large dataset to much smaller dataset using only high variance dimensions • Often used to simplify computation or representation of a dataset • Typical tasks: – Feature Selection: try to find a subset of original variables – Feature Extraction: try to represent data in lower dimensions • Often key to good performance for other machine learning techniques such as regression, classification, etc. • Other uses: – Compression: reduce dataset to smaller representation – Visualization: easy to see low dimensional data

AI and ML - 44 VNG 010720

21 Neural Networks and Unsupervised Learning

• Traditional applications of neural networks such as Image classification fall into the realm of supervised learning: – Given example inputs x and target output y, learn the mapping between them. – A trained network is supposed to give the correct target output for any input stimulus – Training is learning the weights

• Largely used to find better representations for data: clustering and dimensionality reduction • Non linear capabilities

AI and ML - 45 VNG 010720

Example:

• Neural network architecture designed to find a compressed representation for data • Feedforward, multi layer perceptron. • Input layer number of features = output layer number of features • Similar to dimensionality reduction but allows for much more complex representations Reconstructed Reconstructed Input

Encoder Decoder Inputs

Compressed Representation © Leonardo Araujo dos Santos. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

AI and ML - 46 VNG 010720

22 Example: Replicator Neural Network

• Conceptually, very similar to autoencoders • Used extensively for anomaly detection (looking for outliers) • Example architecture

• Salient differences from an : Step Activation Function, Inclusion of dropout layers – Step activation squeezes the middle layer outputs into a number of clusters

– Dropout layers help with overfitting Step Activation Function

AI and ML - 47 Hawkins, Simon, et al. "Outlier detection using replicator neural VNG 010720 networks." International Conference on Data Warehousing and Knowledge Discovery. Springer, Berlin, Heidelberg, 2002.

Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary

AI and ML - 48 VNG 010720

23 Lecture 1: Introduction to Reinforcement Learning Problems within RL

Reinforcement LearningAtari Example: Reinforcement Learning

• What makes reinforcement learning different from other machine learning paradigms? observation action – There is no supervisor, only a reward signal Ot At – Feedback is delayed, not instantaneous Rules of the game are – Time really matters (sequential, often inter- unknown dependent data) Learn directly from – Agent’s actions affect the subsequent data it reward Rt receives interactive game-play Pick actions on • Example: Playing Atari game joystick, see pixels – Rules of the game are unknown and scores – Learn directly from interactive game-play

– Pick actions on joystick, see pixels and scores © David Silver. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/

AI and ML - 49 VNG 010720 Image Source: “Introduction to Reinforcement Learning” by David Silver (Google DeepMind)

Other Reinforcement Learning Examples

• Fly stunt maneuvers in a helicopter – + reward for following desired trajectory – − reward for crashing • Defeat the world champion at Backgammon – +/− reward for winning/losing a game • Manage an investment portfolio – + reward for each $ in bank • Control a power station – + reward for producing power – − reward for exceeding safety thresholds • Make a humanoid robot walk – + reward for forward motion – − reward for falling over

AI and ML - 50 VNG 010720

24 Outline

• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary

AI and ML - 51 VNG 010720

Summary

• Lots of exciting research into AI/ML techniques – This course looks at a number of relatively easy strategies to mitigate these challenges • Key ingredients for AI success: – Data Availablity – Computing Infrastructure – Domain Expertise/Algorithms • Large challenges in data availability and readiness for AI • MIT SuperCloud platform (next presentation) can be used to perform the heavy computation needed

• Further reading: – "AI Enabling Technologies: A Survey.” https://arxiv.org/abs/1905.03592

AI and ML - 52 VNG 010720

25 MIT OpenCourseWare https://ocw.mit.edu/

RES.LL-005 Mathematics of Big Data and Machine Learning IAP 2020

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.