Artificial Intelligence and Machine Learning
Vijay Gadepally Jeremy Kepner, Lauren Milechin, Siddharth Samsi
DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work © 2020 Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. the U.S. Government may violate any copyrights that exist in this work.
Slide contributions from: Siddharth Samsi, Albert Reuther, Jeremy Kepner, David Martinez, Lauren Milechin
Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 2 VNG 010720
1 What is Artificial Intelligence?
Narrow AI: The theory and development of computer systems that perform tasks that augment for human intelligence such as perceiving, classifying, learning, abstracting, reasoning, and/or acting
General AI: Full autonomy
AI and ML - 3 Definition adapted from Oxford dictionary and inputs from Prof. Patrick Winston (MIT) VNG 010720
AI. Why Now?
Big Data Compute Power Machine Learning Algorithms
Source: DARPA/ Public domain Source: DARPA/ Public domain
Convergence of High Performance Computing, Big Data and Algorithms that enable widespread AI development
AI and ML - 4 VNG 010720
2 AI Canonical Architecture
© IEEE. Figure 1 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, Sensors 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ Users (Missions)
Structured Algorithms Human- Machine Data e.g.: Teaming (CoA)
• Knowledge-Based Human Information Knowledge Insight Data • Unsupervised Human-Machine Conditioning and Supervised Complement Sources Learning Machine • Transfer learning
Unstructured • Reinforcement Data Learning Spectrum • etc
Modern Computing
CPUs GPUs TPU Neuromorphic Custom . . . Quantum
Robust AI
Metrics Security Policy, Ethics, Safety and Explainable AI Verification & Validation and Bias Assessment (e.g., counter AI) Training
AI and ML - 5 GPU = Graphics Processing Unit CoA = Courses of Action VNG 010720 TPU = Tensor Processing Unit
Select History of Artificial Intelligence
1950 1960 1970 1980 1990 2000 2010 2017 AI Winters 1974–1980 and 1987–1993
1950 - Computing 1956 - Dartmouth Summer 1979 - An 1982 - Expert 2001–present Systems Pioneer Machinery and Research Project on AI Assessment The 2005 - Google’s Arabic and DENDRAL project availability Intelligence “Turing J. McCarthy, M. Minsky, of AI from Chinese to English translation at Stanford of very large Test” published by N. Rochester, O. Selfridge, a Lincoln MIND vol. LIX C. Shannon, others Laboratory data sets J. Forgie Perspective Internal 1955 - Western Joint Computer Conference and 1984 - Hidden 2007 - DARPA 2011 - IBM Watson 1957 - Frank MIT LL Session on Learning Machines J. Allen Markov models Grand Rosenblatt defeats former publication Challenge Neural Jeopardy! champions (“Urban Networks (Brad Rutter and Challenge”) Perceiving and Ken Jennings) 1989 - Recognizing 1988 - Statistical W.A. Clark G. Dineen O. Selfridge Convolutional Automation Machine Translation Generalization of Pattern Programming Pattern Recognition Neural Networks 2012 - Team from U. of Toronto Recognition in a Self- Pattern and Modern (Geoff Hinton's lab) wins the 2014 - Google’s GoogleNet Organizing System Recognition Computers (MIT LL Staff) (MIT LL Staff) (MIT LL Staff) ImageNet Large Scale Visual Object classification at near 1986–present Recognition Challenge with human performance 1994 - Human-level deep-learning software 1957 - Memory 1958 - National Physical The return of neural spontaneous speech networks Test Computer, Laboratory in the UK recognition first computer Symposium on the Mechani- 2015 - DeepMind 2016 - DeepMind AlphaGo to simulate the zation of Thought Processes achieves human defeats top human Go player operation of neural 1997 - IBM Deep Blue expert level of play on (Lee Sedol) networks defeats reigning chess Atari games (using only 1960 - Recognizing hand- champion (Garry Kasparov) raw pixels and scores) written characters, Robert 1959 - Arthur Samuel Larson of SRI AI Center “Some studies in machine learning 2016 - DARPA Cyber 1961 - James Slagle, Solving using the Game Grand Challenge of checkers” IBM Freshman Calculus (Minsky Journal of R&D Student) MIT
AI and ML - 8 Adapted from: The Quest for Artificial Intelligence, Nils J. Nilsson, 2010 and MIT Lincoln Laboratory Library and Archives VNG 010720
3 Artificial Intelligence Evolution
Four Waves of AI
REASONING* LEARNING CONTEXT ABSTRACTION Handcrafted Statistical Contextual System Knowledge Learning Adaptation Evolution
Perceiving Perceiving Perceiving Perceiving Learning Learning Learning Learning Abstracting Abstracting Abstracting Abstracting Reasoning Reasoning Reasoning Reasoning
Lots of data Adding Ability of * Waves adapted from John enabled context to system to Launchbury, Director I2O, non-expert AI systems abstract DARPA systems
AI and ML - 9 VNG 010720
Spectrum of Commercial Organizations in the Machine Intelligence Field
AI and ML - 10 Source: Shivon Zilis, 2016, https://www.oreilly.com/ideas/the-current-state-of-machine-intelligence-3-0 VNG 010720 © Shivon Zilis and James Cham, designed by Heidi Skinner. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
4 Data is Critical To Breakthroughs in AI
Year Breakthroughs in AI Datasets (First Available) Algorithms (First Proposed) 1994 Human-level read-speech recognition Spoken Wall Street Journal articles and Hidden Markov Model (1984) other texts (1991)
1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka Negascout planning algorithm “The Extended Book” (1991) (1983)
2005 Google’s Arabic- and Chinese-to-English 1.8 trillion tokens from Google Web and Statistical machine translation translation News pages (collected in 2005) algorithm (1988)
2011 IBM Watson became the world Jeopardy! 8.6 million documents from Wikipedia, Mixture-of-Experts algorithm (1991) champion Wiktionary, Wikiquote, and Project Gutenberg (updated in 2010)
2014 Google’s GoogleNet object classification ImageNet corpus of 1.5 million labeled Convolutional neural network at near-human performance images and 1,000 object categories (2010) algorithm (1989)
2015 Google’s Deepmind achieved human Arcade Learning Environment dataset of Q-learning algorithm (1992) parity in playing 29 Atari games by over 50 Atari games (2013) learning general control from video
Average No. of Years to Breakthrough: 3 years 18 years
AI and ML - 11 Source: Train AI 2017, https://www.crowdflower.com/train-ai/ VNG 010720
AI Canonical Architecture
Sensors Users (Missions)
Structured Algorithms Human- Machine Data e.g.: Teaming (CoA)
• Knowledge-Based Human Information Knowledge Insight Data • Unsupervised Human-Machine Conditioning and Supervised Complement Sources Learning Machine • Transfer learning
Unstructured • Reinforcement Data Learning Spectrum • etc
Modern Computing
CPUs GPUs TPU Neuromorphic Custom . . . Quantum
Robust AI
Metrics Security Policy, Ethics, Safety and Explainable AI Verification & Validation and Bias Assessment (e.g., counter AI) Training
AI and ML - 12 GPU = Graphics Processing Unit CoA = Courses of Action VNG 010720 TPU = Tensor Processing Unit
5 Unstructured and Structured Data
Human- Data Data Conditioning/Storage Technologies Conditioning Algorithms Machine Teaming - Data to Information -
Modern Computing Technologies Capabilities Provided
Robust AI Infrastructure/Databases • Indexing/Organization/Structure • Domain Specific Languages Structured Data Types • High Performance Data Access • Declarative Interfaces
Data Curation • Unsupervised machine learning Speech Sensors Network Metadata • Dimensionality Reduction Logs • Clustering/Pattern Recognition Unstructured Data Types • Outlier Detection
Data Labeling • Initial data exploration • Highlight missing or incomplete data • Reorient sensors/recapture data Social Human Reports Side Media Behavior Channel • Look for errors/biases in collection
Often takes up 80+% of overall AI/ML development work
AI and ML - 13 VNG 010720
Machine Learning Algorithms Taxonomy
Human- Data Conditioning Algorithms Machine Teaming
Modern Computing Symbolists Artificial Intelligence Robust AI (e.g., exp. sys.)
Bayesians (e.g., naive Machine Learning
* Bayes)
Analogizers Neural Nets (e.g., SVM)
Deep Neural Nets Algorithms Connectionists (e.g., DNN)
Evolutionaries (e.g., genetic programming) * ”The Five Tribes of Machine Learning”, Pedro Domingos
AI and ML - 14 DNN = Deep Neural Networks Image Adapted From “Deep Learning” by Ian VNG 010720 SVM = Support Vector Machines Exp. Sys. = Expert Systems Goodfellow, Yoshua Bengio and Aaron Courville
6 Modern AI Computing Engines
Human- Data Conditioning Algorithms Machine Teaming What It Provides to AI Selected Results
Modern Computing • Most popular computing Alexnet comparison: Forward-Backward Pass CPU Robust AI platform • General purpose compute
• Used by most for training GPU algorithms (good for NN backpropagation)
• Speeds up inference time TPU (domain specific architecture)
SpGEMM Performance using Graph Processor (G102)
Neuromorphic • Active research area 1E+141014 ASIC Graph Processor (Projected) FPGA Graph Processor (Measured) d1E+131013 Cray XK7 Titan (Measured) n o Cray XT4 Franklin (Measured) c e1E+121012 S 16k Nodes, 64 Racks r • Ability to speed up e Computing Class Computing P1E+111011 4k Nodes s 16 Racks Custom specific computations e g 256 Nodes 1024 Nodes d 4 Racks 1E+1010 Rack • of interest (e.g. graphs) E 10 Second / Edges Traversed d 64 Nodes e Chassis • s r 9 e 1E+910 8 Nodes • v Mini-Chassis ra • Benefits unproven until now T 1E+8108 Embedded Data Center Quantum • Recent results on HHL Applications Applications 1E+7107 (linear system of equations) 1E+1101 1E+2102 1E+3103 1E+4104 1E+5105 1E+6106 1E+7107 1E+8108 1E+9109 Watts
AI and ML - 15 GPU = Graphics Processing Unit HHL = Harrow-Hassidim-Lloyd quantum algorithm VNG 010720 TPU = Tensor Processing Unit
Neural Network Processing Performance
© IEEE. Figure 2 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ Legend
WaveSystem Computation Precision
DGX-2 Int8 Int16 Arria DGX-1 WaveDPU DGX-Station Float16 Turing TPU3 V100 Float16 -> Float32 Goya TPU2 Float32 GraphCoreNode TPU1 Nervana P100 Float64 /W GraphCoreC2 /Second Xavier K80 Form Factor Peak TeraOps Phi7290F 10TrueNorth TrueNorthSys 2xSkyLakeSP Chip /W Phi7210F Card
GOps JetsonTX2 MovidiusX System 1 TeraOps /W JetsonTX1 Computation Type TPUEdge GigaOps Inference 100 MIT Eyeriss Training
Peak Power (W)
AI and ML - 16 Reuther, Albert, et al. "Survey and Benchmarking of Machine Learning VNG 010720 Accelerators." arXiv preprint arXiv:1908.11348 (2019).
7 Robust AI: Preserving Trust
Human- Data Conditioning Algorithms Machine Teaming
Modern Computing Confidence Level vs. Consequence of Actions
Robust AI High Best Matched to Machines
Machines Augmenting Humans the Decision the Best
Confidence Level in Making Machine the Matched to Humans Low Low Consequence High of Actions
AI and ML - 17 VNG 010720
Importance of Robust AI
Robust AI Feature Issue Example Solutions
User unfamiliarity or Seamless integration, Explainable AI mistrust leads to lack model expansion, of adoption transparent uncertainty
Unknown relationship Explainability, dimensionality Metrics between arbitrary input reduction, feature and machine output importance inference
Algorithms need to Robust training, Validation & meet mission “portfolio” methods, Verification specifications regularization
System vulnerable to Model failure Security adversarial action (both detection, red cyber and physical) teaming
Unwanted actions when Risk sensitivity, Policy, Ethics, controlling heavy or robust inference, high Safety, and Training dangerous machinery decision thresholds
AI and ML - 18 VNG 010720
8 Human-Machine Teaming
Scale vs. Application Complexity Human- Data Conditioning Algorithms Machine High Teaming Machines More Effective than Humans Modern Computing
Robust AI Machines Augmenting
Scale Humans
Human-Machine Humans More Teaming (CoA) Effective than Machines Low Human Knowledge Insight Low Application Complexity High Human-Machine Complement Confidence Level vs. Consequence of Actions Machine
Best Matched to Machines Spectrum Machines Augmenting
Level Humans
Human-Machine teaming will Confidence
consist of intelligent assistants Best Matched to enabled by artificial intelligence Humans
Low Consequence of Actions High
Critical Element of AI: Understanding how humans and machines can work together for applications AI and ML - 19 VNG 010720
Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 20 VNG 010720
9 What is Machine Learning?
• Machine Learning – Study of algorithms that improve their performance at some task with experience (data) – Optimize based on performance criterion using example data or past experience • Combination of techniques from statistics, computer science communities • Getting computers to program themselves • Common tasks: – Classification – Regression – Prediction – Clustering – …
AI and ML - 21 VNG 010720
Traditional Programming vs. Machine Learning
Traditional Programming
Data Computer Output Program
Machine Learning Data Computer Program Output
AI and ML - 22 VNG 010720 Source: Pedro Domingos, University of Washington
10 Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)
AI and ML - 23 VNG 010720
Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)
Classification Clustering
Regression
AI and ML - 24 Dimensionality VNG 010720 Reduction
11 Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)
Classification Clustering
Naïve Bayes Hierarchical Markov Spectral Methods Decision Logistic Methods Processes Regression K-Means
K Nearest Ensembles neighbors SVM Neural SARSA Tree based Nets NMF techniques
Linear LASSO Regression Kernel PCA Q-learning Non-Linear PCA Regression
Regression
AI and ML - 25 Dimensionality VNG 010720 Reduction
Machine Learning Techniques Supervised Unsupervised Reinforcement (Labels) (No Labels) (Reward Information)
Classification Clustering
Naïve Bayes Hierarchical Markov GMM Spectral Methods Decision Logistic Methods Processes Regression HMM K-Means
K Nearest Ensembles neighbors SVM Neural SARSA Tree based Nets NMF techniques Embedded Methods (Multiple) Linear LASSO Kernel Regression PCA Q-learning Non-Linear Filter PCA Regression Methods Wrapper Regression Methods Dimensionality AI and ML - 26 Structure VNG 010720 Reduction Learning
12 Common ML Pitfalls
• Over-fitting vs. Under-fitting • Bad/noisy/missing data • Model selection • Lack of success metrics • Linear vs. Non-linear models • Ignoring outliers • Training vs. testing data • Computational complexity, curse of dimensionality • Correlation vs. Causation
AI and ML - 27 VNG 010720
Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 28 VNG 010720
13 Supervised Learning
• Starting with labeled data (ground Training Data truth) • Build a model that predicts labels Data Labels • Two general goals: – Regression: predict continuous variable Train – Classification: predict a class or label Model • Generally has a training step that forms the model Test Data Supervised Predicted Learning Data Labels Model
AI and ML - 29 VNG 010720
Artificial Neural Networks
• Computing systems inspired by biological networks • Systems learn by repetitive training to do tasks based on examples – Generally a supervised learning technique (though unsupervised examples exist) • Components: Inputs, Layers, Outputs, Weights • Deep Neural Network: Lots of “hidden layers” • Popular variants: – Convolutional Neural Nets – Recursive Neural Nets – Deep Belief Networks • Very popular these days with many toolboxes and hardware support
AI and ML - 30 VNG 010720
14 Deep Neural Networks
W0 W1 W2 y0 y3 b0 b1 b2
yi+1 = f(Wi yi + bi)
© RSIP. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
AI and ML - 31 Image source: http://www.rsipvision.com/exploring-deep-learning/ VNG 010720
Components of an Artificial Neural Network
-0.06 Activation Function
w1
Inputs w 2.5 2 f(x)
w3
Neuron/Node
1.4 © Drexel University. All rights reserved. This content is excluded from our Creative Connections/Weights Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ AI and ML - 32 VNG 010720 Image source: https://www.cs.drexel.edu/~greenie/cs510/CS510-17-08.pdf
15 Components of an Artificial Neural Network
-0.06
w1=2.7
w2 =8.6 2.5 f(x) ~1
w3 = 0.002 x = -0.06 2.7 + 2.5 8.6 + 1.4 0.002 = 21.34 f(x) = 1 1.4 © Drexel University. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
AI and ML - 33 VNG 010720 Image source: https://www.cs.drexel.edu/~greenie/cs510/CS510-17-08.pdf
Common Activation Functions
%, " < % • Step Function: ! " = $ (, " ≥ %
• Sigmoid Function: ! " = ( (*+,"
• Tanh Function: ! " = -./0(")
• Rectified Linear Unit (ReLU): ! " = 3."(%, ")
AI and ML - 34 VNG 010720
16 Neural Network Landscape
© Fjodor van Veen and Stefan Leijnen. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
AI and ML - 35 Image Source: http://www.asimovinstitute.org/neural-network-zoo/ VNG 010720
Neural Network Training
Key Idea: Adjusting the weights changes the function represented by the neural network (learning = optimization in weight space).
Iteratively adjust weights to reduce error (difference between network output and target output)
Weight Update – perceptron training rule – linear programming – delta rule – Backpropagation
Real neural network architectures can have 1000s of input data points, hundreds of layers and millions of weight changes per iteration
AI and ML - 36 VNG 010720
17 Neural Network Inference
• Using the trained model on previously “unlabeled” data
AI and ML - 37 VNG 010720
Neural Network Learning: Decision Boundary
AI and ML - 38 VNG 010720
18 Designing a Neural Network
• Designing a neural network can be a complicated task. • Many choices:
• Depth (number of layers) • Types of layers: • Training Algorithm • Inputs (number of inputs) – MaxPool – Performance vs. Quality – Dropout – Stopping criteria • Type of Network: – Convolutional – Performance function – Convolutional Neural Network – Deconvolutional • Metrics: – Deep Feedforward Neural – Softmax Network – False positive – Fully Connected – Deep Belief Network – ROC curve – Skip Layer – Long/Short Term Memory – … – … – …
AI and ML - 39 VNG 010720
Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 40 VNG 010720
19 Unsupervised Learning
• Task of describing hidden structure from unlabeled data
• More formally, we observe features X1, X2,…,Xn and would like to observe patterns among these features. • We are not interested in predcition because we don’t know what an output Y would look like.
• Typical tasks and associated algorithms: • Clustering • Data projection/Preprocessing
• Goal is to discover interesting things about the dataset: subgroups, patterns, clusters?
AI and ML - 41 VNG 010720
More on Unsupervised Learning
• There is no good simple goal (such as maximizing certain probability) for the algorithm • Very popular because techniques work on unlabeled data – Labeled data can be difficult and expensive
• Common techniques: – Clustering • K-Means • Nearest neighbor search • Spectral clustering – Data projection/preprocessing • Principal component analysis • Dimensionality Reduction • Scaling
AI and ML - 42 VNG 010720
20 Clustering
• Group objects or sets of features such that objects in the same cluster are more similar than those of another cluster • Optimal clusters should x – Minimize intra-cluster distance x – Maximize inter-cluster distance • Example of intra-cluster measure x • Squared error se x x k x 2 x x se = å å - mp i x i=1 Î cp i
where mi is the mean of all features in cluster ci
AI and ML - 43 VNG 010720
Dimensionality Reduction
• Process of reducing number of random variables under consideration – Key idea: Reduce large dataset to much smaller dataset using only high variance dimensions • Often used to simplify computation or representation of a dataset • Typical tasks: – Feature Selection: try to find a subset of original variables – Feature Extraction: try to represent data in lower dimensions • Often key to good performance for other machine learning techniques such as regression, classification, etc. • Other uses: – Compression: reduce dataset to smaller representation – Visualization: easy to see low dimensional data
AI and ML - 44 VNG 010720
21 Neural Networks and Unsupervised Learning
• Traditional applications of neural networks such as Image classification fall into the realm of supervised learning: – Given example inputs x and target output y, learn the mapping between them. – A trained network is supposed to give the correct target output for any input stimulus – Training is learning the weights
• Largely used to find better representations for data: clustering and dimensionality reduction • Non linear capabilities
AI and ML - 45 VNG 010720
Example: Autoencoders
• Neural network architecture designed to find a compressed representation for data • Feedforward, multi layer perceptron. • Input layer number of features = output layer number of features • Similar to dimensionality reduction but allows for much more complex representations Reconstructed Reconstructed Input
Encoder Decoder Inputs
Compressed Representation © Leonardo Araujo dos Santos. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
AI and ML - 46 VNG 010720
22 Example: Replicator Neural Network
• Conceptually, very similar to autoencoders • Used extensively for anomaly detection (looking for outliers) • Example architecture
• Salient differences from an autoencoder: Step Activation Function, Inclusion of dropout layers – Step activation squeezes the middle layer outputs into a number of clusters
– Dropout layers help with overfitting Step Activation Function
AI and ML - 47 Hawkins, Simon, et al. "Outlier detection using replicator neural VNG 010720 networks." International Conference on Data Warehousing and Knowledge Discovery. Springer, Berlin, Heidelberg, 2002.
Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 48 VNG 010720
23 Lecture 1: Introduction to Reinforcement Learning Problems within RL
Reinforcement LearningAtari Example: Reinforcement Learning
• What makes reinforcement learning different from other machine learning paradigms? observation action – There is no supervisor, only a reward signal Ot At – Feedback is delayed, not instantaneous Rules of the game are – Time really matters (sequential, often inter- unknown dependent data) Learn directly from – Agent’s actions affect the subsequent data it reward Rt receives interactive game-play Pick actions on • Example: Playing Atari game joystick, see pixels – Rules of the game are unknown and scores – Learn directly from interactive game-play
– Pick actions on joystick, see pixels and scores © David Silver. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
AI and ML - 49 VNG 010720 Image Source: “Introduction to Reinforcement Learning” by David Silver (Google DeepMind)
Other Reinforcement Learning Examples
• Fly stunt maneuvers in a helicopter – + reward for following desired trajectory – − reward for crashing • Defeat the world champion at Backgammon – +/− reward for winning/losing a game • Manage an investment portfolio – + reward for each $ in bank • Control a power station – + reward for producing power – − reward for exceeding safety thresholds • Make a humanoid robot walk – + reward for forward motion – − reward for falling over
AI and ML - 50 VNG 010720
24 Outline
• Artificial Intelligence Overview • Machine Learning Deep Dives – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Conclusions/Summary
AI and ML - 51 VNG 010720
Summary
• Lots of exciting research into AI/ML techniques – This course looks at a number of relatively easy strategies to mitigate these challenges • Key ingredients for AI success: – Data Availablity – Computing Infrastructure – Domain Expertise/Algorithms • Large challenges in data availability and readiness for AI • MIT SuperCloud platform (next presentation) can be used to perform the heavy computation needed
• Further reading: – "AI Enabling Technologies: A Survey.” https://arxiv.org/abs/1905.03592
AI and ML - 52 VNG 010720
25 MIT OpenCourseWare https://ocw.mit.edu/
RES.LL-005 Mathematics of Big Data and Machine Learning IAP 2020
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.