Deep Learning and Workflows in A.I.

Emmanuel Blanchard

© 2015 The MathWorks, Inc.1 A.I. with MATLAB and Simulink: Atlas Robot

2 Why should you care about Reinforcement Learning?

▪ It enables the use of for controls and decision-making applications

Robotics Controls

Autonomous driving Game Play 3 Why should you care about Reinforcement Learning?

4 What is Reinforcement Learning?

▪ What is Reinforcement Learning? – Type of that trains an ‘agent’ through repeated interactions with an environment

▪ How does it work? – Through a trial & error process that uses a reward system to maximize success

5 Agenda

Background: Reinforcement Learning vs Machine Learning vs Deep Learning

Deep Learning Workflows and Challenges

Reinforcement Learning (MATLAB + Simulink)

Conclusion

6 What is Machine Learning?

7 Machine Learning vs Deep Learning

Machine Learning

Unsupervised Learning [Labeled Data] [No Labeled Data] Supervised learning typically involves feature extraction

Clustering Classification Regression

Deep Learning is subset of machine learning Deep Learning with automatic feature extraction • Learns features and tasks directly from data • More Data = better model

https://www.youtube.com/watch?v=xr5LeWKbVnY 8 Deep Learning Uses a Neural Network Architecture

Input Output Layer Hidden Layers (n) Layer 9 Deep Learning Datatypes

Image Signal

Text Numeric

10 Reinforcement Learning vs Machine Learning vs Deep Learning

Machine Learning Reinforcement learning:

▪ Learning through trial & error [interaction] Reinforcement Unsupervised Supervised Learning Learning Learning [Labeled Data] [No Labeled Data] [Interaction Data] ▪ Complex problems typically need deep learning [Deep Reinforcement Learning] Decision Clustering Classification Regression Controls Making ▪ It’s about learning a behavior or Deep Learning accomplishing a task

▪ Examples: o Financial trading, calibration. o Lane-keep assist, adaptive cruise control, robotics, etc.

11 Reinforcement Learning vs Machine Learning vs Deep Learning

12 Reinforcement Learning vs Machine Learning vs Deep Learning

13 Reinforcement Learning vs Machine Learning vs Deep Learning

14 Reinforcement Learning vs Machine Learning vs Deep Learning

15 Agenda

Background: Reinforcement Learning vs Machine Learning vs Deep Learning

Deep Learning Workflows and Challenges

Reinforcement Learning (MATLAB + Simulink)

Conclusion

16 Deep Learning Challenges

Data Not a deep learning expert ▪ Handling large amounts of data

▪ Labeling thousands of images & videos

Training and Testing Deep Neural Networks

▪ Accessing reference models from research

▪ Optimizing hyperparameters

▪ Training takes hours-days

Rapid and Optimized Deployment

▪ Desktop, web, cloud, and embedded hardware

17 Deep Learning Inference in 4 Lines of Code

• >> net = alexnet; • >> I = imread('peacock.jpg') • >> I1 = imresize(I,[227 227]); • >> classify(net,I1) • ans = • categorical • peacock

18 Labeling for deep learning is repetitive, tedious, and time-consuming…

but necessary

19 Deep Learning Made Easy with Apps

▪ Automate ground-truth labeling using Image Labeler app ▪ Deep Network Designer app

▪ Automate ground-truth labeling using Audio Labeler app ▪ Network Analyzer app

21 Accelerating Code: GPU Coder, Parallel Server, MATLAB Coder

▪ Generate CUDA code ▪ MATLAB Parallel Server – integrates with external CUDA code – Dynamic licensing

▪ Generate C/C++ code – C/C++ code is royalty-free: deploy to your customers at no charge – Package generated code as a

MEX-function for use in MATLAB 23 Deploy MATLAB Data Analytics into the Cloud

▪ Use algorithms developed in different versions of MATLAB ▪ Deploy encrypted MATLAB codes to protect IP

Web App MATLABMATLAB ProductionProduction Server Server REST call Enterprise Mobile Up to 24 workers in the pool data sources app Worker MATLAB 2015a runtime Manager Java App

Other

Start / stop Start stop / workers mgrs… MATLAB 2017a runtime .NET App

Request Broker Request Auto Scan MATLAB 2016b runtime C/C++ deploy App libraries Manager

HTTP(s) HTTP(s) over MATLAB 2015a runtime port port 9910/9920

Python Language specific client .ctf Legend App Hot deploy IT developed or deployed resources MathWorks components

24 App Designer Create Desktop and Web Apps in MATLAB

▪ Try this on your phone: https://deeplearning.mwlab.io/

25 MATLAB supports the Entire Deep Learning Workflow

ACCESS AND EXPLORE LABEL AND PREPROCESS DEVELOP PREDICTIVE INTEGRATE MODELS WITH DATA DATA MODELS SYSTEMS

Files Data Augmentation/ Hardware-Accelerated Desktop Apps Transformation Training

Databases Labeling Automation Hyperparameter Tuning Enterprise Scale Systems

Sensors Import Reference Network Visualization Embedded Devices and Models Hardware

26 Interoperability with Deep Learning Frameworks

▪ Import and export models using the Open Neural Network Exchange (ONNX) format ▪ Model importers (Caffe, TensorFlow-)

▪ Access pretrained models with a single line of code – AlexNet, VGG-19, VGG-17, GoogLeNet, RestNet, …. 27 Agenda

Background: Reinforcement Learning vs Machine Learning vs Deep Learning

Deep Learning Workflows and Challenges

Reinforcement Learning (MATLAB + Simulink)

Conclusion

28 Glossary of Common Terms in Reinforcement Learning

▪ Agent: Red Circle that learns how to navigate the grid to reach the blue square by trial and error 5

▪ Environment: 5x5 grid that is being navigated 4

▪ State: The current square the red circle is in 3 +5

▪ Action: One of the 4 possible actions the red circle can take at each time step 2 4 Possible ▪ Reward: Points the red circle gets for taking an Actions action 1 +10 ▪ -1 for any move except 1 2 3 4 5 ▪ +5 when you land on teleportation square [4,4] ▪ +10 when you land on [5,1] Red circle does not know what possible reward values are 29 Glossary of Common Terms in Reinforcement Learning

▪ Trained Agent: Red Circle that has learned how to navigate the grid by taking the best 5 possible actions

4 ▪ Final Reward: ?+11 points

▪ Policy: The logic that is learned by red 3 +5 circle to implement the best possible actions. E.g. – If red circle is in [1,4], move right 2 4 – If red circle is in [2,4], move down Possible Actions 1 +10 ▪ Reinforcement Learning Algorithm: The trial- and-error algorithm that developed this 1 2 3 4 5 policy The best action to take depends on the state 30 In This Sample Trajectory, We Luckily Receive Two Rewards

5

4

3 +5

2

1 +10

1 2 3 4 5

31 And Now, the Agent Remembers Which Two Actions Led to the Reward

5

4

3 +5

2

1 +10

1 2 3 4 5

32 Eventually, We Find the Best Path Possible Based On Our Initial State

5

4

3 +5

2

1

1 2 3 4 5

33 But What If We Had a Different Initial State? Would the Same Path Be the Best Choice?

5

4

3 +5

2

1 +10

1 2 3 4 5

34 Clearly, the Best Action to Take Depends on the State We Are In In this case, we only have 21 possible states

5

In this case, we can run a 4 small and finite number of simulations to find the best possible path irrespective of 3 +5 our initial state

2

1 +10

1 2 3 4 5

https://www.mathworks.com/help/reinforcement-learning/ug/train-q-learning-agent-to-solve-basic-grid-world.html >> openExample('rl/BasicGridWorldExample’) 35 But What If We Have Many States?

36 Applications that Engineers and Scientists Care About Can Have Huge State Spaces

Robot Arm for Grasping Objects 6 Servo Motors – Assume 180 degrees range of motion

Possible states: 180 x 180 x 180 x 180 x 180 x 180 More than 3 trillion states (3.4x1013)

37 Deep Networks are commonly found in the agent, because they can model complex problems.

Current State AGENT Next Action (Image, Radar, Sensor, etc.) • Turn left • Turn right • Brake • Accelerate

By representing policies using deep neural networks, we can solve problems for complex, non-linear systems (continuous or discrete) by directly using data that traditional approaches cannot use easily 38 Teach a robot to follow a straight line using camera data

39 Let’s try to solve this problem the traditional way

Observations Camera Feature State Controller Data Extraction Estimation Motor Commands

Sensors

Motor Leg & Motor Commands Balance Trunk Control Trajectories

Observations

40 What is the alternative approach?

Observations Camera Feature State Controller Data Extraction Estimation Motor Commands

Sensors

Camera Data Black Box Motor Controller Commands Sensors

41 How Does Reinforcement Learning Work?

STATE ACTION AGENT

REWARD

ENVIRONMENT

42 A Practical Example of Reinforcement Learning Training a Self-Driving Car

▪ Vehicle’s computer learns how to drive… AGENT (agent) STATE ACTION ▪ using sensor readings from LIDAR, cameras,… Policy (state) Policy update ▪ that represent road conditions, vehicle position,… Reinforcement (environment) Learning ▪ by generating steering, braking, throttle commands,… Algorithm (action) ▪ based on an internal state-to-action mapping…

REWARD (policy) ▪ that tries to optimize driver comfort & fuel efficiency… (reward). ENVIRONMENT

The policy is updated through repeated trial-and-error by a reinforcement learning algorithm 43 Reinforcement Learning vs Controls Control system Reinforcement learning system

+ ERROR CONTROLLER PLANT REFERENCE MANIPULATED - VARIABLE

MEASUREMENT

Adaptation mechanism RL Algorithm Error/Cost function Reward Manipulated variable Action Measurement Observation Plant Environment Controller Policy Reinforcement learning has parallels to control system design

44 When would you use Reinforcement Learning?

Controller Computational Cost Computational Cost Capability in Training/Tuning in Deployment PID Low Low Low Model Pred Control High Low High Reinforcement Learning High High Medium

Reinforcement learning might be a good fit if ▪ An environment model is available (trial & error on hardware can be expensive), and ▪ Training/tuning time is not critical for the application, and ▪ Uncertain environments or nonlinear environments

45 Everyone is excited about it as it appears to be a silver bullet for all problems.

▪ However, there are challenges:

A lot of simulation trials required

Reward signal design, network layer structure & hyperparameter tuning can be challenging

Training may not converge

No performance guarantees

Further training might be necessary after deployment on real hardware

46 Simulation and Virtual Models are a Key Aspect of Reinforcement Learning

▪ Reinforcement learning needs a lot of data (sample inefficient) – Training on hardware can be prohibitively expensive and dangerous

▪ Virtual models allow you to simulate conditions hard to emulate in the real world – This can help develop a more robust solution

▪ Many of you have already developed MATLAB and Simulink models that can be reused 47 Why MATLAB and Simulink for Reinforcement Learning?

Virtual models allow you to simulate conditions hard to emulate in the real world.

48 Deep Learning Workflow

Data Preparation AI Modeling Deployment Multiplatform code Data access and Deep learning preprocessing generation (CPU, GPU) Model design, Importing Tuning training Reference Models options Edge deployment

Ground truth labeling Model exchange across frameworks Enterprise Deployment Hardware- accelerated training

49 Reinforcement Learning Workflow

Data Preparation AI Modeling Deployment Multiplatform code Scenario Modeling Reinforcement learning generation (CPU, GPU)

Simulation-based Training agent to data generation perform task Edge deployment

Developing reward system to optimize Enterprise performance Deployment Simulink – generate data for dynamic systems (planes, cars, robots, etc.)

50 Implement reinforcement learning agents to train policies

▪ Train agents using built-in and custom reinforcement learning algorithms ▪ Import deep neural network policies from Keras and the ONNX model format ▪ Train agents directly in Simulink models using the RL Agent block

MATLAB doc - Walking robot >> openExample('rl/RLWalkingBipedRobotExample') Webinar - Walking robot 51 Resources

▪ Reference Examples – Design controllers for robots, self-driving cars, and other systems

▪ Documentation written for engineers and domain experts

▪ Tech Talk video series on reinforcement learning concepts for engineers

52 Agenda

Background: Reinforcement Learning vs Machine Learning vs Deep Learning

Deep Learning Workflows and Challenges

Reinforcement Learning (MATLAB + Simulink)

Conclusion

53 Why MATLAB / Simulink for A.I. Tasks?

Increased productivity with interactive tools

Generate simulation data for complex models and systems

Ease of deployment and scaling to various platforms

Full A.I. workflows that cannot be easily replicated by other toolchains

54 MATLAB supports the ENTIRE workflow

▪ Gartner recognizes MathWorks as a Visionary in its January 2019 Magic Quadrant for Data Science and Machine Learning Platforms 55 MathWorks can help you do Deep Learning & RL

Getting Started More options

▪ Guided evaluations with a ▪ Consulting services MathWorks deep learning ▪ Training courses engineer ▪ Technical support ▪ Proof-of-concept projects ▪ Advanced customer support ▪ Deep learning hands-on ▪ Installation, enterprise, and cloud workshop (includes deployment Reinforcement learning) ▪ Deep Learning Paid Training ▪ Seminars and technical deep dives ▪ Deep learning onramp course

56 Before you leave, please complete our evaluation form. Your feedback is important to us! bit.ly/matlabexpoanz

57