Application Of & In Control Systems

Hassan Teimoori, PhD.

May 2019 Agenda

• Analytics ecosystem • Review on Deep learning and reinforcement learning • Applications in control systems Analytics ecosystem Artificial intelligence (AI) is no longer on the horizon. It’s here now, and it’s already having a profound impact on how we live, work, and do business.

Artificial intelligence

Machine learning Perspective analytics (Action) • What do we need to do? • Identify measures to improve the outcome • Automation Deep learning • Optimization

Predictive analytics (Decision) • What is likely to be happening? • Predict the patterns and the near future events Diagnostic analytics (Insights) • Focusing on why is it happening, Examine and find out about the root cause • Confounding information isolation

Descriptive analytics (Information) • Focusing on what happen, Comprehensive, accurate, effective visualization • Capture product conditions, environments & operations

Enablers (Knowledge) • Big data: connected products, historical data, enterprise data, external data • Processing power: Standalone (CPU, GPU, TPU), distributed, cloud, ambient computing (IoT) • Robotics • New Algorithms Control systems

Deep, broad and strong base of foundational knowledge with major emphasis on decision making under uncertainty.

Dynamic Structural Model systems Identification Stability Feedback properties reduction modeling

Fault Optimality Robustness Adaptation Architecture tolerance

• Variety of settings

Linear Nonlinear Stochastic Hybrid Distributed Supervisory

• Open challenges: control of large, complex, distributed dynamical systems under rapid changes in the environment and high levels of uncertainty. Control systems

The current approaches for control are either classic control approach or optimization based approach.

Classic control Optimization based controllers

• Less biased to artificial intelligent decisions • Look ahead in the future and take action considering future errors • Requires extensive knowledge from an expert with relevant domain knowledge • Suffer from the fact that the optimization step takes time to return optimal control • Transfer the knowledge to the controller via input, especially for complex high control law and other mathematical dimensional systems. derivation • Challenge: • Challenge: • Hard to handle uncertainty about the true • Careful analysis of the process dynamics system dynamics and noise in observation data. • Requires a model of the process either derived from first principles or empirical. • Burden on prediction of hidden states

• Model maintenance is very difficult or • Requires an accurate model of the system rarely achieved. to start with

Intelligent control is a class of control techniques that use various artificial intelligence computing approaches like neural networks, Bayesian probability, fuzzy logic, , reinforcement learning, evolutionary computation and genetic algorithms. Deep learning review (1 of 3) Inspired by the human brain, a neural network consists of highly connected networks of neurons that relate the inputs to the desired outputs.

Machine Feature ML Input Feature Output learning extractor algorithm

Deep learning Input Deep learning Output

Neuron

Each neuron • receives input from many other neurons, • changes its internal state (activation) based on the current input, • sends one output signal to many other neurons Deep learning review (2 of 3) Deep learning architectures, algorithms and techniques have created powerful tools to learn representations of large volumes of data in multiple layers of representation.

Criteria:

• Quantity and form of input data Entering Apply the training data • How must the weights be modified to allow fast and correction (neuron activations) reliable learning? • Success measure ? • Number of iterations (Epochs) • Stability? Correction of the • Order of pattern representation? network is calculated Forward propagation • When do we stop learning? (Back-propagate) • Etc.

Output Considerations Error > threshold? assessment and error • Large state space calculation • Subject matter experty Error < threshold ? • Data fidelity • Regulations • Cost of failure • Technology limits Pattern recognition Inspired by the human brain, a neural network consists of highly connected networks of neurons that relate the inputs to the desired outputs.

Best use cases: • For modeling a highly nonlinear system • When the model is supposed to get constantly updated • When the model interpretability is not a key concern. Memory based architecture Inspired by the human brain, a neural network consists of highly connected networks of neurons that relate the inputs to the desired outputs.

푠푡−1 푠푡 푥 LSTM 푡 푦푡

Vanilla Image Sentiment Machine Video Neural Captioning classification Translation processing Networks Deep learning generic pattern Accomplished through computation over dataflow graphs. Provide interfaces that make it simple for developers to construct computation graphs and runtimes that process the graphs in an optimized way. The graph is conducive for optimization and translation to run on specific devices (CPU, GPU, TPU, FPGA, etc.).

Deep neural network frameworks • Caffe/ Caffe2 • CNTK • DL4j • • Lasagne • mxnet • PaddlePaddle • TensorFlow • /Pytorch • … Transfer learning Transfer learning is about to begin with a previously trained model that had good results and then train it further on a specific image dataset.

• Approaches • Fine-tuning an existing model • Use an existing convolutional model as a feature extractor

• When to consider trying transfer learning • Training dataset is small • Training dataset shares visual features with the base dataset Transfer learning for Robotics Learning from a simulation and transferring the knowledge to real-world robot alleviates the slowness and expensive training process

A. A. Rusu et al. 2018 • Boost initial performance • Increased learning speed • Learning more accurate performance Transfer learning Inspired by the human brain, a neural network consists of highly connected networks of neurons that relate the inputs to the desired outputs.

Size of Data set

Train model from scratch The predictions made using pretrained models would not be Fine train the pretrained model effective. Large Ideal case. Hence, its best to train the neural Retain the architecture of the model network from scratch according to and the initial weights of the model. your data.

Fine tune the lower layers Fine tune the output layers We can freeze the initial (let’s say k) Customize and modify the output layers layers of the pretrained model and according to our problem train just the remaining(n-k) layers Use the pretrained model as a feature extractor again. Small Example: imagenet (cat & dog) with 1000 class The small size of the data set is is simplified to two classes compensated by the fact that the initial layers are kept pretrained

Low High Data similarity Reinforcement learning (RL)

RL is the subfield of machine learning that studies how to use past data to enhance the future manipulation of a dynamical system.

Agent: This we create by programming such that it is able to sense the environment, perform actions, receive feedback, and try to maximize rewards. Environment: The world where the agent resides. It can be real or simulated. State: The perception or configuration of the environment that the agent senses. State spaces can be finite or infinite. Rewards: Feedback the agent receives after any action it has taken. The goal of the agent is to maximize the overall reward, that is, the immediate and the future reward. Actions: Anything that the agent is capable of doing in the given environment. Action space can be finite or infinite. Episode: Represents one complete run of the whole task. Reinforcement learning (RL)

RL is the subfield of machine learning that studies how to use past data to enhance the future manipulation of a dynamical system.

Observations Constraints • The agent learns from its own experience. • The outcome of actions may be uncertain • The reward may be delayed and/or stochastic. • The actions change the status. • No clear model about how the world responds to • The effect of an action cannot be completely predicted. actions. • The environment may change while trying to • Choices improve with experience. learn it • Problems can have a finite or infinite time horizon. RL algorithm

The agent contains two components: a policy and a learning algorithm.

Formulate Problem Create Environment Define Reward •Define the task for the •Define the environment •Specify the reward signal agent in terms of interaction within which the agent that the agent uses to with the environment and operates, including the measure its performance goals the agent must interfaces and the against the task goals. achieve environment dynamic model

Create Agent Train Agent Validate Agent •Create the agent, which •Train the agent policy •Evaluate the performance of includes defining a policy representation using the the trained agent by representation and defined environment, simulating the agent and configuring the agent reward, and agent learning environment together. learning algorithm. algorithm.

Deploy Policy •Deploy the trained policy representation using, for example, generated GPU code. Open AI Gym example

• Open AI Gym is a simple 5 state environment. • There are two possible actions in each state, move forward (action 0) and move backwards (action 1). • There is also a random chance that the agent’s action is “flipped” by the environment (i.e. an action 0 is flipped to an action 1 and vice versa).

https://gym.openai.com/envs/NChain-v0/ Deep learning to complete Reinforcement Learning The decision maker may have no idea how does the environment really like, but through trial and error, the agent learn to make decisions that extract the most accumulative reward from the environment.

Environment

Sensor

Data collection

Data exploration

Feature extraction Deep learning Deep Machine learning

Knowledge

Reasoning Deep reinforcement learning reinforcement Deep Planning

Action Reinforcement learning Reinforcement learning (RL) offers powerful algorithms to search for optimal controllers of systems with nonlinear, possibly stochastic dynamics that are unknown or highly uncertain. Reinforcement learning (RL) is a model-free framework for solving optimal control problems stated as Markov decision processes (MDPs) RL algorithms in control context

They have been mainly used to solve: 1) optimal regulation and optimal tracking of single agent systems 2) optimal coordination of multiagent systems

Reinforcement Learning Control Systems Policy Controller Environment Everything that is not the controller — example the plant, the reference signal, and the calculation of the error. In general, the environment can also include additional elements, such as:

• Measurement noise

• Disturbance signals

• Filters

• Analog-to-digital and digital-to-analog converters Observation Any measurable value from the environment that is visible to the agent — In the preceding diagram, the controller can see the error signal from the environment. You can also create agents that observe, for example, the reference signal, measurement signal, and measurement signal rate of change. Action Manipulated variables or control actions Reward Function of the measurement, error signal, or some other performance metric — For example, you can implement reward functions that minimize the steady-state error while minimizing control effort.

Learning Algorithm Adaptation mechanism of an adaptive controller Control systems that are a good fit for RL

Deep, broad and strong base of foundational knowledge with major emphasis on decision making under uncertainty.

Control systems Monitor and Maintain •Robotics • Quality control •Wind Turbine Control • Fault detection and isolation •Autonomous vehicles • Predictive maintenance •Factory automation • Inventory monitoring •Smart grids • Supply chain risk management •Machine Tuning

Optimization • Process planning • Job shop scheduling • Yield management • Supply chain • Demand forecasting • Production coordination • Network optimization

© Deloitte LLP and affiliated entities. PowerPoint Timesaver 21 Challenges in RL for Engineering Applications

Deep, broad and strong base of foundational knowledge with major emphasis on decision making under uncertainty.

• The states and actions are inherently continuous, the dimensionality of both states and actions can be high

• Physical world uncertainty

• Simulation environment

• Reward function design

• Professional knowledge requirements

• Lack of standard benchmarks

© Deloitte LLP and affiliated entities. PowerPoint Timesaver 22 Interpretable deep learning system (IDLS) It is often challenging to intuitively and quantitatively understand how does a deep neural network arrive at a particular decision for a specific input – due to their high non-linearity and nested structures

The success of this goal is tied to the cognition, knowledge, and biases of the user: for a system to be interpretable, it must produce.

An explanation can be evaluated in two ways: • According to its interpretability, and • According to its completeness.

X. Zhang et al. 2018

Methods for explaining neural networks generally falls within two broad categories (1) Saliency Methods: which weights are being activated given some inputs (2) Feature Attribution: attempts to fit structural models on a subset of data in such a way as to find out the explanatory power each variable has in the output variable. Thank you.

Note: Insert information as indicated below. DTTL communications should comply with the DTTL Language & Style Guide and the DTTL Required Legal Language document, accessible from Brand Space’s “Templates” section. Member Firms should consult with their legal counsel and/or Reputation and Risk Leader for the required language. Deloitte, one of Canada's leading professional services firms, provides audit, tax, consulting, and financial advisory services. Deloitte LLP, an Ontario limited liability partnership, is the Canadian member firm of Deloitte Touche Tohmatsu Limited. Deloitte refers to one or more of Deloitte Touche Tohmatsu Limited, a UK private company limited by guarantee, and its network of member firms, each of which is a legally separate and independent entity. Please see www.deloitte.com/about for a detailed description of the legal structure of Deloitte Touche Tohmatsu Limited and its member firms. The information contained herein is not intended to substitute for competent professional advice. © Deloitte LLP and affiliated entities. Appendix Learning methodologies

Policy based • Directly search and learn a policy function (maximum future reward) • Learn the stochastic policy function that maps the state to the action Model-based • Dynamic programming (memory intensive) • Upgrade model and re-plan often • Example: chess • Extremely efficient Value based models • Learn the state or state-action value • Estimate the optimal value function Q*(s,a) (i.e. maximum value achievable under any policy) • Act by choosing the best action in the state.

© Deloitte LLP and affiliated entities. PowerPoint Timesaver 26 CRISP-DM Laws

1.Business objectives are the origin of every algorithm 2.Business knowledge is the central to very step of the data mining process 3.Data preparation is more than half of every data mining process 4.The right model for a given application can only be discovered by experiment or “There is No Free Lunch for the Data Miner” 5.There are always patterns 6.Data mining amplifies perception in the business domain 7.Prediction increases information locally by generalization 8.The value of data mining results is not determined by the accuracy or stability of predictive models 9.All patterns are subject to change

http://khabaza.codimension.net/index_files/9laws.htm

© Deloitte LLP and affiliated entities. PowerPoint Timesaver 27