Artificial – A Driving Force in Industrial 4.0 Shaibal Barua, PhD Researcher, and Intelligent Systems [email protected]

29 May, 2020 Outline

● Artificial Intelligence – what’s the deal?

● Industrial Artificial Intelligence

● The applied AI workflow ● Data cleaning and preparation ● Data representation ● AI problems and methods ● Validation

● Use cases

● What’s next?

2 Part 1: Artificial Intelligence

3 Poll 1

Go to: www.menti.com Use the code: 16 54 26

4 Intelligence

“The ability to learn, understand and think in a logical way about things; the ability to do this well” - Oxford dictionary Street American Journal ”, 1994, the Wall "Mainstream Science on on Science "Mainstream Psychological Association Psychological ”Intelligence: Knowns and Unknowns”,1995, the Board of of Board the Unknowns”,1995, Scientific Affairs of the of Affairs Scientific Intelligence

Capability to understand complex ideas, ability to reasoning, learning from experiences, adaptability to the environment, plan, problem solving …..

5 Poll 2

Go to: www.menti.com Use the code: 49 25 59

6 Artificial Intelligence

Artificial Intelligence (AI) is usually defined as the science of making computers do things that require intelligence when done by humans.

AI is the study of programmed systems that can simulate, to some extent, human activities such as perceiving, thinking, learning and acting.

Behavior by a machine that, if performed by a human being, would be called intelligent

Fig: 7 Artificial Intelligence

Natural Social Reasoning language Intelligence

Artificial Knowledge Learning Intelligence representation

Planning Perception

8 Three types of AI

Narrow AI General AI Superintelligence

• Singular task • Machine intelligence • Hypothetical agent • Successfully realized • Carry out any • Machines become to date cognitive function self-aware • Operate under a that a human can • Surpass the capacity narrow set of • Knowledge transfer of human constraints and between domains intelligence limitations • Fujitsu’s ”K”

9 Poll 3

Go to: www.menti.com Use the code: 60 72 51

10 How Old is the idea of AI?

● Sixth-fifth century BC ● Aristotle layout the epistemological basis; introduces syllogistic logic ● The Iliad – assorted automata from the workshops of Greek god Hephaestus

● Late first century ● Fable automata built by Heron of Alexander

● Fifteenth-sixteenth century ● Mechanic clocks, Paracelus introduces a recipe for a humanculus, an intelligent “little man”

● Eighteen century ● Philosophers try to formulate the laws of thought Hoffman’s The Sandman ● Nineteenth century Goethe’s Faust ● Literary artificial proliferation Mary Shelley’s Frankenstein

● Twentieth century ● Alan Turin proposes an abstract of universal computing machine

11 AI: Past, Present and Future

The Turing test , big "I propose to consider data and general the question, 'Can 2 nd AI: 2011-present machines think?’” (A. 1987AI Winter: Goals fulfilled: Access to large 1993-2011 amounts of data Turing, 1950) -1993 Faster computers An interrogator asks Deep Blue (1997) Deep learning drives questions to an progress in image and Victory of the “neats” (unseen) person A. If 1 st Expert systems video processing, text AI Winter: boom: 1980-1987 (2003) A is replaced by an AI, 1974 analysis, speech DARPA Grand recognition can the interrogator -1980 Rule-based, logical systems Challenge (2005) detect this or not? DeepMind Selection of components AI untold successes in defeats world based on customer , robotics, champion in Go logistics, speech (2016) requirements recognition, search Golden years: 5th gen project (Japan) Widespread 1957-1974 engines discussions around Neural networks, Strong AI: Symbolic AI, search backprop. algorithms, neural superhuman nets, industrial intelligence robots, etc.

2017 AlphaGo: Google’s AI beats world champion Ke Jie. Notable for vast number of 2170 of possible positions

12 Poll 4

Go to: www.menti.com Use the code: 88 11 55

13 Responsible AI

● Ethical Reasoning ● Accountability, Responsibility, Transparency

Figure: Trolley problem dilemma ● Responsible AI concerned with the fact that decisions and actions taken by intelligent autonomous systems have consequences that can be seen as being of Figure: Interrelationship of the seven an ethical nature. requirements: all are of equal importance, support each other, and should be implemented and evaluated throughout the AI system’s lifecycle

Source: EU Ethics Guidelines for Trustworthy AI, https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines/1 14 Poll 5

Go to: www.menti.com Use the code: 71 62 93

15 Part 2: Industrial Artificial Intelligence

16 The four industrial revolutions

•Connected machines 4th •Complex human-machine interaction 2011 •Artificial intelligence

•Nuclear energy 3rd Industrial Revolution •Electronics, 1969 telecommunication, computers •Automation - PLCs, control theory, PID regulators, etc. •Industrial robots

•Electricity, gas and oil 2nd Industrial Revolution •Combustion engine, steel industry, chemical 1870 industry •Telegraph, telephone •Division of Labour (Taylorism), Mass •Mechanical production production (Ford) 1st Industrial Revolution •Industry instead of agriculture 1765 as basis of economy •Water power •Steam engine 17 Industrial Artificial Intelligence

A systematic discipline, which focuses on developing, validating and deploying various algorithms for industrial applications with sustainable performance.

Jay Lee, Hossein Davari, Jaskaran Singh, Vibhor Pandhare, Industrial Artificial Intelligence for industry 4.0-based manufacturing systems, Manufacturing Letters, Volume 18, 2018, Pages 20-23,

18 AI and Industry 4.0

Decision Product Company Manufacturer Supplier making and applications Deep insights Knowledge AI/ML Enabled Advanced Analytics Descriptive Diagnostics Predictive Prescriptive (What happened) (Why it happened) (What will happen) (What action to take)

Capture Products’ Examine the Predict quality and Identify measures to Condition, causes of reduced patterns that signal improve outcomes or environment and product impending events correct problems Pattern operation performance or detect failure

Data Processing Data Aggregation Enterprise External Smart, connected products Data (Service histories, warranty (Price, weather, supplier (Location, condition, use, etc.) status, etc.) inventory, etc.)

Smart Connected Process

19 Adapted from: Jinjiang Wang, Yulin Ma, Laibin Zhang, Robert X. Gao, Dazhong Wu, Deep learning for smart manufacturing: Methods and applications, Journal of Manufacturing Systems, Volume 48, Part C, 2018, Pages 144-156, Key elements in Industrial AI

● Analytics technology (A),

● Big data technology (B),

● Cloud or Cyber technology (C), ● Domain knowhow (D) and

● Evidence (E)

20 Industrial AI

Figure: Comparison of Industrial AI with other learning systems

Jay Lee, Hossein Davari, Jaskaran Singh, Vibhor Pandhare, Industrial Artificial Intelligence for industry 4.0-based manufacturing systems, Manufacturing Letters, Volume 18, 2018, Pages 20-23, 21 Challenges of Industrial AI

● Machine-to-machine interactions

● Machine-to-human interactions

● Data quality

● Cyber security

22 Poll 6

Go to: www.menti.com Use the code: 92 63 4

23 Part 3: The applied AI workflow

24 The Industrial AI stack

Business Understanding

Data collection and processing

Representation

“Solving the problem”

Validation

Deployment, maintenance and support

25 Planning, scheduling, etc.

Common in industrial problems everywhere: ● How should we schedule a workforce? ● How to order manufacturing steps in a product variant? ● How to order individual manufacturing orders/items? ● On what units should which maintenance be performed and when? … etc.

Typically, a deep understanding of the business is needed.

26 The Industrial AI stack

Business Understanding

Data collection and processing

Representation

“Solving the problem”

Validation

Deployment, maintenance and support

27 Data cleaning and preparation

Data from real applications is dirty: ● Duplicates and missing data ● Values with special meaning (ID 9999 means ”missing”) ● Invalid data ● Logically inconsistent data ● Mystery data (railway cars which are 600 meters long) ● Spiking data (temperature is 10e+10 for 1 millisecond) ● Sensor drift, ”almost” values (0.6% really means 0.0%; 100.6% means 100%) ● Multiple data files which are not in sync ● Misspellings ● Different wordings

Data preparation and cleaning takes a long time! Validity threat: data cleaning removes realistic details

28 Example

29 Poll 7

Go to: www.menti.com Use the code: 70 24 27

30 The Industrial AI stack

Business Understanding

Data collection and processing

Representation

“Solving the problem”

Validation

Deployment, maintenance and support

31 Representation

● Before a method is chosen, the representation should be considered ● For machine learning – what should be the input? ● E.g. vibration/noise analysis – representation in time/space or frequency domain? ● For planning, scheduling, simulation – what model abstraction should be used? ● E.g. microscopic model of robot movements, mesoscopic model of discrete manufacturing steps, or macroscopic model of completion time distribution for product variants.

● In both cases, the representation of the problem can impact performance substantially.

● Finding the right representation requires in-depth understanding of the application!

● Stakeholders must agree to modeling assumptions!

32 The Industrial AI stack

Business Understanding

Data collection and processing

Representation

“Solving the problem”

Validation

Deployment, maintenance and support

33 Types of ML

● ML tasks are typically classified into following categories, depending on the nature of the learning "signal" or "feedback" available:

● Supervised learning – it uses inputs and their desired outputs • The program is “trained” on a pre-defined set of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data.

● Unsupervised learning - no labels are given to the learning algorithm • The program is given a bunch of data and must find patterns and relationships therein.

34 Some problems in Supervised Machine Learning

35 Supervised Machine Learning

36 Problems in supervised ML

In classification, inputs are divided into two or more classes, and the learner must produce a model that assigns unseen inputs to one or more of these classes.

In regression, the outputs are continuous rather than discrete. - An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight.

37 Machine Learning Algorithms

Symbolists

Inverse Deduction Analogizers

Optimization Accuracy OPT.

Constrained Constrained Margin EVAL.

Logic . vectors Support Support

REPR

ML Connectionist Neural Neural

Graphical Networks Gradient Descent Gradient

model Error Squared Posterior Probabilistic InferenceProbability Genetic Programs

Bayesians Fitness

Genetic Search • REPR: Representation • EVAL: Evaluation Evolutionaries • OPT: Optimization 38 Some problems in Unsupervised Machine Learning

39 Unsupervised Machine learning

40 Problems in unsupervised ML

In clustering, a set of inputs is to be divided into groups. - Unlike in classification, the groups are not known beforehand, making this typically an unsupervised task.

Density estimation finds the distribution of inputs in some space.

Dimensionality reduction simplifies inputs by mapping them into a lower-dimensional space.

41 The Industrial AI stack

Business Understanding

Data collection and cleaning

Representation

“Solving the problem”

Validation

Deployment, maintenance and support

42 Model Validation

● During model building, random patterns in the sample are easily found which are might not be present in the whole population.

● To justify the performance of the built predictive model, the validation should be done with data points that were never used while building the model.

● For a set of ML variants, optimize parameter selection (learn) on the training set.

● Find the ML variant which performs best on the cross- validation set (e.g. polynomial degree)

● Estimate the generalization error using the test set

43 Overtraining

● You’ve trained your model, now what? ● Overtraining – model doesn’t generalize to (perform well on) new data. ● “Validation” or evaluation is used to estimate the performance on new data, i.e. how the model would perform when actually used ● Validation results will always be too optimistic!

Image by Chabacano / CC BY

✕ Few data samples ✕ Complicated model ✕ Similar training, test and validation sets ✕ Fine-tuning parameters ✕ Evaluating several models with the same validation set Data Splitting

● Training Set: ● This set is used for training the predictive models.

● Validation Set: ● Fixing the values of different parameters of the built model is done with this set.

● Test Set: ● Accuracy of the built model is determined using this set.

● In common practice, test set is made with larger part of the data containing data points of all possible outcomes. The rest of the dataset is split into two sets for validation and testing.

● A widely used ratio of data splitting is 60:20:20 for training, validation and testing.

45 K-fold Cross Validation

Original Data (n = 20)

Training Set Validation Set

CV # 1

CV # 2

CV # 3

CV # 4

CV # 5

Example: Dataset Splitting in 5 – fold Cross Validation.

46 Performance Measures

Confusion Matrix

Case 1 Case 2

● Class 1 = 10 ● Class 1 = 9 ● Class 2 =10 ● Class 2 =1

9 1 9 0 2 8 1 0

Accuracy = 85% Accuracy = 90%

47 F1 Score

TP FP

● F1 score tells us how precise and robust a model is. FN TN

● It is the harmonic mean of Precision and Recall values 1 ! = 2 1 1 1 + %&'()*)+, .'(/00

● When the False Negatives and False Positives are crucial

● When there are imbalanced classes

● Greater F1 Score indicates better performance for prediction models.

48 Performance Measures

● Receiver Operating Characteristics (ROC) ● ROC is the widely used metric for validating binary classification models.

● Two basic terms for AUC: ● Sensitivity: In other words, it is called True Positive Rate (TPR). Sensitivity is calculated from the values of confusion matrix – )* !"#$%&%'%&( )*+ = )* + ./ ● Example: Consider a test of Covid-19 ● Specificity: It is also termed as False Positive ● Test has 90% sensitivity that means the test will Rate (FPR). It is calculated with the formula – correctly return a positive result for 90% of people )/ who have the disease. But will return a negative result !0"1%2%1%&( .*+ = — a false-negative — for 10% of the people who have )/ + .* the disease and should have tested positive.

● What about specificity?

49 Poll 8

Go to: www.menti.com Use the code: 23 58 72

50 The Industrial AI stack

Business Understanding

Data collection and cleaning

Representation

“Solving the problem”

Validation

Out of scope Deployment, maintenance and support of this lecture

51 The Industrial AI stack in reality

Business Understanding

Often 80% of Data collection and cleaning total effort Foundations Representation of value creation “Solving the problem” (20% of effort)

Validation

End-user Deployment, maintenance and support value

52 Part 4: Use cases

53 Example: Machine Health Monitoring

Physical-based Solution MHMS Monitored Data Machine Acquisition Hand design physical model

Conventional Data-driven Solution MHMS Monitored Data Machine Acquisition

Deep Learning based MHMS Monitored Data Solution Machine Acquisition

54 Source: Rui Zhao, Ruqiang Yan, Zhenghua Chen, Kezhi Mao, Peng Wang, Robert X. Gao, Deep learning and its applications to machine health monitoring, Mechanical Systems and Signal Processing, Volume 115, 2019, Pages 213-237, ISSN 0888-3270 Example: Data Analytics in Industry 4.0

A Case Study in Power Transfer Unit

Project: AUTOMAD Project Leader: Dr. Mobyen Uddin Ahmed, Docent Contact: [email protected]

55 Example: Monitoring and Quality Control

Project Leader: Prof. Peter Funk, MDH Contact: [email protected]

Prototype running at Volvo and Chalmers, cloud based solution, implemented by Ivan Tomašić MDH

56 Example: Pulp and paper

Detection of Digester Faults • Screen clogging. • Hang ups and • Channelling Detection of Anomalies Operating cost • Sensor faults

• Something is wrong Quality

Prediction Production Rate • Kappa value

Contact: [email protected] and [email protected] 58 Poll 9

Go to: www.menti.com Use the code: 40 40 42

59 Production engineering course autumn 2020

● Big Data and Cloud Computing for Industrial Applications ● Study period 2020-11-09 - 2021-01-17

● Visit mdh.se/premium

60 Interesting Reading

61 Thank you

63