Semantic Analysis of Ladder Logic

A Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

By

Soumyashree Gad, M.S.

Graduate Program in Computer Science and Engineering

The Ohio State University

2017

Master’s Examination Committee:

Dr. Srinivasan Parthasarathy, Advisor Dr. P. Sadayappan Copyright by

Soumyashree Gad

2017 Abstract

Purpose: A careful examination of Ladder Logic program reveals it’s hierarchical nature and components, which makes it interpret able. Perhaps, using data mining techniques to interpret the top level and sub-level components can be really useful.

This seems like a classification problem. The application of machine learning algo- rithms on the features extracted from ladder logic can give insights of the whole ladder logic code. The components and their interactions are intuitive which certainly add to the probability of getting better results. The learning in the PLC programming can be alleviated by converting the existing PLC code to commonly used formats such as JSON, XML etc. The goal is to consume ladder logic sample, break it down into minor components, identify the components and interactions between them and later write them to JSON. As Ladder logic is the most commonly used for PLCs we decided to start the experiment with Ladder logic program sam- ples. Feature engineering combined with machine learning techniques should provide accurate results for the Ladder Logic data.

Methods: The data set contains 6623 records for top level classification with 1421

ALARM, 150 STEP SEQUENCE, 96 SOLENOID samples and 5304 UNCLASSIFIED and 84472 records for sub-level classification which contains sub-level components of the all the ALARM and STEP SEQUENCE samples from the top level data set. We extract the initial top level and sub-level features from GX works. The

ii advanced features like Sequence, LATCH and comments are extracted by parsing the information from the output of GX works. The final set of features for Top level classification consists of basic features, advanced features, and comments. Data set for Sub-level classification has few more features apart from the already existing features from the top level such as previous instruction - next instruction features(3

Window), bi-gram features of the instructions and top level class( the result of top level classification). The result of top level classification and sub-level classification are filled into a JSON object which is later written to a JSON file.

Results: We have classification results from the top level and sub-level. Decision trees seem to work the best for both Top Level and Sub-level classifications. Since the features are discrete we tried Decision trees, Naive Bayes and Support vector machines. Performance result of each of them was: Decision Tree : Accuracy- 0.91

, F1-macro- 0.90, F1-micro 0.90 Naive Bayes : Accuracy- 0.85 , F1-macro- 0.80,

F1-micro 0.85 LinearSVC : Accuracy- 0.88 , F1-macro- 0.88, F1-micro 0.88

For Sub-level classification decision trees out perform any of the other classifiers

Decision Tree : Accuracy- 0.90 , F1-macro- 0.91, F1-micro 0.90 Naive Bayes :

Accuracy- 0.80 , F1-macro- 0.80, F1-micro 0.81 LinearSVC : Accuracy- 0.75 , F1- macro- 0.78, F1-micro 0.79

Conclusions: A decision tree is the best suitable classifier for the purpose here. The tuning of Decision tree classifier played important role in improving the performance.

Using entropy as the classification criteria and restricted the depth of the tree to 6 improved the performance of the classifier by 6

iii I dedicate this to my parents who have always got my back.

iv Acknowledgments

I would like to thank Prof. Srinivasan Parthasarthy for accepting me as his student and giving me the opportunity to be part of this project. I would like to thank

Derrick Cobb from Honda, who came up the unique concept of using machine learning algorithms with PLC and was always excited work with us. I would like to thank

Albert (Jiongqian Liang) who mentored me through out the project and gave valuable inputs.

I would like to thank my friend Mr.Siddharth Saurav who kept me motivated through out.

Finally and most importantly I would like to thank God for blessing me with a family which has always believed in me and supported my dream.

v Vita

April 3, 1991 ...... Born - Sangli, India

2013 ...... B.E. Computer Science

2013 - 2015 ...... Software Engineer, Bosch India

2015 - present ...... Graduate Student, The Ohio State University.

Fields of Study

Major Field: Computer Science and Engineering

vi Contents

Page

Abstract ...... ii

Dedication ...... iv

Acknowledgments ...... v

Vita...... vi

List of Figures ...... ix

1. Introduction ...... 1

1.1 Challenges ...... 2 1.2 Thesis Statement ...... 3 1.3 Contributions ...... 3 1.4 Organization ...... 5

2. Background ...... 7

2.1 Ladder Logic ...... 7 2.2 Classification Algorithms ...... 9 2.2.1 Decision Tree ...... 9 2.2.2 Bi-gram ordering ...... 10 2.3 Related work ...... 10 2.3.1 Mining Software Repositories ...... 11 2.3.2 Machine Learning Automotive Industry ...... 13 2.3.3 Aid in learning PLC programming ...... 15

vii 3. Implementation ...... 16

3.1 Pre-Processing Steps ...... 16 3.2 Top Level Classification ...... 19 3.3 Sub-level classification ...... 21 3.4 Performance Improvement ...... 25 3.4.1 Parameter tuning ...... 25 3.4.2 Majority Voting ...... 26

4. Data Set and Results ...... 27

4.1 Data set ...... 27 4.2 Results ...... 31

5. Conclusion and Future Work ...... 36

5.1 Future Work ...... 39

Bibliography ...... 42

viii List of Figures

Figure Page

1.1 ALARM sub-level example ...... 4

2.1 ladder logic example ...... 8

3.1 Window of size 3 ...... 17

3.2 Sub-Level Data Null padded ...... 18

3.3 Flow Chart for Top level Classification...... 20

3.4 Step Sequence sub-level example ...... 22

3.5 Flow Chart of Sub-level Classification...... 24

4.1 Step Sequence examples with Count ...... 29

ix 4.2 Alarm examples with Count ...... 30

4.3 Decision tree Visualization ...... 31

4.4 Confusion Matrix for TS3 ...... 33

4.5 Heat Map for the important features for the classes ...... 34

4.6 Input and Output GUI ...... 35

5.1 Accuracy vs Max depth of the Decision tree classifier ...... 37

x Chapter 1: Introduction

The relationship between hardware and software has been long prevailing. The software which operates very close to the hardware is usually low level and far from being user-friendly in most of the cases. One of the PLC programming languages,

Ladder Logic is a quintessential example of such a low-level language.Ladder logic is used to develop software for programmable logic controllers (PLCs) used in industrial control applications. The name is based on the observation that programs in this language resemble ladders, with two vertical rails and a series of horizontal rungs between them. The unique features of every class make the identification possible.

Every component in the Ladder Logic can be considered as a separate class and different sub components of the particular component make the labels of the sub- level classification. The sub-level components are unique to their respective top level classes.

There are several data mining techniques which can be used to solve this prob- lem. However, Classification seems to be the most appropriate. Classification is a method of assigning labels to unknown records by learning from the previously labeled records(training set). Labels are assigned based on the maximum similarity between a particular unknown record and the known records(training set). Prediction proba- bility can be also outputted to see what is the probability that the unknown record

1 is of a certain class. The classifier is said to have learned well when it classifies the unlabelled data with good accuracy. For the classifier to work well the records in the training set have to be similar to the test data. For example, you cannot have oranges and grapes in training set and give apples in the test data. All the apples will be predicted as oranges or grapes.

1.1 Challenges

Getting a good training data which has records similar to the test data is very difficult as there can be numerous possibilities. It is a Herculean task to find every existing pattern to include in the training data. Even after including all the existing patterns it not certain that we would find only the similar records. The ladder logic samples differ across the different automotive platforms. So, we need to have a standard if the same tool has to be applied across all the platforms.

The classifier works very intuitively, it classifies records just like a human would do because the features extracted are the ones even humans would look-up to when identifying the component. However, there are cases when the classifier will output wrong result, given the features resemble other classes more than the actual class.

This project is in its initial phase, so getting training set 100 percent error free is difficult. To err is Human. There have been times when the classifier identified wrongly labeled data in the training set. But, we have to try to check multiple times before inserting a record into the training set.

Thus, the challenges involved can be divided into two types:

1. Choosing right training set for the classifier

2. Tuning the parameters of the Classifier, to make the performance better.

2 We try to address both the challenges and how to deal with them in detail in the chapters ahead.

1.2 Thesis Statement

The statement of this thesis is that it is possible to lever custom feature engi- neering and a novel hierarchical classification methodology to translate Ladder Logic programs to an understandable format (in JSON) that can help both train and ex- plain the underlying logic of such systems to engineers working on such systems. In terms of impact, the proposed work has the potential to transform the automotive industry.

Given a set of features extracted from GX works for top level records and their cor- responding sub-level records, we will describe the method of classifying the unlabelled

Top level data and Sub-level data with acceptable accuracy, validate the sequences and write the valid sequences to JSON which can be fed to an HMI to visualize.

1.3 Contributions

Firstly, we present the novel method of visualizing Ladder Logic code as classes and features. The Ladder Logic is just like any other language that has different functions to carry out different tasks. However, because it works closely with hardware in involves circuits, iøand device arrangements unlike other programming languages like JAVA, Python etc.

3 Figure 1.1: ALARM sub-level example

Secondly, we present a novel way of extracting the features from a Ladder Logic code has never been tried before. The features are extracted from the instruction set, devices, and the identified patterns. For Top level classification, the features are divided into:

1. Basic features- includes features from instruction set, device names

Examples-

Instruction features: LD, LDI, MOV, AND, OR etc

Device features: X, Y, M etc

2. Advanced features- includes Sequence and Latch feature.

4 Sequence feature identifies the presence of a sequence pattern in that particular component( record to be classified into one of the classes). It generally occurs in the

STEP SEQUENCE class.

LATCH feature identifies the presence of a latch i.e. input and output from the same device. LATCH occurs prominently in ALARM and SOLENOID.

3. Comments: We extract the Comments written for every device and convert them to binary features making them eligible for decision tree. The dictionary has the first 13 Comments with highest TF IDF.

Thirdly, we present a novel way of applying hierarchical multi class classification algorithm. We start by classifying the top level data set and the classification result is written to the sub-level data. Top Class is the new feature included in the sub-level data. based on the Top level class, Sub-level classification is carried out making sure the sub-level label predicted in the classification process belongs to the possible labels of the respective top level class.

Lastly, we provide a way to improve performance by tuning the parameters of the classifier and including other algorithms for minimizing the wrong classification.

1.4 Organization

The rest of the thesis is organized into 4 parts starting with Background which is Chapter 2, where we provide detailed background for Ladder Logic and Decision

Trees. Followed by Implementation which is Chapter 3 where we describe how exactly we solved the problem by applying different machine learning algorithms. This is followed by Results which is Chapter 4, where we present our experimental results

5 and their explanation. Finally, we discuss conclusion and future work in this field in

Chapter 5.

6 Chapter 2: Background

2.1 Ladder Logic

A PROGRAMMABLE LOGIC CONTROLLER (PLC) is an industrial computer control system that continuously monitors the state of input devices and makes de- cisions based upon a custom program to control the state of output devices. Ladder

Logic is the most commonly used PLC programming language. Ladder logic has evolved into a programming language that represents a program by a graphical dia- gram based on the circuit diagrams of logic hardware. Ladder logic is widely used to program PLCs, where sequential control of a process or manufacturing op- eration is required. Ladder logic is useful for simple but critical control systems or for reworking old hard-wired relay circuits. As programmable logic controllers be- came more sophisticated it has also been used in very complex automation systems.

Often the ladder logic program is used in conjunction with an HMI program operat- ing on a computer workstation. The motivation for representing sequential control logic in a ladder diagram was to allow factory engineers and technicians to develop software without additional training to learn a language such as FORTRAN or other general purpose computer language. Development and maintenance were simplified because of the resemblance to familiar relay hardware systems.[2] Implementations

7 of ladder logic have characteristics, such as sequential execution and support for con- trol flow features, that make the analogy to hardware somewhat inaccurate. This argument has become less relevant given that most ladder logic programmers have a software background in more conventional programming languages. Ladder logic can be thought of as a rule-based language rather than a procedural language. A

”rung” of the ladder represents a rule. When implemented with relays and other electro-mechanical devices, the various rules ”execute” simultaneously and immedi- ately. When implemented in a programmable logic controller, the rules are typically executed sequentially by software, in a continuous loop (scan). By executing the loop fast enough, typically many times per second, the effect of simultaneous and immediate execution is achieved, if considering intervals greater than the ”scan time” required to execute all the rungs of the program. Proper use of programmable con- trollers requires understanding the limitations of the execution order of rungs.

Figure 2.1: ladder logic example

In this research piece, we have tried to analyze the PLC code by dividing it into two parts: the Top level where the top class(ALARM, STEP SEQUENCE or

8 SOLENOID) is identified and Sub-level class(T, A, R, F etc) for the respective top level classes are identified. This is done by extracting the features from the PLC code

(Ladder Logic), performing preprocessing, feature engineering and classification.

2.2 Classification Algorithms

2.2.1 Decision Tree

Decision tree learning is a method commonly used in data mining. The goal is to create a model that predicts the value of a target variable based on several input variables. An example is shown in the diagram at right. Each interior node corresponds to one of the input variables; there are edges to children for each of the possible values of that input variable. Each leaf represents a value of the target variable given the values of the input variables represented by the path from the root to the leaf. A decision tree is a simple representation for classifying examples. For this section, assume that all of the input features have finite discrete domains, and there is a single target feature called the classification. Each element of the domain of the classification is called a class. A decision tree or a classification tree is a tree in which each internal (non-leaf) node is labeled with an input feature. The arcs coming from a node labeled with an input feature are labeled with each of the possible values of the target or output feature or the arc leads to a subordinate decision node on a different input feature. Each leaf of the tree is labeled with a class or a probability distribution over the classes.

A tree can be ”learned” by splitting the source set into subsets based on an at- tribute value test. This process is repeated on each derived subset in a recursive manner called recursive partitioning. See the examples illustrated in the figure for

9 spaces that have and have not been partitioned using recursive partitioning, or re-

cursive binary splitting. The recursion is completed when the subset at a node has

all the same value of the target variable, or when splitting no longer adds value to

the predictions. This process of top-down induction of decision trees is an example

of a greedy algorithm, and it is by far the most common strategy for learning de-

cision trees from data. In data mining, decision trees can be described also as the

combination of mathematical and computational techniques to aid the description,

categorization, and generalization of a given set of data.

Data comes in records of the form: (X,Y)=(x 1,x 2,x 3,...,x k,Y)

The dependent variable, Y, is the target variable that we are trying to understand,

classify or generalize. The vector x is composed of the input variables, x1, x2, x3 etc.,

that are used for that task

In sub-level classification the ordering of the mnemonics plays a very important

role so we implemented bi-gram ordering algorithm during the pre-processing stage.

2.2.2 Bi-gram ordering

Bi-gram ordering[7] for the mnemonic features is done so that a sequence LDI

LD ANI and LD LDI ANI are treated differently. The instructions LDI, OR, LD,

ANI, ORB, AND, OUT, ANB and ORI are paired with every other valid combination

(there is no pair starting with OUT) Features are named with the two mnemonics

separated by an underscore, Ex: LDI LD, LD OR etc.

2.3 Related work

Analyzing the language semantics has been researched from a long time either to improve a language or to create tools for helping with the languages.

10 2.3.1 Mining Software Repositories

There has been a lot of research going on in the field of language analysis which is related to our goal of analyzing the semantics of Ladder Logic. Mining Software repositories is a field closely related to language analysis which is also our field of interest. Mining Software Repositories is a technique used in the field of software engineering focused on analyzing and understanding the data repositories related to a software development project. The main goal of MSR is to make intelligent use of these software repositories to help in the decision process of the software project

. Software development projects produce several types of repositories during its life- time, described in the following paragraphs. Such repositories are a result of the daily interactions between the stakeholders, as well as the evolutionary changes to the source code, test cases, bug reports, requirements documents, and other docu- mentation. These repositories offer a rich, detailed view of the path taken to realize a software system. Analyzing such repositories provides a wealth of historical infor- mation that can directly benefit the future of a project

CP-Miner [1] shows how to efficiently identify copy-pasted code in large software including operating systems, and detects copy-paste related bugs. It is based on frequent subsequence mining which counts the occurrence of a particular subsequence before calling it a copy pasted. It is efficient because it is not just text matching nor the parsing is through the traditional parse tree.

HotComments[2] is a work on comments in the program which appear first in the key word searches. Program comments have long been used as a common practice for improving inter-programmer communication and code readability, by explicitly spec- ifying programmers’ intentions and assumptions. This piece of work stressed on how

11 important mining of the comments is and how well written comments have helped the development process and how bad or inconsistent comments have negatively im- pacted the development process. The comments are considered and important part of our project as well.

iComment[5] is also a work in analyzing comments written in natural language to extract implicit program rules and use these rules to automatically detect inconsis- tencies between comments and source code, indicating either bugs or bad comments.

This is a advanced classification algorithm which would classify the given comments as good or bad.

Goal-Directed Search[3] attempts to reveal only relevant information needed to establish reachability (or unreachability) of the goal from the initial state of the pro- gram. The paper presented a source-to-source transformation on programs that lifts all assertions in the input program to the entry procedure of the output program, thus, revealing more information about the assertions close to the entry of the program.

Learn Programs from Examples[8] is work on learning the programs from exam- ples. It is a machine learning approach for learning to learn programs that departs from previous work by relying upon features that are independent of the program structure, instead relying upon a learned bias over program behaviors, and more generally over program execution traces

REFAZER[9] is a tool to detect the transformations in the program or predict the potential transformations in the programs. It builds on the observation that code edits performed by developers can be used as input-output examples for learning program transformations.

12 2.3.2 Machine Learning Automotive Industry

There has been a lot of research going on in the field of language analysis which is related to our goal of analyzing the semantics of Ladder Logic. Mining Software

Repositories is a field closely related to language analysis which is also our field of interest. Mining Software Repositories is a technique used in the field of software engineering focused on analyzing and understanding the data repositories related to a software development project. The main goal of MSR is to make intelligent use of these software repositories to help in the decision process of the software project. Soft- ware development projects produce several types of repositories during its lifetime, described in the following paragraphs. Such repositories are a result of the daily in- teractions between the stakeholders, as well as the evolutionary changes to the source code, test cases, bug reports, requirements documents, and other documentation.

These repositories offer a rich, detailed view of the path taken to realize a software system. Analyzing such repositories provides a wealth of historical information that can directly benefit the future of a project

CP-Miner [1] shows how to efficiently identify copy-pasted the code in large soft- ware including operating systems, and detects copy-paste related bugs. It is based on frequent subsequence mining which counts the occurrence of a particular subsequence before calling it a copy paste. It is efficient because it is not just text matching nor the parsing is through the traditional parse tree.

HotComments[2] is a work on comments in the program which appear first in the key word searches. Program comments have long been used as a common practice for improving inter-programmer communication and code readability, by explicitly spec- ifying programmers’ intentions and assumptions. This piece of work stressed on how

13 important mining of the comments is and how well-written comments have helped the development process and how bad or inconsistent comments have negatively im- pacted the development process. The comments are considered and are an important part of our project as well.

iComment[5] is also a work in analyzing comments written in the natural language to extract implicit program rules and use these rules to automatically detect inconsis- tencies between comments and source code, indicating either bugs or bad comments.

This is an advanced classification algorithm which would classify the given comments as good or bad.

Goal-Directed Search[3] attempts to reveal the only relevant information needed to establish reachability (or unreachability) of the goal from the initial state of the program. The paper presented a source-to-source transformation on programs that lifts all assertions in the input program to the entry procedure of the output pro- gram, thus, revealing more information about the assertions close to the entry of the program.

Learn Programs from Examples[8] is work on learning the programs from exam- ples. It is a machine learning approach for learning to learn programs that departs from previous work by relying upon features that are independent of the program structure, instead relying upon a learned bias over program behaviors, and more generally over program execution traces

REFAZER[9] is a tool to detect the transformations in the program or predict the potential transformations in the programs. It builds on the observation that code edits performed by developers can be used as input-output examples for a learning program transformations.

14 2.3.3 Aid in learning PLC programming

Animations and Intelligent Tutoring Systems[11] were developed to help students learn PLCs better. It is a tool where animations were designed to allow users to visualize PLC concepts. That is, they were intended to represent not the physical appearance of PLCs, but rather, their theory of operation. The animations were attractive to users and allowed them to manipulate components of the animation to see what would happen.

15 Chapter 3: Implementation

Considering the Ladder Logic represents a hierarchical structure, we decided to

solve the problem in two stages. The top level classification results in identifying the

top class of the chunk which can be one of ALARM, STEP SEQUENCE, SOLENOID

or UNCLASSIFIED (constitutes everything other than ALARM, STEP SEQUENCE,

and SOLENOID) where as the sub-level classification results in identifying every small

bit of the component who’s top class is known. Top level classification forms the basis

of sub-level classification.

We will see more about the custom classification algorithm we are using in the

following part.

3.1 Pre-Processing Steps

Ladder Logic can be visualized with a tool called GX Works. It also has the option

to export the ladder in the form of instructions. The pre-processing step starts with

consuming a ladder(a part of the ladder logic program). Each ladder segment can be

exported to form an instruction list with mnemonics and Comments. Both the list of mnemonics and Comments are exported to separate files. Comments are linked to particular devices in the ladder.

16 Once we have the instruction set, the features can be formed by counting the

number of instructions present in the ladder segment, also by identifying the kind of

devices present in the ladder and the Comments present in the comment file for those

particular devices.

Advanced features like Sequence and Latch are found by identifying the presence

of a sequence in the ladder i.e. when the offset between 2 or more devices is continuous

and when the input device and output device are same for a component respectively.

Comments are converted to features by assigning the term frequency value as the feature value for that component. Right now we are using the first 13 words with highest term frequency. However, the dictionary of Ladder Logic consists of a huge number of words. Including all of them would make the feature set highly inefficient.

The pre-processing part of Sub-level classification includes few more steps apart from the preprocessing steps involved in producing top level classification data set.

Figure 3.1: Window of size 3

17 The top level data has a set of instruction which together constitutes the top level class. However, every instruction in the set represents a sub-level label. we apply sliding window technique to get the features for sub-level data set. Each record in the sub-level data will be made from the 3 instructions each. The window slides through every window thrice, first two and last two instructions being the exceptions. To avoid this exception case we have used the padding method, 2 ’#’s are added before and after the chunk to make sure every instruction is scanned thrice. This also helps us in post classification. The sub-level label for ’#’ will be ’NULL’.

Figure 3.2: Sub-Level Data Null padded

The pre-processing of Sub-level data is done only after we get the classification result of the top level. The sub-level will have top level class as one of the features which are the prediction from the top level classification. Apart from this feature, we add the bi-gram features to the feature. We have made a list of possible bi-grams and

18 only those are added as features to make sure we don’t have inconsistency between the training and test data.

Once the feature extraction process is done we move to the classification of the data set. We will discuss the classification algorithm in detail for the top level and the sub-level.

3.2 Top Level Classification

A top level class represents a complete component which performs a particular functionality. Our goal is to accurately classify the data into 4 classes ALARM,

STEP SEQUENCE, SOLENOID and UNCLASSIFIED. It is very important to get excellent results in the top level classification as this forms the ground truth of sub- level classification.

19 Figure 3.3: Flow Chart for Top level Classification.

The input data table contains the instructional features, device features, comment features and Sequence-Latch information. Sequence and Latch Features have to be converted to numeric values. The classifier is trained on these features and classified into ALARM, STEP SEQUENCE, SOLENOID or UNCLASSIFIED classes. The result is stored in the resultant file for sub-level classification and is exported to

JSON

20 We are using the Decision Tree classifier from the open source library sklearn[4] to classify our data. Code :

from sklearn import tree

clf = tree.DecisionTreeClassifier()

clf = clf.fit(X, Y)

clf.predict proba([[2., 2.]])

3.3 Sub-level classification

The top level classification is followed by sub-level classification. Sub-level classi-

fication is done only for the ALARM and STEP SEQUENCE records of the top level data set. The sub-level classification of SOLENOID class is not considered as part of this project.

Sub labels for ALARM are ‘T’, ‘A’, ‘R’, ‘O’, ‘F’,’AT’. Look at Figure 1.1 for ref- erence. ’T’ is Alarm triggering condition, ’A’ is Alarm latch contact, ’R’ is Abnormal reset condition, ’F’ is Fault output of alarm, ’O’ is the post condition and ’AT’ is the timer output for the Ladder. There can be multiple ’T’s which triggering conditions in the Ladder chunk which could cause the fault. However, there can be only one ’A’ in any Ladder because there will be only one Alarm Latch contact for every ladder.

There could be multiple ’R’s as well as having multiple reset conditions are allowed.

We can have multiple ’F’s that is Fault output, but we don’t concentrate on exam- ples with multiple ’F’ labels in the Ladder segment (chunk), to avoid complicating the labeling process in his initial phase.

21 Figure 3.4: Step Sequence sub-level example

Sub labels for Step Sequence are ‘P’,’S’,’C’,’O’,’L’. ’P’ is Step Sequence Pre-

Condition, ’S’ is Step Start, ’SE’ is Step Empty Conditions, ’C’ is the Step Condition,

’O ’ is Step Sequence Post-Condition and ’L’ is Step Sequence Count Up Coil. There

22 is generally just one pre-condition so ’P’ occurs only once per chunk( Ladder seg- ment). However, there can be multiple ’S’, ’C’ and ’O’. ’SE’ is going to be only one for a chunk as it’s only tasked to check if the step is empty. ’L’ is the output of the

STEP ladder chunk. As ’S’ is Step-start, it always appears before ’C’ which is step condition.

Once we have the preprocessing done for the sub-level data we will train the classifier on the data. Sub-level classification of SOLENOID class is not included in the scope of the project.

Figure 3.2 is a typical ALARM chunk which is divided into multiple mini chunks

(windows). The windows are padded by hash symbols to get 3 votes for every sub label. The classifier is run to predict labels for every column SUBL1, SUBL2 and

SUBL3.

23 Figure 3.5: Flow Chart of Sub-level Classification.

24 We start with an input data set which consists of instructional features, device features, Comments, and Sequence- Latch features. each sample has 3 mnemonics each and 3 sub labels associated with them. We have the sub-level data only for

ALARM and STEP SEQUENCE samples from Top level. So we have to check if

Sub-level chunk ID is equal to the index top level sample in which case we will fill the top level class for the record. We continue the preprocessing by appending bi-gram features. Train the classifier for all the 3 sub label columns, get the results from all three. Compare the results, take the majority vote to assign the final label. If there’s no majority assign the UNCLASSIFIED ’G’ label.

3.4 Performance Improvement

3.4.1 Parameter tuning

The decision tree classifier that we have used is not a na¨ıve classifier. We have tuned the parameters of the classifier to get better performance.

Criterion: The function to measure the quality of a split. Supported criteria are

“gini” for the Gini impurity and “entropy” for the information gain. We have used

“entropy” as the criterion as it works best when exploring different patterns.

Maxdepth: It is easy to over fit data when the number of features is more or under

fit otherwise. By restricting the depth we can make sure we have rightly trained the classifier. Maxdepth is set to 5. This was obtained by trial and error method.

Presort: Presort the data to speed up the finding of best splits in fitting. When using either a smaller data set or a restricted depth, this may speed up the training.

As we are using restricted depth we can use this option to speed up the data.

25 3.4.2 Majority Voting

The 3 window approach gives us an opportunity to verify the prediction of the classifier by getting the predicted label from the next sample and the previous sample as well. We train and classify the sub-level examples thrice predicting one column every time. We know every sub label (which is of our interest i.e. every label except for ’#’) occurs once in every column. Therefore, we certainly have a label predicted for every label in every column. If we train the classifier to predict for each of the 3 columns we will have 3 potential labels for every label. The labels are predicted for every column and a majority voting is taken before finalizing any of the sub labels. To

finalize the sub label ‘T’ of window 1, prediction of classifier from window 1, window

2 and window 3 are taken. Only if 2 out of 3 predict the same label, it is assigned a final label. Otherwise, it is assigned label ‘G’ which signifies it is unclassified and human supervision is needed.

26 Chapter 4: Data Set and Results

4.1 Data set

We have made various experiments during the implementation of the project.

There were 3 revisions of the data set TS2, TS3, TS4 with each revision adding new samples, features, and classes. The pre-process steps and classification steps were adapted accordingly Initially, we started with TS2, which had ALARM, STEP

SEQUENCE, and SERVO as the top level classes. There was a total of 168 labeled samples which were used for training the classifier. This revision of data set did not have sub-level data. The top level data was classified with a simple decision tree which gave us an accuracy of 0.93. The next revision which was TS3 added a new class SOLENOID and removed the SERVO class from the data set. The reason for removing SERVO class was the unavailability of examples for the training set. With the addition of new class, there were new examples added to the data set. TS3 had 506 labeled samples available for training. The TS3 had sub-level data set as well. was out we added Comments to it which increased the performance by 0.2. We started the sub-level classification with decision tree as well. It gave us a very poor performance 80.2 ( poor compared to top level) We started refining the features for

27 sub-level, added Comments, implemented bi-gram ordering and added the Sequence and Latch features which increased the performance of sub-level classification to 0.97

We also tried the 2-window approach for the sub-level classification. So every record had features of two labels and the sliding window was same as before. We didn’t get a performance improvement, however, the accuracy decreases to 0.83 when the accuracy for 3-window was 0.90.

We tried to inculcate the sub-level features into top level features as the sub- level classification result was better than the top level. We included the bi-gram ordering and Sequence and Latch features. But it did make any effect on the Top

Level classification.Classification of TS4 was really difficult as most of the records were “UNCLASSIFIED” we initially started we with the na¨ıve decision tree which gave us an accuracy of 0.80.

28 Figure 4.1: Step Sequence examples with Count

As we can see from the above figure, the number of possible combinations of step sequence chunk is huge. And, the count is fairly constant as well. We cannot concentrate on getting certain kind of sequence right.

29 Figure 4.2: Alarm examples with Count

As we can see from the above figure, Unlike the count curve in Step Sequence, we have a descending curve in Alarm count. We can here concentrate on getting the most popular alarms sequences and consider the exceptional cases at a later point in time.

30 4.2 Results

Figure 4.3: Decision tree Visualization

The above decision tree is a visualization of the decision tree classifier after all the performance improvements. when we tuned the parameters of decision tree such

31 as max depth = 5, criterion =” entropy” which increased the accuracy to 0.90 for

both top level and sub-level classification which was our target. We observed that

the accuracy for sub-level classification increased to 0.95 when the depth of the tree

was not restricted.

From the decision tree, we can see that the Comments and the bi-grams play

important role in determining the class of the records.

Top level Classification

Classifier Accuracy F1-macro F1-macro Decision Tree 0.91 0.90 0.90 Naive Bayes 0.82 0.81 0.82 LinearSVC 0.88 0.87 0.88

For Sub-level classification decision trees out perform any of the other classifiers

Classifier Accuracy F1-macro F1-macro Decision Tree 0.91 0.90 0.90 Naive Bayes 0.75 0.76 0.75 LinearSVC 0.80 0.82 0.82

32 Figure 4.4: Confusion Matrix for TS3

Most of the misclassification the records is because of the training set error or because the training set is not at all similar to test data. Example: The Sequence generally exists in STEP SEQUENCE. However, there are ALARM examples which have Sequence in them. This confuses the classifier. In order to address this issue, we have to produce examples of our own to train the classifier on such examples.

There have been misclassifications when the device name used is different. Like in SW and ZR alarms.

33 Figure 4.5: Heat Map for the important features for the classes

As we can infer from the Heat Map, not all the features are equally important for all classes. Different classes have different features which decide their classification result. Comments seem to be important for ALARMS while Bi-grams seem to be important for SOLENOID class.

34 Figure 4.6: Input and Output GUI

The left window has the ladder logic files which are input to our pre-processor. The

pre-processor gets the ladder in the form of instructions with device names, offsets,

and Comments. The pre-processed data tables go through the feature engineering process to form the top level and sub-level data set for the classification. The right window is the final Output of the classification and conversion to JSON. The output right now is only for theALARM data.

35 Chapter 5: Conclusion and Future Work

It is difficult to assess the quality of results we have got so far as there is no other work in this field to compare. But, we have met our goal of 90% accuracy. Our

first objective was to get the features from PLC code, most of which was possible to achieve by GX works tool and rest of it was generated through pattern recognition and ordering techniques.

Our second objective was to classify the top level data. We carried out experiments for different classifiers and different parameters for the Decision Tree classifier on the top level as well as the sub-level data sets. Decision tree with entropy as splitting criteria and restricted depth with max Depth = 5 to max depth = 7 worked the best for given data set. As the number of features are huge we have to limit the depth of tree to make sure the classifier doesn’t over fit. However, limiting the number of leaf nodes did not help. None of the other classifiers worked well as the data is discrete. We have to work towards making the training set stronger by collecting as many examples possible and we have to make sure the training set is right as that is the ground truth for classification.

36 Figure 5.1: Accuracy vs Max depth of the Decision tree classifier

From the above figure we can confirm that, the max depth depends on the number of classes in the data set. In the top level classification accuracy increases initially till max depth = 5 after which it is constant for a while,till max depth =7. Past that it

37 decreases to 0.80 and stays constant after that. However, it worked different for sub-

level classification. the accuracy kept increasing till max depth =12 and remained

constantly 0.90 thereafter.

Tuning the parameters of decision tree and assigning weights to the features can

help to improve the performance. There’s a fine line between getting good perfor-

mance and over fitting the data. An accuracy of 0.90 seems to work best with the

unlabelled data. We observed that, applying the bi-gram features to Top level clas-

sification did not help in performance like it did in Sub-level classification. However,

The top features in the classification are the bi-gram instructions. From the decision

tree, we can see that Comments play an important role in the classification. Thus, few more Comments can be added to enhance the classification. The relevant Comments should guide towards the right class.

Our third objective was sub-level classification, which was achieved with decent result of 90% accuracy. Sub-level classification greatly depends on the top level clas- sification. We made sure the predicted labels belong to their respective top level class labels. Bi-gram ordering dramatically increased the performance of sub-level classi-

fication. Majority Voting makes the classifier less prone to classify the sub labels as a different label. There are pros and cons of Majority Voting as we might end up classifying quite a few records as ’G’ even though one of the classifier had predicted it right. Classifying it as ’G’ will send the record to exception log, which it is given

Human attention later. This is a trade off but we choose to classify labels as ’G’ than classify it as wrong label.

38 There is a limitation of this work which is the inclusion of logical instructions. The end goal of the project is to translate the Ladder Logic to JSON. We still haven’t discovered the way to include the logical expressions in the JSON.

5.1 Future Work

We have worked on getting a classification method for ladder logic to isolate and extract out the hierarchy of the classes that existed in the logic itself. This achieves the goal and recognized need from the original background problem statement. We can now see part of what content exists in our ladder logic. We can press further.

While we are able to compute on the ’what’ of the logic (top level and sub-level classifications) , we are not able to compute on the ’how’ of the logic. The ’how’ of the logic involves the Boolean logic relationships between the provided and derived labels. This seems like an extremely useful information to generate.

39 Expression Number Instruction Sub Label

N1 ”LD” P N2 ”LD” S N3 ”AND” C N4 ”LD” S N5 ”ANI” C N6 ”ANI” C N7 ”ORB” N8 ”OR” S N9 ”LD” S N10 ”AND” C N11 ”ORB” N12 ”LD” S N13 ”AND” C N14 ”ORB” N15 ”LD” S N16 ”AND” C N17 ”ORB” N18 ”OR” SE N19 ”OR” SE N20 ”ANB” N21 ”ANI” O N22 ”OUT” L

N22 = N1*((N2*N3)+(N4*N5*N6)+(N8)+(N9*N10)+(N12*N13)+ (N15*N16))*N21

The above expression gives the logical relationship between the sub labels.

We can see the expression can be derived by using a stack. Every encounter of

’LD’ expression should pop everything from stack and use ’AND’ operation on the popped elements and ’OR’ operation with the previously available chunk. However, this doesn’t seem to work with the first element that starts the expression, which has

’AND’ operation with the whole of the next part.

This project focused only on the Ladder Logic on the Mitsubishi platforms, the process can be extended to other platforms as well. If we want to revolutionize the software in the automotive industry we have to target most of the automotive

40 companies and get our product tested. Work should be done towards making a standard through out the automotive industry so that one classifier could deal with all kinds of Ladders.

We have classified only the ALARMS, STEP SEQUENCES and SOLENOID ex- amples from the Ladder Logic. There are various other classes which come under the

UNCLASSIFIED umbrella. Next step can include those classes in the classification data set. With inclusion of different platforms for the project (Platforms apart from

Mitsubishi) the introduction of new classes is however inevitable.

Performance improvement methods like boosting, bagging or ensemble of classi-

fiers can be used at a later point of time to reach the goal. With the increasing the data and it’s features it will be difficult to classify all of the data with rules created by decision tree. We can use the probability from the decision tree to know the confidence and other classifiers can contribute for the final result.

41 Bibliography

[1] CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System

Code by Zhenmin Li, Shan Lu, Suvda Myagmar and Yuanyuan Zhou

[2] HotComments: How to Make Program Comments More Useful? by Lin Tan, Ding

Yuan and Yuanyuan Zhou

[3] A Program Transformation for Faster Goal-Directed Search by Akash Lal and

Shaz Qadeer

[4] /* iComment: Bugs or Bad Comments? */ by Lin Tan† , Ding Yuan† , Gopal

Krishna† , and Yuanyuan Zhou†‡, †University of Illinois at Urbana-Champaign,

Urbana, Illinois, USA

[5] https://en.wikipedia.org/wiki/Decisiontreelearning

[6] https://en.wikipedia.org/wiki/N-gram

[7] Learning to Learn Programs from Examples: Going Beyond Program Structure

Kevin Ellis, Sumit Gulwani, in IJCAI 2017, May 1, 2017

[8] Learning Syntactic Program Transformations from Examples Reudismam Rolim,

Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi,

Ryo Suzuki, Bj¨ornHartmann, in ICSE 2017, April 28, 2017

42 [9] Deep learning in the automotive industry: Applications and tools by Andre

Luckow , Matthew Cook , Nathan Ashcraft , Edwin Weill , Emil Djerekarov ,

Bennie Vorster

[10] Animations and Intelligent Tutoring Systems for Programmable Logic Controller

Education by SHENG-JEN (‘TONY’) HSIEH, PATRICIA YEE HSIEH

43