Plant Model Generator from Digital Twin for Purpose of Formal Verification
Johannes Håkansson
Computer Science and Engineering, master's level 2021
Luleå University of Technology Department of Computer Science, Electrical and Space Engineering ii Abstract
English This master thesis will cover a way to automatically generate a formal model for plant verification from plant traces. The solution will be developed from trace data, stemming from a model of a digital twin of a physical plant. The final goal is to automatically generate a formal model of the plant that can be used for model checking for verifying the safety and functional properties of the actual plant. The solution for this specific setup will be generalized and a general approach for other systems will be discussed. Furthermore, state machine generation will be introduced. This includes generating state machine data from traces, and in the future is planned be used as an intermediate step between the trace data and model generation. The digital twin solution used in this project is a joint setup in Visual Components and nxtSTUDIO. The symbolic model checker NuSMV is utilized in order to verify the functional properties of the plant.
Svenska I detta examensarbete utforskas ett s¨attatt generera formella modeller av en process via inspelningar av dennes beteende. L¨osningen¨arutvecklad fr˚andata ¨over processens beteende, som tas upp av en digital tvilling. Det slutgiltliga m˚alet¨aratt med hj¨alp av den digitala tvillingen automatiskt generera en modell som kan anv¨andasf¨oratt verifiera s¨akerhet och funktioner f¨orden riktiga processen. L¨osningenblir sedan gen- eraliserad f¨oratt i framtiden kunna bli applicerad p˚aandra processer. Ett s¨attatt generera tillst˚andsmaskinerkommer l¨aggasfram. Detta s¨attkommer generera data f¨or tillst˚andsmaskinernagenom den digitala tvillingens beteende och i framtiden planeras att anv¨andassom ett mellansteg f¨oratt generera de slutliga modellerna. Den digitala tvillingen som anv¨andsi det h¨arprojektet ¨arimplementerat av Aalto universitet, och i flera program. Den visuella delen, som ¨aven spelar in tvillingens be- teende, ¨arimplementerad i Visual Components. En kontroll f¨orden digitala tvillingen ¨argjord i nxtSTUDIO. Verktyget f¨oratt verifiera modellens s¨akerhet och funktioner ¨ar gjord i NuSMV.
iii iv Contents
Chapter 1 – Thesis Introduction 1 1.1 Background ...... 1 1.2 Problem Definition ...... 1 1.3 Methodology ...... 2 1.4 Related Work ...... 3 Chapter 2 – Theory 5 2.1 Industrial Automation ...... 5 2.1.1 Introduction ...... 5 2.1.2 Model Driven Engineering ...... 6 2.2 Programmable Logic Controller ...... 6 2.3 Simulation Techniques ...... 7 2.4 Digital Twin ...... 8 2.4.1 Drawbacks and Advantages of Digital Twins ...... 8 2.4.2 Traces from Digital Models ...... 10 2.5 IEC Standards ...... 10 2.5.1 IEC 61131-3 and 61499 ...... 10 2.5.2 History of IEC 61499 ...... 11 2.5.3 Software implementation ...... 11 2.6 Design and Verification for Industrial Control Systems ...... 12 2.6.1 Verification of Logic ...... 13 2.6.2 Model Checking With the Help of a State Machine ...... 15 2.6.3 Partial-Order Reduction ...... 16 2.6.4 Kripke Structure ...... 16 2.6.5 Abstraction ...... 18 Chapter 3 – Setup and Material 19 3.1 Complete Plant Setup ...... 19 3.1.1 Physical Plant ...... 19 3.1.2 Digital Twin ...... 19 3.2 Other Components ...... 20 3.2.1 Python ...... 20 3.2.2 NuSMV ...... 21 Chapter 4 – Results 23 4.1 Implementation ...... 23 4.1.1 Digital Twin Component Isolation ...... 23
v 4.1.2 Visual Components Requirement Example ...... 27 4.1.3 Models of a System ...... 28 4.1.4 Trace Manipulation Method ...... 28 4.1.5 Generalized Model from a Generated Trace ...... 28 4.1.6 Implementation ...... 29 4.1.7 NuSMV Simulation ...... 32 4.1.8 Behavior ...... 33 4.1.9 Connections Between State Machine and NuSMV Model . . . . . 33 4.1.10 Solution ...... 35 4.1.11 Pseudo Code ...... 37 4.2 General Solution ...... 41 4.2.1 Structure of the Generalized Solution ...... 41 4.2.2 New Ideas to Complement Existing Ones ...... 43 Chapter 5 – Analysis 45 5.1 Test Cases ...... 45 5.1.1 CTL Test Cases ...... 45 5.1.2 Model Generated from a Different Trace ...... 47 Chapter 6 – Discussion 49 6.1 Conclusion and Future Work ...... 49 References 53 Chapter 7 – Appendix 57
vi Chapter 1 Thesis Introduction
1.1 Background
Industrial automation has been around for many decades. The field is vastly explored and today’s systems are very intricate and complex. With the increased complexity of systems, it’s only natural that the frequency or magnitude of issues increases as well. These issues can cause problems, so it is important to identify them as early as possible. Model checking offers a solution to this, by evaluating a model of a plant in regards to the functional properties of the plant. An interesting field to explore is if we can generate these plant models automatically. There are a lot of existing plants in industrial settings, and to replace these will be far to costly. Replacing entire plants is therefore not a suitable approach. Therefore, the approach needs to be done in such a way that updates to existing plants needs to be developed. For these existing plants and systems, it is required that they can be updated in a reasonable manner. Since every system that has been created has been done so in different ways, an issue is to apply a general solution that is relevant to the majority. Models for any given plant should be generated and result in a suitable solution. This thesis work builds upon previous work from [1]. The key differences are related to state machine generation. Their work is about state machine generation for a controller using traces from a real controller, whereas this master thesis takes a look at state machines and formal model generation of plant model using traces from a digital twin model of a plant.
1.2 Problem Definition
Mass customization is a field more and more sought after in industrial automation. Ef- fective resource management as well as quick reconfiguration are byproducts of this ad- vancement. Control systems that are reconfigured needs to be verified and tested in order to make sure they function properly. The main target of a possible solution is existing
1 Figure 1.1: Workflow during the project. systems. Can we use the functionality of an existing plant to extract a model based of the current behavior of the plant? An approach to automatically generate plant models of control systems will be ex- plored. These models will be used to evaluate the performance. By verifying the func- tional properties of the plant, correctness and accuracy of the generated model is decided. There needs to be a way so that desired plant behavior can be entered into the model to verify that it has been generated correctly. There needs to be a plan in order to answer if there exists a way to extract these models from older behavior and in theory, save time and effort for a reconfigured system. A digital twin will be used as a base in order to create the models used for the verification process. The digital twin field will be thoroughly explored and the knowledge will be used in the confines of this thesis. By creating a solution with the help of a digital twin, can a good general solution be derived from this? In order to make a general solution that works for most systems, knowledge about the inner workings of a digital twin is paramount. It is crucial in order to enable the possibility to draw conclusions about common traits of any system. Is general understanding of the digital twin field, coupled with the specifics of an actual digital twin, enough to derive a general solution?
1.3 Methodology
This master thesis project workflow was divided into different stages. They are visualized in figure 1.1. In the introduction stage, we have literature review, report structuring and initial writing, as well as architecture, method, framework and tool decision. In the literature review stage, relevant research was acquired and studied thoroughly. Research articles were mostly searched out by me, the student, and further supplemented by my supervisor Sandeep Patil. With this research taken to heart, report writing commenced with relevant data and a solid core was formed. For the architecture, method, framework and tool decision stage, the proper architecture, methods, frameworks and tools were decided that would supply me with the possibility to reach my goals. The implementation and analysis stages are closely related and are alternated between in an iterative process. The implementation stage starts of with an idea. The idea is made
2 a reality before being put through the analysis stage. Here, issues with the approach are surfaced and the implementation stage is revisited with this new information in mind. Improvements are made every iteration until a satisfactory result is reached. For the conclusion stage, the solution produced from implementation and analysis stage will be discussed in regards to the questions presented in section 1.2. How well did the solution live up to the goals set beforehand? How well does the solution answer the questions presented in 1.2?
1.4 Related Work
The authors of [1] proved that they could identify a state machine controller from noisy PLC (Programmable Logic Controller, a computer specifically created for industrial ap- plications) traces. They found that their controller had the same behavior from the traces as the actual PLC controller. What they could not guarantee was if the generated controller was identical in every sense to the original. The goal is to find a method of upgrading current PLCs to include present technology. In their paper, they raise con- cerns about whether the generated state machine is somewhat equivalent to the original controller. One solution that they proposed is the use of model checking/formal verifica- tion. It is this approach that is considered in this project. You can read about their full experiment and conclusion in their paper. Focus in relation to my work lies mainly in section IV (Monolithic state machine controller synthesis from noisy behavior traces) of their paper. It contains important theories behind translating noisy execution traces into something we can work with. Utilization of the fact that the following SAT-solver (Evaluator of boolean expressions) method is used in search of an automaton from noisy traces:
T = Noisy traces. N = Maximum amount of states. K = Maximum amount of states reachable from each state. R = Maximum amount of transitions. ξ = Maximum number of errors in T. S = Valid solutions. The algorithm contains several steps. First, we find a model with the fewest number of states (N). Originating from this model, find a model with the fewest number of reachable states from each state (K). From this model, find the fewest number of transitions (R). Then, starting from no errors (ξ = 0) and counting upwards to the amount of alternative edges that are present between a transition between nodes. For every model that satisfies all constraints (N, K, R, ξ), add it to the set of valid solutions (S). To add to this, if we were to formulate operators and connectivities, we can write statements in a formula- manner with the help of these: ∀ = For all.
3 ∃ = There exists. ∈ = Belong to. ¬ = Not. ∧ = And. ∨ = Or. ⇒ = Implies.
Focus for their work was mainly directed at synthesizing a controller for an already existing PLC, by analyzing and applying algorithms to an already working system. An- other main aim was directed at improving existing systems in order to upgrade to modern technology. When creating a system from scratch, there is no data to analyze to create a controller for. If testing is to be done before real world application is commenced, it is important to create a digital representation as close as possible to what you want to achieve. With their work in mind, a short summary of what my work is aimed at is exploring a way to verify and model check generated models from traces. This includes; changing the way traces are being utilized, creating a solution that converts the trace into an executable model, implementing specific checks for formal verification. From these steps, come up with a generalized solution that would work if the solution is to be applied to other systems.
4 Chapter 2 Theory
2.1 Industrial Automation
2.1.1 Introduction Nikolakopoulos in [2] say that automation broadly contains the following components:
- Sensors. - Control logic. - Actuators. - Environmental inputs and outputs.
The way to use these components to create automation has been readily understood and applied for a long time. The future of automation lies in using resources in a better way to ensure sustainability towards the process, as well as using the time effectively. In order not to fall behind in production, the importance of continuous production is a property that we want to explore and expand upon. In flexible automation, the amount of possible combinations of setup and settings are almost infinite. The issue with infinite possibilities is that it is almost impossible to predict or assume what will happen with different settings. Knowing what can happen in a system is a requirement in order to be able to take actions that benefit the system. Extensive testing and simulations are performed before any real implementation is done to ensure that the system is as ready as possible. Unknown issues from previous settings might reveal themselves and hinder production. The hindrance of production will negatively affect production rates and in extension, company profits. Even though rigorous testing is performed, there is still the possibility that the plant behaves in an unpredictable manner. For example, unpredictable movement of a manipulator may damage nearby equipment, or even injure people in the vicinity. Any of these occurrences are unacceptable.
5 2.1.2 Model Driven Engineering Model Driven Engineering (MDE) is an approach to implement control software. Control software is done when you have a physical process that you want to control. Current application of the MDE approach take place in an open-loop environment [4], with focus mainly on the controller. The plant in question is disregarded and valuable properties are lost. The difference when considering controller or plant are that the properties enhanced when the controller is considered is safety-related, whereas properties enhanced by focus on the plant is functionality-related [4]. Proper validation requires extensive application of both plant and controller. Exactly like a chain link, where the structure is dictated by its weakest component, both controller and plant model is required to hold a high standard. Manual implemen- tation of these models for the purpose of simulation takes a lot of time and is a reason that it is overlooked when configuring a real world industrial plant. To make this time- efficient is an issue that mainly lies in the current process of the amount of manual work required. If you could discover a way to automatically generate models of plants and controllers to be used, this approach would be much more time-efficient and therefore more attractive to use when configuration of a plant is needed. With the help of a digital twin, models are easier to obtain and with them, imple- mentation is easier. It is this fact that enables us to explore the area of implementation of digital twins. A digital twin mimics a real world entity digitally and provides a way to test functionality and limits with little to no real world repercussion. This is an im- portant factor in why a digital twin is desirable to use. We want to ensure that our implementation is working and that we are well prepared before applying the solution in a real system or plant.
2.2 Programmable Logic Controller
Programmable Logic Controllers, from now on referred to as PLC, are computers designed for a specific purpose within industrial applications. Regardless of the size of system that is considered, a PLC is suitable to use for implementation. Hardwiring and hard-coding an implementation might work for the specific task it is made for. But if there is a need to change even the slightest, massive amount of work might be required. A system that is designed with the intention to last should always be approached modularly, so that future changes are easier to do. The higher initial cost of PLC rather than direct implementation is made up for itself in the long run. In a system made to last, the industry accepted PLC approach is the most suitable. A PLC itself is modular in its design [5], enabling the user to tailor its functionality to their needs. The fact that a PLC can be adjusted to only use relevant resources, makes it so that efficiency is greatly increased. The common parts of all PLCs are [6] processor, power unit and input/output. In addition to these, the user may add any component they desire depending on the application needs. There are several companies that manufacture and supply PLCs, some of which are ABB[7], Siemens[8], Mitsubishi[9] and Schneider Electric[10]. There are plenty of different
6 kinds of PLCs, and depending on the requirements of the system you want to create, you choose the hardware accordingly. Although PLC technology is not exactly new, there are ways to upgrade current systems to fit present technological advancements. In [11], the authors state that there is no need to redo entire systems to bring them into modern technology. It is desirable for companies to be able to adapt into new systems without changing to much of the old system. A cost- and resource-efficient way to do this is porting existing PLC applications into IEC 61499 compliant function blocks. Function blocks are contained sections of a system, where each container usually performs a simple task. These blocks are then connected to complete the system. The task of porting to this kind of structure requires extensive data collection and observing. The authors of paper [1] say that the goal of data collection is to record input and output values. To apply this idea to an existing system, a separate PLC is connected to the system. This new PLC collects data from the system. Relevant information is data values and at what state they occurred. IEC 61499 systems are event-driven, so it is possible to record data the instant an event occurs. Data collection between states is not interesting, since the system is event-driven and no changes in data will occur in between.
2.3 Simulation Techniques
In plant development, software implementation is as important as the physical imple- mentation. Without instructions, the plant would not work. In addition to develop software solution for the plant, there is also a need to assure that the software is actually working as intended. Computer simulations in their own right are good, but they lack the effectiveness to explore all possible scenarios [12]. This area has a lot of potential to be expanded upon, and there are several approaches to consider, such as[13]:
- Discrete event simulation systems. - Block oriented simulation. - Declarative Modelica modeling language. - FEM-based simulation. - Game engines. - Generic mechatronic systems.
When creating an industrial system with a PLC approach, software is frequently tested directly on the factory floor [14]. This takes important time and resources away from the production. Time that could be spent more efficiently. If simulation, testing and modification could be done separate from the actual production, resources could be used more efficiently. Testing in a simulated environment saves time and money, something that is important for any company. As always when humans are involved, safety is of utmost importance. Since no injuries are possible on a virtual plant, this is also a big reason to simulate before applying anything to a real life system. No system can be
7 100% safe and predictable, but the use of simulations and testing ensures that we have exhausted the potential issues that may have been overlooked otherwise.
2.4 Digital Twin
As mentioned previously, we want to explore an approach to simulate and test limits of a plant before applying them in the real world. For this, we can use a digital twin. Digital twins are representations of a real world object, e.g. a plant. The digital twin is a model of its structure and functionality. Included in this is processing of data, communication assets, and behaviour[15]. What a digital twin offers that simulations do not do is that you can use it through all development phases. Because digital twins are digital[16], we can test, simulate and otherwise try different approaches without regards to the impact it may have caused if tried in the real world. This is due to digital entities ability to collect and process data in a specified manner[18]. In some cases, a digital twin can take parameters of the real world through the plant sensors themselves [17]. However, they may also be standalone features that simulate the sensor readings. Reality is more complicated than any system you can use to simulate the real world. So the question is, can we digitally represent reality close enough? A digital twin can work close together with the physical object, even during normal operation. Data gathered is the most accurate that is possible to acquire. This represen- tation called digital twin is very versatile. From it, you gather data and analyze it. An additional feature of a digital twin is that it performs analysis very well. If you encounter an unexpected result in the real world, the digital twin can be used for troubleshooting it, since it should behave as its real world counterpart. You can find out what went wrong as well as use the result to decide how to rectify the issue for future iterations.
2.4.1 Drawbacks and Advantages of Digital Twins Consider the approach where the digital twin is a copy of the physical plant [19]. In addition to this, the digital twin also holds gathered data, created models, and expected functionality. The combination of these aspects enables the model to be as close to real life as possible, and it can be used as if it was a real entity. From [20], we can say that some advantageous aspects that a digital twin help in achieving are:
- Analysis. The digital representation of a real world physical system can be run in very many different ways in a shorter period of time and less usage of resources compared to a real world system. With the extensive data gathering that can be performed digitally you can analyse different settings and environments and their influence on the system.
- Control. The digital twin may cover an entire system. This means that you can control every part of the system.
8 - Customization. With the big amount of data gathered, you can use the same base when approaching different ways of implementing. For example, different customers might have different approaches to how they want to solve a given problem.
- Decision making. Simulations without a digital twin are not as precise, so with the enormous amount of data supplied by the digital twin it is easy to make a decision that is backed by data.
- Efficiency. Remote access to digital entities enables us humans to work from any- where in the world. New settings are tested digitally before being applied to the real plant.
- Maintenance. Maintenance is easier because you know how much different parts are strained and can predict wear.
- Monitoring. The state of a real world system may be gathered and presented in an intuitive way to a user. This will make it easily accessed, and therefore monitored through the digital twin.
- Safety. Digital testing means that no issues can arise in the physical environment.
The authors of [21] mentions some drawbacks with the implementation of Cyber- Physical systems, CPS. Digital twins are in essence CPS. We can apply their thoughts regarding CPS in general to digital twins by introducing the following issues:
- Interoperability. The concern that not every part of the system will work together. If different parts of the system do not work together, you will not be able to get a full picture and the accuracy of the digital version of the real world system will decrease. If you can not trust your model, its use cases are greatly reduced.
- Security. Hostile intentions vs the system are a threat. You need to protect your data and prevent unauthorized access to your system.
- Dependability. You need to assure that the system operates in the intended way. There should not be failures or delays in delivery from any part of the system.
- Sustainability. Using resources in different and smart ways depending on the evo- lution of the system to not waste potential.
- Predictability. It must be assured that the system will behave predictably. The system should not give contradictory information in order to stay predictable.
- Reliability. Make sure that the system works the way it was intended to work.
9 2.4.2 Traces from Digital Models A digital twin can generate traces. These traces are very valuable to us. In this section it will be explained what traces are and how they can be used. Traces are records of the behavior of a system. We gather data of what happens during execution. The data is sufficient to be able to replay the sequence of events, enabling an easier approach to analysis. In turn, analysis makes it easier to make improvements. This is properly explained in [22]. In short, a trace is created when the model based execution gathers data from the program determined by the model. A higher level of abstraction is desirable, such as state charts and sequence diagrams. In an example carried out by [22], they have different traces for different use-cases. Their example is based on the game PacMan. One such trace is specific to a scenario, for example PacMan eats a power-up, and contains entries such as event occurrence, binding, cut change, and finalization. There is also traces specified to states. These traces contains state charts and information about said state charts for a specific execution. events, guard conditions, and entering/exiting states. There are different uses depending on what kind of trace you want to generate. Scenario-based and state-based traces are two example of traces.
2.5 IEC Standards
Standards are an important aspect in many sectors. To reinvent the wheel takes time and standards enables us to take in what others learned before us in the same field. This is very true for engineering. Standards enable previous knowledge and research to easily be condensed and accessed through a common place. To be able to follow a standard can greatly reduce time and effort required for a specific task, as well as the finished product being recognized by a greater audience. IEC, the International Electrotechnical Com- mission, establishes standards regarding characteristics and technical interpretation[23]. Standards are employed in areas of electrical and electronic technology related to those.
2.5.1 IEC 61131-3 and 61499 The authors of [24] compares the differences between the IEC 61131-3 standard and the IEC 61499 standard. IEC 61131-3 covers programmable controllers and the language used for cyclic PLC systems. A cyclic system is a system that repeats some of its actions periodically, and eventually returns to its initial state. The IEC 61499 covers function blocks for distributed, event-driven automation systems. The full conclusion can be read in their paper. For this project, sequence, or event-driven automation systems are what we are looking for. Therefore, the IEC 61499 is relevant and is used throughout the project. Function blocks are always executed in a specific order [25]. First, an event must be triggered. Then, function block inputs are read. Next comes the internal execution of the function block. This includes updates of variables and running of algorithms. Lastly, output variables are updated and then the sequence is finished.
10 2.5.2 History of IEC 61499 The base for the IEC 61499 architecture started in the 1990s. Back then, the goal of this was to specify a standard model for control encapsulation and distribution[26]. Devel- opment was finished and it was released in 2005. The standard is split into four parts: architecture, software tool requirements, tutorial information, and rules for compliance profiles. The result is an intelligent automation system that can easily be reconfigured. The IEC 61499 standard is widely adopted when creating flexible distributed automation systems.
2.5.3 Software implementation The software program nxtSTUDIO was selected for this project. It adheres well to the rules of IEC 61499 standard and is praised by the author of [27], stating that nxtSTUDIO “is an industrial grade engineering environment which supports the design of control applications and visualization together in one tool.”, as well as praising that it has online- monitoring, enabling remote debugging of complete distributed systems. In nxtSTUDIO, we can realize the vision of a digital twin, as well as a cyclic approach to a sequential, or event-driven, model. From [28], it is stated that the nxtSTUDIO software contains:
- Control logic. It can be applied to the desired equipment. - Visualization tool. It is used to create your own HMI (Human-Machine Interface) and SCADA (Supervisory Control And Data Acquisition). - Input and output connections. Used for communications with the outside world. - Simulation. With a soft PLC in the software, testing and simulation can be done within the confines of the program. No real world deployment necessary. - Automatic communication paths. Controller-to-controller and controller-to- visu- alisation clients communication paths are built automatically.
For a system that contains multiple mechatronic components, such as conveyor belts, manipulators, sensors etc., We have a library with function blocks[29] designed for basic automation services. Within nxtSTUDIO, we represent each device in the design space as a function block. Moreover, function blocks regarding product description, scheduling and ordering are added. The desired operation sequence is visualized as a state machine. Then we connect all these services together to a complete application. Once the applica- tion is completed in software, we decide which physical component should contain which function blocks. Function blocks come in many different variations. Firstly, we have the basic function block. This block is defined with a simple state machine. Secondly, there is the composite function block. It contains a network of basic function blocks. Thirdly, there are service interface function blocks. These are often supplied by a vendor and imported from a library, without the user really knowing what is inside. Lastly, we have sub-applications. They are a network of function blocks. Their content may be distributed across several devices.
11 2.6 Design and Verification for Industrial Control Systems
When designing an industrial control system, there are certain factors that should be taken into account. From a mechatronic standpoint [30], the design of real-time modules in a system should be event- and signal-driven. Both software and hardware must work closely together in a mechatronic system to achieve a coherent system. It is desirable that software is modular and is possible to be reapplied in the future. As stated by [32], the broader design process for a mechatronic system is
1. Requirements of the system. 2. Concept design. 3. Rough model. 4. Determining physical sensors and actuators. 5. Detailed model. 6. Design of control system. 7. Optimization.
If there already is a plant and a model should be created based on the real world environ- ment, there should be some alterations to the design workflow. Still, the requirements are essential. This time around though, they might be considered limitations rather than requirements. The question is no longer what do we want our system to do, but rather what our system can do and what its limitations are? Instead of concept design and rough model, it is just a matter of creating a detailed model that reflects reality as close as possible. There is no longer the need to determine what sensors and actuators you need, but rather how to use what you are supplied with. A plant model is represented with a finite state machine[33]. FSMs consist of states and transitions. The machine moves between states through transitions when certain conditions are met. Let’s use a simple piston example to illustrate this. See figure 2.1 for reference. This is the Kripke structure of the transition system. The piston has three different states in reality, and will thus have the same amount in the FSM. These are; Retracted, Extended and Moving. To move between these states, we need to define transitions between them. Transitions are conditions that must be true for the model to change state. In our situation, we can apply two commands, Extend and Retract, to move the piston. Let us review what can happen from any state in this representation of piston. From the retracted state, where the piston is retracted, if we apply the command Retract, we follow the arrow and return back to the same state, so nothing changes. We are still in the retracted state. The other option we have is Extend. If we apply that command, the machine will change state to the Moving state. Eventually, by continuing to apply the Extend command, we will reach the Extended state. Continued use of the Extend command in this state will not trigger a state change. We will still be in the Extended state until we issue the Retract command. The exact same implementation
12 Figure 2.1: Kripke Structure of a Finite State Machine. is done for the retraction of the piston from Extended to Retracted as was done from Retracted to Extended. Note that systems are often quite a bit larger, and with extra components. For example, we can have sensor readings as transition conditions to achieve a different result. A FSM can be used in a few ways once it is constructed. One option is to create traces. A Trace is a generated path from the start state that the FSM has the possibility to execute. From this previous simple example, it is easy to see all possible outcomes by yourself. However, it is hard to see this in a complicated system. Therefore, traces can be reviewed to find different routes between different states. Example: we want to answer the question If we start from the retracted state, can we move to the extended state and back to the retracted state again? To find out if this is true or not, we review the traces and look for any that complies with our restrictions. Traces have additional usages. For example, instead of verifying that we can reach certain states from other states in a specified configuration we can verify that some states can not be reached. We name these negative traces.
2.6.1 Verification of Logic A specific way to verify software is to formulate CTL (Computation-Tree Logic)- state- ments and make sure they comply[31]. A CTL statement includes a path quantifier and a temporal operator, followed by the piece of logic of the software it checks. The different options and their function is: Path quantifiers: • E. For some path. . .
• A. For every path. . . Temporal operators: • F. . . . Eventually...
• G. ...Always...
• X. ...Next time...
13 Figure 2.2: Piston Visualization.
• U. ...Until. . .
We illustrate this by using the same example as before: Let us say we want to assure that a piston never extends beyond a certain point, or it will damage the system somehow. We introduce a sensor that detects if the piston has moved too far. See figure 2.2. To make a CTL statement for this requirement, we first need to choose a suitable path quantifier. Since we are assembling a statement for a scenario that should never happen, we want our statement to hold for every path. We choose A. We do not just want it to hold for every possible path, we also need it to hold always. We choose G. The combination of path quantifier, temporal operator and piston sensor code will therefore be something similar to this:
AG(Limit Sensor == false).
To put this in words: For every path(that the system can evaluate), Always will the limit sensor be false. To rearrange the sentence into proper language: For every path, the limit sensor will never be triggered. If the results yielded from the CTL statement is true, we know that there is nowhere in our software that enables the piston to move beyond the point of the sensor. Now let us pretend that our requirements have changed. We want to add different modes to the system, and in one of them, we are required to extend the piston beyond the limit sensor to its physical limit. Since we do not know if we really can extend the piston to its physical limit, we need a CTL statement to ensure that we can do this at least once. As a path quantifier we choose E. As a temporal operator we choose F. The combination will look like this:
EF (Extended F ully == T rue).
For some path, eventually the full extension of piston will be true. If this statement evaluates to true, we know that somewhere in the future, there is a path that leads to the piston being fully extended. By using this example we conclude that this tool is very powerful. We had some concerns regarding the implementation of a new mode into our system. Instead of testing the system in real life to waste time and resources on something that might not even work, we first tested the limits of what we wanted to achieve. Once we have found out what
14 works through logic, we can implement it and when we get to the real world application, we are confident that we have prepared for many possible issues and the system should work from the get go.
2.6.2 Model Checking With the Help of a State Machine The first step when verifying a system is modeling. Depending on their purpose, models are created for different reasons. In general for formal verification and model checking, you decide on the formalism and language depending on the following factors[35]: • System type. Types such as discrete-time systems, continuous-time systems, scenario- oriented systems, and hybrid systems. – Discrete-time systems requires declarative formalism, finite state machines and push-down automata, as well as discrete variables and hierarchical extension. – Continuous-time systems requires formalism related to differential equations. – Scenario-oriented systems are approached with formalism for sequence charts, both message- and live- sequence charts. – Hybrid systems requires a mix of discrete and continuous formalism. Real-time temporal logic or timed- and hybrid- automata. • The property to be verified. Example properties include boolean properties, as well as real-time properties. • Environmental properties. Often the most extensive part of a system. It is hard to handle because it is hard to understand fully, both due to its size and its complexity. • Abstraction level. Abstraction is the way we can decide on what level of detail our model is. Highly detailed models are close to reality, but may be considerably hard to create and implement. Abstraction is covered extensively later, in section 2.6.5. • Clarity. In order to understand a model, it should be made intuitive and from small building blocks. A modular approach, where every part is easy to understand can become a big model once put together. All parts are fundamentally easy and with this knowledge, the entire system becomes easy as well. • Composition. Systems are almost always built by combining components into a whole system. For a system built from similar modules that are alike, a synchronous composition might be suitable. For distributed system with vastly different mod- ules, an asynchronous composition might be suitable. A combination of these two is suitable if the system as a whole requires asynchronous composition, but subsys- tems can work well with synchronous composition. • Computational engines. Examples of Computational engines are Binary Decision Diagram (BDD), Boolean Satisfiability solver (SAT), and Satisfiability Modulo Theory solver (SMT). The verification tool is powered by these, thus the name engine. BDDs and SATs covers finite-state model checking, whereas SMTs cover high-level models of hardware.
15 • Expressiveness. Languages has different functionality and has therefore different ways to use them. The chosen language should be decided depending on what is the easiest to perform the task.
When doing model checking, the amount of states are important aspect to consider. The more states there are, the more combination of states and transitions there are. It is not just a linear increase, the size and complexity of the state space increases exponentially [36]. It is therefore increasingly difficult to analyse systems even of fairly low size and nigh on impossible if the system is large. Real systems also pose another challenge, namely that the state space is infinite. They may be infinite due to memory constraints or undefined amount of parallel running processes. We call this the state space explosion problem. This leads to the system having the need to be abstracted properly. Issues with this is that important features might be lost in translation. It is difficult for the user to know what information can be disregarded in order to keep the correctness at a suitably high level. The model is only useful when it is as close to reality as possible, so too much abstraction will render it unusable. It is easy to make the same mistake in the opposite direction. A model with sufficient details to ensure that there is no loss after abstraction might take too much time and resources to realize.
2.6.3 Partial-Order Reduction One such method to reduce time and space for automatic verification between concurrent transitions is Partial-Order Reduction[37]. Many events in the system are the same, and thus are not required to be checked every time. A version of this method, called compositional reasoning, comprises of local reasoning rather than the whole system. Each component is treated by itself, but with assumptions from other relevant components[38]. The difficult part is to make the right assumptions. Let us again use the example from figure 2.1. If we were to tabulate the data for every possible combination of state space representations, the list would look like in figure 2.4. For just three states and some transitions we already get six possibilities. If we add just a few more states and transitions to show the changes with fairly low amount of new things added (Figure 2.3). From just six possibilities with three states to sixteen possibilities by just increasing the states by two is a very high increase. Increase by a few states yields a very high increase in possibilities. For large systems, this will be too advanced.
2.6.4 Kripke Structure Kripke structures are graphs that represents the various possible states and transitions of a discrete system. The formal representation [39]: A Kripke structure over a set A of atomic propositions is a triple K = hS,R,Li where S is a finite set of states (the state space), R ⊆ S ×S is a set of transitions (the transition relation), and the labeling function L : S → 2A associates each state with a set of atomic propositions. For a state s ∈ S, the set L(s) represents the set of atomic propositions
16 Figure 2.3: Kripke Structure of a Larger State Machine.
Figure 2.4: Possible States and Transitions for the smaller system.
Figure 2.5: Possible States and Transitions for the larger system.
17 A that are true when the system is in state s, and the set L(s) contains the propositions that are false in state s.
2.6.5 Abstraction Since a fully fledged model often is too large to handle, there is a need to reduce it. The reduction must be done in a way that keeps all relevant features intact. This is often done by abstracting away some parts. The result from the abstraction are often derived from a high-level description of the system. The abstraction is thus not based of a full model[40]. Fully analog data will never be abstracted to be represented fully, but there is always a way to abstract in order for it to stay close-enough. A piston that can extend and retract a set distance will have close to infinite locations it can be. However, not every location is relevant. Abstraction helps to split the locations into segments that make more sense to use.
18 Chapter 3 Setup and Material
3.1 Complete Plant Setup
The setup of the plant is a two-part system, with the physical plant being located at the university of Aalto. The second part of the system is the digital twin, which is considered a perfect representation of the physical plant. The digital twin was developed at the same university in Aalto in which the plant resides.
3.1.1 Physical Plant The physical plant is created by the Factory of the Future-section of Aalto university. Since the plant is not in the same location as where this thesis is being conducted, close collaboration between the two parts is basically impossible. However, since we know that the digital twin is a representation of the real plant, we know that anything we accomplish for the digital twin would work in reality. The fact that there exists this digital representation of it makes the real plant extraneous. This section is mostly here to make sure that you, the reader, understands that there exist a real physical plant. For the purpose of my work, said physical plant is not relevant, since the digital twin satisfies the needs for the project. Therefore, every reference to the plant is referred to the digital twin version, unless specifically stated.
3.1.2 Digital Twin The digital twin, created by the Aalto university Factory of the Future, exist as a project through different software applications. Figure 3.1 show the visual part of the digital twin. The major components are the main plant, the Automated Guided Vehicle(AGV), and the IRB. The following sections briefly explains the major software applications used.
19 Figure 3.1: Visual part of the digital twin. Main plant, AGV, and IRB.
Visual Components The visual representation of the plant is created using Visual Components, a tool designed for making 3D simulations for manufacturing processes. Even though the user may interact with the plant and its components directly through Visual Components, this is not the main tool for interaction. nxtSTUDIO This is the main tool for interacting with the plant from Visual Components. The controller for the Visual Components plant is created using nxtSTUDIO, which is an automation software for distributed systems. The controller in nxtSTUDIO is connected to the plant in Visual Components, so that the controller can receive status updates from the plant, as well as control the plant through various commands.
3.2 Other Components
3.2.1 Python From the data collected by the plant, there is a need for a tool that can interpret the data and create a model from it. Due to my previous experience and expertise, a Python approach was chosen for this. The main reason being that it is easy to handle and manipulate sets, lists, tuples, and dictionaries in Python. The thought was that the structure of the trace file would require much manipulation of these to generate a model
20 from it. So with the help of python, we can handle the trace file and restructure the data into a desired model.
3.2.2 NuSMV After the model is created, we need a tool that can evaluate it. NuSMV is a symbolic model checker tool. It can be used to verify and check certain properties of a system. This is particularly useful when trying out some large system without having the need to produce it in the real world. In order to be able to verify a model, you first need to know what kind of properties you want to check. This makes it very reliant on human interaction, so the user need to know the system well in order to produce the test cases necessary to yield an appropriate result. The tests are built upon what was covered in section 2.6.1.
21 22 Chapter 4 Results
4.1 Implementation
The solution works as follows (figure 4.1 for a visual representation): The digital twin application records the behavior of the plant. The output from the application is then put into a trace file. In this file there is raw text describing the behavior of the plant. We do two things with this trace file. First, we execute the model generator. This generator will output a model file depending on the recorded actions in the trace file. After a few manual additions to the model file, such as the last few constraints and the logic the user wants to check, the model is fed to the model checker which will return the results. The results will either be that our functional properties are verified as correct, or it will return counter examples to show us where and how our properties does not hold true. The other thing we use the trace file for is to generate state machine data. The state machine data is used to generate state machine diagrams with the help of a diagram generator. The diagrams are a good tool to get a better understanding of how the system works. The checks to be carried out vary from a range of properties. A few examples of this are The AGV robot should remain in the starting position until we have created a new object and want to insert it into the rest of the system, A subsection can not run unless a detail has passed a certain point, and You can not create a new object unless the previous object has been cleared from the system. Along with the complete set of properties, how they are tested as well as their syntax will be covered in section 5.1.1.
4.1.1 Digital Twin Component Isolation Isolating components from the holistic view of the system enables us to explore parts of the system independently. The idea comes from partial order reduction, covered in section 2.6.3. With our knowledge of the system we can deduce plenty of information to remove irrelevant data.
23 Figure 4.1: Structure of the solution approach.
Figure 4.2: The entire plant, with IRB-subsystem in the upper left and main plant with AGV to the right.
The entire plant is presented in Visual Components in figure 4.2. Let us continue by showing how the splitting of the complete plant into smaller subsystems was done. The first subsystem is the mobile robot IRB(Officially ABB IRB 14000) in the top-left corner. This can be be viewed as an isolated system in figure 4.3 for further clarity. A simplistic view of this specific part of the system contains two things. The robot may move away from the initial position, fetch an object from a specified position from the main plant, return to the initial position, put the object on the conveyor belt, which in turn feeds the objects away from the process (in this case into the void). To this subsystem, the only thing that is interesting from a global view is whether or not the position on the main line is vacant or not. Anything other than this that goes on in the main system is irrelevant to this subsystem. Similarly, figure 4.4 is its own subsystem named AGV, which includes a moving robot
24 Figure 4.4: Isolated AGV- Figure 4.3: Isolated IRB-subsystem of the plant. subsystem of the plant.
Figure 4.5: Main section of the plant with a conveyor belt on top. The task for this subsystem is to move from the initial position towards the main plant and connect to a set point of the main conveyor line. It may then feed an object into the system along the conveyor belt. Relevant information regarding other parts of the system is whether or not the conveyor belt connected to is running or not. All other information is irrelevant. A full view of the main section of the plant can be seen in figure 4.5. As we dive deeper into the details of the main plant, we have split it into six parts, each with different components and functionality. They are depicted in their isolated form in figures 4.6 through 4.11. If you piece them together you will get the whole system from before, in figure 4.5 Each subsystem of the plant is kept as small as possible. This is due to the fact that it is easier to work with a smaller state space. If we consider the entire system, the state space will be huge and the complexity will skyrocket. Therefore, by splitting up the
25 Figure 4.6: Isolated C3-subsystem of the plant.Figure 4.7: Isolated C4-subsystem of the plant.
Figure 4.8: Isolated C5-subsystem of the plant. Figure 4.9: Isolated C6-subsystem of the plant. system into smaller sections, we only need to consider a few parameters. Therefore, the complexity will decrease substantially. The following bullet points briefly explains what you see in each of the different subsystems.
• The first one, seen in figure 4.6, contains a few things. A black conveyor belt that runs from right-to-left. A gripper that grips objects from above. Two sensors for objects: one in the gripper and one on the conveyor belt below the gripper.
• The subsystem in 4.7 has a conveyor belt that runs from right-to-left, a sensor over the belt to detect objects, and an overhead camera.
• 4.8 is a subsystem that is a bit more complex. A conveyor belt that moves from bottom-to-top, a sensor across the belt to detect objects, a sledge that may hold details, and a jack that can move details between the sledge and the conveyor belt.
26 Figure 4.10: Isolated C1-subsystem of the plant.Figure 4.11: Isolated C2-subsystem of the plant.
• Subsystem 4.9 has a black conveyor belt that runs from left-to-right, A gripper that grips objects from above, and two sensors for objects: one in the gripper and one on the conveyor belt below the gripper.
• The subsystem in 4.10 has a conveyor belt that runs from left-to-right, a sensor over the belt to detect objects, and an overhead camera.
• Section 4.11 has a conveyor belt that moves from top-to-bottom, a sensor across the belt to detect objects, a sledge that may hold details, and a jack that can move details between the sledge and the conveyor belt. Once an object leaves the bottom part of 4.11, it joins at the start of subsystem 4.6 again.
4.1.2 Visual Components Requirement Example While exploring the functionality of the digital twin manually, the user comes across a potential issue. If commands to the plant is sent in a specific sequence, one of the stations moves through a physical barrier. So the user wants to check if this occurrence of commands can really be encountered. Moving through objects is not physically possible. Therefore, the user wants to ensure that the Jack is retracted vertically before it moves horizontally. After the model is completed and before being fed to the model checker, logic for checking that the jack cannot be extended or retracted horizontally if it is already extended vertically is added. The IRB is part of the system, but it operates completely independently from the main part of the system. This means that we can isolate the moving robot from most of the main system. The places where the robot is directly connected to the main system are the only components that needs to be taken into account. At one point in time, the
27 robot may move forward to a specific part of the conveyor belt, and this and only this part of the conveyor-part of the system is relevant for the behavior of the moving robot.
4.1.3 Models of a System There are a few approaches one may have when modeling a system. In the end, two approaches were explored for this project. One approach is to take the trace generated by the digital twin and rework it directly into a usable model. The other approach is to create state machine models of the system, then create the usable model depending on this data. The result differs a lot in that you have a greater underlying understanding of the whole system if you do state machine first, but it will take longer time. The finished approach in this project is the one where you do the model directly from the trace data.
4.1.4 Trace Manipulation Method Through Visual Components, we can execute python scripts whenever an action occurs. This feature was used to write traces, generated from simulations, into a file. The struc- ture of the trace was made in two ways. One way was done by the Factory of the Future team at Aalto university, and seems to be made in regards to be easier to be interpreted by a human reading the trace. The other one is developed through this project, and was created with the intent that a different layout to the trace is more suitable for my inten- tions. Automatic handling and re-purpose to executable models that can be executed in NuSMV are easier to accomplish with my new kind of trace structure.
4.1.5 Generalized Model from a Generated Trace The following code is a generalized version of how a model will look once the trace is applied through the model generator. For simplicity, subsystem IRB is used and its connection to the C3 subsystem (station 3 along the conveyor loop). 1 | MODULE main 2 | VAR 3 | C3 name cmd : boolean; 4 | C3 state name : { s t a t e v a l u e 1 , s t a t e v a l u e 2 } ; 5 | IRB name cmd : boolean; 6 | IRB state name : { s t a t e v a l u e 1 , s t a t e v a l u e 2 , s t a t e | value 3 , m o t i o n t o s t a t e 3 } ; 7 | ASSIGN 8 | i n i t ( C3 name cmd ) := cmd value 1 ; 10 | i n i t ( IRB name cmd ) := cmd value 1 ; 9 | i n i t ( C3 state name) := state v a l u e 1 ; 11 | i n i t ( IRB state name) := state v a l u e 1 ; 12 | next ( C3 name cmd ) := case 13 | C3 name cmd = cmd value 1 & C3 state name = s t a t e
28 | value 1 : cmd value 2 ; 14 | C3 name cmd = cmd value 2 & C3 state name = s t a t e | value 1 : cmd value 1 ; 15 | C3 name cmd = cmd value 1 & C3 state name = s t a t e | value 2 : cmd value 2 ; 16 | C3 name cmd = cmd value 2 & C3 state name = s t a t e | value 2 : cmd value 1 ; 17 | TRUE: C3 name cmd ; 18 | esac ; 19 | ... 20 | next ( IRB state name) := case 21 | IRB name cmd = cmd value 1 & IRB state name = s t a t e | value 1 & C3 name cmd = cmd value 1 : s t a t e v a l u e 2 ; 22 | ... 23 | IRB state name = motion t o s t a t e 3 : s t a t e v a l u e 3 ; 24 | TRUE: IRB state name ; 25 | esac ; 26 | −−−−−− Lines 13-16 of the example explains the overall structure of transitions for each subsystem. Line 21 explains the transition from C3 to IRB. It contains constraints from every variable in IRB, but also one trigger from C3. This is to ”kick-start” the process of IRB without having the need to do individual models for every module. Line 23 contains an example of transition for motion to -conditions. This is put in place due to the fact that if the state of a component is that it is moving towards something, it will reach its destination regardless of the other states. The names of the variables and components are named appropriately so that it is easy to distinguish the different subsystems, even though they are part of the same model.
4.1.6 Implementation The functionality for taking the traces and turning them into a model for NuSMV was done through a python script. The trace file is generated by the simulation data in Visual Components and is saved in a .txt log file. Each line in the text file is a specific event, containing timestamp, subsystem, component, and action. The python script sifts through the file and generate the model based on the events in the trace. For example, if only certain parts of the system was used, only traces corresponding to those actions will be saved in the log. Furthermore, only these aspects will be generated into the model. A representation of how it sorts the traces can be seen in figure 4.12. Each thick black line represents an entry in the trace. The desire to only use specific parts of the system is somewhat addressed. You may specify two points in time in which you want the trace to generate the model. This means that you can generate a complete trace of the entire system, but if you know you tested a certain part of the system at some point, you can supply the python script with time slots to only create the model for a specific section.
29 Figure 4.12: How the trace is split into different subsystems.
Note that you must know the times when the desired section was executed so it might not be usable in every instance. The process to reach the goal was split into three big parts. Trace generation from the digital twin, model generation from the generated trace, and model checking from the generated model. Through every iteration of these three steps, more and more aspects were considered, issues were rectified, until a final, working model, was generated.
Model Generation From Trace File
The overall approach is presented here. A more in-depth explanation, as well as pseudo code, is presented in section 4.2:
1. In a text file, store all variables of the plant and their initial values. 2. The trace file is generated with the help of Visual Components and NuSMV. 3. By executing the python script, it sorts the original trace by time of occurrence and filter out all entries that are outside a specified time frame. 4. The script then sifts through the sorted trace and generates the Variable-section for the NuSMV model. The Variable-section is where every possible value for every variable is presented.
30 Figure 4.13: Automatic generation of plant Figure 4.14: Reordered plant variables. To get variables and their values. a better overview of system state.
5. The next thing that is generated and put into the NuSMV model are the initial values of every variable. The values are decided at stage 1, but they are taken into consideration at this stage.
6. Then, guard conditions and transfer actions for each variable are generated for the NuSMV model. The script is now done with its execution.
7. Manually add guard conditions and transfer actions between subsystems.
8. Supply the completed model with the desired CTL-specifications and run the NuSMV model to check if the logic is true.
The specific variables of the system, as well as their initial values, can be seen through the NuSMV application, and are depicted in figure 4.13. This is the order that the variables are sorted by, so they are grouped with variables from the same subsystem. The order of the variables were reordered manually after the model was completed, and looks like in figure 4.14. This serves no functional purpose, but rather was done to be able to quicker identify what was going on when stepping through the model manually.
31 Subsystem Structure Every component has its subsystem name in combination with its component name. We first generate a list with all the components. Together with this list, we supply the method with a specific section of the trace file. This section contains all the entries that relates to each subsection, sorted by the time of occurrence. For each subsystem, the only thing that is interesting is the changes directly within the subsystem. Therefore, the trace is split in the way previously presented in figure 4.12. For every subsystem, the state is recorded at the preceding entry in the trace. The changes made in the entry following this is the conditions of transition within the subsystem, and the state machine is created according to this structure. The trigger to start each subsystem needs to be manually entered. A way to automatically detect which entry in the complete trace is the trigger to initiate the state machine is impossible. We do not have a way to decide which of the preceding entries is the trigger to each of the different subsystems.
4.1.7 NuSMV Simulation The simulation of models are done with the software NuSMV. The first example code is how to run a simulation for 100 steps and manually watch component state changes. The commands are written in the command prompt, in the folder where the model file is located. The horizontal lines (−) indicate that there is feedback to be had from the terminal at that point before continuing with the next command. 1 | nusmv −int filename.smv 2 | −−− 3 | go 4 | p i c k s t a t e 5 | −−− 6 | simulate −r −k 100 7 | −−− 8 | show traces 9 | −−− CTL and LTL specifications can be written in two different ways. Either written manually in the file where the model you want to test is located, or written directly in the command prompt. With just a few lines, you let the NuSMV software test your requirements. The next code example shows you how to check specific requirements. It works the same for both CTL and LTL requirements. For the next example, we use CTL statements. Line 4 is the code for evaluating every specification put in the file. Line 6 is the code for writing specifications directly into the command prompt. For checking a lot of specification, it is recommended to put them at the bottom of the model file, after the last line of the generated model. This means that you do not have to rewrite them every time you want to test them. For a one-off test, a manual entry on the command line might suffice.
32 1 | nusmv −int filename.smv 2 | −−− 3 | go 4 | c h e c k c t l s p e c 5 | −−− 6 | c h e c k c t l s p e c −p ”EG (create cup cmd = TRUE −> EX | AGV position = in motion FWD)” 7 | −−− The check ctlspec command will look through the model file and test all specifications that have been put in the file beforehand, for example CTLSPEC AG!((C5 cup sensor = detected) & (C6 cup sensor = detected)). This CTL specification would mean that for any given point in time, the cup sensor for both C5 and C6 can not detect an object.
4.1.8 Behavior NuSMV Model Structure and Behavior The functionality of NuSMV did not enable a good way to divide the system into sub- systems and check them that way. The final layout of the model is therefore the entire system, but each and every component has its desired subsystem embedded in the name. Even though the model is the entire system, subsystems can easily be identified through their names. For such, figure 4.15 is a visual aid of how I wanted my modules to behave.
Visualization of Behavior Since a modular approach makes the result easier to interpret and handle, the aim was to divide the system into several subsystems and make state machines for each part. For each subsystem, every change that could take place within the specific region were detected automatically. For the conditions from other subsystems that would determine the conditions for state change, they were added manually in the state machines. The state machines in figure 4.15 are manual interpretations of how the system be- haves. A way to instead generate automatic state machines was explored. They lack the functionality to include outside connections, but in theory are great for seeing how each subsystem of the model works. The automatically generated state machines are presented in their entirety in appendix A, but as of now they are not further used in the project.
4.1.9 Connections Between State Machine and NuSMV Model Since the model is created directly from the trace, with no intermediate step with state machines, it would be interesting to see if the generated model differs from the generated state machines. If we explore the code generated for subsystem C1 in the model, we have figure4.16. Connections to other subsystems require manual additions. Those additions
33 Figure 4.15: Manually constructed state machines/flow charts hybrids.
34 Figure 4.16: Generated code for the C1 subsystem. are from C6 and C2 and are added on line 99 and line 100. The resulting state machine for this code would look like in figure 4.17. A side-by-side comparison between this state machine and the automatically generated state machine (figure 4.18) tells us that the state machines are not the same, but they share similarities. A few things can be easily explained. We know that the transitions between state 1-2 of figure 4.17 and state 33-34 of figure 4.18 relies on outside connections, so the difference here is due to the fact that automatic detection between subsystems are not implemented. The state machines are the same between states 2-3 and 34-35. State 36 is where the automatically generated state machine differs a lot. The digital twin can not turn off its conveyor belt if the sensor detects an object. This would mean that the process stops every time an object passes the sensor. We can refer to figure 4.17 to see that the actual plant avoids this behavior. This is why we can not use the current form of automatically generated state machine to derive the plant model. Furthermore, it also proves that the state machine generator needs further development before it can be used to reliably create proper state machines.
4.1.10 Solution The solution handles how to take apart a trace from a digital twin(or physical plant), and make a model from this. In this way, we know what the system can do and what it cannot do. In order to do so, there are certain prerequisites that needs to be fulfilled. The knowledge about the system needs to be comprehensive. This means that the way you generate an entry into a trace needs to be understood well in order to create a trace that can easily be interpreted automatically later on. In the approach I used, an entry was created in the trace whenever a component changed its current state into something else, for example a conveyor changed from off to on or sensor change from not detected to detected. This works well in a pure event-driven setting. When every state change is recorded, there is a possibility to automatically generate a state machine from these. The issue with not having a purely event-driven system is that it is hard to identify which of the previous instances who is responsible for the activation of a certain entry. The solution for this was to manually note the few instances where time-driven activation were used and add them manually. The analysis stage of the implementation starts with line 1 of the trace and runs
35 Figure 4.17: State machine from the code for the C1 subsystem. Figure 4.18: State machine resulting from the automatic generation. through every line sequentially until the end of the trace. For the specific trace entry, the information we can get is:
1. What time the entry occurred. This is essential to have this so that the correct order in which interactions happen are preserved. If you want to isolate a trace to a specific time window, you can use these time frames to only generate a state machine that corresponds to the desired time frame.
2. What subsystem and component the entry belongs to. This is important so that we know what constrains needs to be taken into consideration. With the help from our knowledge of the system, we can know which other components directly affects the component in the subsystem, and we can use their current state as transition conditions for the state machine. This enables us to not require the current state of every component, since components from other subsystems practically does not affect each other. Therefore, we only want to present the relevant components
36 state changes as transition conditions. One added benefit of this is that the size of the text file decreases drastically if you only write between 1 and 5 conditions for various state changes instead of more than 40 for every state change.
3. What state the component in the entry changed to. Once we know our transitions conditions, we need to know into what state the component is supposed to change to.
From the complete trace we will know all the different components and subsystems that are included in the system. We save the initial values for each component in every subsystem in a dictionary. When an entry in the trace is evaluated, its value in the dictionary is changed and the relevant values to that component is presented as transition conditions. This is what keeps the state machine going forward. Every state change will change a value in the dictionary, and will subsequently result in a new transition condition to be true. By only including relevant information in the transition condition to that specific component, while other components statuses are irrelevant, we can evaluate components isolated from the entire system. However, this gives us new issues. Due to the limited information you can get from the trace, to automatically detect which of the other components affects the current one is tough and can not be done automatically. The transition conditions directly impacting the component will always be there, but the extra conditions from other components needs to be manually addressed.
4.1.11 Pseudo Code This is pseudo code for the implementation of the model generator in python. Another script to complement the model generator is a script for the state machine generation. Many of the methods are deliberately kept as brief as possible and the pseudo code is considered low-level detail. Section 4.2 following this explains the solution in more detail.
Model Generator
Set the initial values by running the method addInitialValues. Set lower time limit. Set upper time limit.(Default time is between 0−3600 seconds) Create a list with every component. Sort the desired trace by running the method sortTrace(filename , lower time, upper time). Add ”MODULE main” to model file . Add ”VAR” to model file . Set current stage to variable. For every component: Execute ”addModules” with the sorted trace data and the current component. Add ”ASSIGN” to model file .
37 Set current stage to initialization. For every component: Execute ”addModules” with the sorted trace data and the current component. Set current stage to transition. For every component: Execute ”addModules” with the sorted trace data and the current component.
Method getTime: Return a string with the current time.
Method readFile(filename of raw trace): For every line/entry in the file: Split at every ”:” and put the resulting list in a list. Return the list.
Method appendToFile(text ): adds the supplied text to the bottom of the desired file.
Method sortTrace(list with trace, lower time limit , upper time : l i m i t ) : Sort list by time(ascending). For every entry in the list: If the time of the entry is within the time limits: Keep the entry. If the time of the entry is outside the time limits: Remove the entry. Return the sorted list.
Method addModules(sorted trace , module): For every entry in sorted trace: If the module in the entry is the same as module: Save the entry in a list , moduleTrace. Else : Do nothing . Run the list through the method createModel(moduleTrace).
Method createModel(moduleTrace): Create a list , comp l i s t . For every entry in the moduleTrace: If the component in the entry is not in the comp l i s t : Add the component into comp l i s t .
38 Create a dictionary comp dict by calling the method findAllValues(comp list , moduleTrace).
If current stage is variable: Run comp dict through the method appendAllVariables( comp dict ) .
Else If current stage is initialization: For every component in comp dict : Add the first value in the dictionary that has component as key to the model file.
Else If current stage is transition: Find all transitions for every component in comp dict by running findAndPrintTransitions(comp dict , moduleTrace).
Method appendAllVariables(comp dict ) : For every component in comp dict : Add the component and all the corresponding values in comp dict to the model file.
Method findAllValues(comp list , moduleTrace): For every component in comp l i s t : For every entry in the moduleTrace: If the entry covers the component and its action is unique : Add the action to a list of values. Add the component and the list of values to comp dict . Return the dictionary comp dict
Method findAndPrintTransitions(comp dict , moduleTrace): For every component in comp dict : Add ”next(component):=case” to the model file . For every entry in the moduleTrace: For every component in initial v a l u e s : If the component in the entry is the same as the component in comp dict : Add the action to a list of constraints.
Overwrite the value for the component in initial values with the action in the current trace. I f ’ in motion’ is the value in initial v a l u e s f o r
39 the component: Add ” i n m o t i o n s t a t e ) : a t state;” to the model file. Else If the same constraints already has been detected or the constraint is the same as the action it yields: Do nothing Else : Add the constraints as conditions and the action of the entry as action to the model file. Add ”TRUE:component;” to the model file . Add ”esac;” to the model file.
Method addInitialValues: Open desired file Create dictionary initial v a l u e s . For every line in the file: Split line at ”:”. Add the two different halves of the split line to the dictionary. Return initial v a l u e s .
State Machine Generator
Method writeToUmlfile: Open desired file. Write to filie.
Method createSMdata: writeToUmlfile ”@startuml”. writeToUmlfile ”[ ∗ ] −−> f i r s t s t a t e ” For every subsystem: Go through the trace data and extract relevant entries. Run generate data for the current subsystem. writeToUmlfile ”State” + latest s t a t e + ” −−> f i r s t s t a t e : completed”) writeToUmlfile ”@enduml”
Method generate data : For every entry in the subsystem: writeToUmlfile all the current values writeToUmlfile the transition condition Change the value of the variable in the transition c o n d i t i o n
40 4.2 General Solution
4.2.1 Structure of the Generalized Solution The solution is made for a specific system. Other systems will not be exactly the same, so an implementation for those systems would be different. To further generalize the solution of automatically creating a model from a trace into a general approach would look like equation 4.1. 1. Acquire a digital twin or plant, capable of generating a trace, T , that records the actions of the plant during runtime, as different entries.
T = E1,E2, ..., En, (4.1) where E is an entry and n is the amount of entries. 2. Make sure that every entry take in relevant information related to actions conducted by sensors and actuators. 3. Split the whole system into subsystems. They should be sections of the system that can internally work practically independently from the rest of the system, although they might rely on other subsystems to execute. Their allegiance is decided by the variable Mp in the next point. This means that every Mpk will be closely related to Mp. 4. Create every record so that it follows the same structure,
Ei = (ti,Mp,Mpk, ai). (4.2)
[time occurred (ti), module that is affected (Mp), component that is affected(Mpk ∈ Mp), action that was performed(ai)]. 5. Once the trace is completed we can present it in an expanded form(equation 4.3): T = (t1,M1,M11, a1), (t2,M1,M12, a2), ..., (ta,M1,M1x, aa) ,