Constraint Modelling and Data Validation Using Formal Speciﬁcation Languages

Constraint Modelling and Data Validation Using Formal Specification Languages Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf vorgelegt von David Schneider aus Frankfurt am Main Düsseldorf, Dezember 2017 Aus dem Institut für Informatik der Heinrich-Heine-Universität Düsseldorf Gedruckt mit der Genehmigung der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf Referent: Prof. Dr. Michael Leuschel Heinrich-Heine-Universität Düsseldorf 1. Koreferent: Prof. Dr. Yves Ledru Université Grenoble Alpes Tag der mündlichen Prüfung: 12. Juli 2017 Abstract Formal methods provide rich and expressive specification languages to reason about and to describe systems at a high abstraction level. These methods are usually supported by powerful tools to verify the correctness of the specifications by different means such as proof or model checking. But is it possible to express non-trivial constraint satisfaction problems in these specification languages and to use such a formal model at runtime for problem solving and data validation? Are languages and the tools powerful enough to enable this usage scenario? These are the central questions we explore in this thesis. We look at these questions with a particular focus on the B Method – a state based formal method for software development – and the ProB tool – an animator and model checker for the B Method. We begin by studying the use of the B language, a part of the B method, not only as a specifications language but also as a modelling language for a wide range of challenging constraint based problems. We show on several puzzles and case studies that it is possible to formalize these problems elegantly using B. We also show that for many problems it is possible to solve these formalizations using ProB. We use our results on a larger case study about performing data validation of university curricula using a formal specification. In this case study we use the B language to model and validate the feasibility of university curricula from a students’ perspective and show that ProB can efficiently solve this validation problem. In particular, we show that it is possible to embed ProB in an application and solve this validation problem at runtime using our B model. Afterwards, we will present a general structure of a data validation project in B and outline common challenges along with various solutions. This discussion is rooted in the results and experiences gathered on our case study and on a second independent one. We also discuss possible evolutions of the B language to make it (even) more suitable for such projects. In the course of this thesis we discuss several alternative modelling approaches and how they relate to B and ProB. To conclude, we perform an in-depth evaluation where we compare our B and ProB based approach to several other tools and languages that can be used for this kind of validation task. We conclude that our approach of using B not only as a formal specification language but also as a constraint modelling language can be applied successfully in this scenario. Nevertheless, there are areas where this approach could be improved or extended to better suit this kind of application. We also conclude that ProB produces very good results for the high abstraction level of the language, that it is in many cases faster than brute force solutions and that it is comparable to dedicated constraint solving approaches. Zusammenfassung Formale Methoden bieten ausdrucksstarke Spezifikationssprachen um Probleme systema- tisch zu analysieren und diese mit einem hohen Abstraktionsgrad zu beschreiben. Mit einer Spezifikation werden Eigenschaften eines Systems beschrieben. Um deren Korrektheit zu überprüfen sind Softwarewerkzeuge zentral. Diese erlauben es durch unterschiedliche Techniken, wie Beweise oder Model-Checking, solche Spezifikationen zu verifizieren. Ist es darüber hinaus möglich nicht-triviale Constraint-Satisfaction Probleme in diesen Spezifikationssprachen auszudrücken und ein solches Model zur Laufzeit zu verwenden um Probleme zu lösen und Datenvalidierung durchzuführen? Sind die existierenden Sprachen und Werkzeuge ausdrucksstark und mächtig genug für diesen Einsatzzweck? Dies sind die zentralen Fragen, die in dieser Arbeit erörtert werden. Sie werden mit einem besonderem Augenmerk auf die B-Methode – eine zustandsbasierte formale Methode zur Softwareentwicklung – und dem ProB Werkzeug – ein Animator und Model-Checker für die B-Methode – untersucht. Zunächst wird die Verwendung der B-Sprache, ein Teil der B-Methode, nicht nur als Spezifikationssprache sondern auch als Sprache um Constraints zu beschreiben diskutiert. Anhand einer Reihe unterschiedlicher Puzzles und Fallstudien wird gezeigt, dass es möglich ist, diese mit der B-Sprache auf elegante Art und Weise zu formalisieren und dass es möglich ist, diese Problembeschreibungen mit Hilfe vom ProB auszuwerten. Diese Ergebnisse bilden die Grundlage einer umfangreichen Fallstudie über die Validierung von Hochschulstudienplänen mit Hilfe einer formalen B-Spezifikation. Insbesondere wird gezeigt, dass es möglich ist ProB in eine Anwendung einzubetten um dieses Validierungsproblem zur Laufzeit zu lösen. Aufbauend auf den Ergebnissen und Erfahrungen, die bei der zuvor erwähnten Fall- studie und bei einer zweiten unabhängigen Fallstudie gesammelt wurden, wird eine allgemeine Struktur für B-basierte Datenvalidierungsprojekte vorgestellt. Außerdem werden Probleme, die bei diesen Projekten auftreten können, sowie mögliche Lösungsansätze diskutiert. Im Verlauf dieser Arbeit werden verschiedene alternative Modellierungsansätze diskutiert, sowie mit B und ProB verglichen. Den Abschluss dieser Arbeit bildet ein systematischer Vergleich dieses auf B und ProB basierenden Ansatzes zu alternativen Methoden, die für die untersuchten Validierungsprobleme eingesetzt werden können. Aus diesem Vergleich lässt sich schließen, dass B nicht nur als formale Spezifikationssprache sondern in diesem Kontext auch als Costraint-Modellierungssprache eingesetzt werden kann. Eines der Ergebnisse dieser Arbeit ist, dass ProB sehr gute Ergebnisse bei der untersuchten Art von Validierungsproblemen, insbesondere unter Berücksichtigung des Abstraktionsgrades der Sprache, liefert. Ferner ist ProB in vielen Fällen schneller als Brute-Force-Lösungen und liefert Ergebnisse, die mit dedizierten Constraint-Solving-Ansätzen vergleichbar sind. vi Acknowledgments During the time I have worked on this thesis I have received help, support and encour- agement from many people, to whom I am deeply thankful. Firstly I would like to thank my advisor Michael Leuschel for his guidance, support and the chance to explore different ideas, some of which finally lead to this thesis. I am also very grateful for the opportunity to work with him and be a part of his team. I’d like to thank all off my current and former colleagues and students at the chair of software engineering and programming languages with whom I had the pleasure to work during my time as a student and later as a member of the research group. Additionally a honourable mention goes to our foosball table. A central part of the work discussed in this thesis is the timetable validation case study discussed in Chapter 4 and Chapter 6. In this context I would like to particularly thank Frank Meier for initially bringing up this project and his continuous involvement in the project. Thanks also go to Svenja Wrede, Michael Heim, Nicolas Carsten, Nico Steffen, Niklas Bertram, Sarah Hönerbach and Jürgen Rauter from the deans’ offices at the faculties of Arts & Humanities and Business Administration & Economics at Heinrich Heine University Düsseldorf for the very fruitful collaboration and their helpful feedback during the project. Particular thanks go to Tobias Witt, Philip Höfges and Joshua Schmidt who put very much of effort into the implementation of the timetable validation application and without their work it would have been impossible to create the application that was the result of this case study. My deep gratitude also goes (alphabetically) to Anna Michaelis, Carl Friedrich Bolz, Dominik Hansen, Frank Meier, Michael Leuschel, Nicole Grothe and Stephan Zalewski for proof-reading and commenting on different draft versions of this thesis as well as for the inspiring and insightful discussions to which their feedback led. Regarding the articles that have been used in this thesis, I would like to thank the anonymous reviewers of ABZ 2014, FM 2015 and ABZ 2016 for their work and valuable feedback. Finally, I would like to extend a special thank you to my family and friends, in particular to my wife Anna, for their patience, curiosity about my work and most importantly of all for their support. Contents I Introduction and Background 1 1 Introduction 3 2 Background 7 2.1 Formal Methods & Specification Languages . 7 2.1.1 The B Method . 9 2.1.2 Formal Data Validation . 11 2.2 Constraint Programming . 13 2.3 ProB - Constraint Solving for the B Method . 14 II B for Constraint Modelling and Data Validation 19 3 Using B as a Constraint Modelling Language 21 3.1 On the Expressiveness of B - The Jobs Puzzle . 23 3.1.1 Shapiro’s Challenge . 24 3.1.2 A Solution to the Jobs Puzzle Using B . 25 3.1.3 Related Work . 29 3.2 Solving Constraint Problems with B and ProB . 30 3.2.1 Subset Sum . 31 3.2.2 n-Queens . 33 3.2.3 Peaceable Armies of Queens . 34 3.2.4 Extended Graph Isomorphism . 37 3.3 Industrial Applications . 38 3.4 Conclusion and Future Work . 40 4 Case Study: Timetable Validation and Improvement 43 4.1 Background . 43 4.2 A Domain Theory of Timetables and Curricula . 45 4.2.1 Background . 45 4.2.2 Timetabling Data . 46 4.2.3 Module Combinations . 49 4.2.4 Curricula/Timetables . 53 4.3 Modelling Curricula and Timetables in B . 59 4.3.1 Data Import & Representation . 60 4.3.2 Validation . 69 ix Contents 4.3.3 Assisting the Solving Process . 72 4.3.4 Computing Conflict Sources . 73 4.3.5 Improvement – Finding Alternative Time Slots . 75 4.3.6 Model Structure . 78 4.3.7 Modelling Stages . 78 4.4 From Model to Application .

Load more