Information Flow Security in Tree-Manipulating Processes

TECHNISCHE UNIVERSITAT¨ MUNCHEN¨ Lehrstuhl fürInformatik II Information Flow Security in Tree-Manipulating Processes MátéAmadéKovács VollständigerAbdruck der von der FakultätfürInformatik der Technischen UniversitätMünchen zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzende: Univ.-Prof. Dr. Claudia Eckert Prüferder Dissertation: 1. Univ.-Prof. Dr. Helmut Seidl 2. Univ.-Prof. Dr. Markus Müller-Olm Westfälische Wilhelms-UniversitätMünster Die Dissertation wurde am 30.09.2013 bei der Technischen UniversitätMünchen einge- reicht und durch die FakultätfürInformatik am 05.03.2014 angenommen. Abstract This work describes methods to verify information flow properties of processes manipulating tree-structured data. The developed techniques can be applied, e.g., to enterprise workflows and web service technologies, where data is frequently represented in the form of XML documents. These systems are highly security critical, because they may be in control of important processes of organizations, while communicating with external partners over the network. The first solution is a runtime monitor. It applies generalized constant prop- agation to overapproximate the results of secret-dependent branching constructs in order to prove their equality. The second method is a static analysis that ver- ifies information flow properties at compilation time. It relies on relational abstract interpretation applied to a statically determined alignment of two copies of the program and regular overapproximations of sets of pairs of program states. Both methods allow to enforce end-to-end information flow policies only, which are composed in terms of the initial and final states of computations. In the third part of the thesis tree-manipulating reactive systems are con- sidered, where information flow policies may change over time. The positive fragment of the Linear Temporal Logic is extended with a modal operator, the so-called hide operator, in order to express that the observable behavior of the system is independent of specific input values until a certain point in time. A model checking algorithm is provided to verify temporal information flow properties, which combines methods of abstract interpretation with model checking. In order to foster semantic clarity, the algorithms and techniques are pre- sented for a small \assembly" language for tree-manipulation. ii Zusammenfassung Diese Arbeit beschreibt Methoden, die Informationsfluss-Eigenschaften von Pro- grammen sicherstellen können,deren Schwerpunkt die Verarbeitung von baum- strukturierten Daten ist. Die entwickelten Techniken könnenz.B. bei Enterprise- Workflow-Systemen und Web-Service-Technologien eingesetzt werden, bei denen Daten oft durch XML Dokumente repräsentiert werden. Diese Systeme sind sicherheitskritisch, weil sie wichtige Prozessabläufeeiner Organisation kontrol- lieren können,währendsie mit externen Partnern über das Netzwerk kommu- nizieren. Die erste Methode nutzt einen Laufzeit-Monitor. Ein verallgemeinerter Konstantenfaltungs-Algorithmus wird verwendet, um das Ergebnis von Verzwei- gungen, die von geheimzuhaltenden Informationen abhängen,zu überapproxi- mieren, damit deren Aquivalenz¨ bewiesen werden kann. Die zweite Methode verifiziert Informationsfluss-Eigenschaften zur Ubersetzungszeit.¨ Sie basiert auf relationaler abstrakter Interpretation, angewandt auf die statische Abgleichung zweier Kopien des Programms, und regulärer Uberapproximation¨ von Mengen von Paaren von Programmzuständen. Beide Methoden könnenlediglich soge- nannte \end-to-end" Informationsfluss-Eigenschaften sicherstellen, die in Bezug auf die ersten und letzten Zuständeder Ausführungdefiniert sind. Im dritten Teil dieser Dissertation werden Baummanipulierende reaktive Systeme betrachtet, deren Sicherheitseigenschaften sich zur Laufzeit ändern können. Sicherheitseigenschaften werden in einer Erweiterung des positiven Teils der linearen temporalen Logik erfasst, die es ermöglicht, mithilfe des hide Operators die Unabhängigkeit des beobachtbaren Verhaltens des Systems von bestimmten Werten bis zu einem bestimmten Zeitpunkt zu spezifizieren. Im Rahmen dieser Arbeit wurde ein Model-Checking-Algorithmus fürdie Analyse derartiger Eigenschaften entwickelt, der Methoden aus der abstrakten Interpre- tation mit Model-Checking kombiniert. Um die semantische Klarheit zu fördern,werden die Algorithmen und Metho- den mithilfe einer kleinen Assemblersprache fürBaummanipulationen darge- stellt. Contents 1 Introduction 1 2 Runtime Monitor 7 2.1 Preliminaries . .8 2.1.1 Binary Trees . .8 2.1.2 Assembly Language for Tree Manipulation . .8 2.1.3 Information Flow Policies . 13 2.2 The Runtime Monitor through an Example . 14 2.3 Formal Treatment of the Monitor . 16 2.4 Guarantees . 22 2.5 Related Work . 23 3 Relational Abstract Interpretation 25 3.1 Merge over all Twin Computations . 27 3.2 Self-Compositions of Control Flow Graphs . 30 3.3 Proving Noninterference . 34 3.3.1 Case Study . 40 3.4 Practical Experiments . 43 3.5 Combining the Results of Multiple Analyses . 45 3.6 Related Work . 46 4 Model Checking 49 4.1 Transition Systems . 50 4.2 Temporal Information Flow Policies . 52 4.3 Model Checking Systems with Finite State Space . 54 4.4 Model Checking Systems with Infinite State Space . 58 4.4.1 Constructing the Abstract Transition System . 60 4.4.2 Constructing an Abstract State Machine Having Finite State Space . 62 4.4.3 Computing the Result . 66 4.5 Implementation . 67 4.5.1 Transforming Formulae into Büchi Automata . 67 4.5.2 The Verification Procedure . 68 4.6 Case Studies . 70 4.7 Related Work . 77 5 Conclusion 79 iii iv CONTENTS 6 Proofs 81 6.1 Proofs for Chapter 2 . 81 6.2 Proofs for Chapter 3 . 90 6.3 Proofs for Chapter 4 . 115 Chapter 1 Introduction Today companies and organizations frequently use computer systems to store data and to execute business logic. These workflow systems bear the risk of revealing critical information through software bugs, attacks, or simple miscon- figuration. Since these systems are frequently used by several principals possibly having conflicting interests, the conscious design and enforcement of information flow policies is of paramount importance. Conference Management System Reviewers Program Committee Chair Authors Figure 1.1: An imaginary conference management system and its cooperating partners. As an example, Figure 1.1 illustrates the users of an imaginary conference management system like EasyChair. The users cooperate in order to success- fully execute the workflow of submitting, reviewing and deciding about the acceptance of papers. The conference management system itself maintains a document base describing the state of the submission and review process. The document base stores among others the uploaded documents, the comments and scores given by the reviewers, and a value describing whether the submissions have been accepted. Some examples for conflicting interests in the context of 1 2 CHAPTER 1. INTRODUCTION conference management systems may be the following: • Authors might be interested in knowing the identity of their reviewers. • Authors might wish to know about their competitors previous to the dis- closure of accepted papers. There have already been security breaches in conference management systems. For instance HotCRP [40] version 2.47 had a bug which exposed comments of reviewers to the authors, which were exclusively meant for the program committee. There are well established programming languages and technologies for the implementation of process management systems that are responsible for the coordination of workflows of organizations. The family of standards for web services (e.g., [37, 20]) enables the platform independent communication of computer programs on a network using messages in XML [16] format. Based on this technology, high level workflows can be composed from the functionalities of in- dividual web services using the Web Services Business Process Execution Lan- guage (BPEL) [6]. Accordingly, BPEL is designed to implement the autonomous business logic of companies and organizations that can also communicate with external, independent entities. Therefore, the information flow security of these processes may be crucial for organizations to fulfill their missions. Another cen- tral aspect of BPEL workflows is that the values of variables are document trees. Even though the goal of BPEL programs is not to carry out complex computations, still the language is Turing complete. Data manipulation can be carried out using the XML Path Language [14] (XPath), and XSL Transformations [46] (XSLT). As motivated above, the goal of this work is to give methods for the verification and enforcement of information flow properties of programs manipulating tree-structured data. We will illustrate the developed solutions using examples that implement fragments of the imaginary conference management system sketched in Figure 1.1. We suppose that the workflow of organizing a conference consists of a series of phases. In one phase authors are allowed to upload papers, an other phase is e.g., when the submission deadline is passed, and papers are reviewed. 1 <if name="If1"> 2 <condition> 3 <![CDATA[$phase = "notify"]]> 4 </condition> 5 <assign> <copy> <from>$subDB </from> 6 <to> $toAuthors </to> </copy> 7 </assign> 8 <else> 9 <sequence> 10 <if name="If2"> 11 <condition> <![CDATA[$averageScore < 1.5]]> 12 </condition> 13 <assign name="EvalReject"> 14 <copy>

Information Flow Security in Tree-Manipulating Processes

Web Services Security Using SOAP, \Rysdl and UDDI

Second Year Report

Mobile Three-Dimensional City Maps

Addressing Threats and Security Issues in World Wide Web Technology∗†

Reflections on the REST Architectural Style and “Principled Design of the Modern Web Architecture”

Pay-As-You-Go Data Cleaning and Integration

Requests for Comments (RFC)

CV January 2001 Geoffrey Charles

Cgcopyright 2013 Alexei Czeskis

Principled Design of the Modernweb Architecture

Harvesting Commonsense and Hidden Knowledge from Web Services Julien Romero

Kapanfangrechts