162581776.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The IT University of Copenhagen's Repository Verification of High-Level Transformations with Inductive Refinement Types Ahmad Salim Al-Sibahi Thomas P. Jensen DIKU/Skanned.com/ITU INRIA Rennes Denmark France Aleksandar S. Dimovski Andrzej Wąsowski ITU/Mother Teresa University, Skopje ITU Denmark/Macedonia Denmark Abstract 1 Introduction High-level transformation languages like Rascal include ex- Transformations play a central role in software development. pressive features for manipulating large abstract syntax trees: They are used, amongst others, for desugaring, model trans- first-class traversals, expressive pattern matching, backtrack- formations, refactoring, and code generation. The artifacts ing and generalized iterators. We present the design and involved in transformations—e.g., structured data, domain- implementation of an abstract interpretation tool, Rabit, for specific models, and code—often have large abstract syn- verifying inductive type and shape properties for transfor- tax, spanning hundreds of syntactic elements, and a corre- mations written in such languages. We describe how to per- spondingly rich semantics. Thus, writing transformations form abstract interpretation based on operational semantics, is a tedious and error-prone process. Specialized languages specifically focusing on the challenges arising when analyz- and frameworks with high-level features have been devel- ing the expressive traversals and pattern matching. Finally, oped to address this challenge of writing and maintain- we evaluate Rabit on a series of transformations (normaliza- ing transformations. These languages include Rascal [28], tion, desugaring, refactoring, code generators, type inference, Stratego/XT [12], TXL [16], Uniplate [31] for Haskell, and etc.) showing that we can effectively verify stated properties. Kiama [43] for Scala. For example, Rascal combines a func- tional core language supporting state and exceptions, with CCS Concepts • Theory of computation → Program constructs for processing of large structures. verification; Program analysis; Abstraction; Functional constructs; Program schemes; Operational semantics; Control primitives; • Software and its engineering → Translator 1 PUBLIC Script flattenBlocks(Script s) { writing systems and compiler generators; Semantics; 2 SOLVE(s) { 3 S = bottom-up VISIT(s) { Keywords transformation languages, abstract interpreta- 4 CASE stmtList: [ xs,block(ys), zs] => tion, static analysis * * 5 XS + YS + ZS ACM Reference Format: 6 } Ahmad Salim Al-Sibahi, Thomas P. Jensen, Aleksandar S. Dimovski, 7 } and Andrzej Wąsowski. 2018. Verification of High-Level Trans- 8 RETURN s; Proceedings of formations with Inductive Refinement Types. In 9 } the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE ’18), November 5– 6, 2018, Boston, MA, USA. ACM, New York, NY, USA, 14 pages. Figure 1. Transformation in Rascal that flattens all nested https://doi.org/10.1145/3278122.3278125 blocks in a statement Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies Figure1 shows an example Rascal transformation program 1 are not made or distributed for profit or commercial advantage and that taken from a PHP analyzer. This transformation program copies bear this notice and the full citation on the first page. Copyrights recursively flattens all blocks in a list of statements. The for components of this work owned by others than the author(s) must program uses the following core Rascal features: be honored. Abstracting with credit is permitted. To copy otherwise, or • republish, to post on servers or to redistribute to lists, requires prior specific A visitor (VISIT) to traverse and rewrite all statement permission and/or a fee. Request permissions from [email protected]. lists containing a block to a flat list of statements. Vis- GPCE ’18, November 5–6, 2018, Boston, MA, USA itors support various strategies, like the bottom-up © 2018 Copyright held by the owner/author(s). Publication rights licensed strategy that traverses the abstract syntax tree starting to ACM. from leaves toward the root. ACM ISBN 978-1-4503-6045-6/18/11...$15.00 https://doi.org/10.1145/3278122.3278125 1https://github.com/cwi-swat/php-analysis GPCE ’18, November 5–6, 2018, Boston, MA, USA A. S. Al-Sibahi, T. P. Jensen, A. S. Dimovski and A. Wąsowski • An expressive pattern matching language is used to non- 1 DATA Nat = zero() | suc(Nat pred); deterministically find blocks inside a list of statements. 2 DATA Expr = var(STR nm) | cst(Nat vl) The starred variable patterns *XS and *ZS match arbi- 3 | mult(Expr el, Expr er); trary number of elements in the list, respectively be- 4 fore and after the block(ys) element. Rascal supports 5 Expr simplify(Expr expr) = non-linear matching, negative matching and specify- 6 bottom-up VISIT (expr) { ing patterns that match deeply nested values. 7 CASE mult(cst(zero()), y) => cst(zero()) • The solve-loop (SOLVE) performing the rewrite until a 8 CASE mult(x, cst(zero())) => cst(zero()) fixed point is reached (the value of s stops changing). 9 }; To rule out errors in transformations, we propose a static Figure 2. The running example: eliminating multiplications analysis for enforcing type and shape properties, so that by zero from expressions target transformations produce output adhering to particular shape constraints. For our PHP example, this would include: We proceed by presenting a running example in Sect.2. • The transformation preserves the constructors used in We introduce the key constructs of Rascal in Sect.3. Sec- the input: does not add or remove new types of PHP tion4 describes the modular construction of abstract do- statements. mains. Sections 5 to8 describe abstract semantics. We evalu- • The transformation produces flat statement lists, i.e., ate the analyzer on realistic transformations, reporting re- lists that do not recursively contain any block. sults in Sect.9. Sections 10 and 11 discuss related papers and conclude. To ensure such properties, a verification technique must rea- son about shapes of inductive data—also inside collections 2 Motivation and Overview such as sets and maps—while still maintaining soundness Verifying types and state properties such as the ones stated and precision. It must also track other important aspects, for the program of Fig.1 poses the following key challenges: like cardinality of collections, which interact with target lan- • guage operations including pattern matching and iteration. The programs use heterogeneous inductive data types, In this paper, we address the problem of verifying type and contain collections such as lists, maps and sets, and and shape properties for high-level transformations written basic data such as integers and strings. This compli- in Rascal and similar languages. We show how to design and cates construction of the abstract domains, since one implement a static analysis based on abstract interpretation. shall model interaction between these different types Concretely, our contributions are: while maintaining precision. • The traversal of syntax trees depends heavily on the type and shape of input, on a complex program state, 1. An abstract interpretation-based static analyzer—Rascal and involves unbounded recursion. This challenges the ABstract Interpretation Tool (Rabit)—that supports in- inference of approximate invariants in a procedure ferring types and inductive shapes for a large subset that both terminates and provides useful results. of Rascal. • Backtracking and exceptions in large programs intro- 2. An evaluation of Rabit on several program transfor- duce the possibility of state-dependent non-local jumps. mations: refactoring, desugaring, normalization algo- This makes it difficult to statically calculate the con- rithm, code generator, and language implementation trol flow of target programs and have a compositional of an expression language. denotational semantics, instead of an operational one. 3. A modular design for abstract shape domains, that al- lows extending and replacing abstractions for concrete Figure2 presents a small pedagogical example using visitors. element types, e.g. extending the abstraction for lists The program performs expression simplification by travers- to include length in addition to shape of contents. ing a syntax tree bottom-up and reducing multiplications by 4. Schmidt-style abstract operational semantics [40] for a constant zero. We now survey the analysis techniques con- significant subset of Rascal adapting the idea of trace tributed in this paper, explaining them using this example. memoization to support arbitrary recursive calls with Inductive Refinement Types. Rabit works by inferring an input from infinite domains. inductive refinement type representing the shape of possible output of a transformation given the shape of its input. It does Together, these contributions show feasibility of applying ab- this by interpreting the simplification program abstractly, stract interpretation for constructing analyses for expressive considering all possible paths the program can take for values transformation languages and properties. satisfying the input shape (any expression of type Expr in Verification of High-Level Transformations ... GPCE ’18, November 5–6, 2018, Boston, MA, USA mult ¹cst ¹Natº ; cst ¹Natºº this case). The result of running