Verification of High-Level Transformations with Inductive Refinement Types Ahmad Salim Al-Sibahi, Aleksandar Dimovski, Thomas Jensen, Andrzej Wasowski
Total Page:16
File Type:pdf, Size:1020Kb
Verification of High-Level Transformations with Inductive Refinement Types Ahmad Salim Al-Sibahi, Aleksandar Dimovski, Thomas Jensen, Andrzej Wasowski To cite this version: Ahmad Salim Al-Sibahi, Aleksandar Dimovski, Thomas Jensen, Andrzej Wasowski. Verification of High-Level Transformations with Inductive Refinement Types. GPCE 2018 - 17th International Con- ference on Generative Programming: Concepts & Experience, Nov 2018, Boston, United States. pp.147-160, 10.1145/3278122.3278125. hal-01898058 HAL Id: hal-01898058 https://hal.inria.fr/hal-01898058 Submitted on 19 Oct 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Verification of High-Level Transformations with Inductive Refinement Types Ahmad Salim Al-Sibahi Thomas P. Jensen DIKU/Skanned.com/ITU INRIA Rennes Denmark France Aleksandar S. Dimovski Andrzej Wąsowski ITU/Mother Teresa University, Skopje ITU Denmark/Macedonia Denmark Abstract 1 Introduction High-level transformation languages like Rascal include ex- Transformations play a central role in software development. pressive features for manipulating large abstract syntax trees: They are used, amongst others, for desugaring, model trans- first-class traversals, expressive pattern matching, backtrack- formations, refactoring, and code generation. The artifacts ing and generalized iterators. We present the design and involved in transformations—e.g., structured data, domain- implementation of an abstract interpretation tool, Rabit, for specific models, and code—often have large abstract syn- verifying inductive type and shape properties for transfor- tax, spanning hundreds of syntactic elements, and a corre- mations written in such languages. We describe how to per- spondingly rich semantics. Thus, writing transformations form abstract interpretation based on operational semantics, is a tedious and error-prone process. Specialized languages specifically focusing on the challenges arising when analyz- and frameworks with high-level features have been devel- ing the expressive traversals and pattern matching. Finally, oped to address this challenge of writing and maintain- we evaluate Rabit on a series of transformations (normaliza- ing transformations. These languages include Rascal [29], tion, desugaring, refactoring, code generators, type inference, Stratego/XT [12], TXL [16], Uniplate [32] for Haskell, and etc.) showing that we can effectively verify stated properties. Kiama [44] for Scala. For example, Rascal combines a func- tional core language supporting state and exceptions, with CCS Concepts • Theory of computation → Program constructs for processing of large structures. verification; Program analysis; Abstraction; Functional constructs; Program schemes; Operational semantics; Control primitives; • Software and its engineering → Translator 1 PUBLIC Script flattenBlocks(Script s) { writing systems and compiler generators; Semantics; 2 SOLVE(s) { 3 S = bottom-up VISIT(s) { Keywords transformation languages, abstract interpreta- 4 CASE stmtList: [ xs,block(ys), zs] => tion, static analysis * * 5 XS + YS + ZS ACM Reference Format: 6 } Ahmad Salim Al-Sibahi, Thomas P. Jensen, Aleksandar S. Dimovski, 7 } and Andrzej Wąsowski. 2018. Verification of High-Level Trans- 8 RETURN s; Proceedings of formations with Inductive Refinement Types. In 9 } the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE ’18), November 5– 6, 2018, Boston, MA, USA. ACM, New York, NY, USA, 14 pages. Figure 1. Transformation in Rascal that flattens all nested https://doi.org/10.1145/3278122.3278125 blocks in a statement Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies Figure1 shows an example Rascal transformation program 1 are not made or distributed for profit or commercial advantage and that taken from a PHP analyzer. This transformation program copies bear this notice and the full citation on the first page. Copyrights recursively flattens all blocks in a list of statements. The for components of this work owned by others than the author(s) must program uses the following core Rascal features: be honored. Abstracting with credit is permitted. To copy otherwise, or • republish, to post on servers or to redistribute to lists, requires prior specific A visitor (VISIT) to traverse and rewrite all statement permission and/or a fee. Request permissions from [email protected]. lists containing a block to a flat list of statements. Vis- GPCE ’18, November 5–6, 2018, Boston, MA, USA itors support various strategies, like the bottom-up © 2018 Copyright held by the owner/author(s). Publication rights licensed strategy that traverses the abstract syntax tree starting to ACM. from leaves toward the root. ACM ISBN 978-1-4503-6045-6/18/11...$15.00 https://doi.org/10.1145/3278122.3278125 1https://github.com/cwi-swat/php-analysis 147 GPCE ’18, November 5–6, 2018, Boston, MA, USA A. S. Al-Sibahi, T. P. Jensen, A. S. Dimovski and A. Wąsowski • An expressive pattern matching language is used to non- 1 DATA Nat = zero() | suc(Nat pred); deterministically find blocks inside a list of statements. 2 DATA Expr = var(STR nm) | cst(Nat vl) The starred variable patterns *XS and *ZS match arbi- 3 | mult(Expr el, Expr er); trary number of elements in the list, respectively be- 4 fore and after the block(ys) element. Rascal supports 5 Expr simplify(Expr expr) = non-linear matching, negative matching and specify- 6 bottom-up VISIT (expr) { ing patterns that match deeply nested values. 7 CASE mult(cst(zero()), y) => cst(zero()) • The solve-loop (SOLVE) performing the rewrite until a 8 CASE mult(x, cst(zero())) => cst(zero()) fixed point is reached (the value of s stops changing). 9 }; To rule out errors in transformations, we propose a static Figure 2. The running example: eliminating multiplications analysis for enforcing type and shape properties, so that by zero from expressions target transformations produce output adhering to particular shape constraints. For our PHP example, this would include: We proceed by presenting a running example in Sect.2. • The transformation preserves the constructors used in We introduce the key constructs of Rascal in Sect.3. Sec- the input: does not add or remove new types of PHP tion4 describes the modular construction of abstract do- statements. mains. Sections 5 to8 describe abstract semantics. We evalu- • The transformation produces flat statement lists, i.e., ate the analyzer on realistic transformations, reporting re- lists that do not recursively contain any block. sults in Sect.9. Sections 10 and 11 discuss related papers and conclude. To ensure such properties, a verification technique must rea- son about shapes of inductive data—also inside collections 2 Motivation and Overview such as sets and maps—while still maintaining soundness Verifying types and state properties such as the ones stated and precision. It must also track other important aspects, for the program of Fig.1 poses the following key challenges: like cardinality of collections, which interact with target lan- • guage operations including pattern matching and iteration. The programs use heterogeneous inductive data types, In this paper, we address the problem of verifying type and contain collections such as lists, maps and sets, and and shape properties for high-level transformations written basic data such as integers and strings. This compli- in Rascal and similar languages. We show how to design and cates construction of the abstract domains, since one implement a static analysis based on abstract interpretation. shall model interaction between these different types Concretely, our contributions are: while maintaining precision. • The traversal of syntax trees depends heavily on the type and shape of input, on a complex program state, 1. An abstract interpretation-based static analyzer—Rascal and involves unbounded recursion. This challenges the ABstract Interpretation Tool (Rabit)—that supports in- inference of approximate invariants in a procedure ferring types and inductive shapes for a large subset that both terminates and provides useful results. of Rascal. • Backtracking and exceptions in large programs intro- 2. An evaluation of Rabit on several program transfor- duce the possibility of state-dependent non-local jumps. mations: refactoring, desugaring, normalization algo- This makes it difficult to statically calculate the con- rithm, code generator, and language implementation trol flow of target programs and have a compositional of an expression language. denotational semantics, instead of an operational one. 3. A modular design for abstract shape domains, that al- lows extending and replacing abstractions for concrete Figure2 presents a small pedagogical example using visitors. element types, e.g. extending the abstraction for lists The program performs expression simplification by travers- to include length in addition to shape of contents. ing a syntax tree bottom-up and reducing multiplications by 4. Schmidt-style