<<

Numeric Domains Meet Algebraic Data Types Santiago Bautista, Thomas Jensen, Benoît Montagu

To cite this version:

Santiago Bautista, Thomas Jensen, Benoît Montagu. Numeric Domains Meet Algebraic Data Types. NSAD 2020 - 9th International Workshop on Numerical and Symbolic Abstract Domains, Nov 2020, Virtual, United States. pp.12-16, ￿10.1145/3427762.3430178￿. ￿hal-03028476￿

HAL Id: hal-03028476 https://hal.inria.fr/hal-03028476 Submitted on 27 Nov 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Numeric Domains Meet Algebraic Data Types

Santiago Bautista Thomas Jensen Benoît Montagu ENS Rennes Inria Inria Bruz, France Rennes, France Rennes, France Inria [email protected] [email protected] Rennes, France [email protected]

Abstract properties on scalars. Systems based on refinement types We report on the design and formalization of a novel abstract use SMT solvers to discharge verification constraints. domain, called numeric path relations (NPRs), that combines The topic of abstract domains for the relational analysis numeric relational domains with algebraic data types. This of algebraic data types, however, has remained largely un- domain expresses relations between algebraic values that explored by the static analysis community. The correlation can contain scalar data. The construction of the domain is abstract domain [1] is a recent progress in that area, focused parameterized by the choice of a relational domain on scalar on extracting equality relations. The current work devises values. The construction employs projection paths on alge- a relational abstract domain for algebraic data types, that braic values, and in particular projections on variant cases, expresses relations richer than equality over scalar values. whose sound treatment is subtle due to mutual exclusiveness. We introduce the novel abstract domain of Numeric Path Relations (NPRs) that leverages the expressiveness of known CCS Concepts: • Theory of computation → Program numeric relational domains to denote expressive relations analysis; • Software and its engineering → Automated over structured data that contain scalar values. This is a static analysis. first and necessary step to analyze languages that expose ♯ Keywords: numeric relations, algebraic types, NPR abstract algebraic data types and (OCaml, F , Swift, domain, static analysis, abstract interpretation Rust...). Such languages have been successfully used to spec- ify and reason on high level models of complex programs, ACM Reference Format: such as operating systems [8]. Our long term goal is to ex- Santiago Bautista, Thomas Jensen, and Benoît Montagu. 2020. Nu- ploit results of static analyses to reduce the manual proof meric Domains Meet Algebraic Data Types. In Proceedings of the effort required to verify such complex programs. 9th ACM SIGPLAN International Workshop on Numerical and Sym- bolic Abstract Domains (NSAD ’20), November 17, 2020, Virtual, USA. 2 Technical Setting ACM, New York, NY, USA,5 pages. https://doi.org/10.1145/3427762. 3430178 This section introduces notations for algebraic data types and numeric abstract domains, used in the paper. 1 Introduction 2.1 Algebraic Data Types The field of relational abstract domains on scalar values is Algebraic types are arbitrary nestings of sum types and prod- a vibrant research area, that has produced a rich variety of uct types. Here, we only deal with non-recursive algebraic expressive domains [4, 6, 9, 10, 12]. They were successfully types, which excludes lists or trees. Algebraic types are given used in the implementation of static analyzers to automati- by the following grammar (N is the type of scalar values): cally prove the safety of large code bases [3,5]. In the community, types are ex- 푖 ∈퐼 푖 ∈퐼 휏 ::= N | {푓푖 → 휏푖 } | [퐴푖 → 휏푖 ] erted as a means to express properties of structured data— 푖 ∈퐼 also known as algebraic data types—that a program manipu- The type {푓푖 → 휏푖 } is the where each field lates. More recently, refinement types15 [ –17] were devel- 푖 ∈퐼 푓푖 is of type 휏푖 , and [퐴푖 → 휏푖 ] stands for the sum type oped, that augment types with rich properties, including where each constructor 퐴푖 takes an argument of type 휏푖 . For example, playing cards can be represented by the type ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such,  Hearts → {}, Spades → {}  suit → , rank → N the Government retains a nonexclusive, royalty-free right to publish or Diamonds → {}, Clubs → {} reproduce this article, or to allow others to do so, for Government purposes only. Values, written 푣, are standard for a functional language. NSAD ’20, November 17, 2020, Virtual, USA From base scalar values 푛, compound values are built to form © 2020 Association for Computing Machinery. 푖 ∈퐼 ACM ISBN 978-1-4503-8187-1/20/11...$15.00 records {푓푖 → 푣푖 } or variants 퐴(푣). The different construc- https://doi.org/10.1145/3427762.3430178 tors of a sum type are mutually exclusive: each variant uses

12 NSAD ’20, November 17, 2020, Virtual, USA Santiago Bautista, Thomas Jensen, and Benoît Montagu only one constructor. The syntax for values is given by: • Soundness of pre-order: if (푐,푉 ) ⊑퐷 (V) (푐 ′,푉 ), then 훾 D(V) 푐,푉 훾 D(V) 푐 ′,푉 푖 ∈퐼 (( )) ⊆ (( )). 푣 ::= 푛 | {푓푖 → 푣푖 } | 퐴(푣) • Soundness of union:   For example, let us call hearts_queen the value {suit → 훾 D(V) (푐,푉 ) ∪ 훾 D(V) (푐 ′,푉 ) ⊆ 훾 D(V) (푐,푉 ) ∪퐷 (V) (푐 ′,푉 ) Hearts({}), rank → 12} that represents the queen of hearts. We may write algebraic values to refer to values from an • Soundness of intersection: algebraic . D(V) D(V) ′ D(V)  퐷 (V) ′  We use projection paths to refer to scalar data contained 훾 (푐,푉 ) ∩ 훾 (푐 ,푉 ) ⊆ 훾 (푐,푉 ) ∩ (푐 ,푉 ) in an algebraic value. The of paths, written Paths, is inductively defined by the following grammar: These are unsurprising requirements for an abstract domain. 푝 ::= 휀 | .푓 푝 | @퐴푝 2.3 Variable Manipulation where 푓 is a field, and 퐴 is a constructor. The concretization, pre-order, abstract union and abstract The projection of an algebraic value 푣 on a path 푝, written intersection operations are fundamental to the abstract in- 푣 ⇓val 푝, computes the scalar obtained by following path 푝 in terpretation framework. Additionally, we assume that the 퐷 푣. It is defined as the following partial function: abstract domain also provides the operations Add, Remove and Rename, for respectively adding new variables to an 푣 ⇓val 휀 = 푣 if ⊢ 푣 :N abstract value, removing some variables, and renaming vari- ∈ 푖 퐼 val val {푓푖 → 푣푖 } ⇓ .푓푗 푝 = 푣 푗 ⇓ 푝 if 푗 ∈ 퐼 ables. Those operations are available in Apron [7] and will 퐴(푣) ⇓val @퐴푝 = 푣 ⇓val 푝 prove useful in sections3 and5. E.g., by projecting hearts_queen on its rank field, we get 2.3.1 Adding Variables to an Abstract Value. The ab- val 푐,푉 푉 12: hearts_queen ⇓ .rank = 12. Its projection on the path stract value Add푉2 ( 1) adds the variables 2 ⊆ V to the 푐,푉 푐,푉 푉 푉 .suit@Clubs is not defined, however, because the field suit of abstract value ( 1). The variables of Add푉2 ( 1) are 1∪ 2. 푐,푉 hearts_queen is not set to the variant Clubs. The projection Add푉2 ( 1) should satisfy the following property: of hearts_queen over .suit@Hearts is not defined either, as 훾 D(V) Add (푐,푉 ) = {} 푉2 1 it would lead to the value , which is not a scalar.  푓 훾 D(V) 푐,푉 ,  푔 푉 푉 ∃ ∈ (( 1)) : ( 1 ∪ 2) → R 푥 푉 , 푓 푥 푔 푥 2.2 Numeric Abstract Domains ∀ ∈ 1 ( ) = ( )

Numeric abstract domains are parameterized by a set V of Said otherwise, the newly added variables, i.e. those in 푉2 \푉1 푐,푉 variable names. The abstract values of a numeric abstract do- are unconstrained in Add푉2 ( 1), whereas the variables in main 퐷(V) are pairs (푐,푉 ) where푉 ⊆ V is a set of variables 푉1 are constrained in the same way as they were in (푐,푉1). and 푐 is a set of constraints over the variables of 푉 . For exam- (푐,푉 ) ple, the abstract value ({푥 ≤ 푦, 푥 ≤ 5}, {푥,푦, 푧}) talks about 2.3.2 Removal of Variables. Remove푉2 1 is obtained 푉 푐,푉 three variables (푥, 푦 and 푧) and contains two constraints by removing the variables in 2 ⊆ V from ( 1). Thus, (푐,푉 ) 푉 \ 푉 (푥 ≤ 푦 and 푥 ≤ 5). The variable 푧 is left unconstrained. We Remove푉2 1 talks about variables in 1 2. In addition, (푐,푉 ) call Constraints (퐷(V)) the set of constraints expressible Remove푉2 1 should satisfy the following property: in the domain 퐷(V). Then, the abstract values of 퐷(V)  D(V)  ∃푓 ∈ 훾 ((푐,푉1)) , are elements of P(Constraints (퐷(V))) × P(V) (where P 푔 : (푉1 \ 푉2) → R ∀푥 ∈ 푉1 \ 푉2, 푓 (푥) = 푔(푥) stands for the powerset). When writing an abstract value   D(V) (푐,푉 ) in examples, we omit 푉 if it corresponds exactly to the ⊆ 훾 Remove(푐,푉1) 푉 variables appearing in the constraints of 푐. For example, we 2 write {푥 ≤ 푦, 푥 ≤ 5} instead of ({푥 ≤ 푦, 푥 ≤ 5}, {푥,푦}). In other words, for any environment belonging to the con- A numeric abstract domain 퐷(V) is a cretization of (푐,푉1), its restriction to 푉1 \ 푉2 belongs to the concretization of Remove푉 (푉1). We only require inclusion,  퐷 (V) D(V) 퐷 (V) 퐷 (V) 퐷 (V)  2 A ,훾 , ⊑ , ∪ , ∩ not equality. Indeed, removing variables from an abstract value might cause a loss of information in some domains. A퐷 (V) is the set of abstract values of 퐷(V). 훾 D(V) is the concretization function, that maps each abstract value (푐,푉 ) 2.3.3 Variable Renaming. Given a bijective function 푟 : 푉 푉 푉 ,푉 푐,푉 to a set of environments, i.e. an element of P(푉 → R). The 1 → 2, where 1 2 ⊆ V; Rename푟 ( 1) is obtained by (V) (V) relation ⊑퐷 is a pre-order on abstract values. ∪퐷 and renaming the variables of (푐,푉1) into variables of 푉2 accord- ∩퐷 (V) are operators on abstract values over-approximating ing to the bijection 푟. The following property must hold: set union and set intersection, respectively. A numeric ab- D(V)   n D(V) o 훾 Rename(푐,푉1) = 푔 | 푔 ◦ 푟 ∈ 훾 ((푐,푉1)) stract domain must satisfy the following properties: 푟

13 Numeric Domains Meet Algebraic Data Types NSAD ’20, November 17, 2020, Virtual, USA

푉 푥,푦 푉 푦, 푧 For example, for 1 = { }, 2 = { } and the bijection drift (x : t1) = match x with 푟 푉 푉 푟 푥 푦 푟 푦 푧 : 1 → 2 such that ( ) = and ( ) = , we have: A( n ) → r : = A( n +1) (V)   (V) → 훾 D Rename{푥 = 2푦,푦 ≥ 0} = 훾 D ({푦 = 2푧, 푧 ≥ 0}) | B ( n ) r : = B ( n −1) 푟 Figure 1. The drift function When the domain of 푟 can be deduced from the context and some elements of the domain are left unchanged, we only write the elements modified by 푟. For example: Definition 3.3 (Concretization of NPRs). We define the con-   cretization function on NPRs by: 훾 D(V) {푥 = 푦,푦 ≥ } = 훾 D(V) ({푧 = 푦,푦 ≥ })  퐸 env 푃  Rename 2 0 2 0 NPR ⇓ is well defined ∧ [푥↦→푧] 훾 ((푐, 푃)) = 퐸 (S) 퐸 ⇓env 푃 ∈ 훾 D ((푐, 푃)) 3 Numeric Path Relations Notice that for any inconsistent 푃 (e.g. {(푥, @퐴), (푥, @퐵)}), We propose a way to extend any numeric abstract domain 퐷 we have 훾 NPR ((푐, 푃)) = ∅, no matter the constraint set 푐. from numeric variables to variables of algebraic data types. Consider for example the drift function (Figure1), that The key intuition is to add more structure to variable names manipulates values of a type 휏1 = [퐴 → N; 퐵 → N]. The using paths: projecting structured data along paths allows first branch of drift is exactly abstracted by constraint set to get numeric values that can be abstracted using 퐷. The 푐 = {(푟, @퐴) = (푥, @퐴) + 1, (푛, 휀) = (푥, @퐴)} and extended main challenge is to treat sum types in a sound way. Indeed, paths 푃 = {(푟, @퐴), (푥, @퐴), (푛, 휀)}, for which we have projecting a value of a sum type conveys implicit information 훾 NPR ((푐, 푃)) = {[푥 ↦→ 퐴(푘);푟 ↦→ 퐴(푘 + 1);푛 ↦→ 푘] | 푘 ∈ R}. about which constructor was used to build that value. For numeric abstract domains, it may not necessarily make sense to intersect or unite abstract values that talk about Definition 3.1 (Extended variable). An extended variable is different sets of variables. Nevertheless, for NPRs, it might S = V × a pair of a variable and a path. We call Paths the be the case that two abstract values talk about the same set of set of all extended variables. variables, but projected on different paths. We hence need to Definition 3.2 (Numeric Path Relation). We call numeric give precise meaning to comparing, intersecting or uniting path relation (NPR) any abstract value from a numeric do- NPRs with different sets of extended variables. 퐷 푐, 푃 main (S) on extended variables, i.e., any pair ( ) of Definition 3.4 (Inclusion for NPRs). We define the inclu- 퐷 (S) A = P(Constraints (퐷(S))) × P(S). NPR NPR sion ⊑ between two NPRs such that (푐1, 푃1) ⊑ (푐2, 푃2) (S) NPR NPR NPR (푐 , 푃 ) ⊑퐷 ((푐 , 푃 )) 푃 ⊆ 푃 In the rest of this section, we define 훾 , ⊑ , ∪ and holds iff 1 1 Add푃1 2 2 and 2 1. NPR ∩ , that confer to NPRs the structure of relational abstract Definition 3.5 (Intersection and union for NPRs). domain over values of algebraic data types. (푐, 푃) ∩NPR (푐 ′, 푃 ′) = Add(푐, 푃) ∩퐷 (S) Add(푐 ′, 푃 ′) To design a static analysis, we need to abstractly repre- 푃′ 푃 sent sets of states, denoting the possible environments from (푐, 푃) ∪NPR (푐 ′, 푃 ′) = Remove(푐, 푃) ∪퐷 (S) Remove(푐 ′, 푃 ′) ′ ′ variables to values, that are reachable at a given program 푃\푃 푃 \푃 point. To do this, we exploit projection on paths, that allow NPRs can inherit a widening from the underlying numeric to go from algebraic values to scalar values. abstract domain. Nevertheless, handling recursive algebraic Given an environment 퐸 that maps variables from a set 푉 types will require to significantly adapt this widening. We to algebraic values, and given a set 푃 ∈ S of variable-path defer the presentation of a widening for NPRs to their adap- pairs, we define the projection of 퐸 over 푃 as the function tation to recursive types, in future work. A major difficulty when defining an abstract domain for 퐸 ⇓env 푃 : 푃 → R algebraic values is the fact that constructors of a sum type (푥, 푝) ↦→ 퐸(푥) ⇓val 푝 are mutually exclusive. NPRs handle this through the ‘well- The function 퐸 ⇓env 푃 can be seen as a numeric environ- defined’ condition on environment projection when defining ment on extended variables. Notice that 퐸 ⇓env 푃 is defined concretization. Thus, a single NPR refers to at most one con- only if 퐸(푥) ⇓val 푝 is defined for every (푥, 푝) ∈ 푃. For ex- structor for each value of a sum type. For a more precise ample, for the value hearts_queen from Section 2.1, [푥 ↦→ analysis of programs with pattern matching, we use finite hearts_queen] ⇓env{(푥, .suit@Clubs)} is not well defined. In sets of NPRs, denoting the disjunction of their concretiza- particular, if there are paths in 푃 that try to project the same tions. For example, the final state of the drift function (Fig- value over different constructors of a sum type, then 퐸 ⇓env 푃 ure1) is exactly abstracted by the set of NPRs {{(푟, @퐴) = is never well-defined, independently of the environment 퐸. (푥, @퐴) + 1}, {(푟, @퐵) = (푥, @퐵) − 1}}. This use of disjunc- E.g., for 푃 = {(푥, .suit@Clubs), (푥, .suit@Hearts)}, there is tive completion is guided by the program’s pattern-matches no 퐸 such that 퐸 ⇓env 푃 is well-defined. In such a case we say and limited by the types of variables. We defer the precise that 푃 is inconsistent. presentation of finite disjunctions to a future venue.

14 NSAD ’20, November 17, 2020, Virtual, USA Santiago Bautista, Thomas Jensen, and Benoît Montagu

4 Soundness and Optimality Results it in 푦. A sound abstraction of this operation is We have proven the required soundness properties for the S# ⟦y ← { f1 = x1 ; ... ; fn = xn }⟧(푐, 푃) = operations of the NPR abstract domain, as long as the opera- 푐, 푃 NPR ÑNPR 푦, .푓 푝 푥 , 푝 RemoveP|y ( ) ∩ (푥 ,푝) ∈푃,푖 ∈퐼,푥 ≠푦 {( 푖 ) = ( 푖 )} tions on the chosen numerical domain 퐷 are already sound. 푖 푖 푐, 푃 The proof of theorems 4.1 to 4.4 can be found in [2]. In this formula, RemoveP|y ( ) forgets all information about 푦, and intersecting with the values {(푦, .푓푖푝) = (푥푖, 푝)} Theorem 4.1 (Pre-order). ⊑NPR The relation is a pre-order: copies knowledge about 푥푖푝 (for 푥푖 ≠ 푦) into knowledge it is reflexive and transitive. about 푦.푓푖푝. For simplicity, we do not give a more precise 푥 푦 Theorem 4.2 (Soundness of the pre-order). The relation abstraction, that keeps information about the 푖 equal to . ⊑NPR is sound, i.e. the concretization is monotonic for ⊑NPR. For NPR NPR NPR 5.3 Sum Type Creation any NPRs 푎1 and 푎2, if 푎1 ⊑ 푎2, then 훾 (푎1) ⊆ 훾 (푎2). We write y ← A(x) for the operation that stores the variant Theorem 4.3 (Soundness of union and intersection). The 퐴(푥) in the variable 푦. Again, the instruction x ← A(x) re- union and intersection over NPRs over-approximate their coun- quires a different abstraction but poses no major difficulty, 푎 푎 terpart operations on sets. For any NPRs 1 and 2, so we concentrate on the case 푥 ≠ 푦. A sound abstraction is • 훾 NPR (푎 ) ∪ 훾 NPR (푎 ) ⊆ 훾 NPR 푎 ∪NPR 푎 , and 1 2 1 2 S# ⟦ ← ⟧(푐, 푃) = • 훾 NPR (푎 ) ∩ 훾 NPR (푎 ) ⊆ 훾 NPR 푎 ∩NPR 푎  y A(x) 1 2 1 2 . 푐, 푃 NPR ÑNPR 푦, 퐴푝 푥, 푝 RemoveP|y ( ) ∩ (푥,푝) ∈푃 {( @ ) = ( )} Moreover, we have proved that our definition of abstract 푦 intersection does not cause any loss of information: it is I.e., we forget what we knew about , then we copy what we 푥 푦 퐴 optimal, provided that ∩퐷 (S) is already optimal. knew about into knowledge about @ . Theorem 4.4 (Optimality of intersection). If ∩퐷 (S) is op- 5.4 Pattern Matching 푑 푑 timal (i.e., for every two numeric abstract values 1 and 2 We write match x with A1(y1) → b1| ... |An(yn) → bn 훾 D(S) 푑 훾 D(S) 푑 훾 D(S) 푑 퐷 (S) 푑  we have ( 1) ∩ ( 2) = 1 ∩ 2 ) then for pattern matching on the variable 푥 and executing the NPR 푎 푎 ∩ is optimal as well: for any two NPRs 1 and 2, we have branch 푏푖 when 푥 = 퐴푖 (푦푖 ), i.e. destructing a sum type. We NPR NPR NPR NPR  훾 (푎1) ∩ 훾 (푎2) = 훾 푎1 ∩ 푎2 . call (푎, 푃) the abstract value before the match statement. Again, we only consider the case where 푥 ≠ 푦푖 . A sound 5 Transfer Functions for Algebraic Types abstraction at the beginning of the 푖-th branch is given by We now give an overview of transfer functions for operations NPR ÑNPR RemoveP|y (푐, 푃) ∩ {(푦푖, 푝) = (푥, @퐴푖푝)} over algebraic data types. They all involve extended paths. i (푥,@퐴푖 푝) ∈푃 Following the notation of Miné [13], for any operation 푠 The abstract value {(푦푖, 푝) = (푥, @퐴푖푝)} copies any knowl- # of the analyzed language we denote its abstraction by S ⟦푠⟧. edge about 푥@퐴푖 to 푦푖 and simultaneously asserts that the As explained at the end of Section3, we work with finite dis- variable 푥 can only be in the case 퐴푖 . junctions of NPRs. For simplicity, however, we only present Once the abstract values 푎푖 for each branch 푖 of the pat- the different S# ⟦푠⟧ for a single NPR. We consider a language tern have been computed, the abstract union of the values 푎 with no mutable objects, hence no aliasing problems. Remove푦푖 푖 , is a sound abstraction of the states that are accessible at the end of the pattern matching construct. 5.1 Field Access We write y ← x. f for the operation that accesses field f of 6 Related Work variable 푥 and stores it in variable 푦. We assume that 푥 ≠ 푦. Our approach is completely parametric with respect to the The abstraction for x ← x. f is easier and is elided due to numeric domain 퐷 that one chooses to extend. This is possi- space constraints. A sound abstraction for this operation is ble, as soon as operations on 퐷 for name management from S# ⟦y ← x. f⟧(푐, 푃) = Section 2.3 are available. Those operations are present in 푐, 푃 NPR ÑNPR 푦, 푝 푥, .푓 푝 the Apron library [7]. Two such domains are the polyhedra RemoveP|y ( ) ∩ ( ) ∈ {( ) = ( )} 푥,.푓 푝 푃 [4] and the octagons [12] domains. We presented relational 푣, 푝 푃 푣 푦 where P|y = {( ) ∈ | = } is the subset of P of ex- domains as subsets of 푉 → R, following Miné [12]. This 푦 푐, 푃 tended paths starting with . In this formula, RemoveP|y ( ) presentation is also exposed as the “level 1” of the 푦 forgets about the previous value of and intersecting with Apron library [7]. Relational domains can also be presented {(푦, 푝) (푥, .푓 푝)} the values = copies what we know about as subsets of R푛, i.e. using dimensions instead of variable 푥.푓 푦 into knowledge about . names, as done in the “level 0” interface of the Apron library. Modern static analyzers, such as Astrée [3], handle C 5.2 Product Type Creation records by flattening and C unions following the method- We write y ← { f1 = x1 ; ... ; fn = xn } for the opera- ology of Miné [11]. Analyzing algebraic types is a different 푖 ∈{1,...,푛} tion that creates a new record {푓푖 → 푥푖 } and stores problem. Union types à la C often include a tag field so that

15 Numeric Domains Meet Algebraic Data Types NSAD ’20, November 17, 2020, Virtual, USA a programmer can discriminate between the cases of a union. [2] Santiago Bautista, Thomas Jensen, and Benoît Montagu. 2020. Com- The cases of a sum type, in contrast, are pairwise disjoint bining static correlation analysis with relational numeric domains. In- by design. Moreover, C allows type casts in arbitrary places. ternship report. ENS Rennes. https://sbautista.fr/internship_report_ 2020.pdf This is required for unions to work: to transition from a [3] Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, to one of its cases, a cast must be performed. Antoine Miné, David Monniaux, and Xavier Rival. 2005. The ASTREÉ Languages with algebraic types do not need to support arbi- Analyzer. In Programming Languages and Systems, 14th European Sym- trary casts, thanks to the pattern matching construct. With posium on Programming,ESOP 2005, Held as Part of the Joint European their cast-free semantics, algebraic types are thus easier to Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh, UK, April 4-8, 2005, Proceedings. 21–30. analyze than unions. Finally, algebraic types are immutable [4] Patrick Cousot and Nicolas Halbwachs. 1978. Automatic Discovery of structures, so keeping track of aliases is not necessary. Linear Restraints among Variables of a Program. In Proceedings of the Previous work on relational analysis for algebraic data 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming types has been done by Andreescu et al. [1] who introduced Languages (Tucson, Arizona) (POPL ’78). Association for Computing a domain called correlations. Correlations can express equal- Machinery, New York, NY, USA, 84–96. [5] Alain Deutsch. 2003. Static verification of dynamic properties. In between parts of different algebraic values, whether ACM ities SIGAda 2003 Conference. those parts are numeric or not. In contrast, the NPR abstract [6] Jacob M. Howe and Andy King. 2009. Logahedra: A New Weakly domain can express numeric relations that go beyond equal- Relational Domain. In Automated Technology for Verification and Anal- ity, but only for the numeric parts of different algebraic values. ysis, 7th International Symposium, ATVA 2009, Macao, China, October Taking the reduced product of correlations and NPRs would 14-16, 2009. Proceedings (Lecture Notes in , Vol. 5799), Zhiming Liu and Anders P. Ravn (Eds.). Springer, 306–320. allow to combine the expressiveness of both approaches. [7] Bertrand Jeannet and Antoine Miné. 2009. Apron: A Library of Nu- In a different research community, refinement types [15– merical Abstract Domains for Static Analysis. In Computer Aided Ver- 17] have been introduced as a way to make types more ex- ification, Ahmed Bouajjani and Oded Maler (Eds.). Springer Berlin pressive, by embedding logical expressions in the syntax of Heidelberg, Berlin, Heidelberg, 661–667. types. In particular, they can express relational properties [8] Gerwin Klein, June Andronick, Kevin Elphinstone, Toby Murray, Thomas Sewell, Rafal Kolanski, and Gernot Heiser. 2014. Comprehen- on scalars. This approach is not based on abstract domains, sive Formal Verification of an OS Microkernel. ACM Trans. Comput. however: analyzers with refinement types generate logical Syst. 32, 1, Article 2 (Feb. 2014), 70 pages. constraints that are discharged by SMT solvers. [9] Francesco Logozzo and Manuel Fähndrich. 2010. Pentagons: A weakly relational abstract domain for the efficient validation of array accesses. 7 Conclusion and Future Work Sci. Comput. Program. 75, 9 (2010), 796–807. [10] Antoine Miné. 2001. A New Numerical Abstract Domain Based on We have proposed the abstract domain of numeric path re- Difference-Bound Matrices. In Programs as Data Objects, Second Sym- lations, that denotes relations over structured values that posium, PADO 2001, Aarhus, Denmark, May 21-23, 2001, Proceedings embed scalars. The construction is parameterized over an (Lecture Notes in Computer Science, Vol. 2053), Olivier Danvy and An- arbitrary relational domain on scalars, hence the expressive- drzej Filinski (Eds.). Springer, 155–172. [11] Antoine Miné. 2006. Field-Sensitive Value Analysis of Embedded C ness of the resulting domain can be adapted. A key ingredient Programs with Union Types and Pointer Arithmetics. In Proceedings is the notion of projection path that points inside some part of the 2006 ACM SIGPLAN/SIGBED Conference on Language, Compilers, of a structured value. Handling paths becomes subtle in the and Tool Support for Embedded Systems (Ottawa, Ontario, Canada) presence of sum types, because their cases are mutually exlu- (LCTES ’06). ACM, New York, NY, USA, 54–63. sive. We have proved the correctness of the operations for [12] Antoine Miné. 2006. The octagon abstract domain. High. Order Symb. Comput. 19, 1 (2006), 31–100. the resulting domain, and proved some relative optimality [13] Antoine Miné. 2017. Tutorial on Static Inference of Numeric Invariants results. We also define transfer functions of operations on by Abstract Interpretation. Found. Trends Program. Lang. 4, 3-4 (2017), algebraic values, showing that our domain can be used to 120–372. analyze programs handling algebraic data types and scalars. [14] Benoît Montagu and Thomas Jensen. 2020. Stable Relations and Ab- In future work, we want to implement NPRs to assess the stract Interpretation of Higher-Order Programs. PACMPL 4, ICFP, Article 119 (Aug. 2020), 30 pages. expressiveness of the domain and appreciate its practical [15] Patrick M. Rondon, Ming Kawaguci, and Ranjit Jhala. 2008. Liquid scalability. Moreover, we will extend NPRs to handle recur- Types. In Proceedings of the 29th ACM SIGPLAN Conference on Pro- sive algebraic types—such as lists or trees. This challenging gramming Language Design and Implementation (PLDI ’08). ACM, New extension will require to define an adapted widening, and York, NY, USA, 159–169. will greatly expand the range of programs we can analyze. [16] Nikhil Swamy, Joel Weinberger, Cole Schlesinger, Juan Chen, and Benjamin Livshits. 2013. Verifying higher-order programs with the Finally, combining NPRs with the recent extension of corre- Dijkstra monad. In Proceedings of the 34th ACM SIGPLAN conference lations to higher-order functions [14] is yet to be studied. on design and implementation - PLDI '13. ACM Press. References [17] Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones. 2014. Refinement Types for Haskell. In Proceedings of [1] Oana Fabiana Andreescu, Thomas Jensen, Stéphane Lescuyer, and the 19th ACM SIGPLAN International Conference on Functional Program- Benoît Montagu. 2019. Inferring frame conditions with static correla- ming - ICFP '14. ACM Press. tion analysis. PACMPL 3, POPL (2019), 47:1–47:29.

16