Soundness and Completeness 15-150: Principles of Functional Programming – Lecture 17

Soundness and Completeness 15-150: Principles of Functional Programming { Lecture 17 Giselle Reis A regular expression is a finite way to describe potentially infinite sets of strings in such a way that it is possible to decide whether a string belongs to this set or not. In other words, it defines a pattern for strings. If r is a regular expression, the set of string which match its pattern is called r's language and is denoted by L(r). Regular expressions can be built in many ways, we will use five basic constructors for our examples: 1. a single character (c); 2. the empty string (1); 3. concatenation of regular expressions (r1r2); 4. alternation of regular expressions (r1 + r2); 5. the Kleene star (r∗). The language of a regular expression constructed from each of those operators can be inductively defined as follows: L(c) = fcg L(1) = f\"g L(r1r2) = fs1s2 j s1 2 L(r1) and s2 2 L(r2)g L(r1 + r2) = fs j s 2 L(r1) or s 2 L(r2)g ∗ ∗ L(r ) = f\"g [ fs1s2 j s1 2 L(r) and s2 2 L(r )g Using these definitions (and continuations) we have defined a function which will do just that. 1 datatype regex= 2 Char of char 3 | One 4 | Times of regex* regex 5 | Plus of regex* regex 6 | Star of regex; 7 8 fun match(Charc)sk=( cases 9 ofc'::l =>c=c' andalsokl 10 | [] => false) 11 | match Onesk=ks 12 | match(Times(r1, r2))sk= 13 match r1s(fn rest => match r2 restk) 14 | match(Plus(r1, r2))sk= 15 match r1sk orelse match r2sk 16 | match(Starr)sk= 17ks orelse 18 matchrs(fn rest => rest <>s andalso match(Starr) restk); 19 20 fun acceptrs= matchr(String.explodes)(fn rest => rest = []); But how do we know this function is correct? First of all we need to define what it means to be correct. We want that our implementation gives no false positives and no false negatives. In other words, acceptrs evaluates to true iff s 2 L(r). This can be split into two desiderata: 1 1. If s 2 L(r) then acceptrs= true . 2. If s2 = L(r) then acceptrs= false . Desideratum 1 is called completeness. Intuitively, it states that the function accept is complete, in the sense that it will accept every s in the domain. Desideratum 2 is called soundness. Intuitively, it states that accept is sound, i.e., it will not return true for some s that is not in the domain. Take a while to think about these definitions. What happens if our function is sound but not complete? What if it is complete but not sound? Showing the correctness of accept amounts to showing it is sound and complete. For our proofs, it will be easier if we make everything in terms of positive results. So by using the contrapositive1 of 2, we have: 1. If s 2 L(r) then acceptrs= true . 2. If acceptrs= true then s 2 L(r). If we want to be really precise, modifying the statements this way requires a proof that the function accept is total. We will simply assume this for now and overlook this detail (but it is good to know!). Our function accept uses match with the same r and s but a new parameter, a continuation, k. Let's state the properties we want to prove in terms of match then. For simplicity we are seemlessly converting char lists into strings and vice-versa. Theorem 1 (Completeness). For every r: regex, s: char list and k: char list -> bool, there exists p: char list and u: char list such that if s = pu, p 2 L(r) and ku= true , then matchrs k= true . Initially one could think that a simple structural induction on r would do the job (given the way the function is defined). Although this works for most cases, it will fail for r∗, since one of the recursive calls is performed on the same regex. For this reason we need to consider also the string s (the thing that is reducing when the call with r∗ is made). This is achieved by performing a proof by lexicographic induction on the pair (r; s). This means that we can apply the induction hypothesis either on r0 < r, or on r0 = r and s0 < s. About quantifiers The interpretation of quantifiers depend on whether the statement is an assumption or a proof goal. If we are using a fact 8x:P (x), then we can safely replace x by any concrete object we want, since we know (by assumption) that it holds for any x. On the other hand, if we are proving a fact 8x:P (x), then we need to keep this x generic if we want to actually prove that. The treatment of existential quantifiers is dual. In the proof below, we will use x for quantified variables we can specialize and x for quantified variables that need to remain generic. Proof. The proof proceeds by lexicographic induction on the pair (r; s). ∗ There will be two base cases (1; c) and three inductive cases (r1r2; r1 + r2; r ). • Base cases: 1. r = 1 To show: if s= p u, p 2 L(1) and k u= true , then match One s k= true . Assumptions: 1The contrapositive of A ! B is :B !:A. 2 H1 s = pu H2 p 2 L(1) = f\"g H3 ku= true According to the definition of match: 1 | match Onesk=ks From H2 and H1, p = \" and s = u. Since ku= true from H3, then ks= true . Therefore match Onesk= true . 2. r = c To show: if s= p u, p 2 L( c ) and k u= true , then match(Char c) s k= true . Assumptions: H1 s = pu H2 p 2 L(c) = fcg H3 ku= true According to the definition of match: 1 fun match(Charc)sk=(cases 2 ofc'::l =>c=c' andalsokl 3 | [] => false) From H2 and H1, p = c and s = c :: u. As s is not empty, the evaluation of the Char case falls on the first option c'::l, where c=c' and l=u . This reduces to c=c andalso ku . The first conjunct is trivially true and the second one holds from H3. Therefore match(Charc)sk= true . • Inductive cases: 1. r = r1r2 IH1: if s= p u, p 2 L(r1) and k u= true , then match r1 s k= true . IH2: if s= p u, p 2 L(r2) and k u= true , then match r2 s k= true . To show: if s= p u, p 2 L(r1r2) and k u= true , then match(Times(r1, r2)) s k= true . Assumptions: H1 s = pu H2 p 2 L(r1r2) H3 ku= true From the definition of L(r1r2) and H2, we derive: H4 p = p1p2 H5 p1 2 L(r1) H6 p2 2 L(r2) Using s = p2u, H6 and H3, we have the necessary assumptions to apply IH2: H7 match r2(p_2@u)k= true 3 To apply IH1, we can choose s = p1(p2u) and from H5 we have the second assumption p1 2 L(r1). But we still need a function k such that k(p_2@u)= true . We do have such a function from H7! So we can simply define k=(fns => match r2sk) . Now we can apply IH1 and we obtain: H8 match r1(p_1@(p_2@u))(fns => match r2sk)= true This is precisely the definition of match for the Times case: 1 | match(Times(r1, r2))sk= 2 match r1s(fn rest => match r2 restk) 2. r = r1 + r2 IH1: if s= p u, p 2 L(r1) and k u= true , then match r1 s k= true . IH2: if s= p u, p 2 L(r2) and k u= true , then match r2 s k= true . To show: if s= p u, p 2 L(r1r2) and k u= true , then match(Plus(r1, r2)) s k= true . Assumptions: H1 s = pu H2 p 2 L(r1 + r2) H3 ku= true According to the definition of match: 1 | match(Plus(r1, r2))sk= 2 match r1sk orelse match r2sk From the definition of L(r1 + r2) we derive: H4 p 2 L(r1) or p 2 L(r2). We have two sub-cases: H4 p 2 L(r1) We can use H1, H4 and H3 to satisfy the assumptions of IH1 and thus obtain: H5 match r1sk= true . Therefore match(Plus(r1, r2))sk= true because of the first disjunct in the body. H4 p 2 L(r2) We can use H1, H4 and H3 to satisfy the assumptions of IH2 and thus obtain: H5 match r2sk= true . Therefore match(Plus(r1, r2))sk= true because of the second disjunct in the body. 3. r = r0∗ IH1: if s= p u, p 2 L(r0) and k u= true , then matchr' s k= true . IH2: for all s' < s , if s'= p u, p 2 L(r0∗) and k u= true , then match(Starr') s' k= true . To show: if s= p u, p 2 L(r0∗) and k u= true , then match(Starr') s k= true . Assumptions: 4 H1 s = pu H2 p 2 L(r0∗) H3 ku= true 0 0∗ From H2 we know that either p = \" or p = p1p2 such that p1 2 L(r ) and p2 2 L(r ). We will consider the two cases: (a) p = \": In this case s = u and ks=ku= true (from H3), therefore the disjunction in the body of match for the Star case will hold.

Load more