Functional Data Structures

Tobias Nipkow

February 20, 2021

Abstract A collection of verified functional data structures. The emphasis is on conciseness of algorithms and succinctness of proofs, more in the style of a textbook than a library of efficient algorithms. For more details see [13].

Contents

1 Sorting 4

2 Creating Almost Complete Trees 13

3 Three-Way Comparison 19

4 Lists Sorted wrt < 20

5 List Insertion and Deletion 21

6 Specifications of ADT 24

7 Unbalanced Implementation of Set 26

8 Update and Deletion 29

9 Specifications of Map ADT 32

10 Unbalanced Tree Implementation of Map 34

11 Augmented Tree (Tree2) 36

12 Function isin for Tree2 37

13 Interval Trees 37

14 AVL Tree Implementation of Sets 44

1 15 Function lookup for Tree2 54

16 AVL Tree Implementation of Maps 55

17 AVL Tree with Balance Factors (1) 59

18 AVL Tree with Balance Factors (2) 63

19 Height-Balanced Trees 68

20 Red-Black Trees 77

21 Red-Black Tree Implementation of Sets 78

22 Alternative Deletion in Red-Black Trees 85

23 Red-Black Tree Implementation of Maps 89

24 2-3 Trees 91

25 2-3 Tree Implementation of Sets 92

26 2-3 Tree Implementation of Maps 101

27 2-3 Tree from List 105

28 2-3-4 Trees 108

29 2-3-4 Tree Implementation of Sets 109

30 2-3-4 Tree Implementation of Maps 122

31 1-2 Brother Tree Implementation of Sets 126

32 1-2 Brother Tree Implementation of Maps 139

33 AA Tree Implementation of Sets 144

34 AA Tree Implementation of Maps 156

35 Join-Based Implementation of Sets 161

36 Join-Based Implementation of Sets via RBTs 169

37 Braun Trees 176

38 Arrays via Braun Trees 180

2 39 via Functions 195

40 Tries via Search Trees 197

41 Binary Tries and Patricia Tries 200

42 Queue Specification 206

43 Queue Implementation via 2 Lists 207

44 Specifications 209

45 Heaps 210

46 Leftist 212

47 217

48 Time functions for various operations 232

49 The Median-of-Medians Selection Algorithm 233

50 Bibliographic Notes 258

3 1 Sorting theory Sorting imports Complex Main HOL−Library.Multiset begin hide const List.insort declare Let def [simp]

1.1 Insertion Sort fun insort :: 0a::linorder ⇒ 0a list ⇒ 0a list where insort x [] = [x] | insort x (y#ys) = (if x ≤ y then x#y#ys else y#(insort x ys)) fun isort :: 0a::linorder list ⇒ 0a list where isort [] = [] | isort (x#xs) = insort x (isort xs)

1.1.1 Functional Correctness lemma mset insort: mset (insort x xs) = {#x#} + mset xs apply(induction xs) apply auto done lemma mset isort: mset (isort xs) = mset xs apply(induction xs) apply simp apply (simp add: mset insort) done lemma set insort: set (insort x xs) = {x} ∪ set xs by(simp add: mset insort flip: set mset mset) lemma sorted insort: sorted (insort a xs) = sorted xs apply(induction xs) apply(auto simp add: set insort) done lemma sorted isort: sorted (isort xs)

4 apply(induction xs) apply(auto simp: sorted insort) done

1.1.2 We count the number of function calls. insort x [] = [x] insort x (y#ys) = (if x ≤ y then x#y#ys else y#(insort x ys)) fun T insort :: 0a::linorder ⇒ 0a list ⇒ nat where T insort x [] = 1 | T insort x (y#ys) = (if x ≤ y then 0 else T insort x ys) + 1 isort [] = [] isort (x#xs) = insort x (isort xs) fun T isort :: 0a::linorder list ⇒ nat where T isort [] = 1 | T isort (x#xs) = T isort xs + T insort x (isort xs) + 1 lemma T insort length: T insort x xs ≤ length xs + 1 apply(induction xs) apply auto done lemma length insort: length (insort x xs) = length xs + 1 apply(induction xs) apply auto done lemma length isort: length (isort xs) = length xs apply(induction xs) apply (auto simp: length insort) done lemma T isort length: T isort xs ≤ (length xs + 1 ) ˆ 2 proof(induction xs) case Nil show ?case by simp next case (Cons x xs) have T isort (x#xs) = T isort xs + T insort x (isort xs) + 1 by simp also have ... ≤ (length xs + 1 ) ˆ 2 + T insort x (isort xs) + 1 using Cons.IH by simp

5 also have ... ≤ (length xs + 1 ) ˆ 2 + length xs + 1 + 1 using T insort length[of x isort xs] by (simp add: length isort) also have ... ≤ (length(x#xs) + 1 ) ˆ 2 by (simp add: power2 eq square) finally show ?case . qed

1.2 Merge Sort fun merge :: 0a::linorder list ⇒ 0a list ⇒ 0a list where merge [] ys = ys | merge xs [] = xs | merge (x#xs)(y#ys) = (if x ≤ y then x # merge xs (y#ys) else y # merge (x#xs) ys) fun msort :: 0a::linorder list ⇒ 0a list where msort xs = (let n = length xs in if n ≤ 1 then xs else merge (msort (take (n div 2 ) xs)) (msort (drop (n div 2 ) xs))) declare msort.simps [simp del]

1.2.1 Functional Correctness lemma mset merge: mset(merge xs ys) = mset xs + mset ys by(induction xs ys rule: merge.induct) auto lemma mset msort: mset (msort xs) = mset xs proof(induction xs rule: msort.induct) case (1 xs) let ?n = length xs let ?ys = take (?n div 2 ) xs let ?zs = drop (?n div 2 ) xs show ?case proof cases assume ?n ≤ 1 thus ?thesis by(simp add: msort.simps[of xs]) next assume ¬ ?n ≤ 1 hence mset (msort xs) = mset (msort ?ys) + mset (msort ?zs) by(simp add: msort.simps[of xs] mset merge) also have ... = mset ?ys + mset ?zs using h¬ ?n ≤ 1 i by(simp add: 1 .IH ) also have ... = mset (?ys @ ?zs) by (simp del: append take drop id)

6 also have ... = mset xs by simp finally show ?thesis . qed qed Via the previous lemma or directly: lemma set merge: set(merge xs ys) = set xs ∪ set ys by (metis mset merge set mset mset set mset union) lemma set(merge xs ys) = set xs ∪ set ys by(induction xs ys rule: merge.induct)(auto) lemma sorted merge: sorted (merge xs ys) ←→ (sorted xs ∧ sorted ys) by(induction xs ys rule: merge.induct)(auto simp: set merge) lemma sorted msort: sorted (msort xs) proof(induction xs rule: msort.induct) case (1 xs) let ?n = length xs show ?case proof cases assume ?n ≤ 1 thus ?thesis by(simp add: msort.simps[of xs] sorted01 ) next assume ¬ ?n ≤ 1 thus ?thesis using 1 .IH by(simp add: sorted merge msort.simps[of xs]) qed qed

1.2.2 Time Complexity We only count the number of comparisons between list elements. fun C merge :: 0a::linorder list ⇒ 0a list ⇒ nat where C merge [] ys = 0 | C merge xs [] = 0 | C merge (x#xs)(y#ys) = 1 + (if x ≤ y then C merge xs (y#ys) else C merge (x#xs) ys) lemma C merge ub: C merge xs ys ≤ length xs + length ys by (induction xs ys rule: C merge.induct) auto fun C msort :: 0a::linorder list ⇒ nat where C msort xs =

7 (let n = length xs; ys = take (n div 2 ) xs; zs = drop (n div 2 ) xs in if n ≤ 1 then 0 else C msort ys + C msort zs + C merge (msort ys)(msort zs)) declare C msort.simps [simp del] lemma length merge: length(merge xs ys) = length xs + length ys apply (induction xs ys rule: merge.induct) apply auto done lemma length msort: length(msort xs) = length xs proof (induction xs rule: msort.induct) case (1 xs) show ?case by (auto simp: msort.simps [of xs] 1 length merge) qed Why structured proof? To have the name ”xs” to specialize msort.simps with xs to ensure that msort.simps cannot be used recursively. Also works without this precaution, but that is just luck. lemma C msort le: length xs = 2ˆk =⇒ C msort xs ≤ k ∗ 2ˆk proof(induction k arbitrary: xs) case 0 thus ?case by (simp add: C msort.simps) next case (Suc k) let ?n = length xs let ?ys = take (?n div 2 ) xs let ?zs = drop (?n div 2 ) xs show ?case proof (cases ?n ≤ 1 ) case True thus ?thesis by(simp add: C msort.simps) next case False have C msort(xs) = C msort ?ys + C msort ?zs + C merge (msort ?ys)(msort ?zs) by (simp add: C msort.simps msort.simps) also have ... ≤ C msort ?ys + C msort ?zs + length ?ys + length ?zs using C merge ub[of msort ?ys msort ?zs] length msort[of ?ys] length msort[of ?zs] by arith

8 also have ... ≤ k ∗ 2ˆk + C msort ?zs + length ?ys + length ?zs using Suc.IH [of ?ys] Suc.prems by simp also have ... ≤ k ∗ 2ˆk + k ∗ 2ˆk + length ?ys + length ?zs using Suc.IH [of ?zs] Suc.prems by simp also have ... = 2 ∗ k ∗ 2ˆk + 2 ∗ 2 ˆ k using Suc.prems by simp finally show ?thesis by simp qed qed lemma C msort log: length xs = 2ˆk =⇒ C msort xs ≤ length xs ∗ log 2 (length xs) using C msort le[of xs k] apply (simp add: log nat power algebra simps) by (metis (mono tags) numeral power eq of nat cancel iff of nat le iff of nat mult)

1.3 Bottom-Up Merge Sort fun merge adj :: ( 0a::linorder) list list ⇒ 0a list list where merge adj [] = [] | merge adj [xs] = [xs] | merge adj (xs # ys # zss) = merge xs ys # merge adj zss For the termination proof of merge all below. lemma length merge adjacent[simp]: length (merge adj xs) = (length xs + 1 ) div 2 by (induction xs rule: merge adj .induct) auto fun merge all :: ( 0a::linorder) list list ⇒ 0a list where merge all [] = [] | merge all [xs] = xs | merge all xss = merge all (merge adj xss) definition msort bu :: ( 0a::linorder) list ⇒ 0a list where msort bu xs = merge all (map (λx. [x]) xs)

1.3.1 Functional Correctness abbreviation mset mset :: 0a list list ⇒ 0a multiset where P mset mset xss ≡ # (image mset mset (mset xss)) lemma mset merge adj : mset mset (merge adj xss) = mset mset xss by(induction xss rule: merge adj .induct)(auto simp: mset merge)

9 lemma mset merge all: mset (merge all xss) = mset mset xss by(induction xss rule: merge all.induct)(auto simp: mset merge mset merge adj ) lemma mset msort bu: mset (msort bu xs) = mset xs by(simp add: msort bu def mset merge all multiset.map comp comp def ) lemma sorted merge adj : ∀ xs ∈ set xss. sorted xs =⇒ ∀ xs ∈ set (merge adj xss). sorted xs by(induction xss rule: merge adj .induct)(auto simp: sorted merge) lemma sorted merge all: ∀ xs ∈ set xss. sorted xs =⇒ sorted (merge all xss) apply(induction xss rule: merge all.induct) using [[simp depth limit=3 ]] by (auto simp add: sorted merge adj ) lemma sorted msort bu: sorted (msort bu xs) by(simp add: msort bu def sorted merge all)

1.3.2 Time Complexity fun C merge adj :: ( 0a::linorder) list list ⇒ nat where C merge adj [] = 0 | C merge adj [xs] = 0 | C merge adj (xs # ys # zss) = C merge xs ys + C merge adj zss fun C merge all :: ( 0a::linorder) list list ⇒ nat where C merge all [] = 0 | C merge all [xs] = 0 | C merge all xss = C merge adj xss + C merge all (merge adj xss) definition C msort bu :: ( 0a::linorder) list ⇒ nat where C msort bu xs = C merge all (map (λx. [x]) xs) lemma length merge adj : [[ even(length xss); ∀ xs ∈ set xss. length xs = m ]] =⇒ ∀ xs ∈ set (merge adj xss). length xs = 2 ∗m by(induction xss rule: merge adj .induct)(auto simp: length merge) lemma C merge adj : ∀ xs ∈ set xss. length xs = m =⇒ C merge adj xss ≤ m ∗ length xss proof(induction xss rule: C merge adj .induct) case 1 thus ?case by simp next

10 case 2 thus ?case by simp next case (3 x y) thus ?case using C merge ub[of x y] by (simp add: alge- bra simps) qed lemma C merge all: [[ ∀ xs ∈ set xss. length xs = m; length xss = 2ˆk ]] =⇒ C merge all xss ≤ m ∗ k ∗ 2ˆk proof (induction xss arbitrary: k m rule: C merge all.induct) case 1 thus ?case by simp next case 2 thus ?case by simp next case (3 xs ys xss) let ?xss = xs # ys # xss let ?xss2 = merge adj ?xss obtain k 0 where k 0: k = Suc k 0 using 3 .prems(2 ) by (metis length Cons nat.inject nat power eq Suc 0 iff nat.exhaust) have even (length ?xss) using 3 .prems(2 ) k 0 by auto from length merge adj [OF this 3 .prems(1 )] have ∗: ∀ x ∈ set(merge adj ?xss). length x = 2 ∗m . have ∗∗: length ?xss2 = 2 ˆ k 0 using 3 .prems(2 ) k 0 by auto have C merge all ?xss = C merge adj ?xss + C merge all ?xss2 by simp also have ... ≤ m ∗ 2ˆk + C merge all ?xss2 using 3 .prems(2 ) C merge adj [OF 3 .prems(1 )] by (auto simp: alge- bra simps) also have ... ≤ m ∗ 2ˆk + (2 ∗m) ∗ k 0 ∗ 2ˆk 0 using 3 .IH [OF ∗ ∗∗] by simp also have ... = m ∗ k ∗ 2ˆk using k 0 by (simp add: algebra simps) finally show ?case . qed corollary C msort bu: length xs = 2 ˆ k =⇒ C msort bu xs ≤ k ∗ 2 ˆ k using C merge all[of map (λx. [x]) xs 1 ] by (simp add: C msort bu def )

1.4 Quicksort fun quicksort :: ( 0a::linorder) list ⇒ 0a list where quicksort [] = [] | quicksort (x#xs) = quicksort (filter (λy. y < x) xs)@[x]@ quicksort (filter (λy. x ≤ y) xs) lemma mset quicksort: mset (quicksort xs) = mset xs

11 apply (induction xs rule: quicksort.induct) apply (auto simp: not le) done lemma set quicksort: set (quicksort xs) = set xs by(rule mset eq setD[OF mset quicksort]) lemma sorted quicksort: sorted (quicksort xs) apply (induction xs rule: quicksort.induct) apply (auto simp add: sorted append set quicksort) done

1.5 Insertion Sort w.r.t. Keys and Stability Note that insort key is already defined in theory HOL.List. Thus some of the lemmas are already present as well. fun isort key :: ( 0a ⇒ 0k::linorder) ⇒ 0a list ⇒ 0a list where isort key f [] = [] | isort key f (x # xs) = insort key f x (isort key f xs)

1.5.1 Standard functional correctness lemma mset insort key: mset (insort key f x xs) = {#x#} + mset xs by(induction xs) simp all lemma mset isort key: mset (isort key f xs) = mset xs by(induction xs)(simp all add: mset insort key) lemma set isort key: set (isort key f xs) = set xs by (rule mset eq setD[OF mset isort key]) lemma sorted insort key: sorted (map f (insort key f a xs)) = sorted (map f xs) by(induction xs)(auto simp: set insort key) lemma sorted isort key: sorted (map f (isort key f xs)) by(induction xs)(simp all add: sorted insort key)

1.5.2 Stability lemma insort is Cons: ∀ x∈set xs. f a ≤ f x =⇒ insort key f a xs = a # xs by (cases xs) auto

12 lemma filter insort key neg: ¬ P x =⇒ filter P (insort key f x xs) = filter P xs by (induction xs) simp all lemma filter insort key pos: sorted (map f xs) =⇒ P x =⇒ filter P (insort key f x xs) = insort key f x (filter P xs) by (induction xs)(auto, subst insort is Cons, auto) lemma sort key stable: filter (λy. f y = k)(isort key f xs) = filter (λy. f y = k) xs proof (induction xs) case Nil thus ?case by simp next case (Cons a xs) thus ?case proof (cases f a = k) case False thus ?thesis by (simp add: Cons.IH filter insort key neg) next case True have filter (λy. f y = k)(isort key f (a # xs)) = filter (λy. f y = k)(insort key f a (isort key f xs)) by simp also have ... = insort key f a (filter (λy. f y = k)(isort key f xs)) by (simp add: True filter insort key pos sorted isort key) also have ... = insort key f a (filter (λy. f y = k) xs) by (simp add: Cons.IH ) also have ... = a #(filter (λy. f y = k) xs) by(simp add: True insort is Cons) also have ... = filter (λy. f y = k)(a # xs) by (simp add: True) finally show ?thesis . qed qed end

2 Creating Almost Complete Trees theory Balance imports HOL−Library.Tree Real begin fun bal :: nat ⇒ 0a list ⇒ 0a tree ∗ 0a list where

13 bal n xs = (if n=0 then (Leaf ,xs) else (let m = n div 2 ; (l, ys) = bal m xs; (r, zs) = bal (n−1 −m)(tl ys) in (Node l (hd ys) r, zs))) declare bal.simps[simp del] declare Let def [simp] definition bal list :: nat ⇒ 0a list ⇒ 0a tree where bal list n xs = fst (bal n xs) definition balance list :: 0a list ⇒ 0a tree where balance list xs = bal list (length xs) xs definition bal tree :: nat ⇒ 0a tree ⇒ 0a tree where bal tree n t = bal list n (inorder t) definition balance tree :: 0a tree ⇒ 0a tree where balance tree t = bal tree (size t) t lemma bal simps: bal 0 xs = (Leaf , xs) n > 0 =⇒ bal n xs = (let m = n div 2 ; (l, ys) = bal m xs; (r, zs) = bal (n−1 −m)(tl ys) in (Node l (hd ys) r, zs)) by(simp all add: bal.simps) lemma bal inorder: [[ n ≤ length xs; bal n xs = (t,zs) ]] =⇒ xs = inorder t @ zs ∧ size t = n proof(induction n arbitrary: xs t zs rule: less induct) case (less n) show ?case proof cases assume n = 0 thus ?thesis using less.prems by (simp add: bal simps) next assume [arith]: n 6= 0 let ?m = n div 2 let ?m 0 = n − 1 − ?m from less.prems(2 ) obtain l r ys where b1 : bal ?m xs = (l,ys) and b2 : bal ?m 0 (tl ys) = (r,zs) and

14 t: t = hl, hd ys, ri by(auto simp: bal simps split: prod.splits) have IH1 : xs = inorder l @ ys ∧ size l = ?m using b1 less.prems(1 ) by(intro less.IH ) auto have IH2 : tl ys = inorder r @ zs ∧ size r = ?m 0 using b2 IH1 less.prems(1 ) by(intro less.IH ) auto show ?thesis using t IH1 IH2 less.prems(1 ) hd Cons tl[of ys] by fastforce qed qed corollary inorder bal list[simp]: n ≤ length xs =⇒ inorder(bal list n xs) = take n xs unfolding bal list def by (metis (mono tags) prod.collapse[of bal n xs] append eq conv conj bal inorder length inorder) corollary inorder balance list[simp]: inorder(balance list xs) = xs by(simp add: balance list def ) corollary inorder bal tree: n ≤ size t =⇒ inorder(bal tree n t) = take n (inorder t) by(simp add: bal tree def ) corollary inorder balance tree[simp]: inorder(balance tree t) = inorder t by(simp add: balance tree def inorder bal tree) The length/size lemmas below do not require the precondition n ≤ length xs (or n ≤ size t) that they come with. They could take advantage of the fact that bal xs n yields a result even if length xs < n. In that case the result will contain one or more occurrences of hd []. However, this is counter-intuitive and does not reflect the execution in an eager functional language. lemma bal length: [[ n ≤ length xs; bal n xs = (t,zs) ]] =⇒ length zs = length xs − n using bal inorder by fastforce corollary size bal list[simp]: n ≤ length xs =⇒ size(bal list n xs) = n unfolding bal list def using bal inorder prod.exhaust sel by blast corollary size balance list[simp]: size(balance list xs) = length xs by (simp add: balance list def ) corollary size bal tree[simp]: n ≤ size t =⇒ size(bal tree n t) = n by(simp add: bal tree def )

15 corollary size balance tree[simp]: size(balance tree t) = size t by(simp add: balance tree def ) lemma min height bal: [[ n ≤ length xs; bal n xs = (t,zs) ]] =⇒ min height t = nat(blog 2 (n + 1 )c) proof(induction n arbitrary: xs t zs rule: less induct) case (less n) show ?case proof cases assume n = 0 thus ?thesis using less.prems(2 ) by (simp add: bal simps) next assume [arith]: n 6= 0 let ?m = n div 2 let ?m 0 = n − 1 − ?m from less.prems obtain l r ys where b1 : bal ?m xs = (l,ys) and b2 : bal ?m 0 (tl ys) = (r,zs) and t: t = hl, hd ys, ri by(auto simp: bal simps split: prod.splits) let ?hl = nat (floor(log 2 (?m + 1 ))) let ?hr = nat (floor(log 2 (?m 0 + 1 ))) have IH1 : min height l = ?hl using less.IH [OF b1 ] less.prems(1 ) by simp have IH2 : min height r = ?hr using less.prems(1 ) bal length[OF b1 ] b2 by(intro less.IH ) auto have (n+1 ) div 2 ≥ 1 by arith hence 0 : log 2 ((n+1 ) div 2 ) ≥ 0 by simp have ?m 0 ≤ ?m by arith hence le: ?hr ≤ ?hl by(simp add: nat mono floor mono) have min height t = min ?hl ?hr + 1 by (simp add: t IH1 IH2 ) also have ... = ?hr + 1 using le by (simp add: min absorb2 ) also have ?m 0 + 1 = (n+1 ) div 2 by linarith also have nat (floor(log 2 ((n+1 ) div 2 ))) + 1 = nat (floor(log 2 ((n+1 ) div 2 ) + 1 )) using 0 by linarith also have ... = nat (floor(log 2 (n + 1 ))) using floor log2 div2 [of n+1 ] by (simp add: log mult) finally show ?thesis . qed qed lemma height bal: [[ n ≤ length xs; bal n xs = (t,zs) ]] =⇒ height t = nat dlog 2 (n + 1 )e

16 proof(induction n arbitrary: xs t zs rule: less induct) case (less n) show ?case proof cases assume n = 0 thus ?thesis using less.prems by (simp add: bal simps) next assume [arith]: n 6= 0 let ?m = n div 2 let ?m 0 = n − 1 − ?m from less.prems obtain l r ys where b1 : bal ?m xs = (l,ys) and b2 : bal ?m 0 (tl ys) = (r,zs) and t: t = hl, hd ys, ri by(auto simp: bal simps split: prod.splits) let ?hl = nat dlog 2 (?m + 1 )e let ?hr = nat dlog 2 (?m 0 + 1 )e have IH1 : height l = ?hl using less.IH [OF b1 ] less.prems(1 ) by simp have IH2 : height r = ?hr using b2 bal length[OF b1 ] less.prems(1 ) by(intro less.IH ) auto have 0 : log 2 (?m + 1 ) ≥ 0 by simp have ?m 0 ≤ ?m by arith hence le: ?hr ≤ ?hl by(simp add: nat mono ceiling mono del: nat ceiling le eq) have height t = max ?hl ?hr + 1 by (simp add: t IH1 IH2 ) also have ... = ?hl + 1 using le by (simp add: max absorb1 ) also have ... = nat dlog 2 (?m + 1 ) + 1 e using 0 by linarith also have ... = nat dlog 2 (n + 1 )e using ceiling log2 div2 [of n+1 ] by (simp) finally show ?thesis . qed qed lemma acomplete bal: assumes n ≤ length xs bal n xs = (t,ys) shows acomplete t unfolding acomplete def using height bal[OF assms] min height bal[OF assms] by linarith lemma height bal list: n ≤ length xs =⇒ height (bal list n xs) = nat dlog 2 (n + 1 )e unfolding bal list def by (metis height bal prod.collapse) lemma height balance list: height (balance list xs) = nat dlog 2 (length xs + 1 )e

17 by (simp add: balance list def height bal list) corollary height bal tree: n ≤ size t =⇒ height (bal tree n t) = natdlog 2 (n + 1 )e unfolding bal list def bal tree def by (metis bal list def height bal list length inorder) corollary height balance tree: height (balance tree t) = natdlog 2 (size t + 1 )e by (simp add: bal tree def balance tree def height bal list) corollary acomplete bal list[simp]: n ≤ length xs =⇒ acomplete (bal list n xs) unfolding bal list def by (metis acomplete bal prod.collapse) corollary acomplete balance list[simp]: acomplete (balance list xs) by (simp add: balance list def ) corollary acomplete bal tree[simp]: n ≤ size t =⇒ acomplete (bal tree n t) by (simp add: bal tree def ) corollary acomplete balance tree[simp]: acomplete (balance tree t) by (simp add: balance tree def ) lemma wbalanced bal: [[ n ≤ length xs; bal n xs = (t,ys) ]] =⇒ wbalanced t proof(induction n arbitrary: xs t ys rule: less induct) case (less n) show ?case proof cases assume n = 0 thus ?thesis using less.prems(2 ) by(simp add: bal simps) next assume [arith]: n 6= 0 with less.prems obtain l ys r zs where rec1 : bal (n div 2 ) xs = (l, ys) and rec2 : bal (n − 1 − n div 2 )(tl ys) = (r, zs) and t: t = hl, hd ys, ri by(auto simp add: bal simps split: prod.splits) have l: wbalanced l using less.IH [OF rec1 ] less.prems(1 ) by linarith have wbalanced r using rec1 rec2 bal length[OF rec1 ] less.prems(1 ) by(intro less.IH ) auto with l t bal length[OF rec1 ] less.prems(1 ) bal inorder[OF rec1 ] bal inorder[OF rec2 ]

18 show ?thesis by auto qed qed An alternative proof via wbalanced ?t =⇒ acomplete ?t: lemma [[ n ≤ length xs; bal n xs = (t,ys) ]] =⇒ acomplete t by(rule acomplete if wbalanced[OF wbalanced bal]) lemma wbalanced bal list[simp]: n ≤ length xs =⇒ wbalanced (bal list n xs) by(simp add: bal list def )(metis prod.collapse wbalanced bal) lemma wbalanced balance list[simp]: wbalanced (balance list xs) by(simp add: balance list def ) lemma wbalanced bal tree[simp]: n ≤ size t =⇒ wbalanced (bal tree n t) by(simp add: bal tree def ) lemma wbalanced balance tree: wbalanced (balance tree t) by (simp add: balance tree def ) hide const (open) bal end

3 Three-Way Comparison theory Cmp imports Main begin datatype cmp val = LT | EQ | GT definition cmp :: 0a:: linorder ⇒ 0a ⇒ cmp val where cmp x y = (if x < y then LT else if x=y then EQ else GT ) lemma LT [simp]: cmp x y = LT ←→ x < y and EQ[simp]: cmp x y = EQ ←→ x = y and GT [simp]: cmp x y = GT ←→ x > y by (auto simp: cmp def ) lemma case cmp if [simp]: (case c of EQ ⇒ e | LT ⇒ l | GT ⇒ g) = (if c = LT then l else if c = GT then g else e) by(simp split: cmp val.split)

19 end

4 Lists Sorted wrt < theory Sorted Less imports Less False begin hide const sorted Is a list sorted without duplicates, i.e., wrt

20 lemma sorted mid iff 0: NO MATCH [] ys =⇒ sorted(xs @ y # ys) = (sorted(xs @[y]) ∧ sorted(y # ys)) by(rule sorted mid iff ) lemmas sorted lems = sorted mid iff 0 sorted mid iff2 sorted cons 0 sorted snoc 0 Splay trees need two additional sorted lemmas: lemma sorted snoc le: ASSUMPTION (sorted(xs @[x])) =⇒ x ≤ y =⇒ sorted (xs @[y]) by (auto simp add: sorted wrt append ASSUMPTION def ) lemma sorted Cons le: ASSUMPTION (sorted(x # xs)) =⇒ y ≤ x =⇒ sorted (y # xs) by (auto simp add: sorted wrt Cons ASSUMPTION def ) end

5 List Insertion and Deletion theory List Ins Del imports Sorted Less begin

5.1 Elements in a list lemma sorted Cons iff : sorted(x # xs) = ((∀ y ∈ set xs. x < y) ∧ sorted xs) by(simp add: sorted wrt Cons) lemma sorted snoc iff : sorted(xs @[x]) = (sorted xs ∧ (∀ y ∈ set xs. y < x)) by(simp add: sorted wrt append) lemmas isin simps = sorted mid iff 0 sorted Cons iff sorted snoc iff

5.2 Inserting into an ordered list without duplicates: fun ins list :: 0a::linorder ⇒ 0a list ⇒ 0a list where ins list x [] = [x] | ins list x (a#xs) = (if x < a then x#a#xs else if x=a then a#xs else a # ins list x xs)

21 lemma set ins list: set (ins list x xs) = set xs ∪ {x} by(induction xs) auto lemma sorted ins list: sorted xs =⇒ sorted(ins list x xs) by(induction xs rule: induct list012 ) auto lemma ins list sorted: sorted (xs @[a]) =⇒ ins list x (xs @ a # ys) = (if x < a then ins list x xs @(a#ys) else xs @ ins list x (a#ys)) by(induction xs)(auto simp: sorted lems) In principle, sorted (?xs @[?a]) =⇒ ins list ?x (?xs @ ?a # ?ys) = (if ?x < ?a then ins list ?x ?xs @ ?a # ?ys else ?xs @ ins list ?x (?a # ?ys)) suffices, but the following two corollaries speed up proofs. corollary ins list sorted1 : sorted (xs @[a]) =⇒ a ≤ x =⇒ ins list x (xs @ a # ys) = xs @ ins list x (a#ys) by(auto simp add: ins list sorted) corollary ins list sorted2 : sorted (xs @[a]) =⇒ x < a =⇒ ins list x (xs @ a # ys) = ins list x xs @(a#ys) by(auto simp: ins list sorted) lemmas ins list simps = sorted lems ins list sorted1 ins list sorted2 Splay trees need two additional ins list lemmas: lemma ins list Cons: sorted (x # xs) =⇒ ins list x xs = x # xs by (induction xs) auto lemma ins list snoc: sorted (xs @[x]) =⇒ ins list x xs = xs @[x] by(induction xs)(auto simp add: sorted mid iff2 )

5.3 Delete one occurrence of an element from a list: fun del list :: 0a ⇒ 0a list ⇒ 0a list where del list x [] = [] | del list x (a#xs) = (if x=a then xs else a # del list x xs) lemma del list idem: x ∈/ set xs =⇒ del list x xs = xs by (induct xs) simp all lemma set del list: sorted xs =⇒ set (del list x xs) = set xs − {x} by(induct xs)(auto simp: sorted Cons iff )

22 lemma sorted del list: sorted xs =⇒ sorted(del list x xs) apply(induction xs rule: induct list012 ) apply auto by (meson order.strict trans sorted Cons iff ) lemma del list sorted: sorted (xs @ a # ys) =⇒ del list x (xs @ a # ys) = (if x < a then del list x xs @ a # ys else xs @ del list x (a # ys)) by(induction xs) (fastforce simp: sorted lems sorted Cons iff intro!: del list idem)+ In principle, sorted (?xs @ ?a # ?ys) =⇒ del list ?x (?xs @ ?a # ?ys) = (if ?x < ?a then del list ?x ?xs @ ?a # ?ys else ?xs @ del list ?x (?a # ?ys)) suffices, but the following corollaries speed up proofs. corollary del list sorted1 : sorted (xs @ a # ys) =⇒ a ≤ x =⇒ del list x (xs @ a # ys) = xs @ del list x (a # ys) by (auto simp: del list sorted) corollary del list sorted2 : sorted (xs @ a # ys) =⇒ x < a =⇒ del list x (xs @ a # ys) = del list x xs @ a # ys by (auto simp: del list sorted) corollary del list sorted3 : sorted (xs @ a # ys @ b # zs) =⇒ x < b =⇒ del list x (xs @ a # ys @ b # zs) = del list x (xs @ a # ys)@ b # zs by (auto simp: del list sorted sorted lems) corollary del list sorted4 : sorted (xs @ a # ys @ b # zs @ c # us) =⇒ x < c =⇒ del list x (xs @ a # ys @ b # zs @ c # us) = del list x (xs @ a # ys @ b # zs)@ c # us by (auto simp: del list sorted sorted lems) corollary del list sorted5 : sorted (xs @ a # ys @ b # zs @ c # us @ d # vs) =⇒ x < d =⇒ del list x (xs @ a # ys @ b # zs @ c # us @ d # vs) = del list x (xs @ a # ys @ b # zs @ c # us)@ d # vs by (auto simp: del list sorted sorted lems) lemmas del list simps = sorted lems del list sorted1 del list sorted2 del list sorted3 del list sorted4

23 del list sorted5 Splay trees need two additional del list lemmas: lemma del list notin Cons: sorted (x # xs) =⇒ del list x xs = xs by(induction xs)(fastforce simp: sorted Cons iff )+ lemma del list sorted app: sorted(xs @[x]) =⇒ del list x (xs @ ys) = xs @ del list x ys by (induction xs)(auto simp: sorted mid iff2 ) end

6 Specifications of Set ADT theory Set Specs imports List Ins Del begin The basic set interface with traditional set-based specification: locale Set = fixes empty :: 0s fixes insert :: 0a ⇒ 0s ⇒ 0s fixes delete :: 0a ⇒ 0s ⇒ 0s fixes isin :: 0s ⇒ 0a ⇒ bool fixes set :: 0s ⇒ 0a set fixes invar :: 0s ⇒ bool assumes set empty: set empty = {} assumes set isin: invar s =⇒ isin s x = (x ∈ set s) assumes set insert: invar s =⇒ set(insert x s) = set s ∪ {x} assumes set delete: invar s =⇒ set(delete x s) = set s − {x} assumes invar empty: invar empty assumes invar insert: invar s =⇒ invar(insert x s) assumes invar delete: invar s =⇒ invar(delete x s) lemmas (in Set) set specs = set empty set isin set insert set delete invar empty invar insert invar delete The basic set interface with inorder-based specification: locale Set by Ordered = fixes empty :: 0t fixes insert :: 0a::linorder ⇒ 0t ⇒ 0t fixes delete :: 0a ⇒ 0t ⇒ 0t fixes isin :: 0t ⇒ 0a ⇒ bool fixes inorder :: 0t ⇒ 0a list

24 fixes inv :: 0t ⇒ bool assumes inorder empty: inorder empty = [] assumes isin: inv t ∧ sorted(inorder t) =⇒ isin t x = (x ∈ set (inorder t)) assumes inorder insert: inv t ∧ sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) assumes inorder delete: inv t ∧ sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) assumes inorder inv empty: inv empty assumes inorder inv insert: inv t ∧ sorted(inorder t) =⇒ inv(insert x t) assumes inorder inv delete: inv t ∧ sorted(inorder t) =⇒ inv(delete x t) begin It implements the traditional specification: definition set :: 0t ⇒ 0a set where set = List.set o inorder definition invar :: 0t ⇒ bool where invar t = (inv t ∧ sorted (inorder t)) sublocale Set empty insert delete isin set invar proof(standard, goal cases) case 1 show ?case by (auto simp: inorder empty set def ) next case 2 thus ?case by(simp add: isin invar def set def ) next case 3 thus ?case by(simp add: inorder insert set ins list set def in- var def ) next case (4 s x) thus ?case by (auto simp: inorder delete set del list invar def set def ) next case 5 thus ?case by(simp add: inorder empty inorder inv empty in- var def ) next case 6 thus ?case by(simp add: inorder insert inorder inv insert sorted ins list invar def ) next case 7 thus ?case by (auto simp: inorder delete inorder inv delete sorted del list invar def ) qed

25 end Set2 = Set with binary operations: locale Set2 = Set where insert = insert for insert :: 0a ⇒ 0s ⇒ 0s + fixes union :: 0s ⇒ 0s ⇒ 0s fixes inter :: 0s ⇒ 0s ⇒ 0s fixes diff :: 0s ⇒ 0s ⇒ 0s assumes set union: [[ invar s1 ; invar s2 ]] =⇒ set(union s1 s2 ) = set s1 ∪ set s2 assumes set inter: [[ invar s1 ; invar s2 ]] =⇒ set(inter s1 s2 ) = set s1 ∩ set s2 assumes set diff : [[ invar s1 ; invar s2 ]] =⇒ set(diff s1 s2 ) = set s1 − set s2 assumes invar union: [[ invar s1 ; invar s2 ]] =⇒ invar(union s1 s2 ) assumes invar inter: [[ invar s1 ; invar s2 ]] =⇒ invar(inter s1 s2 ) assumes invar diff : [[ invar s1 ; invar s2 ]] =⇒ invar(diff s1 s2 ) end

7 Unbalanced Tree Implementation of Set theory Tree Set imports HOL−Library.Tree Cmp Set Specs begin definition empty :: 0a tree where empty = Leaf fun isin :: 0a::linorder tree ⇒ 0a ⇒ bool where isin Leaf x = False | isin (Node l a r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ isin r x) hide const (open) insert fun insert :: 0a::linorder ⇒ 0a tree ⇒ 0a tree where insert x Leaf = Node Leaf x Leaf |

26 insert x (Node l a r) = (case cmp x a of LT ⇒ Node (insert x l) a r | EQ ⇒ Node l a r | GT ⇒ Node l a (insert x r)) Deletion by replacing: fun split min :: 0a tree ⇒ 0a ∗ 0a tree where split min (Node l a r) = (if l = Leaf then (a,r) else let (x,l 0) = split min l in (x, Node l 0 a r)) fun delete :: 0a::linorder ⇒ 0a tree ⇒ 0a tree where delete x Leaf = Leaf | delete x (Node l a r) = (case cmp x a of LT ⇒ Node (delete x l) a r | GT ⇒ Node l a (delete x r) | EQ ⇒ if r = Leaf then l else let (a 0,r 0) = split min r in Node l a 0 r 0) Deletion by joining: fun join :: ( 0a::linorder)tree ⇒ 0a tree ⇒ 0a tree where join t Leaf = t | join Leaf t = t | join (Node t1 a t2 )(Node t3 b t4 ) = (case join t2 t3 of Leaf ⇒ Node t1 a (Node Leaf b t4 ) | Node u2 x u3 ⇒ Node (Node t1 a u2 ) x (Node u3 b t4 )) fun delete2 :: 0a::linorder ⇒ 0a tree ⇒ 0a tree where delete2 x Leaf = Leaf | delete2 x (Node l a r) = (case cmp x a of LT ⇒ Node (delete2 x l) a r | GT ⇒ Node l a (delete2 x r) | EQ ⇒ join l r)

7.1 Functional Correctness Proofs lemma isin set: sorted(inorder t) =⇒ isin t x = (x ∈ set (inorder t)) by (induction t)(auto simp: isin simps) lemma inorder insert: sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) by(induction t)(auto simp: ins list simps)

27 lemma split minD: split min t = (x,t 0) =⇒ t 6= Leaf =⇒ x # inorder t 0 = inorder t by(induction t arbitrary: t 0 rule: split min.induct) (auto simp: sorted lems split: prod.splits if splits) lemma inorder delete: sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(induction t)(auto simp: del list simps split minD split: prod.splits) interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = λ . True proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) qed (rule TrueI )+ lemma inorder join: inorder(join l r) = inorder l @ inorder r by(induction l r rule: join.induct)(auto split: tree.split) lemma inorder delete2 : sorted(inorder t) =⇒ inorder(delete2 x t) = del list x (inorder t) by(induction t)(auto simp: inorder join del list simps) interpretation S2 : Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete2 and inorder = inorder and inv = λ . True proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set) next case 3 thus ?case by(simp add: inorder insert) next

28 case 4 thus ?case by(simp add: inorder delete2 ) qed (rule TrueI )+ end

8 Association List Update and Deletion theory AList Upd Del imports Sorted Less begin abbreviation sorted1 ps ≡ sorted(map fst ps) Define own map of function to avoid pulling in an unknown amount of lemmas implicitly (via the simpset). hide const (open) map of fun map of :: ( 0a∗ 0b)list ⇒ 0a ⇒ 0b option where map of [] = (λx. None) | map of ((a,b)#ps) = (λx. if x=a then Some b else map of ps x) Updating an association list: fun upd list :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) list ⇒ ( 0a∗ 0b) list where upd list x y [] = [(x,y)] | upd list x y ((a,b)#ps) = (if x < a then (x,y)#(a,b)#ps else if x = a then (x,y)#ps else (a,b)# upd list x y ps) fun del list :: 0a::linorder ⇒ ( 0a∗ 0b)list ⇒ ( 0a∗ 0b)list where del list x [] = [] | del list x ((a,b)#ps) = (if x = a then ps else (a,b)# del list x ps)

8.1 Lemmas for map of lemma map of ins list: map of (upd list x y ps) = (map of ps)(x := Some y) by(induction ps) auto lemma map of append: map of (ps @ qs) x = (case map of ps x of None ⇒ map of qs x | Some y ⇒ Some y) by(induction ps)(auto) lemma map of None: sorted (x # map fst ps) =⇒ map of ps x = None by (induction ps)(fastforce simp: sorted lems sorted wrt Cons)+

29 lemma map of None2 : sorted (map fst ps @[x]) =⇒ map of ps x = None by (induction ps)(auto simp: sorted lems) lemma map of del list: sorted1 ps =⇒ map of (del list x ps) = (map of ps)(x := None) by(induction ps)(auto simp: map of None sorted lems fun eq iff ) lemma map of sorted Cons: sorted (a # map fst ps) =⇒ x < a =⇒ map of ps x = None by (simp add: map of None sorted Cons le) lemma map of sorted snoc: sorted (map fst ps @[a]) =⇒ a ≤ x =⇒ map of ps x = None by (simp add: map of None2 sorted snoc le) lemmas map of sorteds = map of sorted Cons map of sorted snoc lemmas map of simps = sorted lems map of append map of sorteds

8.2 Lemmas for upd list lemma sorted upd list: sorted1 ps =⇒ sorted1 (upd list x y ps) apply(induction ps) apply simp apply(case tac ps) apply auto done lemma upd list sorted: sorted1 (ps @ [(a,b)]) =⇒ upd list x y (ps @(a,b)# qs) = (if x < a then upd list x y ps @(a,b)# qs else ps @ upd list x y ((a,b)# qs)) by(induction ps)(auto simp: sorted lems) In principle, sorted1 (?ps @ [(?a, ?b)]) =⇒ upd list ?x ?y (?ps @(?a, ?b)# ?qs) = (if ?x < ?a then upd list ?x ?y ?ps @(?a, ?b)# ?qs else ?ps @ upd list ?x ?y ((?a, ?b)# ?qs)) suffices, but the following two corollaries speed up proofs. corollary upd list sorted1 : [[ sorted (map fst ps @[a]); x < a ]] =⇒ upd list x y (ps @(a,b)# qs) = upd list x y ps @(a,b)# qs by (auto simp: upd list sorted) corollary upd list sorted2 : [[ sorted (map fst ps @[a]); a ≤ x ]] =⇒ upd list x y (ps @(a,b)# qs) = ps @ upd list x y ((a,b)# qs)

30 by (auto simp: upd list sorted) lemmas upd list simps = sorted lems upd list sorted1 upd list sorted2 Splay trees need two additional upd list lemmas: lemma upd list Cons: sorted1 ((x,y)# xs) =⇒ upd list x y xs = (x,y)# xs by (induction xs) auto lemma upd list snoc: sorted1 (xs @ [(x,y)]) =⇒ upd list x y xs = xs @ [(x,y)] by(induction xs)(auto simp add: sorted mid iff2 )

8.3 Lemmas for del list lemma sorted del list: sorted1 ps =⇒ sorted1 (del list x ps) apply(induction ps) apply simp apply(case tac ps) apply (auto simp: sorted Cons le) done lemma del list idem: x ∈/ set(map fst xs) =⇒ del list x xs = xs by (induct xs) auto lemma del list sorted: sorted1 (ps @(a,b)# qs) =⇒ del list x (ps @(a,b)# qs) = (if x < a then del list x ps @(a,b)# qs else ps @ del list x ((a,b)# qs)) by(induction ps) (fastforce simp: sorted lems sorted wrt Cons intro!: del list idem)+ In principle, sorted1 (?ps @(?a, ?b)# ?qs) =⇒ del list ?x (?ps @(?a, ?b)# ?qs) = (if ?x < ?a then del list ?x ?ps @(?a, ?b)# ?qs else ?ps @ del list ?x ((?a, ?b)# ?qs)) suffices, but the following corollaries speed up proofs. corollary del list sorted1 : sorted1 (xs @(a,b)# ys) =⇒ a ≤ x =⇒ del list x (xs @(a,b)# ys) = xs @ del list x ((a,b)# ys) by (auto simp: del list sorted) lemma del list sorted2 : sorted1 (xs @(a,b)# ys) =⇒ x < a =⇒ del list x (xs @(a,b)# ys) = del list x xs @(a,b)# ys by (auto simp: del list sorted)

31 lemma del list sorted3 : sorted1 (xs @(a,a 0)# ys @(b,b 0)# zs) =⇒ x < b =⇒ del list x (xs @(a,a 0)# ys @(b,b 0)# zs) = del list x (xs @(a,a 0)# ys) @(b,b 0)# zs by (auto simp: del list sorted sorted lems) lemma del list sorted4 : sorted1 (xs @(a,a 0)# ys @(b,b 0)# zs @(c,c 0)# us) =⇒ x < c =⇒ del list x (xs @(a,a 0)# ys @(b,b 0)# zs @(c,c 0)# us) = del list x (xs @(a,a 0)# ys @(b,b 0)# zs)@(c,c 0)# us by (auto simp: del list sorted sorted lems) lemma del list sorted5 : sorted1 (xs @(a,a 0)# ys @(b,b 0)# zs @(c,c 0)# us @(d,d 0)# vs) =⇒ x < d =⇒ del list x (xs @(a,a 0)# ys @(b,b 0)# zs @(c,c 0)# us @(d,d 0)# vs) = del list x (xs @(a,a 0)# ys @(b,b 0)# zs @(c,c 0)# us)@(d,d 0)# vs by (auto simp: del list sorted sorted lems) lemmas del list simps = sorted lems del list sorted1 del list sorted2 del list sorted3 del list sorted4 del list sorted5 Splay trees need two additional del list lemmas: lemma del list notin Cons: sorted (x # map fst xs) =⇒ del list x xs = xs by(induction xs)(fastforce simp: sorted wrt Cons)+ lemma del list sorted app: sorted(map fst xs @[x]) =⇒ del list x (xs @ ys) = xs @ del list x ys by (induction xs)(auto simp: sorted mid iff2 ) end

9 Specifications of Map ADT theory Map Specs imports AList Upd Del begin The basic map interface with traditional set-based specification:

32 locale Map = fixes empty :: 0m fixes update :: 0a ⇒ 0b ⇒ 0m ⇒ 0m fixes delete :: 0a ⇒ 0m ⇒ 0m fixes lookup :: 0m ⇒ 0a ⇒ 0b option fixes invar :: 0m ⇒ bool assumes map empty: lookup empty = (λ . None) and map update: invar m =⇒ lookup(update a b m) = (lookup m)(a := Some b) and map delete: invar m =⇒ lookup(delete a m) = (lookup m)(a := None) and invar empty: invar empty and invar update: invar m =⇒ invar(update a b m) and invar delete: invar m =⇒ invar(delete a m) lemmas (in Map) map specs = map empty map update map delete invar empty invar update invar delete The basic map interface with inorder-based specification: locale Map by Ordered = fixes empty :: 0t fixes update :: 0a::linorder ⇒ 0b ⇒ 0t ⇒ 0t fixes delete :: 0a ⇒ 0t ⇒ 0t fixes lookup :: 0t ⇒ 0a ⇒ 0b option fixes inorder :: 0t ⇒ ( 0a ∗ 0b) list fixes inv :: 0t ⇒ bool assumes inorder empty: inorder empty = [] and inorder lookup: inv t ∧ sorted1 (inorder t) =⇒ lookup t a = map of (inorder t) a and inorder update: inv t ∧ sorted1 (inorder t) =⇒ inorder(update a b t) = upd list a b (inorder t) and inorder delete: inv t ∧ sorted1 (inorder t) =⇒ inorder(delete a t) = del list a (inorder t) and inorder inv empty: inv empty and inorder inv update: inv t ∧ sorted1 (inorder t) =⇒ inv(update a b t) and inorder inv delete: inv t ∧ sorted1 (inorder t) =⇒ inv(delete a t) begin It implements the traditional specification: definition invar :: 0t ⇒ bool where invar t == inv t ∧ sorted1 (inorder t) sublocale Map empty update delete lookup invar

33 proof(standard, goal cases) case 1 show ?case by (auto simp: inorder lookup inorder empty in- order inv empty) next case 2 thus ?case by(simp add: fun eq iff inorder update inorder inv update map of ins list inorder lookup sorted upd list invar def ) next case 3 thus ?case by(simp add: fun eq iff inorder delete inorder inv delete map of del list inorder lookup sorted del list invar def ) next case 4 thus ?case by(simp add: inorder empty inorder inv empty in- var def ) next case 5 thus ?case by(simp add: inorder update inorder inv update sorted upd list invar def ) next case 6 thus ?case by (auto simp: inorder delete inorder inv delete sorted del list invar def ) qed end end

10 Unbalanced Tree Implementation of Map theory Tree Map imports Tree Set Map Specs begin fun lookup :: ( 0a::linorder∗ 0b) tree ⇒ 0a ⇒ 0b option where lookup Leaf x = None | lookup (Node l (a,b) r) x = (case cmp x a of LT ⇒ lookup l x | GT ⇒ lookup r x | EQ ⇒ Some b) fun update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree where update x y Leaf = Node Leaf (x,y) Leaf |

34 update x y (Node l (a,b) r) = (case cmp x a of LT ⇒ Node (update x y l)(a,b) r | EQ ⇒ Node l (x,y) r | GT ⇒ Node l (a,b)(update x y r)) fun delete :: 0a::linorder ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree where delete x Leaf = Leaf | delete x (Node l (a,b) r) = (case cmp x a of LT ⇒ Node (delete x l)(a,b) r | GT ⇒ Node l (a,b)(delete x r) | EQ ⇒ if r = Leaf then l else let (ab 0,r 0) = split min r in Node l ab 0 r 0)

10.1 Functional Correctness Proofs lemma lookup map of : sorted1 (inorder t) =⇒ lookup t x = map of (inorder t) x by (induction t)(auto simp: map of simps split: option.split) lemma inorder update: sorted1 (inorder t) =⇒ inorder(update a b t) = upd list a b (inorder t) by(induction t)(auto simp: upd list simps) lemma inorder delete: sorted1 (inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(induction t)(auto simp: del list simps split minD split: prod.splits) interpretation M : Map by Ordered where empty = empty and lookup = lookup and update = update and delete = delete and inorder = inorder and inv = λ . True proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: lookup map of ) next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) qed auto end

35 11 Augmented Tree (Tree2) theory Tree2 imports HOL−Library.Tree begin This theory provides the basic infrastructure for the type ( 0a × 0b) tree of augmented trees where 0a is the key and 0b some additional information. IMPORTANT: Inductions and cases analyses on augmented trees need to use the following two rules explicitly. They generate nodes of the form hl, (a, b), ri rather than hl, a, ri for trees of type 0a tree. lemmas tree2 induct = tree.induct[where 0a = 0a ∗ 0b, split format(complete)] lemmas tree2 cases = tree.exhaust[where 0a = 0a ∗ 0b, split format(complete)] fun inorder :: ( 0a∗ 0b)tree ⇒ 0a list where inorder Leaf = [] | inorder (Node l (a, ) r) = inorder l @ a # inorder r fun set tree :: ( 0a∗ 0b) tree ⇒ 0a set where set tree Leaf = {} | set tree (Node l (a, ) r) = {a} ∪ set tree l ∪ set tree r fun bst :: ( 0a::linorder∗ 0b) tree ⇒ bool where bst Leaf = True | bst (Node l (a, ) r) = ((∀ x ∈ set tree l. x < a) ∧ (∀ x ∈ set tree r. a < x) ∧ bst l ∧ bst r) lemma finite set tree[simp]: finite(set tree t) by(induction t) auto lemma eq set tree empty[simp]: set tree t = {} ←→ t = Leaf by (cases t) auto lemma set inorder[simp]: set (inorder t) = set tree t by (induction t) auto lemma length inorder[simp]: length (inorder t) = size t by (induction t) auto end

36 12 Function isin for Tree2 theory Isin2 imports Tree2 Cmp Set Specs begin fun isin :: ( 0a::linorder∗ 0b) tree ⇒ 0a ⇒ bool where isin Leaf x = False | isin (Node l (a, ) r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ isin r x) lemma isin set inorder: sorted(inorder t) =⇒ isin t x = (x ∈ set(inorder t)) by (induction t rule: tree2 induct)(auto simp: isin simps) lemma isin set tree: bst t =⇒ isin t x ←→ x ∈ set tree t by(induction t rule: tree2 induct) auto end

13 Interval Trees theory imports HOL−Data Structures.Cmp HOL−Data Structures.List Ins Del HOL−Data Structures.Isin2 HOL−Data Structures.Set Specs begin

13.1 Intervals The following definition of intervals uses the typedef command to define the type of non-empty intervals as a subset of the type of pairs p where fst p ≤ snd p: typedef (overloaded) 0a::linorder ivl = {p :: 0a × 0a. fst p ≤ snd p} by auto

37 More precisely, 0a ivl is isomorphic with that subset via the function Rep ivl. Hence the basic interval properties are not immediate but need simple proofs: definition low :: 0a::linorder ivl ⇒ 0a where low p = fst (Rep ivl p) definition high :: 0a::linorder ivl ⇒ 0a where high p = snd (Rep ivl p) lemma ivl is interval: low p ≤ high p by (metis Rep ivl high def low def mem Collect eq) lemma ivl inj : low p = low q =⇒ high p = high q =⇒ p = q by (metis Rep ivl inverse high def low def prod eqI ) Now we can forget how exactly intervals were defined. instantiation ivl :: (linorder) linorder begin definition ivl less:(x < y) = (low x < low y | (low x = low y ∧ high x < high y)) definition ivl less eq:(x ≤ y) = (low x < low y | (low x = low y ∧ high x ≤ high y)) instance proof fix x y z :: 0a ivl show a:(x < y) = (x ≤ y ∧ ¬ y ≤ x) using ivl less ivl less eq by force show b: x ≤ x by (simp add: ivl less eq) show c: x ≤ y =⇒ y ≤ z =⇒ x ≤ z by (smt ivl less eq dual order.trans less trans) show d: x ≤ y =⇒ y ≤ x =⇒ x = y using ivl less eq a ivl inj ivl less by fastforce show e: x ≤ y ∨ y ≤ x by (meson ivl less eq leI not less iff gr or eq) qed end definition overlap :: ( 0a::linorder) ivl ⇒ 0a ivl ⇒ bool where overlap x y ←→ (high x ≥ low y ∧ high y ≥ low x) definition has overlap :: ( 0a::linorder) ivl set ⇒ 0a ivl ⇒ bool where has overlap S y ←→ (∃ x∈S. overlap x y)

38 13.2 Interval Trees type synonym 0a ivl tree = ( 0a ivl ∗ 0a) tree fun max hi :: ( 0a::order bot) ivl tree ⇒ 0a where max hi Leaf = bot | max hi (Node ( ,m) ) = m definition max3 :: ( 0a::linorder) ivl ⇒ 0a ⇒ 0a ⇒ 0a where max3 a m n = max (high a)(max m n) fun inv max hi :: ( 0a::{linorder,order bot}) ivl tree ⇒ bool where inv max hi Leaf ←→ True | inv max hi (Node l (a, m) r) ←→ (m = max3 a (max hi l)(max hi r) ∧ inv max hi l ∧ inv max hi r) lemma max hi is max: inv max hi t =⇒ a ∈ set tree t =⇒ high a ≤ max hi t by (induct t, auto simp add: max3 def max def ) lemma max hi exists: inv max hi t =⇒ t 6= Leaf =⇒ ∃ a∈set tree t. high a = max hi t proof (induction t rule: tree2 induct) case Leaf then show ?case by auto next case N :(Node l v m r) then show ?case proof (cases l rule: tree2 cases) case Leaf then show ?thesis using N .prems(1 ) N .IH (2 ) by (cases r, auto simp add: max3 def max def le bot) next case Nl: Node then show ?thesis proof (cases r rule: tree2 cases) case Leaf then show ?thesis using N .prems(1 ) N .IH (1 ) Nl by (auto simp add: max3 def max def le bot) next case Nr: Node obtain p1 where p1 : p1 ∈ set tree l high p1 = max hi l

39 using N .IH (1 ) N .prems(1 ) Nl by auto obtain p2 where p2 : p2 ∈ set tree r high p2 = max hi r using N .IH (2 ) N .prems(1 ) Nr by auto then show ?thesis using p1 p2 N .prems(1 ) by (auto simp add: max3 def max def ) qed qed qed

13.3 Insertion and Deletion definition node where [simp]: node l a r = Node l (a, max3 a (max hi l)(max hi r)) r fun insert :: 0a::{linorder,order bot} ivl ⇒ 0a ivl tree ⇒ 0a ivl tree where insert x Leaf = Node Leaf (x, high x) Leaf | insert x (Node l (a, m) r) = (case cmp x a of EQ ⇒ Node l (a, m) r | LT ⇒ node (insert x l) a r | GT ⇒ node l a (insert x r)) lemma inorder insert: sorted (inorder t) =⇒ inorder (insert x t) = ins list x (inorder t) by (induct t rule: tree2 induct)(auto simp: ins list simps) lemma inv max hi insert: inv max hi t =⇒ inv max hi (insert x t) by (induct t rule: tree2 induct)(auto simp add: max3 def ) fun split min :: 0a::{linorder,order bot} ivl tree ⇒ 0a ivl × 0a ivl tree where split min (Node l (a, m) r) = (if l = Leaf then (a, r) else let (x,l 0) = split min l in (x, node l 0 a r)) fun delete :: 0a::{linorder,order bot} ivl ⇒ 0a ivl tree ⇒ 0a ivl tree where delete x Leaf = Leaf | delete x (Node l (a, m) r) = (case cmp x a of LT ⇒ node (delete x l) a r | GT ⇒ node l a (delete x r) | EQ ⇒ if r = Leaf then l else let (a 0, r 0) = split min r in node l a 0 r 0)

40 lemma split minD: split min t = (x,t 0) =⇒ t 6= Leaf =⇒ x # inorder t 0 = inorder t by (induct t arbitrary: t 0 rule: split min.induct) (auto simp: sorted lems split: prod.splits if splits) lemma inorder delete: sorted (inorder t) =⇒ inorder (delete x t) = del list x (inorder t) by (induct t) (auto simp: del list simps split minD Let def split: prod.splits) lemma inv max hi split min: [[ t 6= Leaf ; inv max hi t ]] =⇒ inv max hi (snd (split min t)) by (induct t)(auto split: prod.splits) lemma inv max hi delete: inv max hi t =⇒ inv max hi (delete x t) apply (induct t) apply simp using inv max hi split min by (fastforce simp add: Let def split: prod.splits)

13.4 Search Does interval x overlap with any interval in the tree? fun search :: 0a::{linorder,order bot} ivl tree ⇒ 0a ivl ⇒ bool where search Leaf x = False | search (Node l (a, m) r) x = (if overlap x a then True else if l 6= Leaf ∧ max hi l ≥ low x then search l x else search r x) lemma search correct: inv max hi t =⇒ sorted (inorder t) =⇒ search t x = has overlap (set tree t) x proof (induction t rule: tree2 induct) case Leaf then show ?case by (auto simp add: has overlap def ) next case (Node l a m r) have search l: search l x = has overlap (set tree l) x using Node.IH (1 ) Node.prems by (auto simp: sorted wrt append) have search r: search r x = has overlap (set tree r) x using Node.IH (2 ) Node.prems by (auto simp: sorted wrt append) show ?case proof (cases overlap a x)

41 case True thus ?thesis by (auto simp: overlap def has overlap def ) next case a disjoint: False then show ?thesis proof cases assume [simp]: l = Leaf have search eval: search (Node l (a, m) r) x = search r x using a disjoint overlap def by auto show ?thesis unfolding search eval search r by (auto simp add: has overlap def a disjoint) next assume l 6= Leaf then show ?thesis proof (cases max hi l ≥ low x) case max hi l ge: True have inv max hi l using Node.prems(1 ) by auto then obtain p where p: p ∈ set tree l high p = max hi l using hl 6= Leaf i max hi exists by auto have search eval: search (Node l (a, m) r) x = search l x using a disjoint hl 6= Leaf i max hi l ge by (auto simp: overlap def ) show ?thesis proof (cases low p ≤ high x) case True have overlap p x unfolding overlap def using True p(2 ) max hi l ge by auto then show ?thesis unfolding search eval search l using p(1 ) by(auto simp: has overlap def overlap def ) next case False have ¬overlap x rp if asm: rp ∈ set tree r for rp proof − have low p ≤ low rp using asm p(1 ) Node(4 ) by(fastforce simp: sorted wrt append ivl less) then show ?thesis using False by (auto simp: overlap def ) qed then show ?thesis unfolding search eval search l using a disjoint by (auto simp: has overlap def overlap def )

42 qed next case False have search eval: search (Node l (a, m) r) x = search r x using a disjoint False by (auto simp: overlap def ) have ¬overlap x lp if asm: lp ∈ set tree l for lp using asm False Node.prems(1 ) max hi is max by (fastforce simp: overlap def ) then show ?thesis unfolding search eval search r using a disjoint by (auto simp: has overlap def overlap def ) qed qed qed qed definition empty :: 0a ivl tree where empty = Leaf

13.5 Specification locale Interval Set = Set + fixes has overlap :: 0t ⇒ 0a::linorder ivl ⇒ bool assumes set overlap: invar s =⇒ has overlap s x = Interval Tree.has overlap (set s) x fun invar :: ( 0a::{linorder,order bot}) ivl tree ⇒ bool where invar t = (inv max hi t ∧ sorted(inorder t)) interpretation S: Interval Set where empty = Leaf and insert = insert and delete = delete and has overlap = search and isin = isin and set = set tree and invar = invar proof (standard, goal cases) case 1 then show ?case by auto next case 2 then show ?case by (simp add: isin set inorder) next case 3 then show ?case by(simp add: inorder insert set ins list flip: set inorder) next case 4

43 then show ?case by(simp add: inorder delete set del list flip: set inorder) next case 5 then show ?case by auto next case 6 then show ?case by (simp add: inorder insert inv max hi insert sorted ins list) next case 7 then show ?case by (simp add: inorder delete inv max hi delete sorted del list) next case 8 then show ?case by (simp add: search correct) qed end

14 AVL Tree Implementation of Sets theory AVL Set Code imports Cmp Isin2 begin

14.1 Code type synonym 0a tree ht = ( 0a∗nat) tree definition empty :: 0a tree ht where empty = Leaf fun ht :: 0a tree ht ⇒ nat where ht Leaf = 0 | ht (Node l (a,n) r) = n definition node :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where node l a r = Node l (a, max (ht l)(ht r) + 1 ) r definition balL :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where balL AB c C = (if ht AB = ht C + 2 then case AB of Node A (a, ) B ⇒

44 if ht A ≥ ht B then node A a (node B c C ) else case B of Node B 1 (b, ) B 2 ⇒ node (node A a B 1) b (node B 2 c C ) else node AB c C ) definition balR :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where balR A a BC = (if ht BC = ht A + 2 then case BC of Node B (c, ) C ⇒ if ht B ≤ ht C then node (node A a B) c C else case B of Node B 1 (b, ) B 2 ⇒ node (node A a B 1) b (node B 2 c C ) else node A a BC ) fun insert :: 0a::linorder ⇒ 0a tree ht ⇒ 0a tree ht where insert x Leaf = Node Leaf (x, 1 ) Leaf | insert x (Node l (a, n) r) = (case cmp x a of EQ ⇒ Node l (a, n) r | LT ⇒ balL (insert x l) a r | GT ⇒ balR l a (insert x r)) fun split max :: 0a tree ht ⇒ 0a tree ht ∗ 0a where split max (Node l (a, ) r) = (if r = Leaf then (l,a) else let (r 0,a 0) = split max r in (balL l a r 0, a 0)) lemmas split max induct = split max.induct[case names Node Leaf ] fun delete :: 0a::linorder ⇒ 0a tree ht ⇒ 0a tree ht where delete Leaf = Leaf | delete x (Node l (a, n) r) = (case cmp x a of EQ ⇒ if l = Leaf then r else let (l 0, a 0) = split max l in balR l 0 a 0 r | LT ⇒ balR (delete x l) a r | GT ⇒ balL l a (delete x r))

14.2 Functional Correctness Proofs Very different from the AFP/AVL proofs

45 14.2.1 Proofs for insert lemma inorder balL: inorder (balL l a r) = inorder l @ a # inorder r by (auto simp: node def balL def split:tree.splits) lemma inorder balR: inorder (balR l a r) = inorder l @ a # inorder r by (auto simp: node def balR def split:tree.splits) theorem inorder insert: sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) by (induct t) (auto simp: ins list simps inorder balL inorder balR)

14.2.2 Proofs for delete lemma inorder split maxD: [[ split max t = (t 0,a); t 6= Leaf ]] =⇒ inorder t 0 @[a] = inorder t by(induction t arbitrary: t 0 rule: split max.induct) (auto simp: inorder balL split: if splits prod.splits tree.split) theorem inorder delete: sorted(inorder t) =⇒ inorder (delete x t) = del list x (inorder t) by(induction t) (auto simp: del list simps inorder balL inorder balR inorder split maxD split: prod.splits) end

14.3 Invariant theory AVL Set imports AVL Set Code HOL−Number Theory.Fib begin fun avl :: 0a tree ht ⇒ bool where avl Leaf = True | avl (Node l (a,n) r) = (abs(int(height l) − int(height r)) ≤ 1 ∧ n = max (height l)(height r) + 1 ∧ avl l ∧ avl r)

46 14.3.1 Insertion maintains AVL balance declare Let def [simp] lemma ht height[simp]: avl t =⇒ ht t = height t by (cases t rule: tree2 cases) simp all First, a fast but relatively manual proof with many lemmas: lemma height balL: [[ avl l; avl r; height l = height r + 2 ]] =⇒ height (balL l a r) ∈ {height r + 2 , height r + 3 } by (auto simp:node def balL def split:tree.split) lemma height balR: [[ avl l; avl r; height r = height l + 2 ]] =⇒ height (balR l a r): {height l + 2 , height l + 3 } by(auto simp add:node def balR def split:tree.split) lemma height node[simp]: height(node l a r) = max (height l)(height r) + 1 by (simp add: node def ) lemma height balL2 : [[ avl l; avl r; height l 6= height r + 2 ]] =⇒ height (balL l a r) = 1 + max (height l)(height r) by (simp all add: balL def ) lemma height balR2 : [[ avl l; avl r; height r 6= height l + 2 ]] =⇒ height (balR l a r) = 1 + max (height l)(height r) by (simp all add: balR def ) lemma avl balL: [[ avl l; avl r; height r − 1 ≤ height l ∧ height l ≤ height r + 2 ]] =⇒ avl(balL l a r) by(auto simp: balL def node def split!: if split tree.split) lemma avl balR: [[ avl l; avl r; height l − 1 ≤ height r ∧ height r ≤ height l + 2 ]] =⇒ avl(balR l a r) by(auto simp: balR def node def split!: if split tree.split) Insertion maintains the AVL property. Requires simultaneous proof. theorem avl insert: avl t =⇒ avl(insert x t)

47 avl t =⇒ height (insert x t) ∈ {height t, height t + 1 } proof (induction t rule: tree2 induct) case (Node l a r) case 1 show ?case proof(cases x = a) case True with 1 show ?thesis by simp next case False show ?thesis proof(cases x

48 case False show ?thesis proof(cases height (insert x r) = height l + 2 ) case False with 2 Node(3 ,4 ) h¬x < a i show ?thesis by (auto simp: height balR2 ) next case True hence (height (balR l a (insert x r)) = height l + 2 ) ∨ (height (balR l a (insert x r)) = height l + 3 )(is ?A ∨ ?B) using 2 Node(3 ) height balR[OF True] by simp thus ?thesis proof assume ?A with 2 h¬x < a i show ?thesis by (auto) next assume ?B with 2 Node(4 ) True h¬x < a i show ?thesis by (simp) arith qed qed qed qed qed simp all Now an automatic proof without lemmas: theorem avl insert auto: avl t =⇒ avl(insert x t) ∧ height (insert x t) ∈ {height t, height t + 1 } apply (induction t rule: tree2 induct)

apply (auto simp: balL def balR def node def max absorb2 split!: if split tree.split) done

14.3.2 Deletion maintains AVL balance lemma avl split max: [[ avl t; t 6= Leaf ]] =⇒ avl (fst (split max t)) ∧ height t ∈ {height(fst (split max t)), height(fst (split max t)) + 1 } by(induct t rule: split max induct) (auto simp: balL def node def max absorb2 split!: prod.split if split tree.split) Deletion maintains the AVL property: theorem avl delete: avl t =⇒ avl(delete x t) avl t =⇒ height t ∈ {height (delete x t), height (delete x t) + 1 }

49 proof (induct t rule: tree2 induct) case (Node l a n r) case 1 show ?case proof(cases x = a) case True thus ?thesis using 1 avl split max[of l] by (auto intro!: avl balR split: prod.split) next case False thus ?thesis using Node 1 by (auto intro!: avl balL avl balR) qed case 2 show ?case proof(cases x = a) case True thus ?thesis using 2 avl split max[of l] by(auto simp: balR def max absorb2 split!: if splits prod.split tree.split) next case False show ?thesis proof(cases x

50 qed qed simp all A more automatic proof. Complete automation as for insertion seems hard due to resource requirements. theorem avl delete auto: avl t =⇒ avl(delete x t) avl t =⇒ height t ∈ {height (delete x t), height (delete x t) + 1 } proof (induct t rule: tree2 induct) case (Node l a n r) case 1 thus ?case using Node avl split max[of l] by (auto intro!: avl balL avl balR split: prod.split) case 2 show ?case using 2 Node avl split max[of l] by auto (auto simp: balL def balR def max absorb1 max absorb2 split!: tree.splits prod.splits if splits) qed simp all

14.4 Overall correctness interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = avl proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp add: empty def ) next case 6 thus ?case by (simp add: avl insert(1 )) next case 7 thus ?case by (simp add: avl delete(1 )) qed

51 14.5 Height-Size Relation Any AVL tree of height n has at least fib (n+2 ) leaves: theorem avl fib bound: avl t =⇒ fib(height t + 2 ) ≤ size1 t proof (induction rule: tree2 induct) case (Node l a h r) have 1 : height l + 1 ≤ height r + 2 and 2 : height r + 1 ≤ height l + 2 using Node.prems by auto have fib (max (height l)(height r) + 3 ) ≤ size1 l + size1 r proof cases assume height l ≥ height r hence fib (max (height l)(height r) + 3 ) = fib (height l + 3 ) by(simp add: max absorb1 ) also have ... = fib (height l + 2 ) + fib (height l + 1 ) by (simp add: numeral eq Suc) also have ... ≤ size1 l + fib (height l + 1 ) using Node by (simp) also have ... ≤ size1 r + size1 l using Node fib mono[OF 1 ] by auto also have ... = size1 (Node l (a,h) r) by simp finally show ?thesis by (simp) next assume ¬ height l ≥ height r hence fib (max (height l)(height r) + 3 ) = fib (height r + 3 ) by(simp add: max absorb1 ) also have ... = fib (height r + 2 ) + fib (height r + 1 ) by (simp add: numeral eq Suc) also have ... ≤ size1 r + fib (height r + 1 ) using Node by (simp) also have ... ≤ size1 r + size1 l using Node fib mono[OF 2 ] by auto also have ... = size1 (Node l (a,h) r) by simp finally show ?thesis by (simp) qed also have ... = size1 (Node l (a,h) r) by simp finally show ?case by (simp del: fib.simps add: numeral eq Suc) qed auto

52 lemma avl fib bound auto: avl t =⇒ fib (height t + 2 ) ≤ size1 t proof (induction t rule: tree2 induct) case Leaf thus ?case by (simp) next case (Node l a h r) have 1 : height l + 1 ≤ height r + 2 and 2 : height r + 1 ≤ height l + 2 using Node.prems by auto have left: height l ≥ height r =⇒ ?case (is ?asm =⇒ ) using Node fib mono[OF 1 ] by (simp add: max.absorb1 ) have right: height l ≤ height r =⇒ ?case using Node fib mono[OF 2 ] by (simp add: max.absorb2 ) show ?case using left right using Node.prems by simp linarith qed An exponential lower bound for fib: lemma fib lowerbound: defines ϕ ≡ (1 + sqrt 5 ) / 2 shows real (fib(n+2 )) ≥ ϕ ˆ n proof (induction n rule: fib.induct) case 1 then show ?case by simp next case 2 then show ?case by (simp add: ϕ def real le lsqrt) next case (3 n) have ϕ ˆ Suc (Suc n) = ϕ ˆ 2 ∗ ϕ ˆ n by (simp add: field simps power2 eq square) also have ... = (ϕ + 1 ) ∗ ϕ ˆ n by (simp all add: ϕ def power2 eq square field simps) also have ... = ϕ ˆ Suc n + ϕ ˆ n by (simp add: field simps) also have ... ≤ real (fib (Suc n + 2 )) + real (fib (n + 2 )) by (intro add mono 3 .IH ) finally show ?case by simp qed The size of an AVL tree is (at least) exponential in its height: lemma avl size lowerbound: defines ϕ ≡ (1 + sqrt 5 ) / 2 assumes avl t shows ϕ ˆ (height t) ≤ size1 t proof − have ϕ ˆ height t ≤ fib (height t + 2 )

53 unfolding ϕ def by(rule fib lowerbound) also have ... ≤ size1 t using avl fib bound[of t] assms by simp finally show ?thesis . qed The height of an AVL tree is most 1 / log 2 ϕ ≈ 1 .44 times worse than log 2 (real (size1 t)): lemma avl height upperbound: defines ϕ ≡ (1 + sqrt 5 ) / 2 assumes avl t shows height t ≤ (1 /log 2 ϕ) ∗ log 2 (size1 t) proof − have ϕ > 0 ϕ > 1 by(auto simp: ϕ def pos add strict) hence height t = log ϕ (ϕ ˆ height t) by(simp add: log nat power) also have ... ≤ log ϕ (size1 t) using avl size lowerbound[OF assms(2 ), folded ϕ def ] h1 < ϕi by (simp add: le log of power) also have ... = (1 /log 2 ϕ) ∗ log 2 (size1 t) by(simp add: log base change[of 2 ϕ]) finally show ?thesis . qed end

15 Function lookup for Tree2 theory Lookup2 imports Tree2 Cmp Map Specs begin fun lookup :: (( 0a::linorder ∗ 0b) ∗ 0c) tree ⇒ 0a ⇒ 0b option where lookup Leaf x = None | lookup (Node l ((a,b), ) r) x = (case cmp x a of LT ⇒ lookup l x | GT ⇒ lookup r x | EQ ⇒ Some b) lemma lookup map of : sorted1 (inorder t) =⇒ lookup t x = map of (inorder t) x by(induction t rule: tree2 induct)(auto simp: map of simps split: option.split) end

54 16 AVL Tree Implementation of Maps theory AVL Map imports AVL Set Lookup2 begin fun update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) tree ht ⇒ ( 0a∗ 0b) tree ht where update x y Leaf = Node Leaf ((x,y), 1 ) Leaf | update x y (Node l ((a,b), h) r) = (case cmp x a of EQ ⇒ Node l ((x,y), h) r | LT ⇒ balL (update x y l)(a,b) r | GT ⇒ balR l (a,b)(update x y r)) fun delete :: 0a::linorder ⇒ ( 0a∗ 0b) tree ht ⇒ ( 0a∗ 0b) tree ht where delete Leaf = Leaf | delete x (Node l ((a,b), h) r) = (case cmp x a of EQ ⇒ if l = Leaf then r else let (l 0, ab 0) = split max l in balR l 0 ab 0 r | LT ⇒ balR (delete x l)(a,b) r | GT ⇒ balL l (a,b)(delete x r))

16.1 Functional Correctness theorem inorder update: sorted1 (inorder t) =⇒ inorder(update x y t) = upd list x y (inorder t) by (induct t)(auto simp: upd list simps inorder balL inorder balR) theorem inorder delete: sorted1 (inorder t) =⇒ inorder (delete x t) = del list x (inorder t) by(induction t) (auto simp: del list simps inorder balL inorder balR inorder split maxD split: prod.splits)

16.2 AVL invariants 16.2.1 Insertion maintains AVL balance theorem avl update: assumes avl t shows avl(update x y t) (height (update x y t) = height t ∨ height (update x y t) = height t + 1 )

55 using assms proof (induction x y t rule: update.induct) case eq2 :(2 x y l a b h r) case 1 show ?case proof(cases x = a) case True with eq2 1 show ?thesis by simp next case False with eq2 1 show ?thesis proof(cases x

56 show ?thesis proof(cases height (update x y r) = height l + 2 ) case False with eq2 2 h¬x < a i show ?thesis by (auto simp: height balR2 ) next case True hence (height (balR l (a,b)(update x y r)) = height l + 2 ) ∨ (height (balR l (a,b)(update x y r)) = height l + 3 )(is ?A ∨ ?B) using eq2 2 h¬x < a i hx 6= a i height balR[OF True] by simp thus ?thesis proof assume ?A with 2 h¬x < a i show ?thesis by (auto) next assume ?B with True 1 eq2 (4 ) h¬x < a i show ?thesis by (simp) arith qed qed qed qed qed simp all

16.2.2 Deletion maintains AVL balance theorem avl delete: assumes avl t shows avl(delete x t) and height t = (height (delete x t)) ∨ height t = height (delete x t) + 1 using assms proof (induct t rule: tree2 induct) case (Node l ab h r) obtain a b where [simp]: ab = (a,b) by fastforce case 1 show ?case proof(cases x = a) case True with Node 1 show ?thesis using avl split max[of l] by (auto intro!: avl balR split: prod.split) next case False show ?thesis proof(cases x

57 qed case 2 show ?case proof(cases x = a) case True then show ?thesis using 1 avl split max[of l] by(auto simp: balR def max absorb2 split!: if splits prod.split tree.split) next case False show ?thesis proof(cases x

58 next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) next case 5 show ?case by (simp add: empty def ) next case 6 thus ?case by(simp add: avl update(1 )) next case 7 thus ?case by(simp add: avl delete(1 )) qed end

17 AVL Tree with Balance Factors (1) theory AVL Bal Set imports Cmp Isin2 begin This version detects height increase/decrease from above via the change in balance factors. datatype bal = Lh | Bal | Rh type synonym 0a tree bal = ( 0a ∗ bal) tree Invariant: fun avl :: 0a tree bal ⇒ bool where avl Leaf = True | avl (Node l (a,b) r) = ((case b of Bal ⇒ height r = height l | Lh ⇒ height l = height r + 1 | Rh ⇒ height r = height l + 1 ) ∧ avl l ∧ avl r)

17.1 Code fun is bal where is bal (Node l (a,b) r) = (b = Bal) fun incr where

59 incr t t 0 = (t = Leaf ∨ is bal t ∧ ¬ is bal t 0) fun rot2 where rot2 A a B c C = (case B of (Node B 1 (b, bb) B 2) ⇒ let b1 = if bb = Rh then Lh else Bal; b2 = if bb = Lh then Rh else Bal in Node (Node A (a,b1) B 1)(b,Bal)(Node B 2 (c,b2) C )) fun balL :: 0a tree bal ⇒ 0a ⇒ bal ⇒ 0a tree bal ⇒ 0a tree bal where balL AB c bc C = (case bc of Bal ⇒ Node AB (c,Lh) C | Rh ⇒ Node AB (c,Bal) C | Lh ⇒ (case AB of Node A (a,Lh) B ⇒ Node A (a,Bal)(Node B (c,Bal) C ) | Node A (a,Bal) B ⇒ Node A (a,Rh)(Node B (c,Lh) C ) | Node A (a,Rh) B ⇒ rot2 A a B c C )) fun balR :: 0a tree bal ⇒ 0a ⇒ bal ⇒ 0a tree bal ⇒ 0a tree bal where balR A a ba BC = (case ba of Bal ⇒ Node A (a,Rh) BC | Lh ⇒ Node A (a,Bal) BC | Rh ⇒ (case BC of Node B (c,Rh) C ⇒ Node (Node A (a,Bal) B)(c,Bal) C | Node B (c,Bal) C ⇒ Node (Node A (a,Rh) B)(c,Lh) C | Node B (c,Lh) C ⇒ rot2 A a B c C )) fun insert :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal where insert x Leaf = Node Leaf (x, Bal) Leaf | insert x (Node l (a, b) r) = (case cmp x a of EQ ⇒ Node l (a, b) r | LT ⇒ let l 0 = insert x l in if incr l l 0 then balL l 0 a b r else Node l 0 (a,b) r | GT ⇒ let r 0 = insert x r in if incr r r 0 then balR l a b r 0 else Node l (a,b) r 0) fun decr where decr t t 0 = (t 6= Leaf ∧ (t 0 = Leaf ∨ ¬ is bal t ∧ is bal t 0)) fun split max :: 0a tree bal ⇒ 0a tree bal ∗ 0a where split max (Node l (a, ba) r) = (if r = Leaf then (l,a) else let (r 0,a 0) = split max r; t 0 = if decr r r 0 then balL l a ba r 0 else Node l (a,ba) r 0

60 in (t 0, a 0)) fun delete :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal where delete Leaf = Leaf | delete x (Node l (a, ba) r) = (case cmp x a of EQ ⇒ if l = Leaf then r else let (l 0, a 0) = split max l in if decr l l 0 then balR l 0 a 0 ba r else Node l 0 (a 0,ba) r | LT ⇒ let l 0 = delete x l in if decr l l 0 then balR l 0 a ba r else Node l 0 (a,ba) r | GT ⇒ let r 0 = delete x r in if decr r r 0 then balL l a ba r 0 else Node l (a,ba) r 0)

17.2 Proofs lemmas split max induct = split max.induct[case names Node Leaf ] lemmas splits = if splits tree.splits bal.splits declare Let def [simp]

17.2.1 Proofs about insertion lemma avl insert: avl t =⇒ avl(insert x t) ∧ height(insert x t) = height t + (if incr t (insert x t) then 1 else 0 ) apply(induction x t rule: insert.induct) apply(auto split!: splits) done The following two auxiliary lemma merely simplify the proof of in- order insert. lemma [simp]: [] 6= ins list x xs by(cases xs) auto lemma [simp]: avl t =⇒ insert x t 6= hl, (a, Rh), hii ∧ insert x t 6= hhi, (a, Lh), ri by(drule avl insert[of x]) (auto split: splits) theorem inorder insert: [[ avl t; sorted(inorder t) ]] =⇒ inorder(insert x t) = ins list x (inorder t) apply(induction t) apply (auto simp: ins list simps split!: splits)

61 done

17.2.2 Proofs about deletion lemma inorder balR: [[ ba = Rh −→ r 6= Leaf ; avl r ]] =⇒ inorder (balR l a ba r) = inorder l @ a # inorder r by (auto split: splits) lemma inorder balL: [[ ba = Lh −→ l 6= Leaf ; avl l ]] =⇒ inorder (balL l a ba r) = inorder l @ a # inorder r by (auto split: splits) lemma height 1 iff : avl t =⇒ height t = Suc 0 ←→ (∃ x. t = Node Leaf (x,Bal) Leaf ) by(cases t)(auto split: splits prod.splits) lemma avl split max: [[ split max t = (t 0,a); avl t; t 6= Leaf ]] =⇒ avl t 0 ∧ height t = height t 0 + (if decr t t 0 then 1 else 0 ) apply(induction t arbitrary: t 0 a rule: split max induct) apply(auto simp: max absorb1 max absorb2 height 1 iff split!: splits prod.splits) done lemma avl delete: avl t =⇒ avl (delete x t) ∧ height t = height (delete x t) + (if decr t (delete x t) then 1 else 0 ) apply(induction x t rule: delete.induct) apply(auto simp: max absorb1 max absorb2 height 1 iff dest: avl split max split!: splits prod.splits) done lemma inorder split maxD: [[ split max t = (t 0,a); t 6= Leaf ; avl t ]] =⇒ inorder t 0 @[a] = inorder t apply(induction t arbitrary: t 0 rule: split max.induct) apply(fastforce split!: splits prod.splits) apply simp done lemma neq Leaf if height neq 0 : height t 6= 0 =⇒ t 6= Leaf by auto

62 lemma split max Leaf : [[ t 6= Leaf ; avl t ]] =⇒ split max t = (hi, x) ←→ t = Node Leaf (x,Bal) Leaf by(cases t)(auto split: splits prod.splits) theorem inorder delete: [[ avl t; sorted(inorder t) ]] =⇒ inorder (delete x t) = del list x (inorder t) apply(induction t rule: tree2 induct) apply(auto simp: del list simps inorder balR inorder balL avl delete inorder split maxD split max Leaf neq Leaf if height neq 0 simp del: balL.simps balR.simps split!: splits prod.splits) done

17.2.3 Set Implementation interpretation S: Set by Ordered where empty = Leaf and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = avl proof (standard, goal cases) case 1 show ?case by (simp) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp) next case 6 thus ?case by (simp add: avl insert) next case 7 thus ?case by (simp add: avl delete) qed end

18 AVL Tree with Balance Factors (2) theory AVL Bal2 Set imports Cmp Isin2

63 begin This version passes a flag (Same/Diff ) back up to signal if the height changed. datatype bal = Lh | Bal | Rh type synonym 0a tree bal = ( 0a ∗ bal) tree Invariant: fun avl :: 0a tree bal ⇒ bool where avl Leaf = True | avl (Node l (a,b) r) = ((case b of Bal ⇒ height r = height l | Lh ⇒ height l = height r + 1 | Rh ⇒ height r = height l + 1 ) ∧ avl l ∧ avl r)

18.1 Code datatype 0a alt = Same 0a | Diff 0a type synonym 0a tree bal2 = 0a tree bal alt fun tree :: 0a alt ⇒ 0a where tree(Same t) = t | tree(Diff t) = t fun rot2 where rot2 A a B c C = (case B of (Node B 1 (b, bb) B 2) ⇒ let b1 = if bb = Rh then Lh else Bal; b2 = if bb = Lh then Rh else Bal in Node (Node A (a,b1) B 1)(b,Bal)(Node B 2 (c,b2) C )) fun balL :: 0a tree bal2 ⇒ 0a ⇒ bal ⇒ 0a tree bal ⇒ 0a tree bal2 where balL AB 0 c bc C = (case AB 0 of Same AB ⇒ Same (Node AB (c,bc) C ) | Diff AB ⇒ (case bc of Bal ⇒ Diff (Node AB (c,Lh) C ) | Rh ⇒ Same (Node AB (c,Bal) C ) | Lh ⇒ (case AB of Node A (a,Lh) B ⇒ Same(Node A (a,Bal)(Node B (c,Bal) C )) | Node A (a,Rh) B ⇒ Same(rot2 A a B c C ))))

64 fun balR :: 0a tree bal ⇒ 0a ⇒ bal ⇒ 0a tree bal2 ⇒ 0a tree bal2 where balR A a ba BC 0 = (case BC 0 of Same BC ⇒ Same (Node A (a,ba) BC ) | Diff BC ⇒ (case ba of Bal ⇒ Diff (Node A (a,Rh) BC ) | Lh ⇒ Same (Node A (a,Bal) BC ) | Rh ⇒ (case BC of Node B (c,Rh) C ⇒ Same(Node (Node A (a,Bal) B)(c,Bal) C ) | Node B (c,Lh) C ⇒ Same(rot2 A a B c C )))) fun ins :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal2 where ins x Leaf = Diff (Node Leaf (x, Bal) Leaf ) | ins x (Node l (a, b) r) = (case cmp x a of EQ ⇒ Same(Node l (a, b) r) | LT ⇒ balL (ins x l) a b r | GT ⇒ balR l a b (ins x r)) definition insert :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal where insert x t = tree(ins x t) fun baldR :: 0a tree bal ⇒ 0a ⇒ bal ⇒ 0a tree bal2 ⇒ 0a tree bal2 where baldR AB c bc C 0 = (case C 0 of Same C ⇒ Same (Node AB (c,bc) C ) | Diff C ⇒ (case bc of Bal ⇒ Same (Node AB (c,Lh) C ) | Rh ⇒ Diff (Node AB (c,Bal) C ) | Lh ⇒ (case AB of Node A (a,Lh) B ⇒ Diff (Node A (a,Bal)(Node B (c,Bal) C )) | Node A (a,Bal) B ⇒ Same(Node A (a,Rh)(Node B (c,Lh) C )) | Node A (a,Rh) B ⇒ Diff (rot2 A a B c C )))) fun baldL :: 0a tree bal2 ⇒ 0a ⇒ bal ⇒ 0a tree bal ⇒ 0a tree bal2 where baldL A 0 a ba BC = (case A 0 of Same A ⇒ Same (Node A (a,ba) BC ) | Diff A ⇒ (case ba of Bal ⇒ Same (Node A (a,Rh) BC ) | Lh ⇒ Diff (Node A (a,Bal) BC ) | Rh ⇒ (case BC of Node B (c,Rh) C ⇒ Diff (Node (Node A (a,Bal) B)(c,Bal) C ) | Node B (c,Bal) C ⇒ Same(Node (Node A (a,Rh) B)(c,Lh) C ) | Node B (c,Lh) C ⇒ Diff (rot2 A a B c C )))) fun split max :: 0a tree bal ⇒ 0a tree bal2 ∗ 0a where

65 split max (Node l (a, ba) r) = (if r = Leaf then (Diff l,a) else let (r 0,a 0) = split max r in (baldR l a ba r 0, a 0)) fun del :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal2 where del Leaf = Same Leaf | del x (Node l (a, ba) r) = (case cmp x a of EQ ⇒ if l = Leaf then Diff r else let (l 0, a 0) = split max l in baldL l 0 a 0 ba r | LT ⇒ baldL (del x l) a ba r | GT ⇒ baldR l a ba (del x r)) definition delete :: 0a::linorder ⇒ 0a tree bal ⇒ 0a tree bal where delete x t = tree(del x t) lemmas split max induct = split max.induct[case names Node Leaf ] lemmas splits = if splits tree.splits alt.splits bal.splits

18.2 Proofs 18.2.1 Proofs about insertion lemma avl ins case: avl t =⇒ case ins x t of Same t 0 ⇒ avl t 0 ∧ height t 0 = height t | Diff t 0 ⇒ avl t 0 ∧ height t 0 = height t + 1 ∧ (∀ l a r. t 0 = Node l (a,Bal) r −→ a = x ∧ l = Leaf ∧ r = Leaf ) apply(induction x t rule: ins.induct) apply(auto simp: max absorb1 split!: splits) done corollary avl insert: avl t =⇒ avl(insert x t) using avl ins case[of t x] by (simp add: insert def split: splits)

lemma ins Diff [simp]: avl t =⇒ ins x t 6= Diff Leaf ∧ (ins x t = Diff (Node l (a,Bal) r) ←→ t = Leaf ∧ a = x ∧ l=Leaf ∧ r=Leaf ) ∧ ins x t 6= Diff (Node l (a,Rh) Leaf ) ∧ ins x t 6= Diff (Node Leaf (a,Lh) r) by(drule avl ins case[of x]) (auto split: splits)

66 theorem inorder ins: [[ avl t; sorted(inorder t) ]] =⇒ inorder(tree(ins x t)) = ins list x (inorder t) apply(induction t) apply (auto simp: ins list simps split!: splits) done

18.2.2 Proofs about deletion lemma inorder baldL: [[ ba = Rh −→ r 6= Leaf ; avl r ]] =⇒ inorder (tree(baldL l a ba r)) = inorder (tree l)@ a # inorder r by (auto split: splits) lemma inorder baldR: [[ ba = Lh −→ l 6= Leaf ; avl l ]] =⇒ inorder (tree(baldR l a ba r)) = inorder l @ a # inorder (tree r) by (auto split: splits) lemma avl split max: [[ split max t = (t 0,a); avl t; t 6= Leaf ]] =⇒ case t 0 of Same t 0 ⇒ avl t 0 ∧ height t = height t 0 | Diff t 0 ⇒ avl t 0 ∧ height t = height t 0 + 1 apply(induction t arbitrary: t 0 a rule: split max induct) apply(fastforce simp: max absorb1 max absorb2 split!: splits prod.splits) apply simp done lemma avl del case: avl t =⇒ case del x t of Same t 0 ⇒ avl t 0 ∧ height t = height t 0 | Diff t 0 ⇒ avl t 0 ∧ height t = height t 0 + 1 apply(induction x t rule: del.induct) apply(auto simp: max absorb1 max absorb2 dest: avl split max split!: splits prod.splits) done corollary avl delete: avl t =⇒ avl(delete x t) using avl del case[of t x] by(simp add: delete def split: splits) lemma inorder split maxD: [[ split max t = (t 0,a); t 6= Leaf ; avl t ]] =⇒ inorder (tree t 0)@[a] = inorder t apply(induction t arbitrary: t 0 rule: split max.induct)

67 apply(fastforce split!: splits prod.splits) apply simp done lemma neq Leaf if height neq 0 [simp]: height t 6= 0 =⇒ t 6= Leaf by auto theorem inorder del: [[ avl t; sorted(inorder t) ]] =⇒ inorder (tree(del x t)) = del list x (inorder t) apply(induction t rule: tree2 induct) apply(auto simp: del list simps inorder baldL inorder baldR avl delete in- order split maxD simp del: baldR.simps baldL.simps split!: splits prod.splits) done

18.2.3 Set Implementation interpretation S: Set by Ordered where empty = Leaf and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = avl proof (standard, goal cases) case 1 show ?case by (simp) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder ins insert def ) next case 4 thus ?case by(simp add: inorder del delete def ) next case 5 thus ?case by (simp) next case 6 thus ?case by (simp add: avl insert) next case 7 thus ?case by (simp add: avl delete) qed end

19 Height-Balanced Trees theory Height Balanced Tree

68 imports Cmp Isin2 begin Height-balanced trees (HBTs) can be seen as a generalization of AVL trees. The code and the proofs were obtained by small modifications of the AVL theories. This is an implementation of sets via HBTs. type synonym 0a tree ht = ( 0a∗nat) tree definition empty :: 0a tree ht where empty = Leaf The maximal amount by which the height of two siblings may differ: locale HBT = fixes m :: nat assumes [arith]: m > 0 begin Invariant: fun hbt :: 0a tree ht ⇒ bool where hbt Leaf = True | hbt (Node l (a,n) r) = (abs(int(height l) − int(height r)) ≤ int(m) ∧ n = max (height l)(height r) + 1 ∧ hbt l ∧ hbt r) fun ht :: 0a tree ht ⇒ nat where ht Leaf = 0 | ht (Node l (a,n) r) = n definition node :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where node l a r = Node l (a, max (ht l)(ht r) + 1 ) r definition balL :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where balL AB b C = (if ht AB = ht C + m + 1 then case AB of Node A (a, ) B ⇒ if ht A ≥ ht B then node A a (node B b C ) else case B of Node B 1 (ab, ) B 2 ⇒ node (node A a B 1) ab (node B 2 b C ) else node AB b C )

69 definition balR :: 0a tree ht ⇒ 0a ⇒ 0a tree ht ⇒ 0a tree ht where balR A a BC = (if ht BC = ht A + m + 1 then case BC of Node B (b, ) C ⇒ if ht B ≤ ht C then node (node A a B) b C else case B of Node B 1 (ab, ) B 2 ⇒ node (node A a B 1) ab (node B 2 b C ) else node A a BC ) fun insert :: 0a::linorder ⇒ 0a tree ht ⇒ 0a tree ht where insert x Leaf = Node Leaf (x, 1 ) Leaf | insert x (Node l (a, n) r) = (case cmp x a of EQ ⇒ Node l (a, n) r | LT ⇒ balL (insert x l) a r | GT ⇒ balR l a (insert x r)) fun split max :: 0a tree ht ⇒ 0a tree ht ∗ 0a where split max (Node l (a, ) r) = (if r = Leaf then (l,a) else let (r 0,a 0) = split max r in (balL l a r 0, a 0)) lemmas split max induct = split max.induct[case names Node Leaf ] fun delete :: 0a::linorder ⇒ 0a tree ht ⇒ 0a tree ht where delete Leaf = Leaf | delete x (Node l (a, n) r) = (case cmp x a of EQ ⇒ if l = Leaf then r else let (l 0, a 0) = split max l in balR l 0 a 0 r | LT ⇒ balR (delete x l) a r | GT ⇒ balL l a (delete x r))

19.1 Functional Correctness Proofs 19.1.1 Proofs for insert lemma inorder balL: inorder (balL l a r) = inorder l @ a # inorder r by (auto simp: node def balL def split:tree.splits) lemma inorder balR: inorder (balR l a r) = inorder l @ a # inorder r by (auto simp: node def balR def split:tree.splits)

70 theorem inorder insert: sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) by (induct t) (auto simp: ins list simps inorder balL inorder balR)

19.1.2 Proofs for delete lemma inorder split maxD: [[ split max t = (t 0,a); t 6= Leaf ]] =⇒ inorder t 0 @[a] = inorder t by(induction t arbitrary: t 0 rule: split max.induct) (auto simp: inorder balL split: if splits prod.splits tree.split) theorem inorder delete: sorted(inorder t) =⇒ inorder (delete x t) = del list x (inorder t) by(induction t) (auto simp: del list simps inorder balL inorder balR inorder split maxD split: prod.splits)

19.2 Invariant preservation 19.2.1 Insertion maintains balance declare Let def [simp] lemma ht height[simp]: hbt t =⇒ ht t = height t by (cases t rule: tree2 cases) simp all First, a fast but relatively manual proof with many lemmas: lemma height balL: [[ hbt l; hbt r; height l = height r + m + 1 ]] =⇒ height (balL l a r) ∈ {height r + m + 1 , height r + m + 2 } by (auto simp:node def balL def split:tree.split) lemma height balR: [[ hbt l; hbt r; height r = height l + m + 1 ]] =⇒ height (balR l a r) ∈ {height l + m + 1 , height l + m + 2 } by(auto simp add:node def balR def split:tree.split) lemma height node[simp]: height(node l a r) = max (height l)(height r) + 1 by (simp add: node def ) lemma height balL2 :

71 [[ hbt l; hbt r; height l 6= height r + m + 1 ]] =⇒ height (balL l a r) = 1 + max (height l)(height r) by (simp all add: balL def ) lemma height balR2 : [[ hbt l; hbt r; height r 6= height l + m + 1 ]] =⇒ height (balR l a r) = 1 + max (height l)(height r) by (simp all add: balR def ) lemma hbt balL: [[ hbt l; hbt r; height r − m ≤ height l ∧ height l ≤ height r + m + 1 ]] =⇒ hbt(balL l a r) by(auto simp: balL def node def max def split!: if splits tree.split) lemma hbt balR: [[ hbt l; hbt r; height l − m ≤ height r ∧ height r ≤ height l + m + 1 ]] =⇒ hbt(balR l a r) by(auto simp: balR def node def max def split!: if splits tree.split) Insertion maintains hbt. Requires simultaneous proof. theorem hbt insert: hbt t =⇒ hbt(insert x t) hbt t =⇒ height (insert x t) ∈ {height t, height t + 1 } proof (induction t rule: tree2 induct) case (Node l a r) case 1 show ?case proof(cases x = a) case True with Node 1 show ?thesis by simp next case False show ?thesis proof(cases x

72 case False show ?thesis proof(cases x

73 Now an automatic proof without lemmas: theorem hbt insert auto: hbt t =⇒ hbt(insert x t) ∧ height (insert x t) ∈ {height t, height t + 1 } apply (induction t rule: tree2 induct)

apply (auto simp: balL def balR def node def max absorb1 max absorb2 split!: if split tree.split) done

19.2.2 Deletion maintains balance lemma hbt split max: [[ hbt t; t 6= Leaf ]] =⇒ hbt (fst (split max t)) ∧ height t ∈ {height(fst (split max t)), height(fst (split max t)) + 1 } by(induct t rule: split max induct) (auto simp: balL def node def max absorb2 split!: prod.split if split tree.split) Deletion maintains hbt: theorem hbt delete: hbt t =⇒ hbt(delete x t) hbt t =⇒ height t ∈ {height (delete x t), height (delete x t) + 1 } proof (induct t rule: tree2 induct) case (Node l a n r) case 1 thus ?case using Node hbt split max[of l] by (auto intro!: hbt balL hbt balR split: prod.split) case 2 show ?case proof(cases x = a) case True then show ?thesis using 1 hbt split max[of l] by(auto simp: balR def max absorb2 split!: if splits prod.split tree.split) next case False show ?thesis proof(cases x

74 hence (height (balR (delete x l) a r) = height (delete x l) + m + 1 ) ∨ height (balR (delete x l) a r) = height (delete x l) + m + 2 (is ?A ∨ ?B) using Node 2height balR[OF True] by simp thus ?thesis proof assume ?A with hx < a i Node 2 show ?thesis by(auto simp: balR def split!: if splits) next assume ?B with hx < a i Node 2 show ?thesis by(auto simp: balR def split!: if splits) qed qed next case False show ?thesis proof(cases height l = height (delete x r) + m + 1 ) case False with Node 1 h¬x < a i hx 6= a i show ?thesis by(auto simp: balL def ) next case True hence (height (balL l a (delete x r)) = height (delete x r) + m + 1 ) ∨ height (balL l a (delete x r)) = height (delete x r) + m + 2 (is ?A ∨ ?B) using Node 2 height balL[OF True] by simp thus ?thesis proof assume ?A with h¬x < a i hx 6= a i Node 2 show ?thesis by(auto simp: balL def split: if splits) next assume ?B with h¬x < a i hx 6= a i Node 2 show ?thesis by(auto simp: balL def split: if splits) qed qed qed qed qed simp all A more automatic proof. Complete automation as for insertion seems hard due to resource requirements. theorem hbt delete auto: hbt t =⇒ hbt(delete x t)

75 hbt t =⇒ height t ∈ {height (delete x t), height (delete x t) + 1 } proof (induct t rule: tree2 induct) case (Node l a n r) case 1 thus ?case using Node hbt split max[of l] by (auto intro!: hbt balL hbt balR split: prod.split) case 2 show ?case proof(cases x = a) case True thus ?thesis using 2 hbt split max[of l] by(auto simp: balR def max absorb2 split!: if splits prod.split tree.split) next case False thus ?thesis using height balL[of l delete x r a] height balR[of delete x l r a] 2 Node by(auto simp: balL def balR def split!: if split) qed qed simp all

19.3 Overall correctness interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = hbt proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp add: empty def ) next case 6 thus ?case by (simp add: hbt insert(1 )) next case 7 thus ?case by (simp add: hbt delete(1 )) qed end

76 end

20 Red-Black Trees theory RBT imports Tree2 begin datatype color = Red | Black type synonym 0a rbt = ( 0a∗color)tree abbreviation R where R l a r ≡ Node l (a, Red) r abbreviation B where B l a r ≡ Node l (a, Black) r fun baliL :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where baliL (R (R t1 a t2 ) b t3 ) c t4 = R (B t1 a t2 ) b (B t3 c t4 ) | baliL (R t1 a (R t2 b t3 )) c t4 = R (B t1 a t2 ) b (B t3 c t4 ) | baliL t1 a t2 = B t1 a t2 fun baliR :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where baliR t1 a (R t2 b (R t3 c t4 )) = R (B t1 a t2 ) b (B t3 c t4 ) | baliR t1 a (R (R t2 b t3 ) c t4 ) = R (B t1 a t2 ) b (B t3 c t4 ) | baliR t1 a t2 = B t1 a t2 fun paint :: color ⇒ 0a rbt ⇒ 0a rbt where paint c Leaf = Leaf | paint c (Node l (a, ) r) = Node l (a,c) r fun baldL :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where baldL (R t1 a t2 ) b t3 = R (B t1 a t2 ) b t3 | baldL t1 a (B t2 b t3 ) = baliR t1 a (R t2 b t3 ) | baldL t1 a (R (B t2 b t3 ) c t4 ) = R (B t1 a t2 ) b (baliR t3 c (paint Red t4 )) | baldL t1 a t2 = R t1 a t2 fun baldR :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where baldR t1 a (R t2 b t3 ) = R t1 a (B t2 b t3 ) | baldR (B t1 a t2 ) b t3 = baliL (R t1 a t2 ) b t3 | baldR (R t1 a (B t2 b t3 )) c t4 = R (baliL (paint Red t1 ) a t2 ) b (B t3 c t4 ) | baldR t1 a t2 = R t1 a t2

77 fun join :: 0a rbt ⇒ 0a rbt ⇒ 0a rbt where join Leaf t = t | join t Leaf = t | join (R t1 a t2 )(R t3 c t4 ) = (case join t2 t3 of R u2 b u3 ⇒ (R (R t1 a u2 ) b (R u3 c t4 )) | t23 ⇒ R t1 a (R t23 c t4 )) | join (B t1 a t2 )(B t3 c t4 ) = (case join t2 t3 of R u2 b u3 ⇒ R (B t1 a u2 ) b (B u3 c t4 ) | t23 ⇒ baldL t1 a (B t23 c t4 )) | join t1 (R t2 a t3 ) = R (join t1 t2 ) a t3 | join (R t1 a t2 ) t3 = R t1 a (join t2 t3 ) end

21 Red-Black Tree Implementation of Sets theory RBT Set imports Complex Main RBT Cmp Isin2 begin definition empty :: 0a rbt where empty = Leaf fun ins :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where ins x Leaf = R Leaf x Leaf | ins x (B l a r) = (case cmp x a of LT ⇒ baliL (ins x l) a r | GT ⇒ baliR l a (ins x r) | EQ ⇒ B l a r) | ins x (R l a r) = (case cmp x a of LT ⇒ R (ins x l) a r | GT ⇒ R l a (ins x r) | EQ ⇒ R l a r) definition insert :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where

78 insert x t = paint Black (ins x t) fun color :: 0a rbt ⇒ color where color Leaf = Black | color (Node ( , c) ) = c fun del :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where del x Leaf = Leaf | del x (Node l (a, ) r) = (case cmp x a of LT ⇒ if l 6= Leaf ∧ color l = Black then baldL (del x l) a r else R (del x l) a r | GT ⇒ if r 6= Leaf ∧ color r = Black then baldR l a (del x r) else R l a (del x r) | EQ ⇒ join l r) definition delete :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where delete x t = paint Black (del x t)

21.1 Functional Correctness Proofs lemma inorder paint: inorder(paint c t) = inorder t by(cases t)(auto) lemma inorder baliL: inorder(baliL l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: baliL.cases)(auto) lemma inorder baliR: inorder(baliR l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: baliR.cases)(auto) lemma inorder ins: sorted(inorder t) =⇒ inorder(ins x t) = ins list x (inorder t) by(induction x t rule: ins.induct) (auto simp: ins list simps inorder baliL inorder baliR) lemma inorder insert: sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) by (simp add: insert def inorder ins inorder paint) lemma inorder baldL: inorder(baldL l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: baldL.cases)

79 (auto simp: inorder baliL inorder baliR inorder paint) lemma inorder baldR: inorder(baldR l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: baldR.cases) (auto simp: inorder baliL inorder baliR inorder paint) lemma inorder join: inorder(join l r) = inorder l @ inorder r by(induction l r rule: join.induct) (auto simp: inorder baldL inorder baldR split: tree.split color.split) lemma inorder del: sorted(inorder t) =⇒ inorder(del x t) = del list x (inorder t) by(induction x t rule: del.induct) (auto simp: del list simps inorder join inorder baldL inorder baldR) lemma inorder delete: sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by (auto simp: delete def inorder del inorder paint)

21.2 Structural invariants lemma neq Black[simp]: (c 6= Black) = (c = Red) by (cases c) auto The proofs are due to Markus Reiter and Alexander Krauss. fun bheight :: 0a rbt ⇒ nat where bheight Leaf = 0 | bheight (Node l (x, c) r) = (if c = Black then bheight l + 1 else bheight l) fun invc :: 0a rbt ⇒ bool where invc Leaf = True | invc (Node l (a,c) r) = ((c = Red −→ color l = Black ∧ color r = Black) ∧ invc l ∧ invc r) Weaker version: abbreviation invc2 :: 0a rbt ⇒ bool where invc2 t ≡ invc(paint Black t) fun invh :: 0a rbt ⇒ bool where invh Leaf = True | invh (Node l (x, c) r) = (bheight l = bheight r ∧ invh l ∧ invh r)

80 lemma invc2I : invc t =⇒ invc2 t by (cases t rule: tree2 cases) simp+ definition rbt :: 0a rbt ⇒ bool where rbt t = (invc t ∧ invh t ∧ color t = Black) lemma color paint Black: color (paint Black t) = Black by (cases t) auto lemma paint2 : paint c2 (paint c1 t) = paint c2 t by (cases t) auto lemma invh paint: invh t =⇒ invh (paint c t) by (cases t) auto lemma invc baliL: [[invc2 l; invc r]] =⇒ invc (baliL l a r) by (induct l a r rule: baliL.induct) auto lemma invc baliR: [[invc l; invc2 r]] =⇒ invc (baliR l a r) by (induct l a r rule: baliR.induct) auto lemma bheight baliL: bheight l = bheight r =⇒ bheight (baliL l a r) = Suc (bheight l) by (induct l a r rule: baliL.induct) auto lemma bheight baliR: bheight l = bheight r =⇒ bheight (baliR l a r) = Suc (bheight l) by (induct l a r rule: baliR.induct) auto lemma invh baliL: [[ invh l; invh r; bheight l = bheight r ]] =⇒ invh (baliL l a r) by (induct l a r rule: baliL.induct) auto lemma invh baliR: [[ invh l; invh r; bheight l = bheight r ]] =⇒ invh (baliR l a r) by (induct l a r rule: baliR.induct) auto All in one: lemma inv baliR: [[ invh l; invh r; invc l; invc2 r; bheight l = bheight r ]] =⇒ invc (baliR l a r) ∧ invh (baliR l a r) ∧ bheight (baliR l a r) = Suc (bheight l) by (induct l a r rule: baliR.induct) auto

81 lemma inv baliL: [[ invh l; invh r; invc2 l; invc r; bheight l = bheight r ]] =⇒ invc (baliL l a r) ∧ invh (baliL l a r) ∧ bheight (baliL l a r) = Suc (bheight l) by (induct l a r rule: baliL.induct) auto

21.2.1 Insertion lemma invc ins: invc t −→ invc2 (ins x t) ∧ (color t = Black −→ invc (ins x t)) by (induct x t rule: ins.induct)(auto simp: invc baliL invc baliR invc2I ) lemma invh ins: invh t =⇒ invh (ins x t) ∧ bheight (ins x t) = bheight t by(induct x t rule: ins.induct) (auto simp: invh baliL invh baliR bheight baliL bheight baliR) theorem rbt insert: rbt t =⇒ rbt (insert x t) by (simp add: invc ins invh ins color paint Black invh paint rbt def insert def ) All in one: lemma inv ins: [[ invc t; invh t ]] =⇒ invc2 (ins x t) ∧ (color t = Black −→ invc (ins x t)) ∧ invh(ins x t) ∧ bheight (ins x t) = bheight t by (induct x t rule: ins.induct)(auto simp: inv baliL inv baliR invc2I ) theorem rbt insert2 : rbt t =⇒ rbt (insert x t) by (simp add: inv ins color paint Black invh paint rbt def insert def )

21.2.2 Deletion lemma bheight paint Red: color t = Black =⇒ bheight (paint Red t) = bheight t − 1 by (cases t) auto lemma invh baldL invc: [[ invh l; invh r; bheight l + 1 = bheight r; invc r ]] =⇒ invh (baldL l a r) ∧ bheight (baldL l a r) = bheight r by (induct l a r rule: baldL.induct) (auto simp: invh baliR invh paint bheight baliR bheight paint Red) lemma invh baldL Black: [[ invh l; invh r; bheight l + 1 = bheight r; color r = Black ]] =⇒ invh (baldL l a r) ∧ bheight (baldL l a r) = bheight r by (induct l a r rule: baldL.induct)(auto simp add: invh baliR bheight baliR)

82 lemma invc baldL: [[invc2 l; invc r; color r = Black]] =⇒ invc (baldL l a r) by (induct l a r rule: baldL.induct)(simp all add: invc baliR) lemma invc2 baldL: [[ invc2 l; invc r ]] =⇒ invc2 (baldL l a r) by (induct l a r rule: baldL.induct)(auto simp: invc baliR paint2 invc2I ) lemma invh baldR invc: [[ invh l; invh r; bheight l = bheight r + 1 ; invc l ]] =⇒ invh (baldR l a r) ∧ bheight (baldR l a r) = bheight l by(induct l a r rule: baldR.induct) (auto simp: invh baliL bheight baliL invh paint bheight paint Red) lemma invc baldR: [[invc l; invc2 r; color l = Black]] =⇒ invc (baldR l a r) by (induct l a r rule: baldR.induct)(simp all add: invc baliL) lemma invc2 baldR: [[ invc l; invc2 r ]] =⇒invc2 (baldR l a r) by (induct l a r rule: baldR.induct)(auto simp: invc baliL paint2 invc2I ) lemma invh join: [[ invh l; invh r; bheight l = bheight r ]] =⇒ invh (join l r) ∧ bheight (join l r) = bheight l by (induct l r rule: join.induct) (auto simp: invh baldL Black split: tree.splits color.splits) lemma invc join: [[ invc l; invc r ]] =⇒ (color l = Black ∧ color r = Black −→ invc (join l r)) ∧ invc2 (join l r) by (induct l r rule: join.induct) (auto simp: invc baldL invc2I split: tree.splits color.splits) All in one: lemma inv baldL: [[ invh l; invh r; bheight l + 1 = bheight r; invc2 l; invc r ]] =⇒ invh (baldL l a r) ∧ bheight (baldL l a r) = bheight r ∧ invc2 (baldL l a r) ∧ (color r = Black −→ invc (baldL l a r)) by (induct l a r rule: baldL.induct) (auto simp: inv baliR invh paint bheight baliR bheight paint Red paint2 invc2I ) lemma inv baldR: [[ invh l; invh r; bheight l = bheight r + 1 ; invc l; invc2 r ]]

83 =⇒ invh (baldR l a r) ∧ bheight (baldR l a r) = bheight l ∧ invc2 (baldR l a r) ∧ (color l = Black −→ invc (baldR l a r)) by (induct l a r rule: baldR.induct) (auto simp: inv baliL invh paint bheight baliL bheight paint Red paint2 invc2I ) lemma inv join: [[ invh l; invh r; bheight l = bheight r; invc l; invc r ]] =⇒ invh (join l r) ∧ bheight (join l r) = bheight l ∧ invc2 (join l r) ∧ (color l = Black ∧ color r = Black −→ invc (join l r)) by (induct l r rule: join.induct) (auto simp: invh baldL Black inv baldL invc2I split: tree.splits color.splits) lemma neq LeafD: t 6= Leaf =⇒ ∃ l x c r. t = Node l (x,c) r by(cases t rule: tree2 cases) auto lemma inv del: [[ invh t; invc t ]] =⇒ invh (del x t) ∧ (color t = Red −→ bheight (del x t) = bheight t ∧ invc (del x t)) ∧ (color t = Black −→ bheight (del x t) = bheight t − 1 ∧ invc2 (del x t)) by(induct x t rule: del.induct) (auto simp: inv baldL inv baldR inv join dest!: neq LeafD) theorem rbt delete: rbt t =⇒ rbt (delete x t) by (metis delete def rbt def color paint Black inv del invh paint) Overall correctness: interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = rbt proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp add: rbt def empty def ) next case 6 thus ?case by (simp add: rbt insert)

84 next case 7 thus ?case by (simp add: rbt delete) qed

21.3 Height-Size Relation lemma rbt height bheight if : invc t =⇒ invh t =⇒ height t ≤ 2 ∗ bheight t + (if color t = Black then 0 else 1 ) by(induction t)(auto split: if split asm) lemma rbt height bheight: rbt t =⇒ height t / 2 ≤ bheight t by(auto simp: rbt def dest: rbt height bheight if ) lemma bheight size bound: invc t =⇒ invh t =⇒ 2 ˆ (bheight t) ≤ size1 t by (induction t) auto lemma rbt height le: assumes rbt t shows height t ≤ 2 ∗ log 2 (size1 t) proof − have 2 powr (height t / 2 ) ≤ 2 powr bheight t using rbt height bheight[OF assms] by (simp) also have ... ≤ size1 t using assms by (simp add: powr realpow bheight size bound rbt def ) finally have 2 powr (height t / 2 ) ≤ size1 t . hence height t / 2 ≤ log 2 (size1 t) by (simp add: le log iff size1 size del: divide le eq numeral1 (1 )) thus ?thesis by simp qed end

22 Alternative Deletion in Red-Black Trees theory RBT Set2 imports RBT Set begin This is a conceptually simpler version of deletion. Instead of the tricky join function this version follows the standard approach of replacing the deleted element (in function del) by the minimal element in its right subtree. fun split min :: 0a rbt ⇒ 0a × 0a rbt where split min (Node l (a, ) r) = (if l = Leaf then (a,r) else let (x,l 0) = split min l in (x, if color l = Black then baldL l 0 a r else R l 0 a r))

85 fun del :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where del x Leaf = Leaf | del x (Node l (a, ) r) = (case cmp x a of LT ⇒ let l 0 = del x l in if l 6= Leaf ∧ color l = Black then baldL l 0 a r else R l 0 a r | GT ⇒ let r 0 = del x r in if r 6= Leaf ∧ color r = Black then baldR l a r 0 else R l a r 0 | EQ ⇒ if r = Leaf then l else let (a 0,r 0) = split min r in if color r = Black then baldR l a 0 r 0 else R l a 0 r 0) The first two lets speed up the automatic proof of inv del below. definition delete :: 0a::linorder ⇒ 0a rbt ⇒ 0a rbt where delete x t = paint Black (del x t)

22.1 Functional Correctness Proofs declare Let def [simp] lemma split minD: split min t = (x,t 0) =⇒ t 6= Leaf =⇒ x # inorder t 0 = inorder t by(induction t arbitrary: t 0 rule: split min.induct) (auto simp: inorder baldL sorted lems split: prod.splits if splits) lemma inorder del: sorted(inorder t) =⇒ inorder(del x t) = del list x (inorder t) by(induction x t rule: del.induct) (auto simp: del list simps inorder baldL inorder baldR split minD split: prod.splits) lemma inorder delete: sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by (auto simp: delete def inorder del inorder paint)

22.2 Structural invariants lemma neq Red[simp]: (c 6= Red) = (c = Black) by (cases c) auto

22.2.1 Deletion lemma inv split min: [[ split min t = (x,t 0); t 6= Leaf ; invh t; invc t ]] =⇒ invh t 0 ∧ (color t = Red −→ bheight t 0 = bheight t ∧ invc t 0) ∧

86 (color t = Black −→ bheight t 0 = bheight t − 1 ∧ invc2 t 0) apply(induction t arbitrary: x t 0 rule: split min.induct) apply(auto simp: inv baldR inv baldL invc2I dest!: neq LeafD split: if splits prod.splits) done An automatic proof. It is quite brittle, e.g. inlining the lets in RBT Set2 .del breaks it. lemma inv del: [[ invh t; invc t ]] =⇒ invh (del x t) ∧ (color t = Red −→ bheight (del x t) = bheight t ∧ invc (del x t)) ∧ (color t = Black −→ bheight (del x t) = bheight t − 1 ∧ invc2 (del x t)) apply(induction x t rule: del.induct) apply(auto simp: inv baldR inv baldL invc2I dest!: inv split min dest: neq LeafD split!: prod.splits if splits) done A structured proof where one can see what is used in each case. lemma inv del2 : [[ invh t; invc t ]] =⇒ invh (del x t) ∧ (color t = Red −→ bheight (del x t) = bheight t ∧ invc (del x t)) ∧ (color t = Black −→ bheight (del x t) = bheight t − 1 ∧ invc2 (del x t)) proof(induction x t rule: del.induct) case (1 x) then show ?case by simp next case (2 x l a c r) note if split[split del] show ?case proof cases assume x < a show ?thesis proof cases assume ∗: l 6= Leaf ∧ color l = Black hence bheight l > 0 using neq LeafD[of l] by auto thus ?thesis using hx < a i 2 .IH (1 ) 2 .prems inv baldL[of del x l] ∗ by(auto) next assume ¬(l 6= Leaf ∧ color l = Black) thus ?thesis using hx < a i 2 .prems 2 .IH (1 ) by(auto) qed next assume ¬ x < a show ?thesis

87 proof cases assume x > a show ?thesis using ha < x i 2 .IH (2 ) 2 .prems neq LeafD[of r] inv baldR[of del x r] by(auto split: if split)

next assume ¬ x > a show ?thesis using 2 .prems h¬ x < a i h¬ x > a i by(auto simp: inv baldR invc2I dest!: inv split min dest: neq LeafD split: prod.split if split) qed qed qed theorem rbt delete: rbt t =⇒ rbt (delete x t) by (metis delete def rbt def color paint Black inv del invh paint) Overall correctness: interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = rbt proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp add: rbt def empty def ) next case 6 thus ?case by (simp add: rbt insert) next case 7 thus ?case by (simp add: rbt delete) qed end

88 23 Red-Black Tree Implementation of Maps theory RBT Map imports RBT Set Lookup2 begin fun upd :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) rbt ⇒ ( 0a∗ 0b) rbt where upd x y Leaf = R Leaf (x,y) Leaf | upd x y (B l (a,b) r) = (case cmp x a of LT ⇒ baliL (upd x y l)(a,b) r | GT ⇒ baliR l (a,b)(upd x y r) | EQ ⇒ B l (x,y) r) | upd x y (R l (a,b) r) = (case cmp x a of LT ⇒ R (upd x y l)(a,b) r | GT ⇒ R l (a,b)(upd x y r) | EQ ⇒ R l (x,y) r) definition update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) rbt ⇒ ( 0a∗ 0b) rbt where update x y t = paint Black (upd x y t) fun del :: 0a::linorder ⇒ ( 0a∗ 0b)rbt ⇒ ( 0a∗ 0b)rbt where del x Leaf = Leaf | del x (Node l ((a,b), c) r) = (case cmp x a of LT ⇒ if l 6= Leaf ∧ color l = Black then baldL (del x l)(a,b) r else R (del x l)(a,b) r | GT ⇒ if r 6= Leaf ∧ color r = Black then baldR l (a,b)(del x r) else R l (a,b)(del x r) | EQ ⇒ join l r) definition delete :: 0a::linorder ⇒ ( 0a∗ 0b) rbt ⇒ ( 0a∗ 0b) rbt where delete x t = paint Black (del x t)

23.1 Functional Correctness Proofs lemma inorder upd: sorted1 (inorder t) =⇒ inorder(upd x y t) = upd list x y (inorder t) by(induction x y t rule: upd.induct) (auto simp: upd list simps inorder baliL inorder baliR) lemma inorder update: sorted1 (inorder t) =⇒ inorder(update x y t) = upd list x y (inorder t) by(simp add: update def inorder upd inorder paint)

89 lemma inorder del: sorted1 (inorder t) =⇒ inorder(del x t) = del list x (inorder t) by(induction x t rule: del.induct) (auto simp: del list simps inorder join inorder baldL inorder baldR) lemma inorder delete: sorted1 (inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del inorder paint)

23.2 Structural invariants 23.2.1 Update lemma invc upd: assumes invc t shows color t = Black =⇒ invc (upd x y t) invc2 (upd x y t) using assms by (induct x y t rule: upd.induct)(auto simp: invc baliL invc baliR invc2I ) lemma invh upd: assumes invh t shows invh (upd x y t) bheight (upd x y t) = bheight t using assms by(induct x y t rule: upd.induct) (auto simp: invh baliL invh baliR bheight baliL bheight baliR) theorem rbt update: rbt t =⇒ rbt (update x y t) by (simp add: invc upd(2 ) invh upd(1 ) color paint Black invh paint rbt def update def )

23.2.2 Deletion lemma del invc invh: invh t =⇒ invc t =⇒ invh (del x t) ∧ (color t = Red ∧ bheight (del x t) = bheight t ∧ invc (del x t) ∨ color t = Black ∧ bheight (del x t) = bheight t − 1 ∧ invc2 (del x t)) proof (induct x t rule: del.induct) case (2 x y c) have x = y ∨ x < y ∨ x > y by auto thus ?case proof (elim disjE) assume x = y with 2 show ?thesis by (cases c)(simp all add: invh join invc join) next assume x < y with 2 show ?thesis by(cases c)

90 (auto simp: invh baldL invc invc baldL invc2 baldL dest: neq LeafD) next assume y < x with 2 show ?thesis by(cases c) (auto simp: invh baldR invc invc baldR invc2 baldR dest: neq LeafD) qed qed auto theorem rbt delete: rbt t =⇒ rbt (delete k t) by (metis delete def rbt def color paint Black del invc invh invc2I invh paint) interpretation M : Map by Ordered where empty = empty and lookup = lookup and update = update and delete = delete and inorder = inorder and inv = rbt proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: lookup map of ) next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by (simp add: rbt def empty def ) next case 6 thus ?case by (simp add: rbt update) next case 7 thus ?case by (simp add: rbt delete) qed end

24 2-3 Trees theory Tree23 imports Main begin class height = fixes height :: 0a ⇒ nat

91 datatype 0a tree23 = Leaf (hi) | Node2 0a tree23 0a 0a tree23 (h , , i) | Node3 0a tree23 0a 0a tree23 0a 0a tree23 (h , , , , i) fun inorder :: 0a tree23 ⇒ 0a list where inorder Leaf = [] | inorder(Node2 l a r) = inorder l @ a # inorder r | inorder(Node3 l a m b r) = inorder l @ a # inorder m @ b # inorder r instantiation tree23 :: (type)height begin fun height tree23 :: 0a tree23 ⇒ nat where height Leaf = 0 | height (Node2 l r) = Suc(max (height l)(height r)) | height (Node3 l m r) = Suc(max (height l)(max (height m)(height r))) instance .. end Completeness: fun complete :: 0a tree23 ⇒ bool where complete Leaf = True | complete (Node2 l r) = (height l = height r ∧ complete l & complete r) | complete (Node3 l m r) = (height l = height m & height m = height r & complete l & complete m & complete r) lemma ht sz if complete: complete t =⇒ 2 ˆ height t ≤ size t + 1 by (induction t) auto end

25 2-3 Tree Implementation of Sets theory Tree23 Set imports Tree23 Cmp Set Specs begin

92 declare sorted wrt.simps(2 )[simp del] definition empty :: 0a tree23 where empty = Leaf fun isin :: 0a::linorder tree23 ⇒ 0a ⇒ bool where isin Leaf x = False | isin (Node2 l a r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ isin r x) | isin (Node3 l a m b r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ (case cmp x b of LT ⇒ isin m x | EQ ⇒ True | GT ⇒ isin r x)) datatype 0a upI = TI 0a tree23 | OF 0a tree23 0a 0a tree23 fun treeI :: 0a upI ⇒ 0a tree23 where treeI (TI t) = t | treeI (OF l a r) = Node2 l a r fun ins :: 0a::linorder ⇒ 0a tree23 ⇒ 0a upI where ins x Leaf = OF Leaf x Leaf | ins x (Node2 l a r) = (case cmp x a of LT ⇒ (case ins x l of TI l 0 => TI (Node2 l 0 a r) | OF l1 b l2 => TI (Node3 l1 b l2 a r)) | EQ ⇒ TI (Node2 l a r) | GT ⇒ (case ins x r of TI r 0 => TI (Node2 l a r 0) | OF r1 b r2 => TI (Node3 l a r1 b r2 ))) | ins x (Node3 l a m b r) = (case cmp x a of

93 LT ⇒ (case ins x l of TI l 0 => TI (Node3 l 0 a m b r) | OF l1 c l2 => OF (Node2 l1 c l2 ) a (Node2 m b r)) | EQ ⇒ TI (Node3 l a m b r) | GT ⇒ (case cmp x b of GT ⇒ (case ins x r of TI r 0 => TI (Node3 l a m b r 0) | OF r1 c r2 => OF (Node2 l a m) b (Node2 r1 c r2 )) | EQ ⇒ TI (Node3 l a m b r) | LT ⇒ (case ins x m of TI m 0 => TI (Node3 l a m 0 b r) | OF m1 c m2 => OF (Node2 l a m1 ) c (Node2 m2 b r)))) hide const insert definition insert :: 0a::linorder ⇒ 0a tree23 ⇒ 0a tree23 where insert x t = treeI (ins x t) datatype 0a upD = TD 0a tree23 | UF 0a tree23 fun treeD :: 0a upD ⇒ 0a tree23 where treeD (TD t) = t | treeD (UF t) = t

fun node21 :: 0a upD ⇒ 0a ⇒ 0a tree23 ⇒ 0a upD where node21 (TD t1 ) a t2 = TD(Node2 t1 a t2 ) | node21 (UF t1 ) a (Node2 t2 b t3 ) = UF (Node3 t1 a t2 b t3 ) | node21 (UF t1 ) a (Node3 t2 b t3 c t4 ) = TD(Node2 (Node2 t1 a t2 ) b (Node2 t3 c t4 )) fun node22 :: 0a tree23 ⇒ 0a ⇒ 0a upD ⇒ 0a upD where node22 t1 a (TD t2 ) = TD(Node2 t1 a t2 ) | node22 (Node2 t1 b t2 ) a (UF t3 ) = UF (Node3 t1 b t2 a t3 ) | node22 (Node3 t1 b t2 c t3 ) a (UF t4 ) = TD(Node2 (Node2 t1 b t2 ) c (Node2 t3 a t4 )) fun node31 :: 0a upD ⇒ 0a ⇒ 0a tree23 ⇒ 0a ⇒ 0a tree23 ⇒ 0a upD where node31 (TD t1 ) a t2 b t3 = TD(Node3 t1 a t2 b t3 ) |

94 node31 (UF t1 ) a (Node2 t2 b t3 ) c t4 = TD(Node2 (Node3 t1 a t2 b t3 ) c t4 ) | node31 (UF t1 ) a (Node3 t2 b t3 c t4 ) d t5 = TD(Node3 (Node2 t1 a t2 ) b (Node2 t3 c t4 ) d t5 ) fun node32 :: 0a tree23 ⇒ 0a ⇒ 0a upD ⇒ 0a ⇒ 0a tree23 ⇒ 0a upD where node32 t1 a (TD t2 ) b t3 = TD(Node3 t1 a t2 b t3 ) | node32 t1 a (UF t2 ) b (Node2 t3 c t4 ) = TD(Node2 t1 a (Node3 t2 b t3 c t4 )) | node32 t1 a (UF t2 ) b (Node3 t3 c t4 d t5 ) = TD(Node3 t1 a (Node2 t2 b t3 ) c (Node2 t4 d t5 )) fun node33 :: 0a tree23 ⇒ 0a ⇒ 0a tree23 ⇒ 0a ⇒ 0a upD ⇒ 0a upD where node33 l a m b (TD r) = TD(Node3 l a m b r) | node33 t1 a (Node2 t2 b t3 ) c (UF t4 ) = TD(Node2 t1 a (Node3 t2 b t3 c t4 )) | node33 t1 a (Node3 t2 b t3 c t4 ) d (UF t5 ) = TD(Node3 t1 a (Node2 t2 b t3 ) c (Node2 t4 d t5 )) fun split min :: 0a tree23 ⇒ 0a ∗ 0a upD where split min (Node2 Leaf a Leaf ) = (a, UF Leaf ) | split min (Node3 Leaf a Leaf b Leaf ) = (a, TD(Node2 Leaf b Leaf )) | split min (Node2 l a r) = (let (x,l 0) = split min l in (x, node21 l 0 a r)) | split min (Node3 l a m b r) = (let (x,l 0) = split min l in (x, node31 l 0 a m b r)) In the base cases of split min and del it is enough to check if one subtree is a Leaf, in which case completeness implies that so are the others. Exercise. fun del :: 0a::linorder ⇒ 0a tree23 ⇒ 0a upD where del x Leaf = TD Leaf | del x (Node2 Leaf a Leaf ) = (if x = a then UF Leaf else TD(Node2 Leaf a Leaf )) | del x (Node3 Leaf a Leaf b Leaf ) = TD(if x = a then Node2 Leaf b Leaf else if x = b then Node2 Leaf a Leaf else Node3 Leaf a Leaf b Leaf ) | del x (Node2 l a r) = (case cmp x a of LT ⇒ node21 (del x l) a r | GT ⇒ node22 l a (del x r) | EQ ⇒ let (a 0,r 0) = split min r in node22 l a 0 r 0) | del x (Node3 l a m b r) = (case cmp x a of LT ⇒ node31 (del x l) a m b r |

95 EQ ⇒ let (a 0,m 0) = split min m in node32 l a 0 m 0 b r | GT ⇒ (case cmp x b of LT ⇒ node32 l a (del x m) b r | EQ ⇒ let (b 0,r 0) = split min r in node33 l a m b 0 r 0 | GT ⇒ node33 l a m b (del x r))) definition delete :: 0a::linorder ⇒ 0a tree23 ⇒ 0a tree23 where delete x t = treeD(del x t)

25.1 Functional Correctness 25.1.1 Proofs for isin lemma isin set: sorted(inorder t) =⇒ isin t x = (x ∈ set (inorder t)) by (induction t)(auto simp: isin simps)

25.1.2 Proofs for insert lemma inorder ins: sorted(inorder t) =⇒ inorder(treeI (ins x t)) = ins list x (inorder t) by(induction t)(auto simp: ins list simps split: upI .splits) lemma inorder insert: sorted(inorder t) =⇒ inorder(insert a t) = ins list a (inorder t) by(simp add: insert def inorder ins)

25.1.3 Proofs for delete lemma inorder node21 : height r > 0 =⇒ inorder (treeD (node21 l 0 a r)) = inorder (treeD l 0)@ a # inorder r by(induct l 0 a r rule: node21 .induct) auto lemma inorder node22 : height l > 0 =⇒ inorder (treeD (node22 l a r 0)) = inorder l @ a # inorder (treeD r 0) by(induct l a r 0 rule: node22 .induct) auto lemma inorder node31 : height m > 0 =⇒ inorder (treeD (node31 l 0 a m b r)) = inorder (treeD l 0)@ a # inorder m @ b # inorder r by(induct l 0 a m b r rule: node31 .induct) auto lemma inorder node32 : height r > 0 =⇒ inorder (treeD (node32 l a m 0 b r)) = inorder l @ a # inorder (treeD m 0) @ b # inorder r

96 by(induct l a m 0 b r rule: node32 .induct) auto lemma inorder node33 : height m > 0 =⇒ inorder (treeD (node33 l a m b r 0)) = inorder l @ a # inorder m @ b # inorder (treeD r 0) by(induct l a m b r 0 rule: node33 .induct) auto lemmas inorder nodes = inorder node21 inorder node22 inorder node31 inorder node32 inorder node33 lemma split minD: split min t = (x,t 0) =⇒ complete t =⇒ height t > 0 =⇒ x # inorder(treeD t 0) = inorder t by(induction t arbitrary: t 0 rule: split min.induct) (auto simp: inorder nodes split: prod.splits) lemma inorder del: [[ complete t ; sorted(inorder t) ]] =⇒ inorder(treeD (del x t)) = del list x (inorder t) by(induction t rule: del.induct) (auto simp: del list simps inorder nodes split minD split!: if split prod.splits) lemma inorder delete: [[ complete t ; sorted(inorder t) ]] =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del)

25.2 Completeness 25.2.1 Proofs for insert First a standard proof that ins preserves complete. fun hI :: 0a upI ⇒ nat where hI (TI t) = height t | hI (OF l a r) = height l lemma complete ins: complete t =⇒ complete (treeI (ins a t)) ∧ hI (ins a t) = height t by (induct t)(auto split!: if split upI .split) Now an alternative proof (by Brian Huffman) that runs faster because two properties (completeness and height) are combined in one predicate. inductive full :: nat ⇒ 0a tree23 ⇒ bool where full 0 Leaf | [[full n l; full n r]] =⇒ full (Suc n)(Node2 l p r) | [[full n l; full n m; full n r]] =⇒ full (Suc n)(Node3 l p m q r)

97 inductive cases full elims: full n Leaf full n (Node2 l p r) full n (Node3 l p m q r) inductive cases full 0 elim: full 0 t inductive cases full Suc elim: full (Suc n) t lemma full 0 iff [simp]: full 0 t ←→ t = Leaf by (auto elim: full 0 elim intro: full.intros) lemma full Leaf iff [simp]: full n Leaf ←→ n = 0 by (auto elim: full elims intro: full.intros) lemma full Suc Node2 iff [simp]: full (Suc n)(Node2 l p r) ←→ full n l ∧ full n r by (auto elim: full elims intro: full.intros) lemma full Suc Node3 iff [simp]: full (Suc n)(Node3 l p m q r) ←→ full n l ∧ full n m ∧ full n r by (auto elim: full elims intro: full.intros) lemma full imp height: full n t =⇒ height t = n by (induct set: full, simp all) lemma full imp complete: full n t =⇒ complete t by (induct set: full, auto dest: full imp height) lemma complete imp full: complete t =⇒ full (height t) t by (induct t, simp all) lemma complete iff full: complete t ←→ (∃ n. full n t) by (auto elim!: complete imp full full imp complete) The insert function either preserves the height of the tree, or increases it by one. The constructor returned by the insert function determines which: A return value of the form TI t indicates that the height will be the same. A value of the form OF l p r indicates an increase in height. 0 fun full i :: nat ⇒ a upI ⇒ bool where full i n (TI t) ←→ full n t | full i n (OF l p r) ←→ full n l ∧ full n r lemma full i ins: full n t =⇒ full i n (ins a t)

98 by (induct rule: full.induct)(auto split: upI .split) The insert operation preserves completeance. lemma complete insert: complete t =⇒ complete (insert a t) unfolding complete iff full insert def apply (erule exE) apply (drule full i ins [of a]) apply (cases ins a t) apply (auto intro: full.intros) done

25.3 Proofs for delete fun hD :: 0a upD ⇒ nat where hD (TD t) = height t | hD (UF t) = height t + 1 lemma complete treeD node21 : [[complete r; complete (treeD l 0); height r = hD l 0 ]] =⇒ complete (treeD (node21 l 0 a r)) by(induct l 0 a r rule: node21 .induct) auto lemma complete treeD node22 : [[complete(treeD r 0); complete l; hD r 0 = height l ]] =⇒ complete (treeD (node22 l a r 0)) by(induct l a r 0 rule: node22 .induct) auto lemma complete treeD node31 : [[ complete (treeD l 0); complete m; complete r; hD l 0 = height r; height m = height r ]] =⇒ complete (treeD (node31 l 0 a m b r)) by(induct l 0 a m b r rule: node31 .induct) auto lemma complete treeD node32 : [[ complete l; complete (treeD m 0); complete r; height l = height r; hD m 0 = height r ]] =⇒ complete (treeD (node32 l a m 0 b r)) by(induct l a m 0 b r rule: node32 .induct) auto lemma complete treeD node33 : [[ complete l; complete m; complete(treeD r 0); height l = hD r 0; height m = hD r 0 ]] =⇒ complete (treeD (node33 l a m b r 0)) by(induct l a m b r 0 rule: node33 .induct) auto

99 lemmas completes = complete treeD node21 complete treeD node22 complete treeD node31 complete treeD node32 complete treeD node33 lemma height 0 node21 : height r > 0 =⇒ hD(node21 l 0 a r) = max (hD l 0)(height r) + 1 by(induct l 0 a r rule: node21 .induct)(simp all) lemma height 0 node22 : height l > 0 =⇒ hD(node22 l a r 0) = max (height l)(hD r 0) + 1 by(induct l a r 0 rule: node22 .induct)(simp all) lemma height 0 node31 : height m > 0 =⇒ hD(node31 l a m b r) = max (hD l)(max (height m)(height r)) + 1 by(induct l a m b r rule: node31 .induct)(simp all add: max def ) lemma height 0 node32 : height r > 0 =⇒ hD(node32 l a m b r) = max (height l)(max (hD m)(height r)) + 1 by(induct l a m b r rule: node32 .induct)(simp all add: max def ) lemma height 0 node33 : height m > 0 =⇒ hD(node33 l a m b r) = max (height l)(max (height m)(hD r)) + 1 by(induct l a m b r rule: node33 .induct)(simp all add: max def ) lemmas heights = height 0 node21 height 0 node22 height 0 node31 height 0 node32 height 0 node33 lemma height split min: split min t = (x, t 0) =⇒ height t > 0 =⇒ complete t =⇒ hD t 0 = height t by(induct t arbitrary: x t 0 rule: split min.induct) (auto simp: heights split: prod.splits) lemma height del: complete t =⇒ hD(del x t) = height t by(induction x t rule: del.induct) (auto simp: heights max def height split min split: prod.splits) lemma complete split min: [[ split min t = (x, t 0); complete t; height t > 0 ]] =⇒ complete (treeD t 0) by(induct t arbitrary: x t 0 rule: split min.induct) (auto simp: heights height split min completes split: prod.splits)

100 lemma complete treeD del: complete t =⇒ complete(treeD(del x t)) by(induction x t rule: del.induct) (auto simp: completes complete split min height del height split min split: prod.splits) corollary complete delete: complete t =⇒ complete(delete x t) by(simp add: delete def complete treeD del)

25.4 Overall Correctness interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = complete proof (standard, goal cases) case 2 thus ?case by(simp add: isin set) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 6 thus ?case by(simp add: complete insert) next case 7 thus ?case by(simp add: complete delete) qed (simp add: empty def )+ end

26 2-3 Tree Implementation of Maps theory Tree23 Map imports Tree23 Set Map Specs begin fun lookup :: ( 0a::linorder ∗ 0b) tree23 ⇒ 0a ⇒ 0b option where lookup Leaf x = None | lookup (Node2 l (a,b) r) x = (case cmp x a of LT ⇒ lookup l x | GT ⇒ lookup r x | EQ ⇒ Some b) | lookup (Node3 l (a1 ,b1 ) m (a2 ,b2 ) r) x = (case cmp x a1 of

101 LT ⇒ lookup l x | EQ ⇒ Some b1 | GT ⇒ (case cmp x a2 of LT ⇒ lookup m x | EQ ⇒ Some b2 | GT ⇒ lookup r x)) fun upd :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) tree23 ⇒ ( 0a∗ 0b) upI where upd x y Leaf = OF Leaf (x,y) Leaf | upd x y (Node2 l ab r) = (case cmp x (fst ab) of LT ⇒ (case upd x y l of TI l 0 => TI (Node2 l 0 ab r) | OF l1 ab 0 l2 => TI (Node3 l1 ab 0 l2 ab r)) | EQ ⇒ TI (Node2 l (x,y) r) | GT ⇒ (case upd x y r of TI r 0 => TI (Node2 l ab r 0) | OF r1 ab 0 r2 => TI (Node3 l ab r1 ab 0 r2 ))) | upd x y (Node3 l ab1 m ab2 r) = (case cmp x (fst ab1 ) of LT ⇒ (case upd x y l of TI l 0 => TI (Node3 l 0 ab1 m ab2 r) | OF l1 ab 0 l2 => OF (Node2 l1 ab 0 l2 ) ab1 (Node2 m ab2 r)) | EQ ⇒ TI (Node3 l (x,y) m ab2 r) | GT ⇒ (case cmp x (fst ab2 ) of LT ⇒ (case upd x y m of TI m 0 => TI (Node3 l ab1 m 0 ab2 r) | OF m1 ab 0 m2 => OF (Node2 l ab1 m1 ) ab 0 (Node2 m2 ab2 r)) | EQ ⇒ TI (Node3 l ab1 m (x,y) r) | GT ⇒ (case upd x y r of TI r 0 => TI (Node3 l ab1 m ab2 r 0) | OF r1 ab 0 r2 => OF (Node2 l ab1 m) ab2 (Node2 r1 ab 0 r2 )))) definition update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) tree23 ⇒ ( 0a∗ 0b) tree23 where update a b t = treeI (upd a b t) fun del :: 0a::linorder ⇒ ( 0a∗ 0b) tree23 ⇒ ( 0a∗ 0b) upD where del x Leaf = TD Leaf | del x (Node2 Leaf ab1 Leaf ) = (if x=fst ab1 then UF Leaf else TD(Node2 Leaf ab1 Leaf )) | del x (Node3 Leaf ab1 Leaf ab2 Leaf ) = TD(if x=fst ab1 then Node2 Leaf ab2 Leaf else if x=fst ab2 then Node2 Leaf ab1 Leaf else Node3 Leaf ab1 Leaf ab2

102 Leaf ) | del x (Node2 l ab1 r) = (case cmp x (fst ab1 ) of LT ⇒ node21 (del x l) ab1 r | GT ⇒ node22 l ab1 (del x r) | EQ ⇒ let (ab1 0,t) = split min r in node22 l ab1 0 t) | del x (Node3 l ab1 m ab2 r) = (case cmp x (fst ab1 ) of LT ⇒ node31 (del x l) ab1 m ab2 r | EQ ⇒ let (ab1 0,m 0) = split min m in node32 l ab1 0 m 0 ab2 r | GT ⇒ (case cmp x (fst ab2 ) of LT ⇒ node32 l ab1 (del x m) ab2 r | EQ ⇒ let (ab2 0,r 0) = split min r in node33 l ab1 m ab2 0 r 0 | GT ⇒ node33 l ab1 m ab2 (del x r))) definition delete :: 0a::linorder ⇒ ( 0a∗ 0b) tree23 ⇒ ( 0a∗ 0b) tree23 where delete x t = treeD(del x t)

26.1 Functional Correctness lemma lookup map of : sorted1 (inorder t) =⇒ lookup t x = map of (inorder t) x by (induction t)(auto simp: map of simps split: option.split) lemma inorder upd: sorted1 (inorder t) =⇒ inorder(treeI (upd x y t)) = upd list x y (inorder t) by(induction t)(auto simp: upd list simps split: upI .splits) corollary inorder update: sorted1 (inorder t) =⇒ inorder(update x y t) = upd list x y (inorder t) by(simp add: update def inorder upd) lemma inorder del: [[ complete t ; sorted1 (inorder t) ]] =⇒ inorder(treeD (del x t)) = del list x (inorder t) by(induction t rule: del.induct) (auto simp: del list simps inorder nodes split minD split!: if split prod.splits) corollary inorder delete: [[ complete t ; sorted1 (inorder t) ]] =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del)

103 26.2 Balancedness lemma complete upd: complete t =⇒ complete (treeI (upd x y t)) ∧ hI (upd x y t) = height t by (induct t)(auto split!: if split upI .split) corollary complete update: complete t =⇒ complete (update x y t) by (simp add: update def complete upd) lemma height del: complete t =⇒ hD(del x t) = height t by(induction x t rule: del.induct) (auto simp add: heights max def height split min split: prod.split) lemma complete treeD del: complete t =⇒ complete(treeD(del x t)) by(induction x t rule: del.induct) (auto simp: completes complete split min height del height split min split: prod.split) corollary complete delete: complete t =⇒ complete(delete x t) by(simp add: delete def complete treeD del)

26.3 Overall Correctness interpretation M : Map by Ordered where empty = empty and lookup = lookup and update = update and delete = delete and inorder = inorder and inv = complete proof (standard, goal cases) case 1 thus ?case by(simp add: empty def ) next case 2 thus ?case by(simp add: lookup map of ) next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by(simp add: empty def ) next case 6 thus ?case by(simp add: complete update) next case 7 thus ?case by(simp add: complete delete) qed

104 end

27 2-3 Tree from List theory Tree23 of List imports Tree23 begin Linear-time bottom up conversion of a list of items into a complete 2-3 tree whose inorder traversal yields the list of items.

27.1 Code Nonempty lists of 2-3 trees alternating with items, starting and ending with a 2-3 tree: datatype 0a tree23s = T 0a tree23 | TTs 0a tree23 0a 0a tree23s abbreviation not T ts == (∀ t. ts 6= T t) fun len :: 0a tree23s ⇒ nat where len (T ) = 1 | len (TTs ts) = len ts + 1 fun trees :: 0a tree23s ⇒ 0a tree23 set where trees (T t) = {t} | trees (TTs t a ts) = {t} ∪ trees ts Join pairs of adjacent trees: fun join adj :: 0a tree23s ⇒ 0a tree23s where join adj (TTs t1 a (T t2 )) = T (Node2 t1 a t2 ) | join adj (TTs t1 a (TTs t2 b (T t3 ))) = T (Node3 t1 a t2 b t3 ) | join adj (TTs t1 a (TTs t2 b ts)) = TTs (Node2 t1 a t2 ) b (join adj ts) Towards termination of join all: lemma len ge2 : not T ts =⇒ len ts ≥ 2 by(cases ts rule: join adj .cases) auto lemma [measure function]: is measure len by(rule is measure trivial) lemma len join adj div2 : not T ts =⇒ len(join adj ts) ≤ len ts div 2 by(induction ts rule: join adj .induct) auto

105 lemma len join adj1 : not T ts =⇒ len(join adj ts) < len ts using len join adj div2 [of ts] len ge2 [of ts] by simp corollary len join adj2 [termination simp]: len(join adj (TTs t a ts)) ≤ len ts using len join adj1 [of TTs t a ts] by simp fun join all :: 0a tree23s ⇒ 0a tree23 where join all (T t) = t | join all ts = join all (join adj ts) fun leaves :: 0a list ⇒ 0a tree23s where leaves [] = T Leaf | leaves (a # as) = TTs Leaf a (leaves as) definition tree23 of list :: 0a list ⇒ 0a tree23 where tree23 of list as = join all(leaves as)

27.2 Functional correctness 27.2.1 inorder: fun inorder2 :: 0a tree23s ⇒ 0a list where inorder2 (T t) = inorder t | inorder2 (TTs t a ts) = inorder t @ a # inorder2 ts lemma inorder2 join adj : not T ts =⇒ inorder2 (join adj ts) = inorder2 ts by (induction ts rule: join adj .induct) auto lemma inorder join all: inorder (join all ts) = inorder2 ts proof (induction ts rule: join all.induct) case 1 thus ?case by simp next case (2 t a ts) thus ?case using inorder2 join adj [of TTs t a ts] by (simp add: le imp less Suc) qed lemma inorder2 leaves: inorder2 (leaves as) = as by(induction as) auto lemma inorder: inorder(tree23 of list as) = as by(simp add: tree23 of list def inorder join all inorder2 leaves)

106 27.2.2 Completeness: lemma complete join adj : ∀ t ∈ trees ts. complete t ∧ height t = n =⇒ not T ts =⇒ ∀ t ∈ trees (join adj ts). complete t ∧ height t = Suc n by (induction ts rule: join adj .induct) auto lemma complete join all: ∀ t ∈ trees ts. complete t ∧ height t = n =⇒ complete (join all ts) proof (induction ts arbitrary: n rule: join all.induct) case 1 thus ?case by simp next case (2 t a ts) thus ?case apply simp using complete join adj [of TTs t a ts n, simplified] by blast qed lemma complete leaves: t ∈ trees (leaves as) =⇒ complete t ∧ height t = 0 by (induction as) auto corollary complete: complete(tree23 of list as) by(simp add: tree23 of list def complete leaves complete join all[of 0 ])

27.3 Linear running time fun T join adj :: 0a tree23s ⇒ nat where T join adj (TTs t1 a (T t2 )) = 1 | T join adj (TTs t1 a (TTs t2 b (T t3 ))) = 1 | T join adj (TTs t1 a (TTs t2 b ts)) = T join adj ts + 1 fun T join all :: 0a tree23s ⇒ nat where T join all (T t) = 1 | T join all ts = T join adj ts + T join all (join adj ts) + 1 fun T leaves :: 0a list ⇒ nat where T leaves [] = 1 | T leaves (a # as) = T leaves as + 1 definition T tree23 of list :: 0a list ⇒ nat where T tree23 of list as = T leaves as + T join all(leaves as) + 1 lemma T join adj : not T ts =⇒ T join adj ts ≤ len ts div 2 by(induction ts rule: T join adj .induct) auto

107 lemma len ge 1 : len ts ≥ 1 by(cases ts) auto lemma T join all: T join all ts ≤ 2 ∗ len ts proof(induction ts rule: join all.induct) case 1 thus ?case by simp next case (2 t a ts) let ?ts = TTs t a ts have T join all ?ts = T join adj ?ts + T join all (join adj ?ts) + 1 by simp also have ... ≤ len ?ts div 2 + T join all (join adj ?ts) + 1 using T join adj [of ?ts] by simp also have ... ≤ len ?ts div 2 + 2 ∗ len (join adj ?ts) + 1 using 2 .IH by simp also have ... ≤ len ?ts div 2 + 2 ∗ (len ?ts div 2 ) + 1 using len join adj div2 [of ?ts] by simp also have ... ≤ 2 ∗ len ?ts using len ge 1 [of ?ts] by linarith finally show ?case . qed lemma T leaves: T leaves as = length as + 1 by(induction as) auto lemma len leaves: len(leaves as) = length as + 1 by(induction as) auto lemma T tree23 of list: T tree23 of list as ≤ 3 ∗(length as) + 4 using T join all[of leaves as] by(simp add: T tree23 of list def T leaves len leaves) end

28 2-3-4 Trees theory Tree234 imports Main begin class height = fixes height :: 0a ⇒ nat

108 datatype 0a tree234 = Leaf (hi) | Node2 0a tree234 0a 0a tree234 (h , , i) | Node3 0a tree234 0a 0a tree234 0a 0a tree234 (h , , , , i) | Node4 0a tree234 0a 0a tree234 0a 0a tree234 0a 0a tree234 (h , , , , , , i) fun inorder :: 0a tree234 ⇒ 0a list where inorder Leaf = [] | inorder(Node2 l a r) = inorder l @ a # inorder r | inorder(Node3 l a m b r) = inorder l @ a # inorder m @ b # inorder r | inorder(Node4 l a m b n c r) = inorder l @ a # inorder m @ b # inorder n @ c # inorder r instantiation tree234 :: (type)height begin fun height tree234 :: 0a tree234 ⇒ nat where height Leaf = 0 | height (Node2 l r) = Suc(max (height l)(height r)) | height (Node3 l m r) = Suc(max (height l)(max (height m)(height r))) | height (Node4 l m n r) = Suc(max (height l)(max (height m)(max (height n)(height r)))) instance .. end Balanced: fun bal :: 0a tree234 ⇒ bool where bal Leaf = True | bal (Node2 l r) = (bal l & bal r & height l = height r) | bal (Node3 l m r) = (bal l & bal m & bal r & height l = height m & height m = height r) | bal (Node4 l m n r) = (bal l & bal m & bal n & bal r & height l = height m & height m = height n & height n = height r) end

29 2-3-4 Tree Implementation of Sets theory Tree234 Set

109 imports Tree234 Cmp Set Specs begin declare sorted wrt.simps(2 )[simp del]

29.1 Set operations on 2-3-4 trees definition empty :: 0a tree234 where empty = Leaf fun isin :: 0a::linorder tree234 ⇒ 0a ⇒ bool where isin Leaf x = False | isin (Node2 l a r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ isin r x) | isin (Node3 l a m b r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ (case cmp x b of LT ⇒ isin m x | EQ ⇒ True | GT ⇒ isin r x)) | isin (Node4 t1 a t2 b t3 c t4 ) x = (case cmp x b of LT ⇒ (case cmp x a of LT ⇒ isin t1 x | EQ ⇒ True | GT ⇒ isin t2 x) | EQ ⇒ True | GT ⇒ (case cmp x c of LT ⇒ isin t3 x | EQ ⇒ True | GT ⇒ isin t4 x))

0 0 0 0 0 datatype a upi = T i a tree234 | Upi a tree234 a a tree234

0 0 fun treei :: a upi ⇒ a tree234 where treei (T i t) = t | treei (Upi l a r) = Node2 l a r

0 0 0 fun ins :: a::linorder ⇒ a tree234 ⇒ a upi where ins x Leaf = Upi Leaf x Leaf | ins x (Node2 l a r) = (case cmp x a of

110 LT ⇒ (case ins x l of 0 0 T i l => T i (Node2 l a r) | Upi l1 b l2 => T i (Node3 l1 b l2 a r)) | EQ ⇒ T i (Node2 l x r) | GT ⇒ (case ins x r of 0 0 T i r => T i (Node2 l a r ) | Upi r1 b r2 => T i (Node3 l a r1 b r2 ))) | ins x (Node3 l a m b r) = (case cmp x a of LT ⇒ (case ins x l of 0 0 T i l => T i (Node3 l a m b r) | Upi l1 c l2 => Upi (Node2 l1 c l2 ) a (Node2 m b r)) | EQ ⇒ T i (Node3 l a m b r) | GT ⇒ (case cmp x b of GT ⇒ (case ins x r of 0 0 T i r => T i (Node3 l a m b r ) | Upi r1 c r2 => Upi (Node2 l a m) b (Node2 r1 c r2 )) | EQ ⇒ T i (Node3 l a m b r) | LT ⇒ (case ins x m of 0 0 T i m => T i (Node3 l a m b r) | Upi m1 c m2 => Upi (Node2 l a m1 ) c (Node2 m2 b r)))) | ins x (Node4 t1 a t2 b t3 c t4 ) = (case cmp x b of LT ⇒ (case cmp x a of LT ⇒ (case ins x t1 of T i t => T i (Node4 t a t2 b t3 c t4 ) | Upi l y r => Upi (Node2 l y r) a (Node3 t2 b t3 c t4 )) | EQ ⇒ T i (Node4 t1 a t2 b t3 c t4 ) | GT ⇒ (case ins x t2 of T i t => T i (Node4 t1 a t b t3 c t4 ) | Upi l y r => Upi (Node2 t1 a l) y (Node3 r b t3 c t4 ))) | EQ ⇒ T i (Node4 t1 a t2 b t3 c t4 ) | GT ⇒ (case cmp x c of LT ⇒ (case ins x t3 of T i t => T i (Node4 t1 a t2 b t c t4 ) | Upi l y r => Upi (Node2 t1 a t2 ) b (Node3 l y r c t4 )) | EQ ⇒ T i (Node4 t1 a t2 b t3 c t4 ) | GT ⇒

111 (case ins x t4 of T i t => T i (Node4 t1 a t2 b t3 c t) | Upi l y r => Upi (Node2 t1 a t2 ) b (Node3 t3 c l y r)))) hide const insert definition insert :: 0a::linorder ⇒ 0a tree234 ⇒ 0a tree234 where insert x t = treei(ins x t)

0 0 0 datatype a upd = T d a tree234 | Upd a tree234

0 0 fun treed :: a upd ⇒ a tree234 where treed (T d t) = t | treed (Upd t) = t

0 0 0 0 fun node21 :: a upd ⇒ a ⇒ a tree234 ⇒ a upd where node21 (T d l) a r = T d(Node2 l a r) | node21 (Upd l) a (Node2 lr b rr) = Upd(Node3 l a lr b rr) | node21 (Upd l) a (Node3 lr b mr c rr) = T d(Node2 (Node2 l a lr) b (Node2 mr c rr)) | node21 (Upd t1 ) a (Node4 t2 b t3 c t4 d t5 ) = T d(Node2 (Node2 t1 a t2 ) b (Node3 t3 c t4 d t5 ))

0 0 0 0 fun node22 :: a tree234 ⇒ a ⇒ a upd ⇒ a upd where node22 l a (T d r) = T d(Node2 l a r) | node22 (Node2 ll b rl) a (Upd r) = Upd(Node3 ll b rl a r) | node22 (Node3 ll b ml c rl) a (Upd r) = T d(Node2 (Node2 ll b ml) c (Node2 rl a r)) | node22 (Node4 t1 a t2 b t3 c t4 ) d (Upd t5 ) = T d(Node2 (Node2 t1 a t2 ) b (Node3 t3 c t4 d t5 ))

0 0 0 0 0 0 fun node31 :: a upd ⇒ a ⇒ a tree234 ⇒ a ⇒ a tree234 ⇒ a upd where node31 (T d t1 ) a t2 b t3 = T d(Node3 t1 a t2 b t3 ) | node31 (Upd t1 ) a (Node2 t2 b t3 ) c t4 = T d(Node2 (Node3 t1 a t2 b t3 ) c t4 ) | node31 (Upd t1 ) a (Node3 t2 b t3 c t4 ) d t5 = T d(Node3 (Node2 t1 a t2 ) b (Node2 t3 c t4 ) d t5 ) | node31 (Upd t1 ) a (Node4 t2 b t3 c t4 d t5 ) e t6 = T d(Node3 (Node2 t1 a t2 ) b (Node3 t3 c t4 d t5 ) e t6 )

0 0 0 0 0 0 fun node32 :: a tree234 ⇒ a ⇒ a upd ⇒ a ⇒ a tree234 ⇒ a upd where node32 t1 a (T d t2 ) b t3 = T d(Node3 t1 a t2 b t3 ) | node32 t1 a (Upd t2 ) b (Node2 t3 c t4 ) = T d(Node2 t1 a (Node3 t2 b t3 c t4 )) |

112 node32 t1 a (Upd t2 ) b (Node3 t3 c t4 d t5 ) = T d(Node3 t1 a (Node2 t2 b t3 ) c (Node2 t4 d t5 )) | node32 t1 a (Upd t2 ) b (Node4 t3 c t4 d t5 e t6 ) = T d(Node3 t1 a (Node2 t2 b t3 ) c (Node3 t4 d t5 e t6 ))

0 0 0 0 0 0 fun node33 :: a tree234 ⇒ a ⇒ a tree234 ⇒ a ⇒ a upd ⇒ a upd where node33 l a m b (T d r) = T d(Node3 l a m b r) | node33 t1 a (Node2 t2 b t3 ) c (Upd t4 ) = T d(Node2 t1 a (Node3 t2 b t3 c t4 )) | node33 t1 a (Node3 t2 b t3 c t4 ) d (Upd t5 ) = T d(Node3 t1 a (Node2 t2 b t3 ) c (Node2 t4 d t5 )) | node33 t1 a (Node4 t2 b t3 c t4 d t5 ) e (Upd t6 ) = T d(Node3 t1 a (Node2 t2 b t3 ) c (Node3 t4 d t5 e t6 ))

0 0 0 0 0 0 0 fun node41 :: a upd ⇒ a ⇒ a tree234 ⇒ a ⇒ a tree234 ⇒ a ⇒ a 0 tree234 ⇒ a upd where node41 (T d t1 ) a t2 b t3 c t4 = T d(Node4 t1 a t2 b t3 c t4 ) | node41 (Upd t1 ) a (Node2 t2 b t3 ) c t4 d t5 = T d(Node3 (Node3 t1 a t2 b t3 ) c t4 d t5 ) | node41 (Upd t1 ) a (Node3 t2 b t3 c t4 ) d t5 e t6 = T d(Node4 (Node2 t1 a t2 ) b (Node2 t3 c t4 ) d t5 e t6 ) | node41 (Upd t1 ) a (Node4 t2 b t3 c t4 d t5 ) e t6 f t7 = T d(Node4 (Node2 t1 a t2 ) b (Node3 t3 c t4 d t5 ) e t6 f t7 )

0 0 0 0 0 0 0 fun node42 :: a tree234 ⇒ a ⇒ a upd ⇒ a ⇒ a tree234 ⇒ a ⇒ a 0 tree234 ⇒ a upd where node42 t1 a (T d t2 ) b t3 c t4 = T d(Node4 t1 a t2 b t3 c t4 ) | node42 (Node2 t1 a t2 ) b (Upd t3 ) c t4 d t5 = T d(Node3 (Node3 t1 a t2 b t3 ) c t4 d t5 ) | node42 (Node3 t1 a t2 b t3 ) c (Upd t4 ) d t5 e t6 = T d(Node4 (Node2 t1 a t2 ) b (Node2 t3 c t4 ) d t5 e t6 ) | node42 (Node4 t1 a t2 b t3 c t4 ) d (Upd t5 ) e t6 f t7 = T d(Node4 (Node2 t1 a t2 ) b (Node3 t3 c t4 d t5 ) e t6 f t7 )

0 0 0 0 0 0 0 fun node43 :: a tree234 ⇒ a ⇒ a tree234 ⇒ a ⇒ a upd ⇒ a ⇒ a 0 tree234 ⇒ a upd where node43 t1 a t2 b (T d t3 ) c t4 = T d(Node4 t1 a t2 b t3 c t4 ) | node43 t1 a (Node2 t2 b t3 ) c (Upd t4 ) d t5 = T d(Node3 t1 a (Node3 t2 b t3 c t4 ) d t5 ) | node43 t1 a (Node3 t2 b t3 c t4 ) d (Upd t5 ) e t6 = T d(Node4 t1 a (Node2 t2 b t3 ) c (Node2 t4 d t5 ) e t6 ) | node43 t1 a (Node4 t2 b t3 c t4 d t5 ) e (Upd t6 ) f t7 = T d(Node4 t1 a (Node2 t2 b t3 ) c (Node3 t4 d t5 e t6 ) f t7 )

113 fun node44 :: 0a tree234 ⇒ 0a ⇒ 0a tree234 ⇒ 0a ⇒ 0a tree234 ⇒ 0a ⇒ 0a 0 upd ⇒ a upd where node44 t1 a t2 b t3 c (T d t4 ) = T d(Node4 t1 a t2 b t3 c t4 ) | node44 t1 a t2 b (Node2 t3 c t4 ) d (Upd t5 ) = T d(Node3 t1 a t2 b (Node3 t3 c t4 d t5 )) | node44 t1 a t2 b (Node3 t3 c t4 d t5 ) e (Upd t6 ) = T d(Node4 t1 a t2 b (Node2 t3 c t4 ) d (Node2 t5 e t6 )) | node44 t1 a t2 b (Node4 t3 c t4 d t5 e t6 ) f (Upd t7 ) = T d(Node4 t1 a t2 b (Node2 t3 c t4 ) d (Node3 t5 e t6 f t7 ))

0 0 0 fun split min :: a tree234 ⇒ a ∗ a upd where split min (Node2 Leaf a Leaf ) = (a, Upd Leaf ) | split min (Node3 Leaf a Leaf b Leaf ) = (a, T d(Node2 Leaf b Leaf )) | split min (Node4 Leaf a Leaf b Leaf c Leaf ) = (a, T d(Node3 Leaf b Leaf c Leaf )) | split min (Node2 l a r) = (let (x,l 0) = split min l in (x, node21 l 0 a r)) | split min (Node3 l a m b r) = (let (x,l 0) = split min l in (x, node31 l 0 a m b r)) | split min (Node4 l a m b n c r) = (let (x,l 0) = split min l in (x, node41 l 0 a m b n c r))

0 0 0 fun del :: a::linorder ⇒ a tree234 ⇒ a upd where del k Leaf = T d Leaf | del k (Node2 Leaf p Leaf ) = (if k=p then Upd Leaf else T d(Node2 Leaf p Leaf )) | del k (Node3 Leaf p Leaf q Leaf ) = T d(if k=p then Node2 Leaf q Leaf else if k=q then Node2 Leaf p Leaf else Node3 Leaf p Leaf q Leaf ) | del k (Node4 Leaf a Leaf b Leaf c Leaf ) = T d(if k=a then Node3 Leaf b Leaf c Leaf else if k=b then Node3 Leaf a Leaf c Leaf else if k=c then Node3 Leaf a Leaf b Leaf else Node4 Leaf a Leaf b Leaf c Leaf ) | del k (Node2 l a r) = (case cmp k a of LT ⇒ node21 (del k l) a r | GT ⇒ node22 l a (del k r) | EQ ⇒ let (a 0,t) = split min r in node22 l a 0 t) | del k (Node3 l a m b r) = (case cmp k a of LT ⇒ node31 (del k l) a m b r | EQ ⇒ let (a 0,m 0) = split min m in node32 l a 0 m 0 b r | GT ⇒ (case cmp k b of LT ⇒ node32 l a (del k m) b r | EQ ⇒ let (b 0,r 0) = split min r in node33 l a m b 0 r 0 | GT ⇒ node33 l a m b (del k r))) | del k (Node4 l a m b n c r) = (case cmp k b of

114 LT ⇒ (case cmp k a of LT ⇒ node41 (del k l) a m b n c r | EQ ⇒ let (a 0,m 0) = split min m in node42 l a 0 m 0 b n c r | GT ⇒ node42 l a (del k m) b n c r) | EQ ⇒ let (b 0,n 0) = split min n in node43 l a m b 0 n 0 c r | GT ⇒ (case cmp k c of LT ⇒ node43 l a m b (del k n) c r | EQ ⇒ let (c 0,r 0) = split min r in node44 l a m b n c 0 r 0 | GT ⇒ node44 l a m b n c (del k r))) definition delete :: 0a::linorder ⇒ 0a tree234 ⇒ 0a tree234 where delete x t = treed(del x t)

29.2 Functional correctness 29.2.1 Functional correctness of isin: lemma isin set: sorted(inorder t) =⇒ isin t x = (x ∈ set (inorder t)) by (induction t)(auto simp: isin simps)

29.2.2 Functional correctness of insert: lemma inorder ins: sorted(inorder t) =⇒ inorder(treei(ins x t)) = ins list x (inorder t) by(induction t)(auto, auto simp: ins list simps split!: if splits upi.splits) lemma inorder insert: sorted(inorder t) =⇒ inorder(insert a t) = ins list a (inorder t) by(simp add: insert def inorder ins)

29.2.3 Functional correctness of delete lemma inorder node21 : height r > 0 =⇒ 0 0 inorder (treed (node21 l a r)) = inorder (treed l )@ a # inorder r by(induct l 0 a r rule: node21 .induct) auto lemma inorder node22 : height l > 0 =⇒ 0 0 inorder (treed (node22 l a r )) = inorder l @ a # inorder (treed r ) by(induct l a r 0 rule: node22 .induct) auto lemma inorder node31 : height m > 0 =⇒ 0 0 inorder (treed (node31 l a m b r)) = inorder (treed l )@ a # inorder m @ b # inorder r by(induct l 0 a m b r rule: node31 .induct) auto

115 lemma inorder node32 : height r > 0 =⇒ 0 0 inorder (treed (node32 l a m b r)) = inorder l @ a # inorder (treed m ) @ b # inorder r by(induct l a m 0 b r rule: node32 .induct) auto lemma inorder node33 : height m > 0 =⇒ 0 inorder (treed (node33 l a m b r )) = inorder l @ a # inorder m @ b # 0 inorder (treed r ) by(induct l a m b r 0 rule: node33 .induct) auto lemma inorder node41 : height m > 0 =⇒ 0 0 inorder (treed (node41 l a m b n c r)) = inorder (treed l )@ a # inorder m @ b # inorder n @ c # inorder r by(induct l 0 a m b n c r rule: node41 .induct) auto lemma inorder node42 : height l > 0 =⇒ inorder (treed (node42 l a m b n c r)) = inorder l @ a # inorder (treed m)@ b # inorder n @ c # inorder r by(induct l a m b n c r rule: node42 .induct) auto lemma inorder node43 : height m > 0 =⇒ inorder (treed (node43 l a m b n c r)) = inorder l @ a # inorder m @ b # inorder(treed n)@ c # inorder r by(induct l a m b n c r rule: node43 .induct) auto lemma inorder node44 : height n > 0 =⇒ inorder (treed (node44 l a m b n c r)) = inorder l @ a # inorder m @ b # inorder n @ c # inorder (treed r) by(induct l a m b n c r rule: node44 .induct) auto lemmas inorder nodes = inorder node21 inorder node22 inorder node31 inorder node32 inorder node33 inorder node41 inorder node42 inorder node43 inorder node44 lemma split minD: split min t = (x,t 0) =⇒ bal t =⇒ height t > 0 =⇒ 0 x # inorder(treed t ) = inorder t by(induction t arbitrary: t 0 rule: split min.induct) (auto simp: inorder nodes split: prod.splits) lemma inorder del: [[ bal t ; sorted(inorder t) ]] =⇒ inorder(treed (del x t)) = del list x (inorder t) by(induction t rule: del.induct) (auto simp: inorder nodes del list simps split minD split!: if split prod.splits)

116 lemma inorder delete: [[ bal t ; sorted(inorder t) ]] =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del)

29.3 Balancedness 29.3.1 Proofs for insert First a standard proof that ins preserves bal. instantiation upi :: (type)height begin

0 fun height upi :: a upi ⇒ nat where height (T i t) = height t | height (Upi l a r) = height l instance .. end lemma bal ins: bal t =⇒ bal (treei(ins a t)) ∧ height(ins a t) = height t by (induct t)(auto split!: if split upi.split) Now an alternative proof (by Brian Huffman) that runs faster because two properties (balance and height) are combined in one predicate. inductive full :: nat ⇒ 0a tree234 ⇒ bool where full 0 Leaf | [[full n l; full n r]] =⇒ full (Suc n)(Node2 l p r) | [[full n l; full n m; full n r]] =⇒ full (Suc n)(Node3 l p m q r) | [[full n l; full n m; full n m 0; full n r]] =⇒ full (Suc n)(Node4 l p m q m 0 q 0 r) inductive cases full elims: full n Leaf full n (Node2 l p r) full n (Node3 l p m q r) full n (Node4 l p m q m 0 q 0 r) inductive cases full 0 elim: full 0 t inductive cases full Suc elim: full (Suc n) t lemma full 0 iff [simp]: full 0 t ←→ t = Leaf

117 by (auto elim: full 0 elim intro: full.intros) lemma full Leaf iff [simp]: full n Leaf ←→ n = 0 by (auto elim: full elims intro: full.intros) lemma full Suc Node2 iff [simp]: full (Suc n)(Node2 l p r) ←→ full n l ∧ full n r by (auto elim: full elims intro: full.intros) lemma full Suc Node3 iff [simp]: full (Suc n)(Node3 l p m q r) ←→ full n l ∧ full n m ∧ full n r by (auto elim: full elims intro: full.intros) lemma full Suc Node4 iff [simp]: full (Suc n)(Node4 l p m q m 0 q 0 r) ←→ full n l ∧ full n m ∧ full n m 0 ∧ full n r by (auto elim: full elims intro: full.intros) lemma full imp height: full n t =⇒ height t = n by (induct set: full, simp all) lemma full imp bal: full n t =⇒ bal t by (induct set: full, auto dest: full imp height) lemma bal imp full: bal t =⇒ full (height t) t by (induct t, simp all) lemma bal iff full: bal t ←→ (∃ n. full n t) by (auto elim!: bal imp full full imp bal) The insert function either preserves the height of the tree, or increases it by one. The constructor returned by the insert function determines which: A return value of the form T i t indicates that the height will be the same. A value of the form Upi l p r indicates an increase in height. 0 primrec full i :: nat ⇒ a upi ⇒ bool where full i n (T i t) ←→ full n t | full i n (Upi l p r) ←→ full n l ∧ full n r lemma full i ins: full n t =⇒ full i n (ins a t) by (induct rule: full.induct)(auto, auto split: upi.split) The insert operation preserves balance. lemma bal insert: bal t =⇒ bal (insert a t) unfolding bal iff full insert def

118 apply (erule exE) apply (drule full i ins [of a]) apply (cases ins a t) apply (auto intro: full.intros) done

29.3.2 Proofs for delete instantiation upd :: (type)height begin

0 fun height upd :: a upd ⇒ nat where height (T d t) = height t | height (Upd t) = height t + 1 instance .. end lemma bal treed node21 : [[bal r; bal (treed l); height r = height l ]] =⇒ bal (treed (node21 l a r)) by(induct l a r rule: node21 .induct) auto lemma bal treed node22 : [[bal(treed r); bal l; height r = height l ]] =⇒ bal (treed (node22 l a r)) by(induct l a r rule: node22 .induct) auto lemma bal treed node31 : [[ bal (treed l); bal m; bal r; height l = height r; height m = height r ]] =⇒ bal (treed (node31 l a m b r)) by(induct l a m b r rule: node31 .induct) auto lemma bal treed node32 : [[ bal l; bal (treed m); bal r; height l = height r; height m = height r ]] =⇒ bal (treed (node32 l a m b r)) by(induct l a m b r rule: node32 .induct) auto lemma bal treed node33 : [[ bal l; bal m; bal(treed r); height l = height r; height m = height r ]] =⇒ bal (treed (node33 l a m b r)) by(induct l a m b r rule: node33 .induct) auto lemma bal treed node41 : [[ bal (treed l); bal m; bal n; bal r; height l = height r; height m = height

119 r; height n = height r ]] =⇒ bal (treed (node41 l a m b n c r)) by(induct l a m b n c r rule: node41 .induct) auto lemma bal treed node42 : [[ bal l; bal (treed m); bal n; bal r; height l = height r; height m = height r; height n = height r ]] =⇒ bal (treed (node42 l a m b n c r)) by(induct l a m b n c r rule: node42 .induct) auto lemma bal treed node43 : [[ bal l; bal m; bal (treed n); bal r; height l = height r; height m = height r; height n = height r ]] =⇒ bal (treed (node43 l a m b n c r)) by(induct l a m b n c r rule: node43 .induct) auto lemma bal treed node44 : [[ bal l; bal m; bal n; bal (treed r); height l = height r; height m = height r; height n = height r ]] =⇒ bal (treed (node44 l a m b n c r)) by(induct l a m b n c r rule: node44 .induct) auto lemmas bals = bal treed node21 bal treed node22 bal treed node31 bal treed node32 bal treed node33 bal treed node41 bal treed node42 bal treed node43 bal treed node44 lemma height node21 : height r > 0 =⇒ height(node21 l a r) = max (height l)(height r) + 1 by(induct l a r rule: node21 .induct)(simp all add: max.assoc) lemma height node22 : height l > 0 =⇒ height(node22 l a r) = max (height l)(height r) + 1 by(induct l a r rule: node22 .induct)(simp all add: max.assoc) lemma height node31 : height m > 0 =⇒ height(node31 l a m b r) = max (height l)(max (height m)(height r)) + 1 by(induct l a m b r rule: node31 .induct)(simp all add: max def ) lemma height node32 : height r > 0 =⇒ height(node32 l a m b r) = max (height l)(max (height m)(height r)) + 1 by(induct l a m b r rule: node32 .induct)(simp all add: max def )

120 lemma height node33 : height m > 0 =⇒ height(node33 l a m b r) = max (height l)(max (height m)(height r)) + 1 by(induct l a m b r rule: node33 .induct)(simp all add: max def ) lemma height node41 : height m > 0 =⇒ height(node41 l a m b n c r) = max (height l)(max (height m)(max (height n)(height r))) + 1 by(induct l a m b n c r rule: node41 .induct)(simp all add: max def ) lemma height node42 : height l > 0 =⇒ height(node42 l a m b n c r) = max (height l)(max (height m)(max (height n)(height r))) + 1 by(induct l a m b n c r rule: node42 .induct)(simp all add: max def ) lemma height node43 : height m > 0 =⇒ height(node43 l a m b n c r) = max (height l)(max (height m)(max (height n)(height r))) + 1 by(induct l a m b n c r rule: node43 .induct)(simp all add: max def ) lemma height node44 : height n > 0 =⇒ height(node44 l a m b n c r) = max (height l)(max (height m)(max (height n)(height r))) + 1 by(induct l a m b n c r rule: node44 .induct)(simp all add: max def ) lemmas heights = height node21 height node22 height node31 height node32 height node33 height node41 height node42 height node43 height node44 lemma height split min: split min t = (x, t 0) =⇒ height t > 0 =⇒ bal t =⇒ height t 0 = height t by(induct t arbitrary: x t 0 rule: split min.induct) (auto simp: heights split: prod.splits) lemma height del: bal t =⇒ height(del x t) = height t by(induction x t rule: del.induct) (auto simp add: heights height split min split!: if split prod.split) lemma bal split min: 0 0 [[ split min t = (x, t ); bal t; height t > 0 ]] =⇒ bal (treed t ) by(induct t arbitrary: x t 0 rule: split min.induct) (auto simp: heights height split min bals split: prod.splits) lemma bal treed del: bal t =⇒ bal(treed(del x t))

121 by(induction x t rule: del.induct) (auto simp: bals bal split min height del height split min split!: if split prod.split) corollary bal delete: bal t =⇒ bal(delete x t) by(simp add: delete def bal treed del)

29.4 Overall Correctness interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = bal proof (standard, goal cases) case 2 thus ?case by(simp add: isin set) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 6 thus ?case by(simp add: bal insert) next case 7 thus ?case by(simp add: bal delete) qed (simp add: empty def )+ end

30 2-3-4 Tree Implementation of Maps theory Tree234 Map imports Tree234 Set Map Specs begin

30.1 Map operations on 2-3-4 trees fun lookup :: ( 0a::linorder ∗ 0b) tree234 ⇒ 0a ⇒ 0b option where lookup Leaf x = None | lookup (Node2 l (a,b) r) x = (case cmp x a of LT ⇒ lookup l x | GT ⇒ lookup r x | EQ ⇒ Some b) | lookup (Node3 l (a1 ,b1 ) m (a2 ,b2 ) r) x = (case cmp x a1 of

122 LT ⇒ lookup l x | EQ ⇒ Some b1 | GT ⇒ (case cmp x a2 of LT ⇒ lookup m x | EQ ⇒ Some b2 | GT ⇒ lookup r x)) | lookup (Node4 t1 (a1 ,b1 ) t2 (a2 ,b2 ) t3 (a3 ,b3 ) t4 ) x = (case cmp x a2 of LT ⇒ (case cmp x a1 of LT ⇒ lookup t1 x | EQ ⇒ Some b1 | GT ⇒ lookup t2 x) | EQ ⇒ Some b2 | GT ⇒ (case cmp x a3 of LT ⇒ lookup t3 x | EQ ⇒ Some b3 | GT ⇒ lookup t4 x))

0 0 0 0 0 0 fun upd :: a::linorder ⇒ b ⇒ ( a∗ b) tree234 ⇒ ( a∗ b) upi where upd x y Leaf = Upi Leaf (x,y) Leaf | upd x y (Node2 l ab r) = (case cmp x (fst ab) of LT ⇒ (case upd x y l of 0 0 T i l => T i (Node2 l ab r) 0 0 | Upi l1 ab l2 => T i (Node3 l1 ab l2 ab r)) | EQ ⇒ T i (Node2 l (x,y) r) | GT ⇒ (case upd x y r of 0 0 T i r => T i (Node2 l ab r ) 0 0 | Upi r1 ab r2 => T i (Node3 l ab r1 ab r2 ))) | upd x y (Node3 l ab1 m ab2 r) = (case cmp x (fst ab1 ) of LT ⇒ (case upd x y l of 0 0 T i l => T i (Node3 l ab1 m ab2 r) 0 0 | Upi l1 ab l2 => Upi (Node2 l1 ab l2 ) ab1 (Node2 m ab2 r)) | EQ ⇒ T i (Node3 l (x,y) m ab2 r) | GT ⇒ (case cmp x (fst ab2 ) of LT ⇒ (case upd x y m of 0 0 T i m => T i (Node3 l ab1 m ab2 r) 0 0 | Upi m1 ab m2 => Upi (Node2 l ab1 m1 ) ab (Node2 m2 ab2 r)) | EQ ⇒ T i (Node3 l ab1 m (x,y) r) | GT ⇒ (case upd x y r of 0 0 T i r => T i (Node3 l ab1 m ab2 r ) 0 0 | Upi r1 ab r2 => Upi (Node2 l ab1 m) ab2 (Node2 r1 ab r2 )))) | upd x y (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) = (case cmp x (fst ab2 ) of LT ⇒ (case cmp x (fst ab1 ) of LT ⇒ (case upd x y t1 of 0 0 T i t1 => T i (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) | Upi t11 q t12 => Upi (Node2 t11 q t12 ) ab1 (Node3 t2 ab2 t3 ab3 t4 )) |

123 EQ ⇒ T i (Node4 t1 (x,y) t2 ab2 t3 ab3 t4 ) | GT ⇒ (case upd x y t2 of 0 0 T i t2 => T i (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) | Upi t21 q t22 => Upi (Node2 t1 ab1 t21 ) q (Node3 t22 ab2 t3 ab3 t4 ))) | EQ ⇒ T i (Node4 t1 ab1 t2 (x,y) t3 ab3 t4 ) | GT ⇒ (case cmp x (fst ab3 ) of LT ⇒ (case upd x y t3 of 0 0 T i t3 ⇒ T i (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) | Upi t31 q t32 => Upi (Node2 t1 ab1 t2 ) ab2/q (Node3 t31 q t32 ab3 t4 )) | EQ ⇒ T i (Node4 t1 ab1 t2 ab2 t3 (x,y) t4 ) | GT ⇒ (case upd x y t4 of 0 0 T i t4 => T i (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) | Upi t41 q t42 => Upi (Node2 t1 ab1 t2 ) ab2 (Node3 t3 ab3 t41 q t42 )))) definition update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) tree234 ⇒ ( 0a∗ 0b) tree234 where update x y t = treei(upd x y t)

0 0 0 0 0 fun del :: a::linorder ⇒ ( a∗ b) tree234 ⇒ ( a∗ b) upd where del x Leaf = T d Leaf | del x (Node2 Leaf ab1 Leaf ) = (if x=fst ab1 then Upd Leaf else T d(Node2 Leaf ab1 Leaf )) | del x (Node3 Leaf ab1 Leaf ab2 Leaf ) = T d(if x=fst ab1 then Node2 Leaf ab2 Leaf else if x=fst ab2 then Node2 Leaf ab1 Leaf else Node3 Leaf ab1 Leaf ab2 Leaf ) | del x (Node4 Leaf ab1 Leaf ab2 Leaf ab3 Leaf ) = T d(if x = fst ab1 then Node3 Leaf ab2 Leaf ab3 Leaf else if x = fst ab2 then Node3 Leaf ab1 Leaf ab3 Leaf else if x = fst ab3 then Node3 Leaf ab1 Leaf ab2 Leaf else Node4 Leaf ab1 Leaf ab2 Leaf ab3 Leaf ) | del x (Node2 l ab1 r) = (case cmp x (fst ab1 ) of LT ⇒ node21 (del x l) ab1 r | GT ⇒ node22 l ab1 (del x r) | EQ ⇒ let (ab1 0,t) = split min r in node22 l ab1 0 t) | del x (Node3 l ab1 m ab2 r) = (case cmp x (fst ab1 ) of LT ⇒ node31 (del x l) ab1 m ab2 r | EQ ⇒ let (ab1 0,m 0) = split min m in node32 l ab1 0 m 0 ab2 r | GT ⇒ (case cmp x (fst ab2 ) of LT ⇒ node32 l ab1 (del x m) ab2 r | EQ ⇒ let (ab2 0,r 0) = split min r in node33 l ab1 m ab2 0 r 0 |

124 GT ⇒ node33 l ab1 m ab2 (del x r))) | del x (Node4 t1 ab1 t2 ab2 t3 ab3 t4 ) = (case cmp x (fst ab2 ) of LT ⇒ (case cmp x (fst ab1 ) of LT ⇒ node41 (del x t1 ) ab1 t2 ab2 t3 ab3 t4 | EQ ⇒ let (ab 0,t2 0) = split min t2 in node42 t1 ab 0 t2 0 ab2 t3 ab3 t4 | GT ⇒ node42 t1 ab1 (del x t2 ) ab2 t3 ab3 t4 ) | EQ ⇒ let (ab 0,t3 0) = split min t3 in node43 t1 ab1 t2 ab 0 t3 0 ab3 t4 | GT ⇒ (case cmp x (fst ab3 ) of LT ⇒ node43 t1 ab1 t2 ab2 (del x t3 ) ab3 t4 | EQ ⇒ let (ab 0,t4 0) = split min t4 in node44 t1 ab1 t2 ab2 t3 ab 0 t4 0 | GT ⇒ node44 t1 ab1 t2 ab2 t3 ab3 (del x t4 ))) definition delete :: 0a::linorder ⇒ ( 0a∗ 0b) tree234 ⇒ ( 0a∗ 0b) tree234 where delete x t = treed(del x t)

30.2 Functional correctness lemma lookup map of : sorted1 (inorder t) =⇒ lookup t x = map of (inorder t) x by (induction t)(auto simp: map of simps split: option.split) lemma inorder upd: sorted1 (inorder t) =⇒ inorder(treei(upd a b t)) = upd list a b (inorder t) by(induction t) (auto simp: upd list simps, auto simp: upd list simps split: upi.splits) lemma inorder update: sorted1 (inorder t) =⇒ inorder(update a b t) = upd list a b (inorder t) by(simp add: update def inorder upd) lemma inorder del: [[ bal t ; sorted1 (inorder t) ]] =⇒ inorder(treed (del x t)) = del list x (inorder t) by(induction t rule: del.induct) (auto simp: del list simps inorder nodes split minD split!: if splits prod.splits) lemma inorder delete: [[ bal t ; sorted1 (inorder t) ]] =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del)

125 30.3 Balancedness lemma bal upd: bal t =⇒ bal (treei(upd x y t)) ∧ height(upd x y t) = height t by (induct t)(auto, auto split!: if split upi.split) lemma bal update: bal t =⇒ bal (update x y t) by (simp add: update def bal upd) lemma height del: bal t =⇒ height(del x t) = height t by(induction x t rule: del.induct) (auto simp add: heights height split min split!: if split prod.split) lemma bal treed del: bal t =⇒ bal(treed(del x t)) by(induction x t rule: del.induct) (auto simp: bals bal split min height del height split min split!: if split prod.split) corollary bal delete: bal t =⇒ bal(delete x t) by(simp add: delete def bal treed del)

30.4 Overall Correctness interpretation M : Map by Ordered where empty = empty and lookup = lookup and update = update and delete = delete and inorder = inorder and inv = bal proof (standard, goal cases) case 2 thus ?case by(simp add: lookup map of ) next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) next case 6 thus ?case by(simp add: bal update) next case 7 thus ?case by(simp add: bal delete) qed (simp add: empty def )+ end

31 1-2 Brother Tree Implementation of Sets theory Brother12 Set

126 imports Cmp Set Specs HOL−Number Theory.Fib begin

31.1 Data Type and Operations datatype 0a bro = N0 | N1 0a bro | N2 0a bro 0a 0a bro |

L2 0a | N3 0a bro 0a 0a bro 0a 0a bro definition empty :: 0a bro where empty = N0 fun inorder :: 0a bro ⇒ 0a list where inorder N0 = [] | inorder (N1 t) = inorder t | inorder (N2 l a r) = inorder l @ a # inorder r | inorder (L2 a) = [a] | inorder (N3 t1 a1 t2 a2 t3 ) = inorder t1 @ a1 # inorder t2 @ a2 # inorder t3 fun isin :: 0a bro ⇒ 0a::linorder ⇒ bool where isin N0 x = False | isin (N1 t) x = isin t x | isin (N2 l a r) x = (case cmp x a of LT ⇒ isin l x | EQ ⇒ True | GT ⇒ isin r x) fun n1 :: 0a bro ⇒ 0a bro where n1 (L2 a) = N2 N0 a N0 | n1 (N3 t1 a1 t2 a2 t3 ) = N2 (N2 t1 a1 t2 ) a2 (N1 t3 ) | n1 t = N1 t hide const (open) insert locale insert

127 begin fun n2 :: 0a bro ⇒ 0a ⇒ 0a bro ⇒ 0a bro where n2 (L2 a1 ) a2 t = N3 N0 a1 N0 a2 t | n2 (N3 t1 a1 t2 a2 t3 ) a3 (N1 t4 ) = N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 ) | n2 (N3 t1 a1 t2 a2 t3 ) a3 t4 = N3 (N2 t1 a1 t2 ) a2 (N1 t3 ) a3 t4 | n2 t1 a1 (L2 a2 ) = N3 t1 a1 N0 a2 N0 | n2 (N1 t1 ) a1 (N3 t2 a2 t3 a3 t4 ) = N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 ) | n2 t1 a1 (N3 t2 a2 t3 a3 t4 ) = N3 t1 a1 (N1 t2 ) a2 (N2 t3 a3 t4 ) | n2 t1 a t2 = N2 t1 a t2 fun ins :: 0a::linorder ⇒ 0a bro ⇒ 0a bro where ins x N0 = L2 x | ins x (N1 t) = n1 (ins x t) | ins x (N2 l a r) = (case cmp x a of LT ⇒ n2 (ins x l) a r | EQ ⇒ N2 l a r | GT ⇒ n2 l a (ins x r)) fun tree :: 0a bro ⇒ 0a bro where tree (L2 a) = N2 N0 a N0 | tree (N3 t1 a1 t2 a2 t3 ) = N2 (N2 t1 a1 t2 ) a2 (N1 t3 ) | tree t = t definition insert :: 0a::linorder ⇒ 0a bro ⇒ 0a bro where insert x t = tree(ins x t) end locale delete begin fun n2 :: 0a bro ⇒ 0a ⇒ 0a bro ⇒ 0a bro where n2 (N1 t1 ) a1 (N1 t2 ) = N1 (N2 t1 a1 t2 ) | n2 (N1 (N1 t1 )) a1 (N2 (N1 t2 ) a2 (N2 t3 a3 t4 )) = N1 (N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 )) | n2 (N1 (N1 t1 )) a1 (N2 (N2 t2 a2 t3 ) a3 (N1 t4 )) = N1 (N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 )) | n2 (N1 (N1 t1 )) a1 (N2 (N2 t2 a2 t3 ) a3 (N2 t4 a4 t5 )) = N2 (N2 (N1 t1 ) a1 (N2 t2 a2 t3 )) a3 (N1 (N2 t4 a4 t5 )) | n2 (N2 (N1 t1 ) a1 (N2 t2 a2 t3 )) a3 (N1 (N1 t4 )) = N1 (N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 )) | n2 (N2 (N2 t1 a1 t2 ) a2 (N1 t3 )) a3 (N1 (N1 t4 )) =

128 N1 (N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 )) | n2 (N2 (N2 t1 a1 t2 ) a2 (N2 t3 a3 t4 )) a5 (N1 (N1 t5 )) = N2 (N1 (N2 t1 a1 t2 )) a2 (N2 (N2 t3 a3 t4 ) a5 (N1 t5 )) | n2 t1 a1 t2 = N2 t1 a1 t2 fun split min :: 0a bro ⇒ ( 0a × 0a bro) option where split min N0 = None | split min (N1 t) = (case split min t of None ⇒ None | Some (a, t 0) ⇒ Some (a, N1 t 0)) | split min (N2 t1 a t2 ) = (case split min t1 of None ⇒ Some (a, N1 t2 ) | Some (b, t1 0) ⇒ Some (b, n2 t1 0 a t2 )) fun del :: 0a::linorder ⇒ 0a bro ⇒ 0a bro where del N0 = N0 | del x (N1 t) = N1 (del x t) | del x (N2 l a r) = (case cmp x a of LT ⇒ n2 (del x l) a r | GT ⇒ n2 l a (del x r) | EQ ⇒ (case split min r of None ⇒ N1 l | Some (b, r 0) ⇒ n2 l b r 0)) fun tree :: 0a bro ⇒ 0a bro where tree (N1 t) = t | tree t = t definition delete :: 0a::linorder ⇒ 0a bro ⇒ 0a bro where delete a t = tree (del a t) end

31.2 Invariants fun B :: nat ⇒ 0a bro set and U :: nat ⇒ 0a bro set where B 0 = {N0 } | B (Suc h) = { N2 t1 a t2 | t1 a t2 . t1 ∈ B h ∪ U h ∧ t2 ∈ B h ∨ t1 ∈ B h ∧ t2 ∈ B h ∪ U h} | U 0 = {} |

129 U (Suc h) = N1 ‘ B h abbreviation T h ≡ B h ∪ U h fun Bp :: nat ⇒ 0a bro set where Bp 0 = B 0 ∪ L2 ‘ UNIV | Bp (Suc 0 ) = B (Suc 0 ) ∪ {N3 N0 a N0 b N0 |a b. True} | Bp (Suc(Suc h)) = B (Suc(Suc h)) ∪ {N3 t1 a t2 b t3 | t1 a t2 b t3 . t1 ∈ B (Suc h) ∧ t2 ∈ U (Suc h) ∧ t3 ∈ B (Suc h)} fun Um :: nat ⇒ 0a bro set where Um 0 = {} | Um (Suc h) = N1 ‘ T h

31.3 Functional Correctness Proofs 31.3.1 Proofs for isin lemma isin set: t ∈ T h =⇒ sorted(inorder t) =⇒ isin t x = (x ∈ set(inorder t)) by(induction h arbitrary: t)(fastforce simp: isin simps split: if splits)+

31.3.2 Proofs for insertion lemma inorder n1 : inorder(n1 t) = inorder t by(cases t rule: n1 .cases)(auto simp: sorted lems) context insert begin lemma inorder n2 : inorder(n2 l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: n2 .cases)(auto simp: sorted lems) lemma inorder tree: inorder(tree t) = inorder t by(cases t) auto lemma inorder ins: t ∈ T h =⇒ sorted(inorder t) =⇒ inorder(ins a t) = ins list a (inorder t) by(induction h arbitrary: t)(auto simp: ins list simps inorder n1 inorder n2 ) lemma inorder insert: t ∈ T h =⇒ sorted(inorder t) =⇒ inorder(insert a t) = ins list a (inorder t) by(simp add: insert def inorder ins inorder tree)

130 end

31.3.3 Proofs for deletion context delete begin lemma inorder tree: inorder(tree t) = inorder t by(cases t) auto lemma inorder n2 : inorder(n2 l a r) = inorder l @ a # inorder r by(cases (l,a,r) rule: n2 .cases)(auto) lemma inorder split min: t ∈ T h =⇒ (split min t = None ←→ inorder t = []) ∧ (split min t = Some(a,t 0) −→ inorder t = a # inorder t 0) by(induction h arbitrary: t a t 0)(auto simp: inorder n2 split: option.splits) lemma inorder del: t ∈ T h =⇒ sorted(inorder t) =⇒ inorder(del x t) = del list x (inorder t) by(induction h arbitrary: t)(auto simp: del list simps inorder n2 inorder split min[OF UnI1 ] inorder split min[OF UnI2 ] split: option.splits) lemma inorder delete: t ∈ T h =⇒ sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del inorder tree) end

31.4 Invariant Proofs 31.4.1 Proofs for insertion lemma n1 type: t ∈ Bp h =⇒ n1 t ∈ T (Suc h) by(cases h rule: Bp.cases) auto context insert begin lemma tree type: t ∈ Bp h =⇒ tree t ∈ B h ∪ B (Suc h) by(cases h rule: Bp.cases) auto lemma n2 type:

131 (t1 ∈ Bp h ∧ t2 ∈ T h −→ n2 t1 a t2 ∈ Bp (Suc h)) ∧ (t1 ∈ T h ∧ t2 ∈ Bp h −→ n2 t1 a t2 ∈ Bp (Suc h)) apply(cases h rule: Bp.cases) apply (auto)[2 ] apply(rule conjI impI | erule conjE exE imageE | simp | erule disjE)+ done lemma Bp if B: t ∈ B h =⇒ t ∈ Bp h by (cases h rule: Bp.cases) simp all An automatic proof: lemma (t ∈ B h −→ ins x t ∈ Bp h) ∧ (t ∈ U h −→ ins x t ∈ T h) apply(induction h arbitrary: t) apply (simp) apply (fastforce simp: Bp if B n2 type dest: n1 type) done A detailed proof: lemma ins type: shows t ∈ B h =⇒ ins x t ∈ Bp h and t ∈ U h =⇒ ins x t ∈ T h proof(induction h arbitrary: t) case 0 { case 1 thus ?case by simp next case 2 thus ?case by simp } next case (Suc h) { case 1 then obtain t1 a t2 where [simp]: t = N2 t1 a t2 and t1 : t1 ∈ T h and t2 : t2 ∈ T h and t12 : t1 ∈ B h ∨ t2 ∈ B h by auto have ?case if x < a proof − have n2 (ins x t1 ) a t2 ∈ Bp (Suc h) proof cases assume t1 ∈ B h with t2 show ?thesis by (simp add: Suc.IH (1 ) n2 type) next assume t1 ∈/ B h hence 1 : t1 ∈ U h and 2 : t2 ∈ B h using t1 t12 by auto show ?thesis by (metis Suc.IH (2 )[OF 1 ] Bp if B[OF 2 ] n2 type) qed with hx < a i show ?case by simp

132 qed moreover have ?case if a < x proof − have n2 t1 a (ins x t2 ) ∈ Bp (Suc h) proof cases assume t2 ∈ B h with t1 show ?thesis by (simp add: Suc.IH (1 ) n2 type) next assume t2 ∈/ B h hence 1 : t1 ∈ B h and 2 : t2 ∈ U h using t2 t12 by auto show ?thesis by (metis Bp if B[OF 1 ] Suc.IH (2 )[OF 2 ] n2 type) qed with ha < x i show ?case by simp qed moreover have ?case if x = a proof − from 1 have t ∈ Bp (Suc h) by(rule Bp if B) thus ?case using hx = a i by simp qed ultimately show ?case by auto next case 2 thus ?case using Suc(1 ) n1 type by fastforce } qed lemma insert type: t ∈ B h =⇒ insert x t ∈ B h ∪ B (Suc h) unfolding insert def by (metis ins type(1 ) tree type) end

31.4.2 Proofs for deletion lemma B simps[simp]: N1 t ∈ B h = False L2 y ∈ B h = False (N3 t1 a1 t2 a2 t3 ) ∈ B h = False N0 ∈ B h ←→ h = 0 by (cases h, auto)+ context delete begin

133 lemma n2 type1 : [[t1 ∈ Um h; t2 ∈ B h]] =⇒ n2 t1 a t2 ∈ T (Suc h) apply(cases h rule: Bp.cases) apply auto[2 ] apply(erule exE bexE conjE imageE | simp | erule disjE)+ done lemma n2 type2 : [[t1 ∈ B h ; t2 ∈ Um h ]] =⇒ n2 t1 a t2 ∈ T (Suc h) apply(cases h rule: Bp.cases) apply auto[2 ] apply(erule exE bexE conjE imageE | simp | erule disjE)+ done lemma n2 type3 : [[t1 ∈ T h ; t2 ∈ T h ]] =⇒ n2 t1 a t2 ∈ T (Suc h) apply(cases h rule: Bp.cases) apply auto[2 ] apply(erule exE bexE conjE imageE | simp | erule disjE)+ done lemma split minNoneN0 : [[t ∈ B h; split min t = None]] =⇒ t = N0 by (cases t)(auto split: option.splits) lemma split minNoneN1 : [[t ∈ U h; split min t = None]] =⇒ t = N1 N0 by (cases h)(auto simp: split minNoneN0 split: option.splits) lemma split min type: t ∈ B h =⇒ split min t = Some (a, t 0) =⇒ t 0 ∈ T h t ∈ U h =⇒ split min t = Some (a, t 0) =⇒ t 0 ∈ Um h proof (induction h arbitrary: t a t 0) case (Suc h) { case 1 then obtain t1 a t2 where [simp]: t = N2 t1 a t2 and t12 : t1 ∈ T h t2 ∈ T h t1 ∈ B h ∨ t2 ∈ B h by auto show ?case proof (cases split min t1 ) case None show ?thesis proof cases assume t1 ∈ B h with split minNoneN0 [OF this None] 1 show ?thesis by(auto) next

134 assume t1 ∈/ B h thus ?thesis using 1 None by (auto) qed next case [simp]: (Some bt 0) obtain b t1 0 where [simp]: bt 0 = (b,t1 0) by fastforce show ?thesis proof cases assume t1 ∈ B h from Suc.IH (1 )[OF this] 1 have t1 0 ∈ T h by simp from n2 type3 [OF this t12 (2 )] 1 show ?thesis by auto next assume t1 ∈/ B h hence t1 : t1 ∈ U h and t2 : t2 ∈ B h using t12 by auto from Suc.IH (2 )[OF t1 ] have t1 0 ∈ Um h by simp from n2 type1 [OF this t2 ] 1 show ?thesis by auto qed qed } { case 2 then obtain t1 where [simp]: t = N1 t1 and t1 : t1 ∈ B h by auto show ?case proof (cases split min t1 ) case None with split minNoneN0 [OF t1 None] 2 show ?thesis by(auto) next case [simp]: (Some bt 0) obtain b t1 0 where [simp]: bt 0 = (b,t1 0) by fastforce from Suc.IH (1 )[OF t1 ] have t1 0 ∈ T h by simp thus ?thesis using 2 by auto qed } qed auto lemma del type: t ∈ B h =⇒ del x t ∈ T h t ∈ U h =⇒ del x t ∈ Um h proof (induction h arbitrary: x t) case (Suc h) { case 1 then obtain l a r where [simp]: t = N2 l a r and lr: l ∈ T h r ∈ T h l ∈ B h ∨ r ∈ B h by auto have ?case if x < a proof cases

135 assume l ∈ B h from n2 type3 [OF Suc.IH (1 )[OF this] lr(2 )] show ?thesis using hx a proof cases assume r ∈ B h from n2 type3 [OF lr(1 ) Suc.IH (1 )[OF this]] show ?thesis using hx>a i by(simp) next assume r ∈/ B h hence l ∈ B h r ∈ U h using lr by auto from n2 type2 [OF this(1 ) Suc.IH (2 )[OF this(2 )]] show ?thesis using hx>a i by(simp) qed moreover have ?case if [simp]: x=a proof (cases split min r) case None show ?thesis proof cases assume r ∈ B h with split minNoneN0 [OF this None] lr show ?thesis by(simp) next assume r ∈/ B h hence r ∈ U h using lr by auto with split minNoneN1 [OF this None] lr(3 ) show ?thesis by (simp) qed next case [simp]: (Some br 0) obtain b r 0 where [simp]: br 0 = (b,r 0) by fastforce show ?thesis proof cases assume r ∈ B h from split min type(1 )[OF this] n2 type3 [OF lr(1 )] show ?thesis by simp next assume r ∈/ B h

136 hence l ∈ B h and r ∈ U h using lr by auto from split min type(2 )[OF this(2 )] n2 type2 [OF this(1 )] show ?thesis by simp qed qed ultimately show ?case by auto } { case 2 with Suc.IH (1 ) show ?case by auto } qed auto lemma tree type: t ∈ T (h+1 ) =⇒ tree t ∈ B (h+1 ) ∪ B h by(auto) lemma delete type: t ∈ B h =⇒ delete x t ∈ B h ∪ B(h−1 ) unfolding delete def by (cases h)(simp, metis del type(1 ) tree type Suc eq plus1 diff Suc 1 ) end

31.5 Overall correctness interpretation Set by Ordered where empty = empty and isin = isin and insert = insert.insert and delete = delete.delete and inorder = inorder and inv = λt. ∃ h. t ∈ B h proof (standard, goal cases) case 2 thus ?case by(auto intro!: isin set) next case 3 thus ?case by(auto intro!: insert.inorder insert) next case 4 thus ?case by(auto intro!: delete.inorder delete) next case 6 thus ?case using insert.insert type by blast next case 7 thus ?case using delete.delete type by blast qed (auto simp: empty def )

31.6 Height-Size Relation By Daniel St¨uwe fun fib tree :: nat ⇒ unit bro where fib tree 0 = N0 | fib tree (Suc 0 ) = N2 N0 () N0 | fib tree (Suc(Suc h)) = N2 (fib tree (h+1 )) () (N1 (fib tree h))

137 fun fib 0 :: nat ⇒ nat where fib 0 0 = 0 | fib 0 (Suc 0 ) = 1 | fib 0 (Suc(Suc h)) = 1 + fib 0 (Suc h) + fib 0 h fun size :: 0a bro ⇒ nat where size N0 = 0 | size (N1 t) = size t | size (N2 t1 t2 ) = 1 + size t1 + size t2 lemma fib tree B: fib tree h ∈ B h by (induction h rule: fib tree.induct) auto declare [[names short]] lemma size fib 0: size (fib tree h) = fib 0 h by (induction h rule: fib tree.induct) auto lemma fibfib: fib 0 h + 1 = fib (Suc(Suc h)) by (induction h rule: fib tree.induct) auto lemma B N2 cases[consumes 1 ]: assumes N2 t1 a t2 ∈ B (Suc n) obtains (BB) t1 ∈ B n and t2 ∈ B n | (UB) t1 ∈ U n and t2 ∈ B n | (BU ) t1 ∈ B n and t2 ∈ U n using assms by auto lemma size bounded: t ∈ B h =⇒ size t ≥ size (fib tree h) unfolding size fib 0 proof (induction h arbitrary: t rule: fib 0.induct) case (3 h t 0) note main = 3 then obtain t1 a t2 where t 0: t 0 = N2 t1 a t2 by auto with main have N2 t1 a t2 ∈ B (Suc (Suc h)) by auto thus ?case proof (cases rule: B N2 cases) case BB then obtain x y z where t2 : t2 = N2 x y z ∨ t2 = N2 z y x x ∈ B h by auto show ?thesis unfolding t 0 using main(1 )[OF BB(1 )] main(2 )[OF t2 (2 )] t2 (1 ) by auto next case UB

138 then obtain t11 where t1 : t1 = N1 t11 t11 ∈ B h by auto show ?thesis unfolding t 0 t1 (1 ) using main(2 )[OF t1 (2 )] main(1 )[OF UB(2 )] by simp next case BU then obtain t22 where t2 : t2 = N1 t22 t22 ∈ B h by auto show ?thesis unfolding t 0 t2 (1 ) using main(2 )[OF t2 (2 )] main(1 )[OF BU (1 )] by simp qed qed auto theorem t ∈ B h =⇒ fib (h + 2 ) ≤ size t + 1 using size bounded by (simp add: size fib 0 fibfib[symmetric] del: fib.simps) end

32 1-2 Brother Tree Implementation of Maps theory Brother12 Map imports Brother12 Set Map Specs begin fun lookup :: ( 0a × 0b) bro ⇒ 0a::linorder ⇒ 0b option where lookup N0 x = None | lookup (N1 t) x = lookup t x | lookup (N2 l (a,b) r) x = (case cmp x a of LT ⇒ lookup l x | EQ ⇒ Some b | GT ⇒ lookup r x) locale update = insert begin fun upd :: 0a::linorder ⇒ 0b ⇒ ( 0a× 0b) bro ⇒ ( 0a× 0b) bro where upd x y N0 = L2 (x,y) | upd x y (N1 t) = n1 (upd x y t) | upd x y (N2 l (a,b) r) = (case cmp x a of LT ⇒ n2 (upd x y l)(a,b) r |

139 EQ ⇒ N2 l (a,y) r | GT ⇒ n2 l (a,b)(upd x y r)) definition update :: 0a::linorder ⇒ 0b ⇒ ( 0a× 0b) bro ⇒ ( 0a× 0b) bro where update x y t = tree(upd x y t) end context delete begin fun del :: 0a::linorder ⇒ ( 0a× 0b) bro ⇒ ( 0a× 0b) bro where del N0 = N0 | del x (N1 t) = N1 (del x t) | del x (N2 l (a,b) r) = (case cmp x a of LT ⇒ n2 (del x l)(a,b) r | GT ⇒ n2 l (a,b)(del x r) | EQ ⇒ (case split min r of None ⇒ N1 l | Some (ab, r 0) ⇒ n2 l ab r 0)) definition delete :: 0a::linorder ⇒ ( 0a× 0b) bro ⇒ ( 0a× 0b) bro where delete a t = tree (del a t) end

32.1 Functional Correctness Proofs 32.1.1 Proofs for lookup lemma lookup map of : t ∈ T h =⇒ sorted1 (inorder t) =⇒ lookup t x = map of (inorder t) x by(induction h arbitrary: t)(auto simp: map of simps split: option.splits)

32.1.2 Proofs for update context update begin lemma inorder upd: t ∈ T h =⇒ sorted1 (inorder t) =⇒ inorder(upd x y t) = upd list x y (inorder t) by(induction h arbitrary: t)(auto simp: upd list simps inorder n1 inorder n2 ) lemma inorder update: t ∈ T h =⇒

140 sorted1 (inorder t) =⇒ inorder(update x y t) = upd list x y (inorder t) by(simp add: update def inorder upd inorder tree) end

32.1.3 Proofs for deletion context delete begin lemma inorder del: t ∈ T h =⇒ sorted1 (inorder t) =⇒ inorder(del x t) = del list x (inorder t) by(induction h arbitrary: t)(auto simp: del list simps inorder n2 inorder split min[OF UnI1 ] inorder split min[OF UnI2 ] split: option.splits) lemma inorder delete: t ∈ T h =⇒ sorted1 (inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(simp add: delete def inorder del inorder tree) end

32.2 Invariant Proofs 32.2.1 Proofs for update context update begin lemma upd type: (t ∈ B h −→ upd x y t ∈ Bp h) ∧ (t ∈ U h −→ upd x y t ∈ T h) apply(induction h arbitrary: t) apply (simp) apply (fastforce simp: Bp if B n2 type dest: n1 type) done lemma update type: t ∈ B h =⇒ update x y t ∈ B h ∪ B (Suc h) unfolding update def by (metis upd type tree type) end

141 32.2.2 Proofs for deletion context delete begin lemma del type: t ∈ B h =⇒ del x t ∈ T h t ∈ U h =⇒ del x t ∈ Um h proof (induction h arbitrary: x t) case (Suc h) { case 1 then obtain l a b r where [simp]: t = N2 l (a,b) r and lr: l ∈ T h r ∈ T h l ∈ B h ∨ r ∈ B h by auto have ?case if x < a proof cases assume l ∈ B h from n2 type3 [OF Suc.IH (1 )[OF this] lr(2 )] show ?thesis using hx a proof cases assume r ∈ B h from n2 type3 [OF lr(1 ) Suc.IH (1 )[OF this]] show ?thesis using hx>a i by(simp) next assume r ∈/ B h hence l ∈ B h r ∈ U h using lr by auto from n2 type2 [OF this(1 ) Suc.IH (2 )[OF this(2 )]] show ?thesis using hx>a i by(simp) qed moreover have ?case if [simp]: x=a proof (cases split min r) case None show ?thesis proof cases assume r ∈ B h with split minNoneN0 [OF this None] lr show ?thesis by(simp)

142 next assume r ∈/ B h hence r ∈ U h using lr by auto with split minNoneN1 [OF this None] lr(3 ) show ?thesis by (simp) qed next case [simp]: (Some br 0) obtain b r 0 where [simp]: br 0 = (b,r 0) by fastforce show ?thesis proof cases assume r ∈ B h from split min type(1 )[OF this] n2 type3 [OF lr(1 )] show ?thesis by simp next assume r ∈/ B h hence l ∈ B h and r ∈ U h using lr by auto from split min type(2 )[OF this(2 )] n2 type2 [OF this(1 )] show ?thesis by simp qed qed ultimately show ?case by auto } { case 2 with Suc.IH (1 ) show ?case by auto } qed auto lemma delete type: t ∈ B h =⇒ delete x t ∈ B h ∪ B(h−1 ) unfolding delete def by (cases h)(simp, metis del type(1 ) tree type Suc eq plus1 diff Suc 1 ) end

32.3 Overall correctness interpretation Map by Ordered where empty = empty and lookup = lookup and update = update.update and delete = delete.delete and inorder = inorder and inv = λt. ∃ h. t ∈ B h proof (standard, goal cases) case 2 thus ?case by(auto intro!: lookup map of ) next case 3 thus ?case by(auto intro!: update.inorder update) next case 4 thus ?case by(auto intro!: delete.inorder delete)

143 next case 6 thus ?case using update.update type by (metis Un iff ) next case 7 thus ?case using delete.delete type by blast qed (auto simp: empty def ) end

33 AA Tree Implementation of Sets theory AA Set imports Isin2 Cmp begin type synonym 0a = ( 0a∗nat) tree definition empty :: 0a aa tree where empty = Leaf fun lvl :: 0a aa tree ⇒ nat where lvl Leaf = 0 | lvl (Node ( , lv) ) = lv fun invar :: 0a aa tree ⇒ bool where invar Leaf = True | invar (Node l (a, h) r) = (invar l ∧ invar r ∧ h = lvl l + 1 ∧ (h = lvl r + 1 ∨ (∃ lr b rr. r = Node lr (b,h) rr ∧ h = lvl rr + 1 ))) fun skew :: 0a aa tree ⇒ 0a aa tree where skew (Node (Node t1 (b, lvb) t2 )(a, lva) t3 ) = (if lva = lvb then Node t1 (b, lvb)(Node t2 (a, lva) t3 ) else Node (Node t1 (b, lvb) t2 )(a, lva) t3 ) | skew t = t fun split :: 0a aa tree ⇒ 0a aa tree where split (Node t1 (a, lva)(Node t2 (b, lvb)(Node t3 (c, lvc) t4 ))) = (if lva = lvb ∧ lvb = lvc — lva = lvc suffices then Node (Node t1 (a,lva) t2 )(b,lva+1 )(Node t3 (c, lva) t4 ) else Node t1 (a,lva)(Node t2 (b,lvb)(Node t3 (c,lvc) t4 ))) |

144 split t = t hide const (open) insert fun insert :: 0a::linorder ⇒ 0a aa tree ⇒ 0a aa tree where insert x Leaf = Node Leaf (x, 1 ) Leaf | insert x (Node t1 (a,lv) t2 ) = (case cmp x a of LT ⇒ split (skew (Node (insert x t1 )(a,lv) t2 )) | GT ⇒ split (skew (Node t1 (a,lv)(insert x t2 ))) | EQ ⇒ Node t1 (x, lv) t2 ) fun sngl :: 0a aa tree ⇒ bool where sngl Leaf = False | sngl (Node Leaf ) = True | sngl (Node ( , lva)(Node ( , lvb) )) = (lva > lvb) definition adjust :: 0a aa tree ⇒ 0a aa tree where adjust t = (case t of Node l (x,lv) r ⇒ (if lvl l >= lv−1 ∧ lvl r >= lv−1 then t else if lvl r < lv−1 ∧ sngl l then skew (Node l (x,lv−1 ) r) else if lvl r < lv−1 then case l of Node t1 (a,lva)(Node t2 (b,lvb) t3 ) ⇒ Node (Node t1 (a,lva) t2 )(b,lvb+1 )(Node t3 (x,lv−1 ) r) else if lvl r < lv then split (Node l (x,lv−1 ) r) else case r of Node t1 (b,lvb) t4 ⇒ (case t1 of Node t2 (a,lva) t3 ⇒ Node (Node l (x,lv−1 ) t2 )(a,lva+1 ) (split (Node t3 (b, if sngl t1 then lva else lva+1 ) t4 ))))) In the paper, the last case of adjust is expressed with the help of an incorrect auxiliary function nlvl. Function split max below is called dellrg in the paper. The latter is incorrect for two reasons: dellrg is meant to delete the largest element but recurses on the left instead of the right subtree; the invariant is not restored. fun split max :: 0a aa tree ⇒ 0a aa tree ∗ 0a where split max (Node l (a,lv) Leaf ) = (l,a) |

145 split max (Node l (a,lv) r) = (let (r 0,b) = split max r in (adjust(Node l (a,lv) r 0), b)) fun delete :: 0a::linorder ⇒ 0a aa tree ⇒ 0a aa tree where delete Leaf = Leaf | delete x (Node l (a,lv) r) = (case cmp x a of LT ⇒ adjust (Node (delete x l)(a,lv) r) | GT ⇒ adjust (Node l (a,lv)(delete x r)) | EQ ⇒ (if l = Leaf then r else let (l 0,b) = split max l in adjust (Node l 0 (b,lv) r))) fun pre adjust where pre adjust (Node l (a,lv) r) = (invar l ∧ invar r ∧ ((lv = lvl l + 1 ∧ (lv = lvl r + 1 ∨ lv = lvl r + 2 ∨ lv = lvl r ∧ sngl r)) ∨ (lv = lvl l + 2 ∧ (lv = lvl r + 1 ∨ lv = lvl r ∧ sngl r)))) declare pre adjust.simps [simp del]

33.1 Auxiliary Proofs lemma split case: split t = (case t of Node t1 (x,lvx)(Node t2 (y,lvy)(Node t3 (z,lvz) t4 )) ⇒ (if lvx = lvy ∧ lvy = lvz then Node (Node t1 (x,lvx) t2 )(y,lvx+1 )(Node t3 (z,lvx) t4 ) else t) | t ⇒ t) by(auto split: tree.split) lemma skew case: skew t = (case t of Node (Node t1 (y,lvy) t2 )(x,lvx) t3 ⇒ (if lvx = lvy then Node t1 (y, lvx)(Node t2 (x,lvx) t3 ) else t) | t ⇒ t) by(auto split: tree.split) lemma lvl 0 iff : invar t =⇒ lvl t = 0 ←→ t = Leaf by(cases t) auto lemma lvl Suc iff : lvl t = Suc n ←→ (∃ l a r. t = Node l (a,Suc n) r) by(cases t) auto lemma lvl skew: lvl (skew t) = lvl t by(cases t rule: skew.cases) auto

146 lemma lvl split: lvl (split t) = lvl t ∨ lvl (split t) = lvl t + 1 ∧ sngl (split t) by(cases t rule: split.cases) auto lemma invar 2Nodes:invar (Node l (x,lv)(Node rl (rx, rlv) rr)) = (invar l ∧ invar hrl, (rx, rlv), rri ∧ lv = Suc (lvl l) ∧ (lv = Suc rlv ∨ rlv = lv ∧ lv = Suc (lvl rr))) by simp lemma invar NodeLeaf [simp]: invar (Node l (x,lv) Leaf ) = (invar l ∧ lv = Suc (lvl l) ∧ lv = Suc 0 ) by simp lemma sngl if invar: invar (Node l (a, n) r) =⇒ n = lvl r =⇒ sngl r by(cases r rule: sngl.cases) clarsimp+

33.2 Invariance 33.2.1 Proofs for insert lemma lvl insert aux: lvl (insert x t) = lvl t ∨ lvl (insert x t) = lvl t + 1 ∧ sngl (insert x t) apply(induction t) apply (auto simp: lvl skew) apply (metis Suc eq plus1 lvl.simps(2 ) lvl split lvl skew)+ done lemma lvl insert: obtains (Same) lvl (insert x t) = lvl t | (Incr) lvl (insert x t) = lvl t + 1 sngl (insert x t) using lvl insert aux by blast lemma lvl insert sngl: invar t =⇒ sngl t =⇒ lvl(insert x t) = lvl t proof (induction t rule: insert.induct) case (2 x t1 a lv t2 ) consider (LT ) x < a | (GT ) x > a | (EQ) x = a using less linear by blast thus ?case proof cases case LT thus ?thesis using 2 by (auto simp add: skew case split case split: tree.splits) next case GT

147 thus ?thesis using 2 proof (cases t1 rule: tree2 cases) case Node thus ?thesis using 2 GT apply (auto simp add: skew case split case split: tree.splits) by (metis less not refl2 lvl.simps(2 ) lvl insert aux n not Suc n sngl.simps(3 ))+ qed (auto simp add: lvl 0 iff ) qed simp qed simp lemma skew invar: invar t =⇒ skew t = t by(cases t rule: skew.cases) auto lemma split invar: invar t =⇒ split t = t by(cases t rule: split.cases) clarsimp+ lemma invar NodeL: [[ invar(Node l (x, n) r); invar l 0; lvl l 0 = lvl l ]] =⇒ invar(Node l 0 (x, n) r) by(auto) lemma invar NodeR: [[ invar(Node l (x, n) r); n = lvl r + 1 ; invar r 0; lvl r 0 = lvl r ]] =⇒ invar(Node l (x, n) r 0) by(auto) lemma invar NodeR2 : [[ invar(Node l (x, n) r); sngl r 0; n = lvl r + 1 ; invar r 0; lvl r 0 = n ]] =⇒ invar(Node l (x, n) r 0) by(cases r 0 rule: sngl.cases) clarsimp+ lemma lvl insert incr iff :(lvl(insert a t) = lvl t + 1 ) ←→ (∃ l x r. insert a t = Node l (x, lvl t + 1 ) r ∧ lvl l = lvl r) apply(cases t rule: tree2 cases) apply(auto simp add: skew case split case split: if splits) apply(auto split: tree.splits if splits) done lemma invar insert: invar t =⇒ invar(insert a t) proof(induction t rule: tree2 induct) case N :(Node l x n r) hence il: invar l and ir: invar r by auto

148 note iil = N .IH (1 )[OF il] note iir = N .IH (2 )[OF ir] let ?t = Node l (x, n) r have a < x ∨ a = x ∨ x < a by auto moreover have ?case if a < x proof (cases rule: lvl insert[of a l]) case (Same) thus ?thesis using ha

149 have insert a ?t = split(skew(Node l (x, n)(insert a r))) using ha>x i by(auto) also have skew(Node l (x,n)(insert a r)) = Node l (x,n)(insert a r) using N .prems by(simp add: skew case split: tree.split) also have invar(split ... ) proof − from lvl insert sngl[OF ir sngl if invar[OF hinvar ?t i 0 ], of a] obtain t1 y t2 where iar: insert a r = Node t1 (y,n) t2 using N .prems 0 by (auto simp: lvl Suc iff ) from N .prems iar 0 iir show ?thesis by (auto simp: split case split: tree.splits) qed finally show ?thesis . next assume 1 : n = lvl r + 1 hence sngl ?t by(cases r) auto show ?thesis proof (cases rule: lvl insert[of a r]) case (Same) show ?thesis using hx

33.2.2 Proofs for delete lemma invarL: ASSUMPTION (invar hl, (a, lv), ri) =⇒ invar l by(simp add: ASSUMPTION def ) lemma invarR: ASSUMPTION (invar hl, (a,lv), ri) =⇒ invar r by(simp add: ASSUMPTION def ) lemma sngl NodeI : sngl (Node l (a,lv) r) =⇒ sngl (Node l 0 (a 0, lv) r)

150 by(cases r rule: tree2 cases)(simp all) declare invarL[simp] invarR[simp] lemma pre cases: assumes pre adjust (Node l (x,lv) r) obtains (tSngl) invar l ∧ invar r ∧ lv = Suc (lvl r) ∧ lvl l = lvl r | (tDouble) invar l ∧ invar r ∧ lv = lvl r ∧ Suc (lvl l) = lvl r ∧ sngl r | (rDown) invar l ∧ invar r ∧ lv = Suc (Suc (lvl r)) ∧ lv = Suc (lvl l) | (lDown tSngl) invar l ∧ invar r ∧ lv = Suc (lvl r) ∧ lv = Suc (Suc (lvl l)) | (lDown tDouble) invar l ∧ invar r ∧ lv = lvl r ∧ lv = Suc (Suc (lvl l)) ∧ sngl r using assms unfolding pre adjust.simps by auto declare invar.simps(2 )[simp del] invar 2Nodes[simp add] lemma invar adjust: assumes pre: pre adjust (Node l (a,lv) r) shows invar(adjust (Node l (a,lv) r)) using pre proof (cases rule: pre cases) case (tDouble) thus ?thesis unfolding adjust def by (cases r)(auto simp: invar.simps(2 )) next case (rDown) from rDown obtain llv ll la lr where l: l = Node ll (la, llv) lr by (cases l) auto from rDown show ?thesis unfolding adjust def by (auto simp: l in- var.simps(2 ) split: tree.splits) next case (lDown tDouble) from lDown tDouble obtain rlv rr ra rl where r: r = Node rl (ra, rlv) rr by (cases r) auto from lDown tDouble and r obtain rrlv rrr rra rrl where rr :rr = Node rrr (rra, rrlv) rrl by (cases rr) auto from lDown tDouble show ?thesis unfolding adjust def r rr apply (cases rl rule: tree2 cases) apply (auto simp add: invar.simps(2 ) split!: if split)

151 using lDown tDouble by (auto simp: split case lvl 0 iff elim:lvl.elims split: tree.split) qed (auto simp: split case invar.simps(2 ) adjust def split: tree.splits) lemma lvl adjust: assumes pre adjust (Node l (a,lv) r) shows lv = lvl (adjust(Node l (a,lv) r)) ∨ lv = lvl (adjust(Node l (a,lv) r)) + 1 using assms(1 ) proof(cases rule: pre cases) case lDown tSngl thus ?thesis using lvl split[of hl, (a, lvl r), ri] by (auto simp: adjust def ) next case lDown tDouble thus ?thesis by (auto simp: adjust def invar.simps(2 ) split: tree.split) qed (auto simp: adjust def split: tree.splits) lemma sngl adjust: assumes pre adjust (Node l (a,lv) r) sngl hl, (a, lv), ri lv = lvl (adjust hl, (a, lv), ri) shows sngl (adjust hl, (a, lv), ri) using assms proof (cases rule: pre cases) case rDown thus ?thesis using assms(2 ,3 ) unfolding adjust def by (auto simp add: skew case)(auto split: tree.split) qed (auto simp: adjust def skew case split case split: tree.split) definition post del t t 0 == invar t 0 ∧ (lvl t 0 = lvl t ∨ lvl t 0 + 1 = lvl t) ∧ (lvl t 0 = lvl t ∧ sngl t −→ sngl t 0) lemma pre adj if postR: invarhlv, (l, a), ri =⇒ post del r r 0 =⇒ pre adjust hlv, (l, a), r 0i by(cases sngl r) (auto simp: pre adjust.simps post del def invar.simps(2 ) elim: sngl.elims) lemma pre adj if postL: invarhl, (a, lv), ri =⇒ post del l l 0 =⇒ pre adjust hl 0, (b, lv), ri by(cases sngl r) (auto simp: pre adjust.simps post del def invar.simps(2 ) elim: sngl.elims) lemma post del adjL: [[ invarhl, (a, lv), ri; pre adjust hl 0, (b, lv), ri ]] =⇒ post del hl, (a, lv), ri (adjust hl 0, (b, lv), ri)

152 unfolding post del def by (metis invar adjust lvl adjust sngl NodeI sngl adjust lvl.simps(2 )) lemma post del adjR: assumes invarhl, (a,lv), ri pre adjust hl, (a,lv), r 0i post del r r 0 shows post del hl, (a,lv), ri (adjust hl, (a,lv), r 0i) proof(unfold post del def , safe del: disjCI ) let ?t = hl, (a,lv), ri let ?t 0 = adjust hl, (a,lv), r 0i show invar ?t 0 by(rule invar adjust[OF assms(2 )]) show lvl ?t 0 = lvl ?t ∨ lvl ?t 0 + 1 = lvl ?t using lvl adjust[OF assms(2 )] by auto show sngl ?t 0 if as: lvl ?t 0 = lvl ?t sngl ?t proof − have s: sngl hl, (a,lv), r 0i proof(cases r 0 rule: tree2 cases) case Leaf thus ?thesis by simp next case Node thus ?thesis using as(2 ) assms(1 ,3 ) by (cases r rule: tree2 cases)(auto simp: post del def ) qed show ?thesis using as(1 ) sngl adjust[OF assms(2 ) s] by simp qed qed declare prod.splits[split] theorem post split max: [[ invar t;(t 0, x) = split max t; t 6= Leaf ]] =⇒ post del t t 0 proof (induction t arbitrary: t 0 rule: split max.induct) case (2 l a lv rl bl rr) let ?r = hrl, bl, rri let ?t = hl, (a, lv), ?ri from 2 .prems(2 ) obtain r 0 where r 0:(r 0, x) = split max ?r and [simp]: t 0 = adjust hl, (a, lv), r 0i by auto 0 0 from 2 .IH [OF r ] hinvar ?t i have post: post del ?r r by simp note preR = pre adj if postR[OF hinvar ?t i post] show ?case by (simp add: post del adjR[OF 2 .prems(1 ) preR post]) qed (auto simp: post del def ) theorem post delete: invar t =⇒ post del t (delete x t) proof (induction t rule: tree2 induct) case (Node l a lv r)

153 let ?l 0 = delete x l and ?r 0 = delete x r let ?t = Node l (a,lv) r let ?t 0 = delete x ?t

from Node.prems have inv l: invar l and inv r: invar r by (auto)

note post l 0 = Node.IH (1 )[OF inv l] note preL = pre adj if postL[OF Node.prems post l 0]

note post r 0 = Node.IH (2 )[OF inv r] note preR = pre adj if postR[OF Node.prems post r 0]

show ?case proof (cases rule: linorder cases[of x a]) case less thus ?thesis using Node.prems by (simp add: post del adjL preL) next case greater thus ?thesis using Node.prems by (simp add: post del adjR preR post r 0) next case equal show ?thesis proof cases assume l = Leaf thus ?thesis using equal Node.prems by(auto simp: post del def invar.simps(2 )) next assume l 6= Leaf thus ?thesis using equal by simp (metis Node.prems inv l post del adjL post split max pre adj if postL) qed qed qed (simp add: post del def ) declare invar 2Nodes[simp del]

33.3 Functional Correctness 33.3.1 Proofs for insert lemma inorder split: inorder(split t) = inorder t by(cases t rule: split.cases)(auto) lemma inorder skew: inorder(skew t) = inorder t by(cases t rule: skew.cases)(auto)

154 lemma inorder insert: sorted(inorder t) =⇒ inorder(insert x t) = ins list x (inorder t) by(induction t)(auto simp: ins list simps inorder split inorder skew)

33.3.2 Proofs for delete lemma inorder adjust: t 6= Leaf =⇒ pre adjust t =⇒ inorder(adjust t) = inorder t by(cases t) (auto simp: adjust def inorder skew inorder split invar.simps(2 ) pre adjust.simps split: tree.splits) lemma split maxD: [[ split max t = (t 0,x); t 6= Leaf ; invar t ]] =⇒ inorder t 0 @[x] = inorder t by(induction t arbitrary: t 0 rule: split max.induct) (auto simp: sorted lems inorder adjust pre adj if postR post split max split: prod.splits) lemma inorder delete: invar t =⇒ sorted(inorder t) =⇒ inorder(delete x t) = del list x (inorder t) by(induction t) (auto simp: del list simps inorder adjust pre adj if postL pre adj if postR post split max post delete split maxD split: prod.splits) interpretation S: Set by Ordered where empty = empty and isin = isin and insert = insert and delete = delete and inorder = inorder and inv = invar proof (standard, goal cases) case 1 show ?case by (simp add: empty def ) next case 2 thus ?case by(simp add: isin set inorder) next case 3 thus ?case by(simp add: inorder insert) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by(simp add: empty def ) next case 6 thus ?case by(simp add: invar insert) next case 7 thus ?case using post delete by(auto simp: post del def ) qed

155 end

34 AA Tree Implementation of Maps theory AA Map imports AA Set Lookup2 begin fun update :: 0a::linorder ⇒ 0b ⇒ ( 0a∗ 0b) aa tree ⇒ ( 0a∗ 0b) aa tree where update x y Leaf = Node Leaf ((x,y), 1 ) Leaf | update x y (Node t1 ((a,b), lv) t2 ) = (case cmp x a of LT ⇒ split (skew (Node (update x y t1 ) ((a,b), lv) t2 )) | GT ⇒ split (skew (Node t1 ((a,b), lv)(update x y t2 ))) | EQ ⇒ Node t1 ((x,y), lv) t2 ) fun delete :: 0a::linorder ⇒ ( 0a∗ 0b) aa tree ⇒ ( 0a∗ 0b) aa tree where delete Leaf = Leaf | delete x (Node l ((a,b), lv) r) = (case cmp x a of LT ⇒ adjust (Node (delete x l) ((a,b), lv) r) | GT ⇒ adjust (Node l ((a,b), lv)(delete x r)) | EQ ⇒ (if l = Leaf then r else let (l 0,ab 0) = split max l in adjust (Node l 0 (ab 0, lv) r)))

34.1 Invariance 34.1.1 Proofs for insert lemma lvl update aux: lvl (update x y t) = lvl t ∨ lvl (update x y t) = lvl t + 1 ∧ sngl (update x y t) apply(induction t) apply (auto simp: lvl skew) apply (metis Suc eq plus1 lvl.simps(2 ) lvl split lvl skew)+ done lemma lvl update: obtains (Same) lvl (update x y t) = lvl t | (Incr) lvl (update x y t) = lvl t + 1 sngl (update x y t) using lvl update aux by fastforce

156 declare invar.simps(2 )[simp] lemma lvl update sngl: invar t =⇒ sngl t =⇒ lvl(update x y t) = lvl t proof (induction t rule: update.induct) case (2 x y t1 a b lv t2 ) consider (LT ) x < a | (GT ) x > a | (EQ) x = a using less linear by blast thus ?case proof cases case LT thus ?thesis using 2 by (auto simp add: skew case split case split: tree.splits) next case GT thus ?thesis using 2 proof (cases t1 ) case Node thus ?thesis using 2 GT apply (auto simp add: skew case split case split: tree.splits) by (metis less not refl2 lvl.simps(2 ) lvl update aux n not Suc n sngl.simps(3 ))+ qed (auto simp add: lvl 0 iff ) qed simp qed simp lemma lvl update incr iff :(lvl(update a b t) = lvl t + 1 ) ←→ (∃ l x r. update a b t = Node l (x,lvl t + 1 ) r ∧ lvl l = lvl r) apply(cases t) apply(auto simp add: skew case split case split: if splits) apply(auto split: tree.splits if splits) done lemma invar update: invar t =⇒ invar(update a b t) proof(induction t rule: tree2 induct) case N :(Node l xy n r) hence il: invar l and ir: invar r by auto note iil = N .IH (1 )[OF il] note iir = N .IH (2 )[OF ir] obtain x y where [simp]: xy = (x,y) by fastforce let ?t = Node l (xy, n) r have a < x ∨ a = x ∨ x < a by auto moreover have ?case if a < x proof (cases rule: lvl update[of a b l]) case (Same) thus ?thesis

157 using hax i by(auto) also have skew(Node l (xy, n)(update a b r)) = Node l (xy, n)(update a b r) using N .prems by(simp add: skew case split: tree.split) also have invar(split ... ) proof − from lvl update sngl[OF ir sngl if invar[OF hinvar ?t i 0 ], of a b]

158 obtain t1 p t2 where iar: update a b r = Node t1 (p, n) t2 using N .prems 0 by (auto simp: lvl Suc iff ) from N .prems iar 0 iir show ?thesis by (auto simp: split case split: tree.splits) qed finally show ?thesis . next assume 1 : n = lvl r + 1 hence sngl ?t by(cases r) auto show ?thesis proof (cases rule: lvl update[of a b r]) case (Same) show ?thesis using hx

34.1.2 Proofs for delete declare invar.simps(2 )[simp del] theorem post delete: invar t =⇒ post del t (delete x t) proof (induction t rule: tree2 induct) case (Node l ab lv r)

obtain a b where [simp]: ab = (a,b) by fastforce

let ?l 0 = delete x l and ?r 0 = delete x r let ?t = Node l (ab, lv) r let ?t 0 = delete x ?t

from Node.prems have inv l: invar l and inv r: invar r by (auto)

note post l 0 = Node.IH (1 )[OF inv l] note preL = pre adj if postL[OF Node.prems post l 0]

159 note post r 0 = Node.IH (2 )[OF inv r] note preR = pre adj if postR[OF Node.prems post r 0]

show ?case proof (cases rule: linorder cases[of x a]) case less thus ?thesis using Node.prems by (simp add: post del adjL preL) next case greater thus ?thesis using Node.prems preR by (simp add: post del adjR post r 0) next case equal show ?thesis proof cases assume l = Leaf thus ?thesis using equal Node.prems by(auto simp: post del def invar.simps(2 )) next assume l 6= Leaf thus ?thesis using equal Node.prems by simp (metis inv l post del adjL post split max pre adj if postL) qed qed qed (simp add: post del def )

34.2 Functional Correctness Proofs theorem inorder update: sorted1 (inorder t) =⇒ inorder(update x y t) = upd list x y (inorder t) by (induct t)(auto simp: upd list simps inorder split inorder skew) theorem inorder delete: [[invar t; sorted1 (inorder t)]] =⇒ inorder (delete x t) = del list x (inorder t) by(induction t) (auto simp: del list simps inorder adjust pre adj if postL pre adj if postR post split max post delete split maxD split: prod.splits) interpretation I : Map by Ordered where empty = empty and lookup = lookup and update = update and delete = delete and inorder = inorder and inv = invar proof (standard, goal cases) case 1 show ?case by (simp add: empty def )

160 next case 2 thus ?case by(simp add: lookup map of ) next case 3 thus ?case by(simp add: inorder update) next case 4 thus ?case by(simp add: inorder delete) next case 5 thus ?case by(simp add: empty def ) next case 6 thus ?case by(simp add: invar update) next case 7 thus ?case using post delete by(auto simp: post del def ) qed end

35 Join-Based Implementation of Sets theory Set2 Join imports Isin2 begin This theory implements the set operations insert, delete, union, intersection and diff erence. The implementation is based on binary search trees. All op- erations are reduced to a single operation join l x r that joins two BSTs l and r and an element x such that l < x < r. The theory is based on theory HOL−Data Structures.Tree2 where nodes have an additional field. This field is ignored here but it means that this the- ory can be instantiated with red-black trees (see theory Set2_Join_RBT.thy) and other balanced trees. This approach is very concrete and fixes the type of trees. Alternatively, one could assume some abstract type 0t of trees with suitable decomposition and recursion operators on it. locale Set2 Join = fixes join :: ( 0a::linorder∗ 0b) tree ⇒ 0a ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree fixes inv :: ( 0a∗ 0b) tree ⇒ bool assumes set join: set tree (join l a r) = set tree l ∪ {a} ∪ set tree r assumes bst join: bst (Node l (a, b) r) =⇒ bst (join l a r) assumes inv Leaf : inv hi assumes inv join: [[ inv l; inv r ]] =⇒ inv (join l a r) assumes inv Node: [[ inv (Node l (a,b) r) ]] =⇒ inv l ∧ inv r begin declare set join [simp] Let def [simp]

161 35.1 split min fun split min :: ( 0a∗ 0b) tree ⇒ 0a × ( 0a∗ 0b) tree where split min (Node l (a, ) r) = (if l = Leaf then (a,r) else let (m,l 0) = split min l in (m, join l 0 a r)) lemma split min set: [[ split min t = (m,t 0); t 6= Leaf ]] =⇒ m ∈ set tree t ∧ set tree t = {m} ∪ set tree t 0 proof(induction t arbitrary: t 0 rule: tree2 induct) case Node thus ?case by(auto split: prod.splits if splits dest: inv Node) next case Leaf thus ?case by simp qed lemma split min bst: [[ split min t = (m,t 0); bst t; t 6= Leaf ]] =⇒ bst t 0 ∧ (∀ x ∈ set tree t 0. m < x) proof(induction t arbitrary: t 0 rule: tree2 induct) case Node thus ?case by(fastforce simp: split min set bst join split: prod.splits if splits) next case Leaf thus ?case by simp qed lemma split min inv: [[ split min t = (m,t 0); inv t; t 6= Leaf ]] =⇒ inv t 0 proof(induction t arbitrary: t 0 rule: tree2 induct) case Node thus ?case by(auto simp: inv join split: prod.splits if splits dest: inv Node) next case Leaf thus ?case by simp qed

35.2 join2 definition join2 :: ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree where join2 l r = (if r = Leaf then l else let (m,r 0) = split min r in join l m r 0) lemma set join2 [simp]: set tree (join2 l r) = set tree l ∪ set tree r by(simp add: join2 def split min set split: prod.split) lemma bst join2 : [[ bst l; bst r; ∀ x ∈ set tree l. ∀ y ∈ set tree r. x < y ]] =⇒ bst (join2 l r)

162 by(simp add: join2 def bst join split min set split min bst split: prod.split) lemma inv join2 : [[ inv l; inv r ]] =⇒ inv (join2 l r) by(simp add: join2 def inv join split min set split min inv split: prod.split)

35.3 split fun split :: ( 0a∗ 0b)tree ⇒ 0a ⇒ ( 0a∗ 0b)tree × bool × ( 0a∗ 0b)tree where split Leaf k = (Leaf , False, Leaf ) | split (Node l (a, ) r) x = (case cmp x a of LT ⇒ let (l1 ,b,l2 ) = split l x in (l1 , b, join l2 a r) | GT ⇒ let (r1 ,b,r2 ) = split r x in (join l a r1 , b, r2 ) | EQ ⇒ (l, True, r)) lemma split: split t x = (l,b,r) =⇒ bst t =⇒ set tree l = {a ∈ set tree t. a < x} ∧ set tree r = {a ∈ set tree t. x < a} ∧ (b = (x ∈ set tree t)) ∧ bst l ∧ bst r proof(induction t arbitrary: l b r rule: tree2 induct) case Leaf thus ?case by simp next case Node thus ?case by(force split!: prod.splits if splits intro!: bst join) qed lemma split inv: split t x = (l,b,r) =⇒ inv t =⇒ inv l ∧ inv r proof(induction t arbitrary: l b r rule: tree2 induct) case Leaf thus ?case by simp next case Node thus ?case by(force simp: inv join split!: prod.splits if splits dest!: inv Node) qed declare split.simps[simp del]

35.4 insert definition insert :: 0a ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree where insert x t = (let (l, ,r) = split t x in join l x r) lemma set tree insert: bst t =⇒ set tree (insert x t) = {x} ∪ set tree t by(auto simp add: insert def split split: prod.split) lemma bst insert: bst t =⇒ bst (insert x t) by(auto simp add: insert def bst join dest: split split: prod.split)

163 lemma inv insert: inv t =⇒ inv (insert x t) by(force simp: insert def inv join dest: split inv split: prod.split)

35.5 delete definition delete :: 0a ⇒ ( 0a∗ 0b) tree ⇒ ( 0a∗ 0b) tree where delete x t = (let (l, ,r) = split t x in join2 l r) lemma set tree delete: bst t =⇒ set tree (delete x t) = set tree t − {x} by(auto simp: delete def split split: prod.split) lemma bst delete: bst t =⇒ bst (delete x t) by(force simp add: delete def intro: bst join2 dest: split split: prod.split) lemma inv delete: inv t =⇒ inv (delete x t) by(force simp: delete def inv join2 dest: split inv split: prod.split)

35.6 union fun union :: ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree where union t1 t2 = (if t1 = Leaf then t2 else if t2 = Leaf then t1 else case t1 of Node l1 (a, ) r1 ⇒ let (l2 , ,r2 ) = split t2 a; l 0 = union l1 l2 ; r 0 = union r1 r2 in join l 0 a r 0) declare union.simps [simp del] lemma set tree union: bst t2 =⇒ set tree (union t1 t2 ) = set tree t1 ∪ set tree t2 proof(induction t1 t2 rule: union.induct) case (1 t1 t2 ) then show ?case by (auto simp: union.simps[of t1 t2 ] split split: tree.split prod.split) qed lemma bst union: [[ bst t1 ; bst t2 ]] =⇒ bst (union t1 t2 ) proof(induction t1 t2 rule: union.induct) case (1 t1 t2 ) thus ?case by(fastforce simp: union.simps[of t1 t2 ] set tree union split intro!: bst join

164 split: tree.split prod.split) qed lemma inv union: [[ inv t1 ; inv t2 ]] =⇒ inv (union t1 t2 ) proof(induction t1 t2 rule: union.induct) case (1 t1 t2 ) thus ?case by(auto simp:union.simps[of t1 t2 ] inv join split inv split!: tree.split prod.split dest: inv Node) qed

35.7 inter fun inter :: ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree where inter t1 t2 = (if t1 = Leaf then Leaf else if t2 = Leaf then Leaf else case t1 of Node l1 (a, ) r1 ⇒ let (l2 ,b,r2 ) = split t2 a; l 0 = inter l1 l2 ; r 0 = inter r1 r2 in if b then join l 0 a r 0 else join2 l 0 r 0) declare inter.simps [simp del] lemma set tree inter: [[ bst t1 ; bst t2 ]] =⇒ set tree (inter t1 t2 ) = set tree t1 ∩ set tree t2 proof(induction t1 t2 rule: inter.induct) case (1 t1 t2 ) show ?case proof (cases t1 rule: tree2 cases) case Leaf thus ?thesis by (simp add: inter.simps) next case [simp]: (Node l1 a r1 ) show ?thesis proof (cases t2 = Leaf ) case True thus ?thesis by (simp add: inter.simps) next case False let ?L1 = set tree l1 let ?R1 = set tree r1 have ∗: a ∈/ ?L1 ∪ ?R1 using hbst t1 i by (fastforce) obtain l2 b r2 where sp: split t2 a = (l2 ,b,r2 ) using prod cases3 by blast let ?L2 = set tree l2 let ?R2 = set tree r2 let ?A = if b then {a}

165 else {} have t2 : set tree t2 = ?L2 ∪ ?R2 ∪ ?A and ∗∗: ?L2 ∩ ?R2 = {} a ∈/ ?L2 ∪ ?R2 ?L1 ∩ ?R2 = {} ?L2 ∩ ?R1 = {} using split[OF sp] hbst t1 i hbst t2 i by (force, force, force, force, force) have IHl: set tree (inter l1 l2 ) = set tree l1 ∩ set tree l2 using 1 .IH (1 )[OF False sp[symmetric]] 1 .prems(1 ,2 ) split[OF sp] by simp have IHr: set tree (inter r1 r2 ) = set tree r1 ∩ set tree r2 using 1 .IH (2 )[OF False sp[symmetric]] 1 .prems(1 ,2 ) split[OF sp] by simp have set tree t1 ∩ set tree t2 = (?L1 ∪ ?R1 ∪ {a}) ∩ (?L2 ∪ ?R2 ∪ ?A) by(simp add: t2 ) also have ... = (?L1 ∩ ?L2 ) ∪ (?R1 ∩ ?R2 ) ∪ ?A using ∗ ∗∗ by auto also have ... = set tree (inter t1 t2 ) using IHl IHr sp inter.simps[of t1 t2 ] False by(simp) finally show ?thesis by simp qed qed qed lemma bst inter: [[ bst t1 ; bst t2 ]] =⇒ bst (inter t1 t2 ) proof(induction t1 t2 rule: inter.induct) case (1 t1 t2 ) thus ?case by(fastforce simp: inter.simps[of t1 t2 ] set tree inter split intro!: bst join bst join2 split: tree.split prod.split) qed lemma inv inter: [[ inv t1 ; inv t2 ]] =⇒ inv (inter t1 t2 ) proof(induction t1 t2 rule: inter.induct) case (1 t1 t2 ) thus ?case by(auto simp: inter.simps[of t1 t2 ] inv join inv join2 split inv split!: tree.split prod.split dest: inv Node) qed

35.8 diff fun diff :: ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree ⇒ ( 0a∗ 0b)tree where diff t1 t2 = (if t1 = Leaf then Leaf else

166 if t2 = Leaf then t1 else case t2 of Node l2 (a, ) r2 ⇒ let (l1 , ,r1 ) = split t1 a; l 0 = diff l1 l2 ; r 0 = diff r1 r2 in join2 l 0 r 0) declare diff .simps [simp del] lemma set tree diff : [[ bst t1 ; bst t2 ]] =⇒ set tree (diff t1 t2 ) = set tree t1 − set tree t2 proof(induction t1 t2 rule: diff .induct) case (1 t1 t2 ) show ?case proof (cases t2 rule: tree2 cases) case Leaf thus ?thesis by (simp add: diff .simps) next case [simp]: (Node l2 a r2 ) show ?thesis proof (cases t1 = Leaf ) case True thus ?thesis by (simp add: diff .simps) next case False let ?L2 = set tree l2 let ?R2 = set tree r2 obtain l1 b r1 where sp: split t1 a = (l1 ,b,r1 ) using prod cases3 by blast let ?L1 = set tree l1 let ?R1 = set tree r1 let ?A = if b then {a} else {} have t1 : set tree t1 = ?L1 ∪ ?R1 ∪ ?A and ∗∗: a ∈/ ?L1 ∪ ?R1 ?L1 ∩ ?R2 = {} ?L2 ∩ ?R1 = {} using split[OF sp] hbst t1 i hbst t2 i by (force, force, force, force) have IHl: set tree (diff l1 l2 ) = set tree l1 − set tree l2 using 1 .IH (1 )[OF False sp[symmetric]] 1 .prems(1 ,2 ) split[OF sp] by simp have IHr: set tree (diff r1 r2 ) = set tree r1 − set tree r2 using 1 .IH (2 )[OF False sp[symmetric]] 1 .prems(1 ,2 ) split[OF sp] by simp have set tree t1 − set tree t2 = (?L1 ∪ ?R1 ) − (?L2 ∪ ?R2 ∪ {a}) by(simp add: t1 ) also have ... = (?L1 − ?L2 ) ∪ (?R1 − ?R2 ) using ∗∗ by auto also have ... = set tree (diff t1 t2 ) using IHl IHr sp diff .simps[of t1 t2 ] False by(simp) finally show ?thesis by simp qed

167 qed qed lemma bst diff : [[ bst t1 ; bst t2 ]] =⇒ bst (diff t1 t2 ) proof(induction t1 t2 rule: diff .induct) case (1 t1 t2 ) thus ?case by(fastforce simp: diff .simps[of t1 t2 ] set tree diff split intro!: bst join bst join2 split: tree.split prod.split) qed lemma inv diff : [[ inv t1 ; inv t2 ]] =⇒ inv (diff t1 t2 ) proof(induction t1 t2 rule: diff .induct) case (1 t1 t2 ) thus ?case by(auto simp: diff .simps[of t1 t2 ] inv join inv join2 split inv split!: tree.split prod.split dest: inv Node) qed Locale Set2 Join implements locale Set2 : sublocale Set2 where empty = Leaf and insert = insert and delete = delete and isin = isin and union = union and inter = inter and diff = diff and set = set tree and invar = λt. inv t ∧ bst t proof (standard, goal cases) case 1 show ?case by (simp) next case 2 thus ?case by(simp add: isin set tree) next case 3 thus ?case by (simp add: set tree insert) next case 4 thus ?case by (simp add: set tree delete) next case 5 thus ?case by (simp add: inv Leaf ) next case 6 thus ?case by (simp add: bst insert inv insert) next case 7 thus ?case by (simp add: bst delete inv delete) next case 8 thus ?case by(simp add: set tree union) next case 9 thus ?case by(simp add: set tree inter) next

168 case 10 thus ?case by(simp add: set tree diff ) next case 11 thus ?case by (simp add: bst union inv union) next case 12 thus ?case by (simp add: bst inter inv inter) next case 13 thus ?case by (simp add: bst diff inv diff ) qed end interpretation unbal: Set2 Join where join = λl x r. Node l (x, ()) r and inv = λt. True proof (standard, goal cases) case 1 show ?case by simp next case 2 thus ?case by simp next case 3 thus ?case by simp next case 4 thus ?case by simp next case 5 thus ?case by simp qed end

36 Join-Based Implementation of Sets via RBTs theory Set2 Join RBT imports Set2 Join RBT Set begin

36.1 Code Function joinL joins two trees (and an element). Precondition: bheight l ≤ bheight r. Method: Descend along the left spine of r until you find a subtree with the same bheight as l, then combine them into a new red node. fun joinL :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where joinL l x r = (if bheight l ≥ bheight r then R l x r

169 else case r of B l 0 x 0 r 0 ⇒ baliL (joinL l x l 0) x 0 r 0 | R l 0 x 0 r 0 ⇒ R (joinL l x l 0) x 0 r 0) fun joinR :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where joinR l x r = (if bheight l ≤ bheight r then R l x r else case l of B l 0 x 0 r 0 ⇒ baliR l 0 x 0 (joinR r 0 x r) | R l 0 x 0 r 0 ⇒ R l 0 x 0 (joinR r 0 x r)) definition join :: 0a rbt ⇒ 0a ⇒ 0a rbt ⇒ 0a rbt where join l x r = (if bheight l > bheight r then paint Black (joinR l x r) else if bheight l < bheight r then paint Black (joinL l x r) else B l x r) declare joinL.simps[simp del] declare joinR.simps[simp del]

36.2 Properties 36.2.1 Color and height invariants lemma invc2 joinL: [[ invc l; invc r; bheight l ≤ bheight r ]] =⇒ invc2 (joinL l x r) ∧ (bheight l 6= bheight r ∧ color r = Black −→ invc(joinL l x r)) proof (induct l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: invc baliL invc2I joinL.simps[of l x r] split!: tree.splits if splits) qed lemma invc2 joinR: [[ invc l; invh l; invc r; invh r; bheight l ≥ bheight r ]] =⇒ invc2 (joinR l x r) ∧ (bheight l 6= bheight r ∧ color l = Black −→ invc(joinR l x r)) proof (induct l x r rule: joinR.induct) case (1 l x r) thus ?case by(fastforce simp: invc baliR invc2I joinR.simps[of l x r] split!: tree.splits if splits)

170 qed lemma bheight joinL: [[ invh l; invh r; bheight l ≤ bheight r ]] =⇒ bheight (joinL l x r) = bheight r proof (induct l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: bheight baliL joinL.simps[of l x r] split!: tree.split) qed lemma invh joinL: [[ invh l; invh r; bheight l ≤ bheight r ]] =⇒ invh (joinL l x r) proof (induct l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: invh baliL bheight joinL joinL.simps[of l x r] split!: tree.split color.split) qed lemma bheight joinR: [[ invh l; invh r; bheight l ≥ bheight r ]] =⇒ bheight (joinR l x r) = bheight l proof (induct l x r rule: joinR.induct) case (1 l x r) thus ?case by(fastforce simp: bheight baliR joinR.simps[of l x r] split!: tree.split) qed lemma invh joinR: [[ invh l; invh r; bheight l ≥ bheight r ]] =⇒ invh (joinR l x r) proof (induct l x r rule: joinR.induct) case (1 l x r) thus ?case by(fastforce simp: invh baliR bheight joinR joinR.simps[of l x r] split!: tree.split color.split) qed All invariants in one: lemma inv joinL: [[ invc l; invc r; invh l; invh r; bheight l ≤ bheight r ]] =⇒ invc2 (joinL l x r) ∧ (bheight l 6= bheight r ∧ color r = Black −→ invc (joinL l x r)) ∧ invh (joinL l x r) ∧ bheight (joinL l x r) = bheight r proof (induct l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: inv baliL invc2I joinL.simps[of l x r] split!: tree.splits if splits) qed

171 lemma inv joinR: [[ invc l; invc r; invh l; invh r; bheight l ≥ bheight r ]] =⇒ invc2 (joinR l x r) ∧ (bheight l 6= bheight r ∧ color l = Black −→ invc (joinR l x r)) ∧ invh (joinR l x r) ∧ bheight (joinR l x r) = bheight l proof (induct l x r rule: joinR.induct) case (1 l x r) thus ?case by(auto simp: inv baliR invc2I joinR.simps[of l x r] split!: tree.splits if splits) qed lemma rbt join: [[ invc l; invh l; invc r; invh r ]] =⇒ rbt(join l x r) by(simp add: inv joinL inv joinR invh paint rbt def color paint Black join def ) To make sure the the black height is not increased unnecessarily: lemma bheight paint Black: bheight(paint Black t) ≤ bheight t + 1 by(cases t) auto lemma [[ rbt l; rbt r ]] =⇒ bheight(join l x r) ≤ max (bheight l)(bheight r) + 1 using bheight paint Black[of joinL l x r] bheight paint Black[of joinR l x r] bheight joinL[of l r x] bheight joinR[of l r x] by(auto simp: max def rbt def join def )

36.2.2 Inorder properties Currently unused. Instead Tree2 .set tree and Tree2 .bst properties are proved directly. lemma inorder joinL: bheight l ≤ bheight r =⇒ inorder(joinL l x r) = inorder l @ x # inorder r proof(induction l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: inorder baliL joinL.simps[of l x r] split!: tree.splits color.splits) qed lemma inorder joinR: inorder(joinR l x r) = inorder l @ x # inorder r proof(induction l x r rule: joinR.induct) case (1 l x r) thus ?case by (force simp: inorder baliR joinR.simps[of l x r] split!: tree.splits color.splits) qed

172 lemma inorder(join l x r) = inorder l @ x # inorder r by(auto simp: inorder joinL inorder joinR inorder paint join def split!: tree.splits color.splits if splits dest!: arg cong[where f = inorder])

36.2.3 Set and bst properties lemma set baliL: set tree(baliL l a r) = set tree l ∪ {a} ∪ set tree r by(cases (l,a,r) rule: baliL.cases)(auto) lemma set joinL: bheight l ≤ bheight r =⇒ set tree (joinL l x r) = set tree l ∪ {x} ∪ set tree r proof(induction l x r rule: joinL.induct) case (1 l x r) thus ?case by(auto simp: set baliL joinL.simps[of l x r] split!: tree.splits color.splits) qed lemma set baliR: set tree(baliR l a r) = set tree l ∪ {a} ∪ set tree r by(cases (l,a,r) rule: baliR.cases)(auto) lemma set joinR: set tree (joinR l x r) = set tree l ∪ {x} ∪ set tree r proof(induction l x r rule: joinR.induct) case (1 l x r) thus ?case by(force simp: set baliR joinR.simps[of l x r] split!: tree.splits color.splits) qed lemma set paint: set tree (paint c t) = set tree t by (cases t) auto lemma set join: set tree (join l x r) = set tree l ∪ {x} ∪ set tree r by(simp add: set joinL set joinR set paint join def ) lemma bst baliL: [[bst l; bst r; ∀ x∈set tree l. x < a; ∀ x∈set tree r. a < x]] =⇒ bst (baliL l a r) by(cases (l,a,r) rule: baliL.cases)(auto simp: ball Un)

173 lemma bst baliR: [[bst l; bst r; ∀ x∈set tree l. x < a; ∀ x∈set tree r. a < x]] =⇒ bst (baliR l a r) by(cases (l,a,r) rule: baliR.cases)(auto simp: ball Un) lemma bst joinL: [[bst (Node l (a, n) r); bheight l ≤ bheight r]] =⇒ bst (joinL l a r) proof(induction l a r rule: joinL.induct) case (1 l a r) thus ?case by(auto simp: set baliL joinL.simps[of l a r] set joinL ball Un intro!: bst baliL split!: tree.splits color.splits) qed lemma bst joinR: [[bst l; bst r; ∀ x∈set tree l. x < a; ∀ y∈set tree r. a < y ]] =⇒ bst (joinR l a r) proof(induction l a r rule: joinR.induct) case (1 l a r) thus ?case by(auto simp: set baliR joinR.simps[of l a r] set joinR ball Un intro!: bst baliR split!: tree.splits color.splits) qed lemma bst paint: bst (paint c t) = bst t by(cases t) auto lemma bst join: bst (Node l (a, n) r) =⇒ bst (join l a r) by(auto simp: bst paint bst joinL bst joinR join def ) lemma inv join: [[ invc l; invh l; invc r; invh r ]] =⇒ invc(join l x r) ∧ invh(join l x r) by (simp add: inv joinL inv joinR invh paint join def )

36.2.4 Interpretation of Set2 Join with Red-Black Tree global interpretation RBT : Set2 Join where join = join and inv = λt. invc t ∧ invh t defines insert rbt = RBT .insert and delete rbt = RBT .delete and split rbt = RBT .split

174 and join2 rbt = RBT .join2 and split min rbt = RBT .split min proof (standard, goal cases) case 1 show ?case by (rule set join) next case 2 thus ?case by (simp add: bst join) next case 3 show ?case by simp next case 4 thus ?case by (simp add: inv join) next case 5 thus ?case by simp qed The invariant does not guarantee that the root node is black. This is not required to guarantee that the height is logarithmic in the size — Exercise. end theory Array Specs imports Main begin Array Specifications locale Array = fixes lookup :: 0ar ⇒ nat ⇒ 0a fixes update :: nat ⇒ 0a ⇒ 0ar ⇒ 0ar fixes len :: 0ar ⇒ nat fixes array :: 0a list ⇒ 0ar

fixes list :: 0ar ⇒ 0a list fixes invar :: 0ar ⇒ bool assumes lookup: invar ar =⇒ n < len ar =⇒ lookup ar n = list ar ! n assumes update: invar ar =⇒ n < len ar =⇒ list(update n x ar) = (list ar)[n:=x] assumes len array: invar ar =⇒ len ar = length (list ar) assumes array: list (array xs) = xs assumes invar update: invar ar =⇒ n < len ar =⇒ invar(update n x ar) assumes invar array: invar(array xs) locale Array Flex = Array + fixes add lo :: 0a ⇒ 0ar ⇒ 0ar fixes del lo :: 0ar ⇒ 0ar fixes add hi :: 0a ⇒ 0ar ⇒ 0ar fixes del hi :: 0ar ⇒ 0ar

175 assumes add lo: invar ar =⇒ list(add lo a ar) = a # list ar assumes del lo: invar ar =⇒ list(del lo ar) = tl (list ar) assumes add hi: invar ar =⇒ list(add hi a ar) = list ar @[a] assumes del hi: invar ar =⇒ list(del hi ar) = butlast (list ar) assumes invar add lo: invar ar =⇒ invar (add lo a ar) assumes invar del lo: invar ar =⇒ invar (del lo ar) assumes invar add hi: invar ar =⇒ invar (add hi a ar) assumes invar del hi: invar ar =⇒ invar (del hi ar) end

37 Braun Trees theory Braun Tree imports HOL−Library.Tree Real begin Braun Trees were studied by Braun and Rem [5] and later Hoogerwo- ord [10]. fun braun :: 0a tree ⇒ bool where braun Leaf = True | braun (Node l x r) = ((size l = size r ∨ size l = size r + 1 ) ∧ braun l ∧ braun r) lemma braun Node 0: braun (Node l x r) = (size r ≤ size l ∧ size l ≤ size r + 1 ∧ braun l ∧ braun r) by auto The shape of a Braun-tree is uniquely determined by its size: lemma braun unique: [[ braun (t1 ::unit tree); braun t2 ; size t1 = size t2 ]] =⇒ t1 = t2 proof (induction t1 arbitrary: t2 ) case Leaf thus ?case by simp next case (Node l1 r1 ) from Node.prems(3 ) have t2 6= Leaf by auto then obtain l2 x2 r2 where [simp]: t2 = Node l2 x2 r2 by (meson neq Leaf iff ) with Node.prems have size l1 = size l2 ∧ size r1 = size r2 by auto thus ?case using Node.prems(1 ,2 ) Node.IH by auto qed

176 Braun trees are almost complete: lemma acomplete if braun: braun t =⇒ acomplete t proof(induction t) case Leaf show ?case by (simp add: acomplete def ) next case (Node l x r) thus ?case using acomplete Node if wbal2 by force qed

37.1 Numbering Nodes We show that a tree is a Braun tree iff a parity-based numbering (braun indices) of nodes yields an interval of numbers. fun braun indices :: 0a tree ⇒ nat set where braun indices Leaf = {} | braun indices (Node l r) = {1 } ∪ (∗) 2 ‘ braun indices l ∪ Suc ‘ (∗) 2 ‘ braun indices r lemma braun indices1 : 0 ∈/ braun indices t by (induction t) auto lemma finite braun indices: finite(braun indices t) by (induction t) auto One direction: lemma braun indices if braun: braun t =⇒ braun indices t = {1 ..size t} proof(induction t) case Leaf thus ?case by simp next have ∗:(∗) 2 ‘ {a..b} ∪ Suc ‘ (∗) 2 ‘ {a..b} = {2 ∗a..2 ∗b+1 } (is ?l = ?r) for a b proof show ?l ⊆ ?r by auto next have ∃ x2 ∈{a..b}. x ∈ {Suc (2 ∗x2 ), 2 ∗x2 } if ∗: x ∈ {2 ∗a .. 2 ∗b+1 } for x proof − have x div 2 ∈ {a..b} using ∗ by auto moreover have x ∈ {2 ∗ (x div 2 ), Suc(2 ∗ (x div 2 ))} by auto ultimately show ?thesis by blast qed thus ?r ⊆ ?l by fastforce qed case (Node l x r)

177 hence size l = size r ∨ size l = size r + 1 (is ?A ∨ ?B) by auto thus ?case proof assume ?A with Node show ?thesis by (auto simp: ∗) next assume ?B with Node show ?thesis by (auto simp: ∗ atLeastAtMostSuc conv) qed qed The other direction is more complicated. The following proof is due to Thomas Sewell. lemma disj evens odds:(∗) 2 ‘ A ∩ Suc ‘ (∗) 2 ‘ B = {} using double not eq Suc double by auto lemma card braun indices: card (braun indices t) = size t proof (induction t) case Leaf thus ?case by simp next case Node thus ?case by(auto simp: UNION singleton eq range finite braun indices card Un disjoint card insert if disj evens odds card image inj on def braun indices1 ) qed lemma braun indices intvl base 1 : assumes bi: braun indices t = {m..n} shows {m..n} = {1 ..size t} proof (cases t = Leaf ) case True then show ?thesis using bi by simp next case False note eqs = eqset imp iff [OF bi] from eqs[of 0 ] have 0 : 0 < m by (simp add: braun indices1 ) from eqs[of 1 ] have 1 : m ≤ 1 by (cases t; simp add: False) from 0 1 have eq1 : m = 1 by simp from card braun indices[of t] show ?thesis by (simp add: bi eq1 ) qed lemma even of intvl intvl:

178 fixes S :: nat set assumes S = {m..n} ∩ {i. even i} shows ∃ m 0 n 0. S = (λi. i ∗ 2 ) ‘ {m 0..n 0} apply (rule exI [where x=Suc m div 2 ], rule exI [where x=n div 2 ]) apply (fastforce simp add: assms mult.commute) done lemma odd of intvl intvl: fixes S :: nat set assumes S = {m..n} ∩ {i. odd i} shows ∃ m 0 n 0. S = Suc ‘ (λi. i ∗ 2 ) ‘ {m 0..n 0} proof − have step1 : ∃ m 0. S = Suc ‘ ({m 0..n − 1 } ∩ {i. even i}) apply (rule tac x=if n = 0 then 1 else m − 1 in exI ) apply (auto simp: assms image def elim!: oddE) done thus ?thesis by (metis even of intvl intvl) qed lemma image int eq image: (∀ i ∈ S. f i ∈ T ) =⇒ (f ‘ S) ∩ T = f ‘ S (∀ i ∈ S. f i ∈/ T ) =⇒ (f ‘ S) ∩ T = {} by auto lemma braun indices1 le: i ∈ braun indices t =⇒ Suc 0 ≤ i using braun indices1 not less eq eq by blast lemma braun if braun indices: braun indices t = {1 ..size t} =⇒ braun t proof(induction t) case Leaf then show ?case by simp next case (Node l x r) obtain t where t: t = Node l x r by simp from Node.prems have eq: {2 .. size t} = (λi. i ∗ 2 ) ‘ braun indices l ∪ Suc ‘ (λi. i ∗ 2 ) ‘ braun indices r (is ?R = ?S ∪ ?T ) apply clarsimp apply (drule tac f =λS. S ∩ {2 ..} in arg cong) apply (simp add: t mult.commute Int Un distrib2 image int eq image braun indices1 le) done

179 then have ST : ?S = ?R ∩ {i. even i} ?T = ?R ∩ {i. odd i} by (simp all add: Int Un distrib2 image int eq image) from ST have l: braun indices l = {1 .. size l} by (fastforce dest: braun indices intvl base 1 dest!: even of intvl intvl simp: mult.commute inj image eq iff [OF inj onI ]) from ST have r: braun indices r = {1 .. size r} by (fastforce dest: braun indices intvl base 1 dest!: odd of intvl intvl simp: mult.commute inj image eq iff [OF inj onI ]) note STa = ST [THEN eqset imp iff , THEN iffD2 ] note STb = STa[of size t] STa[of size t − 1 ] then have sizes: size l = size r ∨ size l = size r + 1 apply (clarsimp simp: t l r inj image mem iff [OF inj onI ]) apply (cases even (size l); cases even (size r); clarsimp elim!: oddE; fastforce) done from l r sizes show ?case by (clarsimp simp: Node.IH ) qed lemma braun iff braun indices: braun t ←→ braun indices t = {1 ..size t} using braun if braun indices braun indices if braun by blast

end

38 Arrays via Braun Trees theory Array Braun imports Array Specs Braun Tree begin

38.1 Array fun lookup1 :: 0a tree ⇒ nat ⇒ 0a where lookup1 (Node l x r) n = (if n=1 then x else lookup1 (if even n then l else r)(n div 2 )) fun update1 :: nat ⇒ 0a ⇒ 0a tree ⇒ 0a tree where update1 n x Leaf = Node Leaf x Leaf | update1 n x (Node l a r) = (if n=1 then Node l x r else

180 if even n then Node (update1 (n div 2 ) x l) a r else Node l a (update1 (n div 2 ) x r)) fun adds :: 0a list ⇒ nat ⇒ 0a tree ⇒ 0a tree where adds [] n t = t | adds (x#xs) n t = adds xs (n+1 )(update1 (n+1 ) x t) fun list :: 0a tree ⇒ 0a list where list Leaf = [] | list (Node l x r) = x # splice (list l)(list r)

38.1.1 Functional Correctness lemma size list: size(list t) = size t by(induction t)(auto) lemma minus1 div2 :(n − Suc 0 ) div 2 = (if odd n then n div 2 else n div 2 − 1 ) by auto arith lemma nth splice: [[ n < size xs + size ys; size ys ≤ size xs; size xs ≤ size ys + 1 ]] =⇒ splice xs ys ! n = (if even n then xs else ys)!(n div 2 ) apply(induction xs ys arbitrary: n rule: splice.induct) apply (auto simp: nth Cons 0 minus1 div2 ) done lemma div2 in bounds: [[ braun (Node l x r); n ∈ {1 ..size(Node l x r)}; n > 1 ]] =⇒ (odd n −→ n div 2 ∈ {1 ..size r}) ∧ (even n −→ n div 2 ∈ {1 ..size l}) by auto arith declare upt Suc[simp del] lookup1 lemma nth list lookup1 : [[braun t; i < size t]] =⇒ list t ! i = lookup1 t (i+1 ) proof(induction t arbitrary: i) case Leaf thus ?case by simp next case Node thus ?case using div2 in bounds[OF Node.prems(1 ), of i+1 ] by (auto simp: nth splice minus1 div2 size list) qed

181 lemma list eq map lookup1 : braun t =⇒ list t = map (lookup1 t)[1 .. 0 =⇒ n − Suc 0 = m ←→ n = m+1 by arith

182 lemma list update splice: [[ n < size xs + size ys; size ys ≤ size xs; size xs ≤ size ys + 1 ]] =⇒ (splice xs ys)[n := x] = (if even n then splice (xs[n div 2 := x]) ys else splice xs (ys[n div 2 := x])) by(induction xs ys arbitrary: n rule: splice.induct)(auto split: nat.split) lemma list update2 : [[ braun t; n ∈ {1 .. size t} ]] =⇒ list(update1 n x t) = (list t)[n−1 := x] proof(induction t arbitrary: n) case Leaf thus ?case by simp next case (Node l a r) thus ?case using div2 in bounds[OF Node.prems] by(auto simp: list update splice diff1 eq iff size list split: nat.split) qed adds lemma splice last: shows size ys ≤ size xs =⇒ splice (xs @[x]) ys = splice xs ys @[x] and size ys+1 ≥ size xs =⇒ splice xs (ys @[y]) = splice xs ys @[y] by(induction xs ys arbitrary: x y rule: splice.induct)(auto) lemma list add hi: braun t =⇒ list(update1 (Suc(size t)) x t) = list t @ [x] by(induction t)(auto simp: splice last size list) lemma size add hi: braun t =⇒ m = size t =⇒ size(update1 (Suc m) x t) = size t + 1 by(induction t arbitrary: m)(auto) lemma braun add hi: braun t =⇒ braun(update1 (Suc(size t)) x t) by(induction t)(auto simp: size add hi) lemma size braun adds: [[ braun t; size t = n ]] =⇒ size(adds xs n t) = size t + length xs ∧ braun (adds xs n t) by(induction xs arbitrary: t n)(auto simp: braun add hi size add hi) lemma list adds: [[ braun t; size t = n ]] =⇒ list(adds xs n t) = list t @ xs by(induction xs arbitrary: t n)(auto simp: size braun adds list add hi size add hi braun add hi)

38.1.2 Array Implementation interpretation A: Array

183 where lookup = λ(t,l) n. lookup1 t (n+1 ) and update = λn x (t,l). (update1 (n+1 ) x t, l) and len = λ(t,l). l and array = λxs. (adds xs 0 Leaf , length xs) and invar = λ(t,l). braun t ∧ l = size t and list = λ(t,l). list t proof (standard, goal cases) case 1 thus ?case by (simp add: nth list lookup1 split: prod.splits) next case 2 thus ?case by (simp add: list update1 split: prod.splits) next case 3 thus ?case by (simp add: size list split: prod.splits) next case 4 thus ?case by (simp add: list adds) next case 5 thus ?case by (simp add: braun update1 size update1 split: prod.splits) next case 6 thus ?case by (simp add: size braun adds split: prod.splits) qed

38.2 Flexible Array fun add lo where add lo x Leaf = Node Leaf x Leaf | add lo x (Node l a r) = Node (add lo a r) x l fun merge where merge Leaf r = r | merge (Node l a r) rr = Node rr a (merge l r) fun del lo where del lo Leaf = Leaf | del lo (Node l a r) = merge l r fun del hi :: nat ⇒ 0a tree ⇒ 0a tree where del hi n Leaf = Leaf | del hi n (Node l x r) = (if n = 1 then Leaf else if even n then Node (del hi (n div 2 ) l) x r else Node l x (del hi (n div 2 ) r))

184 38.2.1 Functional Correctness add lo lemma list add lo: braun t =⇒ list (add lo a t) = a # list t by(induction t arbitrary: a) auto lemma braun add lo: braun t =⇒ braun(add lo x t) by(induction t arbitrary: x)(auto simp add: list add lo simp flip: size list) del lo lemma list merge: braun (Node l x r) =⇒ list(merge l r) = splice (list l)(list r) by (induction l r rule: merge.induct) auto lemma braun merge: braun (Node l x r) =⇒ braun(merge l r) by (induction l r rule: merge.induct)(auto simp add: list merge simp flip: size list) lemma list del lo: braun t =⇒ list(del lo t) = tl (list t) by (cases t)(simp all add: list merge) lemma braun del lo: braun t =⇒ braun(del lo t) by (cases t)(simp all add: braun merge) del hi lemma list Nil iff : list t = [] ←→ t = Leaf by(cases t) simp all lemma butlast splice: butlast (splice xs ys) = (if size xs > size ys then splice (butlast xs) ys else splice xs (butlast ys)) by(induction xs ys rule: splice.induct)(auto) lemma list del hi: braun t =⇒ size t = st =⇒ list(del hi st t) = butlast(list t) apply(induction t arbitrary: st) by(auto simp: list Nil iff size list butlast splice) lemma braun del hi: braun t =⇒ size t = st =⇒ braun(del hi st t) apply(induction t arbitrary: st) by(auto simp: list del hi simp flip: size list)

38.2.2 Flexible Array Implementation interpretation AF : Array Flex where lookup = λ(t,l) n. lookup1 t (n+1 ) and update = λn x (t,l). (update1 (n+1 ) x t, l) and len = λ(t,l). l

185 and array = λxs. (adds xs 0 Leaf , length xs) and invar = λ(t,l). braun t ∧ l = size t and list = λ(t,l). list t and add lo = λx (t,l). (add lo x t, l+1 ) and del lo = λ(t,l). (del lo t, l−1 ) and add hi = λx (t,l). (update1 (Suc l) x t, l+1 ) and del hi = λ(t,l). (del hi l t, l−1 ) proof (standard, goal cases) case 1 thus ?case by (simp add: list add lo split: prod.splits) next case 2 thus ?case by (simp add: list del lo split: prod.splits) next case 3 thus ?case by (simp add: list add hi braun add hi split: prod.splits) next case 4 thus ?case by (simp add: list del hi split: prod.splits) next case 5 thus ?case by (simp add: braun add lo list add lo flip: size list split: prod.splits) next case 6 thus ?case by (simp add: braun del lo list del lo flip: size list split: prod.splits) next case 7 thus ?case by (simp add: size add hi braun add hi split: prod.splits) next case 8 thus ?case by (simp add: braun del hi list del hi flip: size list split: prod.splits) qed

38.3 Faster 38.3.1 Size fun diff :: 0a tree ⇒ nat ⇒ nat where diff Leaf = 0 | diff (Node l x r) n = (if n=0 then 1 else if even n then diff r (n div 2 − 1 ) else diff l (n div 2 )) fun size fast :: 0a tree ⇒ nat where size fast Leaf = 0 | size fast (Node l x r) = (let n = size fast r in 1 + 2 ∗n + diff l n) declare Let def [simp] lemma diff : braun t =⇒ size t : {n, n + 1 } =⇒ diff t n = size t − n

186 by(induction t arbitrary: n) auto lemma size fast: braun t =⇒ size fast t = size t by(induction t)(auto simp add: diff )

38.3.2 Initialization with 1 element fun braun of naive :: 0a ⇒ nat ⇒ 0a tree where braun of naive x n = (if n=0 then Leaf else let m = (n−1 ) div 2 in if odd n then Node (braun of naive x m) x (braun of naive x m) else Node (braun of naive x (m + 1 )) x (braun of naive x m)) fun braun2 of :: 0a ⇒ nat ⇒ 0a tree ∗ 0a tree where braun2 of x n = (if n = 0 then (Leaf , Node Leaf x Leaf ) else let (s,t) = braun2 of x ((n−1 ) div 2 ) in if odd n then (Node s x s, Node t x s) else (Node t x s, Node t x t)) definition braun of :: 0a ⇒ nat ⇒ 0a tree where braun of x n = fst (braun2 of x n) declare braun2 of .simps [simp del] lemma braun2 of size braun: braun2 of x n = (s,t) =⇒ size s = n ∧ size t = n+1 ∧ braun s ∧ braun t proof(induction x n arbitrary: s t rule: braun2 of .induct) case (1 x n) then show ?case by (auto simp: braun2 of .simps[of x n] split: prod.splits if splits) pres- burger+ qed lemma braun2 of replicate: braun2 of x n = (s,t) =⇒ list s = replicate n x ∧ list t = replicate (n+1 ) x proof(induction x n arbitrary: s t rule: braun2 of .induct) case (1 x n) have x # replicate m x = replicate (m+1 ) x for m by simp with 1 show ?case apply (auto simp: braun2 of .simps[of x n] replicate.simps(2 )[of 0 x] simp del: replicate.simps(2 ) split: prod.splits if splits) by presburger+ qed

187 corollary braun braun of : braun(braun of x n) unfolding braun of def by (metis eq fst iff braun2 of size braun) corollary list braun of : list(braun of x n) = replicate n x unfolding braun of def by (metis eq fst iff braun2 of replicate)

38.3.3 Proof Infrastructure Originally due to Thomas Sewell. take nths fun take nths :: nat ⇒ nat ⇒ 0a list ⇒ 0a list where take nths i k [] = [] | take nths i k (x # xs) = (if i = 0 then x # take nths (2ˆk − 1 ) k xs else take nths (i − 1 ) k xs) This is the more concise definition but seems to complicate the proofs: lemma take nths eq nths: take nths i k xs = nths xs (S n. {n∗2ˆk + i}) proof(induction xs arbitrary: i) case Nil then show ?case by simp next case (Cons x xs) show ?case proof cases assume [simp]: i = 0 have (S n. {(n+1 ) ∗ 2 ˆ k − 1 }) = {m. ∃ n. Suc m = n ∗ 2 ˆ k} apply (auto simp del: mult Suc) by (metis diff Suc Suc diff zero mult eq 0 iff not0 implies Suc) thus ?thesis by (simp add: Cons.IH ac simps nths Cons) next assume [arith]: i 6= 0 have (S n. {n ∗ 2 ˆ k + i − 1 }) = {m. ∃ n. Suc m = n ∗ 2 ˆ k + i} apply auto by (metis diff Suc Suc diff zero) thus ?thesis by (simp add: Cons.IH nths Cons) qed qed lemma take nths drop: take nths i k (drop j xs) = take nths (i + j ) k xs by (induct xs arbitrary: i j ; simp add: drop Cons split: nat.split) lemma take nths 00 : take nths 0 0 xs = xs

188 by (induct xs; simp) lemma splice take nths: splice (take nths 0 (Suc 0 ) xs)(take nths (Suc 0 )(Suc 0 ) xs) = xs by (induct xs; simp) lemma take nths take nths: take nths i m (take nths j n xs) = take nths ((i ∗ 2ˆn) + j )(m + n) xs by (induct xs arbitrary: i j ; simp add: algebra simps power add) lemma take nths empty: (take nths i k xs = []) = (length xs ≤ i) by (induction xs arbitrary: i k) auto lemma hd take nths: i < length xs =⇒ hd(take nths i k xs) = xs ! i by (induction xs arbitrary: i k) auto lemma take nths 01 splice: [[ length xs = length ys ∨ length xs = length ys + 1 ]] =⇒ take nths 0 (Suc 0 )(splice xs ys) = xs ∧ take nths (Suc 0 )(Suc 0 )(splice xs ys) = ys by (induct xs arbitrary: ys; case tac ys; simp) lemma length take nths 00 : length (take nths 0 (Suc 0 ) xs) = length (take nths (Suc 0 )(Suc 0 ) xs) ∨ length (take nths 0 (Suc 0 ) xs) = length (take nths (Suc 0 )(Suc 0 ) xs) + 1 by (induct xs) auto braun list fun braun list :: 0a tree ⇒ 0a list ⇒ bool where braun list Leaf xs = (xs = []) | braun list (Node l x r) xs = (xs 6= [] ∧ x = hd xs ∧ braun list l (take nths 1 1 xs) ∧ braun list r (take nths 2 1 xs)) lemma braun list eq: braun list t xs = (braun t ∧ xs = list t) proof (induct t arbitrary: xs) case Leaf show ?case by simp next

189 case Node show ?case using length take nths 00 [of xs] splice take nths[of xs] by (auto simp: neq Nil conv Node.hyps size list[symmetric] take nths 01 splice) qed

38.3.4 Converting a list of elements into a Braun tree fun nodes :: 0a tree list ⇒ 0a list ⇒ 0a tree list ⇒ 0a tree list where nodes (l#ls)(x#xs)(r#rs) = Node l x r # nodes ls xs rs | nodes (l#ls)(x#xs) [] = Node l x Leaf # nodes ls xs [] | nodes [] (x#xs)(r#rs) = Node Leaf x r # nodes [] xs rs | nodes [] (x#xs) [] = Node Leaf x Leaf # nodes [] xs [] | nodes ls [] rs = [] fun brauns :: nat ⇒ 0a list ⇒ 0a tree list where brauns k xs = (if xs = [] then [] else let ys = take (2ˆk) xs; zs = drop (2ˆk) xs; ts = brauns (k+1 ) zs in nodes ts ys (drop (2ˆk) ts)) declare brauns.simps[simp del] definition brauns1 :: 0a list ⇒ 0a tree where brauns1 xs = (if xs = [] then Leaf else brauns 0 xs ! 0 ) fun T brauns :: nat ⇒ 0a list ⇒ nat where T brauns k xs = (if xs = [] then 0 else let ys = take (2ˆk) xs; zs = drop (2ˆk) xs; ts = brauns (k+1 ) zs in 4 ∗ min (2ˆk)(length xs) + T brauns (k+1 ) zs)

Functional correctness The proof is originally due to Thomas Sewell. lemma length nodes: length (nodes ls xs rs) = length xs by (induct ls xs rs rule: nodes.induct; simp) lemma nth nodes: i < length xs =⇒ nodes ls xs rs ! i = Node (if i < length ls then ls ! i else Leaf )(xs ! i) (if i < length rs then rs ! i else Leaf ) by (induct ls xs rs arbitrary: i rule: nodes.induct;

190 simp add: nth Cons split: nat.split) theorem length brauns: length (brauns k xs) = min (length xs)(2 ˆ k) proof (induct xs arbitrary: k rule: measure induct rule[where f =length]) case (less xs) thus ?case by (simp add: brauns.simps[of k xs] length nodes) qed theorem brauns correct: i < min (length xs)(2 ˆ k) =⇒ braun list (brauns k xs ! i)(take nths i k xs) proof (induct xs arbitrary: i k rule: measure induct rule[where f =length]) case (less xs) have xs 6= [] using less.prems by auto let ?zs = drop (2ˆk) xs let ?ts = brauns (Suc k) ?zs from less.hyps[of ?zs Suc k] have IH : [[ j = i + 2 ˆ k; i < min (length ?zs)(2 ˆ (k+1 )) ]] =⇒ braun list (?ts ! i)(take nths j (Suc k) xs) for i j using hxs 6= []i by (simp add: take nths drop) show ?case using less.prems by (auto simp: brauns.simps[of k xs] nth nodes take nths take nths IH take nths empty hd take nths length brauns) qed corollary brauns1 correct: braun (brauns1 xs) ∧ list (brauns1 xs) = xs using brauns correct[of 0 xs 0 ] by (simp add: brauns1 def braun list eq take nths 00 )

Running Time Analysis theorem T brauns: T brauns k xs = 4 ∗ length xs proof (induction xs arbitrary: k rule: measure induct rule[where f = length]) case (less xs) show ?case proof cases assume xs = [] thus ?thesis by(simp) next assume xs 6= [] let ?zs = drop (2ˆk) xs have T brauns k xs = T brauns (k+1 ) ?zs + 4 ∗ min (2ˆk)(length xs)

191 using hxs 6= []i by(simp) also have ... = 4 ∗ length ?zs + 4 ∗ min (2ˆk)(length xs) using less[of ?zs k+1 ] hxs 6= []i by (simp) also have ... = 4 ∗ length xs by(simp) finally show ?case . qed qed

38.3.5 Converting a Braun Tree into a List of Elements The code and the proof are originally due to Thomas Sewell (except running time). function list fast rec :: 0a tree list ⇒ 0a list where list fast rec ts = (let us = filter (λt. t 6= Leaf ) ts in if us = [] then [] else map value us @ list fast rec (map left us @ map right us)) by (pat completeness, auto) lemma list fast rec term1 : ts 6= [] =⇒ Leaf ∈/ set ts =⇒ sum list (map (size o left) ts) + sum list (map (size o right) ts) < sum list (map size ts) apply (clarsimp simp: sum list addf [symmetric] sum list map filter 0) apply (rule sum list strict mono; clarsimp?) apply (case tac x; simp) done lemma list fast rec term: us 6= [] =⇒ us = filter (λt. t 6= hi) ts =⇒ sum list (map (size o left) us) + sum list (map (size o right) us) < sum list (map size ts) apply (rule order less le trans, rule list fast rec term1 , simp all) apply (rule sum list filter le nat) done termination apply (relation measure (sum list o map size)) apply simp apply (simp add: list fast rec term) done declare list fast rec.simps[simp del] definition list fast :: 0a tree ⇒ 0a list where

192 list fast t = list fast rec [t] function T list fast rec :: 0a tree list ⇒ nat where T list fast rec ts = (let us = filter (λt. t 6= Leaf ) ts in length ts + (if us = [] then 0 else 5 ∗ length us + T list fast rec (map left us @ map right us))) by (pat completeness, auto) termination apply (relation measure (sum list o map size)) apply simp apply (simp add: list fast rec term) done declare T list fast rec.simps[simp del]

Functional Correctness lemma list fast rec all Leaf : ∀ t ∈ set ts. t = Leaf =⇒ list fast rec ts = [] by (simp add: filter empty conv list fast rec.simps) lemma take nths eq single: length xs − i < 2ˆn =⇒ take nths i n xs = take 1 (drop i xs) by (induction xs arbitrary: i n; simp add: drop Cons 0) lemma braun list Nil: braun list t [] = (t = Leaf ) by (cases t; simp) lemma braun list not Nil: xs 6= [] =⇒ braun list t xs = (∃ l x r. t = Node l x r ∧ x = hd xs ∧ braun list l (take nths 1 1 xs) ∧ braun list r (take nths 2 1 xs)) by(cases t; simp) theorem list fast rec correct: [[ length ts = 2 ˆ k; ∀ i < 2 ˆ k. braun list (ts ! i)(take nths i k xs) ]] =⇒ list fast rec ts = xs proof (induct xs arbitrary: k ts rule: measure induct rule[where f =length]) case (less xs) show ?case proof (cases length xs < 2 ˆ k) case True

193 from less.prems True have filter: ∃ n. ts = map (λx. Node Leaf x Leaf ) xs @ replicate n Leaf apply (rule tac x=length ts − length xs in exI ) apply (clarsimp simp: list eq iff nth eq) apply(auto simp: nth append braun list not Nil take nths eq single braun list Nil hd drop conv nth) done thus ?thesis by (clarsimp simp: list fast rec.simps[of ts] o def list fast rec all Leaf ) next case False with less.prems(2 ) have ∗: ∀ i < 2 ˆ k. ts ! i 6= Leaf ∧ value (ts ! i) = xs ! i ∧ braun list (left (ts ! i)) (take nths (i + 2 ˆ k)(Suc k) xs) ∧ (∀ ys. ys = take nths (i + 2 ∗ 2 ˆ k)(Suc k) xs −→ braun list (right (ts ! i)) ys) by (auto simp: take nths empty hd take nths braun list not Nil take nths take nths algebra simps) have 1 : map value ts = take (2 ˆ k) xs using less.prems(1 ) False by (simp add: list eq iff nth eq ∗) have 2 : list fast rec (map left ts @ map right ts) = drop (2 ˆ k) xs using less.prems(1 ) False by (auto intro!: Nat.diff less less.hyps[where k= Suc k] simp: nth append ∗ take nths drop algebra simps) from less.prems(1 ) False show ?thesis by (auto simp: list fast rec.simps[of ts] 1 2 ∗ all set conv all nth) qed qed corollary list fast correct: braun t =⇒ list fast t = list t by (simp add: list fast def take nths 00 braun list eq list fast rec correct[where k=0 ])

Running Time Analysis lemma sum tree list children: ∀ t ∈ set ts. t 6= Leaf =⇒ (P t←ts. k ∗ size t) = (P t ← map left ts @ map right ts. k ∗ size t) + k ∗ length ts by(induction ts)(auto simp add: neq Leaf iff algebra simps) theorem T list fast rec ub: T list fast rec ts ≤ sum list (map (λt. 7 ∗size t + 1 ) ts)

194 proof (induction ts rule: measure induct rule[where f =sum list o map size]) case (less ts) let ?us = filter (λt. t 6= Leaf ) ts show ?case proof cases assume ?us = [] thus ?thesis using T list fast rec.simps[of ts] by(simp add: sum list Suc) next assume ?us 6= [] let ?children = map left ?us @ map right ?us have T list fast rec ts = T list fast rec ?children + 5 ∗ length ?us + length ts using h?us 6= []i T list fast rec.simps[of ts] by(simp) also have ... ≤ (P t←?children. 7 ∗ size t + 1 ) + 5 ∗ length ?us + length ts using less[of ?children] list fast rec term[of ?us] h?us 6= []i by (simp) also have ... = (P t←?children. 7 ∗size t) + 7 ∗ length ?us + length ts by(simp add: sum list Suc o def ) also have ... = (P t←?us. 7 ∗size t) + length ts by(simp add: sum tree list children) also have ... ≤ (P t←ts. 7 ∗size t) + length ts by(simp add: sum list filter le nat) also have ... = (P t←ts. 7 ∗ size t + 1 ) by(simp add: sum list Suc) finally show ?case . qed qed end

39 Tries via Functions theory Fun imports Set Specs begin A trie where each node maps a key to sub-tries via a function. Nice abstract model. Not efficient because of the function space. datatype 0a trie = Nd bool 0a ⇒ 0a trie option

195 definition empty :: 0a trie where [simp]: empty = Nd False (λ . None) fun isin :: 0a trie ⇒ 0a list ⇒ bool where isin (Nd b m) [] = b | isin (Nd b m)(k # xs) = (case m k of None ⇒ False | Some t ⇒ isin t xs) fun insert :: 0a list ⇒ 0a trie ⇒ 0a trie where insert [] (Nd b m) = Nd True m | insert (x#xs)(Nd b m) = (let s = (case m x of None ⇒ empty | Some t ⇒ t) in Nd b (m(x := Some(insert xs s)))) fun delete :: 0a list ⇒ 0a trie ⇒ 0a trie where delete [] (Nd b m) = Nd False m | delete (x#xs)(Nd b m) = Nd b (case m x of None ⇒ m | Some t ⇒ m(x := Some(delete xs t))) Use (a tuned version of) isin as an abstraction function: lemma isin case: isin (Nd b m) xs = (case xs of [] ⇒ b | x # ys ⇒ (case m x of None ⇒ False | Some t ⇒ isin t ys)) by(cases xs)auto definition set :: 0a trie ⇒ 0a list set where [simp]: set t = {xs. isin t xs} lemma isin set: isin t xs = (xs ∈ set t) by simp lemma set insert: set (insert xs t) = set t ∪ {xs} by (induction xs t rule: insert.induct) (auto simp: isin case split!: if splits option.splits list.splits) lemma set delete: set (delete xs t) = set t − {xs} by (induction xs t rule: delete.induct) (auto simp: isin case split!: if splits option.splits list.splits) interpretation S: Set

196 where empty = empty and isin = isin and insert = insert and delete = delete and set = set and invar = λ . True proof (standard, goal cases) case 1 show ?case by (simp add: isin case split: list.split) next case 2 show ?case by(rule isin set) next case 3 show ?case by(rule set insert) next case 4 show ?case by(rule set delete) qed (rule TrueI )+ end

40 Tries via Search Trees theory Trie Map imports RBT Map Trie Fun begin An implementation of tries based on maps implemented by red-black trees. Works for any kind of . Implementation of map: type synonym 0a mapi = 0a rbt datatype 0a trie map = Nd bool ( 0a ∗ 0a trie map) mapi In principle one should be able to given an implementation of tries once and for all for any map implementation and not just for a specific one (RBT) as done here. But because the map ( 0a rbt) is used in a datatype, the HOL type system does not support this. However, the development below works verbatim for any map imple- mentation, eg Tree Map, and not just RBT Map, except for the termination lemma lookup size. term size tree lemma lookup size[termination simp]: fixes t :: ( 0a::linorder ∗ 0a trie map) rbt shows lookup t a = Some b =⇒ size b < Suc (size tree (λab. Suc(Suc (size (snd(fst ab))))) t) apply(induction t a rule: lookup.induct)

197 apply(auto split: if splits) done definition empty :: 0a trie map where [simp]: empty = Nd False Leaf fun isin :: ( 0a::linorder) trie map ⇒ 0a list ⇒ bool where isin (Nd b m) [] = b | isin (Nd b m)(x # xs) = (case lookup m x of None ⇒ False | Some t ⇒ isin t xs) fun insert :: ( 0a::linorder) list ⇒ 0a trie map ⇒ 0a trie map where insert [] (Nd b m) = Nd True m | insert (x#xs)(Nd b m) = Nd b (update x (insert xs (case lookup m x of None ⇒ empty | Some t ⇒ t)) m) fun delete :: ( 0a::linorder) list ⇒ 0a trie map ⇒ 0a trie map where delete [] (Nd b m) = Nd False m | delete (x#xs)(Nd b m) = Nd b (case lookup m x of None ⇒ m | Some t ⇒ update x (delete xs t) m)

40.1 Correctness Proof by stepwise refinement. First abstract to type 0a trie. fun abs :: 0a::linorder trie map ⇒ 0a trie where abs (Nd b t) = Trie Fun.Nd b (λa. map option abs (lookup t a)) fun invar :: ( 0a::linorder)trie map ⇒ bool where invar (Nd b m) = (M .invar m ∧ (∀ a t. lookup m a = Some t −→ invar t)) lemma isin abs: isin t xs = Trie Fun.isin (abs t) xs apply(induction t xs rule: isin.induct) apply(auto split: option.split) done lemma abs insert: invar t =⇒ abs(insert xs t) = Trie Fun.insert xs (abs t) apply(induction xs t rule: insert.induct)

198 apply(auto simp: M .map specs RBT Set.empty def [symmetric] split: op- tion.split) done lemma abs delete: invar t =⇒ abs(delete xs t) = Trie Fun.delete xs (abs t) apply(induction xs t rule: delete.induct) apply(auto simp: M .map specs split: option.split) done lemma invar insert: invar t =⇒ invar (insert xs t) apply(induction xs t rule: insert.induct) apply(auto simp: M .map specs RBT Set.empty def [symmetric] split: op- tion.split) done lemma invar delete: invar t =⇒ invar (delete xs t) apply(induction xs t rule: delete.induct) apply(auto simp: M .map specs split: option.split) done Overall correctness w.r.t. the Set ADT: interpretation S2 : Set where empty = empty and isin = isin and insert = insert and delete = delete and set = set o abs and invar = invar proof (standard, goal cases) case 1 show ?case by (simp add: isin case split: list.split) next case 2 thus ?case by (simp add: isin abs) next case 3 thus ?case by (simp add: set insert abs insert del: set def ) next case 4 thus ?case by (simp add: set delete abs delete del: set def ) next case 5 thus ?case by (simp add: M .map specs RBT Set.empty def [symmetric]) next case 6 thus ?case by (simp add: invar insert) next case 7 thus ?case by (simp add: invar delete) qed end

199 41 Binary Tries and Patricia Tries theory Tries Binary imports Set Specs begin hide const (open) insert declare Let def [simp] fun sel2 :: bool ⇒ 0a ∗ 0a ⇒ 0a where sel2 b (a1 ,a2 ) = (if b then a2 else a1 ) fun mod2 :: ( 0a ⇒ 0a) ⇒ bool ⇒ 0a ∗ 0a ⇒ 0a ∗ 0a where mod2 f b (a1 ,a2 ) = (if b then (a1 ,f a2 ) else (f a1 ,a2 ))

41.1 Trie datatype trie = Lf | Nd bool trie ∗ trie definition empty :: trie where [simp]: empty = Lf fun isin :: trie ⇒ bool list ⇒ bool where isin Lf ks = False | isin (Nd b lr) ks = (case ks of [] ⇒ b | k#ks ⇒ isin (sel2 k lr) ks) fun insert :: bool list ⇒ trie ⇒ trie where insert [] Lf = Nd True (Lf ,Lf ) | insert [] (Nd b lr) = Nd True lr | insert (k#ks) Lf = Nd False (mod2 (insert ks) k (Lf ,Lf )) | insert (k#ks)(Nd b lr) = Nd b (mod2 (insert ks) k lr) lemma isin insert: isin (insert xs t) ys = (xs = ys ∨ isin t ys) apply(induction xs t arbitrary: ys rule: insert.induct) apply (auto split: list.splits if splits) done A simple implementation of delete; does not shrink the trie! fun delete0 :: bool list ⇒ trie ⇒ trie where delete0 ks Lf = Lf |

200 delete0 ks (Nd b lr) = (case ks of [] ⇒ Nd False lr | k#ks 0 ⇒ Nd b (mod2 (delete0 ks 0) k lr)) lemma isin delete0 : isin (delete0 as t) bs = (as 6= bs ∧ isin t bs) apply(induction as t arbitrary: bs rule: delete0 .induct) apply (auto split: list.splits if splits) done Now deletion with shrinking: fun node :: bool ⇒ trie ∗ trie ⇒ trie where node b lr = (if ¬ b ∧ lr = (Lf ,Lf ) then Lf else Nd b lr) fun delete :: bool list ⇒ trie ⇒ trie where delete ks Lf = Lf | delete ks (Nd b lr) = (case ks of [] ⇒ node False lr | k#ks 0 ⇒ node b (mod2 (delete ks 0) k lr)) lemma isin delete: isin (delete xs t) ys = (xs 6= ys ∧ isin t ys) apply(induction xs t arbitrary: ys rule: delete.induct) apply simp apply (auto split: list.splits if splits) apply (metis isin.simps(1 )) apply (metis isin.simps(1 )) done definition set trie :: trie ⇒ bool list set where set trie t = {xs. isin t xs} lemma set trie empty: set trie empty = {} by(simp add: set trie def ) lemma set trie isin: isin t xs = (xs ∈ set trie t) by(simp add: set trie def ) lemma set trie insert: set trie(insert xs t) = set trie t ∪ {xs} by(auto simp add: isin insert set trie def ) lemma set trie delete: set trie(delete xs t) = set trie t − {xs} by(auto simp add: isin delete set trie def )

201 interpretation S: Set where empty = empty and isin = isin and insert = insert and delete = delete and set = set trie and invar = λt. True proof (standard, goal cases) case 1 show ?case by (rule set trie empty) next case 2 show ?case by(rule set trie isin) next case 3 thus ?case by(auto simp: set trie insert) next case 4 show ?case by(rule set trie delete) qed (rule TrueI )+

41.2 Patricia Trie datatype trieP = LfP | NdP bool list bool trieP ∗ trieP fun isinP :: trieP ⇒ bool list ⇒ bool where isinP LfP ks = False | isinP (NdP ps b lr) ks = (let n = length ps in if ps = take n ks then case drop n ks of [] ⇒ b | k#ks 0 ⇒ isinP (sel2 k lr) ks 0 else False) definition emptyP :: trieP where [simp]: emptyP = LfP fun split where split [] ys = ([],[],ys) | split xs [] = ([],xs,[]) | split (x#xs)(y#ys) = (if x6=y then ([],x#xs,y#ys) else let (ps,xs 0,ys 0) = split xs ys in (x#ps,xs 0,ys 0)) lemma mod2 cong[fundef cong]: [[ lr = lr 0; k = k 0; Va b. lr 0=(a,b) =⇒ f (a) = f 0 (a); Va b. lr 0=(a,b) =⇒ f (b) = f 0 (b) ]] =⇒ mod2 f k lr= mod2 f 0 k 0 lr 0 by(cases lr, cases lr 0, auto)

202 fun insertP :: bool list ⇒ trieP ⇒ trieP where insertP ks LfP = NdP ks True (LfP,LfP) | insertP ks (NdP ps b lr) = (case split ks ps of (qs,k#ks 0,p#ps 0) ⇒ let tp = NdP ps 0 b lr; tk = NdP ks 0 True (LfP,LfP) in NdP qs False (if k then (tp,tk) else (tk,tp)) | (qs,k#ks 0,[]) ⇒ NdP ps b (mod2 (insertP ks 0) k lr) | (qs,[],p#ps 0) ⇒ let t = NdP ps 0 b lr in NdP qs True (if p then (LfP,t) else (t,LfP)) | (qs,[],[]) ⇒ NdP ps True lr) fun nodeP :: bool list ⇒ bool ⇒ trieP ∗ trieP ⇒ trieP where nodeP ps b lr = (if ¬ b ∧ lr = (LfP,LfP) then LfP else NdP ps b lr) fun deleteP :: bool list ⇒ trieP ⇒ trieP where deleteP ks LfP = LfP | deleteP ks (NdP ps b lr) = (case split ks ps of (qs,ks 0,p#ps 0) ⇒ NdP ps b lr | (qs,k#ks 0,[]) ⇒ nodeP ps b (mod2 (deleteP ks 0) k lr) | (qs,[],[]) ⇒ nodeP ps False lr)

41.2.1 Functional Correctness First step: trieP implements trie via the abstraction function abs trieP: fun prefix trie :: bool list ⇒ trie ⇒ trie where prefix trie [] t = t | prefix trie (k#ks) t = (let t 0 = prefix trie ks t in Nd False (if k then (Lf ,t 0) else (t 0,Lf ))) fun abs trieP :: trieP ⇒ trie where abs trieP LfP = Lf | abs trieP (NdP ps b (l,r)) = prefix trie ps (Nd b (abs trieP l, abs trieP r)) Correctness of isinP: lemma isin prefix trie: isin (prefix trie ps t) ks = (ps = take (length ps) ks ∧ isin t (drop (length ps) ks)) apply(induction ps arbitrary: ks) apply(auto split: list.split)

203 done lemma abs trieP isinP: isinP t ks = isin (abs trieP t) ks apply(induction t arbitrary: ks rule: abs trieP.induct) apply(auto simp: isin prefix trie split: list.split) done Correctness of insertP: lemma prefix trie Lfs: prefix trie ks (Nd True (Lf ,Lf )) = insert ks Lf apply(induction ks) apply auto done lemma insert prefix trie same: insert ps (prefix trie ps (Nd b lr)) = prefix trie ps (Nd True lr) apply(induction ps) apply auto done lemma insert append: insert (ks @ ks 0)(prefix trie ks t) = prefix trie ks (insert ks 0 t) apply(induction ks) apply auto done lemma prefix trie append: prefix trie (ps @ qs) t = prefix trie ps (prefix trie qs t) apply(induction ps) apply auto done lemma split if : split ks ps = (qs, ks 0, ps 0) =⇒ ks = qs @ ks 0 ∧ ps = qs @ ps 0 ∧ (ks 0 6= [] ∧ ps 0 6= [] −→ hd ks 0 6= hd ps 0) apply(induction ks ps arbitrary: qs ks 0 ps 0 rule: split.induct) apply(auto split: prod.splits if splits) done lemma abs trieP insertP: abs trieP (insertP ks t) = insert ks (abs trieP t) apply(induction t arbitrary: ks) apply(auto simp: prefix trie Lfs insert prefix trie same insert append pre- fix trie append dest!: split if split: list.split prod.split if splits)

204 done Correctness of deleteP: lemma prefix trie Lf : prefix trie xs t = Lf ←→ xs = [] ∧ t = Lf by(cases xs)(auto) lemma abs trieP Lf : abs trieP t = Lf ←→ t = LfP by(cases t)(auto simp: prefix trie Lf ) lemma delete prefix trie: delete xs (prefix trie xs (Nd b (l,r))) = (if (l,r) = (Lf ,Lf ) then Lf else prefix trie xs (Nd False (l,r))) by(induction xs)(auto simp: prefix trie Lf ) lemma delete append prefix trie: delete (xs @ ys)(prefix trie xs t) = (if delete ys t = Lf then Lf else prefix trie xs (delete ys t)) by(induction xs)(auto simp: prefix trie Lf ) lemma delete abs trieP: delete ks (abs trieP t) = abs trieP (deleteP ks t) apply(induction t arbitrary: ks) apply(auto simp: delete prefix trie delete append prefix trie prefix trie append prefix trie Lf abs trieP Lf dest!: split if split: if splits list.split prod.split) done The overall correctness proof. Simply composes correctness lemmas. definition set trieP :: trieP ⇒ bool list set where set trieP = set trie o abs trieP lemma isinP set trieP: isinP t xs = (xs ∈ set trieP t) by(simp add: abs trieP isinP set trie isin set trieP def ) lemma set trieP insertP: set trieP (insertP xs t) = set trieP t ∪ {xs} by(simp add: abs trieP insertP set trie insert set trieP def ) lemma set trieP deleteP: set trieP (deleteP xs t) = set trieP t − {xs} by(auto simp: set trie delete set trieP def simp flip: delete abs trieP) interpretation SP: Set where empty = emptyP and isin = isinP and insert = insertP and delete = deleteP and set = set trieP and invar = λt. True

205 proof (standard, goal cases) case 1 show ?case by (simp add: set trieP def set trie def ) next case 2 show ?case by(rule isinP set trieP) next case 3 thus ?case by (auto simp: set trieP insertP) next case 4 thus ?case by(auto simp: set trieP deleteP) qed (rule TrueI )+ end

42 Queue Specification theory Queue Spec imports Main begin The basic queue interface with list-based specification: locale Queue = fixes empty :: 0q fixes enq :: 0a ⇒ 0q ⇒ 0q fixes first :: 0q ⇒ 0a fixes deq :: 0q ⇒ 0q fixes is empty :: 0q ⇒ bool fixes list :: 0q ⇒ 0a list fixes invar :: 0q ⇒ bool assumes list empty: list empty = [] assumes list enq: invar q =⇒ list(enq x q) = list q @[x] assumes list deq: invar q =⇒ list(deq q) = tl(list q) assumes list first: invar q =⇒ ¬ list q = [] =⇒ first q = hd(list q) assumes list is empty: invar q =⇒ is empty q = (list q = []) assumes invar empty: invar empty assumes invar enq: invar q =⇒ invar(enq x q) assumes invar deq: invar q =⇒ invar(deq q) end theory Reverse imports Main begin fun T append :: 0a list ⇒ 0a list ⇒ nat where T append [] ys = 1 | T append (x#xs) ys = T append xs ys + 1

206 fun T rev :: 0a list ⇒ nat where T rev [] = 1 | T rev (x#xs) = T rev xs + T append (rev xs)[x] + 1 lemma T append: T append xs ys = length xs + 1 by(induction xs) auto lemma T rev: T rev xs ≤ (length xs + 1 )ˆ2 by(induction xs)(auto simp: T append power2 eq square) fun itrev :: 0a list ⇒ 0a list ⇒ 0a list where itrev [] ys = ys | itrev (x#xs) ys = itrev xs (x # ys) lemma itrev: itrev xs ys = rev xs @ ys by(induction xs arbitrary: ys) auto lemma itrev Nil: itrev xs [] = rev xs by(simp add: itrev) fun T itrev :: 0a list ⇒ 0a list ⇒ nat where T itrev [] ys = 1 | T itrev (x#xs) ys = T itrev xs (x # ys) + 1 lemma T itrev: T itrev xs ys = length xs + 1 by(induction xs arbitrary: ys) auto end

43 Queue Implementation via 2 Lists theory Queue 2Lists imports Queue Spec Reverse begin Definitions: type synonym 0a queue = 0a list × 0a list fun norm :: 0a queue ⇒ 0a queue where norm (fs,rs) = (if fs = [] then (itrev rs [], []) else (fs,rs))

207 fun enq :: 0a ⇒ 0a queue ⇒ 0a queue where enq a (fs,rs) = norm(fs, a # rs) fun deq :: 0a queue ⇒ 0a queue where deq (fs,rs) = (if fs = [] then (fs,rs) else norm(tl fs,rs)) fun first :: 0a queue ⇒ 0a where first (a # fs,rs) = a fun is empty :: 0a queue ⇒ bool where is empty (fs,rs) = (fs = []) fun list :: 0a queue ⇒ 0a list where list (fs,rs) = fs @ rev rs fun invar :: 0a queue ⇒ bool where invar (fs,rs) = (fs = [] −→ rs = []) Implementation correctness: interpretation Queue where empty = ([],[]) and enq = enq and deq = deq and first = first and is empty = is empty and list = list and invar = invar proof (standard, goal cases) case 1 show ?case by (simp) next case (2 q) thus ?case by(cases q)(simp) next case (3 q) thus ?case by(cases q)(simp add: itrev Nil) next case (4 q) thus ?case by(cases q)(auto simp: neq Nil conv) next case (5 q) thus ?case by(cases q)(auto) next case 6 show ?case by(simp) next case (7 q) thus ?case by(cases q)(simp) next case (8 q) thus ?case by(cases q)(simp) qed Running times: fun T norm :: 0a queue ⇒ nat where T norm (fs,rs) = (if fs = [] then T itrev rs [] else 0 ) + 1

208 fun T enq :: 0a ⇒ 0a queue ⇒ nat where T enq a (fs,rs) = T norm(fs, a # rs) + 1 fun T deq :: 0a queue ⇒ nat where T deq (fs,rs) = (if fs = [] then 0 else T norm(tl fs,rs)) + 1 fun T first :: 0a queue ⇒ nat where T first (a # fs,rs) = 1 fun T is empty :: 0a queue ⇒ nat where T is empty (fs,rs) = 1 Amortized running times: fun Φ :: 0a queue ⇒ nat where Φ(fs,rs) = length rs lemma a enq: T enq a (fs,rs) + Φ(enq a (fs,rs)) − Φ(fs,rs) ≤ 4 by(auto simp: T itrev) lemma a deq: T deq (fs,rs) + Φ(deq (fs,rs)) − Φ(fs,rs) ≤ 3 by(auto simp: T itrev) end

44 Priority Queue Specifications theory Priority Queue Specs imports HOL−Library.Multiset begin Priority queue interface + specification: locale Priority Queue = fixes empty :: 0q and is empty :: 0q ⇒ bool and insert :: 0a::linorder ⇒ 0q ⇒ 0q and get min :: 0q ⇒ 0a and del min :: 0q ⇒ 0q and invar :: 0q ⇒ bool and mset :: 0q ⇒ 0a multiset assumes mset empty: mset empty = {#} and is empty: invar q =⇒ is empty q = (mset q = {#}) and mset insert: invar q =⇒ mset (insert x q) = mset q + {#x#} and mset del min: invar q =⇒ mset q 6= {#} =⇒ mset (del min q) = mset q − {# get min q #}

209 and mset get min: invar q =⇒ mset q 6= {#} =⇒ get min q = Min mset (mset q) and invar empty: invar empty and invar insert: invar q =⇒ invar (insert x q) and invar del min: invar q =⇒ mset q 6= {#} =⇒ invar (del min q) Extend locale with merge. Need to enforce that 0q is the same in both locales. locale Priority Queue Merge = Priority Queue where empty = empty for empty :: 0q + fixes merge :: 0q ⇒ 0q ⇒ 0q assumes mset merge: [[ invar q1 ; invar q2 ]] =⇒ mset (merge q1 q2 ) = mset q1 + mset q2 and invar merge: [[ invar q1 ; invar q2 ]] =⇒ invar (merge q1 q2 ) end

45 Heaps theory Heaps imports HOL−Library.Tree Multiset Priority Queue Specs begin Heap = priority queue on trees: locale Heap = fixes insert :: ( 0a::linorder) ⇒ 0a tree ⇒ 0a tree and del min :: 0a tree ⇒ 0a tree assumes mset insert: heap q =⇒ mset tree (insert x q) = {#x#} + mset tree q and mset del min: [[ heap q; q 6= Leaf ]] =⇒ mset tree (del min q) = mset tree q − {#value q#} and heap insert: heap q =⇒ heap(insert x q) and heap del min: heap q =⇒ heap(del min q) begin definition empty :: 0a tree where empty = Leaf fun is empty :: 0a tree ⇒ bool where is empty t = (t = Leaf )

210 sublocale Priority Queue where empty = empty and is empty = is empty and insert = insert and get min = value and del min = del min and invar = heap and mset = mset tree proof (standard, goal cases) case 1 thus ?case by (simp add: empty def ) next case 2 thus ?case by(auto) next case 3 thus ?case by(simp add: mset insert) next case 4 thus ?case by(simp add: mset del min) next case 5 thus ?case by(auto simp: neq Leaf iff Min insert2 simp del: Un iff ) next case 6 thus ?case by(simp add: empty def ) next case 7 thus ?case by(simp add: heap insert) next case 8 thus ?case by(simp add: heap del min) qed end Once you have merge, insert and del min are easy: locale Heap Merge = fixes merge :: 0a::linorder tree ⇒ 0a tree ⇒ 0a tree assumes mset merge: [[ heap q1 ; heap q2 ]] =⇒ mset tree (merge q1 q2 ) = mset tree q1 + mset tree q2 and invar merge: [[ heap q1 ; heap q2 ]] =⇒ heap (merge q1 q2 ) begin fun insert :: 0a ⇒ 0a tree ⇒ 0a tree where insert x t = merge (Node Leaf x Leaf ) t fun del min :: 0a tree ⇒ 0a tree where del min Leaf = Leaf | del min (Node l x r) = merge l r interpretation Heap insert del min proof(standard, goal cases) case 1 thus ?case by(simp add:mset merge) next case (2 q) thus ?case by(cases q)(auto simp: mset merge)

211 next case 3 thus ?case by (simp add: invar merge) next case (4 q) thus ?case by (cases q)(auto simp: invar merge) qed sublocale PQM : Priority Queue Merge where empty = empty and is empty = is empty and insert = insert and get min = value and del min = del min and invar = heap and mset = mset tree and merge = merge proof(standard, goal cases) case 1 thus ?case by (simp add: mset merge) next case 2 thus ?case by (simp add: invar merge) qed end end

46 Leftist Heap theory Leftist Heap imports HOL−Library.Pattern Aliases Tree2 Priority Queue Specs Complex Main begin fun mset tree :: ( 0a∗ 0b) tree ⇒ 0a multiset where mset tree Leaf = {#} | mset tree (Node l (a, ) r) = {#a#} + mset tree l + mset tree r type synonym 0a lheap = ( 0a∗nat)tree fun mht :: 0a lheap ⇒ nat where mht Leaf = 0 | mht (Node ( , n) ) = n The invariants: fun (in linorder) heap :: ( 0a∗ 0b) tree ⇒ bool where heap Leaf = True | heap (Node l (m, ) r) =

212 ((∀ x ∈ set tree l ∪ set tree r. m ≤ x) ∧ heap l ∧ heap r) fun ltree :: 0a lheap ⇒ bool where ltree Leaf = True | ltree (Node l (a, n) r) = (min height l ≥ min height r ∧ n = min height r + 1 ∧ ltree l & ltree r) definition empty :: 0a lheap where empty = Leaf definition node :: 0a lheap ⇒ 0a ⇒ 0a lheap ⇒ 0a lheap where node l a r = (let rl = mht l; rr = mht r in if rl ≥ rr then Node l (a,rr+1 ) r else Node r (a,rl+1 ) l) fun get min :: 0a lheap ⇒ 0a where get min(Node l (a, n) r) = a For function merge: unbundle pattern aliases fun merge :: 0a::ord lheap ⇒ 0a lheap ⇒ 0a lheap where merge Leaf t = t | merge t Leaf = t | merge (Node l1 (a1 , n1 ) r1 =: t1 )(Node l2 (a2 , n2 ) r2 =: t2 ) = (if a1 ≤ a2 then node l1 a1 (merge r1 t2 ) else node l2 a2 (merge t1 r2 )) Termination of merge: by sum or lexicographic product of the sizes of the two arguments. Isabelle uses a lexicographic product. lemma merge code: merge t1 t2 = (case (t1 ,t2 ) of (Leaf , ) ⇒ t2 | ( , Leaf ) ⇒ t1 | (Node l1 (a1 , n1 ) r1 , Node l2 (a2 , n2 ) r2 ) ⇒ if a1 ≤ a2 then node l1 a1 (merge r1 t2 ) else node l2 a2 (merge t1 r2 )) by(induction t1 t2 rule: merge.induct)(simp all split: tree.split) hide const (open) insert definition insert :: 0a::ord ⇒ 0a lheap ⇒ 0a lheap where insert x t = merge (Node Leaf (x,1 ) Leaf ) t fun del min :: 0a::ord lheap ⇒ 0a lheap where del min Leaf = Leaf |

213 del min (Node l r) = merge l r

46.1 Lemmas lemma mset tree empty: mset tree t = {#} ←→ t = Leaf by(cases t) auto lemma mht eq min height: ltree t =⇒ mht t = min height t by(cases t) auto lemma ltree node: ltree (node l a r) ←→ ltree l ∧ ltree r by(auto simp add: node def mht eq min height) lemma heap node: heap (node l a r) ←→ heap l ∧ heap r ∧ (∀ x ∈ set tree l ∪ set tree r. a ≤ x) by(auto simp add: node def ) lemma set tree mset: set tree t = set mset(mset tree t) by(induction t) auto

46.2 Functional Correctness lemma mset merge: mset tree (merge t1 t2 ) = mset tree t1 + mset tree t2 by (induction t1 t2 rule: merge.induct)(auto simp add: node def ac simps) lemma mset insert: mset tree (insert x t) = mset tree t + {#x#} by (auto simp add: insert def mset merge) lemma get min: [[ heap t; t 6= Leaf ]] =⇒ get min t = Min(set tree t) by (cases t)(auto simp add: eq Min iff ) lemma mset del min: mset tree (del min t) = mset tree t − {# get min t #} by (cases t)(auto simp: mset merge) lemma ltree merge: [[ ltree l; ltree r ]] =⇒ ltree (merge l r) by(induction l r rule: merge.induct)(auto simp: ltree node) lemma heap merge: [[ heap l; heap r ]] =⇒ heap (merge l r) proof(induction l r rule: merge.induct) case 3 thus ?case by(auto simp: heap node mset merge ball Un set tree mset) qed simp all lemma ltree insert: ltree t =⇒ ltree(insert x t)

214 by(simp add: insert def ltree merge del: merge.simps split: tree.split) lemma heap insert: heap t =⇒ heap(insert x t) by(simp add: insert def heap merge del: merge.simps split: tree.split) lemma ltree del min: ltree t =⇒ ltree(del min t) by(cases t)(auto simp add: ltree merge simp del: merge.simps) lemma heap del min: heap t =⇒ heap(del min t) by(cases t)(auto simp add: heap merge simp del: merge.simps) Last step of functional correctness proof: combine all the above lemmas to show that leftist heaps satisfy the specification of priority queues with merge. interpretation lheap: Priority Queue Merge where empty = empty and is empty = λt. t = Leaf and insert = insert and del min = del min and get min = get min and merge = merge and invar = λt. heap t ∧ ltree t and mset = mset tree proof(standard, goal cases) case 1 show ?case by (simp add: empty def ) next case (2 q) show ?case by (cases q) auto next case 3 show ?case by(rule mset insert) next case 4 show ?case by(rule mset del min) next case 5 thus ?case by(simp add: get min mset tree empty set tree mset) next case 6 thus ?case by(simp add: empty def ) next case 7 thus ?case by(simp add: heap insert ltree insert) next case 8 thus ?case by(simp add: heap del min ltree del min) next case 9 thus ?case by (simp add: mset merge) next case 10 thus ?case by (simp add: heap merge ltree merge) qed

46.3 Complexity Explicit termination argument: sum of sizes

215 fun T merge :: 0a::ord lheap ⇒ 0a lheap ⇒ nat where T merge Leaf t = 1 | T merge t Leaf = 1 | T merge (Node l1 (a1 , n1 ) r1 =: t1 )(Node l2 (a2 , n2 ) r2 =: t2 ) = (if a1 ≤ a2 then T merge r1 t2 else T merge t1 r2 ) + 1 definition T insert :: 0a::ord ⇒ 0a lheap ⇒ nat where T insert x t = T merge (Node Leaf (x, 1 ) Leaf ) t + 1 fun T del min :: 0a::ord lheap ⇒ nat where T del min Leaf = 1 | T del min (Node l r) = T merge l r + 1 lemma T merge min height: ltree l =⇒ ltree r =⇒ T merge l r ≤ min height l + min height r + 1 proof(induction l r rule: merge.induct) case 3 thus ?case by(auto) qed simp all corollary T merge log: assumes ltree l ltree r shows T merge l r ≤ log 2 (size1 l) + log 2 (size1 r) + 1 using le log2 of power[OF min height size1 [of l]] le log2 of power[OF min height size1 [of r]] T merge min height[of l r] assms by linarith corollary T insert log: ltree t =⇒ T insert x t ≤ log 2 (size1 t) + 3 using T merge log[of Node Leaf (x, 1 ) Leaf t] by(simp add: T insert def split: tree.split) lemma ld ld 1 less: assumes x > 0 y > 0 shows log 2 x + log 2 y + 1 < 2 ∗ log 2 (x+y) proof − have 2 powr (log 2 x + log 2 y + 1 ) = 2 ∗x∗y using assms by(simp add: powr add) also have . . . < (x+y)ˆ2 using assms by(simp add: numeral eq Suc algebra simps add pos pos) also have ... = 2 powr (2 ∗ log 2 (x+y)) using assms by(simp add: powr add log powr[symmetric]) finally show ?thesis by simp qed corollary T del min log: assumes ltree t

216 shows T del min t ≤ 2 ∗ log 2 (size1 t) + 1 proof(cases t rule: tree2 cases) case Leaf thus ?thesis using assms by simp next case [simp]: (Node l r) have T del min t = T merge l r + 1 by simp also have ... ≤ log 2 (size1 l) + log 2 (size1 r) + 2 using hltree t i T merge log[of l r] by (auto simp del: T merge.simps) also have ... ≤ 2 ∗ log 2 (size1 t) + 1 using ld ld 1 less[of size1 l size1 r] by (simp) finally show ?thesis . qed end

47 Binomial Heap theory Binomial Heap imports HOL−Library.Pattern Aliases Complex Main Priority Queue Specs begin We formalize the binomial heap presentation from Okasaki’s book. We show the functional correctness and complexity of all operations. The presentation is engineered for simplicity, and most proofs are straight- forward and automatic.

47.1 Binomial Tree and Heap Datatype datatype 0a tree = Node (rank: nat)(root: 0a)(children: 0a tree list) type synonym 0a heap = 0a tree list

47.1.1 Multiset of elements fun mset tree :: 0a::linorder tree ⇒ 0a multiset where mset tree (Node a ts) = {#a#} + (P t∈#mset ts. mset tree t) definition mset heap :: 0a::linorder heap ⇒ 0a multiset where mset heap ts = (P t∈#mset ts. mset tree t) lemma mset tree simp alt[simp]:

217 mset tree (Node r a ts) = {#a#} + mset heap ts unfolding mset heap def by auto declare mset tree.simps[simp del] lemma mset tree nonempty[simp]: mset tree t 6= {#} by (cases t) auto lemma mset heap Nil[simp]: mset heap [] = {#} by (auto simp: mset heap def ) lemma mset heap Cons[simp]: mset heap (t#ts) = mset tree t + mset heap ts by (auto simp: mset heap def ) lemma mset heap empty iff [simp]: mset heap ts = {#} ←→ ts=[] by (auto simp: mset heap def ) lemma root in mset[simp]: root t ∈# mset tree t by (cases t) auto lemma mset heap rev eq[simp]: mset heap (rev ts) = mset heap ts by (auto simp: mset heap def )

47.1.2 Invariants Binomial tree fun invar btree :: 0a::linorder tree ⇒ bool where invar btree (Node r x ts) ←→ (∀ t∈set ts. invar btree t) ∧ map rank ts = rev [0 ..

218 47.2 Operations and Their Functional Correctness 47.2.1 link context includes pattern aliases begin fun link :: ( 0a::linorder) tree ⇒ 0a tree ⇒ 0a tree where 0 link (Node r x 1 ts1 =: t1)(Node r x 2 ts2 =: t2) = (if x 1≤x 2 then Node (r+1 ) x 1 (t2#ts1) else Node (r+1 ) x 2 (t1#ts2)) end lemma invar link: assumes invar tree t 1 assumes invar tree t 2 assumes rank t 1 = rank t 2 shows invar tree (link t 1 t2) using assms unfolding invar tree def by (cases (t1, t2) rule: link.cases) auto lemma rank link[simp]: rank (link t 1 t2) = rank t 1 + 1 by (cases (t1, t2) rule: link.cases) simp lemma mset link[simp]: mset tree (link t 1 t2) = mset tree t1 + mset tree t2 by (cases (t1, t2) rule: link.cases) simp

47.2.2 ins tree fun ins tree :: 0a::linorder tree ⇒ 0a heap ⇒ 0a heap where ins tree t [] = [t] | ins tree t 1 (t2#ts) = (if rank t 1 < rank t 2 then t 1#t2#ts else ins tree (link t 1 t2) ts) lemma invar tree0 [simp]: invar tree (Node 0 x []) unfolding invar tree def by auto lemma invar Cons[simp]: invar (t#ts) ←→ invar tree t ∧ invar ts ∧ (∀ t 0∈set ts. rank t < rank t 0) by (auto simp: invar def ) lemma invar ins tree:

219 assumes invar tree t assumes invar ts assumes ∀ t 0∈set ts. rank t ≤ rank t 0 shows invar (ins tree t ts) using assms by (induction t ts rule: ins tree.induct)(auto simp: invar link less eq Suc le[symmetric]) lemma mset heap ins tree[simp]: mset heap (ins tree t ts) = mset tree t + mset heap ts by (induction t ts rule: ins tree.induct) auto lemma ins tree rank bound: assumes t 0 ∈ set (ins tree t ts) 0 0 assumes ∀ t ∈set ts. rank t 0 < rank t assumes rank t 0 < rank t 0 shows rank t 0 < rank t using assms by (induction t ts rule: ins tree.induct)(auto split: if splits)

47.2.3 insert hide const (open) insert definition insert :: 0a::linorder ⇒ 0a heap ⇒ 0a heap where insert x ts = ins tree (Node 0 x []) ts lemma invar insert[simp]: invar t =⇒ invar (insert x t) by (auto intro!: invar ins tree simp: insert def ) lemma mset heap insert[simp]: mset heap (insert x t) = {#x#} + mset heap t by(auto simp: insert def )

47.2.4 merge context includes pattern aliases begin fun merge :: 0a::linorder heap ⇒ 0a heap ⇒ 0a heap where merge ts1 [] = ts1 | merge [] ts2 = ts2 | merge (t1#ts1 =: h1)(t2#ts2 =: h2) = ( if rank t 1 < rank t 2 then t 1 # merge ts1 h2 else

220 if rank t 2 < rank t 1 then t 2 # merge h1 ts2 else ins tree (link t 1 t2)(merge ts1 ts2) ) end lemma merge simp2 [simp]: merge [] ts2 = ts2 by (cases ts2) auto lemma merge rank bound: 0 assumes t ∈ set (merge ts1 ts2) assumes ∀ t1∈set ts1. rank t < rank t 1 assumes ∀ t2∈set ts2. rank t < rank t 2 shows rank t < rank t 0 using assms 0 by (induction ts1 ts2 arbitrary: t rule: merge.induct) (auto split: if splits simp: ins tree rank bound) lemma invar merge[simp]: assumes invar ts1 assumes invar ts2 shows invar (merge ts1 ts2) using assms by (induction ts1 ts2 rule: merge.induct) (auto 0 3 simp: Suc le eq intro!: invar ins tree invar link elim!: merge rank bound) Longer, more explicit proof of invar merge, to illustrate the application of the merge rank bound lemma. lemma assumes invar ts1 assumes invar ts2 shows invar (merge ts1 ts2) using assms proof (induction ts1 ts2 rule: merge.induct) case (3 t 1 ts1 t2 ts2) — Invariants of the parts can be shown automatically from 3 .prems have [simp]: invar tree t 1 invar tree t 2

by auto

— These are the three cases of the merge function consider (LT ) rank t 1 < rank t 2 | (GT ) rank t 1 > rank t 2

221 | (EQ) rank t 1 = rank t 2 using antisym conv3 by blast then show ?case proof cases case LT — merge takes the first tree from the left heap then have merge (t1 # ts1)(t2 # ts2) = t1 # merge ts1 (t2 # ts2) by simp also have invar ... proof (simp, intro conjI ) — Invariant follows from induction hypothesis show invar (merge ts1 (t2 # ts2)) using LT 3 .IH 3 .prems by simp

— It remains to show that t1 has smallest rank. 0 0 show ∀ t ∈set (merge ts1 (t2 # ts2)). rank t 1 < rank t — Which is done by auxiliary lemma merge rank bound using LT 3 .prems by (force elim!: merge rank bound) qed finally show ?thesis . next — merge takes the first tree from the right heap case GT — The proof is anaologous to the LT case then show ?thesis using 3 .prems 3 .IH by (force elim!: merge rank bound) next case [simp]: EQ — merge links both first trees, and inserts them into the merged remaining heaps have merge (t1 # ts1)(t2 # ts2) = ins tree (link t 1 t2)(merge ts1 ts2) by simp also have invar ... proof (intro invar ins tree invar link) — Invariant of merged remaining heaps follows by IH show invar (merge ts1 ts2) using EQ 3 .prems 3 .IH by auto

— For insertion, we have to show that the rank of the linked tree is ≤ the ranks in the merged remaining heaps 0 0 show ∀ t ∈set (merge ts1 ts2). rank (link t 1 t2) ≤ rank t proof − — Which is, again, done with the help of merge rank bound have rank (link t 1 t2) = Suc (rank t 2) by simp thus ?thesis using 3 .prems by (auto simp: Suc le eq elim!: merge rank bound) qed qed simp all finally show ?thesis .

222 qed qed auto lemma mset heap merge[simp]: mset heap (merge ts1 ts2) = mset heap ts1 + mset heap ts2 by (induction ts1 ts2 rule: merge.induct) auto

47.2.5 get min fun get min :: 0a::linorder heap ⇒ 0a where get min [t] = root t | get min (t#ts) = min (root t)(get min ts) lemma invar tree root min: assumes invar tree t assumes x ∈# mset tree t shows root t ≤ x using assms unfolding invar tree def by (induction t arbitrary: x rule: mset tree.induct)(fastforce simp: mset heap def ) lemma get min mset: assumes ts6=[] assumes invar ts assumes x ∈# mset heap ts shows get min ts ≤ x using assms apply (induction ts arbitrary: x rule: get min.induct) apply (auto simp: invar tree root min min def intro: order trans; meson linear order trans invar tree root min )+ done lemma get min member: ts6=[] =⇒ get min ts ∈# mset heap ts by (induction ts rule: get min.induct)(auto simp: min def ) lemma get min: assumes mset heap ts 6= {#} assumes invar ts shows get min ts = Min mset (mset heap ts) using assms get min member get min mset by (auto simp: eq Min iff )

223 47.2.6 get min rest fun get min rest :: 0a::linorder heap ⇒ 0a tree × 0a heap where get min rest [t] = (t,[]) | get min rest (t#ts) = (let (t 0,ts 0) = get min rest ts in if root t ≤ root t 0 then (t,ts) else (t 0,t#ts 0)) lemma get min rest get min same root: assumes ts6=[] assumes get min rest ts = (t 0,ts 0) shows root t 0 = get min ts using assms by (induction ts arbitrary: t 0 ts 0 rule: get min.induct)(auto simp: min def split: prod.splits) lemma mset get min rest: assumes get min rest ts = (t 0,ts 0) assumes ts6=[] shows mset ts = {#t 0#} + mset ts 0 using assms by (induction ts arbitrary: t 0 ts 0 rule: get min.induct)(auto split: prod.splits if splits) lemma set get min rest: assumes get min rest ts = (t 0, ts 0) assumes ts6=[] shows set ts = Set.insert t 0 (set ts 0) using mset get min rest[OF assms, THEN arg cong[where f =set mset]] by auto lemma invar get min rest: assumes get min rest ts = (t 0,ts 0) assumes ts6=[] assumes invar ts shows invar tree t 0 and invar ts 0 proof − have invar tree t 0 ∧ invar ts 0 using assms proof (induction ts arbitrary: t 0 ts 0 rule: get min.induct) case (2 t v va) then show ?case apply (clarsimp split: prod.splits if splits) apply (drule set get min rest; fastforce) done

224 qed auto thus invar tree t 0 and invar ts 0 by auto qed

47.2.7 del min definition del min :: 0a::linorder heap ⇒ 0a::linorder heap where del min ts = (case get min rest ts of (Node r x ts1, ts2) ⇒ merge (rev ts1) ts2) lemma invar del min[simp]: assumes ts 6= [] assumes invar ts shows invar (del min ts) using assms unfolding del min def by (auto split: prod.split tree.split intro!: invar merge invar children dest: invar get min rest ) lemma mset heap del min: assumes ts 6= [] shows mset heap ts = mset heap (del min ts) + {# get min ts #} using assms unfolding del min def apply (clarsimp split: tree.split prod.split) apply (frule (1 ) get min rest get min same root) apply (frule (1 ) mset get min rest) apply (auto simp: mset heap def ) done

47.2.8 Instantiating the Priority Queue Locale Last step of functional correctness proof: combine all the above lemmas to show that binomial heaps satisfy the specification of priority queues with merge. interpretation binheap: Priority Queue Merge where empty = [] and is empty = (=) [] and insert = insert and get min = get min and del min = del min and merge = merge and invar = invar and mset = mset heap proof (unfold locales, goal cases) case 1 thus ?case by simp

225 next case 2 thus ?case by auto next case 3 thus ?case by auto next case (4 q) thus ?case using mset heap del min[of q] get min[OF hinvar q i] by (auto simp: union single eq diff ) next case (5 q) thus ?case using get min[of q] by auto next case 6 thus ?case by (auto simp add: invar def ) next case 7 thus ?case by simp next case 8 thus ?case by simp next case 9 thus ?case by simp next case 10 thus ?case by simp qed

47.3 Complexity The size of a binomial tree is determined by its rank lemma size mset btree: assumes invar btree t shows size (mset tree t) = 2ˆrank t using assms proof (induction t) case (Node r v ts) hence IH : size (mset tree t) = 2ˆrank t if t ∈ set ts for t using that by auto

from Node have COMPL: map rank ts = rev [0 ..

have size (mset heap ts) = (P t←ts. size (mset tree t)) by (induction ts) auto also have ... = (P t←ts. 2ˆrank t) using IH by (auto cong: map cong) also have ... = (P r←map rank ts. 2ˆr) by (induction ts) auto also have ... = (P i∈{0 ..

226 by (auto simp: rev map[symmetric] interv sum list conv sum set nat) also have ... = 2ˆr − 1 by (induction r) auto finally show ?case by (simp) qed lemma size mset tree: assumes invar tree t shows size (mset tree t) = 2ˆrank t using assms unfolding invar tree def by (simp add: size mset btree) The length of a binomial heap is bounded by the number of its elements lemma size mset heap: assumes invar ts shows length ts ≤ log 2 (size (mset heap ts) + 1 ) proof − from hinvar ts i have ASC : sorted wrt (<)(map rank ts) and TINV : ∀ t∈set ts. invar tree t unfolding invar def by auto

have (2 ::nat)ˆlength ts = (P i∈{0 ..

47.3.1 Timing Functions We define timing functions for each operation, and provide estimations of their complexity. definition T link :: 0a::linorder tree ⇒ 0a tree ⇒ nat where [simp]: T link = 1

227 This function is non-canonical: we omitted a +1 in the else-part, to keep the following analysis simpler and more to the point. fun T ins tree :: 0a::linorder tree ⇒ 0a heap ⇒ nat where T ins tree t [] = 1 | T ins tree t 1 (t2 # ts) = ( (if rank t 1 < rank t 2 then 1 else T link t 1 t2 + T ins tree (link t 1 t2) ts) ) definition T insert :: 0a::linorder ⇒ 0a heap ⇒ nat where T insert x ts = T ins tree (Node 0 x []) ts + 1 lemma T ins tree simple bound: T ins tree t ts ≤ length ts + 1 by (induction t ts rule: T ins tree.induct) auto

47.3.2 T insert lemma T insert bound: assumes invar ts shows T insert x ts ≤ log 2 (size (mset heap ts) + 1 ) + 2 proof − have real (T insert x ts) ≤ real (length ts) + 2 unfolding T insert def using T ins tree simple bound using of nat mono by fastforce also note size mset heap[OF hinvar ts i] finally show ?thesis by simp qed

47.3.3 T merge context includes pattern aliases begin fun T merge :: 0a::linorder heap ⇒ 0a heap ⇒ nat where T merge ts1 [] = 1 | T merge [] ts2 = 1 | T merge (t1#ts1 =: h1)(t2#ts2 =: h2) = 1 + ( if rank t 1 < rank t 2 then T merge ts1 h2 else if rank t2 < rank t 1 then T merge h1 ts2 else T ins tree (link t 1 t2)(merge ts1 ts2) + T merge ts1 ts2 ) end

228 A crucial idea is to estimate the time in correlation with the result length, as each carry reduces the length of the result. lemma T ins tree length: T ins tree t ts + length (ins tree t ts) = 2 + length ts by (induction t ts rule: ins tree.induct) auto lemma T merge length: length (merge ts1 ts2) + T merge ts1 ts2 ≤ 2 ∗ (length ts1 + length ts2) + 1 by (induction ts1 ts2 rule: T merge.induct) (auto simp: T ins tree length algebra simps) Finally, we get the desired logarithmic bound lemma T merge bound: fixes ts1 ts2 defines n1 ≡ size (mset heap ts1) defines n2 ≡ size (mset heap ts2) assumes invar ts1 invar ts2 shows T merge ts1 ts2 ≤ 4 ∗log 2 (n1 + n2 + 1 ) + 1 proof − note n defs = assms(1 ,2 )

have T merge ts1 ts2 ≤ 2 ∗ real (length ts1) + 2 ∗ real (length ts2) + 1 using T merge length[of ts1 ts2] by simp also note size mset heap[OF hinvar ts1i] also note size mset heap[OF hinvar ts2i] finally have T merge ts1 ts2 ≤ 2 ∗ log 2 (n1 + 1 ) + 2 ∗ log 2 (n2 + 1 ) + 1 unfolding n defs by (simp add: algebra simps) also have log 2 (n1 + 1 ) ≤ log 2 (n1 + n2 + 1 ) unfolding n defs by (simp add: algebra simps) also have log 2 (n2 + 1 ) ≤ log 2 (n1 + n2 + 1 ) unfolding n defs by (simp add: algebra simps) finally show ?thesis by (simp add: algebra simps) qed

47.3.4 T get min fun T get min :: 0a::linorder heap ⇒ nat where T get min [t] = 1 | T get min (t#ts) = 1 + T get min ts lemma T get min estimate: ts6=[] =⇒ T get min ts = length ts by (induction ts rule: T get min.induct) auto

229 lemma T get min bound: assumes invar ts assumes ts6=[] shows T get min ts ≤ log 2 (size (mset heap ts) + 1 ) proof − have 1 : T get min ts = length ts using assms T get min estimate by auto also note size mset heap[OF hinvar ts i] finally show ?thesis . qed

47.3.5 T del min fun T get min rest :: 0a::linorder heap ⇒ nat where T get min rest [t] = 1 | T get min rest (t#ts) = 1 + T get min rest ts lemma T get min rest estimate: ts6=[] =⇒ T get min rest ts = length ts by (induction ts rule: T get min rest.induct) auto lemma T get min rest bound: assumes invar ts assumes ts6=[] shows T get min rest ts ≤ log 2 (size (mset heap ts) + 1 ) proof − have 1 : T get min rest ts = length ts using assms T get min rest estimate by auto also note size mset heap[OF hinvar ts i] finally show ?thesis . qed Note that although the definition of function rev has quadratic complex- ity, it can and is implemented (via suitable code lemmas) as a linear time function. Thus the following definition is justified: definition T rev xs = length xs + 1 definition T del min :: 0a::linorder heap ⇒ nat where T del min ts = T get min rest ts + (case get min rest ts of (Node x ts1, ts2) ⇒ T rev ts1 + T merge (rev ts1) ts2 ) + 1 lemma T del min bound:

230 fixes ts defines n ≡ size (mset heap ts) assumes invar ts and ts6=[] shows T del min ts ≤ 6 ∗ log 2 (n+1 ) + 3 proof − obtain r x ts1 ts2 where GM : get min rest ts = (Node r x ts1, ts2) by (metis surj pair tree.exhaust sel)

have I1 : invar (rev ts1) and I2 : invar ts2 using invar get min rest[OF GM hts6=[]i hinvar ts i] invar children by auto

define n1 where n1 = size (mset heap ts1) define n2 where n2 = size (mset heap ts2)

have n1 ≤ n n1 + n2 ≤ n unfolding n def n1 def n2 def using mset get min rest[OF GM hts6=[]i] by (auto simp: mset heap def )

have T del min ts = real (T get min rest ts) + real (T rev ts1) + real (T merge (rev ts1) ts2) + 1 unfolding T del min def GM by simp also have T get min rest ts ≤ log 2 (n+1 ) using T get min rest bound[OF hinvar ts i hts6=[]i] unfolding n def by simp also have T rev ts1 ≤ 1 + log 2 (n1 + 1 ) unfolding T rev def n1 def using size mset heap[OF I1 ] by simp also have T merge (rev ts1) ts2 ≤ 4 ∗log 2 (n1 + n2 + 1 ) + 1 unfolding n1 def n2 def using T merge bound[OF I1 I2 ] by (simp add: algebra simps) finally have T del min ts ≤ log 2 (n+1 ) + log 2 (n1 + 1 ) + 4 ∗log 2 (real (n1 + n2) + 1 ) + 3 by (simp add: algebra simps) also note hn1 + n2 ≤ n i also note hn1 ≤ n i finally show ?thesis by (simp add: algebra simps) qed end

231 48 Time functions for various standard library op- erations theory Time Funs imports Main begin fun T length :: 0a list ⇒ nat where T length [] = 1 | T length (x # xs) = T length xs + 1 lemma T length eq: T length xs = length xs + 1 by (induction xs) auto lemmas [simp del] = T length.simps fun T map :: ( 0a ⇒ nat) ⇒ 0a list ⇒ nat where T map T f [] = 1 | T map T f (x # xs) = T f x + T map T f xs + 1 lemma T map eq: T map T f xs = (P x←xs. T f x) + length xs + 1 by (induction xs) auto lemmas [simp del] = T map.simps fun T filter :: ( 0a ⇒ nat) ⇒ 0a list ⇒ nat where T filter T p [] = 1 | T filter T p (x # xs) = T p x + T filter T p xs + 1 lemma T filter eq: T filter T p xs = (P x←xs. T p x) + length xs + 1 by (induction xs) auto lemmas [simp del] = T filter.simps fun T nth :: 0a list ⇒ nat ⇒ nat where T nth [] n = 1 | T nth (x # xs) n = (case n of 0 ⇒ 1 | Suc n 0 ⇒ T nth xs n 0 + 1 ) lemma T nth eq: T nth xs n = min n (length xs) + 1 by (induction xs n rule: T nth.induct)(auto split: nat.splits)

232 lemmas [simp del] = T nth.simps fun T take :: nat ⇒ 0a list ⇒ nat where T take n [] = 1 | T take n (x # xs) = (case n of 0 ⇒ 1 | Suc n 0 ⇒ T take n 0 xs + 1 ) lemma T take eq: T take n xs = min n (length xs) + 1 by (induction xs arbitrary: n)(auto split: nat.splits) fun T drop :: nat ⇒ 0a list ⇒ nat where T drop n [] = 1 | T drop n (x # xs) = (case n of 0 ⇒ 1 | Suc n 0 ⇒ T drop n 0 xs + 1 ) lemma T drop eq: T drop n xs = min n (length xs) + 1 by (induction xs arbitrary: n)(auto split: nat.splits) end

49 The Median-of-Medians Selection Algorithm theory Selection imports Complex Main Sorting Time Funs begin Note that there is significant overlap between this theory (which is in- tended mostly for the Functional Data Structures book) and the Median-of- Medians AFP entry.

49.1 Auxiliary material lemma replicate numeral: replicate (numeral n) x = x # replicate (pred numeral n) x by (simp add: numeral eq Suc) lemma isort correct: isort xs = sort xs using sorted isort mset isort by (metis properties for sort) lemma sum list replicate [simp]: sum list (replicate n x) = n ∗ x by (induction n) auto lemma mset concat: mset (concat xss) = sum list (map mset xss)

233 by (induction xss) simp all lemma set mset sum list [simp]: set mset (sum list xs) = (S x∈set xs. set mset x) by (induction xs) auto lemma filter mset image mset: filter mset P (image mset f A) = image mset f (filter mset (λx. P (f x)) A) by (induction A) auto lemma filter mset sum list: filter mset P (sum list xs) = sum list (map (filter mset P) xs) by (induction xs) simp all lemma sum mset mset mono: assumes (Vx. x ∈# A =⇒ f x ⊆# g x) shows (P x∈#A. f x) ⊆#(P x∈#A. g x) using assms by (induction A)(auto intro!: subset mset.add mono) lemma mset filter mono: assumes A ⊆# B Vx. x ∈# A =⇒ P x =⇒ Q x shows filter mset P A ⊆# filter mset Q B by (rule mset subset eqI )(insert assms, auto simp: mset subset eq count count eq zero iff ) lemma size mset sum mset distrib: size (sum mset A :: 0a multiset) = sum mset (image mset size A) by (induction A) auto lemma sum mset mono: assumes Vx. x ∈# A =⇒ f x ≤ (g x :: 0a :: {ordered ab semigroup add,comm monoid add}) shows (P x∈#A. f x) ≤ (P x∈#A. g x) using assms by (induction A)(auto intro!: add mono) lemma filter mset is empty iff : filter mset P A = {#} ←→ (∀ x. x ∈# A −→ ¬P x) by (auto simp: multiset eq iff count eq zero iff ) lemma sort eq Nil iff [simp]: sort xs = [] ←→ xs = [] by (metis set empty set sort) lemma sort mset cong: mset xs = mset ys =⇒ sort xs = sort ys by (metis sorted list of multiset mset)

234 lemma Min set sorted: sorted xs =⇒ xs 6= [] =⇒ Min (set xs) = hd xs by (cases xs; force intro: Min insert2 ) lemma hd sort: fixes xs :: 0a :: linorder list shows xs 6= [] =⇒ hd (sort xs) = Min (set xs) by (subst Min set sorted [symmetric]) auto lemma length filter conv size filter mset: length (filter P xs) = size (filter mset P (mset xs)) by (induction xs) auto lemma sorted filter less subset take: assumes sorted xs and i < length xs shows {#x ∈# mset xs. x < xs ! i#} ⊆# mset (take i xs) using assms proof (induction xs arbitrary: i rule: list.induct) case (Cons x xs i) show ?case proof (cases i) case 0 thus ?thesis using Cons.prems by (auto simp: filter mset is empty iff ) next case (Suc i 0) have {#y ∈# mset (x # xs). y < (x # xs)! i#} ⊆# add mset x {#y ∈# mset xs. y < xs ! i 0#} using Suc Cons.prems by (auto) also have ... ⊆# add mset x (mset (take i 0 xs)) unfolding mset subset eq add mset cancel using Cons.prems Suc by (intro Cons.IH )(auto) also have ... = mset (take i (x # xs)) by (simp add: Suc) finally show ?thesis . qed qed auto lemma sorted filter greater subset drop: assumes sorted xs and i < length xs shows {#x ∈# mset xs. x > xs ! i#} ⊆# mset (drop (Suc i) xs) using assms proof (induction xs arbitrary: i rule: list.induct) case (Cons x xs i) show ?case proof (cases i)

235 case 0 thus ?thesis by (auto simp: sorted append filter mset is empty iff ) next case (Suc i 0) have {#y ∈# mset (x # xs). y > (x # xs)! i#} ⊆# {#y ∈# mset xs. y > xs ! i 0#} using Suc Cons.prems by (auto simp: set conv nth) also have ... ⊆# mset (drop (Suc i 0) xs) using Cons.prems Suc by (intro Cons.IH )(auto) also have ... = mset (drop (Suc i)(x # xs)) by (simp add: Suc) finally show ?thesis . qed qed auto

49.2 Chopping a list into equally-sized bits fun chop :: nat ⇒ 0a list ⇒ 0a list list where chop 0 = [] | chop [] = [] | chop n xs = take n xs # chop n (drop n xs) lemmas [simp del] = chop.simps This is an alternative induction rule for chop, which is often nicer to use. lemma chop induct 0 [case names trivial reduce]: assumes Vn xs. n = 0 ∨ xs = [] =⇒ P n xs assumes Vn xs. n > 0 =⇒ xs 6= [] =⇒ P n (drop n xs) =⇒ P n xs shows P n xs using assms proof induction schema show wf (measure (length ◦ snd)) by auto qed (blast | simp)+ lemma chop eq Nil iff [simp]: chop n xs = [] ←→ n = 0 ∨ xs = [] by (induction n xs rule: chop.induct; subst chop.simps) auto lemma chop 0 [simp]: chop 0 xs = [] by (simp add: chop.simps) lemma chop Nil [simp]: chop n [] = [] by (cases n)(auto simp: chop.simps) lemma chop reduce: n > 0 =⇒ xs 6= [] =⇒ chop n xs = take n xs # chop

236 n (drop n xs) by (cases n; cases xs)(auto simp: chop.simps) lemma concat chop [simp]: n > 0 =⇒ concat (chop n xs) = xs by (induction n xs rule: chop.induct; subst chop.simps) auto lemma chop elem not Nil [dest]: ys ∈ set (chop n xs) =⇒ ys 6= [] by (induction n xs rule: chop.induct; subst (asm) chop.simps) (auto simp: eq commute[of []] split: if splits) lemma length chop part le: ys ∈ set (chop n xs) =⇒ length ys ≤ n by (induction n xs rule: chop.induct; subst (asm) chop.simps)(auto split: if splits) lemma length chop: assumes n > 0 shows length (chop n xs) = nat dlength xs / ne proof − from hn > 0 i have real n ∗ length (chop n xs) ≥ length xs by (induction n xs rule: chop.induct; subst chop.simps)(auto simp: field simps) moreover from hn > 0 i have real n ∗ length (chop n xs) < length xs + n by (induction n xs rule: chop.induct; subst chop.simps) (auto simp: field simps split: nat diff split asm)+ ultimately have length (chop n xs) ≥ length xs / n and length (chop n xs) < length xs / n + 1 using assms by (auto simp: field simps) thus ?thesis by linarith qed lemma sum msets chop: n > 0 =⇒ (P ys←chop n xs. mset ys) = mset xs by (subst mset concat [symmetric]) simp all lemma UN sets chop: n > 0 =⇒ (S ys∈set (chop n xs). set ys) = set xs by (simp only: set concat [symmetric] concat chop) lemma chop append: d dvd length xs =⇒ chop d (xs @ ys) = chop d xs @ chop d ys by (induction d xs rule: chop induct 0)(auto simp: chop reduce dvd imp le) lemma chop replicate [simp]: d > 0 =⇒ chop d (replicate d xs) = [replicate d xs] by (subst chop reduce) auto

237 lemma chop replicate dvd [simp]: assumes d dvd n shows chop d (replicate n x) = replicate (n div d)(replicate d x) proof (cases d = 0 ) case False from assms obtain k where k: n = d ∗ k by blast have chop d (replicate (d ∗ k) x) = replicate k (replicate d x) using False by (induction k)(auto simp: replicate add chop append) thus ?thesis using False by (simp add: k) qed auto lemma chop concat: assumes ∀ xs∈set xss. length xs = d and d > 0 shows chop d (concat xss) = xss using assms proof (induction xss) case (Cons xs xss) have chop d (concat (xs # xss)) = chop d (xs @ concat xss) by simp also have ... = chop d xs @ chop d (concat xss) using Cons.prems by (intro chop append) auto also have chop d xs = [xs] using Cons.prems by (subst chop reduce) auto also have chop d (concat xss) = xss using Cons.prems by (intro Cons.IH ) auto finally show ?case by simp qed auto

49.3 Selection definition select :: nat ⇒ ( 0a :: linorder) list ⇒ 0a where select k xs = sort xs ! k lemma select 0 : xs 6= [] =⇒ select 0 xs = Min (set xs) by (simp add: hd sort select def flip: hd conv nth) lemma select mset cong: mset xs = mset ys =⇒ select k xs = select k ys using sort mset cong[of xs ys] unfolding select def by auto lemma select in set [intro,simp]: assumes k < length xs shows select k xs ∈ set xs

238 proof − from assms have sort xs ! k ∈ set (sort xs) by (intro nth mem) auto also have set (sort xs) = set xs by simp finally show ?thesis by (simp add: select def ) qed lemma assumes n < length xs shows size less than select: size {#y ∈# mset xs. y < select n xs#} ≤ n and size greater than select: size {#y ∈# mset xs. y > select n xs#} < length xs − n proof − have size {#y ∈# mset (sort xs). y < select n xs#} ≤ size (mset (take n (sort xs))) unfolding select def using assms by (intro size mset mono sorted filter less subset take) auto thus size {#y ∈# mset xs. y < select n xs#} ≤ n by simp have size {#y ∈# mset (sort xs). y > select n xs#} ≤ size (mset (drop (Suc n)(sort xs))) unfolding select def using assms by (intro size mset mono sorted filter greater subset drop) auto thus size {#y ∈# mset xs. y > select n xs#} < length xs − n using assms by simp qed

49.4 The designated median of a list definition median where median xs = select ((length xs − 1 ) div 2 ) xs lemma median in set [intro, simp]: assumes xs 6= [] shows median xs ∈ set xs proof − from assms have length xs > 0 by auto hence (length xs − 1 ) div 2 < length xs by linarith thus ?thesis by (simp add: median def ) qed lemma size less than median: size {#y ∈# mset xs. y < median xs#} ≤ (length xs − 1 ) div 2 proof (cases xs = []) case False

239 hence length xs > 0 by auto hence less:(length xs − 1 ) div 2 < length xs by linarith show size {#y ∈# mset xs. y < median xs#} ≤ (length xs − 1 ) div 2 using size less than select[OF less] by (simp add: median def ) qed auto lemma size greater than median: size {#y ∈# mset xs. y > median xs#} ≤ length xs div 2 proof (cases xs = []) case False hence length xs > 0 by auto hence less:(length xs − 1 ) div 2 < length xs by linarith have size {#y ∈# mset xs. y > median xs#} < length xs − (length xs − 1 ) div 2 using size greater than select[OF less] by (simp add: median def ) also have ... = length xs div 2 + 1 using hlength xs > 0 i by linarith finally show size {#y ∈# mset xs. y > median xs#} ≤ length xs div 2 by simp qed auto lemmas median props = size less than median size greater than median

49.5 A recurrence for selection definition partition3 :: 0a ⇒ 0a :: linorder list ⇒ 0a list × 0a list × 0a list where partition3 x xs = (filter (λy. y < x) xs, filter (λy. y = x) xs, filter (λy. y > x) xs) lemma partition3 code [code]: partition3 x [] = ([], [], []) partition3 x (y # ys) = (case partition3 x ys of (ls, es, gs) ⇒ if y < x then (y # ls, es, gs) else if x = y then (ls, y # es, gs) else (ls, es, y # gs)) by (auto simp: partition3 def ) lemma sort append: assumes ∀ x∈set xs. ∀ y∈set ys. x ≤ y

240 shows sort (xs @ ys) = sort xs @ sort ys using assms by (intro properties for sort)(auto simp: sorted append) lemma select append: assumes ∀ y∈set ys. ∀ z∈set zs. y ≤ z shows k < length ys =⇒ select k (ys @ zs) = select k ys and k ∈ {length ys.. x) xs define nl ne where [simp]: nl = length ls ne = length es have mset eq: mset xs = mset ls + mset es + mset gs unfolding ls def es def gs def by (induction xs) auto have length eq: length xs = length ls + length es + length gs unfolding ls def es def gs def by (induction xs) auto have [simp]: select i es = x if i < length es for i proof − have select i es ∈ set (sort es) unfolding select def using that by (intro nth mem) auto thus ?thesis by (auto simp: es def ) qed

have select k xs = select k (ls @(es @ gs)) by (intro select mset cong)(simp all add: mset eq)

241 also have ... = (if k < nl then select k ls else select (k − nl)(es @ gs)) unfolding nl ne def using assms by (intro select append 0)(auto simp: ls def es def gs def length eq) also have ... = (if k < nl then select k ls else if k < nl + ne then x else select (k − nl − ne) gs) proof (rule if cong) assume ¬k < nl have select (k − nl)(es @ gs) = (if k − nl < ne then select (k − nl) es else select (k − nl − ne) gs) unfolding nl ne def using assms h¬k < nl i by (intro select append 0)(auto simp: ls def es def gs def length eq) also have ... = (if k < nl + ne then x else select (k − nl − ne) gs) using h¬k < nl i by auto finally show select (k − nl)(es @ gs) = ... . qed simp all also have ... = ?rhs by (simp add: partition3 def ls def es def gs def ) finally show ?thesis . qed

49.6 The size of the lists in the recursive calls We now derive an upper bound for the number of elements of a list that are smaller (resp. bigger) than the median of medians with chopping size 5. To avoid having to do the same proof twice, we do it generically for an operation ≺ that we will later instantiate with either < or >. context fixes xs :: 0a :: linorder list fixes M defines M ≡ median (map median (chop 5 xs)) begin lemma size median of medians aux: fixes R :: 0a :: linorder ⇒ 0a ⇒ bool (infix ≺ 50 ) assumes R: R ∈ {(<), (>)} shows size {#y ∈# mset xs. y ≺ M #} ≤ nat d0 .7 ∗ length xs + 3 e proof − define n and m where [simp]: n = length xs and m = length (chop 5 xs) We define an abbreviation for the multiset of all the chopped-up groups: We then split that multiset into those groups whose medians is less than M and the rest.

242 define Y small (Y ≺) where Y ≺ = filter mset (λys. median ys ≺ M ) (mset (chop 5 xs)) define Y big (Y ) where Y  = filter mset (λys. ¬(median ys ≺ M )) (mset (chop 5 xs)) have m = size (mset (chop 5 xs)) by (simp add: m def ) also have mset (chop 5 xs) = Y ≺ + Y  unfolding Y small def Y big def by (rule multiset partition) finally have m eq: m = size Y ≺ + size Y  by simp At most half of the lists have a median that is smaller than the median of medians:

have size Y ≺ = size (image mset median Y ≺) by simp also have image mset median Y ≺ = {#y ∈# mset (map median (chop 5 xs)). y ≺ M #} unfolding Y small def by (subst filter mset image mset [symmetric]) simp all also have size ... ≤ (length (map median (chop 5 xs))) div 2 unfolding M def using median props[of map median (chop 5 xs)] R by auto also have ... = m div 2 by (simp add: m def ) finally have size Y small: size Y ≺ ≤ m div 2 . We estimate the number of elements less than M by grouping them into elements coming from Y ≺ and elements coming from Y : have {#y ∈# mset xs. y ≺ M #} = {#y ∈#(P ys←chop 5 xs. mset ys). y ≺ M #} by (subst sum msets chop) simp all also have ... = (P ys←chop 5 xs. {#y ∈# mset ys. y ≺ M #}) by (subst filter mset sum list)(simp add: o def ) also have ... = (P ys∈#mset (chop 5 xs). {#y ∈# mset ys. y ≺ M #}) by (subst sum mset sum list [symmetric]) simp all also have mset (chop 5 xs) = Y ≺ + Y  by (simp add: Y small def Y big def not le) also have (P ys∈#.... {#y ∈# mset ys. y ≺ M #}) = P P ( ys∈#Y ≺. {#y ∈# mset ys. y ≺ M #}) + ( ys∈#Y . {#y ∈# mset ys. y ≺ M #}) by simp

Next, we overapproximate the elements contributed by Y ≺: instead of those elements that are smaller than the median, we take all the elements of each group. For the elements contributed by Y , we overapproximate by taking all those that are less than their median instead of only those that are less than M. P P also have ... ⊆#( ys∈#Y ≺. mset ys) + ( ys∈#Y . {#y ∈# mset ys. y ≺ median ys#})

243 using R by (intro subset mset.add mono sum mset mset mono mset filter mono) (auto simp: Y big def ) finally have size {# y ∈# mset xs. y ≺ M #} ≤ size ... by (rule size mset mono) hence size {# y ∈# mset xs. y ≺ M #} ≤ P P ( ys∈#Y ≺. length ys) + ( ys∈#Y . size {#y ∈# mset ys. y ≺ median ys#}) by (simp add: size mset sum mset distrib multiset.map comp o def ) Next, we further overapproximate the first sum by noting that each group has at most size 5. P P also have ( ys∈#Y ≺. length ys) ≤ ( ys∈#Y ≺. 5 ) by (intro sum mset mono)(auto simp: Y small def length chop part le) also have ... = 5 ∗ size Y ≺ by simp

Next, we note that each group in Y  can have at most 2 elements that are smaller than its median. P also have ( ys∈#Y . size {#y ∈# mset ys. y ≺ median ys#}) ≤ P ( ys∈#Y . length ys div 2 ) proof (intro sum mset mono, goal cases) fix ys assume ys ∈# Y  hence ys 6= [] by (auto simp: Y big def ) thus size {#y ∈# mset ys. y ≺ median ys#} ≤ length ys div 2 using R median props[of ys] by auto qed P also have ... ≤ ( ys∈#Y . 2 ) by (intro sum mset mono div le mono diff le mono) (auto simp: Y big def dest: length chop part le) also have ... = 2 ∗ size Y  by simp Simplifying gives us the main result.

also have 5 ∗ size Y ≺ + 2 ∗ size Y  = 2 ∗ m + 3 ∗ size Y ≺ by (simp add: m eq) also have ... ≤ 3 .5 ∗ m using hsize Y ≺ ≤ m div 2 i by linarith also have ... = 3 .5 ∗ dn / 5 e by (simp add: m def length chop) also have ... ≤ 0 .7 ∗ n + 3 .5 by linarith finally have size {#y ∈# mset xs. y ≺ M #} ≤ 0 .7 ∗ n + 3 .5 by simp thus size {#y ∈# mset xs. y ≺ M #} ≤ nat d0 .7 ∗ n + 3 e

244 by linarith qed lemma size less than median of medians: size {#y ∈# mset xs. y < M #} ≤ nat d0 .7 ∗ length xs + 3 e using size median of medians aux[of (<)] by simp lemma size greater than median of medians: size {#y ∈# mset xs. y > M #} ≤ nat d0 .7 ∗ length xs + 3 e using size median of medians aux[of (>)] by simp end

49.7 Efficient algorithm We handle the base cases and computing the median for the chopped-up sublists of size 5 using the naive selection algorithm where we sort the list using insertion sort. definition slow select where slow select k xs = isort xs ! k definition slow median where slow median xs = slow select ((length xs − 1 ) div 2 ) xs lemma slow select correct: slow select k xs = select k xs by (simp add: slow select def select def isort correct) lemma slow median correct: slow median xs = median xs by (simp add: median def slow median def slow select correct) The definition of the selection algorithm is complicated somewhat by the fact that its termination is contingent on its correctness: if the first recursive call were to return an element for x that is e.g. smaller than all list elements, the algorithm would not terminate. Therefore, we first prove partial correctness, then termination, and then combine the two to obtain total correctness. function mom select where mom select k xs = ( if length xs ≤ 20 then slow select k xs else let M = mom select (((length xs + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)); (ls, es, gs) = partition3 M xs

245 in if k < length ls then mom select k ls else if k < length ls + length es then M else mom select (k − length ls − length es) gs ) by auto If mom select terminates, it agrees with select: lemma mom select correct aux: assumes mom select dom (k, xs) and k < length xs shows mom select k xs = select k xs using assms proof (induction rule: mom select.pinduct) case (1 k xs) show mom select k xs = select k xs proof (cases length xs ≤ 20 ) case True thus mom select k xs = select k xs using 1 .prems 1 .hyps by (subst mom select.psimps)(auto simp: select def slow select correct) next case False define x where x = mom select (((length xs + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)) define ls es gs where ls = filter (λy. y < x) xs and es = filter (λy. y = x) xs and gs = filter (λy. y > x) xs define nl ne where nl = length ls and ne = length es note defs = nl def ne def x def ls def es def gs def have tw:(ls, es, gs) = partition3 x xs unfolding partition3 def defs One nat def .. have length eq: length xs = nl + ne + length gs unfolding nl def ne def ls def es def gs def by (induction xs) auto note IH = 1 .IH (2 ,3 )[OF False x def tw refl refl]

have mom select k xs = (if k < nl then mom select k ls else if k < nl + ne then x else mom select (k − nl − ne) gs) using 1 .hyps False by (subst mom select.psimps)(simp all add: partition3 def flip: defs One nat def ) also have ... = (if k < nl then select k ls else if k < nl + ne then x else select (k − nl − ne) gs) using IH length eq 1 .prems by (simp add: ls def es def gs def nl def

246 ne def ) also have ... = select k xs using hk < length xs i by (subst (3 ) select rec partition[of x]) (simp all add: nl def ne def flip: tw) finally show mom select k xs = select k xs . qed qed mom select indeed terminates for all inputs: lemma mom select termination: All mom select dom proof (relation measure (length ◦ snd); (safe)?) fix k :: nat and xs :: 0a list assume ¬ length xs ≤ 20 thus ((((length xs + 4 ) div 5 − 1 ) div 2 , map slow median (chop 5 xs)), k, xs) ∈ measure (length ◦ snd) by (auto simp: length chop nat less iff ceiling less iff ) next fix k :: nat and xs ls es gs :: 0a list define x where x = mom select (((length xs + 4 ) div 5 − 1 ) div 2 ) (map slow median (chop 5 xs)) assume A: ¬ length xs ≤ 20 (ls, es, gs) = partition3 x xs mom select dom (((length xs + 4 ) div 5 − 1 ) div 2 , map slow median (chop 5 xs)) have less: ((length xs + 4 ) div 5 − 1 ) div 2 < nat dlength xs / 5 e using A(1 ) by linarith For termination, it suffices to prove that x is in the list. have x = select (((length xs + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)) using less unfolding x def by (intro mom select correct aux A)(auto simp: length chop) also have ... ∈ set (map slow median (chop 5 xs)) using less by (intro select in set)(simp all add: length chop) also have ... ⊆ set xs unfolding set map proof safe fix ys assume ys: ys ∈ set (chop 5 xs) hence median ys ∈ set ys by auto also have set ys ⊆ S (set ‘ set (chop 5 xs)) using ys by blast also have ... = set xs

247 by (rule UN sets chop) simp all finally show slow median ys ∈ set xs by (simp add: slow median correct) qed finally have x ∈ set xs . thus ((k, ls), k, xs) ∈ measure (length ◦ snd) and ((k − length ls − length es, gs), k, xs) ∈ measure (length ◦ snd) using A(1 ,2 ) by (auto simp: partition3 def intro!: length filter less[of x]) qed termination mom select by (rule mom select termination) lemmas [simp del] = mom select.simps lemma mom select correct: k < length xs =⇒ mom select k xs = select k xs using mom select correct aux and mom select termination by blast

49.8 Running time analysis fun T partition3 :: 0a ⇒ 0a list ⇒ nat where T partition3 x [] = 1 | T partition3 x (y # ys) = T partition3 x ys + 1 lemma T partition3 eq: T partition3 x xs = length xs + 1 by (induction x xs rule: T partition3 .induct) auto definition T slow select :: nat ⇒ 0a :: linorder list ⇒ nat where T slow select k xs = T isort xs + T nth (isort xs) k + 1 definition T slow median :: 0a :: linorder list ⇒ nat where T slow median xs = T slow select ((length xs − 1 ) div 2 ) xs + 1 lemma T slow select le: T slow select k xs ≤ length xs ˆ 2 + 3 ∗ length xs + 3 proof − have T slow select k xs ≤ (length xs + 1 )2 + (length (isort xs) + 1 ) + 1 unfolding T slow select def by (intro add mono T isort length)(auto simp: T nth eq) also have ... = length xs ˆ 2 + 3 ∗ length xs + 3 by (simp add: isort correct algebra simps power2 eq square) finally show ?thesis . qed

248 lemma T slow median le: T slow median xs ≤ length xs ˆ 2 + 3 ∗ length xs + 4 unfolding T slow median def using T slow select le[of (length xs − 1 ) div 2 xs] by simp fun T chop :: nat ⇒ 0a list ⇒ nat where T chop 0 = 1 | T chop [] = 1 | T chop n xs = T take n xs + T drop n xs + T chop n (drop n xs) lemmas [simp del] = T chop.simps lemma T chop Nil [simp]: T chop d [] = 1 by (cases d)(auto simp: T chop.simps) lemma T chop 0 [simp]: T chop 0 xs = 1 by (auto simp: T chop.simps) lemma T chop reduce: n > 0 =⇒ xs 6= [] =⇒ T chop n xs = T take n xs + T drop n xs + T chop n (drop n xs) by (cases n; cases xs)(auto simp: T chop.simps) lemma T chop le: T chop d xs ≤ 5 ∗ length xs + 1 by (induction d xs rule: T chop.induct)(auto simp: T chop reduce T take eq T drop eq) The option domintros here allows us to explicitly reason about where the function does and does not terminate. With this, we can skip the ter- mination proof this time because we can reuse the one for mom select. function (domintros) T mom select :: nat ⇒ 0a :: linorder list ⇒ nat where T mom select k xs = ( if length xs ≤ 20 then T slow select k xs else let xss = chop 5 xs; ms = map slow median xss; idx = (((length xs + 4 ) div 5 − 1 ) div 2 ); x = mom select idx ms; (ls, es, gs) = partition3 x xs; nl = length ls;

249 ne = length es in (if k < nl then T mom select k ls else if k < nl + ne then 0 else T mom select (k − nl − ne) gs) + T mom select idx ms + T chop 5 xs + T map T slow median xss + T partition3 x xs + T length ls + T length es + 1 ) by auto termination T mom select proof (rule allI , safe) fix k :: nat and xs :: 0a :: linorder list have mom select dom (k, xs) using mom select termination by blast thus T mom select dom (k, xs) by (induction k xs rule: mom select.pinduct) (rule T mom select.domintros, simp all) qed lemmas [simp del] = T mom select.simps function T 0 mom select :: nat ⇒ nat where T 0 mom select n = (if n ≤ 20 then 463 else T 0 mom select (nat d0 .2 ∗ne) + T 0 mom select (nat d0 .7 ∗n+3 e) + 17 ∗ n + 50 ) by force+ termination by (relation measure id; simp; linarith) lemmas [simp del] = T 0 mom select.simps lemma T 0 mom select ge: T 0 mom select n ≥ 463 by (induction n rule: T 0 mom select.induct; subst T 0 mom select.simps) auto lemma T 0 mom select mono: m ≤ n =⇒ T 0 mom select m ≤ T 0 mom select n proof (induction n arbitrary: m rule: less induct) case (less n m)

250 show ?case proof (cases m ≤ 20 ) case True hence T 0 mom select m = 463 by (subst T 0 mom select.simps) auto also have ... ≤ T 0 mom select n by (rule T 0 mom select ge) finally show ?thesis . next case False hence T 0 mom select m = T 0 mom select (nat d0 .2 ∗me) + T 0 mom select (nat d0 .7 ∗m + 3 e) + 17 ∗ m + 50 by (subst T 0 mom select.simps) auto also have ... ≤ T 0 mom select (nat d0 .2 ∗ne) + T 0 mom select (nat d0 .7 ∗n + 3 e) + 17 ∗ n + 50 using hm ≤ n i and False by (intro add mono less.IH ; linarith) also have ... = T 0 mom select n 0 using hm ≤ n i and False by (subst T mom select.simps) auto finally show ?thesis . qed qed lemma T mom select le aux: T mom select k xs ≤ T 0 mom select (length xs) proof (induction k xs rule: T mom select.induct) case (1 k xs) define n where [simp]: n = length xs define x where x = mom select (((length xs + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)) define ls es gs where ls = filter (λy. y < x) xs and es = filter (λy. y = x) xs and gs = filter (λy. y > x) xs define nl ne where nl = length ls and ne = length es note defs = nl def ne def x def ls def es def gs def have tw:(ls, es, gs) = partition3 x xs unfolding partition3 def defs One nat def .. note IH = 1 .IH (1 ,2 ,3 )[OF refl refl refl x def tw refl refl refl refl]

show ?case proof (cases length xs ≤ 20 ) case True — base case hence T mom select k xs ≤ (length xs)2 + 3 ∗ length xs + 3

251 using T slow select le[of k xs] by (subst T mom select.simps) auto also have ... ≤ 20 2 + 3 ∗ 20 + 3 using True by (intro add mono power mono) auto also have ... ≤ 463 by simp also have ... = T 0 mom select (length xs) using True by (simp add: T 0 mom select.simps) finally show ?thesis by simp next case False — recursive case have ((n + 4 ) div 5 − 1 ) div 2 < nat dn / 5 e using False unfolding n def by linarith hence x = select (((n + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)) unfolding x def n def by (intro mom select correct)(auto simp: length chop) also have ((n + 4 ) div 5 − 1 ) div 2 = (nat dn / 5 e − 1 ) div 2 by linarith also have select ... (map slow median (chop 5 xs)) = median (map slow median (chop 5 xs)) by (auto simp: median def length chop) finally have x eq: x = median (map slow median (chop 5 xs)) . The cost of computing the medians of all the subgroups: define T ms where T ms = T map T slow median (chop 5 xs) have T ms ≤ 9 ∗ n + 45 proof − have T ms = (P ys←chop 5 xs. T slow median ys) + length (chop 5 xs) + 1 by (simp add: T ms def T map eq) also have (P ys←chop 5 xs. T slow median ys) ≤ (P ys←chop 5 xs. 44 ) proof (intro sum list mono) fix ys assume ys ∈ set (chop 5 xs) hence length ys ≤ 5 using length chop part le by blast have T slow median ys ≤ (length ys) ˆ 2 + 3 ∗ length ys + 4 by (rule T slow median le) also have ... ≤ 5 ˆ 2 + 3 ∗ 5 + 4 using hlength ys ≤ 5 i by (intro add mono power mono) auto finally show T slow median ys ≤ 44 by simp qed also have (P ys←chop 5 xs. 44 ) + length (chop 5 xs) + 1 = 45 ∗ nat dreal n / 5 e + 1

252 by (simp add: map replicate const length chop) also have ... ≤ 9 ∗ n + 45 by linarith finally show T ms ≤ 9 ∗ n + 45 by simp qed The cost of the first recursive call (to compute the median of medians): define T rec1 where T rec1 = T mom select (((length xs + 4 ) div 5 − 1 ) div 2 )(map slow median (chop 5 xs)) have T rec1 ≤ T 0 mom select (length (map slow median (chop 5 xs))) using False unfolding T rec1 def by (intro IH (3 )) auto hence T rec1 ≤ T 0 mom select (nat d0 .2 ∗ ne) by (simp add: length chop) The cost of the second recursive call (to compute the final result): define T rec2 where T rec2 = (if k < nl then T mom select k ls else if k < nl + ne then 0 else T mom select (k − nl − ne) gs) consider k < nl | k ∈ {nl..

253 hence T rec2 = T mom select (k − nl − ne) gs by (simp add: T rec2 def ) also have ... ≤ T 0 mom select (length gs) unfolding nl def ne def by (rule IH (2 )) (use hk ≥ nl + ne i False in hauto simp: defs i) also have length gs ≤ nat d0 .7 ∗ n + 3 e unfolding gs def using size greater than median of medians[of xs] by (auto simp: length filter conv size filter mset slow median correct[abs def ] x eq) hence T 0 mom select (length gs) ≤ T 0 mom select (nat d0 .7 ∗ n + 3 e) by (rule T 0 mom select mono) finally show ?thesis . qed Now for the final inequality chain: have T mom select k xs = T rec2 + T rec1 + T ms + n + nl + ne + T chop 5 xs + 4 using False by (subst T mom select.simps, unfold Let def tw [symmetric] defs [symmetric]) (simp all add: nl def ne def T rec1 def T rec2 def T partition3 eq T length eq T ms def ) also have nl ≤ n by (simp add: nl def ls def ) also have ne ≤ n by (simp add: ne def es def ) also note hT ms ≤ 9 ∗ n + 45 i also have T chop 5 xs ≤ 5 ∗ n + 1 using T chop le[of 5 xs] by simp 0 also note hT rec1 ≤ T mom select (nat d0 .2 ∗ne)i 0 also note hT rec2 ≤ T mom select (nat d0 .7 ∗n + 3 e)i finally have T mom select k xs ≤ T 0 mom select (nat d0 .7 ∗n + 3 e) + T 0 mom select (nat d0 .2 ∗ne) + 17 ∗ n + 50 by simp also have ... = T 0 mom select n using False by (subst T 0 mom select.simps) auto finally show ?thesis by simp qed qed

49.9 Akra–Bazzi Light lemma akra bazzi light aux1 : fixes a b :: real and n n0 :: nat assumes ab: a > 0 a < 1 n > n0

254 assumes n0 ≥ (max 0 b + 1 ) / (1 − a) shows nat da∗n+be < n proof − have a ∗ real n + max 0 b ≥ 0 using ab by simp hence real (nat da∗n+be) ≤ a ∗ n + max 0 b + 1 by linarith also { have n0 ≥ (max 0 b + 1 ) / (1 − a) by fact also have . . . < real n using assms by simp finally have a ∗ real n + max 0 b + 1 < real n using ab by (simp add: field simps) } finally show nat da∗n+be < n using hn > n0 i by linarith qed lemma akra bazzi light aux2 : fixes f :: nat ⇒ real fixes n0 :: nat and a b c d :: real and C1 C2 C 1 C 2 :: real assumes bounds: a > 0 c > 0 a + c < 1 C 1 ≥ 0 assumes rec: ∀ n>n0. f n = f (nat da∗n+be) + f (nat dc∗n+de) + C 1 ∗ n + C 2 assumes ineqs: n0 > (max 0 b + max 0 d + 2 ) / (1 − a − c) C 3 ≥ C 1 / (1 − a − c) C 3 ≥ (C 1 ∗ n0 + C 2 + C 4) / ((1 − a − c) ∗ n0 − max 0 b − max 0 d − 2 ) ∀ n≤n0. f n ≤ C 4 shows f n ≤ C 3 ∗ n + C 4 proof (induction n rule: less induct) case (less n) have 0 ≤ C 1 / (1 − a − c) using bounds by auto also have ... ≤ C 3 by fact finally have C 3 ≥ 0 .

show ?case proof (cases n > n0) case False hence f n ≤ C 4 using ineqs(4 ) by auto

255 also have ... ≤ C 3 ∗ real n + C 4 using bounds hC 3 ≥ 0 i by auto finally show ?thesis . next case True have nonneg: a ∗ n ≥ 0 c ∗ n ≥ 0 using bounds by simp all

have (max 0 b + 1 ) / (1 − a) ≤ (max 0 b + max 0 d + 2 ) / (1 − a − c) using bounds by (intro frac le) auto hence n0 ≥ (max 0 b + 1 ) / (1 − a) using ineqs(1 ) by linarith hence rec less1 : nat da∗n+be < n using bounds hn > n0i by (intro akra bazzi light aux1 [of n0]) auto

have (max 0 d + 1 ) / (1 − c) ≤ (max 0 b + max 0 d + 2 ) / (1 − a − c) using bounds by (intro frac le) auto hence n0 ≥ (max 0 d + 1 ) / (1 − c) using ineqs(1 ) by linarith hence rec less2 : nat dc∗n+de < n using bounds hn > n0i by (intro akra bazzi light aux1 [of n0]) auto

have f n = f (nat da∗n+be) + f (nat dc∗n+de) + C 1 ∗ n + C 2 using hn > n0i by (subst rec) auto also have ... ≤ (C 3 ∗ nat da∗n+be + C 4) + (C 3 ∗ nat dc∗n+de + C 4) + C 1 ∗ n + C 2 using rec less1 rec less2 by (intro add mono less.IH ) auto also have ... ≤ (C 3 ∗ (a∗n+max 0 b+1 ) + C 4) + (C 3 ∗ (c∗n+max 0 d+1 ) + C 4) + C 1 ∗ n + C 2 using bounds hC 3 ≥ 0 i nonneg by (intro add mono mult left mono order.refl; linarith) also have ... = C 3 ∗ n + ((C 3 ∗ (max 0 b + max 0 d + 2 ) + 2 ∗ C 4 + C 2) − (C 3 ∗ (1 − a − c) − C 1) ∗ n) by (simp add: algebra simps) also have ... ≤ C 3 ∗ n + ((C 3 ∗ (max 0 b + max 0 d + 2 ) + 2 ∗ C 4 + C 2) − (C 3 ∗ (1 − a − c) − C 1) ∗ n0) using hn > n0i ineqs(2 ) bounds by (intro add mono diff mono order.refl mult left mono)(auto simp: field simps) also have (C 3 ∗ (max 0 b + max 0 d + 2 ) + 2 ∗ C 4 + C 2) − (C 3 ∗

256 (1 − a − c) − C 1) ∗ n0 ≤ C 4 using ineqs bounds by (simp add: field simps) finally show f n ≤ C 3 ∗ real n + C 4 by (simp add: mult right mono) qed qed lemma akra bazzi light: fixes f :: nat ⇒ real fixes n0 :: nat and a b c d C 1 C 2 :: real assumes bounds: a > 0 c > 0 a + c < 1 C 1 ≥ 0 assumes rec: ∀ n>n0. f n = f (nat da∗n+be) + f (nat dc∗n+de) + C 1 ∗ n + C 2 shows ∃ C 3 C 4. ∀ n. f n ≤ C 3 ∗ real n + C 4 proof − 0 0 define n0 where n0 = max n0 (nat d(max 0 b + max 0 d + 2 ) / (1 − a − c) + 1 e) 0 define C 4 where C 4 = Max (f ‘ {..n0 }) define C 3 where C 3 = max (C 1 / (1 − a − c)) 0 0 ((C 1 ∗ n0 + C 2 + C 4) / ((1 − a − c) ∗ n0 − max 0 b − max 0 d − 2 ))

have f n ≤ C 3 ∗ n + C 4 for n proof (rule akra bazzi light aux2 [OF bounds ]) 0 show ∀ n>n0 . f n = f (nat da∗n+be) + f (nat dc∗n+de) + C 1 ∗ n + C 2 0 using rec by (auto simp: n0 def ) next show C 3 ≥ C 1 / (1 − a − c) 0 0 and C 3 ≥ (C 1 ∗ n0 + C 2 + C 4) / ((1 − a − c) ∗ n0 − max 0 b − max 0 d − 2 ) by (simp all add: C 3 def ) next have (max 0 b + max 0 d + 2 ) / (1 − a − c) < nat d(max 0 b + max 0 d + 2 ) / (1 − a − c) + 1 e by linarith 0 also have ... ≤ n0 0 by (simp add: n0 def ) 0 finally show (max 0 b + max 0 d + 2 ) / (1 − a − c) < real n0 . next 0 show ∀ n≤n0 . f n ≤ C 4 by (auto simp: C 4 def ) qed thus ?thesis by blast

257 qed lemma akra bazzi light nat: fixes f :: nat ⇒ nat fixes n0 :: nat and a b c d :: real and C 1 C 2 :: nat assumes bounds: a > 0 c > 0 a + c < 1 C 1 ≥ 0 assumes rec: ∀ n>n0. f n = f (nat da∗n+be) + f (nat dc∗n+de) + C 1 ∗ n + C 2 shows ∃ C 3 C 4. ∀ n. f n ≤ C 3 ∗ n + C 4 proof − have ∃ C 3 C 4. ∀ n. real (f n) ≤ C 3 ∗ real n + C 4 using assms by (intro akra bazzi light[of a c C 1 n0 f b d C 2]) auto then obtain C 3 C 4 where le: ∀ n. real (f n) ≤ C 3 ∗ real n + C 4 by blast have f n ≤ nat dC 3e ∗ n + nat dC 4e for n proof − have real (f n) ≤ C 3 ∗ real n + C 4 using le by blast also have ... ≤ real (nat dC 3e) ∗ real n + real (nat dC 4e) by (intro add mono mult right mono; linarith) also have ... = real (nat dC 3e ∗ n + nat dC 4e) by simp finally show ?thesis by linarith qed thus ?thesis by blast qed

0 0 0 lemma T mom select le : ∃ C 1 C 2. ∀ n. T mom select n ≤ C 1 ∗ n + C 2 proof (rule akra bazzi light nat) show ∀ n>20 . T 0 mom select n = T 0 mom select (nat d0 .2 ∗ n + 0 e) + T 0 mom select (nat d0 .7 ∗ n + 3 e) + 17 ∗ n + 50 using T 0 mom select.simps by auto qed auto end

50 Bibliographic Notes

Red-black trees The insert function follows Okasaki [15]. The delete function in theory RBT Set follows Kahrs [11, 12], an alternative delete function is given in theory RBT Set2.

258 2-3 trees Equational definitions were given by Hoffmann and O’Don- nell [9] (only insertion) and Reade [19]. Our formalisation is based on the teaching material by Turbak [22] and the article by Hinze [8].

1-2 brother trees They were invented by Ottmann and Six [16, 17]. The functional version is due to Hinze [7].

AA trees They were invented by Arne Anderson [3]. Our formalisation follows Ragde [18] but fixes a number of mistakes.

Splay trees They were invented by Sleator and Tarjan [21]. Our formal- isation follows Schoenmakers [20].

Join-based BSTs They were invented by Adams [1, 2] and analyzed by Blelloch et al. [4].

Leftist heaps They were invented by Crane [6]. A first functional imple- mentation is due to N´u˜nez et al. [14].

References

[1] S. Adams. Implementing sets efficiently in a functional language. Tech- nical Report CSTR 92-10, University of Southampton, Department of Electronics and Computer Science, 1992.

[2] S. Adams. Efficient sets - A balancing act. J. Funct. Program., 3(4):553–561, 1993.

[3] A. Andersson. Balanced search trees made simple. In Algorithms and Data Structures (WADS ’93), volume 709 of LNCS, pages 60–71. Springer, 1993.

[4] G. E. Blelloch, D. Ferizovic, and Y. Sun. Just join for parallel ordered sets. In SPAA, pages 253–264. ACM, 2016.

[5] W. Braun and M. Rem. A logarithmic implementation of flexible arrays. Memorandum MR83/4. Eindhoven University of Techology, 1983.

[6] C. A. Crane. Linear Lists and Prorty Queues as Balanced Binary Trees. PhD thesis, Computer Science Department, Stanford University, 1972.

[7] R. Hinze. Purely functional 1-2 brother trees. J. Functional Program- ming, 19(6):633–644, 2009.

[8] R. Hinze. On constructing 2-3 trees. J. Funct. Program., 28:e19, 2018.

259 [9] C. M. Hoffmann and M. J. O’Donnell. Programming with equations. ACM Trans. Program. Lang. Syst., 4(1):83–112, 1982.

[10] R. R. Hoogerwoord. A logarithmic implementation of flexible arrays. In R. Bird, C. Morgan, and J. Woodcock, editors, Mathematics of Program Construction, Second International Conference, volume 669 of LNCS, pages 191–207. Springer, 1992.

[11] S. Kahrs. Red black trees. http://www.cs.ukc.ac.uk/people/staff/smk/ redblack/rb.html.

[12] S. Kahrs. Red-black trees with types. J. Functional Programming, 11(4):425–432, 2001.

[13] T. Nipkow. Automatic functional correctness proofs for functional search trees. http://www.in.tum.de/∼nipkow/pubs/trees.html, Feb. 2016.

[14] M. N´u˜nez,P. Palao, and R. Pena. A second year course on data struc- tures based on functional programming. In P. H. Hartel and M. J. Plasmeijer, editors, Functional Programming Languages in Education, volume 1022 of LNCS, pages 65–84. Springer, 1995.

[15] C. Okasaki. Purely Functional Data Structures. Cambridge University Press, 1998.

[16] T. Ottmann and H.-W. Six. Eine neue Klasse von ausgeglichenen Bin¨arb¨aumen. Angewandte Informatik, 18(9):395–400, 1976.

[17] T. Ottmann and D. Wood. 1-2 brother trees or AVL trees revisited. Comput. J., 23(3):248–255, 1980.

[18] P. Ragde. Simple balanced binary search trees. In Caldwell, H¨olzenspies,and Achten, editors, Trends in Functional Programming in Education, volume 170 of EPTCS, pages 78–87, 2014.

[19] C. Reade. Balanced trees with removals: An exercise in rewriting and proof. Sci. Comput. Program., 18(2):181–204, 1992.

[20] B. Schoenmakers. A systematic analysis of splaying. Information Pro- cessing Letters, 45:41–50, 1993.

[21] D. D. Sleator and R. E. Tarjan. Self-adjusting binary search trees. J. ACM, 32(3):652–686, 1985.

[22] F. Turbak. CS230 Handouts — Spring 2007, 2007. http://cs.wellesley. edu/∼cs230/spring07/handouts.html.

260