Functional Data Structures with Isabelle/HOL

Functional Data Structures with Isabelle/HOL

Functional Data Structures with Isabelle/HOL Tobias Nipkow Fakultät für Informatik Technische Universität München 2021-6-14 1 Part II Functional Data Structures 2 Chapter 6 Sorting 3 1 Correctness 2 Insertion Sort 3 Time 4 Merge Sort 4 1 Correctness 2 Insertion Sort 3 Time 4 Merge Sort 5 sorted :: ( 0a::linorder) list ) bool sorted [] = True sorted (x # ys) = ((8 y2set ys: x ≤ y) ^ sorted ys) 6 Correctness of sorting Specification of sort :: ( 0a::linorder) list ) 0a list: sorted (sort xs) Is that it? How about set (sort xs) = set xs Better: every x occurs as often in sort xs as in xs. More succinctly: mset (sort xs) = mset xs where mset :: 0a list ) 0a multiset 7 What are multisets? Sets with (possibly) repeated elements Some operations: f#g :: 0a multiset add mset :: 0a ) 0a multiset ) 0a multiset + :: 0a multiset ) 0a multiset ) 0a multiset mset :: 0a list ) 0a multiset set mset :: 0a multiset ) 0a set Import HOL−Library:Multiset 8 1 Correctness 2 Insertion Sort 3 Time 4 Merge Sort 9 HOL/Data_Structures/Sorting.thy Insertion Sort Correctness 10 1 Correctness 2 Insertion Sort 3 Time 4 Merge Sort 11 Principle: Count function calls For every function f :: τ 1 ) ::: ) τ n ) τ define a timing function Tf :: τ 1 ) ::: ) τ n ) nat: Translation of defining equations: E[[f p1 ::: pn = e]] = (Tf p1 ::: pn = T [[e]] + 1) Translation of expressions: T [[g e1 ::: ek]] = T [[e1]] + ::: + T [[ek]] + Tg e1 ::: ek All other operations (variable access, constants, constructors, primitive operations on bool and numbers) cost 1 time unit 12 Example: @ E[[ [] @ ys = ys ]] = (T@ [] ys = T [[ys]] + 1) = (T@ [] ys = 2) E[[ (x # xs)@ ys = x #(xs @ ys) ]] = (T@ (x # xs) ys = T [[x #(xs @ ys)]] + 1) = (T@ (x # xs) ys = T@ xs ys + 5) T [[x #(xs @ ys)]] = T [[x]] + T [[xs @ ys]] + T# x (xs @ ys) = 1 + (T [[xs]] + T [[ys]] + T@ xs ys) + 1 = 1 + (1 + 1 + T@ xs ys) + 1 13 if and case So far we model a call-by-value semantics Conditionals and case expressions are evaluated lazily. T [[if b then e1 else e2]] = T [[b]] + (if b then T [[e1]] else T [[e2]]) T [[case e of p1 ) e1 j ::: j pk ) ek]] = T [[e]] + (case e of p1 )T [[e1]] j ::: j pk )T [[ek]]) Also special: let x = t1 in t2 14 O(:) is enough =) Reduce all additive constants to 1 Example T@ (x # xs) ys = T@ xs ys + 5 T@ (x # xs) ys = T@ xs ys + 1 This means we count only • the defined functions via Tf and • +1 for the function call itself. All other operations (variables etc) cost 0, not 1. 15 Discussion • The definition of Tf from f can be automated. • The correctness of Tf could be proved w.r.t. a semantics that counts computation steps. • Precise complexity bounds (as opposed to O(:)) would require a formal model of (at least) the compiler and the hardware. 16 HOL/Data_Structures/Sorting.thy Insertion sort complexity 17 1 Correctness 2 Insertion Sort 3 Time 4 Merge Sort 18 4 Merge Sort Top-Down Bottom-Up 19 merge :: 0a list ) 0a list ) 0a list merge [] ys = ys merge xs [] = xs merge (x # xs)(y # ys) = (if x ≤ y then x # merge xs (y # ys) else y # merge (x # xs) ys) msort :: 0a list ) 0a list msort xs = (let n = length xs in if n ≤ 1 then xs else merge (msort (take (n div 2) xs)) (msort (drop (n div 2) xs))) 20 ≤ length xs + length ys Number of comparisons C merge :: 0a list ) 0a list ) nat C msort :: 0a list ) nat Lemma C merge xs ys Theorem length xs = 2k =) C msort xs ≤ k ∗ 2k 21 HOL/Data_Structures/Sorting.thy Merge Sort 22 4 Merge Sort Top-Down Bottom-Up 23 msort bu :: 0a list ) 0a list msort bu xs = merge all (map (λx: [x]) xs) merge all :: 0a list list ) 0a list merge all [] = [] merge all [xs] = xs merge all xss = merge all (merge adj xss) merge adj :: 0a list list ) 0a list list merge adj [] = [] merge adj [xs] = [xs] merge adj (xs # ys # zss) = merge xs ys # merge adj zss 24 Number of comparisons C merge adj :: 0a list list ) nat C merge all :: 0a list list ) nat C msort bu :: 0a list ) nat Theorem length xs = 2k =) C msort bu xs ≤ k ∗ 2k 25 HOL/Data_Structures/Sorting.thy Bottom-Up Merge Sort 26 Even better Make use of already sorted subsequences Example Sorting [7; 3; 1; 2; 5]: do not start with [[7]; [3]; [1]; [2]; [5]] but with [[1; 3; 7]; [2; 5]] 27 Archive of Formal Proofs https://www.isa-afp.org/entries/ Efficient-Mergesort.shtml 28 Chapter 7 Binary Trees 29 5 Binary Trees 6 Basic Functions 7 (Almost) Complete Trees 30 5 Binary Trees 6 Basic Functions 7 (Almost) Complete Trees 31 HOL/Library/Tree.thy 32 Binary trees datatype 0a tree = Leaf j Node ( 0a tree) 0a ( 0a tree) hi ≡ Leaf Abbreviations: hl; a; ri ≡ Node l a r Most of the time: tree = binary tree 33 5 Binary Trees 6 Basic Functions 7 (Almost) Complete Trees 34 Tree traversal inorder :: 0a tree ) 0a list inorder hi = [] inorder hl; x; ri = inorder l @[x]@ inorder r preorder :: 0a tree ) 0a list preorder hi = [] preorder hl; x; ri = x # preorder l @ preorder r postorder :: 0a tree ) 0a list postorder hi = [] postorder hl; x; ri = postorder l @ postorder r @[x] 35 Size size :: 0a tree ) nat jhij = 0 jhl; ; rij = jlj + jrj + 1 size1 :: 0a tree ) nat jhij1 = 1 jhl; ; rij1 = jlj1 + jrj1 Lemma jtj1 = jtj + 1 Warning: j:j and j:j1 only on slides 36 Height height :: 0a tree ) nat h(hi) = 0 h(hl; ; ri) = max (h(l)) (h(r)) + 1 Warning: h(:) only on slides Lemma h(t) ≤ jtj h(t) Lemma jtj1 ≤ 2 37 Minimal height min height :: 0a tree ) nat mh(hi) = 0 mh(hl; ; ri) = min (mh(l)) (mh(r)) + 1 Warning: mh(:) only on slides Lemma mh(t) ≤ h(t) mh(t) Lemma 2 ≤ jtj1 38 5 Binary Trees 6 Basic Functions 7 (Almost) Complete Trees 39 Complete tree complete :: 0a tree ) bool complete hi = True complete hl; ; ri = (h(l) = h(r) ^ complete l ^ complete r) Lemma complete t = (mh(t) = h(t)) h(t) Lemma complete t =) jtj1 = 2 h(t) Lemma : complete t =) jtj1 < 2 mh(t) Lemma : complete t =) 2 < jtj1 h(t) Corollary jtj1 = 2 =) complete t mh(t) Corollary jtj1 = 2 =) complete t 40 Almost complete tree acomplete :: 0a tree ) bool acomplete t = (h(t) − mh(t) ≤ 1) Almost complete trees have optimal height: Lemma If acomplete t and jtj ≤ jt 0j then h(t) ≤ h(t 0): 41 Warning • The terms complete and almost complete are not defined uniquely in the literature. • For example, Knuth calls complete what we call almost complete. 42 Chapter 8 Search Trees 43 8 Unbalanced BST 9 Abstract Data Types 10 2-3 Trees 11 Red-Black Trees 12 More Search Trees 13 Union, Intersection, Difference on BSTs 14 Tries and Patricia Tries 44 Most of the material focuses on BSTs = binary search trees 45 BSTs represent sets Any tree represents a set: set tree :: 0a tree ) 0a set set tree hi = fg set tree hl; x; ri = set tree l [ fxg [ set tree r A BST represents a set that can be searched in time O(h(t)) Function set tree is called an abstraction function because it maps the implementation to the abstract mathematical object 46 bst bst :: 0a tree ) bool bst hi = True bst hl; a; ri = ((8 x2set tree l: x < a) ^ (8 x2set tree r: a < x) ^ bst l ^ bst r) Type 0a must be in class linorder ( 0a :: linorder) where linorder are linear orders (also called total orders). Note: nat, int and real are in class linorder 47 Set interface An implementation of sets of elements of type 0a must provide • An implementation type 0s • empty :: 0s • insert :: 0a ) 0s ) 0s • delete :: 0a ) 0s ) 0s • isin :: 0s ) 0a ) bool 48 Map interface Instead of a set, a search tree can also implement a map from 0a to 0b: • An implementation type 0m • empty :: 0m • update :: 0a ) 0b ) 0m ) 0m • delete :: 0a ) 0m ) 0m • lookup :: 0m ) 0a ) 0b option Sets are a special case of maps 49 Comparison of elements We assume that the element type 0a is a linear order Instead of using < and ≤ directly: datatype cmp val = LT j EQ j GT cmp x y = (if x < y then LT else if x = y then EQ else GT) 50 8 Unbalanced BST 9 Abstract Data Types 10 2-3 Trees 11 Red-Black Trees 12 More Search Trees 13 Union, Intersection, Difference on BSTs 14 Tries and Patricia Tries 51 8 Unbalanced BST Implementation Correctness Correctness Proof Method Based on Sorted Lists 52 Implementation type: 0a tree empty = Leaf insert x hi = hhi; x; hii insert x hl; a; ri = (case cmp x a of LT ) hinsert x l; a; ri j EQ ) hl; a; ri j GT ) hl; a; insert x ri) 53 isin hi x = False isin hl; a; ri x = (case cmp x a of LT ) isin l x j EQ ) True j GT ) isin r x) 54 delete x hi = hi delete x hl; a; ri = (case cmp x a of LT ) hdelete x l; a; ri j EQ ) if r = hi then l else let (a 0; r 0) = split min r in hl; a 0; r 0i j GT ) hl; a; delete x ri) split min hl; a; ri = (if l = hi then (a; r) else let (x; l 0) = split min l in (x; hl 0; a; ri)) 55 8 Unbalanced BST Implementation Correctness Correctness Proof Method Based on Sorted Lists 56 Why is this implementation correct? Because empty insert delete isin simulate fg [ f:g − f:g 2 set tree empty = fg set tree (insert x t) = set tree t [ fxg set tree (delete x t) = set tree t − fxg isin t x = (x 2 set tree t) Under the assumption bst t 57 Also: bst must be invariant bst empty bst t =) bst (insert x t) bst t =) bst (delete x t) 58 8 Unbalanced BST Implementation Correctness Correctness Proof Method Based on Sorted Lists 59 Key idea Local definition: sorted means sorted w.r.t.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    331 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us