Average-Case Analysis and Lower Bounds by the Incompressibility Method
Total Page:16
File Type:pdf, Size:1020Kb
Average-case Analysis and Lower Bounds by the Incompressibility Method Tao Jiang Dept of Computer Science Univ. of California - Riverside Joint work with Ming Li (Waterloo) and Paul Vitanyi (CWI) 1 Outline 1. Overview of Kolmogorov complexity 2. The incompressibility method 3. Trivial example: lower bound for sorting 4. Average complexity of Shellsort 5. Average complexity of Heapsort 6. Average complexity of sorting with networks of stacks and queues 7. Average complexity of boolean matrix multiplication 8. Average complexity of majority finding 9. Expected length of longest common subsequence 10. Expected size of Heilbronn’s triangles 11. String matching and one-way pointers (heads) 2 Kolmogorov Complexity 1. Introduced by R. Solomonoff (1960), A.N. Kolmogorov (1965), P. Martin-Lof (1966), and G. Chaitin (1969). 2. Also known as descriptional complexity, algorith- mic information complexity, Solomonoff-Kolmogorov-Chaitin complexity. 3. It is concerned with the a priori probability, infor- mation content and randomness of an individual object. 010101010101... 011001010001... A good introduction is: M. Li and P. Vitanyi, An Introduction to Kolmogorov Complexity and Its Application, Springer-Verlag. 3 Kolmogorov Complexity of an Object ¡£¢ ¥¤ Def. The Kolmogorov complexity of object , denoted , is the length of the shortest program that prints . What kind of programs? Pascal, C, Java, or even some pseudo-language? Invariance Theorem. (Solomonoff’60) ¦ © For any two languages ¦¨§ and , ¡ ¢ ¡ ¢ ¤ ¥¤ for any , where is some constant depending only on ¦¨§ and ¦ © . I.e., the language does not matter as long as it’s fixed. Def. The conditional Kolmogorov complexity of relative to , ¡£¢ ¤ denoted , is the length of the shortest program that prints when given as input. Program description/encoding 4 ¡£¢ An Example — ¤ Program 1: 787:9<; !#"%$'&)(+*-, ./..0, 1 243 5 6 9 (A@ > ? The length of description is = bits. Program 2: @ , = For BDCFE to Print(“ ”); 9 (A@ = > ? The length is GIH¨J bits. Are there shorter programs? E KML N Yes if, e.g., = for some integer . 9 (QP In general, O is uncomputable. 5 Incompressibility of Objects S R Let be a finite set of objects and R . ¡£¢ ¤UT VXWY R Def. Object is -incompressible if R . -incompressible objects are also said to be Kolmogorov ran- dom. Incompressibility Lemma. There are at least Z `_ ¢[Z Z ¤ R \^] -incompressible elements in R . 6 a VbWY Proof. Let R . There are at most ] dc c § \ e ] dc Z Z Z R \ \ a a a \^] \^] egfih programs of lengths less than 6j . Each program prints at most one element of R . Q.E.D. 1. At least one element of R is -incompressible. 2. More than half are Z -incompressible. \ 3. More than k<l[m are -incompressible. ¢ Z 6 ¤ 6 VXWYn6 4. More than l are -incompressible. 6 The Incompressibility Method To prove a (lower or upper) bound: ¢ 6p¤ 1. Take an o -incompressible object to construct a “typi- cal” instance/input. 2. Establish the bound for this fixed instance . Show that if the bound does not hold for then we can compress by giving a clever, short encoding of . Remarks: (a) Such a typical instance q possesses all statistical properties; q makes a proof easy; q cannot be recursively constructed. (b) The result usually holds in the average/expected case since most objects are incompressible. 7 Success Stories of the Incompressibility Method 1. Solution of many longstanding open problems con- cerning the complexity of various models of com- putation. 2. Simpler proofs for many results in combinatorics, parallel computation, VLSI, formal language/automata theory, etc. 3. Average-case analysis of classical algorithms such as Heapsort, Shellsort, Boolean matrix multiplica- tion, etc. 8 Average-Case (Time) Complexity Analysis Assume all instances of size = occur with equal prob- ability. 926 Complexity 300 229 1 2 3 Size n Election 2000: The Florida recount 9 A Trivial Example: Lower Bound for Sorting Theorem. Any comparison based sorting algorithm requires r ¢ 6 6sVbWYt6p¤ comparisons to sort elements on the average. Proof. Let u be any comparison based sorting algorithm. Fix a Z ZAxzy{y{y{x 6}| w -incompressible permutation v of such that ¡£¢ x Z 6p¤UT VXWYn6-~ v u v v Suppose u sorts in comparisons. We can describe by listing L ¢Z q ¤ a description of this encoding scheme in bits, q the binary outcomes of the comparisons in bits. L L _ ¢Z ¡£¢ x ¤UT 6p¤ v u Since , L ¡£¢ x ¢[Z ¢Z r ¢ T 6p¤- ¤T VXWYt6-~ ¤a 6sVbWY 6¤ v u L Since more than half of the permutations are Z -incompressible, the average number of comparisons required by u is r ¢ r ¢ \ a 6DVXWYn6p¤ 6£VXWYn6p¤ l 10 Average Complexity of Shellsort Algorithm p-pass Shellsort with increments y{y{y Z 0 a §n ; x{y{y{y{x a § 1. Input: list ; Z a 2. For to 3. Divide list into equally spaced sublists 6 (or chains), each of length l ; 4. Perform Insertion Sort within each sublist; h1 = 6 h2 = 3 h3 = 2 h4 = 1 11 Previous Results on Shellsort Worst case: ¢ © 6 ¤ VXWYn6 1. time in passes (Shell’59) ¢ © 64 ¤ 2. time (Papernov-Stasevitch’65) ¢ © © 6DVbW<Y 6p¤ VbWY 6 3. time in passes (Pratt’72) ¢ ¢ §p © 6 ¤ ¤ VbWY 6 4. time in l passes (Incerpi and Sedgewick’85 and Chazelle’??) r ¢ §p 6 ¤ 5. time for passes (Plaxton, Poonen and Suel’92) Average case: ¢ \ 6¢ ¤ £ 1. ¡ time for -pass (Knuth’73) 2. Complicated analysis for k -pass (Yao’80) ¢ © § 6 ¢¤ ` k 3. for -pass (Janson and Knuth’96) Open: 1. Average complexity of general -pass Shellsort (Plaxton, Poonen and Suel’92; Sedgewick’96,97) ¢ 6£VXWYn6p¤ 2. Can Shellsort achieve on the average? (Sedgewick’97) 12 An Average Lower Bound for Shellsort VXWYt6 Theorem. For any , the average-case running time of r ¢ §¤ § 6 ¤ a -pass Shellsort is under uniform distribution of input permutations. Z r ¢ © a 6 ¤ 1. : is tight for Insertion Sort. r ¢ ¢ ¦{§¨ \ © ¢ 64 ¤ a 6 :©¤ ¥ 2. : vs of Knuth. r ¢ ¢ © § ¢ a 6)ª ¤ 6 ¤ 4 £ k 3. : vs of Janson and Knuth. ® ¢ r ¢Q¢ ¢ \¬«z a VXWYn6 6p¤ 6sVXWYt6p¤ 6¤Q¤ l0o l¯o 4. : . r ¢ VXWYn6 6p¤ 5. : is a trivial lower bound. ¢ 6DVbW<Yn6p¤ Hence, in order for a -pass Shellsort to achieve time ¢ a VbW<Yt6¤ ¡ on the average, . 13 Proving the Average Lower Bound ¢ r ¢ x{y{y{y{x §+§ ¤ 6 ¤ u Fix any § Shellsort . We prove is a lower bound on the average number of inversions. ° Z Z 6 Def. For any and , consider the -chain ° containing element at the beginning of pass . Let be the ° ° ±³² number of elements that are to the left of and largerL than . 1 2 3 4 5 6 7 8 9 10 11 12 5 9 1 6 10 2 7 11 3 8 12 4 h3 e.g. m8,3 = 2 f ´ Fact. The Insertion Sort in pass makes precisely § _ L ±³² Z ± f ´ inversions (or § comparisons). L ±³² ± Let µ denote the total number of inversions: a µ ±³² L f f § § ± 14 Proving the Average Lower Bound – II ZAxzy{y{y{x 6}| VXWYn6 w Fix a -incompressible permutation of with ¡£¢ x 6 ¤jT VXWYt6-~ VXWYn6 u Fact. Given all the numbers ’s for the input permutation , L ±³² we can uniquely reconstruct . Proof. Reconstruct the initial permutation of each pass back- ward. Hence, ¡£¢ x{y{y{y{x x ¡£¢ x 6 ¤jT 6 ¤jT VXWYn6-~ VXWYn6 § § u u ² ² L L c § 6 ¶F· µ c Since there are possible divisions of into non- § ¸ negative integral summands ’s, L ±¹² _ Z 6 _ xzy{y{y{x ¡£¢ x µ VbWY VbWY º T 6 ¤jT VXWYn6-~¼½VXWYt6 § § µ u Z » 6 ² ² L L ¢ VbW<Y a VXWYn6p¤ Noting µ , a careful calculation shows r ¢ § a 6 ¤ ¾ µ The average number of inversions required is thus: Z Z 6 _ r ¢ r ¢ §¤ §¤ 6 ¤ a 6 ¤ ¾ ¾ 6 ¿ 6 ¿ 15 The Average Complexity of Heapsort @ÃÂF Á =¨Ä Algorithm Heapsort(var À ); 1. Heapify À ; @ = 2. For BÅCFE to ÂF Á¹BÄÆC%E À ÁÇB =ÈÄ À DELETEMIN( ); 9 ( = Fact. Heapify requires ? time. 9 ( = GIH¨J = Fact. Heapsort runs in ? time in the worst case. 16 Two Implementations of DELETEMIN ZAyÊy 6)Ë̤ Williams’ DELETEMIN( uDÉ ; Z a 6)Ë a uDÉ 1. ; ; _ ¢ x Z \ \ Ë Ë̤ Í Î8Ï uDÉ 2. While uDÉ _ ¢ x Z \ \ Ë a Ë Ë̤ Í Î8Ï uDÉ uDÉ 3. uDÉÐ ; //shift up// _ Z \ \ a a 4. or ; //move down// Ë a 5. uDÉÐ ; ZAyÊy 6<Ë̤ Floyd’s DELETEMIN( uDÉ : 1. Find the path leading to a leaf while shifting all elements up; 2. Climb up the path and insert at its correct location. a1 a2 a2 a3 DELETEMIN a l-1 al al x x Fact. Williams requires \Ñ comparisons and Floyd \ Ñ requires VbW<Yn6 comparisons. 17 Avg Analysis of Williams and Floyd Heapsorts \ Theorem. Both Williams and Floyd Heapsorts require 6DVbWY 6 comparisons in the worst case. Theorem. (Schaffer and Sedgewick’92) ¢ \ 6DVbW<Yn6 6p¤ On the average, Williams Heapsort requires _ ¢ 6¤ 6£VXWYn6 comparisons and Floyd Heapsort requires com- parisons. The following is a simple incompressibility proof due to Ian Munro. ZAx{y{y{y{x VXWYn6 6}| w Fix a -incompressible input permutation of with ¡£¢ 6¤T VbWY 6+~ VXWYt6 Denote the resulting heap of Heapify as Ò . ¡£¢ x ¢ 6 ¤na 6p¤ Ò Fact. x ¢ ¡£¢ ¡£¢ ¡£¢ 6 ¤jT VXWYt6-~ 6p¤ 6p¤- 6p¤jT Ò Hence, Ò . 18 Insertion Depths are the Key! 1 DELETEMIN 2 DELETEMIN 3 DELETEMIN ...... n-1 DELETEMIN x1 x2 x1 x2 x3 n n H1 = H H2 H3 Hn-1 Hn d(x1 )=l1 d(x 2 )=l2 d(xn-2 )=ln-2 ln-1 =0 dc \ÓÑ § f ´ Claim.