Probability and Algorithms Caltech Cs150a, 2020 Lecture Notes

Probability and Algorithms Caltech CS150a, 2020 Lecture Notes Leonard J. Schulman Glossary a.s. = almost surely = with probability 1 1 = the constant function one, and when used in linear algebra, the all-ones vector. A = 1A = the indicator function for occurrence of event A. (An event is a measurable subset of theJ K sample space. This then is the function which is 1 on that subset, and 0 elsewhere.) We will often conflate an event and its indicator function, so the notations A ⊆ B, A ≤ B , A ⊆ B will occur interchangeably. J K J K J K J K N+ = the positive integers [n] = the set f1, . , ng, provided n 2 N+. Let B be a collection of countably many events Bi. The event that infinitely many Bi occur (also called “B infinitely often”) is variously written lim sup B or B i.o. ⊂ always indicates strict containment. Ac: the complement of event A. Caltech CS150 2020. 2 Contents 1 Some basic probability theory 7 1.1 Lecture 1 (30/Sep): Appetizers . .7 1.2 Lecture 2 (2/Oct) Some basics . .8 1.2.1 Measure . .8 1.2.2 Measurable functions, random variables and events . .9 1.3 Lecture 3 (5/Oct): Linearity of expectation, union bound, existence theorems . 12 1.3.1 Countable additivity . 13 1.3.2 Coupon collector . 13 1.3.3 Application: the probabilistic method . 14 1.4 Lecture 4 (7/Oct): Upper and lower bounds . 15 1.4.1 Union bound . 15 1.4.2 Using the union bound in the probabilistic method: Ramsey theory . 15 1.4.3 Bonferroni inequalities . 16 1.5 Lecture 5 (9/Oct): Tail events. Borel-Cantelli, Kolmogorov 0-1, percolation . 19 1.5.1 Borel-Cantelli . 19 1.5.2 B-C II: a partial converse to B-C I . 19 1.5.3 Kolmogorov 0-1, percolation . 20 1.6 Lecture 6 (12/Oct): Random walk, gambler’s ruin . 22 1.7 Lecture 7 (14/Oct): Percolation on trees. Basic inequalities. 24 1.7.1 A simple model for (among other things) epidemiology: percolation on the regular d-ary tree, d > 1................................ 24 1.7.2 Markov inequality (the simplest tail bound) . 25 1.7.3 Variance and the Chebyshev inequality: a second tail bound . 25 1.7.4 Omitted material 2020: Cont. basic inequalities, probabilistic method . 26 1.8 Lecture 8 (16/Oct): FKG inequality . 28 1.8.1 Omitted 2020: Achieving expectation in MAX-3SAT. 31 3 Contents 2 Algebraic Fingerprinting 33 2.1 Lecture 9 (19/Oct): Fingerprinting with Linear Algebra . 33 2.1.1 Polytime Complexity Classes Allowing Randomization . 33 2.1.2 Verifying Matrix Multiplication . 34 2.1.3 Verifying Associativity . 35 2.2 Lecture 10 (21/Oct): Cont. associativity; perfect matchings, polynomial identity testing 38 2.2.1 Matchings . 38 2.2.2 Bipartite perfect matching: deciding existence . 38 2.3 Lecture 11 (23/Oct): Cont. perfect matchings; polynomial identity testing . 41 2.3.1 Polynomial identity testing . 41 2.3.2 Deciding existence of a perfect matching in a graph . 42 2.4 Lecture 12 (26/Oct): Parallel computation: finding perfect matchings in general graphs. Isolating lemma. 44 2.4.1 Parallel computation . 44 2.4.2 Sequential and parallel linear algebra . 44 2.4.3 Finding perfect matchings in general graphs, in parallel. The Isolating Lemma 44 2.5 Lecture 13 (28/Oct): Finding a perfect matching, in RNC . 47 3 Concentration of Measure 49 3.1 Lecture 14 (30/Oct): Independent rvs: data processing, Chernoff bound, applications 49 3.1.1 Two facts about independent rvs . 49 3.1.2 Chernoff bound for uniform Bernoulli rvs (symmetric random walk) . 50 3.1.3 Application: set discrepancy . 51 3.1.4 Entropy and Kullback-Liebler divergence . 52 3.2 Lecture 15 (2/Nov): CLT. Stronger Chernoff bound and applications. Start Shannon coding theorem . 54 3.2.1 Central limit theorem . 54 3.2.2 Chernoff bound using divergence; robustness of BPP . 54 3.2.3 Balls and bins . 56 3.2.4 Preview of Shannon’s coding theorem . 56 3.3 Lecture 16 (4/Nov): Application of large deviation bounds: Shannon’s coding theorem 58 3.4 Lecture 17 (6/Nov): Application of CLT to Gale-Berlekamp. Khintchine-Kahane. Moment generating functions . 60 3.4.1 Gale-Berlekamp game . 60 3.4.2 Moment generating functions, Chernoff bound for general distributions . 62 3.5 Lecture 18 (9/Nov): Metric spaces . 64 3.5.1 Metric space examples . 64 3.5.2 Embedding dimension for n points in L2 ...................... 65 3.5.3 Normed spaces . 66 3.5.4 Exponential savings in dimension for any fixed distortion . 66 Caltech CS150 2020. 4 Contents 3.6 Lecture 19 (11/Nov): Johnson-Lindenstrauss embedding . 67 3.6.1 The original method . 67 3.6.2 JL: a similar, and easier to analyze, method . 69 3.7 Lecture 20 (13/Nov): Bourgain embedding X ! Lp, p ≥ 1................ 73 3.7.1 Embedding into L1: Weighted Frechet´ embeddings . 73 3.7.2 Good things can happen . 74 3.7.3 Aside: Holder’s¨ inequality . 76 4 Limited independence 77 4.1 Lecture 21 (16/Nov): Pairwise independence, improved proof of coding theorem using linear codes . 77 4.2 Lecture 22 (18/Nov): Pairwise independence, second moment inequality, G(n, p) thresholds . 80 4.2.1 Threshold for H as a subgraph in G(n, p) ...................... 80 4.2.2 Most pairs independent: threshold for K4 in G(n, p) ............... 81 4.3 Lecture 23 (20/Nov): Limited independence: near-pairwise for primes, 4-wise for Khintchine-Kahane . 83 4.3.1 Turan’s proof of a theorem of Hardy and Ramanujan . 83 4.3.2 4-wise independent random walk . 85 4.4 Lecture 24 (23/Nov): Khintchine-Kahane for 4-wise independence; begin MIS in NC 86 4.4.1 Log concavity of moments and Berger’s bound . 86 4.4.2 Khintchine-Kahane for 4-wise independent rvs . 87 4.4.3 Khintchine-Kahane from Paley-Zygmund (omitted in class) . 87 4.4.4 Maximal Independent Set in NC . 88 4.5 Lecture 25 (25/Nov): Luby’s parallel algorithm for maximal independent set . 90 4.5.1 Descent Processes . 91 4.6 Lecture 26 (30/Nov): Limited linear independence, limited statistical independence, error correcting codes. 93 4.6.1 Begin derandomization from small sample spaces . 93 4.6.2 Generator matrix and parity check matrix . 93 4.7 Lecture 27 (2/Dec): Limited linear independence, limited statistical independence, error correcting codes. 96 4.7.1 Constructing C from M ................................ 96 4.7.2 Proof of Thm (93) Part (1): Upper bound on the size of k-wise independent sample spaces . 96 4.7.3 Back to Gale-Berlekamp . 98 4.7.4 Back to MIS . 98 5 Special topic 99 5.1 Lecture 28 (4/Dec): Sampling factored numbers . 99 Caltech CS150 2020. 5 Contents A Material omitted in lecture 102 A.1 Paley-Zygmund in-probability bound, applied to the 4-wise indep. Khintchine-Kahane 102 Bibliography 104 Caltech CS150 2020. 6 Chapter 1 Some basic probability theory 1.1 Lecture 1 (30/Sep): Appetizers 1. N gentlemen check their hats in the lobby of the opera, but after the performance the hats are handed back at random. How many men, on average, get their own hat back? 2. Measure the length of a long string coiled under a rectangular glass tabletop. You have an ordinary rigid ruler (longer than sides of the table). 3. On the table before us are 10 dots, and in our pocket are 10 nickels. Prove the coins can be placed on the table (no two overlapping) in such a way that all the dots are covered. 4. The envelope swap paradox: You’re on a TV game show and the host offers you two identical- looking envelopes, each of which contains a check in your name from the TV network. You pick whichever envelope you like and take it, still unopened. Then the host explains: one of the checks is written for a sum of $N (N > 0), and the other is for $10N. Now, he says, it’s 50-50 whether you selected the small check or the big one. He’ll give you a chance, if you like, to swap envelopes. It’s a good idea for you to swap, he explains, because your expected net gain is (with $m representing the sum currently in hand): E(gain) = (1/2)(10m − m) + (1/2)(m/10 − m) = (81/20)m How can this be? 5. Unbalancing lights (Gale-Berlekamp): You’re given an n × n grid of lightbulbs. For each bulb, at position (i, j), there is a switch bij; there is also a switch ri on each row and a switch cj on each column. The (i, j) bulb is lit if bij + ri + cj is odd. What is the greatest f (n) such that for any setting to the bij’s, you can set the row and column switches to light at least n2/2 + f (n) bulbs? 7 Chapter 1. Some basic probability theory 1.2 Lecture 2 (2/Oct) Some basics 1.2.1 Measure Frequently one can “get by” with a na¨ıve treatment of probability theory: you can treat random variables quite intuitively so long as you maintain Bayes’ law for conditional probabilities of events: Pr(A1 \ A2) Pr(A1jA2) = Pr(A2) However, that’s not good enough for all situations, so we’re going to be more careful, and me- thodically answer the question, “what is a random variable?” (For a philosophical and historical discussion of this question see Mumford in [75].) First we need measure spaces.

Load more