COEN279 - Design and Analysis of

Probabilistic Analysis and Randomized

Dr. Tunghwa Wang Fall 2016

09/22/2016 Dr. Tunghwa Wang 1 Announcement Always read the document online unless necessary to download:  Files could be updated. For the first week, everything is not well prepared yet:  Linux account: Use putty to work remotely.  E-mail: Automatic forwarded to [email protected] In case you receive from this account instead of [email protected].  Bonus assignments: #2 – deadline is extended to 09/25. Future ones – due on Thursday of the week

09/20/2016 Dr. Tunghwa Wang 2 Probabilistic Analysis It is the use of probability in the analysis of problems and thus the algorithm designed to solve these problems. review We compute the average-case running time by taking the average over all the possible inputs – it is average-case running time. The distribution of the inputs is the key to the analysis:  We know the distribution of inputs.  We assume the model of distribution function of inputs.  Otherwise, we cannot do probabilistic analysis.

09/22/2016 Dr. Tunghwa Wang 3 Randomized Algorithms Tools and techniques widely used in many applications. Benefits:  Simplicity  Speed  Robustness Inputs: generated using random number generator  Pseudo random numbers  Distribution functions  Queueing models  More Initial values:  Randomized initial values of internal variables do not make the algorithm a . By the nature, they can also be initialized with any valid values, say all initialized to 0.

09/22/2016 Dr. Tunghwa Wang 4 Probabilistic or Randomized Algorithm At least once during the algorithm, a random number is used to make a decision instead of spending time to work out which alternative is best.  The worst-case running time of a randomized algorithm is almost always the same as the worst-case running time of the non- randomized algorithm.  A good randomized algorithm has no bad input, but only bad random numbers.  The random numbers are important, and we can get an expected running time, where we now average over all possible random numbers instead of over all possible inputs, or the mean time that it would take to solve the same instance over and over again.

09/22/2016 Dr. Tunghwa Wang 5 Random Number Generators True randomness is virtually impossible to do on a computer. Pseudorandom numbers. What really needed is a sequence of random numbers appear independently.

 The linear congruential generator: xi+1 = Axi % M, where x0 is the seed and 1 ≤ x0 < M. If M is prime, xi is never 0. After M-1 numbers, the sequence repeat (period of M-1). Some choices of A gets shorter period than M-1.  If M is chosen to be a large, 31-bit prime, the period should be significantly large for most applications. M = 231 - 1 = 2,147,483,647 and A = 48,271.  Same sequence occurs all the time for easy debugging, and input seed (e.g., use system clock) for real runs.  Usually a random real number in the open interval (0,1), which can be done by dividing by M.  Multiplication overflow prevention: let Q = M / A = 44,488 and R = M % A = 3,399,

 xi+1 = Axi % M = A(xi % Q) - R(xi / Q) + Md(xi), where d(xi) = xi / Q - Axi / M = 1 iff the remaining terms evaluate to less than zero, 0 otherwise.

 xi+1 = Axi % M = Axi - M(Axi / M) = Axi - M(xi / Q) + M(xi / Q) - M(Axi / M) = Axi - M(xi / Q) + M(xi / Q - Axi / M) = A(Q(xi / Q) + xi % Q) - M(xi / Q) + M(xi / Q - Axi / M) = (AQ - M)(xi / Q) + A(xi % Q) + M(xi / Q - Axi / M) = -R(xi / Q) + A(xi % Q) + M(xi / Q - Axi / M) = A(xi % Q) - R(xi / Q) + Md(xi)

09/22/2016 Dr. Tunghwa Wang 6 Complexity Randomized Polynomial Time (RP) is the of problems for which a probabilistic Turing machine exists with these properties  It always runs in polynomial time in the input size.  If the correct answer is NO, it always returns NO.  If the correct answer is YES, then it returns YES with probability at least 1/2 (otherwise, it returns NO). NP  RP  P  Unsolved problem in computer science: RP = P? co-RP:  When RP return YES, it is always right.  The complexity class co-RP is similarly defined, except that NO is always right and YES might be wrong. co-NP  co-RP  P

09/22/2016 Dr. Tunghwa Wang 7 Complexity Bounded-error Probabilistic Polynomial Time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded away from 1/3 for all instances:  It is allowed to flip coins and make random decisions.  It is guaranteed to run in polynomial time.  On any given run of the algorithm, it has a probability of at most 1/3 of giving the wrong answer, whether the answer is YES or NO. BPP  P, BPP  RP, BPP  co-RP Zero-error Probabilistic Polynomial Time (ZPP) is the complexity class of problems for which a probabilistic Turing machine exists with these properties:  The running time is polynomial in expectation for every input.  It always returns the correct YES or NO answer. ZPP = RP ∩ co-RP

09/22/2016 Dr. Tunghwa Wang 8 Numerical Probabilistic Algorithms For certain real-life problems, computation of an exact solution is not possible even in principle, e.g., uncertainties in the experimental data, digital computers handle only binary values, etc. For other problems, a precise answer exists but it would take too long to figure it out exactly. Numerical algorithms yield a confidence interval, and the expected precision improves as the time available to the algorithm increase. The error is usually inversely proportional to the square root of the amount of work performed.  Buffon’s needle  Quasi Monte Carlo integration  Probabilistic counting

09/22/2016 Dr. Tunghwa Wang 9 Always fast but probably correct:  Monte Carlo algorithms give exact answer with high probability whatever the instance considered, although sometimes they provide a wrong answer. Generally you cannot tell if the answer is correct, but you can reduce the error probability arbitrarily by allowing the algorithm more time (amplifying the stochastic). A Monte Carlo algorithm is p-correct if it returns a correct answer with probability at least p (0 < p < 1), whatever the instance considered. p depends on the instance size but not on the instance itself. Verifying matrix multiplication Primality testing Skip list

09/22/2016 Dr. Tunghwa Wang 10 Las Vegas Algorithm Always correct but probably fast:  Las Vegas algorithms make probabilistic choices to help guide them more quickly to a correct solution, they never return a wrong answer. Two main categories of Las Vegas algorithms: it take longer time to solve a problem when unfortunate choice are made (e.g., Quicksort), and alternatively, they allow themselves go to a dead end and admit that they cannot find a solution in this run of the algorithm. A Las Vegas algorithm has the Robin Hood effect, with high probability, instances that took a long time deterministically are now solved much faster, but instances on which the deterministic algorithm was particularly good are slowed down to average. Let p(x) be the probability of success of the algorithm, then the expected time t(x) is 1/p(x). However, a correct analysis must consider separately the expected time taken by LV(x) in case of success s(x) and in case of failure f(x). t(x) = s(x) + ((1-p(x))/p(x))f(x). The eight queens problem Probabilistic quick-select and quick-sort Universal hashing Factoring large integers 09/22/2016 Dr. Tunghwa Wang 11 Atlantic City Algorithm Probably fast and probably correct:  BPP complexity

09/22/2016 Dr. Tunghwa Wang 12 Backup Slides

09/22/2016 Dr. Tunghwa Wang 13 Probability

09/22/2016 Dr. Tunghwa Wang 14 Probability

09/22/2016 Dr. Tunghwa Wang 15 Probability

09/22/2016 Dr. Tunghwa Wang 16 Probability

09/22/2016 Dr. Tunghwa Wang 17 Probability

return

09/22/2016 Dr. Tunghwa Wang 18 Reference Random Algorithms:  http://ocw.mit.edu/courses/electrical-engineering-and-computer- science/6-856j-randomized-algorithms-fall-2002/lecture-notes/

09/22/2016 Dr. Tunghwa Wang 19 Buffon’s Needle Throw a needle at random on a floor made of planks of constant width, if the needle is exactly half as long as the planks in the floor and if the width of the cracks between the planks are zero, the probability that the needle will fall across a crack is 1/ .  General case: The probability that a randomly thrown needle will fall across a crack is 2λ/ω , where λ is needle length and ω is plank width. Estimating :  ≈ 2λ/ωP where P is the probability from randomly throwing the needle. P = h / n where h success over n throws.  The result estimate will be between - ε and + ε with probability at least  (desired reliability).

return

09/22/2016 Dr. Tunghwa Wang 20 Probabilistic Counting With an n-bit counter, we can only count up to 2n – 1. With probabilistic counting, we can count up to larger value at the expense of some loss of accuracy. Counting twice as far to up to 2n+1 - 2 by initializing to 0, each time tick is called, flip a fair coin. If it comes up head, add 1 to the register, otherwise, do nothing. When count is called, return twice the value stored in the register. n Counting exponentially farther from 0 to 22 -1 - 1. Keep in the register an estimate of the logarithm of the actual number of ticks and count(k) returns 2k-1. Keep the relative error in control instead of absolute.

return

09/22/2016 Dr. Tunghwa Wang 21 Verifying Matrix Multiplication Straightforward matrix multiplication algorithm Θ(n3), Strassen’s algorithm (n2.37).

Let D = AB - C, S  {1,2, .., n}, and ∑S(D) denote the vector of length n obtained by adding pointwise the rows of D indexed by the elements of S. ∑S(D) is always 0 if AB equal C, otherwise, assume i be an integer such that the ith row of D contains at least one nonzero element. The probability that ∑S(D)  0 is at least one-half. Let X be a binary vector of length of n such that Xj = 1 if j  S and Xj = 0 otherwise. Then ∑S(D) = XD, and we want to verify if XAB = XC, where (XA)B need Θ(n2). Getting the answer false just once allows you conclude that AB  C. The probability that k successive calls each return the wrong answer is at most 2-k, so it is (1 - 2- k) correct. Alternatively, Monte Carlo algorithms can be given an explicit upper bound on the tolerable error probability in Θ(n2lgε-1).  http://link.springer.com/chapter/10.1007/978-3-319-04298- 5_33#page-1

return

09/22/2016 Dr. Tunghwa Wang 22 Primality Testing O(2d/2) to test whether a d-digit number is a prime. Randomized polynomial-time algorithm: if the algorithm declares that the number is not prime, then it is certainly not a prime. If the algorithm declares that the number is a prime, then with high probability but not 100% sure, the number is prime. Fermat's Lesser Theorem: If P is prime, and 0 < A < P, then AP-1  1 % P Pick 1 < A < N-1 at random. If AN-1  1 % N, declare that N is probably prime, otherwise declare that N is definitely not prime. False witness of primality: Carmichael numbers are not prime but satisfy AN-1  1 % N for all 0 < A < N that are relatively prime to N. If P is prime and 0 < X < P, the only solutions to X2 = 1 % P are X = 1, P-1.

return

09/22/2016 Dr. Tunghwa Wang 23 Skip List Data structure:  Every 2ith node has a pointer to the node 2i ahead of it. The total number of pointers has only doubled, but now at most lgN nodes are examined during a search. The search consists of either advancing to a new node or dropping to a lower pointer in the same node.  A level k node is a node that has k pointers, the ith pointer in any level k node (k  i) points to the next node with at least i levels. Roughly half the nodes are level 1 nodes, roughly a quarter are level 2, and, in general, approximately 1/2i nodes are level i. We choose the level randomly.

09/22/2016 Dr. Tunghwa Wang 24 Skip List Operations:  Find: start at the highest pointer at the header, traverse along this level until find that the next node is larger than the one we are looking for (or nil). When this occurs, go to the next lower level and continue the strategy. When progress is stopped at level 1, either we are in front of the node we are looking for, or it is not in the list.  Insert: proceed as in a Find, and keep track of each point where we switch to a lower level. The new node, whose level is determined randomly, is then spliced into the list.  Delete: After successfully located, remove from all associated linked lists. Analysis:  O(lgN) expected cost: a time-space trade-off  Skip lists need an estimate of the number of elements that will be in the list to determine the number of levels. Different level of nodes need different type declarations.

return 09/22/2016 Dr. Tunghwa Wang 25 The Eight Queen Problem Combine backtracking with probabilistic algorithm, first places a number of queens on the board in a random way, and then uses backtracking to try and add the remaining queens without reconsidering the positions of the queens that were placed randomly. The more queens we place randomly, the smaller the average time needed by the subsequent backtracking stage, whether it fails or succeeds, but the greater the probability of failure. This is the fine-tuning knob.

return 09/22/2016 Dr. Tunghwa Wang 26 Universal Hashing Las Vegas hashing allows us to retain the efficiency of hashing on the average, without arbitrarily favoring some programs at the expense of others. Choose the hash function randomly at the beginning of each compilation and again whenever rehashing becomes necessary, ensure that collision lists remain reasonably well-balanced with high probability. Universal hashing: Let U = {1,2, .., a-1} be the universe of potential indexes for the associative table, and let B = {1,2, ..,N-1} be the set of indexes in the hash table. Let two distinct x and y in U, a set H of functions from U to B, and h:UB is a function chosen randomly from H, H is a universal class of hash functions if the probability that h(x) = h(y) is at most 1/N. Let p be a prime number at least as large as a, and i, j be two integers (1  i < p and 0  j < p), then hij(x) = ((ix + j)%p)%N, and H is universal.

return 09/22/2016 Dr. Tunghwa Wang 27 Factorizing Large Integers The factorization problem consists of finding the unique decomposition of n into a product of prime factors. The splitting consists of finding one nontrivial divisor of n, provided n is composite. Factorizing reduces to splitting and primality testing. An integer is k-smooth if all its prime divisors are among the k smallest prime numbers. k-smooth integers can be factorized efficiently by trial division if k is small. A hard composite number is the product of two primes of roughly equal size. Let n be a composite integer, Let a and b be distinct integers between 1 and n-1 such that a + b  n. If a2 % n  b2 % n, then gcd(a+b, n) is a nontrivial divisor of n. A good presentation: http://www.mi.fu- berlin.de/wiki/pub/Main/GunnarKlauP1winter0708/discMath_klau_hash_II. pdf

09/22/2016 Dr. Tunghwa Wang 28 Factorizing Large Integers Pollard’s rho heuristic:  Sieve up to N is guaranteed to factor completely any number to N2.  POLLARD-RHO can factor up to N4: however, neither running time nor success is guaranteed.

 Example: n = 1387 = 19 * 73, x1 = 2, y = 2, k = 2 2  x2 = (2 - 1) % 1387 = 3; GCD(2-3, 1387) = 1; y = 3; k = 4 2  x3 = (3 - 1) % 1387 = 8; GCD(3-8, 1387) = 1 2  x4 = (8 - 1) % 1387 = 63; GCD(3-63, 1387) = 1; y = 63; k = 8 2  x5 = (63 - 1) % 1387 = 1194; GCD(63-1194, 1387) = 1 2  x6 = (1194 - 1) % 1387 = 1186; GCD(63-1186, 1387) = 1 2  x7 = (1186 - 1) % 1387 = 177; GCD(63-177, 1387) = 19 2  x8 = (177 - 1) % 1387 = 814; GCD(63-814, 1387) = 1; y = 814; k = 16 2  x9 = (814 - 1) % 1387 = 996; GCD(814-996, 1387) = 1 2  x10 = (996 - 1) % 1387 = 310; GCD(814-310, 1387) = 1 2  x11 = (310 - 1) % 1387 = 396; GCD(814-396, 1387) = 19 2  x12 = (396 - 1) % 1387 = 84; GCD(814-84, 1387) = 73 2  x13 = (84 - 1) % 1387 = 120; GCD(814-120, 1387) = 1 2  x14 = (120 - 1) % 1387 = 529; GCD(814-529, 1387) = 19 2  x15 = (529 - 1) % 1387 = 1053; GCD(814-1053, 1387) = 1 2  x16 = (1053 - 1) % 1387 = 595; GCD(814-595, 1387) = 73; y = 595; k = 32 2  x17 = (595 - 1) % 1387 = 339; GCD(595-339, 1387) = 1 2  x18 = (339 - 1) % 1387 = 1186 = x6; GCD(595-1186, 1387) = 1 return 09/22/2016 Dr. Tunghwa Wang 29