Randomized Algorithms RAJEEV MOTWANI Department of Computer Science, Stanford University, Stanford, California

Randomized Algorithms RAJEEV MOTWANI Department of Computer Science, Stanford University, Stanford, California PRABHAKAR RAGHAVAN IBM Almaden Research Center, San Jose, California Randomized algorithms, once viewed as HISTORY AND SOURCES a tool in computational number theory, The roots of randomized algorithms can have by now found widespread applica- be traced back to Monte Carlo methods tion. Growth has been fueled by the two used in numerical analysis, statistical major benefits of randomization: sim- physics, and simulation. In complexity plicity and speed. For many applica- theory, the notion of a probabilistic Tur- tions a randomized algorithm is the ing machine was proposed by de Leeuw fastest algorithm available, or the sim- et al. [1955] and was further explored in plest, or both. the pioneering work of Rabin [1963] and A randomized algorithm is an algo- Gill [1977]. The earliest examples of rithm that uses random numbers to in- concrete randomized algorithms ap- fluence the choices it makes in the peared in the work of Berlekamp [1970], course of its computation. Thus its be- Rabin [1976], and Solovay and Strassen havior (typically quantified as running [1977]. Rabin [1976] explicitly proposed time or quality of output) varies from randomization as an algorithmic tool one execution to another even with a using as examples problems in compu- fixed input. In the analysis of a random- tational geometry and in number the- ized algorithm we establish bounds on ory. At about the same time, Solovay the expected value of a performance and Strassen [1977] presented a ran- measure (e.g., the running time of the domized primality-testing algorithm; algorithm) that are valid for every in- these were predated by the randomized put; the distribution of the performance polynomial-factoring algorithm pre- measure is on the random choices made sented by Berlekamp [1970]. by the algorithm based on the random Since then, an impressive array of bits provided to it. techniques for devising and analyzing randomized algorithms have been devel- oped. The reader may refer to Karp Caveat: The analysis of randomized [1991], Maffioli et al. [1985], and Welsh algorithms should be distinguished [1983] for recent surveys of the research from the probabilistic or average-case into randomized algorithms. The proba- analysis of an algorithm, in which it is bilistic (or “average-case”) analysis of assumed that the input is chosen from a algorithms (sometimes also called “dis- probability distribution. In the latter tributional complexity”) is surveyed by case, the analysis would typically imply Johnson [1984a], and this is contrasted only that the algorithm is good for most with randomized algorithms in his fol- inputs but not for all. lowing bulletin [1984b]. The work of R. Motwani was supported by an Alfred P. Sloan Research Fellowship, an IBM Faculty Partnership Award, and NSF Young Investigator Award CCR-9357849, with matching funds from IBM, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Copyright © 1996, CRC Press. ACM Computing Surveys, Vol. 28, No. 1, March 1996 34 • Rajeev Motwani and Prabhakar Raghavan The recent book by the authors [Mot- population is representative of the pop- wani and Raghavan 1995] gives a com- ulation as a whole. Because computa- prehensive introduction to randomized tions involving small samples are inex- algorithms. pensive, their properties can be used to guide the computations of an algorithm PARADIGMS FOR RANDOMIZED attempting to determine some feature of ALGORITHMS the entire population. For instance, a simple randomized algorithm [Floyd In spite of the multitude of areas in and Rivest 1975] based on sampling which randomized algorithms find ap- finds the kth largest of n elements in plication, a handful of general princi- 1.5n 1 o(n) comparison steps, with ples underlie almost all of them. Using high probability. In contrast, it is the summary in Karp [1991], we known that any deterministic algorithm present these principles in the follow- must make at least 2n comparisons in ing. the worst case. Foiling an Adversary. In the classical worst-case analysis of deterministic al- Abundance of Witnesses. Often a gorithms, a lower bound is established computational problem can be reduced on the running time of algorithms by to finding a witness or a certificate that postulating an “adversary” that con- would efficiently verify an hypothesis. structs an input on which the algorithm (For example, to show that a number is fares poorly. The input thus constructed not prime, it suffices to exhibit any non- may be different for each deterministic trivial factor of that number.) For many algorithm. With a game-theoretic inter- problems, the witness lies in a search pretation of the relationship between an space that is too large to be searched algorithm and an adversary, we can exhaustively. However, if the search view a randomized algorithm as a prob- space were to contain a relatively large ability distribution on a set of determin- number of witnesses, a randomly chosen istic algorithms. (This observation un- element is likely to be a witness. Fur- derlies Yao’s [1977] adaptation of von ther, independent repetitions of the Neumann’s Mini-Max Theorem in game sampling reduce the probability that a theory into a technique for establishing witness is not found on any of the repe- limits on the performance improve- titions. The most striking examples of ments possible via the use of a random- this phenomenon occur in number the- ized algorithm.) Although the adversary ory. Indeed, the problem of testing a may be able to construct an input that given integer for primality has no foils one (or a small fraction) of the known deterministic polynomial-time deterministic algorithms in the set, it algorithm. There are, however, several may be impossible to devise a single randomized polynomial-time algorithms input that is likely to defeat a randomly [Solovay and Strassen 1978; Rabin chosen algorithm. For example, consider 1980] that will, on any input, correctly a uniform binary AND-OR tree with n perform this test with high probability. leaves. Any deterministic algorithm that evaluates such a tree can be forced Fingerprinting and Hashing. A fin- to read the Boolean values at every one gerprint is the image of an element from of the n leaves. However, there is a a (large) universe under a mapping into simple randomized algorithm [Snir another (smaller) universe. Finger- 1985] for which the expected number of prints obtained via random mappings leaves read on any input is O(n0.794). have many useful properties. For example, in pattern-matching applications Random Sampling. A pervasive theme [Karp and Rabin 1987] it can be shown in randomized algorithms is the idea that two strings are likely to be identi- that a small random sample from a cal if their fingerprints are identical; ACM Computing Surveys, Vol. 28, No. 1, March 1996 Randomized Algorithms •35 comparing the short fingerprints is con- the resources. This paradigm has found siderably faster than comparing the many interesting applications in paral- strings themselves. Another example is lel and distributed computing where re- hashing [Carter and Wegman 1979], source utilization decisions have to be where the elements of a set S (drawn made locally, without global knowledge. from a universe U) are stored in a table Consider packet routing in an n-node of size linear in uSu (even though uUu .. butterfly network. It is known that uSu) with the guarantee that the ex- V(=n) steps are required by any deter- pected number of elements in S mapped ministic oblivious algorithm (a class of to a given location in the table is O(1). simple routing algorithms in which the This leads to efficient schemes for decid- route followed by a packet is indepen- ing membership in S. Random finger- dent of the routes of other packets). In prints have found a variety of applica- contrast, based on ideas of Valiant tions in generating pseudorandom [1982], it has been shown [Aleliunas numbers and complexity theory (for in- 1982; Upfal 1984] that there is a ran- stance, the verification of algebraic domized algorithm that terminates in identities [Freivalds 1977]). O(log n) steps with high probability. Random Reordering. A large class of Rapidly Mixing Markov Chains. In problems has the property that a rela- counting problems, the goal is to deter- tively naive algorithm A can be shown mine the number of combinatorial ob- to perform extremely well provided the jects with a specified property. When input data is presented in a random the space of objects is large, an appeal- order. Although A may have poor worst- ing solution is the use of the Monte case performance, randomly reordering Carlo approach of determining the num- the input data ensures that the input is ber of desired objects in a random sam- unlikely to be in one of the orderings ple of the entire space. In a number of that is pathological for A. The earliest cases, it can be shown that picking a instance of this phenomenon can be uniform random sample is as difficult as found in the behavior of the Quicksort the counting problem itself. A particu- algorithm [Hoare 1962]. The random re- larly successful technique for dealing ordering approach has been particularly with such problems is to generate near- successful in tackling problems in data uniform random samples by defining a structures and computational geometry. Markov chain on the elements of the For instance, there are simple algo- population, and showing that a short rithms for computing the convex hull of random walk using this Markov chain is n points in the plane that process the likely to sample the population uni- input one point at a time. Such algo- formly. This method is at the core of a rithms are doomed to take V(n2) steps if number of algorithms used in statistical the input points are presented in an physics [Sinclair 1992].

Load more