Randomized RAJEEV MOTWANI Department of Computer Science, , Stanford, California

PRABHAKAR RAGHAVAN IBM Almaden Research Center, San Jose, California

Randomized algorithms, once viewed as HISTORY AND SOURCES a tool in computational number theory, The roots of randomized algorithms can have by now found widespread applica- be traced back to Monte Carlo methods tion. Growth has been fueled by the two used in numerical analysis, statistical major benefits of randomization: sim- physics, and simulation. In complexity plicity and speed. For many applica- theory, the notion of a probabilistic Tur- tions a randomized is the ing machine was proposed by de Leeuw fastest algorithm available, or the sim- et al. [1955] and was further explored in plest, or both. the pioneering work of Rabin [1963] and A randomized algorithm is an algo- Gill [1977]. The earliest examples of rithm that uses random numbers to in- concrete randomized algorithms ap- fluence the choices it makes in the peared in the work of Berlekamp [1970], course of its computation. Thus its be- Rabin [1976], and Solovay and Strassen havior (typically quantified as running [1977]. Rabin [1976] explicitly proposed time or quality of output) varies from randomization as an algorithmic tool one execution to another even with a using as examples problems in compu- fixed input. In the analysis of a random- tational geometry and in number the- ized algorithm we establish bounds on ory. At about the same time, Solovay the expected value of a performance and Strassen [1977] presented a ran- measure (e.g., the running time of the domized primality-testing algorithm; algorithm) that are valid for every in- these were predated by the randomized put; the distribution of the performance polynomial-factoring algorithm pre- measure is on the random choices made sented by Berlekamp [1970]. by the algorithm based on the random Since then, an impressive array of bits provided to it. techniques for devising and analyzing randomized algorithms have been devel- oped. The reader may refer to Karp Caveat: The analysis of randomized [1991], Maffioli et al. [1985], and Welsh algorithms should be distinguished [1983] for recent surveys of the research from the probabilistic or average-case into randomized algorithms. The proba- analysis of an algorithm, in which it is bilistic (or “average-case”) analysis of assumed that the input is chosen from a algorithms (sometimes also called “dis- probability distribution. In the latter tributional complexity”) is surveyed by case, the analysis would typically imply Johnson [1984a], and this is contrasted only that the algorithm is good for most with randomized algorithms in his fol- inputs but not for all. lowing bulletin [1984b].

The work of R. Motwani was supported by an Alfred P. Sloan Research Fellowship, an IBM Faculty Partnership Award, and NSF Young Investigator Award CCR-9357849, with matching funds from IBM, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Copyright © 1996, CRC Press.

ACM Computing Surveys, Vol. 28, No. 1, March 1996 34 • Rajeev Motwani and Prabhakar Raghavan

The recent book by the authors [Mot- population is representative of the pop- wani and Raghavan 1995] gives a com- ulation as a whole. Because computa- prehensive introduction to randomized tions involving small samples are inex- algorithms. pensive, their properties can be used to guide the computations of an algorithm PARADIGMS FOR RANDOMIZED attempting to determine some feature of ALGORITHMS the entire population. For instance, a simple randomized algorithm [Floyd In spite of the multitude of areas in and Rivest 1975] based on sampling which randomized algorithms find ap- finds the kth largest of n elements in plication, a handful of general princi- 1.5n ϩ o(n) comparison steps, with ples underlie almost all of them. Using high probability. In contrast, it is the summary in Karp [1991], we known that any deterministic algorithm present these principles in the follow- must make at least 2n comparisons in ing. the worst case. Foiling an Adversary. In the classical worst-case analysis of deterministic al- Abundance of Witnesses. Often a gorithms, a lower bound is established computational problem can be reduced on the running time of algorithms by to finding a witness or a certificate that postulating an “adversary” that con- would efficiently verify an hypothesis. structs an input on which the algorithm (For example, to show that a number is fares poorly. The input thus constructed not prime, it suffices to exhibit any non- may be different for each deterministic trivial factor of that number.) For many algorithm. With a game-theoretic inter- problems, the witness lies in a search pretation of the relationship between an space that is too large to be searched algorithm and an adversary, we can exhaustively. However, if the search view a randomized algorithm as a prob- space were to contain a relatively large ability distribution on a set of determin- number of witnesses, a randomly chosen istic algorithms. (This observation un- element is likely to be a witness. Fur- derlies Yao’s [1977] adaptation of von ther, independent repetitions of the Neumann’s Mini-Max Theorem in game sampling reduce the probability that a theory into a technique for establishing witness is not found on any of the repe- limits on the performance improve- titions. The most striking examples of ments possible via the use of a random- this phenomenon occur in number the- ized algorithm.) Although the adversary ory. Indeed, the problem of testing a may be able to construct an input that given integer for primality has no foils one (or a small fraction) of the known deterministic polynomial-time deterministic algorithms in the set, it algorithm. There are, however, several may be impossible to devise a single randomized polynomial-time algorithms input that is likely to defeat a randomly [Solovay and Strassen 1978; Rabin chosen algorithm. For example, consider 1980] that will, on any input, correctly a uniform binary AND-OR tree with n perform this test with high probability. leaves. Any deterministic algorithm that evaluates such a tree can be forced Fingerprinting and Hashing. A fin- to read the Boolean values at every one gerprint is the image of an element from of the n leaves. However, there is a a (large) universe under a mapping into simple randomized algorithm [Snir another (smaller) universe. Finger- 1985] for which the expected number of prints obtained via random mappings leaves read on any input is O(n0.794). have many useful properties. For exam- ple, in pattern-matching applications Random Sampling. A pervasive theme [Karp and Rabin 1987] it can be shown in randomized algorithms is the idea that two strings are likely to be identi- that a small random sample from a cal if their fingerprints are identical;

ACM Computing Surveys, Vol. 28, No. 1, March 1996 Randomized Algorithms •35 comparing the short fingerprints is con- the resources. This paradigm has found siderably faster than comparing the many interesting applications in paral- strings themselves. Another example is lel and distributed computing where re- hashing [Carter and Wegman 1979], source utilization decisions have to be where the elements of a set S (drawn made locally, without global knowledge. from a universe U) are stored in a table Consider packet routing in an n-node of size linear in ͉S͉ (even though ͉U͉ ϾϾ butterfly network. It is known that ͉S͉) with the guarantee that the ex- ⍀(͌n) steps are required by any deter- pected number of elements in S mapped ministic oblivious algorithm (a class of to a given location in the table is O(1). simple routing algorithms in which the This leads to efficient schemes for decid- route followed by a packet is indepen- ing membership in S. Random finger- dent of the routes of other packets). In prints have found a variety of applica- contrast, based on ideas of Valiant tions in generating pseudorandom [1982], it has been shown [Aleliunas numbers and complexity theory (for in- 1982; Upfal 1984] that there is a ran- stance, the verification of algebraic domized algorithm that terminates in identities [Freivalds 1977]). O(log n) steps with high probability.

Random Reordering. A large class of Rapidly Mixing Markov Chains. In problems has the property that a rela- counting problems, the goal is to deter- tively naive algorithm A can be shown mine the number of combinatorial ob- to perform extremely well provided the jects with a specified property. When input data is presented in a random the space of objects is large, an appeal- order. Although A may have poor worst- ing solution is the use of the Monte case performance, randomly reordering Carlo approach of determining the num- the input data ensures that the input is ber of desired objects in a random sam- unlikely to be in one of the orderings ple of the entire space. In a number of that is pathological for A. The earliest cases, it can be shown that picking a instance of this phenomenon can be uniform random sample is as difficult as found in the behavior of the Quicksort the counting problem itself. A particu- algorithm [Hoare 1962]. The random re- larly successful technique for dealing ordering approach has been particularly with such problems is to generate near- successful in tackling problems in data uniform random samples by defining a structures and computational geometry. Markov chain on the elements of the For instance, there are simple algo- population, and showing that a short rithms for computing the convex hull of random walk using this Markov chain is n points in the plane that process the likely to sample the population uni- input one point at a time. Such algo- formly. This method is at the core of a rithms are doomed to take ⍀(n2) steps if number of algorithms used in statistical the input points are presented in an physics [Sinclair 1992]. Examples in- order determined by an adversary; how- clude algorithms for estimating the ever, if they are processed in random number of perfect matchings in a graph order, the running time drops to O(n log [Jerrum and Sinclair 1989]. n). The book by Mulmuley [1993] gives an excellent overview of randomized Isolation and Symmetry Breaking. In geometric algorithms. computing on asynchronous distributed processors, it is often necessary for a Load Balancing. When we must collection of processors to break a dead- choose between different resources lock or a symmetry and make a common (such as links in a communication net- choice. Randomization is a powerful tool work or when assigning tasks to paral- in such deadlock-avoidance. For exam- lel processors), randomization can be ple, see the protocol for choice coordina- used to “spread” the load evenly among tion due to Rabin [1982]. Similarly, in

ACM Computing Surveys, Vol. 28, No. 1, March 1996 36 • Rajeev Motwani and Prabhakar Raghavan parallel computation, often a problem time bounds for selection. Commun. ACM 18, has many feasible solutions and so it 165–172. becomes important to ensure that the FREIVALDS, R. 1977. Probabilistic machines can different processors are working to- use less running time. In Information Process- ing 77, Proceedings of IFIP Congress 77, B. wards finding the same solution. This Gilchrist, Ed., (Aug.), North-Holland, Amster- involves isolating a specific solution out dam, 839–842. of the space of all feasible solutions GILL, J. 1977. Computational complexity of without actually knowing any single el- probabilistic Turing machines. SIAM J. Com- ement of the solution space. One clever put. 6, 4 (Dec.), 675–695. randomized strategy for isolation HOARE, C. A. R. 1962. Quicksort. Comput. J. 5, chooses a random ordering on the feasi- 10–15. ble solutions and then requires the pro- JERRUM,M.R.AND SINCLAIR, A. 1989. cessors to focus on finding the solution Approximating the permanent. SIAM J. Com- put. 18, 6 (Dec.), 1149–1178. of the lowest rank. This idea has proved JOHNSON, D. S. 1984a. The NP-completeness to be critical in devising efficient paral- column: An ongoing guide. J. Algorithms 5, lel algorithms for finding a perfect 284–299. matching in a graph [Mulmuley et al. JOHNSON, D. S. 1984b. The NP-completeness 1987]. column: An ongoing guide. J. Algorithms 5, 433–447. Probabilistic Methods and Existence KARP, R. M. 1991. An introduction to random- Proofs. The probabilistic method at- ized algorithms. Discrete Appl. Math. 34, 165– tempts to establish the existence of a 201. specific type of combinatorial object by KARP,R.M.AND RABIN, M. O. 1987. Efficient arguing that a random object from a randomized pattern-matching algorithms. suitably defined universe has the de- IBM J. Res. Dev. 31 (March), 249–260. sired property with nonzero probability. MAFFIOLI, F., SPERANZA,M.G.,AND VERCELLIS, Usually this method gives no clue on C. 1985. Randomized algorithms. In Com- binatorial Optimization: Annotated Bibliogra- actually finding such an object. This phies, M. O’hEigertaigh, J.K. Lenstra, and method is sometimes used to guarantee A.H.G. Rinooy Kan, Eds., Wiley, New York, the existence of an algorithm for solving 89–105. a problem; we thus know that the algo- MOTWANI,R.AND RAGHAVAN, P. 1995. rithm exists, but have no idea what it Randomized Algorithms. Cambridge Univer- looks like or how to construct it. The sity Press, New York. World-Wide Web infor- book by Alon and Spencer [1992] gives mation at http://www.cup.org/Reviews&blurbs/ RanAlg/RanAlg.html. an excellent overview of this subject. MULMULEY, K. 1993. Computational Geometry: An Introduction Through Randomized Algo- rithms. Prentice Hall, New York. REFERENCES MULMULEY, K., VAZIRANI,U.V.,AND VAZIRANI,V. V. 1987. Matching is as easy as matrix in- ALELIUNAS, R. 1982. Randomized parallel com- version. Combinatorica 7, 105–113. munication. In ACM-SIGOPS Symposium on Principles of Distributed Systems, 60–72. RABIN, M. O. 1982. The choice coordination problem. Acta Inf. 17, 121–134. ALON,N.AND SPENCER, J. 1992. The Probabilis- tic Method. Wiley, New York. RABIN, M. O. 1980. Probabilistic algorithm for BERLEKAMP, E. R. 1970. Factoring polynomials testing primality. J. Number Theory 12, 128– over large finite fields. Math. Comput. 24, 138. 713–735. RABIN, M. O. 1976. Probabilistic algorithms. In CARTER,J.L.AND WEGMAN, M. N. 1979. Algorithms and Complexity, Recent Results Universal classes of hash functions. J. Com- and New Directions, J.F. Traub, Ed., Aca- put. Syst. Sci. 18, 2, 143–154. demic Press, New York, 21–39. DE LEEUW, K., MOORE,E.F.,SHANNON,C.E.,AND RABIN, M. O. 1963. Probabilistic automata. Inf. SHAPIRO, N. 1955. Computability by proba- Control 6, 230–245. bilistic machines. In Automata Studies, C. E. SINCLAIR, A. 1992. Algorithms for Random Gen- Shannon and J. McCarthy, Eds., Princeton eration and Counting: A Markov Chain Ap- University Press, Princeton, NJ, 183–212. proach. Progress in Theoretical Computer Sci- FLOYD,R.W.AND RIVEST, R. L. 1975. Expected ence. Birkhauser, Boston.

ACM Computing Surveys, Vol. 28, No. 1, March 1996 Randomized Algorithms •37

SNIR, M. 1985. Lower bounds on probabilistic VALIANT, L. G. 1982. A scheme for fast parallel linear decision trees. Theor. Comput. Sci. 38, communication. SIAM J. Comput. 11, 350– 69–82. 361. SOLOVAY,R.AND STRASSEN, V. 1977. A fast WELSH, D. J. A. 1983. Randomised algorithms. Monte-Carlo test for primality. SIAM J. Com- Discrete Appl. Math. 5, 133–145. put. 6, 1 (March), 84–85. See also SIAM J. YAO, A. C-C. 1977. Probabilistic computations: Comput. 7, 1 (Feb.), 1978, 118. Towards a unified measure of complexity. In UPFAL, E. 1984. Efficient schemes for parallel Proceedings of the 17th Annual Symposium on communication. J. ACM 31, 507–517. Foundations of Computer Science, 222–227.

ACM Computing Surveys, Vol. 28, No. 1, March 1996