fea-goldreich.qxp 9/23/99 3:56 PM Page 1209
Pseudorandomness Oded Goldreich
his essay considers finite objects, en- associated with a unique distribution—the uni- coded by binary finite sequences called form one. In particular, by definition, one cannot strings. When we talk of distributions generate such perfect random strings from shorter we mean discrete probability distribu- random strings. tions having a finite support that is a set The second theory (cf. [11, 12]), due to ofT strings. Of special interest is the uniform dis- Solomonov, Kolmogorov, and Chaitin, is rooted in tribution, which for a length parameter n (explicit computability theory and specifically in the notion or implicit in the discussion), assigns each n-bit of a universal language (equivalently, universal string x ∈{0, 1}n equal probability (i.e., probabil- machine or computing device). It measures the ity 2 n). We will colloquially speak of “perfectly ran- complexity of objects in terms of the shortest pro- dom strings”, meaning strings selected according gram (for a fixed universal machine) that generates to such a uniform distribution. the object.2 Like Shannon’s theory, Kolmogorov The second half of this century has witnessed complexity is quantitative, and perfect random the development of three theories of randomness, objects appear as an extreme case. Interestingly, a notion that has been puzzling thinkers over the in this approach one may say that a single object, ages. The first theory (cf. [3]), initiated by Shannon, rather than a distribution over objects, is perfectly is rooted in probability theory and is focused on random. Still, Kolmogorov’s approach is inher- distributions that are not perfectly random. Shan- ently intractable (i.e., Kolmogorov complexity is non’s information theory characterizes perfect uncomputable), and, by definition, one cannot randomness as the extreme case in which the in- generate strings of high Kolmogorov complexity formation content is maximized (and there is no from short random strings. redundancy at all).1 Thus, perfect randomness is The third theory, initiated by Blum, Goldwasser, Micali, and Yao [8, 2, 13], is rooted in complexity Oded Goldreich is professor of computer science at the theory and is the focus of this essay. This ap- Weizmann Institute of Science, Israel. His e-mail address proach is explicitly aimed at providing a theory of is [email protected]. perfect randomness that nevertheless allows for The author is grateful to Anthony Knapp and Susan the efficient generation of perfect random strings Landau for their many useful comments. from shorter random strings. The heart of this ap- 1 proach is the suggestion to view objects as equal In general, the amountP of information in a distribution if they cannot be told apart by any efficient D is defined as x D(x) log2 D(x). Thus, the uniform dis- tribution over strings of length n has information mea- sure n, and any other distribution over n-bit strings has 2For example, the string 1n has Kolmogorov complexity
lower information measure. Also, for any function O(1) + log2 n (by virtue of the program “print n ones”, f : {0, 1}n →{0, 1}m with n NOVEMBER 1999 NOTICES OF THE AMS 1209 fea-goldreich.qxp 9/23/99 3:56 PM Page 1210 procedure. Consequently a distribution that can- take random steps, where, without loss of gener- not be efficiently distinguished from the uniform ality, a random step consists of selecting which of distribution will be considered as being random (or two predetermined steps to take next so that each rather “random for all practical purposes”, which possible step is taken with probability 1/2. These we call “pseudorandom”). Thus, randomness is choices are called the algorithm’s internal coin not an “inherent” property of objects (or distrib- tosses. utions) but rather is relative to an observer (and its computational abilities). To demonstrate this ap- The Definition of Pseudorandom proach, let us consider the following mental ex- Generators periment. Loosely speaking, a pseudorandom generator is an efficient program (or algorithm) that stretches short Alice and Bob play HEAD OR TAIL in one random strings into long pseudorandom sequences. of the following four ways. In all of We emphasize three fundamental aspects in the no- them Alice flips a coin high in the air, tion of a pseudorandom generator: and Bob is asked to guess its outcome 1. Efficiency. The generator has to be efficient. before the coin hits the floor. The al- As we associate efficient computations with ternative ways differ by the knowledge polynomial-time ones, we postulate that the Bob has before making his guess. In generator has to be implementable by a de- the first alternative, Bob has to an- terministic polynomial-time algorithm. nounce his guess before Alice flips the This algorithm takes as input a string, called coin. Clearly, in this case Bob wins with its seed. The seed captures a bounded amount probability 1/2. In the second alterna- of randomness used by a device that “gener- tive, Bob has to announce his guess ates pseudorandom sequences”. The formu- while the coin is spinning in the air. Al- lation views any such device as consisting of though the outcome is determined in a deterministic procedure applied to a ran- principle by the motion of the coin, Bob dom seed. does not have accurate information on 2. Stretching. The generator is required to stretch the motion and thus we believe that its input seed to a longer output sequence. also in this case Bob wins with proba- Specifically, it stretches n-bit long seeds into bility 1/2. The third alternative is `(n) -bit long outputs, where `(n) >n. The similar to the second, except that Bob function ` is called the stretching measure has at his disposal sophisticated equip- (or stretching function) of the generator. ment capable of providing accurate in- 3. Pseudorandomness. The generator’s output formation on the coin’s motion as well has to look random to any efficient observer. as on the environment affecting the That is, any efficient procedure should fail to outcome. However, Bob cannot process distinguish the output of a generator (on a this information in time to improve his random seed) from a truly random sequence guess. In the fourth alternative, Bob’s of the same length. The formulation of the recording equipment is directly con- last sentence refers to a general notion of com- nected to a powerful computer pro- putational indistinguishability that is the heart grammed to solve the motion equa- of the entire approach. tions and output a prediction. It is To demonstrate the above, consider the fol- conceivable that in such a case Bob can lowing suggestion for a pseudorandom generator. improve substantially his guess of the The seed consists of a pair of 32-bit integers, x and outcome of the coin. N, and the 100,000-bit output is obtained by re- We conclude that the randomness of an event is peatedly squaring the current x modulo N and relative to the information and computing re- emitting the least significant bit of each interme- 2 sources at our disposal. Thus, a natural concept diate result (i.e., let xi xi 1 mod N , for 5 of pseudorandomness arises: a distribution is i =1,...,10 , and output b1,b2,...,b105 , where def pseudorandom if no efficient procedure can dis- x0 = x and bi is the least significant bit of xi). This tinguish it from the uniform distribution, where process may be generalized to seeds of length n efficient procedures are associated with (proba- (here we used n =64) and outputs of length `(n) bilistic) polynomial-time algorithms. (here l(n)=105). Such a process certainly satisfies An algorithm is called polynomial-time if there items (1) and (2) above, whereas the question exists a polynomial p so that for any possible whether item (3) holds is debatable (once a rigor- input x, the algorithm runs in time bounded by ous definition is provided). Anticipating some of p(|x|), where |x| denotes the length of the string the discussion below, we mention that, under the x. Thus, the running time of such an algorithm assumption that it is difficult to factor large inte- grows moderately as a function of the length of its gers, a slight variant of the above process is indeed input. A probabilistic algorithm is one that can a pseudorandom generator. 1210 NOTICES OF THE AMS VOLUME 46, NUMBER 10 fea-goldreich.qxp 9/23/99 3:56 PM Page 1211 Computational Indistinguishability random generator (as defined next): such distrib- Intuitively, two objects are called computationally utions are computationally indistinguishable from indistinguishable if no efficient procedure can tell uniform but are not statistically indistinguishable them apart. As usual in complexity theory, an el- from uniform. egant formulation requires asymptotic analysis Definition 2. (pseudorandom generators [2, 13]). (or rather a functional treatment of the running A deterministic polynomial-time algorithm G is time of algorithms in terms of the length of their called a pseudorandom generator if there exists a input).3 Thus, the objects in question are infinite stretching function, `:N → N, so that the following sequences of distributions, where each distribution two probability ensembles, denoted {G } ∈N and has a finite support. Such a sequence will be called n n {R } ∈N, are computationally indistinguishable: a distribution ensemble. Typically, we consider n n { } distribution ensembles of the form Dn n∈N, where 1. Distribution Gn is defined as the output of G for some function ` : N → N , the support of each on a uniformly selected seed in {0, 1}n. D is a subset of {0, 1}`(n). Furthermore, typically n 2. Distribution Rn is defined as the uniform dis- ` will be a positive polynomial. For such Dn , we tribution on {0, 1}`(n). denote by e D the process of selecting e ac- n That is, letting U denote the uniform distribution cording to distribution D . Consequently, for a m n over {0, 1}m , we require that for any probabilistic predicate P, we denote by Pr [P(e)] the proba- e Dn polynomial-time algorithm A, any positive polyno- bility that P(e) holds when e is distributed (or se- mial p, and all sufficiently large n lected) according to Dn . 1 Definition 1. (computational indistinguishability Prs U [A(G(s))=1] Prr U [A(r)=1] < . n `(n) p(n) [8, 13]). Two probability ensembles, {Xn}n∈N and {Yn}n∈N , are called computationally indistin- Thus, pseudorandom generators are efficient guishable if for any probabilistic polynomial-time (i.e., polynomial-time) deterministic programs that algorithm A, any positive polynomial p, and all expand short randomly selected seeds into longer sufficiently large n pseudorandom bit sequences, where the latter are 1 defined as computationally indistinguishable from Prx X [A(x)=1] Pry Y [A(y)=1] < . n n p(n) truly random sequences by efficient (i.e., polyno- mial-time) algorithms. It follows that any efficient The probability is taken over Xn (resp., Yn) as well randomized algorithm maintains its performance as over the coin tosses of algorithm A. when its internal coin tosses are substituted by a A few comments are in order. First, we have al- sequence generated by a pseudorandom genera- lowed algorithm A, which is called a distinguisher, tor. That is, to be probabilistic. This makes the requirement Construction 3. (typical application of pseudo- only stronger, and seems essential to several im- random generators). Let A be a probabilistic poly- portant aspects of our approach. Second, we view nomial-time algorithm, and let (n) denote an events occurring with probability that is upper upper bound on its randomness complexity (i.e., bounded by the reciprocal of polynomials as neg- number of coins that A tosses on n-bit inputs). Let ligible. This is well coupled with our notion of A(x, r) denote the output of A on input x and coin efficiency (i.e., polynomial-time computations): an tossing sequence r ∈{0, 1} (|x|) . Let G be a pseudo- event that occurs with negligible probability (as a random generator with stretching function function of a parameter n) will also occur with `:N → N. Then AG is a randomized algorithm that negligible probability when the experiment is re- on input x proceeds as follows. It sets k = k(|x|) to peated for poly(n)-many times. Third, when al- be the smallest integer such that `(k) (|x|), uni- lowing A in the above definition to be an arbitrary formly selects s ∈{0, 1}k , and outputs A(x, r) , function (rather than a probabilistic polynomial- where r is the (|x|)-bit long prefix of G(s). time algorithm), one obtains a notion of statisti- cal indistinguishability. The latter is equivalent to It can be shown that it is infeasible to find long requiring thatP the variation distance between Xn x’s on which the input-output behavior of AG is no- | | and Yn (i.e., z Xn(z) Yn(z) ) is negligible (in n). ticeably different from the one of A, although AG We note that computational indistinguishabil- may use much fewer coin tosses than A. This is ity is a strictly more liberal notion than statistical formulated in the proposition below, where F rep- indistinguishability (cf. [13]). An important case is resents an algorithm trying to find x’s so that A(x) the one of distributions generated by a pseudo- and AG(x) are distinguishable (by an algorithm D). 3The asymptotic (or functional) treatment is not essential Proposition 4. Let A and G be as above. For any to this approach. One may develop the entire approach in terms of inputs of fixed lengths and an adequate no- algorithm D, let A,D(x) denote the discrepancy, as tion of complexity of algorithms. However, such an al- judged by D, in the behavior of A and AG on input ternative treatment is more cumbersome. x, i.e., A,D(x) is defined as NOVEMBER 1999 NOTICES OF THE AMS 1211 fea-goldreich.qxp 9/23/99 3:56 PM Page 1212