Pseudorandomness Oded Goldreich

fea-goldreich.qxp 9/23/99 3:56 PM Page 1209 Pseudorandomness Oded Goldreich his essay considers finite objects, en- associated with a unique distribution—the uni- coded by binary finite sequences called form one. In particular, by definition, one cannot strings. When we talk of distributions generate such perfect random strings from shorter we mean discrete probability distribu- random strings. tions having a finite support that is a set The second theory (cf. [11, 12]), due to Tof strings. Of special interest is the uniform dis- Solomonov, Kolmogorov, and Chaitin, is rooted in tribution, which for a length parameter n (explicit computability theory and specifically in the notion or implicit in the discussion), assigns each n-bit of a universal language (equivalently, universal string x ∈{0, 1}n equal probability (i.e., probabil- machine or computing device). It measures the ity 2n). We will colloquially speak of “perfectly ran- complexity of objects in terms of the shortest pro- dom strings”, meaning strings selected according gram (for a fixed universal machine) that generates to such a uniform distribution. the object.2 Like Shannon’s theory, Kolmogorov The second half of this century has witnessed complexity is quantitative, and perfect random the development of three theories of randomness, objects appear as an extreme case. Interestingly, a notion that has been puzzling thinkers over the in this approach one may say that a single object, ages. The first theory (cf. [3]), initiated by Shannon, rather than a distribution over objects, is perfectly is rooted in probability theory and is focused on random. Still, Kolmogorov’s approach is inher- distributions that are not perfectly random. Shan- ently intractable (i.e., Kolmogorov complexity is non’s information theory characterizes perfect uncomputable), and, by definition, one cannot randomness as the extreme case in which the in- generate strings of high Kolmogorov complexity formation content is maximized (and there is no from short random strings. redundancy at all).1 Thus, perfect randomness is The third theory, initiated by Blum, Goldwasser, Micali, and Yao [8, 2, 13], is rooted in complexity Oded Goldreich is professor of computer science at the theory and is the focus of this essay. This ap- Weizmann Institute of Science, Israel. His e-mail address proach is explicitly aimed at providing a theory of is [email protected]. perfect randomness that nevertheless allows for The author is grateful to Anthony Knapp and Susan the efficient generation of perfect random strings Landau for their many useful comments. from shorter random strings. The heart of this ap- 1 proach is the suggestion to view objects as equal In general, the amountP of information in a distribution if they cannot be told apart by any efficient D is defined as x D(x) log2 D(x). Thus, the uniform distribution over strings of length n has information measure n, and any other distribution over n-bit strings has 2For example, the string 1n has Kolmogorov complexity lower information measure. Also, for any function O(1) + log2 n (by virtue of the program “print n ones”, f : {0, 1}n →{0, 1}m with n<m, the distribution obtained which has length dominated by the encoding of n (say, by applying f to a uniformly distributed n-bit string has in binary)). In contrast, a simple counting argument information measure at most n, which is strictly lower than shows that most n-bit strings have Kolmogorov com- the length of the output. plexity at least n. NOVEMBER 1999 NOTICES OF THE AMS 1209 fea-goldreich.qxp 9/23/99 3:56 PM Page 1210 procedure. Consequently a distribution that can- take random steps, where, without loss of gener- not be efficiently distinguished from the uniform ality, a random step consists of selecting which of distribution will be considered as being random (or two predetermined steps to take next so that each rather “random for all practical purposes”, which possible step is taken with probability 1/2. These we call “pseudorandom”). Thus, randomness is choices are called the algorithm’s internal coin not an “inherent” property of objects (or distrib- tosses. utions) but rather is relative to an observer (and its computational abilities). To demonstrate this ap- The Definition of Pseudorandom proach, let us consider the following mental ex- Generators periment. Loosely speaking, a pseudorandom generator is an efficient program (or algorithm) that stretches short Alice and Bob play HEAD OR TAIL in one random strings into long pseudorandom sequences. of the following four ways. In all of We emphasize three fundamental aspects in the no- them Alice flips a coin high in the air, tion of a pseudorandom generator: and Bob is asked to guess its outcome 1. Efficiency. The generator has to be efficient. before the coin hits the floor. The al- As we associate efficient computations with ternative ways differ by the knowledge polynomial-time ones, we postulate that the Bob has before making his guess. In generator has to be implementable by a de- the first alternative, Bob has to an- terministic polynomial-time algorithm. nounce his guess before Alice flips the This algorithm takes as input a string, called coin. Clearly, in this case Bob wins with its seed. The seed captures a bounded amount probability 1/2. In the second alterna- of randomness used by a device that “gener- tive, Bob has to announce his guess ates pseudorandom sequences”. The formu- while the coin is spinning in the air. Al- lation views any such device as consisting of though the outcome is determined in a deterministic procedure applied to a ran- principle by the motion of the coin, Bob dom seed. does not have accurate information on 2. Stretching. The generator is required to stretch the motion and thus we believe that its input seed to a longer output sequence. also in this case Bob wins with proba- Specifically, it stretches n-bit long seeds into bility 1/2. The third alternative is `(n) -bit long outputs, where `(n) >n. The similar to the second, except that Bob function ` is called the stretching measure has at his disposal sophisticated equip- (or stretching function) of the generator. ment capable of providing accurate in- 3. Pseudorandomness. The generator’s output formation on the coin’s motion as well has to look random to any efficient observer. as on the environment affecting the That is, any efficient procedure should fail to outcome. However, Bob cannot process distinguish the output of a generator (on a this information in time to improve his random seed) from a truly random sequence guess. In the fourth alternative, Bob’s of the same length. The formulation of the recording equipment is directly con- last sentence refers to a general notion of com- nected to a powerful computer pro- putational indistinguishability that is the heart grammed to solve the motion equa- of the entire approach. tions and output a prediction. It is To demonstrate the above, consider the fol- conceivable that in such a case Bob can lowing suggestion for a pseudorandom generator. improve substantially his guess of the The seed consists of a pair of 32-bit integers, x and outcome of the coin. N, and the 100,000-bit output is obtained by re- We conclude that the randomness of an event is peatedly squaring the current x modulo N and relative to the information and computing re- emitting the least significant bit of each interme- 2 sources at our disposal. Thus, a natural concept diate result (i.e., let xi xi1 mod N , for 5 of pseudorandomness arises: a distribution is i =1,...,10 , and output b1,b2,...,b105 , where def pseudorandom if no efficient procedure can dis- x0 = x and bi is the least significant bit of xi). This tinguish it from the uniform distribution, where process may be generalized to seeds of length n efficient procedures are associated with (proba- (here we used n =64) and outputs of length `(n) bilistic) polynomial-time algorithms. (here l(n)=105). Such a process certainly satisfies An algorithm is called polynomial-time if there items (1) and (2) above, whereas the question exists a polynomial p so that for any possible whether item (3) holds is debatable (once a rigor- input x, the algorithm runs in time bounded by ous definition is provided). Anticipating some of p(|x|), where |x| denotes the length of the string the discussion below, we mention that, under the x. Thus, the running time of such an algorithm assumption that it is difficult to factor large inte- grows moderately as a function of the length of its gers, a slight variant of the above process is indeed input. A probabilistic algorithm is one that can a pseudorandom generator. 1210 NOTICES OF THE AMS VOLUME 46, NUMBER 10 fea-goldreich.qxp 9/23/99 3:56 PM Page 1211 Computational Indistinguishability random generator (as defined next): such distrib- Intuitively, two objects are called computationally utions are computationally indistinguishable from indistinguishable if no efficient procedure can tell uniform but are not statistically indistinguishable them apart. As usual in complexity theory, an el- from uniform. egant formulation requires asymptotic analysis Definition 2. (pseudorandom generators [2, 13]). (or rather a functional treatment of the running A deterministic polynomial-time algorithm G is time of algorithms in terms of the length of their called a pseudorandom generator if there exists a input).3 Thus, the objects in question are infinite stretching function, `:N → N, so that the following sequences of distributions, where each distribution two probability ensembles, denoted {G } ∈N and has a finite support.

Pseudorandomness Oded Goldreich

On the Randomness Complexity of Interactive Proofs and Statistical Zero-Knowledge Proofs*

LINEAR ALGEBRA METHODS in COMBINATORICS László Babai

Lecture 8: Key Derivation, Protecting Passwords, Slow Hashes, Merkle Trees

Foundations of Cryptography – a Primer Oded Goldreich

Pseudorandomness and Computational Difficulty

A Note on Perfect Correctness by Derandomization⋆

Counting T-Cliques: Worst-Case to Average-Case Reductions and Direct Interactive Proof Systems

A Mathematical Formulation of the Monte Carlo Method∗

Measures of Pseudorandomness

Pseudorandomness

Computational Complexity

Complexity Theory and Cryptography Understanding the Foundations of Cryptography Is Understanding the Foundations of Hardness