<<

600.664: Randomized Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Randomized Algorithms Week 1: Concepts

Rao Kosaraju

1.1 Motivation for Randomized Algorithms

In a randomized you can toss a as a step of computation. Alternatively a bit taking values of 0 and 1 with equal can be chosen in a single step. More generally, in a single step an element out of n elements can be chosen with equal probalities (uniformly at random). In such a setting, our goal can be to design an algorithm that minimizes run time. Randomized algorithms are often advantageous for several reasons. They may be faster than deterministic algorithms. They may also be simpler. In addition, there are problems that can be solved with randomization for which we cannot design efficient deterministic algorithms directly. In some of those situations, we can design attractive deterministic algoirthms by first designing randomized algorithms and then converting them into deter- ministic algorithms by applying standard derandomization techniques. Deterministic Algorithm: Worst case running time, i.e. the number of steps the algo- rithm takes on the worst input of length n. Randomized Algorithm: 1) Expected number of steps the algorithm takes on the worst input of length n. 2) w.h.p. bounds.

As an example of a randomized algorithm, we now present the classic quicksort algorithm and derive its performance later on in the chapter. We are given a set S of n distinct elements and we want to sort them. Below is the randomized quicksort algorithm.

Algorithm RandQuickSort(S = {a1, a2, ··· , an} If |S| ≤ 1 then output S; else: Choose a pivot element ai uniformly at random (u.a.r.) from S Split the set S into two S1 = {aj|aj < ai} and S2 = {aj|aj > ai} by comparing each aj with the chosen ai Recurse on sets S1 and S2 Output the sorted set S1 then ai and then sorted S2. end Algorithm For this the algorithm we will establish that the expected speed on any input of length is no more than ≥ 2n ln n . In addition, we will establish that the probability that the algorithm will take more than

1 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

1 12 ln n steps is no more than n2 . Hence if n = 1000 and if we run the a algorithm for a million (10002) times, it will run for more than 12000 ln 1000 at most once. Before we can design and analyze randomized algorithms, however, it is important to review the basic concepts in probability .

1.2 Probability Spaces and Events

When we conduct a random experiment several possible outcomes may occur. Definition 1 (Probability ). A consists of a universe Ω, a collection of subsets of Ω known as events, and a , P , over the events that satisfies the following properties:

1. Ω is an event. 2. If E is an event then E¯ is an event.

3. If E1,E2,...,Ek are events then ∪iEi is an event. (The union can be over a of events.) 4. For each event E, P (E) ≥ 0. 5. P (Ω) = 1. P 6. If E1,E2,...,Ek are disjoint events then P (∪iEi) = i P (Ei).

Note that if E1,E2 are events E1 ∩ E2 is also an event since E1 ∩ E2 = E1 ∩ E2 = E1 ∪ E2.

Definition 2 (). The conditional probability of event E1 given that event E2 has occurred, written P (E1|E2), is defined as

P (E1 ∩ E2) P (E1|E2) = (1) P (E2)

At times, we write P (E1 ∩ E2) as P (E1,E2). More generally, we write P (E1 ∩ E2 ∩ · · · ∩ En) as P (E1,E2, ··· ,En) Observation:

P(E1 ∩ E2 ∩ · · · ∩ En) = P (E1|E2 ∩ · · · ∩ En) P (E2|E3 ∩ · · · ∩ En) ··· P (En−1|En) P (En) .

Definition 3 (Independence). Events E1 and E2 are independent if

P (E1 ∩ E2) = P (E1) P (E2)

This can also be stated as:

P (E1|E2) = P (E1) or

P (E2|E1) = P (E2)

2 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Definition 4 (). Events E1,E2,...,En are pairwise independent if (∀i, j, i 6= j)(Ei and Ej are independent ).

Definition 5 (k-wise independence). Events E1,E2,...,En are k-wise independent 2 ≤ k ≤ n, if for every 2 ≤ k1 ≤ k and distinct indices i1, . . . , ik1

P E ∩ E ∩ ... ∩ E  = Πk1 P E ; i1 i2 ik1 j=1 ij

Or equivalently

P E |E ∩ ... ∩ E  = P (E ) . i1 i2 ik1 i1

1.3 Discrete Random Variables and Expectation

Definition 6 (Discrete Random ). A discrete is a function X : Ω → a countable of R such that for each a in the subset {i|X(i) = a} is an event. We write P ({i|X (i) = a}) in the shorthand form P (X = a). The probability function P is known as the probability mass function.

Definition 7 (Expectation). The expectation, E (X), of the random variable X is defined as

X E (X) = aP (X = a) (2) a

E(X) is usually denoted by µX or simply µ when X is understood.

Definition 8 (Joint Mass Function). If X1,...,Xn are random variables, then the joint mass function P (X1 = a1,...,Xn = an) is defined as the probability of the event P ((X1 = a1) ∩ (X2 = a2) ∩ ... ∩ (Xn = an)).

The probability expression can also be written as pX1,X2,...,Xn (a1, a2, . . . , an). 1 We may find P (X1 = a1) as follows: X P(X1 = a1) = P(X1 = a1,X2 = a2,...,Xn = an) (3)

a2,...,an

Definition 9 (Independence of Random Variables). Random variables X1,...,Xn are said to be independent if for every distinct i1, . . . , im and for every a1, . . . , am:

m Y  P (Xi1 = a1,Xi2 = a2,...,Xim = am) = P Xij = aj j=1

1 Of course this generalizes to any Xi and ai, but we write it as X1 and a1 to simplify the notation.

3 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Definition 10 (Pairwise Independence). Random variables X1,X2, ..., Xn are pairwise in- dependent if for every distinct i and j, Xi and Xj are independent.

Definition 11 (k-wise Independence). Random variables X1,X2, ..., Xn are k-wise inde- pendent, 2 ≤ k ≤ n, if for every distinct i , i , ..., i , 2 ≤ k ≤ k, X ,X , ..., X are 1 2 k1 1 i1 i2 ik1 independent. Example 1. 4 balls are thrown independently and u.a.r. into 5 bins. What is the probability that 2 balls fall into bin 1?

We define our probability space to be Ω = {(i1, i2, i3, i4)|ij ∈ {1, 2, 3, 4, 5}}. The value ij specifies the bin into which ball j falls. Since each ball is thrown independently and u.a.r., 1 for every i1, i2, i3, i4, P (i1, i2, i3, i4) = 4 . Define a r.v. X :Ω → {0, 1, 2, 3, 4} s.t. for every 5 ( P 1 if k = 1 i1, i2, i3, i4, X((i1, i2, i3, i4)) = { f(ij), in which f(k) = j 0 otherwise. Note that r.v. X stands for the number of balls that fall into bin 1. We are interested in 0 P(X = 2). For any choice of 2 j s s.t. ij = 1, the other positions can be chosen as any 0 0 value from {2, 3, 4, 5}. Hence the number of (i1, i2, i3, i4) s s.t. exactly 2 of the ijs are 1’s 4 2 4 2 1 4 1 2 4 2 is 2 4 . Hence P (X = 2) is 2 4 54 which is given by 2 5 5 . Example 2. Let X be the number of balls that fall into bin 1. What is the value of E (X)?

Using the same method as above, we have

4 4 P(X = 0) = 5 4 1  4 3 P(X = 1) = 1 5 5 4 1 2 4 2 P(X = 2) = 2 5 5 4 1 3 4  P(X = 3) = 3 5 5 1 4 P(X = 4) = 5

4 4 4 1  4 3 4 1 2 4 2 4 1 3 4  1 4 4 Then E (X) = 0 5 + 1 1 5 5 + 2 2 5 5 + 3 3 5 5 + 4 5 = 5 . This result can also be justified by appealing to our intuition: The expected number of balls that fall into any of the 5 bins should be the same. Since there a total of 4 balls, for any bin the expected number of balls should be one fifth of 4.

1.4 Functions of Random Variables

Definition 12 (Function of a Random Variable). If X1,...,Xk are random variables and f is a function, then f (X1,...,Xk) is a random variable such that X P (f (X1,X2,...,Xk) = a) = P (X1 = a1,...Xk = ak) (4)

a1, . . . , ak s.t. f (a1, . . . , ak) = a

4 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Theorem 1. X E (f (X1,...,Xk)) = f (a1, . . . , ak) P (X1 = a1,...,Xk = ak)

a1,...,ak

X2 X 1 2 3 Example 3. Compute E(X + X ) for the joint mass function given by 1 1 2 1 .1 .1 .2 2 .3 .1 .2

We compute E(X1,X2) by two different methods: By applying the definition and by applying the above .

Direct Computation:

P (X1 + X2 = 2) = P (X1 = 1,X2 = 1) = .1

P (X1 + X2 = 3) = P (X1 = 1,X2 = 2) + P (X1 = 2,X2 = 1) = .1 + .3 = .4

P (X1 + X2 = 4) = P (X1 = 1,X2 = 3) + P (X1 = 2,X2 = 2) = .2 + .1 = .3

P (X1 + X2 = 5) = P (X1 = 2,X2 = 3) = .2

Hence, E(X1 + X2) = 2(.1) + 3(.4) + 4(.3) + 5(.2) = 3.6.

By Theorem:

E(X1 + X2) = (1 + 1).1 + (1 + 2).1 + (1 + 3).2 + (2 + 1).3 + (2 + 2).1 + (2 + 3).2 = 3.6

Theorem 2. If X1,X2, ··· ,Xk,Xk+1, ··· Xn are independent r.v.s and f and g are any functions, then f (X1, ··· ,Xk) and g (Xk+1, ··· ,Xn) are independent.

1.5 Linearity of Expectation

Theorem 3 (Linearity of Expectation). Let X1,...,Xn be random variables, c1, . . . , cn be Pn reals, and let X = i=1 ciXi. Then

n X E (X) = ciE (Xi) . (5) i=1

5 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Proof.

n X  X X E ciXi = (ciai) P (X1 = a1,...,Xn = an)

a1,...,an i=1 n X X = (ciaiP (X1 = a1,...,Xn = an))

i=1 a1,...,an n X X = ciaiP (Xi = ai)

i=1 ai n X X = ci aiP (Xi = ai) i=1 n X = ciE(Xi) i=1

We now apply this theorem to the 4 balls 5 bins problem of Example 2, and recompute E(X):

We express the number of balls that fall into bin 1 by the random variable X given by P4 X = i=1 Xi where

 1 if the ith ball falls into bin 1 Let X = i 0 otherwise

1 4 1 Clearly, P (Xi = 1) = 5 and P (Xi = 0) = 5 . Hence E (Xi) = 5 . By linearity of expectation

4 X 4 E (X) = E (Xi) = 5 , i=1 matching the result obtained earlier. Now let us generalize the problem to throwing n balls into n bins. We want to compute the expected number of balls that fall into bin 1. Pn Example 4. X = i=1 Xi as before. What is E(X)?

1 1 Note that P (Xi = 1) = n , and hence E (Xi) = n . Then

n ! X 1 E (X) = E Xi = n (n) = 1 i=1

An important and helpful property of linearity of expectation is that it doesn’t require the independence of the random variables. In the following example, we demonstrate the power of the linearity of expectation when the random variables are not independent.

6 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Example 5. Throw n balls into n bins such that the first ball is thrown u.a.r. among the n bins and, the ith ball, i ≥ 2, cannot fall into the bin of the i − 1th ball but falls u.a.r. among P the other n − 1 bins. Let X = Xi be the number of balls that fall into bin 1. What is E (X)?

Note that

P (Xi = 1|Xi−1 = 1) = 0 1 P (Xi = 1|Xi−1 = 0) = n−1 1 Claim 1. P (Xi = 1) = n

Proof. By induction on i.

1 Base Case P (X1 = 1) = n , since the first ball is thrown u.a.r. among the n bins.

1 Inductive Step Assume the claim holds for i < k. Then P (Xk−1 = 1) = n and 1 P (Xk−1 = 0) = 1 − n . We have

P (Xk = 1,Xk−1 = 1) = 0

P (Xk = 1,Xk−1 = 0) = P (Xk = 1|Xk−1 = 0) P (Xk−1 = 0) 1 1  1 = n−1 1 − n = n P (Xk = 1) = P (Xk = 1,Xk−1 = 1) + P (Xk = 1,Xk−1 = 0) 1 1 = 0 + n = n Hence the inductive step holds.

1 Pn Pn 1 Therefore E (Xi) = n , and E (X) = E ( i=1 Xi) = i=1 E(Xi) = n n = 1.

Although our throws are not independent this time, we are still able to compute the expec- tation of X by applying the linearity of expectation.

1.6 and Standard Deviation

Definition 13 (Variance). Let X be a random variable with E (X) = µX . The variance of 2 X, denoted V ar(X), is defined as V ar(X) = E[(X − µX ) ]. Note that h 2i V ar (X) = E (X − µX ) 2 2  = E X − 2XµX + µX 2 2 2 = E X − 2µX + µX 2 2 = E X − µX

7 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Definition 14 (Standard Deviation). The standard deviation of X, denoted σX , is defined p as σX = V ar (X). Lemma 1. If X and Y are independent random variables then E (XY ) = E (X) E (Y ).2

Proof. X E (XY ) = abP (X = a, Y = b) a,b X X = abP (X = a)(Y = b) by independence of X and Y a b ! X X = aP (X = a) bP (Y = b) a b X = aP (X = a)E(Y ) a X = E (Y ) aP (X = a) a = E(Y )E(X)

Lemma 2. If X and Y are independent random variables then V ar (X + Y ) = V ar (X) + V ar (Y ).

Proof.

h 2i V ar (X + Y ) = E (X + Y − (µX + µY ))

h 2 2 i = E (X − µX ) + (Y − µY ) + 2 (X − µX )(Y − µY )

h 2i h 2i = E (X − µX ) + E (Y − µY ) + 2E (X − µX ) E (Y − µY ) = V ar(X) + V ar(Y ) + 0

1.7 Moment Generating Function

Definition 15. The moment generating function (mfg) of a random variable X, denoted tX MX (t), is E(e ). When X is understood, we simply write it as M(t).

k dk Observation:E X = dtk M(0) 2This generalizes to n random variables.

8 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Proof.

tX t2X2 tkXk tk+1Xk+1 M(t) = E(1 + + + ... + + + ...) 1! 2! k! k + 1! t t2 tk tk+1 = 1 + E(X) + E(X2) + ... + E(Xt) + E(Xk+1) + ... 1! 2! k! (k + 1)!

Hence dk t t2 M(t) = E(Xk) + E(Xk+1) + E(Xk+2) + ... dtk 1! 2! dk M(0) = E(Xk). dtk

In fact this observation is the basis for calling M(t) as the function that generates moments.

0 Example 6. X = X1+...+Xn, in which the Xis are independent and identically distributed with P (X1 = 1) = p.

We first calculate E X2 by making use of the linearity of when the r.v.s are independent.

2 2 P 2 2 2 2 E(X ) = Var(X) + µX = i=1 Var(Xi) + (np) = n(p − p ) + n p .

We now verify the result by recomputing E X2 via mgf.

M(t) = E(etX ) = E(et(X1+...+Xn)) = E(etX1 )E(etX2 ) ...E(etXn ) = (pet + 1 − p)n

Now we differentiate M(t) twice, and then set t = 0.

M 0(t) = n(pet + 1 − p)n−1pet = np(pet + 1 − p)n−1et M 00(t) = np(pet + 1 − p)n−2(n − 1)petet + np(pet + 1 − p)n−1et M 00(0) = np(n − 1)p + np = n(p − p2) + n2p2.

This value matches the value derived earlier.

9 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

1.8

Definition 16 (Conditional Expectation). For any r.v.s. X and Y the expectation of X conditioned on Y = b is given by X E (X|Y = b) = aP (X = a|Y = b) . a

For example, the joint distribution of Example 2, .1 .3 E(X |X = 1) = 1 + 2 1 2 .1 + .3 .1 + .3 = 1.75

Definition 17. E(X|Y ) is a r.v. and it is a function of Y, f(Y ), given by f(b) = E (X|Y = b). For the joint distribution of Example 3, E (X1|X2) = f(X2) is given by:

f(1) = 1.75 as computed above .1 .2 f(2) = E (X |X = 2) = 1 + 2 = 1.5 1 2 .1 + .1 .2 + .2 .2 .2 f(3) = E (X |X = 3) = 1 + 2 = 1.5 1 2 .2 + .2 .2 + .2

Theorem 4. E(X) = E(E(X|Y )). P That is, E(X) = b E(X|Y = b)P (Y = b).

For the above example, note that E (E (X1|X2)) = 1.75(.4) + 1.5(.2) + 1.5(.4) = 1.6.

Direct computation of E(X1) will also yield 1.6 Now we illustrate an application of the above theorem for a more complicated problem. PN Example 7. Choose N u.a.r. from {1,2,...,n}. Let X = i=1 Xi, in which each Xi is chosen u.a.r. from {0,1,...,i}. Compute E(X). We cannot directly apply the linearity of expectation principle since the number of vairables 1 Pn n+1 itself is a random variable. It would be incorrect if we first compute E(N) = n i=1 i = 2 and then compute E(X1 + X1 + ... + X n+1 ). 2 We attack the problem by applying the above theorem: E(X) = E(E(X|N)).

10 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

1 For any i, P (N = i) = n , and

k X E(X|N = k) = E( Xi) i=1 k X = E(Xi) i=1 k X 1 = (0 + 1 + ... + i) i + 1 i=1 k X i = 2 i=1 k(k + 1) = 4

Pn 1 k(k+1) n2+3n+2 Hence E(X) = E(E(X|Y )) = k=1 n 4 = 12 n2+3n+2 Hence E(X) = 12

2 Probabilistic Recurrences

In this section, we examine two problems and present randomized algorithms for them. The number of steps a randomized algorithm executes on a specific input of length is a random variable. Here we are interested in the expected number of steps. The expectation can vary for different inputs of length n. Throughout this course, we are interested in the expected number of steps for the worst input. For each randomized algorithm, we first express the expected runtime by a recurrence relation.

2.1 Finding the minimum of n

We will begin by finding the minimum of a set of n numbers. Although the minumum can be computed by a simple optimal deterministic algorithm, we design a randomized algorithm to gain insights into solving recurrence relations involving random variables. The randomized algorithm is presented below.

Algorithm FindMin(a1, a2, ··· , an) choose an i u.a.r. in {1, 2, ··· , n}; compare the ai with every other aj; if ai is the minimum element, then output ai; else recurse on the set of elements less than ai; end Algorithm

11 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

Analysis

Let T (n) be the random variable for the comparisons performed by the algorithm FindMin for computing the minimum of n numbers. The probability of choosing any i, 1 ≤ i ≤ n, is 1/n. Alternatively, the probability of choosing the ith smallest element, 1 ≤ i ≤ n is also 1/n. Now we define a r.v. X which takes the value i, 0 ≤ i ≤ n − 1, when the size of the remaining set is i. Hence P (X = i) = P (choosing i + 1th smallest element) = 1/n. Hence T (n) can be expressed by the recurrence equation

T (1) = 0, T (n) = n − 1 + T (X)

Note that

E(T (X)) = E (E (T (X)|X)) n−1 1 X = E(T (i)) n i=0

1 Pn−1 Hence E (T (n)) = n − 1 + n i=0 E(T (i)). 1 Pn−1 1 Let E (T (j)) = f(j). Then, f(n) = n−1+ n i=1 f(i) = n+ n [f(1)+f(2)+···+f(n−1)]. Multiplying both sides by n, we have

nf(n) = n(n − 1) + [f(1) + f(2) + ··· + f(n − 1)] (6)

Replacing n by n − 1 in (6) we have,

(n − 1)f(n − 1) = (n − 1)(n − 2) + [f(1) + f(2) + ··· + f(n − 2)] (7)

Subtracting equation (7) from equation (6),

nf(n) − (n − 1)f(n − 1) = 2n − 2 + f(n − 1) (8) Hence, 2 f(n) = f(n − 1) + 2 − (9) n

Thus by successive substitutions, 2  f(n) = f(n − 1) + 2 − n  2  2  = f(n − 2) + 2 − n−1 + 2 − n  2   2  2  = f(n − 3) + 2 − n−2 + 2 − n−1 + 2 − n . . = 2n − 2H(n)

12 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

th Pn 1 where H(n) is the n Harmonic number defined as H(n) = i=1 i . Therefore, we have proved that the expected number of steps taken by the algorithm Find- Min is 2n−O(ln n). We could also have solved the above recurrence by guessing the solution and verifying the it using induction, which is shown below.

Alternative Proof by Induction

We claim the solution to the recurrence relation (9) is T (n) ≤ 2n, which we prove by mathematical induction.

• Base Case: i = 1, f(1) = 0 ≤ 0.

• Inductive Step: Let f(i) ≤ 2i hold for all values of i such that 1 ≤ i ≤ n − 1. For i = n, we have 2 f(n) = f(n − 1) + 2 − n 2 ≤ 2(n − 1) + 2 − n , by inductive hypothesis 2 = 2n − n ≤ 2n, establishing the inductive step.

2.2 Randomized Quicksort

Now we derive the expected run time of the RandQuickSort algorithm presented earlier.

Analysis

Let T (n) be the number of steps taken by the RandQuickSort algorithm on a set of size n. Note that the maximum value of T (n) occurs when the pivot element xi is the largest/s- mallest element of the remaining set during each recursive call of the algorithm. In this case, T (n) = n + (n − 1) + ··· + 1 = O(n2). This value of T (n) is reached with a very low 2 2 2 2n−1 probability of n · n−1 ···· 2 = n! . Also, the best case occurs when the pivot element splits the set S into two equal sized subsets and then T (n) = O(n ln n). This implies that T (n) has a distribution between O(n ln n) and O(n2). Now we derive the of th T (n). Note that if the i smallest element is chosen as the pivot element then S1 and S2 1 will be of sizes i − 1 and n − i − 1 respectively and this choice has a probability of n . The recurrence relation for T (n) is:

T (n) = n − 1 + T (X) + T (n − 1 − X) (10)

1 where, P [X = i] = n for 0 ≤ i ≤ n − 1. Taking expectations on both sides of (10),

1 Pn−1 1 Pn−1 E[T (n)] = n − 1 + n i=1 E[T (i)] + n j=1 E[T (j)] 2 Pn−1 = n − 1 + n i=1 E[T (i)].

13 600.664: Randomized Algorithms Professor: Rao Kosaraju Johns Hopkins University Scribe: Your Name

2 Pn−1 Let f(i) = E[T (i)]. Then, f(n) = n + n i=1 f(i). Simplifying,

nf(n) = n(n − 1) + 2(f(1) + f(2) + ··· + f(n − 1)) (11)

Substituting n − 1 for n in (11),

(n − 1)f(n − 1) = (n − 1)(n − 2) + 2(f(1) + f(2) + ··· + f(n − 2)) (12)

Subtracting (12) from (11), we get nf(n) − (n − 1)f(n − 1) = (2n − 2) + 2f(n − 1) or n+1 2n−2 f(n) = n f(n − 1) + n .

2.2.1 Claim: f(n) ≤ 2n ln n

Proof: We prove this by induction on n. Base Case: When n = 1, f(1) = 0 ≤ 0 holds. Inductive Step: Let the claim hold for all values up to n − 1. Then,

n+1 2n−2 f(n) = n f(n − 1) + n n+1 2n−2 ≤ n 2(n − 1) ln(n − 1) + n by inductive hypothesis 2(n2−1) 2n−2 = n ln(n − 1) + n 2(n2−1) 1 2n−2 = n (ln n + ln(1 − n )) + n We make use of the standard inequality stated below. Fact 1. 1 + x ≤ ex and ln(1 + x) ≤ x for any x ∈ R. For example, 1 + 0.2 ≤ e0.2 and ln(1 − 0.2) ≤ −0.2.

Hence 2(n2−1) 1 2n−2 f(n) ≤ n (ln n − n ) + n 2 2 2 = 2n ln n − n ln n − 2 + n2 + 2 − n ≤ 2n ln n, establishing the inductive step. Hence the expected run time of the RandQuickSort algorithm is O(n ln n). Nevertheless, one of the limitations of using recurrence relations is that we do not know how the runtime of the algorithm is distributed around its expected value. Can this analysis be extended to answer questions such as, ”With what probability does the algorithm RandQuickSort need more than 24n ln n time steps?” Later on, we will apply a different technique and establish that this probability is very small. Similarly, for the case of the FindMin algorithm, by solving the recurrence relation for the expected running time, we will be unable to answer questions like the following: “What is the probability that the runtime is greater than 3n?” The answer to these and other similar queries lies in the study of Tail Inequalities, discussed in the next section.

14