<<

1 Inclusion-Exclusion

Definition Boolean polynomials. (The definition is recursive. Boolean polynomial is what we can build with iteration of the procedure below.)

• Variable names like A1,A2, ... are Boolean polynomials.

• If B1 and B2 are Boolean polynomials, then so are (B1) ∩ (B2), (B1) ∪ (B2), B1.

For example, A1, A1 ∩ A2,(A1 ∩ A2) ∪ A1, etc. are all Boolean polynomials. Given sets (or events in a ) S1,S2, ..., it makes sense to evaluate the Boolean polynomial by substituting Ai ← Si, and completing the operations we obtain a (event) as value of the polynomial with this substitution. We say that a Boolean polynomial is n-variable, if it uses n distinct variable names. A variable name is a 1-variable Boolean polynomial. Definition Indicator Given a probability space (Ω, A, p) (for events) or a universe U (for sets), the indicator function of an A ∈ A or A ⊆ U is a function χA :Ω → R (or a function χA : U → R) such that ( 1, if a ∈ A, χA(a) = 0, if a∈ / A.

(The definition for sets can be considered as special case of the definition for probability spaces, if we associate with a universe U the probability space (Ω, A, p) with Ω = U, A = 2U and for A ∈ A and p(A) = |A|/|Ω|.)

Lemma 1 For any n-variable Boolean polynomial B(A1, ..., An) there exists an an n-variable real function fB : n R → R, such that for any substitution events S1, ..., Sn ∈ A from any probability space into the variables A1, ..., An, the following Ω → R functions are equal:

χB(S1,...,Sn) = fB(χS1 , ..., χSn ).

Proof. As Boolean polynomials were defined recursively, we have to prove the statement by induction that mimicks the recursive definition. If A is a variable name, then we define fA : R → R by fA(x) = x. Clearly for any S ∈ A and any ω ∈ ω, χS(ω) = (x ◦ χs)(ω) as required. If B(A1, ..., An) was defined as B = (B1) ∩ (B2) from already defined Boolean polynomials, then we must have that the variable sets of B1 and B2 are of the variable set

A1, ..., An, and some real functions fB1 and fB2 satisfy the requirements of the Lemma for them. Rewrite fB1 and fB2 in a form with n variables such that their variables correspond in order to the variables A1, ..., An, possibly not using at all some variables. Define fB(x1, ..., xn) := fB1 (x1, ..., xn)fB2 (x1, ..., xn). It clearly suffices. Similarly, if B = B1, set fB = 1 − fB1 ; and if B = (B1) ∪ (B2), set fB(x1, ..., xn) := fB1 (x1, ..., xn) + fB2 (x1, ..., xn) − fB1 (x1, ..., xn)fB2 (x1, ..., xn), after rewriting fB1 and fB2 as in the case of intersection.

Let us consider the following two statements. For a fixed (Ω, A, p) probability space, Boolean polynomials Bj(A1, ..., An) and constants cj ∈ R (j = 1, 2, ..., m) and events S1, ..., Sn ∈ A, Pm • (a) ∀ω ∈ Ω j=1 cjχBj (S1,...,Sn)(ω) ≥ 0 Pm • (b) j=1 cjp(Bj(S1, ..., Sn)) ≥ 0. It is clear the (a) implies (b): compute the expectation of both sides of (a), use that expectation of a non-negative is non-negative, and use the linearity of expectation on the left-hand side.

Theorem 2 Let us be given an (Ω, A, p) probability space, Boolean polynomials Bj(A1, ..., An) and constants cj ∈ R (j = 1, 2, ..., m). The following statements are equivalent: Pm • (i) ∀ω ∈ Ω and substitution Ai ← Si (Si ∈ A), we have j=1 cjχBj (S1,...,Sn)(ω) ≥ 0. Pm • (ii) for all substitution Ai ← Si (Si ∈ A), we have j=1 cjp(Bj(S1, ..., Sn)) ≥ 0.

1 Pm • (iii) for all substitution Ai ← Si = ∅ or Ai ← Si = Ω, we have j=1 cjp(Bj(S1, ..., Sn)) ≥ 0. Proof. It is obvious that (i) implies (ii) and that (ii) implies (iii). We show (iii) implies (i) by proving the Pm contrapostive. Assume that for some events S1, ..., Sn and an ω ∈ Ω, we have j=1 cjχBj (S1,...,Sn)(ω) < 0. Define the following new events: ( ∗ Ω, if ω ∈ Si, Si = ∅, if ω∈ / Si. Next, using the lemma for the first equation,

∗ ∗ χ (ω) = f (χ (ω), ..., χ (ω)) = f (χ ∗ (ω), ..., χ ∗ (ω)) = f (χ ∗ (ω ), ..., χ ∗ (ω )), Bj (S1,...,Sn) Bj S1 Sn Bj S1 Sn Bj S1 Sn and the last equation holds for every ω∗ ∈ Ω. Observe that

∗ ∗ p(B (S , ..., S )) = E(χ ∗ ∗ ) = E(f (χ ∗ , ..., χ ∗ )) = χ (ω). j 1 n Bj (S1 ,...,Sn) Bj S1 Sn Bj (S1,...,Sn) Repeating this for j = 1, 2, ..., m, we obtain the expected conclusion:

m m X ∗ ∗ X cjp(Bj(S1 , ..., Sn)) = cjχBj (S1,...,Sn)(ω) < 0. j=1 j=1

As every equality a = b can be written as simultaneous inequalities a ≤ b and b ≤ a, we obtain: Corollary 3 The Theorem applies if ≥ 0 is changed to = 0 in all three clauses.

One easily obtains the inclusion-exclusion formula as an application for the corollary. First, for events Si (i ∈ {1, 2, ..., n}) and I ⊆ {1, 2, ..., n}, define SI := ∩i∈I Si. The inclusion-exclusion formula says:

n n [  X X i−1 p Si = (−1) p(SI ). (1.1) inclexcl i=1 i=1 |I|=i I⊆{1,2,...,n}

The proof, according the the theorem, only requires checking the identity for events where for every i, Si = ∅ or Si = Ω. If every Si = ∅, both sides in (1.1) are equal to 0. So assume that we have ` ≥ events = Ω, the other n − ` ` ` ` `−1` events are = ∅. Now the lefthand side in (1.1) is = 1, while the righthand side is 1 − 2 + 3 − ... + (−1) ` . These are equal as the expansion of (1 − 1)` = 0 according to binomial theorem shows. Theorem 4 (Bonferroni Inequalities)

2t n 2t+1 X X i−1 [  X X i−1 (−1) p(SI ) ≤ p Si ≤ (−1) p(SI ). i=1 |I|=i i=1 i=1 |I|=i I⊆{1,2,...,n} I⊆{1,2,...,n}

Proof. The result is trivial when all are Si = ∅. Otherwise, assume that exactly ` ≥ 1 of them are = Ω. With this substitution, the theorem boils down to

2t 2t+1 X ` X ` (−1)i−1 ≤ 1 ≤ (−1)i−1 . i i i=1 i=1

Pk i` Pk i`−1 `−1 k`−1 This follows from the identity i=0(−1) i = 1 + i=1(−1) i−1 + i = (−1) k .

2