BIBD's, finite geometries and marching schoolgirls

Jeff Dinitz

University of Vermont USA A balanced incomplete block design (BIBD) is a pair (V, B) satisfying the following properties:

1. V is a v-set (the points) 2. B is a collection of b k-subsets of V (the blocks) 3. each element of V is contained in exactly r blocks 4. each 2-subset of V is contained in exactly λ blocks.

The numbers v,b,r,k, and λ are the parameters of the BIBD.

We use the notation (v,b,r,k,λ) – BIBD for such a design.

Example 1: A (7,7,3,3,1) – BIBD

V = {1,2,3,4,5,6,7} B = {156, 137, 124, 267, 235, 346, 457} (to save space we write abc rather than {a,b,c}

This BIBD has a nice diagram Example 2: A (9,12,4,3,1) – BIBD

V = {1,2,3,4,5,6,7,8,9} B = {123, 456, 789 147,258, 369, 159, 267, 348, 168, 249, 357}

This also has a nice geometric representation:

Example 3: A (16,16,6,6,2) – BIBD

V = the 16 cells of a 4  4 array (i.e. V ={1,2,3,4}  {1,2,3,4} )

For each cell (i,j) of the array construct a block that contains all the cells in row i and column j except the cell (i,j). B = the set of all blocks so constructed Some connections between the parameters.

Theorem: In any (v,b,r,k,λ) – BIBD, 1. v  r = b  k 2. r (k – 1) = λ (v – 1) Parameter sets that satisfy (1) and (2) are admissible.

Proof.

1.

2. So the three parameters v, k, and λ determine the remaining two as r = λ (v-1) / k - 1 and b = vr / k.

Hence we write (v,k,λ) – BIBD (or (v,k,λ) – design) to denote a (v,b,r,k,λ) – BIBD.

Corollary: If a (v,k,λ) – BIBD exists, then λ (v 1)  0 (mod k  1) and λv(v  1)  0 (mod k(k  1))

These are necessary conditions (not sufficient).

A bound on the parameters. Theorem: (Fisher's Inequality, 1940) If a (v,b,r,k,λ) – BIBD exists with 2  k  v, then b  v .

Proof: Assume (V, B) is a (v,k,λ) – BIBD and construct the v  b incidence matrix of the design.

A =

Then A AT = We compute the determinant of AAT, det(AAT).

det(AAT) = det

= det

= (r + (v – 1)) (r – )v – 1 = rk(r – ) v – 1  0.

So AAT is nonsingular, and thus AAT has rank v.

Since rank(A)  rank(AAT) = v And since b  rank(A) (A has only b rows)

We get that b  v.

In the extremal case of b = v, the design is called a symmetric design.

In a symmetric (v,k,λ) – BIBD every pair of blocks intersects in  points. So the dual of a symmetric design (exchanging the roles of blocks and points) is also a design (but not necessarily the same as the original). Theorem: If (V, B) is a (v,k,λ) – BIBD, then the set of subsets obtained by taking the complement of each block is a (v, v – k, b – 2r + λ) – BIBD.

This design is called the complement of the original design.

For what admissible parameters (v,k,λ) do there exist (v,k,λ) – BIBD?

The BIBD Table: Some special classes of designs:

. Symmetric designs (v = b) . Steiner systems (λ = 1) . Steiner triple systems (k = 3 and λ = 1) . Hadamard designs ( (4m – 1, 2m – 1, m – 1) – designs ) . Resolvable designs (the blocks are resolvable into parallel classes)

A parallel class in a design is a set of blocks that partition the point set.

A resolvable balanced incomplete block design is a (v,k,λ) – BIBD whose blocks can be partitioned into parallel classes.

Example – the (9,3,1) in Example 2 is resolvable.

In 1850, the Reverend Thomas P. Kirkman posed the question:

Fifteen young ladies in a school walk out three abreast for seven days in succession: it is required to arrange them daily, so that no two walk twice abreast.

This is now referred to as Kirkman's 15 schoolgirl problem. We see that it requires the construction of a resolvable (15,3,1)- BIBD. Here is a solution to this problem:

One Solut ion t o the K ir kman Schoolgir l Pr oblem M onday Tuesday W ednesday T hur sday Fr iday Satur day Sunday adef g abcdf abcdg abcf g abcde abcef abceg bj hi k hemkj j mehi dl i kh f hl i k l i dj h ndhi j cnmol i gnol kof l n enj mo gj omn mkgon of kl m

Lets look at this problem in general.

We see that the necessary conditions for the existence of a resolvable (v, 3, 1)- BIBD is that v  3 (mod 6). (since clearly 3 divides v and also v must be odd – remember λ (v 1)  0 (mod k  1) ).

The existence of resolvable (v,3,1)  designs was a celebrated open problem throughout the period 1850-1970, until the first published solution was given by Ray-Chaudhuri and Wilson.

Theorem: A resolvable (v,3,1)  design exists if and only if v  3 (mod 6).

Techniques for making designs are generally either direct (from an algebraic construction) or recursive (building bigger designs from smaller ones).

We will look at this in more detail in the next lecture where we will concentrate on constructions for Steiner triple systems or (v,3,1)  designs. Now we look a connection between finite geometries and designs.

An affine plane consists of a set P of points and a set B of lines satisfying the following properties: 1. Any two points of P are contained on a unique line, 2. (parallel postulate) given a line l and a point p not on l, there is exactly one line of B containing p which does not intersect l (say that this line is parallel to l ) 3. P contains at least one subset of 4 points no 3 of which are collinear.

Example (again)

Now assume that there exists a line that contains n points. Then from these axioms we can prove

1. One point belongs to exactly n + 1 lines

2. Every line contains exactly n points. (We call the number n the order of the affine plane) 3. Every point is on exactly n + 1 lines.

4. There are exactly n2 points in P.

5. There are exactly n2 + n lines in B. (homework)

We have shown just from these axioms that if there exists an affine plane of order n, then there exists an (n2, n, 1)  design.

We now look at the converse: Assume that (V, B) is an (n2, n, 1)  design Axiom 1 and axiom 3 are satisfied (if n  3).

We need to prove the parallel postulate just from the parameters of this design.

Proof. In an (n2, n, 1)  design, it is easy to compute that the number of blocks containing a point (remember this is called r) is n + 1.

So given a block b and a point p not on it, there are n+1 blocks on p, n of which must intersect b. So there is a unique block on p which does not intersect b.  Theorem: There exists an affine plane of order n if and only if there exists an (n2, n, 1)  design. A projective plane consists of a set P of points and a set B of lines satisfying the following properties: 1. Any two points of P are contained on a unique line, 2. Every pair of lines intersect in exactly one point (so any two lines are on exactly one point) 3. P contains at least one subset of 4 points no 3 of which are collinear.

Example (again)

Again assuming the axioms above and that one line contains n +1 points we can show the following properties of the projective plane:

1. One point is on exactly n + 1 lines. 2. Every line contains n + 1 points. 3. Every point is on exactly n + 1 lines. 4. There are exactly n2 + n + 1 points in P. 5. There are exactly n2 + n + 1 lines in P.

Again n will be called the order of the projective plane (with n +1 points on a line). So if there exists a projective plane of order n, there exists an (n2 + n + 1, n + 1, 1)  design. Now consider the converse:

Assume that (V, B) is an (n2 + n + 1, n + 1, 1)  design. Axiom 1 and axiom 3 are satisfied (if n  3).

We need to prove that any two lines intersect (i.e. there are no parallel lines).

Easy, since in this case each point is on n + 1 blocks so if there were two blocks that don't intersect then consider the picture:

So we have the following theorem:

Theorem: There exists a projective plane of order n if and only if there exists an (n2 + n + 1, n + 1, 1)  design.

Now we will discuss the connection between affine and projective planes.

Example: Affine plane of order 2  projective plane of order 2. Example: Affine plane of order 3  projective plane of order 3.

Definition: In an affine plane, a collection of mutually parallel lines that partitions the points is called a parallel class of lines.

Theorem: There exists an affine plane of order n if and only if there exists a projective plane of order n.

Proof. () Assume there is an affine plane of order n

Prove the following properties of this plane.

1. if any line intersects one of two parallel lines, it intersects the other. 2. there are exactly n + 1 parallel classes, each containing n lines.

Construct the projective plane by adding n + 1 new points th {1, 2, …, n +1} and adjoining the point i to each line in the i parallel class. Finally add the line at infinity, {1, 2, … , n +1}

This is a projective plane of order n. () Now assume that there exists a projective plane of order n.

Pick any line and delete all the points from it. Note that any two lines that went through a deleted point are now parallel. The result is an affine plane of order n.

One final connection:

Theorem: An affine plane is equivalent to a complete set of MOLS(n) (i.e. n – 1 MOLS of order n)

Proof: (homework) Basically, given the latin squares, the points of the affine plane correspond to the n2 cells and cells containing the ith symbol in the jth latin square correspond to the ith line in the jth parallel class of the affine plane. This construction can be reversed also.

Example:

From 2 MOLS (3)

1 2 3 1 2 3 2 3 1 3 1 2 3 1 2 2 3 1

We obtain the affine plane of order 3 (the (9, 3, 1)  design) So we have that the existence of the following are all equivalent:

1. An affine plane of order n. 2. An (n2, n, 1)  design. 3. A projective plane of order n. 4. An (n2 + n + 1, n + 1, 1)  design. 5. A set of n  1 MOLS of side n.

We saw that when q is a prime power that there exists a set of q  1 MOLS of side q, hence a plane of order q.

But what about other orders??

Theorem (Bruck, Ryser, 1949) Let n  1 or 2 (mod 4) and let the square free part of n contain at least one prime factor p  3 (mod 4). Then there does not exist an affine plane of order n.

This rules out orders 6, 14, 22, …

A massive computer search by Lam, et. al. ending in 1989 proved that there was no affine plane of order 10.

So the first unsettled case is n = 12. (Need 11 MOLS of order 12 , the most that have been constructed are 5 MOLS(12).

The only orders for which affine planes are known to exist are prime powers!!

(This has been termed the next Fermat's problem by some). Hadamard designs and matrices

A square n  n matrix H all of whose entries are 1 is called a Hadamard matrix of order n if HHT = nI.

This says that the inner product of any two rows is 0, i.e. any two rows are orthogonal as n – dimensional vectors.

Examples: n = 2 1 1 n = 4 1 1 1 1 1 –1 1 –1 –1 1 1 1 –1 –1 1 –1 1 –1

Some properties of Hadamard matrices:

. The columns are also pairwise orthogonal

. Of all n  n matrices with entries |aij|  1, a Hadamard matrix of order n has the maximum determinant. . If a Hadamard matrix of order n exists then n = 2, or n = 4m for some m. (homework)

The existence of a Hadamard matrix of order 4m is equivalent to the existence of a (symmetric) design with parameters (4m – 1, 2m – 1, m – 1). This design is called a Hadamard design .

Construction: Take a normalized Hadamard matrix of order 4m, delete the first row and change all –1's to 0's. The result is the incidence matrix of a (4m – 1, 2m – 1, m – 1) design. (This process can be reversed to get the matrix from the design)

Product construction for Hadamard matrices:

If there exists Hadamard matrices of order m and n, then there exists a Hadamard matrix of order mn.

Proof: Use the Kroneker product of the two matrices.

Example:

M = 1 1 and N = 1 1 1 1 1 –1 1 –1 –1 1 1 1 –1 –1 1 –1 1 –1

M M M M Then M  N = M – M – M M M M – M – M M – M M – M is a Hadamard matrix of order 8. A direct construction for Hadamard designs:

Let q = 4m – 1 be a prime power, Let b be the set of all non-zero squares in GF(q). A Hadamard design with parameters (4m – 1, 2m – 1, m – 1) can constructed from this blocks plus all its translates in GF(q).

Example:

Let q = 11 (so m = 3). The non-zero squares mod 11 are b = 1, 4, 9, 5, 3 then b, b + 1 = 2, 5, 10, 6, 4 , b + 2, b +3, …., b + 10 are the blocks of a Hadamard design of order 11, (an (11,5,2) – BIBD).

Why??

The Hadamard conjecture: for every n  1, there exists a Hadamard matrix of order 4n.

State of the art: The smallest unknown case is order 4  107 = 428. The Hadamard table:

Some applications of BIBD's 1. Resolvable BIBD's with block size 2 are equivalent to round- robin tournaments. (Lecture 4)

2. BIBD's are intimately connected to error correcting codes, erasure codes and optical orthogonal codes used in the transmission of digital information.

3. BIBD's are used in the design of experiments.

4. BIBD's have applications to parts of cryptography such as visual cryptography and to the construction of threshold schemes.

5. BIBD's can be used in the construction of quorum systems, these are used for redundancy in distributed disk storage systems.

6. BIBD's are used in bioinformatics to construct so called Oligo arrays. These are micro-arrays of DNA strands used in gene sequencing.

7. BIBD's are used to test for defective items in a population by testing groups of items at a time. (Called group testing)

8. BIBD's and their generalizations are used in constructing lottery wheels (systems for buying multiple tickets for lotteries with the aim of increasing the expected yield). Lotteries

We desire to buy the minimum number of lottery tickets necessary to insure that whatever numbers are chosen by the lottery that at least one of our tickets contains k of these numbers.

Example: A lottery wheel for Lotto 6/44 that guarantees at least a 3-match.

There are firms that sell solutions to this problem using 355 tickets, we give a solution using 154 tickets.

Partition the set of 44 numbers into two sets A, and B where A = 1, …, 22 and B = 23, …, 44 .

On each of A and B, purchase 77 tickets corresponding to the blocks of the block design with parameters 3  (22,6,1).

Holding these 154 tickets, one must obtain a 3-match at least, unless the six winning numbers contain at most two from A, and at most two from B. Clearly this can't happen.

So these 154 tickets cover every possible 3 subset of the numbers from 1 to 44. Error correcting codes

The aim is to send encode signals so that if they are sent hrough a noisy channel, the receiver is able to correct any errors that may be induced in transmission.

Example: (used for photos from the Mariner and Voyager space probes that visited Mars and Venus in the 1970's).

Photos are made by using three black-and-white pictures taken, in turn, through red, green and blue filters. Each picture is then considered as a 1000  1000 matrix of black and white pixels. Each pixel is graded on a scale of 1 to 16, according to its greyness (so white is 1, black is 16). These grades are then used to choose a codeword in an eight error correcting code based on a Hadamard matrix of order 32. The codeword are transmitted to earth, error corrected, the three black-and-white pictures are reconstructed, and then combined to obtain the colored picture.

Example: The matrix G = 1 1 0 1 0 0 0 has rank 4. 0 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1

So there 24 = 16 vectors that are linear combinations of the rows of G (all arithmetic done modulo 2).

It can be checked that any two of these 16 vectors must differ in at least three positions. Assume we have 16 messages written as 4-tuples of 0's and 1's. For our codeword corresponding to this message we send the 7-tuple (from the row span of G) that begins with our4-tuple message.

So for example if we wish to send the message 0110, we send the codeword 0110100. To send 1001 we send 1001011.

If one error occurs in transmission, it can be corrected by decoding to the nearest codeword.

G is called the generator matrix of a binary [7,4,3] code.

(7 is the length of the codewords, 24 is the number of messages and 3 is the minumun distance between any two codewords)

Codes and designs are closely related. In this case we have that the set of all codewords of weight 3 form the incidence matrix of the (7,3,1) – design (the fano plane again).

Lots more …..

Group testing When testing a large number of samples for a rare attribute, it is efficient (faster and less expensive) to test large groups of these samples together for the attribute. It is desirable to then use the results of these pooled tests to deduce which of the samples had the attribute. Note that a negative result ensures that none of the samples were positive, while a positive result reveals only that at least one of the samples was positive.

The theory of group testing arose via testing millions of World War II military draftees for syphilis and it is very relevant to schemes for large-scale blood testing for viruses such as HIV.

Group testing also arises in connection with the mapping of genomes. Here, we have a long list of molecular sequences, form a library of subsequences (clones), and test whether or not a particular sequence (a probe) appears in the library by testing to see which clones it appears in. Because clone libraries can be huge, this is done by pooling the clones into groups.

Group testing is also relevant to the identification of defective products and has found application in satellite communications.

When the sequence of tests is predetermined this is called a nonadaptive group testing algorithm.

We can model a nonadaptive group testing algorithm as a design.

Let X be a set of m elements called samples. Let A = {b1, b2, … bn} be a set of n subsets of X called tests . Let U  X be the set of positive samples. Now construct a binary vector of length n called the result vector of U or R(U) as follows:

th The i componant R(U) is 1 if U  bi   ; 0 otherwise

Then (X, A) is (m,n)-NAGTA with threshold s if R(U)  R(V) whenever U, V  X, with |U|  s, |V|  s, and U  V.

Example: Let X = {1,2,3,4,5,6} and A= {{1,2,3}, {1,4,5}, {2,4,6}, {3,5,6}}. The result vectors of the (6,4)-NAGTA (X, A) for all possible positive subsets U with |U|  2:

U R(U) U R(U)

 0000 {1,6} 1111 {1} 1100 {2,3} 1011 {2} 1010 {2,4} 1110 {3} 1001 {2,5} 1111 {4} 0110 {2,6} 1011 {5} 0101 {3,4} 1111 {6} 0011 {3,5} 1101 {1,2} 1110 {3,6} 1011 {1,3} 1101 {4,5} 0111 {1,4} 1110 {4,6} 0111 {1,5} 1101 {5,6} 0111

We see that (X, A) has (maximum) threshold s  1 since the seven vectors R(U), where |U |  1, are distinct. However, for sets of cardinality 2, the result vectors are not always different (for example, R(1,3) = R(1,5)). So s = 1.

So we can use this scheme to test 6 items using 4 tests 

The connection to designs:

Theorem: If there exists a (v,b,r,k,1)- BIBD, then there exists a (b, v)- NAGTA with threshold k – 1.