Using Graphs to Find Optimal Block Designs Abstract Conference On
Total Page:16
File Type:pdf, Size:1020Kb
Abstract Ching-Shui Cheng was one of the pioneers Using graphs to find optimal block designs of using graph theory to prove results about optimal incomplete-block designs. There are actually two graphs associated with an incomplete-block design, and R. A. Bailey either can be used. University of St Andrews / QMUL (emerita) A block design is D-optimal if it maximizes the number of spanning trees; it is A-optimal if it minimizes the total of the Special session for Ching-Shui Cheng, pairwise resistances when the graph is thought of as an 20 June 2013 electrical network. I shall report on some surprising results about optimal designs when replication is very low. 1/44 2/44 Conference on Design of Experiments, Tianjin, June 2006 Problem 1: Factorial design Problem There are several treatment factors, with various numbers of levels. There may not be room to include all combinations of these levels. There are several inherent factors (also called block factors) on the experimental units. The inherent factors may pose some constraints on how treatment factors can be applied. So how do we set about designing the experiment? Solution Use Desmond Patterson’s design key. 3/44 4/44 Problem 2: Optimal incomplete-block design Two papers about the second problem Problem I Ch’ing Shui Cheng:ˆ The design must be in blocks, but each block is too small to Maximizing the total number of spanning trees in a graph: contain every treatment. two related problems in graph theory and optimum design We want variance to be small, as measured by the D-criterion theory. (minimizing the volume of the ellpsoid of confidence). Journal of Combinatorial Theory, Series B 31 (1981), 240–248. How do we set about finding a D-optimal block design? I Ching-Shui Cheng: Solution Graph and optimum design theories—some connections Represent the incomplete-block design as a graph and examples. (with vertices and edges), Bulletin of the International Statistical Institute 49 (1981), and maximize the number of spanning trees in the graph. part 1, 580–590. 5/44 6/44 Some early papers Two memorial conferences for R. C. Bose, India, 1988 Year RAB CSC 1977 factorial (design key) 1978 factorial (design key) optimal IBDs Calcutta meeting on : 1979 optimal IBDs Combinatorial Mathematics and Applications CSC spoke on “On the optimality of (M.S)-optimal designs in 1980 optimal IBDs; factorial large systems”. 1981 optimal IBDs and graphs 1982 factorial Delhi meeting on Probability, Statistics and Design of Experiments: 1984 IBDs RAB spoke on “Cyclic designs and factorial designs”. 1985 factorial; IBDs IBDs and graphs; factorial with trend 1986 IBDs IBDs 1988 factorial with trend 1989 factorial 7/44 8/44 Two joint papers Some later papers and books Year RAB CSC 1993 factorial; optimal IBDs 1995 optimal IBDs 1998 factorial; BIBDs I C.-S. Cheng and R. A. Bailey: 1999 factorial Optimality of some two-associate-class partially balanced 2000 factorial incomplete-block designs. 2001 factorial Annals of Statistics 19 (1991), 1667–1671. 2002 factorial I R. A. Bailey, Ching-Shui Cheng and Patricia Kipnis: 2003 factorial Construction of trend-resistant factorial designs. 2004 IBDs factorial Statistica Sinica 2 (1992), 393–411. 2005 IBDs factorial 2006 factorial 2007 optimal IBDs and graphs 2009 optimal IBDs and graphs factorial 2011 factorial factorial 2013 optimal IBDs and graphs factorial (design key) 9/44 10/44 With Jo Kunert, Dortmund, April 2009 What makes a block design good for experiments? I have v treatments that I want to compare. I have b blocks. Each block has space for k treatments (not necessarily distinct). How should I choose a block design? 11/44 12/44 Two designs with v = 5, b = 7, k = 3: which is better? Two designs with v = 15, b = 7, k = 3: which is better? Conventions: columns are blocks; order of treatments within each block is irrelevant; 1 1 2 3 4 5 6 1 1 1 1 1 1 1 order of blocks is irrelevant. 2 4 5 6 10 11 12 2 4 6 8 10 12 14 3 7 8 9 13 14 15 3 5 7 9 11 13 15 1 1 1 1 2 2 2 1 1 1 1 2 2 2 2 3 3 4 3 3 4 1 3 3 4 3 3 4 replications differ by 1 queen-bee design 3 4 5 5 4 5 5 2 4 5 5 4 5 5 ≤ The replication of a treatment is its number of occurrences. binary non-binary A design is a queen-bee design if there is a treatment that A design is binary if no treatment occurs more than once in any occurs in every block. block. 13/44 14/44 Experimental units and incidence matrix Levi graph There are bk experimental units. If ω is an experimental unit, put The Levi graph G˜ of a block design ∆ has f (ω) = treatment on ω I one vertex for each treatment, g(ω) = block containing ω. I one vertex for each block, I one edge for each experimental unit, For i = 1, . , v and j = 1, . , b, let with edge ω joining vertex f (ω) to vertex g(ω). n = ω : f (ω) = i and g(ω) = j ij |{ }| It is a bipartite graph, = number of experimental units in block j which have with nij edges between treatment-vertex i and block-vertex j. treatment i. The v b incidence matrix N has entries n . × ij 15/44 16/44 Example 1: v = 4, b = k = 3 Example 2: v = 8, b = 4, k = 3 1 2 1 1 2 3 4 3 3 2 2 3 4 1 4 4 2 5 6 7 8 5 8 1 4 x ¨H @ ¨ H @ @ ¨¨ x HH x x @ ¨¨ HH 2 HH ¨¨ x H1 2¨ HxH ¨¨ x 4 H¨ @ @ x@ @ 3 x x63 x 7 x 17/44 18/44 Concurrence graph Example 1: v = 4, b = k = 3 1 2 1 The concurrence graph G of a block design ∆ has 3 3 2 I one vertex for each treatment, 4 4 2 I one edge for each unordered pair α, ω, with α = ω, 6 g(α) = g(ω) and f (α) = f (ω): 6 1 2 this edge joins vertices f (α) and f (ω). 4 ¨H ¨ H @ ¨¨ x HH x@ x There are no loops. ¨¨ HH @ H ¨ HH ¨¨ @ If i = j then the number of edges between vertices i and j is 1Hx¨2 x @ 6 HH¨¨ @ b 3x 3x 4x λij = ∑ nisnjs; s=1 Levi graph concurrence graph this is called the concurrence of i and j, can recover design may have more symmetry and is the (i, j)-entry of Λ = NN>. more vertices more edges if k = 2 more edges if k 4 19/44 ≥ 20/44 Example 2: v = 8, b = 4, k = 3 Laplacian matrices The Laplacian matrix L of the concurrence graph G is a 1 2 3 4 v v matrix with (i, j)-entry as follows: 2 3 4 1 × I if i = j then 5 6 7 8 6 L = (number of edges between i and j) = λ ; ij − − ij I Lii = valency of i = ∑ λij. j=i 5 5 6 8 1 x ¡A The Laplacian matrix L˜ of the Levi graph G˜ is a @ ¡ xA (v + b) (v + b) matrix with (i, j)-entry as follows: @ @ 1¡ A 2 x x @ ¨ H ˜ × 2 ¨ H I Lii = valency of i ¨ x x H x 8HH ¨¨ 6 k if i is a block xH ¨ x = 4 A ¡ @ @ 4 3 (replication ri of i if i is a treatment x@ @ xA ¡ x A¡ I if i = j then Lij = (number of edges between i and j) 6 − x63 x 7 7x 0 if i and j are both treatments x Levi graph concurrence graph = 0 if i and j are both blocks nij if i is a treatment and j is a block, or vice versa. 21/44 − 22/44 Connectivity Generalized inverse All row-sums of L and of L˜ are zero, so both matrices have 0 as eigenvalue on the appropriate all-1 vector. Under the assumption of connectivity, the Moore–Penrose generalized inverse L− of L is defined by Theorem 1 The following are equivalent. 1 − 1 L− = L + J J , v v − v v 1. 0 is a simple eigenvalue of L; 2. G is a connected graph; where J is the v v all-1 matrix. v × 3. G˜ is a connected graph; 1 (The matrix J is the orthogonal projector onto the null space 4. 0 is a simple eigenvalue of L;˜ v v of L.) 5. the design ∆ is connected in the sense that all differences between treatments can be estimated. The Moore–Penrose generalized inverse L˜ − of L˜ is defined similarly. From now on, assume connectivity. Call the remaining eigenvalues non-trivial. They are all non-negative. 23/44 24/44 Estimation and variance How do we calculate variance? We measure the response Yω on each experimenal unit ω. If experimental unit ω has treatment i and is in block m Theorem (Standard linear model theory) (f (ω) = i and g(ω) = m), then we assume that Assume that all the noise is independent, with variance σ2. If x = 0, then the variance of the best linear unbiased estimator of Yω = τi + βm + random noise. ∑i i ∑i xiτi is equal to 2 (x>L−x)kσ . We want to estimate contrasts ∑ x τ with ∑ x = 0. i i i i i In particular, the variance of the best linear unbiased estimator of the simple difference τ τ is In particular, we want to estimate all the simple differences i − j τi τj.