AppendixA Auxiliary Results

A.1 Equivalence Relations; Groups

A relation: x ∼ y among the points of a space X is an equivalence relation if it is reflexive, symmetric, and transitive, that is, if (i) x ∼ x for all x ∈X; (ii) x ∼ y implies y ∼ x; (iii) x ∼ y, y ∼ z implies x ∼ z.

Example A.1.1 Consider a class of statistical decision procedures as a space, of which the individual procedures are the points. Then the relation defined by δ ∼ δ if the procedures δ and δ have the same risk function is an equivalence relation. As another example consider all real-valued functions defined over the real line as points of a space. Then f ∼ g if f(x)=g(x) a.e. is an equivalence relation.

Given an equivalence relation, let Dx denote the set of points of the space that are equivalent to x. Then Dx = Dy if x ∼ y, and Dx ∩Dy = 0 otherwise. Since by (i) each point of the space lies in at least one of the sets Dx, it follows that these sets, the equivalence classes defined by the relation ∼, constitute a partition of the space. AsetG of elements is called a group if it satisfies the following conditions. (i) There is defined an operation, group multiplication, which with any two elements a, b ∈ G associates an element c of G. The element c is called the product of a and b and is denoted by ab. A.2. Convergence of Functions; Metric Spaces 693

(ii) Group multiplication obeys the associative law

(ab)c = a(bc).

(iii) There exists an element e ∈ G, called the identity, such that

ae = ea = a for all a ∈ G.

(iv) For each element a ∈ G, there exists an element a−1 ∈ G,itsinverse, such that

aa−1 = a−1a = e.

Both the identity element and the inverse a−1 of any element a can be shown to be unique.

Example A.1.2 The set of all n × n orthogonal matrices constitutes a group if matrix multiplication and inverse are taken as group multiplication and inverse respectively, and if the identity matrix is taken as the identity element of the group. With the same specification of the group operations, the class of all non- singular n × n matrices also forms a group. On the other hand, the class of all n × n matrices fails to satisfy condition (iv).

If the elements of G are transformations of some space onto itself, with the group product ba defined as the result of applying first transformation a and following it by b, then G is called a transformation group. Assumption (ii) is then satisfied automatically. For any transformation group defined over a space X the relation between points of X given by

x ∼ y if there exists a ∈ G such that y = ax is an equivalence relation. That it satisfies conditions (i), (ii), and (iii) required of an equivalence follows respectively from the defining properties (iii), (iv), and (i) of a group. Let C be any class of 1 : 1 transformations of a space, and let G be the class ±1 ±1 ±1 of all finite products a1 a2 ...am ,witha1,...,am ∈ C, m = 1, 2, . . . , where each of the exponents can be +1 or −1 and where the elements a1, a2, . . . need not be distinct. Then it is easily checked that G is a group, and is in fact the smallest group containing C.

A.2 Convergence of Functions; Metric Spaces

When studying convergence properties of functions it is frequently convenient to consider a class of functions as a realization of an abstract space F of points f in which convergence of a sequence fn to a limit f, denoted by fn → f, has been defined.

Example A.2.1 Let µ be a measure over a measurable space (X , A). 694 AppendixA. Auxiliary Results

(i) Let F be the class of integrable functions. Then fn converges to f in the 1 mean if

|fn − f| dµ → 0. (A.1)

(ii) Let F be a uniformly bounded class of measurable functions. The sequence is said to converge to f weakly if

fnpdµ→ fpdµ (A.2)

for all functions p that are integrable µ.

(iii) Let F be the class of measurable functions. Then fn converges to f pointwise if

fn(x) → f(x) a.e. µ. (A.3)

A subset of F0 is dense in F if, given any f ∈F, there exists a sequence in F0 having f as its limit point. A space F is separable if there exists a countable dense subset of F. A space F such that every sequence has a convergent subsequence whose limit point is in F is compact.2 AspaceF is a metric space if for every pair of points f, g in F there is defined a metric (or distance) d(f,g) ≥ 0 such that (i) d(f,g) = 0 if and only if f = g; (ii) d(f,g)=d(g, f); (iii) d(f,g)+d(g, h) ≥ d(f,h) for all f, g, h. The space is a pseudometric space if (i) is replaced by (i) d(f,f) = 0 for all f ∈F. A pseudometric space can be converted into a metric space by introducing the equivalence relation f ∼ g if d(f,g) = 0. The equivalence classes F , G, . . . then constitute a metric space with respect to the metric D(F, G)=d(f,g) where f ∈ F , g ∈ G. In any pseudometric space a natural convergence definition is obtained by putting fn → f if d(fn,f) → 0.

Example A.2.2 The space of integrable functions of Example A.2.1(i) becomes a pseudometric space if we put d(f,g)= |f − g| dµ and the induced convergence definition is that given by (1).

1Here and in the examples that follow, the limit f is not unique. More specifically, if fn → f,thenfn → g if and only if f = g (a.e. µ). Putting f ∼ g when f = g (a.e. µ), uniqueness can be obtained by working with the resulting equivalence classes of functions rather than with the functions themselves. 2The term compactness is more commonly used for an alternative concept. which coincides with the one given here in metric spares. The distinguishing term sequential compactness is then sometimes given to the notion defined here. A.2. Convergence of Functions; Metric Spaces 695

Example A.2.3 Let P be a family of probability distributions over (X , A). Then P is a metric space with respect to the metric d(P, Q)= sup|P (A) − Q(A)|. (A.4) A∈A

Lemma A.2.1 If F is a separable pseudometric space, then every subset of F is also separable.

Proof. By assumption there exists a dense countable subset {fn} of F.Let 1 S = f : d(f,f ) < , m,n n m and let A be any subset of F. Select one element from each of the intersections A ∩ Sm,n that is nonempty, and denote this countable collection of elements by A0.Ifa is any element of A and m any positive integer, there exists an element fnm such that d(a, fnm ) < 1/m. Therefore a belongs to Sm,nm , the intersection

A∩Sm,nm is nonempty, and there exists therefore an element of A0 whose distance to a is < 2/m. This shows that A0 is dense in A, and hence that A is separable.

Lemma A.2.2 A sequence fn of integrable functions converges to f in the mean if and only if

fn dµ → fdµ uniformly for A ∈A. (A.5) A A Proof. That (1) implies (5) is obvious, since for all A ∈A % % % % % % % fn dµ − fdµ% ≤ |fn − f| dµ. A A Conversely, suppose that (5) holds, and denote by An and An the set of points x for which fn(x) >f(x) and fn(x)

|fn − f| dµ = (fn − f) dµ − (fn − f) dµ → 0 . An An

Lemma A.2.3 A sequence fn of uniformly bounded functions converges to a bounded function f weakly if and only if

fn dµ → fdµ for all A with µ(A) < ∞. (A.6) A A Proof. That weak convergence implies (6) is seen by taking for p in (2) the indicator function of a set A, which is integrable ifµ(A) < ∞. Conversely (6) implies that (2) holds if p is any simple function s = aiIAi with all the µ(Ai) < ∞. Given any integrable function p, there exists, by the definition of the integral, such a simple function s for which |p − s| dµ < /3M, where M is a bound on the |f|’s. We then have % % % % % % % % % % % % % % % % % % % % % % % % % (fn − f)pdµ% ≤ % fn(p − s) dµ% + % f(s − p) dµ% + % (fn − f)sdµ% . 696 AppendixA. Auxiliary Results

The first two terms on the right-hand side are </3, and the third term tends to zero as n tends to infinity. Thus the left-hand side is <for n sufficiently large, as was to be proved.

3 Lemma A.2.4 Let f and fn, n =1, 2, . . . , be nonnegative integrable functions with

fdµ= fn dµ =1.

Then pointwise convergence of fn to f implies that fn → f in the mean.

− Proof. If gn = fn − f, then g ≥−f, and the negative part gn =max(−gn, 0) | −|≤ → satisfies gn f.Sincegn(x) 0 (a.e. µ), it follows from Theorem 2.2.2(ii) of − → + Chapter 2 that gn dµ 0, and gn dµ then also tends to zero, since gn dµ = + − 0. Therefore |gn| dµ = (gn + gn ) dµ → 0, as was to be proved. Let P and Pn, n = 1, 2, . . . be probability distributions over (X , A)with densities pn and p with respect to µ. Consider the convergence definitions

(a) pn → p (a.e. µ); (b) |pn − p| dµ → 0; (c) gpn dµ → gp dµ for all bounded measurable g; and (b ) Pn(A) → P (A) uniformly for all A ∈A; (c ) Pn(A) → P (A) for all A ∈A. Then Lemmas A.2.2 and A.2.4 together with a slight modification of Lemma A.2.3 show that (a) implies (b) and (b) implies (c), and that (b) is equivalent to (b) and (c) to (c). It can further be shown that neither (a) and (b) nor (b) and (c) are equivalent.4

A.3 Banach and Hilbert Spaces

AsetViscalledavector space (or linear space) over the reals if there exists a function + on V × V to V and a function · on R × V to V which satisfy for x, y, z ∈ V , (i) x + y = y + x. (ii) (x + y)+z = z +(y + z). (iii) There is a vector 0 ∈ V : x +0= x for all x ∈ V . (iv) λ(x + y)=λx + λy for any λ ∈ R. (v) (λ1 + λ2)x = λ1x + λ2x for λi ∈ R. (vi) λ1(λ2x)=(λ1λ2)x for λi ∈ R. (vii) 0 · x =0, 1 · x = x.

3Scheff´e (1947). 4Robbins (1948). A.3. Banach and Hilbert Spaces 697

The operation + is called addition by scalars and · is multiplication by scalars. A nonnegative real-valued function defined on a vector space is called a norm if (i) x = 0 if and only if x =0. (ii) x + y≤x + y. (iii)λx = |λ|x. A vector space with norm is a then a metric space if we define the metric d to be d(x, y)=x − y. A sequence {xn} of elements in a normed vector space V is called a Cauchy sequence if, given >0, there is an N such that for all m, n ≥ N,wehave xn − xm <. A Banach space is a normed vector space that is complete in the sense that every Cauchy sequence {xn} satisfies xn − x→0forsomex ∈ V .

Example A.3.1 (Lp spaces.) Let µ be a measure over a measurable space p (X , A). Fix p>0 and L [X ,µ] denote the measurable functions f such that |f|pdµ < ∞. If we identify equivalence classes of functions that are equal al- most everywhere µ, then, for p ≥ 1, this vector space becomes a normed vector space by defining 1/p p f = fp = |f| dµ .

In this case, the triangle inequality

f + gp ≤fp + gp is known as Minkowski’s inequality. Moreover, this space is a Banach space.5

A Hilbert space H is a Banach space for which there is defined a function x, y on H × H to R, called the inner product of x and y, satisfying, for xi, y ∈ H, λi ∈ R, (i) λ1x1 + λ2x2,y = λ1x1,y + λ2x2,y . (ii) x, y = y, x . (iii) x, x = x2 . Two vectors x and y of H are called orthogonal if x, y =0.Acollection H0 ⊂ H of vectors is called an orthogonal system if any two elements in H0 are orthogonal. An orthogonal system is orthonormal if each vector in it has norm 1. An orthonormal system H0 is called complete if x, h = 0 for all h ∈ H0 implies x =0. In a separable Hilbert space, every orthonormal system is countable and there exists a complete orthonormal system. Letting {h1,h2,...} denote a complete orthonormal system, Parseval’s identity says that, for any x ∈ H, ∞ 2 2 x = [x, hj ] . (A.7) j=1

Example A.3.2 (L2 spaces.) In example A.3.1 with p = 2, the equivalence classes of square integrable functions is a Hilbert space with inner product given

5For proofs of the results in this section, see Chapter 5 of Dudley (1989). 698 AppendixA. Auxiliary Results by

f1,f2 = f1f2dµ .

If X is [0, 1] and µ is Lebesgue measure,√ then a complete orthonormal system is given by the functions fj (u)= 2sin(πju), j =1, 2,.... Therefore, for any square integrable function f, Parseval’s identity yields 1 ∞ 1 2 f 2(u)du =2 f(u)sin(πju)du . 0 0 j=1

A.4 Dominated Families of Distributions

Let M be a family of measures defined over a measurable space (X , A). Then M is said to be dominated by a σ-finite measure µ defined over (X , A) if each member of M is absolutely continuous with respect to µ. The family M is said to be dominated if there exists a σ-finite measure dominating it. Actually, if M is dominated there always exists a finite dominating measure. For suppose that M X ∪ is dominated by µ and that = Ai,withµ(Ai) finite for all i.Ifthesets i Ai are taken to be mutually exclusive, the measure ν(A)= µ(A ∩ Ai)/2 µ(Ai) also dominates M and is finite.

Theorem A.4.16 A family P of probability measures over a Euclidean space (X , A) is dominated if and only if it is separable with respect to the metric (4) or equivalently with respect to the convergence definition

Pn → P if Pn(A) → P (A) uniformly for A ∈A.

Proof. P { } Suppose first that is separable and that the sequence Pn is dense n in P, and let µ = Pn/2 . Then µ(A)=0impliesPn(A) = 0 for all n, and hence P (A) = 0 for all P ∈P. Conversely suppose that P is dominated by a measure µ, which without loss of generality can be assumed to be finite. Then we must show that the set of integrable functions dP/dµ is separable with respect to the convergence definition (5) or, because of Lemma A.2.2, with respect to convergence in the mean. It follows from Lemma A.2.1 that it suffices to prove this separability for the class F of all functions f that are integrable µ.Sinceby the definition of the integral every integrable function can be approximated in the mean by simple functions, it is enough to prove this for the case that F is the class of all simple integrable functions. Any simple function can be approximated in the mean by simple functions taking on only rational values, so that it is sufficient to prove separability of the class of functions riIAi where the r’s are rational and the A’s are Borel sets, with finite µ-measure since the f’s are integrable. It is therefore finally enough to take for F the class of functions IA, which are indicator functions of Borel sets with finite measure. However, any such set can be approximated by finite unions of disjoint rectangles with rational end

6Berger (1951b). A.4. Dominated Families of Distributions 699 points. The class of all such unions is denumerable, and the associated indicator functions will therefore serve as the required countable dense subset of F.

An examination of the proof shows that the Euclidean nature of the space (X , A) was used only to establish the existence of a countable number of sets Ai ∈Asuch that for any A ∈Awith finite measure there exists a subsequence Ai with µ(Ai) → µ(A). This property holds quite generally for any σ-field A which has a countable number of generators, that is, for which there exists a countable 7 number of sets Bi such that A is the smallest σ-field containing the Bi. It follows that Theorem A.4.1 holds for any σ-field with this property. Statistical applications of such σ-fields occur in sequential analysis, where the sample space X is the union X = ∪iXi of Borel subsets Xi of i-dimensional Euclidean space. In these problems, Xi is the set of points (x1,...,xi) for which exactly i observations are taken. If Ai is the σ-field of Borel subsets of Xi, one can take for A,theσ- field generated by the Ai, and since each Ai possesses a countable number of generators, so does A. If A does not possess a countable number of generators, a somewhat weaker conclusion can be asserted. Two families of measures M and N are equivalent if µ(A) = 0 for all µ ∈Mimplies ν(A) = 0 for all ν ∈N and vice versa.

Theorem A.4.28 A family P of probability measures is dominated by a σ-finite measure if and only if P has a countable equivalent subset.

Proof. Suppose first thatP has a countable equivalent subset {P1,P2,...}. Then n P is dominated by µ = Pn/2 . Conversely, let P be dominated by a σ-finite measure µ, which without loss of generality can be assumed to be finite. Let Q ∈P be the class of all probability measures Q of the form ciPi, where Pi ,the c’s are positive, and ci = 1. The class Q is also dominated by µ, and we denote by q a fixed version of the density dQ/dµ. We shall prove the fact, equivalent to the theorem, that there exists Q0 in Q such that Q0(A)=0impliesQ(A)=0 for all Q ∈Q. Consider the class C of sets C in A for which there exists Q ∈Qsuch that q(x) > 0a.e.µ on C and Q(C) > 0. Let µ(Ci) tend to sup µ(C), let qi(x) > 0 C∗ a.e. on Ci, and denote the union of the Ci by C0. Then q0 (x) ciqi(x) agrees a.e. with the density of Q0 = ciQi and is positive a.e. on C0, so that C0 ∈ C. Suppose now that Q0(A) = 0, let Q be any other member of Q, and let C = {x : q(x) > 0}. Then Q0(A ∩ C0) = 0, and therefore µ(A ∩ C0) = 0 and Q(A ∩ C0)=0.AlsoQ(A ∩ C˜0 ∩ C˜) = 0. Finally, Q(A ∩ C˜0 ∩ C) > 0 would lead to µ(C0 ∪ [A ∩ C˜0 ∩ C]) >µ(C0) and hence to a contradiction of the relation ∩ ˜ ∩ ∪ ∩ ˜ ∩ µ(C0)=supC µ(C), since A C0 C and therefore C0 [A C0 C] belongs to C.

7A proof of this is given for example by Halmos (1974, Theorem B of Section 40). 8Halmos and Savage (1949). 700 AppendixA. Auxiliary Results A.5 The Weak Compactness Theorem

The following theorem forms the basis for proving the existence of most powerful tests, most stringent tests, and so on.

Theorem A.5.19 (Weak compactness theorem). Let µ be a σ-finite mea- sure over a Euclidean space, or more generally over any measurable space (XA) for which A has a countable number of generators. Then the set of measurable functions φ with 0 ≤ φ ≤ 1 is compact with respect to the weak convergence (2).

Proof. Given any sequence {φn}, we must prove the existence of a subsequence

{φnj } and a function φ such that

lim φni pdµ= φp dµ

∗ ∗ for all integrable p.Ifµ is a finite measure equivalent to µ, thenp is integrable µ∗ if and only if p =(dµ∗/dµ)p∗ is integrable µ, and φp dµ = φp∗ dµ∗ for all φ. We may therefore assume without loss of generality that µ is finite. Let {pn} be a sequence of p’s which is dense in the p’s with respect to convergence in the mean. The existence of such a sequence is guaranteed by Theorem A.4.1 and the remark following it. If

Φn(p)= φnpdµ,

the sequence Φn(p) is bounded for each p. A subsequence Φnk can be extracted such that Φnk (pm) converges for each pm by the following diagonal process. Consider first the sequence of numbers {Φn(p1)} which possesses a convergent 1 1 2 2 subsequence Φn1 (p ), Φn2 (p ),.... Next the sequence Φn1 (p ), Φn2 (p ),... has 2 2 a convergent subsequence Φn1 (p ), Φn2 (p ),.... Continuing in this way, let n1 = n1, n2 = n2 , n3 ,.... Then n1

that Φni (p) converges for all p. Denote its limit by Φ(p), and define a set function Φ∗ over A by putting ∗ Φ (A)=Φ(IA). Then Φ∗ is nonnegative and bounded, since for all A,Φ∗(A) ≤ µ(A). To see that it is also countably additive let A = ∪Ak, where the Ak are disjoint. Then ∗ ∗ ∪ Φ (A)=limΦni ( Ak) and % % % % % % % % % m % % − ∗ % ≤ − ∗ % φni dµ Φ (Ak)% % φni dµ Φ (Ak)% ∪ % ∪m % Ak k=1Ak k=1

9Banach (1932). The theorem is valid even without the assumption of a countable number of generators; see N¨olle and Plachky (1967) and Aloaglu’s theorem, given for example in Royden (1988). A.5. The Weak Compactness Theorem 701 % % % ∞ % % ∗ % + % φni dµ − Φ (Ak)% . % ∪∞ % k=m+1Ak k=m+1 m Here the second term is to be taken as zero in the case of a finite sum A = ∪k=1Ak, ∞ and otherwise does not exceed 2µ(∪k=m+1Ak), which can be made arbitrarily small by taking m sufficiently large. For any fixed m the first term tends to zero as i tends to infinity. Thus Φ∗ is a finite measure over (X , A). It is furthermore absolutely continuous with respect to µ,sinceµ(A) = 0 implies Φni (IA) = 0 for ∗ all i, and therefore Φ(IA)=Φ (A) = 0 We can now apply the Radon–Nikodym theorem to get Φ∗(A)= φdµ for all A, A with 0 ≤ φ ≤ 1. We then have

φni dµ → φdµ for all A, A A and weak convergence of the φni to φ follows from Lemma A.2.3. References

Agresti, A. (1992). A survey of exact inference for contingency tables (with discussion). Statistical Science 7, 131–177. Agresti, A. (2002). Categorical Data Analysis, 2nd edition. John Wiley, New York. Agresti, A. and Coull, B. (1998). Approximate is better than “exact” for interval estimation of binomial proportions. American Statistician 52, 119–126. Aiyar, R. J., Guillier, C. L., and Albers, W. (1979). Asymptotic relative efficien- cies of rank tests for trend alternatives. Journal of the American Statistical Association 74, 226–231. Akritas, M., Arnold, S. and Brunner, E. (1997). Nonparametric hypotheses and rank statistics for unbalanced factorial designs. Journal of the American Statistical Association 92, 258–265. Albers, W. (1978). Testing the mean of a normal population under dependence. Annals of Statistics 6, 1337–1344. Albers, W., Bickel, P. and van Zwet, W. (1976). Asymptotic expansion for the power of distribution free tests in the one-sample problem. Annals of Statistics 4, 108–156. Albert, A. (1976). When is a sum of squares an analysis of variance? Annals of Statistics 4, 775–778. Anderson, T. W. (1967). Confidence limits for the expected of an arbitrary bounded random variable with a continuous distribution function. Bull. ISI 43, 249–251. Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd edition. John Wiley, Hoboken, NJ. [Problem 6.19.] References 703

Andersson, S. (1982). Distributions of maximal invariants using quotient measures. Annals of Statistics 10, 955–961. Anscombe, F. (1948). Transformations of Poisson, binomial and negative binomial data. Biometrika 35, 246–254. Antille, A., Kersting, G., and Zucchini, W. (1982). Testing symmetry. Journal of the American Statistical Association 77, 639–651. Arbuthnot, J. (1710). An argument for Divine Providence, taken from the constant regularity observ’d in the births of both sexes. Phil. Trans. 27, 186–190. Arcones, M. (1991). On the asymptotic theory of the bootstrap. Ph.D. thesis, The City University of New York. Arcones, M. and Gin´e, E. (1989). The bootstrap of the mean with arbitrary bootstrap sample size. Annals of the Institute Henri Poincar´e 25, 457–481. Arcones, M. and Gin´e, E. (1991). Additions and correction to “the bootstrap of the mean with arbitrary bootstrap sample size”. Annals of the Institute Henri Poincar´e 27, 583–595. Armsen, P. (1955). Tables for significance tests of 2 × 2 contingency tables. Biometrika 42, 494–511. Arnold, S. (1981). The Theory of Linear Models and Multivariate Analysis. John Wiley, New York. Arnold, S. (1984). Pivotal quantities and invariant confidence regions. Statistics and Decisions 2, 257–280. Arrow, K. (1960). and the choice of a level of significance for the t-test. In Contributions to Probability and Statistics (Olkin et al., eds.) Stanford University Press, Stanford, California. Arvesen, J. N. and Layard, M. W. J. (1975). Asymptotically robust tests in unbalanced variance component models. Annals of Statistics 3, 1122–1134. Athreya, K. (1985). Bootstrap of the mean in the infinite variance case, II. Technical Report 86-21, Department of Statistics, Iowa State University. Athreya, K. (1987). Bootstrap of the mean in the infinite variance case. Annals of Statistics 15, 724–731. Atkinson, A. and Donev, A. (1992). Optimum Experimental Design. Clarendon Press, . Atkinson, A. and Riani, M. (2000). Robust Regression Analysis. Springer-Verlag, New York. Babu, G. (1984). Bootstrapping statistics with linear combinations of chi-squares as weak limit. Sankhya Series A 56, 85–93. Babu, G. and Singh, K. (1983). Inference on means using the bootstrap. Annals of Statistics 11, 999–1003. Bahadur, R. (1955). A characterization of sufficiency. Annals of Mathematical Statistics 26, 286–293. Bahadur, R. (1960). Stochastic comparison of tests. Annals of Mathematical Statistics 31, 279–295. 704 References

Bahadur, R. (1965). An optimal property of the likelihood ratio statistic. In Proc. 5th Berkeley Symposium on Probab. Theory and Math. Statist. 1, Le Cam, L. and Neyman, J. (eds.), University of California Press, 13–26. Bahadur, R. (1979). A note on UMV estimates and ancillary statistics. In Con- tributions to Statistics,J.H´ajek Memorial Volume, Edited by Jureckova, Academia, Prague. Bahadur, R. and Lehmann, E. L. (1955). Two comments on ‘sufficiency and statistical decision functions’. Annals of Mathematical Statistics 26, 139–142. [Problem 2.5.] Bahadur, R. and Savage, L. J. (1956). The nonexistence of certain statistical procedures in nonparametric problems. Annals of Mathematical Statistics 27, 1115–1122. Bain, L. J. and Engelhardt, M. E. (1975). A two-moment chi-square ap- proximation for the statistic log(X/¯ X˜). Journal of the American Statistical Association 70, 948–950. Baker, R. (1995). Two permutation tests of equality of variances. Statistics and Computing 5, 351–361. Banach, S. (1932). Th´eorie des Operations Lin´eaires. Funduszu Kultury Narodowej, Warszawa. Bar-Lev, S. and Plachky, D. (1989). Boundedly complete families which are not complete. Metrika 36, 331–336. Bar-Lev, S. and Reiser, B. (1982). An exponential subfamily which admits UMPU tests based on a single test statistic. Annals of Statistics 10, 979–989. Barankin, E. W. and Maitra, A. P. (1963). Generalizations of the Fisher–Darmois–Koopman–Pitman theorem on sufficient statistics. Sankhy¯a Series A 25, 217–244. Barlow, R. E., Bartholomew, D. J., Bremner, J. M., and Brunk, H. D. (1972). Statistical Inference under Order Restrictions, John Wiley, New York. Barnard, G. A. (1976). Conditional inference is not inefficient. Scandinavian Journal of Statistics 3, 132–134. [Problem 10.27.] Barnard, G. A. (1995). Pivotal models and the fiducial argument. International Statistical Review 63, 309–323. Barnard, G. A. (1996). Rejoinder, Pivotal models and structural models. International Statistical Review 64, 235–236. Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. John Wiley, New York. [Provides a systematic discussion of various concepts of ancillarity with many examples.] Barndorff-Nielsen, O. (1983). On a formula for the distribution of the maximum likelihood estimator. Biometrika 70, 343–365. Barndorff-Nielsen, O., Cox, D. and Reid, N. (1986). Differential geometry in statistical theory. International Statistics Review 54, 83–96. Barndorff-Nielsen, O. and Hall, P. (1988). On the level-error after Bartlett adjustment of the likelihood ratio statistic. Biometrika 75, 374–378. References 705

Barndorff-Nielsen, O. and Pedersen, K. (1968). Sufficient data reduction and exponential families. Math. Scand. 2, 197–202. Barnett, V. (1999). Comparative Statistical Inference, 3rd edition. John Wiley, New York. Barron, A. (1989). Uniformly powerful goodness of fit tests. Annals of Statistics 17, 107–124. Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proc. Roy. Sec. London, Ser. A 160, 268–282. [Points out that exact (that is, similar) tests can be obtained by combining the conditional tests given the different values of a sufficient statistic. Applications.] Bartlett, M. S. (1957). A comment on D. V. Lindley’s statistical paradox. Biometrika 44, 533–534. Basu, D. (1955). On statistics independent of a complete sufficient statistic. Sankhy¯a 15, 377–380. Basu, D. (1958). On statistics independent of a sufficient statistic. Sankhy¯a 20, 223–226. Basu, D. (1959). The family of ancillary statistics. Sankhy¯a (A) 21. 247–256. [Problem 10.7.] Basu, D. (1964). Recovery of ancillary information. Sankhy¯a (A) 26, 3–16. [Problems 10.9, 10.11.] Basu, D. (1978). On partial sufficiency: A review. Journal of Statistical Planning and Inference 2, 1–13. Basu, D. (1982). Basu theorems. In Encycl. Statisti. Sci 1, 193-196. Basu, S. (1999). Conservatism of the z confidence interval under symmetric and asymmetric departures from normality. Annals of the Institute of Statistical Mathematics 51, 217–230. Basu, S. and DasGupta, A. (1995). Robustness of standard confidence in- tervals for location parameters under departure from normality. Annals of Statistics 23, 1433–1442. Bayarri, M. and Berger, J. (2000). P -values for composite null hypotheses. Journal of the American Statistical Association 95, 1127–1142. Bayarri, M. and Berger, J. (2004). The interplay of Bayesian and frequentist analysis. Statistical Science 19, 58–80. Becker, B. (1997). Combination of p-values. Encycl. Statist. update 1, 448–453. Becker, N. and Gordon, I. (1983). On Cox’s criterion for discriminating between alternative ancillary statistics. International Statistical Review 51. 89–92. Bednarski, T. (1984). Minimax testing between Prohorov neighbourhoods. Statistics and Decisions 2, 281–292. Behnen, K. and Neuhaus, G. (1989). Rank Tests With Estimated Scores and Their Applications. (Teubner Skripten zur Mathematischen Stochastik) B. G. Teubner, Stuttgart. Bell, C. B., Blackwell, D. and Breiman, L. (1960). On the completeness of order statistics. Annals of Mathematical Statistics 31, 794–797. 706 References

Bell, C. B. (1964). A characterization of multisample distribution-free statistics. Annals of Mathematical Statistics 35, 735–738. Bell, C. D. and Sen, P. (1984). Randomization procedures. In Handbook of Statistics 4 (Krishnaiah and Sen, eds.), Elsevier. Benichou, J., Fears, T. and Gail, M. (1996). A reminder of the fallibility of the Wald statistic. American Statistician 50, 226–227. Bening, V. (2000). Asymptotic Theory of Testing Statistical Hypotheses: Effi- cient Statistics, Optimality, Power Loss, and Deficiency. VSP Publishing, The Netherlands. Benjamini, Y. (1983). Is the t-test really conservative when the parent dis- tribution is long-tailed? Journal of the American Statistical Association 78, 645–654. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 57, 289–300. Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 1165–1189. Bennett, B. (1957). On the performance characteristic of certain methods of determining confidence limits. Sankhy¯a 18, 1–12. Bentkus, V. (2003). On the dependence of the Berry-Esseen bound on dimension. Journal of Statistical Planning and Inference 113, 385–402. Beran, R. (1974). Asymptotically efficient adaptive rank estimates in location models. Annals of Statistics 2, 63–74. Beran, R. (1977). Minimum Hellinger distance estimates for parametric models. Annals of Statistics 5, 445–463. Beran, R. (1984). Bootstrap methods in statistics. Jahresberichte des Deutschen Mathematischen Vereins 86, 14–30. Beran, R. (1986). Simulated power function. Annals of Statistics 14, 151–173. Beran, R. (1987). Prepivoting to reduce level error of confidence sets. Biometrika 74, 151–173. Beran, R. (1988a). Balanced simultaneous confidence sets. Journal of the American Statistical Association 83, 679–686. Beran, J. (1988b). Prepivoting test statistics: a bootstrap view of asymptotic refinements. Journal of the American Statistical Association 83, 687–697. Beran, J. (1999). Haj´ek-Inagaki convolution theorem. In Encyclopedia of Statistical Sciences, Update 3, 294–297. John Wiley, New York. Beran, J. and Ducharme, G. (1991). Asymptotic Theory for Bootstrap Meth- ods in Statistics. Centre de recherches math´ematiques, University of Montreal, Quebec. Beran, R. and Millar, W. (1986). Confidence sets for a multivariate distribution. Annals of Statistics 14, 431–443. Beran, R. and Millar, W. (1988). A stochastic minimum distance test for multivariate parametric models. Annals of Statistics 17, 125–140. References 707

Beran, R. and Srivastava, M. S. (1985). Bootstrap tests and confidence regions for functions of a covariance matrix. Annals of Statistics 13, 95–115. Berger, A. (1951a). On uniformly consistent tests. Annals of Mathematical Statistics 22, 289–293. Berger, A. (1951b). Remark on separable spaces of probability measures. Annals of Mathematical Statistics 22, 119–120. Berger, J. (1985a). Statistical Decision Theory and Bayesian Analysis, 2nd edition. Springer, New York. Berger, J. (1985b). The frequentist viewpoint of conditioning. In Proc. Berke- ley Conf. in Honor of J. Neyman and J. Kiefer (Le Cam and Olshen, eds.), Wadsworth, Belmont, Calif. Berger, J. (2003). Could Fisher, Jeffreys and Neyman have ageed on testing? (with discussion). Statistical Science 18, 1–32. Berger, J., Boukai, B. and Wang, Y. (1997). United frequentist and Bayesian testing of a precise hypothesis (with discussion). Statistical Science 12, 133–160. Berger, J., Brown, L. D. and Wolpert, R. (1994). A unified conditional frequentist and Bayesian test for fixed and sequential simple hypothesis testing. Annals of Statistics 22, 1787–1807. Berger, J., Liseo, B. and Wolpert, R. (1999). Integrated likelihood methods for eliminating nuisance parameters (with discussion). Statistical Science 14, 1–28. Berger, J. and Sellke, T. (1987). Testing a point null-hypothesis: The irreconcil- ability of significance levels and evidence. Journal of the American Statistical Association 82, 112–122. Berger, J. and Wolpert, R. (1988). The Likelihood Principle, 2nd edition, IMS Lecture Notes–Monograph Series, Hayward, CA. Berger, R. (1982). Multiparameter hypothesis testing and acceptance sampling. Technometrics, 24, 295–300. Berger, R. and Boos, D. (1994). p-values maximized over a confidence set for the nuisance parameter. Journal of the American Statistical Association 89, 1012–1016. Berger, R. and Hsu, J. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets (with discussion). Statistical Science 11, 283–319. Berk, R. (1970). A remark on almost invariance. Annals of Mathematical Statistics 41, 733–735. Berk, R. and Bickel, P. (1968). On invariance and almost invariance. Annals of Mathematical Statistics 39, 1573–1576. Berk, R. and Cohen, A. (1979). Asymptotically optimal methods of combining tests. Journal of the American Statistical Association 74, 812–814. Berk, R., Nogales, A. and Oyola, J. (1996). Some counterexamples concerning sufficiency and invariance. Annals of Statistics 24, 902–905. Bernardo, J. and Smith, A. (1994). Bayesian Theory. New York, John Wiley. 708 References

Bernoulli, D. (1734). Quelle est la cause physique de l’inclinaison des plans des orbites des planetes par rapport au plan de l’´equateur de la revolution du soleil autour de son axe; Et d’o´u vient que les inclinaisons de ces orbites sont dif- ferentes entre elles. Recueil des Pi`eces qui ont Remport´elePrixdel’Acad´emie Royale des Sciences 3, 93–122. Bhat, U. and Miller, G. (2002). Elements of Applied Stochastic Processes, 3rd edition, John Wiley, New York. Bhattacharya, P. K., Gastwirth, J. L., and Wright, A. L. (1982). Two mod- ified Wilcoxon tests for symmetry about an unknown location parameter. Biometrika 69, 377–382. Bhattacharya, R. and Ghosh, J. (1978). On the validity of the formal Edgeworth expansion. Annals of Statistics 6, 434–451. Bhattacharya, R. and Rao, R. (1976). Normal Approximation and Asymptotic Expansions. John Wiley, New York. Bickel, P. (1974). Edgeworth expansions in nonparametric statistics. Annals of Statistics 2, 1–20. Bickel, P. (1982). On adaptive estimation. Annals of Statistics 10, 647–671. Bickel, P. (1984). Parametric robustness: small biases can be worthwhile. Annals of Statistics 12, 864–879. Bickel, P. and Doksum, K. A. (1981). An analysis of transformations revisited. Journal of the American Statistical Association 76, 296–311. Bickel, P. and Doksum, K. A. (2001). Mathematical Statistics, volume I, 2nd edition. Prentice Hall, Upper Saddle River, New Jersey. Bickel, P. and Freedman, D. (1981). Some asymptotic theory for the bootstrap. Annals of Statistics 9, 1196–1217. Bickel, P. and Ghosh, J. (1990). A decomposition for the likelihood ratio statistic and the Bartlett correction – a Bayesian argument. Annals of Statistics 18, 1070–1090. Bickel, P., G¨otze, F. and van Zwet, W. R. (1997). Resampling fewer than n observations: Gains, losses, and remedies for losses. Statistica Sinica 7, 1–31. Bickel, P., Klaassen, C., Ritov, Y., and Wellner, J. (1993). Efficient and Adaptive Estimation for Semiparametric Models. The John Hopkins University Press, Baltimore, MD. Bickel, P. and Van Zwet, W. R. (1978). Asymptotic expansions for the power of distribution free tests in the two-sample problem. Annals of Statistics 6, 937–1004. Billingsley, P. (1961). Statistical methods in Markov chains. Annals of Mathematical Statistics 32, 12–40. Billingsley, P. (1968). Convergence of Probability Measures. John Wiley, New York. Billingsley, P. (1995). Probability and Measure, 3rd edition. John Wiley, New York. References 709

Birch, M. W. (1964). The detection of partial association, I The 2×2 case. Journal of the Royal Statistical Society Series B, 26, 313–324. Birnbaum, A. (1954a). Statistical methods for Poisson processes and exponential populations. Journal of the American Statistical Association 49, 254–266. Birnbaum, A. (1954b). Admissible test for the mean of a rectangular distribution. Annals of Mathematical Statistics 25 157–161. Birnbaum, A. (1955). Characterization of complete classes of tests of some mul- tiparameter hypotheses, with applications to likelihood ratio tests. Annals of Mathematical Statistics 26, 21–36. Birnbaum, A. (1962). On the foundations of statistical inference (with discussion). Journal of the American Statistical Association 57, 269–326. Birnbaum Z. W. (1952). Numerical tabulation of the distribution of Kol- mogorov’s statistic for finite sample size. Journal of the American Statistical Association 47, 431. Birnbaum, Z. W. and Chapman, D. G. (1950). On optimum selections from multi- normal populations. Annals of Mathematical Statistics 21, 433–447. [Problem 3.46] Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice, MIT. Press, Cambridge, Mass. Blackwell, D. (1951). On a theorem of Lyapunov. Annals of Mathematical Statistics 22, 112–114. Blackwell, D. and Dubins, L. E. (1975). On existence and non-existence of proper, regular conditional distributions. Annals Probability 3, 741–752. Blackwell, D. and Girshick, M. A. (1954). Theory of Games and Statistical Decisions. John Wiley, New York. Blackwell, D. and Ramamoorthi, R. V. (1982). A Bayes but not classically sufficient statistic. Annals of Statistics 10, 1025–1026. Blair, R. C. and Higgins, J. J. (1980). A comparison of the power of Wilcoxon’s rank-sum statistic to that of Student’s t-statistic under various nonnormal distributions. Journal of Educational Statistics 5, 309–335. Blyth, C. R. (1970). On the inference and decision models of statistics (with discussion). Annals of Statistics 41, 1034–1058. Blyth, C. R. (1984). Approximate binomial confidence limits. Queen’s Math. Preprint 1984–6, Queens’ Univ., Kingston, Ontario. Blyth, C. R. and Hutchinson, D. W. (1960). Tables of Neyman—shortest confidence intervals for the binomial parameter. Biometrika 47, 481–491. Blyth, C.R. and Staudte, R. (1995). Estimating statistical hypotheses. Statistics and Probability Letters 23, 45–52. Blyth, C.R. and Staudte, R. (1997). Hypothesis estimates and acceptability profiles for 2 × 2 contingency tables. Journal of the American Statistical Association 92, 694–699. Blyth, C. R. and Still, H. A. (1983). Binomial confidence intervals. Journal of the American Statistical Association 78, 108–116. 710 References

Bohrer, R. (1973). An optimality property of Scheff´e bounds. Annals of Statistics 1, 766–772. Bondar, J. V. (1977). A conditional confidence principle. Annals of Mathematical Statistics 5, 881–891. Bondar, J. V. and Milnes, P. (1981). Amenability: A survey for statistical ap- plications of Hunt–Stein and related conditions on groups. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 57, 103–128. Bondar, J. V. and Milnes, P. (1982). A converse to the Hunt-Stein theorem. Unpublished. Bondessen, L. (1983). Equivariant estimators. in Encyclopedia of Statistical Sciences, Vol. 2. John Wiley, New York. Boos, D. (1982). A test for asymmetry associated with the Hodges–Lehmann estimator. Journal of the American Statistical Association 77, 647–651. Boos, D. and Brownie, C. (1989). Bootstrap methods for testing homogeneity of variances. Technometrics 31, 69–82. Boos, D. and Hughes-Oliver, J. (1998). Applications of Basu’s theorem. American Statistician 52, 218–221. Boschloo, R. D. (1970). Raised conditional level of significance for the 2 × 2 table when testing the equality of two probabilities. Statistica Neerlandica 24, 1–35. Bowker, A. H. (1948). A test for symmetry in contingency tables. Journal of the American Statistical Association 43, 572–574. Box, G. E. P. (1953). Non-normality and tests for variances. Biometrika 40, 318–335. Box, G. E. P. and Andersen, S. L. (1955). Permutation theory in the derivation of robust criteria and the study of departures from assumptions. Journal of the Royal Statistical Society Series B 17, 1–34. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society Series B 26, 211–252. Box, G. E. P. and Cox, D. R. (1982). An analysis of transformations revisited, rebutted. Journal of the American Statistical Association 77, 209–210. Box, G. E. P., Hunter, W. G., and Hunter, J. S. (1978). Statistics for Experimenters. John Wiley, New York. Box, G. E. P. and Tiao, G. C. (1964). A note on criterion robustness and inference robustness. Biometrika 51, 169–173. Box, G. E. P. and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis. Addison–Wesley, Reading, Mass. Box, J. F. (1978). R. A. Fisher: The Life of a Scientist. John Wiley, New York. Brain, C. W. and Shapiro, S. S. (1983). A regression test for exponentiality: Censored and complete samples. Technometrics 25, 69–76. Braun, H. (Ed.) (1994). ThecollectedworksofJohnW.Tukey:Vo.VIIIMultiple comparisons: 1948–1983. Chapman & Hall, New York. References 711

Bretagnolle, J. (1983). Limites du bootstrap de ceraines fonctionnelles. Annals of the Institute Henri Poincar´e 3, 281–296. Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Models, 2nd edition. Springer, New York. Broemeling, L. D. (1985). Bayesian Analysis of Linear Models. Marcel Dekker, New York. Bross, I. D. J. and Kasten, E. L. (1957). Rapid analysis of 2 × 2 tables. Journal of the American Statistical Association 52, 18–28. Brown, K. G. (1984). On analysis of variance in the mixed model. Annals of Statistics 12, 1488–1499. Brown, L. D. (1964). Sufficient statistics in the case of independent random variables. Annals of Mathematical Statistics 35, 1456–1474. Brown, L. D. (1966). On the admissibility of invariant estimators of one or more location parameters. Annals of Mathematical Statistics 37, 1087–1136. Brown, L. D. (1967). The conditional level of Student’s t-test. Annals of Mathematical Statistics 38, 1068–1071. Brown, L. D. (1978). An extension of Kiefer’s theory of conditional confidence procedures. Annals of Statistics 6, 59–71. Brown, L. D. (1986). Fundamentals of Statistical Exponential Families (With Ap- plication to Statistical Decision Theory). Institute of Statistical Mathematics Lecture Notes Monograph Series, 9, Hayward, CA. Brown, L. D. (1990). An ancillarity paradox which appears in muliple linear regression (with discussion). Annals of Statistics 18, 471–538. Brown, L. D. (1994). Minimaxity, more or less. In Statistical Decision Theory and Related Topics V, Gupta and Berger (eds.), 1–18. Springer-Verlag, New York. Brown, L. D. (2000). Statistical decision theory. Journal of the American Statistical Association 95, 1277–1281. Brown, L. D., Cai, T. and DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science 16, 101–133. Brown, L. D., Cai, T. and DasGupta, A. (2002). Confidence intervals for a binomial proportion and asymptotic expansions. Annals of Statistics 30, 160–201. Brown, L. D., Casella, G. and Hwang, J. (1995). Optimal confidence sets, bioe- quivalence, and the limacon of Pascal. Journal of the American Statistical Association 90, 880–889. Brown, L. D., Cohen, A., and Strawderman, W. E. (1976). A complete class theorem for strict monotone likelihood ratio with applications. Annals of Statistics 4, 712–722. Brown, L. D., Johnstone, I. M. and MacGibbon, K. G. (1981). Variation dimin- ishing transformations: A direct approach to total positivity and its statistical applications. Journal of the American Statistical Association 76, 824–832. Brown, L. D., Hwang, J. and Munk, A. (1997). An unbiased test for the bioequivalence problem. Annals of Statistics 25, 2345–2367. 712 References

Brown, L. D. and Marden, J. (1989). Complete class results for hypothesis testing problems with simple null hypotheses. Annals of Statistics 17, 209–235. Brown, L. D. and Sackrowitz, H. (1984). An alternative to Student’s t-test for problems with indifference zones. Annals of Statistics 12, 451–469. Brown, M. B. and Forsythe, A. (1974a). The small sample behavior of some statistics which test the equality of several means. Technometrics 16, 129–132. Brown, M. B. and Forsythe, A. (1974b). Robust tests for the equality of variances. Journal of the American Statistical Association 69, 364–367. Brownie, C. and Kiefer, J. (1977). The ideas of conditional confidence in the simplest setting. Comm. Statist. A6(10.8), 691–751. Buehler, R. (1959). Some validity criteria for statistical inferences. Annals of Mathematical Statistics 30, 845–863. [The first systematic treatment of relevant subsets, including Example 10.4.1.] Buehler, R. (1982). Some ancillary statistics and their properties. Journal of the American Statistical Association 77, 581–589. [A review of the principal examples of ancillaries.] Buehler, R. (1983). Fiducial inference. In Encyclopedia of Statistical Sciences, Vol. 3, John Wiley, New York Buehler, R. and Feddersen, A. P. (1963). Note on a conditional property of Student’s t. Annals of Mathematical Statistics 34. 1098–1100. Burkholder, D. L. (1961). Sufficiency in the undominated case. Annals of Mathematical Statistics 32, 1191–1200. Caba˜na, A. and Caba˜na, E. (1997). Transformed empirical processes and modified Kolmogorov-Smirnov tests for multivariate distributions. Annals of Statistics 25, 2388–2409. Casella, G. (1987). Conditionally acceptable recentered set estimators. Annals of Statistics 15, 1363–1371. Casella, G. (1988). Conditionally acceptable frequentist solutions (with discussion). In Statistical Decision Theory and Related Topis IV 1, 73–117. Castillo, J. and Puig, P. (1999). The best test of exponentiality against singly truncated normal alternatives. Journal of the American Statistical Association 94, 529–532. Chambers, E. A. and Cox, D. R. (1967). Discrimination between alternative binary response models. Biometrika 54, 573–578. Chatterjee, S., Hadi, A. and Price, B. (2000). Regression Analysis By Example, 3rd edition. John Wiley, New York. Chebychev, P. (1890). Sur deux th´eoremes relatifs aux probabiliti´es. Acta. Math. 14, 305-315. Chen, L. (1995). Testing the mean of skewed distributions. Journal of the American Statistical Association 90, 767–772. Chernoff, H. (1949). Asymptotic studentization in testing of hypotheses. Annals of Mathematical Statistics 20, 268–278. References 713

Chernoff, H. (1954). On the distribution of the likelihood ratio statistic. Annals of Mathematical Statistics 25, 579–586. Chernoff, H. and Lehmann, E. L. (1954). The use of maximum likelihood estimates in χ2 goodness of fit. Annals of Mathematical Statistics 25, 579–586. Chhikara, R. S. (1975). Optimum tests for the comparison of two inverse Gaussian distribution means. Australian Journal of Statistics 17, 77–83. Chhikara, R. S. and Folks, J. L. (1976). Optimum test procedures for the mean of first passage time distribution in Brownian motion with positive drift. Technometrics 18, 189–193. Chmielewski, M. A. (1981). Elliptically symmetric distributions: A review and bibliography. International Statistical Review 49, 67–74. Choi, K. and Marden, J. (1997). An approach to multivariate rank tests in multi- variate analysis of variance. Journal of the American Statistical Association 92, 1581–1590. Choi, S., Hall, W. and Schick, A. (1996). Asymptotically uniformly most pow- erful tests in parametric and semiparametric models. Annals of Statistics 24, 841–861. Chou, Y. M., Arthur, K. H., Rosenstein, R. B., and Owen, D. B. (1984). New representations of the noncentral chi-square density and cumulative. Communications in Statistics – Theory and Methods 13, 2673–2678. Christensen, R. (1989). Lack-of-fit tests based on near or exact replicates. Annals of Statistics 17, 673–683. Christensen, R. (2000). Linear and loglinear models. Journal of the American Statistical Association 95, 1290–1293. Cima, J. A. and Hochberg, Y. (1976). On optimality criteria in simultaneous interval estimation. Communications in Statistics – Theory and Methods A5, 875–882. Clinch, J. C. and Kesselman, H. J. (1982). Parametric alternatives to the analysis of variance. Journal of Educational Statistics 7, 207–214. Cochran, W. G. (1968). Errors of measurement in statistics. Technometrics 10, 637–666. Cohen, A. (1972). Improved confidence intervals for the variance of a normal distribution. Journal of the American Statistical Association 67, 382–387. Cohen, A., Gatsonis, C., and Marden, J. (1983). Hypothesis tests and optimality properties in discrete multivariate analysis. In Studies in , Time Series, and Multivariate Statistics (Karlin et al., eds.), 379–405. Academic Press, New York. Cohen, A., Kemperman, J. and Sackrowitz, H. (1994). Unbiased testing in exponential family regression. Annals of Statistics 22, 1931–1946. Cohen, A. and Marden, J. (1989). On the admissibility and consistency of tests for homogeneity of variances. Annals of Statistics 17, 236–251. Cohen, A. and Miller, J. (1976). Some remarks on Scheff´es two-way mixed model. American Statistician 30, 36–37. 714 References

Cohen, A. and Sackrowitz, H. (1975). Unbiasedness of the chi-square, likelihood ratio and other goodness of fit tests for the equal cell case. Annals of Statistics 3, 959–964. Cohen, A. and Sackrowitz, H. (1992). Improved tests for comparing treatments against a control and other one-sided problems. Journal of the American Statistical Association 87, 1137–1144. Cohen, A. and Strawderman, W. E. (1973). Admissibility implications for different criteria in confidence estimation. Annals of Statistics 1, 363–366. Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. J. Abnormal and Soc. Psychology 65, 145–153. Cohen, J. (1977). Statistical Power Analysis for the Behavioral Sciences, revised edition. Academic Press, New York. [Advocates the consideration of power attainable against the alternatives of , and provides the tables needed for this purpose for some of the most common tests.] Cohen, L. (1958). On mixed single sample experiments. Annals of Mathematical Statistics 29, 947–971. Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23, 351–361. Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical Statistics 29, 357–372. Cox, D. R. (1959). Planning of Experiments. John Wiley, New York. Cox, D. R. (1961). Tests of separate families of hypotheses. In Proc. 4th Berkeley Symp., Vol. 1, 105–123. Cox, D. R. (1962). Further results on tests of separate families of hypotheses. Journal of the Royal Statistical Society Series B 24, 406–423. Cox, D. R. (1966). A simple example of a comparison involving quantal data. Biometrika 53, 215–220. Cox, D. R. (1970). TheAnalysisofBinaryData, Methuen, London. [An in- troduction to the problems treated in Sections 4.6-4.7 and some of their extensions.] Cox, D. R. (1971). The choice between ancillary statistics. Journal of the Royal Statistical Society (B) 33, 251–255. Cox, D. R. (1977). The role of significance tests. Scandinavian Journal of Statistics 4, 49–62. Cram´er, H. (1928). On the composition of elementary errors. Skand. Aktuarietidskr. 11, 13-74, 141-186. Cram´er, H. (1937). Random Variables and Probability Distributions. Cambridge University Press, Cambridge. Cram´er, H. (1946). Mathematical Methods of Statistics. Princeton University Press. Cressie, N. (1980). Relaxing assumptions in the one-sample t-test. Australian Journal of Statistics 22, 143–153. References 715

Cs¨org¨o, S. and Mason, D. (1989). Bootstrap empirical functions. Annals of Statistics 17, 1447–1471. Cvitanic, J. and Karatzas, I. (2001). Generalized Neyman-Pearson lemma via convex duality. Bernoulli 7, 79–97. Cyr, J. L. and Manoukian, E. B. (1982). Approximate critical values with error bounds for Bartlett’s test of homogeneity of variances for unequal sample sizes. Communications in Statistics – Theory and Methods 11, 1671–1680. D’Agostino, R. (1982). Departures from normality, tests for. In Encycl. Statist. Sci. Vol. 2. John Wiley, New York. D’Agostino, R. and Stephens, M. A. (1986). Goodness-of-Fit Techniques. Marcel Dekker, New York. Dantzig, G. B. and Wald, A. (1951). On the fundamental lemma of Neyman and Pearson. Annals of Mathematical Statistics 22, 87–93. Darmois, G. (1935). Sur les lois de probabilite a estimation exhaustive. C.R. Acad. Sci. Paris 260, 1265–1266. DasGupta, A. (1991). Diameter and volume minimizing confidence sets in Bayes and classical problems. Annals of Statistics 19, 1225–1243. Davenport, J. M. and Webster, J. T. (1975). The Behrens-Fisher problem. An old solution revisited. Metrika 22, 47–54. David, H. A. (1981). Order Statistics, 2nd edition. John Wiley, New York. Davison, A. and Hinkley, D. (1997). Bootstrap Methods and their Application. Cambridge University Press, Cambridge. Dawid, A. P. (1975). On the concepts of sufficiency and ancillarity in the presence of nuisance parameters. Journal of the Royal Statistical Society Series B 37, 248–258. Dawid, A. P. (1977). Discussion of Wilkinson: On resolving the controversy in sta- tistical inference. Journal of the Royal Statistical Society 39, 151–152. [Problem 10.12.] Dayton, C. (2003). Information criteria for pairwise comparisons. Psychological Methods 8, 61–71. de Leeuw, J. (1992). Introduction to Akaike’s (1973) paper “Information the- ory and an extension of the maximum likelihood principle”. Appeared in Breakthroughs in Statistics, volume I, Kotz, S. and Johnson, N. L. eds., Springer-Verlag, New York. de Moivre, A. (1733). The Doctrine of Chances, 3rd edition (1756) has been reprinted by Chelsea, New York (1967). Dempster, A. P. (1958). A high dimensional two-sample significance test. Annals of Mathematical Statistics 29, 995–1010. Deshpande, J. V. (1983). A class of tests for exponentiality against increasing failure rate average alternatives. Biometrika 70, 514–518. Deuchler, G. (1914). Ueber die Methoden der Korrelationsrechnung in der Paedagogik und Psychologic. Z. P¨adag. Psychol. 15, 114–131, 145–159, 229–242. 716 References

Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag, New York. de Wet, T. and Randles, R. (1987). On the effect of substituting parameter estimators in limiting χ2 U and V statistics. Annals of Statistics 15, 398–412. Diaconis, P. (1988). Group representations in probability and statistics. IMS Lecture Notes, 11, Institute of Statistical Mathematics, Hayward, CA. Diaconis, P. and Efron, B. (1985). Testing for independence in a two-way table. New interpretations of the chi-square statistic (with discussion). Annals of Statistics 13, 845–913. Diaconis, P. and Holmes, S. (1994). Gray codes for randomization procedures. Statistics and Computing 4, 287-302. DiCiccio, T., Hall, P., and Romano, J. P (1991). Empirical likelihood is Bartlett- correctable. Annals of Statistics 19, 1053–1061. DiCiccio, T. and Romano, J. P. (1989). The automatic percentile method: accu- rate confidence limits in parametric models. Canadian Journal of Statistics 17, 155–169. DiCiccio, T. and Romano, J. (1990). Nonparametric confidence limits by resam- pling and least favorable distributions. International Statistical Review 58, 59–76. DiCiccio, T. and Stern, S. (1994). Frequentist and Bayesian Bartlett correction of test statistics based on adjusted profile likelihoods. Journal of the Royal Statistical Society Series B 56, 397–408. Dobson, A. (1990). An Introduction to Generalized Linear Models. Chapman & Hall, London. Doksum, K. A. and Yandell, B. S. (1984). Tests for exponentiality. In Handbook of Statistics (Krishnaiah and Sen, editors), Vol. 4, 579–611. Donoghue, J. (2004). Implementing Shaffer’s multiple comparison procedure for a large number of groups. To appear in Recent Developments in Multiple Comparison Procedures, IMS Lecture Notes Monograph Series. Donoho, D. (1988). One-sided inference about functionals of a density. Annals of Statistics 16, 1390–1420. Draper, D. (1981). Rank-Based Robust Analysis of Linear Models, Ph.D. Thesis, Dept. of Statistics, University of California. Berkeley. Draper, D. (1983). Rank-Based Robust Analysis of Linear Models. I. Exposi- tion and Background, Tech. Report No. 17, Dept. of Statistics, University of California, Berkeley. Drost, F. (1988). Asymptotics for Generalized Chi-Square Goodness-of-Fit Tests. Centrum voor Wiskunde en Informatica 48, Amsterdam. Drost, F. (1989). Generalized chi-square goodness-of-fit tests for location-scale models when the number of classes tends to infinity. Annals of Statistics 17, 1285–1300. Dudley, R. (1989). Real Analysis and Probability. Wadsworth, Belmont. References 717

Dudoit, S., Shaffer, J. P. and Boldrick, J. (2003). Multiple hypothesis testing in microarray experiments. Statistical Science 18, 71–103. D¨umbgen, L. (1998). New goodness-of-fit tests and their application to nonparametric confidence sets. Annals of Statistics 26, 288–314. Duncan, D. B. (1955). Multiple range and multiple F -tests. Biometrics 11, 1–42. Durbin, J. (1970). On Bimbaum’s theorem on the relation between sufficiency, conditionality, and likelihood. Journal of the American Statistical Association 65, 395–398. Durbin, J. (1973). Distribution theory for tests based on the sample distribution function. SIAM Philadelphia, PA. Durbin, J. and Knott, M. (1972). Components of Cram´er-von Mises statistics. Part I. Journal of the Royal Statistical Society B 34, 290–307. Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1953). Sequential decision problems for processes with continuous time parameter. Testing hypotheses. Annals of Mathematical Statistics 24, 254–264. Dvoretzky, A., Kiefer, J. and Wolfowitz. J. (1956). Asymptotic minimax character of the sample distribution function and the classical multinomial estimator. Annals of Mathematical Statistics 27, 642–669. Dvoretzky, A., Wald, A. and Wolfowitz, J. (1951). Elimination of randomiza- tion in certain statistical decision procedures and zero-sum two-person games. Annals of Mathematical Statistics 22, 1–21. Eaton, M. (1983). Multivariate Statistics. John Wiley, New York. Eaton, M. (1989). Group Invariance Applications in Statistics. Institute of Statistical Mathematics, Hayward, CA. Edelman, D. (1990). An inequality of optimal order for the tail probabilities of the T statistic under symmetry. Journal of the American Statistical Association 85, 120–122. Edgeworth, F. Y. (1885). Methods of Statistics, Jubilee volume of the Statist. Soc., E. Stanford, London. Edgeworth, F. Y. (1905). The law of error. Proc. Camb. Philos. Soc. 20, 36–45. Edgeworth F. Y. (1908–09). On the probable errors of frequency constants. J. Roy. Statist. Soc. 71, 381–397, 499–512, 651–678; 72, 81–90. [Edgeworth’s work on maximum-likelihood estimation and its relation to the results of Fisher in the same area is reviewed by Pratt (1976). Stigler (1978) provides a systematic account of Edgeworth’s many other important contributions to statistics.] Edgington, E. S. (1995). Randomization Tests, 3rd edition. Marcel Dekker, New York. Edwards, A. W. F. (1963). The measure of association in a 2 × 2 table. Journal of the Royal Statistical Society Series B 126,109–114. Edwards, A. W. F. (1983). Fiducial distributions. In Encycl.ofStatist.Sci., Vol. 3. John Wiley, New York. Efron, B. (1969). Student’s t-test under symmetry conditions. Journal of the American Statistical Association 64, 1278–1302. 718 References

Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics 7, 1–26. Efron, B. (1981). Nonparametric standard errors and confidence intervals (with discussion). Canadian Journal of Statistics 9, 139–172. Efron, B. (1982). The Jackknife, the Bootstrap and Other Resampling Plans. SIAM, Philadelphia. Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman &Hall,NewYork. Elfving, G. (1952). Sufficiency and completeness. Ann. Acad. Sci. Fennicae (A), No. 135. Engelhardt, M. and Bain, L. J. (1977). Uniformly most powerful unbiased tests on the scale parameter of a gamma distribution with a nuisance shape parameter. Technometrics 19, 77–81. Engelhardt, M. and Bain, L. J. (1978). Construction of optimal unbiased inference proceures for the parameters of the gamma distribution. Technometrics 20, 485–489. Eubank, R. (1997). Testing goodness of fit with multinomial data. Journal of the American Statistical Association 92, 1084–1093. Eubank, R. and LaRiccia, V. (1992). Asymptotic comparison of Cram´er- von Mises and nonparametric function estimation techniques for testing goodness-of-fit. Annals of Statistics 20, 2071-2086. Falk, M. and Kohne, W. (1984). A robustification of the sign test under mixing conditions. Annals of Statistics 12, 716–729. Fan, J. (1996). Test of significance based on wavelet thresholding and Neyman’s truncation. Journal of the American Statistical Association 91, 674–688. Fan, J. and Lin, S. (1998). Test of significance when data are curves. Journal of the American Statistical Association 93, 1007–1021. Fan, J., Zhang, C. and Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon. Annals of Statistics 29, 153–193. Faraway, J. and Sun, J. (1995). Simultaneous confidence bands for linear regression with heteroscedastic errors. Journal of the American Statistical Association 90, 1094–1098. Farrell, R. (1985a). Multivariate Calculation: Use of the Continuous Groups. Springer, Berlin. Farrell, R. (1985b). Techniques of Multivariate Calculation. Springer, Berlin. Fears, T., Benichou, J. and Gail, M. (1996). A reminder of the fallibility of the Wald statistic. American Statistician 50, 226–227. Feller, W. (1948). On the Kolmogorov–Smirnov limit theorems for empirical distributions. Annals of Statistics 19, 177–189. Feller, W. (1968). An Introduction to Probability Theory and its Applications, 3rd edition, Vol. 1. John Wiley, New York. Feller, W. (1971). An Introduction to Probability Theory and its Applications, Vol. 2, 2nd edition. John Wiley, New York. References 719

Fenstad, G. U. (1983). A comparison between the U and V tests in the Behrens- Fisher problem. Biometrika 70, 300–302. Ferguson, T. S. (1967). Mathematical Statistics: A Decision Theoretic Approach. Academic Press, New York. Ferguson, T. S. (1996). A Course in Large Sample Theory. Chapman & Hall, New York. Fienberg, S. (1980). The Analysis of Cross-Classified Categorical Data, 2nd edition. MIT Press, Cambridge, Massachusetts. Fienberg, S. and Tanur, J. (1996). Reconsidering the fundamental contributions of Fisher and Neyman in experimentation and sampling. International Statistical Review 64, 237-253. Finch, P. D. (1979). Description and analogy in the practice of statistics (with discussion). Biometrika 66, 195–208. Finner, H. (1994). Two-sided tests and one-sided confidence bounds. Annals of Statistics 22, 1502–1516. Finner, H. (1999). Stepwise multiple test procedures and control of directional errors. Annals of Statistics 27, 274–289. Finner, H. and Roters, M. (1998). Asymptotic comparison of step-down and step- up multiple test procedures based on exchangeable test statistics. Annals of Statistics 26, 505–524. Finner, H. and Roters, M. (2001). On the false discovery rate and expected type I errors, Biometric Journal 43, 995-1005. Finney, D. J. (1948). The Fisher–Yates test of significance in 2 × 2 contingency tables. Biometrika 35, 145–156. Finney, D. J., Latscha, R., Bennett, B., Hsu, P. and Horst, C. (1963, 1966). Tables for Testing Significance in a 2 × 2 Contingency Table, Cambridge U.P. Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Phil.Trans.Roy.Soc.LondonSeriesA222, 309–368. Fisher, R. A. (1924). The conditions under which chi square measures the dis- crepancy between observation and hypothesis. Journal of the Royal Statistical Society 87, 442–450. Fisher, R.A. (1925a). Theory of statistical estimation. Proc. Cambridge Phil. Soc. 22, 700–725. [These papers develop a theory of point estimation (based on the maximum likelihood principle) and the concept of sufficiency. The fac- torization theorem is given in a form which is formally weaker but essentially equivalent to (1.20). First use of term ancillary.] Fisher, R. A. (1925b). Statistical Methods for Research Workers, 1st edition (14th edition, 1970), Oliver and Boyd, Edinburgh. Fisher, R. A. (1928a). The general sampling distribution of the multiple correla- tion coefficient. Proc.Roy.Soc.SeriesA121, 654–673. [Derives the noncentral χ2- and noncentral beta-distributions and the distribution of the sample mul- tiple correlation coefficient for arbitrary values of the population multiple correlation coefficient.] 720 References

Fisher R. A. (1928b). On a property connecting the χ2 measure of discrepancy with the method of maximum likelihood. Atti de Congresso Internazionale dei Mathematici, Bologna 6, 94–100. Fisher, R. A. (1930). Inverse probability. Proc. Cambridge Philos. Soc. 26, 528– 535. Fisher, R. A. (1934a). Statistical Methods for Research Workers, 5th and sub- sequent eds., Oliver and Boyd, Edinburgh, Section 21.02. [Proposes the conditional tests for the hypothesis of independence in a 2 × 2 table.] Fisher, R. A. (1934b). Two new properties of mathematical likelihood. Proc. Roy. Soc.(A) 144, 285–307. [Introduces the idea of conditioning on ancillary statistics and applies it to the estimation of location parameters.] Fisher R. A. (1935a). The Design of Experiments, 1st edition (8th edition, 1966). Oliver and Boyd, Edinburgh. [Contains the basic ideas concerning permuta- tion tests. In particular, points out how randomization provides a basis for inference and proposes the permutation version of the t-test as not requiring the assumption of normality.] Fisher R. A. (1935b). The logic of inductive inference (with discussion). Journal of the Royal Statistical Society 98, 39–82. Fisher R. A. (1936). Uncertain inference. Proc. Amer. Acad. Arts and Sci. 71, 245–258. Fisher, R. A. (1956a). On a test of significance in Pearson’s Biometrika tables (No. 11). Journal of the Royal Statistical Society (B) 18, 56–60. (See also the discussion of this paper by Neyman, Bartlett, and Welch in the same volume, pp. 288–302.) [Exhibits a negatively biased relevant subset for the Welch–Aspin solution of the Behren–Fisher problem.] Fisher, R. A. (1956b, 1959, 1973). Statistical Methods and Scientific Inference. Oliver and Boyd, Edinburgh (1956, 1959); Hafner, New York (1973). [In Chap- ter IV the author gives his views on hypothesis testing and in particular discusses his ideas on the Behrens-Fisher problem. Contains Fisher’s last com- prehensive statement of his views on many topics, including ancillarity and the Behrens–Fisher problem.] Fisher, R. A. (1971–1973). Collected Papers (J. H. Bennett, ed.), University of Adelaide. Fisher, R. A. (1973). Statistical Methods and Scientific Inference, 3rd edition, Hafner, New York. Folks, J. L. and Chhikara, R. S. (1978). The inverse Gaussian distribution and its statistical applications—a review (with discussion). Journal of the Royal Statistical Society Series B 40, 263–289. Forsythe, A. and Hartigan, J. A. (1970). Efficiency of confidence intervals generated by repeated subsample calculations. Biometrika 57, 629–639. Fourier, J. B. J. (1826). Recherches Statistiques sur la Ville de Paris el le D´epartement de la Seine, Vol.3. Franck, W. E. (1981). The most powerful invariant test of normal versus Cauchy with applications to stable alternatives. Journal of the American Statistical Association 76, 1002–1005. References 721

Fraser, D. A. S. (1953). Canadian Journal of Mathematics 6, 42–45. Fraser, D. A. S. (1956). Sufficient statistics with nuisance parameters. Annals of Mathematical Statistics 27, 838–842. Fraser, D. (1996). Comment on “Pivotal inference and the fiducial argument.” International Statistical Review 64, 231-235. Freedman, D. and Lane, D. (1982). Significance testing in a nonstochastic set- ting. In Festschrift for Erich L. Lehmann (Bickel. Doksum, and Hodges, eds.), Wadsworth, Belmont, Calif. Freeman, M. F. and Tukey, J. W. (1950). Transformations related to the angular and the square root. Annals of Mathematical Statistics 21, 607–611. Freiman, J. A., Chalmers, T. C., Smith, H. and Kuebler, R. R. (1978). The importance of beta, the type II error and sample size in the design and interpre- tation of the randomized control trial. New England Journal of Medicine 299, 690–694. Fris´en, M. (1980). Consequences of the use of conditional inference in the analysis of a correlated contingency table. Biometrika 67, 23–30. Fuller, W. (1996). Introduction to Statistical Time Series, 2nd Edition, John Wiley, New York. Gabriel, K. R. (1964). A procedure for testing the homogeneity of all sets of means in analysis of variance. Biometrics 20, 459–477. Gabriel, K. R. and Hall W. J. (1983). Rerandomization inference on regression and shift effects: Computationally feasible methods. Journal of the American Statistical Association 78, 827–836. Gabriel, K. R. and Hsu, C. F. (1983). Evaluation of the power of rerandomization tests, with application to weather modification experiments. Journal of the American Statistical Association 78, 766–775. Galambos, J. (1982). Exponential distribution. In Encycl. Statist. Sci.,Vol.2, John Wiley, New York. Gan, L. and Jiang, J. (1999). A test for global maximum. Journal of the American Statistical Association 94, 847–854. Garside, G. R. and Mack, C. (1976). Actual type 1 error probabilities for var- ious tests in the homogeneity case of the 2 × 2 contingency table. American Statistician 30, 18–21. Gart, J. J. (1970). Point and interval estimation of the common odds ratio in the combination of 2 × 2 tables with fixed marginals. Biometrika 57, 471–475. Garthwaite, P. (1996). Confidence intervals from randomization tests. Biometrics 52, 1387–1393. Gastwirth, J. L. and Rubin, H. (1971). Effect of dependence on the level of some one-sample tests. Journal of the American Statistical Association 66, 816–820. Gauss, C. F. (1809). Theoria motus corporum coelestium in sectionibus conicis solem ambientium. Hamburg. Gauss, C. F. (1816). Bestimmung der Genauigkeit der Beobachtungen. Z. Astron. and Verw. Wiss 1. (Reprinted in Gauss’ collected works, Vol. 4, pp. 109–119.) 722 References

Gavarret, J. (1840). Principes G`en`eraux de Statistique M`edicale, Paris. George, E. I. and Casella, G. (1994). An empirical Bayes confidence report. Statistica Sinica 4, 617–638. Ghosh, J. (1961). On the relation among shortest confidence intervals of different types. Calcutta Statist. Assoc. Bull. 147-152. Ghosh, J., Morimoto, H. and Yamada, S. (1981). Neyman factorization and minimality of pairwise sufficient subfields. Annals of Statistics 9, 514–530. Ghosh, M. (1948). On the problem of similar regions. Sankhy¯a 8, 329–338. Gibbons, J. (1986). Ranking procedures. In Encycl. Statist. Sci. 7, 588–592. Gibbons, J. (1988). Selection procedures. In Encycl. Statist. Sci. 8, 337-345. Gibbons, J. and Chakraborti, S. (1992). Nonparametric statistical inference, 3rd edition. Marcel Dekker, New York. Giesbrecht, F. and Gumpertz, M. (2004). Planning, Construction, and Statistical Analysis of Comparative Experiments. John Wiley, New York. Gin´e, E. (1997). Lectures on Some Aspects of the Bootstrap. Ecole´ d’Et´´ e de Calcul de Probabilit´es de Saint-Flour. Gin´e, E. and Zinn, J. (1989). Necessary conditions for the bootstrap of the mean. Annals of Statistics 17, 684–691. Giri, N. and Kiefer, J. (1964). Local and asymptotic minimax properties of multivariate tests. Annals of Mathematical Statistics 35, 21–35. Giri, N., Kiefer, J. and Stein, C. M. (1963). Minimax character of Hotelling’s T 2 test in the simplest case. Annals of Mathematical Statistics 34, 1524–1535. Girshick, M. A., Mosteller, F. and Savage, L. J. (1946). Unbiased estimates for certain binomial sampling problems with applications. Annals of Mathematical Statistics 17, 13–23. [Problem 4.12.] Glaser, R. E. (1976). The ratio of the geometric mean to the arithmetic mean for a random sample from a gamma distribution. Journal of the American Statistical Association 71, 481–487. Glaser, R. E. (1982). Bartlett’s test of homogeneity of variances. Encycl. Statist. Sci. 1, 189–191. Gleser, L. J. (1985). Exact power of goodness-of-fit tests of Kolmogorov type for discontinuous distributions. Journal of the American Statistical Association 80, 954–958. Gleser, L. J. and Hwang, J. (1987). The nonexistence of 100(1 − α)% confi- dence sets of finite expected diameter in errors-in-variables and related models. Annals of Statistics 15, 1351–1362. Gokhale, D. V. and Johnson, N. S. (1978). A class of alternatives to independence in contingency tables. Journal of the American Statistical Association 73, 800– 804. Good, P. (1994). Permutation Tests, A Practical Guide to Resampling Methods for Testing Hypotheses. Springer-Verlag, New York. References 723

Goodman, L. A. and Kruskal, W. (1954, 1959). Measures of association for cross classification. Journal of the American Statistical Association 49, 732–764; 54, 123–163. Goutis, C. and Casella, G. (1991). Improved invariant confidence intervals for a normal variance. Annals of Statistics 19, 2015–2031. Goutis, C. and Casella, G. (1992). Increasing the confidence in student’s t interval. Annals of Statistics 20, 1501–1513. Graybill, F. A. (1976). Theory and Application of the Linear Model. Duxbury Press, North Scituate, Mass. Green, B. F. (1977). A practical interactive program for randomization tests of location. American Statistician 31, 37–39. Greenwood, P. and Nikulin, M. (1996). A Guide to Chi-Squared Testing. John Wiley, New York. Grenander, U. (1981). Abstract Inference. John Wiley, New York. Groeneboom, P. (1980). Large Deviations and Asymptotic Efficiencies. Mathematisch Centrum, Amsterdam, The Netherlands. Groeneboom, P. and Oosterhoof, J. (1981). Bahadur efficiency and small sample efficiency. International Statistical Review 49, 127–141. Guenther, W. C. (1978). Some remarks on the runs tests and the use of the hypergeometric distribution. American Statistician 32, 71–73. Gupta, A. and Vermeire, L. (1986). Locally optimal tests for multiparameter hypotheses. Journal of the American Statistical Association 81, 819–825. Haberman, S. J. (1974). The Analysis of Frequency Data. University of Chicago Press. Haberman, S. J. (1982). Association, Measures of In Encycl. Statist. Sci.,Vol.1. John Wiley, New York, 130–136. H´ajek, J. (1962). Asymptotically most powerful rank order tests. Annals of Mathematical Statistics 33, 1124–1147. H´ajek, J. (1967). On basic concepts of statistics, In Proc. Fifth Berkeley Symp. Math. Statist. and Probab., Univ. of Calif. Press, Berkeley. H´ajek, J. (1970). A characterization of limiting distributions of regular estimates. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 14, 323–330. H´ajek, J. (1972). Local asymptotic minimax and admissibility in estimation. Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability 1, 175–194. H´ajek, J. and Sid´ak, Z. (1967). Theory of Rank Tests. C.S.A.V. Prague and Academic Press. H´ajek, J. Sid´ak, Z. and Sen, P. (1999). Theory of Rank Tests, 2nd edition. Academic Press, San Diego. Hald, A. (1990). A History of Probability and Statistics (and Their Applications Before 1750). John Wiley, New York. 724 References

Hald, A. (1998). A History of Mathematical Statistics (from 1750 to 1930).John Wiley, New York. Hall, P. (1982). Improving the normal approximation when constructing one- sided confidence intervals for binomial or Poisson parameters. Biometrika 69, 647–652. Hall, P. (1986). On the bootstrap and confidence intervals. Annals of Statistics 14, 1431–1452. Hall, P. (1990). Asymptotic properties of the bootstrap for heavy-tailed distributions. Annals of Probability 18, 1342–1360. Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York. Hall, P. and Jing, B. (1995). Uniform coverage bounds for confidence intervals and Berry-Esseen theorems for Edgeworth expansions. Annals of Statistics 23, 363–375. Hall, P. and Martin, M. (1988). On bootstrap resampling and iteration. Biometrika 75, 661-671. Hall, P. and Padmanabhan, A. (1997). Adaptive inference and the two sample scale problem. Technometrics 39, 412–422. Hall, P. and Welsh, A. H. (1983). A test for normality based on the empirical characteristic function. Biometrika 70, 485–489. Hall, W. and Mathiason, D. (1990). On large-sample estimation and testing in parametric models. International Statistical Review 58, 77–97. Hall, W., Wijsman, R. and Ghosh, J. (1965). The relationship between sufficiency and invariance with applications in sequential analysis. Annals of Mathematical Statistics 36, 575–614. Hallin, M., Taniguchi, M., Serroukh, A. and Choy, K. (1999). Local asymp- totic normality for regression models with long memory disturbance. Annals of Statistics 27, 2054–2080. Halmos, P. R. (1974). Measure Theory. Springer, New York. Halmos, P. R. and Savage, L. J. (1949). Application of the Radon–Nikodym theo- rem to the theory of sufficient statistics. Annals of Mathematical Statistics 20, 225–241. [First abstract treatment of sufficient statistics; the factorization theorem. Problem 10.] Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press, Princeton. Hartigan, J. A. (1969). Using subsample values as typical values. Journal of the American Statistical Association 64, 1303–1317. Harville, D. A. (1978). Alternative formulations and procedures for the two-way mixed model. Biometrics 34, 441–454. Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman & Hall, London. Hastie, T. and Tibshirani, R. (1997). Generalized additive models. In Encycl. Statist. Sci. update 1, 261–269. References 725

Haytner, A. and Hsu, J. (1994). On the relationship between stepwise de- cision procedures and confidence sets. Journal of the American Statistical Association 89, 128–136. Hedges, L. and Olkin, I. (1985). Statistical methods for meta-analysis. Academic Press, Orlando. Hegazy, Y. A. S. and Green, J. R. (1975). Some new goodness-of-fit tests using order statistics. Applied Statistics 24, 299–308. Hegemann, V. and Johnson, D. E. (1976). The power of two tests for nonadditivity. Journal of the American Statistical Association 71, 945–948. Heritier, S. and Ronchetti, E. (1994). Robust bounded-influence tests in general parametric models. Journal of the American Statistical Association 89, 897– 904. Hettmansperger, T. P. (1984). Statistical Inference Based on Ranks. John Wiley, New York. Hettmansperger, T. and McKean, J. (1998). Robust Nonparametric Statistical Methods. Arnold, London. Hettmansperger, T., McKean, J. and Sheather, S. (2000). Robust nonparametric methods. Journal of the American Statistical Association 95, 1308–1312. Hettmansperger, T., M¨ott¨onen, J. and Oja, H. (1997). Affine-invariant mul- tivariate one-sample signed-rank tests. Journal of the American Statistical Association 92, 1591–1600. Hinkley, D. (1977). Conditional inference about a normal mean with known coefficient of variation. Biometrika, 64, 105–108. Hinkley, D. and Runger, G. (1984). The analysis of transformed data. (with discussion). Journal of the American Statistical Association 79, 302–320. Hipp, C. (1974). Sufficient statistics and exponential families. Annals of Statistics 2, 1283–1292. Hobson, E. W. (1927). Theory of Functions of a Real Variable, 3rd edition, Vol. 1. Cambridge University Press, p. 194. Hochberg, Y. and Tamhane, A. (1987). Multiple Comparison Procedures.John Wiley, New York. Hocking, R. R. (1973). A discussion of the two-way mixed model. American Statistician 27, 148–152. Hocking, R. R. (2003). Methods and Applications of Linear Models, 2nd edition. John Wiley, New York. Hocking, R. R. and Speed, F. M. (1975). A full rank analysis of some linear model problems. Journal of the American Statistical Association 70, 706–712. Hodges, J. L., Jr. (1957). The significance probability of the Smirnov two-sample test. Arkiv f¨ur Matematik 3, 469–486. Hodges, J. L., Jr. and Lehmann, E. L. (1954). Testing the approximate validity of statistical hypotheses. Journal of the Royal Statistical Society Series B 16, 261–268. 726 References

Hodges, J. L., Jr. and Lehmann, E. L. (1956). The efficiency of some non- parametric competitors of the t-test. Annals of Mathematical Statistics 27, 324–335. Hodges, J. L., Jr. and Lehmann, E.L. (1970). Deficiency. Annals of Mathematical Statistics 41, 783–801. Hoeffding W. (1951). ‘Optimum’ nonparametric tests. in Proc. 2nd Berkeley Symposium on Mathematical Statistics and Probability, Univ. of Calif. Press., Berkeley, 83–92. Hoeffding, W. (1952). The large-sample power of tests based on permutations of observations. Annals of Mathematical Statistics 23, 169–192. Hoeffding, W. (1956). The role of assumptions in statistical decisions. In Proc. Third Berkeley Symposium on Mathematical Statistics and Probability, edited by Neyman, University of California Press, Berkeley, CA. Hoeffding, W. (1965). Asymptotically optimal tests for multinomial distributions (with discussion). Annals of Mathematical Statistics 36, 369–408. Hoeffding, W. (1977). Some incomplete and boundedly complete families of distributions. Annals of Statistics 5, 278–291. Hoel, P. G. (1948). On the uniqueness of similar regions. Annals of Mathematical Statistics 19, 66–71. [Theorem 4.3.1 under regularity assumptions.] Hogg, R. V. (1972). More light on the kurtosis and related statistics. Journal of the American Statistical Association 67, 422–424. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70. Holm, S. (1999). Multiple confidence sets based on stagewise tests. Journal of the American Statistical Association 94, 489–495. Hooper, P. M. (1982a). Sufficiency and invariance in confidence set estimation. Annals of Statistics 10, 549–555. Hooper, P. M. (1982b). Invariant confidence sets with smallest expected measure. Annals of Statistics 10, 1283–1294. Hotelling, H. (1931). The generalization of Student’s ratio. Annals of Mathematical Statistics 2, 360–378. Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321–377. [One of the early papers making explicit use of invariance considerations.] Hotelling, H. (1953). New light on the correlation coefficient and its transforms. Journal of the Royal Statistical Society Series B 15, 193–224. Hotelling, H. (1961). The behavior of some standard statistical tests under non-standard conditions. Proceedings of the Fourth Berkeley Symposium of Mathematical Statistics Prob. 1, 319–360. Hsu, C. T. (1940). On samples from a normal bivariate population. Annals of Mathematical Statistics 11, 410–426. Hsu, J. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall, London. References 727

Hsu, P. (1941). Analysis of variance from the power function stand-point. Biometrika 32, 62–69. [Shows that the test (7.7) is UMP among all tests whose power function depends only on the noncentrality parameter.] Hsu, P. (1945). On the former function of the E2-test and the T 2-test. Annals of Mathematical Statistics 16, 278–286. [Obtains a result on best average power for the T 2-test analogous to that of Chapter 7, Problem 7.5.] Huang, J. S. and Ghosh, M. (1982). A note on strong unimodality of order statistics. Journal of the American Statistical Association 77, 929–930. Huber, P. J. (1965). A robust version of the probability ratio test. Annals of Mathematical Statistics 36, 1753–1758. Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and Monte Carlo. Annals of Statistics 1, 799–821. [Obtains the robustness conditions (11.55) and (11.57); related results are given by Eicker (1963).] Huber, P. J. (1981). Robust Statistics. John Wiley, New York. Huber, P. J. and Strassen, V. (1973, 1974). Minimax tests and the Neyman– Pearson lemma for capacities. Annals of Statistics 1, 251–263; 2, 223–224. Hunt, G. and Stein, C. M. (1946). Most stringent tests of statistical hypotheses. [In this paper. which unfortunately was never published, a general theory of invariance is developed for hypothesis testing.] Hwang, J. and Brown, L. D. (1991). Estimated confidence under the validity constraint. Annals of Statistics 19, 1964–1977. Hwang, J. and Casella, G. (1982). Minimax confidence sets for the mean of a multivariate normal distribution. Annals of Statistics 10, 868–881. Hwang, J., Casella, G., Robert,C., Wells, M. and Farrell, R. (1992). Estimation of accuracy in testing. Annals of Statistics 20, 490–509. Ibragimov, I. and Has’minskii, R. (1981). Statistical Estimation. Springer-Verlag, New York. Ibragimov, J. A. (1956). On the composition of unimodal distributions (Russian). Teoriya Veroyatnostey 1, 283–288; Eng]. transl., Theor. Probab. Appl. 1 (1956), 255–260. Inglot, T., Kallenberg, W. and Ledwina, T. (1997). Data driven smooth tests for composite hypotheses. Annals of Statistics 25, 1222-1250. Inglot, T., Kallenberg, W. and Ledwina, T. (2000). Vanishing shortcoming and asymptotic relative efficiency. Annals of Statistics 28, 215-238. Inglot, T. and Ledwina, T. (1996). Asymptotic optimality of data-driven Neyman’s tests for uniformity. Annals of Statistics 24, 1982–2019. Ingster, Y. (1993). Asymptotically minimax hypothesis tests for nonparametric alternatives I, II, III. Math. Methods Statist. 2, 85–114, 171–189, 249–268. Ingster, Y. and Suslina, I. (2003). Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Springer Lecture Notes in Statistics 169, Springer-Verlag, New York. Isaacson, S. L. (1951). On the theory of unbiased tests of simple statisti- cal hypotheses specifying the values of two or more parameters. Annals of Mathematical Statistics 22, 217–234. [Introduces type D and E tests.] 728 References

Jagers, P. (1980). Invariance in the linear model—an argument for χ2 and F in nonnormal situations. Statistics 11, 455–464. James, A. T. (1954). Normal multivariate analysis and the orthogonal group. Annals of Mathematical Statistics 25, 40–75. James, G. S. (1951). The comparison of several groups of observations when the ratios of the population variances are unknown. Biometrika 38, 324–329. James, G. S. (1954). Tests of linear hypotheses in univariate and multivari- ate analysis when the ratios of the population variances are unknown. Biometrika 41, 19–43. Janssen, A. (1995). Principal component decomposition of non-parametric tests. Probability Theory and Related Fields 101, 193–209. Janssen, A. (1997). Studentized permutation tests for non-i.i.d. hypotheses and the generalized Behrens-Fisher problem. Statistics and Probability Letters 36, 9–21. Janssen, A. (1999). Testing nonparametric statistical functionals with applica- tions to rank tests. Journal of Statistical Planning and Inference 81, 71–93, Erratum 92, Janssen, A. (2000a). Global power functions of goodness of fit tests. Annals of Statistics 28, 239–253. Janssen, A. (2000b). Nonparametric bioequivalence for tests for statistical functionals and their efficient power functions. Statistics and Decisions 18, 49–78. Janssen, A. (2003). Which power of goodness of fit tests can really be ex- pected: intermediate versus contiguous alternatives. Statistics and Decisions 21, 301–325. Janssen, A. and Pauls, T. (2003). How do bootstrap and permutation tests work? Annals of Statistics 31, 768–806. Jensen, J. (1993). A historical sketch and some new results on the improved log likelihood ratio statistic. Scandinavian Journal of Statistics 20, 1–15. Jockel, K. (1986). Finite sample properties and asymptotic efficiency of Monte Carlo tests. Annals of Statistics 14, 336–347. Johansen, S. (1979). Introduction to the Theory of Regular Exponential Families, Lecture Notes, No. 3, Inst. of Math. Statist., University of Copenhagen. Johansen, S. (1980). The Welch–James approximation to the distribution of the residual sum of squares in a weighted linear regression. Biometrika 67, 85–92. John, R. D. and Robinson, J. (1983a). Edgeworth expansions for the power of permutation tests. Annals of Statistics 11, 625–631. John, R. D. and Robinson, J. (1983b). Significance levels and confidence intervals for permutation tests. Journal of Statistical Computation and Simulation 16, 161–173. Johnson, N. L. and Kotz, S. (1969). Distributions in Statistics: Discrete Distributions. Houghton Mifflin, New York. References 729

Johnson, N. L. and Kotz, S. (1970). Distributions in Statistics: Continuous Univariate Distributions (2 vols.). Houghton Mifflin,New York. Johnson, N. L., Kotz, S. and Balakrishnan, N. (1995). Continuous Univariate Distributions 2, 2nd edition. John Wiley, New York. Johnson, N. L., Kotz, S. and Kemp, A. (1992). Univariate Discrete Distributions, 2nd edition. John Wiley, New York. Joshi, V. (1982). Admissibility. In Encycl. Statist. Sci. 1, 25–29. Kabe, D. G. and Laurent, A. G. (1981). On some nuisance parameter free uniformly most powerful tests. Biometrics Journal 23, 245–250. Kakutani, S. (1948). On the equivalence of infinite product measures. Annals of Mathematical Statistics 49, 214–224. Kalbfleisch, J. D. (1975). Sufficiency and conditionality (with discussion). Biometrika 62, 251–259. Kallenberg, W. (1982). Chernoff efficiency and deficiency. Annals of Statistics 10, 583–594. Kallenberg, W. (1983). Intermediate efficiency, theory and examples. Annals of Statistics 11, 170-1-82. Kallenberg, W. C. M. et al. (1984). Testing Statistical Hypotheses: Worked Solutions, CWI Syllabus No. 3, Centrum voor Wiskunde en Informatien, Amsterdam. Kallenberg, W. and Ledwina, T. (1995). Consistency and Monte Carlo simulation of a data driven version of smooth goodness-of-fit tests. Annals of Statistics 23, 1594–1608. Kallenberg, W. and Ledwina, T. (1999). Data-driven rank tests for independence. Journal of the American Statistical Association 94, 285–301. Kallenberg, W. C. M., Oosterhoff J. and Schriever B. F. (1985). The number of classes in chi-squared goodness-of-fit tests. Journal of the American Statistical Association 80, 959–968. Kanoh, S. and Kusunoki, U. (1984). One sided simultaneous bounds in linear regression. Journal of the American Statistical Association 79, 715–719. Kappenman, R. F, (1975). Conditional confidence intervals for the double exponential distribution parameters. Technometries 17, 233–235. Kariya, T. (1981). Robustness of multivariate tests. Annals of Statistics 9, 1267–1275. Kariya, T. (1985). Testing in the Multivariate Linear Model. Kinokuniya, Tokyo. Kariya, T. and Sinha, B. (1985). Nonnull and optimality robustness of some tests. Annals of Statistics 13, 1182–1197. Karlin, S. (1957). P`olya type distributions. II. Annals of Mathematical Statistics 28, 281–308. Karlin, S. (1968). Total Positivity, Vol. I, Stanford U.P. Stanford, Calif. [Properties of TP distributions, including Problems 3.50–3.53.] 730 References

Karlin, S. and Rubin, H. (1956). The theory of decision procedures for distri- butions with monotone likelihood ratio. Annals of Mathematical Statistics 27. 272–299. [General theory of families with monotone likelihood ratio, including Theorem 3.4.2. For further developments of this theory, see Brown, Cohen, and Strawderman (1976).] Karlin, S. and Taylor, H. (1975). A First Course in Stochastic Processes, 2nd ed., Academic Press, San Diego, CA. Kempthorne, O. (1955). The randomization theory of experimental inference. Journal of the American Statistical Association 50, 946–967. Kempthorne, P. (1988). Controlling risks under different loss functions: The compromise decision problem. Annals of Statistics 16, 1594-1608. Kendall, M. G. (1970). Rank Correlation Methods, 4th edition. Griffin, London. Kendall M. G. and Stuart, A. (1979). The Advanced Theory of Statistics, 4th edition, Vol. 2. MacMillan, New York. Kent, J. and Quesenberry, C. P. (1982). Selecting among probability distributions used in reliability. Technometrics 24, 59–65. Khmaladze, E. (1993). Goodness of fit problem and scanning innovation martingales. Annals of Statistics. 21, 798–829. Kiefer, J. (1958). On the nonrandomized optimality and randomized nonopti- mality of symmetrical designs. Annals of Mathematical Statistics 29, 675–699. [Problem 8.6(ii).] Kiefer, J. (1977a). Conditional confidence statements and confidence estimators (with discussion). Journal of the American Statistical Association 72, 789–827. [The key paper in Kiefer’s proposed conditional confidence approach.] Kiefer, J. (1977b). Conditional confidence and estimated confidence in multi- decision problems (with applications to selections and ranking). Multivariate Analysis IV, 143–158. Kiefer, J. and Schwartz, R. (1965). Admissible Bayes character of T 2-, R2-, and other fully invariant tests for classical multivariate normal problems. Annals of Mathematical Statistics 36, 747–770. King, M. L. and Hillier, G. H. (1985). Locally best invariance tests of the error co- variance matrix of the linear regression model. Journal of the Royal Statistical Society Series B 47, 98–102. Knight, K. (1989). On the bootstrap of the sample mean in the infinite variance case. Annals of Statistics 17, 1168–1175. Koehn, U. and Thomas, D. L. (1975). On statistics independent of a sufficient statistic: Basu’s Lemma. American Statistician 29, 40–41. Kolassa, J. and McCullagh, P. (1990). Edgeworth series for lattice distributions. Annals of Statistics 18, 981–985. Kolmogorov, A. (1933). Sulla adeterminazione empirica di una legge di distribuzione. Giorn. Inst. Ital. Attuari 4, 83–91. Kolmogorov, A. (1942). Sur 1’estimation statistique des param`etres de la loi de Gauss. Bull. Acad. Sci. URSS Ser. Math. 6, 3–32. (Russian–French summary.) [Definition of sufficiency in terms of distributions for the parameters.] References 731

Kolodziejczyk, S. (1935). An important class of statistical hypotheses. Biometrika 37, 161–190. [Discussion of the general linear univariate hypothesis from the likelihood-ratio point of view.] Koopman, B. (1936). On distributions admitting a sufficient statistic. Trans. Amer. Math. Soc. 39, 399–409. Korn, E., Troendle, J., McShane, L. and Simon, R. (2004). Controlling the num- ber of false discoveries: Applications to high-dimensional genomic data. Journal of Statistical Planning and Inference 124, 379–398. Koschat, M. (1987). A characterization of the Fieller solution. Annals of Statistics 15, 462–468. Koshevnik, Y. and Levit, B. (1976). On a non-parametric analogue of the information matrix. Theory of Probability and its Applications 21, 738–753. Kotz, S., Wang, Q., and Hung, K. (1990). Interrelations among various definitions of bivariate positive dependence. In Topics in Statistical Dependence. Block, Sampson and Savits, eds. (1990). IMS Lecture Notes 16, Hayward, CA. Kowalski, J. (1995). Complete classes of tests for regularly varying distributions. Annals of the Institute of Statistical Mathematics 47, 321–350. Koziol, J. A. (1983). Tests for symmetry about an unknown value based on the empirical distribution function. Communications in Statistics 12, 2823–2846. Krafft, O. and Witting, H. (1967). Optimale tests under ung¨unstigsten Verteilun- gen. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 7, 289–302. Kraft, C. (1955). Some conditions for consistency and uniform consistency of statistical procedures. Univ. of Calif. Publ. in Statist. 2, 125–142. Kruskal, W. (1954). The monotonicity of the ratio of two non-central t density functions. Annals of Mathematical Statistics 25, 162–165. Kruskal, W. H. (1957). Historical notes on the Wilcoxon unpaired two-sample test. Journal of the American Statistical Association 52, 356–360. Kruskal, W. H. (1978). Significance, Tests of. In International Encyclopedia of Statistics, Free Press and Macmillan, New York and London. K¨unsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. Annals of Statistics 17, 1217–1241. Lahiri, S. N. (2003). Resampling Methods for Dependent Data. Springer, New York. Lambert, D. (1985). Robust two-sample permutation tests. Annals of Statistics 13, 606–625. Lambert, D. and Hall, W. (1982). Asymptotic lognormality of p-values. Annals of Statistics 10, 44–64. Laplace, P. S. (1773). M`emoire sur l’inclinaison moyenne des orbites des com`etes. Mem.Acad.Roy.Sci.Paris7 (1776). 503–524. Laplace, P. S. (1812). Th`eorie Analytique des Probabilit`es, Paris. (The 3rd edition of 1820 is reprinted as Vol. 7 of Laplace’s collected works.) 732 References

Lawless, J. F. (1972). Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions. Biometrika 59, 377–386. Lawless, J. F. (1973). Conditional versus unconditional confidence intervals for the parameters of the Weibull distribution. Journal of the American Statistical Association 68, 655–669. Lawless, J. F. (1978). Confidence interval estimation for the Weibull and extreme value distributions. Technometries 20, 355–368. Le Cam, L. (1953). On some asymptotic properties of maximum likelihood es- timates and related Bayes estimates. In Univ. Calif. Publs. Statistics,Vol.1, pp. 277–329, Univ. of California Press, Berkeley and Los Angeles. Le Cam, L. (1956). On the asymptotic theory of estimation and testing hypotheses. Proc. 3rd Berkeley Symposium I, 129–156. Le Cam, L. (1958). Les propi´et´es asymptotiques des solutions de Bayes. Publ. Inst. Statist. Univ. Paris. VII (3-4, 17–35. Le Cam, L. (1960). Locally asymptotically normal families of distributions. Univ. California Publ. Statist. 3, 37–98. Le Cam, L. (1964). Sufficiency and approximate sufficiency. Annals of Mathematical Statistics 35, 1419–1455. Le Cam, L. (1969). Theorie Asymptotique de la Decision Statistique. Presses de l’Universit´e de Montreal. Le Cam, L. (1970). On the assumptions used to prove asymptotic normality of maximum likelihood estimators. Annals of Mathematical Statistics 41, 802– 828. Le Cam, L. (1972). Limits of experiments. Proc. 6th Berkeley Symp. on Math. Stat. and Prob. I, 245–261. Le Cam, L. (1979). On a theorem of J. H´ajek. In Contributions to Statistics: J. H´ajek Memorial Volume (Jureckova, ed.), Academia, Prague. [Rigorous and very general treatment of the large-sample theory of maximum- likelihood estimates, with a survey of the large previous literature on the subject.] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer- Verlag, New York. Le Cam, L. (1990). On the standard asymptotic confidence ellipsoids of Wald. International Statistical Review. 58, 129–152. Le Cam, L. and Yang, G. (2000). Asymptotics in Statistics, Some Basic Concepts, 2nd edition. Springer-Verlag, New York. Ledwina, T. (1994). Data-driven version of Neyman’s smooth test of fit. Journal of the American Statistical Association 89, 1000–1005. L´eger, C. and Romano, J. P. (1990a). Bootstrap adaptive estimation: the trimmed-mean example. Canadian Journal of Statistics 18, 297–314. L´eger, C. and Romano, J. P. (1990b). Bootstrap choice of tuning parameters. Annals of the Institute of Statistical Mathematics 42, 709–735. References 733

Lehmann, E. L. (1949). Some comments on large sample tests. In Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, University of CA Press, Berkeley. [Problem 13.26.] Lehmann, E. L. (1950). Some principles of the theory of testing hypotheses. Annals of Mathematical Statistics 21, 1–26. Lehmann, E. L. (1951a). A general concept of unbiasedness. Annals of Mathe- matical Statistics 22, 587–597. [Definition (1.8); Problems 1.2, 1.3, 1.4, 1.6, 1.7, and 1.14.] Lehmann, E. L. (1951b). Consistency and unbiasedness of certain nonparametric tests. Annals of Mathematical Statistics 22, 165–179. Lehmann, E. L. (1952a). Testing multiparameter hypotheses. Annals of Mathematical Statistics 23, 541–552. Lehmann, E. L. (1952b). On the existence of least favorable distributions. Annals of Mathematical Statistics 23, 408–416. Lehmann, E. L. (1955). Ordered families of distributions. Annals of Mathematical Statistics 26, 399–419. [Lemma 8.2.1; Problems 8.2, 8.9 (This problem is a corrected version of Theorem 8.5.1 of the paper in question. Thanks to R. Blumenthal for pointing out an error in the statement of this theorem in the paper.) and 8.10.] Lehmann, E. L. (1958). Significance level and power. Annals of Mathematical Statistics 29, 1167–1176. Lehmann, E. L. (1961). Some model I problems of selection. Annals of Mathematical Statistics 32, 990–1012. Lehmann, E. L. (1980). An interpretation of completeness and Basu’s theorem. Journal of the American Statistical Association 76, 335–340. [Problem 5.70.] Lehmann, E. L. (1985a). The Neyman-Pearson theory after 50 years. In Proc. Neyman–Kiefer Conference (LeCam and Olshen. eds.), Wadsworth, Belmont, CA. Lehmann, E. L. (1985b). The Neyman-Pearson Lemma. In Encycl. Stat. Sci. 6, 224–230. Lehmann, E. L. (1993). The Fisher, Neyman-Pearson theories of testing hypothe- ses: one theory or two? Journal of the American Statistical Association 78, 1242–1249. Lehmann, E. L. (1997). Testing statistical hypotheses: the story of a book. Statistical Science 12, 48–52. Lehmann, E. L. (1998). Nonparametrics: Statistical Methods Based on Ranks, revised first edition. Prentice Hall, Upper Saddle River, New Jersey. [Previous edition by Holden-Day (1975).] Lehmann, E. L. (1999). Elements of Large-Sample Theory. Springer-Verlag, New York. Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, Second Edition, Springer-Verlag, New York. 734 References

Lehmann, E. L. and Loh, W-Y. (1990). Pointwise versus uniform robustness in some large-sample tests and confidence intervals. Scandinavian Journal of Statistics 17, 177–187. Lehmann, E. L. and Rojo, J. (1992). Invariant directional orderings. Annals of Statistics 20, 2100–2110. Lehmann, E. L. and Romano, J. P. (2005). Generalizations of the familywise error rate. Technical Report 2003-37, Department of Statistics, Stanford University, to appear in Annals of Statistics. Lehmann, E. L., Romano, J. P., and Shaffer, J. P. (2003). On optimality of stepdown and stepup procedures. Technical Report 2003-12, Department of Statistics, Stanford University. Lehmann, E. L. and Scheff´e, H. (1950, 1955). Completeness, similar regions, and unbiased estimation. Sankhy¯a 10, 305–340; 15, 219–236. [Introduces the concept of completeness. Theorem 4.4.1 and applications.] Lehmann, E. L. and Shaffer, J. P. (1979). Optimal significance levels for multistage comparison procedures. Annals of Statistics 7, 27–45. Lehmann, E. L. and Stein, C. M. (1948). Most powerful tests of composite hy- potheses. Annals of Mathematical Statistics 19, 495–516. [Theorem 3.8.1 and applications.] Lehmann, E. L. and Stein, C. M. (1949). On the theory of some non-parameteric hypotheses. Annals of Mathematical Statistics 20, 28–45. [Develops the theory of optimum permutation tests, Problem 8.33.] Lehmann, E. L. and Stein, C. M. (1953). The admissibility of certain invari- ant statistical tests involving a translation parameter. Annals of Mathematical Statistics 24, 473–479. Lentner, M. M. and Buehler, R. (1963). Some inferences about gamma parameters with an application to a reliability problem. Journal of the American Statistical Association 58, 670–677. Levy, K. J. and Narula, S. C. (1974). Shortest confidence intervals for the ratio of two normal variances. Canadian Journal of Statistics 2, 83–87. Lexis, W. (1875). Einleitung in die Theorie der Bev¨olkerungsstatistik, Strassburg. Lexis, W. (1877). Zur Theorie der Massenerscheinungen in der Menschlichen Gesellschaft, Freiburg. Liang, K. Y. (1984). The asymptotic efficiency of conditional likelihood methods. Biometrika 71, 305–313. Liang, K.Y. and Self, S. G. (1985). Tests for homogeneity of odds ratio when the data are sparse. Biometrika 72, 353–358. Lieberman, G. J. and Owen, D. B. (1961). Tables of the Hypergeometric Probability Distribution, Stanford University Press. Lindley, D. V. (1957). A statistical paradox. Biometrika 44, 187–192. Linnik, Y. V., Pliss, V. A. and Salaevskii, O. V. (1968). On the theory of Hotelling’s test (Russian). Dok. AN SSSR 168, 743–746. References 735

Littell, R. C. and Louv, W. C. (1981). Confidence regions based on methods of combining test statistics. Journal of the American Statistical Association 76, 125–130. Liu, H. and Berger, R. (1995). Uniformly more powerful one-sided tests for hypotheses about linear inequalities. Annals of Statistics 23, 55-72. Liu, R. Y. and Singh,K. (1987). On a partial correction by the bootstrap. Annals of Statistics 15, 1713–1718. Liu, R. Y. and Singh, K. (1992). Moving blocks jackknife and bootstrap capture weak dependence. In Exploring the Limits of Bootstrap, 225–248. Edited by LePage, R. and Billard, L., John Wiley, New York. Liu, R. Y. and Singh, K. (1997). Notions of limiting P -values based on data depth and bootstrap. Journal of the American Statistical Association 92, 266–277. Loh, W.-Y. (1984a). Strong unimodality and scale mixtures. Annals of the Institute of Statistical Mathematics 36, 441–450. Loh, W.-Y. (1984b). Bounds on ARE’s for restricted classes of distributions defined via tail-orderings. Annals of Statistics 12, 685–701. Loh, W.-Y. (1985). A new method for testing separate families of hypotheses. Journal of the American Statistical Association 80, 362–368. Loh, W.-Y. (1987). Calibrating confidence coefficients. Journal of the American Statistical Association 82, 155–162. Loh, W.-Y. (1989). Bounds on the size of the χ2-test of independence in a contingency table. Annals of Statistics 17, 1709–1722. Loh, W.-Y. (1991). Bootstrap calibration for confidence interval construction and selection. Statistica Sinica 1, 479–495. Loomis, L. H. (1953). An Introduction to Abstract Harmonic Analysis.Van Nostrand, New York. Lorenzen, T. J. (1984). Randomization and blocking in the design of experiments. Communications in Statistics – Theory and Methods 13, 2601–2623. Lou, W. (1996). On runs and longest run tests: a method of finite Markov chain embedding. Journal of the American Statistical Association 91, 1595–1601. Low, M. (1997). On nonparametric confidence intervals. Annals of Statistics 25, 2547–2554. Lyapounov, A. M. (1940). Sur les fonctions-vecteurs compl`etement additives, Izv. Akad. Nauk SSSR Ser. Mat. 4, 465–478. Maatta, J. and Casella, G. (1987). Conditional properties of interval estimators of the normal variance. Annals of Statistics 15, 1372–1388. Mack, G. A. and Skillings, J. H. (1980). A Friedman type rank test for main effects in a two-factor ANOVA. Journal of the American Statistical Association 75, 947–951. Madansky, A. (1962). More on length of confidence intervals. Journal of the American Statistical Association 57, 586–599. Mandelbaum, A. and R¨uschendorf, L. (1987). Complete and symmetrically complete families of distributions. Annals of Statistics 15, 1229–1244. 736 References

Mann, H. and Wald, A. (1942). On the choice of the number of intervals in the application of the chi-square test. Annals of Mathematical Statistics 13, 306–317. Mantel, N. (1987). Understanding Wald’s test for exponential families. American Statistician 41, 147–149. Marasinghe, M. C. and Johnson, D. E. (1981). Testing subhypotheses in the multiplicative interaction model. Technometrics 23, 385–393. Marcus, R., Peritz, E. and Gabriel, K. R. (1976). On closed testing procedures with special reference to ordered analysis of variance. Biometrika 63, 655–660. Marden, J. (1982a). Minimal complete classes of tests of hypotheses with multivariate one-sided alternatives. Annals of Statistics 10, 962–970. Marden, J. (1982b). Combining independent noncentral chi-squared or F -tests. Annals of Statistics 10, 266–270. Marden, J. (1985). Combining independent one-sided noncentral t or normal mean tests. Annals of Statistics 13, 1535–1553. Marden, J. (1991). Sensitive and sturdy p-values. Annals of Statistics 19, 918–934. Marden, J. (2000). Hypothesis testing: from p-values to Bayes factors. Journal of the American Statistical Association 95, 1316–1320. Marden, J. and Muyot, M. (1995). Rank tests for main and interaction effects in analysis of variance. Journal of the American Statistical Association 90, 1388–1398. Marden, J. and Perlman, M. (1980). Invariant tests for means with covariates. Annals of Statistics 8, 25–63. Mardia, K. V. and Zemroch, P. J. (1978). Tables of the F - and Related Distri- butions with Algorithms. Academic Press, London. [Extensive tables of critical values for the central F - and related distributions.] Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66, 163–166. [Problem 5.46(ii).] Marshall, A. W. and Olkin, I. (1979). Inequalities: Theory of Majorization and Its Applications. Academic Press, New York. Mart´ın, A. and Tapia, J. (1998). On determining the p-value in 2×2 multinomial trials. Journal of Statistical Planning and Inference 69, 33–49. Massart, P. (1990). The tight constant in the Dvoretsky-Kiefer-Wolfowitz inequality. Annals of Probability 18, 1269–1283. Massey, F. J. (1950). A note on the power of a non-parametric test. Annals of Mathematical Statistics 21, 440–443. Mathew, M. and Sinha, B. (1988a). Optimum tests for fixed effects and vari- ance components in balanced models. Journal of the American Statistical Association 83, 133–135. Mathew, M. and Sinha, B. (1988b). Optimum tests in unbalanced two-way models without interaction. Annals of Statistics 16, 1727–1740. Mattner, L. (1993). Some incomplete but boundedly complete location families. Annals of Statistics 21, 2158–2162. References 737

Mattner, L. (1996). Complete order statistics in parametric models. Annals of Statistics 24, 1265–1282. McCullagh, P. (1985). On the asymptotic distribution of Pearson’s statistic in linear exponential-family models. International Statistical Review 53, 61–67. McCullagh, P. (1986). The conditional distribution of goodness-of-fit statistics for discrete data. Journal of the American Statistical Association 81, 104–107. McCullagh, P. and Nelder, J. (1989). Generalized Linear Models, 2nd edition. Chapman & Hall, London. McCulloch, C. and Searle, S. (2001). Generalized, Linear, and Mixed Models. John Wiley, New York. McDonald, L. L., Davis, B. M. and Milliken, G. A. (1977). A nonrandomized unconditional test for comparing two proportions in 2 × 2 contingency tables. Technometrics 19, 145–158. McKean, J. and Schrader, R. M. (1982). The use and interpretation of ro- bust analysis of variance. In Modern Data Analysis (Launer and Siegel. eds.). Academic Press, New York. Mee, R. (1990). Confidence intervals for probabilities and tolerance regions based on a generalization of the Mann-Whitney statistic. Journal of the American Statistical Association 85, 793–800. Meeks, S. L. and D’Agostino, R. (1983). A note on the use of confidence limits following rejection of a null hypothesis. American Statistician 37, 134–136. Meng, X. (1994). Posterior predictive p-values. Annals of Statistics 22, 1142–1160. Michel, R. (1979). On the asymptotic efficiency of conditional tests for exponential families. Annals of Statistics 7, 1256–1263. Milbrodt, H. and Strasser, H. (1990). On the asymptotic power of the two-sided Kolmogorov-Smirnov test. Journal of Statistical Planning and Inference 26, 1–23. Millar, W. (1983). The minimax principle in asymptotic statistical theory. In Ecole d’Et´edeProbabilit´es de Saint Flour XI 1981 (P.L. Hennequin, ed.), 75–266. Lecture Notes in Mathematics 976, Springer-Verlag, Berlin. Millar, W. (1985). Nonparametric applications of an infinite dimensional con- volution theorem. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 68, 545–556. Miller, F. R., Neill, J. and Sherfey, B. (1998). Maximin clusters for near-replicate regression lack of fit tests. Annals of Statistics 26, 1411–1433. Miller, F. L. and Quesenberry, C. (1979). Power studies of tests for uniformity II. Communications in Statist. Simulation. Comput. 8, 271–290. Miller, J. (1977a). Asymptotic properties of maximum likelihood estimates in the mixed model of the analysis of variance. Annals of Statistics 5, 746–762. Miller, R. G. (1977b). Developments in multiple comparisons 1966–1976. Journal of the American Statistical Association 72, 779–788. 738 References

Miller, R. G. (1981). Simultaneous Statistical Inference, 2nd edition. Springer, New York. Miller, R. G. (1986). Beyond Anova. John Wiley, New York. Miwa, T. and Hayter, T. (1999). Combining the advantages of one-sided and two-sided test procedures for comparing several treatment effects. Journal of the American Statistical Association 94, 302–307. Montgomery, D. (2001). Design and Analysis of Experiments, 5th edition. John Wiley, New York. Morgan, W. A. (1939). A test for the significance of the difference between the two variances in a sample from a normal bivariate population. Biometrika 31, 13–19. Morgenstem, D. (1956). Einfache Beispiele zweidimensionaler Verteilungen, Mitteil. Math. Statistik 8, 234–235. Mosteller, F. and Tukey, J. W. (1977). Data Analysis and Regression: A Second Course in Statistics, Addison-Wesley, MA. M¨ott¨onen, J., Oja, H. and Tienari, J. (1997). On the efficiency of multivariate spatial sign and rank tests. Annals of Statistics 25, 542–552. Mudholkar, G. S. (1983). Fisher’s z-transformation. Encyclopedia of Statistical Science 3, 130-135. (S. Kotz, N.L. Johnson, C.B. Read, eds.) M¨uller, C. (1998). Optimum robust testing in linear models. Annals of Statistics 26, 1126–1146. Murphy, S. and van der Vaart, A. (1997). Semiparametric likelihood ratio inference. Annals of Statistics 25, 1471–1509. Nachbin, L. (1965). The Haar Integral. Van Nostrand, New York. Naiman, D. Q. (1984a). Average width optimality of simultaneous confidence bounds. Annals of Statistics 12, 1199–1214. Naiman, D. Q. (1984b). Optimal simultaneous confidence bounds. Annals of Statistics 12, 702–715.

Nandi, H. K. (1951). On type B1 and type B regions. Sankhy¯a 11, 13–22. [One of the cases of Theorem 4.4.1, under regularity assumptions.] Neuhaus, G. (1979). Asymptotic theory of goodness of fit tests when parameters are present: A survey. Statistics 10, 479–494. Neyman, J. (1923). On the application of probability theory to agriculture exper- iments. Essay on Principles. Section 9. Translated and edited by D. Dabrowska and T. Speed in (1990), Statistical Science 5, 465–480, with comments by D. Rubin. The Polish original appeared in Roczniki Nauk Rolniczych Tom X (1923), 1–51 (Annals of Agricultural Sciences). Neyman, J. (1935a). Sur un teorema concernente le cosidette statistiche sufficienti. Giorn. Ist. Ital. Att. 6. 320–334. Neyman, J. (1935b). Sur la v´erification des hypoth`eses statistiques compos´ees. Bull.Soc.Math.France63, 246–266. [Defines, and shows how to derive, tests of type B, that is, tests which are LMP among locally unbiased tests in the presence of nuisance parameters.] References 739

Neyman, J. (1937a). Outline of a theory of statistical estimation based on the classical theory of probability. Phil.Trans.Roy.Soc.Ser.A. 236, 333–380. Neyman, J. (1937b). Smooth test for goodness of fit. Skand. Aktuarietidskr. 20, 150–199. Neyman, J. (1938a). L’estimation statistique trait´ee comme un probl`eme classique de probabilit´e. Actualit´es Sci. et Ind. 739, 25–57. Neyman, J. (1938b). Lectures and Conferences on Mathematical Statistics and Probability, 1st edition (2nd edition, 1952), Graduate School, U.S. Dept. of Agriculture, Washington. Neyman, J. (1939). On statistics the distribution of which is independent of the parameters involved in the original probability law of the observed vari- ables, Statist. Res. Mem. 2, 59–89. [Essentially Theorem 5.1.2 under regularity assumptions.] Neyman, J. (1941a). On a statistical problem arising in routine analyses and in sampling inspection of mass distributions. Annals of Mathematical Statis- tics 12, 46–76. [Theory of tests of composite hypotheses that are locally unbiased and locally most powerful.] Neyman, J. (1941b). Fiducial argument and the theory of confidence intervals. Biometrika 32, 128–150. Neyman, J. (1949). Contribution to the theory of the χ2 test. In Proc. Berkeley Symposium on Mathematical Statistics and Probability, Univ. of Califor- nia Press, Berkeley, 239–273. [Gives a theory of χ2 tests with restricted alternatives.] Neyman, J. (1952). Lectures and Conferences on Mathematical Statistics, 2nd edition Washington Graduat School, U.S. Dept. of Agriculture, 43–66. [An account of various approaches to the problem of hypothesis testing.] Neyman, J. (1967). A Selection of Early Statistical Papers of J. Neyman,Univ. of California Press, Berkeley. [Puts forth the point of view that statistics is primarily concerned with how to behave under uncertainty rather than with determining the values of unknown parameters, with inductive behavior rather than with inductive inference.] Neyman, J. and Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika 20A, 175–240, 263–295. Neyman, J. and Pearson, E. S. (1933a). On the testing of statistical hypotheses in relation to probability a priori. Proc. Cambridge Phil. Soc. 29, 492–510. Neyman, J. and Pearson, E. S. (1933b). On the problem of the most efficient tests of statistical hypotheses. Phil. Trans Roy. Soc. Ser. A 231, 289–337. Neyman, J. and Pearson, E. S. (1936a). Contributions to the theory of testing statistical hypotheses. I. Unbiased critical regions of type A and type A1. Statist. Res. Mem. 1. 1–37. Neyman, J. and Pearson, E. S. (1936b). Sufficient statistics and uniformly most powerful tests of statistical hypotheses. Statist. Res. Mem. 1, 113–137. [Problem 3.4(ii).] 740 References

Neyman, J. and Pearson, E. S. (1936, 1938). Contributions to the theory of testing statistical hypotheses. Statist. Res. Mem. 1, 1–37; 2, 25–57. [Defines unbiasedness and determines both locally and UMP unbiased tests of certain classes of simple hypotheses. Discusses tests of types A, that is, tests which are LMP among locally unbiased tests when no nuisance parameters are present.] Neyman, J. and Pearson, E. S. (1967). Joint Statistical Papers of J. Neyman and E. S. Pearson, Univ. of California Press, Berkeley. [In connection with the problem of hypothesis testing, suggests assigning weights for the various possible wrong decisions and the use of the minimax principle.] Nicolaou, A. (1993). Bayesian intervals with good frequentist behaviour in the presence of nuisance parameters. Journal of the Royal Statistical Society Series B 55, 377–390. Niederhausen, H. (1981). Scheffer polynomials for computing exact Kolmogorov– Smirnov and Renyi type distributions. Annals of Statistics 9, 923–944. Nikitin, Y. (1995). Asymptotic Efficiency of Nonparametric Tests. Cambridge University Press. Noether, G. (1955). On a theorem of Pitman. Annals of Mathematical Statistics 26, 64-68. Nogales, A. and Oyola, J. (1996). Some remarks on sufficiency, invariance and conditional independence. Annals of Statistics 24, 906–909. Nogales, A., Oyola, J. and P´erez, P. (2000). Invariance, almost invariance and sufficiency. Statistica, LX, 277–286. N¨olle, G. and Plachky, D. (1967). Zur schwachen Folgenkompaktheit von Testfunktionen. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 8, 182–184. Od´en, A. and Wedel, H. (1975). Arguments for Fisher’s permutation test. Annals of Statistics 3, 518–520. Olshen, R. A. (1973). The conditional level of the F -test. Journal of the American Statistical Association 68, 692–698. Oosterhoff, J. and van Zwet, W. (1979). A note on contiguity and Hellinger distance. Contributions to Statistics, Reidel, Dordrecht-Boston, Mass. London, 157–166. Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 72, 45–58. Owen, A. (1995). Nonparametric likelihood confidence bounds for a distribution function. Journal of the American Statistical Association 90, 516–521. Owen, A. (2001). Empirical Likelihood. Chapman & Hall, New York. Owen, D. B. (1985). Noncentral t-distribution. Encycl. Statist. Sci. 6, 286–290. Pace, L. and Salvan, A. (1990). Best conditional tests for separate families of hypotheses. Journal of the Royal Statistical Society Series B 52, 125–134. Pachares, J. (1961). Tables for unbiased tests on the variance of a normal population. Annals of Mathematical Statistics 32, 84–87. References 741

Patel, J. K. and Read, C. B. (1982). Handbook of the Normal Distribution. Marcel Dekker, New York. Paulson, E. (1941). On certain likelihood ratio tests associated with the expo- nential distribution. Annals of Mathematical Statistics 12, 301–306. [Discusses the power of the tests of Problem 5.15.] Pawitan, Y. (2000). A reminder of the fallibility of the Wald statistic: likelihood explanation. American Statistician 54, 54–56. Pearson, E. S. (1929). Some notes on sampling tests with two variables. Biometrika 21. 337–360. Pearson, E. S. (1966). The Neyman–Pearson story: 1926–1934. In Research Papers in Statistics: Festschrift for J. Neyman (F. N. David, ed.), John Wiley, New York. Pearson, E. S. and Hartley, H. O. (1972). Biometrika Tables for Statisticians. Cambridge University Press, Cambridge. Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, Series 5 50, 157-175. (Reprinted in: Karl Pearson’s Early Statistical Papers, Cambridge University Press, 1956). [The χ2-test is proposed for test- ing a simple multinomial hypothesis, and the limiting distribution of the test criterion is obtained under the hypothesis. The test is extended to composite hypotheses but contains an error in the degrees of freedom of the limiting dis- tribution; a correct solution for the general case was found by Fisher (1924a). Applications.] Peisakoff, M. (1951). Transformation of Parameters, unpublished thesis. Prince- ton Univ. [Extends the Hunt-Stein theory of invariance to more general classes of decision problems; see Problem 1.11(ii). The theory is generalized further in Kiefer (1957, 1966) and Kudo (1955).] Pena, E. (1998). Smooth goodness-of-fit tests for composite hypothesis in hazard based models. Annals of Statistics 26, 1935–1971. Pereira, B. (1977). Discriminating among separate models: A bibliography. International Statistical Review 45, 163–172. Peritz, E. (1965). On inferring order relations in analysis of variance. Biometrics 21, 337–344. Perlman, M. (1969). One-sided testing problems in multivariate analysis. Annals of Mathematical Statistics 40, 549–567. [Correction: Annals of Mathematical Statistics 42 (1971), 1777.] Perlman, M. (1972). On the strong consistency of approximate maximum likeli- hood estimators. Proceedings of the Sixth Berkeley Symposium in Mathematical Statistics 1, University of California Press, 263–281. Perlman, M. and Wu, L. (1999). The Emperor’s new tests. Statistical Science 14, 355–381. Pesarin, F. (2001). Multivariate Permutation Tests With Applications in Biostatistics. John Wiley, Chichester, England. 742 References

Peters, D. and Randles, R. (1991). A bivariate signed rank test for the two-sample location problem. Journal of the Royal Statistical Society Series B 53, 493–504. Pfanzagl, J. (1967). A technical lemma for monotone likelihood ratio families. Annals of Mathematical Statistics 38, 611–613. Pfanzagl, J. (1968). A characterization of the one parameter exponential family by existence of uniformly most powerful tests. Sankhy¯aSeriesA 30, 147–156. Pfanzagl, J. (1974). On the Behrens-Fisher problem. Biometrika 61, 39–47. Pfanzagl, J. (1979). On optimal median unbiased estimators in the presence of nuisance parameters. Annals of Statistics 7, 187–193. Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1982). Contributions to a General Asymptotic Theory. Springer-Verlag, New York. Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1985). Asymptotic Expansions for General Statistical Models. Springer-Verlag, New York, NY. Piegorsch, W. W. (1985a). Admissible and optimal confidence bounds in simple linear regression. Annals of Statistics 13, 801–817. Piegorsch, W. W. (1985b). Average width optimality for confidence bands in simple linear regression. Journal of the American Statistical Association 80, 692–697. Pierce, D. A. (1973). On some difficulties in a frequency theory of inference. Annals of Statistics 1, 241–250. Pitman, E. J. G. (1937, 1938a). Significance tests which may be applied to sam- ples from any population, J. Roy. Statist. Soc. Suppl. 4, 119–130, 225–232; Biometrika 29, 322–335. Pitman, E. J. G. (1938b). The estimation of the location and scale parameters of a continuous population of any given form. Biometrika 30, 391–421. Pitman, E. J. G. (1939a). A note on normal correlation. Biometrika 31, 9–12. [Problem 5.39(i).] Pitman, E. J. G. (1939b). Tests of hypotheses concerning location and scale pa- rameters. Biometrika 31, 200–215. [In these papers the restriction to invariant procedures is introduced for estimation and testing problems involving location and scale parameters.] Pitman, E. J. G. (1949). Lecture notes on nonparametric statistical inference, unpublished. [Develops the concept of relative asymptotic efficiency and applies it to several examples including the Wilcoxon test.] Plackett, R. L. (1977). The marginal totals of a 2 × 2 table. Biometrika 64. 37– 42. [Discusses the fact that the marginals of a 2 × 2 table supply some, but only little, information concerning the odds ratio. See also Barndorff-Nielsen (1978), Example 10.8.] Plackett, R. L. (1981). The Analysis of Categorical Data, 2nd edition. MacMillan, New York. Politis, D. N. and Romano, J. P. (1994a). The stationary bootstrap. Journal of the American Statistical Association 89, 1303–1313. References 743

Politis, D. N. and Romano, J. P. (1994b). Large sample confidence regions based on subsamples under minimal assumptions. Annals of Statistics 22, 2031–2050. Politis, D. N., Romano, J. P. and Wolf, M. (1999). Subsampling. Springer, New York. Pollard, D. (1984). Convergence of Stochastic Processes. Springer-Verlag, New York. Pollard, D. (1997). Another look at differentiability in quadratic mean. In Festschrift for Lucien Le Cam, 305–314. Springer-Verlag, New York. Polonik, W. (1999). Concentration and goodness-of-fit in higher dimen- sions: (Asymptotically) distribution-free methods. Annals of Statistics 27, 1210–1229. Posten, H. O., Yeh, H. C. and Owen, D. B. (1982). Robustness of the two- sample t-test under violations of the homogeneity of variance assumption. Communications in Statistics 11, 109–126. Pratt, J. W. (1958). Admissible one-sided tests for the mean of a rectangular distribution. Annals of Mathematical Statistics 29, 1268–1271. Pratt, J. W. (1961a). Length of confidence intervals. Journal of the American Statistical Association 56, 549–567. Pratt, J. W. (1961b). Review of Testing Statistical Hypotheses by E L. Lehmann. Journal of the American Statistical Association 56. 163–167. [Problems 10.27, 10.28.] Pratt, J. W. (1962). A note on unbiased tests. Annals of Mathematical Statistics 33, 292–294. Pratt, J. W. (1964). Robustness of some procedures for the two-sample loca- tion problem. Journal of the American Statistical Association 59, 665–680. [Proposes and illustrates approach (ii) of Section 1.] Prescott, P. (1975). A simple alternative to Student’s t. Applied Statistics 24, 210–217. Przyborowski, J. and Wilenski, H. (1939). Homogeneity of results in testing sam- ples from Poisson series. Biometrika 31, 313–323. [Derives the UMP similar test for the equality of two Poisson parameters.] Pukelsheim, F. (1993). Optimal Design of Experiments. John Wiley, New York. Quenouille, M. (1949). Approximate tests of correlation in time series. Journal of the Royal Statististical Society Series B 11, 68–84. Quesenberry, C. P. and Starbuck, R. R. (1976). On optimal tests for separate hypotheses and conditional probability integral transformations. Communications in Statistics (A) 1, 507–524. Quine, M. P. and Robinson, J. (1985). Efficiencies of chi-square and likelihood ratio goodness-of-fit tests. Annals of Statistics 13, 727–742. Radlow, R. and Alf, E. (1975). An alternative multinomial assessment of the accuracy of the chi-squared test of goodness of fit. Journal of the American Statistical Association 70, 811–813. 744 References

Ramachandran, K. V. (1958). A test of variances. Journal of the American Statistical Association 53, 741–747. Ramsey, P. H. (1980). Journal of Educational Statistics 5, 337–349. Randles, R. and Wolfe, D. A. (1979). Introduction to the Theory of Nonparametric Statistics. John Wiley, New York. Rao, C. R. (1947). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proc. Camb. Phil. Soc. 44, 50-57. Rao, C. R. (1963). Criteria of estimation in large samples. Sankhy¯a 25, 189–206. Rao, C. R. and Wu, Y. (2001). On model selection (with discussion). In Model Selection, Lahiri, P. ed., IMS Lecture Notes–Monograph Series, volume 38. Rao, P. (1968). Estimation of the location of the cusp of a continuous density. Annals of Mathematical Statistics 39, 76–87. Rayner, J. and Best, D. (1989). Smooth Tests of Goodness of Fit. Oxford University Press, Oxford. Read, T. and Cressie, N. (1988). Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer-Verlag, New York. Reid, C. (1982). Neyman from Life. Springer, New York. Reinhardt, H. E. (1961). The use of least favorable distributions in testing composite hypotheses. Annals of Mathematical Statistics 32, 1034–1041. Richmond, J. (1982). A general method for constructing simultaneous confidence intervals. Journal of the American Statistical Association 77, 455–460. Rieder, H. (1977). Least favorable pairs for special capacities. Annals of Statistics 5, 909–921. Rieder, H. (1994). Robust Asymptotic Statistics. Springer-Verlag, New York. Ripley, B. (1987). Stochastic Simulation. Wiley, New York. Robbins, H. (1948). Convergence of distributions. Annals of Mathematical Statistics 19, 72–76. Robert, C. (1993). A note on Jeffreys-Lindley paradox. Statistica Sinica 3, 603– 605. Robert, C. (1994). The Bayesian Choice. Springer-Verlag, New York. Robertson, T., Wright, F. and Dykstra, R. (1988). Order Restricted Statistical Inference. John Wiley, New York. Robins, J., van der Vaart, A. and Ventura, V. (2000). Asymptotic distribu- tion of p-values in composite null models. Journal of the American Statistical Association 95, 1143-1172. Robinson, G. (1976). Properties of Student’s t and of the Behrens–Fisher solution to the two means problem. Annals of Statistics 4, 963–971. [Correction (1982). Ann. Statist. 10, 321.] Robinson, G. (1979a). Conditional properties of statistical procedures. Annals of Statistics 7, 742–755. References 745

Robinson, G. (1979b). Conditional properties of statistical procedures for location and scale parameters. Annals of Statistics 7, 756–771. [Basic results concern- ing the existence of relevant and semirelevant subsets for location and scale parameters, including Example 10.4.1.] Robinson, G. (1982). Behrens-Fisher problem. In Encycl. Statist. Sci. 1, 205-209. Robinson, J. (1973). The large-sample power of permutation tests for randomization models. Annals of Statistics 1, 291–296. Robinson, J. (1983). Approximations to some test statistics for permutation tests in a completely randomized design. Australian Journal of Statistics 25, 358– 369. [Discusses the asymptotic performance of the permutation version of the F -test in randomized block experiments.] Rojo, J. (1983). On Lehmann’s General Concept of Unbiasedness and Some of Its Applications, Ph.D. Thesis. University of California, Berkeley. Romano, J. P. (1988). A bootstrap revival of some nonparametric distance tests. Journal of the American Statistical Association 83, 698–708. Romano, J. P. (1989a). Do boostrap confidence procedures behave well uniformly in P ? Canadian Journal of Statistics 17, 75–80. Romano, J. P. (1989b). Bootstrap and randomization tests of some nonparametric hypotheses. Annals of Statistics 17, 141–159. Romano, J. P. (1990). On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association 85, 686–692. Romano, J. P. (2004). On nonparametric testing, the uniform behavior of the t-test, and related problems. Scandinavian Journal of Statistics, to appear. Romano, J. P. (2005). Optimal testing of equivalence hypothesis. Annals of Statistics, to appear. Romano, J. P. and Shaikh, A. M. (2004). On control of the false discovery propor- tion. Department of Statistics Technical Report 2004-31, Stanford University, Stanford, CA. Romano, J. P. and Siegel, A.F. (1986). Counterexamples in Probability and Statistics. Wadsworth, Belmont. Romano, J. P. and Thombs, L. A. (1996). Inference for autocorrelations under weak assumptions. Journal of the American Statistical Association 91, 590– 600. Romano, J. P. and Wolf, M. (2000). Finite sample nonparametric inference and large sample efficiency. Annals of Statistics 28, 756–778. Romano, J. P. and Wolf, M. (2004). Exact and approximate stepdown methods for multiple testing. Journal of the American Statistical Association, to appear. Ronchetti, E. (1982). Robust alternatives to the F -test for the linear model. In Probability and Statistical Inference (Grossman, Pflug, and Wertz, eds.), D. Reidel, Dordrecht. Rosenthal, R. and Rubin, D. B. (1985). Statistical analysis: summarizing evidence versus establishing facts. Psych. Bull 97, 527–529, 746 References

Ross, S. (1996). Stochastic Processes, 2nd edition. John Wiley, New York. Rothenberg, T. J. (1984). Hypothesis testing in linear models when the error covariance matrix is nonscalar. Econometrica 52, 827–842. Roussas, G. (1972) Contiguous Probability Measures: Some Applications in Statistics, Cambridge University Press. Roy, K. K. and Ramamoorthi, R. V. (1979). Relationship between Bayes, classical and decision theoretic sufficiency. Sankhy¯a 41, 48–58. Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. Annals of Mathematical Statistics 24, 513–536. Royden, H. L. (1988). Real Analysis. 3rd ed., Macmillan, New York. Ruist, E. (1954). Comparison of tests for non-parametric hypotheses. Arkiv Mat. 3, 133–136. [Problem 8.7.] Rukhin, A. (1993). Bahadur efficiency of tests of separate hypotheses and adaptive test statistics. Journal of the American Statistical Association 88, 161–165. Runger, G. and Eaton, M. (1992). Most powerful invariant permutation tests. Journal of Multivariate Analysis 42, 202–209. Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge University Press. Sackrowitz, H. and Samuel-Cahn, E. (1999). P -values as random variables – expected p-values. American Statistician 53, 326–331. Sahai, H. and Khurshid, A. (1995). Statistics in Epidemiology: Methods, Techniques and Applications. CRC Press, Boca Raton, Florida. Sahai, H. and Ojeda, M. (2004). Analysis of Variance for Random Models. Birkh¨auser, Boston. Salaevskii, Y. (1971). Essay in Investigations in Classical Problems of Probability Theory and Mathematical Statistics (V. M. Kalinin and O. V. Salaevskii, eds.) (Russian), Leningrad Seminars in Math., Vol. 13, Steklov Math. Inst.; Engl. transl., Consultants Bureau, New York. Sanathanan, L. (1974). Critical power function and decision making. Journal of the American Statistical Association 69, 398–402. Sarkar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures, Annals of Statistics 30, 239–257. Savage, L. J. (1962). The Foundations of Statistical Inference. Methuen, London. Savage, L. J. (1972). The Foundations of Statistics, 2nd edition. Dover, New York. Savage, L. J. (1976). On rereading R. A. Fisher (with discussion). Annals of Statistics 4, 441–500. Schafer, G. (1982). Lindley’s paradox (with discussion). Journal of the American Statistical Association 77, 325–351. Schafer, G. (1988). Sharp null hypotheses. In Encycl. Statist. Sci. 8, 433–436. Scheff´e, H. (1942). On the ratio of the variances of two normal populations. Annals of Mathematical Statistics 13, 371–388. References 747

Scheff´e, H. (1943). On a measure problem arising in the theory of non-parametric tests. Annals of Mathematical Statistics 14, 227–233. [Proves the completeness of order statistics.] Scheff´e, H. (1947). A useful convergence theorem for probability distribution functions. Annals of Mathematical Statistics 18, 434–438. Scheff´e, H. (1956). A ‘mixed model’ for the analysis of variance. Annals of Mathematical Statistics 27, 23–36 and 251–271. Scheff´e, H. (1959). Analysis of Variance. John Wiley, New York. Scheff´e, H. (1970). Practical solutions of the Behrens-Fisher problem. Journal of the American Statistical Association 65, 1501–1504. [Introduces the idea of logarithmically shortest confidence intervals for ratios of scale parameters.] Scheff´e, H. (1977). A note on a reformulation of the S-method of multiple com- parison (with discussion). Journal of the American Statistical Association 72, 143–146. [Problem 7.18.] Schervish, M. (1995). Theory of Statistics. Springer-Verlag, New York. Schoenberg, I. J. (1951). On P´olya frequency functions. I. J. Analyse Math. 1, 331–374. [Example 8.2.1.] Scholz, F. W. (1982). Combining independent P -values. In A Festschrift for Erich L. Lehmann (Bickel, Doksum, and Hodges, eds.), Wadsworth, Belmont. Calif. Schuirmann, D. (1981). On hypothesis testing to determine if the mean of a normal distribution is contained in a known interval. Biometrics 37, 617. Schwartz, R. E. (1967a). Locally minimax tests. Annals of Mathematical Statistics 38, 340–360. Schwartz, R. (1967b). Admissible tests in multivariate analysis of variance. Annals of Mathematical Statistics 38, 698–710. Schwartz, R. (1969). Invariant proper Bayes tests for exponential families. Amer. Math. Statist. 40, 270–283. Schweder, T. (1988). A significance version of the basic Neyman-Pearson theory for scientific hypothesis testing. Scandinavian Journal of Statistics 15, 225–242. Schweder, T. and Spjøtvoll, E. (1982). Plots of P-values to evaluate many tests simultaneously. Biometrika 69, 493–502. Seal, H. L. (1967). Studies in the history of probability and statistics XV. The historical development of the Gauss linear model. Biometrika 54, 1–24. Searle, S. (1987). Linear Models and Unbalanced Data. John Wiley, New York. Seber, G. A. F. (1977). Linear Regression Analysis. John Wiley, New York. Seber, G. A. F. (1984). Multivariate Observations. John Wiley, New York. Seidenfeld, T. (1992). R. A. Fisher’s fiducial argument and Bayes’ theorem. Statistical Science 7, 358–368. Sellke, T., Bayarri, J. and Berger, J. (2001). Calibration of p-values for testing precise null hypotheses. American Statistician 55, 62–71. Serfling R. H. (1980). Approximation Theorems of Mathematical Statistics. John Wiley, New York. 748 References

Severini, T. (1993). Bayesian interval estimates which are also confidence intervals. Journal of the Royal Statistical Society Series B 55, 533–540. Shaffer, J. P. (1973). Defining and testing hypotheses in multi-dimensional contingency tables. Psych. Bull. 79, 127–141. Shaffer, J. P. (1977a). Multiple comparisons emphasizing selected contrasts: An extension and generalization of Dunnett’s procedure. Biometrics 33, 293–303. Shaffer, J. P. (1977b). Reorganization of variables in analysis of variance and multidimensional contingency tables. Psych. Bull. 84, 220–228. Shaffer, J. P. (1980). Control of directional errors with stagewise multiple test procedures. Annals of Statistics 8, 1342–1347. Shaffer, J. P. (1981). Complexity: an interpretability criterion for multiple comparisons. Journal of the American Statistical Association 76, 395–401. Shaffer, J. P. (1984). Issues arising in multiple comparisons among populations. In Proc. Seventh Conference on Probab. Theory (Iosifescu, ed.). Edit. Acad. Republ. Soc. Romania. Bucharest. Shaffer, J. P. (1986). Modified sequentially rejective multiple test procedures. Journal of the American Statistical Association 81, 826–831. Shaffer, J. P. (1995). Multiple hypothesis testing: A review. Annual Review of Psychology 46, 561–584. Shaffer, J. P. (2002). Optimality results in multiple hypothesis testing. In The First Erich L. Lehmann Symposium – Optimality, Rojo and P´erez-Abren (eds.), IMS Lecture Notes 44, Beachwood, Ohio. Shao, J. (1999). Mathematical Statistics. Springer, New York. Shao, J. and Tu, D. (1995). The Jackknife and the Bootstrap. Springer, New York. Shapiro, S. S., Wilk M. B. and Chen H. J. (1968). A comparative study of var- ious tests of normality. Journal of the American Statistical Association 63, 1343–1372. Shewhart, W. and Winters, F. (1928). Small samples – new experimental results. Journal of the American Statistical Association 23, 144–153. Shorack, G. (1972). The best test of exponentiality against gamma alternatives. Journal of the American Statistical Association 67, 213–214. Shorack, G. and Wellner, J. (1986). Empirical Processes with Applications to Statistics. John Wiley, New York. Shorrock, G. (1990). Improved confidence intervals for a normal variance. Annals of Statistics 18, 972–980. Shuster, J. (1968). On the inverse Gaussian distribution function. Journal of the American Statistical Association 63, 1514–1516. Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals. Springer-Verlag, New York. Siegmund, D. (1986). Boundary crossing probabilities and statistical applications. Annals of Statistics 14, 361–404. References 749

Sierpinski, W. (1920). Sur les fonctions convexes measurables. Fundamenta Math. 1, 125–129. Silvapulle, M. and Silvapulle, P. (1995). A score test against one-sided alternatives. Journal of the American Statistical Association 90, 342–349. Silvey, S. D. (1980). Optimal Design: An Introduction to the Theory of Parameter Estimation, Chapman & Hall, London. Simpson, D. (1989). Hellinger deviance tests: efficiency, breakdown points, and examples. Journal of the American Statistical Association 84, 107–113. Singh, K. (1981). On the asymptotic accuracy of Efron’s bootstrap. Annals of Statistics 9, 1187–1195. Small, C., Wang, J. and Yang, Z. (2000). Eliminating multiple root problems in estimation (with discussion). Statistical Science 15, 313–341. Smirnov, N. V. (1948). Tables for estimating the goodness of fit of empirical distributions. Annals of Mathematical Statistics 19, 279–281. Smith, D. W. and Murray, L. W. (1984). An alternative to Eisenhart’s Model II and mixed model in the case of negative variance estimates. Journal of the American Statistical Association 79, 145–151. Sophister (G. Story) (1928). Discussion of small samples drawn from an infinite skew population. Biometrika 20A, 389–423. Speed, F. M., Hocking, R. R. and Hackney, O. P. (1979). Methods of analysis of linear models with unbalanced data. Journal of the American Statistical Association 73, 105–112. Speed, T. (1987). What is an analysis of variance? (with discussion). Annals of Statistics 15, 885–941. Speed, T. (1990). Introductory remarks on Neyman (1923). Statistical Science 5, 463–464. Spiegelhalter, D. J. (1983). Diagnostic tests of distributional shape. Biometrika 70, 401–409. Spjøtvoll, E. (1967). Optimum invariant tests in unbalanced variance components models. Annals of Mathematical Statistics 38, 422–428. Spjøtvoll, E. (1972). On the optimality of some multiple comparison procedures. Annals of Mathematical Statistics 43, 398–411. Spjøtvoll, E. (1974). Multiple testing in analysis of variance. Scandinavian Journal of Statistics 1, 97–114, Sprott, D. A. (1975). Marginal and conditional sufficiency. Biometrika 62, 599–605. Spurrier, J. D. (1984). An overview of tests for exponentiality. Communications in Statistics – Theory and Methods 13, 1635–1654. Spurrier, J. D. (1999). Exact confidence bounds for all contrasts of three or more regression lines. Journal of the American Statistical Association 94, 483–488. Stein, C. M. (1951). A property of some tests of composite hypotheses. Annals of Mathematical Statistics 22, 475–476. [Problem 3.58.] 750 References

Stein, C. M. (1956a). The admissibility of Hotelling’s T 2-test. Annals of Mathematical Statistics 27, 616–623. Stein, C. M. (1956b). Efficient nonparametric testing and estimation. in Proc. 3rd Berkeley Symp. Math. Statist. and Probab. Univ. of Calif. Press, Berkeley. Stein, C. M. (1962). Confidence sets for the mean of a multivariate normal distribution. Journal of the Royal Statistical Society Series B 24, 265-296. Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Annals of Statistics 9, 1135–1151. Stephens, M. (1974). EDF Statistics for goodness-of-fit and some comparisons. Journal of the American Statistical Association 69, 730–737. Stephens, M. (1976). Asymptotic results for goodness-of-fit statistics with unknown parameters. Annals of Statistics 4, 357–369. Stigler, S. M. (1977). Eight centuries of sampling inspection: The trial of the Pyx. Journal of the American Statistical Association 72, 493–500. Stigler, S. M. (1978). Francis Ysidro Edgeworth, Statistician (with discussion). Journal of the Royal Statistical Society Series A 141, 287–322. Stigler, S. M. (1986). Laplace’s 1774 memoir on inverse probability. Statistical Science 1, 359-378. Stone, C. J. (1975). Adaptive maximum likelihood estimators of a location parameter. Annals of Statistics 3, 267–294. Stone, C. J. (1981). Admissible selection of an accurate and parsimonious normal linear regression model. Annals of Statistics 9, 475–485. Stone, M. and von Randow, R. (1968). Statistically inspired conditions on the group structure of invariant experiments and their relationships with other conditions on locally compact topological groups. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 10, 70–78. Strasser, H. (1985). Mathematical Theory of Statistics. Walter de Gruyter, Berlin. Stuart, A. and Ord, J. (1987). Kendall’s Advanced Theory of Statistics,Vol.1, 5th edition. Oxford University Press, New York. Stuart, A. and Ord, J. (1991). Kendall’s Advanced Theory of Statistics,Vol.2, 5th edition. Oxford University Press, New York. Stuart, A., Ord., J. and Arnold, S. (1999). Kendall’s Advanced Theory of Statistics, Vol. 2A, 6th edition. Oxford University Press, New York.. Student (W.S. Gosset) (1908). On the probable error of the mean. Biometrika 6, 1–25. Student (W.S. Gosset) (1927). Errors of routine analysis. Biometrika 19, 151–164. Sugiura, N. (1965). An example of the two-sided Wilcoxon test which is not unbiased. Annals of the Institute of Statistical Mathematics 17, 261–263 Sutton, C. (1993). Computer-intensive methods for tests about the mean of an asymmetrical distribution. Journal of the American Statistical Association 88, 802–810. References 751

Sverdrup, E. (1953). Similarity. unbiasedness, minimaxibility and admissibility of statistical test procedures. Skand. Aktuar. Tidskrift 36, 64–86. [Theorem 4.3.1 and results of the type of Theorem 4.4.1. Applications including the 2 ×2 table.] Swed, F. S. and Eisenhart, C. (1943). Tables for testing randomness of grouping in a sequence of alternatives. Annals of Mathematical Statistics 14, 66–87. Takeuchi, K. (1969). A note on the test for the location parameter of an exponential distribution. Annals of Mathematical Statistics 40, 1838–1839. Tallis, G. M. (1983). Goodness of fit. In Encycl. Statist. Sci., Vol. 3. John Wiley, New York. Tan, W. Y. (1982). Sampling distributions and robustness of t, F and variance- ratio in two samples and ANOVA models with respect to departure from normality. Communications in Statistics – Theory and Methods 11, 2485–2511. Tang, D. (1994). Uniformly more powerful tests in a one-sided multivariate problem. Journal of the American Statistical Association 89, 1006–1011. Tate, R. F. and Klett, G. W. (1959). Optimal confidence intervals for the variance of a normal distribution. Journal of the American Statistical Association 54, 674–682. Taylor, H. and Karlin, S. (1998). An Introduction to Stochastic Modeling, 3rd edition. Academic Press, San Diego, CA. Thompson, W. R. (1936). On confidence ranges for the median and other expec- tation distributions for populations of unknown distribution form. Annals of Mathematical Statistics 7, 122–128. [Problem 3.57.] Tiku, M. L. (1967). Tables of the power of the F -test. Journal of the American Statistical Association 62, 525–539. Tiku, M. L. (1972). More tables of the power of the F -test. Journal of the American Statistical Association 67, 709–710. Tiku, M. L. (1985a). Noncentral chi-square distribution. Encycl. Statist. Sci. 6, 276–280. Tiku, M. L. (1985b). Noncentral F -distribution. Encycl. Statist. Sci. 6, 280–294. Tiku, M. L. and Balakrishnan, N. (1984). Testing equality of population vari- ances the robust way. Communications in Statistics – Theory and Methods 13, 2143–2159. Tiku, M. L. and Singh, M. (1981). Robust test for means when population vari- ances are unequal. Communications in Statistics – Theory and Methods A10, 2057–2071. Tocher, K. D. (1950). Extension of Neyman–Pearson theory of tests to discon- tinuous variates. Biometrika 37, 130–144. [Proves the optimum property of Fisher’s exact test.] Tong, Y. L. (1980). Probability Inequalities in Multivariate Distributions. Academic Press, New York. Tritchler, D. (1984). On inverting permutation tests. Journal of the American Statistical Association 79, 200–207. 752 References

Troendle, J. (1995). A stepwise resampling method of multiple testing. Journal of the American Statistical Association 90, 370–378. Tseng, Y. and Brown, L. D. (1997). Good exact confidence sets and minimax estimators for the mean vector of a multivariate normal distribution. Annals of Statistics 25, 2228–2258. Tukey, J. W. (1949a). One degree of freedom for non-additivity. Biometrics 5, 232–242. Tukey, J. W. (1949b). Standard confidence points. Unpublished Report 16, Statist, Res. Group, Princeton Univ. (To be published in Tukey’s Collected Works, Wadsworth, Belmont, Calif.) Tukey, J. W. (1953). The problem of multiple comparisons. Published in The Col- lected Works of John W. Tukey: Multiple Comparisons, Volume VIII. (1999). Edited by H. Braun, CRC Press, Boca Raton, Florida. [This MS, unpublished until 1999, was widely distributed and exerted a strong influence on the devel- opment and acceptance of multiple comparison procedures. It pioneered many of the basic ideas, including the T -method and a first version of Lemma 9.3.1.] Tukey, J. W. (1958a). Bias and confidence in not quite large samples (abstract). Annals of Mathematical Statistics 29, 614. Tukey, J. W. (1958b). A smooth invertibility theorem. Annals of Mathematical Statistics 29, 581–584. Tukey, J. W. (1960). A survey of sampling from contaminated distributions. In Contributions to Probability and Statistics (Olkin, ed.), Stanford University Press. Tukey, J. W. (1991). The philosophy of multiple comparisons. Statistical Science 6, 100-116. Tukey, J. W. and McLaughlin, D. H. (1963). Less vulnerable confi- dence and significance procedures for location based on a single sample: Trimming/Winsorization 1. Sankhy¯a 25, 331–352. Turnbull, H. (1952). Theory of Equations, 5th ed., Oliver and Boyd, Edinburgh. Tweedie, M. C. K. (1957). Statistical properties of inverse Gaussian distributions I, II. Annals of Mathematical Statistics 28, 362–377, 696–705. Unni, K. (1978). The Theory of Estimation in Algebraic and Analytic Exponential Families with Applications to Variance Components Models, unpublished Ph.D. Thesis, Indian Statistical Institute. Uthoff, V. A. (1970). An optimum test property of two well-known statistics. Journal of the American Statistical Association 65, 1597–1600. Uthoff, V. A. (1973). The most powerful scale and location invariant test of normal versus double exponential. Annals of Statistics 1, 170–174. Vadiveloo, J. (1983). On the theory of modified randomization tests for nonparametric hypotheses. Communications in Statistics A12, 1581–1596. Vaeth, M. (1985). On the use of Wald’s test in exponential families. International Statistical Review 53, 199–214. References 753 van Beek, P. (1972). An application of Fourier methods to the problem of sharp- ening the Berry-Esseen inequality. Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete 23, 187-196. van der Laan, M., Dudoit, S. and Pollard, K. (2004). Multiple testing. Part II. Step-down procedures for control of the familywise error rate. Statistical Applications in Genetics and Molecular Biology 3, Article 14. van der Vaart, A. (1988). Statistical Estimation in Large Parameter Spaces. C.W.I. Tract 44, Amsterdam. van der Vaart, A. (1998). Asymptotic Statistics. Cambridge University Press. van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes. Springer, New York. Venable, T. C. and Bhapkar, V. P. (1978). Gart’s test of interaction in a 2 × 2 × 2 contingency table for small samples. Biometrika 65, 669–672. von Mises, R. (1931). Wahrscheinlichkeitsrechnung. Franz Deuticke, Leipzig, . Vu, H. and Zhou, S. (1997). Generalization of likelihood ratio tests under nonstandard conditions. Annals of Statistics 25, 897–916. Wacholder, S. and Weinberg, C. R. (1982). Paired versus two-sample design for a clinical trial of treatments with dichotomous outcome: Power considerations. Biometrics 38, 801–812. Wald, A. (1939). Contributions to the theory of statistical estimation and testing hypotheses. Annals of Mathematical Statistics 10, 299–326. [A general formu- lation of statistical problems containing estimation and testing problems as special cases. Discussion of Bayes and minimax procedures.] Wald, A. (1941a). Asymptotically most powerful tests of statistical hypotheses. Annals of Mathematical Statistics 12, 1–19. Wald, A. (1941b). Some examples of asymptotically most powerful tests. Annals of Mathematical Statistics 12, 396–408. Wald, A. (1942). On the power function of the analysis of variance test. Annals of Mathematical Statistics 13, 434–439. [Problem 7.5. This problem is also treated by Hsu, “On the power function of the E2-test and the T 2-test”, Annals of Mathematical Statistics 16 (1945), 278–286.] Wald, A. (1943). Tests of statistical hypotheses concerning several parame- ters when the number of observations is large. Trans. Amer. Math. Soc. 54, 426–482. [General asymptotic distribution and optimum theory of likelihood ratio (and asymptotically equivalent) tests.] Wald, A. (1949). Note on the consistency of the maximum likelihood estimate. Annals of Mathematical Statistics 20, 595–601. Wald, A. (1950). Statistical Decision Functions. John Wiley, New York. [Definition of most stringent tests.] Wald, A. (1958). Selected Papers in Statistics and Probability by Abraham Wald. Stanford Univ. Press. [Defines and characterizes complete classes of decision procedures for general decision problems. The ideas of this and the preceding paper were developed further in a series of papers culminating in Wald’s book (1950).] 754 References

Wallace, D. (1958). Asymptotic approximations to distributions. Annals of Mathematical Statistics 29, 635–654. Wallace, D. (1980). The Behrens–Fisher and Fieller–Creasy problems. In R. A. Fisher: An Appreciation (Fienberg and Hinkley. eds.) Springer. New York, pp. 119–147. Walsh, J. E. (1949). Some significance tests for the median which are valid under very general conditions. Annals of Mathematical Statistics 20, 64–81. [Lemma 6.7.1; proposes the Wilcoxon one-sample test in the form given in Problem 6.48. The equivalence of the two tests was shown by Tukey in an unpublished mimeographed report dated 1949. Contains a result related to Problem 4.13.] Wang, H. (1999). Brown’s paradox in the estimated confidence approach. Annals of Statistics 27, 610–626. Wang, Y. Y. (1971). Probabilities of the type I errors of the Welch tests for the Behrens-Fisher problem. Journal of the American Statistical Association 66, 605–608. Weisberg, S. (1985). Applied Linear Regression, 2nd edition. John Wiley, New York. Welch, B. L. (1939). On confidence limits and sufficiency with particular reference to parameters of location. Annals of Mathematical Statistics 10, 58–69. Welch, B. L. (1951). On the comparison of several mean values: An alternative approach. Biometrika 38, 330–336. Welch, W. (1990). Construction of permutation tests. Journal of the American Statistical Association 85, 693–698. Wellek, S. (2003). Testing Statistical Hypotheses of Equivalence. Chapman & Hall/CRC. Wells, M., Jammalamadaka, S. and Tiwari, R. (1993). Large sample theory of spacings statistics for tests of fit for the composite hypothesis. Journal of the Royal Statistical Society Series B 55, 189–203. Westfall, P. H. (1989). Power comparisons for invariant variance ratio tests in mixed ANOVA models. Annals of Statistics 17, 318–326. Westfall, P. H. (1997). Multiple testing of general contrasts using logical con- straints and correlations. Journal of the American Statistical Association 92, 299–306. Westfall, P. H. and Young, S. (1993). Resampling-Based Multiple Testing: Examples and Methods for P -Value Adjustment. John Wiley, New York. Westlake, W. (1981). Response to T. B. L. Kirkwood: bioequivalence testing – a need to rethink. Biometrics 37, 589–594. Wijsman, R. (1979). Constructing all smallest simultaneous confidence sets in a given class, with applications to manova. Annals of Statistics 7, 1003–1018. Wijsman, R. (1980). Smallest simultaneous confidence sets with applications in multivariate analysis. Journal of Multivariate Analysis V, 483–498. Wijsman, R. (1990). Invariant Measures on Groups and Their Use in Statistics. IMS Lecture Notes. Institute of Mathematical Statistics, Hayward, CA. References 755

Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometries 1, 90–83. [Proposes the two tests bearing his name. (See also Deuchler, 1914.)] Wilk, M. B. and Kempthorne, O. (1955). Fixed, mixed, and random models. Journal of the American Statistical Association 50, 1144–1167. Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for test- ing composite hypotheses. Annals of Mathematical Statistics 9, 60–62. [Derives the asymptotic distribution of the likelihood ratio when the hypothesis is true.] Williams, D. (1991). Probability With Martingales. Cambridge University Press, Cambridge, England. Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22, 209–212. Wolfowitz, J. (1949). The power of the classical tests associated with the normal distribution. Annals of Mathematical Statistics 20, 540–551. [Proves Lemma 6.5.1 for a number of special cases. Proves that the standard tests of the univariate linear hypothesis and for testing the absence of multi- ple correlation are most stringent among all similar tests and possess certain related optimum properties.] Wolfowitz, J. (1950). Minimax estimates of the mean of a normal distribution with known variance. Annals of Mathematical Statistics 21, 218–230. Working, H. and Hotelling, H. (1929). Application of the theory of error to the interpretation of trends. Journal of the American Statistical Association 24, Mar. Suppl., 73–85. Wu, C. F. (1990). On the asymptotic properties of the jackknife histogram. Annals of Statistics 18, 1438–1452. Wu, C. F. and Hamada, M. (2000). Experiments: Planning, Analysis and Parameter Design. John Wiley, New York. Wynn, H. P. (1984). An exact confidence band for one-dimensional polynomial regression. Biometrika 71, 375–379. Wynn, H. P. and Bloomfield, P. (1971). Simultaneous confidence bands in regres- sion analysis (with discussion). Journal of the Royal Statistical Society Series B 33, 202–217. Yamada, S. and Morimoto, H. (1992). Sufficiency. In Current Issues in Statistical Inference: Essays in Honor of D. Basu. Gosh and Pathak (eds.), IMS Lecture Notes 17, Hayward, CA. Yanagimoto, T. (1990). Dependence ordering in statistical models and other no- tions. In Topics in Statistical Dependence, Block, Sampson and Savits (eds.) (1990), IMS Lecture Notes 16, Hayward, CA. Yuen, K. K. (1974). The two-sample trimmed t for unequal population variances. Biometrika 61, 165–170. Zabell, S. (1992). R. A. Fisher and the fiducial argument. Statistical Science 7, 369–387. Zhang, J. (2002). Powerful goodness-of-fit tests based on the likelihood ratio. Journal of the Royal Statistical Society Series B 64, 281–294. 756 References

Zhang, J. and Boos, D. (1992). Bootstrap critical values for testing homogeneity of covariance matrices. Journal of the American Statistical Association 87, 425–429. Author

Agresti, A., 127, 129, 133, 134, 135, Baker, R., 446 168, 318 Balakrishnan, N., 156, 159, 193, 197, Aiyar, R. J., 272 281, 306, 307, 446 Akritas, M., 318 Banach, S., 700 Albers, W., 272, 451, 582, 690 Barankin, E. W., 47 Albert, A., 286 Bar-Lev, S., 118, 201 Alf, E., 590 Barlow, R. E., 287 Andersen, S. L., 210 Barnard, G. A., 175, 413 Anderson, T. W., 90, 218, 306, 318 Barndorff-Nielsen, O., 47, 55, 106, Andersson, S., 218 214, 398, 403, 517 Anscombe, F., 474 Barnett, V., 6 Antille, A., 248 Barron, A., 630 Arbuthnot, J., 107, 149 Bartholomew, D. J., 287 Arcones, M., 655, 678 Bartlett, M. S., 95, 517 Armsen, P., 127 Basu, D., 106, 210, 395, 397, 398, 410, Arnold, S., 239, 274, 292-293, 318, 411, 412 374 Basu, S., 462, 481 Arrow, K., 58 Bayarri, J., 108 Arthur, K. H., 306 Bayarri, M., 175 Arvesen, J. N., 300 Becker, B., 109 Athreya, K., 655 Becker, N., 397 Atkinson, A., 169, 293, 318 Bednarski, T., 328 Behnen, K., 582 Babu, G., 655, 690 Bell, C. B., 118, 241 Bahadur, R., 54, 210, 466, 481, 575, Bell, C. D., 210 582 Benichou, J., 526, 549 Bain, L. J., 200, 201 Bening, V., 582 758 Author Index

Benjamini, Y., 354, 374, 445 Brockwell, P. J., 451 Bennett, B., 127, 171 Bromeling, L. D., 304 Bentkus, V., 604 Bross, I. D. J., 127 Beran, R., 481, 526, 539, 582, 589, Brown, K. G., 304 629, 657, 658, 668, 671, 672, Brown, L. D., 18, 18, 47, 55, 69, 71, 673, 679, 689, 690 108, 115, 141, 157, 237, 308, Berger, A., 320, 698 336, 347, 408, 409, 414, 415, Berger, J., 15, 16, 18, 27, 95, 108, 173, 435, 561, 647, 668 175, 331, 400, 414, 415, 526 Brown, M. B., 448, 480 Berger, R., 108, 287, 561 Brownie, C., 409, 446 Berk, R., 220, 226, 241 Brunk, H. D., 287 Bernardo, J., 16 Brunner, E., 318 Bernoulli, D., 107 Buehler, R., 175, 196, 408, 408, 412, Best, D., 616, 630 413, 414, 414 Bhapkar, V. P., 135 Burkholder, D. L., 54 Bhat, U., 145 Bhattacharya, P. K., 248 Caba˜na, A., 629 Bhattacharya, R., 460, 481, 668 Caba˜na, E., 629 Bickel, P., 11, 27, 226, 241, 474, 481, Cai, T., 18, 435, 647, 668 488, 517, 539, 571, 582, 654, Carroll, R. J., 318 677, 678, 679, 690 Casella, G., vii, 5, 13, 17, 21, 55, 108, Billingsley, P., 42, 55, 117, 147, 185, 124, 157, 173, 174, 201, 292, 223, 256, 424, 427, 451, 476, 335, 336, 395, 396, 408, 415, 480, 611 506, 507, 548, 561, 679 Birch, M. W., 135 Castillo, J., 144 Birnbaum, A., 99, 126, 276, 400, 414 Chakraborti, S., 146, 245, 251, 286, Birnbaum, Z. W., 108, 256, 442 290, 442 Bishop, Y. M. M., 135, 525 Chalmers, T. C., 57 Blackwell, D., 16, 21, 27, 40, 95, 118 Chambers, E. A., 134 Blair, R. C., 539 Chapman, D. G., 108 Bloomfield, P., 378, 384 Chatterjee, S., 169 Blyth, C. R., 6, 75, 108, 167, 168 Chebyshev, P., 481 Bohrer, R., 384 Chen,H.J.,629 Boldrick, J., 109, 391 Chen, L., 445 Bondar, J. V., 334, 415 Chernoff, H., 231, 526, 598-599, 630 Bondessen, L., 13 Chhikara, R. S., 100, 197, 197 Boos, D., 108, 210, 248, 446, 481 Chmielewski, M. A., 314 Boschloo, R. D., 127 Choi, K., 318 Bose, R. C., 391 Choi, S., 582 Boukai, B., 175 Chou, Y. M., 306 Bowker, A. H., 524 Choy, K., 582 Box, G. E. P., 210, 293, 304, 421, 474, Christensen, R., 318 480 Cima, J. A., 384 Box, J. F., 27 Clinch, J. C., 448, 480 Brain, C. W., 629 Cochran, W. G., 448 Braun, H., 391 Cohen, A., 57, 69, 135, 201, 210, 239, Breiman, L., 118 287, 316, 318, 341, 629 Bremner, J. M., 287 Cohen, J., 281 Bretagnolle, J., 655, 678 Cohen, L., 95 Author Index 759

Conover, W. J., 446, 481 Dvoretzky, A., 95, 442 Coull, B., 168 Dykstra, R., 287 Cox, D., 214 Cox, D. R., 6, 108, 134, 134, 220, 397, Eaton, M., 211, 218, 227, 318, 331 414, 474 Edelman, D., 445, 462 Cram´er, H., 27, 481, 506, 526, 629 Edgeworth, F. Y., 107, 481 Cressie, N., 445, 629 Edgington, E. S., 690 Cs¨org¨o, S., 655 Edwards, A. W. F., 129, 175 Cvitanic, J., 338 Efron, B., 190, 195, 439, 481, 626, Cyr, J. L., 481 648, 668, 672, 686, 690 Eisenhart, C., 146 D’Agostino, R., 408, 589, 616, 628, Elfving, G., 54 629 Engelhardt, M. E., 200, 201 Dantzig, G. B., 78, 108 Eubank, R., 607, 616, 630 Darmois, G., 57 DasGupta, A., 18, 276, 435, 462, 481, Falk, M., 451 647, 668 Fan, J., 526, 607, 630 Davenport, J. M., 231 Faraway, J., 391 David, H. A., 243 Farrell, R., 108, 218, 331 Davis, B. M., 127 Fears, T., 526, 549 Davis, R. A., 451 Feddersen, A. P., 408 Davison, A., 690 Feinberg, S. E., 135, 525 Dawid, A. P., 106, 411 Feller, W., 5, 256, 459, 464, 604 Dayton, C., 373 Fenstad, G. U., 448 de Leeuw, J., 11 Ferguson, T. S., 5, 16, 18, 27 de Moivre, A., 480 Fienberg, S., 134, 210 Dempster, A. P., 175 Finch, P. D., 132 Deshpande, J. V., 629 Finner, H., 140, 354, 373, 391 Deuchler, G., 276 Finney, D. J., 127 Devroye, L., 443 Fisher, R. A., 27, 97, 107, 108, 127, de Wet, T., 589, 630 149, 175, 210, 408, 414, 415, Diaconis, P., 180, 270, 318, 626, 639, 526, 597, 598, 630, 635 649 Folks, J. L., 100, 197, 197 DiCiccio, T., 517, 686, 691 Forsythe, A., 190, 207, 448, 480 Dobson, A., 318 Fourier, J. B. J., 108 Doksum, K. A., 27, 474, 629 Franck, W. E., 259 Donev, A., 293 Fraser, D. A. S., 106, 141, 175 Donoghue, J., 366 Freedman, D., 131, 654, 678, 690 Donoho, D., 481 Freeman, M. F., 474 Draper, D., 286 Freiman, J. A., 57 Drost, F., 630 Fris´en, M., 138 Dubins, L. E., 40 Fuller, W., 451 Ducharme, G., 671 Dudley, R., 55, 424, 472 480, 486, Gabriel, K. R., 188, 190, 368, 374, 385 571, 697 Gail, M., 526, 549 Dudoit, S., 109, 391, 690 Galambos, J., 629 D¨umbgen, L. 629 Gan, L., 507 Duncan, D. B., 391 Garside, G. R., 127 Durbin, J., 245, 414, 442, 616, 629 Gart, J. J., 134 760 Author Index

Garthwaite, P., 190 Harville, D. A., 304 Gastonis, C., 135 Has’minskii, R., 506 Gastwirth, J. L., 248, 450 Hastie, T., 318 Gauss, C. F., 27, 108, 480 Hayter, T., 391 Gavarett, J., 107 Haytner, A., 391 George, E. I., 415 Hedges, L., 109 Ghosh, J., 46, 200, 220, 481, 517 Hegazy, Y. A. S., 629 Ghosh, M., 149, 323 Hegemann, V., 290, 291 Gibbons, J., 108, 146, 245, 251, 286, Heritier, S., 526 290, 442 Hettmansperger, T., 286, 287, 290, Giesbrecht, F., 293 318, 446, 448 Gin´e, E., 655, 658 Higgins, J. J., 539 Giri, N., 322, 335 Hillier, G. H., 480 Girshick, M. A., 16, 27, 141 Hinkley, D., 401, 474, 690 Glaser, R. E., 200, 481 Hipp, C., 47 Gleser, L. J., 197, 442 Hochberg, Y., 353, 354, 368, 373, 374, Gokhale, D. V., 131 384, 391 Good, P., 180, 210, 690 Hocking, R. R., 292, 293, 316 Goodman, L. A., 129 Hodges, J. L., Jr., 157, 157, 424, 582 Gordon, I., 397 Hoeffding, W., 118, 210, 276, 481, G¨otze, F., 677, 679 629, 637, 690 Goutis, C., 201, 408 Hoel, P. G., 149 Green, B. F., 180 Hogg, R. V., 159 Green, J. R., 629 Holland, P. W., 135, 525 Greenwood, P., 598, 630 Holm, S., 350, 374, 375, 391 Grenander, U., 108 Holmes, S., 180, 639, 649 Groeneboom, P., 539, 540, 582 Hooper, P. M., 220, 239 Guenther, W. C., 146 Horst, C., 127 Guillier, C. L., 272 Hotelling, H., 108, 276, 318, 391, 448, Gumpertz, M., 293 474, 480 Gupta, A., 339 Hsu, C. F., 188 Hsu, C. T., 208 Haberman, S. J., 129, 134 Hsu, J., 391, 561 Hackney, O. P., 292 Hsu, P., 127, 318 Hadi, A., 169 Huang, J. S., 323 H´ajek, J., 33, 245, 256, 286, 486, 487, Huber, P. J., 328, 347, 480 525, 526, 582, 588, 590 Hughes-Oliver, J., 210 Hald, A., 480 Hung, K., 145 Hall, P., 75, 446, 460, 481, 517, 629, Hunt, G., 276, 347 655, 664, 665, 668, 679, 690, Hunter, J. S., 293 691 Hunter, W. G., 293 Hall, W., 109, 190, 220, 582 Hutchinson, D. W., 167 Hallin, M., 582 Hwang, J., 108, 157, 197, 336, 415, Halmos, P. R., 95, 331, 499 561 Hamada, M., 293 Hamilton, J. D., 451 Ibragimov, I., 506 Hartigan, J. A., 190, 206, 207, 211, Ibragimov, J. A., 323 691 Inglot, T., 582, 607, 630 Hartley, H. O., 281 Ingster, Y., 597, 622, 624 Author Index 761

Isaacson, S. L., 341 Knight, K., 655 Jagers, P., 279 Knott, M., 616 James, A. T., 448, 480 Koehn, U., 210 James, G. S., 448, 480 Kohne, W., 451 Jammalamadaka, S., 630 Kolassa, J., 668 Janssen, A., 582, 583, 616, 617, 621, Kolmogorov, A., 20, 585, 629 622, 683, 690 Kolodziejczyk, S., 317 Jensen, J., 517 Koopman, B., 47 Jiang, J., 507 Korn, E., 374 Jing, B., 481 Koschat, M., 197 Jockel, K., 443 Koshevnik, Y., 582 Johansen, S., 119, 448, 480 Kotz, S., 98, 99, 114, 127, 145, 156, John, R. D., 180, 188 159, 193, 197, 209, 281, 306, Johnson, D. E., 290, 291 307, 435 Johnson, M. E., 446, 481 Kowalski, J., 18 Johnson, M. M., 446, 481 Koziol,J.A.,248 Johnson, N. L., 98, 99, 114, 127, 156, Krafft, O., 86 159, 193, 197, 209, 281, 306, Kraft, C., 320 307, 435 Kruskal, W., 95, 129, 276 Johnson, N. S., 131 Kuebler, R. R., 57 Johnstone, I. M., 71, 115, 308 K¨unsch, H. R., 687 Joshi, V., 239 Kusunoki, U., 391

Kabe,D.G.,93 Lahiri, S. N., 451, 687, 690 Kakutani, S., 582 Lambert, D., 109, 328, 447, 690 Kalbfleisch, J. D., 395 Lane, D., 131 Kallenberg, W., 140, 272, 339, 582, Laplace, P. S., 27, 107, 108, 480 607, 630 LaRiccia, V., 607, 616 Kanoh, S., 391 Latscha, R., 127 Kappenman, R. F., 440 Laurent, A. G., 93 Karatzas, I., 338 Lawless, J. F., 400 Kariya, T., 314 Layard,M.W.J.,300 Karlin, S., 5, 22, 69, 71, 231, 323 Le Cam, L., 21, 54, 487, 488, 489, Kasten, E. L., 127 507, 525, 526, 533, 549, 550, Kemp, A., 114, 127, 435 582 Kemperman, J., 210 Ledwina, T., 272, 582, 607, 630 Kempthorne, O., 293 L´eger, C., 653 Kempthorne, P., 11 Lentner, M. M., 196 Kendall, M. G., 272, 629, 630 Levit, B., 582 Kent, J., 629 Levy,K.J.,255 Kersting, G., 248 Lexis, W., 107, 108 Kesselman, H. J., 448, 480 Liang, K. Y., 403, 525 Khmaladze, E., 629 Lieberman, G. J., 127 Khurshid, A., 127 Lin, S., 630 Kiefer, J., 276, 293, 306, 316, 322, Lindley, D. V., 95 335, 409, 413, 442 Linnik, Y, V., 335 King, M. L., 480 Liseo, B., 526 Klaassen, C., 488, 571, 582 Littell, R. C., 109 Klett, G. W., 165, 201, 252 Liu, H., 287 762 Author Index

Liu, R. Y., 108, 665, 687 Miller, J., 304, 316 Loh, W.-Y., 220, 323, 481, 626, 667, Miller, R. G., 293, 390 668 Milliken, G. A., 127 Loomis, L. H., 331 Milnes, P., 334 Lorenzen, T. J., 293 Miwa, T., 391 Lou, W., 146 Montgomery, D., 293 Louv, W. C., 109 Morgan, W. A., 208 Low, M., 582 Morgenstem, D., 210 Lyapounov, A. M., 95 Morimoto, H., 21, 46 Mosteller, F., 141, 421 Maatta, J., 415 M¨ott¨onen, J., 318, 582 MacGibbon, K. G., 71, 115, 308 Mudholkar, G. S., 439 Mack, C., 127 M¨uller, C., 480 Mack, G. A., 290 Munk, A., 157 Madansky, A., 200 Murphy, S., 526 Maitra, A. P., 47 Murray, L. W., 300 Mandelbaum, A., 118 Muyot, M., 318 Mann, H., 597 Manoukian, E. B., 481 Nachbin, L., 331 Mantel, N., 549 Naiman, D. Q., 384 Marasinghe, M. C., 290 Narula, S. C., 255 Marcus, R., 374, 385 Neill, J., 318 Marden, J., 18, 108, 109, 135, 141, Nelder, J., 318 287, 318, 403, 404 Neuhaus, G., 582, 629 Mardia, K. V., 159 Neyman, J., 27, 107, 108, 108, 149, Maritz, J. S., 203 210, 211, 480, 597, 599, 600, Marshall, A. W., 308, 323 630 Mart´ın, A., 127 Nicolaou, A., 174 Martin, M., 668 Niederhausen, H., 442 Mason, D., 655 Nikitin, Y., 539, 582, 622, 629 Massart, P., 442 Nikulin, M., 598, 630 Massey, F. J., 586, 587 Noether, G., 582 Mathew, M., 300 Nogales, A., 220 Mattner, L., 118 N¨olle, G., 700 McCullagh, P., 318, 590, 668 McCulloch, C., 293 Od´en, A., 179 McDonald, L. L., 127 Oja, H., 318, 582 McKean, J., 286-287, 287, 446, 448 Ojeda, M., 297 McLaughlin, D. H., 421 Olkin, I., 109, 308, 323 McShane, L., 374 Olshen, R. A., 408 Mee, R., 268 Oosterhoff, J., 534, 540, 582, 630 Meeks, S. L., 408 Ord, J., 27, 316, 439, 597 Meng, X., 108 Owen, A., 442, 526, 673, 690 Michel, R., 121 Owen, D. B., 127, 193, 306, 447 Milbrodt, H., 616, 622 Oyola, J., 220 Millar, W., 526, 629, 658 Miller, F. L., 616 Pace, L., 220 Miller, F. R., 318 Pachares, J., 114 Miller, G., 145 Padmanabhan, A., 446 Author Index 763

Patel, J. K., 209 Rao, R., 460, 481, 668 Pauls, T., 690 Rayner, J., 616, 630 Pawitan, Y., 526, 549 Read, C. B., 209 Pearson, E. S., 27, 107, 108, 149, 210, Read, T., 629 281, 480 Reid, C., 27, 108 Pearson, K., 107, 526, 590, 629 Reid, N., 214 Pedersen, K., 47 Reinhardt, H. E., 86 Pena, E., 630 Reiser, B., 201 Pereira, B., 220 Riani, M., 169, 318 Perez, P., 220 Richmond, J., 384, 390 Peritz, E., 374, 379, 385 Rieder, H., 328, 582 Perlman, M., 157, 233, 287, 403, 404, Ripley, B., 443 506, 561 Ritov, Y., 488, 571, 582 Pesarin, F., 448, 690 Robbins, H., 696 Peters, D., 582 Robert, C., 15, 108, 173, 173, 175 Pfanzagl, J., 65, 67, 162, 231, 581, Robertson, T., 287 582 Robins, J., 109 Piegorsch, W. W., 384 Robinson, G., 188, 202, 231, 408, 408, Pierce, D. A., 414, 415 415 Pitman, E. J. G., 47, 210, 276, 414, Robinson, J., 180, 188, 630, 690 582 Rojo, J., 23, 101 Plachky, D., 118, 700 Ronchetti, E., 287, 526 Plackett, R. L., 134, 525 Rosenstein, R. B., 306 Pliss, V. A., 335 Rosenthal, R., 58 Politis, D. N., 658, 674, 676, 679, 680, Ross, S., 5 687, 690, 691 Roters, M., 354, 391 Pollard, D., 424, 471, 484, 519, 585 Rothenberg, T. J., 157, 448, 480 Pollard, K., 690 Roussas, G., 525, 543, 582 Polonik, W., 629 Roy, K. K., 21, 54 Posten, H. O., 447 Roy, S. N., 391 Pratt, J. W., 99, 150, 200, 245, 413, Royden,H.L.,700 418 Rubin, D. B., 58 Prescott, P., 287 Rubin, H., 69, 248, 450 Price, B., 169 Rukhin, A., 220 Puig, P., 144 Runger, G., 211, 474 Pukelsheim, F., 293 Ruppert, D., 318 R¨uschendorf, L., 118 Quenouille, M., 691 Quesenberry, C. P., 260, 263, 616, 629 Sackrowitz, H., 108, 210, 237, 287, Quine, M. P., 630 341, 629 Sahai, H., 127, 297 Radlow, R., 590 Salaevskii, O. V., 335 Ramachandran, K. V., 166 Salaevskii, Y., 335 Ramamoorthi, R. V., 21, 54 Salvan, A., 220 Ramsey, P. H., 447 Samuel-Cahn, E., 108 Randles, R., 251, 286, 539, 582, 589, Sanathanan, L., 58 630 Sarkar, S. K., 354 Rao, C. R., 11, 526, 630 Savage, L. J., 16, 27, 58, 141, 414, Rao, P., 488 466, 481, 499, 575 764 Author Index

Schafer, G., 95 Small, C., 507 Scheff´e, H., 149, 166, 231, 293, 316, Smirnov, N. V., 256 317, 318, 374, 448, 480, 696 Smith, A., 16 Schervish, M., 27 Smith, D. W., 300 Schick, A., 582 Smith, H., 57 Scholz, F. W., 109 Sophister (G. Story), 480 Schuirmann, D., 561 Speed, F. M., 292 Schwartz, R., 276, 306, 316 Speed, T., 210, 318 Schweder, T., 109 Spiegelhalter, D. J., 629 Seal, H. L., 317 Spjøtvoll, E., 109, 293, 300, 370, 374 Searle, S., 293 Sprott, D. A., 106 Seber, G. A. F., 318 Spurrier, J. D., 391, 639 Seidenfeld, T., 175 Srivastava, M. S., 481 Self, S. G., 525 Starbuck, R. R., 260, 263 Sellke, T., 95, 108 Staudte, R., 108 Sen, P., 33, 210, 245, 256, 286, 525, Stein, C. M., 89, 210, 237, 276, 281, 582, 588, 590 335, 336, 347, 539, 570, 582 Serfling, R. H., 245, 539, 574, 582, Stephens, M., 589, 616, 628, 629 648 Stern, S., 517 Serroukh, A., 582 Stigler, S., 107, 480 Severini, T., 174 Still, H. A., 168 Shaffer, J. P., 109, 124, 311, 360, 365, Stone, C. J., 11, 539 371, 373, 374, 390, 391, 525 Stone, M., 334 Shaikh, A. M., 374 Strassen, V., 328 Shao, J., 27, 648, 690 Strasser, H., 27, 264, 616, 622 Shapiro, S. S., 629 Strawderman, W. E., 69, 239 Sheather, S., 287 Stuart, A., 27, 316, 439, 597, 629, 630 Sherfey, B., 318 Student (W. S. Gosset), 448 Shewart, W., 480 Sugiura, N., 245 Shorack, G., 200, 512, 588, 590, 622, Sun, J., 391 629 Suslina, I., 622, 624 Shorrock, G., 201 Sutton, C., 443 Shrader, R. M., 286-287 Swed, F. S., 146 Shriever, B. F., 630 Shuster, J., 100 Takeuchi, K., 93 Sid´ak, Z., 33, 245, 256, 286, 525, 582, Tallis, G. M., 629 588, 590 Tamhane, A., 353, 354, 368, 373, 374, Siegel, A. F., 96, 143, 470 391 Siegmund, D., 9, 588 Tan, W. Y., 445 Sierpinski, W., 323 Tang, D., 287 Silvapulle, M., 513 Taniguchi, M., 582 Silvapulle, P., 513 Tanur, J., 210 Silvey, S. D., 293 Tapia, J., 127 Simon, R., 374 Tate, R. F., 165, 201, 252 Simpson, D., 582 Taylor,H.,5,22 Singh, K., 108, 665, 687, 690 Thomas, D. L., 210 Singh, M., 448 Thombs, L. A., 451 Sinha, B., 300, 314 Tiao, G. C., 304, 421 Skillings, J. H., 290 Tibshirani, R., 318, 439, 668, 686, 690 Author Index 765

Tienari, J, 582 Wellner, J., 488, 512, 571, 574, 582, Tiku, M. L., 281, 306, 307, 446, 448 585, 588, 590, 612, 622, 629, Tiwari, R., 630 658 Tong, Y. L., 145 Wells, M., 108, 630 Tritchler, D., 190 Welsh, A. H., 629 Troendle, J., 374, 391 Westfall, P. H., 109, 293, 300, 366, Tseng, Y., 336 375, 386, 391, 690 Tu, D., 648, 690 Westlake, W., 561 Tukey, J. W., 20, 77, 291, 374, 391, Wijsman, R., 218, 220, 331, 378, 391 421, 474, 480, 691 Wilk, M. B., 293, 629 Turnbull, H., 39 Wilks, S. S., 526 Tweedie, M. C. K., 197 Williams, D., 55 Wilson, E. B., 108, 435 Unni, K., 398 Winters, F., 480 Uthoff, V. A., 259, 260 Witting, H., 86 Wolf, M., 391, 469, 582, 658, 674, Vadiveloo, J., 190 676, 679, 680, 690, 691 van Beek, P., 428 Wolfe, D. A., 251, 286, 539 van der Laan, M., 690 Wolfowitz, J., 76, 95, 442 van der Vaart, A., 90, 109, 518, 526, Wolpert, R., 108, 400, 414, 415, 526 573, 574, 582, 585, 612, 629, Working, H., 108, 391 658 Wright,A.L.,248 van Zwet, W. R., 534, 582, 677, 679, Wright, F., 287 690 Wu, C. F., 293, 677, 679, 691 Venable, T. C., 135 Wu, L., 157, 233, 287, 561 Ventura, V., 109 Wu, Y., 11 Vermeire, L., 339 Wynn, H. P., 378, 384 von Mises, R., 629 von Randow, R., 334 Yamada, S., 21, 46 Vu, H., 526 Yanagimoto, T., 145 Yandell, B. S., 629 Wacholder, S., 149 Yang, G., 487, 488, 525, 533, 582 Wald, A., 18, 27, 78, 95, 108, 347, Yang, Z., 507 506, 526, 543, 582, 597 Yeh, H. C., 447 Wallace, D., 231, 415 Yekutieli, D., 354 Wand, M. P., 318 Young, S., 109, 293, 375, 386, 391, Wang, H., 414 690 Wang, J., 507 Yuen, K. K., 448 Wang, Q., 145 Wang, Y., 175 Zabell, S., 175 Wang, Y. Y., 231, 448 Zemroch, P. J., 159 Webster, J. T., 231 Zhang, C., 526 Wedel, H., 179 Zhang, J., 481, 526, 630 Wefelmeyer, W., 581, 582 Zhou, S., 526 Weinberg, C. R., 149 Zinn, J., 655 Weisberg, S., 169, 318 Zucchini, W., 248 Welch, B. L., 380, 413 Welch, W., 210, 448 Wellek, S., 582 Subject Index

Absolute continuity (of one measure to maximin tests, 329; relation with respect to another), 33, to unbiasedness, 329. See also 492. See also Equivalence, of Invariance two measures; Radon-Nikodym Almost sure convergence, 440 derivative Almost sure representation theorem, Accelerated, bias-corrected, percentile 443 method, 685, 686 Aloaglu’s theorem, 700 Action problem, 6 Alpha-admissibility, 233 Adaptive test, 539 Alternatives (to a hypothesis), 56 Additive linear models, 318 Amenable group, 334 Additivity of effects, 287, 290; in model II, 298; test for, 290, Analysis of covariance, 297 291 Analysis of variance, 286, 292, 318; Admissibility, 17; and invariance, 26; different models for, 297; Bayes method for proving, for one-way layout, 286; for 236; of confidence sets, 239, two-way layout, 288; history 336; of multiple comparison of, 317; robustness of F -tests, procedures, 369, 370; of 446. See also Components of UMP invariant tests, 332; of variance; Linear hypothesis; UMP unbiased tests, 139, Linear model 232. See also α-admissibility; Ancillary statistic, 152, 395, 400; and d-admissibility; Inadmissibility invariance, 395, 397, 401; and Affinity, 530 sufficiency, 397; history of, Aligned ranks, 290 414; in the presence of missing Almost everywhere (a.e.), 33, 115 observations, 410; maximal, Almost invariance, 23, 263; of 397; paradox for, 414. See also likelihood ratio, 263; of tests, S-ancillary 225, 241; relation to invariance, Anderson Darling statistic, 589, 612 230; relation to invariance of Anderson’s nonparametric confidence power function, 230; relation interval for a mean, 468-469 768 Subject Index

Approximate hypotheses: extended Bayesian confidence sets, see Credible Neyman Pearson lemma for, regions 326, 328 Bayesian inference, 15, 172, 173, 175, Arcsine transformation for binomial 304 variables, 474 Bayes risk, 14 Association, 132; spurious, 133; Yule’s Bayes solution, 14, 23; and complete measure of, 129. See also class of decision procedures, Dependence, positive 18; restricted, 15; to maximize Asymptotically linear statistic, 500 minimum power, 320; to Asymptotically maximin tests: for prove admissibility, 236. See multi-sided hypotheses, 564– also Credible region; Prior 567, for nonparametric mean, distribution 567–570, for Chi-squared test, Bayes sufficiency, 21 593, 594 Bayes test, 94, 264, 309 Asymptotically most powerful test Behrens-Fisher distribution, 202 sequence, 541 Behrens-Fisher problem, 159, 231, Asymptotically normal experiments, 408, 415, 420; bootstrap 549–553 solution, 671, 672; LAUMP Asymptotically perfect test, 432 tests for, 558, 559; many Asymptotically uniformly most sample, 448; nonparametric, powerful (AUMP) tests: in 245; permutation test for, 642; univariate models, 540–549, under nonnormality, 447. See in multiparameter models, also Welch-Aspin test 553–567, in nonparametric Berry-Esseen theorem, 428; models, 567–574 multivariate, 604 Asymptotic equivalence of test Beta distribution, 159, 280; as sequences, 577 distribution of order statistics, Asymptotic pivot, 646 266; noncentral, 280, 307; Asymptotic relative efficiency, relation to F -distribution, 534-540, 582; and goodness 159; relation to gamma of fit, 621; of randomization distribution, 196 tests, 639 Bimeasurable transformation, 214 Asymptotic normality: of functions Binomial distribution b(p, n), 4; of asymptotically normal in comparing two Poisson variables, 436; of sample mean, distributions, 153; as loglinear 426; or sample median, 429. model, 134; completeness of, See also Central limit theorem 116; in comparing two Poisson Asymptotic optimality, vii, 527 distributions, 125, 398; in sign Attributes: paired comparisons by, test, 85; variance stabilizing 169, 291, 510, 526; sample transformation for, 474. See inspection by, 80, 293 also Contingency tables; Automatic percentile method, 686 Multinomial distribution; Autoregressive process, 450 Negative binomial distribution; Average power, maximum, 96, 308, Two by two table 627 Binomial probabilities: comparison of two, 106, 126, 145, 149, Bahadur efficiency, 539 399; confidence bounds Bahadur Savage theorem, 466, 467 for, 75; confidence intervals Banach space, 696–698 for, 167, 434, 435; credible Bartlett correction, 517; relationship region for, 172; one-sided test to bootstrap, 671, 691 for, 67; two-sided test for, Bartlett’s test for variances, 481 113. See also Contingency Basu’s theorem, 152, 210 tables; Independence, test Subject Index 769

for; Matched pairs; Median; combinations, 452; for sample Paired comparisons; Sample median, 429; Lindeberg, 427; inspection; Sign test Lyapounov, 427; multivariate, Binomial trials, 8. 18. 134; minimal 427; uniform, 463, 465 sufficient statistic for, 26. See Characteristic function, 426 also Inverse sampling Chebyshev inequality, 472 Bioassay, 147 Chi-squared distribution, 47; for Bivariate distribution (general): testing linear hypothesis with one-parametric family for, 191; known variance, 310; in testing testing for independence in, normal variance, 114, 155; 192, 271. See also Dependence limit for likelihood ratio, 515, Bivariate normal correlation 516; non-central, 306, 308, coefficient: asymptotic test for, 311; relation to exponential 512; confidence intervals for, distribution, 54; relation to 201; test for, 201, 231, 261, F -distribution, 158; relation to 397; confidence bounds for, t-distribution, 156. See also 273 Gamma distribution; Normal Bivariate normal distribution, 190, one-sample problem, variance; 207; ancillary statistics in, 397; Wishart distribution joint distribution of second Chi-squared test: as a Neyman smooth moments in, 208; test for test, 601; asymptotically independence in, 190 maximin property, 593, Bonferroni procedure, 350, 385 594; for simple hypotheses, Bootstrap, vii, 648; consistency of, 420, 514, 515, 590–597; 650; higher order properties, for composite hypotheses, 664; hypothesis testing, 668; in 597–599; in contingency tables, multiple testing, 658 626; for testing uniformity, Bootstrap calibration, 667 594–597 Bootstrap-t: consistency of, 654; Closure method for multiple testing, higher order properties, 385 665–667 Cluster sampling, 449 Bounded-Lipschitz metric, 471 Cochran-Mantel-Haenszel test, 135 Borel set, 29 Coefficient of variation: asymptotic Bounded completeness, 118, confidence interval for, 509; 228; example of, without confidence bounds for, 273; completeness, 141. See also tests for, 157, 222, 230 Completeness of family of Comparison of experiments, 136, 204 distributions Complement of a set E, denoted Ec, Brownian Bridge process, 585, 588 28 Completeness of a class of decision Calibration, 667 procedures, 17, 18, 108; for Cauchy distribution, 71, 99, 324, 339 one-parameter exponential Cauchy location model: AUMP and family, 141; of classes of LAUMP tests for, 547, 548; one-sided tests, 69; of class of q.m.d. property, 487 two-sided tests, 140; relation Causal influence, 132 to sufficiency, 21. See also CDF, see Cumulative distribution Admissibility function Completeness of family of Center of symmetry: confidence distributions, 115; of intervals for, 203, 206. See also binomial distributions, 116; Symmetric distribution of exponential families, 117; Central limit theorems: for dependent of nonparametric family, variables, 448, 449; for linear 118; of normal distributions, 770 Subject Index

116; of order statistics, 118, 165. See also Simultaneous 143; relations to bounded confidence intervals completeness, 118, 141; of Confidence level, 72 uniform distributions, 116 Confidence sets, 72; admissibility of, Completion of measure, 29 239, 335; average smallest, Complexity: of multiple comparison 251; based on multiple tests, procedure, 373 391; derived from a pivotal Components of variance, 303. See quantity, 254; equivariant, 248, also Random effects model 336; example of inadmissible, Composite hypothesis, 59; vs. simple 336; minimax, 335,336; of alternative, 84 smallest expected Lebesgue Conditional distribution, 40, 41; measure, 200; relation to tests, example of nonexistence, 40 171; unbiased, 164; which are Conditional expectation, 37, 42; not intervals, 225. See also properties of, 39 Credible region; Equivariant Conditional independence: test for, confidence sets; Relevant 133 and semirelevant subsets; Conditional inference, 393, 394, 408; Simultaneous confidence sets optimal, 400 Conjugate distribution, 173 Conditional power, 123, 138, 188, Conservative test, 127 398, 400 Consumer preferences, 135 Conditional probability, 39, 40 Contiguity, 492–494; and limiting Conditional test, 549–553 distribution of a statistic; 499, Confidence bands: for cumulative 500; characterizations of, 496, distribution function, 255, 276; 497; examples of, 498–503 for linear models, 375; for Contingency tables: loglinear models × regression line, 384, 391. See for, 134; r c tables, 127; × × also Simultaneous confidence three factor, 132; 2 2 K, × × intervals 138, 148; 2 2 2, 139; × × × Confidence bounds, 72; equivariant, 2 2 2 L, 148. See also 272; impossible, 300, 408; Two by two tables in presence of nuisance Continuity correction, 127 parameters, 162; most Continuity point, 425 accurate, 72; relation to Continuity theorem, 426 median unbiased estimates, Continuous Mapping theorem, 435, 162; relation to one-sided 436 tests, 163; standard, 76; with Consistent estimator, 432 minimum risk, 102 Contrasts, 382, 472 Confidence coefficient, 72 162; Convergence in distribution (or in conditional, 408 law), 425 Confidence intervals, 6, 76, 162; after Convergence in probability, 431 rejection of a hypothesis, 140, Convergence of moments, 443, 444 408; distribution-free, 189, Convergence theorem: for densities, 203, 251; empty, 300; expected 696; dominated, 32; monotone, length of, 170; history of, 32. See also Central limit 108, 211; in randomization theorem; Continuity theorem; models, 188; interpretation of, Continuous mapping theorem; 162; logarithmically shortest, Cram´er-Wold theorem; Delta 252; loss functions for, 76; method ; Glivenko-Cantelli of bounded length, 197, 198; theorem; Prohorov’s theorem randomized, 166; relation to Cornish-Fisher expansion, 460, 663 two-sided tests, 163; uniformly Correlation coefficient: in bivariate most accurate unbiased, normal distribution, 190, Subject Index 771

548, 549, 557; confidence Design of experiments, 8, 9, 130, bounds for, 273; intraclass, 204, 293. See also Random 313; multiple tests of, 661; assignment; Sample size nonparametric bootstrap test Directional error, 139, 140, 373 of, 670; testing value of, 190, Direct product (of two sets), 33 231, 261. See also Bivariate Dirichlet distribution, 202 distribution; Dependence, Distribution, see the following positive; Multiple correlation families of distributions: coefficient; Rank correlation Beta, Binomial, Bivariate coefficient; Sample correlation normal, Cauchy, Chi-squared, coefficient Dirichlet, Double exponential, Countable additivity, 28 Exponential, F ,Gamma, Countable generators of σ-field, 699 Hypergeometric, Inverse Counting measure, 29 Gaussian, Logistic, Lognormal, Covariance matrix, 89, 305 Multinomial, Multivariate Coverage error, 662–668 normal, Negative binomial, Cram´er’s condition, 459 Noncentral, Normal, Pareto, Cram´er-von Mises statistic 459; Poisson, Polya, Power series, limiting distribution, 616; as t, Hotelling’s T 2, Triangular, a weighted quadratic statistic, Uniform, Weibull, Wishart. 611, 612 See also Exponential family; Cram´er-Wold device, 426 Monotone likelihood ratio; Credible region, 172, 173; highest Total positivity; Variation probability density, 173, 175, diminishing 202 Dominated convergence theorem, 32 Critical function, 58 Dominated family of distributions, Critical region, 56 45, 698, 699 Cross product ratio, see Odds ratio Domination: of one procedure Cumulative distribution function over another, 17. See also (cdf), 30, 52, 424; confidence Admissibility; Inadmissibility bands for, 255, 276; empirical, Double exponential distribution, 259, 245, 255; inverse of, 266. 323, 342; AUMP and LAUMP See also Kolmogorov test for property, 546, 547; locally goodness of fit; Probability most powerful test in, 342; integral transformation q.m.d. property, 487; UMP conditional test in, 402 d-admissibility, 233, 264. See also Duncan multiple comparison Admissibility procedure, 368 Data Snooping, 378 Dunnett’s multiple comparison Decision problem: specification of, 4 method, 390 Decision space, 4, 5 Dvoretzky, Kiefer, Wolfowitz Decision theory, 27, 28; and inference, inequality, 442 6 Deficiency, 157 Delta method, 436–439 EDF, see Empirical distribution Density point, 185 function Dependence: measure of, 129; Edgeworth expansions, 459–462, 481, mo;dels for, 448–451 positive, 662 145; positive quadrant, 145; Efficacy, 536 regression, 191, 240. See Efficient likelihood estimation, 504 also Correlation coefficient; Elliptically symmetric distribution, Independence 314 772 Subject Index

Empirical cumulative distribution 68; sufficient statistics for, function, 245, 255, 441; 27; testing against gamma statistics, 589 distribution, 200; testing Empirical likelihood, 673, 690, 691 against normal or uniform Empirical measure, 475, 589 distribution, 260; tests for Empirical process, 585, 588, 658 parameters of, 93, 195; two- Envelope power function, 262, 337. sample problem for, 259. See See also Most stringent test also Chi-squared distribution; Equi-tailed confidence interval, 649 Gamma distribution; Life Equivalence: of family of distributions testing or measures, 45; of statistics, Exponential family, 46, 55; 26; of two measures, 51 admissibility of tests in, Equivalence classes, 69 234; completeness of, 117; Equivalence hypotheses 81, 90–92; differentiability of, 49; LAUMP tests for, 559–564 equivalent forms of, 123; Equivalence relation, 692 expansion of loglikelihood, Equivariance, 13, 396. See also 483, 484; median unbiased Invariance estimators in, 162; moments Equivariant confidence bands, 255, of sufficient statistics, 55; 376, 384, 390 monotone likelihood ratio of, Equivariant confidence bounds, 272 67; natural parameter space of, Equivariant confidence sets, 248, 48, 55, 119; q.m.d. property, 251, 252, 272, 273, 276; and 488; regression models for, pivotal quantities, 274. See 210; testing in multiparameter, also Uniformly most accurate 119, 121, 123, 234; total confidence sets positivity of, 104. See also One-parameter exponential Error control: strong, 350; weak, 350 family Error of first and second kind, 57, 66; of type 3, 139; familywise error Exponential waiting times, 22, 54, rate, 349; directional, 373 74. See also Exponential distribution Essentially complete class of decision procedures, 17, 54, 69, 96. See Extreme order statistic, 678, 679 also Completeness of a class of decision procedures Factorization criterion for sufficient Estimation, see Confidence bands; statistics, 19, 45, 46 Confidence bounds; Confidence False discovery rate, 354, 386 intervals; Confidence sets; Family of hypotheses, 349, 374 Equivariance; Maximum Familywise error rate (FWER), 349, likelihood; Median: Point 354, 355, 372, 386; control estimation; Unbiasedness based on bootstrap, 658–661 Euclidean sample space, 41 Fatou’s Lemma, 32 Exchangeable, 355 F -distribution, 158; for simultaneous Expectation (of a random available), confidence intervals 381; in 33, 39; conditional, 37, 39, 42 Hotelling’s T 2-test, 306; in Expected order statistics, 243 tests and confidence intervals Experimental design, see Design of for ratio of variances, 166, 299; experiments noncentral, 307; relation to Exponential distribution, 22, 68, beta distribution, 159. See also 74; confidence bounds and F -test for linear hypothesis; intervals in, 74; order statistics F -test for ratio of variances from, 54; relation to Pareto Fiducial, 108; distribution 175; distribution, 94; relation probability, 108, 175 to Poisson process, 54, Fieller’s problem, 197 Subject Index 773

Finite decision problem, 54 Ghosh-Pratt identity, 200 First-order accurate, 666 Glivenko-Cantelli theorem, 441 Fisher’s exact test, 127, 149. See also Goodness of fit test, vii, 256 583; Two by two tables bootstrap tests of, 673; Fisher Information, 485, 486 in multinomial models, Fisher’s least significant difference 514–516; See also Chi-squared method, 368 tests; Kolmogorov-Smirnov; Fisher linkage model, 598 Neyman’s smooth tests; Fisher’s z-transformation, 439 Separate families; Weighted Fixed effects model, 297. See also quadratic tests Linear model; Model I and II Group: amenable, 334; free, 25; Free Group, 25 generated by subgroups, Frequentist point of view, 175 217; linear, 216, 227, 334; of Friedman’s rank test, 290 monotone transformations, F -test for linear hypothesis, 280; 215; orthogonal, 215, 217, admissibility of, 281; as 330;; permutation, 215; scale, Bayes test, 309; for nested 215; transformation, 212, classification, 302; has best 213; transitive, 215, 220; average power, 308; in Fisher’s translation, 215, 219, 333. See least significant difference also Equivariance; Invariance method, 368; in Gabriel’s Group, 692, 693; family, 395, 401 simultaneous test procedure, Guaranteed power: achieved through 368; in mixed models, 426; in sequential procedure, 124, 126, model II analysis of variance, 198, 199 299; power of, 281; robustness of, 445, 446, 448, 480, 491 See Haar measure, 227, 331 also F -distribution Hazard ordering, 101 F -test for ratio of variances, 106, 107, Hellinger distance, 530–534, 582 220, 238; admissibility of, 239; Hierarchical classification. see Nested nonrobustness of, 446. See classification also F -distribution; Normal Higher order asymptotics, 661–668 two-sample problem, ratio of Highest probability density (HPD) variances credible region, 173, 175, 202 F -test in multiple comparison Hilbert space, 696–698 procedures, 366 Hodges-Lehmann efficiency, 539 Fubini’s theorem, 34 Hodges’ superefficient estimator, 525 Fully informative statistics, 96 Holm procedure for multiple testing, Functionals, 571 350, 351, 363, 385 Fundamental lemma, see Neyman- Homogeneity of means: tests of, 285; Pearson fundamental against ordered alternatives, lemma 287; multiple comparisons for, 364, 366; for normal means, Gabriel’s simultaneous test procedure, 285, 287; nonparametric, 286, 368 290, 458. See also Multiple Gamma distribution Γ(g, b), 99, 196; comparisons relation to Beta distribution, Homomorphism, 12 196; scale parameter of, 201; Hotelling’s T 2-test, 306; admissibility shape parameter of, 196. of, 317; as Bayes solution, 317; See also Beta distribution; minimaxity of, 335 Chi-squared distribution; HPD region. see Highest probability Exponential distribution density Gaussian curvature, 341 Huber condition 455 Generalized linear models, 318 Hunt-Stein theorem, 331 774 Subject Index

Hypergeometric distribution, 66, and symmetry, 212; history 134; in testing equality of two of, 276; of likelihood ratio, binomials, 127; in testing for 262; of measure, 227; of power independence in a 2 × 2 table, functions, 227–229; of tests, 131; relation to distribution 214, 276; principle of, 214; of runs, 146. See also Fisher’s relation to equivariance, 13; exact test; Two by two tables relation to minimax principle, Hypergeometric function, 209 25, 329; relation to sufficiency, Hypothesis testing, 5, 56; history of, 220; relation to unbiasedness, 107; loss functions for, 59, 69, 23, 229, 230; warning against 222; without stochastic basis, inappropriate use of, 286. 131, 132 See also Almost invariance; Equivariance Improper prior distribution, 172 Invariant measure, 227, 230; over Inadmissibility, 17; of confidence orthogonal group, 330; over sets for vector means, 335; of translation group, 333 likelihood ratio test, 263; of Inverse Gaussian distribution, 100, UMP invariant test, 306. See 197 also Admissibility Inverse sampling: for binomial trials, Independence: conditional, 133; of 67; for Poisson variables, 68, sample mean from function 98. See also Negative binomial of differences in normal distribution; Poisson process; samples, 152; of statistic from Waiting times a complete sufficient statistic, 152; of sum and ratio of Jackknife, 648, 674 independent χ2 variables, 153; Joint confidence rectangles 657. See of two random variables, 34 also Simultaneous confidence Independence, test for: in bivariate sets normal distribution, 191; in nonparametric models, 241, Kendall’s statistic, 272 271; in r × c contingency k-FWER, 374, 386 tables, 127; in two by two Kolmogorov-Smirnov: and bootstrap tables, 127–130 confidence bands, 658; Indicator function of a set, 33 asymptotic behavior of, 441, Indifference zone, 320 442, 584–589; based on a pivot, Inference, statistical. see Statistical 645; extensions of, 589–590; inference statistic, 256; test for goodness Information matrix, 485, 486 of fit, 256. See also Goodness Integrable function, 31 of fit Integration, 31 Kolmogorov-Smirnov distance, 441 Interaction, 291, 292, 311; as main Kruskal-Wallis test, 286 effects, 311; in random effects Kullback-Leibler information (or and mixed models, 313, 314; divergence) 432; backward, 672 test for absence of, 291 Kurtosis, 459 Interval estimation, see Confidence intervals Large-sample theory, vii, 417 Into, see Transformation squares design, 293, 312 Intraclass correlation coefficient, 313 Lattice distribution, 459 Invariance: of decision procedure, Laws of large numbers: Weak, 431; 12, 13; of likelihood ratio, Strong, 441; Uniform, 463, 464 341; of measure, 299, 518, Least favorable distribution, 18, 84, 519; and admissibility, 26; 85, 86, 321, 361 and ancillarity, 395, 397, 401; Least squares estimates, 281 Subject Index 775

Lebesgue convergence theorems, 39 models, 572; in univariate Lebesgue integral, 31 models, 544–549 Lebesgue measure, 29 Locally most powerful rank test, 244, Legendre polynomials, 599, 600 275 Locally optimal tests, 322, 339, 340, Level of significance, see Significance 403, 511 level Locally unbiased, 340 L´evy distance, 430 Local power 433; of t-test 465, 466 Life testing, 54. See also Exponential Location families (or models), 70, distribution; Poisson process 100, 396; are stochastically Likelihood, 16; function, 503, 504 See increasing 70; comparing two, also Maximum likelihood 219; conditional inference for, Likelihood ratio, 15, 101, 494; 414; condition for monotone censored, 326; invariance likelihood ratio, 323, 401; of, 262; large-sample theory example lacking monotone of, 494, 503; monotone, 65; likelihood ratio, 71; LAUMP preference order based on, 60, tests for, 546–548; strongly 66; sufficiency of, 53. See also unimodal, 401 Monotone likelihood ratio Location-scale families, 12; confidence Likelihood ratio test, 16; example of intervals based on pivot, inadmissible, 263; large-sample 645; comparing two, 258; theory of, 513–517; using LAUMP tests in, 557. See also bootstrap critical values, 670, Normality, testing for 671 Log convexity, 323, 412 Lindley’s Paradox, 95 Logistic distribution, 134, 323, 402 Linear functionals 571; LAUMP Logistic response model, 134 property, 572–574 Loglikelihood ratio, 483; expansion Linear hypothesis, 277, 333; due to Le Cam, 489–491 admissibility of test for 281; Loglinear model, 134, 318 Bayes test for, 309; canonical Lognormal family, 488 form for, 278, 317; F -test Loss function, 3, 7; in confidence for, 200; inhomogeneous, interval estimation, 23, 72, 76; 283; more efficient tests for, in hypothesis testing, 69, 141, 287; parametric form of, 284, 222; monotone, 76 p 309; power of test for, 280; L -space, 697, 698 properties of test for, 280, 308, 333, 338, 341; reduction Main effects, 287, 292; as interactions, of, through invariance, 279; 311; confidence sets for, 289; robustness of tests for, tests for, 287, 291. 451–458. See also Analysis Mallow’s metric, 654 of variance; Additive linear Mantel-Haenszel test, 135 model, Generalized linear Markov chain, 145 model Markov property, 145 Linear model, 277, 318; confidence Markov’s inequality, 472 intervals in, 309; history of, Matched pairs, 138, 183, 221, 239, 317; simultaneous confidence 324; comparison with complete intervals in, 380 randomization, 149; confidence Locally asymptotically uniformly intervals for, 189; rank tests most powerful (LAUMP): for for, 242, 246 equivalence hypotheses, 559– Maximal invariant, 214; ancillarity 564; for one-sided hypotheses of, 395; distribution of, 218; in multiparameter models, method for determining, 216; 553–559; in nonparametric obtained in steps, 217 776 Subject Index

Maximin multiple tests, 354, 357, Model selection, 11 358, 360 Monotone class of sets, 50 Maximin test, 320; by Hunt-Stein Monotone convergence theorem, 32 theorem 333; existence of, Monotone decision rule, 355, 357, 387 338; local, 322; relation to Monotone likelihood ratio, 65, invariance, 329. See also 69, 101, 104; mixtures of Least favorable distribution; distributions with, 341, Minimax principle; Most 401, 403; necessary and stringent test sufficient condition for, 98; of Maximum likelihood, 16, 17, 504–508; differences, 402; of distribution in normal model, 504, 505; in of correlation coefficient, 261; exponential family models, of exponential family, 67; of 505. See also Likelihood ratio location families, 323, 401, test 402; of noncentreal χ2 and F , Maximum modulus confidence 307; of noncentral t, 224; of intervals, 379 scale families, 324; relation to McNemar’s test, 138, 149 total positivity, 103; tests and Measurable: function, 30; set, 29; confidence procedures in the space, 29; transformation, 30, presence of, 65, 69, 73. See also 34 Stochastic increasing Measure, 29 Monotone loss function, 76 Median, confidence bounds for, 105 Monte Carlo simulation 442, 443; Median unbiasedness, 22; relation to for bootstrap, 649; for confidence bounds, 162 subsampling, 679 Meta-analysis, 109 Mortality. see Hazard ordering Metric space, 527, 571, 694. See Most stringent test, 276, 337; also Hellinger; Kolmogorov- existence of, 346 Smirnov; Kullback-Leibler; Moving average process, 450 L´evy; Mallows; Prohorov, Moving blocks bootstrap, 687 Total variation Multinomial distribution, 47, 202; as Minimal complete class of decision conditional distribution, 54; procedures, 17. See also Dirichlet prior for, 202; for Completeness of a class of entries of 2 × 2 table, 128 distributions; Essentially Multinomial model: maximum complete class of decision likelihood estimation in, 514, procedures 515; testing a composite Minimal sufficient statistic, 21 hypothesis in, 597, 598; Minimum Chi-squared estimator, 597 testing a simple hypothesis Minimax principle, 15, 347; and in, 514–516, 590–597; for least favorable distribution, 2 × 2 table, 128, 130; for 18; in confidence estimation, three-factor contingency table, 335; relation to invariance, 133. See also Chi-squared test; 25; relation to unbiasedness, Contingency tables 24. See also Maximin test; Multiple comparison procedures, Restricted Bayes solution iii, 293, 343; complexity Minkowski’s inequality, 697 of, 373; history of, 391; Missing observations, 410 interpretability of, 372; Mixed model, 297, 304, 314, 315 significance levels for, 368, Mixtures of experiments, 392, 394, 370, 371. See also Duncan and 395, 410, 414 Dunnett multiple comparison MLR, see Monotone likelihood ratio methods; Newman-Keuls Model I and II, 297. See also Mixed multiple comparison model; Random effects model procedure; Simultaneous Subject Index 777

confidence intervals; t-distribution, 156, 161, 193, Stepdown procedures; Stepup 224 procedures; Tukey levels; Noninformative prior, 172 Tukey’s T -method Nonparametric: independence Multiple decision procedures, problem, 191, 240, 242; 5. See also Multiple many-sample problem, 286; comparisons; Multiple testing; methods for linear hypotheses, Three-decision problems 290; one-sample problem, 118; Multiple testing, iii, 293, 348; history test in two-way layout, 290. of, 391, maximin procedures, See also Permutation test; 354 Rank tests; Sign test Multiplicity problem, 349 Nonparametric mean 420, 459; and Multivariate cumulative distribution the Bahadur-Savage result; function, 424 466–468; and the bootstrap, Multivariate linear hypothesis, 306, 653, 655; and Edgeworth 318 See also Linear hypothesis expansions, 459–462; and the Multivariate mean: nonparametric t-test, 462–466; asymptotic confidence regions based on maximin and LAUMP bootstrap, 655, 656; multiple property, 567–574; confidence testing for, 661 intervals for based on a root, Multivariate normal distribution, 646, 647; resampling-based 89, 304, 426; testing linear tests for 672, 673. See also combination of means 90, tests Multivariate mean for, 345, 513, 514. See also Nonparametric two-sample problem, Bivariate normal distribution 130, 176, 242; confidence Multivariate normal one-sample intervals in, 188, 203, 268; problem, the mean: confidence omnibus alternatives, 245; intervals for, 415; tests universally unbiased test in, for, 305, 335, 353. See 269. See also Normal scores 2 also Hotelling’s T -test; test; Wilcoxon test Simultaneous confidence sets Nonparametric test, 85 Multivariate t-distribution, 275 Nonparametric variance, LAUMP property, 574 Natural parameter space of an Normal approximation, order of error, exponential family, 48, 55, 119 663, 664 Negative binomial distribution 22, 68, Normal distribution N(ξ,σ2), 5, 86; 144 loglikelihood for, 483; testing Neighborhood model, 326, 328 against Cauchy or double Nested classification, 301, 313 exponential, 259; testing Nested rejection regions, 63, 96, 105 against uniform or exponential, Newman-Keuls multiple comparison 260. See also Bivariate normal procedure, 368, 370 distribution; Multivariate Newton’s identities, 39 normal distribution Neyman-Pearson fundamental lemma, Normality, testing for, 260, 589. See 60, 108; approximate version also Normal distribution of, 326; generalized, 77, 108 Normal many-sample problem: Neyman-Pearson statistic, 503 confidence sets for vector Neyman’s smooth tests, 599–601; means, 252, 336, 366, 375, 378; large sample behavior 601–607 tests for means, 285, 399. See Neyman structure, 115, 118 also Homogeneity of means, Noncentral: beta distribution, 280, tests of 307; χ2-distribution, 306, Normal one-sample problem, the 311; F -distribution, 307; coefficient of variation: 778 Subject Index

confidence intervals for, 273; confidence intervals for, 166, test for, 157, 224, 294, 303 254, 272; credible region for, Normal one-sample problem, the 202; nonrobustness of test for, mean: admissibility of test 446; test for, 107, 157, 259. for, 235; AUMP test for, 555, See also F -test for ratio of 556; confidence intervals for, variances; Ratio of variances 163, 250, 405; credible region Nuisance parameters, 318, 402 for, 172m 174; Edgeworth Null set, 40 expansion for t-statistic, 517; LAUMP test of equivalence Odds ratio, 126, 399; most accurate with unknown variance, 563, unbiased confidence intervals 564; likelihood ratio test for, for, 200. See also Binomial 87; median unbiased estimate probabilities; Contingency of, 164; nonexistence of test table; Two by two tables with controlled power, 157; One parameter exponential family, nonexistence of UMP test 67, 81, 111; complete class for, for, 89; optimum test for, 92, 141; most stringent test in, 155, 156, 260, 283, 401; test 338. for, based on random sample One-sided hypotheses, 65, 124 size, 95; two-stage confidence One-way layout, 285, 353; Bayesian intervals for, of fixed length, inference for, 304; model II 198, 199; two-stage test for, 297; nonparametric, 286. for, with controlled power, See also Homogeneity, tests of; 199; two-sided test for, 260; Normal many-sample problem sequential confidence intervals Onto, see Transformation for, 163, 199. See also Matched Optimality,9,10 pairs; t-test Orbit of transformation group, 214 Normal one-sample problem, the Ordered alternatives, 287 variance: admissibility of test Order notation OP (1), oP (1), 433; for, 238; conditional confidence an  bn, 498; an ∼ bn, 535 intervals for, 415; confidence Order statistics, 37, 38; as intervals for, 165, 201; credible maximal invariants, 215; region for, 174; likelihood ratio as sufficient statistics, 53, test for, 87; optimum test for, 176; completeness of, 118, 87, 92, 154, 220, 325 141; distribution of, 266; Normal response model, 134 equivalent to sums of powers, Normal scores statistic, 269 38; expected values of, 243; in Normal scores test, 243; optimality permutation tests, 176 of, 243, 244 Orthogonal group, 215, 217, 330 Normal subgroup, 257 Orthogonal: transformations, 194, Normal two-sample problem, 215; vector, 697 difference of means: Orthonormal: system, 697; vector, comparison with matched 697 pairs, 204; confidence intervals for, 165; credible region for, Paired comparisons, see Matched 202; optimal tests for for pairs (with variances equal), 107, Pairwise sufficiency, 53 160, 195, 225, 260, 284. See Parameter space, 3 also Behrens-Fisher problem; Parameters, unrelated, see Variation Homogeneity of means, tests independent parameters of; t-distribution; t-test Parametric bootstrap, 651–653; in Normal two-sample problem, ratio of Behrens-Fisher problem, 671, variances, 107, 157, 220, 238; 672 Subject Index 779

Pareto distribution, 94, 196 one-sided test for, 68, 98; Parseval’s identity, 697, 698 one-sided test for sum of, 105 Partial ancillarity, 398, 399 Poisson process, 4, 68, 98; and Partial sufficiency, 106 2 × 2 tables, 130; confidence Pearson’s Chi-squared test. see bounds for scale parameter, Chi-squared test 74; distribution of waiting Percentile method, 685 times in, 22; test for scale Permutation group, 215 parameter in, 68, 98. See also Permutation test, 130, 177, Exponential distribution 187; approximated by Poly´a’s theorem, 429 standard t-test, 180, 447; Poly´a frequency function, 323 as randomization test, 242, Population models, 132 635, 641–643; complete class, Portmanteau theorem, 425 186; computational methods Positive dependence, see Dependence, for, 180; confidence intervals positive based on, 189, 203, 206; for Positive part of a function, 31 testing independence, 192; Posterior distribution, 172; percentiles history of, 210, 690; most of, 175. See also Bayesian powerful, 178; robustness of, inference 447, 638–643; most stringent, Posterior probability, 94 346. See also Nonparametric; Power function, 57; of invariant test, Randomization model 228; of one-sided test, 68; of Pillai-Bartlett trace test, 463; two-sided test, 82 robustness of, 465 Power of a test, 57, 98; conditional, Pitman asymptotic relative efficiency. 124, 399; unbiased estimation see Asymptotic relative of, 123 efficiency Power series distribution, 142 Pivotal: method, 644–646, quantity, Preference ordering of decision 253, 274 procedures, 10, 14 Plug-in estimate, 648 Prepivoting, 657, 668 Point estimation, viii, 5, 7; Prior distribution, 14, 172; improper, equivariant, 13; history of, 27; 172; noninformative, 172. unbiased, 14 See also Bayesian inference; Pointwise asymptotically level α: for Least favorable distribution; confidence sets, 423; for tests, Posterior distribution 422 Probability density (with respect to Pointwise consistent in power, 423 µ), 33; convergence theorem Poisson distribution, 4, 6, 54; for, 696 comparison of two, 125, Probability distribution of a 398; relation to exponential random variable, 30. See distribution, 27, 68, 98; also Cumulative distribution square root transformation function (cdf) for, 474; sufficient statistics Probability integral transformation, for, 19; sums of, 54. See also 97, 266 Exponential distribution; Probability measure, 39, 30 Poisson parameters; Poisson Product measure, 34 process Prohorov’s theorem, 440 Poisson model: for 2 × 2 table, 130, Projection, as maximal invariant, 132; for 2 × 2 × K table, 133, 216, 284 148 Pseudometric space, 694 Poisson parameters: comparing two, P-value, 57, 63, 97, 98, 108; 125, 398; confidence intervals combination of, from for the ratio of two, 168; independent experiments, 97, 780 Subject Index

109; for randomization test, alternative, 265, 266; null 636; for randomized tests, 64; distribution of, 242. See also in multiple testing, 350, 364; Signed ranks in stepdown procedures, 360; Rank-sum test, 147. See also properties of, 64, 139; versus Wilcoxon test fixed levels, 65 Rank tests, 241; as special case of permutation tests, Quadrant dependence, 145, 210, 371, 635, 636; in multivariate 372. See also Dependence, problems, 318; surveys of, positive 286. See also Nonparametric; Quadratic mean derivative, 484 Nonparametric two-sample Quadratic mean differentiable problem; Symmetry; Trend (q.m.d.) families, 484; Ratio of variances: confidence examples of: 486, 488; intervals for, 166, 254, 272, loglikelihood expansion for, 299, 558; in model II, 299; 489; properties of, 485–487 tests for, 157, 220, 259, 298, Quadrinomial distribution, 133 412. See also F -test for ratio of Quality control, 85, 223 variances; Homogeneity, tests Quantiles, 430, 649 of; Random effects model Recognizable subsets, see Relevant Rao’s score tests. see Score tests subsets Radon-Nikodym derivative, 33, 51 Rectangular distribution, see Uniform Radon-Nikodym theorem, 33 distribution Random assignment, 131, 182, 247, Regression, 169, 318, 395; as linear 293 model, 278, 293; comparing Random effects model, 297; for several lines, 295, 312; nested classifications, 301, 313; confidence band for, 384, for one-way layout, 297; for 391; confidence intervals two-way layout, 313. See also for coefficients, 223, 295; Ratio of variances intercepts and ordinates Randomization, 8, 293; as basis for of line, 170; polynomial, inference, 182; possibility of 278; robustness of tests for, dispensing with, 95; relation to 451–458; tests for coefficients, permutation test, 184; tests, 169, 293; with both variables 632–643. See also Random subject to error, 312. See also assignment; Randomized Trend procedure Regression dependence, 191, 240. See Randomization distribution, 637 also Dependence, positive Randomization hypothesis, 633 Regular (estimator sequence), 508, Randomization models, 132, 187; 526 confidence intervals in, 188; Relative efficiency, 539. See also history of, 210 Asymptotic relative efficiency Randomized procedure, 8; confidence Relevant and semirelevant subsets, intervals, 167; in conditioning, 175, 405, 406, 413; history of, 414 414, 415; randomized version Randomized test, 58; representation of, 414; relation to Bayesian as nonrandomized test, 74 inference, 415 Randomness, hypothesis of, 270 Restricted Bayes solution, 15 Random sample size, 95, 142, 210 Riemann integral, 31 Random variable, 30 Riskfunction,4,9,10 Rank correlation coefficient, 272 Robustness, 11, 347; against Ranks, 216; as maximal invariants, dependence, 448–451, 680; 216, 241; distribution under against F -test of means, 445, Subject Index 781

446, 448, 480; of efficiency, 421; of, 536 AUMP and LAUMP of general linear models tests, property, 545; counterexample 451–458 ; of validity, 421; lack to AUMP property, 547 of, for F -test of variances, 446; Score vector (or function), 489, 511 lack of, for Chi-squared test Second-order accurate, 666 of a normal variance, 445; of Selection procedures, 102 test of independence or lack of Separable: family of distributions, correlation, 476; for tests in 698; space, 694 two-way layout, 455; of t-test, Separate families of hypotheses, 220, 444, 445. See also Adaptive 258 test; Behrens-Fisher problem; Sequential procedures, 8, 9, 145, 157, Permutation test; Rank tests 163 Root, 644 Shift, confidence intervals for: based Runs test, for testing independence on permutation tests, 203; in a Markov chain, 145, 146 based on rank tests, 251, 268. See also Behrens-Fisher Sample, 5; haphazard, 181; stratified, problem; Exponential 176, 182, 188 distribution; Nonparametric Sample correlation coefficient, 190, two-sample problem; Normal 207; distribution of, 209; two-sample problem, difference limiting distribution of, 438; of means monotone likelihood ratio of Shift model, 134, 250, 578, 579 distribution, 261; variance σ-field, 29; with countable generators, stabilizing transformation for, 699 438, 439. See also Bivariate σ-finite, 29 normal distribution; Rank Signed ranks, 242; distribution correlation coefficient under alternatives, 270; null Sample covariance matrix, 305, 316; distribution of, 246 distribution of, 208 Significance level, 57; for multiple Sample distribution function, comparisons, 368, 370; for see Empirical cumulative stepdown procedures, 351, 361; distribution function nominal, 387. See also P-value Sample inspection: by attributes, 66, Significance probability, see P-value 223; by variables, 85, 223; for Sign test, 85; asymptotic relative comparing two products, 135, efficiency of, 537, 538; for 225 matched pairs, 138; for testing Sample size, 8; required to achieve consumer preferences, 135; for specified power, 57. 125. 199. testing symmetry with respect 320 to a given point, 137; history Sample median, 429 of, 149; in double exponential Sample space, 30 distribution, 342; limiting Sample standard deviation, 434 behavior, 501, 502; treatment S-ancillary, 398, 399 of ties in, 167, 186. See Scale families, 324; comparing also Binomial probabilities; two, 259, 412; conditional Median; Sample inspection inference for, 414; condition Similar test, 110, 115; relation to for monotone likelihood ratio, unbiased test, 111; history of, 323 149. Scheff´e’s S-method, 375, 380, 384, Simple: class of distributions, 59; 388; alternatives to, 384 hypothesis, 59 Score tests, 511–513; asymptotically Simple function, 31 maximin property, 566, 567; Simple hypothesis vs. simple asymptotical relative efficiency alternative, 60, 415; with 782 Subject Index

large samples, 503. See also Student’s t-test, see t-test Neyman-Pearson lemma Subfield, 34 Simpson’s paradox, 132 Sufficient statistic, 19, 44, 54, Simultaneous confidence intervals, 55; Bayes definition of, 21; 375, 391; bootstrap, 657; for factorization criterion for, 19, all contrasts, 382. See also 45; for exponential families, Confidence bands; Dunnett’s 47; in presence of nuisance multiple comparison method; parameters, 96; likelihood ratio Scheff´e’s S-method; Tukey’s as, 53; minimal, 21; pairwise, T -method 53; relation to ancillarity, 397; Simultaneous confidence sets for a relation to fully informative family of linear functions, 375, statistic, 96; relation to 381; smallest, 378; taut, 378 invariance, 220; statistics Simultaneous testing, 349. See also independent of, 151, 152. See Multiple comparisons also Partial sufficiency Single step procedure for multiple Subsampling, 673–676; comparisons testing, 351 with bootstrap, 677–680; for Singly truncated normal distribution hypothesis testing, 680, 681 (STN), 144 Superefficient estimator, 525; Skewness, 459, 662 bootstrap of, 679 Slutsky’s theorem, 433 Symmetric: confidence interval, 649 Small-sample theory, iii distribution, 53 Smirnov test, 245 Symmetry, 11, 13; and invariance, 12, Smooth function of means, 656 212; sufficient statistics for Spherically symmetric distributions, distributions with, 53; testing 194, 314 for, 241, 246, 270; testing, with Stagewise tests, 367 respect to given point, 137, Standard confidence bounds, 77, 175 246, 248, 270 Starshaped, 101 Stationarity, 145 Tautness, 378 Statistic, 30, 34; and random t-distribution, 156, 161, 286; variables, 31; equivalent approximation to permutation representations of, 36; fully distribution, 180; as informative, 96; subfield distribution of function of induced by, 34 sample correlation coefficient, Statistical inference, 3; and decision 207; as posterior distribution, theory, 6; history of, 27 174; Edgeworth expansion for, Stein’s two-stage procedure, 198 517; in two-stage sampling, Stepdown procedures, 351, 352, 391; 198; monotone likelihood ratio canonical form for, 360; large of, 224; multivariate, 275; sample bootstrap, 658–661 noncentral, 156, 161, 193, 224 Stepup procedures, 351, 356 Test (of a hypothesis), 5, 56; Stochastically increasing, 70, 135 almost invariant, 225, 241; Stochastically larger, 70, 101, 240, conditional, 394, 400, 403; 354 invariant, 214, 276; locally Stratified sampling, 176, 182, 188 maximin, 322; locally most Strictly unbiased, 112 powerful 339; maximin, 322; Strongly unimodal, 323, 401, 412, most stringent, 337; of type D 546, 547 and E, 340, 341; randomized, Studentization, 286, 445 58, 127; strictly unbiased, 112; Studentized range, 367, 390 unbiased, 110; uniformly most Student’s t-distribution, see powerful (UMP), 58 t-distribution Three-decision problems, 81, 124 Subject Index 783

Three factor contingency table, 132 Two by two by two table, 135 Ties, 136 Two-sample problem, see Behrens- Tight sequence, 439 Fisher problem; Binomial Time series models, 450, 451 probabilities; Exponential Total positivity, 71, 103, 115, 308, 323 distribution; Matched pairs; Total variation distance, 529 Nonparametric two-sample Transformation: into, 30; of integrals, problem; Normal two-sample 34; onto, 30; probability problem; Permutation test; integral, 97; variance Poisson parameters; Shift, stabilizing, 439 confidence intervals for; Transformation group, 12, 212, 213. Two-by-two tables See also Invariance; Group Two-sided alternatives, 81 Transitive: binary relation, 569; Two-way contingency tables, see transformation group, 285 Contingency tables; Two by Trend: test for absence of, 271 two tables Triangular distribution, 259 Two-way layout, 287, 290, 304; mixed Trimmed mean, 647, 648 models for, 314, 315; multiple t-test: admissibility of, 235, 237, testing in, 374; rank tests 281; as Bayes solution, 237; for, 290; reorganization of as likelihood ratio test, 25, variables in, 311; robustness in, 87; comparison with Wilcoxon 455; simultaneous confidence and sign tests, 537, 538; for intervals in, 383; with one matched pairs, 183, 204; for observation per cell, 287; regression coefficients, 169, with m observations per cell, 294; in linear hypothesis with 290. See also Contingency one constraint, 281; local tables; Interactions; Nested power of, 465, 466; one-sample, classifications; Two by two 89, 156, 192, 260; optimality tables in nonparametric model, 567–574, permutation version UMP invariant test, 150, 218, 219; of, 180, 635, 638, 639; power admissibility, 232; conditional, of, 156, 192, 193; relevant 404; conditions to be UMP subsets for, 408; robustness almost invariant, 227; example of, 445, 446; two-sample, 161, of inadmissibility, 232; 176; two-stage, 199; under examples of nonuniqueness, local alternatives, 501; uniform 231, 232; relation with UMP asymptotic behavior, 465, 466. unbiased test, 230; trivial, 232. See also Normal one- and See also Invariance; Linear two-sample problem hypothesis Tukey levels for multiple comparisons, Uniformly most powerful (UMP) 368, 387 test, 58, 108; conditional, 394, Tukey’s T -method, 367, 374, 388, 401, 403; examples involving 389, 390 two parameters, 93, 95; for Two by two by K tables, 138, 148 exponential distributions, 93; Two by two tables: alternative models for monotone likelihood ratio for, 128, 130, 132; comparison families, 65; for one-parameter of experiments for, 130; exponential families; for Fisher’s exact test for, 127, uniform distribution, 92, 149; for matched pairs, 138, 99; in inverse Gaussian 149; McNemar’s test for, 138, distribution, 100; in normal 149; multinomial model for, one-sample problem, 87, 88; 128, 130; S-ancillaries for, 399. in Weibell distribution, 99; See also Contingency tables nonparametric example of, 85 784 Subject Index

UMP unbiased test, 111; admissibility intervals; Simultaneous of, 139; example of confidence intervals and sets nonexistence of, 140; for Unimodal, 412. See also Strongly multiparameter exponential unimodel families, 119, 121, 150; for Unrelated parameters, 398 one-parameter exponential U-statistic, 678 families, 111; for strictly totally positive families, 115; Variance components, see relation to UMP almost Components of variance invariant tests, 230; via Variance stabilizing transformation, invariance, 150, 230. See also 439 Unbiasedness Variation diminishing, 71. See also Unbiasedness, 13, 27, 110; and Total positivity admissibility, 26; and Variation independent parameters, invariance, 23, 229, 230; and 398 minimax, 24; and similarity, Vector space, 696–698 111; for confidence intervals, Vitali’s theorem, 32 23, 131; for point estimation, 14, 22, 27; for two-decision Waiting times, 22, 98 procedures, 13; of tests, 110, Wald tests and confidence regions, strict, 112. See also UMP 508–510, 646; efficiency of, 536; unbiased test; Uniformly most AUMP and LAUMP property, accurate confidence sets 548, 549 Undetermined multipliers, 80 Weak compactness theorem, 700, 701 Weak convergence, 425, 694 Uniform confidence bands, 442 Weak conditionality principle, 400 Uniform distribution U(a, b), 9, 22; Weibull distribution, 99 as distribution of probability Weighted quadratic test statistics, integral transformation, 97; 607, 608; examples of, 611, completeness of, 116, 141; 612; local power calculations, discrete, 142; distribution of 614, 615 order statistics from, 267; not Welch approximate t-test, 231, 447, q.m.d., 488, 533; of p-values, 448 64, 65; one-sample problem Welch-Aspin test, 231, 408 for, 92, 99, 413; relation to Wilcoxon one-sample test, 246 exponential distribution, 93; Wilcoxon signed-rank statistic, 269, sufficient statistics for, 26; 493, 502, 503 testing against exponential or Wilcoxon signed-rank test. see triangular distribution, 260; Wilcoxon one-sample test other tests for, 480, 482 Wilcoxon statistic, 268, 269; Uniformly asymptotically level α: for expectation and variance of, confidence sets, 423, 424; for 265 tests, 422 Wilcoxon two-sample test, 243, Uniformly integrable, 472 245; alternative form of 265; comparison with T -test, 537. Uniformly most accurate confidence 538; confidence intervals sets, 72, 73; equivariant, 249; based on, 251; history of, 276; minimize expected Lebesgue optimality of, 243, 244, 267, measure, 251; relation to 268 UMP tests, 73; unbiased, 164. Wilson confidence interval for See also Confidence bands; binomial, 435, 647 Confidence bounds; Confidence intervals; Confidence sets; Simultaneous confidence Yule’s measure of association, 129 Springer Texts in Statistics (continued from page ii)

Lehmann: Testing Statistical Hypotheses, Second Edition Lehmann and CaseLLa: Theory of Point Estimation, Second Edition Lindman: Analysis of Variance in Experimental Design Lindsey: Applying Generalized Linear Models Madansky: Prescriptions for Working Statisticians McPherson: Applying and Interpreting Statistics: A Comprehensive Guide, Second Edition Mueller: Basic Principles of Structural Equation Modeling: An Introduction to LISREL and EQS Nguyen and Rogers: Fundamentals of Mathematical Statistics: Volume I: Probability for Statistics Nguyen and Rogers: Fundamentals of Mathematical Statistics: Volume II: Statistical Inference Noether: Introduction to Statistics: The Nonparametric Way Nolan and Speed: Stat Labs: Mathematical Statistics Through Applications Peters: Counting for Something: Statistical Principles and Personalities Pfeiffer: Probability for Applications Pitman: Probability Rawlings, Pantula and Dickey: Applied Regression Analysis Robert: The Bayesian Choice : From Decision-Theoretic Foundations to Computational Implementation, Second Edition Robert and Casella: Monte Carlo Statistical Methods, Second Edition Rose and Smith : Mathematical Statistics with Mathematica Ruppert: Statistics and Finance: An Introduction Santner and Duffy: The Statistical Analysis of Discrete Data Saville and Wood: Statistical Methods: The Geometric Approach Sen and Srivastava: Regression Analysis: Theory, Methods, and Applications Shao: Mathematical Statistics, Second Edition Shorack: Probability for Statisticians Shumway and Stoffer: Time Series Analysis and Its Applications Simonoff: Analyzing Categorical Data Terrell: Mathematical Statistics: A Unified Introduction Timm: Applied Multivariate Analysis Toutenburg: Statistical Analysis of Designed Experiments, Second Edition Wasserman: All of Statistics: A Concise Course in Statistical Inference Whittle: Probability via Expectation, Fourth Edition Zacks: Introduction to Reliability Analysis: Probability Models and Statistical Methods