Improving on the James-Stein Estimator

Improving on the James-Stein Estimator Yuzo Maruyama Abstract In the estimation of a multivariate normal mean for the case where the unknown covariance matrix is proportional to the identity matrix, a class of generalized Bayes estimators dominating the James-Stein rule is obtained. It is noted that a sequence of estimators in our class converges to the positive-part James-Stein estimator. AMS classification numbers 62C10, 62C20, 62A15, 62H12, 62F10, 62F15 Key words and phrases minimax, multivariate normal mean, generalized Bayes, James-Stein estimator, quadratic loss, improved variance estimator. 1 Introduction Let X and S be independent random variables where X has the p-variate normal 2 2 2 distribution Np(θ; σ Ip) and S/σ has the chi square distribution Ân with n degrees of freedom. We deal with the problem of estimating the mean vector θ relative to the quadratic loss function k± ¡ θk2/σ2. The usual minimax estimator X is inadmissible for p ¸ 3. James and Stein [6] constructed the improved estimator ±JS = [1 ¡ ((p ¡ 2)=(n + 2))S=kXk2] X; which is also dominated by the James-Stein positive- JS JS part estimator ±+ = max(0; ± ) as shown in Baranchik [2]. We note that shrinkage estimators such as James-Stein’s procedure are derived by using the vague prior information that ¸ = kθk2/σ2 is close to 0. It goes without saying that we would like to JS get significant improvement of risk when the prior information is accurate. Though ±+ improves on ±JS at ¸ = 0, it is not analytic and thus inadmissible. Kubokawa [7] showed that one minimax generalized Bayes estimator derived by Lin and Tsai [10] dominates ±JS. This estimator, however, does not improve on ±JS at ¸ = 0. Therefore it is desirable to get analytic improved estimators dominating ±JS especially at ¸ = 0. In Section 2, 1 we show that the estimators in the subclass of Lin-Tsai’s minimax generalized Bayes M M 2 estimators ±® = (1 ¡ Ã® (Z)=Z)X, where Z = kXk =S and R 1 ¸®(p¡2)=2(1 + ¸z)¡®(n+p)=2¡1d¸ ÃM (z) = z R 0 ® 1 ®(p¡2)=2¡1 ¡®(n+p)=2¡1 0 ¸ (1 + ¸z) d¸ JS M with ® ¸ 1, dominate ± . It is noted that ±® with ® = 1 coincides with Kubokawa’s M JS estimator. Further we demonstrate that ±® with ® > 1 improves on ± especially at M ¸ = 0 and that ±® approaches the James-Stein positive-part estimator when ® tends to infinity. 2 Main results M We first represent Ã® (Z) through the hypergeometric function X1 (a) (b) xn F (a; b; c; x) = 1 + n n for (a) = a ¢ (a + 1) ¢ ¢ ¢ (a + n ¡ 1): (c) n! n n=1 n The following facts about F (a; b; c; x), from Abramowitz and Stegun [1], are needed; Z x ta¡1(1 ¡ t)b¡1dt = xa(1 ¡ x)bF (1; a + b; a + 1; x)=a for a; b > 1; (2.1) 0 c(1 ¡ x)F (a; b; c; x) ¡ cF (a; b ¡ 1; c; x) + (c ¡ a)xF (a; b; c + 1; x) = 0; (2.2) (c ¡ a ¡ b)F (a; b; c; x) ¡ (c ¡ a)F (a ¡ 1; b; c; x) + b(1 ¡ x)F (a; b + 1; c; x) = 0; (2.3) F (a; b; c; 1) = 1 when c ¡ a ¡ b ·¡1: (2.4) Making a transformation and using (2.1), we have Z 1 Z 1 ¸®(p=2¡1)(1 + ¸z)¡®(n+p)=2¡1d¸= ¸®(p=2¡1)¡1(1 + ¸z)¡®(n+p)=2¡1d¸ 0 0 ®(p=2 ¡ 1) F (1; ®(n + p)=2 + 1; ®(p=2 ¡ 1) + 2; z=(z + 1)) = : ®(p=2 ¡ 1) + 1 F (1; ®(n + p)=2 + 1; ®(p=2 ¡ 1) + 1; z=(z + 1)) Moreover by (2.2) and (2.3), we obtain · ¸ p ¡ 2 n + p ÃM (z) = 1 ¡ : ® n + 2 (n + 2)F (1; ®(n + p)=2; ®(p ¡ 2)=2 + 1; z=(z + 1)) + p ¡ 2 (2.5) Making use of (2.5), we can easily prove the theorem. 2 M JS Theorem 2.1. The estimator ±® with ® ¸ 1 dominates ± . M Proof. We shall verify that Ã® (z) with ® ¸ 1 satisfies the condition for dominating the James-Stein estimator derived by Kubokawa [9]: for ±Ã = (1 ¡ Ã(Z)=Z)X, Ã(z) is M nondecreasing, limz!1 Ã(z) = (p ¡ 2)=(n + 2); and Ã(z) ¸ Ã1 (z): M Since F (1; ®(n+p)=2; ®(p¡2)=2+1; z=(z +1)) is increasing in z, Ã® (z) is increasing M in z. By (2.4), it is clear that limz!1 Ã® (z) = (p ¡ 2)=(n + 2). In order to show that M M Ã® (z) ¸ Ã1 (z) for ® ¸ 1, we have only to check that F (1; ®(n + p)=2; ®(p ¡ 2)=2 + 1; z=(z + 1)) is increasing in ®, which is easily verified because the coefficient of each term of the r.h.s. of the equation F (1; ®(n + p)=2; ®(p ¡ 2)=2 + 1; z=(z + 1)) µ ¶ p + n z (p + n)(p + n + 2=®) z 2 = 1 + + + ¢ ¢ ¢ (2.6) p ¡ 2 + 2=® 1 + z (p ¡ 2 + 2=®)(p ¡ 2 + 4=®) 1 + z is increasing in ®. We have thus proved the theorem. M Now we investigate the nature of the risk of ±® with ® ¸ 1. By using Kubokawa [9]’s M method, the risk difference between the James-Stein estimator and ±® at ¸ = 0 is written as Z 1 µ M M ¶ Z z JS M d M Ã® (z) ¡ Ã1 (z) ¡1 R(0; ± ) ¡ R(0; ±® ) = 2(n + p) Ã® (z) M s h(s)dsdz; 0 dz 1 + Ã1 (z) 0 R 1 2 where h(s) = 0 vfn(v)fp(vs)dv and fp(t) designates a density of Âp. Therefore we see M that Kubokawa [7]’s estimator (±® with ® = 1) does not improve upon the James-Stein M M estimator at ¸ = 0. On the other hand, since Ã® (z) is strictly increasing in ®, ±® with M ® > 1 improves on the James-Stein estimator especially at ¸ = 0 and the risk R(¸; ±® ) at ¸ = 0 is decreasing in ®. Figure 1 gives a comparison of the respective risks of the M James-Stein estimator, its positive-part rule and ±® with ® = 1; 2 and 10 for p = 8 and M n = 4. This figure reveals that the risk behavior of ±® with ® = 10 is similar to that of the James-Stein positive-part estimator. In fact, we have the following result. M Proposition 2.1. ±® approaches the James-Stein positive-part estimator when ® tends M JS to infinity, that is, lim®!1 ±® = ±+ : Proof. Since F (1; ®(n + p)=2; ®(p ¡ 2)=2 + 1; z=(z + 1)) is increasing in ®, by the mono- P1 ¡1 i tone convergence theorem this function converges to i=0f((p ¡ 2)(z + 1)) (n + p)zg : M Considering two cases: (n + p)z < (¸)(p ¡ 2)(z + 1), we obtain lim®!1 Ã® (z) = z if z < (p ¡ 2)=(n + 2); = (p ¡ 2)=(n + 2) otherwise. This completes the proof. 3 8 7 6 risk 5 James-Stein Kubokawa M2 M10 Positive-Part 4 3 0 1 2 3 4 5 6 noncentrality parameter JS JS M Figure 1 : Comparison of the risks of the estimators ± , ±+ , ±® with ® = 1; 2; 10. M ’noncentrality parameter’ denotes kθk/σ.M® denotes the risk of ±® . We, here, consider the connection between the estimation of the mean vector and the vari- 2 2 ance. It is noted that the James-Stein estimator is expressed as (1¡((p¡2)=kXk )ˆσ0)X; 2 ¡1 2 forσ ˆ0 = (n + 2) S and thatσ ˆ0 is the best affine equivariant estimator for the problem of estimating the variance under the quadratic loss function (±/σ2 ¡ 1)2. Stein [11] 2 showed thatσ ˆ0 is inadmissible. George [5] suggested that it might be possible to use an improved variance estimator to improve on the James-Stein estimator. In the same way as Theorem 2.1, we have the following result, which includes Berry [3]’s and Kubokawa et al. [8]’s. 2 Theorem 2.2. Assume that σÎM = ÁIM (Z)S is an improved estimator of variance, BZ which satisfies Brewster-Zidek [4]’s condition, that is, ÁIM (z) is nondecreasing and Á (z) · ÁIM (z) · 1=(n + 2), where R 1 1 ¸p=2¡1(1 + ¸z)¡(n+p)=2¡1d¸ ÁBZ (z) = R0 : n + p + 2 1 p=2¡1 ¡(n+p)=2¡2 0 ¸ (1 + ¸z) d¸ 2 2 JS Then ±IM = (1 ¡ f(p ¡ 2)=kXk gσÎM )X dominates ± . It should be, however, noticed that the estimator derived from Theorem 2.2 is obvi- 2 2 ously inadmissible, since (1 ¡ f(p ¡ 2)=kXk gσÎM ) sometimes takes a negative value. 4 References [1] Abramowitz, M and Stegun, I.A.: Handbook of Mathematical Functions, New York, Dover Publications (1964) [2] Baranchik, A.J.: Multiple regression and estimation of the mean of a multivariate normal distribution. Stanford Univ. Technical Report 51 (1964) [3] Berry J.C.: Improving the James-Stein estimator using the Stein Variance estimator. Statist. and Prob. Letters 20, 241-245 (1994) [4] Brewster, J.F. and Zidek, J.V.: Improving on equivariant estimators. Ann. Statist. 2, 21-38 (1974) [5] George, E.I.: Comment on “Developments in decision theoretic variance estimation”, by Maatta, J.M. and Casella, G.. Statist. Sci. 5, 107-109 (1990) [6] James, W. and Stein, C.: Estimation with quadratic loss. In Proc. 4th Berkeley Symp. Math. Statist. Prob. Vol.1, 361-379 (1961) [7] Kubokawa, T.: An approach to improving the James-Stein estimator. J. Multivariate Anal. 36, 121-126 (1991) [8] Kubokawa, T., Morita, K., Makita, S. and Nagakura, K.: Estimation of the variance and its applications.

Improving on the James-Stein Estimator

Improving on the James-Stein Estimator

Domination Over the Best Invariant Estimator Is, When Properly

DECISION THEORY APPLIED to an INSTRUMENTAL VARIABLES MODEL 1 Gary Chamberlain ABSTRACT This Paper Applies Some General Concepts

Minimax Theory

Lecture 16: October 4 Lecturer: Siva Balakrishnan

Improved Minimax Estimation of a Multivariate Normal Mean Under Heteroscedasticity” (DOI: 10.3150/13-BEJ580SUPP; .Pdf)

Stat 710: Mathematical Statistics Lecture 3

On the Bayesness, Minimaxity and Admissibility of Point Estimators of Allelic Frequencies Carlos Alberto Martínez , Khitij

Lecture Notes 8 1 Minimax Theory

Minimax Admissible Estimation of a Multivariate Normal Mean and Improvement Upon the James-Stein Estimator

Lecture 9 — October 20 9.1 Recap

Minimax Estimation of Discrete Distributions Under L1 Loss