ada097 110 stanford a new proof univ of ca … · a new proof of admissibility of tests in the...
TRANSCRIPT
I ADA097 110 STANFORD UNIV CA DEPT OF STATISTICS F/G 12/1A NEW PROOF OF ADMISSIBILITY OF TESTS IN THE MULTIVARIATE ANALY--ETC(U)
CAN $I T W ANDERSON. A TAKEMURA N00014-75-C-0442
ULASSIFIED TR-48 NL
jN~
1.,0 M-02
1.6
MW RO( fPY HPIO[UH(IN 'Yl C HART
A NEW PROOF OF ADMISSIBILITY OF TESTS IN THE
MULTIVARIATE ANALYSIS OF VARIANCE
BY
T. W. A:XDERSON # ,d AKIMICHI TAKEMURA
r iE L REP'..,,O. 48"i i JANVIIII&WQ81
,=:l:: ';,,,,MAR ,31198t;1
PREPARED UNDER CONTRACT, Ij00'i4-75-C-6442
(NR-042-034)
OFFICE OF NAVAL RESEARCH
THEODORE W. ANDERSON. PROJECT DIRECTOR
DEPARTMENT OF STATISTICS
STANFORD UNIVERSITY
STANFORD, CALI FORN IA
81 3 30 077
A NEW PROOF OF ADMISSIBILITY OF TESTS IN THE
MUILTIVARIATE ANALYSIS OF VARIANCE
by
Access ion o
T. W. ANDERSON AeTI f For, _
Ac1 C es r-
andTIC '
AKIMICHI TAKEMURA .
- , , _ . ' :" Codes
1', Codes
TECHNICAL REPORT NO. 48
JANUARY 1981
PREPARED UNDER CONTRACT N00014-75-C-0442(NR-042-034)
OFFICE OF NAVAL RESEARCH
Theodore W. Anderson, Project Director
Reproduction in Whole or in Part is Permitted forany Purpose of the United States Government.
Approved for public release; distribution unlimited.
DEPARTMENT OF STATISTICSSTANFORD UNIVERSITYSTANFORD, CALIFORNIA
_ _. . ..
F,.A NEW PROOF OF ADMISSIBILITY
OF TESTS IN THE MULTIVARIATE ANALYSIS OF VARIANCE
T. W. Anderson and Akimichi Takemura
1. Introduction.
A general theorem on the admissibility of tests of the general multi-
variate linear hypothesis was proved by Schwartz (1967) and Ghosh (1964)
using Stein's theorem (1956). In using Stein's theorem there are two
conditions to prove: (i) convexity of the acceptance region and (ii)
existence of certain alternative hypotheses. In his paper on the admissi-
bility of lotelling's T -test Stein (1956) proved the convexity of the
acceptance region by showing that it is an intersection of half-spaces.
Schwartz (1967) and Ghosh (1964) followed this approach.
The purpose of this paper is to present a new proof of the admissi-
bility of the tests of the general multivariate linear hypothesis. We show
the convexity of the acceptance region more directly; that is, a convex
combination of two sample points in the acceptance region again belongs to
the region. The separation of conditions (i) and (ii) simplifies consider-
ably the proof of the convexity condition (i) and makes its geometrical
meaning clearer. We shall be explicit also in proving the condition (ii).
In Section 2 we state the admissibility results in several forms and
discuss their relations. In Section 3 Stein's theorem on the admissibility
of tests in the general exponential family framework is stated. The rest
of the paper is devoted to the proof of the theorems in Section 2.
2. The Problem and Main Results.
In this section we set up the problem and state the admissibility
results in several forms. Discussion of the relations of those forms
will be given. For proofs see Section 4.
The problem of testing the general multivariate linear hypothesis can
be written in the following canonical form. Let X (p xm), Y (p xr), and
Z (p xn) be random matrices such that their columns are independently
normally distributed with a common covariance matrix Z and means
(2.1) X = M, eY = N, Z = 0,
respectively. (See Anderson (1958), sec. 8.11.) The null hypothesis is
H0: NI = 0 and the alternative hypothesis is M # 0. The parameter space
S is given by
(2.2) 2 = {(M,N,E)j E positive definite}
The null hypothesis is
(2.3) S1 = {(M,N,E)I M = 0, Z positive definite}
A test T* of the null hypothesis H0: W e 00 is said to be admissible
if there exists no other test T such that
Pr{Reject H.1 T,w} < Pr{Reject Ho! T*,w}, w C 6O,
(2.4)
Pr{Reject H01 T,w} > Pr{Reject H0! T*,w}, wCe S - S0
with strict inequality for at least one w.
i 2
-7
The usual tests of the above null hypothesis can be given in terms of
the (nonzero) roots of the following determinantal equation:
(2.S) xx, - A(zz, + XX')l - xx, - A(U - YY')l - 0,
whereU = XX' + YY' + ZZ'.
Except for roots that are identically zero, the roots of (2.5) coincide
with the nonzero characteristic roots of X'(U - YY')-X. Let
(2.6) V : (X,Y,U),
and let
(2.7) M(V) X'(U - YY')-lx.
The vector of ordered characteristic roots of M(V) is denoted by
(2.8) (Alp : A(M(V))
where X1 X " m > 0. Since the inclusion of zero roots (when m > p)
causes no trouble in the sequel we assume that the tests depend on X(M(V)).
It is well known that these tests are invariant under certain groups of
transformations. See Anderson (1958), sec. 8.10,or Lehmann (1959), sec. 7.9.
The admissibility of these tests can be stated in terms of the geometric
characteristics of the acceptance regions. Let
(2.9) Rm < {A e RI X > X >' . X > 01,
< - 1 2 m
mm
3
Definition 2.1. A region A in R is monotone if A e A,
e R', v. < X., i = l....m, imply Ve A.
-I
Figure 2.1
Definition 2.2. For A C Rm the extended region A* is defined by
A*= U{(x T(r) ... x (m))I x e A
where r ranges over all permutations of (1,... ,m).
Now we can state the following theorem:
Theorem 2.1. If the region A in Rm is monotone and if the extended
region A* is closed and convex, then A is the acceptance region of an
admissible test.
This is the essential part of the most important result in the main
theorem of Schwartz (1967).
Another characterization of admissible tests is given in terms of
majorization.
4
Definition 2.3. A vector X m (A,. A)' weakly majorizes a
vector v = (vi,... .,V )' if
(2.10) X [1) ,v 'Il 1+ 2 ]v[]..I A [I+- .+x Im v V [I ++v~
where A and v i~l.*.m are the coordinates rearranged in
nonascending order.
We use the notation
A ->v or V<
if A weakly majorizeB v.
Remark. If X,v e Rm then XA> is simply
A 1 vi 1 A+X 2 1V 1 + V2 2... Px1 +. +A X ,v1 +..+ V
If the last inequality in (2.10) is replaced by an equality we say
simply that A inajorizes v and denote this by
(2.11) A >v or v<.X.
The theory of inajorization and the related inequalities are developed
in detail in Marshall and 01kmn (1979).
Definition 2.4. A region A in R m is said to be monotone in
majorization, if A e A, v e R mv4 A imply vyeA.
A2
- (a 1 ,A 2 )
1
Figure 2.2
Theorem 2.2. If a region A in is closed, convex, and monotone
in majorization, then A is the acceptance region of an admissible test.
Theorem 2.1 and Theorem 2.2 are equivalent. It will be seen in
Section 4 that Theorem 2.2 can be more conveniently proved. Then an
argument about the extreme points of a certain convex set (Lemma 4.12)
establishes the equivalence of the two theorems.
Application of the theory of Schur-convex functions yield several
corollaries to Theorem 2.2.
Corollary 2.1. Let g be continuous, nondecreasing, and convex
in [0,1). Let
m(2.12) f(A) = f( i ... , = g()
i=1
Then a test with the acceptance region
A {X1 fMA)., c}
is admissible.
6
A proof of this is given in Section 4. Setting g(X) = - log(l-X),
g(X) = )/(1-X), g(X) = X, respectively, shows that Wilks' likelihood
ratio test, the Lawley-Hotelling trace test, and the Bartlett-Nanda-Pillai
test are admissible. Admissibility of Roy's maximum root test
A: X Z c1
can be proved directly from Theorem 2.1 or Theorem 2.2.
We note that (2.12) is a special case of a Schur-convex function. A
function 0 is called Schur-convex if x -< y implies O(x) < 0(y). The
following facts are well known in the theory of Schur-convex functions.
(See Marshall and Olkin (1979), Chap. 3.) If 4 is increasing in each
argument and Schur-convex then x w y implies O(x) < 4(y). If 4 is
symmetric and quasi-convex (that is, {xj O(x) < c} is convex for all c),
then 0 is Schur-convex. Combining these facts we obtain the following
corollary.
Corollary 2.2. If f is lower semicontinuous, increasing in each argu-
ment, symmetric, and quasi-convex, then a test with the acceptance region
A = {X1 W() < c} is admissible.
Corollary 2.2 is identical to the essential part of Theorem 1 of
Schwartz (1967). It is interesting to see that he arrived at these
7
conditions on f without the notion of Schur-convexity. From our view-
point the meaning of the conditions are clearer here. Note that Corollary
2.2 is equivalent to Theorem 2.1.
The rest of the paper will be devoted to the proof of the above
theorems.
3. Stein's Theorem and the Exponential Family.
In this section we state Stein's theorem and show how our problem
fits its setting.
An exponential family of distributions consists of a finite-dimensional
Euclidean space , a measure m on the a-algebra b of all ordinary
Borel sets of Y-, a subset Q of the adjoint space P' (the linear
space of all real-valued linear functions on Y) such that
(3.1) OP(w) = f e' dm(y) < , We f£
and P, the function on Q to the set of probability measures on
given by
P (A) =q( ) e~ dn(y), A e .
Theorem 3.1 (Stein). Let ( m,Q,P) be an exponential family and
20 a nonempty proper subset of P. Let A be a subset of Y such that
(i) A is closed and convex and (ii) for every vector w f and real
8
c for which {yI w'y > c} and A are disjoint, there exists w fiQ
such that for arbitrarily large A + XW C 02 - Q *Then the test
with acceptance region A is admissible for testing the hypothesis
we 00 against the alternative w e 0
Note that A need not be closed if the boundary of A has rn-measure
zero. (See Proposition 1 in Appendix.)
We rewrite the distribution of (X,Y,Z) in an exponential form. Let
U XI YI +ZZ =(U) an - = ( ). For a general matrix
C (l..,c) let Vec(C) = (c ...... ck)'. The density of (X,Y,Z) can
be written as
(3.2) f(X,Y,Z) = K(M,N,E) exp{tr M'E- X+ tr N'Z- Y -- tr Z- U)
:4= K(M,NED exj ' w -(2)Y(2) + w )()
where
exp{- -tr E- (t44' + NN')JK(M,N,E) = 1 1
~p(m+r+n) m<r+r+n)(2T)2 E2
(3.3) ~ ()=vec(E -1)T. -w2) = vec(E 1 N),
1 11 12 pp'2T(3) = T(a , 2a ,..2oa 2 , a . ..'Op)
Y(l) = vecMX, Y(2) = vec(Y)
Y(3) = (u11,u 12 0".. l ' 22 3.u p)'
9
If we denote the mapping (X,Y,Z) ->y = .( 1),Yc 2 ),y3) by g
(3.4) y = g(X,Y,Z),
then the measure of a set A in the space of y is
(3.5) m(A) = W(g- (A))
where V is the ordinary Lebesgue measure on Rp(m+r+n) We note that
(X,Y,U) is a sufficient statistic and so is y = (y'
Because a test which is admissible with respect to the class of tests
based on a sufficient statistic is admissible in the whole class of tests,
we consider only tests based on a sufficient statistic. Then the acceptance
regions of these tests are subsets in the space of y. The density of
y given by the right hand side of (3.2) is of the form of the exponential
family and therefore we can apply Stein's theorem. Furthermore, since the
transformation (X,Y,U) ->y is linear, we prove the convexity of an acceptance
region in terms of (X,Y,U). The acceptance region of an invariant test is
given in terms of X(M(V)) = (Xil,.. .,m)'. Therefore, in order to prove the
admissibility of these tests we have to check that the inverse image of A
{Vj X(M(V)) e A}
satisfies the conditions of Stein's theorem. We carry this out in the next
section.
For the sequel we use the following notation:
A > B if and only if A- B is positive semidefinite,
A > B if and only if A- B is positive definite
10
4. Proofs.
We start with lemmas concerning matrix inequalities.
Lemma 4.1.
(4.1) py1 + - (p~l * qY2)(pY1 + qY 2)
>' PQUl - + -) + q(U
Proof. The left-hand side minus the right-hand side is
(4.2) p 1ji Y1 + qY'--qY'- q: _ ~2 1)
= p(l-P)Y1Yl + q (l-q)y2 y2 - Pq(YX1Y + yi
= pq(Y1-Y2)(Y1-Y9' >' 0. Q.E.D.
1 -1Lemma 4.2. If A B >0, then A-< ~B
Proof. Let A F FDF' and B =FF' where D is diagonal and F
is nonsingular. Then A >, B implies D >, I, and B 1 -
1 (F') I(I-D- )F >'O
because I- D >' 0. Q.E.D.
Lenmma 4.3. If A > 0 , then f(x,A) =x'Ax is convex in (x,A)
Proof. If A =D is diagonal, then
f(x,D) =x'D1ix x 2 x/d~
is a convex function because it is the sum of convex functions x2/d.
(The matrix of second-order partial derivatives of x2/d is positive
semidefinite.) In general, we can write
A 'A =FF'
Hence
-1
(4.3) (px 1 + qX2)'(pA1 + qA2) (px 1 + qx 2 )
-1(pF-lx + qF-l x)'(pD + ql) -l(pF Ix + qF-lx2
p(F-lxl)1D1(F-1x) + q(F- I x 2 ) ( F- 1 x 2 )
-l' - 1I -' -l
= A-A1 x + qx2A 2 x Q.E.D.p1 1 - 2 2
Lemma 4.4. If A1 > 0, Al > 0, then
(4.4) (pB1 + qB2)' (pA1 + qA2)- (PB1 + qB2)
-1 - -1PBA B1 + qB2A2 B2.
Proof. From Lemma 4.4, we have for all y
Py'BA 1 B y + qy'BA 2 1B 2 Y - y'(pB1 + qB 2 ) '(pA1 + qA2 )- (pB1 + qB2)y
P(B1 )'(Bly) + q(B2Y) 'A2 (B2)
- (PBlY + qB2 y)'(pA I + qA2)-(PBly + qB2Y)
S0.
ii~12
Thus the matrix of the quadratic form in y is positive semidefinite. Q.E.iD.
The relation as in (4.4) is sometimes called "matrix convexity". (See
Marshall and Olkin (1979).)
Theorem 4.1.
M(PV1 + qV,) pM(V1) + qM(V) ,
where
VI ( x lylu 1 V2 (x-),Y u,(l '1' ' 1 ' Y2 . ..=~
U -Y 1 -1 -
U > - > 0
0 1 p = I-q < 1.
Proof. Lemma 4.1 and Lemma 4.2 show that
-1
(4.5) [ply + qU2 - (PY1 + qY2 )(pYI + qY2)']
[P(I Y1Y + 't(2 - Y2Y2 ) ] -
This implies
(4.6) M(pV1 + qV2 ) ¢ (pX1 + qX2 )'[P(UI - Y1 Y) + q(12 Y2Y2)]-1 (PXl + qX2)
Then Lemma 4.4 implies that the right-hand side of (4.6) is less than or
equal to
13
PXI(U I - YY-I + qX'(U - I = pM(V 1 ) + qM(V4 Q.E.D.
For the next step we need several lemmas concerning majorization.
Proofs are given here so that the paper is self-contained. For a full
discussion see Marshall and Olkin (1979).
Lemma 4.5.
k(4.7) X Xi(A) max tr R'AR.i=1 l k
Proof.
(4.8) max tr R'AR max trR'R= IT9 =
where D is a diagonal matrix with
(dll, ... ,dm ) = A(A)
If Q is the appropriate submatrix of the identity matrix, equality in
(4.7) is achieved. To prove the inequality we augment Q to a full ortho-
gonal matrix G and let
B = G'DG = (. )
Then
tr Q'DQ = b + + .
1kk
14
Note that
m 2b j i g -
or
(bl . .. b - PA Imm
where
(Pi )= (g
P is a doubly stochastic matrix. Then (4.7) follows from X > PA
(Lemma 4.6). Q.E.D.
Lemma 4.6. If P is an m x m doubly stochastic matrix, then
%P- ~Y*
Proof.
k m m(4.9) * *Pi"Y = g y.i=1 j=1 - 1--
where
k mgj = pij (0 . g. 1, ~ g. = k).
i=l J j=
Thenm k m k m
gjYj - = iYi- Yi + Yk(k - gi )
j=l i=1 i=l i=l 1=1
15
k m= (Yi-Yk)(gi-1 ) + I (Yi-Yk)gi
, 0 . Q.E.',.
Remark. Lemma 4.0 holds with K, replacing < W
Lemma 4.7. It A ..B, then A(A) A(B).
Proof . From Lemma ,.5
k, k(4.10) X.(A) max tr WAR r max tr R['R I A (B) .
i~:1R'R - I R'R = I =
Remark. Actually A . B implies 1 (A) < (B) , i - 1,...,.
but this stronger result is not needed.
Lemma 4.8.
)~A+8) (A) + A(B)
Proof.
k(4.11) A.(A+B) - max tr R'(A+BIR
i=l R'R = I
max tr RAR + max tr R'BRR'R= Ik R'R=
k k
-~ (A) + A(B)i--I isl
kS ( . (A) + i.( ) . Q.E.D.i=l
10
Remark. Lemma 4.8 holds with , replacing
Now the matrix inequality of Theorem 4.1 translates into majorization
of vectors of characteristic roots.
Theorem 4.2.
X(M(Py I+ qV2 )) <w PX(M(V1)) + qX(M(V2 )
Proof. Theorem 4.1 and Lemma 4.7 imply
(4.12) X(NI (pV1 + qV2)) w X(pM (V1 ) qM(V 2 ))
Then by Lemma 4.8
(4.13) A(pbI(VI) + qM(V 2 )) <w PX(M(V 1)) + qX(M(V2 )).
From (4.12) and (4.13) Theorem 4.2 follows. Q.E.D.
Now let A be a region in the space of roots and let A be the inverse
image of A in V (-(X,Y,U)) space,
A - {VI X(M(V)) E A)
We want to show that A is convex for a region satisfying the condition of
Theorem 2.2. Let Vi - (Xi,Yi,Ui) E A , i - 1,2 . Then X(M(Vi)) E A
i - 1,2, and by convexity of A we have pX(M(V1)) + qX(M(V2)) E A. Then
by Theorem 4.2 and monotonicity in majorization of A
X(M(pVI + 2) e A.
17
Hence pV1 + qV2 e A and A is convex. Furthermore the boundary of A
has probability 0. (See Propositions I and 2 in Appendix.) Therefore,
condition (i) of Stein's theorem is satisfied.
x(2) = A (M(V2))1) q (2) ~ ~
- (2)
A(M(pV1 +qV2)1
Figure 4.1
For the condition (ii) we have the following lemma. The proof was
suggested by Charles Stein.
Lemma 4.9. For the acceptance region A of Theorem 2.1 or Theorem
2.2 the condition (ii) of Stein's theorem is satisfied.
Proof. Let w correspond to (O,T,O) then
(4.14) wy = () ) (2) (2) (3)Y(3)
= tr Y-X tr UY ,trOU
18
where 0 is symmetric. Suppose that {ylw'y > c} is disjoint from A
We want to show that in this case 0 is positive semidefinite. If this
were not true, then
0 F 0 1 0 '
0 0
where F is nonsingular and -I is not vacuous. Let
X = (l/y)Xo, Y = (I/Y)Y 0 ,
U (F') - 0 0 F- 1
0 0 1
V (X,YU),
where X0 , Y0 are fixed matrices and y is a positive number. Then
(4.15) l'y - tr (DIX + I tr ~+ tr 0 -1 0 > c
Y 0 Y -for sufficiently large y. On the other hand,
(4.16) A(M(V)) : X(X'(U-YY') X}
Y 0 0 I Y
-,0 as y->oo.
19
Therefore, V e A for sufficiently large y. This is a contradiction.
Hence 0 is positive semidefinite.
Now let w31 correspond to (lO,I), where 01 # 0. Then I + XS is
positive definite and 01+ A $ 0 for sufficiently large X. Hence,
w1 + Aw e 0 - Q0
for sufficiently large X. Q.E.D.
Now we have proved Theorem 2.2.
Proof of Corollary 2.1. Being a sum of convex functions f is convex
and hence A is convex. A is closed because f is continuous. We want
to show that if f(x) < c and y <w x (x,y e R() then f(y) $ c. Let
= 1k xi k i Then y <w x if and only if xk > YkX i=l ' k "il ~ ~.
k ,...,. Let f(x) = h(xl, ... XM) = g(i1) + li=2 g(xi-xi-i)" It
suffices to show that h(I,... ,Im) is increasing in each R." For
i < m-1 the convexity of g implies that
h(xI, .... )Xi +E, .... m) h(Rl,...,xi, ...,xRm)
= g(xi+6) - g(xi) - {g(xi+1 ) - g(xi+l-C)}
S0.
For i= m the monotonicity of g implies
h(,...i h(,...,m) = g(xm + ) -g(Xm) 0. Q.E.D.
20
-4 - r - - - .- - - ..
Lemma 4.10. A C Rm is convex and monotone in majorization if
and only if A is monotone and A* is convex.
Proof.
Necessity. If A is monotone in majorization, then it is obviously
monotone. By Proposition 3 of Appendix A* is convex.
Sufficiency. For X e Rm let
(4.17) C(X) = {xj xc Rm, x-e 1 X}
D(A) = {xj x e R<, x < X}.
It will be proved in Lemma 4.12 and its corollary that monotonicity
of A and convexity of A* implies C(A) C A*. Then D(X) = C(X) n Rm c
A* n Rm = A. Now suppose v e Rn and v < A. Then v c D(X) c A. This< - < . .. .
shows that A is monotone in majorization. Furthermore, if A* is con-
vex, then A Rm r) A* is convex. Q.E.D.<x 2
( X1) --C-= extreme points
D(A) ( ,O) 1
Figure 4.2
Lemma 4.11. Let C and D be closed and convex. If the extreme points
of C are contained in D, then C C D.
21
-- ---I------ "--m ~ mmmm=' ,
Proof. Obvious.
Now we present the following key lemma.
Lemma 4.12. The extreme points of C(A) are all vectors of the form
(4.18) ( I) X 6T1)IT,(M) X 7(M)
where 7r is a permutation of (l...,m) and 6,= ... =6 1
6k+ 1 = 0 for some k.
Proof. C(A) is convex. (See Proposition 4 of Appendix). Now
note that C(X) is permutation symmetric, that is, if (xI .... )Xm)' e C(X),
then (x T(l),... xT(m))' e C(QX) for any permutation ff. Therefore, for
any permutation 7r, 7(C(A)) = {(x7 (1),...,x (m))'I x e C()} coincides
with C(A). This implies that if (xI,...,x m)' is an extreme point of~m
C(X), then (x 1,.. . ,x (m))' is also an extreme point. In particular,
(X[l],.. .,x m ) e Rm is an extreme point. Conversely if (xl,...,xm) E R
is an extreme point of C(X), then (x T(1),. ..,x (m))' is an extreme point.
We see that once we enumerate the extreme points of C(A) in Rm, the
rest of the extreme points can be obtained by permutation.
Suppose x e R<. An extreme point being the intersection of m
hyperplanes has to satisfy m of the following 2m equations:
(4.19) El: xI = 0 , FI: Xl =
E2 x 2 = 0 F 2 X +x2 = Xl +X2
Em xm = 0 1F x1+ +xm 1 + X m
22
Suppose that k is the first index such that Ek holds (namely, suppose
that F,...,FkIEk hold). Then x f R< implies
0 x k > xk+1 >... x m = .
Therefore, Ek+l .. ,Ein hold. This gives the point X = ('..."Ak-' 0'...,0)
which is in Mn C(X). Therefore A is an extreme point. Q.E.D.
Remark. If X I > X2 > ... > A m above, the total number of distinct
extreme points is
m Im
k=0 ( "-k)! k=O ml e
This follows from the fact that by permutation of X = (A1 ... A Xk' 0 ' . ' ' ' 0 ) '
we obtain m!/(m-k)! distinct points.
Mirsky (1959) gave the explicit expression of the extreme points in
the form
(4.20) ( T1 (1) .. 6mTr (m)
where n is a permutation and 6i's are zero or one. Actually this set
of points include some points that are not extreme points.
Corollary 4.1.
C (X) C A*
Proof. If A is monotone, then A* is monotone in the sense that
if A = (01 , Am)' e A*, v = (v1,... VM)', V' Ai' i =,
23
then v e A*. (See Proposition S of Appendix.) Now the extreme points
of C(X) given by (4.18) are in A* because of permutation symmetry
and monotonicity of A*. Hence by Lemma 4.11 C(X) C A*. Q.E.D.
Proof of Theorem 4.1. Immediate from Theorem 4.2 and Lemma 4.10. Q.E.D.
2
WI
Appendix.
We prove here some technical propositions to complete the argument
in Section 4.
Proposition 1. The condition (i) in Stein's theorem can be replaced
by (i') A is convex and the boundary of A has m-measure zero.
Proof. If A is convex, then A (= closure of A) is convex.
Furthermore,
A n {yl w'y > c} = A C yl wy c}
Ac {yI w'y < c} AnC {yI w'y > cl
Therefore Stein's theorem holds with A replaced by A. Finally, m(A-A) = 0
implies that tests with the acceptance regions A and A are equivalent
(for any we ).
Proposition 2. The boundary of A in ths proof of Theorem 4.11
has m-measure zero.
Proof. We claim that
closure of A C A U {VI U-YY' is singular} = A U C
where C = {VI U-YY' is singular}. Obviously m(C) = 0. This implies
m(C) a m(boundary of A) = 0. Now suppose V = (X,Y,U) e closure of A.
I
W' N " - = :
.. .. . : k ' ii W i : " 2 5
Then V = lim V. = lim(Xi,Y i f,Ui), where V. e A or A(M(Vi)) e A. If
U- YY' is singular, V e C. If U- YY' is nonsingular,
M(V) = X'(U-yY')-I X
-l
= lim M(Vi) = urn X (Ui -YiYI) Xi
X(M(V)) = lir X(M(Vi))
by continuity. Since A is closed, X(M(V)) e A. Then V e A. Q.E.D.
Proposition 3. If A C Rm is convex and monotone in majorization,
then A* is convex.
Proof. Suppose x,y e A*. For a vector z let z denote
= (z[ 1 .. ,Z[mI)' e Rm . Now
(px+qy) w px + qy
because
max(pxi+qyi) < p max x. + q max yi1 1 1
max {(pxi+qyi) + (px +qyj)}(i,j)
. p max (xi+x.) + q max (y.+y.)(i~j) (i,j)
etc. Hence,
(px+qy), e A and px + qye A*. Q.E.D.
26
Proposition 4. C(X) is convex.
Proof. Let x,y -< A. Then x X. As in the proof of
Proposition 3 we have
(px+qy) -< wpx +q
Now px, qy + ,WpA + qA = A. Hence
and
px +qy w-.
m mIt is obvious that if x,y e R+,then px + qy e R.+ Q.E.D.
Proposition 5. If A is monotone, then A* is monotone.
Proof. Let y e A* and for x assume x. i yip i =l.,. The
point of this proposition is that the permutations transforming x .,
y~y might be different. Note the relation
x [k max {min(x. il...,. )}[ki l .. iik
Then for any (i19.... ,ik)
mnx 1'.x k 1i~i . y k
max {min(yj ,... =Yk) Y[k)'
27
Hence x [k] ' Ylk]' k = 1,...,m. This with y c A implies x + A.
Hence x e A*.
Ac knowl edgment
We are grateful to Ingram Olkin and Charles Stein for their valuable
suggestions.
I
28
References
Anderson, T. W. (1958), An Introduction to Multivariate Statistical Analysis.
Wiley, New York.
Ghosh, M. N. (1964), On the admissibility of some tests of MANOVA, Ann. Math.
Statist., 35, 789-794.
Lehmann, E. L. (1959), Testing Statistical Hypotheses. Wiley, New York.
Marshall, A. W. and Olkin, I. (1979), Inequalities: Theory of Majorization
and Its Applications. Academic Press, New York.
Mirsky, L. (1959), On a convex set of matrices, Arch. Math., 10, 88-92.
Schwartz, R. (1967), Admissible tests in multivariate analysis of variance,
Ann. Math. Statist., 38, 698-710.
Stein, C. (1956), The admissibility of Ho.,lling's T 2-test, Ann. Math.
Statist., 27, 616-623.
29
TECIHNICAL REPORTS
OFFICE OF NAVAL RESEARCH CONTRACT NOOO4-67-A-OII2-0030 (NR-o42-O34)
1. "Confidence Limits for the Expected Value of an Arbitrary Bounded RandomVariable with a Continuous Distribution Function," T. W. Anderson,October 1, 1969.
2. "Efficient Estimaticn of Regression Coefficients in Time Series," T. W.Anderson, October 1, 1970.
3. "Determining the Appropriate Sample Size for Confidence Limits for aProportion," T. W. Anderson and H. Burstein, October 15, 1970.
4. "Some General Results on Time-Ordered Classification," D. V. Hinkley,July 30, 1971.
5. "Tests for Randomness of Directions against Equatorial and BimodalAlternatives," T. W. Anderson and M. A. Stephens, August 30, 1971.
6. "Estimation of Covariance Matrices with Linear Structure and MovingAverage Processes of Finite Order," T. W. Anderson, October 29, 1971.
7. "The Stationarity of an Estimated Autoregressive Process," T. W.Anderson, November 15, 1971.
8. "On the Inverse of Some Covariance Matrices of Toeplitz Type," RaulPedro Mentz, July 12, 1972.
9. "An Asymptotic Expansion of the Distribution of "Studentized" Class-ification Statistics," T. W. Anderson, September 10, 1972.
10. "Asymptotic Evaluation of the Probabilities of Misclassification byLinear Discriminant Functions," T. W. Anderson, September 28, 1972.
11. "Population Mixing Models and Clustering Algorithms," Stanley L.Sclove, February 1, 1973.
12. "Asymptotic Properties and Computation of Maximum Likelihood Estimatesin the Mixed Model of the Analysis of Variance," John James Miller,November 21, 1973.
13. "Maximum Likelihood Estimation in the Birth-and-Death Process," NielsKeiding, November 28, 1973.
14 . "Random Orthogonal Set Functions and Stochastic Models for the GravityPotential of the Earth," Steffen L. Lauritzen, December 27, 1973.
15. "Maximum Likelihood Estimation of Parameters, of an AutoregressiveProcess with Moving Average Residuals and Other Covariance' Matriceswith Linear Structure," T. W. Anderson, December, 1973.
16. "Note on a Case-Study in Box-Jenkins Seasonal Forecasting of Time series,"Steffen L. Lauritzen, April, 197h.
TECHNICAL REPORTS (continued)
17. "General Exponential Models for Discrete Observations,"Steffen L. Lauritzen, May, 1974.
18. "On the Interrelationships among Sufficiency, Total Sufficiency andSome Related Concepts," Steffen L. Lauritzen, June, 1974.
19. "Statistical Inference for Multiply Truncated Power Series Distributions,"T. Cacoullos, September 30, 1974.
Office of Naval Research Contract N00014-75-C-0442 (NR-042-034)
20. "Estimation by Maximum Likelihood in Autoregressive Moving Average Aodelsin the Time and Frequency Domains," T. W. Anderson, June 1975.
21. "Asymptotic Properties of Some Estimators in Moving Average Models,"Raul Pedro Mentz, September 8, 1975.
22. "On a Spectral Estimate Obtained by an Autoregressive Model Fitting,"Mituaki Huzii, February 1976.
23. "Estimating Means when Some Observations are Classified by LinearDiscriminant Function," Chien-Pai Han, April 1976.
24. "Panels and Time Series Analysis: Markov Chains and AutoregressiveProcesses," T. W. Anderson, July 1976.
25. "Repeated Measurements on Autoregressive Processes," T. W. Anderson,September 1976.
26. "The Recurrence Classification of Risk and Storage Processes,"J. Michael Harrison and Sidney I. Resnick, September 1976.
27. "The Generalized Variance of a Stationary Autoregressive Process,"T. W. Anderson and Raul P.Mentz, October 1976.
28. "Estimation of the Parameters of Finite Location and Scale Mixtures,"Javad Behboodian, October 1976.
29. "Identification of Parameters by the Distribution of a MaximumRandom Variable," T. W. Anderson and S.C. Ghurye, November 1976.
30. "Discrimination Between Stationary Guassian Processes, Large Sample
Results," Will Gersch, January 1977.
31. "Principal Components in the Nonnormal Case: The Test for Sphericity,"Christine M. Waternaux, October 1977.
32. "Nonnegative Definiteness of the Estimated Dispersion Matrix in a
Multivariate Linear Model," F. Pukelsheim and George P.H. Styan, May 1978.
TECNICAL REPORTS (continued)
33. "Canonical Correlations with Respect to a Complex Structure,"Steen A. Andersson, July 1978.
34. "An Extremal Problem for Positive Definite Matrices," T.W. Anderson and
I. 01kin, July 1978.
35. " Maximum likelihood Estimation for Vector Autoregressive MovingAverage Models," T. W. Anderson, July 1978.
36. "Maximum likelihood Estimation of the Covariances of the Vector MovingAverage Models in the Time and Frequency Domains," F. Ahrabi, August 1978.
37. "Efficient Estimation of a Model with an Autoregressive Signal withWhite Noise," Y. Hosoya, March 1979.
38. "Maximum Likelihood Estimation of the Parameters of a Multivariate
Normal Distribution,"T.W. Anderson and I. Olkin, July 1979.
39. "Maximum Likelihood Estimatiou of the Autoregressive Coefficients andMoving Average Covariances of Vector Autoregressive Moving Average Models,"Fereydoon Ahrabi, August 1979.
40. "Smoothness Priors and the Distributed Lag Estimator," Hirotugu Akaike,August, 1979.
41. "Approximating Conditional Moments of the Multivariate Normal Distribution,"Joseph G. Deken, December 1979.
42. "Methods and Applications of Time Series Analysis - Part I: Regression,Trends, Smoothing, and Differencing," T.W. Anderson and N.D. Singpurwalla,July 1980.
43. "Cochran's Theorem, Rank Additivity, and Tripotent Matrices." T.W. Anderson
and George P.H. Styan, August, 1980.
44. "On Generalizations of Cochran's Theorem and Projection Matrices,"
Akimichi Takemura, August, 1980.45. "Existence of Maximum Likelihood Estimators in Autoregressive and
Moving Average Models," T.W. Anderson and Ral P. Mentz, Oct. 1980.
46. "Generalized Correlations in the Singular Case," Ashis Sen Gupta,
November 1980.
47. "Updating a Discriminant Function on the Basis of Unclassified Data,"G.J. McLachlan and S. Ganesalingam, November 1980.
48. "A New Proof of Admissibility of Tests in the Multivariate Analysisof Variance." T.W. Anderson and Akimichi Takemura, January, 1981.
UNCLASSIFIEDSECURITY CLASSIFICATION OP THIS PAGE (When Dale Entersd
REPORT DOCUMENTATION PAGE READ INSTRUCTIONSREPORT__ DOCUMENTATIONPAGE_ BEFORE COMPLETING FORM
I REPORT NUMBER . GOVT ACCESSION NO. I RECIPIENT*S CATALOG NUMBER
48 At- liO ____
4 TITLE (and Subtitle) S TYPE OF REPORT A PERIOO COVERED
A NEW PROOF OF ADMISSIBILITY OF TESTS IN THE Technical ReportMULTIVARIATE ANALYSIS OF VARIANCE TechnicalReport
6 PERFORMING ONG. REPORT NUMBER
7. AUTmOR(a) 5 CONTRACT OR GRANT NUMSER(s)
T. W. ANDERSON and AKIMICHI TAKEMURA N00014-75-C-0442
S. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT PROJECT. TASK
Department of Statistics AREA6 WORK UNIT NUMBERS
Stanford University (NR-042-034)Stanford, California
II. CON TROLLING OFFICE NAME AND ADDRESS 12. REPORT DATE
Office of Naval Research JANUARY 1981Statistics and Probability Program Code 436 3. NUMBER OF PAGES
Arlington, Virginia 22217 2914. MONITORING AGENCY NAME & AOORESS(I diferent btoa Controlling Office) IS. SECURITY CLASS. (of tl ,apon)
UNCLASSIFIED
"So. DECL ASSIFICATION! DOWNGRADINGSCHEDULE
16. DISTRIBUTION STATEMENT (of title Report)
APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED.
17. DISTRIBUTION STATEMENT (of the abetrtc entered In flock 20. II different from E or)
II. SUPPLEMENTARY NOTES
19. KEY WORDS (Continue on revere. eld. It necessmry mid Identify by Woek nmber)
MANOVA, admissibility, majorization, Schur-convexity, extreme points,exponential family.
20. ABSTRACT (Continue an revere side II neceimy modid Ietelgb blee ko nmber)
-A new proof of admissibility of tests in MANOVA is given using Stein'stheorem (1956). The convexity condition of Stein's theorem is proved directlyby means of majorization rather than by the supporting hyperplane approach.This makes the geometrical meaning of the admissibility result clearer.
DO oFIAN73 1473 EOTION OF INov SIS OSSLETE UNCLASSIFIEDS/N 0102-014- 6601 UNCLASSIFIA_________D*_ I____
SECUITY CLASSIFICATION OF TWIS PAG9 Fien Dwe Unter)
W.E