sample size determination in estimating a covariance matrix

8
Computational Statistics & Data Analysis 5 (1987) 185-192 North-Holland 18 5 Sample size determination in estimating a covariance matrix Pushpa L. GUPTA * Department of Mathematics, University of Maine, Orono, ME 04469, USA R.D. GUPTA * * Division of Mathematics, Egineering, Computer Science, University of New Brunswick, Saint John, N.B., Canada E2L 4L5 Received 22 July 1986 Revised 17 January 1987 Abstract: The sample size requirements, for estimating a covariance matrix with a desired precision in a multivariate normal population, are investigated. Explicit formulas for the sample size are provided in the univariate case and in the multivariate case when the covariance matrix is diagonal. In these cases tables are also provided for specific values of e, and the joint confidence coefficient 1- a. For the general case, a method to compute the sample size is developed resulting in an integral equation involving the covariance matrix. In case a prior estimate of the covariance matrix is available, the integral equation can be solved by using the algorithm given by Russell et al. (1985). Examples are used to illustrate the effects of dimensions and quality of prior estimates of covariance matrix on the sample size. Keywords: Sample size, Covariance matrix, Multivariate normal distribution. 1. Introduction This paper deals with determining the sample size for estimating a covariance matrix in a multivariate normal population with joint confidence level and precision. Theproblem originated when the first author was involved in a project at the USAF School of Aerospace Medicine (USAFSAM). The USAFSAM at Brooks AFB has been interested for several years in the use of statistical methods to develop a computerized system to assist the cardiologists, who must examine a large number of EKG's in a single day, in the screening, diagnosis and serial comparison of vectorcardiograms. Past efforts at USAFSAM in the diagnosis of vectorcardiograms has relied on a Karhunen-Lorve approximation of the signal * Supported by a Faculty Summer Research Grant from the University of Maine. * * Supported by NSERC Research Grant#A-4850. 0167-9473/87/$3.50 © 1987, Elsevier Science Publishers B.V. (North-Holland)

Upload: ho-chun-jian

Post on 09-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 1/8

C o m p u t a t i o n a l S t a t i s t i c s & D a t a A n a l y s i s 5 ( 1 9 8 7 ) 1 8 5 - 1 9 2

N o r t h - H o l l a n d

18 5

S a m p l e s i z e d e t e r m i n a t i o n

i n e s t i m a t i n g a c o v a r i a n c e m a t r i x

P u s h p a L . G U P T A *

Department o f Mathematics , Universi ty of Maine, Orono, M E 04469, USA

R.D. GUPTA * *

Division of Mathem atics, Egineering, Com puter Science, U niversity of N ew Brunswick,

Saint John, N.B. , Canada E2L 4L5

R e c e i v e d 2 2 J u l y 1 9 8 6

R e v i s e d 1 7 J a n u a r y 1 9 8 7

Abstract: The sample s i ze r equ i rement s , fo r e s t ima t ing a cova r i ance ma t r ix wi th a de s i r ed p rec i s ion

in a mul t i va r i a t e norma l popu la t ion , a re i nves t iga t ed . Exp l i c i t fo rmula s fo r t he sample s i ze a re

prov ided in t he un iva r i a t e ca se and in t he mul t i va r i a t e ca se when the cova r i ance ma t r ix i s d i agona l .

In t he se ca se s t ab l e s a re a l so p rov ided fo r spec if i c va lues o f e , and the j o in t co nf iden ce coe f f i c i en t

1 - a . F o r t h e g e n e r a l c a s e , a m e t h o d t o c o m p u t e t h e s a m p l e s i z e i s d e v e l o p e d r e s u l t i n g i n a n

in t egra l equa t ion invo lv ing the cova r i ance ma t r ix . In ca se a p r io r e s t ima te o f t he cova r i ance ma t r ix

i s ava i l ab l e , t he i n t eg ra l equa t ion can be so lved by us ing the a lgor i t hm g iven by Russe l l e t a l .

(1985). Exam ple s a re used to i ll us t r a t e t he e f fec t s o f d imens ions and q ua l i t y o f p r io r e s t ima te s o fcova r i ance ma t r ix on the sample s i ze .

Keywords: Sample s i ze , Cova r i ance ma t r ix , Mul t iva r i a t e norma l d i s t r i bu t ion .

1 . I n t r o d u c t i o n

T h i s p a p e r d e a l s w i t h d e t e r m i n i n g t h e s a m p l e s i z e f o r e s t i m a t i n g a c o v a r i a n c e

m a t r i x i n a m u l t i v a r i a t e n o r m a l p o p u l a t i o n w i t h j o i n t c o n f i d e n c e l e v e l a n d

p r e c i s i o n . T h e p r o b l e m o r i g i n a t e d w h e n t h e f i r s t a u t h o r w a s i n v o l v e d i n a p r o j e c ta t t he U S A F S c h o o l o f A e r o s p a c e M e d i ci ne ( U S A F S A M ) . T h e U S A F S A M a t

B r o o k s A F B h a s b e e n i n t e r e s t e d f o r s e v er a l y e a r s i n th e u s e o f s t a t is t ic a l m e t h o d s

t o d e v e l o p a c o m p u t e r i z e d s y s t e m t o a s s i s t t h e c a r d i o l o g i s t s , w h o m u s t e x a m i n e a

l a r g e n u m b e r o f E K G ' s i n a s i n g l e d a y , i n t h e s c r e e n i n g , d i a g n o s i s a n d s e r i a l

c o m p a r i s o n o f v e c to r c a rd i o g r am s . P a s t e f fo r ts at U S A F S A M i n t h e d ia g n o s i s o f

v e c t o r c a r d i o g r a m s h a s r e l i e d o n a K a r h u n e n - L o r v e a p p r o x i m a t i o n o f t h e s i g n a l

* S u p p o r t e d b y a F a c u l t y S u m m e r R e s e a r c h G r a n t f r o m t h e U n i v e r s i t y o f M a i n e .

* * S u p p o rt e d b y N S E R C R e s ea r ch G r a n t # A - 4 8 5 0 .

0167-9473 /87 /$3 .50 © 1987 , E l sev ie r Sc i ence Pub l i she r s B .V. (Nor th -Hol l and)

Page 2: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 2/8

186 P.L . Gup ta , R .D . Gup ta / Es t im a t ing a covar iance ma tr ix

( 7 5 0 d i m e n s i o n a l i n 3 - l e ad s y s te m ) t o g e t h e r w i t h l in e a r a n d q u a d r a t i c d i s c r i m i n a -

t i o n i n t h e t r a n s f o r m e d s p a c e w h i c h i s 6 0 d i m e n s i o n a l . T h e c r u x o f t h i s a p p r o a c h

i s, th e r ef o r e, t h e e s t i m a t e o f t h e 60 x 6 0 c o v a r i a n c e m a t r i x o f t h e K a r h u n e n - L o 4 v e

coe f f i c i en t s . I t s qua l i t y can , t he r e fo re , be a sou rce o f conc e rn fo r t he e f f i ca cy o f

t h e e n t i re p r o c ed u r e . T h e q u a l i t y o r a c c u r a c y o f t h e c o v a r i a n c e m a t r i x e s t i m a t e is

a f u n c t i o n o f t h e s a m p l e s i ze a n d t h e u n k n o w n e n tr ie s . I t w a s s u g g e s te d t h a t as a m p l e o f 7 50 is s u f f i c ie n t t o e s t i m a t e a 6 0 × 6 0 c o v a r i a n c e m a t r i x w i t h r e a s o n a -

b l e a c c u r a c y . T h i s f i g u r e i s a p p e r e n t l y n o t b a s e d o n a n y t h e o r e t i c a l c o n s i d e r a -

t i o n s a n d s e e m s t o b e - , l o w a s i s e v i d e n t b y t h e s a m p l e s i z e r e q u i r e m e n t f o r t h e

s i x t y d i m e n s i o n a l i n d e p e n d e n t c a s e ( se e T a b l e 2 ).

T h e p r o b l e m o f e s t i m a t i n g t h e v a r i a n c e (O 2) of a n o r m a l d e n s i t y a r i s e s i n

m a n y e x p e r i m e n t a l s i tu a ti o n s. A s a n e x a m p l e (G r e e n w o o d a n d S a n d o m i r e [4 ]), a

se r i e s o f r ada r p u l se s is t o be s en t ou t t o a t a rge t a nd the s t r e ng th o f t he r e tu rn

s i g n a l m e a s u re d . H o w m a n y r e a d i n g s u n d e r i d e n t ic a l c o n d i t io n s s h a l l b e t a k e n s o

t h a t t h e s t a n d a r d d e v i a t i o n o f t h e r e t u r n s i g n a l s t r e n g t h s s h a l l , w i t h 8 0 % c o n f i -

dence , be w i th in 10% o f t he t r ue va lue?G r e e n w o o d a n d S a n d o m i r e [4] p r e s e n t e d a g r a p h i c a l a p p r o a c h f o r o b t a i n in g

t h e s a m p l e si ze r e q u i r e d t o e s t i m a t e v a r i a n c e o f a n o r m a l d e n s i t y w i t h i n a g i v e n

p e r c e n t o f it s t r u e v a l u e. G r a y b i l l a n d C o n n e l l [ 2] i n s te a d , h a v e g i v e n a t w o s t e p

s a m p l i n g p r o ce d u r e t o e s t im a t e t h e v a r i a n c e w i t h i n a g i v en n u m b e r o f u n i ts . T h e

n u m b e r o f u n i ts a n d t h e c o n f id e n c e l ev e l a r e s p e c if i ed in a d v a n c e . T h o m p s o n a n d

E n d r i s s [ 1 0 ] h a v e a l s o g i v e n a m e t h o d f o r e s t i m a t i n g t h e s a m p l e s i z e i n t h e

u n i v a r i a t e c as e. T h e i r m e t h o d d e p e n d s o n t h e l a rg e s a m p l e d i s t r i b u t i o n o f

e s t im a t o r . O t h e r w o r k , d e a l i n g w i t h e s t i m a t i n g v a r ia n c e , i n c l u d e s G r a y b i l l a n d

M o r r i s o n [ 3 ], L e o n e , R u t e n b e r g a n d T o p p [ 5] , T a t e a n d K l e t t [9 ] a n d G r a y b i l l [1 ].

F o r t h e s a k e o f c o m p l e t e n e s s , i n S e c t i o n 2 , a b r i e f d i s c u s s i o n i s g i v e n t o f i n dthe s ample s i ze n fo r t he un iva r i a t e ca se fo r a g iven e ( t he r e l a t i ve e r ro r ) and a

g iven a (whe re 1 - a i s t he con f iden ce coe f f i c ien t ) .

I n S e c ti o n 3, w e d e v e l o p t h e p r o c e d u r e s f o r d e t e r m i n i n g t h e s a m p l e s i ze i n t h e

m u l t i v a r i a t e s i t u a t i o n w h e r e t w o c a s e s a r e s t u d i e d . I n c a s e 1 , t h e c o v a r i a n c e

m a t r i x Z i s t a k e n t o b e d i a g o n a l w h i l e i n c a s e 2 i t i s a n y g e n e r a l m a t r i x . T a b l e 2

i s p r epa red fo r t he ca se I when p = 2 , 5 , 10 , 20 , 40 , 60 . Fo r ca se 2 t ab l e s canno t

b e p r e p a r e d a s th e r e s u l t i s i n t h e f o r m o f a n i n t e g r a l e q u a t i o n i n v o l v i n g Z .

H o w e v e r , i f a p r i o r e s t i m a t e o f N i s a v a i l a b l e , o n e c a n u s e t h e a l g o r i t h m g i v e n b y

Russe l l e t a l . [ 7 ] t o so lve t he i n t eg ra l equa t ion . The qua l i t y o f p r io r e s t ima te has

a n i n t i m a t e e f f e c t o n t h e s a m p l e s i z e w h i c h i s i l l u s t r a t e d b y s o m e e x a m p l e s .

T h r o u g h o u t t h e p a p e r p d e n o t e s t h e d i m e n s i o n , e t h e r e l a t i v e er r o r a n d 1 - a

t h e j o i n t c o n f i d e n c e c o e f f ic i e n t w h e n p > 2 .

2 . U n i v a r i a t e c a s e

L e t X 1 , X 2 , . . . , X n b e a r a n d o m s a m p l e f ro m N ( ~ , o 2 ). L e tn

$ 2 = E ( X , - X ) 2 / n - 1 .

i=1

Page 3: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 3/8

P.L . Gup ta , R .D . Gup ta / Es t im a t ing a covar iance ma tr ix 1 8 7

T h e n ( n - 1 ) $ 2 / 0 2 h a s a c h i- s q u a r e d i s t r i b u t i o n w i t h n - 1 d e g r e e s o f f re e d o m .

I t i s w e l l k n o w n t h a t l a r g e s a m p l e s a r e n e c e s s a r y i f o i s t o b e e s t i m a t e d

a c c u r a t e l y .

T h e p r o b l e m i s t o f i n d t h e s a m p l e s i z e n s u c h t h a t

p [ S _ o _ l < e 1 = 1 - ~ ( 2 . 1 )

f o r a g i v e n v a l u e o f ( e , a ) . T h a t i s ,

[ , ]- a = P 1 - e < - - < l + eo

[ J ]( n - 1 ) S 2

= P ~ /2 (n - 1 ) ( 1 - e ) < 7 ~ < ~ / 2 ( n - 1 ) ( 1 + e )

= P [ ( 2 ( n - 1 ) ( 1 - ~ ) - ~ / 2 (n - 1 ) - 1 < Z < ~ / 2 ( n - 1 ) ( l + e )

- ; / 2 ( n - 1 ) - 1 ]

q }[ 2 ~ - 1 ) ( 1 + ~ ) - ~ /2 (n - 1 ) - 1 ] - O [ ~ / 2 ( n - 1 ) ( 1 - e )

- ~ / 2 ( n - 1 ) - 1 ]

w h e r e

Z = ~ 2 (n - 1 ) S 2

2O

- ~ / 2 ( n - 1 ) - 1 ~- N ( 0 , 1 )

( 2 . 2 )

( s ee S n e d e c o r a n d C o c h r a n [8 ]), • is t h e c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n o f

N ( 0 , 1 ) :

S i n c e n i s l a r g e , e q u a t i o n ( 2 .2 ) c a n b e w r i t t e n a s

(1 - a ) - = ~ [ ~ 2 ( n - 1 ) e ] - q ) [ - ~ 2 ( n - 1 ) e ]

o r

~2 (n - 1) e -=- Z~/2

o r

n - - - l + ~

w h e r e P [ Z > Z , ,/2 1 = a / 2 .T a b l e 1 g i v e s s u c h v a l u e s o f n f o r s o m e s e l e c t e d v a l u e s o f a a n d e .

( 2 . 3 )

3 . M u l t i v a r ia t e c a s e

L e t X 1 , X 2 , . . . , X p b e p r a n d o m v a r i a b le s m e a s u r e d f o r e a c h o b j e c t o r s u b ie c t .

L e t u s a l so a s s u m e t h a t X = ( X 1 , X 2 , . . . , X p ) ' - N p ( # , 2 ; ) .

Page 4: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 4/8

188 P.L. Gupta , R.D. Gupta / Es tim atin g a covariance ma trix

Table 1

Sample size for the univariate case

E

0.01 0.02 0.03 0.04 0.05 0.06 0.08 0.09 0.10

0.01 33180 8295 3688 2075 1329 933 679 411 333

0.05 19210 4804 2136 1202 770 535 394 239 194

0.10 13530 3384 1505 847 543 377 278 169 137

C a s e 1 . S u p p o s e )(1 , X 2 , . . . , X p a r e i n d e p e n d e n t l y d i s t r i b u t e d , i .e . Z = d i a g ( o n ,

0 2 2 , . . . , O p p ) .T h e n t h e p r o b l e m i s t o f i n d n s u c h t h a t

[ S i i - I < ~ , i = l , 2 , . . . , p ] = ( l - a ) ( 3 . 1 )P l o i

f o r g i v e n e > 0 a n d a > 0 , w h e r e

Sii_ 1 '7E ( x , J -j = l

a n d X 1, X 2 , . . . , X n i s a r a n d o m s a m p l e o f s iz e n f r o m X . D e f i n e

v e c S - - ( S l l , $ 2 2 , . . . , S p p ) ' , v e c 2 ; = ( o n , o 2 2 , . . . , O p p ) '

a n d

Y = ~ / n - 1 ( r e c S - v e c Z ) .

T h e n [6 , p . 4 3 ], Y i s a s y m p t o t i c a l l y N p ( 0 , V ) , w h e r e a l l e l e m e n t s o f V a r e g i v e n

b y

c o v( Y ~ j , Y~ , ,) = o ik%, + o ,, %-k . ( 3 . 2 )

I n t h i s c a s e V = d i a g (2 o ] ~ , 2 o ~ 2 , . . . , 2 o 2 p ). N o w ( 3 .1 ) c a n b e w r i t t e n a s

n - - 1P Y~ < ~ 2 , i = l , 2 , . . . , p - - - ( I - a ) . ( 3 .3 )

S i n c e Y~ a re i n d e p e n d e n t l y n o r m a l l y d i s t r i b u t e d w i t h m e a n z e r o a n d v a r i a n c e

2 o i2 i, i = 1 , 2 , . . . , p , w e c a n w r i t e ( 3 . 3 ) a s

o r

w h e r e Z - N ( 0 , 1 ) . L e t 13 = 1 11 - ( 1 - a ) l / p ] . T h e n

( 3 . 4 )

n --- 1 + 2 ( Z B / e ) z . ( 3 . 5 )

Page 5: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 5/8

P.L . Gup ta , R .D . Gup ta / Es t im a t ing a covar iance ma tr ix

Table 2

Sample size for the multivariate independent case

189

p e

0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14

0.01 2 6301 4376 3216 2462 1946 1576 1303 1095 933 805

5 7635 5303 3896 2983 2358 1910 1579 1327 1131 975

10 8657 6012 4417 3382 2673 2165 1790 1504 1282 1105

20 9687 6728 4943 3785 2991 2423 2003 1683 1434 1237

40 10724 7448 5472 4190 3311 2682 2217 1863 1588 1369

60 11059 7680 5643 4321 3414 2766 2286 1921 1637 1412

0.05

0.10

2 4003 2780 2043 1 5 6 5 1237 1002 828 696 593 512

5 5280 3667 2695 2064 1631 1321 1092 918 782 675

10 6272 4356 3201 2 4 5 1 1937 1569 1297 1090 929 801

20 7278 5055 3714 2844 2247 1821 1505 1265 1078 930

40 8297 5762 4234 3242 2562 2075 1715 1442 1229 1060

60 8897 6179 4540 3476 2747 2225 1839 1546 1317 1136

2 3040 2111 1552 1188 939 761 629 529 451 389

5 4273 2968 2181 1670 1320 1069 884 743 633 546

10 5243 3641 2675 2049 1619 1312 1084 911 777 670

20 6233 4329 3181 2436 1925 1559 1284 1083 923 796

40 7239 5028 3694 2829 2235 1811 1497 1258 1072 925

60 7834 5441 3998 3 0 6 1 2419 1960 1620 1361 1160 1001

I t s h o u l d b e n o t i c e d t h a t t h e s a m p l e s i z e f o r m u l a g i v e n b y ( 3 . 5 ) is i n d e p e n d e n t o f

N . W h e n p = 1 , ( 3 . 5 ) r e d u c e s t o ( 2 . 3 ) w i t h d i f f e r e n t v a l u e o f e . O n e c a n r e g a r d

t h i s a s a n a p p l i c a t i o n o f B o n f e r r o n i m e t h o d t o s e v e r a l i n d e p e n d e n t v a r i a b l e s .

T a b l e 2 g i v es t h e v a l u e s o f n f o r s o m e s e l e c t e d v a l u e s o f a , e a n d p . O n e m a y

n o t i c e t h a t t h e r e is a s h a r p i n c r e a s e in n a s p i n c r e a s e s .

C a s e 2 . S u p p o s e n o n e o f t h e o i j a re z e r o , i .e . a l l v a r i a b l e s a r e c o r r e l a t e d . T h e n w e

d e f i n e v e c S a n d v e c ~ a s f o l l o w s :

v e c S = ( $ 1 1 , $ 1 2 , . . . , S i p , $ 2 2 , . . . , S 2 p , . . . , S i i , S i i+ a , . . . , S i p , . . . , S p p ) ' ,

v e c ~J = (o1 1, o 1 2 , . . . , t i p , 0 2 2 , . . . . O 2 p , . . . , o i i , o i i + l , . . . , t i p , . . . , O p p ) '.

A s b e f o r e , l e t

Y = ( ( n - 1 ) ( v e c S - v e c ~ ; ) = (Ira , Y 2 , . " , Y p ( p + l ) / 2 ) ' .

B y [ 6 , p . 4 3 ] , Y i s a s y m p t o t i c a l l y N p ( p + l ) / 2 ( O , V ) , w h e r e e l e m e n t s o f V a r e g i v e n

b y ( 3 .2 ). V t h u s f o r m e d i s a p o s i t iv e d e f i n i t e s y m m e t r i c m a t r ix .

W e w a n t t o f in d n s u c h t h a t

P [ ] S i J - I < e , i = l , 2 , . . . , p , j = l , 2 , . . . , p J = ( l - a ) ,

1 1r [ - e < V < e ] -~ 1 - a ( 3 . 6 )

Page 6: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 6/8

190 P.L . Gup ta , R .D . Gup ta / Es t im a t ing a covar iance ma tr ix

w h e r e

e = e ~ / ( n - 1 ) ( [O la ], I O 1 2 [ , - - - , I O p p l ) t .

Re wr i t in g in in te g ra l fo rm, we h a v e

f_ IV 1-1 /2e < Y < e ( 2 . if ) p ( p + I ) /4 e - y ' V l y / 2 d y ~ ( 1 - o g ) . ( 3 . 7 )

I f a p r io r es t imate o f ~ is ava i lab le , the eva lua t ion o f the in tegra l in (3 .7 ) can be

a c h ie v e d b y a n a lg o r i th m re c e n t ly g iv e n b y Ru s s e l l , F a r r i e r a n d Ho we l l [7 ] .

Remark . In case some of the o i j ' s a re ze ro , we wi l l remove those o i j ' s f rom vec 2 :

a n d th e c o r re s p o n d in g S u ' s f ro m v e c S a n d c a r ry o u t th e c a lc u la t io n a s b e fo re .

S ince (3 .7 ) depends on ~ ; , a tab le fo r the n va lues cannot be p repared . The

s i tu a t io n h e re i s q u i t e s imi la r to th e s a mp le s i z e d e te rmin a t io n in e s t ima t in g th e

p ro p o r t io n o f a b in o m ia l p o p u la t io n . T h e q u a l i ty o f p r io r e s t ima te a n d d im e n s io no f 2 : h a v e p ro fo u n d e f fe c t o n th e s a mp le s i z e . T h e e f fe c t o f d ime n s io n o f 2 : c a n

b e s e e n b y th e f a c t th a t th e d ime n s io n o f V in c re as e s s h a rp ly , r e s u l tin g in a s h a rp

increase in the sample s ize . The e ffec t o f the qua l i ty o f the es t imate o f 2 : can be

seen by the fo l lowing examples .

E x a mp le s . L e t u s s u p p o s e a p r i o r e s t i m a t e o f t h e v a r i a n c e - c o v a r i a n c e m a t r i x 2 ;

o f a b iv a r i a t e n o rma l d i s t r ib u t io n i s g iv e n a s

( 4 5 )5 9 "

T h e n

V =32 40 50)

40 61 90 ,

50 90 162

e = e x /- n - 1 ( 4 , 5 , 9 ) .

Equa t ion (3 .7 ) can be wri t ten as

f 9 e ¢ ~ i f seCt-=1 f4e~z-Y [ g [ - 1 / 2

- 9 ex~ -2]- d - 5 ex/-n~-] -d - 4 e ~ /- n- -z T ( 2 ~ ) 3 / 2e - y ' V - l y / 2 d y i d y 2 d y 3 = 1 - o~ (3 .8 )

o r

_ _ _

w h e r e

R =

e -y 'R-1y/2 d y I d y 2 d y 3 = 1 - a

1 0 .905357 0 .694 444 )

0.905357 1 0.905357 ,

0 .694444 0.905357 1

I n - 1 I n - 1h l = e ~- , h 2 = 5 e ~-~ , ~ n - - 1h 3 = e 2

(3 .9)

Page 7: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 7/8

P.L . Gup ta, R .D . Gup ta / Es t ima t ing a covar iance ma tr ix 191

Now

(~, ~) = ( 0 . 0 5 , 0 . 0 5 ) ,

( ~ , ~ ) = ( 0 . 0 5 , 0 . 1 0 ) ,

(e , o~)= (0 .10 , 0 .05) ,

(e , a ) = (0 .10 , 0 .10) ,

b y u s i n g t h e a l g o r i t h m g i v e n i n R u s s e l l et a l. [ 7 ], w e f i n d t h a t f o r

n ---- 420 9 ,

n --- 3107,

n ~- 1053,

n ---- 780 .

T h e c o r r e l a t i o n b e t w e e n t h e v a r i a b l e s p l a y s a v e ry i m p o r t a n t r o l e a s c a n b e s e e n

f r o m ( 3 .9 ). I f o n e w e r e to t a k e a n y o t h e r p r i o r e s t i m a t e o f ~ w i t h t h e s a m e

c o r r e l a t i o n a s i n t h e a b o v e p r i o r , t h e n t h e r e s u l t i n g R m a t r i x a n d t h e i n t e g r a t i o n

l i m i t s in ( 3. 9) w i l l b e t h e s a m e . I n e s s en c e , t h e s a m p l e s i z e r e m a i n s u n c h a n g e d

c o r r e s p o n d i n g t o a l l p r i o r e s t im a t e s o f ~ w i t h t h e s a m e c o r r e l a t i o n .

I n t h e a b o v e e x a m p l e c o r r e l a t io n b e t w e e n t h e t w o v a r i a b l e s is 5 / 6 . S u p p o s e

t h e i n v e s t i g a t o r d e c i d e d t h a t t h i s c o r r e l a t io n i s to o h i g h w h e n i n f a c t i t s h o u l d b e

ve ry l ow an d t akes a p r io r e s t ima te o f ~ ; t o be

T h e n

( 4 1 )1 9 "

32 8 2 )

V = 8 3 7 1 8 ,

2 18 162

resu l t i ng i n

a n d

n ~ 14223

R =

1 0 .232495 0 .027777

0 .232495 1 0 .232495

0 .027777 0 .232495 1

f o r ( e , a ) = ( 0 .1 0 , 0 . 0 5 )

n ---- 100 20 for (e, a ) = (0.1 0, 0.10 ),

a r a t h e r s h a r p i n c r e a s e i n t h e s a m p l e s iz e. O n e c a n o b s e r v e t h a t t h i s i n c r e a s e i s

d u e t o t h e d e c r e a s e i n t h e c o r r e l a t i o n b e t w e e n t h e v a r i a b l e s . I n o r d e r t o c o n f i r m

t h i s o b s e r v a t i o n f u r th e r , l e t u s t a k e a n o t h e r p r i o r e s t i m a t e o f 2 ;,

( 1½ 1 '

w h i c h r e s u l t s i n c o r r e l a t i o n m a t r i x

a n d

R =

1 0 .63245 0 .25 )0 .63245 1 0 .63245

0.25 0 .63245 1

n -- 1946

n ---- 1400

f o r ( e , a ) = ( 0 . 1 0 , 0 . 0 5 ) ,

f o r ( ~ , ~ ) = ( 0 . 1 0 , 0 . 1 0 ) .

F r o m t h e e x a m p l e s g iv e n a b o v e it is c le a r t h a t a s th e c o r r e l a t i o n b e t w e e n t h e

va r i ab l e s i nc r eases , t he s ample s i ze dec reases .

Page 8: Sample size determination in estimating a covariance matrix

8/7/2019 Sample size determination in estimating a covariance matrix

http://slidepdf.com/reader/full/sample-size-determination-in-estimating-a-covariance-matrix 8/8

192 P.L. Gupta, R.D. Gupta / Estimating a covariance matr ix

N o w l e t u s t a k e a ca s e w h e r e t w o v a r i a b l e s h a v e s m a l l c o r r e l a t i o n a n d t h e t h i r d

i s i n d e p e n d e n t o f t h e f ir s t t w o . S u p p o s e a p r i o r e s t im a t e o f Z i s

4 1 O )1 9 00 0 25

I n t h i s ca s e w e o b t a i n t h e f o l l o w i n g r e s u lt s :

n -= 1 4 3 0 0 f o r ( ~ , a ) = ( 0 . 1 0 , 0 . 0 5 ) ,

n ---- 10055 for ( e , a ) = (0 .10 , 0 .1 0) .

O n e c a n n o t e t h a t t h e r e i s a v e r y sm a l l in c r e a s e in t h e s a m p l e s iz e b y a d d i n g a n

i n d e p e n d e n t v a r i a b l e t o t h e l is t o f v a r i a b l es . I f a l l t h r e e v a r i a b l e s w e r e c o r r e l a t e d ,

t h e V m a t r i x w o u l d h a v e b e e n 6 x 6 w h i c h w o u l d r e s u l t i n a v e r y l a rg e s a m p l e

s iz e. T h e r e f o re , o n e c a n s a f e l y a s s u m e t h a t t h e s a m p l e s i ze o b t a i n e d , u n d e r a

w r o n g a s s u m p t i o n o f i n d e p e n d e n c e o f v ar i ab l e s, w i ll b e t o o l o w .

Acknowledgement

T h e a u t h o r s a r e t h a n k f u l t o t h e r e f e r e e f o r s o m e u s e f u l s u g g e s t i o n s w h i c h

i m p r o v e d t h e m a n u s c r i p t c o n s i d e r a b l y , a n d a ls o to M r s . J u d i t h L e o n a r d f o r

a s s i s t a n c e i n n u m e r i c a l w o r k .

References

[1] F .A. Graybi ll~ Dete rm ining sam ple s ize for a spec i f ied width co nf iden ce in te rva l , Ann. Math.

Statist. 29 (1958) 282-287 .

[2] F .A. Graybi l l and T.L. Con ne l l , Sample size required for es t imat ing the var iance wi th d uni ts

of the t rue va lue , Ann. Math. Statist. 35 (1964) 438-44 0.

[3] F .A. Graybi l l an d R .D. M orr ison , Samp le size for a spec i f ied width c onf ide nce in te rva l on the

va r ia nc e o f a no r m a l d i s t r i bu t ion , Biometrics 16 (1960) 636-641.

[4] J .A. Greenwood and M.M. Sandomire , Sample s ize required for es t imat ing the s tandard

devia t ion as a pe rcent of i t s t rue va lue , J. Amer. Statist. Assoc. 45 (1950) 257-260.

[5] F .C . Leone , Y.M. Ru tenb erg an d C .W. Top p, The use of sam ple quas i - ranges in se t t ing

c on f ide nc e in t er va ls f o r t he po pu la t ion s t a nda r d de v ia ti on , J. Amer. Statist. Assoc. 56 (1961)

260-272.

[6] R.J. Muirhead, Aspects of Multivariate Statistical Theory ( John Wiley & Sons , New York ,

1982).

[7] N.S . Russe l l , D.R. Far r ie r a nd J . How el l , Eva lua t ion of mul t ino rma l probabi l i t ie s us ing

Four ie r se r ies expa nsions , Appl. Statist. 34 (1) (1985) 49-53.

[8] G.W. Sned ecor and W .G. Cochran , Statistical Methods ( Iowa S ta te Univers i ty Press , Ames,

IA, 1967).

[9] R .F . Ta te a nd G.W. Kle t t , Opt im al conf idence in te rva ls for the va r iance of a normal

dis t r ibut ion , J. Amer. Statist. Assoc. 54 (1959) 674-682.

[10] W.A. Thompson and J . Endr iss , The required sample s ize when es t imat ing var iances , Amer.

Statist. 15(3) (1961) 22-23 .