[fundamental studies in engineering] foundations of estimation theory volume 9 || normality of...

58
6 NORMALITY OF OBSERVATION VECTORS A significant part of the theory of linear and quadratic estimators has been developed on the assumption of a normal distribution of the observation vector. There are two reasons for this assumption. The first is a consequence of the general experience that in many experi- ments the conditions for applying central limit theorems are at least partly satisfied, and accordingly the probability distribution of measurement errors is normal. Therefore in many experiments the assumption of normality of the observation vector is close to reality. The second reason lies in the sphere of mathematical theory. The linear and quadratic estimators for normal observation vectors appear to be natu- ral, since a linear transformation of a random vector preserves its normality and suitably chosen quadratic estimators lead to the known chi-square probability distribution. Moreover, in the case of a normally distributed observation vector, the linear estimators possess remarkable statistical properties, for example, they are efficient among all the unbiased estimators. The assumption of normality allows a more thorough investigation of those statistical properties of estimators which have so far been characterized only by means of the second order statistical moments within the framework of linear estimators, and by means of the third and fourth moments in the case of quadratic estimators. Definition 6.1. A random vector , is said to possess a normal probability distribution if: 1. there exists a random vector £ r ,, r AÎ, such that its components are stochastically independent and their probability density with respect to the Lebesgue measure reads f{x) = (2 )" 1 /2 exp(-x 2 /2), jce(-oo, oo); 207

Post on 18-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

6 NORMALITY OF OBSERVATION VECTORS

A significant part of the theory of linear and quadratic estimators has been developed on the assumption of a normal distribution of the observation vector. There are two reasons for this assumption.

The first is a consequence of the general experience that in many experi-ments the conditions for applying central limit theorems are at least partly satisfied, and accordingly the probability distribution of measurement errors is normal. Therefore in many experiments the assumption of normality of the observation vector is close to reality.

The second reason lies in the sphere of mathematical theory. The linear and quadratic estimators for normal observation vectors appear to be natu-ral, since a linear transformation of a random vector preserves its normality and suitably chosen quadratic estimators lead to the known chi-square probability distribution. Moreover, in the case of a normally distributed observation vector, the linear estimators possess remarkable statistical properties, for example, they are efficient among all the unbiased estimators. The assumption of normality allows a more thorough investigation of those statistical properties of estimators which have so far been characterized only by means of the second order statistical moments within the framework of linear estimators, and by means of the third and fourth moments in the case of quadratic estimators.

Definition 6.1. A random vector ηη , is said to possess a normal probability distribution if:

1. there exists a random vector £r ,, r ζ AÎ, such that its components are stochastically independent and their probability density with respect to the Lebesgue measure reads

f{x) = (2π)" 1 /2 e x p ( - x 2 / 2 ) , j c e ( - o o , oo);

207

2. there exists an η χ r matrix J, R(J) = r, and an Az-dimensional vector μ such that η = Λ ξ + μ.

Definition 6.1 immediately implies that Ε(η) = μ and Σ,, = JJ'. The nota-tion η ~ Ν„(μ, Ση) expresses the normality of the vector η.

Theorem 6.1. If the matrix Σ^ == JJ' from Definition 6.1 is regular (i.e. if r = n), then the joint probability density of a normal random vector η is

Proof. The joint probability distribution of the vector ξ from Defini-tion 6.1, which is in our case w-dimensional, is in accordance with the assumption of the stochastical independence of its components

we obtain the joint probability density of the vector η = Λ ξ + μ in the form

Theorem 6.2. Let η - Ν(μ, Σ π ) , then Ύη -h f ~ Ν(Τμ -h /, ΤΣ , ,Τ) .

Proof. It suffices to make use of the known theorem of the one-to-one relation between the distribution function F( · ) of the random vector η and its characteristic function

applying the substitution χ = J \y — μ), whose Jacobian is

P W Ü O I = detiJ" 1) = l/det(J) = l/y/aet<5)9

In our case

208

and

φΤη+,(υ) = Ε{εχρ[ιυ'(Ίη + /)]} =

= exp (i t/7) £[exp (i ι /Τι / ) ] = exp (i u't) q>n(u'T) =

= exp (i 1/7) exp u 'T / i - ^ u'TZJ'uj =

= exp ( i υ\1μ + / ) - ^ ι / Χ Τ Σ , Τ ' ) Î / J O T I / t / - JV(T/# + /, Τ Σ , Τ ' ) .

Remark 6.1, Theorem 6.1 implies that the estimators from Sections 5.1,5.2 and 5,3 possess, provided the observation vector is normally distributed, a normal probability distribution, and this is fully characterized by their mean values and covariance matrices (or dispersions).

Theorem 6.3. If η ~ 7V(/i, Σ ) , ξ} = Τ , η + f, and ξ2 = Τ2η + f2, then the random vectors ξλ and ξ2 are stochastically independent iff Τ , Σ Τ 2 ' = 0.

Proof. With respect to the assumptions and according to Theorem 6.2, the normal probability distribution of the random vector (ξ\, ξ'2)' is characterized by its mean value and covariance matrix of the form

Î K « . « I vs,>-(î£i; ΐ $ ) ·

The assertion now follows immediately from the fact that two components of a normal random vector are stochastically independent iff their covariance is zero.

The properties of the estimators Θ(ξ) from Theorem 5.1.4, in the case of observation vectors normally distributed, are shown in the following the-orem.

Theorem 6.4. If ξ ~ Ν„(ΑΘ, c^H), 0e & then the estimator Θ(ξ) in Model two from Theorem 5.1.4 is the U B U E (Definition 3.2) in the class of all the unbiased estimators (not only in the class of all the linear ones).

Proof. In accordance with Theorem 3.2.2.2, we use the notation g(0) = ©, 0eat\ τ8(ξ) = ( Α Ή " 1 Α ) ~ 1 Α Ή ~ 1 ξ = Θ. The information matrix of the class of the normal probability distributions considered is

F(0) = E[-d2 l n / ( £ 0)/d0d0'\0] = σ 2A H A

209

(in our case independent of 0ei?A); according to Theorem 5.1.4, the co-variance matrix of the estimator τ^ξ) has the form Σ Γ = σ ^ Α Ή Ά ) 1 , i.e. Σ Γ = [F(&)]~\ The assertion now follows immediately from Definition 3.2 and Theorem 3.2.2.2.

Remark 6.2. Taking into account Theorem 5.1.3, analogous assertions are true for the other fundamental models.

For more detail on normal distributions the reader is referred to Miller

[86].

6.1 Fundamental regular models

Theorem 6.1.1. Let the observation vector ξ in the fundamental models from Definition 5.1.1 be normally distributed (ξ ~ Ν(μ9 Σ) ) ; then the esti-mator Θ and the corresponding residual vector ν from Definition 5.4.1 are stochastically independent.

Proof. In each of the models from Definition 5.1.1, Θ = 1χξ+ t\ and ν=Ί2ξ+ί2 (e.g. in Model two, Τ, = (AH Α) Α Ή \ f, = 0, T2 = I -- Α ( Α Ή - 1 Α ) - 1 Α , Η - 1 , t2 = 0). Hence, according to Theorem 6.2, Θ ~ ~ΛΤ (Τ, / i+f , , Τ, ΣΤΟ and ν~Ν(09 J^T^. It can easily be verified that Τ,ΗΤ^ = 0 is true for each of the models considered and this, according to Theorem 6.3, completes the proof.

Theorem 6.1.2. Let the observation vector ξ in the fundamental models from Definition 5.1.1 be normally distributed; then the random variable v'H~ V / σ 2 , where ν is the residual vector from Definition 5.4.1, possesses the chi-square probability distribution whose degrees of freedom for the in-dividual models are stated in Theorem 5.4.2.

Proof. Consider Model two first. Here

ν = [Α(ΑΉ"Α) " Α Ή 1 - 1](ξ - Α Θ )

and, in accordance with the assumption, ξ ~ Ν (ΑΘ, c^H). If S = G _ , ( ^ —

- ΑΘ) , where H = G G \ then δ - N(0, σ 2 ! ) and

v'H]v= δ'[\ - G ~ 1 A ( A / G / " 1 G ~ 1 A ) ~ 1 A , G / ~ 1 ] δ.

The matrix G ~ , A ( A , G , ~ 1 G ~ 1 A ) ~ , A , G / ~ 1 is obviously symmetric and idem-potent (it is a projector in the Euclidean metric), therefore the matrix

210

I — G " ' A ( A G/"

16 "

IA ) "

IA ' G

/"

1

is also symmetric and indempotent. As a consequence, there exists an or-thogonal matrix Q such that

D = QG 1

A(A G ~1G~

1 A ) ~

1 A ' G ' "

1 CT

is a diagonal matrix. The matrix D is obviously also idempotent

D2 = D =»{D} / < f- = | Q

and the number of units is equal to the rank of the matrices D and

G - ' A i A ' G ' - ' G - ' A ) ' A G Λ

Because of its idempotentness the rank of this matrix is given by its trace

T r i G - ' A i A ' G ' - ' G - ' A r ' A ' G ' - ' ) =

= T r ( ( A H A) A ' H ' A ) = T r ( l ^ ) = k.

Using the transformation Λ = QS (Δ ~ N(0, cr2!)), we can write

vH v= A(\ - D)A = A\ + ... + A\ a,

where i,, in_k are indices of the zero diagonal elements of the matrix D. In consideration of the definition of the random variable χ

2, it is obvious that

ν'Η^ν/σ2 = χ

2 with η — k degrees of freedom.

The assertion holds for the other models from Definition 5.1.1 as a consequence of the Theorem 5.1.3. However, it is also possible to prove it directly. For Model three, for example,

v= - H B ( B H B ) ~]Β(ξ-θ)=>¥'Η-

ι¥ =

= (ξ- Θ) Β ( Β Η Β ) Β ( £ - Θ ) .

Having applied the transformation δ= G_ 1

(£— 0 ) , where GG' = Η (i.e. δ ~ N(0, cr

2!)), we can write

κ Ή ν = £ G Β (BGG Β ) ~ ' BG&

Here G 'BXBGG'B ' )- 1 BG is a symmetric and idempotent matrix whose rank

is

Tr (G'B'fBGG'B'] "1 BG) = Tr ( l , t , ) = q.

The completion of the proof can now be carried out analogously to the

preceding case.

211

Theorem 6.1.3. If the covariance matrix of the random vector ξ ~ Ν,(μ, Σ ) is regular, then the random variable (ξ — μ)'Σ~\ξ — μ) has the chi-square distribution with r degrees of freedom.

Proof. Consider, analogously to the proof of the preceding theorem, a matrix G such that GG' = Σ. Then

η = Ζ\ξ-μ)~Ν(0,\)^

^X2= η'η = (ξ~ μ) G ' ^ G - 1 ^ - μ) = (ξ - μ)'Σ~\ξ - μ).

Theorem 6.1.4. Let the observation vector ξ within the fundamental linear models be normal, and the factor σ2 known. Then the vector 9 = ΎΘ is, with probability 1 — a, an element of the random ellipsoid

ë = {u: (u - ΤΘ) [TH^T ] 1 (u - ΎΘ) = σ>χ2,(\ - a)}.

Here Τ is an arbitrary matrix such that TH^T' forms a regular matrix, χ](\ — a) is the (1 — a)th quantile of the chi-square distribution with r = = RÇTHqT) degrees of freedom, and the matrix H# for the individual models is given by Theorem 5.1.4.

Proof. From Theorem 6.2, ΎΘ ~ Nr(J0, c^TH^T), and according to

Theorem 6.1.3

(ΤΘ - ΤΘ) (TH^T ) 1 (ΤΘ - ΤΘ) = σ2*2, thus

Ρ{(9 - ΤΘ) (THÔT) ] (9 - Τ Θ) ^ σ^ 2(1 - α)} =

= Ρ{χ] = - α)} = 1 - α.

Remark 6.1.1. If the first dimension of the matrix Τ is 1, the matrix Τ is written in the form of a row vector p' and if p'Hp Φ 0, then r = 1 and

ρ 'Θ~ Ny(p Θ, c^p'H^)^

=> (p'0 - p'0)/(<rVp'Hép) ~ Nt(0, 1).

I f / (a /2) is the (a/2)th quantile and / ( l - a/2) the (1 - a/2)th quantile of the normal probability distribution, then obviously

P{t(a/2) ^ (p'0 - p'é)/(aVp'H ép) ί f(l - a/2)} =

= P{p'ê + t(a/2) σΤρΊΗ^ρ ^ ρ'Θ ^

^ ρ ' Θ + f(l - α/2)σ^/ρ'Ηφρ} = 1 - a/2 - a/2 = 1 - a.

212

Thus

[ρ Θ - t(\ - α/2) ay/p'Hôp, ρ Θ + t(l - α/2) σ^/ρΉόρ]

is a random interval in which the value ρ ' Θ occurs with probability 1 — a (because /(a/2) = -t(\ - a/2)). This interval is the confidence interval for ρ 'Θ, with the confidence level 1 - a. Analogously, the ellipsoid

{u: (u - Τ Θ ) (ΤΗ^Τ 'Γ 1 (ι/ - Τ Θ ) ζ% σ ^ Ο - α)}

is said to be the confidence ellipsoid for 9 = ΤΘ. If r = 1, the quantile of

the normal probability distribution is used (t(l - a/2) = VZ iU — a)).

Theorem 6.1.5. Within the fundamental linear models, if the observation vector ξ is normal and the factor cr2 is unknown a priori, then the vector 9 = Τ Θ is, with probability 1 — a, covered by the random ellipsoid

* = j i i : ( i # - T 0 ) ' ( T H é T ' ) - ^ t f ^ -a)

Here Fr f{\ — a) is the (1 — a)th quantile of the Fisher — Snedecor probability distribution with r and / degrees of freedom, whose number for individual models is stated by Theorem 5.4.2, ν is the residual vector from Theorem 5.4.1, and the other symbols have the same meaning as in Theorem 6.1.4.

Proof. In accordance with Theorem 6.1.1, the random variables

{9 - Τ Θ ) ' ( T H * T ) - 1 (9 - Τ φ / σ 2 = χ] and

ν'Η-χν/σλ = χ}

are stochastically independent. Taking into account the definition of the Fisher —Snedecor probability distribution, the assertion becomes obvious.

Remark 6.1.2. Theorem 6.1.5 can be annotated analogously to The-orem 6.1.4 (instead of the quantile of a normal probability distribution, the quantile of the Student probability distribution with / degrees of freedom must be used for r = 1). Applications of both theorems are in practice very frequent and important. They form also a basis of statistical tests by means of which a hypothesis 9 = 90 is verified.

A similar problem is to determine the probability that the values of a whole class of linear functionals / ( Θ ) = ρ-Θ, 0e9lk, i = 1, 2, m, of a parameter Θ are simultaneously covered by confidence intervals (the Scheffé confidence region), i.e. to determine the probability

213

P{i{teT}p;0eIt(Ô)},

where Tis the index class (i.e. in our case Τ = {1 , ..., m}) of functional and 1,(0) is the confidence interval of the nh functional determined by means of the estimator 0.

The importance of the solution of this problem can be illustrated by the following simple example. Consider Model two with the observation vector

The corresponding experiment consists in measuring the ordinates of a straight line χ = Θ, + Θ2ί, te(— oo, oo), at the points / 2, ..., tn if the stochastically independent measurement errors are normally distributed with a dispersion cr2. If the class of functionals is chosen in the form

the problem consists in determining the class of intervals covering at a given / e ( — oo, oo), the ordinate of the straight line χ = Θ, + Θ2ΐ when the prob-ability of a simultaneous covering of all the points of the straight line is

The end points of the intervals It(0) form, in the plane in which the straight line χ = 6>, + 6>2Ms plotted, the boundaries of the Scheffé confidence region covering the whole straight line with the above-mentioned probability

The solution of this problem, based on some modification of The-orems 6.1.4 and 6.1.5, is the subject of the following theorems.

Theorem 6.1.6.

where tt3 = T H ^ T ' , 9 = 10 and the other notation is taken from The-orem 6.1.4.

Proof. First, it will be proved that

P { V { / e ( - o o , oo ) } :p , '0e / , (0 )} .

P { V { / e ( - a o , oo)}: p ; © 6 / , ( £ ) } .

= ρ» + ay/x2

r(l - a)VP'H^P} = P{9eS} = 1 - a,

S = {u: (il - 3)Hj\u - 9) = σ ^ Ο - a)}

214

is a convex set. Let u,, u2eé, 0 ^ β < 1 and υ = (1 - β)u, + ßu2. Then

(u - £)' H,'(ι# - 5) = <ι# - $, u - = <(1 - - 5) +

+ /?(u2 - »X (1 - / ? ) ( « . - + 0(u 2 - #)>M J-. =

= (1 - ß)2 ||u, - + 2/3(1 - /O < « , - S, u2 - $ \ . , +

Schw. ineq.

+ / ? > 2 - ê\\2

Hii ^ (1 - /?)2 Hi/, - fl||2 _, +

+ 2β(\ - β) ||ι/, - Î | | M j i ||ι/2 - fl||H_, + ß2\\u2 - ê\\2

Hjl ^

ύ(\-β) lin, - Î l l^-. + ß\\u2 - 5|| 2-, ^ ô2*? (1 - a)=> ueS.

The set S is obviously closed. For an arbitrary point i/, of the boundary, there exists according to Theorem 3.2.2.7 a supporting hyperplane p'u = c which can be determined in the following way. We have p'l/, = c, and (i/, — #)'· •Hj ' ii/, — #) = c^xliX — a); the vector p\ = (i/, — flJHj1 and the number c, = — a) + p\9 satisfy the two given conditions. If ueS, then

\p\u-p\9\ = |(i/, - J ) ' H j ' ( u - 9)\ = |<i/, - $9 u - fl>H_,| ^

^ V(u, - fly H j ' ( ι / , - #)V (u- # ) ' H J ' ( I I - J) ^ ^ 2 ( 1 - a)=>

=> - ^ ( 1 - a) ^ piu - p\9 ^ x?£iy - a) =>

^p\u^ci(=o2

X

2

r(\-a)-r-p\9)y

thus (ι/, - 9)'HJ1!/ = ^ ( l - β) + (ι/, - 9)'Hi]9

is the supporting hyperplane at the point ιι,. The point u2 = 29 — ι/, is the boundary point of the ellipsoid S lying on the connecting straight line of the points ι/, and 9 at a distance from the point 9 equal to the distance of the points ι/, and 9 but in the opposite direction. The supporting hyperplane at the point u2 is

- ( ι / , - 9)' Hj 'i# = σ 2* 2^ - α) - (ι/, - # ) 'Hj 1 J o

ο ρ ί ι / = - σ ^ Ι - a ) + p i f l = c 2

and ueS=>p\u^c2.

Let p e ^ r be an arbitrary vector; because of the regularity of the matrix there exist two supporting hyperplanes orthogonal (in the Euclidean sense)

215

to it. The first is characterized by the point t/, such that p' = k(ux — 9)' ' and

ρΉ,Ηϊ Ή * / * 2 = pH3p/k2 = ^ ( 1 - a),

which implies

k = V P ' H j p / i ^ i l - α)). Thus

α, - 9 = yjo'xliX - a)Hàpl^up

and the corresponding supporting hyperplane is

V ρ Η , ρ

= < ^ ( l - « ) + p'â Α Ξ 3 ;

V ρΉ,ρ

ueS^p'u^yJcfxKX - α)ρΉ3ρ + ρ 9.

For the second supporting hyperplane, we have

pu ^ p'9 - σ^χ]{\ - a ) V p H , p . Thus

{ u : V{pe®r}p'9- σ^χ]{\ - a)^üp ^

S pu ^ p'9 + σ^χ](\-α)^ΰρ) =

= {u:(u- 9)'Hj'(ιι - 9) S σ ^ Ι - α)},

which proves the assertion of the theorem.

Theorem 6.1.7. Using the notation of Theorem 6.1.5,

/ > { V { p e W } p ' 9 - <7Vriv,/l - a)V^H^^ p'9^

^ p'9+ âJrFrJ(\ - ÛOVP'H^P} = P{9eS) = 1 - α,

where σ = ^νΉ~]ν/β

Proof. The course of the proof of this assertion is analogous to that of Theorem 6.1.6.

Remark 6.1.3. We can formulate the problem concerning the Scheffé confidence region in the following way. Let the class of functionals of the parameter 0eatk be characterized by the subspace Ji(J') (T is the matrix from Theorem 6.1.4). The problem now consists in determining a confidence

216

interval for every functional f(0) = s'0, 0e3tk, where seJiÇï'), such that

P{V {se UT(T')} s'0e 1,(0)} = 1 - a.

The problem is solved by Theorems 6.1.6 and 6.1.7 in the forms

P{V{seJi(T)}s'0~ GyJx2

r(\ - a)y/sHes=s 0 =

and ^ s 0 + σ^χ](\ - a)y/s'Hôs} = 1 - a

P{V{seJÎ(T)}s'0- âyJrFrf{\ - a)y/sH0s = s'0 =

= s 0 + âyJrFrJ(\ - a)y/sHôs} = 1 - α,

respectively. Here r = R(1H0V) = d i m ^ ( T ) , and / is given by Theorem 5.4.2.

Remark 6.1.4. In the case of regularity of the matrix TH^T' the proofs of Theorems 6.1.6 and 6.1.7, can be simplified by the help of the Schwarz ineaquality formulated in the following way. Let M be an r χ r symmetric and positive definite matrix. Then

V{jr, ye0tr\{x'y)2 ^ χΆ/Ιχ/Μ\~ ]y

This formulation obviously implies

jr0G{jr: χΊΛ ]x ^ c2}o V {y G <T} (jr^y)2

= c2/My.

For completing the proof, it suffices to choose

Μ = 1H0T and JT0 = 1(0 - 0). Then

jr0G{jr: χ/(1Η0Τ) χ = c2}o

o V { y e f r } [ ( 0 - Ô)'Ty]2 ^ c2/lH0Ty<>

o V { s G J ( r ) } [ ( 0 - 0Ys]2 ^ c2sH0s.

6.2 Quadratic functions of random vectors

Before making use of the assumption of the normality of the random vector in the universal model, several theorems concerning quadratic forms of normally distributed random variables are given.

Theorem 6.2.1. Let the random variables

&~NQii, 1), ι = 1, .

217

be stochastically independent. Then

T = £ Xtf + ltba + c

is a random variable having a x2{k, ^-distribution (k degrees of freedom, parameter of non-centrality δ) iff

1. A, = 0 or 1 for i = 1, . . . , /?; 2. A, = 0=>6, = 0for / = 1, . . . , / > ;

3. c= £bj. i= 1

If the conditions 1, 2 and 3 are fulfilled, then

* = £ A,- and δ= tWi + ttf-

Proof. Compare the characteristic functions of the random variables x\k, δ) and T. Equality of these functions implies the validity of the asser-tion.

If the notation 77, = £,· — μ, is used, then the characteristic function of the random variable Τ is

<pT(t) = E[exp(itT)] =

- c x p [" (c ' I, f)]ε (exp {" [?, +»<+f)']}) · Since

£{exp[u( V j + «,)2]} = - 7 r L = exp (itaj - - ^ L ) , yj\ —2\t \ 1 — 2ιί/

for <pT(t) we have

^ ( 0 = (l j Π Vi - 2 i A / ) exp [j/ ( c - f +

218

and thus

(*) çv(0 = e x p [ i / ( c - f f ) (l/π V l - 2 i v ) -

P L ,f> 1 - 2U,i V ν J

The characteristic function φ(ί) of the random variable x\k, S) is

<p(t) = E [exp (ii Σ ( l y - v> ) 2 ) ] -

where 5 = ^ v|. The expression for φ(ί) can be rearranged as follows: j = ι

/ ( 0 = n ^ { e x p [ ^ ( ^ - ^ ) 2 ] } =

= ( l IΠ Vn=~2Ï<) exp (if £ v? - 2 t 2 t γ ± 2\t

thus

( * * ) ' / Π V T ^ j e x p l X vj y=i / \y= ι 1 — 2\t.

U

The function (* ) is equal to ( * * ) iff

Ρ *

, = 1 , = 1

& jexp ü - £ exp it £ Ί 1 - 2iA.i V V J

exp 1 - 2i/ /J

It is then obvious how to finish the proof.

Theorem 6.2.2. Let ξ ~ Νρ{μ, I). Let A be a symmetric ρ χ ρ matrix, b a p-dimensional vector and c a real number. Then Τ = ξ'Αξ + 2bξ + c is a random variable having the x\k, ^-probability distribution iff

1. A 2 = A; 2. beM(A); 3. c = />'*.

219

If the conditions 1, 2 and 3 are fulfilled, then k = R(A) and δ = (b + //)'· Mb + μ).

Proof. Let Q be an orthogonal matrix such that Q A Q = D is the diagonal matrix. Then η = Ο'ξ ~~ J V ( Q > , I ) and T= η'Οη + 2ϋΟη + c. By The-orem 6.2.1, the random variable Γ has a x\k, ^-distribution iff

1. {D} / ; = 0o r 1 f o r y = 1, poD2 = DoQ'A 2Q = QAQoA 2 = A ; 2. [ { D } 7 7 = 0 =>{Q'b}Jm ι =0]oQbeJt(D)oQQb =

= beJt(AQ) = Jt(A)\ 3. c = bQQb= bb

If conditions 1, 2 and 3 are fulfilled, then again by Theorem 6.2.1,

= (b'Q + μΟ) D ( Q b + Q μ) = (b + μ)Άφ + μ).

Theorem 6.2.3. Let ξ ~» Νρ(μ, Σ ) . Then the random variable Τ = ξ'Αξ + 4- 2b'ξ + c has a x\k, ^-distribution iff

1. Σ Α Σ Α Σ = Σ Α Σ < ^ ( Σ Α )3 = ( Σ Α )

2 ;

2. Σ ( Α / ι + 6 ) Ε ^ ( Σ Α Σ ) ;

3. (Αμ + by Σ(Αμ + b) = μ Αμ + 2b'μ + c

If the conditions 1, 2 and 3 are fulfilled, then k = Tr ( Α Σ ) and <5 = (b + A / # ) ' · Σ Α Σ ( Ζ > + Α / ι) .

Proof. Factorize the matrix Σ with the help of matrices r and J ^ , where 7?(Σ) = r and Σ = JJ'. Let R be a regular ρ χ ρ matrix with the properties

R = ^ j , F ' N = 0, F J = I and N J = 0 ; then R "1 = (J , K ) , where N K = 0.

For the random hypervector

ρ k

= l = T r

( ° ) = R

( ° ) = Ä (QDO') = R(A)

and

δ= i { D } / / [ { * ' Q } i . ; + { / ' ' Q } i . / ] 2 =

we have

hence Ρ{ξ = μ + Δη + K f l = Ρ{ξ = μ + Λ η] = 1

The random variable Τ can thus be expressed in the form

220

Τ = ξ'Αξ + 2b'ξ + ο = (μ + Λη)'Α(μ + Λη) + 2b'(p + J17) + c =

= f? J'AJ ff + 2(Αμ + b)' Jq + μ'Αμ + 2bμ + c

with probability one. From Theorem 6.2.2, we know that the random vari-able Τ has a ^(k, ^-distribution iff

1. J AJJ AJ = J AJ; 2. J'(A/i + b)eJ((J'AJ); 3. μ'Αμ + 2b'μ + c = (Αμ + b)' JJ'(A/i + b).

Furthermore, we have:

J AJJ AJ = J'AJ ο ΣΑΣΑΣ = Σ Α Σ

(here we use the relation F'J = I) . The equivalence

ΣΑΣΑΣ = Σ Α Σ ο ( Σ Α ) 3 = (ΣΑ) 2

can be proved as follows

Λ(ΣΑΣ) = Ä(J'AJ) = Ä(J'AJJ'AJ) = Λ(ΣΑΣΑϋ) ^

^ Α(ΣΑΣΑ) g Λ(ΣΑΣ) => 3 {D : Σ Α Σ = ΣΑΣΑϋ]

The assumption (ΣΑ) 3 = (ΣΑ) 2 implies

ΣΑΣΑΣΑϋ = ΣΑΣΑϋ οΣΑΣΑΣ = ΣΑΣ.

The implication

ΣΑΣΑΣ = Σ Α Σ => (ΣΑ) 3 = (ΣΑ) 2

is obvious. Now the equivalence

J'(A/i + ft)e^(J'AJ)o£(A/( + Ι > ) ε ^ ( Σ Α Σ )

is to be proved. If

J'(A/i + ft)e^T(J'AJ) then

JJ'(A/i + b)eJï(JJ'AJ) = Λ Τ ( Σ Α Σ ) .

If Σ(Α/ι + Λ ) 6 ^ ( Σ Α Σ )

then Ρ'Σ(Αμ + b) = J'(A/i + b)eJt(J'AJJ ') = ^ ( J 'AJ ) .

We apply Theorem 6.2.2 once more in order to complete the proof. From it, we obtain k = /{(J'AJ). As J'AJ is an idempotent matrix, we have

Ä(J'AJ) = Tr(J'AJ) = Tr(AJJ ' ) = Τ γ ( Α Σ ) .

221

The parameter δ can be expressed as

δ = [J'(A/i + b)Y J'AJJ'(A/i + b) = (AJ I + />)'ΣΑΣ(Α/ι + b).

For quadratic forms in more detail refer to Rao [104].

6.3 The universal model

If the vector ξ in regular models (see Theorems 5.4.1 and 5.4.2) is normally distributed, then by Theorem 6.2.3 the estimator a2 is distributed as ^ ^ / / ( s e e also Theorem 6.1.2). Theorem 6.2.3 enables us to investigate the probability distribution of an estimator a2 in the universal model as well.

First consider Theorem 5.4.5. There

0\RQI, A ) - R(A)) = A = (η - Α Θ ) (V - ν " Α [ ( Α ' ) - ( ν )] ' -

- ( A ' ) - ( V )A ' V - + ( Α ' ) - ( ν )Α Ύ - Α [ ( Α 0 - ( ν )] ' ) ( ι ? - Α Θ ) ,

in Theorem 6.2.3 substitute for A the matrix

σ - 2 { ν- _ V - A [ ( A ' ) - ( V )] ' - ( A X o o A ' V - + ( A % ( V )A ^ - A [ ( A ' ) - ( V )] ' }

and for Σ the matrix c^V. It can be simply verified that conditions 1, 2 and 3 from Theorem 6.2.3 are fulfilled, and that the random variable Λ / σ 2

has the central chi-square distribution with s = i?(V, A ) — R(A) degrees of freedom (see the proof of Theorem 5.4.5). Furthermore, it can be seen that the estimator

Ô2 = (1/ - A[ (A% ( V )] ' i / ) ' (V + A U A ' ) " (ι/ - A[ (A% ( vJ ' i / ) / [* (V, A ) - R(A)]

is also distributed as rfjfi/s. Next, Theorems 5.5.5, 5.5.6 and 5.5.7 have to be used. Here, for example, the property of the matrix C, from Theorem 5.5.6 immediately implies that with respect to Theorem 6.2.3 the random variable ηΌ] f//Tr(VC,) has the distribution cr^/s , etc. The fact that the estimator ô2

is distributed as c^x^/s is used in order to determine a confidence interval for σ ; this is the subject of the following theorem.

Theorem 6.3.1. Consider the universal model Εθ(η) = Α Θ , Σ„ = o^V. If the vector η has a normal probability distribution, then a 100 χ (1 — a) percent confidence interval for σ has the form

222

where ô2 is the corresponding estimator of the unit dispersion from The-orems 5.4.1, 5.4.2, 5.4.5 and 5.5.7, and χ)(αβ) and χ](\ — α/2) are respective-ly the (a/2)th and the (1 — a/2)th quantiles of a chi-square distribution with s degrees of freedom.

Proof. If χ\αβ) and χ\\ - a/2) are the (a/2)th and the (1 - a/2)th quantiles of the chi-square distribution with s degrees of freedom, then obviously

1 - a = Ρ \χ](α/2) ^ = χ](\ - α/2)j =

= p \ <o><-*L\ = \χ](\ - α/2) *2(α/2)1

Theorem 6.3.2. Consider an w-dimensional random vector ξ ~ Ν(μ, c^H). Let A and Β be « χ Λ symmetric matrices. Then

Ε(ξ'Αξξ'Βξ) = σ*[2ΤΓ (ΑΗΒΗ) + Tr (AH) Tr (BH)] +

+ o V A / f Tr (BH) + μ'Βμ Tr (AH) + 4#ΓΑΗΒ#ι + μ'Αμμ'Βμ}.

Proof. First let μ = 0, Η = I and A = diag {dx n , dx 22,..., dx nn} = D,. Then

Ε[ξ'Αξξ'Βξ] = £ ( £ 4.„ξ? Σ Σ άβξ&) =

\ ι = 1 j = 1 k = 1 /

= Σ α^„Ε(ξ*) + Σ Σ Α^Εβφ.

ι = 1 ί = 1 j=1

Since Ε(ξ*) = 3cr\ we find

£($ 'D ,# 'BÖ = er4 3 f «*,. Α +<>*Σ Σ 4 , Α =

ι = 1 ι = 1 y = 1

y / / = o*[2Tr(D,B) + Tr(D,) Tr(B)].

Next, let the matrix A not be a diagonal matrix. Then there exists an orthogonal matrix Q such that

QAQ' = D, = diag {</,,„, dX229 dlnn}.

223

The vector η = Οξ has the normal distribution N(09 a 2! ) , so in this case

Ε(ξ'Αξξ'Βξ) = E( η QAQ ηη QBQ η) = Ε{η'Όχηη$η),

where the matrix S = QBQ' is symmetric;

Ε(ξ'Αξξ'Βξ) = σ 'ρΤΓίΟ ,β) + Tr(D,) Tr(8)] =

= o*[2Tr (QAQ QBQ ) + Tr (QAQ') Tr (QBQ')] =

= o*[2Tr(AB) + Tr (A) Tr (B)].

As the next step it is allowed that the matrix Η is not a diagonal one. Then with respect to its positive definiteness, there exists a matrix J with linearly independent columns such that H = JJ'. The vector η = ϋ,+,£ (see Defini-tion 2.1.4) has the normal distribution N(0, cr2!, r ) , where r = R(H) and ρ{ξ = Λη = JJ+,£} = 1 (one must realize that Ρ{ξβ^(Η)} = 1 and that JJ,+, is a projection matrix on JP(H)). Thus in our case

Ε(ξ'Αξξ'Βξ) = Ε(η'ΔΆΛηη'ύ'ΒΔη) = </[2Tr(J'AJJ'BJ) +

+ Tr(J'AJ) Tr(J'BJ)] = σ4[2ΤΓ (AHBH) + Tr (AH) Tr(BH)].

The last step for 0 φ Ε(ξ) = μ is now obvious.

Remark 6,3.1, In the model Εθ(η) = ΑΘ, Σ , = c^V, where the vector η has a normal distribution, the random variable a 2 = i/'Qi//Tr(QV), where Q is a symmetric matrix such that VQA = 0 ( a 2 is invariant with respect to 0e&lk), may be chosen as an unbiased estimator of the unit dispersion σ 2. By The-orem 6.3.2, the variance of this estimator is

var ia 2) = 2a 4 Tr[(QV) 2]/[Tr(QV)] 2, since

var(!/'Qf/) = £ ( I / ' Q I / I / ' Q I / ) - [Ε(η'Οη)]2 = 2er4 Tr[(QV) 2].

If in addition, cf. Theorem 6.2.3, the conditions 1. VQVQV = VQV; 2. e^T(VQA)c:^r(VQV); 3. A'QVQA = A'QA,

are fulfilled, then the estimator a 2 = i/'Qi//Tr(QV) is distributed as a 2 ^ « ^ / /Tr (QV) and therefore

var ia 2) = 2a 4Tr (QV ) / [Tr (QV)] 2 = 2a 4/Tr(QV),

in contrast to the preceding estimator for which

var ia 2) = 2a 4Tr[(QV) 2] /[Tr(QV)] 2.

224

Since for sufficiently large / the random variable xj/f has approximately the normal distribution N(l92/J)9 the uncertainty in determining o2 may be quantified by the variance 2<7 4/Tr(QV) instead of quantifying it by the con-fidence interval given in Theorem 6.3.1. If Q is chosen according to The-orem 5.4.1 or Theorem 5.4.2, then Tr (QV) for separate models is given by the relations (1)—(5) from Theorem 5.4.2. In practice, the estimator cris often more important than the estimator ô2. Since σ = Y/ô2, it can be easily verified that <ris not an unbiased estimator of Σ = yfö2. Applying Theorem 4.1.5, the random variable (jhas, for sufficiently large value of Tr(QV), approximately the normal distribution

( 1 2σ*. σ 2 \ Ν Σ, var ( σ ) = = .

V 4<7 2Tr (QV) 2Tr (QV)/

In the following, generalizations of Theorems 6.1.4 to 6.1.7 are stated.

Theorem 6.3.3. Consider the universal model Εθ(η) = ΑΘ, Σ , = c^V. If Θ 0 = Pi? Θ, where PjJ} is a projection matrix on the subspace Ji(A') in the Euclidean norm, then the rank of the covariance matrix of the estimator

0o = ^ [ ( A ' ) - ( V) ] ' i / ,

i.e. the rank of the matrix

Σ-θ0

= ^PÎ /K^O/t l ) , M( V ) ] V(A m{y)P*

is

RŒ ) = lRW i f d e t ( V ) # 0 , K ®o) lR[V(V + AA 7)"A] otherwise.

Proof. If det (V) φ 0, then the assertion is evident. Let V be an arbitrary positive semidefinite matrix. Then using the simpler notation P A = Pj{!

^ { P A [(A')/{|), W( V ) ]' V(A')/||), m ( V) P A ' } =

= R{A (A )/!,) m(V)[(A )/|,) m ( VJ V(A )/"{",) m ( V) A (A m ( V )} =

= Ä{[(A )/|,) m(v)]rA[(A )/!,) m ( V )] A[(A )/|,) W ( V )] V(A )/|,) W ( V )} =

= Ä{[(A )/!,) m ( V )] A [(A )/!,) W ( V )] V(A )/"[,) „,(V)} =

= /?{A[(A )/!,) m ( V )] A[(A )/"[,) W ( V )] V(A )/[,) w ( V )} =

= iî{A[(A /) /| , ) m ( V )] ' V(A') /|, ) W ( V )} = R{}f(A/)^l) m(V)A'(A')^ m(V)} =

= Ä { V ( A ' ) ( ; m ( V )A ' } = Ä{V(A') - ( V )A'} .

225

By Theorem 2.1.15 we may replace (A ' )~ ( V) by the matrix

(V + ΑΑ') A[A (V + Α Α ' Γ Α ] - ,

and we find

Ä{V(A') - ( V )A'} =

= Ä{V(V + AA')-A[A'(V + A A ' ) ~ A ] - A ' } .

As *#(V) c Jt(y + AA') and Ji{K) a Ji{V + AA'), there exist matrices C

and D such that

V = (V + A A ) C = C (V + ΑΑ'), A = (V + AA )D.

Ä { V ( A ' ) 4 m ( V )A ' } =

= R{C'(V + AA )D[D'(V + AA ) D ] - D'(V + AA')} ^

^ /?{C'(V + AA')D[D (V + AA')D]- D'(V + AA )D} = R{C'(V + AA )D} ^

^ R{C'(V + AA' ) - D[D'(V -I- AA' )D] - D'(V + AA')} =>

=> R{V(*%}, m,v)A'} = *{C'(V + ΑΑ') (V + AA')" (V + ΑΑ') D} =

= Ä{V(V + AA')"A}.

Definition 6.3.1. The set L in 0tk is said to be a cylinder if in & there exists a set Ζ (a base) and a subspace Ji a âtk such that L = {x + y : xe Z, ye Ji}.

Theorem 6.3.4. Let M be a k χ k matrix. Then L = {x: X ' M J T ^ C2} is a

cylinder with the base

Ζ = { Β : L I N I N G c2, ueJt(M + M')}

and with the subspace Ji = Ker(M + M').

Proof. We have

3t = J((tA + Μ') φ Ker (M + Μ')

and

V {jre Λ * } X ' M J T = x/(M + M') jr/2 => V {xe

3 ! { ( « , κ) : u e ^ T ( M + M'), ve Ker (Μ + Μ')} χ = υ + v. Let

xeL = > * = « + κ, i i e u T ( M ' + M), re^/T = Ker(M + Μ') and

( 1 1 + K ) ' ( M + M ' ) ( u + v)/2 = u 'Mug c 2 = > u e Z and κεΛΛ

226

If ueZ = {u. u'Mu^c2, ueJÎ(M + M')} and veJf = Ker (M + M'), then χ = u + ν satisfies the inequality

χΊΑχ = (a + v)' (M + M') (u + v)/2 = u ' M u ^ c\

and therefore xe L.

Theorem 6.3.5. If the notation of Theorem 6.3.3 is used, then a confidence region JT (0 o) for the parameter 0O = P A<0 having confidence coefficient 1 — a is

J f ( 0 o ) = { * : P A - K A ' ) ^ , ^ , ] ' ! ? ) ' -

• {PA[(A')ÎO.m (v>r V(A') /&, m (v )P . r ( * - Ρ Α · [ ( Α % , m ( V )] ' l ï ) ^

^ σ 2 ^ - ,(r, 0)} η { Ρ Α . [ ( Α ' ) , | „ m ( V )] ' i f +

+ ^ { P A . R A X , , m ( VJ ' V(A') , {„ , M ( V )P A } } ,

where x\_a(r, 0) is the (1 — a)th quantile of a random variable having a chi-square distribution with r = Ä ( V [ V + A A ' ] " A ) degrees of freedom. The manifold

is the same for all realizations η(ω), ωβΩ— Ν, of the vector η, where the probability P{N} of the set Ν is equal to zero (we say that the given manifold is, with probability one, the same for all realizations of the vector η).

Proof. According to Theorem 6.2.3

^ { ( Ρ . - Θ - ΡΑ-ΚΑ') , ! , , , M ( wJ ' i | ) ' {ΡΑ<[(Α '),ί,, „«yjVViA 'Xt,,M ( V )PA-r ·

(PA Θ - PA(*%}.mWYn) < c2} = P{Pk0e

s{x:(x- P A - K A O Î , , m ( V )] ' i / ) ' { P . . [ ( A % , , „ ( v j V V i A ' ) ^ , M ( V )P A r ·

· ( * - P A R A ' ) , ; , , m ( V )] ' v ) ^ c1} = P{X\r, 0) ^ c \

Furthermore, with probability one

Ρ.-ΚΑΟ,ΐ.,, mW]'v - PA 6>€ JtPAiAX», m(vJ' V(A0A), m (v ,P. -} =>

P A Θ + ^ { Ρ Α - [ ( Α ' ) ΐ , , m (v,]' V ( A % ) , M ( V )P A - } =

= PA- [ (A ' ) Î , . m (v ) ] ' » / + ^ { Ρ Α - Κ Α Ο ΐ , ^ ' ν ί Α Ο ΐ , , ^ ν , Ρ Α - } ·

Theorem 6.3.6. If η ~ Ν(ΑΘ, c^V), then the random vectors P A $ = = PA'[(A')i%]'»7 and y = AO — η are stochastically independent.

227

Proof. Theorem 6.3 will be used. Writing the residual vector VIN the form

V={A[(A%lm(y)Y -\}η,

and noting that

PA = A ( A )/||) M ( V ), A ( A ),[,) „,(v)A ( A )/],) W ( V) = A ( A ) ^ „ ; ( V) = P A ,

we obtain

, w(V)"A' —

= V (A )/!,) w (v )A ( A )/!,) W(V)PA' —

V ( A )/|() ,„(V)PA' = 0,

which is a necessary and sufficient condition for the stochastic independence of the random vectors P A' [ (A ' ) ,+ V] ' i f and V.

Corollary. The random variables v ' V ' v a n d

(PA'[(A'),{I), -cvj'if - Ρ * Θ ) {PA\(*')k *<vj' V ^ i A O Ä , M ( V )P A r · ( Ρ Α Κ Α Ο ^ ^ Γ Ι Ζ - Ρ Α Θ )

are stochastically independent.

Theorem 6.3.7. Using the notation of Theorem 6.3.3, a confidence region for P A 0 in the case of an, a priori, unknown unit dispersion cr is

J T ( 0 o ) = {χ: (χ-ΡΛ(^%),^)Υηϊ'

{ P A ' [ ( A % ) , MIY)Y V(A')/{,), W (V)PA'}" ( * - P A [(A')ÎD, „orj'i) ^

^ {Ä [V(V + A A ' ) " A ] / [ / î ( V , A ) - Ä ( A ) ] } ι Λ Τ vF(l - α)} η

η { P A[ ( A,) / | I ), M ( VJ ' i f + ^ { P A [ ( A ' ) , T I ) , M ( VJ ' V(A'),|,). W ( V )P A ' } } ,

where F(l — a) is the (1 — a)th quantile of a random variable having the F-distribution with Ä(V[V + A A ' ] ~ A ) and Ä ( V , A ) - R(A) degrees of free-dom.

Proof. The assertion follows sequentially from Theorem 6.3.6, from the definition of an F-distributed random variable and from Theorems 6.3.3 and 6.3.2.

Theorem 6.3.8. If η ~ 7V(A0, crV), then V { 0 e ^ * } F e ( 0 , ) = 0, where Φ = 0 - PA)[(A')IÎV]'*7 (Theorem 5.2.7) is the best l-biased estimator of the vector 0, = ( I — P A ) 0.

228

Proof.

ΕΘ{Θ,) = (I - P A. ) [ (A%, . m ( V )] 'A0 =

= {[(A W(vJ A — A (A )/},) W ( V )A (A m ( V )} 0 = 0 .

Theorem 6.3.9. A confidence region for the parameter 0 = 0O + 0| (the notation from Theorems 6.3.3 and 6.3.8 is used) having confidence coefficient 1 — a with the value σ 2 known a priori, is the cylinder

{x + y : xe J T ( ^ ) , ye Ker (A)} ,

where JT(0 o) is the region from Theorem 6.3.5.

Proof. The assertion follows from the fact that

9 , = (I - PA ) 0 e Ker (A) = ^ ,

and from Theorems 6.3.4, 6.3.5 and 6.3.8.

Theorem 6.3.10. A confidence region for the parameter 0 = 0O + 0, (in the notation of Theorem 6.3.9) having conficence coefficient 1 — a, with the value σ2 a priori unknown, is the cylinder

{ x + y : J T G J T ( 0 o ) , ye Ker (A)} ,

where J f (0 o ) is the region from Theorem 6.3.7.

Proof. See the proof of Theorem 6.3.9; Theorem 6.3.5 is used instead of Theorem 6.3.7.

Remark 6.3.2. Theorems 6.3.9 and 6.3.10 represent generalizations for the universal model of Theorems 6.1.4 and 6.1.5. In the following, an analogue generalization of Theorems 6.1.6 and 6.1.7 is obtained.

Theorem 6.3.11. Let PA> be a projection matrix (in Euclidean norm) on Λ?(Α'), Pjr be a projection matrix (in Euclidean norm) on JV, 0 o = PA 0 and

®οι = ( Ι - Ρ Λ Θ 0 ; then 1. all the components of the parameter 0O1 are unbiasedly estimable; 2. the best unbiased linear estimator of 0O1 is

®oi = ( » - Ρ ^ ) Ρ Α ' [ ( Α 0 / | 1 ), , Μ (ν ) ]/ΐ ϊ ;

3. />{0O1 = 0 O 1} = 1 .

Proof. As Jf cz Jt(A!\ PA — P^P A = PA — Pjr i s again a Euclidean

229

projector on the A'-relative orthogonal complement of the subspace Ji, denoted by Ji A ; Ji(A') — Ji ® Ji A. Columns (and therefore rows also) of the matrix PA — Pv are elements of Jt(A!), which implies the first assertion. The second assertion is a consequence of the first assertion and Theorem 5.2.2. Next, if the statistic Θ0] from the second assertion is considered, then the relation ? { i j e A 0 + ^ 0 0 } = 1 implies

P { 0 O 1G 0 O J + Ji{{\ - P^PAiAX^wYV}} = i.

Now the third assertion follows from the fact that

- P.r)PA-[(A'),t.).m(v)]'V} = Jt{{\ - 9r)JT} = {0}.

Theorem 6.3.12. Let 0m = Ρ , Θ 0 , then 1. all the components of the vector 0^ are unbiasedly estimable; 2. the best unbiased linear estimator of is

where

^®oo= P ^ P a [ ( A )/|i), m(y)Y y(Af)^l)t m(v)PA'Pr »

3. Λ ( Η ^ = dim(Ji) = J?(V[V + A A ' ] - A ) .

Proof. The inclusion Ji a M(k') and the fact that PA- and PA< are Eucli-dean projectors imply the first assertion. The second assertion is a conse-quence of Theorem 5.2.2 and the first assertion. In order to prove the third assertion, we proceed as follows. Let J be a full rank matrix such that V = J J ' ; then

Λ ( Η ^ ) = A { P ^ P A . [ ( A ' ) / + ) , M ( V )] ' J < P ^ P A , [ ( A ' ) , Î ) , M ( V )]/J > ' } =

= A { P R P A , [ ( A 0 ^ , M ( V )] J }

(this is a direct consequence of the relation Ä(MM') = RÇM) valid for an arbitrary matrix M). As

J i = ^ { P A - K A ' ^ V J ' V } = Μ{ΡΛ'ΚΑ'),!,) M ( V )] ' J } ,

then evidently

R(HeJ = R(PJ> = dim (Ji) = dim MF{PA.[(*')k «<v>]' V } ) =

= RPAMkmd' V> = WW + A A ' ) - A ] ,

which is a consequence of Theorem 6.3.3.

230

Theorem 6.3.13. Let 0^ be the parameter from Theorem 6.3.12; then

Ρ{Θ^{υ: u e ^ , (u- ÔJ'HéJu- =

= σ ^ ( 1 - a)}} = 1 - a,

where χ^(1 — a) is the (1 — a)th quantile of a random variable having the

chi-square distribution with

fx = WêJ = *[V(V + AA ) A]

degrees of freedom.

Proof. The assertion follows from Theorems 6.2.3 and 6.3.12.

Theorem 6.3.14. Let the notation of the preceding theorems be used. Then

/>{V{sei**}«'Θοο - σ^χ\(\ -d)^s'P^Hè(P^s =

= = s' Θοο + Gsjxl(\ - a) y/s'PrH^P.rS} = 1 - a.

Proof. If the method of proving Theorem 6.1.6 is applied, then from Theorems 6.3.13 and 6.1.6 it follows that

(*) P{V{se Jf) s'0oo - Gy/ÛV - a) Js'H^s £

= s'©oo = s'0oo + OyJx\(\ -a)Js'HGooS} = 1 - a.

As only the case seJf = Ji(H0{J is considered, no problems connected with

the singularity of the matrix occur. If this matrix is singular, then a

generalized matrix inverse is used instead of the matrix inverse usual for a

regular matrix. Next, as

V {ue .V} V {se ®k} su = s'Pvu = (Pvs)'u9

the symbol V { s e # * } in ( * ) may be used instead of V{seJf), replacing simultaneously s by P^s. As P{Ô^eJi} = 1, the relation

V { S G ^ * } S ' θ * , = (P rsX0oo

is evidently valid with probability one.

Theorem 6.3.15. In the notation from the preceding theorems,

Pjr^éç^jr = [(A')/îi), m(vj' V(A m(V)PA- =

231

(here cr^êa is the covariance matrix of the estimator from Theorem 6.3.3).

Proof. Regarding the second assertion of Theorem 6.3.12, we have

H êm ~ Ρ \ PA [ (A )/}i), „ , ( V ) ] V(A )/},) W( V ) P A ' P \ -

Since

Ji = ^ { P A - K A ^ V ^ V J ' V } = ^{ΡΛ^Χι^Υ J } ,

where J J ' = V and J is a full rank matrix, we have

P.. PA[(A' ) / Î ,K. (V)] ' J = P A - K A O / Î I K m * ) ] ' J => H Ö OO =

= PA[(A')/(I). MIVJ' JJ'iA')/!,) , M ( V )P A ' = H ^ .

Theorem 6.3.16. If the notation of the preceding theorems is used, then

/ > { V { s e ^ } s ' 0 o - a V ^ O ~ « ) V s Η ^ ο 5 ^

= s 0 O = s 0 O + σ ^ Ο - « ) V S / H e 0

s } = 1 - a.

Proof. As 0 O = ®oo + ®oi a n d ®oi * s identical to 0 O 1 with probability one (Theorem 6.3.11), the assertion follows from Theorems 6.3.14 and 6.3.15.

Theorem 6,3.17. If the value σ is a priori unknown, then

i > { V { s e a l s ' 0 O - <TVM,./ 20 - a ) V 5 ' H ê 0

5 = « '©ô =

= 5 0o + ( T V / i f / , . / 2 ( l - f f ) V « V } = l - « ,

where

/ , = / ? [ V ( V + AA')" A], f2 = / ? ( V , A) - Λ(Α),

σ 2 = ιΛΤ if//2, ν = A [ ( A % , m ( V )]'fï - ι/·

Proof. The assertion follows from Theorem 6.1.7, Theorem 6.3.6 and its Corollary, and Theorems 6.3.14 and 6.3.15.

Remark 6.3.3. Theorems 6.3.16 and 6.3.17 generalize Theorems 6.1.6 and 6.1f7. In view of Theorem 6.3.8, it can be seen that a confidence interval for s '0, (where 0 , = ( I — P A ) 0 ) is ( —oo, oo), which therefore evidently covers the value s '0, with probability one. Hence, if !F is an arbitrary class of linear functional characterized by the vectors se M a &k, where fe^ = > / ( 0 ) = = s '0 , s e M , then the boundaries of the Scheffé confidence region can be determined by the following procedure:

232

1. First, the set M is decomposed into the subsets M , = { s : S E M , « e Λ Τ ( Α ' ) } and M 2 = { s : s e M , s < ^ ( A ' ) } such that M = M , u M 2 , Μ , η Α/ 2 = 0·

2. If se A/ , , then the interval I, is defined as follows

âsJ/,F,,.,,(\ - a)

3. If s e M 2 , then the interval / # is ( — oo, oo).

Then, referring to Theorems 6.3.16 and 6.3.17, the probability that all the values s ' 0 are simultaneously covered by the corresponding intervals I9 is at least 1 — a.

This Section is based mainly on references [69] and [127].

6.4 Estimation of a non-linear function of first order parameters

Several consequences implied by the assumption of normality of a random vector related with linear estimators of the first order parameters and quad-ratic estimators of the second order parameter σ 2 are given in Sections 6.1, 6.2 and 6.3. The aim of this Section is to discuss the problems of estimation of a non-linear function of the first order parameters, in the case of normality of the random vector in a regression model.

N o estimation theory of non-linear functions of parameters of greater than first order has yet been developed, unless we have in mind asymptotic theory (see the introduction to this theory in Chapter 4), which enables us to study the statistical properties of estimators under certain conditions, in particular in the case of a sufficiently large dimension of the random vector.

Throughout Section 6.4, the symbols given in Section 2.2 will be used and the normality of the random vector η will be assumed: η — Λ^(ΑΘ,Κ), Oeât*. The design matrix A and the covariance matrix Κ are known, and it is supposed that Jf(A) a Ji{K) (this assumption may be fulfilled automatically, e.g. in Model two of Section 5.1). Our task is to find an unbiased estimator with minimum variance of the non-linear function / ( · ) : -+Stx of the parameter 0e3tk.

The following kinds of Hilbert spaces and relations among them (de-

233

scribed in more detail in Section 2.2) will be used to solve the problem: The space of the random variables L 2 ( i / ) = generated by the set

{u'f/f ue^t"} or by the set {m'Kf/ : meJ/(K)} with the inner product

<f/ii7, υ'2η}^ = υ\η(ω)υ'2η(ω) άΡ0(ω) = Ε0(υ\ηηυ2) = u\Ku2, Ja

or </πίΚ"ι/, ην'2Κη)^ = m\K~m2.

Here P0 is a probability measure defined on the σ-algebra « 5 ^ , correspond-ing to the value 0 of the parameter Θ.

The reproducing kernel Hilbert space ( R K H S ) Jf (K) c 0Γ generated by the matrix Κ or by the set {Ku: ue0tn} with the inner product <Κι/,, Ki/ 2>^ ( K) = = u\Ku2. The space Jf(K) and the space . # (K) consist of the same points. The inner product of the elements m,, m2eJtf(K) is </π,, /π 2>^ ( κ) = m\K~m2

(this value does not depend on the choice of a g-inverse K~ of the matrix K). The spaces Jf = ϋ{η) and J^(K) are isomorphic (throughout Section 6.4

the expression isomorphism means isometrical isomorphism); the corre-sponding elements are υ'η, Κι/for ueât" or m'Kf/, m for meJ^ (K) = ^ ( K ) .

The space L2[^S(Jf)] of all &(Jlf)-measurable and square P0-integrable random variables ; is the minimal subsigma algebra of the σ-algebra « 5 ^ , such that each random variable ξε is measurable with respect to it. The space L

2[ ^ ( J f )] is generated by the class

exp (ξ-Χ-Ε0(ξ2^'.ξΕ^,

or

jexp (mK η-1- ||m||i(K)): meJT(K)J and its inner product is given by the relation

'exp (ξ - - Ε0(ξ2)), exp ( η - - Ε0(η

2))) = exp « ξ , ?/>.,),

or

exp (m\K η - \ | | /π , | |^ ( Κ )) , exp ( ro2K ι; - i | | /π 2| |> ( Κ )) )

\ 2 / \ 2 / / ζΛ*(·*))

= exp ( /n iK/n 2) ,

m,, m 2G ^ ( Κ ) . The space L2[0S{Jtf )] can be expressed as 0 L2

[ ^ ( J f ) ] , where ne/V

234

Ν = {09 1, 2, . . . } , the symbol © means a direct sum, the Hilbert space 1}η\β(&)\ is generated by the random variables ..., ξ„)9 &eJf, / = = 1, w, and Λ„ is the polynomial of η variables defined in Lemma 2.2.8 (Hermitean polynomial); the generator of the space L2

n[âS(Jf)] can also be expressed with the help of the elements meJf(K) in the following way:

{hn(m\K^ m;K"j/): m,eJf(K), ι = 1, « } .

The inner product of the elements

hn(m\K^ / η , ,Κ ι / ) , Λ„(*;Κ f/, ..., ΚηΚ-η)Εΐ$β(3Τ)] η

is given by the value \ \ (mjK'kj). j= ι

The space exp © Jf(K) generated by the class {exp © m: me Jf(K)}; its

element exp © m is given as the direct sum of the symmetric tensor powers of

the element m

e x p © m = 0 — mne. neN γ\ \

The inner product of the elements exp © ml9 exp © m2 is the Qxp(m\K~m2). The space exp © (K) can be expressed as the direct sum of the spaces [JT(K)Y Θ :

e x p © Jf(K)= 0 [Jf(K)]ne. neN

A generator of the space [ ( Κ ) ] Λ 0 is the class {m, © ... © m„, m,e Jf(K), / = 1, ri). The inner product of the elements m{ © ... © m„ and Jr, Θ ... ... © Ar„, AT,, ιπ,-e Jf(K), / = 1, A Z , is

<m, © ... © m„, iV, © ... © * „ > [ J f W =

= </n, © . . . © mn9 k{ © . . . © * Λ>μτ ( κ )Γ® =

σε σ y = 1

Here σ is the class of all the permutations of the set {1, 2, n} and σ =

= { σ „ ..., σ„}. The space exp © Jf7 is defined analogously to the space exp © Jf (K). It is

sufficient to substitute the element exp © (m'Kiy) for the element exp © m. The space Jf{G) generated the class

\fu'fu(m) = (u9 exp (m'K-η- i | | /n||^ ( K ))

UeJif, m e J f ( K )J

235

is important in what follows. Its elements can be expressed in the form

f(m) = £ < f f l

( r t) ® ... ® fli">, ro ® ... ® ro>lJW)r β , « ε #

meJT(K), gln)eJiT(K), i= 1, « ,

which will be used frequently. The spaces L

2[ät(3tf)], exp © (K), exp Θ & and are isomorphic.

Corresponding elements in the first three spaces with respect to the isomor-phism are sequentially

exp (m'K-η - ~ ||/π||^(κ) , exp © m and exp © (ro'Ki/) .

If the isomorphism between the spaces L2[âS{J^)} and (G) is considered, then the element UeL2[3S(Jf)] corresponds to the functional ΜΎ ^ ( K ) -+0t\ where

Mm) = (U, exp (mK η - \ \\m\\2

W[K\)

\ \ 2 JI mm*)) For the sake of completeness, the most important notation and relations

from the theory of reproducing kernel Hilbert spaces and their tensor powers are recapitulated:

roe (K), m] © ... © ro„ = £ (mG] ® ... ® ma)\ yjn\ *ea

n

||ro, © ... © mJj>(K)r«> = Σ Π <m

y m

a y > j f ( K ) ;

lln»"slli2r(K)r« = , l ! 11*115») ; n

© ... © ro„, If, © ... © lOproor® = Σ Π <m

7 > ^ .>jr(K>; σε σ 7 = 1

<ro" 0, ^ , , 0>[ Jr(K)r® = H ! <m

> ^ > j T ( K ) J

<m, Θ ... Θ #n„, l r" s > [ j r m, r e=- Χ Π < " V * > J™> = " ! Π < m y

σε σ 7 = 1 7 = 1

• W i r * ( m . Θ ··· Θ «»„) = (m, © ... Θ mn); 7 1 !

236

l lPMW0 (m, ® . . . ® /n„)||(V(K)]n® = — Σ Σ <

mr

ma>x(K)\

Π l <re«r y = 1 y

P[jr(K)i«© ( /π Λ Θ) = - = /πΛΘ ;

l|P(jr(K)r0 (^"Θ)ΙΙμτ(Κ)Γ® = 11 "® ΙΙμπκ)]»® ·

Definition 6.4.1· A function/(·): Jf(K) of the parameter me Jf(K) is called unbiasedly estimable at the point m 0 if there exists a 0S(3f?^measur-able statistic U = / ( i | ) , w h e r e / ( · ) : (#" , # " ) - • (St\ Λ 1 ) , such that

V { m e J f ( K ) } £ j / ( i f ) ) = f Λ*α>)) dPm(a>) = f(m) & ® ^(/(η)) < oo.

The probability measure Pm is defined by the relation

aPm{a>) = exp (m'K-η - ± | |#η| |> ( Κ )) άΡ0(ω).

The statistic is called an unbiased estimator, and the class of such estimators is denoted

Definition 6.4.2. The locally best unbiased estimator of the function / ( · ) at the point m^e Jif(K) is a statistic /(η) with properties

Theorem 6.4.1. A function/(·): Jf (K) - • 3t] is unbiasedly estimable at the point m = 0 iff / ( · ) e Jf (G) , where Jf (G) is the reproducing kernel Hilbert space from Lemma 2.2.10.

Proof. If the function f(-): Jf (K) -> Λ 1 is unbiasedly estimable at the point m = Oe J f (K), then there exists a ä?(Jf )-measurable statistic JO) = U (for the definition of the subsigma algebra ^ ( J f ) c .9", refer to Section 2.2), which satisfies the conditions of Definition 6.4.1. The variance ^0(U) of the random variable U at the point m = 0 has to be finite with respect to the assumption, and thus the relation

\\U\\hmjr))= f ϋ\ω) dP0(a>) < oo

has to be valid, hence £/eL 2[^?(Jf ) ] . With respect to the isomorphism be-tween the spaces J f (G) and L 2 [ ^ ( J f ) ] (Lemma 2.2.10), one and only one

237

functional fu(')eJf(G) corresponds to the random variable £/, and this functional satisfies the following relations

Hence the functional fv{ · ) is unbiasedly estimable at the point m = 0, and the random variable U is its estimator. However, with respect to our assumption, the random variable U is an estimator of the functional / ( · ) . Because

we h a v e / ( · ) = / t / ( ) . Thus all functional which are estimable at the point m = 0 are just those which belong to the reproducing space Jf (G) .

Theorem 6.4.2. The functional/(·): Jf (K) 3tx is unbiasedly estimable at

the point m 0 e Jf (K) i f f / ( · ) e Jf ( G ^ X where Jf ( G ^ ) is the reproducing space

from Lemma 2.2.12.

Proof. Let (7 be a random variable belonging to the space L 2 ( ß , Jf(Jf ) ,

i ^ ) , which in accordance with Lemma 2.2.12 is isomorphic with the space

Jf ( G ^ ) . Hence there exists the unique functional/( · ) e Jf ( G ^ ) assigned to U9

such that

V { m e J f ( K ) } / c /( m ) =

V { m E J f ( K ) } / ( m ) = Em(U) = / „ ( m ) f

V{#neJ f (K)} / (m) =

C/, exp ({m - m 0) ' K'i/ - i | |ro| | 2

# ( K) - \ \\m^{ ) 2 2« ™ 7 / L 2 ( ßj L2(a.*{*), /LJ

238

= £ ί / ( ω ) exp (m'K-fi(û)) - 1 \ \ M \ \ ] , ( ^ άΡ0(ω) =

= f t/(e») d/>J») = £»(10.

The completion of the proof is analogous to that of Theorem 6.4.1.

Theorem 6.4.3· Let the functional^ ·) be defined on ^ f ( K ) by the relation

fP(M) = <y p , / n p < 8 >> ( J < w® , where

gpe[jP(K)Y®9 roejr(K),/>^0,

then V {ke Jt?(K)}fp( · ) e J^(G^ and

ll/JI r) - Σο ( j ) j*- l l <P[ J R ( K ) K © fit» ^ ; ) < 8 >

> [ J R ( K ) J < P - ^ ® l l2J R ( K ) y ® ;

thus the functional fp( · ) is estimable at each point keJf(K). The symbol

( P [ J T ( K ) K ^ fit ' ^ ;) 0

) [ JT(K)j(^ - » ®

means that the project*

Ρ ( , Τ ( Κ ) Κ θ ^ = β ρ 6 [ ^ ( Κ ) Κ Θ

has the form

4 = -^ Σο*,®···® ν »

where Λ σ e ^ ( K ) , ι = 1, /?; then

(It is necessary to note here that in general, the element gp need not be of the given form, i.e. reducible. However, the symbol

<Ρμτ(κ)κ© 9Pi k^ ^ ® > μ τ ( κ ) ] Θ » - Λ ®

can be defined also for the irreducible elements gp. In the process of com-pletion of finite-dimensional spaces, which is our case, this problem does not occur.)

239

Proof. By mathematical induction, it can be proved firstly that for any ρ ^ 0 we have

V{me J f (K)}/ #(m) = t ( { ) P ' 0 * ® (m - ^ « W « ;

whatever be the vector AreJf(K). For ρ = 0 and ρ = 1, the assertion is obvious. Let it be true for ρ — 1 {ρ > 1); then it holds for ρ also because

y (m) = (gp, mp < 8 >

> [ j r ( K ) ] /> < 8 > =

= «ÊP, aifr- I ) eW)]fr- υ · . m - * > ^ ( Κ ) +

+ « f f , » m ( P

~1 ) < 8 >

> [ j r ( K ) ] 0 » - ' ) ® , *>jr(K) =

= ( ' Z (p Τ ! ) <ff„ *<"- 1 - * · ® (m - KY®YMK)]U>-,>®, m - +

+ ('i (P 7 *) <ff„ "1 -°* ® (M - *)<®>LJR(K)1<,- ,,·, K ) ^ =

= Σο 0 <ff„ ^ " 0 Θ ® (m - =

= t ( { ) « f f „ ^ - 0 · > ^ - « · , (/π - tf* .

The last equality shows that fp{-)eJf{G^ therefore

min {p, q} / \ / \

= Σ (j) (j) /! «*„ W ^ - o ® , < $ , , ^ - ^ > l j r ( K ) ] (, - ( ^ V ( K ) ^ ,

which implies the relation for IL/JJI2*^ from the assertion of the theorem.

Corollary 6.4.1. Every polynomial

AM) = Σ <9P, m^W)],®, me JT(K),

is estimable at each point Ire Jf (K); its norm | | / | | jr ( ( ? t) is given by the relation

η n min {p, q} / \ / \

ρ = 0 <? = 0 1 = 0 \ * / \ * /

iÔpy KIQ ~

0 Θ>μτ(Κ)]<* - ο®>μτ(Κ)]<® ·

240

Theorem 6.4.4. Let the functional/(·): Jf(K) -+&] be estimable at each point Are j f ( K ) . Then there exists the uniformly best estimator of the func-tional / ( · ) ·

Proof. The assumptions of the theorem imply that, for each point AreJf (K) , there exists an unbiased estimator UkeL\Q, aä{Jf), Pk) (with respect to the isomorphism between L 2 ( ß , aä(Jf), Pk) and J^(Gk), this esti-mator is unique and therefore locally best). The unbiasedness implies

V { m e Jf(K)}/(m) = Em(U0) = EJUà =>

=> V { r o e ^ ( K ) } Z < 0 [ ( t / . - UQ) dPJdP0] =

[Uk(co) - υ0(ω)] άΡ*(ω) = 0^Uk=U0 Ρ0]· Jo

Since the vector Ar is arbitrary, it follows that

V {ke j f (K)} Uk = i/0 [#(Jf ) , Pol

Theorem 6.4.5. L e t / ΡΕ [ 3 ^ { Κ ) ] Ρ® and keJi> where Ji is a subspace of the reproducing kernel space (K). Then

Ρμτ]'© KP[jf(k)]iO fp* k^ 0

®>[jr(K)] ο β] =

Proof. Since the elements on both sides of ti.^ equality which is to be proved belong to the space [Jf\iQ, it suffices to prove that for any he[Jf\iQ

P[ur]'©[^P[jr(K)K© fp> k^ 0 < 8 >

>[jr(K)]ü>-o®, ^>μτ(κ)]'® =

= ^ P [ j t ( K ) } p © lp> k^

0 < 8 >) [ j r (K ) i o » - o ® , * ) [ j r ( K ) ] ' ® ·

As ® he[Jt]p®, the last equality is true because

(PlJir]*® ΚΡμτχιοκ© fp, k^ 0<8>>[jf(K)j(p-o®] —

"~ (P[jrY®fp> k^ / ) < 8 >>[jr(K)i^-

/)®> *)[jr(K)j'® =

= < ' , . I 4 * " ° * ® *1 W > K « -

Theorem 6 .4.6. Let gpe[Jt]plg> and f(m) = Σ <*/» i » * * W i r » . «i»e«#,

241

be a polynomial of the unknown mean value of a normally distributed random vector η defined on the subspace Ji of the space Jf (K). Denote by t/fthe uniformly best estimator of the polynomial / , ( · ) defined on the whole space Jf (K) by the relation

Mm) = f <g„ m>*>[jriw, m e Jf(K). p = 0

Then Uf is the uniformly best unbiased polynomial estimator of the func-t i o n a l / ( ) .

Proof. Denote by

Mf= { / * ( · ) : h(m) = £ <#,, m?*y{jr(nr*> meJf(K), I ,7 = 0

/ ,e [Jf (K)]'®, P ^ o /„ = PÏJVG gp, /> = 0, .... * }

the class of polynomials defined on Jf (K) identical on M with the functional / ( · ) . Let Uh be the uniformly best unbiased estimator of an arbitrary polyno-mial h(-)e Jif. Then according to Theorems 6.4.3 and 6.4.5, for any polyno-mial h()eJif{ÙiQ index i changes from 1 to min{/?, q) since the dispersion, not a norm, is being evaluated) we have

V{keJt}2k(Uh) =

" " min(p, <?} / \ / \

= Σ Σ Σ i ! « ^ - M W - . p = 0q = 0 i = \ \

l/ \

1/

* n n

min{p, q) / \ / \

</^, k^ ') < 8 >

>[^(K)](^-o®>[jr (K)]'® = Χ Σ Σ ( ^ ) ( ^ M f * p = 0 ? = 0 ι = 1 \ I / \ l J

• [ « • W © ( , , *^~°®>μτ(Κ)]</>- '>®, ( Ρ μ ^ Θ / , , Ir0, ~ 0 < Ε >> μτ ( Κ) ] ( Ρ - ο ® > Ι ^ ( Κ ) ] ' ® +

(\lp ~~ P[Jty® lp-> ^ ' ^ ^ [ j r i K ) ] ^ - ' ) ® ,

/ / _ ρ # * ( ρ - 0 ® \ \

\V " Μ Τ Θ 1 1 /[jr(K)j^-o®/[jf(K)ji® — = ^ ( ί / / ) + ^ ( ί / Λ _ / ) ^ ^ ( [ / / ) ,

where t/Ä _ 7 is the uniformly best polynomial estimator of the functional (A - y ) ( m ) = A(m) - / ( m ) , /ne Jf(K).

Next, the subspace Ji c j f (K) will be determined by the matrix A in the model η ~ Ν(ΑΘ, Κ), 6>e^*, and the estimated functional will be assumed in the form

242

/ ( Θ ) = £ < A W ® & ® ··· ® Α ( 0, 0 ί® > ' 0 e

ι = 1

where pfe$k, s = 1, ..., ι", ι = 1, ..., m.

Remark 6 .4 .1 . If C/,(J|) is the locally best estimator of the functional

fk&) = <P , ( 0® ... ®P<°, Θ*®), i = l , m,

m

then, according to Theorem 3.1, Σ UM) * s the locally best estimator of the functional ' = 1

m

f(0) = Σ <A, ( 0® . . · ® Α < 0, Θ"®)·

Theorem 6.4.7. The functional

/(6>) = < p , ® . . . ® P m , ér®> is unbiasedly estimable at the point 0 = 0iff V{ / = 1, m} p , e ^ ( A ' ) , i.e. iff

Α ® . . . ® Α , ε [ Λ Τ ( Α ' ) Γ β .

Proof. Let V{i = 1, m} ρ ,εΛΤ(Α'), then

V{i = 1, . . . ,m }A' (A' ) -p , = ft

and therefore

f(0) = <(A'(A')-p,) ® ... ® (A'(A')-pJ, er®> =

= < [ ( A ' ) T ® ( A , ® ... ® AJ, (ΑΘ)" 1®) =

= < [ K ( A ' ) - ] m e (A ® - ® PJ, ( A 0 )m

® > | j r ( K ) )m ® .

According to Theorem 6.4.1 and Lemma 2.2.11, the functional / ( · ) is esti-mable at the point 0=0.

Let / ( · ) be an unbiasedly estimable functional. Then in accordance with Theorem 6.4.1 and Lemma 2.1.11, there exists a vector

« ι ® . . . ® u m e [ ^ r ® = 0tkm,

such that

V { 0 e ^ } / ( 0 ) = <K"*(« , ® ... ® u j , ( A e r e W ) r « =·.

= < (Α 'Γ®(ϋ , ® ... ® uj, 6T®> = <(A ' i#,)® ... ® (A'uJ, er®> =

243

= <P, ® ... ® P„ , 6T®> =^Pl ® ... ® pw =

= (Α ' ι ι , ) ® . . . ® ( A ' i i M ) G [ . / ^ A ' ) ]W

® .

Theorem 6.4.8. The functional/(Θ) = (px ® ... ® pm, 6Γ®> is estimable at the point 0 O e ^ * i f f / ( · ) e ^ ( G A É > 0) .

Proof. The assertion is a consequence of Theorem 6.4.2.

Theorem 6.4.9. If the functional/(0) = <Ρι ® ... ® pm, Θ"®> is unbiased-ly estimable at the point 0 = 0 , then it is unbiasedly estimable at each point Se0tk.

Proof. The assertion follows from Corollary 6.4.1.

Remark 6.4.2. The functional f{0) = <p, ® ... ® pm, 6T®>, Seat, is assigned the class

3F = {p(.):p(m) =

= < Κ " ® [ ( ( Α ' ) Γ Ρ , ) ® ... ® ( ( A ' ) - p J ] , m " ® ) , m e J f ( A ) c= j f ( K ) }

of functional such that p() = / ( · ) on Jt(A). Here ( A ' ) , " e(s/')~, i = 1, m, is an arbitrary choice of the g-inverse of the matrix A ' . According to

Theorem 6.4.6 and Corollary 6.4.1, the dispersion of the locally best esti-mator /(η) of the functional / ( · ) is

^( / ( ι / ) )= Ι Ιρ(ΟΙ Ι^ (σ Α Β Ο) -Ρ2(Θο) ,

where

Ι Ι Ρ ( Ο Ι Ι ^ ) = Σο (7ji! IKVwroiK^ iKAOrpJ ® ...

... ® [(AXpJ}}, ( Α Θ Ο ) " ® ) ^ ^

As shown it what follows, the value ||p(-)lljr<cAeb) is invariant to the choice of

the g-inverse ( A ' ) " of the matrix A ' in the definition of the functional /? ( · ) .

Theorem 6.4.10. Within the model η ~ N (AS, Κ ) , Jt{A) a J((YL\ Se®k, the unbiased estimator of the functional

/ ( 0 ) = <Pt® ...®pm,.er®> is

f(S) = A m (pi(A ' )fi ï , - . , p;(A ')-f/),

where A w ( - ) is the polynomial of m variables given in Lemma 2.2.8.

244

Proof. / ( 0 ) = <p ,®. . .®p m , er®> =

= < ( A ' ) m e[ ( ( A ' ) r P , ) ® - ® ((A')-pJ] , €T®> =

= < Κ " · [ ( ( Α ' ) Γ Α ) ® ... ® ((A')-pJ], ( A 0 r ® > ( j r ( K ) re =

m

= Π < Κ ( Α ' ) Γ Α , Α Θ > ^ ( Κ ). ι = 1

We now take into account the vector

g = ( Κ ( Α ' ) Γ Α ) Θ ... Θ ( K ( A ' ) - p J e [ ^ ( K ) ] - 0 c exp Θ j f ( K ) .

With respect to the isomorphism between exp Θ ^ f ( K ) and L\38{3tf)) (see Lemma 2.2.10 and Corollary 2.2.2), this element g is assigned the random variable

Μ ( Κ ( Α ' ) Γ Ρ ι Χ Κ - | / , (K(A')-pJ ' Κη) =

whose mean value is

^ « ( A î K A o r r * . . . . P i K A x r * } = = <Ä m(A i [ (A ' ) r r» P ; [ ( A % ] ' f ï ) , dPAJdPoyL2(mjf)) =

= <ft exp Θ ( A 0 ) > e x p 0^ ( Κ ) = — <flF, ( A 0 )m

® > [ j r ( K ) Jm ® =

w !

= < f t (A6>r®>[jr(K)r® =/(©)

(here again, the isomorphism between the spaces L\âS{J^)) and exp Θ Jf(K) has been utilized).

Theorem 6.4.11. Within the model η ~ Ν(ΑΘ, Κ) , Jt(k) c ΛΤ(Κ), ©e#*, the best unbiased estimator of the functional

/ ( 0 ) = < P i ® . . . ® P m , <ST®>, β ε Λ *

is of the form

/ (ST= A«(PÎ[(A — P i - K A ' W * .

where ( A ' ) ^ is the matrix from Definition 2.1.2.

Proof. According to Theorem 6.4.10, an unbiased estimator / ( i f ) of the

functional / ( · ) is

245

ΑΘ) = hM(A');]'v, ..., p'm[(A%]'j),

and its dispersion (see Remark 6.4.2) is

^ o ( / W ) = l l ^ l l ^ w - / ( 0 o ) .

Here llp||jf(cA e b) * s ^ e norm of the functional /?( · ) which is defined on the

whole space Jf(K) and is identical on the subspace Jt(A) w i th^ / ( ) . With

respect to Remark 6.4.2 and Lemma 2.1.29, the dispersion ^ejßrtf) reads

» * ( Λ β ) ) = Σ M ' / ! ( K"* Σ « A X A , ) ® ... ® ( ( A ' ) ; m P j ,

ι = 0 \ * / II AW I \ σ ε σ

( A 0 O ) ( m - 0 0

[ j r ( K ) ] (m- 0 ® [JT(K)]'®

- / ( Θ ο )

ζ = 1 \ I / σεσ ml [m - ί) <g> ν

(Jf(K)]'®

= £ ( 7 ) 2 / ! { ^ Σ ^ ( Α ' ) ; „ . , + 1 ® . . . ® ( Α χ

ι = 1 \ t / l (reo AW !

•<p, ® ... ® p - , 0*m-'^®>[p ® ... ® ρ ] [JT(K))'e

K . » [ ( A' ) - ® ( A ' ) i , . + 2 ® . . . ® ( Α χ ;

( Α ' ) Γ _ , ® . . . ® ( A ' ) r ; - ; ( A ' ) ; ® . . . ® ( A ' ) ; ] -

'<ρσ, ® ... ® ρ σ„_ , < ·*"-<>®>ρ σ_ + ι ® ... ® ρσ„

[ < p r , ® . . . ® p w β , - β · > Ρ % 1_ < + ι® . . . ® Ρ % 1

v<P„® . . . ® P x „ _ , 0 , m- ° ® > p x . ® . . . ® p „ ll[jr(K))''®

To minimize this dispersion means to minimize each of its summands. Con-sider the z'th of them, and seek a g-inverse (Α')Γ of the matrix A such that its K '®-norm (or seminorm) would be minimal.

In accordance with Lemmas 2.1.24 and 2.1.27, the matrix

J_ —, KA'); m ( +, ® ® ( A ' ) — ... ; ( Α ' ) - ® .. . ® ( A ' ) " ]

246

is a g-inverse of the matrix [(Α')'Θ ; (A') / < g) ; ... ; ( Α ' ) ' T h e K' ®-norm of the ith summand will be minimal for a g-inverse of the form

' ( A ' ) "

^(A')'° / m ( K'®)

(see Definition 2.1.2). In accordance with Lemmas 2.1.12 and 2.1.15, the K'®-norm of the ith summand is always the same (minimal) for various choices of the g-inverse. If, respecting Lemmas 2.1.24 and 2.1.27, we choose for each / = 1, m

= —A[(*XiK)V®'> - . ; [ ( A W0

) .

ml m ( K ' ® )

we obtain the minimal dispersion ^^(/(Θ)) of the estimator

ΛΘ) = hm(p\[(AX(K)]'K p;[(A0- ( K )]'i/).

Since the choice of the estimator with minimal dispersion has been indepen-dent of the value of the parameter 0 O , the estimator of the functional

/ ( 0 ) = <P i® - OA., ® m < 8 >)

obtained (where ρ , ε ^ Α ' ) , ι = 1, m) is uniformly the best.

Remark 6.4.3. So far, the functional

/ ( # ) = <Ρι® . . . ® Ρ „ ,

such that V { / = 1, m) p , e^(A ' ) have been taken into account. In accord-ance with Theorems 6.4.7 and 6.4.9, the functional / ( · ) is not unbiasedly estimable if there exists an index i 0 = { 1 , m) such that ρ^Μ{Α!). As in Section 5.2 (Remark 5.2.5 and Theorem 5.2.8), it is possible to determine an analogy of the l-minimal biased estimator of the functional / ( · ) . This is the subject of the next theorem.

Theorem 6.4.12. Let within the model

η ~ JV(A0, K) , M(A) c Jt{K\ 0e®k

for the functional f(0) = <p, ® ... ® pw, 6T®>, 0 E # , there exist at least one index i0e{\, m) such that ρ^Μ(Α'). Then the statistic

hm(p\[(A%lmiK)Y^ P ; [ ( A % ) , w (k ) ],I / )

247

minimizesjhejnaximal value of the bias \EA0[f(0)] — / ( 0 ) | and the disper-sion ®A6>l/(®)] provided the norm | |0| | , of the vector parameter 0 is fixed,

Proof. As a consequence of the isomorphism between the spaces L2[0S(Jf )] and exp Θ Jf (Κ) , the mean value of the random variable

U P Î K A O / W J ' ^ ···> A K A ' ^ - K K J ' V ) is

^ΑΘ{Λ, Π(Ρι [ (Α , ) / | Ι ) m ( K)]'rç, ..., P'M[(A')KN,iK)]'V)} =

= < Μ · ) , d P A e / d P 0 ( . ) > L W )) =

= < [ Κ ( Α 0 , ^ ) Γ ® ( Ρ ι ® ... ® PU) . ( A 0 r0 W ) r ® =

= < [ Α ' ( Α % ) , , η ( κ ) Γ Θ ( Ρ , ® ... ® P J , Θ*®>.

Furthermore,

I ^ A ^ J P Î K A ^ ^ K J I / , ..., p ; [ ( A ' ) ^ m ( K )] ' i | ) } - / ( 0 ) | =

= l < [ A X A % ) < m ( K )r ® ( P l ® . . . ® PJ, 0 "®> - < Pl ® . . . ® p m, 0 " ® > | =

= Ι<ΡΚ(Α)Γ®(Ρι ® . . . ® P J - (Α ® . . . ® P J , =

= minute, ® ... ® gm) - (ρ, ® ... ® p m) , 0*®>| : fle^r(A'), ι = U <

The last equality proves the first part of the proof regarding the biasedness, With respect to the dispersion, the proof is analogous to that of the preceding theorem (see also the l-minimal biased best estimator in Section 5.2).

Example 6.4.1. Let

Y= 0,JT + 0 2 ; Ζ=Θ3Χ2+Θ4Χ + 0 5 .

The NIM of an experiment is to verify the hypothesis that there exists a value x0 such that

0 = θ\Χ0 "h Θ 2 = Θ 3*0 + #4*0 + ®5

For this purpose, values of the quantities Y and Ζ at the points χ,, x2y * 3 were measured in such a way that the measurement results yu y2, j>3, z,, z 2, z 3 can be considered to be a realization of the random vector η ~ Νβ(ΑΘ, σ 2 ! ) , where

248

A = [ *3 , ι, ο, ο, ο Ι I o, o, x], * „ ι I \ 0, 0, xl x2, 1 / \ 0 , 0, xj, x3, 1/

fxu 1, 0, 0, 0 \

/ x2, 1, 0, 0, 0 \

0= (0U <92, . . . , Θ5)', σ2 is a known measurement dispersion and I is an

identical matrix. If the hypothesis is true, the resultant of the polynomials mentioned has to be zero, i.e.

/6>„ Θ2, 0 \ f(0) = det I 0, 6>„ Θ 2 1 = 0,

2<95 + 0|6>3 - 0,6>26>4 = 0.

\ 0 3 , <94, 0 5 /

For solving this problem, it is necessary to construct an estimator f(0) of the function / ( · ) and to determine its dispersion. If the hypothesis really is true, the realization of the estimator must not differ essentially from zero. For evaluating the significance of the difference from zero obtained, the value \JS>(/{0)) is used (it is necessary to note that so far — from the point of view of routine application — a suitable test for verifying the hypothesis men-tioned has not been derived; it requires knowledge of the probability distribu-tion of the estimating statistic used).

In our case, the estimated function is

The best unbiased estimator of the function / ( · ) , in accordance with

Theorem 6,4.11, is

/ ( 0 ) = < P . ® P 2 ® P 3 , 03® > + < P 7 ® P 8 ® P 9 , Θ

3® ) -

where

P\=P'2 = P'A = (\,0, 0, 0, 0),

Ps = P7 = P* = (0, 1,0, 0, 0),

p\ = (0, 0, 0, 0, 1), p'6 = (0 ,0 ,0 , 1,0),

p ; = (0, 0, 1,0,0).

+ M P ; K A ' ) - ( I )] ' I / , p a ( A ' ) - ( 1 )] ' i / , P ; [ ( A ' ) - ( I )] ' I / } -

- Α 3{ ρ ; [ ( Α ' ) - ( ( )] Ί | , pmUtV, Ρ ^ Κ Α Χ ο Γ Ί } .

2 4 9

After substituting ( Α Ά ) - 1 Α ' = CA' for [(A ')~ ( l )]'and

uxu2u^ — w, c o v ( « 2 , w3) — u2 cov(w,, u3) — w3 cov(w,, u2)

for the polynomial Λ 3(κ,, w2, " 3 ) (see Lemma 2.2.8), we get

f{è) = P]CA ηρ2ΟΑ ηρ,ΟΑ η -

- (^(ρ,ΟΑηρ^Ορ, + ρ2ΟΑηρ\Ορ3 + p3CA ηρ,Ορ,) +

+ ρ'ΊΟΑ'ηρ'ζΟΑ'ηρ9ΟΑη - ^(ρ,ΟΑ ηρ,Ορ9 + p,CA ηρΊΟρ, +

+ ρ,ΟΑηρ,Ορζ) - p4CA ηρ5ΟΑ ηρ6ΟΑ η + ^(ρ,ΟΑ ηρ50ρ6 +

+ ρ'5ΟΑηρ'4Ορβ + ρ6ΟΑηρ4Ορ5) or

ΑΘ) = Θ2Θ5 - ^(ΙΘ^ + êsQu) + 0 2

2 0 3 - σ 2 ( 2 Θ 2 ρ 2 3 + 0 3 ρ 2 2) -

- 0 , 0 2 0 4 + ^ ( Θ , ρ ^ + Θ 2 ρ 1 4 + 0 4 ρ 1 2) .

Here the relations

cov(p;CA ι/, p,CA ι/) = JplCpj,

ρ\ΟΑη = p£CA 1/ = p 4CA if = Θ, ,

ρ',ΟΑη = p;CA η = p^CA 1/ = 0 2 ,

ρ>3ΟΑ'η = 0 5 , p;CA'i/ = 0 4 , p^CAi/ = 0 3 and c o v ( 0 „ 0,) =

were used.

The dispersion &(f(&f) of the statistic / ( 0 ) can be determined using Remark 6.4.2.

For further detail see references [122, 123, 128] and also [87, 92, 93]. Another important problem of non-linearity is studied by Pâzman [94,95].

6.5 An estimated covariance matrix in estimating the first order parameter

Within the universal model (ijf, A 0 , Σ ) , 0e0tk (Sections 5.2 and 6.3), the B L U E ρ ' [ (Α ' ) ~ ( Σ )] Ί | of an unbiasedly estimable linear function g ( 0 ) = p'0, Oedtk (peJÎ(A')) in general depends essentially on the covariance matrix Σ. If Σ is a priori unknown, three typical situations occur:

(a) Σ is known to belong to the class Σ* from Theorem 5.7.1 ; (b) Σ is known to be of the form Σ = ΑΙ + ΑΓΑ' -h ΖΔΖ ' , where the

number A and the matrices Γ and A are unknown (Theorem 5.7.3); (c) no a priori information on the matrix Σ is at our disposal.

250

If in (a) the function g() fulfils the conditions from Theorem 5.7.2, then its U B L U E is ρ'[(Α)~{Σο)]'η, where Σ 0 is an arbitrary element from the class Σ*. If the condition from Theorem 5.7.2 is not satisfied, then see Remark 5.7.1.

In (b) the U B L U E ("uniformly" with respect to the class { Σ : Σ = AI + + Α Γ Α + Z A Z ' , Α, Γ, Δ arbitrary, Σ p.s.d.}) has the form ρ (A A ) A ?/ (Theorem 5.7.3).

If the matrix Σ in (c) is replaced by another symmetric and positive definite matrix W , then the estimator of the form p'[(A')~{}H η is unbiased, but its dispersion p ' [ ( A ' ) ~ ( W )] ' Σ(Α')~{ψί)ρ can be large in comparison with the mini-mal dispersion p' lXA') ,^)] ' Σ ( Α , ) ~ ( Σ )ρ . Therefore we try, if possible, to estimate > the covariance matrix. Most frequently, this can be carried out by means of replicated realizations of the observation vector η.

Let ι/,, r\m be stochastically independent random variables having the same normal probability distribution, ι/, ~ Ν(ΑΘ, W ) , / = 1, m. Then

m

ή = (1/w) Σ Αι ~ J V „ ( A 0 , W / m = Σ ) / = 1

and the matrix m

8 = [ 1 / ( , Μ - Ι ) ] £ ( ΐ | , - φ ( ΐ | ί - φ ' / = 1

has the Wishart distribution with m — 1 degrees of freedom (for more detail of the Wishart probability distribution see references [3, 102, 114, 119]), S ~ Wn(m — 1, W / ( m — 1)). The unbiased estimator Σ of the matrix Σ based on the Wishart matrix S is

Σ = ( l /m)S , (m - 1 ) Σ - Wn{m - 1, Σ ) .

In this case the estimator p 'KA'X^J'qcan be used within the universal model (ή, Α Θ , Σ ) for estimating the function g{0) = ρ'Θ, 0e@k {peJi(A')). What are the statistical properties of such an estimator?

C. R. Rao [103] solved this problem for Model two. In what follows, a certain generalization of its solution is shown which enables us to answer the above-mentioned question for Models three, four, five and the universal model as well.

First, some necessary statements from multivariate statistical analysis are introduced in Lemmas 6.5.1, 6.5.2 and 6.5.3.

Lemma 6.5.1. If W - Wn(f, Σ ) and / > Α ( Σ ) , then ^ ( W ) = JtÇL) with probability one. Consequently J R ( W ) = Λ ( Σ ) .

251

Proof. See Theorem 3.2.1 and Remark 3.2.1 in [114].

Lemma 6.5.2. Let Z,, ..., Zm be stochastically independent random vectors, Z; ~» N„(Awh Σ) , i = 1, m, where A is a matrix of the type η χ t and ιν„ / = 1 m, is a /-dimensional vector. If

m H = £ ιν,ιν;, Λ(Η) = r,

where 1<, Vm_r are stochastically independent random vectors with the

same distribution jV„(0, Σ) and the matrices £ Ζ,^Η' I £ ZA.iv£) and

( V}, ..., K« - r) a r e stochastically independent. ' " ' Ä ~ 1

Proof. This follows, with little modification, from the proof of Theorem 4.3.2 in [3].

Lemma 6.5.3. Let η ~ Nn(p, T), W ~ Wn(f9 T) and let η and W be stochas-tically independent. Then the random variable

Τ* = (η-μ)'ΨΓ(η-μ)

does not depend on the choice of the g-inverse W" of the matrix W and has the same distribution as

{R(T)l\f-R<J)+l]} pRiUf- R(T)+\ »

where FR(J) f_ Λ ( Τ )+ , is the Fisher — Snedecor random variable with R(J) and

/ — R(T) + 1 degrees of freedom.

Proof. See Theorem 1 in Streit [119].

Within the universal model (ι;, ΑΘ, Σ) , 0 e ^ , let the symbol Ζ denote an n x s matrix such that s ^ n — R(A), JÎÇL) = Ker(A'). Then the random vector T2 = Τ η characterizes the class of all the linear unbiased estimators of the function go(0) = 0, &e3tk. Let 7; = ΑΑη be an arbitrary linear un-biased estimator of the vector function g(0) = ΑΘ, 0 e # ; then obviously

( ; ) ~ » . 4 A f ) . ( t Λ ; : ) ) ·

A „ = Α Α Σ ( Α ) Ά ' , A,2 = AA ΣΖ,

A 2 I = Ζ ' Σ ( Α - ) Ά ' , Λ 2 2 = Ζ'ΣΖ.

252

Lemma 6.5.4. The statistic r = 7J - \l2\22T2 is the B L U E of the function g{0) = Α Θ , 0e3tk; its covariance matrix is Σ Γ = Λ 1 1 2 = Λ η — Λ , 2 Λ 2 " 2 Λ 2 1. In other words r = Α [ ( Α % ( Σ )] ' ΐ | with probability one and A 1 1 2 = A ^ A ' ) , ^ ] ' Σ .

Proof, rand A, , 2 are independent of the choice of the g-inverse Λ2"2 because P{T2eJt(\21)} = 1 and ^ # ( A ; 2 ) = ^ ( A 2 I ) c Jt(\22). Obviously also

V {0e$k} ΕΘ(τ) = Α Θ .

By Theorem 3.1, ris the B L U E of its mean value since cov (r, Ζ ' ι ; ) = 0. Since Α [ ( Α % ( Σ )]

/ ι / is also the B L U E of the function g(-)9 by Theorem 5.2.4 the equality r = Α [ ( Α ' ) ~ ( Σ )] ' ι / must hold with probability one. The rest of the proof follows immediately from Theorem 5.2.5 and Lemma 2.2.12.

We now use the following notation : a right upper index (p) of a random vector or a matrix means that it is conditioned by a random matrix (7;, Ζ ' Σ Ζ ) ,

where ft - W„(f9 Σ ) , / > Α ( Σ ) , and the vector η and the matrix Σ are stochastically independent. Furthermore,

A„ = Α Α Σ ( Α ) A , Λ Ι 2 = Α Α Σ Ζ ,

Λ 2 1 = Ζ ' Σ ( Α - ) Ά ' , Λ 2 2 = Ζ ' Σ Ζ ,

Α|ΐ.2 = Λ, , — Λ, 2Λ 2" ;Λ 2 1

and

r = 7J — Λ , , Λ ^ .

Remark 6.5.1. By Lemma 6.5.4,

ί = Α [ ( Α ' ) ' ( Σ )] ' ι /

with probability one.

Theorem 6.5.1. The random vector r* = 7J°,) — Αγ2Α22Τ2 and the random matrix Âf, 2 = Â f f A ^ A ^ are stochastically independent and

( . ) f* - N„ ( Α Θ , (ι + - τ2χζ'±ζγ T?j A „ . 2 ) ,

(**) M.2~ W ~ Ä ( A 2 2 ) , A n . 2 ) ;

these expressions are independent of the choice of the g-inverses used.

Proof. The independence of the choice of g-inverses follows from Lemma 6.5.1 and the fact that 7 ; 6 ^ ( Ζ ' Σ Ζ ) with probability one.

253

Let there be

-·····/. 1

where ft = £ UaUa, Ua - 7V(0, Σ ) , a = 1, . . . , / , and I/,, ty, are stochasti-

cally independent. Then f\tj = £ l^-KJ, i,y = 1,2, and a = 1

o = l \β=\ J where

itf> - ^ ( Λ ι 2 Λ 2 - 2 ι ς 2 , Λ 1 | 2) ,

^ ~ Λ Γ „ ( Α Θ + Λ : 2 Λ 2 - 2 Γ 2 , A „ . 2 ) .

From this it follows that

Ε(τ*) = Α Θ + Λ 1 2Λ 2- 2Γ 2 - £ Α 1 2Λ 2- 2Ι^ α 2ΐς 2 ( £ Ifelfc) 2 = Α Θ ,

and

var ( f*) = A n . 2 + ] ^ ( Σ ( » w ) ^„.2 =

= ( l +- Γ 2 ' (Ζ 'ΣΖ)-Γ 2^ Λ Π 2 ,

which implies ( * ) . Furthermore,

= Σ W - Σ tt'fcfi f W -α = 1 α = 1 \ 0 = 1 / χ = 1

Because V*p) ~ N r t(A 1 2A 2^ l£ 2 , Λ η 2 ) , by Lemma 6.5.2 we can write

α= 1

where S , , S f _ R(Ai2) are stochastically independent random vectors with the

same probability distribution Nn(0, Λ, , 2 ) . This implies ( * * ) . The stochastic independence of r* and Af, 2 follows also from Lemma

/- *(Λ22) 6.5.2. The expression £ SaS'a does not depend on the second term of the

α = 1

254

expression for r*, and the independence of the first term is an obvious consequence of our assumption (the independence of η and Σ) .

Remark 6.5.2· In the course of the proof, conditioning by the matrix (7;, Z'(£/„ Uf)) was used. Since, in the resulting conditioned distributions the matrix (T2, A 2 2) appears, the latter was used in the formulation of the theorem.

Lemma 6.5.5. With the notation used in the preceding part of this Section, the following relations are true

(a) η'Ζ(Ζ'±Ζ)^Ζ'η = Ϋ±ν, where v = η — ΑΘ, Ac? = Α[(Α%φ)]'η;

(b) Α (Ζ'ΣΖ) = R(L) - Α [Σ(Σ + ΑΑ')"Α] ; (c) the probability distribution of the random variable κΈ" vis the same

as that of the random variable

{ / Κ ( Ζ ' Σ Ζ ) / [ / - Λ(Ζ'ΣΖ) + 1]}

Proof, (a) With respect to Lemma 2.1.9, the matrix Ζ can be expressed in the form Ζ = I - ( Α Χ φ Α ' , thus Τη = v. Using the identity [ ( A ' ^ A ' ] ' Σ = = Ê (A' )md)A' (Lemma 2.1.12), we get

ΖΈΖ = { Ι - Α [ ( Α % ^ } Σ .

It can easily be verified that

({I - Α [ ( Α 0 - ( £ )] ' } Σ ) - = £-{• - AKAX&]'} hence

ηΖ(Ζ±Ζ) Ζ'η = (η- ΑΘ) Σ"{1 - A[(A%(t)]'} (η - Α<9).

Because ^ ^ {I - Α[(Α%(±)]'}(η -Α~Θ)=η- ΑΘ

the assertion is proved. (b) The matrix Σ is positive semidefinite, and therefore there exists an

n x RÇL) matrix J such that Σ = JJ'. Since R(M) = Ä(MM') for every ma-trix M, we have

R (Cr") J J ' K A " ) ' A ' ' z i ) = R ((Ar )J)·

Using Lemma 7.1.2 from [104] (see also the end of the proof of The-orem 5.4.5), we obtain

255

R ^ j = R[AA~ Ker(Z ' ) ] + R(Z') = R(AAA) + R(Z) =

= R(A) + n - R(A) = Λ,

by which the matrix ^ has full rank in its columns. This fact implies

R ( C r ) J ) = m = Λ ( Σ ) = R { A ) ' The identity R(\u2) = Α[Σ(Σ + A A ' ) A ] follows from the identity

A, ,.2 = Α [ ( Α % ( Σ )] ' Σ ( Α 0 " ( Σ )Α ' = Σ ( Α ' ) " ( Σ )Α '

(this is a consequence of Lemmas 6.5.4 and 2.1.12), and from Theorem 2.1 in [127] (see also Theorem 6.3.3), which states the identity

Α [ Σ ( Α ' ) - ( Σ )Α ' ] = Λ[Σ(Σ + A A ' ) " A ] .

The identity

- Λ 1 2Λ 2 - Λ / Λ η , Αί2\( I, 0\ = / A n 29 0 VO, I Α Λ 2 1 , Λ 2 2 / ν-Λ 2- 2, Λ 2 1, 1/ V 0, Λ 2 2,

implies

Λ ( Λ 2 2) = R(A) - Ä ( A „ . 2 ) = Α ( Σ ) - Λ[Σ(Σ -f A A ) Α ] .

(c) Let Ζ be an arbitrary matrix such that K e r ( A ' ) = Jf(Z); then the random variables

Τη ~ N,(0, Λ 2 2) and / ' Ζ ' Σ Ζ ~ Wa{f, Λ 2 2 )

are stochastically independent. With respect to Lemma 6.5.3 the probability distribution of the random variable

is the same as that of the random variable

{R(Z'lZ)/[f- R(Z'1Z) + 1 ] } i W ) , / - « ( z x z ) + 1 ·

In accordance with (b), the rank Λ ( Ζ ' Σ Ζ ) does not depend on the choice of Z, which implies (c).

Theorem 6.5.2. The random variable

Τ2 = (ΑΘ-ΑΘ)'{Α[(Α%(1)]'±}-(ΑΘ-ΑΘ)Ι(ΐ +j

256

has the same probability distribution as the random variable

{ /Κ[Σ(Σ + A A ' ) - A ] / | / - RÇL) + 1]}

^ 7 ? [ Σ ( Σ + AA')-A],/- R(L) + 1 ·

Proof. By Theorem 6.5.1,

(a) (f* - Α β ) / ^ 1 + ± Ι Ϊ ( Ζ ' £ Ζ ) - ζ ~ #„((?, A n . 2 ) and

( b ) / A f , . 2 - W;C/"-Ä(A22), Λ η . 2 ) (the matrix (T2, Ζ 'ΣΖ) is fixed), and the random variables given by (a) and (b) are stochastically independent. By Lemma 6.5.3, the random variable

(f* - A6>X(A*. 2) - ( f* - Α β ) / [ / ( ΐ + j Γ 2 (Ζ 'ΣΖ) -Γ 2

has the same probability distribution as the random variable

{R(*u.iW- Ä ( A 2 2) - R(\U2) + 1]}

The probability distribution of the last random variable does not depend on the matrix (T2, A 2 2 ) . By application of Lemma 6.5.4, Remark 6.5.1 and Lemma 6.5.5, the proof is completed.

Corollary 6.5.1. If FR[Ut + AA-)-A],/- /?<£) + 1 0 - <*) i s t h e (1 - a)th quantile of the F-probability distribution, then the (1 — a) confidence ellipsoid of the vector A Θ is given by the set

υΕΛ{Α[(ΑΧφ))'±1 (u - Α Θ ) ' { Α [ ( Α % Φ )] ' Σ } " (u - Α Θ ) ^

^{/R[t(t + Α Α 0 - Α ] / [ / · - Λ ( Σ ) + l ] } ( l + ±

*^[£(t + AA')-A],/-Ä(£)+ lO ~~ #)}·

Corollary 6.5.2. If the function g(0) = ρ 'Θ, 0 e ^ * , is unbiasedly esti-mable, i.e. peJi(A!\ then the interval

[p'KAXdJ'ff - κ, Ρ Ί ί Α Ο ^ ' ΐ ϊ + * ] ,

where

* = *, ( l - f) ^ 7(l + κ ) p ' K A ' ) - ^ ] ' £ ( A ' ) - ( £ )P ,

ρ = / - [ * ( £ ) - Ä ( £ ( £ + A A ' ) - A ) ]

257

and ίφ (\ — ^ is the ^1 — ^ t h quantile of the Student probability distribu-

tion with φ degrees of freedom, covers the value ρ'Θ with probability 1 — a.

Proof. The relations

A n , = Α [ ( Α ' ) - ( Σ )] ' Σ = A[(A')-{t)]'±(A%it)A' and

pe Ji{A') ο 3 {ue mn) ρ = Au imply

u 'Â n . 2u = p / [ ( A 0 - ( £J Ê ( A 0 - ( È )p .

By Theorem 6.5.1, the random variables

ui* ~ TV, (ρ'Θ, (l +j v't'vj u'An2u

and fu'A*L2u~ W\{f- R(A22)), u'AU2u] = z}-R{An)u'An.2u

are stochastically independent. The symbol #/-/?o\22) denotes a random variable possessing the chi-square probability distribution with / — R(A22) degrees of freedom. By an application of the definition of the Student random variable, and its independence of the conditioning matrix (7^, A 2 2) , the proof is completed.

Theorem 6.5.3. The statistic ΑΘ — A [ ( A , ) w (£ ) ]/ i / represents an unbiased

estimator of the vector function g(0) = Α Θ , 0efflk ; its covariance matrix is

var(A6>) = A [ ( A ' ) - f f ) ] ' L ( / - \)/(f- { * ( £ ) - Λ[Σ(Σ + A A ' ) ~ A ] } - 1).

Proof. The unbiasedness of the estimator Α Θ is an obvious consequence of (* ) from Theorem 6.5.1. Using this relation and assertion (c) from Lem-ma 6.5.5, we get

var(A<9) = £{var [ Α & | ( Γ 2 , Λ 2 2) ] } = Ε (\ + y f Î T * ) Λ ι 1 2 =

= Α [ ( Α ' ) - ( Ε )] ' Σ [ 1 + (J\RÇE) - Λ[Σ(Σ + A A ' ) - A ] } / { / - Λ ( Σ ) +

+ Λ[Σ(Σ + Α Α ' ) - A ] + l})E(FR(i:)_mi + A Â . r A l f_ m ) + l) ] ;

thus the proof is completed because

E(FfiJ) =f2/(f2 - 2)

(see Relation (16.28) in Kendall and Stuart [55]).

258

Corollary 6.5.3. Taking account of Section 5.2 Theorem 5.1.4, we get Model one:

ê = ( / ' £ - ' / Γ ' / ' Σ - ' Ε ,

Ε(§\Θ) = Θ,

v a r ( ê | L ) = ( / ' Σ - ' / Γ ' ( / - ! ) / ( / " - « ) ;

Model two:

Ô= ( Α ' Σ " Ά )_ 1

Α ' Σ- 1

^ ,

£(ê |6>) = Θ, © e £ * ,

ν & Γ( ^ | Σ ) = ( Α ' Σ -,Α ) -

,( / - - 1];

Model three:

Θ = [I - Σ Β ' ( Β Σ Β ' ) "1 Β ] ξ - Ε Β ' ( Β Σ Β ' ) " ' ί » ,

Ε(0\Θ) = Θ, 0 e { u : b + Bu = 0} ,

ν 3 Γ ( ^ | Σ ) = [ Σ - Σ Β ' ( Β Σ Β ' ) -,Β Σ ] ( / ' - \)/(f-q- 1);

Model four:

( ! ; ) = ( ' - - ^ ; Β ι ) ^ ( - Τ ' ) » ·

fÔn, Q 1 2W B , £ B Î , Β Λ "1.

V Q 2 1, Q 2 2; V B2, θ/ '

Model five:

Ô= { ( Α ' Σ - ' Α ) -1 - ( Α ' Σ -

,Α ) -

,Β ' [ Β ( Α ' Σ -

,Α ) -

1Β ' ] -

1Β ( Α ' Σ - Ά ) -

1} ·

• Α ' Σ - ' ξ - ( Α ' Σ - ' Α ) - ' Β ' [ Β ( Α ' Σ_ Ι

Α )_ Ι Β'Γ'δ,

£(ê |6>) = Θ, 6>e{u: b + Bu = 0} ,

v a r ( ê ^ ) = { ( Α ' Σ - ' Α ) - ' - ( Α Σ _ Ι Α ) _ 1 B ' [ B ( A T - ' A ) - ' B T 1 -

· Β ( Α ' Σ_ 1

Α )- 1

} £^-1 . f - ( n - k + q)-\

Example 6.5.1. Consider a regular replicated Model two (η, ( / ® Α)Θ,

259

(I ® Σ ) ) , η = (η\, ..., η'„)', with normally distributed observation vector //; here ?/,, ..., ηΜ are ^-dimensional vectors, 0 e # , Ä(A„ A ) = k ^ n, Κ(Σ) = = n,m> m a x { « — k + 2, « } (the condition following from Theorem 6.5.3). The problem is to determine the function g{0) = ρ'Θ, 0 6 # , when no information on the cov^rjance matrix Σ is available.

Denote

m m

ή = (l/m) Σ ψ, 8 = [l/(m - 1)] Σ (ι/, - ff) (f/7 - φ', 7 = 1 7 = 1

then ή ~ iV„(A0, W = Σ/,ι),

and ( Π - 1)W = [(m - l)/m]S - Wn(m - 1, W)

are stochastically independent (see Theorem 3.3.2 from [3]). Thus, in accord-ance with the theorems of this Section, the required estimator and its disper-sion are:

f?&= p ( A ' W - ^ ' A W q = p (A S Ά ) 'A S ]ή,

v a r ( ^ 0 ^ ) = p'(AX-]A)-]p(m - 2)/{m[m - 2 - (n - k)]}.

The confidence interval of g() is [ρ^Θ - κ, ρ^Θ + κ]9 where

κ = tm_ , _ ( Λ_ , ) ( ΐ - ^) >/("* - Ο/Ν — 1 — ( Λ — Λ)]·

•Vil + [™/(™ - 1)] i^S-'tfHl/rtOp ' iA 'S- 'Ar 'p and

* = j j p - A Î A ' S - ' A ^ A ' S - 1! ! .

For further detail the reader is referred to Kubâcek [70] and Rao [103].

6.6 Estimators of second order parameters — variance components

In this Section as in Section 5.6 the universal model (η, Α Θ , Σ = Σ

is considered, where 0e^2*, 9 = (5,, ..., Θρ)'έ$, »° φ 0, and where the ma-trix A and the symmetric matrices V„ i = 1, />, are known. In addition, η is assumed to be normally distributed. Some consequences implied by this assumption are given in the following remarks.

260

Remark 6.6.1. If η ~ Ν(ΑΘ9 Σ ) , then it can easily be proved by Theo-

rem 6.3.2 that

\ατ(η'Τη\Σ) = 2 Τ Γ ( Τ Σ Τ Σ ) + 4ΘΆ Τ Σ Τ Α Θ

(where T = Τ ' ) .

If moreover TA = 0 (invariance), then

varfaTi i lE) = 2 Τ Γ ( Τ Σ Τ Σ ) .

The proof of Theorem 5.6.2 thus shows that the Rao M I N Q U E is in the fact the Σο-LMVUIE, where Σ 0 = V, + ... + V p. If in Theorem 5.6.2 the

ρ matrix Σ, = £ 5/0 )Y is taken into account instead of Σο, the Σ , - L M V U I E is

/= ι obtained. In the case of normality, the dispersion of the Σ , - L M V U I E of the form ifTif, for an arbitrary value #eS, does not depend on the statistical moments of order higher than the second; obviously

v a r ( f f W ) = 2 f £ Wj T r ( T V , T V Y) .

If the matrix K(n from Theorem 5.6.4 is regular, then the matrix S ( i K | M )+ ,

where Σ, = £ is a regular matrix, Μ = I — A ( A A ) A and ι = 1

{ S ^ i D + K y = T r [ ( M L I M )+V / ( M L I M )

+V / | , i j = 1, . . . , ρ,

is also regular, and the Σ , - L M V U I E of the whole vector 3 is

& — 8(ΜΣ,Μ)+ Χ*

Here γ= ( / „ . . . , γρ)' and y, = ΐ | ' ( Μ Σ , Μ ) + ν ί ( Μ Σ , Μ ) + ΐ | , ι = 1, ..., ρ. If 9 = # 0 ), which means that Σ = Σ , , then

var(y|5<0») = 2S ( M î : i M )+

and

var(5|*°>) = 2S (- M L LM)-

The problem of determining the probability distribution of the Σ,-

L M V U I E is obviously complicated, even if the observation vector η is

normal. For this reason, we usually characterize the quality of variance

components estimators by their dispersions.

Remark 6.6.2. If a sufficiently large number m of replicated realizations of the vector η are available, then the M I N Q U E of the function /(9) = f'9,

9e 9, (here /e ur(K(/>), see Theorem 5.6.7) is

261

i= 1

(Theorem 5.6.8). By Remark 5.6.7

v a r ( m ^ U q | L ) = - {Tr[U(UD\ | / ) ] - [Tr(UL)] 2} + 2 ( m ~ ] ) Tr (1 /ΣΙΙΣ) m m

and

var[Tr(ST)|L] = - {Tr [T(T • ψ)] - [Tr (ΤΣ)] 2} + - Τ Γ ( Τ Σ Τ Σ ) .

m m(m — 1)

If η - Nm„ (ί® Α Θ , I ® £ Sftj, 0eat\ is the replicated model from

Definition 5.6.2, then

Tr [T(T • ψ)] - [ Τ Γ ( Τ Σ ) ]2 = 2 Τ Γ ( Τ Σ Τ Σ ) .

Proof, η ~ W(A0, Σ ) , ε = (ex, . . . , ej = η - Α Θ , ψ = Ε[(εε') ® ( « • ' ) ] , ψ,7 = Ε{ε^εε\ ψ^β = Ε(είε/ε1(ει), then

where

β* = {ΣΚ*, /, k = 1, ...,

The last assertion can be easily proved using the expression

(θ 4/8//θθ8ί*θί/)φ(Ι) | , . β,

where ç>( · ) , te 0tn, is the characteristic function of the random vector ε ~

- 7Vn(0, Σ ) , i.e. φ{ί) == exp ^ - ^ f e « " . Moreover if Τ = Τ', then

τ Γ [ τ ( Τ ϋ ψ ) ] - [ τ Γ ( Τ Σ ) ] 2 = τ Γ ( τ χ Σ Ψ ^ ) - ( Σ Σ W) = n n n n

= Σ Σ Σ Σ v«i°*<* + = 2 Τ Γ ( Τ Σ Τ Σ > .

/ = 1 j = 1 = 1 / = 1

ThusMf the vector η is normally distributed, the dispersion of the esti-mator f'9 is

var (m - 1) Tr (s £ Α,Τ,) + >"»#' £ Α,υ,.^ΙΣ =

2 6 2

= 2 ( M - 1) Χ Σ Μ Τ Γ ( Τ ( Σ η Σ ) + 2 F T M Τ Γ ( υ , Σ υ , Σ ) . / = 1 y = 1 ι = 1 j = 1

Remark 6.6.3. If the observation vector η in a replicated model has a

normal probability distribution and feJi(K{P) (Theorem 5.6.7), then the

dispersion of an unbiased estimator of a function f(9) — 9e 9, based

on the Wishart matrix, i.e. the dispersion of the estimator / ' # = Tr(ST,)

<jT<jyù=fl9i=i9 . . . , />) is

varfTriSTOIL] = [2/(m - 1)] Τ Γ ( Τ , Σ Τ , Σ ) .

Analogously, the dispersion of an unbiased estimator of the same function based on ή9 i.e. the dispersion of the estimator f'9 = ή'Ί2ήηι (Tr(T 2V,) = ι = 1, . . . , />) is

ν&τ(φτ2φη\Σ) = 2 Τ Γ ( Τ 2 Σ Τ 2 Σ ) .

Remark 6.6.4. If the matrix (m — 1)K0 + K ( /) (Theorem 5.6.7) is regular, i.e. if all the variance components are unbiasedly and invariantly estimable, then Σο-LMVUIE of the vector 9 is

^ [ ( m - O S ^ + S ^ ] - 1 / ,

where

{/}/ = fi = (m - 1) ΤΓίβΣο^ν,Σο-1) + ^ ( Μ Σ 0 Μ Γ ν χ Μ Σ 0 Μ ) + 1 / .

The other symbols have the same meaning as in Theorem 5.6.8. The co-variance matrix of the estimator 9 has the form

v a r ( i | * ) = [{m - 1)8^-1 + β ^ , + Γ ' var(f l*)[(m - 1)8^-, + β ^ + Γ1

,

where

{var(y|J>)},,. = cov(y„ ftS) = 2(m - 1) T r ^ ' V ^ ^ ' V , ^ ) +

+ 2 Τ Γ [ ( Μ Σ „ Μ ) + Υ ( Μ Σ 0 Μ ) + Σ ( Μ Σ „ Μ ) + ν / Μ Σ ο Μ ) + Σ ] , i,j = !,...,/>,

and

If 9 = # 0 ) (i.e. Σ = Σ 0 ) , then

var(£|#°>) = 2[(w - 1)8^-, + 8 ^ . ] " '

(because ( Μ Σ „ Μ ) + Σ 0 ( Μ Σ 0 Μ ) + = ( Μ Σ „ Μ ) + ) .

263

Example 6.6.1. In the replicated model from Definition 5.6.2, where ρ = 1, let the symbols o^Og), (x^S) and σ^ι/ , ) , i = 1, m, denote the estimators of the parameter 5, = σ 2, based on the vector η, the matrix S and the vector ηί9

ι = 1, m, respectively. The problem is to compare the dispersions

^ [ σ 2 ^ ) ] , ^ ( S ) ] and s\to\^lm l_/ = ι

If the vector η is normally distributed, we get on the basis of the preceding

cr(_g) = [(m - 1) Tr (SV- ' ) + mij'(MVM)+ij]/[(m - l )n + η - Λ],

^ [ σ 2 ^ ) ] = 2o*/[(m — +

ô 2(S) = ( l / / i ) T r ( S V - 1 ) ,

®[d 2(S)] = 2 o 4 / [ / j ( , « - 1)], m m

Σ â\^/m = {\/[m(n - k)]} Σ ?;(MVM) +i / , ,

® [ î ^in)lm\ = 2cJ4/[m(« - k)].

As an illustration, see the following table for n = 5 and = 3

m 11 101

50/52 500/502

Σ ^ ί ι ) / * 22/52 202/502

© [ ^ ( S ) ] / ® Σ 22/50 202/500

Remark 6.6.5. As in Section 5.7, here also it is necessary to find under what conditions the L M V U E (or the L M V U I E ) of the function f{9) = f9, 9e 9, becomes the U M V U Ε (or the U M V U I E ) . For an observation vector η with a normal probability distribution, these problems are solved in Kleffe [57].

For more detail concerning the theory of estimation of variance com-ponents and of simultaneous estimation of parameters of the first and the second order within the framework of the universal model the reader may consult references [56—61, 71—75, 84, 98, 106, 107, 129].

264