design of experiments notes_iit delhi
DESCRIPTION
Design and analysis of experimentsTRANSCRIPT
-
Analysis of Variance and Analysis of Variance and yyDesign Design of Experimentsof Experiments--II
MODULE MODULE -- II
LECTURE LECTURE -- 1 1 SOME RESULTS ON LINEAR SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY ALGEBRA, MATRIX THEORY
AND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSDr. Shalabh
D t t f M th ti d St ti tiDepartment of Mathematics and StatisticsIndian Institute of Technology Kanpur
A-PDF M
erger DEMO
: Purchase from www.A-PDF.com
to remove the waterm
ark
-
We need some basic knowledge to understand the topics in analysis of variance.
2
A vector Y is an ordered n-tuple of real numbers. A vector can be expressed as row vector or a column vector as
Vectors
1
2
n
yy
Y
y
= #
is a column vector of order . 1n
n
1 2' ( , ,..., )nY y y y= is a row vector of order
and
1 .nIf all for all i = 1,2,,n then is called the null vector.0iy = ' (0, 0,..., 0)Y =If 1 1 1
2 2 2, ,
x y zx y z
X Y Z
= = = # # #
1 2 1
thenn n nx y z
x y kyk
+ +
# # #
2 2 2,
n n n
x y kyX Y kY
x y ky
+ + = = + # #
-
( ) ( )'( ) ' '
X Y Z X Y ZX Y Z X Y X Z+ + = + +
+ +
3
1 1 2 2
'( ) ' '( ' ) ( ) ' '( )
( )' ... n n
X Y Z X Y X Zk X Y kX Y X kY
k X Y kX kYX Y x y x y x y
+ = += =
+ = += + + +
where k is a scalar.
Orthogonal vectors g
Two vectors X and Y are said to be orthogonal if ' ' 0.X Y Y X= =The null vector is orthogonal to every vector X and is the only such vector.
Linear combination
if are m vectors of same order and are scalars, Then1 2, ,..., mx x x
1
m
i ii
t k x=
= is called the linear combination of 1 2, ,..., .x x x
1 2, ,..., mk k k
is called the linear combination of 1 2, ,..., .mx x x
-
Linear independence
4
If are m vectors then they are said to be linearly independent if there exist scalars
such that1 2, ,..., mX X X 1 2, ,..., mk k k
1
0 0m
i i ii
k X k=
= = for all i = 1,2,,m.If there exist with at least one to be nonzero, such that then are said to
be linearly dependent
1 2, ,..., mk k k ik1
0m
i ii
k x=
= 1 2, ,..., mx x xbe linearly dependent.
Any set of vectors containing the null vector is linearly dependent. Any set of non null pair wise orthogonal vectors is linearly independent Any set of non-null pair-wise orthogonal vectors is linearly independent. If m > 1 vectors are linearly dependent, it is always possible to express at least one of them as a linear
combination of the others.
-
Linear function
5
Let be a vector of scalars and be a vector of variables, then
is called a linear function or linear form. The vector K is called the coefficient vector.
1 2( , ,..., ) 'mK k k k= 1m 1 2( , ,..., )mX x x x= 1m
1'
== m i i
iK Y k y
For example mean of can be expressed asx x xFor example, mean of can be expressed as1 2, ,..., mx x x
1
2 '
1
1 1 1(1,1,...,1) 1m
i mi
xx
x x Xm m m=
= = = #
where is a vector of all elements unity.
mx
'1m
Contrast
1m
The linear function is called a contrast in
For example, the linear functions1
'm
i ii
K X k x=
= 1 21
, ,..., 0.ifm
m ii
x x x k=
=xx
are contrasts.
311 2 1 2 3 2, 2 3 , 2 3
xxx x x x x x + +
A linear function is a contrast if and only if it is orthogonal to a linear function or to the linear function'K X1
m
ii
x=
1
1 .m
ii
x xm =
= Contrasts are linearly independent for all j = 2, 3,, m.
Every contrast in in can be written as a linear combination of (m - 1) contrasts
1i= 1i1 2 1 3 1, , ..., jx x x x x x
1 2, ,..., nx x x 1 2 1 3 1, ,..., .mx x x x x x
-
Matrix
6
A matrix is a rectangular array of real numbers. For example,
11 12 1
21 22 2
......
n
n
a a aa a a # # #
1 2 ...m m mna a a # # #
is a matrix of order with m rows and n columns.m n
If m = n, then A is called a square matrix.
If then A is a diagonal matrix and is denoted as
If m = n, (square matrix) and for i > j , then A is called an upper triangular matrix. On the other hand if
0, , ,= =ija i j m n 11 22( , ,..., ).nna aA g adia=0ija =
m = n, and for i < j then A is called a lower triangular matrix.
If A is a matrix, then the matrix obtained by writing the rows of A and columns of A as columns of A and
rows of A respectively, is called the transpose of a matrix A and is denoted as . If then A is a symmetric matrix
0ija =m n
'A'A A=If then A is a symmetric matrix.
If then A is skew symmetric matrix.
A matrix whose all elements are equal to zero is called as null matrix.
An identity matrix is a square matrix of order p whose diagonal elements are unity (ones) and all the off diagonal
A A='A A=
elements are zero. It is denotes as .pI
-
( ) ' ' 'A B A B+ + If A and B are matrices of order thenm n
7
( ) ' ' '.A B A B+ = +
If A and B are the matrices of order m x n and n x p respectively and k is any scalar, then
( ) ' ' 'AB B A=
If the orders of matrices A is m x n, B is n x p and C is n x p then
( ) ( ) ( ) .kA B A kB k AB kAB= = =
( ) .A B C AB AC+ = + If the orders of matrices A is m x n, B is n x p and C is p x q then
( ) ( ).AB C A BC=
If A is the matrix of order m x n then
.m nI A AI A= = If A is the matrix of order m x n then
-
Trace of a matrix
8
The trace of n x n matrix A, denoted as tr(A) or trace(A) is defined to be the sum of all the diagonal elements of A,
i.e.,
If A is of order and B is of order , thenm n1
( ) .=
= n iii
tr A a
n m.
If A is n x n matrix and P is any nonsingular n x n matrix then
( ) ( ).tr AB tr BA=
1( ) ( )tr A tr P AP=If P is an orthogonal matrix than
If A and B are n x n matrices, a and b are scalars then
( ) ( ).tr A tr P AP=( ) ( ' ).tr A tr P AP=
.
If A is a m x n matrix, then
( ) ( ) ( )tr aA bB a tr A b tr B+ = +
2( ' ) ( ')n n
ijtr A A tr AA a= =and
if and only if A = 0.
If A is n x n matrix then
1 1ij
j i= =
( ' ) ( ') 0tr A A tr AA= = If A is n x n matrix then
.( ')tr A trA=
-
Rank of a matrix
9
a o a at
The rank of a matrix A of is the number of linearly independent rows in A.
Let B be another matrix of order
m n.n q
A square matrix of order m is called non-singular if it has a full rank.
( ) min ( ( ), ( )).rank AB rank A rank B( ) ( ) ( ).+ +rank A B rank A rank B
Rank of A is equal to the maximum order of all nonsingular square sub-matrices of A.
A is of full row rank if rank(A) = m < n.
( ) ( ) ( )
( ') ( ' ) ( ) ( ').= = =rank AA rank A A rank A rank A
A is of full column rank if rank(A) = n < m.
-
Inverse of matrix
10
The inverse of a square matrix A of order m, is a square matrix of order m, denoted as , such that 1A 1 1 .mA A AA I = =The inverse of A exists if and only if A is non singular.
1 1( ) .A A = If A is non singular, then
If A and B are non-singular matrices of same order, then their product, if defined, is also nonsingular and
( )1 1( ') ( ) '.A A =
1 1 1( ) .AB B A =
Idempotent matrix
A square matrix A is called idempotent if 2A AA A= =
the eigenvalues of A are 1 or 0.
A square matrix A is called idempotent if
If A is an idempotent matrix with . Then
.A AA A= =n n ( ) rank A r n=
( ) ( ) .trace A rank A r= = If A is of full rank n, then
( ) ( ).nA I=
If A and B are idempotent and AB = BA, then AB is also idempotent.
If A is idempotent then (I A) is also idempotent and A(I - A) = (I - A) A = 0.
-
Analysis of Variance and Analysis of Variance and yyDesign Design of Experimentsof Experiments--II
MODULE MODULE -- II
LECTURE LECTURE -- 2 2 SOME RESULTS ON LINEAR SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY ALGEBRA, MATRIX THEORY
AND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSDr. Shalabh
D t t f M th ti d St ti tiDepartment of Mathematics and StatisticsIndian Institute of Technology Kanpur
-
Quadratic forms
2
If A is a given matrix of order and X and Y are two given vectors of order and respectively, then
the quadratic form is given by
m n 1m 1n
'm n
ij i jX AY a x y= 1 1
j ji j= =
where are the nonstochastic elements of A.'ija s
If A is square matrix of order m and X = Y , then2 2
11 1 12 21 1 2 1 1 1' ... ( ) ... ( ) .mm m m m m m m mX AX a x a x a a x x a a x x = + + + + + + +11 1 12 21 1 2 1, , 1 1( ) ( )mm m m m m m m mIf A is symmetric also, then
2 211 1 12 1 2 1, 1' ... 2 ... 2mm m m m m m
m n
X AX a x a x a x x a x x = + + + + +
1 1
= ij i ji j
a x x= =
is called a quadratic form in m variables x1, x2, , xm or a quadratic form in X.
To every quadratic form corresponds a symmetric matrix and vice versa.
The matrix A is called the matrix of quadratic form.
The quadratic form and the matrix A of the form is called
Positive definite if for all . Positive semi definite if for all
'X AX
' 0X AX > 0x ' 0X AX 0x Positive semi definite if for all .
Negative definite if for all . Negative semi definite if for all .
0X AX ' 0X AX < 0x
' 0X AX
0x
0x
-
If A is positive semi definite matrix then and if then for all j, and for all j.0iia 0iia = 0ija = 0jia =
3
If P is any nonsingular matrix and A is any positive definite matrix (or positive semi-definite matrix) then is
also a positive definite matrix (or positive semi-definite matrix).
A matrix A is positive definite if and only if there exists a non-singular matrix P such that
'P AP
' .A P P= A positive definite matrix is a nonsingular matrix.
If A is matrix and is positive definite and is positive semidefinite.
If A matrix and then both and are positive semidefinite
m n ( ) 'thenrank A m n AA= < 'A Am n ( )rank A k m n= < < 'A A 'AAIf A matrix and then both and are positive semidefinite.m n ( ) ,rank A k m n= < < A A AA
-
Simultaneous linear equations
4
The set of m linear equations in n unknowns and scalars and , of the form 1 2, ,..., nx x x ija ib 1, 2,..., , 1, 2,...,i m j n= =11 1 12 2 1 1
21 1 22 2 2 2
......
n n
n n
a x a x a x ba x a x a x b
+ + + =+ + + =
can be formulated as
AX b
1 1 2 2 ...m m mn n ma x a x a x b+ + + =#
AX = b
where A is a real matrix of known scalars of order called as coefficient matrix, X is real vector and b is real
vector of known scalars given by
m n 1n
a a a 11 12 121 22 2
1 2
......
,
...
is an real matrix called as coefficient matrix,
n
n
m m mn
a a aa a a
A m n
a a a
= # # % #
1 1
2 21 1is an vector of variables and is an real vector, .
x bx b
X n b m
= = # #n mx b
-
If A is nonsingular matrix, then AX = b has a unique solution.L t B [A b] i t d t i A l ti t AX b i t if d l if k(A) k(B)
n n
5
Linear homogeneous system AX = 0 has a solution other than X = 0 if and only if rank (A) < n.
Let B = [A, b] is an augmented matrix. A solution to AX = b exist if and only if rank(A) = rank(B).
If A is an matrix of rank m, then AX = b has a solution.m n
If AX = b is consistent then AX = b has a unique solution if and only if rank (A) = n
If is the ith diagonal element of an orthogonal matrix, then
Let the matrix be partitioned as where is an vector of the elements of ith column of A.
A necessary and sufficient condition that A is an orthogonal matrix is given by the following:
iia 1 1.iia n n 1 2[ , ,..., ]= nA a a a ia 1n
A necessary and sufficient condition that A is an orthogonal matrix is given by the following:
'
'
( ) 1 1, 2,...,
( ) 0 1, 2,..., .
i i
i j
i a a for i n
ii a a for i j n
= == =
Orthogonal matrix
A square matrix A is called an orthogonal matrix if or equivalently if' 'A A AA I= = 1 '.A A =An orthogonal matrix is non singular An orthogonal matrix is non-singular.
If A is orthogonal, then is also orthogonal.
If A is an matrix and let P is an orthogonal matrix, then the determinants of A and are the same.
'AA
n n n n 'P AP
-
Random vectors
6
Let be n random variables then is called a random vector.1 2, ,..., nY Y Y 1 2( , ,..., ) 'nY Y Y Y= The mean vector of Y is
1 2( ) (( ( ), ( ),..., ( )) '.nE Y E Y E Y E Y=
The covariance matrix or dispersion matrix of Y is The covariance matrix or dispersion matrix of Y is
1 1 2 1
2 1 2 2
( ) ( , ) ... ( , )( , ) ( ) ... ( , )
( )
n
n
Var Y Cov Y Y Cov Y YCov Y Y Var Y Cov Y Y
Var Y
= # # % #1 2( , ) ( , ) ... ( )n n nCov Y Y Cov Y Y Var Y
which is a symmetric matrix.
If are pair-wise uncorrelated, then the covariance matrix is a diagonal matrix.1 2, , ..., nY Y Y
If for all i = 1, 2,, n then2( )iVar Y = 2( ) .nVar Y I=
-
Linear function of random variable
7
If are n random variables and are scalars , then is called a linear function of random
variables .1 2, ,..., nY Y Y 1 2, ,.., nk k k
1
n
i ii
k Y=
1 2, ,..., nY Y Y
If then
the mean is and
1 2 1 2( , ,..., ) ', ( , ,..., ) 'n nY Y Y Y K k k k= =1
' ,n
i ii
K Y k Y=
= 'K Y
1( ' ) ' ( ) ( )
n
i ii
E K Y K E Y k E Y=
= = the variance of is 'K Y ( ) ( )' ' .Var K Y K Var Y K=
M lti i t l di t ib ti
A random vector has a multivariate normal distribution with mean vector and dispersion
matrix if its probability density function is
Multivariate normal distribution
1 2( , ,..., ) 'nY Y Y Y= 1 2( , ,..., ) = n
assuming is a nonsingular matrix.
1/2/2
1 1( | , ) exp ( ) ' ( )2(2 ) nn
f Y Y Y =
-
Chi-square distribution 8
If are identically and independently distributed random variables following the normal distribution with 1 2, ,..., kY Y Y y p y g
common mean 0 and common variance 1, then the distribution of is called the - distribution with k
degrees of freedom.
1 2, , , k2
1
k
ii
Y= 2
The probability density function of - distribution with k degrees of freedom is given as2
2
121( ) exp ; 0k xf x x x = < <
If are independently distributed following the normal distribution with common means 0 and common
variance then has - distribution with k degrees of freedom.
2 /2( ) exp ; 0 .( / 2)2 2kf x x x
k< <
1 2, ,..., kY Y Y
2 , 22
1 kiY 2 g
If the random variables are normally distributed with non-null means but common variance
1 th th di t ib ti f h t l di t ib ti ith k d f f d d t lit
1 2, ,..., kY Y Y 1 2, ,..., k 2
k
Y 2
21
ii =
1, then the distribution of has non-central - distribution with k degrees of freedom and non-centrality
parameter1
ii
Y=
2
1
k
ii
=
=
If are independently distributed following the normal distribution with means but common
variance then has non-central -distribution with k degrees of freedom and noncentrality parameter
1 2, ,..., kY Y Y 1 2, ,..., k 2 22
1
1 ki
iY = 22 1
1 .k
ii
== 2
-
9 If U has a Chi-square distribution with k degrees of freedom then and ( )E U k= ( ) 2 .Var U k=
If U has a noncentral Chi-square distribution with k degrees of freedom and noncentrality parameter then
and
( )E U k = + ( ) 2 4 .Var U k = +
If are independently distributed random variables with each having a noncentral Chi-square
distribution with degrees of freedom and non centrality parameter then has noncentral
Chi di t ib ti ith d f f d d t lit t
1 2, ,..., kU U U iU
in , 1,2,...,i i k =1
k
ii
U=
k k Chi-square distribution with degrees of freedom and noncentrality parameter
Let has a multivariate distribution with mean vector and positive definite covariance matrix
1i
in
=
1.i
i
=
1 2( , ,..., ) 'nX X X X= .Then is distributed as noncentral with k degrees of freedom if and only if is an idempotent matrix
of rank k.
'X AX2 A
Let has a multivariate normal distribution with mean vector and positive definite covariance
matrix . Let the two quadratic forms-
i di t ib t d ith d f f d d t lit t d
1 2( , ,..., )nX X X X=
'X A X 2 ' A is distributed as with degrees of freedom and noncentrality parameter and is distributed as with degrees of freedom and noncentrality parameter
Then are independently distributed if
1'X A X 1n 1' A 2'X A X
2 2' .A 1 2' 'andX A X X A X 1 2 0.A A =
2n
-
t- distribution
10
If
X has a normal distribution with mean 0 and variance 1,
Y has a distribution with n degrees of freedom, and
X and Y are independent random variables,
then the distribution of the statistic is called the t distribution with n degrees of freedom
2
XT =then the distribution of the statistic is called the t-distribution with n degrees of freedom. The probability density function of T is
/T
Y n=
12 2
1 nn + + 2 22( ) 1 ; - .
2
Ttf t t
n nn = + < <
If the mean of X is non zero then the distribution of is called the noncentral t - distribution with n degrees
of freedom and noncentrality parameter
/X
Y n
.
-
F- distribution
11
If X and Y are independent random variables with - distribution with m and n degrees of freedom respectively,
then the distribution of the statistic is called the F-distribution with m and n degrees of freedom. The
2//
X mFY
=probability density function of F is
/2
2
mm nm n m +
+
/Y n
2 222( ) 1 ; 0 .
2 2
m
Fmnf f f f f
m n n
= + < <
If X has a noncentral Chi-square distribution with m degrees of freedom and noncentrality parameter has a
distribution with n degrees of freedom, and X and Y are independent random variables, then the distribution of
2; Ydistribution with n degrees of freedom, and X and Y are independent random variables, then the distribution of
is the noncentral F distribution with m and n degrees of freedom and noncentrality parameter .//
X mFY n
=
-
Analysis of Variance and Analysis of Variance and yyDesign Design of Experimentsof Experiments--II
MODULE MODULE -- II
LECTURE LECTURE -- 3 3 SOME RESULTS ON LINEAR SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY ALGEBRA, MATRIX THEORY
AND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSAND DISTRIBUTIONSDr. Shalabh
D t t f M th ti d St ti tiDepartment of Mathematics and StatisticsIndian Institute of Technology Kanpur
-
Linear model
2
Suppose there are n observations. In the linear model, we assume that these observations are the values taken by n
random variables satisfying the following conditions:1 2, ,.., nY Y Y
is a linear combination of p unknown parameters with ( )iE Y
1 1 2 2( ) ... , 1, 2,...,i i i ip pE Y x x x i n = + + + =
1 2, ,..., p
where are known constants.
are uncorrelated and normality distributed with variance
The linear model can be rewritten by introducing independent normal random variables following as
'ijx s
1 2, ,..., nY Y Y2( ) .iVar Y =
2(0 )N The linear model can be rewritten by introducing independent normal random variables following , as.
(0, )N 1 1 2 2 .... , 1, 2,..., .i i i ip p iY x x x i n = + + + + =
These equations can be written using the matrix notations as
Y X = +Y X = +where Y is a n x1 vector of observation, X is a matrix of n observations on each of variables,
is a vector of parameters and is a vector of random error components with Here Y
n p 1 2, ,..., pX X X1p 1n 2~ (0, ).nN I
( )ijx s'
is called study or dependent variable, are called explanatory or independent variables and
are called as regression coefficients.
1 2, ,..., pX X X
1 2, ,..., p
-
Alternatively since so the linear model can also be expressed in the expectation form as a normal 2~ ( , )Y N X I
3
random variable Y with
2
( )
( ) .
E Y X
Var Y I
==
Note that are unknown but X is known.
( ) .Var Y I2and
Estimable function
A li t i f ti f th t i id t b ti bl t i f ti ti bl if th' A linear parametric function of the parameter is said to be an estimable parametric function or estimable if there exists a linear function of random variables of Y where such that
'YA 1 2( , ,..., ) 'nY Y Y Y=
( ' ) 'E Y =Awith and being the vectors of known scalars.1 2( , ,..., ) 'n=A A A A 1 2( , ,..., ) 'n =
-
Best Linear Unbiased Estimates (BLUE)
4
The unbiased minimum variance linear estimate of an estimable function is called the best linear unbiased
estimate of .
S d th BLUE f ti l
'YA ' ' ' YA ' YA ' 'd Suppose and are the BLUE of respectively.
Then is the BLUE of
If is estimable, its best estimate is where is any solution of the equations
1YA 2YA 1 2and 1 1 2 2( ) 'a a Y+A A 1 1 2 2( ) ' .a a +
' ' ' ' .X X X Y =
Least squares estimation
The least squares estimate of in is the value of which minimizes the error sum of squares . Y X = + ' The least squares estimate of in is the value of which minimizes the error sum of squares . Y X + Let ' ( ) '( )
' 2 ' ' ' ' .
S Y X Y X
Y Y X Y X X
= = = +
Minimizing S with respect to involvesMinimizing S with respect to involves 0
' '
S
X X X Y
= =
which is termed as normal equation.
-
This normal equation has a unique solution given by
5
assuming Note that is a positive definite matrix. So is the value of
which minimizes and is termed as ordinary least squares estimator of .
( ) .rank X p= 2 ''
S X X =
1 ( ' ) 'X X X Y = '
1 ( ' ) 'X X X Y =
In this case, are estimable and consequently all the linear parametric function are estimable.
y q
1 2, ,..., p 1 1( ) ( ' ) ' ( ) ( ' ) 'E X X X E Y X X X X = = =
1 1 2 1( ) ( ' ) ' ( ) ( ' ) ( ' )V X X X V Y X X X X X
1 1 2 1( ) ( ' ) ' ( ) ( ' ) ( ' )Var X X X Var Y X X X X X = =
If are the estimates of respectively, then ' 'and ' 'and 2 1 ( ' ) ' ( ) [ '( ' ) ]Var Var X X = = 2 1 ( ' , ' ) [ '( ' ) ].Cov X X =
is called the residual vector and Y X ( ) 0E Y X =( ) 0.E Y X
-
Linear model with correlated observations
6
In the linear model
Y X = +with and is normally distributed, we find( ) 0, ( )E Var = =
( ) , ( ) .E Y X Var Y= = Assuming to be positive definite, we can write
'P P 'P P =where P is a nonsingular matrix. Premultiplying by P, we getY X = +
* * *
PY PX P
Y X
= += +or
* , * * .and
Y X
Y PY X PX P
= += = =
Note that are unknown but X is known.2and
or
where
-
Distribution of 'YA
7
In the linear model consider a linear function which is normally distributed with 2, ~ (0, )Y X N I = + 'YA
2
( ' ) ' ,( ' ) ( ' )
E Y XV Y
=A AA A A
Then
2( ' ) ( ' ).Var Y =A A A
' '~ ,1 .' '
Y XN
A AA A A A
Further, has a noncentral Chi-square distribution with one degrees of freedom and noncentrality parameter
2
2
( ' ) .'
X AA A
2
2
( ' )'
yAA A
2 ' A A
Degrees of freedom
A linear function of the observations is said to carry one degrees of freedom. A set of r linear functions
where L is r x n matrix, is said to have M degrees of freedom if there exists M linearly independent functions in the set
and no more. Alternatively, the degrees of freedom carried by the set equals rank (L). When the set are
the estimates of the degrees of freedom of the set will also be called the degrees of freedom for the
'YA ( 0)A ' ,L Y
'L Y 'L Y' 'L Ythe estimates of the degrees of freedom of the set will also be called the degrees of freedom for the
estimates of .
, L Y'
-
Sum of squares
8
If is a linear function of observations, then the projection of Y on is the vector . The square of this
projection is called the sum of squares (SS) due to is given by . Since has one degree of freedom,
'YA A ' .'Y A AA A
' yA2( ' )
'YAA A 'YA
so the SS due to has one degree of freedom.
The sum of squares and the degrees of freedom arising out of the mutually orthogonal sets of functions can be added
together to give the sum of squares and degrees of freedom for the set of all the function together and vice versa.
'YA
Let has a multivariate normal distribution with mean vector and positive definite covariance 1 2( , ,..., )nX X X X= matrix . Let the two quadratic forms
is distributed as degrees of freedom and noncentrality parameter and
is distributed as degrees of freedom and noncentrality parameter
22with n 2' .A
1'X A X 2 1with n 1' A 2'X A X
Then and are independently distributed if 1 2 0.A A =1'X A X 2'X A X
-
Fisher-Cochran theorem
9
If has multivariate normal distribution with mean vector and positive definite covariance
matrix and let
h ith k Th i d d tl di t ib t d t l Chi
1 2( , ,..., )nX X X X= 1 1 2' ... kX X Q Q Q = + + +
'Q X A X ( ) 1 2A N i k 'Qwhere with rank Then are independently distributed noncentral Chi-square
distribution with degrees of freedom and noncentrality parameter if and only if in which case
'i iQ X A X= ( ) , 1, 2,..., .i iA N i k= = 'iQ s
iN ' iA 1
,k
ii
N N=
=1
1' ' .
k
ii
A =1i=
-
Derivatives of quadratic and linear forms
10
Let and f(X) be any function of n independent variables ,
then
1 2( , ,..., ) '= nX x x x 1 2, ,..., nx x x
1
( )
( )( )
f Xx
f Xf X
2
( ) .
( )
n
f X xX
f Xx
=
#
If is a vector of constants, then 1 2( , ,..., ) '= nK k k k ' .K X KX =
If A is an matrix, thenn n ' 2( ') .X AX A A XX
= +X
Independence of linear and quadratic forms
Let Y be an vector having multivariate normal distribution and B be an matrix Then1n ( )N I m n 1m Let Y be an vector having multivariate normal distribution and B be an matrix. Then vector linear form BY is independent of the quadratic form if BA = 0 where A is a symmetric matrix of known
elements.
1n ( , )N I m n 1m'Y AY
Let Y be an vector having multivariate normal distribution with . If , then the
quadratic form is independent of linear form BY where B is an matrix.
1n ( , ) N ( ) =rank nm n
0 =B A'Y AY
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design of ExperimentDesign of Experiment--II
MODULE MODULE II II
g pg p
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESISLECTURE LECTURE -- 4 4
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
Regression model for the general linear hypothesis
2
1 2, ,..., nY Y Y be a sequence of n independent random variables associated with responses. Then we can write it asLet
1( ) , 1, 2,..., , 1, 2,...,
p
i j ijj
E Y x i n j p=
= = =1
2( ) .j
iVar Y =
=This is the linear model in the expectation form where are the unknown parameters and s are the known values
of independent covariates1 2, , ..., p 'ijx
1 2, ,..., .pX X X
Alternatively, the linear model can be expressed as1
, 1, 2,..., ; 1, 2,...,p
i j ij ij
Y x i n j p =
= + = =where s are identically and independently distributed random error component with mean 0 and variance i.e., i 2 ,
2( ) 0 ( ) ( ) 0( )andE Var Cov i j = = = ( ) 0, ( ) ( , ) 0( ).and i i i jE Var Cov i j = = = In matrix notations, the linear model can be expressed as
Y X = +where
is vector of observations on response variable, 1n
11 12 1 ... pX X X
1 2( , ,..., ) 'nY Y Y Y=
the matrix 21 22 2
1 2
...
...
p
n n np
X X XX
X X X
= # # % #
is matrix of n observations on p independent covariates 1 2, , ..., ,pX X Xn p
-
is a vector of unknown regression parameters (or regression coefficients)1 2( , ,..., )p = 1p 1 2, , ..., p
3
associated with respectively and 1 2, , ..., ,pX X X
is a vector of random errors or disturbances.1 2( , ,..., )n = 1n
We assume that covariance matrix ( ) 0,E = 2( ) ( ') , ( ) .pV E I rank X p = = =
In the context of analysis of variance and design of experiments,
the matrix X is termed as design matrix;
unknown are termed as effects;1 2, , ..., p
the covariates are counter variables or indicator variables where counts the number of times
the effect occurs in the ith observation1 2, , ..., pX X X
xijx
the effect occurs in the ith observation . ixj mostly takes the values 1 or 0 but not always.ijx
The value indicates the presence of effect in and indicates the absence of effect in Xi.1ijx = j ix 0ijx = j
Note that in the linear regression model, the covariates are usually continuous variables.
When some of the covariates are counter variables and rest are continuous variables, then the model is called as
mixed model and is used in the analysis of covariance.
-
Relationship between the regression model and analysis of variance model
4
The same linear model is used in the linear regression analysis as well as in the analysis of variance. So it is important to understand the role of linear model in the context of linear regression analysis and analysis of variance.
Consider the multiple linear model
Y X X X 0 1 1 2 2 ... .p pY X X X = + + + + +
In the case of analysis of variance model,
the one-way classification considers only one covariate,
t l ifi ti d l id t i t two way-classification model considers two covariates,
three-way classification model considers three covariates and so on.
If denote the effects associated with the covariates X, Z and W which are counter variables, then in , and One-way model:
Two-way model:
Three-way model : and so on.
Y X Z = + + +Y X = + +
Y X Z W = + + + +
Consider an example of agricultural yield. The study variable denotes the yield which depends on various covariates
. In case of regression analysis, the covariates are the different variables like temperature, 1 2, , ..., pX X X 1 2, , ..., pX X X
quantity of fertilizer, amount of irrigation, etc.
-
Now consider the case of one way model and try to understand its interpretation in terms of multiple regression model.
5
The covariate X is now measured at different levels, e.g., if X is the quantity of fertilizer then suppose there are p
possible values, say 1 Kg., 2 Kg., ,..., p Kg. then denotes these p values in the following way.
The linear model now can be expressed as by defining1 2, , ..., pX X X
1 1 2 2 ...o p pY X X X = + + + + +
1
2
22
1 if effect of 1 Kg. fertilizer is present0 if effect of 1 Kg. fertilizer is absent
1 if effect of Kg. fertilizer is present0 if effect of Kg. fertilizer is absent
X
X
= =
#1 if effect of Kg. fertilizer is presen
p
pX =#
t0 if effect of Kg. fertilizer is absent.p
If effect of 1 Kg. of fertilizer is present, then other effects will obviously be absent and the linear model is expressible as
0 1 1 2 2
0 1
( 1) ( 0) ... ( 0)
.p pY X X X
= + = + = + + = += + +
If effect of 2 Kg. of fertilizer is present then
0 1 1 2 2
0 2
( 0) ( 1) ... ( 0)
.p pY X X X
= + = + = + + = += + +0 2 . + +
-
If effect of p Kg. of fertilizer is present then
6
0 1 1 2 2
0
( 0) ( 0) ... ( 1)
= + = + = + + = += + +
p p
p
Y X X X
and so on.
If the experiment with 1 Kg. of fertilizer is repeated n1 number of times then n1 observation on response variables
are recorded which can be represented asp
11 0 1 2 11
12 0 1 2 12
.1 .0 ... .0
.1 .0 ... .0p
p
YY
= + + + + += + + + + +
#11 0 1 2 1 1
.1 .0 ... .0 .n p nY = + + + + +#
If X2 = 1 is repeated n2 times, then on the same lines n2 number of times then n1 observation on response
variables are recorded which can be represented as
21 0 1 2 21
22 0 1 2 22
.0 .1 ... .0
.0 .1 ... .0p
p
YY
= + + + + += + + + + +
#2 22 0 1 2 2
.0 .1 ... .0 .n p nY = + + + + +#
-
The experiment is continued and if Xp = 1 is repeated np times, then on the same lines
7
1 0 1 2 1
2 0 1 2 2
0 1 2
.0 .0 ... .1
.0 .0 ... .1
.0 .0 ... .1 .p p
p p P
p p P
pn p pn
YY
Y
= + + + + += + + + + +
= + + + + +#
All these observations can be represented as 1 2, , .., pn n n
11 1 1 0 0 0 0y " 11
1
12
1
21
1 1 0 0 0 0
1 1 0 0 0 01 0 1 0 0 0
n
y
y
y
"# # # # #% # #
""
1
12
1
21
n
#
2
22
2
1 0 1 0 0 0
1 0 1 0 0 0
1 0 0 0 0 1
n
y
y
=
"# # # # # %# #
"# # # #% # ##
2
022
1
2np
+
##
#1
2
1 0 0 0 0 11 0 0 0 0 1
1 0 0 0p
p
p
pn
yy
y
""
# # # #%# ##"
1
2
0 1p
p
p
pn
#
or.Y X = +
-
8In the two way analysis of variance model, there are two covariates and the linear model is expressible as
0 1 1 2 2 1 1 2 2+ ... ...p p q qY X X X Z Z Z = + + + + + + + +
where denotes, e.g., the p levels of quantity of fertilizer, say 1 Kg., 2 Kg.,..., p Kg. and
denotes, e.g., the q levels of level of irrigation, say 10 Cms., 20 Cms.,,10q Cms. etc. The levels
are defined as counter variable indicating the presence or absence of the effect as in the earlier
If th ff t f X d Z t i 1 K f f tili d 10 C f i i ti i d th th
1 2, , ..., pX X X 1 2, , ..., qZ Z Z
1 2, , ..., ,pX X X
1 2, ,..., qZ Z Z
case. If the effect of X1 and Z1 are present, i.e., 1 Kg of fertilizer and 10 Cms. of irrigation is used then the
linear model is written as
0 1 2 1 2
0 1 1
.1 .0 ... .0 .1 .0 ... .0
.p pY
= + + + + + + + + += + + +0 1 1
If X2 = 1 and Z2 = 1 is used, then the model is 0 2 2 .Y = + + +
The design matrix can be written accordingly as in the one way analysis of variance case.
In the three way analysis of variance model
1 1 1 1 1 1... ... ... .p p q q r rY X X Z Z W W = + + + + + + + + + +
-
9 The regression parameters can be fixed or random.'s
's If all are unknown constants, they are called as parameters of the model and the model is called as a fixed-effects model or model I. The objective in this case is to make inferences about the parameters and the error
g p
2variance .
If for some for all then is termed as additive constant In this case occurs with every1j x = 1 2=i n If for some for all then is termed as additive constant. In this case, occurs with every observation and so it is also called as general mean effect.
, 1ijj x = 1, 2,...,=i n j
If all are observable random variables except the additive constant, then the linear model is termed as's
j
random-effects model, model II or variance components model. The objective in this case is to make inferences
about the variances of i.e., and error variance and/or certain functions of them.' ,s1 2
2 2 2, , ..., p 2
If some parameters are fixed and some are random variables, then the model is called as mixed-effects model
or model III. In mixed effect model, at least one is constant and at least one is random variable. The
objective is to make inference about the fixed effect parameters, variance of random effects and error variance .2j j
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design of ExperimentDesign of Experiment--II
MODULE MODULE II II
g pg p
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESISLECTURE LECTURE -- 5 5
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
Analysis of variance
2
Analysis of variance is a body of statistical methods of analyzing the measurements assumed to be structured as
1 1 2 2 ... , 1, 2,...,i i i p ip iy x x x i n = + + + + =where are integers, generally 0 or 1 indicating usually the absence or presence of effects and s are assumed to ;j iijxbe identically and independently distributed with mean 0 and variance . It may be noted that the s can be assumed
additionally to follow a normal distribution It is needed for the maximum likelihood estimation of parameters from
the beginning of analysis but in the least squares estimation, it is needed only when conducting the tests of hypothesis and
the confidence interval estimation of parameters The least squares method does not require any knowledge of distribution
j2 i
2(0, ).N
ij
the confidence interval estimation of parameters. The least squares method does not require any knowledge of distribution
like normal upto the stage of estimation of parameters.
We need some basic concepts to develop the tools.
Least squares estimate of Let be a sample of observations on The least squares estimate of is the values of for which
the sum of squares due to errors, i.e.,1 2, ,..., ny y y 1 2, ,..., .nY Y Y
2 2
1' ( ) ( )
2 '
== = =
= +n ii
S y X y X
y y X y X X
-
i i i h Diff i i S2 i h d b i i i b h l( )
3
is minimum where . Differentiating S2 with respect to and substituting it to be zero, the normal
equations are obtained as1 2( , ,..., )ny y y y =
2
2 2 0 dS X X X yd
= =.or
d
X X X y
=
If X has full rank then has a unique inverse and the unique least squares estimate of is( )X X
which is the best linear unbiased estimator of in the sense of having minimum variance in the class of linear and unbiased
estimator If rank of X is not full then generalized inverse is used for finding the inverse of
( )X X
1 ( )X X X y =
estimator. If rank of X is not full, then generalized inverse is used for finding the inverse of ( ).X X
If is a linear parametric function where is a non-null vector, then the least squares estimate of
is
L 1 2( , ,..., )pL = A A A
A ti i th t h t th diti d hi h li t i f ti d it i l tL
L .L
A question arises that what are the conditions under which a linear parametric function admits a unique least
squares estimate in the general case.
The concept of estimable function is needed to find such conditions.
L
-
4Estimable functionsA linear function of the parameters with known is said to be an estimable parametric function (or estimable) if there
exists a linear function of Y such that
L Y
b ( ) .for all bE L Y R =
Note that not all parametric functions are estimable.
Following results will be useful in understanding the further topics.
Theorem 1
A linear parametric function admits a unique least squares estimate if and only if is estimable.L L
Th 2 (G M k ff h )Theorem 2 (Gauss Markoff theorem)
If the linear parametric function is estimable then the linear estimator where is a solution of
is the best linear unbiased estimater of in the sense of having minimum variance in the class of all linear and unbiased
L L X X X Y =L g
estimators of .
L
-
5Theorem 3
If the linear parametric function are estimable, then any linear combination of
is also estimable.
' ' '1 1 2 2, ,..., = = =k kl l l 1 2, ,..., k
Theorem 4
All linear parametric functions in are estimable if and only if X has full rankAll linear parametric functions in are estimable if and only if X has full rank.
If X is not of full rank, then some linear parametric functions do not admit the unbiased linear estimators and nothing can be
inferred about them. The linear parametric functions which are not estimable are said to be confounded. A possible solution
to this problem is to add linear restrictions on so as to reduce the linear model to a full rank.
Theorem 5
Let and be two estimable parametric functions and let and be their least squares estimators. Then'1L '2L '1 L '2 L ' 2 ' 11 1 1
' ' 2 ' 11 2 1 2
( ) ( ) ( , ) ( )
=
=Var L L X X L
Cov L L L X X L
assuming that X is a full rank matrix. If not, the generalized inverse of can be used in place of unique inverse.XX
-
Estimator of based on least squares estimation2
6
q
Consider an estimator of as 2
1 1
1 ( ) ( )
1 [ ( ) ' ] [ ( ) ]
y X y Xn p
y X X X X y y X X X X y
=
2
1 1
1
[ ( ) ' ] [ ( ) ]
1 [ ( ) ][ ( ) ]
1 [ ( ) ]
y X X X X y y X X X X yn p
y I X X X X I X X X X yn p
y I X X X X y
= = = [ ( ) ]y y
n pwhere the hat matrix is an idempotent matrix with its trace as1[ ( ) ]I X X X X
1 1
1
[ ( ) '] ( )( ) ( ( ) ( ))i th lt
tr I X X X X trI trX X X Xt X X X X t AB t BA
=
1( ) ( ( ) ( ))
.
using the result
p
n tr X X X X tr AB tr BAn tr In p
= == =
Note that, using , we have ( ) ' ( )E y Ay A tr A = + 2
2 1
2
( ) [ ( ) ]E tr I X X X Xn p
= =
and so is an unbiased estimator of 2 2.
-
Maximum likelihood estimation
7
The least square method does not uses any distribution of the random variables in the estimation of parameters. We need the
distributional assumption in case of least squares only while constructing the tests for hypothesis and the confidence
intervals. For maximum likelihood estimation, we need the distributional assumption from the beginning.
Suppose are independently and identically distributed following a normal distribution with mean
and variance (i = 1, 2,, n). Then the likelihood function of is1 2, ,..., ny y y
1( )
p
i j ijj
E y x=
= 2( ) =iVar y 1 2, ,..., ny y y
22
22 2
1 1( | , ) exp ( ) ( )2(2 ) ( )
n nL y y X y X =
where . . Then1 2( , ,..., ) .ny y y y =2 2
2
1ln ( | , ) log 2 log ( ) ( ).2 2 2n nL L y y X y X = =
Differentiating the log likelihood with respect to and we have 2Differentiating the log likelihood with respect to and we have ,
2
0 ,
10 ( ) ( )
L X X X y
L y X y X
= =
2 0 ( ) ( ).y X y Xn = =
-
Assuming the full rank of X, the normal equations are solved and the maximum likelihood estimators are obtained as1
8
1
2
1
( )1 ( ) ( )
1 ( ) .
X X X y
y X y Xn
y I X X X X yn
==
=
n
The second order differentiation conditions can be checked and they are satisfied for to be the maximum
likelihood estimators.
Note that in the maximum likelihood estimator is same as the least squares estimator and
2and
is an unbiased estimator of , i.e., like the least squares estimator but
is not an unbiased estimator of , i.e., unlike the least squares estimator.
Now we use the following theorems for developing the test of hypothesis.
( )E =2 2 2 2 2( ) = n pE
n
Theorem 6
Let follow a multivariate normal distribution with mean vector and positive definite covariance
matrix . Then follows a noncentral chi-square distribution with p degrees of freedom and noncentrality parameter1 2( , ,..., )nY Y Y Y = ( , )N
Y AYmatrix . Then follows a noncentral chi square distribution with p degrees of freedom and noncentrality parameterif and only if is an idempotent matrix of rank p.
Y AY2, . ., ( , )A i e p A A
Theorem 7
Let follows a multivariate normal distribution with mean vector and positive definite covariancematrix . Let follows and follows Then and are independently distributed if
1 2( , ,..., )nY Y Y Y = ( , )N 1Y AY 2 1 1( , )p A 2YAY 2 2 2( , ).p A 1Y AY 2YAY
1 2 0.A A =
-
Theorem 8
9
and follows where rank(X) = p2
2n 2 ( )n p
Let follow a multivariate normal distribution then the maximum likelihood (or least squares)
estimator of estimable linear parametric function is independently distributed of follow1 2( , ,..., )nY Y Y Y = 2( , ),N I
L 2 ; L 1, ( )N L L X X L and follows where rank(X) p. 2 ( )n p
Proof: Consider then1 ( ) ,X X X Y =1
1
( ) ( ) ( )( )
E L L X X X E YL X X X X
= =
2 1
( )
( ) ( ) ( ) ( )( )
L X X X XL
Var L L Var LL E L
L X X L
==
= =
= ( ) .L X X L=Since is a linear function of y and is a linear function of , so follows a normal distribution
Let and then
L 2 1, ( ) .N L L X X L 1( )A I X X X X = 1'( ) ,B L X X X = 1 ( )L L X X X Y BY = =
2 1 ( ) ' ( ) ( ) ' .and n Y X I X X X X Y X Y AY = =
L
So, using Theorem 6 with rank(A) = n p, follows a . Also
( ) ( ) ( ) .and n Y X I X X X X Y X Y AY 2
2n
2 ( )n p 1 1 1( ) ( ) ( )
0BA L X X X L X X X X X X X =
So using Theorem 7, and are independently distributed.0.=
'Y AY 'Y BY
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design of ExperimentDesign of Experiment--II
MODULE MODULE II II
g pg p
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESISLECTURE LECTURE -- 6 6
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
Tests of hypothesis in the linear regression model
2
First we discuss the development of the tests of hypothesis concerning the parameters of a linear regression model. These tests
of hypothesis will be used later in the development of tests based on the analysis of variance.
Analysis of Variance The technique in the analysis of variance involves the breaking down of total variation into orthogonalAnalysis of Variance The technique in the analysis of variance involves the breaking down of total variation into orthogonalcomponents. Each orthogonal factor represents the variation due to a particular factor contributing in the total variation.
Model
Let be independently distributed following a normal distribution with mean and variance . Denoting1 2, ,..., nY Y Y ( )p
i j ijE Y x= 2a column vector, such assumption can be expressed in the form of a linear regression model
where X is a matrix, is a vector and is a vector of disturbances with
1j=1 2( , ,..., )nY Y Y Y = 1n Y X = +
n p 1p 1n( ) 0 =E
2( ) =Cov I
and follows a normal distribution.
This implies that
( ) =Cov I
( )E Y X=
Now we consider four different types of tests of hypothesis .
I th fi t t d l th lik lih d ti t t f th ll h th i l t d t th l i f i N t th t
2( )( ) .E Y X Y X I =
In the first two cases, we develop the likelihood ratio test for the null hypothesis related to the analysis of variance. Note that,
later we will derive the same test on the basis of least squares principle also. An important idea behind the development of this
test is to demonstrate that the test used in the analysis of variance can be derived using least squares principle as well as
likelihood ratio test.
-
3Consider the null hypothesis for testing where is specified and is
unknown.
00 :H = 0 0 0 01 2 1 2( , ,..., ) , ( , ,..., ) 'p p = = 2
Case 1: Test of 00 :H =
This null hypothesis is equivalent to
0 0 00 1 1 2 2: , ,..., .p pH = = =
Assume that all are estimable, i.e., rank(X) = p (full column rank). We now develop the likelihood ratio test.
The dimensional parametric space is a collection of points such that
'i s
( 1) 1p +
Under all s are known and equal, say and the reduces to one dimensional space given by
{ }2 2( , ); , 0 1,2,... .i i p = < < > =0 ,H
{ }0 2 2( ); 0 = >' 0
The likelihood function of is
{ }( , ); 0 . = >1 2, ,..., ny y y
222 2 .
1 1( | , ) exp ( ) ( )2 2
n
L y y X y X =
-
4The likelihood function is maximum over when and are substituted with their maximum likelihood estimators, i.e., 21
2
( )1 ( ) ( ).
X X X y
y X y X
==
Substituting and in gives
222 2
1 1 ( | , ) exp ( ) ( ) 2 2
n
Max L y y X y X =
( ) ( ).y X y Xn
2 2( | , )L y
U d th i lik lih d ti t f i
2
.
2 2
exp 22 ( ) ( )
n
n ny X y X
=
H 2Under the maximum likelihood estimator of is
The maximum value of the likelihood function under is
0 ,H2
2 0 01 ( ) ( ).y X y Xn
= 0H
22 0 02 2
2
1 1( | , ) exp ( ) ( ) 2 2
n
n
Max L y y X y X
n n
=
0 0 .exp2 ( ) ( ) 2y X y X
=
-
The likelihood ratio test statistic is
5
2
2
2
( | , )( | , )
( ) ( )n
Max L yMax L y
y X y X
=
0 0
2
'0 0
( ) ( )( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
n
y yy X y X
y X y X
y X X X y X X X
=
= + +
20 0
( ) ( ) ( ) ( )
( ) ' ( ) 1 ( ) ( )
n
y X X X y X X X
X Xy X y X
+ +
= + n
1
21 q
q
= + 2
( ) ( )where q y X y X =
The expression of q1 and q2 can be further simplified as follows:
2
0 01
( ) ( )
( ) ( ).
where
and
q y X y X
q X X
= =
p q1 q2 p
-
Consider
6
0 01
1 0 1 0
1 0 1 0
( ) ( )
( ) ( )
( ) ( ) ( ) ( )
q X X
X X X y X X X X X y
X X X X X X X X X X
= =
1 0 1 00 1 1 0
0 1 0
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( )
X X X y X X X X X X y X
y X X X X X X X X X y X
y X X X X X y X
= = =
2
1 1
1
( ) ( )
( ) ( )
q y X y X
y X X X X y y X X X X y
= =
1
0 0 1 0 0
0 1 0
( )
[( ) ] [ ( ) '][( ) ]
( ) [ ( ) ]( ).
y I X X X X y
y X X I X X X X y X X
y X I X X X X y X
= = + +
=
Other two terms become zero using 1[ ( ) ] 0.I X X X X X =
-
70H1qq
In order to find out the decision rule for based on , first we need to find if is a monotonic increasing or decreasing
function of . So we proceed as follows:
1
2
,qgq
= 21 22
1 (1 )
nnq g
q
= + = +
12
= nd nd
Let so that
then
2q
Thus is a monotonic decreasing function of 1 .qq
So as g increases, decreases.ddg
122 (1 )++
ndg g
The decision rule is to reject if where is a constant to be determined on the basis of size of the test. Let us
simplify this in our context.
2q
0H 0 0
02
1
2
1
1
or
or
n
oqq
+
( )2
2
02
1
(1 )
1
or
or
on
n
n
g
g
++ 0 1or
or
ngg C
where C is a constant to be determined by the size condition of the test.
-
8So reject whenever
Note that the statistic can also be obtained by the least squares method as follows. The least squares methodology will
0H 12
q Cq
1
2
qq
also be discussed in further lectures.
0 01
( ) ( ) q X X = 1 ( ) ( ) ( ) ( )
sum of squares due
q Min y X y X Min y X y X =
sum sumof squares of
to deviationfrom
ORsumof
oH
due to squaresOR due to error
Total sum of squares
oH
squaresdue to
-
Let
Theorem 9
9
Let0
11
12
( ) '
[ ( ) ] .
Z Y XQ Z X X X X ZQ Z I X X X X Z
= = =
Then and are independently distributed. Further, when is true , then and
where denotes the distribution with m degrees of freedom.
12
Q
22
Q 0H
212 ~ ( )
Q p 222 ~ ( ) Q n p
2 ( )m 2HProof: Under 0 ,H
0 0
2
( ) 0( ) ( ) .
E Z X XVar Z Var Y I
= == =
Proof: Under
Further Z is a linear function of Y and Y follows a normal distribution. So 2~ (0 )Z N IFurther Z is a linear function of Y and Y follows a normal distribution. SoThe matrices and are idempotent matrices. So
(0, )Z N I1( )X XX X 1[ ( ) ]I X X X X
1 1
1 1
[ ( ) ] [( ) ] ( )
[ ( ) ] [ ( ) ] .p
n
tr X X X X tr X X X X tr I p
tr I X X X X tr I tr X X X X n p
= = =
= =
So using theorem 6, we can write that under 2 21 20 2 2, ~ ( ) ~ ( )and Q QH p n p
where the degrees of freedom p and (n-p) are obtained by the trace of and trace of 1( )X X X X 1( ) ,I X X X X respectively.
Since so using theorem 7, the quadratic forms and are independent under H0 .
Hence the theorem is proved.1Q 2Q1 1( ) ( ) 0,I X X X X X X X X =
-
Since and are independently distributed, so under 0 ,H/Q p
10
1Q 2Q
follows a central F distribution, i.e.1
2
//( )Q p
Q n p1
2
( , ).Qn p F p n pp Q
Hence the constant C in the likelihood ratio test statistic is given by
where denotes the upper points of F-distribution with n1 and n2 degrees of freedom.
The computations of this test of hypothesis can be represented in the form of an analysis of variance table.
1 1 2( , )F n n
1 ( , )C F p n p= 100 %
The computations of this test of hypothesis can be represented in the form of an analysis of variance table.
ANOVA table for testing 00 :H =
Source of variation
Degrees of freedom
Sum of squares
Mean squares
F - value
Due to p1q 1
qp
1
2
qn pp q
1 ( , )C F p n p=
0:H
Error (n p)2q
2
( )q
n p
00 :H =
Total n 0 0( ) ( )y X y X
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design Design of Experimentsof Experiments--II
MODULE MODULE II II
gg pp
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESISLECTURE LECTURE -- 7 7
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
C 2 T t f b t f t h d0: 1 2H k < 2
2
Case 2: Test of a subset of parameters when and
are unknown
00 : , 1, 2,..,k kH k r p = = = + +
Th i l f lik lih d f ti d i bt i d b b tit ti th i lik lih d
The likelihood function is
222 2
1 1( | , ) exp ( ) ( ) .2 2
n
L y y X y X =
The maximum value of likelihood function under is obtained by substituting the maximum likelihood estimates of and , i.e.,
2
1
2
( )1 ( ) ( )
X X X y
y X y X
=
as
( ) ( )y X y Xn
=
222 2
1 1 ( | , ) exp ( ) ( ) 2 2
n
Max L y y X y X =
2
'exp . 22 ( ) ( )
n
n ny X y X
=
-
Now we find the maximum value of likelihood function under . The model under becomes
Th lik lih d f ti d i0H 0H
0Y X X
5
. The likelihood function under is01 (1) 2 2Y X X = + + 0H22 0 0
1 (1) 2 (2) 1 (1) 2 (2)2 2
1 1( | , ) exp ( ) ( )2 2
1 1
n
n
L y y X X y X X =
where Note that and are the unknown parameters. This likelihood function looks like as if it is 2
2
2 (2) 2 (2)2 2
1 1exp ( * ) ( * )2 2
y X y X =
(0)1 (1)* .y y X = (2)
written for
This helps is writing the maximum likelihood estimators of and directly as
22 (2)* ~ ( , ). y N X
2(2)' 1 '
(2) 2 2 2 ( ) *X X X y =( )
22 (2) 2 (2)
1 ( * ) ( * ).y X y Xn
=
Note that is a principal minor of Since is a positive definite matrix, so is also positive
definite. Thus exists and is unique.
'2 2X X .XX X X '2 2X X
' 12 2( )X X
definite. Thus exists and is unique.
Thus the maximum value of likelihood function under is obtained as
2 2( )X X
0H
222 (2) 2 (2)2 2
1 1 ( * | , ) exp ( * ) ( * ) 2 2
n
Max L y y X y X =
2
2 (2) 2 (2)
exp . 22 ( * ) '( * )
n
n ny X y X
=
-
The likelihood ratio test statistic for is 00 (1) (1):H =
6
( ) ( )
2
2
2
max ( | , )
max ( | , )
( ) ( )n
L y
L y
y X y X
=
2 (2) 2 (2)
2
2 (2) 2 (2)
( ) ( ) ( * ) ( *
- ( * ) ( * ) ( ) ( ) ( ) ( ) ( ) ( )
n
y X y Xy X y X
y X y X y X y X y X y Xy X y X
= + =
2 (2) 2 (2) ( * ) ( * ) (
1y X y X y X = +
2
2
1
-) ( ) ( ) ( )
-1
n
n
y Xy X y X
q
+ 1
2
1q
= +
where and 1 2 (2) 2 (2) ( * ) ( * ) ( ) ( ) = q y X y X y X y X 2 ( ) ( ). = q y X y XNow we simplify q1 and q2 .p y q1 q2Consider
' 1 ' ' 1 '2 (2) 2 (2) 2 2 2 2 2 2 2 2
' 1 '2 2 2 2
( * ) ( * ) = ( * ( ) *) ( * ( ) *)
* ' ( ) *
y X y X y X X X X y y X X X X y
y I X X X X y
=
0 1 01 (1) 2 (2) 2 (2) 2 2 2 2 1 (1) 2 (2) 2 (2)1 (
( ) ( ) ( )
(
y X X X I X X X X y X X X
y X
= + + = 0 1 01) 2 (2) 2 2 2 2 1 (1) 2 (2)) ( ) ( ).X I X X X X y X X
-
The other terms becomes zero using the result
Consider
' ' 1 '2 2 2 2 2( ) 0.X I X X X X
=
7
Consider1 1
1
0 0 1 01 (1) 2 (2) 1 (1) 2 (2) 1 (1)
( ) ( ) = ( ( ' ) ' ) ( ( ' ) ' )
= ( ) '
= ( ) ) ( ' ) (
y X y X y X X X X y y X X X X y
y I X X X X y
y X X X X I X X X X y X
+ + 02 (2) 1 (1) 2 (2) ) )X X X + +
and other terms become zero using the result Note that under
can be expressed as Thus
1' ( ) 0. = X I X X X X
1 (1) 2 (2) 1 (1) 2 (2) 1 (1)( ) ) ( ) (y y 2 (2) 1 (1) 2 (2)0 1 0
1 (1) 2 (2) 1 (1) 2 (2)
) )
= ( ) ' ( ) ( )y X X I X X X X y X X
00 1 (1) 2 (2), the term H X X +
01 2 (1) (2)( )( ) 'X X can be expressed as Thus1 2 (1) (2)( )( ) .X X
1 2 (2) 2 (2)
1 12 2 2 2
0 ' 1 ' 0 0 1
( * ) ( * ) ( ) ( )
= * ' ( ) * ' ( )
( ) ( ) ( ) ( ) ' ( ) (
q y X y X y X y X
y I X X X X y y I X X X X y
y X X I X X X X y X X y X X I X X X X y X
=
0 )X 1 (1) 2 (2) 2 2 2 2 1 (1) 2 (2) 1 (1) 2 (2) 1 (1( ) ( ) ( ) ( ) ' ( ) (y X X I X X X X y X X y X X I X X X X y X = ) 2 (2)0 1 ' 1 ' 0
1 (1) 2 (2) 2 2 2 2 1 (1) 2 (2)
)
( ) ( ) ( ) ( )
X
y X X X X X X X X X X y X X
=
2 ( ) ( )q y X y X =
[ ][ ]'0 0 0 01 (1) 2 (2) 1 (1) 2 (2) 1 (1) 2 (2) 1 (1) 2 (2)
0 1 01 (1) 2 (2) 1 (1) 2 (2)
( )
( ) ( ) ( ) ( ) ( )
( ) ' ( ) ( ).
y I X X X X y
y X X X X I X X X X y X X X X
y X X I X X X X y X X
= = + + + +
=
Other terms become zero. Note that in simplifying the terms q1 and q2, we tried to write them in the quadratic form with
same variable 01 (1) 2 (2)( ). y X X
-
Using the same argument as in the case 1, we can say that since is a monotonic decreasing function of , so the
lik lih d ti t t j t h
12
qq
H
8
likelihood ratio test rejects whenever
where C is a constant to be determined by the size of the test.
0H
1
2
q Cq
>
The likelihood ratio test statistic can also be obtained through least squares method as follows:
Minimum value of when holds true. 1 2( ) :q q+ ( ) ( ) y X y X 00 (1) (1):H =
Sum of squares due to H( ) :q q+ Sum of squares due to : Sum of squares due to error.
: Sum of squares due to the deviation from or sum of squares due to adjusted for
0H
2q
1q 0H (1)
If then0(1) 0 =(2)
1 2( ) :q q+
If then(1) 0
1 2 (2) 2 (2)
' '(2) 2
' '(2) 2
( ) '( ) ( ) '( ) ( ) ( )
.
q y X y X y X y X
y y X y y y X y
X y X y
= = = (2) 2
(2)
sum of squaresReductiondue tosum of squares
or ignoring
y y
(1)or ignoringsum of squaresdue to
-
Now we have the following theorem based on the Theorems 6 and 7.Th 10
9
Theorem 10
Let 01 (1) 2 (2)
1
Z Y X XQ Z AZQ Z BZ
= ==2
1 ' 1 '2 2 2 2
1
( ) ' ( )
( ) '.
Q Z BZA X X X X X X X XB I X X X X
==
=
Then and are independently distributed Further and1Q 2Q 21 ( )Q q 22 ( )Q n p
where
Then and are independently distributed. Further and 2 2 2 ~ ( ) q 2 ~ ( ).n p
Thus under 0 ,H
follow a F-distribution 1 12 2
// ( )Q r n p Q
Q n p r Q=
( , ).F r n p
Hence the constant C in is
1 ( , )C F r n p=
where denotes the upper points on F-distribution with r and (n - p) degrees of freedom.1 ( , )F r n p 100 %
-
The analysis of variance table for this null hypothesis is as follows:
10
ANOVA for testing
Source of Degrees of Sum of Mean F - value
00 (1) (1):H =
1 ( , )C F p n p=
variation freedom squares squaresDue to r(1) 1q 1q
r1
2
n p qr q
00 :H =
Error (n p)2q
2
( )q
n p
Total n -(p q) 1 2q q+
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design Design of Experimentsof Experiments--II
MODULE MODULE II II
gg pp
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESISLECTURE LECTURE -- 8 8
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
Case 3: Test of 0 :H L =
2
Let us consider the test of hypothesis related to a linear parametric function. Assuming that the linear parameter function
is estimable where is a vector of known constants and . The null hypothesis
of interest is
L 1 2( , ,..., )pL = A A A 1p 1 2( , , ..., )p =
H L 0 :H L =where is some specified constant.
Consider the set up of linear model where follows The maximum likelihood
ti t f d
Y X = + 1 2( , ,..., )nY Y Y Y = 2( , ).N X I 2estimators of and are 2
1 ( ) = X X X y
2 1 ( ) ( )y X y X =and
respectively( ) ( ),y X y Xn
= respectively.
The maximum likelihood estimate of estimable is with L L ( ' )E L L =
2 1
2 1
( )( ) ( )
~ , ( )
E L L
Cov L L X X L
L N L L X X L
= =
and
2n 22 ~ ( )
n n p
assuming X to be the full column rank matrix. Further, and are also independently distributed.L 2
2
n
-
3Under the statistic0 : ,H L =
2 1
( )( ) ( )
n p Ltn L X X L
= ( )follows a t-distribution with degrees of freedom. So the test for against rejects whenever( )n p 0 :H L = 1 :H L 0H
12
( )t t n p
where denotes the upper points on t-distribution with degrees of freedom.1 1( )t n 1n100 %
-
Case 4: Test of 0 1 1 2 2: , ,..., k kH = = =
4
Now we develop the test of hypothesis related to more than one linear parametric functions. Let the ith estimable linear
parametric function is and there are k such functions with and both being vectors as in the Case 3.
Our interest is to test the hypothesis
' =i iL iL 1p
:H 0 1 1 2 2: , ,..., k kH = = =where are the known constants.1 2, ,..., k Let and 1 2( , ,..., ) = k 1 2( , ,.., ) .k =
where is a matrix of constants associated with
The maximum likelihood estimator of is :
L k p 1 2, ,..., .kL L Li ' i iL =
Then is expressible as 0H 0 :H L = =
Then 1 2 ( , ,..., ) .k L = =Also ( )E =
2( )Cov V =( )where where is the element of V. Thus' 1(( ( ) ))= i jV L X X L ' 1( ( ) )i jL X X L ( , )thi j
1
2
( ) ( )V
2
follows a distribution with k degrees of freedom and follows a distribution with (n - p) degrees of freedom where 2 2
2
n
2
is the maximum likelihood estimator of21 ( ) ( )y X y Xn
= 2.
-
Further and are also independently distributed.1
2
( ) ( )V 22
n
5
2 2Thus under
1
2
( ) ( )V
0 :H =
2
2
( )
k
n
n p
or
follows F distribution with k and (n p) degrees of freedom So the hypothesis is rejected against
1
2
( ) ( )
n p Vk n
:H
( )n p
follows F- distribution with k and (n p) degrees of freedom. So the hypothesis is rejected against0 :H =
whenever where denotes the upper points1 : 1, 2,..., iH i k =At least one for 1 ( , )F F k n p 1 ( , )F k n p of F-distribution with k and (n p) degrees of freedom.
100 %
-
One-way classification with fixed effect linear models of full rank
6
The objective in the one way classification is to test the hypothesis about the equality of means on the basis of several
samples which have been drawn from univariate normal populations with different means but the same variances.
Let there be p univariate normal populations and samples of different sizes are drawn from each of the population. Let
be a random sample from the ith normal population with mean and variance , i.e.,( 1, 2,..., )=ij iy j n i 2 , 1, 2,..., =i p2~ ( , ), 1, 2,..., ; 1, 2,..., .ij i iY N j n i p = =
The random samples from different population are assumed to be independent of each other.
These observations follow the set up of linear model
Y X = +where
1 211 12 1 21 2 1 2( , ,..., , ,..., ,..., , ,..., ) '
( ) '
pn n p p pnY Y Y Y Y Y Y Y Y
y y y y y y y y y
==
1 2
1 2
11 12 1 21 2 1 2
1 2
11 12 1 21 2 1 2
( , ,..., , ,..., ,..., , ,..., )
( , ,..., )
( , ,..., , ,..., ,..., , ,..., ) '
p
p
n n p p pn
p
n n p p pn
y y y y y y y y y
==
=
-
11 0...0n
# #%# values
7
2
1 0 0
0 1...0
0 1...0n
X
= # #%# values
0 0...1
0 0...1pn
# # #
# #%#
values
1
0
thi j
ij i j
i j
j xx x
x
=
if occurs in the observationor if effect is present inif effect is absent in
So X is a matrix of order is fixed and
1
0
.
i j
p
ii
x
n n
=
=
if effect is absent in
,n p first rows of are
next rows of are
and similarly the last rows of are
1n '1 (1, 0, 0,..., 0), =2n '2 (0,1, 0,..., 0) =
pn ' (0, 0,..., 0,1).p =2Obviously, and ( ) ( ), rank X p E Y X = = 2( ) .Cov Y I=
This completes the representation of a fixed effect linear model of full rank.
-
Th ll h th i f i t t i ( )H
8
The null hypothesis of interest is (say)
and
where and are unknown.
0 1 2: ... pH = = = =1 : ( )At least one i jH i j 2
W ld d l h h lik lih d i I b d h h l b d i d h h h lWe would develop here the likelihood ratio test. It may be noted that the same test can also be derived through the least
squares method. This will be demonstrated in the next module. This way the readers will understand both the methods.
We already have developed the likelihood ratio for the hypothesis in the case 1.0 1 2: ... pH = = =The whole parametric space is a dimensional space . ( 1)+pe o e pa a et c space s a d e s o a space( )p
{ }2 2( , ) : , 0, 1, 2,..., .i i p = < < > =Note that there are parameters and .
Under , reduces to two dimensional space as
( 1)+p 1 2, , ..., p 20H Under , reduces to two dimensional space as0H
22 21 1( | ) exp ( )i
nnp
L y y The likelihood function under is
{ }2 2( , ) ; , 0 . = < < >
2 21 1
2 2 22
1 1
( | , ) exp ( )2 2
1ln ( | , ) ln (2 ) ( )2 2
10
i
i
ij ii j
np
ij ii j
n
L y y
nL L y y
L y y
= =
= =
= = =
= = =
1
2 22
1 1
0
10 ( ) .i
i ij ioji i
np
ij ioi j
y yn
L y yn
=
= =
= = = = =
-
The dot sign in indicates that the average has been taken over the second subscript j. The Hessian matrix of ( )o ioy2 2 2
9
second order partial derivation of with respect to and is negative definite at and which
ensures that the likelihood function is maximized at these values.
Thus the maximum value of over is
ln L i 2 = ioy 2 2 =
2( | , )L y n 22 2
2 21 1
/2
1 1 ( | , ) exp ( )2 2
exp .
inp
ij ii j
n
Max L y y
n n
= = =
=
2
1 1
p22 ( )
inp
ij ioi j
y y= =
The likelihood function under is
n 22 22 2
1 1
2 2 2
1 1( | , ) exp ( )2 2
1l ( | ) l (2 ) ( )
and
i
i
nnp
iji j
np
L y y
nL
= =
=
2 2 221 1
ln ( | , ) ln(2 ) ( ) .2 2
The normal equations and the least squares estimates are obtained a
iji j
L y y = ==
2
1 1
ln ( | , ) 10
s follows:
inp
ij ooi j
L y y yn
= = = 1 1
22 2
21 1
ln ( | , ) 10 ( ) . i
i j
np
ij ooi j
n
L y y yn
= =
= =
= =
-
The maximum value of the likelihood function over under is 0H
10
22 22 2
1 1
/ 2
1 1 ( | , ) exp ( ) 2 2
in
np
iji j
n
Max L y y = = =
2
1 1
exp .22 ( )
inp
ij ooi j
n n
y y= =
=
The likelihood ratio test statistic is2
2
/2
( | , )
( | , )
nnp
Max L y
Max L y
=
21 1
2
1 1
( ).
( )
i
i
np
ij ioi j
np
ij ooi j
y y
y y
= =
= =
=
We have2
2
1 1 1 1( ) ( ) ( )
i i
i
n np p
ij oo ij io io ooi j i j
np p
y y y y y y= = = =
= + 2 2
1 1 1( ) ( ) .
ip p
ij io i io ooi j i
y y n y y= = =
= +
-
Thus 22 2( ) ( )
+ n
inp p
ij i i io ooy y n y y
11
1 1 1
2
1 1
2
1
( ) ( )
( )
1
= = =
= =
= = +
i
ij i i io ooi j I
np
ij ioi j
n
y y y y
y y
q
2
1+ qwhere
21
1( )
== p i io oo
iq n y y
22
1 1
( ) .= =
= inp ij ioi j
q y y
Note that if the least squares principal is used, then
q1 : sum of squares due to deviations from or the between population sum of squares,
q2 : sum of squares due to error or the within population sum of squares,
q1+q2 : sum of squares due to or the total sum of squares.
Using the Theorems 6 and 7 let
0H
0H
Using the Theorems 6 and 7, let
21
1
22
1
( )
p
i io ooi
p
ii
Q n Y Y
Q S
=
=
=
=
1
2 2
1 1 1 1
1 1( ) , ,
where
i i i
i
n n np
i ij io oo ij io iji i j ji
S Y Y Y Y Y Yn n
=
= = = == = =
-
then under 0H
Q
12
212
222
~ ( 1)
~ ( )
Q p
Q n p
and and are independently distributed.12
Q
22
Q
Thus under 0H
12
22
1
~ ( 1, ).
Q
p
F p n pQ
2n p
The likelihood ratio test reject whenever0H
1
2
q Cq
>
where the constant 1 ( 1, ).C F p n p=
-
The analysis of variance table for the one way classification in fixed effect model is
13
Source of variation
Degrees of freedom
Sum of squares
Mean squares
F - value
Between q 1q 1qn p
1 ( , )C F p n p=
populations p - 1
Within
1q 11
qp
2q
1
2
.1
qn pp q
00 :H =
Within populations n - p
Total n - 1
2q2
( )q
n p
1 2q q+
Note that22QE
n p =
2 11
( );
1 1
1 .
p
ii
p
i
p
QEp p
=
= +
=
1
.iip
=
-
Analysis of Variance andAnalysis of Variance andAnalysis of Variance and Analysis of Variance and Design Design of Experimentsof Experiments--II
MODULE MODULE II II
gg pp
GENERAL LINEAR HYPOTHESISGENERAL LINEAR HYPOTHESIS
LECTURE LECTURE -- 9 9
GENERAL LINEAR HYPOTHESIS GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCEAND ANALYSIS OF VARIANCE
Dr. ShalabhDepartment of Mathematics and Statistics
Indian Institute of Technology Kanpur
-
Case of rejection of 0H
2
If then is rejected. This means that at least one is different from other effects
which is responsible for the rejection. So objective is to investigate and find out such and divide the population into
groups such that the means of populations within the group are same. This can be done by pairwise testing of
1 ( 1, ),F F p n p> 0 1 2: ... pH = = = ii
.s Test against .
This can be tested using following t-statistic0 : ( )i kH i k = 1 : i kH
io koY Yt =
which follows the t distribution with degrees of freedom under and
2 1 1i k
t
sn n
+
( )n p 0H 2 2 .qs n p=
Thus the decision rule is to reject at level if the observed difference
n p
0H
2 1 1( )i ky y t s > +
1 , 2
( )io ko n pi k
y y t sn n
> +
The quantity is called the critical difference.21 ,
1 1n p
k
t sn n
+ ,2 p i kn n
-
Thus following steps are followed:
3
1. Compute all possible critical differences arising out of all possible pairs
2. Compare them with their observed differences.
3 Divide the p populations into different groups such that the populations in the same group have same means
( , ), 1, 2,..., .i k i k p =
3. Divide the p populations into different groups such that the populations in the same group have same means.
The computation are simplified if for all i. In such a case , the common critical difference (CCD) isin n=
2
1 , 2
2n p
sCCD tn
=
and the observed difference are compared with CCD( )y y i k and the observed difference are compared with CCD.
If
then the corresponding effects/means and are coming from populations with the different means.
( ),io koy y i k
io koy y CCD >ioy koy
-
Note: In general we say that if there are three effects then1 2 3, ,
4
1 2 3 if is accepted01 1 2: ( denote as event )H A =
and if is accepted02 2 3: (H B = denote as event )then will be accepted. 03 1 2: ( denote as event )H C =
Since event soA B C
The question arises here that in what sense do we conclude such statement about the acceptance of 03.H
The reason is as follows:
( ) ( )P A B P C
In this sense if the probability of an event is higher than the intersection of the events, i.e., the probability that is 03H
Since event so,A B C
accepted is higher than the probability of acceptance of both, so we conclude, in general , that the
acceptance of imply the acceptance of 01 02andH H
01 02andH H 03.H
-
Multiple comparison tests
5
p p
One interest in the analysis of variance is to decide whether population means are equal or not. If the hypothesis of
equal means is rejected then one would like to divide the populations into subgroups such that all populations with
same means come to the same subgroup. This can be achieved by the multiple comparison tests.
A multiple comparison test procedure conducts the test of hypothesis for all the pairs of effects and compare them at a
significance level i.e., it works on per comparison basis.
This is based mainly on the t statistic If we want to ensure that the significance level simultaneously for all group
,
This is based mainly on the t-statistic. If we want to ensure that the significance level simultaneously for all group
comparison of interest, the approximate multiple test procedure is one that controls the error rate per experiment
basis.
There are various available multiple comparison tests. We will discuss some of them in the context of one way
classification. In two way or higher classification, they can be used on similar lines.
-
1. Studentized range test
6
1. Studentized range test
It is assumed in the Studentized range test that the p samples, each of size n, have been drawn from p normal
populations. Let their sample means be These means are ranked and arranged in an ascending order 1 , 2 , ...,o o poy y y
as where and
Find the range.
* * *1 2, ,..., py y y
* *R y y
*1 = ioiy Min y
* , 1,2,..., .= =p ioiy Max y i p
The Studentized range is defined as
1 .pR y y=
, p n pR nq =
where is the upper point of Studentized range when The tables for are available.
,p p s
, ,pq . = n p , ,pq
The testing procedure involves the comparison of with in the usual way as follows: ,pq , ,pq
100 %
if then conclude that
if then all in the group are not the same.
, , , p n p p n pq q s
-
2. Studentized - Newman - Keuls test
7
2
.p psW q =
The Student-Newman-Keuls test is similar to Studentized range test in the sense that the range is compared with
points on critical Studentized range given by
.pW
100 %
, ,p pq n
The observed range is now compared with If then stop the process of comparison and conclude that
if then
* *1pR y y= .pW
pR W< 1 2 ... .p = = =R W> if thenpR W>
i. divide the ranked means into two subgroups containing* * *1 2, , ..., py y y
* * * * * *1 2 1 2 1( , ,..., ) ( , ,..., ) .andp p p py y y y y y
ii. Compute the ranges . Then compare the ranges and with* * * *1 2 2 1 1andp pR y y R y y= = 1R 2R 1.pW
If either range is smaller than , then means (or s) in each of the groups are equal.
If are greater then then the means (or s) in the group concerned are divided1 2orR R 1pW
1 2/and orR R 1W ( 1)p i
iIf are greater then , then the means (or s) in the group concerned are divided into two groups of (p 2) means each and compare the range of the two groups with
Continue with this procedure until a group of i means (or s) is found whose range does not exceed
1 2/and orR R 1pW ( 1)p
2.pW
.iW
i
i
By this method, the difference between any two means under test is significant when the range of the observed means
of each and every subgroup containing the two means under test is significant according to the Studentized critical
range. This procedure can be easily understood by the following flow chart.
-
* * *1 2
'
...io
p
y sy y y
Arrange in increasing order
8
p
* *1
2
pR y y
s
= Compute
, ,ppsW qn
=Compare with
1 2 ...