matrix approach to simple linear regression knnl – chapter 5
TRANSCRIPT
Matrix Approach to Simple Linear Regression
KNNL – Chapter 5
Matrices• Definition: A matrix is a rectangular array of numbers
or symbolic elements• In many applications, the rows of a matrix will
represent individuals cases (people, items, plants, animals,...) and columns will represent attributes or characteristics
• The dimension of a matrix is it number of rows and columns, often denoted as r x c (r rows by c columns)
• Can be represented in full form or abbreviated form:11 12 1 1
21 22 2 2
1 2
1 2
1,..., ; 1,...,
j c
j c
iji i ij ic
r r rj rc
a a a a
a a a a
a i r j ca a a a
a a a a
A
Special Types of Matrices
11 12
21 22
1
2
3
4
Square Matrix: Number of rows = # of Columns
20 32 50
12 28 42
28 46 60
Vector: Matrix with one column (column vector) or one row (row vector)
57
24
18
r c
b b
b b
d
d
d
d
A B
C D 1 2 3
2 3 3 2
11 1
1
' 17 31 '
Transpose: Matrix formed by interchanging rows and columns of a matrix (use "prime" to denote transpose)
6 86 15 22
' 15 138 13 25
22 25
c
r c
r rc
f f f
h h
h h
E F
G G
H
11 1
1
1,..., ; 1,..., ' 1,..., ; 1,...,
Matrix Equality: Matrices of the same dimension, and corresponding elements in same cells are all equal:
r
ij jic r
c rc
h h
h i r j c h j c i r
h h
H
A
11 1211 12 21 22
21 22
4 6 = 4, 6, 12, 10
12 10
b bb b b b
b b
B
Regression Examples - Toluca Data
1
2
1
1 21
1
2
2
21 2
Response Vector:
'
1
1Design Matrix:
1
1 1 1'
n
n
nn
n
n
nn
Y
Y
Y
Y Y Y
X
X
X
X X X
Y
Y
X
X
X Y1 80 3991 30 1211 50 2211 90 3761 70 3611 60 2241 120 5461 80 3521 100 3531 50 1571 40 1601 70 2521 90 3891 20 1131 110 4351 100 4201 30 2121 50 2681 90 3771 110 4211 30 2731 90 4681 40 2441 80 3421 70 323
Matrix Addition and Subtraction
2 22 2 2 2 2 2
11 1
1
Addition and Subtraction of 2 Matrices of Common Dimension:
4 7 2 0 4 2 7 0 6 7 4 2 7 0 2 7
10 12 14 6 10 14 12 6 24 18 10 14 12 6 4 6
c
r c
r rc
a a
a a
C D C D C D
A
11 1
1
11 11 1 1
1 1
11 11 1 1
1 1
1,..., ; 1,..., 1,..., ; 1,...,
1,..., ; 1,...,
c
ij ijr c
r rc
c c
ij ijr c
r r rc rc
c c
r c
r r
b b
a i r j c b i r j c
b b
a b a b
a b i r j c
a b a b
a b a b
a b
B
A B
A B
1 1 1 1 1
2 2 2 2 2
1 11 11 1
1,..., ; 1,...,
Regression Example:
since
ij ij
rc rc
n nn nn n
n n n n
a b i r j c
a b
Y E Y Y E Y
Y E Y Y E Y
Y E Y Y E Y
Y E Y ε Y E Y ε
1 1 1
2 2 2
n n n n
E Y
E Y
E Y
Matrix MultiplicationMultiplication of a Matrix by a Scalar (single number):
2 1 3(2) 3(1) 6 33
2 7 3( 2) 3(7) 6 21
Multiplication of a Matrix by a Matrix (#cols( ) = #rows( )):
If :A A B B A
A Br c r c r
k k
c r
A A
A B
A B AB
th th
3 2 2 2
3 2 2 2 3 2
= 1,..., ; 1,...,
sum of the products of the elements of i row of and j column of :
2 53 1
3 12 4
0 7
2(3) 5(2) 2( 1) 5(4)
3(3) ( 1
Bij A B
c
ij A B
ab i r j c
ab c r
A B
A B
A B AB
1
16 18
)(2) 3( 1) ( 1)(4) 7 7
0(3) 7(2) 0( 1) 7(4) 14 28
If : = = 1,..., ; 1,...,A A B B A B
c
A B ij ik kj A Br c r c r c
k
c r c ab a b i r j c
A B AB
Matrix Multiplication Examples - I
11 1 12 2 1 21 1 22 2 2
11 12 1 11 2
21 22 2 2
11 1 12 2 1
21 1 22 2 2
22 2
Simultaneous Equations:
(2 equations: , unknown):
4
Sum of Squares: 4 2 3 4 2 3 2
3
a x a x y a x a x y
a a x yx x
a a x y
a x a x y
a x a x y
AX = Y
1 0 1 1
2 0 1 20
1
0 1
29
1
1Regression Equation (Expected Values):
1 n n
X X
X X
X X
Matrix Multiplication Examples - II
1
2 21 2
1
1
12
1 2 2
1 1
Matrices used in simple linear regression (that generalize to multiple regression):
'
1
1 1 1 1'
1
n
n ii
n
n
ii
n nn
i ii in
Y
YY Y Y Y
Y
Xn X
X
X X XX X
X
Y Y
X X
1 1 0 1 1
12 2 0 1 20
1 2 1
1 0 1
1
1 1 1 1'
1
n
ii
nn
i iin n n
Y X XY
Y X XX
X X XX Y
Y X X
X Y
Special Matrix TypesSymmetric Matrix: Square matrix with a transpose equal to itself: ':
6 19 8 6 19 8
19 14 3 19 14 3
8 3 1 8 3 1
Diagonal Matrix: Square matrix with all off-diagonal elements equal to 0:
A = A
A A' A
A1
2
3
4 0 0 0 0
0 1 0 0 0 Note:Diagonal matrices are symmetric (not vice versa)
0 0 2 0 0
Identity Matrix: Diagonal matrix with all diagonal elements equal to 1 (acts like multiplying a
b
b
b
B
11 12 13 11 12 13
21 22 23 21 22 233 3 3 3
31 32 33 31 32 33
scalar by 1):
1 0 0
0 1 0
0 0 1
Scalar Matrix: Diagonal matrix with all diagonal elements equal to a single
a a a a a a
a a a a a a
a a a a a a
I A IA AI A
4 4
1 1 11
number"
0 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 1 0
0 0 0 0 0 0 1
1-Vector and matrix and zero-vector:
1 01 1
1 0 Note: ' 1 1 1
1 11 0
r r rr r r
k
kk k
k
k
I
1 J 0 1 1
11
1 11 1
1 1' 1 1 1
1 11 1
r r r rr
1 1 J
Linear Dependence and Rank of a Matrix• Linear Dependence: When a linear function of the
columns (rows) of a matrix produces a zero vector (one or more columns (rows) can be written as linear function of the other columns (rows))
• Rank of a matrix: Number of linearly independent columns (rows) of the matrix. Rank cannot exceed the minimum of the number of rows or columns of the matrix. rank(A) ≤ min(rA,ca)
• A matrix if full rank if rank(A) = min(rA,ca)
1 2 1 22 2 2 1 2 1
1 2 1 22 2 2 1 2 1
1 33 Columns of are linearly dependent rank( ) = 1
4 12
4 30 0 Columns of are linearly independent rank( ) = 2
4 12
A A A A A 0 A A
B B B B B 0 B B
Matrix Inverse• Note: For scalars (except 0), when we multiply a
number, by its reciprocal, we get 1: 2(1/2)=1 x(1/x)=x(x-1)=1
• In matrix form if A is a square matrix and full rank (all rows and columns are linearly independent), then A has an inverse: A-1 such that: A-1 A = A A-1 = I
2 8 2 8 4 32 16 162 8 2 8 1 036 36 36 36 36 36 36 364 2 4 2 4 2 4 2 8 8 32 4 0 1
36 36 36 36 36 36 36 36
4 0 0 1/ 4 0 0 4 1
0 2 0 0 1/ 2 0
0 0 6 0 0 1/ 6
-1 -1
-1 -1
A A A A I
B B BB
/ 4 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 2 1/ 2 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 6 1/ 6 0 0 1
I
Computing an Inverse of 2x2 Matrix11 12
2 221 22
11 22 12 21
11 12 21 22
11 22 12 21 12 22 12
full rank (columns/rows are linearly independent)
Determinant of
Note: If A is not full rank (for some value ):
a a
a a
a a a a
k a ka a ka
a a a a ka a a k
A
A A
A 22
22 121
2 221 11
1
2
0
1 Thus does not exist if is not full rank
While there are rules for general matrices, we will use computers to solve them
Regression Example:
1
1
1 n
a
a a
a a
r r
X
X
X
-1A A AA
X
2
221 12 2
1 1 1 12
1 1
2
1 1 1
2
1 1
1 Note:
n n
i in n n ni i
i i i in ni i i i
i ii i
n n
i ii i
in ni
i ii i
n X X
n X X n X n X Xn
X X
X X
Xn X X X n
X'X X'X
X'X
2 22 22 2
1 1 1 1 1
2
2 2
1 1 1
2 2
1 1
1
1
n n n n n
i i i ii i i i
n n
i ii i
n n
i ii i
nX X X X nX X X X nX
X X
n X X X X
X
X X X X
X'X
Use of Inverse Matrix – Solving Simultaneous Equations
1 2 1 2
1
where and are matrices of of constants, is matrix of unknowns
(assuming is square and full rank)
Equation 1: 12 6 48 Equation 2: 10 2 12
12 6
10 2
y y y y
y
y
-1 -1 -1
AY = C A C Y
A AY A C Y = A C A
A Y2
48
12
2 6 2 61 1
10 12 10 1212( 2) 6(10) 84
2 6 48 96 72 168 21 1 1
10 12 12 480 144 336 484 84 84
Note the wisdom of waiting to divide by a
-1
-1
-1
C Y = A C
A
Y = A C
| A | t end of calculation!
Useful Matrix ResultsAll rules assume that the matrices are conformable to operations:
Addition Rules:
( ) ( )
Multiplication Rules:
( ) ( ) scalar
Transpose Rules:
( ') ' ( ) ' '
k k k k
A B B A A B C A B C
(AB)C A(BC) C A B CA+CB A B A B
A A A B A ' ( ) '
Inverse Rules (Full Rank, Square Matrices):
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
B AB B'A' (ABC)' = C'B'A'
(AB) = B A (ABC) = C B A (A ) = A (A') = (A )'
Random Vectors and Matrices
1
1 2 3 2
3
1
2
3
Shown for case of =3, generalizes to any :
Random variables: , ,
Expectation: In general: 1,..., ; 1,...,
Variance-Covariance Matri
ijn p
n n
Y
Y Y Y Y
Y
E Y
E Y E Y i n j p
E Y
Y
E Y E Y
1 12
2 2 1 1 2 2 3 3
3 3
2
1 1 1 1 2 2 1 1 3 3
2
2 2 1 1 2 2 2 2 3 3
3 3 1 1 3 3 2 2 3
x for a Random Vector:
Y E Y
E Y E Y Y E Y Y E Y Y E Y
Y E Y
Y E Y Y E Y Y E Y Y E Y Y E Y
Y E Y Y E Y Y E Y Y E Y Y E Y
Y E Y Y E Y Y E Y Y E Y Y E Y
Y Y E Y Y E Y ' E
E
21 12 13
221 2 23
22
31 32 33
Linear Regression Example (n=3)
2
2 2
21
2 22
23
Error terms are assumed to be independent, with mean 0, constant variance :
0 , 0
0 0 0
0 0 0
0 0 0
i i i jE i j
2
2 2
ε E ε σ ε I
Y = Xβ +ε E Y E Xβ +ε Xβ E ε Xβ
σ Y σ Xβ +ε 2
2 2
2
0 0
0 0
0 0
2σ ε I
Mean and Variance of Linear Functions of Y
1
11 1 1 1 11 1 1
1
1 1 1
1 1
matrix of fixed constants random vector
...
random vector:
...
k n n
n n n
k
k kn n k k kn n
k
a a Y W a Y a Y
a a Y W a Y a Y
E W E a
E W
A Y
W AY W
E W
1 1 1 11 1 1
1 1 1 1
11 1 1
1
... ...
... ...
n n n n
k kn n k kn n
n
k kn n
Y a Y a E Y a E Y
E a Y a Y a E Y a E Y
a a E Y
a a E Y
2
AE Y
σ W = E AY -AE Y AY -AE Y ' = E A Y -E Y A Y -E Y ' =
E A Y -E Y Y -E Y 'A
2' = AE Y -E Y Y -E Y ' A' = Aσ Y A'
Multivariate Normal Distribution
21 1 1 12 1
22 2 21 2 2
21 2
1/2/2
2
Multivariate Normal Density function:
12 exp ~
2
~ , 1,...,
n
n
n n n n n
n
i i i
Y
Y
Y
f N
Y N i n Y
2
-1
Y μ = E Y Σ = σ Y
Y Σ Y -μ 'Σ Y -μ Y μ,Σ
,
Note, if is a (full rank) matrix of fixed constants:
~
i j ijY i j
N
A
W = AY Aμ,AΣA'
Simple Linear Regression in Matrix Form0 1
1 0 1 1 1
2 0 1 2 2
0 1
1 1 1
2 2 20
1
Simple Linear Regression Model: 1,...,
Defining:
1
1=
1
i i i
n n n
n n n
Y X i n
Y X
Y X
Y X
Y X
Y X
Y X
Y X β ε
0 1 1
0 1 2
0 1
2
22
2
since:
Assuming constant variance, and independence of error terms :
0 0
0 0
0 0
Further, assuming normal distributio
n
i
n n
X
X
X
2 2
Y = Xβ +ε Xβ E Y
σ Y = σ ε I
2n for error terms : ~ ,i N Y Xβ I
Estimating Parameters by Least Squares
0 1
0 11 1
20 1
1 1 1
1 1
2
1 1
Normal equations obtained from: , and setting each equal to 0:
Note: In matrix form: '
n n
i ii i
n n n
i i i ii i i
n
i ii i
n n
i ii i
Q Q
nb b X Y
b X b X X Y
n X Y
X X
X X X'Y
0
1
1
2 2 20 1 0 0 1 1
1 1 1 1
0
Defining
Based on matrix form:
' 2 2
( )
n
n
i ii
n n n n
i i i i ii i i i
b
bX Y
Q
Y Y Y X Y n X X
Q
-1
b
X'Xb = X'Y b = X'X X'Y
Y -Xβ ' Y - Xβ = Y'Y -Y'Xβ -β'X'Y +β'X'Xβ
β
0 11 1
20 1
1 1 11
2 2 2
2 ' 2 '
2 2 2
Setting this equal to zero, and replacing with b ' '
n n
i ii i
n n n
i i i ii i i
Y n X
X Y X X
X Y X X
X Xb X Y
Fitted Values and Residuals
^ ^
0 1
^
10 1 1
^^
0 1 22
^ 0 1
In Matrix form:
is called the "hat" or "projection" matrix, note that is idempotent (
i ii i i
nn
Y b b X e Y Y
Y b b X
b b XY
b b XY
-1 -1Y Xb = X X'X X'Y = HY H = X X'X X'
H H HH
^ ^
1 11 1^ ^
22 22
^ ^
) and symmetric( ):
nn nn
Y Y YY
YY Y Y
YY Y Y
-1 -1 -1 -1 -1 -1
^
= H H = H'
HH = X X'X X'X X'X X' X'I X'X X' X X'X X' H H' = X X'X X' ' = X X'X X' = H
e Y -Y = Y -Xb = Y -HY = (I -H)Y
2
2
Note:
e
MSE MSE
^ ^-1 2 2
2 2
^2 2
E Y = E HY = HE Y = HXβ = X X'X X'Xβ = Xβ σ Y = Hσ IH' H
E = E I H Y = I H E Y = I H Xβ Xβ - Xβ = 0 σ e = I H σ I I H ' I H
s Y H s e = I H
Analysis of Variance
2
212
1 1
2
12
1
Total (Corrected) Sum of Squares:
1 11
Note: '
1 1
1 1' '
n
in ni
i ii i
n
ini
in n
i
Y
SSTO Y Y Yn
Y
Yn n
SSTOn n
SSE
Y Y Y'JY J
Y Y Y'JY Y I J Y
e'e = Y - Xb
2
1
since
1 1'
Note that , , are all QUADRATIC FOR
n
ii
Y
SSR SSTO SSEn n n
SSTO SSR and SSE
-1
' Y - Xb = Y'Y -Y'Xb -b'X'Y +b'X'Xb = Y'Y -b'X'Y = Y' I -H Y
b'X'Y = Y'X' X'X X'Y = Y'HY
b'X'Y Y'HY Y'JY Y H J Y
MS: for symmetric matrices Y'AY A
Inferences in Linear Regression
2 2
2 2 2
1 1 1
2 2
1 1
1
Recall:1
n n n
i i ii i i
n n
i ii i
MSE
X X MSE X MSE X
n nX X X X X X
X
X X X X
-1 -1 -1
-1 -1 -1 -1 -1 -12 2 2 2 2
-1 2
b = X'X X'Y Þ E b = X'X X'E Y = X'X X'Xβ = β
σ b = X'X X'σ Y X X'X = σ X'X X'IX X'X = σ X'X s b X'X
X'X s b
2
1
2 2
1 1
^ ^2
0 1
^2
0 1
Estimated Mean Response at :
1
Predicted New Response at :
pred
n
ii
n n
i ii i
h
h hhh
h
h h
MSE
X X
XMSE MSE
X X X X
X X
Y b b X s Y MSEX
X X
Y b b X s
-12h h h h h h
h
X 'b X X 's b X X ' X'X X
X 'b 1MSE -1
h hX ' X'X X