matrix approach to simple linear regression knnl – chapter 5

Matrix Approach to Simple Linear Regression

KNNL – Chapter 5

Matrices• Definition: A matrix is a rectangular array of numbers

or symbolic elements• In many applications, the rows of a matrix will

represent individuals cases (people, items, plants, animals,...) and columns will represent attributes or characteristics

• The dimension of a matrix is it number of rows and columns, often denoted as r x c (r rows by c columns)

• Can be represented in full form or abbreviated form:11 12 1 1

21 22 2 2

1 2

1 2

1,..., ; 1,...,

j c

j c

iji i ij ic

r r rj rc

a a a a

a a a a

a i r j ca a a a

a a a a

A

Special Types of Matrices

11 12

21 22

1

2

3

4

Square Matrix: Number of rows = # of Columns

20 32 50

12 28 42

28 46 60

Vector: Matrix with one column (column vector) or one row (row vector)

57

24

18

r c

b b

b b

d

d

d

d

A B

C D 1 2 3

2 3 3 2

11 1

1

' 17 31 '

Transpose: Matrix formed by interchanging rows and columns of a matrix (use "prime" to denote transpose)

6 86 15 22

' 15 138 13 25

22 25

c

r c

r rc

f f f

h h

h h

E F

G G

H

11 1

1

1,..., ; 1,..., ' 1,..., ; 1,...,

Matrix Equality: Matrices of the same dimension, and corresponding elements in same cells are all equal:

r

ij jic r

c rc

h h

h i r j c h j c i r

h h

H

A

11 1211 12 21 22

21 22

4 6 = 4, 6, 12, 10

12 10

b bb b b b

b b

B

Regression Examples - Toluca Data

1

2

1

1 21

1

2

2

21 2

Response Vector:

'

1

1Design Matrix:

1

1 1 1'

n

n

nn

n

n

nn

Y

Y

Y

Y Y Y

X

X

X

X X X

Y

Y

X

X

X Y1 80 3991 30 1211 50 2211 90 3761 70 3611 60 2241 120 5461 80 3521 100 3531 50 1571 40 1601 70 2521 90 3891 20 1131 110 4351 100 4201 30 2121 50 2681 90 3771 110 4211 30 2731 90 4681 40 2441 80 3421 70 323

Matrix Addition and Subtraction

2 22 2 2 2 2 2

11 1

1

Addition and Subtraction of 2 Matrices of Common Dimension:

4 7 2 0 4 2 7 0 6 7 4 2 7 0 2 7

10 12 14 6 10 14 12 6 24 18 10 14 12 6 4 6

c

r c

r rc

a a

a a

C D C D C D

A

11 1

1

11 11 1 1

1 1

11 11 1 1

1 1

1,..., ; 1,..., 1,..., ; 1,...,

1,..., ; 1,...,

c

ij ijr c

r rc

c c

ij ijr c

r r rc rc

c c

r c

r r

b b

a i r j c b i r j c

b b

a b a b

a b i r j c

a b a b

a b a b

a b

B

A B

A B

1 1 1 1 1

2 2 2 2 2

1 11 11 1

1,..., ; 1,...,

Regression Example:

since

ij ij

rc rc

n nn nn n

n n n n

a b i r j c

a b

Y E Y Y E Y

Y E Y Y E Y

Y E Y Y E Y

Y E Y ε Y E Y ε

1 1 1

2 2 2

n n n n

E Y

E Y

E Y

Matrix MultiplicationMultiplication of a Matrix by a Scalar (single number):

2 1 3(2) 3(1) 6 33

2 7 3( 2) 3(7) 6 21

Multiplication of a Matrix by a Matrix (#cols( ) = #rows( )):

If :A A B B A

A Br c r c r

k k

c r

A A

A B

A B AB

th th

3 2 2 2

3 2 2 2 3 2

= 1,..., ; 1,...,

sum of the products of the elements of i row of and j column of :

2 53 1

3 12 4

0 7

2(3) 5(2) 2( 1) 5(4)

3(3) ( 1

Bij A B

c

ij A B

ab i r j c

ab c r

A B

A B

A B AB

1

16 18

)(2) 3( 1) ( 1)(4) 7 7

0(3) 7(2) 0( 1) 7(4) 14 28

If : = = 1,..., ; 1,...,A A B B A B

c

A B ij ik kj A Br c r c r c

k

c r c ab a b i r j c

A B AB

Matrix Multiplication Examples - I

11 1 12 2 1 21 1 22 2 2

11 12 1 11 2

21 22 2 2

11 1 12 2 1

21 1 22 2 2

22 2

Simultaneous Equations:

(2 equations: , unknown):

4

Sum of Squares: 4 2 3 4 2 3 2

3

a x a x y a x a x y

a a x yx x

a a x y

a x a x y

a x a x y

AX = Y

1 0 1 1

2 0 1 20

1

0 1

29

1

1Regression Equation (Expected Values):

1 n n

X X

X X

X X

Matrix Multiplication Examples - II

1

2 21 2

1

1

12

1 2 2

1 1

Matrices used in simple linear regression (that generalize to multiple regression):

'

1

1 1 1 1'

1

n

n ii

n

n

ii

n nn

i ii in

Y

YY Y Y Y

Y

Xn X

X

X X XX X

X

Y Y

X X

1 1 0 1 1

12 2 0 1 20

1 2 1

1 0 1

1

1 1 1 1'

1

n

ii

nn

i iin n n

Y X XY

Y X XX

X X XX Y

Y X X

X Y

Special Matrix TypesSymmetric Matrix: Square matrix with a transpose equal to itself: ':

6 19 8 6 19 8

19 14 3 19 14 3

8 3 1 8 3 1

Diagonal Matrix: Square matrix with all off-diagonal elements equal to 0:

A = A

A A' A

A1

2

3

4 0 0 0 0

0 1 0 0 0 Note:Diagonal matrices are symmetric (not vice versa)

0 0 2 0 0

Identity Matrix: Diagonal matrix with all diagonal elements equal to 1 (acts like multiplying a

b

b

b

B

11 12 13 11 12 13

21 22 23 21 22 233 3 3 3

31 32 33 31 32 33

scalar by 1):

1 0 0

0 1 0

0 0 1

Scalar Matrix: Diagonal matrix with all diagonal elements equal to a single

a a a a a a

a a a a a a

a a a a a a

I A IA AI A

4 4

1 1 11

number"

0 0 0 1 0 0 0

0 0 0 0 1 0 0

0 0 0 0 0 1 0

0 0 0 0 0 0 1

1-Vector and matrix and zero-vector:

1 01 1

1 0 Note: ' 1 1 1

1 11 0

r r rr r r

k

kk k

k

k

I

1 J 0 1 1

11

1 11 1

1 1' 1 1 1

1 11 1

r r r rr

1 1 J

Linear Dependence and Rank of a Matrix• Linear Dependence: When a linear function of the

columns (rows) of a matrix produces a zero vector (one or more columns (rows) can be written as linear function of the other columns (rows))

• Rank of a matrix: Number of linearly independent columns (rows) of the matrix. Rank cannot exceed the minimum of the number of rows or columns of the matrix. rank(A) ≤ min(rA,ca)

• A matrix if full rank if rank(A) = min(rA,ca)

1 2 1 22 2 2 1 2 1

1 2 1 22 2 2 1 2 1

1 33 Columns of are linearly dependent rank( ) = 1

4 12

4 30 0 Columns of are linearly independent rank( ) = 2

4 12

A A A A A 0 A A

B B B B B 0 B B

Matrix Inverse• Note: For scalars (except 0), when we multiply a

number, by its reciprocal, we get 1: 2(1/2)=1 x(1/x)=x(x-1)=1

• In matrix form if A is a square matrix and full rank (all rows and columns are linearly independent), then A has an inverse: A-1 such that: A-1 A = A A-1 = I

2 8 2 8 4 32 16 162 8 2 8 1 036 36 36 36 36 36 36 364 2 4 2 4 2 4 2 8 8 32 4 0 1

36 36 36 36 36 36 36 36

4 0 0 1/ 4 0 0 4 1

0 2 0 0 1/ 2 0

0 0 6 0 0 1/ 6

-1 -1

-1 -1

A A A A I

B B BB

/ 4 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 2 1/ 2 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 6 1/ 6 0 0 1

I

Computing an Inverse of 2x2 Matrix11 12

2 221 22

11 22 12 21

11 12 21 22

11 22 12 21 12 22 12

full rank (columns/rows are linearly independent)

Determinant of

Note: If A is not full rank (for some value ):

a a

a a

a a a a

k a ka a ka

a a a a ka a a k

A

A A

A 22

22 121

2 221 11

1

2

0

1 Thus does not exist if is not full rank

While there are rules for general matrices, we will use computers to solve them

Regression Example:

1

1

1 n

a

a a

a a

r r

X

X

X

-1A A AA

X

2

221 12 2

1 1 1 12

1 1

2

1 1 1

2

1 1

1 Note:

n n

i in n n ni i

i i i in ni i i i

i ii i

n n

i ii i

in ni

i ii i

n X X

n X X n X n X Xn

X X

X X

Xn X X X n

X'X X'X

X'X

2 22 22 2

1 1 1 1 1

2

2 2

1 1 1

2 2

1 1

1

1

n n n n n

i i i ii i i i

n n

i ii i

n n

i ii i

nX X X X nX X X X nX

X X

n X X X X

X

X X X X

X'X

Use of Inverse Matrix – Solving Simultaneous Equations

1 2 1 2

1

where and are matrices of of constants, is matrix of unknowns

(assuming is square and full rank)

Equation 1: 12 6 48 Equation 2: 10 2 12

12 6

10 2

y y y y

y

y

-1 -1 -1

AY = C A C Y

A AY A C Y = A C A

A Y2

48

12

2 6 2 61 1

10 12 10 1212( 2) 6(10) 84

2 6 48 96 72 168 21 1 1

10 12 12 480 144 336 484 84 84

Note the wisdom of waiting to divide by a

-1

-1

-1

C Y = A C

A

Y = A C

| A | t end of calculation!

Useful Matrix ResultsAll rules assume that the matrices are conformable to operations:

Addition Rules:

( ) ( )

Multiplication Rules:

( ) ( ) scalar

Transpose Rules:

( ') ' ( ) ' '

k k k k

A B B A A B C A B C

(AB)C A(BC) C A B CA+CB A B A B

A A A B A ' ( ) '

Inverse Rules (Full Rank, Square Matrices):

-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

B AB B'A' (ABC)' = C'B'A'

(AB) = B A (ABC) = C B A (A ) = A (A') = (A )'

Random Vectors and Matrices

1

1 2 3 2

3

1

2

3

Shown for case of =3, generalizes to any :

Random variables: , ,

Expectation: In general: 1,..., ; 1,...,

Variance-Covariance Matri

ijn p

n n

Y

Y Y Y Y

Y

E Y

E Y E Y i n j p

E Y

Y

E Y E Y

1 12

2 2 1 1 2 2 3 3

3 3

2

1 1 1 1 2 2 1 1 3 3

2

2 2 1 1 2 2 2 2 3 3

3 3 1 1 3 3 2 2 3

x for a Random Vector:

Y E Y

E Y E Y Y E Y Y E Y Y E Y

Y E Y

Y E Y Y E Y Y E Y Y E Y Y E Y



Y Y E Y Y E Y ' E

E

21 12 13

221 2 23

22

31 32 33

Linear Regression Example (n=3)

2

2 2

21

2 22

23

Error terms are assumed to be independent, with mean 0, constant variance :

0 , 0

0 0 0

0 0 0

0 0 0

i i i jE i j

2

2 2

ε E ε σ ε I

Y = Xβ +ε E Y E Xβ +ε Xβ E ε Xβ

σ Y σ Xβ +ε 2

2 2

2

0 0

0 0

0 0

2σ ε I

Mean and Variance of Linear Functions of Y

1

11 1 1 1 11 1 1

1

1 1 1

1 1

matrix of fixed constants random vector

...

random vector:

...

k n n

n n n

k

k kn n k k kn n

k

a a Y W a Y a Y

a a Y W a Y a Y

E W E a

E W

A Y

W AY W

E W

1 1 1 11 1 1

1 1 1 1

11 1 1

1

... ...

... ...

n n n n

k kn n k kn n

n

k kn n

Y a Y a E Y a E Y

E a Y a Y a E Y a E Y

a a E Y

a a E Y

2

AE Y

σ W = E AY -AE Y AY -AE Y ' = E A Y -E Y A Y -E Y ' =

E A Y -E Y Y -E Y 'A

2' = AE Y -E Y Y -E Y ' A' = Aσ Y A'

Multivariate Normal Distribution

21 1 1 12 1

22 2 21 2 2

21 2

1/2/2

2

Multivariate Normal Density function:

12 exp ~

2

~ , 1,...,

n

n

n n n n n

n

i i i

Y

Y

Y

f N

Y N i n Y

2

-1

Y μ = E Y Σ = σ Y

Y Σ Y -μ 'Σ Y -μ Y μ,Σ

,

Note, if is a (full rank) matrix of fixed constants:

~

i j ijY i j

N

A

W = AY Aμ,AΣA'

Simple Linear Regression in Matrix Form0 1

1 0 1 1 1

2 0 1 2 2

0 1

1 1 1

2 2 20

1

Simple Linear Regression Model: 1,...,

Defining:

1

1=

1

i i i

n n n

n n n

Y X i n

Y X

Y X

Y X

Y X

Y X

Y X

Y X β ε

0 1 1

0 1 2

0 1

2

22

2

since:

Assuming constant variance, and independence of error terms :

0 0

0 0

0 0

Further, assuming normal distributio

n

i

n n

X

X

X

2 2

Y = Xβ +ε Xβ E Y

σ Y = σ ε I

2n for error terms : ~ ,i N Y Xβ I

Estimating Parameters by Least Squares

0 1

0 11 1

20 1

1 1 1

1 1

2

1 1

Normal equations obtained from: , and setting each equal to 0:

Note: In matrix form: '

n n

i ii i

n n n

i i i ii i i

n

i ii i

n n

i ii i

Q Q

nb b X Y

b X b X X Y

n X Y

X X

X X X'Y

0

1

1

2 2 20 1 0 0 1 1

1 1 1 1

0

Defining

Based on matrix form:

' 2 2

( )

n

n

i ii

n n n n

i i i i ii i i i

b

bX Y

Q

Y Y Y X Y n X X

Q

QQ

-1

b

X'Xb = X'Y b = X'X X'Y

Y -Xβ ' Y - Xβ = Y'Y -Y'Xβ -β'X'Y +β'X'Xβ

β

0 11 1

20 1

1 1 11

2 2 2

2 ' 2 '

2 2 2

Setting this equal to zero, and replacing with b ' '

n n

i ii i

n n n

i i i ii i i

Y n X

X Y X X

X Y X X

X Xb X Y

Fitted Values and Residuals

^ ^

0 1

^

10 1 1

^^

0 1 22

^ 0 1

In Matrix form:

is called the "hat" or "projection" matrix, note that is idempotent (

i ii i i

nn

Y b b X e Y Y

Y b b X

b b XY

b b XY

-1 -1Y Xb = X X'X X'Y = HY H = X X'X X'

H H HH

^ ^

1 11 1^ ^

22 22

^ ^

) and symmetric( ):

nn nn

Y Y YY

YY Y Y

YY Y Y

-1 -1 -1 -1 -1 -1

^

= H H = H'

HH = X X'X X'X X'X X' X'I X'X X' X X'X X' H H' = X X'X X' ' = X X'X X' = H

e Y -Y = Y -Xb = Y -HY = (I -H)Y

2

2

Note:

e

MSE MSE

^ ^-1 2 2

2 2

^2 2

E Y = E HY = HE Y = HXβ = X X'X X'Xβ = Xβ σ Y = Hσ IH' H

E = E I H Y = I H E Y = I H Xβ Xβ - Xβ = 0 σ e = I H σ I I H ' I H

s Y H s e = I H

Analysis of Variance

2

212

1 1

2

12

1

Total (Corrected) Sum of Squares:

1 11

Note: '

1 1

1 1' '

n

in ni

i ii i

n

ini

in n

i

Y

SSTO Y Y Yn

Y

Yn n

SSTOn n

SSE

Y Y Y'JY J

Y Y Y'JY Y I J Y

e'e = Y - Xb

2

1

since

1 1'

Note that , , are all QUADRATIC FOR

n

ii

Y

SSR SSTO SSEn n n

SSTO SSR and SSE

-1

' Y - Xb = Y'Y -Y'Xb -b'X'Y +b'X'Xb = Y'Y -b'X'Y = Y' I -H Y

b'X'Y = Y'X' X'X X'Y = Y'HY

b'X'Y Y'HY Y'JY Y H J Y

MS: for symmetric matrices Y'AY A

Inferences in Linear Regression

2 2

2 2 2

1 1 1

2 2

1 1

1

Recall:1

n n n

i i ii i i

n n

i ii i

MSE

X X MSE X MSE X

n nX X X X X X

X

X X X X

-1 -1 -1

-1 -1 -1 -1 -1 -12 2 2 2 2

-1 2

b = X'X X'Y Þ E b = X'X X'E Y = X'X X'Xβ = β

σ b = X'X X'σ Y X X'X = σ X'X X'IX X'X = σ X'X s b X'X

X'X s b

2

1

2 2

1 1

^ ^2

0 1

^2

0 1

Estimated Mean Response at :

1

Predicted New Response at :

pred

n

ii

n n

i ii i

h

h hhh

h

h h

MSE

X X

XMSE MSE

X X X X

X X

Y b b X s Y MSEX

X X

Y b b X s

-12h h h h h h

h

X 'b X X 's b X X ' X'X X

X 'b 1MSE -1

h hX ' X'X X

matrix approach to simple linear regression knnl – chapter 5

Documents