matrix approach to simple linear regressionusers.stat.ufl.edu/~winner/sta6208/matrix_reg.pdf ·...

Introduction to Matrices and Matrix Approach to Simple

Linear Regression

Matrices • Definition: A matrix is a rectangular array of numbers

or symbolic elements

• In many applications, the rows of a matrix will represent individuals cases (people, items, plants, animals,...) and columns will represent attributes or characteristics

• The dimension of a matrix is it number of rows and columns, often denoted as r x c (r rows by c columns)

• Can be represented in full form or abbreviated form:

11 12 1 1

21 22 2 2

1 2

1 2

1,..., ; 1,...,

j c

j c

ij

i i ij ic

r r rj rc

a a a a

a a a a

a i r j ca a a a

a a a a

A

Special Types of Matrices

11 12

21 22

1

2

3

4

Square Matrix: Number of rows = # of Columns

20 32 50

12 28 42

28 46 60

Vector: Matrix with one column (column vector) or one row (row vector)

57

24

18

r c

b b

b b

d

d

d

d

A B

C D 1 2 3

2 3 3 2

11 1

1

' 17 31 '

Transpose: Matrix formed by interchanging rows and columns of a matrix (use "prime" to denote transpose)

6 86 15 22

' 15 138 13 25

22 25

c

r c

r rc

f f f

h h

h h

E F

G G

H

11 1

1

1,..., ; 1,..., ' 1,..., ; 1,...,

Matrix Equality: Matrices of the same dimension, and corresponding elements in same cells are all equal:

r

ij jic r

c rc

h h

h i r j c h j c i r

h h

H

A11 12

11 12 21 22

21 22

4 6 = 4, 6, 12, 10

12 10

b bb b b b

b b

B

Regression Examples - Toluca Data

1

2

1

1 21

1

2

2

21 2

Response Vector:

'

1

1Design Matrix:

1

1 1 1'

n

n

nn

n

n

nn

Y

Y

Y

Y Y Y

X

X

X

X X X

Y

Y

X

X

X Y

1 80 399

1 30 121

1 50 221

1 90 376

1 70 361

1 60 224

1 120 546

1 80 352

1 100 353

1 50 157

1 40 160

1 70 252

1 90 389

1 20 113

1 110 435

1 100 420

1 30 212

1 50 268

1 90 377

1 110 421

1 30 273

1 90 468

1 40 244

1 80 342

1 70 323

Matrix Addition and Subtraction

2 22 2 2 2 2 2

11 1

1

Addition and Subtraction of 2 Matrices of Common Dimension:

4 7 2 0 4 2 7 0 6 7 4 2 7 0 2 7

10 12 14 6 10 14 12 6 24 18 10 14 12 6 4 6

c

r c

r rc

a a

a a

C D C D C D

A

11 1

1

11 11 1 1

1 1

11 11 1 1

1 1

1,..., ; 1,..., 1,..., ; 1,...,

1,..., ; 1,...,

c

ij ijr c

r rc

c c

ij ijr c

r r rc rc

c c

r c

r r

b b

a i r j c b i r j c

b b

a b a b

a b i r j c

a b a b

a b a b

a b

B

A B

A B

1 1 1 1 1

2 2 2 2 2

1 11 11 1

1,..., ; 1,...,

Regression Example:

since

ij ij

rc rc

n nn nn n

n n n n

a b i r j c

a b

Y E Y Y E Y

Y E Y Y E Y

Y E Y Y E Y

Y E Y ε Y E Y ε

1 1 1

2 2 2

n n n n

E Y

E Y

E Y

Matrix Multiplication Multiplication of a Matrix by a Scalar (single number):

2 1 3(2) 3(1) 6 33

2 7 3( 2) 3(7) 6 21

Multiplication of a Matrix by a Matrix (#cols( ) = #rows( )):

If :A A B B A

A Br c r c r

k k

c r

A A

A B

A B AB

th th

3 2 2 2

3 2 2 2 3 2

= 1,..., ; 1,...,

sum of the products of the elements of i row of and j column of :

2 53 1

3 12 4

0 7

2(3) 5(2) 2( 1) 5(4)

3(3) ( 1

B

ij A Bc

ij A B

ab i r j c

ab c r

A B

A B

A B AB

1

16 18

)(2) 3( 1) ( 1)(4) 7 7

0(3) 7(2) 0( 1) 7(4) 14 8

If : = = 1,..., ; 1,...,A A B B A B

c

A B ij ik kj A Br c r c r c

k

c r c ab a b i r j c

A B AB

Matrix Multiplication Examples - I

11 1 12 2 1 21 1 22 2 2

11 12 1 1

1 2

21 22 2 2

11 1 12 2 1

21 1 22 2 2

22 2

Simultaneous Equations:

(2 equations: , unknown):

4

Sum of Squares: 4 2 3 4 2 3 2

3

a x a x y a x a x y

a a x yx x

a a x y

a x a x y

a x a x y

AX = Y

1 0 1 1

2 0 1 20

1

0 1

29

1

1Regression Equation (Expected Values):

1 n n

X X

X X

X X

Matrix Multiplication Examples - II

1

2 2

1 2

1

1

12

1 2 2

1 1

Matrices used in simple linear regression (that generalize to multiple regression):

'

1

1 1 1 1'

1

n

n i

i

n

n

i

i

n nn

i i

i in

Y

YY Y Y Y

Y

Xn X

X

X X XX X

X

Y Y

X X

1 1 0 1 1

12 2 0 1 20

1 2 1

1 0 1

1

1 1 1 1'

1

n

i

i

nn

i i

in n n

Y X XY

Y X X

X X XX Y

Y X X

X Y Xβ

Special Matrix Types Symmetric Matrix: Square matrix with a transpose equal to itself: ':

6 19 8 6 19 8

19 14 3 19 14 3

8 3 1 8 3 1

Diagonal Matrix: Square matrix with all off-diagonal elements equal to 0:

A = A

A A' A

A

1

2

3

4 0 0 0 0

0 1 0 0 0 Note:Diagonal matrices are symmetric (not vice versa)

0 0 2 0 0

Identity Matrix: Diagonal matrix with all diagonal elements equal to 1 (acts like multiplying a

b

b

b

B

11 12 13 11 12 13

21 22 23 21 22 233 3 3 3

31 32 33 31 32 33

scalar by 1):

1 0 0

0 1 0

0 0 1

Scalar Matrix: Diagonal matrix with all diagonal elements equal to a single

a a a a a a

a a a a a a

a a a a a a

I A IA AI A

4 4

1 1 11

number"

0 0 0 1 0 0 0

0 0 0 0 1 0 0

0 0 0 0 0 1 0

0 0 0 0 0 0 1

1-Vector and matrix and zero-vector:

1 01 1

1 0 Note: ' 1 1 1

1 11 0

r r rr r r

k

kk k

k

k

I

1 J 0 1 1 11

1 11 1

1 1' 1 1 1

1 11 1

r r r rr

1 1 J

Linear Dependence and Rank of a Matrix

• Linear Dependence: When a linear function of the columns (rows) of a matrix produces a zero vector (one or more columns (rows) can be written as linear function of the other columns (rows))

• Rank of a matrix: Number of linearly independent columns (rows) of the matrix. Rank cannot exceed the minimum of the number of rows or columns of the matrix. rank(A) ≤ min(rA,ca)

• A matrix if full rank if rank(A) = min(rA,ca)

1 2 1 22 2 2 1 2 1

1 2 1 22 2 2 1 2 1

1 33 Columns of are linearly dependent rank( ) = 1

4 12

4 30 0 Columns of are linearly independent rank( ) = 2

4 12

A A A A A 0 A A

B B B B B 0 B B

Matrix Inverse

• Note: For scalars (except 0), when we multiply a number, by its reciprocal, we get 1: 2(1/2)=1 x(1/x)=x(x-1)=1

• In matrix form if A is a square matrix and full rank (all rows and columns are linearly independent), then A has an inverse: A-1 such that: A-1 A = A A-1 = I

2 8 2 8 4 32 16 16

2 8 2 8 1 036 36 36 36 36 36 36 36

4 2 4 2 4 2 4 2 8 8 32 4 0 1

36 36 36 36 36 36 36 36

4 0 0 1/ 4 0 0 4 1

0 2 0 0 1/ 2 0

0 0 6 0 0 1/ 6

-1 -1

-1 -1

A A A A I

B B BB

/ 4 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 2 1/ 2 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 6 1/ 6 0 0 1

I

Computing an Inverse of 2x2 Matrix 11 12

2 221 22

11 22 12 21

11 12 21 22

11 22 12 21 12 22 12

full rank (columns/rows are linearly independent)

Determinant of

Note: If A is not full rank (for some value ):

a a

a a

a a a a

k a ka a ka

a a a a ka a a k

A

A A

A 22

22 121

2 221 11

1

2

0

1 Thus does not exist if is not full rank

While there are rules for general matrices, we will use computers to solve them

Regression Example:

1

1

1 n

a

a a

a a

r r

X

X

X

-1A A A

A

X

2

22

1 12 2

1 1 1 12

1 1

2

1 1 1

2

1 1

1 Note:

n n

i in n n ni i

i i i in ni i i i

i i

i i

n n

i i

i i

in ni

i ii i

n X X

n X X n X n X Xn

X X

X X

X

n X X X n

X'X X'X

X'X

2 22 22 2

1 1 1 1 1

2

2 2

1 1 1

2 2

1 1

1

1

n n n n n

i i i i

i i i i

n n

i i

i i

n n

i i

i i

nX X X X nX X X X nX

X X

nX X X X

X

X X X X

X'X

Use of Inverse Matrix – Solving Simultaneous Equations

1 2 1 2

1

where and are matrices of of constants, is matrix of unknowns

(assuming is square and full rank)

Equation 1: 12 6 48 Equation 2: 10 2 12

12 6

10 2

y y y y

y

y

-1 -1 -1

AY = C A C Y

A AY A C Y = A C A

A Y2

48

12

2 6 2 61 1

10 12 10 1212( 2) 6(10) 84

2 6 48 96 72 168 21 1 1

10 12 12 480 144 336 484 84 84

Note the wisdom of waiting to divide by a

-1

-1

-1

C Y = A C

A

Y = A C

| A | t end of calculation!

Useful Matrix Results All rules assume that the matrices are conformable to operations:

Addition Rules:

( ) ( )

Multiplication Rules:

( ) ( ) scalar

Transpose Rules:

( ') ' ( ) ' '

k k k k

A B B A A B C A B C

(AB)C A(BC) C A B CA + CB A B A B

A A A B A ' ( ) '

Inverse Rules (Full Rank, Square Matrices):

-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

B AB B'A' (ABC)' = C'B'A'

(AB) = B A (ABC) = C B A (A ) = A (A') = (A )'

Random Vectors and Matrices

1

1 2 3 2

3

1

2

3

Shown for case of =3, generalizes to any :

Random variables: , ,

Expectation: In general: 1,..., ; 1,...,

Variance-Covariance Matri

ij

n p

n n

Y

Y Y Y Y

Y

E Y

E Y E Y i n j p

E Y

Y

E Y E Y

1 1

2

2 2 1 1 2 2 3 3

3 3

2

1 1 1 1 2 2 1 1 3 3

2

2 2 1 1 2 2 2 2 3 3

3 3 1 1 3 3 2 2 3

x for a Random Vector:

Y E Y

E Y E Y Y E Y Y E Y Y E Y

Y E Y

Y E Y Y E Y Y E Y Y E Y Y E Y



Y Y E Y Y E Y ' E

E

2

1 12 13

2

21 2 23

22

31 32 33

Linear Regression Example (n=3)

2

2 2

2

1

2 2

2

2

3

Error terms are assumed to be independent, with mean 0, constant variance :

0 , 0

0 0 0

0 0 0

0 0 0

i i i jE i j

2

2 2

ε E ε σ ε I

Y = Xβ + ε E Y E Xβ+ε Xβ E ε Xβ

σ Y σ Xβ+ε

2

2 2

2

0 0

0 0

0 0

2σ ε I

Mean and Variance of Linear Functions of Y

1

11 1 1 1 11 1 1

1

1 1 1

1 1

matrix of fixed constants random vector

...

random vector:

...

k n n

n n n

k

k kn n k k kn n

k

a a Y W a Y a Y

a a Y W a Y a Y

E W E a

E W

A Y

W AY W

E W

1 1 1 11 1 1

1 1 1 1

11 1 1

1

... ...

... ...

n n n n

k kn n k kn n

n

k kn n

Y a Y a E Y a E Y

E a Y a Y a E Y a E Y

a a E Y

a a E Y

2

AE Y

σ W = E AY -AE Y AY -AE Y ' = E A Y -E Y A Y -E Y ' =

E A Y - E Y Y - E Y 'A

2' = AE Y - E Y Y - E Y ' A' = Aσ Y A'

Multivariate Normal Distribution

21 1 1 12 1

22 2 21 2 2

2

1 2

1/2/2

2

Multivariate Normal Density function:

12 exp ~

2

~ , 1,...,

n

n

n n n n n

n

i i i

Y

Y

Y

f N

Y N i n Y

2

-1

Y μ = E Y Σ = σ Y

Y Σ Y -μ 'Σ Y -μ Y μ,Σ

,

Note, if is a (full rank) matrix of fixed constants:

~

i j ijY i j

N

A

W = AY Aμ,AΣA'

Simple Linear Regression in Matrix Form 0 1

1 0 1 1 1

2 0 1 2 2

0 1

1 1 1

2 2 20

1

Simple Linear Regression Model: 1,...,

Defining:

1

1=

1

i i i

n n n

n n n

Y X i n

Y X

Y X

Y X

Y X

Y X

Y X

Y X β ε

0 1 1

0 1 2

0 1

2

2

2

2

since:

Assuming constant variance, and independence of error terms :

0 0

0 0

0 0

Further, assuming normal distributio

n

i

n n

X

X

X

2 2

Y = Xβ +ε Xβ E Y

σ Y = σ ε I

2n for error terms : ~ ,i N Y Xβ I

Estimating Parameters by Least Squares

0 1

^ ^ ^ ^2

0 1 0 1

1 1 1 1 1

1

2

1 1

Normal equations obtained from: , and setting each equal to 0:

( ) ( )

Note: In matrix form: '

n n n n n

i i i i i i

i i i i i

n

i

i

n n

i i

i i

Q Q

i n X Y ii X X X Y

n X

X X

X X

^

1 0

^

1

1

2

0 1 0 0

1 1

Defining

Based on matrix form:

2

' 2 2

n

i

i

n

i i

i

n n

i i i

i i

Y

X Y

Q

Y Y Y X Y n

^

^ ^-1

X'Y β

X'Xβ = X'Y β = X'X X'Y

Y - Xβ ' Y - Xβ = Y'Y -Y'Xβ -β'X'Y +β'X'Xβ Y'Y - Y'Xβ + β'X'Xβ

2 2

1 1

1 1

0 1set

1 10

2

0 1

1 1 11

2 2 2

( ) 2 2 ' '

2 2 2

General Result for fixed symmetric matr

n n

i i

i i

n n

i i

i i

n n n

i i i i

i i i

X X

QY n X

QQ

X Y X X

^ ^-1

X'Y X'Xβ 0 X Xβ X Y β = X'X X'Yβ

ix and variable vector : 2

A w Aw A' wAw Aww w

Fitted Values and Residuals

^ ^ ^

0 1

^ ^^

1 0 1 1

^ ^^^

2 0 1 2

^ ^^

0 1

In Matrix form:

is called the "projection" or "hat" matrix, note that is id

i ii i i

n n

Y b X e Y Y

XY

XY

XY

^-1 -1

Y Xβ = X X'X X'Y = PY P = X X'X X'

P P

^ ^

1 111

^ ^

22 22

^ ^

empotent ( ) and symmetric( ):

n

n nn

Y Y YY

YY Y Y

YY Y Y

-1 -1 -1 -1 -1 -1

^

PP = P P = P'

PP = X X'X X'X X'X X' X'I X'X X' X X'X X' P P' = X X'X X' ' = X X'X X' = P

e Y - Y = Y - Xβ

2

2

Note:

MSE MSE

^

^ ^-1 2 2

2 2

^2 2

= Y - PY = (I - P)Y

E Y = E PY = PE Y = PXβ = X X'X X'Xβ = Xβ σ Y = Pσ IP' P

E e = E I P Y = I P E Y = I P Xβ Xβ - Xβ = 0 σ e = I P σ I I P ' I P

s Y P s e = I P

Analysis of Variance

2

212

1 1

2

12

1

Total (Corrected) Sum of Squares:

1 11 1 1

Note: ' ' '

1 1

Defining

n

in ni

i i

i i

n

ini

in n

i

Y

SSTO Y Y Yn

Y

Y SSTOn n n n

SSE

Y Y Y'JY J Y Y Y'JY Y I J Y

2^

1

as the Residual (Error) Sum of Squares:

since

Defining as the Regression Su

n

ii

i

SSE Y Y

SSR

^ ^ ^ ^ ^ ^ ^

^-1

e'e = Y - Xβ ' Y - Xβ = Y'Y -Y'Xβ-βX'Y +β'X'Xβ = Y'Y -β 'X'Y = Y' I - P Y

β'X'Y = Y'X' X'X X'Y = Y'PY

2

2^

1

1

m of Squares:

1 1'

Note that , , and are all QUADRATIC FORMS: for symmetric and idempotent matrice

n

ini

i

i

Y

SSR Y Y SSTO SSEn n n

SSTO SSR SSE

^

β'X'Y Y'PY Y'JY Y P J Y

Y'AY s A

Inferences in Linear Regression

2

2

2 2

1 1

2 2

1 1

where 2

1

Recall:1

n n

i i

i i

n n

i i

i i

MSE

SSEs MSE

n

X X

nX X X X

X

X X X X

^ ^-1 -1 -1

^ ^-1 -1 -1 -1 -1 -12 2 2 2 2

-1 2

β = X'X X'Y E β = X'X X'E Y = X'X X'Xβ = β

σ β = X'X X'σ Y X X'X = σ X'X X'IX X'X = σ X'X s β X'X

X'X s β

2

2 2

1 1

2 2

1 1

^ ^ ^ ^2

0 1

Estimated Mean Response at :

1

Predicted New Response at

n n

i i

i i

n n

i i

i i

h

h hh

h

MSE X MSE XMSE

nX X X X

XMSE MSE

X X X X

X X

Y X s Y MSEX

^

^ ^-12

h h h h h hX 'β X X 's β X X ' X'X X

^ ^ ^

2

0 1

:

pred 1

h

h h

X X

Y X s MSE

^

-1

h h hX 'β X ' X'X X

matrix approach to simple linear regressionusers.stat.ufl.edu/~winner/sta6208/matrix_reg.pdf ·...

Documents