topic 11: matrix approach to linear regression

Topic 11: Matrix Approach to Linear

Regression

Outline

• Linear Regression in Matrix Form

The Model in Scalar Form• Yi = β0 + β1Xi + ei

• The ei are independent Normally distributed random variables with mean 0 and variance σ2

• Consider writing the observations: Y1= β0 + β1X1 + e1

Y2= β0 + β1X2 + e2

:

Yn= β0 + β1Xn + en

The Model in Matrix Form

n

2

1

n10

210

110

n

2

1

X

X

X

Y

Y

Y

The Model in Matrix Form II

n

2

1

1

0

n

2

1

n

2

1

X 1

X 1

X 1

Y

Y

Y

The Design Matrix

n

2

1

2n

X1

X1

X1

X

Vector of Parameters

1

0

12

Vector of error terms

n

2

1

1n

ε

Vector of responses

n

2

1

1n

Y

Y

Y

Y

Simple Linear Regression in Matrix Form

1n121nY

XY

2n

Variance-Covariance Matrix

}{Y}Y,{Y}Y,{Y

}Y,Y{

}{Y }Y,{Y

}Y,{Y}Y,{Y}Y{

)(

n2

1-nn 1n

n1n

22

12

n12 112

2

Y

Main diagonal values are the variances and off-diagonal values are the covariances.

Covariance Matrix of e

nn

2

n

2

1

nn

2

Cov}{

Iε

Independent errors means that the covariance of any two residuals is zero. Common variance implies the

main diagonal values are equal.

Covariance Matrix of Y

nn

2

n

2

1

nn

2 I

Y

Y

Y

Cov{Y}

Distributional Assumptions in Matrix Form

• e ~ N(0, σ2I)• I is an n x n identity matrix• Ones in the diagonal elements specify that the

variance of each ei is 1 times σ2

• Zeros in the off-diagonal elements specify that the covariance between different ei is zero

• This implies that the correlations are zero

Least Squares

• We want to minimize (Y-Xβ)(Y-Xβ)

• We take the derivative with respect to the (vector) β

• This is like a quadratic function

• Recall the function we minimized using the scalar form

Least Squares

• The derivative is 2 times the derivative of (Y-Xβ) with respect to β

• In other words, –2X(Y-Xβ) • We set this equal to 0 (a vector of zeros)

• So, –2X(Y-Xβ) = 0• Or, XY = XXβ (the normal equations)

Normal Equations

• XY = (XX)β

• Solving for β gives the least squares solution b = (b0, b1)

• b = (XX)–1(XY)

• See KNNL p 199 for details

• This same matrix approach works for multiple regression!!!!!!

Fitted Values

Xb

X1

X1

X1

X

X

X

Y

Y

Y

Y1

0

n

2

1

n10

210

110

n

2

1

b

b

bb

bb

bb

Hat Matrix

')'(

ˆ

')'(ˆ

ˆ

XXXXH

HYY

YXXXXY

XbY

1

1

We’ll use this matrix when assessing diagnostics in multiple regression

Estimated Covariance Matrix of b

• This matrix, b, is a linear combination of the elements of Y

• These estimates are Normal if Y is Normal

• These estimates will be approximately Normal in general

A Useful MultivariateTheorem

• U ~ N(μ, Σ), a multivariate Normal vector

• V = c + DU, a linear transformation of U

c is a vector and D is a matrix

• Then V ~ N(c+Dμ, DΣD)

Application to b

• b = (XX)–1(XY) = ((XX)–1X)Y

• Since Y ~ N(Xβ, σ2I) this means the vector b is Normally distributed with mean (XX)–1XXβ = β and covariance

σ2 ((XX)–1X) I((XX)–1X) = σ2 (XX)–1

Background Reading

• We will use this framework to do multiple regression where we have more than one explanatory variable

• Another explanatory variable is comparable to adding another column in the design matrix

• See Chapter 6

topic 11: matrix approach to linear regression

Documents

matrix approach

matrix forme

matrix formthe model

vector b

diagonal elements

derivative of y

vector of zerosso

linear combination