topic 11: matrix approach to linear regression
DESCRIPTION
Topic 11: Matrix Approach to Linear Regression. Outline. Linear Regression in Matrix Form. The Model in Scalar Form. Y i = β 0 + β 1 X i + e i The e i are independent Normally distributed random variables with mean 0 and variance σ 2 Consider writing the observations: - PowerPoint PPT PresentationTRANSCRIPT
Topic 11: Matrix Approach to Linear
Regression
Outline
• Linear Regression in Matrix Form
The Model in Scalar Form• Yi = β0 + β1Xi + ei
• The ei are independent Normally distributed random variables with mean 0 and variance σ2
• Consider writing the observations: Y1= β0 + β1X1 + e1
Y2= β0 + β1X2 + e2
:
Yn= β0 + β1Xn + en
The Model in Matrix Form
n
2
1
n10
210
110
n
2
1
X
X
X
Y
Y
Y
The Model in Matrix Form II
n
2
1
1
0
n
2
1
n
2
1
X 1
X 1
X 1
Y
Y
Y
The Design Matrix
n
2
1
2n
X1
X1
X1
X
Vector of Parameters
1
0
12
Vector of error terms
n
2
1
1n
ε
Vector of responses
n
2
1
1n
Y
Y
Y
Y
Simple Linear Regression in Matrix Form
1n121nY
XY
2n
Variance-Covariance Matrix
}{Y}Y,{Y}Y,{Y
}Y,Y{
}{Y }Y,{Y
}Y,{Y}Y,{Y}Y{
)(
n2
1-nn 1n
n1n
22
12
n12 112
2
Y
Main diagonal values are the variances and off-diagonal values are the covariances.
Covariance Matrix of e
nn
2
n
2
1
nn
2
Cov}{
Iε
Independent errors means that the covariance of any two residuals is zero. Common variance implies the
main diagonal values are equal.
Covariance Matrix of Y
nn
2
n
2
1
nn
2 I
Y
Y
Y
Cov{Y}
Distributional Assumptions in Matrix Form
• e ~ N(0, σ2I)• I is an n x n identity matrix• Ones in the diagonal elements specify that the
variance of each ei is 1 times σ2
• Zeros in the off-diagonal elements specify that the covariance between different ei is zero
• This implies that the correlations are zero
Least Squares
• We want to minimize (Y-Xβ)(Y-Xβ)
• We take the derivative with respect to the (vector) β
• This is like a quadratic function
• Recall the function we minimized using the scalar form
Least Squares
• The derivative is 2 times the derivative of (Y-Xβ) with respect to β
• In other words, –2X(Y-Xβ) • We set this equal to 0 (a vector of zeros)
• So, –2X(Y-Xβ) = 0• Or, XY = XXβ (the normal equations)
Normal Equations
• XY = (XX)β
• Solving for β gives the least squares solution b = (b0, b1)
• b = (XX)–1(XY)
• See KNNL p 199 for details
• This same matrix approach works for multiple regression!!!!!!
Fitted Values
Xb
X1
X1
X1
X
X
X
Y
Y
Y
Y1
0
n
2
1
n10
210
110
n
2
1
b
b
bb
bb
bb
Hat Matrix
')'(
ˆ
')'(ˆ
ˆ
XXXXH
HYY
YXXXXY
XbY
1
1
We’ll use this matrix when assessing diagnostics in multiple regression
Estimated Covariance Matrix of b
• This matrix, b, is a linear combination of the elements of Y
• These estimates are Normal if Y is Normal
• These estimates will be approximately Normal in general
A Useful MultivariateTheorem
• U ~ N(μ, Σ), a multivariate Normal vector
• V = c + DU, a linear transformation of U
c is a vector and D is a matrix
• Then V ~ N(c+Dμ, DΣD)
Application to b
• b = (XX)–1(XY) = ((XX)–1X)Y
• Since Y ~ N(Xβ, σ2I) this means the vector b is Normally distributed with mean (XX)–1XXβ = β and covariance
σ2 ((XX)–1X) I((XX)–1X) = σ2 (XX)–1
Background Reading
• We will use this framework to do multiple regression where we have more than one explanatory variable
• Another explanatory variable is comparable to adding another column in the design matrix
• See Chapter 6