the ols equations give a nice, clear intuitive meaning about the influence of the variable x on the...
TRANSCRIPT
![Page 1: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/1.jpg)
)(Var),(Cov
1^
XYX
b
The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that:
So_
1^_
0^
XbYb
are how the computer determines the size of the intercept and the slope respectively in an OLS regression
i) the greater the covariance between X and Y, the larger the (absolute value of) the slope
ii) the smaller the variance of X, the larger the (absolute value of) the slope
![Page 2: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/2.jpg)
It is equally important to be able to interpret the effect of an estimated regression coefficient
Given OLS essentially passes a straight line through the data, then given
Y = b0 + b1X
1b
Xd
dY
So the OLS estimate of the slope will give an estimate of the unit change in the dependent variable y following a unit change in the level of the explanatory variable
(so you need to be aware of the units of measurement of your variables in order to be able to interpret what the OLS coefficient is telling you)
![Page 3: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/3.jpg)
PROPERTIES OF OLS
6
Using the fact that for any individual observation, i, the ols residual is the difference between the actual and predicted value
iYiYiu ˆ^
iXbbiY 1^
0^
ˆ Sub. in
iiiii XbbYYYu 1^
0^^
ˆ So that
and dividing by N
iii XN
bbYN
uN
1111
^0
^^
Summing over all N observations in the data set
iXbbiYiu 1^
0^^
![Page 4: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/4.jpg)
6
and since
XbbYu 1^
0^
_^
XbXbYYu 1^
)1^
(
_^
XbYb 1^
0^
iii XN
bbYN
uN
1111
^0
^^
So the mean value of the OLS residuals is zero
(as any residual should be, since random and unpredictable by definition)
Since the sum of any series divided by the sample size gives the mean, can write
01^
1^
_^
XbXbYYu
![Page 5: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/5.jpg)
YY ˆ
the mean of the OLS predicted values equals the mean of the actual values in the data
(so OLS predicts average behaviour in the data set – another useful property)
The 2nd useful property of OLS is that
![Page 6: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/6.jpg)
6
and since
XbbYu 1^
0^
_^
XbXbYYu 1^
)1^
(
_^
XbYb 1^
0^
iii XN
bbYN
uN
1111
^0
^^
So the mean value of the OLS residuals is zero
(as any residual should be, since random and unpredictable by definition)
Since the sum of any series divided by the sample size gives the mean, can write
01^
1^
_^
XbXbYYu
![Page 7: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/7.jpg)
iiiii XbbYYYu 1^
0^^
ˆ iYiYiu ˆ^
iii YYu ˆ^
Proof:
summing
Dividing by N iii Yn
Yn
un
ˆ111 ^
YYu ˆ_^
We know from above that 0
_^u
so YY ˆ
![Page 8: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/8.jpg)
Story so far……
• Looked at idea behind OLS – fit a straight line thru the data that gives the “best fit”
• Do this by choosing a value for the intercept and the slope of the straight line that will minimise the sum of squared residuals
• OLS formula to do this given by
_1
^_0
^XbYb )(Var
),(Cov1
^
XYX
b
![Page 9: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/9.jpg)
Things to do in Lecture 3
Useful algebra results underlying OLS
Invent a summary measure of how well regression fits the data (R2)
Look at contribution of individual variables in explaining outcomes, (Bias, Efficiency)
![Page 10: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/10.jpg)
6
and since
XbbYu 1^
0^
_^
XbXbYYu 1^
)1^
(
_^
XbYb 1^
0^
iii XN
bbYN
uN
1111
^0
^^
So the mean value of the OLS residuals is zero
(as any residual should be, since random and unpredictable by definition)
Useful Algebraic aspects of OLS
01^
1^
_^
XbXbYYu
![Page 11: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/11.jpg)
iiiii XbbYYYu 1^
0^^
ˆ iYiYiu ˆ^
iii YYu ˆ^
Proof:
summing
Dividing by N iii Yn
Yn
un
ˆ111 ^
YYu ˆ_^
We know from above that 0
_^u
so YY ˆ
OLS predicts average behaviour
![Page 12: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/12.jpg)
0)(Var)(Var
),(Cov),(Cov2
),(Cov2),(Cov2
)2,(Cov)1,(Cov),(Cov2
])21[,(Cov2),(Cov20
),2(Cov),1(Cov)]),21([Cov),ˆ(Cov
XX
YXYXb
XXbYXb
XbXbXYXb
XbbYXbuXb
uXbubuXbbuY
the covariance between the fitted values of Y and the residuals must be zero.
22
0),ˆ(Cov^
uY
The 3rd useful result is that
![Page 13: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/13.jpg)
GOODNESS OF FIT
Now we know how to summarise the relationships in the data using the OLS method, we next need a summary measure of “how well” the estimated OLS line fits the data
Think of the dispersion of all possible y values (the variation in Y) being represented by a circle
YAnd similarly the dispersion in the range of x values
X
![Page 14: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/14.jpg)
The more the circles overlap the more the variation in the X data explains the variation in y
Y X
Little overlap in values so X not explain much of variation in Y
YX
Large overlap in values so X variable explains much of variation in Y
![Page 15: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/15.jpg)
iiiii
i uYYYYu^^
ˆˆ
)^
,ˆCov(2)^
Var()ˆVar()^
ˆVar()(Var uYuYuYY
So the variation in the variable of interest, var(Y), is explained by either the variation in the variables included in the OLS model,
29
To derive a statistical measure which does much the same thing remember that
Using the rules on covariances (see problem set 0) we know that
^
Y
or by variation in the residual
)ˆVar(Y
)Var(^
u
)^
Var()ˆVar()^
,ˆCov(2)^
Var()ˆVar()^
ˆVar()(Var uYuYuYuYY
![Page 16: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/16.jpg)
So we use the ratio
As a measure of how well the model fits the data. (R2 is also known as the coefficient of determination)
So R2 measures the % of variation in the dependent variable explained by the model.
If the model explains all the variation in y then the ratio equals 1
So the closer the ratio is to one the better the fit.
)var(
)^
var(2Y
YR
![Page 17: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/17.jpg)
29
It is more common however to use one further algebraic adjustment.
The 1/n is common to both sides, so can cancel out and using the results that
2)
_^^
(12)ˆˆ(
12)(1
uun
YYn
YYn
YY ˆ 0
_^
u
Given (1) says that
)^
Var()ˆVar()(Var uYY Can write this as
2^
2)ˆ(2)( uYYYYThen we have
![Page 18: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/18.jpg)
The left side of the equation is the sum of the squared deviations of Y about its sample mean.
This is called the Total Sum of Squares.
The right hand side consists of the sum of squared deviations of the predictions around the sample mean
(the Explained Sum of Squares)
and the Residual Sum of Squares
29
RSSESSTSS
2^
22 )ˆ()( uYYYY
From this can have an alternative definition of the goodness of fit
TSS
ESS
yiy
yiyR
2)_
(
2)_^
(2TSS
RSS1
2)( YY
2)ˆ( YY
2^
u
![Page 19: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/19.jpg)
Can see from above that it must hold that 0<= R2 <=1
when ESS = 0, then R2 = 0(and model explains none of the variation in the dependent variable)
when ESS = TSS, then R2 = 1(and model explains all of the variation in the dependent variable)
In general the R2 lies between these two extremes.
You will find that for cross-section data (ie samples of individuals, firms etc) the R2 are typically in the region of 0.2
for time-series data (ie samples of aggregate (whole economy) data measured at different points in time) the R2 are typically in the region of 0.9
![Page 20: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/20.jpg)
Why?
Cross section data typically are bigger and have a lot more variation while time series data sets are smaller and have less variation in the dependent variable (easier to fit a straight line if data don’t vary much)
While we would like the R2 to be as high as possible you can only compare R2 in models with the SAME dependent (Y) variable, so can’t say one model is a better fit than another if dependent variables are different
![Page 21: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/21.jpg)
GOODNESS OF FIT
)ˆ(Var)(Var
)ˆ,(Covˆ, YY
YYr YY
43
So the R2 measures the proportion of variance in the dependent variable explained by the modelAnother useful interpretation of the R2 is that it equals the square of the correlation coefficient between the actual and predicted values of Y
Proof: We know the formula for the correlation coefficient
Sub. In for^^uYY
(actual value = predicted value plus residual )
)ˆ(Var)(Var
)ˆ],ˆ([Cov^
ˆ, YY
YuYr YY
![Page 22: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/22.jpg)
GOODNESS OF FIT
)ˆ(Var)(Var
)ˆ,(Cov)ˆ,ˆ(Cov
)ˆ(Var)(Var
)ˆ],ˆ([Cov^^
YY
YuYY
YY
YuY
43
Expand the covariance terms
)ˆ(Var)(Var
)ˆ(Var
YY
Y
(since already proved 0),ˆ(Cov^
uY
)ˆ(Var)(Var
)ˆ(Var)ˆ(Var
YY
YY
And can always write any variance term as square root of the product
![Page 23: The OLS equations give a nice, clear intuitive meaning about the influence of the variable X on the size of the slope, since it shows that: So are how](https://reader036.vdocuments.net/reader036/viewer/2022081603/56649d9f5503460f94a8a067/html5/thumbnails/23.jpg)
GOODNESS OF FIT
)(Var
)ˆ(Var
)ˆ(Var)(Var
)ˆ(Var)ˆ(Var
Y
Y
YY
YY
Thus the correlation coefficient is the square root of R2.
Eg R2 =0.25 implies correlation coefficient between Y variable & X variable (or between Y and predicted values ) = √0.25 = 0.5
43
Cancelling terms
so
Rxyr 2