terminologia econometrica
TRANSCRIPT
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
1/19
Lecture 16. Heteroskedasticity
In the CLR model
iiK K iii uXXX Y ++++= 2211 , ni ,,1 =
one of the assumptions was
Assumption 3 (Homoskedasticity)All su i ' have the same variance, i.e. forni ,,1 =
22 )()( == ii uEuVar
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
2/19
When is this a bad assumption?
If omitted variables are not correlated withthe included variables (assumption 1), buthave a different order of magnitude for(groups of) observations.
Cross-sectional data on units of differentsize, e.g. states, cities. Omitted variablesmay be larger for more populous states orcities.
Cross-sectional data on units at differentpoints in time. Omitted variables may bemore important at some points in time.
Cross-sectional data on units that facedifferent restrictions on their behavior.For instance, high income individualshave more discretion in their spending.
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
3/19
Example of second case: Relation betweenincome and experience.
Data on 222 university professors for 7schools (UC Berkeley, UCLA, UCSD, Illinois,Stanford, Michigan, Virginia)
See graphs
Note Variation in income (in 1000$) increases
with work experience Variation in relative income first
increases and then decreases
Is consistent because income is higher if morework experience
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
4/19
0
50
100
150
200
0 10 20 30 40 50
YEARS
S A L A R Y
Salary (1000$) and work experience(years since Ph.D.)
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
5/19
3.5
4.0
4.5
5.0
5.5
0 10 20 30 40 50
YEARS
L N S A L A R Y
Log(Salary) and work experience(years since Ph.D.)
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
6/19
Note that log transformation reducesvariation in income with experience. Why?
If variation in income increasesproportionally with income level, thenvariation in relative income does not changewith income level.
Example
Income with work experience 4 years:30,40,60 with absolute difference 10, 30,relative difference 33%,100% and logdifference 0.29, 0.69 (all relative to lowest)
Income at work experience 8 years:90,120, 180 with absolute difference 30, 90,relative difference 33%, 100% and logdifference 0.29, 0.69 (all relative to lowest)
Often after log transformation variation isconstant (but not in example).
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
7/19
Is there heteroskedasticity if we estimate themodel
u X X Y +++= 2321
with
=Y log salary
= X work experience
See output and graph
Interpretation of regression coefficients
Nature of relation: Maximum at 35 yearswork experience Heteroskedasticity: Plot squared OLS
residuals against X
All examples for cross-sections, but
heteroskedasticity also important in time-series data, e.g. volatility in the stock market.This is like case 2 but omitted variable isnews/information.
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
8/19
Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:15Sample: 1 222Included observations: 222
Variable Coefficient Std. Error t-Statistic Prob.
C 3.809365 0.041338 92.15104 0.0000YEARS 0.043853 0.004829 9.081645 0.0000
YEARS2 -0.000627 0.000121 -5.190657 0.0000
R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
9/19
0.0
0.2
0.4
0.6
0.8
0 10 20 30 40 50
YEARS
R E S I D 2
Squared OLS residuals and workexperience
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
10/19
Effect of heteroskedasticity on OLSestimators and tests
OLS estimators are unbiased (onlyassumptions 1 and 2 are needed)
The usual formula for the samplingvariance is wrong (assumption 3 was usedin derivation)
The OLS estimators not Best LinearUnbiased (BLU), i.e. better estimatorsmay exist
The t- and F-tests cannot be used
Often standard errors reported by regressionprogram are too small, e.g. estimates of regressions coefficients seem more significantthan they really are. This is case in simpleregression model and if error varianceincreases with X .
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
11/19
How do we detect heteroskedasticity?
Plot of squared OLS residuals againstregressors
Tests
For test we must specify a model for theheteroskedasticity
2)( iiuVar =
We cannot estimate these variances asparameters. Why not?
Models
iL LiiZ Z +++= L221
2 (Breusch-Pagan)
iL LiiZ Z +++= L221 (Glesjer)
iL LiiZ Z +++= L2212ln (Harvey-Godfrey)
The Z s may be regressors or squares orproducts of regressors
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
12/19
Choice of Z s
If heteroskedasticity because of sizedifferences, choose size, e.g. population
If no clear choice, chooseKKK ,,,,,, 21
2222 X X X X X X K K
i.e. regressors, their squares and croos-
products. BP test with this choice is Whitetest
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
13/19
Test
1. Estimate by OLS and obtain OLSresiduals nie i ,,1, K=
2. Estimate linear regression of 2ie (BP),
ie (G), or 2ln ie (HG) on constant and
iLi Z Z ,,2 K and compute the2 R of this
regression,3. Compute the test statistic 2 Rn LM = for
the hypothesis 0,,0: 20 == L H K . If 0 H is true (homoskedastic errors) then LM has a 2 distribution with 1 L degreesof freedom. Use this to obtain critical
value.
This is test is called the Lagrange Multiplier(LM) test for heteroskedasticity of aparticular form.
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
14/19
Example: BP test with 22 X Z = is years and
33 X Z = is years squared.
0747.02 = R 59.160747.0*2222 === Rn LM
Critical value for 5% and chi-squareddistribution with 2 df is 5.99 (see book)
White test: Add 324X X Z =
is years cubed
0810.02 = R 98.170810.0*2222 === Rn LM
Critical value for 5% and chi-squareddistribution with 3 df is 7.81
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
15/19
Dependent Variable: RESID2Method: Least SquaresDate: 11/07/01 Time: 14:13Sample: 1 222Included observations: 222
Variable Coefficient Std. Error t-Statistic Prob.
C -0.011086 0.013473 -0.822858 0.4115YEARS 0.006084 0.001574 3.865764 0.0001
YEARS2 -0.000129 3.94E-05 -3.270700 0.0012
R-squared 0.074714 Mean dependent var 0.042255 Adjusted R-squared 0.066264 S.D. dependent var 0.069804S.E. of regression 0.067451 Akaike info criterion -2.541402Sum squared resid 0.996378 Schwarz criterion -2.495420Log likelihood 285.0957 F-statistic 8.841813Durbin-Watson stat 1.707896 Prob(F-statistic) 0.000203
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
16/19
Dependent Variable: RESID2Method: Least SquaresDate: 11/07/01 Time: 14:25Sample: 1 222Included observations: 222
Variable Coefficient Std. Error t-Statistic Prob.
C -0.027664 0.019079 -1.449980 0.1485YEARS 0.010395 0.003853 2.698166 0.0075
YEARS2 -0.000380 0.000209 -1.821374 0.0699YEARS3 3.99E-06 3.25E-06 1.225781 0.2216
R-squared 0.081048 Mean dependent var 0.042255 Adjusted R-squared 0.068402 S.D. dependent var 0.069804S.E. of regression 0.067374 Akaike info criterion -2.539262Sum squared resid 0.989557 Schwarz criterion -2.477953Log likelihood 285.8581 F-statistic 6.408914Durbin-Watson stat 1.694065 Prob(F-statistic) 0.000352
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
17/19
Estimation with heteroskedasticity
Use OLS but get correct standard errors Find a better estimation procedure
It is possible to derive the correct standarderror of OLS estimator. Formula does notdepend on the model for heteroskedasticity.
These standard errors are calledheteroskedasticity-consistent standard errors.Many regression programs have this option.
Example: See output.
Note differences are small (wrong standarderrors are here too large).
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
18/19
Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:49
Sample: 1 222Included observations: 222White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable Coefficient Std. Error t-Statistic Prob.
C 3.809365 0.026119 145.8466 0.0000YEARS 0.043853 0.004361 10.05599 0.0000
YEARS2 -0.000627 0.000118 -5.322369 0.0000
R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140
Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000
Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:15Sample: 1 222
Included observations: 222
Variable Coefficient Std. Error t-Statistic Prob.
C 3.809365 0.041338 92.15104 0.0000YEARS 0.043853 0.004829 9.081645 0.0000
YEARS2 -0.000627 0.000121 -5.190657 0.0000
R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823
Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000
-
7/30/2019 TERMINOLOGIA ECONOMETRICA
19/19