terminologia econometrica

7/30/2019 TERMINOLOGIA ECONOMETRICA

1/19

Lecture 16. Heteroskedasticity

In the CLR model

iiK K iii uXXX Y ++++= 2211 , ni ,,1 =

one of the assumptions was

Assumption 3 (Homoskedasticity)All su i ' have the same variance, i.e. forni ,,1 =

22 )()( == ii uEuVar


2/19

When is this a bad assumption?

If omitted variables are not correlated withthe included variables (assumption 1), buthave a different order of magnitude for(groups of) observations.

Cross-sectional data on units of differentsize, e.g. states, cities. Omitted variablesmay be larger for more populous states orcities.

Cross-sectional data on units at differentpoints in time. Omitted variables may bemore important at some points in time.

Cross-sectional data on units that facedifferent restrictions on their behavior.For instance, high income individualshave more discretion in their spending.


3/19

Example of second case: Relation betweenincome and experience.

Data on 222 university professors for 7schools (UC Berkeley, UCLA, UCSD, Illinois,Stanford, Michigan, Virginia)

See graphs

Note Variation in income (in 1000$) increases

with work experience Variation in relative income first

increases and then decreases

Is consistent because income is higher if morework experience


4/19

0

50

100

150

200

0 10 20 30 40 50

YEARS

S A L A R Y

Salary (1000$) and work experience(years since Ph.D.)


5/19

3.5

4.0

4.5

5.0

5.5

0 10 20 30 40 50

YEARS

L N S A L A R Y

Log(Salary) and work experience(years since Ph.D.)


6/19

Note that log transformation reducesvariation in income with experience. Why?

If variation in income increasesproportionally with income level, thenvariation in relative income does not changewith income level.

Example

Income with work experience 4 years:30,40,60 with absolute difference 10, 30,relative difference 33%,100% and logdifference 0.29, 0.69 (all relative to lowest)

Income at work experience 8 years:90,120, 180 with absolute difference 30, 90,relative difference 33%, 100% and logdifference 0.29, 0.69 (all relative to lowest)

Often after log transformation variation isconstant (but not in example).


7/19

Is there heteroskedasticity if we estimate themodel

u X X Y +++= 2321

with

=Y log salary

= X work experience

See output and graph

Interpretation of regression coefficients

Nature of relation: Maximum at 35 yearswork experience Heteroskedasticity: Plot squared OLS

residuals against X

All examples for cross-sections, but

heteroskedasticity also important in time-series data, e.g. volatility in the stock market.This is like case 2 but omitted variable isnews/information.


8/19

Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:15Sample: 1 222Included observations: 222

Variable Coefficient Std. Error t-Statistic Prob.

C 3.809365 0.041338 92.15104 0.0000YEARS 0.043853 0.004829 9.081645 0.0000

YEARS2 -0.000627 0.000121 -5.190657 0.0000

R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000


9/19

0.0

0.2

0.4

0.6

0.8

0 10 20 30 40 50

YEARS

R E S I D 2

Squared OLS residuals and workexperience


10/19

Effect of heteroskedasticity on OLSestimators and tests

OLS estimators are unbiased (onlyassumptions 1 and 2 are needed)

The usual formula for the samplingvariance is wrong (assumption 3 was usedin derivation)

The OLS estimators not Best LinearUnbiased (BLU), i.e. better estimatorsmay exist

The t- and F-tests cannot be used

Often standard errors reported by regressionprogram are too small, e.g. estimates of regressions coefficients seem more significantthan they really are. This is case in simpleregression model and if error varianceincreases with X .


11/19

How do we detect heteroskedasticity?

Plot of squared OLS residuals againstregressors

Tests

For test we must specify a model for theheteroskedasticity

2)( iiuVar =

We cannot estimate these variances asparameters. Why not?

Models

iL LiiZ Z +++= L221

2 (Breusch-Pagan)

iL LiiZ Z +++= L221 (Glesjer)

iL LiiZ Z +++= L2212ln (Harvey-Godfrey)

The Z s may be regressors or squares orproducts of regressors


12/19

Choice of Z s

If heteroskedasticity because of sizedifferences, choose size, e.g. population

If no clear choice, chooseKKK ,,,,,, 21

2222 X X X X X X K K

i.e. regressors, their squares and croos-

products. BP test with this choice is Whitetest


13/19

Test

1. Estimate by OLS and obtain OLSresiduals nie i ,,1, K=

2. Estimate linear regression of 2ie (BP),

ie (G), or 2ln ie (HG) on constant and

iLi Z Z ,,2 K and compute the2 R of this

regression,3. Compute the test statistic 2 Rn LM = for

the hypothesis 0,,0: 20 == L H K . If 0 H is true (homoskedastic errors) then LM has a 2 distribution with 1 L degreesof freedom. Use this to obtain critical

value.

This is test is called the Lagrange Multiplier(LM) test for heteroskedasticity of aparticular form.


14/19

Example: BP test with 22 X Z = is years and

33 X Z = is years squared.

0747.02 = R 59.160747.0*2222 === Rn LM

Critical value for 5% and chi-squareddistribution with 2 df is 5.99 (see book)

White test: Add 324X X Z =

is years cubed

0810.02 = R 98.170810.0*2222 === Rn LM

Critical value for 5% and chi-squareddistribution with 3 df is 7.81


15/19

Dependent Variable: RESID2Method: Least SquaresDate: 11/07/01 Time: 14:13Sample: 1 222Included observations: 222


C -0.011086 0.013473 -0.822858 0.4115YEARS 0.006084 0.001574 3.865764 0.0001

YEARS2 -0.000129 3.94E-05 -3.270700 0.0012



16/19

Dependent Variable: RESID2Method: Least SquaresDate: 11/07/01 Time: 14:25Sample: 1 222Included observations: 222


C -0.027664 0.019079 -1.449980 0.1485YEARS 0.010395 0.003853 2.698166 0.0075

YEARS2 -0.000380 0.000209 -1.821374 0.0699YEARS3 3.99E-06 3.25E-06 1.225781 0.2216



17/19

Estimation with heteroskedasticity

Use OLS but get correct standard errors Find a better estimation procedure

It is possible to derive the correct standarderror of OLS estimator. Formula does notdepend on the model for heteroskedasticity.

These standard errors are calledheteroskedasticity-consistent standard errors.Many regression programs have this option.

Example: See output.

Note differences are small (wrong standarderrors are here too large).


18/19

Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:49

Sample: 1 222Included observations: 222White Heteroskedasticity-Consistent Standard Errors & Covariance


C 3.809365 0.026119 145.8466 0.0000YEARS 0.043853 0.004361 10.05599 0.0000

YEARS2 -0.000627 0.000118 -5.322369 0.0000

R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140

Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000

Dependent Variable: LNSALARYMethod: Least SquaresDate: 11/07/01 Time: 13:15Sample: 1 222

Included observations: 222


C 3.809365 0.041338 92.15104 0.0000YEARS 0.043853 0.004829 9.081645 0.0000

YEARS2 -0.000627 0.000121 -5.190657 0.0000

R-squared 0.536179 Mean dependent var 4.325410 Adjusted R-squared 0.531943 S.D. dependent var 0.302511S.E. of regression 0.206962 Akaike info criterion -0.299140Sum squared resid 9.380504 Schwarz criterion -0.253158Log likelihood 36.20452 F-statistic 126.5823

Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000


19/19

terminologia econometrica

Documents