generalized regression model
DESCRIPTION
Generalized Regression Model. Based on Greene’s Note 15 (Chapter 8). Generalized Regression Model. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/1.jpg)
Generalized Regression Model
Based on Greene’s Note 15 (Chapter 8)
![Page 2: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/2.jpg)
Generalized Regression ModelSetting: The classical linear model
assumes that E[] = Var[] = 2I. That is, observations are uncorrelated and all are drawn from a distribution with the same variance. The generalized regression (GR) model allows the variances to differ across observations and allows correlation across observations.
![Page 3: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/3.jpg)
Implications• The assumption that Var[] = 2I is used to derive the
result Var[b] = 2(XX)-1. If it is not true, then the use of s2(XX)-1 to estimate Var[b] is inappropriate.
• The assumption was used to derive most of our test statistics, so they must be revised as well.
• Least squares gives each observation a weight of 1/n. But, if the variances are not equal, then some observations are more informative than others.
• Least squares is based on simple sums, so the information that one observation might provide about another is never used.
![Page 4: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/4.jpg)
GR Model• The generalized regression model: y = X + , E[|X] = 0, Var[|X] = 2. Regressors are well behaved. We consider some
examples with Trace = n. (This is a normalization with no content.)
• Leading Cases– Simple heteroscedasticity– Autocorrelation– Panel data and heterogeneity more generally.
![Page 5: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/5.jpg)
Least Squares• Still unbiased. (Proof did not rely on )• For consistency, we need the true variance of b,
Var[b|X] = E[(b-β)(b-β)’|X] = (X’X)-1 E[X’εε’X] (X’X)-1 = 2 (X’X)-1 XX (X’X)-1 .
Divide all 4 terms by n. If the middle one converges to a finite matrix of constants, we have the result, so we need to examine
(1/n)XX = (1/n)ij ij xi xj.
This will be another assumption of the model.• Asymptotic normality? Easy for heteroscedasticity case, very
difficult for autocorrelation case.
![Page 6: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/6.jpg)
Robust Covariance Matrix• Robust estimation:• How to estimate Var[b|X] = 2 (X’X)-1 XX (X’X)-1 for
the LS b? • The distinction between estimating 2 an n by n matrix and estimating 2 XX = 2 ijij xi xj
• For modern applied econometrics, – The White estimator– Newey-West.
![Page 7: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/7.jpg)
The White Estimator
n1 2 1i i ii 1
n 2i2 i 1
2i
i i2
12
2 2
Est.Var[ ] ( ) e ( )
eUse ˆ nne ˆ ˆ = , =diag( ) note tr( )=nˆ ˆˆ
ˆˆEst.Var[ ] n n n nˆ
Does ˆ n n
b X'X x x' X'X
Ω Ω
X'X X'ΩX X'Xb
X'ΩX X'ΩX
0?
![Page 8: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/8.jpg)
Newey-West Estimator
n 20 i i ii 1
L n1 l t t l t t l t l tl 1 t l 1
l
Heteroscedasticity Component - Diagonal Elements1 en
Autocorrelation Component - Off Diagonal Elements1 we e ( )n
lw 1 = "Bartlett weight"L 11Est.Var[ ]=n
S x x'
S x x x x
Xb1 1
0 1[ ]n n
'X X'XS S
![Page 9: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/9.jpg)
Generalized Least SquaresA transformation of the model: P = -1/2. P’P = -1
Py = PX + P or y* = X* + *. Why? E[**’|X*]= PE[’|X*]P’ = PE[’|X]P’ = σ2PP’ = σ2 -1/2 -1/2 = σ2 0
= σ2I
![Page 10: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/10.jpg)
Generalized Least SquaresAitken theorem. The Generalized Least
Squares estimator, GLS. Py = PX + P or y* = X* + *. E[**’|X*]= σ2I Use ordinary least squares in the
transformed model. Satisfies the Gauss – Markov theorem.
b* = (X*’X*)-1X*’y*
![Page 11: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/11.jpg)
Generalized Least SquaresEfficient estimation of and, by implication,
the inefficiency of least squares b. = (X*’X*)-1X*’y* = (X’P’PX)-1 X’P’Py = (X’Ω-1X)-1 X’Ω-1y
≠ b. is efficient, so by construction, b is not.
β̂
β̂β̂
![Page 12: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/12.jpg)
Asymptotics for GLSAsymptotic distribution of GLS. (NOTE: We
apply the full set of results of the classical model to the transformed model).
UnbiasednessConsistency - “well behaved data”Asymptotic distributionTest statistics
![Page 13: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/13.jpg)
Unbiasedness
1
1
1
ˆ ( )( )
ˆE[ ] ( ) E[ | ]
= if E[ | ]
-1 -1
-1 -1
-1 -1
β X'Ω X X'Ω y β X'Ω X X'Ω ε
β| X =β X'Ω X X'Ω ε X
β ε X 0
![Page 14: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/14.jpg)
Consistency
12
ni ii 1
ii
Use Mean Square
ˆVar[ | ]= ?n n
Requires to be "well behaved"nEither converge to a constant matrix or diverge.Heteroscedasticity case:
1 1n n
Autocorrelatio
-1
-1
-1
X'Ω Xβ X 0
X'Ω X
X'Ω X x x'
n n 2i ji 1 j 1
ij
n case:1 1 . n terms. Convergence is unclear.n n
-1X'Ω X x x '
![Page 15: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/15.jpg)
Asymptotic Normality
111
11
1
' 1ˆn( ) n 'n nConverge to normal with a stable variance O(1)?
' a constant matrix?n1 ' a mean to which we can apply then central limit theorem?Het
X Ω Xβ β X Ω ε
X Ω X
X Ω ε
n1 2i i i ii 1
i i i i
i i
eroscedasticity case?1 1' . Var , is just data.n n Apply Lindeberg-Feller. (Or assuming / is a draw from a common distribution with mean and fixed va
x xX Ω ε =
xriance - some recent treatments.)
Autocorrelation case?
![Page 16: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/16.jpg)
Asymptotic Normality (Cont.)
n n1i j i ii 1 j 1
1
For the autocorrelation case1 1' n n
Does the double sum converge? Uncertain. Requires elementsof to become small as the distance between i and j increases.(Has to resem
ijX Ω ε = Ω x x
Ωble the heteroscedasticity case.)
![Page 17: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/17.jpg)
Test Statistics (Assuming Known Ω)
• With known Ω, apply all familiar results to the transformed model.
• With normality, t and F statistics apply to least squares based on Py and PX
• With asymptotic normality, use Wald statistics and the chi-squared distribution, still based on the transformed model.
![Page 18: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/18.jpg)
Generalized (Weighted) Least Squares:Heteroscedasticity
1
22 2
n
1
-1/2 2
n1
n n1i ii 1 i 1
i i
i i
i2
0 ... 00 ... 0Var[ ] 0 0 ... 00 0 ...
1/ 0 ... 00 1/ ... 0 0 0 ... 00 0 ... 1/
1 1ˆ ( ) ( ) y
ˆy
ˆ
-1 -1i iβ X'Ω X X'Ω y x x x
xβ2
ni 1
n K
![Page 19: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/19.jpg)
Autocorrelation t = t-1 + ut (‘First order autocorrelation.’ How does this come
about?) Assume -1 < < 1. Why?ut = ‘nonautocorrelated white noise’t = t-1 + ut (the autoregressive form) = (t-2 + ut-1) + ut
= ... (continue to substitute)= ut + ut-1 + 2ut-2 + 3ut-3 + ...= (the moving average form)
(Some observations about modeling time series.)
![Page 20: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/20.jpg)
Autocorrelation
2t t t 1 t 1
it ii=0
22i 2 u
u 2i=0
t t 1 t t 1 t2
t t 1 t t 1 t
Var[ ] Var[u u u ...] = Var u
= 1An easier way: Since Var[ ] Var[ ] and uVar[ ] Var[ ] Var[u ] 2 Cov[ ,u ] =
2 2t u
2u
2
Var[ ]
1
![Page 21: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/21.jpg)
Autocovariances
t t 1 t 1 t t 1
t 1 t 1 t t 1
t-1 t2u
2
t t 2 t 1 t t 2
Continuing...Cov[ , ] = Cov[ u , ] = Cov[ , ] Cov[u , ] = Var[ ] Var[ ]
=(1 )Cov[ , ] = Cov[ u , ]
t 1 t 2 t t 2
t t 12 2
u2
= Cov[ , ] Cov[u , ] = Cov[ , ]
= and so on.(1 )
![Page 22: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/22.jpg)
Autocorrelation Matrix
2 1
22
2 2 32
1 2 3
11
11
1
(Note, trace = n as required.)
Ω
Ω
T
T
u T
T T T
![Page 23: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/23.jpg)
Generalized Least Squares
2
1/ 2
21
2 21/ 23 2
T T 1
1 0 0 ... 01 0 ... 0
0 1 ... 0... ... ... ... ...0 0 0 01 yy yy y
...y
Ω
Ω y=
![Page 24: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/24.jpg)
The Autoregressive Transformation
t t t t 1 t
t 1 t 1
t t 1 t t 1
t t 1 t
y u y
y y ( ) + ( )y y ( ) + u
(Where did the first observation go?)
t
t-1
t t-1
t t-1
x 'βx 'β
x x 'βx x 'β
![Page 25: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/25.jpg)
Unknown • The problem (of course), is unknown. For now, we will
consider two methods of estimation:– Two step, or feasible estimation. Estimate first, then
do GLS. Emphasize - same logic as White and Newey-West. We don’t need to estimate . We need to find a matrix that behaves the same as (1/n)X-1X.
– Properties of the feasible GLS estimator• Maximum likelihood estimation of , 2, and all at the
same time.– Joint estimation of all parameters. Fairly rare. Some
generalities…– We will examine two applications: Harvey’s model of
heteroscedasticity and Beach-MacKinnon on the first order autocorrelation model
![Page 26: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/26.jpg)
Specification must be specified first.• A full unrestricted contains n(n+1)/2 - 1
parameters. (Why minus 1? Remember, tr() = n, so one element is determined.)
is generally specified in terms of a few parameters. Thus, = () for some small parameter vector . It becomes a question of estimating .
• Examples:
![Page 27: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/27.jpg)
Heteroscedasticity: Harvey’s Model • Var[i | X] = 2 exp(zi)
• Cov[i,j | X] = 0
e.g.: zi = firm size
e.g.: zi = a set of dummy variables (e.g., countries) (The groupwise heteroscedasticity model.)
• [2 ] = diagonal [exp( + zi)],
= log(2)
![Page 28: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/28.jpg)
AR(1) Model of Autocorrelation
2 1
22
2 2 32
1 2 3
11
11
1
Ω
T
T
u T
T T T
![Page 29: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/29.jpg)
Two Step EstimationThe general result for estimation when is
estimated.GLS uses [X-1X]X -1 y which converges in
probability to .We seek a vector which converges to the same
thing that this does. Call it “Feasible GLS” or FGLS, based on [X X]X y
The object is to find a set of parameters such that [X X]X y - [X -1 X]X -1 y 0
ˆ -1Ω ˆ -1Ω
ˆ -1Ω ˆ -1Ω
![Page 30: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/30.jpg)
Feasible GLS
For FGLS estimation, we do not seek an estimator of such that
ˆ ˆThis makes no sense, since is nxn and does not "converge" to
anything. We seek a matrix such that
Ω
Ω- Ω 0ΩΩ
ˆ (1/n) (1/n)For the asymptotic properties, we will require that
ˆ (1/n) (1/n)Note in this case, these are two random vectors, which we requireto converge
-1 -1
-1 -1
X'Ω X - X'Ω X 0
X'Ω - X'Ω 0
to the same random vector.
![Page 31: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/31.jpg)
Two Step FGLS(Theorem 8.5) To achieve full efficiency, we
do not need an efficient estimate of the parameters in , only a consistent one. Why?
![Page 32: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/32.jpg)
Harvey’s ModelExamine Harvey’s model once again.Methods of estimation:Two step FGLS: Use the least squares residuals to
estimate , then use {X[Ω()]-1 X}-1X’[Ω()]-1y to estimate .
Full maximum likelihood estimation. Estimate all parameters simultaneously.
A handy result due to Oberhofer and Kmenta - the “zig-zag” approach.
Examine a model of groupwise heteroscedasticity.
![Page 33: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/33.jpg)
Harvey’s Model for Groupwise Heteroscedasticity
Groupwise sample, yig, xig,…
N groups, each with Ng observations.
Var[εig] = σg2
Let dig = 1 if observation i,g is in group j, 0 else.
= group dummy variable.Var[εig] = σg
2 exp(θ2d2 + … θGdG)Var1 = σg
2 , Var2 = σg2 exp(θ2) and so on.
![Page 34: Generalized Regression Model](https://reader035.vdocuments.net/reader035/viewer/2022062222/56815e87550346895dcd148d/html5/thumbnails/34.jpg)
Estimating Variance Components• OLS is still consistent:• Est.Var1 = e1’e1/N1 estimates σg
2
• Est.Var2 = e2’e2/N2 estimates σg2 exp(θ2)
• Estimator of θ2 is ln[(e2’e2/N2)/(e1’e1/N1)]• (1) Now use FGLS – weighted least squares• Recompute residuals using WLS slopes• (2) Recompute variance estimators• Iterate to a solution… between (1) and (2)