mit14_384f13_lec6

Intro to GMM 1

14.384 Time Series Analysis, Fall 2007Professor Anna Mikusheva

Paul Schrimpf, scribeNovemeber 8, 2007

corrected September 2012

Lecture 6

GMM

This lecture extensively uses lectures given by Jim Stock as a part of mini-course at NBER SummerInstitute.

Intro to GMM

We have data zt, parameter , and moment condition Eg(zt, 0) = 0. This moment condition identifies .Many problems can be formulated in this way.

Examples

OLS: we have assumption that yt = xt+et and Eetxt = 0. We can write this as a moment condition,E[xt(yt xt)] = 0, that is, g(xt, yt, ) = xt(yt xt).

IV: Consider a case when yt = xt + et, but the error term may be correlated with the regressor.However, we have so called instruments zt correlated with xt but not correlated with the error term:Eetzt = 0 This gives the moment condition E[zt(yt xt)] = 0, and g(xt, yt, zt, ) = zt(yt xt).

Euler Equation: This was the application for which Hansen developed GMM. Suppose we haveCRRA utility, u(c) =

1c 11 . An agent solves the problem of smoothing consumption over time

(inter-temporal optimization). The first order condition from utility maximization gives us the Eulerequation,

E

[

(ct+1

)Rt+1

ct 1|It

]= 0,

where It is information available at time t (sigma-algebra of all variables observed before and at t).For any variable zt measurable with respect to It, we have

c

E

[(

(t+1

Rt+1 1 zt = 0,

ct

)) ]

This moment condition can be used to estimate and .

Estimation

Take the moment condition and replace it with its sample analog:

1 TEg(zt, )

g(zt, ) = gT ()

Tt=1

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu),Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Estimation 2

Our estimate, will be such that gT () 0. Lets suppose is k 1 and gT () is n 1. If n < k, then is not identified. If n = k (and dgd (0) is full rank), then we are just identified, and we may find such thatgT () = 0. If n > k (and the rank of g(0) > k), then we are overidentified. In this case, it will generally beimpossible to find such that g T () = 0, so we instead minimize a quadratic form

= arg min gT ()WT gT ()

where WT is symmetric and positive definite.Some things we want to know:

1. Asymptotics of

2. Efficient choice of Wt

3. Test of overidentifying restrictions

Assumptions

1. Parameter space is compact

2.p

WT W3. For function g0() = Eg(zt, ) assume that g0(0) = 0 and Wg0() = 0 for any = 0 (identification)

4. Function g0() is continuous

5. Assume that gT () converges uniformly in probability to g0()

Theorem 1. Under assumptions 1-5 estimatorp

is consistent, that is, 0.If zt is strictly stationary and ergodic sequence, g(zt, ) is continuous at each with probability one andthere is d(z) such that g(z, ) d(z) and Ed(z)

Estimation 3

Asymptotic Distribution

Theorem 2. Under assumptions 1-9, is asymptotically normal.T (

1

) N(0,) = (RWR) (RWSW R)(RWR)1

Proof. (Sketch of the proof) To prove this, we take a Taylor expansion of the first order condition. The firstorder condition is:

g )gT ( )

T ( WT

| = = 0

Expanding gT () around 0

g(0)Tg T ( ) =

TgT (0) +

T (

0) + op(1)

Weve assumed that

TgT (0) N(0, S), g(0) p R . Then, putting our expansion into the first ordercondition, rearranging, and using these convergence results, we get

(

g ) =( T gT

T 0 WT )1g

( T WTTgT )

(RWR)1RWN(0, S)

Efficient Weighting Matrix

Lemma 3. The efficient choice of W is S1. This choice of W gives asymptotic variance = (RS1R)1.For any other W , is positive semi-definite.

In practice, we do not know S. There are several options available:

Feasible efficient GMM:Choose arbitrary W , say I, get initial estimate 1, use it to calculate S (in time series, we need to useNewey-West to estimate S), then re-run GMM with WT = S1

We can also iterate this procedure, or put S() to do continuous updating estimator (CUE) GMM.Under assumptions above these alternatives have the same asymptotic variance. In IV Feasible efficientGMM corresponds to TSLS, CUE is the same as LIML.

Tests of Over-identifying Restrictions When we have more moment conditions than unknown pa-rameters we may test whether model is correctly specified (that is, whether the moments conditions jointlyhold).

J = [Tg(z, )]1[

Tg(z, )] 2nk

Under the null that all moment conditions are true, the statistic, J , converges to a 2nk. Large values ofJ-test lead to rejection.

Remark 4. This is a test of all the restrictions jointly. If we reject, it does not tell which moment conditionis wrong. It merely means the moment conditions contradict one another.


Weak IV 4

Weak IV

Weak IV problem arises when due to weak correlation between instruments and regressor causes the distribu-tions of IV estimators and test statistics are poorly approximated by their standard asymptotic distributions(normal and chi-squared). The problem is not directly related to non-identification or partial identification(when only part of the unknown parameter is identified). We assume that all the parameters are identifiedin classical sense. The issue concerns only the quality of classical asymptotics.

Consider a classical setup (with one endogenous regressor, no covariates for simplicity):

yt = xt + utxt = Ztpi + vt,

where xt, yt are one-dimensional, Zt is k 1 and EutZt = 0.A GMM estimator for this moment condition solves the following minimization problem: (yx)ZWZ (y

x) min and thus, = (xZWZ x)1(xZWZ y). The optimal matrix is W = (Z Z)1. It leadsto the estimator called two-stage least squares (TSLS). The usual TSLS will be PSLS = x

Zy

T ,xPZx wherePZ = Z(Z Z)1Z . The problem of weak identification arises when moment conditions are not very in-formative about the parameter of interest g(0) |= R 0 = (0) = Z X is small (weak correlation betweeninstruments and regressor). Below are some known applied examples where the problem arises (Stock andWatson mini-course at NBER Summer Institute is an excellent reference).

Return to education and quarter of birth. In Angrist Kreuger (1991), yi is log of earning, xi is years ofeducation, zi quarter of birth. Read the paper for explanations. Bounder, Jaeger, Baker (1995) showed thatif one randomly assigns quarter of birth and run IV s/he would obtain results close to initial (but instrumentsare irrelevant in this case!!!). What is important here is that Angrist and Kreuger obtained quite narrowconfidence sets (which usually indicates high accuracy of estimation), in a situation when there are littleinformation in the data. It raises suspicions that those confidence sets have poor coverage and are basedon poor asymptotic approximations. Nowdays, we can assign this example to weak or many instrumentssetting.

Elasticity of inter-temporal substitution . Let ct+1 be consumption growth from t to t + 1, ri,t+1be a return on asset i from t to t+ 1. The linearized Euler equation is

E [(ct+1 i ri,t+1)|It] = 0

For any variable Zt measurable with respect It we have

E [(ct+1 i ri,t+1)Zt] = 0

Our goal is to estimate . It could be done in two ways:

Run IV (with instruments Zt) in:ct+1 = i + ri,t+1 + ut

Run IV (with instruments Zt) in:

ri,t+1 = i + ct+1 + ut, = 1/

Finding in literature: 95% confidence sets obtained by two methods do not intersect(!!!) Explanation: thesecond regression suffers from weak instrument problem, since it is routinely very difficult to predict changein consumption.


Weak IV 5

Philips curve .pit = xt + fEtpit+1 + bpit1 + et

GMM estimated moment condition:

E [(pit xt fpit+1 + bpit1)Zt ] = 0 Z I1 t1 t1It is very different to predict pit+1 beyond pit 1 by any variables observed at time t 1, that causes weakrelevance of any instruments.


MIT OpenCourseWarehttp://ocw.mit.edu

14.384 Time Series AnalysisFall 2013

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Intro to GMMEstimationWeak IV

mit14_384f13_lec6

Documents