econometric analysis of panel data
DESCRIPTION
Econometric Analysis of Panel Data. William Greene Department of Economics Stern School of Business. Econometric Analysis of Panel Data. 20. Sample Selection and Attrition. Received Sunday, April 27, 2014 - PowerPoint PPT PresentationTRANSCRIPT
Part 20: Selection [1/66]
Econometric Analysis of Panel Data
William GreeneDepartment of EconomicsStern School of Business
Part 20: Selection [2/66]
Econometric Analysis of Panel Data
20. Sample Selection and Attrition
Part 20: Selection [3/66]
Received Sunday, April 27, 2014
I have a paper regarding strategic alliances between firms, and their impact on firm risk. While observing how a firm’s strategic alliance formation impacts its risk, I need to correct for two types of selection biases. The reviews at Journal of Marketing asked us to correct for the propensity of firms to enter into alliances, and also the propensity to select a specific partner, before we examine how the partnership itself impacts risk.
Our approach involved conducting a probit of alliance formation propensity, take the inverse mills and include it in the second selection equation which is also a probit of partner selection. Then, we include inverse mills from the second selection into the main model. The review team states that this is not correct, and we need an MLE estimation in order to correctly model the set of three equations. The Associate Editor’s point is given below. Can you please provide any guidance on whether this is a valid criticism of our approach. Is there a procedure in LIMDEP that can handle this set of three equations with two selection probit models?
AE’s comment:“Please note that the procedure of using an inverse mills ratio is only consistent when the main equation where the ratio is being used is linear. In non-linear cases (like the second probit used by the authors), this is not correct. Please see any standard econometric treatment like Greene or Wooldridge. A MLE estimator is needed which will be far from trivial to specify and estimate given error correlations between all three equations.”
Part 20: Selection [4/66]
Hello Dr. Greene,
My name is xxxxxxxxxx and I go to the University of xxxxxxxx.I see that you have an errata page on your website of your econometrics book 7th edition.It seems like you want to correct all mistakes so I think I have spotted a possible proofreading error.On page 477 (theorem 13.2) you want to show that theta is consistent and you say that
"But, at the true parameter values, qn(θ0) →0. So, if (13-7) is true, then it must followthat qn(θˆGMM) →θ0 as well because of the identification assumption"
I think in the second line it should be qn(θˆGMM) → 0, not θ0.
Part 20: Selection [5/66]
I also have a questions about nonlinear GMM - which is more or less nonlinear IV technique I suppose.
I am running a panel non-linear regression (non-linear in the parameters) and I have L parameters and K exogenous variables with L>K.
In particular my model looks kind of like this: Y = b1*X^b2 + e, and so I am trying to estimate the extra b2 that don't usually appear in a regression.From what I am reading, to run nonlinear GMM I can use the K exogenous variables to construct the orthogonality conditions but what should I use for the extra, b2 coefficients?Just some more possible IVs (like lags) of the exogenous variables?I agree that by adding more IVs you will get a more efficient estimation, but isn't it only the case when you believe the IVs are truly uncorrelated with the error term?So by adding more "instruments" you are more or less imposing more and more restrictive assumptions about the model (which might not actually be true).
I am asking because I have not found sources comparing nonlinear GMM/IV to nonlinear least squares. If there is no homoscadesticity/serial correlation what is more efficient/give tighter estimates?
Part 20: Selection [6/66]
Part 20: Selection [7/66]
Dueling Selection Biases – From two emails, same day.
“I am trying to find methods which can deal with data that is non-randomised and data that is non-randomised and suffers from selection biassuffers from selection bias.”
“I explain the probability of answering questions using, among other independent variables, a variable which measures knowledge breadth. Knowledge breadth can be constructed only for those individuals that fill in a skill description in the company intranet. This is where the selection bias comes from.
Part 20: Selection [8/66]
The Crucial Element Selection on the unobservables
Selection into the sample is based on both observables and unobservables
All the observables are accounted for Unobservables in the selection rule also appear
in the model of interest (or are correlated with unobservables in the model of interest)
“Selection Bias”=the bias due to not accounting for the unobservables that link the equations.
Part 20: Selection [9/66]
A Sample Selection Model
Linear model 2 step ML – Murphy & Topel Binary choice application Other models
Part 20: Selection [10/66]
Canonical Sample Selection ModelRegression Equationy*=x +Sample Selection Mechanismd*=z +u; d=1[d* > 0] (probit)y = y* if d = 1; not observed otherwiseIs the sample 'nonrandomly selected?'E[y*|x,d=1] = x +E[ | x,d 1]
= x +E[ | x,u z ]
= x something if Cor[ ,u|x] 0A left out variable problem (again)Incidental truncation
Part 20: Selection [11/66]
Applications Labor Supply model:
y*=wage-reservation wage d=labor force participation
Attrition model: Clinical studies of medicines Survival bias in financial data Income studies – value of a college application Treatment effects Any survey data in which respondents self select
to report Etc…
Part 20: Selection [12/66]
Estimation of the Selection Model Two step least squares
Inefficient Simple – exists in current software Simple to understand and widely used
Full information maximum likelihood Efficient Simple – exists in current software Not so simple to understand – widely
misunderstood
Part 20: Selection [13/66]
Heckman’s Model
i i
i i i i
i i i2
i i
i i i i i i
i i i
y*= +d*= +u; d=1[d* > 0] (probit)y = y * if d = 1; not observed otherwise[ ,u]~Bivariate Normal[0,0, , ,1]E[y*|x ,d=1] = +E[ | x ,d 1] = +E[ | x ,u
i
i
i
i i
x βz γ
x βx β z γ
i
]( ) = ( )
= Least squares is biased and inconsistent again. Left out variable
ii
i
i
z γx β z γx β
Part 20: Selection [14/66]
Two Step Estimation
i i i i
i
Step 1: Estimate the probit modeld*= +u; d=1[d* > 0] (probit).
ˆ( )ˆˆ Estimation of by . Now compute ˆ( )Step 2: Estimate the regression model with estimated re
i
i
i
z γz γγ γ z γ
i i
i i i
i i i i i i
i
i i i
gressory*= +y = y* if d = 1; not observed otherwiseE[y*|x ,d=1] = +E[ | x ,d 1] =
ˆ Linearly regress y on x , . Step2a. Fix standard errors (Murphy
i
i
i
xβ
x βxβ
and Topel). Estimate ˆand using and /ne'e
The “LAMBDA”
Part 20: Selection [15/66]
FIML Estimation
id 0
2i i i2d 1 2
i i
2
d 0
2 2i i id
logL log
/1 log exp 22 1y
Let
1/ , =- / , =1-
logL log1 log exp y ( 1 ) ( y )22
i
i
i i
z
z
xβ
z
x δ z x δ
1
Note: no inverse Mills ratio appears anywhere in the model.
Part 20: Selection [16/66]
Classic Application Mroz, T., Married women’s labor supply,
Econometrica, 1987. N =753 N1 = 428
A (my) specification LFP=f(age,age2,family income, education,
kids) Wage=g(experience, exp2, education, city)
Two step and FIML estimation
Part 20: Selection [17/66]
Selection Equation+---------------------------------------------+| Binomial Probit Model || Dependent variable LFP || Number of observations 753 || Log likelihood function -490.8478 |+---------------------------------------------++--------+--------------+----------------+--------+--------+----------+|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|+--------+--------------+----------------+--------+--------+----------+---------+Index function for probability Constant| -4.15680692 1.40208596 -2.965 .0030 AGE | .18539510 .06596666 2.810 .0049 42.5378486 AGESQ | -.00242590 .00077354 -3.136 .0017 1874.54847 FAMINC | .458045D-05 .420642D-05 1.089 .2762 23080.5950 WE | .09818228 .02298412 4.272 .0000 12.2868526 KIDS | -.44898674 .13091150 -3.430 .0006 .69588313
Part 20: Selection [18/66]
Heckman Estimator and MLE
Part 20: Selection [19/66]
Extension – Treatment Effect
i i i i
i i i2
i i
i i i i i i i
What is the value of an elite college education?d*= +u; d=1[d* > 0] (probit)y*= + d observed for everyone[ ,u]~Bivariate Normal[0,0, , ,1]E[y*|x ,d=1] = + d+E[ | x ,d 1]
i
i
i
z γx β
xβ
i i i
i
i i i i i i i
= + +E[ | x ,u ]( ) = + ( )
= +E[y*|x ,d=0] = + d+E[ | x ,d 0]
( =
i i
ii
i
i
i
ii
x β z γz γx β z γ
x βxβ
zx β
)( )
Least squares is still biased and inconsistent. Left out variablei
γz γ
Part 20: Selection [20/66]
Sample Selection
An approach modeled on Heckman's modelRegression Equation:Prob[y=j|x,u]=P(λ); λ=exp(xβ+θu)Selection Equation: d=1[zδ+ε>0] (The usual probit)[u,ε]~n[0,0,1,1,ρ] (Var[u] is
2
absorbed in θ)Estimation:Nonlinear Least Squares: [Terza (1998, see cite in text).]
Φ(zδ+ρ)E[y|x,d=1]=exp(xβ+θρ ) Φ(zδ)FIML using Hermite quadrature: [Greene (Stern wp, 97-02, 1997)]
Part 20: Selection [21/66]
Extensions – Binary Data
i i i i
i i i i
i i i
i i
Application: In German Health care data, insurance choicesd*= +u; d=1[d* > 0] (probit)y*= + ,y=1[y* > 0] (probit)y = y* if d = 1; not observed otherwise[ ,u]~Bivariate Normal[0,0,1,
i
i
z γxβ
,1]Estimation:(1) Two step? (Wooldridge text)(2) FIML (Stata, LIMDEP)
Part 20: Selection [22/66]
Panel Data and Selection
it i it
it it it
it
Selection equation with time invariant individual effectd 1[ 0] Observation mechanism: (y , ) observed when d 1Primary equation of interestCommon effects linear regression model y
itz γx
it i it
it it
i i
| (d 1)"Selectivity" as usual arises as a problem when the unobservablesare correlated; Corr( , ) 0.The common effects, and make matters worse.
itx β
Part 20: Selection [23/66]
Panel Data and Sample Selection Models: A Nonlinear Time Series
I. 1990-1992: Fixed and Random Effects Extensions
II. 1995 and 2005: Model Identification through Conditional Mean Assumptions
III. 1997-2005: Semiparametric Approaches based on Differences and Kernel Weights
IV. 2007: Return to Conventional Estimators, with Bias Corrections
Part 20: Selection [24/66]
Panel Data Sample Selection Models
i
it i it
it it i it
Ti t 1
Verbeek, Economics Letters, 1990.d 1[ w 0] y | (d 1) ; Proposed "marginal likelihood" based on joint normality
logL
it
it
z γ (Random effects probit)x β (Fixed effects regression)
i,1 it i,2it i,1 i,2 i,1 i,22 2
it
it it i
i,1 i,2
u d u(2d 1) f(u ,u )du du(1 d )
( / )d (y y ) ( )(Integrate out the random effects; difference out the fixed effects.)u ,u are tim
it it
it it i
z γ+
x x 'β
e invariant uncorrelated standard normal variablesHow to do the integration? Natural candidate for simulation.(Not mentioned in the paper. Too early.)[Verbeek and Nijman: Selectivity "test" based on this model, International Economic Review, 1992.]
Part 20: Selection [25/66]
Zabel – Economics Letters Inappropriate to have a mix of FE and RE
models Two part solution
Treat both effects as “fixed” Project both effects onto the group means of
the variables in the equations Resulting model is two random effects
equations Use both random effects
Part 20: Selection [26/66]
Selection with Fixed Effects
22
0
2
* , , ~ [0,1]* , , ~ [0,1]
( , ) ~ [(0,0), ( ,1, )].
( )
( / ) 1
it
it i it it i i i i
it i it it i i i i
it it
i it i i i id
it i i it
y w w Nd u v v N
u N
L v v dv
v
x xz z
z z
z z
21-
1 ( , )
it
iti i i id
it it it i i
v w dv dw
y w
x x
Part 20: Selection [27/66]
Practical Complications
The bivariate normal integration is actually the product of two univariate normals, because in the specification above, vi and wi are assumed to be uncorrelated. Vella notes, however, “… given the computational demands of estimating by maximum likelihood induced by the requirement to evaluate multiple integrals, we consider the applicability of available simple, or two step procedures.”
Part 20: Selection [28/66]
Simulation
The first line in the log likelihood is of the form Ev[d=0(…)] and the second line is of the form Ew[Ev[(…)(…)/]]. Using simulation instead, the simulated likelihood is
,1 0
, , ,1 1 2
, ,
1
( / )1 1 1
it
it
RSi it i i rr d
R it i i r it r it rr d
it r it it i i r
L vR
vR
y w
z z
z z
x x
Part 20: Selection [29/66]
Correlated Effects
Suppose that wi and vi are bivariate standard normal with correlation vw. We can project wi on vi and write
wi = vwvi + (1-vw2)1/2hi
where hi has a standard normal distribution. To allow the correlation, we now simply substitute this expression for wi in the simulated (or original) log likelihood, and add vw to the list of parameters to be estimated. The simulation is then over still independent normal variates, vi and hi.
Part 20: Selection [30/66]
Conditional Means
Part 20: Selection [31/66]
A Feasible Estimator
Part 20: Selection [32/66]
Estimation
Part 20: Selection [33/66]
Kyriazidou - Semiparametrics
1N Ni=1 i1 i2 i i i i=1 i1 i2 i i i
ii
Assume 2 periodsEstimate selection equation by FE logitUse first differences and weighted least squares:
ˆ ˆ = d d d d y
ˆw1ˆ K kernel function.h h
Use
x x x
with longer panels - any pairwise differencesExtensions based on pairwise differences by Rochina-Barrachina and Dustman/Rochina-Barrachina (1999)
Part 20: Selection [34/66]
Bias Corrections Val and Vella, 2007 (Working paper) Assume fixed effects
Bias corrected probit estimator at the first step
Use fixed probit model to set up second step Heckman style regression treatment.
Part 20: Selection [35/66]
Postscript What selection process is at work?
All of the work examined here (and in the literature) assumes the selection operates anew in each period
An alternative scenario: Selection into the panel, once, at baseline.
Why aren’t the time invariant components correlated? (Greene, 2007, NLOGIT development)
Other models All of the work on panel data selection assumes the main
equation is a linear model. Any others? Discrete choice? Counts?
Part 20: Selection [36/66]
Sample Selection
Part 20: Selection [37/66]
TECHNICAL EFFICIENCY ANALYSIS CORRECTING FOR BIASES FROM OBSERVED AND UNOBSERVED
VARIABLES: AN APPLICATION TO A NATURAL RESOURCE MANAGEMENT PROJECT
Empirical Economics: Volume 43, Issue 1 (2012), Pages 55-72
Boris Bravo-UretaUniversity of Connecticut
Daniel SolisUniversity of Miami
William GreeneNew York University
Part 20: Selection [38/66]
The MARENA Program in Honduras
Several programs have been implemented to address resource degradation while also seeking to improve productivity, managerial performance and reduce poverty (and in some cases make up for lack of public support).
One such effort is the Programa Multifase de Manejo de Recursos Naturales en Cuencas Prioritarias or MARENA in Honduras focusing on small scale hillside farmers.
Part 20: Selection [39/66]OVERALL CONCEPTUAL FRAMEWORKOVERALL CONCEPTUAL FRAMEWORK
Working HYPOTHESIS: if farmers receive private Working HYPOTHESIS: if farmers receive private benefits (higher income) from project activities (e.g., benefits (higher income) from project activities (e.g., training, financing) then adoption is likely to be training, financing) then adoption is likely to be sustainable and to generate positive externalities.sustainable and to generate positive externalities.
More Production and Productivity
More Farm
Income
Sustainability
Off-Farm Income
MARENA
Training &
Financing
Natural, Human &
Social Capital
Part 20: Selection [40/66]
COMPONENT I: Strengthening Strategic Management Capabilities among Govt. Institutions (central and local) COMPONENT II: Support to Nat. Res. Management. Projects
Module 1: Promotion and Organization Modulo 2: Strengthening Local Institutions & Organizations Module 3: Investment (farm, municipal & regional)
COMPONENT III: Administration and Supervision
The MARENA Program
Part 20: Selection [41/66]
Component II - Module 3 focused on promoting investments in sustainable production systems with a budget of US $7.6 million (Bravo-Ureta, 2009).
The major activities undertaken with beneficiaries: training in business management and sustainable farming practices; and the provision of funds to co-finance investment activities through local rural savings associations (cajas rurales).
Component II - Module 3
Part 20: Selection [42/66]
Rural poverty in Honduras, largely due to policy driven unsustainable land use => environmental degradation, productivity losses, food insecurity, growing climatic vulnerability (GEF-IFAD, 2002).
ConclusionsConclusions
Part 20: Selection [43/66]
Expected Impact Evaluation
Part 20: Selection [44/66]
Methods
A matched group of beneficiaries and control farmers is determined using Propensity Score Matching techniques to mitigate biases that would stem from selection on observed variables.
In addition, we deal with possible self-selection on unobservables arising from unobserved variables using a selectivity correction model for stochastic frontiers introduced by Greene (2010).
Part 20: Selection [45/66]
A Sample Selected SF Model
di = 1[ z′ i + hi > 0], hi ~ N[0,12]
yi = + x′ i + i, i ~ N[0,2]
(yi,xi) observed only when di = 1.
i = vi - ui
ui = u|Ui| where Ui ~ N[0,12]
vi = vVi where Vi ~ N[0,12].
(hi,vi) ~ N2[(0,1), (1, v, v2)]
Part 20: Selection [46/66]
Simulated logL for the Standard SF Model2 21
2exp[ ( |) / ]( | ,| |)2
i i u i v
i i iv
y |Uf y U xx
2 212
| |
exp[ ( |) / ]( | ) (| |) | |2
i
i i u i vi i i iU
v
y |Uf y p U d Uxx
2122exp[ | | ](| |) , |U | 0. (Half normal)
2
ii i
Up U
2 212
1
1 exp[ ( |) / ]( | ) 2
R i i u ir vi r
v
y |Uf yR
xx
2 212
=1 1
1 exp[ ( |) / ]log ( , , , ) = log2
N R i i u ir vS u v i r
v
y |ULR
x
This is simply a linear regression with a random constant term, αi = α - σu |Ui |
Part 20: Selection [47/66]
Likelihood For a Sample Selected SF Model
2 212
2
| |
| ( , , ,| |)
exp ( | |) / )
2 (1 ) ( )
( | |) / 1
| ( , , ) | ( , , ,| |) (| |) | |
i
i i i i i
i i u i v
vi i i
i i u i i
i i i i i i i i i i iU
f y d U
y U
d dy U
f y d f y d U f U d U
x z
x
zx z
x z x z
Part 20: Selection [48/66]
Simulated Log Likelihood for a Selectivity Corrected Stochastic Frontier Model
2 212
1 12
exp ( | |) / )
2
1 ( | |) /log ( , , , , , ) log 1
(1 ) ( )
i i u ir vi
v
N Ri i u ir iS u v i r
i i
y Ud
y ULR
d
x
x z
z
The simulation is over the inefficiency term.
Part 20: Selection [49/66]
JLMS Estimator of ui
2 212
2
1 1
1
ˆˆ ˆ ˆexp ( | |) / )
ˆ 2ˆˆˆ ˆ ˆ ˆ( | |) /
ˆ1
1 1ˆ ˆˆ ˆˆ = ( | |) ,
ˆˆ Estimator of [ | ] ˆ
ˆ
i i u ir v
vir
i i u ir v i
R Ri u ir ir i irr r
ii i i
i
Rirr
y U
fy U a
A U f B fR R
Au E uB
g
x
x
1
1
ˆˆ ˆ ˆ| | where , 1
ˆ
Riru ir ir irR r
irr
fU g gf
Part 20: Selection [50/66]
Closed Form for the Selection Model The selection model can be estimated without simulation “The stochastic frontier model with correction for sample selection revisited.” Lai, Hung-pin. Forthcoming, JPA
Based on closed skew normal distribution Similar to Maddala’s 1982 result for the linear selection model. See slide 42. Not more computationally efficient. Statistical properties identical. Suggested possibility that simulation chatter is an element of inefficiency in the maximum simulated likelihood estimator.
Part 20: Selection [51/66]
Spanish Dairy Farms: Selection based on being farm #1-125. 6 periods
The theory works.
Closed Form vs. Simulation
Part 20: Selection [52/66]
Variables Usedin the Analysis
Production
Participation
Part 20: Selection [53/66]
Findings from the First Wave
Part 20: Selection [54/66]
A Panel Data Model Selection takes place only at the baseline. There is no attrition.
0 0 0
0
0 0 0
1[ > 0] Sample Selector, 0,1,... Stochastic Frontier
Selection effect is exerted on ; Corr( , , )( , ) ( ) ( | )
C
i i i
it i it it it
i i i
it i i it i
d hy w v u t
w h wP y d P d P y d
zx
0
0 1 0 00
0 1 0 0 0
onditioned on the selection ( ) observations are independent.
( , ,..., | ) ( | )
I.e., the selection is acting like a permanent random effect.
( , ,..., , ) ( ) (
i
Ti i iT i it it
Ti i iT i i it
h
P y y y d P y d
P y y y d P d P y 0| )t id
Part 20: Selection [55/66]
Simulated Log Likelihood
,
2 212
1 1 00
2
log ( , , , , )
exp ( | |) / )
21log( | |) /
1
i
S C u v
it it u itr v
T vR
d r tit it u itr v i
L
y U
R y U a
x
x
Part 20: Selection [56/66]
Benefit group is more efficient in both years The gap is wider in the second year Both means increase from year 0 to year 1 Both variances decline from year 0 to year 1
Main Empirical Conclusions from Waves 0 and 1
Part 20: Selection [57/66]
Part 20: Selection [58/66]
Attrition
In a panel, t=1,…,T individual I leaves the sample at time Ki and does not return.
If the determinants of attrition (especially the unobservables) are correlated with the variables in the equation of interest, then the now familiar problem of sample selection arises.
Part 20: Selection [59/66]
Application of a Two Period Model “Hemoglobin and Quality of Life in Cancer
Patients with Anemia,” Finkelstein (MIT), Berndt (MIT), Greene
(NYU), Cremieux (Univ. of Quebec) 1998 With Ortho Biotech – seeking to change
labeling of already approved drug ‘erythropoetin.’ r-HuEPO
Part 20: Selection [60/66]
QOL Study Quality of life study
i = 1,… 1200+ clinically anemic cancer patients undergoing chemotherapy, treated with transfusions and/or r-HuEPO
t = 0 at baseline, 1 at exit. (interperiod survey by some patients was not used)
yit = self administered quality of life survey, scale = 0,…,100 xit = hemoglobin level, other covariates
Treatment effects model (hemoglobin level) Background – r-HuEPO treatment to affect Hg level
Important statistical issues Unobservable individual effects The placebo effect Attrition – sample selection FDA mistrust of “community based” – not clinical trial based
statistical evidence Objective – when to administer treatment for maximum
marginal benefit
Part 20: Selection [61/66]
Dealing with Attrition The attrition issue: Appearance for the second
interview was low for people with initial low QOL (death or depression) or with initial high QOL (don’t need the treatment). Thus, missing data at exit were clearly related to values of the dependent variable.
Solutions to the attrition problem Heckman selection model (used in the study)
Prob[Present at exit|covariates] = Φ(z’θ) (Probit model) Additional variable added to difference model i =
Φ(zi’θ)/Φ(zi’θ) The FDA solution: fill with zeros. (!)
Part 20: Selection [62/66]
An Early Attrition ModelHausman, J . and Wise, D., "Attrition Bias in Experimental and Panel Data: The Gary Income Maintenance Experiment," Econometrica, 1979.A two period model:Structural response model (Random Effects Regre
i1 i1 i1 i
i2 i2 i2 i
i2 i2 i2 i2 i2
i2 i2
2 2 212 i1 i i2 i u u
ssion)y uy uAttrition model for observation in the second period (Probit)z * y vz 1(z * 0)Endogeneity "problem"
Corr[ u , u] /( ) C
x βx β
x θ w α
i2 i2 i i2 i2 i i2 iorr[v , u] Corr[v ( u), u)
Part 20: Selection [63/66]
Methods of Estimating the Attrition Model
Heckman style “selection” model Two step maximum likelihood Full information maximum likelihood Two step method of moments estimators Weighting schemes that account for the
“survivor bias”
Part 20: Selection [64/66]
Selection Model
i2 i2 i i2 i i
i2 i2
i2 i2
i2 i2
Reduced form probit model for second period observation equationz * x ( ) w ( u v ) r hz 1(z * 0)Conditional means for observations observed in the second period
E[y | x
i2i2 i2 12
i2
i2i1 i1 i2 i1 12
i2
(r ),z 1] x ( ) (r )First period conditional means for observations observed in the second period
(r )E[y | x ,z 1] x ( ) (r )(1) Estimate probit equation(2) Combine these two equations with a period dummy variable, use OLS with a constructed regressor in the second period
( )THE TWO DISTURBANCES ARE CORRELATED. TREAT THIS IS A SUR MODEL. EQUIVALENT TO MDE
Part 20: Selection [65/66]
Maximum Likelihood
2i1 i1
i 2
22 i2 12 i1 12 12 i112 2 2
12i2
i2 i2 i22
i2
(y x )log2LogL log2 2[(y y ) (x x ) )]log log 1 2 (1 )
zr ( / )(y x ) log
1
(1 z ) log
i2 12 i1 i12
12
r ( / )(y x )1
(1) See H&W for FIML estimation(2) Use the invariance principle to reparameterize(3) Estimate separately and use a two step ML with Murphy and Topel correction of asymptotic covariance matrix.
Part 20: Selection [66/66]
Part 20: Selection [67/66]
A Model of Attrition Nijman and Verbeek, Journal of Applied
Econometrics, 1992 Consumption survey (Holland, 1984 –
1986) Exogenous selection for participation (rotating
panel) Voluntary participation (missing not at
random – attrition)
Part 20: Selection [68/66]
Attrition Model
i,t 0 i,t i i,t
i i i i i
i,t 0 i,t i i i,t
The main equationy , Random effects consumption function
u , Mundlak device; u uncorrelated with y u , Reduced form random effect
xx X
x x
it
it
it
s model
The selection mechanisma 1[individual i asked to participate in period t] Purely exogenous a may depend on observables, but does not depend on unobservablesr 1[individual i cho
it
it it
it
oses to participate if asked] Endogenous. r is the endogenous participation dummy variable a 0 r 0 a 1 the selection mechanism operates
Part 20: Selection [69/66]
Selection Equation
i,t 0 i,t i i i,t
it
it
The main equationy u , Reduced form random effects model
The selection mechanismr 1[individual i chooses to participate if asked] Endogenous. r is the endogenous p
x x
it it
it
it 0 i,t i i,t i i,t it
i
articipation dummy variable a 0 r 0 a 1 the selection mechanism operatesr 1[ v w 0] all observed if a 1
State dependence: may include r
x x zz
,t-1
2
i,t i
v
,t i i
Latent persistent unobserved heterogeneity: 0."Selection" arises Cov[ ,w ] 0 Cov[u ,v if or ] 0
Part 20: Selection [70/66]
Estimation Using One Wave Use any single wave as a cross section
with observed lagged values. Advantage: Familiar sample selection
model Disadvantages
Loss of efficiency “One can no longer distinguish between state
dependence and unobserved heterogeneity.”
Part 20: Selection [71/66]
One Wave Model
it 0 it i i it
it 0 it i 1 i,t 1 2 i,t 1 i it
i,t-1
A standard sample selection model.y x x (u )r 1[ x x r a (v w ) 0]
With only one period of data and r exogenous,
this is the Heckman sample selection mod
i,t-1 i
i,t-1
i,t-1
el.If > 0, then r is correlated with v and the Heckman
approach fails.An assumption is required:(1) Include r and assume no unobserved heterogeneity
(2) Exclude r and assume there is n
i it i it
o state dependence.
In either case, now if Cov[(u ),(v w )] we can use OLS.Otherwise, use the maximum likelihood estimator.
Part 20: Selection [72/66]
Maximum Likelihood Estimation See Zabel’s model in slides 20 and 23. Because numerical integration is required in
one or two dimensions for every individual in the sample at each iteration of a high dimensional numerical optimization problem, this is, though feasible, not computationally attractive. The dimensionality of the optimization is irrelevant This is much easier in 2015 than it was in 1992
(especially with simulation) The authors did the computations with Hermite quadrature.
Part 20: Selection [73/66]
Testing for Selection? Maximum Likelihood Results
Covariances were highly insignificant. LR statistic=0.46.
Two step results produced the same conclusion based on a Hausman test
ML Estimation results looked like the two step results.
Part 20: Selection [74/66]
A Dynamic Ordered Probit Model
Part 20: Selection [75/66]
Random Effects Dynamic Ordered Probit Model
Jit it j 1 j i,t 1 i i,t
i,t j-1 it j
i,t i,t
Jit,j it j it j 1 j i,t 1 i
Random Effects Dynamic Ordered Probit Model
h * x h ( j)
h j if < h * <
h ( j) 1 if h = j
P P[h j] ( x h ( j) )
i
i
Jj 1 it j 1 j i,t 1 i
Ji 0 j 1 1,j i,1 i i
TNit,j j ji=1 t 1
( x h ( j) )
Parameterize Random Effects
h ( j) x u
Simulation or Quadrature Based Estimation
lnL= ln P f( )d
Part 20: Selection [76/66]
A Study of Health Status in the Presence of Attrition
“THE DYNAMICS OF HEALTH IN THE BRITISH HOUSEHOLD PANEL SURVEY,”
Contoyannis, P., Jones, A., N. RiceJournal of Applied Econometrics, 19, 2004, pp. 473-503. Self assessed health British Household Panel Survey (BHPS)
1991 – 1998 = 8 waves About 5,000 households
Part 20: Selection [77/66]
Attrition
Part 20: Selection [78/66]
Testing for Attrition Bias
Three dummy variables added to full model with unbalanced panel suggest presence of attrition effects.
Part 20: Selection [79/66]
Attrition Model with IP Weights
Assumes (1) Prob(attrition|all data) = Prob(attrition|selected variables) (ignorability) (2) Attrition is an ‘absorbing state.’ No reentry. Obviously not true for the GSOEP data above.Can deal with point (2) by isolating a subsample of those present at wave 1 and the monotonically shrinking subsample as the waves progress.
Part 20: Selection [80/66]
Probability Weighting Estimators A Patch for Attrition (1) Fit a participation probit equation for
each wave. (2) Compute p(i,t) = predictions of
participation for each individual in each period. Special assumptions needed to make this work
Ignore common effects and fit a weighted pooled log likelihood: Σi Σt [dit/p(i,t)]logLPit.
Part 20: Selection [81/66]
Inverse Probability WeightingPanel is based on those present at WAVE 1, N1 individualsAttrition is an absorbing state. No reentry, so N1 N2 ... N8.Sample is restricted at each wave to individuals who were present atthe pre
1 , 1
1
vious wave.d = 1[Individual is present at wave t]. d = 1 i, d 0 d 0.
covariates observed for all i at entry that relate to likelihood of being present at subsequen
it
i it i t
i
xt waves.
(health problems, disability, psychological well being, self employment, unemployment, maternity leave, student, caring for family member, ...)Probit model for d 1[it i x 1
1
ˆ], t = 2,...,8. fitted probability.ˆ ˆAssuming attrition decisions are independent, P
dˆInverse probability weight WP̂
Weighted log likelihood logL log (No common
it it
tit iss
itit
it
W it
w
L
8
1 1 effects.)N
i t
Part 20: Selection [82/66]
Spatial Autocorrelation in a Sample Selection Model
Alaska Department of Fish and Game. Pacific cod fishing eastern Bering Sea – grid of locations Observation = ‘catch per unit effort’ in grid square Data reported only if 4+ similar vessels fish in the region 1997 sample = 320 observations with 207 reported full data
Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.
Part 20: Selection [83/66]
Spatial Autocorrelation in a Sample Selection Model
• LHS is catch per unit effort = CPUE• Site characteristics: MaxDepth, MinDepth, Biomass• Fleet characteristics:
Catcher vessel (CV = 0/1) Hook and line (HAL = 0/1) Nonpelagic trawl gear (NPT = 0/1) Large (at least 125 feet) (Large = 0/1)
Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.
Part 20: Selection [84/66]
Spatial Autocorrelation in a Sample Selection Model*1 0 1 1 1 1 1
*2 0 2 2 2 2 2
21 1 12
122 12 2
*1 1
2
0~ , , (?? 1??)
0
Observation Mechanism
1 > 0 Probit Model
i i i i ij j ij i
i i i i ij j ij i
i
i
i i
i
y u u c u
y u u c u
N
y y
y
x
x
*2 1 if = 1, unobserved otherwise.i iy y
Part 20: Selection [85/66]
Spatial Autocorrelation in a Sample Selection Model
1 1 1
1 (1)1 1 1 2
2* (1) 2 (1)1 0 1 1 1 1 1 1
* (2) 2 (2 0 2 2 2 1 1
= Spatial weight matrix, 0.
[ ] = , likewise for
( ) , Var[ ] ( )
( ) , Var[ ] ( )
ii
N Ni i ij i i ijj j
Ni i ij i i ijj
y u
y u
u CuC C
u I C u
x
x
22)1
(1) (2)1 2 12 1
Cov[ , ] ( ) ( )
N
j
Ni i ij ijj
u u
Part 20: Selection [86/66]
Spatial Weights
2
1 ,
Euclidean distance
Band of 7 neighbors is usedRow standardized.
ijij
ij
cd
d
Part 20: Selection [87/66]
Two Step Estimation
0 122 (1)(1) (2)
1 11
2(1) 1 0 1
22 (1)1 1
Probit estimated by Pinske/Slade GMM
( )( ) ( )
( )
( )
Spatial regression with included IMR i
i
NNijij ij jj
iN
ijj i
Nijj
x
x
n second step(*) GMM procedure combines the two steps in one large estimation.
Part 20: Selection [88/66]