econometric methods and applications i lecture 4
TRANSCRIPT
Non-linear regression models a) The probit and tobit models as examples b) Interpretation of the models c) Relevant estimation methods (ML) d) Final considerations about regressions
Literature Wooldridge (2002,2010): 15.1-15.4, 15.6, 17.1-17.4
Econometric Methods and Applications I Lecture 4
Econometric Methods and Applications I, Lecture 4, Slide 2
Introduction (1)
> If the CIA is valid, then the causal parameter in the most
simple binary D framework is obtained by estimating
and aggregating over the appropriate distribution of X
> Now, we consider the case when a linear approximation of
these conditional expectations is inadequate
( | , 1) ( | , 0)E Y X x D E Y X x D= = − = =
Econometric Methods and Applications I, Lecture 4, Slide 3
Introduction (2)
> Example
• True conditional expectation of Y:
• Function used in estimation by linear regression:
• Implied error term by function used in estimation:
> Implied error term is correlated with variables used in estimation OLS/FGLS is inconsistent
> Non-linear models may lead to better approximation of and may avoid inconsistent estimation
( , )g x θ
xβ
( | ) ( | ) ( , ) 0U Y X
E Y X X x E Y X x x g x xβ
β β θ β= − ⇒
− = = = − = − ≠
( | ) ( , )E Y X x g x θ= =
Econometric Methods and Applications I, Lecture 4, Slide 4
Probit model (1-1)
> Example 1: With a binary outcome variable, linear regression is
generally not attractive, because
• A probability is bounded between zero and one
• The probability (usually) corresponds to a cumulative distribution
function (details later), which is generally not linear in its
arguments (exception: uniform distribution)
( | ) 1 ( 1| ) 0 ( 0 | )( 1| )
E Y X x P Y X x P Y X xP Y X x
= = × = = + × = == = =
Econometric Methods and Applications I, Lecture 4, Slide 5
Probit model (1-2)
Econometric Methods and Applications I, Lecture 4, Slide 6
Probit model (2)
> Model based on a linear index model and a
non-linear link function
> F(.) denotes the cdf of U
( 1| )P Y X x= =
1( 0)Y X Uβ= + > ⇒
( | ) ( 1| )( 0 | )( | )
1 ( | )1 ( )
E Y X x P Y X xP X U X xP U X X x
P U X X xF x
ββ
ββ
= = = == + > == > − == − ≤ − == − −
Econometric Methods and Applications I, Lecture 4, Slide 7
Probit model (3)
> By making different distributional assumptions concerning F,
we obtain different models (logit, probit, etc.)
> Only in one special case (U is distributed uniformly in a fixed
interval) is F(.) a linear function (linear probability model)
> Generally, this expression simplifies for symmetric
distributions:
( | ) 1 ( ) ( )u symmetric
E Y X x F x F xβ β= = − − =
Econometric Methods and Applications I, Lecture 4, Slide 8
Probit model (4)
> By assuming that U is normally distributed with mean zero
and variance , we obtain the probit model:
( | ) ,
: cdf of standard normal distribution
xE Y X x
a
βσ
σ
= = Φ Φ
2σ
2Remember: (0, ) ( ) aA N P A aσσ ⇒ ≤ = Φ
Econometric Methods and Applications I, Lecture 4, Slide 9
Probit model (5)
> General identification problem of binary choice models
• The following two models lead to the same dependent variable:
• therefore, it is impossible to distinguish models empirically
• Some (convenient) normalisation is needed usually
1( 0) and 1( 0); 0.Y X U Y X Uβ βσ σ σ= + > = + > >
1σ =
Econometric Methods and Applications I, Lecture 4, Slide 10
Tobit model (1-1)
> Second example for a non-linear model: Tobit
• Motivation: Some dependent variables cannot fall below (rise
above) some threshold
− e.g. earnings cannot be negative
> Again, modelling is based on a latent linear index:
*
1( 0)( )Y
Y X U X Uβ β= + > +
Econometric Methods and Applications I, Lecture 4, Slide 11
Tobit model (1-2)
Econometric Methods and Applications I, Lecture 4, Slide 12
Tobit model (2)
> Assume U is normally distributed with mean 0 and variance
> Derivation of E(Y| X) is somewhat complex (Wooldridge, 2002,
Ch. 16.2)
> Consider only the subpopulation having positive values of y:
2σ
( | ) , : pdf of stand. normal distr.x x aE Y X x xβ ββ σφ φσ σ σ
= = Φ +
( | , 0)
xxE Y X x Y x x
x
βφβσβ σ β σλ
β σσ
= > = + = + Φ
Econometric Methods and Applications I, Lecture 4, Slide 13
Tobit model (3)
> Estimation in the complete sample
> OLS is inconsistent because of the neglected nonlinearity
and the omitted variable
xβσ
Φ
xβφσ
( | ) x xE Y X x xβ ββ σφσ σ
= = Φ +
Econometric Methods and Applications I, Lecture 4, Slide 14
Tobit model (4)
> Estimation in the subsample with Y > 0
> OLS in the population with positive Y is inconsistent because
of the omitted variable
xx
x
βφβ σλ
βσσ
= Φ
( | , 0)
xxE Y X x Y x x
x
βφβσβ σ β σλ
β σσ
= > = + = + Φ
Econometric Methods and Applications I, Lecture 4, Slide 15
Effects of interest (1)
> In case of a binary treatment D, we want to compute
> Assume the most simple model without D x X interactions
[ ]( )
( ) | ( | 1, ) ( | 0, ) |x
E x D d E E Y D X x E Y D X x D dθ
θ = = = = − = = =
* 2
Probit
obit
; (0, )
P robit: ( )
Tobit: ( ) ( ) ( )T
Y X D U U N
x xx
x x x xx x x
β γ σ
β γ βθσ σβ γ β γ β βθ β γ σφ β σφσ σ σ σ
= + +
+ = Φ −Φ
+ + = Φ + + −Φ −
Econometric Methods and Applications I, Lecture 4, Slide 16
Effects of interest (2)
> If D is continuous we may be more interested in how the
conditional expectation changes for very small changes in D
( | , )P robit:
( | , )Tobit:
E Y D d X x x dd
E Y D d X x x dd
β γ γφσ σ
β γ γσ
∂ = = + = ∂
∂ = = + = Φ ∂
Econometric Methods and Applications I, Lecture 4, Slide 17
Effects of interest (3)
> The coefficient is informative about the sign of the effect,
but not of its magnitude which depends also on the other
coefficients and control variables
(this is different in the linear regression model)
γ
Econometric Methods and Applications I, Lecture 4, Slide 18
Estimation (1)
> Minimizing the squared deviation between actual and
predicted individual outcomes (least squares principle)
• is not efficient (due to the implied heteroscedasticity)
> Probit and Tobit models are usually estimated by maximum
likelihood or generalized methods of moments
• both these estimation methods will be discussed in more detail in
Econometric Methods and Applications II
Econometric Methods and Applications I, Lecture 4, Slide 19
Estimation (2)
> Basic idea of Maximum Likelihood (ML)
• Choose the unknown coefficients in such a way that the observed
sample is most likely to come from an underlying population
described by the chosen values of the coefficients
> Properties of ML
• When the model is correctly specified and some further regularity
conditions are met, ML is consistent, asymptotically efficient,
and asymptotically normally distributed
Econometric Methods and Applications I, Lecture 4, Slide 20
Estimation (3) > Basic idea of the Generalized Method of Moments (GMM)
• The model implies that the residual V (not U) has conditional expectation 0
• Thus, it is uncorrelated with all functions of X. This defines a set of moment conditions (equalities) that hold in the population for the true parameters
• Choose the parameters such that the sample analogues of those moments (very often mean functions) come as close as possible to fulfil the same conditions in the sample
• Under correct specification, GMM is (usually) consistent and asymptotically normally distributed
( | ) | ( | ) 0; Probit :V
XE Y E Y X x X x E V X x V Y βσ
− = = = = = = −Φ
N −
Econometric Methods and Applications I, Lecture 4, Slide 21
Estimation (4)
> Probit and Tobit models are usually estimated by ML
> Tobit model: There is a particularly simple 2-step GMM
estimator when considering observations with positive y:
( | , 0)
xxE Y X x Y x x
x
βφβσβ σ β σλ
β σσ
= > = + = + Φ
Econometric Methods and Applications I, Lecture 4, Slide 22
Estimation (5)
> 1st step:
• Estimate a probit model to obtain consistent estimates of
• Use them to compute a consistent estimate of
for every observation:
> 2nd step
• Use as additional regressor in regression of Y on X (Heckit)
( | , 0) xE Y X x Y x ββ σλσ
= > = +
βσxβλ
σ
ˆi
i i
i
xx
x
βφσβλ λ
σ βσ
= = Φ
iλ
Econometric Methods and Applications I, Lecture 4, Slide 23
Computing the effects of interest (1)
( )
Probitˆi i ix x xβ γ βθ
σ σ σ
= Φ + −Φ
( )11
1 ˆ N
i ii
ATET d xN
θ=
= ∑
( )Tobitˆ ˆ ˆ ˆˆ ˆˆ ˆ ˆˆ ˆ ˆ( )ˆ ˆ ˆ ˆ
i i i ii i
x x x xx x xβ γ β γ β βθ β γ σφ β σφσ σ σ σ
+ += Φ + + −Φ −
( )11
1 ˆ (1 )N
i ii
ATENT d xN N
θ=
= −− ∑
( )1
1 ˆ N
ii
ATE xN
θ=
= ∑
Econometric Methods and Applications I, Lecture 4, Slide 24
Computing the effects of interest (3)
> The same logic applies to the continuous outcomes
> Averaging is over the various populations as before
> Most statistical software packages also provide the values of
these derivatives or discrete changes of D (and other X) for a
particular value of D and X, usually the sample mean
( | , )P robit: E Y D d X x x dd
β γ γφσ σ σ
∂ = = = + ∂ ˆ ˆ( | , ) ˆTobit:
ˆE Y D d X x x d
dβ γ γσ
∂ = = += Φ ∂
Econometric Methods and Applications I, Lecture 4, Slide 25
Final considerations about regressions (1)
> Linear and non-linear regressions are tools to remove differences
in the outcome variables due to observable variables
> Whether this is enough to uncover causal effects depends on the
(non-)existence other (non-observables) differences also related
to selection (i.e. other factors influencing D and Y)
> Regressions uncover causal effects if
• conditional expectations are of linear or non-linear known form
• CIA holds
[ ] [ ]( | ) | ( , ) | ( | ) 0E Y E Y X x X x E Y g X X x E U X xθ− = = = − = = = =
Econometric Methods and Applications I, Lecture 4, Slide 26
Final considerations about regressions (2)
> Regressions for causal inference
• If effect heterogeneity is expected include (enough) interaction
terms D x X
• Always check whether coefficient has an interpretation as effect
or if more complex calculations are required, in particular in
models with D x X interactions and non-linear models
• If unsure about non-linearities or if substantial effect
heterogeneity of unknown form is expected
use more flexible semi- or non-parametric methods (as
discussed in the course “Flexible estimation in practice”)!
Econometric Methods and Applications I, Lecture 4, Slide 27
Final considerations about regressions (3)
> Regressions for causal inference: Fatal mistakes
• Bad controls
− Conditioning on variables influenced by D (simultaneous equation bias)
− Controls measured with error related to D (measurement error bias)
• Missing variables
− Variables related to Y and D are not in the data (omitted variable bias)
• Specification error of the conditional expectation functions acts like a
missing variable (or a measurement error)
− Misspecified bit of true regression becomes part of error term and may
violate E(U|X)=0 condition