part i: discrete choice models (theory and applications)binary outcomes ordinal outcomes multinomial...

98
Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias Universidad Cat´olica del Norte Workshop SOCHER 2017 Fondecyt Project N 11160104, Individual-specific inference for choice models October 5, 2017

Upload: others

Post on 25-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Part I: Discrete Choice Models (Theory andApplications)

Mauricio Sarrias

Universidad Catolica del NorteWorkshop SOCHER 2017

Fondecyt Project N 11160104, Individual-specific inference for choice models

October 5, 2017

Page 2: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 3: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Outline of this workshop

Part I: Discrete choice models (Theory and Applications):Binary outcomes.Ordered outcomes.Multinomial Logit Model.

Part II: Discrete choice models and individual-heterogeneity(Theory and Applications):

Binary outcomes.Multinomial Logit Model.

Page 4: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 5: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Binary Dependent Variable

We will study method for estimating model with binary dependentvariable:

yi ={

1 if some event occurs0 if the even does not occurs

Some examples:Is an adult a member of the labor force?Did a citizen vote in the last election?Does a high school student decide to go to college?Is a consumer more likely to buy the same brand or try a newbrand?Does the individual migrate?

Page 6: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Binary Dependent Variable

So, the question is: How to estimate a model then thedependent variable is binary?

The first approach is to apply OLS as if the dependent variable iscontinuous.

Page 7: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

What if OLS ... ?

The linear probability model is the regression model applied to abinary dependent variable. The structural model is:

yi = x′iβ + εi,

where

yi ={

1 if some event occurs0 if the even does not occurs

When y is a binary random variable, then:

E (yi|xi) = [1× Pr (yi = 1|xi)]+[0× Pr (yi = 0|xi)] = Pr (yi = 1|xi) = x′iβ

Page 8: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Linear Probability ModelFor a Single Independent Variable

Page 9: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Problems with the LPM

Heterokedasticity: If a binary random variable has mean µ,then its variance is µ(1− µ). Then:

Var (yi|xi) = Pr (yi = 1|xi) [1− Pr (yi = 1|xi)] = x′iβ (1− x′iβ) ,

which implies that the variance of the errors depend on the x’sand is not constant.Nonsensical Predictions: The LPM predicts values of y thatare negative or greater than 1.Functional Form: Since the model is linear, a unit increase inxk results in change of βk in the probability of an event. Theincrease is the same regardless of the current value of x.

Page 10: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Problems with the LPMMore on Functional Form

Consider the following two specifications:

yi = α+ βxi + δdi + εi

yi = F (α+ βxi + δdi)

where di is a dummy variable.The discrete change in y as d changes from 0 to 1, holding xconstant as:

∆y∆d = (α+ βxi + δ · 1)− (α+ βxi + δ · 0) = δ

For our second function, the discrete change is?

Page 11: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Problems with the LPMMore on Functional Form

Page 12: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 13: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

WLS

Since we know the exact form of the heterokedasticity function we canuse Weighted-Least-Square (WLS)

In this case:

E(ε2i |X

)= Var (εi|X) = σ2

0hi(X) (1)

wherehi(X) = x′iβ

(1− x′iβ

)(2)

Then we can estimate the WLS estimator:

βWLS =[n∑i=1

wixix′i

]−1 [ n∑i=1

wixiyi

], wi = 1/hi (3)

Page 14: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 15: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Determinants of Personal Computer Ownership

Consider the following model:

PCi = β0 + β1hsGPAi + β2ACTi + β3parcolli + εi (4)

where:PC: binary indicator equal to unity if the student owns acomputer, zero otherwise.hsGPA: High school GPA.ACT: is achievement test score.parcoll: binary indicator equal to unity if at least one parentattended college.1

We use data in GPA1.dta to estimate the probability ofowning a computer.

1Separate college indicators for the mother and father do not yield individuallysignificant results, as these are pretty highly correlated.

Page 16: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

* Open Datacd "/Users/mauriciosarrias/Documents/Clases/Discrete Choice Models/Examples/Binary"use "GPA1.DTA", clear

* Gen parcollquietly{gen parcoll = 0replace parcoll =1 if fathcoll == 1 | mothcoll == 1reg PC hsGPA ACT parcoll}

* Plot Predicted Valuesquietly margins, at(hsGPA = (16(0.5)33)) atmeansmarginsplot, yline(0, lcolor(red)) yline(1, lcolor(red))

Page 17: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Page 18: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

* Predicted valuequi reg PC hsGPA ACT parcollpredict yhat, xbsum yhat

Variable | Obs Mean Std. Dev. Min Max-------------+--------------------------------------------------------

yhat | 141 .3971631 .1000667 .1700624 .4974409

* Weightsgen h_hat = yhat * (1 - yhat)

Page 19: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

* Modelsquietly eststo ols : reg PC hsGPA ACT parcollquietly eststo olsr: reg PC hsGPA ACT parcoll, robustquietly eststo wls : reg PC hsGPA ACT parcoll [aweight = 1/ h_hat]esttab ols olsr wls, b se ///mtitle("OLS" "OLS Rob" "WLS")

------------------------------------------------------------(1) (2) (3)OLS OLS Rob WLS

------------------------------------------------------------hsGPA 0.0654 0.0654 0.0327

(0.137) (0.141) (0.130)

ACT 0.000565 0.000565 0.00427(0.0155) (0.0161) (0.0155)

parcoll 0.221* 0.221* 0.215*(0.0930) (0.0880) (0.0863)

_cons -0.000432 -0.000432 0.0262(0.491) (0.496) (0.477)

------------------------------------------------------------N 141 141 141------------------------------------------------------------Standard errors in parentheses* p<0.05, ** p<0.01, *** p<0.001

Page 20: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

WLSFinal Remarks

WLS and Robust Standard Errors help us to lead withHeterokedasticity.... but it does not solve the problems related to interpretation.

Page 21: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 22: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent Variable Approach

The latent y∗ is assumed to be linearly related to the observed x’sthrough the structural model:

y∗i = x′iβ + εi

The latent variable y∗ is linked to the observed binary variable y bythe measurement equation:

yi ={

1, if y∗i > τ

0, if y∗i ≤ τ(5)

Page 23: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent Variable Approach

Page 24: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent Variable Approach

Pr (yi = 1|xi) = Pr (y∗i > 0|xi)= Pr (x′iβ + εi > 0|xi)= Pr (εi > −x′iβ|xi)= 1− Pr (εi ≤ −x′iβ|xi) ∵ Pr(X > x) = 1− Pr(X ≤ x)= Pr (εi ≤ x′iβ|xi) by symmetry= F (x′iβ)

=∫ x′

−∞f(εi)dεi

As usually we assume that E (ε|X) = 0

Page 25: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent Variable Approach

Page 26: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 27: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

ProbitError term distributed as Normal

When ε is normal with E (ε|X) = 0 and Var (ε|X) = 1, the pdf is

φ(ε) = 1√2π

exp(−ε

2

2

)and the cumulative distribution function (cdf) is:

Φ(ε) =∫ ε

−∞

1√2π

exp(− t

2

2

)dt

Page 28: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

LogitError term distributed as Logistic

In the logistic model, the errors are assumed to have a standardlogistic distribution with mean 0 and variance π2/3. This unusualvariance is chosen because it results in a particularly simple equationfor the pdf:

λ(ε) = exp (ε)[1 + exp (ε)]2

and an even simpler equation for the cdf:

Λ(ε) = exp (ε)1 + exp(ε)

Page 29: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Normal and Logistic Distribution

Page 30: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

ProbitIf the εi’s are independently and normally distributed, εi ∼ N(0, σ2),then

Pr (yi = 1|xi) = Pr(εiσ> −x′iβ

σ

∣∣∣∣xi)= 1− Φ

(−x′iβ

σ

)= Φ

(x′iβσ

)by symmetry of standard normal distribution

The probability depends on β and σ,but only the ration β/σ is identified, but not the singleparameters β and σ.For instance, if β and σ each are multiplied by a constant c, thenthe probability remains unchanged.Typically, we let σ = 1 as normalization.

Page 31: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 32: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

ML EstimationThe outcome is Bernoulli distributed, the binomial distribution withjust one trial. A very convenient compact notation for the density ofyi, or more formally its probability mass function, is:

f (yi|xi) = P yi

i (1− Pi)1−yi , yi = 1, 0where Pi = F (x′iβ). This yields probabilities Pi and (1− Pi) sincef(1) = P 1(1− P )0 = P and f(0) = P 0(1− P )1 = 1− P . Assumingthat each probability is independent of that other, the jointprobability function (or Likelihood function) is:

Pr (Y1 = y1, Y2 = y2, ..., Yn = yn|X) =N∏i=1

f (yi|xi)

The likelihood function for a sample of n obvervations can be writtenas

L

β|y,X︸︷︷︸data

=n∏i=1

[F (x′iβ)]yi [1− F (x′iβ)]1−yi

Page 33: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Log-Likelihood FunctionTaking logs, we obtain the Log-Likelihood function, which must bemaximized

logL (β|data) = log(

n∏i=1

[F (x′iβ)]yi [1− F (x′iβ)]1−yi

)

=n∑i=1

{log([F (x′iβ)]yi

)+ log

([1− F (x′iβ)]1−yi

)}=

n∑i=1{yi log [F (x′iβ)] + (1− yi) log [1− F (x′iβ)]}

Useful TrickIf the distribution is symmetric, as the normal and logistic, then1− F (x′iβ) = F (−x′iβ). Let qi = 2yi − 1. Then:

logL (β|data) =n∑i=1

logF (qix′iβ)

Page 34: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Asymptotic Distribution of Probit and Logit ModelRecall that

√n(β − β0

)d−→ N(0, I−1)

where I is the sample Fisher Information defined as:

I = −E[

∂2

∂β∂β′log f(yi|xi;β)

]Then, an estimator of the expected value of the Hessian (orinformation matrix), is given by:

I(β) = −E[∂2 logL∂β∂β′

]= 1n

n∑i=1

f2(x′iβ)F (x′iβ) [1− F (x′iβ)]xix

′i

(6)

We can also use the following estimators:(−∑ni=1 H(wi, β)

)−1

or the outer product of the gradients.

Page 35: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

What about small samples?Griffiths et al. (1987)

We have three estimators.NR: inverse of the negative of the Hessian matrix from thelog-likelihood function.Scoring: inverse of the information matrix is used.BHHH: the inverse of the outer product of the first derivatives ofthe log-likelihood function.

They are asymptotically equivalent, but their performance canvary in finite samples.

They find that, on average, the Hessian matrix and the informationmatrix give almost identical results and lead to more accurateestimates of the asymptotic covariance matrix than does theestimator based on first derivatives.

Page 36: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 37: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal EffectsRecall that:

E (yi|xi) = [1× Pr (yi = 1|xi)]+[0× Pr (yi = 0|xi)] = Pr (yi = 1|xi) = F (x′iβ)

The marginal effect is given by:

∂E (yi|xi)∂xi︸ ︷︷ ︸

(K×1)

=[dF (x′iβ)d(x′iβ)

]β = f(x′iβ)︸ ︷︷ ︸

(1×1)

β︸︷︷︸(K×1)

where f(·) is the probability density function.

So:

Probit =⇒ ∂E (yi|xi)∂xi

= φ(x′iβi)β

Logit =⇒ ∂E (yi|xi)∂xi

= Λ(x′iβ) [1− Λ(x′iβ)]β

Page 38: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal EffectsComments

Some aspects are important to note:MEs will vary with the values of x.Current practices:

Calculate MEs at means of the variables.Calculate MEs at specific values.Evaluate MEs at every observation and use the sample average ofthe individual MEs.

Page 39: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal EffectsDummy variable

The appropriate ME for a binary independent variable, say, d, wouldbe:

ME =[Pr(yi = 1|x(d), di = 1

)]−[Pr(yi = 1|x(d), di = 0

)]where x(d) denotes the means of all the other variables in the model.

Page 40: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Average Partial Effects

Current practice: “average partial effects”. The quantity of interest is:

APE = Ex[∂E(y|x)∂x

]In practical terms, this suggests the computation:

APE = ¯γ = 1n

n∑i=1

f(x′iβ)β.

Page 41: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 42: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Goodness-of-Fit

A number of suggestions have been made for how to evaluate theoverall quality of a binary response model.

First approach: to mimic the R2 measure.Assess the predictive performance of the model

R2 are not directly applicable in non-linear models such as binaryresponse models, since we do not have a proper variancedecomposition result

The so-called pseudo R2 measures have been suggested.

Page 43: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Goodness-of-Fit

Let:logL(βr): the value of the maximized log-likelihood function inthe constant-only model.logL(βu): is the maximized log-likelihood value in the full model.

Note that the value of the log-likelihood function is always negative,so:

logL(βu) ≥ logL(βr) =⇒∣∣∣logL(βu)

∣∣∣ ≤ ∣∣∣logL(βr)∣∣∣

So that:

0 ≤ 1− logL(βu)logL(βr)

= R2McFadden ≤ 1

The McFadden R2 will be zero if the full model has noexplanatory power.

Page 44: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Information Measures

Definition (Akaike’s Information Criterion (AIC))Akaike’s (1973) information criterion is defined as

AIC = −2 log L+ 2Kn

(7)

where log L is the likelihood of the model and K is the number ofparameters in the model.

Larger values of log L indicates a better fit.−2 log L ranges from to 0 +∞ with smaller values indicating abetter fit.As K increases, −2 log L becomes smaller since moreparameters make what is observed more likely.2K is added as a penalty.All else being equal, smaller values suggest a better fitting model.Use to compare models across different samples or to comparenonested models.

Page 45: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Information Measures

Definition (Bayes Information Criterion (BIC))BIC information criterion is defined as

BIC = −2 log L+K lognn

(8)

where log L is the likelihood of the model and K is the number ofparameters in the model and n is the number of individuals.

It is possible to increase the likelihood by adding parameters, butdoing so may result in overfitting. Both BIC and AIC resolve thisproblem by introducing a penalty term for the number of parametersin the model; the penalty term is larger in BIC than in AIC.

Page 46: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 47: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Stata Example

Open BinaryStata.do

Page 48: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 49: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Ordinal Outcomes

When a variable is ordinal, its categories can be ranked from lowto high, but the distances between adjacent categories areunknown.Examples:

Kind of Job: low skilled, medium skilled, and highly skilled.Subjective Health Status: very bad, bad, good, very good.Likert Scales: strongly agree, agree, have no opinion, disagree, orstrongly disagree with a statement.

Page 50: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Ordinal Outcomes

Researchers often treat ordinal dependent variables as if they wereinterval. The dependent categories are numbered sequantially and theLRM is used. This involves the implicit assumption that the intervalsbetween adjacent categories are equal.

For example, the distance between strongly agreeing and agreeing isassumed to be the same as the distance between agreeing and beingneutral on a Likert Scale.

This is not correct. See for example MacKelvey and Zavoina (1975)and Winship and Mare (1984).

Page 51: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 52: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent ApproachConsider the latent variable y∗i ranging from −∞ to ∞, determind bythe following structural model

y∗i = x′iβ + εi (9)However, we do not observe y∗i . We only observed a discrete variableyi, which is though of as providing incomplete information about andunderlying y∗i according to the following rule:

yi =

1 if y∗i < κ1

2 if κ1 < y∗i < κ2

3 if κ2 < y∗i

(10)

or more general mechanism that accounts for the ordering nature ofy∗i is

yi = j ⇐⇒ κj−1 < y∗i ≤ κj j = 1, ..., J (11)The κ’s are called thresholds or cutpoints. The extreme categories 1and J are defined by open-ended intervals with κ0 = −∞ andκJ =∞. When J = 2, we have the BRM.

Page 53: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Latent ApproachConsider the following question: How do you grade your actualhealth status? The answer are: 1 = Poor, 2 = Fair, 3 =Good, and 4 = Very Good. Assume that this ordinal variable isrelated to a continuous, latent variable y∗ that indicates andindividual’s underlying health status. The observed y is related to y∗according to:

yi =

1 =⇒ Poor if κ0 = −∞ ≤ y∗i < κ1

2 =⇒ Fair if κ1 ≤ y∗i < κ2

3 =⇒ Good if κ2 ≤ y∗i < κ3

4 =⇒ Very Good if κ3 ≤ y∗i < κ4 =∞

(12)

Page 54: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Page 55: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

The probabilitiesFirst, consider the probability that y = 1. This implies that:

Pr(yi = 1|xi) = Pr(κ0 ≤ y∗i < κ1|xi)= Pr(κ0 ≤ x′iβ + εi < κ1|xi)= Pr(κ0 − x′iβ ≤ εi < κ1 − x′iβ|xi)= Pr(εi < κ1 − x′iβ|xi)− Pr(εi < κ0 − x′iβ|xi)= F (κ1 − x′iβ)− F (κ0 − x′iβ)

(13)

So, generalizing we obtain:

Pr(yi = j|xi) = F (κj − x′iβ)− F (κj−1 − x′iβ) (14)Note that:

F (κ0 − x′iβ) = F (−∞− x′iβ) = 0 (15)and

F (κJ − x′iβ) = F (∞− x′iβ) = 1 (16)

Page 56: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Ordered Probit Model

In the ordered probit model, we assume that the error term followa standard normal distribution, F (ε) = Φ(ε).

In this case, the probabilities are:

Pr(yi = j|xi) = Φ(κj − x′iβ

σε

)− Φ

(κj−1 − x′iβ

σε

)(17)

For identification we need σε

Page 57: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Ordered Probit Model

Page 58: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Ordered Logist Model

If εi ∼ iid Logistically distributed, then

Pr(yi = j|xi) = Λ(κj − x′iβ

σε

)− Λ

(κj−1 − x′iβ

σε

)(18)

where σε = π2/3

Page 59: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Identification

Since y∗ is latent, its means and variance cannot be estimated.For logit Var(ε|x) = π2/3.For Probit Var(ε|x) = 1

Can we estimate a constant is this model?

Page 60: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 61: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Estimation

The conditional probabilities are:

f(yi|xi;β, κ1, ..., κJ−1) = P di1i1 ...P diJ

iJ =J∏j=1

Pdij

ij (19)

where dij is defined as a binary indicator equal to one if yi = j and 0otherwise. For a sample of n independent pairs of obserations (yi,xi),the likelihood function is given by

L(θ; y,X) =n∏i=1

J∏j=1

Pdij

ij (20)

Page 62: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Estimation

Taking logarithms converts products into sums and we can write thelog-likelihood funcition as

logL(θ; y,X) =n∑i=1

J∑j=1

dijPij (21)

ML estimation of the parameters yields consistent, asymptoticallyefficient, and asymptotically normally distributed estimators.We can use LR, Wald, and Score test to test for generalrestrictions, and the invariance property together with the DeltaMethod for estimation and inference of predicted probabilities ormarginal probability effects.

Page 63: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 64: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Compesating Variation

DefinitionCV The variation in two regressors such that the latent variable doesnot change.

Let xil denote the l-element in xi and βl the correspondingparameter, and let m index the m-th elements in both vectors xi andβ, respectively.

Now consider a change in xil and xim at the same time, such that∆y∗i = 0 (and therefore all probabilities are unchanged). Thisrequires:

0 = βl∆xil + βm∆xim or ∆xil∆xim

= −βmβl

(22)

If, for example, xil is logarithmic income and xim is a dummy variableindicating unemployment, then the above ratio gives a trade-offration: The relative increase in income required to compensate for thenegative effect of unemployment

Page 65: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Compesating Variation

In some applications, it could be interesting to know how much anexplanatory varible should change to reach the next higher responsecategory. For this purpose, one could consider the ratio of the intervallength to the parameter

kj − kj−1

βl(23)

The smallest this ratio in absolute terms, the smallest the maximumchange in xil required to move the response from yi = j to yi = j + 1

Page 66: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Discrete Probability Effect

The ceteris paribus effect of a discrete change in one explanatoryvariable, say the l-th element in xi, is simply the difference betweenthe probabilities before and those after the change, given the values ofthe other variables:

∆Pij = Pr(yi = j|xi + ∆xil)− Pr(yi = j|xi)= [F (κj − βxi −∆xilβ)− F (κj−1 − βxi −∆xilβ)]− [F (κj − βxi)− F (κj−1 − βxi)]

(24)

Page 67: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Shift in Density Due to a Change ∆xi > 0 (β > 0)

Page 68: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal Effects

The marginal probability effect (MPE) is given by:

MPEijl = ∂Pij

∂xil=[f(kj−1 − x′iβ)− f(kj − x′iβ)

]βl (25)

Neither the signs nor the magnitudes of the coefficients are directlyinterpretable in the ordered choice models.MPE’s are functions of the covariates and therefore vary acrossindividuals.To calculate average marginal probability effects (AMPE’s) wehave to take expectations with respect to x, which is estimatedconsistently by replacing β by its ML estimate β and averaging overthe sample.

Page 69: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal Effects

Furthermore, in the computation of the MPE, the only certainties inthe signs of the partial effects in this model are as follows, where weconsider a variable with a positive coefficient:

Increases in that variable will increase the probability in thehighest cell and decrease the probability in the lowest cell.The sum of all the changes will be zero—the new probabilitiesmust still sum to one.The effects will begin at Pr(yi = 1) with one or more negativevalues, then change to a set of positives values; there will be onesign change (Single Crossing)

Page 70: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal Effects

One might also be interested in cummulative values of the partialeffects, such as

∂ Pr(y ≤ j|xi)∂xil

=J∑j=1

[f(kj−1 − x′iβ)− f(kj − x′iβ)]βl (26)

Page 71: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 72: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Stata Example

Open Ordered Lab.do

Page 73: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 74: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Introduction

Type of dependent variableDiscrete choices: choie of one among several mutually exclusivealternatives:

Commuting mode: train, car, bycicle, walking.Type of car: standard car, hybrid car, electric car.

Type of dataRevealed preferences: data which means that the data areobserved choice of individuals for, say, a transport mode.Stated preferences data: in this case individuals face a virtualsituation of choice for example the choice between different typesof automated cars.

Page 75: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Design of the choice experiment

Figure: Sample of a Choice Situation Presented to Respondents

Page 76: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 77: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Random Utility Model

The utility of a individual:

Uij = Vij + εij

where Vij is the deterministic component, and εij is the randomcomponent. This random component is intended to capture thevarious unpredictable choices people make.

Deterministic PartIt is usually assumed that is a linear combination of the observedexplantory variables:

Vij = αj + x′ijβ + z′iγj + w′ijδj

Page 78: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Random Utility Model

Trhee kinds of variables:alternative specific variables xij with generic coefficient β.individual specific variables zi with an alternative specificcoefficients γj .alternative specific variables wij with an alternative specificcoefficient δj

Page 79: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Random Utility Model

Only differences between utility matters. Let two alternatives j and k

Vij − Vik = (αj − αk) + β(xij − xik) + (γj − γk)zi + (δjwij − δkwik)

Coefficients for individual specific variables (including theintercept) should be alternative specific. Normalization isrequired γ1 = 0.Coefficients for alternative specific variables may (or may not) bealternative specific.

Cost is alternative specific, but the cost of standard vehicles maynot have the same impact on utility that the cost in a electric car.

Page 80: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 81: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Probabilities

If the error terms are EV type I, then:

Pij = eVij∑j eVij

= ex′ijβ∑

j ex′

ijβ

(27)

For a sample of n independent pairs of obserations (yi,xij), thelog-likelihood function is given by

logL(θ; y,X) =∑i

∑j

yij logPij (28)

where yij is a dummy variable which equal to 1 if individual i madechoice j and 0 otherwise.

Page 82: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Marginal effects

The marginal effects respect to individual-specific zi oralternative-specific xij variables are:

∂Pij∂zi

= Pij

(βj −

∑l

Pilβl

)∂Pij∂xij

= γPij(1− Pij)

∂Pij∂xil

= −γPijPil

Page 83: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 84: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

gmnl package in R

Page 85: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Formulas in gmnl package

#load packages and datalibrary("gmnl")library("mlogit")data("TravelMode", package = "AER")with(TravelMode, prop.table(table(mode[choice == "yes"])))

#### air train bus car## 0.2761905 0.3000000 0.1428571 0.2809524

head(TravelMode)

## individual mode choice wait vcost travel gcost income size## 1 1 air no 69 59 100 70 35 1## 2 1 train no 34 31 372 71 35 1## 3 1 bus no 35 25 417 70 35 1## 4 1 car yes 0 10 180 30 35 1## 5 2 air no 64 58 68 68 30 2## 6 2 train no 44 31 354 84 30 2

Page 86: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Formulas in gmnl package

Data is in long-format (one row per available mode). We transformthe data in mlogit.format

# transform dataTM <- mlogit.data(TravelMode, choice = "choice",

shape = "long",alt.levels = c("air", "train", "bus", "car"))

Page 87: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Formulas in gmnl package

Suppose we want to estimate a multinomial logit model wherethe variables wait and vcost are alternative-specific variableswith a generic coefficient β,income is an individual-specific variable with an alternativespecific coefficient γj ;and the variable travel is alternative-specific variable with analternative-specific coefficient δj .

f1 <- choice ˜ wait + vcost | income | travel

Page 88: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Formulas in gmnl package

By default, the alternative-specific constants (ASC) for eachalternative are included. They can be omitted by adding +0 or -1 inthe second part of the formula.

f2 <- choice ˜ wait + vcost | income + 0 | travelf2 <- choice ˜ wait + vcost | income - 1 | travel

Page 89: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Formulas in gmnl package

Some parts may by omitted when there is no ambiguity. For instance,a model with only individual-specific variables can be specified asfollows:

f2 <- choice ˜ 0 | income + size | 0f2 <- choice ˜ 0 | income + size | 1

Page 90: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

1 Binary outcomesIntroduction to Linear Probability Model (LPM)WLSExampleNon-Linear Probability ModelProbit and LogitEstimationMarginal EffectsGoodness-of-FitStata Example

2 Ordinal outcomesMotivationLatent ApproachML EstimationInterpretation of ParametersStata Example

3 Multinomial Logit ModelIntroductionRUMProbabilities and EstimationFormulas using gmnl packageAn example

Page 91: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

The dataset Train contains data about a stated preference survey inNetherlands. Users are asked to choose between two trains tripscharacterized by four attributes:

price: the price in cents of guilders,time: travel time in minutes,change: the number of changes,comfort: the class of comfort, 0, 1 or 2; 0 being the mostcomfortable class.

data("Train", package = "mlogit")Tr <- mlogit.data(Train, shape = "wide", choice = "choice",varying = 4:11, sep = "", alt.levels = c(1, 2), id = "id")

Page 92: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

We first convert price and time in more meaningful unities, hoursand euros (1 guilder is 2.20371 euros):

Tr$price <- Tr$price / 100 * 2.20371Tr$time <- Tr$time / 60

Page 93: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

Now we estimate the model: both alternatives being virtual equal(train trips), it is relevant to use only generic coefficients and toremove the ASC

mnl <- gmnl(choice ˜ price + time + change + comfort | -1,data = Tr,model = "mnl")

Page 94: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Examplesummary(mnl)

#### Model estimated on: Thu Oct 05 18:16:57 2017#### Call:## gmnl(formula = choice ˜ price + time + change + comfort | -1,## data = Tr, model = "mnl", method = "nr")#### Frequencies of categories:#### 1 2## 0.50324 0.49676#### The estimation took: 0h:0m:0s#### Coefficients:## Estimate Std. Error z-value Pr(>|z|)## price -0.0673581 0.0033933 -19.8506 < 2.2e-16 ***## time -1.7205517 0.1603517 -10.7299 < 2.2e-16 ***## change -0.3263410 0.0594892 -5.4857 4.118e-08 ***## comfort -0.9457257 0.0649455 -14.5618 < 2.2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Optimization of log-likelihood by Newton-Raphson maximisation## Log Likelihood: -1724.2## Number of observations: 2929## Number of iterations: 5## Exit of MLE: gradient close to zero

Page 95: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

Coefficients are not directly interpretable, but dividing them by theprice coefficient, we get the WTPs

coef(mnl)[-1] / coef(mnl)[1]

## time change comfort## 25.543370 4.844869 14.040276

or we can usewtp.gmnl(mnl, wrt = "price")

#### Willigness-to-pay respect to: price#### Estimate Std. Error t-value Pr(>|t|)## time 25.54337 2.09054 12.2185 < 2.2e-16 ***## change 4.84487 0.84345 5.7441 9.241e-09 ***## comfort 14.04028 0.88110 15.9349 < 2.2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We obtain the value of 26 euros for an hour of traveling, 5 euros for achange and 14 euros to access a more confortable class.

Page 96: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

Now we use the Fish data set.

data("Fishing", package = "mlogit")head(Fishing, 3)

## mode price.beach price.pier price.boat price.charter catch.beach## 1 charter 157.930 157.930 157.930 182.930 0.0678## 2 charter 15.114 15.114 10.534 34.534 0.1049## 3 boat 161.874 161.874 24.334 59.334 0.5333## catch.pier catch.boat catch.charter income## 1 0.0503 0.2601 0.5391 7083.332## 2 0.0451 0.1574 0.4671 1250.000## 3 0.4522 0.2413 1.0266 3750.000

Fish <- mlogit.data(Fishing, shape = "wide",varying = 2:9,choice = "mode")

mnl.fish <- gmnl(mode ˜ price | income | catch, data = Fish,model = "mnl")

Page 97: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Examplesummary(mnl.fish)

#### Model estimated on: Thu Oct 05 18:16:58 2017#### Call:## gmnl(formula = mode ˜ price | income | catch, data = Fish, model = "mnl",## method = "nr")#### Frequencies of categories:#### beach boat charter pier## 0.11337 0.35364 0.38240 0.15059#### The estimation took: 0h:0m:1s#### Coefficients:## Estimate Std. Error z-value Pr(>|z|)## boat:(intercept) 8.4184e-01 2.9996e-01 2.8065 0.0050080 **## charter:(intercept) 2.1549e+00 2.9746e-01 7.2443 4.348e-13 ***## pier:(intercept) 1.0430e+00 2.9535e-01 3.5315 0.0004132 ***## price -2.5281e-02 1.7551e-03 -14.4046 < 2.2e-16 ***## boat:income 5.5428e-05 5.2130e-05 1.0633 0.2876613## charter:income -7.2337e-05 5.2557e-05 -1.3764 0.1687089## pier:income -1.3550e-04 5.1172e-05 -2.6480 0.0080977 **## beach:catch 3.1177e+00 7.1305e-01 4.3724 1.229e-05 ***## boat:catch 2.5425e+00 5.2274e-01 4.8638 1.152e-06 ***## charter:catch 7.5949e-01 1.5420e-01 4.9254 8.417e-07 ***## pier:catch 2.8512e+00 7.7464e-01 3.6807 0.0002326 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Optimization of log-likelihood by Newton-Raphson maximisation## Log Likelihood: -1199.1## Number of observations: 1182## Number of iterations: 6## Exit of MLE: successive function values within tolerance limit

Page 98: Part I: Discrete Choice Models (Theory and Applications)Binary outcomes Ordinal outcomes Multinomial Logit Model Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias

Binary outcomes Ordinal outcomes Multinomial Logit Model

Example

Predicted probabilities for the outcome or for all alternatives ifoutcome = FALSE:

head(fitted(mnl.fish))

## 1.beach 2.beach 3.beach 4.beach 5.beach 6.beach## 0.3114002 0.4537956 0.4567631 0.3701758 0.4763721 0.4216448

head(fitted(mnl.fish, outcome = FALSE))

## beach boat charter pier## [1,] 0.09299770 0.5011740 0.3114002 0.09442818## [2,] 0.09151069 0.2749292 0.4537956 0.17976449## [3,] 0.01410359 0.4567631 0.5125571 0.01657626## [4,] 0.17065866 0.1947959 0.2643696 0.37017583## [5,] 0.02858216 0.4763721 0.4543225 0.04072325## [6,] 0.01029792 0.5572462 0.4216448 0.01081103