limited dependent variables often there are occasions where we are interested in explaining a...

Limited Dependent Variables

Often there are occasions where we are interested in explaining a dependent variable that has only limited measurement

Frequently it is even dichotomous.

Examples

War(1) vs. no War(0) Vote vs. no vote Regime change vs. no change

These are often Probability Models

E.g. Power disparity leads to war:

Where Yt is the occurrence (or not) of war, and Xt

is a measure of power disparity

We call this a Linear Probability Model

ttt eXBaY 1

Problems with LPM Regression

OLS in this case is called the Linear Probability Model

Running regression produces some problems Errors are not distributed normally Errors are heteroskedastic Predicted Ys can be outside the 0.0-1. bounds

required for probability

Logistic Model

We need a model that produces true probabilities The Logit, or cumulative logistic distribution offers one

approach.

This produces a sigmoid curve. Look at equation under 2 conditions:

Xi = +∞ Xi = -∞

)( 211

1iXBBi e

Y

Sigmoid curve

http://en.wikipedia.org/wiki/Logistic_function

Probability Ratio

Note that

Where

Z

Z

ZXBBi e

e

eeP

ii

11

1

1

1)_( 21

ii XBBZ 21

Log Odds Ratio

The logit is the log of the odds ratio, and is given by:

This model gives us a coefficient that may be interpreted as a change in the weighted odds of the dependent variable

iii

ii XBBZ

P

PL 211

ln

Estimation of Model

We estimate this with maximum likelihood The significance tests are z statistics We can generate a Pseudo R2 which is an attempt to

measure the percent of variation of the underlying logit function explained by the independent variables

We test the full model with the Likelihood Ratio test (LR), which has a χ2 distribution with k degrees of freedom

Neural Networks

The alternate formulation is representative of a single-layer perceptron in an artificial neural network.

Probit

If we can assume that the dependent variable is actually the result of an underlying (and immeasurable) propensity or utility, we can use the cumulative normal probability function to estimate a Probit model

Also, more appropriate if the categories (or their propensities) are likely to be normally distributed

It looks just like a logit model in practice

The Cumulative Normal Density Function

The normal distribution is given by:

The Cumulative Normal Density Function is:

2

2

2

)(

22

1)(

X

eXf

0 2

2

2

)(

22

1)(

XX

eXF

The Standard Normal CDF

We assume that there is an underlying threshold value (Ii) that if the case exceeds will be a 1, and 0 otherwise.

We can standardize and estimate this as

iXBB zi dzeIF

21 2 2/

2

1)(

Probit estimates

Again, maximum likelihood estimation Again, a Pseudo R2 Again, a LR ratio with k degrees of freedom

Assumptions of Models

All Y’s are in {0,1} set They are statistically independent No multicollinearity The P(Yi=1) is normal density for probit, and

logistic function for logit

Ordered Probit

If the dependent variable can take on ordinal levels, we can extend the dichotomous Probit model to an n-chotomous, or ordered, Probit model

It simply has several threshold values estimated

Ordered logit works much the same way

Multinomial Logit

If our dependent variable takes on different values, but they are nominal, this is a multinomial logit model

Some additional info

The Modal category is good benchmark Present % correctly predicted

This can be calculated and presented. This, when compared to the modal category,

gives us a good indication of fit.

Stata

Use Leadership Change data (1992 cross section) 1992-Stata

Test different models

Dependent variable Leadership change Examine distribution

tables ledchan1 Independent variables

Try differentTry corr and then (pwcorr)

Try the following

regress ledchan1 grwthgdp hlthexp illit_f polity2

logit ledchan1 grwthgdp hlthexp illit_f polity2

logistic ledchan1 grwthgdp hlthexp illit_f polity2

probit ledchan1 grwthgdp hlthexp illit_f polity2

ologit ledchan1 grwthgdp hlthexp illit_f polity2

oprobit ledchan1 grwthgdp hlthexp illit_f polity2

mlogit ledchan1 grwthgdp hlthexp illit_f polity2

tobit ledchan1 grwthgdp hlthexp illit_f polity2, ul ll

Tobit

Assumes a 0 value, and then a scale E.g., the decision to incarcerate

0 or 1 (Imprison or not)

If Imprison, than for how many years?

Other models

This leads to many other models Count models & Poisson regression Duration/Survival/hazard models Censoring and truncation models Selection bias models

limited dependent variables often there are occasions where we are interested in explaining a...

Documents

logit slide

probability slide

change slide

multinomial logit model

dependent variable slide

linear probability model

practice slide

way slide