limited dependent variables often there are occasions where we are interested in explaining a...
TRANSCRIPT
Limited Dependent Variables
Often there are occasions where we are interested in explaining a dependent variable that has only limited measurement
Frequently it is even dichotomous.
Examples
War(1) vs. no War(0) Vote vs. no vote Regime change vs. no change
These are often Probability Models
E.g. Power disparity leads to war:
Where Yt is the occurrence (or not) of war, and Xt
is a measure of power disparity
We call this a Linear Probability Model
ttt eXBaY 1
Problems with LPM Regression
OLS in this case is called the Linear Probability Model
Running regression produces some problems Errors are not distributed normally Errors are heteroskedastic Predicted Ys can be outside the 0.0-1. bounds
required for probability
Logistic Model
We need a model that produces true probabilities The Logit, or cumulative logistic distribution offers one
approach.
This produces a sigmoid curve. Look at equation under 2 conditions:
Xi = +∞ Xi = -∞
)( 211
1iXBBi e
Y
Sigmoid curve
Probability Ratio
Note that
Where
Z
Z
ZXBBi e
e
eeP
ii
11
1
1
1)_( 21
ii XBBZ 21
Log Odds Ratio
The logit is the log of the odds ratio, and is given by:
This model gives us a coefficient that may be interpreted as a change in the weighted odds of the dependent variable
iii
ii XBBZ
P
PL 211
ln
Estimation of Model
We estimate this with maximum likelihood The significance tests are z statistics We can generate a Pseudo R2 which is an attempt to
measure the percent of variation of the underlying logit function explained by the independent variables
We test the full model with the Likelihood Ratio test (LR), which has a χ2 distribution with k degrees of freedom
Neural Networks
The alternate formulation is representative of a single-layer perceptron in an artificial neural network.
Probit
If we can assume that the dependent variable is actually the result of an underlying (and immeasurable) propensity or utility, we can use the cumulative normal probability function to estimate a Probit model
Also, more appropriate if the categories (or their propensities) are likely to be normally distributed
It looks just like a logit model in practice
The Cumulative Normal Density Function
The normal distribution is given by:
The Cumulative Normal Density Function is:
2
2
2
)(
22
1)(
X
eXf
0 2
2
2
)(
22
1)(
XX
eXF
The Standard Normal CDF
We assume that there is an underlying threshold value (Ii) that if the case exceeds will be a 1, and 0 otherwise.
We can standardize and estimate this as
iXBB zi dzeIF
21 2 2/
2
1)(
Probit estimates
Again, maximum likelihood estimation Again, a Pseudo R2 Again, a LR ratio with k degrees of freedom
Assumptions of Models
All Y’s are in {0,1} set They are statistically independent No multicollinearity The P(Yi=1) is normal density for probit, and
logistic function for logit
Ordered Probit
If the dependent variable can take on ordinal levels, we can extend the dichotomous Probit model to an n-chotomous, or ordered, Probit model
It simply has several threshold values estimated
Ordered logit works much the same way
Multinomial Logit
If our dependent variable takes on different values, but they are nominal, this is a multinomial logit model
Some additional info
The Modal category is good benchmark Present % correctly predicted
This can be calculated and presented. This, when compared to the modal category,
gives us a good indication of fit.
Stata
Use Leadership Change data (1992 cross section) 1992-Stata
Test different models
Dependent variable Leadership change Examine distribution
tables ledchan1 Independent variables
Try differentTry corr and then (pwcorr)
Try the following
regress ledchan1 grwthgdp hlthexp illit_f polity2
logit ledchan1 grwthgdp hlthexp illit_f polity2
logistic ledchan1 grwthgdp hlthexp illit_f polity2
probit ledchan1 grwthgdp hlthexp illit_f polity2
ologit ledchan1 grwthgdp hlthexp illit_f polity2
oprobit ledchan1 grwthgdp hlthexp illit_f polity2
mlogit ledchan1 grwthgdp hlthexp illit_f polity2
tobit ledchan1 grwthgdp hlthexp illit_f polity2, ul ll
Tobit
Assumes a 0 value, and then a scale E.g., the decision to incarcerate
0 or 1 (Imprison or not)
If Imprison, than for how many years?
Other models
This leads to many other models Count models & Poisson regression Duration/Survival/hazard models Censoring and truncation models Selection bias models