parametric eha models sociology 229a: event history analysis class 6 copyright © 2008 by evan...
Post on 21-Dec-2015
218 views
TRANSCRIPT
Parametric EHA Models
Sociology 229A: Event History AnalysisClass 6
Copyright © 2008 by Evan SchoferDo not copy or distribute without permission
Announcements
• Assignment #4 due• Assignment #5 handed out
• Class topic: • Parametric EHA models• More diagnostics: Outliers
Parametric Proportional Hazard Models
• Cox models do not specify a functional form for the hazard curve, h(t)
• Rather, they examine effects of variables net of a baseline hazard trend (to be inferred from the data)
• h(t) = h0(t)eX = h0(t)exp(X)
• Parametric models specify the general shape of the hazard curve
• Approach is more familiar – more like regression– We can model Y as a constant, a linear function, a logit
function, a binomial function (poisson), etc
• For instance, we could assume h(t) was a linear– Then solve for values of a hazard slope that best fit the data
(plus effects of other covariates on hazard rate).
Parametric Proportional Hazard Models
• Parametric models work best when you choose a curve that fits the data
• Just like OLS regression – which works best when the relationship between two variables is roughly linear
• If the actual relationship between two variables is non-linear, coefficient estimates may be incorrect
– Though sometimes one can transform variables (e.g., logging them) to get a good fit…
– Parametric models are more efficient than Cox models• They can generate more precise estimates for a given sample size• But, they can also be more wildly incorrect if you mis-specify h(t)!
– Note: These are proportional hazard models – like Cox!• You must still check the proportional hazard assumption.
Exponential (Constant Rate) Model
• Exponential models are simplest:)()( 2211)( βXaXbXbXba eeth nn
• Note that there is no “t” in the equation… no coefficient that specifies time dependence of the hazard rate
– Rather, there are just exponentiated BXs– PLUS: a, the constant
• Note 2: Box-Steffensmeier & Jones: h(t)=e-(X)
• An exponential model solves for the constant value (a) that best fits the data…
• Along with values of Bs, which reflect effects of X vars• In effect, the model assumes a constant hazard rate .
Exponential (Constant Rate) Model
• Another way of looking at it: An exponential model is a lot like a cox model
• But, with the assumption that the baseline hazard is a constant!
)(0 )()( βXethth
Cox
)()()( βXaXa eeeth Exponential
Exponential (Constant Rate) Model
• Basic Model. Constant reflects base rate. streg gdp degradation education democracy ngo ingo, dist(exponential) nohr
Exponential regression -- log relative-hazard form
No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 Wald chi2(6) = 94.29Log pseudolikelihood = 282.11796 Prob > chi2 = 0.0000
------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | -.044568 .1842564 -0.24 0.809 -.4057039 .3165679 degradation | -.4766958 .1044108 -4.57 0.000 -.6813372 -.2720543 education | .0377531 .0130314 2.90 0.004 .0122121 .0632942 democracy | .2295392 .0959669 2.39 0.017 .0414475 .417631 ngo | .4258148 .1576803 2.70 0.007 .1167671 .7348624 ingo | .3114173 .365112 0.85 0.394 -.4041891 1.027024 _cons | -4.565513 1.864396 -2.45 0.014 -8.219663 -.9113642------------------------------------------------------------------------------
Constant shows base hazard rate estimated from data:
exp(-4.57) = .01
Exponential (Constant Rate) Model
• Suppose we plotted the baseline hazard rate estimated from our exponential model
• It would be a flat line: h(t) = .01– This is the estimated hazard if all X vars are zero
• If we plotted the estimated hazard for some values of X (ex: democracy = 10), we would get a higher value
– Since democracy has a positive effect, Democ = 10 would yield a higher hazard than democ = 0
– But, again, the estimated hazard rate trend would be a flat line over time…
Exponential Model: Baseline Hazard• Ex: stcurve, hazard
-.96
9705
91.
030
294
Ha
zard
func
tion
1970 1980 1990 2000analysis time
Exponential regression
See, the estimated baseline hazard really is flat!
Exponential Model: Estimated Hazard• stcurve, hazard at1(democ=1) at2(democ=10)
.05
.1.1
5.2
.25
.3H
aza
rd fu
nctio
n
1970 1980 1990 2000analysis time
democracy=1 democracy=10
Exponential regression
Here are estimated hazards for 2 groups
Other vars pegged at mean
Exponential Model: Baseline Hazard• Issue: Actual hazard is rising. A problem?
0.0
2.0
4.0
6.0
8.1
1970 1980 1990 2000analysis time
Smoothed hazard estimateIs an exponential model appropriate?
Answer:
It can be, IF we have X variables that account for increasing hazard
If not, fit will be poor!
Exponential (Constant Rate) Model• Cleves et al. 2004, p. 216:
• In the exponential model, h(t) being constant means that the failure rate is independent of time, and thus the failure process is said to lack memory.
• You may be tempted to view exponential regression as suitable for use only in the simplest of cases. This would be unfair. There is another sense in which the exponential model is the basis for all other models.
• The baseline hazard… is constant … the way in which the overall hazard varies is purely a function of X. The overall hazard need not be constant with time; it is just that every bit of how the hazard varies must be specified in BX. If you fully understand a process, you should be able to do that.
• When you do not understand a process, you are forced to assign a role to time, and in that way, you hope, put to the side your ignorance and still describe the part of the process that you do understand.
• In addition, exponential models can be used to model the overall hazard as a function of time, if they include t or functions of t as covariates.
Exponential (Constant Rate) Model• The exponential model is extremely flexible…
• You specify substantive covariates (X variables) to explain failures
– It is probably not due to some inherent feature of time, but rather due to some variable that you hope to control for
– If you do a great job, you will fully explain why hazard rate appears to go up (or down) over time
• And, you can include functions of time as independent variables to address temporal variation
– Independent (X) variable scan include time dummies, log time, linear time, time interactions, etc
– That is, if you can’t explain time variation with substantive X variables, you can add time variables to model it
• But, if you mis-specify your model, results will be biased– In that case, you might be better off with a Cox model…
Piecewise Exponential Model
• If you have a lot of cases, you can estimate a piecewise model
– Essentially a separate model for different chunks of time
• Model will yield different coefficients and base rate (constant) for multiple chunks of time
• Even if hazard is not constant over time, it may be more or less constant in each period
– This allows you to effectively model any hazard trend
– A related approach: Put in time-period dummies• This gives a single set of bX coefficient estimates• But, allows you to specify changes in the hazard rate
over different periods– NOTE: Don’t forget to omit one of the time dummies!
Parametric Models
• Let’s try a more complex parametric model• Example: Let’s specify a linear time trend
)(0 )()( βXetβath
Linear
)()()( βXaXa eeeth Exponential
• In this case, we estimate a constant (a) and slope (0) which best summarize the time dependence of the hazard rate
• Note: this isn’t common – we have better options…
Gompertz Models
• Another option: an exponentiated line• Rather than a linear function of time and exponentiated
function of X, we’ll exponentiate everything:
• Slope coefficient is often represented by gamma: • Note: Exponentiation alters the line… it isn’t a simple
linear function anymore. – It is flat if gamma = 0– It is monotonically increasing if gamma > 0– It is monotonically decreasing if gamma < 0
)()()( 0)( βXtaβXtβa eeeth Exponentiated Linear: Gompertz
Gompertz Models• Exponentiating a linear function generates a
curve defined by the value of gamma () • Model estimates value of that best fits the data
= 0
< 0
> 0
>> 0
Gompertz Model• Example: streg gdp degradation education democracy ngo
ingo, robust nohr dist(gompertz)Gompertz regression -- log relative-hazard form
No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 Wald chi2(6) = 46.48Log pseudolikelihood = 307.64758 Prob > chi2 = 0.0000
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .4633559 .2104244 2.20 0.028 .0509316 .8757802 degradation | -.4394712 .1434178 -3.06 0.002 -.720565 -.1583775 education | .0026837 .0145341 0.18 0.854 -.0258026 .03117 democracy | .2890106 .092612 3.12 0.002 .1074943 .4705268 ngo | .2522894 .1658275 1.52 0.128 -.0727265 .5773054 ingo | .0037688 .2275176 0.02 0.987 -.4421575 .4496952 _cons | -253.035 45.28363 -5.59 0.000 -341.7892 -164.2807-------------+---------------------------------------------------------------- gamma | .124117 .0224506 5.53 0.000 .0801146 .1681195------------------------------------------------------------------------------
Model estimates gamma to be positive, significant. Implies increasing baseline hazard
Gompertz Model: Estimated Hazard• stcurve, hazard at1(democ=1) at2(democ=10)
Estimated hazards for 2 groups
Other vars pegged at mean
01
23
4H
aza
rd fu
nctio
n
1970 1980 1990 2000analysis time
democracy=1 democracy=10
Gompertz regression
Note: curves are actually proportional – hard to see because bottom curve is nearly zero…
Weibull Models
• Another option: the Weibull curve• Another curve that can fit monatonic hazards
• Model estimates p to best fit the model– Hazard is flat if p = 1– Hazard is monotonically increasing if p > 1– Hazard is monotonically decreasing if p < 1.
)(1)( βXap eptth Weibull
Weibull: Visually
• The Weibull family: Monotonic increasing or decreasing, depending on p
Time
Haz
ard
Rat
e
p = 1
p = 4
p = .5
p = 2
Weibull Model• Example: streg gdp degradation education democracy ngo ingo, robust nohr dist(weibull)
Weibull regression -- log relative-hazard form
No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 LR chi2(6) = 23.71Log likelihood = 307.6045 Prob > chi2 = 0.0006
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .4631871 .2360589 1.96 0.050 .0005202 .9258541 degradation | -.4396978 .1486662 -2.96 0.003 -.7310781 -.1483175 education | .0027319 .0141652 0.19 0.847 -.0250314 .0304953 democracy | .288927 .0913855 3.16 0.002 .1098147 .4680394 ngo | .2522595 .1610192 1.57 0.117 -.0633324 .5678514 ingo | .004058 .1835743 0.02 0.982 -.355741 .363857 _cons | -1884.071 280.0398 -6.73 0.000 -2432.939 -1335.203-------------+---------------------------------------------------------------- /ln_p | 5.511481 .1486542 37.08 0.000 5.220124 5.802837-------------+---------------------------------------------------------------- p | 247.5173 36.79449 184.9571 331.2381 1/p | .0040401 .0006006 .003019 .0054067------------------------------------------------------------------------------
Parametric: Model Fit
• Parametric models use maximum likelihood estimation (MLE)
• Comparisons among nested models can be made using a likelihood ratio test (LR test)
• Just like logit: Addition of groups of variables can be tested with lrtest
– Some parametric models are themselves nested• Ex: A Weibull model simplifies to an exponential model
if p = 1– Thus, exponential is nested within Wiebull
• LR tests can be used to see if Weibull is preferable to exponential.
Parametric Model Fit: AIC
• Non-nested parametric models can be compared via the Akaike Information Criterion
)(2)ln(2 ckLAIC • k = # independent variables in the model• c = # shape parameters in model (ex: p in Weibull)
– Exponential has one parameter (a); Weibull has 2.
• AIC compares likelihoods, but corrects for parameters in the model – rewarding simpler models…
• Low values = better model fit– Even for negative values… -100 is better than -50.
Parametric: Model Fit
• How do you know which model fits best?
• 1. Look at the shape parameter• Weibull: p, Gompertz: gamma• If gamma is near zero or p near 1, they aren’t improving
on fit compared to an exponential model
• 2. Conduct a likelihood ratio test• For nested models only
• 3. Compare fit statistics: AIC• Run models, then request “estat ic”• Lower values = better.
Likelihood Ratio Test
• Ex: Compare Gompertz to exponential– Likelihood ratio test
• Run full model (weibull or gompertz)• estimates store fullmodel• Run base model• estimates store basemodel• lrtest fullmodel basemodel, force.
. lrtest gompertz exponential, force
Likelihood-ratio test LR chi2(1) = 51.06(Assumption: exponential nested in gompertz) Prob > chi2 = 0.0000
Significant effect indicates that full model (Gompertz) fits better than exponential
Parametric: Model Fit• AIC: Weibull, Gompertz, Exponential
• Request “estat ic” after each model is run----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC-------------+--------------------------------------------------------------- weibull | 1938 295.7504 307.6045 8 -599.209 -554.6537-----------------------------------------------------------------------------
----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC-------------+--------------------------------------------------------------- gompertz | 1938 295.7926 307.6476 8 -599.2952 -554.7399-----------------------------------------------------------------------------
----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC-------------+--------------------------------------------------------------- Exponential | 1938 259.5519 282.118 7 -550.2359 -511.25-----------------------------------------------------------------------------
AIC Results: Lower = better. Gompertz & Weibull fit better than Exponential; Little difference between Gompertz/Weibull.
Ancillary Parameters
• Gompertz & Weibull models have parameters that determine the shape of the curve
• Gamma (), p• Ex: Bigger = greater increase of h(t) over time
– You can actually specify covariate effects on those parameters
• Effectively allowing a different curve shape across values of X variables
• Ex: If you think that hazard increases more for men than women, you can look to see if Dmale affects
– streg male educ, dist(gompertz) ancillary(male) – Model estimates effect of male on hazard AND on gamma…
Choosing a Hazard Model
• A Cox model is a good starting point• Less problems due to accidental mis-specification of
the time-dependence of the hazard rate• Box-Steffensmeier & Jones point to cites: Cox models
are 95% as efficient as parametric models under many circumstances
– Cox models treat time dependence as a “nuisance”, put the focus on substantive covariates
• Which is often desirable.
Choosing a Hazard Model
• Parametric models are good when • 1. You have strong theoretical expectations about the
hazard rate• 2. You are confident that you can fit the time
dependence well with a parametric model• 3. You need the most efficient estimates possible
• AGAIN: Substantive model specification is typically more important
• Biases due to omitted variables are often greater than biases due to poor model choice (e.g., Cox vs. Weibull)
• Also: In small samples, outliers are likely to be more important.
PH Assumption
• Models discussed today are proportional hazard models…
• Require the same assumption as Cox models• But, most of the “tests” of proportionality are only
available in Cox models• But: You can still use piecewise models and interaction
terms to check the assumption.
Residuals in EHA
• OLS regression: Residuals = difference between predicted value of Y and observed
• Y-hat – Yi
• EHA: Residuals are more complicated• You could compute predicted failure minus observed…• But, what about censored cases? What is observed?• There are a number of different ways to calculate
residuals… each with different properties.
Residuals – Summary• From Cleves et al. (2004) An Introduction to Survival
Analysis Using Stata, p. 184:• 1. Cox-Snell residuals
• … are useful for assessing overall model fit
• 2. Martingale residuals• Are useful in determining the functional form of the covariates to
be included in the model
• 3. Schoenfeld residuals (scaled & unscaled), score residuals, and efficient score residuals
• Are useful for checking & testing the proportional hazard assumption, examining leverage points, and identifying outliers
• NOTE: A residual is produced for each independent variable…
• 4. Deviance residuals• Are useful fin examining model accuracy and identifying outliers.
Martingale/deviance Residuals Outliers
• Martingale residuals: difference over time of observed failures minus expected failures
• Feature: range from +1 to –infinity
– Deviance residuals = martingale residuals that are rescaled to be symmetric around zero
• Easier to interpret
• Extreme martingale or deviance residuals may indicate outliers
• Plot residuals vs. time, case number, IVs, etc.• Or simply sort data by residuals & list the cases.
Martingale & Deviance Residuals: Outliers
• Stata code to identify outliers:
*run Cox Model, calculate martingale residualsstcox var1 var2 var3, robust nohr mgale(mg)* Creates variable “mg” which contains martingale residuals* Next, compute deviance residuals using “predict”predict dev, deviancegen caseid = _n* create plots of various typesscatter mg caseid* Deviance residual plots are generally easier to interpretscatter dev caseid, mlabel(newname2)
Deviance Residuals Plot• Extreme values may be outliers
CROATIA
LATVIA
MACEDONIA
SLOVAKIA
SLOVENIA
ALGERIA
ANGOLA
BENIN
BUR-FASO
BURUNDI
CAMEROON
CHADCOMOROS
CONGO
EGYPT
ETHIOPIA w e
GAMBIA
GHANA
GUINEA
IVORY-CO
KENYA
MADAGASC
MALAWI
MALI
MAURITANMAURITIUS
MOROCCO
MOZAMBIQ
NIGER
NIGERIA
RWANDA
SENEGAL
SIERRA-L
SO-AFRICA
TANZANIA
TOGO
UGANDA
ZAMBIA
ZIMBABWE
CANADA
COSTA-RICUBA
DOM-REP
EL-SALVA
GUATEMA
HONDURAS
JAMAICA
MEXICO
NICARAGPANAMATRIN&TOB
USA
ARGENTIN
BOLIVIA
BRAZIL
CHILE
COLOMBIA
ECUADORGUYANA
PARAGUAY
PERU
URUGUAY
BANGLAD
CYPRUS
KAMPUCH
INDIA
INDONES
IRAN
ISRAEL
JAPAN
JORDAN
KOREA-R(S
LEBANON
MALAYSIA
NEPAL
PAKISTAN
PHILIPPI
SINGAPOR
SRI-LAN
SYRIA
THAILAND
TURKEY
BELGIUM
DENMARKFINLAND
ICELAND
IRELAND
LUXEMB
NETHERL
NORWAY
PORTUGAL
SWEDEN
SWITZERL
AUSTRAL
NEW-ZEAL
-2-1
01
2de
via
nce
re
sidu
al
0 1000 2000 3000caseid
Here, no obvious outliers are visible
Scaled Schoenfeld Residuals: Outliers
• Stata code to identify outliers:
*run Cox Model, calculate residualsstcox var1 var2 var3, nohr schoenfeld(sch*) scaledsch(sca*)*Creates variables containing schoenfeld & scaled schoenfeld* residuals… labeled sch1, sch2, sch3… respectivelygen caseid = _n* create plots of various typesscatter sca1 caseid, mlabel(caselabel)scatter sca2 caseid, mlabel(caselabel)scatter sca3 caseid, mlabel(caselabel)…-- repeat for as many X variables as you have in the model
Scaled Schoenfeld Residuals: Plot
• A set of residuals is created for each X var
LATVIA
MACEDONIA
SLOVENIA
ALGERIA
ANGOLA
BENIN
BUR-FASOBURUNDI
CAMEROON
CHAD
COMOROS
CONGOEGYPTGAMBIA
GHANA
GUINEA
KENYA
MADAGASC
MALAWI
MALI
MAURITAN
MAURITIUS
MOZAMBIQ
NIGER
NIGERIASENEGAL
SO-AFRICA
TOGO
UGANDA
ZAMBIA
CANADACOSTA-RI
DOM-REPEL-SALVAGUATEMA
HONDURAS
MEXICO
NICARAG
PANAMA
TRIN&TOB
USA
ARGENTIN
BOLIVIA
CHILE
COLOMBIA
ECUADORGUYANA
PARAGUAY
PERU
URUGUAY
BANGLAD
KAMPUCH
INDIA
INDONES
JAPAN
KOREA-R(S
MALAYSIANEPALPAKISTAN
PHILIPPISRI-LAN
SYRIA
THAILAND
TURKEYBELGIUM
DENMARK
FINLAND
ICELANDIRELAND
LUXEMB
NETHERLNORWAY
PORTUGALSWEDEN
SWITZERL
AUSTRALNEW-ZEAL
-50
510
scal
ed S
choe
nfel
d -
gdp
0 1000 2000 3000caseid
Not too bad, but Latvia is a bit suspicious…
Scaled Schoenfeld Residuals: Plot
LATVIA
MACEDONIASLOVENIA
ALGERIA
ANGOLA
BENINBUR-FASOBURUNDICAMEROON
CHAD
COMOROS
CONGOEGYPT
GAMBIAGHANA
GUINEA
KENYAMADAGASCMALAWIMALIMAURITANMAURITIUSMOZAMBIQ
NIGER
NIGERIA
SENEGALSO-AFRICATOGOUGANDAZAMBIA
CANADACOSTA-RI
DOM-REP
EL-SALVA
GUATEMAHONDURAS
MEXICO
NICARAGPANAMA
TRIN&TOB
USA
ARGENTINBOLIVIACHILECOLOMBIA
ECUADORGUYANAPARAGUAYPERU
URUGUAYBANGLAD
KAMPUCH
INDIAINDONES
JAPANKOREA-R(SMALAYSIA
NEPAL
PAKISTANPHILIPPISRI-LAN
SYRIATHAILANDTURKEYBELGIUM
DENMARKFINLANDICELAND
IRELAND
LUXEMBNETHERLNORWAY
PORTUGAL
SWEDENSWITZERL
AUSTRALNEW-ZEAL
-30
-20
-10
010
scal
ed S
choe
nfel
d -
ingo
0 1000 2000 3000caseid
This can’t be good!
• Here is a plot for a different X var: INGOs…
Efficient Score Residuals: Influential Cases
• Procedure for identifying outliers using ESRs• It is possible to compute DFBETAs based on ESRs• DFBETA: Change in coefficient a variable’s coefficient
due to a particular case in the analysis– Cases with big DFBETAS may be overly influential
– Issue: Stata cannot automatically compute DFBETAS…
• You have to compute them manually• Also, computation = limited to 800 cases (for
“intercooled stata”)• Hopefully stata will improve this in the future.
ESRs: Influential Cases• Stata code to estimate DFBETAs:* Run Cox model, request efficient score residuals* Creates vars: esr1 to esr5 corresponding to vars listed in modelstcox gdp var1 var2 var3 var4, robust nohr esr(esr*)* Create room for a matrix of up to 800 rows (for your cases)set matsize 800* Create esr matrixmkmat esr1 esr2 esr3 esr4, matrix(esr)* Multiply ESRs and Var/Cov matrix to estimate DFBETAs, save resultsmat V=e(V)mat Inf = esr*Vsvmat Inf, names(s)* Label estimates for subsequent plotslabel var s1 "dfbeta – var 1"label var s2 "dfbeta – var 2"label var s3 "dfbeta – var 3"label var s4 "dfbeta – var 4"* Plot DFBETAs for each variable vs. time or case numberscatter s1 _t, yline(0) mlab(caseID) s(i)scatter s1 casenumber, yline(0) mlab(caseID) s(i)* Look for extreme values (for each IV – s1 to s4)
DFBETA Example• DFBETA for NGOs (plotted by casenumber)
LATVIA
MACEDONIA
MACEDONIAMACEDONIAMACEDONIA
MACEDONIA
SLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIA
ALGERIA
ANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLA
ANGOLA
BENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENIN
BENIN
BUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASO
BUR-FASO
BURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDI
BURUNDI
CAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROON
CAMEROON
CHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHAD
CHAD
COMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGO
CONGO
EGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPT
EGYPT
ETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIA
GAMBIA
GHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEA
GUINEAIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-CO
KENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYA
KENYA
MADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASC
MADAGASC
MALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWI
MALAWI
MALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALI
MALI
MAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITAN
MAURITAN
MAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUS
MAURITIUS
MOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGER
NIGER
NIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIA
NIGERIA
RWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDASENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-L
SO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICA
SO-AFRICA
TANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOUGANDAUGANDA
-.05
0.0
5.1
dfb
eta
- ng
o
0 200 400 600 800casenumber
DFBETA value indicates that presence of Latvia changes NGO coefficient by +.075 standard deviations
Outliers• Cox Model: change due to removal of outlier------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .4572288 .2025104 2.26 0.024 .0603157 .8541419 degradation | -.4311475 .1131853 -3.81 0.000 -.6529867 -.2093083 education | .0027517 .0136965 0.20 0.841 -.024093 .0295964 democracy | .2836321 .0911985 3.11 0.002 .1048862 .4623779 ngo | .2874221 .1614045 1.78 0.075 -.0289248 .603769 ingo | -.026845 .2391101 -0.11 0.911 -.4954922 .4418021------------------------------------------------------------------------------
. RESULTS WITH LATVIA REMOVED: ------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .3654458 .2031124 1.80 0.072 -.0326471 .7635388 degradation | -.4472621 .1110395 -4.03 0.000 -.6648956 -.2296286 education | -.0002829 .0141668 -0.02 0.984 -.0280494 .0274837 democracy | .2715732 .0904942 3.00 0.003 .0942078 .4489385 ngo | .2245402 .1644891 1.37 0.172 -.0978526 .546933 ingo | .2735146 .200823 1.36 0.173 -.1200912 .6671204------------------------------------------------------------------------------
Removing Latvia changes things…
Reading Discussion
• Empirical Example: Schofer, Evan. 2003. “The Global Institutionalization of Geological Science, 1800-1990.” American Sociological Review, 68 (Dec): 730-759.