eventstudies · event studies author: kaushik krishnan=1.->plagiarised from pat kline and dave...
TRANSCRIPT
Event Studies
Kaushik Krishnan1
February 11, 2017
1Plagiarised from Pat Kline and Dave Card
7 January 2009
Ramalinga Raju confessed to an accounting fraud to the tune ofUSD 1.47 bn
Their auditor was PWC
What happened to other companies audited by PWC?
Satyam Effect
Figure 1: Satyam
Today
1. Econometrics from 60k ft refresher2. Econometrics from 550 ft refresher3. Difference in Differences4. Event Studies5. Event Studies in R6. Replicate Satyam Event Study
Econometrics Refresher
The progress of the field, in rough chronological order:
I descriptive modelingI causal modelingI ‘prediction’
Descriptive Modeling
We want to summarize the relationship between some outcome yand some other variables x = (x1, x2, ..., xJ).
I Not trying to measure the causal effect of x on y .I Only trying to take into account that y may be strongly related
to some x ’s and only weakly related to others.I our benchmark: E [y |x ]
E [y |x ]
I We try to approximate the CEF with a linear “regressionfunction”
I When we say that, we can mean two things:I “population regression”: the function we could estimate with ∞
dataI “sample regression”: the function we can actually estimate on a
given sample
Causal Modeling
Often, a descriptive analysis is not enough. Many debates ineconomics amount to disputes over the question:
“does x cause y?”
A very empirical notion of causality:
x causes y if, in an idealised experiment, we could manipulate x ,leaving other factors constant, and observe that the mean of thedistribution of outcomes of y has changed.
The Observability Problem
We need to be able to see two things:
I the distribution of y when x is manipulated (the “treatment”)I the distribution of y in the absence of manipulation (the“counterfactual”)
Unfortunately, we cannot see both at the same time!
Solving the Observability Problem
We need a way to infer the counterfactual for the units that aretreated.
Possible ideas
1. Observational Design – calculate mean outcomes for peoplewho are treated and those who are not.
2. Pre-Post Design – compare outcomes for people who aretreated with their outcomes prior to treatent.
3. RCT – randomly assign treatment, calculate mean outcomesfor T’s and C’s.
How does this apply to Satyam?
Quick Regression Algebra Refresher
I said earlier that we want to find E [y |x ]. Why?
1. We can always write:
yi = E [yi |xi ] + εi
where E [εi |xi ] = 0.2. argminm(xi ) E [(yi −m(xi ))2] = E [yi |xi ]
Thus, estimating E [yi |xi ] is our best shot at explaining therelationship between y and x .
E [y |x ] can be an unwieldy object
OLS minimises 2 on the previous slide with the additional impositionof a linear CEF – “Population Regression Function” (PRF):
β∗ = argminβ
E [(yi − x ′i β)]
The FOC of which is:
E [xi (yi − x ′i β∗)] = 0
And with some work, we can see that:
β∗ = E [xix ′i ]−1E [xiyi ]
The sample equivalent
β = argminβ
1N
N∑i=1
(yi − x ′i β)2
and
β =[1N
N∑i=1
xix ′i
]−1 [1N
N∑i=1
xiyi
]
Which we can be shown to be a “good” approximation of β∗
Useful Facts
1. If E [yi |xi ] = x ′i βe then β∗ = βe (PRF = CEF)2. x ′i β∗ is the best linear approximation to E [yi |xi ]3. If your covariates are just indicator variables, eg.
x ′i = (1,D1i ,D2i )
Then,
E [yi |xi ] = µ0 + D1i (µ1 − µ0) + D2i (µ2 − µ0)
And the PRF fits the mean of each group exactly!
Difference of Differences Framework
Assume:yit = αi + δt + Ditθ + εit
I αi , a person effectI δt , a time trendI Dit , some event of interestI εit , an error term
We are interested in θ.
Example: Housing Prices and Cancer Clusters (Davis 2004)I A cancer cluster is discovered in Churchill County in 2000 (D)I Nothing discovered in Lyon County or the State of Nevada
Figure 2: event
Trick
yi2 − yi1 ≡ ∆y = δ2−1 + ∆Di2θ + ∆εi
I We are now in familiar territoryI ∆Di2 is just a dummy for membership in the treatment regionI What do we know about regressing on group dummy
membership?I θ = ∆y2 −∆y1
I This is the difference of differencesI Can generalise to a case with controls easily
DD in Action
Figure 3: diff
Very Compelling, But Why?
I Even though treatment and control groups were not stationary,the difference was
I Time Series Econometrics Fact: if two series are cointegratedthen a linear combination of them is stationary. Hence, thedifference between treatment and control yields a seriescentered around zero prior to treatment
I Ten years prior to treatment, difference was relatively stable: iethey shared the same long run mean
I The goal of any DD analysis should be to reduce the data topicture exhibiting such stable pre-treatment behaviour
I The change in relative outcomes in 2000 is bigger than thechange in relative outcomes in all previous periods
Where Does DD Fail?I Earnings of training program applicants “dipped” prior to
enrolment (maybe why they enrolled in the first place)I Match treatment and control units based on pre-treatment
covariates to reduce dip? Maybe
Figure 4: ashenfelter
How To Code DD
I As with 2SLS, best not to compute differences yourselvesI Instead, run a version of this regression
Yit = αDi + γPostt + β(Di × Postt) + X ′itφ+ εit
I where:I Di is an indicator for being in treatment groupI Postt is an indicator for the post treatment periodI β is coefficient of interestI Remember: cluster at the level of the unit to which treatment is
assigned (county in Davis’ case)
Event Studies
I Generalisation of DD where different units are treated atdifferent times
I Basic idea: reorder panel in event timeI In financial applications, the dependent variable is excess
returns – the deviation of a stock from a level implied by somemarket index (Campbell et al 1997)
I In other applications, we need to work a little harder to removethe predictable component of the outcome
Classic Example
I Jacobson, LaLonde and Sullivan (1993) (JLS)I Effect of job loss on earningsI Yit : earnings of individual i on date t
Set Up
1. Define ei as the date at which individual i is displaced2. Define Dk
it = 1[t = ei + k]. ie, Dkit is a dummy indicating that
worker i was displaced k periods ago3. Run the following regression:
Yit = αi + γt + X ′itφ+C∑
k=CβkDk
it + uit
4. Plot βk over time. These are estimates of mean earnings in“event time” after having taken out individual and year specificeffects
JLS Results
Figure 5: jls
Details and The Devil - Two Control Groups
I Event study implicitly compares changes in the outcomes of thetreated units to:
I units that have not yet been treated andI units that will never be treated
I Useful to test whether those two sets of controls areexchangeable
I Reestimate the model without never treated units and see howpoint estimates change
I May lose a lot of power, but important to know whether mostof the power is coming from contrasts with never treated orfrom the differential timing of treatment onset among theeventually treated
More Details – Sample Construction
I If you have a balanced panel of T time periods and varyingevent dates then you cannot have a balanced sample in eventtime
I Must choose endpoints (C, C) carefullyI Approach One:
I Bin up endpoints. ie, DCit = 1[t ≥ eit + C ] (McCrary 2007)
I Approach Two:I Fully saturate model and include all event time dummies but
only report those for which you have a balanced sample
More Devils – Normalising Coefficients
I If you have zero never treated units, you cannot include allevent time dummies even if you bin up endpoints
I Why? HW (hint: Is X ′X invertible?)
I You need to normalise one event coefficient to zeroI Industry practice: Normalise first lead (-1 in event time) to
zero. Makes it easy to test for impact.
Details in Devils – Individual Specific Trends
I If you add αi to your regression, you are saying:I Each individual continues to grow at a different rate than
everyone else (even in the long run)I What if your outcome is bounded?
I αi will be estimated off of both pre and post treatmentvariation. Is that okay?
I That means that features of the regression function before theevent are determined by observations after your structuralbreak (event)
I You are trying to estimate a counterfactual model in theabsence of treatment. You should use only untreatedobservations
I Think hard before adding individual specific fixed effects. Themost expansive specification is not always the best
Miscellaneous Woes and Shortcomings of Event Studies
I Same weakness as DD studiesI Common to find one thing in levels, another in logs, a third in
first differencesI Event studies are parametricI Linearity is a serious assumption
I Standard error clustering is an issueI Evaluate robustness over various specifications of time effects
Practical Example
I You are given a panel dataset on juvenile curfew laws acrossUS cities
I You are asked to run an event study on the effect of these lawson log juvenile arrests
head(m);
## year city t enacted lnarrests## 1: 81 Akron -9 90 6.568078## 2: 82 Akron -8 90 6.678971## 3: 83 Akron -7 90 6.778785## 4: 84 Akron -6 90 6.698268## 5: 85 Akron -5 90 6.732211## 6: 86 Akron -4 90 6.641182
Event Time
Construct a variable Eit that equals one in the year that a cityenacts a curfew law
m$E <- 0;m$E[m$enacted == m$year] <- 1;
EndPoints
m$Ecap1 <- 0;m$Ecap1[m$year - m$enacted >= 6] <- 1;
m$Ecap0 <- 0;m$Ecap0[m$year - m$enacted <= -6] <- 1;
Create Lags and Leads
m[,c(paste("E.lag",
1:5,sep="")) := lapply(1:5,
function(i) shift(E,i)),by=city];
m[,c(paste("E.lead",
5:2,sep="")) := lapply(5:2,
function(i)shift(E,i,type='lead')),
by=city];
What Does Our Data Look Like Now?
## year city t enacted lnarrests E Ecap1 Ecap0 E.lag1 E.lag2 E.lag3## 1: 81 Akron -9 90 6.568078 0 0 1 NA NA NA## 2: 82 Akron -8 90 6.678971 0 0 1 0 NA NA## 3: 83 Akron -7 90 6.778785 0 0 1 0 0 NA## 4: 84 Akron -6 90 6.698268 0 0 1 0 0 0## 5: 85 Akron -5 90 6.732211 0 0 0 0 0 0## 6: 86 Akron -4 90 6.641182 0 0 0 0 0 0## E.lag4 E.lag5 E.lead5 E.lead4 E.lead3 E.lead2## 1: NA NA 0 0 0 0## 2: NA NA 0 0 0 0## 3: NA NA 0 0 0 0## 4: NA NA 0 0 0 0## 5: 0 NA 1 0 0 0## 6: 0 0 0 1 0 0
Run The Regressionm$year <- factor(m$year);m$city <- factor(m$city);
(ES_6 <- lm(lnarrests ~ . -t -enacted, data = m));
#### Call:## lm(formula = lnarrests ~ . - t - enacted, data = m)#### Coefficients:## (Intercept) year86 year87## 6.3888424 -0.0015315 -0.0334892## year88 year89 year90## 0.0005694 0.1040470 0.1468098## year91 year92 year93## 0.2180575 0.2527693 0.2606403## year94 year95 year96## 0.3589144 0.3659621 0.2869393## year97 year98 year99## 0.2625592 0.1508740 -0.0467437## cityAlbuquerque cityAnaheim cityAnchorage## 0.1518789 -0.5606519 -0.3047332## cityAtlanta cityAustin cityBaltimore## 0.6339827 0.7594920 1.2866206## cityBaton Rouge cityBirmingham cityBuffalo## 0.0544757 -0.6613175 -0.4313937## cityCharlotte cityCincinnati cityCleveland## 0.1385461 0.5014668 0.6160811## cityColorado Springs cityCorpus Christi cityDallas## 0.7057790 0.1564510 1.1685181## cityDenver cityDetroit cityEl Paso## 0.4997700 1.2125223 0.7304654## cityFort Worth cityFresno cityGarland## 0.5230891 1.0688861 -0.2637803## cityGlendale cityHouston cityJackson## -0.3280997 1.6120248 -0.3004399## cityJacksonville (re cityJersey City cityKansas City## 1.1521822 0.1061550 1.3030523## cityLexington-Fayette cityLong Beach cityLos Angeles## -0.5522205 0.7105688 2.5286332## cityLouisville cityLubbock cityMadison## -0.0274826 -0.4705992 -0.1714147## cityMesa cityMiami cityMobile## 0.5713493 0.1398018 -0.2992023## cityNew Orleans cityNewark cityNorfolk## 0.5033780 0.1193804 0.2221923## cityOklahoma City cityPhoenix cityRichmond## 0.9650265 1.6639581 -0.1209711## citySacramento citySan Diego citySan Jose## 0.6619828 1.0827930 0.9016285## cityShreveport citySt. Paul cityTampa## -0.5206536 0.7851042 0.5642772## cityToledo cityTulsa cityVirginia Beach## -0.0247237 0.6495977 0.4710393## cityWichita E Ecap1## 0.3376162 -0.0958179 -0.4444412## Ecap0 E.lag1 E.lag2## 0.1030182 -0.1866457 -0.2174169## E.lag3 E.lag4 E.lag5## -0.2285216 -0.2464958 -0.2915673## E.lead5 E.lead4 E.lead3## 0.0802558 0.0840810 0.0473505## E.lead2## 0.0111777
Collect Coefficients
coefs <- c(coef(ES_6)[c("Ecap0","E.lead5","E.lead4","E.lead3","E.lead2")],
0,coef(ES_6)[c("E",
"E.lag1","E.lag2","E.lag3","E.lag4","E.lag5","Ecap1")]);
df.coefs <- data.frame(coefs,time = -6:6);
ggplot(data = df.coefs,aes(y = coefs, x = time, group = 1)) +geom_line() + geom_point() +geom_vline(xintercept = 0, linetype = 2);
−0.4
−0.3
−0.2
−0.1
0.0
0.1
−6 −3 0 3 6
time
coef
s
R Markdown
This is an R Markdown presentation. Markdown is a simpleformatting syntax for authoring HTML, PDF, and MS Worddocuments. For more details on using R Markdown seehttp://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated thatincludes both content as well as the output of any embedded Rcode chunks within the document.
Slide with Bullets
I Bullet 1I Bullet 2I Bullet 3
Slide with R Output
summary(cars)
## speed dist## Min. : 4.0 Min. : 2.00## 1st Qu.:12.0 1st Qu.: 26.00## Median :15.0 Median : 36.00## Mean :15.4 Mean : 42.98## 3rd Qu.:19.0 3rd Qu.: 56.00## Max. :25.0 Max. :120.00
Slide with Plot
0 50 100 150 200 250 300 350
020
040
060
080
0
temperature
pres
sure