duration analysis - uab barcelonapareto.uab.cat/jllull/bgse_panel_data/duration_notes.pdf · the...
TRANSCRIPT
![Page 1: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/1.jpg)
Duration Analysis
Joan Llull
Panel Data and DurationMaster in Macroeconomic Policy and Financial Markets
Barcelona GSE
I. Introduction
A. Motivation
There are many examples in economics in which our variable of interest is a
duration. Duration data can answer the question “how long an individual has
been in a particular state when exiting from it”. Examples are the number of
weeks or months that an individual has been unemployed when she finds a job,
how long an individual have been in a hospital before leaving it, or what is the
life expectancy of an individual with certain age (i.e. how long has she been alive
when she dies). In this chapter we learn the basic tools to model this kind of data.
This kind of analysis allows us to talk about why durations differ across individ-
uals (i.e. what is the effect of individual characteristics on duration), and about
how and why do exit probabilities vary over time. These techniques have a long
tradition in biometrics. For this reason, it is common to find a lot of nomenclature
that has been borrowed from that field: survival probabilities, hazard functions,...
B. Duration data
Our data consist of a sample of durations. We have a sample of durations for N
individuals, t1, t2, ..., tN . Importantly, these data are typically censored. Figure 1
draws two examples of censored samples. In the left plot, we have a hypothetical
sample of durations that is assumed to be obtained interviewing individuals from
January 1990 to January 1992 at a monthly basis. For individuals 2 and 4, we
observe complete duration spells. Individual 1 was already unemployed at the
first interview date. Individual 3 is still unemployed at the last interview date. In
both cases we know that the duration of their unemployment spell is larger than
a certain value, but not by how much; i.e. we observe t > t̄, but not t.
The second hypothetical sample could be collected through registries. Imagine
that, in order to receive the unemployment benefit, workers have to show up at
the unemployment office and prove that they are still unemployed. In the figure,
1
![Page 2: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/2.jpg)
Figure 1. Two examples of censored observations
A. Example 10
12
34
Indi
vidu
al
Jan90 Jul90 Jan91 Jul91 Jan92Date
B. Example 2
01
23
4In
divi
dual
Jan90 Jul90 Jan91 Jul91 Jan92Date
Note: Black lines represent the time when the individual was unemployed. A dot indicates that theindividual is still unemployed at that date, but we do not have further information about him/her.Vertical red dashed lines in Example 2 are interview dates.
individual 1 found a job in less than one year; in our data, we observe that he
does not appear in the sample of January 1991, so we learn that he found a job
before; as a result, we know that the duration of this unemployment spell was
below one year, but not by how much. Similarly, individuals 3 and 4 found a job
during the second year. For these three individuals data is censored in the sense
that we only observe that the duration is within a given interval, but we do not
know it exactly; i.e. we observe t < t < t̄ instead of t. Individual 2 is censored
exactly in the same way as individuals 1 and 3 from the left figure.
The presence of censoring is one of the main motivations to use the techniques
presented here (as opposed to, say, regression). If data are censored, sample
average durations are biased. This framework incorporates censoring into the
analysis without the need of additional strong assumptions.
On top of censoring, duration analysis allows for time/state dependence, i.e. the
probability of terminating the current spell depends on the duration of the spell.
For instance, we might be interested in analyzing whether individuals’ probability
of finding a job is decreasing in the time they have been unemployed in the current
spell. These techniques allow us to go beyond the simple average duration and
look at the shape of exit probabilities over different durations.
II. The Hazard Function
The hazard function is the most important object in this analysis. It is defined
as the probability/density of exiting at t conditional on being alive. In the unem-
ployment example, it is the probability that an individual finds a job at, say, the
10th month of unemployment, conditional on being unemployed in the 9th month.
2
![Page 3: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/3.jpg)
Figure 2. Mortality Hazard Rate
0.0
0.2
0.4
0.6
0.8
1.0
Haz
ard
func
tion
0 20 40 60 80 100Age
Note: The line depicts the hazard mortality rate, i.e. the probability of dying at age a conditional onsurvival until that age.
This hazard function can be constant or time varying. For instance, if individu-
als receive offers in every period with the same probability, the hazard is constant.
A time-varying example is plotted in Figure 2. The figure depicts the hazard
mortality rates of a given population. Infant mortality makes it decreasing at the
first five years of age. A hazard rate of around 8% at age 5 does not mean that the
probability of dying at age 5 is 8% but, instead, that the probability of dying at
age 5 for those individuals who survived the first 4 years is 8%. A good example
to understand this distinction is the hazard rate at age 100: probability of dying
at age 100 conditional on having survived until age 99 is almost 1; however, the
probability of dying at age 100 is almost zero.
We consider both discrete and continuous durations. Whenever our duration
of interest is discrete, the hazard function is a probability. When the duration is
continuous, this hazard function is a pdf.
Our interest in the hazard function has theoretical and empirical grounds. From
a theoretical perspective, it is more appealing to model the decision of exiting con-
ditional on survival than the duration itself (e.g. modeling job offer arrival rate).
Empirically, it is also convenient to model the hazard function because it implies
a binomial discrete decision and delivers a very clean likelihood, and because it
allows to take censoring into account without need of further strong assumptions.
In this section, we characterize the unconditional hazard function. We build
bridges between hazard functions and pdfs/cdfs/probability masses, which then
are used to write the likelihood function. Regressors are introduced later on in
the chapter.
3
![Page 4: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/4.jpg)
A. Hazard function for a discrete variable
Let t be a random variable with discrete support {1, 2, 3, ...} with probability
mass function p(τ) = Pr(t = τ) and cdf F (t) = p(1) + p(2) + ... + p(t) for
t = 1, 2, 3, .... The hazard function is defined as:
h(τ) ≡ Pr(t = τ |t ≥ τ) =Pr(t = τ)
Pr(t ≥ τ)=
p(τ)
1− F (τ − 1)=F (τ)− F (τ − 1)
1− F (τ − 1). (1)
This hazard function is a modeling decision (e.g. with a logistic or normal cdf).
We need to recover p(t) and F (t) in order to write the likelihood. To recover
them, we proceed recursively. In the first period, we know that:
h(1) = Pr(t = 1|t ≥ 1) = Pr(t = 1) = p(1) and F (1) = p(1). (2)
In the second period, we can use equation (1):
h(2) =p(2)
1− F (1)=
p(2)
1− h(1)⇒ p(2) = h(2)(1− h(1)). (3)
Hence:
F (2) = p(1)+p(2) = h(1)+h(2)(1−h(1)) ⇔ 1−F (2) = (1−h(2))(1−h(1)).
(4)
In the third period:
h(3) =p(3)
1− F (2)=
p(3)
(1− h(2))(1− h(1))⇒ p(3) = h(3)
2∏s=1
(1− h(s))). (5)
Hence:
F (3) = p(1) + p(2) + p(3) = h(1) + h(2)(1− h(1)) + h(3)2∏s=1
(1− h(s))), (6)
which implies:
1− F (3) =3∏s=1
(1− h(s))). (7)
In general, the recursion is such that:
p(t) = F (t)− F (t− 1) = h(t)(1− F (t− 1)) = h(t)t−1∏s=1
(1− h(s)), (8)
and:
F (t) = 1−t∏
s=1
(1− h(s)). (9)
4
![Page 5: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/5.jpg)
These expressions are interpretable. In particular:
p(τ) = Pr(t = τ) = h(τ)τ−1∏s=1
(1−h(s)) = Pr(t = τ |t ≥ τ) Pr(t > τ − 1|t ≥ τ − 1)...,
(10)
or, in words, the probability of exiting at time t is equal to the probability of
exiting at time t conditional on survival times the probability of survival until t.
Additionally:
F (τ) = Pr(t ≤ τ) = 1−τ∏s=1
(1− h(s)) =
= 1− Pr(t > τ |t ≥ τ) Pr(t > τ − 1|t ≥ τ − 1)... =
= 1− Pr(t > τ). (11)
B. Hazard function for a continuous variable
Consider now the case of a continuous duration t. This random variable is
characterized by its pdf f(t) instead of a probability mass. Therefore, its hazard
function is also a density:
h(τ) = limdt→0
Pr(τ ≤ t < τ + dt|t ≥ τ)
dt= lim
dt→0
Pr(τ ≤ t < τ + dt)
Pr(t ≥ τ)
/dt =
f(τ)
1− F (τ).
(12)
Note that this expression is analogous to equation (1).
In order to derive f(t) and F (t) from the hazard function (so that we can write
the likelihood of our sample) we make use of the integrated or cumulative hazard :
H(t) =
∫ t
0
h(s)ds. (13)
The integrated hazard can be written as a function of f(t) and F (t):
H(t) =
∫ t
0
f(s)
1− F (s)ds = [− ln(1− F (s))]t0 = − ln[1− F (t)]. (14)
Therefore, we can trivially see that:
F (t) = 1− exp(−H(t)), (15)
and, similarly:
f(t) =∂F (t)
∂t= h(t) exp(−H(t)). (16)
The interpretation of these expressions is not as straightforward as before, but
5
![Page 6: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/6.jpg)
we can make a connection with the discrete case. In the discrete case:
ln[1− F (t)] =t∑
s=1
ln(1− h(s)), (17)
and ln(1− h(s)) ≈ −h(s), which compares to the continuous case:
ln[1− F (t)] = −H(t) =
∫ t
0
(−h(s))ds. (18)
C. Some frequently used hazard functions
In the discrete case, we often avoid parametric assumptions on the hazard func-
tion, and we estimate it semi-parametrically. When the duration is continuous,
however, (and in some cases when it is discrete), we need to make functional form
assumptions on the hazard function. These are two widely used cases.
Constant hazard This is the simplest possible hazard function. In our example
of unemployment duration, this assumption is consistent with a constant job ar-
rival rate, i.e. in every period we have the same probability of receiving a job offer,
no matter how long we have been unemployed. Therefore, we assume h(t) = λ,
with λ > 0.
Given this function, the integrated hazard is very easy to compute:
H(t) =
∫ t
0
λdu = λt. (19)
Therefore, cdf and pdf are trivially derived:
F (t) = 1− e−λt, (20)
and:
f(t) = λe−λt, (21)
which is the exponential distribution.
Therefore, the exponential distribution has a constant hazard (this is called the
memoryless property of the exponential distribution). Additionally, the expected
duration with this function is the inverse of the hazard function: E[T ] = 1/λ.
The discrete counterpart of this parametric family of functions is:
F (t) = 1− (1− λ)t, (22)
and:
f(t) = λ(1− λ)t−1. (23)
6
![Page 7: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/7.jpg)
Figure 3. Examples of Hazard Functions: Constant and Weibull
Panel A. Constant Hazard
A. Hazard function0.
000.
040.
080.
120.
160.
20H
azar
d ra
te [
h(t)
]
0 10 20 30 40Duration [t]
B. Integrated hazard
0.0
1.0
2.0
3.0
4.0
5.0
Inte
grat
ed h
azar
d [H
(t)]
0 10 20 30 40Duration [t]
C. Pdf f(t)
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Prob
. den
sity
fun
ctio
n [f
(t)]
0 10 20 30 40Duration [t]
D. Cdf F (t)
0.00
0.20
0.40
0.60
0.80
1.00
Cum
. dis
trib
utio
n fu
nctio
n [F
(t)]
0 10 20 30 40Duration (t)
Panel B. Weibull Distribution
E. Hazard function
0.00
0.10
0.20
0.30
0.40
0.50
Haz
ard
rate
[h(
t)]
0 10 20 30 40Duration [t]
F. Integrated hazard
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
Inte
grat
ed h
azar
d [H
(t)]
0 10 20 30 40Duration [t]
G. Pdf f(t)
0.00
0.03
0.06
0.09
0.12
0.15
Prob
. den
sity
fun
ctio
n [f
(t)]
0 10 20 30 40Duration [t]
H. Cdf F (t)
0.00
0.20
0.40
0.60
0.80
1.00
Cum
. dis
trib
utio
n fu
nctio
n [F
(t)]
0 10 20 30 40Duration (t)
Note: Black: λ = 0.12; gray: λ = 0.05. Solid: α = 1; dotted: α = 0.5; dashed: α = 1.5; dot-dashed:α = 3. The examples in the top panel depict the hazard function (left), the integrated hazard (center-left), the pdf or probability mass function (center-right) and the cdf (right) of a constant hazard modelwith hazard equal to λ. The bottom panel depicts the corresponding functions for a Weibull hazardmodel with parameters λ and α.
Top panel in Figure 3 depicts the hazard rate, the integrated hazard, the cdf
and the pdf of a constant hazard model. We can see that a constant hazard
rate implies a decreasing unconditional probability of exiting and a marginally
decreasing cdf. The figure also allows us to illustrate the fact that the integrated
hazard does not have any interpretation as a conditional cdf, as it can take values
above 1 as in this case.
The Weibull distribution Another hazard function that is commonly used
for continuous durations is derived from a two-parameter generalization of the
exponential distribution known as the Weibull distribution. This function allows
for a hazard that increases or decreases monotonically.
The Weibull distribution is given by:
F (t) = 1− e−(λt)α λ > 0, α > 0. (24)
Hence, its pdf is:
f(t) =∂F (t)
∂t= αλαtα−1e−(λt)
α
, (25)
7
![Page 8: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/8.jpg)
the hazard function is:
h(t) =f(t)
1− F (t)=αλαtα−1e−(λt)
α
e−(λt)α= αλαtα−1, (26)
and the integrated hazard is:
H(t) =
∫ t
0
h(s)ds = (λt)α. (27)
The bottom panel of Figure 3 plots this function for different combinations of
the two parameters. It can be seen that the function is very flexible in mimicking
different shapes for the hazard function. Its main limitation, however, is that
hazard rates are either monotonically increasing or monotonically decreasing (for
instance, it cannot approximate the hazard function of mortality from Figure 2).
III. Conditional Hazard Functions
This section discusses the introduction of covariates into the model. We need
to write the hazard function conditional on covariates:
h(t,x) =f(t|x)
1− F (t|x). (28)
Below we review some of the most popular approaches for doing this.
A. The proportional hazard model
The Proportional Hazard (PH) model (or Cox model, after Cox (1972)) is prob-
ably one of the most widely used duration models because of its simplicity. In
this model, the conditional hazard function is given by:
h(t,x) = λ(t) exp(x′β). (29)
This is, we factorize h(t,x) into a function of t and a function of x, so that two
different individuals have exit probabilities that are proportional for all t (hence
the name). For instance, if an individual has twice the probability of another
individual of exiting from unemployment in period t1 conditional on survival, she
also has twice the conditional probability of the other individual of exiting at t2.
The function λ(t) is called the baseline hazard function (as it provides the
shape of the conditional hazard function, that then is scaled differently for every
individual). The baseline hazard is often assumed to be given by one of the two
possibilities described above. Note that if x′β includes a constant term, then the
scale of the corresponding hazard function needs to be normalized (e.g. λ(t) = 1
in the constant case, and λ(t) = αtα−1 in the Weibull case).
8
![Page 9: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/9.jpg)
B. Discrete durations
In discrete duration models, even though some times we might proceed with the
standard PH model, it is common to select richer models. The reason is that the
problem then reduces to a sequence of Probit or Logit estimations. In general, we
can specify conditional hazard rates as follows:
h(τ,x) = Pr(t = τ |t ≥ τ,x) = G(γτ + x′βτ ), (30)
where G(.) is a cdf (e.g. normal or logistic for Probit and Logit respectively).
This model is richer than the previous one because it allows for different different
βs at different durations. The equivalent to the baseline hazard here, {γt}Tt=0,
where T is the maximum duration observed in the data, can be specified in many
different ways. One possibility is to specify a polynomial like:
γt = γ0 + γ1 ln t+ γ2(ln t)2. (31)
Another possibility is to leave these parameters free:
γt =T ∗∑j=1
γj 1{t = j}. (32)
Note that we only specify T ∗ < T parameters instead of T . The reason for this is
identification: on the one hand, we want T ∗ to be close enough to T , but, on the
other hand, we need a critical mass of individuals being alive at T ∗ to be able to
identify exit rates (e.g. if T is the period in which the last individual exits, then
γT would be such that the probability of exiting at period T is equal to 1). This
flexible option is very interesting, as it provides a semi-parametric estimation of
λ(t) and it allows to test other parametric assumptions.
IV. Likelihood Functions
A. Complete continuous durations
Assume that we observe {t1, t2, ..., tN}. Then, the log-likelihood of this sample is:
LN =N∑i=1
ln f(ti|xi), (33)
where:
f(t|x) = h(t,x) exp(−H(t,x)) = λ(t) exp(x′β) exp {−Λ(t) exp(x′β)} . (34)
9
![Page 10: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/10.jpg)
B. Censored continuous durations
Duration data can be censored, i.e. we know that t > t or that t < t < t,
but we do not observe the exact value of t. We allow the level of censoring to be
observation specific (with the individual subindexes in the upper and lower limits).
The contribution to the likelihood of an observation that is censored because we
only observe that t > t is Pr(t ≥ t|x) = 1 − F (t|x). Hence, in this case, the
log-likelihood boils down to:
LN =N∑i=1
{wi ln f(ti|xi) + (1− wi) ln(1− F (ti|xi))
}, (35)
where wi = 1{ti < ti}, i.e. equals 1 if the observation is not censored. This is the
a log-likelihood of a sample like the one generated in Example 1 of Figure 1.
Now consider the Example 2 of Figure 1. There we have observations with
durations below one year (t1), others with durations between one and two years
(t1< t < t
2), and others with durations above two years (t > t
2). In this case,
the log-likelihood is:
LN =N∑i=1
{w1i lnF (t
1i |xi) + w2
i ln(F (t
2i |xi)− F (t
1i |xi)
)+ (1− w1
i − w2i ) ln
(1− F (t
2i |xi)
)}, (36)
where w1i = 1{ti < t
1i }, and w2
i = 1{t1i < ti < t2i }.
As a final example, consider a case like Example 2 of Figure 1 but in which at
the starting point, individuals have not had 0 periods unemployed, but, instead,
d1, d2, ..., dN , with di known by the researcher. In this case, the log-likelihood
looks similar to (36) with the exception of conditioning on initial duration:
LN =N∑i=1
{w1i ln
F (di + t1i |xi)− F (di|xi)
1− F (di|xi)+ w2
i lnF (di + t
2i |xi)− F (di + t
1i |xi)
1− F (di|xi)
+(1− w1i − w2
i ) ln1− F (di + t
2i |xi)
1− F (di|xi)
}, (37)
where w1i = 1{di < ti < di + t
1i }, and w2
i = 1{di + t1i < di + ti < t
2i }.
C. Discrete durations
In the discrete duration model, we use a logistic or normal cdf (Logit or Probit)
to estimate the hazard function. Also, we use the link between the probability
10
![Page 11: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/11.jpg)
of observing a given duration and the hazard rate seen in Section II. The log-
likelihood is given by:
LN =N∑i=1
T ∗∑τ=1
wiτ{yiτ lnG(γτ + x′iβτ ) + (1− yiτ ) ln(1−G(γτ + x′iβτ ))} (38)
where yiτ = 1{ti = τ}, and wiτ = 1{ti ≥ τ}, and G(.) is a cdf (e.g. logistic or
normal as before). This expression includes two types of contributions:
• Spells that end at time τ :
ln Pr(t = τ |x) = lnh(τ,x) +τ−1∑s=1
ln(1− h(s,x)). (39)
• Spells that are incomplete at time T ∗:
ln Pr(t > T ∗|x) =T ∗∑s=1
ln(1− h(s,x)). (40)
For every period τ = 1, ..., T ∗ we estimate a Probit or Logit of exiting vs not
exiting conditional on survival (i.e. estimated on the sample of individuals still
alive, which are those with wit = 1). Given this, it becomes clear the importance
of setting T ∗ small enough, so that we have enough observations alive to estimate
the probit of exiting at period T ∗ with precision. We can set T ∗i (= min{t̄i, T ∗})different for every individual, if there are observations censored below the maxi-
mum T ∗ that we are considering.
V. Unobserved Heterogeneity
A. Unobserved heterogeneity vs spurious duration dependence
Often, we cannot observe all important determinants of durations. The omission
of important regressors can generate spurious duration dependence. To illustrate
this idea, consider a regressor x = {0, 1} (e.g. low vs high ability), and that the
conditional hazard is well represented by a constant proportional hazard model
in which:
h(t, x = 0) = h0 h(t, x = 1) = h1 h1 > h0. (41)
Now assume that we do not observe x. The (unconditional) hazard that we
identify is the following:
h(τ) = h1 Pr(x = 1|t ≥ τ) + h0 Pr(x = 0|t ≥ τ). (42)
11
![Page 12: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/12.jpg)
Figure 4. An Example with Unobserved Heterogeneity
0.00
0.03
0.06
0.09
0.12
0.15
Haz
ard
func
tion
0 10 20 30 40Duration (t)
Note: Black dashed: h1 (hazard rate when x = 1); gray dashed: h0 (hazard rate when x = 0); graydashed: observed (unconditional) hazard.
The shape of this hazard is not constant anymore. Given that individuals with
x = 1 have a higher hazard of exiting, the proportion of individuals with x = 1 is
decreasing in the population, and the unconditional hazard converges to h0.
Figure 4 is an example of this. In the figure, the conditional hazards are h1 =
0.14 and h0 = 0.02, and the initial fraction of individuals with x = 1 is 80%,
with 20% having x = 0. As it emerges from the figure, after 40 periods, the
unconditional hazard rate has completely converged to h0, as all individuals with
x = 1 exited, while there are still individuals with x = 0 remaining unemployed.
Hence, not being able to control for a covariate x (e.g. ability) that correlates
with the hazard of exiting (e.g. high ability unemployed workers have a larger
hazard of exiting from unemployment) can create a spurious duration dependence.
In our unemployment example, not controlling for ability leads to the conclusion
that the hazard of finding a job decreases with the duration of the current un-
employment spell when this hazard is indeed constant because an individual that
has been unemployment for long is more likely to be of low ability.
B. Dealing with heterogeneity in continuous hazard models
Lancaster (1979) addressed this problem by introducing a multiplicative random
effect in the proportional hazard specification:
h(t,x, ν) = λ(t) exp(x′β)ν, (43)
where ν is assumed independent of x with positive support, E[ν] = 1 and pdf g(ν).
Hence, h(t,x, ν) is a hazard function conditional on x and ν. The cdf conditional
12
![Page 13: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/13.jpg)
on x and ν is:
F (t|x, ν) = 1− exp
(−∫ t
0
h(u,x, ν)
), (44)
and the cdf of t given x only, based on which we write the integrated likelihood, is
F (t|x) =
∫ ∞0
F (t|x, v)g(v)dv. (45)
Lancaster assumed a Gamma distribution for g(ν). An important remark here is
that we need to include regressors to be able to identify this model.
C. Dealing with heterogeneity in discrete hazard models
Similarly, in the discrete case, we can also write hazards conditional on ν and
x and then integrate over ν:
Pr(t = τ |x) =
∫Pr(t = τ |x, v)g(v)dv =
∫h(τ,x, v)
τ−1∏s=1
[1− h(s,x, v)]g(v)dv,
(46)
where, for instance:
h(t,x, ν) = G(γt + x′βt + ν). (47)
A frequently used specification for g(v) is a discrete-support mass point distribu-
tion: {ν1, ..., νm} with probabilities {p1, .., pm}:
Pr(t = τ |x) =m∑j=1
{h(τ,x, νj)
τ−1∏s=1
[1− h(s,x, νj)]pj
}, (48)
where νj and pj are additional parameters to be estimated.
VI. Multiple-exit discrete duration models
A. Discrete competing risk models
In this last section we discuss the case in which there are multiple exits from
the current state. For instance, we consider a model in which individuals can exit
from unemployment into a temporary or a permanent job.
Let duration t be discrete, and consider the two indicator functions dj with
j = 1, 2 that equal one if the exit is to alternative j. We can define the following
intensities of transition to each state:
φj(τ) = Pr(t = τ, dj = 1|t ≥ τ), j = 1, 2. (49)
This expression has a direct link with the unconditional hazard rates:
h(τ) = Pr(t = τ |t ≥ τ) = φ1(τ) + φ2(τ). (50)
13
![Page 14: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/14.jpg)
Conditional hazard rates are:
h1(τ) = Pr(y1τ = 1|t ≥ τ, y2τ = 0) (51)
h2(τ) = Pr(y2τ = 1|t ≥ τ, y1τ = 0), (52)
where yjτ = 1{t = τ, dj = 1}.The mapping between intensities and conditional hazards is given by the defi-
nition of conditional expectation:
hj(τ) =Pr(yjτ = 1|t ≥ τ)
Pr(ykτ = 0|t ≥ τ)=
φj(t)
1− φk(t), (53)
where the numerator is the joint probability as y1t = 1 implies y2t = 0 and vicev-
ersa. Therefore, we can write the model in terms of either of the two. For instance,
a MNL for φ’s is equivalent to a binary Logit for h’s with the same parameters.
Models presented in terms of conditional hazards h1(t) and h(2(t) are also known
as competing risk models. This name comes from considering two latent random
variables t∗1 and t∗2 such that the observed duration is t = min{t∗1, t∗2}. If t∗1 and
t∗2 are independent, h1(t) and h2(t) can be interpreted as hazard rates of latent
durations:
hj(τ) = Pr(t∗j = τ |t∗j ≥ τ). (54)
This implies that the analysis of exits to 1 takes exits to 2 as censored observations.
B. Full information ML
The log-likelihood function is analogous to the discrete case:
LN =N∑i=1
T ∗∑τ=1
wiτ{yiτ (d1i lnφ1(τ,xi) + d2i lnφ2(τ,xi))
+ (1− yiτ ) ln(1− φ1(τ,xi)− φ2(τ,xi))}, (55)
where yiτ = 1{ti = τ}, and wiτ = 1{ti ≥ τ}, as defined above. This expression
includes three types of contributions:
• Spells that end at time τ exiting to option 1:
ln Pr(t = τ, d1 = 1|x) = lnφ1(τ,x) +τ−1∑s=1
ln(1− φ1(s,x)− φ2(s,x)). (56)
• Spells that end at time τ exiting to option 2:
ln Pr(t = τ, d2 = 1|x) = lnφ2(τ,x) +τ−1∑s=1
ln(1− φ1(s,x)− φ2(s,x)). (57)
14
![Page 15: Duration Analysis - UAB Barcelonapareto.uab.cat/jllull/BGSE_Panel_Data/Duration_notes.pdf · The hazard function is the most important object in this analysis. It is de ned as the](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f4856b43777fc69b771807c/html5/thumbnails/15.jpg)
• Spells that are incomplete at time T ∗:
ln Pr(t > T ∗|x) =T ∗∑s=1
ln(1− φ1(s,x)− φ2(s,x)). (58)
C. Limited information ML based on competing risk models
We can also estimate separately the model for each option, considering exits
to the other option as censored, in a competing risk fashion. This is a LIML
estimation. The likelihood for option j would be:
LNj =N∑i=1
T ∗∑τ=1
wiτ{yijτ lnhj(τ,xi) + (1− yijτ ) ln(1− hj(τ,xi))}, (59)
or, alternatively:
LNj =N∑i=1
T ∗∑τ=1
wiτ{yiτdj lnhj(τ,xi) + (1− yiτdj) ln(1− hj(τ,xi))}, (60)
where yijτ = 1{ti = τ, dij = 1}, yiτ = 1{ti = τ}, and wiτ = 1{ti ≥ τ}, with ti =
min{t∗1i, t∗2i}, which delivers two types of contributions as in the standard discrete
case (with the “censored” contributions being for both censored observations and
exits to alternative k 6= j.
References
Cameron, A. Colin and Pravin K. Triverdi (2005), Microeconometrics:
Methods and Applications, Cambridge University Press.
Cox, David R. (1972), “Regression Models and Life Tables (with Discussion)”,
Journal of the Royal Statistical Society, B, 34, 187-220.
Lancaster, Tony (1979), “Econometric Models for the Duration of Unemploy-
ment”, Econometrica, 47, 939-956.
Lancaster, Tony (1990), Econometric Analysis of Transition Data, Cambridge.
Van den Berg, Gerard (2001), “Duration Models: Specification, Identification
and Multiple Durations”, in J.J. Heckman and E. Leamer (eds.), Handbook of
Econometrics, Vol. 5, Ch. 55.
15