bayesian estimation of time-changed default intensity modelsbayesian estimation of time-changed...

Finance and Economics Discussion SeriesDivisions of Research & Statistics and Monetary Affairs

Federal Reserve Board, Washington, D.C.

Bayesian Estimation of Time-Changed Default Intensity Models

Michael B. Gordy and Pawel J. Szerszen

2015-002

Please cite this paper as:Gordy, Michael B. and Pawel J. Szerszen (2015). “Bayesian Estimation of Time-ChangedDefault Intensity Models,” Finance and Economic Discussion Series 2015-002. Board of Gov-ernors of the Federal Reserve System (U.S.). http://dx.doi.org/10.17016/FEDS.2015.002

NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminarymaterials circulated to stimulate discussion and critical comment. The analysis and conclusions set forthare those of the authors and do not indicate concurrence by other members of the research staff or theBoard of Governors. References in publications to the Finance and Economics Discussion Series (other thanacknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Bayesian Estimation ofTime-Changed Default Intensity Models∗

Michael B. Gordy and Pawel J. SzerszenFederal Reserve Board, Washington, DC, USA

January 6, 2015

AbstractWe estimate a reduced-form model of credit risk that incorporates stochastic volatility indefault intensity via stochastic time-change. Our Bayesian MCMC estimation methodovercomes nonlinearity in the measurement equation and state-dependent volatility inthe state equation. We implement on firm-level time-series of CDS spreads, and findstrong in-sample evidence of stochastic volatility in this market. Relative to the widely-used CIR model for the default intensity, we find that stochastic time-change offersmodest benefit in fitting the cross-section of CDS spreads at each point in time, butvery large improvements in fitting the time-series, i.e., in bringing agreement betweenthe moments of the default intensity and the model-implied moments. Finally, we obtainmodel-implied out-of-sample density forecasts via auxiliary particle filter, and find thatthe time-changed model strongly outperforms the baseline CIR model.

JEL classification: C11, C15, C58, G12, G17Keywords: Bayesian estimation; MCMC; Particle filter; CDS; Credit derivatives; CIR

process; Stochastic time change

∗We have benefitted from discussion with Dobrislav Dobrev, and from the excellent research assistance ofHenry Fingerhut, Danny Kannell, Jim Marrone, Danny Marts and Bobak Moallemi. The opinions expressedhere are our own, and do not reflect the views of the Board of Governors or its staff. Address correspondenceto Pawel Szerszen, Federal Reserve Board, Washington, DC 20551; email 〈[email protected]〉.

1 Introduction

Credit spreads widened dramatically during the financial crisis of 2007–09. The CDX indexof five year North American investment grade corporate credit default swaps (CDS) rosefrom under 30 basis points (bp) in February 2007 to 280bp in November 2008. Perhapsless widely appreciated, the financial crisis also witnessed bursts of extreme volatility inspreads. This can be seen in Figure 1, where we plot daily changes in the log of the parspread on the on-the-run CDX index. The figure suggests that stochastic volatility is not aphenomenon peculiar to the financial crisis, as bursts of volatility can be seen, for example,around March 2005 and September 2011. More formally, Gordy and Willemann (2012) findthat a hypothesis of constant volatility in a Vasicek model of log-spreads is strongly rejectedeven in the pre-crisis data, and Alexander and Kaeck (2008) provide evidence of regime-dependent volatility in a Markov switching model of CDS spreads. Arguably, financialinstitutions might have been better prepared for the financial crisis had risk managementand rating models incorporated stochastic volatility. As a case study in model risk, Gordyand Willemann (2012) show that the high investment grade ratings assigned pre-crisis toconstant proportion debt obligations depended crucially on the rating agencies’ assumptionof constant volatility in spreads.

In practical application and in the empirical literature, the most widely-used pricingmodels for CDS take the so-called reduced form approach, pioneered by Jarrow and Turnbull(1995) and Duffie and Singleton (1999), in which a firm’s default occurs at the first eventof a non-explosive counting process with stochastic intensity. Broadly speaking, there arethree ways in which to accommodate the patterns of Figure 1 in this class of models. First,if we begin with the assumption that the log of the intensity follows a Vasicek process, asin Pan and Singleton (2008), then we can simply augment the process with positive andnegative jumps. The diffusion component would drive the low volatility periods, and thejumps would accommodate the high volatility periods. Observe here that the log transformof the intensity is what allows us to accommodate negative jumps without violating the zerolower bound on the intensity process, but it carries the price of computational intractability.A second approach builds on the widely-used and analytically tractable single-factor CIRspecification for the intensity. Jacobs and Li (2008) employ a two-factor specification inwhich a second CIR process controls the volatility of the intensity process. They estimatethe model on time-series of corporate bond spreads and find strong evidence of positivevolatility of volatility. The model retains tractability in pricing, but at the expense ofthe zero lower bound. Except when volatility of volatility is zero, there is no region of theparameter space for which the default intensity is bounded nonnegative. The third approachis to induce stochastic volatility via stochastic time-change. The constant volatility modelis assumed to apply in a latent “business time.” The speed of business time with respect to

1

calendar time is stochastic, and captures the intuition of time-variation in the rate of arrivalof news to the market. When applied to default intensity models, analytic tractability issacrified to a degree, but Mendoza-Arriaga and Linetsky (2014) and Costin, Gordy, Huang,and Szerszen (forthcoming) derive computationally efficient series solutions to the time-changed model. An advantage to this approach is that the calendar-time default intensityis bounded nonnegative so long as the business-time intensity is bounded nonnegative.

In this paper, we take the third approach. Although CDS pricing is tractable, estimationof the model presents significant challenges. The model has two latent state variables (onefor the default intensity in business time, and one for the time-change process). Due tononlinearities in the pricing function and state-dependent variance in the state evolution,the model is not well-suited to maximum likelihood estimation by Kalman filter or itsextensions. Bayesian Markov chain Monte Carlo (MCMC) estimation is viable, but theconventional MCMC algorithm will be prone to very poor convergence rates in our settingprimarily because the latent default intensity has high serial dependence. Variables that arehighly correlated are best sampled in blocks, but the model nonlinearity and state-dependentvariance make this infeasible. To overcome these obstacles, we build on the approachintroduced by Stroud, Müller, and Polson (2003) in which an auxiliary linearized modelfacilitates sampling of the latent state variables in blocks. As a by-product of the MCMCestimation, we obtain smoothed estimates of the path of the default intensity and time-change increments. Moreover, missing data is easily addressed in the MCMC algorithm.

We estimate the model on firm-level time-series of CDS spreads and find strong evidenceof material stochastic volatility. Relative to the model without time-change, we find thatstochastic time-change offers modest benefit in fitting the cross-section of CDS spreads ateach point in time. However, we find that stochastic time-change offers very large improve-ments in fitting the time-series, i.e., in bringing agreement between the in-sample momentsof the default intensity and the model-implied moments.

Finally, we assess model performance in out-of-sample density forecasts. We hold pa-rameters fixed at the mean of the in-sample posterior distribution, and filter state variablesthrough the out-of-sample period. Although non-adapted particle filters are straightforwardto implement in our model setting, they are prone to particle degeneration whenever abruptchanges of CDS spreads occur. We develop a partially-adapted auxiliary particle filter whicheffectively mitigates particle degeneration while preserving tractability. For nine of the tenfirms in our sample, we reject the baseline CIR model in favor of the model with stochastictime-change.

Our model specification and CDS pricing are set forth in Section 2. Our econometricmethodology is described in Section 3. Data and estimation results are presented in Section4. Out-of-sample performance is assessed in Section 5. Section 6 concludes.

2

2 Model specification and pricing

Mendoza-Arriaga and Linetsky (2014) and Costin et al. (forthcoming, hereafter CGHS)introduce stochastic time change to default intensity models for pricing credit default swapsand other credit sensitive instruments. A firm’s default occurs at the first event of a non-explosive counting process. Under the business-time clock, the intensity of the countingprocess is λt. The intuition driving default intensity models is that λtdt is the probabilityof default before business time t+ dt, conditional on survival to business time t.

We assume complete markets, no arbitrage, and the existence of an equivalent mar-tingale measure Q. Unless stated otherwise, all specifications of stochastic processes andexpectations are taken with respect to this measure. We assume that λt is a CIR processunder the business clock, i.e., that it follows the stochastic differential equation

dλt = (µ− κλt)dt+ σ√λtdWt (1)

where Wt is a Brownian motion. We assume that µ > 0 and initial condition λ0 ≥ 0, butdo not restrict κ. Our approach can easily be generalized to allow for independent positivePoisson jumps in the default intensity process, but we do not pursue this extension here.

Define Λt(s) as the time-integral (or “compensator”) of the default intensity from timet to time t+ s, i.e., Λt(s) =

∫ t+st λudu. Conditional on the survival to time t and the state

λt, the probability of survival to date t+ s is

St(s; `) = Pr(τ > t+ s|τ > t, λt = `) = E [exp(−Λt(s))|λt = `] . (2)

where τ denotes the default time of the firm. This function has an exponential affine formas detailed in Duffie and Singleton (2003, Appendix A.5).

We now introduce stochastic time-change. Let Tt be the stochastic business time asso-ciated with calendar time t. For a given firm, let τ denote the calendar default time, sothat τ = Tτ is the corresponding time under the business clock. We assume that processesλ and T are independent, i.e., that there is no leverage effect. In the empirical literature onstochastic volatility in stock returns, there is strong evidence for dependence between thevolatility factor and stock returns (e.g., Andersen et al., 2002; Jones, 2003; Jacquier et al.,2004). In the credit risk literature, the evidence is rather less compelling. Across the firmsin their sample, Jacobs and Li (2008) find a median correlation of around 1% between thedefault intensity diffusion and the volatility factor.

We assume that the time-change Tt is an inverse Gaussian (IG) process with meanparameter t and shape parameter αt2. This is an almost surely increasing Lévy pro-cess such that E [exp(u(Tt+s − Tt))] = exp(sΨ(u)) where the Laplace exponent is Ψ(u) =α(1−

√1− 2u/α

). As E [Tt+s − Tt] = s for all s and t, we say that the business clock is

3

unbiased, i.e., business time moves at the same speed on average as calendar time. Theparameter α can be interpreted as a precision parameter. As α → ∞, business time con-verges in probability to calendar time (hence, no time-change). As a Lévy subordinator,the process has increments Tt+s − Tt, Tt+2s − Tt+s, Tt+3s − Tt+2s, . . . that are independentand identically distributed, so the specification rules out the possibility of volatility clus-tering in time. The methodology of Mendoza-Arriaga and Linetsky (2014) allows Tt to beany Lévy subordinator, and CGHS allow for a still broader class of time-change processes.Nonetheless, we have found the IG specification to be flexible and numerically well-behavedin our econometric application, so impose this specification throughout this paper.

Independence of the default intensity and business clock implies that time-changing thedefault time is equivalent to time-changing Λt; that is, that the compensator in calendar timeis Λt(s) = ΛT (t)(Tt+s−Tt). Since λt and Tt are Markov processes, the triplet (Tt, λ(Tt), τ > t)is a sufficient statistic for the information at time t. The conditional calendar-time survivalprobability function is

St(s; `) = Pr(τ > t+ s|τ > t, Tt, λ(Tt) = `) = Pr(τ > Tt+s|τ > Tt, Tt, λ(Tt) = `)

= E[E[exp(−ΛT (t)(Tt+s − Tt))|Tt+s, Tt, λ(Tt) = `

] ∣∣∣∣Tt, λ(Tt) = `

]= Et

[ST (t)(Tt+s − Tt; `)

].

Mendoza-Arriaga and Linetsky (2014) provide a series solution to St(s; `) via Bochnersubordination of the eigenfunction expansion. The expansion is uniformly convergent whenthe intensity process is stationary (µ > 0 and κ > 0), but cannot be employed in thecase of non-stationarity (κ ≤ 0). Both Duffee (1999) and Jacobs and Li (2008) find thatthe default intensity process is indeed non-stationary under the risk-neutral measure forthe typical firm. In our empirical results below, we find that the mean of the posteriordistribution for κ under the risk-neutral measure is negative for every firm in our sample.Therefore, we employ the “expansion in derivatives” method of CGHS, which does notimpose any restriction on κ. Calendar-time survival probabilities are given by

St(s; `) ≈M∑m=0

α−mm∑j=0

cm,jsjDm+j

s ST (t)(s; `) (3)

where Ds is the differential operator dds , M is the order of approximation, and

cm,j = 12m

1m

1(j − 1)!

(2mm− j

)

for m ≥ j ≥ 1. For j = 0, c0,0 = 1 and cm,0 = 0 for m > 0. Derivatives of the business-time

4

survival function are easily computed using a recurrence rule in §4 of CGHS. CGHS showthat expansion (3) diverges as M → ∞ but yields an accurate approximation when theseries is truncated close to the numerically least term. In our empirical application, we fixM = 2, which CGHS find generally accurate to within the bid-ask spread.

Let rt denote the riskfree short rate, which we assume to be independent of the defaultintensity and time-change. Our pricing and estimation methods each can be generalized toallow for dependence, but at significant cost in complexity and computational resources.1

Duffee (1999) finds that the correlation between riskfree rates and default intensity is typ-ically negative but has second-order importance, so we do not pursue the extension here.The discount function is given by

Pt(s) = Et[exp

(−∫ t+s

trudu

)]CDS pricing follows the treatment of Leeming et al. (2010, §2). By independence of the

short rate and default risk, we can write the value of the remaining quarterly premium legpayments at t+ s1, . . . , t+ sn at date t as

PremiumLeg = cn∑i=1

Pt(si)St(si;λT (t))B(si)

where B(s) = min{0.25, s} is the coupon time interval. The value of protection leg paymentsis

ProtectLeg = (1− ρ)∫ s(n)

0Pt(s)qt(s;λT (t)) ds

where ρ is the expected recovery rate as a fraction of face value, and qt(s; `) is the densityof the default time under calendar time conditional on business-time intensity λT (t) = `.This density is simply qt(s; `) = −S′t(s; `). The model-implied par CDS spread equates thevalue of the premium leg to that of the protection leg.2

In Section 4, we will compare the smoothed estimates of the default intensity in amodel without time-change to the calendar-time intensity λt in the full model with time-change. The default intensity coincides with the instantaneous forward default rate, soλt = qt(0;λT (t)) is readily computed.3

As emphasized by Jarrow et al. (2005), market prices can reveal only the Q-intensity,1On pricing in a multifactor framework with dependence, see Remark 4.1 in Mendoza-Arriaga and Linet-

sky (2014) and §7 in CGHS.2Since April 2009, CDS have been traded on an “upfront” basis with annual coupons fixed to 1% or 5%.

For observations since this change in market convention, we convert upfront to par spread using the standardformula published by ISDA. Leeming et al. (2010, §4).

3Formal treatment of the existence of the calendar-time default intensity can be found in Mendoza-Arriagaand Linetsky (2014, Theorems 3.2(iii), 3.3(iii)).

5

which may or may not contain a risk-premium for the event risk associated with default.While the intensity under the physical measure P is not identified in our setting, oursmoothed estimate of λt reveals the P-dynamics of the Q-intensity.4 We specify the dynam-ics of λt under the two measures jointly in the standard fashion. The drift and volatilityparameters in (1) are restricted to be the same across the two measures, but the risk-neutralmean-reversion parameter κQ differs from the physical κP by an unrestricted risk-premium.This change of measure is called “drift change in the intensity” by Jarrow et al. (2005),and has been adopted by Duffee (1999) and Jacobs and Li (2008), among others. In theempirical implementation below, we will restrict κP > 0 to guarantee stationarity under thephysical measure, but will not restrict κQ.

In principle, we can introduce a second risk-premium on the uncertainty due to time-change. Say we assume that Tt+s − Tt ∼ IG(s, αs2) under the physical measure P. Forthe time-change process Tt to remain within the IG family under Q, the increments tothe business clock must be distributed Tt+s − Tt ∼ IG(sψ, αs2) under Q where the risk-premium parameter ψ is nonnegative.5 The issue is whether ψ can be estimated. Forlarge α, Tt+s − Tt converges in probability to sψ. Since λt follows a scale-invariant CIRprocess, the parameters of the time-changed default intensity could not be identified inthe asymptotic case of α→∞. In numerical experiments based on simulated time-series oflength comparable to our data and values of α from the range of estimates reported below inSection 4, we find that ψ is only very weakly identified. Therefore, we make the simplifyingassumption that uncertainty due to time-change is unpriced, i.e., that Tt+s−Tt ∼ IG(s, αs2)holds under both P and Q.

Stochastic time-change can have a dramatic effect on the kurtosis of changes in thetime-series of CDS spreads. In Figure 2, we plot kurtosis of the stationary distribution ofchanges in the CDS spread as a function of α on a log-log scale. Let CDS(λ;α) be thefive-year par spread as a function of the current business-time intensity. For each value of αand horizon δ, we first draw a sample λ(i)

0 , i = 1, . . . , I, from the stationary distribution forλt. Next, we draw the elapsed business-time T (i)

δ ; and then draw λ(i)(T (i)δ ) from the CIR

transition density. From this sample, we calculate the kurtosis of the stationary distributionof CDS(λ(i)(T (i)

δ );α) − CDS(λ(i)0 ;α). We plot separate curves for a one day horizon (δ =

1/250, assuming 250 trading days per year), a one month horizon (δ = 1/12), and anannual horizon (δ = 1). Parameters for the business-time CIR process are fixed to κP =κQ = 0.2, µ = 0.004, σ0 = .1, and recovery is fixed to ρ = 0.4. As we expect, kurtosis at allhorizons tends to its asymptotic CIR limit as α → ∞. For fixed α, kurtosis also tends to

4This limitation applies as well to Duffee (1999), Jacobs and Li (2008), and Pan and Singleton (2008).Driessen (2005) brings additional ratings-based information to bear to identify λP

t and λQt separately.

5This is the only form that satisfies the necessary integrability conditions on the Radon-Nikodym deriva-tive of the Lévy measures of jumps under P and Q set forth in Theorem 7.3 of Barndorff-Nielsen and Shiryaev(2010).

6

its CIR limit as δ → ∞. This is because an unbiased trend stationary time-change has noeffect on the distribution of a stationary process far into the future. For intermediate valuesof α (say, between 1 and 10), we see that time-change has a modest impact on kurtosisbeyond one year, but a material impact at a one month horizon, and a very large impactat a daily horizon.

3 Estimation method

Our choice of estimation method is guided by two characteristics of the model of Section2. First, the mapping from default intensity to CDS spreads is nonlinear and sensitive toparameters. Second, the latent default intensity is persistent under the physical measureand has state-dependent variance. Bayesian Markov chain Monte Carlo (MCMC) estimationis a natural choice in this setting, but the conventional MCMC algorithm will suffer fromvery poor convergence rates in our setting primarily because the latent default intensityhas high serial dependence. Variables that are highly correlated are best sampled in blocks,but the model nonlinearity and state-dependent variance make this infeasible. To overcomethis obstacle, we build on the approach introduced by Stroud, Müller, and Polson (2003,hereafter SMP) in which an auxiliary linear model facilitates sampling of the latent statevariables in blocks.

MCMC estimation has additional advantages. As a by-product of the estimation, weobtain smoothed estimates of default intensity and time-change increments. Moreover,missing data is easily addressed in the MCMC algorithm by augmenting the states. Relativeto particle filtering, a limitation of MCMC is that online estimation of states and parametersis computationally infeasible. We turn to particle filtering for out-of-sample forecastingexercises in Section 5.

Section 3.1 offers an introduction to Bayesian MCMC methods in general and the SMPalgorithm in particular. The presentation is in a simplified setting and may be skipped overby readers familiar with the SMP method. In Section 3.2, we adapt the SMP method tothe model of Section 2.

3.1 MCMC estimation with a linear auxiliary model

Our introduction to Bayesian MCMC methods is modeled after Chib and Greenberg (1996)and Jones (2003). Let Y = (y1, . . . , yT ) denote the vector of observations, X = (x1, . . . , xT )be the vector of latent state variables and Θ be the vector of model parameters.6 In Bayesianinference the joint posterior of states and parameters is of interest and can be derived given

6Throughout the paper, we employ caligraphic font to distinguish dataset dimensions from model quan-tities, e.g., T as a count of time-series observations vs. Tt as the stochastic time-change process.

7

the prior distribution on the parameters:

p(Θ, X|Y ) ∝ p(Y |X,Θ) · p(X|Θ) · p(Θ),

where p(Y |X,Θ) is the likelihood function of the model, p(X|Θ) is the probability distri-bution of state variables conditional on the parameters and p(Θ) is the prior probabilitydistribution on the parameters of the model. The joint posterior distribution p(Θ, X|Y )is in general analytically and computationally intractable. MCMC methods overcome thisproblem by breaking the high-dimensional vectors X and Θ into low-dimensional subvectorswith complete conditional posterior distributions that are more easily sampled. The jointposterior is, by construction, the invariant distribution of the chain.

The Gibbs sampler of Geman and Geman (1984) is typically the preferred method whenthe conditional posterior can be sampled directly. We partition X and Θ into, respectively,JX and J Θ subvectors X(1), X(2), ..., X(JX) and Θ(1),Θ(2), ...,Θ(JΘ). The Markov chainis initialized at chosen values X0 and Θ0. The chain is formed by drawing iterativelyfrom transition densities p(X(i)

n |X(−i)n ,Θn−1, Y ), i = 1, 2, . . . ,JX , and p(Θ(j)

n |Θ(−j)n , Xn, Y )

j = 1, 2, . . . ,J Θ, where X(−i)n ≡ (X(k)

n )k<i ∪ (X(k)n−1)k>i and Θ(−j)

n ≡ (Θ(k)n )k<j ∪ (Θ(k)

n−1)k>j .Under mild regularity conditions, the chain (Xn,Θn) converges to its invariant distributionp(Θ, X|Y ).

When it is difficult or impossible to draw directly from a complete conditional posteriorfor a subvectorX(i) or Θ(j), that step is replaced by the Metropolis-Hastings (MH) algorithm(Metropolis et al., 1953). Say X(i) is the variable to be sampled in iteration n. We samplefrom a tractable proposal density p(X;X(i)

n−1|X(−i)n ,Θn−1, Y ). The draw X is accepted with

probability

min

1,p(X|X(−i)

n ,Θn−1, Y ) · p(X(i)n−1; X|X(−i)

n ,Θn−1, Y )p(X(i)

n−1|X(−i)n ,Θn−1, Y ) · p(X;X(i)

n−1|X(−i)n ,Θn−1, Y )

If accepted, we set X(i)

n = X. If rejected, then the value X(i)n−1 is retained, i.e., we set

X(i)n = X

(i)n−1. A similar procedure is used when Θ(j) is the variable to be sampled.

Observe that if the true conditional density is taken as the proposal density, the draw isaccepted with probability one, and the MH step is simply a Gibbs sampler. In application,it is often the case that the true conditional density is too complicated to approximate withany reliability, in which case a standard approach is to sample as a normal random walk,i.e., draw X as a normal random variable centered at X(i)

n−1. While simple to implement,this approach may suffer from slow convergence.

Correlation across parameters or state variables poses a computational challenge inMCMC estimation. Joint sampling of correlated variables yields faster convergence rates,but is often much more difficult to implement. In our application, this dilemma is especially

8

acute for sampling of the default intensity. Strong persistence in the default intensity impliesthat we ought to sample jointly the default intensity states over extended blocks of time.Were the model linear and Gaussian, we could sample jointly via the forward filteringbackward sampling (FFBS) method of Carter and Kohn (1994) and Frühwirth-Schnatter(1994). In our model setting, nonlinearity and state-dependent variance make this infeasible.The single-move MCMC sampler (Carlin et al., 1992) can be applied, but we have foundthe cost in rate of convergence to be insurmountable.

SMP propose an MCMC algorithm that is well-suited to our problem. Their approachis quite general, but for illustration we adopt a simplified setting with nonlinearity in themeasurement equation, constant variance in the state equation, and both yt and xt uni-dimensional. The essential idea is to extend the state vector to include auxiliary mixingvariables ~k = (k1, . . . , kT ) with kt having discrete support {1, . . . ,K}. Conditional on themixing variable, the measurement equation is approximated as Gaussian and linear in xt,which allows FFBS to be applied to the proposal distribution. Finally, this FFBS drawis accepted or rejected, as in MH, to account for the gap between the true measurementequation and the linearized approximation that provides the proposal density. There is atrade-off in choosing the number of nodes in the approximation: The larger is K, the smalleris the gap between true and linearized models, so the higher the acceptance rate. However,the computational cost per MCMC iteration increases with K.

SMP specify the auxiliary model for yt as Gaussian with pa(yt|xt, kt = k,Θ) ∼ N(γt,k +βt,kxt, ω

2t,k) for (possibly) time-varying coefficients γt,k, βt,k, ωt,k. The mixture weights are

standardized Gaussian weights

pa(kt = k|xt,Θ) = φ(xt; νk, ξk)w(xt)

where φ(x; ν, ξ) denotes the normal density with mean ν and variance ξ2 and the scalingfactor w(x) =

∑Kk=1 φ(x; νk, ξk) guarantees that the weights sum to one. For tractability, the

mixture variables are assumed to be serially independent, conditional on xt. The auxiliarymixture model is then

pa(yt|xt,Θ) =K∑k=1

pa(yt|xt, kt = k,Θ) · pa(kt = k|xt,Θ) (4)

The node constants (νk, ξk) and identifying restrictions for (γt,k, βt,k, ωt,k) are chosen so thatthe auxiliary density pa(yt|xt,Θ) is close to the target density p(yt|xt,Θ).

In each iteration of the MCMC procedure, we draw Θ in the usual way, and draw thestate variables as follows. For notational compactness, we drop the subscript n for indexingMCMC iterations.

9

1. Generate mixture indicators kt from the complete conditional posterior pa(kt|yt, xt,Θ),which has a multinomial distribution. By Bayes’ rule,

pa(kt|yt, xt,Θ) ∝ pa(yt|xt, kt,Θ) · pa(kt|xt,Θ)

This is a Gibbs step.

2. Conditional on the mixture indicators, draw states jointly via FFBS from the MHproposal density for the mixture model. The proposal density is given by:

p(X|~k, Y,Θ) = p(x0)T∏t=1

p(xt|xt−1,Θ) · pa(yt|xt, kt,Θ) · φ(xt; νk(t), ξk(t))

This is similar to the MH proposal density in a linear model without auxiliary vari-ables, but incorporates an extra Gaussian kernel φ(xt; νk(t), ξk(t)) to capture the ad-ditional conditioning information from the auxiliary variables. Denote the proposaldraw X = (x1, . . . , xT ).

3. Accept or reject X. The acceptance probability is

min{

1,T∏t=1

p(yt|xt,Θ)w(xt)pa(yt|xt,Θ)

w(xt)pa(yt|xt,Θ)p(yt|xt,Θ)

}(5)

where the {xt} denote the previous draw of X in the Markov chain.

Observe that the acceptance probability is maximized when the auxiliary model likelihoodpa(yt|xt,Θ) is close to the true likelihood p(yt|xt,Θ). When the acceptance probability islow, performance may be improved by dividing the time-series into blocks, b = 1, . . . ,B,and tuning the node constants (νbk, ξbk) on a block-by-block basis.

3.2 Application to estimation of the time-changed default intensity model

In this section, we cast the model of Section 2 as a discrete-time state-space model, andthen adapt the SMP algorithm to accommodate a multidimensional measurement equationand state-dependent variance in the state equation for the default intensity.

Our notation needs to track three time-scales, one for observations, one for calendar time,and one for business time. Henceforth, we let t = 1, . . . , T index a set of daily observations.The associated (deterministic) calendar time is denoted Tt. For simplicity, we take tradingdays as equally spaced in calendar time at an interval of ∆ = Tt+1 − Tt = 1/250 of a year.The stochastic business time associated with observation time t is T (Tt). For notationalconvenience, let ht = λ(T (Tt)) be the business-time default intensity at the business time

10

associated with observation date t.7

Fixing a given reference entity, let yt,m be the log of the CDS spread observed on datet for maturity m ∈ M = {1, 2, 3, 5, 7, 10} years. Let Fm(ht;µ, κQ, σ, α, ρt, Pt) map ht to amodel-implied log-spread on a CDS of maturity m. For notational compactness, we dropexplicit reference to the date-t riskfree discount function and recovery rate, and simply writeFt,m(ht;µ, κQ, σ, α).

We assume that CDS log-spreads are observed with noise. Without measurement noise,the spread would be a deterministic function of underlying parameters and the state, whichwould lead to stochastic singularity in filtering and smoothing.8 In the option pricingliterature, this approach has been used by Eraker (2004) and Johannes et al. (2009), amongothers. Our measurement equation is

yt,m = Ft,m(ht;µ, κQ, σ, α) + ζ ε(y)t,m (6)

Measurement errors ε(y)t,m are assumed to be i.i.d. standard normal random variables. The

new parameter ζ scales the pricing errors. For parsimony, we impose homoscedasticityacross both time and contract maturity.

The model has two state evolution equations. The first governs the distribution ofincrements to the business time clock. Let

χt+1 = T (Tt+1)− T (Tt)iid∼ IG(∆, α∆2) (7)

be the inverse Gaussian increment to the business clock between observations t and t + 1.The second governs the evolution of the default intensity under the physical measure. Weapply a first order Euler discretization scheme to SDE (1) with stochastic time-incrementsto obtain

ht+1 = ht + (µ− κPht)χt+1 + σ√htχt+1 ε

(λ)t+1 (8)

The {ε(λ)t } are i.i.d. standard normal random variables, independent of {ε(y)

t,m}. We assumethat both κP and µ are strictly positive, so that ht is bounded nonnegative and stationary.9

We refer to equations (6)–(8) as the state-space representation.Our framework extends the basic SMP algorithm of Section 3.1 along two dimensions.

First, our measurement equation is of dimension #M, as we have one observation foreach contract in maturity set M. As in the standard SMP algorithm, we introduce a

7This should not be confused with the calendar time default intensity λ(Tt).8In our setting, measurement error could alternatively be justified by misspecification or estimation error

in the fitted riskfree term-structure Pt or by the presence of time-varying liquidity premia in the CDS market.9Discretization introduces the possibility of negative realizations of the default intensity. We avoid this

in the MCMC simulation by simple truncation at zero.

11

linear auxiliary model for the measurement equation with mixing variable k(y)t . Second,

and of greater consequence, the state equation (8) for ht has state-dependent volatility,as V [ht+1|ht] = σ2htχt+1. As FFBS requires state-independent volatility, we introducea linear auxiliary model for the state equation (8) with a new mixing variable k(λ)

t . Forparsimony, the two mixing variables share the same support {1, . . . ,K}, as well as meanand volatility constants (νk, ξk).10

We assume that log-spread yt,m is conditionally independent of k(λ)t given k

(y)t . The

auxiliary model for the measurement equation is conditionally Gaussian with

pa(yt|ht, k(y)t ,Θ) =

∏m∈M

pa(yt,m|ht, k(y)t ,Θ)

Hence, the auxiliary model is given by the following mixture of Gaussian densities:

pa(yt|ht,Θ) =K∑k=1

pa(k(y)t = k|ht,Θ)

∏m∈M

pa(yt,m|ht, k(y)t = k,Θ)

where

pa(k(y)t = k|ht,Θ) = φ(ht; νk, ξk)

w(ht)for w(x) =

K∑k=1

φ(x; νk, ξk)

As in the standard SMP algorithm, we assume that the auxiliary model for the measure-ment equation is linear and Gaussian conditional on the mixture indicator. We linearizethe measurement equation (6) locally at node k, and define

pa(yt,m|ht, k(y)t = k,Θ) ∼ N(Ft,m(νk; Θ) + F ′t,m(νk; Θ) · (ht − νk), ζ2)

Observe that the coefficients in the mean of the auxiliary distribution vary over time andacross maturity, as well as across nodes.

We now introduce the auxiliary model for the state evolution equation (8). Conditionalon k(λ)

t and ht−1, ht is independent of the measurement equation mixing variable k(y)t . The

auxiliary model is conditionally linear and Gaussian. We linearize state equation (8) atnode k, and define

pa(ht|ht−1, k(λ)t = k, χt,Θ) ∼ N(µχt + (1− κPχt)ht−1, σ

2χtνk).

As the drift of the true evolution equation is affine in ht−1, the mean of the auxiliary modelcan be matched to the true mean without conditioning on the node. Only the variance of

10In our implementation, we achieve higher acceptance rates if the time-series is divided into contiguousblocks of, say, 40–50 observations each. The node constants are tuned on a block-by-block basis.

12

the auxiliary model depends on k(λ).The auxiliary model for state evolution equation is decomposed in the usual way:

pa(ht|ht−1, χt,Θ) =K∑k=1

pa(ht|ht−1, k(λ)t = k, χt,Θ) · pa(k(λ)

t = k|ht−1,Θ) (9)

The state k(λ)t is assumed to be conditionally independent of k(y)

t with

pa(k(λ)t = k|ht−1,Θ) = φ(ht−1; νk, ξk)

w(ht−1) (10)

We introduce additional notation so that the algorithm can be presented compactly.The data vector is Y = (y1, . . . , yT ) where yt = (yt,m)m∈M is the vector of CDS log-spreadsobserved at time t for all maturitiesm ∈M. The parameter vector is Θ = (µ, κP, κQ, σ, α, ζ).The state vector X contains four components: default intensities X(λ) = (h1, . . . , hT ); time-change increments X(χ) = (χ1, . . . , χT ); mixture variables for the observation equationX(y) = ~k(y) = (k(y)

1 , . . . , k(y)T ) with k

(y)t ∈ {1, . . . ,K}; and mixture variables for the state

evolution equation X(x) = ~k(λ) = (k(λ)1 , . . . , k

(λ)T ) with k

(λ)t ∈ {1, . . . ,K}. For any subset

V ⊂ X, let X(−V ) denote X\V .In each iteration of the MCMC procedure, we follow these steps:

Draw the mixture indicator variables. By Bayes’ law and the assumption of condi-tional independence,

pa(k(y)t |X(−~k(y)), Y,Θ) ∝ pa(yt|ht, k(y)

t ,Θ) · pa(k(y)t |ht,Θ)

pa(k(λ)t |X(−~k(λ)), Y,Θ) ∝ pa(ht|ht−1, k

(λ)t , χt,Θ) · pa(k(λ)

t |ht−1,Θ)

The complete conditional posterior distribution for each mixture indicator is multi-nomial and is easily sampled.

Generate proposal default intensity states. We apply Bayes’ law to the auxiliary modelfor the state equation:

pa(X(λ)|X(−λ), Y,Θ) ∝ pa(Y |X(λ),~k(y),~k(λ),Θ) · pa(X(λ)|~k(y),~k(λ), X(χ),Θ)

∝ pa(Y |X(λ),~k(y),Θ) · pa(~k(y)|X(λ),Θ) · pa(X(λ)|~k(λ), X(χ),Θ)

13

where

pa(Y |X(λ),~k(y),Θ) =T∏t=1

∏m∈M

pa(yt,m|ht, k(y)t ,Θ)

pa(~k(y)|X(λ),Θ) =T∏t=1

φ(ht; νkt , ξkt)w(ht)

pa(X(λ)|~k(λ), X(χ),Θ) = pa(h0|k(λ)1 ,Θ)

T∏t=2

pa(ht|ht−1, k(λ)t , χt,Θ)

The proposal distribution is

p(X(λ)|X(−λ), Y,Θ) ∝ pa(X(λ)|X(−λ), Y,Θ) ·T∏t=1

w(ht)

The standard FFBS algorithm can be used to sample jointly default intensity statesfrom the proposal distribution p.

Accept or reject proposal draw. By repeated application of Bayes’ law:

p(X(λ)|X(−λ), Y,Θ)p(X(λ)|X(−λ), Y,Θ)

∝ p(Y |X(λ),Θ) · p(X(λ)|X(χ),Θ) · pa(~k(y)|X(λ),Θ) · pa(~k(λ)|X(λ),Θ)pa(Y |X(λ),~k(y),Θ) · pa(X(λ)|~k(λ),Θ) · pa(~k(y)|X(λ),Θ) ·

∏Tt=1w(ht)

∝ p(Y |X(λ),Θ) · p(X(λ)|X(χ),Θ)pa(Y |X(λ),Θ) · pa(X(λ)|X(χ),Θ) ·

∏Tt=1w(ht)

Hence, the MH acceptance probability is given by:

min{

1, p(Y |X(λ),Θ) · p(X(λ)|X(χ),Θ) · pa(Y |X(λ),Θ) · pa(X(λ)|X(χ),Θ) ·

∏Tt=1w(X(λ)

t )p(Y |X(λ),Θ) · p(X(λ)|X(χ),Θ) · pa(Y |X(λ),Θ) · pa(X(λ)|X(χ),Θ) ·

∏Tt=1w(ht)

}(5′)

where X(λ) = (h1, . . . , hT ) is the proposal draw and X(λ) = (h1, . . . , hT ) is the currentvalue. Relative to the standard SMP acceptance probability in (5), expression (5′)introduces the ratio of kernels pa(X(λ)|X(χ),Θ) and p(X(λ)|X(χ),Θ) to account forthe second auxiliary model for the state equation. Calculation of the acceptanceprobabilities is straightforward, as the kernels for true model p involve only Gaussiandensities, and the auxiliary kernels pa are mixtures of Gaussian densities.

Draw the parameters. The conditional posteriors for the parameters κP and ζ2 are con-jugate under the choice of normal prior and inverse gamma prior, respectively. Thus,the Gibbs sampler can be used for these parameters. The remaining parameters(µ, κQ, σ, α) have complete conditional posteriors that depend intractably both on the

14

time-series of default intensity states (ht)t=1,...,T and the observed data Y . We samplethese parameters via the MH normal random walk algorithm. For all parameters weassume flat uninformative priors. We restrict κP and α to be strictly positive.

Draw the time-change increments. By Bayes’ law and the conditional independenceof the increments,

p(χt|X(−χ), Y,Θ) ∝ p(ht|ht−1, χt,Θ) · p(χt|Θ)

It is straightforward to show that these kernels give a generalized inverse Gaussian(GIG) density from which we can sample directly in a Gibbs sampler step.

4 Empirical results

We estimate the CIR and time-changed CIR models on daily data for CDS spreads onten single-name reference entities for the period from the beginning of 2002 to the end of2009. The sample period spans the financial crisis of 2007–09. We retain the period fromthe beginning of 2010 to the end of 2011 for out-of-sample forecasting exercises in Section5. The sample of reference entities includes U.S. and European nonfinancial corporationsof varying credit quality: Alcoa, Anadarko, CenturyLink, Clear Channel, Ford, Lennar,Limited Brands, RadioShack, Sprint Nextel, and Tyson Foods. All have liquid trading inCDS. For each name, spreads on CDS of maturities 1, 2, 3, 5, 7 and 10 years are taken fromthe Markit database.11 In the upper panels of Figures 3 and 4 we plot time-series of 1, 5and 10 year CDS spreads for Alcoa and Ford, which are among the most heavily-tradednames in our sample.

Even for these liquid names, it is not unusual to find missing observations in CDSdata. In Table 1 we report the number of missing observations in our eight year sample byreference entity and maturity. As a practical matter, it would be infeasible to estimate themodel without a convenient and rigorous means to accommodate missing observations. InMCMC estimation, we can simply treat missing observations as latent variables and samplefrom their complete conditional posteriors. This approach avoids additional assumptionson the missing data points and does not interfere with the law of motion imposed by theobserved data points.12

Model-implied CDS spreads depend on the riskfree discount function Pt and on recoveryrates. We parameterize Pt using the Svensson yield curve, for which we rely on daily

11The time-series for Sprint Nextel Corp. begins on August 24, 2005. Data for the other nine firms spanthe entire sample period.

12See Tsay (2010, §12.6) on the handling of missing data via data augmentation.

15

parameter estimates provided by the Federal Reserve Board.13 Recovery rates are takenfrom the Markit database.

We estimate model specifications both with and without time-change in default intensityusing the algorithm of Section 3.2. Recall that the model without time-change is nested inthe time-change model as the limiting case α = ∞. After discarding the burn-in period,we simulate chains with 100,000 draws from the invariant distribution for each model. Theparameter estimates and the smoothed state estimates are calculated as means of drawsfrom the joint posterior distribution of parameters and states.

The parameter estimates of CDS pricing models are reported in Tables 2.a–2.j. Thetables report for each firm the mean, standard deviation, 5% quantile and 95% quantile ofthe posterior distribution for the state-space model in equations (6)–(8). All parametersare estimated with high precision except for the speed of mean reversion under the physicalmeasure. This exception is not unexpected. In maximum likelihood estimation of a CIRspecification for interest rates, Chapman and Pearson (2000) and Phillips and Yu (2009)show that κP has a large standard error in reasonable sample sizes. The speed of meanreversion under the risk-neutral measure (κQ) is for all firms negative even at the 95%quantiles. This finding is qualitatively consistent with Duffee (1999), Pan and Singleton(2008) and Jacobs and Li (2008), and underscores the necessity of a pricing methodologythat remains valid under non-stationarity.

The estimates of the parameter α provide strong evidence for stochastic time-changein default intensities. For two of the firms (Lennar and Sprint), the mean of the posteriordistribution of α is near 20. At such values, stochastic time-change has essentially no effecton the term-structure of credit spreads. Nonetheless, as illustrated in Figure 2, the kurtosisof daily changes in CDS spreads is nonetheless an order of magnitude larger than thekurtosis in the CIR model without time-change. For RadioShack, the mean and even the95% quantile of the posterior for α is below 2.0. At such values, time-change has a noticeableeffect on credit spreads. For the remaining seven firms in our sample, the estimated α liebetween these extremes.

Alcoa and Ford are typical in this regard, with estimated α between 7 and 8. In themiddle panels of Figures 3 and 4, we plot smoothed estimates of the calendar-time de-fault intensity (λt) for these two firms for the models with and without time-change. Thesmoothed time-series track quite closely across the two models, except when the estimatedCIR default intensity dips below 1bp. Intuitively, the close alignment is a consequenceof the small effect of time-change on the model-implied term-structure of spreads and thesufficiency of the observed term-structure to pin down the latent default intensity.

In the lower panels, we plot smoothed estimates of the time-change increments (χt)13Daily Svensson curves are publicly available for download at the Federal Reserve Board website. For

details on data sources and methodology, see Gürkaynak et al. (2007).

16

for the time-changed model. We scale by ∆ so that the increments have unit mean. Notsurprisingly, we find very large outliers during the financial crisis of 2007–09 with businesstime ticking 15 times faster than calendar time for the most active days for Alcoa and up toabout 25 times faster for Ford. Counter to our model assumption of Lévy time change, thesmoothed time-series suggest serial correlation in time-change increments. We leave this asan avenue for future research.

The moments of the estimated residuals in the measurement equations and state equa-tion for the default intensity provide metrics of in-sample fit for the CDS pricing modelswith and without time-change. Table 3 reports the posterior mean, standard deviation,skewness and kurtosis of the smoothed distribution of the stacked measurement equationresiduals, ε(y)

t,m. A correct specification implies that these residuals are standard normal withmean zero, standard deviation one, skewness zero and kurtosis three. Comparison of themoments suggests that stochastic time-change offers modest benefit in fitting CDS spreads.The most notable improvement is a reduction in skewness for nine of the ten referenceentities.

It is in the time-series dimension where we see a clear distinction between the modelswith and without time-change. Table 4 reports the moments of the smoothed distributionof the innovations (ε(λ)

t ) in the default intensity state equation. Across all firms in oursample, introducing stochastic time-change brings the mean, standard deviation, skewnessand kurtosis of the state equation innovations closer to the moments of the standard normaldistribution. In the case of Alcoa, for example, the mean falls from 0.05 for the CIR model to0.01 for the time-changed model, variance increases from 0.67 to 0.82, skewness decreasesfrom 0.34 to 0.03, and kurtosis falls from 4.27 to 3.22. Indeed, introducing time-changeessentially eliminates skewness and reduces kurtosis to below 3.25 for every firm in thesample.

We use the estimated parameters for Alcoa to illustrate how stochastic time-changealters the implications of the baseline CIR model. In Figure 5, we plot the term structureof CDS spreads on three dates, one during the credit boom (18 March 2005), one in theearly phase of the crisis (19 June 2008), and one in the late phase of the crisis (19 March2009).14 The five-year CDS spreads on these dates for Alcoa were 19bp, 93bp and 851bp,respectively. For each model and date, model parameters are fixed to the mean of theposterior distribution for the model, as reported in Table 2.a. The business time defaultintensity is taken from the estimated smoothed distribution of ht for the date and model.Despite the two-orders-of-magnitude difference across dates in the initial condition (ht), thetwo models produce very similar term structures, which suggests that the two models will

14Dates are chosen to be representative of the range of observed spreads over the sample period. Each ofthese dates immediately precedes a CDS settlement date, so the nominal maturity of a CDS on these datesequals the exact actual maturity.

17

yield similar fit to the cross-section of observed spreads on each sample date.The fitted models differ, however, in the forecast distribution for spreads at a future

horizon. We again fix parameters for each model to the mean of the posterior distributionfor Alcoa. For simplicity, we fix a riskless rate of 3% and recovery of 40%. For the CIRmodel, we simulate the quantiles of the forecast distribution of CDS(λδ) for δ = 1/250 (i.e.,a horizon of one trading day) given initial condition λ0. Similarly, for the time-changedmodel we simulate the quantiles of the forecast distribution of CDS(λ(Tδ);α). We do thisfor two values of λ0, one below the long-run mean (5 bp) and one above the long-run mean(50 bp), and report results in Table 5. Consistent with the findings on kurtosis of spreadchanges (Figure 2), we find that the time-change model assigns higher probability to verylarge changes in the spread over the short horizon.

We return briefly to consider whether empirical results are consistent with two orthog-onality assumptions imposed in Section 2:

• There is no leverage effect, i.e., the innovation ε(λ)t in the default intensity state

equation is uncorrelated with time-change increments χt.

• The default intensity process is independent of the riskfree short rate process, i.e., ε(λ)t

is uncorrelated with the change in the federal funds rate from t− 1 to t.

To test these assumptions, we report in Table 6 the posterior means and standard deviationsof the respective time-series correlations. We find the correlation between ε(λ)

t and changesin the short rate is economically small and statistically insignificant for all ten names in oursample. The correlation between ε(λ)

t and χt is positive for all ten names and in some casesstatistically significant, but generally of second-order economic magnitude. For all firmsexcept Sprint, the posterior mean correlation is under 3%.

5 Out-of-sample performance

In this section, we assess out-of-sample model performance. For each observation datein the out-of-sample period from January 2010 through December 2011, we calculate theday-ahead forecast density implied by the models with and without time-change in defaultintensity. For each model and each date, the forecast density is evaluated at the realizednext-day observation. The test statistic, due to Diebold and Mariano (1995) and Amisanoand Giacomini (2007), allows for a two-sided test of the null hypothesis that the two modelsperform equally well.

Construction of forecast densities presents new challenges. The MCMC algorithm ofSection 3.2 delivers a smoothed estimator of the default intensity, whereas for this applica-tion online filtering is required. As in our MCMC estimator, the filtering algorithm mustaccommodate nonlinearity in the measurement equation and non-Gaussian state evolution.

18

Particle filtering is well-suited to such challenges, but the most tractable filters, such asthe SIR filter of Gordon et al. (1993), are ill-suited to our setting. Since the latent time-change process has serially independent increments, propagation of state variables from t tot+ 1 using only time-t information leads to particle degeneration, because the filter cannot“anticipate” abrupt changes in CDS spreads.15 Contrasted to such non-adapted filters arefully-adapted filters, in which propagation is conditioned on the observation at t+ 1. Herethe challenge is tractability. To navigate between these obstacles, we develop a partially-adapted particle filter using an auxiliary linearized model in the spirit of our approach inSection 3.2. To reduce the dimensionality of the problem, we fix all model parameters tothe in-sample posterior means estimated in Section 4.

5.1 Partially-adapted particle filter on a linearized model

The particle filter is a recursive algorithm that constructs a discrete approximation to thefiltered distribution of the state variables at each observation date:

x(i)t

iid∼ p(xt|Y t) for all t = T + 1, . . . , T ∗

where Y t = (y1, . . . , yt) are observations available up to time t, xt is the state vector, I isthe number of particles i = 1, . . . , I, and T ∗ − T is the number of trading days in the out-of-sample period. To reduce notation, explicit dependence on model parameters is omittedthroughout this section.

In the auxiliary particle filter (APF) of Pitt and Shephard (1999), the sample of par-ticles at observation time t is constructed as the marginal distribution for xt in the jointdistribution for (xt, xt−1) given Y t, for which the kernel can be decomposed as

p(xt, xt−1|Y t) ∝ p(xt|xt−1, yt) · p(yt|xt−1) · p(xt−1|Y t−1). (11)

Fully-adapted sampling from this distribution is conducted in two steps. First, the particles{x(i)

t−1}i=1,...,I that serve as a discrete approximation to p(xt−1|Y t−1) are resampled withweights proportional to p(yt|x(i)

t−1). This step requires pointwise evaluation of the requireddensity and induces consistency of the “old” particles x(i)

t−1 with the “new” observationyt. Second, the resampled time-t − 1 particles are propagated to time t by sampling fromp(xt|xt−1, yt). As is often the case in application, in our model setting we cannot easilyevaluate p(yt|xt−1) for the resampling step, nor can we sample directly from p(xt|xt−1, yt).

15Particle degeneration arises when only a small number of particles have significant weight in the resam-pling stage. See Lopes and Tsay (2011) for a review of particle filter methods with discussion of particledegeneration.

19

We substitute xt = (ht, χt) in equation (11) and re-write the first two kernels as

p(ht, χt|ht−1, χt−1, yt) · p(yt|ht−1, χt−1)

= p(ht|χt, ht−1, χt−1, yt) · p(yt|χt, ht−1, χt−1) · p(χt|ht−1, χt−1, Yt−1).

In our model, the χt are iid, so p(χt|ht−1, χt−1, Yt−1) = p(χt). Conditional on ht−1, the CDS

spread at t does not depend on χt−1, so p(yt|χt, ht−1, χt−1) = p(yt|χt, ht−1). Similarly, con-ditional on ht−1, ht does not depend on χt−1, so p(ht|χt, ht−1, χt−1, yt) = p(ht|χt, ht−1, yt).Combining these terms, we re-write equation (11) as

p(ht, χt, ht−1, χt−1|Y t) ∝ p(ht|χt, ht−1, yt) · p(yt|χt, ht−1)p(χt) · p(ht−1, χt−1|Y t−1) (12)

As the propagation kernel p(ht|χt, ht−1, yt) is intractable, we introduce an auxiliarylinear model.16 Begin by writing

p(ht|χt, ht−1, yt) · p(yt|χt, ht−1) = p(yt|ht, χt, ht−1) · p(ht|χt, ht−1) (13)

In our setting, this is simplified by noting that

p(yt|ht, χt, ht−1) = p(yt|ht) =∏m∈M

p(yt,m|ht).

In the auxiliary model, the measurement equation is linearized as

yt,m = γt,m + βt,mht + ζ ε(y)t,m (6′)

so that pa(yt,m|ht) ∼ N(γt,m+βt,mht, ζ2). The choice of constants γt,m and βt,m determines

how well the auxiliary linear model approximates the true nonlinear model. The auxiliarymodel state equations are assumed to be the same as those for the true model in (7) and (8).In the appendix, we derive a proposal propagation density p(ht|χt, ht−1, yt) and auxiliaryevaluation weights pa(yt|χt, ht−1) such that equation (13) can be replaced by

p(ht|χt, ht−1, yt) · pa(yt|χt, ht−1) = pa(yt|ht) · p(ht|χt, ht−1) (13′)

The proposal propagation density is Gaussian (so sampling is trivial), and the auxiliaryevaluation weights are similarly convenient to compute.

We arrive at the following algorithm for sampling {ht, χt} conditional on particles{h(i)

t−1}i=1,...,I and on Y t:16The possibility of constructing a proposal distribution on a linearization of the true model was suggested

by Pitt and Shephard (1999, §3.3).

20

1. Draw particles {χ(i)t }i=1,...,I blindly from the unconditional IG distribution for χt.

2. Draw particles {h(i)t−1, χ

(i)t }i=1,...,I by resampling {h(i)

t−1, χ(i)t }i=1,...,I with weights pro-

portional to pa(yt|χ(i)t , h

(i)t−1).

3. Propagate {h(i)t }i=1,...,I by sampling from p(ht|χ(i)

t , h(i)t−1, yt).

4. To restore consistency with the law of the true model, generate {h(i)t , h

(i)t−1, χ

(i)t }i=1,...,I

by resampling from {h(i)t , h

(i)t−1, χ

(i)t }i=1,...,I with weights proportional to17

ω(i)t =

p(h(i)t |χ

(i)t , h

(i)t−1, yt) · p(yt|χ

(i)t , h

(i)t−1)

p(h(i)t |χ

(i)t , h

(i)t−1, yt) · pa(yt|χ

(i)t , h

(i)t−1)

= p(yt|h(i)t )

p(yt|h(i)t )

(14)

5. Discard the sample of χt, and regenerate by sampling χ(i)t from p(χt|h(i)

t , h(i)t−1). As

noted in Section 3.2, the conditional distributed is GIG.

This algorithm is fully-adapted for ht with respect to the auxiliary model. Even thoughthe auxiliary model diverges from the true model, with reasonable choices of γt,m and βt,mwe can still expect that the realization of yt will inform the propagation of ht−1 to ht. Weimpose

βt,m = F ′t,m(ht)

γt,m = Ft,m(ht)− βt,mht

where ht is given by the inverse of model-implied 5-year CDS spread observed at time t:ht = F−1

t,5 (yt,5).Sampling of χt in Step 1 is non-adapted, but the regenerated sample in Step 5 is fully

adapted, i.e., it utilizes all the information content of yt under the true model. The regen-eration step is permissible only because χt does not enter the observation equation (i.e.,because yt is independent of χt when conditioned on ht). It is straightforward to showthat the regenerated {χt}i=1,...,I has the same conditional distribution as the sample afterStep 4, but regeneration improves the efficiency of sampling. We include the final step tofacilitate examination of the time-series of filtered states in Section 5.2.

Finally, it is trivial to modify the algorithm for the baseline model without time-change,because the pricing model is a special case of the model with time-change. We can simplyfix χ(i)

t = ∆ in Step 1 and drop Step 5.17We denote the resampled ht−1 as {ht−1}i=1,...,I in order to avoid confusion with the original cloud of

particles {ht−1}i=1,...,I , which are conditioned only on Y t−1.

21

5.2 Density forecast analysis

In this section we present the out-of-sample model selection analysis based on the Dieboldand Mariano (1995) and Amisano and Giacomini (2007) test statistic. We begin with abrief description of the test. The average predictive likelihood for the out-of-sample pointst = T + 1, . . . , T ∗ is given by:

LM = 1T ∗ − T

T ∗∑t=T +1

log(p(yt|Y t−1,ΘM)

)

where M ∈ {TC,CIR} denotes a model specification with “CIR” for the baseline modelwithout time-change and “TC” for the model with time-change, p denotes the predictivelikelihood evaluated at the future realized CDS log-spread at time t, given model parametersΘM, which are fixed at the in-sample MCMC estimates. The predictive likelihood can becomputed from the filtering distribution as follows:

p(yt|Y t−1,ΘM) =∫p(yt|xt,ΘM) p(xt|Y t−1,ΘM) dxt

p(xt|Y t−1,ΘM) =∫p(xt|xt−1,ΘM) p(xt−1|Y t−1,ΘM) dxt−1

The particle filter provides a sample {x(i)t−1}i=1,...,I from the filtering distribution p(xt−1|Y t−1,ΘM).

We propagate this sample forward via the state evolution equations by drawing from thetransition density p(xt|x(i)

t−1,ΘM) to obtain a sample {x(i)t }i=1,...,I from the distribution

p(xt|Y t−1,ΘM). The predictive likelihood for observation t is approximated as the MonteCarlo integral

LM,t = log(

1I

I∑i=1

p(yt|x(i)t ,ΘM)

).

Finally, we average across time to obtain

LM = 1T ∗ − T

T ∗∑t=T +1

LM,t.

Intuitively, a model that performs better out-of-sample has a higher value of averagepredictive likelihood. Under the null hypothesis of equal performance the Amisano andGiacomini (2007) statistic

U =√T ∗ − T (LTC − LCIR)√

1T ∗−T −1

∑T ∗t=T +1

(LTC,t − LCIR,t

)2−(LTC − LCIR

)2(15)

22

has an asymptotic standard normal distribution.The partially-adapted APF of Section 5.1 is implemented for each model and each firm

with I = 100,000 particles. We confirm that our particle filter avoids significant particledegeneration. In the case of the time-changed model for Ford, for example, the range ofthe efficient sample size (as a fraction of the number of particles) for the partially-adaptedAPF is (68.3%, 99.8%) with the 1%-quantile of 80.6%. We have implemented the standardSIR filter as well, and in this case find a range of (1.7%, 88.4%) along with the 1%-quantileof 15.6%.

The test statistic U is estimated for each of the ten names in our sample and reportedin Table 7. For nine of the names (all firms but Lennar), the coefficient is positive andsignificant at the 1% level. In Figures 6 and 7, we contrast the typical case of Ford withthe exceptional case of Lennar. We plot CDS spreads (top graph), the filtered estimatesof default intensity (middle graph), and the filtered estimates of time-change increments(bottom graph) for the out-of-sample period covering 2010 and 2011.18 The filtered time-series of χt for Ford exhibits significant volatility with multiple spikes, whereas for Lennarthis time-series is roughly constant with only two minor instances of higher activity. Ineach case, the spikes in χt coincide with relatively large movements in the one-year CDSspread and the filtered default intensity. Thus, adding stochastic time-change to the baselineCIR model improves the forecast performance when the CDS spread exhibits volatility involatility, but may lead to overfit when volatility is relatively constant.

6 Conclusion

Stochastic volatility has been studied extensively in markets for equities, interest ratesand commodities, but there has been relatively little empirical work to date on stochasticvolatility in credit markets. We appear to be the first to estimate a model in which stochasticvolatility in default intensity is induced by stochastic time change. Nonlinearity in theCDS pricing function and state-dependent variance in the evolution of the default intensitypresent significant challenges for estimation. We overcome these difficulties using a newBayesian MCMC estimation method that builds on earlier work by Stroud, Müller, andPolson (2003). For out-of-sample density forecasts, we also develop an auxiliary particlefilter. Although computationally intensive, both methodologies performed well on our data,and may have application to related problems in modeling interest rate or volatility swaps.

We estimate our model on firm-level time-series of single-name credit default swapspreads. In-sample, stochastic time-change is found to be a statistically and economicallysignificant feature of the data for all firms in our sample. In out-of-sample density forecasts

18To be precise, the plotted value at date t for the filtered estimate is the mean of the sample of{x(i)

t }i=1,...,I from the partially-adapted APF.

23

tests, we find that the time-changed model outperforms the baseline CIR model at highlevels of significance for nine of our ten firms.

Smoothed estimates of the increments of stochastic time show large outliers during thefinancial crisis of 2007–09 with business time ticking substantially faster than calendar timefor the most volatile days in our sample. Intuitively, large jumps in the business clockallow the model to accommodate large daily changes in the default intensity (i.e., relativeto volatility). While the time-changed model offers only modest improvement in fittingthe term-structure of observable spreads on any given date, it greatly improves the fit inthe time-series dimension. The findings have important implications for risk-managementapplications, because the time-changed model allows for much higher kurtosis in the lossdistribution over short horizons, and so higher quantiles in the tail (i.e., higher value-at-risk). For similar reasons, the findings also suggest that the widely-used CIR model willtend to underprice deep out-of-the-money options on CDS.

While the model imposes a variety of independence restrictions, only one appears to bematerially at odds with the data. Counter to our model assumption of Lévy time change, thesmoothed time-series of state variables exhibit serial correlation in time-change increments.The pricing methodology can easily be extended to allow for such volatility clustering,but our MCMC and particle filter methodologies would be quite challenging without thetractability afforded by serial independence. We leave this as an avenue for future research.

Appendix: Proposal and resampling weights in linearized model

Application of the proposed adapted filter requires sampling from p(ht|χt, ht−1, yt) andevaluation of the weights pa(yt|χt, ht−1). We begin with the kernels on the right-hand-sideof equation (13′). Since pa(yt,m|ht) ∼ N(γt,m + βt,mht, ζ

2) and the measurement errors areindependent across maturity, we have

pa(yt|ht) =∏m∈M

φ (yt,m; γt,m + βt,mht, ζ)

=(

1√2πζ

)#M ∏m∈M

exp(−1

2

(yt,m − γt,m − βt,mht

ζ

)2)

=(

1√2πζ

)#M ∏m∈M

exp

−12

(ht − νt,mξt,m

)2 =

∏m∈M

1βt,m

φ (ht; νt,m, ξt,m)

where νt,m ≡ (yt,m − γt,m)/βt,m and ξt,m ≡ ζ/βt,m. From the state equation (8), we havep(ht|χt, ht−1) = φ (ht; νt,∗, ξt,∗) where νt,∗ ≡ ht−1 + (µ − κPht−1)χt and ξ2

t,∗ ≡ σ2 ht−1χt,where the asterisk indicates parameters of the state equation. For notational conveniencein combining these kernels, we define the set N = M ∪ {∗}.

24

The right-hand-side of equation (13′) is

pa(yt|ht)p(ht|χt, ht−1) =

∏m∈M

1βt,m

φ (ht; νt,m, ξt,m)

φ (ht; νt,∗, ξt,∗)

=(

1√2πζ

)#M 1√2πξt,∗

exp

−12∑m∈N

(ht − νt,mξt,m

)2 (16)

For m ∈ N, let ξ2t,−m ≡

∏j∈N\{m} ξ

2t,j , and let ηt,m be the weight ηt,m = ξ2

t,−m/∑j∈N ξ

2t,−j .

Let ξ2t be the harmonic sum of the variances, i.e.,

ξ2t ≡

∏m∈N ξ

2t,m∑

m∈N ξ2t,−m

=

∑m∈N

1ξ2t,m

−1

It is straightforward to sum the fractions within the exp() term of equation (16) as

∑m∈N

(ht − νt,mξt,m

)2

=(ht −

∑m∈N ηt,mνt,m

ξt

)2

+ 1ξ

2t

∑m∈N

ηt,mν2t,m −

∑m∈N

ηt,mνt,m

2

Observe that ht appears only in the first term, which we embed in the proposal density as

p(ht|χt, ht−1, yt) = φ

ht; ∑m∈N

ηt,mνt,m, ξt

Intuitively, the proposal is the posterior distribution for an unobserved state given a set ofindependent Gaussian signals with realization {νt,m}m∈N. The mth signal receives weightηt,m, which is proportional to its precision 1/ξ2

t,m. The precision of the posterior, 1/ξ2t , is

the sum of signal precisions.To enforce identity (13′), remaining terms in equation (16) are captured in the resam-

pling weight

pa(yt|χt, ht−1) =(

1√2πζ

)#Mξ

ξt,∗exp

(−1

2

∑m∈N ηt,mν

2t,m − (

∑m∈N ηt,mνt,m)2

ξ2t

)

The resampling weight is low when the variation across signals νt,m is large. Thus, a particlecarrying state (χt, ht−1) is penalized in the resampling step when it implies large differencesacross the #M + 1 equations in the likelihood of yt.

25

References

Carol Alexander and Andreas Kaeck. Capital structure and asset prices: Some effects ofbankruptcy procedures. Journal of Banking and Finance, 32(6):1008–1021, June 2008.

Gianni Amisano and Raffaella Giacomini. Comparing density forecasts via weighted likeli-hood ratio tests. Journal of Business and Economics Statistics, 25(2):177–199, 2007.

Torben G. Andersen, Luca Benzoni, and Jesper Lund. An empirical investigation ofcontinuous-time equity return models. Journal of Finance, 57(3):1239–1284, 2002.

Ole E. Barndorff-Nielsen and Albert Shiryaev. Change of Time and Change of Measure,volume 13 of Advanced Series on Statistical Science and Applied Probability. World Sci-entific, 2010.

Bradley P. Carlin, Nicholas G. Polson, and David S. Stoffer. A Monte Carlo approachto nonnormal and nonlinear state-space modeling. Journal of the American StatisticalAssociation, 87(418):493–500, 1992.

C.K. Carter and R. Kohn. On Gibbs sampling for state space models. Biometrika, 81(3):541–553, 1994.

David A. Chapman and Neil D. Pearson. Is the short rate drift actually nonlinear? Journalof Finance, 55(1):355–388, February 2000.

Siddhartha Chib and Edward Greenberg. Markov chain Monte Carlo simulation methodsin econometrics. Econometric Theory, 12(3):409–431, 1996.

Ovidiu Costin, Michael B. Gordy, Min Huang, and Pawel J. Szerszen. Expectations of func-tions of stochastic time with application to credit risk modeling. Mathematical Finance,forthcoming.

Francis X. Diebold and Roberto S. Mariano. Comparing predictive accuracy. Journal ofBusiness and Economics Statistics, 13(3):252–263, 1995.

Joost Driessen. Is default event risk priced in corporate bonds? Review of Financial Studies,18:165–195, 2005.

Gregory R. Duffee. Estimating the price of default risk. Review of Financial Studies, 12(1):197–226, Spring 1999.

Darrell Duffie and Kenneth J. Singleton. Modeling term structures of defaultable bonds.Review of Financial Studies, 12(4):687–720, 1999.

26

Darrell Duffie and Kenneth J. Singleton. Credit Risk: Pricing, Measurement, and Manage-ment. Princeton University Press, Princeton, 2003.

Bjørn Eraker. Do stock prices and volatility jump? Reconciling evidence from spot andoption prices. Journal of Finance, 59(3):1367–1403, 2004.

Sylvia Frühwirth-Schnatter. Data augmentation and dynamic linear models. Journal ofTime Series Analysis, 15(2):183–202, 1994.

D. Geman and S. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:721–741, 1984.

N.J. Gordon, D.J. Salmond, and A.F.M. Smith. Novel approach to nonlinear/non-GaussianBayesian state estimation. IEE Proceedings F on Radar and Signal Process, 140(2):107–113, April 1993.

Michael B. Gordy and Søren Willemann. Constant proportion debt obligations: A post-mortem analysis of rating models. Management Science, 58(3):476–492, March 2012.

Refet S. Gürkaynak, Brian Sack, and Jonathan H. Wright. The U.S. Treasury yield curve:1961 to the present. Journal of Monetary Economics, 54(8):2291–2304, 2007.

Kris Jacobs and Xiaofei Li. Modeling the dynamics of credit spreads with stochastic volatil-ity. Management Science, 54(6):1176–1188, June 2008.

Eric Jacquier, Nicholas G. Polson, and Peter E. Rossi. Bayesian analysis of stochasticvolatility models with fat-tails and correlated errors. Journal of Econometrics, 122(1):185–212, September 2004.

Robert Jarrow and Stuart Turnbull. Pricing derivatives on financial securities subject tocredit risk. Journal of Finance, 50(1):53–86, March 1995.

Robert A. Jarrow, David Lando, and Fan Yu. Default risk and diversification: Theory andempirical implications. Mathematical Finance, 15(1):1–26, January 2005.

Michael S. Johannes, Nicholas G. Polson, and Jonathan R. Stroud. Optimal filtering ofjump diffusions: Extracting latent states from asset prices. Review of Financial Studies,22(7):2759–2799, 2009.

Christopher S. Jones. The dynamics of stochastic volatility: evidence from underlying andoptions market. Journal of Econometrics, 116(1–2):181–224, 2003.

27

Matthew Leeming, Søren Willemann, Arup Ghosh, and Rob Hagemans. Standard corpo-rate CDS handbook: Ongoing evolution of the CDS market. Technical report, BarclaysCapital, February 2010.

Hedibert F. Lopes and Ruey S. Tsay. Particle filters and Bayesian inference in financialeconometrics. Journal of Forecasting, 30:168–209, 2011.

Rafael Mendoza-Arriaga and Vadim Linetsky. Time-changed CIR default intensities withtwo-sided mean-reverting jumps. Annals of Applied Probability, 24(2):811–856, 2014.

N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller. Equationof state calculations by fast computing machines. Journal of Chemical Physics, 21(6):1087–1092, 1953.

Jun Pan and Kenneth J. Singleton. Default and recovery implicit in the term structure ofsovereign CDS spreads. Journal of Finance, 63(5), October 2008.

Peter C.B. Phillips and Jun Yu. Simulation-based estimation of contingent-claims prices.Review of Financial Studies, 22(9):3669–3705, September 2009.

Michael K. Pitt and Neil Shephard. Analytic convergence rates and parameterization issuesfor the Gibbs sampler applied to state space models. Journal of Time Series Analysis,20(1):63–85, January 1999.

Jonathan R. Stroud, Peter Müller, and Nicholas G. Polson. Nonlinear state-space modelswith state-dependent variances. Journal of the American Statistical Association, 98(462):377–386, 2003.

Ruey S. Tsay. Analysis of Financial Time Series. John Wiley & Sons, Hoboken, NJ, thirdedition, 2010.

28

Table 1: Counts of missing observations in CDS data.The table reports the number of missing observations in the estimation sample from the beginningof 2002 to the end of 2009 (with the time series for Sprint Nextel starting on August 24, 2005).

Maturity1y 2y 3y 5y 7y 10y

Alcoa 0 0 0 0 0 8Anadarko 6 32 0 0 0 8CenturyLink 61 115 12 3 11 107Clear Channel 0 1 0 0 0 0Ford 0 0 0 0 0 0Lennar 120 301 71 2 218 233Limited Brands 18 42 10 3 0 19RadioShack 16 17 4 0 1 4Sprint Nextel 0 0 0 0 0 0Tyson Foods 39 147 22 30 25 208

29

Table 2.a: Parameter estimates of CDS pricing models: Alcoa.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileAlcoa: CIR ModelκP 0.4794 0.3698 0.0366 1.1952σ 0.1877 0.0030 0.1826 0.1925µ ∗ 100 0.0829 0.0009 0.0814 0.0844κQ −0.2526 0.0068 −0.2634 −0.2413ζ 0.1627 0.0011 0.1609 0.1645Alcoa: CIR + IG Time Change ModelκP 0.6590 0.5188 0.0483 1.6713σ 0.2238 0.0019 0.2207 0.2269µ ∗ 100 0.0688 0.0009 0.0673 0.0702κQ −0.3787 0.0068 −0.3903 −0.3679ζ 0.1583 0.0010 0.1566 0.1600α 7.1439 0.4384 6.4713 7.9060

Table 2.b: Parameter estimates of CDS pricing models: Anadarko.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileAnadarko: CIR ModelκP 0.6569 0.3852 0.0915 1.3452σ 0.0975 0.0042 0.0910 0.1051µ ∗ 100 0.1643 0.0012 0.1623 0.1662κQ −0.0318 0.0035 −0.0382 −0.0265ζ 0.1627 0.0011 0.1610 0.1645Anadarko: CIR + IG Time Change ModelκP 1.7103 1.2431 0.1488 4.0767σ 0.2596 0.0025 0.2555 0.2636µ ∗ 100 0.1096 0.0018 0.1067 0.1128κQ −0.3398 0.0094 −0.3547 −0.3223ζ 0.1535 0.0010 0.1519 0.1552α 5.5933 0.3528 5.0505 6.2100

30

Table 2.c: Parameter estimates of CDS pricing models: CenturyLink.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileCenturyLink: CIR ModelκP 1.1616 0.7956 0.1122 2.6471σ 0.2531 0.0036 0.2471 0.2589µ ∗ 100 0.2095 0.0031 0.2044 0.2145κQ −0.3253 0.0104 −0.3423 −0.3078ζ 0.2002 0.0013 0.1980 0.2024CenturyLink: CIR + IG Time Change ModelκP 1.8575 1.3345 0.1640 4.3793σ 0.2922 0.0023 0.2884 0.2960µ ∗ 100 0.1558 0.0034 0.1505 0.1618κQ −0.5531 0.0132 −0.5730 −0.5302ζ 0.1944 0.0013 0.1923 0.1965α 5.9933 0.4425 5.3364 6.7968

Table 2.d: Parameter estimates of CDS pricing models: Clear Channel.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileClear Channel: CIR ModelκP 0.2310 0.1806 0.0171 0.5820σ 0.2332 0.0040 0.2266 0.2398µ ∗ 100 0.2594 0.0064 0.2493 0.2702κQ −0.3094 0.0115 −0.3275 −0.2904ζ 0.2689 0.0017 0.2661 0.2719Clear Channel: CIR + IG Time Change ModelκP 0.1759 0.1373 0.0133 0.4422σ 0.2662 0.0038 0.2601 0.2725µ ∗ 100 0.1983 0.0056 0.1889 0.2081κQ −0.5013 0.0172 −0.5301 −0.4738ζ 0.2640 0.0017 0.2613 0.2669α 3.8124 0.4316 3.1516 4.5369

31

Table 2.e: Parameter estimates of CDS pricing models: Ford.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileFord: CIR ModelκP 0.3466 0.2579 0.0283 0.8446σ 0.3207 0.0028 0.3162 0.3254µ ∗ 100 0.4686 0.0094 0.4529 0.4834κQ −0.4997 0.0086 −0.5136 −0.4859ζ 0.1824 0.0012 0.1805 0.1844Ford: CIR + IG Time Change ModelκP 0.3299 0.2508 0.0263 0.8129σ 0.3390 0.0028 0.3345 0.3436µ ∗ 100 0.4147 0.0112 0.3958 0.4318κQ −0.6419 0.0127 −0.6626 −0.6213ζ 0.1801 0.0012 0.1782 0.1821α 7.5902 0.6288 6.6347 8.6549

Table 2.f: Parameter estimates of CDS pricing models: Lennar.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileLennar: CIR ModelκP 0.4378 0.3252 0.0356 1.0646σ 0.2576 0.0057 0.2479 0.2670µ ∗ 100 0.2983 0.0037 0.2924 0.3047κQ −0.1736 0.0114 −0.1927 −0.1545ζ 0.1802 0.0013 0.1782 0.1823Lennar: CIR + IG Time Change ModelκP 0.6873 0.5193 0.0537 1.6897σ 0.3197 0.0037 0.3137 0.3258µ ∗ 100 0.2593 0.0027 0.2549 0.2639κQ −0.3330 0.0103 −0.3500 −0.3158ζ 0.1771 0.0012 0.1751 0.1791α 18.1031 1.4843 15.8002 20.6781

32

Table 2.g: Parameter estimates of CDS pricing models: Limited Brands.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileLimited Brands: CIR ModelκP 0.3738 0.2655 0.0343 0.8747σ 0.1334 0.0068 0.1218 0.1442µ ∗ 100 0.2292 0.0041 0.2227 0.2361κQ −0.0922 0.0088 −0.1063 −0.0771ζ 0.2399 0.0016 0.2374 0.2425Limited Brands: CIR + IG Time Change ModelκP 0.9273 0.7077 0.0722 2.2911σ 0.2531 0.0034 0.2477 0.2587µ ∗ 100 0.1518 0.0037 0.1453 0.1576κQ −0.3907 0.0148 −0.4178 −0.3680ζ 0.2334 0.0015 0.2309 0.2358α 3.8126 0.3055 3.3556 4.3503

Table 2.h: Parameter estimates of CDS pricing models: RadioShack.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileRadioShack: CIR ModelκP 0.6419 0.4726 0.0538 1.5468σ 0.1907 0.0025 0.1866 0.1947µ ∗ 100 0.0681 0.0022 0.0644 0.0717κQ −0.3902 0.0082 −0.4036 −0.3764ζ 0.2660 0.0018 0.2632 0.2689RadioShack: CIR + IG Time Change ModelκP 1.3374 1.0143 0.1068 3.2928σ 0.1968 0.0025 0.1928 0.2008µ ∗ 100 0.0388 0.0018 0.0358 0.0418κQ −0.6591 0.0167 −0.6856 −0.6314ζ 0.2561 0.0016 0.2534 0.2588α 1.7946 0.0901 1.6592 1.9488

33

Table 2.i: Parameter estimates of CDS pricing models: Sprint.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileSprint: CIR ModelκP 0.5126 0.4062 0.0367 1.3034σ 0.2404 0.0032 0.2351 0.2456µ ∗ 100 0.1407 0.0018 0.1378 0.1437κQ −0.3992 0.0101 −0.4154 −0.3820ζ 0.1836 0.0017 0.1809 0.1864Sprint: CIR + IG Time Change ModelκP 0.5894 0.4706 0.0422 1.5102σ 0.2543 0.0028 0.2498 0.2591µ ∗ 100 0.1320 0.0018 0.1291 0.1348κQ −0.4586 0.0106 −0.4769 −0.4424ζ 0.1822 0.0016 0.1796 0.1848α 21.3563 2.2668 17.8964 25.3523

Table 2.j: Parameter estimates of CDS pricing models: Tyson Foods.The table reports the mean, standard deviation, 5%- and 95%-quantiles of the MCMC posteriordistribution for the CDS pricing model in equations (6)-(8). The results are reported for modelspecifications without time-change (CIR model) and with time-change in default intensity (CIR +IG Time Change Model).

Mean Std Dev 5% Quantile 95% QuantileTyson Foods: CIR ModelκP 0.7200 0.5119 0.0630 1.6877σ 0.2469 0.0034 0.2412 0.2524µ ∗ 100 0.2753 0.0027 0.2708 0.2798κQ −0.2400 0.0075 −0.2524 −0.2279ζ 0.1683 0.0011 0.1665 0.1702Tyson Foods: CIR + IG Time Change ModelκP 1.1482 0.8454 0.0946 2.7624σ 0.2916 0.0027 0.2870 0.2960µ ∗ 100 0.2384 0.0026 0.2339 0.2427κQ −0.3805 0.0091 −0.3949 −0.3651ζ 0.1647 0.0011 0.1630 0.1665α 12.1615 0.8741 10.8246 13.6870

34

Table 3: Moments of measurement equation residuals.The table reports the posterior mean of mean, std. deviation, skewness and kurtosis of measurementequation residuals, ε(y), calculated at each iteration of MCMC algorithm for model specificationswithout time-change (CIR model) and with time-change in default intensity (CIR + IG Time ChangeModel). A correct specification implies that the residuals are standard normal with mean zero,standard deviation one, skewness zero and kurtosis three.

Mean Std Dev Skewness KurtosisAlcoaCIR Model −0.0157 0.9998 −0.1532 4.3142CIR-IG TC Model −0.0051 1.0000 0.0274 4.4102AnadarkoCIR Model −0.0048 1.0000 −0.4178 3.4865CIR-IG TC Model −0.0044 0.9999 −0.1429 3.4420CenturyLinkCIR Model −0.0169 0.9998 −0.6530 4.0214CIR-IG TC Model −0.0041 0.9999 −0.4849 4.0898Clear ChannelCIR Model −0.0112 0.9999 −0.2315 3.7204CIR-IG TC Model −0.0049 1.0000 −0.0709 3.6722FordCIR Model −0.0125 0.9999 −0.4358 4.3982CIR-IG TC Model −0.0048 0.9999 −0.3012 4.4348LennarCIR Model −0.0097 0.9999 −0.3456 3.9502CIR-IG TC Model −0.0040 0.9999 −0.1962 3.8164Limited BrandsCIR Model −0.0068 0.9999 −0.6464 3.5192CIR-IG TC Model −0.0028 1.0000 −0.4513 3.2429RadioShackCIR Model −0.0208 0.9998 −0.5183 3.5148CIR-IG TC Model −0.0022 1.0000 −0.2493 3.3394SprintCIR Model −0.0122 0.9999 −0.0525 3.2900CIR-IG TC Model −0.0047 1.0000 −0.0032 3.3974Tyson FoodsCIR Model −0.0134 0.9999 0.0039 3.6866CIR-IG TC Model −0.0042 1.0000 0.1442 3.7100

35

Table 4: Moments of state equation innovations.The table reports the posterior mean of mean, std. deviation, skewness and kurtosis of state equationresiduals, ε(λ), calculated at each iteration of MCMC algorithm for model specifications withouttime-change (CIR model) and with time-change in default intensity (CIR + IG Time Change Model).A correct specification implies that the residuals are standard normal with mean zero, standarddeviation one, skewness zero and kurtosis three.

Mean Std Dev Skewness KurtosisAlcoaCIR Model 0.0463 0.6674 0.3405 4.2667CIR-IG TC Model 0.0108 0.8230 0.0316 3.2158AnadarkoCIR Model 0.0159 0.9196 0.1544 3.2541CIR-IG TC Model −0.0018 0.8354 0.0232 3.1892CenturyLinkCIR Model 0.0492 0.7081 0.2956 3.3331CIR-IG TC Model −0.0024 0.8671 0.0242 3.1424Clear ChannelCIR Model 0.0420 0.8925 0.1341 3.4407CIR-IG TC Model 0.0015 0.9581 0.0177 3.0508FordCIR Model 0.0306 0.8265 0.1025 3.4196CIR-IG TC Model 0.0001 0.9351 0.0156 3.0696LennarCIR Model 0.0436 0.7893 0.2021 3.3338CIR-IG TC Model 0.0127 0.8083 0.0392 3.2169Limited BrandsCIR Model 0.0308 0.9215 0.1456 3.1110CIR-IG TC Model −0.0023 0.9208 0.0173 3.0802RadioShackCIR Model 0.0359 0.7439 0.1330 3.1776CIR-IG TC Model −0.0040 0.9290 0.0065 3.0734SprintCIR Model 0.0728 0.7659 0.4195 4.4997CIR-IG TC Model 0.0398 0.8156 0.0451 3.2393Tyson FoodsCIR Model 0.0406 0.7290 0.2760 3.4170CIR-IG TC Model 0.0003 0.8201 0.0351 3.2036

36

Table 5: Forecast distribution for 1-day horizon.For the CIR model, we report the quantiles of the forecast distribution of CDS(λδ) for δ = 1/250(i.e., a horizon of one trading day) given initial condition λ0 and the posterior mean parametersestimated for Alcoa. For the time-changed model we report the quantiles of the forecast distributionof CDS(λ(Tδ);α). Two values of λ0 are considered, one below the long-run mean (5 bp) and oneabove the long-run mean (50 bp). Parameters for each model are fixed to the mean of the posteriordistribution. A riskless rate is fixed at 3% and recovery rate at 40%.

λ0 (bp) Quantiles of 5-year CDS(λ(Tδ)) in basis points0.001 0.01 0.1 0.25 0.5 0.75 0.9 0.99 0.999

CIR Model5 17.2 17.6 18.9 20.0 21.5 23.3 25.2 28.9 32.2

50 42.7 47.2 54.1 58.5 63.6 69.1 74.3 83.9 90.9CIR-IG TC Model

5 17.5 17.5 21.2 22.5 23.1 23.7 24.7 32.7 61.850 17.5 40.7 68.8 72.5 74.3 76.1 79.3 104.5 177.6

Table 6: Validation of independence assumptions.For the time-changed model, we report the posterior mean and standard deviation of correlationsbetween innovations ε(λ)

t in the default intensity process and time-change increments χt, and betweenε

(λ)t and changes in the federal funds rate (dFFRt = FFRt−FFRt−1).

Corr[ε(λ), χ

]Corr

[ε(λ), dFFRt

]Mean Std Dev Mean Std Dev

Alcoa 0.0248 0.0125 0.0033 0.0204Anadarko 0.0209 0.0122 0.0023 0.0218CenturyLink 0.0212 0.0110 0.0048 0.0209Clear Channel 0.0284 0.0145 0.0000 0.0216Ford 0.0251 0.0138 −0.0083 0.0213Lennar 0.0231 0.0128 0.0109 0.0225Limited Brands 0.0235 0.0125 0.0010 0.0214RadioShack 0.0197 0.0109 0.0024 0.0221Sprint 0.0503 0.0184 −0.0014 0.0279Tyson Foods 0.0276 0.0122 0.0030 0.0204

37

Table 7: Out-of-sample density forecast performance results.The table reports Amisano and Giacomini (2007) test statistics defined in equation (15). Under thenull hypothesis of equal performance, the statistics have an asymptotic standard normal distribution.The positive (negative) values support the time-changed (standard CIR) model.

U

Alcoa 10.98Anadarko 6.47CenturyLink 17.65Clear Channel 47.62Ford 19.44Lennar −14.73Limited Brands 63.07RadioShack 39.79Sprint 9.45Tyson Foods 23.60

38

Figure 1: Daily changes in log-spreads.Daily changes in the log-spread for the on-the-run 5 year CDX.NA.IG index. Index roll dates(marked on the x-axis) are dropped to avoid contamination from changes in index composition.

−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25C

hange in log−

spre

ads

21Oct03

22Mar04

20Sep04

21Mar05

20Sep05

20Mar06

20Sep06

20Mar07

20Sep07

25Mar08

02Oct08

20Mar09

21Sep09

22Mar10

20Sep10

21Mar11

20Sep11

20Mar12

20Sep12

20Mar13

39

Figure 2: Kurtosis of change in the CDS spread.Stationary CIR model under business time with parameters κP = κQ = 0.2, µ = 0.004, σ = 0.1,and R = 0.4. Black dotted line marks limit as δ → ∞ of kurtosis of spread change in CIR model.Both axes on log-scale. Moments of the calendar time increments are obtained by simulation with4 million trials.

10−1

100

101

102

103

101

102

103

104

α

Kurt

osis

1 day

1 month

1 year

40

Figure 3: In-sample CDS spreads and smoothed state estimates: Alcoa.We plot CDS spreads (top panel, log-scale), smoothed estimates of the default intensity (mid-dle panel, log-scale) and smoothed estimates of time-change increments (bottom panel) for Alcoa.Smoothed estimates of the default intensity are obtained as the posterior mean of the MCMC chainfor the model without time-change (CIR) and with time-change in default intensity (CIR + IG TimeChange). The smoothed estimates of time-change increments for the model with time-change arescaled by ∆ for ease of comparison against model without time-change.

10−4

10−2

100

CD

S s

prea

ds

CDS spreads

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009

1 year5 year10 year

10−6

10−4

10−2

100 smoothed default intensity

λt

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009

CIRCIR + IG time change

0

5

10

15smoothed time change increments

χt/∆

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009

41

Figure 4: In-sample CDS spreads and smoothed state estimates: Ford.We plot CDS spreads (top panel, log-scale), smoothed estimates of the default intensity (mid-dle panel, log-scale) and smoothed estimates of time-change increments (bottom panel) for Ford.Smoothed estimates of the default intensity are obtained as the posterior mean of the MCMC chainfor the model without time-change (CIR) and with time-change in default intensity (CIR + IG TimeChange). The smoothed estimates of time-change increments for the model with time-change arescaled by ∆ for ease of comparison against model without time-change.

10−3

10−1

101

CD

S s

prea

ds

CDS spreads

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009

1 year5 year10 year

10−3

10−2

10−1

100 smoothed default intensity

λt

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009


0

5

10

15

20

25smoothed time change increments

χt/∆

3/20

/200

26/

20/2

002

9/20

/200

212

/20/

2002

3/20

/200

36/

20/2

003

9/22

/200

312

/22/

2003

3/22

/200

46/

21/2

004

9/20

/200

412

/20/

2004

3/21

/200

56/

20/2

005

9/20

/200

512

/20/

2005

3/20

/200

66/

20/2

006

9/20

/200

612

/20/

2006

3/20

/200

76/

20/2

007

9/20

/200

712

/20/

2007

3/20

/200

86/

20/2

008

9/22

/200

812

/22/

2008

3/20

/200

96/

22/2

009

9/21

/200

912

/21/

2009

42

Figure 5: Model-implied term structure of spreads for Alcoa.Term-structure of credit spreads for Alcoa on three dates for fitted model with time-change (solidlines) and without time-change (dashed lines). Spreads plotted in basis points on a log scale.

1 2 3 4 5 6 7 8 9 10

101

102

103

18−Mar−2005

19−Mar−2009

19−Jun−2008

Maturity

Spre

ad (

bp)

43

Figure 6: Out-of-sample CDS spreads and filtered state estimates: Ford.We plot CDS spreads (top panel, log-scale), filtered estimates of the default intensity (middle panel,log-scale) and filtered estimates of time-change increments (bottom panel) for Ford. Filtered esti-mates of the default intensity are obtained as the mean of the filtered distribution produced by theparticle filter for the model without time-change (CIR) and with time-change in default intensity(CIR + IG Time Change). The filtered estimates of time-change increments for the model withtime-change are scaled by ∆ for ease of comparison against model without time-change.

10−3

10−2

10−1

CD

S s

prea

ds

CDS spreads

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11

1 year5 year10 year

10−3

10−2

10−1 filtered default intensity

λt

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11


0

5

10filtered time change increments

χt/∆

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11

44

Figure 7: Out-of-sample CDS spreads and filtered state estimates: Lennar.We plot CDS spreads (top panel, log-scale), filtered estimates of the default intensity (middle panel,log-scale) and filtered estimates of time-change increments (bottom panel) for Lennar. Filteredestimates of the default intensity are obtained as the mean of the filtered distribution producedby the particle filter for the model without time-change (CIR) and with time-change in defaultintensity (CIR + IG Time Change). The filtered estimates of time-change increments for the modelwith time-change are scaled by ∆ for ease of comparison against model without time-change.

10−3

10−2

10−1

CD

S s

prea

ds

CDS spreads

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11

1 year5 year10 year

10−2

10−1 filtered default intensity

λt

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11


0

5

10filtered time change increments

χt/∆

3/22

/201

0

6/21

/201

0

9/20

/201

0

12/2

0/20

10

3/21

/201

1

6/20

/201

1

9/20

/201

1

12/2

0/20

11

45

bayesian estimation of time-changed default intensity modelsbayesian estimation of time-changed...

Documents