financial data analysis - uni-freiburg.de · †the name derives from the fact that it uses the...
TRANSCRIPT
Multivariate GARCH Models
• C. Alexander (2008): Practical Financial Econometrics, Chapter II.4.5.
• A. Silvennoinen and T. Terasvirta (2009): Multivariate GARCHModels, Handbook of Financial Time Series, Springer; available athttp://papers.ssrn.com/sol3/papers.cfm?abstract id=1148139.
• L. Bauwens and S. Laurent and J. K. Rombouts (2006): MultivariateGARCH Models: A Survey, Journal of Applied Econometrics, 21, 79–109;available at http://ideas.repec.org/p/cor/louvco/2003031.html.
1
Multivariate GARCH
• Many problems in finance are inherently multivariate and require us tounderstand the dependence structure between assets.
• E.g.,
– portfolio analysis,– volatility transmission: study of relations between the volatilities
and covariances/correlations of several markets (e.g., emerging anddeveloped markets, or different regions),
– relation between correlations and volatilities,– tests of asset pricing models,– futures hedging.
• Multivariate GARCH: Models for the evolution of volatilities andcovariances/correlations.
2
• Consider a return vector rt consisting of N components, i.e., rt =[r1t, r2t, . . . , rNt]′ (a column vector),
rt = µt + εt (1)
µt = E(rt|It−1) = Et−1(rt) (2)
εt|It−1 ∼ N(0,Ht) (3)
Ht = Var(rt|It−1) = Vart−1(rt) = Vart−1(εt), (4)
where It is the information available at time t, usually It = {rt, rt−1, . . .}.
• The error termεt = [ε1t, ε2t, . . . , εNt]′.
• This implies that Ht is the conditional covariance matrix of rt.
3
• Covariance matrix
Ht =
h21t h12,t h13,t · · · h1N,t
h12,t h22t h23,t · · · h2N,t
h213,t h23,t h2
3,t · · · h3N,t... ... ... . . . ...
h1N,t h2N,t h3N,t · · · h2Nt
, (5)
whereh2
jt = Vart−1(rjt), hij,t = Covt−1(rit, rjt), (6)
is symmetric and positive definite:
• We know that for any linear combination (with weight vector w =[w1, w2, . . . , wN ]′) of the elements of rt,
1
0 < Vart−1
(∑
i
wirit
)=
∑
i
w2i h
2i,t +
∑
i
∑
j 6=i
wiwjhij,t = w′Htw.
1The variance may be zero if the components are linearly dependent.
4
• For example, with N = 2,
Vart−1(w1r1t + w2r2t) = w21h
21t + 2w1w2h12,t + w2
2h22t
=[
w1 w2
] [h2
1t h12,t
h12,t h22t
] [w1
w2
].
• If the conditional distribution of rt is multivariate normal, then, forexample, the conditional 100×ξ% portfolio Value–at–Risk (VaR) for anyportfolio combination w can be calculated as
VaRt−1(ξ) = w′µt + Φ−1(ξ)√
w′Htw, (7)
where Φ−1(ξ) is the ξ–quantile of the standard normal distribution, e.g.,Φ−1(0.01) = −2.3263 and Φ−1(0.05) = −1.6449.
5
• Similar to the univariate GARCH,
rt = µt + εt, εt = σtηt, ηtiid∼ N(0, 1),
(3) is often written as
εt = H1/2t zt, zt
iid∼ N(0, I), (8)
where N(0, I) denotes the N–dimensional normal distribution with amean vector of zeros and identity covariance matrix, i.e., the N -dimensional standard normal.
• H1/2t is an N ×N matrix such that H
1/2t (H1/2
t )′ = Ht (matrix squareroot).
• As Ht is a covariance matrix, such a factorization exists, e.g., theCholesky decomposition.
6
• A symmetric positive definite matrix A can be factored as A = LL′,where L is lower triangular with positive diagonal elements (the Choleskyfactorization of A).2
• For example, if N = 2 (bivariate case), where
Ht =[
h21t h12,t
h12,t h22t
],
the Cholesky factorization is
L =
[ √h2
1t 0
h12,t/√
h21t
√h2
2,t − h212/h2
1t
].
• LL′ = Ht is easily checked, and h22,t − h2
12/h21t = (h2
1th22,t − h2
12)/h21t =
(det Ht)/h1t > 0 since Ht is positive definite.
2Other factorizations exist.
7
• It then follows from (8) that
Vart−1(rt) = Vart−1(εt) (9)
= Et−1(εtε′t)− Et−1(εt)︸ ︷︷ ︸
=0
Et−1(εt)′ (10)
= Et−1(H1/2t ztz
′t(H
1/2t )′) (11)
= H1/2t Et−1(ztz
′t)︸ ︷︷ ︸
=identity matrix
(H1/2t )′ (12)
= H1/2t (H1/2
t )′ = Ht. (13)
8
Main Problems
• There are two main problems when it comes to the specification ofmultivariate GARCH models:
(i) To keep estimation feasible, we need parsimonious models (i.e., modelswith a moderate number of parameters) which are still flexible enough tocapture the most important aspects of the volatility/covariance dynamics.
(ii) We have to make sure that the conditional covariance matrix will remainpositive definite at each point of time.
• For the sake of illustration, consider a bivariate GARCH(1,1) of thegeneral vec–type.
• The covariance matrix is then given by
Ht =[
h21t h12,t
h12,t h22t
],
9
where, in the most general case
h21t = c1 + a11ε
21,t−1 + a12ε1,t−1ε2,t−1 + a13ε
22,t−1
+b11h21,t−1 + b12h12,t−1 + b13h
22,t−1
h12,t = c2 + a21ε21,t−1 + a22ε1,t−1ε2,t−1 + a23ε
22,t−1
+b21h21,t−1 + b22h12,t−1 + b23h
22,t−1
h22t = c3 + a31ε
21,t−1 + a32ε1,t−1ε2,t−1 + a33ε
22,t−1
+b31h21,t−1 + b32h12,t−1 + b33h
22,t−1,
or
h21,t
h12,t
h22,t
︸ ︷︷ ︸=ht
=
c1
c2
c3
+
a11 a12 a13
a21 a22 a23
a31 a32 a33
ε21,t−1
ε1,t−1ε2,t−1
ε22,t−1
+
b11 b12 b13
b21 b22 b23
b31 b32 b33
h21,t−1
h12,t−1
h22,t−1
.
10
• In this specification, both conditional variances, h21t and h2
2t, and theconditional covariance, h12,t, may depend on all lagged squared returnsand variances and all lagged cross–products ε1,t−1ε2,t−1 and covariances.
• Although flexible, this model is difficult to handle in practice, since itrequires estimation of 21 parameters (and this is for the bivariate case).
• Moreover, without further restrictions, there is no guarantee that thesequence of covariance matrices implied by an estimated process will bepositive definite for all t.
• Such conditions are very tedious to work out and to impose in estimation.
• The system above is a bivariate version of the vec model, which is astraightforward generalization of univariate GARCH.
• The general case is still useful, as it nests many more practicablespecifications.
11
• The name derives from the fact that it uses the vech operator.
• As the N × N matrix Ht is symmetric, it contains only N(N + 1)/2independent elements, which may be obtained, for example, by excludingthe upper triangular (redundant) part.
• The vech operator then stacks the remaining elements columnwise intoan N(N + 1)/2 column vector, e.g.,
vech([
h21t h12,t
h12,t h22t
])=
h21t
h12,t
h22t
vech(εtε′t) = vech
([ε1t
ε2t
] [ε1t ε2t
])
= vech([
ε21t ε1tε2t
ε1tε2t ε22t
])=
ε21t
ε1tε2t
ε22t
.
• The vec operator is similar, but without excluding the upper triangularpart.
12
• Then the vec(1,1) model can be written
ht = c + Aηt−1 + Bht−1, (14)
where
ht = vechHt (15)
ηt = vech(εtε′t). (16)
• Without restrictions, the are
– N(N + 1)/2 parameters in c– N2(N + 1)2/4 parameters in A– N2(N + 1)2/4 parameters in B.– With N = 2, 3, 5, 10 assets, we have 21, 78, 465, 6105 parameters.
13
Stationarity and Unconditional Variance
• The covariance stationarity for the vec(1,1) model (14),
ht = c + Aηt−1 + Bht−1, (17)
requires the eigenvalues of matrix
Q = A + B
to be inside the unit circle.
• If this holds, the unconditional covariance matrix (its vech) can beobtained by taking expectations on both sides of (17),
E(ht) = c + AE(ηt−1) + BE(ht−1)
= c + AE(ht−1) + BE(ht−1)
= c + (A + B)E(ht),
14
henceE(vechHt) = E(ht) = (I −A−B)−1
c.
• Covariance matrix forecasts:
ht+1 = c + Aηt + Bht
Et(ht+2) = c + AEtηt+1 + Bht+1 = c + (A + B)ht+1
Et(ht+3) = c + AEtηt+2 + BEtht+2
= c + (A + B)Etht+2 = c + (A + B)c + (A + B)2ht+1
...
Et(ht+τ) =τ−2∑
i=0
(A + B)ic + (A + B)τ−1ht+1
= E(ht) + (A + B)τ−1(ht+1 − E(ht)),
usingτ−2∑
i=0
(A + B)i = [I − (A + B)τ−1](I −A−B)−1.
15
• Et(ht+τ) converges to the unconditional covariance matrix provided thecovariance stationarity condition is satisfied.
• Calculation of higher moments of the vec model is considerably moreinvolved than in the univariate GARCH model.3
3C. M. Hafner (2003): Fourth Moment Structure of Multivariate GARCH Models, Journal of FinancialEconometrics, 1, 26–54.
16
Special Case I: Diagonal VEC
• To reduce the number of parameters, this restricts the matrices A andB in (14) to be diagonal.
• This means that
– each variance h2it depends only on its own past squared error ε2i,t−1
and its own lag (as in the univariate case)
h2it = cii + aiiε
2i,t−1 + biih
2i,t−1, i = 1, . . . , N, (18)
– each covariance hij,t depends only on its own past cross–product oferrors εi,t−1εj,t−1 and its own lag,
hij,t = cij + aijεi,t−1εj,t−1 + bihij,t−1, i, j = 1, . . . , N. (19)
• Often this specification is sufficient to represent the dynamics of variancesand covariances.
17
• However, it does not allow for volatility transmissions, so not suitable forthis kind of application.
• With N = 2, 3, 5, 10 assets, we have 9, 18, 45, 165 parameters.
• Even in the diagonal vec model, conditions for positive definiteness aredifficult to check and impose in estimation.
• To find sufficient conditions, the diagonal vec model (18) and (19) canbe rewritten as
Ht = C + A¯ (εt−1ε′t−1) + B ¯Ht−1, (20)
where the Hadamard product ¯ denotes elementwise multiplication ofconformable matrices.
18
• E.g., for N = 2,
[h2
1t h12,t
h12,t h22t
]=
[c11 c12
c12 c22
]+
[a11 a12
a12 a22
]¯
[ε21,t−1 ε1,t−1ε2,t−1
ε1,t−1ε2,t−1 ε22,t−1
]
+[
b11 b12
b12 b22
]¯
[h2
1,t−1 h12,t−1
h12,t−1 h22,t−1
].
19
Schur product theorem
• Consider two normally distributed zero–mean random vectors X and Y(of the same length), where
– X has covariance matrix ΣX, and– Y has covariance matrix ΣY ,
and X and Y are independent.
• Now consider the product X ¯ Y .
• The covariance between the ith and jth elements of X ¯ Y is
Cov(Xi · Yi, Xj · Yj) = E(XiYiXjYj)− E(XiYi)E(XjYj)independence
= E(XiXj)E(YiYj)− E(Xi)E(Yi)E(Xj)E(Yj)zero mean= E(XiXj)E(YiYj)
= Cov(Xi, Xj)Cov(Yi, Yj).
20
• It follows that the covariance matrix of X ¯ Y is ΣX ¯ ΣY .
• Thus ΣX ¯ ΣY is a covariance matrix and therefore positive definite.
• Since any positive definite matrix is the covariance matrix of a normalrandom vector we conclude that the Hadamard product of two positivedefinite matrices is likewise positive definite.
• This result is often referred to as the Schur product theorem.4
4Which is slightly more general, allowing also for positive semi–definite matrices.
21
• The Schur product theorem can be applied to the representation5
Ht = C + A¯ (εt−1ε′t−1) + B ¯Ht−1 (21)
to conclude that
– if (symmetric) matrices C, A, and B are positive definite (or indeedsemi–definite), and
– the initial covariance matrix H0 is positive definite,
then Ht will remain positive definite for all t.
• During estimation, matrices C, A, and B can be parameterized in termsof their Cholesky factorization to guarantee positive semi–definiteness.i.e., one estimates the system
Ht = CC ′ + AA′ ¯ (εt−1ε′t−1) + BB′ ¯Ht−1, (22)
where A, B, and C are lower triangular with positive diagonal.5Cf. Z. Ding and R. F. Engle (2001): Large Scale Conditional Covariance Matrix Modeling, Estimation
and Testing, Academia Economic Papers, 29, 157–184.
22
• For high–dimensional systems, it is still not feasible to estimate (22)directly.
• Two ways to proceed:
(i) Introduce further simplifications.(ii) Keep the full flexibility of the diagonal vec model and use a clever
estimation method, as in Ledoit et al. (2003).6
• Briefly, as to (i), the most extreme case is the scalar–diagonal model,given by
Ht = CC ′ + αεt−1ε′t−1 + βHt−1, (23)
where α and β are positive scalars.
• We can do variance targeting by noting that the unconditionalcovariance matrix in (23) is CC ′/(1− α− β).
6O. Ledoit, P. Santa–Clara and M. Wolf, Flexible Multivariate GARCH Modeling with an Application toInternational Stock Markets, Review of Economics and Statistics, 85, 735–747.
23
• Thus, we may put
Ht = S(1− α− β) + αεt−1ε′t−1 + βHt−1, (24)
where S is the sample covariance matrix (or any long–run covariancematrix imposed by the analyst).
• Thus only two parameters need to be estimated numerically.7
• Clearly this is a very restrictive model, as it assumes the same kind ofvolatility dynamics to be present in each asset.
• A practically relevant example of (23), where β = 1 − α = λ, is theexponentially weighted moving average (EWMA), where
Ht = λHt−1 + (1− λ)εt−1ε′t−1 = (1− λ)
∞∑
i=1
λi−1εt−iε′t−i,
where λ is fixed at 0.94 for daily data in the RiskMetrics model.
7The long–run covariance matrix is of course also an estimate, but it need not be estimated numerically.
24
• As to (ii), we can estimate high–dimensional systems by applying themethodology suggested by Ledoit et al. (2003).
• The idea is to first obtain each set of coefficient estimates cij, bij, andcij separately for each pair of assets (i, j).
• This can be achieved simply by estimating one– or two–dimensionalGARCH models for the conditional variances and covariances respectively.
• That is, in the first step, for all individual return series, we fit a univariateGARCH model to get coefficient estimates cii, aii, and bii, i = 1, . . . , N .
• We use these estimates to calculate a sequence of conditional varianceestimates, h2
it, for each asset,
h2it = cii + aiiε
2i,t−1 + biih
2i,t−1, t = 1, . . . , T, i = 1, . . . , N. (25)
25
• In the second stage, the estimates (25) are used to specify bivariatelikelihood functions for the off-diagonal elements. For example, ifnormality is assumed, we maximize
−12
T∑t=1
(log |Hij,t|+ 1
2ε′ij,tH
−1ij,tεij,t
),
where
εij,t =[
εit
εjt
], Hij,t =
[h2
it hij,t
hij,t h2jt
],
h2it and h2
jt are from (25), and the diagonal vec GARCH specification forthe conditional covariance
h2ij,t = cij + aijεi,t−1εj,t−1 + bijhij,t−1 (26)
i, j = 1, 2, . . . , N, i 6= j. (27)
• In each of these second–step bivariate problems, only three parametersin (26) have to be estimated.
26
• After the second step, we have estimates cij, bij, and cij, i, j = 1, . . . , N .
• These are then used to construct matrices
C = [cij]i,j=1,...,N (28)
A = [aij]i,j=1,...,N (29)
B = [bij]i,j=1,...,N . (30)
• Nothing guarantees that these matrices are positive semi–definite, andthus their application may produce conditional covariance matrices thatare not positive definite.8
• However, if not, we can transform matrices C, A, and B to positivesemi–definite matrices C, A, and B, which are then taken to be theestimates of C, A, and B.
8Actually, we only require C ® (I − B) rather than C to be positive semi–definite, where ® iselementwise division, but this does not change the line of the argument; see Ledoit et al. (2003) for details.
27
• Matrices C, A, and B are chosen such that they are positive semi–definitematrices and as close as possible to C, A, and B in the Frobenius norm,9
i.e., by minimizing ‖C − C‖F , ‖A − A‖F , and ‖B − B‖F , where forN ×N matrix M ,10
‖M‖F =
√√√√N∑
i=1
N∑
j=1
m2ij.
• Ledoit et al. (2003) show that this works well in applications to volatilityforecasting, Value–at–Risk measurement, and portfolio selection.
9In addition, it is imposed that the diagonal elements do not change.10Matlab code for doing the optimization is available at
http://www.iew.uzh.ch/institute/people/wolf/publications.html.
28
Special Case II: BEKK
• BEKK (Baba, Engle, Kraft, and Kroner) was suggested by Engle andKroner (1995).11
• This specifies, in its simplest form,
Ht = C?C?′ + A?εt−1ε′t−1A
?′ + B?Ht−1B?′, (31)
where C is a triangular matrix and A? and B? are N × N parametermatrices.
• This guarantees positive definiteness if the initialization of Ht is positivedefinite.
• So the number of parameters is N(5N + 1)/2, i.e., for N = 2, 3, 5, 10assets, we have 11, 24, 65, 255 parameters.
11Multivariate Simultaneous Generalized ARCH, Econometric Theory, 11, 122–150.
29
• To see that this is a restricted vec model, consider the case N = 2,where
[h2
1t h12,t
h12,t h22,t
]=
[c?11 0
c?21 c?
22
] [c?11 c?
21
0 c?22
]
+[
a?11 a?
12
a?21 a?
22
] [ε21,t−1 ε1,t−1ε2,t−1
ε1,t−1ε2,t−1 ε22,t−1
] [a?11 a?
12
a?21 a?
22
]′
+[
b?11 b?
12
b?21 b?
22
] [h2
1,t−1 h12,t−1
h12,t−1 h22,t−1
] [b?11 b?
12
b?21 b?
22
]′,
or
h21,t = c1 + a?2
11ε21,t−1 + 2a?
11a?12ε1,t−1ε2,t−1 + a?2
12ε22,t−1
+b?211h
21,t−1 + 2b?
11b?12h12,t−1 + b?2
12h22,t−1
h12,t = c2 + a?11a
?21ε
21,t−1 + (a?
11a?22 + a?
21a?12)ε1,t−1ε2,t−1 + a?
22a?12ε
22,t−1
+b?11b
?21h
21,t−1 + (b?
11b?22 + b?
12b?21)h12,t−1 + b?
22b?12h
22,t−1.
30
• For the general relation between the models, the Kronecker product ⊗turns out to be useful.
• For an m × n matrix A and an p × q matrix B, this is defined as themp× nq matrix
A⊗B =
a11B a12B · · · a1nBa21B a22B · · · a2nB
... ... . . . ...am1B am2B · · · amnB
.
• Important rule in time series analysis:
vec(ABC) = (C ′ ⊗A)vec(B).
• Then (31) can be written as
vec(Ht) = vec(C?C?′)+(A?⊗A?)vec(εt−1ε′t−1)+(B?⊗B?)vec(Ht−1).
(32)
31
• Representation (32) directly leads to stationarity conditions andcovariance matrix forecasts for the BEKK model.
• In practice, the diagonal BEKK model is sometimes used to furtherreduce the number of parameters to be estimated, where the parametermatrices A? and B? are diagonal.
32
Factor Models
• Basic idea: Co–movements of returns are driven by a small number of(observable or unobservable) common underlying variables, which arecalled factors.
• For example, as an observable factor, the return of a market index maybe used as a proxy for the general tendency of the stock market.
• Consider the simplest case of just a single observable factor.
• Think of this as the market return, denoted by rMt.
• In portfolio analysis, where factor models are often used to structurecovariance matrices, the model is also known as single index model(SIM).
33
• The return of asset i, i = 1, . . . , N , is described by
rit = αi + βirMt + εit, i = 1, . . . , N ; (33)
E(εit) = 0, Vart−1(εit) = σ2εi, i = 1, . . . , N ; (34)
Covt−1(εit, εjt) = 0, i 6= j. (35)
• Expected return and variance of the market return will be denoted byEt−1(rMt) = µMt and Vart−1(rMt) = σ2
Mt, and we assume
Covt−1(rMt, εit) = 0, i = 1, . . . , N. (36)
• This structure implies that
Et−1(rit) = αi + βiµMt, i =, . . . , N, (37)
Vart−1(rit) = β2i σ2
Mt + σ2εi, i = 1, . . . , N, (38)
Covt−1(rit, rjt) = βiβjσ2Mt, i, j = 1, . . . , N, i 6= j. (39)
34
• For the covariance structure of the returns, given by (39), Assumption(35) is crucial, as it implies that the only reason for asset i and asset jmoving together is their joint dependence on the market return rMt.
• The first part of (38) is also often referred to as the systematic risk(which is related to the general tendency of the market), whereas thesecond part is the unsystematic (idiosyncratic, specific) risk, which is notrelated to market factors.
• The conditional variance of the market factor can be modeled by meansof a univariate (asymmetric) (E)GARCH model, e.g.,
σ2Mt = c + aε2M,t−1 + bσ2
M,t−1, (40)
whereεMt = rMt − µMt. (41)
• Equation (38) implies that the GARCH effects in the market will betransferred to all the assets’ variances.
35
• Defining
β =
β1
β2...
βN
, Σε =
σ2ε1
0 · · · 00 σ2
ε2· · · 0
... ... . . . ...0 0 · · · σ2
εN
,
the conditional covariance matrix of the N–dimensional rt =[r1t, r2t, . . . , rNt]′ can be written as
Covt−1(rt) =
β21σ
2Mt + σ2
ε1β1β2σ
2Mt · · · β1βNσ2
Mt
β1β2σ2Mt β2
2σ2Mt + σ2
ε2· · · β2βNσMt
... ... . . . ...β1βNσ2
Mt β2βNσ2Mt · · · β2
Nσ2Mt + σ2
εN
= ββ′σ2Mt + Σε.
36
Modeling Conditional Correlations
• The models considered so far specified models for the conditionalcovariances, in addition to the variances.
• Another approach is to model the variances and the conditionalcorrelations.
• One advantage is that conditional variances (or standard deviations) andconditional correlations can be modeled separately, which often allows forconsistent two–step model estimation, thus reducing the computationalburden.
• For these models, we write Ht as
Ht = DtRtDt (42)
Dt =
√h2
1t 0 · · · 00
√h2
2t · · · 0... ... . . . ...
0 0 · · ·√
h2Nt
, (43)
37
i.e., Ht is a diagonal matrix with the conditional standard deviations onits main diagonal, and
Rt =
1 ρ12,t · · · ρ1N,t
ρ12,t 1 · · · ρ2N,t... ... . . . ...
ρ1N,t ρ2N,t · · · 1
(44)
is the conditional correlation matrix, i.e.,
ρij,t = Corrt−1(εit, εjt), i, j = 1, . . . , N, i 6= j,
is the conditional correlation between assets i and j.
• The conditional covariances are
hij,t = ρij,t
√h2
ith2jt, i 6= j.
• Positive definiteness of Ht follows from that of Rt and the positivity ofthe conditional standard deviations in Dt.
38
Constant Conditional Correlations (CCC)
• One of the first multivariate GARCH models (Bollerslev, 1990).12
• In this model Rt = R is constant in (42), i.e., the conditional correlationsare constant.
• We may write this asεt = Dtzt, (45)
where {(z1t, . . . , zNt)′} is an iid series of (e.g., normally distributed)innovations with mean zero and covariance matrix R, i.e.,
zt ∼ N(0, R). (46)
• Until ten years ago or so, it has also been the most popular multivariateGARCH model due to the fact that it can easily be estimated even forhigh–dimensional time series.
12Modelling the coherence in short–run nominal exchange rates: a multivariate generalized ARCH model,Review of Economics and Statistics, 73, 498–505.
39
• Note that R is the constant conditional correlation matrix (i.e., thecorrelation matrix of the innovations), not the unconditional correlationmatrix.
• Consistent two–step estimation for high–dimensional time series feasible:
• First estimate univariate GARCH models for each series.
• This allows for flexible specification of the univariate processes. Forexample, we may specify a standard GARCH for one component,AGARCH or EGARCH for another...
• Calculate the standardized residuals,
zit =εit√h2
it
, i = 1, . . . , N, t = 1, . . . , T. (47)
• Then, in view of (45) and (45), estimate R as the correlation matrix ofthe standardized residuals.
40
• The CCC has been extended to allow for volatility spillovers by specifyinga multivariate GARCH structure for the conditional volatilities of theindividual series.13
• For the bivariate case, this is
[h2
1t
h22t
]=
[c1
c2
]+
[a11 a12
a21 a22
] [ε21,t−1
ε22,t−1
]+
[b11 b12
b21 b22
] [h2
1t
h22t
].
• This allows past squared returns and variances of all series to enter theindividual conditional variance equations.
• This clearly requires simultaneous estimation of the conditional varianceparameters on the first step.
13T. Jeantheau (1998): Strong consistency of estimators for multivariate ARCH models, EconometricTheory, 14, 70–86; C. He and T. Terasvirta (2004): An extended constant conditional correlation GARCHmodel and its fourth–moment structure, Econometric Theory, 20, 904–926.
41
Dynamic Conditional Correlation (DCC) Models
• The two–step estimation procedure makes application of the CCCto high–dimensional systems feasible, but more often than not thehypothesis of constant conditional correlations is rejected.
• For example, it is often observed that correlations between financial timeseries increase in turbulent periods, and are very high in crash situations.
• Thus models for dynamic conditional correlations (DCC) have beenproposed.
• As an example, consider the model proposed by Engle (2002).14
14Dynamic conditional correlation—a simple class of multivariate GARCH model, Journal of Business andEconomic Statistics, 20, 339–350. A similar model was suggested by Y. K. Tse and A. K. C. Tsui (2002): Amultivariate GARCH model with time–varying correlations, Journal of Business and Economic Statistics, 20,351–362.
42
• In its simplest (scalar) form, this can be written as
εt ∼ N(0, DtRtDt), (48)
Dt ∼ GARCH (49)
zt = D−1εt (produces standardized residuals (47))
Qt = (1− a− b)S + azt−1z′t−1 + bQt−1, (50)
a, b ≥ 0, a + b < 1,
Rt = {diag(Qt)}−1/2Qt{diag(Qt)}−1/2. (51)
• In (50), S is the unconditional correlation matrix of the standardizesresiduals zt.
• If the starting value for Qt in (50) is positive definite, then Qt is positivedefinite, but will not in general be a valid correlation matrix (i.e., withones on the diagonal).
• Thus, the rescaling in (51) is necessary.
43
• Two–step estimation is still feasible, which facilitates the application totime series of higher dimension.
• More general GARCH–like structures than the scalar model in (50) canbe employed.
• An alternative multivariate GARCH model for dynamic conditionalcorrelations was suggested by Pelletier (2006), i.e., a regime-switchingmodel for dynamic correlations.15
• In Pelletier’s (2006) model, the conditional correlation matrix is subjectto Markovian regime–switching and is given by
R(∆t) = [ρij(∆t)], i, j = 1, . . . , M, (52)
where {∆t} is a Markov chain with finite state space S = {1, . . . , k} and
15D. Pelletier (2006): Regime–switching for Dynamic Correlations, Journal of Econometrics, 131, 445–473.
44
irreducible and aperiodic (primitive) transition matrix
P =
p11 · · · pk1... · · · ...
p1k · · · pkk
, (53)
where pij = p(∆t = j|∆t−1 = i), i, j = 1, . . . , k.
45
• In principle, any volatility model can be employed for generating thevolatility dynamics of the individual assets. However, Pelletier assumesthat an absolute value GARCH(1,1) (AVGARCH) process is appropriate,i.e.,
εi,t = zithit, (54)
hit = ωi + αi|εi,t−1|+ βihi,t−1 (55)
= ωi + αi|zi,t−1|hi,t−1 + βihi,t−1 (56)
= ωi + (αi|zi,t−1|+ βi)hi,t−1 (57)
= ωi + ci,t−1hi,t−1, (58)
ωi > 0, αi, βi ≥ 0, i = 1, . . . , N,
wherecit = αi|zi,t|+ βi,
and where the volatility dynamics are specified in terms of the conditionalstandard deviation.
• The reason for doing so is that this allows for closed–form covariancematrix forecasts.
46
• This is not the case for the constant conditional correlation model wherethe volatility dynamics are specified in terms of the conditional variance.
• To see this, consider the standard model with a single regime only.
• The forecast of the time–t conditional covariance between asset 1 andasset 2 at time t + d is given by
Covt(ε1,t+d, ε2,t+d) = Et(ε1,t+d, ε2,t+d) = Et(z1,t+dz2,t+dh1,t+dh2,t+d)
= E(z1,t+dz2,t+d)︸ ︷︷ ︸=ρ12
Et(h1,t+dh2,t+d)
= ρ12Et(h1,t+dh2,t+d).
• Now we can substitute for h1,t+d and h2,t+d to obtain
h1,t+dh2,t+d = (ω1 + c1,t+d−1h1,t+d−1)(ω2 + c2,t+d−1h2,t+d−1)
= ω1ω2 + ω1c2,t+d−1h2,t+d−1 + ω2c1,t+d−1h1,t+d−1
+c1,t+d−1c2,t+d−1h1,t+d−1h2,t+d−1.
47
• Defining
st = ω1ω2 + ω1c2,th2,t + ω2c1,th1,t, c12,t = c1tc2t,
this can be written
h1,t+dh2,t+d = st+d−1 + c12,t+d−1h1,t+d−1h2,t+d−1,
and taking conditional expectations gives
Et(h1,t+dh2,t+d) = Et(st+d−1) + c12Et(h1,t+d−1h2,t+d−1), (59)
where c12 = E(c1tc2t).
• Et(st+d−1) in (59) can be calculated via the GARCH(1,1) volatilityforecast formula
Et(hi,t+τ) = ω11− cτ−1
i
1− ci+cτ−1
i hi,t+1 = E(hit)+cτ−1i (hi,t+1−E(hit)), τ ≥ 1,
48
where E(hit) = ωi(1− ci)−1, ci = E(cit) = αiκ1 + βi, i = 1, 2, and
κ1 = E(|zit|) =
√2π if zit ∼ N(0, 1)
√ν−2Γ(ν−1
2 )√πΓ(ν/2)
if zit ∼ tν,(60)
where tν denotes Student’s t distribution with ν > 2 degrees of freedomand unit variance.
• Furthermore
c12 = E(c1tc2t) = α1α2E(|z1tz2t|) + κ1(α1β2 + α2β1) + β1β2,
where16
E(|z1tz2t|) =2π
(ρ12 arcsin ρ12 +
√1− ρ2
12
)
both for normal as well as Student’s t innovations.17
16S. Nabeya (1951): Absolute Moments in 2–dimensional Normal Distribution, Annals of the Institute ofStatistical Mathematics, 3, 2–6.
17Provided the Student distribution has been standardized to have unit variance.
49
• Now we can iterate (59) to obtain
Et(h1,t+dh2,t+d) = Et(st+d−1) + c12Et(h1,t+d−1h2,t+d−1)
= Et(st+d−1) + c12Et(st+d−2) + c212Et(h1,t+d−2h2,t+d−2)
...
=d−2∑
`=0
c`12Et(st+d−`−1) + cd−1
12 h1,t+1h2,t+1.
• On the other hand, using a GARCH model in the variance and thesquared returns, calculation of covariance matrix forecasts would requirethe evaluation of
Covt(ε1,t+d, ε2,t+d) = ρ12Et (h1,t+dh2,t+d) ,
where
hi,t+d =√
ωi + (αiz2i,t+d−1 + βi)h2
i,t+d−1, i = 1, 2,
which does not allow a closed–form solution.
50
• To illustrate the application of the model, we consider dynamiccorrelations between global stock market and real estate equity returns,using dollar–denominated weekly (Wednesday–to–Wednesday) returns ofthe MSCI world and the FTSE EPRA/NAREIT global indices over theperiod from January 1990 to May 2009 (T = 1012 observations).
• Continuously compounded percentage returns are considered, i.e.,
rit = 100× log(Iit/Ii,t−1), i = 1, 2,
where I1t and I2t are the MSCI and EPRA/NAREIT index levels at timet, respectively.
51
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 201050
100
150
200
250
300Index
stock marketreal estate
52
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010−20
−15
−10
−5
0
5
10stock market returns
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010−30
−20
−10
0
10
20real estate returns
53
0 5 10 15 20 25 30 35 40 45 50−0.1
0
0.1
0.2
0.3ACF of absolute stock market returns
0 5 10 15 20 25 30 35 40 45 50−0.1
0
0.1
0.2
0.3
0.4ACF of absolute real estate returns
54
Table 1: Properties of stock market and real estate returns
covariance/
mean correlation matrix skewness kurtosis JB
MSCI 0.051 4.806 0.768 −0.794 8.060 1186.1∗∗∗
EPRA/NAREIT 0.019 4.242 6.346 −1.239 12.49 4053.3∗∗∗
• JB is the Jarque–Bera test for normality.
55
• Estimate models with one and two regimes, and with normal andStudent’s t innovations.
• First compare standard GARCH (i.e., modeling conditional variances)with absolute value GARCH (AVGARCH, i.e., modeling conditionalstandard devations).
• Although models based on squared returns do slightly better than thoseusing absolute values for Gaussian innovations, there are basically nodifferences for Student’s t models.
• Clearly, as the latter lead to a much better fit than the former, resultspertaining to them appear to be more informative.
• We also note that allowing for regime–dependent correlations in generalsubstantially decreases the BIC, providing strong support for time–varyingcorrelations.
56
Table 2: Likelihood–based goodness–of–fitAVGARCH models
Gaussian Student’s t
k = 1 k = 2 k = 1 k = 2
K 9 12 10 13
log L −3877.9 −3843.6 −3832.5 −3806.0
BIC 7818.1 7770.1 7734.3 7702.0
GARCH models
Gaussian Student’s t
k = 1 k = 2 k = 1 k = 2
K 9 12 10 13
log L −3874.6 −3841.5 −3832.5 −3806.7
BIC 7811.6 7765.9 7734.2 7703.3Reported are likelihood–based goodness–of–fit measures for variousbivariate GARCH models fitted to the international stock and realestate equity markets. AVGARCH indicates Taylor’s (1986) absolutevalue GARCH process for the individual volatilities, as given by (55),whereas GARCH is the specification of Bollerslev (1986), where (55) isreplaced by h2
it = ωi + αiε2i,t−1 + βih
2i,t−1, i = 1, 2. K denotes
the number of parameters of a model, log L is the value of themaximized log–likelihood, and BIC is the Bayesian information criterion,i.e., BIC = −2× log L + K log T , where T is the sample size.
57
Table 3: Parameter estimates for AVGARCH modelsGaussian Student’s t
k = 1 k = 2 k = 1 k = 2
ω1 0.054(0.020)
0.058(0.019)
0.043(0.018)
0.041(0.017)
α1 0.094(0.016)
0.100(0.016)
0.080(0.016)
0.086(0.016)
β1 0.901(0.021)
0.893(0.020)
0.918(0.019)
0.914(0.018)
ω2 0.085(0.028)
0.054(0.019)
0.057(0.024)
0.036(0.017)
α2 0.088(0.015)
0.088(0.013)
0.076(0.015)
0.080(0.014)
β2 0.896(0.023)
0.910(0.017)
0.916(0.021)
0.924(0.016)
ρ12(1) 0.747(0.014)
0.654(0.029)
0.745(0.016)
0.669(0.025)
ρ12(2) − 0.913(0.013)
− 0.894(0.013)
ν − − 7.257(1.048)
7.286(1.052)
P 1
0.964(0.019)
0.066(0.029)
0.036(0.019)
0.934(0.029)
1
0.994(0.004)
0.010(0.009)
0.006(0.004)
0.990(0.009)
π1,∞ 1 0.649 1 0.645
(1− p11)−1 ∞ 28.12 ∞ 181.2
(1− p22)−1 − 15.21 − 99.67
58
• In the Gaussian regime–switching model model, expected regime
durations, (1 − pjj)−1, j = 1, 2, are only 28 and 15 weeks for the
low– and the high–correlation regime.
• They are approximately 3.5 years and two years in the Student’s t model.
59
1990 1995 2000 2005 20100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Gaussian
time
smoo
thed
pro
b. o
f reg
ime
2
1990 1995 2000 2005 20100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Student´s t
timesm
ooth
ed p
rob.
of r
egim
e 2
1 25 50 75 1000.65
0.7
0.75
0.8
0.85
0.9
forecast horizon, d
(con
ditio
nal)
corr
elat
ion
Gaussian
π1t
= 1
π2t
= 1
unconditional correlation
1 50 100 150 200 250 3000.65
0.7
0.75
0.8
0.85
0.9
forecast horizon, d
(con
ditio
nal)
corr
elat
ion
Student´s t
π1t
= 1
π2t
= 1
unconditional correlation
61
• In view of these differences between regime–switching models based
on different distributions, we shall investigate their consequences for
out–of–sample forecasting.
• To this end, we first reestimate the models using the first 500 observations
and then update the estimates every four weeks, using an expanding
window of data.
• We use the estimates to construct ex–ante global minimum variance
portfolios (GMVP) for (cumulative) returns at forecast horizons D =1, 4, 8, 12, 16, 20, and 24.
• As pointed out by Ledoit et al. (2003), an advantage of using the GMVP
is that it allows us to refrain from specifying expected returns, “which is
more a task for the portfolio manager than a statistical problem”.
62
• In general, for a portfolio of N assets with weight vector w =[w1, w2, . . . , wN ], the portfolio mean and variance are given by
µp = w′µ, and σ2p = w′Hw,
where µ and H are the mean vector and covariance matrix, respectively,
of the assets under study.
• A minimum variance (well diversified) portfolio is a portfolio that
minimizes the variance for a prespecified level of expected portfolio
return, µ?p, i.e.,
min σ2p = w′Hw, subject to µp = w′µ ≥ µ?
p. (61)
• Once we restrict our attention to efficient minimum variance portfolios,18
a higher expected return can only be achieved by accepting a higher
variance, i.e., we trade off risk for expected return.18These have maximum expected return for a given variance.
63
• However, the optimal risk–return combination depends on the preferences
of the investor, and it involves expected returns, which are much harder
to estimate statistically than volatilities and covariances.
• Thus, in covariance matrix forecasts are of interest, one often
concentrates on the GMVP, which is the portfolio that solves (61)
without any restrictions on the portfolio mean return.
• For two assets, with w being the weight of the first asset, portfolio
variance h2p is
h2p = w2h2
1 + (1− w)2h22 + 2w(1− w)h12 (62)
= w2(h21 + h2
2 − 2h12)− 2w(h22 − h12) + h2
2.
• The GMVP is obtained by minimizing (62), i.e.,
wGMVP =h2
2 − h12
h21 + h2
2 − 2h12.
64
• For the Gaussian CCC, we report the standard deviation of the realized
returns, whereas for the other models their respective standard deviation
divided by that of the Gaussian CCC is indicated.
• Compared to the latter, the improvements from switching to a Student’s
t distribution are generally small.
• Those from using a regime–switching model for the correlations are
larger, although still moderate for shorter forecast horizons.
• However, as D becomes larger, the relative performance of the regime–
switching approach improves considerably, but only for the Student’s t
model.
• This is very likely due to the fact that, at longer forecast horizons, the
higher persistence of the correlation regimes implied by the Student’s
t model becomes effective, whereas the conditional correlation of the
Gaussian model rapidly converges to its unconditional value.
65
Table 4: Properties of realized global minimum variance portfolio (GMVP)
returnsD 1 4 8 12 16 20 24
Gaussian CCC 2.314 4.994 7.660 9.222 10.89 12.96 15.03
Student’s t CCC 0.997 0.994 0.989 0.992 0.991 0.987 0.985
Gaussian regime–switching 0.978 0.970 0.962 0.970 0.970 0.963 0.965
Student’s t regime–switching 0.968 0.954 0.929 0.925 0.925 0.919 0.917Shown are the results from calculating ex–ante global minimum variance portfolios (GMVP) implied by different
GARCH models and for different forecast horizons. For the Gaussian CCC, we show, for each forecast horizon,
D, the standard deviation of realized returns resulting from ex–ante GMVP portfolio weights. For the other
models, we report their respective standard deviation divided by that of the Gaussian CCC. The first row of
the table specifies the forecast horizon, D. The calculations refer to cumulative returns, i.e., if rt+d is the
single–period return vector at time t + d, then the D–period ahead cumulative return vector at forecast origin
t is∑D
d=1 rt+d, and the multi–period covariance matrix expectations are calculated accordingly.
66