Download - Ch8_ Slides 21 July 2010
-
8/6/2019 Ch8_ Slides 21 July 2010
1/64
1
Chapter 8 (Brooks)
Modelling volatility and correlation
-
8/6/2019 Ch8_ Slides 21 July 2010
2/64
2
Stylized Facts of Financial Data
The linear structural and time series models assume constant error variance. This
assumption is not realistic in financial time series data especially at high frequency. Some
of the stylized facts of financial returns data are as follows:
- Fat Tails and Leptokurtosis: Distribution of financial data has tails thicker than thenormal distribution. Tails decrease slower than . Kurtosis of financial data are almost
always greater than 3 (correspond to normal). Ignoring this may lead to under estimation
of tail risk whichhas obvious consequence for asset allocation and risk management.
- Volatility clustering: Tendency for volatility in financial markets to appear in bunches.
Thus large returns (of either sign) are expected to follow large returns and small returns
(of either sign) to follow small returns. This is due to the information arrival process.Information (news) arrive in bunches.
- Asymmetric volatility (leverage effects): Tendency of volatility to rise more following
a large price fall than following a large price rise of the same magnitude. A price fall
results in increase in firms debt to equity ratio (i.e. leverage). Shareholder's residual claim
is at larger risk.
2x
e
-
8/6/2019 Ch8_ Slides 21 July 2010
3/64
3
A Sample Financial Asset Returns Time Series
Daily S&P 500 Returns for January 1990 December 1999
-0.08
-0.06
-0.04
-0.02
0.00
0.02
0.04
0.06
1/01/90 11/01/93 9/01/97
Return
Date
-
8/6/2019 Ch8_ Slides 21 July 2010
4/64
4
KSE-100 index returns
KSE-100 Daily Returns (Jan 1990-Dec 1999)
-10
-8
-6-4
-2
02
46
810
Day
-
8/6/2019 Ch8_ Slides 21 July 2010
5/64
5
KSE-100 index returns
0
200
400
600
800
1,000
1,200
-10 -5 0 5 10
Series: KSE index returnsSample Jan 1990 July 1999Observations 2609
Mean 0.032780Median 0.000000Maximum 12.76223Minimum -13.21425Std. Dev. 1.545728Skewness -0.265931Kurtosis 12.29654
Jarque-Bera 9425.946Probability 0.000000
-
8/6/2019 Ch8_ Slides 21 July 2010
6/64
6
Are KSE returns different fr II
n r l returns
Simulated iid returns with same mean and standard deviation as
KSE data for Jan 1990-Dec 1999
-6
-4
-2
0
2
4
6
8
Day
IIDR
etrun
-
8/6/2019 Ch8_ Slides 21 July 2010
7/64
7
KSE-100 index returns
KSE-100 index returns ( n 1 000- une 0 00 )
-10
-
--
-
0
10
-
8/6/2019 Ch8_ Slides 21 July 2010
8/64
8
KSE-100 index returns
0
100
200
300
400
500
600
-8 -6 -4 -2 0 2 4 6 8
Series: KSE index returnsSample Jan 2000 July 2008Observations 2217
Mean 0.097446Median 0.095419Maximum 8.507070Minimum -8.662332Std. Dev. 1.553112Skewness -0.262416Kurtosis 6.805081
Jarque-Bera 1362.909Probabil ity 0.000000
-
8/6/2019 Ch8_ Slides 21 July 2010
9/64
9
Dail Exchange rate ak Rs/US$
changes April 00 t ul 00
Daily Exchange Rate log change
-3
-2
-1
0
1
2
3
day
(%e
xchangeratechange)
-
8/6/2019 Ch8_ Slides 21 July 2010
10/64
10
Dail Exchange rate ak Rs/US$
changes April 00 t ul 00
0
00
00
300
400
0
Series: Dail Exchange RateSample april 00 jul 00Observati ns 5
Mean 0.0 07Me ian 0.000000Maximum .40 344Minimum .90 03St . Dev. 0.3 5Skewness 0. 5055
urt sis 0. 9349
arque Bera 7 5. 3r babilit 0.000000
-
8/6/2019 Ch8_ Slides 21 July 2010
11/64
11
Monthl inflation rate C I-1 ajor
cities %log change
Inflation Rate
-1.5
-1-0.5
0
0.5
1
1.5
2
2.53
M t
M
t
If
t
t
-
8/6/2019 Ch8_ Slides 21 July 2010
12/64
12
Monthl inflation rate C I-1 ajor
cities %log change
4
6
8
1
1
14
- .5 - . .5 1. 1.5 .
eries: INFLATIONample 1995M01 006M06
Observations 1 6
Mean 0.51 396Me ian 0.471 74Maximum .366600Minimum -0.851813
t . Dev. 0.599854kewness 0.381571urtosis 3. 556 8
Jarque-Bera 3.400593robabilit 0.18 6 9
-
8/6/2019 Ch8_ Slides 21 July 2010
13/64
13
Non-linear Models: A Definition
Campbell, Lo and MacKinlay (1997) define a non-linear data generatingprocess as one that can be written
yt=f(ut, ut-1, ut-2, )
where utis an iid error term andfis a non-linear function.
They also give a slightly more specific definition as
yt=g(ut-1, ut-2, )+ utW2(ut-1, ut-2, )wheregis a function of past error terms only and W2 is a variance term.
Models with nonlinear g() are non-linear in mean, while those withnonlinearW2() are non-linear in variance. Our traditional structural modelcould be something like:
yt=F1 +F2x2t+ ... +Fkxkt+ ut, or more compactly y = XF + u. We also assumed utb N(0,W2).
-
8/6/2019 Ch8_ Slides 21 July 2010
14/64
14
Heteroscedasticity Revisited
An example of a structural model iswith utb N(0, ).
The assumption that the variance of the errors is constant is known as
homoskedasticity, i.e. Var (ut) = .
What if the variance of the errors is not constant?
- heteroskedasticity
- would imply that standard error estimates could be wrong.
Is the variance of the errors likely to be constant over time? Not for financial data.
Wu2
Wu2
t F1 +F2x2t+F3x3t+F4x4t+ u t
-
8/6/2019 Ch8_ Slides 21 July 2010
15/64
15
Autoregressive Conditionally Heteroscedastic
(ARCH
) Models
So use a model which does not assume that the variance is constant.
Recall the definition of the variance ofut:
= Var(ut ut-1, ut-2,...) = E[(ut-E(ut))2 ut-1, ut-2,...]W
e usually assume that E(ut) = 0
so = Var(ut ut-1, ut-2,...) = E[ut2 ut-1, ut-2,...].
What could the current value of the variance of the errors plausiblydepend upon?
Previous squared error terms.
This leads to the autoregressive conditionally heteroscedastic modelfor the variance of the errors:
= E0 + E1 This is known as an ARCH(1) model.
Wt2
Wt2
Wt2
ut12
-
8/6/2019 Ch8_ Slides 21 July 2010
16/64
16
Autoregressive Conditionally Heteroscedastic
(ARCH) Models (contd)
The full model would be
yt= F1 + F2x2t+ ... + Fkxkt+ ut, utb N(0, )where = E0 + E1
We can easily extend t
his to t
he general case w
here t
he error variancedepends on q lags of squared errors:
= E0 + E1 +E2 +...+Eq This is an ARCH(q) model.
Instead of calling the variance , in the literature it is usually called ht
,so the model is
yt= F1 + F2x2t+ ... + Fkxkt+ ut, utb N(0,ht)where ht= E0 + E1 +E2 +...+Eq
Wt2
Wt2
Wt2
ut12
ut q2
ut q2
W
t
2
2
1tu2
2tu
2
1tu2
2tu
-
8/6/2019 Ch8_ Slides 21 July 2010
17/64
17
Another Way of Writing ARCH Models
For illustration, consider an ARCH(1). Instead of the above, we can
write
yt= F1 + F2x2t+ ... + Fkxkt+ ut, ut= vtWt
, vtb N(0,1)
The two are different ways of expressing exactly t
he same model.
Thefirst form is easier to understand while the second form is required for
simulation from an ARCH model.
W E Et tu! 0 1 12
-
8/6/2019 Ch8_ Slides 21 July 2010
18/64
18
Testing for ARCH Effects
1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt= F1 + F2x2t+ ... + Fkxkt+ utsaving the residuals, .
2. Then square the residuals, and regress them on q own lags to test for ARCH
of orderq, i.e. run the regression
where vtis iid.
ObtainR2 from this regression
3. The test statistic is defined as (T-q)R2 (the number of observationsmultiplied by the coefficient of multiple correlation) from the lastregression, and is distributed as a G2(q).
tu
tqtqttt vuuuu ! 22
22
2
110
2 ... KKKK
-
8/6/2019 Ch8_ Slides 21 July 2010
19/64
19
Testing for ARCH Effects (contd)
4. The null and alternative hypotheses are
H0 : K1 = 0 and K2 = 0 and K3 = 0 and ... and Kq = 0
H1 : K1 {0 or K2 {0 orK3 {0 or ... or Kq {0.
If the value of the test statistic is greater than the critical value from the
G2 distribution, then reject the null hypothesis.
Note t
hat t
he A
RCHtest is also sometimes applied directly to returnsinstead of the residuals from Stage 1 above.
-
8/6/2019 Ch8_ Slides 21 July 2010
20/64
20
Problems with ARCH(q) Models
How do we decide on q?
The required value ofq might be very large
Non-negativity constraints might be violated.
When we estimate an ARCH model, we require Ei>0 i=1,2,...,q
(since variance cannot be negative)
A natural extension of an ARCH(q) model which gets around some of
these problems is a GARCH model.
-
8/6/2019 Ch8_ Slides 21 July 2010
21/64
21
Generalised ARCH (GARCH) Models
Due to Bollerslev (1986). Allows the conditional variance to be dependent
upon previous own lags
The variance equation is now
(1) This is a GARCH(1,1) model, which is like an ARMA(1,1) model for the
variance equation.
We could also write
Substituting into (1) forWt-12 :
W
t
2
=E
0+ E
1
2
1tu +FW
t-1
2
Wt-12=E0+E1 2 2tu +FWt-22
Wt-2
2
= E0 + E12
3tu +FWt-32
Wt2
= E0 + E12
1tu +F(E0 + E12
2tu +FWt-22)
=E0+E12
1tu +E0 +E1F2
2tu +FWt-22
-
8/6/2019 Ch8_ Slides 21 July 2010
22/64
22
Generalised ARCH (GARCH) Models (contd)
Now substituting into (2) forWt-22
An infinite number of successive substitutions would yield
So the GARCH(1,1) model can be written as an infinite order ARCH model.
We can again extend t
he GA
RCH(1,1) model to a GA
RCH(p,q):
Wt2
=E0 + E12
1tu +E0F + E1F2
2tu +F2(E0 + E1
2
3tu +FWt-32)
Wt2=E0+E1 2 1tu +E0F+E1F
2
2tu +E0F2+E1F2 2 3tu +F
3Wt-32
Wt2
=E0(1+F+F2
)+E12
1tu (1+FL+F2
L2
) +F3
Wt-32
Wt2
= E0(1+F+F2+...) + E1 2 1tu (1+FL+F2L
2+...) +FgW02
Wt2
=E0+E1 2 1tu +E22
2tu +...+Eq2
qtu +F1Wt-12+F2Wt-22+...+FpWt-p2
Wt2=
! !
q
i
p
j
jtjitiu1 1
22
0 WFEE
-
8/6/2019 Ch8_ Slides 21 July 2010
23/64
23
Generalised ARCH (GARCH) Models (contd)
But in general a GARCH(1,1) model will be sufficient to capture the
volatility clustering in the data.
Why is GARCHbetter than ARCH?
- more parsimonious - avoids over fitting
- less likely to breech non-negativity constraints
-The ARCH part explains volatility clustering
-Th
e GARCH
part indicates the temporal dependence in volatility
-
8/6/2019 Ch8_ Slides 21 July 2010
24/64
24
The Unconditional Variance under the GARCH
Specification
The unconditional variance ofutis given by
when
A high sum of alpha and beta would indicate volatility persistence.This would be useful in volatility forecast
is termed non-stationarity in variance
is termed intergrated GARCH
For non-stationarity in variance, the conditional variance forecasts willnot converge on their unconditional value as the horizon increases.
ar(ut) =)(1
1
0
FEE
FE 1 < 1
FE 1 u 1
FE 1 = 1
-
8/6/2019 Ch8_ Slides 21 July 2010
25/64
25
Estimation of ARCH / GARCH Models
Since the model is no longer of the usual linear form, we cannot use
OLS.
We use another technique known as maximum likelihood.
The method works by finding the most likely values of the parameters
given the actual data.
More specifically, we form a log-likelihood function and maximise it.
-
8/6/2019 Ch8_ Slides 21 July 2010
26/64
26
Estimation of ARCH / GARCH Models (contd)
The steps involved in actually estimating an ARCH or GARCH model
are as follows
1. Specify the appropriate equations for the mean and the variance - e.g. anAR(1)- GARCH(1,1) model:
2. Specify the log-likelihood function to maximise:
3. The software will maximise the function and give parameter values and
their standard errors
t=Q + Jyt-1+ ut , utbN(0,Wt2)Wt
2= E0 + E1
2
1tu +FWt-12
!
!
!T
t
ttt
T
t
t yyT
L1
22
1
1
2
/)(2
1)log(
2
1)2log(
2WJQWT
-
8/6/2019 Ch8_ Slides 21 July 2010
27/64
27
Parameter Estimation using Maximum Likelihood
Consider the bivariate regression case with homoskedastic errors forsimplicity:
Assuming t
hat ut b N(0,W
2
), then yt b N( , W
2
) so th
at theprobability density function for a normally distributed random variable
with this mean and variance is given by
(1)
Successive values ofytwould trace out the familiar bell-shaped curve.
Assuming that utare iid, thenytwill also be iid.
ttt uxy ! 21 FF
tx
21 FF
!2
2
212
21
)(
2
1exp
2
1),(
W
FF
W
WFF ttttxy
xyf
-
8/6/2019 Ch8_ Slides 21 July 2010
28/64
28
Parameter Estimation using Maximum Likelihood
(contd)
Then the joint pdf for all the ys can be expressed as a product of theindividual density functions
(2)
Substituting into equation (2) for everyytfrom equation (1),
(3)
! !
T
t
tt
TTtT
xyxyyyf
12
2
212
2121
)(
2
1exp
)2(
1),,...,,(
W
FF
TWWFF
!
!
!
T
ttt
T
tT
Xyf
XyfXyfXyfXyyyf
1
2
21
2
421
2
2212
2
1211
2
2121
),(
),()...,(),(),,...,,(
WFF
WFFWFFWFFWFF
-
8/6/2019 Ch8_ Slides 21 July 2010
29/64
29
Parameter Estimation using Maximum Likelihood
(contd) The typical situation we have is that the xt andyt are given and we want to
estimate F1, F2, W2. If this is the case, then f(y) is known as the likelihoodfunction, denoted LF(F1, F2, W2), so we write
(4)
Maximum likelihood estimation involves choosing parameter values (F1,
F2,W2
) that maximise this function.
We want to differentiate (4) w.r.t. F1, F2,W2, but (4) is a product containingTterms.
! !
T
t
tt
TT
xyLF
12
2
212
21
)(
2
1exp
)2(
1),,(
W
FF
TWWFF
-
8/6/2019 Ch8_ Slides 21 July 2010
30/64
30
Since , we can take logs of (4).
Then, using the various laws for transforming functions containinglogarithms, we obtain the log-likelihood function, LLF:
which is equivalent to
(5)
Differentiating (5) w.r.t. F1, F2,W2, we obtain
(6)
max ( ) maxlog( ( ))x x
f x f x!
Parameter Estimation using Maximum Likelihood
(contd)
!
!
T
t
tt xyTTLLF1
2
2
21 )(
2
1)2log(
2log
W
FFTW
!
!
T
t
tt xyTTLLF
1
2
2
212 )(
2
1)2log(
2
log
2 W
FFW
!2
21
1
1.2).(
2
1
W
FF
xFx tt xyLLF
-
8/6/2019 Ch8_ Slides 21 July 2010
31/64
31
(7)
(8
) Setting (6)-(8) to zero to minimise the functions, and putting hats above
the parameters to denote the maximum likelihood estimators,
From (6),
(9)
Parameter Estimation using Maximum Likelihood
(contd)
!4
2
21
22
)(
2
11
2 W
FF
WxW
x tt xyTLLF
!2
21
2
.2).(
2
1
WFF
xFx ttt xxyLLF
! 0)( 21 tt xy FF
! 0
21 tt xy FF ! 0 21 tt xTy FF
! 011
21 tt xT
yT
FF
xy21
FF !
-
8/6/2019 Ch8_ Slides 21 July 2010
32/64
32
From (7),
(10)
From (8),
Parameter Estimation using Maximum Likelihood
(contd)
! 0)( 21 ttt xxy FF
! 0 221 tttt xxxy FF
! 0 221 tttt xxxy FF
! tttt xxyxyx )( 222 FF
! 2222 xTyxTxyx ttt FF
! yxTxyxTx ttt )( 222F
)(
222
xTx
yxTxy
t
tt
!F
! 22142 )(1
tt xy
TFF
WW
-
8/6/2019 Ch8_ Slides 21 July 2010
33/64
33
Rearranging,
(11)
How do these formulae compare with the OLS estimators?
(9) & (10) are identical to OLS
(11) is different. The OLS estimator was
Therefore the ML estimator of the variance of the disturbances is biased,although it is consistent.
But how does this help us in estimating heteroskedastic models?
W2 21
! T
ut
W2 21
!
T
kut
Parameter Estimation using Maximum Likelihood
(contd)
! 2212 )(1
tt xy
TFFW
-
8/6/2019 Ch8_ Slides 21 July 2010
34/64
34
Estimation ofGARCH Models Using
Maximum Likelihood
Now we haveyt= Q + Jyt-1 + ut , utb N(0, )
Unfortunately, the LLF for a model with time-varying variances cannot bemaximised analytically, except in the simplest of cases. So a numerical
procedure is used to maximise the log-likelihood function. A potentialproblem: local optima or multimodalities in the likelihood surface.
The way we do the optimisation is:
1. Set up LLF.
2. Use regression to get initial guesses for the mean parameters.
3. Choose some initial guesses for the conditional variance parameters.
4. Specify a convergence criterion - either by criterion or by value.
Wt2
Wt2
= E0 + E12
1tu +FWt-12
! ! !
T
tttt
T
tt yy
T
L 1
22
11
2/
)(2
1
)log(2
1
)2
log(2 WJQWT
-
8/6/2019 Ch8_ Slides 21 July 2010
35/64
35
Non-Normality and Maximum Likelihood
Recall that the conditional normality assumption forut is essential.
We can test for normality using the following representation
ut
= vtW
tv
tb N(0,1)
The sample counterpart is
Are the normal? Typically are still leptokurtic, although less so thanthe . Is this a problem? Not really, as we can use the ML with a robustvariance/covariance estimator. ML with robust standard errors is called Quasi-Maximum Likelihood or QML. The robust standard errors are developed by
Bollerslev and Wooldridge (1992)
W E E E Wt t tu! 0 1 12
2 12 v
ut
t
t
!W
t
t
t
uv
W
!
tv tv
tu
-
8/6/2019 Ch8_ Slides 21 July 2010
36/64
36
QML Estimator
Let the normal densit for error is.
his maximization gi es the QML estimates.According to ollersle and Wooldridge 1
-
8/6/2019 Ch8_ Slides 21 July 2010
37/64
37
Extensions to the Basic GARCH Model
Since the GARCH model was developed, a huge number of extensions
and variants have been proposed. Three of the most important
examples are EGARCH, GJR, and GARCH-M models.
Problems with GARCH(p,q) Models:
- Non-negativity constraints may still be violated
- GARCH models cannot account for leverage effects
Possible solutions: the exponential GARCH (EGARCH) model or the
GJRmodel, which are asymmetric GARCH models.
-
8/6/2019 Ch8_ Slides 21 July 2010
38/64
38
The EGARCH Model
Suggested by Nelson (1991). The variance equation is given by
Advantages of the model
- Since we model the log(Wt
2
), th
en even if the parameters are negative, Wt
2
will be positive.
- We can account for the leverage effect: if the relationship between
volatility and returns is negative, K, will be negative.
!
TWEWKWF[W2
)log()log( 21
1
2
1
12
1
2
t
t
t
t
tt
uu
-
8/6/2019 Ch8_ Slides 21 July 2010
39/64
39
The GJR Model
Due to Glosten, Jaganathan and Runkle
where It-1 = 1 ifut-1 < 0
= 0 otherwise
For a leverage effect, we would see K> 0.
We require E1 + K 0 and E1 u 0 for non-negativity.
Wt2 E0 E1
2
1tu +FWt-12+Kut-1
2It-1
-
8/6/2019 Ch8_ Slides 21 July 2010
40/64
40
An Example of the use of a GJR Model
Using monthly S&P 500 returns, December 1979- June 1998
Estimating a GJRmodel, we obtain the following results.
)198.3(
172.0!ty
)772.5()999.14()437.0()372.16(
604.0498.0015.0243.1 12
1
2
1
2
1
2
! ttttt Iuu WW
-
8/6/2019 Ch8_ Slides 21 July 2010
41/64
41
News Impact Curves
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values ofut-1, given an estimated model.
News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
Model Estimates:
0
0.0
0.0
0.0
0.0
0.1
0.1
0.1
-1 -0.9 -0.8 -0.7 -0.6 -0.
-0.4 -0.
-0.
-0.1 0 0.1 0.
0.
0.4 0.
0.6 0.7 0.8 0.9 1
Valu o
Lagged Shock
Valueo
Cond
iionalVariance
GARCH
G
R
-
8/6/2019 Ch8_ Slides 21 July 2010
42/64
42
GARCH-in Mean
We expect a risk to be compensated by a higher return. So why not letthe return of a security be partly determined by its risk?
Engle, Lilien and Robins (1987) suggested the ARCH-M specification.A GARCH-M model would be
H can be interpreted as a sort of risk premium.
It is possible to combine all or some of these models together to getmore complex hybrid models - e.g. an ARMA-EGARCH(1,1)-Mmodel.
t Q+HWt-1+ut , utbN(0,Wt2)
Wt2
= E0+ E12
1tu +FWt-12
-
8/6/2019 Ch8_ Slides 21 July 2010
43/64
43
What Use Are GARCH-type Models?
GARCH can model the volatility clustering effect since the conditionalvariance is autoregressive. Such models can be used to forecast volatility.
We could show that
Var (ytyt-1, yt-2, ...) = Var (ut ut-1, ut-2, ...)
So modelling Wt2 will give us models and forecasts forytas well.
Variance forecasts are additive over time.
-
8/6/2019 Ch8_ Slides 21 July 2010
44/64
44
Forecasting Variances using GARCH Models
Producing conditional variance forecasts from GARCH models uses avery similar approach to producing forecasts from ARMA models.
It is again an exercise in iterating with the conditional expectationsoperator.
Consider t
he following GA
RCH(1,1) model:, utb N(0,Wt2),
What is needed is to generate are forecasts ofWT+12 ;T, WT+22 ;T, ...,
WT+s2 ;T where ;T denotes all information available up to and
including observation T.
Adding one to each of the time subscripts of the above conditionalvariance equation, and then two, and then three would yield thefollowing equations
WT+12 =E0 +E1 u2T+FWT2 , WT+22 =E0 +E1 u2T+1 +FWT+12 , WT+32 =E0 +E1
u2T+2+FWT+22
tt uy ! Q2
1
2
110
2
! ttt u FWEEW
-
8/6/2019 Ch8_ Slides 21 July 2010
45/64
45
Forecasting Variances
usingG
ARCH
Models (Contd)
Let be the one step ahead forecast forW2 made at time T. This iseasy to calculate since, at time T, the values of all the terms on the
RHS are known.
would be obtained by taking the conditional expectation of thefirst equation at the bottom of slide 44:
Given, how is , the 2-step ahead forecast forW2 made at time T,calculated? Taking the conditional expectation of the second equation
at the bottom of slide 44:=E0 +E1E( ;T) +F
where E( ;T) is the expectation, made at time T, of , which isthe squared disturbance term.
2
,1
f
TW
2
,1fTW
2
,1
f
TW = E0 + E1
2
Tu +FWT
2
2
,1
f
TW
2
,2
f
TW
2
,2
f
TW2
1Tu2
,1
f
TW2
1Tu2
1Tu
-
8/6/2019 Ch8_ Slides 21 July 2010
46/64
46
Forecasting Variances
usingG
ARCH
Models (Contd) We can write
E(uT+12 ` ;t) =WT+12
But WT+12 is not known at time T, so it is replaced with the forecast for
it, , so that the 2-step ahead forecast is given by
=E0 +E1 +F=E0 +(E1+F)
By similar arguments, the 3-step ahead forecast will be given by
=ET(E0 +E1 +FWT+22)=E0 +(E1+F)
=E0 +(E1+F)[ E0 +(E1+F) ]=E0 +E0(E1+F) + (E1+F)2
Any s-step ahead forecast (s u 2) would be produced by
2
,1
f
TW
2
,2f
TW2
,1fT
W
2
,1fT
W
2
,2
f
TW
2
,1
f
TW
2
,3
f
TW
2
,2
f
TW
2
,1f
TW2
,1
f
TW
f
T
s
s
i
if
Ts hh ,11
1
1
1
1
10, )()(
!
! FEFEE
-
8/6/2019 Ch8_ Slides 21 July 2010
47/64
47
What Use Are Volatility Forecasts?
1. Option pricing
C = f(S, X, W2, T, rf)
2. Conditional betas
3. Dynamic hedge ratios
The Hedge Ratio - the size of the futures position to the size of the
underlying exposure, i.e. the number of futures contracts to buy or sell per
unit of the spot good.
FW
Wi tim t
m t
,,
,
!2
-
8/6/2019 Ch8_ Slides 21 July 2010
48/64
48
What Use Are Volatility Forecasts? (Contd)
What is the optimal value of the hedge ratio?
Assuming that the objective ofhedging is to minimise the variance of thehedged portfolio, the optimal hedge ratio will be given by
where h = hedge ratio
p = correlation coefficient between change in spot price (S) andchange in futures price (F)
WS= standard deviation ofS
WF= standard deviation ofF
What if the standard deviations and correlation are changing over time?
Use
h p
s
F!
W
W
tF
ts
tt ph,
,
W
W
!
-
8/6/2019 Ch8_ Slides 21 July 2010
49/64
49
Testing Non-linear Restrictions or
TestingH
ypotheses aboutN
on-linear Models
Usual t- and F-tests are still valid in non-linear models, but they are
not flexible enough.
There are three hypothesis testing procedures based on maximumlikelihood principles: Wald, Likelihood Ratio, Lagrange Multiplier.
Consider a single parameter, U to be estimated, Denote the MLE as
and a restricted estimate as .~U
U
-
8/6/2019 Ch8_ Slides 21 July 2010
50/64
50
Likelihood Ratio Tests
Estimate under the null hypothesis and under the alternative.
Then compare the maximised values of the LLF.
So we estimate the unconstrained model and achieve a given maximisedvalue of the LLF, denoted Lu
Then estimate the model imposing the constraint(s) and get a new value ofthe LLF denoted Lr.
Which will be bigger?
Lre Lu comparable to RRSS u URSS
The LR test statistic is given byLR = -2(Lr - Lu) b G2(m)
where m = number of restrictions
-
8/6/2019 Ch8_ Slides 21 July 2010
51/64
51
Likelihood Ratio Tests (contd)
Example: We estimate a GARCH model and obtain a maximised LLF of66.85. We are interested in testing whetherF = 0 i n the following equation.
yt
= Q + Jyt-1
+ ut
, ut
b N(0, )= E0 + E1 + F
We estimate the model imposing the restriction and observe the maximisedLLF falls to 64.54. Can we accept the restriction?
LR = -2(64.54-66.85) = 4.62.
The test follows a G2(1) = 3.84 at 5%, so reject the null.
Wt2
Wt2 ut1
2 2
1tW
-
8/6/2019 Ch8_ Slides 21 July 2010
52/64
52
Modelling Volatilit of KSE Returns
Lets consider dail KSE-100 index return data from an 1 000-une 0 008. We hold last obser ations forout of sample
forecast comparison. Here is the ofplot returns hich shows t picalpattern of financial data
KSE-100 index returns an 1 000 une 0 008)
-10
-8
--
-
0
8
10
ay
-
8/6/2019 Ch8_ Slides 21 July 2010
53/64
53
Identifying GARCH order
Correlogramof KSE-return here seem tobe some significant
autocorrelations up toand including10 lags. he mean e uation mayin ol e toomanyparameters
-
8/6/2019 Ch8_ Slides 21 July 2010
54/64
54
Identifying GARCH order
Correlogramof S uared KSE-return AC of first four lags and at some higher large are large. ossibly
indicating that simple ARCH may in ol e toomanyparameters.
-
8/6/2019 Ch8_ Slides 21 July 2010
55/64
55
Testing forARCH Effects in Returns
ARCH LM test can be directlyapplied tokse-returns to see if s uared returnshas autoregressi e structure
LM(11 =TR = 07*0.1484= 7. al=@chis (327. ,11 =0.000
As suggested in the literature e.g. Enders (2004,p146 we start with the mostparsimonious model for conditional olatility that has often been found tobesatisfactory in de eloped stockmarkets i.e. GARCH(1,1 ,.
If it f ails to satisfyany diagnostics,more complicated model are needed.
return
ss
quaredo
flags
oftscoefficienarewhereHo kk HHHH 0...: 21 !!!!
-
8/6/2019 Ch8_ Slides 21 July 2010
56/64
56
Tentati e Model ARMA(1,1 -GARCH
(1,1 Lets examine the fit ofARMA(1,1 -
GARCH(1,1 model.All the coefficients inmean and olatility e uations are significant.Also the GARCH model seem tobe
covariance stationaryas the sumofalpha+beta =0.18 2+0.7 25=0.937
-
8/6/2019 Ch8_ Slides 21 July 2010
57/64
57
Diagnostics rdinaryand StandardizedResiduals
The estimated models residualsshould be uncorrelated and the
residuals should not contain anyremaining conditional volatility.
Kurtosis of standardizedresiduals (residuals divided byconditional standard deviationis smaller than the kurtosis ofordinary residuals,which shouldbe the case if GARCH is asuitable model
- - -
-
Series !
rdinary R " SIds
Sample # $ $
#
%
!
bservations$ $
#
&
' ean -(
.( )
&%
#
)
Median -(
.(
#
$ % 0 $
Ma1
imum%
.% 2 2 3 3 &
Minimum -%
.% 0 ) & 3 $
Std.Dev.#
.2
4&
%
#
$
Skewness -(
.#
0 3
#
2
#
Kurtosis&
.( ) % 3 0 3
Jar5
ue-6
era #
2
#
&
.2
# #
Probability (
.( ( ( ( ( (
7
8 7 7
9 7 7
@ 7 7
A
7 7
B 7 7
C 7 7
-C -A
-9 7 9A
SeriesStandardized R
"SIDs
Sample#
$ $
#
%
!
bservations$ $
#
&
Mean -(
.( $ 3
2
#
4
Median -(
.(
# #
( 3 %
Ma1 imum
4
.$
#
2 % ) %
Minimum -0
.)
#
0 4 4 3
Std.Dev.(
.3 3 3 3 $ 3
Skewness -(
.2
#
( 0 4 (
Kurtosis2
.0
&3 ( 3 )
Jar5
ue-6
era & 2
3
.)
&) %
Probability (
.( ( ( ( ( (
-
8/6/2019 Ch8_ Slides 21 July 2010
58/64
58
Model Diagnostics Standardized
Residuals Standardized residuals and
s uared standardizedresiduals do not give
evidence ofanymisspecification
-
8/6/2019 Ch8_ Slides 21 July 2010
59/64
59
Testing forLeverage Effects (Enders,
p-148 If there is no leverage, the regression of s uared standardized
residuals on their level should give an overall test which isinsignificant and individual t-values insignificant.
sresid(-1 and sresid(-2)are significant as well as
overall -test.
This indicates presence of
leverage effect. Conditional
variance ofkse returns is
affected more by
negative shocks.
-
8/6/2019 Ch8_ Slides 21 July 2010
60/64
60
EGARCH Estimates
The coefficient of terminvolving lag residual allowsthe sign of residual toaffect
conditional variance. Ifasymmetry is present, thiscoefficient (in this case c(6))should be negative andsignificant. If this term is zerothere is noasymmetry. In
this case the term is negativeand significant whichconfirms previous diagnostictest.
-
8/6/2019 Ch8_ Slides 21 July 2010
61/64
61
EGARCH model
The estimated EGARCH model gives
Apositive shock in error last period (good news)ofone unitincreases volatilityby =-0.1103 +0.3173=0.207 units
A negative shock in error last period (bad news)ofone unitincreases volatilityby =-(-0.1103) +0.3173=0.4276 units
which is more than double hence volatility responds to
shocks in asymmetric way.
2
1
1
1
1
12 log8997.01103.03173.0173.0log
! tt
t
t
tt
uuW
WW
W
-
8/6/2019 Ch8_ Slides 21 July 2010
62/64
62
ARCH-in Mean
Using either standarddeviation orvariance inthe mean e uation gives
insignificant estimate ofriskpremium.Thus in thiscase ARCH-M is not asatisfactory description of
kse-returns.
-
8/6/2019 Ch8_ Slides 21 July 2010
63/64
63
Model Selection
The following table indicates that log likelihood correspondsto EGARCH (1,1)(with T distributed errors) is themaximum among competing models. Compared to normal
case parameter estimates in the case of EGARCH with terrors are slightly different.Akaike and Schwarz criteriaalso give similar results.
Model Log-Li eli ood
GARCH(1,1) -Normal Errors -3792.053
GARCH(1,1) -T Errors -3690.913
GARCH(1,1) -GED Errors -3685.821
EGARCH(1,1) Normal Errors -3787.982EGARCH(1,1) T Errors -3677.53
EGARCH(1,1) GED Errors -3678.178
-
8/6/2019 Ch8_ Slides 21 July 2010
64/64
64
Estimated Volatility
Volatility estimates from EGARCH model resembles more closely to realizeds uared returns
0
1 0
2 0
3 0
4 0
5 0
6 0
2 5 5 0 7 5 1 0 0
V G A R C H T V E G A R C H T K S E ^2
D a t e
volatility
E s t im a t e d V o la t i l ity a n d S q u a re d K S E R e t u rn s