forecasting intra-day volatility. multiplicative component

52
UNIVERSITY OF AMSTERDAM FACULTY OF ECONOMICS AND BUSSINESS Master thesis in the subject of Financial Econometrics Forecasting intra-day volatility. Multiplicative Component Realized GARCH Karolina Jerofejevaite 10603484 Supervised by Peter Boswijk 2014.08.25

Upload: others

Post on 21-Feb-2022

13 views

Category:

Documents


0 download

TRANSCRIPT

UNIVERSITY OF AMSTERDAM

FACULTY OF ECONOMICS AND BUSSINESS

Master thesis

in the subject of

Financial Econometrics

Forecasting intra-day volatility.Multiplicative Component Realized GARCH

Karolina Jerofejevaite

10603484

Supervised by Peter Boswijk

2014.08.25

CONTENTS CONTENTS

Contents

1 Introduction 3

2 Literature review 5

3 Econometric methods 7

3.1 Multiplicative Component GARCH . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Realized GARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.1 Specification of the log-linear Realized GARCH(1,1) . . . . . . . . . . 12

3.2.2 Log-likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2.3 Multi-period Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.4 Realized measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Data 16

5 Results 18

5.1 Realized GARCH modelling results . . . . . . . . . . . . . . . . . . . . . . . 18

5.2 Realized GARCH forecasting results . . . . . . . . . . . . . . . . . . . . . . 22

5.3 Modelling results for Multiplicative Component GARCH . . . . . . . . . . . 24

5.4 Forecasting results for Multiplicative Component GARCH . . . . . . . . . . 25

6 Possible extensions 31

6.1 Asymmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.2 Long memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7 Limitations 37

8 Conclusions 40

A Appendix 43

A.1 15 minute Realized Kernel calculation . . . . . . . . . . . . . . . . . . . . . . 43

A.2 Diebold-Mariano test for Comparing Predictive Accuracy . . . . . . . . . . . 43

A.3 Data analysis for Microsoft stock . . . . . . . . . . . . . . . . . . . . . . . . 44

1

CONTENTS CONTENTS

A.4 Results for Microsoft stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

A.4.1 Realized GARCH results . . . . . . . . . . . . . . . . . . . . . . . . . 46

A.4.2 Multiplicative Component GARCH results . . . . . . . . . . . . . . . 48

2

1 INTRODUCTION

1 Introduction

Up to date there is a huge amount of literature on modelling and forecasting daily variance

of the returns whereas intra-day volatility models are far less discussed. However, every year

trading becomes more and more frequent and automated. This motivates the development of

the intra-day volatility forecasting models. These forecasts then serve as an important part

of the algorithm for the schedule trades, asset pricing, changing hedge ratio. Also they help

place limit orders and can be used to calculate intra-day Value at Risk (VaR) which may

lead to allocation of the funds within a day. Consequently, interest in modelling intra-day

variance is growing.

Thus my main goal in this thesis is to try to find the most empirically suitable approach to

model and forecast intra-day variance. One of the most influential papers on this matter was

recently published by Engle and Sokalska (2012). The authors introduce the Multiplicative

Component GARCH model for High-frequency intra-day financial returns, which specifies

the conditional variance to be a product of daily, diurnal and stochastic intra-day volatility.

My master thesis builds on this paper. I investigate the performance of the mentioned vari-

ance components and I seek to answer the question - how significant are these components

for the intra-day volatility modelling and forecasting results?

In the mentioned paper commercially available volatility forecasts are used as a daily vari-

ance component. These predictions are made on the basis of a multi-factor risk model. In

contrast, I want to make use of information that high frequency data provides. Thus I take

a different approach and model the daily volatility component by Realized GARCH, intro-

duced by Hansen, Huang and Shek (2011). By the structure of the model, it accounts for

asymmetry and long memory properties of the daily returns. Also it has been proven that

the model gives substantial improvements for the daily conditional variance modelling and

forecasting over usual GARCH models. Standard GARCH models typically employ squared

returns to extract information about the current level of daily volatility. Within Realized

GARCH model, the observed realized measures of the latent volatility are used instead.

3

1 INTRODUCTION

These measures are build using high frequency returns in such a way that they approximate

the quadratic variation of the true underlying price process, by filtering out the microstruc-

ture noise. Overall, many realized measures have been proposed, in my work I focus on

5 minute Realized Variance, Sub-sampled 5 minute Realized Variance and Realized Kernel.

The last gives the most accurate daily forecasts among the three, therefore these forecasts are

then used as a daily variance component in the Multiplicative Component GARCH model.

After applying this model, I predict intra-day volatility 15 minutes ahead and use different

approaches to evaluate the obtained forecasts. That is, various true volatility proxies and

prediction measures are used. The models are applied to frequently traded stocks: Intel

Corporation and Microsoft. Data samples start on the 2nd of June, 1990 till the 3rd of May,

2011. Every trading day consists of 5 second log-returns. Thus in total there are 3000 days

of observations with 4680 5 second log-returns within each day.

This thesis is organized as follows. In the second chapter a literature review is given. In the

3rd chapter I introduce Realized GARCH and Multiplicative Component GARCH models.

The used data is detailed in Chapter 4. Obtained results are summarized in Chapter 5.

Possible extensions are discussed in Chapter 6. Limitations of the Realized GARCH model

are expressed in Chapter 7. Overall conclusions of the thesis are presented in Chapter 8.

4

2 LITERATURE REVIEW

2 Literature review

The frequency of trading in the financial markets is gradually increasing every year. We

find ourselves at the point where proper intra-day volatility model is of substantial empiri-

cal importance. It has been argued that conventional GARCH models are not suitable for

within-the-day modelling and fail to capture important features of the intra-day volatility.

(See for instance Andersen and Bollerslev (1997)). The reason for this is distinctive intra-

day seasonality or in other words diurnal patterns of the volatility. A number of closely

connected models were developed to take account of intra-day volatility patterns, see Ghose

and Kroner (1996), Andersen and Bollerslev (1997, 1998), Giot (2005) and Engle (2012).

The latter extended the model proposed by Andersen and Bollerslev (1997) and introduced

Multiplicative Component GARCH model. In contrast, to Andersen and Bollerslev, this

model included not only daily and diurnal volatility components but also a stochastic intra-

day component.

This thesis builds on the work of Engle and Sokalska (2012). The authors were dealing with

a huge amount of data (that is 2500 US equities with high frequency returns) thus the main

focus of their work was applying a number of different specifications and then comparing

forecasting results. Namely, they construct models for separate companies, pool data into

industries, and consider various criteria for grouping returns. And finally they arrive to the

conclusion that the forecasts from the pooled specifications outperform the corresponding

forecasts from company by company estimation. For this kind of modelling you need to have

an access to a comprehensive sample plus an extremely fast computer. Thus in my thesis I

take a different approach and focus on investigating the significance of the daily, diurnal and

a stochastic intra-day volatility components for the forecasting results. Along with in-depth

investigation of the properties of the intra-day returns. Moreover, I apply different ways to

evaluate the accuracy of the forecasts.

For the time being let’s focus on the daily volatility (component) modelling. Commonly

standard GARCH models are used. Within the GARCH framework, daily returns (typically

5

2 LITERATURE REVIEW

squared returns) are employed to extract information about the current level of volatility,

and this information is used to form expectations about the next period’s volatility. But it

must be emphasized that squared returns only offer a weak signal about the current level of

volatility. Moreover, it is known that GARCH model is slow at ’catching up’ and it will take

many periods for the conditional variance (implied by the GARCH model) to reach its new

level, as discussed in Andersen (2003). Therefore, since I am dealing with high-frequency

financial data, it is necessary to take advantage of this additional information. A number of

realized measures of volatility, including Realized variance, bipower variation, the Realized

Kernel, and others (see Andersen and Bollerslev (1998), Andersen (2001), Barndorff-Nielsen

and Shephard, (2002, 2004, 2008), Hansen and Lunde (2006), Bandi and Russell (2008))

prove to be far more informative about the current level of volatility than is the squared

return. This makes realized measures very useful for modelling and forecasting future volatil-

ity. Andersen (2001), Barndorff-Nielsen and Shephard (2002) show that applying Realized

variance, a measure constructed by summing high frequency squared returns, improves the

understanding of time-varying variance and ability to forecast future volatility. Hansen and

Lunde (2006) carry out an in-depth analysis of the Realized variance and investigate its

upward-biasedness at high frequencies. Their work show that using 1 to 5 minute squared

returns for Realized variance measure give the optimal results. Barndorff-Nielsen,Hansen,

Lunde and Shephard expand this influential Realized variance literature by introducing Re-

alized Kernel. This non-negative estimator is robust to autocorrelation of the high-frequency

returns and has broadly the same form as a standard heteroskedasticity and autocorrelation

consistent (HAC) covariance matrix.

Engle (2002) introduced a model called GARCH-X, which is GARCH model that includes

a realized measure. However, within the GARCH-X framework the variation in the realized

measures are left unexplained, due to this GARCH-X models are called partial. Engle and

Gallo (2006) introduced the first ’complete’ model in this context. Their model specifies a

GARCH structure for each of the realized measures, so that an additional latent volatility

process is introduced for each realized measure in the model. The model by Engle and

Gallo (2006) is known as the multiplicative error model (MEM), because it builds on the

6

3 ECONOMETRIC METHODS

MEM structure proposed by Engle (2002). Another complete model is the HEAVY model

by Shephard and Sheppard (2010), which also incorporates at least two separate equations:

one for latent volatility and the other one for realized measure. Thus unlike the traditional

GARCH models, these models operate with multiple latent volatility processes. Another

example of a complete model was introduced by Hansen, Huang and Shek (2011) and is

called Realized GARCH. This model combines a GARCH structure for the returns with

an integrated model for the realized measures of volatility. Importantly, the authors show

statistical gains from incorporating realized measures in the volatility models. This is not

the only paper that illustrates the benefit of including the realized measures in the analysis.

Shephard and Sheppard (2010) show that when it comes to forecasting, HEAVY models

out-perform standard GARCH substantially, for both within the sample and out of sample

forecasts. From this it seems that HEAVY models would be a great pick for daily variance

component forecast. However, in this model latent volatility and realized measure equations

are estimated separately. And Shephard and Sheppard (2010) briefly mention that if the

information was pooled across the two equations it might bring more explanatory power for

the model. That is exactly what is done by Hansen, Huang and Shek (2011) in the Realized

GARCH model. Therefore in this thesis Realized GARCH is chosen to model and forecast

daily volatility.

In this Master thesis I combine arguably the best known practices to model intra-day and

daily volatility. Namely, the Multiplicative Component GARCH by Engle and Sokalska

(2012) for intra-day variance and the Realized GARCH for daily volatility (component)

forecasting. This way I obtain an extended model, called the Multiplicative Component

Realized GARCH.

3 Econometric methods

General notation. Every trading day is divided into a number of bins (time intervals) N,

j = 1, ..., N marks an index of the bin within a day and t = 1, ..., T denotes the index of a

trading day. Both things combined we get that {t, j} indicates the j-th bin on the t-th day.

7

3.1 Multiplicative Component GARCH 3 ECONOMETRIC METHODS

Dataset contains high-frequency 5 second log returns. This means that the size of the bin

is 5 seconds and there are 4680 such bins within each trading day. The 5 second log returns

are defined as:

rt,j = log

(St,j

St,j−1

),

where St,j denotes the stock price at time {t, j}.

Then the daily returns can be obtained as follows:

rt =4680∑j=1

rt,j.

Please recall that there are 4680 5 second bins within a day.

Throughout this thesis 15 minute returns will be used often. Therefore, to distinguish these

15 minute returns from high frequency 5 second returns, I choose to denote them as rt,i,

where i = 1, ..., 26 (each day has 26 15 minute bins).

3.1 Multiplicative Component GARCH

The Multiplicative component GARCH model for the intra-day financial returns specifies the

conditional variance to be a multiplicative product of daily, diurnal and stochastic intra-day

volatility. In this section I give the general specification of the model.

The intra-day returns rt,i are assumed to take the following form:

rt,i =√htsiqt,iεt,i, εt,i ∼ N(0, 1). (1)

Here ht is the daily volatility component, si is the diurnal variance pattern, εt,i is the error

term, and qt,i is the stochastic intra-day volatility component, with E(qt,i) = 1. In this

thesis I chose to use 15 minute returns. Modelling of Multiplicative GARCH breaks down

in 3 major parts.

(i) Daily volatility component ht needs to be modelled and predicted. For this purpose

Realized GARCH model, detailed in section 3.2, is used.

(ii) The deterministic diurnal pattern (si) has to be obtained.

In order to distinguish day seasonality, let’s first divide the 15 minute squared returns

8

3.2 Realized GARCH 3 ECONOMETRIC METHODS

by the daily variance component ht:

r2t,i

ht= siqt,iε

2t,i ⇒ E

[r2t,i

ht

]= siE[qt,i] = si.

This leads to the following estimator of si:

si =1

T

T∑1

r2t,i

ht. (2)

(iii) The intra-day returns must be normalized by daily (ht) and diurnal patterns (si):

zt,i =rt,i√htsi≈ √qt,iεt,i

These obtained normalized returnszt,i are now used in a GARCH(1,1) model for the

intra-day stochastic component qt,i:

qt,i = ω + αz2t,i−1 + βqt,i−1 (3)

where zt,i | Ft,i−1 ∼ N(0, qt,i).

Here Ft,i−1 is a σ-algebra containing all the 15 minute returns observed up to the current

time moment {t, i}. In more detail, Ft,i−1 = {r1,1, ..., r1,N , ..., rt−1,1, ..., rt−1,N , rt,1, ..., rt,i−1}

and N = 26. I apply a simple GARCH(1,1) for the stochastic component qt,i as is done in

the paper by Engle and Sosalska (2012). However, possible extensions of this model will be

discussed in Chapter 7.

3.2 Realized GARCH

In this part I introduce Realized GARCH model proposed by Hansen, Huang and Shek (2011)

and the applied realized measures. It will be assumed that E(rt | Ft−1) = 0, which is em-

pirically proven to be accurate assumption. Here the information set Ft−1 contains the high

frequency returns rj,t and the realized measures constructed from these returns, observed

up to day t − 1. That is, Ft−1 = {rt−1,1, ..., rt−1,N , rt−2,1, ..., rt−1,N,, r1,1, ..., r1,N , xt−1, ..., x1}

where N=4680 and xt−1 denotes the obtained realized measure. As mentioned rt are the

daily returns and t = 1, ..., T is the index of the day. The interest lies within the unobserved

9

3.2 Realized GARCH 3 ECONOMETRIC METHODS

daily volatility of the returns, ht = V (rt | Ft−1).

First of all, the broader introduction to the realized measures is needed. Let’s say we want

to measure the variation over the period [0,T], also let’s assume that the log price process

(Y) is a Brownian semi-martingale. A continuous semi-martingale is a process that can be

decomposed as Yt = Y0 + At + Mt , where {At}t>0 is of bounded variation and {Mt}t>0 a

continuous local martingale. Thus Ito processes, also known as Brownian semi-martingales,

form a subset, with At =∫ t

0µds and Mt =

∫ t

0σsdWs. Combining everything we get:

Yt =

∫ t

0

µsds+

∫ t

0

σsdWs

The main focus lies in the latent quadratic variation of this process over the whole period of

interest [0,T]:

[Y ] =

∫ T

0

σ2udu

where due to Ito isometry property and multiplication table we have

V

[∫ t

0

σsdWs

]= E

[(∫ t

0

σsdWs

)2]

= E

[∫ t

0

σ2sds

]See for instance Etheridge (2002).

Furthermore, assume that:

Xt,j = Yt,j + Ut,j

is a noisy observation of the true log price process Yt,j ∀{t, j} ∈ [0, T ] with Ut,j denoting the

market microstructure noise, such that E[Ut,j] = 0, V [Ut,j] = ω2.

The realized measures are constructed in such a way that they approximate the quadratic

variation of the semi-martingale that drives the underlying log price process by filtering out

the market microstructure noise. In the empirical part of my thesis I apply the following

realized measures: Realized variance (RV ), Sub-sampled Realized variance(RVsub) and Re-

alized kernel (RK). The detailed specifications of these measures are given in the end of this

chapter.

The general framework of the Realized GARCH model consists of 3 equations, namely re-

turn, GARCH and realized measure equations. Here in the GARCH equation the conditional

10

3.2 Realized GARCH 3 ECONOMETRIC METHODS

variance ht depends not only on the ht−1 but also on the realized measure of the volatility,

denoted xt−1. Overall, the measurement equation is a very important component, that ties

the realized measure to the latent volatility. Also providing a simple way of modelling the

joint dependence between rt and xt. The specification of the model in more detail can be

found in section 3.2.1.

The authors Hansen, Huang and Shek (2011) define different specifications for the model,

however they do emphasize the choice of the log-linear Realized GARCH. There are few

reasons for this. First, log-linear specification automatically ensures a positive variance. In

practice the log-linear specification of the usual GARCH model is not often used because rt

may have a zero value and this would cause censoring in the model. In contrast, within the

Realized GARCH framework the logarithm of the returns (log(rt−1)) does not appear in the

model (this is explicitly shown below in equation (5)). For the motivation of not including

returns in the Realized GARCH model I refer the reader to the Hansen, Huang and Shek

(2011) paper. Another attractive feature of the log-linear Realized GARCH, is the fact that

it maintains the ARMA structure that characterizes some of the standard GARCH models.

All things considered, log-linear Realized GARCH seems like the best choice to be applied

in an empirical work.

Usually GARCH(1,1) specification is applied to account for the volatility clustering. Here I

am dealing with the returns of the Intel Corporation and Microsoft storcks. And GARCH(1,1)

proofs to be enough to correct for the serial autocorrelation in the residuals of the squared

residuals of the data analysed (detailed information about the data is provided in Chapter 4).

After taking in to account all of this argumentation, I chose to use log-linear Realized

GARCH(1,1) model for daily volatility modelling and forecasting. In the next section the

detailed specification of the model can be found.

11

3.2 Realized GARCH 3 ECONOMETRIC METHODS

3.2.1 Specification of the log-linear Realized GARCH(1,1)

Return equation:

rt =√htεt. (4)

GARCH equation:

ht = exp {ω + β log ht−1 + γ log xt−1} . (5)

Realized measure equation:

log xt = ξ + ϕ log ht + τ(εt) + ut. (6)

where τ(εt) = τ1εt + τ2(ε2t − 1) is a leverage effect such that E[τ(εt)] = 0 and ε ∼ iidN(0, 1),

u ∼ iidN(0, σ2u) and xt is a realized measure. It is shown in the paper by Hansen, Huang

and Shek (2011) that the use of this type of leverage equation τ(εt) in the realized measure

equation (6) induces EGARCH type structure in the GARCH equation (5).

From the model specification above another argument for choosing log-linear Realized GARCH

can be spotted. It lies within rt specification in equation (4). Which implies that:

log(r2t ) = log(ht) + log(ε2t )

and a realized measure is in many ways similar to the squared return, r2t , although a more

accurate measure of ht. Therefore, it is natural to express realized measure log(xt) in terms

of log(ht) and εt like in equation(6).

3.2.2 Log-likelihood

For the purpose of estimation, the Gaussian specification will be adopted, so that the log-

likelihood is given by:

l(r, x, θ) = −1

2

n∑t=1

(log(ht) +

r2t

ht+ log(σ2

u) +u2t

σ2u

)(7)

where θ = (ω, β, γ, ξ, ϕ, τ1, τ2, σ2u)′

12

3.2 Realized GARCH 3 ECONOMETRIC METHODS

3.2.3 Multi-period Forecast

The Realized GARCH model can be used to predict not only the conditional return variance

but also the realized measure. Even more importantly, the advantage of having a model that

fully describes the dynamic properties of realized measure (xt), is that multi-period-ahead

forecasting is possible. In contrast, this kind of predictions are not feasible without realized

measure equation (6). The Realized GARCH model induces the following VARMA(1,1)

structure, which will be used for multi-period-ahead forecasts: log(ht+k)

log(xt+k)

=

β γ

ϕβ ϕγ

k log(ht)

log(xt)

+k−1∑j=0

β γ

ϕβ ϕγ

j ω

ξ + ϕω

+

0

τ(εt+k−j) + ut+k−j

where τ(εt+k−j) = τ1εt+k−j + τ2(ε2t+k−j − 1) To simplify the expression let’s denote :

A =

β γ

ϕβ ϕγ

, b =

ω

ξ + ϕω

, Yt =

log(ht)

log(xt)

, ζt =

0

τ(εt) + ut

, then we get:

Yt+k = AkYt +k−1∑j=0

Aj(b+ ζt+k−j) (8)

3.2.4 Realized measures

In this section I describe in more detail the realized measures which are used in the empirical

part of the thesis.

Realized variance

The simplest and yet most broadly used realized measure is Realized variance. The main

idea is to aggregate squared high frequency intra-day returns rt,j to approximate the daily

increments of the quadratic variation of the price process. In more detail, if the prices are

observed without the noise then, as maxj | tj − tj−1 | ↓ 0, the Realized variance consistently

estimates the quadratic variation of the price process on the t-th day. Or in other words the

13

3.2 Realized GARCH 3 ECONOMETRIC METHODS

Realized variance converges to the daily increment of the quadratic variation of the price

process (see for instance Barndorff-Nielsen, Shephard (2002), Bandi and Russell (2008)). In

my dataset tj = tj−1 + δt and δt is 5 seconds. However, due to the market microstructure

noise (Ut,j) there is a difference between the observed price process and the true price process,

whose quadratic variation is the object of interest. The effect of market microstructure noise

for the Realized variance estimates are illustrated by Barndorff-Nielsen, Shephard (2002),

who show that these estimates are upward biased at high frequencies. Therefore, in practice

1- to 5-minute return data are used to mitigate the effect of the noise. (See also Hansen and

Lunde (2006)). I chose to use 5-minute Realized variance to be implemented in my empirical

part of the paper.

The Realized variance is defined as:

xt = RVt =N∑i=1

r2t,i (9)

Please note that here rt,i are 5 minute returns, thus every trading day is divided into N = 76

bins (time intervals).

Sub-sampled Realized variance

If a subset of the data is used with the Realized variance, then it is possible to average across

many such estimators each using different subsets. This is called sub-sampling. Theoretically

this procedure is beneficial in reducing the upward bias of Realized variance measure for the

high frequencies (see for instance Barndorff-Nielsen, Shephard N (2002), Barndorff-Nielsen

OE, Hansen PR, Lunde A, Shephard N. (2008), Bandi (2008)). In this thesis the sub-sampled

5 minute Realized variance is constructed by shifting the time of the first estimation in 5-

second increments. This way I find 60 of RVt (detailed in eq. (9)) for each day and simply

take an average of these Realized variances. And that is how RVsub is obtained.

Realized Kernel

One of the concerns that arises when dealing with high frequency data is the autocorrelation

between the high frequency returns. This motivated to construct a measure that would

14

3.2 Realized GARCH 3 ECONOMETRIC METHODS

account for this serial correlation. Barndorff-Nielsen OE, Hansen PR, Lunde A, Shephard

N. (2008) proposed to use Realized Kernel. Which is defined as:

xt = RKt = K(X) =H∑

h=−H

k

(h

H + 1

)γh,

where k(x) is a symmetric function, known as Parzen Kernel:

k(x) =

1− 6x2 + 6x3, 0 ≤ x ≤ 0.5

2(1− x)3, 0.5 ≤ x ≤ 1

0, x > 1

and

γh =n∑|h|+1

rt,jrt,j−|h|,

where n = bt/δtc = 4680, δt = 5 seconds and rt,j are 5 second log returns. The authors show

that as n → ∞, K(U) → 0, K(Y ) → [Y ], which implies that also we have K(X) → [X].

Recall that X is an observation of Y and U denotes the market microstructure noise.

Now I briefly show how the optimal H can be chosen. In the same paper, authors argue that

H should be find by using the formula:

H = c ξ4/5n3/5, (10)

with c = 3.5134 for the Parzen kernel k(x) detailed above and

ξ2 =ω2

RVsub

RVsub is sub-sampled Realized variance mentioned above and ω2 can be found using the

formula:

ω2 =1

q

q∑i=1

RV(i)dense

2n(i)

(11)

where RV(1)dense, ..., RV

(q)dense are Realised variances calculated using every q-th observation (ev-

ery 5th second, 10th second, 15th second and so on).

It should be noted that within the Realized Kernel framework high frequency returns can be

used (rt,j = 5 seconds). Because, in contrast to Realized variance measure, Realized Kernel

15

4 DATA

is not upward biased on high frequencies. Thus more information is used to construct this

measure. Which indicates that, theoretically, the Realized Kernel should approximate the

quadratic variation of the log price process better than the Realized variance.

4 Data

In the empirical part of the thesis I apply models to the Intel Corporation (INTC) and

Microsoft stocks. Results obtained for Microsoft stock are summarized in the appendix

and serve as a robustness check. Therefore, in this section I provide plots and descriptive

statistics only for the Intel Corporation (in the appendix the corresponding results are shown

for Microsoft). Data sample for INTC stock starts on the 2nd of June, 1990 till the 3rd of

May, 2011. Thus in total I have 3000 days of observations with 5 second log-returns within

the day. In total there are 3000 trading days (6.5 hours) each with 4680 5 second returns. I

declare 4680*3000=14,040,000 high frequency log returns to be known observations. In this

comprehensive sample around 44% of the 5 second returns are zeros for the Intel Corparation

(INTC) stock. Additional 30 days of data (from 2011.05.04 to 2011.06.15) is also recorded

in the data set and will serve as a out of sample observations for the evaluations of the

forecasts. For a better illustration of the data I plot the daily returns and provide their

descriptive statistics in Figures 1 and 2 respectably. The returns show two high volatility

periods which correspond to dot-com bubble (approximately between 2001 and 2003) and

credit crisis (between 2008 and 2010). From the descriptive statistics we see that the mean

is close to zero but standard deviation much smaller than 1. The return distribution has

kurtosis of 5.7 (> 3) which indicates fatter tales than for the Normal distribution. This

so called non-normality is mainly caused by the volatility clustering which can be clearly

seen from Figure 1. Furthermore, the autocorrelations for the squared daily returns are

plotted in Figure 3 where we observe that autocorrelation is high and decays very slowly

(long memory), in fact, it diminishes completely only after 500 lags.

16

4 DATA

Figure 1: Daily returns of the Intel Corporation stock

Figure 2: Descriptive statistic for the daily returns of the Intel Corporation stock

17

5 RESULTS

Figure 3: Autocorrelation for the squared daily Intel Corporation stock returns

5 Results

In this section I present the obtained results after applying models detailed above. In order to

apply Multiplicative Component GARCH model I need to have forecasts of the daily volatil-

ity component. Therefore I first present the modelling and forecasting results obtained from

the Realized GARCH model implementation and only then the Multiplicative Component

GARCH results.

5.1 Realized GARCH modelling results

By taking the advantage of the available high frequency data I model the latent daily volatil-

ity by Realized GARCH model. As emphasized in the section above, there are many different

approaches to approximate quadratic variation of the price process. I chose to apply three of

those measures, namely 5 minute Realized variance, 5 minute sub-sampled Realized variance

and Realized Kernel. Then by comparing the obtained results, distinguish which of these

measures are the most suitable for the daily volatility modelling.

In Figure 4 all the mentioned realized measures obtained for the INTC stock are plotted

18

5.1 Realized GARCH modelling results 5 RESULTS

together. All three measures clearly capture the two high volatility periods within the sam-

ple, which correspond to the dot-com bubble and credit crisis. However, to compare those

measures among themselves is quite difficult just from observing the graph. All three of

them seem to move very closely together. But if we take a closer look (see Figure 5) we

can notice that, especially during the low volatility periods, the Realized Kernel tends to,

on average, give higher volatility compared to other measures. Here it should be noted that

realized measures ignore the variation of the overnight prices, which then leads to the lower

volatility compared to variance of the squared daily returns (Shephard and Sheppard 2010).

Taking this into account we can argue that higher level of volatility obtained from realized

measures is desired. Then in this context Realized Kernel performs the best.

In Table 1 I present the values of the log-likelihood for the Realized GARCH model (specified

in equations (5) and (6)). It is obvious that the highest log-likelihood value is achieved when

using the Realized Kernel (RK) measure. This is also an indication for the choice of Realized

Kernel (RK) for modelling the latent daily volatility within Realized GARCH framework.

Table 1: Log-likelihood for log-linear Realized GARCH(1,1)

Realized measure: RV RV sub RK

Value of the log-likelihood: 11485.5891 11656.9490 12510.1461

19

5.1 Realized GARCH modelling results 5 RESULTS

Figure 4: Realized measures

Figure 5: Realized measures (closer look)

20

5.1 Realized GARCH modelling results 5 RESULTS

Tab

le2:

Obta

ined

resu

lts

for

log-

linea

rR

ealize

dG

AR

CH

(1,1

)

Para

mete

rsSta

ndard

err

ors

Stu

dent’

st

p-v

alu

e

RV

RV

sub

RK

RV

RV

sub

RK

RV

RV

sub

RK

RV

RV

sub

RK

ω-0

.119

1-0

.149

00.

2013

0.08

240.

0933

0.12

34-1

.444

7-1

.597

31.

6306

0.14

860.

1103

0.10

31

β0.

6271

0.57

680.

5361

0.02

100.

0221

0.02

0929

.832

926

.051

425

.643

50.

0000

0.00

000.

0000

γ0.

3554

0.40

030.

5000

0.02

070.

0220

0.02

7217

.135

618

.152

318

.376

40.

0000

0.00

000.

0000

ξ-0

.080

7-0

.045

3-0

.705

60.

2151

0.21

690.

2181

-0.3

753

-0.2

091

-3.2

353

0.70

750.

8344

0.00

12

ϕ0.

9971

1.00

500.

8898

0.02

690.

0271

0.02

7337

.060

037

.056

132

.568

40.

0000

0.00

000.

0000

τ 1-0

.018

8-0

.016

7-0

.021

90.

0078

0.00

730.

0055

-2.4

163

-2.2

805

-3.9

772

0.01

570.

0226

0.00

01

τ 20.

0842

0.08

090.

0505

0.00

550.

0052

0.00

3815

.162

715

.375

913

.360

60.

0000

0.00

000.

0000

π0.

9815

0.97

900.

9810

21

5.2 Realized GARCH forecasting results 5 RESULTS

From observing results in Table 2 some important conclusions can be drawn. Almost all of

the parameters of the model are very significant with every realized measure. It can also

be clearly seen that standard (RV ) and sub-sampled 5 minute Realized variances (RVsub)

give very close results, thus sub-sampling procedure does not give empirically sufficient im-

provements in this case. However, when investigating results with Realized Kernel (RK)

few distinctions can be noticed. First, the leverage effect (captured by the parameter τ1)

becomes even more important. Also using this realised measure, the parameter ξ becomes

significant compared to other realized measures. Another important difference can be no-

ticed when RK is applied then parameter β becomes significantly smaller and γ gets bigger.

This implies that less weight is given to ht−1 and more to xt−1 which means that Realized

Kernel measure has more explanatory power for the latent daily volatility ht.

The parameter π = β + ϕγ is constrained and needs to be between (−1, 1) for the model

to be stationary (see Hansen, Huang and Shek (2011)). In fact π is the largest eigenvalue

of the matrix A in the equation (8). In Table 2 we see that this parameter has value close

to 1 no matter which realized measures is used. It indicates that autocorrelation for the

residuals of the squared returns decays slowly. Please recall that in Chapter 4 it was shown

that squared daily returns of the Intel Corporation have this long memory property. Thus

it can be concluded that Realized GARCH model captures the long memory property.

5.2 Realized GARCH forecasting results

In this part I try to distinguish which of the realized measures give the best forecasting

results within a Realized GARCH framework. As mentioned above, this model induces the

VARMA(1,1) structure for multi-period forecasts (detailed in equation (8)). The benefit of

this structure is not only the fact that we can forecast daily volatility (ht) multi periods

ahead, but also the fact that we can forecast realized measure xt as well. Therefore, in this

section the forecasts over different horizons, namely 1,2,...,30 days ahead, are found and then

compared. Moreover, the predictions for both xt and ht are obtained. In more detail, I use

a data sample of the 3000 days (from 2nd of June, 1990 till the 3rd of May, 2011) to forecast

22

5.2 Realized GARCH forecasting results 5 RESULTS

daily volatility and realized measure 1,2,...,30 days ahead.

True volatility is not observable. Thus when we want to evaluate the forecasts of the variance

we encounter a problem of choosing the right proxy for the true volatility. Often volatility

forecasts are compared to squared returns but they give a low level of information about the

variance. Another solution is to choose realized measure as a proxy for the true volatility, I

chose to use three proxies, namely 5 minute Realized variance RVout−sample, sub-sampled 5

minute Realized variance RVsub−out−sample and RKout−sample. Forecasting procedure can be

summarized like this:

(i) By applying 3 different realized measures I obtain 3 different forecasts for xt and 3 for

ht.

(ii) In order to distinguish which measure gives the best forecasts I calculate 3 out of

sample realized measures which would serve as proxies for the true volatility.

(iii) Finally comparison takes place. Forecasts obtained for the xt and ht with Realized

variance (as a measure) are compared to RVout−sample, prediction of xt and ht with

sub-sampled Realized variance are compared to RVsub−out−sample and forecasts obtained

for xt and ht with Realized Kernel are compared to RKout−sample.

The results are summarized in Table 3. Here we see that once again Realized Kernel performs

the best and give the most accurate forecasts. 1

Table 3: Obtained results for out of sample forecasts of daily volatility and realized measure

over 30 days horizon

RV RV sub RK

Forecasts for: ht xt ht xt ht xt

RMSE: 1.67E-04 1.93E-04 1.24E-04 1.02E-04 6.52E-05 7.66E-05

Compared to out of sample: RV RV sub RK

All things considered, it is clear that Realized Kernel is the most suitable realized measure

to be used for modelling daily volatility and obtaining corresponding forecasts.

1In Table 3 bold numbers mark the most accurate forecasts.

23

5.3 Modelling results for Multiplicative Component GARCH 5 RESULTS

5.3 Modelling results for Multiplicative Component GARCH

In this section I present results obtained after applying Multiplicative Component GARCH

model (detailed in equations (1),(2) and (3)) for INTC stock. The last 30 days of observations

will be used for this model. Here ht is modelled by the Realized GARCH and as rt,i I take

15 minute returns. In total, there is 30*26=780 observations of rt,i and diurnal pattern si

is calculated using these returns. Results of Multiplicative GARCH model are summarized

in Table 4. In this table we can see that only the parameter β is significant, and it is very

high. These results lead to the conclusion that zt,i do not give any explanatory information

for stochastic component qt,i. Or in other words, indicates that modelling daily volatility

component with Realized GARCH leads to no ARCH effects for the normalized 15 minute

returns zt,i. (The formal tests for ARCH effects in normalized squared returns z2t,i are given

in Chapter 6).

Table 4: GARCH(1,1) results for stochastic component qt,i

Parameter value Standard error Student’s t p-value

ω 0.006726 0.013649 0.492762 0.626165

α 0.015688 0.009767 1.606262 0.119850

β 0.978239 0.020707 47.242246 0.000000

Value of the log-likelihood 1102.1942

In Figure 6 the diurnal pattern si) is plotted, it shows variance of the returns in each of 26

15 minute bins. From this graph it can be clearly seen that at the beginning of each trading

day there is a substantial increase in volatility. Also at the end of the day variance increases

but considerably less than in the morning. This U-shaped day seasonality of the volatility

is documented in the number of articles, see for instance (Andersen T. G. , Bollerslev T.,

(1997), Andersen (2001)).

For a better understanding and illustration of the model I plot returns, daily variance com-

ponent ht, intra-day component qt,i, diurnal pattern si, composite variance and composite

variance without stochastic component (see Figure 7). The notation is:

24

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

Figure 6: Diurnal pattern

ht,i – composite variance and gt,i – composite variance without stochastic intra-day compo-

nent qt,i. Here by composite variance I mean:

ht,i =√htsiqt,i (12)

whereas gt,i does not have stochastic component qt,i. Thus it is equal to:

gt,i =√htsi. (13)

From Figure 7 one important feature should be noted: ht,i – composite variance and gt,i

– composite variance without stochastic intra-day component qt,i move very close together.

Also all the peaks of the diurnal pattern si remain clearly vivid in composite variance graph.

Which indicates that day seasonality is of particular importance.

5.4 Forecasting results for Multiplicative Component GARCH

In this part I present the results obtained for the one step ahead forecasts of the stochastic

intra-day volatility component qt,i. That is, qt,i is predicted sequentially for each 15 minute

bin for the following day. Assuming that today is 2011.05.03, I then forecast the volatility

25

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

Figure 7: Volatility components

for each 15 minute time interval for the next day, that is 2011.05.04. The procedure can be

summarized as follows:

(i) I take the daily variance component ht modelled by Realized GARCH.

(ii) Then si is calculated as detailed above.

(iii) Having this information I model the stochastic intra-day volatility component qt,i by

GARCH(1,1) and forecast it one period ahead qt+1,1. This way I obtain the prediction

for the first 15 minute bin of 2011.05.04.

(iv) Let’s assume that the first 15 minutes of the day 2011.05.04 have passed, that is rt+1,1

is now known. Using this additional information and the forecast of the daily volatility

component ht+1 (see equation (8)), now qt+1,2 can be predicted.

This procedure is then repeated till qt+1,i, ∀ i = 1, ..., N is found. Here N indicates number

of bins in one trading day and is equal to 26 because in total there are 26 bins of 15 minute

time intervals in one trading day.

26

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

When we want to evaluate the forecasts of the volatility we encounter a problem of choosing

the right proxy of the true volatility. Here a true volatility for every bin in each day is de-

noted as σ2t,i. And to evaluate this σ2

t,i, 3 volatility proxies are employed, namely r2t,i, RVsub t,i

and RKt,i. In more detail, r2t,i are squared 15 minute returns, RVsub t,i is a sub-sampled 1

minute Realized variance for every 15 minute time interval and RKt,i – a 15 minute Realized

Kernel (detailed explanation of the calculation procedure of this measure can be found in

the appendix).

I choose to compare ht,i-composite intra-day variance and gt,i-composite intra-day variance

without stochastic component (detailed in 12 and 13) to the true volatility proxies. This

way I can determine whether the inclusion of the predicted stochastic intra-day component

qt,i makes the forecasts more accurate or not. For this purpose Root mean squared error

(RMSE) is implemented. It is calculated as follows:

(i) RMSE =√

1N

∑Ni=1

(σ2t,i − ht,i

)2

(ii) RMSE =√

1N

∑Ni=1

(σ2t,i − gt,i

)2

where σ2t,i is one of the three mentioned volatility proxies.

The results are summarized in Table 5. In order to check the robustness of the results here

I take a grid of observations, I assume that ’today’ is 2008.12.12, 2009.09.30, 2010.07.19,

2011.05.03 and obtain predictions for the following days 2008.12.13, 2009.10.01,2010.07.20,

2011.05.04 accordingly. In Table 5 bold numbers indicate which gives better results - forecasts

with stochastic component ht,i or without gt,i. And (*) marks significantly different forecasts

at 10 %, (**) - at 5 % and (***) - at 1% confidence interval according to The Diebold-

Mariano (DM) test. Which is summarized in the appendix along with the results of DM

statistic. At first glance we observe that forecasts give contradicting results for different

time periods. Inclusion of the stochastic intra-day component qt,i not always benefits the

accuracy of the forecasts. On the other hand, Diebold-Mariano test indicates that only three

of these predictions are significantly different and they all favour the forecasts obtained with

the stochastic intra-day component. Another noticeable feature is the fact that predictions

are more accurate when compared to RVsub.t,i, RKt,i rather than r2t,i.

27

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

Table 5: RMSE comparison for composite variance

RMSE

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to proxies:

r2t,i 2.148E-09*** 2.377E-09 1.55E-10 1.83E-10 8.3E-11*** 8.9E-11 1.96E-10 1.52E-10

RVsub.t,i 1.103E-09 1.084E-09 6.4E-11 8.1E-11 1.5E-11 1.6E-11 1E-10 5.2E-11

RKt,i 8.47E-10 7.98E-10 7.6E-11*** 1.18E-10 1.2E-11 1.2E-11 5E-11 4.3E-11

Along with the preceding analysis I also employ the same approach as used by Engle and

Sokalska (2012) in their paper. That is, use the MSEt,i = (z2t,i − q

ft,i)

2 measure to evaluate

the forecasts (here qft,i denotes the predicted intra-day component). Consequently, I obtain

comparable results (see Table 5.4 2). However, the drawback of this approach is the fact that

it is more difficult to obtain other true volatility proxies instead of z2t,i (recall that z2

t,i are

normalized 15 minute returns). This motivated me to focus more on the composite intra-day

variance forecast evaluation.

MSE

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1

Compared to proxies:

z2t,i 2.0717*** 2.2256 4.5305 5.4935 1.6432 1.6926 2.6495 2.9354

Another possible way to compare accuracy of the predictions is to calculate the out-of sam-

ple log-likelihood. Please recall that intra-day stochastic component qt,i is modelled by

GARCH(1,1). For evaluation purposes first term of the likelihood 12

log(2π) can be ignored.

Consequently, we get forecasting measure – LIKt,i detailed below.

LIKt,i = − log(qft,i)−z2t,i

qft,i,

2(*) marks significantly different forecasts at 10 %, (**) - at 5 % and (***) - at 1% confidence interval

according to The Diebold-Mariano test

28

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

here qft,i denotes the predicted stochastic intra-day component. The results obtained with

this measure are summarized in Table 6. Similarly, I construct LIK measures for ht,i and

gt,i:

(i) LIK = − log(ht,i)−r2t,iht,i

(ii) LIK = − log(gt,i)−r2t,igt,i

.

and summarize the results in Table 7. The results gathered in Tables 7 and 6 are equiva-

lent and indicate that inclusion of the stochastic intra-day component is beneficial for the

forecasting accuracy.

Table 6: LIK comparison for the intra-day component

LIK

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1

Compared to proxies:

z2t,i -31.28 -33.18 -37.09 -44.48 -25.95 -26.80 -35.78 -38.34

Table 7: LIK comparison for composite variance

LIK

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to:

r2t,i 236.42 235.05 277.94 270.99 288.62 286.90 275.84 271.97

Overall, the results favour the inclusion of the stochastic intra-day component qt,i for achiev-

ing higher accuracy of the predictions, but only slightly. Thus for further investigation, I

add one more measure of forecasts - MME(u) used by David G. McMillan (2009). The

idea of this measure is that it weights the under-predictions of the volatility more than the

29

5.4 Forecasting results for Multiplicative Component GARCH 5 RESULTS

over-predictions and it is defined as:

MME(u) =1

h

[u∑

i=1

| ht,i − σ2t,i | +

o∑i=1

√| ht,i − σ2

t,i |

](14)

where u indicates the number of under-predictions, o – over-predictions and h, is the number

of forecasts. When we think about the risk in general, it can be emphasized that the under-

estimation of the risk can cause more trouble then the over-estimation. Thus it could be

argued that MME(u) is the most important forecast evaluation measure discussed in this

thesis. The results can be found in Table 8 3 below along with a graph for the predictions

of qt,i, see Figure 8.

Table 8: MME(u) comparison for composite variance

MME(u)

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to proxies:

r2t,i 2.11E-03 2.07E-03*** 1.20E-03 1.52E-03 8.22E-04 7.33E-04*** 1.08E-03 1.36E-03

RVsub.t,i 2.65E-03 2.49E-03 1.58E-03 1.89E-03 8.19E-04 6.13E-04 1.10E-03 1.61E-03

RKt,i 3.04E-03 2.85E-03 2.13E-03*** 2.59E-03 9.96E-04 8.00E-04 1.29E-03 1.95E-03

After investigating Figure 8 and Table 8 a simple conclusion can be drawn. If qt,i is predicted

to be below 1, then this causes ht,i to be downward biased compared to gt,i and consequently

MME(u) give worse results for ht,i. This therefore motivates the exclusion of the stochastic

intra-day component qt,i. And vice versa if qt,i is predicted to be above 1. Overall, it

illustrates that the level of the predicted stochastic intra-day component is more important

than its variations within that day.

3(*) marks significantly different forecasts at 10 %, (**) - at 5 % and (***) - at 1% confidence interval

according to The Diebold-Mariano test

30

6 POSSIBLE EXTENSIONS

Figure 8: Forecasts for qt,i, ∀ i = 1, ..., N

6 Possible extensions

6.1 Asymmetries

In this part I will check whether or not a leverage effect occurs when dealing with the daily

returns and with the intra-day returns. As mentioned in section 3.2.1, the realized measure

equation (6) induces an EGARCH type structure in the GARCH equation (5) within the

Realized GARCH framework. This motivates to apply the EGARCH model to the daily

returns and then investigate the significance of the leverage effect. Obtained results for

the model are given in Table 9. It can be clearly seen that leverage effect is important for

the daily returns. It is indicated by highly significant coefficient α2. In order to illustrate

leverage effect more vividly I use News Impact Curve (NIC), which shows how positive and

negative shocks to the returns affect future volatility (see Figure 9). It is obvious that NIC

is very asymmetric, which implies that negative shocks to the returns have stronger effect to

the volatility compared to positive effects. Of course, these findings are expected.

On the other hand, if we investigate the standardized 15 minute returns zt,i and their prop-

erties (see Table 10 and Figure 10), we observe that leverage effect is not present. This is

31

6.1 Asymmetries 6 POSSIBLE EXTENSIONS

Table 9: EGARCH model results for the daily returns of Intel Corporation

Dependent Variable: rt

log ht = α0 + α1 | rt−1/√ht−1 | +α2(rt−1/

√ht−1) + α3 log ht−1

Variable Coefficient Std. Error z-Statistic Prob.

α0 -0.135701 0.017482 -7.762547 0.0000

α1 0.106276 0.011581 9.177085 0.0000

α2 -0.041434 0.007356 -5.633033 0.0000

α3 0.993618 0.001571 632.5466 0.0000

Figure 9:

32

6.1 Asymmetries 6 POSSIBLE EXTENSIONS

Table 10: EGARCH model results for the normalized intra-day 15 minute returns of INTC

Dependent Variable: zt,i

log qt,i = α0 + α1 | zt,i−1/√qt,i−1 | +α2(zt,i−1/

√qt,i−1) + α3 log qt,i−1

Variable Coefficient Std. Error z-Statistic Prob.

α0 -0.106300 0.033075 -3.213912 0.0013

α1 0.135951 0.042946 3.165588 0.0015

α2 -0.000207 0.023703 -0.008713 0.9930

α3 0.867883 0.059561 14.57141 0.0000

Figure 10:

33

6.2 Long memory 6 POSSIBLE EXTENSIONS

indicated by the insignificant coefficient α2 and a symmetric News impact curve.

All things considered, it can be concluded that for the leverage effect to have a significant

importance longer period of time needs to be considered. Or in other words, leverage effect

takes longer to occur. Which is the reason why in this thesis leverage effect is accounted for

in the daily volatility modelling but not on the intra-day basis (where stochastic component

is modelled by a simple GARCH(1,1)).

6.2 Long memory

Another property that needs to be accounted for is long memory. In section 5.3 I find that

normalized 15 minute returns zt,i do not give any explanatory information for the intra-day

stochastic component qt,i, or in other words, there are no ARCH effects on the intra-day

basis. This can be spotted from graphs of rt and zt,i see Figures 1 and 11. We observe from

rt graph that volatility clustering plays an important role and needs to be accounted for but

zt,i looks similar to the white noise.

Figure 11:

Still this property needs to be thoroughly tested. Again for a better illustration and com-

34

6.2 Long memory 6 POSSIBLE EXTENSIONS

parison I investigate ARCH effects for the residuals of the squared daily returns r2t and of

the normalized intra-day returns z2t,i. First, Breusch-Godfrey Serial Correlation LM Test

is performed. The results are shown in Tables 11 and 13. Here the null hypothesis states

that there is no serial correlation. It is clearly rejected for the residuals of the squared daily

returns but not of the normalized intra-day returns. Which means that there are ARCH

effects in r2t but no in z2

t,i. This conclusion is supported by the Q-statistic as well, see Tables

12 and 14. And even though Q-statistic indicates some autocorrelation for the z2t,i residuals,

it is not highly significant this is clearly illustrated by the autocorrelation (ACF) graph, see

Figure 12. Conversely, when looking at the ACF plot for r2t (Figure 3) we see that autocor-

relation is of particular importance and very persistent. In fact, serial correlation for daily

returns diminishes completely only after 500 lags, which implies very long memory.

All things considered we arrive to another conclusion that intra-day returns, which were

normalized by daily variance and diurnal volatility component, no longer present serial auto-

correlation and no GARCH type model needs to be applied. Thus ARCH effects and long

memory property should be accounted on the daily basis (which in this thesis is done by

Realized GARCH) but not within the day.

Table 11: Serial Correlation test for the squared daily returns

Breusch-Godfrey Serial Correlation LM Test:

F-statistic 13.71320 Prob. F(50,2949) 0.0000

Obs*R-squared 565.9347 Prob. Chi-Square(50) 0.0000

Dependent Variable: Residuals of r2t

Method: Least Squares

Sample: 1 3000

Table 12: Q-statistic for the squared daily returnsQ-statistic: 123.60 274.05 409.66 486.85 621.53 727.85 824.03 901.33 1060.0 1156.9 1279.3 1331.5 1433.7 1503.7 1618.0

Prob. 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

35

6.2 Long memory 6 POSSIBLE EXTENSIONS

Table 13: Serial Correlation test for the squared normalized intra-day returns

Breusch-Godfrey Serial Correlation LM Test:

F-statistic 1.674308 Prob. F(5,774) 0.1383

Obs*R-squared 8.346163 Prob. Chi-Square(5) 0.1382

Dependent Variable: Residuals of the z2t,i

Method: Least Squares

Sample: 1 780

Table 14: Q-statistic for the squared normalized 15 minute returns

Q-statistic Prob.

6.8269 0.009

6.9267 0.031

7.4979 0.058

7.5141 0.111

7.9985 0.156

36

7 LIMITATIONS

Figure 12: Autocorrelation for the squared normalized intra-day returns

7 Limitations

In this chapter I simulate the daily returns for the Realized GARCH model (detailed in

section 3.2.1) and explore the possible limitations of this model. As true parameters I chose

to use the parameter values obtained from the model with Realized Kernel, see third col-

umn in Table 2, that is: ω = 0.2013; β = 0.5361; γ = 0.5; ξ = −0.7056;φ = 0.8898; τ1 =

−0.0219; τ2 = 0.0505;σu = 0.09. The simulated returns are plotted in Figure 13. From

observing this graph a concern arises that modelled returns show lower degree of volatility

clustering than the real data (compare with Figure 1). The autocorrelations for these re-

turns are plotted in Figure 14. The volatility shows lower magnitude and decays faster than

observed for the Intel Corporation squared returns (see Figure 3).

The serial correlation of the simulated squared returns is strongly influenced by the parame-

ters τ1, τ2 and σu. If these are relatively small, then the variance of log(ht) and log(r2t ) is also

small, recall that log(r2t ) = log(ht) + log(ε2). And this also implies small autocorrelations in

r2. In the model with the realized kernel, the estimates of these three parameters are small,

which leads to a low degree of volatility clustering. Thus for the simulation purposes the

37

7 LIMITATIONS

higher values for these parameters should chosen. For instance, I implemented τ1 = −0.25,

τ2 = 0.1 and σu = 0.1, which lead to the autocorrelation similar to that observed in the data

just with a faster decay, see Figure 15.

Overall, two important features should be noted, first, values of the parameters τ1, τ2 and

σu obtained by Realized GARCH model should be treated with care and might need adjust-

ments to account for the higher level of volatility clustering observed in the data. Another

implication is that even with the ’styled’ parameters autocorrelation still decays faster than

for the data. Initially after observing the results of the Realized GARCH model, an argu-

mentation in section 5.1 was carried out that this model captures the long memory property.

It was indicated by π value being close to 1 (recall results in Table 2). However, after ob-

serving the simulation results, it seems this conclusion needs to be adjusted. That is, the

Realized GARCH model fails to fully capture the long memory property.

Figure 13: Simulated daily returns

38

7 LIMITATIONS

Figure 14: Autocorrelation of the simulated squared returns

Figure 15: Autocorrelation of the simulated squared returns with adjusted parameters

39

8 CONCLUSIONS

8 Conclusions

First of all, I would like to draw some important conclusions regarding Realized GARCH

model. It was extensively shown that within the framework of this model appliance of Real-

ized Kernel provides the most accurate results. It was shown to be so in terms of modelling

and forecasting. This motivates that quadratic variation of the underlying true price process

should be modelled by the Realized Kernel rather than the Realized variance. However,

even then the simulation of the daily returns indicated that Realized GARCH does not fully

capture the volatility clustering and its persistence that is observed in the data.

In terms of Multiplicative Component GARCH model, the importance of diurnal pattern is

emphasized. In fact, the clear shape of this pattern obtained here coincides with the results

in many papers regarding Intra-day volatility. The majority of the forecasting measures

favour the inclusion of the stochastic intra-day volatility component but its significance

is still doubtful. Forecasting results evaluated by various measures give close results which

sometime are contradicting. Therefore additional forecasting measure was added – MME(u),

which penalizes under-predictions more than over-predictions. This measure indicated that a

stochastic intra-day component should be included in the model, if it is predicted to be above

1. Thus importance lies within the predicted level of the stochastic component rather than

its variations within the day. In-depth analysis of the intra-day returns (normalized by daily

variance component forecasts formed with Realized GARCH model and by deterministic

diurnal pattern) shows that no GARCH type model is needed on the intra-day basis. In

more detail, no asymmetries and, in fact, no significant serial correlation is observed for the

standardized squared 15 minute returns. Consequently, we can argue that modelling the

daily and distinguishing the diurnal volatility components is of substantial importance in

order to forecast the intra-day variance combined with accurate predictions of the level of

the stochastic intra-day component.

40

REFERENCES REFERENCES

References

[1] Andersen T.G., Bollerslev T. 1997. Intraday periodicity and volatility persistence in

financial markets. Journal of Empirical Finance 4 (1997) 115-158.

[2] Andersen T.G., Bollerslev T. 1998. Answering the skeptics: yes, standard volatility

models do provide accurateforecasts. International Economic Review 39(4): 885–905.

[3] Andersen T.G., Bollerslev T., Ashish D. 2001. Variance-ratio Statistics and High-

frequency Data: Testing for Changes in Intraday Volatility Patterns. The Journal of

Finance, 2001, Vol.56(1), pp.305-327

[4] Andersen T.G., Bollerslev T., Diebold FX, Labys P. 2001. The distribution of exchange

rate volatility. Journal of the American Statistical Association 96(453): 42–55 (correc-

tion published 2003, Vol. 98, p. 501).

[5] Andersen T.G., Bollerslev T., Diebold FX, Labys P. 2003. Modeling and forecasting

realized volatility. Econometrica 71(2): 579–625.

[6] Bandi F.M., Russell J.R. 2008. Microstructure Noise, Realized Variance, and Optimal

Sampling. Review of Economic Studies 75, 339–369.

[7] Barndorff-Nielsen O.E., Shephard N. 2002. Econometric analysis of realised volatility

and its use in estimating stochastic volatility models. Journal of the Royal Statistical

Society B 64: 253–280.

[8] Barndorff-Nielsen O.E., Shephard N. 2004. Power and bipower variation with stochastic

volatility and jumps(with discussion). Journal of Financial Econometrics 2: 1–48.

[9] Barndorff-Nielsen O.E., Hansen P.R., Lunde A., Shephard N. 2008. Designing realised

kernels to measure the ex-post variation of equity prices in the presence of noise. Econo-

metrica 76: 1481–536.

[10] Barndorff-Nielsen O.E., Hansen P.R., Lunde A., Shephard N. 2009a. Realised kernels

in practice: trades and quotes. Econometrics Journal 12: 1–33.

41

REFERENCES REFERENCES

[11] Engle R.F. 2002. New frontiers of ARCH models. Journal of Applied Econometrics 17:

425–446.

[12] Engle R.F., Gallo G. 2006. A multiple indicators model for volatility using intra-daily

data. Journal of Econometrics 131: 3–27.

[13] Engle R.F., Sokalska M. E. 2012. Forecasting intraday volatility in the US equity market.

Multiplicative Component GARCH. JournalofFinancialEconom etrics, 2012, Vol. 10,

No. 1, 54–83.

[14] Etheridge A. 2002. A Course in Financial Calculus.

[15] Ghose D., and Kroner. K. 1996. Components of Volatility in Foreign Exchange Markets:

An Empirical Analysis of High Frequency Data. Unpublished manuscript, Department

of Economics, University of Arizona.

[16] Giot P., 2005. Market Risk Models for Intraday Data. European Journal of Finance 11:

309–324.

[17] Hansen P.R., Lunde A. 2006. Consistent Ranking of Volatility Models. Journal of Econo-

metrics 131: 97–121.

[18] Hansen P.R., Huang Z., Shek H.H. 2011. Realized GARCH: A joint model for returns

and realized measures of volatility. Journal of Applied Econometrics 27: 877–906.

[19] Mcmillan D.G., Garcia R.Q. 2009. Intra-day volatility forecasts. Applied Financial

Economics,19,611-623.

[20] Shephard N., Sheppard K. 2010. Realising the future: forecasting with high frequency

based volatility (HEAVY) models. Journal of Applied Econometrics 25: 197–231.

42

A APPENDIX

A Appendix

A.1 15 minute Realized Kernel calculation

In order to calculate 15 minute RKt,i the same procedure as in section 3.2.4 is employed. Just

here to find the optimal H for every 15 minute bin I calculate 1 minute sub-sampled Realized

Variance. And use every 5, 10,..., 150 seconds returns in order to findRV(1)dense, RV

(2)dense..., RV

(q)dense

and then consequently ω and H for the every 15 minute time interval is obtained.

A.2 Diebold-Mariano test for Comparing Predictive Accuracy

Let yt denote the series to be predicted, also assume that we have two obtained forecasts y1t ,

y2t . Then ε1t = yt − y1

t and ε2t = yt − y2t can be constructed. The interest lies in evaluating

whether or not the accuracy of the forecasts differ. The accuracy of each forecast is measured

by a particular loss function. The most common loss functions are:

(i) squared error loss: (εit)2, i = 1, 2

(ii) absolute error loss:| εit |, i = 1, 2

In this thesis I chose to use the absolute error loss function because the magnitude of the

obtained errors were very small. To determine if one model predicts better than another I

test the null hypothesis:

H0 : E[| ε1t |

]= E

[| ε2t |

]with alternative:

H1 : E[| ε1t |

]6= E

[| ε2t |

].

The Diebold-Mariano test is based on the loss differential:

dt =| ε1t | − | ε2t |

and the statistic is:

S =d√Vd/T

∼ N(0, 1),

43

A.3 Data analysis for Microsoft stock A APPENDIX

Table 15: The Diebold-Mariano (DM) Test

The Diebold-Mariano (DM) Test for the composite variance

Date 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Compared to proxies:

r2t,i -2.0276*** -0.5888 -3.4989*** 1.3354

RVsub.t,i -0.7402 -0.3826 -1.1151 0.9259

RKt,i 0.4761 -3.8525*** 0.5305 0.0303

Table 16: The Diebold-Mariano (DM) Test for qt,i

The Diebold-Mariano (DM) Test

Date 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Compared to:

z2t,i -4.1476*** -1.2114 -1.4253 1.2017

where

d =1

T

T∑t=t0

dt

and

Vd = γ0 + 2∞∑j=1

γj, γj = cov(dt, dt−j).

In Table 15 I present results of DM test for comparing prediction accuracy between the 15

minute ahead forecasts of the ht,i – composite variance and gt,i – composite variance with no

stochastic component for Intel Corporation stock. In Table 16 I present results of DM test

for comparing prediction accuracy between qt,i and qt,i = 1. 4

A.3 Data analysis for Microsoft stock

Data sample for Microsoft stock starts on the 2nd of June, 1990 till the 3rd of May, 2011. Thus

in total I have 3000 days of observations with 5 second log-returns within the day. In total

4(*)-indicate significance at 10 %, (**)-at 5 % and (***) - at 1%

44

A.3 Data analysis for Microsoft stock A APPENDIX

there are 3000 trading days (6.5 hours) each with 4680 5 second returns. All in all I declare

4680*3000=14,040,000 high frequency log returns to be known observations. Around 43% of

these 5 second returns are zeros. Additional 30 days of data (from 2011.05.04 to 2011.06.15),

which is also recorded in the data set will serve as a out of sample observations for the

evaluations of the forecasts. For a better illustration of the data I plot the daily returns and

provide their descriptive statistics in Figures 16 and 17 along with plots of autocorrelations

in Figure 18.

Figure 16: Daily log returns of the Microsoft stock

Figure 17: Descriptive statistics for the daily log returns of the Microsoft stock

45

A.4 Results for Microsoft stock A APPENDIX

Figure 18: Autocorrelation for the squared daily returns of the Microsoft stock

A.4 Results for Microsoft stock

In order to check whether or not results obtained for Intel Corporation are robust I carry out

the same analysis for the Microsoft stock. Microsoft and Intel Corporation stocks have very

similar trading frequency, in fact in my dataset these two stocks are the most often traded.

Therefore, the results should be similar (summarized results for the Microsoft can be found

in Tables :18, 17, 19, 20,21, 23, 25, 22, 24 and Figure 19).

After close investigation of the results one noticeable difference is the higher significance of

the forecasts for the Microsoft stock. Other than that the results are very similar for the

both stocks. Therefore the same conclusions can be drawn and this serves as a robustness

check for the results obtained for the Intel Corporation stock.

A.4.1 Realized GARCH results

46

A.4 Results for Microsoft stock A APPENDIX

Tab

le17

:O

bta

ined

resu

lts

for

log-

linea

rR

ealize

dG

AR

CH

(1,1

)

Para

mete

rsS

tan

dard

err

ors

Stu

dent’

st

p-v

alu

e

RV

RV

sub

RK

RV

RV

sub

RK

RV

RV

sub

RK

RV

RV

sub

RK

ω0.

0956

150.

0766

770.

6380

910.

0838

160.

0904

380.

1265

861.

1407

760.

8478

445.

0407

610.

2540

540.

3965

930.

0000

00

β0.

6482

660.

6106

040.

5559

420.

0191

060.

0198

070.

0206

4433

.930

427

30.8

2776

326

.930

361

0.00

0000

0.00

0000

0.00

0000

γ0.

3629

730.

3971

280.

5361

650.

0207

080.

0213

270.

0282

9117

.528

512

18.6

2086

318

.951

653

0.00

0000

0.00

0000

0.00

0000

ξ-0

.713

624

-0.6

3844

6-1

.499

560

0.20

5688

0.20

3290

0.18

5150

-3.4

6945

8-3

.140

562

-8.0

9916

20.

0005

290.

0017

030.

0000

00

ϕ0.

9161

600.

9282

320.

7919

140.

0240

920.

0237

970.

0216

7338

.027

053

39.0

0640

636

.539

008

0.00

0000

0.00

0000

0.00

0000

τ 1-0

.029

849

-0.0

3495

9-0

.032

247

0.00

8251

0.00

7842

0.00

5725

-3.6

1757

5-4

.458

042

-5.6

3312

20.

0003

020.

0000

090.

0000

00

τ 20.

0914

920.

0911

270.

0564

960.

0055

050.

0052

920.

0038

3516

.620

458

17.2

2051

214

.732

924

0.00

0000

0.00

0000

0.00

0000

π0.

9808

080.

9792

310.

9805

39

47

A.4 Results for Microsoft stock A APPENDIX

Table 18: Log-likelihood for log-linear Realized GARCH(1,1)

Realized measure: RV RV sub RK

Value of the log-likelihood: 12155.3449 12317.2870 13253.8499

Table 19: Obtained results for out of sample forecasts of daily volatility and realized measure

over 30 days horizon

RV RV sub RK

Forecasts for: ht xt ht xt ht xt

RMSE: 1.07E-04 1.06E-04 1.24E-04 1.28E-04 7.01E-05 1.00E-04

Compared to out of sample: RVt RVsub.t RKt

A.4.2 Multiplicative Component GARCH results

Table 20: GARCH(1,1) results for stochastic component qt,i

Parameter value Standard error Student’s t p-value

ω 0.059516 0.022729 2.618532 0.014305

α 0.093866 0.021823 4.301170 0.000199

β 0.847342 0.034912 24.270579 0.000000

Value of the log-likelihood 1102.1942

48

A.4 Results for Microsoft stock A APPENDIX

Table 21: RMSE comparison for composite variance

RMSE

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to proxies:

r2t,i 5.724E-09 7.076E-09 6.3E-11 7.3E-11 3.74E-10* 3.86E-10 2.6E-11*** 7.2E-11

RVsub.t,i 1.5E-09 1.431E-09 4.8E-11* 7.3E-11 1.7E-11 1.6E-11 7E-12*** 2.8E-11

RKt,i 1.684E-09 1.654E-09 4.7E-11** 6.7E-11 2E-11 1.8E-11 7E-12 7E-11

The Diebold-Mariano (DM) Test

Date 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Compared to proxies:

r2t,i -1.4504 0.5096 -1.6770* -3.9020***

RVsub.t,i -0.3738 -1.8587* 1.3959 -2.1925**

RKt,i -0.4910 -2.0378** 0.6318 -0.9537

Figure 19: Forecasts for qt,i, ∀ i = 1, ..., N

49

A.4 Results for Microsoft stock A APPENDIX

Table 22: MSE comparison for the intra-day component

MSE

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1

Compared to proxies:

z2t,i 2.6926 3.0967 2.4829 2.8399 1.4256*** 1.4830 1.2033*** 1.4826

The Diebold-Mariano (DM) Test for qt,i

Date 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Compared to:

z2t,i -0.0314 -0.0173 -2.383*** -5.6998***

Table 23: LIK comparison for composite variance

LIK

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to proxies:

r2t,i 247.4329 244.0970 293.3140 289.6178 285.0877 282.9899 310.2033 302.2858

Table 24: LIK comparison for the intra-day component

LIK

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1 qt,i qt,i = 1

Compared to proxies:

z2t,i -23.497532 -29.640347 -30.713465 -37.900146 -23.332222 -24.208448 -8.883522 -1.861710

50

A.4 Results for Microsoft stock A APPENDIX

Table 25: MME(u) comparison for composite variance

MME(u)

Date: 2008.12.12 2009.09.30 2010.07.19 2011.05.03

Forecasts of: ht,i gt,i ht,i gt,i ht,i gt,i ht,i gt,i

Compared to proxies:

r2t,i 1.89E-03 2.09E-03 9.02E-04 1.01E-03 9.25E-04* 9.35E-04 3.15E-04 2.38E-04***

RVsub.t,i 3.22E-0 3.53E-03 1.40E-03* 1.61E-03 6.31E-04 6.05E-04 7.04E-04 3.02E-04***

RKt,i 3.19E-03 3.63E-03 1.72E-03** 1.91E-03 1.09E-03 1.10E-03 1.10E-03 4.91E-04

51