realized volatility estimation (paper)

REALIZED VOLATILITY ESTIMATION

Master thesis

Miquel Masoliver, Guillem Roig, Shikhar Singla

25.06.2014

Advisors: Christian Brownless, Eulalia Nualart

Acknowledgements

We would like to thank our advisors Christian Brownlees and Eulalia Nualart for

the guidance and support they provided during the development of this thesis.

1

Contents

1 Introduction 3

2 Theoretical background 4

2.1 Realized Volatility Estimator . . . . . . . . . . . . . . . . . . . . 4

2.2 Subsampling to remedy for noise . . . . . . . . . . . . . . . . . . 5

2.3 Realized Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 Threshold Realized Variance . . . . . . . . . . . . . . . . . . . . . 8

3 Simulation 9

3.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Finding optimal parameters for TSRV & RK . . . . . . . . . . . 11

3.3 Modeling price jumps . . . . . . . . . . . . . . . . . . . . . . . . 12

3.4 Implementing market microstructure noise . . . . . . . . . . . . . 14

4 Data implementation 15

4.1 Data and methodology . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Conclusions 19

2

1 Introduction

In finance it is of utmost importance to understand volatility and its dynamics,

since it is the main driver of portfolio construction, in hedging and pricing of

options and in the determination of a firm’s exposure to risk. It also plays a

critical role in discovering trading and investment opportunities that provide

an attractive risk-return trade-off [Yu09]. Therefore, it is not surprising that

many estimators have been proposed to measure volatility from a discrete price

sample: Parkinson [Par80], Rogers et al. [Yoo94] and Yang and Zhang [Zha00]

among others, were the first proponents of new methodologies to estimate re-

alized volatility using low-frequency daily data. Parametric models such as the

seminal ARCH model first proposed by Engle [Eng82] and posterior contribu-

tions from Bollerslev and Mikkelsen [Mik96] among many others, have provided

new ways to estimate and forecast volatility. However, the validity of such

volatility measures rely upon specific distributional assumptions, immediately

calling to question the robustness of previous findings.

Andersen et al. proposed a non-parametric approach based on summing squares

and cross-products of intraday high-frequency returns to construct estimates of

realized daily volatility. The underlying idea is to use the quadratic variation

as an ex-post variation of asset prices [Ebe01]. This approach, however, has the

weakness that it can be sensitive to market frictions when applied to returns

obtained over shorter time intervals, since it estimates the magnitude of the

noise term rather than volatility. To overcome these drawbacks Zhang et al.,

on the one hand, proposed the so-called Two-Scale Realized Variance (TSRV),

an unbiased estimator that incorporates noise into the model and separates the

sample into multiple ”grids” [AS05], and Barndorff-Nielsen et al., on the other

hand, introduce the family of realized kernel estimators to carry out efficient

feasible inference on the ex-post variation of underlying equity prices in the

presence of market frictions [She08].

All of these estimators, however, turn out to be inconsistent in the presence of

jumps. Corsi et al. introduce the family of threshold realized variance estima-

tors (TRV), which detect discontinuities in prices and do not include those price

points in volatility calculations [Ren10].

3

The main purpose of this study is to try to find the optimal volatility esti-

mator in a non-parametric framework. In particular, this study focuses on the

estimation of the daily integrated variance-covariance matrix of stock returns us-

ing simulated and high-frequency data in the presence of market microstructure

noise, jumps, and non-synchronous trading. This work is structured in three

building blocks:(i) price processes are simulated in the presence of jumps and

market microstructure noise. This allows us to obtain some insight about the

estimators’ performance. (ii) The aforementioned realized volatility estimators

are applied to high-frequency data of the S&P 100 stocks of October 27th 2010

using 5-second, 10-second, 30-second, 1-minute and 2-minute time intervals.

(iii) We use the estimated covariance matrices to construct the global minimum

variance portfolio for each sampling frequency. These global minimum variance

portfolios are used to build 30 day ex-post portfolio’s returns and we use the

variance of these returns to compare between the performance of the estimators.

2 Theoretical background

In this section we provide a review of the main theory on different non-parametric

volatility estimators that take part in the study, as well as the rationale behind

using high-frequency data.

2.1 Realized Volatility Estimator

Volatility estimation and inference has attracted much attention in the financial

econometric and statistical literature. Even in a discrete time framework one

will often start with the sum of squared log-returns, as not only the simplest and

most natural estimator, but also as the one with the most desirable properties

as shown in [Yu09]. One of the recent achievements in financial econometrics

is the introduction of the concept of realized volatility [AS05] , which allows to

consistently estimate the accumulated price variation over some time interval

by summing over the high-frequency squared returns.

Let us consider that the logarithmic prices for a given asset are governed by

4

the following diffusion process,

yt =

∫ t

0

µ(s)ds+

∫ t

0

σ(s)dWs. (1)

The object of interest is the amount of accumulated variation for the asset price

over a time interval τ = [0, 1] (representing a trading day), called integrated

variance. Mathematically it is given by the following expression,

IV =

∫ 1

0

σ2(s)ds. (2)

Since intraday prices are not continuous, it is not feasible to compute the previ-

ous integral; therefore, one needs to come up with an estimator for the integrated

variance process. One feasible solution is to discretize the time interval τ = [0, 1]

into a grid of subintervals t0 = 0 ≤ · · · ≤ tn = 1 of length ti − ti−1 = ∆ = 1/n,

and then set the prices yi = yti . Under such circumstances, the realized volatil-

ity estimator,defined as the sum of the squared returns, i.e.

RV 2 =

n−1∑i=1

(yi+1 − yi)2, (3)

turns out to be a natural estimator for the integrated variance. The asymptotic

properties of this estimator are especially striking when sampling occurs at

an increasing frequency that is, small δ which, when assets trade every few

seconds, is a realistic approximation to what we observe using the now commonly

available transaction or quote-level sources of financial data. In particular, fully

observing the sample path of an asset will perfectly reveal the volatility of that

path.

2.2 Subsampling to remedy for noise

As pointed out in the previous section, the realized variance estimator will

converge to the quadratic variation of the process, i.e., the approximation of

integrated volatility as the RV estimator seems natural since stochastic processes

5

theory shows that

plim

n−1∑i

(yi+1 − yi)2 =

∫ T=1

0

σ2sds (4)

However, in real life, market microstructure noise appears when dealing with

high-frequency data. Such noise captures a variety of frictions inherent in the

trading process: bid-ask bounces, discreteness of price changes, differences in

trade sizes, etc.

Assume a portfolio of N assets with M price points for each asset. Assume that

the log prices are contaminated by market microstructure noise ui , i.e.

yi = y∗i + ui. (5)

In this case, the observed return is given by:

ri = r∗i + εi, (6)

with the noise intraday increment εit = uit−uit−1 . Therefore, the RV estimator

can be decomposed as:

RV = RV ∗ + 2

M∑i=1

r∗i εi +

M∑j=1

ε2j . (7)

The last term on the right can be interpreted as the (unobservable) realized

variance of the noise process, while the second term is induced by potential

dependence between the efficient price and the noise. Based on this decompo-

sition, RV is a biased estimator for IV [Lun06].

Two-Scaled Realized Variance Estimator

One approach to overcome such a drawback consists in explicitly incorporating

microstructure noise into the analysis, and some estimators have been devel-

oped such that they make use of all the data, no matter how high the frequency

and how noisy it is. These methods decompose the total observed variance into

a component attributable to the fundamental price and another to the market

6

microstructure noise [Yu09].

One of these consistent estimators, named Two Scaled Realized Variance (TSRV),

was first proposed by Zhang et al. [AS05]. The underlying idea is to sample

sparsely at some lower frequency and to evaluate the quadratic variation at the

two frequencies. Averaging the results over the entire sampling, and taking a

suitable linear combination, one obtains a consistent and asymptotically unbi-

ased estimator of IV.

More precisely, suppose T = {t1, . . . , tn} is a vector containing the times

of the observed log prices in a certain trading day. Then T is partitioned into

K non-overlapping sub-grids with equal number of observations. The kth (k

= 1, 2,..., K) sub-grid extracts the observations from the whole intraday data

with following times attached: Tk = {tk−1, tk−1+K , ..., tk−1+nkK}, where nk is

the largest integer so that the (tk−1+nkK)th observation is included in Tk. The

TSRV is calculated as follows:

TSRV 2 =1

K

K∑k=1

∑ti,ti+1∈Tk

(yti+1− yti)2 − n

n

∑tj ,tj+1∈T

(ytj+1− ytj )2 (8)

where yj is log price process, n is the total observations in an intraday dataset,

and n = n−K+1K .

2.3 Realized Kernel

Realized kernel estimators introduced by Bandorff-Nielsen et al. 2011 can

be used to estimate the quadratic variation of an underlying price process

from high frequency noisy data guaranteeing consistency and positive semi-

definiteness[LS09].

The multivariate realized kernel is defined as

K(X) =

H∑h=−H

k(h

H + 1)γh, (9)

7

where γh is matrix of autocovariances given by

γh =

n∑

j=h+2

rjrTj−h, h ≥ 0

γh = γT−h, h < 0

(10)

and k(x) is Parzen function and is given by

k(x) =

1− 6x2 + 6x3, 0 ≤ x ≤ 1

2

2(1− x)3, 12 ≤ x ≤ 1

0, x ≥ 1

(11)

Here rj is the high frequency return. We focus on the Parzen function be-

cause it satisfies the smoothness conditions and is guaranteed to produce a

non-negative estimate. The preferred choice of bandwidth is H∗ = c∗ξ4/5n3/5

where c∗ = 3.5314 for Parzen function and ξ2 = ω2/√IQ denotes the noise-

to-signal ratio, ω2 is a measure of microstructure noise variance and IQ is the

integrated quarticity[LS09].

2.4 Threshold Realized Variance

The three estimators described above are inconsistent when there are jumps in

the prices. For an estimator to work well in presence of jumps, it has to detect

the jumps and not include those price points in volatility calculation where there

is a jump. Corsi et al.(2010) introduce the family of threshold realized variance

estimators which are consistent in presence of jumps [Ren10]. If the difference in

log prices (returns) is above a certain threshold, it is not included in calculation

of variance. Threshold realized variance is defined as follows:

TRVδ(y)t =

[T/δ]∑j=1

(yj − yj−1)21{(yj−yj−1)2≤Θ(δ)} (12)

The threshold function has to satisfy

limδ→0

Θ(δ) = 0 and limδ→0

δlog( 1δ )

Θ(δ)= 0 (13)

8

i.e., it has to vanish slower than the continuity of Brownian motion in order to

have convergence in probability (consistency). The two threshold functions we

work with are as follows:

TRV1 =

[T/δ]∑j=1

(yj − yj−1)21{(yj−yj−1)2≤log( 1δ )√δ} (14)

TRV2 =

[T/δ]∑j=1

(yj − yj−1)21{(yj−yj−1)2≤√δ} (15)

where T denotes the length of the interval (1 in our case) and

δ =1

number of observations(16)

3 Simulation

The aim of this section is to study the behavior of the different estimators

in a controlled environment so we can analyze its performance under various

scenarios. The main advantage of simulating is that it provides us with the

”true” variance-covariance matrix, since under the Heston model volatility is

also simulated, which will allow us to compare the performance of the estimators.

The section starts by exposing the foundations of the framework under which

we run the simulations as well as the main technical aspects one has to take

into account. Later, we offer some observations and comments on the results to

highlight the main properties of each estimator.

3.1 Setup

The first step to simulate the covariance matrix is to impose some structure to

the underlying asset’s price and volatility. In a discrete time framework GARCH

family processes are widely used for its appeal. The analog in a continuous time

setting corresponds to a Geometric Brownian Motion price process and a Heston

structure for the volatility. Therefore, the market model will be implemented by

first simulating a price process and then generating its implied volatility given

9

the Heston model. The price process is specified by

dyt = µytdt+ σtytdWSt (17)

where µ represents the return of the asset and σt the instantaneous volatility.

The instantaneous variance is determined by a CIR process of the form

dσ2t = κ(θ − σt)dt+ ξσtdW

νt (18)

where dWSt and dW ν

t are Brownian Motion processes with covariance ρ.

In the variance equation, θ is the long variance, or long run average price vari-

ance; as t tends to infinity the expected value of σt tends to θ. κ is the rate at

which the volatility reverts to θ and ξ is the volatility of the volatility, which

determines the variance of σt[Hes93].

Simulating prices using the Heston model allows us to compute the integrated

variance-covariance (i.e. the true volatility) by adding the variance-covariance

matrix simulated at each step of the time interval.

To compare the performance of the estimators we construct a matrix A

defined as the subtraction of the variance-covariance matrix of each estimator

minus the integrated variance-covariance,

ARV = RV − IV (19)

ATSRV = TSRV − IV

ARK = RK − IV.

RV(TSRV,RK) corresponds to the variance-covariance matrix associated with

the RV estimator (TSRV estimator, RK estimator).

Afterwards, these matrices are used to compute the Frobenius and the Infinite

norm that are defined as,

Frobenius Norm ||A||f =

√√√√ m∑i=1

m∑j=1

|aij |2 (20)

Infinite Norm ||A||i = max|aij |. (21)

10

Figure 1: Frobenius and Infinite norm for matrices A as defined in (20) to testthe performance of the estimators on the benchmark scenario i.e. no jumps nornoise.

The intuition behind these norms is that they are used as a metric to compare

the performance of the estimators by computing how much their estimated val-

ues differ from the ”true” values. This distance is captured by the elements of

the matrices ARV , ATSRV and ARK . The performance is assessed either by

means of the Frobenius norm, i.e. by computing the sum of all of the elements

of these matrices, either by means of the Infinite Norm, i.e. by comparing be-

tween the elements with the highest value.

Figure 1 shows the benchmark scenario i.e. simulation of prices without

adding jumps nor noise. We can observe that RV is consistent and its perfor-

mance is increasing with sampling frequency. This behavior also holds for all

other estimators. It is important to point out that the TSRV outperforms the

RV for low frequencies up to a threshold. Beyond that point the RV is the best

among our estimators.

3.2 Finding optimal parameters for TSRV & RK

The optimal value of K for TSRV is cn2/3 where c =(

12ω2

IQ

)1/3

and n is the

number of data points.

11

The preferred choice of bandwidth for RK is H∗ = c∗ξ4/5n3/5 where c∗ = 3.5314

for Parzen function and ξ2 = ω2/√IQ.

We estimate ω2 = RV/2n and IQ = RV 2sparse as explained in [LS09]. We

find subsampled realized variance based on returns from every 50th price point.

More precisely, we compute a total of 50 realized variances by shifting the first

observation. RVsparse is simply the average of these estimators. We get K and

H for every stock from this exercise, global K and H is simply the average of

these K’s and H’s.

3.3 Modeling price jumps

In a liquid and efficient financial market prices are set so that they reflect all

available information regarding all traded products in the exchange. Following

this rationale, when new information is released or generated prices will change

accordingly. This shift in the price can be either smooth or very sharp. We

will focus on the latter, since a sharp increase or decrease in the price can be

understood as a jump.

In finance, the building block of a jump model is the Poisson process. Let us

consider a sequence {τi}i≥1 of independent exponential random variables with

parameter λ, that is, with cumulative distribution function defined as

F (y) = P (τi ≥ y) = e−λy. (22)

Let Tn =∑ni=1 τi, then the process

N(t) =∑n≥1

1t≤Tn , (23)

is called a Poisson process with rate λ. In our case, τ corresponds to the waiting

times between jumps, and N(t) the number of jumps occurred up to time t.

Poisson processes have turned out to be the paradigm for jump models in dif-

fusion process, since it shares with the Brownian motion the property of in-

crements being independent and stationary, i.e., for every t > s the increment

Nt −Ns is independent of the history of the process up to time s.

12

In reality, however, simple Poisson processes are of little interest since they

only consider one possible jump size and assume that they occur strictly one

after the other. To relax such assumptions we considered a Compound Pois-

son process where waiting times between jumps are exponentially distributed,

whereas jump sizes can have an arbitrary distribution.

More precisely, letting B1, B2, ... denote the i.i.d. sequential amplitude for

jumps, the total amplitude of the jump at time t, A(t) is given by

A(t) =

N(t)∑n=1

Bn, (24)

where N(t) are the number of jumps occurring at time t.

In practice, we first generate the waiting times {Xn = tn − tn−1}, i.e., time

intervals between events by considering that they are exponentially distributed

thus following the same distribution as described in 22. From such a distribution

on can explicitly compute the set of times {tn} at which the events take place,

tn = tn−1 −1

λln(U), (25)

where U is a Uniform [0, 1] function. Therefore, by simulating a random [0, 1]

vector and defining a λ such that only one jump takes place per day we obtained

the set of times at which jumps occur.

The next step consists on modeling the amplitude of those jumps. Efficient

pricing theory dictates that a jump in the price can either be a consequence of

supply and demand adjustments or due to new information released. In both

cases this jump can move in either direction. Hence, the best way to model the

direction and size of the shift in price is by assigning to each jump a draw from

a (standard) normal distribution corresponding to its amplitude.

Figure 2 plots the performance of our estimators when we add jumps to the

price process. In this case, sampling frequency provides better estimates as well

but the first three estimators are inconsistent, thus unreliable. On the other

hand, the Threshold Realized Variance estimators are consistent under jumps

13

Figure 2: Frobenius and Infinite norm for matrices A as defined in (20)to testthe performance of the estimators in the presence of jumps.

and both of them outperform the inconsistent estimators and both behave very

similarly.

3.4 Implementing market microstructure noise

To study the effect of market microstructure on the behavior of estimators we

need to add noise in the price process. A simple yet effective approach is to add a

random term to the price. This term will consist of random draws form a normal

distribution. Our main concern regarding noise is how will it affect the behavior

of the estimators depending on its size. Figure 3 shows the impact of noise on the

estimation of volatility depending on the size of this distortion measured using

the Frobenius norm. On the horizontal axis we measure the sampling frequency

in a way so that values far from the origin represent estimators using very few

data points while points close to the origin are true high-frequency estimators.

We can see that for small interferences the effect is insignificant. When we

increase the size from σ2noise = 0.025 onwards, only consistent estimators such

as the TSRV and Kernel remain reliable, whereas the RV is clearly outperformed

due to its asymptotic inconsistency in the presence of interferences.

14

Figure 3: Impact of noise using Frobenius norm

4 Data implementation

In the previous section we described how the estimators behave under differ-

ent ways of simulating prices. The next steps are assessing whether estimates

from intraday data outperform estimates using daily data, determining which

estimator perform the best and addressing the question whether intraday prices

have jumps or not.

4.1 Data and methodology

All the estimators are obtained by using intraday tick-by-tick prices on the

trades executed on October 27th, 2010 in the S&P 100 index. This dataset

comprises the prices of the 94 stocks traded on that day from 09:30 to 16:00,

the opening and closing times of the exchange. The integrated volatility is also

estimated using daily data comprising the closing prices for the same stocks

during the previous 60 days (relative to October 27).

15

Data cleaning

Since the opening of the exchange, assets can be traded at will if two parties

agree to do so. This means that trades between agents can materialize at any

time. Therefore, the first problem to overcome in tick-by-tick data is its lack of

synchronization. Non-synchronous trading delivers fresh (trade or quote) prices

at irregularly spaced times which differ across stocks. Raw data on trades is

saved with actual times, so for example a trade would be saved with a timestamp

090345 if it were executed at 09:03:45. In order to standardize the timestamps

we normalize the trading day consisting of 6.5 hours into a [0, 1] interval split

into seconds (23400 seconds). As such, since the exchange is open for 6.5 hours,

this particular trade would have a normalized timestamp of 0.00833.

The second issue that needs to be tackled is the presence of simultaneous trades.

In that case, the approach consists in keeping only the last trade recorded. Since

they are simultaneous, there is no price distortion by arbitrarily choosing one of

them, since we apply the same policy for all cases. Last but not least, we may

have a problem of low liquidity in some assets in the sense that they are traded

very few times during a day. This has a tremendous impact on the estimators

since at some point we need to invert the variance-covariance matrix. If those

assets are not traded frequently or the price does not change significantly the

variance of those assets will be extremely small, resulting in the impossibility

to compute its inverse. We set a threshold on the minimum daily variance at

0.001. An asset with a variance lower than the threshold is dropped out of the

sample since it makes the estimation unfeasible. After running these cleaning

procedures we obtain a clean database containing 91 assets. The list of stocks

is available in the 2.

4.2 Analysis

At the core of our work lies the idea that some estimators provide better es-

timates than others, as discussed in previous sections. Asymptotic properties

were discussed to assess the desirability to use certain candidates in the pres-

ence of specific distortions in the observed prices such as noise and jumps. This

section tries to determine if there is evidence of a systematic outperformance of

any estimator over all other studied candidates. Explicit comparisons between

16

Figure 4: Performance of RV and Kernel

methods to assess which estimator is better have been hampered by the multi-

tude of metrics to use in forming the comparisons. The distance between two

covariance matrices is not well defined, and it is certainly not obvious that all

elements of this difference should be treated as equally important.

To overcome this drawback, an asset allocation perspective is introduced

to measure the value of covariance information. As shown by Engle in et al.,

realized volatility is the smallest for the correctly specified covariance matrix

for any vector of expected returns [Col06]. Making use of this property, we will

construct the Global Minimum Variance Portfolio using data on the S&P 100

index. By means of the previous theoretical result, the best estimator should

yield a GMVP with the least variance for the same expected return. To assess

the performance of each estimator, we compute the variance of returns of a

Buy-and-Hold strategy on these portfolios over the following 30 trading days

(October 27th to December 9th 2010) as a function of intraday sampling fre-

quency.

As explained by Lunde et al., estimators may yield implausible results when

working on real data [SS11]. TSRV happens to exhibit this misbehvior with our

data. Nonehteless, it is worth mentioning that TSRV computed using sampling

17

frequency of 5 seconds outperforms RV 5 seconds due to the large number of

datapoints. However, as the number of data points decreases, the variance-

covariance matrix obtained from TSRV is not invertible even after regulariza-

tion.

Figure 4 shows the variance of returns for the Kernel estimator against the Re-

alized Volatility. We also include an equally weighted portfolio as a benchmark.

We can clearly observe that the RV estimator systematically outperforms the

Kernel with the exception of the highest sampling frequencies where market

microstructure noise becomes relevant and RV is inconsistent. Additionally,

Figure 5 plots the variance of the minimum variance portfolio as a function of

sampling frequency for the Realized Volatility against the Threshold RV. In-

terestingly, both specifications of the TRV and RV behave very similarly. This

fact may lead us to think that there is no evidence of jumps in the data or size

and/or frequency of jumps must be small as the three values are very close. To

understand why the two specifications for TRV perform differently we need to

go back to its definitions in section 2.4.

The difference between the two arises when defining how we measure a jump,

i.e., the threshold we set on the amplitude of the price change required to be

considered as a jump. The first specification has a broader range, so only ex-

treme events will be classified as jumps. In contrast, the second case is less

strict and small variations can go beyond the threshold. It is important to note

that the RV performs better at the maximum frequency as well as relatively low

frequencies but the lowest variance is achieved by all estimators at a 30 seconds

frequency. This is consistent with empirical studies such as [Pat11] in which

beating a high-frequency (15 to 120 seconds) RV constitutes a serious (but as

seen not impossible) challenge.

To conclude this section we offer a final remark. One may wonder whether it

makes sense to implement this sort of estimators to the conventional daily data

approach in case they could improve volatility estimates. Table 1 below shows

the volatility estimates of the Buy-and-Hold minimum variance portfolio as

described previously using daily data instead of intraday data. The covariances

were computed using data on the previous 60 trading days to October 27. Not

18

Figure 5: Performance of RV and Threshold RV

Estimator VarianceRCOV from daily data 5.370TSRV from daily data 218.7957

RK from daily data 16.9148Uniformly weighted portfolio 0.8372

Table 1: Performance of estimators using daily data

surprisingly, the outcome of this strategy is extremely poor especially in the

TSRV, the one that relies more on data abundance to perform the sub-sampling.

5 Conclusions

In this study we analyzed the performance of multiple methods to estimate

volatility using high-frequency data. We saw that in contrast with stochastic

processes theory, using the highest available frequency does not necessarily re-

sult in the best approximation to a continuous sample path because financial

markets’ microstructure plays a distortionary role that leads to (in some cases)

inconsistent estimators.

This study chooses a series of non-parametric volatility estimators to mea-

sure realized volatility. The starting point was to measure volatility as an ex-

19

post quadratic variation of asset prices, giving rise to the Realized Volatility

estimator.

The next step in adding complexity to the process of measuring volatility is

correcting for jumps and noise. To do that, we first decided to compare the

performance of the Realized variance estimator that is inconsistent under noise

with the performance of consistent estimators under noise, i.e., TSRV and Real-

ized Kernel estimators. As expected, we found that when noise is present TSRV

and Realized Kernel outperform Realized Variance, specially for small sampling

frequencies.

The same procedure was applied in the presence of jumps. Therefore, we com-

pared Realized Variance (inconsistent in the presence of jumps) with Threshold

Realized variance estimators, that are consistent under jumps. In this case and

consistent with the literature, we found that they outperform the other estima-

tors. Our results support the hypothesis of Corsi et al.. that price jumps do

have an impact on future volatility [Ren10].

Lastly, we tested again the estimators using data from S&P 100. Excluding

TSRV, that fails to give plausible outcomes, the results are consistent with the

ones obtained using simulated prices. Realized Kernel performs better than

Realized Variance estimator for sampling frequencies smaller than 5 seconds.

Therefore, we conclude that market microstructure noise emerges for sampling

frequencies smaller than 5 seconds.

Minimum values for the ex-post portfolio’s variance are found for sampling

frequencies equal to 30 seconds for RV, TRV1 and TRV2 estimators. Although

TRV1 outperforms RV we cannot conclude that jumps do exist in intraday data,

since both specifications behave very similarly.

To get more insights into the nature of jumps, threshold versions of TSRV and

Realized Kernel should be implemented to assess their performance.

20

References

[AS05] Lan Zhang, Per A. Mykland, Yacine Ait-Sahalia. A tale of two time

scales: Determining integrated volatility with noisy high-frequency

data. Journal of the American Statistical Association, 101 (472):1394–

1411, 2005.

[Col06] Robert Engle,Ricardo Colaccito. Testing and valuing dynamic correla-

tions for asset allocation. Journal of Business and Economic Statistics,

24(2), 2006.

[Ebe01] Torben G. Andersen, Tim Bollerslev, Francis X. Diebold, Heiko Ebens.

The distribution of realized stock return volatility. Joural of Financial

Econometrics, 61:43–76, 2001.

[Eng82] Robert Engle. Autoregressive conditional heteroskedasticity with es-

timates of the variance of u.k. inflation. Econometrica, 50:987–1007,

1982.

[Hes93] Steven L. Heston. A closed-form solution for options with stochastic

volatility with applications to bond and currency options. The Review

of Financial Studies, 6(2):327–343, 1993.

[LS09] O.E. Barndoff-Nielsen,P. Reinhard Hansen, A. Lunde and N. Shephard.

Realized kernels in practice: traded and quotes. The Econometrics

Journal, 2009.

[Lun06] Peter R. Hansen, Asger Lunde. Realized variance and market mi-

crostructure noise. Journal of Business and Economic Statistics,

24(2):127–161, 2006.

[Mik96] Tim Bollerslev, Hans O. Mikkelsen. Modeling and pricing long memory

in stock market volatility. Journal of Econometrics, 73:151–184, 1996.

[Par80] M. Parkinson. The extreme value method for estimating the variance

of the rate of return. Journal of Business, 53:67–78, 1980.

[Pat11] Andrew J. Patton. Data-based ranking of realised volatility estimators.

Journal of Econometrics, 3:284–303, 2011.

21

[Ren10] Fulvio Corsi, Davide Pirino, Roberto Reno. Journal of Econometrics,

159, 2010.

[She08] Ole Barndorff-Nielsen, Peter R. Hansen, Asger Lunde, Neil Shephard.

Realized kernels to measure ex post variation of equity prices in the

presence of noise. Econometrica, 76(6):1481–1536, 2008.

[SS11] Asger Lunde, Neil Shephard and Kevin Sheppard. Econometric analysis

of vast covariance matrices using composite realized kernels. Working

paper, 2011.

[Yoo94] L.C.G Rogers, S. Satchell, Y. Yoon. Estimating the volatility of stock

prices: a comparison of methods that use high and low prices. Applied

Financial Economics, 4:241–247, 1994.

[Yu09] Yacine Ait-Sahalia, Jianlin Yu. High frequency market microstructure

noise estimates and liquidity measures. Journal of Mathematical Statis-

tics, 161:422–457, 2009.

[Zha00] Dennis Yang, Qiang Zhag. Drift-independent volatility estimation

based on high, low, open, and close prices. Journal of Business,

73(3):477–491, 2000.

22

TICKER COMPANY NAME GICS SECTOR NAME STATUS

MMM 3M CO IndustrialsABT ABBOTT LABORATORIES Health CareACN ACCENTURE PLC-CL A Information TechnologyALL ALLSTATE CORP FinancialsMO ALTRIA GROUP INC Consumer StaplesAMZN AMAZON.COM INC Consumer DiscretionaryAXP AMERICAN EXPRESS CO FinancialsAIG AMERICAN INTERNATIONAL GROUP FinancialsAMGN AMGEN INC Health CareAPC ANADARKO PETROLEUM CORP EnergyAPA APACHE CORP EnergyAAPL APPLE INC Information TechnologyT AT&T INC Telecommunication ServicesBAC BANK OF AMERICA CORP FinancialsBK BANK OF NEW YORK MELLON CORP FinancialsBAX BAXTER INTERNATIONAL INC Health CareBIIB BIOGEN IDEC INC Health CareBA BOEING CO/THE IndustrialsBMY BRISTOL-MYERS SQUIBB CO Health CareCOF CAPITAL ONE FINANCIAL CORP FinancialsCAT CATERPILLAR INC IndustrialsCVX CHEVRON CORP EnergyCSCO CISCO SYSTEMS INC Information TechnologyC CITIGROUP INC Financials droppedKO COCA-COLA CO/THE Consumer StaplesCL COLGATE-PALMOLIVE CO Consumer StaplesCMCSA COMCAST CORP-CLASS A Consumer Discretionary droppedCOP CONOCOPHILLIPS EnergyCOST COSTCO WHOLESALE CORP Consumer StaplesCVS CVS CAREMARK CORP Consumer StaplesDVN DEVON ENERGY CORPORATION EnergyDOW DOW CHEMICAL CO/THE MaterialsDD DU PONT (E.I.) DE NEMOURS MaterialsEBAY EBAY INC Information TechnologyEMC EMC CORP/MA Information TechnologyEMR EMERSON ELECTRIC CO IndustrialsEXC EXELON CORP UtilitiesXOM EXXON MOBIL CORP EnergyFDX FEDEX CORP IndustrialsF FORD MOTOR CO Consumer DiscretionaryFCX FREEPORT-MCMORAN COPPER MaterialsGD GENERAL DYNAMICS CORP IndustrialsGE GENERAL ELECTRIC CO Industrials droppedGILD GILEAD SCIENCES INC Health CareGS GOLDMAN SACHS GROUP INC FinancialsGOOG GOOGLE INC-CL C Information TechnologyHAL HALLIBURTON CO EnergyHPQ HEWLETT-PACKARD CO Information TechnologyHD HOME DEPOT INC Consumer DiscretionaryHON HONEYWELL INTERNATIONAL INC IndustrialsINTC INTEL CORP Information TechnologyIBM INTL BUSINESS MACHINES CORP Information TechnologyJNJ JOHNSON & JOHNSON Health CareJPM JPMORGAN CHASE & CO FinancialsLLY ELI LILLY & CO Health CareLMT LOCKHEED MARTIN CORP IndustrialsLOW LOWE’S COS INC Consumer DiscretionaryMA MASTERCARD INC-CLASS A Information TechnologyMCD MCDONALD’S CORP Consumer DiscretionaryMDT MEDTRONIC INC Health CareMRK MERCK & CO. INC. Health CareMET METLIFE INC FinancialsMSFT MICROSOFT CORP Information TechnologyMON MONSANTO CO MaterialsMS MORGAN STANLEY FinancialsNOV NATIONAL OILWELL VARCO INC EnergyNKE NIKE INC -CL B Consumer DiscretionaryNSC NORFOLK SOUTHERN CORP IndustrialsOXY OCCIDENTAL PETROLEUM CORP EnergyORCL ORACLE CORP Information TechnologyPEP PEPSICO INC Consumer StaplesPFE PFIZER INC Health CarePM PHILIP MORRIS INTERNATIONAL Consumer StaplesPG PROCTER & GAMBLE CO/THE Consumer StaplesQCOM QUALCOMM INC Information TechnologyRTN RAYTHEON COMPANY IndustrialsSLB SCHLUMBERGER LTD EnergySPG SIMON PROPERTY GROUP INC FinancialsSO SOUTHERN CO/THE UtilitiesSBUX STARBUCKS CORP Consumer DiscretionaryTGT TARGET CORP Consumer DiscretionaryTXN TEXAS INSTRUMENTS INC Information TechnologyTWX TIME WARNER INC Consumer DiscretionaryUSB US BANCORP FinancialsUNP UNION PACIFIC CORP IndustrialsUNH UNITEDHEALTH GROUP INC Health CareUPS UNITED PARCEL SERVICE-CL B IndustrialsUTX UNITED TECHNOLOGIES CORP IndustrialsVZ VERIZON COMMUNICATIONS INC Telecommunication ServicesV VISA INC-CLASS A SHARES Information TechnologyWMT WAL-MART STORES INC Consumer StaplesWAG WALGREEN CO Consumer StaplesDIS WALT DISNEY CO/THE Consumer DiscretionaryWFC WELLS FARGO & CO Financials

Table 2: S&P 100 index constituents for October 27th 2010

23

realized volatility estimation (paper)

Economy & Finance