realized volatility estimation (paper)
DESCRIPTION
Barcelona GSE Master Project by Miquel Masoliver, Guillem Roig, Shikhar Singla Master Program: Finance About Barcelona GSE master programs: http://j.mp/MastersBarcelonaGSETRANSCRIPT
REALIZED VOLATILITY ESTIMATION
Master thesis
Miquel Masoliver, Guillem Roig, Shikhar Singla
25.06.2014
Advisors: Christian Brownless, Eulalia Nualart
Acknowledgements
We would like to thank our advisors Christian Brownlees and Eulalia Nualart for
the guidance and support they provided during the development of this thesis.
1
Contents
1 Introduction 3
2 Theoretical background 4
2.1 Realized Volatility Estimator . . . . . . . . . . . . . . . . . . . . 4
2.2 Subsampling to remedy for noise . . . . . . . . . . . . . . . . . . 5
2.3 Realized Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Threshold Realized Variance . . . . . . . . . . . . . . . . . . . . . 8
3 Simulation 9
3.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Finding optimal parameters for TSRV & RK . . . . . . . . . . . 11
3.3 Modeling price jumps . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Implementing market microstructure noise . . . . . . . . . . . . . 14
4 Data implementation 15
4.1 Data and methodology . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Conclusions 19
2
1 Introduction
In finance it is of utmost importance to understand volatility and its dynamics,
since it is the main driver of portfolio construction, in hedging and pricing of
options and in the determination of a firm’s exposure to risk. It also plays a
critical role in discovering trading and investment opportunities that provide
an attractive risk-return trade-off [Yu09]. Therefore, it is not surprising that
many estimators have been proposed to measure volatility from a discrete price
sample: Parkinson [Par80], Rogers et al. [Yoo94] and Yang and Zhang [Zha00]
among others, were the first proponents of new methodologies to estimate re-
alized volatility using low-frequency daily data. Parametric models such as the
seminal ARCH model first proposed by Engle [Eng82] and posterior contribu-
tions from Bollerslev and Mikkelsen [Mik96] among many others, have provided
new ways to estimate and forecast volatility. However, the validity of such
volatility measures rely upon specific distributional assumptions, immediately
calling to question the robustness of previous findings.
Andersen et al. proposed a non-parametric approach based on summing squares
and cross-products of intraday high-frequency returns to construct estimates of
realized daily volatility. The underlying idea is to use the quadratic variation
as an ex-post variation of asset prices [Ebe01]. This approach, however, has the
weakness that it can be sensitive to market frictions when applied to returns
obtained over shorter time intervals, since it estimates the magnitude of the
noise term rather than volatility. To overcome these drawbacks Zhang et al.,
on the one hand, proposed the so-called Two-Scale Realized Variance (TSRV),
an unbiased estimator that incorporates noise into the model and separates the
sample into multiple ”grids” [AS05], and Barndorff-Nielsen et al., on the other
hand, introduce the family of realized kernel estimators to carry out efficient
feasible inference on the ex-post variation of underlying equity prices in the
presence of market frictions [She08].
All of these estimators, however, turn out to be inconsistent in the presence of
jumps. Corsi et al. introduce the family of threshold realized variance estima-
tors (TRV), which detect discontinuities in prices and do not include those price
points in volatility calculations [Ren10].
3
The main purpose of this study is to try to find the optimal volatility esti-
mator in a non-parametric framework. In particular, this study focuses on the
estimation of the daily integrated variance-covariance matrix of stock returns us-
ing simulated and high-frequency data in the presence of market microstructure
noise, jumps, and non-synchronous trading. This work is structured in three
building blocks:(i) price processes are simulated in the presence of jumps and
market microstructure noise. This allows us to obtain some insight about the
estimators’ performance. (ii) The aforementioned realized volatility estimators
are applied to high-frequency data of the S&P 100 stocks of October 27th 2010
using 5-second, 10-second, 30-second, 1-minute and 2-minute time intervals.
(iii) We use the estimated covariance matrices to construct the global minimum
variance portfolio for each sampling frequency. These global minimum variance
portfolios are used to build 30 day ex-post portfolio’s returns and we use the
variance of these returns to compare between the performance of the estimators.
2 Theoretical background
In this section we provide a review of the main theory on different non-parametric
volatility estimators that take part in the study, as well as the rationale behind
using high-frequency data.
2.1 Realized Volatility Estimator
Volatility estimation and inference has attracted much attention in the financial
econometric and statistical literature. Even in a discrete time framework one
will often start with the sum of squared log-returns, as not only the simplest and
most natural estimator, but also as the one with the most desirable properties
as shown in [Yu09]. One of the recent achievements in financial econometrics
is the introduction of the concept of realized volatility [AS05] , which allows to
consistently estimate the accumulated price variation over some time interval
by summing over the high-frequency squared returns.
Let us consider that the logarithmic prices for a given asset are governed by
4
the following diffusion process,
yt =
∫ t
0
µ(s)ds+
∫ t
0
σ(s)dWs. (1)
The object of interest is the amount of accumulated variation for the asset price
over a time interval τ = [0, 1] (representing a trading day), called integrated
variance. Mathematically it is given by the following expression,
IV =
∫ 1
0
σ2(s)ds. (2)
Since intraday prices are not continuous, it is not feasible to compute the previ-
ous integral; therefore, one needs to come up with an estimator for the integrated
variance process. One feasible solution is to discretize the time interval τ = [0, 1]
into a grid of subintervals t0 = 0 ≤ · · · ≤ tn = 1 of length ti − ti−1 = ∆ = 1/n,
and then set the prices yi = yti . Under such circumstances, the realized volatil-
ity estimator,defined as the sum of the squared returns, i.e.
RV 2 =
n−1∑i=1
(yi+1 − yi)2, (3)
turns out to be a natural estimator for the integrated variance. The asymptotic
properties of this estimator are especially striking when sampling occurs at
an increasing frequency that is, small δ which, when assets trade every few
seconds, is a realistic approximation to what we observe using the now commonly
available transaction or quote-level sources of financial data. In particular, fully
observing the sample path of an asset will perfectly reveal the volatility of that
path.
2.2 Subsampling to remedy for noise
As pointed out in the previous section, the realized variance estimator will
converge to the quadratic variation of the process, i.e., the approximation of
integrated volatility as the RV estimator seems natural since stochastic processes
5
theory shows that
plim
n−1∑i
(yi+1 − yi)2 =
∫ T=1
0
σ2sds (4)
However, in real life, market microstructure noise appears when dealing with
high-frequency data. Such noise captures a variety of frictions inherent in the
trading process: bid-ask bounces, discreteness of price changes, differences in
trade sizes, etc.
Assume a portfolio of N assets with M price points for each asset. Assume that
the log prices are contaminated by market microstructure noise ui , i.e.
yi = y∗i + ui. (5)
In this case, the observed return is given by:
ri = r∗i + εi, (6)
with the noise intraday increment εit = uit−uit−1 . Therefore, the RV estimator
can be decomposed as:
RV = RV ∗ + 2
M∑i=1
r∗i εi +
M∑j=1
ε2j . (7)
The last term on the right can be interpreted as the (unobservable) realized
variance of the noise process, while the second term is induced by potential
dependence between the efficient price and the noise. Based on this decompo-
sition, RV is a biased estimator for IV [Lun06].
Two-Scaled Realized Variance Estimator
One approach to overcome such a drawback consists in explicitly incorporating
microstructure noise into the analysis, and some estimators have been devel-
oped such that they make use of all the data, no matter how high the frequency
and how noisy it is. These methods decompose the total observed variance into
a component attributable to the fundamental price and another to the market
6
microstructure noise [Yu09].
One of these consistent estimators, named Two Scaled Realized Variance (TSRV),
was first proposed by Zhang et al. [AS05]. The underlying idea is to sample
sparsely at some lower frequency and to evaluate the quadratic variation at the
two frequencies. Averaging the results over the entire sampling, and taking a
suitable linear combination, one obtains a consistent and asymptotically unbi-
ased estimator of IV.
More precisely, suppose T = {t1, . . . , tn} is a vector containing the times
of the observed log prices in a certain trading day. Then T is partitioned into
K non-overlapping sub-grids with equal number of observations. The kth (k
= 1, 2,..., K) sub-grid extracts the observations from the whole intraday data
with following times attached: Tk = {tk−1, tk−1+K , ..., tk−1+nkK}, where nk is
the largest integer so that the (tk−1+nkK)th observation is included in Tk. The
TSRV is calculated as follows:
TSRV 2 =1
K
K∑k=1
∑ti,ti+1∈Tk
(yti+1− yti)2 − n
n
∑tj ,tj+1∈T
(ytj+1− ytj )2 (8)
where yj is log price process, n is the total observations in an intraday dataset,
and n = n−K+1K .
2.3 Realized Kernel
Realized kernel estimators introduced by Bandorff-Nielsen et al. 2011 can
be used to estimate the quadratic variation of an underlying price process
from high frequency noisy data guaranteeing consistency and positive semi-
definiteness[LS09].
The multivariate realized kernel is defined as
K(X) =
H∑h=−H
k(h
H + 1)γh, (9)
7
where γh is matrix of autocovariances given by
γh =
n∑
j=h+2
rjrTj−h, h ≥ 0
γh = γT−h, h < 0
(10)
and k(x) is Parzen function and is given by
k(x) =
1− 6x2 + 6x3, 0 ≤ x ≤ 1
2
2(1− x)3, 12 ≤ x ≤ 1
0, x ≥ 1
(11)
Here rj is the high frequency return. We focus on the Parzen function be-
cause it satisfies the smoothness conditions and is guaranteed to produce a
non-negative estimate. The preferred choice of bandwidth is H∗ = c∗ξ4/5n3/5
where c∗ = 3.5314 for Parzen function and ξ2 = ω2/√IQ denotes the noise-
to-signal ratio, ω2 is a measure of microstructure noise variance and IQ is the
integrated quarticity[LS09].
2.4 Threshold Realized Variance
The three estimators described above are inconsistent when there are jumps in
the prices. For an estimator to work well in presence of jumps, it has to detect
the jumps and not include those price points in volatility calculation where there
is a jump. Corsi et al.(2010) introduce the family of threshold realized variance
estimators which are consistent in presence of jumps [Ren10]. If the difference in
log prices (returns) is above a certain threshold, it is not included in calculation
of variance. Threshold realized variance is defined as follows:
TRVδ(y)t =
[T/δ]∑j=1
(yj − yj−1)21{(yj−yj−1)2≤Θ(δ)} (12)
The threshold function has to satisfy
limδ→0
Θ(δ) = 0 and limδ→0
δlog( 1δ )
Θ(δ)= 0 (13)
8
i.e., it has to vanish slower than the continuity of Brownian motion in order to
have convergence in probability (consistency). The two threshold functions we
work with are as follows:
TRV1 =
[T/δ]∑j=1
(yj − yj−1)21{(yj−yj−1)2≤log( 1δ )√δ} (14)
TRV2 =
[T/δ]∑j=1
(yj − yj−1)21{(yj−yj−1)2≤√δ} (15)
where T denotes the length of the interval (1 in our case) and
δ =1
number of observations(16)
3 Simulation
The aim of this section is to study the behavior of the different estimators
in a controlled environment so we can analyze its performance under various
scenarios. The main advantage of simulating is that it provides us with the
”true” variance-covariance matrix, since under the Heston model volatility is
also simulated, which will allow us to compare the performance of the estimators.
The section starts by exposing the foundations of the framework under which
we run the simulations as well as the main technical aspects one has to take
into account. Later, we offer some observations and comments on the results to
highlight the main properties of each estimator.
3.1 Setup
The first step to simulate the covariance matrix is to impose some structure to
the underlying asset’s price and volatility. In a discrete time framework GARCH
family processes are widely used for its appeal. The analog in a continuous time
setting corresponds to a Geometric Brownian Motion price process and a Heston
structure for the volatility. Therefore, the market model will be implemented by
first simulating a price process and then generating its implied volatility given
9
the Heston model. The price process is specified by
dyt = µytdt+ σtytdWSt (17)
where µ represents the return of the asset and σt the instantaneous volatility.
The instantaneous variance is determined by a CIR process of the form
dσ2t = κ(θ − σt)dt+ ξσtdW
νt (18)
where dWSt and dW ν
t are Brownian Motion processes with covariance ρ.
In the variance equation, θ is the long variance, or long run average price vari-
ance; as t tends to infinity the expected value of σt tends to θ. κ is the rate at
which the volatility reverts to θ and ξ is the volatility of the volatility, which
determines the variance of σt[Hes93].
Simulating prices using the Heston model allows us to compute the integrated
variance-covariance (i.e. the true volatility) by adding the variance-covariance
matrix simulated at each step of the time interval.
To compare the performance of the estimators we construct a matrix A
defined as the subtraction of the variance-covariance matrix of each estimator
minus the integrated variance-covariance,
ARV = RV − IV (19)
ATSRV = TSRV − IV
ARK = RK − IV.
RV(TSRV,RK) corresponds to the variance-covariance matrix associated with
the RV estimator (TSRV estimator, RK estimator).
Afterwards, these matrices are used to compute the Frobenius and the Infinite
norm that are defined as,
Frobenius Norm ||A||f =
√√√√ m∑i=1
m∑j=1
|aij |2 (20)
Infinite Norm ||A||i = max|aij |. (21)
10
Figure 1: Frobenius and Infinite norm for matrices A as defined in (20) to testthe performance of the estimators on the benchmark scenario i.e. no jumps nornoise.
The intuition behind these norms is that they are used as a metric to compare
the performance of the estimators by computing how much their estimated val-
ues differ from the ”true” values. This distance is captured by the elements of
the matrices ARV , ATSRV and ARK . The performance is assessed either by
means of the Frobenius norm, i.e. by computing the sum of all of the elements
of these matrices, either by means of the Infinite Norm, i.e. by comparing be-
tween the elements with the highest value.
Figure 1 shows the benchmark scenario i.e. simulation of prices without
adding jumps nor noise. We can observe that RV is consistent and its perfor-
mance is increasing with sampling frequency. This behavior also holds for all
other estimators. It is important to point out that the TSRV outperforms the
RV for low frequencies up to a threshold. Beyond that point the RV is the best
among our estimators.
3.2 Finding optimal parameters for TSRV & RK
The optimal value of K for TSRV is cn2/3 where c =(
12ω2
IQ
)1/3
and n is the
number of data points.
11
The preferred choice of bandwidth for RK is H∗ = c∗ξ4/5n3/5 where c∗ = 3.5314
for Parzen function and ξ2 = ω2/√IQ.
We estimate ω2 = RV/2n and IQ = RV 2sparse as explained in [LS09]. We
find subsampled realized variance based on returns from every 50th price point.
More precisely, we compute a total of 50 realized variances by shifting the first
observation. RVsparse is simply the average of these estimators. We get K and
H for every stock from this exercise, global K and H is simply the average of
these K’s and H’s.
3.3 Modeling price jumps
In a liquid and efficient financial market prices are set so that they reflect all
available information regarding all traded products in the exchange. Following
this rationale, when new information is released or generated prices will change
accordingly. This shift in the price can be either smooth or very sharp. We
will focus on the latter, since a sharp increase or decrease in the price can be
understood as a jump.
In finance, the building block of a jump model is the Poisson process. Let us
consider a sequence {τi}i≥1 of independent exponential random variables with
parameter λ, that is, with cumulative distribution function defined as
F (y) = P (τi ≥ y) = e−λy. (22)
Let Tn =∑ni=1 τi, then the process
N(t) =∑n≥1
1t≤Tn , (23)
is called a Poisson process with rate λ. In our case, τ corresponds to the waiting
times between jumps, and N(t) the number of jumps occurred up to time t.
Poisson processes have turned out to be the paradigm for jump models in dif-
fusion process, since it shares with the Brownian motion the property of in-
crements being independent and stationary, i.e., for every t > s the increment
Nt −Ns is independent of the history of the process up to time s.
12
In reality, however, simple Poisson processes are of little interest since they
only consider one possible jump size and assume that they occur strictly one
after the other. To relax such assumptions we considered a Compound Pois-
son process where waiting times between jumps are exponentially distributed,
whereas jump sizes can have an arbitrary distribution.
More precisely, letting B1, B2, ... denote the i.i.d. sequential amplitude for
jumps, the total amplitude of the jump at time t, A(t) is given by
A(t) =
N(t)∑n=1
Bn, (24)
where N(t) are the number of jumps occurring at time t.
In practice, we first generate the waiting times {Xn = tn − tn−1}, i.e., time
intervals between events by considering that they are exponentially distributed
thus following the same distribution as described in 22. From such a distribution
on can explicitly compute the set of times {tn} at which the events take place,
tn = tn−1 −1
λln(U), (25)
where U is a Uniform [0, 1] function. Therefore, by simulating a random [0, 1]
vector and defining a λ such that only one jump takes place per day we obtained
the set of times at which jumps occur.
The next step consists on modeling the amplitude of those jumps. Efficient
pricing theory dictates that a jump in the price can either be a consequence of
supply and demand adjustments or due to new information released. In both
cases this jump can move in either direction. Hence, the best way to model the
direction and size of the shift in price is by assigning to each jump a draw from
a (standard) normal distribution corresponding to its amplitude.
Figure 2 plots the performance of our estimators when we add jumps to the
price process. In this case, sampling frequency provides better estimates as well
but the first three estimators are inconsistent, thus unreliable. On the other
hand, the Threshold Realized Variance estimators are consistent under jumps
13
Figure 2: Frobenius and Infinite norm for matrices A as defined in (20)to testthe performance of the estimators in the presence of jumps.
and both of them outperform the inconsistent estimators and both behave very
similarly.
3.4 Implementing market microstructure noise
To study the effect of market microstructure on the behavior of estimators we
need to add noise in the price process. A simple yet effective approach is to add a
random term to the price. This term will consist of random draws form a normal
distribution. Our main concern regarding noise is how will it affect the behavior
of the estimators depending on its size. Figure 3 shows the impact of noise on the
estimation of volatility depending on the size of this distortion measured using
the Frobenius norm. On the horizontal axis we measure the sampling frequency
in a way so that values far from the origin represent estimators using very few
data points while points close to the origin are true high-frequency estimators.
We can see that for small interferences the effect is insignificant. When we
increase the size from σ2noise = 0.025 onwards, only consistent estimators such
as the TSRV and Kernel remain reliable, whereas the RV is clearly outperformed
due to its asymptotic inconsistency in the presence of interferences.
14
Figure 3: Impact of noise using Frobenius norm
4 Data implementation
In the previous section we described how the estimators behave under differ-
ent ways of simulating prices. The next steps are assessing whether estimates
from intraday data outperform estimates using daily data, determining which
estimator perform the best and addressing the question whether intraday prices
have jumps or not.
4.1 Data and methodology
All the estimators are obtained by using intraday tick-by-tick prices on the
trades executed on October 27th, 2010 in the S&P 100 index. This dataset
comprises the prices of the 94 stocks traded on that day from 09:30 to 16:00,
the opening and closing times of the exchange. The integrated volatility is also
estimated using daily data comprising the closing prices for the same stocks
during the previous 60 days (relative to October 27).
15
Data cleaning
Since the opening of the exchange, assets can be traded at will if two parties
agree to do so. This means that trades between agents can materialize at any
time. Therefore, the first problem to overcome in tick-by-tick data is its lack of
synchronization. Non-synchronous trading delivers fresh (trade or quote) prices
at irregularly spaced times which differ across stocks. Raw data on trades is
saved with actual times, so for example a trade would be saved with a timestamp
090345 if it were executed at 09:03:45. In order to standardize the timestamps
we normalize the trading day consisting of 6.5 hours into a [0, 1] interval split
into seconds (23400 seconds). As such, since the exchange is open for 6.5 hours,
this particular trade would have a normalized timestamp of 0.00833.
The second issue that needs to be tackled is the presence of simultaneous trades.
In that case, the approach consists in keeping only the last trade recorded. Since
they are simultaneous, there is no price distortion by arbitrarily choosing one of
them, since we apply the same policy for all cases. Last but not least, we may
have a problem of low liquidity in some assets in the sense that they are traded
very few times during a day. This has a tremendous impact on the estimators
since at some point we need to invert the variance-covariance matrix. If those
assets are not traded frequently or the price does not change significantly the
variance of those assets will be extremely small, resulting in the impossibility
to compute its inverse. We set a threshold on the minimum daily variance at
0.001. An asset with a variance lower than the threshold is dropped out of the
sample since it makes the estimation unfeasible. After running these cleaning
procedures we obtain a clean database containing 91 assets. The list of stocks
is available in the 2.
4.2 Analysis
At the core of our work lies the idea that some estimators provide better es-
timates than others, as discussed in previous sections. Asymptotic properties
were discussed to assess the desirability to use certain candidates in the pres-
ence of specific distortions in the observed prices such as noise and jumps. This
section tries to determine if there is evidence of a systematic outperformance of
any estimator over all other studied candidates. Explicit comparisons between
16
Figure 4: Performance of RV and Kernel
methods to assess which estimator is better have been hampered by the multi-
tude of metrics to use in forming the comparisons. The distance between two
covariance matrices is not well defined, and it is certainly not obvious that all
elements of this difference should be treated as equally important.
To overcome this drawback, an asset allocation perspective is introduced
to measure the value of covariance information. As shown by Engle in et al.,
realized volatility is the smallest for the correctly specified covariance matrix
for any vector of expected returns [Col06]. Making use of this property, we will
construct the Global Minimum Variance Portfolio using data on the S&P 100
index. By means of the previous theoretical result, the best estimator should
yield a GMVP with the least variance for the same expected return. To assess
the performance of each estimator, we compute the variance of returns of a
Buy-and-Hold strategy on these portfolios over the following 30 trading days
(October 27th to December 9th 2010) as a function of intraday sampling fre-
quency.
As explained by Lunde et al., estimators may yield implausible results when
working on real data [SS11]. TSRV happens to exhibit this misbehvior with our
data. Nonehteless, it is worth mentioning that TSRV computed using sampling
17
frequency of 5 seconds outperforms RV 5 seconds due to the large number of
datapoints. However, as the number of data points decreases, the variance-
covariance matrix obtained from TSRV is not invertible even after regulariza-
tion.
Figure 4 shows the variance of returns for the Kernel estimator against the Re-
alized Volatility. We also include an equally weighted portfolio as a benchmark.
We can clearly observe that the RV estimator systematically outperforms the
Kernel with the exception of the highest sampling frequencies where market
microstructure noise becomes relevant and RV is inconsistent. Additionally,
Figure 5 plots the variance of the minimum variance portfolio as a function of
sampling frequency for the Realized Volatility against the Threshold RV. In-
terestingly, both specifications of the TRV and RV behave very similarly. This
fact may lead us to think that there is no evidence of jumps in the data or size
and/or frequency of jumps must be small as the three values are very close. To
understand why the two specifications for TRV perform differently we need to
go back to its definitions in section 2.4.
The difference between the two arises when defining how we measure a jump,
i.e., the threshold we set on the amplitude of the price change required to be
considered as a jump. The first specification has a broader range, so only ex-
treme events will be classified as jumps. In contrast, the second case is less
strict and small variations can go beyond the threshold. It is important to note
that the RV performs better at the maximum frequency as well as relatively low
frequencies but the lowest variance is achieved by all estimators at a 30 seconds
frequency. This is consistent with empirical studies such as [Pat11] in which
beating a high-frequency (15 to 120 seconds) RV constitutes a serious (but as
seen not impossible) challenge.
To conclude this section we offer a final remark. One may wonder whether it
makes sense to implement this sort of estimators to the conventional daily data
approach in case they could improve volatility estimates. Table 1 below shows
the volatility estimates of the Buy-and-Hold minimum variance portfolio as
described previously using daily data instead of intraday data. The covariances
were computed using data on the previous 60 trading days to October 27. Not
18
Figure 5: Performance of RV and Threshold RV
Estimator VarianceRCOV from daily data 5.370TSRV from daily data 218.7957
RK from daily data 16.9148Uniformly weighted portfolio 0.8372
Table 1: Performance of estimators using daily data
surprisingly, the outcome of this strategy is extremely poor especially in the
TSRV, the one that relies more on data abundance to perform the sub-sampling.
5 Conclusions
In this study we analyzed the performance of multiple methods to estimate
volatility using high-frequency data. We saw that in contrast with stochastic
processes theory, using the highest available frequency does not necessarily re-
sult in the best approximation to a continuous sample path because financial
markets’ microstructure plays a distortionary role that leads to (in some cases)
inconsistent estimators.
This study chooses a series of non-parametric volatility estimators to mea-
sure realized volatility. The starting point was to measure volatility as an ex-
19
post quadratic variation of asset prices, giving rise to the Realized Volatility
estimator.
The next step in adding complexity to the process of measuring volatility is
correcting for jumps and noise. To do that, we first decided to compare the
performance of the Realized variance estimator that is inconsistent under noise
with the performance of consistent estimators under noise, i.e., TSRV and Real-
ized Kernel estimators. As expected, we found that when noise is present TSRV
and Realized Kernel outperform Realized Variance, specially for small sampling
frequencies.
The same procedure was applied in the presence of jumps. Therefore, we com-
pared Realized Variance (inconsistent in the presence of jumps) with Threshold
Realized variance estimators, that are consistent under jumps. In this case and
consistent with the literature, we found that they outperform the other estima-
tors. Our results support the hypothesis of Corsi et al.. that price jumps do
have an impact on future volatility [Ren10].
Lastly, we tested again the estimators using data from S&P 100. Excluding
TSRV, that fails to give plausible outcomes, the results are consistent with the
ones obtained using simulated prices. Realized Kernel performs better than
Realized Variance estimator for sampling frequencies smaller than 5 seconds.
Therefore, we conclude that market microstructure noise emerges for sampling
frequencies smaller than 5 seconds.
Minimum values for the ex-post portfolio’s variance are found for sampling
frequencies equal to 30 seconds for RV, TRV1 and TRV2 estimators. Although
TRV1 outperforms RV we cannot conclude that jumps do exist in intraday data,
since both specifications behave very similarly.
To get more insights into the nature of jumps, threshold versions of TSRV and
Realized Kernel should be implemented to assess their performance.
20
References
[AS05] Lan Zhang, Per A. Mykland, Yacine Ait-Sahalia. A tale of two time
scales: Determining integrated volatility with noisy high-frequency
data. Journal of the American Statistical Association, 101 (472):1394–
1411, 2005.
[Col06] Robert Engle,Ricardo Colaccito. Testing and valuing dynamic correla-
tions for asset allocation. Journal of Business and Economic Statistics,
24(2), 2006.
[Ebe01] Torben G. Andersen, Tim Bollerslev, Francis X. Diebold, Heiko Ebens.
The distribution of realized stock return volatility. Joural of Financial
Econometrics, 61:43–76, 2001.
[Eng82] Robert Engle. Autoregressive conditional heteroskedasticity with es-
timates of the variance of u.k. inflation. Econometrica, 50:987–1007,
1982.
[Hes93] Steven L. Heston. A closed-form solution for options with stochastic
volatility with applications to bond and currency options. The Review
of Financial Studies, 6(2):327–343, 1993.
[LS09] O.E. Barndoff-Nielsen,P. Reinhard Hansen, A. Lunde and N. Shephard.
Realized kernels in practice: traded and quotes. The Econometrics
Journal, 2009.
[Lun06] Peter R. Hansen, Asger Lunde. Realized variance and market mi-
crostructure noise. Journal of Business and Economic Statistics,
24(2):127–161, 2006.
[Mik96] Tim Bollerslev, Hans O. Mikkelsen. Modeling and pricing long memory
in stock market volatility. Journal of Econometrics, 73:151–184, 1996.
[Par80] M. Parkinson. The extreme value method for estimating the variance
of the rate of return. Journal of Business, 53:67–78, 1980.
[Pat11] Andrew J. Patton. Data-based ranking of realised volatility estimators.
Journal of Econometrics, 3:284–303, 2011.
21
[Ren10] Fulvio Corsi, Davide Pirino, Roberto Reno. Journal of Econometrics,
159, 2010.
[She08] Ole Barndorff-Nielsen, Peter R. Hansen, Asger Lunde, Neil Shephard.
Realized kernels to measure ex post variation of equity prices in the
presence of noise. Econometrica, 76(6):1481–1536, 2008.
[SS11] Asger Lunde, Neil Shephard and Kevin Sheppard. Econometric analysis
of vast covariance matrices using composite realized kernels. Working
paper, 2011.
[Yoo94] L.C.G Rogers, S. Satchell, Y. Yoon. Estimating the volatility of stock
prices: a comparison of methods that use high and low prices. Applied
Financial Economics, 4:241–247, 1994.
[Yu09] Yacine Ait-Sahalia, Jianlin Yu. High frequency market microstructure
noise estimates and liquidity measures. Journal of Mathematical Statis-
tics, 161:422–457, 2009.
[Zha00] Dennis Yang, Qiang Zhag. Drift-independent volatility estimation
based on high, low, open, and close prices. Journal of Business,
73(3):477–491, 2000.
22
TICKER COMPANY NAME GICS SECTOR NAME STATUS
MMM 3M CO IndustrialsABT ABBOTT LABORATORIES Health CareACN ACCENTURE PLC-CL A Information TechnologyALL ALLSTATE CORP FinancialsMO ALTRIA GROUP INC Consumer StaplesAMZN AMAZON.COM INC Consumer DiscretionaryAXP AMERICAN EXPRESS CO FinancialsAIG AMERICAN INTERNATIONAL GROUP FinancialsAMGN AMGEN INC Health CareAPC ANADARKO PETROLEUM CORP EnergyAPA APACHE CORP EnergyAAPL APPLE INC Information TechnologyT AT&T INC Telecommunication ServicesBAC BANK OF AMERICA CORP FinancialsBK BANK OF NEW YORK MELLON CORP FinancialsBAX BAXTER INTERNATIONAL INC Health CareBIIB BIOGEN IDEC INC Health CareBA BOEING CO/THE IndustrialsBMY BRISTOL-MYERS SQUIBB CO Health CareCOF CAPITAL ONE FINANCIAL CORP FinancialsCAT CATERPILLAR INC IndustrialsCVX CHEVRON CORP EnergyCSCO CISCO SYSTEMS INC Information TechnologyC CITIGROUP INC Financials droppedKO COCA-COLA CO/THE Consumer StaplesCL COLGATE-PALMOLIVE CO Consumer StaplesCMCSA COMCAST CORP-CLASS A Consumer Discretionary droppedCOP CONOCOPHILLIPS EnergyCOST COSTCO WHOLESALE CORP Consumer StaplesCVS CVS CAREMARK CORP Consumer StaplesDVN DEVON ENERGY CORPORATION EnergyDOW DOW CHEMICAL CO/THE MaterialsDD DU PONT (E.I.) DE NEMOURS MaterialsEBAY EBAY INC Information TechnologyEMC EMC CORP/MA Information TechnologyEMR EMERSON ELECTRIC CO IndustrialsEXC EXELON CORP UtilitiesXOM EXXON MOBIL CORP EnergyFDX FEDEX CORP IndustrialsF FORD MOTOR CO Consumer DiscretionaryFCX FREEPORT-MCMORAN COPPER MaterialsGD GENERAL DYNAMICS CORP IndustrialsGE GENERAL ELECTRIC CO Industrials droppedGILD GILEAD SCIENCES INC Health CareGS GOLDMAN SACHS GROUP INC FinancialsGOOG GOOGLE INC-CL C Information TechnologyHAL HALLIBURTON CO EnergyHPQ HEWLETT-PACKARD CO Information TechnologyHD HOME DEPOT INC Consumer DiscretionaryHON HONEYWELL INTERNATIONAL INC IndustrialsINTC INTEL CORP Information TechnologyIBM INTL BUSINESS MACHINES CORP Information TechnologyJNJ JOHNSON & JOHNSON Health CareJPM JPMORGAN CHASE & CO FinancialsLLY ELI LILLY & CO Health CareLMT LOCKHEED MARTIN CORP IndustrialsLOW LOWE’S COS INC Consumer DiscretionaryMA MASTERCARD INC-CLASS A Information TechnologyMCD MCDONALD’S CORP Consumer DiscretionaryMDT MEDTRONIC INC Health CareMRK MERCK & CO. INC. Health CareMET METLIFE INC FinancialsMSFT MICROSOFT CORP Information TechnologyMON MONSANTO CO MaterialsMS MORGAN STANLEY FinancialsNOV NATIONAL OILWELL VARCO INC EnergyNKE NIKE INC -CL B Consumer DiscretionaryNSC NORFOLK SOUTHERN CORP IndustrialsOXY OCCIDENTAL PETROLEUM CORP EnergyORCL ORACLE CORP Information TechnologyPEP PEPSICO INC Consumer StaplesPFE PFIZER INC Health CarePM PHILIP MORRIS INTERNATIONAL Consumer StaplesPG PROCTER & GAMBLE CO/THE Consumer StaplesQCOM QUALCOMM INC Information TechnologyRTN RAYTHEON COMPANY IndustrialsSLB SCHLUMBERGER LTD EnergySPG SIMON PROPERTY GROUP INC FinancialsSO SOUTHERN CO/THE UtilitiesSBUX STARBUCKS CORP Consumer DiscretionaryTGT TARGET CORP Consumer DiscretionaryTXN TEXAS INSTRUMENTS INC Information TechnologyTWX TIME WARNER INC Consumer DiscretionaryUSB US BANCORP FinancialsUNP UNION PACIFIC CORP IndustrialsUNH UNITEDHEALTH GROUP INC Health CareUPS UNITED PARCEL SERVICE-CL B IndustrialsUTX UNITED TECHNOLOGIES CORP IndustrialsVZ VERIZON COMMUNICATIONS INC Telecommunication ServicesV VISA INC-CLASS A SHARES Information TechnologyWMT WAL-MART STORES INC Consumer StaplesWAG WALGREEN CO Consumer StaplesDIS WALT DISNEY CO/THE Consumer DiscretionaryWFC WELLS FARGO & CO Financials
Table 2: S&P 100 index constituents for October 27th 2010
23