essays in behavioral finance biases in investment ......a huge thanks goes to my mom sandra, whose...
TRANSCRIPT
INVITATIONTo attend the public defense of the PhD thesis entitled
Essays in Behavioral FinanceBiases in Investment Decisions and Their Impact Across Asset Classes
by Luiz Fernando Fortes Félix
Monday October 1st, 2018At 13.45 hours
.....Vrije Universiteit Amsterdamstreet, nopostalcode,Amsterdam
The defense will be followed by a reception
Paranymphs
Klaas [email protected]
Rob van den [email protected]
Essays in Behavioral Finance
Luiz Fernando Fortes Félix
Essays in Behavioral FinanceBiases in Investment Decisions and Their Impact Across Asset Classes
Luiz Fernando Fortes Félix
Essays in Behavioral FinanceBiases in Investment Decisions andTheir Impact Across Asset Classes
Luiz Fernando Fortes Felix
reading committee:
prof.dr. R. Calcagno
prof.dr. M. van Dijk
prof.dr. U. von Lilienfeld-Toal
prof.dr. P. Verwijmeren
prof.dr. R. Zwinkels
ISBN 978-94-6332-385-7
@ Luiz Fernando Fortes Felix
Contact: [email protected]
Cover design: Loes Kema
Printed by GVO drukkers & vormgevers B.V., Ede
School of Business and Economics
Vrije Universiteit Amsterdam
De Boelelaan 1105
1081HV Amsterdam
The Netherlands
VRIJE UNIVERSITEIT
ESSAYS IN BEHAVIORAL FINANCEBIASES IN INVESTMENT DECISIONS AND THEIR IMPACT ACROSS
ASSET CLASSES
ACADEMISCH PROEFSCHRIFT
ter verkrijging van de graad Doctor of Philosophyaan de Vrije Universiteit Amsterdam,op gezag van de rector magnificus
prof.dr. V. Subramaniam,in het openbaar te verdedigen
ten overstaan van de promotiecommissievan de School of Business and Economicsop maandag 1 oktober 2018 om 13.45 uur
in de aula van de universiteit,De Boelelaan 1105
door
Luiz Fernando Fortes Felix
geboren te Belo Horizonte, Brazil
promotor: prof.dr. P.A. Stork
copromotor: prof.dr. R. Kraussl
This thesis is dedicated to
Clarissa,
for daily supporting me to achieve and for joining my dreams,
and to my parents,
Sandra and Jose Luiz (�),
for empowering me to dream high.
Foreword and AcknowledgementsHaving worked as a professional investors for a number of years, I can say that
acknowledging the influence of behavioral biases in investments decision making
is crucial for a sound investment process and a career in financial markets. In
this thesis, I focus in overweighting of tail events, which is the most emblematic
of the Cumulative Prospect theory. Observing and personally experiencing how
overweighting of tail events can lead to sub-optimal decisions was the main reason
why I decided to formally investigate it through a PhD.
This thesis is a join product of me, my supervisor Philip Stork, and my co-supervisor
Roman Kraussl. I have no words to thank you both for the amount of hours and
thoughts dedicated to our project. I am very grateful for your critical approach on
your own unique styles: the meticulous challenger and the devils’ advocate. Dear
Philip, I am pleased that after this journey together, I also consider you a mentor
and a good friend. I am delighted to have met your beautiful family in different
occasions, from casual visits to your house to a coincidental family holiday in le cote
d’azur. I will miss our “business” lunches at Symphony and hope we can replace this
ritual by a new one. Dear Roman, despite the distance and less frequent meetings,
our collaboration was much appreciated and joyful, often marked by wise comments
in prompt email replies past midnight irrevocably ended with “best wishes”.
Many were the seminars and conferences where I presented my work: from APG
Quant Roundtable and VU Brown Bag seminars to renowned conferences, such as
EEA-ESEM and EFA. I thank seminar participants for their useful comments, as
they truly helped improving this thesis and widening my perspective over my own
subject. More specifically, I would like to thank the following colleagues and de-
baters, whose help and comments undoubtedly influenced this thesis: Andre Lucas,
Albert Menkveld, Arjen Siegmann, Ton Vorst and Remco Zwinkels at VU Univer-
sity; Thijs Aaten, Louis Chaillet, Gillis Danielsen, Pieter van Foreest, Rob van den
Goorbergh, Jaroslav Krystul, Pim Lausberg, Rajiv Mallick, Koen Marree, Jan Mark
van Mill, Sunil Patil, Martin Prins, Klaas Reedijk, Ashutosh Shahi, Frank Smudde,
Olaf van Veen, Kevin Wees, Hans van Westrienen, Peter Wijn, Ruben Winnink and
Tim Zwinkels at APG Asset Management; Steven Desmyter, Yoav Git, Otto van
Hemert, Sandy Rattray, Graham Robertson, Matthew Sargaison, Markus Schanta,
Lionel Viaccoz and Tim Wong at MAN-AHL; Andy Moniz and Caio Natividade at
Deutsche Bank; Christian Ruprecht and Iskandar Vanblarcum at Barclays; Yang-Ho
Park and Emilio Osambela at the Federal Reserve Bank; Rui Almeida at Maastricth
University; Alessandro Beber at BlackRock; Roy Hoevenaars at Capstone Advisors
and Evert Vrugt (ex-APG).
Thank you to the members of my reading committee prof.dr. R. Calcagno, prof.dr.
M. van Dijk, prof.dr. U. von Lilienfeld-Toal, prof.dr. P. Verwijmeren and prof.dr.
R. Zwinkels. I much enjoyed reading and benefited from your comments to my
thesis. Thanks also to Norman Seeger for joining my PhD Assessment Committee.
I am also grateful for the support that my managers at APG Asset Management,
Peter Wijn and Klaas Reedijk, gave to this thesis. Without your backing, I would
not have managed to complete this PhD alongside working. Thank you my other
(ex-)colleagues at Asset Allocation & Overlay, Mark van Aartsen, Jelle Jansen, Jos
Kalb, Ed Swiderski and Alex Tiebout for incentivize and backing me up during
conference days. At APG, I also would like to thank Rob van den Goorbergh, Peter
Strikwerda and Job Kooij for making my days, respectively, more quant (around a
roundtable), innovative and sporty.
A special thanks to my paranymphs Rob van den Goorbergh and Klaas Reedijk,
who have closely accompanied my PhD journey and really helped me on the final
details of my thesis and defense.
Thanks also to many friends in Amsterdam, Adriaan, Alan, Claudinha, Gisa, Guigo,
Hermine, Inge, Laıs, Leroy, Margriet, Matheus, Virgilio among others and in our
Randwijk community, who have not abandoned me but rather supported me despite
many negated invitations to meet them.
A huge thanks goes to my mom Sandra, whose immeasurable love has always sup-
ported my endeavours, despite they would mean to be far from home. To my
deceased father, Jose Luiz, thanks a lot for teaching me to be a hard worker. I
would have not finished this thesis if it would not be your example. Many thanks
to my brother Claudio and my sister Luciana, whose incentives and help made my
PhD path more meaningful, happy and pythonic. Thanks to my extended family
Jairo, Tokie, Sonia, Andre, Renato, Andre Reis, Tia Fatima, Tia Ana, Tio Jose
Lino, Neocles and Leila, among others, for their appreciation and encouragement.
Now, six years after I have started my PhD, life is quite different from when I started
it. I am now father of two amazing creatures: Thomas and Bernardo. I thank you
for the understanding on weekends and for going to bed early on week days. Nearly
nothing in life gives me more energy than spending time with these two little boys
and my wife. They are my wonders and I am happy to have more time for them
once my nights and weekends are no longer consumed by my “six-year old baby”.
In the mean time, some beloved family members have left this world. Vovo Ge,
Vovo Xico and Ana Elizabeth, you will be always in my heart.
Last but not least, I would like to thank my better half, Clarissa Bonifacio. No one
inspired, motivated and supported me to progress in this six years like you did.
Luiz Fernando Fortes FelixAmsterdam, August 2018
Contents
1 Introduction 1
1.1 An artificial intelligence introduction to this thesis . . . . . . . . . . . . . . . . . 1
1.2 My introduction to this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Cumulative Prospect Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 The 2011 European Short Sale Ban: A Cure or a Curse? 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Data and methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 VaR levels and volatility skews . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 Financial contagion risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Panel regression analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.4 Robustness Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.A Appendix: Implied jump risk estimation . . . . . . . . . . . . . . . . . . . 36
2.A.1 Implied jump risk from risk-neutral distributions . . . . . . . . . . . . . 36
2.A.2 The Figlewski (2010) approach for extracting RND from implied volatilities 37
2.A.3 The modified Figlewski (2010) approach . . . . . . . . . . . . . . . . . . 38
2.B Appendix: Extreme value theory . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Single stock call options as lottery tickets: overpricing and investor senti-
ment 41
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.1 Subjective density functions . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 Estimating CPT parameters . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.3 Density function tails’ consistency test . . . . . . . . . . . . . . . . . . . 47
3.2.4 Estimating RND and EDF . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Empirical analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.1 Estimated CPT long-term parameters . . . . . . . . . . . . . . . . . . . . 51
3.3.2 Density functions tails’ consistency test results . . . . . . . . . . . . . . . 52
3.3.3 Estimated CPT time-varying parameters . . . . . . . . . . . . . . . . . . 58
i
3.3.4 Time variation in probability weighting parameter and investors’ sentiment 62
3.4 Robustness tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.1 Kupiec’s test for tail comparison . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.2 Prelec’s weighting function parameter . . . . . . . . . . . . . . . . . . . . 70
3.4.3 Estimating time-varying γ under different assumptions for δ , α and β . . 71
3.4.4 Overweight of (right) tails driven by IV of single stock options . . . . . . 72
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.A Appendix: Risk-neutral densities and implied volatility analytics . . . 76
3.A.1 Subject density function estimation . . . . . . . . . . . . . . . . . . . . . 76
3.A.2 Single stock weighted average implied volatilities . . . . . . . . . . . . . . 77
3.B Appendix: Machine learning methods . . . . . . . . . . . . . . . . . . . . . 79
3.B.1 Least Absolute Shrinkage and Selection Operator (Lasso) . . . . . . . . . 79
3.B.2 k-Nearest-Neighbor classifier . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.C Appendix: Welch and Goyal (2008) equity market predictors . . . . . . 80
4 Implied Volatility Sentiment: A Tale of Two Tails 81
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 Overweight of tails: dynamics and dependencies . . . . . . . . . . . . . . . . . . 89
4.3.1 Time-varying CPT parameters . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.2 Overweight of tails and sentiment . . . . . . . . . . . . . . . . . . . . . . 90
4.3.3 Overweight of tails, IV skews and higher moments of the RND . . . . . . 93
4.4 Predicting with overweight of tails . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.1 Predicting returns with DGspread and IV-sentiment . . . . . . . . . . . 97
4.4.2 IV-sentiment pair trading strategy . . . . . . . . . . . . . . . . . . . . . 99
4.4.3 Out-of-sample equity returns predictive tests . . . . . . . . . . . . . . . . 107
4.4.3.1 Univariate models and forecast combination . . . . . . . . . . . 107
4.4.3.2 “Kitchen sink” and machine learning-based models . . . . . . . 111
4.4.4 IV-sentiment and equity factors . . . . . . . . . . . . . . . . . . . . . . . 113
4.4.5 Behavioral versus risk-sharing perspectives . . . . . . . . . . . . . . . . . 116
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5 Predictable Biases in Macroeconomic Forecasts and Their Impact Across
Asset Classes 121
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Forecast biases, anchoring and rationality tests . . . . . . . . . . . . . . . . . . . 124
5.3 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3.1 Economic surprise predictive models . . . . . . . . . . . . . . . . . . . . 128
5.3.2 Market response predictive models . . . . . . . . . . . . . . . . . . . . . 128
5.4 Empirical analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4.1 Predicting economic surprises . . . . . . . . . . . . . . . . . . . . . . . . 130
ii
5.4.2 Market responses around macroeconomic announcements . . . . . . . . . 134
5.4.3 Predicting market responses . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.4.4 Market responses, skewness of economic forecasts and regret . . . . . . . 145
5.4.5 Robustness tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.4.5.1 Economic surprise models across regions . . . . . . . . . . . . . 148
5.4.5.2 Expected and unexpected surprises . . . . . . . . . . . . . . . . 149
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.A Appendix: Machine learning methods . . . . . . . . . . . . . . . . . . . . . 154
5.A.1 Principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . 154
5.A.2 Ridge regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.A.3 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.A.4 Random forest variable Importance measure . . . . . . . . . . . . . . . . 155
6 Conclusion 157
Bibliography 160
Summary 171
Samenvatting 172
Short biography 173
Publications 174
Conferences presentations 175
iii
List of Figures
1.1 LDA topic modelling applied to this thesis . . . . . . . . . . . . . . . . . . . . . 2
1.2 QR-code for interactive LDA output visualization . . . . . . . . . . . . . . . . . 2
1.3 Impact of changes in λ, γ and δ in the CPT model . . . . . . . . . . . . . . . . 7
2.1 Short positions in stocks around ban date . . . . . . . . . . . . . . . . . . . . . 18
2.2 Averaged implied volatility skews for banned and non-banned stocks . . . . . . . 21
2.3 Sovereign CDS spreads, V2X and implied volatility skews around the ban date . 22
2.4 Implied volatility skews and IV spread around the ban date . . . . . . . . . . . 31
2.5 RND extraction using different methods . . . . . . . . . . . . . . . . . . . . . . 39
3.1 Cumulative density functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Time varying gamma parameter in CPT . . . . . . . . . . . . . . . . . . . . . . 62
3.3 k-Nearest-Neighbors for IV skews . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1 Information ratios for daily IV-based strategies . . . . . . . . . . . . . . . . . . . 101
4.2 Information ratios for long- and short-leg of IV-based strategies . . . . . . . . . 104
4.3 Information ratios, skewness and horizon for monthly IV-based strategies . . . . 106
4.4 Cumulative Sum of Squared Error Differences of single factor predictive regressions110
4.5 Cumulative Sum of Squared Error Differences of combined predictive regressions 112
4.6 Correlation matrix between IV-sentiment factor and cross-sectional equity factors114
5.1 Cumulative average returns (CAR) around the macroeconomic announcements . 135
5.2 Importance measure from Random forest model . . . . . . . . . . . . . . . . . . 144
iv
Chapter 1
Introduction
1.1 An artificial intelligence introduction to this thesis
I start this thesis where my mind is: artificial intelligence, more specifically natural language
processing (NLP), the main topic of my current research agenda at APG, together with deep
learning. NLP is the field in computer science concerned with the understanding of human
(natural) language. Given the power of NLP and, in specific, topic modelling for dimensionality
reduction of text, I could not prevent myself from offering you this entree to my thesis. This
choice does not come out of context, as I apply a number of other artificial intelligence methods
across this thesis, more specifically machine learning algorithms. Notwithstanding the concern
that by now I may have distracted the reader enough, I note that my own, high dimensional
and detailed introduction is provided in the following section. Thus, if you have a taste for
technology and a natural interest for research you will enjoy reading both introductions. If
time is of main concern, you should skip section 1.2. If precision is of singular interest and/or
you despise technology, you should jump to 1.2 straight away.
A NLP Latent Dirichlet Allocation (LDA) topic modelling technique (see Blei et al., 2003)1
applied to this thesis suggests it can be broken down into the following six topics with the
respective weights2:
1. Forecast of macroeconomic indicator using biases, 20.6%.
2. Investor sentiment using out-of-the-money (OTM) options, 14.1%.
3. Ban on European financial stocks and implied volatility (IV) skews, 12.2%.
4. Equity, sentiment, momentum, cross-sectional and surprise factor strategies, 14.0%.
5. Cumulative Prospect Theory (CPT) and overweight of small probabilities, 25.5%.
6. S&P500 index implied volatility (IV) across moneyness, 13.6%.
1LDA is a widely adopted NLP technique for topic modelling.2Topic names are not automated but constructed from top-10 most relevant terms for each topic.
1
Figure 1.1: LDA topic modelling applied to this thesis. This figure presents the visualization output of a LatentDirichlet Allocation (LDA) method for topic modelling applied to this thesis’ text. On the left the global topic view is shown.On the right (with topic 5, the topic with highest frequency, selected) the term bar chart is shown. The parameter λ to set therelevance metric equals 0.6 in this figure.
Figure 1.2: QR-code for interactive LDA output visualization. This QR-code is linked to the LDAvis output of theLatent Dirichlet Allocation (LDA) model provided in the Figure 1.1 above. On the left the global topic view is shown. Select topicsby clicking at their corresponding bubble. Once a topic is selected, its most relevant terms (for the set value of λ) are displayedon the right side. On the right the term bar chart is shown. Red bars indicate term frequency within the selected topic. Blue barsindicate overall term frequency across the corpus. See Sievert and Shirley (2014) for further details.
Using a multidimensional scaling technique, we report the inter-topic distance chart, shown
on the left side of Figure 1.1, produced using the LDAvis Python library implemented by Sievert
and Shirley (2014). As such, the chart suggests that this thesis contains a mix of topics that
are occasionally correlated (topics 4 and 1 and topics 5 and 6) but mostly quite distinct from
each other. Topic 3 seems the most distant topic from the other five topics, whereas topics 2
and 4 are the most common to all topics. Bubble sizes are proportional to the topic frequencies
reported above.
On the right side of Figure 1.1, the term bar chart is displayed. It shows the most relevant
terms for the different topics (with topic 5 selected) given our parameterization for the relevance
metric (λ) equal to 0.63. For an interactive version of the LDA visualization output of this thesis,
please, scan the QR-code provided in Figure 1.24.
3We use λ=0.6, as being the optimal level of this parameters according to Sievert and Shirley (2014).4Alternatively, go to the next url: https://cdn.rawgit.com/luizfelix/PhD/b9121543/PhD_Luiz_
ldavis_6.html#topic=0&lambda=0.6&term=
2
The remainder of Chapter 1 is split into my introduction to this thesis, an introduction to
the Cumulative Prospect Theory (CPT) and the outline of this thesis.
1.2 My introduction to this thesis
This thesis is about behavioral finance, the sub-field of behavioral economics that studies the
impact of psychological and cognitive biases in financial decision making. The main hypothesis
of behavioral finance is that people systematically make irrational decisions when based out-
comes are unknown. Behavioral finance is a breakthrough because it managed to challenge the
classical economics and financial theories, which are both built on the assumption that individ-
uals are fundamentally rational, as implied by the expected utility theory. The proponents of
behavioral finance used lab experiments to prove that individuals making decisions under un-
certainty violate the axioms of the expected utility theory. As such, behavioral finance models
were designed in a stylized form, disconnected from financial markets. Thus, this thesis adds
to the growing literature that attempts to validate the hypotheses made by behavioral finance
in real financial markets. Beyond that, by using forward looking information sources, it tests
these hypotheses more directly than the literature has done.
More formally, the Cumulative Prospect Theory (CPT)5 introduced by Kahneman and Tver-
sky (1979)6 and Tversky and Kahneman (1992), the main single milestone of behavioral finance,
have paved the way for challenging the classical financial theory. Since their contribution, many
asset pricing puzzles and market inefficiencies have received behavioral interpretations. Most of
these interpretations link ex-post empirical observations (e.g., autocorrelation of stock returns
over a six-month horizon, see Jegadeesh and Titman (1993)) to stylized effects of behavioral
models (e.g., underreaction and overreaction, Barberis et al. (1998); Daniel et al. (1998); Hong
and Stein (1999)) and related behavioral biases (e.g., overconfidence and self-attribution, for
the case of Daniel et al. (1998)). Fewer are the papers using ex-ante information to directly
recognize the presence of behavioral biases in investment decision making7. Thus, the over-
arching theme in this thesis is the link between behavioral finance, investments and ex-ante
information sources, which are either new or evaluated from a novel perspective.
Particularly, CPT’s overweighting of small probabilities is hypothesized to explain a series
of puzzles in asset pricing (see Barberis and Huang, 2008) but it lacks empirical validation.
Thus, most chapters of this thesis investigate market inefficiencies which we hypothesize to be
explained by the CPT probability weighting function. Using ex-ante information, we find it
to play a role in explaining some inefficient behaviors of market makers, retail investors and
institutional investors, giving rise to economically significant investment insights.
5An overview of the Cumulative Prospect Theory (CPT) is provided in section 1.3.6Kahneman and Tversky (1979) introduced the Prospect Theory, which was later refined into the CPT.7Note that utilizing ex-post and ex-ante information are either effective for testing the presence of behavioral
biases once historical data is available. The deemed benefit of using ex-ante data is that it allows for aninstantaneous observation and test of biases. Nevertheless, given its anticipatory nature, the availability ofex-ante data is much more limited.
3
Differently from most behavioral finance research, a large part of this thesis attempts to
empirically validate its hypothesis using equity options data. This was a deliberate choice.
Option markets constitute a rich source of information because they efficiently provides market-
based estimates of investors’ preferences (Dennis and Mayhew, 2002; Barberis and Huang, 2008;
Dierkes, 2009; Chang et al., 2013) or expectations (Bates, 1991; Rubinstein, 1994; Jackwerth and
Rubinstein, 1996) or both (Kliger and Levy, 2009; Polkovnichenko and Zhao, 2013; Barberis,
2013). Beyond that, because the availability of a cross-sectional structure of options (across
moneyness) and maturities, risk-neutral probability density functions for multiple horizons can
be estimated. Employing risk-neutral densities (RND) has several advantages. First, it provides
a true ex-ante probability distribution of market expectations. Second, it captures patterns in
the probability density which cannot be replicated by smoothing nor parametric methods.
Third, it does not require extremely long data history to estimate assets’ steady-state return
distributions and their conditional counterparts. And, lastly, it allows us to directly investigate
potential time-variation in distributions.
Note that, if the number of papers utilizing ex-ante information to test for behavioral biases
is small, studies linking behavioral insights to ex-ante density functions are even more scarce.
The ones capable of that are typically restricted to the usage of RND from the option markets.
These studies, however, solely focus on the index option market, as in Polkovnichenko and Zhao
(2013) and Dierkes (2009). To the best of our knowledge, we are the first to use RND from
single stock option markets, let alone, the first to combine ex-ante information from both index
option and single stock option markets.
Broadly speaking, we implement the Figlewski (2010) approach for RND estimation, which
is preferred to earlier methodologies because it is able to extrapolate fat tails beyond the
density’s body. For instance, a clear advantage of this method versus the one used by Bliss
and Panigirtzoglou (2004) is that tails of the distribution of prices no longer resemble Black-
Scholes’ log-normal tails, as its constant volatility across moneyness is largely inconsistent with
empirical observations. Particularly, in this thesis we extend the Figlewski (2010) approach by
making tails’ anchor points (the point where tails are collated to distribution body) dynamic
rather than fixed, making it more flexible. Our addition to the Figlewski (2010) method largely
improves the quality of fitted Generalized Extreme Value tails in our data set.
To investigate how well the CPT model can explain options pricing, RND must be trans-
formed into subjective density functions. This is done by embedding a risk-reward trade-off
(i.e, a utility function) to RNDs through the use of the pricing kernel. Yet, another way of
investigating how well the CPT matches option prices is to estimate the CPT’s γ and δ proba-
bility distortion parameters required to make subjective density functions to match RND. We
could not have implemented either of these approaches if it were not for the direction given
by Bliss and Panigirtzoglou (2004). In specific for utility functions with probability weighting,
such as the CPT and Prelec’s rank-dependent expected utility (RDEU), Polkovnichenko and
Zhao (2013) and Dierkes (2009) provided important guidance.
4
We delved deep into the extraction of RND and into the estimation of CPT parameters from
options data, however, our conclusions are mostly dedicated to the underlying stock market
and equity sentiment rather than to options’ pricing itself8. As most applied behavioral finance
studies focus on the cross section of stocks (see Carhart, 1997; Barber and Odean, 2008; Cen
et al., 2013), the number of studies linking behavioral biases to asset class level and across asset
class investment strategies is substantially smaller and mostly concentrates on momentum (see
Benartzi and Thaler, 1995a; Asness et al., 2013; Moskowitz et al., 2012). At the same time,
there is a lack of studies that investigate CPT’s overweighting of tail events in connection to
cross-asset investing and asset allocation. And, as asset allocation is the primary determinant
of a portfolio’s return variability (see Brinson et al., 1986; Ibbotson and Kaplan, 2000), this
thesis not only fills a gap in the literature, but is relevant to regulators, investors and other
market participants.
This thesis is important to regulators because it provides a unique perspective on short-
sale bans, implied jump risk and market failure, which are closely connected to contagion risk
and to behavioral inefficiencies. As equity market reversals and momentum crashes are also
linked to behavioral biases, acknowledging their effects in market prices should also be useful
for market prudential policy. This thesis matters to market participants because, as financial
decision makers, they are under constant influence of behavioral biases, which, on the one
hand, might be the biggest challenge to the their professional activity and, on the other hand,
create investment opportunities. Thus, understanding how specific behavioral biases impact
financial markets in aggregate and may impact our own behavior is important, if not critical
in investments.
Our analyses mostly concentrate on tail measures and higher moments, both extracted
from RND: tail shape estimator (Hill, 1975), extreme downside risk (Hartmann et al., 2004),
conditional co-crash probabilities (Hartmann et al., 2004; Balla et al., 2014), jump risk (Yan,
2011), expected shortfall (Danielsson et al., 2006), risk-neutral skewness and kurtosis (Dennis
and Mayhew, 2002; Conrad et al., 2013) and implied volatility skews (Bates, 2003; Garleanu
et al., 2009; Vilkov and Xiao, 2013). Our starting point is the fact that implied volatility
skews from index put options have dramatically changed since the 87 crash (see Bates, 1991;
Rubinstein, 1994; Jackwerth and Rubinstein, 1996). Since this event, implied volatility skews
have been traditionally associated with demand for portfolio insurance by institutional investors
and (left) tail fears (see Vilkov and Xiao, 2013), reflecting bearishness. The perspective provided
by the single stock option market is, however, very distinct from the one obtained from the
index option markets, which has been the one studied in connection with probability weighting
functions (see Polkovnichenko and Zhao, 2013; Dierkes, 2009). In contrast, single stock options
are mostly traded by individual investors (Bollen and Whaley, 2004; Lakonishok et al., 2007)
to speculate on the upside of equity markets (Lakonishok et al., 2007; Bauer et al., 2009; Choy,
2015), reflecting bullish sentiment. Because we investigate both the index option and the single
8This is supported by the assumptions that options and stock prices reflect the same information, in linewith Conrad et al. (2013)
5
stock option markets, we are able to propose a novel implied volatility-based sentiment measure,
jointly extracted from both the index and single stock option markets: IV-sentiment.
In addition to the overweight of tail events, we also investigate the influence of other be-
havioral biases, such as anchoring (Tversky and Kahneman, 1974), conservatism (Ward, 1982),
overconfidence (Daniel et al., 1998), herding (Scharfstein and Stein, 1990), regret (Loomes and
Sugden, 1982; Bell, 1982), and rational bias (Laster et al., 1999; Ottaviani and Sorensen, 2006)
amid surveys of macroeconomic data forecasters, another source of ex-ante information. We
find some of these biases to be pervasive in the forecasting of US economic data releases and
also present in Continental Europe, the United Kingdom and Japan. Under this condition,
economic surprises are predictable. And, as market prices react to the unexpected informa-
tion flow, we find that predicting economic surprises gives rise to return predictability, with
implications across four asset classes: equities, bonds, foreign exchange and commodities. Our
results suggest that returns on assets that are sensitive to the fundamentals being revealed by
macro announcements (local equities and bonds) are more predictable around such events than
foreign markets, currencies and commodities.
The curse of dimensionality (a relatively small number of data samples in a high dimensional
feature space) struck a few times in our research. Hence, we found the need to go beyond the
standard linear model to satisfactorily test our hypothesis. The fact that computational power
no longer hinders the application of advanced statistical techniques enables us to apply artificial
intelligence techniques whenever necessary. The application of these algorithms was sometimes
crucial for preventing overfitting, which could have misled us to erroneous conclusions, such as
in Chapter 5. Other times, machine learning methods were instrumental to clarify relations
that were blurred when analyzed through standard methods. In some situations, though, ma-
chine learning approaches were the ones leading to overfitting, such as in Chapter 4. We did
not restrict ourselves to the application of supervised learning methods. Unsupervised learning
techniques were also essential to perform some of our analysis, for instance, for dimensionality
reduction used in Chapter 5. In sum, this thesis suggests that machine learning and artificial
intelligence techniques is not the answer to all financial economics and investment questions but
simply one additional toolbox at the hands of the econometrician and investment researcher.
These techniques, even more than standard linear models, should be understood and applied
with care despite the ease of usage via Python or R. Insights into the artificial intelligence tech-
niques used are provided in the Appendices of the different chapters. For a general explanation
of these methods, see Blei et al. (2003) and Hastie et al. (2008).
1.3 Cumulative Prospect Theory
The Prospect theory (PT) of Kahneman and Tversky (1979) incorporates behavioral biases
into the standard utility theory (Von Neumann and Morgenstern, 1947), which presumes that
individuals are rational9. Such behavioral anomalies are i) loss aversion, ii) risk seeking behavior
9The expected utility theory of Von Neumann and Morgenstern (1947) is the standard economics frameworkon decision making under risk. Their theory assumes that decision-makers behave as if they maximize the
6
and iii) non-linear preferences10. The CPT is described in terms of a value function (υ) and
a probability distortion function (π). The value function is analogous to the utility function
in the standard utility theory and it is defined relative to a reference point zero. Therefore,
positive values within the value function are considered as gains and negative values are losses,
which leads to:
υ(x) =
{xα , if x >= 0−λ(−x)β , if x < 0
(1.1)
where λ ≥ 1, 0 ≤ β ≤ 1, 0 ≤ α ≤ 1, and x are gains or losses. Thus, along the domain of x, the
CPT’s value function is asymmetrically S-shaped (see Figure 1.3a) with diminishing sensitivity
as x → ±∞.
The value function is, thus, concave over gains and convex over losses, differently from
the traditional utility function used by standard utility theory. Such a shape of the value
function implies diminishing marginal values as gains or losses increase, which means that any
additional unit of gain (loss) becomes less relevant when wealth increases (decreases). As α
and β increase, the effect of diminishing sensitivity decreases, and as λ increases the degree of
loss aversion increases. We also note in Figure 1.3a that the value function has a kink at the
reference point, which implies loss aversion, as the function is steeper for losses than for gains.
(a) (b)
Figure 1.3: Impact of changes in λ, γ and δ in the CPT model. Plot a in this figure shows the impact of changes inthe loss aversion parameter λ on the Cumulative Prospect Theory (CPT) value function v(x). The v(x) is depicted for λ equals to1, 1.5, 2 and 2.5. Plot b depicts the CPT weighting function w(x) for the probability weighting parameters γ and δ, respectively, forgains and losses, as well as the weighting function of Prelec (1998) for the probability weighting parameter δ. The w(x) is depictedfor γ equals to 0.61, δ equals to 0.69 and Prelec δ equals to 0.5.
The use of a probability distortion function or decision weight function is the adjustment
made to the PT to address nonlinear preferences. This function takes probabilities and weights
expected value of some function defined over the potential (probabilistic) outcomes. Individuals are assumedto have stable and rational preferences; i.e., not influenced by the context or framing.
10Loss aversion is the property in which people are more sensitive to losses than gains. For details, seeKahneman and Tversky (1979), Tversky and Kahneman (1992) and Barberis and Huang (2001). Risk-seekingbehavior happens when individuals are attracted by gambles with unfair prospects. The risk-seeking individualchooses a gamble over a sure thing even though the two outcomes have the same expected value. Non-linearpreferences occur when preferences between risky prospects are not linear in the probabilities, thus, equallyprobable prospects are more heavily weighted by agents than others. For details, see Tversky and Kahneman(1992), Fox et al. (1996), Wu and Gonzalez (1996), Prelec (1998) and Hsu et al. (2009).
7
them non-linearly, so that the difference between probabilities at high percentiles, e.g., between
99 percent and 100 percent, has more impact on preferences than the difference between prob-
abilities at small percentiles, e.g., between 10 percent and 11 percent. This is the main advance
of the CPT over the original PT. The CPT applies probability distortions to the cumulative
probabilities (i.e., the CDF), whereas the PT applies them to individual probabilities (i.e.,
the PDF). The enhancement brought by this new formulation satisfies stochastic dominance
conditions not achieved by the PT, which renders the CPT applicable to a wider number of
experiments. The probability distortion functions suggested by Tversky and Kahneman (1992),
respectively, for gains (π+n ) and losses (π−
−m) are:
π+n = w+(pn) (1.2a)
π+i = w+(pi + ...+ pn)− w+(pi+1 + ...+ pn) , for 0 ≤ i ≤ n− 1 (1.2b)
π−−m = w−(p−m) (1.2c)
π−i = w−(p−m + ...+ pi)− w−(p−m + ...+ pi−1) , for 1−m ≤ i ≤ 0 (1.2d)
where p are objective probabilities of outcomes, which are ranked for gains from the reference
point i = 0 to i = n, the largest gain, and for losses from the largest loss i = −m to i = 0, the
reference point. Further, w+ and w−, the parametric form of the decision weighting functions,
are given by:
w+(p) =pγ
(pγ + (1− p)γ)1/γ(1.3a)
w−(p) =pδ
(pδ + (1− p)δ)1/δ(1.3b)
where parameters γ and δ define the curvature of the weighting function for gains and losses,
which leads the probability distortion functions to assume inverse S-shapes. Figure 1.3b depicts
how low probability events are overweighted at the cost of moderate and high probabilities
within the CPT probability distortion functions. Tversky and Kahneman (1992) indicate that
the weighting functions for gains are slightly more curved than for losses (i.e., γ < δ), whereas
γ and δ parameters smaller than one mean overweighting of small probability events (i.e, the
distribution tails), and γ and δ larger than one mean underweighting of tail probabilities.
Note that the CPT model on its original parameterization gives rise to a distinctive fourfold
pattern of risk attitudes: risk aversion for gains of high probability, risk seeking for losses of
high probability, risk seeking for gains of low probability, and risk aversion for losses of low
probability.
The parameters estimated by Tversky and Kahneman for the CPT model, which are ex-
plored in Chapters 3 and 4 of this thesis, are λ = 2.25; β = 0.88; α = 0.88; γ = 0.61; δ = 0.69.
8
1.4 Outline
The chapters of this thesis can be read individually without prior study of the preceding chap-
ters. That said, we note that Chapter 3 does provide insights for a better understanding of
Chapter 4.
Chapter 2 (The 2011 European Short Sale Ban: A Cure or a Curse? ) evaluates whether
the 2011 European short sale ban on financial stocks proved to be successful or had a negative
impact on financial markets. The analysis focuses on the effects of the short sale ban on
financial stability and contagion risk. Different from the previous literature, we explicitly take
an options market perspective and focus on market participants’ change in expectations. Our
starting point is the extraction of RND and implied volatility skews from single stock options.
Our estimated measure of implied jump risk is central to our analysis. We find that implied
jump risk tended to increase for banned stocks even more than for non-banned stocks, which
arguably is the opposite of what regulators target. However, contagion risk for banned stocks
did decrease during the ban relative to the previous period. This is likely due to fact that
short trading activity eased after the imposition of the ban. Perhaps, the main reason for
it was market failure. During the ban, market makers become more risk-sensitive following
equity market declines, which can be explained by the overweighting of tails, a feature of the
CPT. While we observe that the short sale ban is effective in restricting both outright and
synthetic shorts on banned stocks, we do find evidence of trading migration from single-stock
puts to index puts. The selling pressure potentially diverted from the financial stocks to a larger
share of the stock market, thereby reducing the destabilizing effects in the financial sector. As
described, Chapter 2 is the one that departs the most from the common theme in this thesis:
behavioral finance. CPT’s overweighting of tails is behind our main finding, but that is about
it. The chapter as a whole gravitates around the effectiveness of the 2011 European short sale
ban from an option market perspective.
Chapter 3 (Single stock call options as lottery tickets: overpricing and investor sentiment)
empirically tests whether the overpricing of out-of-the money single stock calls can be explained
by the CPT. To the best of our knowledge, these tests have never been reported in the liter-
ature. In line with Barberis and Huang (2008), our main hypothesis is that single stock call
options, typically traded by individual investors, are overpriced because these type of investors
overweight small probability events and overpay for positively skewed securities that resemble
lottery tickets. In specific, we test whether tails of the CPT density function outperform the
RND and a set of rational subjective probability density functions on matching tails of the
distribution of realized returns. We find that overweighting of small probabilities embedded in
the CPT explains the richness of out-of-the money single stock calls better than other utility
functions. We find our estimates for the CPT probability weighting function parameter to
be qualitatively consistent with the ones of Tversky and Kahneman (1992), particularly for
short-term options. Our estimates suggest, however, that overweight of small probabilities is
less pronounced than suggested by the CPT. Moreover, overweighting of small probabilities is
9
strongly time-varying and to a large degree explained by the sentiment factor of Baker and Wur-
gler (2007), a result that is confirmed by the Least Absolute Shrinkage and Selection Operator
(Lasso) of Tibshirani (1996).
Chapter 4 (Implied Volatility Sentiment: A Tale of Two Tails) builds on Chapter 3 and
on Polkovnichenko and Zhao (2013) and Dierkes (2009), providing evidence that low prob-
ability events are occasionally overweighted, as observed in the pricing of out-of-the-money
single stock calls (due to individual investors’ trading activity) and index puts (due to insti-
tutional investors’ trading activity). We show that overweighting of tail events in these two
option markets is strongly time-varying and is linked to equity market sentiment and higher
moments of RND. As a consequence, we suggest a novel sentiment indicator: implied volatility-
Sentiment or IV-Sentiment. We find that our measure, jointly derived from index and single
stock options, explains investors’ overweight of tail events well. When attempting to predict
the equity risk premium out-of-sample, we find that IV-Sentiment adds value over and above
traditional factors, especially when multifactor predictive models are constrained. The struc-
ture provided by these constraints in addition to a simple forecast combination approach seems
also to outperform a “kitchen sink” model and a set of machine learning algorithms capable
of exploring non-linearities in the data and tackling multicollinearity issues, such as Random
forests (Breiman, 2001), Neural Networks, Principal Component Regression (Massy, 1965) and
Ridge regression (Hoerl and Kennard, 1970). When employed as a mean-reversion strategy, our
IV-Sentiment measure delivers economically significant results, which seem more robust than
the ones produced by the conventional sentiment factor. Last but not least, we find that a con-
trarian strategy based on IV-Sentiment shows limited exposure to a set of cross-sectional equity
factors, including Fama and French’s five factors, the momentum factor and the low-volatility
factor, and seems valuable in avoiding momentum crashes.
Chapter 5 (Predictable Biases in Macroeconomic Forecasts and Their Impact Across Asset
Classes) reveals how biases in macroeconomic forecasts are associated with economic surprises
and market responses across four asset classes around US data announcements. We start by
reiterating the previous finding of the literature that the consensus forecasts of US macroeco-
nomic releases embed anchoring (see Campbell and Sharpe, 2009). Further, to the best of our
knowledge, we are the first to find that the skewness of the distribution of economic forecasts
is a strong predictor of economic surprises, suggesting that forecasters behave strategically
(rational bias) and possess private information. By using a popularity measure per economic
indicator and by expanding the number of countries/regions and indicators tested relative to
Campbell and Sharpe (2009), we advocate that the prevalence of biases is related to attention,
which is also a novel insight in the literature. Under these conditions, both economic surprises
and returns of assets that are sensitive to macroeconomic conditions are predictable. Our find-
ings indicate that local equities and bond markets are more predictable than foreign markets,
currencies and commodities. On an out-of-sample basis, point-forecast is better performed by
non-linear machine learning models as they seem to capture the dynamics of market responses
10
around macroeconomic announcements better than linear regression models and avoid overfit-
ting. Yet, when forecasters fail to correctly forecast the direction of economic surprises, regret
becomes a relevant cognitive bias to explain asset price responses. We find that the behavioral
and rational biases encountered in US economic forecasting also exist in Continental Europe,
the United Kingdom and Japan, albeit, to a lesser extent.
11
Chapter 2
The 2011 European Short Sale Ban: ACure or a Curse?∗
2.1 Introduction
On August 11, 2011, Belgium, France, Italy, and Spain imposed short sale bans on financial
stocks. The European Securities and Markets Authority (ESMA) stated that the reason for
the short sale bans was to curb market abuse and the spread of false rumors2. The spread
of false rumors is dangerous because it may increase the risk of financial contagion3, thereby
endangering financial stability.
Recent academic studies argue that short sale bans, at best, do not affect stock price levels
and, at worst, contribute to their decline and negatively impact market quality. For instance,
Boehmer et al. (2013) conclude that it is unclear whether the SEC’s 2008 imposition of short
sale bans achieved the goal of providing a floor for U.S. equity markets. Beber and Pagano
(2013) investigate the impact of the 2008 bans on stock markets in 30 different countries and
find that banned stocks underperform stocks not included in the bans.
∗This chapter is based on Felix et al. (2016a). I am grateful to the Iftekhar Hasan (the editor) and twoanonymous referees at Journal of Financial Stability for useful comments and suggestions. We also thankseminar participants at IX Seminar on Risk, Financial Stability and Banking of the Banco Central do Brasil 2014in Sao Paulo, International Risk Management Conference 2014 in Warsaw, Financial Management Association(FMA) European Conference 2014 in Maastricht, 3rd International Conference of the Financial Engineering andBanking Society - FEBS/LabEx-ReFi 2013 in Paris, VU University Amsterdam and APG Asset Managementin Amsterdam for their helpful comments. We thank Markit Securities Finance for providing the data on shortstock positions and borrowing costs. We thank APG Asset Management for making available a large part ofthe additional data set.
2ESMA stated on August 11, 2011: “European financial markets have been very volatile over recent weeks.The developments have raised concerns for securities markets regulators across the European Union. [....] Whileshort selling can be a valid trading strategy, when used in combination with spreading false market rumors this isclearly abusive. [...] Today some authorities have decided to impose or extend existing short selling bans in theirrespective countries. They have done so either to restrict the benefits that can be achieved from spreading falserumors or to achieve a regulatory level playing field, given the close inter-linkage between some EU markets”.
3Financial contagion occurs when a relatively contained shock, which initially affects only one or a fewinstitutions, sectors or countries, propagates via larger shocks to the rest of the financial sector, economy orother countries.
12
In this chapter, we explicitly take an options market perspective, as opposed to employing
only the stock market itself. Our study focuses on market participants’ changes in beliefs
and expectations, as in the work of Yan (2011), Chang et al. (2013), and Chira et al. (2013).
Forward-looking probabilities implied by options prices, i.e., risk neutral densities (RND), and
the implied volatility (IV) skew, are used to assess how the ban affects implied jump risk on
banned and non-banned stocks. We employ a data set of daily IV across a range of different
moneyness levels for all optionable European stocks listed in Belgium, France, Italy, and Spain.
We note that using option-implied data is a novel approach in the literature to analyze the
impact of short sale bans on financial markets.
We focus not only on the outmost tails of RNDs but also on the tails of realized returns.
We argue that it is the more extreme parts of the distributions that best reflect implied jump
risk. We use extreme value theory (EVT) to assess how investors, through their perception of
implied jump risk, differentiated between banned and non-banned stocks upon the introduction
of the 2011 European short selling ban.
Our work is related to that of Melick et al. (1997) and Birru and Figlewski (2011) because
it examines the behavior of RNDs over specific events. The rationale of using RND and IV
skews to assess how the ban affected implied jump risk is also supported by Bates (2000) and
Rubinstein (1994). They show that before the 1987 crash, the probability of large negative stock
returns was small and fairly close to that suggested by the normal distribution. Just prior to
the crash, however, the option-implied probability of jumps rose considerably at the same time
that the IV skew became steeper. The left tail of the RND of returns became considerably
fatter and thus negatively skewed with increased kurtosis, a phenomenon attributed to crash
fear (Rubinstein, 1994). As a result, out-of-the-money (OTM) puts are systematically priced
at a higher level relative to at-the-money (ATM) ones.
The main contributions of this chapter are threefold. First, we provide evidence that the
ban increased implied jump risk levels, particularly impacting the banned financial stocks. We
show that it is the imposition of the ban itself that led to the increase in implied jump risk,
rather than other causes, such as information flow, options-trading volumes, or stock-specific
factors. This finding is important because increased implied jump risk may provoke financial
contagion (see Ait-Sahalia et al., 2015) and increase systemic risk. Because of the connection
between implied jump risk and contagion, shifts in implied jump risk are closely monitored by
regulators4.
Second, we find that after the announcement of the ban, financial contagion risk actually
drops for banned stocks. This finding seems to run contrary to what one might expect, given
the documented increases in implied jump risk levels for banned stocks. Interestingly, for the
non-banned stocks, we document that contagion risk levels do indeed increase after the ban,
thus behaving in line with the rise in implied jump risk levels. We argue that this difference
may be caused by (formal and informal) market makers’ reluctance to further increase their
4For instance, Poon and Granger (2003) note that the Bank of England uses implied volatilities to assessmarket sentiment.
13
options’ inventory risk, leading to relatively steep IV skews, reduced volumes, and widened bid-
ask spreads for banned stocks. Such supply shift occur because market makers become more
risk-sensitive following equity market declines, which can be explained by the overweighting of
tails feature of the Cumulative Prospect Theory (CPT) of Tversky and Kahneman (1992).
Third, we compare the effects of the 2011 European ban to its 2008 American counterpart.
Investors may be able to obtain economic short exposure to banned stocks through a derivatives-
based strategy that replicates the payoff of a stock’s short sale. Such a “substitution effect”
(see Battalio and Schultz, 2011; Grundy et al., 2012) is characterized by a migration of trading
volume from one instrument to another. We find that no substitution effect occurred between
regular short selling and synthetic shorting through single stock puts during the 2011 European
ban. Instead of a substitution effect, our results show a migration out of single stock puts into
the EuroStoxx 50 index options market. We conclude that this type of migration diversifies
selling pressure initially concentrated in financial stocks across a larger share of the stock
market, thereby reducing systemic risks and enhancing overall financial stability.
2.2 Data and methodology
The 2011 short sale ban on financial stocks in the Euro member countries Belgium, France,
Italy, and Spain was established by a coordinated act of the European Securities and Market
Authority (ESMA) and the national financial market regulators of those countries on August
11, 2011. The announcement was made via a public statement issued by the ESMA and
was followed by publications on the same day by the Belgian Financial Services and Markets
Authority (FSMA), the French Autorit Des Marchs Financiers (AMF), the Italian Commissione
Nazionale per le Societ e la Borsa (Consob), and the Spanish Comision Nacional Del Mercado
de Valores (CNMV). The ban entered into effect on August 12, 2011. Table 2.1 provides an
overview of the banned financial stocks.
The ban on covered short selling not only prohibited the creation of new net short positions
but also banned increases in existing ones, including intra-day operations. Naked short selling
had already been prohibited in these four markets since 2008. Positions arising from formal
market-making activities were exempted from the ban. The ban targeted not only public mar-
kets but also over-the-counter (OTC) markets. In terms of scope, the national announcements
differed. The Belgian FSMA announced that the ban applied to net economic short positions
of any kind, while the French AMF communicated that derivatives could only be used to hedge,
create or extend net long positions. For the Italian Consob, the ban covered only shares and
not exchange-traded funds (ETFs) or any derivatives, while the Spanish CNMV imposed the
ban on all trades in equities or indices.
During the ban, holders of financial stocks were still allowed to use single stock derivatives
or simply sell their holdings to hedge their portfolios. Investors exposed to stocks were allowed
to hedge their overall equity market exposure by trading the market index or single stock
derivatives. It was the short selling of banned stocks that was prohibited, not hedging them
14
or reducing equity market risk. The creation or extension of marginal net short positions in
banned securities as a result of hedging equity market risk was still allowed.
Table 2.1: Overview of banned financial stocks
Belgium France Italy Spain
Ageas April Group Azimut Holding Banca Cvica, S.A.Dexia Axa Banca Carige Banco Bilbao Vizcaya Argentaria, S.A.KBC Group BNP Paribas Banca Finnat Banco de Sabadell, S.A.KBC Ancora CIC Banca Generali Banco de Valencia
CNP Assurances Banca Ifis S.A.Banco Espaol de Crdito, S.A.Crdit Agricole Banca Intermobiliare Banco Pastor, S.A.Euler Herms Banca Monte Paschi di Siena Banco Popular Espaol, S.A.Natixis Banca Popolare Emilia Romagna Banco Santander, S.A.Paris R Banca Popolare Etruria e Lazio Bankia, S.A.,Scor Banca Popolare Milano Bankinter, S.A.Socit Gnrale Banca Popolare Sondrio Bolsas y Mercados Espaoles, S.A.
Banca Profilo Caixabank, S.A.Banco di Desio e Brianza Caja de Ahorros del MediterrneoBanco di Sardegna Rsp Grupo Catalana de Occidente, S.A.Banco Popolare Mapfre, S.A.Cattolica Assicurazioni Bolsas y Mercados Espaoles, S.A.Credito Artigiano Renta 4 Servicios de Inversion, S.A.Credito EmilianoCredito ValtellineseFondiaria SaiGeneraliIntesa SanpaoloMediobancaMediolanumMilano AssicurazioniUbi BancaUnicreditUnipolandVittoria Assicurazioni.
This table lists the financial stocks banned from short selling on August 11, 2011, in Belgium, France, Italy, and Spain bytheir respective national financial market regulators in a coordinated act with the European Securities and Market Authority(ESMA).
The European short sale ban was initially intended to be in place for the next 15 days
only, with the exception of Belgium, which announced that the ban would remain in effect
indefinitely. Nevertheless, the ban was extended by the Spanish CNMV, the French AMF, and
the Italian Consob several times. On February 13, 2012, both FSMA and AMF announced
the lifting of the ban with immediate effect in Belgium and with retroactive effect, to February
11, in France. On February 15, the CNMV announced the lifting of the ban from February 16
onwards, and on February 24, the Italian ban expired.
Our sample covers the period from February 15, 2008, to March 27, 2012, and includes
1,073 trading days. It consists of all stocks that had listed options as of February 2012 on
the Belgian (Brussels Stock Exchange/Euronext Brussels), French (Paris Bourse or Euronext
Paris), Italian (Milan Stock Exchange or Borsa Italiana), and Spanish (Bolsa de Madrid) stock
exchanges. Overall, our sample comprises 185 stocks, of which 105 are included in these stock
exchanges’ main indices, i.e., the Belgian BEL20, the French CAC40, the Italian MIB, and the
Spanish IBEX35.
From Bloomberg, we source daily trading volumes and the number of shares outstanding per
stock, trading volumes, and put-call volume ratios for listed options. Trading volumes for listed
puts on the EuroStoxx 50 index, the V2X index (the IV index from the EuroStoxx 50 index),
15
and generic series of five-year sovereign credit default swaps (CDS) for Belgium, France, Italy,
and Spain are also collected from Bloomberg. Daily short stock positions (utilization rates)
and costs of short selling (simple average fee, simple average rebate, and cost of borrow score)
were kindly provided by Markit Securities Finance (formerly Data Explorers).
We implement the method by Figlewski (2010) for obtaining RNDs. He builds on the Bree-
den and Litzenberger (1978) formulae and interpolates and smooths the IV structure instead
of interpolating option prices. A clear strength of the Figlewski (2010) approach is its ability
to fill in intermediate grid values of the IV curve between the available strikes (the body of
the RND) with reduced noise and to extrapolate the RND beyond such observable strikes with
tails of flexible and reasonable shape.
To calculate RND for the stocks of interest, we obtain daily IV data for seven moneyness
levels, i.e., 80, 90, 95, 100, 105, 110, and 120, at the three-month maturity. Implied volatilities
are extracted by reverse engineering the Black-Scholes model from Bloomberg’s 16:00 hours
closing mid-prices (Bloomberg, 2008). In line with Figlewski (2010), IVs for the 80 to 95
moneyness levels are obtained from puts, while IVs for the 105 to 120 moneyness levels are
obtained from calls. For consistency with our IV skew measure, we use ATM IV from puts.
Because we intend to compare RND from banned stocks to non-banned ones, we compute
IV skews and extract RNDs for these two groups of stocks separately. More details on the
application of the method are included in Appendix 2.A.
We make use of extreme value theory (EVT) to measure implied jump risk5 because it
focuses on tail events, such as jumps in return distributions. EVT allows us to compare the
value-at-risk (VaR) implied by RNDs for banned and non-banned stocks. We first estimate
the tail shape estimator (φ), using Hill (1975) to compute the VaR using the semi-parametric
quantile estimator (qp) of Hartmann et al. (2004). In a next step, we employ a bivariate EVT
method to calculate commonality in jumps and, hence, contagion risk from historical returns.
EVT is well suited to measure contagion risk because it does not assume any specific return
distribution. Our approach estimates how likely it is that one stock will experience a crash
beyond a specific extreme negative return threshold conditional on another stock crash beyond
an equally probable threshold6.
We use the daily IV skews of individual European equities as a second measure of implied
jump risk7. The IV skew is calculated as the difference between the IV of three-month OTM
listed puts at the 80 percent moneyness level and ATM puts with the same maturity for every
stock in our sample. As with RNDs, we construct two indices of IV skew, one for banned
and the other for non-banned stocks, by equally averaging the stock-specific IV skews of the
5We acknowledge that the expression “jump risk” can also refer to the physical or actual, real-world jumprisk. However, in this chapter, we work with the implied, risk-neutral jump risk measure only. This measure ofjump risk can be viewed as the sum of the actual (real-world) jump risk plus a risk premium. Hence, impliedjump risk increases may be caused by increases in physical jump risk, increases in the risk premium, or both.
6We refer to Hartmann et al. (2004) and Balla et al. (2014), who use the conditional co-crash (CCC)probability estimator, which is applied to each pair of stocks in our sample. Appendix 2.B includes a detaileddiscussion of both our employed univariate and bivariate EVT-methodologies.
7A fat left tail in the RND of returns is a corollary to the fact that the IV skew is steep, see Bakshi et al.(2003).
16
constituents of each index. We also calculate single country versions of the banned and non-
banned IV skews for Belgium, France, Italy, and Spain. Table 2.2 presents descriptive statistics
for our IV skew measures for the entire sample period, both for the overall and single-country
levels.
Table 2.2: Descriptive statistics
Statistics Overall Belgium FranceNon-banned Banned Non-banned Banned Non-banned Banned
Average 5.73 6.49 5.33 7.11 6.23 8.57Median 5.31 5.98 5.14 6.94 6.09 8.16
Standard deviation 1.22 1.63 1.40 2.25 1.28 2.04Skew 1.15 1.91 0.67 0.48 0.56 0.54
Excess Kurtosis 0.88 5.24 0.29 0.64 -0.11 -0.51Jarque-Bera 266.0*** 1835.4*** 81.0*** 58.5*** 54.9*** 61.9***
Statistics Italy SpainNon-banned Banned Non-banned Banned
Average 6.49 7.49 3.85 4.96Median 6.23 7.24 3.19 3.98
Standard deviation 1.26 1.70 2.78 3.84Skew 1.02 0.89 4.03 4.35
Excess Kurtosis 0.89 1.16 18.87 22.44Jarque-Bera 214.5*** 195.8*** 18395.7*** 25297.4***
This table provides descriptive statistics for the IV skews of non-banned and banned stocks calculated over the full sampleperiod (February 15, 2008 to March 27, 2012) for the overall group of stocks as well as separately for Belgium, France,Italy, and Spain. We perform Jarque-Bera normality tests for all groups of stocks to infer whether IV skews are normallydistributed or not. The null hypothesis (H0) for the Jarque-Bera test is that data is normally distributed. Rejection of H0is denoted by ***, **, and *, at the one, five, and ten percent significance level, respectively.
Table 2.2 shows that the average and median IV skews for banned stocks are higher than
for non-banned stocks, an observation that pertains not only to the overall numbers but also to
each country separately. The standard deviation of the IV skew is higher for banned stocks. As
expected, the distributions of the IV skews are all positively skewed. All IV skew distributions
reported here have fat right tails and are not normal; thus, we use a non-parametric Mann-
Whitney U-test to make statistical inferences.
2.3 Discussion of results
We first examine the short selling utilization rate8 and the performance of banned and non-
banned stocks around the ban announcement day. Figure 2.1 shows that the imposition of the
2011 European ban strongly affects the short selling of stocks in Belgium, France, Italy, and
Spain. Plot A indicates that the short selling utilization rates fall for banned stocks from 32 to
27 percent in the months of August and September 2011, especially after the ban announcement
on August 11. This drop in short selling utilization is widespread across the four crisis countries.
For Belgium and Italy, short selling utilization drops from 29 to 23 percent and 24 percent,
respectively. For France, it remains unchanged at approximately 10 percent during this period.
For Spain, it drops from 53 percent on the ban announcement day to 45 percent on September
8The short selling utilization rate is calculated as Utilization=100*(ValueOnLoan/InventoryValue), whereValueOnLoan is the beneficial owner value of the loan and InventoryValue is the beneficial owner inventoryvalue. Utilization measures the value of a stock utilized for securities lending against the total value of inventoryavailable for lending, i.e., its short selling demand.
17
Figure 2.1: Short positions in stocks around ban date. This figure presents the average short utilization rates calculatedfor banned (Plot A) and non-banned (Plot B) stocks in our sample. Utilization rates have been calculated for the full sample ofbanned and non-banned stocks as well as separately for the stocks in Belgium, France, Italy, and Spain.
30. We observe that such drops in the utilization rate come from the decrease in the value of
short selling, the numerator of the utilization rate, as inventories of stocks available for lending
in the four countries remain relatively unchanged. The decreasing utilization rates indicate
that the ban was effective in reducing short selling, despite market makers still being allowed
to short banned stocks.
The reduction in short selling of financial stocks is especially noteworthy when utilization
rates for banned and non-banned stocks are compared. Plot B shows that the utilization rate for
non-banned stocks increases from 16 to 18 percent, on average, during August and September
2011, an increase observed across all four euro countries.
Despite such changes in utilization rates, short selling of banned financial stocks far exceeds
the level measured in non-banned stocks. In August 2011, the average short selling utilization
rate for financial stocks is twice the level reported for stocks of the other sectors (32 percent in
Plot A vs. 16 percent in Plot B). Short sellers would have benefited much more from further
18
deterioration of financial stocks rather than from a potential weakness in the average stock.
Despite such a dichotomy, utilization rates for the overall market around the ban announcement
day were at their highest levels since 2010 for the four crisis countries. For Italy and Spain,
the short selling activity was concentrated in mid-caps DataExplorers (2011), which matches a
large short selling interest in their banks.
From the end of June until the ban announcement, the EuroStoxx Banks index dropped by
32 percent, whereas the EuroStoxx 50 index fell by 22 percent. In the first ten days of August
2011, before the ban, shares in European banks fell by 23 percent, whereas the European index
corrected by 17 percent. In the subsequent month after the ban was announced, European
banks’ stocks lost an additional 18 percent, while non-financial equity dropped by only 6 per-
cent. Our data on short selling positions and returns suggest that financial stocks were indeed
under strong pressure.
2.3.1 VaR levels and volatility skews
In the following analysis of VaR levels implied by RNDs, we distinguish five sub-periods: (1) the
U.S. recession period (February 15, 2008, to June 30, 2009); (2) the 2009/2010 stock market
rally (July 1, 2009, to April 26, 2010); (3) the European crisis period (April 27, 2010, to
August 10, 2011), initiated by Standard and Poor’s downgrade of Greece’s sovereign bonds to
junk status; (4) the ban period (August 11, 2011, to February 16, 2012); and (5) the post-ban
period (February 17, 2012, to March 27, 2012).
Panel A of Table 2.3 shows that during the ban period, the RND-implied VaR levels for
banned stocks, i.e., the perceived implied jump risks, are significantly higher than for non-
banned stocks. The same conclusion holds for the post-ban period. We observe that the
VaR levels from RNDs during the ban period are significantly higher than during the pre-ban
period. The ten percent VaR for banned stocks increases from 38 to 62 percent for banned
stocks, whereas for non-banned stocks, it increases to a much lesser extent, from 35 to 46
percent. We observe similar differences in extreme downside risk for these two sub-samples
at both the five- and one-percent VaR levels. Interestingly, the VaR levels for the post-ban
period are not significantly different from the ban period for both banned and non-banned
stocks. The other sub-sample that had very distinct downside risk features in comparison to
the preceding period was the 2009 stock market rally. The latter period had significantly lower
VaR priced in RND returns than the U.S. recession period, especially for banned stocks but
also for non-banned equity. The ten percent VaR for banned stocks was 50 percent during
the recession and 40 percent during the rally, whereas for non-banned stocks it was 47 and 41
percent, respectively. We find that the VaR levels for banned stocks were generally higher than
for non-banned stocks, and downside risk priced in RND reached its peak during the ban.
19
Table 2.3: Extreme downside risk and implied volatility skew
Panel A - Extreme downside risk
Sample split 10% VaR 5% VaR 1% VaR
Non-banned Banned NB vs. B Non-banned Banned NB vs. B Non-banned Banned NB vs. B
Full sample: 02/15/2008 03/27/2012 -0.31 -0.34 -1.30 -0.36 -0.40 -1.50 -0.51 -0.57 -1.9US recession: 02/15/2008 06/30/2009 -0.47 -0.50 -0.90 -0.53 -0.58 -1.20 -0.73 -0.82 -1.8
2009 stock market rally: 07/01/2009 04/26/2010 -0.41** -0.40** 0.30 -0.47** -0.46** 0.20 -0.63** -0.63** 0.00Pre-ban European crisis: 04/27/2010 08/10/2011 -0.35** -0.38 -1.10 -0.40* -0.44 -1.00 -0.58 -0.62 -1.00
Ban period:08/11/2011 02/16/2012 -0.46*** -0.62*** -4.00*** -0.52*** -0.70*** -4.00*** -0.69*** -0.94*** -4.10***Post-ban period: 02/17/2012 03/27/2012 -0.45 -0.56 -2.10** -0.50 -0.64 -2.20** -0.66 -0.84 -2.20**
Panel B - Implied volatility skew
Sample split Overall Belgium France Italy Spain
Non-banned Banned Non-banned Banned Non-banned Banned Non-banned Banned Non-banned BannedFull sample: 02/15/2008 03/27/2012 5.31 5.98 5.14 6.94 6.09 8.16 6.23 7.24 3.19 3.98US recession: 02/15/2008 06/30/2009 5.02 5.99 4.70 6.97 5.46 7.83 6.31 8.02 3.47 4.87
2009 stock market rally: 07/01/2009 04/26/2010 5.05* 5.41*** 4.46*** 6.24*** 5.69** 7.46*** 5.99*** 6.00*** 2.42*** 3.26***Pre-ban European crisis: 04/27/2010 08/10/2011 5.78*** 6.05*** 5.63*** 7.81*** 6.42*** 8.14*** 5.82 7.11*** 3.60*** 3.98***
Ban period:08/11/2011 02/16/2012 6.05** 7.34*** 5.90*** 7.28 6.99*** 11.97*** 7.14*** 7.98*** 3.06*** 3.98Post-ban period: 02/17/2012 03/27/2012 5.17*** 6.37*** 5.32*** 5.25*** 5.59*** 10.65*** 6.73*** 5.49*** 2.73*** 4.98***
Panel A shows the ten, five, and one percent extreme downside risk estimates, or value-at-risk (VaR), of the risk neutraldensities (RNDs) for all non-banned and all banned stocks during the full sample period, as well as for the five differentsub-periods. Asterisks used as superscript to VaRs denote the outcome of the t-tests specified in Eqs. (3.2.7) and (2.B.3)across different sample periods. The column “NB vs. B” shows the t-stats of the test that compares VaRs of non-bannedand banned stocks, using Eqs. (3.2.7) and (2.B.3). The null hypothesis (H0) is that there is no difference between the VaRfrom non-banned and banned stocks. Panel B provides the median IV skews for non-banned and banned groups of stocksduring the same periods as in Panel A. Mann-Whitney (MW) U -tests are applied to the IV skew of paired sample splits toinfer whether the medians are statistically different from each other. The null hypothesis (H0) for the MW U -test is thatthere is no difference between the two unrelated samples. In both panels, rejection of H0 is denoted by the asterisks ***, **,and *, at the one, five, and ten percent significance level, respectively. In Panel B, the superscripts are placed in the cell ofthe second sub-sample that is compared.
Figure 2.2 depicts the historical behavior of our proxy for implied jump risk, the average IV
skew, for banned and non-banned stocks. The ban period is highlighted, with the beginning of
the shadowed part representing the ban announcement day. We observe that between 2008 and
2012, spikes in average IV skews were well above their mean, coinciding with periods of market
turmoil. Figure 2.2 shows that the IV skews rise strongly in 2008, around the Lehman collapse,
and wane after the market trough in March 2009. In 2010, the IV skews jump on April 27, the
day the Greek government bonds were downgraded by Standard and Poor’s to “junk” status.
The IV skew then strongly reverses on October 18, 2010, when a task force of European leaders
agreed on a rescue package to improve the European Union’s economic governance in an effort
to tackle the financial crisis. The 2011 jump in IV skews coincides with the ban announcement
day on August 11. The announcement was not accompanied by any major event related to the
European financial crisis, to equity markets in general or to the financial sector. We observe
that on all three occasions, the IV skew of banned stocks exceeded the IV skews for non-banned
stocks.
Figure 2.2 shows that implied jump risk rises strongly just prior to the ban announcement
for both banned and non-banned stocks. Such spikes in implied jump risk occur during the day
on August 11, 2011, whereas the ban was officially announced only after the market closed9.
The increase in the average IV skew on August 11 for banned stocks is equivalent to 2.16
volatility points, while for non-banned stocks it is equivalent to 1.05 volatility points. Both
differences exceed the 99th percentile of all daily IV skew changes in our sample. On August
12, the average IV skew continued to rise sharply, by 0.78 volatility points for banned stocks,
a movement exceeding the 94th percentile of all daily IV skew changes in our sample. On that
9It is unclear whether any information on the upcoming short selling ban leaked before the market closed onAugust 11. However, given that a ban on covered short selling on all stocks was already introduced in Greeceon August 8, the extension of the ban to other European countries might have been expected by some marketparticipants.
20
Figure 2.2: Averaged implied volatility skews for banned and non-banned stocks. This figure depicts the averageIV skew for banned and non-banned stocks over the entire sample period. Averages are calculated over all stocks in Belgium,France, Italy, and Spain that have listed options. The IV skew per stock is calculated as the difference between the IV of the80 percent moneyness OTM put option and the ATM put option. The European short selling ban period (August 12, 2011, toFebruary 16, 2012) is shadowed.
same day, the IV skew for non-banned stocks rose by 0.55 volatility points, exceeding the 96th
percentile. We observe that jumps in the IV skews around the ban announcement day are
outliers in our sample. More importantly, the rise in the IV skew for banned stocks is much
more pronounced than for non-banned stocks.
We also observe from Figure 2.2 that after the announcement of the short sale ban, the IV
skew levels of both banned and non-banned stocks remained elevated for several weeks. During
the entire ban period, the IV skew of banned stocks remained relatively high, whereas the IV
skew for non-banned stocks slowly declined to pre-ban levels. This persistence in the high level
of implied jump risk indicates that the ban did not diminish market participants’ concerns
regarding European financial stocks.
Table 2.3, Panel B presents the corresponding medians for the whole period and for the five
sub-periods separately. We observe that the sub-periods 1, 3, and 4 have the highest IV skews.
They also roughly match the periods of market turmoil and volatility humps highlighted in
Figure 2.2: the global financial crisis, the European sovereign debt crisis, and the 2011 European
ban period. The median IV skew for banned stocks is 7.34, significantly higher during the ban
period than before, when it was 6.05. For non-banned stocks, the median IV skew during the
ban period is 6.05, only slightly higher than during the European crisis, when it was 5.78.
Moreover, the median IV skew during the European crisis period is also significantly higher
than during the stock market rally. Figure 2.2 also indicates that the IV skew for banned
stocks exceeds that for non-banned stocks in most periods. We observe similar patterns in the
country-specific data. We find that the short selling ban contributes to an increase in implied
jump risk, especially with respect to banned stocks. Conversely, once the ban is lifted on
21
Figure 2.3: Sovereign CDS spreads, V2X and implied volatility skews around the ban date. This figure depictsthe five-year sovereign CDS spreads for Belgium, France, Italy, and Spain as well as the V2X and the IV skews for non-banned andbanned stocks. Sovereign CDS spreads proxy for country-specific information flow. V2X is the implied volatility index from theEuroStoxx 50 index, and it proxies for market-wide information flow. The V2X series is multiplied by 0.10 to fit the same scale asthe IV skew. The IV skew per stock is calculated as the difference between the IV of the 80 percent moneyness OTM put optionand the ATM put option. The ban announcement on August 11, 2011, is indicated by the vertical line.
February 16, 2012, IV skews drop significantly for both banned and non-banned stocks.
Our empirical findings indicate that the short sale ban did not reduce the implied jump risk
during the European financial crisis. Otherwise, VaR levels and IV skews would have receded.
On the contrary, we find significant evidence that VaR levels increased strongly and that IV
skews jumped instantly when the ban was introduced and remained high during the period of
the ban, particularly for banned stocks.
A potential flaw in the empirical analysis so far is that large movements in the IV skew,
observed during the ban or at the time of its announcement, may have been contemporaneous
to the dissemination of other relevant information. If so, we cannot draw a clear connection
between the ban announcement and IV skew behavior. Figure 2.3 indicates that the IV skews
rise on August 11, even though we do not observe negative shocks within country-specific CDS
spreads and the V2X index10.
Figure 2.3 displays that information flow for the four crisis countries was relatively benign
around the ban announcement date, as CDS spread levels remain unchanged, whereas the
V2X even decreases after showing a large spike in the days preceding the ban. Equity market
movements around that period further support the presence of such positive information flow11.
10We use CDS spreads to proxy the country-specific information flow. We adopt the V2X, the Europeancounterpart of the VIX (the IV index for S&P500 index options), as a proxy for the European equity marketinformation flow.
11Figure 2.3 shows that moves in the country-specific IV skew match the sovereign CDS spread behavior inthis period very well. CDS spreads moved sideways for Belgium, France, and Italy, while they rose for Spain.This divergence can be explained by the fact that on February 13, 2012, Spain’s sovereign debt rating wasdowngraded two notches by Moody’s, from A3 to A1, which was much more severe than the rating changes forthe other three crisis countries.
22
The EuroStoxx 50 index rose by 2.86 percent on August 11 and by 4.15 percent on August 12,
whereas the EuroStoxx Banks index rose by 2.96 and 5.26 percent, respectively. Moreover, no
other major announcement was made during these days. The absence of negative information
strongly suggests that the ban announcement itself catalyzed the rise in implied jump risk.
2.3.2 Financial contagion risk
In this section, we assess the development of financial contagion risk, using average conditional
co-crash (CCC) probabilities. The average CCC probability measures the likelihood that a
banned (non-banned) stock crashes, given that another banned (non-banned) stock crashes.
We estimate the bivariate CCC probabilities for all pairs of banned and all pairs of non-banned
stocks using realized daily returns. Table 2.4, Panel A presents the results from the estimation
of Eq. (2.B.4) for the full sample and individual sub-samples. Over the full sample, the CCC
probability for the banned stocks is 32 percent, while for the non-banned stocks, it is 29 percent,
which is not significantly different. In the first sub-sample periods, the contagion risk of the
banned stocks reaches a similar level, as it does for the other stocks. However, during the
pre-ban European crisis period, we find that contagion risk for banned stocks (42 percent)
becomes significantly different from that for non-banned stocks (33 percent). Surprisingly, this
substantial difference is no longer observed during the ban period, when the CCC probability
for banned stocks decreases to 32 percent, while it increases to 41 percent for non-banned
stocks. This decrease in contagion risk for banned stocks is one of the major findings of this
chapter. Apparently, the imposition of the ban decreased systemic risk. This effect occurred
despite the increase in forward-looking implied jump risk across the same sample and period.
In a next step, we analyze whether the CCC probabilities for banned and non-banned
stocks are different across samples. We observe that the CCC probabilities for banned stocks
during the U.S. recession (27 percent) and the 2009 equity market rally period (28 percent) are
not significantly different. The same assessment holds for non-banned stocks, where Panel A
indicates CCC probabilities at 26 and 23 percent for these two sub-periods, respectively. The
pre-ban period, however, witnesses an abrupt and statistically significant increase in the CCC
probability for banned (from 28 to 42 percent) and non-banned (from 23 to 32 percent) stocks.
Clearly, contagion risk is higher across the board once the European crisis is triggered, but it
is especially higher for financial stocks. After the ban is announced, banned stocks’ contagion
risk falls from 42 to 32 percent, while for non-banned stocks, contagion risk rises from 32 to 41
percent.
23
Table 2.4: Extreme downside risk and implied volatility skew
Panel A - Conditional co-crash probabilities
Sample split Conditional co-crash-probabilities
Non-banned Banned NB vs. B
Full sample: 02/15/2008 03/27/2012 0.29 0.32* 1.7US recession: 02/15/2008 06/30/2009 0.26 0.27 0.4
2009 stock market rally: 07/01/2009 04/26/2010 0.23 0.28 1.6Pre-ban European crisis: 04/27/2010 08/10/2011 0.32* 0.42* 2.2**
Ban period:08/11/2011 02/16/2012 0.41 0.32 -1.3Post-ban period: 02/17/2012 03/27/2012 NA NA NA
Panel B - Option trading volumes and put-call ratios
Sample split Put volume Put-call volume ratio
Non-banned Banned Non-banned Banned
Full sample: 02/15/2008 03/27/2012 1,064 1,690 7.0 3.8US recession: 02/15/2008 06/30/2009 0,877 1,377 7.1 4.1
2009 stock market rally: 07/01/2009 04/26/2010 1,200*** 1,747*** 7.0 3.4***Pre-ban European crisis: 04/27/2010 08/10/2011 1,157 1,905** 6.3 3.7**
Ban period:08/11/2011 02/16/2012 0,943*** 1,727*** 8.7*** 3.8Post-ban period: 02/17/2012 03/27/2012 1,245*** 2,758*** 7.9 5.7***
Panel A shows the average conditional co-crash (CCC) probabilities calculated by Eq. (2.B.4) among all non-banned andall banned stocks during the full sample period, as well as for the five different sub-periods. Asterisks used as superscriptto CCC-probabilities denote the outcome of the t-tests specified in Eqs. (3.2.7) and (2.B.3) across different sample periods.The column “NB vs. B” shows the t-stats of the test that compares CCC-probabilities of non-banned and banned stocks,using Eqs. (3.2.7) and (2.B.3). The null hypothesis (H0) is that there is no difference between the CCC-probabilities fromnon-banned and banned stocks. Panel B shows the median daily trading volume, measured by the number of contractstraded in put options, as well as the median daily put-call volume ratio for all non-banned and banned stocks for the overallsample period and for the five different sub periods. We apply Mann-Whitney U -tests to investigate whether the mediansare statistically different from each other. The null hypothesis (H0) is that there is no difference between the populations ofthe two samples. In both panels, rejection of H0 is denoted by the asterisks ***, **, and *, at the one, five, and ten percentsignificance level, respectively. In Panel B, the superscripts are placed in the cell of the second sub-sample that is compared.
Another variable that potentially plays a major role for financial contagion risk is trading
activity. Bollen and Whaley (2004) suggest that the IV skew might be closely linked to trading
activity in the options market. They find that changes in the shape of the IV function are
directly related to net buying pressure on options from end-users’ public order flow. They
argue that end-users trade options for portfolio insurance, agency, and speculative reasons,
rather than for market-making reasons. Garleanu et al. (2009) confirm their findings and
observe that the size of the IV skew is positively and significantly related to demand pressures
from institutional investors seeking portfolio insurance.
We inspect daily put and call trading volumes as well as the put-call volume ratio as proxies
for trading pressure, as suggested by Dennis and Mayhew (2002). We measure volume as the
median number of contracts traded on a specific day for all stocks in the sample. We obtain
an overall put-call volume ratio by averaging the single-stock contracts. Again, we evaluate
these measures over the five periods previously identified in our data set. Table 2.4, Panel B
documents that the median number of single-stock puts for each banned stock traded per day
decreases significantly from 1,905 during the pre-ban European crisis period to 1,727 during the
ban period. For non-banned stocks, the median volume of puts also drops, from 1,157 during
the pre-ban period to 943 during the ban. The median put-call volume ratio for non-banned
stocks significantly increases during the ban, from 6.3 to 8.7, whereas the median put-call
volume ratio for banned stocks hardly changes.
The findings in Panel B provide no evidence that individual stock options, particularly puts,
experienced a large rise in trading activity. Thus, we find no evidence of a substitution effect
24
of the short selling of common stock into single-stock put options. We also find no evidence
that trading activity completely dried up during the ban period. This result is in line with
Grundy et al. (2012), who show that the overall volume of options trading dropped during the
2008 U.S. short selling ban. This behavior of trading volumes indicates that during the ban,
the IV skew does not increase as a result of increased selling pressure, as originally suggested
by Bollen and Whaley (2004) and Garleanu et al. (2009).
We assume that once short selling activity in banned stocks diminishes, the demand for
synthetic shorts via put options should increase. During the ban, informal market makers in
options (high-frequency traders and hedge funds)12 can no longer delta-hedge by short selling
stocks. Hence, they become less willing to sell protection, significantly impairing the supply of
puts.
As securities-lending programs were in less demand by short sellers during the ban, it
became cheaper to borrow stocks. In unreported results, we find that three common measures
of borrowing costs (the simple average fee, the simple average rebate, and the daily cost of
borrow score) indeed fall for banned stocks, from the date the ban was introduced until the end
of September 2011. Borrowing costs constitute, however, only one component of hedging costs,
and, depending on market circumstances, not necessarily the largest one. Costs incurred by
bid-ask spreads and price impact may easily outpace borrowing costs in times of thin trading
activity. Beber and Pagano (2013) illustrate that the 2008 U.S. ban is associated with an
increase in bid-ask spreads ranging between 1.64 and 1.98 percentage points among international
stocks where the average bid-ask spread is 3.93 percentage points13. Likewise, Battalio and
Schultz (2011) and Grundy et al. (2012) note that bid-ask spreads on options on banned stocks
also rose significantly during the 2008 U.S. short sale ban. In contrast, on August 11, 2011,
the fee for borrowing from the Spanish bank Santander was only 51 bps per annum. Therefore,
lower borrowing costs may not have helped much in encouraging market makers to write puts
during the ban.
A final explanation for a smaller supply of puts during the ban is that option sellers became
more risk-sensitive following equity market declines. Garleanu et al. (2009) find that end-users
have a net long-position in equity index options with a corresponding large net position in
OTM puts. Conversely, market makers are short in OTM puts. Following a market decline,
they become more reluctant to write additional puts. This behavior is fully consistent with
the overweight of small probabilities feature of the Cumulative Prospect Theory of Tversky
and Kahneman (1992). According to this model, agents making decision under risk, such as
market makers, tend to perceive tails events as more probable than they are, causing them to
assume risk averse attitude. In the days before the introduction of the European short sale
ban, equity markets strongly corrected on the back of an intensifying European financial crisis;
thus, it is not difficult to envision high risk aversion among market makers during the ban and
a diminished willingness to sell puts. Holders of financial stocks suddenly had to pay much
12Boehmer et al. (2013) note that approximately 50 percent of all options trading is currently supplied bysuch informal market makers.
13Sobaci et al. (2014) provide similar results for emerging markets.
25
higher prices to buy protection: three-months 80 percent moneyness OTM puts on financial
stock, on average, became 16 percent more expensive on August 11, compared to the average
of the previous 21 trading days.
On the ban announcement day, the trading volume for puts on the EuroStoxx 50 index
reached 2,573,868, which is the second-highest daily trading volume for this instrument in our
sample14. A potential explanation for such a high trading volume is that after the imposition
of the ban, the skew from stock options relative to index options became too costly. The
spread between the IV skew from the Eurostoxx 50 index put options and single-stock puts,
which is normally highly positive, was just marginally positive during the ban, reaching zero on
December 20, 2011. Because index puts are far more liquid than single-stock puts, a liquidity
premium no longer existed, and a migration from single stock puts to index puts took place.
Such an explanation is also in line with the “flight-to-liquidity” models suggested by Pastor
and Stambaugh (2003) and Acharya and Pedersen (2005).
2.3.3 Panel regression analysis
To further assess the effect created by the short selling ban and trading activity on IV skews,
we run a panel regression analysis with the IV skew (IVSkew) as the dependent variable.
This regression allows us to isolate the relationship between the IV skew, banned stocks, and
trading activity by controlling for other determinants of the IV skew, such as information flow
and idiosyncratic factors. We use the following firm-specific control variables: daily turnover
(Turnover), systematic risk (Beta), and firm size (Size). We use Turnover as a proxy for stock
liquidity, following Dennis and Mayhew (2002).
We calculate an individual stock’s daily turnover by dividing its daily trading volume by
its number of shares outstanding. The stock’s beta is our control variable for systematic risk.
The market return is assumed to be the equal-weighted average daily return for all stocks in
our sample. The daily estimation of the beta uses a rolling window of one year’s worth of
data, where the data begin one year before the first sample date. Firm size is calculated as the
number of shares outstanding on a specific day multiplied by the stock price.
Control variables are uncorrelated with each other in both the cross-sectional and the time-
series dimension (unreported here). We employ de-trended levels of sovereign CDS spreads for
Belgium, France, Italy, and Spain (CCDS ) and the V2X volatility index (V2X ) as a control
variable for country-specific and equity market information flows15. Additionally, we proxy
firm-specific information flows with daily stock returns (R), trading pressure via single-stock
put option trading volume (PVlme), and trading volume of puts on the EuroStoxx 50 index
(E50PVlme)16. Our resulting Model 2.1 is given as follows:
14The heaviest trading in EuroStoxx 50 index puts took place on October 10, 2008, when the Belgian bankDexia was bailed out and 2,604,185 contracts were traded.
15Based on the Johansen cointegration test, we find no cointegration between the de-trended CDS spreads ofthe four crisis countries and the V2X index at the five-percent significance level.
16Single-stock put option trading volume is computed as the average daily trading volume of puts divided by1,000. Put trading volume is not used as an additional cross-sectional variable because data are only availablefor a limited set of stocks (122 of 186). E50PVlme is the daily trading volume of puts on the EuroStoxx 50
26
IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned
t +
DBt ∗DBned
t +DPostBt +DPostB
t ∗DBnedt + PV lmet + E50PV lmet + εt,
(2.1)
where DBt is a dummy variable equal to one if the date is within the ban period (August 11,
2011, to February 16, 2012), and zero otherwise. DBnedt is a dummy variable equal to one
if the underlying stock is a banned stock, and zero otherwise. DPostBt is a dummy variable
equal to one if the date is after the lifting of the ban (from February 17, 2012, onwards), and
zero otherwise. An additional dummy variable is created as an interaction term for these two
dummies, DBt *D
Bnedt . This variable captures the effect on the IV skew when two conditions
hold: the stock is banned and the ban is in place.
We use generalized least squares (GLS) to account for potential serial correlation in the
residuals. We estimate our panel regression over three different periods: (a) the full period,
ranging from February 15, 2008, to March 27, 2012; (b) the period that starts on April 27,
2010, when the European sovereign crisis is deemed to have begun, to March 27, 2012; and
(c) the ban period, ranging from August 11, 2011, to February 16, 2012. Table 2.5, Panel
A reports the regression results. Over the full sample period (column a), all coefficients are
statistically significant at the one-percent level, except for the dummy variable DBt and the
post-ban dummy variables. The results for V2X, country CDS spreads, Beta, and Turnover
are in line with the results reported in the literature and with our expectations. We expected
V2X and CDS spreads to be positively related to the IV skew, as implied jump risk priced
for individual stocks is likely to increase with equity market volatility and country credit risk.
Contrary to our expectations, stock returns and size are positively related to the IV skew.
Nevertheless, our size-skew estimates are in line with the results reported in Engle and Mistry
(2008). They suggest that size proxies for beta, warranting a positive relationship between size
and skew.
The results obtained from our dummy variables over the full sample period confirm that
the ban positively affected the IV skew for banned stocks: DBt *D
Bnedt has a positive sign and
is statistically significant. The interaction coefficient of this dummy variable indicates that the
ban increases IV skews for banned stocks by 0.3 volatility points, which is economically relevant
because it amounts to approximately five percent of the median IV skew in our data set. This
is a strong result, given the large set of control variables used. This finding suggests that the
IV skew for banned stocks during the European short sale ban was abnormally high compared
to that for non-banned stocks and that for banned stocks in other periods. Furthermore, the
estimated coefficient of DBnedt implies that financial stocks have IV skews that are, on average,
0.72 volatility points higher than IV skews for non-banned stocks. This finding is consistent
with our descriptive statistics provided in Table 2.2. The three dummy estimates confirm that
the IV skew for all stocks was higher during the ban, and the effect was more pronounced for
banned stocks.
index divided by 1,000,000 and is used to capture the potential indirect substitution effect of trading pressureon single-stocks’ puts by index puts.
27
Column b of Panel A shows that during the euro crisis pre-ban period, all parameter es-
timates for control variables have identical signs and comparable statistical significance levels
compared to the results obtained in estimating Model 2.1 over the full period (column a). The
impact in IV skews of banned stocks caused by the ban is even stronger though. On average,
the ban increases the IV skew for banned stocks by 0.45 volatility points, which amounts to
roughly nine percent of the median IV skews across our data set. At the same time, for the
average stock, IV skews decreases by -1.1 volatility points during the ban. These results sup-
port our hypothesis that investors seem to have differentiated between banned and non-banned
stocks upon the ban introduction.
Table 2.5: Panel regression results
Panel A Panel B Panel C
(a) Full (b) Euro (c) Ban (a) Full (b) Euro (c) Ban (a) Full (a) Full sple.sample crisis period sample crisis period sample Default Prob.
Intercept 2.823*** 2.198*** -0.283 2.951*** 2.214*** -0.264 2.680*** 1.084***(0.155) (0.329) (0.344) (0.052) (0.330) (0.346) (0.156) (0.068)
V2X 0.053*** 0.094*** 0.035*** 0.050*** 0.096*** 0.029*** 0.073*** 0.031***(0.004) (0.011) (0.009) (0.001) (0.011) (0.009) (0.004) (0.002)
Country CDS 0.006*** 0.003*** 0.008*** 0.007*** 0.003*** 0.009*** 0.003*** 0.008***(0.001) (0.001) (0.001) (0.000) (0.001) (0.001) (0.001) (0.000)
Stock returns 5.509*** 9.663*** 7.055*** 5.843*** 9.693*** 6.997*** 4.458*** -0.203(0.921) (1.710) (1.385) (0.360) (1.709) (1.385) (0.960) (0.417)
Stock turnover -19.317*** -22.535*** -25.602*** -18.916*** -22.178*** -26.247*** 4.749** 67.677***(1.834) (2.468) (3.313) (1.376) (2.466) (3.318) (2.318) (1.964)
Stock size 0.048*** 0.066*** 0.109*** 0.050*** 0.067*** 0.109*** 0.030*** -0.035***(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Stock Beta 1.015*** 1.710*** 3.702*** 1.037*** 1.712*** 3.701*** 1.482*** 1.368***(0.036) (0.060) (0.097) (0.028) (0.060) (0.097) (0.046) (0.038)
Dummy Ban Period -0.180 -1.104*** 0.676*** 0.358*** 1.195*** 1.052***(0.118) (0.195) (0.028) (0.064) (0.116) (0.048)
Dummy Stock Banned 0.719*** 0.357*** 0.037 -0.305*** -1.123*** 0.030 0.475*** -0.692***(0.035) (0.064) (0.049) (0.035) (0.196) (0.049) (0.041) (0.039)
Dummy Ban Period*Stock Banned 0.306*** 0.451*** 0.327*** 0.474*** 0.311*** 1.796***(0.098) (0.119) (0.069) (0.119) (0.110) (0.075)
Dummy Post Ban -0.201 -0.695*** -0.293*** -0.704*** 1.025*** 0.897***(0.200) (0.224) (0.059) (0.224) (0.209) (0.090)
Dummy Post Ban*Stock Banned 0.280 0.389* 0.314** 0.416* 0.491** 0.978***(0.207) (0.230) (0.138) (0.229) (0.247) (0.154)
Overall put volume 0.305*** 0.185 -0.297** 0.274*** 0.182 -0.333** 0.442*** 0.111***(0.074) (0.147) (0.144) (0.020) (0.147) (0.146) (0.073) (0.027)
EuroStoxx50 put volume -0.723*** -1.358*** 0.397** -0.691*** -1.372*** 0.428** -1.114*** -0.541***(0.119) (0.220) (0.189) (0.033) (0.220) (0.191) (0.119) (0.045)
IV spread 0.062 -0.619 3.304**(0.106) (0.727) (1.509)
Default Probability -12.356***(0.498)
R2 0.1046 0.1289 0.2757 0.1076 0.1308 0.2765 0.1452 0.2478# Obs (Unbalanced Panel) 146201 73327 21298 142048 73327 21298 74622 84095
Panel A reports the panel regression results for Model 2.1. Panel B reports the panel regression results for Model 2.2. PanelC reports the panel regression results for Models 2.3 and 2.4. The dependent variable for Models 2.1, 2.2 and 2.3 is theIV skew. Model 2.4 uses probability of default from single-name CDS (PD) as the dependent variable. We distinguishthree different periods: (a) full sample (from February 15, 2008 to March 27, 2012), (b) euro crisis (from April 27, 2010to March 27, 2012), and (c) ban period (from August 11, 2011 to February 16, 2012). The single stock IV skew is thedependent variable and information flow (Country CDS and V2X), firm-specific control variables (Return, Turnover, Size,Beta), trading volume on single put options (Put volume) and on index options (EuroStoxx50 put volume), a proxy forsupply shift on option markets (IV spread), firms’ probability of default from single-name CDS market (PD) and dummiesare the explanatory variables. The intercept is estimated as common to all cross-sections and no weighting is used in thecross-sections for estimation. Residuals are not normal for most cross-sections. We apply White-Heteroskedasticity consistentstandard error and covariance estimates. The asterisks ***, ***, and * indicate significance at the one, five, and ten percentlevel, respectively.
Empirical results change more strongly when we estimate Model 2.1 for the 2011 European
ban period. Column c shows that all control variables still have the same signs and the results
are strongly statistically significant; however, the estimate of DBnedt is no longer statistically
significant. Because we use such a short period, the dummies DBt , D
PostBt , DB
t *DBnedt , and
DPostBt *DBned
t are no longer applicable. This outcome suggests that, within the ban period,
financial stocks are no longer associated with higher IV skews relative to the average stock.
28
The lack of significance of DBnedt is, however, connected to its cross-correlation with beta for
financial stocks during the ban (i.e., 1.30 relative to 0.90 for non-banned stocks). Additionally,
PVlme becomes negative and significantly related to IVSkew. Thus, a rise in the skew during
the ban period is associated with a lower volume of single-stock puts. This result is consistent
with our hypothesis that a supply shift drove the IV skew during the ban rather than a change
in demand. In such a setting, large upward movements in the skew could have been caused by
low trading volumes in OTM puts. At the same time, the link between E50PVlme and IVSkew
turns positive and significant. This relation is explained by the above-noted increase in the
volume of index puts traded, in parallel with the supply-led rise in the IV skew during the ban.
A ban may be considered ineffective when selling pressure migrates from banned securities
to alternative instruments. However, in the case of the 2011 European short sale ban, we
observe that the migration of selling pressure from financial stocks to put options on European
indices has not jeopardized the efficacy of the short sale ban. As a result of the migration,
the ban appears to have diverted selling pressure initially concentrated in financial stocks to
a larger share of the market. This hypothesis is consistent with the fact that contagion risk
decreased for banned stocks during the ban but increased for non-banned stocks.
When the short sale ban was introduced on August 11, 2011, any further selling pressure
on financial stocks could have led to destabilizing shocks and financial contagion. The price of
OTM puts on banned stocks rose as a result of lower trading volume rather than through a
substitution effect. The richness of OTM puts made it substantially more expensive for market
participants to take a synthetic short position. Hence, imposition of the ban likely helped to
curb downward price pressure, which benefited financial sector stability.
In a next step, we analyze whether a supply shift in the options market was an important
driver of IV skews during the ban. As market makers use their bid-ask quotes for inventory
management, spread measures from options markets indicate whether a supply shift occurred
around the ban or not. We calculate two bid-ask spread-based measures for put optionsi)
percentage spread, i.e., (ask-mid)mid, and ii) IV spread, i.e., (δIV = δC/V ega to evaluate the
impact on IV caused by market makers’ inventory management17. Percentage spread represents
the percentage of the put mid-price that market makers charge to supply an option. IV spread
represents the translation of percentage spread into volatility points, i.e., how many volatility
points market makers charge to supply an option. We evaluate the behavior of percentage
spread and IV spread in the full sample and in our sub-samples by calculating median spreads
of these two metrics across put options on the 28 stocks in our sample that belong to the
EuroStoxx 50 index18. The results are shown in Table 2.6.
17IV spread uses options’ Greek Vega, i.e., δC/IV , to obtain δIV , where δC is the difference betweenask and mid-prices. Option prices and Vegas are from ATM options and are obtained from Bloomberg andDatastream. Although the IV spread measure only estimates the impact on IV caused by changes in spreadsfrom ATM options, we assume that such increase in spread is also indicative of supply shift on OTM optionsand, consequently, on the IV skew. This assumption is conservative, as bid-ask spreads of OTM options aretypically higher than those of ATM options due to the lower liquidity of OTM options.
18From the full Eurostoxx 50 index sample, we discard those stocks for which the required options data arenot available.
29
Table 2.6: Robustness checks
Sample split Percentage spread IV spread
Non-banned Banned Non-banned Banned
Full sample (02/15/2008 03/27/2012) 0.088 0.069 0.057 0.072U.S. recession (02/15/2008 06/30/2009) 0.08 0.06 0.06 0.041
2009 stock market rally (07/01/2009 04/26/2010) 0.081 0.058 0.050*** 0.039***Pre-ban European crisis (04/27/2010 08/10/2011) 0.099*** 0.076*** 0.075** 0.082***
Ban period (08/11/2011 02/16/2012) 0.090*** 0.089*** 0.025*** 0.149***Post-ban period (02/17/2012 03/27/2012) 0.072*** 0.083** 0.018*** 0.166
This table shows the median daily Percentage spread measure as well as the median IV spread measure for non-banned andbanned stocks that belong to the EuroStoxx 50 index for the overall sample period and for the five different sub periods. ThePercentage spread is defined as (ask-mid)/mid, where ask is the asking price of an ATM option, and mid is the mid-price of anATM option. This metric represents the percentage of the mid-price that is charged by market makers to sell an option. TheIV spread is defined as δIV = δC/V ega, where δC is the difference between ask- and mid-prices, i.e., the spread, and Vegais obtained for ATM options. We apply Mann-Whitney U-tests to assess whether the medians are statistically different fromeach other. The null hypothesis (H0) is that there is no difference between the populations of the two samples. Rejection ofthe (H0) is denoted by the asterisks ***, **, and *, indicating significance at the one, five, and ten percent level, respectively.
The IV spread metric behaves in line with percentage spread. The IV spread is relatively low
during both the U.S. recession and the 2009 stock market rally, with the latter period reporting
statistically significant lower spreads than the former. The pre-ban period experiences a sudden
and statistically significant rise in IV spread, from 0.050 to 0.075, for non-banned stocks and
from 0.039 to 0.082 for financial stocks. The IV spread continues to rise during the ban period
for banned stocks, from 0.082 to 0.149. In contrast, it falls by two-thirds for non-banned stocks,
from 0.075 to 0.025. During the post-ban period, IV spread continues to rise for banned stocks,
whereas it falls for non-banned stocks19. These results from our two spread measures confirm
that during the ban, market makers widened their spreads for options on financial stocks, while
no such supply shift seems to have occurred for options on the other stocks.
To formally test the overall impact of options bid-ask spread on IV skew, we specify our
Model 2.2, which comprises Model 2.1 with the addition of the IV spread as an explanatory
variable, as follows:
IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned
t +
DBt ∗DBned
t +DPostBt +DPostB
t ∗DBnedt + PV lmet + E50PV lmet + IV spreadt + εt,
(2.2)
Table 2.5, Panel B presents the estimates of Model 2.2. We observe that IV spread has a
statistically significant (positive) relation with IV skews during the ban period but not during
the other two periods. During the ban period, on average, a one volatility point increase in
IV spread is linked to a 3.30 increase in IV skew. Within the full sample and during the pre-
ban period, however, rises in IV spread provoke no statistically significant impact on IV skew.
Most explanatory variables in Model 2.2 have the same signs and similar significance levels as
observed in the estimation of Model 2.1. This is always the case for the joint dummyDBt *D
Bnedt .
More intuitively, Figure 2.4 shows the jump in IV skews around the ban announcement day
and a coincident large spike in IV spread.
19The rise in IV spread during the post-ban period is mainly caused by Spain, which matches the behavior ofSpanish stocks’ IV skew and sovereign CDS in such periods. This rise is likely caused by the country’s sovereigndebt rating downgrade by Moody’s on February 13, 2012.
30
Figure 2.4: Implied volatility skews and IV spread around the ban date. This figure depicts the average IV skewsfor banned and non-banned stocks as well as the average IV spread for the 28 stocks in our sample that belong to the EuroStoxx50 index from July 11, 2011, to December 30, 2011. The IV skew per stock is calculated as the difference between the IV of the80 percent moneyness OTM put option and the ATM put option. The ban announcement on August 11, 2011, indicated by thevertical line, coincides with a large spike in IV spread and with large increases in the IV skew for banned and for non-bannedstocks.
We see that on August 11, 2011, the banned stocks’ average IV skew rose by 2.16 volatility
points, whereas for our sample of 28 stocks, this increase was 1.77 volatility points. Of these
1.77 volatility points, a rise of 0.32 volatility points in IV skew (approximately 18 percent) was
caused by a widening of the bid-ask spread, as suggested by the IV spread variable shown in
Figure 2.4. Due to the conservative nature of this variable, which is based on ATM options
rather than on OTM options, such an impact on IV skew coming from bid-ask spreads is
material. These findings reinforce our view that IV skews have risen due to a supply shift
among market makers and other options providers, rather than further selling pressure on
financial stocks via options.
In a final step, we incorporate information from the fixed income market into our panel
regression analysis. We use the probability of default, following Hull et al. (2005), who build
on the Merton (1974) credit risk model. Hull et al. (2005) find that implied volatility skews
from single-stock options are linked to the firms’ default risk. We specify the probability of
default both as a stock-specific information flow proxy (Model 2.3) and to replace the dependent
variable in Model 2.1, which leads to Model 2.4. Models 2.3 and 2.4 are estimated for the full
sample, ranging from February 15, 2008, to March 27, 2012. We use the same GLS panel
regression approach as in Model 2.1, with the following specifications:
IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned
t +
DBt ∗DBned
t +DPostBt +DPostB
t ∗DBnedt + PV lmet + E50PV lmet + PDi,t + εt,
(2.3)
and
PDi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned
t +
DBt ∗DBned
t +DPostBt +DPostB
t ∗DBnedt + PV lmet + E50PV lmet + εt,
(2.4)
31
where, PDi,t is the probability of default20 implied by the 5-year CDS for firm i at time t.
Because CDS data are not available for all firms in our sample, the number of cross-sections
used in Model 2.3 and Model 2.4 equals 83. Table 2.5, Panel C reports the regression results
for both models.
The second-to-last column in Panel C shows that the Model 2.3 estimates for the full period
are consistent with the Model 2.1 estimates (column a) for the joint dummy DBt *D
Bnedt , with
a significant coefficient of 0.3. The dummy DBnedt has the same sign and statistical significance
as in Model 2.1. The coefficient of the PDi,t variable is negative and strongly statistically
significant. This negative link supports our hypothesis that the ban itself is responsible for an
increase in implied jump risk for banned stocks.
The results in Panel C show that the risk premium priced in CDS default probabilities does
not increase during the ban, like it was observed for the IV skews in Table 2.3. We argue
that the implied jump risk rose due to an increase in physical jump risk, not to an increase
in the risk premium. This result seems to be in line with our conclusion that the increase in
implied jump risk during the ban was due to a supply shift instead of further selling pressure or
increased risk premium required by investors to hold financial stocks. The explanatory power
for Model 2.3 (14.5 percent) is higher than for Model 2.1 (10.5 percent), indicating that PDi,t
is a powerful variable in explaining the dynamics of jump risk.
The last column of Panel C reports the estimates for Model 2.4, where PDi,t is the dependent
variable. We see that the ban period has increased the probability of default for banned stocks
by 1.8 percent, on average, as evidenced by the coefficient of the joint dummy DBt *D
Bnedt . The
dummy DBt is also positive, meaning that the probability of default rose across the board once
the ban was introduced. These findings suggest that the ban has negatively impacted fixed
income markets because the increase in the probability of default slightly increased after the
introduction of the ban.
2.3.4 Robustness Tests
In Model 2.1, we observe shifts in the signs of PVlme and E50PVlme across different sample
periods. Hence, we run now an additional GLS panel regression as a robustness check to control
for any influence of the short sale ban. We estimate a reduced form of Model 2.1 that excludes
all dummies related to the ban, while using only pre-ban data. Thus, our Model 2.5 is specified
as follows:
IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t + PV lmet+
E50PV lmet + εt,(2.5)
where the variables are defined as in Model 2.1. We estimate Model 2.5 for (a) the entire
pre-ban period (from February 15, 2008, to August 10, 2011); (b) the U.S. recession period
20Probability of default implied by CDS spreads is calculated using the ISDA standard model. The recoveryratio is 40 percent.
32
(from February 15, 2008, to June 30, 2009); (c) the stock market rally period (from July 1,
2009, to April 26, 2010); and (d) the European sovereign crisis until the last trading before the
ban was implemented (from April 27, 2010, to August 10, 2011). Panel A of Table 2.7 presents
the regression results of Model 2.5.
Table 2.7: Robustness checks
Panel A - Model 2.5 Panel B - Model 2.1 Panel C - Model 2.1 Panel D - Model 2.1
All (pre-ban) US recession Market rally Euro Crisis All Euro Crisis Ban Yan (2011) IVSkew adjusted AllVol skew Vol skew Vol skew Vol skew 95 minus 105 for ask-prices Vol skew
Intercept 3.213*** 2.196*** 6.035*** 2.429*** 2.893*** 1.981*** -0.728** 0.197** 4.016*** 2.574***(0.170) (0.102) (0.281) (0.473) (0.164) (0.353) (0.335) (0.100) (0.190) (0.149)
V2X 0.052*** 0.052*** -0.022** 0.125*** 0.050*** 0.103*** 0.037*** 0.013*** 0.076*** 0.065***(0.005) (0.002) (0.010) (0.017) (0.004) (0.012) (0.009) (0.003) (0.005) (0.004)
Country CDS spread 0.005*** 0.025*** 0.024*** 0.000 0.006*** 0.003*** 0.009*** -0.003*** 0.000(0.001) (0.001) (0.001) (0.002) (0.001) (0.001) (0.001) (0.000) (0.001)
Country Bond spread -0.130***(0.034)
Stock returns 4.861*** 4.085*** 3.569*** 11.958*** 5.383*** 9.735*** 7.090*** -0.983 3.899*** 5.346***(1.060) (0.667) (1.234) (2.927) (0.995) (1.832) (1.374) (0.670) (1.347) (0.934)
Stock turnover -22.602*** -18.110*** -5.120 -30.131*** -19.694*** -23.984*** -27.241*** -10.699*** -39.059*** -17.274***(2.200) (3.245) (4.332) (3.727) (1.890) (2.572) (3.330) (2.585) (4.812) (1.907)
Stock size 0.042*** 0.034*** 0.038*** 0.054*** 0.049*** 0.069*** 0.114*** 0.011*** 0.008*** 0.048***(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.000) (0.001) (0.001)
Stock Beta 0.939*** 0.944*** 0.383*** 1.262*** 1.002*** 1.739*** 3.893*** 1.012*** 0.955*** 0.990***(0.031) (0.046) (0.037) (0.052) (0.040) (0.068) (0.099) (0.033) (0.063) (0.036)
Dummy Ban Period -0.184 -1.217*** 0.304*** 1.334*** 0.399***(0.124) (0.207) (0.071) (0.066) (0.117)
Dummy Stock Banned 0.712*** 0.339*** 0.087 0.418*** 1.987*** 0.745***(0.036) (0.068) (0.056) (0.023) (0.143) (0.034)
Dummy Ban Period*Stock 0.318*** 0.497*** -0.033 0.379** 0.450***(0.106) (0.128) (0.058) (0.151) (0.097)
Dummy Post Ban -0.277 -0.743*** -0.835*** 1.414*** 0.194(0.210) (0.237) (0.117) (0.261) (0.203)
Dummy Post Ban*Stock 0.438** 0.578** 0.479*** 0.392 0.343(0.223) (0.244) (0.119) (0.314) (0.210)
Overall put volume 0.364*** 0.052 -0.016 0.370* 0.341*** 0.236 -0.274* 0.490*** 0.396*** 0.273***(0.081) (0.054) (0.056) (0.201) (0.077) (0.156) (0.140) (0.047) (0.087) (0.074)
EuroStoxx50 put volume -0.901*** -0.252*** -0.165 -2.188*** -0.724*** -1.452*** 0.423** -0.710*** -0.808*** -0.752***(0.132) (0.082) (0.164) (0.308) (0.124) (0.233) (0.184) (0.077) (0.144) (0.120)
R2 0.0830 0.1412 0.2757 0.0987 0.0994 0.1271 0.2854 0.0315 0.2013 0.1021Observations 119759 44644 28230 46735 128757 64470 18702 148757 25152 146201
Panel A reports the panel regression results for Model 2.5, using only the pre-ban period data. We distinguish four sub-periods: (a) Full pre-ban period: Feb 15, 2008 to Aug 10, 2011; (b) U.S. recession: Feb 15, 2008 to Jun 30, 2009; (c) Marketrally: Jul 1, 2009 to Apr 26, 2010; and (d) Euro crisis: Apr 27, 2010 to Aug 10, 2011. Panel B reports the estimates afterremoval of the Belgian data from the full sample. Here we distinguish three different periods: (a) Full period: 15 Feb, 2008to 27 Mar, 2012; (b) Euro crisis: 27 Apr, 2010 to 27 Mar, 2012; and (c) Ban period: Aug 11, 2011 to Feb 16, 2012. PanelC reports the regression results for Model 2.1, where the explained variable IV skew is substituted by (a) Yan (2011) 95minus 105 IV skew measure, and by (b) our proxy for the IV skew measure from ask-prices. Panel D reports the regressionresults for Model 2.1, where the country CDS spreads are replaced by sovereign spreads versus Germany. The single stockIV skew is the dependent variable and information flow (Country CDS spread, Country Bond Spread and V2X), firm-specificcontrol variables (Return,Turnover , Size, Beta), trading volume on single put options (Put volume) and on index options(EuroStoxx50 put volume), and dummies are the explanatory variables. The intercept is set equal in all cross-sections and noweighting is used. Residuals are not normal for most cross-sections. We report White-Heteroskedasticity consistent standarderrors in brackets. Asterisks ***, **, and * indicate significance at the one, five, and ten percent level, respectively.
The first column of Table 2.7 shows that the Model 2.5 estimates for the pre-ban period
are consistent with the Model 2.1 estimates (Table 2.5, Panel A, column a) for the full sample
period. Hence, we find that increased trading activity in single-stock puts is linked to a high IV
skew of single-stock options. This relation confirms the findings of Bollen and Whaley (2004)
and Garleanu et al. (2009), who link trading pressure to IV skews. Columns b and c of Panel A
show that the estimates of the Model 2.5 parameters during all three sub-periods have the same
signs and statistical significance levels as those obtained over the full pre-ban period (column
a). Hence, the findings remain stable across various time periods and within two different model
specifications.
As an additional robustness check, we analyze whether the IV skews for stocks in other
European countries increased around the date of the short sale ban announcement. Such an
increase could be evidence of financial contagion effects in options markets. If so, the steep rise
in implied jump risk would also be observed in other European countries that did not adopt the
ban and that were vulnerable to or already hit by the financial crisis. European countries that
fit such criteria are Greece, Ireland, and Portugal (Grammatikos and Vermeulen (2012)). We33
compile the IV skew data for only Ireland and Greece because Portugal does not have a public
equity options market. In unreported results, we find no indication that implied jump risk for
these stocks materially changes when the short selling ban is introduced. These observations
strengthen our earlier conclusion that the rise in the level of implied jump risk on the day of the
ban announcement is connected to the ban itself, as opposed to other reasons, such as financial
contagion.
In another robustness check, Panel B reports the results of estimating Model 2.1 after
removing the Belgian shares from the sample. A potential justification for excluding the Belgian
data from the analysis is that the Belgian banks in particular experienced relatively heavy
government intervention during the crisis period. This intervention may distort the estimations
for Belgium. However, Panel B of Table 2.7 shows that the results do not materially change
after the removal of the Belgian data.
Our findings are also not altered when we use the IV slope measure of Yan (2011), which is
the IV of close-to-ATM puts minus the IV of calls, as the dependent variable instead of our IV
skew measure (see Panel C). Because our measurement of the IV skew is based on mid-prices,
which may be unaffected by bid-asks widening due to market makers’ response to the market
turmoil, we also test whether our results hold if our IV skew measure comes from ask prices.
We hypothesize that the ask price-based IV skew is biased upwards by the wider than normal
bid-ask spreads during the European sovereign crisis and the ban period. Our analysis shows
that the main findings still hold when we add the IV spread variable to the IV skew on the
left-hand side of Model 2.1. This result, reported in Panel C, proves that our regressions are
not biased by the use of IV from mid-prices in the construction of our explained variable, the
IV skew.
As a final robustness test of Model 2.1, we substitute CDS spreads with sovereign spreads
from government bonds. The intuition behind this check for robustness test is that uncovered
positions in sovereign European CDS were also banned on November 1, 2012. We calculate
spreads vis-a-vis Germany, using a maturity of ten years. Panel D indicates that our findings
are affected rather little by this substitution and remain robust.
2.4 Conclusion
Recent research suggests that the short sale bans introduced during the 2008 financial crisis have
reduced market quality around the world, perhaps even to the extent that the bans’ benefits
were outpaced (see, for example, Battalio and Schultz, 2011; Grundy et al., 2012; Boehmer
et al., 2013; Beber and Pagano, 2013). Nevertheless, market regulators in Belgium, France,
Italy, and Spain re-introduced a short sale ban on financial stocks in August 2011 to combat
the European financial crisis.
To analyze the effects of the European 2011 short sale ban on financial market stability
and contagion risk, we extracted RNDs and IV skews from single-stock options. Our results
indicate that implied jump risk of banned stocks was higher during the ban period than in
any other period analyzed. We find that on the day of the ban announcement, implied jump
34
risk levels for both banned and non-banned stocks showed a significant rise. Implied jump
risk tended to increase for banned stocks even more than for non-banned stocks. Furthermore,
during the imposition of the ban, the banned stocks’ average IV skews remained at an elevated
level, whereas this metric dropped for the other stocks. During the ban, the median IV skews
for both the banned and non-banned stocks reached their highest levels when compared to any
other period in the sample. This adverse effect in IV skews seems to occur due to a supply
shift induced by a rise in market markers’ risk aversion, which is consistent with the behavior
predicated by Cumulative Prospect Theory of Tversky and Kahneman (1992). Despite its
cause, our findings show that, even after controlling for information flow and stock-specific
factors, the short sale bans themselves increased implied jump risk, especially for the banned
stocks.
We further document that contagion risk for both banned and non-banned stocks already
increased significantly during the pre-ban period. For non-banned stocks, contagion risk rose
even more upon imposition of the ban. However, we find that contagion risk for banned stocks
decreased during the ban relative to the pre-ban period.
Our approach of using option-implied data to analyze the impact of short sale bans on
financial markets is only a first step. We believe that our knowledge on this topic would benefit
from additional future research. Of particular interest would be the analysis of ban-driven
increases in implied jump risk using the mutually exciting jumps model of Ait-Sahalia et al.
(2015). We hypothesize that a lack of coordination by country regulators in introducing bans
may be undesirable, as shocks in jump risk caused by subsequent bans may cross-excite each
other and lead to financial contagion, which is of great importance to the supervisory policy
agenda.
While we observe that the short sale ban is effective in restricting both outright and synthetic
shorts on banned stocks, we do find evidence of trading migration to the Eurostoxx 50 index
options market. Investors seem to switch from single-stock puts to index puts because of “flight-
to-liquidity” incentives. The selling pressure potentially diverted from the financial stocks to
a larger share of the stock market, thereby reducing the destabilizing effects in the financial
sector.
The question remains whether the 2011 European short selling ban was a cure or a curse. If
the first and foremost goal of imposing a ban is reducing systemic risk, then the 2011 bans do
seem to fulfill this purpose. However, we note that this success comes at a cost, which is that
the implied jump risk increases. Despite the fact that this effect in implied jump risk indicates
market failure and may have adversely influenced market participants’ expectations, it helped
to preserve market stability by reducing contagion risk. Thus, what is the appropriate balance
between market failure and systemic risk? Bans should be avoided if possible, and should only
be used as a last resort when all other means have failed, as government and regulators should
prioritize financial market stability over transitory market failure.
35
2.A Appendix: Implied jump risk estimation
2.A.1 Implied jump risk from risk-neutral distributions
In this section, we describe how we compute IVs for the groups of banned and non-banned
stocks. The banned group constituents are the stocks that were prohibited from short sales.
The non-banned group constituents are the remaining stocks in our sample. We compute IVs
for banned and non-banned stocks separately by equally averaging IV21 on each moneyness level
available across all stocks belonging to either the banned group or the non-banned group. This
step produces one IV structure across our seven moneyness levels (80, 90, 95, 100, 105, 110 and
120) for both groups for every day in our sample. Then, we apply the Black-Scholes model to
our IV data to obtain options prices for the banned and non-banned groups of stocks. We set the
instantaneous price level of both groups equal to 100, and as a result, the percentage moneyness
level automatically reflects strike prices per group. When applying the Black-Scholes model,
we calculate contemporaneous dividend yield for banned and non-banned stocks by equally
weighting dividend yield from the individual stocks. The risk-free rate applied is the Euribor
three-month maturity.
Once options prices for the average banned and non-banned stocks are obtained, we can
extract the RND of equity returns using the Breeden and Litzenberger (1978) formulae for the
strikes along the body of our distribution, i.e., from the 80 to 120 moneyness levels:
RND(S) = exp(rT )δ2C(T,K)
δK2|K=S, (2.A.1)
where RND(S) is the risk-neutral probability of observing the terminal index level (S) at time
T , r is the risk-free rate for the specific maturity, K is the strike price, and C is the index
option price. Computing the second derivative of the option price relative to strike prices via
central differences leads to:
RND(S) ≈ exp(rT )C(T, S −∆K)− 2C(T, S) + C(T, S+)
(∆K)2, (2.A.2)
Following Figlewski (2010), extrapolation beyond the body of the RND22 is performed by
fitting a generalized extreme value (GEV) distribution using two extreme anchor points on
each side of the body of the RND and extending a tail with the same shape23. The GEV-based
21IV is calculated through reverse engineering the Black-Scholes model, while assuming constant interest ratesand discrete dividends. Interpolation is used to calculate the IV at a fixed level of moneyness and at a fixedtime to maturity.
22The Figlewski (2010) method is close to the method used by Bliss and Panigirtzoglou (2004), where bodyand tails are also extracted separately. These authors use a weighted natural spline algorithm for interpolation,which has the same decreasing noise effect in RNDs. Extrapolation is done by the introduction of pseudo datapoints, which has the effect of pasting lognormal tails into the RND. One advantage of both approaches is thatextrapolation does not result in negative probabilities, which is possible when the spline interpolation is applied.We favor the approach by Figlewski (2010) because the use of the lognormal tails by Bliss and Panigirtzoglou(2004) assumes that the IV is constant beyond the observable strikes, resembling the Black-Scholes model andbeing largely inconsistent with empirical evidence.
23Figlewski (2010) argues that interpolation using fourth-order splines is superior to cubic splines because itavoids kinks in the RND. The translation from interpolated IV curve into RND would require taking higher-orderderivatives than those used by the construction of the spline.
36
extrapolation is then used to model the tails of the RND toward the moneyness levels 0 and 200.
We initially use the first and third percentiles of the RND’s body as (outer and inner) anchor
points for the left tail and the 99th and 97th percentiles as (outer and inner) anchor points for
the right tail. We extend the approach of Figlewski (2010) by allowing these anchor points to
change if the fitted GEV curves produce implausible tails, e.g., zero probability under the tails.
Eqs. (2.A.3) and (2.A.4) give, respectively, the GEV’s cumulative distribution function and
probability distribution function:
FGEV (ST ) = exp[−
(1 + ω
(ST − µ
σ
))−1/ω], (2.A.3)
and
fGEV (ST ) =1
σ
[1 + ω
(ST − µ
σ
)](−1/ω)−1
exp[−
(1 + ω
(ST − µ
σ
))−1/ω], (2.A.4)
where ω > 0 sets a fat tail relative to the normal, ω = 0 sets a normal tail, and ω < 0 sets
a distribution tail that is thinner than the normal. The µ and σ are location and dispersion
parameters. Because fitting GEV curves entails setting these three parameters, Figlewski (2010)
also imposes three conditions on the tail: i) that the total probability in the tail of the body
(up to the inner anchor point) is the same for the RND and the GEV approximation, ii) that
the shape of the RND equals the shape of the GEV curve in the inner anchor point, and iii)
that the shape of the RND equals the shape of the GEV curve in the outer anchor point. We
refer to Appendix 2.A.2 below for more details.
Once the body and tails of the RND for terminal index levels are obtained for banned and
non-banned stocks, we convert them into return RNDs by calculating log-returns relative to the
starting index level S0. Finally, we compute probabilities for every percentage return quantile
of the PDF via linear interpolation, which are normalized to integrate to one.
2.A.2 The Figlewski (2010) approach for extracting RND from im-plied volatilities
In this section, we describe the Figlewski (2010) approach and how we apply it to our sample.
In the Figlewski (2010) method, the following three conditions are imposed for the right tail:
Condition 1: FGEV (X(αinnerR)) = αinnerR
Condition 2: fGEV (X(αinnerR)) = fbody(X(αinnerR))
Condition 2: fGEV (X(αouterR)) = fbody(X(αouterR))
where X(αinnerR) represents the exercise price corresponding to the α-quantile of the RND used
as the inner anchor point in the right tail, whereas X(αouterR) denotes the same but for the
outer anchor point in the right tail. For the left tail, these conditions are modified to:
Condition 1: FGEV (−X(αinnerL)) = 1− αinnerL
Condition 2: fGEV (−X(αinnerL)) = fbody(X(αinnerL))
37
Condition 3: FGEV (−X(αouterL)) = fbody(X(αouterL))
We fit the GEV curves by implementing the following optimization:
GEV (ω, µ, σ) = argmin(y) (2.A.5)
where the objective function yR for the right tail following the three conditions above is:
yR = [FGEV (X(αinnerR))− αinnerR]2 + [fGEV (X(αinnerR))− fbody(X(αinnerR))]
2 + ...
[fGEV (X(αouterR))− fbody(X(αouterR))]2,
(2.A.6)
and whereas, for the left tail, the objective function yL is:
yL = [FGEV (X(αinnerL))− αinnerL]2 + [fGEV (X(αinnerL))− fbody(X(αinnerL))]
2 + ...
[fGEV (X(αouterL))− fbody(X(αouterL))]2,
(2.A.7)
2.A.3 The modified Figlewski (2010) approach
The approach by Figlewski (2010) performs nicely for many observations in our sample. How-
ever, for some observations, the fitted GEV curves are implausible. We illustrate the problem
encountered in Figure 2.5, where the right tail of the RND is reasonably fitted by GEV, but
the left tail is not. To avoid ending up with implausible tails, we allow the inner anchor
points to change by a predefined amount (∆IAnchor), following a loop-algorithm from itera-
tion m = 1, ...,M . Within this algorithm, the inner anchor points are mainly the ones to shift
to accommodate a better-behaved GEV curve. Exceptionally, however, the outer anchor points
are also shifted. The algorithm for the left tail is given as:
1. Let the α-quantile inner anchor point (αinnerL) increase by ∆IAnchor as m → M loops
until ymL > 5−25 and median of δ2fGEV
δK2 |K=X(αinnerL)K=0 < 0, otherwise stop loop.
2. If ym−1L < ymL , then evaluate if median of δ2fGEV
δK2 |K=X(αinnerL)K=0 > 0. If yes, stop loop and
use α-quantile inner anchor point (αinnerL) of ymL for GEV estimation. If median ofδ2fGEV
δK2 |K=X(αinnerL)K=0 < 0, continue loop by increasing α-quantile inner anchor point (αinnerL)
by ∆IAnchor.
3. If ym−1L < ymL , then evaluate if 0.05 >
∫ X(αouterL)
0Fm−1GEV > 0.1. If so, stop loop and use
α-quantile inner anchor point (αinnerL) of ym−1L for GEV estimation, otherwise continue
loop.
4. If the α-quantile inner anchor point (αinnerL) increases up to the mode (peak) of the RND,
then it stops increasing and the α-quantile outer anchor point (αouterL) starts increasing by
a very small step of 0.01 percent. If the α-quantile outer anchor point (αouterL) increases
more than 10 times, then stop loop and use α-quantile outer and inner anchor points from
the iteration with lowest yL for GEV estimation.
Thus, our modification to the Figlewski (2010) approach is that the RND body is always
extracted from the IV by using the Breeden and Litzenberger (1978) formulae. In contrast,
Figlewski (2010) substitutes the original RND in the interval between the inner anchor point
and the end of the original RND.38
Figure 2.5: RND extraction using different methods Plot A depicts the RND of banned stocks for March 24, 2011,using the Figlewski (2010) approach. Plot B depicts the RND of banned stocks for the same date, using the modified Figlewskiapproach described here. We note that in Plot A both the left and the right tails of the RND, fitted by GEV curves, are implausiblebecause they contain abruptly declining tails under which the probability is close to zero. It is not the approach that causes suchdistortion but the limited range of moneyness in our data set.
2.B Appendix: Extreme value theory
When applying EVT, we first estimate the tail shape estimator (ϕ), using Hill (1975):
ϕ =1
θ=
1
k
K∑j=1
ln(xj
xk+1
), (2.B.1)
where xj are ranked returns in ascending order j = 1, . . . , n; n is the sample size; k is the
number of extreme returns used in the tail estimation; and xk+1 is the return “tail cut-off
point”. The tail shape estimator ϕ measures the curvature, i.e., the fatness of the tails of the
return distribution: a high (low) ϕ indicates that the tail is fat (thin).
After extracting RNDs for both banned and non-banned stocks, we next determine the
optimal number of observations k used to estimate parameter ϕ in Eq. (3.2.6). For this
purpose, we produce Hill-plots for the left tail of our two RNDs. Such Hill-plots depict the
relationship between k and ϕ as a curve. The optimal value of k is selected as the minimum
level for which the value of ϕ stabilizes, thus where a stable trade-off between the approximation
of the tail shape by the Pareto distribution and the uncertainty of such approximation occurs
(because of the use of fewer observations). We set k equal to four percent or 43 observations,
which matches the level used in, e.g., Hartmann et al. (2004).
Once ϕ is obtained, we compute extreme downside risk, hereafter VaR, using a semi-
parametric quantile estimator used in Hartmann et al. (2004):
qp = xk+1(k
pn)1
θ , (2.B.2)
where n is the sample size, p is a chosen exceedance probability, which means the likelihood
that a return xj exceeds the tail value q, and x(k+1)is the “tail cut-off point”. Note that qp has
as one of its inputs the estimated tail shape parameter ϕ. The qp statistic indicates the level
of the worst return occurring with probability p. Since the tail quantile statistic
√k
ln( kpk
)[ln q(p)
q(p)]
39
is asymptotically normally distributed, we follow Hartmann et al. (2004) and use the following
t-statistic for this estimator:
Tq =q1 − q2
σ[q1 − q2]∼ N(0, 1), (2.B.3)
where the denominator is calculated as the difference between the two estimated VaRs, using
1,000 bootstraps. The null hypothesis of this test is that q1 and q2 do not come from independent
samples of normal distributions, therefore, the VaRs are equal.
In the next step, we employ a bivariate EVT method to calculate commonality in jumps,
hence, contagion risk from historical returns. EVT is well suited to measure contagion risk
because it does not assume any specific return distribution. Our approach estimates how
likely it is that one stock will experience a crash beyond a specific extreme negative return
threshold conditional on another stock crash beyond an equally probable threshold. We refer
to Hartmann et al. (2004) and Balla et al. (2014) who use the conditional co-crash (CCC)
probability estimator, which is applied to each pair of stocks in our sample, as follows:
CCCij = 2− 1
k
N∑t=1
I[Vit > xi,N−k or Vjt > xj,N−k], (2.B.4)
where the function I is the crash indicator function, in which I = 1 in case of a crash, and I = 0
otherwise, Vit and Vjt are returns for stocks i and j at time t; xi,N−k, and xj,N−k are extreme
crash thresholds. The estimation of the CCC-probabilities requires setting k as the number of
observations used in Eq. (2.B.4). For consistency with our Hill-estimator, we again use k = 43
as the minimum level for which the value of ϕ is stable in our Hill-plots. Furthermore, because
the CCC-probability is asymptotic normal if kN
→ 0 as k,N → ∞ (see Hartmann et al., 2004),
a t-test for such estimator is obtained by the same bootstrap-based approach that is used in
equation (2.B.3).
40
Chapter 3
Single stock call options as lotterytickets: overpricing and investorsentiment∗
3.1 Introduction
Barberis and Huang (2008) hypothesize that Tversky and Kahneman’s (1992) Cumulative
Prospect Theory (CPT) explains a number of seemingly unrelated pricing puzzles. In contrast
to previous literature, which concentrates on the CPT’s value function (see Benartzi and Thaler,
1995b; Barberis et al., 2001; Barberis and Huang, 2001), Barberis and Huang (2008) focus on
the probability weighting functions of the model. They conclude that the CPT’s overweighting
of small probability events explains why investors prefer positively skewed returns, or “lottery
ticket” type of securities. Because of such preference, investors overpay for positively skewed
securities, turning them expensive and causing them to yield low forward returns. The authors
argue that this mechanism is the reason why IPO stocks, private equity, distressed stocks, single
segment firms and deep out-of-the money (OTM) single stock calls are overpriced among other
irrational pricing phenomena.
The proposition made by Barberis and Huang (2008) that deep OTM single stock calls
resemble overpriced lottery-like securities due to investors’ overweight of tails has not yet been
verified empirically2. Empirical studies on probability weighting functions implied by option
prices are offered by Dierkes (2009), Kliger and Levy (2009), and Polkovnichenko and Zhao
(2013)3. The evidence provided by these papers is, however, based on the index put options
∗This chapter is based on Felix et al. (2016b). I am grateful to Deborah A. Trask (the editor) and oneanonymous referee at Journal of Behavioral Finance for their useful comments and suggestions. We also thankseminar participants at the IFABS 2016 Barcelona Conference, at the VU University Amsterdam in 2016, at theAPG Asset Management Quant Roundtable in 2016, at the 2016 Research in Behavioral Finance Conferencein Amsterdam and at the Board of Governors of the Federal Reserve System in Washington D.C. in 2016 fortheir helpful comments. We thank APG Asset Management for making available part of the data set.
2Boyer and Vorkink (2014) provide evidence that lottery-like single stock options do deliver lower forwardreturns than options with lower ex-ante skewness. However, their paper does not test why these options areovervalued, nor does it analyzes the potential time-variation in ex-ante skewness and forward returns. Conradet al. (2013) find similar results for ex-ante skewness and subsequent stock returns.
3These studies focus on the rank-dependent expected utility (RDEU) rather than the CPT, as the RDEUis seamlessly effective in dealing with the overweighting of probability phenomena. The RDEU’s probabilityweighting functions are strictly monotonically increasing, whereas the CPT one is not. RDEU functions are
41
market, which behaves very differently from the single stock option market. The main buyers
of OTM index puts are institutional investors, which use them for portfolio insurance (Bates,
2003; Bollen and Whaley, 2004; Lakonishok et al., 2007; Barberis and Huang, 2008). Because
institutional investors comprise around two-thirds of the total equity market capitalization
(Blume and Keim, 2012), their option trading activity strongly impacts the pricing of put
options (Bollen and Whaley, 2004) by making them expensive. The results of Dierkes (2009)
and Polkovnichenko and Zhao (2013) reiterate this evidence and suggest that overweighting
of small probabilities partially explains the pricing puzzle present in the equity index option
market.
Contrary to the index put market, trading activity in single stock calls is concentrated among
individual investors (Bollen and Whaley, 2004; Lakonishok et al., 2007) and is speculative in
nature (Lakonishok et al., 2007; Bauer et al., 2009; Choy, 2015). Beyond that, Mitton and
Vorkink (2007); Bauer et al. (2009); Kumar (2009) provide important empirical support to
the link between preference for skewness and individual investor trading activity. The fact
that many individual investors have a substantial portion of their portfolios tied up in low risk
investments, such as pensions, social security, 401(k)s, IRAs, or are averse (or constrained)
to borrow (Frazzini and Pedersen, 2014) encourages them to buy financial instruments with
implicit leverage such call options. Hence, given the very distinct clientele of these two option
markets (institutional investors vs. retail investors) and the different motivation for trading
(portfolio insurance vs. speculation), we reason that the OTM single stock calls overpricing is
a puzzle in itself, requiring an independent empirical proof from the index option market.
The first contribution of this chapter is to investigate whether the CPT can empirically
explain the claimed overpricing of OTM single stock call options. To that purpose, we empir-
ically test whether tails of the CPT density function outperform the risk-neutral density and
rational subjective probability density functions on matching tails of the distribution of realized
returns. We find that our estimates for the CPT probability weighting function parameter γ
are qualitatively consistent with the ones predicated by Tversky and Kahneman (1992), partic-
ularly for short-term options. Our estimates do suggest that overweight of small probabilities
is less pronounced than suggested by the CPT though. This analysis complements the results
of Barberis and Huang (2008) and provides novel support to explain the overpricing of OTM
single stock calls. Our empirical results extend the findings of Dierkes (2009), Kliger and Levy
(2009), Polkovnichenko and Zhao (2013), because we show that investors’ overweighting of
small probabilities is not restricted to the pricing of index puts but also applies to single stock
calls.
Secondly, we provide evidence that overweighting of small probabilities is strongly time-
varying and connected to the Baker and Wurgler (2007) investor sentiment factor. These
findings contrast the CPT model, where the probability weighting parameter for gains (γ)
is constant at 0.61. In fact, our estimations suggest that the γ parameter fluctuates widely
around that level, sometimes even reflecting underweighting of small probabilities. We show
also easier to estimate because they use one less parameter than the CPT.
42
that overweighting of small probabilities was quite strong during the dot-com bubble, which
coincided with a strong rise in investor sentiment. The strong time-variation in overweight of
tails indicates that investors have either a “bias in beliefs” or time-varying (rather than static)
skewness preferences, see Barberis (2013) for a discussion on the topic4.
Moreover, we find that overweighting of small probabilities is largely horizon-dependent,
because this bias is mostly observed within short-term options prices (i.e., three- and six-
months) rather than in long-term ones (i.e., twelve-months). We reason that such positive
term structure of tails’ overweighting exist because individual investors may speculate using the
cheapest available call at their disposal. In other words, individual investors buy the cheapest
lottery tickets that they can find. As three- and six-month options have much less time-value
than twelve-month ones, more pronounced overweighting of small probabilities within short-
term options seems sensible. This result is consistent with individual investors being the typical
buyers of OTM single stock calls and the fact that they mostly use short-term instruments to
speculate on the upside of equities (Lakonishok et al., 2007).
In our analysis of probability weighting functions, we focus on the outmost tails of RNDs5.
We argue that, as distribution tails (mostly estimated from OTM options) are the sections of
the distribution that reflect low probability events, we may analyze these locally, thus, isolated
from the distribution’s body. To this purpose, we use extreme value theory (EVT) and Kupiec’s
test (as a robustness check), which are especially suited for the analysis of tail probabilities
and, so far, have not been employed yet to the evaluation of overpricing of OTM options. As
an additional robustness check, we replace the CPT by the rank-dependent expected utility
(RDEU) function of Prelec (1998). This alteration reconfirms the presence of overweighted
small probabilities by investors within the OTM single stock call market and, at the same time,
reiterates that such bias is less pronounced than suggested by the CPT model. Time-variation
of the weighting function parameters is also observed when RDEU is applied.
The remainder of this chapter is organized as follows. Section 3.2 describes the data and
methodology employed in our study. Section 3.3 presents our empirical analysis and section
3.4 discusses our robustness tests. Section 3.5 concludes.
3.2 Data and Methodology
In this section, we first describe the theoretical background that allows us to relate empirical
density functions (EDF), RND, and subjective density functions. This is a key step for testing
4We acknowledge that it is unclear whether overpricing of OTM calls is caused by overweighting of smallprobabilities (i.e., a matter of preferences), or rather by biased beliefs (i.e., investors’ expectations). Barberis(2013) eloquently discusses how both phenomena are distinctly different and how both (individually or jointly)may explain the overpricing in OTM options. In this chapter we take a myopic view and use only the firstexplanation, for ease of exposition. Disentangling the two (beliefs and preferences) would potentially be veryinteresting, but we deem it to be outside the scope of this chapter.
5Per contrast, Dierkes (2009) and Polkovnichenko and Zhao (2013) explore the relation between overweight-ing of small probabilities and options prices by analyzing the full RND from options. Dierkes (2009) appliesBerkowitz’s tests, whereas Polkovnichenko and Zhao (2013) estimate an empirical weighting function via poly-nomial regressions.
43
the hypothesis that the CPT helps to explain overpricing of OTM options, because we build on
the assumption that investors’ subjective density estimates should correspond, on average6, to
the distribution of realizations (see Bliss and Panigirtzoglou, 2004). Thus, testing whether the
CPT’s weighting function explains the overpricing of OTM options, ultimately, relates to how
the subjective density function produced by CPT’s preferences matches empirical returns. Be-
cause the representative agent is not observable, subjective density functions are not estimable
like EDF and RND are. As such, we build on the following theory to derive subjective density
functions from RNDs.
In our empirical exercise, we first derive subjective density functions for (a) the power
and (b) exponential utility functions. Because the CPT model contains not only a utility
function (the value function) but also a probability weighting scheme (the weighting function),
we produce two density functions: (c) the hereafter called partial CPT density function (PCPT),
where only the value function is taken into account, and (d) the CPT density function, where
the value and the weighting functions are considered. Lastly, we also calibrate γ to market data
and are, then, able to compute (e) the estimated CPT density (ECPT). We provide details on
estimation methods for our five subjective density functions, (a) to (e), in section 3.2.1, and
for the RND and EDF in section 3.2.4.
Once all five subjective density functions are obtained, we distinguish four analyses in our
empirical analysis section: 1) the estimation of long-term CPT value and weighting function
parameters (from which we can produce the ECPT density) (section 3.3.1); 2) EVT-based
tests of consistency between tails of the EDF, the RND and our five subjective probability
distributions (section 3.3.2); 3) the estimation of time-varying γ parameter (section 3.3.3); and
4) a regression linking the CPT time-varying probability weighting parameter (γ) to sentiment
measures as well as numerous control variables (section 3.3.4).
We use single stock weighted average implied volatility (IV) data used for the largest 100
stocks of the S&P 500 index within our RND estimations. Appendix 3.A.2 shows how single
stock weighted average IV are computed. Weights applied are the S&P 500 index weights
normalized by the sum of weights of stocks for which IVs are available. Following the S&P
500 index methodology and the unavailability of IV information for every stock in all days
in our sample, stocks weights in this basket change on a daily basis. The sum of weights is,
on average, 58 percent of the total S&P 500 index capitalization and it fluctuates between
46 and 65 percent. The IV data comes from closing mid-option prices from January 2, 1998
to March 19, 2013 for fixed maturities for five moneyness levels, i.e., 80, 90, 100, 110, and
120, at the three-, six- and twelve-month maturity. Continuously compounded stock market
returns are calculated throughout our analysis from the basket of stocks weighted with the
same daily-varying loadings used for aggregating the IV data. IV data and stock weights are
kindly provided by Barclays7. Single stock returns are downloaded via Bloomberg.
6This implies that investors are somewhat rational. This assumption is not inconsistent with the CPTassumption that the representative agent is less than fully rational. The CPT suggests that investors arebiased, not that decision makers are utterly irrational to the point that their subjective density forecast shouldnot correspond, on average, to the realized return distribution.
7We thank Barclays for providing the implied volatility data. Barclays disclaimer: ”Any analysis that utilizes
44
We take the perspective of end-users of single stock OTM call options8. Hence, we assume
that supply imbalances are minimal and do not impact implied volatilities. We think this
assumption is reasonable because 1) option markets for the largest 100 U.S. stocks are liquid;
2) any un-hedged risk run by market makers can be easily hedged by purchasing the stock;
and 3) unhedged risk by market makers is likely much smaller when supplying call options
relative to put options. Market makers run little unhedged risk when supplying call options
vis-a-vis supplying puts because stocks returns are negatively skewed, making gap and jump
risk much lower on the upside than on the downside. Garleanu et al. (2009) have shown that
this condition is different for the index option market, where market makers mostly provide
put options for portfolio insurance programs. As the authors suggest, put sellers become more
risk-sensitive following equity market declines, as their un-hedged risk increases, which makes
them unwilling to write additional puts to the market. Our implied volatility data show no
indication of an increase in the implied volatility skew from 120 percent moneyness options, nor
from at-the-money options around moments of market stress (e.g., the 2008-09 global financial
crisis). Hence, we find no evidence of the presence of supply imbalances in the OTM calls in
our sample.
3.2.1 Subjective density functions
Standard utility theory tells us that since the representative agent does not have risk-neutral
preferences, RNDs are inconsistent with subjective and EDF9, thus both “real-world” proba-
bilities. Hence, if investors are risk-averse or risk seeking, their subjective probability function
should differ from the one implied by option prices. The relation between the RND fQ(ST ),
and “real-world” probability distributions, fP (ST ), with ST being wealth or consumption10, is
described by ς(ST ), the pricing kernel or the marginal rate of substitution (of consumption at
time T for consumption at time t)11:
fQ(ST )
fP (ST )= Λ
U′(ST )
U ′(St)≡ ς(ST ), (3.2.1)
where Λ is the subjective discount factor (the time-preference constant) and U(·) is the rep-
any data of Barclays, including all opinions and/or hypotheses therein, is solely the opinion of the author andnot of Barclays. Barclays has not sponsored, approved or otherwise been involved in the making or preparationof this Report, nor in any analysis or conclusions presented herein. Any use of any data of Barclays used hereinis pursuant to a license.”
8We assume that end-users of single stock OTM call options have the same preferences across underlyingsecurities. This assumption is supported by the evidence provided by Bollen and Whaley (2004) and Lakonishoket al. (2007) that trading activity in equity calls is concentrated among individual investors and is speculativein nature.
9Anagnou et al. (2002) and Bliss and Panigirtzoglou (2004) have tested the consistency between RNDs andphysical densities estimated from historical data and found that such distributions are inconsistent, i.e., RNDsare poor forecasters of the distribution of realizations.
10Note that, as the value function within the CPT measures utility versus a reference point, ST is not strictlypositive in this model. A negative ST denotes a loss of wealth or consumption, whereas a positive ST representsa gain.
11The condition necessary for Eq. (3.2.1) to hold is that markets are complete and frictionless and a singlerisky asset is traded.
45
resentative investor utility function. As U(ST ) is a random variable, the pricing kernel is also
called the stochastic discount factor. Thus, Eq. (3.2.1) tell us that the “real-world” distribution
equates to the RND when adjusted by the pricing kernel. The intuition behind Eq. (3.2.1) is
that a real-world or risk-adjusted probability distribution can be obtained from the RND, once
the risk trade-off embedded in the representative investor utility function is considered.
Since CPT-biased investors price options as if the data-generating process has a cumulative
distribution FP (ST ) = w(FP (ST ))12, where w is the weighting function, its density function
becomes fP (ST ) = w′(FP (ST )) · fP (ST ) (see Dierkes, 2009; Polkovnichenko and Zhao, 2013).
Thus, CPT-biased agents assess probability distributions as if their tails would contain more
weight than in reality they do, i.e., they have a preference for skewness or “bias in beliefs”,
as Barberis (2013) argues. Consequently, evaluating whether the CPT’s propositions apply is
equivalent to testing whether Eq. (3.2.1) still holds if fP (ST ) is replaced by fP (ST ), leading to:
fQ(ST )
w′(FP (ST )) · fP (ST )= ς(ST ). (3.2.2)
We, then, further manipulate Eq. (3.2.2) so to directly relate the original EDF to the CPT
subjective density function, by “undoing” the effect of the CPT probability distortion functions
within the PCPT density function. The relation between EDF and the CPT density function
is given by Eq. (3.2.3) and its derivation, from Eq. (3.2.2), is provide in Appendix 3.A.1:
fP (ST )︸ ︷︷ ︸EDF
=
fQ(ST )
ν′(ST )∫ fQ(x)
ν′(x)dx
(w−1)′(FP (ST ))
︸ ︷︷ ︸CPT density function
(3.2.3)
where ν ′(ST ) is the CPT’s marginal utility function.
This result allows us to obtain a clear representation of the CPT subjective density function,
thus, where the value and the weighting function are simultaneously taken into account. At
this stage, as we can produce RND and the set of subjective densities of our interest, including
the CPT density, one can evaluate how consistent with realizations their tails are.
3.2.2 Estimating CPT parameters
We start evaluating the empirical validity of the CPT for single stock call options by comparing
EDF to the CPT density function parameterized by Tversky and Kahneman (1992). Subse-
quently, we estimate CPT weighting function parameters λ and γ with the same goal. We
only estimate γ within the probability weighting function, and not δ, because we are interested
in the gains-side of the distribution, which is extracted from call options. We estimate these
parameters non-parametrically, by minimizing the weighted squared distance between physical
distribution and the partial CPT density function for every bin above the median of the two
12Similarly, if investors are rational, their subjective density functions should be consistent, on average, withthe empirical density function. Bliss and Panigirtzoglou (2004) find that subjective density functions, producedfrom RND adjusted by two types of representative investors’ utility functions (power and exponential) withplausible relative risk aversion parameters, outperform RND on forecasting density functions.
46
distributions, as follows:
υ(λ) = MinB∑b=1
Wb(EDF bprob − CPT b
prob)2, (3.2.4)
where, EDF bprob and CPT b
prob are, respectively, the probability within bin b in the empirical
and CPT density functions and Wb are weights given by 11√2
∫∞0.5
e−x2
2 dx = 1, the reciprocal
of the normalized normal probability distribution (above its median), split in the same total
number of bins (B) used for the EDF and CPT. The loss aversion parameter, λ, in Eq. (3.2.4)
is optimized using multiple constraint intervals: [0,3], [0,5] and [0,10]. Once the optimal λ is
known, we minimize Eq. (3.2.5) using its estimate and the CPT λ:
w+(γ, δ = γ) = MinB∑b=1
Wb(EDF bprob − CPT b
prob)2, (3.2.5)
where γ, the probability weighting parameter for gains, is constrained by the permutation
of the following upper bounds (1.2, 1.35, 1.5, 1.75 and 2) and lower bounds (-0.25, 0 and
0.28). Weights applied in these optimizations are due to the higher importance of matching
probabilities tails in our analysis than the body of the distributions.
Our non-linear bounded optimization is a single parameter one, where we first estimate
optimal γ (which we impose to equal δ) across all permutations of upper and lower bounds
to select the bounds that produce the lowest residual sum of square (RSS). Subsequently, we
estimate λ and γ as suggested by the sequence of optimizations described by Eqs. (3.2.4) and
(3.2.5). This method resembles the ones of Kliger and Levy (2009), Dierkes (2009), Chabi-
Yo and Song (2013), and Polkovnichenko and Zhao (2013). Once optimal parameters λ and
γ are estimated, we can produce another long-term subjective density function: the ECPT,
which stands for estimated CPT, where we apply the optimal γ for the characterization of
its probability weighting function. Finally, we also estimate time-varying γ using different
assumptions of λ, so to evaluate the sensitivity of γ to changes in λ.
3.2.3 Density function tails’ consistency test
We check for tail consistency of our set of five subjective density functions (CPT, PCPT, ECPT,
power and exponential), RND, and the EDF by applying extreme value theory (EVT). EVT
allows us to estimate the shape of the tails of these eight PDFs and to extract the returns
implied by an extreme quantile within our PDFs. We estimate the tail shape estimator (ϕ) by
means of the Hill (1975) estimator:
ϕ =1
θ=
1
k
K∑j=1
ln(xj
xk+1
), (3.2.6)
where k is the number of extreme returns used in the tail estimation, and xk+1 is the tail cut-off
point. The tail shape estimator ϕ measures the curvature, i.e., the fatness of the tails of the
47
return distribution: a high (low) ϕ indicates that the tail is fat (thin). The inverse of ϕ is the
tail index (θ), which determine the tail probability’s rate of decay. A high (low) θ indicates that
the tail decays quickly (slowly) and, therefore, is thin (fat). Such tail shape estimator and tail
index give us a good representation of the curvature of the tails, but since tails may have the
same shape while estimating diverse extreme observations, we also employ the semi-parametric
extreme quantile estimator from De Haan et al. (1994):
qp = xk+1(k
pn)1
θ , (3.2.7)
where n the sample size, p is a corresponding exceedance probability, which means the likelihood
that a return xj exceeds the tail value q, and xk+1 is the tail cut-off point. We note that one
of the input of qp is the tail shape estimator ϕ. Similar to value-at-risk (VaR) modeling, the
q−p statistic indicates the level of the worst return occurring with probability p, which is small.
This is the reason why we call qp extreme quantile return (EQR). As we are interested only in
the upside returns with a p probability estimated from calls, we only compute q+p by applying
the same methodology to the right side of the RND obtained from the single stock option
market13.
In addition to the EQR, we also evaluate the density function tails using expected shortfall
(ES), which captures the average loss beyond the tail cut-off point. As we are interested in
the upside of the distribution, we call such measure expected upside (EU) as the average gain
beyond the tail cut-off point. We evaluate the EU following Danielsson et al. (2006) formulae
for the ES, which relates the EQR (i.e., the VaR) to the ES (i.e., the CVaR) as described below:
EU q(p) =θ
θ − 1· xk+1(
k
pn)1
θ , (3.2.8)
where θ is the tail index.
De Haan et al. (1994) show that the tail shape estimator statistic√k(ϕ(k) − ϕ) and the
tail quantile statistic
√k
ln( kpk
)[ln q(p)
q(p)] are asymptotically normally distributed. Hence, according
to Hartmann et al. (2004) and Straetmans et al. (2008), the t-statistics for such estimators are
given by:
Tϕ =ϕ1 − ϕ2
σ[ϕ1 − ϕ2]∼ N(0, 1), (3.2.9a)
and
Tq =q1 − q2
σ[q1 − q2]∼ N(0, 1), (3.2.9b)
where the denominators are calculated as the bootstrapped difference between the estimated
shape parameters ϕ and the quantile parameters qp using 1000 bootstraps. The null hypothesis
of this test is that ϕ and qp parameters do not come from independent samples of normal
distributions, therefore, ϕ1 = ϕ2 and q1 = q2. The alternative hypothesis is that ϕ and qp have
13Our EQR measure is closely connected to the risk-neutral tail loss measure of Vilkov and Xiao (2013).
48
unequal means. Such t-test is also applied to our EU analysis, as the distribution of EU follows
the same distribution of the tail quantile statistic
√k
ln( kpk
)[ln q(p)
q(p)], given that EU is the extreme
quantile estimator multiplied by a constant.
3.2.4 Estimating RND and EDF
For the estimation of the RND, the first step taken is the application of the Black-Scholes
model to our IV data to obtain options prices (C) for the S&P 500 index. Once our data
is normalized so strikes are expressed in terms of percentage moneyness, the instantaneous
price level of the S&P 500 index (S0) equals 100 for every period for which we would like to
obtain implied returns. Contemporaneous dividend yields for the S&P 500 index are used for
the calculation of P as well as the risk-free rate from three-, six-, and twelve-month T-bills.
Because we have IV data for five levels of moneyness, we implement a modified Figlewski (2010)
method for extracting our RND structure, as in Felix et al. (2016a). The main advantage of
this method over other techniques is that it extracts the body and tails of the distribution
separately, thereby allowing for fat tails.
The Figlewski (2010) method is close to the one employed by Bliss and Panigirtzoglou
(2004), where body and tails are also extracted separately. Bliss and Panigirtzoglou (2004)
use a weighted natural spline algorithm for interpolation, which has the same decreasing-noise
effect in RNDs of using splines in the absence of knots, as done in Figlewski (2010). The
extrapolation in Bliss and Panigirtzoglou (2004) is done by the introduction of a pseudo-data
point, which has the effect of pasting lognormal tails into the RND. One advantage of these two
approaches is that the extrapolation does not result in negative probabilities, which is possible
when splines is applied in such case. Nevertheless, we favor the approach of Figlewski (2010)
as the lognormal tails employed by Bliss and Panigirtzoglou (2004) assume that IV is constant
beyond the observable strikes, resembling the Black-Scholes model. The modification made to
the Figlewski (2010) method by Felix et al. (2016a) entailed having flexible inner anchor points
(as opposed to having fixed anchor points) for fitting tails to the risk neutral density. The aim
of this modification is to prevent the method to estimate distribution density functions with
implausible shapes.
We estimate the EDF in two different ways. First, using the entire sample of realized returns
(r), we estimate long-term EDFs non-parametrically, where r = ln(ST/St) and St is the realized
return index at time t and ST is the forward level of the same index three-, six- or twelve-
months forward, i.e., respectively 21, 63 and 252-days forward. Because of overlapping periods,
we initially estimate our empirical distribution from non-overlapping returns for these three
maturities by using distinct starting points. This methodology is also applied by Jackwerth
(2000) and Ait-Sahalia and Lo (2000). However, because the length of the overlapping periods is
relatively large compared to our total sample, especially for the twelve-month forward returns,
we average the distribution with distinct starting points to smooth the shape of our multiple-
horizon distributions14.
14As a robustness check to this approach, we compare our three-, six- and twelve-month empirical distributions
49
In a second step, we estimate time-varying EDFs built from an invariant component, the
standardized innovation density, and a time-varying part, the conditional variance (σ2t|t−1) pro-
duced by an EGARCH model (see Nelson, 1991). We first define the standardized innovation,
being the ratio of empirical returns and their conditional standard deviation (ln(St/St−1)/σt|t−1)
produced by the EGARCH model. From the set of standardized innovations produced, we can
then estimate a density shape, i.e., the standardized innovation density. The advantage of
such a density shape versus a parametric one is that it may include, the typically observed,
fat-tails and negative skewness, which are not incorporated in simple parametric models, e.g.,
the normal. As mentioned, such density shape is invariant and it is turned time-varying by
multiplication of each standardized innovation by the EGARCH conditional standard deviation
at time t, which is specified as follows:
ln(St/St−1) = µ+ εt, ε ∼ f(0, σ2t|t−1) (3.2.10a)
and
σ2t|t−1 = ω1 + αε2t−1 + βσ2
t−1|t−2 + ϑMax[0,−εt−1]2, (3.2.10b)
where α captures the sensitivity of conditional variance to lagged squared innovations (ε2t−1),
β captures the sensitivity of conditional variance to the conditional variance (σ2t−1|t−2), and ϑ
allows for the asymmetric impact of lagged returns (ϑMax[0,−εt−1]2). The model is estimated
using maximum log-likelihood where innovations are assumed to be normally distributed.
Up to this point, we managed to produce a one-day horizon EDF for every day in our
sample but we still lack time-varying EDFs for the three-, six-, and twelve-month horizons.
Thus, we use bootstrapping to draw 1,000 paths towards these desired horizons by randomly
selecting single innovations (εt+1) from the one-day horizon EDFs available for each day in
our sample. We note that once the first return is drawn, the conditional variance is updated
(σ2t−1|t−2) affecting the subsequent innovation drawings of a path. This sequential exercise
continues through time until the desired horizon is reached. In order to account for drift in
the simulated paths, we add the daily drift estimated from the long-term EDF plus the risk-
free rate to drawn innovations, thus the one-period simulated returns is εt+1 + µ + Rf . The
density functions produced by the collection of returns implied by the terminal values of every
path and their starting points are our three-, six-, and twelve-month EDFs. These simulated
paths contain, respectively, 63, 126, and 252 daily returns. We note that by drawing returns
from stylized distributions with fat-tails and excess skewness, our EDFs for the three relevant
horizons also embed such features. Finally, once these three time-varying EDFs are estimated
for all days in our sample, we estimate γ for each of these days using Eq. (3.2.5)15.
with the ones calculated from non-overlapping returns. We use data since 1871 for the US equity price index,made available by Welch and Goyal (2008), who use S&P 500 data since 1926, and data from Robert Shiller’swebsite for the preceding period. Our empirical distributions are quite similar to the ones estimated from thelonger data set, suggesting that they are, indeed suitable as long-term distributions. The use of overlappingreturns is less problematic in our calculations than in regression estimation, where statistical inferences onparameter estimates can be strongly affected by overlapping returns’ serial correlation.
15Due to drift, the model of time-varying EDF for the twelve-month horizon occasionally does not match the
50
Our approach for estimating both the long-term EDF and the time-varying EDF is closely
connected to the method applied by Polkovnichenko and Zhao (2013). The time-varying method
used by these authors is based on Rosenberg and Engle (2002). The choice for an EGARCH
approach versus the standard GARCH model is due to the asymmetric feature of the former
model that embeds the “leverage effect”16.
3.3 Empirical analysis and results
In this section, we present our empirical results. Since we estimate EDF in the two ways
described (the long-term and time-varying EDFs), we are able to estimate long-term and time-
varying γ’s by minimizing Eq. (3.2.5). We use our long-term γ estimates to compute the
ECPT to compare it to the other subjective density functions using the tests described in
section 3.2.3. The time-varying estimates of γ are analyzed in sections 3.3.3 and 3.3.4 with
the use of a regression model. Finally, in section 3.4, we perform four robustness tests on our
results by using an alternative weighting function to the CPT, the one embedded in the Prelec
(1998) model, and we apply Kupiec’s test to probability tails, among other checks.
3.3.1 Estimated CPT long-term parameters
We report the estimated CPT parameters (λ and γ) extracted from long-term density functions
in Table 3.1, Panel A. Our first finding is that λ, the parameter of loss aversion, which is 2.25 in
the CPT, fluctuates around that number for six- and twelve-month options but shows a quite
different outcome for three-month options. Our estimation of λ from three-month options is
1.02, which indicates no loss aversion. For the six- and twelve-month options λ is 2.66 and 3.00,
respectively. This finding suggests that loss aversion is more pronounced at longer maturities
than suggested by the CPT. Apart from that, twelve-month λ estimates are highly variant
across the different optimization upper bounds used (i.e. 3, 5 and 10), always matching the
bound value, whereas estimates from three- and six option maturities are very stable across
upper bounds.
The estimated probability weighting function parameter γ is slightly higher than the one
suggested by the CPT (i.e., 0.61) at the three- and six-month horizons, respectively, at 0.75 and
0.81. For twelve-month options, γ is around 1.09. These results suggest that overweighting of
small probabilities occurs in short-term options (up to six-months), while twelve-month options
seem to behave more rationally. These findings support our hypothesis that individual investors
are, on average, biased when purchasing single stock call options, as suggested by Barberis and
Huang (2008).
one of the PCPT model. This difference is challenging to estimation of γ (Eq. (3.2.5)), as a large amount of γestimates produce unreasonable PDFs such as non-monotonic CDFs. Therefore, to perform the optimizationsgiven by Eq. (3.2.5), we set the mode of the simulated EDF equal to the one of the PCPT.
16The leverage effect is the negative correlation between an asset’s returns and changes in its volatility. Fora comparison between alternative GARCH approaches, see Bollerslev et al. (2009).
51
Table 3.1: Long-term CPT parameters and consistency test on tail shape
Panel A - long-term CPT parameters
Gamma (γ) Lambda (λ) (γ|λ)
Maturity Estimate RSS Estimate RSS Estimate RSS
3 months 0.75 0.02 1.02 0.12 0.54 0.01
6 months 0.81 0.02 2.66 0.3 0.87 0.02
12 months 1.09 0.06 3 1.64 1.12 0.07
Panel B - statistical test on tail shape parameters
Phi
Maturity (1) vs (2) (1) (2)EDF p-value t-stat
3 months RND vs EDF 0.20∗ 0.29 0.1% −3.6
Power vs EDF 0.17∗∗∗ 0.29 0.0% −4.9
Expo vs EDF 0.18∗∗∗ 0.29 0.0% −4.6
PCPT vs EDF 0.17∗∗∗ 0.29 0.0% −4.9
CPT vs EDF 0.20 0.29 0.0% −4.6
ECPT vs EDF 0.20 0.29 0.0% −3.7
6 months RND vs EDF 0.19∗∗ 0.23 2.8% −2.3
Power vs EDF 0.16∗∗∗ 0.23 0.0% −4.0
Expo vs EDF 0.16∗∗∗ 0.23 0.0% −3.9
PCPT vs EDF 0.17∗∗∗ 0.23 0.0% −4.0
CPT vs EDF 0.19∗∗ 0.23 0.0% −3.9
ECPT vs EDF 0.18∗∗∗ 0.23 0.0% −2.8
12 months RND vs EDF 0.22 0.14 0.0% 4.0
Power vs EDF 0.14∗∗∗ 0.14 37.7% 0.3
Expo vs EDF 0.14∗∗∗ 0.14 39.7% −0.1
PCPT vs EDF 0.18∗∗ 0.14 1.9% 2.5
CPT vs EDF 0.22∗∗∗ 0.14 0.0% 4.3
ECPT vs EDF 0.18∗∗ 0.14 1.7% 2.5
Panel A of this table reports the estimated long-term CPT parameters gamma (γ), lambda (λ), and γ conditional on optimal
λ (γ|λ) from the single stock options as well as residual sum of squares (RSS) of Eqs. (3.2.4) and (3.2.5). The parameter γdefines the curvature of the weighting function for gains, which leads the probability distortion functions to assume inverseS-shapes. Estimated parameters close to unity lead to weighting functions that are close to un-weighted probabilities,whereas parameters close to zero denotes a larger overweighting of small probabilities. The parameter λ is the loss aversionparameter. These parameters are long-term since their estimates are obtained by setting the average CPT density functionsto match the return distribution realized within our full sample. These parameters are estimated using Eqs. (3.2.4) and(3.2.5). Panel B reports the results from the statistical test of the tail shape parameter phi (ϕ), according to Eqs. (3.2.6)and (3.2.9b) applied to the averaged probability density functions. As densities compared here are averaged, for the RNDand subjective densities or estimated using our full sample for realized returns, such test aims to test for the long-termconsistency between distribution tail shapes. The null hypothesis of these two tests is that ϕ from the two distributionsbeing compared have equal means and, therefore, tail shapes are consistent. The null hypothesis of these two tests is that ϕfrom the two distributions being compared have equal means and, therefore, tail shapes are consistent. The rejection of thenull hypothesis is tested by t-tests of Eq. (3.2.9a) at the ten, five, and one percent statistical levels, respectively, shown bysuperscripts *, **, ***, assigned to ϕ for the RND, displayed in column (1), and ϕ for the EDF, shown in column (2).
3.3.2 Density functions tails’ consistency test results
As specified in section 3.3, we test the empirical consistency of density function tails among a
set of five subjective distributions (CPT, PCPT, ECPT, power, exponential), the RND, and
the EDF. We perform these tests by employing EVT through the application of Eqs. (3.2.6)
to (3.2.9b). For such purpose, we require return streams (xj), which are only available for the
long-term EDF. Thus, we apply an inversion transform sampling technique to our other PDFs
to obtain sampled returns for them. Such method, also known as the Smirnov method, entails
52
drawing n random numbers from a uniformly distributed variable U = (u1, u2, ..., un) bounded
at interval [0, 1] and, subsequently, computing xj ← F−1(uj), where F are the CDFs of interest
(see Devroye, 1986, p.28). Hence, the Smirnov method simulates returns that resemble the
ones of the inverse CDF by randomly drawing probabilities along such function.
Once we obtain returns for all five PDFs, the next step is to set k as the optimal number of
observations used for estimation of ϕ by Eq. (3.2.6), the Hill-estimator. For this purpose, we
produce Hill-plots for the right tail of our distributions, which depict the relationship between
k and ϕ as a curve (see Straetmans et al., 2008). Picking the optimal k is done by observing
the interval in this curve where the value of ϕ stabilizes while k changes. This area suggests a
stable trade-off between a good approximation of the tail shape by the Pareto distribution and
the uncertainty of such approximation (by the use of fewer observations). The interval that
corresponds to roughly four to seven percent of observations seems to be a stable region across
the Hill-plots of the tails of the EDF and the CPT. As an increase in k increases the statistical
power of the estimator but may distort the shape of the tail, we decide to set k as chosen from
the Hill-plots for EDF and CPT tails equal to four percent.
We examine whether the tail shape parameter (ϕ), computed via the Hill (1975) estimator,
for the RND and for our subjective density functions (i.e., power, exponential, PCPT, CPT
and ECPT) matches the one for the EDF. The outcomes from the statistical tests performed
to compare tail shape parameters (Eq. (3.2.9a)) are provided in Table 3.1, Panel B. Results
suggest that for the three-month maturity options, ϕ for the RND, CPT and ECPT (at 0.20)
are the closest to the EDF parameter (at 0.29) but they are not statistically equal. The ϕ
estimate for the power, exponential, and PCPT density functions do not match the one for the
EDF, as they are all around 0.17 and, thus, exhibit fatter tails than the EDF.
We observe that the results for the six- and twelve-month options are very similar to the
ones obtained for the three-month expiry. The parameter estimate ϕ of the EDF is statistically
equal to the RND and CPT. Parameter ϕ ranges from 0.18 to 0.19 for the CPT, ECPT, and
RND for the six- and twelve-month maturities, whereas it is 0.23 for the EDF. The estimate of
ϕ for the RND (0.19 and 0.22 for the six- and twelve-month maturities, respectively) somewhat
matches the one for the EDF at the six-month maturity but it is off at the twelve-month
maturity. The parameter estimates ϕ for the power, exponential, and PCPT density functions
match the EDF’s ϕ at the twelve-month maturity only. Generally, the parameter estimates ϕ
for these subjective density functions are too small in comparison to the one of the EDF. This
means that these six- and twelve-month maturity subjective density functions have fatter tails
than the EDF, the other subjective densities (CPT and ECPT), and the RND. These results
suggest that the shape of the CPT density function is a good match to the shape of realized
tails.
After k is chosen and the shape estimator ϕ for the EDF, RND, power, exponential, PCPT,
CPT, and ECPT is computed, extreme quantile returns (EQR) can also be estimated via Eq.
(3.2.7). Subsequently, the t-test in Eq. (3.2.9b) is applied using the one, five and ten percent
statistical significance levels. This test evaluates whether the EQRs estimated from a set of
53
two distributions (RND, power, exponential, PCPT, and CPT versus EDF) have equal means
(the null hypothesis). The results of this test are shown in Table 3.2, Panel A.
Analyzing the density functions derived from the three-month option maturity, we find that
the EQR implied by the CPT is the only one that matches the realized EQR and at the first
quantile solely at 21 percent. The EQR implied by the ECPT is almost the same as implied
by the CPT, thus, it also statistically matches the EDF. Per contrast, the EQRs for the RND,
power, exponential, and PCPT densities always overshoot the one for the EDF. All comparisons
between these distributions’ EQR at the three-month maturity reject the null hypothesis that
returns at the same quantile are equal. This pattern is observed across all quantiles analyzed,
i.e., at the tenth, fifth, and first quantiles. This empirical finding indicates that the equity
market upside implied in option markets (i.e., the RND) and the power, exponential and PCPT
densities are always higher than the ones realized by the equity market. The results for the
PCPT resemble the ones for the RND. The EQRs from the CPT and the ECPT are clearly the
best matches for the EDF.
For the six-month maturity, upside returns priced by the RND and ECPT best match the
EQR. The EQRs for the EDF are roughly 18, 22, and 32 percent for the tenth, fifth, and first
quantile of returns, respectively, whereas the EQRs for the ECPT are 19, 21, and 28 percent.
For the RND, such extreme upside return estimates are 19, 22, and 30 percent. Thus, the ECPT
statistically matches the realized EQR best at the tenth and fifth quantile, whereas the RND
is the best match for the third quantile. No rational subjective density function consistently
matches the EQR of the EDF. The power, exponential, and PCPT densities almost always
overshoot the EQR of realized returns. Per contrast, the CPT density always undershoots the
EDF’s extreme returns. Despite always overshooting the EQR of the EDF, the PCPT is the
only other subjective density (apart from the ECPT) that has EQR statistically equal to the
EDF, which happens only at the first quantile EQR.
In contrast to the three- and six-month maturities, the EQRs from the RND for the twelve-
month maturity all underestimate the EQRs from realized returns. The EQRs of realized
returns are 32, 35, and 44 percent for the tenth, fifth and first quantiles, respectively, whereas
for the RND these are 22, 26, and 37 percent, respectively. The same underestimation is
documented for the densities linked to the CPT (i.e., PCPT, CPT and ECPT) as tail returns
are largely out of sync with realized ones, especially for the CPT in which overweight of tails
will force EQRs further away from EDF ones (vis-a-vis the PCPT EQRs). The EQRs of the
exponential densities continue to largely overshoot the ones for the EDF. However, the power
utility function density successfully matches the EQR returns across all EQR values and with
strong statistical significance.
54
Table
3.2:EVT
consistency
testson
tail
retu
rns
PanelA
-Tailsextrem
equantilereturns(EQR)
10%
quantile
5%
quantile
1%
quantile
Maturity
(1)vs.(2)
(1)
(2)EDF
p-value
t-stat
(1)
(2)EDF
p-value
t-stat
(1)
(2)EDF
p-value
t-stat
3RND
vs.EDF
0.16***
0.11
0.0%
-7.7
0.19***
0.13
0.0%
-8.2
0.26***
0.21
0.0%
-5
months
Powervs.EDF
0.21***
0.11
0.0%
-13.9
0.23***
0.13
0.0%
-15.4
0.31***
0.21
0.0%
-10.4
Expovs.EDF
0.21***
0.11
0.0%
-13.9
0.24***
0.13
0.0%
-15.7
0.32***
0.21
0.0%
-11.5
PCPT
vs.EDF
0.18***
0.11
0.0%
-10.8
0.21***
0.13
0.0%
-11.6
0.27***
0.21
0.0%
-7.3
CPT
vs.EDF
0.13***
0.11
0.2%
-3.3
0.15***
0.13
0.7%
-2.9
0.21
0.21
36.3%
0.4
ECPT
vs.EDF
0.15***
0.11
0.0%
-5.4
0.17***
0.13
0.0%
-5.3
0.23*
0.21
8.1%
-1.8
6RND
vs.EDF
0.19
0.18
18.5%
-1.2
0.22
0.22
36.5%
-0.4
0.3*
0.32
8.9%
1.7
months
Powervs.EDF
0.25***
0.18
0.0%
-9.6
0.28***
0.22
0.0%
-10.2
0.36***
0.32
0.0%
-4.3
Expovs.EDF
0.26***
0.18
0.0%
-10.6
0.29***
0.22
0.0%
-11.5
0.37***
0.32
0.0%
-5.5
PCPT
vs.EDF
0.22***
0.18
0.0%
-5.1
0.24***
0.22
0.0%
-4.6
0.32
0.32
36.4%
-0.4
CPT
vs.EDF
0.16***
0.18
0.0%
4.2
0.18***
0.22
0.0%
6.5
0.25***
0.32
0.0%
6.6
ECPT
vs.EDF
0.19
0.18
28.4%
-0.8
0.21
0.22
35.2%
0.5
0.28***
0.32
0.5%
2.9
12
RND
vs.EDF
0.22***
0.32
0.0%
10.5
0.26***
0.35
0.0%
11.5
0.37***
0.44
0.0%
6
months
Powervs.EDF
0.33
0.32
19.8%
-1.2
0.36*
0.35
9.9%
-1.7
0.45*
0.44
9.2%
-1.7
Expovs.EDF
0.34***
0.32
0.2%
-3.2
0.38***
0.35
0.0%
-4
0.47***
0.44
0.1%
-3.5
PCPT
vs.EDF
0.26***
0.32
0.0%
6.4
0.3***
0.35
0.0%
70.39***
0.44
0.0%
3.7
CPT
vs.EDF
0.18***
0.32
0.0%
23.3
0.21***
0.35
0.0%
26.9
0.3***
0.44
0.0%
12.2
ECPT
vs.EDF
0.26***
0.32
0.0%
7.1
0.29***
0.35
0.0%
7.8
0.39***
0.44
0.0%
4.1
PanelB
-Tailsexpected
upsidereturns(EU)
10%
quantile
5%
quantile
1%
quantile
Maturity
(1)vs.(2)
(1)
(2)EDF
p-value
t-stat
(1)
(2)EDF
p-value
t-stat
(1)
(2)EDF
p-value
t-stat
3RND
vs.EDF
0.2***
0.15
0.0%
-5.3
0.23***
0.19
0.0%
-5.1
0.32*
0.3
8.9%
-1.7
months
Powervs.EDF
0.25***
0.15
0.0%
-10.3
0.28***
0.19
0.0%
-10.9
0.37***
0.3
0.0%
-5.9
Expovs.EDF
0.26***
0.15
0.0%
-10.6
0.29***
0.19
0.0%
-11.4
0.39***
0.3
0.0%
-7.2
PCPT
vs.EDF
0.22***
0.15
0.0%
-7.5
0.25***
0.19
0.0%
-7.4
0.33***
0.3
0.6%
-2.9
CPT
vs.EDF
0.16
0.15
18.6%
-1.2
0.19
0.19
39.0%
-0.2
0.26***
0.3
0.1%
3.4
ECPT
vs.EDF
0.18***
0.15
0.5%
-3
0.21**
0.19
3.2%
-2.2
0.28
0.3
13.3%
1.5
6RND
vs.EDF
0.24
0.24
36.9%
0.4
0.27
0.28
10.9%
1.6
0.37***
0.41
0.2%
3.3
months
Powervs.EDF
0.3***
0.24
0.0%
-6.8
0.33***
0.28
0.0%
-6.6
0.43
0.41
14.4%
-1.4
Expovs.EDF
0.31***
0.24
0.0%
-7.9
0.34***
0.28
0.0%
-8
0.45**
0.41
1.4%
-2.6
PCPT
vs.EDF
0.26**
0.24
1.4%
-2.6
0.29
0.28
13.3%
-1.5
0.38*
0.41
5.9%
2
CPT
vs.EDF
0.2***
0.24
0.0%
60.22***
0.28
0.0%
8.8
0.3***
0.41
0.0%
8.1
ECPT
vs.EDF
0.23
0.24
15.3%
1.4
0.26***
0.28
0.2%
3.3
0.35***
0.41
0.0%
4.9
12
RND
vs.EDF
0.28***
0.37
0.0%
7.6
0.33***
0.4
0.0%
7.7
0.46***
0.51
0.7%
2.8
months
Powervs.EDF
0.38
0.37
14.7%
-1.4
0.42*
0.4
5.8%
-2
0.53*
0.51
5.8%
-2
Expovs.EDF
0.4***
0.37
0.2%
-3.2
0.44***
0.4
0.0%
-4
0.55***
0.51
0.1%
-3.4
PCPT
vs.EDF
0.32***
0.37
0.0%
4.8
0.36***
0.4
0.0%
4.9
0.48*
0.51
6.4%
1.9
CPT
vs.EDF
0.23***
0.37
0.0%
19.2
0.27***
0.4
0.0%
21.5
0.38***
0.51
0.0%
9.1
ECPT
vs.EDF
0.31***
0.37
0.0%
5.3
0.35***
0.4
0.0%
5.6
0.47**
0.51
3.0%
2.3
This
table
reportsth
eresu
ltsfrom
statisticaltestsofth
eex
trem
equantile
retu
rn,EQR
(in
Panel
A)and
tail
expected
upsideretu
rns,
EU
(in
Panel
B),
perform
edaccord
ingto
Eqs.
(3.2.7),
(3.2.8)and(3.2.9b)applied
toaveraged
den
sity
functions.
Since
theden
sities
comparedhereare
averaged
forth
eRND
andforth
esu
bjectiveden
sities
orestimatedusingourfullsample
forrealized
retu
rns,
thesetestsaim
toinvestigate
thelong-term
consisten
cybetween
thedistribution
tails.
Thenull
hypoth
esis
ofth
esetestsis
thatth
eEQR
and
thetail
expected
upsideretu
rnsfrom
the
distributionsbeingco
mparedhaveeq
ualmea
nsand,th
erefore,tailsare
consisten
t.Therejectionofth
enullhypoth
esis
istested
byt-testsofEq.(3.2.9b)atth
eten,five,
andonepercentstatistical
levels,
resp
ectively,
shownbysu
perscripts
*,**,***,assigned
toth
eEQR
orth
eEU
forth
eRND,displayed
inco
lumn(1),
andforth
esamestatisticsforth
eEDF,sh
ownin
column(2).
55
In line with these results for the EQR, Table 3.2, Panel B, shows that the expected upside
(EU) for the EDF is more closely matched within the three-month horizon by the CPT and
ECPT density functions for the tenth, fifth and first quantiles. The three-month horizon EUs
estimated from the realized returns are 15, 19, and 30 percent for the mentioned quantiles. The
ECPT EUs for the same horizon are 18, 21, and 28 percent, respectively. For the CPT, EUs are
16, 19 and 26 percent. Thus, estimates from these two density functions are mostly statistically
equal to the realized returns. Similarly to our analysis on the EQR, for the other subjective
densities, the EUs for all quantiles are also much larger than the EDF expected upside. The
exponential density has the highest expected upside across the different quantiles, being the
furthest away from the realized returns. The RND-implied expected upside is somewhat con-
servative and relatively closer to the realized ones but only statistically significant at the one
percent quantile.
For the six-months maturity, the expected upsides for the CPT and ECPT density functions
are no longer that close to each other nor to the realized ones. The EDF expected upside always
exceeds the ones for the CPT and ECPT. Only at the tenth quantile, the expected upside of the
ECPT density function equals the realized one. The densities which better match the expected
upside of the EDFs are the PCPT and the RND.
For the twelve-month horizon, the expected upside for the realized returns is 37, 40, and 51
percent for the tenth, fifth and first quantiles. In line with the results from our EQR analysis,
the power density again best matches realized EUs, as estimates are statistically equal across
all maturities. Second best performers are the PCPT and ECPT densities, which match the
realized EU at the one percent quantile level.
In summary, across the three EVT tests performed (i.e., on tail shape, EQR and EU), the
three option maturities and the three quantiles evaluated, we observe that the success rate of
the CPT subjective density functions on matching the EDF tails is 57 percent. In contrast, this
success rate is 38 percent for the power utility, 33 percent for the RND and only 10 percent for
the exponential utility density function. These results suggest that CPT-related distributions,
although not always matching the EQRs and ES of the EDF, seem to best match the EQR
of the EDF, especially at the short maturities. More specifically, the ECPT seems to have
some advantage over the other methods for the three- and six-month maturities. This result is
not a surprise, because allowing the CPT weighting function to assume different shapes entails
extra flexibility to match the data relative compared to traditional utility functions. Thus,
if our findings suggest that the CPT does not fully explain single stock options pricing, its
overweighting of small probabilities feature goes very far in explaining such market data, with
the exception of twelve-month options. These findings reiterate our takeaway from section
3.3.1, in which a positive term structure of overweight of tails appears to play a substantial
role: twelve-month options are priced more rationally than shorter term ones, which seem to be
priced as a result of lottery buying by individual investors. Figure 3.1 compares the CDFs from
six of our equity return densities: the EDF, the RND, the CPT, the PCPT, the exponential-
56
and the power-utility density17. We focus on the right tails of these distributions as we are
interested in how closely the RND from call options and derived subjective density functions
match the tails of the EDF.
In Figure 3.1, we see that the tails implied by option prices (RND, in red) are fatter than the
tails from the CPT (in dark blue) and EDF (in green) density functions over the three-month
horizon. The tails for the CPT and the EDF are almost identical above the 120 terminal
level, i.e., at the 20 percent return. The right tail of the RND distribution is clearly much
fatter than the ones of the CPT and EDF, but it is still thinner than the ones of the PCPT,
the exponential- and the power-utility densities. Thus, the upside risk implied from options
is much higher than the one realized by the EDF, a sign of a potentially biased behavior by
investors in such options. This observation is confirmed by the tail shape parameter (ϕ), the
EQRs and the EU estimated across the different quantiles, which in all cases report higher
upside in the RND than in the EDF and the CPT. Figure 3.1 also suggests that the upside
risk of the RND is more consistent with the PCPT density, whereas the CPT tails seem very
distinct from the PCPT, which is in line with our earlier findings.
The plot in column B, which depicts the CDF for our studied densities at the six-month
horizon, suggests that the RND and the EDF are closer than at the three-month horizon.
At the same time, the CPT density seems more disconnected from the EDF. This finding
matches our results from the EQR and the expected upside comparisons. The PCPT tail is,
at this horizon, higher than the EDF, CPT, and RND ones and closer to the EDF one than
to the CPT one, especially at its very extreme. This finding is also confirmed by our EQR
and expected upside tests, as the PCPT is statistically equal to the EDF at the one percent
quantile. The exponential and power utility densities have right tails that are much fatter than
the other densities, including the EDF.
Figure 3.1 shows that at the twelve-months horizon the CPT’s CDF tails seem completely
disconnected from the EDF. The EDF tails are much fatter than the CPT ones and slightly
fatter than the RND ones. In fact, the RND seems to match the EDF for terminal levels above
120. This finding suggests that long-term options trade in a much less CPT-biased manner
than short-term options.
Overall, Figure 3.1 confirms our hypothesis that end-users of OTM single stock calls are
likely biased and behave as buying lottery tickets when trading short-term options. These
results strengthen the evidence provided by Ilmanen (2012), Barberis (2013), Conrad et al.
(2013), Boyer and Vorkink (2014) and Choy (2015) that investors push single stock options
prices to extreme valuation levels. Investors seem to overweight small probabilities especially
at short-term horizons. Next, we analyze the time-variation in overweight of small probabilities
to better understand the underlying reasons for our findings.
17We omit the ECPT for better visualization as its CDFs are very similar to the CPT ones. The similarityis caused by the ECPT left tail weighting function parameter (δ) being the same for the CPT and because theestimated long-term γ for the three maturities are close to the Tversky and Kahneman (1992) one.
57
(a) Three-month horizon (b) Six-month horizon
(c) Twelve-month horizon
Figure 3.1: Cumulative density functions.This figure shows three plots that depict the cumulative density function(CDF) for equity returns obtained from the empirical density function (EDF), the risk-neutral density (RND), and the foursubjective density functions: 1) the power utility density, 2) the exponential utility density, 3) the cumulative prospective theorydensity (CPT), and the partial CPT (PCPT). The equity returns’ CDFs from these six sources are presented for three-, six-, andtwelve-month horizons. The plots display the cumulative probabilities on the y-axis and the terminal price levels on the x-axis,given an initial price level of 100.
3.3.3 Estimated CPT time-varying parameters
To investigate time-variation in the CPT’s overweighting of small probabilities in single stock
options, we apply Eq.(3.2.5) to each day in the sample to estimate the empirical γ (weighting
function) parameter. Lower and upper bounds of -0.25 and 1.75 were used in this optimization
as they produced the lowest RSS across permutation of all bounds when γ was optimized using
the CPT parameterization. We estimate γ under four different assumptions about λ, the loss
aversion parameter: 1) λ equals 2.25, the CPT parameterization; 2) no loss aversion, λ equals
1; 3) augmented loss aversion, λ equals 3; and 4) optimal λ, as estimated by Eq.(3.2.4).
Table 3.3, Panel A reports the statistics when λ equals 2.25. We find that the median and
the mean time-varying values of γ, estimated from the three-month options are above its CPT
value of 0.61 but still reflect overweight of small probabilities. This suggests that overweighting
58
of small probabilities is present within the pricing of three-month call options as suggested by
the theory. The distribution of γ is skewed to the right and overweight of small probabilities
is present 64 percent of times within three-month maturity. The 25th percentile of γ is 0.74,
clearly suggesting a less pronounced overweight of small probabilities than suggested by the
CPT. The estimates of γ range from 0 to 1.75 (i.e., an underweighting of small probabilities)
and are volatile, with a standard deviation of 0.23.
Table 3.3: Time-varying gamma parameter
Panel A - Gamma with CPT loss aversion (λ = 2.25)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months - 0.74 0.91 0.89 1.04 1.75 0.23 64% 97% 35% 59% 0.0209
6 months - 0.81 0.99 0.96 1.14 1.75 0.28 52% 92% 18% 46% 0.017
12 months 0.04 0.91 1.03 1.01 1.14 1.75 0.22 41% 83% 11% 29% 0.0225
Panel B - Gamma with no loss aversion (λ = 1)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.32 0.52 0.66 0.67 0.8 1.27 0.18 97% 100% 94% 97% 0.0253
6 months 0.32 0.55 0.71 0.72 0.87 1.75 0.21 90% 98% 79% 92% 0.0198
12 months 0.29 0.62 0.83 0.8 0.98 1.75 0.22 81% 98% 63% 83% 0.0169
Panel C - Gamma with augmented loss aversion (λ = 3)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.45 0.81 0.96 0.98 1.11 1.75 0.25 58% 93% 27% 53% 0.023
6 months 0.38 0.89 1.02 1.06 1.25 1.75 0.25 45% 83% 13% 38% 0.0196
12 months 0.37 0.98 1.07 1.09 1.19 1.75 0.21 31% 66% 6% 22% 0.0265
Panel D - Gamma with optimized loss aversion
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.33 0.52 0.66 0.67 0.81 1.75 0.18 97% 100% 93% 97% 0.0249
6 months 0.37 0.86 1.01 1.02 1.17 1.75 0.24 47% 86% 15% 40% 0.0187
12 months 0.34 0.96 1.05 1.06 1.17 1.75 0.2 35% 73% 8% 24% 0.025
This table reports the summary statistics of the estimated CPT time-varying parameter gamma (γ) from the single stockoptions market for each day in our full sample across different values of lambda (λ). The parameter λ is the loss aversionparameter and the parameter γ defines the curvature of the weighting function for gains, which leads the probability distortionfunctions to assume inverse S-shapes. An estimated γ parameter close to unity leads to a weighting function that is close tothe unweighted probabilities, whereas values close to zero denote a larger overweighting of small probabilities. The columnwith heading %γ < 1 reports the percentage of observations in which γ < 1, thus, the proportion of the sample in whichoverweight of small probabilities is observed. We report this metric for the full sample as well as for three equal-sized splitsof our full sample, namely: (98-03), from 1998-01-05 to 2003-01-30; (03-08) from 2003-01-31 to 2008-02-21 and; (08-13) from2008-02-22 to 2013-03-19. Panel A reports the summary statistics of γ when we assume the CPT parameterization, whereλ equals 2.25. Panel B reports the summary statistics of γ when we assume the loss aversion parameter λ equals 1 (no lossaversion). Panel C reports the summary statistics of γ when we assume λ equals to 3 (augmented loss aversion). PanelD reports the summary statistics of γ when we assume λ to equal its estimated (optimal) values, as reported in Table 3.1,Panel A.
Interestingly, when we split the sample in three parts (as shown in Table 3.3), we observe
that overweight of small probabilities is most strongly present at the beginning of our sample,
in 97 percent of the days from 1998-01-05 to 2003-01-30, but that has faded since 2003. During
the period from 2003-01-31 to 2008-02-21, underweight of small probabilities is present in 65
percent of the days, whereas such condition is less pervasive from 2008-02-22 onwards, i.e., until
2013-03-19. This finding suggests that overpricing of single stock options is sample specific and
not structural. Even if sample specific, overweight of small probabilities seems, in general,
59
much less pronounced than the 0.61 parameter offered by the CPT. These results seem to only
partially confirm our hypothesis that the CPT can empirically explain the overpricing of OTM
single stock call options.
At the six-month maturity, overweighting of small probabilities is less frequent than in three-
month tenor. The median γ for such maturity is 0.99, implying roughly neutral probability
weighting. The long-term γ equals 0.81 and is somewhat out-of-sync with the time-varying
estimates. Similarly to the three-month maturity, the distribution of γ is also slightly skewed to
the right. The 75th quantile of γ equals 1.14 and suggests an underweighting of tail probabilities.
However, probability weighting is largely sample dependent as within the overall sample, 52
percent of all observations reflect overweight of small probabilities but, between 1998 and 2003,
its occurrence is 92 percent.
Differently from the other maturities, γ estimates for the twelve-month maturity tend to-
wards underweight of tail probabilities. The median γ is 1.03, whereas the mean γ is 1.01.
Time variation and sample dependence are present as for the other maturities but, at the
twelve-month maturity, the percentage of days with overweight of tails is smaller, 41 percent
in the full sample but still 83 percent for the 1998-2003 sample.
In summary, the statistics in Table 3.3, Panel A, indicate that the weighting function pa-
rameters γ for the three maturities evaluated are time-varying and sample specific. Overweight
of small probabilities holds for the three-month maturity, less convincingly so for the six-month
maturity, and not at all for the twelve-month maturity, in which neutral probabilities and
underweight of tails respectively prevails.
Because the loss-aversion parameter λ is of high importance in the CPT model, we estimate
γ under different λ parameterizations, more specifically, for 1) λ equals 1, 2) λ equals 3 and
optimal λ, as estimated from the long-term empirical distribution (see Table 3.1).
We report the summary statistics of the new γ estimates in Panel B of Table 3.3, when
we assume λ equals 1. The new median and mean estimates for γ are 0.66 and 0.67 for the
three-month maturity, respectively, and, thus, lower than when γ was estimated under the CPT
loss aversion calibration (λ=2.25). The 75th percentile of γ also decreases, from 1.14 to 0.80.
At the six-month horizon, the difference between γ with λ equals 2.25 and with λ equals 1
is also large. The median γ for the CPT λ is 0.99, whereas for when λ equals 1 it is 0.71.
The means are 0.96 and 0.72, respectively. At the 75th percentile using λ equals 1, γ becomes
0.87. For the twelve-month maturity, we observe a similar effect. The median γ for when
λ equals 1 is 0.83, whereas for when λ equals 2.25 it is 1.03. In brief, a lower loss aversion
parameter consistently gives rise to higher γ estimates, across the different options’ maturities
and quantiles. The opposite effect is observed when the λ is increased from 2.25 to 3, as shown
by Table 3.3, Panel C. The median and mean γ when λ equals 3 becomes 0.96 and 0.98 for
the three-month maturity, in comparison to 0.91 and 0.89 when λ equals 2.25. Such rise in
central tendency of γ estimates is also observed within the six- and twelve-month maturities
and across the 25 and 75 percent quantiles. Table 3.3, Panel D, which reports γ estimates when
optimized λ parameters are used, shows distinct results for the three-month maturity versus the
60
six- and twelve-month maturity. For the three-month maturity, we observe a downward shift to
γ estimates, whereas for six- and twelve-month maturities, an upward movement in estimates
occurs. However, this initially opposite effect in estimates is, in fact, qualitatively equal to the
result just described when we use λ as 1 or 3, as the optimal λ parameters estimated for the
three-, six and twelve-month maturities are, respectively, 1.02, 2.66 and 3.00 (i.e., it decreases
for the three-month maturity and increases for the six- and twelve-month maturity vis-a-vis
the CPT parameterization).
The reason why a lower (higher) loss aversion gives rise to a decreased (increased) γ is that it
increases (decreases) the probability on the left side of distribution, influencing the probabilities
and the shape of the right side of the CPT distribution. High values of λ push the CPT density
to have more probability on the right side of the distribution, which is spread proportionally to
the probabilities originally observed in the right-side bins (i.e., creating a bump into the center-
right side of the distribution), all else equal. Thus, the impact of such probability shift fades as
the tail approaches. Nevertheless, the right tail of the CPT density does turn fatter (and the
γ parameter higher) as λ is made higher. The opposite occurs if low values of λ are assumed:
the right tail of the CPT density becomes thinner, causing γ estimates to be low (which more
forcefully can turn the RND right tail into such thin CPT tail). One important finding from
our experimentation with different λ parameters is that the time variation observed when λ
equals 2.25 is unchanged. The standard deviation and range of γ estimates across the use of the
different λ values are somewhat the same. Though, the percentage of days that overweight of
tails is observed in the different samples studied dramatically changes towards a more frequent
presence of overweight of small probabilities, as low levels of λ are used (and vice-versa). The
large difference in the presence of overweight of small probabilities across samples remains.
We interpret our finding that γ is strongly time-varying and sample dependent across all
maturities and under different λ assumptions as a strong evidence that single stock options are
not overvalued due to a structural skewness preference, as Barberis (2013) may suggest. We
reckon that, if static skewness preferences would drive overweight of small probabilities, param-
eter γ would be relatively stable throughout our sample. Given that the γ is largely volatile,
we support the view that investors experience (time-varying) “bias in beliefs” or, alternatively,
time-varying preferences (see Barberis, 2013)18. Our results are in line with Green and Hwang
(2011), Chen et al. (2015) and Jiao (2016), who report similar time-varying effects in the over-
pricing, skewness effects and returns for IPOs and lottery-like stocks. These papers also report
that, beyond time-varying effects, stronger skewness preferences are associated with higher par-
ticipation of individual investor (trading in IPOs, trading around earnings announcements and
owning stocks) in detriment of institutional investors.
18Barberis (2013) distinguishes investors’ time-varying beliefs from skewness preferences as he argues thatinvestors with biased beliefs mistakenly overestimate tail events, whereas preference for skewness leads to over-weight of tails, which is less likely to be a mistake. As an example, the author suggests that investors thatoverweight small probabilities events correctly anticipate the distribution of a stock’s future returns but over-weight the state of the world in which a stock turns out to be “the next Google”. In the example, overestimationof tail events would occur when the investor attributes a higher chance to the stock being the next Google.As we do not attempt to distinguish between biased believes and time-varying preferences, we use the termoverweight of small probabilities throughout this chapter.
61
3.3.4 Time variation in probability weighting parameter and in-
vestors’ sentiment
As observed in section 3.3.3, the probability weighting parameter γ is clearly time-varying. In
the following, we investigate which factors may explain this time-variation of γ. Our main hy-
pothesis is that it is linked to investor sentiment. The link between sentiment and overweighting
of small probabilities or lottery buying in OTM single stock calls originates from the fact that
individual investors are highly influenced by market sentiment and attention-grabbing stocks
(Barberis et al., 1998; Barber and Odean, 2008), and that OTM single stock calls trading is
speculative in nature and mostly done by individual investors (Lakonishok et al., 2007). For
instance, Lakonishok et al. (2007) argue that the IT bubble of 2000, a period of high variation
of γ, is linked to elevated investor sentiment, when the least sophisticated investors were the
ones most inclined to purchase calls on growth and IT stocks. Figure 3.2 depicts time-varying
γ’s and the Baker and Wurgler (2007) sentiment factor. It provides evidence that these mea-
sures move in tandem at times. For example, during the IT bubble, the level of γ seems quite
connected with the level of sentiment, especially for the three- and six-month options.
Figure 3.2: Time varying gamma parameter in CPT. This figure depicts the time-varying nature of the gamma(γ) parameter from three-, six-, and twelve-month single stock options estimated using the CPT parameterization as well as thesentiment factor of Baker and Wurgler (2007).
To formally test our hypothesis that time variation of γ is linked to investor sentiment,
we design a regression model. In Eq. (3.3.1) the explained variables are γ for the three-
, six-, and twelve-month horizons and the explanatory variables are the Baker and Wurgler
(2007) sentiment measure19; the percentage of bullish investors minus the percentage of bearish
investors given by the survey of the American Association of Individual Investors (AAII), used
as a proxy for individual investor sentiment by Han (2008); and a set of control variables among
19Available at http://people.stern.nyu.edu/jwurgler/.
62
the ones tested by Welch and Goyal (2008)20 as potential forecasters of the equity market. The
data frequency used in the regression is monthly as this is the highest frequency available from
the sentiment data and from the Welch and Goyal (2008) data set21. Our regression sample
starts in January 1998 and ends in February 201322. Our OLS regression model is specified as
follows:
γt = c+ ψ1 · Sentt + ψ2 · IISentt + ψ3 · E12t + ψ4 · B/mt + ψ5 ·Ntist+
ψ6 ·Rfreet + ψ7 · Inflt + ψ8 · Corprt + ψ9 · Svart + ψ10 · CSPt + εt,(3.3.1)
where Sent is the Baker and Wurgler (2007) sentiment measure, IISent is the AAII individual
investor sentiment measure, E12 is the twelve-month moving sum of earnings on the S&P5000
index, B/m is the book-to-market ratio, Ntis is the net equity expansion, Rfree is the risk-free
rate, Infl is the annual inflation rate, Corpr is the corporate spread, Svar is the stock market
variance and, CSP is the cross-sectional premium.
Additionally, we run univariate models for each explanatory factor to understand the indi-
vidual relation between γ and the control variables:
γt = ci + ψi · xi,t + εt, (3.3.2)
where x replaces the n explanatory variable earlier specified, given i = 1...n.
Table 3.4, Panel A presents the estimates of Eq. (3.3.1). We note the high explanatory
power of the multivariate regression, ranging from 68 to 71 percent. As expected, we observe
that Sent is consistently negative and statistically significant across the three different horizons
studied. On average, each one-unit difference in Sent is linked to roughly -0.1 difference in γ,
all else being equal. The univariate regressions of Sent confirm the negative link between
sentiment and γ. For all option maturities, a negative relation between the Baker and Wurgler
(2007) sentiment measure and γ is found. The explanatory power of the variable Sent in the
univariate setting is also high, between 22 and 29 percent. These findings altogether support
our hypothesis that overweighting of small probabilities increases at higher levels of sentiment
and that sentiment strongly impacts the probability weighting bias of call option investors.
In contrast with the variable Sent, the coefficients for the individual investor sentiment
(IISent) are positive but not statistically significant either on the multivariate setting or on the
univariate one (see Table 3.4). The univariate regressions run on γ have rather low explanatory
power. The positive relationship between IISent and γ at the three-month maturity may
be attributed to potential capitulations in individual investor sentiment, as such indicator is
strongly mean-reverting.
20The complete set and description of variables suggested by Welch and Goyal (2008) is provided in Appendix3.C. From the complete set of variables used by Welch and Goyal (2008), we select a smaller set using the cross-correlation between them to avoid multicollinearity in our regression analysis. Because we run a multivariatemodel, using the full set of variables is undesirable as some of them correlate 80 percent with each other. Weexclude variables that correlate more than 40 percent with each other.
21Given the fact that γ is estimated on a daily basis, we average γ throughout each month.22This sample is only possible because Welch and Goyal (2008) and Baker and Wurgler (2007) have updated
and made available their datasets after publication.
63
Table
3.4:Regression
resu
lts:
CPT
para
metrization
Panel
A-Multivariate
analysis
Panel
B-Univariate
analysis
Matu
rity
3m
6m
12m
3m
6m
12m
3m
6m
12m
3m
3m
3m
3m
3m
3m
3m
3m
Intercep
t0.564***
0.646***
0.769***
0.891***
0.969***
1.016***
0.864***
0.939***
0.996***
0.573***
0.513***
0.865***
0.917***
0.855***
0.869***
0.917***
0.864***
(0.068)
(0.090)
(0.074)
(0.020)
(0.023)
(0.019)
(0.022)
(0.026)
(0.022)
(0.031)
(0.061)
(0.022)
(0.031)
(0.023)
(0.022)
(0.026)
(0.021)
Sen
t-0.078***
-0.121***
-0.114***
-0.145***
-0.206***
-0.157***
(0.024)
(0.032)
(0.024)
(0.020)
(0.028)
(0.020)
AAIISen
t0.094
0.030
-0.003
0.031
-0.040
-0.067
(0.059)
(0.085)
(0.058)
(0.083)
(0.107)
(0.079)
E12
0.049***
0.052***
0.044***
0.059***
(0.008)
(0.010)
(0.011)
(0.005)
B/m
0.617**
0.698*
0.387
1.428***
(0.268)
(0.344)
(0.331)
(0.266)
Ntis
0.357
0.201†
-0.869
0.617
(0.553)
(0.676)
(0.567)
(0.811)
Rfree
-0.022**
-0.030**
-0.016
-0.022
(0.010)
(0.014)
(0.011)
(0.013)
Infl
-0.580†
-3.111†
0.370†
5.867
(2.539)
(3.493)
(2.916)
(4.778)
Corp
r0.100†
-0.174†
-0.037†
-0.314
(0.275)
(0.373)
(0.349)
(0.554)
Sva
r-9.762***
-11.792***
-8.679**
-13.280***
(3.179)
(4.156)
(3.686)
(4.835)
CSP
-0.219†
-0.219†
-0.116†
0.626*
(0.197)
(0.233)
(0.238)
(0.313)
R2
71%
68%
67%
22%
29%
27%
0%
0%
1%
37%
31%
0%
3%
1%
0%
18%
2%
F-stat
36.5
31.7
30.2
43.7
64.3
57.7
0.2
0.2
0.8
90.3
70.0
0.7
5.4
2.2
0.3
33.9
3.8
AIC
-247.1
-166.0
-233.7
-326.1
-186.0
34.1
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
BIC
-213.4
-132.3
-200.0
-320.0
-179.9
40.3
0.1
0.1
0.1
0.0
0.2
0.7
0.0
3.9
0.5
2.3
0.3
This
table
reportsth
eresu
ltsfrom
statisticaltestsofth
eex
trem
equantile
retu
rn,EQR
(in
Panel
A)and
tail
expected
upsideretu
rns,
EU
(in
Panel
B),
perform
edaccord
ingto
Eqs.
(3.2.7),
(3.2.8)and(3.2.9b)applied
toaveraged
den
sity
functions.
Since
theden
sities
comparedhereare
averaged
forth
eRND
andforth
esu
bjectiveden
sities
orestimatedusingourfullsample
forrealized
retu
rns,
thesetestsaim
toinvestigate
thelong-term
consisten
cybetween
thedistribution
tails.
Thenull
hypoth
esis
ofth
esetestsis
thatth
eEQR
and
thetail
expected
upsideretu
rnsfrom
the
distributionsbeingco
mparedhaveeq
ualmea
nsand,th
erefore,tailsare
consisten
t.Therejectionofth
enullhypoth
esis
istested
byt-testsofEq.(3.2.9b)atth
eten,five,
andonepercentstatistical
levels,
resp
ectively,
shownbysu
perscripts
*,**,***,assigned
toth
eEQR
orth
eEU
forth
eRND,displayed
inco
lumn(1),
andforth
esamestatisticsforth
eEDF,sh
ownin
column(2).
64
The nine Welch and Goyal (2008) control variables used in our multivariate regression are
linked to γ in very distinct manners. First, it is fair to say that they add substantial explanatory
power to our multivariate regressions. The three-, six-, and twelve-month multivariate models
explain, respectively, 71, 68, and 67 percent of the level of γ. Most of these relations are stable,
because the coefficient signs change only rarely. The control variables that are statistically
significant in our multivariate setting are E12, B/m, Rfree, Infl, Svar, and CSP (Table 3.4).
We observe that γ is positively linked to E12, the twelve-month moving sum of earnings on the
S&P 500 index, as well as to B/m, the book to market ratio, in both multivariate and univariate
regressions. The positive relation between E12, B/m and γ could be explained by mean-
reversion of earnings and valuation being linked to a greater overweighting of small probabilities,
which could be justified by the higher investor sentiment outweighing earning downgrades and
rising valuations in a rallying market. These two variables have high explanatory power of
γ, respectively, 37 and 31 percent for the three-month horizon. The significance of Rfree
is, however, somewhat unstable. At the three- and six-month maturity at the multivariate
regression Rfree is significant but not at the univariate regression. Further, the stock market
variance, Svar, is negatively linked to γ. Apparently, the higher the risk environment, the
higher the overweighting of small probabilities is. In a univariate setting (at the three-month
horizon), the explanatory power of such univariate regression is 18 percent, thus relatively high.
Table 3.4, Panel B indicates that the cross-sectional premium CSP is positive and statistically
significant in the univariate setting for the three-month horizon, despite being negative and not
significant in the multivariate regressions.
To reiterate our results, we also apply the Least Absolute Shrinkage and Selection Operator
(Lasso) methodology to our main multivariate regressions (see Tibshirani, 1996, and Appendix
3.B.1). We apply Lasso to select the regressors that are most relevant for the overall fit of the
γ by our sentiment and control variables. The coefficients that shrink to zero via the Lasso
are identified in Table 3.4 (Panel A) with a dagger (†). Model selection via the Lasso confirms
that Sent and IISent are more relevant for the overall fit of γ than some of the fundamental
factors used, namely, Ntis, Infl, Corpr and CSP .
The results provided by our OLS regression and by the Lasso indicate that supportive
fundamental data for equity markets do not necessarily intensify biased behavior of single stock
call option investors. This is an interesting takeaway, especially considering the notion that
sentiment does appear to affect such behavior: single stock option investors seem to overweight
small probabilities when sentiment is exuberant, not necessarily when stock fundamentals are
exuberant.
More importantly, these results support our earlier findings that overweight of small proba-
bilities is strongly time-varying and linked to sentiment. Therefore, overweight of small proba-
bilities is unlikely to result from (static) investor preferences but from investors’ bias-in-beliefs
or time-varying preferences, which seem conditional on sentiment levels. Furthermore, we also
run our regression models (Eqs. (3.3.1) and (3.3.2)) using different assumptions about the value
of λ, the loss aversion parameter. In this exercise we set λ to imply 1) no loss aversion (λ=1),
65
2) augmented loss aversion (λ=3) and 3) optimal loss aversion, where λ assumes the estimated
value by Eq. (3.2.4) and reported in Table 3.1, Panel A.
Table 3.5: Regression results: alternative loss aversion parameterization
Panel A - Multivariate analysis
λ = 1 λ = 3 Optimum λ
Maturity 3m 6m 12m 3m 6m 12m 3m 6m 12m
Intercept 0.348*** 0.347*** 0.413*** 0.576*** 0.661*** 0.798*** 0.354*** 0.663*** 0.817***
-0.056 -0.07 -0.065 -0.084 -0.094 -0.074 -0.056 -0.085 -0.069
Sent -0.050*** -0.061*** -0.050** -0.077*** -0.094*** -0.091*** -0.051*** -0.089*** -0.093***
-0.018 -0.022 -0.02 -0.027 -0.03 -0.022 -0.018 -0.027 -0.021
AAIISent 0.087* 0.082 0.110** 0.108 0.092 -0.008 0.086* 0.074 -0.003
-0.05 -0.061 -0.049 -0.074 -0.083 -0.057 -0.05 -0.076 -0.054
E12 0.043*** 0.045*** 0.054*** 0.049*** 0.043*** 0.036*** 0.043*** 0.042*** 0.036***
-0.007 -0.008 -0.008 -0.01 -0.01 -0.01 -0.007 -0.01 -0.009
B/m 0.563** 0.754*** 0.647** 0.795** 0.885** 0.527* 0.571** 0.823** 0.43
-0.227 -0.267 -0.269 -0.334 -0.377 -0.316 -0.227 -0.336 -0.292
Ntis 0.258 0.029 -0.357 0.266 0.057† -0.884† 0.253 0.223† -0.900*†-0.486 -0.593 -0.556 -0.686 -0.705 -0.558 -0.486 -0.643 -0.525
Rfree -0.013* -0.011 -0.021** -0.01 -0.008 0.008 -0.013* -0.014 -0.003
-0.008 -0.01 -0.009 -0.012 -0.014 -0.011 -0.008 -0.012 -0.01
Infl -0.558† -1.626† 2.157† 0.468† -0.3† 0.941† -0.59† -1.034† 0.259†-2.029 -2.678 -2.313 -3.265 -3.836 -2.981 -2.044 -3.496 -2.72
Corpr 0.086† -0.038† 0.097† 0.094† -0.112† -0.153† 0.089† -0.168† -0.133†-0.206 -0.263 -0.262 -0.319 -0.38 -0.323 -0.208 -0.347 -0.312
Svar -6.371** -7.192** -5.469* -8.742** -9.835** -8.126** -6.634** -9.520** -8.018**
-2.533 -3.188 -2.899 -3.484 -3.845 -3.373 -2.542 -3.681 -3.142
CSP -0.076 -0.099† -0.024† -0.186† -0.196† -0.059† -0.073 -0.152† -0.066†-0.153 -0.182 -0.178 -0.22 -0.235 -0.196 -0.154 -0.215 -0.189
R2 70% 68% 73% 63% 62% 64% 70% 65% 65%
F − stats 35 32 40 25.3 24 25.7 34.9 26.7 27.6
AIC -309.4 -254.6 -278.5 -184.8 -166.5 -247.6 -307.1 -193.4 -268.4
BIC -275.7 -220.9 -244.8 -151.1 -132.9 -213.9 -273.4 -159.7 -234.8
Panel B - Univariate analysis
λ = 1 λ = 3 Optimum λ
Maturity 3m 6m 12m 3m 6m 12m 3m 6m 12m
Intercept 0.662*** 0.722*** 0.797*** 0.980*** 1.055*** 1.095*** 0.668*** 1.023*** 1.065***
-0.017 -0.019 -0.02 -0.022 -0.023 -0.018 -0.017 -0.021 -0.017
Sent -0.105*** -0.126*** -0.117*** -0.141*** -0.163*** -0.117*** -0.106*** -0.162*** -0.125***
-0.015 -0.017 -0.017 -0.021 -0.023 -0.016 -0.015 -0.022 -0.016
R2 18% 19% 16% 18% 22% 18% 18% 24% 23%
F − stats 33.2 36.6 30.4 34.3 44.2 34.9 33.6 49.6 45.4
AIC -326.1 -186 34.1 34.1 34.1 34.1 34.1 34.1 34.1
BIC -320 -179.9 40.3 40.3 40.3 40.3 40.3 40.3 40.3
This table reports the regression results for Eq. (3.3.1), in a multivariate setting, in Panel A and for Eq. (3.3.2), in aunivariate setting, in Panel B. Across columns, the parameterization of lambda (λ) differs so Eqs. (3.3.1) and (3.3.2) arerun under the assumption that 1) loss aversion is absent (λ =1); 2) loss aversion is augmented (λ =3) and 3) λ is optimal(as given by Table 3.1 Panel A). The dependent variable in regression of Eq. (3.3.1) is gamma (γ), where the explanatoryvariables are 1) the Baker and Wurgler (2007) sentiment measure; 2) the AAII individual investor sentiment measure and 3)the explanatory variables used by Welch and Goyal (2008) excluding the factors that correlate with each other in excess of40 percent. The regressors identified with a dagger (†) are the ones shrank to zero by the application of the Lasso. Panel Breports the regression results for Eq. (3.3.2), which regresses γ and the same explanatory variables mentioned before in theunivariate setting.
Table 3.5 indicates that the results for Sent are similar to the ones obtained in our main
regressions: Sent is negatively linked to γ and statistically significant at all horizons but with
less statistical significance, explanatory power and magnitude at the twelve-month horizon.
This result applies to the multivariate regression model only. Across all options maturities,
the Sent coefficients become larger when λ equals 3 and they shrink when λ equals 1. The
66
relation between changes in λ and Sent observed is intuitive. We argue that as λ increases, the
probabilities on the left side of the CPT distribution increase, favoring a thinner tail on the
right side of the PCPT distribution, which, then, requires less overweight of tail adjustment
(through a higher γ) for the PCPT to match the EDF. As a higher γ is obtained by such
increase in λ, the coefficient of γ with the given sentiment factor also increases in magnitude.
The explanatory power of these regressions are, once again, high, as R2 ranges from 62 to
73 percent in the multivariate models. The explanatory power of Sent ranges from 16 to 24
percent in the univariate setting. Table 3.5 reiterates the relation between IISent, the AAII
individual investor sentiment measure, the Welch and Goyal (2008) control variables and γ in
our main regressions. IISent is rarely significantly linked to γ. The control variables that are
robustly linked to γ in our main regression (E12, B/m and Svar) remain strongly connected
to it within these auxiliary regressions. Applying the Lasso model selection technique to these
regressions gives results that are analogous to these ones. Sent, IISent, E12, B/m, Svar and
Rfree always survive the Lasso variable selection procedure, whereas Ntis, Infl, Corp and
CSP coefficient often shrink to zero (as in our main regression, these coefficients are identified
with a dagger (†) in Table 3.5, Panel A).
The robustness of the relation between γ and Sent suggests that changes in the overweight-
ing of tails are not conditional on the level of the loss aversion parameter. In other words, levels
of loss aversion do not drive investors to overweight upside tail events, as one could hypothesize
when associating upside speculation with a state of low loss aversion. Thus, our results suggest
that overweighting of small probabilities is a phenomenon stably linked to sentiment, rather
than positive fundamentals or loss aversion levels. Our results tie closely with the findings of
Green and Hwang (2011), who investigate the relation between IPOs expected skewness and
returns. They find that the skewness effect is stronger during period of high investor senti-
ment. In the same line, Chen et al. (2015) conclude that when gambling sentiment is high,
stocks with lottery-like characteristics earn positive abnormal returns in the short-run followed
by underperformance in the long run.
3.4 Robustness tests
3.4.1 Kupiec’s test for tail comparison
We employ Kupiec’s (1995) test to compare the tails of the EDF with the ones of the subjective
density functions and of the RND as a robustness test to the EVT methods applied. Kupiec’s
test was originally designed to evaluate the accuracy of Value-at-risk (VaR) models, where the
estimated VaR were compared with realized ones. Because the VaR is no different from the
EQR on the downside, i.e., the q−p statistic, we can also make use of Kupiec’s method to test the
accuracy of the q+p statistic for subjective densities and the RND on matching realized EQRs.
Kupiec’s method computes a proportion of failure (POF) statistic that evaluates how often a
VaR level is violated over a specified time span. Thus, if the number of realized violations
is significantly higher than the number of violations implied by the level of confidence of the
67
VaR, then such a risk model or consistency of tails is challenged. Kupiec’s POF test, which is
designed as a log-likelihood ratio test, is defined as:
LRPOF = −2log[(1− p∗)(n−v)(p∗)v] + 2log[(1− [ vn])(n−v)( v
n)v] ∼ χ2(1), (3.4.1)
where p∗ is the POF under the null hypothesis, n is the sample size, and v is the number of
violations in the sample. The null hypothesis of such test is vn= p∗, i.e., the realized probability
of failure matches the predicted one. Thus if the LR exceeds the critical value, χ2 (1)=3.841, the
hypothesis is rejected at the five percent level. In our empirical problem, p∗ equals the assumed
probability that the EQR of the subjective and risk-neutral densities will violate the EQR of
the realized returns, whereas vnis the realized number of violations. Because we apply Kupiec’s
test to upside returns, violations mean that returns are higher than a positive threshold.
The first step in applying Kupiec’s test to our data set is outlining the expected percentage
of failure (p∗) between the EQR from the EDF and from the subjective and risk-neutral den-
sities. We pick p∗ as being five and ten percent. The percentages can be seen as the expected
frequency that the tails of the subjective and of the RND distributions overstate the tails of
the distribution of the realized returns. As a fatter tail is a symptom of an overweighting of
small probabilities, we expect that densities that do not adjust for the CPT weighting function
will deliver a higher frequency of failures than the CPT density function. The Kupiec’s test
results are reported in Table 3.6.
Panel A in Table 3.6 suggests that the probability of failure for the RND, power, exponential,
and PCPT densities is particularly high at the three-month horizon, with more than 99 percent
for the EQR at 90 and 95 percent and for p∗ equal to five and ten percent. These densities often
contain fatter tails than the EDF. For the CPT density, the POF is much lower across the two
values of p∗ used and the 90 and 95 percent EQR. The POF for the 90 percent EQR is roughly
58 percent for the CPT, irrespective of p∗. At the 95 percent EQR, the POF is 46 percent
for the CPT. These findings suggest that at the 90 and 95 percent EQR, the CPT densities
overstate less frequently the EDF tails than other densities. The violations of the EDF tails are,
however, still significant as they occur between 41 and 52 percent of times. Nevertheless, when
we analyze the 99 percent EQR, we find that the POF for all densities decreases considerably
and, for the CPT, it becomes 16 percent.
Panel B of Table 3.6 depicts a very similar pattern of the POF for the probability densities
derived from the six-month options as we find for the three-month options. The POF is very
close to 100 percent for all densities apart from the CPT at the 90 and 95 percent EQR, while
at the 99 percent EQR violations fall substantially, even more than what we observed for the
three-month options. Nevertheless, the CPT remains the best approximation for the EDF, as
its POF is the lowest. The Kupiec’s test result suggests that the CPT density is statistically
equal to the EDF, whereas the RND also equals the empirical returns at the ten percent level.
The results for p∗ at the five or ten percent are very similar.
68
Table 3.6: Robustness checks: Kupiec’s test
Panel A - Three-month calls
EQR 90% EQR 95% EQR 99%
p = 10% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 99.9% 0.0000 ∞ 99.2% 0.0000 ∞ 50.5% 0.0000 414.8
Power vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 84.7% 0.0000 ∞Expo vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 86.8% 0.0000 ∞PCPT vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 67.2% 0.0000 752.0
CPT vs EDF 58.2% 0.0000 559.6 45.7% 0.0000 333.3 16.0% 0.0002 13.6
p = 5% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 99.9% 0.0000 ∞ 99.2% 0.0000 ∞ 50.5% 0.0000 671.5
Power vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 84.7% 0.0000 ∞Expo vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 86.8% 0.0000 ∞PCPT vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 67.2% 0.0000 ∞CPT vs EDF 58.2% 0.0000 861.9 45.7% 0.0000 561.3 16.0% 0.0000 65.5
Panel B - Six-month calls
EQR 90% EQR 95% EQR 99%
p = 10% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 99.9% 0.0000 ∞ 93.3% 0.0000 ∞ 13.8% 0.0160 5.8
Power vs EDF 99.9% 0.0000 ∞ 97.7% 0.0000 ∞ 22.1% 0.0000 49.7
Expo vs EDF 99.9% 0.0000 ∞ 97.8% 0.0000 ∞ 23.0% 0.0000 56.4
PCPT vs EDF 99.9% 0.0000 ∞ 97.3% 0.0000 ∞ 17.0% 0.0000 18.2
CPT vs EDF 62.4% 0.0000 647.0 36.3% 0.0000 197.3 5.7% 0.0019 9.6
p = 5% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 99.9% 0.0000 ∞ 93.3% 0.0000 ∞ 13.8% 0.0000 44.8
Power vs EDF 99.9% 0.0000 ∞ 97.7% 0.0000 ∞ 22.1% 0.0000 137.7
Expo vs EDF 99.9% 0.0000 ∞ 97.8% 0.0000 ∞ 23.0% 0.0000 149.7
PCPT vs EDF 99.9% 0.0000 ∞ 97.3% 0.0000 ∞ 17.0% 0.0000 76.0
CPT vs EDF 62.4% 0.0000 ∞ 36.3% 0.0000 369.9 5.7% 0.5474 0.4
Panel C - Twelve-month calls
EQR 90% EQR 95% EQR 99%
p = 10% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 62.8% 0.0000 655.1 25.0% 0.0000 72.9 20.3% 0.0000 37.0
Power vs EDF 93.5% 0.0000 ∞ 42.5% 0.0000 283.5 29.3% 0.0000 114.7
Expo vs EDF 94.6% 0.0000 ∞ 43.1% 0.0000 292.7 30.4% 0.0000 126.2
PCPT vs EDF 79.5% 0.0000 1067.2 36.1% 0.0000 194.7 24.4% 0.0000 68.3
CPT vs EDF 29.4% 0.0000 115.2 7.2% 0.0480 3.9 8.4% 0.2666 1.2
p = 5% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat
RND vs EDF 62.8% 0.0000 ∞ 25.0% 0.0000 177.9 20.3% 0.0000 114.2
Power vs EDF 93.5% 0.0000 ∞ 42.5% 0.0000 492.6 29.3% 0.0000 245.6
Expo vs EDF 94.6% 0.0000 ∞ 43.1% 0.0000 505.3 30.4% 0.0000 263.6
PCPT vs EDF 79.5% 0.0000 ∞ 36.1% 0.0000 366.1 24.4% 0.0000 170.2
CPT vs EDF 29.4% 0.0000 246.4 7.2% 0.0631 3.5 8.4% 0.0048 8.0
This table reports the results from Kupiec’s (1995) percentage of failure (POF) test for violations of the extreme quantilereturns (EQR) from the empirical density function (EDF) by the EQR of a set of RND and subjective density functions.The test is performed as a robustness check to the extreme value theory (EVT)-based tests performed on the EQR and onthe expected upside returns. The null hypothesis, which is designed as a log-likelihood ratio test (Eq. (3.4.1)), is that therealized probability of failure ( v
n) matches the predicted one p∗. Thus if the LR exceeds the critical value, χ2 (1)=3.841,
such a hypothesis is rejected at the five percent level. Translating the methodology to our empirical problem, (p∗) becomesthe assumed probability that the EQR of the subjective and of the risk-neutral densities will violate the EQR of the realizedreturns, where v
nis the realized number of violations. We note that because we apply Kupiecs test to the upside returns,
violations mean that returns are higher than a positive threshold.
Panel C presents the POF for the twelve-month maturity. We find once again that the CPT
tails are the ones that violate the EDF tails the least. The POF for these densities are about
69
29 percent for the 90 percent EQR, seven percent for the 95 percent EQR, and four percent
for the 99 percent EQR. These findings suggest that the tails of the CPT closely match the
EDF ones, especially far out in the tail, i.e., at the 95 and 99 percent EQR. The RND, power,
exponential, and PCPT densities record POFs that are much smaller than for the three- and
six-month maturities but that are still high in comparison to the CPT.
We note that results for the PCPT and the CPT are quite distinct, whereas results for the
PCPT are somewhat closer to the ones of the RND. This suggests that the weighting function
is the component within the CPT density function that more forcefully causes the RND to
approximate the EDF, so not the value function. Overall, our analysis using Kupiec’s test
leads to similar results as the ones reached within our EVT analysis and further evidences that
the CPT model is superior in matching realized returns.
3.4.2 Prelec’s weighting function parameter
As another robustness check, we estimate the weighting function parameter ω of the RDEU
model suggested by Prelec (1998) in order to test whether our conclusions are robust to other
weighted functions formulations23. The Prelec weighting function w+−p is given by Eq. 3.4.2:
w+−p (p) = exp(−(−log(p))ω), (3.4.2)
where the parameter ω defines the curvature of the weighting function for both gains and
losses, which also leads to S-shaped probability distortion functions. We note that according
to Prelec (1998) the standard ω parameter value equals 0.65. Our time-varying and long-term
(LT) estimates for ω are presented in Table 3.7, Panel A.
The long-term estimates of ω are somewhat in line with the one suggested by the RDEU
but less so for the twelve-month horizon: ω estimated from the three-, six-, and twelve-months
are 0.46, 0.67, and 1.11, respectively. These parameters are somewhat consistent with our long-
term estimates for γ being, 0.75, 0.81, and 1.09 (see Table 3.1), as they suggest overweighting
of small probabilities that fades with the increase in the option horizon. Similarly, time-
varying estimates of ω also indicate more overweight of small probabilities than suggested
by γ estimations. We find the mean (0.95) and median (0.93) for time-varying estimates of
ω from three-month options to be higher than the ones suggested by Prelec (1998). This
outcome means that overweighting of small probabilities within the single stock option markets
is less than suggested by RDEU (similar to our conclusion concerning CPT parameters) and
that estimated Prelec parameters imply a less pronounced overweight of tails than suggested
by our CPT parameter estimations. In line with our results for the CPT, for the six- and
twelve-month maturities, underweight of small probabilities is, however, more frequent than an
overweight. The average ω for the six-month options is 1.02 (median being 0.99), and for the
twelve-months options is 1.05 (median being 1.07). The fact that investors tend to overweight
small probabilities to a much lesser extent in the short-term and that estimates are higher than
23A major advance of Prelec’s (1998) weighting function vis-a-vis the CPT is that it is monotonic for anyvalue of ω, whereas the CPT can have a non-monotonic probability weighting for low levels of γ.
70
suggested by their respective lab-based estimates confirms our main findings.
Table 3.7: Robustness checks: time-varying weighting function parameters
Panel A - Prelec omega (ω)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % ω < 1 % ω < 1 % ω < 1 % ω < 1 RSS LT
(98-03) (03-08) (08-13)
3 months 0.42 0.76 0.93 0.95 1.07 1.75 0.27 64% 95% 36% 61% 0.0204 0.46
6 months 0.37 0.84 0.99 1.02 1.17 1.75 0.26 51% 88% 21% 45% 0.017 0.68
12 months 0.44 0.94 1.07 1.05 1.18 1.75 0.21 39% 79% 10% 28% 0.0201 1.14
Panel B - Gamma with overweight of small probabilities on the right tail (δ = 0.69)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.44 0.7 0.86 0.97 1.21 1.75 0.34 58% 99% 23% 53% 0.0231
6 months 0.4 0.75 1.01 1.04 1.27 1.75 0.31 49% 93% 13% 43% 0.0198
12 months 0.4 0.83 1.05 1.04 1.24 1.75 0.25 43% 87% 11% 32% 0.0238
Panel C - Gamma with neutral probability weighting on the right tail (δ = 1)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.48 0.73 0.88 0.97 1.15 1.75 0.3 62% 98% 30% 58% 0.023
6 months 0.43 0.8 0.99 1.03 1.24 1.75 0.3 51% 92% 16% 45% 0.0191
12 months 0.5 0.87 1.03 1.02 1.13 1.75 0.22 44% 84% 12% 35% 0.0233
Panel D - Gamma with pronounced diminishing sensitivities to gains and losses (αandβ = 0.75)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.45 0.81 0.96 0.98 1.09 1.75 0.25 57% 93% 27% 52% 0.023
6 months 0.38 0.9 1.02 1.06 1.23 1.75 0.25 44% 82% 13% 36% 0.0196
12 months 0.32 0.99 1.07 1.09 1.18 1.75 0.2 30% 65% 5% 21% 0.0276
Panel E - Gamma with no diminishing sensitivities to gains and losses (αandβ = 1)
Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0 0.72 0.88 0.87 1.03 1.75 0.24 67% 98% 40% 64% 0.0204
6 months 0 0.78 0.98 0.93 1.14 1.75 0.3 55% 94% 21% 49% 0.0163
12 months 0.04 0.86 1.02 0.98 1.14 1.75 0.25 44% 88% 11% 33% 0.0207
This table reports robustness checks of our time-varying estimates of overweight of small probabilities. Panel A reports thesummary statistics of the estimated omega (ω) parameter, which is the parameter used in the Prelec’s (1998) probabilityweighting function (see Eq. (3.4.2)). Similarly to the CPT, the parameter ω defines the curvature of the weighting functionfor gains and losses, which leads the probability weighting functions to assume inverse S-shapes. An ω parameter equalto one means a weighting function with un-weighted (neutral) probabilities, whereas ω < 1 denotes overweighting of smallprobabilities. Similarly to γ, we estimate long-term ω’s (reported for γ in Table 3.1, Panel A) as well as time-varying ω’s(reported for γ in Table 3.4, Panel C). Panels B and C report γ estimates when the CPT’s probability weighting parameterfor left side of the distribution (δ) is assumed to be, respectively, 0.69 (the CPT parameterization) and 1 (neutral probabilityweighting). Panel C and D report γ estimates when the CPT’s value weighting parameters α and β for diminishing sensitivityto gains and losses are assumed to be, respectively, 0.75 (increased diminishing sensitivity) and 1 (no diminishing sensitivity).We assume in these robustness tests that the loss aversion parameter λ equals 2.25.
The sample dependence observed in our main results is confirmed by the usage of Prelec’s
weighting function as overweight of tails is pervasive mostly in the 1998-2003 sample. Overall,
the robustness checks following Prelec (1998) confirm our main findings regarding time-variation
and sample dependence of overweighting of small probabilities, and reiterate our conclusion.
3.4.3 Estimating time-varying γ under different assumptions for δ ,
α and β
As an additional robustness test to our time-varying estimates of γ, we also run optimizations
where we fix parameter δ instead of jointly optimizing it with γ. We impose δ = 1 (no overweight71
of small probabilities on the left-side of the distribution) or 0.69, the value of δ within the CPT.
In line with our previous robustness test, Table 3.7, Panels B and C, suggests that results from
optimizations with different values for δ are qualitatively the same to our main results, i.e., a
positive term structure and sample dependency of overweight of small probabilities. Unreported
results also indicate a negative correlation between γ and sentiment and high explanatory power
of regressions. R2 is between 13 and 21 percent for three- and six-month options and between
0 to 3 percent for twelve-month options. Though, neutral probability weighting on the left side
of the distribution (δ=1) adjusts γ downwards when compared to our main results. Conversely,
when δ is 0.69, an upwards adjustment to γ estimates occurs.
Similarly, we also estimate γ under different assumptions for α and β. We assumed α=β=1
(no diminishing sensitivity to gains and losses) and α=β=0.75 (more pronounced diminishing
sensitivity to gains and losses) instead of the CPT parameterization α=β=0.88. Our results,
reported in Table 3.7, Panels D and E, suggest that lower sensitivity to gains and losses (higher
α and β) leads to a decrease in overweight of small probabilities (higher γ estimates), whereas
higher sensitivity to gains and losses (lower α and β) leads to an increase in overweight of tails
(lower γ estimates). This effect is similar to the one observed by changes in λ (described in
Section 3.3.3), which also magnifies the sensitivity for losses when increased.
As indicated in section 3.2.2, we have also estimated time-varying γ using different lower
(-0.25, 0 and 0.28) and upper bounds (1.2, 1.35, 1.5, 1.75 and 2). Results across bounds used
differ to the extent that higher bounds produce upward shifts in the estimated γ across all
quantiles, median and averages to the extent that overweight of small probabilities becomes
less pronounced but remain present. The time-variation pattern observed in Figure 3.2 and,
more importantly, the strong negative relationship with sentiment reported in Table 3.4 are,
though, extremely robust to changes in lower and upper optimization bounds. This result
strengthens our conclusion that overweight of small probabilities is largely time varying and
reflects investor sentiment.
3.4.4 Overweight of (right) tails driven by IV of single stock options
Finally, given that overweight of small probabilities by single stock call investors was most
evident during the IT bubble period (as Table 3.3 suggests), we hereby evaluate whether this
finding may have been driven by movements in the IV of index options rather than changes
in the IV of single stock options. We perform such analysis because our methodology for
calculation of average weighted stock IV volatilities partly relies on the IV on index options
(as it depends on implied correlations), as Eqs. 3.A.8j and 3.A.8l in Appendix 3.A.2 suggest.
Essentially, we want to ensure that the overweight of small probabilities observed from our
single stock options data is not caused by a rise in index options’ IV. As overweight of small
probabilities is a corollary of high IV skew24, we examine the IV skews (120 percent moneyness
versus at-the-money, ATM) from both index options and from single stock options within our
24While this relation is widely acknowledged, Jarrow and Rudd (1982), Corrado and Su (1997) and Longstaff(1995) provide a formal theorem for the link between IV skew and risk-neutral skewness and kurtosis.
72
sample using a k-Nearest-Neighbors (KNN) algorithm (see Appendix 3.B.2 for detail). Figure
3.3 depicts a scatter plot that relates single stock IV skews (on the y-axis) with index option IV
skew (on the x-axis) overlaid with the decision boundary between overweight of tails (in red)
and its absence (in blue), produced by the application of the KNN algorithm to our full data
sample. The picture suggests that that overweight of small probabilities is almost never caused
by positive index IV skews, whereas positive single stock IV skews very often produce overweight
of tails rather than underweights. Overweight of tails are mostly caused by situation where
single stock IV skew are higher than index IV skew, which suggest that either high single stock
IV skews or low implied correlation are responsible for overweight of tails, not index options’
IV. These conditions can be anecdotally confirmed by our observation of IV skews during the
2000’s IT bubble. During that period, when overweight of tails was pervasive, IV skew from
single options was quite high, close to +10 volatility points, whereas the same IV skew from
index options reached extreme low levels such as -15 volatility points. This disconnect between
the two IV markets, which drove the implied correlation to 2.8 percent (an extreme low level),
suggests that the index options’ IV was not the driver for overweight of tails during the IT
bubble. These findings reiterate our suggestions that overweight of small probabilities observed
in our sample is caused by trading in single stock options by retail investors, rather than activity
in the index option market.
(a) 3 months
Figure 3.3: k-Nearest-Neighbors for IV skews. This figure shows a scatter plots depicting the relation betweensingle stock (120 percent minus ATM) IV skews (on the y-axis) and index (120 percent minus ATM) IV skews (on the x-axis).Observations colored in red imply the presence of overweight of small probabilities on the right side of the distribution (γ < 1),whereas observations colored in blue imply either neutral probability weighting (γ = 1) or underweight of small probabilities(γ > 1). The decision boundary is produced by a k-Nearest-Neighbors algorithm (k=41, estimated via cross-validation) anddelimits the region in which a new observation (of paired IV skews, such as the solid dotes) will be assigned to the overweight ofsmall probabilities class (in red) or the alternative class (in blue).
73
3.5 Conclusion
Single stock OTM call options are deemed overpriced because investors overpay for positively
skewed securities, resembling lottery tickets. The CPT’s probability weighting function of
Tversky and Kahneman (1992) theoretical model provides an appealing explanation why these
options are expensive: investors’ preferences for positively skewed securities. In our empirical
analysis, we find that the CPT subjective density function implied by single stock options
outperforms the RND and two rational densities functions (from the power and exponential
utilities) in matching the tails of realized equity returns. We estimate the CPT probability
weighting function parameter γ and find that they are qualitatively consistent with the one
predicated by Tversky and Kahneman (1992), particularly for short-term options. This outcome
endorses our hypothesis that investors in single stock call options are biased.
Our analysis provides detailed insights into the behavior of single stock option investors.
Our empirical findings suggest that overweight of small probabilities is less pronounced than
proposed by the CPT. We find the presence of a positive term structure of overweighting of
tails, because it becomes less pronounced as the option maturity increases. Investors in single
stock calls are more biased when trading short-term contracts, whereas they seem to be more
rational (less biased) when trading long-term calls. This result is consistent with individual
investors being the typical buyers of OTM single stock calls and the fact that they mostly use
short-term options to speculate on the upside of equities.
We also find that investors overweighting of small probabilities is strongly time-varying and
sample dependent. Time-variation in γ’s remains strong even when we account for different
levels of loss aversion, different diminishing sensitivities to gains and losses, different degrees of
overweighting of the left tail and an alternative (Prelec’s) weighting function. The strong time-
variation and sample dependency of γ suggest that investors do not have a static preference for
skewness, but rather time-varying preferences or “bias in beliefs” (see Barberis, 2013).
Such time-variation in γ is also confirmed by overweighting of tails to be pronounced in
periods in which sentiment is high, for instance, the IT bubble period. This finding is consistent
with the Baker and Wurgler (2007) sentiment measure being the main explanatory variable of
overweighting of small probabilities. Our results challenge the view that single stock call options
are structurally overpriced and offer the insight that overweight of tail events implied in these
options are conditional on sentiment levels and option maturity rather than positive stock
fundamentals, loss aversion levels or investor preferences for skewness.
Our findings have several important practical implications. First, the understanding of time-
variation in investors’ overweighting of small probabilities could be used in the development of
behavioral option pricing models, which remains in its infancy. To the extent that overweighting
of small probabilities is a latent variable or, simply, not trivial to estimate, we contemplate that
future option pricing models should be more sentiment-aware than current ones. Second, of
importance for such next generation option-pricing models is the inclusion of a positive term
structure of tails’ overweighting. Such potential modifications on options’ pricing have large
and direct consequences to risk-management, hedging and arbitrage activities. Third, from a
74
financial stability point of view, investors’ overweighting of small probabilities in single stock
options could be of use to regulators for triangulating the presence of speculative equity markets
bubbles.
75
3.A Appendix: Risk-neutral densities and implied volatil-
ity analytics
3.A.1 Subject density function estimation
We hereby present the derivations required to achieve Eq. (3.2.3) in the main text, Eq. (3.A.7)
here, from Eq. (3.2.2), called here Eq. (3.A.1):
fQ(ST )
w′(FP (ST )) · fP (ST )= ς(ST ). (3.A.1)
where fP (ST ) is the “real-world” probability distribution, fQ(ST ) is the RND, ς(ST ) is the
pricing kernel, w is the weighting function and FP (ST ) is the “real-world” cumulative density
function.
The first step of our derivation entails re-arranging Eq. (3.A.1) into (3.A.2b) via Eq.
(3.A.2a), which demonstrates that for the CPT to hold, the subjective density function should
be consistent with the probability weighted EDF:
fQ(ST )︸ ︷︷ ︸RND
= w′(FP (ST ))︸ ︷︷ ︸probability weighing
· fP (ST )︸ ︷︷ ︸EDF
· ς(ST )︸ ︷︷ ︸pricing kernel
(3.A.2a)
fQ(ST )︸ ︷︷ ︸RND
= fP (ST )︸ ︷︷ ︸probability weighted EDF
· ς(ST )︸ ︷︷ ︸pricing kernel
(3.A.2b)
fQ(ST )
ΛU ′(ST )U ′(St)
=fQ(ST )
ς(ST )︸ ︷︷ ︸
Subjective density
= fP (ST )︸ ︷︷ ︸probability weighted EDF
(3.A.3)
Following Ait-Sahalia and Lo (2000) and Bliss and Panigirtzoglou (2004), Eq. (3.A.3) can
be manipulated so that the time-preference constant Λ of the pricing kernel vanishes, producing
Eq. (3.A.4), which directly relates the probability weighted EDF, the RND, and the marginal
utility, U ′(ST ):
fP (ST )︸ ︷︷ ︸probability weighted EDF
=λU ′(ST )
U ′(St)Q(ST )∫ U ′(St)
U ′(x)Q(x)dx
=
fQ(ST )
U ′(ST )∫ fQ(x)
U ′(x)dx︸ ︷︷ ︸
Generic subjective density function
(3.A.4)
where∫ fQ(x)
U ′(x)dx normalizes the resulting subjective density function to integrate to one. Once
the utility function is estimated, Eq. (3.A.4) allows us to convert RND into the probability
weighted EDF. Eq. (3.A.4) can also be used to estimate the subjective density function for an
(rational) investor that has power or exponential utility function, by disregarding the weighting
function W (·), so the left-hand side of the equation becomes fp(ST ). In the remainder of
the chapter we call these subjective distributions power and exponential density functions.
As we hypothesize that the representative investor has a CPT utility function, its marginal
76
utility function is U ′(ST ) = υ′(ST ), and, thus, υ′(ST ) = αSα−1
T for ST >= 0, and υ′(ST ) =
−λβ(−ST )β−1 for ST < 0, leading to Eq. (3.A.5):
fP (ST ) =
fQ(ST )
αSα−1T∫ fQ(x)
αxα−1dxfor ST ≥ 0, and (3.A.5)
fP (ST )︸ ︷︷ ︸probability weighted EDF
=
fQ(ST )
−λβ(−ST )β−1∫ fQ(x)
−λβ(−x)β−1dx︸ ︷︷ ︸Partial CPT density function
for ST < 0, and (3.A.6)
Eqs. (3.A.5) and (3.A.6), hence, relate the EDF where probabilities are weighted according
to the CPT probability distortion functions, on the LHS, to the subjective density function
derived from the CPT value function, on the RHS, separately for gains and losses, i.e., the
PCPT density function. The relationships specified by Eqs. (3.A.5) and (3.A.6) fully state the
relation we would like to depict, although one additional manipulation is convenient for our
argumentation. Assuming that the function w(FP (ST )) is strictly increasing over the domain
[0,1], there is a one-to-one relationship between w(FP (ST )) and a unique inverse w−1(FP (ST )).
So, result fP (ST ) = w′(FP (ST ))fP (ST ) also implies fP (ST ).(w−1)′(FP (ST )) = fP (ST )
25. This
outcome allows us to directly relate the original EDF to the CPT subjective density function,
by “undoing” the effect of the CPT probability distortion functions within the PCPT density
function:
fP (ST )︸ ︷︷ ︸EDF
=
fQ(ST )
ν′(ST )∫ fQ(x)
ν′(x)dx
(w−1)′(FP (ST ))
︸ ︷︷ ︸CPT density function
(3.A.7)
Thus, once the relation between the probability weighting function of EDF and the PCPT
density is established, as in Eqs. (3.A.5) and (3.A.6), one can eliminate the weighting scheme
affecting returns by applying the inverse of such weightings to the subjective density function
without endangering such equalities, as in Eq. (3.A.7), numbered Eq. (3.2.3) in the main text.
3.A.2 Single stock weighted average implied volatilities
In the following we derive the weighted average single stock IV, Eq. (3.A.8l), and the implied
correlation approximation, Eq. (3.A.8j):
σ2P =
n∑i=1
w2i σ
2i +
n∑i �=j
wiwjρijσiσj (3.A.8a)
25A drawback of the CPT model is that it allows for non-strictly increasing functions, which would not allowinvertibility. This is the reason why the newer literature on probability distortions functions favors other strictly
monotonic functions, such as Prelec’s (1998) w(p) = e−(−ln(p))δ , as the weighting functions. Nevertheless,because the CPT parameters of our interest (γ = 0.61; δ = 0.69) impose strict monotonicity, we can obtain theinverse of the probability function, w−1(p) numerically.
77
Starting from the portfolio variance σ2P formula given by Eq. (3.A.8a), where i and j are indexes
for the portfolio constituents, this relation can be re-written for a equity index as:
σ2I =
n∑i,j=1
wiwjρijσiσj (3.A.8b)
implying that,
n∑i �=j
wiwjρijσiσj =n∑
i,j=1
wiwjρijσiσj −n∑
i=1
w2i σ
2i (3.A.8c)
where,
ρij(x) =
{ρ, if i �= j
1, if i = j(3.A.8d)
and where σ2I is the equity index option-implied variance. Then, assuming ρ as the estimator
for average stock correlation we have:
σ2I = ρ
n∑i �=j
wiwjσiσj +n∑
i=1
w2i σ
2i , (3.A.8e)
which, given equality 3.A.8c, can be re-written as:
σ2I = ρ
n∑i,j=1
wiwjσiσj − ρ
n∑i=1
w2i σ
2i +
n∑i=1
w2i σ
2i , (3.A.8f)
= ρ
(n∑
i=1
wiσi
)2
− ρn∑
i=1
w2i σ
2i +
n∑i=1
w2i σ
2i , (3.A.8g)
= ρ
(
n∑i=1
wiσi
)2
−n∑
i=1
w2i σ
2i
+
n∑i=1
w2i σ
2i , (3.A.8h)
ρ =σ2I −
∑ni=1 w
2i σ
2i
(∑n
i=1 wiσi)2 −∑n
i=1 w2i σ
2i
. (3.A.8i)
As∑n
i=1 w2i σ
2i is relatively small, we can simplify Eq. (3.A.8i), the implied correlation,
into the approximated implied correlation given by Eq. (3.A.8j). Note that, as∑n
i=1 w2i σ
2i is
always positive, the approximated implied correlation will always overstate the true implied
correlation.
ρ ≈ σ2I
(∑n
i=1 wiσi)2. (3.A.8j)
Further, in order to obtain the weighted average single stock implied volatility, Eq. (3.A.8l),
we square root both sides of the approximation and re-arrange their terms:
√ρ ≈ σI
(∑n
i=1 wiσi)(3.A.8k)
78
withn∑
i=1
wiσi ≈σI√ρ. (3.A.8l)
Lastly, note that, given equality 3.A.8c, Eq. 3.A.8i can be re-written as:
ρ =σ2I −
∑ni=1 w
2i σ
2i∑n
i �=j wiwjσiσj
=σ2I −
∑ni=1 w
2i σ
2i∑n
i=1
∑i �=j wiwjσiσj
, (3.A.8m)
which is the implied correlation (IC) measure employed by Driessen et al. (2013).
3.B Appendix: Machine learning methods
3.B.1 Least Absolute Shrinkage and Selection Operator (Lasso)
The regression coefficients obtained by the Lasso methodology applied (βLθ ) are estimated by
minimizing the quantity:
n∑i=1
(y1 − β0 −p∑
j=1
βjxij)2 + κ
p∑j=1
|βj|= RSS + κ
p∑j=1
|βj| (3.B.1)
where κ is the tuning parameter, which is estimated via cross-validation. The cross-validation
applied by us uses ten equal-size splits of our overall data set.
3.B.2 k-Nearest-Neighbor classifier
The k-Nearest-Neighbor (KNN) classifier is one of the approaches in machine learning that
attempts to estimate the conditional distribution of the explained variable (Y ) given the ex-
planatory variables (X) and, subsequently, classify new observations to the class with highest
estimated probability. The KNN classifier uses the Euclidean distance to first identify the clos-
est kth observations within the training data (in-sample data) to a new test (out-of-sample)
observation provided (x0). Such neighborhood of points around the test observation x0 is de-
fined as N0. KNN, then, estimates the conditional probability of x0 to belong to a class j as
the percentage of old observations (yi) in the neighborhood N0 whose class is also j:
Pr(Y = j|X = x0) =1
k
∑i∈N0
I(yi = j) (3.B.2)
In a third step, KNN applies the Bayes rule to perform out-of-sample classification (in test
data) of x0 to the class with the largest probability. For further details, see Hastie et al. (2008).
79
3.C Appendix: Welch and Goyal (2008) equity market
predictors
The complete set and summarized descriptions of variables provided by Welch and Goyal
(2008)26 that are used in our study is given as:
1. Dividendprice ratio (log), D/P: Difference between the log of dividends paid on the
S&P 500 index and the log of stock prices (S&P 500 index).
2. Dividend yield (log), D/Y: Difference between the log of dividends and the log of
lagged stock prices.
3. Earnings, E12: 12-month moving sum of earnings on teh S&P500 index.
4. Earnings-price ratio (log), E/P: Difference between the log of earnings on the S&P
500 index and the log of stock prices.
5. Dividend-payout ratio (log), D/E: Difference between the log of dividends and the
log of earnings.
6. Stock variance, SVAR: Sum of squared daily returns on the S&P 500 index.
7. Book-to-market ratio, B/M: Ratio of book value to market value for the Dow Jones
Industrial Average.
8. Net equity expansion, NTIS: Ratio of twelve-month moving sums of net issues by
NYSE-listed stocks to total end-of-year market capitalization of NYSE stocks.
9. Treasury bill rate, TBL: Interest rate on a three-month Treasury bill.
10. Long-term yield, LTY: Long-term government bond yield.
11. Long-term return, LTR: Return on long-term government bonds.
12. Term spread, TMS: Difference between the long-term yield and the Treasury bill rate.
13. Default yield spread, DFY: Difference between BAA- and AAA-rated corporate bond
yields.
14. Default return spread, DFR: Difference between returns of long-term corporate and
government bonds.
15. Cross-sectional premium, CSP: measures the relative valuation of high- and low-beta
stocks.
16. Inflation, INFL: Calculated from the CPI (all urban consumers) using t−1 information
due to the publication lag of inflation numbers.
26Available at http://www.hec.unil.ch/agoyal/.
80
Chapter 4
Implied Volatility Sentiment: A Tale of
Two Tails∗
4.1 Introduction
End-users of out-of-the-money (OTM) options tend to overweight small probability events. This
behavioral bias, suggested by Tversky and Kahneman (1992) Cumulative Prospect Theory, is
claimed to be present in the pricing of OTM index puts and in OTM single stock calls (Barberis
and Huang, 2008; Polkovnichenko and Zhao, 2013)2. Within the index option market, the
typical end-users of OTM puts are institutional investors, who use them to protect their large
equity portfolios. Because institutional investors have large portfolios and hold a substantial
part of the total market capitalization, OTM index puts are frequently in high demand and,
as a result, are overvalued. The reason for such richness of OTM puts goes back to the 1987
financial market crash. Bates (1991) and Jackwerth and Rubinstein (1996) argue that the
implied distribution of equity market expected returns from index options changed considerably
following the 1987 market crash. Their findings demonstrate that, since the crash, a large shift
in market participants’ demand for such instruments took place, evidenced by the probabilities
implied by options prices. Before the crash, the probability of large negative stock returns was
close to the one suggested by a normal distribution. In contrast, just prior to the 1987 crash, the
probability of large negative returns implied by option prices rose considerably. Such increased
demand for hedging against tail risk events suggested a change in beliefs and attitude towards
risk. Investors feared another crash and became more willing to give up upside potential in
equities to hedge against the risk of drawdowns via put options. Bates (2003) suggests that even
∗This chapter is based on Felix et al. (2017a). We thank seminar participants at the APG Asset ManagementQuant Roundtable, at the Infiniti 2017 Conference in Valencia, at the EEA-ESEM 2017 Conference in Lisbonand at the 2018 annual meeting of the European Financial Management Association (EFMA) in Milano fortheir helpful comments. We thank APG Asset Management for making available part of the data set.
2We acknowledge that it is yet unclear whether the overweighting of small probabilities is caused solely bypreferences (i.e., a behavioral bias) or rather by biased beliefs (i.e., investors’ expectations). Barberis (2013)eloquently discusses how both phenomena are distinctly different and how both (individually or jointly) maypotentially explain the existence of overpriced OTM options. In this chapter we take a myopic view and useonly the first explanation, the existence of a behavioral bias, for ease of exposition.
81
models adjusted for stochastic volatility, stochastic interest rates, and random jumps do not
fully explain the high level of OTM puts’ implied volatilities (IV). Accordingly, Garleanu et al.
(2009) argue that excessive IV from OTM puts cannot either be explained by option-pricing
models that take such institutional investors’ demand pressure into account3.
It has been claimed that OTM calls on single stocks are systematically expensive (Barberis
and Huang, 2008; Boyer and Vorkink, 2014). The typical end-users of OTM single stock calls
are individual investors. Bollen and Whaley (2004) state that changes in the IV structure of
single stock options across moneyness are driven by the net purchase of calls by individual
investors. The literature provides several explanations for such strong buying pressure of calls
by retail investors. For example, Mitton and Vorkink (2007) and Barberis and Huang (2008)
propose models in which investors have a clear preference for positive return skewness, or
“lottery ticket” type of assets. In consequence of this preference, retail investors overpay for
these leveraged securities, making OTM calls expensive and causing them to yield low forward
returns. Cornell (2009) presents a behavioral explanation for the overpricing of single stock
calls: because investors are overconfident in their stock-picking skills, they buy calls to get
the most “bang for the buck”. A related explanation for the structural overpricing of single
stock calls is leverage aversion or leverage constraint: because investors are averse to borrowing
(levering) or constrained to do so, they buy instruments with implicit leverage to achieve their
return targets.
Beyond this literature that supports the link between institutional and individual investor
trading activity and the structural overvaluation of OTM options, we argue that short-term
trading dynamics also influence the pricing of OTM options. For instance, Han (2008) provides
evidence that the index options IV smirk is steeper when professional investors are bearish. He
concludes that the steepness of the IV structure across moneyness relates to investors’ sentiment.
In the same line, Amin et al. (2004) argue that investors bid up the prices of put options after
increases in stock market volatility and rising risk aversion, whereas such buying pressure wanes
following positive momentum in equity markets. Mahani and Poteshman (2008) argue that
trading in single stock call options around earnings announcements is speculative in nature and
dominated by unsophisticated retail investors. Lakonishok et al. (2007) show evidence that long
call prices increased substantially during bubble times (1990 and 2000) and that most of the
single stock options’ market activity consists of speculative directional call positions. Lemmon
and Ni (2011) discuss that the demand for single stock options (dominated by speculative
individual investors’ trades) positively relates to sentiment. Lastly, Polkovnichenko and Zhao
(2013) suggest that time-variation in overweight of small probabilities derived from index put
options might depend on sentiment, whereas Felix et al. (2016b) provide evidence that the time-
varying overweight of small probabilities from single stock options largely links to sentiment.
The above studies suggest that OTM index puts and single stock calls are systematically
3It is important to disentangle the (equity) hedging behavior of institutional investor to their overall tradingactivity. Studies, such as Frijns et al. (2015), provide evidence that institutional investors price stocks rationally,supporting the idea that the argued behavioral bias might be confined to institutional investors’ portfolioinsurance decisions.
82
overpriced and that the valuation misalignment fluctuate considerably over time, caused by
changes in investor sentiment. In this chapter, we delve deeper into it and investigate how
overweight of small probabilities links to sentiment and forward returns.
The first contribution of our study is to evaluate the information content of overweighted
small probabilities from index puts and single stock calls, as a measure of sentiment. We assess
the ability of this measure to predict forward equity returns and, more specifically, equity market
reversals, defined as abrupt changes in the market direction4. Because we find overweight small
probabilities to be strongly linked to IV skews, we hypothesize that reversals may follow not
only periods of excessive overweight of tails but also periods of extreme IV skews5.
One characteristic of the literature that analyzes the informational content of IV skews is
that it evaluates index puts’ IV skews and single stock calls’ IV skews completely separated
from each other. As such, our second contribution is that we are, to the best of our knowledge,
the first in the literature to use IV skews jointly extracted from both the index and single stock
option market as an indicator for investors’ sentiment. Our sentiment measure, the so-called
IV-sentiment, is calculated as the IV of OTM index puts minus the IV of OTM single stock calls.
We conjecture that our IV-sentiment measure is an advance on the understanding investors’
sentiment because it captures the very distinct nature of these markets’ two main categories of
end-users: 1) IV from OTM puts captures institutional investors’ willingness to pay for leverage
to hedge their downside risk (portfolio insurance), as a measure of bearishness, whereas 2) IV
from OTM single stock calls captures levering by individual investors for speculation on the
upside (“lottery tickets” buying), as a measure of bullishness. Thus, a high level of IV-sentiment
indicates bearish sentiment, as IV from index puts outpace the ones from single stock calls. In
contrast, low levels of IV-sentiment indicate bullishness sentiment, as IV from single stock calls
become high relative to the ones from index puts.
We find that our IV-sentiment measure predicts equity market reversals better than over-
weight of small probabilities itself. It also delivers positive risk-adjusted returns more consis-
tently than the common Baker and Wurgler (2007) sentiment factor when evaluated via two
trading strategies, a high-frequency and a low-frequency one. In univariate and multivariate
predictive regression settings, our IV-sentiment measure improves the out-of-sample forecast
ability of traditional equity risk-premium models. This result is likely due to the uniqueness
of our IV-sentiment measure relative to traditional predictive factors, as well as caused by the
4Reversals in the context of this chapter are not to be confused with the, so-called, reversal (cross-sectional)strategy, i.e., a strategy that buys (sells) stocks with low (high) total returns over the past month, as firstdocumented by Lehmann (1990). We focus on the overall equity market, rather than investigating single stocks.
5The literature on IV skew has largely explored the level of volatility skew across stocks and their cross-sectionof returns. However, insights on the link between the skew and the overall stock market are still incipient. Thestudy by Doran et al. (2007) is one of the few that has tested the power of IV skews as a predictor of aggregatemarket returns. However, they only analyze the relation between skews and one-day ahead returns (found to beweakly negatively related), and ignore any longer and perhaps more persistent effects. Similarly, several studieshave already attempted to recognize the conditionality of forward equity market returns to other volatility-typeof measures: Ang and Liu (2007) for realized variance, Bliss and Panigirtzoglou (2004) for risk-aversion impliedby risk-neutral probability distribution function embedded in cross-sections of options, Bollerslev et al. (2009)for variance risk premium, Driessen et al. (2013) for option-implied correlations, Pollet and Wilson (2008) forhistorical correlations, and Vilkov and Xiao (2013) for the risk-neutral tail loss measure. Most of these studiesdocument a short-term negative relation between risk measures and equity market movements.
83
imposition of some structure into our models (in the form of coefficient constraints). Once
these models are constrained, forecast combination approaches largely outperform individual
predictors and advanced machine learning techniques in forecasting the equity risk-premium
in our data set. Thus, the third contribution of our study is to complement the literature
on out-of-sample forecasting of the equity risk-premium (Welch and Goyal, 2008; Campbell
and Thompson, 2008; Rapach et al., 2010) by suggesting a new predictor, the IV-sentiment
measure. Concurrently, we reiterate earlier findings that constrained linear models remain a
powerful tool to forecast equity returns.
A final contribution of our work is to reveal the ability of our IV-sentiment measure on
improving on time-series momentum, cross-sectional momentum and equity buy-and-hold in-
vestment strategies. Our sentiment measure is uncorrelated to these strategies, also at the
tails, for instance, when cross-sectional momentum crashes contemporaneously to market re-
bounds (Daniel and Moskowitz, 2016). Consequently, we document an increase in the infor-
mational content of such strategies when combined with the IV-sentiment strategy, especially
for cross-sectional momentum. In line with this outcome, we also report that returns from a
IV-sentiment-based strategy are poorly explained by widely used equity risk factors, such as
Fama and French’s five-factors, the momentum factor (WML) and the low-volatility factor
(BAB). Hence, we propose that active equity managers could benefit from IV-sentiment by
using it for Beta-timing.
The remainder of this chapter is organized as follows. Section 4.2 describes the data and
the main methods employed in our empirical study. In section 4.3, within three sub-sections,
we focus on estimating overweight of small probabilities parameters from the index and single
stock option markets as well as linking it to the Baker and Wurgler (2007) sentiment factor and
other proxies for sentiment. In section 4.4 we test how our sentiment proxy based on overweight
of small probabilities relates to forward equity returns. Section 4.5 concludes.
4.2 Data and Methodology
We use S&P 500 index options’ IV data and single stock weighted average IV data from the
largest 100 stocks of the S&P 500 index within our risk-neutral density (RND) estimations.
The IV data comes from closing mid-option prices from January 2, 1998 to March 19, 2013
for fixed maturities for five moneyness levels, i.e., 80, 90, 100, 110, and 120, at the three-, six-
and twelve-month maturity both for index and single stock options. Eq. (3.A.8l) in Appendix
3.A.2 shows how weighted average single stock IV are computed. We apply the S&P 500
index weights normalized by the sum of weights of stocks for which IVs across all moneyness
levels are available. Following the S&P 500 index methodology and the unavailability of IV
information for every stock in all days in our sample, stocks weights in this basket change
on a daily basis. The sum of weights is, on average, 58 percent of the total S&P 500 index
capitalization and it fluctuates from 46 to 65 percent. Continuously compounded stock market
returns are calculated throughout our analysis from the basket of stocks weighted with the
84
same daily-varying loadings used for aggregating the IV data6. For index options, we use the
S&P 500 index prices to calculate continuously compounded stock market returns. Realized
index returns and single stock returns are downloaded via Bloomberg.
Overweight of small probabilities is embedded in the cumulative prospect theory (CPT)
model by means of the weighting function of the probability of prospects. Within the CPT
model, overweight of small probabilities is measured by the probability weighting function
parameters δ and γ for the left (losses) and right (gains) side of the return distribution, re-
spectively. δ and γ < 1 imply overweight of small probabilities, whereas δ and γ > 1 imply
underweight of small probabilities, and δ and γ equal to 1 means neutral weighting of prospects
(see Tversky and Kahneman, 1992).
Our methodology builds on the assumption that investors’ subjective density estimates
should correspond, on average7, to the distribution of realizations (Bliss and Panigirtzoglou,
2004). Thus, estimating CPT probability weighting function parameters δ and γ is only feasible
if two basic inputs are available: the CPT subjective density function and the distribution
of realizations, i.e., the empirical density function (EDF). The methodology applied by us to
estimate these two parameters comprises of: 1) estimating the returns’ risk-neutral density from
option prices using a modified Figlewski (2010) method; 2) estimating the partial CPT density
function using the CPT marginal utility function; 3) “undoing” the effect of the probability
weighting function (w) to obtain the CPT subjective density function; 4) simulating time-
varying empirical return distributions using the Rosenberg and Engle (2002) approach; and 5)
minimizing the squared difference of the tail probabilities of the CPT and EDF to obtain daily
optimal δ’s and γ’s.
Our starting point for obtaining the CPT probability weighting function parameters δ and γ
is the estimation of RND from IV data. In order to estimate the RND, we first apply the Black-
Scholes model to our IV data to obtain options prices (C) for the S&P 500 index. Once our
data is normalized, so strikes are expressed in terms of percentage moneyness, the instantaneous
price level of the S&P 500 index (S0) equals 100 for every period for which we would like to
obtain implied returns. Contemporaneous dividend yields for the S&P 500 index are used
for the calculation of P as well as the risk-free rate from three-, six- and twelve-month T-bills.
Because we have IV data for five levels of moneyness, we implement a modified Figlewski (2010)
method for extracting the RND structure. The main advantage of the Figlewski (2010) method
over other techniques is that it extracts the body and tails of the distribution separately, thereby
allowing for fat tails.
Once the RND is estimated, we must change the measure to translate it into the subjective
6We thank Barclays Capital for providing the implied volatility data. Barclays Capital disclosure: “Anyanalysis that utilizes any data of Barclays, including all opinions and/or hypotheses therein, is solely the opinionof the author and not of Barclays. Barclays has not sponsored, approved or otherwise been involved in themaking or preparation of this Report, nor in any analysis or conclusions presented herein. Any use of any dataof Barclays used herein is pursuant to a license.”
7This assumption implies that investors are somewhat rational, which is not inconsistent with the CPT-assumption that the representative agent is less than fully rational. The CPT suggests that investors arebiased, not that decision makers are utterly irrational to the point that their subjective density forecast shouldnot correspond, on average, to the realized return distribution.
85
density function, a real-world probability distribution. This operation is possible via the pricing
kernel as follows:
fQ(ST )
fP (ST )= Λ
U′(ST )
U ′(St)≡ ς(ST ), (4.2.1)
where, fQ(ST ) is the RND, fP (ST ) is the real-world probability distribution, ST is wealth or
consumption, ς(ST ) is the pricing kernel, Λ is the subjective discount factor (the time-preference
constant) and U(·) is the representative investor utility function.
Since CPT-biased investors price options as if the data-generating process has a cumulative
distribution FP (ST ) = w(FP (ST )), where w is the weighting function, its density function
becomes fP (ST ) = w′(FP (ST )) · fP (ST ) (Dierkes, 2009; Polkovnichenko and Zhao, 2013) and
Eq. (4.2.1) collapses into Eq. (4.2.2):
fQ(ST )
w′(FP (ST )) · fP (ST )= ς(ST ). (4.2.2)
which, re-arranged into Eq. (4.2.4) via Eqs. (4.2.3a) and (4.2.3b), demonstrates that for the
CPT to hold, the subjective density function should be consistent with the probability weighted
EDF:
fQ(ST )︸ ︷︷ ︸RND
= w′(FP (ST ))︸ ︷︷ ︸probability weighing
· fP (ST )︸ ︷︷ ︸EDF
· ς(ST )︸ ︷︷ ︸pricing kernel
(4.2.3a)
fQ(ST )︸ ︷︷ ︸RND
= fP (ST )︸ ︷︷ ︸probability weighted EDF
· ς(ST )︸ ︷︷ ︸pricing kernel
(4.2.3b)
fQ(ST )
ΛU ′(ST )U ′(St)
=fQ(ST )
ς(ST )︸ ︷︷ ︸
Subjective density
= fP (ST )︸ ︷︷ ︸probability weighted EDF
(4.2.4)
Following Bliss and Panigirtzoglou (2004), Eq. (4.2.4) can be manipulated so that the
time-preference constant Λ of the pricing kernel vanishes, producing Eq. (4.2.5), which directly
relates the probability weighted EDF, the RND, and the marginal utility, U ′(ST ):
fP (ST )︸ ︷︷ ︸probability weighted EDF
=λU ′(ST )
U ′(St)Q(ST )∫ U ′(St)
U ′(x)Q(x)dx
=
fQ(ST )
U ′(ST )∫ fQ(x)
U ′(x)dx︸ ︷︷ ︸
Generic subjective density function
(4.2.5)
where∫ fQ(x)
U ′(x)dx normalizes the resulting subjective density function to integrate to one. Once
the utility function is estimated, Eq. (4.2.5) allows us to convert RND into the probability
weighted EDF. As the CPT marginal utility function is U ′(ST ) = υ′(ST ), and, thus, υ′(ST ) =
αSα−1T for ST >= 0, and υ′(ST ) = −λβ(−ST )
β−1 for ST < 0, we obtain Eq. (4.2.6) and (4.2.7):
86
fP (ST ) =
fQ(ST )
αSα−1T∫ fQ(x)
αxα−1dxfor ST ≥ 0, and (4.2.6)
fP (ST )︸ ︷︷ ︸probability weighted EDF
=
fQ(ST )
−λβ(−ST )β−1∫ fQ(x)
−λβ(−x)β−1dx︸ ︷︷ ︸Partial CPT density function
for ST < 0, and (4.2.7)
Eq. (4.2.6) relates to the probabilities weighted EDF (on the LHS), which uses the CPT
probability distortion function for weighting, to the subjective density function on the RHS,
derived from the CPT value function for gains (ST ≥ 0). We call the RHS the partial CPT
density function (PCPT), as it does not embed the probability function. Eq. (4.2.7) is the
corresponding equation for losses (ST < 0). As the function w(FP (ST )) is strictly increasing
over the domain [0,1], there is a one-to-one relationship between w(FP (ST )) and a unique inverse
w−1(FP (ST )). So, the result fP (ST ) = w′(FP (ST ))fP (ST ) also implies fP (ST ).(w−1)′(FP (ST )) =
fP (ST ). This outcome allows us to directly relate the original EDF to the CPT subjective
density function, by “undoing” the effect of the CPT probability distortion functions within
the PCPT density function:
fP (ST )︸ ︷︷ ︸EDF
=
fQ(ST )
ν′(ST )∫ fQ(x)
ν′(x)dx
(w−1)′(FP (ST ))
︸ ︷︷ ︸CPT density function
(4.2.8)
Thus, once the relation between the probability weighting function of the EDF and the
PCPT density is established, as in Eqs. (4.2.6) and (4.2.7), one can eliminate the weighting
scheme affecting returns by applying the inverse of such weightings to the subjective density
function without endangering such equalities, as in Eq. (4.2.8).
As the RND is converted into the subjective density function, we must also estimate daily
empirical density functions (EDF). We built such time-varying EDFs from an invariant com-
ponent, the standardized innovation density, and a time-varying part, the lagged conditional
variance (σ2t|t−1) produced by an EGARCH model (Nelson, 1991). We first define the standard-
ized innovation, being the ratio of empirical returns and their conditional standard deviation
(ln(St/St−1)/σt|t−1) produced by the EGARCH model. From the set of standardized innova-
tions produced, we can then estimate a density shape, i.e., the standardized innovation density.
The advantage of such a density shape versus a parametric one is that it may include the
typically observed fat tails and negative skewness, which are not incorporated in simple para-
metric models, e.g., the normal distribution. This density shape is invariant and it is turned
time-varying by multiplication of each standardized innovation by the EGARCH conditional
standard deviation at time t, which is specified as follows:
ln(St/St−1) = µ+ εt, ε ∼ f(0, σ2t|t−1) (4.2.9a)
and
87
σ2t|t−1 = ω1 + αε2t−1 + βσ2
t−1|t−2 + ϑMax[0,−εt−1]2, (4.2.9b)
where α captures the sensitivity of the conditional variance to lagged squared innovations
(ε2t−1), β captures the sensitivity of the conditional variance to the conditional variance (σ2t−1|t−2),
and ϑ allows for the asymmetric impact of lagged returns (ϑMax[0,−εt−1]2). The model is esti-
mated using maximum log-likelihood where innovations are assumed to be normally distributed.
Up to now, we produced a one-day horizon EDF for every day in our sample but we still lack
time-varying EDFs for the three-, six-, and twelve-month horizons. Thus, we use bootstrapping
to draw 1,000 paths towards these desired horizons by randomly selecting single innovations
(εt+1) from the one-day horizon EDFs available for each day in our sample. We note that once
the first return is drawn, the conditional variance is updated (σ2t−1|t−2) affecting the subsequent
innovation drawings of a path. This sequential exercise continues through time until the desired
horizon is reached. To account for drift in the simulated paths, we add the daily drift estimated
from the long-term EDF to drawn innovations, so that the one-period simulated returns equal
εt+1 + µ. The density functions produced by the collection of returns implied by the terminal
values of every path and their starting points are our three-, six-, and twelve-month EDFs.
These simulated paths contain, respectively, 63, 126, and 252 daily returns. We note that by
drawing returns from stylized distributions with fat-tails and excess skewness, our EDFs for
the three relevant horizons also embed such features. This estimation method for time-varying
EDF is based on Rosenberg and Engle (2002).
Finally, once these three time-varying EDFs are estimated for all days in our sample, we
estimate δ and γ for each of these days using Eq. (4.2.10) and (4.2.11).
w+(γ, δ = γ) = Min
B∑b=1
Wb(EDF bprob − CPT b
prob)2, (4.2.10)
w−(δ, δ = γ) = Min
B∑b=1
Wb(EDF bprob − CPT b
prob)2, (4.2.11)
where EDF bprob and CPT b
prob are the probability within bin b in the empirical and CPT density
functions and Wb are weights given by 1
1√2π
∫∞0.5 e
−x22
dx = 1, the reciprocal of the normalized
normal probability distribution (above its median), split in the same total number of bins (B)
used for the EDF and CPT. Parameters δ and γ are constrained by an upper bound of 1.75
and a lower bound of -0.25. The weights applied in these optimizations are due to the higher
importance of matching probability tails in our analysis than the body of the distributions.
88
4.3 Overweight of tails: dynamics and dependencies
4.3.1 Time-varying CPT parameters
In this section, we evaluate the dynamics of the overweighting of tails within the single stock
and index option markets. Descriptive statistics of the CPT’s estimated δ and γ parameters
via the methodology presented in section 4.2 are provided in Table 4.1.
Table 4.1: Descriptive statistics
Panel A - Gamma
Maturity Min 25% Q Median Mean 75% Q Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months - 0.74 0.91 0.89 1.04 1.75 0.23 64% 97% 35% 59% 0.0209
6 months - 0.81 0.99 0.96 1.14 1.75 0.28 52% 92% 18% 46% 0.0170
12 months 0.04 0.91 1.03 1.01 1.14 1.75 0.22 41% 83% 11% 29% 0.0225
Panel B - Delta
Maturity Min 25% Q Median Mean 75% Q Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS
(98-03) (03-08) (08-13)
3 months 0.29 0.64 0.68 0.68 0.72 1.01 0.08 100% 100% 100% 100% 0.0579
6 months 0.30 0.54 0.60 0.60 0.65 1.75 0.10 100% 100% 100% 100% 0.0198
12 months - 0.40 0.45 0.47 0.52 1.75 0.10 100% 100% 100% 100% 0.0169
This table reports the summary statistics of the estimated cumulative prospect theory (CPT) parameters gamma (γ) from thesingle stock options market and delta (δ) from the index option market for each day in our sample as well as the optimizations’residual sum of squares (RSS). The parameters γ and δ define the curvature of the weighting function for gains and losses,respectively, which leads the probability distortion functions to have inverse S-shapes. The γ and δ parameters close to unitylead to weighting functions that are close to unweighted (neutral) probabilities, whereas parameters close to zero indicateslarge overweight of small probabilities. Panel A reports the summary statistics of gamma (γ) when we assume a parameterof risk aversion (λ) equal to 2.25 (the standard CPT parametrization). Panel B reports the summary statistics of delta (δ)under the same risk aversion assumption. Column headings % γ < 1 and % δ < 1 report the percentage of observations inwhich parameters γ and δ are smaller than one, i.e., the proportion of the sample in which overweight of small probabilitiesis observed. We report this metric for the full sample as well as for three equal-sized splits of our full samples, namely: 98-03,from January 5, 1998 to January 30, 2003; 03-08, from January 31, 2003 to February 21, 2008; and 08-13, from February 22,2008 to March 19, 2013.
We report summary statistics of the estimated γ for three-, six- and twelve-month options
in Panel A for the right tail from single stock options. The median and mean time-varying γ
estimates for three-month options are 0.89 and 0.91, which considerably exceed the parameter
value of 0.61 that is indicated by Tversky and Kahneman (1992). This finding suggests that
overweight of small probabilities is present within the pricing of short-term single stock call
options, but to a much lesser extent than provided by the theory. The results in Panel A
also show that γ is highly time-varying and strongly sample dependent. Overweight of small
probabilities in the single stock option market is very pronounced from 1998 to 2003 (present
at 97 percent of all times), but infrequent from 2003 to 2008 (present at only 35 percent of
all times). Our γ-estimates from three-month options range from 0 to 1.75 and the standard
deviation of estimates is 0.23. In Panel B, we report summary statistics of the estimated δ
from index options for the left tail. For δ estimated from three-month options, the median and
mean estimates are both 0.68, implying a probability weighting that roughly matches the one
in the CPT, which calibrates δ at 0.69. The δ-estimates are also time-varying, however, their
standard deviation (0.08) is more than three times lower than for the γ-estimates. The range
89
of δ-estimates is also much narrower than for γ, as it is between 0.29 and 1.01. In contrast to
the γ-estimates, our δ-estimates reflect a consistent overweight of small probabilities across all
sub-samples.
At the six-month maturity, overweight of small probabilities for γ seems even less acute
than suggested by the theory and by the three-month options findings. The median and mean
γ estimates for this maturity are 0.99 and 0.96, respectively. The distribution of γ is somewhat
skewed to the right, i.e., towards a less pronounced overweight of small probabilities, as the
median is higher than the mean. The 75th quantile of γ (1.14) suggests an underweighting of
probabilities already. For index options with six-month maturity, the estimated δ indicates an
even more pronounced overweight of small probabilities (both the mean and median δ equal
0.60) than for three-month options. Overweight of small probabilities is again documented
across all samples for δ but not for γ, in which overweight of small probabilities is more frequent
than underweight of small probabilities only in the 1998-2003 sample.
The γ estimates for the twelve-month maturity tend even more towards probability un-
derweighting than the six-month ones. The median γ is 1.03, whereas the mean γ is 1.01.
Overweight of small probabilities appears in only 41 percent of all times in the overall sample
and is roughly nonexistent in the 2003-2008 sample. Differently, the mean and median for
the δ estimates from index options are 0.47 and 0.40, respectively, indicating an even stronger
overweight of small probabilities than for single stock options and other maturities. We argue
that such a pattern could be caused by institutional investors buying long-term protection, as
twelve-month OTM index options are less liquid than short-term ones.
OTM index puts seem to be structurally expensive from the perspective of overweight of
small probabilities, despite the fact that the degree of overvaluation varies in time. Concur-
rently, OTM single stock options are only occasionally expensive. Our γ estimates indicate
an infrequent occurrence of overweight of small probabilities in single stock options, clustered
within specific parts of our sample, e.g., during the 1998-2003 period. Our results fit nicely
within the seminal literature, for instance with Dierkes (2009), Kliger and Levy (2009), and
Polkovnichenko and Zhao (2013), regarding the index option market, and with Felix et al.
(2016b) regarding the single stock option market.
4.3.2 Overweight of tails and sentiment
In order to evaluate how time-variation in overweight of small probabilities relates to sentiment,
we run regressions between our proxies for overweight of tails, the Baker and Wurgler (2007)
sentiment measure and other explanatory control variables. Since we aim to combine overweight
of small probabilities parameters from both index options (bearish sentiment) and single stock
options (bullish sentiment), we use the Delta minus Gamma spread, δ - γ, as the explained
variable. The Delta minus Gamma spread captures the overweighting of small probabilities
from both index options and single stock, because δ is the CPT tail overweight parameter
estimated from the single stock market and γ is the equivalent parameter estimated from
the index option market. The explanatory variables in these regressions are (1) the Baker and
90
Wurgler (2007) sentiment measure8, (2) the percentage of bullish investors minus the percentage
of bearish investors given by the survey of the American Association of Individual Investors
(AAII), (3) a proxy for individual investors’ sentiment (see Han, 2008), and (4) a set of control
variables among the ones tested by Welch and Goyal (2008)9 as potential forecasters of the
equity market. The data frequency used is monthly, as this is the highest frequency in which
the Baker and Wurgler (2007) sentiment factor and the Welch and Goyal (2008) data set are
available. Our regression sample starts in January 1998 and ends in February 201310. The OLS
regression model applied is given as:
DGspread[τ ]t = c+ SENTt + IISENTt + E12t + B/Mt +NTISt+
TBLt + INFLt + CORPRt + SV ARt + CSPt + εt,(4.3.1)
where τ is the option horizon, DGspread is the Delta minus Gamma spread, SENT is the Baker
and Wurgler (2007) sentiment measure, IISENT is the AAII individual investor sentiment
measure, E12 is the twelve-month moving sum of earnings of the S&P 5000 index, B/M is
the book-to-market ratio, NTIS is the net equity expansion, TBL is the risk-free rate, INFL
is the annual INFLation rate, CORPR is the corporate spread, SV AR is the stock variance,
and CSP is the cross-sectional premium. We also run the following univariate models for each
explanatory factor separately to understand their individual relation with the DGspread :
DGspread[τ ]t = αi + βixi,t + εt, (4.3.2)
where x represents the 10 explanatory variables specified in Eq. (4.3.1), thus i = 1...10.
Table 4.2 Panel A reports the results of Eq. (4.3.1), estimated across our three maturities for
the DGspread. The explanatory power of the multivariate regression is very high, ranging from
36 to 57 percent. As expected, SENT is positively linked to DGspread and statistically signifi-
cant across the three- and six-month maturities. This suggests that high sentiment exacerbates
overweight of small probabilities measured as DGspread. However, this relation is negative and
not significant at the twelve-month maturity. The univariate regressions of SENT confirm the
positive link between sentiment and DGspread at shorter maturities. Once again, this relation
is not present at the twelve-month horizon. The explanatory power of SENT in the univariate
setting is also high for the three- and six-month horizons, with 17 and 32 percent, respectively.
This result strengthens our hypothesis that overweight of small probabilities increases at higher
levels of sentiment and that sentiment seems to have a strong link to probability weighting by
investors as priced by index puts and single stock call options. This finding, however, applies
to the three- and six-month horizons only since the twelve-month univariate regression has a
R2 of zero.
8Available at http://people.stern.nyu.edu/jwurgler/.9The complete set of variables provided by Welch and Goyal (2008) that is employed here is discussed in
Appendix 3.C. In order to avoid multicollinearity in our regression analysis (some variables correlate 80 percentwith each other), we exclude all variables that correlate more than 40 percent with others.
10This sample is only possible because Welch and Goyal (2008) and Baker and Wurgler (2007) have updatedand made available their data sets after publication.
91
Table
4.2:Regression
resu
lts:
Delta
min
usGamma
spread
Panel
A-Multivariate
Panel
B-Univariate
Maturity
3m
6m
12m
3m
6m
12m
3m
6m
12m
6m
6m
6m
6m
6m
6m
6m
6m
Intercep
t0.003
-0.491***
-0.490***
-0.063***
-0.369***
-0.520***
-0.064***
-0.365***
-0.508***
-0.048
0.131***
-0.055***
-0.121***
-0.055***
-0.053***
-0.039***
-0.052***
(0.056)
(0.037)
(0.058)
(0.010)
(0.008)
(0.013)
(0.011)
(0.010)
(0.012)
(0.031)
(0.031)
(0.011)
(0.015)
(0.013)
(0.011)
(0.012)
(0.011)
SENT
0.030*
0.064***
-0.024
0.071***
0.097***
-0.003
(0.017)
(0.013)
(0.019)
(0.014)
(0.016)
(0.016)
IISENT
0.041
0.096**
-0.106**
0.123***
0.125**
-0.124***
(0.047)
(0.038)
(0.048)
(0.044)
(0.050)
(0.043)
E12
0.000
-0.003
-0.028***
-0.001
(0.006)
(0.004)
(0.007)
(0.006)
B/M
-0.364*
0.125
0.163
-0.737***
(0.217)
(0.132)
(0.211)
(0.130)
NTIS
0.560
0.259
-0.814
1.075**
(0.391)
(0.285)
(0.523)
(0.440)
TBL
0.013
0.036***
0.029***
0.030***
(0.008)
(0.006)
(0.009)
(0.006)
INFL
0.453
1.843
2.311
1.784
(2.507)
(1.885)
(2.176)
(3.350)
CORPR
0.225
0.233
0.044
0.128
(0.285)
(0.202)
(0.273)
(0.472)
SVAR
-1.426
3.519***
3.470*
-3.376**
(1.331)
(1.153)
(1.982)
(1.307)
CSP
-0.125
0.198
0.261
0.029
(0.136)
(0.122)
(0.235)
(0.197)
R2
36%
57%
30%
17%
32%
0%
6%
6%
5%
0%
27%
4%
21%
0%
0%
4%
0%
F-stats
8.2
19.5
6.4
32.5
72.9
0.0
9.1
9.4
8.0
0.0
58.0
7.1
40.7
0.7
0.2
6.1
0.0
AIC
-308.1
-369.1
-273.2
-326.1
-186.0
34.1
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
BIC
-274.4
-335.4
-239.6
-320.0
-179.9
40.3
0.0
0.0
0.0
0.0
0.1
0.4
0.0
2.2
0.3
1.4
0.2
Panel
Areportsth
eregression
resu
ltsforEq.(4.3.1)in
amultivariate
setting.Thedep
enden
tvariable
isDelta
minusGammaspread
(δ-γ),whileasex
planatory
variableswesp
ecify:1)th
eBaker
andW
urgler(2007)sentimen
tmea
sure
(SENT),
2)th
eindividualinvestorsentimen
t(IIS
ENT),
and3)th
eex
planatory
variablesusedbyW
elch
andGoyal(2008),
whileex
cludingfactors
thatco
rrelate
toea
choth
erin
excess
of40percent(see
Appen
dix
3.C
forth
efulllist
ofvariables).Panel
Breportsth
eregressionresu
ltsfor(4.3.2),
inanunivariate
setting,in
whichDelta
minus
Gammaspreadis
regressed
onth
esamesetofex
planatory
variables.
Wereport
New
ey-W
estadjusted
standard
errors
inbrackets.
Asterisks***,**,and*indicate
significa
nce
atth
eone,
five,
and
tenpercentlevel,resp
ectively.
92
IISENT is also positively connected to DGspread in the multivariate regression at the
three- and six-month horizons but negatively at the twelve-month horizon. These results are
confirmed by the univariate regressions, as IISent is positively linked to DGspread at the three-
and six-month horizons. Explanatory power of these regressions is at 6 percent for both the
three- and six-month maturities, which is relatively high. For the twelve-month maturity in the
univariate regression, IISENT is negatively linked to DGspread and is statistically significant.
Once we analyze the other control variables in our regression, we observe that the results
are less stable than for the sentiment proxies. Table 4.2 indicates that some signs of control
variables change in both the multivariate and univariate regressions. TBL is the only control
variable that remains statistically significant and keeps its sign across the multivariate and
univariate models. The explanatory power of TBL is 21 percent in the univariate setting,
whereas the other independent variable with high explanatory power is book-to-market with 27
percent. B/M is only statistically significant in the three-month maturity of the multivariate
regressions. NTIS is negatively and significantly linked to DGspread in the univariate setting
as well as in the multivariate regression in the twelve-month maturity. SV AR is negatively and
significantly linked to DGspread in the univariate regression but in the multivariate regression
this result is not observed. Overall, these empirical findings suggest that fundamentals have a
relatively unstable link to the DGspread.
We note that the high stability of the relation between the sentiment factors and the
DGspread within the multivariate regressions evidences that sentiment and overweight of small
probabilities are strongly connected.
4.3.3 Overweight of tails, IV skews and higher moments of the RND
In a next step, we assess the relationship between DGspread and higher moments (skewness and
kurtosis) of the RND implied by options and IV skew measures. We undertake this analysis
for two reasons: 1) to understand to which extent DGspread is connected to other metrics
seemingly derived from IV, and 2) to approximate DGspread by an easier-to-obtain measure,
given the comprehensive estimation procedures required to compute γ and δ.
We expect the existence of a positive link between the estimated DGspread and IV skew
measures, because the presence of fat tails in the RND is a pre-condition for overweight of tail
probabilities and a corollary of OTM’s IVs to be rich versus at-the-money (ATM) IVs. Simi-
larly, we observe negative skewness and fat-tails in RNDs only if OTM options are expensive
versus ATM options and vice-versa11. Consequently, γ and δ are likely to be smaller than one
(overweight of small probabilities), and DGspread differs from zero if OTM options are expen-
sive versus ATM options, which supports the use of IV skew as another proxy for overweight
of tails.
The IV skew measures used at the beginning are the standard measures: 1) IV 90 percent
11While these relations are widely acknowledged, Jarrow and Rudd (1982) and Longstaff (1995) provide aformal theorem for the link between IV skew and risk-neutral moments, whereas Bakshi et al. (2003) offer acomprehensive empirical test of this proposition for index options.
93
(moneyness) minus ATM, 2) IV 80 percent minus ATM from index options (which captures
bearish sentiment), 3) IV 110 percent minus ATM, and 4) IV 120 percent minus ATM from
single stock calls (which captures bullish sentiment). However, as overweight of small proba-
bilities is observed from the tails of the two markets jointly via DGspread, and standard IV
skew measures only capture information from one market at a time, we suggest a new IV-based
measure. Our proposed IV skew sentiment metric, so-called IV-sentiment, is a combined mea-
sure of the index and single stock options markets. Our IV-sentiment measure is specified as
follows:
IV sentiment = OTMindexputIVτp −OTMsinglestockcallIVτc, (4.3.3)
where, the subscript τ = 1...3 indexes the different option-maturities used, p specifies the
moneyness levels 80 and 90 percent from index put options, and c specifies the moneyness levels
110 and 120 percent from single stock call options. Thus, our sentiment measure is calculated
as permutations of IVs from the three-, six- and twelve-month maturities, and four points in
the moneyness (80, 90, 110, and 120 percent) level grid, where the absolute distance from the
two moneyness levels used per sentiment measure and the ATM level (100 percent moneyness)
is kept constant. In other words, the IV-sentiment metric produced is restricted to the 80
minus 120 percent and the 90 minus 110 percent measures, hereafter called the IV-sentiment
90-110 and IV-sentiment 80-120 measures. From the granular data set across different money-
ness levels and maturities, we create six distinct skew-based measures of IV-sentiment. Using
such a construction, our IV-sentiment measure jointly incorporates bearishness sentiment from
institutional investors and bullishness sentiment from retails investors, similarly to DGspread.
We assess the isolated relationship between DGspread and higher moments of the RND,
(standard) IV skews, and our IV-sentiment measures using the univariate models presented by
Eqs. (4.3.4) to (4.3.7). These models are estimated with OLS, where Newey-West standard
errors are used for statistical inference. Our daily regression samples start on January 2, 1998
and end on March 19, 2013.
DGspread[τ ] = αt
[K
S
]+ IV Sentt
[K
S; τ
], (4.3.4)
DGspread[τ ] = αt +KURTmt (τ), (4.3.5)
DGspread[τ ] = αt + SKEWmt (τ), (4.3.6)
DGspread[τ ] = αt
[K
S
]+ IV SKEWt
[K
S; τ
], (4.3.7)
where KSis the moneyness level of the option, τ is the option horizon, DGspread is the DGspread,
IV Sent is our IV-sentiment measure, SKEW is the RND return skewness implied by options,
KURT is the RND return kurtosis implied by options, and IV SKEW is the single market
IV skew measure, for both index option and single stock option markets. We note that the
94
superscript m for the variables KURT and SKEW aims to distinguish RND kurtosis and
skewness obtained from either RND implied by index options (m = io) or single stock options
(m = sso).
We estimate multivariate models of DGspread regressed on RND skewness, kurtosis, IV
skews and IV-sentiment to better understand the relation between these measures jointly and
overweight of small probabilities:
DGspread[τ ] = αt
[K
S
]+ SKEWm
t (τ) +KURTmt (τ) + IV Sentt
[K
S; τ
], (4.3.8)
Table 4.3 Panel A reports the estimates of Eqs. (4.3.4) to (4.3.7), when the DGspread is
regressed on RND moments, IV skews and IV-sentiment 90-110 in a univariate setting. The
empirical findings indicate that IV-sentiment is the variable that explains DGspread the most
across all maturities. The explanatory power of IV-sentiment is not only the highest but
it is also the most consistent factor, as its R2 ranges from 30 to 46 percent. IV-sentiment is
negatively related toDGspread. Such a negative sign of the IV-sentiment regressor was expected
because the DGspread rises with higher bullish sentiment, whereas higher IV-sentiment suggests
a more pronounced bearish sentiment. Risk-neutral skewness and kurtosis also strongly explains
DGspread (by roughly 30 percent), though only within the three-month maturity. Skewness
and kurtosis explain DGspread by roughly 10 percent for six-month options, and 7 percent
for twelve-month ones. The coefficient signs are in line with our expectations since high levels
of RND skewness are associated with high DGspread (a bullish sentiment signal), while low
levels of RND kurtosis (less pronounced fat-tails) are associated with high DGspread 12. In
contrast, standard IV skews explain very little of DGspread within the three-month maturity,
only between 0 and 4 percent. At longer maturities, the IV skews are able to better explain
DGspread, however, mostly when the skew measure comes from the single stock options market
(between 17 and 21 percent). As a robustness check, we note that the regression results are
virtually unchanged by the usage of either IV-sentiment 90-110 or 80-120 measures. As a first
impression, these results imply that IV-sentiment is strongly connected to DGspread and to
overweight of small probabilities.
Panel B shows that when we evaluate the multivariate regressions, we find that IV-sentiment
is the most stable regressor with respect to coefficient signs, being negatively linked to DGspread
across all regressions, and is always statistically significant. These regressions have high ex-
planatory power (ranging from 41 to 61 percent), especially when considering the daily fre-
quency, thus, potentially containing more noise than lower frequency data. In the multivariate
regression we use the IV-sentiment 90-110, while the (unreported) results using IV-sentiment
80-120 are qualitatively the same. Due to likely multicollinearity in this multivariate model,
we believe that our univariate models are more insightful than the former.
12The regression results reported here use RND kurtosis and skewness from index options (m = io). Theresults when RND is extracted from single stock options (m = sso) are unreported but qualitatively the sameas the coefficient signs are equal to the reported ones, and regressions’ explanatory power are roughly in thesame range.
95
Table
4.3:Regression
resu
lts:
Delta
min
usGamma
spread
and
risk
-neutralmeasu
res
Pan
elA
-Univariate
regression
s
Maturity
3m6m
12m
3m6m
12m
3m6m
12m
3m6m
12m
Intercept
0.019**
-0.219***
-0.436***
-0.045***
-0.254***
-0.460***
-0.295***
-0.499***
-0.683***
-0.186***
-0.368***
-0.593***
(0.007)
(0.010)
(0.009)
(0.006)
(0.008)
(0.007)
(0.004)
(0.004)
(0.004)
(0.004)
(0.003)
(0.003)
Skewness
0.122***
0.073***
0.054***
(0.003)
(0.004)
(0.003)
Kurtosis
-0.015***
-0.009***
-0.007***
(0.000)
(0.000)
(0.000)
IV-sentiment90-110
-1.998***
-2.774***
-2.359***
(0.046)
(0.064)
(0.062)
IV-sentiment80-120
-1.606***
-2.438***
-2.124***
(0.042)
(0.058)
(0.056)
R2
32%
9%7%
30%
10%
7%34%
46%
36%
30%
45%
35%
F-stats
1861.1
408.6
285.7
1714.6
423.4
315.6
2085.1
3423.7
2228.0
1707.2
3199.9
2121.2
AIC
-2001
-13
-944
-1900
-27
-972
-2151
-2093
-2437
-1895
-1971
-2368
BIC
-1989
-1-931
-1888
-14
-959
-2139
-2081
-2424
-1883
-1959
-2355
Pan
elA
-Univariate
regression
s(con
tinuation)
Pan
elB
-Multivariate
regression
s
Maturity
3m6m
12m
3m6m
12m
3m6m
12m
Intercept
-0.195***
-0.141***
-0.332***
0.029
-0.052
-0.407***
-0.273***
-0.465***
-0.495***
(0.010)
(0.013)
(0.011)
(0.028)
(0.034)
(0.027)
(0.025)
(0.038)
(0.040)
Skewness
0.093***
0.000
-0.032***
(0.008)
(0.009)
(0.009)
Kurtosis
-0.002**
-0.007***
-0.009***
(0.001)
(0.001)
(0.001)
IV-sentiment90-110
-1.989***
-2.462***
-1.677***
(0.063)
(0.108)
(0.126)
IV110-ATM
skew
1.082**
13.681***
16.172***
0.511
5.525***
8.106***
(0.435)
(0.717)
(0.711)
(0.371)
(0.672)
(1.004)
IV90-ATM
skew
-4.941***
-8.399***
-4.997***
3.876***
4.129***
0.000
(0.557)
(0.903)
(0.993)
(0.391)
(0.732)
(0.933)
R2
0%17%
21%
4%4%
1%61%
53%
42%
F-stats
10.3
810.0
1065.3
148.4
177.3
49.4
1214.5
903.8
707.4
AIC
-485
-362
-1612
-621
202
-717
-4154
-2636
-3008
BIC
-472
-349
-1599
-608
215
-705
-4116
-2598
-2970
Panel
Areportsth
eregressionresu
ltsforEqs.
(4.3.4),
(4.3.5),
(4.3.6)and(4.3.7)in
anunivariate
setting.Thedep
enden
tvariable
forth
eseregressionsis
Delta
minusGammaspread
(δ-γ),
aproxyforoverweightofsm
allprobabilities.
Asex
planatory
variableswesp
ecifyth
erisk-neu
tralskew
nessandkurtosis,
IV110-A
TM
skew
(from
single
stock
options),IV
90-A
TM
skew
(from
index
options),andourIV
-sen
timen
tmea
sure
intw
opermutationsper
matu
rity:1)IV
-sen
timen
t90-110,and2)IV
-sen
timen
t80-120.OurIV
-sen
timen
tmea
sure
isanIV
skew
mea
sure
thatco
mbines
inform
ationfrom
theindex
optionmarket
andth
esingle
stock
optionmarket,seeEq.(4.3.3).
Forinstance,th
eIV
-sen
timen
t90-110
mea
sure
combines
theIV
from
the90percentmoney
nesslevel
from
theindex
optionmarket
andth
e110percentmoney
nesslevel
from
thesingle
stock
optionmarket.Panel
Breportsth
eregressionresu
ltsforEq.(4.3.8)in
amultivariate
setting,in
whichDelta
minusGammaspread
isregressed
onth
esamesetofex
planatory
variables.
Wereport
New
ey-W
estadjusted
standard
errors
inbrackets.
Asterisks***,**,and*indicate
significa
nce
atth
eone,
five,
andtenpercentlevel,resp
ectively.
96
These findings strongly suggest that DGspread co-moves with our IV-sentiment measure
within the three-, six-, and twelve-month maturities. Hence, we feel comfortable to use IV-
sentiment to approximate the overweighting of small probabilities, similarly to DGspread.
4.4 Predicting with overweight of tails
4.4.1 Predicting returns with DGspread and IV-sentiment
Section 4.3.1 has documented that the overweighting of small probabilities is strongly time-
varying. We hypothesize that it is linked to equity markets reversals. Thus, in the following,
we employ regression analysis to test if overweight of small probabilities (proxied by DGspread)
can predict equity market returns. Given the results of section 4.3.3, in which our IV-sentiment
measure strongly links to the DGspread, we also run such predictive regressions by using IV-
sentiment as the explanatory variable.
In order to test the predictability of these two metrics, we regress values of DGspread and of
our IV-sentiment measure on rolling forward returns with eight different investment horizons:
42, 84, 126, 252, 315, 525, 735, and 945 days, as specified by the Eqs. (4.4.1) and (4.4.2):
pt+h+1
pt+1
= αh + βhDGspread[τ ]t + εt, (4.4.1)
pt+h+1
pt+1
= αh + βhIV Sent[τ ]t + εt, (4.4.2)
where p is the equity market price level, h is the investment horizon, τ is the option maturity,
α is the unconditional expected mean of forward returns, and β is the sensitivity of forward
returns to DGspread and to IV-sentiment. We estimate Eqs. (4.4.1) and (4.4.2) via OLS with
Newey-West adjustment to the standard deviation of regressors’ coefficients due to the presence
of serial correlation in forwards returns. Our regression samples start in January 2, 1998 and
end in March 19, 2013.
Table 4.4 presents the empirical findings of forward returns regressed on DGspread. The
explanatory power of these regressions have single-digit values and rarely exceeds ten percent.
For the three-month horizon, the explanatory power rises steadily up to the two-year horizon
(to nine percent), and drops then to four percent for forward returns at the 945-days horizon.
We note that DGspread tends to have low explanatory power and is not significant for short-
horizons (42- to 126-days) and for higher maturities (twelve-month options). The coefficients
of DGspread are always negative for the three- and six-month maturities. This result was
expected as it implies that a high (low) DGspread, i.e., a bullish (bearish) sentiment predicts
negative (positive) forward returns, i.e., reversals. For the twelve-month maturity, the coeffi-
cient signs are unstable, being negative (and statistically significant) for the 252-days horizon,
while sometimes positive and insignificant for shorter horizons.
97
Table
4.4:Regression
resu
lts:
Deltaminusgammasp
read
andIV
-sentiment
Panel
A-Delta
minusGammasp
read
Panel
B-IV
-sen
timen
t90-110
Three-month
options
Horizo
n42
84
126
252
315
525
735
945
42
84
126
252
315
525
735
945
Intercep
t0.000
-0.004
-0.009**
-0.016***
-0.023***
-0.030***
-0.027***
0.003
0.01***
0.01***
0.02***
0.04***
0.04***
0.06***
0.08***
0.08***
(0.003)
(0.003)
(0.004)
(0.006)
(0.006)
(0.008)
(0.008)
(0.009)
(0.002)
(0.002)
(0.003)
(0.004)
(0.004)
(0.005)
(0.006)
(0.007)
DGsp
read/IV
-Sen
t-0.03***
-0.08***
-0.13***
-0.26***
-0.33***
-0.43***
-0.39***
-0.19***
0.13***
0.26***
0.38***
0.65***
0.70***
1.11***
1.59***
1.52***
(0.007)
(0.009)
(0.010)
(0.015)
(0.018)
(0.032)
(0.034)
(0.036)
(0.013)
(0.017)
(0.020)
(0.021)
(0.021)
(0.031)
(0.033)
(0.048)
R2
1%
2%
4%
7%
9%
9%
7%
1%
4%
8%
10%
15%
16%
23%
34%
26%
F-stats
31.4
81.9
150.1
274.5
347.3
351.0
228.5
42.4
150.7
317.6
418.3
678.2
709.1
1083.3
1737.3
1096.3
AIC
0.0017
0.0023
0.0030
0.0042
0.0047
0.0061
0.0066
0.0075
0.0011
0.0015
0.0019
0.0026
0.0027
0.0034
0.0039
0.0047
BIC
0.0061
0.0085
0.0108
0.0157
0.0175
0.0232
0.0255
0.0288
0.0106
0.0146
0.0184
0.0251
0.0261
0.0337
0.0382
0.0460
Six-m
onth
options
Horizo
n42
84
126
252
315
525
735
945
42
84
126
252
315
525
735
945
Intercep
t-0.004
-0.017***
-0.023***
-0.050***
-0.066***
-0.123***
-0.138***
-0.088***
0.02***
0.04***
0.05***
0.10***
0.12***
0.18***
0.19***
0.14***
(0.003)
(0.004)
(0.005)
(0.007)
(0.008)
(0.009)
(0.008)
(0.009)
(0.002)
(0.002)
(0.003)
(0.005)
(0.005)
(0.007)
(0.009)
(0.011)
DGsp
read/IV
-Sen
t-0.03***
-0.08***
-0.12***
-0.25***
-0.32***
-0.53***
-0.56***
-0.39***
0.24***
0.45***
0.64***
1.12***
1.38***
2.18***
2.22***
1.48***
(0.007)
(0.009)
(0.010)
(0.014)
(0.017)
(0.023)
(0.024)
(0.028)
(0.022)
(0.027)
(0.029)
(0.032)
(0.040)
(0.050)
(0.059)
(0.066)
R2
1%
3%
4%
8%
10%
16%
17%
7%
5%
8%
10%
15%
18%
26%
23%
9%
F-stats
31.7
114.0
145.2
301.5
399.4
669.1
651.7
232.6
186.6
344.4
447.9
660.3
803.0
1204.0
959.3
288.5
AIC
0.0023
0.0031
0.0040
0.0057
0.0062
0.0079
0.0084
0.0097
0.0014
0.0019
0.0025
0.0036
0.0040
0.0054
0.0063
0.0079
BIC
0.0056
0.0077
0.0099
0.0142
0.0159
0.0204
0.0221
0.0257
0.0176
0.0241
0.0304
0.0437
0.0486
0.0629
0.0717
0.0873
Twelve-month
options
Horizo
n42
84
126
252
315
525
735
945
42
84
126
252
315
525
735
945
Intercep
t0.007
0.004
0.012
-0.037***
-0.100***
-0.177***
-0.228***
-0.164***
0.02***
0.04***
0.06***
0.11***
0.14***
0.21***
0.21***
0.13***
(0.005)
(0.007)
(0.009)
(0.013)
(0.014)
(0.019)
(0.020)
(0.021)
(0.002)
(0.003)
(0.003)
(0.005)
(0.006)
(0.008)
(0.009)
(0.012)
DGsp
read/IV
-Sen
t0.00
-0.02
-0.01
-0.14***
-0.27***
-0.44***
-0.53***
-0.40***
0.25***
0.46***
0.68***
1.23***
1.53***
2.30***
2.23***
1.26***
(0.009)
(0.012)
(0.015)
(0.022)
(0.024)
(0.033)
(0.037)
(0.040)
(0.024)
(0.029)
(0.030)
(0.035)
(0.043)
(0.056)
(0.068)
(0.077)
R2
0%
0%
0%
2%
5%
8%
11%
5%
4%
7%
10%
15%
18%
23%
19%
5%
F-stats
0.0
2.8
0.9
62.8
198.7
313.8
396.4
166.3
160.0
299.2
409.8
636.9
806.9
1048.0
753.0
162.8
AIC
0.0037
0.0052
0.0066
0.0096
0.0105
0.0136
0.0144
0.0164
0.0016
0.0022
0.0028
0.0041
0.0046
0.0063
0.0074
0.0093
BIC
0.0066
0.0093
0.0119
0.0172
0.0190
0.0249
0.0267
0.0310
0.0194
0.0267
0.0336
0.0486
0.0539
0.0709
0.0813
0.0987
Panel
Areportsth
eregressionresu
ltsforEq.(4.4.1),
whichregresses
theDelta
minusGammaspread
oneightdifferen
thorizo
nsforforw
ard
equityretu
rns.
Panel
Breportsth
eregressionresu
lts
forEq.(4.4.2),
whichregresses
theIV
-sen
timen
t90-110
mea
sure
onth
esameforw
ard
equityretu
rnsusedin
Panel
A.Theex
plained
variablesare
forw
ard
retu
rnsforth
eS&P
500index
mea
sured
over
thefollowinghorizo
ns:
42,84,126,252,315,525,735,and945days.
Wereport
New
ey-W
estadjusted
standard
errors
inbrackets.
Theasterisks***,**,and*indicate
significa
nce
atth
eone,
five,
andtenpercentlevel,resp
ectively.
98
Panel B reports the regression results of Eq. (4.4.2), i.e., the outcomes of forward re-
turns regressed on our IV-sentiment 90-110 measure for three-, six-, and twelve-month ma-
turities13. The pattern of R2 across the different horizons tested is similar across the three
option-maturities and analogous to the one observed for DGspread for the same three-month
horizon: R2 rises from four percent to 28 percent when the horizon increases from 42 days (two
months) to 525 days (two years), while after the two years horizon, the explanatory power falls
slightly for the 735 days (roughly three years) and collapses for the 925 days (3.7 years) horizon.
We observe that the explanatory power for the six- and twelve-month option maturity is just
slightly lower than for the three-month maturity. Statistical significance of the estimators is
often high, across option maturities and return horizons. The coefficients for the IV-sentiment
90-110 measure are always positive. This is as expected as it means that high (low) IV-
sentiment, i.e., bearish (bullish) sentiment, predicts positive (negative) forward returns. The
explanatory power, the stability of the coefficient signs, and the statistical significance of the
regressors using our IV-sentiment 90-110 measure clearly dominate the regression results that
use Delta minus Gamma spread. These results strengthen our earlier findings that our IV-
sentiment measure is a good representation of sentiment, especially concerning the prediction
of equity market reversals.
4.4.2 IV-sentiment pair trading strategy
Our previous results suggest that IV-sentiment is more strongly connected to forward returns
than Delta minus Gamma spread itself. As such, we construct a trading strategy to further
test the predictability power of IV-sentiment. This strategy consists of a high frequency (daily)
trading rule that aims to predict equity market reversals. Our hypothesis is that when the
IV-sentiment measure is significantly higher (lower) than its normal level, overweight of small
probabilities is then extreme and likely to mean-revert in the subsequent periods in tandem
with the underlying market. The trading strategy, thus, buys (sells) equities when there is
excessive bearishness/panic (excessive bullishness/complacency) indicated by the high (low)
level of IV-sentiment.
The strategy is tested via a pair-trading rule among long and short positions in the S&P
500 index and a USD cash return index. For simplicity, such a strategy is implemented as a
purely directional strategy where positions are constant in size and IV-sentiment is normalized
via a Z -score. The trading rule enters a five percent long equities position when the IV-
sentiment is higher than a pre-specified threshold, for example, its historical two standard
deviation. The trading rule closes such a position, by entering into a full cash position, when
such normalized IV-sentiment measure converges back to its average. Conversely, the rule
enters a short equities position when the IV-sentiment is lower than its historical negative
two standard deviation threshold and buys back a full cash position when it converges to
its average. Five basis points trading cost is charged over the five percent position traded
13The regression results for our IV-sentiment 80-120 measure are qualitatively indifferent from the ones wepresent for IV-sentiment 90-110.
99
in equities. In order to avoid strategy overfitting, we 1) compute the Z -score using multiple
look-back periods, and 2) use multiple threshold levels to configure excessive sentiment14. We
evaluate these contrarian strategies on a volatility-adjusted basis using standard performance
analytics such as the information ratio, downside risk characteristics, and higher moments of
returns. We compare these strategies to 1) other contrarian strategies that make use of IV
volatilities, such as an IV skew-based strategy, a volatility risk premia (VRP) strategy, and an
implied-correlation-based (IC) strategy15, 2) the equity market beta, i.e., the S&P 500 index,
and 3) alternative beta strategies, such as writing put options, a 110-95 collar strategy, the
G10 FX carry, equity cross-sectional momentum, and a time-series momentum strategy16. We
further evaluate such strategies by estimating the paired correlation coefficient between them,
as well as tail and (distribution) higher-moment dependency statistics such as conditional co-
crash (CCC) probabilities (see Appendix 2.B) and co-skewness. Our back-test samples start in
January 2, 1998 and end in December 4, 201517.
The boxplots of information ratios obtained by our IV-sentiment strategies and other IV-
based strategies are provided in Figure 4.1. We see that the IV-sentiment 90-110 strategy seems
to perform better than the IV-sentiment 80-120 strategy, as the information ratio means and
dispersion of the former strategy dominate the ones for the latter. The average information
ratio for the IV-sentiment 90-110 strategy is positive for the three- and six-month option
maturities but negative for the twelve-month. For the three- and six-month strategies, all one-
standard deviation boxes for the information ratio lay in positive territory, suggesting that the
IV-sentiment 90-110 strategy is robust to changes in look-back and outer-threshold parameters.
Further, the IV-sentiment 90-110 is superior to single-market IV skew-based strategies for the
three- and six-month maturities, but not for the twelve-month maturity. At the three-month
maturity, the average information ratio and dispersion for the IV-sentiment 90-110 strategy are
similar to the ones for the VRP strategy. However, for the six- and twelve-month maturities,
the VRP strategies dominate the IV-sentiment 90-110 based on the average information ratio,
despite larger dispersion for the six-month maturity strategy.
Figure 4.1 shows that the IC strategies seem to deliver relatively high and consistent in-
formation ratios, especially when calculated using the 80 and 90 percent moneyness levels. At
the three- and six-month maturities, the performance of IC strategies match the performance
of the IV-sentiment 90-110 and VRP strategies. At the twelve-month horizon, the 80 and 90
percent IC strategies are superior to the IV-sentiment 90-110 measure. Overall, the boxplots
in Figure 4.1 suggest that the IV-sentiment 90-110 strategy is robust to changes in parameters
14We also tested a percentile normalization and found results that are qualitatively similar to the use ofZ -scores.
15A implied-correlation (or dispersion trading) strategy buys (sells) index options and sells (buys), while deltahedging, to arbitrage price differences in these two volatility markets.
16Strategy return series used are, respectively, the CBOE S&P 500 BuyWrite Index, the CBOE InvestableCorrelation Index, the S&P 500 index, CBOE put writing index, the CBOE 110-95 collar, the DB G10 FXcarry index, the JPMorgan Equity Momentum index and the Credit Suisse Managed Futures index.
17As our IV-sentiment measure requires much less (cross-sectional) IV data than the DGspread to be cal-culated, we were able to extend our full sample, which originally ended on March 19, 2013, until December 4,2015
100
but also that its performance is matched by other IV-based strategies. Table 4.5 Panel A pro-
vides the performance analytics for the IV-sentiment 90-110 strategy, as well as for alternative
strategies.
A) Three-month options B) Six-month options C)Twelve-month options
Figure 4.1: Information ratios for daily IV-based strategies. The boxplots depict the distribution of information ratios(IR) obtained by the IV-based strategies tested, when different look-back periods and outer-thresholds are used per factor-specificstrategy. Boxplot A depicts the distribution of IRs when the IV factor used is obtained from three-month options. Panels B andC depict the same information while using the IV factors obtained from six- and twelve-month options, respectively.
We observe that the IV-sentiment 90-110 strategy (using three-month option maturity)
delivers returns (20 basis points) and risk-adjusted returns (0.29) that are superior to many of
the other strategies compared, such as the S&P 500, the IV skew, the VRP, the IC, the 90-110
collar, the G10 FX carry, and the equity momentum. Thus, the only strategies that deliver
equal or higher risk-adjusted returns than our IV-sentiment 90-110 strategy are the time-
series momentum and the put writing. The return skewness for our IV-sentiment strategy is
positive (0.10) and above the average of the other strategies. A strategy that has surprisingly
high skewed returns is the IC (0.43). The drawdown characteristics such as the maximum
drawdown, the average recovery time, and the maximum daily drawdown of our IV-sentiment
strategy are somewhat similar to the other IV-based strategies.
101
Table
4.5:IV
-sentimentbased
pair-tra
destra
tegy
Panel
A-Back
-testresu
lts
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
11
12
13
IV-sen
timen
tIV
Skew
VRP
ICS&P500
Put
110-95
G10FX
Equity
CTA
S&P500
Eq.Mom
CTA
90-110
3m
90
3m
90
3m
90
writing
collar
carry
Momen
tum
+IV
Sen
t+IV
Sen
t+IV
Sen
t
Averageretu
rn0.20%
0.14%
0.12%
0.17%
0.10%
0.34%
0.14%
0.18%
0.10%
0.51%
0.21%
0.24%
0.53%
Volatility
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
0.71%
Inform
ationratio
0.29
0.20
0.17
0.24
0.14
0.48
0.20
0.26
0.14
0.71
0.29
0.34
0.75
Skew
ness
0.10
-0.07
-0.01
0.43
-0.18
-0.60
0.01
-0.93
-0.45
-0.37
-0.05
0.09
-0.37
Kurtosis
15.84
24.73
29.02
18.54
8.12
24.13
2.25
12.04
4.23
2.89
7.66
17.53
2.91
Maxdrawdown
-1.7%
-1.6%
-2.9%
-1.7%
-3.0%
-2.5%
-2.9%
-3.2%
-2.9%
-1.4%
-2.4%
-2.2%
-1.1%
Avgrecoverytime(inyea
rs)
0.43
0.42
0.41
0.35
0.22
0.06
0.20
0.16
0.25
0.14
0.13
0.21
0.14
Maxdailydrawdown
-0.55%
-0.53%
-0.49%
-0.47%
-0.34%
-0.53%
-0.29%
-0.50%
-0.35%
-0.31%
-0.29%
-0.43%
-0.48%
Panel
B-Correlationmatrix
IV-sen
timen
tIV
Skew
VRP
ICS&P500
Put
110-95
G10FX
Equity
CTA
90-110
3m
90
3m
90
3m
90
writing
collar
carry
Momen
tum
IV-sen
timen
t1
0.41
0.18
0.70
0.10
0.04
0.13
0.07
-0.16
-0.11
IVSkew
0.41
10.55
0.59
0.18
0.16
0.08
0.16
-0.05
-0.03
VRP
0.18
0.55
10.51
0.41
0.42
0.18
0.15
-0.13
-0.11
IC0.70
0.59
0.51
10.34
0.31
0.24
0.14
-0.21
-0.13
S&P500
0.10
0.18
0.41
0.34
10.89
0.88
0.28
-0.05
-0.15
Putwriting
0.04
0.16
0.42
0.31
0.89
10.68
0.26
-0.08
-0.16
110-95co
llar
0.13
0.08
0.18
0.24
0.88
0.68
10.23
0.13
-0.05
G10FX
carry
0.07
0.16
0.15
0.14
0.28
0.26
0.23
10.02
-0.05
EquityMomen
tum
-0.16
-0.05
-0.13
-0.21
-0.05
-0.08
0.13
0.02
10.30
CTA
-0.11
-0.03
-0.11
-0.13
-0.15
-0.16
-0.05
-0.05
0.30
1
Panel
C-Taildep
enden
ceIV
-sen
timen
tIV
Skew
VRP
ICS&P500
Put
110-95
G10FX
Equity
CTA
withIV
-sen
timen
t90-110
3m
90
3m
90
3m
90
writing
collar
carry
Momen
tum
Co-skew
ness
1.6E-12
-2.9E-13
1.0E-11
4.6E-12
-6.5E-10
-3.3E-09
9.6E-10
-9.8E-10
5.9E-10
-3.2E-09
1%
cond.crash
prob.
100%
51%
36%
77%
23%
21%
19%
9%
19%
2%
2%
cond.crash
prob.
100%
46%
37%
76%
32%
26%
26%
13%
15%
2%
5%
cond.crash
prob.
100%
44%
37%
78%
29%
32%
26%
17%
18%
7%
Panel
Areportsth
eresu
ltsofco
ntrarianpair-tradestrategiesbasedonourIV
-sen
timen
t110-95
indicatorandonoth
erIV
-basedstrategiessu
chasth
eIV
Skew
,th
evolatility-riskpremia
(VRP),
and
oth
ertraditionaland
alternativebetastrategies,
i.e.,buy&
hold
theS&P500index
,putwriting,110-95co
llar,
G10FX
carry,
cross-sectionaleq
uitymomen
tum,and
time-series
momen
tum
(CTA).
TheIV
-basedstrategiesuse
252daysasth
elook-back
periodand+/-tw
ostandard
dev
iationsasco
nvergen
ceth
resh
olds.
Theco
lumns(11)and(12)ofPanel
Areport
statisticsforstrategies
thatco
mbineth
eth
ree-month
IV-sen
timen
t90-110strategy(column(1))
withth
ebuy&
hold
theS&P
500index
(column(5))
andth
eCTA
strategy(column(10)).Panel
Breportsth
eco
rrelation
coeffi
cien
tsofdailyretu
rnsestimatedover
theperiodbetweenJanuary
2,1998andDecem
ber
4,2015,forth
esamestrategiesreported
inPanel
A.Panel
Creportsth
eco
-skew
nessandth
eco
nditional
co-crash
(CCC)probabilitiesofth
eth
ree-month
IV-sen
timen
t90-110
withth
eoth
erstrategies,
whichindicate
thedeg
reeoftail-dep
enden
ceamongth
em.
102
In the following, we combine our IV-sentiment strategy with a simple buy-and-hold of the
S&P 500 index, a cross-sectional equity momentum, and a time-series momentum strategy,
on a standalone basis. These combinations are done by weighting returns in a 50/50 percent
proportion. Statistics for the strategies are presented in columns (11) and (13) of Panel A of
Table 4.5. We note that the combined strategies improve the information ratios of these three
strategies. The information ratio for the S&P 500 rises from 0.14 to 0.29, for the time-series
momentum from 0.71 to 0.75 and by a staggering 0.20 points for the cross-sectional momentum
strategy, from 0.14 to 0.34. The drawdown and skewness characteristics are also improved,
especially for the cross-sectional momentum strategy. We argue that these improvements in
the information ratio and downside statistics occur due to the low correlation and low higher
moments-/tail-dependencies of our IV-sentiment strategy with these alternative strategies. For
instance, Table 4.5 Panel B indicates that the IV-sentiment strategy is negatively correlated
to both equity momentum and time-series momentum, by -0.16 and -0.11, respectively.
Co-skewness and, especially, CCC probabilities of the IV-sentiment strategy with momen-
tum strategies are also very low (see Panel C of Table 4.5). Since Daniel and Moskowitz (2016)
document that momentum crashes, in particular cross-sectional momentum, we suggest that the
large improvement delivered by IV-sentiment to these strategies is likely due to the reduction
of their large negative tails.
Moreover, Table 4.5 Panel B indicates that the IV-sentiment strategy is, on average, posi-
tively related to other strategies. The highest correlation observed for the IV-sentiment strategy
is with the IC strategy (0.70), which is an intuitive result given that these are the only two
strategies driven jointly by the index option market and the single stock option market. The
correlations of our IV-sentiment strategy with other IV-based strategies are also relatively high:
0.18 with the VRP and 0.41 with the IV skew 90 percent. The correlation of the IV-sentiment
with the S&P 500 index is with 0.10, very low. The correlation of the IV-sentiment strategy
with other strategies that perform poorly in “bad times” is also low, at 0.04 with the put
writing, at 0.07 with the G10 FX carry, and at 0.13 with the 90-110 collar strategy. We also
note that other strategies can be highly correlated with each other, e.g., with 0.89 between the
S&P 500 and the put-writing, whereas negative correlations are mostly observed for momentum
strategies. Our findings on correlations among strategies are mostly reiterated by the estimated
tail-dependence between them using co-skewness and CCC probabilities reported in Panel C of
Table 4.5.
As a robustness check, we analyze whether our IV-sentiment high-frequency trading strategy
performs well due to both its legs or whether its merit is concentrated in either the long- or
the short-leg. We separate the performance of the two legs of the strategy as if they were two
different strategies and we compute individual performance statistics. In order to visualize the
results, we produce information ratios’ (IRs) boxplots separately for the three option maturities,
which are shown in Figure 4.2.
103
A) Three-month options B) Six-month options C)Twelve-month options
Figure 4.2: Information ratios for long- and short-leg of IV-based strategies. The boxplots depict the distributionof information ratios (IRs) obtained by the IV-based strategies tested, when different look-back periods and outer-threshold areused per factor-specific strategy. Boxplots on the top row (in green) refer to IRs produced by the long-leg of IV-based strategies,whereas the ones in the second row refer to the short-leg of the same strategies. Boxplot A depicts the distribution of IRs whenthe IV factor used is obtained from three-month options. Panels B and C depict the same information while using the IV factorsobtained from six- and twelve-month options, respectively.
The distribution of IRs for the long positions are shown in the plots at the upper part, while
the distribution of IRs for the shorts are shown at the bottom. We note that the dispersion of
IRs from the short-leg is much higher than from the long-leg; outliers are much more frequent in
the short-leg. We find that the median IRs of long-legs are substantially higher than for short-
legs. The IR distributions of the short positions seem slightly skewed to the negative side,
whereas for the long positions they seem skewed to the positive side. These results indicate
that the merit of our IV-sentiment strategy is concentrated in its buy-signal rather than in its
sell-signal.
Figure 4.2 suggests that other IV-based strategies also seem to have their long-legs perform-
ing much better than their short-legs. This finding suggests that extreme bearish sentiment
signals may be more reliable than extreme bullish sentiment signals. One explanation for this
finding is the fact that the IV may be more reactive on the downside, due to the leverage ef-
fect18. In contrast, on the upside, a higher IV led by the bidding of call options might be offset
by an overall lower IV. Our results are partially in line with the literature on cross-sectional
returns and skew measures. Barberis and Huang (2008) suggest that stocks that have a high
18The leverage effect refers to the typically observed negative correlation between equity returns and itschanges of volatility, and was first noted by Black (1976).
104
skew tend to have high subsequent returns, whereas for a call with a high skew this relation is
inverse. However, other studies, such as Cremers and Weinbaum (2010), suggest that the rela-
tion between returns and volatility skews has the opposite direction. Assuming that there are
systematic reasons for OTM implied volatilities across stocks to move in tandem, e.g., market
risk, as suggested by Dennis and Mayhew (2002) and Duan and Wei (2009), then the logical
consequence from the cross-sectional relation between the implied skew and returns would be
that the overall equity market should reverse following times of extremely high skews.
Our results, thus, offer additional findings to the literature that explores the link between
variance-measures and forward returns (Ang and Liu, 2007; Bliss and Panigirtzoglou, 2004;
Doran et al., 2007; Pollet and Wilson, 2008). Most of these studies recognize a negative and
short-term relation between risk measures and returns, where a high variance links to subse-
quent negative to low returns. In contrast, our findings suggest that a high level of IV skew
relates to subsequent positive and high returns. Our finding is mostly in line with Boller-
slev et al. (2009), who document that equity market reversals are predicted by the variance
risk-premium.
Further, we aimed to compare the trading performance of the Baker and Wurgler (2007)
sentiment measure to our high-frequency strategy but this was not possible as the former factor
is only available on a monthly or quarterly frequency and was only published until 2010. Thus,
in a next step, we compare how trading strategies using our suggested IV-sentiment measure
compare to strategies that use the sentiment factor of Baker and Wurgler (2007). We do this
by implementing a low-frequency pair trading strategy using both predictors. This pair-trading
strategy is identical to the one applied above with the only difference being the rebalancing
frequency and the number of observations in the look-back window. We use the following look-
backs for the calculation of Z-scores: 1, 3, 6, 9, 12, 18, and 24 months. The IV-sentiment
measures used are the IV-sentiment 80-120 and 90-110 factors, available in our three different
option maturities. Other back-test features (e.g., trading costs, strategy exit) are the same
as for the high-frequency pair-trade strategy. Figure 4.3 provides our results by a series of
boxplots. The empirical findings are displayed in columns for the different option maturities
and in rows for the different statistics evaluated: 1) information ratio, 2) return skewness and
3) horizon, proxied by the average drawdown length (in months) observed per strategy.
Our findings suggest that the IRs of the IV-sentiment strategies are much less dispersed
than the ones for the sentiment factor by Baker and Wurgler (2007). The median IR for the
IV-sentiment 90-110 factor is also higher than for the other two strategies. The IV-sentiment
90-110 factor is the only strategy in which almost all backtests deliver positive IRs, with the
exception of a few outliers. This is not the case for the other strategies, as a substantial amount
of backtests deliver negative IRs.
105
A) Three-month options B) Six-month options C)Twelve-month options
Figure 4.3: Information ratios, skewness and horizon for monthly IV-based strategies. The boxplots depict thedistribution of information ratios (IRs), return skewness, and trade horizon (average drawdown) obtained by the IV-sentimentstrategies tested, as well as the Baker and Wurgler (2007) sentiment factor when different look-back periods and outer-thresholdare used per strategy. Boxplot A depicts the distribution of these statistics when the IV factor used is obtained from three-monthoptions. Panel B and C depicts the same information but, respectively, when the IV factors used are obtained from six- andtwelve-month options. Boxplots of IR, return skewness and trade horizon for the Baker and Wurgler (2007) factor are the sameacross option horizons but are shown for comparison with the IV-sentiment strategies.
In line with our earlier results, the IV-sentiment 90-110 factor seems to dominate the
IV-sentiment 80-120 factor. The return skewness for the IV-sentiment 90-110 strategy also
dominates the ones for the other two strategies, as all boxplot features (median, one standard
deviation, high and low percentile, and outliers) are superior. The IV-sentiment 90-110 factor
delivers the lowest median horizon of all strategies. The average horizons estimated for the
IV-sentiment 90-110 factor are 12, 13, and 19 months, respectively, for the strategies based on
the three-, six- and twelve-month options. The dispersion of strategies’ horizon is, however,
higher for the IV-sentiment 90-110 factor than for the Baker and Wurgler (2007) sentiment
factor. We can conclude that our IV-sentiment measure seems to outperform a trading strategy
based on the sentiment factor by Baker and Wurgler (2007) on several key aspects: IR, return
skewness, and trade horizon.
106
4.4.3 Out-of-sample equity returns predictive tests
4.4.3.1 Univariate models and forecast combination
Following our hypothesis that extreme bearishness and bullishness sentiment might be fol-
lowed by reversals in equity markets, we test here whether our IV-sentiment measure has
out-of-sample predictive power in forecasting the equity risk premium, in line with the analysis
introduced by Welch and Goyal (2008). We follow the methodology used by Campbell and
Thompson (2008) and Rapach et al. (2010), who build on Welch and Goyal (2008). Hence,
similarly to these three studies, our predictive OLS regressions are formulated as:
rt+1 = αi + βixi,t + εt+1, (4.4.3)
where rt+1 is the monthly excess return of the S&P 500 index over the risk-free interest rate,
xt is an explanatory variable hypothesized to have predictive power, and εt+1 is the error term.
Our predictive regressions also use the monthly data set provided by Welch and Goyal (2008)19,
but the scope of 14 explanatory variables used closely follows Rapach et al. (2010)20.
From the predictive regressions in Eq. (4.4.3), we generate out-of-sample forecasts for the
next quarter (t + 1) by using an expanding window. Following Rapach et al. (2010), the first
parameters are estimated using data from January 1947 until December 1964, and forecasts
are produced from January 1965 until December 2014. The estimating window for B/M starts
slightly later than January 1947, while the number of observations available allows forecasting
B/M to start also at January 1965. For the IV-sentiment-based regression, the data used
for the first parameter estimation starts at January 1998 and ends at December 1999 so that
out-of-sample forecasting is performed from January 2000 to December 2014 only.
Following Campbell and Thompson (2008) and Rapach et al. (2010), restrictions on the re-
gression model specified by Eq. (4.4.3) are applied. The first restriction entails a sign restriction
on the slope coefficients of Eq. (4.4.3) for the 14 Welch and Goyal (2008) variables we employed.
The second restriction comprises setting negative forecasts of the equity risk premium to zero.
We specify an additional model containing both coefficient and forecast sign restrictions. The
original Eq. (4.4.3) with no restrictions applied is called the unrestricted model, whereas the
model with the two restrictions is called the restricted model. Once individual forecasts for rt+1
are obtained using the restricted and unrestricted models for every variable, weighted measures
of central tendency (mean and median) of the N forecasts are generated by Eq. (4.4.4):
rc,t+1 =N∑i=1
ωi,tri,t+1, (4.4.4)
where (ωi,t)Ni=1 are the combining weights available at time t. Our forecast combination method
19Welch and Goyal (2008) monthly data was updated until December 2014 and is available athttp://www.hec.unil.ch/agoyal/.
20These variables are: the dividend price ratio, the dividend yield, the earnings-price ratio, the dividend-payout ratio, the book-to-market ratio, the net equity issuance, the Treasury bill rate, the long-term yield, thelong-term return, the term spread, the default yield spread, the default return spread, the inflation rate, andthe stock variance.
107
is a more simple and agnostic approach than the one used by Rapach et al. (2010)21. The
mean and median combination methods are simply the equal weighed (ωi,t = 1/N) average and
median of the forecasts. Our benchmark forecasting model is the historical average model with
the use of an expanding window.
We use the out-of-sample R2 statistic method (R2OS) introduced by Campbell and Thompson
(2008) and followed by Rapach et al. (2010) for forecast evaluation. This method compares
the performance of a return forecast rt+1 and a benchmark or naıve return forecast rt+1 with
the actual realized return (rt+1). We note that this method can be applied either to the single
factor-based forecast models as well as to the combined or multifactor forecast models, both
described in the previous section. The R2OS statistic is given as:
R2OS = 1−
∑qk=q0+1
(rm+k − rm+l)2
∑qk=q0+1
(rm+k − rm+l)2, (4.4.5)
which evaluates the return forecasts from a predictive model (in the numerator) and the return
forecasts from a benchmark or naıve model (in the denominator) by comparing the mean
squared prediction errors (MSPE) for both methods. Because the ratio of MSPEs is subtracted
from 1 in the R2OS statistic, its interpretation becomes: if R2
OS > 0, then MSPE of rt+1 is smaller
than for rt+1, indicating that the forecasting model outperforms the naıve (benchmark) model,
and vice-versa. To better evaluate the out-of-sample performance of of models graphically, we
employ the cumulative cum of squared error difference (CSSEDOS) statistic given below. The
advantage of CSSEDOS over R2OS is that it starts at zero and accumulates over time in a
homoscedastic manner, whereas R2OS typically displays a very high volatility at the start of the
(accumulation) period and a lower volatility of the metric as t increases22.
CSSEDOS =
q∑k=q0+1
(rm+k − rm+l)2 −
q∑k=q0+1
(rm+k − rm+l)2. (4.4.6)
The results from our out-of-sample equity returns predictive tests are reported in Table 4.6.
Panel A reports the findings for the out-of-sample forecasting period between January 1965
and December 2014 for all individual variables except our IV-sentiment factor (IV Sent), for
which forecasts are only available from January 2004 to December 2014, and for the combined
forecasts. For individual models, R2OS comes from the restricted model, whereas for the aggre-
gated models, the results are reported for both the restricted and the unrestricted models. The
results of the aggregate models are reported in means and medians, reflecting the aggregation
21Rapach et al. (2010) classify their combination methods in two classes: the first class uses a mean, median,and trimmed mean approach for forecast combination, and the second class uses a discounted mean squareprediction error (DMSPE) methodology. The DMSPE method aims to set combining weights as a function ofthe historical forecasting performance of the individual models during the out-of-sample period. This methodweights more recent forecasts heavier than older ones by the use of one additional parameter. Despite thedesirable features of such a second class combination method, we prefer to stick to the first class methods onlybecause they are more transparent and do not require the choice of an additional parameter.
22The undesirable graphical pattern of R2OS is caused by the normalization through
∑qk=q0+1(rm+k−rm+l)2
,
which at the start of the sample tends to be very small relative to CSSEDOS . Note that R2OS =
CSSEDOS/∑q
k=q0+1(rm+k−rm+l)2.
108
method used.
Table 4.6: Out-of-sample equity risk premium
Individual predictive regression model forecast Combination forecasts Machine learning methods
Predictor R2OS(%) Predictor R2
OS(%) Combining method R2OS(%) Methods R2
OS(%)
(1) (2) (3) (4) (5) (6) (7) (8)
Panel A. 1965:1-2014:12 out-of-sample period
D/P -0.30 LTY -0.28 Mean-Unconstrained 1.08 Kitchen-sink (OLS) -88.14
D/Y -0.11 TMS -0.50 Median-Unconstrained 0.64 Ridge regression 0.81
E/P -0.41 LTR 0.22 Principal Component Regression -5.93
D/E -0.76 DFY -0.69 Mean-Constrained 1.11 Random Forest -9.97
B/M -0.88 DFR -0.55 Median-Constrained 0.63 Neural Networks -84.14
NTIS -0.83 TBL -0.01 Mean-models -6.35
INFL 0.48 Median-models -2.06
SVAR 0.02
Panel B. 2004:1-2014:12 out-of-sample period
D/P -0.82 LTY 0.62 Mean-Unconstrained 0.25 Kitchen-sink (OLS) -62.64
D/Y -0.53 TMS -0.94 Median-Unconstrained -0.35 Ridge regression -1.76
E/P -1.31 LTR 0.01 Mean-Unconstrained + IVSent 0.63 Principal Component Regression 0.09
D/E -2.13 DFY -1.26 Median-Unconstrained + IVSent 0.27 Random Forest -8.80
B/M -0.16 DFR -0.64 Mean-Constrained 0.40 Neural Networks -66.95
NTIS -2.63 TBL -0.05 Median-Constrained -0.25 Mean-models 1.24
INFL -2.58 IVSent3m 2.45 Mean-Constrained + IVSent 0.75 Median-models 2.12
SVAR 4.17 IVSent6m 2.45 Median-Constrained + IVSent 0.19
IVSent12m 1.59
This table reports the results from the predictive regressions of individual factor models and of combined-factor modelsrelative to the historical average naıve (benchmark) model. R2
OS is the Campbell and Thompson (2008) out-of-sample R2
statistic. If R2OS > 0, then mean squared prediction errors (MSPE) of rt+1, i.e., the predictive regression forecast, is smaller
than for rt+1, i.e., the naıve forecast, indicating that the forecasting model outperforms the latter (benchmark) model. PanelA reports the results for the full out-of-sample period available (1965:1-2014:12) for all variables tested by Rapach et al.(2010). Panel B reports the results for the latest period within the entire out-of-sample history (2004:1-2014:12) and includesthe three-month IV-sentiment 90-110 factor (IVSent) in addition to the variables tested by Rapach et al. (2010).
Panel A suggests that performance is not consistent across factors within the longer history
of the out-of-sample test. Some factors outperform others by a large amount. Concurrently, the
performance of most single factors is quite inconsistent through time, as Figure 4.4 depicts: the
slope and levels of CSSEDOS constantly change from negative to positive and vice-versa for
almost all factors. For some of them, CSSEDOS even flips sign at times within the sample. In
contrast, the aggregated models deliver better performance across restricted and unrestricted
models using either averages or medians for aggregation method. Moreover, the performance
of the weakest aggregate model (0.63) is superior to the best individual factor (INFL at 0.48)
within the full sample.
Once we evaluate the period from January 2004 to December 2014, when IV Sent is used,
we observe that the performance across factors remains inconsistent. The performance across
individual factors looks less dispersed in this sample than in the full sample, but the overall
performance deteriorates. The IV Sent factor performs well (ranging from 1.59 to 2.45 depend-
ing on the maturity), despite being strongly outperformed by the SV AR factor, while other
factors perform extremely poorly (NTIS at -2.63, INFL at -2.58). The combined models that
do not include IV Sent in their median versions (restricted and unrestricted) underperform the
naıve forecasting benchmark as their R2OS is negative.
109
Figure 4.4: Cumulative Sum of Squared Error Differences of single factor predictive regressions. The linesin every plot depict the out-of-sample Cumulative Sum of Squared Errors Differences (CSSEDOS) calculated by Eq. (4.4.6) forthe historical average benchmark-forecasting model minus the cumulative squared prediction errors for the single-factor forecastingmodels constructed by using 14 out of all the explanatory variables suggested by Welch and Goyal (2008), as well as the IV-sentiment 90-110 factor with a three-month maturity. Positive values of CSSEDOS mean that single-factor forecasting modelsthat employ the Welch and Goyal (2008) factors and IVsent outperform the historical average benchmark-forecasting model.
Interestingly, when our IV Sent factor is added to these models, the performance improves
substantially, outperforming the benchmark. We observe the same for models based on the
mean: the mean-unconstrained and the mean-constrained models ex-IV Sent show a R2OS of
0.25 and 0.40, respectively. When the IV Sent factor is added to them, R2OS improves to 0.63
and 0.75, respectively. Therefore, it appears that our IV Sent factor seems to impact the
combined model in a very distinct way when compared to other factors. R2OS from models
that use median forecasts are worse than for models that aggregate forecasts by averaging.
Nonetheless, improvements delivered by the inclusion of IV Sent and the imposition of model
constraints are qualitatively the same across models aggregated by either median or averaging.
We also find that our IV Sent is quite uncorrelated to other factors. The correlation co-
efficient of the IV Sent factor that uses three-month options with other individual factors is
most of the times negative or close to zero, and only exceeds 0.5 when evaluated against long-
term yield (LTY )23. Such correlation is higher for the IV Sent factor computed using six- and
twelve-month option maturities. These results suggest that the improvements made by our
IV Sent factor to the combined models stem partially from diversification benefits rather than
from forecast performance (R2OS) alone.
23A full correlation matrix among the individual predictive factors tested by Rapach et al. (2010) and IV-sentiment factors can be provided upon request.
110
4.4.3.2 “Kitchen sink” and machine learning-based models
Further, we also test a “kitchen sink” model24 as used by Welch and Goyal (2008) and Rapach
et al. (2010) but we extend it beyond the standard linear model toward machine learning
algorithms. Our aim is to test whether more advanced models can fix the exceptionally poor
out-of-sample performance of the multivariate approach to forecast the equity risk premium,
as reported by Welch and Goyal (2008) and Rapach et al. (2010). The models tested by us
in addition to the “kitchen sink” OLS model are: 1) Ridge regression (Hoerl and Kennard,
1970), 2) Principal Component Regression (Massy, 1965), 3) Random forest (Breiman, 2001),
and 4) Neural Networks2526. Our hypothesis for performing this models’ “horse race” is that
machine learning-based models might be able to improve over the multivariate OLS regression
by either 1) reducing its variance and, so, avoiding overfitting, 2) better modelling potentially
non-linearities present in the data, and 3) dampening the effect of collinearity in the regressors.
Our results from testing a “kitchen sink” OLS model reiterate the ones of Welch and Goyal
(2008) and Rapach et al. (2010) (see Table 4.6). The model is the worst performing one in R2OS
terms across all univariate and multivariate models. In contrast, individual machine learning
algorithms using the same set of variables outperform the “kitchen sink” model but do not
consistently outperform the models that combine forecasts from univariate models. The Ridge
regression model seems to be the best performing across all multivariate models as it delivers
high R2OS in the January 1965-December 2014 sample and a less negative R2
OS than other models
in the January 2004-December 2014 sample. Given its linear character, the main advantages
of Ridge regression over the “kitchen sink” is the regularization (shrinkage) applied as well as
its adequacy to multicollinear systems. As the principal component regression also addresses
multicollinearity problems and it performs quite poorly in the January 1965-December 2014
sample, we conjecture that the main benefit delivered by the Ridge regression might be the
shrinkage, which likely dampens the overfitting undergone by the “kitchen sink” model. The
Random forest model performs poorly, although, less bad then the “kitchen sink” and the
Neural Networks models, suggesting that the structure imposed by constraint plus forecasting
combination seems to add more value to predictions than being able to capture non-linear
relationships. The Neural Networks model performs as bad as the “kitchen sink” model, likely
due to overfitting. As we intentionally did not tune the Random forest and the Neural Networks
models much, the chance these models are overfitted is high, especially for the Neural Networks
model. These two approaches are known by their potential for overfitting if stop-training
24The “kitchen sink” includes all 14 predictive variables used in our univariate models.25We tune Ridge regression by using cross-validation with 10 folds. We tune our Random forest model using
a single pass of out-of-bag errors to estimation of the optimal number of predictors sampled for splitting at eachnode. We use cross-validation in the estimation of our Neural Networks model to come up with the number oflayers and neurons (among a set of pre-defined structures) only. We do not apply any early-stop procedure. Adetailed description of tuning procedure applied to these models is out of scope of this thesis.
26A more detailed description of the Ridge regression and Random forest models is provided in Appendix 5.A.In brief, the Principal Component Regression model consists of is a regression analysis in which the explanatoryvariables are the orthogonal factor generated by a principal component analysis (PCA) (see Appendix 5.A fordetails on PCA). Given the complexity and flexibility of Neural Networks/deep learning, further details on thismethod is out of scope of this thesis but available in Haykin (1999). For more insight into all the machinelearning methods used in this chapter, see Hastie et al. (2008)
111
procedures are not imposed.
Observing the evolution of CSSEDOS for the median-based (restricted and unrestricted)
combined models in Plot A of Figure 4.5, we notice that both lines have slopes that are pre-
dominantly positive or flat. Positive slopes of the CSSEDOS curve indicate that the combined
model outperforms the benchmark out-of-sample. These CSSEDOS lines match very closely
the ones presented by Rapach et al. (2010) up to 2004, when their sample ends. The evolution
of R2OS for our individual factors in Figure 4.4 is also very similar to Rapach et al. (2010): some
CSSEDOS curves are positively sloped during certain periods, but often all factors display
negatively sloped curves. The R2OS curves for the IV Sent factor is mostly positively sloped but
relatively flat from 2004 to 2007, as the last plot in Figure 4.4 indicates. These results reiter-
ate the primary conclusion of Welch and Goyal (2008), Campbell and Thompson (2008) and
Rapach et al. (2010): individual predictors that reliably outperform the historical average in
forecasting the equity risk premium are rare but, once these models are sensibly restricted and
aggregated in a multi-factor model, their out-of-sample predicting power improves consider-
ably. This conclusion applies also to the inclusion of our IV Sent factor within the multi-factor
model. Plot B of Figure 4.5 shows that the CSSEDOS curves for the model that includes the
IV Sent factor are visibly steeper than the ones that do not include it. Further, the findings in
Figure 4.5 indicate that restricted models seem to be superior to unrestricted ones by having
either higher or less volatile CSSEDOS.
(a) Without IV-sentiment (b) With IV-sentiment
Figure 4.5: Cumulative Sum of Squared Error Differences of combined predictive regressions. The black linein Plot A depicts the Cumulative Sum of Squared Error Differences (CSSEDOS) for the historical average benchmark-forecastingmodel minus the cumulative squared prediction errors for the aggregated predictive regression-forecasting model construct by using14 Welch and Goyal (2008) explanatory variables in univariate unrestricted models. The green and red lines in Plot A depict thesame forecast evaluation statistic, i.e., the CSSEDOS , when such 14 univariate models are restricted as suggested by Campbelland Thompson (2008). The red line represents the CSSEDOS when coefficients are constrained to have the same sign as the priorssuggest. Plot B zooms in on the January 2003-December 2014 period, where the black and red lines are the same as in Plot A,whereas the green and blue lines are the the CSSEDOS when our IV-sentiment factor is added to the multifactor forecasts modelfor the unrestricted and restricted model, respectively. The forecasting period is January 1965-December 2014 for all variablesexcept IVSent, for which forecasts are only available from January 2004-December 2014. Forecast aggregation in both models isdone by calculating the mean of the t+ 1 forecast from each individual predictive regression.
However, even if the combined factor models perform much better than the individual
predictors do, the red and black lines in Plots A and B of Figure 4.5 are not always positively
112
sloped, which is in line with Rapach et al. (2010). The R2OS curve is strongly positively sloped
from 1965 to 1975, more moderately positively sloped from 1975 to 1992, negatively sloped from
1992 to 2000, and then slightly positive to flat until 2008, when it sharply drops amid the global
financial crisis up to December 2014. The addition of our IV Sent factor in the combined model
produces the blue and green lines in Plot B of Figure 4.5. These new curves have an equally flat
slope during the 2004 to 2008 period, while both experience a sharp rise since the beginning of
2008. These curves’ profiles suggest that our IV Sent factor has considerably improved the out-
of-sample performance of the combined model especially in times when the other factors broke
down or did not provide an edge versus the historical average predictor. Thus, the inclusion
of our IV Sent factor seems to revive the conclusion reached by the previous literature, where
combined factor models are able to improve compared to individual factor models. At the same
time, the recent poor performance of the combined models ex-IV Sent underscores that factor
identification is still a major challenge for the specification of combined models. Overall, our
empirical findings suggest that IV-based factors provide a relevant explanatory variable for the
time-variation of equity returns.
4.4.4 IV-sentiment and equity factors
In this section we test whether the stream of returns produced by the IV-sentiment trading
strategy is connected to (cross-sectional) equity factors. Our goal in this analysis is to evaluate
whether the IV-sentiment loads heavily on equity factors identified in the literature. Since the
IV-sentiment aims to time entry and exit-points into the equity markets, it could potentially
also be used by equity managers to time their beta exposure. Nevertheless, if this timing-
strategy largely resembles equity factors, it should be less useful to equity portfolio managers.
We perform this analysis using Eqs. (4.4.7a) to (4.4.7d), as well as univariate models using
the individual factor employed in the following models:
IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd + εd, (4.4.7a)
IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd +WMLd + εd, (4.4.7b)
IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd +WMLd +RMWd +CMAd + εd, (4.4.7c)
IV Sentm = αm+(Mkt−RF )m+SMBm+HMLm+WMLm+RMWm+CMAm+BABm+εm,
(4.4.7d)
where, the subscript d = 1, 2, ...D stands for daily returns, whereas the subscript m = 1, 2, ...M
stands for monthly returns, both extending from January 2, 1999 to December 8, 2015. The first
set of explanatory variables, used in Eq. (4.4.7a), are the market (Mkt-Rf ), the size (SMB) and
the value (HML) factors, as proposed by Fama and French (1992). Additionally, the profitability
(RMW ) and investment (CMA)27 factor of Fama and French (2015), the momentum factor
27The Fama and French factors SMB, HML, RMW and CMA stand, respectively, for small minus big (size),
113
(WML) of Carhart (1997) and the low- versus high-beta (BAB), known as the “Betting Against
Beta” factor of Frazzini and Pedersen (2014) are used in Eqs. (4.4.7b) to (4.4.7d)28. The
correlation structure of these factors estimated using our monthly data is reported in the
Figure (4.6) below. In brief, it suggests that some cross-sectional equity factor can be highly
positively or negatively correlated with each other but, more importantly, the IV-sentiment
strategy seems lowly correlated to all series.
Figure 4.6: Correlation matrix between IV-sentiment factor and cross-sectional equity factors. The uppertriangular part of the matrix above reports the correlation coefficient between pairs of cross-sectional equity factors and the IV-sentiment factor. These equity factors are the market (Mkt-Rf ), the size (SMB) and the value (HML) factors, the profitability(RMW ), the investment (CMA), the momentum factor (WML) and the “Betting Against Beta” factor (BAB). The font size ofcoefficient reiterates its magnitude, whereas asterisks ***, **, and * indicate significance at the one, five, and ten percent level,respectively. In the diagonal, the histograms of factor returns are depicted. The lower triangular part of the matrix depicts scatterplots of the returns of the multiple pairs of factors.
Table (4.7) reports results of Eqs. (4.4.7a) to (4.4.7d). At first we observe that the IV-
sentiment has very little Beta exposure as the coefficients for the (Mkt−RF ) factor are close to
zero across its univariate model as well as across all multivariate models. This result matches
our expectations as IV-sentiment has, in fact, a time-varying long or short exposure to the
equity market. The IV-sentiment strategy also seems to have a large-cap tilt as the coefficient
of SMB is often statistically significant and small or negative, ranging from -0.107 to 0.147.
high minus low (valuation), robust minus weak (profitability) and conservative minus aggressive (investments).28The regressions that include the BAB factor have monthly frequency as this factor is not available in a
daily frequency.
114
Again, this is an expected result as the IV-sentiment strategy is implemented in the US large
cap universe, i.e, the S&P500 Index. Coefficients for HML are also either low or negative,
suggesting a growth tilt. HML is positive in the simpler models, i.e, the univariate regression
and in the Fama and French (1992) model, but negative in the more comprehensive models.
This finding suggests the presence of multicollinearity in the model, which is likely affecting
the estimated coefficient for HML. This effect is likely caused by the addition of the RMW
factor, as these factors have a correlation of 0.5 in our sample (see Figure (4.6)), whereas being
reported by the literature to reach 0.8.
Table 4.7: Regression results: IV-sentiment and equity factors
Panel A - Multivariate Panel B - Univariate
Intercept 0.000 0.000 0.000 0.007* 0.000 0.000 0.000 0.000 0.000 0.000 0.007
(0.000) (0.948) (0.000) (0.004) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.004)
Mkt-RF 0.070*** 0.042*** 0.060*** 0.072 0.073***
(0.010) (0.011) (0.012) (0.104) (0.010)
SMB 0.134*** 0.153*** 0.136*** -0.107 0.147***
(0.021) (0.021) (0.022) (0.152) (0.021)
HML 0.080*** 0.018 -0.064*** -0.271 0.086***
(0.019) (0.021) (0.024) (0.180) (0.020)
WML -0.121*** -0.141*** -0.179* -0.134***
(0.015) (0.015) (0.094) (0.013)
RMW -0.042 -0.130 -0.137***
(0.029) (0.220) (0.024)
CMA 0.244*** 0.624** 0.107***
(0.036) (0.245) (0.029)
BAB -0.186 -0.215**
(0.126) (0.098)
R2 2% 4% 5% 13% 1% 1% 0% 2% 1% 0% 4%
F-stats 36.7 45.0 38.0 2.5 49.2 47.6 19.4 104.5 31.5 13.7 4.8
AIC -29771 -29837 -29879 -430 -29715 -29713 -29685 -29769 -29698 -29680 -429
BIC -29739 -29798 -29827 -405 -29696 -29694 -29666 -29750 -29678 -29661 -421
This table reports regression results for Eqs. (4.4.7a), (4.4.7c) and (4.4.7d). The dependent variable is the stream of returnsproduced by the contrarian strategy based on our IV-sentiment 90-110 indicator, while the explanatory variables are equity(cross-sectional) factors, namely: the market (Mkt-Rf), size (SMB), value (HML), profitability (RMW), investment (CMA),momentum (WML) and low- versus high-beta (BAB). Panel A reports the regression results in a multivariate setting, usingthree distinct model: 1) the Fama-French three-factor model, 2) the Fama-French three-factor model with the addition ofthe Carhart (1997) momentum factor, 3) the Fama-French five-factor model with the momentum factor and 4) the lattermodel with the addition of the BAB (Betting Against Beta) factor suggested by Frazzini and Pedersen (2014). Note that asthe BAB factor is only available in monthly frequency, regression that contain such factor use monthly frequency, whereasdata used in other regressions has daily frequency. We report standard errors in brackets. Asterisks ***, **, and * indicatesignificance at the one, five, and ten percent level, respectively.
Turning to the factors in Eqs. (4.4.7b) to (4.4.7d) only, we find that IV-sentiment has
negative exposure to the cross-sectional momentum factor (WML) consistently across all re-
gressions. At first glance, this result makes sense as IV-sentiment is a mean-reversion strategy.
Nevertheless, because the IV-sentiment reflects mean-reversion in the overall equity market,
hence in time-series, rather than cross-sectionally, the expectation of a negative relation be-
tween these variables is ambiguous. Moskowitz et al. (2012) report that time-series momentum
and cross-sectional momentum in the equity markets are strongly related though29, which sug-
gests that our original assumption that IV-sentiment is negatively correlated to WML holds.
Among all factors, WML is almost the only one which the statistical significance holds across
29Moskowitz et al. (2012) report that the coefficient of time-series momentum on cross-sectional momentumequals to 0.57 with a t-stat of 15.52 in a univariate model.
115
all regressions. WML seems also to deliver, with around 2 percent, high explanatory power
relative to the other factors used. This strong and robust negative link between IV-sentiment
and WML reiterates our earlier suggestion that these two risk factors seem to complement each
other. And, by doing so, IV-sentiment might be able to mitigate some momentum crashes.
Moreover, the exposure of IV-sentiment to the profitability factor (RMW ) is small and
always negative, despite the fact that the coefficients are not statistically significant in the two
multivariate models applied, only in the univariate regression. IV-sentiment is positively ex-
posed to the investment factor (CMA) as its coefficients are significant across all regressions. We
interpret that this positive relation with IV-sentiment relates to a higher frequency of reversals
in periods when firm investments are low (likely during recessions or in the late economic cycle),
which coincides with conservative firms outperforming aggressive ones. Besides, IV-sentiment
loads negatively on the BAB factor, despite being only statistically significant in the univariate
regression. This connection is argued to be linked to the profitability factor (RMW ) by Fama
and French (2016), which may help explain why both regressors are not statistically significant
in the multivariate model, whereas they are strongly significant in the univariate regressions.
In line with this suggestion, the estimated correlation between these two factors in our sample
is 0.59 (see Figure (4.6)).
Last but not least, none of our regression models explains the variability IV-sentiment much
as R2 from Eq. (4.4.7d) is with 13 percent, at best, always low. This finding indicates that
the IV-sentiment strategy is quite distinct from factors typically used by portfolio managers
for single name equity management. Hence, as the IV-sentiment strategy embeds a timing
approach for equity markets, which can be implemented via a dynamic exposure to market
Beta, equity portfolio managers could enhance their strategies by making use of it.
4.4.5 Behavioral versus risk-sharing perspectives
Another perspective of equity market dynamics provided by IV-based factors that are jointly
extracted from single stock and index options, is the implied correlation (ρ). It is approximated
by Eq. (4.4.8), which is derived in Appendix 3.A.2:
ρ ≈ σ2I
(∑n
i=1 wiσi)2, (4.4.8)
where σ2I is the variance of index options, σi is the volatility of i = 1...n stocks in the index, and
wi is the stocks’ weight in the index. The implied correlation measures the level of the average
correlation between stocks that are constituents of an index. The IV of index options, i.e., (σ2I ),
can be matched by the one of single stock options, weighted by its constituents’ loadings in
the index, i.e., (∑n
i=1 wiσi)2. Thus, if IV can be used as a measure of absolute expensiveness
of an option, the implied correlation provides a relative valuation measure between the index
and single stock options: a high (low) level of implied correlation means that index options are
expensive (cheap) relative to single stock options.
Table 4.8 Panel A presents descriptive statistics of the implied correlations between the
116
index and single stock options’ IV. The means and medians suggest that the implied correlation
monotonically decreases with an increase in the moneyness level. The implied correlation means
range from 0.30 to 0.65, a somewhat wide range given that these are averaged measures. Such a
relative high dispersion of implied correlations is confirmed by their standard deviations, which
are around 0.14. The distributions of the implied correlation are mostly negative skewed, as
medians are most of the times higher than their means. The most striking result is given by
the maximum and minimum implied correlations: the maximum implied correlation observed
across all maturities and moneyness levels reported reaches 135 percent. Implied correlations
above 100 percent are observed for many options, mostly for puts at the 80 and 90 percent
moneyness levels. This finding implies that in order to match the weighted IV of puts on single
stocks that are part of the S&P 500 index to the IV of a put on the index (with same levels
of moneyness), an average correlation above 100 percent between the single stock put options
is required. However, as correlation coefficients are bounded between −100 and +100 percent,
these levels of implied correlation are indicative of irrational behavior by investors, who bid up
index puts to levels that contradict market completeness.
Table 4.8: Implied and realized correlations
Panel A Implied correlations
Statistics \ Maturity, moneyness 3m 80% 3m 90% 3m ATM 3m 110% 3m 120% 6m 80% 6m 90% 12m 80% 12m 90%
Mean 0.65 0.56 0.45 0.35 0.3 0.64 0.56 0.6 0.54
Median 0.67 0.57 0.45 0.35 0.3 0.65 0.56 0.61 0.54
Minimum 0.24 0.18 0.12 0.07 0.03 0.26 0.21 0.26 0.22
Maximum 1.35 1.11 0.86 0.72 0.68 1.07 0.95 1.1 1.01
10th percentile 0.44 0.35 0.27 0.17 0.13 0.41 0.34 0.39 0.34
90th percentile 0.81 0.73 0.63 0.53 0.49 0.8 0.73 0.77 0.72
Standard deviation 0.15 0.14 0.14 0.13 0.14 0.14 0.14 0.14 0.13
Skew -0.46 -0.39 0.1 0.29 0.29 -0.6 -0.38 -0.26 -0.2
Excess Kurtosis 0.6 -0.02 -0.37 -0.38 -0.66 0.09 -0.18 0.03 -0.22
Panel B - Realized correlations
Statistics \ Look-back period 30 Days 60 Days 90 Days 180 Days 720 Days
Mean 0.3 0.25 0.25 0.26 0.36
Median 0.27 0.22 0.24 0.25 0.31
Minimum 0 0.01 0 0.01 0.06
Maximum 0.84 0.69 0.67 0.61 0.74
10th percentile 0.1 0.05 0.04 0.07 0.08
90th percentile 0.54 0.47 0.48 0.52 0.71
Standard deviation 0.17 0.16 0.16 0.16 0.2
Skew 0.66 0.38 0.6 0.37 0.42
Excess Kurtosis -0.02 -0.68 -0.21 -0.86 -0.88
Panel A reports the descriptive statistics for the implied correlations between index options and single stock options for three monthoptions at the 80, 90, ATM (100), 110, and 120 percent moneyness levels, and for six- and twelve-month options at the 80 and 90percent moneyness levels over the full sample, which extends from January 2, 1998 to March 19, 2013. The implied correlation (p)
is approximated by the Eq. (4.4.8): p ≈ σ2I
(∑n
i=1 wiσi)2, where σ2
I is the implied volatility of an index option and∑n
i=1 wiσi is the
weighted average single stock implied volatility, as in Eq. (3.A.8l) of Appendix 3.A.2. Panel B reports the descriptive statistics forthe average pair-correlations for the 50 largest constituents of the S&P500 index calculated over the same sample, which extendsfrom January 2, 1998 to March 19, 2013.
We also find that trading in the opposite direction of such evident irrational investor behavior117
has been very profitable, as implied correlations higher than 100 percent were very effective
as an entry point for contrarian strategies. Across the maturities and moneyness levels where
we can observe such biased behavior, a sentiment strategy that buys the equity market when
the implied correlation is above 100 percent and sells it when the implied correlation falls back
to 50 percent, yields an average net information ratio of 0.35, with information ratios ranging
from 0.27 to 0.52.
The implied correlation means and medians provided by Panel A are far higher than the
same measures from realized average pair-correlations between the 50 largest constituents of the
S&P 500 index as of February 14, 2014, as provided in Panel B. Such average pair-correlations
range from 0.25 to 0.36 when look-back periods of 30, 60, 90, 180, and 720 days are evaluated,
which is substantially lower than most average implied correlations posted for the different
option maturity and moneyness levels reported in Panel A. In fact, the average realized corre-
lations are often below the 10th percentile of the implied correlation for some options’ maturity
and moneyness levels. The 90th percentile of realized correlations often match the average im-
plied correlations reported. The maximum realized correlations are at most 84 percent, using
an extremely short look-back of 30 days, much lower than the 135 percent observed for implied
correlations. These empirical findings strongly suggest that implied correlations substantially
overshoot realized ones. Similarly, the implied correlation reaches sometimes values as low
as three percent for some options, especially on the call side (above ATM moneyness). This
finding is also low when compared to put options. The minimum historical correlations from
OTM puts is 0.18, whereas for call options it is 0.03. The fact that those extremely low values
of the implied correlation from calls largely undershoots implied correlations from put options
may also suggest less than fully rational pricing on the call side. It indicates that single stock
options are expensive relative to index calls, which matches our postulation that individual
investors use single stock calls to speculate on the upside.
Despite the strong evidence of irrational behavioral by investors provided by the extreme
levels of implied correlation, which indirectly links to the IV skew being at extreme levels at
times, we conjecture that such phenomena may also have a risk-bearing explanation. Reversal
strategies such as the ones designed by us earn attractive long-term risk-adjusted returns, but
are highly dependent on equity markets at the tail (see Table 4.5, Panel C). Additionally, IV-
sentiment-based reversal strategies experience the largest daily drawdowns among all strategies
evaluated (see Table 4.5, Panel A). Thus, their attractive risk-adjusted returns are, partially,
compensation for downside risk. Therefore, the risk borne by investors that bet on reversals
in equity markets is the risk of poor timing of losses (Campbell and Cochrane, 1999; Harvey
and Siddique, 2000) and downside risk (Ang et al., 2006). In brief, betting on equity market
reversals is a risky activity.
We note that this rational explanation for excesses in sentiment is also linked to limits-
to-arbitrage. The limits-to-arbitrage literature defends that, as investors have finite access to
capital (Brunnermeier and Pedersen, 2009) and feedback trading can keep markets irrational
for a long period of time (De Long et al., 1990), contrarian strategies aiming to exploit the
118
effect of irrational trading are not without risk. For example, once bearish sentiment seems
excessive, the risk of betting on a reversal may be tolerable only to a few investors, because 1)
higher volatility drags investors’ risk budget usage closer to its limits, and 2) access to funding
is limited. Thus, the ability to “catch a knife falling” in the equity markets is not suitable
for all investors, as it involves high risk. Contrarian strategies are, then, mainly accessible to
investors that have enough capital or funding liquidity. Similar considerations are career risk
(Chan et al., 2002), negative skewness of returns (Harvey and Siddique, 2000), poor timing of
losses (Campbell and Cochrane, 1999; Harvey and Siddique, 2000), and risk aversion of market
makers (Garleanu et al., 2009). One final element in the characterization of reversals as a
compensation for risk is the presence of correlation risk priced in index options (see Driessen
et al., 2009, 2013; Krishnam and Ritchken, 2008; Jackwerth and Vilkov, 2015), which is present
in assets that perform well when market-wide correlations are higher than expected.
4.5 Conclusion
End-users of OTM options tend to overweight small probability events, i.e., tail events. This
bias is strongly time-varying and present in both OTM index puts and single stock calls, due
to individual and institutional investors trading activity, respectively. Individual investors
typically buy OTM single stock calls (“lottery tickets”) to speculate on the upside of equi-
ties (indicating bullish sentiment), whereas institutional investors typically buy OTM index
puts (portfolio insurance) to protect their large equity holdings (indicating bearish sentiment).
Hence, overweight of small probabilities derived from equity option prices should capture in-
vestors’ sentiment and, thus, potentially predict equity returns.
The parameters that directly capture overweight of small probabilities from option prices
such as the Delta (δ) and Gamma (γ) CPT parameters or the Delta minus Gamma spread
(as designed by us) are difficult to estimate. Because Delta minus Gamma spread is found
to be strongly linked to risk-neutral moments and IV skews, we circumvent these estimation
challenges by proposing a simplified but still informative sentiment proxy: IV-sentiment. The
uniqueness of our IV-sentiment measure is that it is jointly calculated from the IV of OTM
index puts and single stock call options. It aims to capture both bullish and bearish sentiment,
respectively, from individual investors and institutional investors’ trading in options.
We find that our IV-sentiment predicts mean-reversion better than the overweighting of
small probabilities parameter Delta minus Gamma spread. Contrarian-trading strategies using
our IV-sentiment measure produce economically significant risk-adjusted returns. The joint
use of information from the single stock and index option markets seems to be the reason for
the superior forecast ability of our IV-sentiment measure, because factors that use implied
volatility skews from a single market achieve significantly inferior results. The performance of
our IV-sentiment measure seems also more consistent in delivering a positive information ratio
than the Baker and Wurgler (2007) sentiment factor. Moreover, it is more positively skewed,
has a shorter horizon than the standard factor and allows for a daily strategy rebalancing.
119
Our IV-sentiment factor seems to forecast returns as well as other well-known predictors
of equity returns. Since it is uncorrelated to these predictors of the equity risk-premium, it
significantly improves the quality of predictive models, especially when such frameworks are
constrained, as in the terms of Campbell and Thompson (2008). The structure provided by
these constraints in addition to a simple forecast combination approach seems also to outperform
a “kitchen sink” model and a set of machine learning algorithms capable of exploring non-
linearities in the data, applying regularization and tackling multicollinearities issues.
Further, the IV-sentiment strategy is little exposed to a set of widely used cross-sectional
equity factors, which includes Fama and French’s five-factors, the momentum factor (WML)
and the low-volatility factor (BAB). The link between the momentum factor (WML) and
IV-sentiment is found to be consistently negative. At the same time, these factors explain
very little variability of the IV-sentiment strategy. One implication of these findings it that
IV-sentiment could be employed as a Beta-timing tool by active equity managers. Another
implication is that WML and IV-sentiment seem to largely diversify each other and, thus,
prove beneficial for portfolio optimization.
The prediction of reversals seems to be further enhanced when the volatility skews priced by
OTM index puts and single stock calls are clearly irrational, e.g., when implied correlations are
higher than 100 percent. Timing market reversals using our IV-sentiment measure is, however,
not without risk. Reversal strategies, like ours, are exposed to large drawdowns, which likely
happen during “bad times”. Nevertheless, we find that combining our sentiment strategy with
other strategies, such as buy-and-hold the S&P 500 index, time-series momentum and cross-
sectional equity momentum can improve their risk-adjusted returns. Cross-sectional momentum
is the strategy that benefits the most when combined with our contrarian-sentiment strategy,
which is caused by these strategies being negatively correlated with each other and having low
tail dependence. This outcome is largely in line with the finding that WML and IV-sentiment
are strongly negatively correlated and indicate a promising avenue for future research on the
mitigation of momentum crashes by our measure.
120
Chapter 5
Predictable Biases in Macroeconomic
Forecasts and Their Impact Across
Asset Classes∗
5.1 Introduction
The presence of bias in analysts’ forecasts is a widely investigated topic. Early literature focuses
on the bias present in equity analysts’ forecasting of earnings per share (FEPS), and attempts to
explain why earnings estimates are systematically overoptimistic. De Bondt and Thaler (1990)
suggest that equity analysts suffer from a cognitive failure which leads them to overreact and
have too extreme expectations. At the same time, Mendenhall (1991) argues that underreaction
to past quarterly earnings and stock returns contributes to an overoptimistic bias in earnings.
Overreaction and underreaction as causes for an overoptimistic FEPS are, though, reconciled
by Easterwood and Nutt (1999), who defend that analysts underreact to negative earnings
announcements but overreact to positive ones. Another branch of the literature on analysts’
forecasts proposes that this perceived bias is caused by strategic behavior, i.e., a rational
bias. For instance, Michaely and Womack (1999) advocates that equity analysts employed by
brokerage firms (underwriter analysts) often recommend companies that their employer has
recently taken public. In the same vein, Tim (2001) suggests that a rational bias exists within
corporate earnings forecasts because analysts trade-off this bias to improve management access
(via positive forecasts) and forecast accuracy.
Only recently the same attention given to FEPS by the literature was given to the analysis
of potential biases in macroeconomic forecasts. For instance, Laster et al. (1999) argue that
forecasters have a dual goal: forecasting accuracy and publicity. Forecasters would depart from
∗This chapter is based on Felix et al. (2017b). We thank seminar participants at the 2017 Econometrics andFinancial Data Science workshop at the Henley Business School in Reading, at the APG Asset ManagementQuant Roundtable seminar in 2017, at the MAN-AHL Research Seminar in 2017, at the 2018 Annual Conferenceof the Swiss Society for Financial Market Research (SGF) Conference in Zurich, at the 2018 EEA-ESEMConference in Cologne and at the 2018 European Finance Association (EFA) in Warsaw for their helpfulcomments. We thank APG Asset Management and AHL Partners LLP for making available part of the dataset.
121
the consensus (which is typically accurate) when incentives related to their firms’ publicity
outpace the wages received by being accurate. The authors find this trade-off to vary by
industry. Ottaviani and Sorensen (2006) compare two theories of professional forecasting,
which lead to either forecasts that are excessively dispersed or forecasts that are biased towards
the prior mean (herding)2. A drawback of this early literature on macroeconomic forecasts is
that it fails to empirically test the direction and size of the bias, but mostly elucidates that
dispersion of forecasts is plausible under different (sometimes stringent) assumptions. We also
note that both previous papers focus on rational bias explanations for macroeconomic forecast
rather than on cognitive issues.
To the best of our knowledge, Campbell and Sharpe (2009) is the first study to address
macroeconomic forecasts from both an empirical and behavioral bias approach. Their study
hypothesizes that experts’ consensus forecasts of economic releases are systematically biased
towards the previous release. This bias is consistent with the adjustment heuristic proposed
by Tversky and Kahneman (1974). This cognitive bias, commonly known as anchoring, is
characterized by the human propensity to rely too heavily on the initial value (the “anchor”) of
an estimation when updating forecasts. In other words, individuals tend to make adjustments
to original estimates that do not fully incorporate the newly available information. Thus,
anchoring underweights new information in detriment of the “anchor”.
Campbell and Sharpe (2009) hypothesize that surprises over economic releases are pre-
dictable, as they will tend to underreact to new information. They find that the previous
economic releases of 10 important US economic indicators explain up to 25 percent of the
subsequent economic surprises3. Anchoring in forecasting seems not to be, however, restricted
to macroeconomic data releases. Cen et al. (2013) have shown that anchoring also plays a
significant role in FEPS of firms by stock analysts. Their study suggests that analysts tend to
issue optimistic (pessimistic) forecasts when the firms’ FEPS is lower (higher) than the industry
median.
Further, Zhang (2006) investigates the link between anchoring, underreaction and informa-
tion uncertainty. The author builds on the earlier post-earnings-announcement-drift (PEAD)
literature (see, e.g., Stickel, 1991), which states that analysts underreact to new information
when revising their forecasts due to behavioral biases, such as conservatism (Ward, 1982) or
overconfidence (Daniel et al., 1998). He suggests that a greater dispersion (disagreement) in
analysts FEPS, which forms his proxy for information uncertainty, contributes to a large de-
gree of analysts underreaction. Consequently, in an environment of high dispersion of FEPS,
or for firms with greater information uncertainty, analysts will tend to incur in larger positive
(negative) forecasts errors and larger subsequent forecast revisions following good (bad) news.
Capistran and Timmermann (2009) argue that the causality between underreaction and
disagreement depicted by Zhang (2006) may also work the other way around. Capistran and
2Ottaviani and Sorensen (2006) builds on the reputational herding model of Scharfstein and Stein (1990),who suggests that forecasters (investment managers in their case) mimic the decision of others and ignoresubstantive private information, mostly due to concerns about their reputation in the labor market.
3However, it is yet unclear if this bias has a behavioral nature or if it is led by professional forecasters’strategic incentives
122
Timmermann (2009) argue that, as forecasters have asymmetric and differing loss functions,
they react differently to macroeconomic news. In doing so, forecasters update their predictions
in different ways and at different points in time as a reaction to the same news flow, giving
rise to forecast disagreement. In line with Capistran and Timmermann (2009), Mankiw and
Thomas (1997) suggests that, as there are costs involved in gathering information and making
adjustments to forecasts, experts underreact to recent news and only update their predictions
periodically. Thus, in such a sticky-information model for forecasts adjustments, only part of
the pool of forecasters would update their predictions at each period, also corroborating for dis-
persion of forecasts and information uncertainty. Interestingly, Zarnowitz and Lambros (1987)
and Lahiri and Sheng (2010) use dispersion of forecasts as a measure of forecast uncertainty,
not information uncertainty.
When attempting to find predictive value in disagreement measures among forecasters, Leg-
erstee and Franses (2015) use the standard deviation of forecasts and the 5th and 95th percentile
of survey forecasts to predict macroeconomic fundamentals. The 5th and 95th percentiles of sur-
vey forecasts (especially when used in combination with the mean or median) are, arguably,
proxies for the skewness of forecasts, which is explicitly explored by Colacito et al. (2016) and
Truong et al. (2016). In their study, Colacito et al. (2016) use skewness of expected macroeco-
nomic fundamentals to predict expected returns, whereas Truong et al. (2016) uses the skewness
of FEPS survey data to predict quarterly earnings.
Finally, Legerstee and Franses (2015) use the number of forecasts collected as a predictor
of future macroeconomic releases as a proxy for “attention”. Arguably this popularity measure
could be used as a direct predictor of macroeconomic data, as these authors do, but it could
also be employed as a weighting scheme to test whether the pervasiveness of biases fluctuates
with attention.
Hence, because anchoring is to some extent linked to other inherent properties of the pool
of forecasts, as the above literature demonstrates, we hereby investigate other potential biases
that might be embedded in macroeconomic consensus forecasts. The main hypothesis of this
chapter is that, beyond anchoring, these inefficiencies are informative in predicting economic
surprises. As the literature suggests, such biases are expressed by moments of the distribution
of macroeconomic forecasts, such as the disagreement among forecasters (second moment) and
skewness of forecasts (third moment). As market prices react to the information flow, economic
surprise predictability might give rise to return predictability, as reported by Campbell and
Sharpe (2009) and Cen et al. (2013). As a consequence, we conjecture that economic surprises
as well as asset returns around these releases are predictable.
Our contribution to the literature on forecasting bias is four-fold. First, we identify new
biases in experts’ expectations (over and above the anchoring bias), which are statistically
significant predictors of economic surprises. More specifically, we are the first to empirically
validate the rational bias hypothesis of Laster et al. (1999) and Ottaviani and Sorensen (2006)
in a large multi-country data set of macroeconomic releases. Within such models, forecasters
possess private information which is unveiled via the skewness of the distribution of forecasts.
123
Second, by using a popularity measure per economic indicator and by expanding the number
of countries/regions and indicators tested vis-a-vis Campbell and Sharpe (2009), we advocate
that the prevalence of biases is related to attention. This finding is supported by the fact
that as we move from very popular economic releases, such as the Non-farm payrolls (NFP)
employment number, Retail Sales and Consumer Confidence towards less watched indicators,
biases become less pervasive. The same effect is observed when we compare our results for
the US to those in other countries, in which economic indicators are forecasted by much fewer
experts. Third, we confirm the hypothesis that, by predicting economic surprises, one can
predict asset returns around macroeconomic announcements. We find that expected economic
surprises can largely predict the direction of market responses around data releases in-sample
and, to a lesser degree, out-of-sample. Hence, the expected component of surprises can explain
market responses, whereas previous research (see Campbell and Sharpe, 2009) suggests that
markets only respond to the unpredictable component of surprises. The explanatory power
and predictability achieved by our models are higher for local equity and bond markets than
for foreign markets, currencies and commodities, which is intuitive, as those markets are the
ones more intrinsically linked to the fundamentals being revealed by macroeconomic indicators.
On an out-of-sample basis, point-forecast is better performed by non-linear machine learning
models as they seem to capture the dynamics of market responses around macroeconomic
announcements better than linear regression models. Fourth, we are the first to recognize that
a regret bias (see Loomes and Sugden, 1982; Bell, 1982) might influence how asset market reacts
to macroeconomic surprises.
The four key implications of our research are: 1) a better understanding of the “market
consensus” and of the informational content of higher moments of the distribution of macroe-
conomic forecasts by regulators, policy makers and market participants; 2) the challenge of
standard weighting schemes used in economic surprise indexes, which, we reckon, can be im-
proved by changing from “popularity” (or “attention”)-weighted to unweighted; 3) the proposi-
tion that advanced statistical learning techniques should be used to refine the forecast of market
responses amid macroeconomic releases and 4) the opening of a new stream in the literature to
investigate regret effects in asset responses around announcements of forecasted figures.
The remainder of this chapter is organized as follows. Section 5.2 provides a generic formu-
lation of research applied to forecast biases. Section 5.3 describes the data and methodology
employed in our study. Section 5.4 presents our empirical analysis and Section 5.5 concludes.
5.2 Forecast biases, anchoring and rationality tests
Let us first introduce the generic formulation of research applied to forecast biases, as used
by Aggarwal et al. (1995), Schirm (2003) and Campbell and Sharpe (2009). In brief, this
formulation consists of a rationality test in which macroeconomic forecasts are assessed to have
properties of rational expectations. Such assessment is done, in its basic format, by running
regressions with the actual release, At, as the explained variable, and the most recent forecast,
124
Ft, as the explanatory variable, as follows:
At = β1Ft + εt, (5.2.1)
Rationality holds when β1 is not significantly different from unity, while a β1 significantly higher
(lower) than one suggests a structural downward (upward) bias of forecasts. Observing serial
correlation in the error term would also suggest irrationality, as one would be able to forecast
the At using an autoregressive model.
An alternative and more intuitive formulation of this rationality test, as suggested by Camp-
bell and Sharpe (2009), can be achieved by subtracting the forecast from the left side of Eq.
(5.2.1):
St ≡ At − Ft = β2Ft + εt, (5.2.2)
This manipulation yields to the forecast error or the “surprise”, St, as the new explained
variable, which is still dependent on forecast values. In Eq. (5.2.2), rationality holds when β2
is not significantly different from zero; otherwise, a structural bias is perceived. For the specific
case of anchoring, we can dissect the forecast bias using the following model:
Ft = λE[At] + (1− λ)A, (5.2.3)
where E[At] is the forecaster’s unbiased prediction, and A is the anchor, which equals to the
value of the previous release of the indicator of interest. In such a model, if λ < 1 so that
1 − λ > 0, then the consensus forecast is anchored to the previous releases of the indicator.
If λ = 1, no anchor is observed. By applying expectations to Eq. (5.2.2), then, substituting
E[At] = E[St] + Ft into Eq. (5.2.4a), we obtain Eq. (5.2.4d) after some manipulations:
Ft = λ(E[St] + Ft) + (1− λ)A, (5.2.4a)
λE[St] = Ft − λFt − A+ λA, (5.2.4b)
E[St] =Ft − λFt − A+ λA
λ, (5.2.4c)
E[St] =(1− λ)(Ft − A)
λ, (5.2.4d)
assuming γ = (1−λ)λ
and adding a intercept (α) we find4:
St = α + γ(Ft − A) + εt, (5.2.5a)
4The above derivation builds fully on the work of Campbell and Sharpe (2009). The only difference betweenour approach and theirs lies on the fact that they consider the anchor to be the average value of the forecastedseries over a number (h�3) of previous releases, whereas our anchor variable relies only on the previous release(h=1). Robustness test for h>1 will be provided in future versions of this study.
125
St = α + γESAt + εt, (5.2.5b)
which reveals a direct test of anchoring, identified when the γ coefficient is positive, where
ESAt is the expected surprise given the presence of an anchor.
5.3 Data and Methodology
In this study, we mostly employ ordinary least square (OLS) regression analysis adjusted for
Newey-West standard error with the goal to offer interpretability to our results. More advanced
statistical learning techniques are employed but their usage is restricted to section 5.4.3.
We use macroeconomic release data from the ECO function in Bloomberg in our analysis.
This data comprises of time-stamped real-time released figures for 43 distinct US macroeco-
nomic indicators, as well as information on forecasters’ expectations for each release. See Table
5.1 for an overview of these indicators. This expectations information comprises of 1) the
previous economic release, 2) the cross-sectional standard deviation of forecasts, 3) the lagged
median survey expectations, and the 4) the skewness in economists’ forecasts, calculated as
the mean minus median survey expectations. We use similar data sets for Continental Europe,
the United Kingdom and Japan for robustness testing. Our daily data set spans the period
from January 1997 to December 2016, thus covering 4,422 business days and 21,048 individual
announcements. The consensus forecast is the forecast median, in line with Bloomberg’s (and
most other studies’) definition.
We note that the economic indicators tracked are released in different frequencies and
throughout the month. This a-synchronicity among indicators poses some challenges to process
the information flow coming from them and to jointly test for the predictability of surprises.
Therefore, predictability is separately tested for each indicator, and results are subsequently
aggregated.
As we intend to use states of the economy as a control variable in our empirical analysis, we
have also implemented the Principal Component Analysis (PCA)5-based nowcasting method of
Beber et al. (2015) using the same 43 distinct US macroeconomic indicators. Their nowcasting
method allows us to access the real-time growth and inflation conditions present at the time
of any economic release6. Table (5.1) provides details on stationary adjustments, directional
adjustments, frequency of release, starting publication date for the series, and (common) release
time. Finally, we also use the 12-month change in stock market prices (i.e., the S&P500 index
5PCA is a unsupervised machine learning method that describes correlated variables into a set of orthogonal(linearly independent) variables, so-called principal components.
6The Beber et al. (2015) nowcasting method splits indicators among 4 categories (i.e., output, employment,sentiment, and inflation). We follow the same classification but we aggregate output, employment and sentimentindicator into a single category, i.e., growth. As our set of indicators perfectly matches the ones of Beber et al.(2015), this attribution exercise is straightforward. The only nuance that differs our nowcasting method fromthese authors’ is that we use a single parameter to adjust for the non-stationarity of some series. Beber et al.(2015) adjust series using one-month and twelve-month changes, whereas we use six-month changes across allnon-stationary indicators.
126
prices) and the VIX index in order to proxy for wealth effects and risk-appetite, respectively,
as additional control variables in our empirical analysis.
Table 5.1: Overview of US macro releases
# Indicator name Type Start Frequency Release time Direction Stationary
1 US Initial Jobless Claims SA Growth 31/12/96 W 14:30:00 GMT -1 No
2 US Employees on Nonfarm Payroll Growth 02/01/97 M 14:30:00 GMT 1 No
3 U-3 US Unemployment Rate Total Growth 07/01/97 M 14:30:00 GMT -1 No
4 US Employees on Nonfarm Payroll Manuf. Growth 08/01/97 M 14:30:00 GMT 1 Yes
5 US Continuing Jobless Claims SA Growth 09/01/97 W 14:30:00 GMT -1 No
6 ADP National Employment Report Growth 09/01/97 M 14:15:00 GMT 1 No
7 US Average Weekly Hours All Employees Growth 10/01/97 M 14:30:00 GMT 1 No
8 US Personal Income MoM SA Growth 10/01/97 M 14:30:00 GMT 1 Yes
9 ISM Manufacturing PMI SA Growth 14/01/97 M 16:00:00 GMT 1 Yes
10 US Manufacturers New Orders Total Growth 14/01/97 M 16:00:00 GMT 1 Yes
11 Federal Reserve Consumer Credit Growth 16/01/97 M 21:00:00 GMT 1 No
12 Merchant Wholesalers Inventories Growth 17/01/97 M 16:00:00 GMT 1 Yes
13 US Industrial Production MOM SA Growth 17/01/97 M 15:15:00 GMT 1 Yes
14 GDP US Chained 2009 Dollars QoQ Growth 28/01/97 Q 14:30:00 GMT 1 Yes
15 US Capacity Utilization % of Total Growth 03/02/97 M 15:15:00 GMT 1 Yes
16 US Personal Consumption Expenditures Growth 03/02/97 M 14:30:00 GMT 1 Yes
17 US Durable Goods New Orders Ind. Growth 25/02/97 M 14:30:00 GMT 1 Yes
18 US Auto Sales Domestic Vehicle Growth 04/03/97 M 23:00:00 GMT 1 No
19 Adjusted Retail & Food Service Growth 26/03/97 M 14:30:00 GMT 1 Yes
20 Adjusted Retail Sales Less Autos Growth 03/07/97 M 14:30:00 GMT 1 Yes
21 US Durable Goods New Orders Total Growth 16/07/97 M 14:30:00 GMT 1 Yes
22 GDP US Personal Consumption Change Growth 12/08/97 Q 14:30:00 GMT 1 Yes
23 ISM Non-Manufacturing PMI Growth 26/11/97 M 16:00:00 GMT 1 No
24 US Manufacturing & Trade Inventories Growth 12/12/97 M 16:00:00 GMT -1 Yes
25 Philadelphia Fed Business Outlook Growth 13/08/98 M 16:00:00 GMT 1 Yes
26 MNI Chicago Business Barometer Growth 08/01/99 M 16:00:00 GMT 1 Yes
27 Conference Board US Leading Ind. Growth 14/05/99 M 16:00:00 GMT 1 Yes
28 Conference Board Consumer Conf. Growth 01/07/99 M 16:00:00 GMT 1 No
29 US Empire State Manufacturing Growth 13/06/01 M 14:30:00 GMT 1 Yes
30 Richmond Federal Reserve Manuf. Growth 13/06/01 M 16:00:00 GMT 1 Yes
31 ISM Milwaukee Purchasers Manuf. Growth 28/12/01 M 16:00:00 GMT 1 Yes
32 University of Michigan Consumer Sent. Growth 25/07/02 M 16:00:00 GMT 1 No
33 Dallas Fed Manufacturing Outlook Growth 15/11/02 M 16:30:00 GMT 1 Yes
34 US PPI Finished Goods Less Food & En. Inflation 30/01/03 M 14:30:00 GMT 1 Yes
35 US CPI Urban Consumers MoM SA Inflation 30/04/04 M 14:30:00 GMT 1 Yes
36 US CPI Urban Consumers Less Food & En. Inflation 26/05/05 M 14:30:00 GMT 1 Yes
37 Bureau of Labor Statistics Employment Inflation 30/06/05 Q 14:30:00 GMT 1 Yes
38 US Output Per Hour Nonfarm Business Inflation 25/10/05 Q 14:30:00 GMT -1 Yes
39 US PPI Finished Goods SA MoM% Inflation 02/08/06 M 14:30:00 GMT 1 Yes
40 US Import Price Index by End User Inflation 31/07/07 M 14:30:00 GMT 1 Yes
41 US GDP Price Index QoQ SAAR Inflation 05/02/08 Q 14:30:00 GMT 1 Yes
42 US Personal Con. Exp. Core MOM SA Inflation 26/01/09 M 14:30:00 GMT 1 Yes
43 US Personal Cons. Exp. Price YOY SA Inflation 05/02/10 M 14:30:00 GMT 1 Yes
This table reports the 43 US macroeconomic indicators used in our main analysis. Indicators are classified as either growthor inflation related. Column Start reports the date that the time series of each macroeconomic indicator begins. ColumnFrequency reports in which frequency the indicator is released, where Q stands for quarterly, M stands for monthly and Wstands for weekly. Release time reports the typical (most frequent) release time of the indicator in GMT time. Directionstates the potential directional adjustment, represented by -1 when the given indicator reports a quantity that is inverselyrelated to growth or inflation. The column Stationary shows if an indicator’s series is stationary; a stationary adjustment(i.e., towards 6 months differences) is applied within our data manipulation step so the series can be modelled using ourmethodology.
127
5.3.1 Economic surprise predictive models
Following Eq. (5.2.5b), we hereby extend the anchor-only predictive model for economic sur-
prises by incorporating moments of the distribution of macroeconomic forecasts and the control
variables stated above. The moments of the distribution of macroeconomic forecasts added are
1) the lagged median forecast (first moment); the disagreement among forecasters (second mo-
ment) and 2) the skewness of forecasts (third moment). Eq. (5.3.1) is our unrestricted economic
surprise model (UnES model):
St = α+ESAϕ+SurvLagϕ+Stdϕ+Skewϕ+Inflϕ+Growthϕ+Stocksϕ+V IXϕ+εt, (5.3.1)
where subscript ϕ (used hereafter) is t-1, ESA is the expected surprise given anchor7, SurvLag is
the lagged consensus forecast (the previous median of economic forecasts), Std is the dispersion
(standard deviation) of economic estimates across forecasters, and Skew is the skewness of
economic estimates across forecasters. SurvLag, Std and Skew are the three variables selected
to test our hypothesis that alternative measures inherent of the pool of economic forecasts
can reflect biases in expectations over economic releases. More specifically, we use SurvLag to
test whether an anchor towards the previous consensus forecast exists. We employ Std to test
for the effect of forecasters disagreement and information uncertainty over the predictability
of economic surprises, in line with Zhang (2006). Skew is used to test for the presence of
strategic behavior and rational bias in macroeconomic forecasting, in line with the forecasters’
dual-goal hypothesis of forecasting accuracy and publicity as discussed in Laster et al. (1999)
and Ottaviani and Sorensen (2006). Infl and Growth are the states of inflation and economic
growth produced by the nowcasting method implemented. Stocks and VIX are the stock
market returns and implied volatility. Infl, Growth, Stocks and VIX are control variables in
our model.
5.3.2 Market response predictive models
Once predictive models of economic surprises are estimated via Eqs. (5.2.5b) and (5.3.1), we
use the predictions to explain market responses between one minute before and one minute
after ([t-1, t+1]) the release-time (t) of macroeconomic data. We do this by using the expected
economic surprise produced by the different economic surprise models as an explanatory variable
to forecast returns. We use three types of market response predictive models: 1) the anchor-
only model, in which the expected surprise given anchor (ESA) is the only predictor of economic
surprises, thus Eq. (5.3.1) with only one explanatory variable; 2) the unrestricted model, using
all explanatory variables stated by Eq. (5.3.1); and 3) the unrestricted-extended response
7The coefficient γ is excluded from this model representation and subsequent ones for conciseness of pre-sentation. We use the subscript ϕ (i.e, t − 1) to clearly state that the model is predictive. In reality, thesubscript t would still suggest a prediction as most macroeconomic indicator surveys close for forecast submis-sion days before the economic release. For the case of Bloomberg, surveys close one business day prior to thedata announcement.
128
model, which entails the unrestricted model extended with a set of exogenous variables. The
generic formulation of the two expected surprise-based models used is given by Eq. (5.3.2),
whereas Eq. (5.3.3) specifies the 1) anchor-only model, as follows:
Rt = ω + E(St) + εt, (5.3.2)
Rt = ω + E(α + γESAt) + εt, (5.3.3)
where Rt is the market response calculated around the interval [t-1, t+1], thus the one minute
before and one minute after the time the economic data is made available, and E(St), the
expected surprise, is derived from Eq. (5.2.5b).
The unrestricted response model is specified by Eq. (5.3.4):
Rt = ω + E(α + ESAϕ + SurvLagϕ + Sϕ + Skewϕ + Inflϕ +Growthϕ + Stocksϕ + V IXϕ︸ ︷︷ ︸Unrestricted economic surprise model (UnES)
) + εt.
(5.3.4)
Eq. (5.3.5) provides a generic formulation of the unrestricted-extended response model
because we do not implement it as an OLS regression only but also in the form of a Ridge
regression and a Random forest models8:
Rt = ω + E(UnES)ϕ + ESAϕ + SurvLagϕ + Stdϕ + Skewϕ + Inflϕ +Growthϕ
+Stocksϕ + V IXϕ +S∑t=s
Rt−s + εt,(5.3.5)
where UnES is the outcome of unrestricted economic surprise model of Eq. (5.3.1) and s=[5,
10, 20, 30, 40, 50, 60] minutes. For the Ridge regression model, we tune the shrinkage hyper-
parameter (φ, typically called λ) via cross-validation using three splits of the train data set.
For Random forest, we first run a cross-validation step for feature selection (using variable
importance as guidance) and, then, tune the model by minimizing out-of-bag (OOB) errors9 to
obtain the parameter m for the number of random features considered at each branch split10.
We allow the Random forest model to grow 500 trees per run.
We calculate the market responses across equity, treasury, currency, and commodity mar-
kets. More specifically, we use the following instruments: S&P500 index future, Euro-Stoxx
index future, FTSE100 index future, 2-year US Treasury Note future, 2-year Bund future, 10-
year Gilt future, Oil WTI future, Gold future, Copper future, GBPUSD forwards, JPYUSD
8See Appendix 5.A for details on the Ridge regression and Random forest models9The usage of out-of-bag errors is an efficient replacement for cross-validation for tuning methods that rely
on bootstrap to reduce the variance of a learning method. As such methods already make use of a bootstrappedsubset of the observations to fit the model, whereas another subset of the observation is unused, the latter subset(so-called the out-of-bag (OOB) observations) can be used to calculate prediction error, thus, called OOB errors.
10Given the relative small number of observations available in our data set, we apply 100 repeats of ourOOB-based tuning approach to obtain m, which is selected as the mode of the optimal m across all repeats.
129
forwards, CHFUSD forwards, AUDUSD forwards, EURUSD forwards, and CADUSD forwards.
The market responses are calculated and used in our analysis for the entire history available
per market instrument11,12.
5.4 Empirical analysis and results
We split our empirical analysis and results section into five parts. Section 5.4.1 reports the
results of predicting models for economic surprises. Section 5.4.2 dissects market responses
as cumulative average returns (CAR) across multiple time-frames. Section 5.4.3 reports our
findings of market response predictive models. Section 5.4.4 evaluates the presence of regret
effects around macroeconomic announcements. Section 5.4.5 checks for the robustness of our
findings.
5.4.1 Predicting economic surprises
In this section we report our findings from Eqs. (5.2.5b) and (5.3.1), i.e., the anchor-only
(restricted) model and the unrestricted model, respectively, which we use to forecast economic
surprises. Table (5.2) reports aggregated results of these models across all 43 distinct US
macroeconomic indicators analyzed. We evaluate the sign consistency (with our expectations)
and the statistical strength of the individual regressors by computing the percentage of times
that the coefficients are positive (as expected) and statistically significant at the ten percent
level across regressions run separately for each economic indicator. The model quality is eval-
uated using explanatory power (R2) as well as the Akaike Information Criteria (AIC) per
individual (economic indicator’s) regression.
Table (5.2) suggests that the anchor-only model estimates confirm the general finding of
the previous literature, in which the expected surprise given the anchor (ESA) is a strong
predictor of economic surprises. We observe that ESA is significantly linked to surprises 65
percent of the times in our sample. This result is confirmed by the unrestricted model, in
which ESA is statistically significant 67 percent of the times. The results for the unrestricted
model reveal that the Skew factor is also often significant (72 percent) across our individual
indicator regressions. This supports our conjecture that forecasters may behave strategically
(a rational bias), which is in line with Laster et al. (1999) and Ottaviani and Sorensen (2006).
SurvLag and Std are somewhat statistically significant, with 40 and 35 percent of the times,
11Response data is available since 18/9/2002 for the S&P500 index E-mini future, 22/6/1998 for the Euro-Stoxx index future, 1/1/1996 for the FTSE100 index future, 2/1/1996 for the 2-year US Treasury Note future,10/5/1999 for the EUREX 2-year Bund future, 1/1/1996 for the 10-year Gilt future, 2/1/1996 for the NYMEXOil WTI future, 2/1/1996 for the NYMEX Gold, 2/1/1996 for the NYMEX Copper, 1/1/1996 for GBPUSDforwards, 1/1/2000 for JPYUSD forwards, 1/1/2000 for CHFUSD forwards, 1/1/1996 for AUDUSD forwards,16/7/1997 for EURUSD forwards, and 1/1/2000 for CADUSD forwards. This response data is provided byAHL Partners LLP.
12Return series for futures use first and second contracts for all markets. In general, return series use firstcontracts, which are rolled into second contracts between 5 and 10 days prior to the last trading day of firstcontracts, following standard market practice. Return series for currencies are calculated using synthetic one-month forwards.
130
respectively. The result for SurvLag challenges our hypothesis that an anchor towards the
previous consensus forecast holds empirically. The weak statistical significance of Std among
our individual regressions also suggests that disagreement among forecasters and information
uncertainty are linked to economic surprises. The control variables Infl, Growth, Stocks, and
IV are significant between 7 and 33 percent of times, suggesting a somewhat weak relation
between them and economic surprises.
Table 5.2: Aggregated results of anchor-only (restricted) and unrestricted eco-nomic surprise models for the US
Model Anchor-only Unrestricted
Panel A - Percentage of statistical significance per factor
Intercept 0.35 0.42
ESA 0.65 0.67
Std 0.35
SurvLag 0.40
Skew 0.72
Infl 0.16
Growth 0.33
Stocks 0.07
IV 0.23
Panel B - Percentage of positive coefficients
Intercept 0.47 0.56
ESA 0.81 0.88
Std 0.56
SurvLag 0.56
Skew 0.93
Infl 0.49
Growth 0.30
Stocks 0.51
IV 0.23
Panel C - Model quality
Mean R2 4% 17%
Median R2 2% 14%
Stdev R2 4% 10%
AIC 923 896
Panel A reports the percentage of statistically significant coefficients across anchor-only and unrestricted regression modelsfor economic surprises of US macroeconomic indicators. For example, 0.65 found for the ESA variable within the anchor-onlymodel means that 65 percent of the ESA across the individual regressions run for the 43 US macroeconomic indicators arestatistically significant at the 10 percent level. Panel B reports the percentage of positive coefficients across anchor-only andunrestricted regression models for economic surprises of US macroeconomic indicators. Panel C reports the mean, medianand standard deviation of the explanatory power (R2) achieved across all indicator-specific regressions, as well as averageAkaike Information Criteria (AIC).
From an explanatory power perspective, the unrestricted model dominates the anchor-only
model. The mean R2 across the predictive surprise models of the different economic indicators
is 4 percent for the anchor-only model and 17 percent for the unrestricted model (R2 medians
are 2 and 14 percent, respectively).
We report for the anchor-only regressions positive coefficients for the ESA factor in 81
percent of the times. The unrestricted model delivers a positively signed ESA coefficient in
88 percent of the times. Both results suggest a robust relationship between economic surprises
and the anchor factor. The frequency of positive coefficients found for Skew is, however,
even higher than for ESA. The Skew regressors are positive 93 percent of times across all131
regressions. SurvLag and Std are with 56 percent also largely positive but to a lesser extent
than Bias and Skew. Our control variables are to an even lesser extent positive (between 23
and 51 percent). The results provided by AIC are in line with R2 as the average AIC for the
anchor-only model is higher (926) than for the unrestricted model (896). These findings are,
thus, supportive of our hypothesis that a rational bias may be embedded in macroeconomic
forecasting due to strategic behaviour of forecasters, which is in line with Laster et al. (1999)
and Ottaviani and Sorensen (2006).
Table (5.3) presents the results of the individual predictive surprise models (restricted and
unrestricted). The R2gain ratio (reported in the last column) computes the number of times
that R2 of the unrestricted model is higher than the R2 for the restricted model. From a R2
perspective, the unrestricted models largely outperform the anchor-only model. The R2gain
ratio ranges from 1 to ∞, as the average R2 across the anchor-only model is 3.7 percent,
whereas for the unrestricted model it is 17 percent.
The Conference Board Consumer Confidence indicator is the variable for which R2 is the
highest in the anchor-only model (14 percent), followed by the US PPI Finished Goods SA
Mom% indicator (13 percent). Most R2 are of a single digit level, and for only four indicators
does the regressions yield explanatory power above 10 percent. Most anchor coefficients are
statistically significant at least at the 10 percent level.
When the unrestricted model is used, US Personal Income MoM SA (45 percent) is the
indicator with the highest R2, followed by US GDP Price Index QoQ SAAR (41 percent), and
Adjusted Retail Food Service Sales (36 percent). Most R2 reach a double-digit level, in contrast
with the anchor-only model. Most anchor coefficients are also statistically significant, in line
with the anchor-only model. In line with earlier results, the Skew coefficients are mostly positive
and statistically significant, whereas the coefficient sign is more unstable for the SurvLag and
Std coefficients. The control variables within the unrestricted model are mostly statistically
not significant, especially when inflation surprises are being forecasted.
More importantly, by analyzing individual models’ results, we are able to explore an addi-
tional aspect of macroeconomic indicators: popularity. We measure popularity by averaging
the number of analysts that provide forecasts for a given indicator in our sample. In Table
(5.3), popularity is reported in the last column as a Popularity weight measure, which uses
the sum of our popularity measure across all indicators as denominator. We also aggregate
statistics in Table (5.3) using the nine most popular US economic indicator as employed by
Campbell and Sharpe (2009)13.
13The indicators used by Campbell and Sharpe (2009) are the NFP Employment Indicator, Michigan Con-sumer Confidence, Consumer Price Index (CPI) headline and Core, Industrial Production, ISM ManufacturingIndex and Retail Sales Headline and ex-Autos. New Homes Sales is also used by these authors but as housingdata is out-of-scope of our set of macroeconomic indicator this item is not part of our set of nine most popularUS indicators.
132
Table
5.3:Resu
ltsofanch
or-only
(restricted)andunrestricted
models
foreco
nomic
surp
risespereco
nomic
indicato
r
Model
Anchor-only
model
Unrestrictedmodel
Popularity
Statistics/Regressors
R2
AIC
Intercept
Anchor
R2
AIC
Intercept
Anchor
Std
SurvLag
Skew
Inflation
Growth
Stocks
VIX
xR
2gain
weight
USInitialJoblessClaim
sSA
0%
22901
128.8
-0.1**
7%
21925
-3586.0
-0.1
-0.2
0.0
2.1***
582.8
358
-24468
253***
∞2.0%
USEmployeesonNonfarm
Payrol
0%
6151
-12**
-0.1
8%
5884
50**
0.0000
-0.001**
-0.0001
0.003***
-51
-172
-1*
∞4.4%
U-3
USUnemploymentRateTotal
0%
-2451
0.0***
0.1
11%
-2362
0.0
0.0
0.1
0.0**
1.2***
0.0
0.0
0.0
0.0
∞4.3%
USEmployeesonNonfarm
Payrol
2%
5003
-5009***
-0.2*
16%
4937
1235
0-1***
0**
2***
-592
1971**
122767
99
80.9%
USContinuingJoblessClaim
sS
2%
17944
20.3***
8%
17910
30***
0***
00***
00
-2**
-106
1***
40.3%
ADP
NationalEmploymentReport
1%
3132
3786.5
0.1
13%
3129
49**
00
00**
4-1
506
-113
0.9%
USAverageW
eekly
HoursAllEm
0%
-183
0.0
-0.1
27%
-196
0.0
0.4*
0.8*
0.0
2.3***
0.0
0.0
0.6
0.0
∞0.6%
USPersonalIncomeMoM
SA
3%
-2153
0.0**
0.1**
45%
-2109
0.0**
0.2***
0.2
0.3***
3.3***
0.0
0.0***
0.0
0.0
15
3.4%
ISM
ManufacturingPMISA
1%
996
0.1
0.2*
5%
939
2.2
0.1
0.1
0.0
1.1
0.0
0.0
15.4*
0.0
53.7%
USManufacturersNew
OrdersTo
1%
-1777
0.0
0.0
7%
-1638
0.0*
0.1*
-0.4**
0.1
0.1
0.0
0.0
0.0
0.0
73.0%
FederalReserveConsumerCredi
1%
11472
668.9*
0.1
8%
10667
964
00
00**
-532**
286**
-26825
78
1.9%
MerchantW
holesalersInventori
1%
-1908
0.0**
0.1*
9%
-1796
0.0
0.2**
-0.2
0.2
0.9
0.0
0.0
0.0
0.0
91.4%
USIndustrialProductionMOM
S7%
-2056
0.0*
0.2***
24%
-1940
0.0
0.3***
-0.4
0.4***
3.0***
0.0
0.0
0.0
0.0
33.7%
GDP
USChained2009DollarsQo
1%
-1860
0.0
0.0*
9%
-1787
0.0**
0.1***
0.3
0.1**
1.7**
0.0
0.0
0.0
0.0
95.5%
USCapacityUtilization%
ofT
3%
-2063
0.0
0.2**
14%
-1928
0.0
0.3***
0.5*
0.0
1.9***
0.0
0.0
0.0
0.0*
53.2%
USPersonalConsumptionExpend
9%
-2414
0.0
0.1***
15%
-2248
0.0**
0.2***
0.2
0.2***
0.0
0.0
0.0*
0.0
0.0
23.5%
USDurable
GoodsNew
OrdersIn
5%
-1080
0.0
0.1***
25%
-1087
0.0
0.2***
0.8***
0.4***
2.5***
0.0
0.0***
0.1
0.0
53.4%
USAutoSalesDomestic
Vehicle
9%
6253
114.6***
0.3***
23%
6203
-156
0***
0***
00***
-31
8-5986*
13
1.2%
AdjustedRetail&
FoodService
11%
-1426
0.0
0.2***
36%
-1475
0.0
0.2***
1.1***
0.1
4.2***
0.0
0.0
0.0
0.0***
33.1%
AdjustedRetailSalesLessAut
11%
-1533
0.0
0.2***
18%
-1535
0.0
0.3***
0.5
0.2
2.0**
0.0
0.0
0.0
0.0
22.8%
USDurable
GoodsNew
OrdersTo
0%
-1055
0.0**
0.0
15%
-1072
0.0
0.1*
-0.1
0.2
1.4*
0.0
0.0***
0.0
0.0
∞1.4%
GDP
USPersonalConsumptionCh
4%
-1404
0.0
0.1**
10%
-1400
0.0**
0.1*
-0.2
0.0
0.2
0.0
0.0
0.0
0.0**
30.5%
ISM
Non-M
anufacturingNMI
4%
457
0.1
0.4**
25%
444
-5.5**
0.4**
1.9*
0.1***
1.1
-0.1
-0.1**
34.8**
-0.1*
61.6%
USManufacturing&
TradeInven
2%
-2247
0.0
0.1**
14%
-2152
0.0**
0.2***
-0.6**
0.0
1.5***
0.0
0.0
0.0
0.0
72.5%
Philadelphia
FedBusinessOutl
2%
1738
-0.7
0.3**
12%
1624
7.3***
0.2
-1.1*
-0.1
2.9***
-0.1
0.0
-18.2
-0.2**
62.5%
MNIChicagoBusinessBarometer
3%
1368
0.7**
0.3**
6%
1293
2.6
0.4**
0.7
0.0
2.5*
0.1
0.2
-4.2
0.0
22.5%
ConferenceBoardUSLeadingIn
6%
-2358
0.0*
0.1***
35%
-2311
0.0**
0.2***
0.2
0.2***
2.0***
0.0*
0.0**
0.0
0.0
62.6%
ConferenceBoardConsumerConf
14%
1414
0.3
0.7***
20%
1325
0.3
0.6***
0.1
0.0
2.6**
0.4*
-0.1
-0.6
-0.1
13.3%
USEmpireStateManufacturing
1%
1272
-0.5
0.2
11%
1268
3.0
0.1
1.0
-0.1
4.3***
0.2
-0.1
-37.8
-0.3**
11
1.7%
RichmondFederalReserveManuf
1%
970
-1.0
0.2
18%
959
-3.2
0.2
1.5**
-0.1
3.3***
0.4
-0.2
-62.7
0.0
18
0.2%
ISM
MilwaukeePurchasersManuf
0%
451
0.0
0.0
12%
456
10.1**
0.1
-0.4
-0.2**
0.9
0.5
0.5
1.9
0.0
∞0.1%
UniversityofMichiganConsume
3%
2109
-0.2*
0.5***
9%
2095
1.1
0.5***
-0.3
0.0
2.1***
0.2**
-0.1**
9.1
0.0
32.6%
DallasFedManufacturingOutlo
8%
652
-3.4***
0.6**
14%
659
0.8
0.6**
-0.7
0.0
0.3
0.1
-0.3
-52.5
-0.1
20.2%
USPPIFinishedGoodsLessFoo
9%
-1885
0.0
0.3***
18%
-1734
0.0**
0.3***
0.5
0.8***
1.1*
0.0
0.0
0.0
0.0
23.4%
USCPIUrbanConsumersMoM
SA
1%
-2559
0.0
0.0
21%
-2408
0.0**
0.2***
0.2
0.2***
1.1***
0.0
0.0
0.0
0.0
21
3.8%
USCPIUrbanConsumersLessFo
2%
-2714
0.0
-0.1**
9%
-2523
0.0**
-0.1
-0.3
-0.4**
0.4
0.0
0.0*
0.0
0.0
53.7%
BureauofLaborStatisticsEmp
1%
-778
0.0
0.0
10%
-718
0.0
0.0
-0.1
0.1
0.0
0.0
0.0*
0.0
0.0
10
2.8%
USOutputPerHourNonfarm
Bus
3%
-1110
0.0***
0.1**
13%
-1090
0.0
0.1***
0.1
0.1***
0.4
0.0
0.0
0.1
0.0
43.0%
USPPIFinishedGoodsSA
MoM%
13%
-1554
0.0
0.2***
29%
-1533
0.0***
0.4***
1.0**
0.4***
2.0**
0.0
0.0
0.0
0.0
23.6%
USImportPriceIndexbyEndU
1%
-1661
0.0
0.0
23%
-1677
0.0
0.1***
0.0
0.3***
2.5***
0.0**
0.0
0.0
0.0*
23
2.0%
USGDP
PriceIndexQoQ
SAAR
9%
-1189
0.0
0.2***
41%
-1238
0.0
0.3***
-0.4**
0.0
3.5***
0.0*
0.0*
0.0
0.0
51.1%
USPersonalConsumptionExpend
1%
-1660
0.0**
-0.1
27%
-1689
0.0
0.1
-0.6**
-0.1
1.4***
0.0
0.0
0.0
0.0
27
1.3%
USPersonalConsumptionExpend
1%
-1524
0.0
0.0
12%
-1528
0.0
0.1*
0.1
0.0
0.6**
0.0
0.0
0.0
0.0
12
0.5%
Average
3.7%
923
--
17%
896
--
--
--
--
--
2.3%
Popularity-w
eightedaverage
4.0%
110
--
17%
102
--
--
--
--
--
-
Averageofmostpopularindicators
4.9%
-312
--
18%
-313
--
--
--
--
--
-
%ofpositive&
significantcoefficients(P&SC)
--
7%
58%
--
12%
67%
19%
28%
70%
5%
5%
5%
5%
--
Popularity-w
eighted%
ofP&SC
--
6%
64%
--
8%
70%
17%
39%
75%
6%
3%
5%
2%
--
P&SC
ofmostpopularindicators
--
0%
67%
--
11%
67%
22%
33%
78%
11%
0%
11%
0%
--
Thetable
below
reportsresu
ltsofanch
or-only
(restricted)andunrestricted
regressionmodelsforeconomic
surp
rises.
Reg
ressionresu
ltsare
reported
per
economic
indicator.
Weuse
New
ey-W
est
adjustmen
tsto
compute
coeffi
cien
tstandard
errors.Theasterisks***,**,and*indicate
significa
nce
atth
eone,
five,
andtenpercentlevel,resp
ectively.
Thepopularity
weightprovided
inth
elast
columnofth
etable
usesth
esu
mofourpopularity
mea
sure
across
allindicators
asbase.W
emea
sure
popularity
byaveragingth
enumber
ofanalyststh
atprovideforeca
stsforagiven
indicatorin
oursample.
133
Overall, we find that the model quality is higher for popular indicators. The R2 (AIC)
weighted using our popularity measure for the anchor-only model is 4.0 percent (110), whereas
the (unweighted) average R2 (AIC) is 3.7 percent (923). For the unrestricted model, the
weighted R2 (weighted AIC) is 17 percent (102), whereas the average R2 (average AIC) is 17
percent (896). Hence, popular indicators seem better explained by our explanatory variables. If
we compare the percentage of positive and significant coefficients across all models (see last two
rows of Table (5.3)) with the same measure weighted by popularity and using the most popular
indicator only, we observe that ESA and Skew are more likely to hold with the correct sign
among popular indicators. This result applies to both anchor-only model and the unrestricted
model to what it concerns ESA. Hence, we conjecture that the rational and behavioral biases
modelled by ESA and Skew are more present among popular indicators. This finding makes
explicit that the bias in analysis here links to the active behavior of forecasters, not to their
lack of action, as suggested by inattention-type of (behavioral) explanations advocated by
Mendenhall (1991), Stickel (1991), Campbell and Sharpe (2009) and Cen et al. (2013), such as
the anchoring bias.
5.4.2 Market responses around macroeconomic announcements
In the following we evaluate how asset prices of four different asset classes (equities, treasuries,
foreign exchange, and commodities) behave around macroeconomic announcements. Because
we primarily investigate US macroeconomic releases, we target reactions in US local markets
and the EURUSD, the main USD currency cross. Hence, we analyze responses on the following
assets: S&P500 index future, 2-year US Treasury future, and EURUSD forwards. Note that
bond returns is adjusted to have the opposite signal so to be consistent with the expected
response to surprises for equity returns and currencies. The response time-frames used (in
minutes) are -60, -50, -40, -30, -20, -10, -5, -1, 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60. Negative
time-frames imply a time before the relevant economic release, whereas positive ones mean the
minutes after the economic release.
We assess market responses around macroeconomic announcements by calculating cumula-
tive average returns (CARs) and classifying responses around announcements as good or bad
news to the asset. This way, we calculate CAR separately for announcements that had a positive
or negative effect on the specific asset price.
Figure (5.1) illustrates market responses, separately for the S&P500 futures, the 2-year
US Treasury futures, and EURUSD forwards in rows, whereas the first column displays plots
of reactions to good news and the second column offers plot of reactions to bad news. The
CARs around macroeconomic announcements for positive and negative responses across mul-
tiple time intervals are provided in Table (5.4), given in basis points (bps) and as a percentage
of the CAR observed during the the one-hour before until one-minute after the macroeconomic
announcement interval [t-60min, t+1min].
134
A) Positive responses B) Negative responses
Figure 5.1: Cumulative average returns (CAR) around the macroeconomic announcements. The line plots abovedepict the CAR (across all indicators) around the time of macroeconomic announcements (in blue). The time of macro economicannouncements within these plots is t = 0 within the x-axis. The -60, -50, -40, -30, -20, -10, -5, -1 time-frames, which proceedst = 0, represent the minutes prior to the macro announcement. The 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60 time-frames representthe minutes after the macro announcement. The shadowed area around the CAR line show its one standard deviation (68.27th
percent) confidence interval. The securities evaluated are the S&P500 index future, the 2-year treasury bond future and EURUSDforwards, respectively, reported in rows one, two and three.
For the S&P500 futures (Figure (5.1)) in row one, first column of Table (5.4), we observe
that the largest part of the positive response happens around the macroeconomic announcement
(which occurs between time-frames -1 and 1). The CARs from one-hour before the announce-
ment until one minute after the news is roughly +/- 10 bps for positive and negative responses,
respectively, whereas the response around the announcement is also approximately +/- 10bps.
In fact, the average CAR observed around the announcement makes up for more than 100
percent of the overall CAR observed in the [t-60min, t+1min] interval (i.e., 108 for positive
and 102 percent for negative responses). We conclude that pre-announcement drifts are, on
average, of an opposite direction to the overall CARs observed. However, we note that the
pre-announcement drifts are very small relative to the response observed within the [t-1min,
t+1min] interval. Our results also suggest that one-minute after that releases are made public
up to 60 minutes afterwards, there are only small post-announcement drift effects within the
S&P500 futures as only 1 and 5 percent of the CARs around the announcement is observed
within the [t-60min, t+1min] interval for the positive and negative responses. In brief, the
positive and negative market responses for the S&P500 index future midst macroeconomic an-
nouncements have a similar pattern: almost no drift prior to the announcement, a jump at
135
the announcement, and roughly a flat post-drift effect up to one-hour after the announcement.
This CAR pattern suggests that no exploitable market underreaction to US macroeconomic
news releases seems to be present within the local equity market. At the same time, as there
is no pervasive pre-announcement drift observed, no evidence of leakage or usage of private
information by market participants is found.
The CARs observed in different time-frames for the Euro-Stoxx and FTSE100 index futures
show patterns similar to the ones found for the S&P500 index future. The pre-announcement
drifts have the opposite direction to the response found close to the macroeconomic news release,
both for positive and negative responses. The post-drift we observe is in the same direction as
the response, but is in both markets of higher magnitude than the one found for the S&P500
index future, ranging from 10 to 24 percent of the CARs found in the [t-60min, t+1min] interval.
This result seems to suggest that both Euro-Stoxx and FTSE100 index futures are less efficient
than the S&P500 index future.
For the 2-year US Treasury futures (Figure (5.1), we see in row two, first column of Table
(5.4)) that the positive response is also very distinct around the announcement. The average
CAR from one-hour before the announcement until 30-minutes before the announcement is
flat. There is some evidence of a pre-drift in the direction of the response from 30-minutes
before the announcement until one-minute before the announcement of roughly 10 percent of
the CARs observed in the [t-60min, t+1min] interval. The CARs observed for both the positive
and negative responses in the interval [t-60min, t+1min] is between absolute 2.4 and 2.7 bps.
Differently from equity markets, there is some evidence of a post-announcement drift, with
an additional 0.6 and -0.3 bps (20 and 13 percent of the CARs experienced in the [t-60min,
t+1min]) move expected after positive and negative responses, respectively. When treasury
markets for the UK and Germany are evaluated (see Table (5.4)), we observe similar patterns
for the post-announcement drift and response in the [t-1min, t+1min] interval, but no consistent
pre-announcement drift. We note that the post-announcement drift for positive responses are
consistently larger than the ones for negative responses.
We see that for the EURUSD (Figure (5.1), row three, first column of Table (5.4)), the
responses are once again very distinct and concentrated closely around the macroeconomic an-
nouncement time. The CAR from one-hour before the announcement until one minute before
the announcement is roughly zero. The CAR within the interval -1 minute and +1 minute
is relatively large, roughly 5 bps, concentrating between 102 and 105 percent of the CAR
observed in the [t-60min, t+1min] interval. Differently than observed for the treasury and
equity markets, the post-announcement drift tends to be in the opposite direction of the re-
sponse observed around the data releases, dampening between 5 and 21 percent of the CAR
observed in the [t-60min, t+1min] interval. For other currencies, the pre-announcement drifts
are on average small and inconsistent with each other and with the responses observed in the
[t-1min, t+1min] interval. The post-announcement drifts are mostly in the opposite direc-
tion to the response around announcements for the negative responses. However, for positive
responses, the post-announcement drift responses are in the same direction as the responses
136
around the announcements for most currencies and only in opposite direction for the EURUSD
and CADUSD.
Table 5.4: Cumulative average returns (CAR) around macroeconomic an-nouncements
Panel A - Cumulative average return (CAR) for positive market responses
Absolute (in Bps) As percentage of [t-60min, t+1min]
[t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60] [t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60]
S&P500 0.2 -0.9 9.9 9.2 0.1 2% -10% 108% 100% 1%
Euro-Stoxx -0.4 -0.5 16.3 15.5 1.6 -2% -3% 106% 100% 10%
FTSE100 -0.2 -0.4 10.1 9.5 1.4 -2% -5% 106% 100% 15%
2y Bund -0.0 -0.0 1.2 1.1 0.2 -3% -1% 104% 100% 20%
2y T-Note -0.1 0.2 2.3 2.4 0.6 -3% 7% 96% 100% 23%
10y Gilt -0.5 0.3 6.6 6.3 1.4 -8% 5% 103% 100% 21%
WTI Oil 1.2 0.6 6.7 8.5 -1.9 14% 7% 78% 100% -22%
Gold -0.3 0.8 7.1 7.7 0.3 -3% 11% 92% 100% 4%
Copper -1.3 0.5 8.8 7.9 2.6 -17% 6% 110% 100% 33%
USDGBP -0.5 0.0 4.1 3.6 0.4 -14% 0% 114% 100% 10%
USDEUR -0.2 -0.1 5.4 5.1 -1.1 -3% -1% 105% 100% -21%
USDJPY -0.2 0.4 5.8 6.0 0.7 -3% 7% 96% 100% 12%
USDCHF -0.4 -0.2 5.4 4.8 0.4 -8% -5% 113% 100% 7%
USDAUD -0.3 -0.2 8.4 8.0 1.5 -3% -2% 105% 100% 19%
USDCAD 0.5 0.3 5.4 6.2 -0.6 7% 5% 87% 100% -9%
Panel B - Cumulative average return (CAR) for negative market responses
Absolute (in Bps) As percentage of [t-60min, t+1min]
[t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60] [t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60]
S&P500 0.1 0.1 -9.7 -9.5 -0.5 -1% -1% 102% 100% 5%
Euro-Stoxx 1.0 0.9 -16.2 -14.2 -3.5 -7% -7% 114% 100% 24%
FTSE100 1.2 0.6 -10.3 -8.5 -0.9 -14% -7% 121% 100% 10%
2y Bund 0.0 -0.1 -1.2 -1.2 -0.1 0% 7% 94% 100% 8%
2y T-Note 0.0 -0.3 -2.4 -2.7 -0.3 -1% 10% 91% 100% 13%
10y Gilt 0.4 0.1 -6.8 -6.2 -0.6 -7% -2% 109% 100% 9%
WTI Oil -0.9 0.2 -6.8 -7.6 -0.5 12% -2% 90% 100% 7%
Gold -0.2 0.8 -6.9 -6.3 0.4 3% -13% 110% 100% -6%
Copper 0.2 -1.0 -8.8 -9.6 1.3 -2% 11% 92% 100% -14%
GBPUSD -0.0 -0.4 -4.2 -4.7 0.3 1% 8% 91% 100% -7%
EURUSD 0.3 -0.2 -5.3 -5.1 0.3 -5% 3% 102% 100% -5%
JPYUSD 0.3 0.5 -5.6 -4.7 -0.1 -7% -11% 118% 100% 1%
CHFUSD 0.4 -0.3 -5.2 -5.2 0.4 -7% 6% 101% 100% -7%
AUDUSD -0.0 -0.8 -7.9 -8.8 1.0 0% 10% 90% 100% -12%
CADUSD 0.2 -0.2 -4.9 -4.9 0.4 -4% 4% 100% 100% -9%
Panel A reports the cumulative average returns (CAR) around macroeconomic announcements for positive market responsesacross several markets and time-frames (in minutes). Absolute CAR are reported on the left sub-panel, whereas the CARfor each time-frame as percentage of the CAR for the [t− 60min, t+1min] interval is reported on the right sub-panel. PanelB reports the similar CAR information but for negative market responses. t is the time of announcement of macroeconomicdata releases.
We assess the CAR around US macroeconomic announcements for three commodities: WTI
oil, gold, and copper. The most notable difference between our results for these commodi-
ties versus the other asset classes investigated is that the responses observed in the [t-1min,
t+1min] interval for commodities concentrate less of the overall CAR observed in the [t-60min,
t+1min] interval than for the previous three asset classes equity. The responses observed in
the [t-1min, t+1min] interval range from 78 percent to 110 percent. Evidence of any pre- or
post-announcement drift is very inconsistent across commodities and across positive and neg-
ative responses. The reason for such inconsistency might be that commodities are less clearly
linked to the business cycle of a particular country compared to equities, treasuries and cur-
137
rencies. For instance, as countries may be consumers or suppliers of specific commodities, it is
unclear how the macroeconomic announcements in a specific country, should affect the price of
commodities14.
Finally, it worths making notice of some common features observed from the Figure (5.1 for
S&P500 futures, 2-year US Treasury futures and EURUSD. Firstly, when market responses are
one standard deviation higher than the average reaction, markets mean-revert strongly by the
following two minutes after the surprise and continue to do so for the following three minutes,
though, less aggressively. Secondly, when market responses are one standard deviation lower
than the average reaction, a post-drift in the subsequent two minutes after the surprise is ob-
served. Further, volatility tends to increase prior to announcements for the S&P500 and 2-year
US Treasury futures markets but not for the EURUSD market. Such increase in volatility starts
even more than 30 minutes before announcements in the S&P500 futures markets, whereas for
2-year US Treasury futures, it happens only in the last 20 minutes before announcements.
5.4.3 Predicting market responses
In this section we analyze the estimates from Eqs. (5.2.5a), (5.3.4) and (5.3.5), i.e., the anchor-
only (restricted) model, the unrestricted model (used to forecast economic surprises), and
the unrestricted-extended model. Table (5.5) reports R2, AIC, the frequency of the expected
surprise coefficients that are positive for the three OLS-based models employed, and hit-ratios
as well as root mean squared error (RMSE) for all models. Hit-ratios and RMSEs are reported
for our train and out-of-sample or test data set15,16, whereas other statistics are calculated
in-sample, i.e, using the full data set.
Table (5.5) reports that R2 monotonically increases across the three OLS regression models
as we move from the restricted model to more comprehensive models17. The magnitude of gains
in R2 across the three types of models suggests that the unrestricted-extended models have much
higher explanatory power. On average, anchor-only models deliver R2 of 1.5 percent, whereas
unrestricted response models have R2 of 2 percent on average. In contrast, unrestricted-extended
models, which no longer are univariate models, post average R2 of 29 percent.
The AIC statistic estimated across the different models challenges somewhat the results
provided by R2: complex unrestricted models are deemed less informative once penalties for
complexity are applied. The AICs for the restricted model are 74 percent of the times lower than
for the unrestricted model (indicating dominance of the anchor-only model), whereas the AICs
for the restricted models dominate the AICs from unrestricted-extended models at all times. The
14For stocks, it is also unclear how positive news impacts prices. Late in a tightening cycle (high inflation),good news is bad for equities, whereas at an early stage in tightening cycle (low inflation), positive macroeco-nomic news is definitely good for equities.
15The in-sample period extends through our full data set (i.e, from 1997 to 2016), whereas our out-of-sampleperiod (our test data set) comprises of the latest 25 percent of observations of the full data set. The trainingdata used for tuning (typically via cross-validation) of machine learning methods and estimation of modelsemployed for out-of-sample forecasting uses the earliest 75 percent of observations of the full data set.
16Not all statistics are provided for the Ridge and Random forest model as they are not available or are notstraight forward to estimate or aggregate.
17Note that both restricted and unrestricted models are univariate models.
138
same AIC dominance holds for the restricted models over the unrestricted models. Average
AICs across these three types of models confirm these findings. AICs of Ridge models are,
however, superior than the ones of their OLS counterparts, the unrestricted-extended models,
indicating that model quality is improved by shrinkage. These first results indicate that in-
sample fit is superior for the most complex models versus simpler models from an explanatory
power perspective, but not from a parsimony perspective. Despite that, differences in AICs are
not large, indicating that the superiority of small models on this criteria is not absolute.
When evaluating the coefficient signs of the expected surprise factor in market response
predictive models18, at first glance, we find that coefficients are mostly positive. Among anchor-
only models, on average 57 percent of the coefficients for expected surprise are positive, whereas
for unrestricted models this is 64 percent. Within larger models, such as unrestricted-extended
ones, the percentage of positive coefficients for the expected surprises falls to 54 percent.
Further, we evaluate results coming from our OLS models by making a split between local
(US) markets and foreign markets. We find that, from a R2 perspective, the unrestricted
and unrestricted-extended models of local markets seem to outperform foreign markets. The
percentage of positive coefficients for the expected surprise variable is equal or higher for local
markets than for foreign markets across all models. This is an intuitive results as we assume
that local fundamentals should explain local markets more than local conditions explaining
foreign markets. Though, for the US, due to its dominant economic position, this assumption
might be weaker than for other countries.
When we assess model goodness of fit across the different asset classes evaluated, we find
that R2 for our three OLS models are much higher for the stock and bond markets. Within
stocks, the Euro-Stoxx-based models are the ones with higher R2. In bonds, the 2y Bund
models are the one with highest explanatory power. Copper-based models have the highest R2
in Commodities. Results from AIC and from currencies are more mixed. In univariate models,
the percentage of positive coefficient for expected surprises are higher for stocks and bonds
(always above 59 percent) than for other asset classes, in line with our expectations.
Further, R2 is almost the same for growth and inflation indicators. Nevertheless, AIC
points for a clear superiority of models’ fit of growth-based indicators over inflation-based.
The percentage of positive coefficients for the expected surprise variable is also consistently
higher for growth indicator (between 57-69 percent on average), as it is always lower than 50
percent for inflation indicators. We think that this result is caused by positive growth surprises
being less directly linked to subsequent increases in interest rates by central bank than positive
inflation surprises, as higher interest rates typically produces negative shocks to equities and,
more indirectly, commodities.
18We expect expected surprise coefficients to be, in general, positive, as we expect that equities, commodities,currencies would typically appreciate in response of positive economic surprise. Bond returns are adjusted tohave the opposite in order to be consistent with the other asset classes.
139
Table
5.5:Resu
ltsofrestricted
and
unrestricted
mark
etresp
onse
models
permark
et
Anch
or-only
resp
onse
model
Unrestricted
resp
onse
model
Unrestricted
-extended
resp
onse
model
Unrestr.
Ridgeresp
onse
model
Unrestr.
RF
resp
onse
model
%E(S
urp
rise)
Hit-ratio
RMSE
(x1000)
%E(S
urp
rise)
Hit-ratio
RMSE
(x1000)
%E(S
urp
rise)
Hit-ratio
RMSE
(x1000)
Hit-ratio
RMSE
(x1000)
Hit-ratio
RMSE
(x1000)
Markets
R2
AIC
coeff
.>
0Train/Test
Train/Test
R2
AIC
coeff
.>
0Train/Test
Train/Test
R2
AIC
coeff
.>
0Train/Test
Train/Test
AIC
Train/Test
Train/Test
Train/Test
Train/Test
S&P500
2.0%
-1298
63%
53%
/48%
2.16/1.16
2.4%
-1273
65%
54%
/49%
2.16/1.15
36%
-1241
60%
64%
/51%
1.76/2.53
-1306
53%
/51%
1.44/1.31
54%
/50%
2.20/1.21
Euro-Stoxx
2.3%
-1128
63%
55%
/51%
3.28/1.65
3.4%
-1130
67%
55%
/52%
3.26/1.67
38%
-1103
53%
64%
/52%
2.66/2.84
-1169
53%
/52%
0.95/1.08
53%
/54%
3.36/1.80
FTSE100
2.2%
-1620
60%
54%
/49%
2.02/0.99
2.7%
-1589
67%
54%
/51%
2.00/0.97
35%
-1562
42%
63%
/52%
1.63/1.35
-1630
54%
/52%
1.48/1.55
53%
/54%
2.06/0.95
2yBund
3.0%
-1133
64%
53%
/46%
0.21/0.09
4.3%
-1134
83%
55%
/52%
0.21/0.10
46%
-1094
50%
66%
/49%
0.16/0.20
-1192
52%
/49%
1.50/1.16
51%
/47%
0.22/0.10
2yT-N
ote
1.1%
-1066
52%
54%
/51%
0.43/0.22
2.6%
-1058
79%
54%
/52%
0.42/0.22
46%
-1017
50%
66%
/52%
0.33/0.36
-1100
54%
/52%
0.21/0.13
50%
/49%
0.44/0.23
10yGilt
1.1%
-1499
61%
55%
/50%
1.18/0.90
2.2%
-1488
73%
55%
/49%
1.15/0.89
28%
-1445
51%
65%
/52%
0.99/1.10
-1445
54%
/51%
2.19/1.30
53%
/50%
1.18/0.89
WTIOil
0.9%
-1334
53%
53%
/50%
1.47/1.31
1.1%
-1311
67%
52%
/49%
1.47/1.32
27%
-1271
58%
61%
/51%
1.25/1.87
-1304
51%
/51%
3.30/1.98
50%
/50%
1.52/1.32
Gold
0.8%
-1654
42%
54%
/51%
1.47/1.92
0.7%
-1620
42%
54%
/50%
1.48/1.93
24%
-1576
51%
61%
/50%
1.31/2.07
-1611
53%
/51%
0.97/1.12
52%
/51%
1.51/1.93
Copper
1.2%
-1311
60%
54%
/47%
1.50/1.08
2.4%
-1298
65%
54%
/50%
1.49/1.09
33%
-1261
49%
63%
/52%
1.23/1.36
-1326
54%
/53%
2.01/1.08
51%
/50%
1.52/1.08
USDGBP
1.0%
-2457
60%
53%
/51%
0.76/0.82
1.1%
-2387
67%
53%
/52%
0.76/0.83
17%
-2329
70%
60%
/48%
0.70/0.89
-2367
52%
/49%
1.48/1.98
52%
/50%
0.78/0.83
USDEUR
1.0%
-2273
56%
53%
/51%
0.96/1.10
1.4%
-2267
63%
55%
/52%
0.96/1.10
18%
-2213
56%
60%
/51%
0.87/1.15
-2251
53%
/52%
1.16/0.96
54%
/53%
0.97/1.10
USDJPY
1.3%
-2113
60%
53%
/50%
1.08/1.20
1.6%
-2114
72%
54%
/51%
1.08/1.21
19%
-2065
67%
61%
/50%
0.99/1.29
-2102
53%
/50%
0.98/1.12
53%
/53%
1.11/1.22
USDCHF
1.1%
-2154
60%
53%
/50%
0.98/1.09
1.4%
-2155
74%
54%
/50%
0.98/1.09
19%
-2106
58%
61%
/51%
0.89/1.17
-2144
52%
/50%
0.43/0.24
52%
/51%
1.00/1.10
USDAUD
1.8%
-1837
56%
52%
/50%
1.43/1.26
1.3%
-1823
44%
53%
/49%
1.43/1.27
24%
-1784
49%
61%
/52%
1.27/1.39
-1820
52%
/51%
0.77/0.85
52%
/52%
1.47/1.27
USDCAD
0.9%
-1960
53%
54%
/50%
0.95/1.04
0.8%
-1960
37%
53%
/48%
0.95/1.04
19%
-1910
49%
62%
/50%
0.86/1.14
-1947
53%
/50%
1.09/1.24
53%
/52%
0.96/1.05
Loca
l(avg)
1.5%
-1182
63%
54%
/50%
1.30/0.69
2.5%
-1165
65%
54%
/51%
1.29/0.68
41%
-1129
55%
65%
/52%
1.05/1.44
-1203
53%
/51%
1.31/0.77
52%
/50%
1.32/0.72
Foreign(avg)
1.4%
-1729
58%
54%
/50%
1.33/1.11
1.9%
-1714
63%
54%
/50%
1.32/1.12
27%
-1671
54%
62%
/51%
1.14/1.37
-1716
53%
/51%
1.33/1.20
52%
/51%
1.36/1.13
Stock
s(avg)
2.2%
-1349
62%
54%
/50%
2.49/1.27
2.8%
-1331
67%
54%
/51%
2.47/1.26
36%
-1302
52%
64%
/52%
2.02/2.24
-1368
53%
/52%
2.50/1.45
54%
/53%
2.54/1.32
Bonds(avg)
1.8%
-1232
59%
54%
/49%
0.61/0.41
3.0%
-1227
78%
55%
/51%
0.59/0.40
40%
-1185
50%
66%
/51%
0.50/0.55
-1246
53%
/51%
0.60/0.44
51%
/49%
0.61/0.41
Curren
cies
(avg)
1.2%
-2132
58%
53%
/51%
1.03/1.08
1.3%
-2118
60%
54%
/50%
1.03/1.09
20%
-2068
58%
61%
/50%
0.93/1.17
-2105
52%
/50%
1.03/1.12
52%
/52%
1.05/1.09
Commodities(avg)
1.0%
-1433
52%
54%
/49%
1.48/1.44
1.4%
-1410
58%
53%
/50%
1.48/1.45
28%
-1370
53%
62%
/51%
1.27/1.77
-1414
52%
/52%
1.49/1.56
51%
/50%
1.52/1.45
Growth
(avg)
1.5%
-2623
60%
54%
/50%
1.35/1.09
2.0%
-2608
69%
54%
/51%
1.35/1.10
29%
-2561
57%
63%
/51%
1.16/1.44
-2600
53%
/51%
1.36/1.17
52%
/51%
1.38/1.11
Inflation(avg)
1.4%
-1522
48%
53%
/50%
1.16/0.88
1.8%
-1504
48%
54%
/49%
1.15/0.88
31%
-1464
44%
63%
/50%
0.96/1.20
-1535
53%
/50%
1.15/0.98
52%
/51%
1.18/0.90
Average
1.5%
-1656
57%
54%
/50%
1.32/1.06
2.0%
-1640
64%
54%
/50%
1.32/1.06
29%
-1599
54%
63%
/51%
1.12/1.40
-1648
53%
/51%
1.33/1.14
52%
/51%
1.35/1.07
Pop-w
eightedavg.
1.4%
-1658
58%
54%
/50%
1.39/1.13
2.0%
-1639
68%
54%
/50%
1.38/1.13
26%
-1598
56%
62%
/51%
1.20/1.40
-1640
52%
/51%
1.39/1.21
52%
/51%
1.41/1.15
Most
pop.indicators
(avg)
1.1%
-1246
63%
54%
/50%
1.52/1.23
2.1%
-1691
79%
54%
/51%
1.51/1.23
24%
-1651
59%
62%
/52%
1.34/1.44
-1687
52%
/52%
1.53/1.31
52%
/50%
1.54/1.26
Thetable
below
reportsresu
ltsofanch
or-only
(restricted),
unrestricted
andunrestricted-exten
ded
market
resp
onse
models1)per
market
evaluated,2)aggregatedper
geo
graphicalcoverage(i.e.,
loca
lorforeign),
3)aggregated
across
asset
classes
s(i.e.,
stock
s,bonds,
FX
and
commodities),4)aggregated
per
typeofmacroeconomic
indicatorto
predicteconomic
surp
rises(i.e.,
growth
or
inflation)and5)aggregatedusingpopularity
weights.W
eaggregate
resu
ltsbyaveragingstatisticsfrom
theindividual(m
acroindicator-sp
ecific)
models.
Statisticsreported
are
theaverageR
2,th
eex
planatory
power;AIC
,th
eAkaikeInform
ationCoeffi
cien
t;Coeff
.>0,th
epercentageofpositiveco
efficien
ts,hit-ratiosandRMSE(x1000).
Hit-ratiosandRMSEsare
reported
forourtrain
and
out-of-sample
ortest
data,wherea
soth
erstatisticsare
calculatedin-sample.Thein-sample
periodex
tendsth
roughth
efulldata
setforea
chindicator,
wherea
sourout-of-sample
period(ourtest
data
set)
comprisesofth
elast
25percentofobservationsofth
edata
set.
Thefirst75percentofth
edata
isourtrainingdata
set,
whichis
usedfortu
ningandestimationofpredictivemodels.
For
mach
inelearn
ing-basedpredictivemodels,
only
AIC
(forRidge),hit-ratiosandRMSEsare
reported
because
inference
ofoth
erstatisticsis
notstraightforw
ard
orbecause
resu
ltsmayget
distorted
when
aggregated.Statisticsforth
eindividualmarketsare
alsoaverages
asth
eyare
aggregatedfrom
modelsth
atare
basedonth
esetofindividualmacroeconomic
indicators
investigatedbyus.
140
Weighing model fit outcomes using our popularity measure does not lead to additional
insights as R2 and AIC are nearly the same across models that rely on less-popular indicators
and models that rely on popular indicators. The percentage of positive coefficients for UnES
is, though, clearly higher for popular indicators.
As we assess the performance of predictions made by the various models, our first impression
is that anchor-only and unrestricted models does not convincingly beat a 50 percent hit-ratio
out-of-sample, despite delivering a roughly 54 percent hit-ratio in the train data set. Out-
of-sample hit-ratios are only slightly better than a coin flip for the unrestricted-extended (51
percent) among all models we use. The same applies to the average Ridge and Random forest
models as they also post out-of-sample hit-ratios of around 51 percent. Interestingly, train
hit-ratios for the unrestricted-extended seem to more heavily overstate out-of-sample hit-ratios
than done by the train hit-ratios of the Ridge and Random forest models19. For instance, the
average train hit-ratio for the unrestricted-extended model is 63 percent, whereas for the Ridge
is 53 percent and 52 percent for Random forest. Random forest is the model that seem to
overstate testing hit-ratios by train ones the least.
Across all unrestricted-extended frameworks, hit-ratios seem to be consistently higher for
models that forecast stocks returns versus models that predict other asset classes, especially
currencies and commodities, matching our findings from R2. Hit-ratios also suggest that models
based on growth indicators do a better job at forecasting market direction than models based on
inflation indicators. Further, train and test hit-ratio of models that use popular macroeconomic
indicators are not consistently higher than hit-ratios for the average model.
Concerning RMSE, a first noticeable observation is that OLS unrestricted-extended models
outperform all other models (including machine learning-based models) in train set but deliver
higher average test RMSE than all other models. This is the case across popular and unpopular
indicators and might be a symptom of overfitting. Machine learning-based models, however,
report average train RMSEs that are higher than for the unrestricted-extended model but
deliver much lower out-of-sample RMSEs, respectively 1.14x10−3 and 1.07x10−3 for the Ridge
and Random forest model (versus 1.40x10−3 for the unrestricted-extended model). RMSEs
are often higher for the in-sample period than for the out-of-sample period, which may be
cause by the different level of markets’ volatilities in the two sample splits. RMSEs also vary
substantially across asset classes, which is explained by the adverse levels of return volatility of
the asset classes used. As expected, RMSEs are the lowest for bonds and currencies and higher
for stocks and commodities. Further, both train and test RMSEs are the lowest for models
that use inflation indicators and predict local markets.
Our results also indicate that market responses created by announcements of popular
macroeconomic indicators are less predictable than responses of unpopular indicators as train
and test RMSEs are consistently higher for popular indicators, across all models. This finding
suggests that, even if economic surprises in popular indicators are easier to forecast (as biases
are more pervasive), their market responses are less anticipated (in RMSE terms) by the pre-
19This result, which suggests overfitting by the OLS models, is what motivates us to apply shrinkage, as doneby the Ridge regression.
141
dictable part of economic surprises. As earlier reported, hit-ratios estimated do not suggest
that popular indicators are more predictable either. Hence, to some extent, market partici-
pants seem to either discount the biases incurred by forecasters when trading around economic
surprises of popular indicators or to have better models to predict surprises. These results are
somewhat connected to Campbell and Sharpe (2009), who conclude that market participants
“look through” forecasters’ biases within ten popular US macroeconomic indicators20,21,22.
When we dig into the drivers of forecasts produced by the unrestricted-extended, Ridge
regression and Random forest models, we find that the E(UnES) variable is important but
only after a couple of market-based information, such as the 5-minute asset return prior to
the announcement as well as the prior day level of the VIX index and stock returns. We
base this conclusion in two metrics: the percentage of significant coefficients estimated by our
unrestricted-extended OLS models and the Importance measure extracted from our Random
forest models. In addition to these two metrics, we somewhat rely on the the percentage of
positive coefficients from regression models to evaluate if the relation found between responses
and the UnES, ESA, Skew variables have the expected coefficient sign23.
Table (5.6) reports the percentage of positive coefficients for predictive models using the
OLS and Ridge approaches (in the train and test data sets, respectively reported in Panels A
and B). We observe that the estimated coefficients for UnES and ESA are more often positive
than negative, in line with our expectations. For Skew this results is less strong, as within
the OLS model this variable is only positive between 45 and 48 percent of times. Nevertheless,
because, among all regressors, Skew and UnES are the most correlated variables (reaching
a correlation of 0.8 for some of our macroeconomic indicators), the fact that Skew is mostly
negative might be simply the manifestation of multicollinearity in the regression model. We also
find that Stocks and Rt−5 to be consistently positive, suggesting a positive serial correlation
between returns during announcements and prior asset returns. In the case of Stocks, such
relation might be linked to time-series momentum, which is typically captured in daily frequency
data24. In the case of Rt−5, a positive coefficient indicates the presence of pre-announcement
price drift in the direction of the economic surprise-led responses just few minutes prior to
the data release, indicating potential leakage of information or short-term trading activity by
20We use nine out of the ten US macroeconomic indicators evaluated by Campbell and Sharpe (2009). Theonly indicator used by these authors and not by us is the New Home Sales statistic, as we do not include housingmarket data in our analysis.
21The average weight used to calculate popularity-weighted statistics is 2.3 percent, whereas the averageweight of the indicator used by Campbell and Sharpe (2009) within such weighting scheme is 3.5 percent,denoting the use of very popular indicators by the authors.
22Qualitatively similar results are obtained when we perform the supervised learning approaches specified insection 5.4.3 as a classification problem rather than in a regression setting. These results are available underrequest.
23We evaluate the percentage of positive sign for these three variables only as we do not have a prior for therelation between past returns and market responses around macroeconomic announcements. The same appliesfor the relation between return volatility and market responses around announcements.
24If found that positive (negative) responses amid positive (negative) data surprises might be strengthenedby the existing positive (negative) time-series momentum, one could hypothesize that serial correlation in prices(i.e., momentum) is intensified by economic surprises in the same direction or a series of such surprises, i.e.,serial correlation in surprises.
142
informed investors.
Table 5.6: Results for unrestricted-extended market response models per factor
Panel A - Unrestricted reponse models - Train set Panel B - Unrestricted reponse models - Test set/Out-of-sample
Extended (OLS) Ridge Random Forest Extended (OLS) Ridge Random Forest
% positive % significant % positive Importance (x105) % positive % significant % positive Importance (x105)
Intercept 49% 10% 47% - 50% 10% 46% -
UnES 58% 13% 65% 2.4 56% 12% 66% 2.0
ESA 52% 11% 59% 1.4 51% 10% 57% 1.2
Skew 48% 10% 62% 2.3 45% 10% 52% 1.9
Std 44% 13% 50% 1.6 45% 12% 47% 1.2
SurvLag 52% 10% 47% 1.6 51% 9% 49% 1.3
Inflation 52% 14% 52% 1.8 57% 12% 56% 1.4
Growth 51% 8% 47% 1.8 51% 7% 50% 1.4
Stocks 66% 28% 64% 3.5 65% 25% 65% 3.0
VIX 49% 13% 51% 3.4 50% 11% 53% 2.9
VIXdif 58% 15% 49% 1.9 55% 15% 55% 1.6
ret60 51% 15% 46% 1.8 51% 15% 49% 1.4
ret55 50% 19% 50% 2.1 52% 16% 54% 1.7
ret50 46% 21% 49% 2.2 47% 16% 47% 1.8
ret45 48% 18% 49% 2.3 48% 17% 49% 1.9
ret40 43% 16% 43% 2.4 45% 13% 43% 1.9
ret35 53% 16% 51% 2.7 52% 16% 52% 2.3
ret30 44% 15% 41% 2.2 42% 12% 42% 1.8
ret25 51% 17% 47% 2.2 51% 15% 51% 1.9
ret20 48% 17% 47% 2.1 50% 13% 47% 1.8
ret15 40% 21% 38% 2.5 45% 15% 41% 2.1
ret10 49% 22% 48% 2.4 53% 18% 52% 2.0
ret5 66% 27% 59% 3.5 65% 23% 65% 2.9
Average 51% 16% 51% 2.3 51% 14% 52% 1.9
Panel A reports details on the fit of unrestricted-extended models for the in-sample period. Panel B reports details onthe fit of unrestricted-extended models for the out-of-sample period. Across the two panels, we contrast results from theUnrestricted-extended, Unrestricted-Ridge regression and Unrestricted-Random Forest models to provide some interpretationinto our results. As the three models used do not provide a common variable for direct comparison, these statistics are mostlyused here to map the best predictors and, perhaps, found confirmation of that from other models.
Turning into the percentage of significant coefficients estimated by the unrestricted-extended
OLS model, we find that Stocks and Rt−5 are the variables most strongly connected to asset
responses amid macroeconomic announcements. Returns at other times frames (minutes) before
announcements are also connected to returns during announcements, despite the fact that the
direction of the relationship is not clear. Among non-market data based regressors, UnES
and Std are linked to market responses the most, indicating the relevance of UnES for market
predictions. Using the Random forest Importance measure as guidance (i.e., node impurity)25,
we find that Stocks and Rt−5 but also the V IX are highly relevant for predictions (see Table
(5.6) and Figure 5.2). As reported by Importance, UnES is the most relevant non-market
data predictor used by Random forest. The fact that past returns have predictable power
in forecasting returns around data announcements also adds to the pool of evidence in the
literature of failure of the Efficient Market Hypothesis (EMH) on its weak form.
25See Appendix 5.A for details.
143
Figure 5.2: Importance measure from Random forest model. The bar charts above depict the Importance measureproduced by the Random forest model applied to predict market responses around the announcements of macroeconomic data, inthe train and test data sets. More specifically, the Importance measure computes the average node impurity across all trees grownby the Random forest, which reflects how partitions made by the different explanatory variables at each node into two sub-regionsperform versus a constant fit over the entire region, i.e., a ’pure’ node. For the case of regression models, performance is calculatedin terms of squared errors (see Appendix 5.A for additional details on the Importance measure).
In brief, we show that cross asset returns around US macroeconomic data announcement can
be largely explained by variables that represent biases in the behavioral of forecasters, such as
UnES and ESA, as well as market-based variables, such as Stocks and Rt−5. Beyond that, we
find that these variables also have some out-of-sample predictability power. Explanatory power
and predictability26 is higher for local stocks and bonds than for currencies and commodities,
whereas local markets are better predicted than foreign markets. Goodness of fit measures
(R2 and AIC) indicate that larger models deliver much higher explanatory power but, taking
parsimony into account, bigger models are only preferred when regularized. Contrary to our
26We consider hit-ratio as our predictability measure as RMSE cannot be adequately used to compare returnforecast of assets with very distinct volatility as in our exercise. We used the median hit-ratio among unrestricted-extended models to rank the predictability across the different asset classes studied.
144
results on predictability of economic surprises, in which popular indicator are found to be more
predictable, we find that market returns around announcements of popular macroeconomic
indicators are less predictable than responses provoked by unpopular indicators.
Beyond that, our results suggest that the (regularized) machine learning methods applied
are superior at avoiding overfitting in our data set than simpler models, such as the OLS
regression. No model applied consistently outperforms other methods on producing superior
RMSEs and hit-ratios, however, Random forest dominates other method on point forecast as
it consistently deliver lower RMSEs. We hypothesize that this result might be driven by the
fact that Random forest is the only non-linear method among the models tested. Finally, the
variable Importance measures calculated seems to challenge the myth that Random forest is a
“black-box” method as it allows somewhat for model interpretation.
5.4.4 Market responses, skewness of economic forecasts and regret
In the previous sections, we observed that the skewness of economic forecasts is strongly and
positively linked to economic surprises and market responses. In the following, we hypothesize
that the relation between skewness of economic forecasts and market responses depends on
failures of our skewness-based model in forecasting surprises. The rationale behind this hy-
pothesis is that if market participants use experts’ forecasts to trade, market responses might
be adversely affected by the skewness of forecasts when they fail to correctly predict surprises.
More specifically, we hypothesize that: 1) if a forecasted surprise fails to predict the direction
of the realized surprise, then the correspondent market response is relatively large and in the
opposite direction to the forecasted surprise (i.e., in line with the realized forecast); 2) if a
forecasted surprise is in line with the realized surprise, then the subsequent market response is
relatively small and in line with both the forecasted and realized surprise.
The intuition of Hypothesis (1) is that a regret effect takes places in asset markets in line
with the models of Loomes and Sugden (1982) and Bell (1982). Therefore, investors that would
be positioned in line with the expected surprises close their losses quickly after the economic
release is made public. Concurrently, when realized surprises are in line with expected ones
(Hypothesis (2)), no additional trading activity is expected from market participants that are
holding such expectations, as it is likely they have positioned themselves according to their
expectations ahead of the specific release27. One strong assumption made in this exercise is
that the direction of the expected surprise is driven by the direction of the skewness of forecasts,
which is in line with our estimated economic surprise models but not a result found for every
single macroeconomic indicator in our analysis28.
27An implicit assumption embedded in Hypotheses (1) and (2) is that market participants that take part inan economic forecast survey also trade in asset markets (in line with their own forecasts) and that their forecastsinfluences market participants who trade in these markets.
28As indicated by Table (5.2), 93 percent of the estimated unrestricted models for economic surprises have apositive Skew coefficient.
145
In order to test the Hypotheses (1) and (2), we specify the following regression models:
Rt = α + Skew+t ∗ S−
t + εt, (5.4.1a)
Rt = α + Skew+t ∗ S+
t + εt, (5.4.1b)
Rt = α + Skew−t ∗ S−
t + εt, (5.4.1c)
Rt = α + Skew−t ∗ S+
t + εt, (5.4.1d)
where Rt is the market response. Skew+t is the skewness in forecasts when it is positive and
Skew−t when it is negative. S+
t is the realized surprise when it is positive and S−t when it is
negative. Hence, explanatory variables in these models are interaction between surprises and
skewness in forecasts. To make the interpretation of the estimated coefficients of these inter-
action terms easier, we run regressions with the absolute value of these explanatory variables.
Importantly, given our assumption that the direction of the expected surprise is driven by the
direction of the skewness of forecasts, we interpret the variables Skew+t ∗ S−
t and Skew−t ∗ S+
t
as scenarios in which the expected surprise failed to forecast the realized economic surprise. In
the same line, Skew+t ∗ S+
t and Skew−t ∗ S−
t are scenarios in which the expected surprise was
successful in forecasting the economic surprise.
Hence, these regressions split the direction of the skewness and realized surprises to map
the four possible scenarios in which the responses can be evaluated: 1) the presence of positive
skewness and negative economic surprise (skewness fails to forecast the direction of surprise); 2)
the presence of positive skewness and positive economic surprise (skewness successfully forecasts
the direction of surprise); 3) the presence of negative skewness and negative economic surprise
(skewness successfully forecasts the direction of surprise); and 4) the presence of negative
skewness and positive economic surprise (skewness fails to forecast the direction of surprise).
The scenarios that give rise to regret are the ones in which the skewness fails to forecast the
direction of economic surprise, thus, the scenarios number 1 and 4.
We note that Eqs. (5.4.1a) to (5.4.1d) do perform this four-scenario mapping by implement-
ing each one of them as an individual univariate model. In order to have enough observations
to run regressions for each of these four scenarios, we do not run regressions at the individual
economic indicator level but we aggregate observations for all economic releases. Aggregating
surprise data for all the economic indicators within our sample is possible because surprises are
also available as a number of standard deviations from the mean (apart from in raw surprise
format)29.
29The application of these regressions using raw surprise would be biased as the different magnitude of themultiple economic releases would create a biased relation between the explanatory variable (surprises) and theexplained variable (market response).
146
Table 5.7: Test of market response in presence of regret
Panel A - Univariate
Markets R2 Skw+Surp- R2 Skw+Surp+ R2 Skw-Surp- R2 Skw-Surp+
S&P500 0.18% -19.4 0.01% 5.5 0.10% -15.9 0.07% 12.6
Euro-Stoxx 0.04% -6.4 0.04% 8.6 0.01% -3.0 0.12% 11.5
FTSE100 0.10% -11.8 0.00% 3.2 0.00% -0.7 0.00% 3.0
2y Bund 0.54% -26.9*** 0.03% 9.4 0.01% 3.0 0.07% 12.6
2y T-Note 0.35% -3.6* 0.02% 0.8 1.50% 7.4*** 0.00% 0.2
10y Gilt 0.07% -14.5 0.09% -25.6 0.07% 13.9 0.00% -0.1
WTI Oil 0.14% -29.0 0.05% -24.4 0.06% 20.3 0.01% 9.6
Gold 0.00% -1.9 0.01% -3.5 0.01% -2.5 0.08% 9.2
Copper 0.07% -12.9 0.02% -9.0 0.01% 5.8 0.00% -2.8
USDGBP 0.11% -13.8 0.04% 12.5 0.09% 15.6 0.02% 8.3
USDEUR 0.00% -2.1 0.14% 17.9 1.37% 38.6*** 0.34% 22.7**
USDJPY 0.28% -16.4** 0.01% 3.0 0.03% -5.8 0.21% 15.2*
USDCHF 0.05% 3.4 0.00% 0.4 0.06% 3.2 0.09% 4.6
USDAUD 0.08% -6.7 0.04% -7.4 0.00% 1.8 0.13% 9.8
USDCAD 0.04% -7.1 0.00% 3.2 0.12% -13.1 0.01% 3.0
% coefficients > 0 7% 67% 60% 87%
% sig. coefficients > 0 0% - 100% 100%
This table reports results of our test for the presence of regret within market responses, see Eqs. (5.4.1a) to (5.4.1d) inan univariate setting. Skew+
t is the skewness in forecasts when it is positive and Skew−t when it is negative. S+
t is the
realized surprise when it is positive and S−t when it is negative. Explanatory variables in these models are interaction terms
between surprises and skewness in forecasts. Skew+t ∗ S−
t and Skew−t ∗ S+
t are scenarios in which the expected surprise
failed to forecast the realized economic surprise. Skew+t ∗ S+
t and Skew−t ∗ S−
t are scenarios in which the expected surprisewas successful in forecasting the economic surprise. We use Newey-West adjustments to compute coefficient standard errors.The asterisks ***, **, and * indicate significance at the one, five, and ten percent level, respectively.
Table (5.7) reports the regression results for Eqs. (5.4.1a) to (5.4.1d) in an univariate
setting. We observe that when economic surprises are negative, the percentage of coefficients
found to be positive is lower than when economic surprises are positive. This indicates that
negative surprises are more linked to negative market responses relative to positive surprises,
when the absolute value of Skew+t ∗ S−
t and Skew−t ∗ S−
t are used as regressors. This is in
line with what we have expected30. However, when the skewness of forecasts is positive, the
percentage of positive coefficients is the lowest (7 percent). This result suggests that negative
responses are more frequently linked to negative surprises when the skewness of forecasts fails to
correctly predict the economic surprise. This result is confirmed when we only use statistically
significant coefficients in our analysis, as the percentage of positive and statistically significant
coefficients for the Skew+t ∗ S−
t is zero versus 100 percent for Skew−t ∗ S−
t . We interpret this
finding as being supportive of our hypothesis that regret affects market participants on trading
around economic surprises.
In line with our expectations, when surprises are positive, the number of coefficients pointing
towards a positive market response always exceeds 50 percent. Nevertheless, the number of
coefficients pointing towards a positive market response is higher when the skewness of forecasts
is negative (87 percent) than when it is positive (67 percent). This finding is confirmed when
only statistically significant coefficients is taken into account in the analysis. This finding
connects to our results when negative surprises are evaluated and supports our conjecture of a
regret effect within economic surprises.
30We note that the coefficient sign of the bond returns is reversed by us to be consistent with equity returnsand currencies to what the expected direction of returns given economic surprises is concerned.
147
5.4.5 Robustness tests
5.4.5.1 Economic surprise models across regions
As a robustness test, we apply Eqs. (5.2.5a) and (5.3.1) across other regions, namely, Continen-
tal Europe, the United Kingdom and Japan31. Table (5.8) indicates that our results for these
three regions are qualitatively the same as the ones reported for the US: unrestricted models
tend to improve the R2 of anchor-only models and the coefficients for the ESA and Skew fac-
tors are mostly positive (the expected sign). These two coefficients are positive between 56 and
75 percent of all times, which is however lower than the percentage of correct signs found for the
US. Yet, among coefficients for all factors (including control variables), ESA and skew remain
the ones that are mostly positive. Moreover, in significance terms, models (anchor-only and
unrestricted) for Europe, Japan and the United Kingdom perform worse than the US model,
as the percentage of coefficients that are significant are, in general, lower than for the US.
Table 5.8: Aggregated results of anchor-only and unrestricted economic sur-prise models for Continental Europe, the UK and Japan
Region Cont. Europe UK Japan
Model Anchor-only Unrestricted Anchor-only Unrestricted Anchor-only Unrestricted
Panel A - Percentage of statistical significance per factor
Intercept 0.22 0.29 0.33 0.19 0.24 0.16
Bias 0.44 0.37 0.33 0.31 0.42 0.28
Std 0.18 0.42 0.16
SurvLag 0.20 0.33 0.31
Skew 0.09 0.44 0.13
Infl 0.11 0.11 0.09
Growth 0.14 0.19 0.13
Stocks 0.15 0.22 0.13
IV 0.29 0.25 0.06
Panel B - Explanatory power (R2)
Mean R2 8% 25% 3% 18% 3% 16%
Median R2 3% 16% 1% 13% 2% 10%
Stdev R2 17% 24% 5% 13% 3% 20%
Panel C - Percentage of positive coefficients
Intercept 0.37 0.53 0.56 0.50 0.52 0.78
Bias 0.66 0.62 0.58 0.72 0.70 0.69
Std 0.42 0.56 0.44
SurvLag 0.47 0.36 0.31
Skew 0.56 0.75 0.59
Infl 0.40 0.25 0.44
Growth 0.44 0.44 0.34
Stocks 0.48 0.44 0.47
IV 0.26 0.42 0.19
Panel A reports the percentage of statistical significant coefficients (factors) across anchor-only and unrestricted regressionmodels for economic surprises of macroeconomic indicators for Europe, the United Kingdom and Japan. For example, 0.44found for the ESA factor within the anchor-only model for Europe means that 44 percent of such the ESA factor acrossthe individual regressions run for the European macroeconomic indicators are statistically significant at the 10 percent level.Panel B reports the mean, median and standard deviation of the explanatory power (R2) achieve by across all indicator-specific regressions. Panel C reports the percentage of positive coefficients across anchor-only and unrestricted regressionmodels for economic surprises of the same macroeconomic indicators.
We conjecture that the difference in presence of biases in macroeconomic forecasting across
31The overview of macro releases for these regions can be provided under request
148
the different regions might be explained by the number of experts dedicated to macroeconomic
forecasts across these countries/regions. The average number of analysts providing forecasts
across all indicators and through our sample is 44 for the US. For Europe, Japan and the
United Kingdom this number is, respectively, 9, 13, and 15. We argue that, as the number of
forecasters increases for a specific indicator or within a country, it becomes more likely that
1) convergence towards the previous release happens simply by the law of large numbers; 2)
forecasters possess private information; 3) such private information is revealed by the skewness
in forecasts, given strategic behavior by experts.
5.4.5.2 Expected and unexpected surprises
As an additional robustness test, we compare how expected economic surprises are linked to
market responses vis-a-vis to unexpected surprises. As earlier mentioned, expected surprises are
the ones that can be predicted, in line with Campbell and Sharpe (2009). In this study, we have
used the anchor-only and unrestricted models, as given by Eqs. (5.2.5b) and (5.3.1), to estimate
expected economic surprises. In contrast, unexpected surprises are the residual component of
surprises. In other words, the unexpected surprise is the part of the economic surprise that
cannot be forecasted. We link market return around macroeconomic data announcements with
expected surprises and unexpected surprises via the following two-component response model:
Rt = ω + δ1E[St] + δ2(St − E[St]) + εt, (5.4.2)
where E[St] can be provided by either the anchor-only model or by the unrestricted economic
surprise model, and St is the realized surprise. Essentially, the goal of running such a re-
gression model is to understand and compare whether and how market prices react to the
two-components of economic surprises: the predictable component of surprises (by evaluating
δ1), and the unpredictable portion of economic surprises (by evaluating δ2.).
In order to compare how market responses can be explained by expected surprises versus
unexpected surprises, we also run the following unexpected-surprise response model:
Rt = ω + δ3(St − E[St]) + εt, (5.4.3)
The estimates of Eqs. (5.4.2) and (5.4.3) are provided by Table (5.9). Panel A reports the
R2 as well as the percentage of δ1 and δ2 coefficients that are significant for both the anchor-only
(two-components) response model and the unrestricted (two-components) response model.
149
Table
5.9:Resu
ltsoftw
o-component(expected-andunexpected-eco
nomic
surp
rise)resp
onse
models
Panel
A-Two-componen
tresp
onse
models
Panel
B-ComparisonbetweenR
2from
two-componen
tresp
onse
modelsandunex
pectedresp
onse
models
Anch
or-only
two-componen
tresp
onse
model
Unrestricted
two-componen
tresp
onse
model
Unex
pectedresp
onse
models
Anch
or-only
model
Unrestricted
model
Expected
Unex
pected
Expected
Unex
pected
R2gain
by
%R
2gain
by
R2gain
by
%R
2gain
by
R2
%Significa
nt
%Significa
nt
R2
%Significa
nt
%Significa
nt
R2
two-componen
ttw
o-componen
tR
2tw
o-componen
ttw
o-componen
t
USInitialJobless
Claim
sSA
8.8%
40%
93%
8.9%
60%
87%
8.5%
0.3%
3%
8.4%
0.5%
6%
USEmployeesonNonfarm
Payrol
20.9%
7%
100%
22.2%
60%
100%
20.6%
0.2%
1%
20.0%
2.2%
11%
U-3
USUnem
ploymen
tRate
Total
2.0%
7%
53%
1.8%
0%
47%
1.6%
0.4%
23%
1.4%
0.3%
23%
USEmployeesonNonfarm
Payrol
1.8%
0%
47%
2.1%
7%
47%
1.6%
0.2%
14%
1.9%
0.3%
15%
USContinuingJobless
Claim
sS
1.9%
0%
73%
2.2%
0%
73%
1.8%
0.1%
5%
2.1%
0.1%
5%
ADP
NationalEmploymen
tRep
ort
33.2%
13%
93%
33.3%
60%
93%
32.8%
0.5%
1%
28.7%
4.6%
16%
USAverageW
eekly
Hours
AllEm
1.6%
0%
0%
2.0%
0%
20%
1.1%
0.5%
47%
1.8%
0.2%
10%
USPersonalInco
meMoM
SA
0.9%
0%
27%
1.0%
20%
7%
0.7%
0.2%
27%
0.4%
0.7%
194%
ISM
ManufacturingPMISA
19.1%
20%
87%
19.3%
33%
87%
18.8%
0.3%
2%
18.4%
0.9%
5%
USManufacturers
New
Ord
ersTo
6.5%
20%
87%
6.6%
27%
80%
6.0%
0.5%
9%
5.8%
0.8%
13%
Fed
eralReserveConsu
mer
Credi
1.9%
29%
14%
1.2%
7%
7%
1.0%
0.9%
90%
0.8%
0.3%
42%
Merch
antW
holesalers
Inven
tori
0.8%
20%
0%
0.5%
0%
0%
0.3%
0.5%
176%
0.3%
0.2%
68%
USIndustrialProductionMOM
S19.9%
27%
93%
21.4%
67%
87%
19.2%
0.7%
4%
18.1%
3.3%
18%
GDP
USChained
2009Dollars
Qo
24.0%
60%
100%
24.3%
47%
100%
22.4%
1.6%
7%
23.2%
1.1%
5%
USCapacity
Utiliza
tion%
ofT
15.7%
60%
93%
16.0%
73%
93%
13.8%
1.9%
14%
13.3%
2.6%
20%
USPersonalConsu
mptionExpen
d1.4%
13%
40%
1.3%
13%
33%
0.8%
0.6%
74%
0.9%
0.4%
41%
USDurable
GoodsNew
Ord
ersIn
13.9%
7%
93%
13.6%
60%
93%
13.5%
0.4%
3%
12.3%
1.3%
11%
USAuto
SalesDomesticVeh
icle
4.4%
40%
7%
1.9%
7%
0%
0.5%
3.9%
754%
0.9%
1.0%
109%
Adjusted
Retail&
FoodService
21.6%
20%
93%
23.5%
67%
93%
21.0%
0.7%
3%
21.2%
2.3%
11%
Adjusted
RetailSalesLessAut
25.6%
73%
93%
25.9%
73%
93%
22.1%
3.5%
16%
23.4%
2.5%
11%
USDurable
GoodsNew
Ord
ersTo
22.0%
13%
87%
22.0%
67%
93%
21.5%
0.5%
2%
20.0%
2.0%
10%
GDP
USPersonalConsu
mptionCh
4.7%
27%
80%
5.4%
27%
67%
3.6%
1.1%
31%
3.7%
1.7%
46%
ISM
Non-M
anufacturingNMI
25.4%
33%
80%
26.0%
40%
87%
23.2%
2.2%
10%
24.3%
1.7%
7%
USManufacturing&
TradeInven
0.8%
7%
7%
0.8%
0%
13%
0.4%
0.4%
88%
0.4%
0.4%
93%
Philadelphia
Fed
BusinessOutl
20.2%
33%
100%
21.7%
67%
93%
19.5%
0.8%
4%
18.1%
3.5%
20%
MNIChicagoBusinessBarometer
7.7%
60%
87%
7.9%
7%
93%
6.6%
1.1%
17%
7.4%
0.6%
8%
Conference
Board
USLea
dingIn
3.3%
0%
67%
3.8%
67%
40%
3.1%
0.3%
9%
1.3%
2.5%
202%
Conference
Board
Consu
mer
Conf
23.5%
53%
93%
24.7%
67%
93%
21.7%
1.8%
8%
22.2%
2.4%
11%
USEmpireState
Manufacturing
15.8%
33%
87%
15.5%
47%
80%
14.7%
1.0%
7%
13.6%
1.9%
14%
RichmondFed
eralReserveManuf
2.8%
7%
40%
2.8%
0%
53%
2.0%
0.7%
36%
2.6%
0.2%
9%
ISM
Milwaukee
Purchasers
Manuf
3.5%
27%
13%
2.5%
7%
13%
1.8%
1.7%
95%
1.5%
1.0%
70%
University
ofMichiganConsu
me
4.0%
40%
73%
4.5%
40%
73%
3.4%
0.7%
20%
2.9%
1.6%
54%
DallasFed
ManufacturingOutlo
1.0%
0%
0%
2.1%
20%
0%
0.5%
0.6%
125%
0.6%
1.5%
255%
USPPIFinished
GoodsLessFoo
8.4%
33%
87%
8.9%
27%
87%
7.5%
1.0%
13%
7.2%
1.8%
25%
USCPIUrb
anConsu
mersMoM
SA
9.0%
20%
100%
10.9%
0%
100%
8.4%
0.6%
7%
10.6%
0.3%
3%
USCPIUrb
anConsu
mersLessFo
17.6%
7%
100%
19.3%
7%
100%
17.5%
0.2%
1%
19.0%
0.3%
1%
BureauofLaborStatisticsEmp
7.1%
33%
20%
5.1%
20%
20%
2.8%
4.4%
158%
2.6%
2.5%
95%
USOutp
utPer
HourNonfarm
Bus
4.4%
27%
53%
4.2%
27%
53%
3.1%
1.3%
44%
2.7%
1.5%
55%
USPPIFinished
GoodsSA
MoM%
6.3%
27%
67%
6.3%
53%
47%
5.5%
0.9%
16%
4.4%
2.0%
45%
USIm
port
Price
Index
byEndU
2.4%
20%
67%
2.8%
60%
27%
1.8%
0.6%
30%
0.9%
1.9%
200%
USGDP
Price
Index
QoQ
SAAR
2.0%
0%
40%
3.4%
60%
13%
1.6%
0.4%
26%
0.9%
2.5%
261%
USPersonalConsu
mptionExpen
d0.8%
0%
7%
1.7%
7%
27%
0.6%
0.2%
25%
1.2%
0.5%
46%
USPersonalConsu
mptionExpen
d1.6%
7%
27%
1.5%
7%
13%
1.1%
0.5%
45%
1.0%
0.5%
45%
Average
9.7%
22%
62%
10.0%
32%
59%
8.8%
0.9%
49%
8.7%
1.4%
51%
Popularity-w
eightedaverage
11.6%
26%
72%
12.0%
37%
68%
10.6%
1.0%
33%
10.5%
1.5%
42%
Most
pop.indicators
(avg)
19.0%
26%
95%
20.1%
48%
94%
18.1%
0.9%
5%
18.4%
1.7%
9%
ThePanel
Aofth
etable
below
reportsresu
ltsforth
etw
o-compo
nen
tresponse
mod
elbasedonth
eanch
or-only
andonth
eunrestricted
model
foreconomic
surp
rises.
Two-compo
nen
tresponse
mod
elsseparate
economic
surp
risesinto
expectedsu
rprises,
asforeca
sted
byeconomic
surp
rise
predictivemodels,
andunex
pectedsu
rprises,
asth
eresidualbetweeneconomic
surp
risesandex
pected
surp
risesasgiven
byEq.(5.4.2).
Statisticsreported
fortw
o-componen
tmodelsare
R2andth
epercentageofregressionco
efficien
tsth
atare
statisticallysignifica
ntacross
differen
tmarkets.
Panel
BreportsR
2forth
e(u
nivariate)unexpected-surp
rise
response
mod
elasgiven
byEq.(5.4.3)aswellasth
eabsolute
andpercentagegain
inR
2delivered
byTwo-compo
nen
tresponse
mod
elsversu
sunexpected-surp
rise
response
mod
els.
Reg
ressionresu
ltsare
reported
per
economic
indicator.
Weuse
New
ey-W
estadjustmen
tsto
compute
coeffi
cien
tstandard
errors.Theasterisks***,**,and*
indicate
significa
nce
atth
eone,
five,
andtenpercentlevel,resp
ectively.
150
Our findings suggest that the coefficient for unexpected surprises is often significant, on
average 62 and 59 percent of all times, respectively for the anchor-only and unrestricted mod-
els. Nevertheless, the expected surprises are also frequently significant, but less so than for
unexpected surprises. Still, because expected surprises are significant in 22 and 32 percent of
all times, we conclude that market responses are also strongly linked to expected surprises,
not only unexpected ones. The fact that expected surprises are more significant in unrestricted
models versus the anchor-only model, and that unexpected surprises are less significant for the
unrestricted models reiterate our results that factors beyond ESA are informative in predicting
surprises, such as Skew.
Panel B of Table (5.9) compares the R2 of univariate unexpected response models for
the anchor-only and unrestricted response models with the explanatory power of their two-
component (multivariate) counterparts. At first glance, we see that the average explanatory
power delivered by the anchor-only and unrestricted-based unexpected surprise models are
quite similar, suggesting that the unexplained portion of the surprise as modelled by these two
approaches is comparable. When we compare the average R2 delivered by the two-component
models with the ones delivered by unexplained surprises only (across anchor-only and unre-
stricted models), it seems that R2 for the two-component models is only marginally higher, by
roughly one percent. Nevertheless, when we evaluate the percentage gain in R2 delivered by
the two-component models (versus the unexpected-surprise models), there is an indication that
the two-component models increase the explanatory power of the unexpected-surprise models
quite substantially. This gain is roughly 50 percent on average, across anchor-only and the
unrestricted model. This finding indicates that the predictable portion of surprises adds sub-
stantial explanatory power to response models relative to the explanatory power of unexpected
surprises-only models. These results contribute to our findings that expected surprise models
comprise a relevant source of information on estimating market responses around economic
surprises.
Further, for expected surprises the percentages of significant coefficients is roughly the
same for the popularity-weighted and the un-weighted averages. This findings are in line with
our earlier observation that responses connected to surprises of popular indicators are just as
predictable than responses provoked by unpopular indicators. In contrast, we observe that
the percentages of significant coefficients for unexpected surprises is higher for popularity-
weighted and the most popular indicator than for the un-weighted average across anchor-only
and unrestricted surprise models. In the same vein, the percentage gain in R2 delivered by the
two-component models is lower for popular indicators than for the average indicator, reflecting
that the explanatory power added by expected surprises (on top of the R2 produced by the
unexplained portion) for popular indicators is less than for the average indicator. These findings
indicate, despite expected surprises being connected to market response across indicators of
any level of popularity, for popular indicators, the unexpected component of surprises is more
prevalent than the expected one than for un-popular indicators. In other words, the unexpected
component of surprises for popular indicator dominates their expected component, which is
151
happens at a lower frequency for un-popular indicators.
In this way, our findings diverge from the bold conclusion of Campbell and Sharpe (2009)
that traders “look-through” the bias. Nevertheless, our results differ from theirs partially
because we use a much broader set of (un-popular) economic indicator. When we focus on
the set of popular indicators used by these authors our results are more in line with theirs, in
which unexpected component of surprises is the main explanatory variable of market responses
(despite not being the only one). Hence, the link between expected surprises and market
responses is more prevalent in a set of less popular indicators, despite the fact that biases are
more pervasive on popular indicators.
5.5 Conclusion
This chapter investigates how forecasters biases, both cognitive and rational, are associated
with future macroeconomic surprises and their respective market responses around announce-
ments in the US. We empirically confirm that the anchor bias previously recognized in the
literature remains pervasive but we show that higher moments of the distribution of economic
forecasts are also informative on predicting surprises. Particularly, our results suggest that the
skewness of the distribution of economic forecasts provides reliable information to predict eco-
nomic surprises. Whereas anchoring has clear behavioral roots, we assume that the information
contained in the skewness of forecasts reflects a rational bias. This assumption builds on the
literature on strategic behavior by forecasters, who have dual and contradicting objectives, i.e.,
forecast accuracy and publicity. According to this stream of research, forecasters typically stay
close to the “pack” (herding) but eventually, when in the possession of what they perceive to
be private information, they intentionally issue off-consensus forecasts, contributing to a highly
skewed distribution of forecasts. Our results, thus, suggest that professional forecasters often
possess private information and that they do make use of it by issuing controversial (and in-
formative) forecasts. Under these conditions, macroeconomic surprises are, at least, partially
predictable. Predictability is, though, stronger for popular indicators, suggesting that as we
move from widely followed indicators towards less watched ones, biases become less pervasive.
In the same vein, we show that predictability and the strong link between macroeconomic sur-
prises and forecast skewness also holds for other countries/regions, such as Continental Europe,
the United Kingdom and Japan, albeit, to a lesser extent.
A consequence of economic surprises being predictable might be that responses observed in
asset returns around macroeconomic announcements are also predictable. Our findings confirm
this hypothesis in-sample and, to a lesser degree, out-of-sample. We identify that forecasts
made using our unrestricted-extended economic surprise models outperform simpler models on
forecasting market responses across the four asset classes studied. The success of these models
is partially associated to the expected economic surprises modelled and partially linked to past
returns, challenging the Efficient Market Hypothesis (EMH).
Asset returns around announcements of highly-followed macroeconomic indicators are as
152
predictable as around releases of unpopular indicators, despite economic surprises being more
predictable for popular indicators. Nevertheless, market responses around announcements of
popular indicators are more frequently linked to the unexpected component of surprises than
unpopular indicators are. We also find that machine learning techniques outperforms OLS
regression models in point forecast, which may be linked to their non-linear nature. Undoubt-
edly, the regularized machine learning models applied by us are superior than OLS regression
at avoiding overfitting in our data set. Further, we find that returns of assets that are sensitive
to the fundamentals being revealed by macro announcements (local equities and bonds) are
more predictable around such events than foreign markets, currencies and commodities.
Yet, when forecasters fail to correctly forecast the direction of the economic surprises, an-
other bias seems to play a role in explaining market responses: regret. We identify the presence
of this cognitive bias as we find that negative (positive) market responses are more perva-
sive when the skewness of forecasts failed to correctly forecast surprises. Future research is
warranted to strengthen our conclusions on the matter, while extending our findings to other
cases, such as the forecasting of quarterly earnings releases, seems the natural next steps to
take. Extending our skewness-based forecast approach by the use the skewness of top-quartile
forecasters only should also strengthen our findings.
We conclude with the four key implications of our findings: 1) a better understanding of
the “market consensus” and of the informational content of higher moments of the distribution
of macroeconomic forecasts by regulators, policy makers and market participants; 2) the chal-
lenge of standard weighting schemes used in economic surprise indexes, which we find can be
improved by changing from “popularity” (or “attention”)-weighted to un-weighted, as market
responses around announcements of popular indicators are not more predictable than responses
around releases of unpopular indicators; 3) the proposition that advanced statistical learning
techniques should be used to refine the forecast of market responses amid macroeconomic re-
leases, especially when such methods prevent overfitting and are somewhat transparent; and 4)
the opening of a new stream in the literature to investigate regret effects in asset price responses
around announcements of forecasted figures.
153
5.A Appendix: Machine learning methods
5.A.1 Principal component analysis
Principal component analysis (PCA) is likely the most popular linear dimensionality reduction
tool. As such, PCA is an unsupervised method, which bares no association with an explained
variable Y but only with the features X1, X2, ..., Xp of a data set or model. PCA aims to
summarize a large set of correlated variables into a smaller number of orthogonal variables that
explain most of the variability in the original set. Hence, the first principal component (PC1) of
a data set is the orthogonal variable produced by a linear combination of the provided features
that can explain the most variance in this data set. More formally, to obtain PC1 one must
solve the following optimization problem, which maximizes explained sample variance by:
w(1) = arg max
{1
n
n∑i=1
(
p∑j=1
wj1xij)2
}subject to ||w||= 1 (5.A.1)
where w are the Kth weights or loadings wK of each feature in principal component calculated.
Once PC1 is computed, a subsequent second principal component (PC2) can be computed in
the same manner by subtracting PC1 from the original data set. The following higher K-order
PC-Kth are found in the same manner.
5.A.2 Ridge regression
The Ridge regression (Hoerl and Kennard, 1970) is a shrinkage method, similar to the Least
Absolute Shrinkage and Selection Operator (Lasso) of Tibshirani (1996). The main difference
between the Lasso and Ridge regression is that the former translates each coefficient by a
constant factor φ (typically λ), truncating at zero, whereas the latter applies proportional
shrinkage. The main consequence of such difference is that shrank coefficients by Lasso equal
to zero, whereas for Ridge regression coefficient approach zero as limit. Therefore, the Ridge
regression only applies shrinkage and not shrinkage and variable selection simultaneously, which
should help forecast accuracy but does not improve model interpretation as the Lasso does.
Thus, regression models shrank by Lasso are sparse version of the original regression model,
whereas Ridge regressions are not.
The regression coefficients obtained by the Ridge regression methodology applied (βLθ ) are
estimated by minimizing the quantity:
n∑i=1
(y1 − β0 −p∑
j=1
βjxij)2 + φ
p∑j=1
β2j = RSS + φ
p∑j=1
β2j (5.A.2)
where φ is the tuning parameter, which is estimated via cross-validation. The cross-validation
applied uses three equal-size splits of our train data set. For a comparison between Ridge
regression and the Lasso, see Hastie et al. (2008).
154
5.A.3 Random Forest
Random forest (Breiman, 2001) is a decision tree-based method derived from bootstrap aggre-
gation (i.e., bagging). Bagging entails fitting a regression many times by applying bootstrap
to the train data set and, then, averaging the predictions of each model. The goal of applying
bagging is to reduce a predictions’ variance by averaging. As decision tree typically suffer from
high variance, ensemble methods such as bagging, Random forest and boosting are warranted
for better predictive accuracy. As mentioned, Random forest builds on bagging by growing
a large collection of trees with the nuance that they are imposed to be as de-correlated as
possible. Similarly to bagging, after de-correlated trees are grown as a ’random forest ’, then,
predictions coming from them are averaged into a single prediction.
The Random forest (regression) predictor is given by:
fBrf (x) =
1
B
B∑b=1
T (x; Θb) (5.A.3)
where, B is the number of T (·) trees grown, Θb is the set of characteristics of the bth tree to
what it concerns the split variables, cut-points at each node and terminal-node values and x is
the set of explanatory variables. For details on decision tree, which is the basic building block
for Random forest, see Hastie et al. (2008).
5.A.4 Random forest variable Importance measure
The Importance measure applied in our study32 computes the average node impurity across all
trees grown by the Random forest, reflecting how optimum partitions made by the different
explanatory variables at each node compare with a ‘pure’ node, i.e., a constant fit over the
entire region.
The natural starting point for the calculation of the Importance measure (Ψv) per variable
v = 1...V is a single decision tree T as follows:
Ψ2v(T ) =
J−1∑j=1
ι2jI(w(j) = v) (5.A.4)
where, the sum collects the importance of each variable v across the total number of nodes J
of the tree. At each node j the input variable in analysis splits the region into two sub-regions
associated with a boundary level, where sub-regions are either equal or higher than or smaller
than the boundary level utilizing function w(·). The variable chosen to make this split is the
one that maximizes the improvement (versus the previous nodes in this branch) ι2j in squared
errors achieved in this node relative to the decision of no partition, i.e., a ’pure’ node.
As in Random forest many trees are utilized, the Importance measure per tree (Ψ2v(T ))
has to be aggregated into a overall model Importance measure, which is achieve by averaging
32Note that there are alternative Importance measures that can be used in Random forest and in othertree-based algorithms.
155
Importance across the total number of tree M :
Ψ2v =
1
M
M∑m=1
Ψ2v(Tm). (5.A.5)
For more details on Importance measures see Hastie et al. (2008).
156
Chapter 6
Conclusion
The main theme of this thesis is the link between behavioral finance, ex-ante informational
sources, particularly probability densities extracted from option markets and macroeconomic
survey data, and investment strategies.
Chapter 2 analyses the effects of the European 2011 short sale ban on financial market
stability and contagion risk through the lenses of risk-neutral densities (RND) and implied
jump risk from single-stock options. We find that the short sale bans imposed by Belgium,
France, Italy, and Spain increased implied jump risk, especially for the banned stocks, even after
controlling for information flow and stock-specific factors. Partially, this is caused by a smaller
supply of puts during the ban as market makers became more risk-sensitive following equity
market declines (see Garleanu et al., 2009), which can be explained by the CPT’s overweighting
of tails. However, we find that contagion risk for banned stocks decreased during the ban relative
to the pre-ban period. While we observe that the short sale ban is effective in restricting both
outright and synthetic shorts on banned stocks, we do find evidence investors seem to switch
from single-stock puts to index puts because of “flight-to-liquidity” incentives. This migration
likely diverted selling pressure from the financial stocks to a larger share of the stock market,
thereby reducing the destabilizing effects in the financial sector. Thus, if the first and foremost
goal of imposing a ban is reducing systemic risk, then the 2011 bans do seem to fulfill this
purpose. However, we note that this success comes at a cost: the increase in the implied jump
risk due to a supply shift. Despite the fact that this effect in implied jump risk indicates
market failure and may have adversely influenced market participants’ expectations, it helped
to preserve market stability by reducing contagion risk.
In Chapter 3 we estimate the CPT probability weighting function parameter γ for gains
and find that it is qualitatively consistent with the one predicated by Tversky and Kahneman
(1992), endorsing our hypothesis that investors in single stock call options are biased. Though,
overweight of small probabilities is less pronounced than proposed by the CPT and exhibits
a positive term structure as it becomes less pronounced as the option maturity increases. In-
vestors’ overweighting of small probabilities is also largely time-varying and sample dependent.
It is pronounced in periods in which sentiment is high, for instance, the IT bubble period and
it abates when sentiment is low. Our results challenge the view that single stock call options
157
are structurally overpriced and offer the insight that overweight of tail events implied in these
options are conditional on sentiment levels and option maturity rather than positive stock
fundamentals, loss aversion levels or investor preferences for skewness.
Chapter 4 finds that overweight small probability events is strongly time-varying and present
in both OTM index puts and single stock calls, due to individual and institutional investors
trading activity, respectively. In order to capture both bullish and bearish sentiment from
trading activity of these two types of market participants, we propose a novel indicator: IV-
sentiment. Contrarian-trading strategies using our IV-sentiment measure produce economically
significant risk-adjusted returns. The joint use of information from the single stock and index
option markets seems to be the reason for the superior forecast ability of our indicator. More-
over, IV-sentiment seems to forecast returns as well as other well-known predictors of equity
returns and is uncorrelated to these predictors, significantly improving the quality of multifactor
predictive models. An IV-sentiment-based strategy is also little exposed to a set of widely used
cross-sectional equity factors, which includes Fama and French’s five factors, the momentum
factor and the low-volatility factor. We find that combining our sentiment strategy with other
strategies, such as buy-and-hold the S&P 500 index, time-series momentum and cross-sectional
equity momentum can largely improve their risk-return trade-offs. Cross-sectional momentum
benefits the most from our IV-sentiment measure as it seems to mitigate momentum crashes.
Chapter 5 investigates how forecasters’ biases, both cognitive and rational, are associated
with future macroeconomic surprises around announcements in the US. We empirically confirm
that the anchor bias previously recognized in macroeconomic forecasting remains pervasive but
also that the skewness of the distribution of economic forecasts provides reliable information for
the prediction of economic surprises, denoting the presence of a rational bias. A consequence
of economic surprises being predictable is that responses observed in asset returns around
macroeconomic announcements are also predictable. Our findings confirm this hypothesis in-
sample and, to a lesser degree, out-of-sample. Returns of assets that are sensitive to the
fundamentals being revealed by macro announcements (local equities and bonds) are found
to be more predictable around such events than foreign markets, currencies and commodities.
Yet, when forecasters fail to correctly forecast the direction of the economic surprises, a regret
bias seems also to play a role in explaining market responses.
Our findings yield further validation of behavioral finance and new angles to cross-asset
investing, equity timing, options pricing and short-sale bans. Hence, this thesis draws impli-
cations not only for academia but also for regulators, investors and other market participants.
For instance, it provides an additional set of tools to regulators, which can be used to monitor
contagion effects (Chapter 2) and the build up of speculative equity market bubbles (Chapter
3). It suggests that investors’ overweighting of small probabilities and a positive term struc-
ture of tails’ overweighting can be used in the development of behavioral option pricing models
(Chapter 3). It proposes that a contrarian IV-sentiment-based strategy for equity timing can
benefit equity investors and asset allocators (Chapter 4). It challenges the “popularity”-based
weighting schemes used in economic indexes in favor of an un-weighted schemes (Chapter 5).
158
In conclusion, this thesis finds strong but intricate relations between behavioral biases, ex-
ante distributions of equity market returns from options, moments of macroeconomic forecasts
and asset prices. It adds novel evidence that behavioral models resemble the decision making
process of key market participants’ and can guide the design of investment strategies with
implication for academics and practitioners.
We acknowledge that our research can still be expanded in several directions. For instance,
it would be interesting to analyze ban-driven increases in implied jump risk using the mutually
exciting jumps model of Ait-Sahalia et al. (2015). This analysis is relevant as it might justify co-
ordinated introduction of bans by regulators. As our IV-sentiment-based strategy is negatively
correlated with and largely improves the momentum factor, it could be further investigated as a
protection against momentum crashes (see Daniel and Moskowitz, 2016). Moreover, we conjec-
ture that using individual forecasters’ information may largely strengthen our results regarding
the presence of bias in macroeconomic forecasting, beyond improving the forecast ability of
models. We also think that the data set on macroeconomic releases used in Chapter 5 should
be further explored. This data set can serve as the backbone of studies that investigate under-
and over-reaction of asset prices to macroeconomic news (in line with Chan, 2003) and the
economic cycle in high frequency basis. Given our preliminary finding that regret (see Loomes
and Sugden, 1982; Bell, 1982) seems to drive market responses when forecasters fail to correctly
forecast the direction of economic surprises, the matter requires further investigation. Future
research on regret around economic surprises should not only validate our finding but also test
for its presence elsewhere, such as in the forecasting of quarterly earnings releases. Finally,
given the infancy of research that applies artificial intelligence methods to forecasting financial
data, more studies on the topic are warranted. These methods should be further explored
because they allow for great dimensionality reduction and overcome some of the limitations of
standard regressions, such as multicollinearity and linearity.
159
Bibliography
Acharya, V., Pedersen, L., 2005. Asset pricing with liquidity risk. Journal of Financial Eco-
nomics 77 (2), 375–410.
Aggarwal, R., Mohanty, S., Song, F., 1995. Are survey forecasts of macroeconomic variables
rational? Journal of Business 68, 99–119.
Ait-Sahalia, Y., Cacho-Diaz, J., Laeven, R., 2015. Modeling financial contagion usingmutually
exciting jump processes. Journal of Financial Economics 117 (3), 585–606.
Ait-Sahalia, Y., Lo, A., 2000. Nonparametric risk management and implied risk aversion. Jour-
nal of Econometrics 94 (3), 90–51.
Amin, K., Coval, J., Seyhun, H., 2004. Index option prices and stock market momentum.
Journal of Business 77 (4), 835–874.
Anagnou, I., Bedendo, M., Hodges, S., Tompkins, R., 2002. The relationship between implied
and realised probability density function. Working paper, University of Warwick and the
University of Technology, Vienna.
Ang, A., Chen, J., Xing, Y., 2006. Risk, return and dividends. Review of Financial Studies
19 (4), 1191–1239.
Ang, A., Liu, J., 2007. Risk, return and dividends. Journal of Financial Economics 85, 1–38.
Asness, C., Moskowitz, T., Pedersen, L., 2013. Value and momentum everywhere. Journal of
Finance 68 (3), 929–985.
Baker, M., Wurgler, J., 2007. Investor sentiment in the stock market. Journal of Economic
Perspectives 21 (Spring), 129–157.
Bakshi, G., Kapadia, N., Madan, D., 2003. Stock returns characteristics, skew laws, and the
differential pricing of individual equity options. Review of Financial Studies 16 (1), 101–143.
Balla, E., Ergen, I., Migueis, M., 2014. Tail dependence and indicators of systemicrisk for large
US depositories. Journal of Financial Stability 15, 195–209.
Barber, B., Odean, T., 2008. All that glitters: The effect of attention and news on the buying
behavior of individual and institutional investors. Review of Financial Studies 21, 785–818.
160
Barberis, N., 2013. The psychology of tail events: Progress and challenges. American Economic
Review 103 (3), 611–616.
Barberis, N., Huang, M., 2001. Mental accounting, loss aversion and individual stock returns.
Journal of Finance 56 (4), 1247–1292.
Barberis, N., Huang, M., 2008. Stocks as lotteries: The implication of probability weighting for
security prices. American Economic Review 98 (5), 2066–2100.
Barberis, N., Huang, M., Santos, T., 2001. Prospect theory and asset prices. Quarterly Journal
of Economics 56 (1), 1–53.
Barberis, N., Shleifer, A., Vishny, R., 1998. A model of investor sentiment. Journal of Financial
Economics 49 (3), 307–343.
Bates, D., 1991. The crash of 87: Was it expected? the evidence from options markets. Journal
of Finance 46 (3), 1009–1044.
Bates, D., 2000. Post-87 crash fears in the sp 500 futures option market. Journal of Economet-
rics 94 (1-2), 181–238.
Bates, D., 2003. Empirical option pricing: A retrospection. Journal of Econometrics 116, 387–
404.
Battalio, R., Schultz, P., 2011. Regulatory uncertainty and market liquidity: the 2008 short
sale bans impact on equity option markets. Journal of Finance 66 (6), 2013–2052.
Bauer, R., Cosemans, M., Eichholtz, P., 2009. Option trading and individual investor perfor-
mance. Journal of Banking & Finance 33 (4), 731–746.
Beber, A., Brandt, M., Luisi, M., 2015. Distilling the macroeconomic news flow. Journal of
Financial Economics 117 (3), 489–507.
Beber, A., Pagano, M., 2013. Short selling bans around the world: evidence from the 2007-2009
crisis. Journal of Finance 68 (1), 343–381.
Bell, D., 1982. Regret in decision making under uncertainty. Operations research 30 (5), 961–
981.
Benartzi, S., Thaler, R., 1995a. Myopic loss aversion and the equity premium puzzle. The
Quarterly Journal of Economics 110 (1), 73–92.
Benartzi, S., Thaler, R., 1995b. Myopic loss aversion and the equity premium puzzle. Quarterly
Journal of Economics 110 (1), 73–92.
Birru, J., Figlewski, S., 2011. Anatomy of a meltdown: the risk neutral density forthe sp 500
in the fall of 2008. Journal of Financial Markets 15 (2), 151–180.
161
Black, F., 1976. Studies of stock price volatility changes. Proceedings of the 1976 Meetings of
the American Statistical Association, 171–181.
Blei, D., Ng, A., Jordan, M., 2003. Latent dirichlet allocation. Journal of Machine Learning
Research 3 (Jan), 993–1022.
Bliss, R., Panigirtzoglou, N., 2004. Option-implied risk aversion estimates. Journal of Finance
59 (1), 407–446.
Bloomberg, 2008. Introduction into the new bloomberg implied volatilitycalculations. (March).
Blume, M., Keim, D., 2012. Institutional investors and stock market liquidity: Trends and
relationships. Working paper: The Wharton School.
Boehmer, E., Jones, C., Zhang, X., 2013. Shackling short sellers: the 2008 shorting ban. Review
of Financial Studies 26 (6), 1363–1400.
Bollen, N., Whaley, R., 2004. Does net buying pressure affect the shape of implied volatility
function? Journal of Finance 59 (2), 711–754.
Bollerslev, T., Tauchen, G., Zhou, H., 2009. Exptected stock returns and variance risk premia.
Review of Financial Studies 22 (11), 4463–4492.
Boyer, B., Vorkink, K., 2014. Stock options as lotteries. Journal of Finance 69 (4), 1485–1527.
Breeden, D., Litzenberger, R., 1978. Prices of state-contingent claims implicit in option prices.
Journal of Business 51 (4), 621–651.
Breiman, L., 2001. Random forests. Machine Learning 45 (1), 5–32.
Brinson, G., Hood, L., Beebower, G., 1986. Determinants of portfolio performance. Financial
Analysts Journal 42 (4), 39–48.
Brunnermeier, M., Pedersen, L., 2009. Market liquidity and funding liquidity. Review of Finan-
cial Studies 22 (6), 2201–2238.
Campbell, J., Cochrane, J., 1999. By force of habit: A consumption-based explanation of
aggregate stock market behavior. Journal of Political Economy 107 (2), 205–251.
Campbell, J., Thompson, S., 2008. Predicting the equity premium out of sample: Can anything
beat the historical average? Review of Financial Studies 21 (4), 1509–1531.
Campbell, S., Sharpe, S., 2009. Anchoring bias in consensus forecasts and its effect on market
prices. Journal of Financial and Quantitative Analysis 44 (2), 369–390.
Capistran, C., Timmermann, A., 2009. Disagreement and biases in inflation expectations. Jour-
nal of Money, Credit and Banking 41 (2), 365–396.
162
Carhart, M., 1997. On persistence in mutual fund performance. The Journal of Finance 52 (1),
57–82.
Cen, L., Hilary, G., Wei, K., 2013. The role of anchoring bias in equity market: evidence
from analysts’ earnings forecasts and stock returns. Journal of Financial and Quantitative
Analysis 48 (1), 47–76.
Chabi-Yo, F., Song, Z., 2013. Recovering the probability weights of tail events with volatility
risk from option prices. Working paper Ohio State University and Federal Reserve System,
1–55.
Chan, L., Chen, H.-L., Lakonishok, J., 2002. On mutual fund investment styles. Review of
Financial Studies 15 (5), 1407–1437.
Chan, W., 2003. Stock price reaction to news and no-news: drift and reversal after headlines.
Journal of Financial Economics 70 (2), 223–260.
Chang, I., Christoffersen, P., Jacobs, K., 2013. Market skewness risk and the cross section of
stock returns. Journal of Financial Economics 107 (1), 46–68.
Chen, Y., Kumar, A., Zhang, C., 2015. Searching for gambles: investor attention, gambling
sentiment, and stock market outcomes. SSRN working paper 2635572.
Chira, I., Madura, J., Viale, K., 2013. Bank exposure to market fear. Journal of Financial
Stability 9, 451–459.
Choy, S., 2015. Retail clientele and option returns. Journal of Banking & Finance 51 (5),
141–159.
Colacito, R., Ghysels, E., Meng, J., Siwasarit, W., 2016. Skewness in expected macro funda-
mentals and the predictability of equity returns: Evidence and theory. Review of Financial
Studies 29 (8), 2069–2109.
Conrad, J., Dittmar, R., Ghysels, E., 2013. Ex-ante skewness and expected stock returns.
Journal of Finance 68 (1), 85–124.
Cornell, B., 2009. The pricing of volatility and skewness: A new interpretation. Journal of
Investing 18, 27–30.
Corrado, C., Su, T., 1997. Implied volatility skews and stock index skewness and kurtosis
implied by sp 500 index option prices. Journal of Derivatives 4 (4), 8–19.
Cremers, M., Weinbaum, D., 2010. Deviations from put-call parity and stock return predictabil-
ity. Journal of Financial and Quantitative Analysis 45, 335–367.
Daniel, K., Hirshleifer, D., Subrahmanyam, A., 1998. Investor psychology and security market
under- and overreactions. Journal of Finance 53 (6), 1839–1885.
163
Daniel, K., Moskowitz, T., 2016. Momentum crashes. Journal of Financial Economics 122 (2),
221–247.
Danielsson, J., Jorgensen, B., Sarma, M., de Vries, C., 2006. Comparing downside risk measures
for heavy tailed distributions. Economics Letters 92 (2), 202–208.
DataExplorers, L., 2011. Securities lending review Q3 2011: Back to its roots. Third ed. Data
Explorers Limited, London.
De Bondt, W., Thaler, R., 1990. Do security analysts overreact? American Economic Review
80 (2), 52–57.
De Haan, L., Jansen, D., Koedijk, K., de Vries, C., 1994. Safety first portfolio selection, extrem
value theory and long run asset risks. In Proceedings from a Conference on Extreme Value
Theory and Applications, Galambos J (ed.), Kluwer Academic: Boston, MA,, 471–487.
De Long, J., Shleifer, A., Summers, L., Waldmann, R., 1990. Noise trader risk in financial
markets. Journal of Political Economy 98 (4), 703–738.
Dennis, P., Mayhew, S., 2002. Risk-neutral skewness: evidence from stock options. Journal of
Financial and Quantitative Analysis 37 (3), 471–493.
Devroye, L., 1986. Non-uniform random variate generation. Springer-Verlag, New York.
Dierkes, M., 2009. Option-implied risk attitude under rank-dependent utility. Unpublished work-
ing paper. University of Munster, Munster, Germany.
Doran, J., Peterson, D., Tarrant, B., 2007. Is there information in the volatility skew? Journal
of Future Markets 27 (10), 921–959.
Driessen, J., Maenhout, P., Vilkov, G., 2009. The price of correlation risk: Evidence from equity
options. Journal of Finance 64 (3), 1377–1406.
Driessen, J., Maenhout, P., Vilkov, G., 2013. Option-implied correlations and the price of
correlation risk. SSRN working paper no. 2166829, 1–46.
Duan, J.-C., Wei, J., 2009. Systematic risk and the price structure of individual equity options.
Review of Financial Studies 22 (5), 1981–2006.
Easterwood, J., Nutt, S., 1999. Inefficiency in analysts’ earnings forecasts: Systematic misre-
action or systematic optimism? Journal of Finance 54 (5), 1777–1797.
Engle, R., Mistry, A., 2008. Priced risk and asymmetric volatility in the cross-section of skew-
ness. SSRN working paper 1354529.
Fama, E., French, K., 1992. The cross-section of expected stock returns. Journal of Finance
47 (2), 427–465.
164
Fama, E., French, K., 2015. A five-factor asset pricing model. Journal of Financial Economics
116 (1), 1–22.
Fama, E., French, K., 2016. Dissecting anomalies with a five-factor model. Review of Financial
Studies 29 (1), 69–103.
Felix, L., Kraussl, R., Stork, P., 2016a. The 2011 european short sale ban: A cure or a curse?
Journal of Financial Stability 25, 115–131.
Felix, L., Kraussl, R., Stork, P., 2016b. Single stock call options as lottery tickets: overpricing
and investor sentiment. Forthcoming in Journal of Behavioral Finance, 1–38.
Felix, L., Kraussl, R., Stork, P., 2017a. Implied volatility sentiment: a tale of two tails. Tin-
bergen Institute Discussion Paper 17-002/IV - SSRN working paper 2758641, 1–54.
Felix, L., Kraussl, R., Stork, P., 2017b. Predictable biases in macroeconomic forecasts and their
impact across asset classes. SSRN working paper 3008976, 1–40.
Figlewski, S., 2010. Estimating the implied risk neutral density for the US market portfolio.
In Volatility and Time Series Econometrics: Essays in Honor of Robert F. Engle - Oxford
University Press.
Fox, C., Rogers, B., Tversky, A., 1996. Options traders exhibit subadditive decision weights.
Journal of Risk and Uncertainty 13, 5–17.
Frazzini, A., Pedersen, L., 2014. Betting against beta. Journal of Financial Economics 111 (1),
1–25.
Frijns, B., Huynh, T., Tourani-Rad, A., Westerholm, P., 2015. Institutional trading and asset
pricing. FIRN Research Paper No. 2531823, 1–55.
Garleanu, N., Pedersen, L. H., Poteshman, A. M., 2009. Demand-based option pricing. Review
of Financial Studies 22 (10), 4259–4299.
Grammatikos, T., Vermeulen, R., 2012. Transmission of the financial and sovereign debt crises
to the emu: stock prices, cds spreads and exchange rates. Journal of International Money
and Finance 31 (3), 469–480.
Green, T., Hwang, B.-H., 2011. Initial public offering as lotteries: skewness preferences and
first-day returns. Management Science 86 (2), 432–444.
Grundy, B., Lim, B., Verwijmeren, P., 2012. Do option markets undo restrictions on short sales?
evidence from the 2008 short-sale ban. Journal of Financial Economics 106 (2), 331–348.
Han, B., 2008. Investor sentiment and option prices. Review of Financial Studies 21 (1), 387–
414.
165
Hartmann, P., Straetmans, S., de Vries, C., 2004. Asset market linkages in crisis periods. The
Review of Economics and Statistics 86 (1), 313–326.
Harvey, C., Siddique, A., 2000. Conditional skewness in asset pricing tests. Journal of Finance
60 (3), 1263–1296.
Hastie, T., Tibshirani, R., Friedman, J., 2008. The elements of statistical learning: data mining,
inference, and prediction (2nd ed.), Springer–Verlag, New York.
Haykin, S., 1999. Neural networks: A comprehensive foundation (2nd ed.), Pearson Prentice
Hall.
Hill, B., 1975. A simple general approach to inference about the tail of a distribution. Annals
of Statistics 3 (5), 1163–1173.
Hoerl, A., Kennard, R., 1970. Ridge regression: biased estimation for nonorthogonal problems.
Technometrics 12 (1), 55–67.
Hong, H., Stein, J., 1999. A unified theory of underreaction, momentum trading and overreac-
tion in asset markets. Journal of Finance 59 (6), 2143–2184.
Hsu, M., Krajbich, I., Zhao, C., Camerer, C., 2009. Neural response to reward anticipation
under risk is nonlinear in probabilities. Journal of Neurosciencel 29 (7), 2231–2237.
Hull, J., Nelken, I., White, A., 2005. Merton’s model, credit risk, and volatility skews. Journal
of Credit Risk 1 (1), 3–28.
Ibbotson, R., Kaplan, P., 2000. Does asset allocation policy explain 40, 90, or 100 percent of
performance? Financial Analysts Journal 56 (1), 26–33.
Ilmanen, A., 2012. Do financial markets reward buying or selling insurance and lottery tickets?
Financial Analyst Journal 68 (5), 26–36.
Jackwerth, J., 2000. Recovering risk aversion from option prices and realized returns. Review
of Financial Studies 13 (2), 433–451.
Jackwerth, J., Rubinstein, M., 1996. Recovering probability distributions from options prices.
Journal of Finance 51 (5), 1611–1631.
Jackwerth, J., Vilkov, G., 2015. , asymmetric volatility risk: Evidence from option markets.
SSRN working paper 2325380.
Jarrow, R., Rudd, A., 1982. Approximate option valuation for arbitrary stochastic processes.
Journal of Financial Economics 10 (5), 347–369.
Jegadeesh, N., Titman, 1993. Returns to buying winners and selling losers: implications for
stock market efficiency. Journal of Finance 48, 65–91.
166
Jiao, Y., 2016. Lottery preference and earnings announcement premia. SSRN Working Paper
2522798.
Kahneman, D., Tversky, A., 1979. , prospect theory: An analysis of decision under risk. Journal
of Financial Economics 47 (2), 263–291.
Kliger, D., Levy, O., 2009. Theories of choice under risk: Insights from financial markets.
Journal of Economic Behavior & Organization 71 (2), 330–346.
Krishnam, C., P. R., Ritchken, P., 2008. Correlation risk. SSRN Working Paper 1027479, 1–31.
Kumar, A., 2009. Who gambles in the stock market? Journal of Finance 64 (4), 1889–1933.
Kupiec, P., 1995. Techniques for verifying the accuracy of risk management models. Journal of
Derivatives 3, 73–84.
Lahiri, K., Sheng, X., 2010. Measuring forecast uncertainty by disagreement: The missing link.
Journal of Applied Econometrics 25 (4), 514–538.
Lakonishok, J., Lee, I., Pearson, N., Poteshman, A., 2007. Option market activity. Review of
Financial Studies 20 (3), 813–857.
Laster, D., Bennett, P., Geoum, I., 1999. Rational bias in macroeconomic forecasts. Quarterly
Journal of Economics 114 (1), 293–318.
Legerstee, R., Franses, P., 2015. Does disagreement amongst forecasters have predictive value?
Journal of Forecasting 34 (4), 290–302.
Lehmann, B., 1990. Fads, martingales, and market efficiency. Quarterly Journal of Economics
105, 1–28.
Lemmon, M., Ni, S., 2011. The effects of investor sentiment on speculative trading and prices
of stock and index options. SSRN working paper no. 1572427.
Longstaff, F., 1995. Option pricing and the martingale restriction. Review of financial studies
8 (4), 1091–1124.
Loomes, G., Sugden, R., 1982. Regret theory: an alternative theory of rational choice under
uncertainty. Economic Journal 92 (4), 805–924.
Mahani, R., Poteshman, A., 2008. Overreaction to stock market news and misevaluation of stock
prices by unsophisticated investors: evidence from the options market. Journal of Empirical
Finance 15 (4), 635–655.
Mankiw, W., Thomas, C., 1997. Recovering an asset’s implied pdf from optionprices: an ap-
plication to crude oil during the gulf crisis. Journal of Financial and Quantitative Analysis
32 (1), 91–115.
167
Massy, W., 1965. Principal components regression in exploratory statistical research. Journal
of the American Statistical Association 60 (309), 234–256.
Melick, N., Reis, R., Wolfers, J., 1997. Disagreement about inflation expectations. NBER
Macroeconomics Annual 2003 18, 209–248.
Mendenhall, R., 1991. Evidence of possible underweighting of earnings-related information.
Journal of Accounting Research 29, 140–170.
Merton, R., 1974. On the pricing of corporate debt: the risk structure of interestrates. Journal
of Finance 29 (2), 449–470.
Michaely, R., Womack, K., 1999. Conflict of interest and the credibility of underwriter analyst
recommendations. Review of Financial Studies 12 (4), 653–686.
Mitton, T., Vorkink, K., 2007. Equilibrium underdiversification and the preference for skewness.
Review of Financial Studies 20 (4), 1255–1288.
Moskowitz, T., Ooi, Y. H., Pedersen, L. H., 2012. Time series momentum. Journal of Financial
Economics 104 (2), 228–250.
Nelson, D., 1991. Conditional heteroskedasticity in asset returns: A new approach. Economet-
rica 59 (2), 347–370.
Ottaviani, M., Sorensen, P., 2006. The strategy of professional forecasting. Journal of Financial
Economics 81 (2), 441–466.
Pastor, L., Stambaugh, R., 2003. Liquidity risk and expected stock returns. Journal of Political
Economy 111 (3), 642–685.
Polkovnichenko, V., Zhao, F., 2013. Probability weighting functions implied in option prices.
Journal of Financial Economics 107 (3), 580–609.
Pollet, J., Wilson, M., 2008. Average correlation and stock market returns. Journal of Financial
Economics 96 (3), 364–380.
Poon, S.-H., Granger, C., 2003. Forecasting volatility in financial markets: a review. Journal
of Economic Literature 61 (2), 478–539.
Prelec, D., 1998. The probability weighting function. Econometrica 66 (3), 497–527.
Rapach, D., Strauss, J., Zhou, G., 2010. Out-of-sample equity premium prediction: Combina-
tion forecasts and links to the real economy. Review of Financial Studies 23 (2), 821–862.
Rosenberg, J., Engle, R., 2002. Empirical pricing kernels. Journal of Financial Economics
64 (3), 341–372.
Rubinstein, D., 1994. Implied binomial tree. Journal of Finance 49 (3), 771–818.
168
Scharfstein, D., Stein, J., 1990. Herd behavior and investment. American Economic Review
80 (3), 465–479.
Schirm, D., 2003. A comparative analysis of the rationality of consensus forecasts of u.s. eco-
nomic indicators. Journal of Business 76, 547–561.
Sievert, C., Shirley, K., 2014. Ldavis: a method for visualizing and interpreting topics. Proceed-
ings of the Workshop on Interactive Language Learning, Visualization, and Interfaces June,
63–70.
Sobaci, C., Sensoy, A., Erturk, M., 2014. Impact of short selling activity on marketdynamics:
evidence from an emerging market. Journal of Financial Stability 15, 53–62.
Stickel, S., 1991. Common stock returns surrounding earnings forecast revisions: more puzzling
evidence. The Accounting Review 66, 402–416.
Straetmans, S., Verschoor, W., Wolff, C., 2008. Extreme US stock market fluctuations in the
the wake of 9/11. Journal of Applied Econometrics 23 (1), 17–42.
Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. Journal of the Royal
Statistical Society 58 (1), 267–288.
Tim, T., 2001. Rationality and analysts’ forecast bias. Journal of Finance 61 (1), 369–385.
Truong, C., Shane, P., Zhao, Q., 2016. Information in the tails of the distribution of analysts’
quarterly earnings forecasts. Financial Analysts Journal 73 (5), 84–99.
Tversky, A., Kahneman, D., 1974. Judgement under uncertainty: heuristics and biases. Science
185, 1124–1131.
Tversky, A., Kahneman, D., 1992. Advances in prospect theory: Cumulative representation of
uncertainty. Journal of Risk and Uncertainty 5 (4), 297–323.
Vilkov, G., Xiao, Y., 2013. Option-implied information and predictability of extreme returns.
SAFE (Goethe University Frankfurt) Working Paper Series 5, 1–36.
Von Neumann, J., Morgenstern, O., 1947. Theory of games and economic behavior, 2nd edition.
Princeton University Press, Princenton.
Ward, E., 1982. Conservatism in human information processing. In Daniel Kahneman, Paul
Slovic and Amos Tversky. (1982). Judgment under uncertainty: Heuristics and biases, Cam-
bridge University Press, New York.
Welch, I., Goyal, A., 2008. A comprehensive look at the empirical performance of equity pre-
mium prediction. Review of Financial Studies 21 (4), 1455–1508.
Wu, G., Gonzalez, R., 1996. Curvature of the probability weighting function. Management
Science 42 (12), 1676–1690.169
Yan, S., 2011. Jump risk, stock returns, and slope of implied volatility smile. Journal of Finan-
cial Economics 99 (1), 216–233.
Zarnowitz, V., Lambros, L., 1987. Consensus and uncertainty in economic prediction. Journal
of Political Economy 95 (3), 591–621.
Zhang, X., 2006. Information uncertainty and analyst forecast behavior. Contemporary Ac-
counting Research 23 (2), 565–590.
170
Summary
This PhD thesis is about behavioral finance, the sub-field of behavioral economics that studies
the impact of psychological and cognitive biases in financial decision making. The main hypoth-
esis of behavioral finance is that people systematically make irrational decisions when outcomes
are unknown. Behavioral finance is a breakthrough because it managed to challenge the clas-
sical economics and financial theories, which are both built on the assumption that individuals
are fundamentally rational, as implied by the expected utility theory. The proponents of behav-
ioral finance used lab experiments to prove that individuals making decisions under uncertainty
violate the axioms of the expected utility theory. As such, behavioral finance models were de-
signed in a stylized form, disconnected from financial markets. Thus, this thesis adds to the
growing literature that attempts to validate the hypotheses made by behavioral finance in real
financial markets. In specific, most of my research investigates market inefficiencies which we
hypothesize to be explained by the Cumulative Prospect Theory (CPT) probability weighting
function. Using ex-ante information from option prices, we find it to play a role in explaining
some inefficient behaviors of market makers, retail investors and institutional investors, which
produces interesting investment insights. Additionally, my research also recognizes the influ-
ence of other cognitive biases, such as anchoring, conservatism, overconfidence, herding, regret,
and rational bias amid the behavior of macroeconomic data professional forecasters.
171
Samenvatting
Deze dissertatie handelt over ‘behavioral finance’, het deelterrein van de gedragseconomie dat
de invloed van de menselijke psychologie op financiele beslissingen bestudeert. De behavioral
finance stelt dat mensen systematisch irrationele keuzes maken wanneer ze moeten beslissen
in een onzekere situatie. Dit inzicht veroorzaakte een doorbraak in de economische weten-
schap, die voordien altijd was uitgegaan van de verwachte-nutstheorie, die stelt dat mensen in
wezen rationeel handelende individuen zijn. Met behulp van laboratoriumexperimenten toon-
den ‘behavioral finance’ onderzoekers aan dat het gedrag van mensen die beslissingen maken
onder onzekerheid niet strookt met de axioma’s van deze verwachte-nutstheorie. Vervolgens
werden, los van de financiele markten, gestileerde behavioral-finance modellen opgesteld, die
een bloeiende onderzoeksliteratuur nu met financiele-marktgegevens tracht te valideren. Deze
dissertatie is een bijdrage aan die literatuur. In het bijzonder onderzoek ik in hoeverre mark-
tinefficienties worden verklaard door de zogenaamde kanswegingsfunctie uit de ‘Cumulatieve
Prospect Theorie’ (CPT). Met behulp van ‘ex-ante’ informatie uit optieprijzen toont mijn on-
derzoek aan dat deze kanswegingen een rol spelen bij de verklaring van inefficient gedrag van
beursmakelaars, particuliere beleggers en institutionele beleggers. Daarnaast identificeert ik
in de dissertatie de invloed van andere gedragseffecten, waaronder anchoring, conservatism,
overconfidence, herding, regret, en rational bias, door gebruik te maken van een andere ex-ante
informatiebron, namelijk enquetes onder voorspellers van macro-economische statistieken.
172
Short biography
Luiz Fernando Fortes Felix was born in Belo Horizonte (Brazil) on July 26, 1978. During
his early years he experienced living in many places, among them a village in the Amazon
forest (Serra dos Carajas) and in the United States, where he graduated from High School. In
2001, he graduated in Public Administration at Fundacao Joao Pinheiro and in 2002 in Law
at Universidade Federal de Minas Gerais, both in Belo Horizonte. Subsequently, he obtained
a Diploma in Finance from IBMEC (Brazil) and a MSc degree in Finance and Investments
from Durham University (UK), both courses being fully funded by scholarships. Along his
professional career, Luiz has acquired the CFA and CQF charters.
He has worked in financial markets since 2001. The first years of his career he spent
managing fixed income portfolios at a Brazilian pension fund. In 2005, he joined ABN AMRO
Asset Management in Amsterdam (the Netherlands) as a quantitative investment strategist.
Since 2008 he works at the Asset Allocation & Overlay (AA&O) department of APG Asset
Management in Amsterdam. APG is one of the largest investors fully dedicated to manage
pension funds’ assets and liabilities in the world. In AA&O Luiz has managed several derivative-
based strategies, ranging from hedging and protection programs to systematic active strategies.
He has also managed APG’s Absolute Return Strategies (ARS) pool, which invests in a set of
renowned hedge funds. He has been largely involved in the introduction of active management
within AA&O, being the main designer of its tactical asset allocation (TAA) and the active
FX mandate. His responsibilities include the research, design and management of investment
strategies in the global equities, fixed income, commodities and foreign exchange markets.
Luiz is the coordinator of AA&O’s portfolio research and the co-founder of the APG Quant
Roundtable. Lately, he largely engaged with the APG Innovation initiative, leading him to
design natural language processing (NLP) and deep learning-based investment strategies.
Luiz wrote his PhD thesis while working full-time at APG Asset Management from 2012
to 2018, mainly during evenings and weekends. He is married to Clarissa Calil Bonifacio and
together they have two children, Thomas and Bernardo, respectively, 4 and 1 year-old at the
time of writing.
173
Publications
Modified versions of Chapters 2 and 3 of this thesis are published as:
1. Felix, L., Kraussl, R., Stork, P., 2016a. The 2011 european short sale ban: A cure or a
curse? Journal of Financial Stability 25, 115-131.
2. Felix, L., Kraussl, R., Stork, P., 2016b. Single stock call options as lottery tickets:
overpricing and investor sentiment. Forthcoming in Journal of Behavioral Finance, 1-38.
Extended versions of Chapters 4 and 5 of this thesis are available in the form of the following
research papers:
1. Felix, L., Kraussl, R., Stork, P., 2017a. Implied volatility sentiment: a tale of two tails.
Tinbergen Institute Discussion Paper 17-002/IV - SSRN working paper 2758641, 1-54.
Available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2899680.
2. Felix, L., Kraussl, R., Stork, P., 2017b. Predictable biases in macroeconomic forecasts
and their impact across asset classes. SSRN working paper 3008976, 1-40. Available at
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3008976.
174
Conference presentations
1. 2013 3rd International Conference of the Financial Engineering and Banking Society
(FEBS) in Paris.
2. 2014 IX Seminar on Risk, Financial Stability and Banking Conference of the Banco
Central do Brasil in Sao Paulo.
3. 2014 International Risk Management Conference in Warsaw.
4. 2014 Financial Management Association (FMA) European Conference in Maastricht.
5. 2016 IFABS Conference in Barcelona.
6. 2016 Research in Behavioral Finance Conference (RBFC) in Amsterdam.
7. Board of Governors of the Federal Reserve System research seminar in Washington D.C.
in 2016.
8. 2017 Infiniti Conference in Valencia.
9. 2017 EEA-ESEM Conference in Lisbon.
10. MAN-AHL Research Seminar in London in 2017.
11. 2017 Econometrics and Financial Data Science workshop at the Henley Business School
in Reading.
12. 2018 Conference of the Swiss Society for Financial Market Research (SGF) in Zurich.
13. 2018 annual meeting of the European Financial Management Association (EFMA) in
Milano.
14. 2018 EEA-ESEM Conference in Cologne.
15. 2018 European Finance Association (EFA) in Warsaw.
16. 2018 Research in Behavioral Finance Conference (RBFC) in Amsterdam.
175
INVITATIONTo attend the public defense of the PhD thesis entitled
Essays in Behavioral FinanceBiases in Investment Decisions and Their Impact Across Asset Classes
by Luiz Fernando Fortes Félix
Monday October 1st, 2018At 13.45 hours
.....Vrije Universiteit Amsterdamstreet, nopostalcode,Amsterdam
The defense will be followed by a reception
Paranymphs
Klaas [email protected]
Rob van den [email protected]
Essays in Behavioral Finance
Luiz Fernando Fortes Félix
Essays in Behavioral FinanceBiases in Investment Decisions and Their Impact Across Asset Classes
Luiz Fernando Fortes Félix