essays in behavioral finance biases in investment ......a huge thanks goes to my mom sandra, whose...

INVITATIONTo attend the public defense of the PhD thesis entitled

Essays in Behavioral FinanceBiases in Investment Decisions and Their Impact Across Asset Classes

by Luiz Fernando Fortes Félix

Monday October 1st, 2018At 13.45 hours

.....Vrije Universiteit Amsterdamstreet, nopostalcode,Amsterdam

The defense will be followed by a reception

Paranymphs

Klaas [email protected]

Rob van den [email protected]

Essays in Behavioral Finance

Luiz Fernando Fortes Félix



Essays in Behavioral FinanceBiases in Investment Decisions andTheir Impact Across Asset Classes

Luiz Fernando Fortes Felix

reading committee:

prof.dr. R. Calcagno

prof.dr. M. van Dijk

prof.dr. U. von Lilienfeld-Toal

prof.dr. P. Verwijmeren

prof.dr. R. Zwinkels

ISBN 978-94-6332-385-7

@ Luiz Fernando Fortes Felix

Contact: [email protected]

Cover design: Loes Kema

Printed by GVO drukkers & vormgevers B.V., Ede

School of Business and Economics

Vrije Universiteit Amsterdam

De Boelelaan 1105

1081HV Amsterdam

The Netherlands

VRIJE UNIVERSITEIT

ESSAYS IN BEHAVIORAL FINANCEBIASES IN INVESTMENT DECISIONS AND THEIR IMPACT ACROSS

ASSET CLASSES

ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad Doctor of Philosophyaan de Vrije Universiteit Amsterdam,op gezag van de rector magnificus

prof.dr. V. Subramaniam,in het openbaar te verdedigen

ten overstaan van de promotiecommissievan de School of Business and Economicsop maandag 1 oktober 2018 om 13.45 uur

in de aula van de universiteit,De Boelelaan 1105

door

Luiz Fernando Fortes Felix

geboren te Belo Horizonte, Brazil

promotor: prof.dr. P.A. Stork

copromotor: prof.dr. R. Kraussl

This thesis is dedicated to

Clarissa,

for daily supporting me to achieve and for joining my dreams,

and to my parents,

Sandra and Jose Luiz (�),

for empowering me to dream high.

Foreword and AcknowledgementsHaving worked as a professional investors for a number of years, I can say that

acknowledging the influence of behavioral biases in investments decision making

is crucial for a sound investment process and a career in financial markets. In

this thesis, I focus in overweighting of tail events, which is the most emblematic

of the Cumulative Prospect theory. Observing and personally experiencing how

overweighting of tail events can lead to sub-optimal decisions was the main reason

why I decided to formally investigate it through a PhD.

This thesis is a join product of me, my supervisor Philip Stork, and my co-supervisor

Roman Kraussl. I have no words to thank you both for the amount of hours and

thoughts dedicated to our project. I am very grateful for your critical approach on

your own unique styles: the meticulous challenger and the devils’ advocate. Dear

Philip, I am pleased that after this journey together, I also consider you a mentor

and a good friend. I am delighted to have met your beautiful family in different

occasions, from casual visits to your house to a coincidental family holiday in le cote

d’azur. I will miss our “business” lunches at Symphony and hope we can replace this

ritual by a new one. Dear Roman, despite the distance and less frequent meetings,

our collaboration was much appreciated and joyful, often marked by wise comments

in prompt email replies past midnight irrevocably ended with “best wishes”.

Many were the seminars and conferences where I presented my work: from APG

Quant Roundtable and VU Brown Bag seminars to renowned conferences, such as

EEA-ESEM and EFA. I thank seminar participants for their useful comments, as

they truly helped improving this thesis and widening my perspective over my own

subject. More specifically, I would like to thank the following colleagues and de-

baters, whose help and comments undoubtedly influenced this thesis: Andre Lucas,

Albert Menkveld, Arjen Siegmann, Ton Vorst and Remco Zwinkels at VU Univer-

sity; Thijs Aaten, Louis Chaillet, Gillis Danielsen, Pieter van Foreest, Rob van den

Goorbergh, Jaroslav Krystul, Pim Lausberg, Rajiv Mallick, Koen Marree, Jan Mark

van Mill, Sunil Patil, Martin Prins, Klaas Reedijk, Ashutosh Shahi, Frank Smudde,

Olaf van Veen, Kevin Wees, Hans van Westrienen, Peter Wijn, Ruben Winnink and

Tim Zwinkels at APG Asset Management; Steven Desmyter, Yoav Git, Otto van

Hemert, Sandy Rattray, Graham Robertson, Matthew Sargaison, Markus Schanta,

Lionel Viaccoz and Tim Wong at MAN-AHL; Andy Moniz and Caio Natividade at

Deutsche Bank; Christian Ruprecht and Iskandar Vanblarcum at Barclays; Yang-Ho

Park and Emilio Osambela at the Federal Reserve Bank; Rui Almeida at Maastricth

University; Alessandro Beber at BlackRock; Roy Hoevenaars at Capstone Advisors

and Evert Vrugt (ex-APG).

Thank you to the members of my reading committee prof.dr. R. Calcagno, prof.dr.

M. van Dijk, prof.dr. U. von Lilienfeld-Toal, prof.dr. P. Verwijmeren and prof.dr.

R. Zwinkels. I much enjoyed reading and benefited from your comments to my

thesis. Thanks also to Norman Seeger for joining my PhD Assessment Committee.

I am also grateful for the support that my managers at APG Asset Management,

Peter Wijn and Klaas Reedijk, gave to this thesis. Without your backing, I would

not have managed to complete this PhD alongside working. Thank you my other

(ex-)colleagues at Asset Allocation & Overlay, Mark van Aartsen, Jelle Jansen, Jos

Kalb, Ed Swiderski and Alex Tiebout for incentivize and backing me up during

conference days. At APG, I also would like to thank Rob van den Goorbergh, Peter

Strikwerda and Job Kooij for making my days, respectively, more quant (around a

roundtable), innovative and sporty.

A special thanks to my paranymphs Rob van den Goorbergh and Klaas Reedijk,

who have closely accompanied my PhD journey and really helped me on the final

details of my thesis and defense.

Thanks also to many friends in Amsterdam, Adriaan, Alan, Claudinha, Gisa, Guigo,

Hermine, Inge, Laıs, Leroy, Margriet, Matheus, Virgilio among others and in our

Randwijk community, who have not abandoned me but rather supported me despite

many negated invitations to meet them.

A huge thanks goes to my mom Sandra, whose immeasurable love has always sup-

ported my endeavours, despite they would mean to be far from home. To my

deceased father, Jose Luiz, thanks a lot for teaching me to be a hard worker. I

would have not finished this thesis if it would not be your example. Many thanks

to my brother Claudio and my sister Luciana, whose incentives and help made my

PhD path more meaningful, happy and pythonic. Thanks to my extended family

Jairo, Tokie, Sonia, Andre, Renato, Andre Reis, Tia Fatima, Tia Ana, Tio Jose

Lino, Neocles and Leila, among others, for their appreciation and encouragement.

Now, six years after I have started my PhD, life is quite different from when I started

it. I am now father of two amazing creatures: Thomas and Bernardo. I thank you

for the understanding on weekends and for going to bed early on week days. Nearly

nothing in life gives me more energy than spending time with these two little boys

and my wife. They are my wonders and I am happy to have more time for them

once my nights and weekends are no longer consumed by my “six-year old baby”.

In the mean time, some beloved family members have left this world. Vovo Ge,

Vovo Xico and Ana Elizabeth, you will be always in my heart.

Last but not least, I would like to thank my better half, Clarissa Bonifacio. No one

inspired, motivated and supported me to progress in this six years like you did.

Luiz Fernando Fortes FelixAmsterdam, August 2018

Contents

1 Introduction 1

1.1 An artificial intelligence introduction to this thesis . . . . . . . . . . . . . . . . . 1

1.2 My introduction to this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Cumulative Prospect Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 The 2011 European Short Sale Ban: A Cure or a Curse? 12

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Data and methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 VaR levels and volatility skews . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.2 Financial contagion risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.3 Panel regression analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3.4 Robustness Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.A Appendix: Implied jump risk estimation . . . . . . . . . . . . . . . . . . . 36

2.A.1 Implied jump risk from risk-neutral distributions . . . . . . . . . . . . . 36

2.A.2 The Figlewski (2010) approach for extracting RND from implied volatilities 37

2.A.3 The modified Figlewski (2010) approach . . . . . . . . . . . . . . . . . . 38

2.B Appendix: Extreme value theory . . . . . . . . . . . . . . . . . . . . . . . . 39

3 Single stock call options as lottery tickets: overpricing and investor senti-

ment 41

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.2.1 Subjective density functions . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2.2 Estimating CPT parameters . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.2.3 Density function tails’ consistency test . . . . . . . . . . . . . . . . . . . 47

3.2.4 Estimating RND and EDF . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3 Empirical analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.1 Estimated CPT long-term parameters . . . . . . . . . . . . . . . . . . . . 51

3.3.2 Density functions tails’ consistency test results . . . . . . . . . . . . . . . 52

3.3.3 Estimated CPT time-varying parameters . . . . . . . . . . . . . . . . . . 58

i

3.3.4 Time variation in probability weighting parameter and investors’ sentiment 62

3.4 Robustness tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.4.1 Kupiec’s test for tail comparison . . . . . . . . . . . . . . . . . . . . . . . 67

3.4.2 Prelec’s weighting function parameter . . . . . . . . . . . . . . . . . . . . 70

3.4.3 Estimating time-varying γ under different assumptions for δ , α and β . . 71

3.4.4 Overweight of (right) tails driven by IV of single stock options . . . . . . 72

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.A Appendix: Risk-neutral densities and implied volatility analytics . . . 76

3.A.1 Subject density function estimation . . . . . . . . . . . . . . . . . . . . . 76

3.A.2 Single stock weighted average implied volatilities . . . . . . . . . . . . . . 77

3.B Appendix: Machine learning methods . . . . . . . . . . . . . . . . . . . . . 79

3.B.1 Least Absolute Shrinkage and Selection Operator (Lasso) . . . . . . . . . 79

3.B.2 k-Nearest-Neighbor classifier . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.C Appendix: Welch and Goyal (2008) equity market predictors . . . . . . 80

4 Implied Volatility Sentiment: A Tale of Two Tails 81

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81


4.3 Overweight of tails: dynamics and dependencies . . . . . . . . . . . . . . . . . . 89

4.3.1 Time-varying CPT parameters . . . . . . . . . . . . . . . . . . . . . . . . 89

4.3.2 Overweight of tails and sentiment . . . . . . . . . . . . . . . . . . . . . . 90

4.3.3 Overweight of tails, IV skews and higher moments of the RND . . . . . . 93

4.4 Predicting with overweight of tails . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.4.1 Predicting returns with DGspread and IV-sentiment . . . . . . . . . . . 97

4.4.2 IV-sentiment pair trading strategy . . . . . . . . . . . . . . . . . . . . . 99

4.4.3 Out-of-sample equity returns predictive tests . . . . . . . . . . . . . . . . 107

4.4.3.1 Univariate models and forecast combination . . . . . . . . . . . 107

4.4.3.2 “Kitchen sink” and machine learning-based models . . . . . . . 111

4.4.4 IV-sentiment and equity factors . . . . . . . . . . . . . . . . . . . . . . . 113

4.4.5 Behavioral versus risk-sharing perspectives . . . . . . . . . . . . . . . . . 116

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5 Predictable Biases in Macroeconomic Forecasts and Their Impact Across

Asset Classes 121

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

5.2 Forecast biases, anchoring and rationality tests . . . . . . . . . . . . . . . . . . . 124


5.3.1 Economic surprise predictive models . . . . . . . . . . . . . . . . . . . . 128

5.3.2 Market response predictive models . . . . . . . . . . . . . . . . . . . . . 128

5.4 Empirical analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.4.1 Predicting economic surprises . . . . . . . . . . . . . . . . . . . . . . . . 130

ii

5.4.2 Market responses around macroeconomic announcements . . . . . . . . . 134

5.4.3 Predicting market responses . . . . . . . . . . . . . . . . . . . . . . . . . 138

5.4.4 Market responses, skewness of economic forecasts and regret . . . . . . . 145

5.4.5 Robustness tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

5.4.5.1 Economic surprise models across regions . . . . . . . . . . . . . 148

5.4.5.2 Expected and unexpected surprises . . . . . . . . . . . . . . . . 149

5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

5.A Appendix: Machine learning methods . . . . . . . . . . . . . . . . . . . . . 154

5.A.1 Principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . 154

5.A.2 Ridge regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

5.A.3 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.A.4 Random forest variable Importance measure . . . . . . . . . . . . . . . . 155

6 Conclusion 157

Bibliography 160

Summary 171

Samenvatting 172

Short biography 173

Publications 174

Conferences presentations 175

iii

List of Figures

1.1 LDA topic modelling applied to this thesis . . . . . . . . . . . . . . . . . . . . . 2

1.2 QR-code for interactive LDA output visualization . . . . . . . . . . . . . . . . . 2

1.3 Impact of changes in λ, γ and δ in the CPT model . . . . . . . . . . . . . . . . 7

2.1 Short positions in stocks around ban date . . . . . . . . . . . . . . . . . . . . . 18

2.2 Averaged implied volatility skews for banned and non-banned stocks . . . . . . . 21

2.3 Sovereign CDS spreads, V2X and implied volatility skews around the ban date . 22

2.4 Implied volatility skews and IV spread around the ban date . . . . . . . . . . . 31

2.5 RND extraction using different methods . . . . . . . . . . . . . . . . . . . . . . 39

3.1 Cumulative density functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.2 Time varying gamma parameter in CPT . . . . . . . . . . . . . . . . . . . . . . 62

3.3 k-Nearest-Neighbors for IV skews . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.1 Information ratios for daily IV-based strategies . . . . . . . . . . . . . . . . . . . 101

4.2 Information ratios for long- and short-leg of IV-based strategies . . . . . . . . . 104

4.3 Information ratios, skewness and horizon for monthly IV-based strategies . . . . 106

4.4 Cumulative Sum of Squared Error Differences of single factor predictive regressions110

4.5 Cumulative Sum of Squared Error Differences of combined predictive regressions 112

4.6 Correlation matrix between IV-sentiment factor and cross-sectional equity factors114

5.1 Cumulative average returns (CAR) around the macroeconomic announcements . 135

5.2 Importance measure from Random forest model . . . . . . . . . . . . . . . . . . 144

iv

Chapter 1

Introduction

1.1 An artificial intelligence introduction to this thesis

I start this thesis where my mind is: artificial intelligence, more specifically natural language

processing (NLP), the main topic of my current research agenda at APG, together with deep

learning. NLP is the field in computer science concerned with the understanding of human

(natural) language. Given the power of NLP and, in specific, topic modelling for dimensionality

reduction of text, I could not prevent myself from offering you this entree to my thesis. This

choice does not come out of context, as I apply a number of other artificial intelligence methods

across this thesis, more specifically machine learning algorithms. Notwithstanding the concern

that by now I may have distracted the reader enough, I note that my own, high dimensional

and detailed introduction is provided in the following section. Thus, if you have a taste for

technology and a natural interest for research you will enjoy reading both introductions. If

time is of main concern, you should skip section 1.2. If precision is of singular interest and/or

you despise technology, you should jump to 1.2 straight away.

A NLP Latent Dirichlet Allocation (LDA) topic modelling technique (see Blei et al., 2003)1

applied to this thesis suggests it can be broken down into the following six topics with the

respective weights2:

1. Forecast of macroeconomic indicator using biases, 20.6%.

2. Investor sentiment using out-of-the-money (OTM) options, 14.1%.

3. Ban on European financial stocks and implied volatility (IV) skews, 12.2%.

4. Equity, sentiment, momentum, cross-sectional and surprise factor strategies, 14.0%.

5. Cumulative Prospect Theory (CPT) and overweight of small probabilities, 25.5%.

6. S&P500 index implied volatility (IV) across moneyness, 13.6%.

1LDA is a widely adopted NLP technique for topic modelling.2Topic names are not automated but constructed from top-10 most relevant terms for each topic.

1

Figure 1.1: LDA topic modelling applied to this thesis. This figure presents the visualization output of a LatentDirichlet Allocation (LDA) method for topic modelling applied to this thesis’ text. On the left the global topic view is shown.On the right (with topic 5, the topic with highest frequency, selected) the term bar chart is shown. The parameter λ to set therelevance metric equals 0.6 in this figure.

Figure 1.2: QR-code for interactive LDA output visualization. This QR-code is linked to the LDAvis output of theLatent Dirichlet Allocation (LDA) model provided in the Figure 1.1 above. On the left the global topic view is shown. Select topicsby clicking at their corresponding bubble. Once a topic is selected, its most relevant terms (for the set value of λ) are displayedon the right side. On the right the term bar chart is shown. Red bars indicate term frequency within the selected topic. Blue barsindicate overall term frequency across the corpus. See Sievert and Shirley (2014) for further details.

Using a multidimensional scaling technique, we report the inter-topic distance chart, shown

on the left side of Figure 1.1, produced using the LDAvis Python library implemented by Sievert

and Shirley (2014). As such, the chart suggests that this thesis contains a mix of topics that

are occasionally correlated (topics 4 and 1 and topics 5 and 6) but mostly quite distinct from

each other. Topic 3 seems the most distant topic from the other five topics, whereas topics 2

and 4 are the most common to all topics. Bubble sizes are proportional to the topic frequencies

reported above.

On the right side of Figure 1.1, the term bar chart is displayed. It shows the most relevant

terms for the different topics (with topic 5 selected) given our parameterization for the relevance

metric (λ) equal to 0.63. For an interactive version of the LDA visualization output of this thesis,

please, scan the QR-code provided in Figure 1.24.

3We use λ=0.6, as being the optimal level of this parameters according to Sievert and Shirley (2014).4Alternatively, go to the next url: https://cdn.rawgit.com/luizfelix/PhD/b9121543/PhD_Luiz_

ldavis_6.html#topic=0&lambda=0.6&term=

2

The remainder of Chapter 1 is split into my introduction to this thesis, an introduction to

the Cumulative Prospect Theory (CPT) and the outline of this thesis.

1.2 My introduction to this thesis

This thesis is about behavioral finance, the sub-field of behavioral economics that studies the

impact of psychological and cognitive biases in financial decision making. The main hypothesis

of behavioral finance is that people systematically make irrational decisions when based out-

comes are unknown. Behavioral finance is a breakthrough because it managed to challenge the

classical economics and financial theories, which are both built on the assumption that individ-

uals are fundamentally rational, as implied by the expected utility theory. The proponents of

behavioral finance used lab experiments to prove that individuals making decisions under un-

certainty violate the axioms of the expected utility theory. As such, behavioral finance models

were designed in a stylized form, disconnected from financial markets. Thus, this thesis adds

to the growing literature that attempts to validate the hypotheses made by behavioral finance

in real financial markets. Beyond that, by using forward looking information sources, it tests

these hypotheses more directly than the literature has done.

More formally, the Cumulative Prospect Theory (CPT)5 introduced by Kahneman and Tver-

sky (1979)6 and Tversky and Kahneman (1992), the main single milestone of behavioral finance,

have paved the way for challenging the classical financial theory. Since their contribution, many

asset pricing puzzles and market inefficiencies have received behavioral interpretations. Most of

these interpretations link ex-post empirical observations (e.g., autocorrelation of stock returns

over a six-month horizon, see Jegadeesh and Titman (1993)) to stylized effects of behavioral

models (e.g., underreaction and overreaction, Barberis et al. (1998); Daniel et al. (1998); Hong

and Stein (1999)) and related behavioral biases (e.g., overconfidence and self-attribution, for

the case of Daniel et al. (1998)). Fewer are the papers using ex-ante information to directly

recognize the presence of behavioral biases in investment decision making7. Thus, the over-

arching theme in this thesis is the link between behavioral finance, investments and ex-ante

information sources, which are either new or evaluated from a novel perspective.

Particularly, CPT’s overweighting of small probabilities is hypothesized to explain a series

of puzzles in asset pricing (see Barberis and Huang, 2008) but it lacks empirical validation.

Thus, most chapters of this thesis investigate market inefficiencies which we hypothesize to be

explained by the CPT probability weighting function. Using ex-ante information, we find it

to play a role in explaining some inefficient behaviors of market makers, retail investors and

institutional investors, giving rise to economically significant investment insights.

5An overview of the Cumulative Prospect Theory (CPT) is provided in section 1.3.6Kahneman and Tversky (1979) introduced the Prospect Theory, which was later refined into the CPT.7Note that utilizing ex-post and ex-ante information are either effective for testing the presence of behavioral

biases once historical data is available. The deemed benefit of using ex-ante data is that it allows for aninstantaneous observation and test of biases. Nevertheless, given its anticipatory nature, the availability ofex-ante data is much more limited.

3

Differently from most behavioral finance research, a large part of this thesis attempts to

empirically validate its hypothesis using equity options data. This was a deliberate choice.

Option markets constitute a rich source of information because they efficiently provides market-

based estimates of investors’ preferences (Dennis and Mayhew, 2002; Barberis and Huang, 2008;

Dierkes, 2009; Chang et al., 2013) or expectations (Bates, 1991; Rubinstein, 1994; Jackwerth and

Rubinstein, 1996) or both (Kliger and Levy, 2009; Polkovnichenko and Zhao, 2013; Barberis,

2013). Beyond that, because the availability of a cross-sectional structure of options (across

moneyness) and maturities, risk-neutral probability density functions for multiple horizons can

be estimated. Employing risk-neutral densities (RND) has several advantages. First, it provides

a true ex-ante probability distribution of market expectations. Second, it captures patterns in

the probability density which cannot be replicated by smoothing nor parametric methods.

Third, it does not require extremely long data history to estimate assets’ steady-state return

distributions and their conditional counterparts. And, lastly, it allows us to directly investigate

potential time-variation in distributions.

Note that, if the number of papers utilizing ex-ante information to test for behavioral biases

is small, studies linking behavioral insights to ex-ante density functions are even more scarce.

The ones capable of that are typically restricted to the usage of RND from the option markets.

These studies, however, solely focus on the index option market, as in Polkovnichenko and Zhao

(2013) and Dierkes (2009). To the best of our knowledge, we are the first to use RND from

single stock option markets, let alone, the first to combine ex-ante information from both index

option and single stock option markets.

Broadly speaking, we implement the Figlewski (2010) approach for RND estimation, which

is preferred to earlier methodologies because it is able to extrapolate fat tails beyond the

density’s body. For instance, a clear advantage of this method versus the one used by Bliss

and Panigirtzoglou (2004) is that tails of the distribution of prices no longer resemble Black-

Scholes’ log-normal tails, as its constant volatility across moneyness is largely inconsistent with

empirical observations. Particularly, in this thesis we extend the Figlewski (2010) approach by

making tails’ anchor points (the point where tails are collated to distribution body) dynamic

rather than fixed, making it more flexible. Our addition to the Figlewski (2010) method largely

improves the quality of fitted Generalized Extreme Value tails in our data set.

To investigate how well the CPT model can explain options pricing, RND must be trans-

formed into subjective density functions. This is done by embedding a risk-reward trade-off

(i.e, a utility function) to RNDs through the use of the pricing kernel. Yet, another way of

investigating how well the CPT matches option prices is to estimate the CPT’s γ and δ proba-

bility distortion parameters required to make subjective density functions to match RND. We

could not have implemented either of these approaches if it were not for the direction given

by Bliss and Panigirtzoglou (2004). In specific for utility functions with probability weighting,

such as the CPT and Prelec’s rank-dependent expected utility (RDEU), Polkovnichenko and

Zhao (2013) and Dierkes (2009) provided important guidance.

4

We delved deep into the extraction of RND and into the estimation of CPT parameters from

options data, however, our conclusions are mostly dedicated to the underlying stock market

and equity sentiment rather than to options’ pricing itself8. As most applied behavioral finance

studies focus on the cross section of stocks (see Carhart, 1997; Barber and Odean, 2008; Cen

et al., 2013), the number of studies linking behavioral biases to asset class level and across asset

class investment strategies is substantially smaller and mostly concentrates on momentum (see

Benartzi and Thaler, 1995a; Asness et al., 2013; Moskowitz et al., 2012). At the same time,

there is a lack of studies that investigate CPT’s overweighting of tail events in connection to

cross-asset investing and asset allocation. And, as asset allocation is the primary determinant

of a portfolio’s return variability (see Brinson et al., 1986; Ibbotson and Kaplan, 2000), this

thesis not only fills a gap in the literature, but is relevant to regulators, investors and other

market participants.

This thesis is important to regulators because it provides a unique perspective on short-

sale bans, implied jump risk and market failure, which are closely connected to contagion risk

and to behavioral inefficiencies. As equity market reversals and momentum crashes are also

linked to behavioral biases, acknowledging their effects in market prices should also be useful

for market prudential policy. This thesis matters to market participants because, as financial

decision makers, they are under constant influence of behavioral biases, which, on the one

hand, might be the biggest challenge to the their professional activity and, on the other hand,

create investment opportunities. Thus, understanding how specific behavioral biases impact

financial markets in aggregate and may impact our own behavior is important, if not critical

in investments.

Our analyses mostly concentrate on tail measures and higher moments, both extracted

from RND: tail shape estimator (Hill, 1975), extreme downside risk (Hartmann et al., 2004),

conditional co-crash probabilities (Hartmann et al., 2004; Balla et al., 2014), jump risk (Yan,

2011), expected shortfall (Danielsson et al., 2006), risk-neutral skewness and kurtosis (Dennis

and Mayhew, 2002; Conrad et al., 2013) and implied volatility skews (Bates, 2003; Garleanu

et al., 2009; Vilkov and Xiao, 2013). Our starting point is the fact that implied volatility

skews from index put options have dramatically changed since the 87 crash (see Bates, 1991;

Rubinstein, 1994; Jackwerth and Rubinstein, 1996). Since this event, implied volatility skews

have been traditionally associated with demand for portfolio insurance by institutional investors

and (left) tail fears (see Vilkov and Xiao, 2013), reflecting bearishness. The perspective provided

by the single stock option market is, however, very distinct from the one obtained from the

index option markets, which has been the one studied in connection with probability weighting

functions (see Polkovnichenko and Zhao, 2013; Dierkes, 2009). In contrast, single stock options

are mostly traded by individual investors (Bollen and Whaley, 2004; Lakonishok et al., 2007)

to speculate on the upside of equity markets (Lakonishok et al., 2007; Bauer et al., 2009; Choy,

2015), reflecting bullish sentiment. Because we investigate both the index option and the single

8This is supported by the assumptions that options and stock prices reflect the same information, in linewith Conrad et al. (2013)

5

stock option markets, we are able to propose a novel implied volatility-based sentiment measure,

jointly extracted from both the index and single stock option markets: IV-sentiment.

In addition to the overweight of tail events, we also investigate the influence of other be-

havioral biases, such as anchoring (Tversky and Kahneman, 1974), conservatism (Ward, 1982),

overconfidence (Daniel et al., 1998), herding (Scharfstein and Stein, 1990), regret (Loomes and

Sugden, 1982; Bell, 1982), and rational bias (Laster et al., 1999; Ottaviani and Sorensen, 2006)

amid surveys of macroeconomic data forecasters, another source of ex-ante information. We

find some of these biases to be pervasive in the forecasting of US economic data releases and

also present in Continental Europe, the United Kingdom and Japan. Under this condition,

economic surprises are predictable. And, as market prices react to the unexpected informa-

tion flow, we find that predicting economic surprises gives rise to return predictability, with

implications across four asset classes: equities, bonds, foreign exchange and commodities. Our

results suggest that returns on assets that are sensitive to the fundamentals being revealed by

macro announcements (local equities and bonds) are more predictable around such events than

foreign markets, currencies and commodities.

The curse of dimensionality (a relatively small number of data samples in a high dimensional

feature space) struck a few times in our research. Hence, we found the need to go beyond the

standard linear model to satisfactorily test our hypothesis. The fact that computational power

no longer hinders the application of advanced statistical techniques enables us to apply artificial

intelligence techniques whenever necessary. The application of these algorithms was sometimes

crucial for preventing overfitting, which could have misled us to erroneous conclusions, such as

in Chapter 5. Other times, machine learning methods were instrumental to clarify relations

that were blurred when analyzed through standard methods. In some situations, though, ma-

chine learning approaches were the ones leading to overfitting, such as in Chapter 4. We did

not restrict ourselves to the application of supervised learning methods. Unsupervised learning

techniques were also essential to perform some of our analysis, for instance, for dimensionality

reduction used in Chapter 5. In sum, this thesis suggests that machine learning and artificial

intelligence techniques is not the answer to all financial economics and investment questions but

simply one additional toolbox at the hands of the econometrician and investment researcher.

These techniques, even more than standard linear models, should be understood and applied

with care despite the ease of usage via Python or R. Insights into the artificial intelligence tech-

niques used are provided in the Appendices of the different chapters. For a general explanation

of these methods, see Blei et al. (2003) and Hastie et al. (2008).

1.3 Cumulative Prospect Theory

The Prospect theory (PT) of Kahneman and Tversky (1979) incorporates behavioral biases

into the standard utility theory (Von Neumann and Morgenstern, 1947), which presumes that

individuals are rational9. Such behavioral anomalies are i) loss aversion, ii) risk seeking behavior

9The expected utility theory of Von Neumann and Morgenstern (1947) is the standard economics frameworkon decision making under risk. Their theory assumes that decision-makers behave as if they maximize the

6

and iii) non-linear preferences10. The CPT is described in terms of a value function (υ) and

a probability distortion function (π). The value function is analogous to the utility function

in the standard utility theory and it is defined relative to a reference point zero. Therefore,

positive values within the value function are considered as gains and negative values are losses,

which leads to:

υ(x) =

{xα , if x >= 0−λ(−x)β , if x < 0

(1.1)

where λ ≥ 1, 0 ≤ β ≤ 1, 0 ≤ α ≤ 1, and x are gains or losses. Thus, along the domain of x, the

CPT’s value function is asymmetrically S-shaped (see Figure 1.3a) with diminishing sensitivity

as x → ±∞.

The value function is, thus, concave over gains and convex over losses, differently from

the traditional utility function used by standard utility theory. Such a shape of the value

function implies diminishing marginal values as gains or losses increase, which means that any

additional unit of gain (loss) becomes less relevant when wealth increases (decreases). As α

and β increase, the effect of diminishing sensitivity decreases, and as λ increases the degree of

loss aversion increases. We also note in Figure 1.3a that the value function has a kink at the

reference point, which implies loss aversion, as the function is steeper for losses than for gains.

(a) (b)

Figure 1.3: Impact of changes in λ, γ and δ in the CPT model. Plot a in this figure shows the impact of changes inthe loss aversion parameter λ on the Cumulative Prospect Theory (CPT) value function v(x). The v(x) is depicted for λ equals to1, 1.5, 2 and 2.5. Plot b depicts the CPT weighting function w(x) for the probability weighting parameters γ and δ, respectively, forgains and losses, as well as the weighting function of Prelec (1998) for the probability weighting parameter δ. The w(x) is depictedfor γ equals to 0.61, δ equals to 0.69 and Prelec δ equals to 0.5.

The use of a probability distortion function or decision weight function is the adjustment

made to the PT to address nonlinear preferences. This function takes probabilities and weights

expected value of some function defined over the potential (probabilistic) outcomes. Individuals are assumedto have stable and rational preferences; i.e., not influenced by the context or framing.

10Loss aversion is the property in which people are more sensitive to losses than gains. For details, seeKahneman and Tversky (1979), Tversky and Kahneman (1992) and Barberis and Huang (2001). Risk-seekingbehavior happens when individuals are attracted by gambles with unfair prospects. The risk-seeking individualchooses a gamble over a sure thing even though the two outcomes have the same expected value. Non-linearpreferences occur when preferences between risky prospects are not linear in the probabilities, thus, equallyprobable prospects are more heavily weighted by agents than others. For details, see Tversky and Kahneman(1992), Fox et al. (1996), Wu and Gonzalez (1996), Prelec (1998) and Hsu et al. (2009).

7

them non-linearly, so that the difference between probabilities at high percentiles, e.g., between

99 percent and 100 percent, has more impact on preferences than the difference between prob-

abilities at small percentiles, e.g., between 10 percent and 11 percent. This is the main advance

of the CPT over the original PT. The CPT applies probability distortions to the cumulative

probabilities (i.e., the CDF), whereas the PT applies them to individual probabilities (i.e.,

the PDF). The enhancement brought by this new formulation satisfies stochastic dominance

conditions not achieved by the PT, which renders the CPT applicable to a wider number of

experiments. The probability distortion functions suggested by Tversky and Kahneman (1992),

respectively, for gains (π+n ) and losses (π−

−m) are:

π+n = w+(pn) (1.2a)

π+i = w+(pi + ...+ pn)− w+(pi+1 + ...+ pn) , for 0 ≤ i ≤ n− 1 (1.2b)

π−−m = w−(p−m) (1.2c)

π−i = w−(p−m + ...+ pi)− w−(p−m + ...+ pi−1) , for 1−m ≤ i ≤ 0 (1.2d)

where p are objective probabilities of outcomes, which are ranked for gains from the reference

point i = 0 to i = n, the largest gain, and for losses from the largest loss i = −m to i = 0, the

reference point. Further, w+ and w−, the parametric form of the decision weighting functions,

are given by:

w+(p) =pγ

(pγ + (1− p)γ)1/γ(1.3a)

w−(p) =pδ

(pδ + (1− p)δ)1/δ(1.3b)

where parameters γ and δ define the curvature of the weighting function for gains and losses,

which leads the probability distortion functions to assume inverse S-shapes. Figure 1.3b depicts

how low probability events are overweighted at the cost of moderate and high probabilities

within the CPT probability distortion functions. Tversky and Kahneman (1992) indicate that

the weighting functions for gains are slightly more curved than for losses (i.e., γ < δ), whereas

γ and δ parameters smaller than one mean overweighting of small probability events (i.e, the

distribution tails), and γ and δ larger than one mean underweighting of tail probabilities.

Note that the CPT model on its original parameterization gives rise to a distinctive fourfold

pattern of risk attitudes: risk aversion for gains of high probability, risk seeking for losses of

high probability, risk seeking for gains of low probability, and risk aversion for losses of low

probability.

The parameters estimated by Tversky and Kahneman for the CPT model, which are ex-

plored in Chapters 3 and 4 of this thesis, are λ = 2.25; β = 0.88; α = 0.88; γ = 0.61; δ = 0.69.

8

1.4 Outline

The chapters of this thesis can be read individually without prior study of the preceding chap-

ters. That said, we note that Chapter 3 does provide insights for a better understanding of

Chapter 4.

Chapter 2 (The 2011 European Short Sale Ban: A Cure or a Curse? ) evaluates whether

the 2011 European short sale ban on financial stocks proved to be successful or had a negative

impact on financial markets. The analysis focuses on the effects of the short sale ban on

financial stability and contagion risk. Different from the previous literature, we explicitly take

an options market perspective and focus on market participants’ change in expectations. Our

starting point is the extraction of RND and implied volatility skews from single stock options.

Our estimated measure of implied jump risk is central to our analysis. We find that implied

jump risk tended to increase for banned stocks even more than for non-banned stocks, which

arguably is the opposite of what regulators target. However, contagion risk for banned stocks

did decrease during the ban relative to the previous period. This is likely due to fact that

short trading activity eased after the imposition of the ban. Perhaps, the main reason for

it was market failure. During the ban, market makers become more risk-sensitive following

equity market declines, which can be explained by the overweighting of tails, a feature of the

CPT. While we observe that the short sale ban is effective in restricting both outright and

synthetic shorts on banned stocks, we do find evidence of trading migration from single-stock

puts to index puts. The selling pressure potentially diverted from the financial stocks to a larger

share of the stock market, thereby reducing the destabilizing effects in the financial sector. As

described, Chapter 2 is the one that departs the most from the common theme in this thesis:

behavioral finance. CPT’s overweighting of tails is behind our main finding, but that is about

it. The chapter as a whole gravitates around the effectiveness of the 2011 European short sale

ban from an option market perspective.

Chapter 3 (Single stock call options as lottery tickets: overpricing and investor sentiment)

empirically tests whether the overpricing of out-of-the money single stock calls can be explained

by the CPT. To the best of our knowledge, these tests have never been reported in the liter-

ature. In line with Barberis and Huang (2008), our main hypothesis is that single stock call

options, typically traded by individual investors, are overpriced because these type of investors

overweight small probability events and overpay for positively skewed securities that resemble

lottery tickets. In specific, we test whether tails of the CPT density function outperform the

RND and a set of rational subjective probability density functions on matching tails of the

distribution of realized returns. We find that overweighting of small probabilities embedded in

the CPT explains the richness of out-of-the money single stock calls better than other utility

functions. We find our estimates for the CPT probability weighting function parameter to

be qualitatively consistent with the ones of Tversky and Kahneman (1992), particularly for

short-term options. Our estimates suggest, however, that overweight of small probabilities is

less pronounced than suggested by the CPT. Moreover, overweighting of small probabilities is

9

strongly time-varying and to a large degree explained by the sentiment factor of Baker and Wur-

gler (2007), a result that is confirmed by the Least Absolute Shrinkage and Selection Operator

(Lasso) of Tibshirani (1996).

Chapter 4 (Implied Volatility Sentiment: A Tale of Two Tails) builds on Chapter 3 and

on Polkovnichenko and Zhao (2013) and Dierkes (2009), providing evidence that low prob-

ability events are occasionally overweighted, as observed in the pricing of out-of-the-money

single stock calls (due to individual investors’ trading activity) and index puts (due to insti-

tutional investors’ trading activity). We show that overweighting of tail events in these two

option markets is strongly time-varying and is linked to equity market sentiment and higher

moments of RND. As a consequence, we suggest a novel sentiment indicator: implied volatility-

Sentiment or IV-Sentiment. We find that our measure, jointly derived from index and single

stock options, explains investors’ overweight of tail events well. When attempting to predict

the equity risk premium out-of-sample, we find that IV-Sentiment adds value over and above

traditional factors, especially when multifactor predictive models are constrained. The struc-

ture provided by these constraints in addition to a simple forecast combination approach seems

also to outperform a “kitchen sink” model and a set of machine learning algorithms capable

of exploring non-linearities in the data and tackling multicollinearity issues, such as Random

forests (Breiman, 2001), Neural Networks, Principal Component Regression (Massy, 1965) and

Ridge regression (Hoerl and Kennard, 1970). When employed as a mean-reversion strategy, our

IV-Sentiment measure delivers economically significant results, which seem more robust than

the ones produced by the conventional sentiment factor. Last but not least, we find that a con-

trarian strategy based on IV-Sentiment shows limited exposure to a set of cross-sectional equity

factors, including Fama and French’s five factors, the momentum factor and the low-volatility

factor, and seems valuable in avoiding momentum crashes.

Chapter 5 (Predictable Biases in Macroeconomic Forecasts and Their Impact Across Asset

Classes) reveals how biases in macroeconomic forecasts are associated with economic surprises

and market responses across four asset classes around US data announcements. We start by

reiterating the previous finding of the literature that the consensus forecasts of US macroeco-

nomic releases embed anchoring (see Campbell and Sharpe, 2009). Further, to the best of our

knowledge, we are the first to find that the skewness of the distribution of economic forecasts

is a strong predictor of economic surprises, suggesting that forecasters behave strategically

(rational bias) and possess private information. By using a popularity measure per economic

indicator and by expanding the number of countries/regions and indicators tested relative to

Campbell and Sharpe (2009), we advocate that the prevalence of biases is related to attention,

which is also a novel insight in the literature. Under these conditions, both economic surprises

and returns of assets that are sensitive to macroeconomic conditions are predictable. Our find-

ings indicate that local equities and bond markets are more predictable than foreign markets,

currencies and commodities. On an out-of-sample basis, point-forecast is better performed by

non-linear machine learning models as they seem to capture the dynamics of market responses

10

around macroeconomic announcements better than linear regression models and avoid overfit-

ting. Yet, when forecasters fail to correctly forecast the direction of economic surprises, regret

becomes a relevant cognitive bias to explain asset price responses. We find that the behavioral

and rational biases encountered in US economic forecasting also exist in Continental Europe,

the United Kingdom and Japan, albeit, to a lesser extent.

11

Chapter 2

The 2011 European Short Sale Ban: ACure or a Curse?∗

2.1 Introduction

On August 11, 2011, Belgium, France, Italy, and Spain imposed short sale bans on financial

stocks. The European Securities and Markets Authority (ESMA) stated that the reason for

the short sale bans was to curb market abuse and the spread of false rumors2. The spread

of false rumors is dangerous because it may increase the risk of financial contagion3, thereby

endangering financial stability.

Recent academic studies argue that short sale bans, at best, do not affect stock price levels

and, at worst, contribute to their decline and negatively impact market quality. For instance,

Boehmer et al. (2013) conclude that it is unclear whether the SEC’s 2008 imposition of short

sale bans achieved the goal of providing a floor for U.S. equity markets. Beber and Pagano

(2013) investigate the impact of the 2008 bans on stock markets in 30 different countries and

find that banned stocks underperform stocks not included in the bans.

∗This chapter is based on Felix et al. (2016a). I am grateful to the Iftekhar Hasan (the editor) and twoanonymous referees at Journal of Financial Stability for useful comments and suggestions. We also thankseminar participants at IX Seminar on Risk, Financial Stability and Banking of the Banco Central do Brasil 2014in Sao Paulo, International Risk Management Conference 2014 in Warsaw, Financial Management Association(FMA) European Conference 2014 in Maastricht, 3rd International Conference of the Financial Engineering andBanking Society - FEBS/LabEx-ReFi 2013 in Paris, VU University Amsterdam and APG Asset Managementin Amsterdam for their helpful comments. We thank Markit Securities Finance for providing the data on shortstock positions and borrowing costs. We thank APG Asset Management for making available a large part ofthe additional data set.

2ESMA stated on August 11, 2011: “European financial markets have been very volatile over recent weeks.The developments have raised concerns for securities markets regulators across the European Union. [....] Whileshort selling can be a valid trading strategy, when used in combination with spreading false market rumors this isclearly abusive. [...] Today some authorities have decided to impose or extend existing short selling bans in theirrespective countries. They have done so either to restrict the benefits that can be achieved from spreading falserumors or to achieve a regulatory level playing field, given the close inter-linkage between some EU markets”.

3Financial contagion occurs when a relatively contained shock, which initially affects only one or a fewinstitutions, sectors or countries, propagates via larger shocks to the rest of the financial sector, economy orother countries.

12

In this chapter, we explicitly take an options market perspective, as opposed to employing

only the stock market itself. Our study focuses on market participants’ changes in beliefs

and expectations, as in the work of Yan (2011), Chang et al. (2013), and Chira et al. (2013).

Forward-looking probabilities implied by options prices, i.e., risk neutral densities (RND), and

the implied volatility (IV) skew, are used to assess how the ban affects implied jump risk on

banned and non-banned stocks. We employ a data set of daily IV across a range of different

moneyness levels for all optionable European stocks listed in Belgium, France, Italy, and Spain.

We note that using option-implied data is a novel approach in the literature to analyze the

impact of short sale bans on financial markets.

We focus not only on the outmost tails of RNDs but also on the tails of realized returns.

We argue that it is the more extreme parts of the distributions that best reflect implied jump

risk. We use extreme value theory (EVT) to assess how investors, through their perception of

implied jump risk, differentiated between banned and non-banned stocks upon the introduction

of the 2011 European short selling ban.

Our work is related to that of Melick et al. (1997) and Birru and Figlewski (2011) because

it examines the behavior of RNDs over specific events. The rationale of using RND and IV

skews to assess how the ban affected implied jump risk is also supported by Bates (2000) and

Rubinstein (1994). They show that before the 1987 crash, the probability of large negative stock

returns was small and fairly close to that suggested by the normal distribution. Just prior to

the crash, however, the option-implied probability of jumps rose considerably at the same time

that the IV skew became steeper. The left tail of the RND of returns became considerably

fatter and thus negatively skewed with increased kurtosis, a phenomenon attributed to crash

fear (Rubinstein, 1994). As a result, out-of-the-money (OTM) puts are systematically priced

at a higher level relative to at-the-money (ATM) ones.

The main contributions of this chapter are threefold. First, we provide evidence that the

ban increased implied jump risk levels, particularly impacting the banned financial stocks. We

show that it is the imposition of the ban itself that led to the increase in implied jump risk,

rather than other causes, such as information flow, options-trading volumes, or stock-specific

factors. This finding is important because increased implied jump risk may provoke financial

contagion (see Ait-Sahalia et al., 2015) and increase systemic risk. Because of the connection

between implied jump risk and contagion, shifts in implied jump risk are closely monitored by

regulators4.

Second, we find that after the announcement of the ban, financial contagion risk actually

drops for banned stocks. This finding seems to run contrary to what one might expect, given

the documented increases in implied jump risk levels for banned stocks. Interestingly, for the

non-banned stocks, we document that contagion risk levels do indeed increase after the ban,

thus behaving in line with the rise in implied jump risk levels. We argue that this difference

may be caused by (formal and informal) market makers’ reluctance to further increase their

4For instance, Poon and Granger (2003) note that the Bank of England uses implied volatilities to assessmarket sentiment.

13

options’ inventory risk, leading to relatively steep IV skews, reduced volumes, and widened bid-

ask spreads for banned stocks. Such supply shift occur because market makers become more

risk-sensitive following equity market declines, which can be explained by the overweighting of

tails feature of the Cumulative Prospect Theory (CPT) of Tversky and Kahneman (1992).

Third, we compare the effects of the 2011 European ban to its 2008 American counterpart.

Investors may be able to obtain economic short exposure to banned stocks through a derivatives-

based strategy that replicates the payoff of a stock’s short sale. Such a “substitution effect”

(see Battalio and Schultz, 2011; Grundy et al., 2012) is characterized by a migration of trading

volume from one instrument to another. We find that no substitution effect occurred between

regular short selling and synthetic shorting through single stock puts during the 2011 European

ban. Instead of a substitution effect, our results show a migration out of single stock puts into

the EuroStoxx 50 index options market. We conclude that this type of migration diversifies

selling pressure initially concentrated in financial stocks across a larger share of the stock

market, thereby reducing systemic risks and enhancing overall financial stability.

2.2 Data and methodology

The 2011 short sale ban on financial stocks in the Euro member countries Belgium, France,

Italy, and Spain was established by a coordinated act of the European Securities and Market

Authority (ESMA) and the national financial market regulators of those countries on August

11, 2011. The announcement was made via a public statement issued by the ESMA and

was followed by publications on the same day by the Belgian Financial Services and Markets

Authority (FSMA), the French Autorit Des Marchs Financiers (AMF), the Italian Commissione

Nazionale per le Societ e la Borsa (Consob), and the Spanish Comision Nacional Del Mercado

de Valores (CNMV). The ban entered into effect on August 12, 2011. Table 2.1 provides an

overview of the banned financial stocks.

The ban on covered short selling not only prohibited the creation of new net short positions

but also banned increases in existing ones, including intra-day operations. Naked short selling

had already been prohibited in these four markets since 2008. Positions arising from formal

market-making activities were exempted from the ban. The ban targeted not only public mar-

kets but also over-the-counter (OTC) markets. In terms of scope, the national announcements

differed. The Belgian FSMA announced that the ban applied to net economic short positions

of any kind, while the French AMF communicated that derivatives could only be used to hedge,

create or extend net long positions. For the Italian Consob, the ban covered only shares and

not exchange-traded funds (ETFs) or any derivatives, while the Spanish CNMV imposed the

ban on all trades in equities or indices.

During the ban, holders of financial stocks were still allowed to use single stock derivatives

or simply sell their holdings to hedge their portfolios. Investors exposed to stocks were allowed

to hedge their overall equity market exposure by trading the market index or single stock

derivatives. It was the short selling of banned stocks that was prohibited, not hedging them

14

or reducing equity market risk. The creation or extension of marginal net short positions in

banned securities as a result of hedging equity market risk was still allowed.

Table 2.1: Overview of banned financial stocks

Belgium France Italy Spain

Ageas April Group Azimut Holding Banca Cvica, S.A.Dexia Axa Banca Carige Banco Bilbao Vizcaya Argentaria, S.A.KBC Group BNP Paribas Banca Finnat Banco de Sabadell, S.A.KBC Ancora CIC Banca Generali Banco de Valencia

CNP Assurances Banca Ifis S.A.Banco Espaol de Crdito, S.A.Crdit Agricole Banca Intermobiliare Banco Pastor, S.A.Euler Herms Banca Monte Paschi di Siena Banco Popular Espaol, S.A.Natixis Banca Popolare Emilia Romagna Banco Santander, S.A.Paris R Banca Popolare Etruria e Lazio Bankia, S.A.,Scor Banca Popolare Milano Bankinter, S.A.Socit Gnrale Banca Popolare Sondrio Bolsas y Mercados Espaoles, S.A.

Banca Profilo Caixabank, S.A.Banco di Desio e Brianza Caja de Ahorros del MediterrneoBanco di Sardegna Rsp Grupo Catalana de Occidente, S.A.Banco Popolare Mapfre, S.A.Cattolica Assicurazioni Bolsas y Mercados Espaoles, S.A.Credito Artigiano Renta 4 Servicios de Inversion, S.A.Credito EmilianoCredito ValtellineseFondiaria SaiGeneraliIntesa SanpaoloMediobancaMediolanumMilano AssicurazioniUbi BancaUnicreditUnipolandVittoria Assicurazioni.

This table lists the financial stocks banned from short selling on August 11, 2011, in Belgium, France, Italy, and Spain bytheir respective national financial market regulators in a coordinated act with the European Securities and Market Authority(ESMA).

The European short sale ban was initially intended to be in place for the next 15 days

only, with the exception of Belgium, which announced that the ban would remain in effect

indefinitely. Nevertheless, the ban was extended by the Spanish CNMV, the French AMF, and

the Italian Consob several times. On February 13, 2012, both FSMA and AMF announced

the lifting of the ban with immediate effect in Belgium and with retroactive effect, to February

11, in France. On February 15, the CNMV announced the lifting of the ban from February 16

onwards, and on February 24, the Italian ban expired.

Our sample covers the period from February 15, 2008, to March 27, 2012, and includes

1,073 trading days. It consists of all stocks that had listed options as of February 2012 on

the Belgian (Brussels Stock Exchange/Euronext Brussels), French (Paris Bourse or Euronext

Paris), Italian (Milan Stock Exchange or Borsa Italiana), and Spanish (Bolsa de Madrid) stock

exchanges. Overall, our sample comprises 185 stocks, of which 105 are included in these stock

exchanges’ main indices, i.e., the Belgian BEL20, the French CAC40, the Italian MIB, and the

Spanish IBEX35.

From Bloomberg, we source daily trading volumes and the number of shares outstanding per

stock, trading volumes, and put-call volume ratios for listed options. Trading volumes for listed

puts on the EuroStoxx 50 index, the V2X index (the IV index from the EuroStoxx 50 index),

15

and generic series of five-year sovereign credit default swaps (CDS) for Belgium, France, Italy,

and Spain are also collected from Bloomberg. Daily short stock positions (utilization rates)

and costs of short selling (simple average fee, simple average rebate, and cost of borrow score)

were kindly provided by Markit Securities Finance (formerly Data Explorers).

We implement the method by Figlewski (2010) for obtaining RNDs. He builds on the Bree-

den and Litzenberger (1978) formulae and interpolates and smooths the IV structure instead

of interpolating option prices. A clear strength of the Figlewski (2010) approach is its ability

to fill in intermediate grid values of the IV curve between the available strikes (the body of

the RND) with reduced noise and to extrapolate the RND beyond such observable strikes with

tails of flexible and reasonable shape.

To calculate RND for the stocks of interest, we obtain daily IV data for seven moneyness

levels, i.e., 80, 90, 95, 100, 105, 110, and 120, at the three-month maturity. Implied volatilities

are extracted by reverse engineering the Black-Scholes model from Bloomberg’s 16:00 hours

closing mid-prices (Bloomberg, 2008). In line with Figlewski (2010), IVs for the 80 to 95

moneyness levels are obtained from puts, while IVs for the 105 to 120 moneyness levels are

obtained from calls. For consistency with our IV skew measure, we use ATM IV from puts.

Because we intend to compare RND from banned stocks to non-banned ones, we compute

IV skews and extract RNDs for these two groups of stocks separately. More details on the

application of the method are included in Appendix 2.A.

We make use of extreme value theory (EVT) to measure implied jump risk5 because it

focuses on tail events, such as jumps in return distributions. EVT allows us to compare the

value-at-risk (VaR) implied by RNDs for banned and non-banned stocks. We first estimate

the tail shape estimator (φ), using Hill (1975) to compute the VaR using the semi-parametric

quantile estimator (qp) of Hartmann et al. (2004). In a next step, we employ a bivariate EVT

method to calculate commonality in jumps and, hence, contagion risk from historical returns.

EVT is well suited to measure contagion risk because it does not assume any specific return

distribution. Our approach estimates how likely it is that one stock will experience a crash

beyond a specific extreme negative return threshold conditional on another stock crash beyond

an equally probable threshold6.

We use the daily IV skews of individual European equities as a second measure of implied

jump risk7. The IV skew is calculated as the difference between the IV of three-month OTM

listed puts at the 80 percent moneyness level and ATM puts with the same maturity for every

stock in our sample. As with RNDs, we construct two indices of IV skew, one for banned

and the other for non-banned stocks, by equally averaging the stock-specific IV skews of the

5We acknowledge that the expression “jump risk” can also refer to the physical or actual, real-world jumprisk. However, in this chapter, we work with the implied, risk-neutral jump risk measure only. This measure ofjump risk can be viewed as the sum of the actual (real-world) jump risk plus a risk premium. Hence, impliedjump risk increases may be caused by increases in physical jump risk, increases in the risk premium, or both.

6We refer to Hartmann et al. (2004) and Balla et al. (2014), who use the conditional co-crash (CCC)probability estimator, which is applied to each pair of stocks in our sample. Appendix 2.B includes a detaileddiscussion of both our employed univariate and bivariate EVT-methodologies.

7A fat left tail in the RND of returns is a corollary to the fact that the IV skew is steep, see Bakshi et al.(2003).

16

constituents of each index. We also calculate single country versions of the banned and non-

banned IV skews for Belgium, France, Italy, and Spain. Table 2.2 presents descriptive statistics

for our IV skew measures for the entire sample period, both for the overall and single-country

levels.

Table 2.2: Descriptive statistics

Statistics Overall Belgium FranceNon-banned Banned Non-banned Banned Non-banned Banned

Average 5.73 6.49 5.33 7.11 6.23 8.57Median 5.31 5.98 5.14 6.94 6.09 8.16

Standard deviation 1.22 1.63 1.40 2.25 1.28 2.04Skew 1.15 1.91 0.67 0.48 0.56 0.54

Excess Kurtosis 0.88 5.24 0.29 0.64 -0.11 -0.51Jarque-Bera 266.0*** 1835.4*** 81.0*** 58.5*** 54.9*** 61.9***

Statistics Italy SpainNon-banned Banned Non-banned Banned

Average 6.49 7.49 3.85 4.96Median 6.23 7.24 3.19 3.98

Standard deviation 1.26 1.70 2.78 3.84Skew 1.02 0.89 4.03 4.35

Excess Kurtosis 0.89 1.16 18.87 22.44Jarque-Bera 214.5*** 195.8*** 18395.7*** 25297.4***

This table provides descriptive statistics for the IV skews of non-banned and banned stocks calculated over the full sampleperiod (February 15, 2008 to March 27, 2012) for the overall group of stocks as well as separately for Belgium, France,Italy, and Spain. We perform Jarque-Bera normality tests for all groups of stocks to infer whether IV skews are normallydistributed or not. The null hypothesis (H0) for the Jarque-Bera test is that data is normally distributed. Rejection of H0is denoted by ***, **, and *, at the one, five, and ten percent significance level, respectively.

Table 2.2 shows that the average and median IV skews for banned stocks are higher than

for non-banned stocks, an observation that pertains not only to the overall numbers but also to

each country separately. The standard deviation of the IV skew is higher for banned stocks. As

expected, the distributions of the IV skews are all positively skewed. All IV skew distributions

reported here have fat right tails and are not normal; thus, we use a non-parametric Mann-

Whitney U-test to make statistical inferences.

2.3 Discussion of results

We first examine the short selling utilization rate8 and the performance of banned and non-

banned stocks around the ban announcement day. Figure 2.1 shows that the imposition of the

2011 European ban strongly affects the short selling of stocks in Belgium, France, Italy, and

Spain. Plot A indicates that the short selling utilization rates fall for banned stocks from 32 to

27 percent in the months of August and September 2011, especially after the ban announcement

on August 11. This drop in short selling utilization is widespread across the four crisis countries.

For Belgium and Italy, short selling utilization drops from 29 to 23 percent and 24 percent,

respectively. For France, it remains unchanged at approximately 10 percent during this period.

For Spain, it drops from 53 percent on the ban announcement day to 45 percent on September

8The short selling utilization rate is calculated as Utilization=100*(ValueOnLoan/InventoryValue), whereValueOnLoan is the beneficial owner value of the loan and InventoryValue is the beneficial owner inventoryvalue. Utilization measures the value of a stock utilized for securities lending against the total value of inventoryavailable for lending, i.e., its short selling demand.

17

Figure 2.1: Short positions in stocks around ban date. This figure presents the average short utilization rates calculatedfor banned (Plot A) and non-banned (Plot B) stocks in our sample. Utilization rates have been calculated for the full sample ofbanned and non-banned stocks as well as separately for the stocks in Belgium, France, Italy, and Spain.

30. We observe that such drops in the utilization rate come from the decrease in the value of

short selling, the numerator of the utilization rate, as inventories of stocks available for lending

in the four countries remain relatively unchanged. The decreasing utilization rates indicate

that the ban was effective in reducing short selling, despite market makers still being allowed

to short banned stocks.

The reduction in short selling of financial stocks is especially noteworthy when utilization

rates for banned and non-banned stocks are compared. Plot B shows that the utilization rate for

non-banned stocks increases from 16 to 18 percent, on average, during August and September

2011, an increase observed across all four euro countries.

Despite such changes in utilization rates, short selling of banned financial stocks far exceeds

the level measured in non-banned stocks. In August 2011, the average short selling utilization

rate for financial stocks is twice the level reported for stocks of the other sectors (32 percent in

Plot A vs. 16 percent in Plot B). Short sellers would have benefited much more from further

18

deterioration of financial stocks rather than from a potential weakness in the average stock.

Despite such a dichotomy, utilization rates for the overall market around the ban announcement

day were at their highest levels since 2010 for the four crisis countries. For Italy and Spain,

the short selling activity was concentrated in mid-caps DataExplorers (2011), which matches a

large short selling interest in their banks.

From the end of June until the ban announcement, the EuroStoxx Banks index dropped by

32 percent, whereas the EuroStoxx 50 index fell by 22 percent. In the first ten days of August

2011, before the ban, shares in European banks fell by 23 percent, whereas the European index

corrected by 17 percent. In the subsequent month after the ban was announced, European

banks’ stocks lost an additional 18 percent, while non-financial equity dropped by only 6 per-

cent. Our data on short selling positions and returns suggest that financial stocks were indeed

under strong pressure.

2.3.1 VaR levels and volatility skews

In the following analysis of VaR levels implied by RNDs, we distinguish five sub-periods: (1) the

U.S. recession period (February 15, 2008, to June 30, 2009); (2) the 2009/2010 stock market

rally (July 1, 2009, to April 26, 2010); (3) the European crisis period (April 27, 2010, to

August 10, 2011), initiated by Standard and Poor’s downgrade of Greece’s sovereign bonds to

junk status; (4) the ban period (August 11, 2011, to February 16, 2012); and (5) the post-ban

period (February 17, 2012, to March 27, 2012).

Panel A of Table 2.3 shows that during the ban period, the RND-implied VaR levels for

banned stocks, i.e., the perceived implied jump risks, are significantly higher than for non-

banned stocks. The same conclusion holds for the post-ban period. We observe that the

VaR levels from RNDs during the ban period are significantly higher than during the pre-ban

period. The ten percent VaR for banned stocks increases from 38 to 62 percent for banned

stocks, whereas for non-banned stocks, it increases to a much lesser extent, from 35 to 46

percent. We observe similar differences in extreme downside risk for these two sub-samples

at both the five- and one-percent VaR levels. Interestingly, the VaR levels for the post-ban

period are not significantly different from the ban period for both banned and non-banned

stocks. The other sub-sample that had very distinct downside risk features in comparison to

the preceding period was the 2009 stock market rally. The latter period had significantly lower

VaR priced in RND returns than the U.S. recession period, especially for banned stocks but

also for non-banned equity. The ten percent VaR for banned stocks was 50 percent during

the recession and 40 percent during the rally, whereas for non-banned stocks it was 47 and 41

percent, respectively. We find that the VaR levels for banned stocks were generally higher than

for non-banned stocks, and downside risk priced in RND reached its peak during the ban.

19

Table 2.3: Extreme downside risk and implied volatility skew

Panel A - Extreme downside risk

Sample split 10% VaR 5% VaR 1% VaR

Non-banned Banned NB vs. B Non-banned Banned NB vs. B Non-banned Banned NB vs. B

Full sample: 02/15/2008 03/27/2012 -0.31 -0.34 -1.30 -0.36 -0.40 -1.50 -0.51 -0.57 -1.9US recession: 02/15/2008 06/30/2009 -0.47 -0.50 -0.90 -0.53 -0.58 -1.20 -0.73 -0.82 -1.8

2009 stock market rally: 07/01/2009 04/26/2010 -0.41** -0.40** 0.30 -0.47** -0.46** 0.20 -0.63** -0.63** 0.00Pre-ban European crisis: 04/27/2010 08/10/2011 -0.35** -0.38 -1.10 -0.40* -0.44 -1.00 -0.58 -0.62 -1.00

Ban period:08/11/2011 02/16/2012 -0.46*** -0.62*** -4.00*** -0.52*** -0.70*** -4.00*** -0.69*** -0.94*** -4.10***Post-ban period: 02/17/2012 03/27/2012 -0.45 -0.56 -2.10** -0.50 -0.64 -2.20** -0.66 -0.84 -2.20**

Panel B - Implied volatility skew

Sample split Overall Belgium France Italy Spain

Non-banned Banned Non-banned Banned Non-banned Banned Non-banned Banned Non-banned BannedFull sample: 02/15/2008 03/27/2012 5.31 5.98 5.14 6.94 6.09 8.16 6.23 7.24 3.19 3.98US recession: 02/15/2008 06/30/2009 5.02 5.99 4.70 6.97 5.46 7.83 6.31 8.02 3.47 4.87

2009 stock market rally: 07/01/2009 04/26/2010 5.05* 5.41*** 4.46*** 6.24*** 5.69** 7.46*** 5.99*** 6.00*** 2.42*** 3.26***Pre-ban European crisis: 04/27/2010 08/10/2011 5.78*** 6.05*** 5.63*** 7.81*** 6.42*** 8.14*** 5.82 7.11*** 3.60*** 3.98***

Ban period:08/11/2011 02/16/2012 6.05** 7.34*** 5.90*** 7.28 6.99*** 11.97*** 7.14*** 7.98*** 3.06*** 3.98Post-ban period: 02/17/2012 03/27/2012 5.17*** 6.37*** 5.32*** 5.25*** 5.59*** 10.65*** 6.73*** 5.49*** 2.73*** 4.98***

Panel A shows the ten, five, and one percent extreme downside risk estimates, or value-at-risk (VaR), of the risk neutraldensities (RNDs) for all non-banned and all banned stocks during the full sample period, as well as for the five differentsub-periods. Asterisks used as superscript to VaRs denote the outcome of the t-tests specified in Eqs. (3.2.7) and (2.B.3)across different sample periods. The column “NB vs. B” shows the t-stats of the test that compares VaRs of non-bannedand banned stocks, using Eqs. (3.2.7) and (2.B.3). The null hypothesis (H0) is that there is no difference between the VaRfrom non-banned and banned stocks. Panel B provides the median IV skews for non-banned and banned groups of stocksduring the same periods as in Panel A. Mann-Whitney (MW) U -tests are applied to the IV skew of paired sample splits toinfer whether the medians are statistically different from each other. The null hypothesis (H0) for the MW U -test is thatthere is no difference between the two unrelated samples. In both panels, rejection of H0 is denoted by the asterisks ***, **,and *, at the one, five, and ten percent significance level, respectively. In Panel B, the superscripts are placed in the cell ofthe second sub-sample that is compared.

Figure 2.2 depicts the historical behavior of our proxy for implied jump risk, the average IV

skew, for banned and non-banned stocks. The ban period is highlighted, with the beginning of

the shadowed part representing the ban announcement day. We observe that between 2008 and

2012, spikes in average IV skews were well above their mean, coinciding with periods of market

turmoil. Figure 2.2 shows that the IV skews rise strongly in 2008, around the Lehman collapse,

and wane after the market trough in March 2009. In 2010, the IV skews jump on April 27, the

day the Greek government bonds were downgraded by Standard and Poor’s to “junk” status.

The IV skew then strongly reverses on October 18, 2010, when a task force of European leaders

agreed on a rescue package to improve the European Union’s economic governance in an effort

to tackle the financial crisis. The 2011 jump in IV skews coincides with the ban announcement

day on August 11. The announcement was not accompanied by any major event related to the

European financial crisis, to equity markets in general or to the financial sector. We observe

that on all three occasions, the IV skew of banned stocks exceeded the IV skews for non-banned

stocks.

Figure 2.2 shows that implied jump risk rises strongly just prior to the ban announcement

for both banned and non-banned stocks. Such spikes in implied jump risk occur during the day

on August 11, 2011, whereas the ban was officially announced only after the market closed9.

The increase in the average IV skew on August 11 for banned stocks is equivalent to 2.16

volatility points, while for non-banned stocks it is equivalent to 1.05 volatility points. Both

differences exceed the 99th percentile of all daily IV skew changes in our sample. On August

12, the average IV skew continued to rise sharply, by 0.78 volatility points for banned stocks,

a movement exceeding the 94th percentile of all daily IV skew changes in our sample. On that

9It is unclear whether any information on the upcoming short selling ban leaked before the market closed onAugust 11. However, given that a ban on covered short selling on all stocks was already introduced in Greeceon August 8, the extension of the ban to other European countries might have been expected by some marketparticipants.

20

Figure 2.2: Averaged implied volatility skews for banned and non-banned stocks. This figure depicts the averageIV skew for banned and non-banned stocks over the entire sample period. Averages are calculated over all stocks in Belgium,France, Italy, and Spain that have listed options. The IV skew per stock is calculated as the difference between the IV of the80 percent moneyness OTM put option and the ATM put option. The European short selling ban period (August 12, 2011, toFebruary 16, 2012) is shadowed.

same day, the IV skew for non-banned stocks rose by 0.55 volatility points, exceeding the 96th

percentile. We observe that jumps in the IV skews around the ban announcement day are

outliers in our sample. More importantly, the rise in the IV skew for banned stocks is much

more pronounced than for non-banned stocks.

We also observe from Figure 2.2 that after the announcement of the short sale ban, the IV

skew levels of both banned and non-banned stocks remained elevated for several weeks. During

the entire ban period, the IV skew of banned stocks remained relatively high, whereas the IV

skew for non-banned stocks slowly declined to pre-ban levels. This persistence in the high level

of implied jump risk indicates that the ban did not diminish market participants’ concerns

regarding European financial stocks.

Table 2.3, Panel B presents the corresponding medians for the whole period and for the five

sub-periods separately. We observe that the sub-periods 1, 3, and 4 have the highest IV skews.

They also roughly match the periods of market turmoil and volatility humps highlighted in

Figure 2.2: the global financial crisis, the European sovereign debt crisis, and the 2011 European

ban period. The median IV skew for banned stocks is 7.34, significantly higher during the ban

period than before, when it was 6.05. For non-banned stocks, the median IV skew during the

ban period is 6.05, only slightly higher than during the European crisis, when it was 5.78.

Moreover, the median IV skew during the European crisis period is also significantly higher

than during the stock market rally. Figure 2.2 also indicates that the IV skew for banned

stocks exceeds that for non-banned stocks in most periods. We observe similar patterns in the

country-specific data. We find that the short selling ban contributes to an increase in implied

jump risk, especially with respect to banned stocks. Conversely, once the ban is lifted on

21

Figure 2.3: Sovereign CDS spreads, V2X and implied volatility skews around the ban date. This figure depictsthe five-year sovereign CDS spreads for Belgium, France, Italy, and Spain as well as the V2X and the IV skews for non-banned andbanned stocks. Sovereign CDS spreads proxy for country-specific information flow. V2X is the implied volatility index from theEuroStoxx 50 index, and it proxies for market-wide information flow. The V2X series is multiplied by 0.10 to fit the same scale asthe IV skew. The IV skew per stock is calculated as the difference between the IV of the 80 percent moneyness OTM put optionand the ATM put option. The ban announcement on August 11, 2011, is indicated by the vertical line.

February 16, 2012, IV skews drop significantly for both banned and non-banned stocks.

Our empirical findings indicate that the short sale ban did not reduce the implied jump risk

during the European financial crisis. Otherwise, VaR levels and IV skews would have receded.

On the contrary, we find significant evidence that VaR levels increased strongly and that IV

skews jumped instantly when the ban was introduced and remained high during the period of

the ban, particularly for banned stocks.

A potential flaw in the empirical analysis so far is that large movements in the IV skew,

observed during the ban or at the time of its announcement, may have been contemporaneous

to the dissemination of other relevant information. If so, we cannot draw a clear connection

between the ban announcement and IV skew behavior. Figure 2.3 indicates that the IV skews

rise on August 11, even though we do not observe negative shocks within country-specific CDS

spreads and the V2X index10.

Figure 2.3 displays that information flow for the four crisis countries was relatively benign

around the ban announcement date, as CDS spread levels remain unchanged, whereas the

V2X even decreases after showing a large spike in the days preceding the ban. Equity market

movements around that period further support the presence of such positive information flow11.

10We use CDS spreads to proxy the country-specific information flow. We adopt the V2X, the Europeancounterpart of the VIX (the IV index for S&P500 index options), as a proxy for the European equity marketinformation flow.

11Figure 2.3 shows that moves in the country-specific IV skew match the sovereign CDS spread behavior inthis period very well. CDS spreads moved sideways for Belgium, France, and Italy, while they rose for Spain.This divergence can be explained by the fact that on February 13, 2012, Spain’s sovereign debt rating wasdowngraded two notches by Moody’s, from A3 to A1, which was much more severe than the rating changes forthe other three crisis countries.

22

The EuroStoxx 50 index rose by 2.86 percent on August 11 and by 4.15 percent on August 12,

whereas the EuroStoxx Banks index rose by 2.96 and 5.26 percent, respectively. Moreover, no

other major announcement was made during these days. The absence of negative information

strongly suggests that the ban announcement itself catalyzed the rise in implied jump risk.

2.3.2 Financial contagion risk

In this section, we assess the development of financial contagion risk, using average conditional

co-crash (CCC) probabilities. The average CCC probability measures the likelihood that a

banned (non-banned) stock crashes, given that another banned (non-banned) stock crashes.

We estimate the bivariate CCC probabilities for all pairs of banned and all pairs of non-banned

stocks using realized daily returns. Table 2.4, Panel A presents the results from the estimation

of Eq. (2.B.4) for the full sample and individual sub-samples. Over the full sample, the CCC

probability for the banned stocks is 32 percent, while for the non-banned stocks, it is 29 percent,

which is not significantly different. In the first sub-sample periods, the contagion risk of the

banned stocks reaches a similar level, as it does for the other stocks. However, during the

pre-ban European crisis period, we find that contagion risk for banned stocks (42 percent)

becomes significantly different from that for non-banned stocks (33 percent). Surprisingly, this

substantial difference is no longer observed during the ban period, when the CCC probability

for banned stocks decreases to 32 percent, while it increases to 41 percent for non-banned

stocks. This decrease in contagion risk for banned stocks is one of the major findings of this

chapter. Apparently, the imposition of the ban decreased systemic risk. This effect occurred

despite the increase in forward-looking implied jump risk across the same sample and period.

In a next step, we analyze whether the CCC probabilities for banned and non-banned

stocks are different across samples. We observe that the CCC probabilities for banned stocks

during the U.S. recession (27 percent) and the 2009 equity market rally period (28 percent) are

not significantly different. The same assessment holds for non-banned stocks, where Panel A

indicates CCC probabilities at 26 and 23 percent for these two sub-periods, respectively. The

pre-ban period, however, witnesses an abrupt and statistically significant increase in the CCC

probability for banned (from 28 to 42 percent) and non-banned (from 23 to 32 percent) stocks.

Clearly, contagion risk is higher across the board once the European crisis is triggered, but it

is especially higher for financial stocks. After the ban is announced, banned stocks’ contagion

risk falls from 42 to 32 percent, while for non-banned stocks, contagion risk rises from 32 to 41

percent.

23

Table 2.4: Extreme downside risk and implied volatility skew

Panel A - Conditional co-crash probabilities

Sample split Conditional co-crash-probabilities

Non-banned Banned NB vs. B

Full sample: 02/15/2008 03/27/2012 0.29 0.32* 1.7US recession: 02/15/2008 06/30/2009 0.26 0.27 0.4

2009 stock market rally: 07/01/2009 04/26/2010 0.23 0.28 1.6Pre-ban European crisis: 04/27/2010 08/10/2011 0.32* 0.42* 2.2**

Ban period:08/11/2011 02/16/2012 0.41 0.32 -1.3Post-ban period: 02/17/2012 03/27/2012 NA NA NA

Panel B - Option trading volumes and put-call ratios

Sample split Put volume Put-call volume ratio

Non-banned Banned Non-banned Banned

Full sample: 02/15/2008 03/27/2012 1,064 1,690 7.0 3.8US recession: 02/15/2008 06/30/2009 0,877 1,377 7.1 4.1

2009 stock market rally: 07/01/2009 04/26/2010 1,200*** 1,747*** 7.0 3.4***Pre-ban European crisis: 04/27/2010 08/10/2011 1,157 1,905** 6.3 3.7**

Ban period:08/11/2011 02/16/2012 0,943*** 1,727*** 8.7*** 3.8Post-ban period: 02/17/2012 03/27/2012 1,245*** 2,758*** 7.9 5.7***

Panel A shows the average conditional co-crash (CCC) probabilities calculated by Eq. (2.B.4) among all non-banned andall banned stocks during the full sample period, as well as for the five different sub-periods. Asterisks used as superscriptto CCC-probabilities denote the outcome of the t-tests specified in Eqs. (3.2.7) and (2.B.3) across different sample periods.The column “NB vs. B” shows the t-stats of the test that compares CCC-probabilities of non-banned and banned stocks,using Eqs. (3.2.7) and (2.B.3). The null hypothesis (H0) is that there is no difference between the CCC-probabilities fromnon-banned and banned stocks. Panel B shows the median daily trading volume, measured by the number of contractstraded in put options, as well as the median daily put-call volume ratio for all non-banned and banned stocks for the overallsample period and for the five different sub periods. We apply Mann-Whitney U -tests to investigate whether the mediansare statistically different from each other. The null hypothesis (H0) is that there is no difference between the populations ofthe two samples. In both panels, rejection of H0 is denoted by the asterisks ***, **, and *, at the one, five, and ten percentsignificance level, respectively. In Panel B, the superscripts are placed in the cell of the second sub-sample that is compared.

Another variable that potentially plays a major role for financial contagion risk is trading

activity. Bollen and Whaley (2004) suggest that the IV skew might be closely linked to trading

activity in the options market. They find that changes in the shape of the IV function are

directly related to net buying pressure on options from end-users’ public order flow. They

argue that end-users trade options for portfolio insurance, agency, and speculative reasons,

rather than for market-making reasons. Garleanu et al. (2009) confirm their findings and

observe that the size of the IV skew is positively and significantly related to demand pressures

from institutional investors seeking portfolio insurance.

We inspect daily put and call trading volumes as well as the put-call volume ratio as proxies

for trading pressure, as suggested by Dennis and Mayhew (2002). We measure volume as the

median number of contracts traded on a specific day for all stocks in the sample. We obtain

an overall put-call volume ratio by averaging the single-stock contracts. Again, we evaluate

these measures over the five periods previously identified in our data set. Table 2.4, Panel B

documents that the median number of single-stock puts for each banned stock traded per day

decreases significantly from 1,905 during the pre-ban European crisis period to 1,727 during the

ban period. For non-banned stocks, the median volume of puts also drops, from 1,157 during

the pre-ban period to 943 during the ban. The median put-call volume ratio for non-banned

stocks significantly increases during the ban, from 6.3 to 8.7, whereas the median put-call

volume ratio for banned stocks hardly changes.

The findings in Panel B provide no evidence that individual stock options, particularly puts,

experienced a large rise in trading activity. Thus, we find no evidence of a substitution effect

24

of the short selling of common stock into single-stock put options. We also find no evidence

that trading activity completely dried up during the ban period. This result is in line with

Grundy et al. (2012), who show that the overall volume of options trading dropped during the

2008 U.S. short selling ban. This behavior of trading volumes indicates that during the ban,

the IV skew does not increase as a result of increased selling pressure, as originally suggested

by Bollen and Whaley (2004) and Garleanu et al. (2009).

We assume that once short selling activity in banned stocks diminishes, the demand for

synthetic shorts via put options should increase. During the ban, informal market makers in

options (high-frequency traders and hedge funds)12 can no longer delta-hedge by short selling

stocks. Hence, they become less willing to sell protection, significantly impairing the supply of

puts.

As securities-lending programs were in less demand by short sellers during the ban, it

became cheaper to borrow stocks. In unreported results, we find that three common measures

of borrowing costs (the simple average fee, the simple average rebate, and the daily cost of

borrow score) indeed fall for banned stocks, from the date the ban was introduced until the end

of September 2011. Borrowing costs constitute, however, only one component of hedging costs,

and, depending on market circumstances, not necessarily the largest one. Costs incurred by

bid-ask spreads and price impact may easily outpace borrowing costs in times of thin trading

activity. Beber and Pagano (2013) illustrate that the 2008 U.S. ban is associated with an

increase in bid-ask spreads ranging between 1.64 and 1.98 percentage points among international

stocks where the average bid-ask spread is 3.93 percentage points13. Likewise, Battalio and

Schultz (2011) and Grundy et al. (2012) note that bid-ask spreads on options on banned stocks

also rose significantly during the 2008 U.S. short sale ban. In contrast, on August 11, 2011,

the fee for borrowing from the Spanish bank Santander was only 51 bps per annum. Therefore,

lower borrowing costs may not have helped much in encouraging market makers to write puts

during the ban.

A final explanation for a smaller supply of puts during the ban is that option sellers became

more risk-sensitive following equity market declines. Garleanu et al. (2009) find that end-users

have a net long-position in equity index options with a corresponding large net position in

OTM puts. Conversely, market makers are short in OTM puts. Following a market decline,

they become more reluctant to write additional puts. This behavior is fully consistent with

the overweight of small probabilities feature of the Cumulative Prospect Theory of Tversky

and Kahneman (1992). According to this model, agents making decision under risk, such as

market makers, tend to perceive tails events as more probable than they are, causing them to

assume risk averse attitude. In the days before the introduction of the European short sale

ban, equity markets strongly corrected on the back of an intensifying European financial crisis;

thus, it is not difficult to envision high risk aversion among market makers during the ban and

a diminished willingness to sell puts. Holders of financial stocks suddenly had to pay much

12Boehmer et al. (2013) note that approximately 50 percent of all options trading is currently supplied bysuch informal market makers.

13Sobaci et al. (2014) provide similar results for emerging markets.

25

higher prices to buy protection: three-months 80 percent moneyness OTM puts on financial

stock, on average, became 16 percent more expensive on August 11, compared to the average

of the previous 21 trading days.

On the ban announcement day, the trading volume for puts on the EuroStoxx 50 index

reached 2,573,868, which is the second-highest daily trading volume for this instrument in our

sample14. A potential explanation for such a high trading volume is that after the imposition

of the ban, the skew from stock options relative to index options became too costly. The

spread between the IV skew from the Eurostoxx 50 index put options and single-stock puts,

which is normally highly positive, was just marginally positive during the ban, reaching zero on

December 20, 2011. Because index puts are far more liquid than single-stock puts, a liquidity

premium no longer existed, and a migration from single stock puts to index puts took place.

Such an explanation is also in line with the “flight-to-liquidity” models suggested by Pastor

and Stambaugh (2003) and Acharya and Pedersen (2005).

2.3.3 Panel regression analysis

To further assess the effect created by the short selling ban and trading activity on IV skews,

we run a panel regression analysis with the IV skew (IVSkew) as the dependent variable.

This regression allows us to isolate the relationship between the IV skew, banned stocks, and

trading activity by controlling for other determinants of the IV skew, such as information flow

and idiosyncratic factors. We use the following firm-specific control variables: daily turnover

(Turnover), systematic risk (Beta), and firm size (Size). We use Turnover as a proxy for stock

liquidity, following Dennis and Mayhew (2002).

We calculate an individual stock’s daily turnover by dividing its daily trading volume by

its number of shares outstanding. The stock’s beta is our control variable for systematic risk.

The market return is assumed to be the equal-weighted average daily return for all stocks in

our sample. The daily estimation of the beta uses a rolling window of one year’s worth of

data, where the data begin one year before the first sample date. Firm size is calculated as the

number of shares outstanding on a specific day multiplied by the stock price.

Control variables are uncorrelated with each other in both the cross-sectional and the time-

series dimension (unreported here). We employ de-trended levels of sovereign CDS spreads for

Belgium, France, Italy, and Spain (CCDS ) and the V2X volatility index (V2X ) as a control

variable for country-specific and equity market information flows15. Additionally, we proxy

firm-specific information flows with daily stock returns (R), trading pressure via single-stock

put option trading volume (PVlme), and trading volume of puts on the EuroStoxx 50 index

(E50PVlme)16. Our resulting Model 2.1 is given as follows:

14The heaviest trading in EuroStoxx 50 index puts took place on October 10, 2008, when the Belgian bankDexia was bailed out and 2,604,185 contracts were traded.

15Based on the Johansen cointegration test, we find no cointegration between the de-trended CDS spreads ofthe four crisis countries and the V2X index at the five-percent significance level.

16Single-stock put option trading volume is computed as the average daily trading volume of puts divided by1,000. Put trading volume is not used as an additional cross-sectional variable because data are only availablefor a limited set of stocks (122 of 186). E50PVlme is the daily trading volume of puts on the EuroStoxx 50

26

IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned

t +

DBt ∗DBned

t +DPostBt +DPostB

t ∗DBnedt + PV lmet + E50PV lmet + εt,

(2.1)

where DBt is a dummy variable equal to one if the date is within the ban period (August 11,

2011, to February 16, 2012), and zero otherwise. DBnedt is a dummy variable equal to one

if the underlying stock is a banned stock, and zero otherwise. DPostBt is a dummy variable

equal to one if the date is after the lifting of the ban (from February 17, 2012, onwards), and

zero otherwise. An additional dummy variable is created as an interaction term for these two

dummies, DBt *D

Bnedt . This variable captures the effect on the IV skew when two conditions

hold: the stock is banned and the ban is in place.

We use generalized least squares (GLS) to account for potential serial correlation in the

residuals. We estimate our panel regression over three different periods: (a) the full period,

ranging from February 15, 2008, to March 27, 2012; (b) the period that starts on April 27,

2010, when the European sovereign crisis is deemed to have begun, to March 27, 2012; and

(c) the ban period, ranging from August 11, 2011, to February 16, 2012. Table 2.5, Panel

A reports the regression results. Over the full sample period (column a), all coefficients are

statistically significant at the one-percent level, except for the dummy variable DBt and the

post-ban dummy variables. The results for V2X, country CDS spreads, Beta, and Turnover

are in line with the results reported in the literature and with our expectations. We expected

V2X and CDS spreads to be positively related to the IV skew, as implied jump risk priced

for individual stocks is likely to increase with equity market volatility and country credit risk.

Contrary to our expectations, stock returns and size are positively related to the IV skew.

Nevertheless, our size-skew estimates are in line with the results reported in Engle and Mistry

(2008). They suggest that size proxies for beta, warranting a positive relationship between size

and skew.

The results obtained from our dummy variables over the full sample period confirm that

the ban positively affected the IV skew for banned stocks: DBt *D

Bnedt has a positive sign and

is statistically significant. The interaction coefficient of this dummy variable indicates that the

ban increases IV skews for banned stocks by 0.3 volatility points, which is economically relevant

because it amounts to approximately five percent of the median IV skew in our data set. This

is a strong result, given the large set of control variables used. This finding suggests that the

IV skew for banned stocks during the European short sale ban was abnormally high compared

to that for non-banned stocks and that for banned stocks in other periods. Furthermore, the

estimated coefficient of DBnedt implies that financial stocks have IV skews that are, on average,

0.72 volatility points higher than IV skews for non-banned stocks. This finding is consistent

with our descriptive statistics provided in Table 2.2. The three dummy estimates confirm that

the IV skew for all stocks was higher during the ban, and the effect was more pronounced for

banned stocks.

index divided by 1,000,000 and is used to capture the potential indirect substitution effect of trading pressureon single-stocks’ puts by index puts.

27

Column b of Panel A shows that during the euro crisis pre-ban period, all parameter es-

timates for control variables have identical signs and comparable statistical significance levels

compared to the results obtained in estimating Model 2.1 over the full period (column a). The

impact in IV skews of banned stocks caused by the ban is even stronger though. On average,

the ban increases the IV skew for banned stocks by 0.45 volatility points, which amounts to

roughly nine percent of the median IV skews across our data set. At the same time, for the

average stock, IV skews decreases by -1.1 volatility points during the ban. These results sup-

port our hypothesis that investors seem to have differentiated between banned and non-banned

stocks upon the ban introduction.

Table 2.5: Panel regression results

Panel A Panel B Panel C

(a) Full (b) Euro (c) Ban (a) Full (b) Euro (c) Ban (a) Full (a) Full sple.sample crisis period sample crisis period sample Default Prob.

Intercept 2.823*** 2.198*** -0.283 2.951*** 2.214*** -0.264 2.680*** 1.084***(0.155) (0.329) (0.344) (0.052) (0.330) (0.346) (0.156) (0.068)

V2X 0.053*** 0.094*** 0.035*** 0.050*** 0.096*** 0.029*** 0.073*** 0.031***(0.004) (0.011) (0.009) (0.001) (0.011) (0.009) (0.004) (0.002)

Country CDS 0.006*** 0.003*** 0.008*** 0.007*** 0.003*** 0.009*** 0.003*** 0.008***(0.001) (0.001) (0.001) (0.000) (0.001) (0.001) (0.001) (0.000)

Stock returns 5.509*** 9.663*** 7.055*** 5.843*** 9.693*** 6.997*** 4.458*** -0.203(0.921) (1.710) (1.385) (0.360) (1.709) (1.385) (0.960) (0.417)

Stock turnover -19.317*** -22.535*** -25.602*** -18.916*** -22.178*** -26.247*** 4.749** 67.677***(1.834) (2.468) (3.313) (1.376) (2.466) (3.318) (2.318) (1.964)

Stock size 0.048*** 0.066*** 0.109*** 0.050*** 0.067*** 0.109*** 0.030*** -0.035***(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)

Stock Beta 1.015*** 1.710*** 3.702*** 1.037*** 1.712*** 3.701*** 1.482*** 1.368***(0.036) (0.060) (0.097) (0.028) (0.060) (0.097) (0.046) (0.038)

Dummy Ban Period -0.180 -1.104*** 0.676*** 0.358*** 1.195*** 1.052***(0.118) (0.195) (0.028) (0.064) (0.116) (0.048)

Dummy Stock Banned 0.719*** 0.357*** 0.037 -0.305*** -1.123*** 0.030 0.475*** -0.692***(0.035) (0.064) (0.049) (0.035) (0.196) (0.049) (0.041) (0.039)

Dummy Ban Period*Stock Banned 0.306*** 0.451*** 0.327*** 0.474*** 0.311*** 1.796***(0.098) (0.119) (0.069) (0.119) (0.110) (0.075)

Dummy Post Ban -0.201 -0.695*** -0.293*** -0.704*** 1.025*** 0.897***(0.200) (0.224) (0.059) (0.224) (0.209) (0.090)

Dummy Post Ban*Stock Banned 0.280 0.389* 0.314** 0.416* 0.491** 0.978***(0.207) (0.230) (0.138) (0.229) (0.247) (0.154)

Overall put volume 0.305*** 0.185 -0.297** 0.274*** 0.182 -0.333** 0.442*** 0.111***(0.074) (0.147) (0.144) (0.020) (0.147) (0.146) (0.073) (0.027)

EuroStoxx50 put volume -0.723*** -1.358*** 0.397** -0.691*** -1.372*** 0.428** -1.114*** -0.541***(0.119) (0.220) (0.189) (0.033) (0.220) (0.191) (0.119) (0.045)

IV spread 0.062 -0.619 3.304**(0.106) (0.727) (1.509)

Default Probability -12.356***(0.498)

R2 0.1046 0.1289 0.2757 0.1076 0.1308 0.2765 0.1452 0.2478# Obs (Unbalanced Panel) 146201 73327 21298 142048 73327 21298 74622 84095

Panel A reports the panel regression results for Model 2.1. Panel B reports the panel regression results for Model 2.2. PanelC reports the panel regression results for Models 2.3 and 2.4. The dependent variable for Models 2.1, 2.2 and 2.3 is theIV skew. Model 2.4 uses probability of default from single-name CDS (PD) as the dependent variable. We distinguishthree different periods: (a) full sample (from February 15, 2008 to March 27, 2012), (b) euro crisis (from April 27, 2010to March 27, 2012), and (c) ban period (from August 11, 2011 to February 16, 2012). The single stock IV skew is thedependent variable and information flow (Country CDS and V2X), firm-specific control variables (Return, Turnover, Size,Beta), trading volume on single put options (Put volume) and on index options (EuroStoxx50 put volume), a proxy forsupply shift on option markets (IV spread), firms’ probability of default from single-name CDS market (PD) and dummiesare the explanatory variables. The intercept is estimated as common to all cross-sections and no weighting is used in thecross-sections for estimation. Residuals are not normal for most cross-sections. We apply White-Heteroskedasticity consistentstandard error and covariance estimates. The asterisks ***, ***, and * indicate significance at the one, five, and ten percentlevel, respectively.

Empirical results change more strongly when we estimate Model 2.1 for the 2011 European

ban period. Column c shows that all control variables still have the same signs and the results

are strongly statistically significant; however, the estimate of DBnedt is no longer statistically

significant. Because we use such a short period, the dummies DBt , D

PostBt , DB

t *DBnedt , and

DPostBt *DBned

t are no longer applicable. This outcome suggests that, within the ban period,

financial stocks are no longer associated with higher IV skews relative to the average stock.

28

The lack of significance of DBnedt is, however, connected to its cross-correlation with beta for

financial stocks during the ban (i.e., 1.30 relative to 0.90 for non-banned stocks). Additionally,

PVlme becomes negative and significantly related to IVSkew. Thus, a rise in the skew during

the ban period is associated with a lower volume of single-stock puts. This result is consistent

with our hypothesis that a supply shift drove the IV skew during the ban rather than a change

in demand. In such a setting, large upward movements in the skew could have been caused by

low trading volumes in OTM puts. At the same time, the link between E50PVlme and IVSkew

turns positive and significant. This relation is explained by the above-noted increase in the

volume of index puts traded, in parallel with the supply-led rise in the IV skew during the ban.

A ban may be considered ineffective when selling pressure migrates from banned securities

to alternative instruments. However, in the case of the 2011 European short sale ban, we

observe that the migration of selling pressure from financial stocks to put options on European

indices has not jeopardized the efficacy of the short sale ban. As a result of the migration,

the ban appears to have diverted selling pressure initially concentrated in financial stocks to

a larger share of the market. This hypothesis is consistent with the fact that contagion risk

decreased for banned stocks during the ban but increased for non-banned stocks.

When the short sale ban was introduced on August 11, 2011, any further selling pressure

on financial stocks could have led to destabilizing shocks and financial contagion. The price of

OTM puts on banned stocks rose as a result of lower trading volume rather than through a

substitution effect. The richness of OTM puts made it substantially more expensive for market

participants to take a synthetic short position. Hence, imposition of the ban likely helped to

curb downward price pressure, which benefited financial sector stability.

In a next step, we analyze whether a supply shift in the options market was an important

driver of IV skews during the ban. As market makers use their bid-ask quotes for inventory

management, spread measures from options markets indicate whether a supply shift occurred

around the ban or not. We calculate two bid-ask spread-based measures for put optionsi)

percentage spread, i.e., (ask-mid)mid, and ii) IV spread, i.e., (δIV = δC/V ega to evaluate the

impact on IV caused by market makers’ inventory management17. Percentage spread represents

the percentage of the put mid-price that market makers charge to supply an option. IV spread

represents the translation of percentage spread into volatility points, i.e., how many volatility

points market makers charge to supply an option. We evaluate the behavior of percentage

spread and IV spread in the full sample and in our sub-samples by calculating median spreads

of these two metrics across put options on the 28 stocks in our sample that belong to the

EuroStoxx 50 index18. The results are shown in Table 2.6.

17IV spread uses options’ Greek Vega, i.e., δC/IV , to obtain δIV , where δC is the difference betweenask and mid-prices. Option prices and Vegas are from ATM options and are obtained from Bloomberg andDatastream. Although the IV spread measure only estimates the impact on IV caused by changes in spreadsfrom ATM options, we assume that such increase in spread is also indicative of supply shift on OTM optionsand, consequently, on the IV skew. This assumption is conservative, as bid-ask spreads of OTM options aretypically higher than those of ATM options due to the lower liquidity of OTM options.

18From the full Eurostoxx 50 index sample, we discard those stocks for which the required options data arenot available.

29

Table 2.6: Robustness checks

Sample split Percentage spread IV spread

Non-banned Banned Non-banned Banned

Full sample (02/15/2008 03/27/2012) 0.088 0.069 0.057 0.072U.S. recession (02/15/2008 06/30/2009) 0.08 0.06 0.06 0.041

2009 stock market rally (07/01/2009 04/26/2010) 0.081 0.058 0.050*** 0.039***Pre-ban European crisis (04/27/2010 08/10/2011) 0.099*** 0.076*** 0.075** 0.082***

Ban period (08/11/2011 02/16/2012) 0.090*** 0.089*** 0.025*** 0.149***Post-ban period (02/17/2012 03/27/2012) 0.072*** 0.083** 0.018*** 0.166

This table shows the median daily Percentage spread measure as well as the median IV spread measure for non-banned andbanned stocks that belong to the EuroStoxx 50 index for the overall sample period and for the five different sub periods. ThePercentage spread is defined as (ask-mid)/mid, where ask is the asking price of an ATM option, and mid is the mid-price of anATM option. This metric represents the percentage of the mid-price that is charged by market makers to sell an option. TheIV spread is defined as δIV = δC/V ega, where δC is the difference between ask- and mid-prices, i.e., the spread, and Vegais obtained for ATM options. We apply Mann-Whitney U-tests to assess whether the medians are statistically different fromeach other. The null hypothesis (H0) is that there is no difference between the populations of the two samples. Rejection ofthe (H0) is denoted by the asterisks ***, **, and *, indicating significance at the one, five, and ten percent level, respectively.

The IV spread metric behaves in line with percentage spread. The IV spread is relatively low

during both the U.S. recession and the 2009 stock market rally, with the latter period reporting

statistically significant lower spreads than the former. The pre-ban period experiences a sudden

and statistically significant rise in IV spread, from 0.050 to 0.075, for non-banned stocks and

from 0.039 to 0.082 for financial stocks. The IV spread continues to rise during the ban period

for banned stocks, from 0.082 to 0.149. In contrast, it falls by two-thirds for non-banned stocks,

from 0.075 to 0.025. During the post-ban period, IV spread continues to rise for banned stocks,

whereas it falls for non-banned stocks19. These results from our two spread measures confirm

that during the ban, market makers widened their spreads for options on financial stocks, while

no such supply shift seems to have occurred for options on the other stocks.

To formally test the overall impact of options bid-ask spread on IV skew, we specify our

Model 2.2, which comprises Model 2.1 with the addition of the IV spread as an explanatory

variable, as follows:


t +

DBt ∗DBned

t +DPostBt +DPostB

t ∗DBnedt + PV lmet + E50PV lmet + IV spreadt + εt,

(2.2)

Table 2.5, Panel B presents the estimates of Model 2.2. We observe that IV spread has a

statistically significant (positive) relation with IV skews during the ban period but not during

the other two periods. During the ban period, on average, a one volatility point increase in

IV spread is linked to a 3.30 increase in IV skew. Within the full sample and during the pre-

ban period, however, rises in IV spread provoke no statistically significant impact on IV skew.

Most explanatory variables in Model 2.2 have the same signs and similar significance levels as

observed in the estimation of Model 2.1. This is always the case for the joint dummyDBt *D

Bnedt .

More intuitively, Figure 2.4 shows the jump in IV skews around the ban announcement day

and a coincident large spike in IV spread.

19The rise in IV spread during the post-ban period is mainly caused by Spain, which matches the behavior ofSpanish stocks’ IV skew and sovereign CDS in such periods. This rise is likely caused by the country’s sovereigndebt rating downgrade by Moody’s on February 13, 2012.

30

Figure 2.4: Implied volatility skews and IV spread around the ban date. This figure depicts the average IV skewsfor banned and non-banned stocks as well as the average IV spread for the 28 stocks in our sample that belong to the EuroStoxx50 index from July 11, 2011, to December 30, 2011. The IV skew per stock is calculated as the difference between the IV of the80 percent moneyness OTM put option and the ATM put option. The ban announcement on August 11, 2011, indicated by thevertical line, coincides with a large spike in IV spread and with large increases in the IV skew for banned and for non-bannedstocks.

We see that on August 11, 2011, the banned stocks’ average IV skew rose by 2.16 volatility

points, whereas for our sample of 28 stocks, this increase was 1.77 volatility points. Of these

1.77 volatility points, a rise of 0.32 volatility points in IV skew (approximately 18 percent) was

caused by a widening of the bid-ask spread, as suggested by the IV spread variable shown in

Figure 2.4. Due to the conservative nature of this variable, which is based on ATM options

rather than on OTM options, such an impact on IV skew coming from bid-ask spreads is

material. These findings reinforce our view that IV skews have risen due to a supply shift

among market makers and other options providers, rather than further selling pressure on

financial stocks via options.

In a final step, we incorporate information from the fixed income market into our panel

regression analysis. We use the probability of default, following Hull et al. (2005), who build

on the Merton (1974) credit risk model. Hull et al. (2005) find that implied volatility skews

from single-stock options are linked to the firms’ default risk. We specify the probability of

default both as a stock-specific information flow proxy (Model 2.3) and to replace the dependent

variable in Model 2.1, which leads to Model 2.4. Models 2.3 and 2.4 are estimated for the full

sample, ranging from February 15, 2008, to March 27, 2012. We use the same GLS panel

regression approach as in Model 2.1, with the following specifications:


t +

DBt ∗DBned

t +DPostBt +DPostB

t ∗DBnedt + PV lmet + E50PV lmet + PDi,t + εt,

(2.3)

and

PDi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t +DBt +DBned

t +

DBt ∗DBned

t +DPostBt +DPostB

t ∗DBnedt + PV lmet + E50PV lmet + εt,

(2.4)

31

where, PDi,t is the probability of default20 implied by the 5-year CDS for firm i at time t.

Because CDS data are not available for all firms in our sample, the number of cross-sections

used in Model 2.3 and Model 2.4 equals 83. Table 2.5, Panel C reports the regression results

for both models.

The second-to-last column in Panel C shows that the Model 2.3 estimates for the full period

are consistent with the Model 2.1 estimates (column a) for the joint dummy DBt *D

Bnedt , with

a significant coefficient of 0.3. The dummy DBnedt has the same sign and statistical significance

as in Model 2.1. The coefficient of the PDi,t variable is negative and strongly statistically

significant. This negative link supports our hypothesis that the ban itself is responsible for an

increase in implied jump risk for banned stocks.

The results in Panel C show that the risk premium priced in CDS default probabilities does

not increase during the ban, like it was observed for the IV skews in Table 2.3. We argue

that the implied jump risk rose due to an increase in physical jump risk, not to an increase

in the risk premium. This result seems to be in line with our conclusion that the increase in

implied jump risk during the ban was due to a supply shift instead of further selling pressure or

increased risk premium required by investors to hold financial stocks. The explanatory power

for Model 2.3 (14.5 percent) is higher than for Model 2.1 (10.5 percent), indicating that PDi,t

is a powerful variable in explaining the dynamics of jump risk.

The last column of Panel C reports the estimates for Model 2.4, where PDi,t is the dependent

variable. We see that the ban period has increased the probability of default for banned stocks

by 1.8 percent, on average, as evidenced by the coefficient of the joint dummy DBt *D

Bnedt . The

dummy DBt is also positive, meaning that the probability of default rose across the board once

the ban was introduced. These findings suggest that the ban has negatively impacted fixed

income markets because the increase in the probability of default slightly increased after the

introduction of the ban.

2.3.4 Robustness Tests

In Model 2.1, we observe shifts in the signs of PVlme and E50PVlme across different sample

periods. Hence, we run now an additional GLS panel regression as a robustness check to control

for any influence of the short sale ban. We estimate a reduced form of Model 2.1 that excludes

all dummies related to the ban, while using only pre-ban data. Thus, our Model 2.5 is specified

as follows:

IV Skewi,t = c+ V 2Xt + CCDSi,t +Ri,t + Turni,t + Sizei,t + Betai,t + PV lmet+

E50PV lmet + εt,(2.5)

where the variables are defined as in Model 2.1. We estimate Model 2.5 for (a) the entire

pre-ban period (from February 15, 2008, to August 10, 2011); (b) the U.S. recession period

20Probability of default implied by CDS spreads is calculated using the ISDA standard model. The recoveryratio is 40 percent.

32

(from February 15, 2008, to June 30, 2009); (c) the stock market rally period (from July 1,

2009, to April 26, 2010); and (d) the European sovereign crisis until the last trading before the

ban was implemented (from April 27, 2010, to August 10, 2011). Panel A of Table 2.7 presents

the regression results of Model 2.5.

Table 2.7: Robustness checks

Panel A - Model 2.5 Panel B - Model 2.1 Panel C - Model 2.1 Panel D - Model 2.1

All (pre-ban) US recession Market rally Euro Crisis All Euro Crisis Ban Yan (2011) IVSkew adjusted AllVol skew Vol skew Vol skew Vol skew 95 minus 105 for ask-prices Vol skew

Intercept 3.213*** 2.196*** 6.035*** 2.429*** 2.893*** 1.981*** -0.728** 0.197** 4.016*** 2.574***(0.170) (0.102) (0.281) (0.473) (0.164) (0.353) (0.335) (0.100) (0.190) (0.149)

V2X 0.052*** 0.052*** -0.022** 0.125*** 0.050*** 0.103*** 0.037*** 0.013*** 0.076*** 0.065***(0.005) (0.002) (0.010) (0.017) (0.004) (0.012) (0.009) (0.003) (0.005) (0.004)

Country CDS spread 0.005*** 0.025*** 0.024*** 0.000 0.006*** 0.003*** 0.009*** -0.003*** 0.000(0.001) (0.001) (0.001) (0.002) (0.001) (0.001) (0.001) (0.000) (0.001)

Country Bond spread -0.130***(0.034)

Stock returns 4.861*** 4.085*** 3.569*** 11.958*** 5.383*** 9.735*** 7.090*** -0.983 3.899*** 5.346***(1.060) (0.667) (1.234) (2.927) (0.995) (1.832) (1.374) (0.670) (1.347) (0.934)

Stock turnover -22.602*** -18.110*** -5.120 -30.131*** -19.694*** -23.984*** -27.241*** -10.699*** -39.059*** -17.274***(2.200) (3.245) (4.332) (3.727) (1.890) (2.572) (3.330) (2.585) (4.812) (1.907)

Stock size 0.042*** 0.034*** 0.038*** 0.054*** 0.049*** 0.069*** 0.114*** 0.011*** 0.008*** 0.048***(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.000) (0.001) (0.001)

Stock Beta 0.939*** 0.944*** 0.383*** 1.262*** 1.002*** 1.739*** 3.893*** 1.012*** 0.955*** 0.990***(0.031) (0.046) (0.037) (0.052) (0.040) (0.068) (0.099) (0.033) (0.063) (0.036)

Dummy Ban Period -0.184 -1.217*** 0.304*** 1.334*** 0.399***(0.124) (0.207) (0.071) (0.066) (0.117)

Dummy Stock Banned 0.712*** 0.339*** 0.087 0.418*** 1.987*** 0.745***(0.036) (0.068) (0.056) (0.023) (0.143) (0.034)

Dummy Ban Period*Stock 0.318*** 0.497*** -0.033 0.379** 0.450***(0.106) (0.128) (0.058) (0.151) (0.097)

Dummy Post Ban -0.277 -0.743*** -0.835*** 1.414*** 0.194(0.210) (0.237) (0.117) (0.261) (0.203)

Dummy Post Ban*Stock 0.438** 0.578** 0.479*** 0.392 0.343(0.223) (0.244) (0.119) (0.314) (0.210)

Overall put volume 0.364*** 0.052 -0.016 0.370* 0.341*** 0.236 -0.274* 0.490*** 0.396*** 0.273***(0.081) (0.054) (0.056) (0.201) (0.077) (0.156) (0.140) (0.047) (0.087) (0.074)

EuroStoxx50 put volume -0.901*** -0.252*** -0.165 -2.188*** -0.724*** -1.452*** 0.423** -0.710*** -0.808*** -0.752***(0.132) (0.082) (0.164) (0.308) (0.124) (0.233) (0.184) (0.077) (0.144) (0.120)

R2 0.0830 0.1412 0.2757 0.0987 0.0994 0.1271 0.2854 0.0315 0.2013 0.1021Observations 119759 44644 28230 46735 128757 64470 18702 148757 25152 146201

Panel A reports the panel regression results for Model 2.5, using only the pre-ban period data. We distinguish four sub-periods: (a) Full pre-ban period: Feb 15, 2008 to Aug 10, 2011; (b) U.S. recession: Feb 15, 2008 to Jun 30, 2009; (c) Marketrally: Jul 1, 2009 to Apr 26, 2010; and (d) Euro crisis: Apr 27, 2010 to Aug 10, 2011. Panel B reports the estimates afterremoval of the Belgian data from the full sample. Here we distinguish three different periods: (a) Full period: 15 Feb, 2008to 27 Mar, 2012; (b) Euro crisis: 27 Apr, 2010 to 27 Mar, 2012; and (c) Ban period: Aug 11, 2011 to Feb 16, 2012. PanelC reports the regression results for Model 2.1, where the explained variable IV skew is substituted by (a) Yan (2011) 95minus 105 IV skew measure, and by (b) our proxy for the IV skew measure from ask-prices. Panel D reports the regressionresults for Model 2.1, where the country CDS spreads are replaced by sovereign spreads versus Germany. The single stockIV skew is the dependent variable and information flow (Country CDS spread, Country Bond Spread and V2X), firm-specificcontrol variables (Return,Turnover , Size, Beta), trading volume on single put options (Put volume) and on index options(EuroStoxx50 put volume), and dummies are the explanatory variables. The intercept is set equal in all cross-sections and noweighting is used. Residuals are not normal for most cross-sections. We report White-Heteroskedasticity consistent standarderrors in brackets. Asterisks ***, **, and * indicate significance at the one, five, and ten percent level, respectively.

The first column of Table 2.7 shows that the Model 2.5 estimates for the pre-ban period

are consistent with the Model 2.1 estimates (Table 2.5, Panel A, column a) for the full sample

period. Hence, we find that increased trading activity in single-stock puts is linked to a high IV

skew of single-stock options. This relation confirms the findings of Bollen and Whaley (2004)

and Garleanu et al. (2009), who link trading pressure to IV skews. Columns b and c of Panel A

show that the estimates of the Model 2.5 parameters during all three sub-periods have the same

signs and statistical significance levels as those obtained over the full pre-ban period (column

a). Hence, the findings remain stable across various time periods and within two different model

specifications.

As an additional robustness check, we analyze whether the IV skews for stocks in other

European countries increased around the date of the short sale ban announcement. Such an

increase could be evidence of financial contagion effects in options markets. If so, the steep rise

in implied jump risk would also be observed in other European countries that did not adopt the

ban and that were vulnerable to or already hit by the financial crisis. European countries that

fit such criteria are Greece, Ireland, and Portugal (Grammatikos and Vermeulen (2012)). We33

compile the IV skew data for only Ireland and Greece because Portugal does not have a public

equity options market. In unreported results, we find no indication that implied jump risk for

these stocks materially changes when the short selling ban is introduced. These observations

strengthen our earlier conclusion that the rise in the level of implied jump risk on the day of the

ban announcement is connected to the ban itself, as opposed to other reasons, such as financial

contagion.

In another robustness check, Panel B reports the results of estimating Model 2.1 after

removing the Belgian shares from the sample. A potential justification for excluding the Belgian

data from the analysis is that the Belgian banks in particular experienced relatively heavy

government intervention during the crisis period. This intervention may distort the estimations

for Belgium. However, Panel B of Table 2.7 shows that the results do not materially change

after the removal of the Belgian data.

Our findings are also not altered when we use the IV slope measure of Yan (2011), which is

the IV of close-to-ATM puts minus the IV of calls, as the dependent variable instead of our IV

skew measure (see Panel C). Because our measurement of the IV skew is based on mid-prices,

which may be unaffected by bid-asks widening due to market makers’ response to the market

turmoil, we also test whether our results hold if our IV skew measure comes from ask prices.

We hypothesize that the ask price-based IV skew is biased upwards by the wider than normal

bid-ask spreads during the European sovereign crisis and the ban period. Our analysis shows

that the main findings still hold when we add the IV spread variable to the IV skew on the

left-hand side of Model 2.1. This result, reported in Panel C, proves that our regressions are

not biased by the use of IV from mid-prices in the construction of our explained variable, the

IV skew.

As a final robustness test of Model 2.1, we substitute CDS spreads with sovereign spreads

from government bonds. The intuition behind this check for robustness test is that uncovered

positions in sovereign European CDS were also banned on November 1, 2012. We calculate

spreads vis-a-vis Germany, using a maturity of ten years. Panel D indicates that our findings

are affected rather little by this substitution and remain robust.

2.4 Conclusion

Recent research suggests that the short sale bans introduced during the 2008 financial crisis have

reduced market quality around the world, perhaps even to the extent that the bans’ benefits

were outpaced (see, for example, Battalio and Schultz, 2011; Grundy et al., 2012; Boehmer

et al., 2013; Beber and Pagano, 2013). Nevertheless, market regulators in Belgium, France,

Italy, and Spain re-introduced a short sale ban on financial stocks in August 2011 to combat

the European financial crisis.

To analyze the effects of the European 2011 short sale ban on financial market stability

and contagion risk, we extracted RNDs and IV skews from single-stock options. Our results

indicate that implied jump risk of banned stocks was higher during the ban period than in

any other period analyzed. We find that on the day of the ban announcement, implied jump

34

risk levels for both banned and non-banned stocks showed a significant rise. Implied jump

risk tended to increase for banned stocks even more than for non-banned stocks. Furthermore,

during the imposition of the ban, the banned stocks’ average IV skews remained at an elevated

level, whereas this metric dropped for the other stocks. During the ban, the median IV skews

for both the banned and non-banned stocks reached their highest levels when compared to any

other period in the sample. This adverse effect in IV skews seems to occur due to a supply

shift induced by a rise in market markers’ risk aversion, which is consistent with the behavior

predicated by Cumulative Prospect Theory of Tversky and Kahneman (1992). Despite its

cause, our findings show that, even after controlling for information flow and stock-specific

factors, the short sale bans themselves increased implied jump risk, especially for the banned

stocks.

We further document that contagion risk for both banned and non-banned stocks already

increased significantly during the pre-ban period. For non-banned stocks, contagion risk rose

even more upon imposition of the ban. However, we find that contagion risk for banned stocks

decreased during the ban relative to the pre-ban period.

Our approach of using option-implied data to analyze the impact of short sale bans on

financial markets is only a first step. We believe that our knowledge on this topic would benefit

from additional future research. Of particular interest would be the analysis of ban-driven

increases in implied jump risk using the mutually exciting jumps model of Ait-Sahalia et al.

(2015). We hypothesize that a lack of coordination by country regulators in introducing bans

may be undesirable, as shocks in jump risk caused by subsequent bans may cross-excite each

other and lead to financial contagion, which is of great importance to the supervisory policy

agenda.

While we observe that the short sale ban is effective in restricting both outright and synthetic

shorts on banned stocks, we do find evidence of trading migration to the Eurostoxx 50 index

options market. Investors seem to switch from single-stock puts to index puts because of “flight-

to-liquidity” incentives. The selling pressure potentially diverted from the financial stocks to

a larger share of the stock market, thereby reducing the destabilizing effects in the financial

sector.

The question remains whether the 2011 European short selling ban was a cure or a curse. If

the first and foremost goal of imposing a ban is reducing systemic risk, then the 2011 bans do

seem to fulfill this purpose. However, we note that this success comes at a cost, which is that

the implied jump risk increases. Despite the fact that this effect in implied jump risk indicates

market failure and may have adversely influenced market participants’ expectations, it helped

to preserve market stability by reducing contagion risk. Thus, what is the appropriate balance

between market failure and systemic risk? Bans should be avoided if possible, and should only

be used as a last resort when all other means have failed, as government and regulators should

prioritize financial market stability over transitory market failure.

35

2.A Appendix: Implied jump risk estimation

2.A.1 Implied jump risk from risk-neutral distributions

In this section, we describe how we compute IVs for the groups of banned and non-banned

stocks. The banned group constituents are the stocks that were prohibited from short sales.

The non-banned group constituents are the remaining stocks in our sample. We compute IVs

for banned and non-banned stocks separately by equally averaging IV21 on each moneyness level

available across all stocks belonging to either the banned group or the non-banned group. This

step produces one IV structure across our seven moneyness levels (80, 90, 95, 100, 105, 110 and

120) for both groups for every day in our sample. Then, we apply the Black-Scholes model to

our IV data to obtain options prices for the banned and non-banned groups of stocks. We set the

instantaneous price level of both groups equal to 100, and as a result, the percentage moneyness

level automatically reflects strike prices per group. When applying the Black-Scholes model,

we calculate contemporaneous dividend yield for banned and non-banned stocks by equally

weighting dividend yield from the individual stocks. The risk-free rate applied is the Euribor

three-month maturity.

Once options prices for the average banned and non-banned stocks are obtained, we can

extract the RND of equity returns using the Breeden and Litzenberger (1978) formulae for the

strikes along the body of our distribution, i.e., from the 80 to 120 moneyness levels:

RND(S) = exp(rT )δ2C(T,K)

δK2|K=S, (2.A.1)

where RND(S) is the risk-neutral probability of observing the terminal index level (S) at time

T , r is the risk-free rate for the specific maturity, K is the strike price, and C is the index

option price. Computing the second derivative of the option price relative to strike prices via

central differences leads to:

RND(S) ≈ exp(rT )C(T, S −∆K)− 2C(T, S) + C(T, S+)

(∆K)2, (2.A.2)

Following Figlewski (2010), extrapolation beyond the body of the RND22 is performed by

fitting a generalized extreme value (GEV) distribution using two extreme anchor points on

each side of the body of the RND and extending a tail with the same shape23. The GEV-based

21IV is calculated through reverse engineering the Black-Scholes model, while assuming constant interest ratesand discrete dividends. Interpolation is used to calculate the IV at a fixed level of moneyness and at a fixedtime to maturity.

22The Figlewski (2010) method is close to the method used by Bliss and Panigirtzoglou (2004), where bodyand tails are also extracted separately. These authors use a weighted natural spline algorithm for interpolation,which has the same decreasing noise effect in RNDs. Extrapolation is done by the introduction of pseudo datapoints, which has the effect of pasting lognormal tails into the RND. One advantage of both approaches is thatextrapolation does not result in negative probabilities, which is possible when the spline interpolation is applied.We favor the approach by Figlewski (2010) because the use of the lognormal tails by Bliss and Panigirtzoglou(2004) assumes that the IV is constant beyond the observable strikes, resembling the Black-Scholes model andbeing largely inconsistent with empirical evidence.

23Figlewski (2010) argues that interpolation using fourth-order splines is superior to cubic splines because itavoids kinks in the RND. The translation from interpolated IV curve into RND would require taking higher-orderderivatives than those used by the construction of the spline.

36

extrapolation is then used to model the tails of the RND toward the moneyness levels 0 and 200.

We initially use the first and third percentiles of the RND’s body as (outer and inner) anchor

points for the left tail and the 99th and 97th percentiles as (outer and inner) anchor points for

the right tail. We extend the approach of Figlewski (2010) by allowing these anchor points to

change if the fitted GEV curves produce implausible tails, e.g., zero probability under the tails.

Eqs. (2.A.3) and (2.A.4) give, respectively, the GEV’s cumulative distribution function and

probability distribution function:

FGEV (ST ) = exp[−

(1 + ω

(ST − µ

σ

))−1/ω], (2.A.3)

and

fGEV (ST ) =1

σ

[1 + ω

(ST − µ

σ

)](−1/ω)−1

exp[−

(1 + ω

(ST − µ

σ

))−1/ω], (2.A.4)

where ω > 0 sets a fat tail relative to the normal, ω = 0 sets a normal tail, and ω < 0 sets

a distribution tail that is thinner than the normal. The µ and σ are location and dispersion

parameters. Because fitting GEV curves entails setting these three parameters, Figlewski (2010)

also imposes three conditions on the tail: i) that the total probability in the tail of the body

(up to the inner anchor point) is the same for the RND and the GEV approximation, ii) that

the shape of the RND equals the shape of the GEV curve in the inner anchor point, and iii)

that the shape of the RND equals the shape of the GEV curve in the outer anchor point. We

refer to Appendix 2.A.2 below for more details.

Once the body and tails of the RND for terminal index levels are obtained for banned and

non-banned stocks, we convert them into return RNDs by calculating log-returns relative to the

starting index level S0. Finally, we compute probabilities for every percentage return quantile

of the PDF via linear interpolation, which are normalized to integrate to one.

2.A.2 The Figlewski (2010) approach for extracting RND from im-plied volatilities

In this section, we describe the Figlewski (2010) approach and how we apply it to our sample.

In the Figlewski (2010) method, the following three conditions are imposed for the right tail:

Condition 1: FGEV (X(αinnerR)) = αinnerR

Condition 2: fGEV (X(αinnerR)) = fbody(X(αinnerR))

Condition 2: fGEV (X(αouterR)) = fbody(X(αouterR))

where X(αinnerR) represents the exercise price corresponding to the α-quantile of the RND used

as the inner anchor point in the right tail, whereas X(αouterR) denotes the same but for the

outer anchor point in the right tail. For the left tail, these conditions are modified to:

Condition 1: FGEV (−X(αinnerL)) = 1− αinnerL

Condition 2: fGEV (−X(αinnerL)) = fbody(X(αinnerL))

37

Condition 3: FGEV (−X(αouterL)) = fbody(X(αouterL))

We fit the GEV curves by implementing the following optimization:

GEV (ω, µ, σ) = argmin(y) (2.A.5)

where the objective function yR for the right tail following the three conditions above is:

yR = [FGEV (X(αinnerR))− αinnerR]2 + [fGEV (X(αinnerR))− fbody(X(αinnerR))]

2 + ...

[fGEV (X(αouterR))− fbody(X(αouterR))]2,

(2.A.6)

and whereas, for the left tail, the objective function yL is:

yL = [FGEV (X(αinnerL))− αinnerL]2 + [fGEV (X(αinnerL))− fbody(X(αinnerL))]

2 + ...

[fGEV (X(αouterL))− fbody(X(αouterL))]2,

(2.A.7)

2.A.3 The modified Figlewski (2010) approach

The approach by Figlewski (2010) performs nicely for many observations in our sample. How-

ever, for some observations, the fitted GEV curves are implausible. We illustrate the problem

encountered in Figure 2.5, where the right tail of the RND is reasonably fitted by GEV, but

the left tail is not. To avoid ending up with implausible tails, we allow the inner anchor

points to change by a predefined amount (∆IAnchor), following a loop-algorithm from itera-

tion m = 1, ...,M . Within this algorithm, the inner anchor points are mainly the ones to shift

to accommodate a better-behaved GEV curve. Exceptionally, however, the outer anchor points

are also shifted. The algorithm for the left tail is given as:

1. Let the α-quantile inner anchor point (αinnerL) increase by ∆IAnchor as m → M loops

until ymL > 5−25 and median of δ2fGEV

δK2 |K=X(αinnerL)K=0 < 0, otherwise stop loop.

2. If ym−1L < ymL , then evaluate if median of δ2fGEV

δK2 |K=X(αinnerL)K=0 > 0. If yes, stop loop and

use α-quantile inner anchor point (αinnerL) of ymL for GEV estimation. If median ofδ2fGEV

δK2 |K=X(αinnerL)K=0 < 0, continue loop by increasing α-quantile inner anchor point (αinnerL)

by ∆IAnchor.

3. If ym−1L < ymL , then evaluate if 0.05 >

∫ X(αouterL)

0Fm−1GEV > 0.1. If so, stop loop and use

α-quantile inner anchor point (αinnerL) of ym−1L for GEV estimation, otherwise continue

loop.

4. If the α-quantile inner anchor point (αinnerL) increases up to the mode (peak) of the RND,

then it stops increasing and the α-quantile outer anchor point (αouterL) starts increasing by

a very small step of 0.01 percent. If the α-quantile outer anchor point (αouterL) increases

more than 10 times, then stop loop and use α-quantile outer and inner anchor points from

the iteration with lowest yL for GEV estimation.

Thus, our modification to the Figlewski (2010) approach is that the RND body is always

extracted from the IV by using the Breeden and Litzenberger (1978) formulae. In contrast,

Figlewski (2010) substitutes the original RND in the interval between the inner anchor point

and the end of the original RND.38

Figure 2.5: RND extraction using different methods Plot A depicts the RND of banned stocks for March 24, 2011,using the Figlewski (2010) approach. Plot B depicts the RND of banned stocks for the same date, using the modified Figlewskiapproach described here. We note that in Plot A both the left and the right tails of the RND, fitted by GEV curves, are implausiblebecause they contain abruptly declining tails under which the probability is close to zero. It is not the approach that causes suchdistortion but the limited range of moneyness in our data set.

2.B Appendix: Extreme value theory

When applying EVT, we first estimate the tail shape estimator (ϕ), using Hill (1975):

ϕ =1

θ=

1

k

K∑j=1

ln(xj

xk+1

), (2.B.1)

where xj are ranked returns in ascending order j = 1, . . . , n; n is the sample size; k is the

number of extreme returns used in the tail estimation; and xk+1 is the return “tail cut-off

point”. The tail shape estimator ϕ measures the curvature, i.e., the fatness of the tails of the

return distribution: a high (low) ϕ indicates that the tail is fat (thin).

After extracting RNDs for both banned and non-banned stocks, we next determine the

optimal number of observations k used to estimate parameter ϕ in Eq. (3.2.6). For this

purpose, we produce Hill-plots for the left tail of our two RNDs. Such Hill-plots depict the

relationship between k and ϕ as a curve. The optimal value of k is selected as the minimum

level for which the value of ϕ stabilizes, thus where a stable trade-off between the approximation

of the tail shape by the Pareto distribution and the uncertainty of such approximation occurs

(because of the use of fewer observations). We set k equal to four percent or 43 observations,

which matches the level used in, e.g., Hartmann et al. (2004).

Once ϕ is obtained, we compute extreme downside risk, hereafter VaR, using a semi-

parametric quantile estimator used in Hartmann et al. (2004):

qp = xk+1(k

pn)1

θ , (2.B.2)

where n is the sample size, p is a chosen exceedance probability, which means the likelihood

that a return xj exceeds the tail value q, and x(k+1)is the “tail cut-off point”. Note that qp has

as one of its inputs the estimated tail shape parameter ϕ. The qp statistic indicates the level

of the worst return occurring with probability p. Since the tail quantile statistic

√k

ln( kpk

)[ln q(p)

q(p)]

39

is asymptotically normally distributed, we follow Hartmann et al. (2004) and use the following

t-statistic for this estimator:

Tq =q1 − q2

σ[q1 − q2]∼ N(0, 1), (2.B.3)

where the denominator is calculated as the difference between the two estimated VaRs, using

1,000 bootstraps. The null hypothesis of this test is that q1 and q2 do not come from independent

samples of normal distributions, therefore, the VaRs are equal.

In the next step, we employ a bivariate EVT method to calculate commonality in jumps,

hence, contagion risk from historical returns. EVT is well suited to measure contagion risk

because it does not assume any specific return distribution. Our approach estimates how

likely it is that one stock will experience a crash beyond a specific extreme negative return

threshold conditional on another stock crash beyond an equally probable threshold. We refer

to Hartmann et al. (2004) and Balla et al. (2014) who use the conditional co-crash (CCC)

probability estimator, which is applied to each pair of stocks in our sample, as follows:

CCCij = 2− 1

k

N∑t=1

I[Vit > xi,N−k or Vjt > xj,N−k], (2.B.4)

where the function I is the crash indicator function, in which I = 1 in case of a crash, and I = 0

otherwise, Vit and Vjt are returns for stocks i and j at time t; xi,N−k, and xj,N−k are extreme

crash thresholds. The estimation of the CCC-probabilities requires setting k as the number of

observations used in Eq. (2.B.4). For consistency with our Hill-estimator, we again use k = 43

as the minimum level for which the value of ϕ is stable in our Hill-plots. Furthermore, because

the CCC-probability is asymptotic normal if kN

→ 0 as k,N → ∞ (see Hartmann et al., 2004),

a t-test for such estimator is obtained by the same bootstrap-based approach that is used in

equation (2.B.3).

40

Chapter 3

Single stock call options as lotterytickets: overpricing and investorsentiment∗

3.1 Introduction

Barberis and Huang (2008) hypothesize that Tversky and Kahneman’s (1992) Cumulative

Prospect Theory (CPT) explains a number of seemingly unrelated pricing puzzles. In contrast

to previous literature, which concentrates on the CPT’s value function (see Benartzi and Thaler,

1995b; Barberis et al., 2001; Barberis and Huang, 2001), Barberis and Huang (2008) focus on

the probability weighting functions of the model. They conclude that the CPT’s overweighting

of small probability events explains why investors prefer positively skewed returns, or “lottery

ticket” type of securities. Because of such preference, investors overpay for positively skewed

securities, turning them expensive and causing them to yield low forward returns. The authors

argue that this mechanism is the reason why IPO stocks, private equity, distressed stocks, single

segment firms and deep out-of-the money (OTM) single stock calls are overpriced among other

irrational pricing phenomena.

The proposition made by Barberis and Huang (2008) that deep OTM single stock calls

resemble overpriced lottery-like securities due to investors’ overweight of tails has not yet been

verified empirically2. Empirical studies on probability weighting functions implied by option

prices are offered by Dierkes (2009), Kliger and Levy (2009), and Polkovnichenko and Zhao

(2013)3. The evidence provided by these papers is, however, based on the index put options

∗This chapter is based on Felix et al. (2016b). I am grateful to Deborah A. Trask (the editor) and oneanonymous referee at Journal of Behavioral Finance for their useful comments and suggestions. We also thankseminar participants at the IFABS 2016 Barcelona Conference, at the VU University Amsterdam in 2016, at theAPG Asset Management Quant Roundtable in 2016, at the 2016 Research in Behavioral Finance Conferencein Amsterdam and at the Board of Governors of the Federal Reserve System in Washington D.C. in 2016 fortheir helpful comments. We thank APG Asset Management for making available part of the data set.

2Boyer and Vorkink (2014) provide evidence that lottery-like single stock options do deliver lower forwardreturns than options with lower ex-ante skewness. However, their paper does not test why these options areovervalued, nor does it analyzes the potential time-variation in ex-ante skewness and forward returns. Conradet al. (2013) find similar results for ex-ante skewness and subsequent stock returns.

3These studies focus on the rank-dependent expected utility (RDEU) rather than the CPT, as the RDEUis seamlessly effective in dealing with the overweighting of probability phenomena. The RDEU’s probabilityweighting functions are strictly monotonically increasing, whereas the CPT one is not. RDEU functions are

41

market, which behaves very differently from the single stock option market. The main buyers

of OTM index puts are institutional investors, which use them for portfolio insurance (Bates,

2003; Bollen and Whaley, 2004; Lakonishok et al., 2007; Barberis and Huang, 2008). Because

institutional investors comprise around two-thirds of the total equity market capitalization

(Blume and Keim, 2012), their option trading activity strongly impacts the pricing of put

options (Bollen and Whaley, 2004) by making them expensive. The results of Dierkes (2009)

and Polkovnichenko and Zhao (2013) reiterate this evidence and suggest that overweighting

of small probabilities partially explains the pricing puzzle present in the equity index option

market.

Contrary to the index put market, trading activity in single stock calls is concentrated among

individual investors (Bollen and Whaley, 2004; Lakonishok et al., 2007) and is speculative in

nature (Lakonishok et al., 2007; Bauer et al., 2009; Choy, 2015). Beyond that, Mitton and

Vorkink (2007); Bauer et al. (2009); Kumar (2009) provide important empirical support to

the link between preference for skewness and individual investor trading activity. The fact

that many individual investors have a substantial portion of their portfolios tied up in low risk

investments, such as pensions, social security, 401(k)s, IRAs, or are averse (or constrained)

to borrow (Frazzini and Pedersen, 2014) encourages them to buy financial instruments with

implicit leverage such call options. Hence, given the very distinct clientele of these two option

markets (institutional investors vs. retail investors) and the different motivation for trading

(portfolio insurance vs. speculation), we reason that the OTM single stock calls overpricing is

a puzzle in itself, requiring an independent empirical proof from the index option market.

The first contribution of this chapter is to investigate whether the CPT can empirically

explain the claimed overpricing of OTM single stock call options. To that purpose, we empir-

ically test whether tails of the CPT density function outperform the risk-neutral density and

rational subjective probability density functions on matching tails of the distribution of realized

returns. We find that our estimates for the CPT probability weighting function parameter γ

are qualitatively consistent with the ones predicated by Tversky and Kahneman (1992), partic-

ularly for short-term options. Our estimates do suggest that overweight of small probabilities

is less pronounced than suggested by the CPT though. This analysis complements the results

of Barberis and Huang (2008) and provides novel support to explain the overpricing of OTM

single stock calls. Our empirical results extend the findings of Dierkes (2009), Kliger and Levy

(2009), Polkovnichenko and Zhao (2013), because we show that investors’ overweighting of

small probabilities is not restricted to the pricing of index puts but also applies to single stock

calls.

Secondly, we provide evidence that overweighting of small probabilities is strongly time-

varying and connected to the Baker and Wurgler (2007) investor sentiment factor. These

findings contrast the CPT model, where the probability weighting parameter for gains (γ)

is constant at 0.61. In fact, our estimations suggest that the γ parameter fluctuates widely

around that level, sometimes even reflecting underweighting of small probabilities. We show

also easier to estimate because they use one less parameter than the CPT.

42

that overweighting of small probabilities was quite strong during the dot-com bubble, which

coincided with a strong rise in investor sentiment. The strong time-variation in overweight of

tails indicates that investors have either a “bias in beliefs” or time-varying (rather than static)

skewness preferences, see Barberis (2013) for a discussion on the topic4.

Moreover, we find that overweighting of small probabilities is largely horizon-dependent,

because this bias is mostly observed within short-term options prices (i.e., three- and six-

months) rather than in long-term ones (i.e., twelve-months). We reason that such positive

term structure of tails’ overweighting exist because individual investors may speculate using the

cheapest available call at their disposal. In other words, individual investors buy the cheapest

lottery tickets that they can find. As three- and six-month options have much less time-value

than twelve-month ones, more pronounced overweighting of small probabilities within short-

term options seems sensible. This result is consistent with individual investors being the typical

buyers of OTM single stock calls and the fact that they mostly use short-term instruments to

speculate on the upside of equities (Lakonishok et al., 2007).

In our analysis of probability weighting functions, we focus on the outmost tails of RNDs5.

We argue that, as distribution tails (mostly estimated from OTM options) are the sections of

the distribution that reflect low probability events, we may analyze these locally, thus, isolated

from the distribution’s body. To this purpose, we use extreme value theory (EVT) and Kupiec’s

test (as a robustness check), which are especially suited for the analysis of tail probabilities

and, so far, have not been employed yet to the evaluation of overpricing of OTM options. As

an additional robustness check, we replace the CPT by the rank-dependent expected utility

(RDEU) function of Prelec (1998). This alteration reconfirms the presence of overweighted

small probabilities by investors within the OTM single stock call market and, at the same time,

reiterates that such bias is less pronounced than suggested by the CPT model. Time-variation

of the weighting function parameters is also observed when RDEU is applied.

The remainder of this chapter is organized as follows. Section 3.2 describes the data and

methodology employed in our study. Section 3.3 presents our empirical analysis and section

3.4 discusses our robustness tests. Section 3.5 concludes.

3.2 Data and Methodology

In this section, we first describe the theoretical background that allows us to relate empirical

density functions (EDF), RND, and subjective density functions. This is a key step for testing

4We acknowledge that it is unclear whether overpricing of OTM calls is caused by overweighting of smallprobabilities (i.e., a matter of preferences), or rather by biased beliefs (i.e., investors’ expectations). Barberis(2013) eloquently discusses how both phenomena are distinctly different and how both (individually or jointly)may explain the overpricing in OTM options. In this chapter we take a myopic view and use only the firstexplanation, for ease of exposition. Disentangling the two (beliefs and preferences) would potentially be veryinteresting, but we deem it to be outside the scope of this chapter.

5Per contrast, Dierkes (2009) and Polkovnichenko and Zhao (2013) explore the relation between overweight-ing of small probabilities and options prices by analyzing the full RND from options. Dierkes (2009) appliesBerkowitz’s tests, whereas Polkovnichenko and Zhao (2013) estimate an empirical weighting function via poly-nomial regressions.

43

the hypothesis that the CPT helps to explain overpricing of OTM options, because we build on

the assumption that investors’ subjective density estimates should correspond, on average6, to

the distribution of realizations (see Bliss and Panigirtzoglou, 2004). Thus, testing whether the

CPT’s weighting function explains the overpricing of OTM options, ultimately, relates to how

the subjective density function produced by CPT’s preferences matches empirical returns. Be-

cause the representative agent is not observable, subjective density functions are not estimable

like EDF and RND are. As such, we build on the following theory to derive subjective density

functions from RNDs.

In our empirical exercise, we first derive subjective density functions for (a) the power

and (b) exponential utility functions. Because the CPT model contains not only a utility

function (the value function) but also a probability weighting scheme (the weighting function),

we produce two density functions: (c) the hereafter called partial CPT density function (PCPT),

where only the value function is taken into account, and (d) the CPT density function, where

the value and the weighting functions are considered. Lastly, we also calibrate γ to market data

and are, then, able to compute (e) the estimated CPT density (ECPT). We provide details on

estimation methods for our five subjective density functions, (a) to (e), in section 3.2.1, and

for the RND and EDF in section 3.2.4.

Once all five subjective density functions are obtained, we distinguish four analyses in our

empirical analysis section: 1) the estimation of long-term CPT value and weighting function

parameters (from which we can produce the ECPT density) (section 3.3.1); 2) EVT-based

tests of consistency between tails of the EDF, the RND and our five subjective probability

distributions (section 3.3.2); 3) the estimation of time-varying γ parameter (section 3.3.3); and

4) a regression linking the CPT time-varying probability weighting parameter (γ) to sentiment

measures as well as numerous control variables (section 3.3.4).

We use single stock weighted average implied volatility (IV) data used for the largest 100

stocks of the S&P 500 index within our RND estimations. Appendix 3.A.2 shows how single

stock weighted average IV are computed. Weights applied are the S&P 500 index weights

normalized by the sum of weights of stocks for which IVs are available. Following the S&P

500 index methodology and the unavailability of IV information for every stock in all days

in our sample, stocks weights in this basket change on a daily basis. The sum of weights is,

on average, 58 percent of the total S&P 500 index capitalization and it fluctuates between

46 and 65 percent. The IV data comes from closing mid-option prices from January 2, 1998

to March 19, 2013 for fixed maturities for five moneyness levels, i.e., 80, 90, 100, 110, and

120, at the three-, six- and twelve-month maturity. Continuously compounded stock market

returns are calculated throughout our analysis from the basket of stocks weighted with the

same daily-varying loadings used for aggregating the IV data. IV data and stock weights are

kindly provided by Barclays7. Single stock returns are downloaded via Bloomberg.

6This implies that investors are somewhat rational. This assumption is not inconsistent with the CPTassumption that the representative agent is less than fully rational. The CPT suggests that investors arebiased, not that decision makers are utterly irrational to the point that their subjective density forecast shouldnot correspond, on average, to the realized return distribution.

7We thank Barclays for providing the implied volatility data. Barclays disclaimer: ”Any analysis that utilizes

44

We take the perspective of end-users of single stock OTM call options8. Hence, we assume

that supply imbalances are minimal and do not impact implied volatilities. We think this

assumption is reasonable because 1) option markets for the largest 100 U.S. stocks are liquid;

2) any un-hedged risk run by market makers can be easily hedged by purchasing the stock;

and 3) unhedged risk by market makers is likely much smaller when supplying call options

relative to put options. Market makers run little unhedged risk when supplying call options

vis-a-vis supplying puts because stocks returns are negatively skewed, making gap and jump

risk much lower on the upside than on the downside. Garleanu et al. (2009) have shown that

this condition is different for the index option market, where market makers mostly provide

put options for portfolio insurance programs. As the authors suggest, put sellers become more

risk-sensitive following equity market declines, as their un-hedged risk increases, which makes

them unwilling to write additional puts to the market. Our implied volatility data show no

indication of an increase in the implied volatility skew from 120 percent moneyness options, nor

from at-the-money options around moments of market stress (e.g., the 2008-09 global financial

crisis). Hence, we find no evidence of the presence of supply imbalances in the OTM calls in

our sample.

3.2.1 Subjective density functions

Standard utility theory tells us that since the representative agent does not have risk-neutral

preferences, RNDs are inconsistent with subjective and EDF9, thus both “real-world” proba-

bilities. Hence, if investors are risk-averse or risk seeking, their subjective probability function

should differ from the one implied by option prices. The relation between the RND fQ(ST ),

and “real-world” probability distributions, fP (ST ), with ST being wealth or consumption10, is

described by ς(ST ), the pricing kernel or the marginal rate of substitution (of consumption at

time T for consumption at time t)11:

fQ(ST )

fP (ST )= Λ

U′(ST )

U ′(St)≡ ς(ST ), (3.2.1)

where Λ is the subjective discount factor (the time-preference constant) and U(·) is the rep-

any data of Barclays, including all opinions and/or hypotheses therein, is solely the opinion of the author andnot of Barclays. Barclays has not sponsored, approved or otherwise been involved in the making or preparationof this Report, nor in any analysis or conclusions presented herein. Any use of any data of Barclays used hereinis pursuant to a license.”

8We assume that end-users of single stock OTM call options have the same preferences across underlyingsecurities. This assumption is supported by the evidence provided by Bollen and Whaley (2004) and Lakonishoket al. (2007) that trading activity in equity calls is concentrated among individual investors and is speculativein nature.

9Anagnou et al. (2002) and Bliss and Panigirtzoglou (2004) have tested the consistency between RNDs andphysical densities estimated from historical data and found that such distributions are inconsistent, i.e., RNDsare poor forecasters of the distribution of realizations.

10Note that, as the value function within the CPT measures utility versus a reference point, ST is not strictlypositive in this model. A negative ST denotes a loss of wealth or consumption, whereas a positive ST representsa gain.

11The condition necessary for Eq. (3.2.1) to hold is that markets are complete and frictionless and a singlerisky asset is traded.

45

resentative investor utility function. As U(ST ) is a random variable, the pricing kernel is also

called the stochastic discount factor. Thus, Eq. (3.2.1) tell us that the “real-world” distribution

equates to the RND when adjusted by the pricing kernel. The intuition behind Eq. (3.2.1) is

that a real-world or risk-adjusted probability distribution can be obtained from the RND, once

the risk trade-off embedded in the representative investor utility function is considered.

Since CPT-biased investors price options as if the data-generating process has a cumulative

distribution FP (ST ) = w(FP (ST ))12, where w is the weighting function, its density function

becomes fP (ST ) = w′(FP (ST )) · fP (ST ) (see Dierkes, 2009; Polkovnichenko and Zhao, 2013).

Thus, CPT-biased agents assess probability distributions as if their tails would contain more

weight than in reality they do, i.e., they have a preference for skewness or “bias in beliefs”,

as Barberis (2013) argues. Consequently, evaluating whether the CPT’s propositions apply is

equivalent to testing whether Eq. (3.2.1) still holds if fP (ST ) is replaced by fP (ST ), leading to:

fQ(ST )

w′(FP (ST )) · fP (ST )= ς(ST ). (3.2.2)

We, then, further manipulate Eq. (3.2.2) so to directly relate the original EDF to the CPT

subjective density function, by “undoing” the effect of the CPT probability distortion functions

within the PCPT density function. The relation between EDF and the CPT density function

is given by Eq. (3.2.3) and its derivation, from Eq. (3.2.2), is provide in Appendix 3.A.1:

fP (ST )︸︷︷︸EDF

=

fQ(ST )

ν′(ST )∫ fQ(x)

ν′(x)dx

(w−1)′(FP (ST ))

︸︷︷︸CPT density function

(3.2.3)

where ν ′(ST ) is the CPT’s marginal utility function.

This result allows us to obtain a clear representation of the CPT subjective density function,

thus, where the value and the weighting function are simultaneously taken into account. At

this stage, as we can produce RND and the set of subjective densities of our interest, including

the CPT density, one can evaluate how consistent with realizations their tails are.

3.2.2 Estimating CPT parameters

We start evaluating the empirical validity of the CPT for single stock call options by comparing

EDF to the CPT density function parameterized by Tversky and Kahneman (1992). Subse-

quently, we estimate CPT weighting function parameters λ and γ with the same goal. We

only estimate γ within the probability weighting function, and not δ, because we are interested

in the gains-side of the distribution, which is extracted from call options. We estimate these

parameters non-parametrically, by minimizing the weighted squared distance between physical

distribution and the partial CPT density function for every bin above the median of the two

12Similarly, if investors are rational, their subjective density functions should be consistent, on average, withthe empirical density function. Bliss and Panigirtzoglou (2004) find that subjective density functions, producedfrom RND adjusted by two types of representative investors’ utility functions (power and exponential) withplausible relative risk aversion parameters, outperform RND on forecasting density functions.

46

distributions, as follows:

υ(λ) = MinB∑b=1

Wb(EDF bprob − CPT b

prob)2, (3.2.4)

where, EDF bprob and CPT b

prob are, respectively, the probability within bin b in the empirical

and CPT density functions and Wb are weights given by 11√2

∫∞0.5

e−x2

2 dx = 1, the reciprocal

of the normalized normal probability distribution (above its median), split in the same total

number of bins (B) used for the EDF and CPT. The loss aversion parameter, λ, in Eq. (3.2.4)

is optimized using multiple constraint intervals: [0,3], [0,5] and [0,10]. Once the optimal λ is

known, we minimize Eq. (3.2.5) using its estimate and the CPT λ:

w+(γ, δ = γ) = MinB∑b=1


prob)2, (3.2.5)

where γ, the probability weighting parameter for gains, is constrained by the permutation

of the following upper bounds (1.2, 1.35, 1.5, 1.75 and 2) and lower bounds (-0.25, 0 and

0.28). Weights applied in these optimizations are due to the higher importance of matching

probabilities tails in our analysis than the body of the distributions.

Our non-linear bounded optimization is a single parameter one, where we first estimate

optimal γ (which we impose to equal δ) across all permutations of upper and lower bounds

to select the bounds that produce the lowest residual sum of square (RSS). Subsequently, we

estimate λ and γ as suggested by the sequence of optimizations described by Eqs. (3.2.4) and

(3.2.5). This method resembles the ones of Kliger and Levy (2009), Dierkes (2009), Chabi-

Yo and Song (2013), and Polkovnichenko and Zhao (2013). Once optimal parameters λ and

γ are estimated, we can produce another long-term subjective density function: the ECPT,

which stands for estimated CPT, where we apply the optimal γ for the characterization of

its probability weighting function. Finally, we also estimate time-varying γ using different

assumptions of λ, so to evaluate the sensitivity of γ to changes in λ.

3.2.3 Density function tails’ consistency test

We check for tail consistency of our set of five subjective density functions (CPT, PCPT, ECPT,

power and exponential), RND, and the EDF by applying extreme value theory (EVT). EVT

allows us to estimate the shape of the tails of these eight PDFs and to extract the returns

implied by an extreme quantile within our PDFs. We estimate the tail shape estimator (ϕ) by

means of the Hill (1975) estimator:

ϕ =1

θ=

1

k

K∑j=1

ln(xj

xk+1

), (3.2.6)

where k is the number of extreme returns used in the tail estimation, and xk+1 is the tail cut-off

point. The tail shape estimator ϕ measures the curvature, i.e., the fatness of the tails of the

47

return distribution: a high (low) ϕ indicates that the tail is fat (thin). The inverse of ϕ is the

tail index (θ), which determine the tail probability’s rate of decay. A high (low) θ indicates that

the tail decays quickly (slowly) and, therefore, is thin (fat). Such tail shape estimator and tail

index give us a good representation of the curvature of the tails, but since tails may have the

same shape while estimating diverse extreme observations, we also employ the semi-parametric

extreme quantile estimator from De Haan et al. (1994):

qp = xk+1(k

pn)1

θ , (3.2.7)

where n the sample size, p is a corresponding exceedance probability, which means the likelihood

that a return xj exceeds the tail value q, and xk+1 is the tail cut-off point. We note that one

of the input of qp is the tail shape estimator ϕ. Similar to value-at-risk (VaR) modeling, the

q−p statistic indicates the level of the worst return occurring with probability p, which is small.

This is the reason why we call qp extreme quantile return (EQR). As we are interested only in

the upside returns with a p probability estimated from calls, we only compute q+p by applying

the same methodology to the right side of the RND obtained from the single stock option

market13.

In addition to the EQR, we also evaluate the density function tails using expected shortfall

(ES), which captures the average loss beyond the tail cut-off point. As we are interested in

the upside of the distribution, we call such measure expected upside (EU) as the average gain

beyond the tail cut-off point. We evaluate the EU following Danielsson et al. (2006) formulae

for the ES, which relates the EQR (i.e., the VaR) to the ES (i.e., the CVaR) as described below:

EU q(p) =θ

θ − 1· xk+1(

k

pn)1

θ , (3.2.8)

where θ is the tail index.

De Haan et al. (1994) show that the tail shape estimator statistic√k(ϕ(k) − ϕ) and the

tail quantile statistic

√k

ln( kpk

)[ln q(p)

q(p)] are asymptotically normally distributed. Hence, according

to Hartmann et al. (2004) and Straetmans et al. (2008), the t-statistics for such estimators are

given by:

Tϕ =ϕ1 − ϕ2

σ[ϕ1 − ϕ2]∼ N(0, 1), (3.2.9a)

and

Tq =q1 − q2

σ[q1 − q2]∼ N(0, 1), (3.2.9b)

where the denominators are calculated as the bootstrapped difference between the estimated

shape parameters ϕ and the quantile parameters qp using 1000 bootstraps. The null hypothesis

of this test is that ϕ and qp parameters do not come from independent samples of normal

distributions, therefore, ϕ1 = ϕ2 and q1 = q2. The alternative hypothesis is that ϕ and qp have

13Our EQR measure is closely connected to the risk-neutral tail loss measure of Vilkov and Xiao (2013).

48

unequal means. Such t-test is also applied to our EU analysis, as the distribution of EU follows

the same distribution of the tail quantile statistic

√k

ln( kpk

)[ln q(p)

q(p)], given that EU is the extreme

quantile estimator multiplied by a constant.

3.2.4 Estimating RND and EDF

For the estimation of the RND, the first step taken is the application of the Black-Scholes

model to our IV data to obtain options prices (C) for the S&P 500 index. Once our data

is normalized so strikes are expressed in terms of percentage moneyness, the instantaneous

price level of the S&P 500 index (S0) equals 100 for every period for which we would like to

obtain implied returns. Contemporaneous dividend yields for the S&P 500 index are used for

the calculation of P as well as the risk-free rate from three-, six-, and twelve-month T-bills.

Because we have IV data for five levels of moneyness, we implement a modified Figlewski (2010)

method for extracting our RND structure, as in Felix et al. (2016a). The main advantage of

this method over other techniques is that it extracts the body and tails of the distribution

separately, thereby allowing for fat tails.

The Figlewski (2010) method is close to the one employed by Bliss and Panigirtzoglou

(2004), where body and tails are also extracted separately. Bliss and Panigirtzoglou (2004)

use a weighted natural spline algorithm for interpolation, which has the same decreasing-noise

effect in RNDs of using splines in the absence of knots, as done in Figlewski (2010). The

extrapolation in Bliss and Panigirtzoglou (2004) is done by the introduction of a pseudo-data

point, which has the effect of pasting lognormal tails into the RND. One advantage of these two

approaches is that the extrapolation does not result in negative probabilities, which is possible

when splines is applied in such case. Nevertheless, we favor the approach of Figlewski (2010)

as the lognormal tails employed by Bliss and Panigirtzoglou (2004) assume that IV is constant

beyond the observable strikes, resembling the Black-Scholes model. The modification made to

the Figlewski (2010) method by Felix et al. (2016a) entailed having flexible inner anchor points

(as opposed to having fixed anchor points) for fitting tails to the risk neutral density. The aim

of this modification is to prevent the method to estimate distribution density functions with

implausible shapes.

We estimate the EDF in two different ways. First, using the entire sample of realized returns

(r), we estimate long-term EDFs non-parametrically, where r = ln(ST/St) and St is the realized

return index at time t and ST is the forward level of the same index three-, six- or twelve-

months forward, i.e., respectively 21, 63 and 252-days forward. Because of overlapping periods,

we initially estimate our empirical distribution from non-overlapping returns for these three

maturities by using distinct starting points. This methodology is also applied by Jackwerth

(2000) and Ait-Sahalia and Lo (2000). However, because the length of the overlapping periods is

relatively large compared to our total sample, especially for the twelve-month forward returns,

we average the distribution with distinct starting points to smooth the shape of our multiple-

horizon distributions14.

14As a robustness check to this approach, we compare our three-, six- and twelve-month empirical distributions

49

In a second step, we estimate time-varying EDFs built from an invariant component, the

standardized innovation density, and a time-varying part, the conditional variance (σ2t|t−1) pro-

duced by an EGARCH model (see Nelson, 1991). We first define the standardized innovation,

being the ratio of empirical returns and their conditional standard deviation (ln(St/St−1)/σt|t−1)

produced by the EGARCH model. From the set of standardized innovations produced, we can

then estimate a density shape, i.e., the standardized innovation density. The advantage of

such a density shape versus a parametric one is that it may include, the typically observed,

fat-tails and negative skewness, which are not incorporated in simple parametric models, e.g.,

the normal. As mentioned, such density shape is invariant and it is turned time-varying by

multiplication of each standardized innovation by the EGARCH conditional standard deviation

at time t, which is specified as follows:

ln(St/St−1) = µ+ εt, ε ∼ f(0, σ2t|t−1) (3.2.10a)

and

σ2t|t−1 = ω1 + αε2t−1 + βσ2

t−1|t−2 + ϑMax[0,−εt−1]2, (3.2.10b)

where α captures the sensitivity of conditional variance to lagged squared innovations (ε2t−1),

β captures the sensitivity of conditional variance to the conditional variance (σ2t−1|t−2), and ϑ

allows for the asymmetric impact of lagged returns (ϑMax[0,−εt−1]2). The model is estimated

using maximum log-likelihood where innovations are assumed to be normally distributed.

Up to this point, we managed to produce a one-day horizon EDF for every day in our

sample but we still lack time-varying EDFs for the three-, six-, and twelve-month horizons.

Thus, we use bootstrapping to draw 1,000 paths towards these desired horizons by randomly

selecting single innovations (εt+1) from the one-day horizon EDFs available for each day in

our sample. We note that once the first return is drawn, the conditional variance is updated

(σ2t−1|t−2) affecting the subsequent innovation drawings of a path. This sequential exercise

continues through time until the desired horizon is reached. In order to account for drift in

the simulated paths, we add the daily drift estimated from the long-term EDF plus the risk-

free rate to drawn innovations, thus the one-period simulated returns is εt+1 + µ + Rf . The

density functions produced by the collection of returns implied by the terminal values of every

path and their starting points are our three-, six-, and twelve-month EDFs. These simulated

paths contain, respectively, 63, 126, and 252 daily returns. We note that by drawing returns

from stylized distributions with fat-tails and excess skewness, our EDFs for the three relevant

horizons also embed such features. Finally, once these three time-varying EDFs are estimated

for all days in our sample, we estimate γ for each of these days using Eq. (3.2.5)15.

with the ones calculated from non-overlapping returns. We use data since 1871 for the US equity price index,made available by Welch and Goyal (2008), who use S&P 500 data since 1926, and data from Robert Shiller’swebsite for the preceding period. Our empirical distributions are quite similar to the ones estimated from thelonger data set, suggesting that they are, indeed suitable as long-term distributions. The use of overlappingreturns is less problematic in our calculations than in regression estimation, where statistical inferences onparameter estimates can be strongly affected by overlapping returns’ serial correlation.

15Due to drift, the model of time-varying EDF for the twelve-month horizon occasionally does not match the

50

Our approach for estimating both the long-term EDF and the time-varying EDF is closely

connected to the method applied by Polkovnichenko and Zhao (2013). The time-varying method

used by these authors is based on Rosenberg and Engle (2002). The choice for an EGARCH

approach versus the standard GARCH model is due to the asymmetric feature of the former

model that embeds the “leverage effect”16.

3.3 Empirical analysis and results

In this section, we present our empirical results. Since we estimate EDF in the two ways

described (the long-term and time-varying EDFs), we are able to estimate long-term and time-

varying γ’s by minimizing Eq. (3.2.5). We use our long-term γ estimates to compute the

ECPT to compare it to the other subjective density functions using the tests described in

section 3.2.3. The time-varying estimates of γ are analyzed in sections 3.3.3 and 3.3.4 with

the use of a regression model. Finally, in section 3.4, we perform four robustness tests on our

results by using an alternative weighting function to the CPT, the one embedded in the Prelec

(1998) model, and we apply Kupiec’s test to probability tails, among other checks.

3.3.1 Estimated CPT long-term parameters

We report the estimated CPT parameters (λ and γ) extracted from long-term density functions

in Table 3.1, Panel A. Our first finding is that λ, the parameter of loss aversion, which is 2.25 in

the CPT, fluctuates around that number for six- and twelve-month options but shows a quite

different outcome for three-month options. Our estimation of λ from three-month options is

1.02, which indicates no loss aversion. For the six- and twelve-month options λ is 2.66 and 3.00,

respectively. This finding suggests that loss aversion is more pronounced at longer maturities

than suggested by the CPT. Apart from that, twelve-month λ estimates are highly variant

across the different optimization upper bounds used (i.e. 3, 5 and 10), always matching the

bound value, whereas estimates from three- and six option maturities are very stable across

upper bounds.

The estimated probability weighting function parameter γ is slightly higher than the one

suggested by the CPT (i.e., 0.61) at the three- and six-month horizons, respectively, at 0.75 and

0.81. For twelve-month options, γ is around 1.09. These results suggest that overweighting of

small probabilities occurs in short-term options (up to six-months), while twelve-month options

seem to behave more rationally. These findings support our hypothesis that individual investors

are, on average, biased when purchasing single stock call options, as suggested by Barberis and

Huang (2008).

one of the PCPT model. This difference is challenging to estimation of γ (Eq. (3.2.5)), as a large amount of γestimates produce unreasonable PDFs such as non-monotonic CDFs. Therefore, to perform the optimizationsgiven by Eq. (3.2.5), we set the mode of the simulated EDF equal to the one of the PCPT.

16The leverage effect is the negative correlation between an asset’s returns and changes in its volatility. Fora comparison between alternative GARCH approaches, see Bollerslev et al. (2009).

51

Table 3.1: Long-term CPT parameters and consistency test on tail shape

Panel A - long-term CPT parameters

Gamma (γ) Lambda (λ) (γ|λ)

Maturity Estimate RSS Estimate RSS Estimate RSS

3 months 0.75 0.02 1.02 0.12 0.54 0.01

6 months 0.81 0.02 2.66 0.3 0.87 0.02

12 months 1.09 0.06 3 1.64 1.12 0.07

Panel B - statistical test on tail shape parameters

Phi

Maturity (1) vs (2) (1) (2)EDF p-value t-stat

3 months RND vs EDF 0.20∗ 0.29 0.1% −3.6

Power vs EDF 0.17∗∗∗ 0.29 0.0% −4.9

Expo vs EDF 0.18∗∗∗ 0.29 0.0% −4.6

PCPT vs EDF 0.17∗∗∗ 0.29 0.0% −4.9

CPT vs EDF 0.20 0.29 0.0% −4.6

ECPT vs EDF 0.20 0.29 0.0% −3.7

6 months RND vs EDF 0.19∗∗ 0.23 2.8% −2.3

Power vs EDF 0.16∗∗∗ 0.23 0.0% −4.0

Expo vs EDF 0.16∗∗∗ 0.23 0.0% −3.9

PCPT vs EDF 0.17∗∗∗ 0.23 0.0% −4.0

CPT vs EDF 0.19∗∗ 0.23 0.0% −3.9

ECPT vs EDF 0.18∗∗∗ 0.23 0.0% −2.8

12 months RND vs EDF 0.22 0.14 0.0% 4.0

Power vs EDF 0.14∗∗∗ 0.14 37.7% 0.3

Expo vs EDF 0.14∗∗∗ 0.14 39.7% −0.1

PCPT vs EDF 0.18∗∗ 0.14 1.9% 2.5

CPT vs EDF 0.22∗∗∗ 0.14 0.0% 4.3

ECPT vs EDF 0.18∗∗ 0.14 1.7% 2.5

Panel A of this table reports the estimated long-term CPT parameters gamma (γ), lambda (λ), and γ conditional on optimal

λ (γ|λ) from the single stock options as well as residual sum of squares (RSS) of Eqs. (3.2.4) and (3.2.5). The parameter γdefines the curvature of the weighting function for gains, which leads the probability distortion functions to assume inverseS-shapes. Estimated parameters close to unity lead to weighting functions that are close to un-weighted probabilities,whereas parameters close to zero denotes a larger overweighting of small probabilities. The parameter λ is the loss aversionparameter. These parameters are long-term since their estimates are obtained by setting the average CPT density functionsto match the return distribution realized within our full sample. These parameters are estimated using Eqs. (3.2.4) and(3.2.5). Panel B reports the results from the statistical test of the tail shape parameter phi (ϕ), according to Eqs. (3.2.6)and (3.2.9b) applied to the averaged probability density functions. As densities compared here are averaged, for the RNDand subjective densities or estimated using our full sample for realized returns, such test aims to test for the long-termconsistency between distribution tail shapes. The null hypothesis of these two tests is that ϕ from the two distributionsbeing compared have equal means and, therefore, tail shapes are consistent. The null hypothesis of these two tests is that ϕfrom the two distributions being compared have equal means and, therefore, tail shapes are consistent. The rejection of thenull hypothesis is tested by t-tests of Eq. (3.2.9a) at the ten, five, and one percent statistical levels, respectively, shown bysuperscripts *, **, ***, assigned to ϕ for the RND, displayed in column (1), and ϕ for the EDF, shown in column (2).

3.3.2 Density functions tails’ consistency test results

As specified in section 3.3, we test the empirical consistency of density function tails among a

set of five subjective distributions (CPT, PCPT, ECPT, power, exponential), the RND, and

the EDF. We perform these tests by employing EVT through the application of Eqs. (3.2.6)

to (3.2.9b). For such purpose, we require return streams (xj), which are only available for the

long-term EDF. Thus, we apply an inversion transform sampling technique to our other PDFs

to obtain sampled returns for them. Such method, also known as the Smirnov method, entails

52

drawing n random numbers from a uniformly distributed variable U = (u1, u2, ..., un) bounded

at interval [0, 1] and, subsequently, computing xj ← F−1(uj), where F are the CDFs of interest

(see Devroye, 1986, p.28). Hence, the Smirnov method simulates returns that resemble the

ones of the inverse CDF by randomly drawing probabilities along such function.

Once we obtain returns for all five PDFs, the next step is to set k as the optimal number of

observations used for estimation of ϕ by Eq. (3.2.6), the Hill-estimator. For this purpose, we

produce Hill-plots for the right tail of our distributions, which depict the relationship between

k and ϕ as a curve (see Straetmans et al., 2008). Picking the optimal k is done by observing

the interval in this curve where the value of ϕ stabilizes while k changes. This area suggests a

stable trade-off between a good approximation of the tail shape by the Pareto distribution and

the uncertainty of such approximation (by the use of fewer observations). The interval that

corresponds to roughly four to seven percent of observations seems to be a stable region across

the Hill-plots of the tails of the EDF and the CPT. As an increase in k increases the statistical

power of the estimator but may distort the shape of the tail, we decide to set k as chosen from

the Hill-plots for EDF and CPT tails equal to four percent.

We examine whether the tail shape parameter (ϕ), computed via the Hill (1975) estimator,

for the RND and for our subjective density functions (i.e., power, exponential, PCPT, CPT

and ECPT) matches the one for the EDF. The outcomes from the statistical tests performed

to compare tail shape parameters (Eq. (3.2.9a)) are provided in Table 3.1, Panel B. Results

suggest that for the three-month maturity options, ϕ for the RND, CPT and ECPT (at 0.20)

are the closest to the EDF parameter (at 0.29) but they are not statistically equal. The ϕ

estimate for the power, exponential, and PCPT density functions do not match the one for the

EDF, as they are all around 0.17 and, thus, exhibit fatter tails than the EDF.

We observe that the results for the six- and twelve-month options are very similar to the

ones obtained for the three-month expiry. The parameter estimate ϕ of the EDF is statistically

equal to the RND and CPT. Parameter ϕ ranges from 0.18 to 0.19 for the CPT, ECPT, and

RND for the six- and twelve-month maturities, whereas it is 0.23 for the EDF. The estimate of

ϕ for the RND (0.19 and 0.22 for the six- and twelve-month maturities, respectively) somewhat

matches the one for the EDF at the six-month maturity but it is off at the twelve-month

maturity. The parameter estimates ϕ for the power, exponential, and PCPT density functions

match the EDF’s ϕ at the twelve-month maturity only. Generally, the parameter estimates ϕ

for these subjective density functions are too small in comparison to the one of the EDF. This

means that these six- and twelve-month maturity subjective density functions have fatter tails

than the EDF, the other subjective densities (CPT and ECPT), and the RND. These results

suggest that the shape of the CPT density function is a good match to the shape of realized

tails.

After k is chosen and the shape estimator ϕ for the EDF, RND, power, exponential, PCPT,

CPT, and ECPT is computed, extreme quantile returns (EQR) can also be estimated via Eq.

(3.2.7). Subsequently, the t-test in Eq. (3.2.9b) is applied using the one, five and ten percent

statistical significance levels. This test evaluates whether the EQRs estimated from a set of

53

two distributions (RND, power, exponential, PCPT, and CPT versus EDF) have equal means

(the null hypothesis). The results of this test are shown in Table 3.2, Panel A.

Analyzing the density functions derived from the three-month option maturity, we find that

the EQR implied by the CPT is the only one that matches the realized EQR and at the first

quantile solely at 21 percent. The EQR implied by the ECPT is almost the same as implied

by the CPT, thus, it also statistically matches the EDF. Per contrast, the EQRs for the RND,

power, exponential, and PCPT densities always overshoot the one for the EDF. All comparisons

between these distributions’ EQR at the three-month maturity reject the null hypothesis that

returns at the same quantile are equal. This pattern is observed across all quantiles analyzed,

i.e., at the tenth, fifth, and first quantiles. This empirical finding indicates that the equity

market upside implied in option markets (i.e., the RND) and the power, exponential and PCPT

densities are always higher than the ones realized by the equity market. The results for the

PCPT resemble the ones for the RND. The EQRs from the CPT and the ECPT are clearly the

best matches for the EDF.

For the six-month maturity, upside returns priced by the RND and ECPT best match the

EQR. The EQRs for the EDF are roughly 18, 22, and 32 percent for the tenth, fifth, and first

quantile of returns, respectively, whereas the EQRs for the ECPT are 19, 21, and 28 percent.

For the RND, such extreme upside return estimates are 19, 22, and 30 percent. Thus, the ECPT

statistically matches the realized EQR best at the tenth and fifth quantile, whereas the RND

is the best match for the third quantile. No rational subjective density function consistently

matches the EQR of the EDF. The power, exponential, and PCPT densities almost always

overshoot the EQR of realized returns. Per contrast, the CPT density always undershoots the

EDF’s extreme returns. Despite always overshooting the EQR of the EDF, the PCPT is the

only other subjective density (apart from the ECPT) that has EQR statistically equal to the

EDF, which happens only at the first quantile EQR.

In contrast to the three- and six-month maturities, the EQRs from the RND for the twelve-

month maturity all underestimate the EQRs from realized returns. The EQRs of realized

returns are 32, 35, and 44 percent for the tenth, fifth and first quantiles, respectively, whereas

for the RND these are 22, 26, and 37 percent, respectively. The same underestimation is

documented for the densities linked to the CPT (i.e., PCPT, CPT and ECPT) as tail returns

are largely out of sync with realized ones, especially for the CPT in which overweight of tails

will force EQRs further away from EDF ones (vis-a-vis the PCPT EQRs). The EQRs of the

exponential densities continue to largely overshoot the ones for the EDF. However, the power

utility function density successfully matches the EQR returns across all EQR values and with

strong statistical significance.

54

Table

3.2:EVT

consistency

testson

tail

retu

rns

PanelA

-Tailsextrem

equantilereturns(EQR)

10%

quantile

5%

quantile

1%

quantile

Maturity

(1)vs.(2)

(1)

(2)EDF

p-value

t-stat

(1)

(2)EDF

p-value

t-stat

(1)

(2)EDF

p-value

t-stat

3RND

vs.EDF

0.16***

0.11

0.0%

-7.7

0.19***

0.13

0.0%

-8.2

0.26***

0.21

0.0%

-5

months

Powervs.EDF

0.21***

0.11

0.0%

-13.9

0.23***

0.13

0.0%

-15.4

0.31***

0.21

0.0%

-10.4

Expovs.EDF

0.21***

0.11

0.0%

-13.9

0.24***

0.13

0.0%

-15.7

0.32***

0.21

0.0%

-11.5

PCPT

vs.EDF

0.18***

0.11

0.0%

-10.8

0.21***

0.13

0.0%

-11.6

0.27***

0.21

0.0%

-7.3

CPT

vs.EDF

0.13***

0.11

0.2%

-3.3

0.15***

0.13

0.7%

-2.9

0.21

0.21

36.3%

0.4

ECPT

vs.EDF

0.15***

0.11

0.0%

-5.4

0.17***

0.13

0.0%

-5.3

0.23*

0.21

8.1%

-1.8

6RND

vs.EDF

0.19

0.18

18.5%

-1.2

0.22

0.22

36.5%

-0.4

0.3*

0.32

8.9%

1.7

months

Powervs.EDF

0.25***

0.18

0.0%

-9.6

0.28***

0.22

0.0%

-10.2

0.36***

0.32

0.0%

-4.3

Expovs.EDF

0.26***

0.18

0.0%

-10.6

0.29***

0.22

0.0%

-11.5

0.37***

0.32

0.0%

-5.5

PCPT

vs.EDF

0.22***

0.18

0.0%

-5.1

0.24***

0.22

0.0%

-4.6

0.32

0.32

36.4%

-0.4

CPT

vs.EDF

0.16***

0.18

0.0%

4.2

0.18***

0.22

0.0%

6.5

0.25***

0.32

0.0%

6.6

ECPT

vs.EDF

0.19

0.18

28.4%

-0.8

0.21

0.22

35.2%

0.5

0.28***

0.32

0.5%

2.9

12

RND

vs.EDF

0.22***

0.32

0.0%

10.5

0.26***

0.35

0.0%

11.5

0.37***

0.44

0.0%

6

months

Powervs.EDF

0.33

0.32

19.8%

-1.2

0.36*

0.35

9.9%

-1.7

0.45*

0.44

9.2%

-1.7

Expovs.EDF

0.34***

0.32

0.2%

-3.2

0.38***

0.35

0.0%

-4

0.47***

0.44

0.1%

-3.5

PCPT

vs.EDF

0.26***

0.32

0.0%

6.4

0.3***

0.35

0.0%

70.39***

0.44

0.0%

3.7

CPT

vs.EDF

0.18***

0.32

0.0%

23.3

0.21***

0.35

0.0%

26.9

0.3***

0.44

0.0%

12.2

ECPT

vs.EDF

0.26***

0.32

0.0%

7.1

0.29***

0.35

0.0%

7.8

0.39***

0.44

0.0%

4.1

PanelB

-Tailsexpected

upsidereturns(EU)

10%

quantile

5%

quantile

1%

quantile

Maturity

(1)vs.(2)

(1)

(2)EDF

p-value

t-stat

(1)

(2)EDF

p-value

t-stat

(1)

(2)EDF

p-value

t-stat

3RND

vs.EDF

0.2***

0.15

0.0%

-5.3

0.23***

0.19

0.0%

-5.1

0.32*

0.3

8.9%

-1.7

months

Powervs.EDF

0.25***

0.15

0.0%

-10.3

0.28***

0.19

0.0%

-10.9

0.37***

0.3

0.0%

-5.9

Expovs.EDF

0.26***

0.15

0.0%

-10.6

0.29***

0.19

0.0%

-11.4

0.39***

0.3

0.0%

-7.2

PCPT

vs.EDF

0.22***

0.15

0.0%

-7.5

0.25***

0.19

0.0%

-7.4

0.33***

0.3

0.6%

-2.9

CPT

vs.EDF

0.16

0.15

18.6%

-1.2

0.19

0.19

39.0%

-0.2

0.26***

0.3

0.1%

3.4

ECPT

vs.EDF

0.18***

0.15

0.5%

-3

0.21**

0.19

3.2%

-2.2

0.28

0.3

13.3%

1.5

6RND

vs.EDF

0.24

0.24

36.9%

0.4

0.27

0.28

10.9%

1.6

0.37***

0.41

0.2%

3.3

months

Powervs.EDF

0.3***

0.24

0.0%

-6.8

0.33***

0.28

0.0%

-6.6

0.43

0.41

14.4%

-1.4

Expovs.EDF

0.31***

0.24

0.0%

-7.9

0.34***

0.28

0.0%

-8

0.45**

0.41

1.4%

-2.6

PCPT

vs.EDF

0.26**

0.24

1.4%

-2.6

0.29

0.28

13.3%

-1.5

0.38*

0.41

5.9%

2

CPT

vs.EDF

0.2***

0.24

0.0%

60.22***

0.28

0.0%

8.8

0.3***

0.41

0.0%

8.1

ECPT

vs.EDF

0.23

0.24

15.3%

1.4

0.26***

0.28

0.2%

3.3

0.35***

0.41

0.0%

4.9

12

RND

vs.EDF

0.28***

0.37

0.0%

7.6

0.33***

0.4

0.0%

7.7

0.46***

0.51

0.7%

2.8

months

Powervs.EDF

0.38

0.37

14.7%

-1.4

0.42*

0.4

5.8%

-2

0.53*

0.51

5.8%

-2

Expovs.EDF

0.4***

0.37

0.2%

-3.2

0.44***

0.4

0.0%

-4

0.55***

0.51

0.1%

-3.4

PCPT

vs.EDF

0.32***

0.37

0.0%

4.8

0.36***

0.4

0.0%

4.9

0.48*

0.51

6.4%

1.9

CPT

vs.EDF

0.23***

0.37

0.0%

19.2

0.27***

0.4

0.0%

21.5

0.38***

0.51

0.0%

9.1

ECPT

vs.EDF

0.31***

0.37

0.0%

5.3

0.35***

0.4

0.0%

5.6

0.47**

0.51

3.0%

2.3

This

table

reportsth

eresu

ltsfrom

statisticaltestsofth

eex

trem

equantile

retu

rn,EQR

(in

Panel

A)and

tail

expected

upsideretu

rns,

EU

(in

Panel

B),

perform

edaccord

ingto

Eqs.

(3.2.7),

(3.2.8)and(3.2.9b)applied

toaveraged

den

sity

functions.

Since

theden

sities

comparedhereare

averaged

forth

eRND

andforth

esu

bjectiveden

sities

orestimatedusingourfullsample

forrealized

retu

rns,

thesetestsaim

toinvestigate

thelong-term

consisten

cybetween

thedistribution

tails.

Thenull

hypoth

esis

ofth

esetestsis

thatth

eEQR

and

thetail

expected

upsideretu

rnsfrom

the

distributionsbeingco

mparedhaveeq

ualmea

nsand,th

erefore,tailsare

consisten

t.Therejectionofth

enullhypoth

esis

istested

byt-testsofEq.(3.2.9b)atth

eten,five,

andonepercentstatistical

levels,

resp

ectively,

shownbysu

perscripts

*,**,***,assigned

toth

eEQR

orth

eEU

forth

eRND,displayed

inco

lumn(1),

andforth

esamestatisticsforth

eEDF,sh

ownin

column(2).

55

In line with these results for the EQR, Table 3.2, Panel B, shows that the expected upside

(EU) for the EDF is more closely matched within the three-month horizon by the CPT and

ECPT density functions for the tenth, fifth and first quantiles. The three-month horizon EUs

estimated from the realized returns are 15, 19, and 30 percent for the mentioned quantiles. The

ECPT EUs for the same horizon are 18, 21, and 28 percent, respectively. For the CPT, EUs are

16, 19 and 26 percent. Thus, estimates from these two density functions are mostly statistically

equal to the realized returns. Similarly to our analysis on the EQR, for the other subjective

densities, the EUs for all quantiles are also much larger than the EDF expected upside. The

exponential density has the highest expected upside across the different quantiles, being the

furthest away from the realized returns. The RND-implied expected upside is somewhat con-

servative and relatively closer to the realized ones but only statistically significant at the one

percent quantile.

For the six-months maturity, the expected upsides for the CPT and ECPT density functions

are no longer that close to each other nor to the realized ones. The EDF expected upside always

exceeds the ones for the CPT and ECPT. Only at the tenth quantile, the expected upside of the

ECPT density function equals the realized one. The densities which better match the expected

upside of the EDFs are the PCPT and the RND.

For the twelve-month horizon, the expected upside for the realized returns is 37, 40, and 51

percent for the tenth, fifth and first quantiles. In line with the results from our EQR analysis,

the power density again best matches realized EUs, as estimates are statistically equal across

all maturities. Second best performers are the PCPT and ECPT densities, which match the

realized EU at the one percent quantile level.

In summary, across the three EVT tests performed (i.e., on tail shape, EQR and EU), the

three option maturities and the three quantiles evaluated, we observe that the success rate of

the CPT subjective density functions on matching the EDF tails is 57 percent. In contrast, this

success rate is 38 percent for the power utility, 33 percent for the RND and only 10 percent for

the exponential utility density function. These results suggest that CPT-related distributions,

although not always matching the EQRs and ES of the EDF, seem to best match the EQR

of the EDF, especially at the short maturities. More specifically, the ECPT seems to have

some advantage over the other methods for the three- and six-month maturities. This result is

not a surprise, because allowing the CPT weighting function to assume different shapes entails

extra flexibility to match the data relative compared to traditional utility functions. Thus,

if our findings suggest that the CPT does not fully explain single stock options pricing, its

overweighting of small probabilities feature goes very far in explaining such market data, with

the exception of twelve-month options. These findings reiterate our takeaway from section

3.3.1, in which a positive term structure of overweight of tails appears to play a substantial

role: twelve-month options are priced more rationally than shorter term ones, which seem to be

priced as a result of lottery buying by individual investors. Figure 3.1 compares the CDFs from

six of our equity return densities: the EDF, the RND, the CPT, the PCPT, the exponential-

56

and the power-utility density17. We focus on the right tails of these distributions as we are

interested in how closely the RND from call options and derived subjective density functions

match the tails of the EDF.

In Figure 3.1, we see that the tails implied by option prices (RND, in red) are fatter than the

tails from the CPT (in dark blue) and EDF (in green) density functions over the three-month

horizon. The tails for the CPT and the EDF are almost identical above the 120 terminal

level, i.e., at the 20 percent return. The right tail of the RND distribution is clearly much

fatter than the ones of the CPT and EDF, but it is still thinner than the ones of the PCPT,

the exponential- and the power-utility densities. Thus, the upside risk implied from options

is much higher than the one realized by the EDF, a sign of a potentially biased behavior by

investors in such options. This observation is confirmed by the tail shape parameter (ϕ), the

EQRs and the EU estimated across the different quantiles, which in all cases report higher

upside in the RND than in the EDF and the CPT. Figure 3.1 also suggests that the upside

risk of the RND is more consistent with the PCPT density, whereas the CPT tails seem very

distinct from the PCPT, which is in line with our earlier findings.

The plot in column B, which depicts the CDF for our studied densities at the six-month

horizon, suggests that the RND and the EDF are closer than at the three-month horizon.

At the same time, the CPT density seems more disconnected from the EDF. This finding

matches our results from the EQR and the expected upside comparisons. The PCPT tail is,

at this horizon, higher than the EDF, CPT, and RND ones and closer to the EDF one than

to the CPT one, especially at its very extreme. This finding is also confirmed by our EQR

and expected upside tests, as the PCPT is statistically equal to the EDF at the one percent

quantile. The exponential and power utility densities have right tails that are much fatter than

the other densities, including the EDF.

Figure 3.1 shows that at the twelve-months horizon the CPT’s CDF tails seem completely

disconnected from the EDF. The EDF tails are much fatter than the CPT ones and slightly

fatter than the RND ones. In fact, the RND seems to match the EDF for terminal levels above

120. This finding suggests that long-term options trade in a much less CPT-biased manner

than short-term options.

Overall, Figure 3.1 confirms our hypothesis that end-users of OTM single stock calls are

likely biased and behave as buying lottery tickets when trading short-term options. These

results strengthen the evidence provided by Ilmanen (2012), Barberis (2013), Conrad et al.

(2013), Boyer and Vorkink (2014) and Choy (2015) that investors push single stock options

prices to extreme valuation levels. Investors seem to overweight small probabilities especially

at short-term horizons. Next, we analyze the time-variation in overweight of small probabilities

to better understand the underlying reasons for our findings.

17We omit the ECPT for better visualization as its CDFs are very similar to the CPT ones. The similarityis caused by the ECPT left tail weighting function parameter (δ) being the same for the CPT and because theestimated long-term γ for the three maturities are close to the Tversky and Kahneman (1992) one.

57

(a) Three-month horizon (b) Six-month horizon

(c) Twelve-month horizon

Figure 3.1: Cumulative density functions.This figure shows three plots that depict the cumulative density function(CDF) for equity returns obtained from the empirical density function (EDF), the risk-neutral density (RND), and the foursubjective density functions: 1) the power utility density, 2) the exponential utility density, 3) the cumulative prospective theorydensity (CPT), and the partial CPT (PCPT). The equity returns’ CDFs from these six sources are presented for three-, six-, andtwelve-month horizons. The plots display the cumulative probabilities on the y-axis and the terminal price levels on the x-axis,given an initial price level of 100.

3.3.3 Estimated CPT time-varying parameters

To investigate time-variation in the CPT’s overweighting of small probabilities in single stock

options, we apply Eq.(3.2.5) to each day in the sample to estimate the empirical γ (weighting

function) parameter. Lower and upper bounds of -0.25 and 1.75 were used in this optimization

as they produced the lowest RSS across permutation of all bounds when γ was optimized using

the CPT parameterization. We estimate γ under four different assumptions about λ, the loss

aversion parameter: 1) λ equals 2.25, the CPT parameterization; 2) no loss aversion, λ equals

1; 3) augmented loss aversion, λ equals 3; and 4) optimal λ, as estimated by Eq.(3.2.4).

Table 3.3, Panel A reports the statistics when λ equals 2.25. We find that the median and

the mean time-varying values of γ, estimated from the three-month options are above its CPT

value of 0.61 but still reflect overweight of small probabilities. This suggests that overweighting

58

of small probabilities is present within the pricing of three-month call options as suggested by

the theory. The distribution of γ is skewed to the right and overweight of small probabilities

is present 64 percent of times within three-month maturity. The 25th percentile of γ is 0.74,

clearly suggesting a less pronounced overweight of small probabilities than suggested by the

CPT. The estimates of γ range from 0 to 1.75 (i.e., an underweighting of small probabilities)

and are volatile, with a standard deviation of 0.23.

Table 3.3: Time-varying gamma parameter

Panel A - Gamma with CPT loss aversion (λ = 2.25)

Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS

(98-03) (03-08) (08-13)

3 months - 0.74 0.91 0.89 1.04 1.75 0.23 64% 97% 35% 59% 0.0209

6 months - 0.81 0.99 0.96 1.14 1.75 0.28 52% 92% 18% 46% 0.017

12 months 0.04 0.91 1.03 1.01 1.14 1.75 0.22 41% 83% 11% 29% 0.0225

Panel B - Gamma with no loss aversion (λ = 1)


(98-03) (03-08) (08-13)

3 months 0.32 0.52 0.66 0.67 0.8 1.27 0.18 97% 100% 94% 97% 0.0253

6 months 0.32 0.55 0.71 0.72 0.87 1.75 0.21 90% 98% 79% 92% 0.0198

12 months 0.29 0.62 0.83 0.8 0.98 1.75 0.22 81% 98% 63% 83% 0.0169

Panel C - Gamma with augmented loss aversion (λ = 3)


(98-03) (03-08) (08-13)

3 months 0.45 0.81 0.96 0.98 1.11 1.75 0.25 58% 93% 27% 53% 0.023

6 months 0.38 0.89 1.02 1.06 1.25 1.75 0.25 45% 83% 13% 38% 0.0196

12 months 0.37 0.98 1.07 1.09 1.19 1.75 0.21 31% 66% 6% 22% 0.0265

Panel D - Gamma with optimized loss aversion


(98-03) (03-08) (08-13)

3 months 0.33 0.52 0.66 0.67 0.81 1.75 0.18 97% 100% 93% 97% 0.0249

6 months 0.37 0.86 1.01 1.02 1.17 1.75 0.24 47% 86% 15% 40% 0.0187

12 months 0.34 0.96 1.05 1.06 1.17 1.75 0.2 35% 73% 8% 24% 0.025

This table reports the summary statistics of the estimated CPT time-varying parameter gamma (γ) from the single stockoptions market for each day in our full sample across different values of lambda (λ). The parameter λ is the loss aversionparameter and the parameter γ defines the curvature of the weighting function for gains, which leads the probability distortionfunctions to assume inverse S-shapes. An estimated γ parameter close to unity leads to a weighting function that is close tothe unweighted probabilities, whereas values close to zero denote a larger overweighting of small probabilities. The columnwith heading %γ < 1 reports the percentage of observations in which γ < 1, thus, the proportion of the sample in whichoverweight of small probabilities is observed. We report this metric for the full sample as well as for three equal-sized splitsof our full sample, namely: (98-03), from 1998-01-05 to 2003-01-30; (03-08) from 2003-01-31 to 2008-02-21 and; (08-13) from2008-02-22 to 2013-03-19. Panel A reports the summary statistics of γ when we assume the CPT parameterization, whereλ equals 2.25. Panel B reports the summary statistics of γ when we assume the loss aversion parameter λ equals 1 (no lossaversion). Panel C reports the summary statistics of γ when we assume λ equals to 3 (augmented loss aversion). PanelD reports the summary statistics of γ when we assume λ to equal its estimated (optimal) values, as reported in Table 3.1,Panel A.

Interestingly, when we split the sample in three parts (as shown in Table 3.3), we observe

that overweight of small probabilities is most strongly present at the beginning of our sample,

in 97 percent of the days from 1998-01-05 to 2003-01-30, but that has faded since 2003. During

the period from 2003-01-31 to 2008-02-21, underweight of small probabilities is present in 65

percent of the days, whereas such condition is less pervasive from 2008-02-22 onwards, i.e., until

2013-03-19. This finding suggests that overpricing of single stock options is sample specific and

not structural. Even if sample specific, overweight of small probabilities seems, in general,

59

much less pronounced than the 0.61 parameter offered by the CPT. These results seem to only

partially confirm our hypothesis that the CPT can empirically explain the overpricing of OTM

single stock call options.

At the six-month maturity, overweighting of small probabilities is less frequent than in three-

month tenor. The median γ for such maturity is 0.99, implying roughly neutral probability

weighting. The long-term γ equals 0.81 and is somewhat out-of-sync with the time-varying

estimates. Similarly to the three-month maturity, the distribution of γ is also slightly skewed to

the right. The 75th quantile of γ equals 1.14 and suggests an underweighting of tail probabilities.

However, probability weighting is largely sample dependent as within the overall sample, 52

percent of all observations reflect overweight of small probabilities but, between 1998 and 2003,

its occurrence is 92 percent.

Differently from the other maturities, γ estimates for the twelve-month maturity tend to-

wards underweight of tail probabilities. The median γ is 1.03, whereas the mean γ is 1.01.

Time variation and sample dependence are present as for the other maturities but, at the

twelve-month maturity, the percentage of days with overweight of tails is smaller, 41 percent

in the full sample but still 83 percent for the 1998-2003 sample.

In summary, the statistics in Table 3.3, Panel A, indicate that the weighting function pa-

rameters γ for the three maturities evaluated are time-varying and sample specific. Overweight

of small probabilities holds for the three-month maturity, less convincingly so for the six-month

maturity, and not at all for the twelve-month maturity, in which neutral probabilities and

underweight of tails respectively prevails.

Because the loss-aversion parameter λ is of high importance in the CPT model, we estimate

γ under different λ parameterizations, more specifically, for 1) λ equals 1, 2) λ equals 3 and

optimal λ, as estimated from the long-term empirical distribution (see Table 3.1).

We report the summary statistics of the new γ estimates in Panel B of Table 3.3, when

we assume λ equals 1. The new median and mean estimates for γ are 0.66 and 0.67 for the

three-month maturity, respectively, and, thus, lower than when γ was estimated under the CPT

loss aversion calibration (λ=2.25). The 75th percentile of γ also decreases, from 1.14 to 0.80.

At the six-month horizon, the difference between γ with λ equals 2.25 and with λ equals 1

is also large. The median γ for the CPT λ is 0.99, whereas for when λ equals 1 it is 0.71.

The means are 0.96 and 0.72, respectively. At the 75th percentile using λ equals 1, γ becomes

0.87. For the twelve-month maturity, we observe a similar effect. The median γ for when

λ equals 1 is 0.83, whereas for when λ equals 2.25 it is 1.03. In brief, a lower loss aversion

parameter consistently gives rise to higher γ estimates, across the different options’ maturities

and quantiles. The opposite effect is observed when the λ is increased from 2.25 to 3, as shown

by Table 3.3, Panel C. The median and mean γ when λ equals 3 becomes 0.96 and 0.98 for

the three-month maturity, in comparison to 0.91 and 0.89 when λ equals 2.25. Such rise in

central tendency of γ estimates is also observed within the six- and twelve-month maturities

and across the 25 and 75 percent quantiles. Table 3.3, Panel D, which reports γ estimates when

optimized λ parameters are used, shows distinct results for the three-month maturity versus the

60

six- and twelve-month maturity. For the three-month maturity, we observe a downward shift to

γ estimates, whereas for six- and twelve-month maturities, an upward movement in estimates

occurs. However, this initially opposite effect in estimates is, in fact, qualitatively equal to the

result just described when we use λ as 1 or 3, as the optimal λ parameters estimated for the

three-, six and twelve-month maturities are, respectively, 1.02, 2.66 and 3.00 (i.e., it decreases

for the three-month maturity and increases for the six- and twelve-month maturity vis-a-vis

the CPT parameterization).

The reason why a lower (higher) loss aversion gives rise to a decreased (increased) γ is that it

increases (decreases) the probability on the left side of distribution, influencing the probabilities

and the shape of the right side of the CPT distribution. High values of λ push the CPT density

to have more probability on the right side of the distribution, which is spread proportionally to

the probabilities originally observed in the right-side bins (i.e., creating a bump into the center-

right side of the distribution), all else equal. Thus, the impact of such probability shift fades as

the tail approaches. Nevertheless, the right tail of the CPT density does turn fatter (and the

γ parameter higher) as λ is made higher. The opposite occurs if low values of λ are assumed:

the right tail of the CPT density becomes thinner, causing γ estimates to be low (which more

forcefully can turn the RND right tail into such thin CPT tail). One important finding from

our experimentation with different λ parameters is that the time variation observed when λ

equals 2.25 is unchanged. The standard deviation and range of γ estimates across the use of the

different λ values are somewhat the same. Though, the percentage of days that overweight of

tails is observed in the different samples studied dramatically changes towards a more frequent

presence of overweight of small probabilities, as low levels of λ are used (and vice-versa). The

large difference in the presence of overweight of small probabilities across samples remains.

We interpret our finding that γ is strongly time-varying and sample dependent across all

maturities and under different λ assumptions as a strong evidence that single stock options are

not overvalued due to a structural skewness preference, as Barberis (2013) may suggest. We

reckon that, if static skewness preferences would drive overweight of small probabilities, param-

eter γ would be relatively stable throughout our sample. Given that the γ is largely volatile,

we support the view that investors experience (time-varying) “bias in beliefs” or, alternatively,

time-varying preferences (see Barberis, 2013)18. Our results are in line with Green and Hwang

(2011), Chen et al. (2015) and Jiao (2016), who report similar time-varying effects in the over-

pricing, skewness effects and returns for IPOs and lottery-like stocks. These papers also report

that, beyond time-varying effects, stronger skewness preferences are associated with higher par-

ticipation of individual investor (trading in IPOs, trading around earnings announcements and

owning stocks) in detriment of institutional investors.

18Barberis (2013) distinguishes investors’ time-varying beliefs from skewness preferences as he argues thatinvestors with biased beliefs mistakenly overestimate tail events, whereas preference for skewness leads to over-weight of tails, which is less likely to be a mistake. As an example, the author suggests that investors thatoverweight small probabilities events correctly anticipate the distribution of a stock’s future returns but over-weight the state of the world in which a stock turns out to be “the next Google”. In the example, overestimationof tail events would occur when the investor attributes a higher chance to the stock being the next Google.As we do not attempt to distinguish between biased believes and time-varying preferences, we use the termoverweight of small probabilities throughout this chapter.

61

3.3.4 Time variation in probability weighting parameter and in-

vestors’ sentiment

As observed in section 3.3.3, the probability weighting parameter γ is clearly time-varying. In

the following, we investigate which factors may explain this time-variation of γ. Our main hy-

pothesis is that it is linked to investor sentiment. The link between sentiment and overweighting

of small probabilities or lottery buying in OTM single stock calls originates from the fact that

individual investors are highly influenced by market sentiment and attention-grabbing stocks

(Barberis et al., 1998; Barber and Odean, 2008), and that OTM single stock calls trading is

speculative in nature and mostly done by individual investors (Lakonishok et al., 2007). For

instance, Lakonishok et al. (2007) argue that the IT bubble of 2000, a period of high variation

of γ, is linked to elevated investor sentiment, when the least sophisticated investors were the

ones most inclined to purchase calls on growth and IT stocks. Figure 3.2 depicts time-varying

γ’s and the Baker and Wurgler (2007) sentiment factor. It provides evidence that these mea-

sures move in tandem at times. For example, during the IT bubble, the level of γ seems quite

connected with the level of sentiment, especially for the three- and six-month options.

Figure 3.2: Time varying gamma parameter in CPT. This figure depicts the time-varying nature of the gamma(γ) parameter from three-, six-, and twelve-month single stock options estimated using the CPT parameterization as well as thesentiment factor of Baker and Wurgler (2007).

To formally test our hypothesis that time variation of γ is linked to investor sentiment,

we design a regression model. In Eq. (3.3.1) the explained variables are γ for the three-

, six-, and twelve-month horizons and the explanatory variables are the Baker and Wurgler

(2007) sentiment measure19; the percentage of bullish investors minus the percentage of bearish

investors given by the survey of the American Association of Individual Investors (AAII), used

as a proxy for individual investor sentiment by Han (2008); and a set of control variables among

19Available at http://people.stern.nyu.edu/jwurgler/.

62

the ones tested by Welch and Goyal (2008)20 as potential forecasters of the equity market. The

data frequency used in the regression is monthly as this is the highest frequency available from

the sentiment data and from the Welch and Goyal (2008) data set21. Our regression sample

starts in January 1998 and ends in February 201322. Our OLS regression model is specified as

follows:

γt = c+ ψ1 · Sentt + ψ2 · IISentt + ψ3 · E12t + ψ4 · B/mt + ψ5 ·Ntist+

ψ6 ·Rfreet + ψ7 · Inflt + ψ8 · Corprt + ψ9 · Svart + ψ10 · CSPt + εt,(3.3.1)

where Sent is the Baker and Wurgler (2007) sentiment measure, IISent is the AAII individual

investor sentiment measure, E12 is the twelve-month moving sum of earnings on the S&P5000

index, B/m is the book-to-market ratio, Ntis is the net equity expansion, Rfree is the risk-free

rate, Infl is the annual inflation rate, Corpr is the corporate spread, Svar is the stock market

variance and, CSP is the cross-sectional premium.

Additionally, we run univariate models for each explanatory factor to understand the indi-

vidual relation between γ and the control variables:

γt = ci + ψi · xi,t + εt, (3.3.2)

where x replaces the n explanatory variable earlier specified, given i = 1...n.

Table 3.4, Panel A presents the estimates of Eq. (3.3.1). We note the high explanatory

power of the multivariate regression, ranging from 68 to 71 percent. As expected, we observe

that Sent is consistently negative and statistically significant across the three different horizons

studied. On average, each one-unit difference in Sent is linked to roughly -0.1 difference in γ,

all else being equal. The univariate regressions of Sent confirm the negative link between

sentiment and γ. For all option maturities, a negative relation between the Baker and Wurgler

(2007) sentiment measure and γ is found. The explanatory power of the variable Sent in the

univariate setting is also high, between 22 and 29 percent. These findings altogether support

our hypothesis that overweighting of small probabilities increases at higher levels of sentiment

and that sentiment strongly impacts the probability weighting bias of call option investors.

In contrast with the variable Sent, the coefficients for the individual investor sentiment

(IISent) are positive but not statistically significant either on the multivariate setting or on the

univariate one (see Table 3.4). The univariate regressions run on γ have rather low explanatory

power. The positive relationship between IISent and γ at the three-month maturity may

be attributed to potential capitulations in individual investor sentiment, as such indicator is

strongly mean-reverting.

20The complete set and description of variables suggested by Welch and Goyal (2008) is provided in Appendix3.C. From the complete set of variables used by Welch and Goyal (2008), we select a smaller set using the cross-correlation between them to avoid multicollinearity in our regression analysis. Because we run a multivariatemodel, using the full set of variables is undesirable as some of them correlate 80 percent with each other. Weexclude variables that correlate more than 40 percent with each other.

21Given the fact that γ is estimated on a daily basis, we average γ throughout each month.22This sample is only possible because Welch and Goyal (2008) and Baker and Wurgler (2007) have updated

and made available their datasets after publication.

63

Table

3.4:Regression

resu

lts:

CPT

para

metrization

Panel

A-Multivariate

analysis

Panel

B-Univariate

analysis

Matu

rity

3m

6m

12m

3m

6m

12m

3m

6m

12m

3m

3m

3m

3m

3m

3m

3m

3m

Intercep

t0.564***

0.646***

0.769***

0.891***

0.969***

1.016***

0.864***

0.939***

0.996***

0.573***

0.513***

0.865***

0.917***

0.855***

0.869***

0.917***

0.864***

(0.068)

(0.090)

(0.074)

(0.020)

(0.023)

(0.019)

(0.022)

(0.026)

(0.022)

(0.031)

(0.061)

(0.022)

(0.031)

(0.023)

(0.022)

(0.026)

(0.021)

Sen

t-0.078***

-0.121***

-0.114***

-0.145***

-0.206***

-0.157***

(0.024)

(0.032)

(0.024)

(0.020)

(0.028)

(0.020)

AAIISen

t0.094

0.030

-0.003

0.031

-0.040

-0.067

(0.059)

(0.085)

(0.058)

(0.083)

(0.107)

(0.079)

E12

0.049***

0.052***

0.044***

0.059***

(0.008)

(0.010)

(0.011)

(0.005)

B/m

0.617**

0.698*

0.387

1.428***

(0.268)

(0.344)

(0.331)

(0.266)

Ntis

0.357

0.201†

-0.869

0.617

(0.553)

(0.676)

(0.567)

(0.811)

Rfree

-0.022**

-0.030**

-0.016

-0.022

(0.010)

(0.014)

(0.011)

(0.013)

Infl

-0.580†

-3.111†

0.370†

5.867

(2.539)

(3.493)

(2.916)

(4.778)

Corp

r0.100†

-0.174†

-0.037†

-0.314

(0.275)

(0.373)

(0.349)

(0.554)

Sva

r-9.762***

-11.792***

-8.679**

-13.280***

(3.179)

(4.156)

(3.686)

(4.835)

CSP

-0.219†

-0.219†

-0.116†

0.626*

(0.197)

(0.233)

(0.238)

(0.313)

R2

71%

68%

67%

22%

29%

27%

0%

0%

1%

37%

31%

0%

3%

1%

0%

18%

2%

F-stat

36.5

31.7

30.2

43.7

64.3

57.7

0.2

0.2

0.8

90.3

70.0

0.7

5.4

2.2

0.3

33.9

3.8

AIC

-247.1

-166.0

-233.7

-326.1

-186.0

34.1

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

BIC

-213.4

-132.3

-200.0

-320.0

-179.9

40.3

0.1

0.1

0.1

0.0

0.2

0.7

0.0

3.9

0.5

2.3

0.3

This

table

reportsth

eresu

ltsfrom

statisticaltestsofth

eex

trem

equantile

retu

rn,EQR

(in

Panel

A)and

tail

expected

upsideretu

rns,

EU

(in

Panel

B),

perform

edaccord

ingto

Eqs.

(3.2.7),

(3.2.8)and(3.2.9b)applied

toaveraged

den

sity

functions.

Since

theden

sities

comparedhereare

averaged

forth

eRND

andforth

esu

bjectiveden

sities

orestimatedusingourfullsample

forrealized

retu

rns,

thesetestsaim

toinvestigate

thelong-term

consisten

cybetween

thedistribution

tails.

Thenull

hypoth

esis

ofth

esetestsis

thatth

eEQR

and

thetail

expected

upsideretu

rnsfrom

the

distributionsbeingco

mparedhaveeq

ualmea

nsand,th

erefore,tailsare

consisten

t.Therejectionofth

enullhypoth

esis

istested

byt-testsofEq.(3.2.9b)atth

eten,five,

andonepercentstatistical

levels,

resp

ectively,

shownbysu

perscripts

*,**,***,assigned

toth

eEQR

orth

eEU

forth

eRND,displayed

inco

lumn(1),

andforth

esamestatisticsforth

eEDF,sh

ownin

column(2).

64

The nine Welch and Goyal (2008) control variables used in our multivariate regression are

linked to γ in very distinct manners. First, it is fair to say that they add substantial explanatory

power to our multivariate regressions. The three-, six-, and twelve-month multivariate models

explain, respectively, 71, 68, and 67 percent of the level of γ. Most of these relations are stable,

because the coefficient signs change only rarely. The control variables that are statistically

significant in our multivariate setting are E12, B/m, Rfree, Infl, Svar, and CSP (Table 3.4).

We observe that γ is positively linked to E12, the twelve-month moving sum of earnings on the

S&P 500 index, as well as to B/m, the book to market ratio, in both multivariate and univariate

regressions. The positive relation between E12, B/m and γ could be explained by mean-

reversion of earnings and valuation being linked to a greater overweighting of small probabilities,

which could be justified by the higher investor sentiment outweighing earning downgrades and

rising valuations in a rallying market. These two variables have high explanatory power of

γ, respectively, 37 and 31 percent for the three-month horizon. The significance of Rfree

is, however, somewhat unstable. At the three- and six-month maturity at the multivariate

regression Rfree is significant but not at the univariate regression. Further, the stock market

variance, Svar, is negatively linked to γ. Apparently, the higher the risk environment, the

higher the overweighting of small probabilities is. In a univariate setting (at the three-month

horizon), the explanatory power of such univariate regression is 18 percent, thus relatively high.

Table 3.4, Panel B indicates that the cross-sectional premium CSP is positive and statistically

significant in the univariate setting for the three-month horizon, despite being negative and not

significant in the multivariate regressions.

To reiterate our results, we also apply the Least Absolute Shrinkage and Selection Operator

(Lasso) methodology to our main multivariate regressions (see Tibshirani, 1996, and Appendix

3.B.1). We apply Lasso to select the regressors that are most relevant for the overall fit of the

γ by our sentiment and control variables. The coefficients that shrink to zero via the Lasso

are identified in Table 3.4 (Panel A) with a dagger (†). Model selection via the Lasso confirms

that Sent and IISent are more relevant for the overall fit of γ than some of the fundamental

factors used, namely, Ntis, Infl, Corpr and CSP .

The results provided by our OLS regression and by the Lasso indicate that supportive

fundamental data for equity markets do not necessarily intensify biased behavior of single stock

call option investors. This is an interesting takeaway, especially considering the notion that

sentiment does appear to affect such behavior: single stock option investors seem to overweight

small probabilities when sentiment is exuberant, not necessarily when stock fundamentals are

exuberant.

More importantly, these results support our earlier findings that overweight of small proba-

bilities is strongly time-varying and linked to sentiment. Therefore, overweight of small proba-

bilities is unlikely to result from (static) investor preferences but from investors’ bias-in-beliefs

or time-varying preferences, which seem conditional on sentiment levels. Furthermore, we also

run our regression models (Eqs. (3.3.1) and (3.3.2)) using different assumptions about the value

of λ, the loss aversion parameter. In this exercise we set λ to imply 1) no loss aversion (λ=1),

65

2) augmented loss aversion (λ=3) and 3) optimal loss aversion, where λ assumes the estimated

value by Eq. (3.2.4) and reported in Table 3.1, Panel A.

Table 3.5: Regression results: alternative loss aversion parameterization

Panel A - Multivariate analysis

λ = 1 λ = 3 Optimum λ

Maturity 3m 6m 12m 3m 6m 12m 3m 6m 12m

Intercept 0.348*** 0.347*** 0.413*** 0.576*** 0.661*** 0.798*** 0.354*** 0.663*** 0.817***

-0.056 -0.07 -0.065 -0.084 -0.094 -0.074 -0.056 -0.085 -0.069

Sent -0.050*** -0.061*** -0.050** -0.077*** -0.094*** -0.091*** -0.051*** -0.089*** -0.093***

-0.018 -0.022 -0.02 -0.027 -0.03 -0.022 -0.018 -0.027 -0.021

AAIISent 0.087* 0.082 0.110** 0.108 0.092 -0.008 0.086* 0.074 -0.003

-0.05 -0.061 -0.049 -0.074 -0.083 -0.057 -0.05 -0.076 -0.054

E12 0.043*** 0.045*** 0.054*** 0.049*** 0.043*** 0.036*** 0.043*** 0.042*** 0.036***

-0.007 -0.008 -0.008 -0.01 -0.01 -0.01 -0.007 -0.01 -0.009

B/m 0.563** 0.754*** 0.647** 0.795** 0.885** 0.527* 0.571** 0.823** 0.43

-0.227 -0.267 -0.269 -0.334 -0.377 -0.316 -0.227 -0.336 -0.292

Ntis 0.258 0.029 -0.357 0.266 0.057† -0.884† 0.253 0.223† -0.900*†-0.486 -0.593 -0.556 -0.686 -0.705 -0.558 -0.486 -0.643 -0.525

Rfree -0.013* -0.011 -0.021** -0.01 -0.008 0.008 -0.013* -0.014 -0.003

-0.008 -0.01 -0.009 -0.012 -0.014 -0.011 -0.008 -0.012 -0.01

Infl -0.558† -1.626† 2.157† 0.468† -0.3† 0.941† -0.59† -1.034† 0.259†-2.029 -2.678 -2.313 -3.265 -3.836 -2.981 -2.044 -3.496 -2.72

Corpr 0.086† -0.038† 0.097† 0.094† -0.112† -0.153† 0.089† -0.168† -0.133†-0.206 -0.263 -0.262 -0.319 -0.38 -0.323 -0.208 -0.347 -0.312

Svar -6.371** -7.192** -5.469* -8.742** -9.835** -8.126** -6.634** -9.520** -8.018**

-2.533 -3.188 -2.899 -3.484 -3.845 -3.373 -2.542 -3.681 -3.142

CSP -0.076 -0.099† -0.024† -0.186† -0.196† -0.059† -0.073 -0.152† -0.066†-0.153 -0.182 -0.178 -0.22 -0.235 -0.196 -0.154 -0.215 -0.189

R2 70% 68% 73% 63% 62% 64% 70% 65% 65%

F − stats 35 32 40 25.3 24 25.7 34.9 26.7 27.6

AIC -309.4 -254.6 -278.5 -184.8 -166.5 -247.6 -307.1 -193.4 -268.4

BIC -275.7 -220.9 -244.8 -151.1 -132.9 -213.9 -273.4 -159.7 -234.8

Panel B - Univariate analysis

λ = 1 λ = 3 Optimum λ

Maturity 3m 6m 12m 3m 6m 12m 3m 6m 12m

Intercept 0.662*** 0.722*** 0.797*** 0.980*** 1.055*** 1.095*** 0.668*** 1.023*** 1.065***

-0.017 -0.019 -0.02 -0.022 -0.023 -0.018 -0.017 -0.021 -0.017

Sent -0.105*** -0.126*** -0.117*** -0.141*** -0.163*** -0.117*** -0.106*** -0.162*** -0.125***

-0.015 -0.017 -0.017 -0.021 -0.023 -0.016 -0.015 -0.022 -0.016

R2 18% 19% 16% 18% 22% 18% 18% 24% 23%

F − stats 33.2 36.6 30.4 34.3 44.2 34.9 33.6 49.6 45.4

AIC -326.1 -186 34.1 34.1 34.1 34.1 34.1 34.1 34.1

BIC -320 -179.9 40.3 40.3 40.3 40.3 40.3 40.3 40.3

This table reports the regression results for Eq. (3.3.1), in a multivariate setting, in Panel A and for Eq. (3.3.2), in aunivariate setting, in Panel B. Across columns, the parameterization of lambda (λ) differs so Eqs. (3.3.1) and (3.3.2) arerun under the assumption that 1) loss aversion is absent (λ =1); 2) loss aversion is augmented (λ =3) and 3) λ is optimal(as given by Table 3.1 Panel A). The dependent variable in regression of Eq. (3.3.1) is gamma (γ), where the explanatoryvariables are 1) the Baker and Wurgler (2007) sentiment measure; 2) the AAII individual investor sentiment measure and 3)the explanatory variables used by Welch and Goyal (2008) excluding the factors that correlate with each other in excess of40 percent. The regressors identified with a dagger (†) are the ones shrank to zero by the application of the Lasso. Panel Breports the regression results for Eq. (3.3.2), which regresses γ and the same explanatory variables mentioned before in theunivariate setting.

Table 3.5 indicates that the results for Sent are similar to the ones obtained in our main

regressions: Sent is negatively linked to γ and statistically significant at all horizons but with

less statistical significance, explanatory power and magnitude at the twelve-month horizon.

This result applies to the multivariate regression model only. Across all options maturities,

the Sent coefficients become larger when λ equals 3 and they shrink when λ equals 1. The

66

relation between changes in λ and Sent observed is intuitive. We argue that as λ increases, the

probabilities on the left side of the CPT distribution increase, favoring a thinner tail on the

right side of the PCPT distribution, which, then, requires less overweight of tail adjustment

(through a higher γ) for the PCPT to match the EDF. As a higher γ is obtained by such

increase in λ, the coefficient of γ with the given sentiment factor also increases in magnitude.

The explanatory power of these regressions are, once again, high, as R2 ranges from 62 to

73 percent in the multivariate models. The explanatory power of Sent ranges from 16 to 24

percent in the univariate setting. Table 3.5 reiterates the relation between IISent, the AAII

individual investor sentiment measure, the Welch and Goyal (2008) control variables and γ in

our main regressions. IISent is rarely significantly linked to γ. The control variables that are

robustly linked to γ in our main regression (E12, B/m and Svar) remain strongly connected

to it within these auxiliary regressions. Applying the Lasso model selection technique to these

regressions gives results that are analogous to these ones. Sent, IISent, E12, B/m, Svar and

Rfree always survive the Lasso variable selection procedure, whereas Ntis, Infl, Corp and

CSP coefficient often shrink to zero (as in our main regression, these coefficients are identified

with a dagger (†) in Table 3.5, Panel A).

The robustness of the relation between γ and Sent suggests that changes in the overweight-

ing of tails are not conditional on the level of the loss aversion parameter. In other words, levels

of loss aversion do not drive investors to overweight upside tail events, as one could hypothesize

when associating upside speculation with a state of low loss aversion. Thus, our results suggest

that overweighting of small probabilities is a phenomenon stably linked to sentiment, rather

than positive fundamentals or loss aversion levels. Our results tie closely with the findings of

Green and Hwang (2011), who investigate the relation between IPOs expected skewness and

returns. They find that the skewness effect is stronger during period of high investor senti-

ment. In the same line, Chen et al. (2015) conclude that when gambling sentiment is high,

stocks with lottery-like characteristics earn positive abnormal returns in the short-run followed

by underperformance in the long run.

3.4 Robustness tests

3.4.1 Kupiec’s test for tail comparison

We employ Kupiec’s (1995) test to compare the tails of the EDF with the ones of the subjective

density functions and of the RND as a robustness test to the EVT methods applied. Kupiec’s

test was originally designed to evaluate the accuracy of Value-at-risk (VaR) models, where the

estimated VaR were compared with realized ones. Because the VaR is no different from the

EQR on the downside, i.e., the q−p statistic, we can also make use of Kupiec’s method to test the

accuracy of the q+p statistic for subjective densities and the RND on matching realized EQRs.

Kupiec’s method computes a proportion of failure (POF) statistic that evaluates how often a

VaR level is violated over a specified time span. Thus, if the number of realized violations

is significantly higher than the number of violations implied by the level of confidence of the

67

VaR, then such a risk model or consistency of tails is challenged. Kupiec’s POF test, which is

designed as a log-likelihood ratio test, is defined as:

LRPOF = −2log[(1− p∗)(n−v)(p∗)v] + 2log[(1− [ vn])(n−v)( v

n)v] ∼ χ2(1), (3.4.1)

where p∗ is the POF under the null hypothesis, n is the sample size, and v is the number of

violations in the sample. The null hypothesis of such test is vn= p∗, i.e., the realized probability

of failure matches the predicted one. Thus if the LR exceeds the critical value, χ2 (1)=3.841, the

hypothesis is rejected at the five percent level. In our empirical problem, p∗ equals the assumed

probability that the EQR of the subjective and risk-neutral densities will violate the EQR of

the realized returns, whereas vnis the realized number of violations. Because we apply Kupiec’s

test to upside returns, violations mean that returns are higher than a positive threshold.

The first step in applying Kupiec’s test to our data set is outlining the expected percentage

of failure (p∗) between the EQR from the EDF and from the subjective and risk-neutral den-

sities. We pick p∗ as being five and ten percent. The percentages can be seen as the expected

frequency that the tails of the subjective and of the RND distributions overstate the tails of

the distribution of the realized returns. As a fatter tail is a symptom of an overweighting of

small probabilities, we expect that densities that do not adjust for the CPT weighting function

will deliver a higher frequency of failures than the CPT density function. The Kupiec’s test

results are reported in Table 3.6.

Panel A in Table 3.6 suggests that the probability of failure for the RND, power, exponential,

and PCPT densities is particularly high at the three-month horizon, with more than 99 percent

for the EQR at 90 and 95 percent and for p∗ equal to five and ten percent. These densities often

contain fatter tails than the EDF. For the CPT density, the POF is much lower across the two

values of p∗ used and the 90 and 95 percent EQR. The POF for the 90 percent EQR is roughly

58 percent for the CPT, irrespective of p∗. At the 95 percent EQR, the POF is 46 percent

for the CPT. These findings suggest that at the 90 and 95 percent EQR, the CPT densities

overstate less frequently the EDF tails than other densities. The violations of the EDF tails are,

however, still significant as they occur between 41 and 52 percent of times. Nevertheless, when

we analyze the 99 percent EQR, we find that the POF for all densities decreases considerably

and, for the CPT, it becomes 16 percent.

Panel B of Table 3.6 depicts a very similar pattern of the POF for the probability densities

derived from the six-month options as we find for the three-month options. The POF is very

close to 100 percent for all densities apart from the CPT at the 90 and 95 percent EQR, while

at the 99 percent EQR violations fall substantially, even more than what we observed for the

three-month options. Nevertheless, the CPT remains the best approximation for the EDF, as

its POF is the lowest. The Kupiec’s test result suggests that the CPT density is statistically

equal to the EDF, whereas the RND also equals the empirical returns at the ten percent level.

The results for p∗ at the five or ten percent are very similar.

68

Table 3.6: Robustness checks: Kupiec’s test

Panel A - Three-month calls

EQR 90% EQR 95% EQR 99%

p = 10% POF p-value LR-stat POF p-value LR-stat POF p-value LR-stat

RND vs EDF 99.9% 0.0000 ∞ 99.2% 0.0000 ∞ 50.5% 0.0000 414.8

Power vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 84.7% 0.0000 ∞Expo vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 86.8% 0.0000 ∞PCPT vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 67.2% 0.0000 752.0

CPT vs EDF 58.2% 0.0000 559.6 45.7% 0.0000 333.3 16.0% 0.0002 13.6


RND vs EDF 99.9% 0.0000 ∞ 99.2% 0.0000 ∞ 50.5% 0.0000 671.5

Power vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 84.7% 0.0000 ∞Expo vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 86.8% 0.0000 ∞PCPT vs EDF 100.0% 0.0000 ∞ 100.0% 0.0000 ∞ 67.2% 0.0000 ∞CPT vs EDF 58.2% 0.0000 861.9 45.7% 0.0000 561.3 16.0% 0.0000 65.5

Panel B - Six-month calls

EQR 90% EQR 95% EQR 99%


RND vs EDF 99.9% 0.0000 ∞ 93.3% 0.0000 ∞ 13.8% 0.0160 5.8

Power vs EDF 99.9% 0.0000 ∞ 97.7% 0.0000 ∞ 22.1% 0.0000 49.7

Expo vs EDF 99.9% 0.0000 ∞ 97.8% 0.0000 ∞ 23.0% 0.0000 56.4

PCPT vs EDF 99.9% 0.0000 ∞ 97.3% 0.0000 ∞ 17.0% 0.0000 18.2

CPT vs EDF 62.4% 0.0000 647.0 36.3% 0.0000 197.3 5.7% 0.0019 9.6


RND vs EDF 99.9% 0.0000 ∞ 93.3% 0.0000 ∞ 13.8% 0.0000 44.8

Power vs EDF 99.9% 0.0000 ∞ 97.7% 0.0000 ∞ 22.1% 0.0000 137.7

Expo vs EDF 99.9% 0.0000 ∞ 97.8% 0.0000 ∞ 23.0% 0.0000 149.7

PCPT vs EDF 99.9% 0.0000 ∞ 97.3% 0.0000 ∞ 17.0% 0.0000 76.0

CPT vs EDF 62.4% 0.0000 ∞ 36.3% 0.0000 369.9 5.7% 0.5474 0.4

Panel C - Twelve-month calls

EQR 90% EQR 95% EQR 99%


RND vs EDF 62.8% 0.0000 655.1 25.0% 0.0000 72.9 20.3% 0.0000 37.0

Power vs EDF 93.5% 0.0000 ∞ 42.5% 0.0000 283.5 29.3% 0.0000 114.7

Expo vs EDF 94.6% 0.0000 ∞ 43.1% 0.0000 292.7 30.4% 0.0000 126.2

PCPT vs EDF 79.5% 0.0000 1067.2 36.1% 0.0000 194.7 24.4% 0.0000 68.3

CPT vs EDF 29.4% 0.0000 115.2 7.2% 0.0480 3.9 8.4% 0.2666 1.2


RND vs EDF 62.8% 0.0000 ∞ 25.0% 0.0000 177.9 20.3% 0.0000 114.2

Power vs EDF 93.5% 0.0000 ∞ 42.5% 0.0000 492.6 29.3% 0.0000 245.6

Expo vs EDF 94.6% 0.0000 ∞ 43.1% 0.0000 505.3 30.4% 0.0000 263.6

PCPT vs EDF 79.5% 0.0000 ∞ 36.1% 0.0000 366.1 24.4% 0.0000 170.2

CPT vs EDF 29.4% 0.0000 246.4 7.2% 0.0631 3.5 8.4% 0.0048 8.0

This table reports the results from Kupiec’s (1995) percentage of failure (POF) test for violations of the extreme quantilereturns (EQR) from the empirical density function (EDF) by the EQR of a set of RND and subjective density functions.The test is performed as a robustness check to the extreme value theory (EVT)-based tests performed on the EQR and onthe expected upside returns. The null hypothesis, which is designed as a log-likelihood ratio test (Eq. (3.4.1)), is that therealized probability of failure ( v

n) matches the predicted one p∗. Thus if the LR exceeds the critical value, χ2 (1)=3.841,

such a hypothesis is rejected at the five percent level. Translating the methodology to our empirical problem, (p∗) becomesthe assumed probability that the EQR of the subjective and of the risk-neutral densities will violate the EQR of the realizedreturns, where v

nis the realized number of violations. We note that because we apply Kupiecs test to the upside returns,

violations mean that returns are higher than a positive threshold.

Panel C presents the POF for the twelve-month maturity. We find once again that the CPT

tails are the ones that violate the EDF tails the least. The POF for these densities are about

69

29 percent for the 90 percent EQR, seven percent for the 95 percent EQR, and four percent

for the 99 percent EQR. These findings suggest that the tails of the CPT closely match the

EDF ones, especially far out in the tail, i.e., at the 95 and 99 percent EQR. The RND, power,

exponential, and PCPT densities record POFs that are much smaller than for the three- and

six-month maturities but that are still high in comparison to the CPT.

We note that results for the PCPT and the CPT are quite distinct, whereas results for the

PCPT are somewhat closer to the ones of the RND. This suggests that the weighting function

is the component within the CPT density function that more forcefully causes the RND to

approximate the EDF, so not the value function. Overall, our analysis using Kupiec’s test

leads to similar results as the ones reached within our EVT analysis and further evidences that

the CPT model is superior in matching realized returns.

3.4.2 Prelec’s weighting function parameter

As another robustness check, we estimate the weighting function parameter ω of the RDEU

model suggested by Prelec (1998) in order to test whether our conclusions are robust to other

weighted functions formulations23. The Prelec weighting function w+−p is given by Eq. 3.4.2:

w+−p (p) = exp(−(−log(p))ω), (3.4.2)

where the parameter ω defines the curvature of the weighting function for both gains and

losses, which also leads to S-shaped probability distortion functions. We note that according

to Prelec (1998) the standard ω parameter value equals 0.65. Our time-varying and long-term

(LT) estimates for ω are presented in Table 3.7, Panel A.

The long-term estimates of ω are somewhat in line with the one suggested by the RDEU

but less so for the twelve-month horizon: ω estimated from the three-, six-, and twelve-months

are 0.46, 0.67, and 1.11, respectively. These parameters are somewhat consistent with our long-

term estimates for γ being, 0.75, 0.81, and 1.09 (see Table 3.1), as they suggest overweighting

of small probabilities that fades with the increase in the option horizon. Similarly, time-

varying estimates of ω also indicate more overweight of small probabilities than suggested

by γ estimations. We find the mean (0.95) and median (0.93) for time-varying estimates of

ω from three-month options to be higher than the ones suggested by Prelec (1998). This

outcome means that overweighting of small probabilities within the single stock option markets

is less than suggested by RDEU (similar to our conclusion concerning CPT parameters) and

that estimated Prelec parameters imply a less pronounced overweight of tails than suggested

by our CPT parameter estimations. In line with our results for the CPT, for the six- and

twelve-month maturities, underweight of small probabilities is, however, more frequent than an

overweight. The average ω for the six-month options is 1.02 (median being 0.99), and for the

twelve-months options is 1.05 (median being 1.07). The fact that investors tend to overweight

small probabilities to a much lesser extent in the short-term and that estimates are higher than

23A major advance of Prelec’s (1998) weighting function vis-a-vis the CPT is that it is monotonic for anyvalue of ω, whereas the CPT can have a non-monotonic probability weighting for low levels of γ.

70

suggested by their respective lab-based estimates confirms our main findings.

Table 3.7: Robustness checks: time-varying weighting function parameters

Panel A - Prelec omega (ω)

Maturity Min 25% Qtile Median Mean 75% Qtile Max StDev % ω < 1 % ω < 1 % ω < 1 % ω < 1 RSS LT

(98-03) (03-08) (08-13)

3 months 0.42 0.76 0.93 0.95 1.07 1.75 0.27 64% 95% 36% 61% 0.0204 0.46

6 months 0.37 0.84 0.99 1.02 1.17 1.75 0.26 51% 88% 21% 45% 0.017 0.68

12 months 0.44 0.94 1.07 1.05 1.18 1.75 0.21 39% 79% 10% 28% 0.0201 1.14

Panel B - Gamma with overweight of small probabilities on the right tail (δ = 0.69)


(98-03) (03-08) (08-13)

3 months 0.44 0.7 0.86 0.97 1.21 1.75 0.34 58% 99% 23% 53% 0.0231

6 months 0.4 0.75 1.01 1.04 1.27 1.75 0.31 49% 93% 13% 43% 0.0198

12 months 0.4 0.83 1.05 1.04 1.24 1.75 0.25 43% 87% 11% 32% 0.0238

Panel C - Gamma with neutral probability weighting on the right tail (δ = 1)


(98-03) (03-08) (08-13)

3 months 0.48 0.73 0.88 0.97 1.15 1.75 0.3 62% 98% 30% 58% 0.023

6 months 0.43 0.8 0.99 1.03 1.24 1.75 0.3 51% 92% 16% 45% 0.0191

12 months 0.5 0.87 1.03 1.02 1.13 1.75 0.22 44% 84% 12% 35% 0.0233

Panel D - Gamma with pronounced diminishing sensitivities to gains and losses (αandβ = 0.75)


(98-03) (03-08) (08-13)

3 months 0.45 0.81 0.96 0.98 1.09 1.75 0.25 57% 93% 27% 52% 0.023

6 months 0.38 0.9 1.02 1.06 1.23 1.75 0.25 44% 82% 13% 36% 0.0196

12 months 0.32 0.99 1.07 1.09 1.18 1.75 0.2 30% 65% 5% 21% 0.0276

Panel E - Gamma with no diminishing sensitivities to gains and losses (αandβ = 1)


(98-03) (03-08) (08-13)

3 months 0 0.72 0.88 0.87 1.03 1.75 0.24 67% 98% 40% 64% 0.0204

6 months 0 0.78 0.98 0.93 1.14 1.75 0.3 55% 94% 21% 49% 0.0163

12 months 0.04 0.86 1.02 0.98 1.14 1.75 0.25 44% 88% 11% 33% 0.0207

This table reports robustness checks of our time-varying estimates of overweight of small probabilities. Panel A reports thesummary statistics of the estimated omega (ω) parameter, which is the parameter used in the Prelec’s (1998) probabilityweighting function (see Eq. (3.4.2)). Similarly to the CPT, the parameter ω defines the curvature of the weighting functionfor gains and losses, which leads the probability weighting functions to assume inverse S-shapes. An ω parameter equalto one means a weighting function with un-weighted (neutral) probabilities, whereas ω < 1 denotes overweighting of smallprobabilities. Similarly to γ, we estimate long-term ω’s (reported for γ in Table 3.1, Panel A) as well as time-varying ω’s(reported for γ in Table 3.4, Panel C). Panels B and C report γ estimates when the CPT’s probability weighting parameterfor left side of the distribution (δ) is assumed to be, respectively, 0.69 (the CPT parameterization) and 1 (neutral probabilityweighting). Panel C and D report γ estimates when the CPT’s value weighting parameters α and β for diminishing sensitivityto gains and losses are assumed to be, respectively, 0.75 (increased diminishing sensitivity) and 1 (no diminishing sensitivity).We assume in these robustness tests that the loss aversion parameter λ equals 2.25.

The sample dependence observed in our main results is confirmed by the usage of Prelec’s

weighting function as overweight of tails is pervasive mostly in the 1998-2003 sample. Overall,

the robustness checks following Prelec (1998) confirm our main findings regarding time-variation

and sample dependence of overweighting of small probabilities, and reiterate our conclusion.

3.4.3 Estimating time-varying γ under different assumptions for δ ,

α and β

As an additional robustness test to our time-varying estimates of γ, we also run optimizations

where we fix parameter δ instead of jointly optimizing it with γ. We impose δ = 1 (no overweight71

of small probabilities on the left-side of the distribution) or 0.69, the value of δ within the CPT.

In line with our previous robustness test, Table 3.7, Panels B and C, suggests that results from

optimizations with different values for δ are qualitatively the same to our main results, i.e., a

positive term structure and sample dependency of overweight of small probabilities. Unreported

results also indicate a negative correlation between γ and sentiment and high explanatory power

of regressions. R2 is between 13 and 21 percent for three- and six-month options and between

0 to 3 percent for twelve-month options. Though, neutral probability weighting on the left side

of the distribution (δ=1) adjusts γ downwards when compared to our main results. Conversely,

when δ is 0.69, an upwards adjustment to γ estimates occurs.

Similarly, we also estimate γ under different assumptions for α and β. We assumed α=β=1

(no diminishing sensitivity to gains and losses) and α=β=0.75 (more pronounced diminishing

sensitivity to gains and losses) instead of the CPT parameterization α=β=0.88. Our results,

reported in Table 3.7, Panels D and E, suggest that lower sensitivity to gains and losses (higher

α and β) leads to a decrease in overweight of small probabilities (higher γ estimates), whereas

higher sensitivity to gains and losses (lower α and β) leads to an increase in overweight of tails

(lower γ estimates). This effect is similar to the one observed by changes in λ (described in

Section 3.3.3), which also magnifies the sensitivity for losses when increased.

As indicated in section 3.2.2, we have also estimated time-varying γ using different lower

(-0.25, 0 and 0.28) and upper bounds (1.2, 1.35, 1.5, 1.75 and 2). Results across bounds used

differ to the extent that higher bounds produce upward shifts in the estimated γ across all

quantiles, median and averages to the extent that overweight of small probabilities becomes

less pronounced but remain present. The time-variation pattern observed in Figure 3.2 and,

more importantly, the strong negative relationship with sentiment reported in Table 3.4 are,

though, extremely robust to changes in lower and upper optimization bounds. This result

strengthens our conclusion that overweight of small probabilities is largely time varying and

reflects investor sentiment.

3.4.4 Overweight of (right) tails driven by IV of single stock options

Finally, given that overweight of small probabilities by single stock call investors was most

evident during the IT bubble period (as Table 3.3 suggests), we hereby evaluate whether this

finding may have been driven by movements in the IV of index options rather than changes

in the IV of single stock options. We perform such analysis because our methodology for

calculation of average weighted stock IV volatilities partly relies on the IV on index options

(as it depends on implied correlations), as Eqs. 3.A.8j and 3.A.8l in Appendix 3.A.2 suggest.

Essentially, we want to ensure that the overweight of small probabilities observed from our

single stock options data is not caused by a rise in index options’ IV. As overweight of small

probabilities is a corollary of high IV skew24, we examine the IV skews (120 percent moneyness

versus at-the-money, ATM) from both index options and from single stock options within our

24While this relation is widely acknowledged, Jarrow and Rudd (1982), Corrado and Su (1997) and Longstaff(1995) provide a formal theorem for the link between IV skew and risk-neutral skewness and kurtosis.

72

sample using a k-Nearest-Neighbors (KNN) algorithm (see Appendix 3.B.2 for detail). Figure

3.3 depicts a scatter plot that relates single stock IV skews (on the y-axis) with index option IV

skew (on the x-axis) overlaid with the decision boundary between overweight of tails (in red)

and its absence (in blue), produced by the application of the KNN algorithm to our full data

sample. The picture suggests that that overweight of small probabilities is almost never caused

by positive index IV skews, whereas positive single stock IV skews very often produce overweight

of tails rather than underweights. Overweight of tails are mostly caused by situation where

single stock IV skew are higher than index IV skew, which suggest that either high single stock

IV skews or low implied correlation are responsible for overweight of tails, not index options’

IV. These conditions can be anecdotally confirmed by our observation of IV skews during the

2000’s IT bubble. During that period, when overweight of tails was pervasive, IV skew from

single options was quite high, close to +10 volatility points, whereas the same IV skew from

index options reached extreme low levels such as -15 volatility points. This disconnect between

the two IV markets, which drove the implied correlation to 2.8 percent (an extreme low level),

suggests that the index options’ IV was not the driver for overweight of tails during the IT

bubble. These findings reiterate our suggestions that overweight of small probabilities observed

in our sample is caused by trading in single stock options by retail investors, rather than activity

in the index option market.

(a) 3 months

Figure 3.3: k-Nearest-Neighbors for IV skews. This figure shows a scatter plots depicting the relation betweensingle stock (120 percent minus ATM) IV skews (on the y-axis) and index (120 percent minus ATM) IV skews (on the x-axis).Observations colored in red imply the presence of overweight of small probabilities on the right side of the distribution (γ < 1),whereas observations colored in blue imply either neutral probability weighting (γ = 1) or underweight of small probabilities(γ > 1). The decision boundary is produced by a k-Nearest-Neighbors algorithm (k=41, estimated via cross-validation) anddelimits the region in which a new observation (of paired IV skews, such as the solid dotes) will be assigned to the overweight ofsmall probabilities class (in red) or the alternative class (in blue).

73

3.5 Conclusion

Single stock OTM call options are deemed overpriced because investors overpay for positively

skewed securities, resembling lottery tickets. The CPT’s probability weighting function of

Tversky and Kahneman (1992) theoretical model provides an appealing explanation why these

options are expensive: investors’ preferences for positively skewed securities. In our empirical

analysis, we find that the CPT subjective density function implied by single stock options

outperforms the RND and two rational densities functions (from the power and exponential

utilities) in matching the tails of realized equity returns. We estimate the CPT probability

weighting function parameter γ and find that they are qualitatively consistent with the one

predicated by Tversky and Kahneman (1992), particularly for short-term options. This outcome

endorses our hypothesis that investors in single stock call options are biased.

Our analysis provides detailed insights into the behavior of single stock option investors.

Our empirical findings suggest that overweight of small probabilities is less pronounced than

proposed by the CPT. We find the presence of a positive term structure of overweighting of

tails, because it becomes less pronounced as the option maturity increases. Investors in single

stock calls are more biased when trading short-term contracts, whereas they seem to be more

rational (less biased) when trading long-term calls. This result is consistent with individual

investors being the typical buyers of OTM single stock calls and the fact that they mostly use

short-term options to speculate on the upside of equities.

We also find that investors overweighting of small probabilities is strongly time-varying and

sample dependent. Time-variation in γ’s remains strong even when we account for different

levels of loss aversion, different diminishing sensitivities to gains and losses, different degrees of

overweighting of the left tail and an alternative (Prelec’s) weighting function. The strong time-

variation and sample dependency of γ suggest that investors do not have a static preference for

skewness, but rather time-varying preferences or “bias in beliefs” (see Barberis, 2013).

Such time-variation in γ is also confirmed by overweighting of tails to be pronounced in

periods in which sentiment is high, for instance, the IT bubble period. This finding is consistent

with the Baker and Wurgler (2007) sentiment measure being the main explanatory variable of

overweighting of small probabilities. Our results challenge the view that single stock call options

are structurally overpriced and offer the insight that overweight of tail events implied in these

options are conditional on sentiment levels and option maturity rather than positive stock

fundamentals, loss aversion levels or investor preferences for skewness.

Our findings have several important practical implications. First, the understanding of time-

variation in investors’ overweighting of small probabilities could be used in the development of

behavioral option pricing models, which remains in its infancy. To the extent that overweighting

of small probabilities is a latent variable or, simply, not trivial to estimate, we contemplate that

future option pricing models should be more sentiment-aware than current ones. Second, of

importance for such next generation option-pricing models is the inclusion of a positive term

structure of tails’ overweighting. Such potential modifications on options’ pricing have large

and direct consequences to risk-management, hedging and arbitrage activities. Third, from a

74

financial stability point of view, investors’ overweighting of small probabilities in single stock

options could be of use to regulators for triangulating the presence of speculative equity markets

bubbles.

75

3.A Appendix: Risk-neutral densities and implied volatil-

ity analytics

3.A.1 Subject density function estimation

We hereby present the derivations required to achieve Eq. (3.2.3) in the main text, Eq. (3.A.7)

here, from Eq. (3.2.2), called here Eq. (3.A.1):

fQ(ST )

w′(FP (ST )) · fP (ST )= ς(ST ). (3.A.1)

where fP (ST ) is the “real-world” probability distribution, fQ(ST ) is the RND, ς(ST ) is the

pricing kernel, w is the weighting function and FP (ST ) is the “real-world” cumulative density

function.

The first step of our derivation entails re-arranging Eq. (3.A.1) into (3.A.2b) via Eq.

(3.A.2a), which demonstrates that for the CPT to hold, the subjective density function should

be consistent with the probability weighted EDF:

fQ(ST )︸︷︷︸RND

= w′(FP (ST ))︸︷︷︸probability weighing

· fP (ST )︸︷︷︸EDF

· ς(ST )︸︷︷︸pricing kernel

(3.A.2a)


= fP (ST )︸︷︷︸probability weighted EDF


(3.A.2b)

fQ(ST )

ΛU ′(ST )U ′(St)

=fQ(ST )

ς(ST )︸︷︷︸

Subjective density


(3.A.3)

Following Ait-Sahalia and Lo (2000) and Bliss and Panigirtzoglou (2004), Eq. (3.A.3) can

be manipulated so that the time-preference constant Λ of the pricing kernel vanishes, producing

Eq. (3.A.4), which directly relates the probability weighted EDF, the RND, and the marginal

utility, U ′(ST ):

fP (ST )︸︷︷︸probability weighted EDF

=λU ′(ST )

U ′(St)Q(ST )∫ U ′(St)

U ′(x)Q(x)dx

=

fQ(ST )

U ′(ST )∫ fQ(x)

U ′(x)dx︸︷︷︸

Generic subjective density function

(3.A.4)

where∫ fQ(x)

U ′(x)dx normalizes the resulting subjective density function to integrate to one. Once

the utility function is estimated, Eq. (3.A.4) allows us to convert RND into the probability

weighted EDF. Eq. (3.A.4) can also be used to estimate the subjective density function for an

(rational) investor that has power or exponential utility function, by disregarding the weighting

function W (·), so the left-hand side of the equation becomes fp(ST ). In the remainder of

the chapter we call these subjective distributions power and exponential density functions.

As we hypothesize that the representative investor has a CPT utility function, its marginal

76

utility function is U ′(ST ) = υ′(ST ), and, thus, υ′(ST ) = αSα−1

T for ST >= 0, and υ′(ST ) =

−λβ(−ST )β−1 for ST < 0, leading to Eq. (3.A.5):

fP (ST ) =

fQ(ST )

αSα−1T∫ fQ(x)

αxα−1dxfor ST ≥ 0, and (3.A.5)


=

fQ(ST )

−λβ(−ST )β−1∫ fQ(x)

−λβ(−x)β−1dx︸︷︷︸Partial CPT density function

for ST < 0, and (3.A.6)

Eqs. (3.A.5) and (3.A.6), hence, relate the EDF where probabilities are weighted according

to the CPT probability distortion functions, on the LHS, to the subjective density function

derived from the CPT value function, on the RHS, separately for gains and losses, i.e., the

PCPT density function. The relationships specified by Eqs. (3.A.5) and (3.A.6) fully state the

relation we would like to depict, although one additional manipulation is convenient for our

argumentation. Assuming that the function w(FP (ST )) is strictly increasing over the domain

[0,1], there is a one-to-one relationship between w(FP (ST )) and a unique inverse w−1(FP (ST )).

So, result fP (ST ) = w′(FP (ST ))fP (ST ) also implies fP (ST ).(w−1)′(FP (ST )) = fP (ST )

25. This

outcome allows us to directly relate the original EDF to the CPT subjective density function,

by “undoing” the effect of the CPT probability distortion functions within the PCPT density

function:


=

fQ(ST )

ν′(ST )∫ fQ(x)

ν′(x)dx

(w−1)′(FP (ST ))


(3.A.7)

Thus, once the relation between the probability weighting function of EDF and the PCPT

density is established, as in Eqs. (3.A.5) and (3.A.6), one can eliminate the weighting scheme

affecting returns by applying the inverse of such weightings to the subjective density function

without endangering such equalities, as in Eq. (3.A.7), numbered Eq. (3.2.3) in the main text.

3.A.2 Single stock weighted average implied volatilities

In the following we derive the weighted average single stock IV, Eq. (3.A.8l), and the implied

correlation approximation, Eq. (3.A.8j):

σ2P =

n∑i=1

w2i σ

2i +

n∑i �=j

wiwjρijσiσj (3.A.8a)

25A drawback of the CPT model is that it allows for non-strictly increasing functions, which would not allowinvertibility. This is the reason why the newer literature on probability distortions functions favors other strictly

monotonic functions, such as Prelec’s (1998) w(p) = e−(−ln(p))δ , as the weighting functions. Nevertheless,because the CPT parameters of our interest (γ = 0.61; δ = 0.69) impose strict monotonicity, we can obtain theinverse of the probability function, w−1(p) numerically.

77

Starting from the portfolio variance σ2P formula given by Eq. (3.A.8a), where i and j are indexes

for the portfolio constituents, this relation can be re-written for a equity index as:

σ2I =

n∑i,j=1

wiwjρijσiσj (3.A.8b)

implying that,

n∑i �=j

wiwjρijσiσj =n∑

i,j=1

wiwjρijσiσj −n∑

i=1

w2i σ

2i (3.A.8c)

where,

ρij(x) =

{ρ, if i �= j

1, if i = j(3.A.8d)

and where σ2I is the equity index option-implied variance. Then, assuming ρ as the estimator

for average stock correlation we have:

σ2I = ρ

n∑i �=j

wiwjσiσj +n∑

i=1

w2i σ

2i , (3.A.8e)

which, given equality 3.A.8c, can be re-written as:

σ2I = ρ

n∑i,j=1

wiwjσiσj − ρ

n∑i=1

w2i σ

2i +

n∑i=1

w2i σ

2i , (3.A.8f)

= ρ

(n∑

i=1

wiσi

)2

− ρn∑

i=1

w2i σ

2i +

n∑i=1

w2i σ

2i , (3.A.8g)

= ρ

(

n∑i=1

wiσi

)2

−n∑

i=1

w2i σ

2i

+

n∑i=1

w2i σ

2i , (3.A.8h)

ρ =σ2I −

∑ni=1 w

2i σ

2i

(∑n

i=1 wiσi)2 −∑n

i=1 w2i σ

2i

. (3.A.8i)

As∑n

i=1 w2i σ

2i is relatively small, we can simplify Eq. (3.A.8i), the implied correlation,

into the approximated implied correlation given by Eq. (3.A.8j). Note that, as∑n

i=1 w2i σ

2i is

always positive, the approximated implied correlation will always overstate the true implied

correlation.

ρ ≈ σ2I

(∑n

i=1 wiσi)2. (3.A.8j)

Further, in order to obtain the weighted average single stock implied volatility, Eq. (3.A.8l),

we square root both sides of the approximation and re-arrange their terms:

√ρ ≈ σI

(∑n

i=1 wiσi)(3.A.8k)

78

withn∑

i=1

wiσi ≈σI√ρ. (3.A.8l)

Lastly, note that, given equality 3.A.8c, Eq. 3.A.8i can be re-written as:

ρ =σ2I −

∑ni=1 w

2i σ

2i∑n

i �=j wiwjσiσj

=σ2I −

∑ni=1 w

2i σ

2i∑n

i=1

∑i �=j wiwjσiσj

, (3.A.8m)

which is the implied correlation (IC) measure employed by Driessen et al. (2013).

3.B Appendix: Machine learning methods

3.B.1 Least Absolute Shrinkage and Selection Operator (Lasso)

The regression coefficients obtained by the Lasso methodology applied (βLθ ) are estimated by

minimizing the quantity:

n∑i=1

(y1 − β0 −p∑

j=1

βjxij)2 + κ

p∑j=1

|βj|= RSS + κ

p∑j=1

|βj| (3.B.1)

where κ is the tuning parameter, which is estimated via cross-validation. The cross-validation

applied by us uses ten equal-size splits of our overall data set.

3.B.2 k-Nearest-Neighbor classifier

The k-Nearest-Neighbor (KNN) classifier is one of the approaches in machine learning that

attempts to estimate the conditional distribution of the explained variable (Y ) given the ex-

planatory variables (X) and, subsequently, classify new observations to the class with highest

estimated probability. The KNN classifier uses the Euclidean distance to first identify the clos-

est kth observations within the training data (in-sample data) to a new test (out-of-sample)

observation provided (x0). Such neighborhood of points around the test observation x0 is de-

fined as N0. KNN, then, estimates the conditional probability of x0 to belong to a class j as

the percentage of old observations (yi) in the neighborhood N0 whose class is also j:

Pr(Y = j|X = x0) =1

k

∑i∈N0

I(yi = j) (3.B.2)

In a third step, KNN applies the Bayes rule to perform out-of-sample classification (in test

data) of x0 to the class with the largest probability. For further details, see Hastie et al. (2008).

79

3.C Appendix: Welch and Goyal (2008) equity market

predictors

The complete set and summarized descriptions of variables provided by Welch and Goyal

(2008)26 that are used in our study is given as:

1. Dividendprice ratio (log), D/P: Difference between the log of dividends paid on the

S&P 500 index and the log of stock prices (S&P 500 index).

2. Dividend yield (log), D/Y: Difference between the log of dividends and the log of

lagged stock prices.

3. Earnings, E12: 12-month moving sum of earnings on teh S&P500 index.

4. Earnings-price ratio (log), E/P: Difference between the log of earnings on the S&P

500 index and the log of stock prices.

5. Dividend-payout ratio (log), D/E: Difference between the log of dividends and the

log of earnings.

6. Stock variance, SVAR: Sum of squared daily returns on the S&P 500 index.

7. Book-to-market ratio, B/M: Ratio of book value to market value for the Dow Jones

Industrial Average.

8. Net equity expansion, NTIS: Ratio of twelve-month moving sums of net issues by

NYSE-listed stocks to total end-of-year market capitalization of NYSE stocks.

9. Treasury bill rate, TBL: Interest rate on a three-month Treasury bill.

10. Long-term yield, LTY: Long-term government bond yield.

11. Long-term return, LTR: Return on long-term government bonds.

12. Term spread, TMS: Difference between the long-term yield and the Treasury bill rate.

13. Default yield spread, DFY: Difference between BAA- and AAA-rated corporate bond

yields.

14. Default return spread, DFR: Difference between returns of long-term corporate and

government bonds.

15. Cross-sectional premium, CSP: measures the relative valuation of high- and low-beta

stocks.

16. Inflation, INFL: Calculated from the CPI (all urban consumers) using t−1 information

due to the publication lag of inflation numbers.

26Available at http://www.hec.unil.ch/agoyal/.

80

Chapter 4

Implied Volatility Sentiment: A Tale of

Two Tails∗

4.1 Introduction

End-users of out-of-the-money (OTM) options tend to overweight small probability events. This

behavioral bias, suggested by Tversky and Kahneman (1992) Cumulative Prospect Theory, is

claimed to be present in the pricing of OTM index puts and in OTM single stock calls (Barberis

and Huang, 2008; Polkovnichenko and Zhao, 2013)2. Within the index option market, the

typical end-users of OTM puts are institutional investors, who use them to protect their large

equity portfolios. Because institutional investors have large portfolios and hold a substantial

part of the total market capitalization, OTM index puts are frequently in high demand and,

as a result, are overvalued. The reason for such richness of OTM puts goes back to the 1987

financial market crash. Bates (1991) and Jackwerth and Rubinstein (1996) argue that the

implied distribution of equity market expected returns from index options changed considerably

following the 1987 market crash. Their findings demonstrate that, since the crash, a large shift

in market participants’ demand for such instruments took place, evidenced by the probabilities

implied by options prices. Before the crash, the probability of large negative stock returns was

close to the one suggested by a normal distribution. In contrast, just prior to the 1987 crash, the

probability of large negative returns implied by option prices rose considerably. Such increased

demand for hedging against tail risk events suggested a change in beliefs and attitude towards

risk. Investors feared another crash and became more willing to give up upside potential in

equities to hedge against the risk of drawdowns via put options. Bates (2003) suggests that even

∗This chapter is based on Felix et al. (2017a). We thank seminar participants at the APG Asset ManagementQuant Roundtable, at the Infiniti 2017 Conference in Valencia, at the EEA-ESEM 2017 Conference in Lisbonand at the 2018 annual meeting of the European Financial Management Association (EFMA) in Milano fortheir helpful comments. We thank APG Asset Management for making available part of the data set.

2We acknowledge that it is yet unclear whether the overweighting of small probabilities is caused solely bypreferences (i.e., a behavioral bias) or rather by biased beliefs (i.e., investors’ expectations). Barberis (2013)eloquently discusses how both phenomena are distinctly different and how both (individually or jointly) maypotentially explain the existence of overpriced OTM options. In this chapter we take a myopic view and useonly the first explanation, the existence of a behavioral bias, for ease of exposition.

81

models adjusted for stochastic volatility, stochastic interest rates, and random jumps do not

fully explain the high level of OTM puts’ implied volatilities (IV). Accordingly, Garleanu et al.

(2009) argue that excessive IV from OTM puts cannot either be explained by option-pricing

models that take such institutional investors’ demand pressure into account3.

It has been claimed that OTM calls on single stocks are systematically expensive (Barberis

and Huang, 2008; Boyer and Vorkink, 2014). The typical end-users of OTM single stock calls

are individual investors. Bollen and Whaley (2004) state that changes in the IV structure of

single stock options across moneyness are driven by the net purchase of calls by individual

investors. The literature provides several explanations for such strong buying pressure of calls

by retail investors. For example, Mitton and Vorkink (2007) and Barberis and Huang (2008)

propose models in which investors have a clear preference for positive return skewness, or

“lottery ticket” type of assets. In consequence of this preference, retail investors overpay for

these leveraged securities, making OTM calls expensive and causing them to yield low forward

returns. Cornell (2009) presents a behavioral explanation for the overpricing of single stock

calls: because investors are overconfident in their stock-picking skills, they buy calls to get

the most “bang for the buck”. A related explanation for the structural overpricing of single

stock calls is leverage aversion or leverage constraint: because investors are averse to borrowing

(levering) or constrained to do so, they buy instruments with implicit leverage to achieve their

return targets.

Beyond this literature that supports the link between institutional and individual investor

trading activity and the structural overvaluation of OTM options, we argue that short-term

trading dynamics also influence the pricing of OTM options. For instance, Han (2008) provides

evidence that the index options IV smirk is steeper when professional investors are bearish. He

concludes that the steepness of the IV structure across moneyness relates to investors’ sentiment.

In the same line, Amin et al. (2004) argue that investors bid up the prices of put options after

increases in stock market volatility and rising risk aversion, whereas such buying pressure wanes

following positive momentum in equity markets. Mahani and Poteshman (2008) argue that

trading in single stock call options around earnings announcements is speculative in nature and

dominated by unsophisticated retail investors. Lakonishok et al. (2007) show evidence that long

call prices increased substantially during bubble times (1990 and 2000) and that most of the

single stock options’ market activity consists of speculative directional call positions. Lemmon

and Ni (2011) discuss that the demand for single stock options (dominated by speculative

individual investors’ trades) positively relates to sentiment. Lastly, Polkovnichenko and Zhao

(2013) suggest that time-variation in overweight of small probabilities derived from index put

options might depend on sentiment, whereas Felix et al. (2016b) provide evidence that the time-

varying overweight of small probabilities from single stock options largely links to sentiment.

The above studies suggest that OTM index puts and single stock calls are systematically

3It is important to disentangle the (equity) hedging behavior of institutional investor to their overall tradingactivity. Studies, such as Frijns et al. (2015), provide evidence that institutional investors price stocks rationally,supporting the idea that the argued behavioral bias might be confined to institutional investors’ portfolioinsurance decisions.

82

overpriced and that the valuation misalignment fluctuate considerably over time, caused by

changes in investor sentiment. In this chapter, we delve deeper into it and investigate how

overweight of small probabilities links to sentiment and forward returns.

The first contribution of our study is to evaluate the information content of overweighted

small probabilities from index puts and single stock calls, as a measure of sentiment. We assess

the ability of this measure to predict forward equity returns and, more specifically, equity market

reversals, defined as abrupt changes in the market direction4. Because we find overweight small

probabilities to be strongly linked to IV skews, we hypothesize that reversals may follow not

only periods of excessive overweight of tails but also periods of extreme IV skews5.

One characteristic of the literature that analyzes the informational content of IV skews is

that it evaluates index puts’ IV skews and single stock calls’ IV skews completely separated

from each other. As such, our second contribution is that we are, to the best of our knowledge,

the first in the literature to use IV skews jointly extracted from both the index and single stock

option market as an indicator for investors’ sentiment. Our sentiment measure, the so-called

IV-sentiment, is calculated as the IV of OTM index puts minus the IV of OTM single stock calls.

We conjecture that our IV-sentiment measure is an advance on the understanding investors’

sentiment because it captures the very distinct nature of these markets’ two main categories of

end-users: 1) IV from OTM puts captures institutional investors’ willingness to pay for leverage

to hedge their downside risk (portfolio insurance), as a measure of bearishness, whereas 2) IV

from OTM single stock calls captures levering by individual investors for speculation on the

upside (“lottery tickets” buying), as a measure of bullishness. Thus, a high level of IV-sentiment

indicates bearish sentiment, as IV from index puts outpace the ones from single stock calls. In

contrast, low levels of IV-sentiment indicate bullishness sentiment, as IV from single stock calls

become high relative to the ones from index puts.

We find that our IV-sentiment measure predicts equity market reversals better than over-

weight of small probabilities itself. It also delivers positive risk-adjusted returns more consis-

tently than the common Baker and Wurgler (2007) sentiment factor when evaluated via two

trading strategies, a high-frequency and a low-frequency one. In univariate and multivariate

predictive regression settings, our IV-sentiment measure improves the out-of-sample forecast

ability of traditional equity risk-premium models. This result is likely due to the uniqueness

of our IV-sentiment measure relative to traditional predictive factors, as well as caused by the

4Reversals in the context of this chapter are not to be confused with the, so-called, reversal (cross-sectional)strategy, i.e., a strategy that buys (sells) stocks with low (high) total returns over the past month, as firstdocumented by Lehmann (1990). We focus on the overall equity market, rather than investigating single stocks.

5The literature on IV skew has largely explored the level of volatility skew across stocks and their cross-sectionof returns. However, insights on the link between the skew and the overall stock market are still incipient. Thestudy by Doran et al. (2007) is one of the few that has tested the power of IV skews as a predictor of aggregatemarket returns. However, they only analyze the relation between skews and one-day ahead returns (found to beweakly negatively related), and ignore any longer and perhaps more persistent effects. Similarly, several studieshave already attempted to recognize the conditionality of forward equity market returns to other volatility-typeof measures: Ang and Liu (2007) for realized variance, Bliss and Panigirtzoglou (2004) for risk-aversion impliedby risk-neutral probability distribution function embedded in cross-sections of options, Bollerslev et al. (2009)for variance risk premium, Driessen et al. (2013) for option-implied correlations, Pollet and Wilson (2008) forhistorical correlations, and Vilkov and Xiao (2013) for the risk-neutral tail loss measure. Most of these studiesdocument a short-term negative relation between risk measures and equity market movements.

83

imposition of some structure into our models (in the form of coefficient constraints). Once

these models are constrained, forecast combination approaches largely outperform individual

predictors and advanced machine learning techniques in forecasting the equity risk-premium

in our data set. Thus, the third contribution of our study is to complement the literature

on out-of-sample forecasting of the equity risk-premium (Welch and Goyal, 2008; Campbell

and Thompson, 2008; Rapach et al., 2010) by suggesting a new predictor, the IV-sentiment

measure. Concurrently, we reiterate earlier findings that constrained linear models remain a

powerful tool to forecast equity returns.

A final contribution of our work is to reveal the ability of our IV-sentiment measure on

improving on time-series momentum, cross-sectional momentum and equity buy-and-hold in-

vestment strategies. Our sentiment measure is uncorrelated to these strategies, also at the

tails, for instance, when cross-sectional momentum crashes contemporaneously to market re-

bounds (Daniel and Moskowitz, 2016). Consequently, we document an increase in the infor-

mational content of such strategies when combined with the IV-sentiment strategy, especially

for cross-sectional momentum. In line with this outcome, we also report that returns from a

IV-sentiment-based strategy are poorly explained by widely used equity risk factors, such as

Fama and French’s five-factors, the momentum factor (WML) and the low-volatility factor

(BAB). Hence, we propose that active equity managers could benefit from IV-sentiment by

using it for Beta-timing.

The remainder of this chapter is organized as follows. Section 4.2 describes the data and

the main methods employed in our empirical study. In section 4.3, within three sub-sections,

we focus on estimating overweight of small probabilities parameters from the index and single

stock option markets as well as linking it to the Baker and Wurgler (2007) sentiment factor and

other proxies for sentiment. In section 4.4 we test how our sentiment proxy based on overweight

of small probabilities relates to forward equity returns. Section 4.5 concludes.


We use S&P 500 index options’ IV data and single stock weighted average IV data from the

largest 100 stocks of the S&P 500 index within our risk-neutral density (RND) estimations.

The IV data comes from closing mid-option prices from January 2, 1998 to March 19, 2013

for fixed maturities for five moneyness levels, i.e., 80, 90, 100, 110, and 120, at the three-, six-

and twelve-month maturity both for index and single stock options. Eq. (3.A.8l) in Appendix

3.A.2 shows how weighted average single stock IV are computed. We apply the S&P 500

index weights normalized by the sum of weights of stocks for which IVs across all moneyness

levels are available. Following the S&P 500 index methodology and the unavailability of IV

information for every stock in all days in our sample, stocks weights in this basket change

on a daily basis. The sum of weights is, on average, 58 percent of the total S&P 500 index

capitalization and it fluctuates from 46 to 65 percent. Continuously compounded stock market

returns are calculated throughout our analysis from the basket of stocks weighted with the

84

same daily-varying loadings used for aggregating the IV data6. For index options, we use the

S&P 500 index prices to calculate continuously compounded stock market returns. Realized

index returns and single stock returns are downloaded via Bloomberg.

Overweight of small probabilities is embedded in the cumulative prospect theory (CPT)

model by means of the weighting function of the probability of prospects. Within the CPT

model, overweight of small probabilities is measured by the probability weighting function

parameters δ and γ for the left (losses) and right (gains) side of the return distribution, re-

spectively. δ and γ < 1 imply overweight of small probabilities, whereas δ and γ > 1 imply

underweight of small probabilities, and δ and γ equal to 1 means neutral weighting of prospects

(see Tversky and Kahneman, 1992).

Our methodology builds on the assumption that investors’ subjective density estimates

should correspond, on average7, to the distribution of realizations (Bliss and Panigirtzoglou,

2004). Thus, estimating CPT probability weighting function parameters δ and γ is only feasible

if two basic inputs are available: the CPT subjective density function and the distribution

of realizations, i.e., the empirical density function (EDF). The methodology applied by us to

estimate these two parameters comprises of: 1) estimating the returns’ risk-neutral density from

option prices using a modified Figlewski (2010) method; 2) estimating the partial CPT density

function using the CPT marginal utility function; 3) “undoing” the effect of the probability

weighting function (w) to obtain the CPT subjective density function; 4) simulating time-

varying empirical return distributions using the Rosenberg and Engle (2002) approach; and 5)

minimizing the squared difference of the tail probabilities of the CPT and EDF to obtain daily

optimal δ’s and γ’s.

Our starting point for obtaining the CPT probability weighting function parameters δ and γ

is the estimation of RND from IV data. In order to estimate the RND, we first apply the Black-

Scholes model to our IV data to obtain options prices (C) for the S&P 500 index. Once our

data is normalized, so strikes are expressed in terms of percentage moneyness, the instantaneous

price level of the S&P 500 index (S0) equals 100 for every period for which we would like to

obtain implied returns. Contemporaneous dividend yields for the S&P 500 index are used

for the calculation of P as well as the risk-free rate from three-, six- and twelve-month T-bills.

Because we have IV data for five levels of moneyness, we implement a modified Figlewski (2010)

method for extracting the RND structure. The main advantage of the Figlewski (2010) method

over other techniques is that it extracts the body and tails of the distribution separately, thereby

allowing for fat tails.

Once the RND is estimated, we must change the measure to translate it into the subjective

6We thank Barclays Capital for providing the implied volatility data. Barclays Capital disclosure: “Anyanalysis that utilizes any data of Barclays, including all opinions and/or hypotheses therein, is solely the opinionof the author and not of Barclays. Barclays has not sponsored, approved or otherwise been involved in themaking or preparation of this Report, nor in any analysis or conclusions presented herein. Any use of any dataof Barclays used herein is pursuant to a license.”

7This assumption implies that investors are somewhat rational, which is not inconsistent with the CPT-assumption that the representative agent is less than fully rational. The CPT suggests that investors arebiased, not that decision makers are utterly irrational to the point that their subjective density forecast shouldnot correspond, on average, to the realized return distribution.

85

density function, a real-world probability distribution. This operation is possible via the pricing

kernel as follows:

fQ(ST )

fP (ST )= Λ

U′(ST )

U ′(St)≡ ς(ST ), (4.2.1)

where, fQ(ST ) is the RND, fP (ST ) is the real-world probability distribution, ST is wealth or

consumption, ς(ST ) is the pricing kernel, Λ is the subjective discount factor (the time-preference

constant) and U(·) is the representative investor utility function.

Since CPT-biased investors price options as if the data-generating process has a cumulative

distribution FP (ST ) = w(FP (ST )), where w is the weighting function, its density function

becomes fP (ST ) = w′(FP (ST )) · fP (ST ) (Dierkes, 2009; Polkovnichenko and Zhao, 2013) and

Eq. (4.2.1) collapses into Eq. (4.2.2):

fQ(ST )

w′(FP (ST )) · fP (ST )= ς(ST ). (4.2.2)

which, re-arranged into Eq. (4.2.4) via Eqs. (4.2.3a) and (4.2.3b), demonstrates that for the

CPT to hold, the subjective density function should be consistent with the probability weighted

EDF:


= w′(FP (ST ))︸︷︷︸probability weighing

· fP (ST )︸︷︷︸EDF


(4.2.3a)




(4.2.3b)

fQ(ST )

ΛU ′(ST )U ′(St)

=fQ(ST )

ς(ST )︸︷︷︸

Subjective density


(4.2.4)

Following Bliss and Panigirtzoglou (2004), Eq. (4.2.4) can be manipulated so that the

time-preference constant Λ of the pricing kernel vanishes, producing Eq. (4.2.5), which directly

relates the probability weighted EDF, the RND, and the marginal utility, U ′(ST ):


=λU ′(ST )

U ′(St)Q(ST )∫ U ′(St)

U ′(x)Q(x)dx

=

fQ(ST )

U ′(ST )∫ fQ(x)

U ′(x)dx︸︷︷︸

Generic subjective density function

(4.2.5)

where∫ fQ(x)

U ′(x)dx normalizes the resulting subjective density function to integrate to one. Once

the utility function is estimated, Eq. (4.2.5) allows us to convert RND into the probability

weighted EDF. As the CPT marginal utility function is U ′(ST ) = υ′(ST ), and, thus, υ′(ST ) =

αSα−1T for ST >= 0, and υ′(ST ) = −λβ(−ST )

β−1 for ST < 0, we obtain Eq. (4.2.6) and (4.2.7):

86

fP (ST ) =

fQ(ST )

αSα−1T∫ fQ(x)

αxα−1dxfor ST ≥ 0, and (4.2.6)


=

fQ(ST )

−λβ(−ST )β−1∫ fQ(x)

−λβ(−x)β−1dx︸︷︷︸Partial CPT density function

for ST < 0, and (4.2.7)

Eq. (4.2.6) relates to the probabilities weighted EDF (on the LHS), which uses the CPT

probability distortion function for weighting, to the subjective density function on the RHS,

derived from the CPT value function for gains (ST ≥ 0). We call the RHS the partial CPT

density function (PCPT), as it does not embed the probability function. Eq. (4.2.7) is the

corresponding equation for losses (ST < 0). As the function w(FP (ST )) is strictly increasing

over the domain [0,1], there is a one-to-one relationship between w(FP (ST )) and a unique inverse

w−1(FP (ST )). So, the result fP (ST ) = w′(FP (ST ))fP (ST ) also implies fP (ST ).(w−1)′(FP (ST )) =

fP (ST ). This outcome allows us to directly relate the original EDF to the CPT subjective

density function, by “undoing” the effect of the CPT probability distortion functions within

the PCPT density function:


=

fQ(ST )

ν′(ST )∫ fQ(x)

ν′(x)dx

(w−1)′(FP (ST ))


(4.2.8)

Thus, once the relation between the probability weighting function of the EDF and the

PCPT density is established, as in Eqs. (4.2.6) and (4.2.7), one can eliminate the weighting

scheme affecting returns by applying the inverse of such weightings to the subjective density

function without endangering such equalities, as in Eq. (4.2.8).

As the RND is converted into the subjective density function, we must also estimate daily

empirical density functions (EDF). We built such time-varying EDFs from an invariant com-

ponent, the standardized innovation density, and a time-varying part, the lagged conditional

variance (σ2t|t−1) produced by an EGARCH model (Nelson, 1991). We first define the standard-

ized innovation, being the ratio of empirical returns and their conditional standard deviation

(ln(St/St−1)/σt|t−1) produced by the EGARCH model. From the set of standardized innova-

tions produced, we can then estimate a density shape, i.e., the standardized innovation density.

The advantage of such a density shape versus a parametric one is that it may include the

typically observed fat tails and negative skewness, which are not incorporated in simple para-

metric models, e.g., the normal distribution. This density shape is invariant and it is turned

time-varying by multiplication of each standardized innovation by the EGARCH conditional

standard deviation at time t, which is specified as follows:

ln(St/St−1) = µ+ εt, ε ∼ f(0, σ2t|t−1) (4.2.9a)

and

87

σ2t|t−1 = ω1 + αε2t−1 + βσ2

t−1|t−2 + ϑMax[0,−εt−1]2, (4.2.9b)

where α captures the sensitivity of the conditional variance to lagged squared innovations

(ε2t−1), β captures the sensitivity of the conditional variance to the conditional variance (σ2t−1|t−2),

and ϑ allows for the asymmetric impact of lagged returns (ϑMax[0,−εt−1]2). The model is esti-

mated using maximum log-likelihood where innovations are assumed to be normally distributed.

Up to now, we produced a one-day horizon EDF for every day in our sample but we still lack

time-varying EDFs for the three-, six-, and twelve-month horizons. Thus, we use bootstrapping

to draw 1,000 paths towards these desired horizons by randomly selecting single innovations

(εt+1) from the one-day horizon EDFs available for each day in our sample. We note that once

the first return is drawn, the conditional variance is updated (σ2t−1|t−2) affecting the subsequent

innovation drawings of a path. This sequential exercise continues through time until the desired

horizon is reached. To account for drift in the simulated paths, we add the daily drift estimated

from the long-term EDF to drawn innovations, so that the one-period simulated returns equal

εt+1 + µ. The density functions produced by the collection of returns implied by the terminal

values of every path and their starting points are our three-, six-, and twelve-month EDFs.

These simulated paths contain, respectively, 63, 126, and 252 daily returns. We note that by

drawing returns from stylized distributions with fat-tails and excess skewness, our EDFs for

the three relevant horizons also embed such features. This estimation method for time-varying

EDF is based on Rosenberg and Engle (2002).

Finally, once these three time-varying EDFs are estimated for all days in our sample, we

estimate δ and γ for each of these days using Eq. (4.2.10) and (4.2.11).

w+(γ, δ = γ) = Min

B∑b=1


prob)2, (4.2.10)

w−(δ, δ = γ) = Min

B∑b=1


prob)2, (4.2.11)

where EDF bprob and CPT b

prob are the probability within bin b in the empirical and CPT density

functions and Wb are weights given by 1

1√2π

∫∞0.5 e

−x22

dx = 1, the reciprocal of the normalized

normal probability distribution (above its median), split in the same total number of bins (B)

used for the EDF and CPT. Parameters δ and γ are constrained by an upper bound of 1.75

and a lower bound of -0.25. The weights applied in these optimizations are due to the higher

importance of matching probability tails in our analysis than the body of the distributions.

88

4.3 Overweight of tails: dynamics and dependencies

4.3.1 Time-varying CPT parameters

In this section, we evaluate the dynamics of the overweighting of tails within the single stock

and index option markets. Descriptive statistics of the CPT’s estimated δ and γ parameters

via the methodology presented in section 4.2 are provided in Table 4.1.

Table 4.1: Descriptive statistics

Panel A - Gamma

Maturity Min 25% Q Median Mean 75% Q Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS

(98-03) (03-08) (08-13)

3 months - 0.74 0.91 0.89 1.04 1.75 0.23 64% 97% 35% 59% 0.0209

6 months - 0.81 0.99 0.96 1.14 1.75 0.28 52% 92% 18% 46% 0.0170

12 months 0.04 0.91 1.03 1.01 1.14 1.75 0.22 41% 83% 11% 29% 0.0225

Panel B - Delta

Maturity Min 25% Q Median Mean 75% Q Max StDev % γ < 1 % γ < 1 % γ < 1 % γ < 1 RSS

(98-03) (03-08) (08-13)

3 months 0.29 0.64 0.68 0.68 0.72 1.01 0.08 100% 100% 100% 100% 0.0579

6 months 0.30 0.54 0.60 0.60 0.65 1.75 0.10 100% 100% 100% 100% 0.0198

12 months - 0.40 0.45 0.47 0.52 1.75 0.10 100% 100% 100% 100% 0.0169

This table reports the summary statistics of the estimated cumulative prospect theory (CPT) parameters gamma (γ) from thesingle stock options market and delta (δ) from the index option market for each day in our sample as well as the optimizations’residual sum of squares (RSS). The parameters γ and δ define the curvature of the weighting function for gains and losses,respectively, which leads the probability distortion functions to have inverse S-shapes. The γ and δ parameters close to unitylead to weighting functions that are close to unweighted (neutral) probabilities, whereas parameters close to zero indicateslarge overweight of small probabilities. Panel A reports the summary statistics of gamma (γ) when we assume a parameterof risk aversion (λ) equal to 2.25 (the standard CPT parametrization). Panel B reports the summary statistics of delta (δ)under the same risk aversion assumption. Column headings % γ < 1 and % δ < 1 report the percentage of observations inwhich parameters γ and δ are smaller than one, i.e., the proportion of the sample in which overweight of small probabilitiesis observed. We report this metric for the full sample as well as for three equal-sized splits of our full samples, namely: 98-03,from January 5, 1998 to January 30, 2003; 03-08, from January 31, 2003 to February 21, 2008; and 08-13, from February 22,2008 to March 19, 2013.

We report summary statistics of the estimated γ for three-, six- and twelve-month options

in Panel A for the right tail from single stock options. The median and mean time-varying γ

estimates for three-month options are 0.89 and 0.91, which considerably exceed the parameter

value of 0.61 that is indicated by Tversky and Kahneman (1992). This finding suggests that

overweight of small probabilities is present within the pricing of short-term single stock call

options, but to a much lesser extent than provided by the theory. The results in Panel A

also show that γ is highly time-varying and strongly sample dependent. Overweight of small

probabilities in the single stock option market is very pronounced from 1998 to 2003 (present

at 97 percent of all times), but infrequent from 2003 to 2008 (present at only 35 percent of

all times). Our γ-estimates from three-month options range from 0 to 1.75 and the standard

deviation of estimates is 0.23. In Panel B, we report summary statistics of the estimated δ

from index options for the left tail. For δ estimated from three-month options, the median and

mean estimates are both 0.68, implying a probability weighting that roughly matches the one

in the CPT, which calibrates δ at 0.69. The δ-estimates are also time-varying, however, their

standard deviation (0.08) is more than three times lower than for the γ-estimates. The range

89

of δ-estimates is also much narrower than for γ, as it is between 0.29 and 1.01. In contrast to

the γ-estimates, our δ-estimates reflect a consistent overweight of small probabilities across all

sub-samples.

At the six-month maturity, overweight of small probabilities for γ seems even less acute

than suggested by the theory and by the three-month options findings. The median and mean

γ estimates for this maturity are 0.99 and 0.96, respectively. The distribution of γ is somewhat

skewed to the right, i.e., towards a less pronounced overweight of small probabilities, as the

median is higher than the mean. The 75th quantile of γ (1.14) suggests an underweighting of

probabilities already. For index options with six-month maturity, the estimated δ indicates an

even more pronounced overweight of small probabilities (both the mean and median δ equal

0.60) than for three-month options. Overweight of small probabilities is again documented

across all samples for δ but not for γ, in which overweight of small probabilities is more frequent

than underweight of small probabilities only in the 1998-2003 sample.

The γ estimates for the twelve-month maturity tend even more towards probability un-

derweighting than the six-month ones. The median γ is 1.03, whereas the mean γ is 1.01.

Overweight of small probabilities appears in only 41 percent of all times in the overall sample

and is roughly nonexistent in the 2003-2008 sample. Differently, the mean and median for

the δ estimates from index options are 0.47 and 0.40, respectively, indicating an even stronger

overweight of small probabilities than for single stock options and other maturities. We argue

that such a pattern could be caused by institutional investors buying long-term protection, as

twelve-month OTM index options are less liquid than short-term ones.

OTM index puts seem to be structurally expensive from the perspective of overweight of

small probabilities, despite the fact that the degree of overvaluation varies in time. Concur-

rently, OTM single stock options are only occasionally expensive. Our γ estimates indicate

an infrequent occurrence of overweight of small probabilities in single stock options, clustered

within specific parts of our sample, e.g., during the 1998-2003 period. Our results fit nicely

within the seminal literature, for instance with Dierkes (2009), Kliger and Levy (2009), and

Polkovnichenko and Zhao (2013), regarding the index option market, and with Felix et al.

(2016b) regarding the single stock option market.

4.3.2 Overweight of tails and sentiment

In order to evaluate how time-variation in overweight of small probabilities relates to sentiment,

we run regressions between our proxies for overweight of tails, the Baker and Wurgler (2007)

sentiment measure and other explanatory control variables. Since we aim to combine overweight

of small probabilities parameters from both index options (bearish sentiment) and single stock

options (bullish sentiment), we use the Delta minus Gamma spread, δ - γ, as the explained

variable. The Delta minus Gamma spread captures the overweighting of small probabilities

from both index options and single stock, because δ is the CPT tail overweight parameter

estimated from the single stock market and γ is the equivalent parameter estimated from

the index option market. The explanatory variables in these regressions are (1) the Baker and

90

Wurgler (2007) sentiment measure8, (2) the percentage of bullish investors minus the percentage

of bearish investors given by the survey of the American Association of Individual Investors

(AAII), (3) a proxy for individual investors’ sentiment (see Han, 2008), and (4) a set of control

variables among the ones tested by Welch and Goyal (2008)9 as potential forecasters of the

equity market. The data frequency used is monthly, as this is the highest frequency in which

the Baker and Wurgler (2007) sentiment factor and the Welch and Goyal (2008) data set are

available. Our regression sample starts in January 1998 and ends in February 201310. The OLS

regression model applied is given as:

DGspread[τ ]t = c+ SENTt + IISENTt + E12t + B/Mt +NTISt+

TBLt + INFLt + CORPRt + SV ARt + CSPt + εt,(4.3.1)

where τ is the option horizon, DGspread is the Delta minus Gamma spread, SENT is the Baker

and Wurgler (2007) sentiment measure, IISENT is the AAII individual investor sentiment

measure, E12 is the twelve-month moving sum of earnings of the S&P 5000 index, B/M is

the book-to-market ratio, NTIS is the net equity expansion, TBL is the risk-free rate, INFL

is the annual INFLation rate, CORPR is the corporate spread, SV AR is the stock variance,

and CSP is the cross-sectional premium. We also run the following univariate models for each

explanatory factor separately to understand their individual relation with the DGspread :

DGspread[τ ]t = αi + βixi,t + εt, (4.3.2)

where x represents the 10 explanatory variables specified in Eq. (4.3.1), thus i = 1...10.

Table 4.2 Panel A reports the results of Eq. (4.3.1), estimated across our three maturities for

the DGspread. The explanatory power of the multivariate regression is very high, ranging from

36 to 57 percent. As expected, SENT is positively linked to DGspread and statistically signifi-

cant across the three- and six-month maturities. This suggests that high sentiment exacerbates

overweight of small probabilities measured as DGspread. However, this relation is negative and

not significant at the twelve-month maturity. The univariate regressions of SENT confirm the

positive link between sentiment and DGspread at shorter maturities. Once again, this relation

is not present at the twelve-month horizon. The explanatory power of SENT in the univariate

setting is also high for the three- and six-month horizons, with 17 and 32 percent, respectively.

This result strengthens our hypothesis that overweight of small probabilities increases at higher

levels of sentiment and that sentiment seems to have a strong link to probability weighting by

investors as priced by index puts and single stock call options. This finding, however, applies

to the three- and six-month horizons only since the twelve-month univariate regression has a

R2 of zero.

8Available at http://people.stern.nyu.edu/jwurgler/.9The complete set of variables provided by Welch and Goyal (2008) that is employed here is discussed in

Appendix 3.C. In order to avoid multicollinearity in our regression analysis (some variables correlate 80 percentwith each other), we exclude all variables that correlate more than 40 percent with others.

10This sample is only possible because Welch and Goyal (2008) and Baker and Wurgler (2007) have updatedand made available their data sets after publication.

91

Table

4.2:Regression

resu

lts:

Delta

min

usGamma

spread

Panel

A-Multivariate

Panel

B-Univariate

Maturity

3m

6m

12m

3m

6m

12m

3m

6m

12m

6m

6m

6m

6m

6m

6m

6m

6m

Intercep

t0.003

-0.491***

-0.490***

-0.063***

-0.369***

-0.520***

-0.064***

-0.365***

-0.508***

-0.048

0.131***

-0.055***

-0.121***

-0.055***

-0.053***

-0.039***

-0.052***

(0.056)

(0.037)

(0.058)

(0.010)

(0.008)

(0.013)

(0.011)

(0.010)

(0.012)

(0.031)

(0.031)

(0.011)

(0.015)

(0.013)

(0.011)

(0.012)

(0.011)

SENT

0.030*

0.064***

-0.024

0.071***

0.097***

-0.003

(0.017)

(0.013)

(0.019)

(0.014)

(0.016)

(0.016)

IISENT

0.041

0.096**

-0.106**

0.123***

0.125**

-0.124***

(0.047)

(0.038)

(0.048)

(0.044)

(0.050)

(0.043)

E12

0.000

-0.003

-0.028***

-0.001

(0.006)

(0.004)

(0.007)

(0.006)

B/M

-0.364*

0.125

0.163

-0.737***

(0.217)

(0.132)

(0.211)

(0.130)

NTIS

0.560

0.259

-0.814

1.075**

(0.391)

(0.285)

(0.523)

(0.440)

TBL

0.013

0.036***

0.029***

0.030***

(0.008)

(0.006)

(0.009)

(0.006)

INFL

0.453

1.843

2.311

1.784

(2.507)

(1.885)

(2.176)

(3.350)

CORPR

0.225

0.233

0.044

0.128

(0.285)

(0.202)

(0.273)

(0.472)

SVAR

-1.426

3.519***

3.470*

-3.376**

(1.331)

(1.153)

(1.982)

(1.307)

CSP

-0.125

0.198

0.261

0.029

(0.136)

(0.122)

(0.235)

(0.197)

R2

36%

57%

30%

17%

32%

0%

6%

6%

5%

0%

27%

4%

21%

0%

0%

4%

0%

F-stats

8.2

19.5

6.4

32.5

72.9

0.0

9.1

9.4

8.0

0.0

58.0

7.1

40.7

0.7

0.2

6.1

0.0

AIC

-308.1

-369.1

-273.2

-326.1

-186.0

34.1

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

BIC

-274.4

-335.4

-239.6

-320.0

-179.9

40.3

0.0

0.0

0.0

0.0

0.1

0.4

0.0

2.2

0.3

1.4

0.2

Panel

Areportsth

eregression

resu

ltsforEq.(4.3.1)in

amultivariate

setting.Thedep

enden

tvariable

isDelta

minusGammaspread

(δ-γ),whileasex

planatory

variableswesp

ecify:1)th

eBaker

andW

urgler(2007)sentimen

tmea

sure

(SENT),

2)th

eindividualinvestorsentimen

t(IIS

ENT),

and3)th

eex

planatory

variablesusedbyW

elch

andGoyal(2008),

whileex

cludingfactors

thatco

rrelate

toea

choth

erin

excess

of40percent(see

Appen

dix

3.C

forth

efulllist

ofvariables).Panel

Breportsth

eregressionresu

ltsfor(4.3.2),

inanunivariate

setting,in

whichDelta

minus

Gammaspreadis

regressed

onth

esamesetofex

planatory

variables.

Wereport

New

ey-W

estadjusted

standard

errors

inbrackets.

Asterisks***,**,and*indicate

significa

nce

atth

eone,

five,

and

tenpercentlevel,resp

ectively.

92

IISENT is also positively connected to DGspread in the multivariate regression at the

three- and six-month horizons but negatively at the twelve-month horizon. These results are

confirmed by the univariate regressions, as IISent is positively linked to DGspread at the three-

and six-month horizons. Explanatory power of these regressions is at 6 percent for both the

three- and six-month maturities, which is relatively high. For the twelve-month maturity in the

univariate regression, IISENT is negatively linked to DGspread and is statistically significant.

Once we analyze the other control variables in our regression, we observe that the results

are less stable than for the sentiment proxies. Table 4.2 indicates that some signs of control

variables change in both the multivariate and univariate regressions. TBL is the only control

variable that remains statistically significant and keeps its sign across the multivariate and

univariate models. The explanatory power of TBL is 21 percent in the univariate setting,

whereas the other independent variable with high explanatory power is book-to-market with 27

percent. B/M is only statistically significant in the three-month maturity of the multivariate

regressions. NTIS is negatively and significantly linked to DGspread in the univariate setting

as well as in the multivariate regression in the twelve-month maturity. SV AR is negatively and

significantly linked to DGspread in the univariate regression but in the multivariate regression

this result is not observed. Overall, these empirical findings suggest that fundamentals have a

relatively unstable link to the DGspread.

We note that the high stability of the relation between the sentiment factors and the

DGspread within the multivariate regressions evidences that sentiment and overweight of small

probabilities are strongly connected.

4.3.3 Overweight of tails, IV skews and higher moments of the RND

In a next step, we assess the relationship between DGspread and higher moments (skewness and

kurtosis) of the RND implied by options and IV skew measures. We undertake this analysis

for two reasons: 1) to understand to which extent DGspread is connected to other metrics

seemingly derived from IV, and 2) to approximate DGspread by an easier-to-obtain measure,

given the comprehensive estimation procedures required to compute γ and δ.

We expect the existence of a positive link between the estimated DGspread and IV skew

measures, because the presence of fat tails in the RND is a pre-condition for overweight of tail

probabilities and a corollary of OTM’s IVs to be rich versus at-the-money (ATM) IVs. Simi-

larly, we observe negative skewness and fat-tails in RNDs only if OTM options are expensive

versus ATM options and vice-versa11. Consequently, γ and δ are likely to be smaller than one

(overweight of small probabilities), and DGspread differs from zero if OTM options are expen-

sive versus ATM options, which supports the use of IV skew as another proxy for overweight

of tails.

The IV skew measures used at the beginning are the standard measures: 1) IV 90 percent

11While these relations are widely acknowledged, Jarrow and Rudd (1982) and Longstaff (1995) provide aformal theorem for the link between IV skew and risk-neutral moments, whereas Bakshi et al. (2003) offer acomprehensive empirical test of this proposition for index options.

93

(moneyness) minus ATM, 2) IV 80 percent minus ATM from index options (which captures

bearish sentiment), 3) IV 110 percent minus ATM, and 4) IV 120 percent minus ATM from

single stock calls (which captures bullish sentiment). However, as overweight of small proba-

bilities is observed from the tails of the two markets jointly via DGspread, and standard IV

skew measures only capture information from one market at a time, we suggest a new IV-based

measure. Our proposed IV skew sentiment metric, so-called IV-sentiment, is a combined mea-

sure of the index and single stock options markets. Our IV-sentiment measure is specified as

follows:

IV sentiment = OTMindexputIVτp −OTMsinglestockcallIVτc, (4.3.3)

where, the subscript τ = 1...3 indexes the different option-maturities used, p specifies the

moneyness levels 80 and 90 percent from index put options, and c specifies the moneyness levels

110 and 120 percent from single stock call options. Thus, our sentiment measure is calculated

as permutations of IVs from the three-, six- and twelve-month maturities, and four points in

the moneyness (80, 90, 110, and 120 percent) level grid, where the absolute distance from the

two moneyness levels used per sentiment measure and the ATM level (100 percent moneyness)

is kept constant. In other words, the IV-sentiment metric produced is restricted to the 80

minus 120 percent and the 90 minus 110 percent measures, hereafter called the IV-sentiment

90-110 and IV-sentiment 80-120 measures. From the granular data set across different money-

ness levels and maturities, we create six distinct skew-based measures of IV-sentiment. Using

such a construction, our IV-sentiment measure jointly incorporates bearishness sentiment from

institutional investors and bullishness sentiment from retails investors, similarly to DGspread.

We assess the isolated relationship between DGspread and higher moments of the RND,

(standard) IV skews, and our IV-sentiment measures using the univariate models presented by

Eqs. (4.3.4) to (4.3.7). These models are estimated with OLS, where Newey-West standard

errors are used for statistical inference. Our daily regression samples start on January 2, 1998

and end on March 19, 2013.

DGspread[τ ] = αt

[K

S

]+ IV Sentt

[K

S; τ

], (4.3.4)

DGspread[τ ] = αt +KURTmt (τ), (4.3.5)

DGspread[τ ] = αt + SKEWmt (τ), (4.3.6)

DGspread[τ ] = αt

[K

S

]+ IV SKEWt

[K

S; τ

], (4.3.7)

where KSis the moneyness level of the option, τ is the option horizon, DGspread is the DGspread,

IV Sent is our IV-sentiment measure, SKEW is the RND return skewness implied by options,

KURT is the RND return kurtosis implied by options, and IV SKEW is the single market

IV skew measure, for both index option and single stock option markets. We note that the

94

superscript m for the variables KURT and SKEW aims to distinguish RND kurtosis and

skewness obtained from either RND implied by index options (m = io) or single stock options

(m = sso).

We estimate multivariate models of DGspread regressed on RND skewness, kurtosis, IV

skews and IV-sentiment to better understand the relation between these measures jointly and

overweight of small probabilities:

DGspread[τ ] = αt

[K

S

]+ SKEWm

t (τ) +KURTmt (τ) + IV Sentt

[K

S; τ

], (4.3.8)

Table 4.3 Panel A reports the estimates of Eqs. (4.3.4) to (4.3.7), when the DGspread is

regressed on RND moments, IV skews and IV-sentiment 90-110 in a univariate setting. The

empirical findings indicate that IV-sentiment is the variable that explains DGspread the most

across all maturities. The explanatory power of IV-sentiment is not only the highest but

it is also the most consistent factor, as its R2 ranges from 30 to 46 percent. IV-sentiment is

negatively related toDGspread. Such a negative sign of the IV-sentiment regressor was expected

because the DGspread rises with higher bullish sentiment, whereas higher IV-sentiment suggests

a more pronounced bearish sentiment. Risk-neutral skewness and kurtosis also strongly explains

DGspread (by roughly 30 percent), though only within the three-month maturity. Skewness

and kurtosis explain DGspread by roughly 10 percent for six-month options, and 7 percent

for twelve-month ones. The coefficient signs are in line with our expectations since high levels

of RND skewness are associated with high DGspread (a bullish sentiment signal), while low

levels of RND kurtosis (less pronounced fat-tails) are associated with high DGspread 12. In

contrast, standard IV skews explain very little of DGspread within the three-month maturity,

only between 0 and 4 percent. At longer maturities, the IV skews are able to better explain

DGspread, however, mostly when the skew measure comes from the single stock options market

(between 17 and 21 percent). As a robustness check, we note that the regression results are

virtually unchanged by the usage of either IV-sentiment 90-110 or 80-120 measures. As a first

impression, these results imply that IV-sentiment is strongly connected to DGspread and to

overweight of small probabilities.

Panel B shows that when we evaluate the multivariate regressions, we find that IV-sentiment

is the most stable regressor with respect to coefficient signs, being negatively linked to DGspread

across all regressions, and is always statistically significant. These regressions have high ex-

planatory power (ranging from 41 to 61 percent), especially when considering the daily fre-

quency, thus, potentially containing more noise than lower frequency data. In the multivariate

regression we use the IV-sentiment 90-110, while the (unreported) results using IV-sentiment

80-120 are qualitatively the same. Due to likely multicollinearity in this multivariate model,

we believe that our univariate models are more insightful than the former.

12The regression results reported here use RND kurtosis and skewness from index options (m = io). Theresults when RND is extracted from single stock options (m = sso) are unreported but qualitatively the sameas the coefficient signs are equal to the reported ones, and regressions’ explanatory power are roughly in thesame range.

95

Table

4.3:Regression

resu

lts:

Delta

min

usGamma

spread

and

risk

-neutralmeasu

res

Pan

elA

-Univariate

regression

s

Maturity

3m6m

12m

3m6m

12m

3m6m

12m

3m6m

12m

Intercept

0.019**

-0.219***

-0.436***

-0.045***

-0.254***

-0.460***

-0.295***

-0.499***

-0.683***

-0.186***

-0.368***

-0.593***

(0.007)

(0.010)

(0.009)

(0.006)

(0.008)

(0.007)

(0.004)

(0.004)

(0.004)

(0.004)

(0.003)

(0.003)

Skewness

0.122***

0.073***

0.054***

(0.003)

(0.004)

(0.003)

Kurtosis

-0.015***

-0.009***

-0.007***

(0.000)

(0.000)

(0.000)

IV-sentiment90-110

-1.998***

-2.774***

-2.359***

(0.046)

(0.064)

(0.062)

IV-sentiment80-120

-1.606***

-2.438***

-2.124***

(0.042)

(0.058)

(0.056)

R2

32%

9%7%

30%

10%

7%34%

46%

36%

30%

45%

35%

F-stats

1861.1

408.6

285.7

1714.6

423.4

315.6

2085.1

3423.7

2228.0

1707.2

3199.9

2121.2

AIC

-2001

-13

-944

-1900

-27

-972

-2151

-2093

-2437

-1895

-1971

-2368

BIC

-1989

-1-931

-1888

-14

-959

-2139

-2081

-2424

-1883

-1959

-2355

Pan

elA

-Univariate

regression

s(con

tinuation)

Pan

elB

-Multivariate

regression

s

Maturity

3m6m

12m

3m6m

12m

3m6m

12m

Intercept

-0.195***

-0.141***

-0.332***

0.029

-0.052

-0.407***

-0.273***

-0.465***

-0.495***

(0.010)

(0.013)

(0.011)

(0.028)

(0.034)

(0.027)

(0.025)

(0.038)

(0.040)

Skewness

0.093***

0.000

-0.032***

(0.008)

(0.009)

(0.009)

Kurtosis

-0.002**

-0.007***

-0.009***

(0.001)

(0.001)

(0.001)

IV-sentiment90-110

-1.989***

-2.462***

-1.677***

(0.063)

(0.108)

(0.126)

IV110-ATM

skew

1.082**

13.681***

16.172***

0.511

5.525***

8.106***

(0.435)

(0.717)

(0.711)

(0.371)

(0.672)

(1.004)

IV90-ATM

skew

-4.941***

-8.399***

-4.997***

3.876***

4.129***

0.000

(0.557)

(0.903)

(0.993)

(0.391)

(0.732)

(0.933)

R2

0%17%

21%

4%4%

1%61%

53%

42%

F-stats

10.3

810.0

1065.3

148.4

177.3

49.4

1214.5

903.8

707.4

AIC

-485

-362

-1612

-621

202

-717

-4154

-2636

-3008

BIC

-472

-349

-1599

-608

215

-705

-4116

-2598

-2970

Panel

Areportsth

eregressionresu

ltsforEqs.

(4.3.4),

(4.3.5),

(4.3.6)and(4.3.7)in

anunivariate

setting.Thedep

enden

tvariable

forth

eseregressionsis

Delta

minusGammaspread

(δ-γ),

aproxyforoverweightofsm

allprobabilities.

Asex

planatory

variableswesp

ecifyth

erisk-neu

tralskew

nessandkurtosis,

IV110-A

TM

skew

(from

single

stock

options),IV

90-A

TM

skew

(from

index

options),andourIV

-sen

timen

tmea

sure

intw

opermutationsper

matu

rity:1)IV

-sen

timen

t90-110,and2)IV

-sen

timen

t80-120.OurIV

-sen

timen

tmea

sure

isanIV

skew

mea

sure

thatco

mbines

inform

ationfrom

theindex

optionmarket

andth

esingle

stock

optionmarket,seeEq.(4.3.3).

Forinstance,th

eIV

-sen

timen

t90-110

mea

sure

combines

theIV

from

the90percentmoney

nesslevel

from

theindex

optionmarket

andth

e110percentmoney

nesslevel

from

thesingle

stock

optionmarket.Panel

Breportsth

eregressionresu

ltsforEq.(4.3.8)in

amultivariate

setting,in

whichDelta

minusGammaspread

isregressed

onth

esamesetofex

planatory

variables.

Wereport

New

ey-W

estadjusted

standard

errors

inbrackets.

Asterisks***,**,and*indicate

significa

nce

atth

eone,

five,

andtenpercentlevel,resp

ectively.

96

These findings strongly suggest that DGspread co-moves with our IV-sentiment measure

within the three-, six-, and twelve-month maturities. Hence, we feel comfortable to use IV-

sentiment to approximate the overweighting of small probabilities, similarly to DGspread.

4.4 Predicting with overweight of tails

4.4.1 Predicting returns with DGspread and IV-sentiment

Section 4.3.1 has documented that the overweighting of small probabilities is strongly time-

varying. We hypothesize that it is linked to equity markets reversals. Thus, in the following,

we employ regression analysis to test if overweight of small probabilities (proxied by DGspread)

can predict equity market returns. Given the results of section 4.3.3, in which our IV-sentiment

measure strongly links to the DGspread, we also run such predictive regressions by using IV-

sentiment as the explanatory variable.

In order to test the predictability of these two metrics, we regress values of DGspread and of

our IV-sentiment measure on rolling forward returns with eight different investment horizons:

42, 84, 126, 252, 315, 525, 735, and 945 days, as specified by the Eqs. (4.4.1) and (4.4.2):

pt+h+1

pt+1

= αh + βhDGspread[τ ]t + εt, (4.4.1)

pt+h+1

pt+1

= αh + βhIV Sent[τ ]t + εt, (4.4.2)

where p is the equity market price level, h is the investment horizon, τ is the option maturity,

α is the unconditional expected mean of forward returns, and β is the sensitivity of forward

returns to DGspread and to IV-sentiment. We estimate Eqs. (4.4.1) and (4.4.2) via OLS with

Newey-West adjustment to the standard deviation of regressors’ coefficients due to the presence

of serial correlation in forwards returns. Our regression samples start in January 2, 1998 and

end in March 19, 2013.

Table 4.4 presents the empirical findings of forward returns regressed on DGspread. The

explanatory power of these regressions have single-digit values and rarely exceeds ten percent.

For the three-month horizon, the explanatory power rises steadily up to the two-year horizon

(to nine percent), and drops then to four percent for forward returns at the 945-days horizon.

We note that DGspread tends to have low explanatory power and is not significant for short-

horizons (42- to 126-days) and for higher maturities (twelve-month options). The coefficients

of DGspread are always negative for the three- and six-month maturities. This result was

expected as it implies that a high (low) DGspread, i.e., a bullish (bearish) sentiment predicts

negative (positive) forward returns, i.e., reversals. For the twelve-month maturity, the coeffi-

cient signs are unstable, being negative (and statistically significant) for the 252-days horizon,

while sometimes positive and insignificant for shorter horizons.

97

Table

4.4:Regression

resu

lts:

Deltaminusgammasp

read

andIV

-sentiment

Panel

A-Delta

minusGammasp

read

Panel

B-IV

-sen

timen

t90-110

Three-month

options

Horizo

n42

84

126

252

315

525

735

945

42

84

126

252

315

525

735

945

Intercep

t0.000

-0.004

-0.009**

-0.016***

-0.023***

-0.030***

-0.027***

0.003

0.01***

0.01***

0.02***

0.04***

0.04***

0.06***

0.08***

0.08***

(0.003)

(0.003)

(0.004)

(0.006)

(0.006)

(0.008)

(0.008)

(0.009)

(0.002)

(0.002)

(0.003)

(0.004)

(0.004)

(0.005)

(0.006)

(0.007)

DGsp

read/IV

-Sen

t-0.03***

-0.08***

-0.13***

-0.26***

-0.33***

-0.43***

-0.39***

-0.19***

0.13***

0.26***

0.38***

0.65***

0.70***

1.11***

1.59***

1.52***

(0.007)

(0.009)

(0.010)

(0.015)

(0.018)

(0.032)

(0.034)

(0.036)

(0.013)

(0.017)

(0.020)

(0.021)

(0.021)

(0.031)

(0.033)

(0.048)

R2

1%

2%

4%

7%

9%

9%

7%

1%

4%

8%

10%

15%

16%

23%

34%

26%

F-stats

31.4

81.9

150.1

274.5

347.3

351.0

228.5

42.4

150.7

317.6

418.3

678.2

709.1

1083.3

1737.3

1096.3

AIC

0.0017

0.0023

0.0030

0.0042

0.0047

0.0061

0.0066

0.0075

0.0011

0.0015

0.0019

0.0026

0.0027

0.0034

0.0039

0.0047

BIC

0.0061

0.0085

0.0108

0.0157

0.0175

0.0232

0.0255

0.0288

0.0106

0.0146

0.0184

0.0251

0.0261

0.0337

0.0382

0.0460

Six-m

onth

options

Horizo

n42

84

126

252

315

525

735

945

42

84

126

252

315

525

735

945

Intercep

t-0.004

-0.017***

-0.023***

-0.050***

-0.066***

-0.123***

-0.138***

-0.088***

0.02***

0.04***

0.05***

0.10***

0.12***

0.18***

0.19***

0.14***

(0.003)

(0.004)

(0.005)

(0.007)

(0.008)

(0.009)

(0.008)

(0.009)

(0.002)

(0.002)

(0.003)

(0.005)

(0.005)

(0.007)

(0.009)

(0.011)

DGsp

read/IV

-Sen

t-0.03***

-0.08***

-0.12***

-0.25***

-0.32***

-0.53***

-0.56***

-0.39***

0.24***

0.45***

0.64***

1.12***

1.38***

2.18***

2.22***

1.48***

(0.007)

(0.009)

(0.010)

(0.014)

(0.017)

(0.023)

(0.024)

(0.028)

(0.022)

(0.027)

(0.029)

(0.032)

(0.040)

(0.050)

(0.059)

(0.066)

R2

1%

3%

4%

8%

10%

16%

17%

7%

5%

8%

10%

15%

18%

26%

23%

9%

F-stats

31.7

114.0

145.2

301.5

399.4

669.1

651.7

232.6

186.6

344.4

447.9

660.3

803.0

1204.0

959.3

288.5

AIC

0.0023

0.0031

0.0040

0.0057

0.0062

0.0079

0.0084

0.0097

0.0014

0.0019

0.0025

0.0036

0.0040

0.0054

0.0063

0.0079

BIC

0.0056

0.0077

0.0099

0.0142

0.0159

0.0204

0.0221

0.0257

0.0176

0.0241

0.0304

0.0437

0.0486

0.0629

0.0717

0.0873

Twelve-month

options

Horizo

n42

84

126

252

315

525

735

945

42

84

126

252

315

525

735

945

Intercep

t0.007

0.004

0.012

-0.037***

-0.100***

-0.177***

-0.228***

-0.164***

0.02***

0.04***

0.06***

0.11***

0.14***

0.21***

0.21***

0.13***

(0.005)

(0.007)

(0.009)

(0.013)

(0.014)

(0.019)

(0.020)

(0.021)

(0.002)

(0.003)

(0.003)

(0.005)

(0.006)

(0.008)

(0.009)

(0.012)

DGsp

read/IV

-Sen

t0.00

-0.02

-0.01

-0.14***

-0.27***

-0.44***

-0.53***

-0.40***

0.25***

0.46***

0.68***

1.23***

1.53***

2.30***

2.23***

1.26***

(0.009)

(0.012)

(0.015)

(0.022)

(0.024)

(0.033)

(0.037)

(0.040)

(0.024)

(0.029)

(0.030)

(0.035)

(0.043)

(0.056)

(0.068)

(0.077)

R2

0%

0%

0%

2%

5%

8%

11%

5%

4%

7%

10%

15%

18%

23%

19%

5%

F-stats

0.0

2.8

0.9

62.8

198.7

313.8

396.4

166.3

160.0

299.2

409.8

636.9

806.9

1048.0

753.0

162.8

AIC

0.0037

0.0052

0.0066

0.0096

0.0105

0.0136

0.0144

0.0164

0.0016

0.0022

0.0028

0.0041

0.0046

0.0063

0.0074

0.0093

BIC

0.0066

0.0093

0.0119

0.0172

0.0190

0.0249

0.0267

0.0310

0.0194

0.0267

0.0336

0.0486

0.0539

0.0709

0.0813

0.0987

Panel

Areportsth

eregressionresu

ltsforEq.(4.4.1),

whichregresses

theDelta

minusGammaspread

oneightdifferen

thorizo

nsforforw

ard

equityretu

rns.

Panel

Breportsth

eregressionresu

lts

forEq.(4.4.2),

whichregresses

theIV

-sen

timen

t90-110

mea

sure

onth

esameforw

ard

equityretu

rnsusedin

Panel

A.Theex

plained

variablesare

forw

ard

retu

rnsforth

eS&P

500index

mea

sured

over

thefollowinghorizo

ns:

42,84,126,252,315,525,735,and945days.

Wereport

New

ey-W

estadjusted

standard

errors

inbrackets.

Theasterisks***,**,and*indicate

significa

nce

atth

eone,

five,


ectively.

98

Panel B reports the regression results of Eq. (4.4.2), i.e., the outcomes of forward re-

turns regressed on our IV-sentiment 90-110 measure for three-, six-, and twelve-month ma-

turities13. The pattern of R2 across the different horizons tested is similar across the three

option-maturities and analogous to the one observed for DGspread for the same three-month

horizon: R2 rises from four percent to 28 percent when the horizon increases from 42 days (two

months) to 525 days (two years), while after the two years horizon, the explanatory power falls

slightly for the 735 days (roughly three years) and collapses for the 925 days (3.7 years) horizon.

We observe that the explanatory power for the six- and twelve-month option maturity is just

slightly lower than for the three-month maturity. Statistical significance of the estimators is

often high, across option maturities and return horizons. The coefficients for the IV-sentiment

90-110 measure are always positive. This is as expected as it means that high (low) IV-

sentiment, i.e., bearish (bullish) sentiment, predicts positive (negative) forward returns. The

explanatory power, the stability of the coefficient signs, and the statistical significance of the

regressors using our IV-sentiment 90-110 measure clearly dominate the regression results that

use Delta minus Gamma spread. These results strengthen our earlier findings that our IV-

sentiment measure is a good representation of sentiment, especially concerning the prediction

of equity market reversals.

4.4.2 IV-sentiment pair trading strategy

Our previous results suggest that IV-sentiment is more strongly connected to forward returns

than Delta minus Gamma spread itself. As such, we construct a trading strategy to further

test the predictability power of IV-sentiment. This strategy consists of a high frequency (daily)

trading rule that aims to predict equity market reversals. Our hypothesis is that when the

IV-sentiment measure is significantly higher (lower) than its normal level, overweight of small

probabilities is then extreme and likely to mean-revert in the subsequent periods in tandem

with the underlying market. The trading strategy, thus, buys (sells) equities when there is

excessive bearishness/panic (excessive bullishness/complacency) indicated by the high (low)

level of IV-sentiment.

The strategy is tested via a pair-trading rule among long and short positions in the S&P

500 index and a USD cash return index. For simplicity, such a strategy is implemented as a

purely directional strategy where positions are constant in size and IV-sentiment is normalized

via a Z -score. The trading rule enters a five percent long equities position when the IV-

sentiment is higher than a pre-specified threshold, for example, its historical two standard

deviation. The trading rule closes such a position, by entering into a full cash position, when

such normalized IV-sentiment measure converges back to its average. Conversely, the rule

enters a short equities position when the IV-sentiment is lower than its historical negative

two standard deviation threshold and buys back a full cash position when it converges to

its average. Five basis points trading cost is charged over the five percent position traded

13The regression results for our IV-sentiment 80-120 measure are qualitatively indifferent from the ones wepresent for IV-sentiment 90-110.

99

in equities. In order to avoid strategy overfitting, we 1) compute the Z -score using multiple

look-back periods, and 2) use multiple threshold levels to configure excessive sentiment14. We

evaluate these contrarian strategies on a volatility-adjusted basis using standard performance

analytics such as the information ratio, downside risk characteristics, and higher moments of

returns. We compare these strategies to 1) other contrarian strategies that make use of IV

volatilities, such as an IV skew-based strategy, a volatility risk premia (VRP) strategy, and an

implied-correlation-based (IC) strategy15, 2) the equity market beta, i.e., the S&P 500 index,

and 3) alternative beta strategies, such as writing put options, a 110-95 collar strategy, the

G10 FX carry, equity cross-sectional momentum, and a time-series momentum strategy16. We

further evaluate such strategies by estimating the paired correlation coefficient between them,

as well as tail and (distribution) higher-moment dependency statistics such as conditional co-

crash (CCC) probabilities (see Appendix 2.B) and co-skewness. Our back-test samples start in

January 2, 1998 and end in December 4, 201517.

The boxplots of information ratios obtained by our IV-sentiment strategies and other IV-

based strategies are provided in Figure 4.1. We see that the IV-sentiment 90-110 strategy seems

to perform better than the IV-sentiment 80-120 strategy, as the information ratio means and

dispersion of the former strategy dominate the ones for the latter. The average information

ratio for the IV-sentiment 90-110 strategy is positive for the three- and six-month option

maturities but negative for the twelve-month. For the three- and six-month strategies, all one-

standard deviation boxes for the information ratio lay in positive territory, suggesting that the

IV-sentiment 90-110 strategy is robust to changes in look-back and outer-threshold parameters.

Further, the IV-sentiment 90-110 is superior to single-market IV skew-based strategies for the

three- and six-month maturities, but not for the twelve-month maturity. At the three-month

maturity, the average information ratio and dispersion for the IV-sentiment 90-110 strategy are

similar to the ones for the VRP strategy. However, for the six- and twelve-month maturities,

the VRP strategies dominate the IV-sentiment 90-110 based on the average information ratio,

despite larger dispersion for the six-month maturity strategy.

Figure 4.1 shows that the IC strategies seem to deliver relatively high and consistent in-

formation ratios, especially when calculated using the 80 and 90 percent moneyness levels. At

the three- and six-month maturities, the performance of IC strategies match the performance

of the IV-sentiment 90-110 and VRP strategies. At the twelve-month horizon, the 80 and 90

percent IC strategies are superior to the IV-sentiment 90-110 measure. Overall, the boxplots

in Figure 4.1 suggest that the IV-sentiment 90-110 strategy is robust to changes in parameters

14We also tested a percentile normalization and found results that are qualitatively similar to the use ofZ -scores.

15A implied-correlation (or dispersion trading) strategy buys (sells) index options and sells (buys), while deltahedging, to arbitrage price differences in these two volatility markets.

16Strategy return series used are, respectively, the CBOE S&P 500 BuyWrite Index, the CBOE InvestableCorrelation Index, the S&P 500 index, CBOE put writing index, the CBOE 110-95 collar, the DB G10 FXcarry index, the JPMorgan Equity Momentum index and the Credit Suisse Managed Futures index.

17As our IV-sentiment measure requires much less (cross-sectional) IV data than the DGspread to be cal-culated, we were able to extend our full sample, which originally ended on March 19, 2013, until December 4,2015

100

but also that its performance is matched by other IV-based strategies. Table 4.5 Panel A pro-

vides the performance analytics for the IV-sentiment 90-110 strategy, as well as for alternative

strategies.

A) Three-month options B) Six-month options C)Twelve-month options

Figure 4.1: Information ratios for daily IV-based strategies. The boxplots depict the distribution of information ratios(IR) obtained by the IV-based strategies tested, when different look-back periods and outer-thresholds are used per factor-specificstrategy. Boxplot A depicts the distribution of IRs when the IV factor used is obtained from three-month options. Panels B andC depict the same information while using the IV factors obtained from six- and twelve-month options, respectively.

We observe that the IV-sentiment 90-110 strategy (using three-month option maturity)

delivers returns (20 basis points) and risk-adjusted returns (0.29) that are superior to many of

the other strategies compared, such as the S&P 500, the IV skew, the VRP, the IC, the 90-110

collar, the G10 FX carry, and the equity momentum. Thus, the only strategies that deliver

equal or higher risk-adjusted returns than our IV-sentiment 90-110 strategy are the time-

series momentum and the put writing. The return skewness for our IV-sentiment strategy is

positive (0.10) and above the average of the other strategies. A strategy that has surprisingly

high skewed returns is the IC (0.43). The drawdown characteristics such as the maximum

drawdown, the average recovery time, and the maximum daily drawdown of our IV-sentiment

strategy are somewhat similar to the other IV-based strategies.

101

Table

4.5:IV

-sentimentbased

pair-tra

destra

tegy

Panel

A-Back

-testresu

lts

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

11

12

13

IV-sen

timen

tIV

Skew

VRP

ICS&P500

Put

110-95

G10FX

Equity

CTA

S&P500

Eq.Mom

CTA

90-110

3m

90

3m

90

3m

90

writing

collar

carry

Momen

tum

+IV

Sen

t+IV

Sen

t+IV

Sen

t

Averageretu

rn0.20%

0.14%

0.12%

0.17%

0.10%

0.34%

0.14%

0.18%

0.10%

0.51%

0.21%

0.24%

0.53%

Volatility

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

0.71%

Inform

ationratio

0.29

0.20

0.17

0.24

0.14

0.48

0.20

0.26

0.14

0.71

0.29

0.34

0.75

Skew

ness

0.10

-0.07

-0.01

0.43

-0.18

-0.60

0.01

-0.93

-0.45

-0.37

-0.05

0.09

-0.37

Kurtosis

15.84

24.73

29.02

18.54

8.12

24.13

2.25

12.04

4.23

2.89

7.66

17.53

2.91

Maxdrawdown

-1.7%

-1.6%

-2.9%

-1.7%

-3.0%

-2.5%

-2.9%

-3.2%

-2.9%

-1.4%

-2.4%

-2.2%

-1.1%

Avgrecoverytime(inyea

rs)

0.43

0.42

0.41

0.35

0.22

0.06

0.20

0.16

0.25

0.14

0.13

0.21

0.14

Maxdailydrawdown

-0.55%

-0.53%

-0.49%

-0.47%

-0.34%

-0.53%

-0.29%

-0.50%

-0.35%

-0.31%

-0.29%

-0.43%

-0.48%

Panel

B-Correlationmatrix

IV-sen

timen

tIV

Skew

VRP

ICS&P500

Put

110-95

G10FX

Equity

CTA

90-110

3m

90

3m

90

3m

90

writing

collar

carry

Momen

tum

IV-sen

timen

t1

0.41

0.18

0.70

0.10

0.04

0.13

0.07

-0.16

-0.11

IVSkew

0.41

10.55

0.59

0.18

0.16

0.08

0.16

-0.05

-0.03

VRP

0.18

0.55

10.51

0.41

0.42

0.18

0.15

-0.13

-0.11

IC0.70

0.59

0.51

10.34

0.31

0.24

0.14

-0.21

-0.13

S&P500

0.10

0.18

0.41

0.34

10.89

0.88

0.28

-0.05

-0.15

Putwriting

0.04

0.16

0.42

0.31

0.89

10.68

0.26

-0.08

-0.16

110-95co

llar

0.13

0.08

0.18

0.24

0.88

0.68

10.23

0.13

-0.05

G10FX

carry

0.07

0.16

0.15

0.14

0.28

0.26

0.23

10.02

-0.05

EquityMomen

tum

-0.16

-0.05

-0.13

-0.21

-0.05

-0.08

0.13

0.02

10.30

CTA

-0.11

-0.03

-0.11

-0.13

-0.15

-0.16

-0.05

-0.05

0.30

1

Panel

C-Taildep

enden

ceIV

-sen

timen

tIV

Skew

VRP

ICS&P500

Put

110-95

G10FX

Equity

CTA

withIV

-sen

timen

t90-110

3m

90

3m

90

3m

90

writing

collar

carry

Momen

tum

Co-skew

ness

1.6E-12

-2.9E-13

1.0E-11

4.6E-12

-6.5E-10

-3.3E-09

9.6E-10

-9.8E-10

5.9E-10

-3.2E-09

1%

cond.crash

prob.

100%

51%

36%

77%

23%

21%

19%

9%

19%

2%

2%

cond.crash

prob.

100%

46%

37%

76%

32%

26%

26%

13%

15%

2%

5%

cond.crash

prob.

100%

44%

37%

78%

29%

32%

26%

17%

18%

7%

Panel

Areportsth

eresu

ltsofco

ntrarianpair-tradestrategiesbasedonourIV

-sen

timen

t110-95

indicatorandonoth

erIV

-basedstrategiessu

chasth

eIV

Skew

,th

evolatility-riskpremia

(VRP),

and

oth

ertraditionaland

alternativebetastrategies,

i.e.,buy&

hold

theS&P500index

,putwriting,110-95co

llar,

G10FX

carry,

cross-sectionaleq

uitymomen

tum,and

time-series

momen

tum

(CTA).

TheIV

-basedstrategiesuse

252daysasth

elook-back

periodand+/-tw

ostandard

dev

iationsasco

nvergen

ceth

resh

olds.

Theco

lumns(11)and(12)ofPanel

Areport

statisticsforstrategies

thatco

mbineth

eth

ree-month

IV-sen

timen

t90-110strategy(column(1))

withth

ebuy&

hold

theS&P

500index

(column(5))

andth

eCTA

strategy(column(10)).Panel

Breportsth

eco

rrelation

coeffi

cien

tsofdailyretu

rnsestimatedover

theperiodbetweenJanuary

2,1998andDecem

ber

4,2015,forth

esamestrategiesreported

inPanel

A.Panel

Creportsth

eco

-skew

nessandth

eco

nditional

co-crash

(CCC)probabilitiesofth

eth

ree-month

IV-sen

timen

t90-110

withth

eoth

erstrategies,

whichindicate

thedeg

reeoftail-dep

enden

ceamongth

em.

102

In the following, we combine our IV-sentiment strategy with a simple buy-and-hold of the

S&P 500 index, a cross-sectional equity momentum, and a time-series momentum strategy,

on a standalone basis. These combinations are done by weighting returns in a 50/50 percent

proportion. Statistics for the strategies are presented in columns (11) and (13) of Panel A of

Table 4.5. We note that the combined strategies improve the information ratios of these three

strategies. The information ratio for the S&P 500 rises from 0.14 to 0.29, for the time-series

momentum from 0.71 to 0.75 and by a staggering 0.20 points for the cross-sectional momentum

strategy, from 0.14 to 0.34. The drawdown and skewness characteristics are also improved,

especially for the cross-sectional momentum strategy. We argue that these improvements in

the information ratio and downside statistics occur due to the low correlation and low higher

moments-/tail-dependencies of our IV-sentiment strategy with these alternative strategies. For

instance, Table 4.5 Panel B indicates that the IV-sentiment strategy is negatively correlated

to both equity momentum and time-series momentum, by -0.16 and -0.11, respectively.

Co-skewness and, especially, CCC probabilities of the IV-sentiment strategy with momen-

tum strategies are also very low (see Panel C of Table 4.5). Since Daniel and Moskowitz (2016)

document that momentum crashes, in particular cross-sectional momentum, we suggest that the

large improvement delivered by IV-sentiment to these strategies is likely due to the reduction

of their large negative tails.

Moreover, Table 4.5 Panel B indicates that the IV-sentiment strategy is, on average, posi-

tively related to other strategies. The highest correlation observed for the IV-sentiment strategy

is with the IC strategy (0.70), which is an intuitive result given that these are the only two

strategies driven jointly by the index option market and the single stock option market. The

correlations of our IV-sentiment strategy with other IV-based strategies are also relatively high:

0.18 with the VRP and 0.41 with the IV skew 90 percent. The correlation of the IV-sentiment

with the S&P 500 index is with 0.10, very low. The correlation of the IV-sentiment strategy

with other strategies that perform poorly in “bad times” is also low, at 0.04 with the put

writing, at 0.07 with the G10 FX carry, and at 0.13 with the 90-110 collar strategy. We also

note that other strategies can be highly correlated with each other, e.g., with 0.89 between the

S&P 500 and the put-writing, whereas negative correlations are mostly observed for momentum

strategies. Our findings on correlations among strategies are mostly reiterated by the estimated

tail-dependence between them using co-skewness and CCC probabilities reported in Panel C of

Table 4.5.

As a robustness check, we analyze whether our IV-sentiment high-frequency trading strategy

performs well due to both its legs or whether its merit is concentrated in either the long- or

the short-leg. We separate the performance of the two legs of the strategy as if they were two

different strategies and we compute individual performance statistics. In order to visualize the

results, we produce information ratios’ (IRs) boxplots separately for the three option maturities,

which are shown in Figure 4.2.

103


Figure 4.2: Information ratios for long- and short-leg of IV-based strategies. The boxplots depict the distributionof information ratios (IRs) obtained by the IV-based strategies tested, when different look-back periods and outer-threshold areused per factor-specific strategy. Boxplots on the top row (in green) refer to IRs produced by the long-leg of IV-based strategies,whereas the ones in the second row refer to the short-leg of the same strategies. Boxplot A depicts the distribution of IRs whenthe IV factor used is obtained from three-month options. Panels B and C depict the same information while using the IV factorsobtained from six- and twelve-month options, respectively.

The distribution of IRs for the long positions are shown in the plots at the upper part, while

the distribution of IRs for the shorts are shown at the bottom. We note that the dispersion of

IRs from the short-leg is much higher than from the long-leg; outliers are much more frequent in

the short-leg. We find that the median IRs of long-legs are substantially higher than for short-

legs. The IR distributions of the short positions seem slightly skewed to the negative side,

whereas for the long positions they seem skewed to the positive side. These results indicate

that the merit of our IV-sentiment strategy is concentrated in its buy-signal rather than in its

sell-signal.

Figure 4.2 suggests that other IV-based strategies also seem to have their long-legs perform-

ing much better than their short-legs. This finding suggests that extreme bearish sentiment

signals may be more reliable than extreme bullish sentiment signals. One explanation for this

finding is the fact that the IV may be more reactive on the downside, due to the leverage ef-

fect18. In contrast, on the upside, a higher IV led by the bidding of call options might be offset

by an overall lower IV. Our results are partially in line with the literature on cross-sectional

returns and skew measures. Barberis and Huang (2008) suggest that stocks that have a high

18The leverage effect refers to the typically observed negative correlation between equity returns and itschanges of volatility, and was first noted by Black (1976).

104

skew tend to have high subsequent returns, whereas for a call with a high skew this relation is

inverse. However, other studies, such as Cremers and Weinbaum (2010), suggest that the rela-

tion between returns and volatility skews has the opposite direction. Assuming that there are

systematic reasons for OTM implied volatilities across stocks to move in tandem, e.g., market

risk, as suggested by Dennis and Mayhew (2002) and Duan and Wei (2009), then the logical

consequence from the cross-sectional relation between the implied skew and returns would be

that the overall equity market should reverse following times of extremely high skews.

Our results, thus, offer additional findings to the literature that explores the link between

variance-measures and forward returns (Ang and Liu, 2007; Bliss and Panigirtzoglou, 2004;

Doran et al., 2007; Pollet and Wilson, 2008). Most of these studies recognize a negative and

short-term relation between risk measures and returns, where a high variance links to subse-

quent negative to low returns. In contrast, our findings suggest that a high level of IV skew

relates to subsequent positive and high returns. Our finding is mostly in line with Boller-

slev et al. (2009), who document that equity market reversals are predicted by the variance

risk-premium.

Further, we aimed to compare the trading performance of the Baker and Wurgler (2007)

sentiment measure to our high-frequency strategy but this was not possible as the former factor

is only available on a monthly or quarterly frequency and was only published until 2010. Thus,

in a next step, we compare how trading strategies using our suggested IV-sentiment measure

compare to strategies that use the sentiment factor of Baker and Wurgler (2007). We do this

by implementing a low-frequency pair trading strategy using both predictors. This pair-trading

strategy is identical to the one applied above with the only difference being the rebalancing

frequency and the number of observations in the look-back window. We use the following look-

backs for the calculation of Z-scores: 1, 3, 6, 9, 12, 18, and 24 months. The IV-sentiment

measures used are the IV-sentiment 80-120 and 90-110 factors, available in our three different

option maturities. Other back-test features (e.g., trading costs, strategy exit) are the same

as for the high-frequency pair-trade strategy. Figure 4.3 provides our results by a series of

boxplots. The empirical findings are displayed in columns for the different option maturities

and in rows for the different statistics evaluated: 1) information ratio, 2) return skewness and

3) horizon, proxied by the average drawdown length (in months) observed per strategy.

Our findings suggest that the IRs of the IV-sentiment strategies are much less dispersed

than the ones for the sentiment factor by Baker and Wurgler (2007). The median IR for the

IV-sentiment 90-110 factor is also higher than for the other two strategies. The IV-sentiment

90-110 factor is the only strategy in which almost all backtests deliver positive IRs, with the

exception of a few outliers. This is not the case for the other strategies, as a substantial amount

of backtests deliver negative IRs.

105


Figure 4.3: Information ratios, skewness and horizon for monthly IV-based strategies. The boxplots depict thedistribution of information ratios (IRs), return skewness, and trade horizon (average drawdown) obtained by the IV-sentimentstrategies tested, as well as the Baker and Wurgler (2007) sentiment factor when different look-back periods and outer-thresholdare used per strategy. Boxplot A depicts the distribution of these statistics when the IV factor used is obtained from three-monthoptions. Panel B and C depicts the same information but, respectively, when the IV factors used are obtained from six- andtwelve-month options. Boxplots of IR, return skewness and trade horizon for the Baker and Wurgler (2007) factor are the sameacross option horizons but are shown for comparison with the IV-sentiment strategies.

In line with our earlier results, the IV-sentiment 90-110 factor seems to dominate the

IV-sentiment 80-120 factor. The return skewness for the IV-sentiment 90-110 strategy also

dominates the ones for the other two strategies, as all boxplot features (median, one standard

deviation, high and low percentile, and outliers) are superior. The IV-sentiment 90-110 factor

delivers the lowest median horizon of all strategies. The average horizons estimated for the

IV-sentiment 90-110 factor are 12, 13, and 19 months, respectively, for the strategies based on

the three-, six- and twelve-month options. The dispersion of strategies’ horizon is, however,

higher for the IV-sentiment 90-110 factor than for the Baker and Wurgler (2007) sentiment

factor. We can conclude that our IV-sentiment measure seems to outperform a trading strategy

based on the sentiment factor by Baker and Wurgler (2007) on several key aspects: IR, return

skewness, and trade horizon.

106

4.4.3 Out-of-sample equity returns predictive tests

4.4.3.1 Univariate models and forecast combination

Following our hypothesis that extreme bearishness and bullishness sentiment might be fol-

lowed by reversals in equity markets, we test here whether our IV-sentiment measure has

out-of-sample predictive power in forecasting the equity risk premium, in line with the analysis

introduced by Welch and Goyal (2008). We follow the methodology used by Campbell and

Thompson (2008) and Rapach et al. (2010), who build on Welch and Goyal (2008). Hence,

similarly to these three studies, our predictive OLS regressions are formulated as:

rt+1 = αi + βixi,t + εt+1, (4.4.3)

where rt+1 is the monthly excess return of the S&P 500 index over the risk-free interest rate,

xt is an explanatory variable hypothesized to have predictive power, and εt+1 is the error term.

Our predictive regressions also use the monthly data set provided by Welch and Goyal (2008)19,

but the scope of 14 explanatory variables used closely follows Rapach et al. (2010)20.

From the predictive regressions in Eq. (4.4.3), we generate out-of-sample forecasts for the

next quarter (t + 1) by using an expanding window. Following Rapach et al. (2010), the first

parameters are estimated using data from January 1947 until December 1964, and forecasts

are produced from January 1965 until December 2014. The estimating window for B/M starts

slightly later than January 1947, while the number of observations available allows forecasting

B/M to start also at January 1965. For the IV-sentiment-based regression, the data used

for the first parameter estimation starts at January 1998 and ends at December 1999 so that

out-of-sample forecasting is performed from January 2000 to December 2014 only.

Following Campbell and Thompson (2008) and Rapach et al. (2010), restrictions on the re-

gression model specified by Eq. (4.4.3) are applied. The first restriction entails a sign restriction

on the slope coefficients of Eq. (4.4.3) for the 14 Welch and Goyal (2008) variables we employed.

The second restriction comprises setting negative forecasts of the equity risk premium to zero.

We specify an additional model containing both coefficient and forecast sign restrictions. The

original Eq. (4.4.3) with no restrictions applied is called the unrestricted model, whereas the

model with the two restrictions is called the restricted model. Once individual forecasts for rt+1

are obtained using the restricted and unrestricted models for every variable, weighted measures

of central tendency (mean and median) of the N forecasts are generated by Eq. (4.4.4):

rc,t+1 =N∑i=1

ωi,tri,t+1, (4.4.4)

where (ωi,t)Ni=1 are the combining weights available at time t. Our forecast combination method

19Welch and Goyal (2008) monthly data was updated until December 2014 and is available athttp://www.hec.unil.ch/agoyal/.

20These variables are: the dividend price ratio, the dividend yield, the earnings-price ratio, the dividend-payout ratio, the book-to-market ratio, the net equity issuance, the Treasury bill rate, the long-term yield, thelong-term return, the term spread, the default yield spread, the default return spread, the inflation rate, andthe stock variance.

107

is a more simple and agnostic approach than the one used by Rapach et al. (2010)21. The

mean and median combination methods are simply the equal weighed (ωi,t = 1/N) average and

median of the forecasts. Our benchmark forecasting model is the historical average model with

the use of an expanding window.

We use the out-of-sample R2 statistic method (R2OS) introduced by Campbell and Thompson

(2008) and followed by Rapach et al. (2010) for forecast evaluation. This method compares

the performance of a return forecast rt+1 and a benchmark or naıve return forecast rt+1 with

the actual realized return (rt+1). We note that this method can be applied either to the single

factor-based forecast models as well as to the combined or multifactor forecast models, both

described in the previous section. The R2OS statistic is given as:

R2OS = 1−

∑qk=q0+1

(rm+k − rm+l)2

∑qk=q0+1

(rm+k − rm+l)2, (4.4.5)

which evaluates the return forecasts from a predictive model (in the numerator) and the return

forecasts from a benchmark or naıve model (in the denominator) by comparing the mean

squared prediction errors (MSPE) for both methods. Because the ratio of MSPEs is subtracted

from 1 in the R2OS statistic, its interpretation becomes: if R2

OS > 0, then MSPE of rt+1 is smaller

than for rt+1, indicating that the forecasting model outperforms the naıve (benchmark) model,

and vice-versa. To better evaluate the out-of-sample performance of of models graphically, we

employ the cumulative cum of squared error difference (CSSEDOS) statistic given below. The

advantage of CSSEDOS over R2OS is that it starts at zero and accumulates over time in a

homoscedastic manner, whereas R2OS typically displays a very high volatility at the start of the

(accumulation) period and a lower volatility of the metric as t increases22.

CSSEDOS =

q∑k=q0+1

(rm+k − rm+l)2 −

q∑k=q0+1

(rm+k − rm+l)2. (4.4.6)

The results from our out-of-sample equity returns predictive tests are reported in Table 4.6.

Panel A reports the findings for the out-of-sample forecasting period between January 1965

and December 2014 for all individual variables except our IV-sentiment factor (IV Sent), for

which forecasts are only available from January 2004 to December 2014, and for the combined

forecasts. For individual models, R2OS comes from the restricted model, whereas for the aggre-

gated models, the results are reported for both the restricted and the unrestricted models. The

results of the aggregate models are reported in means and medians, reflecting the aggregation

21Rapach et al. (2010) classify their combination methods in two classes: the first class uses a mean, median,and trimmed mean approach for forecast combination, and the second class uses a discounted mean squareprediction error (DMSPE) methodology. The DMSPE method aims to set combining weights as a function ofthe historical forecasting performance of the individual models during the out-of-sample period. This methodweights more recent forecasts heavier than older ones by the use of one additional parameter. Despite thedesirable features of such a second class combination method, we prefer to stick to the first class methods onlybecause they are more transparent and do not require the choice of an additional parameter.

22The undesirable graphical pattern of R2OS is caused by the normalization through

∑qk=q0+1(rm+k−rm+l)2

,

which at the start of the sample tends to be very small relative to CSSEDOS . Note that R2OS =

CSSEDOS/∑q

k=q0+1(rm+k−rm+l)2.

108

method used.

Table 4.6: Out-of-sample equity risk premium

Individual predictive regression model forecast Combination forecasts Machine learning methods

Predictor R2OS(%) Predictor R2

OS(%) Combining method R2OS(%) Methods R2

OS(%)

(1) (2) (3) (4) (5) (6) (7) (8)

Panel A. 1965:1-2014:12 out-of-sample period

D/P -0.30 LTY -0.28 Mean-Unconstrained 1.08 Kitchen-sink (OLS) -88.14

D/Y -0.11 TMS -0.50 Median-Unconstrained 0.64 Ridge regression 0.81

E/P -0.41 LTR 0.22 Principal Component Regression -5.93

D/E -0.76 DFY -0.69 Mean-Constrained 1.11 Random Forest -9.97

B/M -0.88 DFR -0.55 Median-Constrained 0.63 Neural Networks -84.14

NTIS -0.83 TBL -0.01 Mean-models -6.35

INFL 0.48 Median-models -2.06

SVAR 0.02

Panel B. 2004:1-2014:12 out-of-sample period

D/P -0.82 LTY 0.62 Mean-Unconstrained 0.25 Kitchen-sink (OLS) -62.64

D/Y -0.53 TMS -0.94 Median-Unconstrained -0.35 Ridge regression -1.76

E/P -1.31 LTR 0.01 Mean-Unconstrained + IVSent 0.63 Principal Component Regression 0.09

D/E -2.13 DFY -1.26 Median-Unconstrained + IVSent 0.27 Random Forest -8.80

B/M -0.16 DFR -0.64 Mean-Constrained 0.40 Neural Networks -66.95

NTIS -2.63 TBL -0.05 Median-Constrained -0.25 Mean-models 1.24

INFL -2.58 IVSent3m 2.45 Mean-Constrained + IVSent 0.75 Median-models 2.12

SVAR 4.17 IVSent6m 2.45 Median-Constrained + IVSent 0.19

IVSent12m 1.59

This table reports the results from the predictive regressions of individual factor models and of combined-factor modelsrelative to the historical average naıve (benchmark) model. R2

OS is the Campbell and Thompson (2008) out-of-sample R2

statistic. If R2OS > 0, then mean squared prediction errors (MSPE) of rt+1, i.e., the predictive regression forecast, is smaller

than for rt+1, i.e., the naıve forecast, indicating that the forecasting model outperforms the latter (benchmark) model. PanelA reports the results for the full out-of-sample period available (1965:1-2014:12) for all variables tested by Rapach et al.(2010). Panel B reports the results for the latest period within the entire out-of-sample history (2004:1-2014:12) and includesthe three-month IV-sentiment 90-110 factor (IVSent) in addition to the variables tested by Rapach et al. (2010).

Panel A suggests that performance is not consistent across factors within the longer history

of the out-of-sample test. Some factors outperform others by a large amount. Concurrently, the

performance of most single factors is quite inconsistent through time, as Figure 4.4 depicts: the

slope and levels of CSSEDOS constantly change from negative to positive and vice-versa for

almost all factors. For some of them, CSSEDOS even flips sign at times within the sample. In

contrast, the aggregated models deliver better performance across restricted and unrestricted

models using either averages or medians for aggregation method. Moreover, the performance

of the weakest aggregate model (0.63) is superior to the best individual factor (INFL at 0.48)

within the full sample.

Once we evaluate the period from January 2004 to December 2014, when IV Sent is used,

we observe that the performance across factors remains inconsistent. The performance across

individual factors looks less dispersed in this sample than in the full sample, but the overall

performance deteriorates. The IV Sent factor performs well (ranging from 1.59 to 2.45 depend-

ing on the maturity), despite being strongly outperformed by the SV AR factor, while other

factors perform extremely poorly (NTIS at -2.63, INFL at -2.58). The combined models that

do not include IV Sent in their median versions (restricted and unrestricted) underperform the

naıve forecasting benchmark as their R2OS is negative.

109

Figure 4.4: Cumulative Sum of Squared Error Differences of single factor predictive regressions. The linesin every plot depict the out-of-sample Cumulative Sum of Squared Errors Differences (CSSEDOS) calculated by Eq. (4.4.6) forthe historical average benchmark-forecasting model minus the cumulative squared prediction errors for the single-factor forecastingmodels constructed by using 14 out of all the explanatory variables suggested by Welch and Goyal (2008), as well as the IV-sentiment 90-110 factor with a three-month maturity. Positive values of CSSEDOS mean that single-factor forecasting modelsthat employ the Welch and Goyal (2008) factors and IVsent outperform the historical average benchmark-forecasting model.

Interestingly, when our IV Sent factor is added to these models, the performance improves

substantially, outperforming the benchmark. We observe the same for models based on the

mean: the mean-unconstrained and the mean-constrained models ex-IV Sent show a R2OS of

0.25 and 0.40, respectively. When the IV Sent factor is added to them, R2OS improves to 0.63

and 0.75, respectively. Therefore, it appears that our IV Sent factor seems to impact the

combined model in a very distinct way when compared to other factors. R2OS from models

that use median forecasts are worse than for models that aggregate forecasts by averaging.

Nonetheless, improvements delivered by the inclusion of IV Sent and the imposition of model

constraints are qualitatively the same across models aggregated by either median or averaging.

We also find that our IV Sent is quite uncorrelated to other factors. The correlation co-

efficient of the IV Sent factor that uses three-month options with other individual factors is

most of the times negative or close to zero, and only exceeds 0.5 when evaluated against long-

term yield (LTY )23. Such correlation is higher for the IV Sent factor computed using six- and

twelve-month option maturities. These results suggest that the improvements made by our

IV Sent factor to the combined models stem partially from diversification benefits rather than

from forecast performance (R2OS) alone.

23A full correlation matrix among the individual predictive factors tested by Rapach et al. (2010) and IV-sentiment factors can be provided upon request.

110

4.4.3.2 “Kitchen sink” and machine learning-based models

Further, we also test a “kitchen sink” model24 as used by Welch and Goyal (2008) and Rapach

et al. (2010) but we extend it beyond the standard linear model toward machine learning

algorithms. Our aim is to test whether more advanced models can fix the exceptionally poor

out-of-sample performance of the multivariate approach to forecast the equity risk premium,

as reported by Welch and Goyal (2008) and Rapach et al. (2010). The models tested by us

in addition to the “kitchen sink” OLS model are: 1) Ridge regression (Hoerl and Kennard,

1970), 2) Principal Component Regression (Massy, 1965), 3) Random forest (Breiman, 2001),

and 4) Neural Networks2526. Our hypothesis for performing this models’ “horse race” is that

machine learning-based models might be able to improve over the multivariate OLS regression

by either 1) reducing its variance and, so, avoiding overfitting, 2) better modelling potentially

non-linearities present in the data, and 3) dampening the effect of collinearity in the regressors.

Our results from testing a “kitchen sink” OLS model reiterate the ones of Welch and Goyal

(2008) and Rapach et al. (2010) (see Table 4.6). The model is the worst performing one in R2OS

terms across all univariate and multivariate models. In contrast, individual machine learning

algorithms using the same set of variables outperform the “kitchen sink” model but do not

consistently outperform the models that combine forecasts from univariate models. The Ridge

regression model seems to be the best performing across all multivariate models as it delivers

high R2OS in the January 1965-December 2014 sample and a less negative R2

OS than other models

in the January 2004-December 2014 sample. Given its linear character, the main advantages

of Ridge regression over the “kitchen sink” is the regularization (shrinkage) applied as well as

its adequacy to multicollinear systems. As the principal component regression also addresses

multicollinearity problems and it performs quite poorly in the January 1965-December 2014

sample, we conjecture that the main benefit delivered by the Ridge regression might be the

shrinkage, which likely dampens the overfitting undergone by the “kitchen sink” model. The

Random forest model performs poorly, although, less bad then the “kitchen sink” and the

Neural Networks models, suggesting that the structure imposed by constraint plus forecasting

combination seems to add more value to predictions than being able to capture non-linear

relationships. The Neural Networks model performs as bad as the “kitchen sink” model, likely

due to overfitting. As we intentionally did not tune the Random forest and the Neural Networks

models much, the chance these models are overfitted is high, especially for the Neural Networks

model. These two approaches are known by their potential for overfitting if stop-training

24The “kitchen sink” includes all 14 predictive variables used in our univariate models.25We tune Ridge regression by using cross-validation with 10 folds. We tune our Random forest model using

a single pass of out-of-bag errors to estimation of the optimal number of predictors sampled for splitting at eachnode. We use cross-validation in the estimation of our Neural Networks model to come up with the number oflayers and neurons (among a set of pre-defined structures) only. We do not apply any early-stop procedure. Adetailed description of tuning procedure applied to these models is out of scope of this thesis.

26A more detailed description of the Ridge regression and Random forest models is provided in Appendix 5.A.In brief, the Principal Component Regression model consists of is a regression analysis in which the explanatoryvariables are the orthogonal factor generated by a principal component analysis (PCA) (see Appendix 5.A fordetails on PCA). Given the complexity and flexibility of Neural Networks/deep learning, further details on thismethod is out of scope of this thesis but available in Haykin (1999). For more insight into all the machinelearning methods used in this chapter, see Hastie et al. (2008)

111

procedures are not imposed.

Observing the evolution of CSSEDOS for the median-based (restricted and unrestricted)

combined models in Plot A of Figure 4.5, we notice that both lines have slopes that are pre-

dominantly positive or flat. Positive slopes of the CSSEDOS curve indicate that the combined

model outperforms the benchmark out-of-sample. These CSSEDOS lines match very closely

the ones presented by Rapach et al. (2010) up to 2004, when their sample ends. The evolution

of R2OS for our individual factors in Figure 4.4 is also very similar to Rapach et al. (2010): some

CSSEDOS curves are positively sloped during certain periods, but often all factors display

negatively sloped curves. The R2OS curves for the IV Sent factor is mostly positively sloped but

relatively flat from 2004 to 2007, as the last plot in Figure 4.4 indicates. These results reiter-

ate the primary conclusion of Welch and Goyal (2008), Campbell and Thompson (2008) and

Rapach et al. (2010): individual predictors that reliably outperform the historical average in

forecasting the equity risk premium are rare but, once these models are sensibly restricted and

aggregated in a multi-factor model, their out-of-sample predicting power improves consider-

ably. This conclusion applies also to the inclusion of our IV Sent factor within the multi-factor

model. Plot B of Figure 4.5 shows that the CSSEDOS curves for the model that includes the

IV Sent factor are visibly steeper than the ones that do not include it. Further, the findings in

Figure 4.5 indicate that restricted models seem to be superior to unrestricted ones by having

either higher or less volatile CSSEDOS.

(a) Without IV-sentiment (b) With IV-sentiment

Figure 4.5: Cumulative Sum of Squared Error Differences of combined predictive regressions. The black linein Plot A depicts the Cumulative Sum of Squared Error Differences (CSSEDOS) for the historical average benchmark-forecastingmodel minus the cumulative squared prediction errors for the aggregated predictive regression-forecasting model construct by using14 Welch and Goyal (2008) explanatory variables in univariate unrestricted models. The green and red lines in Plot A depict thesame forecast evaluation statistic, i.e., the CSSEDOS , when such 14 univariate models are restricted as suggested by Campbelland Thompson (2008). The red line represents the CSSEDOS when coefficients are constrained to have the same sign as the priorssuggest. Plot B zooms in on the January 2003-December 2014 period, where the black and red lines are the same as in Plot A,whereas the green and blue lines are the the CSSEDOS when our IV-sentiment factor is added to the multifactor forecasts modelfor the unrestricted and restricted model, respectively. The forecasting period is January 1965-December 2014 for all variablesexcept IVSent, for which forecasts are only available from January 2004-December 2014. Forecast aggregation in both models isdone by calculating the mean of the t+ 1 forecast from each individual predictive regression.

However, even if the combined factor models perform much better than the individual

predictors do, the red and black lines in Plots A and B of Figure 4.5 are not always positively

112

sloped, which is in line with Rapach et al. (2010). The R2OS curve is strongly positively sloped

from 1965 to 1975, more moderately positively sloped from 1975 to 1992, negatively sloped from

1992 to 2000, and then slightly positive to flat until 2008, when it sharply drops amid the global

financial crisis up to December 2014. The addition of our IV Sent factor in the combined model

produces the blue and green lines in Plot B of Figure 4.5. These new curves have an equally flat

slope during the 2004 to 2008 period, while both experience a sharp rise since the beginning of

2008. These curves’ profiles suggest that our IV Sent factor has considerably improved the out-

of-sample performance of the combined model especially in times when the other factors broke

down or did not provide an edge versus the historical average predictor. Thus, the inclusion

of our IV Sent factor seems to revive the conclusion reached by the previous literature, where

combined factor models are able to improve compared to individual factor models. At the same

time, the recent poor performance of the combined models ex-IV Sent underscores that factor

identification is still a major challenge for the specification of combined models. Overall, our

empirical findings suggest that IV-based factors provide a relevant explanatory variable for the

time-variation of equity returns.

4.4.4 IV-sentiment and equity factors

In this section we test whether the stream of returns produced by the IV-sentiment trading

strategy is connected to (cross-sectional) equity factors. Our goal in this analysis is to evaluate

whether the IV-sentiment loads heavily on equity factors identified in the literature. Since the

IV-sentiment aims to time entry and exit-points into the equity markets, it could potentially

also be used by equity managers to time their beta exposure. Nevertheless, if this timing-

strategy largely resembles equity factors, it should be less useful to equity portfolio managers.

We perform this analysis using Eqs. (4.4.7a) to (4.4.7d), as well as univariate models using

the individual factor employed in the following models:

IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd + εd, (4.4.7a)

IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd +WMLd + εd, (4.4.7b)

IV Sentd = αd + (Mkt−RF )d + SMBd +HMLd +WMLd +RMWd +CMAd + εd, (4.4.7c)

IV Sentm = αm+(Mkt−RF )m+SMBm+HMLm+WMLm+RMWm+CMAm+BABm+εm,

(4.4.7d)

where, the subscript d = 1, 2, ...D stands for daily returns, whereas the subscript m = 1, 2, ...M

stands for monthly returns, both extending from January 2, 1999 to December 8, 2015. The first

set of explanatory variables, used in Eq. (4.4.7a), are the market (Mkt-Rf ), the size (SMB) and

the value (HML) factors, as proposed by Fama and French (1992). Additionally, the profitability

(RMW ) and investment (CMA)27 factor of Fama and French (2015), the momentum factor

27The Fama and French factors SMB, HML, RMW and CMA stand, respectively, for small minus big (size),

113

(WML) of Carhart (1997) and the low- versus high-beta (BAB), known as the “Betting Against

Beta” factor of Frazzini and Pedersen (2014) are used in Eqs. (4.4.7b) to (4.4.7d)28. The

correlation structure of these factors estimated using our monthly data is reported in the

Figure (4.6) below. In brief, it suggests that some cross-sectional equity factor can be highly

positively or negatively correlated with each other but, more importantly, the IV-sentiment

strategy seems lowly correlated to all series.

Figure 4.6: Correlation matrix between IV-sentiment factor and cross-sectional equity factors. The uppertriangular part of the matrix above reports the correlation coefficient between pairs of cross-sectional equity factors and the IV-sentiment factor. These equity factors are the market (Mkt-Rf ), the size (SMB) and the value (HML) factors, the profitability(RMW ), the investment (CMA), the momentum factor (WML) and the “Betting Against Beta” factor (BAB). The font size ofcoefficient reiterates its magnitude, whereas asterisks ***, **, and * indicate significance at the one, five, and ten percent level,respectively. In the diagonal, the histograms of factor returns are depicted. The lower triangular part of the matrix depicts scatterplots of the returns of the multiple pairs of factors.

Table (4.7) reports results of Eqs. (4.4.7a) to (4.4.7d). At first we observe that the IV-

sentiment has very little Beta exposure as the coefficients for the (Mkt−RF ) factor are close to

zero across its univariate model as well as across all multivariate models. This result matches

our expectations as IV-sentiment has, in fact, a time-varying long or short exposure to the

equity market. The IV-sentiment strategy also seems to have a large-cap tilt as the coefficient

of SMB is often statistically significant and small or negative, ranging from -0.107 to 0.147.

high minus low (valuation), robust minus weak (profitability) and conservative minus aggressive (investments).28The regressions that include the BAB factor have monthly frequency as this factor is not available in a

daily frequency.

114

Again, this is an expected result as the IV-sentiment strategy is implemented in the US large

cap universe, i.e, the S&P500 Index. Coefficients for HML are also either low or negative,

suggesting a growth tilt. HML is positive in the simpler models, i.e, the univariate regression

and in the Fama and French (1992) model, but negative in the more comprehensive models.

This finding suggests the presence of multicollinearity in the model, which is likely affecting

the estimated coefficient for HML. This effect is likely caused by the addition of the RMW

factor, as these factors have a correlation of 0.5 in our sample (see Figure (4.6)), whereas being

reported by the literature to reach 0.8.

Table 4.7: Regression results: IV-sentiment and equity factors

Panel A - Multivariate Panel B - Univariate

Intercept 0.000 0.000 0.000 0.007* 0.000 0.000 0.000 0.000 0.000 0.000 0.007

(0.000) (0.948) (0.000) (0.004) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.004)

Mkt-RF 0.070*** 0.042*** 0.060*** 0.072 0.073***

(0.010) (0.011) (0.012) (0.104) (0.010)

SMB 0.134*** 0.153*** 0.136*** -0.107 0.147***

(0.021) (0.021) (0.022) (0.152) (0.021)

HML 0.080*** 0.018 -0.064*** -0.271 0.086***

(0.019) (0.021) (0.024) (0.180) (0.020)

WML -0.121*** -0.141*** -0.179* -0.134***

(0.015) (0.015) (0.094) (0.013)

RMW -0.042 -0.130 -0.137***

(0.029) (0.220) (0.024)

CMA 0.244*** 0.624** 0.107***

(0.036) (0.245) (0.029)

BAB -0.186 -0.215**

(0.126) (0.098)

R2 2% 4% 5% 13% 1% 1% 0% 2% 1% 0% 4%

F-stats 36.7 45.0 38.0 2.5 49.2 47.6 19.4 104.5 31.5 13.7 4.8

AIC -29771 -29837 -29879 -430 -29715 -29713 -29685 -29769 -29698 -29680 -429

BIC -29739 -29798 -29827 -405 -29696 -29694 -29666 -29750 -29678 -29661 -421

This table reports regression results for Eqs. (4.4.7a), (4.4.7c) and (4.4.7d). The dependent variable is the stream of returnsproduced by the contrarian strategy based on our IV-sentiment 90-110 indicator, while the explanatory variables are equity(cross-sectional) factors, namely: the market (Mkt-Rf), size (SMB), value (HML), profitability (RMW), investment (CMA),momentum (WML) and low- versus high-beta (BAB). Panel A reports the regression results in a multivariate setting, usingthree distinct model: 1) the Fama-French three-factor model, 2) the Fama-French three-factor model with the addition ofthe Carhart (1997) momentum factor, 3) the Fama-French five-factor model with the momentum factor and 4) the lattermodel with the addition of the BAB (Betting Against Beta) factor suggested by Frazzini and Pedersen (2014). Note that asthe BAB factor is only available in monthly frequency, regression that contain such factor use monthly frequency, whereasdata used in other regressions has daily frequency. We report standard errors in brackets. Asterisks ***, **, and * indicatesignificance at the one, five, and ten percent level, respectively.

Turning to the factors in Eqs. (4.4.7b) to (4.4.7d) only, we find that IV-sentiment has

negative exposure to the cross-sectional momentum factor (WML) consistently across all re-

gressions. At first glance, this result makes sense as IV-sentiment is a mean-reversion strategy.

Nevertheless, because the IV-sentiment reflects mean-reversion in the overall equity market,

hence in time-series, rather than cross-sectionally, the expectation of a negative relation be-

tween these variables is ambiguous. Moskowitz et al. (2012) report that time-series momentum

and cross-sectional momentum in the equity markets are strongly related though29, which sug-

gests that our original assumption that IV-sentiment is negatively correlated to WML holds.

Among all factors, WML is almost the only one which the statistical significance holds across

29Moskowitz et al. (2012) report that the coefficient of time-series momentum on cross-sectional momentumequals to 0.57 with a t-stat of 15.52 in a univariate model.

115

all regressions. WML seems also to deliver, with around 2 percent, high explanatory power

relative to the other factors used. This strong and robust negative link between IV-sentiment

and WML reiterates our earlier suggestion that these two risk factors seem to complement each

other. And, by doing so, IV-sentiment might be able to mitigate some momentum crashes.

Moreover, the exposure of IV-sentiment to the profitability factor (RMW ) is small and

always negative, despite the fact that the coefficients are not statistically significant in the two

multivariate models applied, only in the univariate regression. IV-sentiment is positively ex-

posed to the investment factor (CMA) as its coefficients are significant across all regressions. We

interpret that this positive relation with IV-sentiment relates to a higher frequency of reversals

in periods when firm investments are low (likely during recessions or in the late economic cycle),

which coincides with conservative firms outperforming aggressive ones. Besides, IV-sentiment

loads negatively on the BAB factor, despite being only statistically significant in the univariate

regression. This connection is argued to be linked to the profitability factor (RMW ) by Fama

and French (2016), which may help explain why both regressors are not statistically significant

in the multivariate model, whereas they are strongly significant in the univariate regressions.

In line with this suggestion, the estimated correlation between these two factors in our sample

is 0.59 (see Figure (4.6)).

Last but not least, none of our regression models explains the variability IV-sentiment much

as R2 from Eq. (4.4.7d) is with 13 percent, at best, always low. This finding indicates that

the IV-sentiment strategy is quite distinct from factors typically used by portfolio managers

for single name equity management. Hence, as the IV-sentiment strategy embeds a timing

approach for equity markets, which can be implemented via a dynamic exposure to market

Beta, equity portfolio managers could enhance their strategies by making use of it.

4.4.5 Behavioral versus risk-sharing perspectives

Another perspective of equity market dynamics provided by IV-based factors that are jointly

extracted from single stock and index options, is the implied correlation (ρ). It is approximated

by Eq. (4.4.8), which is derived in Appendix 3.A.2:

ρ ≈ σ2I

(∑n

i=1 wiσi)2, (4.4.8)

where σ2I is the variance of index options, σi is the volatility of i = 1...n stocks in the index, and

wi is the stocks’ weight in the index. The implied correlation measures the level of the average

correlation between stocks that are constituents of an index. The IV of index options, i.e., (σ2I ),

can be matched by the one of single stock options, weighted by its constituents’ loadings in

the index, i.e., (∑n

i=1 wiσi)2. Thus, if IV can be used as a measure of absolute expensiveness

of an option, the implied correlation provides a relative valuation measure between the index

and single stock options: a high (low) level of implied correlation means that index options are

expensive (cheap) relative to single stock options.

Table 4.8 Panel A presents descriptive statistics of the implied correlations between the

116

index and single stock options’ IV. The means and medians suggest that the implied correlation

monotonically decreases with an increase in the moneyness level. The implied correlation means

range from 0.30 to 0.65, a somewhat wide range given that these are averaged measures. Such a

relative high dispersion of implied correlations is confirmed by their standard deviations, which

are around 0.14. The distributions of the implied correlation are mostly negative skewed, as

medians are most of the times higher than their means. The most striking result is given by

the maximum and minimum implied correlations: the maximum implied correlation observed

across all maturities and moneyness levels reported reaches 135 percent. Implied correlations

above 100 percent are observed for many options, mostly for puts at the 80 and 90 percent

moneyness levels. This finding implies that in order to match the weighted IV of puts on single

stocks that are part of the S&P 500 index to the IV of a put on the index (with same levels

of moneyness), an average correlation above 100 percent between the single stock put options

is required. However, as correlation coefficients are bounded between −100 and +100 percent,

these levels of implied correlation are indicative of irrational behavior by investors, who bid up

index puts to levels that contradict market completeness.

Table 4.8: Implied and realized correlations

Panel A Implied correlations

Statistics \ Maturity, moneyness 3m 80% 3m 90% 3m ATM 3m 110% 3m 120% 6m 80% 6m 90% 12m 80% 12m 90%

Mean 0.65 0.56 0.45 0.35 0.3 0.64 0.56 0.6 0.54

Median 0.67 0.57 0.45 0.35 0.3 0.65 0.56 0.61 0.54

Minimum 0.24 0.18 0.12 0.07 0.03 0.26 0.21 0.26 0.22

Maximum 1.35 1.11 0.86 0.72 0.68 1.07 0.95 1.1 1.01

10th percentile 0.44 0.35 0.27 0.17 0.13 0.41 0.34 0.39 0.34

90th percentile 0.81 0.73 0.63 0.53 0.49 0.8 0.73 0.77 0.72

Standard deviation 0.15 0.14 0.14 0.13 0.14 0.14 0.14 0.14 0.13

Skew -0.46 -0.39 0.1 0.29 0.29 -0.6 -0.38 -0.26 -0.2

Excess Kurtosis 0.6 -0.02 -0.37 -0.38 -0.66 0.09 -0.18 0.03 -0.22

Panel B - Realized correlations

Statistics \ Look-back period 30 Days 60 Days 90 Days 180 Days 720 Days

Mean 0.3 0.25 0.25 0.26 0.36

Median 0.27 0.22 0.24 0.25 0.31

Minimum 0 0.01 0 0.01 0.06

Maximum 0.84 0.69 0.67 0.61 0.74

10th percentile 0.1 0.05 0.04 0.07 0.08

90th percentile 0.54 0.47 0.48 0.52 0.71

Standard deviation 0.17 0.16 0.16 0.16 0.2

Skew 0.66 0.38 0.6 0.37 0.42

Excess Kurtosis -0.02 -0.68 -0.21 -0.86 -0.88

Panel A reports the descriptive statistics for the implied correlations between index options and single stock options for three monthoptions at the 80, 90, ATM (100), 110, and 120 percent moneyness levels, and for six- and twelve-month options at the 80 and 90percent moneyness levels over the full sample, which extends from January 2, 1998 to March 19, 2013. The implied correlation (p)

is approximated by the Eq. (4.4.8): p ≈ σ2I

(∑n

i=1 wiσi)2, where σ2

I is the implied volatility of an index option and∑n

i=1 wiσi is the

weighted average single stock implied volatility, as in Eq. (3.A.8l) of Appendix 3.A.2. Panel B reports the descriptive statistics forthe average pair-correlations for the 50 largest constituents of the S&P500 index calculated over the same sample, which extendsfrom January 2, 1998 to March 19, 2013.

We also find that trading in the opposite direction of such evident irrational investor behavior117

has been very profitable, as implied correlations higher than 100 percent were very effective

as an entry point for contrarian strategies. Across the maturities and moneyness levels where

we can observe such biased behavior, a sentiment strategy that buys the equity market when

the implied correlation is above 100 percent and sells it when the implied correlation falls back

to 50 percent, yields an average net information ratio of 0.35, with information ratios ranging

from 0.27 to 0.52.

The implied correlation means and medians provided by Panel A are far higher than the

same measures from realized average pair-correlations between the 50 largest constituents of the

S&P 500 index as of February 14, 2014, as provided in Panel B. Such average pair-correlations

range from 0.25 to 0.36 when look-back periods of 30, 60, 90, 180, and 720 days are evaluated,

which is substantially lower than most average implied correlations posted for the different

option maturity and moneyness levels reported in Panel A. In fact, the average realized corre-

lations are often below the 10th percentile of the implied correlation for some options’ maturity

and moneyness levels. The 90th percentile of realized correlations often match the average im-

plied correlations reported. The maximum realized correlations are at most 84 percent, using

an extremely short look-back of 30 days, much lower than the 135 percent observed for implied

correlations. These empirical findings strongly suggest that implied correlations substantially

overshoot realized ones. Similarly, the implied correlation reaches sometimes values as low

as three percent for some options, especially on the call side (above ATM moneyness). This

finding is also low when compared to put options. The minimum historical correlations from

OTM puts is 0.18, whereas for call options it is 0.03. The fact that those extremely low values

of the implied correlation from calls largely undershoots implied correlations from put options

may also suggest less than fully rational pricing on the call side. It indicates that single stock

options are expensive relative to index calls, which matches our postulation that individual

investors use single stock calls to speculate on the upside.

Despite the strong evidence of irrational behavioral by investors provided by the extreme

levels of implied correlation, which indirectly links to the IV skew being at extreme levels at

times, we conjecture that such phenomena may also have a risk-bearing explanation. Reversal

strategies such as the ones designed by us earn attractive long-term risk-adjusted returns, but

are highly dependent on equity markets at the tail (see Table 4.5, Panel C). Additionally, IV-

sentiment-based reversal strategies experience the largest daily drawdowns among all strategies

evaluated (see Table 4.5, Panel A). Thus, their attractive risk-adjusted returns are, partially,

compensation for downside risk. Therefore, the risk borne by investors that bet on reversals

in equity markets is the risk of poor timing of losses (Campbell and Cochrane, 1999; Harvey

and Siddique, 2000) and downside risk (Ang et al., 2006). In brief, betting on equity market

reversals is a risky activity.

We note that this rational explanation for excesses in sentiment is also linked to limits-

to-arbitrage. The limits-to-arbitrage literature defends that, as investors have finite access to

capital (Brunnermeier and Pedersen, 2009) and feedback trading can keep markets irrational

for a long period of time (De Long et al., 1990), contrarian strategies aiming to exploit the

118

effect of irrational trading are not without risk. For example, once bearish sentiment seems

excessive, the risk of betting on a reversal may be tolerable only to a few investors, because 1)

higher volatility drags investors’ risk budget usage closer to its limits, and 2) access to funding

is limited. Thus, the ability to “catch a knife falling” in the equity markets is not suitable

for all investors, as it involves high risk. Contrarian strategies are, then, mainly accessible to

investors that have enough capital or funding liquidity. Similar considerations are career risk

(Chan et al., 2002), negative skewness of returns (Harvey and Siddique, 2000), poor timing of

losses (Campbell and Cochrane, 1999; Harvey and Siddique, 2000), and risk aversion of market

makers (Garleanu et al., 2009). One final element in the characterization of reversals as a

compensation for risk is the presence of correlation risk priced in index options (see Driessen

et al., 2009, 2013; Krishnam and Ritchken, 2008; Jackwerth and Vilkov, 2015), which is present

in assets that perform well when market-wide correlations are higher than expected.

4.5 Conclusion

End-users of OTM options tend to overweight small probability events, i.e., tail events. This

bias is strongly time-varying and present in both OTM index puts and single stock calls, due

to individual and institutional investors trading activity, respectively. Individual investors

typically buy OTM single stock calls (“lottery tickets”) to speculate on the upside of equi-

ties (indicating bullish sentiment), whereas institutional investors typically buy OTM index

puts (portfolio insurance) to protect their large equity holdings (indicating bearish sentiment).

Hence, overweight of small probabilities derived from equity option prices should capture in-

vestors’ sentiment and, thus, potentially predict equity returns.

The parameters that directly capture overweight of small probabilities from option prices

such as the Delta (δ) and Gamma (γ) CPT parameters or the Delta minus Gamma spread

(as designed by us) are difficult to estimate. Because Delta minus Gamma spread is found

to be strongly linked to risk-neutral moments and IV skews, we circumvent these estimation

challenges by proposing a simplified but still informative sentiment proxy: IV-sentiment. The

uniqueness of our IV-sentiment measure is that it is jointly calculated from the IV of OTM

index puts and single stock call options. It aims to capture both bullish and bearish sentiment,

respectively, from individual investors and institutional investors’ trading in options.

We find that our IV-sentiment predicts mean-reversion better than the overweighting of

small probabilities parameter Delta minus Gamma spread. Contrarian-trading strategies using

our IV-sentiment measure produce economically significant risk-adjusted returns. The joint

use of information from the single stock and index option markets seems to be the reason for

the superior forecast ability of our IV-sentiment measure, because factors that use implied

volatility skews from a single market achieve significantly inferior results. The performance of

our IV-sentiment measure seems also more consistent in delivering a positive information ratio

than the Baker and Wurgler (2007) sentiment factor. Moreover, it is more positively skewed,

has a shorter horizon than the standard factor and allows for a daily strategy rebalancing.

119

Our IV-sentiment factor seems to forecast returns as well as other well-known predictors

of equity returns. Since it is uncorrelated to these predictors of the equity risk-premium, it

significantly improves the quality of predictive models, especially when such frameworks are

constrained, as in the terms of Campbell and Thompson (2008). The structure provided by

these constraints in addition to a simple forecast combination approach seems also to outperform

a “kitchen sink” model and a set of machine learning algorithms capable of exploring non-

linearities in the data, applying regularization and tackling multicollinearities issues.

Further, the IV-sentiment strategy is little exposed to a set of widely used cross-sectional

equity factors, which includes Fama and French’s five-factors, the momentum factor (WML)

and the low-volatility factor (BAB). The link between the momentum factor (WML) and

IV-sentiment is found to be consistently negative. At the same time, these factors explain

very little variability of the IV-sentiment strategy. One implication of these findings it that

IV-sentiment could be employed as a Beta-timing tool by active equity managers. Another

implication is that WML and IV-sentiment seem to largely diversify each other and, thus,

prove beneficial for portfolio optimization.

The prediction of reversals seems to be further enhanced when the volatility skews priced by

OTM index puts and single stock calls are clearly irrational, e.g., when implied correlations are

higher than 100 percent. Timing market reversals using our IV-sentiment measure is, however,

not without risk. Reversal strategies, like ours, are exposed to large drawdowns, which likely

happen during “bad times”. Nevertheless, we find that combining our sentiment strategy with

other strategies, such as buy-and-hold the S&P 500 index, time-series momentum and cross-

sectional equity momentum can improve their risk-adjusted returns. Cross-sectional momentum

is the strategy that benefits the most when combined with our contrarian-sentiment strategy,

which is caused by these strategies being negatively correlated with each other and having low

tail dependence. This outcome is largely in line with the finding that WML and IV-sentiment

are strongly negatively correlated and indicate a promising avenue for future research on the

mitigation of momentum crashes by our measure.

120

Chapter 5

Predictable Biases in Macroeconomic

Forecasts and Their Impact Across

Asset Classes∗

5.1 Introduction

The presence of bias in analysts’ forecasts is a widely investigated topic. Early literature focuses

on the bias present in equity analysts’ forecasting of earnings per share (FEPS), and attempts to

explain why earnings estimates are systematically overoptimistic. De Bondt and Thaler (1990)

suggest that equity analysts suffer from a cognitive failure which leads them to overreact and

have too extreme expectations. At the same time, Mendenhall (1991) argues that underreaction

to past quarterly earnings and stock returns contributes to an overoptimistic bias in earnings.

Overreaction and underreaction as causes for an overoptimistic FEPS are, though, reconciled

by Easterwood and Nutt (1999), who defend that analysts underreact to negative earnings

announcements but overreact to positive ones. Another branch of the literature on analysts’

forecasts proposes that this perceived bias is caused by strategic behavior, i.e., a rational

bias. For instance, Michaely and Womack (1999) advocates that equity analysts employed by

brokerage firms (underwriter analysts) often recommend companies that their employer has

recently taken public. In the same vein, Tim (2001) suggests that a rational bias exists within

corporate earnings forecasts because analysts trade-off this bias to improve management access

(via positive forecasts) and forecast accuracy.

Only recently the same attention given to FEPS by the literature was given to the analysis

of potential biases in macroeconomic forecasts. For instance, Laster et al. (1999) argue that

forecasters have a dual goal: forecasting accuracy and publicity. Forecasters would depart from

∗This chapter is based on Felix et al. (2017b). We thank seminar participants at the 2017 Econometrics andFinancial Data Science workshop at the Henley Business School in Reading, at the APG Asset ManagementQuant Roundtable seminar in 2017, at the MAN-AHL Research Seminar in 2017, at the 2018 Annual Conferenceof the Swiss Society for Financial Market Research (SGF) Conference in Zurich, at the 2018 EEA-ESEMConference in Cologne and at the 2018 European Finance Association (EFA) in Warsaw for their helpfulcomments. We thank APG Asset Management and AHL Partners LLP for making available part of the dataset.

121

the consensus (which is typically accurate) when incentives related to their firms’ publicity

outpace the wages received by being accurate. The authors find this trade-off to vary by

industry. Ottaviani and Sorensen (2006) compare two theories of professional forecasting,

which lead to either forecasts that are excessively dispersed or forecasts that are biased towards

the prior mean (herding)2. A drawback of this early literature on macroeconomic forecasts is

that it fails to empirically test the direction and size of the bias, but mostly elucidates that

dispersion of forecasts is plausible under different (sometimes stringent) assumptions. We also

note that both previous papers focus on rational bias explanations for macroeconomic forecast

rather than on cognitive issues.

To the best of our knowledge, Campbell and Sharpe (2009) is the first study to address

macroeconomic forecasts from both an empirical and behavioral bias approach. Their study

hypothesizes that experts’ consensus forecasts of economic releases are systematically biased

towards the previous release. This bias is consistent with the adjustment heuristic proposed

by Tversky and Kahneman (1974). This cognitive bias, commonly known as anchoring, is

characterized by the human propensity to rely too heavily on the initial value (the “anchor”) of

an estimation when updating forecasts. In other words, individuals tend to make adjustments

to original estimates that do not fully incorporate the newly available information. Thus,

anchoring underweights new information in detriment of the “anchor”.

Campbell and Sharpe (2009) hypothesize that surprises over economic releases are pre-

dictable, as they will tend to underreact to new information. They find that the previous

economic releases of 10 important US economic indicators explain up to 25 percent of the

subsequent economic surprises3. Anchoring in forecasting seems not to be, however, restricted

to macroeconomic data releases. Cen et al. (2013) have shown that anchoring also plays a

significant role in FEPS of firms by stock analysts. Their study suggests that analysts tend to

issue optimistic (pessimistic) forecasts when the firms’ FEPS is lower (higher) than the industry

median.

Further, Zhang (2006) investigates the link between anchoring, underreaction and informa-

tion uncertainty. The author builds on the earlier post-earnings-announcement-drift (PEAD)

literature (see, e.g., Stickel, 1991), which states that analysts underreact to new information

when revising their forecasts due to behavioral biases, such as conservatism (Ward, 1982) or

overconfidence (Daniel et al., 1998). He suggests that a greater dispersion (disagreement) in

analysts FEPS, which forms his proxy for information uncertainty, contributes to a large de-

gree of analysts underreaction. Consequently, in an environment of high dispersion of FEPS,

or for firms with greater information uncertainty, analysts will tend to incur in larger positive

(negative) forecasts errors and larger subsequent forecast revisions following good (bad) news.

Capistran and Timmermann (2009) argue that the causality between underreaction and

disagreement depicted by Zhang (2006) may also work the other way around. Capistran and

2Ottaviani and Sorensen (2006) builds on the reputational herding model of Scharfstein and Stein (1990),who suggests that forecasters (investment managers in their case) mimic the decision of others and ignoresubstantive private information, mostly due to concerns about their reputation in the labor market.

3However, it is yet unclear if this bias has a behavioral nature or if it is led by professional forecasters’strategic incentives

122

Timmermann (2009) argue that, as forecasters have asymmetric and differing loss functions,

they react differently to macroeconomic news. In doing so, forecasters update their predictions

in different ways and at different points in time as a reaction to the same news flow, giving

rise to forecast disagreement. In line with Capistran and Timmermann (2009), Mankiw and

Thomas (1997) suggests that, as there are costs involved in gathering information and making

adjustments to forecasts, experts underreact to recent news and only update their predictions

periodically. Thus, in such a sticky-information model for forecasts adjustments, only part of

the pool of forecasters would update their predictions at each period, also corroborating for dis-

persion of forecasts and information uncertainty. Interestingly, Zarnowitz and Lambros (1987)

and Lahiri and Sheng (2010) use dispersion of forecasts as a measure of forecast uncertainty,

not information uncertainty.

When attempting to find predictive value in disagreement measures among forecasters, Leg-

erstee and Franses (2015) use the standard deviation of forecasts and the 5th and 95th percentile

of survey forecasts to predict macroeconomic fundamentals. The 5th and 95th percentiles of sur-

vey forecasts (especially when used in combination with the mean or median) are, arguably,

proxies for the skewness of forecasts, which is explicitly explored by Colacito et al. (2016) and

Truong et al. (2016). In their study, Colacito et al. (2016) use skewness of expected macroeco-

nomic fundamentals to predict expected returns, whereas Truong et al. (2016) uses the skewness

of FEPS survey data to predict quarterly earnings.

Finally, Legerstee and Franses (2015) use the number of forecasts collected as a predictor

of future macroeconomic releases as a proxy for “attention”. Arguably this popularity measure

could be used as a direct predictor of macroeconomic data, as these authors do, but it could

also be employed as a weighting scheme to test whether the pervasiveness of biases fluctuates

with attention.

Hence, because anchoring is to some extent linked to other inherent properties of the pool

of forecasts, as the above literature demonstrates, we hereby investigate other potential biases

that might be embedded in macroeconomic consensus forecasts. The main hypothesis of this

chapter is that, beyond anchoring, these inefficiencies are informative in predicting economic

surprises. As the literature suggests, such biases are expressed by moments of the distribution

of macroeconomic forecasts, such as the disagreement among forecasters (second moment) and

skewness of forecasts (third moment). As market prices react to the information flow, economic

surprise predictability might give rise to return predictability, as reported by Campbell and

Sharpe (2009) and Cen et al. (2013). As a consequence, we conjecture that economic surprises

as well as asset returns around these releases are predictable.

Our contribution to the literature on forecasting bias is four-fold. First, we identify new

biases in experts’ expectations (over and above the anchoring bias), which are statistically

significant predictors of economic surprises. More specifically, we are the first to empirically

validate the rational bias hypothesis of Laster et al. (1999) and Ottaviani and Sorensen (2006)

in a large multi-country data set of macroeconomic releases. Within such models, forecasters

possess private information which is unveiled via the skewness of the distribution of forecasts.

123

Second, by using a popularity measure per economic indicator and by expanding the number

of countries/regions and indicators tested vis-a-vis Campbell and Sharpe (2009), we advocate

that the prevalence of biases is related to attention. This finding is supported by the fact

that as we move from very popular economic releases, such as the Non-farm payrolls (NFP)

employment number, Retail Sales and Consumer Confidence towards less watched indicators,

biases become less pervasive. The same effect is observed when we compare our results for

the US to those in other countries, in which economic indicators are forecasted by much fewer

experts. Third, we confirm the hypothesis that, by predicting economic surprises, one can

predict asset returns around macroeconomic announcements. We find that expected economic

surprises can largely predict the direction of market responses around data releases in-sample

and, to a lesser degree, out-of-sample. Hence, the expected component of surprises can explain

market responses, whereas previous research (see Campbell and Sharpe, 2009) suggests that

markets only respond to the unpredictable component of surprises. The explanatory power

and predictability achieved by our models are higher for local equity and bond markets than

for foreign markets, currencies and commodities, which is intuitive, as those markets are the

ones more intrinsically linked to the fundamentals being revealed by macroeconomic indicators.

On an out-of-sample basis, point-forecast is better performed by non-linear machine learning

models as they seem to capture the dynamics of market responses around macroeconomic

announcements better than linear regression models. Fourth, we are the first to recognize that

a regret bias (see Loomes and Sugden, 1982; Bell, 1982) might influence how asset market reacts

to macroeconomic surprises.

The four key implications of our research are: 1) a better understanding of the “market

consensus” and of the informational content of higher moments of the distribution of macroe-

conomic forecasts by regulators, policy makers and market participants; 2) the challenge of

standard weighting schemes used in economic surprise indexes, which, we reckon, can be im-

proved by changing from “popularity” (or “attention”)-weighted to unweighted; 3) the proposi-

tion that advanced statistical learning techniques should be used to refine the forecast of market

responses amid macroeconomic releases and 4) the opening of a new stream in the literature to

investigate regret effects in asset responses around announcements of forecasted figures.

The remainder of this chapter is organized as follows. Section 5.2 provides a generic formu-

lation of research applied to forecast biases. Section 5.3 describes the data and methodology

employed in our study. Section 5.4 presents our empirical analysis and Section 5.5 concludes.

5.2 Forecast biases, anchoring and rationality tests

Let us first introduce the generic formulation of research applied to forecast biases, as used

by Aggarwal et al. (1995), Schirm (2003) and Campbell and Sharpe (2009). In brief, this

formulation consists of a rationality test in which macroeconomic forecasts are assessed to have

properties of rational expectations. Such assessment is done, in its basic format, by running

regressions with the actual release, At, as the explained variable, and the most recent forecast,

124

Ft, as the explanatory variable, as follows:

At = β1Ft + εt, (5.2.1)

Rationality holds when β1 is not significantly different from unity, while a β1 significantly higher

(lower) than one suggests a structural downward (upward) bias of forecasts. Observing serial

correlation in the error term would also suggest irrationality, as one would be able to forecast

the At using an autoregressive model.

An alternative and more intuitive formulation of this rationality test, as suggested by Camp-

bell and Sharpe (2009), can be achieved by subtracting the forecast from the left side of Eq.

(5.2.1):

St ≡ At − Ft = β2Ft + εt, (5.2.2)

This manipulation yields to the forecast error or the “surprise”, St, as the new explained

variable, which is still dependent on forecast values. In Eq. (5.2.2), rationality holds when β2

is not significantly different from zero; otherwise, a structural bias is perceived. For the specific

case of anchoring, we can dissect the forecast bias using the following model:

Ft = λE[At] + (1− λ)A, (5.2.3)

where E[At] is the forecaster’s unbiased prediction, and A is the anchor, which equals to the

value of the previous release of the indicator of interest. In such a model, if λ < 1 so that

1 − λ > 0, then the consensus forecast is anchored to the previous releases of the indicator.

If λ = 1, no anchor is observed. By applying expectations to Eq. (5.2.2), then, substituting

E[At] = E[St] + Ft into Eq. (5.2.4a), we obtain Eq. (5.2.4d) after some manipulations:

Ft = λ(E[St] + Ft) + (1− λ)A, (5.2.4a)

λE[St] = Ft − λFt − A+ λA, (5.2.4b)

E[St] =Ft − λFt − A+ λA

λ, (5.2.4c)

E[St] =(1− λ)(Ft − A)

λ, (5.2.4d)

assuming γ = (1−λ)λ

and adding a intercept (α) we find4:

St = α + γ(Ft − A) + εt, (5.2.5a)

4The above derivation builds fully on the work of Campbell and Sharpe (2009). The only difference betweenour approach and theirs lies on the fact that they consider the anchor to be the average value of the forecastedseries over a number (h�3) of previous releases, whereas our anchor variable relies only on the previous release(h=1). Robustness test for h>1 will be provided in future versions of this study.

125

St = α + γESAt + εt, (5.2.5b)

which reveals a direct test of anchoring, identified when the γ coefficient is positive, where

ESAt is the expected surprise given the presence of an anchor.


In this study, we mostly employ ordinary least square (OLS) regression analysis adjusted for

Newey-West standard error with the goal to offer interpretability to our results. More advanced

statistical learning techniques are employed but their usage is restricted to section 5.4.3.

We use macroeconomic release data from the ECO function in Bloomberg in our analysis.

This data comprises of time-stamped real-time released figures for 43 distinct US macroeco-

nomic indicators, as well as information on forecasters’ expectations for each release. See Table

5.1 for an overview of these indicators. This expectations information comprises of 1) the

previous economic release, 2) the cross-sectional standard deviation of forecasts, 3) the lagged

median survey expectations, and the 4) the skewness in economists’ forecasts, calculated as

the mean minus median survey expectations. We use similar data sets for Continental Europe,

the United Kingdom and Japan for robustness testing. Our daily data set spans the period

from January 1997 to December 2016, thus covering 4,422 business days and 21,048 individual

announcements. The consensus forecast is the forecast median, in line with Bloomberg’s (and

most other studies’) definition.

We note that the economic indicators tracked are released in different frequencies and

throughout the month. This a-synchronicity among indicators poses some challenges to process

the information flow coming from them and to jointly test for the predictability of surprises.

Therefore, predictability is separately tested for each indicator, and results are subsequently

aggregated.

As we intend to use states of the economy as a control variable in our empirical analysis, we

have also implemented the Principal Component Analysis (PCA)5-based nowcasting method of

Beber et al. (2015) using the same 43 distinct US macroeconomic indicators. Their nowcasting

method allows us to access the real-time growth and inflation conditions present at the time

of any economic release6. Table (5.1) provides details on stationary adjustments, directional

adjustments, frequency of release, starting publication date for the series, and (common) release

time. Finally, we also use the 12-month change in stock market prices (i.e., the S&P500 index

5PCA is a unsupervised machine learning method that describes correlated variables into a set of orthogonal(linearly independent) variables, so-called principal components.

6The Beber et al. (2015) nowcasting method splits indicators among 4 categories (i.e., output, employment,sentiment, and inflation). We follow the same classification but we aggregate output, employment and sentimentindicator into a single category, i.e., growth. As our set of indicators perfectly matches the ones of Beber et al.(2015), this attribution exercise is straightforward. The only nuance that differs our nowcasting method fromthese authors’ is that we use a single parameter to adjust for the non-stationarity of some series. Beber et al.(2015) adjust series using one-month and twelve-month changes, whereas we use six-month changes across allnon-stationary indicators.

126

prices) and the VIX index in order to proxy for wealth effects and risk-appetite, respectively,

as additional control variables in our empirical analysis.

Table 5.1: Overview of US macro releases

# Indicator name Type Start Frequency Release time Direction Stationary

1 US Initial Jobless Claims SA Growth 31/12/96 W 14:30:00 GMT -1 No

2 US Employees on Nonfarm Payroll Growth 02/01/97 M 14:30:00 GMT 1 No

3 U-3 US Unemployment Rate Total Growth 07/01/97 M 14:30:00 GMT -1 No

4 US Employees on Nonfarm Payroll Manuf. Growth 08/01/97 M 14:30:00 GMT 1 Yes

5 US Continuing Jobless Claims SA Growth 09/01/97 W 14:30:00 GMT -1 No

6 ADP National Employment Report Growth 09/01/97 M 14:15:00 GMT 1 No

7 US Average Weekly Hours All Employees Growth 10/01/97 M 14:30:00 GMT 1 No

8 US Personal Income MoM SA Growth 10/01/97 M 14:30:00 GMT 1 Yes

9 ISM Manufacturing PMI SA Growth 14/01/97 M 16:00:00 GMT 1 Yes

10 US Manufacturers New Orders Total Growth 14/01/97 M 16:00:00 GMT 1 Yes

11 Federal Reserve Consumer Credit Growth 16/01/97 M 21:00:00 GMT 1 No

12 Merchant Wholesalers Inventories Growth 17/01/97 M 16:00:00 GMT 1 Yes

13 US Industrial Production MOM SA Growth 17/01/97 M 15:15:00 GMT 1 Yes

14 GDP US Chained 2009 Dollars QoQ Growth 28/01/97 Q 14:30:00 GMT 1 Yes

15 US Capacity Utilization % of Total Growth 03/02/97 M 15:15:00 GMT 1 Yes

16 US Personal Consumption Expenditures Growth 03/02/97 M 14:30:00 GMT 1 Yes

17 US Durable Goods New Orders Ind. Growth 25/02/97 M 14:30:00 GMT 1 Yes

18 US Auto Sales Domestic Vehicle Growth 04/03/97 M 23:00:00 GMT 1 No

19 Adjusted Retail & Food Service Growth 26/03/97 M 14:30:00 GMT 1 Yes

20 Adjusted Retail Sales Less Autos Growth 03/07/97 M 14:30:00 GMT 1 Yes

21 US Durable Goods New Orders Total Growth 16/07/97 M 14:30:00 GMT 1 Yes

22 GDP US Personal Consumption Change Growth 12/08/97 Q 14:30:00 GMT 1 Yes

23 ISM Non-Manufacturing PMI Growth 26/11/97 M 16:00:00 GMT 1 No

24 US Manufacturing & Trade Inventories Growth 12/12/97 M 16:00:00 GMT -1 Yes

25 Philadelphia Fed Business Outlook Growth 13/08/98 M 16:00:00 GMT 1 Yes

26 MNI Chicago Business Barometer Growth 08/01/99 M 16:00:00 GMT 1 Yes

27 Conference Board US Leading Ind. Growth 14/05/99 M 16:00:00 GMT 1 Yes

28 Conference Board Consumer Conf. Growth 01/07/99 M 16:00:00 GMT 1 No

29 US Empire State Manufacturing Growth 13/06/01 M 14:30:00 GMT 1 Yes

30 Richmond Federal Reserve Manuf. Growth 13/06/01 M 16:00:00 GMT 1 Yes

31 ISM Milwaukee Purchasers Manuf. Growth 28/12/01 M 16:00:00 GMT 1 Yes

32 University of Michigan Consumer Sent. Growth 25/07/02 M 16:00:00 GMT 1 No

33 Dallas Fed Manufacturing Outlook Growth 15/11/02 M 16:30:00 GMT 1 Yes

34 US PPI Finished Goods Less Food & En. Inflation 30/01/03 M 14:30:00 GMT 1 Yes

35 US CPI Urban Consumers MoM SA Inflation 30/04/04 M 14:30:00 GMT 1 Yes

36 US CPI Urban Consumers Less Food & En. Inflation 26/05/05 M 14:30:00 GMT 1 Yes

37 Bureau of Labor Statistics Employment Inflation 30/06/05 Q 14:30:00 GMT 1 Yes

38 US Output Per Hour Nonfarm Business Inflation 25/10/05 Q 14:30:00 GMT -1 Yes

39 US PPI Finished Goods SA MoM% Inflation 02/08/06 M 14:30:00 GMT 1 Yes

40 US Import Price Index by End User Inflation 31/07/07 M 14:30:00 GMT 1 Yes

41 US GDP Price Index QoQ SAAR Inflation 05/02/08 Q 14:30:00 GMT 1 Yes

42 US Personal Con. Exp. Core MOM SA Inflation 26/01/09 M 14:30:00 GMT 1 Yes

43 US Personal Cons. Exp. Price YOY SA Inflation 05/02/10 M 14:30:00 GMT 1 Yes

This table reports the 43 US macroeconomic indicators used in our main analysis. Indicators are classified as either growthor inflation related. Column Start reports the date that the time series of each macroeconomic indicator begins. ColumnFrequency reports in which frequency the indicator is released, where Q stands for quarterly, M stands for monthly and Wstands for weekly. Release time reports the typical (most frequent) release time of the indicator in GMT time. Directionstates the potential directional adjustment, represented by -1 when the given indicator reports a quantity that is inverselyrelated to growth or inflation. The column Stationary shows if an indicator’s series is stationary; a stationary adjustment(i.e., towards 6 months differences) is applied within our data manipulation step so the series can be modelled using ourmethodology.

127

5.3.1 Economic surprise predictive models

Following Eq. (5.2.5b), we hereby extend the anchor-only predictive model for economic sur-

prises by incorporating moments of the distribution of macroeconomic forecasts and the control

variables stated above. The moments of the distribution of macroeconomic forecasts added are

1) the lagged median forecast (first moment); the disagreement among forecasters (second mo-

ment) and 2) the skewness of forecasts (third moment). Eq. (5.3.1) is our unrestricted economic

surprise model (UnES model):

St = α+ESAϕ+SurvLagϕ+Stdϕ+Skewϕ+Inflϕ+Growthϕ+Stocksϕ+V IXϕ+εt, (5.3.1)

where subscript ϕ (used hereafter) is t-1, ESA is the expected surprise given anchor7, SurvLag is

the lagged consensus forecast (the previous median of economic forecasts), Std is the dispersion

(standard deviation) of economic estimates across forecasters, and Skew is the skewness of

economic estimates across forecasters. SurvLag, Std and Skew are the three variables selected

to test our hypothesis that alternative measures inherent of the pool of economic forecasts

can reflect biases in expectations over economic releases. More specifically, we use SurvLag to

test whether an anchor towards the previous consensus forecast exists. We employ Std to test

for the effect of forecasters disagreement and information uncertainty over the predictability

of economic surprises, in line with Zhang (2006). Skew is used to test for the presence of

strategic behavior and rational bias in macroeconomic forecasting, in line with the forecasters’

dual-goal hypothesis of forecasting accuracy and publicity as discussed in Laster et al. (1999)

and Ottaviani and Sorensen (2006). Infl and Growth are the states of inflation and economic

growth produced by the nowcasting method implemented. Stocks and VIX are the stock

market returns and implied volatility. Infl, Growth, Stocks and VIX are control variables in

our model.

5.3.2 Market response predictive models

Once predictive models of economic surprises are estimated via Eqs. (5.2.5b) and (5.3.1), we

use the predictions to explain market responses between one minute before and one minute

after ([t-1, t+1]) the release-time (t) of macroeconomic data. We do this by using the expected

economic surprise produced by the different economic surprise models as an explanatory variable

to forecast returns. We use three types of market response predictive models: 1) the anchor-

only model, in which the expected surprise given anchor (ESA) is the only predictor of economic

surprises, thus Eq. (5.3.1) with only one explanatory variable; 2) the unrestricted model, using

all explanatory variables stated by Eq. (5.3.1); and 3) the unrestricted-extended response

7The coefficient γ is excluded from this model representation and subsequent ones for conciseness of pre-sentation. We use the subscript ϕ (i.e, t − 1) to clearly state that the model is predictive. In reality, thesubscript t would still suggest a prediction as most macroeconomic indicator surveys close for forecast submis-sion days before the economic release. For the case of Bloomberg, surveys close one business day prior to thedata announcement.

128

model, which entails the unrestricted model extended with a set of exogenous variables. The

generic formulation of the two expected surprise-based models used is given by Eq. (5.3.2),

whereas Eq. (5.3.3) specifies the 1) anchor-only model, as follows:

Rt = ω + E(St) + εt, (5.3.2)

Rt = ω + E(α + γESAt) + εt, (5.3.3)

where Rt is the market response calculated around the interval [t-1, t+1], thus the one minute

before and one minute after the time the economic data is made available, and E(St), the

expected surprise, is derived from Eq. (5.2.5b).

The unrestricted response model is specified by Eq. (5.3.4):

Rt = ω + E(α + ESAϕ + SurvLagϕ + Sϕ + Skewϕ + Inflϕ +Growthϕ + Stocksϕ + V IXϕ︸︷︷︸Unrestricted economic surprise model (UnES)

) + εt.

(5.3.4)

Eq. (5.3.5) provides a generic formulation of the unrestricted-extended response model

because we do not implement it as an OLS regression only but also in the form of a Ridge

regression and a Random forest models8:

Rt = ω + E(UnES)ϕ + ESAϕ + SurvLagϕ + Stdϕ + Skewϕ + Inflϕ +Growthϕ

+Stocksϕ + V IXϕ +S∑t=s

Rt−s + εt,(5.3.5)

where UnES is the outcome of unrestricted economic surprise model of Eq. (5.3.1) and s=[5,

10, 20, 30, 40, 50, 60] minutes. For the Ridge regression model, we tune the shrinkage hyper-

parameter (φ, typically called λ) via cross-validation using three splits of the train data set.

For Random forest, we first run a cross-validation step for feature selection (using variable

importance as guidance) and, then, tune the model by minimizing out-of-bag (OOB) errors9 to

obtain the parameter m for the number of random features considered at each branch split10.

We allow the Random forest model to grow 500 trees per run.

We calculate the market responses across equity, treasury, currency, and commodity mar-

kets. More specifically, we use the following instruments: S&P500 index future, Euro-Stoxx

index future, FTSE100 index future, 2-year US Treasury Note future, 2-year Bund future, 10-

year Gilt future, Oil WTI future, Gold future, Copper future, GBPUSD forwards, JPYUSD

8See Appendix 5.A for details on the Ridge regression and Random forest models9The usage of out-of-bag errors is an efficient replacement for cross-validation for tuning methods that rely

on bootstrap to reduce the variance of a learning method. As such methods already make use of a bootstrappedsubset of the observations to fit the model, whereas another subset of the observation is unused, the latter subset(so-called the out-of-bag (OOB) observations) can be used to calculate prediction error, thus, called OOB errors.

10Given the relative small number of observations available in our data set, we apply 100 repeats of ourOOB-based tuning approach to obtain m, which is selected as the mode of the optimal m across all repeats.

129

forwards, CHFUSD forwards, AUDUSD forwards, EURUSD forwards, and CADUSD forwards.

The market responses are calculated and used in our analysis for the entire history available

per market instrument11,12.

5.4 Empirical analysis and results

We split our empirical analysis and results section into five parts. Section 5.4.1 reports the

results of predicting models for economic surprises. Section 5.4.2 dissects market responses

as cumulative average returns (CAR) across multiple time-frames. Section 5.4.3 reports our

findings of market response predictive models. Section 5.4.4 evaluates the presence of regret

effects around macroeconomic announcements. Section 5.4.5 checks for the robustness of our

findings.

5.4.1 Predicting economic surprises

In this section we report our findings from Eqs. (5.2.5b) and (5.3.1), i.e., the anchor-only

(restricted) model and the unrestricted model, respectively, which we use to forecast economic

surprises. Table (5.2) reports aggregated results of these models across all 43 distinct US

macroeconomic indicators analyzed. We evaluate the sign consistency (with our expectations)

and the statistical strength of the individual regressors by computing the percentage of times

that the coefficients are positive (as expected) and statistically significant at the ten percent

level across regressions run separately for each economic indicator. The model quality is eval-

uated using explanatory power (R2) as well as the Akaike Information Criteria (AIC) per

individual (economic indicator’s) regression.

Table (5.2) suggests that the anchor-only model estimates confirm the general finding of

the previous literature, in which the expected surprise given the anchor (ESA) is a strong

predictor of economic surprises. We observe that ESA is significantly linked to surprises 65

percent of the times in our sample. This result is confirmed by the unrestricted model, in

which ESA is statistically significant 67 percent of the times. The results for the unrestricted

model reveal that the Skew factor is also often significant (72 percent) across our individual

indicator regressions. This supports our conjecture that forecasters may behave strategically

(a rational bias), which is in line with Laster et al. (1999) and Ottaviani and Sorensen (2006).

SurvLag and Std are somewhat statistically significant, with 40 and 35 percent of the times,

11Response data is available since 18/9/2002 for the S&P500 index E-mini future, 22/6/1998 for the Euro-Stoxx index future, 1/1/1996 for the FTSE100 index future, 2/1/1996 for the 2-year US Treasury Note future,10/5/1999 for the EUREX 2-year Bund future, 1/1/1996 for the 10-year Gilt future, 2/1/1996 for the NYMEXOil WTI future, 2/1/1996 for the NYMEX Gold, 2/1/1996 for the NYMEX Copper, 1/1/1996 for GBPUSDforwards, 1/1/2000 for JPYUSD forwards, 1/1/2000 for CHFUSD forwards, 1/1/1996 for AUDUSD forwards,16/7/1997 for EURUSD forwards, and 1/1/2000 for CADUSD forwards. This response data is provided byAHL Partners LLP.

12Return series for futures use first and second contracts for all markets. In general, return series use firstcontracts, which are rolled into second contracts between 5 and 10 days prior to the last trading day of firstcontracts, following standard market practice. Return series for currencies are calculated using synthetic one-month forwards.

130

respectively. The result for SurvLag challenges our hypothesis that an anchor towards the

previous consensus forecast holds empirically. The weak statistical significance of Std among

our individual regressions also suggests that disagreement among forecasters and information

uncertainty are linked to economic surprises. The control variables Infl, Growth, Stocks, and

IV are significant between 7 and 33 percent of times, suggesting a somewhat weak relation

between them and economic surprises.

Table 5.2: Aggregated results of anchor-only (restricted) and unrestricted eco-nomic surprise models for the US

Model Anchor-only Unrestricted

Panel A - Percentage of statistical significance per factor

Intercept 0.35 0.42

ESA 0.65 0.67

Std 0.35

SurvLag 0.40

Skew 0.72

Infl 0.16

Growth 0.33

Stocks 0.07

IV 0.23

Panel B - Percentage of positive coefficients

Intercept 0.47 0.56

ESA 0.81 0.88

Std 0.56

SurvLag 0.56

Skew 0.93

Infl 0.49

Growth 0.30

Stocks 0.51

IV 0.23

Panel C - Model quality

Mean R2 4% 17%

Median R2 2% 14%

Stdev R2 4% 10%

AIC 923 896

Panel A reports the percentage of statistically significant coefficients across anchor-only and unrestricted regression modelsfor economic surprises of US macroeconomic indicators. For example, 0.65 found for the ESA variable within the anchor-onlymodel means that 65 percent of the ESA across the individual regressions run for the 43 US macroeconomic indicators arestatistically significant at the 10 percent level. Panel B reports the percentage of positive coefficients across anchor-only andunrestricted regression models for economic surprises of US macroeconomic indicators. Panel C reports the mean, medianand standard deviation of the explanatory power (R2) achieved across all indicator-specific regressions, as well as averageAkaike Information Criteria (AIC).

From an explanatory power perspective, the unrestricted model dominates the anchor-only

model. The mean R2 across the predictive surprise models of the different economic indicators

is 4 percent for the anchor-only model and 17 percent for the unrestricted model (R2 medians

are 2 and 14 percent, respectively).

We report for the anchor-only regressions positive coefficients for the ESA factor in 81

percent of the times. The unrestricted model delivers a positively signed ESA coefficient in

88 percent of the times. Both results suggest a robust relationship between economic surprises

and the anchor factor. The frequency of positive coefficients found for Skew is, however,

even higher than for ESA. The Skew regressors are positive 93 percent of times across all131

regressions. SurvLag and Std are with 56 percent also largely positive but to a lesser extent

than Bias and Skew. Our control variables are to an even lesser extent positive (between 23

and 51 percent). The results provided by AIC are in line with R2 as the average AIC for the

anchor-only model is higher (926) than for the unrestricted model (896). These findings are,

thus, supportive of our hypothesis that a rational bias may be embedded in macroeconomic

forecasting due to strategic behaviour of forecasters, which is in line with Laster et al. (1999)

and Ottaviani and Sorensen (2006).

Table (5.3) presents the results of the individual predictive surprise models (restricted and

unrestricted). The R2gain ratio (reported in the last column) computes the number of times

that R2 of the unrestricted model is higher than the R2 for the restricted model. From a R2

perspective, the unrestricted models largely outperform the anchor-only model. The R2gain

ratio ranges from 1 to ∞, as the average R2 across the anchor-only model is 3.7 percent,

whereas for the unrestricted model it is 17 percent.

The Conference Board Consumer Confidence indicator is the variable for which R2 is the

highest in the anchor-only model (14 percent), followed by the US PPI Finished Goods SA

Mom% indicator (13 percent). Most R2 are of a single digit level, and for only four indicators

does the regressions yield explanatory power above 10 percent. Most anchor coefficients are

statistically significant at least at the 10 percent level.

When the unrestricted model is used, US Personal Income MoM SA (45 percent) is the

indicator with the highest R2, followed by US GDP Price Index QoQ SAAR (41 percent), and

Adjusted Retail Food Service Sales (36 percent). Most R2 reach a double-digit level, in contrast

with the anchor-only model. Most anchor coefficients are also statistically significant, in line

with the anchor-only model. In line with earlier results, the Skew coefficients are mostly positive

and statistically significant, whereas the coefficient sign is more unstable for the SurvLag and

Std coefficients. The control variables within the unrestricted model are mostly statistically

not significant, especially when inflation surprises are being forecasted.

More importantly, by analyzing individual models’ results, we are able to explore an addi-

tional aspect of macroeconomic indicators: popularity. We measure popularity by averaging

the number of analysts that provide forecasts for a given indicator in our sample. In Table

(5.3), popularity is reported in the last column as a Popularity weight measure, which uses

the sum of our popularity measure across all indicators as denominator. We also aggregate

statistics in Table (5.3) using the nine most popular US economic indicator as employed by

Campbell and Sharpe (2009)13.

13The indicators used by Campbell and Sharpe (2009) are the NFP Employment Indicator, Michigan Con-sumer Confidence, Consumer Price Index (CPI) headline and Core, Industrial Production, ISM ManufacturingIndex and Retail Sales Headline and ex-Autos. New Homes Sales is also used by these authors but as housingdata is out-of-scope of our set of macroeconomic indicator this item is not part of our set of nine most popularUS indicators.

132

Table

5.3:Resu

ltsofanch

or-only

(restricted)andunrestricted

models

foreco

nomic

surp

risespereco

nomic

indicato

r

Model

Anchor-only

model

Unrestrictedmodel

Popularity

Statistics/Regressors

R2

AIC

Intercept

Anchor

R2

AIC

Intercept

Anchor

Std

SurvLag

Skew

Inflation

Growth

Stocks

VIX

xR

2gain

weight

USInitialJoblessClaim

sSA

0%

22901

128.8

-0.1**

7%

21925

-3586.0

-0.1

-0.2

0.0

2.1***

582.8

358

-24468

253***

∞2.0%

USEmployeesonNonfarm

Payrol

0%

6151

-12**

-0.1

8%

5884

50**

0.0000

-0.001**

-0.0001

0.003***

-51

-172

-1*

∞4.4%

U-3

USUnemploymentRateTotal

0%

-2451

0.0***

0.1

11%

-2362

0.0

0.0

0.1

0.0**

1.2***

0.0

0.0

0.0

0.0

∞4.3%


Payrol

2%

5003

-5009***

-0.2*

16%

4937

1235

0-1***

0**

2***

-592

1971**

122767

99

80.9%

USContinuingJoblessClaim

sS

2%

17944

20.3***

8%

17910

30***

0***

00***

00

-2**

-106

1***

40.3%

ADP

NationalEmploymentReport

1%

3132

3786.5

0.1

13%

3129

49**

00

00**

4-1

506

-113

0.9%

USAverageW

eekly

HoursAllEm

0%

-183

0.0

-0.1

27%

-196

0.0

0.4*

0.8*

0.0

2.3***

0.0

0.0

0.6

0.0

∞0.6%

USPersonalIncomeMoM

SA

3%

-2153

0.0**

0.1**

45%

-2109

0.0**

0.2***

0.2

0.3***

3.3***

0.0

0.0***

0.0

0.0

15

3.4%

ISM

ManufacturingPMISA

1%

996

0.1

0.2*

5%

939

2.2

0.1

0.1

0.0

1.1

0.0

0.0

15.4*

0.0

53.7%

USManufacturersNew

OrdersTo

1%

-1777

0.0

0.0

7%

-1638

0.0*

0.1*

-0.4**

0.1

0.1

0.0

0.0

0.0

0.0

73.0%

FederalReserveConsumerCredi

1%

11472

668.9*

0.1

8%

10667

964

00

00**

-532**

286**

-26825

78

1.9%

MerchantW

holesalersInventori

1%

-1908

0.0**

0.1*

9%

-1796

0.0

0.2**

-0.2

0.2

0.9

0.0

0.0

0.0

0.0

91.4%

USIndustrialProductionMOM

S7%

-2056

0.0*

0.2***

24%

-1940

0.0

0.3***

-0.4

0.4***

3.0***

0.0

0.0

0.0

0.0

33.7%

GDP

USChained2009DollarsQo

1%

-1860

0.0

0.0*

9%

-1787

0.0**

0.1***

0.3

0.1**

1.7**

0.0

0.0

0.0

0.0

95.5%

USCapacityUtilization%

ofT

3%

-2063

0.0

0.2**

14%

-1928

0.0

0.3***

0.5*

0.0

1.9***

0.0

0.0

0.0

0.0*

53.2%

USPersonalConsumptionExpend

9%

-2414

0.0

0.1***

15%

-2248

0.0**

0.2***

0.2

0.2***

0.0

0.0

0.0*

0.0

0.0

23.5%

USDurable

GoodsNew

OrdersIn

5%

-1080

0.0

0.1***

25%

-1087

0.0

0.2***

0.8***

0.4***

2.5***

0.0

0.0***

0.1

0.0

53.4%

USAutoSalesDomestic

Vehicle

9%

6253

114.6***

0.3***

23%

6203

-156

0***

0***

00***

-31

8-5986*

13

1.2%

AdjustedRetail&

FoodService

11%

-1426

0.0

0.2***

36%

-1475

0.0

0.2***

1.1***

0.1

4.2***

0.0

0.0

0.0

0.0***

33.1%

AdjustedRetailSalesLessAut

11%

-1533

0.0

0.2***

18%

-1535

0.0

0.3***

0.5

0.2

2.0**

0.0

0.0

0.0

0.0

22.8%

USDurable

GoodsNew

OrdersTo

0%

-1055

0.0**

0.0

15%

-1072

0.0

0.1*

-0.1

0.2

1.4*

0.0

0.0***

0.0

0.0

∞1.4%

GDP

USPersonalConsumptionCh

4%

-1404

0.0

0.1**

10%

-1400

0.0**

0.1*

-0.2

0.0

0.2

0.0

0.0

0.0

0.0**

30.5%

ISM

Non-M

anufacturingNMI

4%

457

0.1

0.4**

25%

444

-5.5**

0.4**

1.9*

0.1***

1.1

-0.1

-0.1**

34.8**

-0.1*

61.6%

USManufacturing&

TradeInven

2%

-2247

0.0

0.1**

14%

-2152

0.0**

0.2***

-0.6**

0.0

1.5***

0.0

0.0

0.0

0.0

72.5%

Philadelphia

FedBusinessOutl

2%

1738

-0.7

0.3**

12%

1624

7.3***

0.2

-1.1*

-0.1

2.9***

-0.1

0.0

-18.2

-0.2**

62.5%

MNIChicagoBusinessBarometer

3%

1368

0.7**

0.3**

6%

1293

2.6

0.4**

0.7

0.0

2.5*

0.1

0.2

-4.2

0.0

22.5%

ConferenceBoardUSLeadingIn

6%

-2358

0.0*

0.1***

35%

-2311

0.0**

0.2***

0.2

0.2***

2.0***

0.0*

0.0**

0.0

0.0

62.6%

ConferenceBoardConsumerConf

14%

1414

0.3

0.7***

20%

1325

0.3

0.6***

0.1

0.0

2.6**

0.4*

-0.1

-0.6

-0.1

13.3%

USEmpireStateManufacturing

1%

1272

-0.5

0.2

11%

1268

3.0

0.1

1.0

-0.1

4.3***

0.2

-0.1

-37.8

-0.3**

11

1.7%

RichmondFederalReserveManuf

1%

970

-1.0

0.2

18%

959

-3.2

0.2

1.5**

-0.1

3.3***

0.4

-0.2

-62.7

0.0

18

0.2%

ISM

MilwaukeePurchasersManuf

0%

451

0.0

0.0

12%

456

10.1**

0.1

-0.4

-0.2**

0.9

0.5

0.5

1.9

0.0

∞0.1%

UniversityofMichiganConsume

3%

2109

-0.2*

0.5***

9%

2095

1.1

0.5***

-0.3

0.0

2.1***

0.2**

-0.1**

9.1

0.0

32.6%

DallasFedManufacturingOutlo

8%

652

-3.4***

0.6**

14%

659

0.8

0.6**

-0.7

0.0

0.3

0.1

-0.3

-52.5

-0.1

20.2%

USPPIFinishedGoodsLessFoo

9%

-1885

0.0

0.3***

18%

-1734

0.0**

0.3***

0.5

0.8***

1.1*

0.0

0.0

0.0

0.0

23.4%

USCPIUrbanConsumersMoM

SA

1%

-2559

0.0

0.0

21%

-2408

0.0**

0.2***

0.2

0.2***

1.1***

0.0

0.0

0.0

0.0

21

3.8%

USCPIUrbanConsumersLessFo

2%

-2714

0.0

-0.1**

9%

-2523

0.0**

-0.1

-0.3

-0.4**

0.4

0.0

0.0*

0.0

0.0

53.7%

BureauofLaborStatisticsEmp

1%

-778

0.0

0.0

10%

-718

0.0

0.0

-0.1

0.1

0.0

0.0

0.0*

0.0

0.0

10

2.8%

USOutputPerHourNonfarm

Bus

3%

-1110

0.0***

0.1**

13%

-1090

0.0

0.1***

0.1

0.1***

0.4

0.0

0.0

0.1

0.0

43.0%

USPPIFinishedGoodsSA

MoM%

13%

-1554

0.0

0.2***

29%

-1533

0.0***

0.4***

1.0**

0.4***

2.0**

0.0

0.0

0.0

0.0

23.6%

USImportPriceIndexbyEndU

1%

-1661

0.0

0.0

23%

-1677

0.0

0.1***

0.0

0.3***

2.5***

0.0**

0.0

0.0

0.0*

23

2.0%

USGDP

PriceIndexQoQ

SAAR

9%

-1189

0.0

0.2***

41%

-1238

0.0

0.3***

-0.4**

0.0

3.5***

0.0*

0.0*

0.0

0.0

51.1%


1%

-1660

0.0**

-0.1

27%

-1689

0.0

0.1

-0.6**

-0.1

1.4***

0.0

0.0

0.0

0.0

27

1.3%


1%

-1524

0.0

0.0

12%

-1528

0.0

0.1*

0.1

0.0

0.6**

0.0

0.0

0.0

0.0

12

0.5%

Average

3.7%

923

--

17%

896

--

--

--

--

--

2.3%

Popularity-w

eightedaverage

4.0%

110

--

17%

102

--

--

--

--

--

-

Averageofmostpopularindicators

4.9%

-312

--

18%

-313

--

--

--

--

--

-

%ofpositive&

significantcoefficients(P&SC)

--

7%

58%

--

12%

67%

19%

28%

70%

5%

5%

5%

5%

--

Popularity-w

eighted%

ofP&SC

--

6%

64%

--

8%

70%

17%

39%

75%

6%

3%

5%

2%

--

P&SC

ofmostpopularindicators

--

0%

67%

--

11%

67%

22%

33%

78%

11%

0%

11%

0%

--

Thetable

below

reportsresu

ltsofanch

or-only

(restricted)andunrestricted

regressionmodelsforeconomic

surp

rises.

Reg

ressionresu

ltsare

reported

per

economic

indicator.

Weuse

New

ey-W

est

adjustmen

tsto

compute

coeffi

cien

tstandard

errors.Theasterisks***,**,and*indicate

significa

nce

atth

eone,

five,


ectively.

Thepopularity

weightprovided

inth

elast

columnofth

etable

usesth

esu

mofourpopularity

mea

sure

across

allindicators

asbase.W

emea

sure

popularity

byaveragingth

enumber

ofanalyststh

atprovideforeca

stsforagiven

indicatorin

oursample.

133

Overall, we find that the model quality is higher for popular indicators. The R2 (AIC)

weighted using our popularity measure for the anchor-only model is 4.0 percent (110), whereas

the (unweighted) average R2 (AIC) is 3.7 percent (923). For the unrestricted model, the

weighted R2 (weighted AIC) is 17 percent (102), whereas the average R2 (average AIC) is 17

percent (896). Hence, popular indicators seem better explained by our explanatory variables. If

we compare the percentage of positive and significant coefficients across all models (see last two

rows of Table (5.3)) with the same measure weighted by popularity and using the most popular

indicator only, we observe that ESA and Skew are more likely to hold with the correct sign

among popular indicators. This result applies to both anchor-only model and the unrestricted

model to what it concerns ESA. Hence, we conjecture that the rational and behavioral biases

modelled by ESA and Skew are more present among popular indicators. This finding makes

explicit that the bias in analysis here links to the active behavior of forecasters, not to their

lack of action, as suggested by inattention-type of (behavioral) explanations advocated by

Mendenhall (1991), Stickel (1991), Campbell and Sharpe (2009) and Cen et al. (2013), such as

the anchoring bias.

5.4.2 Market responses around macroeconomic announcements

In the following we evaluate how asset prices of four different asset classes (equities, treasuries,

foreign exchange, and commodities) behave around macroeconomic announcements. Because

we primarily investigate US macroeconomic releases, we target reactions in US local markets

and the EURUSD, the main USD currency cross. Hence, we analyze responses on the following

assets: S&P500 index future, 2-year US Treasury future, and EURUSD forwards. Note that

bond returns is adjusted to have the opposite signal so to be consistent with the expected

response to surprises for equity returns and currencies. The response time-frames used (in

minutes) are -60, -50, -40, -30, -20, -10, -5, -1, 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60. Negative

time-frames imply a time before the relevant economic release, whereas positive ones mean the

minutes after the economic release.

We assess market responses around macroeconomic announcements by calculating cumula-

tive average returns (CARs) and classifying responses around announcements as good or bad

news to the asset. This way, we calculate CAR separately for announcements that had a positive

or negative effect on the specific asset price.

Figure (5.1) illustrates market responses, separately for the S&P500 futures, the 2-year

US Treasury futures, and EURUSD forwards in rows, whereas the first column displays plots

of reactions to good news and the second column offers plot of reactions to bad news. The

CARs around macroeconomic announcements for positive and negative responses across mul-

tiple time intervals are provided in Table (5.4), given in basis points (bps) and as a percentage

of the CAR observed during the the one-hour before until one-minute after the macroeconomic

announcement interval [t-60min, t+1min].

134

A) Positive responses B) Negative responses

Figure 5.1: Cumulative average returns (CAR) around the macroeconomic announcements. The line plots abovedepict the CAR (across all indicators) around the time of macroeconomic announcements (in blue). The time of macro economicannouncements within these plots is t = 0 within the x-axis. The -60, -50, -40, -30, -20, -10, -5, -1 time-frames, which proceedst = 0, represent the minutes prior to the macro announcement. The 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60 time-frames representthe minutes after the macro announcement. The shadowed area around the CAR line show its one standard deviation (68.27th

percent) confidence interval. The securities evaluated are the S&P500 index future, the 2-year treasury bond future and EURUSDforwards, respectively, reported in rows one, two and three.

For the S&P500 futures (Figure (5.1)) in row one, first column of Table (5.4), we observe

that the largest part of the positive response happens around the macroeconomic announcement

(which occurs between time-frames -1 and 1). The CARs from one-hour before the announce-

ment until one minute after the news is roughly +/- 10 bps for positive and negative responses,

respectively, whereas the response around the announcement is also approximately +/- 10bps.

In fact, the average CAR observed around the announcement makes up for more than 100

percent of the overall CAR observed in the [t-60min, t+1min] interval (i.e., 108 for positive

and 102 percent for negative responses). We conclude that pre-announcement drifts are, on

average, of an opposite direction to the overall CARs observed. However, we note that the

pre-announcement drifts are very small relative to the response observed within the [t-1min,

t+1min] interval. Our results also suggest that one-minute after that releases are made public

up to 60 minutes afterwards, there are only small post-announcement drift effects within the

S&P500 futures as only 1 and 5 percent of the CARs around the announcement is observed

within the [t-60min, t+1min] interval for the positive and negative responses. In brief, the

positive and negative market responses for the S&P500 index future midst macroeconomic an-

nouncements have a similar pattern: almost no drift prior to the announcement, a jump at

135

the announcement, and roughly a flat post-drift effect up to one-hour after the announcement.

This CAR pattern suggests that no exploitable market underreaction to US macroeconomic

news releases seems to be present within the local equity market. At the same time, as there

is no pervasive pre-announcement drift observed, no evidence of leakage or usage of private

information by market participants is found.

The CARs observed in different time-frames for the Euro-Stoxx and FTSE100 index futures

show patterns similar to the ones found for the S&P500 index future. The pre-announcement

drifts have the opposite direction to the response found close to the macroeconomic news release,

both for positive and negative responses. The post-drift we observe is in the same direction as

the response, but is in both markets of higher magnitude than the one found for the S&P500

index future, ranging from 10 to 24 percent of the CARs found in the [t-60min, t+1min] interval.

This result seems to suggest that both Euro-Stoxx and FTSE100 index futures are less efficient

than the S&P500 index future.

For the 2-year US Treasury futures (Figure (5.1), we see in row two, first column of Table

(5.4)) that the positive response is also very distinct around the announcement. The average

CAR from one-hour before the announcement until 30-minutes before the announcement is

flat. There is some evidence of a pre-drift in the direction of the response from 30-minutes

before the announcement until one-minute before the announcement of roughly 10 percent of

the CARs observed in the [t-60min, t+1min] interval. The CARs observed for both the positive

and negative responses in the interval [t-60min, t+1min] is between absolute 2.4 and 2.7 bps.

Differently from equity markets, there is some evidence of a post-announcement drift, with

an additional 0.6 and -0.3 bps (20 and 13 percent of the CARs experienced in the [t-60min,

t+1min]) move expected after positive and negative responses, respectively. When treasury

markets for the UK and Germany are evaluated (see Table (5.4)), we observe similar patterns

for the post-announcement drift and response in the [t-1min, t+1min] interval, but no consistent

pre-announcement drift. We note that the post-announcement drift for positive responses are

consistently larger than the ones for negative responses.

We see that for the EURUSD (Figure (5.1), row three, first column of Table (5.4)), the

responses are once again very distinct and concentrated closely around the macroeconomic an-

nouncement time. The CAR from one-hour before the announcement until one minute before

the announcement is roughly zero. The CAR within the interval -1 minute and +1 minute

is relatively large, roughly 5 bps, concentrating between 102 and 105 percent of the CAR

observed in the [t-60min, t+1min] interval. Differently than observed for the treasury and

equity markets, the post-announcement drift tends to be in the opposite direction of the re-

sponse observed around the data releases, dampening between 5 and 21 percent of the CAR

observed in the [t-60min, t+1min] interval. For other currencies, the pre-announcement drifts

are on average small and inconsistent with each other and with the responses observed in the

[t-1min, t+1min] interval. The post-announcement drifts are mostly in the opposite direc-

tion to the response around announcements for the negative responses. However, for positive

responses, the post-announcement drift responses are in the same direction as the responses

136

around the announcements for most currencies and only in opposite direction for the EURUSD

and CADUSD.

Table 5.4: Cumulative average returns (CAR) around macroeconomic an-nouncements

Panel A - Cumulative average return (CAR) for positive market responses

Absolute (in Bps) As percentage of [t-60min, t+1min]

[t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60] [t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60]

S&P500 0.2 -0.9 9.9 9.2 0.1 2% -10% 108% 100% 1%

Euro-Stoxx -0.4 -0.5 16.3 15.5 1.6 -2% -3% 106% 100% 10%

FTSE100 -0.2 -0.4 10.1 9.5 1.4 -2% -5% 106% 100% 15%

2y Bund -0.0 -0.0 1.2 1.1 0.2 -3% -1% 104% 100% 20%

2y T-Note -0.1 0.2 2.3 2.4 0.6 -3% 7% 96% 100% 23%

10y Gilt -0.5 0.3 6.6 6.3 1.4 -8% 5% 103% 100% 21%

WTI Oil 1.2 0.6 6.7 8.5 -1.9 14% 7% 78% 100% -22%

Gold -0.3 0.8 7.1 7.7 0.3 -3% 11% 92% 100% 4%

Copper -1.3 0.5 8.8 7.9 2.6 -17% 6% 110% 100% 33%

USDGBP -0.5 0.0 4.1 3.6 0.4 -14% 0% 114% 100% 10%

USDEUR -0.2 -0.1 5.4 5.1 -1.1 -3% -1% 105% 100% -21%

USDJPY -0.2 0.4 5.8 6.0 0.7 -3% 7% 96% 100% 12%

USDCHF -0.4 -0.2 5.4 4.8 0.4 -8% -5% 113% 100% 7%

USDAUD -0.3 -0.2 8.4 8.0 1.5 -3% -2% 105% 100% 19%

USDCAD 0.5 0.3 5.4 6.2 -0.6 7% 5% 87% 100% -9%

Panel B - Cumulative average return (CAR) for negative market responses

Absolute (in Bps) As percentage of [t-60min, t+1min]

[t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60] [t-60, t-30] [t-30, t-1] [t-1, t+1] [t-60, t+1] [t+1,t+60]

S&P500 0.1 0.1 -9.7 -9.5 -0.5 -1% -1% 102% 100% 5%

Euro-Stoxx 1.0 0.9 -16.2 -14.2 -3.5 -7% -7% 114% 100% 24%

FTSE100 1.2 0.6 -10.3 -8.5 -0.9 -14% -7% 121% 100% 10%

2y Bund 0.0 -0.1 -1.2 -1.2 -0.1 0% 7% 94% 100% 8%

2y T-Note 0.0 -0.3 -2.4 -2.7 -0.3 -1% 10% 91% 100% 13%

10y Gilt 0.4 0.1 -6.8 -6.2 -0.6 -7% -2% 109% 100% 9%

WTI Oil -0.9 0.2 -6.8 -7.6 -0.5 12% -2% 90% 100% 7%

Gold -0.2 0.8 -6.9 -6.3 0.4 3% -13% 110% 100% -6%

Copper 0.2 -1.0 -8.8 -9.6 1.3 -2% 11% 92% 100% -14%

GBPUSD -0.0 -0.4 -4.2 -4.7 0.3 1% 8% 91% 100% -7%

EURUSD 0.3 -0.2 -5.3 -5.1 0.3 -5% 3% 102% 100% -5%

JPYUSD 0.3 0.5 -5.6 -4.7 -0.1 -7% -11% 118% 100% 1%

CHFUSD 0.4 -0.3 -5.2 -5.2 0.4 -7% 6% 101% 100% -7%

AUDUSD -0.0 -0.8 -7.9 -8.8 1.0 0% 10% 90% 100% -12%

CADUSD 0.2 -0.2 -4.9 -4.9 0.4 -4% 4% 100% 100% -9%

Panel A reports the cumulative average returns (CAR) around macroeconomic announcements for positive market responsesacross several markets and time-frames (in minutes). Absolute CAR are reported on the left sub-panel, whereas the CARfor each time-frame as percentage of the CAR for the [t− 60min, t+1min] interval is reported on the right sub-panel. PanelB reports the similar CAR information but for negative market responses. t is the time of announcement of macroeconomicdata releases.

We assess the CAR around US macroeconomic announcements for three commodities: WTI

oil, gold, and copper. The most notable difference between our results for these commodi-

ties versus the other asset classes investigated is that the responses observed in the [t-1min,

t+1min] interval for commodities concentrate less of the overall CAR observed in the [t-60min,

t+1min] interval than for the previous three asset classes equity. The responses observed in

the [t-1min, t+1min] interval range from 78 percent to 110 percent. Evidence of any pre- or

post-announcement drift is very inconsistent across commodities and across positive and neg-

ative responses. The reason for such inconsistency might be that commodities are less clearly

linked to the business cycle of a particular country compared to equities, treasuries and cur-

137

rencies. For instance, as countries may be consumers or suppliers of specific commodities, it is

unclear how the macroeconomic announcements in a specific country, should affect the price of

commodities14.

Finally, it worths making notice of some common features observed from the Figure (5.1 for

S&P500 futures, 2-year US Treasury futures and EURUSD. Firstly, when market responses are

one standard deviation higher than the average reaction, markets mean-revert strongly by the

following two minutes after the surprise and continue to do so for the following three minutes,

though, less aggressively. Secondly, when market responses are one standard deviation lower

than the average reaction, a post-drift in the subsequent two minutes after the surprise is ob-

served. Further, volatility tends to increase prior to announcements for the S&P500 and 2-year

US Treasury futures markets but not for the EURUSD market. Such increase in volatility starts

even more than 30 minutes before announcements in the S&P500 futures markets, whereas for

2-year US Treasury futures, it happens only in the last 20 minutes before announcements.

5.4.3 Predicting market responses

In this section we analyze the estimates from Eqs. (5.2.5a), (5.3.4) and (5.3.5), i.e., the anchor-

only (restricted) model, the unrestricted model (used to forecast economic surprises), and

the unrestricted-extended model. Table (5.5) reports R2, AIC, the frequency of the expected

surprise coefficients that are positive for the three OLS-based models employed, and hit-ratios

as well as root mean squared error (RMSE) for all models. Hit-ratios and RMSEs are reported

for our train and out-of-sample or test data set15,16, whereas other statistics are calculated

in-sample, i.e, using the full data set.

Table (5.5) reports that R2 monotonically increases across the three OLS regression models

as we move from the restricted model to more comprehensive models17. The magnitude of gains

in R2 across the three types of models suggests that the unrestricted-extended models have much

higher explanatory power. On average, anchor-only models deliver R2 of 1.5 percent, whereas

unrestricted response models have R2 of 2 percent on average. In contrast, unrestricted-extended

models, which no longer are univariate models, post average R2 of 29 percent.

The AIC statistic estimated across the different models challenges somewhat the results

provided by R2: complex unrestricted models are deemed less informative once penalties for

complexity are applied. The AICs for the restricted model are 74 percent of the times lower than

for the unrestricted model (indicating dominance of the anchor-only model), whereas the AICs

for the restricted models dominate the AICs from unrestricted-extended models at all times. The

14For stocks, it is also unclear how positive news impacts prices. Late in a tightening cycle (high inflation),good news is bad for equities, whereas at an early stage in tightening cycle (low inflation), positive macroeco-nomic news is definitely good for equities.

15The in-sample period extends through our full data set (i.e, from 1997 to 2016), whereas our out-of-sampleperiod (our test data set) comprises of the latest 25 percent of observations of the full data set. The trainingdata used for tuning (typically via cross-validation) of machine learning methods and estimation of modelsemployed for out-of-sample forecasting uses the earliest 75 percent of observations of the full data set.

16Not all statistics are provided for the Ridge and Random forest model as they are not available or are notstraight forward to estimate or aggregate.

17Note that both restricted and unrestricted models are univariate models.

138

same AIC dominance holds for the restricted models over the unrestricted models. Average

AICs across these three types of models confirm these findings. AICs of Ridge models are,

however, superior than the ones of their OLS counterparts, the unrestricted-extended models,

indicating that model quality is improved by shrinkage. These first results indicate that in-

sample fit is superior for the most complex models versus simpler models from an explanatory

power perspective, but not from a parsimony perspective. Despite that, differences in AICs are

not large, indicating that the superiority of small models on this criteria is not absolute.

When evaluating the coefficient signs of the expected surprise factor in market response

predictive models18, at first glance, we find that coefficients are mostly positive. Among anchor-

only models, on average 57 percent of the coefficients for expected surprise are positive, whereas

for unrestricted models this is 64 percent. Within larger models, such as unrestricted-extended

ones, the percentage of positive coefficients for the expected surprises falls to 54 percent.

Further, we evaluate results coming from our OLS models by making a split between local

(US) markets and foreign markets. We find that, from a R2 perspective, the unrestricted

and unrestricted-extended models of local markets seem to outperform foreign markets. The

percentage of positive coefficients for the expected surprise variable is equal or higher for local

markets than for foreign markets across all models. This is an intuitive results as we assume

that local fundamentals should explain local markets more than local conditions explaining

foreign markets. Though, for the US, due to its dominant economic position, this assumption

might be weaker than for other countries.

When we assess model goodness of fit across the different asset classes evaluated, we find

that R2 for our three OLS models are much higher for the stock and bond markets. Within

stocks, the Euro-Stoxx-based models are the ones with higher R2. In bonds, the 2y Bund

models are the one with highest explanatory power. Copper-based models have the highest R2

in Commodities. Results from AIC and from currencies are more mixed. In univariate models,

the percentage of positive coefficient for expected surprises are higher for stocks and bonds

(always above 59 percent) than for other asset classes, in line with our expectations.

Further, R2 is almost the same for growth and inflation indicators. Nevertheless, AIC

points for a clear superiority of models’ fit of growth-based indicators over inflation-based.

The percentage of positive coefficients for the expected surprise variable is also consistently

higher for growth indicator (between 57-69 percent on average), as it is always lower than 50

percent for inflation indicators. We think that this result is caused by positive growth surprises

being less directly linked to subsequent increases in interest rates by central bank than positive

inflation surprises, as higher interest rates typically produces negative shocks to equities and,

more indirectly, commodities.

18We expect expected surprise coefficients to be, in general, positive, as we expect that equities, commodities,currencies would typically appreciate in response of positive economic surprise. Bond returns are adjusted tohave the opposite in order to be consistent with the other asset classes.

139

Table

5.5:Resu

ltsofrestricted

and

unrestricted

mark

etresp

onse

models

permark

et

Anch

or-only

resp

onse

model

Unrestricted

resp

onse

model

Unrestricted

-extended

resp

onse

model

Unrestr.

Ridgeresp

onse

model

Unrestr.

RF

resp

onse

model

%E(S

urp

rise)

Hit-ratio

RMSE

(x1000)

%E(S

urp

rise)

Hit-ratio

RMSE

(x1000)

%E(S

urp

rise)

Hit-ratio

RMSE

(x1000)

Hit-ratio

RMSE

(x1000)

Hit-ratio

RMSE

(x1000)

Markets

R2

AIC

coeff

.>

0Train/Test

Train/Test

R2

AIC

coeff

.>

0Train/Test

Train/Test

R2

AIC

coeff

.>

0Train/Test

Train/Test

AIC

Train/Test

Train/Test

Train/Test

Train/Test

S&P500

2.0%

-1298

63%

53%

/48%

2.16/1.16

2.4%

-1273

65%

54%

/49%

2.16/1.15

36%

-1241

60%

64%

/51%

1.76/2.53

-1306

53%

/51%

1.44/1.31

54%

/50%

2.20/1.21

Euro-Stoxx

2.3%

-1128

63%

55%

/51%

3.28/1.65

3.4%

-1130

67%

55%

/52%

3.26/1.67

38%

-1103

53%

64%

/52%

2.66/2.84

-1169

53%

/52%

0.95/1.08

53%

/54%

3.36/1.80

FTSE100

2.2%

-1620

60%

54%

/49%

2.02/0.99

2.7%

-1589

67%

54%

/51%

2.00/0.97

35%

-1562

42%

63%

/52%

1.63/1.35

-1630

54%

/52%

1.48/1.55

53%

/54%

2.06/0.95

2yBund

3.0%

-1133

64%

53%

/46%

0.21/0.09

4.3%

-1134

83%

55%

/52%

0.21/0.10

46%

-1094

50%

66%

/49%

0.16/0.20

-1192

52%

/49%

1.50/1.16

51%

/47%

0.22/0.10

2yT-N

ote

1.1%

-1066

52%

54%

/51%

0.43/0.22

2.6%

-1058

79%

54%

/52%

0.42/0.22

46%

-1017

50%

66%

/52%

0.33/0.36

-1100

54%

/52%

0.21/0.13

50%

/49%

0.44/0.23

10yGilt

1.1%

-1499

61%

55%

/50%

1.18/0.90

2.2%

-1488

73%

55%

/49%

1.15/0.89

28%

-1445

51%

65%

/52%

0.99/1.10

-1445

54%

/51%

2.19/1.30

53%

/50%

1.18/0.89

WTIOil

0.9%

-1334

53%

53%

/50%

1.47/1.31

1.1%

-1311

67%

52%

/49%

1.47/1.32

27%

-1271

58%

61%

/51%

1.25/1.87

-1304

51%

/51%

3.30/1.98

50%

/50%

1.52/1.32

Gold

0.8%

-1654

42%

54%

/51%

1.47/1.92

0.7%

-1620

42%

54%

/50%

1.48/1.93

24%

-1576

51%

61%

/50%

1.31/2.07

-1611

53%

/51%

0.97/1.12

52%

/51%

1.51/1.93

Copper

1.2%

-1311

60%

54%

/47%

1.50/1.08

2.4%

-1298

65%

54%

/50%

1.49/1.09

33%

-1261

49%

63%

/52%

1.23/1.36

-1326

54%

/53%

2.01/1.08

51%

/50%

1.52/1.08

USDGBP

1.0%

-2457

60%

53%

/51%

0.76/0.82

1.1%

-2387

67%

53%

/52%

0.76/0.83

17%

-2329

70%

60%

/48%

0.70/0.89

-2367

52%

/49%

1.48/1.98

52%

/50%

0.78/0.83

USDEUR

1.0%

-2273

56%

53%

/51%

0.96/1.10

1.4%

-2267

63%

55%

/52%

0.96/1.10

18%

-2213

56%

60%

/51%

0.87/1.15

-2251

53%

/52%

1.16/0.96

54%

/53%

0.97/1.10

USDJPY

1.3%

-2113

60%

53%

/50%

1.08/1.20

1.6%

-2114

72%

54%

/51%

1.08/1.21

19%

-2065

67%

61%

/50%

0.99/1.29

-2102

53%

/50%

0.98/1.12

53%

/53%

1.11/1.22

USDCHF

1.1%

-2154

60%

53%

/50%

0.98/1.09

1.4%

-2155

74%

54%

/50%

0.98/1.09

19%

-2106

58%

61%

/51%

0.89/1.17

-2144

52%

/50%

0.43/0.24

52%

/51%

1.00/1.10

USDAUD

1.8%

-1837

56%

52%

/50%

1.43/1.26

1.3%

-1823

44%

53%

/49%

1.43/1.27

24%

-1784

49%

61%

/52%

1.27/1.39

-1820

52%

/51%

0.77/0.85

52%

/52%

1.47/1.27

USDCAD

0.9%

-1960

53%

54%

/50%

0.95/1.04

0.8%

-1960

37%

53%

/48%

0.95/1.04

19%

-1910

49%

62%

/50%

0.86/1.14

-1947

53%

/50%

1.09/1.24

53%

/52%

0.96/1.05

Loca

l(avg)

1.5%

-1182

63%

54%

/50%

1.30/0.69

2.5%

-1165

65%

54%

/51%

1.29/0.68

41%

-1129

55%

65%

/52%

1.05/1.44

-1203

53%

/51%

1.31/0.77

52%

/50%

1.32/0.72

Foreign(avg)

1.4%

-1729

58%

54%

/50%

1.33/1.11

1.9%

-1714

63%

54%

/50%

1.32/1.12

27%

-1671

54%

62%

/51%

1.14/1.37

-1716

53%

/51%

1.33/1.20

52%

/51%

1.36/1.13

Stock

s(avg)

2.2%

-1349

62%

54%

/50%

2.49/1.27

2.8%

-1331

67%

54%

/51%

2.47/1.26

36%

-1302

52%

64%

/52%

2.02/2.24

-1368

53%

/52%

2.50/1.45

54%

/53%

2.54/1.32

Bonds(avg)

1.8%

-1232

59%

54%

/49%

0.61/0.41

3.0%

-1227

78%

55%

/51%

0.59/0.40

40%

-1185

50%

66%

/51%

0.50/0.55

-1246

53%

/51%

0.60/0.44

51%

/49%

0.61/0.41

Curren

cies

(avg)

1.2%

-2132

58%

53%

/51%

1.03/1.08

1.3%

-2118

60%

54%

/50%

1.03/1.09

20%

-2068

58%

61%

/50%

0.93/1.17

-2105

52%

/50%

1.03/1.12

52%

/52%

1.05/1.09

Commodities(avg)

1.0%

-1433

52%

54%

/49%

1.48/1.44

1.4%

-1410

58%

53%

/50%

1.48/1.45

28%

-1370

53%

62%

/51%

1.27/1.77

-1414

52%

/52%

1.49/1.56

51%

/50%

1.52/1.45

Growth

(avg)

1.5%

-2623

60%

54%

/50%

1.35/1.09

2.0%

-2608

69%

54%

/51%

1.35/1.10

29%

-2561

57%

63%

/51%

1.16/1.44

-2600

53%

/51%

1.36/1.17

52%

/51%

1.38/1.11

Inflation(avg)

1.4%

-1522

48%

53%

/50%

1.16/0.88

1.8%

-1504

48%

54%

/49%

1.15/0.88

31%

-1464

44%

63%

/50%

0.96/1.20

-1535

53%

/50%

1.15/0.98

52%

/51%

1.18/0.90

Average

1.5%

-1656

57%

54%

/50%

1.32/1.06

2.0%

-1640

64%

54%

/50%

1.32/1.06

29%

-1599

54%

63%

/51%

1.12/1.40

-1648

53%

/51%

1.33/1.14

52%

/51%

1.35/1.07

Pop-w

eightedavg.

1.4%

-1658

58%

54%

/50%

1.39/1.13

2.0%

-1639

68%

54%

/50%

1.38/1.13

26%

-1598

56%

62%

/51%

1.20/1.40

-1640

52%

/51%

1.39/1.21

52%

/51%

1.41/1.15

Most

pop.indicators

(avg)

1.1%

-1246

63%

54%

/50%

1.52/1.23

2.1%

-1691

79%

54%

/51%

1.51/1.23

24%

-1651

59%

62%

/52%

1.34/1.44

-1687

52%

/52%

1.53/1.31

52%

/50%

1.54/1.26

Thetable

below

reportsresu

ltsofanch

or-only

(restricted),

unrestricted

andunrestricted-exten

ded

market

resp

onse

models1)per

market

evaluated,2)aggregatedper

geo

graphicalcoverage(i.e.,

loca

lorforeign),

3)aggregated

across

asset

classes

s(i.e.,

stock

s,bonds,

FX

and

commodities),4)aggregated

per

typeofmacroeconomic

indicatorto

predicteconomic

surp

rises(i.e.,

growth

or

inflation)and5)aggregatedusingpopularity

weights.W

eaggregate

resu

ltsbyaveragingstatisticsfrom

theindividual(m

acroindicator-sp

ecific)

models.

Statisticsreported

are

theaverageR

2,th

eex

planatory

power;AIC

,th

eAkaikeInform

ationCoeffi

cien

t;Coeff

.>0,th

epercentageofpositiveco

efficien

ts,hit-ratiosandRMSE(x1000).

Hit-ratiosandRMSEsare

reported

forourtrain

and

out-of-sample

ortest

data,wherea

soth

erstatisticsare

calculatedin-sample.Thein-sample

periodex

tendsth

roughth

efulldata

setforea

chindicator,

wherea

sourout-of-sample

period(ourtest

data

set)

comprisesofth

elast

25percentofobservationsofth

edata

set.

Thefirst75percentofth

edata

isourtrainingdata

set,

whichis

usedfortu

ningandestimationofpredictivemodels.

For

mach

inelearn

ing-basedpredictivemodels,

only

AIC

(forRidge),hit-ratiosandRMSEsare

reported

because

inference

ofoth

erstatisticsis

notstraightforw

ard

orbecause

resu

ltsmayget

distorted

when

aggregated.Statisticsforth

eindividualmarketsare

alsoaverages

asth

eyare

aggregatedfrom

modelsth

atare

basedonth

esetofindividualmacroeconomic

indicators

investigatedbyus.

140

Weighing model fit outcomes using our popularity measure does not lead to additional

insights as R2 and AIC are nearly the same across models that rely on less-popular indicators

and models that rely on popular indicators. The percentage of positive coefficients for UnES

is, though, clearly higher for popular indicators.

As we assess the performance of predictions made by the various models, our first impression

is that anchor-only and unrestricted models does not convincingly beat a 50 percent hit-ratio

out-of-sample, despite delivering a roughly 54 percent hit-ratio in the train data set. Out-

of-sample hit-ratios are only slightly better than a coin flip for the unrestricted-extended (51

percent) among all models we use. The same applies to the average Ridge and Random forest

models as they also post out-of-sample hit-ratios of around 51 percent. Interestingly, train

hit-ratios for the unrestricted-extended seem to more heavily overstate out-of-sample hit-ratios

than done by the train hit-ratios of the Ridge and Random forest models19. For instance, the

average train hit-ratio for the unrestricted-extended model is 63 percent, whereas for the Ridge

is 53 percent and 52 percent for Random forest. Random forest is the model that seem to

overstate testing hit-ratios by train ones the least.

Across all unrestricted-extended frameworks, hit-ratios seem to be consistently higher for

models that forecast stocks returns versus models that predict other asset classes, especially

currencies and commodities, matching our findings from R2. Hit-ratios also suggest that models

based on growth indicators do a better job at forecasting market direction than models based on

inflation indicators. Further, train and test hit-ratio of models that use popular macroeconomic

indicators are not consistently higher than hit-ratios for the average model.

Concerning RMSE, a first noticeable observation is that OLS unrestricted-extended models

outperform all other models (including machine learning-based models) in train set but deliver

higher average test RMSE than all other models. This is the case across popular and unpopular

indicators and might be a symptom of overfitting. Machine learning-based models, however,

report average train RMSEs that are higher than for the unrestricted-extended model but

deliver much lower out-of-sample RMSEs, respectively 1.14x10−3 and 1.07x10−3 for the Ridge

and Random forest model (versus 1.40x10−3 for the unrestricted-extended model). RMSEs

are often higher for the in-sample period than for the out-of-sample period, which may be

cause by the different level of markets’ volatilities in the two sample splits. RMSEs also vary

substantially across asset classes, which is explained by the adverse levels of return volatility of

the asset classes used. As expected, RMSEs are the lowest for bonds and currencies and higher

for stocks and commodities. Further, both train and test RMSEs are the lowest for models

that use inflation indicators and predict local markets.

Our results also indicate that market responses created by announcements of popular

macroeconomic indicators are less predictable than responses of unpopular indicators as train

and test RMSEs are consistently higher for popular indicators, across all models. This finding

suggests that, even if economic surprises in popular indicators are easier to forecast (as biases

are more pervasive), their market responses are less anticipated (in RMSE terms) by the pre-

19This result, which suggests overfitting by the OLS models, is what motivates us to apply shrinkage, as doneby the Ridge regression.

141

dictable part of economic surprises. As earlier reported, hit-ratios estimated do not suggest

that popular indicators are more predictable either. Hence, to some extent, market partici-

pants seem to either discount the biases incurred by forecasters when trading around economic

surprises of popular indicators or to have better models to predict surprises. These results are

somewhat connected to Campbell and Sharpe (2009), who conclude that market participants

“look through” forecasters’ biases within ten popular US macroeconomic indicators20,21,22.

When we dig into the drivers of forecasts produced by the unrestricted-extended, Ridge

regression and Random forest models, we find that the E(UnES) variable is important but

only after a couple of market-based information, such as the 5-minute asset return prior to

the announcement as well as the prior day level of the VIX index and stock returns. We

base this conclusion in two metrics: the percentage of significant coefficients estimated by our

unrestricted-extended OLS models and the Importance measure extracted from our Random

forest models. In addition to these two metrics, we somewhat rely on the the percentage of

positive coefficients from regression models to evaluate if the relation found between responses

and the UnES, ESA, Skew variables have the expected coefficient sign23.

Table (5.6) reports the percentage of positive coefficients for predictive models using the

OLS and Ridge approaches (in the train and test data sets, respectively reported in Panels A

and B). We observe that the estimated coefficients for UnES and ESA are more often positive

than negative, in line with our expectations. For Skew this results is less strong, as within

the OLS model this variable is only positive between 45 and 48 percent of times. Nevertheless,

because, among all regressors, Skew and UnES are the most correlated variables (reaching

a correlation of 0.8 for some of our macroeconomic indicators), the fact that Skew is mostly

negative might be simply the manifestation of multicollinearity in the regression model. We also

find that Stocks and Rt−5 to be consistently positive, suggesting a positive serial correlation

between returns during announcements and prior asset returns. In the case of Stocks, such

relation might be linked to time-series momentum, which is typically captured in daily frequency

data24. In the case of Rt−5, a positive coefficient indicates the presence of pre-announcement

price drift in the direction of the economic surprise-led responses just few minutes prior to

the data release, indicating potential leakage of information or short-term trading activity by

20We use nine out of the ten US macroeconomic indicators evaluated by Campbell and Sharpe (2009). Theonly indicator used by these authors and not by us is the New Home Sales statistic, as we do not include housingmarket data in our analysis.

21The average weight used to calculate popularity-weighted statistics is 2.3 percent, whereas the averageweight of the indicator used by Campbell and Sharpe (2009) within such weighting scheme is 3.5 percent,denoting the use of very popular indicators by the authors.

22Qualitatively similar results are obtained when we perform the supervised learning approaches specified insection 5.4.3 as a classification problem rather than in a regression setting. These results are available underrequest.

23We evaluate the percentage of positive sign for these three variables only as we do not have a prior for therelation between past returns and market responses around macroeconomic announcements. The same appliesfor the relation between return volatility and market responses around announcements.

24If found that positive (negative) responses amid positive (negative) data surprises might be strengthenedby the existing positive (negative) time-series momentum, one could hypothesize that serial correlation in prices(i.e., momentum) is intensified by economic surprises in the same direction or a series of such surprises, i.e.,serial correlation in surprises.

142

informed investors.

Table 5.6: Results for unrestricted-extended market response models per factor

Panel A - Unrestricted reponse models - Train set Panel B - Unrestricted reponse models - Test set/Out-of-sample

Extended (OLS) Ridge Random Forest Extended (OLS) Ridge Random Forest

% positive % significant % positive Importance (x105) % positive % significant % positive Importance (x105)

Intercept 49% 10% 47% - 50% 10% 46% -

UnES 58% 13% 65% 2.4 56% 12% 66% 2.0

ESA 52% 11% 59% 1.4 51% 10% 57% 1.2

Skew 48% 10% 62% 2.3 45% 10% 52% 1.9

Std 44% 13% 50% 1.6 45% 12% 47% 1.2

SurvLag 52% 10% 47% 1.6 51% 9% 49% 1.3

Inflation 52% 14% 52% 1.8 57% 12% 56% 1.4

Growth 51% 8% 47% 1.8 51% 7% 50% 1.4

Stocks 66% 28% 64% 3.5 65% 25% 65% 3.0

VIX 49% 13% 51% 3.4 50% 11% 53% 2.9

VIXdif 58% 15% 49% 1.9 55% 15% 55% 1.6

ret60 51% 15% 46% 1.8 51% 15% 49% 1.4

ret55 50% 19% 50% 2.1 52% 16% 54% 1.7

ret50 46% 21% 49% 2.2 47% 16% 47% 1.8

ret45 48% 18% 49% 2.3 48% 17% 49% 1.9

ret40 43% 16% 43% 2.4 45% 13% 43% 1.9

ret35 53% 16% 51% 2.7 52% 16% 52% 2.3

ret30 44% 15% 41% 2.2 42% 12% 42% 1.8

ret25 51% 17% 47% 2.2 51% 15% 51% 1.9

ret20 48% 17% 47% 2.1 50% 13% 47% 1.8

ret15 40% 21% 38% 2.5 45% 15% 41% 2.1

ret10 49% 22% 48% 2.4 53% 18% 52% 2.0

ret5 66% 27% 59% 3.5 65% 23% 65% 2.9

Average 51% 16% 51% 2.3 51% 14% 52% 1.9

Panel A reports details on the fit of unrestricted-extended models for the in-sample period. Panel B reports details onthe fit of unrestricted-extended models for the out-of-sample period. Across the two panels, we contrast results from theUnrestricted-extended, Unrestricted-Ridge regression and Unrestricted-Random Forest models to provide some interpretationinto our results. As the three models used do not provide a common variable for direct comparison, these statistics are mostlyused here to map the best predictors and, perhaps, found confirmation of that from other models.

Turning into the percentage of significant coefficients estimated by the unrestricted-extended

OLS model, we find that Stocks and Rt−5 are the variables most strongly connected to asset

responses amid macroeconomic announcements. Returns at other times frames (minutes) before

announcements are also connected to returns during announcements, despite the fact that the

direction of the relationship is not clear. Among non-market data based regressors, UnES

and Std are linked to market responses the most, indicating the relevance of UnES for market

predictions. Using the Random forest Importance measure as guidance (i.e., node impurity)25,

we find that Stocks and Rt−5 but also the V IX are highly relevant for predictions (see Table

(5.6) and Figure 5.2). As reported by Importance, UnES is the most relevant non-market

data predictor used by Random forest. The fact that past returns have predictable power

in forecasting returns around data announcements also adds to the pool of evidence in the

literature of failure of the Efficient Market Hypothesis (EMH) on its weak form.

25See Appendix 5.A for details.

143

Figure 5.2: Importance measure from Random forest model. The bar charts above depict the Importance measureproduced by the Random forest model applied to predict market responses around the announcements of macroeconomic data, inthe train and test data sets. More specifically, the Importance measure computes the average node impurity across all trees grownby the Random forest, which reflects how partitions made by the different explanatory variables at each node into two sub-regionsperform versus a constant fit over the entire region, i.e., a ’pure’ node. For the case of regression models, performance is calculatedin terms of squared errors (see Appendix 5.A for additional details on the Importance measure).

In brief, we show that cross asset returns around US macroeconomic data announcement can

be largely explained by variables that represent biases in the behavioral of forecasters, such as

UnES and ESA, as well as market-based variables, such as Stocks and Rt−5. Beyond that, we

find that these variables also have some out-of-sample predictability power. Explanatory power

and predictability26 is higher for local stocks and bonds than for currencies and commodities,

whereas local markets are better predicted than foreign markets. Goodness of fit measures

(R2 and AIC) indicate that larger models deliver much higher explanatory power but, taking

parsimony into account, bigger models are only preferred when regularized. Contrary to our

26We consider hit-ratio as our predictability measure as RMSE cannot be adequately used to compare returnforecast of assets with very distinct volatility as in our exercise. We used the median hit-ratio among unrestricted-extended models to rank the predictability across the different asset classes studied.

144

results on predictability of economic surprises, in which popular indicator are found to be more

predictable, we find that market returns around announcements of popular macroeconomic

indicators are less predictable than responses provoked by unpopular indicators.

Beyond that, our results suggest that the (regularized) machine learning methods applied

are superior at avoiding overfitting in our data set than simpler models, such as the OLS

regression. No model applied consistently outperforms other methods on producing superior

RMSEs and hit-ratios, however, Random forest dominates other method on point forecast as

it consistently deliver lower RMSEs. We hypothesize that this result might be driven by the

fact that Random forest is the only non-linear method among the models tested. Finally, the

variable Importance measures calculated seems to challenge the myth that Random forest is a

“black-box” method as it allows somewhat for model interpretation.

5.4.4 Market responses, skewness of economic forecasts and regret

In the previous sections, we observed that the skewness of economic forecasts is strongly and

positively linked to economic surprises and market responses. In the following, we hypothesize

that the relation between skewness of economic forecasts and market responses depends on

failures of our skewness-based model in forecasting surprises. The rationale behind this hy-

pothesis is that if market participants use experts’ forecasts to trade, market responses might

be adversely affected by the skewness of forecasts when they fail to correctly predict surprises.

More specifically, we hypothesize that: 1) if a forecasted surprise fails to predict the direction

of the realized surprise, then the correspondent market response is relatively large and in the

opposite direction to the forecasted surprise (i.e., in line with the realized forecast); 2) if a

forecasted surprise is in line with the realized surprise, then the subsequent market response is

relatively small and in line with both the forecasted and realized surprise.

The intuition of Hypothesis (1) is that a regret effect takes places in asset markets in line

with the models of Loomes and Sugden (1982) and Bell (1982). Therefore, investors that would

be positioned in line with the expected surprises close their losses quickly after the economic

release is made public. Concurrently, when realized surprises are in line with expected ones

(Hypothesis (2)), no additional trading activity is expected from market participants that are

holding such expectations, as it is likely they have positioned themselves according to their

expectations ahead of the specific release27. One strong assumption made in this exercise is

that the direction of the expected surprise is driven by the direction of the skewness of forecasts,

which is in line with our estimated economic surprise models but not a result found for every

single macroeconomic indicator in our analysis28.

27An implicit assumption embedded in Hypotheses (1) and (2) is that market participants that take part inan economic forecast survey also trade in asset markets (in line with their own forecasts) and that their forecastsinfluences market participants who trade in these markets.

28As indicated by Table (5.2), 93 percent of the estimated unrestricted models for economic surprises have apositive Skew coefficient.

145

In order to test the Hypotheses (1) and (2), we specify the following regression models:

Rt = α + Skew+t ∗ S−

t + εt, (5.4.1a)

Rt = α + Skew+t ∗ S+

t + εt, (5.4.1b)

Rt = α + Skew−t ∗ S−

t + εt, (5.4.1c)

Rt = α + Skew−t ∗ S+

t + εt, (5.4.1d)

where Rt is the market response. Skew+t is the skewness in forecasts when it is positive and

Skew−t when it is negative. S+

t is the realized surprise when it is positive and S−t when it is

negative. Hence, explanatory variables in these models are interaction between surprises and

skewness in forecasts. To make the interpretation of the estimated coefficients of these inter-

action terms easier, we run regressions with the absolute value of these explanatory variables.

Importantly, given our assumption that the direction of the expected surprise is driven by the

direction of the skewness of forecasts, we interpret the variables Skew+t ∗ S−

t and Skew−t ∗ S+

t

as scenarios in which the expected surprise failed to forecast the realized economic surprise. In

the same line, Skew+t ∗ S+

t and Skew−t ∗ S−

t are scenarios in which the expected surprise was

successful in forecasting the economic surprise.

Hence, these regressions split the direction of the skewness and realized surprises to map

the four possible scenarios in which the responses can be evaluated: 1) the presence of positive

skewness and negative economic surprise (skewness fails to forecast the direction of surprise); 2)

the presence of positive skewness and positive economic surprise (skewness successfully forecasts

the direction of surprise); 3) the presence of negative skewness and negative economic surprise

(skewness successfully forecasts the direction of surprise); and 4) the presence of negative

skewness and positive economic surprise (skewness fails to forecast the direction of surprise).

The scenarios that give rise to regret are the ones in which the skewness fails to forecast the

direction of economic surprise, thus, the scenarios number 1 and 4.

We note that Eqs. (5.4.1a) to (5.4.1d) do perform this four-scenario mapping by implement-

ing each one of them as an individual univariate model. In order to have enough observations

to run regressions for each of these four scenarios, we do not run regressions at the individual

economic indicator level but we aggregate observations for all economic releases. Aggregating

surprise data for all the economic indicators within our sample is possible because surprises are

also available as a number of standard deviations from the mean (apart from in raw surprise

format)29.

29The application of these regressions using raw surprise would be biased as the different magnitude of themultiple economic releases would create a biased relation between the explanatory variable (surprises) and theexplained variable (market response).

146

Table 5.7: Test of market response in presence of regret

Panel A - Univariate

Markets R2 Skw+Surp- R2 Skw+Surp+ R2 Skw-Surp- R2 Skw-Surp+

S&P500 0.18% -19.4 0.01% 5.5 0.10% -15.9 0.07% 12.6

Euro-Stoxx 0.04% -6.4 0.04% 8.6 0.01% -3.0 0.12% 11.5

FTSE100 0.10% -11.8 0.00% 3.2 0.00% -0.7 0.00% 3.0

2y Bund 0.54% -26.9*** 0.03% 9.4 0.01% 3.0 0.07% 12.6

2y T-Note 0.35% -3.6* 0.02% 0.8 1.50% 7.4*** 0.00% 0.2

10y Gilt 0.07% -14.5 0.09% -25.6 0.07% 13.9 0.00% -0.1

WTI Oil 0.14% -29.0 0.05% -24.4 0.06% 20.3 0.01% 9.6

Gold 0.00% -1.9 0.01% -3.5 0.01% -2.5 0.08% 9.2

Copper 0.07% -12.9 0.02% -9.0 0.01% 5.8 0.00% -2.8

USDGBP 0.11% -13.8 0.04% 12.5 0.09% 15.6 0.02% 8.3

USDEUR 0.00% -2.1 0.14% 17.9 1.37% 38.6*** 0.34% 22.7**

USDJPY 0.28% -16.4** 0.01% 3.0 0.03% -5.8 0.21% 15.2*

USDCHF 0.05% 3.4 0.00% 0.4 0.06% 3.2 0.09% 4.6

USDAUD 0.08% -6.7 0.04% -7.4 0.00% 1.8 0.13% 9.8

USDCAD 0.04% -7.1 0.00% 3.2 0.12% -13.1 0.01% 3.0

% coefficients > 0 7% 67% 60% 87%

% sig. coefficients > 0 0% - 100% 100%

This table reports results of our test for the presence of regret within market responses, see Eqs. (5.4.1a) to (5.4.1d) inan univariate setting. Skew+

t is the skewness in forecasts when it is positive and Skew−t when it is negative. S+

t is the

realized surprise when it is positive and S−t when it is negative. Explanatory variables in these models are interaction terms

between surprises and skewness in forecasts. Skew+t ∗ S−

t and Skew−t ∗ S+

t are scenarios in which the expected surprise

failed to forecast the realized economic surprise. Skew+t ∗ S+


t are scenarios in which the expected surprisewas successful in forecasting the economic surprise. We use Newey-West adjustments to compute coefficient standard errors.The asterisks ***, **, and * indicate significance at the one, five, and ten percent level, respectively.

Table (5.7) reports the regression results for Eqs. (5.4.1a) to (5.4.1d) in an univariate

setting. We observe that when economic surprises are negative, the percentage of coefficients

found to be positive is lower than when economic surprises are positive. This indicates that

negative surprises are more linked to negative market responses relative to positive surprises,

when the absolute value of Skew+t ∗ S−


t are used as regressors. This is in

line with what we have expected30. However, when the skewness of forecasts is positive, the

percentage of positive coefficients is the lowest (7 percent). This result suggests that negative

responses are more frequently linked to negative surprises when the skewness of forecasts fails to

correctly predict the economic surprise. This result is confirmed when we only use statistically

significant coefficients in our analysis, as the percentage of positive and statistically significant

coefficients for the Skew+t ∗ S−

t is zero versus 100 percent for Skew−t ∗ S−

t . We interpret this

finding as being supportive of our hypothesis that regret affects market participants on trading

around economic surprises.

In line with our expectations, when surprises are positive, the number of coefficients pointing

towards a positive market response always exceeds 50 percent. Nevertheless, the number of

coefficients pointing towards a positive market response is higher when the skewness of forecasts

is negative (87 percent) than when it is positive (67 percent). This finding is confirmed when

only statistically significant coefficients is taken into account in the analysis. This finding

connects to our results when negative surprises are evaluated and supports our conjecture of a

regret effect within economic surprises.

30We note that the coefficient sign of the bond returns is reversed by us to be consistent with equity returnsand currencies to what the expected direction of returns given economic surprises is concerned.

147

5.4.5 Robustness tests

5.4.5.1 Economic surprise models across regions

As a robustness test, we apply Eqs. (5.2.5a) and (5.3.1) across other regions, namely, Continen-

tal Europe, the United Kingdom and Japan31. Table (5.8) indicates that our results for these

three regions are qualitatively the same as the ones reported for the US: unrestricted models

tend to improve the R2 of anchor-only models and the coefficients for the ESA and Skew fac-

tors are mostly positive (the expected sign). These two coefficients are positive between 56 and

75 percent of all times, which is however lower than the percentage of correct signs found for the

US. Yet, among coefficients for all factors (including control variables), ESA and skew remain

the ones that are mostly positive. Moreover, in significance terms, models (anchor-only and

unrestricted) for Europe, Japan and the United Kingdom perform worse than the US model,

as the percentage of coefficients that are significant are, in general, lower than for the US.

Table 5.8: Aggregated results of anchor-only and unrestricted economic sur-prise models for Continental Europe, the UK and Japan

Region Cont. Europe UK Japan

Model Anchor-only Unrestricted Anchor-only Unrestricted Anchor-only Unrestricted

Panel A - Percentage of statistical significance per factor

Intercept 0.22 0.29 0.33 0.19 0.24 0.16

Bias 0.44 0.37 0.33 0.31 0.42 0.28

Std 0.18 0.42 0.16

SurvLag 0.20 0.33 0.31

Skew 0.09 0.44 0.13

Infl 0.11 0.11 0.09

Growth 0.14 0.19 0.13

Stocks 0.15 0.22 0.13

IV 0.29 0.25 0.06

Panel B - Explanatory power (R2)

Mean R2 8% 25% 3% 18% 3% 16%

Median R2 3% 16% 1% 13% 2% 10%

Stdev R2 17% 24% 5% 13% 3% 20%

Panel C - Percentage of positive coefficients

Intercept 0.37 0.53 0.56 0.50 0.52 0.78

Bias 0.66 0.62 0.58 0.72 0.70 0.69

Std 0.42 0.56 0.44

SurvLag 0.47 0.36 0.31

Skew 0.56 0.75 0.59

Infl 0.40 0.25 0.44

Growth 0.44 0.44 0.34

Stocks 0.48 0.44 0.47

IV 0.26 0.42 0.19

Panel A reports the percentage of statistical significant coefficients (factors) across anchor-only and unrestricted regressionmodels for economic surprises of macroeconomic indicators for Europe, the United Kingdom and Japan. For example, 0.44found for the ESA factor within the anchor-only model for Europe means that 44 percent of such the ESA factor acrossthe individual regressions run for the European macroeconomic indicators are statistically significant at the 10 percent level.Panel B reports the mean, median and standard deviation of the explanatory power (R2) achieve by across all indicator-specific regressions. Panel C reports the percentage of positive coefficients across anchor-only and unrestricted regressionmodels for economic surprises of the same macroeconomic indicators.

We conjecture that the difference in presence of biases in macroeconomic forecasting across

31The overview of macro releases for these regions can be provided under request

148

the different regions might be explained by the number of experts dedicated to macroeconomic

forecasts across these countries/regions. The average number of analysts providing forecasts

across all indicators and through our sample is 44 for the US. For Europe, Japan and the

United Kingdom this number is, respectively, 9, 13, and 15. We argue that, as the number of

forecasters increases for a specific indicator or within a country, it becomes more likely that

1) convergence towards the previous release happens simply by the law of large numbers; 2)

forecasters possess private information; 3) such private information is revealed by the skewness

in forecasts, given strategic behavior by experts.

5.4.5.2 Expected and unexpected surprises

As an additional robustness test, we compare how expected economic surprises are linked to

market responses vis-a-vis to unexpected surprises. As earlier mentioned, expected surprises are

the ones that can be predicted, in line with Campbell and Sharpe (2009). In this study, we have

used the anchor-only and unrestricted models, as given by Eqs. (5.2.5b) and (5.3.1), to estimate

expected economic surprises. In contrast, unexpected surprises are the residual component of

surprises. In other words, the unexpected surprise is the part of the economic surprise that

cannot be forecasted. We link market return around macroeconomic data announcements with

expected surprises and unexpected surprises via the following two-component response model:

Rt = ω + δ1E[St] + δ2(St − E[St]) + εt, (5.4.2)

where E[St] can be provided by either the anchor-only model or by the unrestricted economic

surprise model, and St is the realized surprise. Essentially, the goal of running such a re-

gression model is to understand and compare whether and how market prices react to the

two-components of economic surprises: the predictable component of surprises (by evaluating

δ1), and the unpredictable portion of economic surprises (by evaluating δ2.).

In order to compare how market responses can be explained by expected surprises versus

unexpected surprises, we also run the following unexpected-surprise response model:

Rt = ω + δ3(St − E[St]) + εt, (5.4.3)

The estimates of Eqs. (5.4.2) and (5.4.3) are provided by Table (5.9). Panel A reports the

R2 as well as the percentage of δ1 and δ2 coefficients that are significant for both the anchor-only

(two-components) response model and the unrestricted (two-components) response model.

149

Table

5.9:Resu

ltsoftw

o-component(expected-andunexpected-eco

nomic

surp

rise)resp

onse

models

Panel

A-Two-componen

tresp

onse

models

Panel

B-ComparisonbetweenR

2from

two-componen

tresp

onse

modelsandunex

pectedresp

onse

models

Anch

or-only

two-componen

tresp

onse

model

Unrestricted

two-componen

tresp

onse

model

Unex

pectedresp

onse

models

Anch

or-only

model

Unrestricted

model

Expected

Unex

pected

Expected

Unex

pected

R2gain

by

%R

2gain

by

R2gain

by

%R

2gain

by

R2

%Significa

nt

%Significa

nt

R2

%Significa

nt

%Significa

nt

R2

two-componen

ttw

o-componen

tR

2tw

o-componen

ttw

o-componen

t

USInitialJobless

Claim

sSA

8.8%

40%

93%

8.9%

60%

87%

8.5%

0.3%

3%

8.4%

0.5%

6%


Payrol

20.9%

7%

100%

22.2%

60%

100%

20.6%

0.2%

1%

20.0%

2.2%

11%

U-3

USUnem

ploymen

tRate

Total

2.0%

7%

53%

1.8%

0%

47%

1.6%

0.4%

23%

1.4%

0.3%

23%


Payrol

1.8%

0%

47%

2.1%

7%

47%

1.6%

0.2%

14%

1.9%

0.3%

15%

USContinuingJobless

Claim

sS

1.9%

0%

73%

2.2%

0%

73%

1.8%

0.1%

5%

2.1%

0.1%

5%

ADP

NationalEmploymen

tRep

ort

33.2%

13%

93%

33.3%

60%

93%

32.8%

0.5%

1%

28.7%

4.6%

16%

USAverageW

eekly

Hours

AllEm

1.6%

0%

0%

2.0%

0%

20%

1.1%

0.5%

47%

1.8%

0.2%

10%

USPersonalInco

meMoM

SA

0.9%

0%

27%

1.0%

20%

7%

0.7%

0.2%

27%

0.4%

0.7%

194%

ISM

ManufacturingPMISA

19.1%

20%

87%

19.3%

33%

87%

18.8%

0.3%

2%

18.4%

0.9%

5%

USManufacturers

New

Ord

ersTo

6.5%

20%

87%

6.6%

27%

80%

6.0%

0.5%

9%

5.8%

0.8%

13%

Fed

eralReserveConsu

mer

Credi

1.9%

29%

14%

1.2%

7%

7%

1.0%

0.9%

90%

0.8%

0.3%

42%

Merch

antW

holesalers

Inven

tori

0.8%

20%

0%

0.5%

0%

0%

0.3%

0.5%

176%

0.3%

0.2%

68%

USIndustrialProductionMOM

S19.9%

27%

93%

21.4%

67%

87%

19.2%

0.7%

4%

18.1%

3.3%

18%

GDP

USChained

2009Dollars

Qo

24.0%

60%

100%

24.3%

47%

100%

22.4%

1.6%

7%

23.2%

1.1%

5%

USCapacity

Utiliza

tion%

ofT

15.7%

60%

93%

16.0%

73%

93%

13.8%

1.9%

14%

13.3%

2.6%

20%

USPersonalConsu

mptionExpen

d1.4%

13%

40%

1.3%

13%

33%

0.8%

0.6%

74%

0.9%

0.4%

41%

USDurable

GoodsNew

Ord

ersIn

13.9%

7%

93%

13.6%

60%

93%

13.5%

0.4%

3%

12.3%

1.3%

11%

USAuto

SalesDomesticVeh

icle

4.4%

40%

7%

1.9%

7%

0%

0.5%

3.9%

754%

0.9%

1.0%

109%

Adjusted

Retail&

FoodService

21.6%

20%

93%

23.5%

67%

93%

21.0%

0.7%

3%

21.2%

2.3%

11%

Adjusted

RetailSalesLessAut

25.6%

73%

93%

25.9%

73%

93%

22.1%

3.5%

16%

23.4%

2.5%

11%

USDurable

GoodsNew

Ord

ersTo

22.0%

13%

87%

22.0%

67%

93%

21.5%

0.5%

2%

20.0%

2.0%

10%

GDP

USPersonalConsu

mptionCh

4.7%

27%

80%

5.4%

27%

67%

3.6%

1.1%

31%

3.7%

1.7%

46%

ISM

Non-M

anufacturingNMI

25.4%

33%

80%

26.0%

40%

87%

23.2%

2.2%

10%

24.3%

1.7%

7%

USManufacturing&

TradeInven

0.8%

7%

7%

0.8%

0%

13%

0.4%

0.4%

88%

0.4%

0.4%

93%

Philadelphia

Fed

BusinessOutl

20.2%

33%

100%

21.7%

67%

93%

19.5%

0.8%

4%

18.1%

3.5%

20%

MNIChicagoBusinessBarometer

7.7%

60%

87%

7.9%

7%

93%

6.6%

1.1%

17%

7.4%

0.6%

8%

Conference

Board

USLea

dingIn

3.3%

0%

67%

3.8%

67%

40%

3.1%

0.3%

9%

1.3%

2.5%

202%

Conference

Board

Consu

mer

Conf

23.5%

53%

93%

24.7%

67%

93%

21.7%

1.8%

8%

22.2%

2.4%

11%

USEmpireState

Manufacturing

15.8%

33%

87%

15.5%

47%

80%

14.7%

1.0%

7%

13.6%

1.9%

14%

RichmondFed

eralReserveManuf

2.8%

7%

40%

2.8%

0%

53%

2.0%

0.7%

36%

2.6%

0.2%

9%

ISM

Milwaukee

Purchasers

Manuf

3.5%

27%

13%

2.5%

7%

13%

1.8%

1.7%

95%

1.5%

1.0%

70%

University

ofMichiganConsu

me

4.0%

40%

73%

4.5%

40%

73%

3.4%

0.7%

20%

2.9%

1.6%

54%

DallasFed

ManufacturingOutlo

1.0%

0%

0%

2.1%

20%

0%

0.5%

0.6%

125%

0.6%

1.5%

255%

USPPIFinished

GoodsLessFoo

8.4%

33%

87%

8.9%

27%

87%

7.5%

1.0%

13%

7.2%

1.8%

25%

USCPIUrb

anConsu

mersMoM

SA

9.0%

20%

100%

10.9%

0%

100%

8.4%

0.6%

7%

10.6%

0.3%

3%

USCPIUrb

anConsu

mersLessFo

17.6%

7%

100%

19.3%

7%

100%

17.5%

0.2%

1%

19.0%

0.3%

1%

BureauofLaborStatisticsEmp

7.1%

33%

20%

5.1%

20%

20%

2.8%

4.4%

158%

2.6%

2.5%

95%

USOutp

utPer

HourNonfarm

Bus

4.4%

27%

53%

4.2%

27%

53%

3.1%

1.3%

44%

2.7%

1.5%

55%

USPPIFinished

GoodsSA

MoM%

6.3%

27%

67%

6.3%

53%

47%

5.5%

0.9%

16%

4.4%

2.0%

45%

USIm

port

Price

Index

byEndU

2.4%

20%

67%

2.8%

60%

27%

1.8%

0.6%

30%

0.9%

1.9%

200%

USGDP

Price

Index

QoQ

SAAR

2.0%

0%

40%

3.4%

60%

13%

1.6%

0.4%

26%

0.9%

2.5%

261%

USPersonalConsu

mptionExpen

d0.8%

0%

7%

1.7%

7%

27%

0.6%

0.2%

25%

1.2%

0.5%

46%

USPersonalConsu

mptionExpen

d1.6%

7%

27%

1.5%

7%

13%

1.1%

0.5%

45%

1.0%

0.5%

45%

Average

9.7%

22%

62%

10.0%

32%

59%

8.8%

0.9%

49%

8.7%

1.4%

51%

Popularity-w

eightedaverage

11.6%

26%

72%

12.0%

37%

68%

10.6%

1.0%

33%

10.5%

1.5%

42%

Most

pop.indicators

(avg)

19.0%

26%

95%

20.1%

48%

94%

18.1%

0.9%

5%

18.4%

1.7%

9%

ThePanel

Aofth

etable

below

reportsresu

ltsforth

etw

o-compo

nen

tresponse

mod

elbasedonth

eanch

or-only

andonth

eunrestricted

model

foreconomic

surp

rises.

Two-compo

nen

tresponse

mod

elsseparate

economic

surp

risesinto

expectedsu

rprises,

asforeca

sted

byeconomic

surp

rise

predictivemodels,

andunex

pectedsu

rprises,

asth

eresidualbetweeneconomic

surp

risesandex

pected

surp

risesasgiven

byEq.(5.4.2).

Statisticsreported

fortw

o-componen

tmodelsare

R2andth

epercentageofregressionco

efficien

tsth

atare

statisticallysignifica

ntacross

differen

tmarkets.

Panel

BreportsR

2forth

e(u

nivariate)unexpected-surp

rise

response

mod

elasgiven

byEq.(5.4.3)aswellasth

eabsolute

andpercentagegain

inR

2delivered

byTwo-compo

nen

tresponse

mod

elsversu

sunexpected-surp

rise

response

mod

els.

Reg

ressionresu

ltsare

reported

per

economic

indicator.

Weuse

New

ey-W

estadjustmen

tsto

compute

coeffi

cien

tstandard

errors.Theasterisks***,**,and*

indicate

significa

nce

atth

eone,

five,


ectively.

150

Our findings suggest that the coefficient for unexpected surprises is often significant, on

average 62 and 59 percent of all times, respectively for the anchor-only and unrestricted mod-

els. Nevertheless, the expected surprises are also frequently significant, but less so than for

unexpected surprises. Still, because expected surprises are significant in 22 and 32 percent of

all times, we conclude that market responses are also strongly linked to expected surprises,

not only unexpected ones. The fact that expected surprises are more significant in unrestricted

models versus the anchor-only model, and that unexpected surprises are less significant for the

unrestricted models reiterate our results that factors beyond ESA are informative in predicting

surprises, such as Skew.

Panel B of Table (5.9) compares the R2 of univariate unexpected response models for

the anchor-only and unrestricted response models with the explanatory power of their two-

component (multivariate) counterparts. At first glance, we see that the average explanatory

power delivered by the anchor-only and unrestricted-based unexpected surprise models are

quite similar, suggesting that the unexplained portion of the surprise as modelled by these two

approaches is comparable. When we compare the average R2 delivered by the two-component

models with the ones delivered by unexplained surprises only (across anchor-only and unre-

stricted models), it seems that R2 for the two-component models is only marginally higher, by

roughly one percent. Nevertheless, when we evaluate the percentage gain in R2 delivered by

the two-component models (versus the unexpected-surprise models), there is an indication that

the two-component models increase the explanatory power of the unexpected-surprise models

quite substantially. This gain is roughly 50 percent on average, across anchor-only and the

unrestricted model. This finding indicates that the predictable portion of surprises adds sub-

stantial explanatory power to response models relative to the explanatory power of unexpected

surprises-only models. These results contribute to our findings that expected surprise models

comprise a relevant source of information on estimating market responses around economic

surprises.

Further, for expected surprises the percentages of significant coefficients is roughly the

same for the popularity-weighted and the un-weighted averages. This findings are in line with

our earlier observation that responses connected to surprises of popular indicators are just as

predictable than responses provoked by unpopular indicators. In contrast, we observe that

the percentages of significant coefficients for unexpected surprises is higher for popularity-

weighted and the most popular indicator than for the un-weighted average across anchor-only

and unrestricted surprise models. In the same vein, the percentage gain in R2 delivered by the

two-component models is lower for popular indicators than for the average indicator, reflecting

that the explanatory power added by expected surprises (on top of the R2 produced by the

unexplained portion) for popular indicators is less than for the average indicator. These findings

indicate, despite expected surprises being connected to market response across indicators of

any level of popularity, for popular indicators, the unexpected component of surprises is more

prevalent than the expected one than for un-popular indicators. In other words, the unexpected

component of surprises for popular indicator dominates their expected component, which is

151

happens at a lower frequency for un-popular indicators.

In this way, our findings diverge from the bold conclusion of Campbell and Sharpe (2009)

that traders “look-through” the bias. Nevertheless, our results differ from theirs partially

because we use a much broader set of (un-popular) economic indicator. When we focus on

the set of popular indicators used by these authors our results are more in line with theirs, in

which unexpected component of surprises is the main explanatory variable of market responses

(despite not being the only one). Hence, the link between expected surprises and market

responses is more prevalent in a set of less popular indicators, despite the fact that biases are

more pervasive on popular indicators.

5.5 Conclusion

This chapter investigates how forecasters biases, both cognitive and rational, are associated

with future macroeconomic surprises and their respective market responses around announce-

ments in the US. We empirically confirm that the anchor bias previously recognized in the

literature remains pervasive but we show that higher moments of the distribution of economic

forecasts are also informative on predicting surprises. Particularly, our results suggest that the

skewness of the distribution of economic forecasts provides reliable information to predict eco-

nomic surprises. Whereas anchoring has clear behavioral roots, we assume that the information

contained in the skewness of forecasts reflects a rational bias. This assumption builds on the

literature on strategic behavior by forecasters, who have dual and contradicting objectives, i.e.,

forecast accuracy and publicity. According to this stream of research, forecasters typically stay

close to the “pack” (herding) but eventually, when in the possession of what they perceive to

be private information, they intentionally issue off-consensus forecasts, contributing to a highly

skewed distribution of forecasts. Our results, thus, suggest that professional forecasters often

possess private information and that they do make use of it by issuing controversial (and in-

formative) forecasts. Under these conditions, macroeconomic surprises are, at least, partially

predictable. Predictability is, though, stronger for popular indicators, suggesting that as we

move from widely followed indicators towards less watched ones, biases become less pervasive.

In the same vein, we show that predictability and the strong link between macroeconomic sur-

prises and forecast skewness also holds for other countries/regions, such as Continental Europe,

the United Kingdom and Japan, albeit, to a lesser extent.

A consequence of economic surprises being predictable might be that responses observed in

asset returns around macroeconomic announcements are also predictable. Our findings confirm

this hypothesis in-sample and, to a lesser degree, out-of-sample. We identify that forecasts

made using our unrestricted-extended economic surprise models outperform simpler models on

forecasting market responses across the four asset classes studied. The success of these models

is partially associated to the expected economic surprises modelled and partially linked to past

returns, challenging the Efficient Market Hypothesis (EMH).

Asset returns around announcements of highly-followed macroeconomic indicators are as

152

predictable as around releases of unpopular indicators, despite economic surprises being more

predictable for popular indicators. Nevertheless, market responses around announcements of

popular indicators are more frequently linked to the unexpected component of surprises than

unpopular indicators are. We also find that machine learning techniques outperforms OLS

regression models in point forecast, which may be linked to their non-linear nature. Undoubt-

edly, the regularized machine learning models applied by us are superior than OLS regression

at avoiding overfitting in our data set. Further, we find that returns of assets that are sensitive

to the fundamentals being revealed by macro announcements (local equities and bonds) are

more predictable around such events than foreign markets, currencies and commodities.

Yet, when forecasters fail to correctly forecast the direction of the economic surprises, an-

other bias seems to play a role in explaining market responses: regret. We identify the presence

of this cognitive bias as we find that negative (positive) market responses are more perva-

sive when the skewness of forecasts failed to correctly forecast surprises. Future research is

warranted to strengthen our conclusions on the matter, while extending our findings to other

cases, such as the forecasting of quarterly earnings releases, seems the natural next steps to

take. Extending our skewness-based forecast approach by the use the skewness of top-quartile

forecasters only should also strengthen our findings.

We conclude with the four key implications of our findings: 1) a better understanding of

the “market consensus” and of the informational content of higher moments of the distribution

of macroeconomic forecasts by regulators, policy makers and market participants; 2) the chal-

lenge of standard weighting schemes used in economic surprise indexes, which we find can be

improved by changing from “popularity” (or “attention”)-weighted to un-weighted, as market

responses around announcements of popular indicators are not more predictable than responses

around releases of unpopular indicators; 3) the proposition that advanced statistical learning

techniques should be used to refine the forecast of market responses amid macroeconomic re-

leases, especially when such methods prevent overfitting and are somewhat transparent; and 4)

the opening of a new stream in the literature to investigate regret effects in asset price responses

around announcements of forecasted figures.

153

5.A Appendix: Machine learning methods

5.A.1 Principal component analysis

Principal component analysis (PCA) is likely the most popular linear dimensionality reduction

tool. As such, PCA is an unsupervised method, which bares no association with an explained

variable Y but only with the features X1, X2, ..., Xp of a data set or model. PCA aims to

summarize a large set of correlated variables into a smaller number of orthogonal variables that

explain most of the variability in the original set. Hence, the first principal component (PC1) of

a data set is the orthogonal variable produced by a linear combination of the provided features

that can explain the most variance in this data set. More formally, to obtain PC1 one must

solve the following optimization problem, which maximizes explained sample variance by:

w(1) = arg max

{1

n

n∑i=1

(

p∑j=1

wj1xij)2

}subject to ||w||= 1 (5.A.1)

where w are the Kth weights or loadings wK of each feature in principal component calculated.

Once PC1 is computed, a subsequent second principal component (PC2) can be computed in

the same manner by subtracting PC1 from the original data set. The following higher K-order

PC-Kth are found in the same manner.

5.A.2 Ridge regression

The Ridge regression (Hoerl and Kennard, 1970) is a shrinkage method, similar to the Least

Absolute Shrinkage and Selection Operator (Lasso) of Tibshirani (1996). The main difference

between the Lasso and Ridge regression is that the former translates each coefficient by a

constant factor φ (typically λ), truncating at zero, whereas the latter applies proportional

shrinkage. The main consequence of such difference is that shrank coefficients by Lasso equal

to zero, whereas for Ridge regression coefficient approach zero as limit. Therefore, the Ridge

regression only applies shrinkage and not shrinkage and variable selection simultaneously, which

should help forecast accuracy but does not improve model interpretation as the Lasso does.

Thus, regression models shrank by Lasso are sparse version of the original regression model,

whereas Ridge regressions are not.

The regression coefficients obtained by the Ridge regression methodology applied (βLθ ) are

estimated by minimizing the quantity:

n∑i=1

(y1 − β0 −p∑

j=1

βjxij)2 + φ

p∑j=1

β2j = RSS + φ

p∑j=1

β2j (5.A.2)

where φ is the tuning parameter, which is estimated via cross-validation. The cross-validation

applied uses three equal-size splits of our train data set. For a comparison between Ridge

regression and the Lasso, see Hastie et al. (2008).

154

5.A.3 Random Forest

Random forest (Breiman, 2001) is a decision tree-based method derived from bootstrap aggre-

gation (i.e., bagging). Bagging entails fitting a regression many times by applying bootstrap

to the train data set and, then, averaging the predictions of each model. The goal of applying

bagging is to reduce a predictions’ variance by averaging. As decision tree typically suffer from

high variance, ensemble methods such as bagging, Random forest and boosting are warranted

for better predictive accuracy. As mentioned, Random forest builds on bagging by growing

a large collection of trees with the nuance that they are imposed to be as de-correlated as

possible. Similarly to bagging, after de-correlated trees are grown as a ’random forest ’, then,

predictions coming from them are averaged into a single prediction.

The Random forest (regression) predictor is given by:

fBrf (x) =

1

B

B∑b=1

T (x; Θb) (5.A.3)

where, B is the number of T (·) trees grown, Θb is the set of characteristics of the bth tree to

what it concerns the split variables, cut-points at each node and terminal-node values and x is

the set of explanatory variables. For details on decision tree, which is the basic building block

for Random forest, see Hastie et al. (2008).

5.A.4 Random forest variable Importance measure

The Importance measure applied in our study32 computes the average node impurity across all

trees grown by the Random forest, reflecting how optimum partitions made by the different

explanatory variables at each node compare with a ‘pure’ node, i.e., a constant fit over the

entire region.

The natural starting point for the calculation of the Importance measure (Ψv) per variable

v = 1...V is a single decision tree T as follows:

Ψ2v(T ) =

J−1∑j=1

ι2jI(w(j) = v) (5.A.4)

where, the sum collects the importance of each variable v across the total number of nodes J

of the tree. At each node j the input variable in analysis splits the region into two sub-regions

associated with a boundary level, where sub-regions are either equal or higher than or smaller

than the boundary level utilizing function w(·). The variable chosen to make this split is the

one that maximizes the improvement (versus the previous nodes in this branch) ι2j in squared

errors achieved in this node relative to the decision of no partition, i.e., a ’pure’ node.

As in Random forest many trees are utilized, the Importance measure per tree (Ψ2v(T ))

has to be aggregated into a overall model Importance measure, which is achieve by averaging

32Note that there are alternative Importance measures that can be used in Random forest and in othertree-based algorithms.

155

Importance across the total number of tree M :

Ψ2v =

1

M

M∑m=1

Ψ2v(Tm). (5.A.5)

For more details on Importance measures see Hastie et al. (2008).

156

Chapter 6

Conclusion

The main theme of this thesis is the link between behavioral finance, ex-ante informational

sources, particularly probability densities extracted from option markets and macroeconomic

survey data, and investment strategies.

Chapter 2 analyses the effects of the European 2011 short sale ban on financial market

stability and contagion risk through the lenses of risk-neutral densities (RND) and implied

jump risk from single-stock options. We find that the short sale bans imposed by Belgium,

France, Italy, and Spain increased implied jump risk, especially for the banned stocks, even after

controlling for information flow and stock-specific factors. Partially, this is caused by a smaller

supply of puts during the ban as market makers became more risk-sensitive following equity

market declines (see Garleanu et al., 2009), which can be explained by the CPT’s overweighting

of tails. However, we find that contagion risk for banned stocks decreased during the ban relative

to the pre-ban period. While we observe that the short sale ban is effective in restricting both

outright and synthetic shorts on banned stocks, we do find evidence investors seem to switch

from single-stock puts to index puts because of “flight-to-liquidity” incentives. This migration

likely diverted selling pressure from the financial stocks to a larger share of the stock market,

thereby reducing the destabilizing effects in the financial sector. Thus, if the first and foremost

goal of imposing a ban is reducing systemic risk, then the 2011 bans do seem to fulfill this

purpose. However, we note that this success comes at a cost: the increase in the implied jump

risk due to a supply shift. Despite the fact that this effect in implied jump risk indicates

market failure and may have adversely influenced market participants’ expectations, it helped

to preserve market stability by reducing contagion risk.

In Chapter 3 we estimate the CPT probability weighting function parameter γ for gains

and find that it is qualitatively consistent with the one predicated by Tversky and Kahneman

(1992), endorsing our hypothesis that investors in single stock call options are biased. Though,

overweight of small probabilities is less pronounced than proposed by the CPT and exhibits

a positive term structure as it becomes less pronounced as the option maturity increases. In-

vestors’ overweighting of small probabilities is also largely time-varying and sample dependent.

It is pronounced in periods in which sentiment is high, for instance, the IT bubble period and

it abates when sentiment is low. Our results challenge the view that single stock call options

157

are structurally overpriced and offer the insight that overweight of tail events implied in these

options are conditional on sentiment levels and option maturity rather than positive stock

fundamentals, loss aversion levels or investor preferences for skewness.

Chapter 4 finds that overweight small probability events is strongly time-varying and present

in both OTM index puts and single stock calls, due to individual and institutional investors

trading activity, respectively. In order to capture both bullish and bearish sentiment from

trading activity of these two types of market participants, we propose a novel indicator: IV-

sentiment. Contrarian-trading strategies using our IV-sentiment measure produce economically

significant risk-adjusted returns. The joint use of information from the single stock and index

option markets seems to be the reason for the superior forecast ability of our indicator. More-

over, IV-sentiment seems to forecast returns as well as other well-known predictors of equity

returns and is uncorrelated to these predictors, significantly improving the quality of multifactor

predictive models. An IV-sentiment-based strategy is also little exposed to a set of widely used

cross-sectional equity factors, which includes Fama and French’s five factors, the momentum

factor and the low-volatility factor. We find that combining our sentiment strategy with other

strategies, such as buy-and-hold the S&P 500 index, time-series momentum and cross-sectional

equity momentum can largely improve their risk-return trade-offs. Cross-sectional momentum

benefits the most from our IV-sentiment measure as it seems to mitigate momentum crashes.

Chapter 5 investigates how forecasters’ biases, both cognitive and rational, are associated

with future macroeconomic surprises around announcements in the US. We empirically confirm

that the anchor bias previously recognized in macroeconomic forecasting remains pervasive but

also that the skewness of the distribution of economic forecasts provides reliable information for

the prediction of economic surprises, denoting the presence of a rational bias. A consequence

of economic surprises being predictable is that responses observed in asset returns around

macroeconomic announcements are also predictable. Our findings confirm this hypothesis in-

sample and, to a lesser degree, out-of-sample. Returns of assets that are sensitive to the

fundamentals being revealed by macro announcements (local equities and bonds) are found

to be more predictable around such events than foreign markets, currencies and commodities.

Yet, when forecasters fail to correctly forecast the direction of the economic surprises, a regret

bias seems also to play a role in explaining market responses.

Our findings yield further validation of behavioral finance and new angles to cross-asset

investing, equity timing, options pricing and short-sale bans. Hence, this thesis draws impli-

cations not only for academia but also for regulators, investors and other market participants.

For instance, it provides an additional set of tools to regulators, which can be used to monitor

contagion effects (Chapter 2) and the build up of speculative equity market bubbles (Chapter

3). It suggests that investors’ overweighting of small probabilities and a positive term struc-

ture of tails’ overweighting can be used in the development of behavioral option pricing models

(Chapter 3). It proposes that a contrarian IV-sentiment-based strategy for equity timing can

benefit equity investors and asset allocators (Chapter 4). It challenges the “popularity”-based

weighting schemes used in economic indexes in favor of an un-weighted schemes (Chapter 5).

158

In conclusion, this thesis finds strong but intricate relations between behavioral biases, ex-

ante distributions of equity market returns from options, moments of macroeconomic forecasts

and asset prices. It adds novel evidence that behavioral models resemble the decision making

process of key market participants’ and can guide the design of investment strategies with

implication for academics and practitioners.

We acknowledge that our research can still be expanded in several directions. For instance,

it would be interesting to analyze ban-driven increases in implied jump risk using the mutually

exciting jumps model of Ait-Sahalia et al. (2015). This analysis is relevant as it might justify co-

ordinated introduction of bans by regulators. As our IV-sentiment-based strategy is negatively

correlated with and largely improves the momentum factor, it could be further investigated as a

protection against momentum crashes (see Daniel and Moskowitz, 2016). Moreover, we conjec-

ture that using individual forecasters’ information may largely strengthen our results regarding

the presence of bias in macroeconomic forecasting, beyond improving the forecast ability of

models. We also think that the data set on macroeconomic releases used in Chapter 5 should

be further explored. This data set can serve as the backbone of studies that investigate under-

and over-reaction of asset prices to macroeconomic news (in line with Chan, 2003) and the

economic cycle in high frequency basis. Given our preliminary finding that regret (see Loomes

and Sugden, 1982; Bell, 1982) seems to drive market responses when forecasters fail to correctly

forecast the direction of economic surprises, the matter requires further investigation. Future

research on regret around economic surprises should not only validate our finding but also test

for its presence elsewhere, such as in the forecasting of quarterly earnings releases. Finally,

given the infancy of research that applies artificial intelligence methods to forecasting financial

data, more studies on the topic are warranted. These methods should be further explored

because they allow for great dimensionality reduction and overcome some of the limitations of

standard regressions, such as multicollinearity and linearity.

159

Bibliography

Acharya, V., Pedersen, L., 2005. Asset pricing with liquidity risk. Journal of Financial Eco-

nomics 77 (2), 375–410.

Aggarwal, R., Mohanty, S., Song, F., 1995. Are survey forecasts of macroeconomic variables

rational? Journal of Business 68, 99–119.

Ait-Sahalia, Y., Cacho-Diaz, J., Laeven, R., 2015. Modeling financial contagion usingmutually

exciting jump processes. Journal of Financial Economics 117 (3), 585–606.

Ait-Sahalia, Y., Lo, A., 2000. Nonparametric risk management and implied risk aversion. Jour-

nal of Econometrics 94 (3), 90–51.

Amin, K., Coval, J., Seyhun, H., 2004. Index option prices and stock market momentum.

Journal of Business 77 (4), 835–874.

Anagnou, I., Bedendo, M., Hodges, S., Tompkins, R., 2002. The relationship between implied

and realised probability density function. Working paper, University of Warwick and the

University of Technology, Vienna.

Ang, A., Chen, J., Xing, Y., 2006. Risk, return and dividends. Review of Financial Studies

19 (4), 1191–1239.

Ang, A., Liu, J., 2007. Risk, return and dividends. Journal of Financial Economics 85, 1–38.

Asness, C., Moskowitz, T., Pedersen, L., 2013. Value and momentum everywhere. Journal of

Finance 68 (3), 929–985.

Baker, M., Wurgler, J., 2007. Investor sentiment in the stock market. Journal of Economic

Perspectives 21 (Spring), 129–157.

Bakshi, G., Kapadia, N., Madan, D., 2003. Stock returns characteristics, skew laws, and the

differential pricing of individual equity options. Review of Financial Studies 16 (1), 101–143.

Balla, E., Ergen, I., Migueis, M., 2014. Tail dependence and indicators of systemicrisk for large

US depositories. Journal of Financial Stability 15, 195–209.

Barber, B., Odean, T., 2008. All that glitters: The effect of attention and news on the buying

behavior of individual and institutional investors. Review of Financial Studies 21, 785–818.

160

Barberis, N., 2013. The psychology of tail events: Progress and challenges. American Economic

Review 103 (3), 611–616.

Barberis, N., Huang, M., 2001. Mental accounting, loss aversion and individual stock returns.

Journal of Finance 56 (4), 1247–1292.

Barberis, N., Huang, M., 2008. Stocks as lotteries: The implication of probability weighting for

security prices. American Economic Review 98 (5), 2066–2100.

Barberis, N., Huang, M., Santos, T., 2001. Prospect theory and asset prices. Quarterly Journal

of Economics 56 (1), 1–53.

Barberis, N., Shleifer, A., Vishny, R., 1998. A model of investor sentiment. Journal of Financial

Economics 49 (3), 307–343.

Bates, D., 1991. The crash of 87: Was it expected? the evidence from options markets. Journal

of Finance 46 (3), 1009–1044.

Bates, D., 2000. Post-87 crash fears in the sp 500 futures option market. Journal of Economet-

rics 94 (1-2), 181–238.

Bates, D., 2003. Empirical option pricing: A retrospection. Journal of Econometrics 116, 387–

404.

Battalio, R., Schultz, P., 2011. Regulatory uncertainty and market liquidity: the 2008 short

sale bans impact on equity option markets. Journal of Finance 66 (6), 2013–2052.

Bauer, R., Cosemans, M., Eichholtz, P., 2009. Option trading and individual investor perfor-

mance. Journal of Banking & Finance 33 (4), 731–746.

Beber, A., Brandt, M., Luisi, M., 2015. Distilling the macroeconomic news flow. Journal of

Financial Economics 117 (3), 489–507.

Beber, A., Pagano, M., 2013. Short selling bans around the world: evidence from the 2007-2009

crisis. Journal of Finance 68 (1), 343–381.

Bell, D., 1982. Regret in decision making under uncertainty. Operations research 30 (5), 961–

981.

Benartzi, S., Thaler, R., 1995a. Myopic loss aversion and the equity premium puzzle. The

Quarterly Journal of Economics 110 (1), 73–92.

Benartzi, S., Thaler, R., 1995b. Myopic loss aversion and the equity premium puzzle. Quarterly

Journal of Economics 110 (1), 73–92.

Birru, J., Figlewski, S., 2011. Anatomy of a meltdown: the risk neutral density forthe sp 500

in the fall of 2008. Journal of Financial Markets 15 (2), 151–180.

161

Black, F., 1976. Studies of stock price volatility changes. Proceedings of the 1976 Meetings of

the American Statistical Association, 171–181.

Blei, D., Ng, A., Jordan, M., 2003. Latent dirichlet allocation. Journal of Machine Learning

Research 3 (Jan), 993–1022.

Bliss, R., Panigirtzoglou, N., 2004. Option-implied risk aversion estimates. Journal of Finance

59 (1), 407–446.

Bloomberg, 2008. Introduction into the new bloomberg implied volatilitycalculations. (March).

Blume, M., Keim, D., 2012. Institutional investors and stock market liquidity: Trends and

relationships. Working paper: The Wharton School.

Boehmer, E., Jones, C., Zhang, X., 2013. Shackling short sellers: the 2008 shorting ban. Review

of Financial Studies 26 (6), 1363–1400.

Bollen, N., Whaley, R., 2004. Does net buying pressure affect the shape of implied volatility

function? Journal of Finance 59 (2), 711–754.

Bollerslev, T., Tauchen, G., Zhou, H., 2009. Exptected stock returns and variance risk premia.

Review of Financial Studies 22 (11), 4463–4492.

Boyer, B., Vorkink, K., 2014. Stock options as lotteries. Journal of Finance 69 (4), 1485–1527.

Breeden, D., Litzenberger, R., 1978. Prices of state-contingent claims implicit in option prices.

Journal of Business 51 (4), 621–651.

Breiman, L., 2001. Random forests. Machine Learning 45 (1), 5–32.

Brinson, G., Hood, L., Beebower, G., 1986. Determinants of portfolio performance. Financial

Analysts Journal 42 (4), 39–48.

Brunnermeier, M., Pedersen, L., 2009. Market liquidity and funding liquidity. Review of Finan-

cial Studies 22 (6), 2201–2238.

Campbell, J., Cochrane, J., 1999. By force of habit: A consumption-based explanation of

aggregate stock market behavior. Journal of Political Economy 107 (2), 205–251.

Campbell, J., Thompson, S., 2008. Predicting the equity premium out of sample: Can anything

beat the historical average? Review of Financial Studies 21 (4), 1509–1531.

Campbell, S., Sharpe, S., 2009. Anchoring bias in consensus forecasts and its effect on market

prices. Journal of Financial and Quantitative Analysis 44 (2), 369–390.

Capistran, C., Timmermann, A., 2009. Disagreement and biases in inflation expectations. Jour-

nal of Money, Credit and Banking 41 (2), 365–396.

162

Carhart, M., 1997. On persistence in mutual fund performance. The Journal of Finance 52 (1),

57–82.

Cen, L., Hilary, G., Wei, K., 2013. The role of anchoring bias in equity market: evidence

from analysts’ earnings forecasts and stock returns. Journal of Financial and Quantitative

Analysis 48 (1), 47–76.

Chabi-Yo, F., Song, Z., 2013. Recovering the probability weights of tail events with volatility

risk from option prices. Working paper Ohio State University and Federal Reserve System,

1–55.

Chan, L., Chen, H.-L., Lakonishok, J., 2002. On mutual fund investment styles. Review of

Financial Studies 15 (5), 1407–1437.

Chan, W., 2003. Stock price reaction to news and no-news: drift and reversal after headlines.

Journal of Financial Economics 70 (2), 223–260.

Chang, I., Christoffersen, P., Jacobs, K., 2013. Market skewness risk and the cross section of

stock returns. Journal of Financial Economics 107 (1), 46–68.

Chen, Y., Kumar, A., Zhang, C., 2015. Searching for gambles: investor attention, gambling

sentiment, and stock market outcomes. SSRN working paper 2635572.

Chira, I., Madura, J., Viale, K., 2013. Bank exposure to market fear. Journal of Financial

Stability 9, 451–459.

Choy, S., 2015. Retail clientele and option returns. Journal of Banking & Finance 51 (5),

141–159.

Colacito, R., Ghysels, E., Meng, J., Siwasarit, W., 2016. Skewness in expected macro funda-

mentals and the predictability of equity returns: Evidence and theory. Review of Financial

Studies 29 (8), 2069–2109.

Conrad, J., Dittmar, R., Ghysels, E., 2013. Ex-ante skewness and expected stock returns.


Cornell, B., 2009. The pricing of volatility and skewness: A new interpretation. Journal of

Investing 18, 27–30.

Corrado, C., Su, T., 1997. Implied volatility skews and stock index skewness and kurtosis

implied by sp 500 index option prices. Journal of Derivatives 4 (4), 8–19.

Cremers, M., Weinbaum, D., 2010. Deviations from put-call parity and stock return predictabil-

ity. Journal of Financial and Quantitative Analysis 45, 335–367.

Daniel, K., Hirshleifer, D., Subrahmanyam, A., 1998. Investor psychology and security market

under- and overreactions. Journal of Finance 53 (6), 1839–1885.

163

Daniel, K., Moskowitz, T., 2016. Momentum crashes. Journal of Financial Economics 122 (2),

221–247.

Danielsson, J., Jorgensen, B., Sarma, M., de Vries, C., 2006. Comparing downside risk measures

for heavy tailed distributions. Economics Letters 92 (2), 202–208.

DataExplorers, L., 2011. Securities lending review Q3 2011: Back to its roots. Third ed. Data

Explorers Limited, London.

De Bondt, W., Thaler, R., 1990. Do security analysts overreact? American Economic Review

80 (2), 52–57.

De Haan, L., Jansen, D., Koedijk, K., de Vries, C., 1994. Safety first portfolio selection, extrem

value theory and long run asset risks. In Proceedings from a Conference on Extreme Value

Theory and Applications, Galambos J (ed.), Kluwer Academic: Boston, MA,, 471–487.

De Long, J., Shleifer, A., Summers, L., Waldmann, R., 1990. Noise trader risk in financial

markets. Journal of Political Economy 98 (4), 703–738.

Dennis, P., Mayhew, S., 2002. Risk-neutral skewness: evidence from stock options. Journal of

Financial and Quantitative Analysis 37 (3), 471–493.

Devroye, L., 1986. Non-uniform random variate generation. Springer-Verlag, New York.

Dierkes, M., 2009. Option-implied risk attitude under rank-dependent utility. Unpublished work-

ing paper. University of Munster, Munster, Germany.

Doran, J., Peterson, D., Tarrant, B., 2007. Is there information in the volatility skew? Journal

of Future Markets 27 (10), 921–959.

Driessen, J., Maenhout, P., Vilkov, G., 2009. The price of correlation risk: Evidence from equity

options. Journal of Finance 64 (3), 1377–1406.

Driessen, J., Maenhout, P., Vilkov, G., 2013. Option-implied correlations and the price of

correlation risk. SSRN working paper no. 2166829, 1–46.

Duan, J.-C., Wei, J., 2009. Systematic risk and the price structure of individual equity options.


Easterwood, J., Nutt, S., 1999. Inefficiency in analysts’ earnings forecasts: Systematic misre-

action or systematic optimism? Journal of Finance 54 (5), 1777–1797.

Engle, R., Mistry, A., 2008. Priced risk and asymmetric volatility in the cross-section of skew-

ness. SSRN working paper 1354529.

Fama, E., French, K., 1992. The cross-section of expected stock returns. Journal of Finance

47 (2), 427–465.

164

Fama, E., French, K., 2015. A five-factor asset pricing model. Journal of Financial Economics

116 (1), 1–22.

Fama, E., French, K., 2016. Dissecting anomalies with a five-factor model. Review of Financial

Studies 29 (1), 69–103.

Felix, L., Kraussl, R., Stork, P., 2016a. The 2011 european short sale ban: A cure or a curse?

Journal of Financial Stability 25, 115–131.

Felix, L., Kraussl, R., Stork, P., 2016b. Single stock call options as lottery tickets: overpricing

and investor sentiment. Forthcoming in Journal of Behavioral Finance, 1–38.

Felix, L., Kraussl, R., Stork, P., 2017a. Implied volatility sentiment: a tale of two tails. Tin-

bergen Institute Discussion Paper 17-002/IV - SSRN working paper 2758641, 1–54.

Felix, L., Kraussl, R., Stork, P., 2017b. Predictable biases in macroeconomic forecasts and their

impact across asset classes. SSRN working paper 3008976, 1–40.

Figlewski, S., 2010. Estimating the implied risk neutral density for the US market portfolio.

In Volatility and Time Series Econometrics: Essays in Honor of Robert F. Engle - Oxford

University Press.

Fox, C., Rogers, B., Tversky, A., 1996. Options traders exhibit subadditive decision weights.

Journal of Risk and Uncertainty 13, 5–17.

Frazzini, A., Pedersen, L., 2014. Betting against beta. Journal of Financial Economics 111 (1),

1–25.

Frijns, B., Huynh, T., Tourani-Rad, A., Westerholm, P., 2015. Institutional trading and asset

pricing. FIRN Research Paper No. 2531823, 1–55.

Garleanu, N., Pedersen, L. H., Poteshman, A. M., 2009. Demand-based option pricing. Review


Grammatikos, T., Vermeulen, R., 2012. Transmission of the financial and sovereign debt crises

to the emu: stock prices, cds spreads and exchange rates. Journal of International Money

and Finance 31 (3), 469–480.

Green, T., Hwang, B.-H., 2011. Initial public offering as lotteries: skewness preferences and

first-day returns. Management Science 86 (2), 432–444.

Grundy, B., Lim, B., Verwijmeren, P., 2012. Do option markets undo restrictions on short sales?

evidence from the 2008 short-sale ban. Journal of Financial Economics 106 (2), 331–348.

Han, B., 2008. Investor sentiment and option prices. Review of Financial Studies 21 (1), 387–

414.

165

Hartmann, P., Straetmans, S., de Vries, C., 2004. Asset market linkages in crisis periods. The

Review of Economics and Statistics 86 (1), 313–326.

Harvey, C., Siddique, A., 2000. Conditional skewness in asset pricing tests. Journal of Finance

60 (3), 1263–1296.

Hastie, T., Tibshirani, R., Friedman, J., 2008. The elements of statistical learning: data mining,

inference, and prediction (2nd ed.), Springer–Verlag, New York.

Haykin, S., 1999. Neural networks: A comprehensive foundation (2nd ed.), Pearson Prentice

Hall.

Hill, B., 1975. A simple general approach to inference about the tail of a distribution. Annals

of Statistics 3 (5), 1163–1173.

Hoerl, A., Kennard, R., 1970. Ridge regression: biased estimation for nonorthogonal problems.

Technometrics 12 (1), 55–67.

Hong, H., Stein, J., 1999. A unified theory of underreaction, momentum trading and overreac-

tion in asset markets. Journal of Finance 59 (6), 2143–2184.

Hsu, M., Krajbich, I., Zhao, C., Camerer, C., 2009. Neural response to reward anticipation

under risk is nonlinear in probabilities. Journal of Neurosciencel 29 (7), 2231–2237.

Hull, J., Nelken, I., White, A., 2005. Merton’s model, credit risk, and volatility skews. Journal

of Credit Risk 1 (1), 3–28.

Ibbotson, R., Kaplan, P., 2000. Does asset allocation policy explain 40, 90, or 100 percent of

performance? Financial Analysts Journal 56 (1), 26–33.

Ilmanen, A., 2012. Do financial markets reward buying or selling insurance and lottery tickets?

Financial Analyst Journal 68 (5), 26–36.

Jackwerth, J., 2000. Recovering risk aversion from option prices and realized returns. Review


Jackwerth, J., Rubinstein, M., 1996. Recovering probability distributions from options prices.


Jackwerth, J., Vilkov, G., 2015. , asymmetric volatility risk: Evidence from option markets.

SSRN working paper 2325380.

Jarrow, R., Rudd, A., 1982. Approximate option valuation for arbitrary stochastic processes.


Jegadeesh, N., Titman, 1993. Returns to buying winners and selling losers: implications for

stock market efficiency. Journal of Finance 48, 65–91.

166

Jiao, Y., 2016. Lottery preference and earnings announcement premia. SSRN Working Paper

2522798.

Kahneman, D., Tversky, A., 1979. , prospect theory: An analysis of decision under risk. Journal

of Financial Economics 47 (2), 263–291.

Kliger, D., Levy, O., 2009. Theories of choice under risk: Insights from financial markets.

Journal of Economic Behavior & Organization 71 (2), 330–346.

Krishnam, C., P. R., Ritchken, P., 2008. Correlation risk. SSRN Working Paper 1027479, 1–31.

Kumar, A., 2009. Who gambles in the stock market? Journal of Finance 64 (4), 1889–1933.

Kupiec, P., 1995. Techniques for verifying the accuracy of risk management models. Journal of

Derivatives 3, 73–84.

Lahiri, K., Sheng, X., 2010. Measuring forecast uncertainty by disagreement: The missing link.

Journal of Applied Econometrics 25 (4), 514–538.

Lakonishok, J., Lee, I., Pearson, N., Poteshman, A., 2007. Option market activity. Review of

Financial Studies 20 (3), 813–857.

Laster, D., Bennett, P., Geoum, I., 1999. Rational bias in macroeconomic forecasts. Quarterly

Journal of Economics 114 (1), 293–318.

Legerstee, R., Franses, P., 2015. Does disagreement amongst forecasters have predictive value?

Journal of Forecasting 34 (4), 290–302.

Lehmann, B., 1990. Fads, martingales, and market efficiency. Quarterly Journal of Economics

105, 1–28.

Lemmon, M., Ni, S., 2011. The effects of investor sentiment on speculative trading and prices

of stock and index options. SSRN working paper no. 1572427.

Longstaff, F., 1995. Option pricing and the martingale restriction. Review of financial studies

8 (4), 1091–1124.

Loomes, G., Sugden, R., 1982. Regret theory: an alternative theory of rational choice under

uncertainty. Economic Journal 92 (4), 805–924.

Mahani, R., Poteshman, A., 2008. Overreaction to stock market news and misevaluation of stock

prices by unsophisticated investors: evidence from the options market. Journal of Empirical

Finance 15 (4), 635–655.

Mankiw, W., Thomas, C., 1997. Recovering an asset’s implied pdf from optionprices: an ap-

plication to crude oil during the gulf crisis. Journal of Financial and Quantitative Analysis

32 (1), 91–115.

167

Massy, W., 1965. Principal components regression in exploratory statistical research. Journal

of the American Statistical Association 60 (309), 234–256.

Melick, N., Reis, R., Wolfers, J., 1997. Disagreement about inflation expectations. NBER

Macroeconomics Annual 2003 18, 209–248.

Mendenhall, R., 1991. Evidence of possible underweighting of earnings-related information.

Journal of Accounting Research 29, 140–170.

Merton, R., 1974. On the pricing of corporate debt: the risk structure of interestrates. Journal

of Finance 29 (2), 449–470.

Michaely, R., Womack, K., 1999. Conflict of interest and the credibility of underwriter analyst

recommendations. Review of Financial Studies 12 (4), 653–686.

Mitton, T., Vorkink, K., 2007. Equilibrium underdiversification and the preference for skewness.


Moskowitz, T., Ooi, Y. H., Pedersen, L. H., 2012. Time series momentum. Journal of Financial

Economics 104 (2), 228–250.

Nelson, D., 1991. Conditional heteroskedasticity in asset returns: A new approach. Economet-

rica 59 (2), 347–370.

Ottaviani, M., Sorensen, P., 2006. The strategy of professional forecasting. Journal of Financial

Economics 81 (2), 441–466.

Pastor, L., Stambaugh, R., 2003. Liquidity risk and expected stock returns. Journal of Political

Economy 111 (3), 642–685.

Polkovnichenko, V., Zhao, F., 2013. Probability weighting functions implied in option prices.


Pollet, J., Wilson, M., 2008. Average correlation and stock market returns. Journal of Financial

Economics 96 (3), 364–380.

Poon, S.-H., Granger, C., 2003. Forecasting volatility in financial markets: a review. Journal

of Economic Literature 61 (2), 478–539.

Prelec, D., 1998. The probability weighting function. Econometrica 66 (3), 497–527.

Rapach, D., Strauss, J., Zhou, G., 2010. Out-of-sample equity premium prediction: Combina-

tion forecasts and links to the real economy. Review of Financial Studies 23 (2), 821–862.

Rosenberg, J., Engle, R., 2002. Empirical pricing kernels. Journal of Financial Economics

64 (3), 341–372.

Rubinstein, D., 1994. Implied binomial tree. Journal of Finance 49 (3), 771–818.

168

Scharfstein, D., Stein, J., 1990. Herd behavior and investment. American Economic Review

80 (3), 465–479.

Schirm, D., 2003. A comparative analysis of the rationality of consensus forecasts of u.s. eco-

nomic indicators. Journal of Business 76, 547–561.

Sievert, C., Shirley, K., 2014. Ldavis: a method for visualizing and interpreting topics. Proceed-

ings of the Workshop on Interactive Language Learning, Visualization, and Interfaces June,

63–70.

Sobaci, C., Sensoy, A., Erturk, M., 2014. Impact of short selling activity on marketdynamics:

evidence from an emerging market. Journal of Financial Stability 15, 53–62.

Stickel, S., 1991. Common stock returns surrounding earnings forecast revisions: more puzzling

evidence. The Accounting Review 66, 402–416.

Straetmans, S., Verschoor, W., Wolff, C., 2008. Extreme US stock market fluctuations in the

the wake of 9/11. Journal of Applied Econometrics 23 (1), 17–42.

Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. Journal of the Royal

Statistical Society 58 (1), 267–288.

Tim, T., 2001. Rationality and analysts’ forecast bias. Journal of Finance 61 (1), 369–385.

Truong, C., Shane, P., Zhao, Q., 2016. Information in the tails of the distribution of analysts’

quarterly earnings forecasts. Financial Analysts Journal 73 (5), 84–99.

Tversky, A., Kahneman, D., 1974. Judgement under uncertainty: heuristics and biases. Science

185, 1124–1131.

Tversky, A., Kahneman, D., 1992. Advances in prospect theory: Cumulative representation of

uncertainty. Journal of Risk and Uncertainty 5 (4), 297–323.

Vilkov, G., Xiao, Y., 2013. Option-implied information and predictability of extreme returns.

SAFE (Goethe University Frankfurt) Working Paper Series 5, 1–36.

Von Neumann, J., Morgenstern, O., 1947. Theory of games and economic behavior, 2nd edition.

Princeton University Press, Princenton.

Ward, E., 1982. Conservatism in human information processing. In Daniel Kahneman, Paul

Slovic and Amos Tversky. (1982). Judgment under uncertainty: Heuristics and biases, Cam-

bridge University Press, New York.

Welch, I., Goyal, A., 2008. A comprehensive look at the empirical performance of equity pre-

mium prediction. Review of Financial Studies 21 (4), 1455–1508.

Wu, G., Gonzalez, R., 1996. Curvature of the probability weighting function. Management

Science 42 (12), 1676–1690.169

Yan, S., 2011. Jump risk, stock returns, and slope of implied volatility smile. Journal of Finan-

cial Economics 99 (1), 216–233.

Zarnowitz, V., Lambros, L., 1987. Consensus and uncertainty in economic prediction. Journal

of Political Economy 95 (3), 591–621.

Zhang, X., 2006. Information uncertainty and analyst forecast behavior. Contemporary Ac-

counting Research 23 (2), 565–590.

170

Summary

This PhD thesis is about behavioral finance, the sub-field of behavioral economics that studies

the impact of psychological and cognitive biases in financial decision making. The main hypoth-

esis of behavioral finance is that people systematically make irrational decisions when outcomes

are unknown. Behavioral finance is a breakthrough because it managed to challenge the clas-

sical economics and financial theories, which are both built on the assumption that individuals

are fundamentally rational, as implied by the expected utility theory. The proponents of behav-

ioral finance used lab experiments to prove that individuals making decisions under uncertainty

violate the axioms of the expected utility theory. As such, behavioral finance models were de-

signed in a stylized form, disconnected from financial markets. Thus, this thesis adds to the

growing literature that attempts to validate the hypotheses made by behavioral finance in real

financial markets. In specific, most of my research investigates market inefficiencies which we

hypothesize to be explained by the Cumulative Prospect Theory (CPT) probability weighting

function. Using ex-ante information from option prices, we find it to play a role in explaining

some inefficient behaviors of market makers, retail investors and institutional investors, which

produces interesting investment insights. Additionally, my research also recognizes the influ-

ence of other cognitive biases, such as anchoring, conservatism, overconfidence, herding, regret,

and rational bias amid the behavior of macroeconomic data professional forecasters.

171

Samenvatting

Deze dissertatie handelt over ‘behavioral finance’, het deelterrein van de gedragseconomie dat

de invloed van de menselijke psychologie op financiele beslissingen bestudeert. De behavioral

finance stelt dat mensen systematisch irrationele keuzes maken wanneer ze moeten beslissen

in een onzekere situatie. Dit inzicht veroorzaakte een doorbraak in de economische weten-

schap, die voordien altijd was uitgegaan van de verwachte-nutstheorie, die stelt dat mensen in

wezen rationeel handelende individuen zijn. Met behulp van laboratoriumexperimenten toon-

den ‘behavioral finance’ onderzoekers aan dat het gedrag van mensen die beslissingen maken

onder onzekerheid niet strookt met de axioma’s van deze verwachte-nutstheorie. Vervolgens

werden, los van de financiele markten, gestileerde behavioral-finance modellen opgesteld, die

een bloeiende onderzoeksliteratuur nu met financiele-marktgegevens tracht te valideren. Deze

dissertatie is een bijdrage aan die literatuur. In het bijzonder onderzoek ik in hoeverre mark-

tinefficienties worden verklaard door de zogenaamde kanswegingsfunctie uit de ‘Cumulatieve

Prospect Theorie’ (CPT). Met behulp van ‘ex-ante’ informatie uit optieprijzen toont mijn on-

derzoek aan dat deze kanswegingen een rol spelen bij de verklaring van inefficient gedrag van

beursmakelaars, particuliere beleggers en institutionele beleggers. Daarnaast identificeert ik

in de dissertatie de invloed van andere gedragseffecten, waaronder anchoring, conservatism,

overconfidence, herding, regret, en rational bias, door gebruik te maken van een andere ex-ante

informatiebron, namelijk enquetes onder voorspellers van macro-economische statistieken.

172

Short biography

Luiz Fernando Fortes Felix was born in Belo Horizonte (Brazil) on July 26, 1978. During

his early years he experienced living in many places, among them a village in the Amazon

forest (Serra dos Carajas) and in the United States, where he graduated from High School. In

2001, he graduated in Public Administration at Fundacao Joao Pinheiro and in 2002 in Law

at Universidade Federal de Minas Gerais, both in Belo Horizonte. Subsequently, he obtained

a Diploma in Finance from IBMEC (Brazil) and a MSc degree in Finance and Investments

from Durham University (UK), both courses being fully funded by scholarships. Along his

professional career, Luiz has acquired the CFA and CQF charters.

He has worked in financial markets since 2001. The first years of his career he spent

managing fixed income portfolios at a Brazilian pension fund. In 2005, he joined ABN AMRO

Asset Management in Amsterdam (the Netherlands) as a quantitative investment strategist.

Since 2008 he works at the Asset Allocation & Overlay (AA&O) department of APG Asset

Management in Amsterdam. APG is one of the largest investors fully dedicated to manage

pension funds’ assets and liabilities in the world. In AA&O Luiz has managed several derivative-

based strategies, ranging from hedging and protection programs to systematic active strategies.

He has also managed APG’s Absolute Return Strategies (ARS) pool, which invests in a set of

renowned hedge funds. He has been largely involved in the introduction of active management

within AA&O, being the main designer of its tactical asset allocation (TAA) and the active

FX mandate. His responsibilities include the research, design and management of investment

strategies in the global equities, fixed income, commodities and foreign exchange markets.

Luiz is the coordinator of AA&O’s portfolio research and the co-founder of the APG Quant

Roundtable. Lately, he largely engaged with the APG Innovation initiative, leading him to

design natural language processing (NLP) and deep learning-based investment strategies.

Luiz wrote his PhD thesis while working full-time at APG Asset Management from 2012

to 2018, mainly during evenings and weekends. He is married to Clarissa Calil Bonifacio and

together they have two children, Thomas and Bernardo, respectively, 4 and 1 year-old at the

time of writing.

173

Publications

Modified versions of Chapters 2 and 3 of this thesis are published as:

1. Felix, L., Kraussl, R., Stork, P., 2016a. The 2011 european short sale ban: A cure or a

curse? Journal of Financial Stability 25, 115-131.

2. Felix, L., Kraussl, R., Stork, P., 2016b. Single stock call options as lottery tickets:

overpricing and investor sentiment. Forthcoming in Journal of Behavioral Finance, 1-38.

Extended versions of Chapters 4 and 5 of this thesis are available in the form of the following

research papers:

1. Felix, L., Kraussl, R., Stork, P., 2017a. Implied volatility sentiment: a tale of two tails.

Tinbergen Institute Discussion Paper 17-002/IV - SSRN working paper 2758641, 1-54.

Available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2899680.

2. Felix, L., Kraussl, R., Stork, P., 2017b. Predictable biases in macroeconomic forecasts

and their impact across asset classes. SSRN working paper 3008976, 1-40. Available at

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3008976.

174

Conference presentations

1. 2013 3rd International Conference of the Financial Engineering and Banking Society

(FEBS) in Paris.

2. 2014 IX Seminar on Risk, Financial Stability and Banking Conference of the Banco

Central do Brasil in Sao Paulo.

3. 2014 International Risk Management Conference in Warsaw.

4. 2014 Financial Management Association (FMA) European Conference in Maastricht.

5. 2016 IFABS Conference in Barcelona.

6. 2016 Research in Behavioral Finance Conference (RBFC) in Amsterdam.

7. Board of Governors of the Federal Reserve System research seminar in Washington D.C.

in 2016.

8. 2017 Infiniti Conference in Valencia.

9. 2017 EEA-ESEM Conference in Lisbon.

10. MAN-AHL Research Seminar in London in 2017.

11. 2017 Econometrics and Financial Data Science workshop at the Henley Business School

in Reading.

12. 2018 Conference of the Swiss Society for Financial Market Research (SGF) in Zurich.

13. 2018 annual meeting of the European Financial Management Association (EFMA) in

Milano.

14. 2018 EEA-ESEM Conference in Cologne.

15. 2018 European Finance Association (EFA) in Warsaw.

16. 2018 Research in Behavioral Finance Conference (RBFC) in Amsterdam.

175

INVITATIONTo attend the public defense of the PhD thesis entitled


by Luiz Fernando Fortes Félix

Monday October 1st, 2018At 13.45 hours

.....Vrije Universiteit Amsterdamstreet, nopostalcode,Amsterdam

The defense will be followed by a reception

Paranymphs

Klaas [email protected]

Rob van den [email protected]

Essays in Behavioral Finance




essays in behavioral finance biases in investment ......a huge thanks goes to my mom sandra, whose...

Documents