uva-dare (digital academic repository) …computational techniques in queueing and fluctuation...
TRANSCRIPT
UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl)
UvA-DARE (Digital Academic Repository)
Computational techniques in queueing and fluctuation theory
Mohammad Asghari, N.
Link to publication
Citation for published version (APA):Mohammad Asghari, N. (2014). Computational techniques in queueing and fluctuation theory.
General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s),other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).
Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, statingyour reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Askthe Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam,The Netherlands. You will be contacted as soon as possible.
Download date: 28 May 2020
Computational techniques
in queueing and fluctuation theory
Naser Mohammad Asghari C
omputational techniques in queueing and fluctuation theory
Naser M
ohamm
ad Asghari
INVITATION
for the public defense of my PhD thesis
Computational techniques
in queueing and fluctuation theory
that will take place on Tuesday November 25, 2014 at 14.00
in the Agnietenkapel, Oudezijds voorburgwal 231, Amsterdam
The defense will be followed by a reception
Naser Mohammad Asghari
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page i — #1�
�
�
�
�
�
Computational techniques
in queueing and fluctuation theory
Naser Mohammad Asghari
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page ii — #2�
�
�
�
�
�
Computational techniques
in queueing and fluctuation theory
Naser Mohammad Asghari
Korteweg-de Vries Instituut voor Wiskunde
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
Proefschrift Universiteit van Amsterdam
Copyright c© 2014 by Naser Mohammad Asghari, Amsterdam
All rights reserved. No part of this book may be reproduced,
in any form or by any means, without permission in writing
from the author.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page iii — #3�
�
�
�
�
�
Computational techniques
in queueing and fluctuation theory
ACADEMISCH PROEFSCHRIFT
ter verkrijging van de graad van doctor
aan de Universiteit van Amsterdam
op gezag van de Rector Magnificus
prof. dr. D.C. van den Boom
ten overstaan van een door het college voor promoties
ingestelde commissie,
in het openbaar te verdedigen in de Agnietenkapel
op dinsdag 25 november 2014, te 14:00 uur
door
Naser Mohammad Asghari
geboren te Tehran, Iran
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page iv — #4�
�
�
�
�
�
Promotiecommissie
Promotor: Prof. dr. M. R. H. Mandjes
Overige leden: Dr. P. J. C. SpreijProf. dr. R. Núñez QueijaProf. dr. ir. C. W. OosterleeProf. dr. J. H. van ZantenProf. dr. D. T. CrommelinProf. dr. B. F. Heidergott
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page v — #5�
�
�
�
�
�
Acknowledgments
When I was school student, I was fascinated by almost every topic, mathematics, physics,
chemistry, biology, history, geography, ... (except arts!), and I wanted to be expert in all of
them. One day I studied physics, the other day history, and so on. Finally I found out I like
physics and mathematics most. So I started mathematics at university, but the next year I
changed to physics. After I got graduated in physics, I was introduced to economics and
finance by my brother Mohsen, and my best friend, Mehdi. Physicists had already been
active in economy and finance for some time, and they called their research econophysics.
Researching in econophysics led me to stochastic processes and mathematical finance.
I was very lucky that I managed to persuade Michel Mandjes to supervise me, despite the
fact that I did not have a mathematics background. Although I had studied probability and
stochastic processes, my knowledge was not enough to start doing research. Michel trusted
me and guided me such that I managed to get on the right track in quite a short period
of time. In fact, it would not be possible to accomplish this thesis without his great help
and support. Michel, thank you for your trust, support, guidance and kindness. You also
supported me apart from my thesis and I always appreciate it.
I would also like to thank Peter Spreij. He gave me the chance to be an assistant for his
courses at UvA. He is a great teacher and those courses still help me in my research and in
my job. He also supported me when I was looking for a job. Thank you, Peter.
I met Martijn Pistorius during a summer school in mathematical finance, in 2011 in Ljubljana.
I recall that I enjoyed his lectures. When he came to KdVI, whenever I had challenges in my
research, he always had valuable suggestions and comments. I would like to thank him.
KdVI is a prestigious institute, and I am really proud of being a member of it. I would not
have gotten this chance without the support of Jan Wiegerinck, to whom I am very grateful.
v
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page vi — #6�
�
�
�
�
�
I would also like to thank Evelien, Henneke and Marieke for their help. I never felt lonely at
KdVI with Nabi, Paul, Jevgenijs, Ricardo, Enno, Loek, Arie, and Piotr. I will never forget the
dinners we had at Jevgenijs’s place.
I am also so thankful to Peter den Iseger, Anwar Walid and Krzysztof Debicki for the nice
collaboration when writing our joint papers.
I would like to thank Drona Kandhai for giving me the chance to work in his team at ING. I
really enjoy working with my managers and colleagues at ING: Dirk, Veronica, Bart, Geert-
Jan, Joanna, Artem, Dmytro, Daan, Jef, Markus, Xiaoyu, Carlos and Frits Koen.
Outside the University and ING, I belong to an Iranian community. We have a great time
together. I would like to thank Mohammad & Maryam, Vahid & Sara, Hodjat & Marzieh,
Mehdi & Marzieh, Afshin & Fatemeh, Ali & Azadeh, Mahdi & Mahdieh, Masoud & Hoda,
Amin & Fatemeh, Naser & Aylar, Shayan & Narges, Mohammad & Fahimeh, Saeed & An-
disheh, Farzin & Mahsa, Hossein, Mahdi, Jafar, Narges, Rojman, Mohammad, Abbas, Da-
nial, Amir, Behnaz, Maryam and Rokhsareh.
I want to thank my family. How would I get here without your love, support and encour-
agement? My mother and father, sisters and brother, nephew and nieces, my father, mother,
brothers and sisters in law. I love you all and I cannot imagine a moment without you.
My wife, Sareh, deserves a special acknowledgment. Thank you for your love and support.
You always give me energy and encouragement to continue. This thesis is dedicated to you.
Naser Mohammad Asghari,
Amsterdam, October 2014
vi
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page vii — #7�
�
�
�
�
�
Contents
Acknowledgments v
List of Figures ix
List of Tables xi
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Preliminaries on Lévy fluctuation theory . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Preliminaries on Markov fluid models . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Outline of thesis, contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 Numerical techniques in Lévy fluctuation theory 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Laplace inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Approximation with rational Laplace transform . . . . . . . . . . . . . . . . . 36
2.5 Small jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Beta processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.7 Discussion and concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 43
3 Evaluation of option prices 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
vii
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page viii — #8�
�
�
�
�
�
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Transforms of prices and Greeks of lookback options . . . . . . . . . . . . . . 61
3.4 Numerical validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4 Asymptotics of the supremum of a Lévy process 81
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5 Energy-Efficient Scheduling in Multi-Core Servers 89
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Energy Cost Function and Models . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3 Optimization of Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . 100
5.4 Robustness Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Bibliography 115
Samenvatting 122
Summary 124
About the author 126
viii
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page ix — #9�
�
�
�
�
�
List of Figures
5.1 Optimal cost of static serving and sleep mode strategies vs. buffering cost rate. 102
5.2 Thresholds in the 1-threshold and the hysteretic strategy with respect to β . . 103
5.3 Optimal cost of the hysteretic strategy and the 1-threshold strategy. . . . . . . 103
5.4 Optimal cost of strategies with many thresholds. . . . . . . . . . . . . . . . . . 106
5.5 Service rate as function of buffer content in continuous strategy. . . . . . . . . 106
ix
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page x — #10�
�
�
�
�
�
x
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page xi — #11�
�
�
�
�
�
List of Tables
2.1 Brownian Motion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2 Compound Poisson with exponential jumps. . . . . . . . . . . . . . . . . . . . 46
2.3 Compound Poisson with Weibull jumps. . . . . . . . . . . . . . . . . . . . . . . 46
2.4 Compound Poisson with Weibull jumps. . . . . . . . . . . . . . . . . . . . . . . 47
2.5 Compound Poisson with Pareto jumps. . . . . . . . . . . . . . . . . . . . . . . 47
2.6 Compound Poisson with shifted-Pareto jumps. . . . . . . . . . . . . . . . . . . 48
2.7 Compound Poisson with both upward and downward jumps. . . . . . . . . . 48
2.8 CGMY-like upward-jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.9 Variance Gamma process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.10 Beta process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1 Black-Scholes model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2 Greeks corresponding to Black-Scholes model. . . . . . . . . . . . . . . . . . . 69
3.3 Merton model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4 Kou model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5 CGMY model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.6 Greeks corresponding to CGMY model. . . . . . . . . . . . . . . . . . . . . . . 76
3.7 Beta model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1 Simulation results corresponding to Compound Poisson process. . . . . . . . 86
4.2 Simulation results corresponding to Variance Gamma process. . . . . . . . . . 87
5.1 Optimal cost of single-server strategies. . . . . . . . . . . . . . . . . . . . . . . 104
xi
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page xii — #12�
�
�
�
�
�
5.2 Optimal cost of multiple-server strategies. . . . . . . . . . . . . . . . . . . . . . 107
5.3 Robustness with respect to changes in the mean on-time. . . . . . . . . . . . . 109
5.4 Robustness with respect to changes in the distribution of the on-times, with
the mean on-time unchanged. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5 Robustness of multiple-server strategies with respect to changes in the distri-
bution of the on-times, with the mean on-time unchanged. . . . . . . . . . . . 111
5.6 Robustness with respect to changes in the distribution of the on-times, with
the mean on-time changing as well. . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.7 Robustness of multiple-server strategies with respect to changes in the distri-
bution of the on-times, with the mean on-time changing as well. . . . . . . . . 112
xii
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 1 — #13�
�
�
�
�
�
Chapter 1Introduction
1.1 Motivation
Stochastic phenomena are everywhere around us. Consider the example of the valuation
of options (or other financial products) in financial markets. The future prices of the un-
derlying assets being highly uncertain, we need a model to describe these. Such a model
should be able to accurately capture the random features of the underlying stochastic pro-
cess. Realizing that option prices are essentially functionals of the evolution of the price of
the underlying asset, the model can be used to price the options.
Another example in which randomness plays an important role, is in the design of vari-
ous types of communication infrastructures. A commonly used paradigm is to model such
systems as queueing networks. The design objective typically reflects the tradeoff between
the cost and the quality-of-service delivered to the customers; adding capacity improves the
perceived quality, but obviously comes at a price. In recent years substantial emphasis has
been put on designing the network such that it efficiently uses energy resources, for instance
by adaptively changing the processor speeds of the queues, as a function of the current
workload. Such algorithms need to be set up delicately, as they should not compromise the
performance perceived by the network’s users.
These are just illustrative examples of instances where stochastic modeling offers a useful
mathematical framework. Of course in many other situations such models can be applied
as well. One could think of a wide variety of other domains, such as population biology,
chemical reaction networks, and statistical physics. The use of techniques from stochastic
1
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 2 — #14�
�
�
�
�
�
2 1.1. MOTIVATION
modeling, to optimize the efficient usage of network resources, obviously extends beyond
the application of communication networks that we mentioned above: also in transport net-
works and logistic networks this approach is intensively used. The list of potential appli-
cation areas keeps on expanding; recently developed areas include social networks, brain
modeling, and forensic sciences.
Modeling real-life situations by means of stochastic systems is typically a first step; in or-
der for these models to have practical use, they need to allow fast and accurate (numerical)
evaluation. To illustrate this, let us go back to the two motivating examples that we intro-
duced above. In the context of option pricing, traders need to respond to the clients’ requests
nearly instantly, and this requires that they need to be able to compute prices virtually in real
time. The alternative is to rely on simulation-based techniques, which are fast only when
specific simplifications have been imposed on the underlying model (for instance by assum-
ing a simplistic volatility model, or by assuming the underlying stochastic process is just
Brownian); when the ambition is to use more sophisticated models, simulation techniques
become inherently problematic. This motivates the research on fast and accurate numerical
computation techniques that do not require us to a priori simplify the underlying stochastic
processes.
Also in the setting of the (optimal) design of energy-efficient communication networks, com-
putational issues play a prominent role. A typical objective function includes the energy
consumption per time unit as well as a performance measure that reflects the quality-of-
service, and the idea is to find the service policy that strikes an optimal balance between
these components (or the less ambitious objective of identifying a policy that is ‘close’ to this
optimum). The optimization tries to find, for any specific condition the network can be in,
what the best service strategy is. The complexity of finding an optimum lies in the huge
parameter space that we have to optimize over, as well as the sometimes unpleasant specific
properties of the objective function (it is not necessarily a smooth unimodal function, for
instance). In the setting of designing a network the optimization routine can in principle be
performed off-line. It should be realized, however, that the underlying algorithm requires
that the objective function be evaluated potentially very frequently, and as a consequence we
need a technique that evaluates the system’s performance (for a single parameter instance)
fast and sufficiently accurately. Perhaps equally important is that it is made sure that the
resulting design is robust: when the input parameters differ slightly from their estimated
values, the system should still behave ‘nearly’ optimally.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 3 — #15�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 3
1.2 Models
A canonical model in stochastic modeling is the so-called random walk. Consider a sequence
of independent and identically distributed increments Y1, Y2, . . ., and the associated random
walk
Sn :=
n∑i=1
Yi.
The probabilistic behavior of these random walks has been an object of intensive study.
Classical results are the law of large numbers, saying that Sn/n converges (in various spe-
cific forms) to the corresponding mean, and the central limit theorem, stating that a centered
and normalized version of Sn/n converges to a standard Normal random variable.
In the context of the option pricing example introduced above, observations from financial
data suggest that models with independent and identically distributed increments form a
natural framework. On the other hand, as in this example time is in principle not slotted,
it is less appropriate to consider a discrete-time framework. This motivates why it makes
sense to look at the continuous-time counterpart of the above random-walk model, viz. the
Lévy model. A Lévy process (Xt)t≥0 is a real-valued continuous-time process, with X0 = 0,
such that all increments are independent and identically distributed (meaning that for all s,
h and t we have that Xs+h−Xs and Xt+h−Xt have the same distribution, and that Xt+h−Xt
is independent of Xt). In some cases processes in finance are not well modeled by (Xt)t≥0
itself, but rather by (eXt)t≥0 (thus also avoiding the sometimes problematic issue that the
underlying process can attain negative values).
In the example of option pricing, various payoff structures involve both the value XT of the
underlying at maturity T and the largest value XT attained until T . An example here is
the so-called barrier option: there, for a given strike K and barrier H , the payoff is given by
max{XT −K, 0} provided that XT ≥ H ; if we have insight into the probabilistic properties
of these objects, we are in a position to price the option. This explains the interest in getting a
handle on the joint distribution of XT and the running maximum XT . As it turns out, for Lévy
processes there is a vast body of literature available on this joint distribution, commonly
referred to as Wiener-Hopf theory. We recapitulate the first principles, as well as a collection
of key results, of this theory in Section 1.3.
In the context of communication networks, a key concept is that of a queue. The data that
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 4 — #16�
�
�
�
�
�
4 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
cannot be processed immediately upon arrival is temporarily stored in a buffer, thus nat-
urally leading to a notion of a queueing process. The Lévy process defined above can in
principle attain any real value (i.e., positive and negative), so to make it a queueing process
we should prevent it from becoming negative. The (commonly followed) way of truncating
the process is by applying a so-called reflection or regulation: we introduce a queueing process
(or: workload process) by
Qt := sups≤t
(Xt −Xs);
observe that this entails that the workload is a functional of the driving Lévy process (Xt)t≥0.
It is remarked that in such a Lévy-driven queue the amount of traffic fed into the system
in two disjoint periods of time is independent, which is typically not true in the context
of communication networks: if a user is generating traffic at a high rate at some point in
time, it is relatively likely he or she is still doing that some small amount of time later. This
observation motivates why traffic is often modeled as a so-called on-off process: the users
activity level alternates between generating traffic at some rate r for a while, and being silent,
thus creating some positive dependence within the queue’s input process. Subsequent on-
and off-times may be assumed to be independent.
1.3 Preliminaries on Lévy fluctuation theory
Brownian motion is one of the most frequently used stochastic models, being applied in a va-
riety of domains. Among its most important properties are the continuity of its sample paths
and its scale invariance. It is noted, however, that many phenomena which are modeled by
Brownian motion, do not have these properties. In finance, for example, returns are often
modeled by Brownian motion, but they tend to jump up and down; these discontinuities
can not be ignored in many situations [31].
Another classical stochastic process is the Poisson process. This has discontinuous sample
paths (as it jumps up by steps of size 1, after exponentially distributed times). A Poisson
process is a non-decreasing process, and therefore has paths of bounded variation over finite
time (as opposed to Brownian motion).
Although Brownian motion and the Poisson process are seemingly completely different,
they share a number of important features. Both have right-continuous paths with left limits,
and they both are Markovian processes that start at the origin. They belong to a wide class of
stochastic processes, Lévy processes, named after the French mathematician Paul Lévy, which
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 5 — #17�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 5
play an important role in this thesis. We primarily focus on these processes with a specific
focus on the analysis of the probabilistic properties related to its extreme values; this branch
of research is commonly referred to as fluctuation theory. For a complete treatment of Lévy
processes and fluctuation theory, we refer to e.g. the textbook [68].
A process X = {Xt, t ≥ 0} defined on a probability space (Ω,F ,P) is a Lévy process if it
has right-continuous paths with left limits, and if it has stationary and independent incre-
ments, with X0 = 0. ‘Stationarity’ in this context means that increments corresponding to
a fixed time interval are identically distributed; ‘independence’ refers to the property that
increments corresponding to non-overlapping time intervals behave statistically indepen-
dently.
An important, and highly convenient, property of the Lévy process is that its characteristic
function obeys a closed-form formula, which is in fact an immediate consequence of the pro-
cess having stationary and independent increments. From the definition of a Lévy process,
it is known that Xt is infinitely divisible for any t > 0. Realizing that, for any n ∈ N, Xt
equals
Xt = Xt/n + (X2t/n −Xt/n) + · · · (Xt −X(n−1)t/n),
it follows that
logEeisXt = t logEeisX1 = tξ(s)
(first for rational s, and with a limiting argument also for real s). The function ξ(s) :=
logEeisX1 is often referred to as the characteristic exponent of the Lévy process. As stated
in the following theorem, it can be shown that a Lévy process can be uniquely defined by a
triple (μ, σ,Π) [68, 34, 21].
Theorem 1.3.1. Lévy-Khintchine formula for Lévy processes. Suppose that μ ∈ R, σ ≥ 0.
Let the measure Π be concentrated on R\{0}, in such a way that the regularity condition∫Rmin{1, x2}Π(dx) < ∞ is met. This triple (μ, σ,Π) defines for any s ∈ R,
ξ(s) = logEeisX1 = iμs− 1
2s2σ2 +
∫ ∞
−∞(eisx − 1− isx1{|x|<1})Π(dx).
Then there exists a probability space (Ω,F ,P) on which a Lévy process can be defined hav-
ing Lévy characteristic exponent ξ(s). Π is called the corresponding Lévy measure, while μ is
often referred to as the drift, and σ2 as the parameter of the Brownian term.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 6 — #18�
�
�
�
�
�
6 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
The characteristic exponent can be rewritten as
ξ(s) =
{iμs− 1
2s2σ2
}
+
{Π(R\(−1, 1))
∫|x|≥1
(eisx − 1)Π(dx)
Π(R\(−1, 1))
}
+
{∫0<|x|<1
(eisx − 1− isx)Π(dx)
}.
It should be mentioned that in case Π(R\(−1, 1)) = 0 the second term is left out. Based on
this formula of the characteristic exponent, the Lévy process Xt can be decomposed as the
independent sum of processes X(1), X(2) and X(3) which are described as follows (Lévy-Itô
decomposition). In the first place, X(1) is a linear Brownian motion with drift. Then, X(2) is
a compound Poisson process with rate Π(R\(−1, 1)), where the jumps are independent and
identically distributed with distribution Π(dx)/Π(R\(−1, 1)) concentrated on {x : |x| ≥ 1}.
Finally, concentrating on the last term, it is first observed that it can be written as
∫0<|x|<1
(eisx − 1− isx)Π(dx)
=∑n≤0
{λn
∫2−(n+1)≤|x|<2−n
(eisx − 1)Fn(dx)− isλn
(∫2−(n+1)≤|x|<2−n
xFn(dx)
)}
where λn := Π(2−(n+1) ≤ |x| < 2−n) and Fn(dx) := Π(dx)/λn. The component X(3) can
thus be considered as the superposition of (at most) a countable number of independent
compound Poisson processes with different rates and linear drift. In fact, X(3) is a square
integrable martingale with an almost surely countable number of jumps on each finite time
interval. Importantly, the number of jumps can be infinite almost surely, leading to the class
of Lévy models with infinite activity; we will intensively work with this class later in this
thesis.
Further relevant classifications are the following. If Π(−∞, 0) = 0, then it follows from the
Lévy-Itô decomposition that the corresponding Lévy process has no negative jumps. In this
case it is referred as a spectrally positive Lévy process. On the contrary, a Lévy process is
called spectrally negative if −X is spectrally positive (i.e., it has no positive jumps). These two
classes of processes are generally indicated by S+ and S− respectively, and referred to as
the spectrally one-sided class. As we will see throughout this thesis, for the class of spectrally
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 7 — #19�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 7
one-sided Lévy processes often very explicit analysis is possible.
Let X be a spectrally positive process, and assume in addition∫(0,∞)
max(1, x)Π(dx) < ∞,
σ = 0 and μ ≥ 0. Then, again from the Lévy-Itô decomposition, it follows that the process
X has non-decreasing paths. Such a process is referred to as a subordinator. Given a Lévy
process Xt and an independent subordinator τs, we can introduce another Lévy process by
sampling Xt at stochastic time epochs which are defined by the subordinator. More precisely,
suppose that Xt is a Lévy process with characteristic exponent ξ and τ = {τs : s ≥ 0} is
an independent subordinator with characteristic exponent Ξ. Then the process Y , which is
defined by Xτs , is a Lévy process, and its characteristic exponent is given by Ξ ◦ ξ [68]. For
example, a possible representation of the so-called Variance Gamma process (a frequently
used infinite-activity Lévy process) corresponds to sampling a Brownian motion at times
that result from a Gamma process [34].
1.3.1 Wiener-Hopf factorization
A collection of important results in Lévy fluctuation theory are immediate consequences
of the so-called Wiener-Hopf factorization. In fact, this Wiener-Hopf factorization provides a
powerful tool with several applications in probability (e.g. related to finance). In this sec-
tion we first introduce a few relevant concepts, and then we roughly sketch a proof for the
Wiener-Hopf factorization theorem in a discrete-time framework. The continuous-time set-
ting is considerably more technical, and therefore we decided to just state the main result,
and leave the underlying considerations out.
As it was mentioned, any Lévy process is defined on a probability space (Ω,F ,P). Let F be
the filtration F = {Ft : t ≥ 0}, so that we obtain a filtered probability space (Ω,F ,F,P), on
which we assume X is defined. Then the non-negative random variable τ , defined on the
same filtered probability space, is called stopping time if {τ ≤ t} ∈ Ft for all t > 0. It should
be mentioned that it is not a priori ruled out that a stopping time could have the property
that P(τ = ∞) > 0. Now suppose that τ is a stopping time. The process X = {Xt : t ≥ 0}where
Xt = Xτ+t −Xτ , t ≥ 0
defined on {τ < ∞} is independent of Fτ and has the same law as X and hence is a Lévy
process. For instance, the first entrance time (first hitting time) of a given subset B ⊆ R is
F-stopping time.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 8 — #20�
�
�
�
�
�
8 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
One elementary but useful concept, applying to all Lévy processes, is duality; it is a direct
consequence of the stationary independent increments. In fact, duality can informally de-
scribed as a kind of symmetry under time reversial. When a path of a Lévy process is re-
versed in time, over a finite time horizon, the new path is distributionally equivalent. More
precisely, for each t > 0 the processes {X(t−s)− −Xt : 0 ≤ s ≤ t} and {−Xs : 0 ≤ s ≤ t}are equivalent, and have the same law.
An interesting direct consequence of this duality property concerns a relationship between
the running supremum and the running infimum, which are defined by
Xt := sup0≤s≤t
Xs, Xt := inf0≤s≤t
Xs.
The processes {Xt : t ≥ 0} and {Xt : t ≥ 0} are the key objects studied in fluctuation theory,
and will play an important role in the sequel of this thesis. We arrive at the following useful
lemma.
Lemma 1.3.2. For each fixed t > 0, the pairs (Xt, Xt −Xt) and (Xt −Xt,−Xt) have the same
distribution under P.
We now consider the running maximum and minimum up to τ(q), which represents an
exponentially distributed time, with arbitrary parameter q > 0 (i.e., mean 1/q). It can be
seen that if the Lévy process is not a compound Poisson process, then its maximums are
obtained at unique times; we define Gt := sup{s < t : Xs = Xs} and Gt := sup{s < t :
Xs = Xs}. We are now ready to state the Wiener-Hopf decomposition (where it is noted that
the compound Poisson case should be treated slightly differently; see [68]).
Theorem 1.3.3. The Wiener-Hopf factorization. Suppose that X is any Lévy process and let
τ(q) be an independent exponentially distributed random variable with parameter q > 0.
Then the following statements hold.
1. The pairs
(Gτ(q), Xτ(q)) and (τ(q)− Gτ(q), Xτ(q) −Xτ(q))
are independent and infinitely divisible. For any θ, ϑ ∈ R the following factorization
applies:q
q − iϑ+ ξ(θ)= E
(eiϑGτ(q)+iθXτ(q)
)E
(eiϑGτ(q)+iθXτ(q)
)where the pairs E(eiϑGτ(q)+iθXτ(q)) and E(eiϑGτ(q)+iθXτ(q)) are the Wiener-Hopf factors.
2. When setting ϑ = 0, the Wiener-Hopf factors may be identified in terms of the Laplace
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 9 — #21�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 9
exponent κ(α, q) and κ(α, q), which are defined by (for some k0 > 0, and α ≥ 0)
κ(α, q) := Ee−αXτ(q) = k0 exp
(−
∫ ∞
0
∫(0,∞)
1
t
(e−qt − e−qt−αx
)P(Xt ∈ dx)dt
)(1.1)
for the running maximum, and (for some k0 > 0, and α ≤ 0)
κ(α, q) := Ee−αXτ(q) = k0 exp
(−
∫ ∞
0
∫(−∞,0)
1
t
(e−qt − e−qt−αx
)P(Xt ∈ dx)dt
)
(1.2)
for the running minimum. In addition,
κ(α, q)κ(−α, q) =q
q − logEe−αX1=: K (α, q).
Note that there are some constants in the expressions of the Wiener-Hopf factorization which
are not identified (k0 and k0, that is). They depend on the normalization which is chosen in
the definition of local time; for more background on this issue we refer to [68].
We do not provide a proof of the above result. Instead, in order to convey the main ideas
behind it, we include a rough sketch of a possible proof in a discrete-time framework (which
corresponds to a random walk). We have chosen to do so, since in the continuous-time
framework there are number of rather technical steps that have to be dealt with; at the same
time, we believe that the main ideas behind the discrete-time counterpart provide useful
insights [34].
To this end, consider the random walk Sn :=∑n
i=1 Yi, with the Yi being i.i.d., distributed as
a generic random variable Y. Let Sn the running maximum process
Sn := supi∈{1,...,n}
Si;
Gn denotes the (first) epoch at which that running maximum is attained. Let T be an (in-
dependent) geometric random variable, i.e., P(T = k) = p(1 − p)k, for some p ∈ (0, 1), and
k ∈ {0, 1, · · · }.
Realize that the number of maximums which are attained before time T (number of excur-
sions) is a geometric random variable; it is denoted by N . It follows that both ST and GT
can be written as the sum of N i.i.d. non-negative random variables. It can be showed that a
geometric sum of i.i.d. random variables is infinitely divisible. Based on the above, we can
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 10 — #22�
�
�
�
�
�
10 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
conclude that ST and GT are infinitely divisible as well.
Furthermore, in line with the duality property of Lévy processes, we have that (ST − ST , T −GT ) is independent of (ST , GT ). This can be intuitively understood, as follows. First real-
ize that the geometric distribution is memoryless. Suppose now that we are told that the
maximum (before time T ) is attained at a specific epoch (GT ), the value of this specific issue
does not have any impact on the amount by which the process goes down between GT and
T . A similar property holds for the residual time T −GT until ‘the geometric clock expires’.
Also, it is observed that, from the duality property, ST − ST has the same distribution as the
running minimum process.
After these first observations, we now include a bit of elementary algebra, leading to the
identification of the Wiener-Hopf factors. To this end, we first notice that it can be verified
that, with s ∈ (0, 1] and α ∈ R,
E sT eαiST =p
1− (1− p)sEeαiY.
On the other hand, this quantity can be alternatively written as
exp
(−
∫ ∞
−∞
∞∑n=1
1
n(1− sneαix)(1− p)nP(Sn ∈ dx)
)
= exp
(−
∞∑n=1
1
n(1− snEeαiSn)(1− p)n
)
= exp
(−
∞∑n=1
1
n
((1− p)n − (
(1− p)sEeαiξ)n))
= exp(log p− log
(1− s(1− p)EeαiY
)).
Recall that we found that (ST , T ) can be written as the sum of two independent terms, viz.
(ST , GT ) and (ST − ST , T − GT ) which are both infinitely divisible. As a result, it follows
that
E sGT eαiST = exp
(−
∫ ∞
0
∞∑n=1
1
n(1− sneαix)(1− p)nP(Sn ∈ dx)
),
and
E sT−GT eαi(ST−ST ) = exp
(−
∫ 0
−∞
∞∑n=1
1
n(1− sneαix)(1− p)nP(Sn ∈ dx)
).
As mentioned above, the step from discrete-time to continuous-time introduces a substan-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 11 — #23�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 11
tial amount of technicalities. If we would ‘extrapolate’ our discrete-time findings, we would
obtain the following. Let T be exponentially distributed random time with mean 1/θ, inde-
pendent of the Lévy process (Xt)t. For β ≥ 0 and α ∈ R, we can easily show that
Ee−βT+αiXT =ϑ
ϑ+ β − logEeαiX1. (1.3)
Using the Frullani integral identity [68], we also have
exp
(−
∫ ∞
0
∫ ∞
−∞
1
t
(e−ϑt − e−(ϑ+β)teαix
)P(Xt ∈ dx)dt
)
= exp
(−
∫ ∞
0
1
t
(e−ϑt − e−(ϑ+β)tEeαiXt
)dt
)
= exp
(−
∫ ∞
0
1
t
(e−ϑt − e−(ϑ+β−log EeαiX1 )t
)dt
)
=ϑ
ϑ+ β − logEeαiX1.
Mimicking the discrete-time setup, we obtain the same results in continuous-time frame-
work:
Ee−βGT+αiXT =κ(ϑ+ β,−αi)
κ(ϑ, 0)
= exp
(−
∫ ∞
0
∫ ∞
0
1
t
(e−ϑt − e−(ϑ+β)teαix
)P(Xt ∈ dx)dt
),
and
Ee−β(T−GT )+αi(XT−XT ) =κ(ϑ+ β, αi)
κ(ϑ, 0)
= exp
(−
∫ ∞
0
∫ 0
−∞
1
t
(e−ϑt − e−(ϑ+β)teαix
)P(Xt ∈ dx)dt
).
where the functions κ and κ are defined in Equations (1.1) and (1.2).
1.3.2 Second factorization identity
Much of our analysis relies on the Wiener-Hopf factorization and its ramifications. A second
result that we use in this thesis is usually referred to as the second factorization identity and it
holds for any Lévy process. It can be found in e.g. [68, p.176].
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 12 — #24�
�
�
�
�
�
12 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
Let us define the first hitting time (or first passage time) as
σ(x) := inf{t ≥ 0 : Xt > x},
and let, as before, T be an exponentially distributed random variable with mean q−1. Then
the following result holds for any Lévy process (Xt)t. As the proof is straightforward and
insightful, we decided to include it.
Lemma 1.3.4. For q, q ≥ 0, β > 0,
∫ ∞
0
e−βxE
(e−qσ(x)−q(x−Xσ(x))1{σ(x)<∞}
)dx =
1
β − q
(1− Ee−βXT
Ee−qXT
)=
1
β − q
(1− κ(β, q)
κ(q, q)
)
where κ(α, q) was defined in Equation (1.1).
Proof. We follow the proof of [68, Exercise 6.7]. Xt is a Lévy process and hence it is Marko-
vian; in addition T is exponentially distributed. Due to the memoryless property of the
exponential distribution, we have
E
(e−q(XT−Xσ(x))
)= E
(e−qXt
)
and therefore
E
(e−qXT 1{XT>x}
)= E
(e−qXT 1{σ(x)≤T}
)= E
(e−qXσ(x)1{σ(x)≤T}
)E
(e−qXT
).
In addition, the first factor of the previous equation can be expressed explicitly in terms of
distribution of σ(x) and T . It follows that
E(e−qXσ(x)1{σ(x)<T}
)=
∫ ∞
0
e−qs
∫ ∞
0
qe−q(t−s) E(1{s<t}e−qXs
)dtP(σ(x) ∈ ds)
= E
(e−qσ(x)−qXσ(x)1{σ(x)<∞}
).
Combining the above results gives
∫ ∞
0
(β − q)e−(β−q)xE
(e−qXT 1{XT>x}
)dx =
∫ ∞
0
∫ ∞
x
(β − q)e−(β−q)xe−qudP(XT ∈ du)dx.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 13 — #25�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 13
By interchanging the order of integrations we have
∫ ∞
0
(β − q)e−(β−q)xE
(e−qXT 1{XT>x}
)dx = E
(e−qXT
)− E
(e−βXT
).
Theorem 1.3.3 proves the claim. �
1.3.3 Some remarks on Wiener-Hopf factorization
The Wiener-Hopf factorization theorem provides an elegant decomposition in terms of the
characteristic functions of the running maximum and the running minimum associated with
the underlying Lévy process. The result, however, does not say how one should calculate
each characteristic functions; preferably one would express the Wiener-Hopf factors in terms
of the model primitives, i.e., the characteristic exponent ξ(·).In fact such a decomposition is possible for specific classes of Lévy processes only. This is for
instance the case if the driving Lévy process belongs to the class of spectrally one-sided Lévy
processes, i.e., Xt has only negative jumps (Xt ∈ S−) or only positive jumps (Xt ∈ S+): in
both cases κ(α, q) can be expressed in closed-form in terms of the characteristic exponent.
Let Xt ∈ S−, and let Φ(β) := logEeβX1 be the so-called Laplace exponent, where we define its
right inverse by Ψ(.). Then, as it turns out,
κ(α, q) =Ψ(q)
Ψ(q) + α.
In case of spectrally positive case Xt ∈ S+ the decomposition is usually referred as the
(generalized version of the) Pollaczek-Khinchine formula. Then we have
κ(α, q) =q
ψ(q)
ψ(q)− α
q − φ(α)
where the Laplace exponent is defined by φ(α) := logEe−αX1 , and ψ(.) is the inverse of φ(.).
The function κ(α, q) follows from κ(α, q)κ(−α, q) = K (α, q) [34].
The following case can be dealt with (semi-)explicitly as well. If the jumps in one direction
(either downward or upward) have a phase type distribution (which we further comment on
below), whereas the jumps in the other direction are allowed to have a general distribution,
the Wiener-Hopf decomposition can be performed in terms of the roots of the equation q =
ξ(s); see e.g. [72, 71].
Another class of Lévy processes for which the Wiener-Hopf decomposition is possible in
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 14 — #26�
�
�
�
�
�
14 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY
more explicit terms, is the class of processes which have a meromorphic Lévy exponent in the
complex plane, e.g. expressed in terms of beta and digamma functions. Also for this class
of Lévy processes the Wiener-Hopf factorization can be evaluated in terms of the roots of
equation q = ξ(s), but now this equation has infinitely many roots [66, 67].
Any distribution can be arbitrarily well approximated by a phase type distribution [10, Thm.
III.4.2]; the class of phase-type distributions is dense (in the sense of weak convergence)
in the set of all probability distributions on (0,∞). If we are in a situation in which the
jumps in both directions are general, we could replace those in one directions by their phase-
type counterpart, leading to a Lévy process that is covered by [72, 71]. Several methods
have been developed to deal with approximating a distribution on (0,∞) by a phase-type
distribution, see for example [40, 57]. The approach which we used in this thesis is based on
the expectation-maximization algorithm.
So far we did not discuss the impact of the ‘small jumps’ in case the driving Lévy process
has infinite activity. In order to enter the setup of [72, 71], with phase type jumps in one
directions and general jumps in the other direction, it is implicitly assumed that the Lévy
process is of finite activity, i.e.∫ ∞−∞ Π(dx) < ∞ (as these small jumps cannot be described by
a compound Poisson stream of phase-type distributed jumps). This issue can be remedied
as follows; focus on the situation that we wish to replace the positive jumps by a phase-
type counterpart. The jump distribution on (ε,∞), for some ε > 0, can be approximated
by a phase type distribution for any specific ε > 0. A Brownian motion with drift (with
appropriately chosen parameters) can then compensate the jumps smaller than ε [10]. By
picking the value of ε suitably small, the approximation turns out to be highly accurate. More
concretely, assuming the upward jumps are approximated by the phase-type distribution
Pph(dx), the parameters of the Brownian motion are calculated by the following formulas:
με :=
∫ ε
0
x (Π− Pph) (dx), σ2ε :=
∫ ε
0
x2 (Π− Pph) (dx).
The approximation of small jumps with a deterministic drift process and Brownian motion
are also frequently used in Monte Carlo simulation [10].
From the above, we conclude that if one manages to (sufficiently accurately) approximate
the Lévy measure with a phase type distribution, the factors in the Wiener-Hopf decompo-
sition can be evaluated. As a consequence, we have the Laplace transforms of the running
maximum and minimum. The next step is to invert these, in order to evaluate the distri-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 15 — #27�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 15
butions of the running maximum and minimum. There are several methods developed for
performing Laplace (and/or Fourier) inversion. Most of the Laplace inversion techniques
are based on the well-known Poisson summation formula (PSF) [1, 38]. The PSF is given by the
following formula, for any v ∈ [0, 1) and any ‘damping factor’ a ∈ R:
∞∑k=−∞
f (a+ 2πi(k + v)) =
∞∑k=0
e−ake−2πikvf(k)
where f is the Laplace (Fourier) transform of the function f(x). The right hand side of
this equation is a discrete Fourier transform which can be computed efficiently by the well-
known fast Fourier transform algorithm, obviously provided that one can evaluate the left-
hand side of the equation. The method which is developed by den Iseger [36] approxi-
mates the infinite summation with a finite sum at appropriately chosen points and weight.
This technique has been extensively tested, and has shown to be able to calculate Laplace
and Fourier inversion transform fast and accurately. The technique can be adapted to per-
form the inversion of non-smooth functions and even functions with singularities. It is also
remarked that the extension of the method to multi-dimensional mixed Laplace/Fourier in-
version is straightforward; for details of the implementation, as well as a series of extensions,
we refer to [36].
1.4 Preliminaries on Markov fluid models
The topics of fluctuation theory and queueing theory are intimately related; e.g. the steady-
state distribution of the workload in a queueing system can often be translated in terms of
the probability of an associated ‘free’ (i.e., not truncated at 0) process attaining a given set.
In this sense, techniques developed in the context of fluctuation theory for Lévy processes,
can be used in the context of Lévy-driven queues as well [34, 83]. In this section we leave the
setting of Lévy processes though, focusing on queues with Markov fluid input. We provide
the preliminaries necessary for the last chapter of this thesis.
Queueing theory studies the evolution of a storage process, which can be in terms of either
(discrete) customers or workload. A queue is characterized by an arrival process, the distri-
bution of the service requirements, and a service discipline. In general a queue can have one
or more servers (where servers can be interpreted as internet servers, cashiers in shops, etc.),
which process the clients’ jobs. Often the arrival process and service times are uncertain, and
therefore random objects are used to model these.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 16 — #28�
�
�
�
�
�
16 1.4. PRELIMINARIES ON MARKOV FLUID MODELS
There is a vast body of literature on the mathematical modeling of all sorts of queues. A
large subclass of these models can be summarized by the notation A/B/n/m − S, which
characterization is due to Kendall in the 1950s. Here A and B correspond to, respectively,
the distributional properties of the arrival process and service requirements. The number of
servers and the maximum number of jobs which can wait in the system (i.e., in its buffer) are
indicated by n and m respectively. Finally, the S specifies the service discipline; it can be for
instance first-come-first serve or processor-sharing.
The model we consider in this thesis, does not fit in the Kendall notation. We consider a fluid
model, in which a reservoir is fed by a continuous traffic stream [65, 83, 4, 63, 30]. The server
may be considered as the output flow of the reservoir; the output rate is usually considered
constant but may also be stochastic [74, 75, 89, 42, 20]. We give a more detailed description
below.
1.4.1 Markov fluid model
Consider the following fluid reservoir (or buffer), where the amount of fluid in the reservoir
at time t is denoted by Ct. Let (Xt)t denote the so-called background process, which is as-
sumed to be an irreducible continuous Markov process; this background process models the
stochasticity of the input flow into the reservoir. We assume that Xt has a finite state space
N ⊂ N, i.e., Xt attains values in the set N = {1, 2, · · · , N}.
Now the content of the reservoir is driven by (Xt)t, in the sense that the input rate into the
reservoir is ri when the process (Xt)t is in state i ∈ N , unless the reservoir is empty and
the net input flow rate is negative (as in that situation the reservoir remains empty). The
content of the reservoir is evidently stochastic, and its dynamics are given by the following
differential equation:
dCt
dt=
⎧⎪⎨⎪⎩max(ri, 0) if Ct = 0,
ri if Ct > 0.(1.4)
The above formula tacitly assumed that the buffer has infinite capacity; as an aside, we note
that if there is a finite buffer B > 0, the following equation holds:
dCt
dt=
⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩max(ri, 0) if Ct = 0,
min(ri, 0) if Ct = B,
ri if Ct ∈ (0, B).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 17 — #29�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 17
It is assumed that at least one of the net flow rates is positive, as otherwise the model is trivial
(in that Ct = 0 all the time). However, when the buffer capacity is infinite, the requirement
that the queue be stable means that we have to impose the condition
∑i∈N
piri < 0,
where pi is the stationary probability of Xt being in state i ∈ N ; if this condition would not
be met, the queue will grow to infinity [65].
We define by Fi(y, t) the probability of the content is at most y and the background process
Xt is in state i. In other words,
Fi(y, t) := P[Xt = i, Ct ≤ y], i ∈ N , y ≥ 0.
In addition, the background process is a continuous-time Markov chain with generator ma-
trix Q = [qij ] such that
P[X(t+ h) = j|X(t) = i] = qijh+ o(h),
P[X(t+ h) = i|X(t) = i] = 1 + qiih+ o(h). (1.5)
where qij ≥ 0 if i = j and qii = −∑j �=i qij , for i ∈ N .
Consider Fi(y, t + h), i.e., the probability of the background process being in state i and
the buffer content is being at most y at time t + h, where h ‘is infinitesimally small’. Using
Equations (1.5), Fi(y, t + h) can be expressed in terms of Fj(y, t). Let y > 0, h > 0 and
y − rih > 0 for all i ∈ N . Then,
Fi(y, t+ h) = (1 + qiih)Fi(y − rih, t) + h∑j �=i
qjiFj(y − rjh, t) + o(h). (1.6)
It is elementary to rewrite the above equation to
(Fi(y, t+ h)− Fi(y)) + (Fi(y)− Fi(y − rih, t)) = h∑j �=i
qjiFj(y − rjh, t) + o(h).
Assuming ∂yFi that ∂tFi exist, dividing by h, in the limit h → 0 we obtain the following
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 18 — #30�
�
�
�
�
�
18 1.4. PRELIMINARIES ON MARKOV FLUID MODELS
equation:∂Fi(y, t)
∂t=
∑j∈N
qjiFi(y, t)− ∂Fi(y, t)
∂yri.
This equation (which can be regarded as a Kolmogorov forward equation) [65, 60, 77] can be
written in compact matrix form; then it becomes
F (y, t)
∂t= F (y, t)Q− F (y, t)
∂yR, y > 0, (1.7)
where F (y, t) = (F1(y, t), · · · , FN (y, t))T and R = diag(r1, · · · , rN ). When the stability con-
dition is satisfied, then it can be shown that there exists a stationary distribution of (Xt, Ct);
i.e., the partial derivative with respect to time vanishes, and we have that, as t → ∞,
F (y)Q =F (y)
∂yR (1.8)
where F (y) denotes the corresponding stationary distribution of the fluid level in the reser-
voir.
Evidently, (1.8) alone does not uniquely specify the stationary distribution; a set of coefficient
still needs to be specified. More specifically, the solution of (1.8) has the form
F (y) =
N∑j=1
cjeξjyV (j), (1.9)
where (ξj ,V(j)) are the eigenvalues and corresponding eigenvectors of the matrix R−1QT,
and cj are constants which have to be determined by imposing boundary conditions [69].
Particularly, when the buffer size is infinitely large, if Re(ξj) > 0, the corresponding coeffi-
cient cj has to be zero, as otherwise the probability F (y) cannot be bounded between 0 and
1. We divide the states into two separate sets, in terms of their net input rates; we define
N+ ≡ {i ∈ N | ri > 0} and N− ≡ {i ∈ N | ri < 0} (assuming for ease that there is not i such
that ri = 0). As a consequence, we have the following boundary condition corresponding to
an empty reservoir:
Fi(0) = 0, i ∈ N+;
recall that the content of the reservoir is increasing when Xt ∈ N+. On the other hand, if the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 19 — #31�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 19
buffer is finite then the following boundary conditions have to hold:
Fi(B−) = pi, for i ∈ N−
where B is the buffer size. It turns out that these condition yield as many conditions as there
are unknowns cj (viz. N ). An important role in this argument is played by the property
that, if the stability condition is satisfied, the number of eigenvalues with negative real part
equals |N+|, there is one zero eigenvalue and the other eigenvalues have positive real part.
1.4.2 Workload-dependent service rate models
In the setting above, the net input rates and the transition matrix did not depend on the
current buffer content (i.e., the fluid amount in the reservoir). We now consider situations in
which we depart from this framework.
A first scenario is that in which the rates ri and the transition matrix Q are functions of the
current buffer content. This class of models is, for obvious reasons, sometimes referred to as
feedback models. We primarily consider the case in which there are (finitely many) levels such
that between two subsequent levels ri and Q are constant [89, 42]. We consider a second
scenario as well, in which the rates ri depend on the direction in which the level is crossed;
we call this scenario an hysteresis-type model [74, 75].
First we consider models in which ri and Q are locally constant between specified levels,
and do not depend on the direction in which the level is crossed. Suppose there are K + 1
buffer levels such that
0 = B0 ≤ B1 ≤ · · · ≤ BK−1 ≤ BK ≤ ∞
and, as a consequence, K buffer regimes; when y ∈ (BK−1, BK) the flow rates are given
by Rk = diag(rk1 , · · · , rkN ) where k ∈ {1, 2, · · · ,K}. We remark that this setup can be even
extended to a continuous counterpart, in which ri and Q depend on the current buffer level
in a continuous fashion [89]; here we only consider the case that flow rates are piecewise
constant functions of buffer content.
Assume that all Rk matrices are invertible. As a consequence, the steady-state probability
distribution of buffer content in each regime k is F ki (y) follows from a Kolmogorov equation
in the spirit of Equation (1.8) (with R and Q replaced by their ‘local counterparts’), and the
(local) solutions are given by an expression in the spirit of Equation (1.9). To complete the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 20 — #32�
�
�
�
�
�
20 1.5. OUTLINE OF THESIS, CONTRIBUTIONS
solution we need to align the solutions corresponding to the individual regimes. This is
done by imposing the appropriate boundary conditions at each buffer threshold. It requires
a substantial amount of administration to identify all boundary conditions, and to verify
that their number equals the number of unknown coefficients.
We now focus on the hysteresis-type model. For ease, we now assume that the buffer is infi-
nite and there are only two thresholds, the lower level and upper level which are indicated
by B� and Bu, respectively (where obviously B� ≤ Bu). In addition, for ease we assume that
in each background process state, the net input rate rate can take two values (i.e., a higher
and the lower value). Introduce an indicator process I(·) ∈ {+,−}, corresponding to the
two regimes, i.e., those in which the higher and lower net input rates apply.
The process is assumed to evolve as follows. It starts with an empty buffer and the indicator
is +. The indicator stays + as long as the buffer content remains below Bu. At the moment
that buffer content reaches the level Bu (and the background process is in state i), the indi-
cator changes from into − (while the background process remains in i). Then the indicator
remains − as long as the buffer content is above the level B�; then it changes to − again
(where the background process does not change). The process continues in this fashion.
With techniques similar to those explained above, we can evaluate the steady-state proba-
bilities
F−i (x) := P(I = −, X = i, C ≤ x), F+
i (x) := P(I = +, X = i, C ≤ x).
Finding these is again a matter of setting up differential equations, and imposing the appro-
priate boundary conditions. For a detailed analysis we refer to [74].
1.5 Outline of thesis, contributions
The primary objective of this thesis is to contribute to the development of computational
techniques in queueing and fluctuation theory. Essentially, three types of techniques will be
explored.
• In the first place, we systematically validate a (one- and two-dimensional) Laplace and
Fourier inversion algorithm. Our approach is based on an algorithm proposed by den
Iseger [36], and several variants thereof, which essentially rely on the Poisson summa-
tion formula. We do so in the context of Lévy fluctuation theory, aiming at numerically
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 21 — #33�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 21
evaluating the probability distribution of the running maximum process. While this
approach has a variety of potential applications, we primarily focus on applying it to
price specific exotic options, viz. the so-called lookback option.
• The second technique we present in this thesis is importance sampling. This technique
aims at reducing the variance of simulation-based estimators of performance mea-
sures. For instance, when estimating rare event probabilities, straightforward simu-
lation is extremely time consuming, as many paths need to be generated in order to
obtain an estimate with low variance. The idea behind importance sampling is to gen-
erate paths under an alternative measure, which is chosen such that the event under
study is not rare anymore. Correcting the simulation output by a likelihood ratio, the
resulting estimator is unbiased. The challenge is to find a good new measure, for which
the variance of the estimator (provably) reduces.
• In the third place, considering a node in a communication network, we assess the
tradeoff between the quality of service and the energy consumption. Different service
scenarios are considered; in each scenario the service speed is determined in a specific
way by the evolution of the buffer occupancy. We use techniques from optimization to
minimize a cost function (encompassing quality of service and energy consumption),
where the parameter space covers all feasible service strategies. The optimization rou-
tine is based on simulated annealing in combination with classical Newton-Raphson-
type algorithms, in the sense that we identify by simulated annealing the initial point
of the Newton-Raphson optimization routine (which locally finds the optimum).
In more detail, the contributions of the individual chapters can be summarized as follows.
There are four studies, the first three focusing on fluctuation-theoretic aspects in a Lévy-
driven system, and the last evaluating the tradeoff between quality of service and energy
consumption in a queue with fluid input. We now detail the specific contributions.
Chapter 2 presents a framework for numerical computations in fluctuation theory for Lévy
processes. More specifically, with as before Xt := sup0≤s≤t Xs denoting the running max-
imum of the Lévy process Xt, the aim is to evaluate P(Xt ≤ x) for t, x > 0. We do so
by approximating the Lévy process under consideration by another Lévy process for which
the double transform Ee−αXτ(q) is known, with τ(q) an exponentially distributed random
variable with mean 1/q; then we use a fast and highly accurate Laplace inversion technique
(of almost machine precision) to obtain the distribution of Xt. A broad range of examples
illustrates the attractive features of our approach. This chapter is based on [5].
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 22 — #34�
�
�
�
�
�
22 1.5. OUTLINE OF THESIS, CONTRIBUTIONS
In Chapter 3 our objective is to compute the prices (and corresponding sensitivities, known
as Greeks) of lookback options driven by Lévy processes. In this setup, the risk neutral evolu-
tion of the stock price, say St, is given by S0eXt , with S0 the initial price and Xt representing
a Lévy process. Lookback options prices are functions of the stock price ST at the maturity
time T and the running maximum ST := sup0≤t≤T St, and as a consequence the Wiener-
Hopf decomposition provides us with all probabilistic information needed to evaluate these
prices. To overcome the complication that in general only an implicit form of the Wiener-
Hopf factors is available, we follow the same approach as in Chapter 2: we approximate the
Lévy process under consideration by an appropriately chosen other Lévy process for which
the double transform Ee−αXτ(q) is known; as before, τ(q) is an exponentially distributed
random variable with mean 1/q. The second step is to write the transform of the lookback
option prices in terms of this double transform. Finally, we use state-of-the-art numerical
inversion techniques to compute the prices and Greeks (i.e., sensitivities with respect to ini-
tial price S0 and maturity time T ); these rely on the techniques featuring in Chapter 2. We
test our procedure for a broad range of relevant Lévy processes, including a number of ‘tra-
ditional’ models (Black-Scholes, Merton) and more recently proposed models (CGMY and
Beta processes), showing excellent performance in terms of speed and accuracy. This has
been submitted for publication to Journal of Computational Finance [7].
In Chapter 4 the focus is on numerical techniques to evaluate rare-event probabilities in a
Lévy setting. We analyze the tail asymptotics corresponding to the all-time maximum value
attained by a Lévy process with negative drift. This chapter has two main contributions: a
short and elementary proof of these asymptotics, and an importance sampling algorithm to
estimate the rare-event probabilities under consideration. This chapter is based on [6] which
has been accepted for publication in Statistics and Probability Letters.
Finally, in Chapter 5 considers the tradeoff between quality of service and capacity cost in
communication networks. More specifically, we develop techniques for analyzing and op-
timizing energy management in multi-core servers with so-called speed scaling capabilities
(i.e., the service speed can be adjusted based on the current buffer occupancy, or the evolu-
tion of the buffer occupancy in the recent past). Our framework incorporates the processor’s
dynamic power, but it also accounts for other intricate and relevant power features such
as the static (leakage) power and switching overhead between speed levels. Using stochas-
tic fluid models to capture traffic burst dynamics, the chapter proposes and studies differ-
ent strategies for adapting the multi-core processor speeds based on the observable buffer
content, so as to optimize objective functions that balance energy consumption and perfor-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 23 — #35�
�
�
�
�
�
CHAPTER 1. INTRODUCTION 23
mance. The strategies can be non-hysteretic (i.e., the processor speed depends on current
buffer level relative to the buffer thresholds) or hysteretic (i.e., it matters in which direction
the buffer thresholds are crossed). It is shown that, under rather general conditions, strate-
gies which use more threshold levels are more efficient with respect to power consumption;
however, most of the efficiency gain is achieved with 1 or 2 thresholds only. In addition,
the optimal power consumptions of the different strategies are only very mildly sensitive
to perturbations in the input parameters, implying the highly advantageous property that
the performance is robust to estimation errors in the system’s input traffic parameters. This
chapter has appeared as [9], and a short version as [8].
Our objective is that all chapters are self-contained, i.e., they can be read separately. As a
consequence, there will be some inevitable amount of overlap between these chapters. We
also remark that we have pursued a maximum level of uniformity regarding the notation
used throughout the thesis, and that this is also in line with the notation introduced in the
present chapter.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 24 — #36�
�
�
�
�
�
24 1.5. OUTLINE OF THESIS, CONTRIBUTIONS
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 25 — #37�
�
�
�
�
�
Chapter 2Numerical techniques in Lévy
fluctuation theory
In this chapter we discuss a numerical technique to fast and accurately evaluate the distri-
bution of the running supremum as attained by a Lévy process.
2.1 Introduction
As explained in Chapter 1, owing to their wide applicability and their attractive mathemati-
cal properties, Lévy processes play an important role in applied probability. In mathematical
terms, they are characterized as processes with stationary and independent increments, and,
as such, the class of Lévy processes covers e.g. Brownian motion and (compound) Poisson
processes (but is substantially broader; for instance processes with infinitely many jumps
in finite time intervals belong to this class as well). Over the past decades Lévy processes
have found widespread use in various application domains. More specifically, they are in-
tensively studied in both mathematical finance and operations research, see, among many
other sources, for instance [10, 31, 35].
With Xt denoting the Lévy process (assuming X0 = 0), a substantial research effort con-
centrates on analyzing probabilistic properties of the so-called running maximum process
Xt := sup0≤s≤t Xs. More particularly, one wishes to determine the probability P(Xt ≤ x) for
t, x > 0, or alternatively the corresponding density. The branch of research focusing on this
type of problems is commonly known as fluctuation theory [21, 68, 83].
25
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 26 — #38�
�
�
�
�
�
26 2.1. INTRODUCTION
As mentioned in Chapter 1, a Lévy process is characterized by its Lévy exponent logEeisX1 ,
which is a necessarily of the form
logEeisX1 = isd− 1
2s2σ2 +
∫ ∞
−∞(eisx − 1− isx1{|x|<1})Π(dx), (2.1)
where d ∈ R, σ ≥ 0, and the spectral measure Π(·), concentrated on R \ {0}, satisfies
∫R
min{x2, 1}Π(dx) < ∞.
The triplet (d, σ2,Π) is usually referred to as the characteristic triplet, as it uniquely defines the
Lévy process [21, Ch. I, Thm. 1]. The three terms in the right-hand side of the representation
(2.1) are, for obvious reasons, often called the (deterministic) drift term, the Brownian term,
and the jump term. Special cases of Lévy processes are deterministic drifts (only a drift term)
and Brownian motions (only a Brownian term). The class of Lévy processes also contains
compound Poisson processes; then we just have the jump-term (and the first term as well in
case a deterministic drift is present as well), and in addition there should be a well-defined
arrival rate (which requires that∫ ∞−∞ Π(dx) < ∞). The class is wider though, as it also
includes processes with infinitely many jumps in a finite amount of time (usually referred to
as ‘small jumps’); this happens in case∫ ∞−∞ Π(dx) = ∞.
In principle the distribution of Xt is fully specified through the so-called Wiener-Hopf de-
composition, see e.g. [68, Ch. 6]. It implies that, with τ(q) denoting an exponential random
variable with mean 1/q that is independent of the Lévy process Xt,
κ(α, q) := Ee−αXτ(q) = k0 exp
(−
∫ ∞
0
∫(0,∞)
1
t
(e−t − e−qt−αx
)P(Xt ∈ dx)dt
), (2.2)
where k0 is a normalizing constant. From a practical standpoint, the use of this characteri-
zation is limited, as it provides us with the double transform of P(Xt ∈ dx) — realize that
1
q· Ee−αXτ(q) =
∫ ∞
0
e−qt
∫ ∞
0
e−αxP(Xt ∈ dx)dt, (2.3)
which in general cannot be inverted explicitly.
The above entails that, in order to get numerical values for the density P(Xt ∈ dx) or the
distribution function P(Xt ≤ x), one option is to (i) first evaluate the double integral (2.2)
numerically, and then to (ii) numerically invert the double transform (2.3). The primary
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 27 — #39�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 27
objective of this chapter is to develop a methodology to evaluate P(Xt ≤ x), but to do so
by bypassing stage (i) above. The underlying idea is that we make use of the fact that for
quite a substantial class of Lévy processes Xt, the double transform κ(α, q) can be expressed
explicitly in terms of the Lévy exponent; we replace the Lévy process under consideration by
a (suitably chosen) Lévy process in this class, so that the just performing stage (ii) remains.
As mentioned above, for a broad class of Lévy processes the double transform κ(α, q) can be
expressed explicitly in terms of the Lévy exponent; in some cases still a number of (relatively
straightforward) numerical computations need to be performed. We give a brief overview
of such processes here.
• The most standard examples in which this is possible are the ones in which the under-
lying Lévy process is spectrally one-sided. This means that Xt has either only negative
jumps (the spectrally negative case; write X ∈ S−) or only positive jumps (the spec-
trally positive case; write X ∈ S+). In the former case the running maximum up to
the exponential epoch τ(q) has an exponential distribution, whereas in the latter case
the so-called generalized Pollaczek-Khinchine formula applies; see e.g. [35, Ch. III
and IV]. In both cases, κ(α, q) can be expressed in closed-form in terms of the Lévy
exponent.
• It has been found out more recently that κ(α, q) can be expressed in semi-explicit terms
if the jumps in one direction (either upward or downward) are phase-type (or, more
generally, have a rational Laplace transform), whereas the jumps in the other direction
are allowed to have a general distribution — see for results along these lines [11, 71, 72].
In this chapter, we concentrate on the setting of Lewis and Mordecki [71] in which the
positive jumps have a rational Laplace transform, and the downward jumps are general;
we write X ∈ R. In this case κ(α, q) can be expressed in terms of the zeros of a specific
equation (that needs to be solved numerically).
• If the Lévy exponent is a meromorphic function (write: X ∈ M ), expressed in terms of
beta and digamma functions, the Wiener-Hopf factorization can be done in essentially
the same way as in case of phase-type distributed jumps [66, 67]. This Wiener-Hopf fac-
torization, however, is now in terms of an infinite product, due to the infinitely many
poles of the Lévy exponent, so there is a truncation error. In the context of the present
chapter we consider the class of Beta processes [66, Section 4], which has meromorphic
Lévy exponent.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 28 — #40�
�
�
�
�
�
28 2.1. INTRODUCTION
As indicated above, in our numerical evaluation scheme we approximate the Lévy process
under consideration by one in the class for which we can compute the double transform
κ(α, q) explicitly (that is, a Lévy process in S−, S+, or R). In case there are non-phase-type
jumps in both directions, the jumps in one direction are approximated by using a phase-type
distribution; if there are ‘small jumps’, we approximate the jumps of the Lévy process by
the sum of an appropriately chosen compound Poisson process and Brownian motion [15].
Then we have an approximation for κ(α, q), which is inverted using the inversion approach
presented in [36]; this approach can be considered as ‘state-of-the-art’ in terms of accuracy
(near machine precision), speed and general applicability.
To the best of our knowledge, our study is the first systematic account that tackles the nu-
merical evaluation of P(Xt ≤ x) for t, x > 0 (or the corresponding density) in full generality.
Building on the ideas mentioned above, we study in great detail the numerical accuracy and
complexity of our approximation method. This is done for an extensive set of examples,
covering many of the specific Lévy processes proposed in the literature. It is noted that par-
ticular Lévy processes were already dealt with before, see for instance [13] for the CGMY
process; [87, 92] focus on numerical aspects related to the spectrally-negative case.
The remainder of this chapter is organized as follows. Section 2.2 sketches the preliminaries
of our approach: it reviews the results for the spectrally one-sided case as well as the results
from [71] for the case the positive jumps have a rational Laplace transform. In Section 2.3
the case of one-sided jumps is dealt with, with a focus on Brownian motion and compound
Poisson; the output of the numerical experiments is validated against either exact results
or simulation-based results. Then Section 2.4 studies the effect of replacing the positive
jumps by a phase-type counterpart; to assess the accuracy of the method we also perform
these approximations for instances that do allow explicit calculation of the double transform
κ(α, q). Section 2.5 concerns the approximation of small jumps by the sum of a Brownian
motion and a compound Poisson process. When the Wiener-Hopf factorization is available,
there is an efficient method [67] for sampling the running maximum (called Wiener-Hopf
Monte Carlo, or WH-MC). In Section 2.6 we consider Beta processes, and use WH-MC to
assess the accuracy of our approximation technique. In addition, as Beta processes are in
M , we can represent κ(α, q) as an infinite product; we also include the results obtained by
truncating this product and performing the inversion.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 29 — #41�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 29
2.2 Preliminaries
Recalling that we denote by Xt the running maximum process of the Lévy process Xt, and
by τ(q) an exponentially distributed random variable (with mean 1/q, for q > 0), we review
in this section Lévy processes for which the double transform of Xτ(q), denoted by κ(α, q),
can be explicitly expressed in terms of the model’s primitives, or immediately computable
quantities.
We first consider the situation that there are no positive jumps, that is, the spectrally negative
case. Following [21, Ch. VII], for X ∈ S− we define Φ(β) := logEeβX1 , and Ψ(·) its right-
inverse [68, p. 211]. Then κ(α, q) satisfies the following simple expression:
κ(α, q) =Ψ(q)
Ψ(q) + α. (2.4)
In other words, Xτ(q) is exponentially distributed with parameter Ψ(q), or, equivalently,
∫ ∞
0
qe−qtP(Xt ∈ dx)dt = Ψ(q)e−Ψ(q)xdx. (2.5)
Then we consider the case of no negative jumps, usually referred to as the spectrally positive
case. For X ∈ S+ we define the Laplace exponent by the function ϕ(·) : [0,∞) → [0,∞),
defined through ϕ(α) := logEe−αX1 . In this case, with ψ(·) being the inverse of ϕ(·),
κ(α, q) =q
ψ(q)
ψ(q)− α
q − ϕ(α). (2.6)
This result is sometimes referred to as the (generalized) Pollaczek-Khinchine formula [55,
98]; see also [10, Ch. IX, Thm. 3.10].
We finally consider the case in which the jumps in the downward direction are general, but
those in the upward direction are assumed to have a rational Laplace transform [72]. We
define this class R by the Lévy processes Xt such that for a finite and positive λ,
ξ(s) := logEeisX1 = isd− 1
2s2σ2 +
∫ 0
−∞(eisx − 1− isx1{x>−1})Π(dx)
+ λ
⎛⎝ K∑
k=1
nk∑j=1
ckj
(iαk
s+ iαk
)j
− 1
⎞⎠
where the αi are order such that 0 ≤ Re(α1) < Re(α2) ≤ · · · ≤ Re(αK). This corresponds to
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 30 — #42�
�
�
�
�
�
30 2.3. LAPLACE INVERSION
a Lévy process with a general jump-size distribution in the downwards direction, while the
upwards jumps have density
p(x) =
K∑k=1
nk∑j=1
ckj(αk)j xj−1
(j − 1)!e−αkx, x > 0.
Now let βj(q) the j-th root of q = ξ(s), with multiplicity mj(q); let m(q) the total number of
distinct roots. Then
κ(α, q) =K∏
k=1
(α+ αk
αk
)nk m(q)∏j=1
(βj(q)
α+ βj(q)
)mj(q)
; (2.7)
this expression can be inverted with respect to α, after having performed a partial fraction
expansion. Further details and properties of the roots are given in [72, Thm. 2.2].
2.3 Laplace inversion
As pointed out in the introduction, our approach requires a technique to perform Laplace
transform inversion. More specifically, our methodology proposes a way to approximate
the double transform κ(α, q) = Ee−αXτ(q) . In this section we first describe such a Laplace
transform inversion technique in detail. As the objective of this section is to assess the ac-
curacy of the double inversion technique, we then focus on a situation in which both κ(α, q)
and P(Xt ≤ x) are explicitly known (viz. Brownian motion with drift). Then we consider
situations for which we do know κ(α, q); for these cases we use simulation to validate our
numerical findings.
2.3.1 Laplace inversion
As indicated in the introduction, in our approach an important role is played by techniques
to perform Laplace inversion. We advocate the use of the method developed by den Iseger
[36]. It is in the spirit of approaches developed earlier [1, 38], in the sense that it relies
on the Poisson summation formula. This Poisson summation formula relates an infinite
sum of Laplace transform values to the z-transform of the function values f(kΔ), with k =
0, . . . ,M − 1, that we wish to evaluate, from which the f(kΔ) can be computed relying on
the well-known fast Fourier transform [33].
A first complication is that the above-mentioned infinite sum tends to converge slowly.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 31 — #43�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 31
Abate and Whitt [1] remedy this using a so-called Euler summation, but in general the
convergence remains prohibitively slow unless knowledge of the location of singularities
is available. One of den Iseger’s contributions [36] is to approximate the infinite sum by
a finite sum by using a Gaussian quadrature. The resulting algorithm is a substantial im-
provement over earlier algorithms in the sense that (i) it can handle a larger class of Laplace
transforms (e.g., no knowledge of the location of discontinuities or singularities is needed),
(ii) the algorithm only needs numerical values of the Laplace transform, is fast (that is, the
function values f(kΔ), with k = 0, . . . ,M −1, are computed at once, in order M logM time),
and is of nearly machine precision, (iii) can be extended to multiple dimensions. It is stressed
that that last feature is of crucial importance to us, as in our setting we are often dealing with
two-dimensional transforms.
In our numerical experiments we used the modified Laplace inversion for non-smooth func-
tions which was developed in [36, Section 6.2]. This modification is effective for functions
with discontinuities, singularities and local non-smoothness (even if we do not a priori know
their locations). The experiments reported on in [36] show that the algorithm typically re-
sults in approximations of (nearly) machine precision. Below we explain in greater detail
how this modification works.
Let f(s) be the Laplace transform of the complex-valued Lebesgue integrable function f(x).
Then it holds that (see e.g. [1])
∞∑k=−∞
f(a+ 2πi(k + ν)) =
∞∑k=0
e−ake−2πikνf(k); (2.8)
where a is a given real number. In this approach we approximate the left-hand side of (2.8)
by a finite summationn∑
k=1
βkf(a+ iλk + 2πiν),
where the (βk)nk=1 are appropriately chosen positive numbers and (λk)
nk=1 appropriately
chosen real numbers. In [36, Appendix A] it is described how these numbers can be gener-
ated for such a quadrature rule.
Suppose now that f(·) has a singularity in x = α for some α ∈ R. Let w(·) be a window
function, that is, a trigonometric polynomial with period 1, with w(0) = 1 and w(α) = 0.
Define
fw(x) := w(x)qf(x),
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 32 — #44�
�
�
�
�
�
32 2.3. LAPLACE INVERSION
for some positive integer q. The parameter q is chosen such that fw(x) is smooth in x =
α; also observe that fw(k) coincides with f(k) at k = 0, 1, . . .. Now the ‘normal’ Laplace
inversion technique, as described in [36, Section 4], applied to fw(·), can be used to compute
the f(k) (with integer k). If the function has multiple singularities, say in the points αj
with j = 1, 2, . . . ,m, the window function is the multiplication of window functions, that is,
w(x) =∏m
j=1 wj(x). If there is a singularity at x = 0, a situation that occurs frequently in the
examples of the present chapter, the window function is
w(x) = sin2(πx
2
),
and in the way described above we can compute the function values f(2k + 1). Guidelines
for choosing the parameter q are given in [36, Remark 6.5].
The modified algorithm described above can be improved for functions with various sorts
of non-smoothness; we now describe an improvement detailed in [36, Section 6.3] which
is useful when we do not know the location of the singularity. Suppose that the window
function depends on the point k, fwk(x) = wk(x)f(x), such that wk(k) = 1, and the ε-
support, with a > 0,
{x : |e−atfwk(x)| ≥ ε}
of fwkis [k− δ, k+ δ], with δ a given positive control parameter and ε a predefined tolerance.
In order to be sure that fwk(·) is smooth on [0,∞), it is sufficient that f(·) is smooth on
[k − δ, k + δ]. As a result, it is only needed that f(·) be smooth on [k − δ, k + δ] to compute
f(k) in great precision using the quadrature rule mentioned above. As it turns out, a good
choice for the window function is the Gaussian function defined by
w(t) = exp
(−1
2
(t
σ
)2)
for given tolerance ε and control parameter δ, where σ is chosen such that
exp
(−1
2
(δ
σ
)2)
< ε.
We also mention that [36, Section 5] points out how multi-dimensional inversion can be per-
formed. For further implementation details we refer to [36].
This Laplace inversion method can be adjusted to facilitate the numerical computation of
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 33 — #45�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 33
Laplace transforms; such a procedure is needed in situations that no explicit expressions are
available (for instance for the Pareto or Weibull distribution). The key idea behind it con-
cerns the transformation of the Legendre coefficients. Legendre polynomials are a complete
orthogonal set of polynomials in L2([0, 1]) and, in addition, the shifted version of Legendre
polynomials are a complete set in L2(R). Therefore, any function in L2(R) can be approx-
imated with an expansion of shifted Legendre polynomials. On the other hand there is a
complete set of functions in the Laplace domain; for a definition we refer to [37, Appendix
A]. The coefficients of the expansions in these two spaces are linked together through the
Poisson summation formula (2.8). As demonstrated in [37], such a method can compute the
Laplace transform with (almost) machine precision accuracy; it only needs knowledge of the
coefficients of the expansion which can be computed by Gaussian quadratures.
In the rest of this section we systematically assess the performance of the inversion technique
developed in [36] (and described above), in the context of the evaluation of P(Xt ≤ x).
We start by considering a case in which explicit analysis is possible (viz. Brownian motion
with drift). Then we consider a number of other examples for which no explicit expression
is available (but in which we do know κ(α, q)); in those cases we compare our numerical
output with simulations.
2.3.2 Comparison with exact results
In this subsection we consider a case in which the distribution function of Xt, that is, P(Xt ≤x), is known explicitly.
Example 1. Let Xt be a Brownian motion with drift, i.e., Xt = dt+σBt with Bt being standard
Brownian motion and d ∈ R. It holds that [56, p. 49]
P(Xt ≤ x) = 1− ΦN
(−x+ dt
σ√t
)− e2dx/σ
2
ΦN
(−x− dt
σ√t
),
with ΦN(·) denoting the distribution function of a standard Normal random variable.
As highlighted in Section 2.3.1, several Laplace inversion variants are described in [36]; they
differ in the way they deal with discontinuities and singularities. In this example, and all
following numerical computations presented in this chapter, we use the variant described in
[36, Section 6.3]. Table 2.1 focuses on P(Xt ≤ x), and compares the output of our numerical
experiments with the exact values and simulation-based estimates. Here, and in all other ex-
amples reported on in this chapter, we perform 107 independent replications per simulation
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 34 — #46�
�
�
�
�
�
34 2.3. LAPLACE INVERSION
time t x Simulation Exact value Error, use (2.5) Error, use (2.4)0.1 0.1 0.286726 0.28679183 1.021e-14 2.012e-11
0.2 0.525182 0.52535042 1.909e-14 2.063e-110.5 0.912001 0.91208092 9.298e-15 1.416e-101.0 0.999051 0.99906069 8.121e-17 9.550e-10
0.3 0.1 0.190579 0.19063594 5.995e-15 8.468e-110.2 0.358863 0.35900170 1.144e-14 1.564e-100.5 0.723613 0.72378120 2.248e-14 4.055e-101.0 0.959856 0.95991828 6.335e-15 2.833e-09
0.5 0.1 0.161220 0.16126780 3.997e-15 2.946e-100.2 0.305175 0.30529875 8.993e-15 6.374e-100.5 0.636069 0.63611270 1.860e-14 2.216e-091.0 0.908323 0.90832011 3.802e-15 6.548e-09
Table 2.1: Brownian motion with parameters d = −0.5 and σ = 1.0.
experiment.
The third column contains the simulation-based estimate, the fourth the exact value based on
the above formula. In the last two columns we use the explicit expression (2.4) that we have
for κ(α, q) (or alternatively representation (2.6); recall that Brownian motion is spectrally
negative as well as spectrally positive!). In the fifth column we use the fact that we can per-
form the inversion with respect to α explicitly, as seen in Eqn. (2.5); then a one-dimensional
numerical Laplace inversion is used to approximate the probability of interest. The resulting
error (compared to the exact result) is given. In the last column we present the values ob-
tained when subjecting (2.4) to two-dimensional Laplace inversion; again the error is given.
Observe that in the former approach error are maximally in the order of 10−14, and in the
latter approach maximally of 10−9.
2.3.3 Comparison with simulation results
In the next set of examples, we let Xt correspond to a Brownian motion with drift, plus a
compound Poisson process with upward jumps. In other words,
logEeisX1 = isd− 1
2s2σ2 + λ
(EeisJ − 1
), (2.9)
with J ≥ 0 the random variable associated with the jumps, and λ > 0. As this process is
spectrally positive, (2.6) applies. We consider various jump-size distributions J , thus cover-
ing both light-tailed and heavy-tailed scenarios.
One way to determine Xt in a simulation is by sampling the values of the Lévy process on a
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 35 — #47�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 35
grid (yielding X0, XΔ, X2Δ, . . . , Xt−Δ, Xt), and to then take the maximum (tacitly assuming
that t is a multiple of Δ). This procedure is inherently biased: the value found in this way is
necessarily smaller than Xt, but of course this bias decreases when Δ ↓ 0. In this section we
consider the situation that the Lévy process is the sum of a deterministic drift, a Brownian
term, and a compound Poisson process, and it turns out that for this specific scenario there is
an attractive alternative. First observe that it is trivial to sample the jump epochs of the com-
pound Poisson process up to time t, and the values of the Lévy process at these jump epochs,
as well as the value at time t itself; call the resulting time epochs t0 = 0, t1, . . . , tN−1, tN = t.
Then realize that the distribution of the maximum between ti and ti+1 is known — it follows
essentially from the distribution of the maximum attained by a Brownian bridge. The corre-
sponding distribution function is invertible, and as a consequence it is elementary to sample
from it. It is now clear that in this way we can generate all information needed to determine
Xt (without any approximation). The procedure is described in detail in [50].
Example 2. In this example, we assume that jump size J has an exponential distribution,
that is, P(J > x) = exp(−μx), with μ > 0. The results are presented in Table 2.2. Again,
the third column contains the simulation-based estimate. In the fourth column, we rely
on (2.7) with one-dimensional inversion (observe that the upward jumps are of phase-type,
hence this formula applies); the resulting approximation is given. The fifth column displays
the approximation based on (2.6) with two-dimensional inversion. Finally, the last column
gives the difference between the previous two columns. It is concluded that both inversion-
based methods are close to the simulation-based estimates; in addition, the inversion-based
methods give nearly the same result (up to roughly 10−9, that is).
Example 3. Now let J have a Weibull distribution: P(J > x) = exp(−μxγ), with μ, γ > 0.
For γ ∈ (0, 1), this tail is heavier than exponential, for γ > 1 lighter. More specifically, for
γ < 1 the Weibull distribution is subexponential: despite the fact that all moments exist, there
is no open neighborhood around the origin such that the moment generating function is
finite; in Table 2.3 the jump sizes are subexponential. The third column contains simulation-
based estimates, the fourth is based on doubly inverting expression (2.6). Notice that in this
situation we cannot approximate the probability P(Xt ≤ x) relying on (2.7), as the positive
jumps do not have a rational Laplace transform. The last two columns will be commented on
in the next section. We observe that the approximation based on double Laplace inversion of
(2.6) performs reasonably well compared to the simulation-based estimates; the fit is better
in the light-tailed case.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 36 — #48�
�
�
�
�
�
36 2.4. APPROXIMATION WITH RATIONAL LAPLACE TRANSFORM
Example 4. Let J now be sampled from a Pareto distribution: P(J > x) = (x+1)−γ , for some
γ > 0. This tail is heavier than the Weibull-tail: just a finite number of moments exists —
more precisely: the k-th moment exists if k < γ. Table 2.5 should be read as Tables 2.3 and
2.4. We conclude that there is a good fit relative to the simulation-based results.
2.4 Approximation with rational Laplace transform
From the examples presented in previous section we conclude that the numerical inversion
procedure works well, even if the approximation requires a double inversion. In all these
examples, however, the Lévy process involved was such that the double transform κ(α, q)
was given in closed form.
In this section we add a complication. We consider cases in which Xt is such that we do
not have an explicit expression for κ(α, q). The focus will now be on Lévy processes that are
Brownian motion with drift, plus compound Poisson processes (with upward and down-
ward jumps); ‘small jumps’ will only be incorporated in the next section. If the jumps in the
upward direction do not have a rational Laplace transform, the results of [71] do not apply
(see Section 2.2), and hence we do not have an explicit expression for κ(α, q). The idea is
now that we approximate the distribution of the upward jumps by a phase-type distribution
(while leaving the jumps in the downward direction unchanged), so that we are again in
the framework of [71] — realize that the class of phase-type distributions is contained in the
class of distributions with a rational Laplace transform. The objective of this section is to
assess how well such an approximation performs, in terms of evaluating P(Xt ≤ x).
2.4.1 Fitting of phase-type distributions
There are various papers dealing with approximating a distribution on (0,∞) by a phase-
type distribution, see for instance [40, 57]. In our work we rely on the approach developed in
[14], based on the EM algorithm, and [94], who propose a comparable approach that focuses
primarily on mixtures of Erlangs. For a precise definition of phase-type distributions, see
e.g. [10, Ch. III]; they can be thought of as distributions of absorption times in a finite-state
continuous-time Markov chain. More precisely, with d denoting the dimension of the state
space, and d − 1 states being transient and the remaining state absorbing, a phase-type dis-
tribution corresponds to the entrance time of the absorbing state. This class covers mixtures
and sums of exponential distributions (and hence also the Erlang distribution, being dis-
tributed as the sum of independent exponential random variables with the same mean). The
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 37 — #49�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 37
class of phase-type distributions is dense, in that any distribution on (0,∞) can, in principle,
be approximated arbitrarily well; the price to be paid, though, is that the dimension d of the
associated Markov chain may become large.
The performance of the EM-based algorithm proposed is assessed in detail in [14] — it was
shown that quite a large class of distributions can be accurately approximated by phase-type
distributions (of relatively low dimension d). From this it is, however, not a priori clear what
the impact is of replacing the upward jumps by an appropriate phase-type random variable
when evaluating P(Xt ≤ x) in the way described above — we do not have an explicit bound
on the error introduced by replacing the jump distribution by its phase-type counterpart. It
is the primary objective of this section to study this effect.
The remainder of this section consists of two parts. In Section 2.4.2 we consider models
of which the upward jumps do not have a rational Laplace transform, but that are in S+
(i.e., there are no downward jumps). Due to (2.6), we know κ(α, q), so that we can apply
the inversion approach developed in [36] (see Section 2.3) to evaluate P(Xt ≤ x). Then we
approximate the upward jumps with phase-type random variables, compute κ(α, q) relying
on [71], and again perform the inversion. Then we compare both numerical approximations
of P(Xt ≤ x).
In Section 2.4.3 we consider models for which we do not know κ(α, q), i.e., models in which
both the upward and downward jump have general distributions. We approximate the up-
ward jumps by phase-type random variables, and proceed as before. We then compare with
simulation to assess the accuracy of this approach.
2.4.2 Comparison with results for spectrally-positive Lévy processes
In the examples below (2.9) applies: the Lévy process consists of a Brownian term (with drift)
increased by a compound Poisson process with positive jumps. As a result, κ(α, q) is given
by (2.6). To assess the impact of replacing the upward jumps by their phase-type counterpart,
we first use the EM-algorithm to find a phase-type approximation for the jumps, and then
approximate P(Xt ≤ x), relying on (2.7) and a single-dimensional Laplace inversion.
Example 5. We go back to the setting of Example 3: we let J have a Weibull distribution with
γ = 0.5 and γ = 2, respectively. In the fifth column of Tables 2.3 and 2.4 we display the
resulting numerical approximations. The last column gives the difference with the result of
doubly inverting (2.6). It is concluded that the differences roughly range between 10−4 and
10−7.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 38 — #50�
�
�
�
�
�
38 2.5. SMALL JUMPS
Example 6. We now return to the setting of Example 4: we assume that J corresponds to a
Pareto distribution. The last two columns of Table 2.5 should be read as the corresponding
columns in Tables 2.3 and 2.4. Here the differences with the result based on (2.6) roughly
range between 10−3 and 10−5.
Example 7. Now consider a slightly harder example: J follows a shifted-Pareto distribution,
that is, P(J > x) = 1 for x ≤ 1, and P(J > x) = x−γ for x > 1, for some γ > 0; observe
that the support of J is (1,∞). In this case the approximating phase-type distribution is a
mixture of Erlang distributions of high degree; to this end, realize that an Erl(n, n) random
variable (having mean 1, and a variance 1/n, i.e., vanishing as n grows large) approximates
a deterministic (1) random variable. Table 2.6 should be read as Table 2.5. We observe that
due the fact that the distribution of the jumps does not have support (0,∞), the phase-type-
based approximation performs relatively weak.
2.4.3 Comparison with simulation results
In this subsection we deal with an example in which we do not know κ(α, q) (as opposed to
the examples presented in Section 2.4.2).
Example 8. In this example we consider compound Poisson with two-sided jumps, plus
Brownian motion with drift. The upward jumps are Weibullian, and approximated by a
phase-type distribution. The downward jumps are exponential. The numerical results are
compared to simulation-based estimates, and show a good fit. (As an aside we mention that
in this case the upward jumps are of phase-type, where the downward jumps are not. Con-
sequently, also in this case κ(α, q) can be given explicitly, in terms of a number of roots, see
[71]. We do not pursue this approach.)
2.5 Small jumps
So far we have developed a technique that can deal with all Lévy processes consisting of
deterministic drifts, Brownian motions and compound Poisson processes. This means that
we have not yet looked at processes with small jumps. In this section we rely on results from
[15] to deal with these. The main result used is that under appropriate conditions a Lévy
process with small jumps can be accurately approximated by the sum of an appropriately
chosen compound Poisson process and Brownian motion. We first write the jump part of
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 39 — #51�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 39
the Lévy exponent in the form
∫ ∞
−∞(eisx − 1− isx1{|x|<ε})Π(dx) =
∫ ε
−ε
(eisx − 1− isx)Π(dx) +
∫R\[−ε,ε]
(eisx − 1)Π(dx);
let the first term correspond to a Lévy process, say, X(1,ε)t , and the second term (which is a
compound Poisson process) to, say, X(2,ε)t . Then the ‘small jump component’ X(1,ε)
t can be
approximated by (for some small value of ε)
μεt+ σεBt +X(2,ε)t , (2.10)
where Bt is a standard Brownian motion, and
με :=
∫ ε
−ε
xΠ(dx), σ2ε :=
∫ ε
−ε
x2Π(dx).
To shed some light on the accuracy of such an approximation, it is mentioned that it holds
that under appropriate conditions [15]
(X(1,ε)(t)− μεt
σε
)t≥0
d→ (Bt)t≥0, (2.11)
A sufficient condition for (2.11) to hold is that, with L(·) a slowly varying function at 0, Π(·)has a density of the form L(x)/|x|γ+1 for x ↓ 0, with γ ∈ (0, 2). It is noted that this condition
applies for e.g. stable Lévy processes and CGMY processes, but not for e.g. variance Gamma
processes (as these correspond to γ = 0). We also mention that the use of (2.10) is advocated
for Variance Gamma in [44] — see his third algorithm on p. 25.
Approximating the distribution of the upward jumps by a phase-type distribution, we are
again in the setting of Section 2.4. As a result, we can use the methodology developed earlier
to perform the numerical computations. There is an obvious trade-off between accuracy and
computational effort when varying ε.
Example 9. In this example we consider a Lévy process whose upward jumps are CGMY-like,
that is, for C,M, Y > 0,
Π(x) = Ce−Mx
x1+Y
for x > 0 and 0 else. A Brownian term is added. The third and fourth column of Table
2.8 present simulation-based estimates, based on approximation (2.10), with ε = 0.1 and
ε = 0.05, respectively. Observe that this model is contained in S+, so that (2.6) applies and
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 40 — #52�
�
�
�
�
�
40 2.6. BETA PROCESSES
κ(α, q) is given explicitly; the fifth column gives the results based on double inversion of
(2.6). In the last column X(2,ε)t is approximated by a Lévy process with phase-type jumps;
as usual, we apply (2.7). From the small difference between both simulation-based columns,
we conclude that those values are likely to be close to the true values. The inversion-based
columns are well in agreement with each other and with the simulation-based output.
Example 10. We now consider a Variance Gamma process, which a can be regarded as a
(standard, in our case) Brownian motion where the time parameter follows a Gamma pro-
cess. More precisely, with Bt being a standard Brownian motion, and Yt a Gamma process
with parameters 1 and 1, the Lévy process under consideration is given by BYt; this is an
example of subordinated Brownian motion.
The third column of Table 2.9 presents simulation-based estimates, based on approximation
(2.10), with ε = 0.01. The fourth column also gives simulation-based estimates, but now
simulating the Variance Gamma process as subordinated Brownian motion. This means that
we sample the values of the Gamma process on a grid, and then generate the Brownian
motion at these values. We have performed this procedure for different grid sizes, N =
200, N = 500, N = 1000 but we observed just minor differences (negligible with respect to
the width of the confidence interval).
To obtain the fifth column, the upper tail of the Lévy measure is split into a Brownian compo-
nent and a compound Poisson component, as explained earlier in this section. We observe a
reasonable fit. The problem with the approach that we proposed, however, is that we cut out
the interval (0, ε), so that the positive jumps have a distribution which has support (ε,∞) —
we saw before (viz. in the shifted-Pareto case) that such distributions do not lend themselves
to be approximated by a phase-type distribution. We can remedy this effect by allowing the
jump size distribution to have support (0,∞); we give the Lévy measure of X(2,ε)t the value
Π(ε) in the interval (0, ε); the parameters of the Brownian motion are then adapted such that
the first two moments give the desired match. The last column gives the resulting estimates;
the fit is considerable better than in the previous column and in addition the dimension of
the approximating mixture of Erlangs is substantially lower.
2.6 Beta processes
In this section we test our methodology for the class of Beta processes, which fall in the class
of Lévy processes M for which the Lévy exponent is meromorphic. For these processes,
κ(α, q) can be represented [66] in terms of an infinite product. By truncating this infinite
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 41 — #53�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 41
product and performing a one-dimensional inversion, we can approximate P(Xt ≤ x). We
also obtain a simulation-based benchmark, by performing the sampling method developed
by Kuznetsov et al. [67]. We start this section by reviewing this simulation technique.
2.6.1 Wiener-Hopf Monte Carlo (WH-MC) simulating method
Suppose that we are able to sample the running maximum (Xτ(q)) and the running minimum
(Xτ(q)), where τ(q) is a exponentially distributed random variable with mean 1/q. Then
by the method developed by Kuznetsov et al. [67], based on an algorithm introduced by
Carr [27], we are able to evaluate E[F (Xt, Xt)], the main ideas being the following. By the
strong law of large numbers we know that∑n
i=1tnei → t as n → ∞, where ei constitutes
a sequence of i.i.d. exponentially distributed random variables with mean 1. The random
variable∑n
i=1tnei is equal in law to gamma random variable with parameters n and q =
nt ; we denote it by g(n, q). As a consequence, P(Xg(n,q) ∈ dx, Xg(n,q) ∈ dy) is a suitable
approximation to P(Xt ∈ dx, Xt ∈ dy), taking n sufficiently large. The following result is
due to [67, Thm. 1], to which we refer for more details.
Theorem 1. For all n ≥ 1 and q > 0, define g(n, q) :=∑n
i=1tnei. Then
P(Xg(n,q) ∈ dx, Xg(n,q) ∈ dy)d= (V (n, q), J(n, q)) (2.12)
where V (n, q) and J(n, q) are defined iteratively through
V (n, q) = V (n− 1, q) + S(n)q + I(n)q
J(n, q) = max(J(n− 1, q), V (n− 1, q) + S(n)
q
)
and V (0, q) = J(0, q) = 0, S(0)q = I
(0)q = 0, {S(j)
q ; j ≥ 1} is a sequence of i.i.d. random variables
with common distribution Xτ(q), and {I(j)q ; j ≥ 1} is a sequence of i.i.d. random variables with
common distribution Xτ(q).
2.6.2 Beta processes
The class of Beta processes consists of Lévy processes defined by the triplet (μ, σ,Π), where
the Lévy measure is defined as
Π(x) = c1e−α1β1x
(1− e−β1x)λ11{x>0} + c2
eα2β2x
(1− eβ2x)λ21{x<0},
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 42 — #54�
�
�
�
�
�
42 2.6. BETA PROCESSES
with parameters αi > 0, βi > 0, ci ≥ 0 and λi ∈ (0, 3)\{1, 2}. Its Lévy exponent is
Ψ(s) = i(μ− ρ)s− 1
2σ2s2 +
c1β1
B(α1 − is
β1, 1− λ1) +
c2β2
B(α2 +is
β2, 1− λ2)− γ. (2.13)
Here B(x, y) := Γ(x)Γ(y)/Γ(x+y) is the well-known Beta function. In addition, with ψ(x) :=
d log(Γ(x))/dx,
γ =c1β1
B(α1, 1− λ1) +c2β2
B(α2, 1− λ2),
ρ =c1β1
B(α1, 1− λ1)(ψ(1 + α1 − λ1)− ψ(α1))− c2β2
B(α2, 1− λ2)(ψ(1 + α2 − λ2)− ψ(α2)).
The Lévy exponent of the beta process is a meromorphic function in C; it turns out to be
possible to identify all roots of the equation q−Ψ(s) = 0; these roots are characterized in the
following theorem [66, Thm. 10].
Theorem 2. For q > 0 and Ψ(s) defined above, the equation q − Ψ(iξ) = 0 has infinitely many
solutions, all of which are real and simple. they are such that ξ−0 ∈ (−α1β1, 0) and ξ+0 ∈ (0, α2β2),
while for n ∈ {1, 2, . . .},
ξ−n ∈ (β1(−α1 − n), β1(−α1 − n+ 1)), ξ+n ∈ (β2(α2 + n− 1), β2(α2 + n)).
Moreover, for x > 0,
P(Xτ(q) ∈ dx) = −( ∞∑
k=0
C−k ξ−k eξ
−k x
)dx (2.14)
where, with k ∈ {1, 2, . . .},
C−0 =
∏n≥1
1 + ξ−0 /β1(n− 1 + α1)
1− ξ−0 /ξ−n, C−
k =1 + ξ−k /β1(k − 1 + α1)
1− ξ−k /ξ−0
∏n≥1,n �=k
1 + ξ−k /β1(n− 1 + α1)
1− ξ−k /ξ−n.
A similar expression holds for P(−Xτ(q) ∈ dx), but {ξ−n } must be replaced by {−ξ+n } and α1, β1
must be replaced by α2, β2.
Note that if σ = 0 and λi < 2 the distribution of Xτ(q) has an atom at zero which is equal to
1−∑n≥1 C
−k ; it can be written as
∏n≥0 −ξ−n /β1(n+ α1).
The above theorem provides us with information about the location of the poles, thus facili-
tating the efficient determination of their exact positions (use for instance a simple bisection
method). However, for performing the inverse Laplace transform we need to find poles for
complex values of q; as computing roots for complex q is time consuming, we rely on the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 43 — #55�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 43
method that we explain in Section 2.7.2.
Example 11. In this example, we consider a Beta process with parameters
(μ, σ;α1, β1, λ1, c1;α2, β2, λ2, c2) = (−0.5, 1; 1, 1.5, 1.5, 1; 1, 1.5, 1.5, 1).
Because the Beta process is a Lévy process with small jumps, we need to approximate the
jumps smaller than ε with a Brownian motion in the Monte-Carlo simulation; in the Wiener-
Hopf Monte-Carlo simulation we do not need this approximation. It is also noted that the
distributions of Xτ(q) and Xτ(q) are expressed in terms of infinite series, and as a consequence
we have to perform a truncation to sample from Xτ(q) and Xτ(q). In Table 2.10 the third
and forth columns show the result obtained from ‘ordinary simulation’, using ε = 0.1 and
ε = 0.05. The next two columns display the estimates obtained relying on WH-MC with
the number of iterations n in Thm. 1 equal to 20 and 100, respectively. We perform 107
realizations in each simulation.
The seventh column shows the outcome of (2.14), where the summation is truncated after
25 terms. The last column, finally, is based on inverting the Laplace transform obtained
by approximating the positive jumps by their phase-type counterparts. If we leave out the
jumps smaller than ε the positive jumps size distribution will have support (ε,∞) which is
poorly approximated by Erlang distributions, as we explained in Example 10; we remedy
this complication in the same way as we did in Example 10.
2.7 Discussion and concluding remarks
We conclude this chapter by briefly discussing a number of issues that affect the accuracy
and computation time.
2.7.1 Remarks on fitting of phase-type distribution
Let f(x) be the density which we wish to approximate with a so-called hyper-Erlang distri-
bution of degree N , that is, we wish to find αj , λj and nj such that
f(x) ≈N∑j=1
αj(λix)
nj−1
(nj − 1)!λje
−λjx; (2.15)
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 44 — #56�
�
�
�
�
�
44 2.7. DISCUSSION AND CONCLUDING REMARKS
here n1, . . . , nN ∈ N are the numbers of phases of the individual Erlang distributions, the
λj s are positive numbers, while the αj s are positive numbers such that∑N
j=1 αj = 1.
With the EM algorithm we can optimize the parameters αj and λj for a given N and pre-
defined set of nj . In order to find the ‘best’ fitting we have tried a large set of vectors
(n1, n2, · · · , nN ) for a given N , in order to identify the (n1, n2, · · · , nN ) which maximizes
the likelihood. This procedure can be implemented efficiently; for a detailed discussion we
refer to [94].
Note that all phase-type distributions have the features that they (i) are light-tailed, and (ii)
have support (0,∞). As a consequence, it heavily depends on the distribution under consid-
eration whether it can be approximated well by a phase-type distribution. From our experi-
ments, we observed that for light-tailed distributions Erlang distributions of low dimension
suffice; see for example the Weibull distribution with γ ≥ 1. Distributions with heavier tails
(Pareto, the Weibull distribution with γ < 1) are significantly harder to approximate (in that
they require a mixture of Erlangs of high dimension); it is emphasized that the fit of the dis-
tribution’s tail may be poor in this case, while the ‘body’ of the distribution is approximated
quite well. Distributions with support different from (0,∞) are even harder to fit; think of
the shifted-Pareto distribution. In this case, recall that the approximating phase-type distri-
bution contains multiple Erlang distributions of high degree; note that an Erl(n, n) random
variable (which has mean 1, and variance 1/n, i.e., vanishing as n grows large) can be used
to approximate a deterministic (1) random variable. Our experiments indicate that, despite
the fact that we included such high-degree Erlang distributions, the resulting numerics are
decent, but not highly accurate.
2.7.2 Remarks on the computation time
As we mentioned earlier in this chapter, the computation time of the Laplace inversion al-
gorithm is of the order M log(M), if function values f(kΔ), k = 0, 1, · · · ,M − 1 are to be
computed. It is emphasized, though, that the bulk of the computation time is not related to
this inversion, but rather to the numerics related to identifying the roots βj(q) in (2.7), which
solve q = ξ(s). The number of roots of this equation equals to Np + 1 if there is a Brownian
component in the Lévy exponent, and Np otherwise; here Np denotes the sum of phases of
the individual distributions the phase-type distribution is composed from. For details we
refer to [71].
In general, finding these roots can be extremely time consuming, as we lack precise knowl-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 45 — #57�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 45
edge about the locations of the roots in the complex plane and their multiplicity. In addition,
the Laplace inversion algorithm needs to compute these roots for different values of q. In
order to save time, we first compute the roots βj of the equation a = ξ(s) with a being real
damping factor. Note that the roots of the equation a + iq = ξ(s) change continuously with
respect to q in the complex plane; considering the roots as explicit functions of q such that
ξ[βj(q)] = a+ iq; βj(0) = βj (2.16)
we obtain by differentiating with respect to q the ordinary differential equation
dβj(q)
dq=
i
ξ′[βj(q)]. (2.17)
Applying this procedure, we can find the roots efficiently for different values of q by using,
for example, an adaptive Runge-Kutta method [66].
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 46 — #58�
�
�
�
�
�
46 2.7. DISCUSSION AND CONCLUDING REMARKS
time t x Simulation Appr., use (2.7) Appr., use (2.6) Difference0.1 0.1 0.340062 0.34025703 0.34025703 1.32e-09
0.2 0.579295 0.57935467 0.57935467 1.24e-100.5 0.890919 0.89079051 0.89079051 2.04e-101.0 0.959618 0.95963405 0.95963405 4.18e-10
0.3 0.1 0.241525 0.24164945 0.24164945 5.67e-100.2 0.420271 0.42037262 0.42037262 9.24e-110.5 0.720007 0.71992788 0.71992788 4.34e-101.0 0.876619 0.87671756 0.87671756 3.04e-09
0.5 0.1 0.206808 0.20683652 0.20683652 4.94e-100.2 0.361211 0.36125826 0.36125826 7.48e-100.5 0.633832 0.63363062 0.63363062 1.57e-091.0 0.809454 0.80949895 0.80949895 1.16e-09
Table 2.2: Compound Poisson with exponential jumps. The jumps occur according to a Poisson processwith rate λ = 1; the jump sizes are exponential with mean 1. The Brownian term has parametersd = −1.5 and σ = 1.0.
time t x Simulation Appr., use (2.6) Appr. Ph., use (2.7) Difference0.1 0.1 0.306441 0.30554528 0.30503331 5.12e-04
0.2 0.541505 0.53979428 0.53897538 8.19e-040.5 0.884350 0.88191833 0.88112773 7.90e-041.0 0.960646 0.95807537 0.95813022 5.48e-05
0.3 0.1 0.203235 0.20174542 0.20107108 6.74e-040.2 0.368249 0.36527241 0.36413314 1.14e-030.5 0.683732 0.67832431 0.67690345 1.42e-031.0 0.866122 0.85926538 0.85922344 4.19e-05
0.5 0.1 0.166972 0.16488486 0.16421784 6.67e-040.2 0.303935 0.29990763 0.29877142 1.13e-030.5 0.580408 0.57275238 0.57129482 1.46e-031.0 0.780043 0.76979656 0.76990473 1.08e-04
Table 2.3: Compound Poisson with Weibull jumps; heavy-tailed case. The jumps occur according to aPoisson process with rate λ = 1; the jump sizes are Weibull with parameters μ = 1.0 and γ = 0.5. TheBrownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 7 and the highest number of phases nmax = 3.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 47 — #59�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 47
time t x Simulation Appr., use (2.6) Appr. Ph., use (2.7) Difference0.1 0.1 0.298493 0.29822869 0.29822927 5.75e-07
0.2 0.526594 0.52650719 0.52650797 7.80e-070.5 0.861135 0.86135242 0.86133659 1.58e-051.0 0.953920 0.95401102 0.95437502 3.64e-04
0.3 0.1 0.189862 0.18976048 0.18978895 2.84e-050.2 0.343766 0.34370972 0.34376675 5.70e-050.5 0.643406 0.64373428 0.64389320 1.59e-041.0 0.848913 0.84918147 0.84978994 6.08e-04
0.5 0.1 0.152512 0.15236195 0.15243569 7.37e-050.2 0.277445 0.27746044 0.27759827 1.38e-040.5 0.536884 0.53709063 0.53740254 3.12e-041.0 0.761019 0.76122687 0.76199729 7.70e-04
Table 2.4: Compound Poisson with Weibull jumps; light-tailed case. The jumps occur according to aPoisson process with rate λ = 1; the jump sizes are Weibull with parameters μ = 1.0 and γ = 2. TheBrownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 4 and the highest number of phases nmax = 5.
time t x Simulation Appr., use (2.6) Appr. Ph., use (2.7) Difference0.1 0.1 0.417032 0.41702642 0.41707991 5.35e-05
0.2 0.661758 0.66178212 0.66185155 6.94e-050.5 0.909879 0.91002954 0.91013778 1.08e-041.0 0.950149 0.95024208 0.95049174 2.50e-04
0.3 0.1 0.329487 0.32952345 0.32968987 1.66e-040.2 0.530314 0.53051325 0.53077411 2.61e-040.5 0.778595 0.77893652 0.77936986 4.33e-041.0 0.863911 0.86403154 0.86472833 6.97e-04
0.5 0.1 0.292768 0.29285157 0.29315023 2.99e-040.2 0.472333 0.47253618 0.47301425 4.78e-040.5 0.702192 0.70257211 0.70334963 7.77e-041.0 0.796445 0.79658771 0.79772318 1.13e-03
Table 2.5: Compound Poisson with Pareto jumps. The jumps occur according to a Poisson processwith rate λ = 1; the jump sizes are Pareto with parameter γ = 1.0. The Brownian term has parametersd = −2.5 and σ = 1.0. Number of Erlang distributions which were fitted to the Pareto distributionNEr = 10 and the highest number of phases nmax = 5.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 48 — #60�
�
�
�
�
�
48 2.7. DISCUSSION AND CONCLUDING REMARKS
time t x Simulation Appr., use (2.6) Appr. Ph., use (2.7) Difference0.1 0.1 0.409234 0.40893688 0.40897511 3.82e-05
0.2 0.647698 0.64752872 0.64761795 8.92e-050.5 0.881551 0.88133353 0.88190216 5.99e-041.0 0.912168 0.91201427 0.91587049 3.86e-03
0.3 0.1 0.302718 0.30249674 0.30373567 1.24e-030.2 0.485837 0.48568042 0.48783318 2.15e-030.5 0.704911 0.70473313 0.70947429 4.74e-031.0 0.779802 0.77970465 0.79028644 1.05e-02
0.5 0.1 0.252952 0.25275076 0.25585642 3.10e-030.2 0.407164 0.40696892 0.41214744 5.18e-030.5 0.600502 0.60036169 0.60967756 9.31e-031.0 0.685971 0.68588977 0.70143444 1.55e-02
Table 2.6: Compound Poisson with shifted-Pareto jumps. The jumps occur according to a Poissonprocess with rate λ = 1; the jump sizes are shifted-Pareto with parameter γ = 1.0. The Brownianterm has parameters d = −2.5 and σ = 1.0. Number of Erlang distributions which were fitted to theshifted-Pareto distribution NEr = 12 and the highest number of phases nmax = 28.
time t x Simulation Appr. Ph., use (2.7)0.1 0.1 0.310605 0.31083581
0.2 0.540572 0.540683940.5 0.866999 0.866768511.0 0.955574 0.95648076
0.3 0.1 0.212655 0.214138280.2 0.376741 0.378790540.5 0.674504 0.676089081.0 0.863495 0.86495277
0.5 0.1 0.181433 0.190424460.2 0.322121 0.334367920.5 0.588677 0.599694831.0 0.794101 0.80039848
Table 2.7: Compound Poisson with both upward and downward jumps. Both the upward jumps anddownward jumps occur according to a Poisson process with rate λ = 1; the positive jump sizes areWeibull with parameters μ = 1 and γ = 2, and the negative jump sizes exponential with mean 1.0.The Brownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 5 and the highest number of phases nmax = 5.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 49 — #61�
�
�
�
�
�
CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 49
time t x Simulation ε = 0.1 Simulation ε = 0.05 Appr., use (2.6) Appr. Ph., use (2.7)0.1 0.1 0.464093 0.464372 0.46468919 0.46685728
0.2 0.707813 0.707969 0.70792216 0.712028070.5 0.947668 0.947733 0.94756177 0.950568331.0 0.993627 0.993675 0.99361188 0.99453926
0.3 0.1 0.412429 0.412595 0.41282521 0.416754900.2 0.636461 0.636570 0.63645668 0.643070830.5 0.896372 0.896337 0.89611778 0.902065201.0 0.981780 0.981841 0.98174473 0.98406940
0.5 0.1 0.402468 0.402620 0.40282695 0.407444800.2 0.621987 0.622003 0.62191009 0.629535090.5 0.882154 0.882088 0.88182180 0.888979261.0 0.976001 0.975993 0.97592972 0.97898461
Table 2.8: The upward-jumps are CGMY-like, with parameters C = 1.0, M = 2.0 and Y = 0.5; thereare no downward jumps. The Brownian term has parameters d = −4.0 and σ = 1.0. Number ofErlang distributions which were fitted to the CGMY-upper tail NEr = 5 and the highest number ofphases nmax = 9, after having cut off the interval (0, ε), with ε = 0.1, from Π(·).
time t x Simulation ε = 0.01 Simulation SBM Appr. Ph., use (2.7) Appr. Ph. adapted0.1 0.1 0.862579 0.863298 0.87678746 0.85836619
0.2 0.909530 0.909840 0.92167074 0.901254950.5 0.962386 0.962496 0.96661308 0.961994221.0 0.987754 0.987790 0.98923639 0.98548799
0.3 0.1 0.663389 0.665469 0.68917587 0.659586970.2 0.759177 0.760215 0.78453098 0.745560380.5 0.886828 0.887072 0.89861106 0.885608771.0 0.959078 0.959243 0.96420072 0.95351051
0.5 0.1 0.536040 0.538339 0.56558233 0.544035830.2 0.646397 0.647893 0.67988532 0.641977880.5 0.816052 0.816467 0.83582281 0.819688671.0 0.927131 0.927450 0.93698719 0.92169160
Table 2.9: Variance Gamma process, d = 0.0, σ2 = 1.0 and κ = 1.0. For the fifth column, number ofErlang distributions which were fitted to the upper tail NEr = 10 and the highest number of phasesnmax = 10, after having cut off the interval (0, ε), with ε = 0.01, from Π(·). For the last column,number of Erlang distributions which were fitted to the upper tail NEr = 3 and the highest number ofphases nmax = 4, after having cut off the interval (0, ε), with ε = 0.01, from Π(·).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 50 — #62�
�
�
�
�
�
50 2.7. DISCUSSION AND CONCLUDING REMARKS
Simulation Simulation WH-MC WH-MC Laplace Appr. Ph.time t x ε = 0.1 ε = 0.05 n = 20 n = 100 inversion (2.14) adapted0.1 0.1 0.2693 0.2696 0.2760 0.2713 0.2696 0.2626
0.2 0.4849 0.4850 0.4961 0.4883 0.4850 0.47070.5 0.8476 0.8474 0.8517 0.8475 0.8475 0.83671.0 0.9710 0.9711 0.9686 0.9675 0.9710 0.9708
0.3 0.1 0.1724 0.1726 0.1761 0.1732 0.1725 0.17880.2 0.3162 0.3163 0.3235 0.3185 0.3163 0.32130.5 0.6255 0.6255 0.6359 0.6287 0.6255 0.61871.0 0.8703 0.8702 0.8714 0.8688 0.8703 0.8605
0.5 0.1 0.1421 0.1423 0.1450 0.1425 0.1422 0.15800.2 0.2613 0.2615 0.2669 0.2625 0.2613 0.28220.5 0.5291 0.5290 0.5389 0.5316 0.5291 0.54271.0 0.7847 0.7848 0.7896 0.7843 0.7848 0.7832
Table 2.10: Beta process, with parameters of Example 11. Number of Erlang distributions which werefitted to the upper tail NEr = 3 and the highest number of phases nmax = 4, after having cut off theinterval (0, ε), with ε = 0.1, from Π(·).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 51 — #63�
�
�
�
�
�
Chapter 3Evaluation of option prices
In the previous chapter we developed a numerical technique for evaluating the distribution
of the running supremum of Lévy processes. In this chapter we use this numerical technique
for pricing specific class of path-dependent options.
3.1 Introduction
Standard options (or: vanilla options) have a payoff structure that depends on the price evo-
lution of the underlying asset only through the price at expiration. There is an abundance
of exotic options that are traded nowadays, however, with payoff structures that are sub-
stantially more involved. Lookback options are examples of derivatives of which the payoff
depends on the maximum (or minimum) price over the life of the option, and possibly the
price of the underlying asset at maturity as well. They come in two flavors: lookback options
with fixed strike, and those with a floating strike.
With the stochastic process St representing the evolution of the stock-prices and ST :=
sup0≤t≤T St the associated running maximum process, the payoff of the fixed strike call op-
tion is
P(c)fix (T,K) := max{ST −K, 0} = (ST −K)+,
with strike price K and maturity time T ; analogously, the payoff of the put-counterpart is
given by P(p)fix (T,K) := (K − ST )
+, with St the running minimum process. As indicated by
these payoffs, this type of options has a fixed, a priori known strike price, but as, opposed
to the ‘traditional’ European option, the underlying trigger is not the price at maturity but
51
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 52 — #64�
�
�
�
�
�
52 3.1. INTRODUCTION
rather the maximum (or minimum) of the underlying asset price over the life of the option.
The payoff of the floating strike call option is
P(c)fl (T, L) := max{ST − LST , 0} = (ST − LST )
+;
in case L ≤ 1 the payoff is always nonnegative, and reduces to ST − LST . This means
that the strike price is fixed at the asset’s minimal price during the option’s life, multiplied
with a specified constant L. The payoff of the put-counterpart is defined by P(p)fl (T, L) :=
max{LST − ST , 0}, which reduces to LST − ST if L ≥ 1.
Importantly, unlike vanilla options, the lookback options discussed above as well as other
exotic options have a path-dependent payoff. This means that their payoff does not depends
on ST only, but also involves a certain functional of the process St, for 0 ≤ t ≤ T (i.e.,
the maximum or minimum value attained). As a consequence it is highly nontrivial to price
such options, or to numerically assess the sensitivities of the price with respect to the various
model parameters such as the maturity and the initial price of the underlying asset (the
‘Greeks’).
It was widely recognized that the classical Black-Scholes model [23], in which the price evo-
lution process is given by St = S0eXt for a Brownian motion Xt, fails to reproduce the
smile effect; instead, in this model the volatility is constant with respect to strike prices. This
has motivated researchers and practitioners to depart from the Black-Scholes model, and to
consider the more general situation in which Xt follows a Lévy process. Lévy processes, char-
acterized by the property that their increments are stationary and independent, form a rich
class, covering a broad spectrum of possible jump structures [32]. At the same time, how-
ever, they allow for explicit analysis, even if more complex metrics are involved [21, 68]. A
specific branch of the Lévy literature is about fluctuation theory, describing the probabilistic
features of the extreme values attained by the Lévy process under study. Important in the
context of the present chapter are so-called Wiener-Hopf results, which characterize the joint
distribution of the running maximum XT (or running minimum XT ) and the value of the
process XT , in terms of a double Laplace transform. More specifically, an expression is given
for
κ(α, q) := Ee−αXτ(q) =
∫ ∞
0
qe−qtEe−αXtdt,
where τ(q) is an exponentially distributed random variable, independent of the evolution of
the Lévy process Xt, with mean q−1; there is an analogous result for the transform κ(α, q)
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 53 — #65�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 53
associated with the running minimum counterpart. In addition, the transform of Xτ(q) (or
Xτ(q)) jointly with the value at τ(q), i.e., Xτ(q), can be given.
The major difficulty with these results lies in their implicitness. Numerical evaluation of the
transforms requires the availability of the density of Xt for any t ≥ 0, while often only the
associated characteristic function EeisX1 is given (which characterizes the law at any t ≥ 0,
due to EeisXt = (EeisX1)t, as a consequence of the stationary and independent increments).
For particular classes of Lévy process, however, (semi-)explicit expressions for κ(α, q) are
available. In this context we mention (i) the class of spectrally one-sided Lévy processes (i..e,
Xt has either only negative jumps, or only positive jumps), (ii) the class of Lévy processes
with general jumps in one direction and phase-type jumps in another direction, (iii) the class
of Lévy processes of which the Lévy exponent logEeisX1 is meromorphic function.
The aim of Chapter 2 was to evaluate the probability distribution of Xt for any Lévy process.
The strategy followed consisted of the following steps. First the Lévy process under consid-
eration is approximated by a Lévy process from one of the classes (i)-(ii)-(iii); this can be
done in principle arbitrarily accurately (at the expense of higher computation times). Then
for this approximate Lévy process the Laplace transform of Xτ(q) is evaluated. The last step
is to rely on the numerical algorithms described in [36, 37] to invert this transform. The
approach has been validated by extensive numerical experimentation, the main conclusion
being that the proposed method is fast and highly accurate in all scenarios considered.
The idea behind the present chapter is to use the framework of Chapter 2 in order to numer-
ically evaluate lookback option prices and the associated Greeks. With techniques similar
to those used by Nguyen-Ngoc and Yor [78, 79], we find the transforms of the option prices
(fixed and floating strike, call and put), as well as those related to their sensitivities (Greeks)
with respect to the maturity T and initial price S0, in terms of the Wiener-Hopf transforms
κ(α, q) and κ(α, q) introduced above. Then we replace the Lévy process under study by that
of an approximating Lévy process of which κ(α, q) and κ(α, q) can be evaluated. Finally,
the numerical inversion routines of [36, 37] are used to numerically evaluate the prices and
Greeks.
The numerical experiments performed in this chapter cover a broad variety of underlying
Lévy processes. We start by the classical models, viz. the model in which Xt corresponds
to a Brownian motion with drift, often referred to as the Black-Scholes model [23], and the
model with additional Normally distributed jumps at Poisson epochs, known as the Mer-
ton model [76]. These models still allow explicit expressions for the Wiener-Hopf transforms
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 54 — #66�
�
�
�
�
�
54 3.2. PRELIMINARIES
κ(α, q) and κ(α, q); the numerics primarily serve the purpose of checking whether the inver-
sion techniques provide correct output. Next we consider examples from the class of Lévy
processes with infinite activity, i.e., processes with infinitely many jumps over any finite in-
terval, which have shown to provide a particularly good fit of option price data. This class
class contains e.g. the tempered stable process [62], the normal inverse Gaussian process
[18], the variance gamma process [73], and the CGMY process [28]; the latter process fea-
tures in one of the examples, in addition to the so-called Beta process recently analyzed by
Kuznetsov et al. [66, 67].
A substantial body of work is concerned with the numerical evaluation of prices and Greeks
of specific exotic options, usually just for particular classes of driving Lévy processes; see e.g.
[32, Ch. XI] and [78, 79], and references therein, for nice accounts of the literature. We refer
to [58] for a study on barrier options for the class of generalized hyper-exponential Lévy
models (covering e.g. tempered stable processes, normal inverse Gaussian processes, and
variance gamma processes). An alternative method, the so-called likelihood ratio method,
has been developed by Glasserman and Liu in [48, 49].
This chapter is organized as follows. Section 3.2 sketches the preliminaries: some back-
ground on Lévy processes, key results in fluctuation theory, a brief account of phase-type
approximations, and a short description of the numerical inversion techniques developed
in [36, 37]. In Section 3.3 we compute the transforms of the lookback options under study,
in terms of the Wiener-Hopf transforms κ(α, q) and κ(α, q). Section 3.4 presents the main
findings of our numerical experiments. This chapter is concluded by a discussion in Section
3.5.
3.2 Preliminaries
In this section we introduce notation, review the main properties of Lévy processes, present
a short overview of Wiener-Hopf theory, point out how to perform phase-type approxima-
tions, and sketch the numerical inversion technique developed in [36, 37].
3.2.1 Lévy processes
We now present a brief review of Lévy processes, with a focus on fluctuation-theoretic results
(i.e., results concerning the distribution of the running maximum and running minimum
process). More detailed accounts can be found in the textbooks by Bertoin [21], Kyprianou
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 55 — #67�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 55
[68], and Sato [88], and the survey paper [35]. The textbooks by Cont and Tankov [32] and
Schoutens [90] focus on the use of Lévy processes in finance.
Let (Xt)t≥0 be a Lévy process, i.e., a (single-dimensional) stochastic process with stationary
and independent increments, defined on an appropriately chosen probability space (Ω,F ,P),
shifted such that X0 = 0. Recall from Chapter 1 that for any Lévy process (Xt)t≥0, the distri-
bution of X1 is infinite divisible, which is equivalent to the Lévy-Khintchine representation
of its characteristic function (known as the Lévy exponent) being of the form
logEeisX1 = isd− 1
2s2σ2 +
∫ ∞
−∞(eisx − 1− isx1{|x|<1})Π(dx)
where d ∈ R, σ ≥ 0 and the spectral measure Π(dx), concentrated on R \ {0}, satisfies
∫R
min{x2, 1}Π(dx) < ∞.
The triplet (μ, σ2,Π), commonly referred to as the characteristic triplet, uniquely defines the
Lévy process [21, 68, 88]. The first and the second terms are related to a deterministic drift
and a Brownian component, respectively. The jumps of the Lévy process are contained third
term. If∫ ∞−∞ Π(dx) < ∞, then these jumps can be interpreted as a compound Poisson pro-
cess. In case∫ ∞−∞ Π(dx) = ∞, on the contrary, the Lévy process has infinitely many jumps
in any time interval. Examples of processes of the latter kind are gamma processes, variance
gamma processes, and normal inverse Gaussian processes.
We now discuss results related to the running maximum process Xt := sup0≤s≤t Xs and
the running minimum process Xt := inf0≤s≤t Xs. Recall that τ(q) denotes an exponential
random variable with mean q−1, independent of the considered Lévy process. Wiener-Hopf
theory states that Xτ(q) and Xτ(q) − Xτ(q) are independent; in addition realize that Xτ(q) −Xτ(q) is distributed as Xτ(q). More specifically, for α ≥ 0,
κ(α, q) := Ee−αXτ(q) = k0 exp
(−
∫ ∞
0
∫(0,∞)
1
t
(e−qt − e−qt−αx
)P(Xt ∈ dx)dt
); (3.1)
here k0 > 0 is a normalizing constant. Similarly, for some k0 > 0, and α ≤ 0,
κ(α, q) := Ee−αXτ(q) = k0 exp
(−
∫ ∞
0
∫(−∞,0)
1
t
(e−qt − e−qt−αx
)P(Xt ∈ dx)dt
). (3.2)
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 56 — #68�
�
�
�
�
�
56 3.2. PRELIMINARIES
In addition, due to the independence mentioned above,
κ(α, q)κ(−α, q) = Ee−αXτ(q) EeαXτ(q) = Ee−αXτ(q) Ee−αXτ(q)+αXτ(q)
= Ee−αXτ(q) =
∫ ∞
0
qe−qtEe−αXtdt
=
∫ ∞
0
q(exp(−q + logEe−αX1)
)tdt =
q
q − logEe−αX1=: K (α, q).(3.3)
Suppose we wish to evaluate the distributions of Xt and Xt. The Wiener-Hopf decompo-
sition, as given above, entails that one option is to (i) first numerically evaluate the double
integrals in the exponent in the right-hand sides of (3.1)–(3.2), and then to (ii) numerically
invert these.
A principal problem of this approach is that often only the Lévy exponent corresponding the
Lévy process under study is available; in other words, we do not have an explicit expression
for the density P(Xt ∈ dx). In this chapter we use a technique that circumvents this problem,
and that was proposed in Chapter 2 to evaluate the distributions of Xt and Xt, to compute
the prices of lookback options and the associated Greeks.
The main idea of the approach advocated in Chapter 2 is to evaluate P(Xt ≤ x) bypassing
stage (i) above; evidently, P(Xt ≤ x) can be computed in an analogous fashion. The underly-
ing idea is that we make use of the fact that for quite a substantial class of Lévy processes Xt,
the double transform κ(α, q) can be expressed explicitly in terms of the Lévy exponent; we
approximate the Lévy process under consideration by an appropriately chosen Lévy process
in this class, so that the just performing stage (ii) remains.
We now describe a few classes of Lévy processes for which κ(α, q) (and because of (3.3) also
κ(α, q)) can be expressed explicitly in terms of the Lévy exponent logEeisX1 . It is noted that
in some cases still a number of (relatively straightforward) numerical computations need to
be performed.
� Spectrally one-sided processes. First consider the situation in which the underlying Lévy
process Xt has either only negative jumps (the spectrally negative case; write X ∈ S−) or only
positive jumps (the spectrally positive case; write X ∈ S+). In the former case the running
maximum up to the exponential epoch τ(q) has an exponential distribution, whereas in the
latter case the so-called generalized Pollaczek-Khinchine formula applies; see e.g. [35, Ch. III
and IV]. In both cases, κ(α, q) can be expressed in closed-form in terms of the Lévy exponent,
as we point out now.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 57 — #69�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 57
Following [21, Ch. VII], for X ∈ S− we define Φ(β) := logEeβX1 , and Ψ(·) its right-inverse
[68, p. 211]. Then
κ(α, q) =Ψ(q)
Ψ(q) + α. (3.4)
In other words, Xτ(q) is exponentially distributed with parameter Ψ(q). In case X ∈ S+,
define ϕ(α) := logEe−αX1 , and let ψ(·) be the inverse of ϕ(·). Then
κ(α, q) =q
ψ(q)
ψ(q)− α
q − ϕ(α). (3.5)
This result is sometimes referred to as the (generalized) Pollaczek-Khinchine formula [55,
98]; see also [10, Ch. IX, Thm. 3.10].
� Processes with phase-type jumps. It has been found out more recently that κ(α, q) can be
expressed in semi-explicit terms if the jumps in one direction (either upward or downward)
are phase-type (or, more generally, have a rational Laplace transform), whereas the jumps in
the other direction are allowed to have a general distribution — see for results along these
lines [11, 71, 72]. In this chapter, we concentrate on the setting of Lewis and Mordecki [71]
in which the positive jumps have a rational Laplace transform, and the downward jumps are
general; we write X ∈ R. In this case κ(α, q) can be expressed in terms of the zeros of a
specific equation (that needs to be solved numerically).
More specifically, we consider a Lévy process with jumps, with a general jump-size distri-
bution in the downwards direction, while the upwards jumps have density
p(x) =
K∑k=1
nk∑j=1
ckj(αk)j xj−1
(j − 1)!e−αkx, x > 0.
As in Chapter 2, these Lévy processes define the class R such that for a finite and positive λ,
ξ(s) := logEeisX1 = isd− 1
2s2σ2 +
∫ 0
−∞(eisx − 1− isx1{x>−1})Π(dx)
+ λ
⎛⎝ K∑
k=1
nk∑j=1
ckj
(iαk
s+ iαk
)j
− 1
⎞⎠
where the αi are order such that 0 ≤ Re(α1) < Re(α2) ≤ · · · ≤ Re(αK).
Now let βj(q) denote the j-th root of the equation q = ξ(s), which we assume to have mul-
tiplicity mj(q); let m(q) denote the total number of distinct roots. Then κ(α, q) can be ex-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 58 — #70�
�
�
�
�
�
58 3.2. PRELIMINARIES
pressed in terms of the αk and βj(q):
κ(α, q) =K∏
k=1
(α+ αk
αk
)nk m(q)∏j=1
(βj(q)
α+ βj(q)
)mj(q)
; (3.6)
this expression can be inverted with respect to α, after having performed a partial fraction
expansion. Further details and properties of the roots are given in [72, Thm. 2.2]. Notice that
this expression for κ(α, q) can be inverted with respect to α. For more details and properties
of the roots, we refer to [71, 72].
� Processes with a meromorphic Lévy exponent. If the Lévy exponent is a meromorphic function
in complex plane, the Wiener-Hopf factorization can be evaluated in the same way as in case
of phase-type distribution jumps [66]. As in Section 2.6.2, the class of Beta processes M
consists of Lévy processes defined by triplet (d, σ,Π), with the Lévy measure Π(·) such that
Π(x) = c1e−α1β1x
(1− e−β1x)λ11{x>0} + c2
eα2β2x
(1− eβ2x)λ21{x<0},
with parameters αi > 0, βi > 0, ci ≥ 0 and λi ∈ (0, 3)\{1, 2}. Its Lévy exponent is given by
Ψ(s) = i(d− ρ)s− 1
2σ2s2 +
c1β1
B(α1 − is
β1, 1− λ1) +
c2β2
B(α2 +is
β2, 1− λ2)− γ, (3.7)
with B(x, y) := Γ(x)Γ(y)/Γ(x + y) denoting the Beta function. In addition, defining the
function ψ(x) := d log(Γ(x))/dx,
γ =c1β1
B(α1, 1− λ1) +c2β2
B(α2, 1− λ2),
ρ =c1β1
B(α1, 1− λ1)(ψ(1 + α1 − λ1)− ψ(α1))− c2β2
B(α2, 1− λ2)(ψ(1 + α2 − λ2)− ψ(α2)).
The Lévy exponent of the Beta process is a meromorphic function in C. There are infinitely
many roots of the equation q −Ψ(s) = 0; these are real and simple, and characterized in [66,
Thm. 10], as follows. The roots are such that ξ−0 ∈ (−α1β1, 0) and ξ+0 ∈ (0, α2β2), while for
n ∈ {1, 2, . . .},
ξ−n ∈ (β1(−α1 − n), β1(−α1 − n+ 1)), ξ+n ∈ (β2(α2 + n− 1), β2(α2 + n)).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 59 — #71�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 59
Moreover, for x > 0,
P(Xτ(q) ∈ dx) = −( ∞∑
k=0
C−k ξ−k eξ
−k x
)dx (3.8)
where, with k ∈ {1, 2, . . .},
C−0 =
∏n≥1
1 + ξ−0 /β1(n− 1 + α1)
1− ξ−0 /ξ−n, C−
k =1 + ξ−k /β1(k − 1 + α1)
1− ξ−k /ξ−0
∏n≥1,n �=k
1 + ξ−k /β1(n− 1 + α1)
1− ξ−k /ξ−n.
A similar expression holds for the density P(−Xτ(q) ∈ dx ), but {ξ−n } must be replaced by
{−ξ+n }, while α1, β1 must be replaced by α2, β2.
Based on the above characterization, the poles can be determined efficiently. It is noted
though, that for performing the inverse Laplace transform these poles must be for complex
values of q; in Section 2.7 it is pointed out how this can be done.
3.2.2 Phase-type approximations, and a few other implementational is-
sues
As discussed in the previous subsection, if the jumps in one direction have a phase-type
distribution, while those in the other direction can have any distribution, the Wiener-Hopf
decomposition can be performed in (semi-)explicit terms. For the case of a Lévy process with
non-phase-type jumps, this leads to the idea of approximating the distribution of the jumps
in one direction by a phase-type distribution. Recall that the class of phase-type distribu-
tions has the attractive feature that it forms a dense class of distributions within the class of
distributions on (0,∞) [10, Ch. III, Thm. 4.2]. As a result, any distribution on (0,∞) can
be approximated by a phase-type distribution arbitrarily closely. One approach we use for
fitting an arbitrary distribution on (0,∞) by a suitable phase-type distribution is developed
by Asmussen et al. [14], while an alternative method is proposed by Horváth and Telek [57].
Both methods are based on the expectation-maximization (usually referred to as EM) algorithm.
The method developed in [14] generates an approximation with a general phase-type distri-
bution; this essentially means that it does not necessarily need a good guess to initialize the
algorithm, but for an accurate approximation the degree of phase-type may be prohibitively
large. The method of [57] focuses on mixtures of Erlang distributions, and tends to give an
accurate approximation already with a relatively low number of Erlang distributions; the de-
grees of these Erlang distributions, however, do not change while the algorithm is running,
and as a consequence we need to have a proper guess about these degrees. It has turned
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 60 — #72�
�
�
�
�
�
60 3.2. PRELIMINARIES
out that for our purposes the algorithm of [57] is more appropriate than the first one with
respect to accuracy and CPU time consumption.
Evidently, in the case of small jumps, i.e.,∫ ∞−∞ Π(dx) = ∞, the jumps cannot be described as
a compound Poisson process: there are infinitely many jumps in any finite time interval. In
this situation, which we will come across in Section 3.4 when discussing the CGMY model,
we replace the small jumps (i.e., those that are in absolute value smaller than some small
ε > 0) by an appropriately chosen Brownian motion. This procedure, theoretically backed
by the findings in e.g. [15], is explained in greater detail in Section 2.5.
3.2.3 Laplace inverse transform
Our approach to option pricing relies on an advanced numerical method to perform Laplace
inversion transform [36, 37]. The method is capable of performing Laplace inversion, Fourier
inversion as well as mixed Laplace-Fourier inversion transform accurately and fast; impor-
tantly, also multi-dimensional transforms can be handled. In this subsection we briefly dis-
cuss the inversion method; see Section 2.3 for a more detailed account.
Like many other numerical inversion algorithms, the technique applied here is based on
well-known Poisson summation formula (PSF) [1, 38]. This PSF is given by, for v ∈ [0, 1) and
the ‘damping factor’ a ∈ R,
∞∑k=−∞
f (a+ 2πi(k + v)) =
∞∑k=0
e−ake−2πikvf(k) (3.9)
in which f is the Laplace (Fourier) transform
f(s) :=
∫ ∞
−∞e−stf(t)dt.
The idea is to replace the infinite summation in the left-hand side by an appropriately chosen
finite sum; in [36, Appendix A] it is pointed out how such a quadrature rule can be found.
Then the values of f(k) can be computed efficiently by the fast Fourier transform algorithm
(FFT); see e.g. the seminal paper [32].
Numerical experiments in [36, 37] show that, under general circumstances, this method
gives evaluates the function values near machine precision. It is also pointed out how the
method can be adapted, with a simple modification, such that it is capable of handling dis-
continuities. The method can also be extended for multidimensional Laplace inverse trans-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 61 — #73�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 61
formation.
Although the method has been developed for Laplace inverse transform, it can be adapted
in a straightforward manner to perform Fourier inversion. Details on the implementation, as
well as its extension to multidimensional inversion, are described in great detail in [36, 37].
3.3 Transforms of prices and Greeks of lookback options
We consider the setting described in the introduction: a model in which the price of an
underlying asset evolves as St = S0 eXt , where Xt is a Lévy process with X0 = 0. In this
chapter we primarily focus on lookback options. To test our numerical procedures, however,
we also include a simpler option, viz. the vanilla option.
We consider the usual setup, as introduced in more detail e.g. in [78]: a market with two basic
assets, viz. the usual bank account with an interest rate r > 0, and the option associated with
an underlying asset whose evolution in time is represented by the stochastic process St.
3.3.1 Vanilla option
For the vanilla call option the payoff is given
P (c)van(T,K) := (ST −K)+,
where, as usual, T is maturity time and K is strike price. This option is simpler than the
lookback options given in the introduction (which also depend on the extreme values of St
for t ∈ [0, T ]). The put-counterpart is defined through the payoff P (p)van(T,K) := (K − ST )
+.
Our goal is to compute the price of the vanilla option, i.e.,
V (c)van(T,K) := E
[e−rTP (c)
van(T,K)];
the analysis of the put-counterpart works similarly. It requires some elementary algebra to
verify that, with k := log(K/S0),
V (c)van(T,K) = S0 e
−rT
∫ ∞
k
(ex − ek)P(XT ∈ dx).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 62 — #74�
�
�
�
�
�
62 3.3. TRANSFORMS OF PRICES AND GREEKS OF LOOKBACK OPTIONS
Let V (c)van(T, α) be the Fourier transformation with respect to k:
V (c)van(T, α) := S0e
−rT
∫ ∞
−∞eiαkeηk
∫ ∞
k
(ex − ek
)P(XT ∈ dx)dk,
where η > 0 is a damping factor. By changing the integration order it is readily found that
V (c)van(T, α) =
S0e−rT
(iα+ η)(iα+ η + 1)Ee(iα+η+1)XT
=S0
(iα+ η)(iα+ η + 1)
(e−r Ee(iα+η+1)X1
)T
, (3.10)
where the last step is due to the Lévy nature of Xt. We have expressed the transform
V(c)van(T, α) in terms of the Lévy exponent corresponding to Xt and the maturity T .
We now determine the transforms of a set of Greeks, i.e., sensitivities. We focus on the
sensitivities with respect to the initial price of the underlying asset S0 and the maturity T ;
in the sequel we refer to these Greeks by Δ and Θ. Regarding the former, it is elementary to
verify that
Δ(c)van(T,K) :=
∂V(c)van(T,K)
∂S0= e−rT
∫ ∞
log(K/S0)
exP(XT ∈ dx).
Writing k := log(K/S0) and transforming to k in the same way as above, we obtain the
transform
Δ(c)van(T, α) := e−rT
∫ ∞
−∞eiαkeηk
∫ ∞
k
ex P(XT ∈ dx)dk =1
iα+ η
(e−r Ee(iα+η)X1
)T
.
Realize that the expression in the right-hand side implicitly depends on S0, as k = log(K/S0).
We now concentrate on the Greek with respect to the maturity time. With
Θ(c)van(T,K) :=
∂V(c)van(T,K)
∂T,
we have that
Θ(c)van(T, α) :=
S0
(iα+ η)(iα+ η + 1)
(e−r Ee(iα+η+1)X1
)T (logEe(iα+η+1)X1 − r
).
It is noted that transforms of second-order Greeks can be determined similarly.
The vanilla options are path independent, in the sense that their prices depend on the asset
price process only through the asset price at maturity time T , and are independent of the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 63 — #75�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 63
specific shape of the path during the time interval (0, T ). The lookback options, which we
are going to study now, are path dependent.
3.3.2 Fixed strike lookback options
We now focus on pricing fixed-strike lookback options; again we present our analysis for the
call option, but the put-variant is dealt with analogously. In our derivations, we follow the
same line of reasoning as in [78]. Our goal is to evaluate, in terms of transforms,
V(c)fix (T,K) := E
[e−rTP
(c)fix (T,K)
],
as well as its Greeks with respect to S0 and T ; recall the definition of the payoff P (c)fix (T,K)
from the introduction of this chapter. If K ≤ S0, it automatically follows that P (c)fix (T,K) =
ST −K; the option price V (c)fix (ϑ, α) can then be evaluated as pointed out in Chapter 2. Realize
that this case corresponds to a ‘riskless’ option, of which it is guaranteed that the payoff is
non-negative. Let us therefore turn to the more realistic setting in which K > S0.
We again parametrize k = log(K/S0), which is now necessarily positive. Let V (c)fix (ϑ, α) be
the transform with respect to k and T :
V(c)fix (ϑ, α) :=
∫ ∞
0
ϑe−ϑT
∫ ∞
0
e−αk V(c)fix (T,K) dk dT.
The idea of including the maturity T as an exponential random variable was first proposed in
[46] for barrier options, but just for the Black-Scholes model. This expression can be rewritten
as the threefold integral, which we in the sequel assume to converge,
S0
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
0
e−αk
∫ ∞
k
(ex − ek)P(XT ∈ dx) dk dT.
Now change the order of summation: first integrate over k ∈ [0, x], so as to obtain
S0ϑ
r + ϑ
∫ ∞
0
(r+ϑ)e−(r+ϑ)T
∫ ∞
0
(1
α
(ex − e(1−α)x
)− 1
α− 1
(1− e(1−α)x
))P(XT ∈ dx) dT.
This expression can be expressed in term of transforms related to the running maximum
after an exponentially distributed time with mean (r + ϑ)−1:
S0ϑ
r + ϑ
(1
α
(EeXτ(r+ϑ) − Ee(1−α)Xτ(r+ϑ)
)− 1
α− 1
(1− Ee(1−α)Xτ(r+ϑ)
)).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 64 — #76�
�
�
�
�
�
64 3.3. TRANSFORMS OF PRICES AND GREEKS OF LOOKBACK OPTIONS
This expression can be written in terms of the transform κ(α, q) introduced earlier:
S0ϑ
r + ϑ
(κ(−1, r + ϑ)− κ(α− 1, r + ϑ)
α− 1− κ(α− 1, r + ϑ)
α− 1
).
The Greek related to the initial asset price S0 is
Δ(c)fix (T,K) :=
∂V(c)fix (T,K)
∂S0= e−rT
∫ ∞
log(K/S0)
exP(XT ∈ dx).
With the usual transformation k = log(K/S0), we find
Δ(c)fix (ϑ, α) :=
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
−∞e−αk
∫ ∞
k
ex P(XT ∈ dx)dk dT
=ϑ
r + ϑ
κ(−1, r + ϑ)− κ(α− 1, r + ϑ)
α.
Now consider the Greek with respect to the maturity T . Interchanging the order of the
integrals and integration by parts yields
Θ(c)fix (ϑ, α) :=
∫ ∞
0
ϑe−ϑT
∫ ∞
0
e−αk ∂V(c)fix (T,K)
∂Tdk dT = ϑV
(c)fix (ϑ, α).
3.3.3 Floating strike lookback options
In this subsection the focus lies on fixed strike lookback options, presenting, as usual, the
results for the call variant. We characterize, in terms of transforms,
V(c)fl (T, L) := E
[e−rTP
(c)fl (T, L)
],
as well as its Greeks with respect to S0 and T ; the payoff function P(c)fl (T, L) is defined in the
introduction. If L ≤ 1, this payoff equals ST − LST , being non-negative, and allowing for
relatively easy evaluation. We therefore focus on the more realistic (and challenging) setting
in which L > 1.
We parametrize � := logL (which is positive), and define
V(c)fl (ϑ, α) =
∫ ∞
0
ϑe−ϑT
∫ ∞
0
e−α�V(c)fl (T, e�) d� dT.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 65 — #77�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 65
After some algebra, it is seen that this expression equals
S0
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
0
e−α�
∫ 0
y=−∞
∫ ∞
x=�+y
(ex − e�+y)P(XT ∈ dx,XT ∈ dy) d� dT.
Interchange the order of the integrals, such that first the integral over � ∈ [0, x− y] is evalu-
ated. This reduces to, with the inner integral corresponding to the variable y and the ‘middle’
integral to x,
S0
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
−∞
∫ 0
−∞
(ex
(1− e−α(x−y))
α− ey
(1− e−(α−1)(x−y))
α− 1
)P(XT ∈ dx,XT ∈ dy)dT.
We thus find
V(c)fl (ϑ, α) = S0
ϑ
r + ϑ
(1
α(α− 1)E
[e−(α−1)Xτ(r+ϑ)+αXτ(r+ϑ)
]+
1
αEeXτ(r+ϑ) − 1
α− 1EeXτ(r+ϑ)
).
Consider the first term between the brackets in the previous display. By virtue of (i) the
trivial identity −(α−1)x+αx = (α−1)(x−x)+x, (ii) the fact that (due to Wiener-Hopf theory)
Xτ(r+ϑ) and Xτ(r+ϑ) −Xτ(r+ϑ) are independent, and (iii) the fact that Xτ(r+ϑ) −Xτ(r+ϑ) is
distributed as −Xτ(r+ϑ), we have that
E
[e−(α−1)Xτ(r+ϑ)+αXτ(r+ϑ)
]= Ee−(α−1)Xτ(r+ϑ) EeXτ(r+ϑ) .
These considerations eventually lead to the identity, using the notation introduced in Eqn.
(3.3),
V(c)fl (ϑ, α) = S0
ϑ
r + ϑ
(κ(α− 1, r + ϑ)κ(−1, r + ϑ)
α(α− 1)+
K (−1, r + ϑ)
α− κ(−1, r + ϑ)
α− 1
).
We now turn to the Greeks. Δ(c)fl (ϑ, α), defined in the obvious way, is simply V
(c)fl (ϑ, α)/S0,
which is independent of S0; it is evident from the definition of the payoff that V (c)fl (ϑ, α) is
linear in S0. Regarding, in self-evident notation, Θ(c)fl (T, L), it is seen that, with the same line
of reasoning as used for the fixed strike lookback option, Θ(c)fl (ϑ, α) = ϑV
(c)fl (ϑ, α).
3.4 Numerical validation
In this section we consider a set of frequently used, practically relevant Lévy processes Xt.
For each of them we have implemented our inversion technique to numerically invert the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 66 — #78�
�
�
�
�
�
66 3.4. NUMERICAL VALIDATION
transforms identified in the previous section.
3.4.1 Black-Scholes model
Our first example concerns the celebrated Black-Scholes model, in which Xt represents a
Brownian motion with drift. This model is admittedly an oversimplification of reality, as
argued in the introduction, but, due to the fact that it allows explicit computations, serves as
an ideal benchmark to test our numerical techniques.
The Lévy process Xt which governs the price of the underlying asset follows a Brownian
motion with drift μ and standard deviation parameter σ:
dXt = μdt+ σdWt,
with Wt a standard Brownian motion. We pick μ = r−σ2/2, such that e−rtSt is a martingale
under the risk neutral measure as well as Wt. It is readily checked that the price of the vanilla
option is given by the well-known Black formula:
V (c)van(T,K) = S0ΦN (d+)−Ke−rTΦN (d−)
where
d± :=− log(K/S0) + T (r ± σ2/2)
σ√T
and ΦN (u) is the cumulative distribution function of the standard Normal distribution. The
Greeks with respect to S0 and T follow by differentiation.
The price of the fixed strike lookback option reads
V(c)fix (T,K) = S0e
−rT
∫ ∞
0
(ey − ek)P(XT ∈ dy),
which can be further evaluated using the relation [56, p. 49]
P(XT ≤ y) = 1− ΦN
(−y + μT
σ√T
)− exp
(−2μy
σ2
)ΦN
(−y − μT
σ√T
).
As a consequence, the option price can be numerically evaluated by performing an elemen-
tary integration with arbitrary precision. Formulas for the Greeks follows in a similar fash-
ion.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 67 — #79�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 67
The price of the floating strike lookback is defined by, with L > 1 and � := logL,
V(c)fl (T, L) = S0e
−rT
∫ 0
y=−∞
∫ ∞
x=�+y
(ex − e�+y)P(XT ∈ dx;XT ∈ dy).
This expression can be further evaluated by first conditioning on the value of XT . Then real-
ize that Xt, conditional on the value of XT , is a Brownian bridge. The distribution function
of the minimum value attained by the Brownian bridge is known to be, for y ≤ 0 and y ≤ x,
P(XT ≤ y |XT = x) = exp
(−2y(y − x)
Tσ2
).
This leads to an algorithm that quantifies V (c)fl (T, L) by performing a number of elementary
numerical integrations. Again, such procedures can be applied as well to find the Greeks.
Table 3.1 presents numerical output for an example corresponding to the Black-Scholes
model with parameters r = 0.03, σ = 0.2, and S0 = 100. The column ‘Exact computa-
tion’ corresponds to the explicit evaluation techniques mentioned above; here we rely on the
explicit formula for the vanilla option, and numerical integration for the lookback options
(which can be done arbitrarily accurately). The numerical experiments with vanilla options
show a virtually perfect fit, close to machine precision. For the lookback options, that require
multiple inversions, the performance is still remarkably good.
Table 3.2 presents the corresponding results for the Greeks with respect to the initial asset
price S0 and the maturity T . Again we have a nearly perfect performance for the vanilla
option, and still highly accurate results for the loopback options. From the table we conclude
that Δ(c)fl (T, L) = V
(c)fl (T, L)/S0, in line with an observation we made above.
3.4.2 Jump-diffusion models
A Lévy process is a jump-diffusion if it has the the following form:
Xt = μt+ σWt +
Nt∑i=1
Ji;
here the first two terms on the right-hand side correspond to a Brownian motion (with drift),
whereas the last term is a compound Poisson term. The process Nt is a Poisson process (with
rate λ > 0) which counts the number of jumps up to time t, and the Ji are i.i.d. jumps. We
consider two specific models here: the so-called Merton model and the Kou model.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 68 — #80�
�
�
�
�
�
68 3.4. NUMERICAL VALIDATION
Vanilla optionTime T Strike K V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K),
Simulation Fourier inversion Error0.5 90 12.799 12.799 4.2·10−14
0.5 100 6.370 6.371 5.7·10−14
0.5 110 2.611 2.612 2.1·10−13
1 90 15.428 15.429 1.6·10−13
1 100 9.142 9.413 1.8·10−13
1 110 5.292 5.293 3.0·10−13
Fixed strike lookback optionTime T Strike K V
(c)fix (T,K), V
(c)fix (T,K), V
(c)fix (T,K),
Simulation Laplace inversion Error0.5 110 5.133 5.134 1.9·10−6
0.5 115 3.070 3.070 1.8·10−6
0.5 120 1.757 1.757 3.9·10−7
1 110 10.315 10.316 1.1·10−6
1 115 7.534 7.535 1.3·10−6
1 120 5.408 5.409 2.8·10−7
Floating strike lookback optionTime T L V
(c)fl (T, L), V
(c)fl (T, L), V
(c)fl (T, L),
Simulation Laplace inversion Error0.5 1.10 4.800 4.800 1.7·10−6
0.5 1.15 2.887 2.888 1.8·10−6
0.5 1.20 1.661 1.661 3.9·10−7
1 1.10 9.346 9.347 9.8·10−7
1 1.15 6.869 6.870 1.3·10−6
1 1.20 4.959 4.960 2.8·10−7
Table 3.1: Black-Scholes model; results obtained by simulation, results obtained by Fourier/Laplaceinversion, and absolute value of the error, compared with exact computation. Parameter values are:μ = 0.01 and σ = 0.2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 69 — #81�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 69
Vanilla optionTime T Strike K Δ(c)
van(T,K), Θ(c)van(T,K), Δ(c)
van(T,K), Θ(c)van(T,K), Δ(c)
van(T,K), Θ(c)van(T,K),
Simulation Simulation Inversion Inversion Error Error0.5 90 0.821 5.761 0.822 5.770 1.0·10−14 0.00.5 100 0.570 7.065 0.570 7.074 4.0·10−15 0.00.5 110 0.309 5.830 0.310 5.836 9.5·10−15 3.0·10−15
1 90 0.781 4.827 0.781 4.832 9.0·10−15 0.01 100 0.598 5.377 0.599 5.380 5.6·10−15 0.01 110 0.410 4.958 0.410 4.961 4.0·10−15 3.0·10−15
Fixed strike lookback optionTime T Strike K Δ
(c)fix (T,K), Θ
(c)fix (T,K), Δ
(c)fix (T,K), Θ
(c)fix (T,K), Δ
(c)fix (T,K), Θ
(c)fix (T,K),
Simulation Simulation Inversion Inversion Error Error0.5 110 0.606 11.364 0.606 11.366 1.7·10−5 3.3·10−4
0.5 115 0.409 9.067 0.410 9.069 1.2·10−6 4.8·10−5
0.5 120 0.262 6.689 0.262 6.690 2.8·10−7 2.6·10−5
1 110 0.796 9.517 0.796 9.519 1.7·10−7 3.4·10−5
1 115 0.635 8.621 0.635 8.622 1.3·10−6 6.1·10−5
1 120 0.495 7.517 0.495 7.519 3.1·10−7 2.3·10−5
Floating strike lookback optionTime T L Δ
(c)fl (T, L), Θ
(c)fl (T, L), Δ
(c)fl (T, L), Θ
(c)fl (T, L), Δ
(c)fl (T, L), Θ
(c)fl (T, L),
Simulation Simulation Inversion Inversion Error Error0.5 1.10 0.048 10.254 0.048 10.254 1.7·10−7 2.3·10−5
0.5 1.15 0.029 8.310 0.029 8.311 1.8·10−8 1.7·10−4
0.5 1.20 0.017 6.201 0.017 6.202 3.9·10−9 2.3·10−5
1 1.10 0.093 8.133 0.093 8.077 9.8·10−10 3.4·10−5
1 1.15 0.069 7.501 0.069 7.472 1.3·10−9 5.8·10−5
1 1.20 0.050 6.635 0.050 6.620 2.8·10−10 6.8·10−5
Table 3.2: Greeks corresponding to Black-Scholes model; results obtained by simulation, results ob-tained by Fourier/Laplace inversion, and absolute value of the error, compared with exact computa-tion. Parameter values are: μ = 0.01 and σ = 0.2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 70 — #82�
�
�
�
�
�
70 3.4. NUMERICAL VALIDATION
Vanilla optionTime T Strike K V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K),
Simulation StdDev Fourier inversion Error0.5 90 13.072 0.0013 13.083 4.3 · 10−9
0.5 100 6.775 0.0010 6.786 6.5 · 10−7
0.5 110 2.981 0.0007 2.99 3.3 · 10−7
1 90 15.857 0.0018 15.88 1.3 · 10−6
1 100 9.985 0.0015 9.994 8.9 · 10−7
1 110 5.878 0.0012 5.879 5.6 · 10−7
Fixed strike lookback optionTime T Strike K V
(c)fix (T,K), V
(c)fix (T,K), V
(c)fix (T,K),
Simulation StdDev Laplace inversion0.5 110 5.849 0.0009 5.9140.5 115 3.66 0.0007 3.7160.5 120 2.21 0.0006 2.2551 110 11.483 0.0015 11.6031 115 8.612 0.0013 8.7201 120 6.371 0.0012 6.465
Floating strike lookback optionTime T L V
(c)fl (T, L), V
(c)fl (T, L), V
(c)fl (T, L),
Simulation StdDev Laplace inversion0.5 1.1 5.448 0.0008 5.4980.5 1.15 3.434 0.0007 3.4760.5 1.2 2.086 0.0005 2.1201 1.1 10.329 0.0014 10.4251 1.15 7.798 0.0012 7.8861 1.2 5.801 0.0011 5.878
Table 3.3: Merton model; results obtained by simulation, results obtained by Fourier/Laplace inver-sion, and absolute value of the error, compared with exact computation (for vanilla option only). Pa-rameter values are: μ = 0.01, σ = 0.2, λ = 10.0, δ = 0.025, and ρ = −δ2/2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 71 — #83�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 71
� In the Merton model [76], the jumps have a Gaussian distribution with mean ρ and vari-
ance δ2. The distribution of Xt can be given explicitly:
P(Xt ∈ dx) = e−λt∞∑k=0
(λt)k
k!
1√2π(σ2t+ kδ2)
exp
(−1
2
(x− μt− kρ)2
σ2t+ kδ2
). (3.11)
The results are presented in Table 3.3. Using this formula the price of the vanilla option
can be obtained as a series whose summands can be expressed in terms of the Black-Scholes
model formula. For the loopback options we use simulation as a benchmark. In our in-
version approach we approximate the upward jumps by a phase-type distribution Section
2.4, and rely on the results presented in [72]. The results thus obtained are given in the last
column. The conclusions are similar to those regarding the Black-Scholes model: a nearly
perfect fit for the vanilla option, and highly accurate performance for the lookback options.
� The distribution of the jump sizes in the Kou model [64] is an asymmetric exponential
distribution, with density
P(Ji ∈ dx) =(pλ+e
−λ+x1x>0 + (1− p)λ−e−λ−|x|1x<0
)(3.12)
where λ± > 0 and p ∈ [0, 1]. In this case the probability distribution of Xt cannot be given in
closed form, and as result we use in Table 3.4 simulation as a benchmark for both the vanilla
and the lookback options. An advantage of this model over the Merton model is that here
the jumps are phase type, and therefore the Wiener-Hopf factors are readily evaluated. the
performance is comparable to that of the Merton model.
3.4.3 Infinite activity Lévy processes
In this section we consider Lévy processes with infinite activity. As an example we use
CGMY processes [28], but other infinite-activity models can be dealt with similarly; in Ex-
ample 10 of Chapter 2, for instance, also variance gamma processes are covered.
For the models we described before (Black-Scholes, Merton, Kou) Monte-Carlo simulation is
a viable alternative to inversion-based techniques: compound Poissons are straightforward
to generate, while it is also known how to obtain exact samples from (Xt, Xt) when Xt is
a Brownian motion [50]. Monte Carlo methods are problematic for infinite-activity models,
however, due the property of infinitely-many jumps in finite time intervals. Techniques have
been developed, however, to replace the process’ small jumps (say, those in absolute value
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 72 — #84�
�
�
�
�
�
72 3.4. NUMERICAL VALIDATION
Vanilla optionTime T Strike K V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K),
Simulation StdDev Fourier inversion0.5 90 13.361 0.0014 13.3580.5 100 7.166 0.0011 7.1640.5 110 3.341 0.0008 3.341 90 16.312 0.0019 16.3081 100 10.532 0.0016 10.5291 110 6.423 0.0013 6.421
Fixed strike lookback optionTime T Strike K V
(c)fix (T,K), V
(c)fix (T,K), V
(c)fix (T,K),
Simulation StdDev Fourier inversion0.5 110 6.465 0.0010 6.4630.5 115 4.198 0.0008 4.1970.5 120 2.65 0.0007 2.6491 110 12.494 0.0016 12.4911 115 9.565 0.0015 9.5631 120 7.241 0.0013 7.239
Floating strike lookback optionTime T L V
(c)fl (T, L), V
(c)fl (T, L), V
(c)fl (T, L),
Simulation StdDev Fourier inversion0.5 1.1 5.988 0.0009 5.9860.5 1.15 3.912 0.0008 3.9110.5 1.2 2.482 0.0006 2.4811 1.1 11.157 0.0015 11.1551 1.15 8.598 0.0014 8.5961 1.2 6.546 0.0012 6.545
Table 3.4: Kou model; results obtained by simulation, and results obtained by Fourier/Laplace inver-sion. Parameter values are: μ = 0.01, σ = 0.2, p = 1/2, λ− = 39, and λ+ = 40.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 73 — #85�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 73
smaller than ε) by a suitably chosen Brownian motion, whereas the remaining jumps can be
described by a compound Poisson process; see the approach presented in Section 2.5, which
is theoretically backed by the convergence results in [15].
Such approximations are useful when devising computational techniques, too: approximat-
ing the large positive jumps by a phase-type distribution, we have a identified a model in
the class R, allowing the evaluation of its Wiener-Hopf factors κ(α, q) and κ(α, q), as treated
in [72].
Owing to its inherent versatility, the CGMY model is among the most popular models used
when modeling asset prices. Particularly when we add a Brownian motion, the six parame-
ters typically allow capturing the process’ crucial features. The Lévy measure of the CGMY
process is given by
Π(dx) = Ce−Mx
x1+Y1x>0 + C
e−G|x|
|x|1+Y1x<0,
with C,G,M > 0 and Y ∈ [0, 2). The corresponding Lévy exponent reads
logEesXt = μs+1
2σ2s2 + CΓ(−Y )
[(M − u)Y −MY + (G+ s)Y −GY
]. (3.13)
For Monte-Carlo simulation we may ‘remove’ the small jumps smaller than ε, in the sense
that we approximate them by a drift and diffusion process such that με =∫ ε
−εxΠ(dx) and
σ2ε =
∫ ε
−εx2Π(dx). As indicated above, the large positive jumps (i.e., those larger than ε) are
then approximated by by phase-type jumps, so that we have determined an approximative
model in R. This technique is dealt with in detail, and thoroughly validated, in Chapter 2;
here we follow an alternative approach, cf. [13, Section 2.1], which we describe now.
Consider for ease just the upper tail of the Lévy measure; the lower tail can be dealt with
analogously. First realize that
Ce−Mx
x1+Y= Ce−Mx
∫ ∞
0
uY e−ux
Γ(1 + Y )du. (3.14)
Evidently, the above integral can be approximated by a weighted sum of exponential terms,
Ce−Mx
x1+Y≈
N∑i=1
ci(ui +M)e−(ui+M)x
where ci := Cwiuyi / [(ui +M)Γ(1 + Y )], with quadrature points ui, and the wi denoting the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 74 — #86�
�
�
�
�
�
74 3.4. NUMERICAL VALIDATION
corresponding Gaussian weights. When choosing N suitably large, the approximation can
be made arbitrarily accurately. Following the same procedure for the lower tail, we have
approximated the Lévy process under consideration by a Lévy process in R, for which we
can evaluate κ(α, q) and κ(α, q).
We now discuss how the CGMY process can be simulated. The above integration represen-
tation (3.14) is only valid for x > 0; small jumps have to be treated separately. Assuming
that we exclude x ∈ (−ε, ε) for a specific value of ε, we can choose the number of terms in
Gaussian quadrature expansion of the integral so as to obtain the desired accuracy. Then
we need to add drift and diffusion terms to the process to compensate for the exclusion of
jumps in absolute terms smaller than ε; the corresponding parameters are
με =
∫ ε
−ε
(Ce−Mx
x1+Y−
N∑i=1
ci(ui +M)e−(ui+M)x
)x dx,
and
σ2ε =
∫ ε
−ε
(Ce−Mx
x1+Y−
N∑i=1
ci(ui +M)e−(ui+M)x
)x2 dx;
these expressions can be evaluated in more explicit terms (but the resulting formula do not
provide any additional insight). The numerical findings can be found in Table 3.5; results
for the corresponding Greeks are presented in Table 3.6.
3.4.4 Beta processes
In this last example we consider the situation that Xt follows a Beta process [66, 67]. As this
process has small jumps, the idea is to approximate the jumps (in absolute value) smaller
than ε by a suitable chosen Brownian motion with drift, as explained in Section 3.2.2 and
Section 2.5; we picked ε = 0.05. As we pointed out in Section 3.2.1, the distributions of Xτ(q)
and Xτ(q) are given in terms of infinite series; in our experiments we truncated these at 25.
In Table 3.7 we present results obtained by ordinary simulation (with the specific ε given
above), and by the Wiener-Hopf Monte Carlo method recently developed in [67] (denoted
by ‘KKPS’); see also the brief summary in Section 2.6.1.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 75 — #87�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 75
Vanilla optionTime T Strike K V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K),
Simulation StdDev Fourier inversion0.5 90 16.994 0.0026 16.9310.5 100 11.349 0.0023 11.2760.5 110 7.555 0.0020 7.4911 90 22.347 0.0037 22.2531 100 17.323 0.0035 17.2241 110 13.448 0.0032 13.354
Fixed strike lookback optionTime T Strike K V
(c)fix (T,K), V
(c)fix (T,K), V
(c)fix (T,K),
Simulation StdDev Laplace inversion0.5 110 11.894 0.0024 11.7370.5 115 9.523 0.0023 9.390.5 120 7.77 0.0022 7.6611 110 22.547 0.0038 22.3021 115 19.567 0.0037 19.341 120 17.09 0.0036 16.883
Floating strike lookback optionTime T L V
(c)fl (T, L), V
(c)fl (T, L), V
(c)fl (T, L),
Simulation StdDev Laplace inversion0.5 1.1 11.75 0.0022 11.7170.5 1.15 9.468 0.0021 9.440.5 1.2 7.665 0.0020 7.6431 1.1 20.172 0.0034 20.1351 1.15 17.645 0.0033 17.6121 1.2 15.454 0.0032 15.424
Table 3.5: CGMY model; results obtained by simulation, and results obtained by Fourier/Laplaceinversion. Parameter values are: μ = 0.01, σ = 0.2, C = 1/2, G = 3, M = 4, and Y = 1/2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 76 — #88�
�
�
�
�
�
76 3.4. NUMERICAL VALIDATION
Vanilla optionTime T Strike K Δ(c)
van(T,K), Δ(c)van(T,K), Θ(c)
van(T,K), Θ(c)van(T,K), Δ(c)
van(T,K), Θ(c)van(T,K),
Simulation StdDev Simulation StdDev Inversion Inversion0.5 90 0.736 6.0 · 10−5 11.775 0.0146 0.736 11.7680.5 100 0.579 6.4 · 10−5 13.356 0.0136 0.579 13.3480.5 110 0.424 6.4 · 10−5 12.941 0.0124 0.424 12.9381 90 0.714 7.0 · 10−5 8.848 0.0158 0.714 8.8321 100 0.611 7.3 · 10−5 9.602 0.0151 0.611 9.5871 110 0.51 7.4 · 10−5 9.709 0.0144 0.51 9.695
Fixed strike lookback optionTime T Strike K Δ
(c)fix (T,K), Δ
(c)fix (T,K), Θ
(c)fix (T,K), Θ
(c)fix (T,K), Δ
(c)fix (T,K), Θ
(c)fix (T,K),
Simulation StdDev Simulation StdDev Inversion Inversion0.5 110 0.779 6.7 · 10−5 24.357 0.0106 0.781 24.380.5 115 0.635 7.0 · 10−5 22.474 0.0104 0.636 22.5010.5 120 0.516 7.0 · 10−5 20.3 0.0101 0.517 20.3231 110 0.99 7.3 · 10−5 19.863 0.0109 0.992 19.8611 115 0.879 7.9 · 10−5 19.245 0.0108 0.88 19.2461 120 0.781 8.2 · 10−5 18.453 0.0107 0.782 18.455
Floating strike lookback optionTime T L Δ
(c)fl (T, L), Δ
(c)fl (T, L), Θ
(c)fl (T, L), Θ
(c)fl (T, L), Δ
(c)fl (T, L), Θ
(c)fl (T, L),
Simulation StdDev Simulation StdDev Inversion Inversion0.5 1.1 0.117 2.2 · 10−5 19.972 0.0141 0.117 19.9770.5 1.15 0.094 2.1 · 10−5 18.781 0.0135 0.094 18.7940.5 1.2 0.076 2.0 · 10−5 17.203 0.0128 0.076 17.2161 1.1 0.201 3.4 · 10−5 14.428 0.0160 0.201 14.4121 1.15 0.176 3.3 · 10−5 14.314 0.0156 0.176 14.3031 1.2 0.154 3.2 · 10−5 13.994 0.0152 0.154 13.986
Table 3.6: Greeks corresponding to CGMY model; results obtained by simulation, and results obtainedby Fourier/Laplace inversion. Parameter values are: μ = 0.01, σ = 0.2, C = 1/2, G = 3, M = 4, andY = 1/2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 77 — #89�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 77
Vanilla optionTime T Strike K V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K), V
(c)van(T,K),
Simulation StdDev Simulation – KKPS Inversion0.5 90 17.449 0.0035 18.671 17.4550.5 100 12.325 0.0033 13.52 12.330.5 110 8.845 0.0031 10.054 8.851 90 23.133 0.0051 25.057 23.1421 100 18.577 0.0049 20.476 18.5851 110 15.038 0.0047 16.929 15.046
Fixed strike lookback optionTime T Strike K V
(c)fix (T,K), V
(c)fix (T,K), V
(c)fix (T,K),
Simulation StdDev Simulation – KKPS Inversion0.5 110 13.941 0.0036 15.403 14.5160.5 115 11.638 0.0035 13.127 12.1910.5 120 9.88 0.0034 11.393 10.3951 110 25.661 0.0056 28.103 27.2511 115 22.794 0.0055 25.252 24.3021 120 20.385 0.0054 22.858 21.808
Floating strike lookback optionTime T L V
(c)fl (T, L), V
(c)fl (T, L), V
(c)fl (T, L),
Simulation StdDev Simulation – KKPS Inversion0.5 1.1 12.403 0.0032 13.134 12.8810.5 1.15 10.391 0.0031 11.251 10.8610.5 1.2 8.84 0.0031 9.802 9.2841 1.1 21.37 0.0049 22.505 22.5891 1.15 19.083 0.0048 20.327 20.2551 1.2 17.134 0.0048 18.473 18.253
Table 3.7: Beta model; results obtained by simulation, and results obtained by Fourier/Laplace inver-sion. Parameter values are: μ = 0.01, σ = 0.2, α1 = α2 = 1, β1 = β2 = 4, λ1 = λ2 = 3/2 andc1 = c2 = 2.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 78 — #90�
�
�
�
�
�
78 3.5. DISCUSSION
3.5 Discussion
In this section we reflect on the computational effort required by our algorithm, and dis-
cusses possible extensions to alternative exotic options.
3.5.1 Remarks on computational effort
The computational effort incurred by the numerical Laplace inversion [36, 37] is in the order
of M logM , if function values f(kΔ), for k = 0, 1, . . . ,M − 1 are to be determined. As quan-
tified in detail in [36, 37], the associated computation time is typically extremely low, often
negligible relative to other components of the algorithm. More specifically, when approxi-
mating the Lévy process under consideration by a Lévy process in R or M , our algorithm
requires us to compute roots, which can be time consuming. For a more detailed account of
this issue we refer to Section 2.7.2; it is also described how to accelerate this search, relying
on ideas proposed in e.g. [66]. It is also noticed that for Beta processes we have information
on the location of the roots, as mentioned earlier in this chapter; relatively straightforward
bisection procedures can be applied there.
To give an impression of the computational gain with respect to simulation-based estima-
tion, consider the numerical experiments performed for the Black-Scholes model. On a stan-
dard PC, generation of 105 samples (essentially consisting of pairs (XT , XT )) took about 2
seconds per instance, while generating the full table (i.e., several maturities and strikes) for
the vanilla option by Fourier inversion took less than 0.1 s, and generating the full table for
the lookback options by double Laplace inversion about 1.1 s. For the other models (Kou,
Merton, CGMY, Beta) the comparison is even more in favor of the inversion-based meth-
ods, for the reason that the time needed to generate a single sample is now proportional
to the maturity T (while the computational effort associated with the inversion method is
independent of T ).
3.5.2 Other exotic options
Various other exotic options can be dealt with fully analogously. For example, the variable
notional call and put options [79, Section 3.2], with payoffs
P (c)vn (T, L) :=
(ST − LST )+
ST, P (p)
vn (T, L) :=(LST − ST )
+
ST,
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 79 — #91�
�
�
�
�
�
CHAPTER 3. EVALUATION OF OPTION PRICES 79
respectively; for the call (put) option we assume L ≥ 1 (L ≤ 1, respectively), so as to avoid
the less interesting situation of a certainly positive payoff. The joint transforms (with respect
to the maturity time T as well as � = logL) and corresponding Greeks can be determined as
in Section 3.3.3, cf. [79, Prop. 3.4]. Other options whose payoff can be written in terms ST , ST ,
and ST , T , and a single other parameter (e.g., the K or L featuring in loopback options) result
in explicit expressions (in terms of κ(α, q), κ(α, q), and K (α, q)) for the double transform of
the option price, and can be analyzed and numerically evaluated in a similar fashion.
Barrier options. So far all transforms considered were single transforms (vanilla option) or
double transforms (loopback options), for which the inversion techniques of [36, 37] were
feasible; we remark, however, that the inversion routines for double transforms, while still
being fast and accurate, was already substantially slower and less accurate than those for
single transforms. It is noted that barrier options, and their associated Greeks, can be analyzed
in terms of triple transforms, as seen as follows.
Consider for instance the so-called Up-and-In barrier call option, where it is remarked that
the other flavors (Up-and-Out, Down-and-In, Down-and Out, and the put variants) can be
analyzed fully analogously. The Up-and-In barrier call option has payoff
P(c)uib(T,K,H) := (ST −K)+ 1{ST ≥ H};
we are interested in the more challenging case that max{S0,K} < H (noting that if this
condition is not fulfilled the payoff is nonnegative with certainty). Putting k := log(K/S0)
and h := log(H/S0), we wish to evaluate
V(c)uib (T,K,H) := E
[e−rTP
(c)uib(T,K,H)
].
Now let V (c)uib (ϑ, α, β) be the transform with respect to k, h, and T :
S0
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
−∞
∫ ∞
0
eαike−βh
∫ ∞
y=k
∫ ∞
x=h
(ey − ek)P(XT ∈ dx,XT ∈ dy)dh dk dT.
This expression reduces, when interchanging the order of integration, to
S0
iα(iα+ 1)β
∫ ∞
0
ϑe−(r+ϑ)T
∫ ∞
y=−∞
∫ ∞
x=0
e(iα+1)y(1− e−βx)P(XT ∈ dx,XT ∈ dy)dT
=S0
iα(iα+ 1)β
ϑ
r + ϑE
(e(iα+1)Xτ(r+ϑ)
(1− e−βXτ(r+ϑ)
)),
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 80 — #92�
�
�
�
�
�
80 3.6. CONCLUDING REMARKS
which can be expressed in terms of the functions κ(α, q), κ(α, q), and K (α, q) (as we did for
the floating strike loopback option):
S0ϑ
r + ϑ
(K (−iα− 1, r + ϑ)− κ(−iα− 1 + β, r + ϑ)κ(−iα− 1, r + ϑ)
iα(iα+ 1)β
).
The Greeks can be characterized (in terms of transforms) as before.
Barrier options prices are, however, significantly harder to evaluate than lookback options,
as the transform to be inverted is threefold. In this case, the inversion techniques of [36,
37] become slow and less accurate. Adapting the inversion techniques to facilitate triple
inversion is a topic for further research. For the class of generalized hyperexponential Lévy
processes, [58] determines the transform with respect to the maturity time T (for given K
and H) can be explicitly calculated; hence, for this class of Lévy processes just a single-
dimensional inversion is needed.
3.6 Concluding Remarks
This chapter proposes and validates a technique for pricing loopback options driven by a
general exponential Lévy model. The main idea is that we approximate the Lévy process
under investigation by a Lévy process for which the Wiener-Hopf factorization can be done
in (semi-)explicit terms; this approximation can be in principle as accurate as needed (of
course at the expense of an increase in computation time). With the Wiener-Hopf factors
being known, the option prices as well as corresponding Greeks can be evaluated by per-
forming (potentially multi-dimensional) Fourier and Laplace inversion.
We have thoroughly tested the proposed algorithm, by considering a broad range of driv-
ing Lévy processes, while we also vary the parameter values. The procedure consistently
yielded fast and accurate results.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 81 — #93�
�
�
�
�
�
Chapter 4Asymptotics of the supremum of a
Lévy process
In the previous chapters we have studied numerical techniques for evaluating the distri-
bution of the running supremum attained by a Lévy process, and applied it for option
pricing. In this chapter we discuss an alternative evaluation technique: we determine the
corresponding tail asymptotics, and present an importance-sampling-based fast simulation
method.
4.1 Introduction
Consider the Lévy process X ≡ (Xt)t≥0, and define Q := supt≥0 Xt as its all-time supremum.
The object of study in this chapter is the asymptotic behavior of P(Q > u), as u → ∞. In
particular, we present a short derivation of the result that, under a condition that ascertains
that the underlying Lévy process is light-tailed (in a sense specified below), there is a constant
ω > 0 such that P(Q > u)eωu converges to a positive constant, as u → ∞. This result, which
can be considered as continuous-time counterpart of Cramér’s seminal result for random
walks, was first established in [22]; see also [68, Section VII.2]. The first contribution of this
note is a novel, insightful, compact derivation of the above asymptotics, essentially relying
on the so-called second factorization identity [82]. The second contribution concerns a fast and
efficient scheme for evaluating P(Q > u) for u large, relying on importance sampling. The
applicability of this procedure is demonstrated in a number of numerical experiments.
81
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 82 — #94�
�
�
�
�
�
82 4.2. ASYMPTOTICS
4.2 Asymptotics
Let ζ(ϑ) be the Lévy exponent associated with the process (Xt)t≥0, in that EeϑXt = exp(tζ(ϑ)).
We assume that the process has a negative drift, i.e., EX1 < 0, and that 0 is regular for (0,∞)
[68, Def. 6.4]. The process (Xt)t≥0 is light-tailed in the sense that the Cramér condition holds:
∃ω ∈ (0,∞) : ζ(ω) = 0;
in the sequel we refer to this root simply by ω.
We now consider the ω-twisted version of the Lévy process (Xt)t≥0; that is, we associate a
measure Q to the process (Xt)t≥0 so that dQ(Xt ≤ x) = eωxdP(Xt ≤ x). This twisted version
is again a Lévy process, and has Lévy exponent
EQeϑXt = exp(tζ(ϑ+ ω));
EQ(·) denotes expectation under Q. For an explicit description of (Xt)t≥0 in Q, we refer to,
e.g., [10, Thm. XIII.3.4]; the drift is adjusted, the Brownian term remains unchanged, while
the jumps are exponentially twisted (with ‘twist’ ω) with an adapted arrival rate.
Define σ(u) := inf{t > u : Xt > u}. Then a standard change-of-measure argument yields
P(Q > u) = EQe−ωXσ(u) ,
see e.g. [10, Eqn. (XIII.5.2)] or [22, Remark 2]. Now decompose Xσ(u) into the level u and the
non-negative random quantity Bu := Xσ(u) − u, to be interpreted as the overshoot over level
u. We thus obtain
limu→∞P(Q > u)eωu = lim
u→∞EQe−ωBu ,
given these limits are well-defined.
Now define the following function:
κ(α) := exp
(−
∫ ∞
0
∫(0,∞)
1
t(1− e−αx)P(Xt ∈ dx)dt
),
and κQ(α) its counterpart under Q. The second factorization identity, due to [82] (see also
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 83 — #95�
�
�
�
�
�
CHAPTER 4. ASYMPTOTICS OF THE SUPREMUM OF A LÉVY PROCESS 83
[68, Exercise 6.7]), entails that under the stated assumptions
∫ ∞
0
e−βxE
(e−γ(Xσ(x)−x)1{σ(x)<∞}
)dx =
1
β − γ
(1− κ(β)
κ(γ)
). (4.1)
Realize that EQX1 = ζ ′(ω) > 0, and hence σ(u) < ∞ almost surely for all u > 0 under Q.
Thus, relying on (4.1), we have that
limu→∞EQe
−ωBu = limβ↓0
β
β − ω
(1− κQ(β)
κQ(ω)
). (4.2)
From
κQ(β) = exp
(−
∫ ∞
0
∫(0,∞)
1
t
(1− e−βx
)eωxP(Xt ∈ dx) dt
)
= exp
(−
∫ ∞
0
∫(0,∞)
1
t
((1− e−(β−ω)x
)− (1− eωx)
)P(Xt ∈ dx) dt
)
=κ(β − ω)
κ(−ω),
it is immediate thatκQ(β)
κQ(ω)=
κ(β − ω)
κ(0).
As is argued in [68, p. 188], �(β) := 1/κ(β) → 0 as β ↓ −ω. It follows that (4.2) equals
1
ωκ(0)limβ↓0
β
�(β − ω)= C :=
1
ωκ(0)
1
�′(−ω).
The final result is given in the theorem below.
Theorem 3. As u → ∞,
P(Q > u)eωu → C. (4.3)
The right hand side of (4.3) is understood as 0 when �′(−ω) = ∞. It is also remarked that
the assumption that 0 be regular for (0,∞) rules out that (Xt)t≥0 is a Poisson process whose
Lévy measure is lattice. It is easily seen what happens if this condition is lifted; one can then
just consider the process at (or, more precisely, immediately after) the jump epochs of the
Poisson process, thus reducing the problem to that of the supremum attained by a discrete-
time random walk [41, p. 393]. For more reflections on the assumptions imposed, see [22,
Remark 1] and [68, p. 186].
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 84 — #96�
�
�
�
�
�
84 4.3. IMPORTANCE SAMPLING
4.3 Importance sampling
The result of the previous section suggest to approximate P(Q > u) by Ce−ωu. There are
however two practical objections to this procedure.
(i) In the first place, Thm. 3 does not provide us with any error bounds, and in light of
this we do not have any insight in the error made for a specific value of u.
(ii) In the second place, this approximation requires us to evaluate the constant C, which
can be problematic. In case the jumps are one-sided, κ(·) can be explicitly expressed
in terms of ζ(·) [68, Section 6.5], and hence C can be computed, but this is not possible
when there are jumps in both directions. A next idea could be to evaluate κ(·) numer-
ically. Realize, however, that in many cases the Lévy process under study is given in
terms of the Lévy exponent ζ(·) only; no explicit expression for P(Xt ∈ dx) is available,
and therefore evaluation of κ(0) and �′(−ω) is not straightforward.
In this section we describe how such complications can be remedied, relying on a rare-event
simulation technique; cf. [12, Section XII.6a].
It is known that estimation of small probabilities by (naïve) simulation is time consum-
ing. Suppose our objective is to obtain, at a given confidence interval, an estimate which
is smaller than a given fraction of the corresponding confidence interval (for instance 10%).
Then the number of independent runs needed is roughly inversely proportional to the prob-
ability of interest [12, Section VI.1]. This problem can be solved by simulating under another
measure than the original one, a technique usually referred to as importance sampling [12,
Section V.1]; by weighing the simulation output by appropriately defined likelihood ratios,
an unbiased estimator is obtained, while the corresponding variance may be reduced.
The starting point of the procedure is the representation P(Q > u) = EQe−ωXσ(u) . The idea is
to simulate in any run the Lévy process under Q until u has been exceeded (which happens
with probability 1), to record in any run the value of Xσ(u), and to compute e−ωXσ(u) . In
self-evident notation, the estimator based on N independent runs becomes
αN (u) :=1
N
N∑n=1
e−ωX
(n)
σ(u) .
To analyze the performance of this estimator, realize that
VarαN (u) =1
N
(EQe
−2ωXσ(u) − (EQe
−ωXσ(u))2)
.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 85 — #97�
�
�
�
�
�
CHAPTER 4. ASYMPTOTICS OF THE SUPREMUM OF A LÉVY PROCESS 85
Using the methodology developed in the previous section, we obtain that
EQe−2ωXσ(u)e2ωu = lim
β↓0β
β − 2ω
(1− κQ(β)
κQ(2ω)
)=
1
2ωκ(ω)
1
�′(−ω).
As a consequence,
limu→∞(VarαN (u)) e2ωu =
C
N, where C :=
1
2ωκ(ω)
1
�′(−ω)−
(1
ωκ(0)
1
�′(−ω)
)2
.
Suppose that runs are sampled until the ratio of the confidence interval half-width and the
estimator drops below ε; we use ‘confidence value’ t (e.g., when the confidence interval is
95% the value of t is roughly 1.96). As a consequence, the minimum value of Nu should fulfil
t
√Ce−2ωu
Nu≤ εCe−ωu.
From the above the following claim follows.
Corollary 1. As u → ∞,
Nu → t2C/(ε2C2).
In other words, the number of runs needed is hardly affected by the value of the exceedance
level u. It thus follows that the proposed procedure has bounded relative error [12, Eqn.
VI.(1.2)].
Example 1. In this case (Xt)t≥0 corresponds to the superposition of a negative drift and a
compound Poisson process with standard Normal jumps. Notice that the resulting Lévy
process is spectrally two-sided. We have, for some d, λ > 0,
ζ(ϑ) = −dϑ+ λ(eϑ
2/2 − 1),
with EX1 = ζ ′(0) = −d < 0. Let ω solve ζ(ω) = 0; it can be checked that the ω-twisted
version of the Lévy process corresponds with a compound Poisson process with arrival rate
λeω2/2 = λ + dω and jumps that are Normally distributed with mean ω and variance 1. In
Table 4.1 we display, for various values of u, estimates of our target probability P(Q > u),
as obtained by direct simulation (that is, under the original measure P; denoted by pN (u))
and by importance sampling (denoted by αN (u)). We have continued simulating until the
ratio of the confidence interval half-width and the estimator drops below ε = 1% (taking
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 86 — #98�
�
�
�
�
�
86 4.3. IMPORTANCE SAMPLING
Naïve SimulationLevel u pN (u) Nu CPU time (sec.)0.5 0.581416 110,630 971 0.468690 174,196 2392 0.293742 369,464 1,0615 0.071168 2,005,515 30,596
Important Sampling SimulationLevel u αN (u) Nu CPU time (sec.)0.5 0.582281 8,577 0.581 0.467459 8,009 0.512 0.294591 7,622 0.495 0.071485 7,840 0.5510 6.70× 10−3 7,837 0.6420 5.95× 10−5 7,710 0.7950 4.12× 10−11 7,991 1.36100 2.26× 10−21 7,820 2.15
Table 4.1: Simulation results corresponding to Example 1. Parameters related to the Compound Pois-son process: d = −0.25 and λ = 1. Decay rate: ω = 0.47260. Precision/confidence: ε = 1%, t = 1.96.
t = 1.96); Nu is the number of runs needed. We also present the CPU time needed.
Example 2. In this numerical example (Xt)t≥0 is a Variance Gamma process, or, equivalently,
the difference between two Gamma processes (which is spectrally two-sided, too). The Lévy
exponent is given by
ζ(ϑ) = β log
(α1
α1 − ϑ
)+ β log
(α2
α2 + ϑ
).
It is readily checked that
EX1 = ζ ′(0) =β
α1− β
α2,
which is assumed to be negative (i.e., α1 > α2 > 0). Again, ω solves ζ(ω) = 0; the ω-twisted
process is Variance Gamma as well, but now with Lévy exponent
β log
(α1 − ω
α1 − ϑ− ω
)+ β log
(α2 + ω
α2 + ϑ+ ω
).
In [12, Ch. XII] it is pointed out how a Variance Gamma process can be simulated; we do so
by replacing all jumps smaller than ε = 0.05 by a Brownian motion with σε = 0.0345, while
the bigger jumps (both upward and downward) now correspond with a Compound Poisson
process (with intensities λ+ and λ−, respectively). In Table 4.2 we display, for various values
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 87 — #99�
�
�
�
�
�
CHAPTER 4. ASYMPTOTICS OF THE SUPREMUM OF A LÉVY PROCESS 87
Naïve SimulationLevel u pN (u) Nu CPU time (sec.)0.1 0.360057 273,112 6030.2 0.285182 385,165 1,1850.5 0.155404 835,137 5,6261 0.063159 2,279,294 40,521
Important Sampling SimulationLevel u αN (u) Nu CPU time (sec.)0.1 0.302785 101,312 78.190.2 0.263005 99,969 76.070.5 0.160835 101,764 79.081 0.067980 107,966 89.092 0.012298 114,929 101.535 7.75× 10−5 122,577 116.4210 1.76× 10−8 123,915 121.4420 9.09× 10−16 125,771 129.1450 1.30× 10−37 124,446 139.28100 4.98× 10−74 125,365 162.77
Table 4.2: Simulation results corresponding to Example 2. Parameters of the Variance Gamma process:d = −0.25, β = 1, α1 = 2, and α2 = 1. Parameters related to the approximation of the Variance Gammaprocess by the sum of a Brownian motion and a Compound Poisson process: ε = 0.05, σε = 0.0345,λ+ = 0.9115, λ− = 1.2339. Decay rate: ω = 1.6770. Precision/confidence: ε = 1%, t = 1.96.
of u, estimates of P(Q > u), as obtained by direct simulation and by importance sampling;
again ε = 1% and t = 1.96. Again we continue simulating until a precision of 1% is achieved,
and record the number of runs Nu needed, as well as the CPU time.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 88 — #100�
�
�
�
�
�
88 4.3. IMPORTANCE SAMPLING
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 89 — #101�
�
�
�
�
�
Chapter 5Energy-Efficient Scheduling in
Multi-Core Servers
Previous chapters concentrated on computational issues related to Lévy fluctuation theory.
This chapter discusses an entirely different context: the use of Markov-fluid models to de-
sign service strategies in multi-core servers.
5.1 Introduction
Today, the information and communication technology (ICT) accounts for about 2% of global
CO2 emissions [26, 51], which is about the same as the emissions of the entire aviation indus-
try [26]. However, ICT emissions are expected to almost double by 2020 [26]. Considering
the ICT energy use more closely, it is envisaged that more than half of it is likely to be due to
data centers. The Green Data Project [53] determined that 37% of data center utility power is
consumed by data storage equipment, 23% by networking equipment and 40% by servers.
Therefore, there has been a great interest in exploring efficient mechanisms for managing
and optimizing CPU energy consumption.
Manufacturers of semiconductor chips, as well as the servers that use them, are already
increasing the computing throughput per watt in such a way that it roughly doubles every
two years. The semiconductor industry is continuing to harness performance gains through
Moore’s Law by developing multi-core processors [54]. The utilization of data centers can
be remarkably low, e.g. 10% [52], due to various reasons including uneven application fit,
89
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 90 — #102�
�
�
�
�
�
90 5.1. INTRODUCTION
risk management, and uncertainty in demand forecasts. Interestingly, servers in data centers
tend to spend most of their time at low utilization; in addition, it should be realized many
applications can tolerate some delay. All these considerations clearly open up opportunities
for developing strategies to adapt the processor speeds (that is, turning off or slowing down
processors when possible), in order to achieve substantial energy savings.
Energy-aware processors can achieve energy-proportional computing [19] by utilizing speed
scaling, so as to adapt the ‘speed’ of the server CPU to the processing load and the service
performance requirements. Speed scaling is enabled by dynamic voltage/frequency scaling
(DVFS) [45] to decrease the supply voltage and the clock rate. Speed scaling designs can be
highly sophisticated (adapting the speed at all times to the current state; this is dynamic speed
scaling), or very simple (running at a given static speed except when idle, to balance energy
and performance).
The growing adoption of speed scaling designs has spurred quantitative research into the
topic. The analytic study of the speed scaling problem began with Yao et al. [97] about
two decades ago. Subsequent studies considered three main performance objectives that
balance energy and performance: (i) minimize the total energy used in order to meet per-
formance requirements, e.g., [2, 85]; (ii) optimize performance (e.g., minimize delay) given
an energy/power budget, e.g., [24, 86]; and (iii) optimize a linear combination of expected
performance and energy usage [3, 95, 96].
The study presented in the present chapter is of type (iii); this objective is appropriate in Web
settings since typically there neither (strict) job completion deadlines apply, nor fixed energy
budgets. In our work we choose to consider an objective function which is a linear com-
bination of energy usage, queuing cost (reflected by the delay), and speed switching costs;
depending on the specific situation at hand, the switching costs, not incorporated in much of
the prior work, may affect the quantitative analysis. The processor we consider is multi-core,
where each core can in principle be set at a different speed; however, practical implemen-
tation considerations often require that all cores run at equal speed. Our analysis captures
the processor’s dynamic power as well as its static power; the former can be characterized
as frequency-dependent, and being controlled by changing the frequency and the voltage of
the processor using the DVFS technology, whereas the latter is frequency-independent and
not controlled by DVFS. It is noted that the static power is becoming significant as transis-
tors are getting smaller and faster [70]. Our feature-rich objective function, along with the
enabling mechanisms that we propose, provide an approach for studying and optimizing
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 91 — #103�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 91
the server energy and performance tradeoffs. Importantly, these enabling mechanisms for
adjusting the number of active cores and their speeds are based on quantities that can be
easily observed (viz. the queue’s buffer contents).
Our main contribution is that we propose a stochastic fluid model for the analysis and opti-
mization of multi-core processing systems. An important distinction between our work and
previous studies lies in the modelling assumptions: whereas previous papers use queues
with Poisson arrivals, our framework is more realistic in that the burstiness of the arrivals
of processing jobs is captured by relying on on-off fluid processes. In our setup we have the
freedom to choose appropriate distributions for the corresponding on and off periods; we
can pick exponential distributions (with a coefficient of variation (CoV) equaling 1), or dis-
tributions that are more regular (CoV < 1) or more variable (CoV > 1) than the exponential
distribution. The arrival process can be easily extended beyond on-off processes; then the
fluid rate can take any one of multiple (more than two) values according to a modulating
process. The fluid model is appropriate when the processors are fast and job arrivals are
bursty and of high rate. Stochastic fluid models have been successfully used in prior studies
of data packet queueing and transmission over communication links [4, 39, 74, 89], as they
preserve the structural properties of the modeled system, while its solution remains rela-
tively simple. As a result, the framework allows us to analyze a broad range of strategies for
adapting the multi-core server speeds, all of which attempt to optimize objective functions
which balance energy consumption and performance. The strategies studied differ in the
number of buffer threshold levels used, but also in the dependence of the processing speed
on the specific threshold that is crossed (as well as the direction in which it is crossed, i.e., if
the policy is hysteretic).
The more specific contributions of our work are the following. We first discuss several
schemes that are intended to reduce energy consumption in multi-core processors (i.e., pro-
cessors with multiple CPU s) or multiple servers. Essentially there are two classes of strate-
gies: those in which the service rate depends just on the current buffer content in relation
to the values of a set of thresholds, and those in which it also matters in what direction the
thresholds are crossed (hysteretic control). We propose to rely on stochastic fluid theory to
evaluate the performance of the underlying queueing systems. Then we introduce a family
of cost functions, that incorporates a tunable trade off between the power consumption of
the servers and the quality-of-service. The power consumption cost has three components:
(i) static power, (ii) dynamic power, which a function of processing speed of each server,
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 92 — #104�
�
�
�
�
�
92 5.2. ENERGY COST FUNCTION AND MODELS
and (iii) power consumption due to switching between processing speed levels [25, 43]. The
quality-of-service is a function of the queueing delay of processing jobs waiting for service.
In order to implement the service strategies the system design parameters (service rates,
buffer thresholds) should be chosen so as to minimize the cost function; to this end we ap-
ply a conjugate gradient method. We then present the results of our numerical experiments,
in which we compare the various service strategies. Importantly, we quantify the benefits
(in terms of our cost function) of more versatile service strategies — this enables us to eval-
uate the benefit of ‘richer’ strategies (e.g., with many thresholds) over ‘simpler’ strategies
(e.g., a static policy, or the ‘sleep mode strategy’, which reduces the service rate only when
the queue is empty). Evidently, strategies which use more threshold levels are more effi-
cient in terms of power consumption; however, for a reasonable switching overhead of the
processing speed, the efficiency gain quickly diminishes beyond a few thresholds.
A seeming drawback of the service strategies proposed is that the optimal tuning of the pa-
rameters requires knowledge of the statistical properties of the system’s input traffic. While
it is obviously formally true that perturbations of the input model lead to non-optimal pa-
rameter settings, extensive numerical experiments reveal that this sensitivity is remarkably
weak. Even relatively significant changes in the input (both in terms of the duration of the
source’s active periods, and the distributional properties of those active periods), lead to
very modest (percentagewise) changes in the cost function. This robustness property makes
the use of the proposed energy efficient strategies highly attractive.
This chapter is organized as follows. In Section 5.2 we describe the stochastic fluid models
used, and the underlying cost function. Then Section 5.3 describes and motivates the various
service strategies that fall in the framework of our stochastic fluid queues, we show how
to numerically optimize the cost function, and present the main findings. The robustness
analysis and related insights can be found in Section 5.4, and conclusions are presented in
Section 5.5.
5.2 Energy Cost Function and Models
In this section we give a detailed description of the energy cost function that we use in this
chapter. Then we explain two stochastic fluid queueing models that can be used to model
various service strategies (which will be further analyzed in Section 5.3).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 93 — #105�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 93
5.2.1 System Cost Function
The system cost function which we seek to optimize reflects a balance between power usage
and processing performance. We assume that the power consumption of each server consists
of three components, viz. a static, dynamic, and switching component. The static component
is essentially determined by the technology used [29, 93], that is, independent of the actual
service rate used in the underlying queueing system. The dynamic power component is
approximated, following various studies [91, 97], by C sα for some constants α ≥ 2 and
C > 0, where s is the service rate used at that moment. In our work we choose α = 3,
and we normalize all costs such that C ≡ 1 (and the value of the static component resulting
from this normalization is γ). It is further assumed that there are n ∈ N identical processors
(CPU cores) operating, which effectively means that the static energy consumption per time
unit is γn, independent of their service rates; if the number of active processors is a random
variable, this component is γ En. As mentioned, the third component is the switching energy
cost, which reflects the cost incurred (that is, energy consumed) when changing the value
of the service rate. We set this component at β ES, where the random variable S is the
corresponding switching rate (per unit time), and β > 0 is a constant (normalized as above)
[61]; cf. the framework used in [47].
The system cost function also includes a measure of job buffering (queueing for processing)
cost, which reflects the degree of quality-of-service provided. We assume that this com-
ponent depends linearly on the average amount of data W stored in the buffer: δ EW =
δ∫xP(W ∈ dx), where δ > 0 is the buffering cost per data unit per unit of time. Summariz-
ing, the cost function (E) we consider in this chapter is given by
E = γ En+
n∑i=1
Ei[s3] + β
n∑i=1
EiS + δ EW ; (5.1)
here Ei[s3] is the third moment of the service rate of the i-th server, and EiS the correspond-
ing mean switching rate.
The choice of a cubic form for the power used when running at speed s is in line with much
of the prior literature (with a couple of notable exceptions, see e.g. [16]). That is because
the dynamic power of CMOS’s (Complementary Metal Oxide Semiconductors) are propor-
tional to V 2f , with V denoting the supply voltage and f the clock frequency; see for instance
[59]. Operating at a higher frequency requires dynamic voltage scaling to a higher voltage,
nominally with V being roughly proportional with f . In this way we obtain the cubic rela-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 94 — #106�
�
�
�
�
�
94 5.2. ENERGY COST FUNCTION AND MODELS
tionship. At the methodological level, any other functional form can be chosen; the resulting
setup is evidently computationally equivalent. For a further validation of the form of this
cost component, we refer to e.g. [95, Section II].
Several recent papers, see e.g. [81, 80], argue that the switching cost can be a relevant com-
ponent in the cost function, justifying including this in our model. While the extent of the
overhead may be debatable, its inclusion makes our model richer and thus of more potential
use.
We finish this subsection with a couple of remarks regarding the choice of the cost function,
and, more specifically, the selection of the scalars β, γ, δ.
(i) The approach we follow (that is, considering an objective function consisting of vari-
ous sorts of ‘cost components’) has been intensively used in the leading literature on
power-aware speed scaling, see in particular the recent papers by Wierman et al. such
as [3, 95]. In those works the performance metric used is
E[T ] +E[E]
β′ ,
where T is the response time of a job, E is the expected energy expended on that job,
whereas the parameter β′ represents the relative cost of delay.
Our objective function is of the same type, but includes other relevant ‘cost types’, and
is ‘system oriented’ rather than ‘job oriented’.
This type of objective functions can also be regarded as disutility curves, as they reflect
the burden due to various ‘undesirable effects’ (many processors to be allocated, a sys-
tematically high value of the processor speed s, frequent switching, and performance
degradation).
(ii) It should be borne in mind that the specific choice of the scalars β, γ, δ is a provider-
specific issue, and to a large extent motivated by commercial and technological con-
siderations. Indeed, the parameters reflect the ‘disutility’ associated with the number
of servers active, the (statistical distribution of the) processing rate used, the switching
rate, and the quality-of-service delivered (in terms of buffer content); as a consequence,
the specific values of β, γ, δ critically depend on the policy of the provider. Evidently,
when for instance the provider wishes to offer a strict quality level in the Service Level
Agreement (as agreed upon with the customers), the parameter δ should be chosen
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 95 — #107�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 95
relatively high, but it is up to the provider what specific value is selected. As a result,
the selection of appropriate values for β, γ, δ has a subjective component, and it is not
the objective of this chapter to provide guidelines on how to pick them.
(iii) As indicated in the introduction, our study also covers the sensitivity of the optimal
solution with respect to the various parameters. In many of the experiments reported
on, we vary the scalars β, γ, δ, and study the impact on the objective function, but also,
quite importantly, the values of the optimal parameters (buffer thresholds, processor
speeds, etc.). In other words, we do not wish to study the system under one specific
set of parameters, but rather provide insight in the effect of varying these parameters.
5.2.2 Fluid Models
In this chapter we analyze various service strategies relying on the theory of stochastic fluid
queues. In this subsection we review the essentials of this theory, with a focus on the specific
strategies that we evaluate. It is noted that fluid models keep track of the system’s workload,
not of the number of active jobs.
We consider an infinite-buffer queue fed by a single traffic source, served by one server or
a few parallel servers. A simple but common model for the queue’s input process is a so
called on-off source, alternating between being active (generating traffic at a constant rate,
say, r > 0) and being silent; the on- an off-times are assumed exponentially distributed with
mean values μ−1 and λ−1, respectively. The assumptions of a single source, the source being
of the on-off type, and an infinite buffer have been imposed for simplicity. The analysis can
be extended in a straightforward fashion to finite buffers, more advanced source models,
and multiple sources; actually, in Section 5.4 we depart from the exponentially assumption
to quantify the impact of non-exponential on-times. Importantly, in our work the service
rate of the queue may depend on the current buffer level (referred to as ‘threshold models’
below), or on the direction in which thresholds are crossed (referred to a ‘hysteretic models’
below).
As our cost function depends on steady-state quantities only, we now point out how to
evaluate the queue’s equilibrium distribution. We enumerate the states of the on-off source
as {−,+}, and define by Fi(x) the equilibrium probability that the source is in state i ∈{−,+} while at the same time the buffer content does not exceed level x; let Fi(x, t) denote
its transient counterpart (corresponding to time t ≥ 0). Let D denote a diagonal matrix such
that di := Dii is the difference between the input rate and the service rate when the source is
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 96 — #108�
�
�
�
�
�
96 5.2. ENERGY COST FUNCTION AND MODELS
in state i ∈ {−,+}. It is easily verified that, for Δt small, up to o(Δt)-terms,
F+(x, t) = F+(x− d+ Δt, t−Δt)(1− μΔt) + μΔtF−(x− d−Δt, t−Δt),
and
F−(x, t) = F−(x− d− Δt, t−Δt)(1− λΔt) + λΔtF−(x− d+Δt, t−Δt),
as pointed out in greater detail in [4]. Now take F+(x − d+ Δt, t −Δt) to the left-hand side
in the first of these equations, and F−(x − d− Δt, t − Δt) in the second. Then divide the
equations by Δt, and let Δt ↓ 0, to obtain
d+∂F+
∂x+
∂F+
∂t= μF+(x, t)− μF−(x, t),
and
d−∂F−∂x
+∂F−∂t
= λF−(x, t)− λF+(x, t).
Letting t → ∞, we thus obtain that the equilibrium probabilities satisfy the system of differ-
ential equations (following the convention that vectors are denoted by bold symbols)
Dd
dxF (x) = Q F (x), (5.2)
where Q is a generator matrix with q−+ = λ and q+− = μ. This system can be solved taking
into account appropriate boundary conditions and a normalization. Obviously, to obtain a
proper limiting distribution, we should have that the stability condition π+d+ + π−d− < 0
is fulfilled: the long-term drift should be negative.
Importantly, the system of differential equations (5.2) still applies when Q and D depend on
the current value of the buffer in a piecewise constant way [39, 75], albeit with additional
boundary and continuity conditions. In the case that Q or D are continuous functions of the
buffer content [89], (5.2) holds, but not for the distribution function F (x), but rather for the
density f(x). It is noted that in this situation the stability condition cannot be immediately
expressed in terms of the model primitives; the stability conditions for the various model
variants can be found in [39, 75, 89].
We now consider two specific ways in which the service reacts to the buffer content process:
a model in which the current buffer level determines the service rate, and a model in which
it also matters in what direction certain thresholds are crossed.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 97 — #109�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 97
Many Thresholds Model
In this model there are threshold levels 0 = B0 < B1 < B2 < · · · < BN . The queue is
served at rate si when the buffer content is between Bi and Bi+1. Here it is assumed that
s0 < s1 < s2 < · · · < sN < r (where sN < r ensures that the system is non-trivial, as
otherwise the system would remain empty all the time). The buffer content distribution can
be evaluated as pointed out in e.g. [89]; the probability density is given by
(f−(x)f+(x)
)= λD0e
−g(x)
(1/s(x)
1/(r − s(x))
)
where g(x) is a continuous function of x (whose specific form we leave out here), D0 is the
(stationary) probability that the buffer is empty, and s(x) is the service rate when the buffer
content equals x. In our specific model, s(x) is constant and equal to si for Bi�x�Bi+1
where i = 0, . . . , N ; then we also have g(x) = νi + ηix, where
ηi =μ
r − si− λ
si, νi = νi−1 +Bi(ηi−1 − ηi)
and ν0 = 0. The probability of an empty buffer follows from
D−10 = 1 + λ
∫ ∞
0
e−g(x)
(1
s(x)+
1
r − s(x)
)dx
= 1 + λ
N−1∑i=0
ξiηie−νi
(e−ηiBi − e−ηiBi+1
)+ λ
ξNηN
e−νN−ηNBN
where ξi = si−1 + (r − si)
−1. If there is only one server (n = 1, that is), the mean workload
and switching rate are given by
E[s3] = D0s30 + λD0
N−1∑i=0
ξiηis3i e
−νi(e−ηiBi − e−ηiBi+1) + λD0ξNηN
s3Ne−νN e−ηNBN ;
EW = λD0
N−1∑i=0
ξiηie−νi(Bie
−ηiBi −Bi+1e−ηiBi+1)
+ λD0
N−1∑i=0
ξiη2i
e−νi(e−ηiBi − e−ηiBi+1) + λD0ξNη2N
e−νN e−ηNBN (1 + ηNBN );
ES = λD0
N∑i=1
(e−νi−1e−ηi−1Bi + e−νie−ηiBi
).
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 98 — #110�
�
�
�
�
�
98 5.2. ENERGY COST FUNCTION AND MODELS
The case of multiple servers should be treated differently, see Section 5.2.3.
Hysteretic Model
In this model there are two threshold buffer levels; its dynamics are as in [74], and can be
described as follows. The queue is served with the lower service rate s− if the buffer content
is under the higher threshold Bh, until this threshold has been hit from below. From that
moment it is served with the higher service rate s+, until the lower threshold level B� is hit
from above. Then the service rate is changed into s− again, and the procedure repeats. The
stationary buffer content distribution can be computed by using the techniques from [39, 74].
F±(x) = P(W < x) denotes the buffer content distribution function, where the subscripts
+ and − refer to the cases that the queue is served at rate s+ and s−, respectively; its first
(second) component corresponds to the source being off (on).
Let the diagonal matrix D be given by diag(−s±, r − s±). Under B� the queue is always
served with rate s−, and beyond Bh with rate s+. As a result, F+(x) = 0 for 0�x�B�
and F−(x) = F−(Bh) for x�Bh. If the buffer is empty, it cannot remain empty as long
as the source is on, so the second component of F−(0) equals 0. For obvious reasons we
restrict ourselves to the case that r− s± is larger than zero; as a consequence the distribution
function can have a jump only at 0 (and F±(·) is continuous elsewhere). In addition there is
the obvious normalization equation (with F−−(·) the distribution function in the time that
s− is active and the source is off, etc.)
F−−(∞) + F−+(∞) + F+−(∞) + F++(∞) = 1.
It requires some elementary calculus to obtain that the eigenvalues of D−1± Q are {0,−η±}
and the corresponding eigenvectors are
u0 =
(μ/λ
1
), u± =
((r − s±)/s±
1
)
where η± = μ(r − s±)−1 − λ s±−1. To ensure that our queueing system is stable, η+ must be
positive; hence, s+ should be larger than smin = λr/(μ + λ). Using the findings of [74], the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 99 — #111�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 99
distribution function of buffer content is:
0�x�B� F−(x) = a1(u0 − u−e−η−x
),
F+(x) = 0;
B��x�Bh F−(x) = a2u0x+ a3u−e−η−x +
(v1v2
),
F+(x) = a4u0x+ a5u+e−η+x +
(v3v4
);
Bh�x < ∞ F−(x) = F−(Bh),
F+(x) = a6u+e−η+x +
(v5v6
);
here the ai and vi (with i ∈ {1, · · · , 6}) are constants following from boundary conditions,
continuity conditions, substitution, and additional conditions [74, 75]. The number of times
the service rates switches per unit time is given explicitly in [74]:
ES = f−−(Bh) · (r − s−) + f++(B�) · s+,
with f−−(·) the density of the buffer content in the time that s− is active and the source is
off, and f++(·) defined likewise.
5.2.3 Multiple Server Models
We now focus on the situation of multiple servers. Three cases are distinguished; each case
has a specific cost function.
Case 1
Under the first threshold B1 the queue is served by only one server which works at a constant
rate s0. Above this threshold the second server starts working at a constant rate s1, but such
that the first server still works at rate s0. A next server is switched on when the buffer content
reaches a next threshold B2, and so on; there are n− 1 thresholds (and hence n servers). The
cost function reads
E = (s30 + γ)F (B1) + (s30 + s31 + 2γ)(F (B2)− F (B1))
+ · · ·+(
n∑i=1
s3i + nγ
)(1− F (Bn−1)) + β ES + δ EW
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 100 — #112�
�
�
�
�
�
100 5.3. OPTIMIZATION OF ENERGY CONSUMPTION
Case 2
Below the threshold B1 one server works at rate s0, while beyond B1 a second server is
switched on and the first server adjusts its rate so that both work at rate s1/2. In general,
when the buffer content hits Bi, server i+ 1 is switched on, and all servers adjust their rates
to si/(i+ 1). As a consequence,
E = (s30 + γ)F (B1) + 2((s1/2)
3 + γ)(F (B2)− F (B1))
+ · · ·+ n((si−1/n)
3 + γ)(1− F (Bn−1)) + β ES + δ EW
Case 3
In this case there are two different kinds of thresholds. At some thresholds the servers only
change their rates, but all servers work with the same rate; these thresholds we denote by
Bi. At other thresholds, a server is switched on or off; we denote these by B�mj
. It means
if the buffer content is under B�m1
, then only one server works, but there are thresholds
{B1, B2, . . . , Bm1−1} such that B1 < B2 < · · · < B�m1
and at Bi the service rate changes to si.
Beyond B�m1
and under B�m2
two servers work both with rate sm1/2 and there are (m2−m1)
thresholds at which the servers change their rates. In general, at B�mi
the (i + 1)-th server
starts operating; at Bj such that B�mi−1
< Bj < B�mi
, i servers adjust their rates to sj/i. We
have the following cost function:
E = (s30 + γ)F (B1) +
m1∑i=1
(s3i + γ)(F (Bi+1)− F (Bi))
+
m2∑i=m1+1
2((si/2)3 + γ)(F (Bi+1)− F (Bi)) + · · ·
+ n((si−1/n)3 + γ)(1− F (Bmn)) + β ES + δ EW.
5.3 Optimization of Energy Consumption
So far we described models so as to operate the queue in an energy efficient manner. Clearly,
the value of the cost function depends on the choice of the parameters involved (service rates
and values of the thresholds). In this section, we define a number of strategies that fit in the
framework introduced above, and examine their performance in terms of our cost function.
Before doing so, we first explain the optimization algorithm we used.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 101 — #113�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 101
5.3.1 Algorithm and Implementation
For finding the optimal service rates and threshold levels we used the Polak-Ribiere variant
of conjugate gradient method [84]. The gradient of the cost function, which is needed for
this method, is computed by a standard finite difference method. This algorithm does not
necessarily find the objective function’s global minimum; this problem can be remedied by
starting the conjugate gradient method at different initial values. As a check, we also applied
a simulated annealing based method [84]; after performing extensive tests, it turned out that
the conjugate gradient algorithm provided us with correct results.
It is noted that we have to impose specific constraints on the parameter space. Beyond the
last threshold level the matrix D−1Q must have at least one negative eigenvalue, to ensure
stability; smin is the minimum value of service rate to make sure that this is the case. The
matrix D−1Q has always a zero eigenvalue, but if s = smin it has at least two zero eigenvalues
while all other eigenvalues are positive; this case leads to a minor technicality and is dealt
with separately. The (trivial) constraints 0 < B1�B2� · · ·�BN and 0 < s0�s1� · · ·�sN < r
must be imposed explicitly.
5.3.2 Serving Strategies
As indicated earlier, in this section we assume that the on- and off-periods are exponentially
distributed with means 1/μ and 1/λ respectively. Without loss of generality, we assume that
λ = 1 and that the source’s transmission rate is r = 1; this is essentially a renormalization of
time and space. When comparing single-server models the static cost does not play a role in
the optimization and can therefore be left out. When comparing the single-server case with
the multiple-server case, it is obviously required to include the static component.
Static Strategy
This is the simplest strategy of serving, in which the server works at a given constant rate
at any buffer content. For this model the cost function can be optimized explicitly. The cost
function, E(s), has only two components, viz. the dynamic cost and the buffering cost:
E(s) = s3 + δλr(r − s)
(λ+ μ) (λ(s− r) + μs). (5.3)
Fig. 5.1 shows the optimal cost for different values of the buffering cost rate δ.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 102 — #114�
�
�
�
�
�
102 5.3. OPTIMIZATION OF ENERGY CONSUMPTION
0.2
0.3
0.4
0.5
0.6
0.7
0 0.2 0.4 0.6 0.8 1
optim
al c
ost
δ
staticsleep mode
Figure 5.1: Optimal cost of static serving and sleep mode strategies vs. buffering cost rate. The switching costrate is 0.5; the mean burst time is 2 (i.e., μ = 0.5).
Sleep Mode Strategy
Here the server is off when the queue is empty and starts working as soon as the source
enters the active mode. Also in this case the optimal service rate can be computed explicitly.
The service alternates between 0 and a constant service rate higher than smin. Switching
consumes energy but even for high values of the switching cost rate, β, the sleep mode
serving strategy outperforms the static strategy, in terms of our cost function. Fig. 5.1 shows
the optimal cost of the static and sleep mode strategy, for different values of the buffering
cost rate δ; the switching cost rate is relatively high (β = 0.5). For small δ both strategies
have nearly the same energy cost, but when δ increases the difference increases. For small
values of β this difference tends to be significantly higher.
1-Threshold Strategy
We now optimize our objective function with respect to a single threshold (B1) and two
service rates (s0, s1). The optimal s0 is always lower than smin, and we have seen before
that s1 must be higher than smin. By increasing β the threshold increases, so as to reduce
the switching cost. In Fig. 5.2 we show the behavior of the thresholds when varying β. It
shows that for β � 0.025, the 1-threshold strategy and the hysteretic strategy (see below) are
equivalent, but for β � 0.025 the thresholds (B�, Bh) bifurcate, with B1 lying in between.
Increasing β leads to an increase in s0, but a decrease in s1. Also, increasing the buffering
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 103 — #115�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 103
0.8
1
1.2
1.4
1.6
1.8
0 0.02 0.04 0.06 0.08 0.1
Thre
shol
d
β
MTM-1: B1HysM: BlHysM: Bh
Figure 5.2: Thresholds in the 1-threshold and the hysteretic strategy with respect to β. The parameters chosenare μ = 0.5, δ = 0.05.
0.43
0.435
0.44
0.445
0.45
0.455
0 0.02 0.04 0.06 0.08 0.1
optim
al c
ost
β
MTM-1HysM
Figure 5.3: Optimal cost of the hysteretic strategy and the 1-threshold strategy. For small β (switching costrate) the hysteretic strategy behaves like the 1-threshold strategy, but if β is large, then the hystereticstrategy is more efficient (μ = 0.5, δ = 0.05).
cost rate δ reduces s0 and B1, but increases s1. Typical numerical results are shown in Table
5.1.
Hysteretic Strategy
The hysteretic strategy is very similar to the 1-threshold strategy; actually if the thresholds
are equal (Bh = B�), then they match. For low values of the switching cost rate, the op-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 104 — #116�
�
�
�
�
�
104 5.3. OPTIMIZATION OF ENERGY CONSUMPTION
δ β HysM MTM-1 MTM-2 MTM-3
Scenario 1: μ = 0.5
0.02 0.0 0.3807 0.3807 0.3740 0.3713
0.02 0.3835 0.3839 0.3785 0.3771
0.05 0.3858 0.3885 0.3848 0.3848
0.05 0.0 0.4340 0.4340 0.4221 0.4190
0.02 0.4385 0.4385 0.4286 0.4274
0.05 0.4435 0.4450 0.4380 0.4378
0.10 0.0 0.4896 0.4896 0.4731 0.4700
0.02 0.4956 0.4956 0.4815 0.4810
0.05 0.5044 0.5044 0.4934 0.4934
Scenario 2: μ = 2.0
0.02 0.0 0.0704 0.0704 0.0679 0.0671
0.02 0.0736 0.0764 0.0762 0.0762
0.05 0.0756 0.0809 0.0809 0.0835
0.05 0.0 0.0914 0.0914 0.0875 0.0863
0.02 0.0971 0.0998 0.0993 0.0993
0.05 0.1007 0.1069 0.1069 0.1069
0.10 0.0 0.1147 0.1147 0.1092 0.1079
0.02 0.1238 0.1256 0.1244 0.1244
0.05 0.1291 0.1362 0.1361 0.1361
Table 5.1: Optimal cost of single-server strategies. The third column relates to the hysteretic model, andthe next three columns to the models with 1, 2, and 3 thresholds, respectively, denoted by MTM-1,MTM-2, MTM-3.
timal thresholds of the hysteretic strategy are equal and coincide with the threshold of the
1-threshold strategy; the optimal service rates of the two models are equal, too. Fig. 5.3
shows the optimal cost of the hysteretic and one-threshold strategy as a function of the pa-
rameter β. It is observed that until a critical level the thresholds Bh and B� are equal, and
increase as a function of β; above this critical level the upper threshold Bh increases while
the lower threshold B� decreases. It means that the hysteretic strategy is more efficient than
1-threshold only if the switching cost is high (as under the hysteretic strategy the switching
rate ES is reduced). Table 5.1 shows values of the objective function for a representative set
of parameters.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 105 — #117�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 105
N -Threshold Strategy
In this model the optimization must be performed over N thresholds (B1, B2, · · · , BN ) and
N + 1 service rates (s0, s1, · · · , sN ). The optimal values of the first threshold (B1) and the
first service rate (s0) tend to be very close to zero. It means that the system resembles the
sleep mode strategy described above. Obviously, if two consecutive service rates are equal
(si = si+1,that is), then an N -threshold strategy is equivalent to an (N−1)-threshold strategy.
This phenomenon occurs when the switching cost rate is high, so that it is beneficial to have
a relatively low number of thresholds. We present some sample results of the 2-threshold
strategy and the 3-threshold strategy in Table 5.1.
Continuous Serving Strategy
We now consider the model in which the service rate is a continuous function of the buffer
content; the corresponding stationary distribution has been obtained in [89]. For optimizing
this model we could either use calculus-of-variations methods or approximate this model
by an N -threshold model (with the thresholds fixed). Following the latter approach, as-
sume there are N threshold levels such that the distance between two consecutive levels is
constant, and choose the thresholds such that Bi+1 − Bi goes to 0 as N → ∞; evidently, the
solution we obtain for large N will be close to the solution to the continuous serving strategy.
It is clear that the optimal cost of N -threshold models decrease when N increases, as one can
increasingly accurately approximate the optimal service rates of the continuous strategy. Fig.
5.4 shows the optimal cost as a function of N ; it gives us insight into the amount by which the
objective function decreases when N increases. From the figure we observe that the optimal
cost converges, as expected, to a limit when the number of thresholds becomes large; this
limiting strategy corresponds to the situation in which the server constantly adapts its pro-
cessor speed. If switching between service rates does consume energy, the objective function
will be high for high values of N , unless the service rates si remain constant for a substantial
set of i ∈ {0, . . . , N} (meaning that there is effectively no switching); a typical example of
the optimal service rate is depicted in Fig. 5.5.
Multiple Servers Strategy
If γ, the energy rate consumed by an operating server, is small, it is obviously efficient to ac-
tivate many servers. Hence, the optimal number of operating servers depends on γ. Above
we did not take into account γ as we so far limited ourselves to single-server strategies. Now
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 106 — #118�
�
�
�
�
�
106 5.3. OPTIMIZATION OF ENERGY CONSUMPTION
0.4
0.42
0.44
0.46
0.48
0.5
0 20 40 60 80 100
optim
al c
ost
n
Figure 5.4: Optimal cost of strategies with many thresholds, as a function of the number of thresholds. Theparameters chosen are μ = 0.5, δ = 0.05, β = 0.0.
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
serv
ing
rate
buffer content
β = 0.0β = 0.02
Figure 5.5: Service rate as function of buffer content in continuous strategy, approximated by 100 thresholds.The solid line represents the cost when there is no switching cost, and dashed line the cost if switchingconsumes energy.
that we compare single-server strategies with multiple-server strategies, we clearly have to
add this cost component. Table 5.2 presents the optimal values of our cost function for dif-
ferent sets of parameters.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 107 — #119�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 107
γ C1-12 C1-23 C2-12 C2-23 C3-01 C3-10
Scenario 1:
0.05 0.2392 0.1954 0.2276 0.1919 0.2253 0.2271
0.10 0.3250 0.3114 0.3152 0.3056 0.3143 0.3135
0.20 0.4859 0.4859 0.4801 0.4801 0.4801 0.4752
0.50 0.9145 0.9360 0.9325 0.9344 0.9303 0.8506
Scenario 2:
0.05 0.1718 0.1534 0.1642 0.1501 0.1630 0.1556
0.10 0.2487 0.2416 0.2256 0.2408 0.2429 0.2253
0.20 0.3868 0.3868 0.3284 0.3834 0.3834 0.3507
0.50 0.7394 0.7737 0.7400 0.7676 0.6320 0.6309
Table 5.2: Optimal cost of multiple-server strategies. C1-12 (C1-23) relates to Case 1 with 1 threshold and2 servers (2 thresholds and 3 servers, respectively). C2-12 (C2-23) relates to Case 2 with 1 thresholdand 2 servers (2 thresholds and 3 servers, respectively). C3-01 (C3-10) relates to Case 3 with 0 (1)threshold under and 1 (0) threshold beyond B� (which is the threshold at which the second serverstarts operating). Scenario 1 and Scenario 2 have parameters (μ = 0.5, δ = 0.05, β = 0.0) and (μ = 1.0,δ = 0.05, β = 0.0), respectively.
5.3.3 Comparing the Service Strategies
Above we assessed the performance of a wide range of service strategies, see e.g. Tables 5.1–
5.2. In this section we further compare the strategies; we say that strategy A is better than
strategy B if strategy A’s optimal cost is lower than strategy B’s optimal cost. Let us start with
a number of obvious observations. Denote by f(s0) the cost function of static strategy, and
by g(B1, s0, s1) the cost function of the 1-threshold strategy. If s0 = s1, then the 1-threshold’s
cost function is equal to the static’s cost. It is now immediate that min g(B1, s0, s1) is less
than or equal to min f(s0), and hence 1-threshold is better than static. By same token every
N -threshold strategy is better than any M -threshold strategy if N�M . Also, the hysteretic
strategy is more efficient than the 1-threshold strategy.
If β is high the optimization procedure tries to decrease the mean switching rate ES, which
can be done in two ways: (i) eliminating thresholds by setting service rates below and above
at the same values, or (ii) increasing the distance between the threshold levels. In both ways,
however, the mean buffer content will increase, so if δ is also high the cost function will
not decrease. As shown by the numerical output above (and a wide set of additional ex-
periments, not reported here) the hysteretic strategy tends to be better than the N -threshold
strategy if the switching and buffering cost are relatively high.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 108 — #120�
�
�
�
�
�
108 5.4. ROBUSTNESS ANALYSIS
We showed that if the static component is very small, then multiple-server strategies out-
perform single-server strategies; evidently, if γ is high, then multiple servers are not nec-
essarily better. We finish this section with an interesting (though straightforward) observa-
tion. Let us assume that strategy A is a single-server static strategy, while strategies B and
C correspond to 2 and 3 servers, respectively, operating with rate s0, being the same rate
as in strategy A. Because all these strategies work with the same service rates, their work-
loads are equal as well. If γ < (5/36) s30, then the 3-server strategy is the most efficient; if
(5/36) s30 < γ < (3/4) s30, then the 2-server strategy is the most efficient; and if γ > (3/4) s30,
then it is optimal to use one server. Observe that the optimal numbers of servers depends
on the value of the parameter γ.
5.4 Robustness Analysis
So far we found the optimal parameters (in terms of service rates and thresholds) of vari-
ous service strategies, and compared these with each other. Now suppose that we decide
to choose a specific strategy, which is suitable for implementation and has sufficiently low
energy cost. In order to be able to identify the optimal parameter values, however, we need
to know the precise values of the input traffic characteristics. Now the question is how an
estimation error propagates: to what extent is the value of the objective function affected?
We consider two scenarios: in the first we perturb the value of the mean on-time (keeping
its distribution exponential), while in the second we perturb the distribution of the on-time.
We compare the strategies (with and without the perturbation) in terms of the value of the
cost function, using the following procedure. First we find the optimal parameter values
(service rates, thresholds) and the corresponding optimal cost. Then we perturb the input
process, and we compute the cost function with the optimal parameter values as obtained for
the unperturbed model. These parameters are likely not optimal for the perturbed model,
and hence the cost is higher than the optimal cost of the unperturbed model. We use the
difference of these costs divided by optimal cost as a ‘robustness measure’, quantifying the
impact of the perturbation:
ΔE := 100× Eper(Xunper)− Eper(Xper)
Eper(Xper), (5.4)
where Eper is the cost function of the perturbed model, Xunper is the vector of optimal pa-
rameters of the non-perturbed model, and Xper is the vector of optimal parameters of the
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 109 — #121�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 109
Δμ% StM HysM MTM-1 MTM-2 MTM-3
Scenario 1:
5 0.48 0.10 0.10 0.06 0.04
10 1.67 0.36 0.36 0.24 0.18
20 5.39 1.21 1.21 0.84 0.64
-5 0.67 0.13 0.13 0.08 0.06
-10 3.44 0.59 0.59 0.34 0.26
-20 31.1 3.40 3.40 1.81 1.27
Scenario 2:
5 0.48 0.09 0.10 0.06 0.05
10 1.67 0.33 0.37 0.24 0.19
20 5.39 1.11 1.23 0.88 0.75
-5 0.67 0.11 0.13 0.07 0.06
-10 3.44 0.52 0.62 0.33 0.25
-20 31.1 2.94 3.66 1.67 1.03
Table 5.3: Robustness with respect to changes in the mean on-time. The first column is the change in μ. StMstands for static strategy. All numbers are percentages. Scenario 1 and Scenario 2 have parameters(μ = 0.5, δ = 0.05, β = 0.0) and (μ = 0.5, δ = 0.05, β = 0.05), respectively.
perturbed model. In a first series of experiments, we assume that the distribution of on-
periods is still exponential but the rate, μ, is perturbed. Evidently, decreasing μ makes the
on-times longer, so that the queueing system can become unstable; we restrict ourselves to
values of μ that keep the queue stable. From extensive simulation experiments, see Table 5.3,
we conclude that the perturbation hardly affects the optimality of our designs. For example,
an increase of 20% in μ leads to an increase of 5% in the value of the cost function in the static
strategy (StM) and just about 1% increase in our dynamic strategies. More interestingly, a
decrease of 20% in μ leads to an increase of 31% in the value of the cost function in the static
strategy (StM), but only a few percents increase in our dynamic strategies.
In a second series of experiments, we study the effect of a perturbation of the distribution of
the on-times. A first observation is that there is a connection between the coefficient of varia-
tion (CoV) of the on-time distribution and the value of the cost function: if this CoV is smaller
(larger) than one, then the optimal cost is smaller (larger) than the optimal cost correspond-
ing to exponentially distributed on-times (where the CoV of a random variable is defined
as the ratio of the corresponding standard deviation and mean); this is because (informally)
variability of the on-periods increases the energy cost as there will be more queueing. Ta-
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 110 — #122�
�
�
�
�
�
110 5.4. ROBUSTNESS ANALYSIS
StM MTM-1 MTM-2 MTM-3
Scenario 1:
Eexp 0.4832 0.4340 0.4221 0.4190
Ehyp 0.5036 0.4464 0.4339 0.4306
EErl 0.4596 0.4180 0.4073 0.4046
ΔEhyp 0.28% 0.17% 0.10% 0.09%
ΔEErl 0.39% 0.21% 0.07% 0.04%
Scenario 2:
Eexp 0.4832 0.4450 0.4380 0.4378
Ehyp 0.5036 0.4568 0.4494 0.4488
EErl 0.4596 0.4302 0.4234 0.4231
ΔEhyp 0.28% 0.22% 0.20% 0.05%
ΔEErl 0.39% 0.25% 0.08% 0.05%
Table 5.4: Robustness with respect to changes in the distribution of the on-times, with the mean on-time un-changed. Scenario 1 and Scenario 2 have parameters (μ = 0.5, δ = 0.05, β = 0.0) and (μ = 0.5, δ = 0.05,β = 0.05), respectively.
bles 5.4–5.5 show the impact of perturbing the distribution of the on-time (keeping its mean
fixed); we do so by replacing the exponential distribution by an Erlang (CoV smaller than
1) or a hyper-exponential (CoV larger than 1) distribution. The robustness measures are again
computed by ΔE , as defined in (5.4). In case of the Erlang distribution, the mean is kept at
1/μ, but there are now two phases (each of mean duration 1/(2μ)), leading to a squared CoV
of 12 . The hyperexponential distribution that we chose corresponds to an exponential ran-
dom variable with mean 1/(2μ) with probability 12 , and an exponential random variable with
mean 3/(2μ), also with probability 12 , such that the resulting mean is 1/μ and the squared
CoV is 32 . In Tables 5.4-5.7, ΔEhyp (ΔEErl) is the percentage error in the value of the objective
function optimized based on exponential job arrivals where in fact the arrival distribution is
hyper-exponential (Erlang).
In Tables 5.6–5.7 the same experiments have been performed, but the mean on-time is also
increased by 10%. The numerical output shows that multiple-threshold strategies are even
more robust; a similar conclusion holds for multiple-server strategies.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 111 — #123�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 111
C1-12 C2-12 C3-01
Scenario 1:
Eexp 0.2392 0.2276 0.2253
Ehyp 0.2446 0.2317 0.2287
EErl 0.2323 0.2229 0.2202
ΔEhyp 0.23% 0.23% 0.12%
ΔEErl 0.29% 0.30% 0.09%
Scenario 2:
Eexp 0.4859 0.4801 0.4801
Ehyp 0.4885 0.4813 0.4813
EErl 0.4816 0.4775 0.4775
ΔEhyp 0.02% 0.03% 0.03%
ΔEErl 0.02% 0.04% 0.04%
Table 5.5: Robustness of multiple-server strategies with respect to changes in the distribution of the on-times,with the mean on-time unchanged. Scenario 1 and Scenario 2 have parameters (μ = 0.5, γ = 0.05,δ = 0.05, β = 0.0) and (μ = 0.5, γ = 0.2, δ = 0.05, β = 0.0), respectively.
StM MTM-1 MTM-2 MTM-3
Scenario 1:
Eexp 0.4832 0.4340 0.4221 0.4190
Ehyp 0.5368 0.4783 0.4653 0.4617
EErl 0.4921 0.4497 0.4382 0.4356
ΔEhyp 4.99% 1.21% 0.65% 0.49%
ΔEErl 0.90% 0.21% 0.19% 0.27%
Scenario 2:
Eexp 0.4832 0.4450 0.4380 0.4378
Ehyp 0.5368 0.4883 0.4802 0.4796
EErl 0.4921 0.4613 0.4541 0.4537
ΔEhyp 4.99% 1.30% 0.88% 0.22%
ΔEErl 0.90% 0.21% 0.28% 0.25%
Table 5.6: Robustness with respect to changes in the distribution of the on-times, with the mean on-time chang-ing as well. The mean on-time of the non-exponential distribution is 10% higher than that of the ex-ponential distribution. Scenario 1 and Scenario 2 have parameters (μ = 0.5, γ = 0.05, β = 0.0) and(μ = 0.5, γ = 0.05, β = 0.05), respectively.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 112 — #124�
�
�
�
�
�
112 5.5. CONCLUSION
C1-12 C2-12 C3-01
Scenario 1:
Eexp 0.2392 0.2276 0.2253
Ehyp 0.2535 0.2402 0.2372
EErl 0.2412 0.2308 0.2288
ΔEhyp 0.99% 0.80% 0.35%
ΔEErl 0.51% 0.57% 0.44%
Scenario 2:
Eexp 0.4859 0.4801 0.4801
Ehyp 0.5008 0.4932 0.4933
EErl 0.4943 0.4897 0.4898
ΔEhyp 0.08% 0.07% 0.07%
ΔEErl 0.04% 0.05% 0.05%
Table 5.7: Robustness of multiple-server strategies with respect to changes in the distribution of the on-times,with the mean on-time changing as well. The mean on-time of the non-exponential distribution is 10%higher than that of the exponential distribution. Scenario 1 and Scenario 2 have parameters (μ = 0.5,γ = 0.05, δ = 0.05, β = 0.0) and (μ = 0.5, γ = 0.2, δ = 0.05, β = 0.0) respectively.
5.5 Conclusion
This chapter has presented a modeling framework for controlling and optimizing the energy
management in the emerging multi-core servers with speed scaling capabilities. In addition
to incorporating dynamic power, our framework also includes the static (leakage) power
and the switching overhead between speed levels; these features were largely unaccounted
for in prior works.
We proposed and studied different dynamic strategies for adapting the multi-core server
speeds on the basis of observations of the current buffer content. For a given strategy we
have showed how the performance of the system can be evaluated relying on stochastic fluid
models. The resulting numerical evaluation technique enabled us to calculate the value of
objective functions that balance energy consumption and performance. We have studied
strategies in which the service policy depends on the current buffer value only, but also
strategies in which it matters in what direction thresholds are crossed (i.e., hysteretic con-
trol).
The following general conclusions were drawn. Evidently, strategies that use more thresh-
old levels are more efficient with respect to power consumption, but, remarkably, most of
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 113 — #125�
�
�
�
�
�
CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 113
the efficiency gain is achieved with only 1 or 2 thresholds. Furthermore, our objective func-
tions are just mildly sensitive to perturbations in the input parameters. As a consequence,
our procedure is robust, in that estimation errors in the input parameters hardly affect the
performance of the proposed procedure. A more specific conclusion is that if the switch-
ing cost is considerable, then the hysteretic model performs better than the model with one
threshold, but the model with 2 thresholds is always more efficient.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 114 — #126�
�
�
�
�
�
114 5.5. CONCLUSION
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 115 — #127�
�
�
�
�
�
Bibliography
[1] J. ABATE and W. WHITT (1995). Numerical inversion of Laplace transforms of proba-
bility distributions. ORSA J. Comp., 7, pp. 36-43.
[2] S. ALBERS and H. FUJIWARA (2006). Energy-efficient algorithms for flow time mini-
mization. Lecture Notes in Computer Science (STACS), 3884, pp. 621-633.
[3] L. ANDREW, A. WIERMAN and A. TANG (2012). Speed scaling for processor sharing
systems: Optimality and robustness. Performance Evaluation, 69, pp. 601-622.
[4] D. ANICK, D. MITRA and M. SONDHI (1982). Stochastic theory of a data-handling
system with multiple sources. AT&T, Bell Syst. Techn. J., 61, pp. 1871-1894.
[5] N. ASGHARI, P. DEN ISEGER and M. MANDJES (2014). Numerical techniques in Lévy
fluctuation theory. Methodol. Comput. Appl. Probab., 16, pp. 31-52.
[6] N. ASGHARI, K. DEBICKI and M. MANDJES (2014). Exact tail asymptotics of the
supremum attained by a Lévy process, accepted for publication in Statistics and Proba-
bility Letters.
[7] N. ASGHARI and M. MANDJES (2014). Transform-based evaluation of prices and
Greeks of lookback options driven by Lévy processes, submitted for publication to
J. Comp. Fin.
[8] N. ASGHARI, M. MANDJES and A. WALID (2013). Modeling and optimization of en-
ergy management in multi-core servers. Performance Evaluation Review, 41, pp. 38-40.
[9] N. ASGHARI, M. MANDJES and A. WALID (2014). Energy-efficient scheduling in
multi-core servers. Computer Networks, 59, pp. 33-43.
115
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 116 — #128�
�
�
�
�
�
116 BIBLIOGRAPHY
[10] S. ASMUSSEN (2003). Applied Probability and Queues. Springer, New York, NY, USA.
[11] S. ASMUSSEN, F. AVRAM and M. PISTORIUS (2004). Russian and American put options
under exponential phase-type Lévy models. Stoch. Proc. Appl., 109, pp. 79-111.
[12] S. ASMUSSEN and P. GLYNN (2007). Stochastic Simulation: Algorithms and Analysis.
Springer, New York, NY, USA.
[13] S. ASMUSSEN, D. MADAN and M. PISTORIUS (2007). Pricing equity default swaps
under an approximation to the CGMY Lévy model. J. Comp. Fin., 11, pp. 79-93.
[14] S. ASMUSSEN, O. NERMAN and M. OLSSON (1996). Fitting phase-type distributions
via the EM algorithm. Scand. J. Stat., 23, pp. 419-441.
[15] S. ASMUSSEN and J. ROSINSKI (2004). Approximations of small jumps of a Lévy pro-
cess with a view towards simulation. J. Appl. Probab., 38, pp. 482-493.
[16] N. BANSAL, H. CHAN and K. PRUHS (2009). Speed scaling with an arbitrary power
function. Proc. ACM-SIAM SODA, pp. 693-701.
[17] N. BANSAL, K. PRUHS and C. STEIN (2007). Speed scaling for weighted flow times.
Proc. ACM-SIAM SODA, pp. 805-813.
[18] O. BARNDORFF-NIELSEN (1998). Processes of normal inverse Gaussian type. Financ.
Stoch., 2, 41-68.
[19] L. BARROSO and U. HOLZLE (2007). The case for energy-proportional computing,
IEEE Computer, 40, No. 12, pp. 33-37.
[20] R. BEKKER (2005). Queues with state-dependent rates. PhD thesis, Technische Universiteit
Eindhoven, Eindhoven, The Netherlands.
[21] J. BERTOIN (1998). Lévy Processes. Cambridge University Press, Cambridge, UK.
[22] J. BERTOIN and R. DONEY (1994). Cramér’s estimate for Lévy processes. Statist. Probab.
Lett., 21, pp. 363-365.
[23] F. BLACK and M. SCHOLES (1973). The pricing of options and corporate liabilities. J.
Polit. Econ., 81, pp. 637-654.
[24] D. BUNDE (2006). Power-aware scheduling for makespan and flow. Proc. ACM Symp.
Parallel Alg. and Arch., pp. 190-196.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 117 — #129�
�
�
�
�
�
BIBLIOGRAPHY 117
[25] T. BURD, T. PERING, A. STRATAKOS and R. BRODERSEN (2000). A dynamic voltage
scaled microprocessor system, IEEE J. Solid-State Circuits, 35, pp. 1571-1580.
[26] C. CHAN, A. GYGAX, E. WONG, C. LECKIE, A. NIRMALATHAS and D. KILPER (2013).
Methodologies for Assessing the Use-Phase Power Consumption and Greenhouse Gas
Emissions of Telecommunications Network Services. Environ. Sci. Technol., 47, pp. 485-
492.
[27] P. CARR (1998). Randomization and the American Put. Rev. Fin. Studies, 11, pp. 597-
626.
[28] P. CARR, H. GEMAN, D. MADAN and M. YOR (2002). The fine structure of asset re-
turns: an empirical investigation. J. Business, 75, 305-332.
[29] S. CHO and R. MELHEM (2010). On interplay of parallelization, program performance,
and energy consumption. IEEE J. Trans. Par. Distr. Syst., 21, pp. 342-353.
[30] J. COHEN (1974). Superimposed renewal processes and storage with gradual input,
Stochastic Process. Appl., 2, pp. 31-57.
[31] R. CONT and P. TANKOV (2004). Financial Modelling with Jump Processes. Chapman &
Hall/CRC Press, Boca Raton, FL, USA.
[32] R. CONT and P. TANKOV (2008). Financial Modelling with Jump Processes, 2nd edition.
Chapman & Hall / CRC Press, London, United Kingdom.
[33] J. COOLEY and J. TUKEY (1965). An algorithm for the machine calculation of complex
Fourier series. Math. Comput., 19, pp. 297-301.
[34] K. DEBICKI and M. MANDJES (2015). Queues and Lévy fluctuation theory - an applied
probability approach. Springer, to be published.
[35] K. DEBICKI and M. MANDJES (2012). Lévy-driven queues. Surveys in Operations Re-
search and Management Science, 17, pp. 15-37.
[36] P. DEN ISEGER (2006). Numerical transform inversion using Gaussian quadrature.
Probab. Engg. Inf. Sci., 20, pp. 1-44.
[37] P. DEN ISEGER and E. OLDENKAMP (2006). Pricing guaranteed return rate products
and discretely sampled Asian options. J. Comp. Fin., 9, pp. 383-403.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 118 — #130�
�
�
�
�
�
118 BIBLIOGRAPHY
[38] H. DUBNER and J. ABATE (1968). Numerical inversion of Laplace transforms by relat-
ing them to the finite Fourier cosine transform. J. ACM, 15, pp. 115-123.
[39] A. ELWALID (1995). Analysis of adaptive rate-based congestion control for high-speed
Wide-Area Networks. Proc. IEEE ICC ’95, pp. 1948-1953.
[40] A. FELDMANN and W. WHITT (1998). Fitting mixtures of exponentials to long-tail dis-
tributions to analyze network performance models. Perf. Eval., 31, pp. 245-279.
[41] W. FELLER (1966). An Introduction to Probability Theory and its Applications. Wiley, New
York, NY, USA.
[42] N. VAN FOREEST (2004). Queues with Congestion-dependent Feedback. PhD thesis, Twente
University, Enschede, The Netherlands.
[43] A. FRANCINI (2012). Selection of a rate adaptation scheme for network hardware. Proc.
IEEE Infocom pp. 2831-2835.
[44] M. FU (2007). Variance-Gamma and Monte Carlo. In: Advances in Mathematical Finance,
eds. Fu, Jarrow, Yen, Elliott. Birkhäuser, pp. 21-35.
[45] A. GANDHI, M. HARCHOL-BALTER, R. DAS and C. LEFURGY (2007). Optimal power
allocation in server farms. Sigmetrics ’09 Proceedings, pp. 157-168.
[46] H. GEMAN and M. YOR (1996). Pricing and hedging double barrier options: a proba-
bilistic approach. Math. Finance, 6, pp. 365-387.
[47] J. GEORGE and J. HARRISON (2001). Dynamic control of a queue with adjustable ser-
vice rate. Oper. Res., 49, pp. 720-731.
[48] P. GLASSERMAN and Z. LIU (2010). Sensitivity estimates from characteristic functions.
Oper. Res., 58, pp. 1611-1623.
[49] P. GLASSERMAN and Z. LIU (2011). Estimating Greeks in simulating Lévy-driven
models. J. Comp. Fin., 14, pp. 3-56.
[50] P. GLYNN and M. MANDJES (2011). Simulation-based computation of the workload
correlation function in a Lévy-driven queue. J. Appl. Probab., 48, pp. 114-130.
[51] J. GOMBINER (2011). Carbon Footprinting the Internet. Consilience: Journal of Sustain-
able Development, 5, pp. 119-124
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 119 — #131�
�
�
�
�
�
BIBLIOGRAPHY 119
[52] A. GREENBERG, J. HAMILTON, D. MALTZ and P. PATEL (2009). The cost of a cloud:
research problems in data center networks. ACM Sigcomm/Computer Communication
Review, 39, pp. 68-73.
[53] Green Data Project. http://www.greendataproject.org/
[54] V. GUPTA and R. NATHUJI (2010). Analyzing performance of asymmetric multicore
processors for latency sensitive datacenter applications. Proc. Usenix HotPower.
[55] J. HARRISON (1977). The supremum distribution of a Lévy process with no negative
jumps. Adv. Appl. Probab., 9, pp. 417-422.
[56] J. HARRISON (1985). Brownian Motion and Stochastic Flow Systems. Wiley, New York,
NY, USA.
[57] A. HORVÁTH and M. TELEK (2002). Phfit: a general phase-type fitting tool. In: Proc. of
12th Performance TOOLS, LNCS 2324, pp. 82-91.
[58] M. JEANNIN and M. PISTORIUS (2010). A transform approach to compute prices and
Greeks of barrier options driven by a class of Lévy processes. Quant. Financ., 10, 629-
644.
[59] S. KAXIRAS and M. MARTONOSI (2008). Computer Architecture Techniques for Power-
Efficiency. Morgan and Claypool.
[60] O. KELLA and W. STADJE (2002). Exact results for a fluid model with state-dependent
flow rates. Probability in the Engineering and Informational Sciences, 16, No. 4, pp.
389âAS402.
[61] M. KITAEV and R. SERFOZO (1999). M/M/1 queues with switching costs and hys-
teretic optimal control. Oper. Res., 47, pp. 310-312.
[62] I. KOPONEN (1995). Analytic approach to the problem of convergence of truncated
Lévy flights towards the Gaussian stochastic process. Phys. Rev. E, 52, 1197-1199.
[63] L. KOSTEN (1974). Stochastic theory of a multi-entry buffer (I). Delft Progr. Rep., Series
F, 1, pp. 10-18.
[64] S. KOU (2002). A jump-diffusion model for option pricing. Man. Sci., 48, pp. 1086-1101.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 120 — #132�
�
�
�
�
�
120 BIBLIOGRAPHY
[65] V. KULKARNI (1997). Fluid models for single buÂoer systems, Frontiers in queueing.
CRC, Boca Raton, FL, pp. 321-338.
[66] A. KUZNETSOV (2010). Wiener-Hopf factorization and distribution of extrema for a
family of Lévy processes. Annals of Applied Probability, 20, pp. 1801-1830.
[67] A. KUZNETSOV, A. KYPRIANOU, J. PARDO and K. VAN SCHAIK (2011). A Wiener-
Hopf Monte Carlo simulation technique for Lévy processes. Ann. Appl. Probab., 21, pp.
2171-2190.
[68] A. KYPRIANOU (2006). Introductory Lectures on Fluctuations of Lévy Processes with Appli-
cations. Springer, Berlin, Germany.
[69] P. LANCASTER and M. TISMENETSKY (1985).The Theory of Matrices with Applications.
Academic Press, 2nd edition.
[70] E. LE SUEUR and G. HEISER (2010). Dynamic voltage and frequency scaling: the laws
of diminishing returns. Proc. Usenix HotPower.
[71] A. LEWIS and E. MORDECKI (2005). Wiener-Hopf factorization for Lévy processes hav-
ing negative jumps with rational transforms. Submitted for publication.
[72] A. LEWIS and E. MORDECKI (2008). Wiener-Hopf factorization for Lévy processes hav-
ing positive jumps with rational transforms J. Appl. Probab., 45, pp. 118-134.
[73] D. MADAN and F. MILNE (1991). Option pricing with VG martingale components.
Math. Financ., 1, 39-55.
[74] R. MALHOTRA, M. MANDJES, W. SCHEINHARDT and H. VAN DEN BERG (2009). A
feedback fluid queue with two congestion control thresholds. Math. Meth. Oper. Res.,
70, pp. 149-169.
[75] M. MANDJES, D. MITRA and W. SCHEINHARDT (2003). Models of network access sing
feedback fluid queues. Queueing Syst., 44, pp. 365-398.
[76] R. MERTON (1976). Option pricing when underlying stock returns are discontinuous.
J. Financ. Econ., 3, pp. 125-144.
[77] D. MITRA (1988). Stochastic theory of a fluid model of producers and consumers cou-
pled by a buffer. Adv. Appl. Probab., 20, pp. 646-676.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 121 — #133�
�
�
�
�
�
BIBLIOGRAPHY 121
[78] L. NGUYEN-NGOC (2003). Exotic options in general Lévy models. Prépublication 850,
Univ. Paris 6, Laboratoire de Probabilités et Modèles Aléatoires.
[79] L. NGUYEN-NGOC and M. YOR (2007). Lookback and barrier options under general
Lévy processes. In: Handbook of Financial Econometrics, Y. Aït-Sahalia and L. Hansen
(eds.). North-Holland, Amsterdam, the Netherlands.
[80] S. PARK, J. PARK, D. SHIN, Y. WANG, Q. XIE, N. CHANG and M. PEDRAM (2013). Ac-
curate modeling of the delay and energy overhead of dynamic voltage and frequency
scaling in modern microprocessors. IEEE Trans. on Computer Aided Design, 32, No. 5,
pp. 695-708.
[81] J. PARK, D. SHIN, N. CHANG and M. PEDRAM (2010). Accurate modeling and cal-
culation of delay and energy overheads of dynamic voltage scaling in modern high-
performance microprocessors. Proc. of Symposium on Low Power Electronics and Design,
pp. 419-424.
[82] E. PECHERSKII and B. ROGOZIN (1969). On the joint distribution of random variables
associated with fluctuations of a process with independent increments. Th. Probab.
Appl., 14, pp. 410-423.
[83] N. PRABHU (1998). Stochastic Storage Processes, 2nd edition. Springer, New York, NY,
USA.
[84] W. PRESS, S. TEUKOLSKY, W. VETTERLING and B. FLANNERY (1992). Numerical Recipes
in C, 2nd Edition, Cambridge University Press.
[85] K. PRUHS, P. UTHAISOMBUT and G. WOEGINGER (2008). Getting the best response
for your erg. ACM Transactions on Algorithms, 4, pp. 38:1-38:17.
[86] K. PRUHS, R. VAN STEE and P. UTHAISOMBUT (2008). Speed scaling of tasks with
precedence constraints. Theory Comput. Syst., 43, pp. 67-80.
[87] L. ROGERS (2000). Evaluating first-passage probabilities for spectrally one-sided Lévy
processes. J. Appl. Probab., 37, pp. 1173-1180.
[88] K. SATO (1999). Lévy Processes and Infinitely Divisible Distributions. Cambridge Univer-
sity Press, Cambridge, United Kingdom.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 122 — #134�
�
�
�
�
�
122
[89] W. SCHEINHARDT, N. VAN FOREEST and M. MANDJES (2005). Continuous feedback
fluid queues. Oper. Res. Letters, 33, pp. 551-559.
[90] W. SCHOUTENS (2003). Lévy Processes in Finance. Wiley, New York, United States.
[91] A. SINHA and A. CHANDRAKASAN (2001). JouleTrack — A web-based tool for soft-
ware energy profiling. Proc. Design Automation Conf. (DAC), pp. 220–225.
[92] B. SURYA (2008). Evaluating scale functions of spectrally negative Lévy processes. J.
Appl. Probab., 45, pp. 135-149.
[93] R. TEODORESCU and J. TORRELLAS (2008). Variation-aware application scheduling
and power management for CMPs. Proc. Int’l Symp. Computer Architecture (ISCA), pp.
363-374.
[94] A. THÜMMLER, P. BUCHHOLZ and M. TELEK (2006). A novel approach for phase-type
fitting with the EM Algorithm. IEEE Trans. Dep. Sec. Comp., 3, pp. 245-258.
[95] A. WIERMAN, L. ANDREW and M. LIN (2011). Speed scaling: an algorithmic per-
spective. Chapter in: Handbook on Energy-Aware and Green Computing. Chapman &
Hall/CRC Computer and Information Science Series.
[96] A. WIERMAN, L. ANDREW and A. TANG (2009). Power-aware speed scaling in pro-
cessor sharing systems. Proc. IEEE Infocom.
[97] F. YAO, A. DEMERS and S. SHENKER (1995). A scheduling model for reduced CPU
energy. Proc. IEEE Symp. Foundations of Computer Science (FOCS), pp. 374-382.
[98] V. ZOLOTAREV (1964). The first passage time of a level and the behaviour at infinity
for a class of processes with independent increments. Th. Probab. Appl., 9, pp. 653-661.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 123 — #135�
�
�
�
�
�
Samenvatting
Lévy-processen (i.e., processen met stationaire en onafhankelijke aangroeiingen) spelen een
belangrijke rol in de toegepaste kansrekening. Ze worden gebruikt in tal van toepassingen,
uiteenlopend van verzekeringswiskunde en andere financiëel-georiënteerde modellen tot
de besliskunde en zelfs biologie. Eén van de belangrijkste onderzoeksdomeinen binnen de
Lévy-processen houdt zich bezig met de analyse van de verdeling van het supremum (of
infimum) dat aangenomen wordt door het proces over een zekere tijdshorizon; het resul-
terende proces wordt meestal het ‘lopende maximum’ (of ‘lopende minimum’ genoemd. De
onderzoeksresultaten op dit gebied staan bekend onder de naam fluctuatietheorie.
Het voornaamste doel van dit proefschrift is de ontwikkeling van numerieke technieken om
de verdeling van het lopende maximum van Lévy-processen te bepalen, en die in een aantal
financiële toepassingen uit te werken. Het tweede doel betreft computationele technieken
die helpen bij het optimaliseren van de energieconsumptie van servers die verkeer in een
communicatienetwerk afhandelen. Het verkeer in zo’n netwerk wordt gemodelleerd als een
aan-uit-proces, waarbij de aan- en uit-tijden stochastische variabelen zijn.
In het tweede hoofdstuk van dit proefschrift, ontwikkelen we een numerieke techniek die
gebaseerd is op de zgn. Wiener-Hopf-ontbinding, en waarmee we de verdelingsfunctie van
het lopende maximum (of minimum) van een algemeen Lévy proces (d.w.z. een Lévy-proces
met sprongen zowel omhoog als omlaag) kunnen bepalen. Dankzij numerieke Laplace- en
Fourier-inversie-technieken ontwikkeld door Den Iseger, zijn we in staat dit met welhaast
machine-precisie te doen. Deze aanpak heeft een veelheid aan mogelijke toepassingen.
In Hoofdstuk 3 kijken we met name naar het toepassen van de technieken uit Hoofdstuk
2 bij het prijzen van specifieke exotische opties, te weten zgn. lookback-opties. We merken
hierbij echter op dat onze techniek in principe gebruikt kan worden voor het prijzen van
123
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 124 — #136�
�
�
�
�
�
124
elke willekeurige optie waarvan de payoff bepaald wordt door het lopende maximum of
minimum, bijvoorbeeld de zgn. barrier-optie.
De tweede techniek die we in dit boek bespreken is importance sampling; zie Hoofdstuk 4.
Deze aanpak heeft als doel de variantie van simulatie-gebaseerde schatters omlaag te bren-
gen. Directe (‘naïeve’) simulatie is inefficiënt en onnauwkeurig als de gebeurtenis waarin
we geïnteresseerd zijn zeldzaam is, en om dit het hoofd te bieden is het idee achter im-
portance sampling om simulatiepaden te genereren met gebruikmaking van een een alter-
natieve kansmaat, waaronder de gebeurtenis juist vaak voorkomt. Het is duidelijk dat de
simulatie-output gecorrigeerd moet worden (d.m.v. likelihood ratio’s) om daarmee een zui-
vere schatting te krijgen. De belangrijkste uitdaging ligt in het vinden van een goede alter-
natieve kansmaat die de resulterende variantie zo klein mogelijk maakt; hierin slagen we in
ons Lévy-model.
‘Energie-bewuste’ processoren zijn bedoeld om efficiënt verkeer af te handelen door de ver-
werkingssnelheid van de CPU aan te passen aan de belasting van dat moment en de gestelde
prestatie-eisen. In Hoofdstuk 5 beschouwen we dit probleem onder een doelfunctie die een
lineaire combinatie is van energie-verbruik, de ervaren kwaliteit (gemeten in termen van de
vertraging die het verkeer in het netwerk oploopt), en de frequentie waarmee de verwer-
kingssnelheid aangepast dient te worden; dit alles in een zgn. multi-core processor. Onze be-
langrijkste bijdrage is dat we een stochastisch vloeistof-model ontwikkelen waarmee dit sys-
teem beschreven en geoptimaliseerd kan worden. We bespreken verschillende schema’s, die
elk op hun specifieke wijze de verwerkingssnelheid aanpassen, en kwantificeren de reduc-
tie in het energiegebruik. We laten bovendien zien dat optimale strategieën robuust zijn, in
de zin dat verstoringen van de parameters er nauwelijks invloed op hebben. Deze robuust-
heidseigenschap maakt het praktisch gebruik van optimale strategieën zeer aantrekkelijk.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 125 — #137�
�
�
�
�
�
Summary
Lévy processes, i.e., processes with stationary and independent increments, play an impor-
tant role in applied probability. They have widespread applications, ranging from insurance
and financial mathematics to operation research and even biology. One of the main branches
of research on Lévy processes concentrates on analyzing the probabilistic properties of the
supremum (or infimum) attained by the process over a given period of time, usually referred
to as the running maximum (or minimum). This topic is commonly known as fluctuation the-
ory.
The main objective of this thesis is to develop numerical techniques to calculate the prob-
ability distribution of the running maximum of Lévy processes, and consider a number of
specific financial applications. The other objective is to propose a numerical method to opti-
mize the energy consumption of servers handling traffic in a communication network. The
traffic itself is modeled by a random process, usually an on-off process with random on- and
off-times.
In the second chapter of this thesis, a numerical technique based on the Wiener-Hopf fac-
torization is developed to evaluate the probability distribution of the running maximum (or
minimum) of a general Lévy process (i.e., a Lévy process with possibly two-sided jumps).
Thanks to the numerical Laplace and Fourier inversion technique developed by den Iseger, we
are able to numerically compute the probability with almost machine precision. This ap-
proach has a variety of potential applications.
In Chapter 3, we primarily focus on applying the technique developed in Chapter 2 to price
specific exotic options, viz. the so-called lookback option. However, the method can be em-
ployed for pricing many other options which depend on the maximum and/or minimum
attained by the underlying Lévy process, for instance the barrier option.
125
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 126 — #138�
�
�
�
�
�
126
The second technique which is presented in this book is importance sampling; see Chapter 4.
This technique is essentially used to reduce the variance of the simulation-based estimator.
Straightforward simulation for estimating rare event probabilities being inefficient and inac-
curate, the idea of importance sampling is to generate simulation paths under an alternative
measure such that the event is not rare anymore. Obviously the simulation results have to
be corrected by an appropriate likelihood ratio to obtain an unbiased estimate. The main
challenge of this method is to find the appropriate alternative measure and corresponding
likelihood ratio, which we succeed to find in our Lévy setting.
Energy-aware processors are intended to operate efficiently by adapting the speed of the
server CPU to the processing load and the service level requirement. In Chapter 5, we con-
sider a performance objective which is a linear combination of energy usage, queuing cost
(reflected by delay) and speed switching cost for a multi-core processor. Our analysis cap-
tures the static power as well as the dynamic power of the processor. Our main contribu-
tion is that we propose a stochastic fluid model for the analysis and optimization of such
multi-core processing systems. We discuss several schemes that lead to energy consumption
reduction. We show that the optimal strategies are robust under perturbations of the system
parameters and statistical properties of the traffic. This robustness property makes the use
of the optimal strategies highly attractive in practical situations.
�
�
“Thesis-Naser” — 2014/10/8 — 15:10 — page 127 — #139�
�
�
�
�
�
About the author
Naser Mohammad Asghari was born in Tehran, Iran in August 1976. During his school years
he was fascinated by mathematics and physics. In 1994 he entered the bachelor program of
mathematics at Sharif University of Technology (SUT), but changed to physics the next year.
In 1998 he was accepted for a master program in cosmology at SUT. After he graduated in
this master program in 2000, he continued his study in a PhD program on astrophysics at
the Institute for Advanced Studies in Basic Science. He defended his PhD thesis in 2006,
and then did a one-year postdoc at the same institute. During the period 2007–2010 he was
an assistant professor at Aerospace Research Institute and Azad University of Zanjan. In
2010 he started his second PhD at the KdV Institute for Mathematics, University of Ams-
terdam, the Netherlands, in the field of applied probability under supervision of Professor
Michel Mandjes. Since 2012 he has been working as a quantitative analyst at ING Bank in
Amsterdam.
127