uva-dare (digital academic repository) …computational techniques in queueing and fluctuation...

UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

Computational techniques in queueing and fluctuation theory

Mohammad Asghari, N.

Link to publication

Citation for published version (APA):Mohammad Asghari, N. (2014). Computational techniques in queueing and fluctuation theory.

General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s),other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, statingyour reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Askthe Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam,The Netherlands. You will be contacted as soon as possible.

Download date: 28 May 2020

https://dare.uva.nl/personal/pure/en/publications/computational-techniques-in-queueing-and-fluctuation-theory(03570407-b5bb-4c6f-9cb9-9db68a335046).html

Computational techniques

in queueing and fluctuation theory

Naser Mohammad Asghari C

omputational techniques in queueing and fluctuation theory

Naser M

ohamm

ad Asghari

INVITATION

for the public defense of my PhD thesis



that will take place on Tuesday November 25, 2014 at 14.00

in the Agnietenkapel, Oudezijds voorburgwal 231, Amsterdam

The defense will be followed by a reception

Naser Mohammad Asghari

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page i — #1�

�

�

�

�

�




�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page ii — #2�

�

�

�

�

�




Korteweg-de Vries Instituut voor Wiskunde

Faculteit der Natuurwetenschappen, Wiskunde en Informatica

Proefschrift Universiteit van Amsterdam

Copyright c© 2014 by Naser Mohammad Asghari, Amsterdam

All rights reserved. No part of this book may be reproduced,

in any form or by any means, without permission in writing

from the author.

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page iii — #3�

�

�

�

�

�



ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad van doctor

aan de Universiteit van Amsterdam

op gezag van de Rector Magnificus

prof. dr. D.C. van den Boom

ten overstaan van een door het college voor promoties

ingestelde commissie,

in het openbaar te verdedigen in de Agnietenkapel

op dinsdag 25 november 2014, te 14:00 uur

door


geboren te Tehran, Iran

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page iv — #4�

�

�

�

�

�

Promotiecommissie

Promotor: Prof. dr. M. R. H. Mandjes

Overige leden: Dr. P. J. C. SpreijProf. dr. R. Núñez QueijaProf. dr. ir. C. W. OosterleeProf. dr. J. H. van ZantenProf. dr. D. T. CrommelinProf. dr. B. F. Heidergott

Faculteit der Natuurwetenschappen, Wiskunde en Informatica

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page v — #5�

�

�

�

�

�

Acknowledgments

When I was school student, I was fascinated by almost every topic, mathematics, physics,

chemistry, biology, history, geography, ... (except arts!), and I wanted to be expert in all of

them. One day I studied physics, the other day history, and so on. Finally I found out I like

physics and mathematics most. So I started mathematics at university, but the next year I

changed to physics. After I got graduated in physics, I was introduced to economics and

finance by my brother Mohsen, and my best friend, Mehdi. Physicists had already been

active in economy and finance for some time, and they called their research econophysics.

Researching in econophysics led me to stochastic processes and mathematical finance.

I was very lucky that I managed to persuade Michel Mandjes to supervise me, despite the

fact that I did not have a mathematics background. Although I had studied probability and

stochastic processes, my knowledge was not enough to start doing research. Michel trusted

me and guided me such that I managed to get on the right track in quite a short period

of time. In fact, it would not be possible to accomplish this thesis without his great help

and support. Michel, thank you for your trust, support, guidance and kindness. You also

supported me apart from my thesis and I always appreciate it.

I would also like to thank Peter Spreij. He gave me the chance to be an assistant for his

courses at UvA. He is a great teacher and those courses still help me in my research and in

my job. He also supported me when I was looking for a job. Thank you, Peter.

I met Martijn Pistorius during a summer school in mathematical finance, in 2011 in Ljubljana.

I recall that I enjoyed his lectures. When he came to KdVI, whenever I had challenges in my

research, he always had valuable suggestions and comments. I would like to thank him.

KdVI is a prestigious institute, and I am really proud of being a member of it. I would not

have gotten this chance without the support of Jan Wiegerinck, to whom I am very grateful.

v

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page vi — #6�

�

�

�

�

�

I would also like to thank Evelien, Henneke and Marieke for their help. I never felt lonely at

KdVI with Nabi, Paul, Jevgenijs, Ricardo, Enno, Loek, Arie, and Piotr. I will never forget the

dinners we had at Jevgenijs’s place.

I am also so thankful to Peter den Iseger, Anwar Walid and Krzysztof Debicki for the nice

collaboration when writing our joint papers.

I would like to thank Drona Kandhai for giving me the chance to work in his team at ING. I

really enjoy working with my managers and colleagues at ING: Dirk, Veronica, Bart, Geert-

Jan, Joanna, Artem, Dmytro, Daan, Jef, Markus, Xiaoyu, Carlos and Frits Koen.

Outside the University and ING, I belong to an Iranian community. We have a great time

together. I would like to thank Mohammad & Maryam, Vahid & Sara, Hodjat & Marzieh,

Mehdi & Marzieh, Afshin & Fatemeh, Ali & Azadeh, Mahdi & Mahdieh, Masoud & Hoda,

Amin & Fatemeh, Naser & Aylar, Shayan & Narges, Mohammad & Fahimeh, Saeed & An-

disheh, Farzin & Mahsa, Hossein, Mahdi, Jafar, Narges, Rojman, Mohammad, Abbas, Da-

nial, Amir, Behnaz, Maryam and Rokhsareh.

I want to thank my family. How would I get here without your love, support and encour-

agement? My mother and father, sisters and brother, nephew and nieces, my father, mother,

brothers and sisters in law. I love you all and I cannot imagine a moment without you.

My wife, Sareh, deserves a special acknowledgment. Thank you for your love and support.

You always give me energy and encouragement to continue. This thesis is dedicated to you.

Naser Mohammad Asghari,

Amsterdam, October 2014

vi

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page vii — #7�

�

�

�

�

�

Contents

Acknowledgments v

List of Figures ix

List of Tables xi

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Preliminaries on Lévy fluctuation theory . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Preliminaries on Markov fluid models . . . . . . . . . . . . . . . . . . . . . . . 15

1.5 Outline of thesis, contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Numerical techniques in Lévy fluctuation theory 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3 Laplace inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.4 Approximation with rational Laplace transform . . . . . . . . . . . . . . . . . 36

2.5 Small jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.6 Beta processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.7 Discussion and concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Evaluation of option prices 51

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

vii

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page viii — #8�

�

�

�

�

�

3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3 Transforms of prices and Greeks of lookback options . . . . . . . . . . . . . . 61

3.4 Numerical validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4 Asymptotics of the supremum of a Lévy process 81

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.3 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5 Energy-Efficient Scheduling in Multi-Core Servers 89

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.2 Energy Cost Function and Models . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.3 Optimization of Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . 100

5.4 Robustness Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Bibliography 115

Samenvatting 122

Summary 124

About the author 126

viii

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page ix — #9�

�

�

�

�

�

List of Figures

5.1 Optimal cost of static serving and sleep mode strategies vs. buffering cost rate. 102

5.2 Thresholds in the 1-threshold and the hysteretic strategy with respect to β . . 103

5.3 Optimal cost of the hysteretic strategy and the 1-threshold strategy. . . . . . . 103

5.4 Optimal cost of strategies with many thresholds. . . . . . . . . . . . . . . . . . 106

5.5 Service rate as function of buffer content in continuous strategy. . . . . . . . . 106

ix

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page x — #10�

�

�

�

�

�

x

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page xi — #11�

�

�

�

�

�

List of Tables

2.1 Brownian Motion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.2 Compound Poisson with exponential jumps. . . . . . . . . . . . . . . . . . . . 46

2.3 Compound Poisson with Weibull jumps. . . . . . . . . . . . . . . . . . . . . . . 46

2.4 Compound Poisson with Weibull jumps. . . . . . . . . . . . . . . . . . . . . . . 47

2.5 Compound Poisson with Pareto jumps. . . . . . . . . . . . . . . . . . . . . . . 47

2.6 Compound Poisson with shifted-Pareto jumps. . . . . . . . . . . . . . . . . . . 48

2.7 Compound Poisson with both upward and downward jumps. . . . . . . . . . 48

2.8 CGMY-like upward-jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.9 Variance Gamma process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.10 Beta process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.1 Black-Scholes model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.2 Greeks corresponding to Black-Scholes model. . . . . . . . . . . . . . . . . . . 69

3.3 Merton model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.4 Kou model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.5 CGMY model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.6 Greeks corresponding to CGMY model. . . . . . . . . . . . . . . . . . . . . . . 76

3.7 Beta model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.1 Simulation results corresponding to Compound Poisson process. . . . . . . . 86

4.2 Simulation results corresponding to Variance Gamma process. . . . . . . . . . 87

5.1 Optimal cost of single-server strategies. . . . . . . . . . . . . . . . . . . . . . . 104

xi

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page xii — #12�

�

�

�

�

�

5.2 Optimal cost of multiple-server strategies. . . . . . . . . . . . . . . . . . . . . . 107

5.3 Robustness with respect to changes in the mean on-time. . . . . . . . . . . . . 109

5.4 Robustness with respect to changes in the distribution of the on-times, with

the mean on-time unchanged. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.5 Robustness of multiple-server strategies with respect to changes in the distri-

bution of the on-times, with the mean on-time unchanged. . . . . . . . . . . . 111

5.6 Robustness with respect to changes in the distribution of the on-times, with

the mean on-time changing as well. . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.7 Robustness of multiple-server strategies with respect to changes in the distri-

bution of the on-times, with the mean on-time changing as well. . . . . . . . . 112

xii

�

�

“Thesis-Naser” — 2014/10/8 — 15:10 — page 1 — #13�

�

�

�

�

�

Chapter 1Introduction

1.1 Motivation

Stochastic phenomena are everywhere around us. Consider the example of the valuation

of options (or other financial products) in financial markets. The future prices of the un-

derlying assets being highly uncertain, we need a model to describe these. Such a model

should be able to accurately capture the random features of the underlying stochastic pro-

cess. Realizing that option prices are essentially functionals of the evolution of the price of

the underlying asset, the model can be used to price the options.

Another example in which randomness plays an important role, is in the design of vari-

ous types of communication infrastructures. A commonly used paradigm is to model such

systems as queueing networks. The design objective typically reflects the tradeoff between

the cost and the quality-of-service delivered to the customers; adding capacity improves the

perceived quality, but obviously comes at a price. In recent years substantial emphasis has

been put on designing the network such that it efficiently uses energy resources, for instance

by adaptively changing the processor speeds of the queues, as a function of the current

workload. Such algorithms need to be set up delicately, as they should not compromise the

performance perceived by the network’s users.

These are just illustrative examples of instances where stochastic modeling offers a useful

mathematical framework. Of course in many other situations such models can be applied

as well. One could think of a wide variety of other domains, such as population biology,

chemical reaction networks, and statistical physics. The use of techniques from stochastic

1

�

�


�

�

�

�

�

2 1.1. MOTIVATION

modeling, to optimize the efficient usage of network resources, obviously extends beyond

the application of communication networks that we mentioned above: also in transport net-

works and logistic networks this approach is intensively used. The list of potential appli-

cation areas keeps on expanding; recently developed areas include social networks, brain

modeling, and forensic sciences.

Modeling real-life situations by means of stochastic systems is typically a first step; in or-

der for these models to have practical use, they need to allow fast and accurate (numerical)

evaluation. To illustrate this, let us go back to the two motivating examples that we intro-

duced above. In the context of option pricing, traders need to respond to the clients’ requests

nearly instantly, and this requires that they need to be able to compute prices virtually in real

time. The alternative is to rely on simulation-based techniques, which are fast only when

specific simplifications have been imposed on the underlying model (for instance by assum-

ing a simplistic volatility model, or by assuming the underlying stochastic process is just

Brownian); when the ambition is to use more sophisticated models, simulation techniques

become inherently problematic. This motivates the research on fast and accurate numerical

computation techniques that do not require us to a priori simplify the underlying stochastic

processes.

Also in the setting of the (optimal) design of energy-efficient communication networks, com-

putational issues play a prominent role. A typical objective function includes the energy

consumption per time unit as well as a performance measure that reflects the quality-of-

service, and the idea is to find the service policy that strikes an optimal balance between

these components (or the less ambitious objective of identifying a policy that is ‘close’ to this

optimum). The optimization tries to find, for any specific condition the network can be in,

what the best service strategy is. The complexity of finding an optimum lies in the huge

parameter space that we have to optimize over, as well as the sometimes unpleasant specific

properties of the objective function (it is not necessarily a smooth unimodal function, for

instance). In the setting of designing a network the optimization routine can in principle be

performed off-line. It should be realized, however, that the underlying algorithm requires

that the objective function be evaluated potentially very frequently, and as a consequence we

need a technique that evaluates the system’s performance (for a single parameter instance)

fast and sufficiently accurately. Perhaps equally important is that it is made sure that the

resulting design is robust: when the input parameters differ slightly from their estimated

values, the system should still behave ‘nearly’ optimally.

�

�


�

�

�

�

�

CHAPTER 1. INTRODUCTION 3

1.2 Models

A canonical model in stochastic modeling is the so-called random walk. Consider a sequence

of independent and identically distributed increments Y1, Y2, . . ., and the associated random

walk

Sn :=

n∑i=1

Yi.

The probabilistic behavior of these random walks has been an object of intensive study.

Classical results are the law of large numbers, saying that Sn/n converges (in various spe-

cific forms) to the corresponding mean, and the central limit theorem, stating that a centered

and normalized version of Sn/n converges to a standard Normal random variable.

In the context of the option pricing example introduced above, observations from financial

data suggest that models with independent and identically distributed increments form a

natural framework. On the other hand, as in this example time is in principle not slotted,

it is less appropriate to consider a discrete-time framework. This motivates why it makes

sense to look at the continuous-time counterpart of the above random-walk model, viz. the

Lévy model. A Lévy process (Xt)t≥0 is a real-valued continuous-time process, with X0 = 0,

such that all increments are independent and identically distributed (meaning that for all s,

h and t we have that Xs+h−Xs and Xt+h−Xt have the same distribution, and that Xt+h−Xt

is independent of Xt). In some cases processes in finance are not well modeled by (Xt)t≥0

itself, but rather by (eXt)t≥0 (thus also avoiding the sometimes problematic issue that the

underlying process can attain negative values).

In the example of option pricing, various payoff structures involve both the value XT of the

underlying at maturity T and the largest value XT attained until T . An example here is

the so-called barrier option: there, for a given strike K and barrier H , the payoff is given by

max{XT −K, 0} provided that XT ≥ H ; if we have insight into the probabilistic properties

of these objects, we are in a position to price the option. This explains the interest in getting a

handle on the joint distribution of XT and the running maximum XT . As it turns out, for Lévy

processes there is a vast body of literature available on this joint distribution, commonly

referred to as Wiener-Hopf theory. We recapitulate the first principles, as well as a collection

of key results, of this theory in Section 1.3.

In the context of communication networks, a key concept is that of a queue. The data that

�

�


�

�

�

�

�

4 1.3. PRELIMINARIES ON LÉVY FLUCTUATION THEORY

cannot be processed immediately upon arrival is temporarily stored in a buffer, thus nat-

urally leading to a notion of a queueing process. The Lévy process defined above can in

principle attain any real value (i.e., positive and negative), so to make it a queueing process

we should prevent it from becoming negative. The (commonly followed) way of truncating

the process is by applying a so-called reflection or regulation: we introduce a queueing process

(or: workload process) by

Qt := sups≤t

(Xt −Xs);

observe that this entails that the workload is a functional of the driving Lévy process (Xt)t≥0.

It is remarked that in such a Lévy-driven queue the amount of traffic fed into the system

in two disjoint periods of time is independent, which is typically not true in the context

of communication networks: if a user is generating traffic at a high rate at some point in

time, it is relatively likely he or she is still doing that some small amount of time later. This

observation motivates why traffic is often modeled as a so-called on-off process: the users

activity level alternates between generating traffic at some rate r for a while, and being silent,

thus creating some positive dependence within the queue’s input process. Subsequent on-

and off-times may be assumed to be independent.

1.3 Preliminaries on Lévy fluctuation theory

Brownian motion is one of the most frequently used stochastic models, being applied in a va-

riety of domains. Among its most important properties are the continuity of its sample paths

and its scale invariance. It is noted, however, that many phenomena which are modeled by

Brownian motion, do not have these properties. In finance, for example, returns are often

modeled by Brownian motion, but they tend to jump up and down; these discontinuities

can not be ignored in many situations [31].

Another classical stochastic process is the Poisson process. This has discontinuous sample

paths (as it jumps up by steps of size 1, after exponentially distributed times). A Poisson

process is a non-decreasing process, and therefore has paths of bounded variation over finite

time (as opposed to Brownian motion).

Although Brownian motion and the Poisson process are seemingly completely different,

they share a number of important features. Both have right-continuous paths with left limits,

and they both are Markovian processes that start at the origin. They belong to a wide class of

stochastic processes, Lévy processes, named after the French mathematician Paul Lévy, which

�

�


�

�

�

�

�


play an important role in this thesis. We primarily focus on these processes with a specific

focus on the analysis of the probabilistic properties related to its extreme values; this branch

of research is commonly referred to as fluctuation theory. For a complete treatment of Lévy

processes and fluctuation theory, we refer to e.g. the textbook [68].

A process X = {Xt, t ≥ 0} defined on a probability space (Ω,F ,P) is a Lévy process if it

has right-continuous paths with left limits, and if it has stationary and independent incre-

ments, with X0 = 0. ‘Stationarity’ in this context means that increments corresponding to

a fixed time interval are identically distributed; ‘independence’ refers to the property that

increments corresponding to non-overlapping time intervals behave statistically indepen-

dently.

An important, and highly convenient, property of the Lévy process is that its characteristic

function obeys a closed-form formula, which is in fact an immediate consequence of the pro-

cess having stationary and independent increments. From the definition of a Lévy process,

it is known that Xt is infinitely divisible for any t > 0. Realizing that, for any n ∈ N, Xt

equals

Xt = Xt/n + (X2t/n −Xt/n) + · · · (Xt −X(n−1)t/n),

it follows that

logEeisXt = t logEeisX1 = tξ(s)

(first for rational s, and with a limiting argument also for real s). The function ξ(s) :=

logEeisX1 is often referred to as the characteristic exponent of the Lévy process. As stated

in the following theorem, it can be shown that a Lévy process can be uniquely defined by a

triple (μ, σ,Π) [68, 34, 21].

Theorem 1.3.1. Lévy-Khintchine formula for Lévy processes. Suppose that μ ∈ R, σ ≥ 0.

Let the measure Π be concentrated on R\{0}, in such a way that the regularity condition∫Rmin{1, x2}Π(dx) < ∞ is met. This triple (μ, σ,Π) defines for any s ∈ R,

ξ(s) = logEeisX1 = iμs− 1

2s2σ2 +

∫ ∞

−∞(eisx − 1− isx1{|x|<1})Π(dx).

Then there exists a probability space (Ω,F ,P) on which a Lévy process can be defined hav-

ing Lévy characteristic exponent ξ(s). Π is called the corresponding Lévy measure, while μ is

often referred to as the drift, and σ2 as the parameter of the Brownian term.

�

�


�

�

�

�

�


The characteristic exponent can be rewritten as

ξ(s) =

{iμs− 1

2s2σ2

}

+

{Π(R\(−1, 1))

∫|x|≥1

(eisx − 1)Π(dx)

Π(R\(−1, 1))

}

+

{∫0<|x|<1

(eisx − 1− isx)Π(dx)

}.

It should be mentioned that in case Π(R\(−1, 1)) = 0 the second term is left out. Based on

this formula of the characteristic exponent, the Lévy process Xt can be decomposed as the

independent sum of processes X(1), X(2) and X(3) which are described as follows (Lévy-Itô

decomposition). In the first place, X(1) is a linear Brownian motion with drift. Then, X(2) is

a compound Poisson process with rate Π(R\(−1, 1)), where the jumps are independent and

identically distributed with distribution Π(dx)/Π(R\(−1, 1)) concentrated on {x : |x| ≥ 1}.

Finally, concentrating on the last term, it is first observed that it can be written as

∫0<|x|<1

(eisx − 1− isx)Π(dx)

=∑n≤0

{λn

∫2−(n+1)≤|x|<2−n

(eisx − 1)Fn(dx)− isλn

(∫2−(n+1)≤|x|<2−n

xFn(dx)

)}

where λn := Π(2−(n+1) ≤ |x| < 2−n) and Fn(dx) := Π(dx)/λn. The component X(3) can

thus be considered as the superposition of (at most) a countable number of independent

compound Poisson processes with different rates and linear drift. In fact, X(3) is a square

integrable martingale with an almost surely countable number of jumps on each finite time

interval. Importantly, the number of jumps can be infinite almost surely, leading to the class

of Lévy models with infinite activity; we will intensively work with this class later in this

thesis.

Further relevant classifications are the following. If Π(−∞, 0) = 0, then it follows from the

Lévy-Itô decomposition that the corresponding Lévy process has no negative jumps. In this

case it is referred as a spectrally positive Lévy process. On the contrary, a Lévy process is

called spectrally negative if −X is spectrally positive (i.e., it has no positive jumps). These two

classes of processes are generally indicated by S+ and S− respectively, and referred to as

the spectrally one-sided class. As we will see throughout this thesis, for the class of spectrally

�

�


�

�

�

�

�


one-sided Lévy processes often very explicit analysis is possible.

Let X be a spectrally positive process, and assume in addition∫(0,∞)

max(1, x)Π(dx) < ∞,

σ = 0 and μ ≥ 0. Then, again from the Lévy-Itô decomposition, it follows that the process

X has non-decreasing paths. Such a process is referred to as a subordinator. Given a Lévy

process Xt and an independent subordinator τs, we can introduce another Lévy process by

sampling Xt at stochastic time epochs which are defined by the subordinator. More precisely,

suppose that Xt is a Lévy process with characteristic exponent ξ and τ = {τs : s ≥ 0} is

an independent subordinator with characteristic exponent Ξ. Then the process Y , which is

defined by Xτs , is a Lévy process, and its characteristic exponent is given by Ξ ◦ ξ [68]. For

example, a possible representation of the so-called Variance Gamma process (a frequently

used infinite-activity Lévy process) corresponds to sampling a Brownian motion at times

that result from a Gamma process [34].

1.3.1 Wiener-Hopf factorization

A collection of important results in Lévy fluctuation theory are immediate consequences

of the so-called Wiener-Hopf factorization. In fact, this Wiener-Hopf factorization provides a

powerful tool with several applications in probability (e.g. related to finance). In this sec-

tion we first introduce a few relevant concepts, and then we roughly sketch a proof for the

Wiener-Hopf factorization theorem in a discrete-time framework. The continuous-time set-

ting is considerably more technical, and therefore we decided to just state the main result,

and leave the underlying considerations out.

As it was mentioned, any Lévy process is defined on a probability space (Ω,F ,P). Let F be

the filtration F = {Ft : t ≥ 0}, so that we obtain a filtered probability space (Ω,F ,F,P), on

which we assume X is defined. Then the non-negative random variable τ , defined on the

same filtered probability space, is called stopping time if {τ ≤ t} ∈ Ft for all t > 0. It should

be mentioned that it is not a priori ruled out that a stopping time could have the property

that P(τ = ∞) > 0. Now suppose that τ is a stopping time. The process X = {Xt : t ≥ 0}where

Xt = Xτ+t −Xτ , t ≥ 0

defined on {τ < ∞} is independent of Fτ and has the same law as X and hence is a Lévy

process. For instance, the first entrance time (first hitting time) of a given subset B ⊆ R is

F-stopping time.

�

�


�

�

�

�

�


One elementary but useful concept, applying to all Lévy processes, is duality; it is a direct

consequence of the stationary independent increments. In fact, duality can informally de-

scribed as a kind of symmetry under time reversial. When a path of a Lévy process is re-

versed in time, over a finite time horizon, the new path is distributionally equivalent. More

precisely, for each t > 0 the processes {X(t−s)− −Xt : 0 ≤ s ≤ t} and {−Xs : 0 ≤ s ≤ t}are equivalent, and have the same law.

An interesting direct consequence of this duality property concerns a relationship between

the running supremum and the running infimum, which are defined by

Xt := sup0≤s≤t

Xs, Xt := inf0≤s≤t

Xs.

The processes {Xt : t ≥ 0} and {Xt : t ≥ 0} are the key objects studied in fluctuation theory,

and will play an important role in the sequel of this thesis. We arrive at the following useful

lemma.

Lemma 1.3.2. For each fixed t > 0, the pairs (Xt, Xt −Xt) and (Xt −Xt,−Xt) have the same

distribution under P.

We now consider the running maximum and minimum up to τ(q), which represents an

exponentially distributed time, with arbitrary parameter q > 0 (i.e., mean 1/q). It can be

seen that if the Lévy process is not a compound Poisson process, then its maximums are

obtained at unique times; we define Gt := sup{s < t : Xs = Xs} and Gt := sup{s < t :

Xs = Xs}. We are now ready to state the Wiener-Hopf decomposition (where it is noted that

the compound Poisson case should be treated slightly differently; see [68]).

Theorem 1.3.3. The Wiener-Hopf factorization. Suppose that X is any Lévy process and let

τ(q) be an independent exponentially distributed random variable with parameter q > 0.

Then the following statements hold.

1. The pairs

(Gτ(q), Xτ(q)) and (τ(q)− Gτ(q), Xτ(q) −Xτ(q))

are independent and infinitely divisible. For any θ, ϑ ∈ R the following factorization

applies:q

q − iϑ+ ξ(θ)= E

(eiϑGτ(q)+iθXτ(q)

)E

(eiϑGτ(q)+iθXτ(q)

)where the pairs E(eiϑGτ(q)+iθXτ(q)) and E(eiϑGτ(q)+iθXτ(q)) are the Wiener-Hopf factors.

2. When setting ϑ = 0, the Wiener-Hopf factors may be identified in terms of the Laplace

�

�


�

�

�

�

�


exponent κ(α, q) and κ(α, q), which are defined by (for some k0 > 0, and α ≥ 0)

κ(α, q) := Ee−αXτ(q) = k0 exp

(−

∫ ∞

0

∫(0,∞)

1

t

(e−qt − e−qt−αx

)P(Xt ∈ dx)dt

)(1.1)

for the running maximum, and (for some k0 > 0, and α ≤ 0)


(−

∫ ∞

0

∫(−∞,0)

1

t


)P(Xt ∈ dx)dt

)

(1.2)

for the running minimum. In addition,

κ(α, q)κ(−α, q) =q

q − logEe−αX1=: K (α, q).

Note that there are some constants in the expressions of the Wiener-Hopf factorization which

are not identified (k0 and k0, that is). They depend on the normalization which is chosen in

the definition of local time; for more background on this issue we refer to [68].

We do not provide a proof of the above result. Instead, in order to convey the main ideas

behind it, we include a rough sketch of a possible proof in a discrete-time framework (which

corresponds to a random walk). We have chosen to do so, since in the continuous-time

framework there are number of rather technical steps that have to be dealt with; at the same

time, we believe that the main ideas behind the discrete-time counterpart provide useful

insights [34].

To this end, consider the random walk Sn :=∑n

i=1 Yi, with the Yi being i.i.d., distributed as

a generic random variable Y. Let Sn the running maximum process

Sn := supi∈{1,...,n}

Si;

Gn denotes the (first) epoch at which that running maximum is attained. Let T be an (in-

dependent) geometric random variable, i.e., P(T = k) = p(1 − p)k, for some p ∈ (0, 1), and

k ∈ {0, 1, · · · }.

Realize that the number of maximums which are attained before time T (number of excur-

sions) is a geometric random variable; it is denoted by N . It follows that both ST and GT

can be written as the sum of N i.i.d. non-negative random variables. It can be showed that a

geometric sum of i.i.d. random variables is infinitely divisible. Based on the above, we can

�

�


�

�

�

�

�


conclude that ST and GT are infinitely divisible as well.

Furthermore, in line with the duality property of Lévy processes, we have that (ST − ST , T −GT ) is independent of (ST , GT ). This can be intuitively understood, as follows. First real-

ize that the geometric distribution is memoryless. Suppose now that we are told that the

maximum (before time T ) is attained at a specific epoch (GT ), the value of this specific issue

does not have any impact on the amount by which the process goes down between GT and

T . A similar property holds for the residual time T −GT until ‘the geometric clock expires’.

Also, it is observed that, from the duality property, ST − ST has the same distribution as the

running minimum process.

After these first observations, we now include a bit of elementary algebra, leading to the

identification of the Wiener-Hopf factors. To this end, we first notice that it can be verified

that, with s ∈ (0, 1] and α ∈ R,

E sT eαiST =p

1− (1− p)sEeαiY.

On the other hand, this quantity can be alternatively written as

exp

(−

∫ ∞

−∞

∞∑n=1

1

n(1− sneαix)(1− p)nP(Sn ∈ dx)

)

= exp

(−

∞∑n=1

1

n(1− snEeαiSn)(1− p)n

)

= exp

(−

∞∑n=1

1

n

((1− p)n − (

(1− p)sEeαiξ)n))

= exp(log p− log

(1− s(1− p)EeαiY

)).

Recall that we found that (ST , T ) can be written as the sum of two independent terms, viz.

(ST , GT ) and (ST − ST , T − GT ) which are both infinitely divisible. As a result, it follows

that

E sGT eαiST = exp

(−

∫ ∞

0

∞∑n=1

1


),

and

E sT−GT eαi(ST−ST ) = exp

(−

∫ 0

−∞

∞∑n=1

1


).

As mentioned above, the step from discrete-time to continuous-time introduces a substan-

�

�


�

�

�

�

�


tial amount of technicalities. If we would ‘extrapolate’ our discrete-time findings, we would

obtain the following. Let T be exponentially distributed random time with mean 1/θ, inde-

pendent of the Lévy process (Xt)t. For β ≥ 0 and α ∈ R, we can easily show that

Ee−βT+αiXT =ϑ

ϑ+ β − logEeαiX1. (1.3)

Using the Frullani integral identity [68], we also have

exp

(−

∫ ∞

0

∫ ∞

−∞

1

t

(e−ϑt − e−(ϑ+β)teαix

)P(Xt ∈ dx)dt

)

= exp

(−

∫ ∞

0

1

t

(e−ϑt − e−(ϑ+β)tEeαiXt

)dt

)

= exp

(−

∫ ∞

0

1

t

(e−ϑt − e−(ϑ+β−log EeαiX1 )t

)dt

)

=ϑ

ϑ+ β − logEeαiX1.

Mimicking the discrete-time setup, we obtain the same results in continuous-time frame-

work:

Ee−βGT+αiXT =κ(ϑ+ β,−αi)

κ(ϑ, 0)

= exp

(−

∫ ∞

0

∫ ∞

0

1

t


)P(Xt ∈ dx)dt

),

and

Ee−β(T−GT )+αi(XT−XT ) =κ(ϑ+ β, αi)

κ(ϑ, 0)

= exp

(−

∫ ∞

0

∫ 0

−∞

1

t


)P(Xt ∈ dx)dt

).

where the functions κ and κ are defined in Equations (1.1) and (1.2).

1.3.2 Second factorization identity

Much of our analysis relies on the Wiener-Hopf factorization and its ramifications. A second

result that we use in this thesis is usually referred to as the second factorization identity and it

holds for any Lévy process. It can be found in e.g. [68, p.176].

�

�


�

�

�

�

�


Let us define the first hitting time (or first passage time) as

σ(x) := inf{t ≥ 0 : Xt > x},

and let, as before, T be an exponentially distributed random variable with mean q−1. Then

the following result holds for any Lévy process (Xt)t. As the proof is straightforward and

insightful, we decided to include it.

Lemma 1.3.4. For q, q ≥ 0, β > 0,

∫ ∞

0

e−βxE

(e−qσ(x)−q(x−Xσ(x))1{σ(x)<∞}

)dx =

1

β − q

(1− Ee−βXT

Ee−qXT

)=

1

β − q

(1− κ(β, q)

κ(q, q)

)

where κ(α, q) was defined in Equation (1.1).

Proof. We follow the proof of [68, Exercise 6.7]. Xt is a Lévy process and hence it is Marko-

vian; in addition T is exponentially distributed. Due to the memoryless property of the

exponential distribution, we have

E

(e−q(XT−Xσ(x))

)= E

(e−qXt

)

and therefore

E

(e−qXT 1{XT>x}

)= E

(e−qXT 1{σ(x)≤T}

)= E

(e−qXσ(x)1{σ(x)≤T}

)E

(e−qXT

).

In addition, the first factor of the previous equation can be expressed explicitly in terms of

distribution of σ(x) and T . It follows that

E(e−qXσ(x)1{σ(x)<T}

)=

∫ ∞

0

e−qs

∫ ∞

0

qe−q(t−s) E(1{s<t}e−qXs

)dtP(σ(x) ∈ ds)

= E

(e−qσ(x)−qXσ(x)1{σ(x)<∞}

).

Combining the above results gives

∫ ∞

0

(β − q)e−(β−q)xE

(e−qXT 1{XT>x}

)dx =

∫ ∞

0

∫ ∞

x

(β − q)e−(β−q)xe−qudP(XT ∈ du)dx.

�

�


�

�

�

�

�


By interchanging the order of integrations we have

∫ ∞

0

(β − q)e−(β−q)xE

(e−qXT 1{XT>x}

)dx = E

(e−qXT

)− E

(e−βXT

).

Theorem 1.3.3 proves the claim. �

1.3.3 Some remarks on Wiener-Hopf factorization

The Wiener-Hopf factorization theorem provides an elegant decomposition in terms of the

characteristic functions of the running maximum and the running minimum associated with

the underlying Lévy process. The result, however, does not say how one should calculate

each characteristic functions; preferably one would express the Wiener-Hopf factors in terms

of the model primitives, i.e., the characteristic exponent ξ(·).In fact such a decomposition is possible for specific classes of Lévy processes only. This is for

instance the case if the driving Lévy process belongs to the class of spectrally one-sided Lévy

processes, i.e., Xt has only negative jumps (Xt ∈ S−) or only positive jumps (Xt ∈ S+): in

both cases κ(α, q) can be expressed in closed-form in terms of the characteristic exponent.

Let Xt ∈ S−, and let Φ(β) := logEeβX1 be the so-called Laplace exponent, where we define its

right inverse by Ψ(.). Then, as it turns out,

κ(α, q) =Ψ(q)

Ψ(q) + α.

In case of spectrally positive case Xt ∈ S+ the decomposition is usually referred as the

(generalized version of the) Pollaczek-Khinchine formula. Then we have

κ(α, q) =q

ψ(q)

ψ(q)− α

q − φ(α)

where the Laplace exponent is defined by φ(α) := logEe−αX1 , and ψ(.) is the inverse of φ(.).

The function κ(α, q) follows from κ(α, q)κ(−α, q) = K (α, q) [34].

The following case can be dealt with (semi-)explicitly as well. If the jumps in one direction

(either downward or upward) have a phase type distribution (which we further comment on

below), whereas the jumps in the other direction are allowed to have a general distribution,

the Wiener-Hopf decomposition can be performed in terms of the roots of the equation q =

ξ(s); see e.g. [72, 71].

Another class of Lévy processes for which the Wiener-Hopf decomposition is possible in

�

�


�

�

�

�

�


more explicit terms, is the class of processes which have a meromorphic Lévy exponent in the

complex plane, e.g. expressed in terms of beta and digamma functions. Also for this class

of Lévy processes the Wiener-Hopf factorization can be evaluated in terms of the roots of

equation q = ξ(s), but now this equation has infinitely many roots [66, 67].

Any distribution can be arbitrarily well approximated by a phase type distribution [10, Thm.

III.4.2]; the class of phase-type distributions is dense (in the sense of weak convergence)

in the set of all probability distributions on (0,∞). If we are in a situation in which the

jumps in both directions are general, we could replace those in one directions by their phase-

type counterpart, leading to a Lévy process that is covered by [72, 71]. Several methods

have been developed to deal with approximating a distribution on (0,∞) by a phase-type

distribution, see for example [40, 57]. The approach which we used in this thesis is based on

the expectation-maximization algorithm.

So far we did not discuss the impact of the ‘small jumps’ in case the driving Lévy process

has infinite activity. In order to enter the setup of [72, 71], with phase type jumps in one

directions and general jumps in the other direction, it is implicitly assumed that the Lévy

process is of finite activity, i.e.∫ ∞−∞ Π(dx) < ∞ (as these small jumps cannot be described by

a compound Poisson stream of phase-type distributed jumps). This issue can be remedied

as follows; focus on the situation that we wish to replace the positive jumps by a phase-

type counterpart. The jump distribution on (ε,∞), for some ε > 0, can be approximated

by a phase type distribution for any specific ε > 0. A Brownian motion with drift (with

appropriately chosen parameters) can then compensate the jumps smaller than ε [10]. By

picking the value of ε suitably small, the approximation turns out to be highly accurate. More

concretely, assuming the upward jumps are approximated by the phase-type distribution

Pph(dx), the parameters of the Brownian motion are calculated by the following formulas:

με :=

∫ ε

0

x (Π− Pph) (dx), σ2ε :=

∫ ε

0

x2 (Π− Pph) (dx).

The approximation of small jumps with a deterministic drift process and Brownian motion

are also frequently used in Monte Carlo simulation [10].

From the above, we conclude that if one manages to (sufficiently accurately) approximate

the Lévy measure with a phase type distribution, the factors in the Wiener-Hopf decompo-

sition can be evaluated. As a consequence, we have the Laplace transforms of the running

maximum and minimum. The next step is to invert these, in order to evaluate the distri-

�

�


�

�

�

�

�


butions of the running maximum and minimum. There are several methods developed for

performing Laplace (and/or Fourier) inversion. Most of the Laplace inversion techniques

are based on the well-known Poisson summation formula (PSF) [1, 38]. The PSF is given by the

following formula, for any v ∈ [0, 1) and any ‘damping factor’ a ∈ R:

∞∑k=−∞

f (a+ 2πi(k + v)) =

∞∑k=0

e−ake−2πikvf(k)

where f is the Laplace (Fourier) transform of the function f(x). The right hand side of

this equation is a discrete Fourier transform which can be computed efficiently by the well-

known fast Fourier transform algorithm, obviously provided that one can evaluate the left-

hand side of the equation. The method which is developed by den Iseger [36] approxi-

mates the infinite summation with a finite sum at appropriately chosen points and weight.

This technique has been extensively tested, and has shown to be able to calculate Laplace

and Fourier inversion transform fast and accurately. The technique can be adapted to per-

form the inversion of non-smooth functions and even functions with singularities. It is also

remarked that the extension of the method to multi-dimensional mixed Laplace/Fourier in-

version is straightforward; for details of the implementation, as well as a series of extensions,

we refer to [36].

1.4 Preliminaries on Markov fluid models

The topics of fluctuation theory and queueing theory are intimately related; e.g. the steady-

state distribution of the workload in a queueing system can often be translated in terms of

the probability of an associated ‘free’ (i.e., not truncated at 0) process attaining a given set.

In this sense, techniques developed in the context of fluctuation theory for Lévy processes,

can be used in the context of Lévy-driven queues as well [34, 83]. In this section we leave the

setting of Lévy processes though, focusing on queues with Markov fluid input. We provide

the preliminaries necessary for the last chapter of this thesis.

Queueing theory studies the evolution of a storage process, which can be in terms of either

(discrete) customers or workload. A queue is characterized by an arrival process, the distri-

bution of the service requirements, and a service discipline. In general a queue can have one

or more servers (where servers can be interpreted as internet servers, cashiers in shops, etc.),

which process the clients’ jobs. Often the arrival process and service times are uncertain, and

therefore random objects are used to model these.

�

�


�

�

�

�

�

16 1.4. PRELIMINARIES ON MARKOV FLUID MODELS

There is a vast body of literature on the mathematical modeling of all sorts of queues. A

large subclass of these models can be summarized by the notation A/B/n/m − S, which

characterization is due to Kendall in the 1950s. Here A and B correspond to, respectively,

the distributional properties of the arrival process and service requirements. The number of

servers and the maximum number of jobs which can wait in the system (i.e., in its buffer) are

indicated by n and m respectively. Finally, the S specifies the service discipline; it can be for

instance first-come-first serve or processor-sharing.

The model we consider in this thesis, does not fit in the Kendall notation. We consider a fluid

model, in which a reservoir is fed by a continuous traffic stream [65, 83, 4, 63, 30]. The server

may be considered as the output flow of the reservoir; the output rate is usually considered

constant but may also be stochastic [74, 75, 89, 42, 20]. We give a more detailed description

below.

1.4.1 Markov fluid model

Consider the following fluid reservoir (or buffer), where the amount of fluid in the reservoir

at time t is denoted by Ct. Let (Xt)t denote the so-called background process, which is as-

sumed to be an irreducible continuous Markov process; this background process models the

stochasticity of the input flow into the reservoir. We assume that Xt has a finite state space

N ⊂ N, i.e., Xt attains values in the set N = {1, 2, · · · , N}.

Now the content of the reservoir is driven by (Xt)t, in the sense that the input rate into the

reservoir is ri when the process (Xt)t is in state i ∈ N , unless the reservoir is empty and

the net input flow rate is negative (as in that situation the reservoir remains empty). The

content of the reservoir is evidently stochastic, and its dynamics are given by the following

differential equation:

dCt

dt=

⎧⎪⎨⎪⎩max(ri, 0) if Ct = 0,

ri if Ct > 0.(1.4)

The above formula tacitly assumed that the buffer has infinite capacity; as an aside, we note

that if there is a finite buffer B > 0, the following equation holds:

dCt

dt=

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩max(ri, 0) if Ct = 0,

min(ri, 0) if Ct = B,

ri if Ct ∈ (0, B).

�

�


�

�

�

�

�


It is assumed that at least one of the net flow rates is positive, as otherwise the model is trivial

(in that Ct = 0 all the time). However, when the buffer capacity is infinite, the requirement

that the queue be stable means that we have to impose the condition

∑i∈N

piri < 0,

where pi is the stationary probability of Xt being in state i ∈ N ; if this condition would not

be met, the queue will grow to infinity [65].

We define by Fi(y, t) the probability of the content is at most y and the background process

Xt is in state i. In other words,

Fi(y, t) := P[Xt = i, Ct ≤ y], i ∈ N , y ≥ 0.

In addition, the background process is a continuous-time Markov chain with generator ma-

trix Q = [qij ] such that

P[X(t+ h) = j|X(t) = i] = qijh+ o(h),

P[X(t+ h) = i|X(t) = i] = 1 + qiih+ o(h). (1.5)

where qij ≥ 0 if i = j and qii = −∑j �=i qij , for i ∈ N .

Consider Fi(y, t + h), i.e., the probability of the background process being in state i and

the buffer content is being at most y at time t + h, where h ‘is infinitesimally small’. Using

Equations (1.5), Fi(y, t + h) can be expressed in terms of Fj(y, t). Let y > 0, h > 0 and

y − rih > 0 for all i ∈ N . Then,

Fi(y, t+ h) = (1 + qiih)Fi(y − rih, t) + h∑j �=i

qjiFj(y − rjh, t) + o(h). (1.6)

It is elementary to rewrite the above equation to

(Fi(y, t+ h)− Fi(y)) + (Fi(y)− Fi(y − rih, t)) = h∑j �=i

qjiFj(y − rjh, t) + o(h).

Assuming ∂yFi that ∂tFi exist, dividing by h, in the limit h → 0 we obtain the following

�

�


�

�

�

�

�

18 1.4. PRELIMINARIES ON MARKOV FLUID MODELS

equation:∂Fi(y, t)

∂t=

∑j∈N

qjiFi(y, t)− ∂Fi(y, t)

∂yri.

This equation (which can be regarded as a Kolmogorov forward equation) [65, 60, 77] can be

written in compact matrix form; then it becomes

F (y, t)

∂t= F (y, t)Q− F (y, t)

∂yR, y > 0, (1.7)

where F (y, t) = (F1(y, t), · · · , FN (y, t))T and R = diag(r1, · · · , rN ). When the stability con-

dition is satisfied, then it can be shown that there exists a stationary distribution of (Xt, Ct);

i.e., the partial derivative with respect to time vanishes, and we have that, as t → ∞,

F (y)Q =F (y)

∂yR (1.8)

where F (y) denotes the corresponding stationary distribution of the fluid level in the reser-

voir.

Evidently, (1.8) alone does not uniquely specify the stationary distribution; a set of coefficient

still needs to be specified. More specifically, the solution of (1.8) has the form

F (y) =

N∑j=1

cjeξjyV (j), (1.9)

where (ξj ,V(j)) are the eigenvalues and corresponding eigenvectors of the matrix R−1QT,

and cj are constants which have to be determined by imposing boundary conditions [69].

Particularly, when the buffer size is infinitely large, if Re(ξj) > 0, the corresponding coeffi-

cient cj has to be zero, as otherwise the probability F (y) cannot be bounded between 0 and

1. We divide the states into two separate sets, in terms of their net input rates; we define

N+ ≡ {i ∈ N | ri > 0} and N− ≡ {i ∈ N | ri < 0} (assuming for ease that there is not i such

that ri = 0). As a consequence, we have the following boundary condition corresponding to

an empty reservoir:

Fi(0) = 0, i ∈ N+;

recall that the content of the reservoir is increasing when Xt ∈ N+. On the other hand, if the

�

�


�

�

�

�

�


buffer is finite then the following boundary conditions have to hold:

Fi(B−) = pi, for i ∈ N−

where B is the buffer size. It turns out that these condition yield as many conditions as there

are unknowns cj (viz. N ). An important role in this argument is played by the property

that, if the stability condition is satisfied, the number of eigenvalues with negative real part

equals |N+|, there is one zero eigenvalue and the other eigenvalues have positive real part.

1.4.2 Workload-dependent service rate models

In the setting above, the net input rates and the transition matrix did not depend on the

current buffer content (i.e., the fluid amount in the reservoir). We now consider situations in

which we depart from this framework.

A first scenario is that in which the rates ri and the transition matrix Q are functions of the

current buffer content. This class of models is, for obvious reasons, sometimes referred to as

feedback models. We primarily consider the case in which there are (finitely many) levels such

that between two subsequent levels ri and Q are constant [89, 42]. We consider a second

scenario as well, in which the rates ri depend on the direction in which the level is crossed;

we call this scenario an hysteresis-type model [74, 75].

First we consider models in which ri and Q are locally constant between specified levels,

and do not depend on the direction in which the level is crossed. Suppose there are K + 1

buffer levels such that

0 = B0 ≤ B1 ≤ · · · ≤ BK−1 ≤ BK ≤ ∞

and, as a consequence, K buffer regimes; when y ∈ (BK−1, BK) the flow rates are given

by Rk = diag(rk1 , · · · , rkN ) where k ∈ {1, 2, · · · ,K}. We remark that this setup can be even

extended to a continuous counterpart, in which ri and Q depend on the current buffer level

in a continuous fashion [89]; here we only consider the case that flow rates are piecewise

constant functions of buffer content.

Assume that all Rk matrices are invertible. As a consequence, the steady-state probability

distribution of buffer content in each regime k is F ki (y) follows from a Kolmogorov equation

in the spirit of Equation (1.8) (with R and Q replaced by their ‘local counterparts’), and the

(local) solutions are given by an expression in the spirit of Equation (1.9). To complete the

�

�


�

�

�

�

�

20 1.5. OUTLINE OF THESIS, CONTRIBUTIONS

solution we need to align the solutions corresponding to the individual regimes. This is

done by imposing the appropriate boundary conditions at each buffer threshold. It requires

a substantial amount of administration to identify all boundary conditions, and to verify

that their number equals the number of unknown coefficients.

We now focus on the hysteresis-type model. For ease, we now assume that the buffer is infi-

nite and there are only two thresholds, the lower level and upper level which are indicated

by B� and Bu, respectively (where obviously B� ≤ Bu). In addition, for ease we assume that

in each background process state, the net input rate rate can take two values (i.e., a higher

and the lower value). Introduce an indicator process I(·) ∈ {+,−}, corresponding to the

two regimes, i.e., those in which the higher and lower net input rates apply.

The process is assumed to evolve as follows. It starts with an empty buffer and the indicator

is +. The indicator stays + as long as the buffer content remains below Bu. At the moment

that buffer content reaches the level Bu (and the background process is in state i), the indi-

cator changes from into − (while the background process remains in i). Then the indicator

remains − as long as the buffer content is above the level B�; then it changes to − again

(where the background process does not change). The process continues in this fashion.

With techniques similar to those explained above, we can evaluate the steady-state proba-

bilities

F−i (x) := P(I = −, X = i, C ≤ x), F+

i (x) := P(I = +, X = i, C ≤ x).

Finding these is again a matter of setting up differential equations, and imposing the appro-

priate boundary conditions. For a detailed analysis we refer to [74].

1.5 Outline of thesis, contributions

The primary objective of this thesis is to contribute to the development of computational

techniques in queueing and fluctuation theory. Essentially, three types of techniques will be

explored.

• In the first place, we systematically validate a (one- and two-dimensional) Laplace and

Fourier inversion algorithm. Our approach is based on an algorithm proposed by den

Iseger [36], and several variants thereof, which essentially rely on the Poisson summa-

tion formula. We do so in the context of Lévy fluctuation theory, aiming at numerically

�

�


�

�

�

�

�


evaluating the probability distribution of the running maximum process. While this

approach has a variety of potential applications, we primarily focus on applying it to

price specific exotic options, viz. the so-called lookback option.

• The second technique we present in this thesis is importance sampling. This technique

aims at reducing the variance of simulation-based estimators of performance mea-

sures. For instance, when estimating rare event probabilities, straightforward simu-

lation is extremely time consuming, as many paths need to be generated in order to

obtain an estimate with low variance. The idea behind importance sampling is to gen-

erate paths under an alternative measure, which is chosen such that the event under

study is not rare anymore. Correcting the simulation output by a likelihood ratio, the

resulting estimator is unbiased. The challenge is to find a good new measure, for which

the variance of the estimator (provably) reduces.

• In the third place, considering a node in a communication network, we assess the

tradeoff between the quality of service and the energy consumption. Different service

scenarios are considered; in each scenario the service speed is determined in a specific

way by the evolution of the buffer occupancy. We use techniques from optimization to

minimize a cost function (encompassing quality of service and energy consumption),

where the parameter space covers all feasible service strategies. The optimization rou-

tine is based on simulated annealing in combination with classical Newton-Raphson-

type algorithms, in the sense that we identify by simulated annealing the initial point

of the Newton-Raphson optimization routine (which locally finds the optimum).

In more detail, the contributions of the individual chapters can be summarized as follows.

There are four studies, the first three focusing on fluctuation-theoretic aspects in a Lévy-

driven system, and the last evaluating the tradeoff between quality of service and energy

consumption in a queue with fluid input. We now detail the specific contributions.

Chapter 2 presents a framework for numerical computations in fluctuation theory for Lévy

processes. More specifically, with as before Xt := sup0≤s≤t Xs denoting the running max-

imum of the Lévy process Xt, the aim is to evaluate P(Xt ≤ x) for t, x > 0. We do so

by approximating the Lévy process under consideration by another Lévy process for which

the double transform Ee−αXτ(q) is known, with τ(q) an exponentially distributed random

variable with mean 1/q; then we use a fast and highly accurate Laplace inversion technique

(of almost machine precision) to obtain the distribution of Xt. A broad range of examples

illustrates the attractive features of our approach. This chapter is based on [5].

�

�


�

�

�

�

�


In Chapter 3 our objective is to compute the prices (and corresponding sensitivities, known

as Greeks) of lookback options driven by Lévy processes. In this setup, the risk neutral evolu-

tion of the stock price, say St, is given by S0eXt , with S0 the initial price and Xt representing

a Lévy process. Lookback options prices are functions of the stock price ST at the maturity

time T and the running maximum ST := sup0≤t≤T St, and as a consequence the Wiener-

Hopf decomposition provides us with all probabilistic information needed to evaluate these

prices. To overcome the complication that in general only an implicit form of the Wiener-

Hopf factors is available, we follow the same approach as in Chapter 2: we approximate the

Lévy process under consideration by an appropriately chosen other Lévy process for which

the double transform Ee−αXτ(q) is known; as before, τ(q) is an exponentially distributed

random variable with mean 1/q. The second step is to write the transform of the lookback

option prices in terms of this double transform. Finally, we use state-of-the-art numerical

inversion techniques to compute the prices and Greeks (i.e., sensitivities with respect to ini-

tial price S0 and maturity time T ); these rely on the techniques featuring in Chapter 2. We

test our procedure for a broad range of relevant Lévy processes, including a number of ‘tra-

ditional’ models (Black-Scholes, Merton) and more recently proposed models (CGMY and

Beta processes), showing excellent performance in terms of speed and accuracy. This has

been submitted for publication to Journal of Computational Finance [7].

In Chapter 4 the focus is on numerical techniques to evaluate rare-event probabilities in a

Lévy setting. We analyze the tail asymptotics corresponding to the all-time maximum value

attained by a Lévy process with negative drift. This chapter has two main contributions: a

short and elementary proof of these asymptotics, and an importance sampling algorithm to

estimate the rare-event probabilities under consideration. This chapter is based on [6] which

has been accepted for publication in Statistics and Probability Letters.

Finally, in Chapter 5 considers the tradeoff between quality of service and capacity cost in

communication networks. More specifically, we develop techniques for analyzing and op-

timizing energy management in multi-core servers with so-called speed scaling capabilities

(i.e., the service speed can be adjusted based on the current buffer occupancy, or the evolu-

tion of the buffer occupancy in the recent past). Our framework incorporates the processor’s

dynamic power, but it also accounts for other intricate and relevant power features such

as the static (leakage) power and switching overhead between speed levels. Using stochas-

tic fluid models to capture traffic burst dynamics, the chapter proposes and studies differ-

ent strategies for adapting the multi-core processor speeds based on the observable buffer

content, so as to optimize objective functions that balance energy consumption and perfor-

�

�


�

�

�

�

�


mance. The strategies can be non-hysteretic (i.e., the processor speed depends on current

buffer level relative to the buffer thresholds) or hysteretic (i.e., it matters in which direction

the buffer thresholds are crossed). It is shown that, under rather general conditions, strate-

gies which use more threshold levels are more efficient with respect to power consumption;

however, most of the efficiency gain is achieved with 1 or 2 thresholds only. In addition,

the optimal power consumptions of the different strategies are only very mildly sensitive

to perturbations in the input parameters, implying the highly advantageous property that

the performance is robust to estimation errors in the system’s input traffic parameters. This

chapter has appeared as [9], and a short version as [8].

Our objective is that all chapters are self-contained, i.e., they can be read separately. As a

consequence, there will be some inevitable amount of overlap between these chapters. We

also remark that we have pursued a maximum level of uniformity regarding the notation

used throughout the thesis, and that this is also in line with the notation introduced in the

present chapter.

�

�


�

�

�

�

�


�

�


�

�

�

�

�

Chapter 2Numerical techniques in Lévy

fluctuation theory

In this chapter we discuss a numerical technique to fast and accurately evaluate the distri-

bution of the running supremum as attained by a Lévy process.

2.1 Introduction

As explained in Chapter 1, owing to their wide applicability and their attractive mathemati-

cal properties, Lévy processes play an important role in applied probability. In mathematical

terms, they are characterized as processes with stationary and independent increments, and,

as such, the class of Lévy processes covers e.g. Brownian motion and (compound) Poisson

processes (but is substantially broader; for instance processes with infinitely many jumps

in finite time intervals belong to this class as well). Over the past decades Lévy processes

have found widespread use in various application domains. More specifically, they are in-

tensively studied in both mathematical finance and operations research, see, among many

other sources, for instance [10, 31, 35].

With Xt denoting the Lévy process (assuming X0 = 0), a substantial research effort con-

centrates on analyzing probabilistic properties of the so-called running maximum process

Xt := sup0≤s≤t Xs. More particularly, one wishes to determine the probability P(Xt ≤ x) for

t, x > 0, or alternatively the corresponding density. The branch of research focusing on this

type of problems is commonly known as fluctuation theory [21, 68, 83].

25

�

�


�

�

�

�

�

26 2.1. INTRODUCTION

As mentioned in Chapter 1, a Lévy process is characterized by its Lévy exponent logEeisX1 ,

which is a necessarily of the form

logEeisX1 = isd− 1

2s2σ2 +

∫ ∞

−∞(eisx − 1− isx1{|x|<1})Π(dx), (2.1)

where d ∈ R, σ ≥ 0, and the spectral measure Π(·), concentrated on R \ {0}, satisfies

∫R

min{x2, 1}Π(dx) < ∞.

The triplet (d, σ2,Π) is usually referred to as the characteristic triplet, as it uniquely defines the

Lévy process [21, Ch. I, Thm. 1]. The three terms in the right-hand side of the representation

(2.1) are, for obvious reasons, often called the (deterministic) drift term, the Brownian term,

and the jump term. Special cases of Lévy processes are deterministic drifts (only a drift term)

and Brownian motions (only a Brownian term). The class of Lévy processes also contains

compound Poisson processes; then we just have the jump-term (and the first term as well in

case a deterministic drift is present as well), and in addition there should be a well-defined

arrival rate (which requires that∫ ∞−∞ Π(dx) < ∞). The class is wider though, as it also

includes processes with infinitely many jumps in a finite amount of time (usually referred to

as ‘small jumps’); this happens in case∫ ∞−∞ Π(dx) = ∞.

In principle the distribution of Xt is fully specified through the so-called Wiener-Hopf de-

composition, see e.g. [68, Ch. 6]. It implies that, with τ(q) denoting an exponential random

variable with mean 1/q that is independent of the Lévy process Xt,


(−

∫ ∞

0

∫(0,∞)

1

t

(e−t − e−qt−αx

)P(Xt ∈ dx)dt

), (2.2)

where k0 is a normalizing constant. From a practical standpoint, the use of this characteri-

zation is limited, as it provides us with the double transform of P(Xt ∈ dx) — realize that

1

q· Ee−αXτ(q) =

∫ ∞

0

e−qt

∫ ∞

0

e−αxP(Xt ∈ dx)dt, (2.3)

which in general cannot be inverted explicitly.

The above entails that, in order to get numerical values for the density P(Xt ∈ dx) or the

distribution function P(Xt ≤ x), one option is to (i) first evaluate the double integral (2.2)

numerically, and then to (ii) numerically invert the double transform (2.3). The primary

�

�


�

�

�

�

�

CHAPTER 2. NUMERICAL TECHNIQUES IN LÉVY FLUCTUATION THEORY 27

objective of this chapter is to develop a methodology to evaluate P(Xt ≤ x), but to do so

by bypassing stage (i) above. The underlying idea is that we make use of the fact that for

quite a substantial class of Lévy processes Xt, the double transform κ(α, q) can be expressed

explicitly in terms of the Lévy exponent; we replace the Lévy process under consideration by

a (suitably chosen) Lévy process in this class, so that the just performing stage (ii) remains.

As mentioned above, for a broad class of Lévy processes the double transform κ(α, q) can be

expressed explicitly in terms of the Lévy exponent; in some cases still a number of (relatively

straightforward) numerical computations need to be performed. We give a brief overview

of such processes here.

• The most standard examples in which this is possible are the ones in which the under-

lying Lévy process is spectrally one-sided. This means that Xt has either only negative

jumps (the spectrally negative case; write X ∈ S−) or only positive jumps (the spec-

trally positive case; write X ∈ S+). In the former case the running maximum up to

the exponential epoch τ(q) has an exponential distribution, whereas in the latter case

the so-called generalized Pollaczek-Khinchine formula applies; see e.g. [35, Ch. III

and IV]. In both cases, κ(α, q) can be expressed in closed-form in terms of the Lévy

exponent.

• It has been found out more recently that κ(α, q) can be expressed in semi-explicit terms

if the jumps in one direction (either upward or downward) are phase-type (or, more

generally, have a rational Laplace transform), whereas the jumps in the other direction

are allowed to have a general distribution — see for results along these lines [11, 71, 72].

In this chapter, we concentrate on the setting of Lewis and Mordecki [71] in which the

positive jumps have a rational Laplace transform, and the downward jumps are general;

we write X ∈ R. In this case κ(α, q) can be expressed in terms of the zeros of a specific

equation (that needs to be solved numerically).

• If the Lévy exponent is a meromorphic function (write: X ∈ M ), expressed in terms of

beta and digamma functions, the Wiener-Hopf factorization can be done in essentially

the same way as in case of phase-type distributed jumps [66, 67]. This Wiener-Hopf fac-

torization, however, is now in terms of an infinite product, due to the infinitely many

poles of the Lévy exponent, so there is a truncation error. In the context of the present

chapter we consider the class of Beta processes [66, Section 4], which has meromorphic

Lévy exponent.

�

�


�

�

�

�

�


As indicated above, in our numerical evaluation scheme we approximate the Lévy process

under consideration by one in the class for which we can compute the double transform

κ(α, q) explicitly (that is, a Lévy process in S−, S+, or R). In case there are non-phase-type

jumps in both directions, the jumps in one direction are approximated by using a phase-type

distribution; if there are ‘small jumps’, we approximate the jumps of the Lévy process by

the sum of an appropriately chosen compound Poisson process and Brownian motion [15].

Then we have an approximation for κ(α, q), which is inverted using the inversion approach

presented in [36]; this approach can be considered as ‘state-of-the-art’ in terms of accuracy

(near machine precision), speed and general applicability.

To the best of our knowledge, our study is the first systematic account that tackles the nu-

merical evaluation of P(Xt ≤ x) for t, x > 0 (or the corresponding density) in full generality.

Building on the ideas mentioned above, we study in great detail the numerical accuracy and

complexity of our approximation method. This is done for an extensive set of examples,

covering many of the specific Lévy processes proposed in the literature. It is noted that par-

ticular Lévy processes were already dealt with before, see for instance [13] for the CGMY

process; [87, 92] focus on numerical aspects related to the spectrally-negative case.

The remainder of this chapter is organized as follows. Section 2.2 sketches the preliminaries

of our approach: it reviews the results for the spectrally one-sided case as well as the results

from [71] for the case the positive jumps have a rational Laplace transform. In Section 2.3

the case of one-sided jumps is dealt with, with a focus on Brownian motion and compound

Poisson; the output of the numerical experiments is validated against either exact results

or simulation-based results. Then Section 2.4 studies the effect of replacing the positive

jumps by a phase-type counterpart; to assess the accuracy of the method we also perform

these approximations for instances that do allow explicit calculation of the double transform

κ(α, q). Section 2.5 concerns the approximation of small jumps by the sum of a Brownian

motion and a compound Poisson process. When the Wiener-Hopf factorization is available,

there is an efficient method [67] for sampling the running maximum (called Wiener-Hopf

Monte Carlo, or WH-MC). In Section 2.6 we consider Beta processes, and use WH-MC to

assess the accuracy of our approximation technique. In addition, as Beta processes are in

M , we can represent κ(α, q) as an infinite product; we also include the results obtained by

truncating this product and performing the inversion.

�

�


�

�

�

�

�


2.2 Preliminaries

Recalling that we denote by Xt the running maximum process of the Lévy process Xt, and

by τ(q) an exponentially distributed random variable (with mean 1/q, for q > 0), we review

in this section Lévy processes for which the double transform of Xτ(q), denoted by κ(α, q),

can be explicitly expressed in terms of the model’s primitives, or immediately computable

quantities.

We first consider the situation that there are no positive jumps, that is, the spectrally negative

case. Following [21, Ch. VII], for X ∈ S− we define Φ(β) := logEeβX1 , and Ψ(·) its right-

inverse [68, p. 211]. Then κ(α, q) satisfies the following simple expression:

κ(α, q) =Ψ(q)

Ψ(q) + α. (2.4)

In other words, Xτ(q) is exponentially distributed with parameter Ψ(q), or, equivalently,

∫ ∞

0

qe−qtP(Xt ∈ dx)dt = Ψ(q)e−Ψ(q)xdx. (2.5)

Then we consider the case of no negative jumps, usually referred to as the spectrally positive

case. For X ∈ S+ we define the Laplace exponent by the function ϕ(·) : [0,∞) → [0,∞),

defined through ϕ(α) := logEe−αX1 . In this case, with ψ(·) being the inverse of ϕ(·),

κ(α, q) =q

ψ(q)

ψ(q)− α

q − ϕ(α). (2.6)

This result is sometimes referred to as the (generalized) Pollaczek-Khinchine formula [55,

98]; see also [10, Ch. IX, Thm. 3.10].

We finally consider the case in which the jumps in the downward direction are general, but

those in the upward direction are assumed to have a rational Laplace transform [72]. We

define this class R by the Lévy processes Xt such that for a finite and positive λ,

ξ(s) := logEeisX1 = isd− 1

2s2σ2 +

∫ 0

−∞(eisx − 1− isx1{x>−1})Π(dx)

+ λ

⎛⎝ K∑

k=1

nk∑j=1

ckj

(iαk

s+ iαk

)j

− 1

⎞⎠

where the αi are order such that 0 ≤ Re(α1) < Re(α2) ≤ · · · ≤ Re(αK). This corresponds to

�

�


�

�

�

�

�

30 2.3. LAPLACE INVERSION

a Lévy process with a general jump-size distribution in the downwards direction, while the

upwards jumps have density

p(x) =

K∑k=1

nk∑j=1

ckj(αk)j xj−1

(j − 1)!e−αkx, x > 0.

Now let βj(q) the j-th root of q = ξ(s), with multiplicity mj(q); let m(q) the total number of

distinct roots. Then

κ(α, q) =K∏

k=1

(α+ αk

αk

)nk m(q)∏j=1

(βj(q)

α+ βj(q)

)mj(q)

; (2.7)

this expression can be inverted with respect to α, after having performed a partial fraction

expansion. Further details and properties of the roots are given in [72, Thm. 2.2].

2.3 Laplace inversion

As pointed out in the introduction, our approach requires a technique to perform Laplace

transform inversion. More specifically, our methodology proposes a way to approximate

the double transform κ(α, q) = Ee−αXτ(q) . In this section we first describe such a Laplace

transform inversion technique in detail. As the objective of this section is to assess the ac-

curacy of the double inversion technique, we then focus on a situation in which both κ(α, q)

and P(Xt ≤ x) are explicitly known (viz. Brownian motion with drift). Then we consider

situations for which we do know κ(α, q); for these cases we use simulation to validate our

numerical findings.

2.3.1 Laplace inversion

As indicated in the introduction, in our approach an important role is played by techniques

to perform Laplace inversion. We advocate the use of the method developed by den Iseger

[36]. It is in the spirit of approaches developed earlier [1, 38], in the sense that it relies

on the Poisson summation formula. This Poisson summation formula relates an infinite

sum of Laplace transform values to the z-transform of the function values f(kΔ), with k =

0, . . . ,M − 1, that we wish to evaluate, from which the f(kΔ) can be computed relying on

the well-known fast Fourier transform [33].

A first complication is that the above-mentioned infinite sum tends to converge slowly.

�

�


�

�

�

�

�


Abate and Whitt [1] remedy this using a so-called Euler summation, but in general the

convergence remains prohibitively slow unless knowledge of the location of singularities

is available. One of den Iseger’s contributions [36] is to approximate the infinite sum by

a finite sum by using a Gaussian quadrature. The resulting algorithm is a substantial im-

provement over earlier algorithms in the sense that (i) it can handle a larger class of Laplace

transforms (e.g., no knowledge of the location of discontinuities or singularities is needed),

(ii) the algorithm only needs numerical values of the Laplace transform, is fast (that is, the

function values f(kΔ), with k = 0, . . . ,M −1, are computed at once, in order M logM time),

and is of nearly machine precision, (iii) can be extended to multiple dimensions. It is stressed

that that last feature is of crucial importance to us, as in our setting we are often dealing with

two-dimensional transforms.

In our numerical experiments we used the modified Laplace inversion for non-smooth func-

tions which was developed in [36, Section 6.2]. This modification is effective for functions

with discontinuities, singularities and local non-smoothness (even if we do not a priori know

their locations). The experiments reported on in [36] show that the algorithm typically re-

sults in approximations of (nearly) machine precision. Below we explain in greater detail

how this modification works.

Let f(s) be the Laplace transform of the complex-valued Lebesgue integrable function f(x).

Then it holds that (see e.g. [1])

∞∑k=−∞

f(a+ 2πi(k + ν)) =

∞∑k=0

e−ake−2πikνf(k); (2.8)

where a is a given real number. In this approach we approximate the left-hand side of (2.8)

by a finite summationn∑

k=1

βkf(a+ iλk + 2πiν),

where the (βk)nk=1 are appropriately chosen positive numbers and (λk)

nk=1 appropriately

chosen real numbers. In [36, Appendix A] it is described how these numbers can be gener-

ated for such a quadrature rule.

Suppose now that f(·) has a singularity in x = α for some α ∈ R. Let w(·) be a window

function, that is, a trigonometric polynomial with period 1, with w(0) = 1 and w(α) = 0.

Define

fw(x) := w(x)qf(x),

�

�


�

�

�

�

�


for some positive integer q. The parameter q is chosen such that fw(x) is smooth in x =

α; also observe that fw(k) coincides with f(k) at k = 0, 1, . . .. Now the ‘normal’ Laplace

inversion technique, as described in [36, Section 4], applied to fw(·), can be used to compute

the f(k) (with integer k). If the function has multiple singularities, say in the points αj

with j = 1, 2, . . . ,m, the window function is the multiplication of window functions, that is,

w(x) =∏m

j=1 wj(x). If there is a singularity at x = 0, a situation that occurs frequently in the

examples of the present chapter, the window function is

w(x) = sin2(πx

2

),

and in the way described above we can compute the function values f(2k + 1). Guidelines

for choosing the parameter q are given in [36, Remark 6.5].

The modified algorithm described above can be improved for functions with various sorts

of non-smoothness; we now describe an improvement detailed in [36, Section 6.3] which

is useful when we do not know the location of the singularity. Suppose that the window

function depends on the point k, fwk(x) = wk(x)f(x), such that wk(k) = 1, and the ε-

support, with a > 0,

{x : |e−atfwk(x)| ≥ ε}

of fwkis [k− δ, k+ δ], with δ a given positive control parameter and ε a predefined tolerance.

In order to be sure that fwk(·) is smooth on [0,∞), it is sufficient that f(·) is smooth on

[k − δ, k + δ]. As a result, it is only needed that f(·) be smooth on [k − δ, k + δ] to compute

f(k) in great precision using the quadrature rule mentioned above. As it turns out, a good

choice for the window function is the Gaussian function defined by

w(t) = exp

(−1

2

(t

σ

)2)

for given tolerance ε and control parameter δ, where σ is chosen such that

exp

(−1

2

(δ

σ

)2)

< ε.

We also mention that [36, Section 5] points out how multi-dimensional inversion can be per-

formed. For further implementation details we refer to [36].

This Laplace inversion method can be adjusted to facilitate the numerical computation of

�

�


�

�

�

�

�


Laplace transforms; such a procedure is needed in situations that no explicit expressions are

available (for instance for the Pareto or Weibull distribution). The key idea behind it con-

cerns the transformation of the Legendre coefficients. Legendre polynomials are a complete

orthogonal set of polynomials in L2([0, 1]) and, in addition, the shifted version of Legendre

polynomials are a complete set in L2(R). Therefore, any function in L2(R) can be approx-

imated with an expansion of shifted Legendre polynomials. On the other hand there is a

complete set of functions in the Laplace domain; for a definition we refer to [37, Appendix

A]. The coefficients of the expansions in these two spaces are linked together through the

Poisson summation formula (2.8). As demonstrated in [37], such a method can compute the

Laplace transform with (almost) machine precision accuracy; it only needs knowledge of the

coefficients of the expansion which can be computed by Gaussian quadratures.

In the rest of this section we systematically assess the performance of the inversion technique

developed in [36] (and described above), in the context of the evaluation of P(Xt ≤ x).

We start by considering a case in which explicit analysis is possible (viz. Brownian motion

with drift). Then we consider a number of other examples for which no explicit expression

is available (but in which we do know κ(α, q)); in those cases we compare our numerical

output with simulations.

2.3.2 Comparison with exact results

In this subsection we consider a case in which the distribution function of Xt, that is, P(Xt ≤x), is known explicitly.

Example 1. Let Xt be a Brownian motion with drift, i.e., Xt = dt+σBt with Bt being standard

Brownian motion and d ∈ R. It holds that [56, p. 49]

P(Xt ≤ x) = 1− ΦN

(−x+ dt

σ√t

)− e2dx/σ

2

ΦN

(−x− dt

σ√t

),

with ΦN(·) denoting the distribution function of a standard Normal random variable.

As highlighted in Section 2.3.1, several Laplace inversion variants are described in [36]; they

differ in the way they deal with discontinuities and singularities. In this example, and all

following numerical computations presented in this chapter, we use the variant described in

[36, Section 6.3]. Table 2.1 focuses on P(Xt ≤ x), and compares the output of our numerical

experiments with the exact values and simulation-based estimates. Here, and in all other ex-

amples reported on in this chapter, we perform 107 independent replications per simulation

�

�


�

�

�

�

�


time t x Simulation Exact value Error, use (2.5) Error, use (2.4)0.1 0.1 0.286726 0.28679183 1.021e-14 2.012e-11

0.2 0.525182 0.52535042 1.909e-14 2.063e-110.5 0.912001 0.91208092 9.298e-15 1.416e-101.0 0.999051 0.99906069 8.121e-17 9.550e-10

0.3 0.1 0.190579 0.19063594 5.995e-15 8.468e-110.2 0.358863 0.35900170 1.144e-14 1.564e-100.5 0.723613 0.72378120 2.248e-14 4.055e-101.0 0.959856 0.95991828 6.335e-15 2.833e-09

0.5 0.1 0.161220 0.16126780 3.997e-15 2.946e-100.2 0.305175 0.30529875 8.993e-15 6.374e-100.5 0.636069 0.63611270 1.860e-14 2.216e-091.0 0.908323 0.90832011 3.802e-15 6.548e-09

Table 2.1: Brownian motion with parameters d = −0.5 and σ = 1.0.

experiment.

The third column contains the simulation-based estimate, the fourth the exact value based on

the above formula. In the last two columns we use the explicit expression (2.4) that we have

for κ(α, q) (or alternatively representation (2.6); recall that Brownian motion is spectrally

negative as well as spectrally positive!). In the fifth column we use the fact that we can per-

form the inversion with respect to α explicitly, as seen in Eqn. (2.5); then a one-dimensional

numerical Laplace inversion is used to approximate the probability of interest. The resulting

error (compared to the exact result) is given. In the last column we present the values ob-

tained when subjecting (2.4) to two-dimensional Laplace inversion; again the error is given.

Observe that in the former approach error are maximally in the order of 10−14, and in the

latter approach maximally of 10−9.

2.3.3 Comparison with simulation results

In the next set of examples, we let Xt correspond to a Brownian motion with drift, plus a

compound Poisson process with upward jumps. In other words,


2s2σ2 + λ

(EeisJ − 1

), (2.9)

with J ≥ 0 the random variable associated with the jumps, and λ > 0. As this process is

spectrally positive, (2.6) applies. We consider various jump-size distributions J , thus cover-

ing both light-tailed and heavy-tailed scenarios.

One way to determine Xt in a simulation is by sampling the values of the Lévy process on a

�

�


�

�

�

�

�


grid (yielding X0, XΔ, X2Δ, . . . , Xt−Δ, Xt), and to then take the maximum (tacitly assuming

that t is a multiple of Δ). This procedure is inherently biased: the value found in this way is

necessarily smaller than Xt, but of course this bias decreases when Δ ↓ 0. In this section we

consider the situation that the Lévy process is the sum of a deterministic drift, a Brownian

term, and a compound Poisson process, and it turns out that for this specific scenario there is

an attractive alternative. First observe that it is trivial to sample the jump epochs of the com-

pound Poisson process up to time t, and the values of the Lévy process at these jump epochs,

as well as the value at time t itself; call the resulting time epochs t0 = 0, t1, . . . , tN−1, tN = t.

Then realize that the distribution of the maximum between ti and ti+1 is known — it follows

essentially from the distribution of the maximum attained by a Brownian bridge. The corre-

sponding distribution function is invertible, and as a consequence it is elementary to sample

from it. It is now clear that in this way we can generate all information needed to determine

Xt (without any approximation). The procedure is described in detail in [50].

Example 2. In this example, we assume that jump size J has an exponential distribution,

that is, P(J > x) = exp(−μx), with μ > 0. The results are presented in Table 2.2. Again,

the third column contains the simulation-based estimate. In the fourth column, we rely

on (2.7) with one-dimensional inversion (observe that the upward jumps are of phase-type,

hence this formula applies); the resulting approximation is given. The fifth column displays

the approximation based on (2.6) with two-dimensional inversion. Finally, the last column

gives the difference between the previous two columns. It is concluded that both inversion-

based methods are close to the simulation-based estimates; in addition, the inversion-based

methods give nearly the same result (up to roughly 10−9, that is).

Example 3. Now let J have a Weibull distribution: P(J > x) = exp(−μxγ), with μ, γ > 0.

For γ ∈ (0, 1), this tail is heavier than exponential, for γ > 1 lighter. More specifically, for

γ < 1 the Weibull distribution is subexponential: despite the fact that all moments exist, there

is no open neighborhood around the origin such that the moment generating function is

finite; in Table 2.3 the jump sizes are subexponential. The third column contains simulation-

based estimates, the fourth is based on doubly inverting expression (2.6). Notice that in this

situation we cannot approximate the probability P(Xt ≤ x) relying on (2.7), as the positive

jumps do not have a rational Laplace transform. The last two columns will be commented on

in the next section. We observe that the approximation based on double Laplace inversion of

(2.6) performs reasonably well compared to the simulation-based estimates; the fit is better

in the light-tailed case.

�

�


�

�

�

�

�

36 2.4. APPROXIMATION WITH RATIONAL LAPLACE TRANSFORM

Example 4. Let J now be sampled from a Pareto distribution: P(J > x) = (x+1)−γ , for some

γ > 0. This tail is heavier than the Weibull-tail: just a finite number of moments exists —

more precisely: the k-th moment exists if k < γ. Table 2.5 should be read as Tables 2.3 and

2.4. We conclude that there is a good fit relative to the simulation-based results.

2.4 Approximation with rational Laplace transform

From the examples presented in previous section we conclude that the numerical inversion

procedure works well, even if the approximation requires a double inversion. In all these

examples, however, the Lévy process involved was such that the double transform κ(α, q)

was given in closed form.

In this section we add a complication. We consider cases in which Xt is such that we do

not have an explicit expression for κ(α, q). The focus will now be on Lévy processes that are

Brownian motion with drift, plus compound Poisson processes (with upward and down-

ward jumps); ‘small jumps’ will only be incorporated in the next section. If the jumps in the

upward direction do not have a rational Laplace transform, the results of [71] do not apply

(see Section 2.2), and hence we do not have an explicit expression for κ(α, q). The idea is

now that we approximate the distribution of the upward jumps by a phase-type distribution

(while leaving the jumps in the downward direction unchanged), so that we are again in

the framework of [71] — realize that the class of phase-type distributions is contained in the

class of distributions with a rational Laplace transform. The objective of this section is to

assess how well such an approximation performs, in terms of evaluating P(Xt ≤ x).

2.4.1 Fitting of phase-type distributions

There are various papers dealing with approximating a distribution on (0,∞) by a phase-

type distribution, see for instance [40, 57]. In our work we rely on the approach developed in

[14], based on the EM algorithm, and [94], who propose a comparable approach that focuses

primarily on mixtures of Erlangs. For a precise definition of phase-type distributions, see

e.g. [10, Ch. III]; they can be thought of as distributions of absorption times in a finite-state

continuous-time Markov chain. More precisely, with d denoting the dimension of the state

space, and d − 1 states being transient and the remaining state absorbing, a phase-type dis-

tribution corresponds to the entrance time of the absorbing state. This class covers mixtures

and sums of exponential distributions (and hence also the Erlang distribution, being dis-

tributed as the sum of independent exponential random variables with the same mean). The

�

�


�

�

�

�

�


class of phase-type distributions is dense, in that any distribution on (0,∞) can, in principle,

be approximated arbitrarily well; the price to be paid, though, is that the dimension d of the

associated Markov chain may become large.

The performance of the EM-based algorithm proposed is assessed in detail in [14] — it was

shown that quite a large class of distributions can be accurately approximated by phase-type

distributions (of relatively low dimension d). From this it is, however, not a priori clear what

the impact is of replacing the upward jumps by an appropriate phase-type random variable

when evaluating P(Xt ≤ x) in the way described above — we do not have an explicit bound

on the error introduced by replacing the jump distribution by its phase-type counterpart. It

is the primary objective of this section to study this effect.

The remainder of this section consists of two parts. In Section 2.4.2 we consider models

of which the upward jumps do not have a rational Laplace transform, but that are in S+

(i.e., there are no downward jumps). Due to (2.6), we know κ(α, q), so that we can apply

the inversion approach developed in [36] (see Section 2.3) to evaluate P(Xt ≤ x). Then we

approximate the upward jumps with phase-type random variables, compute κ(α, q) relying

on [71], and again perform the inversion. Then we compare both numerical approximations

of P(Xt ≤ x).

In Section 2.4.3 we consider models for which we do not know κ(α, q), i.e., models in which

both the upward and downward jump have general distributions. We approximate the up-

ward jumps by phase-type random variables, and proceed as before. We then compare with

simulation to assess the accuracy of this approach.

2.4.2 Comparison with results for spectrally-positive Lévy processes

In the examples below (2.9) applies: the Lévy process consists of a Brownian term (with drift)

increased by a compound Poisson process with positive jumps. As a result, κ(α, q) is given

by (2.6). To assess the impact of replacing the upward jumps by their phase-type counterpart,

we first use the EM-algorithm to find a phase-type approximation for the jumps, and then

approximate P(Xt ≤ x), relying on (2.7) and a single-dimensional Laplace inversion.

Example 5. We go back to the setting of Example 3: we let J have a Weibull distribution with

γ = 0.5 and γ = 2, respectively. In the fifth column of Tables 2.3 and 2.4 we display the

resulting numerical approximations. The last column gives the difference with the result of

doubly inverting (2.6). It is concluded that the differences roughly range between 10−4 and

10−7.

�

�


�

�

�

�

�

38 2.5. SMALL JUMPS

Example 6. We now return to the setting of Example 4: we assume that J corresponds to a

Pareto distribution. The last two columns of Table 2.5 should be read as the corresponding

columns in Tables 2.3 and 2.4. Here the differences with the result based on (2.6) roughly

range between 10−3 and 10−5.

Example 7. Now consider a slightly harder example: J follows a shifted-Pareto distribution,

that is, P(J > x) = 1 for x ≤ 1, and P(J > x) = x−γ for x > 1, for some γ > 0; observe

that the support of J is (1,∞). In this case the approximating phase-type distribution is a

mixture of Erlang distributions of high degree; to this end, realize that an Erl(n, n) random

variable (having mean 1, and a variance 1/n, i.e., vanishing as n grows large) approximates

a deterministic (1) random variable. Table 2.6 should be read as Table 2.5. We observe that

due the fact that the distribution of the jumps does not have support (0,∞), the phase-type-

based approximation performs relatively weak.

2.4.3 Comparison with simulation results

In this subsection we deal with an example in which we do not know κ(α, q) (as opposed to

the examples presented in Section 2.4.2).

Example 8. In this example we consider compound Poisson with two-sided jumps, plus

Brownian motion with drift. The upward jumps are Weibullian, and approximated by a

phase-type distribution. The downward jumps are exponential. The numerical results are

compared to simulation-based estimates, and show a good fit. (As an aside we mention that

in this case the upward jumps are of phase-type, where the downward jumps are not. Con-

sequently, also in this case κ(α, q) can be given explicitly, in terms of a number of roots, see

[71]. We do not pursue this approach.)

2.5 Small jumps

So far we have developed a technique that can deal with all Lévy processes consisting of

deterministic drifts, Brownian motions and compound Poisson processes. This means that

we have not yet looked at processes with small jumps. In this section we rely on results from

[15] to deal with these. The main result used is that under appropriate conditions a Lévy

process with small jumps can be accurately approximated by the sum of an appropriately

chosen compound Poisson process and Brownian motion. We first write the jump part of

�

�


�

�

�

�

�


the Lévy exponent in the form

∫ ∞

−∞(eisx − 1− isx1{|x|<ε})Π(dx) =

∫ ε

−ε

(eisx − 1− isx)Π(dx) +

∫R\[−ε,ε]

(eisx − 1)Π(dx);

let the first term correspond to a Lévy process, say, X(1,ε)t , and the second term (which is a

compound Poisson process) to, say, X(2,ε)t . Then the ‘small jump component’ X(1,ε)

t can be

approximated by (for some small value of ε)

μεt+ σεBt +X(2,ε)t , (2.10)

where Bt is a standard Brownian motion, and

με :=

∫ ε

−ε

xΠ(dx), σ2ε :=

∫ ε

−ε

x2Π(dx).

To shed some light on the accuracy of such an approximation, it is mentioned that it holds

that under appropriate conditions [15]

(X(1,ε)(t)− μεt

σε

)t≥0

d→ (Bt)t≥0, (2.11)

A sufficient condition for (2.11) to hold is that, with L(·) a slowly varying function at 0, Π(·)has a density of the form L(x)/|x|γ+1 for x ↓ 0, with γ ∈ (0, 2). It is noted that this condition

applies for e.g. stable Lévy processes and CGMY processes, but not for e.g. variance Gamma

processes (as these correspond to γ = 0). We also mention that the use of (2.10) is advocated

for Variance Gamma in [44] — see his third algorithm on p. 25.

Approximating the distribution of the upward jumps by a phase-type distribution, we are

again in the setting of Section 2.4. As a result, we can use the methodology developed earlier

to perform the numerical computations. There is an obvious trade-off between accuracy and

computational effort when varying ε.

Example 9. In this example we consider a Lévy process whose upward jumps are CGMY-like,

that is, for C,M, Y > 0,

Π(x) = Ce−Mx

x1+Y

for x > 0 and 0 else. A Brownian term is added. The third and fourth column of Table

2.8 present simulation-based estimates, based on approximation (2.10), with ε = 0.1 and

ε = 0.05, respectively. Observe that this model is contained in S+, so that (2.6) applies and

�

�


�

�

�

�

�

40 2.6. BETA PROCESSES

κ(α, q) is given explicitly; the fifth column gives the results based on double inversion of

(2.6). In the last column X(2,ε)t is approximated by a Lévy process with phase-type jumps;

as usual, we apply (2.7). From the small difference between both simulation-based columns,

we conclude that those values are likely to be close to the true values. The inversion-based

columns are well in agreement with each other and with the simulation-based output.

Example 10. We now consider a Variance Gamma process, which a can be regarded as a

(standard, in our case) Brownian motion where the time parameter follows a Gamma pro-

cess. More precisely, with Bt being a standard Brownian motion, and Yt a Gamma process

with parameters 1 and 1, the Lévy process under consideration is given by BYt; this is an

example of subordinated Brownian motion.

The third column of Table 2.9 presents simulation-based estimates, based on approximation

(2.10), with ε = 0.01. The fourth column also gives simulation-based estimates, but now

simulating the Variance Gamma process as subordinated Brownian motion. This means that

we sample the values of the Gamma process on a grid, and then generate the Brownian

motion at these values. We have performed this procedure for different grid sizes, N =

200, N = 500, N = 1000 but we observed just minor differences (negligible with respect to

the width of the confidence interval).

To obtain the fifth column, the upper tail of the Lévy measure is split into a Brownian compo-

nent and a compound Poisson component, as explained earlier in this section. We observe a

reasonable fit. The problem with the approach that we proposed, however, is that we cut out

the interval (0, ε), so that the positive jumps have a distribution which has support (ε,∞) —

we saw before (viz. in the shifted-Pareto case) that such distributions do not lend themselves

to be approximated by a phase-type distribution. We can remedy this effect by allowing the

jump size distribution to have support (0,∞); we give the Lévy measure of X(2,ε)t the value

Π(ε) in the interval (0, ε); the parameters of the Brownian motion are then adapted such that

the first two moments give the desired match. The last column gives the resulting estimates;

the fit is considerable better than in the previous column and in addition the dimension of

the approximating mixture of Erlangs is substantially lower.

2.6 Beta processes

In this section we test our methodology for the class of Beta processes, which fall in the class

of Lévy processes M for which the Lévy exponent is meromorphic. For these processes,

κ(α, q) can be represented [66] in terms of an infinite product. By truncating this infinite

�

�


�

�

�

�

�


product and performing a one-dimensional inversion, we can approximate P(Xt ≤ x). We

also obtain a simulation-based benchmark, by performing the sampling method developed

by Kuznetsov et al. [67]. We start this section by reviewing this simulation technique.

2.6.1 Wiener-Hopf Monte Carlo (WH-MC) simulating method

Suppose that we are able to sample the running maximum (Xτ(q)) and the running minimum

(Xτ(q)), where τ(q) is a exponentially distributed random variable with mean 1/q. Then

by the method developed by Kuznetsov et al. [67], based on an algorithm introduced by

Carr [27], we are able to evaluate E[F (Xt, Xt)], the main ideas being the following. By the

strong law of large numbers we know that∑n

i=1tnei → t as n → ∞, where ei constitutes

a sequence of i.i.d. exponentially distributed random variables with mean 1. The random

variable∑n

i=1tnei is equal in law to gamma random variable with parameters n and q =

nt ; we denote it by g(n, q). As a consequence, P(Xg(n,q) ∈ dx, Xg(n,q) ∈ dy) is a suitable

approximation to P(Xt ∈ dx, Xt ∈ dy), taking n sufficiently large. The following result is

due to [67, Thm. 1], to which we refer for more details.

Theorem 1. For all n ≥ 1 and q > 0, define g(n, q) :=∑n

i=1tnei. Then

P(Xg(n,q) ∈ dx, Xg(n,q) ∈ dy)d= (V (n, q), J(n, q)) (2.12)

where V (n, q) and J(n, q) are defined iteratively through

V (n, q) = V (n− 1, q) + S(n)q + I(n)q

J(n, q) = max(J(n− 1, q), V (n− 1, q) + S(n)

q

)

and V (0, q) = J(0, q) = 0, S(0)q = I

(0)q = 0, {S(j)

q ; j ≥ 1} is a sequence of i.i.d. random variables

with common distribution Xτ(q), and {I(j)q ; j ≥ 1} is a sequence of i.i.d. random variables with

common distribution Xτ(q).

2.6.2 Beta processes

The class of Beta processes consists of Lévy processes defined by the triplet (μ, σ,Π), where

the Lévy measure is defined as

Π(x) = c1e−α1β1x

(1− e−β1x)λ11{x>0} + c2

eα2β2x

(1− eβ2x)λ21{x<0},

�

�


�

�

�

�

�

42 2.6. BETA PROCESSES

with parameters αi > 0, βi > 0, ci ≥ 0 and λi ∈ (0, 3)\{1, 2}. Its Lévy exponent is

Ψ(s) = i(μ− ρ)s− 1

2σ2s2 +

c1β1

B(α1 − is

β1, 1− λ1) +

c2β2

B(α2 +is

β2, 1− λ2)− γ. (2.13)

Here B(x, y) := Γ(x)Γ(y)/Γ(x+y) is the well-known Beta function. In addition, with ψ(x) :=

d log(Γ(x))/dx,

γ =c1β1

B(α1, 1− λ1) +c2β2

B(α2, 1− λ2),

ρ =c1β1

B(α1, 1− λ1)(ψ(1 + α1 − λ1)− ψ(α1))− c2β2

B(α2, 1− λ2)(ψ(1 + α2 − λ2)− ψ(α2)).

The Lévy exponent of the beta process is a meromorphic function in C; it turns out to be

possible to identify all roots of the equation q−Ψ(s) = 0; these roots are characterized in the

following theorem [66, Thm. 10].

Theorem 2. For q > 0 and Ψ(s) defined above, the equation q − Ψ(iξ) = 0 has infinitely many

solutions, all of which are real and simple. they are such that ξ−0 ∈ (−α1β1, 0) and ξ+0 ∈ (0, α2β2),

while for n ∈ {1, 2, . . .},

ξ−n ∈ (β1(−α1 − n), β1(−α1 − n+ 1)), ξ+n ∈ (β2(α2 + n− 1), β2(α2 + n)).

Moreover, for x > 0,

P(Xτ(q) ∈ dx) = −( ∞∑

k=0

C−k ξ−k eξ

−k x

)dx (2.14)

where, with k ∈ {1, 2, . . .},

C−0 =

∏n≥1

1 + ξ−0 /β1(n− 1 + α1)

1− ξ−0 /ξ−n, C−

k =1 + ξ−k /β1(k − 1 + α1)

1− ξ−k /ξ−0

∏n≥1,n �=k

1 + ξ−k /β1(n− 1 + α1)

1− ξ−k /ξ−n.

A similar expression holds for P(−Xτ(q) ∈ dx), but {ξ−n } must be replaced by {−ξ+n } and α1, β1

must be replaced by α2, β2.

Note that if σ = 0 and λi < 2 the distribution of Xτ(q) has an atom at zero which is equal to

1−∑n≥1 C

−k ; it can be written as

∏n≥0 −ξ−n /β1(n+ α1).

The above theorem provides us with information about the location of the poles, thus facili-

tating the efficient determination of their exact positions (use for instance a simple bisection

method). However, for performing the inverse Laplace transform we need to find poles for

complex values of q; as computing roots for complex q is time consuming, we rely on the

�

�


�

�

�

�

�


method that we explain in Section 2.7.2.

Example 11. In this example, we consider a Beta process with parameters

(μ, σ;α1, β1, λ1, c1;α2, β2, λ2, c2) = (−0.5, 1; 1, 1.5, 1.5, 1; 1, 1.5, 1.5, 1).

Because the Beta process is a Lévy process with small jumps, we need to approximate the

jumps smaller than ε with a Brownian motion in the Monte-Carlo simulation; in the Wiener-

Hopf Monte-Carlo simulation we do not need this approximation. It is also noted that the

distributions of Xτ(q) and Xτ(q) are expressed in terms of infinite series, and as a consequence

we have to perform a truncation to sample from Xτ(q) and Xτ(q). In Table 2.10 the third

and forth columns show the result obtained from ‘ordinary simulation’, using ε = 0.1 and

ε = 0.05. The next two columns display the estimates obtained relying on WH-MC with

the number of iterations n in Thm. 1 equal to 20 and 100, respectively. We perform 107

realizations in each simulation.

The seventh column shows the outcome of (2.14), where the summation is truncated after

25 terms. The last column, finally, is based on inverting the Laplace transform obtained

by approximating the positive jumps by their phase-type counterparts. If we leave out the

jumps smaller than ε the positive jumps size distribution will have support (ε,∞) which is

poorly approximated by Erlang distributions, as we explained in Example 10; we remedy

this complication in the same way as we did in Example 10.

2.7 Discussion and concluding remarks

We conclude this chapter by briefly discussing a number of issues that affect the accuracy

and computation time.

2.7.1 Remarks on fitting of phase-type distribution

Let f(x) be the density which we wish to approximate with a so-called hyper-Erlang distri-

bution of degree N , that is, we wish to find αj , λj and nj such that

f(x) ≈N∑j=1

αj(λix)

nj−1

(nj − 1)!λje

−λjx; (2.15)

�

�


�

�

�

�

�

44 2.7. DISCUSSION AND CONCLUDING REMARKS

here n1, . . . , nN ∈ N are the numbers of phases of the individual Erlang distributions, the

λj s are positive numbers, while the αj s are positive numbers such that∑N

j=1 αj = 1.

With the EM algorithm we can optimize the parameters αj and λj for a given N and pre-

defined set of nj . In order to find the ‘best’ fitting we have tried a large set of vectors

(n1, n2, · · · , nN ) for a given N , in order to identify the (n1, n2, · · · , nN ) which maximizes

the likelihood. This procedure can be implemented efficiently; for a detailed discussion we

refer to [94].

Note that all phase-type distributions have the features that they (i) are light-tailed, and (ii)

have support (0,∞). As a consequence, it heavily depends on the distribution under consid-

eration whether it can be approximated well by a phase-type distribution. From our experi-

ments, we observed that for light-tailed distributions Erlang distributions of low dimension

suffice; see for example the Weibull distribution with γ ≥ 1. Distributions with heavier tails

(Pareto, the Weibull distribution with γ < 1) are significantly harder to approximate (in that

they require a mixture of Erlangs of high dimension); it is emphasized that the fit of the dis-

tribution’s tail may be poor in this case, while the ‘body’ of the distribution is approximated

quite well. Distributions with support different from (0,∞) are even harder to fit; think of

the shifted-Pareto distribution. In this case, recall that the approximating phase-type distri-

bution contains multiple Erlang distributions of high degree; note that an Erl(n, n) random

variable (which has mean 1, and variance 1/n, i.e., vanishing as n grows large) can be used

to approximate a deterministic (1) random variable. Our experiments indicate that, despite

the fact that we included such high-degree Erlang distributions, the resulting numerics are

decent, but not highly accurate.

2.7.2 Remarks on the computation time

As we mentioned earlier in this chapter, the computation time of the Laplace inversion al-

gorithm is of the order M log(M), if function values f(kΔ), k = 0, 1, · · · ,M − 1 are to be

computed. It is emphasized, though, that the bulk of the computation time is not related to

this inversion, but rather to the numerics related to identifying the roots βj(q) in (2.7), which

solve q = ξ(s). The number of roots of this equation equals to Np + 1 if there is a Brownian

component in the Lévy exponent, and Np otherwise; here Np denotes the sum of phases of

the individual distributions the phase-type distribution is composed from. For details we

refer to [71].

In general, finding these roots can be extremely time consuming, as we lack precise knowl-

�

�


�

�

�

�

�


edge about the locations of the roots in the complex plane and their multiplicity. In addition,

the Laplace inversion algorithm needs to compute these roots for different values of q. In

order to save time, we first compute the roots βj of the equation a = ξ(s) with a being real

damping factor. Note that the roots of the equation a + iq = ξ(s) change continuously with

respect to q in the complex plane; considering the roots as explicit functions of q such that

ξ[βj(q)] = a+ iq; βj(0) = βj (2.16)

we obtain by differentiating with respect to q the ordinary differential equation

dβj(q)

dq=

i

ξ′[βj(q)]. (2.17)

Applying this procedure, we can find the roots efficiently for different values of q by using,

for example, an adaptive Runge-Kutta method [66].

�

�


�

�

�

�

�


time t x Simulation Appr., use (2.7) Appr., use (2.6) Difference0.1 0.1 0.340062 0.34025703 0.34025703 1.32e-09

0.2 0.579295 0.57935467 0.57935467 1.24e-100.5 0.890919 0.89079051 0.89079051 2.04e-101.0 0.959618 0.95963405 0.95963405 4.18e-10

0.3 0.1 0.241525 0.24164945 0.24164945 5.67e-100.2 0.420271 0.42037262 0.42037262 9.24e-110.5 0.720007 0.71992788 0.71992788 4.34e-101.0 0.876619 0.87671756 0.87671756 3.04e-09

0.5 0.1 0.206808 0.20683652 0.20683652 4.94e-100.2 0.361211 0.36125826 0.36125826 7.48e-100.5 0.633832 0.63363062 0.63363062 1.57e-091.0 0.809454 0.80949895 0.80949895 1.16e-09

Table 2.2: Compound Poisson with exponential jumps. The jumps occur according to a Poisson processwith rate λ = 1; the jump sizes are exponential with mean 1. The Brownian term has parametersd = −1.5 and σ = 1.0.

time t x Simulation Appr., use (2.6) Appr. Ph., use (2.7) Difference0.1 0.1 0.306441 0.30554528 0.30503331 5.12e-04

0.2 0.541505 0.53979428 0.53897538 8.19e-040.5 0.884350 0.88191833 0.88112773 7.90e-041.0 0.960646 0.95807537 0.95813022 5.48e-05

0.3 0.1 0.203235 0.20174542 0.20107108 6.74e-040.2 0.368249 0.36527241 0.36413314 1.14e-030.5 0.683732 0.67832431 0.67690345 1.42e-031.0 0.866122 0.85926538 0.85922344 4.19e-05

0.5 0.1 0.166972 0.16488486 0.16421784 6.67e-040.2 0.303935 0.29990763 0.29877142 1.13e-030.5 0.580408 0.57275238 0.57129482 1.46e-031.0 0.780043 0.76979656 0.76990473 1.08e-04

Table 2.3: Compound Poisson with Weibull jumps; heavy-tailed case. The jumps occur according to aPoisson process with rate λ = 1; the jump sizes are Weibull with parameters μ = 1.0 and γ = 0.5. TheBrownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 7 and the highest number of phases nmax = 3.

�

�


�

�

�

�

�



0.2 0.526594 0.52650719 0.52650797 7.80e-070.5 0.861135 0.86135242 0.86133659 1.58e-051.0 0.953920 0.95401102 0.95437502 3.64e-04

0.3 0.1 0.189862 0.18976048 0.18978895 2.84e-050.2 0.343766 0.34370972 0.34376675 5.70e-050.5 0.643406 0.64373428 0.64389320 1.59e-041.0 0.848913 0.84918147 0.84978994 6.08e-04

0.5 0.1 0.152512 0.15236195 0.15243569 7.37e-050.2 0.277445 0.27746044 0.27759827 1.38e-040.5 0.536884 0.53709063 0.53740254 3.12e-041.0 0.761019 0.76122687 0.76199729 7.70e-04

Table 2.4: Compound Poisson with Weibull jumps; light-tailed case. The jumps occur according to aPoisson process with rate λ = 1; the jump sizes are Weibull with parameters μ = 1.0 and γ = 2. TheBrownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 4 and the highest number of phases nmax = 5.


0.2 0.661758 0.66178212 0.66185155 6.94e-050.5 0.909879 0.91002954 0.91013778 1.08e-041.0 0.950149 0.95024208 0.95049174 2.50e-04

0.3 0.1 0.329487 0.32952345 0.32968987 1.66e-040.2 0.530314 0.53051325 0.53077411 2.61e-040.5 0.778595 0.77893652 0.77936986 4.33e-041.0 0.863911 0.86403154 0.86472833 6.97e-04

0.5 0.1 0.292768 0.29285157 0.29315023 2.99e-040.2 0.472333 0.47253618 0.47301425 4.78e-040.5 0.702192 0.70257211 0.70334963 7.77e-041.0 0.796445 0.79658771 0.79772318 1.13e-03

Table 2.5: Compound Poisson with Pareto jumps. The jumps occur according to a Poisson processwith rate λ = 1; the jump sizes are Pareto with parameter γ = 1.0. The Brownian term has parametersd = −2.5 and σ = 1.0. Number of Erlang distributions which were fitted to the Pareto distributionNEr = 10 and the highest number of phases nmax = 5.

�

�


�

�

�

�

�



0.2 0.647698 0.64752872 0.64761795 8.92e-050.5 0.881551 0.88133353 0.88190216 5.99e-041.0 0.912168 0.91201427 0.91587049 3.86e-03

0.3 0.1 0.302718 0.30249674 0.30373567 1.24e-030.2 0.485837 0.48568042 0.48783318 2.15e-030.5 0.704911 0.70473313 0.70947429 4.74e-031.0 0.779802 0.77970465 0.79028644 1.05e-02

0.5 0.1 0.252952 0.25275076 0.25585642 3.10e-030.2 0.407164 0.40696892 0.41214744 5.18e-030.5 0.600502 0.60036169 0.60967756 9.31e-031.0 0.685971 0.68588977 0.70143444 1.55e-02

Table 2.6: Compound Poisson with shifted-Pareto jumps. The jumps occur according to a Poissonprocess with rate λ = 1; the jump sizes are shifted-Pareto with parameter γ = 1.0. The Brownianterm has parameters d = −2.5 and σ = 1.0. Number of Erlang distributions which were fitted to theshifted-Pareto distribution NEr = 12 and the highest number of phases nmax = 28.

time t x Simulation Appr. Ph., use (2.7)0.1 0.1 0.310605 0.31083581

0.2 0.540572 0.540683940.5 0.866999 0.866768511.0 0.955574 0.95648076

0.3 0.1 0.212655 0.214138280.2 0.376741 0.378790540.5 0.674504 0.676089081.0 0.863495 0.86495277

0.5 0.1 0.181433 0.190424460.2 0.322121 0.334367920.5 0.588677 0.599694831.0 0.794101 0.80039848

Table 2.7: Compound Poisson with both upward and downward jumps. Both the upward jumps anddownward jumps occur according to a Poisson process with rate λ = 1; the positive jump sizes areWeibull with parameters μ = 1 and γ = 2, and the negative jump sizes exponential with mean 1.0.The Brownian term has parameters d = −1.0 and σ = 1.0. Number of Erlang distributions which werefitted to the Weibull distribution NEr = 5 and the highest number of phases nmax = 5.

�

�


�

�

�

�

�


time t x Simulation ε = 0.1 Simulation ε = 0.05 Appr., use (2.6) Appr. Ph., use (2.7)0.1 0.1 0.464093 0.464372 0.46468919 0.46685728

0.2 0.707813 0.707969 0.70792216 0.712028070.5 0.947668 0.947733 0.94756177 0.950568331.0 0.993627 0.993675 0.99361188 0.99453926

0.3 0.1 0.412429 0.412595 0.41282521 0.416754900.2 0.636461 0.636570 0.63645668 0.643070830.5 0.896372 0.896337 0.89611778 0.902065201.0 0.981780 0.981841 0.98174473 0.98406940

0.5 0.1 0.402468 0.402620 0.40282695 0.407444800.2 0.621987 0.622003 0.62191009 0.629535090.5 0.882154 0.882088 0.88182180 0.888979261.0 0.976001 0.975993 0.97592972 0.97898461

Table 2.8: The upward-jumps are CGMY-like, with parameters C = 1.0, M = 2.0 and Y = 0.5; thereare no downward jumps. The Brownian term has parameters d = −4.0 and σ = 1.0. Number ofErlang distributions which were fitted to the CGMY-upper tail NEr = 5 and the highest number ofphases nmax = 9, after having cut off the interval (0, ε), with ε = 0.1, from Π(·).

time t x Simulation ε = 0.01 Simulation SBM Appr. Ph., use (2.7) Appr. Ph. adapted0.1 0.1 0.862579 0.863298 0.87678746 0.85836619

0.2 0.909530 0.909840 0.92167074 0.901254950.5 0.962386 0.962496 0.96661308 0.961994221.0 0.987754 0.987790 0.98923639 0.98548799

0.3 0.1 0.663389 0.665469 0.68917587 0.659586970.2 0.759177 0.760215 0.78453098 0.745560380.5 0.886828 0.887072 0.89861106 0.885608771.0 0.959078 0.959243 0.96420072 0.95351051

0.5 0.1 0.536040 0.538339 0.56558233 0.544035830.2 0.646397 0.647893 0.67988532 0.641977880.5 0.816052 0.816467 0.83582281 0.819688671.0 0.927131 0.927450 0.93698719 0.92169160

Table 2.9: Variance Gamma process, d = 0.0, σ2 = 1.0 and κ = 1.0. For the fifth column, number ofErlang distributions which were fitted to the upper tail NEr = 10 and the highest number of phasesnmax = 10, after having cut off the interval (0, ε), with ε = 0.01, from Π(·). For the last column,number of Erlang distributions which were fitted to the upper tail NEr = 3 and the highest number ofphases nmax = 4, after having cut off the interval (0, ε), with ε = 0.01, from Π(·).

�

�


�

�

�

�

�


Simulation Simulation WH-MC WH-MC Laplace Appr. Ph.time t x ε = 0.1 ε = 0.05 n = 20 n = 100 inversion (2.14) adapted0.1 0.1 0.2693 0.2696 0.2760 0.2713 0.2696 0.2626

0.2 0.4849 0.4850 0.4961 0.4883 0.4850 0.47070.5 0.8476 0.8474 0.8517 0.8475 0.8475 0.83671.0 0.9710 0.9711 0.9686 0.9675 0.9710 0.9708

0.3 0.1 0.1724 0.1726 0.1761 0.1732 0.1725 0.17880.2 0.3162 0.3163 0.3235 0.3185 0.3163 0.32130.5 0.6255 0.6255 0.6359 0.6287 0.6255 0.61871.0 0.8703 0.8702 0.8714 0.8688 0.8703 0.8605

0.5 0.1 0.1421 0.1423 0.1450 0.1425 0.1422 0.15800.2 0.2613 0.2615 0.2669 0.2625 0.2613 0.28220.5 0.5291 0.5290 0.5389 0.5316 0.5291 0.54271.0 0.7847 0.7848 0.7896 0.7843 0.7848 0.7832

Table 2.10: Beta process, with parameters of Example 11. Number of Erlang distributions which werefitted to the upper tail NEr = 3 and the highest number of phases nmax = 4, after having cut off theinterval (0, ε), with ε = 0.1, from Π(·).

�

�


�

�

�

�

�

Chapter 3Evaluation of option prices

In the previous chapter we developed a numerical technique for evaluating the distribution

of the running supremum of Lévy processes. In this chapter we use this numerical technique

for pricing specific class of path-dependent options.

3.1 Introduction

Standard options (or: vanilla options) have a payoff structure that depends on the price evo-

lution of the underlying asset only through the price at expiration. There is an abundance

of exotic options that are traded nowadays, however, with payoff structures that are sub-

stantially more involved. Lookback options are examples of derivatives of which the payoff

depends on the maximum (or minimum) price over the life of the option, and possibly the

price of the underlying asset at maturity as well. They come in two flavors: lookback options

with fixed strike, and those with a floating strike.

With the stochastic process St representing the evolution of the stock-prices and ST :=

sup0≤t≤T St the associated running maximum process, the payoff of the fixed strike call op-

tion is

P(c)fix (T,K) := max{ST −K, 0} = (ST −K)+,

with strike price K and maturity time T ; analogously, the payoff of the put-counterpart is

given by P(p)fix (T,K) := (K − ST )

+, with St the running minimum process. As indicated by

these payoffs, this type of options has a fixed, a priori known strike price, but as, opposed

to the ‘traditional’ European option, the underlying trigger is not the price at maturity but

51

�

�


�

�

�

�

�


rather the maximum (or minimum) of the underlying asset price over the life of the option.

The payoff of the floating strike call option is

P(c)fl (T, L) := max{ST − LST , 0} = (ST − LST )

+;

in case L ≤ 1 the payoff is always nonnegative, and reduces to ST − LST . This means

that the strike price is fixed at the asset’s minimal price during the option’s life, multiplied

with a specified constant L. The payoff of the put-counterpart is defined by P(p)fl (T, L) :=

max{LST − ST , 0}, which reduces to LST − ST if L ≥ 1.

Importantly, unlike vanilla options, the lookback options discussed above as well as other

exotic options have a path-dependent payoff. This means that their payoff does not depends

on ST only, but also involves a certain functional of the process St, for 0 ≤ t ≤ T (i.e.,

the maximum or minimum value attained). As a consequence it is highly nontrivial to price

such options, or to numerically assess the sensitivities of the price with respect to the various

model parameters such as the maturity and the initial price of the underlying asset (the

‘Greeks’).

It was widely recognized that the classical Black-Scholes model [23], in which the price evo-

lution process is given by St = S0eXt for a Brownian motion Xt, fails to reproduce the

smile effect; instead, in this model the volatility is constant with respect to strike prices. This

has motivated researchers and practitioners to depart from the Black-Scholes model, and to

consider the more general situation in which Xt follows a Lévy process. Lévy processes, char-

acterized by the property that their increments are stationary and independent, form a rich

class, covering a broad spectrum of possible jump structures [32]. At the same time, how-

ever, they allow for explicit analysis, even if more complex metrics are involved [21, 68]. A

specific branch of the Lévy literature is about fluctuation theory, describing the probabilistic

features of the extreme values attained by the Lévy process under study. Important in the

context of the present chapter are so-called Wiener-Hopf results, which characterize the joint

distribution of the running maximum XT (or running minimum XT ) and the value of the

process XT , in terms of a double Laplace transform. More specifically, an expression is given

for

κ(α, q) := Ee−αXτ(q) =

∫ ∞

0

qe−qtEe−αXtdt,

where τ(q) is an exponentially distributed random variable, independent of the evolution of

the Lévy process Xt, with mean q−1; there is an analogous result for the transform κ(α, q)

�

�


�

�

�

�

�

CHAPTER 3. EVALUATION OF OPTION PRICES 53

associated with the running minimum counterpart. In addition, the transform of Xτ(q) (or

Xτ(q)) jointly with the value at τ(q), i.e., Xτ(q), can be given.

The major difficulty with these results lies in their implicitness. Numerical evaluation of the

transforms requires the availability of the density of Xt for any t ≥ 0, while often only the

associated characteristic function EeisX1 is given (which characterizes the law at any t ≥ 0,

due to EeisXt = (EeisX1)t, as a consequence of the stationary and independent increments).

For particular classes of Lévy process, however, (semi-)explicit expressions for κ(α, q) are

available. In this context we mention (i) the class of spectrally one-sided Lévy processes (i..e,

Xt has either only negative jumps, or only positive jumps), (ii) the class of Lévy processes

with general jumps in one direction and phase-type jumps in another direction, (iii) the class

of Lévy processes of which the Lévy exponent logEeisX1 is meromorphic function.

The aim of Chapter 2 was to evaluate the probability distribution of Xt for any Lévy process.

The strategy followed consisted of the following steps. First the Lévy process under consid-

eration is approximated by a Lévy process from one of the classes (i)-(ii)-(iii); this can be

done in principle arbitrarily accurately (at the expense of higher computation times). Then

for this approximate Lévy process the Laplace transform of Xτ(q) is evaluated. The last step

is to rely on the numerical algorithms described in [36, 37] to invert this transform. The

approach has been validated by extensive numerical experimentation, the main conclusion

being that the proposed method is fast and highly accurate in all scenarios considered.

The idea behind the present chapter is to use the framework of Chapter 2 in order to numer-

ically evaluate lookback option prices and the associated Greeks. With techniques similar

to those used by Nguyen-Ngoc and Yor [78, 79], we find the transforms of the option prices

(fixed and floating strike, call and put), as well as those related to their sensitivities (Greeks)

with respect to the maturity T and initial price S0, in terms of the Wiener-Hopf transforms

κ(α, q) and κ(α, q) introduced above. Then we replace the Lévy process under study by that

of an approximating Lévy process of which κ(α, q) and κ(α, q) can be evaluated. Finally,

the numerical inversion routines of [36, 37] are used to numerically evaluate the prices and

Greeks.

The numerical experiments performed in this chapter cover a broad variety of underlying

Lévy processes. We start by the classical models, viz. the model in which Xt corresponds

to a Brownian motion with drift, often referred to as the Black-Scholes model [23], and the

model with additional Normally distributed jumps at Poisson epochs, known as the Mer-

ton model [76]. These models still allow explicit expressions for the Wiener-Hopf transforms

�

�


�

�

�

�

�

54 3.2. PRELIMINARIES

κ(α, q) and κ(α, q); the numerics primarily serve the purpose of checking whether the inver-

sion techniques provide correct output. Next we consider examples from the class of Lévy

processes with infinite activity, i.e., processes with infinitely many jumps over any finite in-

terval, which have shown to provide a particularly good fit of option price data. This class

class contains e.g. the tempered stable process [62], the normal inverse Gaussian process

[18], the variance gamma process [73], and the CGMY process [28]; the latter process fea-

tures in one of the examples, in addition to the so-called Beta process recently analyzed by

Kuznetsov et al. [66, 67].

A substantial body of work is concerned with the numerical evaluation of prices and Greeks

of specific exotic options, usually just for particular classes of driving Lévy processes; see e.g.

[32, Ch. XI] and [78, 79], and references therein, for nice accounts of the literature. We refer

to [58] for a study on barrier options for the class of generalized hyper-exponential Lévy

models (covering e.g. tempered stable processes, normal inverse Gaussian processes, and

variance gamma processes). An alternative method, the so-called likelihood ratio method,

has been developed by Glasserman and Liu in [48, 49].

This chapter is organized as follows. Section 3.2 sketches the preliminaries: some back-

ground on Lévy processes, key results in fluctuation theory, a brief account of phase-type

approximations, and a short description of the numerical inversion techniques developed

in [36, 37]. In Section 3.3 we compute the transforms of the lookback options under study,

in terms of the Wiener-Hopf transforms κ(α, q) and κ(α, q). Section 3.4 presents the main

findings of our numerical experiments. This chapter is concluded by a discussion in Section

3.5.

3.2 Preliminaries

In this section we introduce notation, review the main properties of Lévy processes, present

a short overview of Wiener-Hopf theory, point out how to perform phase-type approxima-

tions, and sketch the numerical inversion technique developed in [36, 37].

3.2.1 Lévy processes

We now present a brief review of Lévy processes, with a focus on fluctuation-theoretic results

(i.e., results concerning the distribution of the running maximum and running minimum

process). More detailed accounts can be found in the textbooks by Bertoin [21], Kyprianou

�

�


�

�

�

�

�


[68], and Sato [88], and the survey paper [35]. The textbooks by Cont and Tankov [32] and

Schoutens [90] focus on the use of Lévy processes in finance.

Let (Xt)t≥0 be a Lévy process, i.e., a (single-dimensional) stochastic process with stationary

and independent increments, defined on an appropriately chosen probability space (Ω,F ,P),

shifted such that X0 = 0. Recall from Chapter 1 that for any Lévy process (Xt)t≥0, the distri-

bution of X1 is infinite divisible, which is equivalent to the Lévy-Khintchine representation

of its characteristic function (known as the Lévy exponent) being of the form


2s2σ2 +

∫ ∞

−∞(eisx − 1− isx1{|x|<1})Π(dx)

where d ∈ R, σ ≥ 0 and the spectral measure Π(dx), concentrated on R \ {0}, satisfies

∫R

min{x2, 1}Π(dx) < ∞.

The triplet (μ, σ2,Π), commonly referred to as the characteristic triplet, uniquely defines the

Lévy process [21, 68, 88]. The first and the second terms are related to a deterministic drift

and a Brownian component, respectively. The jumps of the Lévy process are contained third

term. If∫ ∞−∞ Π(dx) < ∞, then these jumps can be interpreted as a compound Poisson pro-

cess. In case∫ ∞−∞ Π(dx) = ∞, on the contrary, the Lévy process has infinitely many jumps

in any time interval. Examples of processes of the latter kind are gamma processes, variance

gamma processes, and normal inverse Gaussian processes.

We now discuss results related to the running maximum process Xt := sup0≤s≤t Xs and

the running minimum process Xt := inf0≤s≤t Xs. Recall that τ(q) denotes an exponential

random variable with mean q−1, independent of the considered Lévy process. Wiener-Hopf

theory states that Xτ(q) and Xτ(q) − Xτ(q) are independent; in addition realize that Xτ(q) −Xτ(q) is distributed as Xτ(q). More specifically, for α ≥ 0,


(−

∫ ∞

0

∫(0,∞)

1

t


)P(Xt ∈ dx)dt

); (3.1)

here k0 > 0 is a normalizing constant. Similarly, for some k0 > 0, and α ≤ 0,


(−

∫ ∞

0

∫(−∞,0)

1

t


)P(Xt ∈ dx)dt

). (3.2)

�

�


�

�

�

�

�


In addition, due to the independence mentioned above,

κ(α, q)κ(−α, q) = Ee−αXτ(q) EeαXτ(q) = Ee−αXτ(q) Ee−αXτ(q)+αXτ(q)

= Ee−αXτ(q) =

∫ ∞

0

qe−qtEe−αXtdt

=

∫ ∞

0

q(exp(−q + logEe−αX1)

)tdt =

q

q − logEe−αX1=: K (α, q).(3.3)

Suppose we wish to evaluate the distributions of Xt and Xt. The Wiener-Hopf decompo-

sition, as given above, entails that one option is to (i) first numerically evaluate the double

integrals in the exponent in the right-hand sides of (3.1)–(3.2), and then to (ii) numerically

invert these.

A principal problem of this approach is that often only the Lévy exponent corresponding the

Lévy process under study is available; in other words, we do not have an explicit expression

for the density P(Xt ∈ dx). In this chapter we use a technique that circumvents this problem,

and that was proposed in Chapter 2 to evaluate the distributions of Xt and Xt, to compute

the prices of lookback options and the associated Greeks.

The main idea of the approach advocated in Chapter 2 is to evaluate P(Xt ≤ x) bypassing

stage (i) above; evidently, P(Xt ≤ x) can be computed in an analogous fashion. The underly-

ing idea is that we make use of the fact that for quite a substantial class of Lévy processes Xt,

the double transform κ(α, q) can be expressed explicitly in terms of the Lévy exponent; we

approximate the Lévy process under consideration by an appropriately chosen Lévy process

in this class, so that the just performing stage (ii) remains.

We now describe a few classes of Lévy processes for which κ(α, q) (and because of (3.3) also

κ(α, q)) can be expressed explicitly in terms of the Lévy exponent logEeisX1 . It is noted that

in some cases still a number of (relatively straightforward) numerical computations need to

be performed.

� Spectrally one-sided processes. First consider the situation in which the underlying Lévy

process Xt has either only negative jumps (the spectrally negative case; write X ∈ S−) or only

positive jumps (the spectrally positive case; write X ∈ S+). In the former case the running

maximum up to the exponential epoch τ(q) has an exponential distribution, whereas in the

latter case the so-called generalized Pollaczek-Khinchine formula applies; see e.g. [35, Ch. III

and IV]. In both cases, κ(α, q) can be expressed in closed-form in terms of the Lévy exponent,

as we point out now.

�

�


�

�

�

�

�


Following [21, Ch. VII], for X ∈ S− we define Φ(β) := logEeβX1 , and Ψ(·) its right-inverse

[68, p. 211]. Then

κ(α, q) =Ψ(q)

Ψ(q) + α. (3.4)

In other words, Xτ(q) is exponentially distributed with parameter Ψ(q). In case X ∈ S+,

define ϕ(α) := logEe−αX1 , and let ψ(·) be the inverse of ϕ(·). Then

κ(α, q) =q

ψ(q)

ψ(q)− α

q − ϕ(α). (3.5)

This result is sometimes referred to as the (generalized) Pollaczek-Khinchine formula [55,

98]; see also [10, Ch. IX, Thm. 3.10].

� Processes with phase-type jumps. It has been found out more recently that κ(α, q) can be

expressed in semi-explicit terms if the jumps in one direction (either upward or downward)

are phase-type (or, more generally, have a rational Laplace transform), whereas the jumps in

the other direction are allowed to have a general distribution — see for results along these

lines [11, 71, 72]. In this chapter, we concentrate on the setting of Lewis and Mordecki [71]

in which the positive jumps have a rational Laplace transform, and the downward jumps are

general; we write X ∈ R. In this case κ(α, q) can be expressed in terms of the zeros of a

specific equation (that needs to be solved numerically).

More specifically, we consider a Lévy process with jumps, with a general jump-size distri-

bution in the downwards direction, while the upwards jumps have density

p(x) =

K∑k=1

nk∑j=1

ckj(αk)j xj−1

(j − 1)!e−αkx, x > 0.

As in Chapter 2, these Lévy processes define the class R such that for a finite and positive λ,

ξ(s) := logEeisX1 = isd− 1

2s2σ2 +

∫ 0

−∞(eisx − 1− isx1{x>−1})Π(dx)

+ λ

⎛⎝ K∑

k=1

nk∑j=1

ckj

(iαk

s+ iαk

)j

− 1

⎞⎠

where the αi are order such that 0 ≤ Re(α1) < Re(α2) ≤ · · · ≤ Re(αK).

Now let βj(q) denote the j-th root of the equation q = ξ(s), which we assume to have mul-

tiplicity mj(q); let m(q) denote the total number of distinct roots. Then κ(α, q) can be ex-

�

�


�

�

�

�

�


pressed in terms of the αk and βj(q):

κ(α, q) =K∏

k=1

(α+ αk

αk

)nk m(q)∏j=1

(βj(q)

α+ βj(q)

)mj(q)

; (3.6)

this expression can be inverted with respect to α, after having performed a partial fraction

expansion. Further details and properties of the roots are given in [72, Thm. 2.2]. Notice that

this expression for κ(α, q) can be inverted with respect to α. For more details and properties

of the roots, we refer to [71, 72].

� Processes with a meromorphic Lévy exponent. If the Lévy exponent is a meromorphic function

in complex plane, the Wiener-Hopf factorization can be evaluated in the same way as in case

of phase-type distribution jumps [66]. As in Section 2.6.2, the class of Beta processes M

consists of Lévy processes defined by triplet (d, σ,Π), with the Lévy measure Π(·) such that

Π(x) = c1e−α1β1x

(1− e−β1x)λ11{x>0} + c2

eα2β2x

(1− eβ2x)λ21{x<0},

with parameters αi > 0, βi > 0, ci ≥ 0 and λi ∈ (0, 3)\{1, 2}. Its Lévy exponent is given by

Ψ(s) = i(d− ρ)s− 1

2σ2s2 +

c1β1

B(α1 − is

β1, 1− λ1) +

c2β2

B(α2 +is

β2, 1− λ2)− γ, (3.7)

with B(x, y) := Γ(x)Γ(y)/Γ(x + y) denoting the Beta function. In addition, defining the

function ψ(x) := d log(Γ(x))/dx,

γ =c1β1

B(α1, 1− λ1) +c2β2

B(α2, 1− λ2),

ρ =c1β1

B(α1, 1− λ1)(ψ(1 + α1 − λ1)− ψ(α1))− c2β2

B(α2, 1− λ2)(ψ(1 + α2 − λ2)− ψ(α2)).

The Lévy exponent of the Beta process is a meromorphic function in C. There are infinitely

many roots of the equation q −Ψ(s) = 0; these are real and simple, and characterized in [66,

Thm. 10], as follows. The roots are such that ξ−0 ∈ (−α1β1, 0) and ξ+0 ∈ (0, α2β2), while for

n ∈ {1, 2, . . .},

ξ−n ∈ (β1(−α1 − n), β1(−α1 − n+ 1)), ξ+n ∈ (β2(α2 + n− 1), β2(α2 + n)).

�

�


�

�

�

�

�


Moreover, for x > 0,

P(Xτ(q) ∈ dx) = −( ∞∑

k=0

C−k ξ−k eξ

−k x

)dx (3.8)

where, with k ∈ {1, 2, . . .},

C−0 =

∏n≥1

1 + ξ−0 /β1(n− 1 + α1)

1− ξ−0 /ξ−n, C−

k =1 + ξ−k /β1(k − 1 + α1)

1− ξ−k /ξ−0

∏n≥1,n �=k

1 + ξ−k /β1(n− 1 + α1)

1− ξ−k /ξ−n.

A similar expression holds for the density P(−Xτ(q) ∈ dx ), but {ξ−n } must be replaced by

{−ξ+n }, while α1, β1 must be replaced by α2, β2.

Based on the above characterization, the poles can be determined efficiently. It is noted

though, that for performing the inverse Laplace transform these poles must be for complex

values of q; in Section 2.7 it is pointed out how this can be done.

3.2.2 Phase-type approximations, and a few other implementational is-

sues

As discussed in the previous subsection, if the jumps in one direction have a phase-type

distribution, while those in the other direction can have any distribution, the Wiener-Hopf

decomposition can be performed in (semi-)explicit terms. For the case of a Lévy process with

non-phase-type jumps, this leads to the idea of approximating the distribution of the jumps

in one direction by a phase-type distribution. Recall that the class of phase-type distribu-

tions has the attractive feature that it forms a dense class of distributions within the class of

distributions on (0,∞) [10, Ch. III, Thm. 4.2]. As a result, any distribution on (0,∞) can

be approximated by a phase-type distribution arbitrarily closely. One approach we use for

fitting an arbitrary distribution on (0,∞) by a suitable phase-type distribution is developed

by Asmussen et al. [14], while an alternative method is proposed by Horváth and Telek [57].

Both methods are based on the expectation-maximization (usually referred to as EM) algorithm.

The method developed in [14] generates an approximation with a general phase-type distri-

bution; this essentially means that it does not necessarily need a good guess to initialize the

algorithm, but for an accurate approximation the degree of phase-type may be prohibitively

large. The method of [57] focuses on mixtures of Erlang distributions, and tends to give an

accurate approximation already with a relatively low number of Erlang distributions; the de-

grees of these Erlang distributions, however, do not change while the algorithm is running,

and as a consequence we need to have a proper guess about these degrees. It has turned

�

�


�

�

�

�

�


out that for our purposes the algorithm of [57] is more appropriate than the first one with

respect to accuracy and CPU time consumption.

Evidently, in the case of small jumps, i.e.,∫ ∞−∞ Π(dx) = ∞, the jumps cannot be described as

a compound Poisson process: there are infinitely many jumps in any finite time interval. In

this situation, which we will come across in Section 3.4 when discussing the CGMY model,

we replace the small jumps (i.e., those that are in absolute value smaller than some small

ε > 0) by an appropriately chosen Brownian motion. This procedure, theoretically backed

by the findings in e.g. [15], is explained in greater detail in Section 2.5.

3.2.3 Laplace inverse transform

Our approach to option pricing relies on an advanced numerical method to perform Laplace

inversion transform [36, 37]. The method is capable of performing Laplace inversion, Fourier

inversion as well as mixed Laplace-Fourier inversion transform accurately and fast; impor-

tantly, also multi-dimensional transforms can be handled. In this subsection we briefly dis-

cuss the inversion method; see Section 2.3 for a more detailed account.

Like many other numerical inversion algorithms, the technique applied here is based on

well-known Poisson summation formula (PSF) [1, 38]. This PSF is given by, for v ∈ [0, 1) and

the ‘damping factor’ a ∈ R,

∞∑k=−∞

f (a+ 2πi(k + v)) =

∞∑k=0

e−ake−2πikvf(k) (3.9)

in which f is the Laplace (Fourier) transform

f(s) :=

∫ ∞

−∞e−stf(t)dt.

The idea is to replace the infinite summation in the left-hand side by an appropriately chosen

finite sum; in [36, Appendix A] it is pointed out how such a quadrature rule can be found.

Then the values of f(k) can be computed efficiently by the fast Fourier transform algorithm

(FFT); see e.g. the seminal paper [32].

Numerical experiments in [36, 37] show that, under general circumstances, this method

gives evaluates the function values near machine precision. It is also pointed out how the

method can be adapted, with a simple modification, such that it is capable of handling dis-

continuities. The method can also be extended for multidimensional Laplace inverse trans-

�

�


�

�

�

�

�


formation.

Although the method has been developed for Laplace inverse transform, it can be adapted

in a straightforward manner to perform Fourier inversion. Details on the implementation, as

well as its extension to multidimensional inversion, are described in great detail in [36, 37].

3.3 Transforms of prices and Greeks of lookback options

We consider the setting described in the introduction: a model in which the price of an

underlying asset evolves as St = S0 eXt , where Xt is a Lévy process with X0 = 0. In this

chapter we primarily focus on lookback options. To test our numerical procedures, however,

we also include a simpler option, viz. the vanilla option.

We consider the usual setup, as introduced in more detail e.g. in [78]: a market with two basic

assets, viz. the usual bank account with an interest rate r > 0, and the option associated with

an underlying asset whose evolution in time is represented by the stochastic process St.

3.3.1 Vanilla option

For the vanilla call option the payoff is given

P (c)van(T,K) := (ST −K)+,

where, as usual, T is maturity time and K is strike price. This option is simpler than the

lookback options given in the introduction (which also depend on the extreme values of St

for t ∈ [0, T ]). The put-counterpart is defined through the payoff P (p)van(T,K) := (K − ST )

+.

Our goal is to compute the price of the vanilla option, i.e.,

V (c)van(T,K) := E

[e−rTP (c)

van(T,K)];

the analysis of the put-counterpart works similarly. It requires some elementary algebra to

verify that, with k := log(K/S0),

V (c)van(T,K) = S0 e

−rT

∫ ∞

k

(ex − ek)P(XT ∈ dx).

�

�


�

�

�

�

�

62 3.3. TRANSFORMS OF PRICES AND GREEKS OF LOOKBACK OPTIONS

Let V (c)van(T, α) be the Fourier transformation with respect to k:

V (c)van(T, α) := S0e

−rT

∫ ∞

−∞eiαkeηk

∫ ∞

k

(ex − ek

)P(XT ∈ dx)dk,

where η > 0 is a damping factor. By changing the integration order it is readily found that

V (c)van(T, α) =

S0e−rT

(iα+ η)(iα+ η + 1)Ee(iα+η+1)XT

=S0

(iα+ η)(iα+ η + 1)

(e−r Ee(iα+η+1)X1

)T

, (3.10)

where the last step is due to the Lévy nature of Xt. We have expressed the transform

V(c)van(T, α) in terms of the Lévy exponent corresponding to Xt and the maturity T .

We now determine the transforms of a set of Greeks, i.e., sensitivities. We focus on the

sensitivities with respect to the initial price of the underlying asset S0 and the maturity T ;

in the sequel we refer to these Greeks by Δ and Θ. Regarding the former, it is elementary to

verify that

Δ(c)van(T,K) :=

∂V(c)van(T,K)

∂S0= e−rT

∫ ∞

log(K/S0)

exP(XT ∈ dx).

Writing k := log(K/S0) and transforming to k in the same way as above, we obtain the

transform

Δ(c)van(T, α) := e−rT

∫ ∞

−∞eiαkeηk

∫ ∞

k

ex P(XT ∈ dx)dk =1

iα+ η

(e−r Ee(iα+η)X1

)T

.

Realize that the expression in the right-hand side implicitly depends on S0, as k = log(K/S0).

We now concentrate on the Greek with respect to the maturity time. With

Θ(c)van(T,K) :=

∂V(c)van(T,K)

∂T,

we have that

Θ(c)van(T, α) :=

S0

(iα+ η)(iα+ η + 1)

(e−r Ee(iα+η+1)X1

)T (logEe(iα+η+1)X1 − r

).

It is noted that transforms of second-order Greeks can be determined similarly.

The vanilla options are path independent, in the sense that their prices depend on the asset

price process only through the asset price at maturity time T , and are independent of the

�

�


�

�

�

�

�


specific shape of the path during the time interval (0, T ). The lookback options, which we

are going to study now, are path dependent.

3.3.2 Fixed strike lookback options

We now focus on pricing fixed-strike lookback options; again we present our analysis for the

call option, but the put-variant is dealt with analogously. In our derivations, we follow the

same line of reasoning as in [78]. Our goal is to evaluate, in terms of transforms,

V(c)fix (T,K) := E

[e−rTP

(c)fix (T,K)

],

as well as its Greeks with respect to S0 and T ; recall the definition of the payoff P (c)fix (T,K)

from the introduction of this chapter. If K ≤ S0, it automatically follows that P (c)fix (T,K) =

ST −K; the option price V (c)fix (ϑ, α) can then be evaluated as pointed out in Chapter 2. Realize

that this case corresponds to a ‘riskless’ option, of which it is guaranteed that the payoff is

non-negative. Let us therefore turn to the more realistic setting in which K > S0.

We again parametrize k = log(K/S0), which is now necessarily positive. Let V (c)fix (ϑ, α) be

the transform with respect to k and T :

V(c)fix (ϑ, α) :=

∫ ∞

0

ϑe−ϑT

∫ ∞

0

e−αk V(c)fix (T,K) dk dT.

The idea of including the maturity T as an exponential random variable was first proposed in

[46] for barrier options, but just for the Black-Scholes model. This expression can be rewritten

as the threefold integral, which we in the sequel assume to converge,

S0

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

0

e−αk

∫ ∞

k

(ex − ek)P(XT ∈ dx) dk dT.

Now change the order of summation: first integrate over k ∈ [0, x], so as to obtain

S0ϑ

r + ϑ

∫ ∞

0

(r+ϑ)e−(r+ϑ)T

∫ ∞

0

(1

α

(ex − e(1−α)x

)− 1

α− 1

(1− e(1−α)x

))P(XT ∈ dx) dT.

This expression can be expressed in term of transforms related to the running maximum

after an exponentially distributed time with mean (r + ϑ)−1:

S0ϑ

r + ϑ

(1

α

(EeXτ(r+ϑ) − Ee(1−α)Xτ(r+ϑ)

)− 1

α− 1

(1− Ee(1−α)Xτ(r+ϑ)

)).

�

�


�

�

�

�

�

64 3.3. TRANSFORMS OF PRICES AND GREEKS OF LOOKBACK OPTIONS

This expression can be written in terms of the transform κ(α, q) introduced earlier:

S0ϑ

r + ϑ

(κ(−1, r + ϑ)− κ(α− 1, r + ϑ)

α− 1− κ(α− 1, r + ϑ)

α− 1

).

The Greek related to the initial asset price S0 is

Δ(c)fix (T,K) :=

∂V(c)fix (T,K)

∂S0= e−rT

∫ ∞

log(K/S0)

exP(XT ∈ dx).

With the usual transformation k = log(K/S0), we find

Δ(c)fix (ϑ, α) :=

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

−∞e−αk

∫ ∞

k

ex P(XT ∈ dx)dk dT

=ϑ

r + ϑ

κ(−1, r + ϑ)− κ(α− 1, r + ϑ)

α.

Now consider the Greek with respect to the maturity T . Interchanging the order of the

integrals and integration by parts yields

Θ(c)fix (ϑ, α) :=

∫ ∞

0

ϑe−ϑT

∫ ∞

0

e−αk ∂V(c)fix (T,K)

∂Tdk dT = ϑV

(c)fix (ϑ, α).

3.3.3 Floating strike lookback options

In this subsection the focus lies on fixed strike lookback options, presenting, as usual, the

results for the call variant. We characterize, in terms of transforms,

V(c)fl (T, L) := E

[e−rTP

(c)fl (T, L)

],

as well as its Greeks with respect to S0 and T ; the payoff function P(c)fl (T, L) is defined in the

introduction. If L ≤ 1, this payoff equals ST − LST , being non-negative, and allowing for

relatively easy evaluation. We therefore focus on the more realistic (and challenging) setting

in which L > 1.

We parametrize � := logL (which is positive), and define

V(c)fl (ϑ, α) =

∫ ∞

0

ϑe−ϑT

∫ ∞

0

e−α�V(c)fl (T, e�) d� dT.

�

�


�

�

�

�

�


After some algebra, it is seen that this expression equals

S0

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

0

e−α�

∫ 0

y=−∞

∫ ∞

x=�+y

(ex − e�+y)P(XT ∈ dx,XT ∈ dy) d� dT.

Interchange the order of the integrals, such that first the integral over � ∈ [0, x− y] is evalu-

ated. This reduces to, with the inner integral corresponding to the variable y and the ‘middle’

integral to x,

S0

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

−∞

∫ 0

−∞

(ex

(1− e−α(x−y))

α− ey

(1− e−(α−1)(x−y))

α− 1

)P(XT ∈ dx,XT ∈ dy)dT.

We thus find

V(c)fl (ϑ, α) = S0

ϑ

r + ϑ

(1

α(α− 1)E

[e−(α−1)Xτ(r+ϑ)+αXτ(r+ϑ)

]+

1

αEeXτ(r+ϑ) − 1

α− 1EeXτ(r+ϑ)

).

Consider the first term between the brackets in the previous display. By virtue of (i) the

trivial identity −(α−1)x+αx = (α−1)(x−x)+x, (ii) the fact that (due to Wiener-Hopf theory)

Xτ(r+ϑ) and Xτ(r+ϑ) −Xτ(r+ϑ) are independent, and (iii) the fact that Xτ(r+ϑ) −Xτ(r+ϑ) is

distributed as −Xτ(r+ϑ), we have that

E

[e−(α−1)Xτ(r+ϑ)+αXτ(r+ϑ)

]= Ee−(α−1)Xτ(r+ϑ) EeXτ(r+ϑ) .

These considerations eventually lead to the identity, using the notation introduced in Eqn.

(3.3),

V(c)fl (ϑ, α) = S0

ϑ

r + ϑ

(κ(α− 1, r + ϑ)κ(−1, r + ϑ)

α(α− 1)+

K (−1, r + ϑ)

α− κ(−1, r + ϑ)

α− 1

).

We now turn to the Greeks. Δ(c)fl (ϑ, α), defined in the obvious way, is simply V

(c)fl (ϑ, α)/S0,

which is independent of S0; it is evident from the definition of the payoff that V (c)fl (ϑ, α) is

linear in S0. Regarding, in self-evident notation, Θ(c)fl (T, L), it is seen that, with the same line

of reasoning as used for the fixed strike lookback option, Θ(c)fl (ϑ, α) = ϑV

(c)fl (ϑ, α).

3.4 Numerical validation

In this section we consider a set of frequently used, practically relevant Lévy processes Xt.

For each of them we have implemented our inversion technique to numerically invert the

�

�


�

�

�

�

�

66 3.4. NUMERICAL VALIDATION

transforms identified in the previous section.

3.4.1 Black-Scholes model

Our first example concerns the celebrated Black-Scholes model, in which Xt represents a

Brownian motion with drift. This model is admittedly an oversimplification of reality, as

argued in the introduction, but, due to the fact that it allows explicit computations, serves as

an ideal benchmark to test our numerical techniques.

The Lévy process Xt which governs the price of the underlying asset follows a Brownian

motion with drift μ and standard deviation parameter σ:

dXt = μdt+ σdWt,

with Wt a standard Brownian motion. We pick μ = r−σ2/2, such that e−rtSt is a martingale

under the risk neutral measure as well as Wt. It is readily checked that the price of the vanilla

option is given by the well-known Black formula:

V (c)van(T,K) = S0ΦN (d+)−Ke−rTΦN (d−)

where

d± :=− log(K/S0) + T (r ± σ2/2)

σ√T

and ΦN (u) is the cumulative distribution function of the standard Normal distribution. The

Greeks with respect to S0 and T follow by differentiation.

The price of the fixed strike lookback option reads

V(c)fix (T,K) = S0e

−rT

∫ ∞

0

(ey − ek)P(XT ∈ dy),

which can be further evaluated using the relation [56, p. 49]

P(XT ≤ y) = 1− ΦN

(−y + μT

σ√T

)− exp

(−2μy

σ2

)ΦN

(−y − μT

σ√T

).

As a consequence, the option price can be numerically evaluated by performing an elemen-

tary integration with arbitrary precision. Formulas for the Greeks follows in a similar fash-

ion.

�

�


�

�

�

�

�


The price of the floating strike lookback is defined by, with L > 1 and � := logL,

V(c)fl (T, L) = S0e

−rT

∫ 0

y=−∞

∫ ∞

x=�+y

(ex − e�+y)P(XT ∈ dx;XT ∈ dy).

This expression can be further evaluated by first conditioning on the value of XT . Then real-

ize that Xt, conditional on the value of XT , is a Brownian bridge. The distribution function

of the minimum value attained by the Brownian bridge is known to be, for y ≤ 0 and y ≤ x,

P(XT ≤ y |XT = x) = exp

(−2y(y − x)

Tσ2

).

This leads to an algorithm that quantifies V (c)fl (T, L) by performing a number of elementary

numerical integrations. Again, such procedures can be applied as well to find the Greeks.

Table 3.1 presents numerical output for an example corresponding to the Black-Scholes

model with parameters r = 0.03, σ = 0.2, and S0 = 100. The column ‘Exact computa-

tion’ corresponds to the explicit evaluation techniques mentioned above; here we rely on the

explicit formula for the vanilla option, and numerical integration for the lookback options

(which can be done arbitrarily accurately). The numerical experiments with vanilla options

show a virtually perfect fit, close to machine precision. For the lookback options, that require

multiple inversions, the performance is still remarkably good.

Table 3.2 presents the corresponding results for the Greeks with respect to the initial asset

price S0 and the maturity T . Again we have a nearly perfect performance for the vanilla

option, and still highly accurate results for the loopback options. From the table we conclude

that Δ(c)fl (T, L) = V

(c)fl (T, L)/S0, in line with an observation we made above.

3.4.2 Jump-diffusion models

A Lévy process is a jump-diffusion if it has the the following form:

Xt = μt+ σWt +

Nt∑i=1

Ji;

here the first two terms on the right-hand side correspond to a Brownian motion (with drift),

whereas the last term is a compound Poisson term. The process Nt is a Poisson process (with

rate λ > 0) which counts the number of jumps up to time t, and the Ji are i.i.d. jumps. We

consider two specific models here: the so-called Merton model and the Kou model.

�

�


�

�

�

�

�


Vanilla optionTime T Strike K V

(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K),

Simulation Fourier inversion Error0.5 90 12.799 12.799 4.2·10−14

0.5 100 6.370 6.371 5.7·10−14

0.5 110 2.611 2.612 2.1·10−13

1 90 15.428 15.429 1.6·10−13

1 100 9.142 9.413 1.8·10−13

1 110 5.292 5.293 3.0·10−13

Fixed strike lookback optionTime T Strike K V

(c)fix (T,K), V

(c)fix (T,K), V

(c)fix (T,K),

Simulation Laplace inversion Error0.5 110 5.133 5.134 1.9·10−6

0.5 115 3.070 3.070 1.8·10−6

0.5 120 1.757 1.757 3.9·10−7

1 110 10.315 10.316 1.1·10−6

1 115 7.534 7.535 1.3·10−6

1 120 5.408 5.409 2.8·10−7

Floating strike lookback optionTime T L V

(c)fl (T, L), V

(c)fl (T, L), V

(c)fl (T, L),

Simulation Laplace inversion Error0.5 1.10 4.800 4.800 1.7·10−6

0.5 1.15 2.887 2.888 1.8·10−6

0.5 1.20 1.661 1.661 3.9·10−7

1 1.10 9.346 9.347 9.8·10−7

1 1.15 6.869 6.870 1.3·10−6

1 1.20 4.959 4.960 2.8·10−7

Table 3.1: Black-Scholes model; results obtained by simulation, results obtained by Fourier/Laplaceinversion, and absolute value of the error, compared with exact computation. Parameter values are:μ = 0.01 and σ = 0.2.

�

�


�

�

�

�

�


Vanilla optionTime T Strike K Δ(c)

van(T,K), Θ(c)van(T,K), Δ(c)


van(T,K), Θ(c)van(T,K),

Simulation Simulation Inversion Inversion Error Error0.5 90 0.821 5.761 0.822 5.770 1.0·10−14 0.00.5 100 0.570 7.065 0.570 7.074 4.0·10−15 0.00.5 110 0.309 5.830 0.310 5.836 9.5·10−15 3.0·10−15

1 90 0.781 4.827 0.781 4.832 9.0·10−15 0.01 100 0.598 5.377 0.599 5.380 5.6·10−15 0.01 110 0.410 4.958 0.410 4.961 4.0·10−15 3.0·10−15

Fixed strike lookback optionTime T Strike K Δ

(c)fix (T,K), Θ

(c)fix (T,K), Δ

(c)fix (T,K), Θ

(c)fix (T,K), Δ

(c)fix (T,K), Θ

(c)fix (T,K),

Simulation Simulation Inversion Inversion Error Error0.5 110 0.606 11.364 0.606 11.366 1.7·10−5 3.3·10−4

0.5 115 0.409 9.067 0.410 9.069 1.2·10−6 4.8·10−5

0.5 120 0.262 6.689 0.262 6.690 2.8·10−7 2.6·10−5

1 110 0.796 9.517 0.796 9.519 1.7·10−7 3.4·10−5

1 115 0.635 8.621 0.635 8.622 1.3·10−6 6.1·10−5

1 120 0.495 7.517 0.495 7.519 3.1·10−7 2.3·10−5

Floating strike lookback optionTime T L Δ

(c)fl (T, L), Θ

(c)fl (T, L), Δ

(c)fl (T, L), Θ

(c)fl (T, L), Δ

(c)fl (T, L), Θ

(c)fl (T, L),

Simulation Simulation Inversion Inversion Error Error0.5 1.10 0.048 10.254 0.048 10.254 1.7·10−7 2.3·10−5

0.5 1.15 0.029 8.310 0.029 8.311 1.8·10−8 1.7·10−4

0.5 1.20 0.017 6.201 0.017 6.202 3.9·10−9 2.3·10−5

1 1.10 0.093 8.133 0.093 8.077 9.8·10−10 3.4·10−5

1 1.15 0.069 7.501 0.069 7.472 1.3·10−9 5.8·10−5

1 1.20 0.050 6.635 0.050 6.620 2.8·10−10 6.8·10−5

Table 3.2: Greeks corresponding to Black-Scholes model; results obtained by simulation, results ob-tained by Fourier/Laplace inversion, and absolute value of the error, compared with exact computa-tion. Parameter values are: μ = 0.01 and σ = 0.2.

�

�


�

�

�

�

�



(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K),

Simulation StdDev Fourier inversion Error0.5 90 13.072 0.0013 13.083 4.3 · 10−9

0.5 100 6.775 0.0010 6.786 6.5 · 10−7

0.5 110 2.981 0.0007 2.99 3.3 · 10−7

1 90 15.857 0.0018 15.88 1.3 · 10−6

1 100 9.985 0.0015 9.994 8.9 · 10−7

1 110 5.878 0.0012 5.879 5.6 · 10−7


(c)fix (T,K), V

(c)fix (T,K), V

(c)fix (T,K),

Simulation StdDev Laplace inversion0.5 110 5.849 0.0009 5.9140.5 115 3.66 0.0007 3.7160.5 120 2.21 0.0006 2.2551 110 11.483 0.0015 11.6031 115 8.612 0.0013 8.7201 120 6.371 0.0012 6.465


(c)fl (T, L), V

(c)fl (T, L), V

(c)fl (T, L),

Simulation StdDev Laplace inversion0.5 1.1 5.448 0.0008 5.4980.5 1.15 3.434 0.0007 3.4760.5 1.2 2.086 0.0005 2.1201 1.1 10.329 0.0014 10.4251 1.15 7.798 0.0012 7.8861 1.2 5.801 0.0011 5.878

Table 3.3: Merton model; results obtained by simulation, results obtained by Fourier/Laplace inver-sion, and absolute value of the error, compared with exact computation (for vanilla option only). Pa-rameter values are: μ = 0.01, σ = 0.2, λ = 10.0, δ = 0.025, and ρ = −δ2/2.

�

�


�

�

�

�

�


� In the Merton model [76], the jumps have a Gaussian distribution with mean ρ and vari-

ance δ2. The distribution of Xt can be given explicitly:

P(Xt ∈ dx) = e−λt∞∑k=0

(λt)k

k!

1√2π(σ2t+ kδ2)

exp

(−1

2

(x− μt− kρ)2

σ2t+ kδ2

). (3.11)

The results are presented in Table 3.3. Using this formula the price of the vanilla option

can be obtained as a series whose summands can be expressed in terms of the Black-Scholes

model formula. For the loopback options we use simulation as a benchmark. In our in-

version approach we approximate the upward jumps by a phase-type distribution Section

2.4, and rely on the results presented in [72]. The results thus obtained are given in the last

column. The conclusions are similar to those regarding the Black-Scholes model: a nearly

perfect fit for the vanilla option, and highly accurate performance for the lookback options.

� The distribution of the jump sizes in the Kou model [64] is an asymmetric exponential

distribution, with density

P(Ji ∈ dx) =(pλ+e

−λ+x1x>0 + (1− p)λ−e−λ−|x|1x<0

)(3.12)

where λ± > 0 and p ∈ [0, 1]. In this case the probability distribution of Xt cannot be given in

closed form, and as result we use in Table 3.4 simulation as a benchmark for both the vanilla

and the lookback options. An advantage of this model over the Merton model is that here

the jumps are phase type, and therefore the Wiener-Hopf factors are readily evaluated. the

performance is comparable to that of the Merton model.

3.4.3 Infinite activity Lévy processes

In this section we consider Lévy processes with infinite activity. As an example we use

CGMY processes [28], but other infinite-activity models can be dealt with similarly; in Ex-

ample 10 of Chapter 2, for instance, also variance gamma processes are covered.

For the models we described before (Black-Scholes, Merton, Kou) Monte-Carlo simulation is

a viable alternative to inversion-based techniques: compound Poissons are straightforward

to generate, while it is also known how to obtain exact samples from (Xt, Xt) when Xt is

a Brownian motion [50]. Monte Carlo methods are problematic for infinite-activity models,

however, due the property of infinitely-many jumps in finite time intervals. Techniques have

been developed, however, to replace the process’ small jumps (say, those in absolute value

�

�


�

�

�

�

�



(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K),

Simulation StdDev Fourier inversion0.5 90 13.361 0.0014 13.3580.5 100 7.166 0.0011 7.1640.5 110 3.341 0.0008 3.341 90 16.312 0.0019 16.3081 100 10.532 0.0016 10.5291 110 6.423 0.0013 6.421


(c)fix (T,K), V

(c)fix (T,K), V

(c)fix (T,K),



(c)fl (T, L), V

(c)fl (T, L), V

(c)fl (T, L),

Simulation StdDev Fourier inversion0.5 1.1 5.988 0.0009 5.9860.5 1.15 3.912 0.0008 3.9110.5 1.2 2.482 0.0006 2.4811 1.1 11.157 0.0015 11.1551 1.15 8.598 0.0014 8.5961 1.2 6.546 0.0012 6.545

Table 3.4: Kou model; results obtained by simulation, and results obtained by Fourier/Laplace inver-sion. Parameter values are: μ = 0.01, σ = 0.2, p = 1/2, λ− = 39, and λ+ = 40.

�

�


�

�

�

�

�


smaller than ε) by a suitably chosen Brownian motion, whereas the remaining jumps can be

described by a compound Poisson process; see the approach presented in Section 2.5, which

is theoretically backed by the convergence results in [15].

Such approximations are useful when devising computational techniques, too: approximat-

ing the large positive jumps by a phase-type distribution, we have a identified a model in

the class R, allowing the evaluation of its Wiener-Hopf factors κ(α, q) and κ(α, q), as treated

in [72].

Owing to its inherent versatility, the CGMY model is among the most popular models used

when modeling asset prices. Particularly when we add a Brownian motion, the six parame-

ters typically allow capturing the process’ crucial features. The Lévy measure of the CGMY

process is given by

Π(dx) = Ce−Mx

x1+Y1x>0 + C

e−G|x|

|x|1+Y1x<0,

with C,G,M > 0 and Y ∈ [0, 2). The corresponding Lévy exponent reads

logEesXt = μs+1

2σ2s2 + CΓ(−Y )

[(M − u)Y −MY + (G+ s)Y −GY

]. (3.13)

For Monte-Carlo simulation we may ‘remove’ the small jumps smaller than ε, in the sense

that we approximate them by a drift and diffusion process such that με =∫ ε

−εxΠ(dx) and

σ2ε =

∫ ε

−εx2Π(dx). As indicated above, the large positive jumps (i.e., those larger than ε) are

then approximated by by phase-type jumps, so that we have determined an approximative

model in R. This technique is dealt with in detail, and thoroughly validated, in Chapter 2;

here we follow an alternative approach, cf. [13, Section 2.1], which we describe now.

Consider for ease just the upper tail of the Lévy measure; the lower tail can be dealt with

analogously. First realize that

Ce−Mx

x1+Y= Ce−Mx

∫ ∞

0

uY e−ux

Γ(1 + Y )du. (3.14)

Evidently, the above integral can be approximated by a weighted sum of exponential terms,

Ce−Mx

x1+Y≈

N∑i=1

ci(ui +M)e−(ui+M)x

where ci := Cwiuyi / [(ui +M)Γ(1 + Y )], with quadrature points ui, and the wi denoting the

�

�


�

�

�

�

�


corresponding Gaussian weights. When choosing N suitably large, the approximation can

be made arbitrarily accurately. Following the same procedure for the lower tail, we have

approximated the Lévy process under consideration by a Lévy process in R, for which we

can evaluate κ(α, q) and κ(α, q).

We now discuss how the CGMY process can be simulated. The above integration represen-

tation (3.14) is only valid for x > 0; small jumps have to be treated separately. Assuming

that we exclude x ∈ (−ε, ε) for a specific value of ε, we can choose the number of terms in

Gaussian quadrature expansion of the integral so as to obtain the desired accuracy. Then

we need to add drift and diffusion terms to the process to compensate for the exclusion of

jumps in absolute terms smaller than ε; the corresponding parameters are

με =

∫ ε

−ε

(Ce−Mx

x1+Y−

N∑i=1


)x dx,

and

σ2ε =

∫ ε

−ε

(Ce−Mx

x1+Y−

N∑i=1


)x2 dx;

these expressions can be evaluated in more explicit terms (but the resulting formula do not

provide any additional insight). The numerical findings can be found in Table 3.5; results

for the corresponding Greeks are presented in Table 3.6.

3.4.4 Beta processes

In this last example we consider the situation that Xt follows a Beta process [66, 67]. As this

process has small jumps, the idea is to approximate the jumps (in absolute value) smaller

than ε by a suitable chosen Brownian motion with drift, as explained in Section 3.2.2 and

Section 2.5; we picked ε = 0.05. As we pointed out in Section 3.2.1, the distributions of Xτ(q)

and Xτ(q) are given in terms of infinite series; in our experiments we truncated these at 25.

In Table 3.7 we present results obtained by ordinary simulation (with the specific ε given

above), and by the Wiener-Hopf Monte Carlo method recently developed in [67] (denoted

by ‘KKPS’); see also the brief summary in Section 2.6.1.

�

�


�

�

�

�

�



(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K),



(c)fix (T,K), V

(c)fix (T,K), V

(c)fix (T,K),

Simulation StdDev Laplace inversion0.5 110 11.894 0.0024 11.7370.5 115 9.523 0.0023 9.390.5 120 7.77 0.0022 7.6611 110 22.547 0.0038 22.3021 115 19.567 0.0037 19.341 120 17.09 0.0036 16.883


(c)fl (T, L), V

(c)fl (T, L), V

(c)fl (T, L),

Simulation StdDev Laplace inversion0.5 1.1 11.75 0.0022 11.7170.5 1.15 9.468 0.0021 9.440.5 1.2 7.665 0.0020 7.6431 1.1 20.172 0.0034 20.1351 1.15 17.645 0.0033 17.6121 1.2 15.454 0.0032 15.424

Table 3.5: CGMY model; results obtained by simulation, and results obtained by Fourier/Laplaceinversion. Parameter values are: μ = 0.01, σ = 0.2, C = 1/2, G = 3, M = 4, and Y = 1/2.

�

�


�

�

�

�

�


Vanilla optionTime T Strike K Δ(c)

van(T,K), Δ(c)van(T,K), Θ(c)


van(T,K), Θ(c)van(T,K),

Simulation StdDev Simulation StdDev Inversion Inversion0.5 90 0.736 6.0 · 10−5 11.775 0.0146 0.736 11.7680.5 100 0.579 6.4 · 10−5 13.356 0.0136 0.579 13.3480.5 110 0.424 6.4 · 10−5 12.941 0.0124 0.424 12.9381 90 0.714 7.0 · 10−5 8.848 0.0158 0.714 8.8321 100 0.611 7.3 · 10−5 9.602 0.0151 0.611 9.5871 110 0.51 7.4 · 10−5 9.709 0.0144 0.51 9.695

Fixed strike lookback optionTime T Strike K Δ

(c)fix (T,K), Δ

(c)fix (T,K), Θ

(c)fix (T,K), Θ

(c)fix (T,K), Δ

(c)fix (T,K), Θ

(c)fix (T,K),

Simulation StdDev Simulation StdDev Inversion Inversion0.5 110 0.779 6.7 · 10−5 24.357 0.0106 0.781 24.380.5 115 0.635 7.0 · 10−5 22.474 0.0104 0.636 22.5010.5 120 0.516 7.0 · 10−5 20.3 0.0101 0.517 20.3231 110 0.99 7.3 · 10−5 19.863 0.0109 0.992 19.8611 115 0.879 7.9 · 10−5 19.245 0.0108 0.88 19.2461 120 0.781 8.2 · 10−5 18.453 0.0107 0.782 18.455

Floating strike lookback optionTime T L Δ

(c)fl (T, L), Δ

(c)fl (T, L), Θ

(c)fl (T, L), Θ

(c)fl (T, L), Δ

(c)fl (T, L), Θ

(c)fl (T, L),

Simulation StdDev Simulation StdDev Inversion Inversion0.5 1.1 0.117 2.2 · 10−5 19.972 0.0141 0.117 19.9770.5 1.15 0.094 2.1 · 10−5 18.781 0.0135 0.094 18.7940.5 1.2 0.076 2.0 · 10−5 17.203 0.0128 0.076 17.2161 1.1 0.201 3.4 · 10−5 14.428 0.0160 0.201 14.4121 1.15 0.176 3.3 · 10−5 14.314 0.0156 0.176 14.3031 1.2 0.154 3.2 · 10−5 13.994 0.0152 0.154 13.986

Table 3.6: Greeks corresponding to CGMY model; results obtained by simulation, and results obtainedby Fourier/Laplace inversion. Parameter values are: μ = 0.01, σ = 0.2, C = 1/2, G = 3, M = 4, andY = 1/2.

�

�


�

�

�

�

�



(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K), V

(c)van(T,K),

Simulation StdDev Simulation – KKPS Inversion0.5 90 17.449 0.0035 18.671 17.4550.5 100 12.325 0.0033 13.52 12.330.5 110 8.845 0.0031 10.054 8.851 90 23.133 0.0051 25.057 23.1421 100 18.577 0.0049 20.476 18.5851 110 15.038 0.0047 16.929 15.046


(c)fix (T,K), V

(c)fix (T,K), V

(c)fix (T,K),

Simulation StdDev Simulation – KKPS Inversion0.5 110 13.941 0.0036 15.403 14.5160.5 115 11.638 0.0035 13.127 12.1910.5 120 9.88 0.0034 11.393 10.3951 110 25.661 0.0056 28.103 27.2511 115 22.794 0.0055 25.252 24.3021 120 20.385 0.0054 22.858 21.808


(c)fl (T, L), V

(c)fl (T, L), V

(c)fl (T, L),

Simulation StdDev Simulation – KKPS Inversion0.5 1.1 12.403 0.0032 13.134 12.8810.5 1.15 10.391 0.0031 11.251 10.8610.5 1.2 8.84 0.0031 9.802 9.2841 1.1 21.37 0.0049 22.505 22.5891 1.15 19.083 0.0048 20.327 20.2551 1.2 17.134 0.0048 18.473 18.253

Table 3.7: Beta model; results obtained by simulation, and results obtained by Fourier/Laplace inver-sion. Parameter values are: μ = 0.01, σ = 0.2, α1 = α2 = 1, β1 = β2 = 4, λ1 = λ2 = 3/2 andc1 = c2 = 2.

�

�


�

�

�

�

�

78 3.5. DISCUSSION

3.5 Discussion

In this section we reflect on the computational effort required by our algorithm, and dis-

cusses possible extensions to alternative exotic options.

3.5.1 Remarks on computational effort

The computational effort incurred by the numerical Laplace inversion [36, 37] is in the order

of M logM , if function values f(kΔ), for k = 0, 1, . . . ,M − 1 are to be determined. As quan-

tified in detail in [36, 37], the associated computation time is typically extremely low, often

negligible relative to other components of the algorithm. More specifically, when approxi-

mating the Lévy process under consideration by a Lévy process in R or M , our algorithm

requires us to compute roots, which can be time consuming. For a more detailed account of

this issue we refer to Section 2.7.2; it is also described how to accelerate this search, relying

on ideas proposed in e.g. [66]. It is also noticed that for Beta processes we have information

on the location of the roots, as mentioned earlier in this chapter; relatively straightforward

bisection procedures can be applied there.

To give an impression of the computational gain with respect to simulation-based estima-

tion, consider the numerical experiments performed for the Black-Scholes model. On a stan-

dard PC, generation of 105 samples (essentially consisting of pairs (XT , XT )) took about 2

seconds per instance, while generating the full table (i.e., several maturities and strikes) for

the vanilla option by Fourier inversion took less than 0.1 s, and generating the full table for

the lookback options by double Laplace inversion about 1.1 s. For the other models (Kou,

Merton, CGMY, Beta) the comparison is even more in favor of the inversion-based meth-

ods, for the reason that the time needed to generate a single sample is now proportional

to the maturity T (while the computational effort associated with the inversion method is

independent of T ).

3.5.2 Other exotic options

Various other exotic options can be dealt with fully analogously. For example, the variable

notional call and put options [79, Section 3.2], with payoffs

P (c)vn (T, L) :=

(ST − LST )+

ST, P (p)

vn (T, L) :=(LST − ST )

+

ST,

�

�


�

�

�

�

�


respectively; for the call (put) option we assume L ≥ 1 (L ≤ 1, respectively), so as to avoid

the less interesting situation of a certainly positive payoff. The joint transforms (with respect

to the maturity time T as well as � = logL) and corresponding Greeks can be determined as

in Section 3.3.3, cf. [79, Prop. 3.4]. Other options whose payoff can be written in terms ST , ST ,

and ST , T , and a single other parameter (e.g., the K or L featuring in loopback options) result

in explicit expressions (in terms of κ(α, q), κ(α, q), and K (α, q)) for the double transform of

the option price, and can be analyzed and numerically evaluated in a similar fashion.

Barrier options. So far all transforms considered were single transforms (vanilla option) or

double transforms (loopback options), for which the inversion techniques of [36, 37] were

feasible; we remark, however, that the inversion routines for double transforms, while still

being fast and accurate, was already substantially slower and less accurate than those for

single transforms. It is noted that barrier options, and their associated Greeks, can be analyzed

in terms of triple transforms, as seen as follows.

Consider for instance the so-called Up-and-In barrier call option, where it is remarked that

the other flavors (Up-and-Out, Down-and-In, Down-and Out, and the put variants) can be

analyzed fully analogously. The Up-and-In barrier call option has payoff

P(c)uib(T,K,H) := (ST −K)+ 1{ST ≥ H};

we are interested in the more challenging case that max{S0,K} < H (noting that if this

condition is not fulfilled the payoff is nonnegative with certainty). Putting k := log(K/S0)

and h := log(H/S0), we wish to evaluate

V(c)uib (T,K,H) := E

[e−rTP

(c)uib(T,K,H)

].

Now let V (c)uib (ϑ, α, β) be the transform with respect to k, h, and T :

S0

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

−∞

∫ ∞

0

eαike−βh

∫ ∞

y=k

∫ ∞

x=h

(ey − ek)P(XT ∈ dx,XT ∈ dy)dh dk dT.

This expression reduces, when interchanging the order of integration, to

S0

iα(iα+ 1)β

∫ ∞

0

ϑe−(r+ϑ)T

∫ ∞

y=−∞

∫ ∞

x=0

e(iα+1)y(1− e−βx)P(XT ∈ dx,XT ∈ dy)dT

=S0

iα(iα+ 1)β

ϑ

r + ϑE

(e(iα+1)Xτ(r+ϑ)

(1− e−βXτ(r+ϑ)

)),

�

�


�

�

�

�

�

80 3.6. CONCLUDING REMARKS

which can be expressed in terms of the functions κ(α, q), κ(α, q), and K (α, q) (as we did for

the floating strike loopback option):

S0ϑ

r + ϑ

(K (−iα− 1, r + ϑ)− κ(−iα− 1 + β, r + ϑ)κ(−iα− 1, r + ϑ)

iα(iα+ 1)β

).

The Greeks can be characterized (in terms of transforms) as before.

Barrier options prices are, however, significantly harder to evaluate than lookback options,

as the transform to be inverted is threefold. In this case, the inversion techniques of [36,

37] become slow and less accurate. Adapting the inversion techniques to facilitate triple

inversion is a topic for further research. For the class of generalized hyperexponential Lévy

processes, [58] determines the transform with respect to the maturity time T (for given K

and H) can be explicitly calculated; hence, for this class of Lévy processes just a single-

dimensional inversion is needed.

3.6 Concluding Remarks

This chapter proposes and validates a technique for pricing loopback options driven by a

general exponential Lévy model. The main idea is that we approximate the Lévy process

under investigation by a Lévy process for which the Wiener-Hopf factorization can be done

in (semi-)explicit terms; this approximation can be in principle as accurate as needed (of

course at the expense of an increase in computation time). With the Wiener-Hopf factors

being known, the option prices as well as corresponding Greeks can be evaluated by per-

forming (potentially multi-dimensional) Fourier and Laplace inversion.

We have thoroughly tested the proposed algorithm, by considering a broad range of driv-

ing Lévy processes, while we also vary the parameter values. The procedure consistently

yielded fast and accurate results.

�

�


�

�

�

�

�

Chapter 4Asymptotics of the supremum of a

Lévy process

In the previous chapters we have studied numerical techniques for evaluating the distri-

bution of the running supremum attained by a Lévy process, and applied it for option

pricing. In this chapter we discuss an alternative evaluation technique: we determine the

corresponding tail asymptotics, and present an importance-sampling-based fast simulation

method.

4.1 Introduction

Consider the Lévy process X ≡ (Xt)t≥0, and define Q := supt≥0 Xt as its all-time supremum.

The object of study in this chapter is the asymptotic behavior of P(Q > u), as u → ∞. In

particular, we present a short derivation of the result that, under a condition that ascertains

that the underlying Lévy process is light-tailed (in a sense specified below), there is a constant

ω > 0 such that P(Q > u)eωu converges to a positive constant, as u → ∞. This result, which

can be considered as continuous-time counterpart of Cramér’s seminal result for random

walks, was first established in [22]; see also [68, Section VII.2]. The first contribution of this

note is a novel, insightful, compact derivation of the above asymptotics, essentially relying

on the so-called second factorization identity [82]. The second contribution concerns a fast and

efficient scheme for evaluating P(Q > u) for u large, relying on importance sampling. The

applicability of this procedure is demonstrated in a number of numerical experiments.

81

�

�


�

�

�

�

�

82 4.2. ASYMPTOTICS

4.2 Asymptotics

Let ζ(ϑ) be the Lévy exponent associated with the process (Xt)t≥0, in that EeϑXt = exp(tζ(ϑ)).

We assume that the process has a negative drift, i.e., EX1 < 0, and that 0 is regular for (0,∞)

[68, Def. 6.4]. The process (Xt)t≥0 is light-tailed in the sense that the Cramér condition holds:

∃ω ∈ (0,∞) : ζ(ω) = 0;

in the sequel we refer to this root simply by ω.

We now consider the ω-twisted version of the Lévy process (Xt)t≥0; that is, we associate a

measure Q to the process (Xt)t≥0 so that dQ(Xt ≤ x) = eωxdP(Xt ≤ x). This twisted version

is again a Lévy process, and has Lévy exponent

EQeϑXt = exp(tζ(ϑ+ ω));

EQ(·) denotes expectation under Q. For an explicit description of (Xt)t≥0 in Q, we refer to,

e.g., [10, Thm. XIII.3.4]; the drift is adjusted, the Brownian term remains unchanged, while

the jumps are exponentially twisted (with ‘twist’ ω) with an adapted arrival rate.

Define σ(u) := inf{t > u : Xt > u}. Then a standard change-of-measure argument yields

P(Q > u) = EQe−ωXσ(u) ,

see e.g. [10, Eqn. (XIII.5.2)] or [22, Remark 2]. Now decompose Xσ(u) into the level u and the

non-negative random quantity Bu := Xσ(u) − u, to be interpreted as the overshoot over level

u. We thus obtain

limu→∞P(Q > u)eωu = lim

u→∞EQe−ωBu ,

given these limits are well-defined.

Now define the following function:

κ(α) := exp

(−

∫ ∞

0

∫(0,∞)

1

t(1− e−αx)P(Xt ∈ dx)dt

),

and κQ(α) its counterpart under Q. The second factorization identity, due to [82] (see also

�

�


�

�

�

�

�

CHAPTER 4. ASYMPTOTICS OF THE SUPREMUM OF A LÉVY PROCESS 83

[68, Exercise 6.7]), entails that under the stated assumptions

∫ ∞

0

e−βxE

(e−γ(Xσ(x)−x)1{σ(x)<∞}

)dx =

1

β − γ

(1− κ(β)

κ(γ)

). (4.1)

Realize that EQX1 = ζ ′(ω) > 0, and hence σ(u) < ∞ almost surely for all u > 0 under Q.

Thus, relying on (4.1), we have that

limu→∞EQe

−ωBu = limβ↓0

β

β − ω

(1− κQ(β)

κQ(ω)

). (4.2)

From

κQ(β) = exp

(−

∫ ∞

0

∫(0,∞)

1

t

(1− e−βx

)eωxP(Xt ∈ dx) dt

)

= exp

(−

∫ ∞

0

∫(0,∞)

1

t

((1− e−(β−ω)x

)− (1− eωx)

)P(Xt ∈ dx) dt

)

=κ(β − ω)

κ(−ω),

it is immediate thatκQ(β)

κQ(ω)=

κ(β − ω)

κ(0).

As is argued in [68, p. 188], �(β) := 1/κ(β) → 0 as β ↓ −ω. It follows that (4.2) equals

1

ωκ(0)limβ↓0

β

�(β − ω)= C :=

1

ωκ(0)

1

�′(−ω).

The final result is given in the theorem below.

Theorem 3. As u → ∞,

P(Q > u)eωu → C. (4.3)

The right hand side of (4.3) is understood as 0 when �′(−ω) = ∞. It is also remarked that

the assumption that 0 be regular for (0,∞) rules out that (Xt)t≥0 is a Poisson process whose

Lévy measure is lattice. It is easily seen what happens if this condition is lifted; one can then

just consider the process at (or, more precisely, immediately after) the jump epochs of the

Poisson process, thus reducing the problem to that of the supremum attained by a discrete-

time random walk [41, p. 393]. For more reflections on the assumptions imposed, see [22,

Remark 1] and [68, p. 186].

�

�


�

�

�

�

�

84 4.3. IMPORTANCE SAMPLING

4.3 Importance sampling

The result of the previous section suggest to approximate P(Q > u) by Ce−ωu. There are

however two practical objections to this procedure.

(i) In the first place, Thm. 3 does not provide us with any error bounds, and in light of

this we do not have any insight in the error made for a specific value of u.

(ii) In the second place, this approximation requires us to evaluate the constant C, which

can be problematic. In case the jumps are one-sided, κ(·) can be explicitly expressed

in terms of ζ(·) [68, Section 6.5], and hence C can be computed, but this is not possible

when there are jumps in both directions. A next idea could be to evaluate κ(·) numer-

ically. Realize, however, that in many cases the Lévy process under study is given in

terms of the Lévy exponent ζ(·) only; no explicit expression for P(Xt ∈ dx) is available,

and therefore evaluation of κ(0) and �′(−ω) is not straightforward.

In this section we describe how such complications can be remedied, relying on a rare-event

simulation technique; cf. [12, Section XII.6a].

It is known that estimation of small probabilities by (naïve) simulation is time consum-

ing. Suppose our objective is to obtain, at a given confidence interval, an estimate which

is smaller than a given fraction of the corresponding confidence interval (for instance 10%).

Then the number of independent runs needed is roughly inversely proportional to the prob-

ability of interest [12, Section VI.1]. This problem can be solved by simulating under another

measure than the original one, a technique usually referred to as importance sampling [12,

Section V.1]; by weighing the simulation output by appropriately defined likelihood ratios,

an unbiased estimator is obtained, while the corresponding variance may be reduced.

The starting point of the procedure is the representation P(Q > u) = EQe−ωXσ(u) . The idea is

to simulate in any run the Lévy process under Q until u has been exceeded (which happens

with probability 1), to record in any run the value of Xσ(u), and to compute e−ωXσ(u) . In

self-evident notation, the estimator based on N independent runs becomes

αN (u) :=1

N

N∑n=1

e−ωX

(n)

σ(u) .

To analyze the performance of this estimator, realize that

VarαN (u) =1

N

(EQe

−2ωXσ(u) − (EQe

−ωXσ(u))2)

.

�

�


�

�

�

�

�


Using the methodology developed in the previous section, we obtain that

EQe−2ωXσ(u)e2ωu = lim

β↓0β

β − 2ω

(1− κQ(β)

κQ(2ω)

)=

1

2ωκ(ω)

1

�′(−ω).

As a consequence,

limu→∞(VarαN (u)) e2ωu =

C

N, where C :=

1

2ωκ(ω)

1

�′(−ω)−

(1

ωκ(0)

1

�′(−ω)

)2

.

Suppose that runs are sampled until the ratio of the confidence interval half-width and the

estimator drops below ε; we use ‘confidence value’ t (e.g., when the confidence interval is

95% the value of t is roughly 1.96). As a consequence, the minimum value of Nu should fulfil

t

√Ce−2ωu

Nu≤ εCe−ωu.

From the above the following claim follows.

Corollary 1. As u → ∞,

Nu → t2C/(ε2C2).

In other words, the number of runs needed is hardly affected by the value of the exceedance

level u. It thus follows that the proposed procedure has bounded relative error [12, Eqn.

VI.(1.2)].

Example 1. In this case (Xt)t≥0 corresponds to the superposition of a negative drift and a

compound Poisson process with standard Normal jumps. Notice that the resulting Lévy

process is spectrally two-sided. We have, for some d, λ > 0,

ζ(ϑ) = −dϑ+ λ(eϑ

2/2 − 1),

with EX1 = ζ ′(0) = −d < 0. Let ω solve ζ(ω) = 0; it can be checked that the ω-twisted

version of the Lévy process corresponds with a compound Poisson process with arrival rate

λeω2/2 = λ + dω and jumps that are Normally distributed with mean ω and variance 1. In

Table 4.1 we display, for various values of u, estimates of our target probability P(Q > u),

as obtained by direct simulation (that is, under the original measure P; denoted by pN (u))

and by importance sampling (denoted by αN (u)). We have continued simulating until the

ratio of the confidence interval half-width and the estimator drops below ε = 1% (taking

�

�


�

�

�

�

�


Naïve SimulationLevel u pN (u) Nu CPU time (sec.)0.5 0.581416 110,630 971 0.468690 174,196 2392 0.293742 369,464 1,0615 0.071168 2,005,515 30,596

Important Sampling SimulationLevel u αN (u) Nu CPU time (sec.)0.5 0.582281 8,577 0.581 0.467459 8,009 0.512 0.294591 7,622 0.495 0.071485 7,840 0.5510 6.70× 10−3 7,837 0.6420 5.95× 10−5 7,710 0.7950 4.12× 10−11 7,991 1.36100 2.26× 10−21 7,820 2.15

Table 4.1: Simulation results corresponding to Example 1. Parameters related to the Compound Pois-son process: d = −0.25 and λ = 1. Decay rate: ω = 0.47260. Precision/confidence: ε = 1%, t = 1.96.

t = 1.96); Nu is the number of runs needed. We also present the CPU time needed.

Example 2. In this numerical example (Xt)t≥0 is a Variance Gamma process, or, equivalently,

the difference between two Gamma processes (which is spectrally two-sided, too). The Lévy

exponent is given by

ζ(ϑ) = β log

(α1

α1 − ϑ

)+ β log

(α2

α2 + ϑ

).

It is readily checked that

EX1 = ζ ′(0) =β

α1− β

α2,

which is assumed to be negative (i.e., α1 > α2 > 0). Again, ω solves ζ(ω) = 0; the ω-twisted

process is Variance Gamma as well, but now with Lévy exponent

β log

(α1 − ω

α1 − ϑ− ω

)+ β log

(α2 + ω

α2 + ϑ+ ω

).

In [12, Ch. XII] it is pointed out how a Variance Gamma process can be simulated; we do so

by replacing all jumps smaller than ε = 0.05 by a Brownian motion with σε = 0.0345, while

the bigger jumps (both upward and downward) now correspond with a Compound Poisson

process (with intensities λ+ and λ−, respectively). In Table 4.2 we display, for various values

�

�


�

�

�

�

�


Naïve SimulationLevel u pN (u) Nu CPU time (sec.)0.1 0.360057 273,112 6030.2 0.285182 385,165 1,1850.5 0.155404 835,137 5,6261 0.063159 2,279,294 40,521

Important Sampling SimulationLevel u αN (u) Nu CPU time (sec.)0.1 0.302785 101,312 78.190.2 0.263005 99,969 76.070.5 0.160835 101,764 79.081 0.067980 107,966 89.092 0.012298 114,929 101.535 7.75× 10−5 122,577 116.4210 1.76× 10−8 123,915 121.4420 9.09× 10−16 125,771 129.1450 1.30× 10−37 124,446 139.28100 4.98× 10−74 125,365 162.77

Table 4.2: Simulation results corresponding to Example 2. Parameters of the Variance Gamma process:d = −0.25, β = 1, α1 = 2, and α2 = 1. Parameters related to the approximation of the Variance Gammaprocess by the sum of a Brownian motion and a Compound Poisson process: ε = 0.05, σε = 0.0345,λ+ = 0.9115, λ− = 1.2339. Decay rate: ω = 1.6770. Precision/confidence: ε = 1%, t = 1.96.

of u, estimates of P(Q > u), as obtained by direct simulation and by importance sampling;

again ε = 1% and t = 1.96. Again we continue simulating until a precision of 1% is achieved,

and record the number of runs Nu needed, as well as the CPU time.

�

�


�

�

�

�

�


�

�


�

�

�

�

�

Chapter 5Energy-Efficient Scheduling in

Multi-Core Servers

Previous chapters concentrated on computational issues related to Lévy fluctuation theory.

This chapter discusses an entirely different context: the use of Markov-fluid models to de-

sign service strategies in multi-core servers.

5.1 Introduction

Today, the information and communication technology (ICT) accounts for about 2% of global

CO2 emissions [26, 51], which is about the same as the emissions of the entire aviation indus-

try [26]. However, ICT emissions are expected to almost double by 2020 [26]. Considering

the ICT energy use more closely, it is envisaged that more than half of it is likely to be due to

data centers. The Green Data Project [53] determined that 37% of data center utility power is

consumed by data storage equipment, 23% by networking equipment and 40% by servers.

Therefore, there has been a great interest in exploring efficient mechanisms for managing

and optimizing CPU energy consumption.

Manufacturers of semiconductor chips, as well as the servers that use them, are already

increasing the computing throughput per watt in such a way that it roughly doubles every

two years. The semiconductor industry is continuing to harness performance gains through

Moore’s Law by developing multi-core processors [54]. The utilization of data centers can

be remarkably low, e.g. 10% [52], due to various reasons including uneven application fit,

89

�

�


�

�

�

�

�


risk management, and uncertainty in demand forecasts. Interestingly, servers in data centers

tend to spend most of their time at low utilization; in addition, it should be realized many

applications can tolerate some delay. All these considerations clearly open up opportunities

for developing strategies to adapt the processor speeds (that is, turning off or slowing down

processors when possible), in order to achieve substantial energy savings.

Energy-aware processors can achieve energy-proportional computing [19] by utilizing speed

scaling, so as to adapt the ‘speed’ of the server CPU to the processing load and the service

performance requirements. Speed scaling is enabled by dynamic voltage/frequency scaling

(DVFS) [45] to decrease the supply voltage and the clock rate. Speed scaling designs can be

highly sophisticated (adapting the speed at all times to the current state; this is dynamic speed

scaling), or very simple (running at a given static speed except when idle, to balance energy

and performance).

The growing adoption of speed scaling designs has spurred quantitative research into the

topic. The analytic study of the speed scaling problem began with Yao et al. [97] about

two decades ago. Subsequent studies considered three main performance objectives that

balance energy and performance: (i) minimize the total energy used in order to meet per-

formance requirements, e.g., [2, 85]; (ii) optimize performance (e.g., minimize delay) given

an energy/power budget, e.g., [24, 86]; and (iii) optimize a linear combination of expected

performance and energy usage [3, 95, 96].

The study presented in the present chapter is of type (iii); this objective is appropriate in Web

settings since typically there neither (strict) job completion deadlines apply, nor fixed energy

budgets. In our work we choose to consider an objective function which is a linear com-

bination of energy usage, queuing cost (reflected by the delay), and speed switching costs;

depending on the specific situation at hand, the switching costs, not incorporated in much of

the prior work, may affect the quantitative analysis. The processor we consider is multi-core,

where each core can in principle be set at a different speed; however, practical implemen-

tation considerations often require that all cores run at equal speed. Our analysis captures

the processor’s dynamic power as well as its static power; the former can be characterized

as frequency-dependent, and being controlled by changing the frequency and the voltage of

the processor using the DVFS technology, whereas the latter is frequency-independent and

not controlled by DVFS. It is noted that the static power is becoming significant as transis-

tors are getting smaller and faster [70]. Our feature-rich objective function, along with the

enabling mechanisms that we propose, provide an approach for studying and optimizing

�

�


�

�

�

�

�

CHAPTER 5. ENERGY-EFFICIENT SCHEDULING IN MULTI-CORE SERVERS 91

the server energy and performance tradeoffs. Importantly, these enabling mechanisms for

adjusting the number of active cores and their speeds are based on quantities that can be

easily observed (viz. the queue’s buffer contents).

Our main contribution is that we propose a stochastic fluid model for the analysis and opti-

mization of multi-core processing systems. An important distinction between our work and

previous studies lies in the modelling assumptions: whereas previous papers use queues

with Poisson arrivals, our framework is more realistic in that the burstiness of the arrivals

of processing jobs is captured by relying on on-off fluid processes. In our setup we have the

freedom to choose appropriate distributions for the corresponding on and off periods; we

can pick exponential distributions (with a coefficient of variation (CoV) equaling 1), or dis-

tributions that are more regular (CoV < 1) or more variable (CoV > 1) than the exponential

distribution. The arrival process can be easily extended beyond on-off processes; then the

fluid rate can take any one of multiple (more than two) values according to a modulating

process. The fluid model is appropriate when the processors are fast and job arrivals are

bursty and of high rate. Stochastic fluid models have been successfully used in prior studies

of data packet queueing and transmission over communication links [4, 39, 74, 89], as they

preserve the structural properties of the modeled system, while its solution remains rela-

tively simple. As a result, the framework allows us to analyze a broad range of strategies for

adapting the multi-core server speeds, all of which attempt to optimize objective functions

which balance energy consumption and performance. The strategies studied differ in the

number of buffer threshold levels used, but also in the dependence of the processing speed

on the specific threshold that is crossed (as well as the direction in which it is crossed, i.e., if

the policy is hysteretic).

The more specific contributions of our work are the following. We first discuss several

schemes that are intended to reduce energy consumption in multi-core processors (i.e., pro-

cessors with multiple CPU s) or multiple servers. Essentially there are two classes of strate-

gies: those in which the service rate depends just on the current buffer content in relation

to the values of a set of thresholds, and those in which it also matters in what direction the

thresholds are crossed (hysteretic control). We propose to rely on stochastic fluid theory to

evaluate the performance of the underlying queueing systems. Then we introduce a family

of cost functions, that incorporates a tunable trade off between the power consumption of

the servers and the quality-of-service. The power consumption cost has three components:

(i) static power, (ii) dynamic power, which a function of processing speed of each server,

�

�


�

�

�

�

�

92 5.2. ENERGY COST FUNCTION AND MODELS

and (iii) power consumption due to switching between processing speed levels [25, 43]. The

quality-of-service is a function of the queueing delay of processing jobs waiting for service.

In order to implement the service strategies the system design parameters (service rates,

buffer thresholds) should be chosen so as to minimize the cost function; to this end we ap-

ply a conjugate gradient method. We then present the results of our numerical experiments,

in which we compare the various service strategies. Importantly, we quantify the benefits

(in terms of our cost function) of more versatile service strategies — this enables us to eval-

uate the benefit of ‘richer’ strategies (e.g., with many thresholds) over ‘simpler’ strategies

(e.g., a static policy, or the ‘sleep mode strategy’, which reduces the service rate only when

the queue is empty). Evidently, strategies which use more threshold levels are more effi-

cient in terms of power consumption; however, for a reasonable switching overhead of the

processing speed, the efficiency gain quickly diminishes beyond a few thresholds.

A seeming drawback of the service strategies proposed is that the optimal tuning of the pa-

rameters requires knowledge of the statistical properties of the system’s input traffic. While

it is obviously formally true that perturbations of the input model lead to non-optimal pa-

rameter settings, extensive numerical experiments reveal that this sensitivity is remarkably

weak. Even relatively significant changes in the input (both in terms of the duration of the

source’s active periods, and the distributional properties of those active periods), lead to

very modest (percentagewise) changes in the cost function. This robustness property makes

the use of the proposed energy efficient strategies highly attractive.

This chapter is organized as follows. In Section 5.2 we describe the stochastic fluid models

used, and the underlying cost function. Then Section 5.3 describes and motivates the various

service strategies that fall in the framework of our stochastic fluid queues, we show how

to numerically optimize the cost function, and present the main findings. The robustness

analysis and related insights can be found in Section 5.4, and conclusions are presented in

Section 5.5.

5.2 Energy Cost Function and Models

In this section we give a detailed description of the energy cost function that we use in this

chapter. Then we explain two stochastic fluid queueing models that can be used to model

various service strategies (which will be further analyzed in Section 5.3).

�

�


�

�

�

�

�


5.2.1 System Cost Function

The system cost function which we seek to optimize reflects a balance between power usage

and processing performance. We assume that the power consumption of each server consists

of three components, viz. a static, dynamic, and switching component. The static component

is essentially determined by the technology used [29, 93], that is, independent of the actual

service rate used in the underlying queueing system. The dynamic power component is

approximated, following various studies [91, 97], by C sα for some constants α ≥ 2 and

C > 0, where s is the service rate used at that moment. In our work we choose α = 3,

and we normalize all costs such that C ≡ 1 (and the value of the static component resulting

from this normalization is γ). It is further assumed that there are n ∈ N identical processors

(CPU cores) operating, which effectively means that the static energy consumption per time

unit is γn, independent of their service rates; if the number of active processors is a random

variable, this component is γ En. As mentioned, the third component is the switching energy

cost, which reflects the cost incurred (that is, energy consumed) when changing the value

of the service rate. We set this component at β ES, where the random variable S is the

corresponding switching rate (per unit time), and β > 0 is a constant (normalized as above)

[61]; cf. the framework used in [47].

The system cost function also includes a measure of job buffering (queueing for processing)

cost, which reflects the degree of quality-of-service provided. We assume that this com-

ponent depends linearly on the average amount of data W stored in the buffer: δ EW =

δ∫xP(W ∈ dx), where δ > 0 is the buffering cost per data unit per unit of time. Summariz-

ing, the cost function (E) we consider in this chapter is given by

E = γ En+

n∑i=1

Ei[s3] + β

n∑i=1

EiS + δ EW ; (5.1)

here Ei[s3] is the third moment of the service rate of the i-th server, and EiS the correspond-

ing mean switching rate.

The choice of a cubic form for the power used when running at speed s is in line with much

of the prior literature (with a couple of notable exceptions, see e.g. [16]). That is because

the dynamic power of CMOS’s (Complementary Metal Oxide Semiconductors) are propor-

tional to V 2f , with V denoting the supply voltage and f the clock frequency; see for instance

[59]. Operating at a higher frequency requires dynamic voltage scaling to a higher voltage,

nominally with V being roughly proportional with f . In this way we obtain the cubic rela-

�

�


�

�

�

�

�


tionship. At the methodological level, any other functional form can be chosen; the resulting

setup is evidently computationally equivalent. For a further validation of the form of this

cost component, we refer to e.g. [95, Section II].

Several recent papers, see e.g. [81, 80], argue that the switching cost can be a relevant com-

ponent in the cost function, justifying including this in our model. While the extent of the

overhead may be debatable, its inclusion makes our model richer and thus of more potential

use.

We finish this subsection with a couple of remarks regarding the choice of the cost function,

and, more specifically, the selection of the scalars β, γ, δ.

(i) The approach we follow (that is, considering an objective function consisting of vari-

ous sorts of ‘cost components’) has been intensively used in the leading literature on

power-aware speed scaling, see in particular the recent papers by Wierman et al. such

as [3, 95]. In those works the performance metric used is

E[T ] +E[E]

β′ ,

where T is the response time of a job, E is the expected energy expended on that job,

whereas the parameter β′ represents the relative cost of delay.

Our objective function is of the same type, but includes other relevant ‘cost types’, and

is ‘system oriented’ rather than ‘job oriented’.

This type of objective functions can also be regarded as disutility curves, as they reflect

the burden due to various ‘undesirable effects’ (many processors to be allocated, a sys-

tematically high value of the processor speed s, frequent switching, and performance

degradation).

(ii) It should be borne in mind that the specific choice of the scalars β, γ, δ is a provider-

specific issue, and to a large extent motivated by commercial and technological con-

siderations. Indeed, the parameters reflect the ‘disutility’ associated with the number

of servers active, the (statistical distribution of the) processing rate used, the switching

rate, and the quality-of-service delivered (in terms of buffer content); as a consequence,

the specific values of β, γ, δ critically depend on the policy of the provider. Evidently,

when for instance the provider wishes to offer a strict quality level in the Service Level

Agreement (as agreed upon with the customers), the parameter δ should be chosen

�

�


�

�

�

�

�


relatively high, but it is up to the provider what specific value is selected. As a result,

the selection of appropriate values for β, γ, δ has a subjective component, and it is not

the objective of this chapter to provide guidelines on how to pick them.

(iii) As indicated in the introduction, our study also covers the sensitivity of the optimal

solution with respect to the various parameters. In many of the experiments reported

on, we vary the scalars β, γ, δ, and study the impact on the objective function, but also,

quite importantly, the values of the optimal parameters (buffer thresholds, processor

speeds, etc.). In other words, we do not wish to study the system under one specific

set of parameters, but rather provide insight in the effect of varying these parameters.

5.2.2 Fluid Models

In this chapter we analyze various service strategies relying on the theory of stochastic fluid

queues. In this subsection we review the essentials of this theory, with a focus on the specific

strategies that we evaluate. It is noted that fluid models keep track of the system’s workload,

not of the number of active jobs.

We consider an infinite-buffer queue fed by a single traffic source, served by one server or

a few parallel servers. A simple but common model for the queue’s input process is a so

called on-off source, alternating between being active (generating traffic at a constant rate,

say, r > 0) and being silent; the on- an off-times are assumed exponentially distributed with

mean values μ−1 and λ−1, respectively. The assumptions of a single source, the source being

of the on-off type, and an infinite buffer have been imposed for simplicity. The analysis can

be extended in a straightforward fashion to finite buffers, more advanced source models,

and multiple sources; actually, in Section 5.4 we depart from the exponentially assumption

to quantify the impact of non-exponential on-times. Importantly, in our work the service

rate of the queue may depend on the current buffer level (referred to as ‘threshold models’

below), or on the direction in which thresholds are crossed (referred to a ‘hysteretic models’

below).

As our cost function depends on steady-state quantities only, we now point out how to

evaluate the queue’s equilibrium distribution. We enumerate the states of the on-off source

as {−,+}, and define by Fi(x) the equilibrium probability that the source is in state i ∈{−,+} while at the same time the buffer content does not exceed level x; let Fi(x, t) denote

its transient counterpart (corresponding to time t ≥ 0). Let D denote a diagonal matrix such

that di := Dii is the difference between the input rate and the service rate when the source is

�

�


�

�

�

�

�


in state i ∈ {−,+}. It is easily verified that, for Δt small, up to o(Δt)-terms,

F+(x, t) = F+(x− d+ Δt, t−Δt)(1− μΔt) + μΔtF−(x− d−Δt, t−Δt),

and

F−(x, t) = F−(x− d− Δt, t−Δt)(1− λΔt) + λΔtF−(x− d+Δt, t−Δt),

as pointed out in greater detail in [4]. Now take F+(x − d+ Δt, t −Δt) to the left-hand side

in the first of these equations, and F−(x − d− Δt, t − Δt) in the second. Then divide the

equations by Δt, and let Δt ↓ 0, to obtain

d+∂F+

∂x+

∂F+

∂t= μF+(x, t)− μF−(x, t),

and

d−∂F−∂x

+∂F−∂t

= λF−(x, t)− λF+(x, t).

Letting t → ∞, we thus obtain that the equilibrium probabilities satisfy the system of differ-

ential equations (following the convention that vectors are denoted by bold symbols)

Dd

dxF (x) = Q F (x), (5.2)

where Q is a generator matrix with q−+ = λ and q+− = μ. This system can be solved taking

into account appropriate boundary conditions and a normalization. Obviously, to obtain a

proper limiting distribution, we should have that the stability condition π+d+ + π−d− < 0

is fulfilled: the long-term drift should be negative.

Importantly, the system of differential equations (5.2) still applies when Q and D depend on

the current value of the buffer in a piecewise constant way [39, 75], albeit with additional

boundary and continuity conditions. In the case that Q or D are continuous functions of the

buffer content [89], (5.2) holds, but not for the distribution function F (x), but rather for the

density f(x). It is noted that in this situation the stability condition cannot be immediately

expressed in terms of the model primitives; the stability conditions for the various model

variants can be found in [39, 75, 89].

We now consider two specific ways in which the service reacts to the buffer content process:

a model in which the current buffer level determines the service rate, and a model in which

it also matters in what direction certain thresholds are crossed.

�

�


�

�

�

�

�


Many Thresholds Model

In this model there are threshold levels 0 = B0 < B1 < B2 < · · · < BN . The queue is

served at rate si when the buffer content is between Bi and Bi+1. Here it is assumed that

s0 < s1 < s2 < · · · < sN < r (where sN < r ensures that the system is non-trivial, as

otherwise the system would remain empty all the time). The buffer content distribution can

be evaluated as pointed out in e.g. [89]; the probability density is given by

(f−(x)f+(x)

)= λD0e

−g(x)

(1/s(x)

1/(r − s(x))

)

where g(x) is a continuous function of x (whose specific form we leave out here), D0 is the

(stationary) probability that the buffer is empty, and s(x) is the service rate when the buffer

content equals x. In our specific model, s(x) is constant and equal to si for Bi�x�Bi+1

where i = 0, . . . , N ; then we also have g(x) = νi + ηix, where

ηi =μ

r − si− λ

si, νi = νi−1 +Bi(ηi−1 − ηi)

and ν0 = 0. The probability of an empty buffer follows from

D−10 = 1 + λ

∫ ∞

0

e−g(x)

(1

s(x)+

1

r − s(x)

)dx

= 1 + λ

N−1∑i=0

ξiηie−νi

(e−ηiBi − e−ηiBi+1

)+ λ

ξNηN

e−νN−ηNBN

where ξi = si−1 + (r − si)

−1. If there is only one server (n = 1, that is), the mean workload

and switching rate are given by

E[s3] = D0s30 + λD0

N−1∑i=0

ξiηis3i e

−νi(e−ηiBi − e−ηiBi+1) + λD0ξNηN

s3Ne−νN e−ηNBN ;

EW = λD0

N−1∑i=0

ξiηie−νi(Bie

−ηiBi −Bi+1e−ηiBi+1)

+ λD0

N−1∑i=0

ξiη2i

e−νi(e−ηiBi − e−ηiBi+1) + λD0ξNη2N

e−νN e−ηNBN (1 + ηNBN );

ES = λD0

N∑i=1

(e−νi−1e−ηi−1Bi + e−νie−ηiBi

).

�

�


�

�

�

�

�


The case of multiple servers should be treated differently, see Section 5.2.3.

Hysteretic Model

In this model there are two threshold buffer levels; its dynamics are as in [74], and can be

described as follows. The queue is served with the lower service rate s− if the buffer content

is under the higher threshold Bh, until this threshold has been hit from below. From that

moment it is served with the higher service rate s+, until the lower threshold level B� is hit

from above. Then the service rate is changed into s− again, and the procedure repeats. The

stationary buffer content distribution can be computed by using the techniques from [39, 74].

F±(x) = P(W < x) denotes the buffer content distribution function, where the subscripts

+ and − refer to the cases that the queue is served at rate s+ and s−, respectively; its first

(second) component corresponds to the source being off (on).

Let the diagonal matrix D be given by diag(−s±, r − s±). Under B� the queue is always

served with rate s−, and beyond Bh with rate s+. As a result, F+(x) = 0 for 0�x�B�

and F−(x) = F−(Bh) for x�Bh. If the buffer is empty, it cannot remain empty as long

as the source is on, so the second component of F−(0) equals 0. For obvious reasons we

restrict ourselves to the case that r− s± is larger than zero; as a consequence the distribution

function can have a jump only at 0 (and F±(·) is continuous elsewhere). In addition there is

the obvious normalization equation (with F−−(·) the distribution function in the time that

s− is active and the source is off, etc.)

F−−(∞) + F−+(∞) + F+−(∞) + F++(∞) = 1.

It requires some elementary calculus to obtain that the eigenvalues of D−1± Q are {0,−η±}

and the corresponding eigenvectors are

u0 =

(μ/λ

1

), u± =

((r − s±)/s±

1

)

where η± = μ(r − s±)−1 − λ s±−1. To ensure that our queueing system is stable, η+ must be

positive; hence, s+ should be larger than smin = λr/(μ + λ). Using the findings of [74], the

�

�


�

�

�

�

�


distribution function of buffer content is:

0�x�B� F−(x) = a1(u0 − u−e−η−x

),

F+(x) = 0;

B��x�Bh F−(x) = a2u0x+ a3u−e−η−x +

(v1v2

),

F+(x) = a4u0x+ a5u+e−η+x +

(v3v4

);

Bh�x < ∞ F−(x) = F−(Bh),

F+(x) = a6u+e−η+x +

(v5v6

);

here the ai and vi (with i ∈ {1, · · · , 6}) are constants following from boundary conditions,

continuity conditions, substitution, and additional conditions [74, 75]. The number of times

the service rates switches per unit time is given explicitly in [74]:

ES = f−−(Bh) · (r − s−) + f++(B�) · s+,

with f−−(·) the density of the buffer content in the time that s− is active and the source is

off, and f++(·) defined likewise.

5.2.3 Multiple Server Models

We now focus on the situation of multiple servers. Three cases are distinguished; each case

has a specific cost function.

Case 1

Under the first threshold B1 the queue is served by only one server which works at a constant

rate s0. Above this threshold the second server starts working at a constant rate s1, but such

that the first server still works at rate s0. A next server is switched on when the buffer content

reaches a next threshold B2, and so on; there are n− 1 thresholds (and hence n servers). The

cost function reads

E = (s30 + γ)F (B1) + (s30 + s31 + 2γ)(F (B2)− F (B1))

+ · · ·+(

n∑i=1

s3i + nγ

)(1− F (Bn−1)) + β ES + δ EW

�

�


�

�

�

�

�

100 5.3. OPTIMIZATION OF ENERGY CONSUMPTION

Case 2

Below the threshold B1 one server works at rate s0, while beyond B1 a second server is

switched on and the first server adjusts its rate so that both work at rate s1/2. In general,

when the buffer content hits Bi, server i+ 1 is switched on, and all servers adjust their rates

to si/(i+ 1). As a consequence,

E = (s30 + γ)F (B1) + 2((s1/2)

3 + γ)(F (B2)− F (B1))

+ · · ·+ n((si−1/n)

3 + γ)(1− F (Bn−1)) + β ES + δ EW

Case 3

In this case there are two different kinds of thresholds. At some thresholds the servers only

change their rates, but all servers work with the same rate; these thresholds we denote by

Bi. At other thresholds, a server is switched on or off; we denote these by B�mj

. It means

if the buffer content is under B�m1

, then only one server works, but there are thresholds

{B1, B2, . . . , Bm1−1} such that B1 < B2 < · · · < B�m1

and at Bi the service rate changes to si.

Beyond B�m1

and under B�m2

two servers work both with rate sm1/2 and there are (m2−m1)

thresholds at which the servers change their rates. In general, at B�mi

the (i + 1)-th server

starts operating; at Bj such that B�mi−1

< Bj < B�mi

, i servers adjust their rates to sj/i. We

have the following cost function:

E = (s30 + γ)F (B1) +

m1∑i=1

(s3i + γ)(F (Bi+1)− F (Bi))

+

m2∑i=m1+1

2((si/2)3 + γ)(F (Bi+1)− F (Bi)) + · · ·

+ n((si−1/n)3 + γ)(1− F (Bmn)) + β ES + δ EW.

5.3 Optimization of Energy Consumption

So far we described models so as to operate the queue in an energy efficient manner. Clearly,

the value of the cost function depends on the choice of the parameters involved (service rates

and values of the thresholds). In this section, we define a number of strategies that fit in the

framework introduced above, and examine their performance in terms of our cost function.

Before doing so, we first explain the optimization algorithm we used.

�

�


�

�

�

�

�


5.3.1 Algorithm and Implementation

For finding the optimal service rates and threshold levels we used the Polak-Ribiere variant

of conjugate gradient method [84]. The gradient of the cost function, which is needed for

this method, is computed by a standard finite difference method. This algorithm does not

necessarily find the objective function’s global minimum; this problem can be remedied by

starting the conjugate gradient method at different initial values. As a check, we also applied

a simulated annealing based method [84]; after performing extensive tests, it turned out that

the conjugate gradient algorithm provided us with correct results.

It is noted that we have to impose specific constraints on the parameter space. Beyond the

last threshold level the matrix D−1Q must have at least one negative eigenvalue, to ensure

stability; smin is the minimum value of service rate to make sure that this is the case. The

matrix D−1Q has always a zero eigenvalue, but if s = smin it has at least two zero eigenvalues

while all other eigenvalues are positive; this case leads to a minor technicality and is dealt

with separately. The (trivial) constraints 0 < B1�B2� · · ·�BN and 0 < s0�s1� · · ·�sN < r

must be imposed explicitly.

5.3.2 Serving Strategies

As indicated earlier, in this section we assume that the on- and off-periods are exponentially

distributed with means 1/μ and 1/λ respectively. Without loss of generality, we assume that

λ = 1 and that the source’s transmission rate is r = 1; this is essentially a renormalization of

time and space. When comparing single-server models the static cost does not play a role in

the optimization and can therefore be left out. When comparing the single-server case with

the multiple-server case, it is obviously required to include the static component.

Static Strategy

This is the simplest strategy of serving, in which the server works at a given constant rate

at any buffer content. For this model the cost function can be optimized explicitly. The cost

function, E(s), has only two components, viz. the dynamic cost and the buffering cost:

E(s) = s3 + δλr(r − s)

(λ+ μ) (λ(s− r) + μs). (5.3)

Fig. 5.1 shows the optimal cost for different values of the buffering cost rate δ.

�

�


�

�

�

�

�


0.2

0.3

0.4

0.5

0.6

0.7

0 0.2 0.4 0.6 0.8 1

optim

al c

ost

δ

staticsleep mode

Figure 5.1: Optimal cost of static serving and sleep mode strategies vs. buffering cost rate. The switching costrate is 0.5; the mean burst time is 2 (i.e., μ = 0.5).

Sleep Mode Strategy

Here the server is off when the queue is empty and starts working as soon as the source

enters the active mode. Also in this case the optimal service rate can be computed explicitly.

The service alternates between 0 and a constant service rate higher than smin. Switching

consumes energy but even for high values of the switching cost rate, β, the sleep mode

serving strategy outperforms the static strategy, in terms of our cost function. Fig. 5.1 shows

the optimal cost of the static and sleep mode strategy, for different values of the buffering

cost rate δ; the switching cost rate is relatively high (β = 0.5). For small δ both strategies

have nearly the same energy cost, but when δ increases the difference increases. For small

values of β this difference tends to be significantly higher.

1-Threshold Strategy

We now optimize our objective function with respect to a single threshold (B1) and two

service rates (s0, s1). The optimal s0 is always lower than smin, and we have seen before

that s1 must be higher than smin. By increasing β the threshold increases, so as to reduce

the switching cost. In Fig. 5.2 we show the behavior of the thresholds when varying β. It

shows that for β � 0.025, the 1-threshold strategy and the hysteretic strategy (see below) are

equivalent, but for β � 0.025 the thresholds (B�, Bh) bifurcate, with B1 lying in between.

Increasing β leads to an increase in s0, but a decrease in s1. Also, increasing the buffering

�

�


�

�

�

�

�


0.8

1

1.2

1.4

1.6

1.8

0 0.02 0.04 0.06 0.08 0.1

Thre

shol

d

β

MTM-1: B1HysM: BlHysM: Bh

Figure 5.2: Thresholds in the 1-threshold and the hysteretic strategy with respect to β. The parameters chosenare μ = 0.5, δ = 0.05.

0.43

0.435

0.44

0.445

0.45

0.455

0 0.02 0.04 0.06 0.08 0.1

optim

al c

ost

β

MTM-1HysM

Figure 5.3: Optimal cost of the hysteretic strategy and the 1-threshold strategy. For small β (switching costrate) the hysteretic strategy behaves like the 1-threshold strategy, but if β is large, then the hystereticstrategy is more efficient (μ = 0.5, δ = 0.05).

cost rate δ reduces s0 and B1, but increases s1. Typical numerical results are shown in Table

5.1.

Hysteretic Strategy

The hysteretic strategy is very similar to the 1-threshold strategy; actually if the thresholds

are equal (Bh = B�), then they match. For low values of the switching cost rate, the op-

�

�


�

�

�

�

�


δ β HysM MTM-1 MTM-2 MTM-3

Scenario 1: μ = 0.5

0.02 0.0 0.3807 0.3807 0.3740 0.3713

0.02 0.3835 0.3839 0.3785 0.3771

0.05 0.3858 0.3885 0.3848 0.3848

0.05 0.0 0.4340 0.4340 0.4221 0.4190

0.02 0.4385 0.4385 0.4286 0.4274

0.05 0.4435 0.4450 0.4380 0.4378

0.10 0.0 0.4896 0.4896 0.4731 0.4700

0.02 0.4956 0.4956 0.4815 0.4810

0.05 0.5044 0.5044 0.4934 0.4934

Scenario 2: μ = 2.0

0.02 0.0 0.0704 0.0704 0.0679 0.0671

0.02 0.0736 0.0764 0.0762 0.0762

0.05 0.0756 0.0809 0.0809 0.0835

0.05 0.0 0.0914 0.0914 0.0875 0.0863

0.02 0.0971 0.0998 0.0993 0.0993

0.05 0.1007 0.1069 0.1069 0.1069

0.10 0.0 0.1147 0.1147 0.1092 0.1079

0.02 0.1238 0.1256 0.1244 0.1244

0.05 0.1291 0.1362 0.1361 0.1361

Table 5.1: Optimal cost of single-server strategies. The third column relates to the hysteretic model, andthe next three columns to the models with 1, 2, and 3 thresholds, respectively, denoted by MTM-1,MTM-2, MTM-3.

timal thresholds of the hysteretic strategy are equal and coincide with the threshold of the

1-threshold strategy; the optimal service rates of the two models are equal, too. Fig. 5.3

shows the optimal cost of the hysteretic and one-threshold strategy as a function of the pa-

rameter β. It is observed that until a critical level the thresholds Bh and B� are equal, and

increase as a function of β; above this critical level the upper threshold Bh increases while

the lower threshold B� decreases. It means that the hysteretic strategy is more efficient than

1-threshold only if the switching cost is high (as under the hysteretic strategy the switching

rate ES is reduced). Table 5.1 shows values of the objective function for a representative set

of parameters.

�

�


�

�

�

�

�


N -Threshold Strategy

In this model the optimization must be performed over N thresholds (B1, B2, · · · , BN ) and

N + 1 service rates (s0, s1, · · · , sN ). The optimal values of the first threshold (B1) and the

first service rate (s0) tend to be very close to zero. It means that the system resembles the

sleep mode strategy described above. Obviously, if two consecutive service rates are equal

(si = si+1,that is), then an N -threshold strategy is equivalent to an (N−1)-threshold strategy.

This phenomenon occurs when the switching cost rate is high, so that it is beneficial to have

a relatively low number of thresholds. We present some sample results of the 2-threshold

strategy and the 3-threshold strategy in Table 5.1.

Continuous Serving Strategy

We now consider the model in which the service rate is a continuous function of the buffer

content; the corresponding stationary distribution has been obtained in [89]. For optimizing

this model we could either use calculus-of-variations methods or approximate this model

by an N -threshold model (with the thresholds fixed). Following the latter approach, as-

sume there are N threshold levels such that the distance between two consecutive levels is

constant, and choose the thresholds such that Bi+1 − Bi goes to 0 as N → ∞; evidently, the

solution we obtain for large N will be close to the solution to the continuous serving strategy.

It is clear that the optimal cost of N -threshold models decrease when N increases, as one can

increasingly accurately approximate the optimal service rates of the continuous strategy. Fig.

5.4 shows the optimal cost as a function of N ; it gives us insight into the amount by which the

objective function decreases when N increases. From the figure we observe that the optimal

cost converges, as expected, to a limit when the number of thresholds becomes large; this

limiting strategy corresponds to the situation in which the server constantly adapts its pro-

cessor speed. If switching between service rates does consume energy, the objective function

will be high for high values of N , unless the service rates si remain constant for a substantial

set of i ∈ {0, . . . , N} (meaning that there is effectively no switching); a typical example of

the optimal service rate is depicted in Fig. 5.5.

Multiple Servers Strategy

If γ, the energy rate consumed by an operating server, is small, it is obviously efficient to ac-

tivate many servers. Hence, the optimal number of operating servers depends on γ. Above

we did not take into account γ as we so far limited ourselves to single-server strategies. Now

�

�


�

�

�

�

�


0.4

0.42

0.44

0.46

0.48

0.5

0 20 40 60 80 100

optim

al c

ost

n

Figure 5.4: Optimal cost of strategies with many thresholds, as a function of the number of thresholds. Theparameters chosen are μ = 0.5, δ = 0.05, β = 0.0.

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5

serv

ing

rate

buffer content

β = 0.0β = 0.02

Figure 5.5: Service rate as function of buffer content in continuous strategy, approximated by 100 thresholds.The solid line represents the cost when there is no switching cost, and dashed line the cost if switchingconsumes energy.

that we compare single-server strategies with multiple-server strategies, we clearly have to

add this cost component. Table 5.2 presents the optimal values of our cost function for dif-

ferent sets of parameters.

�

�


�

�

�

�

�


γ C1-12 C1-23 C2-12 C2-23 C3-01 C3-10

Scenario 1:

0.05 0.2392 0.1954 0.2276 0.1919 0.2253 0.2271

0.10 0.3250 0.3114 0.3152 0.3056 0.3143 0.3135

0.20 0.4859 0.4859 0.4801 0.4801 0.4801 0.4752

0.50 0.9145 0.9360 0.9325 0.9344 0.9303 0.8506

Scenario 2:

0.05 0.1718 0.1534 0.1642 0.1501 0.1630 0.1556

0.10 0.2487 0.2416 0.2256 0.2408 0.2429 0.2253

0.20 0.3868 0.3868 0.3284 0.3834 0.3834 0.3507

0.50 0.7394 0.7737 0.7400 0.7676 0.6320 0.6309

Table 5.2: Optimal cost of multiple-server strategies. C1-12 (C1-23) relates to Case 1 with 1 threshold and2 servers (2 thresholds and 3 servers, respectively). C2-12 (C2-23) relates to Case 2 with 1 thresholdand 2 servers (2 thresholds and 3 servers, respectively). C3-01 (C3-10) relates to Case 3 with 0 (1)threshold under and 1 (0) threshold beyond B� (which is the threshold at which the second serverstarts operating). Scenario 1 and Scenario 2 have parameters (μ = 0.5, δ = 0.05, β = 0.0) and (μ = 1.0,δ = 0.05, β = 0.0), respectively.

5.3.3 Comparing the Service Strategies

Above we assessed the performance of a wide range of service strategies, see e.g. Tables 5.1–

5.2. In this section we further compare the strategies; we say that strategy A is better than

strategy B if strategy A’s optimal cost is lower than strategy B’s optimal cost. Let us start with

a number of obvious observations. Denote by f(s0) the cost function of static strategy, and

by g(B1, s0, s1) the cost function of the 1-threshold strategy. If s0 = s1, then the 1-threshold’s

cost function is equal to the static’s cost. It is now immediate that min g(B1, s0, s1) is less

than or equal to min f(s0), and hence 1-threshold is better than static. By same token every

N -threshold strategy is better than any M -threshold strategy if N�M . Also, the hysteretic

strategy is more efficient than the 1-threshold strategy.

If β is high the optimization procedure tries to decrease the mean switching rate ES, which

can be done in two ways: (i) eliminating thresholds by setting service rates below and above

at the same values, or (ii) increasing the distance between the threshold levels. In both ways,

however, the mean buffer content will increase, so if δ is also high the cost function will

not decrease. As shown by the numerical output above (and a wide set of additional ex-

periments, not reported here) the hysteretic strategy tends to be better than the N -threshold

strategy if the switching and buffering cost are relatively high.

�

�


�

�

�

�

�

108 5.4. ROBUSTNESS ANALYSIS

We showed that if the static component is very small, then multiple-server strategies out-

perform single-server strategies; evidently, if γ is high, then multiple servers are not nec-

essarily better. We finish this section with an interesting (though straightforward) observa-

tion. Let us assume that strategy A is a single-server static strategy, while strategies B and

C correspond to 2 and 3 servers, respectively, operating with rate s0, being the same rate

as in strategy A. Because all these strategies work with the same service rates, their work-

loads are equal as well. If γ < (5/36) s30, then the 3-server strategy is the most efficient; if

(5/36) s30 < γ < (3/4) s30, then the 2-server strategy is the most efficient; and if γ > (3/4) s30,

then it is optimal to use one server. Observe that the optimal numbers of servers depends

on the value of the parameter γ.

5.4 Robustness Analysis

So far we found the optimal parameters (in terms of service rates and thresholds) of vari-

ous service strategies, and compared these with each other. Now suppose that we decide

to choose a specific strategy, which is suitable for implementation and has sufficiently low

energy cost. In order to be able to identify the optimal parameter values, however, we need

to know the precise values of the input traffic characteristics. Now the question is how an

estimation error propagates: to what extent is the value of the objective function affected?

We consider two scenarios: in the first we perturb the value of the mean on-time (keeping

its distribution exponential), while in the second we perturb the distribution of the on-time.

We compare the strategies (with and without the perturbation) in terms of the value of the

cost function, using the following procedure. First we find the optimal parameter values

(service rates, thresholds) and the corresponding optimal cost. Then we perturb the input

process, and we compute the cost function with the optimal parameter values as obtained for

the unperturbed model. These parameters are likely not optimal for the perturbed model,

and hence the cost is higher than the optimal cost of the unperturbed model. We use the

difference of these costs divided by optimal cost as a ‘robustness measure’, quantifying the

impact of the perturbation:

ΔE := 100× Eper(Xunper)− Eper(Xper)

Eper(Xper), (5.4)

where Eper is the cost function of the perturbed model, Xunper is the vector of optimal pa-

rameters of the non-perturbed model, and Xper is the vector of optimal parameters of the

�

�


�

�

�

�

�


Δμ% StM HysM MTM-1 MTM-2 MTM-3

Scenario 1:

5 0.48 0.10 0.10 0.06 0.04

10 1.67 0.36 0.36 0.24 0.18

20 5.39 1.21 1.21 0.84 0.64

-5 0.67 0.13 0.13 0.08 0.06

-10 3.44 0.59 0.59 0.34 0.26

-20 31.1 3.40 3.40 1.81 1.27

Scenario 2:

5 0.48 0.09 0.10 0.06 0.05

10 1.67 0.33 0.37 0.24 0.19

20 5.39 1.11 1.23 0.88 0.75

-5 0.67 0.11 0.13 0.07 0.06

-10 3.44 0.52 0.62 0.33 0.25

-20 31.1 2.94 3.66 1.67 1.03

Table 5.3: Robustness with respect to changes in the mean on-time. The first column is the change in μ. StMstands for static strategy. All numbers are percentages. Scenario 1 and Scenario 2 have parameters(μ = 0.5, δ = 0.05, β = 0.0) and (μ = 0.5, δ = 0.05, β = 0.05), respectively.

perturbed model. In a first series of experiments, we assume that the distribution of on-

periods is still exponential but the rate, μ, is perturbed. Evidently, decreasing μ makes the

on-times longer, so that the queueing system can become unstable; we restrict ourselves to

values of μ that keep the queue stable. From extensive simulation experiments, see Table 5.3,

we conclude that the perturbation hardly affects the optimality of our designs. For example,

an increase of 20% in μ leads to an increase of 5% in the value of the cost function in the static

strategy (StM) and just about 1% increase in our dynamic strategies. More interestingly, a

decrease of 20% in μ leads to an increase of 31% in the value of the cost function in the static

strategy (StM), but only a few percents increase in our dynamic strategies.

In a second series of experiments, we study the effect of a perturbation of the distribution of

the on-times. A first observation is that there is a connection between the coefficient of varia-

tion (CoV) of the on-time distribution and the value of the cost function: if this CoV is smaller

(larger) than one, then the optimal cost is smaller (larger) than the optimal cost correspond-

ing to exponentially distributed on-times (where the CoV of a random variable is defined

as the ratio of the corresponding standard deviation and mean); this is because (informally)

variability of the on-periods increases the energy cost as there will be more queueing. Ta-

�

�


�

�

�

�

�

110 5.4. ROBUSTNESS ANALYSIS

StM MTM-1 MTM-2 MTM-3

Scenario 1:

Eexp 0.4832 0.4340 0.4221 0.4190

Ehyp 0.5036 0.4464 0.4339 0.4306

EErl 0.4596 0.4180 0.4073 0.4046

ΔEhyp 0.28% 0.17% 0.10% 0.09%

ΔEErl 0.39% 0.21% 0.07% 0.04%

Scenario 2:

Eexp 0.4832 0.4450 0.4380 0.4378

Ehyp 0.5036 0.4568 0.4494 0.4488

EErl 0.4596 0.4302 0.4234 0.4231

ΔEhyp 0.28% 0.22% 0.20% 0.05%

ΔEErl 0.39% 0.25% 0.08% 0.05%

Table 5.4: Robustness with respect to changes in the distribution of the on-times, with the mean on-time un-changed. Scenario 1 and Scenario 2 have parameters (μ = 0.5, δ = 0.05, β = 0.0) and (μ = 0.5, δ = 0.05,β = 0.05), respectively.

bles 5.4–5.5 show the impact of perturbing the distribution of the on-time (keeping its mean

fixed); we do so by replacing the exponential distribution by an Erlang (CoV smaller than

1) or a hyper-exponential (CoV larger than 1) distribution. The robustness measures are again

computed by ΔE , as defined in (5.4). In case of the Erlang distribution, the mean is kept at

1/μ, but there are now two phases (each of mean duration 1/(2μ)), leading to a squared CoV

of 12 . The hyperexponential distribution that we chose corresponds to an exponential ran-

dom variable with mean 1/(2μ) with probability 12 , and an exponential random variable with

mean 3/(2μ), also with probability 12 , such that the resulting mean is 1/μ and the squared

CoV is 32 . In Tables 5.4-5.7, ΔEhyp (ΔEErl) is the percentage error in the value of the objective

function optimized based on exponential job arrivals where in fact the arrival distribution is

hyper-exponential (Erlang).

In Tables 5.6–5.7 the same experiments have been performed, but the mean on-time is also

increased by 10%. The numerical output shows that multiple-threshold strategies are even

more robust; a similar conclusion holds for multiple-server strategies.

�

�


�

�

�

�

�


C1-12 C2-12 C3-01

Scenario 1:

Eexp 0.2392 0.2276 0.2253

Ehyp 0.2446 0.2317 0.2287

EErl 0.2323 0.2229 0.2202

ΔEhyp 0.23% 0.23% 0.12%

ΔEErl 0.29% 0.30% 0.09%

Scenario 2:

Eexp 0.4859 0.4801 0.4801

Ehyp 0.4885 0.4813 0.4813

EErl 0.4816 0.4775 0.4775

ΔEhyp 0.02% 0.03% 0.03%

ΔEErl 0.02% 0.04% 0.04%

Table 5.5: Robustness of multiple-server strategies with respect to changes in the distribution of the on-times,with the mean on-time unchanged. Scenario 1 and Scenario 2 have parameters (μ = 0.5, γ = 0.05,δ = 0.05, β = 0.0) and (μ = 0.5, γ = 0.2, δ = 0.05, β = 0.0), respectively.

StM MTM-1 MTM-2 MTM-3

Scenario 1:

Eexp 0.4832 0.4340 0.4221 0.4190

Ehyp 0.5368 0.4783 0.4653 0.4617

EErl 0.4921 0.4497 0.4382 0.4356

ΔEhyp 4.99% 1.21% 0.65% 0.49%

ΔEErl 0.90% 0.21% 0.19% 0.27%

Scenario 2:

Eexp 0.4832 0.4450 0.4380 0.4378

Ehyp 0.5368 0.4883 0.4802 0.4796

EErl 0.4921 0.4613 0.4541 0.4537

ΔEhyp 4.99% 1.30% 0.88% 0.22%

ΔEErl 0.90% 0.21% 0.28% 0.25%

Table 5.6: Robustness with respect to changes in the distribution of the on-times, with the mean on-time chang-ing as well. The mean on-time of the non-exponential distribution is 10% higher than that of the ex-ponential distribution. Scenario 1 and Scenario 2 have parameters (μ = 0.5, γ = 0.05, β = 0.0) and(μ = 0.5, γ = 0.05, β = 0.05), respectively.

�

�


�

�

�

�

�

112 5.5. CONCLUSION

C1-12 C2-12 C3-01

Scenario 1:

Eexp 0.2392 0.2276 0.2253

Ehyp 0.2535 0.2402 0.2372

EErl 0.2412 0.2308 0.2288

ΔEhyp 0.99% 0.80% 0.35%

ΔEErl 0.51% 0.57% 0.44%

Scenario 2:

Eexp 0.4859 0.4801 0.4801

Ehyp 0.5008 0.4932 0.4933

EErl 0.4943 0.4897 0.4898

ΔEhyp 0.08% 0.07% 0.07%

ΔEErl 0.04% 0.05% 0.05%

Table 5.7: Robustness of multiple-server strategies with respect to changes in the distribution of the on-times,with the mean on-time changing as well. The mean on-time of the non-exponential distribution is 10%higher than that of the exponential distribution. Scenario 1 and Scenario 2 have parameters (μ = 0.5,γ = 0.05, δ = 0.05, β = 0.0) and (μ = 0.5, γ = 0.2, δ = 0.05, β = 0.0) respectively.

5.5 Conclusion

This chapter has presented a modeling framework for controlling and optimizing the energy

management in the emerging multi-core servers with speed scaling capabilities. In addition

to incorporating dynamic power, our framework also includes the static (leakage) power

and the switching overhead between speed levels; these features were largely unaccounted

for in prior works.

We proposed and studied different dynamic strategies for adapting the multi-core server

speeds on the basis of observations of the current buffer content. For a given strategy we

have showed how the performance of the system can be evaluated relying on stochastic fluid

models. The resulting numerical evaluation technique enabled us to calculate the value of

objective functions that balance energy consumption and performance. We have studied

strategies in which the service policy depends on the current buffer value only, but also

strategies in which it matters in what direction thresholds are crossed (i.e., hysteretic con-

trol).

The following general conclusions were drawn. Evidently, strategies that use more thresh-

old levels are more efficient with respect to power consumption, but, remarkably, most of

�

�


�

�

�

�

�


the efficiency gain is achieved with only 1 or 2 thresholds. Furthermore, our objective func-

tions are just mildly sensitive to perturbations in the input parameters. As a consequence,

our procedure is robust, in that estimation errors in the input parameters hardly affect the

performance of the proposed procedure. A more specific conclusion is that if the switch-

ing cost is considerable, then the hysteretic model performs better than the model with one

threshold, but the model with 2 thresholds is always more efficient.

�

�


�

�

�

�

�

114 5.5. CONCLUSION

�

�


�

�

�

�

�

Bibliography

[1] J. ABATE and W. WHITT (1995). Numerical inversion of Laplace transforms of proba-

bility distributions. ORSA J. Comp., 7, pp. 36-43.

[2] S. ALBERS and H. FUJIWARA (2006). Energy-efficient algorithms for flow time mini-

mization. Lecture Notes in Computer Science (STACS), 3884, pp. 621-633.

[3] L. ANDREW, A. WIERMAN and A. TANG (2012). Speed scaling for processor sharing

systems: Optimality and robustness. Performance Evaluation, 69, pp. 601-622.

[4] D. ANICK, D. MITRA and M. SONDHI (1982). Stochastic theory of a data-handling

system with multiple sources. AT&T, Bell Syst. Techn. J., 61, pp. 1871-1894.

[5] N. ASGHARI, P. DEN ISEGER and M. MANDJES (2014). Numerical techniques in Lévy

fluctuation theory. Methodol. Comput. Appl. Probab., 16, pp. 31-52.

[6] N. ASGHARI, K. DEBICKI and M. MANDJES (2014). Exact tail asymptotics of the

supremum attained by a Lévy process, accepted for publication in Statistics and Proba-

bility Letters.

[7] N. ASGHARI and M. MANDJES (2014). Transform-based evaluation of prices and

Greeks of lookback options driven by Lévy processes, submitted for publication to

J. Comp. Fin.

[8] N. ASGHARI, M. MANDJES and A. WALID (2013). Modeling and optimization of en-

ergy management in multi-core servers. Performance Evaluation Review, 41, pp. 38-40.

[9] N. ASGHARI, M. MANDJES and A. WALID (2014). Energy-efficient scheduling in

multi-core servers. Computer Networks, 59, pp. 33-43.

115

�

�


�

�

�

�

�

116 BIBLIOGRAPHY

[10] S. ASMUSSEN (2003). Applied Probability and Queues. Springer, New York, NY, USA.

[11] S. ASMUSSEN, F. AVRAM and M. PISTORIUS (2004). Russian and American put options

under exponential phase-type Lévy models. Stoch. Proc. Appl., 109, pp. 79-111.

[12] S. ASMUSSEN and P. GLYNN (2007). Stochastic Simulation: Algorithms and Analysis.

Springer, New York, NY, USA.

[13] S. ASMUSSEN, D. MADAN and M. PISTORIUS (2007). Pricing equity default swaps

under an approximation to the CGMY Lévy model. J. Comp. Fin., 11, pp. 79-93.

[14] S. ASMUSSEN, O. NERMAN and M. OLSSON (1996). Fitting phase-type distributions

via the EM algorithm. Scand. J. Stat., 23, pp. 419-441.

[15] S. ASMUSSEN and J. ROSINSKI (2004). Approximations of small jumps of a Lévy pro-

cess with a view towards simulation. J. Appl. Probab., 38, pp. 482-493.

[16] N. BANSAL, H. CHAN and K. PRUHS (2009). Speed scaling with an arbitrary power

function. Proc. ACM-SIAM SODA, pp. 693-701.

[17] N. BANSAL, K. PRUHS and C. STEIN (2007). Speed scaling for weighted flow times.

Proc. ACM-SIAM SODA, pp. 805-813.

[18] O. BARNDORFF-NIELSEN (1998). Processes of normal inverse Gaussian type. Financ.

Stoch., 2, 41-68.

[19] L. BARROSO and U. HOLZLE (2007). The case for energy-proportional computing,

IEEE Computer, 40, No. 12, pp. 33-37.

[20] R. BEKKER (2005). Queues with state-dependent rates. PhD thesis, Technische Universiteit

Eindhoven, Eindhoven, The Netherlands.

[21] J. BERTOIN (1998). Lévy Processes. Cambridge University Press, Cambridge, UK.

[22] J. BERTOIN and R. DONEY (1994). Cramér’s estimate for Lévy processes. Statist. Probab.

Lett., 21, pp. 363-365.

[23] F. BLACK and M. SCHOLES (1973). The pricing of options and corporate liabilities. J.

Polit. Econ., 81, pp. 637-654.

[24] D. BUNDE (2006). Power-aware scheduling for makespan and flow. Proc. ACM Symp.

Parallel Alg. and Arch., pp. 190-196.

�

�


�

�

�

�

�

BIBLIOGRAPHY 117

[25] T. BURD, T. PERING, A. STRATAKOS and R. BRODERSEN (2000). A dynamic voltage

scaled microprocessor system, IEEE J. Solid-State Circuits, 35, pp. 1571-1580.

[26] C. CHAN, A. GYGAX, E. WONG, C. LECKIE, A. NIRMALATHAS and D. KILPER (2013).

Methodologies for Assessing the Use-Phase Power Consumption and Greenhouse Gas

Emissions of Telecommunications Network Services. Environ. Sci. Technol., 47, pp. 485-

492.

[27] P. CARR (1998). Randomization and the American Put. Rev. Fin. Studies, 11, pp. 597-

626.

[28] P. CARR, H. GEMAN, D. MADAN and M. YOR (2002). The fine structure of asset re-

turns: an empirical investigation. J. Business, 75, 305-332.

[29] S. CHO and R. MELHEM (2010). On interplay of parallelization, program performance,

and energy consumption. IEEE J. Trans. Par. Distr. Syst., 21, pp. 342-353.

[30] J. COHEN (1974). Superimposed renewal processes and storage with gradual input,

Stochastic Process. Appl., 2, pp. 31-57.

[31] R. CONT and P. TANKOV (2004). Financial Modelling with Jump Processes. Chapman &

Hall/CRC Press, Boca Raton, FL, USA.

[32] R. CONT and P. TANKOV (2008). Financial Modelling with Jump Processes, 2nd edition.

Chapman & Hall / CRC Press, London, United Kingdom.

[33] J. COOLEY and J. TUKEY (1965). An algorithm for the machine calculation of complex

Fourier series. Math. Comput., 19, pp. 297-301.

[34] K. DEBICKI and M. MANDJES (2015). Queues and Lévy fluctuation theory - an applied

probability approach. Springer, to be published.

[35] K. DEBICKI and M. MANDJES (2012). Lévy-driven queues. Surveys in Operations Re-

search and Management Science, 17, pp. 15-37.

[36] P. DEN ISEGER (2006). Numerical transform inversion using Gaussian quadrature.

Probab. Engg. Inf. Sci., 20, pp. 1-44.

[37] P. DEN ISEGER and E. OLDENKAMP (2006). Pricing guaranteed return rate products

and discretely sampled Asian options. J. Comp. Fin., 9, pp. 383-403.

�

�


�

�

�

�

�

118 BIBLIOGRAPHY

[38] H. DUBNER and J. ABATE (1968). Numerical inversion of Laplace transforms by relat-

ing them to the finite Fourier cosine transform. J. ACM, 15, pp. 115-123.

[39] A. ELWALID (1995). Analysis of adaptive rate-based congestion control for high-speed

Wide-Area Networks. Proc. IEEE ICC ’95, pp. 1948-1953.

[40] A. FELDMANN and W. WHITT (1998). Fitting mixtures of exponentials to long-tail dis-

tributions to analyze network performance models. Perf. Eval., 31, pp. 245-279.

[41] W. FELLER (1966). An Introduction to Probability Theory and its Applications. Wiley, New

York, NY, USA.

[42] N. VAN FOREEST (2004). Queues with Congestion-dependent Feedback. PhD thesis, Twente

University, Enschede, The Netherlands.

[43] A. FRANCINI (2012). Selection of a rate adaptation scheme for network hardware. Proc.

IEEE Infocom pp. 2831-2835.

[44] M. FU (2007). Variance-Gamma and Monte Carlo. In: Advances in Mathematical Finance,

eds. Fu, Jarrow, Yen, Elliott. Birkhäuser, pp. 21-35.

[45] A. GANDHI, M. HARCHOL-BALTER, R. DAS and C. LEFURGY (2007). Optimal power

allocation in server farms. Sigmetrics ’09 Proceedings, pp. 157-168.

[46] H. GEMAN and M. YOR (1996). Pricing and hedging double barrier options: a proba-

bilistic approach. Math. Finance, 6, pp. 365-387.

[47] J. GEORGE and J. HARRISON (2001). Dynamic control of a queue with adjustable ser-

vice rate. Oper. Res., 49, pp. 720-731.

[48] P. GLASSERMAN and Z. LIU (2010). Sensitivity estimates from characteristic functions.

Oper. Res., 58, pp. 1611-1623.

[49] P. GLASSERMAN and Z. LIU (2011). Estimating Greeks in simulating Lévy-driven

models. J. Comp. Fin., 14, pp. 3-56.

[50] P. GLYNN and M. MANDJES (2011). Simulation-based computation of the workload

correlation function in a Lévy-driven queue. J. Appl. Probab., 48, pp. 114-130.

[51] J. GOMBINER (2011). Carbon Footprinting the Internet. Consilience: Journal of Sustain-

able Development, 5, pp. 119-124

�

�


�

�

�

�

�

BIBLIOGRAPHY 119

[52] A. GREENBERG, J. HAMILTON, D. MALTZ and P. PATEL (2009). The cost of a cloud:

research problems in data center networks. ACM Sigcomm/Computer Communication

Review, 39, pp. 68-73.

[53] Green Data Project. http://www.greendataproject.org/

[54] V. GUPTA and R. NATHUJI (2010). Analyzing performance of asymmetric multicore

processors for latency sensitive datacenter applications. Proc. Usenix HotPower.

[55] J. HARRISON (1977). The supremum distribution of a Lévy process with no negative

jumps. Adv. Appl. Probab., 9, pp. 417-422.

[56] J. HARRISON (1985). Brownian Motion and Stochastic Flow Systems. Wiley, New York,

NY, USA.

[57] A. HORVÁTH and M. TELEK (2002). Phfit: a general phase-type fitting tool. In: Proc. of

12th Performance TOOLS, LNCS 2324, pp. 82-91.

[58] M. JEANNIN and M. PISTORIUS (2010). A transform approach to compute prices and

Greeks of barrier options driven by a class of Lévy processes. Quant. Financ., 10, 629-

644.

[59] S. KAXIRAS and M. MARTONOSI (2008). Computer Architecture Techniques for Power-

Efficiency. Morgan and Claypool.

[60] O. KELLA and W. STADJE (2002). Exact results for a fluid model with state-dependent

flow rates. Probability in the Engineering and Informational Sciences, 16, No. 4, pp.

389âAS402.

[61] M. KITAEV and R. SERFOZO (1999). M/M/1 queues with switching costs and hys-

teretic optimal control. Oper. Res., 47, pp. 310-312.

[62] I. KOPONEN (1995). Analytic approach to the problem of convergence of truncated

Lévy flights towards the Gaussian stochastic process. Phys. Rev. E, 52, 1197-1199.

[63] L. KOSTEN (1974). Stochastic theory of a multi-entry buffer (I). Delft Progr. Rep., Series

F, 1, pp. 10-18.

[64] S. KOU (2002). A jump-diffusion model for option pricing. Man. Sci., 48, pp. 1086-1101.

�

�


�

�

�

�

�

120 BIBLIOGRAPHY

[65] V. KULKARNI (1997). Fluid models for single buÂoer systems, Frontiers in queueing.

CRC, Boca Raton, FL, pp. 321-338.

[66] A. KUZNETSOV (2010). Wiener-Hopf factorization and distribution of extrema for a

family of Lévy processes. Annals of Applied Probability, 20, pp. 1801-1830.

[67] A. KUZNETSOV, A. KYPRIANOU, J. PARDO and K. VAN SCHAIK (2011). A Wiener-

Hopf Monte Carlo simulation technique for Lévy processes. Ann. Appl. Probab., 21, pp.

2171-2190.

[68] A. KYPRIANOU (2006). Introductory Lectures on Fluctuations of Lévy Processes with Appli-

cations. Springer, Berlin, Germany.

[69] P. LANCASTER and M. TISMENETSKY (1985).The Theory of Matrices with Applications.

Academic Press, 2nd edition.

[70] E. LE SUEUR and G. HEISER (2010). Dynamic voltage and frequency scaling: the laws

of diminishing returns. Proc. Usenix HotPower.

[71] A. LEWIS and E. MORDECKI (2005). Wiener-Hopf factorization for Lévy processes hav-

ing negative jumps with rational transforms. Submitted for publication.

[72] A. LEWIS and E. MORDECKI (2008). Wiener-Hopf factorization for Lévy processes hav-

ing positive jumps with rational transforms J. Appl. Probab., 45, pp. 118-134.

[73] D. MADAN and F. MILNE (1991). Option pricing with VG martingale components.

Math. Financ., 1, 39-55.

[74] R. MALHOTRA, M. MANDJES, W. SCHEINHARDT and H. VAN DEN BERG (2009). A

feedback fluid queue with two congestion control thresholds. Math. Meth. Oper. Res.,

70, pp. 149-169.

[75] M. MANDJES, D. MITRA and W. SCHEINHARDT (2003). Models of network access sing

feedback fluid queues. Queueing Syst., 44, pp. 365-398.

[76] R. MERTON (1976). Option pricing when underlying stock returns are discontinuous.

J. Financ. Econ., 3, pp. 125-144.

[77] D. MITRA (1988). Stochastic theory of a fluid model of producers and consumers cou-

pled by a buffer. Adv. Appl. Probab., 20, pp. 646-676.

�

�


�

�

�

�

�

BIBLIOGRAPHY 121

[78] L. NGUYEN-NGOC (2003). Exotic options in general Lévy models. Prépublication 850,

Univ. Paris 6, Laboratoire de Probabilités et Modèles Aléatoires.

[79] L. NGUYEN-NGOC and M. YOR (2007). Lookback and barrier options under general

Lévy processes. In: Handbook of Financial Econometrics, Y. Aït-Sahalia and L. Hansen

(eds.). North-Holland, Amsterdam, the Netherlands.

[80] S. PARK, J. PARK, D. SHIN, Y. WANG, Q. XIE, N. CHANG and M. PEDRAM (2013). Ac-

curate modeling of the delay and energy overhead of dynamic voltage and frequency

scaling in modern microprocessors. IEEE Trans. on Computer Aided Design, 32, No. 5,

pp. 695-708.

[81] J. PARK, D. SHIN, N. CHANG and M. PEDRAM (2010). Accurate modeling and cal-

culation of delay and energy overheads of dynamic voltage scaling in modern high-

performance microprocessors. Proc. of Symposium on Low Power Electronics and Design,

pp. 419-424.

[82] E. PECHERSKII and B. ROGOZIN (1969). On the joint distribution of random variables

associated with fluctuations of a process with independent increments. Th. Probab.

Appl., 14, pp. 410-423.

[83] N. PRABHU (1998). Stochastic Storage Processes, 2nd edition. Springer, New York, NY,

USA.

[84] W. PRESS, S. TEUKOLSKY, W. VETTERLING and B. FLANNERY (1992). Numerical Recipes

in C, 2nd Edition, Cambridge University Press.

[85] K. PRUHS, P. UTHAISOMBUT and G. WOEGINGER (2008). Getting the best response

for your erg. ACM Transactions on Algorithms, 4, pp. 38:1-38:17.

[86] K. PRUHS, R. VAN STEE and P. UTHAISOMBUT (2008). Speed scaling of tasks with

precedence constraints. Theory Comput. Syst., 43, pp. 67-80.

[87] L. ROGERS (2000). Evaluating first-passage probabilities for spectrally one-sided Lévy

processes. J. Appl. Probab., 37, pp. 1173-1180.

[88] K. SATO (1999). Lévy Processes and Infinitely Divisible Distributions. Cambridge Univer-

sity Press, Cambridge, United Kingdom.

�

�


�

�

�

�

�

122

[89] W. SCHEINHARDT, N. VAN FOREEST and M. MANDJES (2005). Continuous feedback

fluid queues. Oper. Res. Letters, 33, pp. 551-559.

[90] W. SCHOUTENS (2003). Lévy Processes in Finance. Wiley, New York, United States.

[91] A. SINHA and A. CHANDRAKASAN (2001). JouleTrack — A web-based tool for soft-

ware energy profiling. Proc. Design Automation Conf. (DAC), pp. 220–225.

[92] B. SURYA (2008). Evaluating scale functions of spectrally negative Lévy processes. J.

Appl. Probab., 45, pp. 135-149.

[93] R. TEODORESCU and J. TORRELLAS (2008). Variation-aware application scheduling

and power management for CMPs. Proc. Int’l Symp. Computer Architecture (ISCA), pp.

363-374.

[94] A. THÜMMLER, P. BUCHHOLZ and M. TELEK (2006). A novel approach for phase-type

fitting with the EM Algorithm. IEEE Trans. Dep. Sec. Comp., 3, pp. 245-258.

[95] A. WIERMAN, L. ANDREW and M. LIN (2011). Speed scaling: an algorithmic per-

spective. Chapter in: Handbook on Energy-Aware and Green Computing. Chapman &

Hall/CRC Computer and Information Science Series.

[96] A. WIERMAN, L. ANDREW and A. TANG (2009). Power-aware speed scaling in pro-

cessor sharing systems. Proc. IEEE Infocom.

[97] F. YAO, A. DEMERS and S. SHENKER (1995). A scheduling model for reduced CPU

energy. Proc. IEEE Symp. Foundations of Computer Science (FOCS), pp. 374-382.

[98] V. ZOLOTAREV (1964). The first passage time of a level and the behaviour at infinity

for a class of processes with independent increments. Th. Probab. Appl., 9, pp. 653-661.

�

�


�

�

�

�

�

Samenvatting

Lévy-processen (i.e., processen met stationaire en onafhankelijke aangroeiingen) spelen een

belangrijke rol in de toegepaste kansrekening. Ze worden gebruikt in tal van toepassingen,

uiteenlopend van verzekeringswiskunde en andere financiëel-georiënteerde modellen tot

de besliskunde en zelfs biologie. Eén van de belangrijkste onderzoeksdomeinen binnen de

Lévy-processen houdt zich bezig met de analyse van de verdeling van het supremum (of

infimum) dat aangenomen wordt door het proces over een zekere tijdshorizon; het resul-

terende proces wordt meestal het ‘lopende maximum’ (of ‘lopende minimum’ genoemd. De

onderzoeksresultaten op dit gebied staan bekend onder de naam fluctuatietheorie.

Het voornaamste doel van dit proefschrift is de ontwikkeling van numerieke technieken om

de verdeling van het lopende maximum van Lévy-processen te bepalen, en die in een aantal

financiële toepassingen uit te werken. Het tweede doel betreft computationele technieken

die helpen bij het optimaliseren van de energieconsumptie van servers die verkeer in een

communicatienetwerk afhandelen. Het verkeer in zo’n netwerk wordt gemodelleerd als een

aan-uit-proces, waarbij de aan- en uit-tijden stochastische variabelen zijn.

In het tweede hoofdstuk van dit proefschrift, ontwikkelen we een numerieke techniek die

gebaseerd is op de zgn. Wiener-Hopf-ontbinding, en waarmee we de verdelingsfunctie van

het lopende maximum (of minimum) van een algemeen Lévy proces (d.w.z. een Lévy-proces

met sprongen zowel omhoog als omlaag) kunnen bepalen. Dankzij numerieke Laplace- en

Fourier-inversie-technieken ontwikkeld door Den Iseger, zijn we in staat dit met welhaast

machine-precisie te doen. Deze aanpak heeft een veelheid aan mogelijke toepassingen.

In Hoofdstuk 3 kijken we met name naar het toepassen van de technieken uit Hoofdstuk

2 bij het prijzen van specifieke exotische opties, te weten zgn. lookback-opties. We merken

hierbij echter op dat onze techniek in principe gebruikt kan worden voor het prijzen van

123

�

�


�

�

�

�

�

124

elke willekeurige optie waarvan de payoff bepaald wordt door het lopende maximum of

minimum, bijvoorbeeld de zgn. barrier-optie.

De tweede techniek die we in dit boek bespreken is importance sampling; zie Hoofdstuk 4.

Deze aanpak heeft als doel de variantie van simulatie-gebaseerde schatters omlaag te bren-

gen. Directe (‘naïeve’) simulatie is inefficiënt en onnauwkeurig als de gebeurtenis waarin

we geïnteresseerd zijn zeldzaam is, en om dit het hoofd te bieden is het idee achter im-

portance sampling om simulatiepaden te genereren met gebruikmaking van een een alter-

natieve kansmaat, waaronder de gebeurtenis juist vaak voorkomt. Het is duidelijk dat de

simulatie-output gecorrigeerd moet worden (d.m.v. likelihood ratio’s) om daarmee een zui-

vere schatting te krijgen. De belangrijkste uitdaging ligt in het vinden van een goede alter-

natieve kansmaat die de resulterende variantie zo klein mogelijk maakt; hierin slagen we in

ons Lévy-model.

‘Energie-bewuste’ processoren zijn bedoeld om efficiënt verkeer af te handelen door de ver-

werkingssnelheid van de CPU aan te passen aan de belasting van dat moment en de gestelde

prestatie-eisen. In Hoofdstuk 5 beschouwen we dit probleem onder een doelfunctie die een

lineaire combinatie is van energie-verbruik, de ervaren kwaliteit (gemeten in termen van de

vertraging die het verkeer in het netwerk oploopt), en de frequentie waarmee de verwer-

kingssnelheid aangepast dient te worden; dit alles in een zgn. multi-core processor. Onze be-

langrijkste bijdrage is dat we een stochastisch vloeistof-model ontwikkelen waarmee dit sys-

teem beschreven en geoptimaliseerd kan worden. We bespreken verschillende schema’s, die

elk op hun specifieke wijze de verwerkingssnelheid aanpassen, en kwantificeren de reduc-

tie in het energiegebruik. We laten bovendien zien dat optimale strategieën robuust zijn, in

de zin dat verstoringen van de parameters er nauwelijks invloed op hebben. Deze robuust-

heidseigenschap maakt het praktisch gebruik van optimale strategieën zeer aantrekkelijk.

�

�


�

�

�

�

�

Summary

Lévy processes, i.e., processes with stationary and independent increments, play an impor-

tant role in applied probability. They have widespread applications, ranging from insurance

and financial mathematics to operation research and even biology. One of the main branches

of research on Lévy processes concentrates on analyzing the probabilistic properties of the

supremum (or infimum) attained by the process over a given period of time, usually referred

to as the running maximum (or minimum). This topic is commonly known as fluctuation the-

ory.

The main objective of this thesis is to develop numerical techniques to calculate the prob-

ability distribution of the running maximum of Lévy processes, and consider a number of

specific financial applications. The other objective is to propose a numerical method to opti-

mize the energy consumption of servers handling traffic in a communication network. The

traffic itself is modeled by a random process, usually an on-off process with random on- and

off-times.

In the second chapter of this thesis, a numerical technique based on the Wiener-Hopf fac-

torization is developed to evaluate the probability distribution of the running maximum (or

minimum) of a general Lévy process (i.e., a Lévy process with possibly two-sided jumps).

Thanks to the numerical Laplace and Fourier inversion technique developed by den Iseger, we

are able to numerically compute the probability with almost machine precision. This ap-

proach has a variety of potential applications.

In Chapter 3, we primarily focus on applying the technique developed in Chapter 2 to price

specific exotic options, viz. the so-called lookback option. However, the method can be em-

ployed for pricing many other options which depend on the maximum and/or minimum

attained by the underlying Lévy process, for instance the barrier option.

125

�

�


�

�

�

�

�

126

The second technique which is presented in this book is importance sampling; see Chapter 4.

This technique is essentially used to reduce the variance of the simulation-based estimator.

Straightforward simulation for estimating rare event probabilities being inefficient and inac-

curate, the idea of importance sampling is to generate simulation paths under an alternative

measure such that the event is not rare anymore. Obviously the simulation results have to

be corrected by an appropriate likelihood ratio to obtain an unbiased estimate. The main

challenge of this method is to find the appropriate alternative measure and corresponding

likelihood ratio, which we succeed to find in our Lévy setting.

Energy-aware processors are intended to operate efficiently by adapting the speed of the

server CPU to the processing load and the service level requirement. In Chapter 5, we con-

sider a performance objective which is a linear combination of energy usage, queuing cost

(reflected by delay) and speed switching cost for a multi-core processor. Our analysis cap-

tures the static power as well as the dynamic power of the processor. Our main contribu-

tion is that we propose a stochastic fluid model for the analysis and optimization of such

multi-core processing systems. We discuss several schemes that lead to energy consumption

reduction. We show that the optimal strategies are robust under perturbations of the system

parameters and statistical properties of the traffic. This robustness property makes the use

of the optimal strategies highly attractive in practical situations.

�

�


�

�

�

�

�

About the author

Naser Mohammad Asghari was born in Tehran, Iran in August 1976. During his school years

he was fascinated by mathematics and physics. In 1994 he entered the bachelor program of

mathematics at Sharif University of Technology (SUT), but changed to physics the next year.

In 1998 he was accepted for a master program in cosmology at SUT. After he graduated in

this master program in 2000, he continued his study in a PhD program on astrophysics at

the Institute for Advanced Studies in Basic Science. He defended his PhD thesis in 2006,

and then did a one-year postdoc at the same institute. During the period 2007–2010 he was

an assistant professor at Aerospace Research Institute and Azad University of Zanjan. In

2010 he started his second PhD at the KdV Institute for Mathematics, University of Ams-

terdam, the Netherlands, in the field of applied probability under supervision of Professor

Michel Mandjes. Since 2012 he has been working as a quantitative analyst at ING Bank in

Amsterdam.

127

uva-dare (digital academic repository) …computational techniques in queueing and fluctuation...

Documents