limit theorems and approximations with applications …jb2814/papers/disertab3.pdf · limit...

156
LIMIT THEOREMS AND APPROXIMATIONS WITH APPLICATIONS TO INSURANCE RISK AND QUEUEING THEORY a dissertation submitted to the department of management science and engineering and the committee on graduate studies of stanford university in partial fulfillment of the requirements for the degree of doctor of philosophy Jose H. Blanchet August 2004

Upload: dangkhanh

Post on 11-Apr-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

LIMIT THEOREMS AND APPROXIMATIONS WITH

APPLICATIONS TO INSURANCE RISK AND

QUEUEING THEORY

a dissertation

submitted to the

department of management science and engineering

and the committee on graduate studies

of stanford university

in partial fulfillment of the requirements

for the degree of

doctor of philosophy

Jose H. Blanchet

August 2004

c° Copyright by Jose H. Blanchet 2004

All Rights Reserved

ii

I certify that I have read this dissertation and that, in

my opinion, it is fully adequate in scope and quality as a

dissertation for the degree of Doctor of Philosophy.

Peter W. Glynn(Principal Adviser)

I certify that I have read this dissertation and that, in

my opinion, it is fully adequate in scope and quality as a

dissertation for the degree of Doctor of Philosophy.

Nicholas Bambos

I certify that I have read this dissertation and that, in

my opinion, it is fully adequate in scope and quality as a

dissertation for the degree of Doctor of Philosophy.

David O. Siegmund

Approved for the University Committee on Graduate

Studies.

iii

Acknowledgements

First, I want to thank God for giving me the opportunity of living all these wonderful

experiences at Stanford together with my beloved wife, Citlalli. Thanks, Lalli, for

being extremely supportive and being always interested and willing to listen to my

ideas. I consider your support and help throughout the completion of this step in my

academic life extremely valuable!.

My advisor, Professor Peter Glynn, has been a constant source of encouragement

and support. I had the fortune of enjoying a rich academic experience at Stanford and,

obviously, my interactions with Professor Glynn have played a crucial role in making

my Stanford experience so enjoyable. The example of Professor Glynn as researcher,

teacher and advisor is something that I treasure as one of the most important lessons

that I am keeping as a part of my learning experience.

I am grateful to the members of both, the reading and examination committees

(Professors Bambos, Diaconis, Glynn, Siegmund and Van Roy), for taking the time

to read this dissertation and provide valuable feedback through useful conversations

and interesting questions. In particular, thanks to Professor Siegmund for the use-

ful comments that he provided during several insightful discussions. I also want to

acknowledge the support that I received, in particular at early stages of my Ph.D.

work, from Professor David Luenberger.

Naturally, I want to thank my parents and brothers, specially my mother and

sisters, Rocío and Roxanna, for being extremely supportive and loving! And, finally, I

would like to express my gratitude to my friends and colleagues at Stanford, who have

contributed to make my overall experience at Stanford wonderful in every dimension.

Thanks!

iv

Contents

Acknowledgements iv

1 Introduction 1

2 Corrected Diffusion Approximations 6

2.1 The Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Short-time Asymptotics for the Cauchy Process . . . . . . . . . . . . 15

2.3 Reduction to Cauchy Process’ Asymptotics . . . . . . . . . . . . . . . 17

2.4 An Asymptotic Expansion for I (θ, b) . . . . . . . . . . . . . . . . . . 23

2.5 Expansions for r (∆) and EθRk (∞) . . . . . . . . . . . . . . . . . . . 30

2.5.1 The Expansion for r (∆) . . . . . . . . . . . . . . . . . . . . . 30

2.5.2 The Expansion for EθR (∞)k as θ & 0 . . . . . . . . . . . . . 32

2.6 Technical Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Cramer-Lundberg with Heavy Tails 43

3.1 A Cramer-Lundberg Representation . . . . . . . . . . . . . . . . . . . 46

3.2 Connection to Corrected Diffusion Approximations . . . . . . . . . . 50

3.3 Technical Development . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Geometric Sums and Applications 66

4.1 Asymptotics for Geometric Sums . . . . . . . . . . . . . . . . . . . . 69

4.2 Asymptotics of Defective Renewal Equations . . . . . . . . . . . . . . 81

v

5 Approximating Discounted Rewards 86

5.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.2 Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.3 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . 96

5.4 Edgeworth Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.4.1 The discrete time setting . . . . . . . . . . . . . . . . . . . . . 106

5.4.2 The continuous time setting . . . . . . . . . . . . . . . . . . . 115

5.5 Large Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.5.1 The continuous time setting . . . . . . . . . . . . . . . . . . . 124

5.5.2 The discrete time setting . . . . . . . . . . . . . . . . . . . . . 136

Bibliography 144

vi

Chapter 1

Introduction

This dissertation focuses on the development of limit theorems and approximations

for several performance measures that play an important role in a great variety of

applied disciplines including: Insurance Risk Theory, Queueing Theory, Statistical

Sequential Analysis, and Time Series Analysis, among others. To be more precise, let

us utilize the insurance setting as a vehicle to provide a unified overview of the types

of results that are developed in the subsequent chapters of this dissertation.

When dealing with the contingent nature of the insurance business, risk man-

agers take advantage of stochastic models and tools that are used to effectively assess

the risk of insurance portfolios (see Bowers et al (1997)). A popular model, widely

used in the insurance community to analyze collective risk models is the so-called

renewal model (see Bowers et al (1997) p. 432 and Asmussen (2001) Ch. 5). The

renewal model assumes that the claims arrive according to a renewal process, with in-

dependent and identically distributed (iid) inter-arrival times. It is also assumed that

the claims sizes are represented by a sequence of iid non-negative random variables

(rv’s), independent of the arrival process. Finally, the model specifies a constant (ag-

gregated) premium rate, which is received by the insurance company. A fundamental

quantity in the risk analysis of insurance portfolios is the so-called ruin probability

or probability of bankruptcy. Of course, if the premium rate charged is less than or

equal to the equilibrium pay-out rate, then the LLN implies that the company will go

bankrupt eventually with probability one. Consequently, insurance companies would

1

CHAPTER 1. INTRODUCTION 2

typically charge a positive “safety loading” in addition to the equilibrium pay-out

rate. Note, however, that in competitive environments, one would typically expect

insurance companies to charge small safety loadings to their customers.

The first portion of this dissertation addresses the problem of understanding the

probability of eventual ruin, parametrically in the premium rate, under low safety

loading environments. This problem, in turn, involves studying the mathematical

structure of random walk with small negative drift. Indeed, the time to bankruptcy

in the renewal model can be represented as the first hitting time to a certain level

(which is just the initial reserve level of the company) for a random walk that has a

negative drift proportional to the safety loading. As a result, the ruin occurs in finite

time if the maximum of a random walk with negative drift ever hits a certain level,

or, equivalently, if the corresponding first hitting time to this level is finite. Conse-

quently, the aforementioned insurance problem motivates the parametric analysis of

the distribution of the maximum of a random walk with small negative drift.

Incidentally, the distribution of the all time maximum of random walk with nega-

tive drift corresponds to the steady-state waiting time distribution (excluding service)

of the single server queue (which is one of the most fundamental models in the theory

of queues). As in the insurance setting discussed previously, the underlying ran-

dom walk would often have close to zero drift, which translates into the so-called

heavy traffic regime that is widely used in the modern analysis of queueing systems.

Heavy traffic analysis is often done through diffusion approximations. In fact, as we

shall see in Chapter 2, our parametric analysis of the distribution of the maximum

of random walk, with close to zero drift, corrects the natural diffusion approxima-

tion based on Brownian motion (which provides a crude “first order” approximation

to the distribution of the all time maximum of random walk). Corrected diffusion

approximations (CDA’s) for the distribution of the maximum of random walk were

introduced by Siegmund (1979). Siegmund’s second order correction to the standard

Brownian approximation was motivated by applications in Statistical Sequential Anal-

ysis. Specifically, applications related to proper design of statistical tests that run

up to a suitably defined first hitting time of an underlying random walk. The theory

presented in Chapter 2 extends the development initiated by Siegmund (1979) and

CHAPTER 1. INTRODUCTION 3

subsequent results in Statistical Sequential Analysis (see, for example, Chang (1992)

and Chang and Peres (1997)).

The previous discussion presents some examples of applied disciplines that can po-

tentially benefit from the results in the second chapter of this dissertation. Of course,

in some of these disciplines, stylized features arising from modeling considerations,

and statistical analysis of the data may give rise to additional technical complications

that must be addressed. For example, in the insurance setting described before, it

turns out that, in several branches of the insurance business (such as property insur-

ance), heavy tailed structure (in particular, claims sizes that do not have exponential

moments) seems to be an appropriate modeling feature to consider. (Other examples

are discussed in Chapter 3 below.) Unfortunately, techniques (such as exponential

changes of measure) that are extremely useful in the analysis of light tailed systems

(i.e. assuming the existence of exponential moments) do not extend to the heavy tailed

case. For instance, again coming back to the insurance arena, the corrected diffusion

approximation by Siegmund (1979), and the extension provided in Chapter 2 of this

dissertation, rely on light tailed techniques. Also, another approximation for the ruin

probability, which is typically very powerful in light tailed settings, is the celebrated

Cramer-Lundberg approximation. It turns out that, in the light tailed case, both

the CDA presented in Chapter 2 and the Cramer-Lundberg approximation are inti-

mately connected. Due to its success in applications involving light tailed character-

istics, analogous forms of the Cramer-Lundberg approximation have been developed

to cover a large class of heavy tailed claims (more precisely, subexponential claims,

see Embrechts, Klüppelberg and Mikosch (1997)). These extensions to heavy tailed

contexts are developed for large values of the initial reserve and fixed safety loading

and typically provide a poor performance for typical values of the initial reserve in

practical applications (see Embrechts, Klüppelberg and Mikosch (1997) p. 54). In

Chapter 3, we introduce a new interpretation of the Cramer-Lundberg approximation

for heavy tailed claims under the low safety loading asymptotic regime. In this dis-

sertation (specifically in Chapter 3) we only focus on the proposed Cramer-Lundberg

type of approximation in diffusion scale, which is related to the CDA presented in

Chapter 2. Thus, in simple terms, Chapter 3 provides a new Cramer-Lundberg type

CHAPTER 1. INTRODUCTION 4

of approximation for heavy tailed claims, interpreted in a low safety loading asymp-

totic regime, that seems to perform well in practical applications. (See Asmussen

and Binswanger (1997), who analyzed a related approximation provided by Hogan

(1986), which is discussed in Chapter 3 of this dissertation.) The approximation pro-

vided in Chapter 3 blends accurate approximations in diffusion scale with standard

Cramer-Lundberg asymptotics for large values of the reserve in a coherent way; this

parallels the relationship between the CDA of Chapter 2 and the Cramer-Lundberg

asymptotic in the light tailed case.

As was mentioned before, the analysis of stochastic systems with heavy tailed

characteristics gives rise to technical complications due to the fact that standard

light tailed techniques are infeasible. In order to deal with the problem of providing

accurate approximations for the probability of bankruptcy in heavy tailed contexts,

we developed new techniques that, in particular, are applied to obtain the results

described in the previous paragraph. These new techniques are presented in Chapter

4 of this dissertation. More precisely, Chapter 4 develops asymptotic expansions

of so-called random geometric sums (or geometric convolutions) when the success

parameter of the geometric random variable is close to zero. The direct connection to

the ruin problem and the distribution of the maximum of random walk comes from

a well known representation of the all time maximum of random walk as a geometric

number of iid positive random variables. The techniques developed in Chapter 4

have implications beyond the ruin problem previously discussed. In particular, as we

shall see, asymptotic expansions of geometric sums are closely related to so-called

defective renewal equations. As we discuss in Chapter 4, these types of integral

equations arise naturally in many areas of applied probability (including queueing

theory and insurance risk theory). The asymptotics developed for geometric sums are

then used to obtain expansion for defective renewal equations that are close to being

proper. Again, this asymptotic regime arises repeatedly in queueing and insurance.

For instance, as we shall see in Chapter 4, these results are useful in the development

of corrected heavy traffic approximations for M/G/c queueing models and in the

analysis of generalizations of classical renewal risk models.

Finally, it should be recognized that investments may play an important role in

CHAPTER 1. INTRODUCTION 5

the bankruptcy of insurance companies. Indeed, it follows that if one introduces in-

vestment effects in the risk reserve, the probability of bankruptcy can be expressed in

terms of the distribution of a so-called perpetuity or infinite horizon discounted reward

(see Asmussen (2001) Ch. 7). This motivates the theme of the last chapter of this

dissertation, namely Chapter 5. Specifically, in Chapter 5 we develop approximation

for the distribution of infinite horizon discounted rewards. The theory provided in

Chapter 5 is developed, just as in the previous chapters in a “low profit environment”

which again is natural in many applications settings (such as the insurance context

that we have been emphasizing). In particular, we develop central limit theorems,

laws of large numbers, Edgeworth expansions and large deviation principles (rough

and exact) for the distribution of perpetuities under low interest rates. As we shall

also discuss in Chapter 5, these approximations are relevant not only to the insurance

ruin problem, but also for other applied disciplines (including time series analysis and

finance).

Chapter 2

Corrected Diffusion Approximation

for the Maximum of Random Walk

Let (Xn : n ≥ 1) be a sequence of independent and identically distributed (iid) randomvariables (rv’s), and let S = (Sn : n ≥ 0) be its associated randomwalk (so that S0 = 0and Sn = X1 + ... +Xn for n ≥ 1). In this chapter, we focus on the development ofhigh accuracy approximations to the distribution of the maximum r.v.

M = max{Sn : n ≥ 0}.

Clearly, −µ , EX1 must typically be negative in order that M be finite-valued. The

distribution of M is of importance in a number of different disciplines.

For x > 0, {M > x} = {τ (x) < ∞}, where τ (x) = inf{n ≥ 1 : Sn > x}, sothat computing the tail of M is equivalent to computing a level crossing probability

for the random walk S. Because of this level crossing interpretation, the tail of M

is of great interest to both the sequential analysis and risk theory communities. In

particular, in the setting of insurance risk, P (τ (x) <∞) is the probability that aninsurer will face ruin in finite time (when the insurer starts with initial reserve x and

is subjected to iid claims over time); see, for example, Asmussen (2000).

The distribution ofM also arises in the analysis of the single most important model

in queueing theory, namely the single-server queue. If the inter-arrival and service

times for successive customers are iid with a mean arrival rate less than the mean

6

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 7

service rate, then W = (Wn : n ≥ 0) is a positive recurrent Markov chain on [0,∞),whereWn is the waiting time (exclusive of service) for customer n. IfW∞ is a random

variable having the stationary distribution of W , then Kiefer and Wolfowitz (1956)

showed thatW∞ has the distribution ofM for an appropriately defined random walk.

As a consequence, computing the distribution of M is of fundamental importance to

queueing theorists.

Since W is a positive recurrent Markov chain, the distribution of M can be com-

puted as the solution to the equation describing the stationary distribution of W .

This linear integral equation is known as Lindley’s equation (see Lindley (1952)) and

is of Wiener-Hopf type; it is challenging to solve, both analytically and numerically.

As a result, approximations are frequently employed instead. One important such

approximation holds as µ& 0. This asymptotic regime corresponds in risk theory to

the setting in which the “safety loading” is small (i.e. the premium charged is close to

the typical pay-out for claims) and in queueing theory to the “heavy traffic” setting

in which the server is utilized close to 100% of the time. Thus, this asymptotic regime

is of great interest from an applications standpoint. Kingman (1963) showed that the

approximation

P (M > x) ≈ exp ¡−2µx/σ2¢ (1)

is valid as µ & 0, where σ2 = V ar (X1). (A more precise statement of this result

will be given in Section 2.) Because the right hand side of (1) is the exact value of

the level crossing probability for the natural Brownian approximation to the random

walk S, (1) is often called the diffusion approximation to the distribution of M .

As with any such approximation, there are applications for which (1) delivers

poor results. Siegmund (1979) therefore proposed a so-called “corrected diffusion

approximation” that reflects information in the increment distribution beyond the

mean and variance. This corrected diffusion approximation computes the next term in

the asymptotic (as µ& 0) beyond that given by the right hand side of (1). The main

result in this chapter (Theorem 1) is a development of the full asymptotic expansion

initiated by Siegmund. We compute all the terms in the asymptotic expansion for

general random walks with increments having exponential moments; see Section 6 for

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 8

details on the calculation of the relevant coefficients in the expansion. Our theorem

can be viewed as a non-Gaussian counterpart to the corresponding expansion provided

recently by Chang and Peres (1997) for Gaussian randomwalks. As perhaps expected,

the mathematical approach followed here is quite different from that used by Chang

and Peres.

As is well known in the literature, there is a close connection between such cor-

rections and asymptotic expansions for the moments of the ascending ladder height

random variables associated with the random walk. Theorem 2 establishes an asymp-

totic expansion for the mean of the first strict ascending ladder height for random

walks with light-tailed symmetric and continuous increments. As indicated in Sec-

tion 6, this permits one to develop asymptotic expansions for all the moments of the

ascending ladder heights (and for the limiting overshoot induced by the associated

renewal process); see also Theorem 4.

This chapter is organized as follows. The main results are described in Section

2. A key connection to asymptotic expansions for the “short-time” behavior of the

Cauchy process is made in Section 3. Section 4 shows how all the integrals required

for our asymptotic expansion can be reduced to the short-time asymptotics of Section

3. Finally, Section 5 provides rigorous support for the remaining details in the ar-

gument used to compute the coefficients in the expansion. Section 6 summarizes the

computation of the coefficients, and discusses an expansion related to the moments

of the strict ascending ladder height. Any proof that does not follows the statement

of the result can be found in our final section, namely Section 7.

2.1 The Main Results

To state our main results, we adopt the parameterization utilized by Siegmund (1979).

We assume throughout this chapter that the Xi’s have exponential moments, so that

E exp (θX1) <∞ for θ in a neighborhood containing the origin. For such θ, define

ψ (θ) = logE (exp (θX1)) .

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 9

Then, for each such θ, we can define the probability measure Pθ having the property

that for n ≥ 0,

Pθ (A) = E (exp (θSn − nψ (θ)) 1A)

for A ∈ σ (Sj : 0 ≤ j ≤ n). As is well known, S is again a random walk with iid

increments under Pθ, having common increment distribution

Pθ (X1 ∈ dx) = exp (θx− ψ (θ))P (X1 ∈ dx)

for x ∈ R (with mean EθX1 = ψ0 (θ) and variance V arθ (X1) = ψ00 (θ)). Without

any loss of generality, assume that EX1 = 0 and V ar (X1) = 1. Since ψ (·) is strictlyconvex on its domain of finiteness, EθX1 < 0 for θ < 0. Thus, Pθ induces a random

walk with negative drift when θ < 0. We therefore focus on corrected approximations

to Pθ (M > x) as θ % 0.

A key step to the analysis of Pθ (M > x) is the judicious application of Wald’s

likelihood ratio identity; see, for example Siegmund (1985), p. 13. For θ0 in some

interval of the form (−η, 0), there exists a positive θ1 such that ψ (θ0) = ψ (θ1).

Set ∆ = θ1 − θ0. Note that parameterizing in terms of ∆ is essentially equivalent

to parameterization in terms of θ0 (or parameterization in terms of the drift µ =

−ψ0 (θ0)). The likelihood ratio identity then asserts that

Pθ0 (τ (x) <∞) = Eθ1 exp¡− (θ1 − θ0)Sτ(x)

¢= exp (− (θ1 − θ0)x)Eθ1 exp (− (θ1 − θ0)R (x)) , (2)

where R (x) = Sτ(x) − x is the so-called “overshoot” at level x.Suppose now that X1 is strongly non-lattice, in the sense that for each δ > 0,

inf|λ|>δ

|1− g (λ)| > 0, (3)

where g (λ) = E exp (iλX1) is the characteristic function of X1 (under P0). Applying

renewal theory to the random walk at strictly increasing ladder epochs establishes

then

Eθ1 exp (− (θ1 − θ0)R (x))→ Eθ1 exp (− (θ1 − θ0)R (∞)) (4)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 10

as x→∞.Siegmund (1979) showed that the renewal theorem can be applied uniformly for

∆ < η (see also Chang (1992)). Hence, (2) yields

Pθ0 (M > x) = exp (−∆x)Eθ1 exp (−∆R (∞)) + o (exp(− (∆+ r)x)) (5)

for some r > 0 (uniformly in θ0 > −η/2). In insurance risk theory, ∆ is called

the “adjustment coefficient” and the quantity Eθ1 exp (−∆R (∞)) is known as theCramer-Lundberg constant (c.f. Asmussen (2001)).

Relation (5) may alternatively be written as

Pθ0 (∆M > x) = exp (−x)Eθ1 exp (−∆R (∞)) + o (exp (−rx/∆)) (6)

where o (exp (−rx/∆)) is uniform in θ0 > −η/2. Note that exp (−x) is precisely thelevel crossing probability of level x/∆ for a Brownian motion with drift −∆/2 andunit variance. Since Eθ1X1 ∼ −∆/2 as θ0 % 0, (6) provides rigorous support for

the diffusion approximation (1). Furthermore, a correction to the diffusion approxi-

mation described at the beginning of this chapter can be obtained by developing an

asymptotic expansion for Eθ1 exp (−∆R (∞)).Siegmund (1979) obtained his corrected diffusion approximation by showing that

Eθ1 exp (−∆R (∞)) = exp (−∆β1) + o¡∆2¢

(7)

as ∆ ↓ 0, where β1 can be computed explicitly as

β1 =1

6EX3

1 −1

Z ∞

−∞

1

λ2Re log{2 (1− g (λ)) /λ2}dλ. (8)

Note that by computing the single integral (8), Siegmund’s corrected diffusion ap-

proximation to the distribution of M provides a parametric approximation that is

valid for all random walks having negative drift sufficiently close to zero. Such para-

metric approximations are convenient in many applications settings (i.e. in studying

the behavior of a queue when utilization is close to 100%).

Our main theorem shows that there is a full asymptotic expansion for

r (∆) , logEθ1 exp (−∆R (∞)) .

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 11

Theorem 1 Suppose that X1 has exponential moments and is strongly non-lattice.

Then, r (·) (initially defined on [0, υ) for υ > 0) admits an analytic extension on a

neighborhood of the origin in the complex plane.

Remark An immediate consequence of Theorem 1 and the implicit function the-

orem is that the Cramer-Lundberg constant, namely exp (r (∆ (θ0))), initially defined

for all θ0 < 0 sufficiently close to zero, admits an analytic extension on a disc con-

taining the origin in the complex plane.

According to Theorem 1,

Eθ1 exp (−∆R (∞)) = expà ∞Xn=1

βn∆n

!, (9)

where β1 is given by (8) and β2 = 0. (This latter equality follows from the fact that

the error term in (7) is o (∆2) .) Obviously, in order for (9) to be useful from an

applied standpoint, we need a means of numerically computing the βn’s. This issue

is discussed in Section 6. We establish there that the βn’s can be successively com-

puted via a finite number of one-dimensional integrations reminiscent of the integral

appearing in (8). Thus, the βn’s can easily be computed, thereby yielding cheaply

computable high-order parametric corrections to the diffusion approximation (1).

The argument above also permits us to establish asymptotic expansions for certain

ladder height quantities. As noted earlier, renewal theory applies to the random walk

when sampled at strictly increasing ladder epochs. The renewal theorem invoked

above actually establishes that

Eθ1 exp (−∆R (∞)) =1−Eθ1 exp

¡−∆Sτ+¢∆Eθ1Sτ+

, (10)

where τ+ = inf{n ≥ 1 : Sn > 0} is the first (strict) increasing ladder epoch (seeAsmussen (1987)). In view of (2), it follows that

1− Eθ1 exp¡−∆Sτ+¢ = Pθ0 (τ+ =∞) . (11)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 12

Random walk duality (see, for example, p. 173 of Siegmund (1985)) implies that

Pθ0 (τ+ =∞) = 1/Eθ0τ−, (12)

where τ− = inf{n ≥ 1 : Sn ≤ 0}. If the Xi’s are symmetric rv’s with commoncontinuous distribution function, ∆ = 2θ1 and Eθ0τ− = Eθ1τ+. Furthermore, (10) to

(12) then imply that

Eθ1 exp (−∆R (∞)) =1

2θ1¡Eθ1Sτ+

¢(Eθ1τ+)

.

In view of Wald’s identity, we then obtain the relation

Eθ1 exp (−∆R (∞)) =ψ0 (θ1)

2θ1¡Eθ1Sτ+

¢2 .As a consequence, Theorem 1 then yields a full asymptotic expansion for the expected

ladder height Eθ1Sτ+. We record this result as our Theorem 2.

Theorem 2 Assume that X1 has exponential moments and is symmetric with a con-

tinuous distribution function. Then,

Eθ1Sτ+ =

sψ0 (θ1)2θ1

exp

Ã−12

∞Xm=0

β2m+1 (2θ1)2m+1

!.

Given our above argument, the only remaining issue in proving Theorem 2 is

establishing that β2n = 0 for n ≥ 1 in the presence of symmetry. This fact is provenin Section 2.6.

The most important device that we use to prove Theorems 1 and 2 is a convenient

representation for r (∆). This representation is a key idea in our mathematical de-

velopment. To introduce our representation put φ (θ) = E exp (θX1) for θ ∈ R and,for z ∈ C, set γ (z) = E exp (zX1). Note that φ is finite-valued on a neighborhood Nof the origin and γ is analytic on the strip {x+ iy : x ∈ N , y ∈ R}. For non-negativeθ ∈ N and b ∈ R, put

ρ (θ, b) = logEθ exp (−bR (∞)) .

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 13

Note r (∆) = ρ (θ1,∆), where θ1 = θ1 (∆) > θ0 (∆) = θ0 is such that ψ (θ1 (∆)) =

ψ (θ0 (∆)). Woodroofe (1979) showed that

ρ (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

µγ (θ)− γ (θ + iλ)

−iφ0 (θ)λ¶dλ; (13)

see also Corollary 8.45 and Theorem 8.51 of Siegmund (1985). While (13) is conve-

nient for many purposes, it presents difficulties in the current circumstances because

of the singularity (in the logarithm) that arises when θ & 0. The following represen-

tation for ρ (θ, b) is free of such singularities.

Theorem 3 Suppose X1 has exponential moments and is strongly non-lattice. Then,

for non-negative θ ∈ N and b > 0,

ρ (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

µ2 (γ (θ)− γ (θ + iλ))

λ (λ− 2iφ0 (θ))¶dλ. (14)

Siegmund’s computation of β1 takes advantage of the fact that the first order

behavior of r (∆) should match that of

s (∆) = logE0 exp (∆R (∞)) . (15)

Since s (∆) = ρ (0,∆) , Theorem 3 implies that

s (∆) =1

Z ∞

−∞

−∆(∆+ iλ) iλ

log¡2 (1− g (λ))λ−2¢ dλ; (16)

see also p. 226 of Siegmund (1985). We proceed to analyze ρ (θ, b) by writing ρ (θ, b) =

s (b) + I (θ, b). In view of both Theorem 3 and (16),

I (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

µλ (γ (θ)− γ (θ + iλ))

(λ− 2iφ0 (θ)) (1− g (λ))¶dλ. (17)

In the next sections, we develop asymptotics, as b & 0, appropriate to the inte-

grals arising in (16) and (17). Such asymptotics can be used to provide asymptotic

expansions for the moments (or, equivalently, the cumulants) of the limiting expected

overshoot r.v.R (∞) under Pθ as θ & 0. Specifically, for n ≥ 1, let

κn (θ) = (−1)n ∂n

∂bρ (θ, b)

¯b=0

.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 14

Theorem 4 Assume that X1 has exponential moments and is strongly non-lattice.

Then (for all n ≥ 1) κn (·), initially defined on [0, υ) for υ > 0, can be extended to bean analytic function throughout a disc in the complex plane containing the origin.

An important implication of Theorem 4 is that it can be directly applied to obtain

complete asymptotics for the steady-state mean of the waiting time sequence, namely

Eθ0M (= Eθ0W∞). In particular, Siegmund (1979) shows (see also Theorem 6.7, p.

275, of Asmussen (1987)) that

Eθ0M =Eθ0

¡Sτ+

¯τ+ <∞

¢Pθ0 (τ+ =∞)

=Eθ1Sτ+ exp

¡−∆Sτ+¢1−Eθ1 exp

¡−∆Sτ+¢=

Eθ1 (1−R (∞)) exp (−∆R (∞))∆Eθ1 exp (−∆R (∞))

=1

∆+1

∂bρ (θ1,∆) . (18)

Thus, since

∂bρ (θ, b) =

∞Xm=0

(−1)m κm+1 (θ)bm

m!,

it follows that Theorem 4 can be applied directly to provide the full asymptotic

expansion for Eθ0M . Indeed, our analysis in Sections 3 to 5 yield an asymptotic

expansion for κn (·) around zero which in turn implies the expansion

Eθ0M =1

∆+

nXm=0

n−mXj=0

(−1)m κ(j)m+1 (0)

θ1 (∆)j

j!

∆m

m!+O

¡∆n+1

¢valid for all n ≥ 0. The explicit computation of the derivatives κ(j)m+1 (0), for j,m ≥ 0,is discussed in Section 2.5.2.

Finally, the analytic extension of κn (·) and r (·) is a consequence of the followingresult.

Proposition 1 If X1 has exponential moments and strongly non-lattice distribution,

then, I (·) (defined as in (17) on a domain containing [0, υ)× [0, υ) with υ > 0) can

be analytically extended throughout a disc containing the origin in C×C.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 15

Moreover, with the aid of Theorem 1 it follows easily (from (18) and the implicit

function theorem) that ∆Eθ0M (initially defined for θ0 < 0) can be analytically

extended (as a function of ∆ (θ0)) in a neighborhood of the origin in the complex

plane.

2.2 Short-time Asymptotics for the Cauchy Pro-

cess

The approach described in Section 2 suggests computing an asymptotic expansion

for r (∆) by developing appropriate expansions for s (∆) and I (θ1,∆). In this sec-

tion, we will show how asymptotics for s (∆) can be obtained. Section 4 shows how

asymptotics for I (θ,∆) (and, as a result, also for I (θ1 (∆) ,∆)) can be reduced to

the types of integrals considered here.

Since s (b) is real for b positive, it follows that the integral of the imaginary part

of (16) must vanish. Hence, s (b) equals the integral of the real part of (16), so that

s (b) =1

Z ∞

−∞

b

b2 + λ2Re log

¡2 (1− g (λ))λ−2¢ dλ (19)

− 12π

Z ∞

−∞

b¡b2 + λ2

¢λIm log (1− g (λ)) dλ.

Both of the above integrals take the form

K (b, f) =1

Z ∞

−∞

b

b2 + λ2f (λ) dλ (20)

=1

Z ∞

−∞

1

1 + λ2f (λb) dλ.

for suitably defined f . Note that if Y = (Y (t) : t ≥ 0) is a standard Cauchy pro-cess (so that Y (1) is distributed as a standard Cauchy r.v.), K (t, f) can then be

represented as

K (t, f) =1

2E (f (Y (t))|X = 0) .

Hence, representing K (t, f) as a power series in t is equivalent to the development of

short-time asymptotics of the Cauchy process. Such asymptotics are also of general

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 16

analytical interest, because of their relevance to Fourier analysis. Integrals of the

type (20) are closely related to “approximate identities of the Fejer type”; see p. 31

of Butzer (1971).

Let L be the space of functions f : R→ C for which E |f (Y (1))| is finite and forwhich f is infinitely differentiable at zero. For f : R→ C let f be the symmetrizationof f defined via f(x) = (f (x) + f (−x)) /2. The following result provides our short-time asymptotic expansion for K (t, f).

Proposition 2 Suppose f belongs to L . Then, K (·, f) is infinitely differentiable atthe origin and

K(n) (0, f) =

((−1)n/2 f (n) (0) n even

(−1)(n−1)/2 n! 12π

R∞−∞¡T(n−1)/2f

¢(λ) dλ n odd

,

where, for j ≥ 0, Tj acts on even functions in L as

Tjf (λ) =f (λ)−P2j−1

k=0 f(2k) (0)λ2k/ (2k!)

λ2j.

Furthermore, the family of linear operators (Tn : n ≥ 0) is a commutative semigroup,so that Tn+m = TnTm m,n ≥ 0.

Remark Note that the even derivatives of f match those of f(·). One mighttherefore be tempted to write the derivatives of K (·, f) in terms of integrals of Tjfrather than Tjf . The problem is that Tjf typically has a singularity at the origin,

unless the odd derivatives of f at zero vanish. As a consequence, the integrals defining

the derivative of K (·, f) may diverge if they were defined directly in terms of f . Toavoid this, we use the symmetrization f.

Proof of Proposition 2. The fact that Tn is a linear operator, and forms a

commutative semigroup is straightforward. To obtain the formula for the derivatives

of K (·, f) at the origin, note that K (·, f) = K ¡·, f¢ where f is the symmetrizationof f given by f(·) = (f (·) + f (−·)) /2. Furthermore, if f ∈ L then f is also in L.Observe that the Dominated Convergence Theorem implies that

K¡t, f¢→ f (0) /2

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 17

as t& 0. This motivates writing

K¡t, f¢= f (0) /2 +

1

Z ∞

−∞

t

t2 + λ2¡f (λ)− f (0)¢ dλ.

Since E¯f (Y (1))

¯is finite, it follows that the above integrand is uniformly domi-

nated by an integrable function for |λ| bounded away from zero. On the other hand,f(λ)−f(0) = O ¡λ2¢ as λ → 0, so the integrand is also uniformly (in t) dominated

for |λ| small. Hence, the Dominated Convergence Theorem yields the conclusion that

K (t, f) = f (0) /2 +t

Z ∞

−∞

1

t2¡f (λ)− f (0)¢ dλ+ o (t)

as t→ 0. In fact,

K (t, f) = f (0) /2 +t

Z ∞

−∞

1

λ2¡f (λ)− f (0)¢ dλ

− t2

Z ∞

−∞

t

t2 + λ2

¡f (λ)− f (0)¢

λ2dλ

= f (0) /2 +t

Z ∞

−∞

¡T1f

¢(λ) dλ− t2K ¡t, T1f¢ . (21)

If we apply (21) recursively to K¡·, T1f¢, K ¡·, T2f¢,... we find that K (t, f) satisfies

K (t, f) =nXj=0

(−1)jÃt2j¡Tjf

¢(0)

2+t2j+1

Z ∞

−∞

¡Tj+1f

¢(λ) dλ

!+(−1)n+1 t2(n+1)K ¡t, T2(n+1)f¢ ,

yielding the result.

With Proposition 2 in hand, our asymptotic expansion for s (∆) follows immedi-

ately.

2.3 Reducing the Analysis to Cauchy Process Short-

time Asymptotics

As we discussed earlier in Section 2, the backbone of our asymptotic analysis for

r (∆) is given by the relation ρ (θ, b) = s (b)+ I (θ, b). In Section 3, we studied how to

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 18

develop asymptotics for s (b). In this section, we will study how to reduce the analysis

of the remaining term I (θ, b) to that already studied in Section 3. Recall that

I (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log (1− v (θ,λ)) dλ,

where

v (θ,λ) =λ

λ− 2iφ0 (θ)µ1− 2iφ

0 (θ)λ

− γ (θ)− γ (θ + iλ)

1− g (λ)¶.

A natural strategy to now follow is to express the logarithm as a power series in

v (θ,λ), followed by an expansion for v as

v (θ,λ) =∞Xn=0

vn (iλ)θn

n!. (22)

One could then apply Proposition 2 (as for (19)) to the real and imaginary parts in

each of the resulting integrals that would appear as coefficients for θn. However, the

expansion (22) requires that the function v be expressible as a joint power series in

non-negative powers of θ and λ. Unfortunately, the presence of the term (λ− 2iφ0 (θ))in the denominator of v precludes the existence of such a joint power series.

To avoid this difficulty we write v as

v (θ,λ) =λH (θ,λ)

λ− 2iφ0 (θ) ,

so that

H (θ,λ) = 1− 2iφ0 (θ)λ

− γ (θ)− γ (θ + iλ)

1− g (λ) .

The functionH (·) is well behaved because the term 2iφ0 (θ) /λ controls the behavior of(γ (θ)− γ (θ + iλ)) (1− g (λ))−1 as λ& 0. As a consequence, H (·) can be smoothlydefined at λ = 0 via the relation H (θ, 0) = 1− φ00 (θ). Our next result describes the

analytic structure of H (·).

Proposition 3 Let Dη/2 , {z ∈ C : |z| < η/2} and, for (z1, z2) ∈ Dη/2סDη/2

SR¢,

put H

H (z1, z2) = 1− 2iγ0 (z1)z2

− γ (z1)− γ (z1 + iz2)

1− γ (iz2).

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 19

Then, for every z1 ∈ Dη/2, the function H (z1, ·) is analytic on Dη/2. Similarly, for

every z2 ∈ Dη/2

SR, the function H (·, z2) is analytic on Dη/2. Finally, H (z1,λ)

can be represented as an absolutely and uniformly convergent series, for λ ∈ R andz1 ∈ Dη/2, namely

H (z1,λ) =∞Xk=1

hk (iλ)zk1k!, (23)

where hk (iλ) ,¡γ(k) (iλ)− µk

¢/ (1− g (λ))− ¡2iµk+1/λ¢. In particular, this implies

that

supλ∈R

|H (z1,λ)|→ 0

as z1 → 0.

Remark Note that the function eH (z1, z2) , H (z1, z2)−H (θ, 0) = H (z1, z2)−1+γ00 (z1), satisfies the same properties stated for H (·) in Proposition 3 with ehk (iλ) ,hk (iλ) + µk+2, this follows from the analyticity of γ (·) and the fact that γ00 (0) = 1.Moreover, observe that completely analogous analytic properties apply to the functioneG (z1, z2) = (γ00 (z1))−1 eH (z1, z2) defined on Dη/2 ×

¡Dη/2

SR¢.

Note that |λ/ (λ− 2iφ0 (θ))| = |λ|³λ2 + (2φ0 (θ))2

´−1/2≤ 1. It follows from

Proposition 3 that for r > 0 small enough,

supθ∈(0,r)

supλ∈R

|v (θ,λ)| < 1.

Therefore, for all 0 < θ < r, we can proceed to expand log (1− v) in powers of v andformally integrate each term in the obtained expansion to express I (θ, b) in terms of

integrals of the form

Jk (a, b, f) =1

Z ∞

−∞

−b(b+ iλ) iλ

µiλ

a+ iλ

¶kf (iλ) dλ, (24)

where a, b > 0, f (i·) ∈ L and k ≥ 0. Because J0 (a, b, f) , J0 (b, f) can be written as

J0 (b, f) =1

Z ∞

−∞

b

b2 + λ2¡Re f (iλ)− λ−1 Im f (iλ)

¢dλ

+i

Z ∞

−∞

b

b2 + λ2¡Im f (iλ) + λ−1Re f (iλ)

¢dλ, (25)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 20

it follows that asymptotics for J0 can be computed in terms of asymptotics for the

K-type integrals that are subject of Proposition 2. In view of the development leading

to (24), a key to our asymptotic expansion for I (θ, b) is therefore the reduction of

integrals Jk (a, b, f) for k ≥ 1 to integrals of the form J0 (b, f). A key identity in

establishing this reduction step is the following.

Lemma 1 Suppose that a, b ≥ 0. Then, for m,n ≥ 0,1

Z ∞

−∞

−b(b+ iλ) iλ

(iλ)m+1

(a+ iλ)m+n+1dλ = 0.

Furthermore,

1

Z ∞

−∞

−1(1 + iλ) iλ

log (1 + aiλ) dλ = 0.

Proof. For a, b > 0, let the function of a complex variable f (·) be defined as

f (z) =−b

(b+ iz) iz

(iz)m+1

(a+ iz)m+n+1.

Consider the contour (in the clockwise direction) C (r) = C1 (r) + C2 (r), where

C1 (r) = {reiτ : −π ≤ τ ≤ 0} and C2 (r) = {λ : λ ∈ [−r, r]}. Since f is (complex)analytic on Im (z) ≤ 0 , Cauchy’s theorem yields

1

ZC(r)

−b(b+ iz) iz

(iz)m+1

(a+ iz)m+n+1dz = 0.

This, in turn, implies that

1

Z r

−r

−b(b+ iλ) iλ

(iλ)m+1

(a+ iλ)m+n+1dz =

−12π

ZC1(r)

−b(b+ iz) iz

(iz)m+1

(a+ iz)m+n+1dz

=−12π

Z 0

−π

b (ir)m+1 e(m+1)τ i

(b+ ireτ i) (a+ ireτ i)m+n+1dτ .

Letting r →∞, we obtain (by virtue of dominated convergence) the first part of thelemma. For the second part, let us define

f1 (a) =1

Z ∞

−∞− ((1 + iλ) iλ)−1 log (1 + aiλ) dλ.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 21

A routine dominated convergence argument, combined with our previous analysis,

shows that

f 01 (a) =1

Z ∞

−∞

−1(1 + iλ) (1 + aiλ)

dλ = 0.

The proof of the lemma is completed by observing that f1 (a)→ 0 as a& 0.

Let L0 be the subspace of L (recall the definition of L preceding Proposition 2)

for which f (0) = 0. Also, for f ∈ L, let ef (·) = f (·) − f (0) (∈ L0). We are nowready to offer a proposition that reduces the evaluation of the integrals Jk (a, b, f) for

k ≥ 1 to that of integrals such as J0 (b, f), thereby permitting the application of theshort-time asymptotics of Section 3.

Proposition 4 Suppose that f ∈ L0. Then, for k ≥ 1 and n ≥ 0,

Jk (a, b, f) = J0

Ãb,

nXj=0

µk + j − 1

j

¶(−a)j eTj ef!+ bo (an) , (26)

where the linear operator eTj (j ≥ 0) acts on functions ef (i·) ∈ L0 as³eTj ef´ (iλ) = ef (iλ)−Pjm=1

ef (m) (0) (iλ)m /m!(iλ)j

.

Moreover, the family of operators³eTj : j ≥ 0´ constitutes a commutative semigroup,

so that eTm eTn = eTm+n.Remark As for Proposition 2, one might be tempted to express the right-hand

side of (26) in terms of f rather that ef . However, eTjf is generally non-integrable withrespect to the kernel that defines J0. Finally, note that, if all integrals are interpreted

in terms of Cauchy principal value, one can apply Proposition 4 directly to functions

that do not vanish at the origin by defining J0 (b, f) = J0 (b, f (·)− f (0)) + f (0) /2.

Proof of Proposition 4. That³eTj : j ≥ 0´ is a family of linear operators

forming a commutative semigroup is immediate. By virtue of Lemma 1, it follows

that

Jm (a, b, f) =1

Z ∞

−∞

−b(b+ iλ) iλ

µiλ

a+ iλ

¶m ef (iλ) dλ.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 22

Observe that ef (i·) is now in the domain of the operators eTn, n ≥ 1. On the otherhand, we can write

Jm³a, b, ef´ = 1

Z ∞

−∞

−b(b+ iλ) iλ

ef (iλ) dλ+1

Z ∞

−∞

−b(b+ iλ) iλ

ef (iλ)µµ iλ

a+ iλ

¶m− 1¶dλ. (27)

Note that µiλ

a+ iλ

¶m− 1 = −

mXk=1

µm

k

¶ak (iλ)m−k

(a+ iλ)m.

Once again, by appealing to Lemma 1 and to the definition of eTk ef , it follows that,for m ≥ k ≥ 1,

akJm³a, b, eTk ef´ , 1

Z ∞

−∞

−b(b+ iλ) iλ

ak (iλ)m−k

(a+ iλ)mef (iλ) dλ.

Combining this observation with (27), we obtain

Jm (a, b, f) = Jm³a, b, ef´ = J0 ³b, ef´− mX

k=1

µm

k

¶akJm

³a, b, eTk ef´ . (28)

The recursive relation (28) can now be expressed in operator form as

Jm³a, b, ef´ = J0 ³b, ef´+ Jm ³a, b,³1− ³1 + aeT´m´ ef´ .

(Here, we have used the semigroup property of the family of operators eTm). Iteratingthe previous expression, we arrive at

Jm (a, b, f) = Jm³a, b, ef´ = nX

k=0

J0

µb,³1−

³1 + aeT´m´k ef¶

+Jm

µa, b,

³1−

³1 + aeT´m´n+1 ef¶

= J

Ãb,

nXj=0

µm+ j − 1

j

¶(−a)j eTj ef!+ bo (an) , (29)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 23

where the last equality in (29) has been obtained by using the semigroup property of

the operators eTm and by noting that the coefficient of aj eTj in (29) (for j ≤ n) mustmatch that of xj in the formal expansion of

p (x) =1− (1− (1 + x)m)n+11− (1− (1 + x)m) =

1

(1 + x)m+O

¡xn+1

¢.

That the error term in (29) is bo (an) comes from the fact that aJm³a, b, ef´ = bo (1),

as a& 0, as it can be seen as follows,¯aJm

³a, b, ef´¯ = ¯¯ a2π

Z ∞

−∞

−b (iλ)m−1 ef (iλa)(b+ iλa) (1 + iλ)m

¯¯

≤ b

Z ∞

−∞

¯¯ ef (iλa)λ (1 + iλ)

¯¯ dλ = bo (1) ,

where the last step follows by a dominated convergence argument. This concludes

the proof of the proposition.

Proposition 4, combined with our development for K (t, ·) in Section 3, providesall the elements required to develop asymptotic expansions for integrals of the form

Jm (a, b, f). Since, as discussed earlier at a formal level, I (θ, b) can be expressed as a

sum of terms such as Jm (a, b, f), it follows that the whole asymptotic analysis of r (∆)

and ρ (θ, b) can be reduced to that of Section 3. A complete rigorous justification for

this representation for I (θ, b) is one of the main issues discussed in Section 5.

2.4 An Asymptotic Expansion for I (θ, b)

In Sections 3 and 4, we have developed the tools required to obtain asymptotic ex-

pansions, in powers of b, for s (b) and I (θ, b). We have done this by showing that

the problem can be reduced to short-time asymptotics for the Cauchy process. The

purpose of this section is to make rigorous the expansion for I (θ, b), in powers of θ,

that was outlined in Section 4.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 24

Noting the important role that functions vanishing at the origin plays in Propo-

sition 4, it seems appropriate to define

eH (θ,λ) , H (θ,λ)−H (θ, 0) = H (θ,λ)− 1 + φ00 (θ)

=∞Xk=1

ehk (iλ) θkk!, (30)

where ehk (iλ) , (γ(k) (iλ)−µk)/(1−g (λ))−¡2iµk+1/λ¢+µk+2 is such that ehk (0) = 0.The next proposition shows how a simplified expression for I (θ, b) in terms of eH can

be obtained.

Proposition 5 Define Ψ (θ) = 2φ0 (θ) /φ00 (θ). Then,

I (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

Ã1− φ00 (θ)−1 λ eH (θ,λ)

(λ− iΨ (θ))

!dλ. (31)

Proof. Just note that

log (1− v (θ,λ)) = logÃ1− λ eH (θ,λ)

(λ− iΨ (θ)φ00 (θ)) −λ (1− φ00 (θ))

(λ− iΨ (θ)φ00 (θ))

!

= log

µiλ/Ψ (θ) + 1

iλφ00 (θ) /Ψ (θ) + 1

¶+ log

Ã1− φ00 (θ)−1 λ eH (θ,λ)

(λ− iΨ (θ))

!.

Thus, (31) follows from Lemma 1 by noting that

1

Z ∞

−∞

−b(b+ iλ) iλ

log

µiλ/Ψ (θ) + 1

iλφ00 (θ) /Ψ (θ) + 1

¶dλ

=1

Z ∞

−∞

−1(1 + iλ) iλ

log

µiλb/Ψ (θ) + 1

iλbφ00 (θ) /Ψ (θ) + 1

¶dλ = 0.

Additional simplifications reduce the complexity of the expansion for I (θ, b). In

particular, the expression for the integral J0 (b, f) simplifies when it is known that

J0 (b, f) is real; see (25). Fortunately, our analysis of I (θ, b) gives rise to such real-

valued J0 (b, f)’s. To establish this result, we introduce the following family of func-

tions.

Definition A function f : R→ C is said to have the “parity property” if Re f (i·)and Im f (i·) are even and odd functions respectively. The class of functions possessingthe parity property will be denoted by P.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 25

Note that if f (i·) is in the domain of J0 (b, ·) and f possesses the parity property,then we must have that ImJ0 (b, f) = 0 (since it corresponds to an integral on the

real line of an odd integrable function). The family of functions enjoying the parity

property has certain closure characteristics that will be useful for the rest of our

development. These closure properties are discussed in the next proposition.

Proposition 6 The class P of functions forms an algebra on R (i.e. a vector spaceon R that is closed under product of functions). In addition, if f ∈ P, then 1/f (·)(defined on its domain of finiteness) also possesses the parity property. Finally, if f

is in the domain of eT and has the parity property, then eTf ∈ P.Proof. Certainly P constitutes a vector space on R and it is almost imme-

diate that eT preserves the parity property. Now, if f1,f2 ∈ P, then Re (f1f2) =Re (f1)Re (f2) − Im (f1) Im (f2) must clearly be even. Similarly, Im (f1f2) must beodd , which implies that f1f2 ∈ P. Finally, note that

1

f=

Re (f)

Re (f)2 + Im(f)2− i Im (f)

Re (f)2 + Im(f)2,

which immediately implies that Re 1/f and Im1/f are even and odd functions re-

spectively and thus 1/f ∈ P.

We now present the main result of this section, which yields an expansion for

I (θ, b) in powers of θ and coefficients involving only integrals of the form J0 (b, f)

with f satisfying the parity property.

Proposition 7 For k,m ≥ 1, let the coefficient multiplying θk in the power se-

ries representation of eG (θ,λ)m , ³φ00 (θ)−1 eH (θ,λ)´m be defined as egk,m (iλ). Then,egk,m (·) ∈ P can be recursively computed viaegk,m+1 (iλ) = kX

n=0

egn+1,m (iλ) egk−n,1 (iλ) .Consider b > 0 and let χ (θ) = −Ψ (θ) /θ. Then,

I (θ, b) =nX

m=1

θmm−1Xj=0

χ (θ)j J0 (b, Ej,m) + bo (θn) , (32)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 26

where Ej,m (iλ) , defined for 0 ≤ j ≤ m− 1 and m ≥ 1 as

Ej,m = −m−j−1Xk=0

1

m− j − kµm− k − 1

j

¶eTjegk,m−j−k,satisfies the parity property.

Proof. Since γ (iλ) = E0 cos (λX1)+ iE0 sin (λX1), it follows that γ(k) (iλ)−µk ∈P, as does the function 1−γ (iλ). By the closure properties described in Proposition

6, we may easily conclude that egk,1 ∈ P. A second application of Proposition 6 showsthat egk,m ∈ P and Ej,m ∈ P. The recursive expression provided for egk,m follows fromstandard convolution operations of power series. For n ≥ 1, define

eGn (θ,λ) , nXk=1

egk,1 (iλ) θkk!

and

In (θ, b) ,1

Z ∞

−∞

−b(b+ iλ) iλ

log

Ã1−

eGn (θ,λ)λλ− i2φ0 (θ)

!dλ.

Note that

log

Ã1−

eG (θ,λ)λλ− i2φ0 (θ)

!− log

Ã1−

eGn (θ,λ)λλ− i2φ0 (θ)

!

= log

1− λ

λ− i2φ0 (θ)

³ eGn (θ,λ)− eG (θ,λ)´³1− eGn (θ,λ)λ (λ− i2φ0 (θ))−1´

.On the other hand, from the remark following Proposition 3 and because log (1 + z) =

z (1 + ε (z)) for z ∈ C, where |ε (z)| ≤ |z| for |z| ≤ 1/2 (see Proposition 8.46, Breiman(1992)), we can see that there exists a constant B > 0 such that

|I (b, θ)− In (b, θ)|

≤ B

Z ∞

−∞

b¯ eGn (θ,λ)− eG (θ,λ)¯¡

b2 + λ2¢1/2 ³

λ2 + (2φ0 (θ))2´1/2dλ.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 27

Essentially by making the change of variables u = λθ we then see that for all θ ∈ (0, δ)for some δ > 0 we have

|I (b, θ)− In (b, θ)| ≤ Bbθn

Z ∞

−∞

¯ eGn (θ,λθ)− eG (θ,λθ)¯θn+1 |λ| ¡λ2 + 1¢1/2 dλ.

It follows easily from the previous inequality and the Dominated Convergence Theo-

rem that

I (b, θ)− In (b, θ) = bo (θn) .Using the expansion of log (1 + z) at z = 0 and a similar dominated convergence

argument, we can write

In (b, θ) =−12π

Z ∞

−∞

−b(b+ iλ) iλ

nXm=1

1

m

µiλ

iλ+Ψ (θ)

¶m eGn (θ,λ)m dθ + bo (θn)=−12π

Z ∞

−∞

−b(b+ iλ) iλ

nXm=1

1

m

µiλ

iλ+Ψ (θ)

¶m n−mXk=0

θk+megk,m (iλ) dλ+bo (θn) . (33)

Using Proposition 4 and (33), we obtain that

I (θ, b)

= −nX

m=1

θmm−1Xk=0

Jm−k

µΨ (θ) , b,

egk,mm− k

¶+ bo (θn)

= −nX

m=1

θmm−1Xk=0

J0

Ãb,

nXj=0

µm− k + j − 1

j

¶(−Ψ (θ))j eTj egk,m

m− k

!+ bo (θn)

= −nX

m=1

θmm−1Xj=0

θjχ (θ)j J0

Ãb,m−1Xk=0

µm− k + j − 1

j

¶eTj egk,mm− k

!+ bo (θn)

=nX

m=1

θmm−1Xj=0

χ (θ)j J0

Ãb,−

m−j−1Xk=0

µm− k − 1

j

¶eTj egk,m (iλ)m− j − k

!+ bo (θn) , (34)

which yields the desired conclusion.

In view of the previous result, an explicit expression for the coefficients in the

expansion for J0 (·, f), when f satisfies the parity property, deserves special attention.Providing such explicit expressions is the aim of the next proposition.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 28

Proposition 8 Suppose that f (i·) ∈ L0 has the parity property. Then, J0 (·, f) isinfinitely differentiable at zero and

J(n)0 (0, f) =

(−1)n/2

³f(n)RE (0)− n!

R∞−∞¡Tn/2+1fIM

¢(λ) dλ

´n even

(−1)(n+1)/2µf(n+1)IM (0)

(n+1)− n!

R∞−∞¡T(n+1)/2fRE

¢(λ) dλ

¶n odd

,

(35)

where fIM (iλ) = Im f (iλ)λ−1 and fRE (iλ) = Re f (iλ).

Proof. The proof follows by a direct application of Proposition 2 combined with

the fact that Re J0 (b, f) = 0.

We close this section with some remarks that clarify how the expansion just de-

rived for I (θ, b) can alternatively be viewed through the prism of a formal operator

expansion. The analytic properties stated in Proposition 1 provide rigorous justifica-

tion for the expansions outlined next. First, we note that if θ > 0 is small enough

and b > 0, we can formally write

I (θ, b) = −∞Xk=1

1

kJk³Ψ (θ) , b,φ00 (θ)−k eHk (θ, ·)

´. (36)

Formally interpreting³1 + aeT´−m as

³1 + aeT´−m = ∞X

k=0

µm+ k − 1

k

¶(−a)k eT k,

in combination with the expansion (26) developed for Jk (a, b, f) and equality (36),

allows us to write

I (θ, b) = −∞Xk=1

1

kJ0

µb,φ00 (θ)−k

³1 +Ψ (θ) eT´−k eHk (θ, ·)

¶.

If we introduce the convention that for commutative operators B1 (θ), B2 (θ) and

functions F1 (θ, ·), F2 (θ, ·), expressions of the form B1 (θ)F1 (θ, ·)B2 (θ)F2 (θ, ·) (orany permutation of this form) are always interpreted as

(B1 (θ)B2 (θ)) (F1 (θ, ·)F2 (θ, ·)) ,

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 29

then we can write

I (θ, b) = J0

µb, log

µ1− φ00 (θ)−1

³1 +Ψ (θ) eT´−1 eH (θ, ·)¶¶ . (37)

Expression (37) provides a convenient shorthand notation for the expansion of I (θ, b),

in powers of θ and with coefficients in terms of integrals of the form J0 (b, ·). Inaddition, note that, in order to recover the coefficients in the expansion for I (θ1 (·) , ·)one can apply formal differentiation to (37) in both arguments θ and b (always having

in mind that (37) is just a formalism representing a certain asymptotic expansion).

Hence, for example, one can obtain the first term in the expansion for I (θ1 (·) , ·) as

∂∆I (θ1 (∆) ,∆)|∆=0 = ∂θI (0, 0) ∂∆θ1 (0) + ∂bI (0, 0) ,

where the formal derivatives applied to (37) must be interpreted using the formal

operator convention introduced earlier. Thus, for example, if B (θ) is an operator of

the form

B (θ) =∞Xk=0

bkθkeT kk,

applied to a function F (θ,λ) =Pfk (iλ) θ

k/k!, we interpret the formal derivative

∂θ log (1−B (θ)F (θ, ·)) as

∂θ log (1−B (θ)F (θ, ·)) = −∂θB (θ) (1−B (θ)F (θ, ·))−1 F (θ, ·)−B (θ) (1−B (θ)F (θ, ·))−1 ∂θF (θ, ·) .

where

∂θB (θ) (1−B (θ)F (θ, ·))−1 F (θ, ·)

=∞Xk=0

³∂θB (θ)B (θ)

k´F (θ, ·)k+1 ,

and, similarly,

B (θ) (1−B (θ)F (θ, ·))−1 ∂θF (θ, ·)

=∞Xk=0

B (θ)k+1³F (θ, ·)k ∂θF (θ, ·)

´.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 30

Thus, it is possible to combine this formalism with the expansion

J0 (b, f) =mXn=1

J (n) (0, f) bn/n! +O¡bm+1

¢to recover the coefficients in the expansion for I (θ1 (·) , ·) in powers of ∆.

2.5 Expansions for r (∆) and EθRk (∞)

In previous sections, we developed all the elements required to rigorously compute a

full asymptotic expansion for r (·) in powers of ∆. In the first part of this section, asa summary, we indicate how the developments obtained in the previous three sections

can be applied to provide an asymptotic expansion for r (·) in powers of ∆. In view ofthe level of complexity in the computation of the constants βn, the description in this

section is intended to provide guidance for an easy-to-design practical implementation

in a computational package such as Mathematica or Matlab. An efficient implemen-

tation of the procedure will appear elsewhere. In the second part of this section, also

as a direct consequence of the analysis in the previous sections, we will develop a

rigorous asymptotic expansion for the cumulants of R (∞) under Pθ in powers of θ.

2.5.1 The Expansion for r (∆)

An algorithm for computing βk for k ≤ n proceeds as follows:

1. Expand s (∆) up to terms of order O (∆n+1) using Proposition 8.

2. Similarly, expand the functions J0 (·, Ej,m) up to terms O (∆n−m) with 0 ≤ j ≤m− 1 and 1 ≤ m ≤ n. This also can be done by applying Proposition 8, sinceEj,m has the parity property.

3. Finally, the terms obtained can be combined with an expansion for θ1 (∆) up

to terms of order O (∆n+1). Such an expansion can be easily obtained using the

implicit function theorem and therefore is omitted.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 31

Observe that the previous algorithm provides an asymptotic expansion for r (·) inpowers of ∆. However, because of Theorem 1, we actually have that this asymptotic

expansion converges absolutely in a neighborhood of the origin.

As a simple application of the previous expansion, we show that β2 = 0.

Proposition 9 Suppose that X1 has exponential moments and is strongly non-lattice.

Then

r (∆) = −∆β1 +O¡∆3¢

Proof. We only need to show that β2 = 0. Note that by virtue of Proposition 8,

the coefficient multiplying ∆2 in the expansion of s (∆) equals

s2 =1

Z ∞

−∞

1

λ2

µIm log (1− g (λ))

λ− µ3

¶− ¡µ4/12− µ23/18¢ .

In order to show that β2 = 0 it suffices to show that θ1J (∆, E0,1) ∼ −∆2s2 or (since

J (∆, Ej,m) = O (∆), θ1/2 ∼ ∆ and φ00 (θ1) ∼ 1), that ∆J (∆, E0,1) ∼ −2∆2s2, where

∆J (∆, E0,1) =1

Z ∞

−∞

−∆(∆+ iλ) iλ

µγ0 (iλ)1− g (λ) −

2i

λ+ µ3

¶dλ

=1

π

Z ∞

0

∆2¡∆2 + λ2

¢ Reµ γ0 (iλ)1− g (λ) −

2i

λ+ µ3

¶dλ (38)

−∆π

Z ∞

0

∆2¡∆2 + λ2

¢λIm

µγ0 (iλ)1− g (λ) −

2i

λ+ µ3

¶dλ. (39)

Note that g0 (λ) = iγ0 (iλ) and that Im log¡2λ−2

¢= 0; hence, we can write

Re

µγ0 (iλ)1− g (λ) −

2i

λ+ µ3

¶= − Im d

¡log¡2 (1− g (λ))λ−2¢− µ3iλ¢ ,

which implies, using integration by parts, that the integral in (38) equals

−1π

Z ∞

0

2λ∆2¡∆2 + λ2

¢2 Im ¡log ¡2 (1− g (λ))λ−2¢− µ3iλ¢ dλ∼ −∆

2

π

Z ∞

0

2

λ2

µIm log (1− g (λ))

λ− µ3

¶dλ, (40)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 32

where (40) has been obtained using dominated convergence and simple manipulations.

It follows from Proposition 2 and a first order asymptotic expansion of E0,1 (iλ) that

(39) equals −∆2 (µ23/9− µ4/6). Combining this last estimate together with (40) into(38) and (39) yields ∆J (∆, E0,1) ∼ −2∆2s2 which is exactly what we wanted to show

to conclude that β2 = 0.

2.5.2 The Expansion for EθR (∞)k as θ& 0

We shall provide asymptotics for EθR (∞)k = Eθ

¡Skτ+

¢/¡k!Eθ

¡Sτ+

¢¢via the cu-

mulants (κj (θ) : j ≥ k) of R (∞) under Pθ. In particular, these estimates yield theproof of Theorem 4 stated in Section 2. The idea is to develop an asymptotic expan-

sion, in powers of b, for s (b) and I (θ, b) respectively and to match coefficients in the

expression

ρ (θ, b) = −κ1 (θ) b+ κ2 (θ) b2/2− κ3 (θ) b

3/3! + ...

= s (b) + I (θ, b) . (41)

In order to perform this task, we will take advantage of Proposition 7 as follows; first

let us define, for k ≥ 1, αk,j,m = J (k)0 (0, Ej,m) /k! (which can be explicitly computed

via Proposition 8). With this notation, we can write, for l, n ≥ 1,

I (θ, b) =nX

m=1

θmm−1Xj=0

χ (θ)jÃ

lXk=1

αk,j,mbk +O

¡bl+1

¢!+ bo (θn)

=lX

k=1

bknX

m=1

m−1Xj=0

θmχ (θ)j αk,j,m + θO¡bl+1

¢+ bo (θn)

Therefore, we obtain that, for all s, n ≥ 1, κs (θ) satisfies

κs (θ) = (−1)sÃκs (0) + s!

nXm=1

m−1Xj=1

θmχ (θ)j αs,j,m

!+O

¡θn+1

¢.

Consequently, κn (·) is an infinitely differentiable function at θ = 0 and for m ≥ 0and n ≥ 1 we have

κ(m)n (0)

n!= (−1)n κn (0)

n!+m−1Xs=0

m−1−sXj=0

χs,jαn,j,m−s,

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 33

where, for n, j ≥ 1, χn,j is the coefficient multiplying θn in the expansion for χ (θ)j .

In particular, the χn,j can be computed recursively as

χn,j+1 =kXn=0

χn,jχ(k−n) (0) / (k − n)!,

with χn,1 = χ(n) (0) /n!.

2.6 Technical Proofs

Proof of Theorem 3. Using Lemma 1, we can add

0 =1

Z ∞

−∞

b

(b+ iλ) iλlog (1 + iλ/2φ0 (θ)) dλ

to expression (13) for ρ (θ, b) to obtain

ρ (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

µγ (θ)− γ (θ + iλ)

−iφ0 (θ)λ (1 + iλ/2φ0 (θ))¶dλ

=1

Z ∞

−∞

−b(b+ iλ) iλ

log

µ2 (γ (θ)− γ (θ + iλ))

λ (λ− 2iφ0 (θ))¶dλ,

yielding the conclusion of the theorem.

Proof of Proposition 3. It follows immediately, by a Taylor series expansion

of γ (·), that a series representation for H can be written (for fixed λ and θ such that

0 < |λ|+ |θ| < η) as

H (θ,λ) = 1− 2iφ0 (θ)λ

− γ (θ)− γ (θ + iλ)

1− g (λ)

= 1− 2iλ

∞Xk=1

µk+1θk

k!− 1

1− g (λ)∞Xk=0

¡µk − γ(k) (iλ)

¢ θkk!

=∞Xk=1

hk (iλ)θk

k!.

In fact, the functions hk (i·) can be analytically extended throughout the disc Dη/2 =

{z ∈ C : |z| < η/2}. This is easily seen as follows, recall that γ (·) (and therefore

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 34

γ(k) (·)) are analytic on N (defined in Section 2). Also, observe that 1−γ (iz) ∼ z2/2and γ(k) (iz)− µk ∼ izµk+1 as z → 0. Thus,

¡γ(k) (iz)− µk

¢/ (1− γ (iz)) possesses a

simple pole at 0 with residue equal to 2iµk+1, which implies that the natural extension

of hk defined as

hk (iz) =γ(k) (iz)− µk1− γ (iz)

− 2iµk+1z

=

¡γ(k) (iz)− µk

¢z − 2iµk+1 (1− γ (iz))

(1− γ (iz)) z

is analytic on Dη/2. Now, by virtue of the maximum principle (see, for example,

Rudin (1987), p. 253) we have that if δ > 0 is suitably small,

sup|z|≤δ

|hk (iz)| ≤ sup|z|=δ

|hk (iz)| .

Since γ (z) is a non-constant analytic function defined on Dη/2 (which is an open set

and thus has an accumulation point), then 1 − γ (z) has an isolated zero at z = 0.

Thus, it is possible to choose δ > 0 in such a way that

inf|z|=δ

|1− γ (iz)| > ε > 0,

for some ε > 0. Consequently,

sup|z|≤δ

|hk (iz)| ≤ sup|z|=δ

|hk (iz)| ≤ 1

εδsup|z|=δ

¯¡γ(k) (iz)− µk

¢z + 2µk+1 (1− γ (iz))

¯.

Observe that, for |z| < η/2, γ(k) (z) = E0¡Xk exp (zX)

¢. Therefore, if z = x + iy,

with |z| = δ,¯γ(k) (iz)

¯ ≤ E0 ³|X|k |exp (izX)|´ = E0 ³|X|k |exp (yX)|´ ≤ E0 ³|X|k exp (δ |X|)´ .A similar bound can be obtained for γ (z) and we can conclude that ∃ B > 0 such

that

sup|z|≤δ0

|hk (iz)| ≤ B³E0³|X|k (exp (δ |X|) + 1)

´+E0

³|X|k+1

´(1 +E0 exp (δ |X|))

´.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 35

Now, suppose that δ < η/2. Then, if z1 ∈ Dη/2, we can define

BE0³|X|k+1

´(1 +E0 exp (δ |X|)) z

k1

k!, N1 (z1)

in such a way that the previous series converges absolutely and uniformly on Dη/2.

Similarly, we can define

B∞Xk=1

E0³|X|k (exp (δ |X|) + 1)

´ zk1k!= BE0

à ∞Xk=1

|X|k zk1

k!(exp (δ |X|) + 1)

!= BE0 ((exp (z1 |X|)− 1) (exp (δ |X|) + 1)), N2 (z1) .

Note that, for j = 1, 2, Nj (z1) → 0 as z1 → 0. On the other hand, since g (λ) is

strongly non-lattice, we have that

sup|λ|≥δ

|hk (iλ)| = sup|λ|≥δ

¯γ(k) (iλ)− µk1− g (λ) − 2iµk+1

λ

¯≤ B

³E0

³|X|k

´+E0

³|X|k+1

´´,

if B < ∞ is big enough. The previous estimates imply that there exist constants

0 < Mk ≤ B³E0³|X|k (exp (δ |X|) + 1)

´+E0

³|X|k+1

´(1 +E0 exp (δ |X|))

´such

that

supz2∈R

SDη/2

|hk (iz2)| ≤Mk

and¯P∞

k=1Mkzk1k!

¯≤P∞

k=1

¯Mk

zk1k!

¯<∞ for z1 ∈ Dη/2. Thus, using the Weierstrass M

test, we obtain the validity of (23). Finally, the invoked Weierstrass M test combined

with the analytic functions convergence theorem (see Theorem 10.28, p. 214, of Rudin

(1987)) yields the analyticity of H (z1, ·) on RSDη/2 (for z1 ∈ Dη/2) and similarly

for H (·, z2) on Dη/2 (for z2 ∈ RSDη/2).

Proof of Proposition 1. We start by writing

I (θ, b) =1

Z ∞

−∞

−b(b+ iλ) iλ

log

µ1− H (θ,λ)λ

λ− 2φ0 (θ) i¶dλ.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 36

The strategy will be to study this integral on {|λ| < δ} and {|λ| ≥ δ} separately(where δ > 0 is some convenient small number to be characterized later).

I (θ, b) = − 12π

Z δ

−δ

b

(b+ iλ)

1

iλlog

µ1− H (θ,λ)λ

λ− 2φ0 (θ) i¶dλ (42)

− 1

Z|λ|≥δ

b

(b+ iλ)

1

iλlog

µ1− H (θ,λ)λ

λ− 2φ0 (θ) i¶dλ. (43)

Let us define IA (θ, b) and IB (θ, b) as (42) and (43) respectively. Suppose that 0 <

b < δ < η/2. By making u = bλ, we can write

IA (θ, b) = − 12π

Z δ

−δ

b

(b+ iλ)

1

iλlog

µ1− H (θ,λ)λ

λ− 2φ0 (θ) i¶dλ.

Let C = {w ∈ C : |w| ≤ δ}∩{Im (w) ≤ 0}, and observe that by virtue of Proposition3, we can pick δ1 > 0 in such a way that for all 0 < θ < δ1 the function

f1 (w) =b

(b+ iw)

1

iwlog

µ1− H (θ, w)w

w − 2φ0 (θ) i¶

is analytic on C. Thus, applying Cauchy’s theorem to the contour enclosing C we

obtain

IA (θ, b)

= − 12π

Z 0

−π

b

(b+ iδeiλ)

iδeiλ

iδeiλlog

Ã1− H

¡θ, δeiλ

¢δeiλ

δeiλ − i

!dλ

=1

Z 0

−π

ibδ−1e−iλ¡1− ibδ−1e−iλ¢ log

Ã1− H

¡θ, δeiλ

¢1− i2φ0 (θ) δ−1e−iλ

!dλ. (44)

The equality (44) has been obtained by simple algebraic manipulations. Observe that

the previous expression in combination with Proposition 3 and the analyticity of the

functions φ0 (θ) (∼ 0) at zero immediately gives that IA (θ, b) can be represented asan absolutely convergent double power series in θ and b on the set 0 < |θ|+ |b| < δ2

for some δ2 > 0. Indeed, if we pick δ2 small enough, it is possible to provide an

explicit power series representation for IA (θ, b) by using the expansion of log (1− w)at w = 0 in combination with the series representation (23) for the function H (θ,λ)

derived in Proposition 3 and a taylor expansion of (1− w)−1 around w = 0.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 37

The analysis of IB (θ.b) is easier,

IB (θ, b) =1

Z|λ|≥δ

b¡1− bλ−1¢ 1λ2 log

µ1− H (θ,λ)

1− i2φ0 (θ)λ−1¶dλ.

Hence, in order to show that IB (·) can be written as an absolutely convergence doublepower series in a neighborhood of the origin, it suffices to show (by Fubini’s theorem)

thatZ|λ|≥δ

∞Xk,j,m≥0

µm+ k

k

¶bj+1 (2 (E (exp (θ |X|)− 1− |X|)))m

(k + 1) |λ|j+2+mà ∞Xs=1

|hs (iλ)| θs

s!

!k+1dλ

is finite for all non-negative θ and b such that θ + b < δ3 for some δ3 > 0. But this

fact follows easily from Proposition 3, first note, by the change of variables λ = uδ,

that the previous expression equalsZ|u|≥1

∞Xk,j,m≥0

µm+ k

k

¶bj+1 (2 (E (exp (θ |X|)− 1− |X|)))m

δj+m+1 (k + 1) |u|j+2+mà ∞Xs=1

|hs (iλδ)| θs

s!

!k+1du,

now pick δ3 small enough so that 0 < max (b, 2 (E (exp (θ |X|)− 1− |X|))) < δ3 < δ

(if θ + b < δ3), and use Proposition 3 to conclude that one δ3 can be chosen so thatP∞s=1 |hs (iλδ)| θ

s

s!< c < 1− δ3/δ. Therefore, we can bound the previous sum byZ

|u|≥1

∞Xk,j,m≥0

µm+ k

k

¶(δ3/δ)

j+m+1

(k + 1) |u|2 ck+1du

≤ 2

3

1

1− δ3/δ

¯log

µ1− c

1− δ3/δ

¶¯<∞.

The conclusions obtained for both IA (·) and IB (·), indicate that for all 0 ≤ θ, b ≤ υ

(for some υ > 0) I (θ, b) can be written as

I (θ, b) =Xj,k≥1

θjbkIjk,

where the previous series converges absolutely on the specified region on θ and b.

The previous expression provides the natural analytic extension of I (·) on D2υ =

{(z1, z2) ∈ C×C : |z1|+ |z2| < υ}.

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 38

Proof of Theorem 1. Since

exp(s (∆)) =1− E0

¡exp

¡−∆Sτ+¢¢∆E0

¡Sτ+

¢ = E0 (exp (−∆R (∞))) ,

the analytic extension of the term s (∆) follows from that of the right hand side, which

comes from the fact that Sτ+ has exponential moments (see Asmussen (1987)). Thus,

since r (∆) = s (∆) + I (θ1 (∆) ,∆), we just have to analyze I (θ1 (∆) ,∆). However,

from the implicit function theorem, we know that θ1 (·) is analytic in neighborhoodof the origin, thus, the analytic functions convergence theorem (see Theorem 10.28,

p. 214, of Rudin (1987)) combined with Theorem 1 yields the desired conclusion.

Proof of Theorem 4. From Theorem 1, we know that for 0 ≤ θ, b ≤ υ (for

some υ > 0)

I (θ, b) =∞Xj=1

bjI·,j (θ) ,

where each function I·,j (θ) can be expanded in absolutely convergent power series for

0 ≤ θ ≤ υ, and thus can be analytically extended throughout a neighborhood of the

origin in the complex plane. But,

ρ (θ, b) = −κ1 (θ) b+ κ2 (θ) b2/2− κ3 (θ) b

3/3! + ...

= s (b) + I (θ, b) ,

where s (·) is (real) analytic at zero. Hence, the conclusion of the Theorem follows

immediately by matching coefficients.

Next, we show that if the distribution of X1 is symmetric then for n ≥ 1, β2n = 0.

Proof of Theorem 2. As we discussed before, all that we need to show is that

β2n = 0. We have shown that an absolutely convergent power series representation

is possible for r (∆) when ∆ is small, thus it suffices to show that if 0 < ∆ < δ

(where δ > 0 is suitably small), then an asymptotic expansion for r (∆) is given in

odd powers of ∆ only. Using the integral expression (14), integrating on |λ| ≤ δ and

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 39

|λ| > δ we can write

r (∆) =1

Z|λ|<δ

−∆(∆+ iλ) iλ

log

µ2 (γ (θ1)− γ (θ1 + iλ))

λ (λ− 2iφ0 (θ1))¶dλ (45)

+1

Z|λ|≥δ

−∆(∆+ iλ) iλ

log

µ2 (γ (θ1)− γ (θ1 + iλ))

λ (λ− 2iφ0 (θ1))¶dλ. (46)

Define by A (∆) and B (∆) the integrals appearing in expressions (45) and (46) re-

spectively. We first analyze A (∆). Using a similar argument as in the proof of

Theorem 1, we see that

A (∆) =1

ZC1

(∆+ iz) izlog

µ2 (γ (θ1)− γ (θ1 + iz))

z (z − 2iφ0 (θ1))¶dz,

where the trajectory C1 is defined as C1 = {δeiλ : λ ∈ [0,−π)}. Also, define thetrajectory C2 = {δeiλ : λ ∈ [−π, 0)}. The proof of the theorem will be complete if weshow that A (∆) is an odd function. That is, we must show that A (∆) = −A (−∆) .Note that

−A (−∆) = −12π

ZC1

−∆(−∆+ iz) iz log

µ2 (γ (−θ1)− γ (−θ1 + iz))

z (z − 2iφ0 (−θ1))¶dz

=1

ZC2

−∆(∆+ iw) iw

log

µ2 (γ (θ1) + γ (θ1 + iw))

w (w − 2iφ0 (θ1))¶dz. (47)

Equality (47) was obtained by making the change of variables −w = z and using thatγ (θ1) and φ0 (θ1) are even and odd functions of θ1 respectively. In view of (47), in

order to show that A (∆) = −A (−∆), it suffices to show that

0 =1

ZC

(∆+ iw) iwlog

µ2 (γ (θ1)− γ (θ1 + iw))

w (w − 2iφ0 (θ1))¶dw,

where C = C1 + C2 is the contour corresponding to the circle with radius δ. Now,

1

ZC

(∆+ iw) iwlog

µ2 (γ (θ1)− γ (θ1 + iw))

w (w − 2iφ0 (θ1))¶dw

=1

Z−C

w (w − i∆) logµ2 (γ (θ1)− γ (θ1 + iw))

w (w − i∆)¶dw (48)

+1

Z−C

w (w − i∆) logµ

w − i∆w − i2φ0 (θ1)

¶dw. (49)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 40

We will show that both terms (48) and (49) vanish. We first consider (49). For

γ ∈ [0, 1] and a ∈ [−δ, δ], define f (γ) as

f (γ) =1

Z−C

w (w − i∆) log (γw − ia) dw.

Using residue calculus (see Rudin (1987), p. 224) it is easy to see that f (0) = 0. A

standard dominated convergence argument yields

f 0 (γ) =1

Z−C

(w − i∆) (γw − ia)dw = 0,

where the previous integral has again been evaluated using residue calculus. As a

result, we obtain that f (1) = 0. Applying these considerations with a = ∆ and

a = 2φ0 (θ1) shows that the integral in (49) equals zero. We also can apply residue

calculus to evaluate (48) directly as follows. Consider

f1 (w) =∆

w (w − i∆) logµ2 (γ (θ1)− γ (θ1 + iw))

w (w − i∆)¶.

Using the change of variables w = h + i∆ and the definition of ∆ = θ1 − θ0 with

γ (θ1) = γ (θ0) we can evaluate the residue of f1 at w = i∆ as Residue(f1; i∆) =

−i log (−2γ0 (θ0) /∆). We also can obtain Residue(f1; 0) = i log (2γ0 (θ1) /∆). There-fore, using residue calculus we obtain that the integral in (48) equals

−i log (−2γ0 (θ0) / (2γ0 (θ1))) = −i log (γ0 (θ1) /γ0 (θ1)) = 0,

since in the case of symmetric distributions γ0 (λ) is odd and θ1 = −θ0.Finally, we analyze B (∆). Note that

B (∆) =1

Z|λ|≥δ

−∆(∆+ iλ) iλ

log¡2 (1− g (λ))λ−2¢ dλ (50)

+1

Z|λ|≥δ

−∆(∆+ iλ) iλ

log

µ1− λH (θ1,λ)

λ− 2iφ0 (θ1)¶dλ. (51)

LetB1 (∆) andB2 (∆) be defined as (50) and (51) respectively. SinceX1 is symmetric,

it follows that log¡2 (1− g (λ))λ−2¢ is real. As a result, we obtain, just by integrating

the real and imaginary parts of the integrand in B1 (∆),

B1 (∆) =1

Z|λ|≥δ

∆2 + λ2log¡2 (1− g (λ))λ−2¢ dλ. (52)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 41

Expression (52) yields an asymptotic expansion in odd powers of∆ for B1 (∆). Again,

integrating the real and imaginary parts in B2 (∆) we obtain

B2 (∆) =1

Z|λ|≥δ

∆¡∆2 + λ2

¢ Re logµ1− λH (θ1,λ)

λ− 2iφ0 (θ1)¶dλ (53)

− 1

Z|λ|≥δ

∆2¡∆2 + λ2

¢λIm log

µ1− λH (θ1,λ)

λ− 2iφ0 (θ1)¶dλ. (54)

The previous identity for B2 (∆) is obtained by observing that the integral

of the imaginary part must vanish. This occurs because for all θ1 small the

function log (1− iλH (θ1,λ) / (iλ+ 2φ0 (θ1))) satisfies the parity property, which canbe verified by observing that, since γ (iλ) = E0 cos (λX)+iE0 sin (λX), it follows that

hk (iλ) ∈ P; also, using Proposition 6, we obtain that iλ/ (iλ+ 2φ0 (θ1)) satisfies theparity property. Therefore, the closure properties proved in Proposition 6 together

with an expansion of the logarithm yield that log (1− iλH (θ1,λ) / (iλ+ 2φ0 (θ1))) ∈P. which justifies (53) and (54). For notational convenience let us define

C (θ1,λ) =∞Xk=1

h2k (iλ) θ2k1 /2k! (55)

and

D (θ1,λ) = −i∞Xk=1

h2k−1 (iλ) θ2k−11 / (2k − 1)!, (56)

where hk (iλ) = (γ(k) (iλ)− µk)/(1− γ (iλ))− 2iµk+1/λ. Since the distribution of X1is symmetric we have that γ(iλ) is even and real. Moreover, we also have that hk (iλ)

is even if and only if k is even. We also can see that Re (H (θ,λ)) , C (θ,λ) and

Im(H (θ,λ)) , D (θ,λ) are even and odd functions of both θ and λ (meaning that

for every θ ∈ (−η/2, η/2) fixed, C (θ, ·) is even and, similarly, for each λ ∈ R, C (·,λ)is also even on (−η/2, η/2), say). Using this notation, we can write

λH (θ1,λ)

λ− 2φ0 (θ1) i =λ2C (θ1,λ)− 2φ0 (θ1)λD (θ1,λ)

λ2 + (2φ0 (θ1))2 (57)

+ i2φ0 (θ1)λC (θ1,λ) + λ2D (θ1,λ)

λ2 + (2φ0 (θ1))2 . (58)

CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 42

Let us define C (θ1,λ) and D (θ1,λ) as the real and imaginary parts of

λH (θ1,λ) / (λ− 2φ0 (θ1) i), respectively, as indicated in the corresponding expressions(57) and (58). Since λH (θ1,λ) / (λ− 2φ0 (θ1) i) holds the parity property, C (θ1,λ)and D (θ1,λ), are even and odd function in both arguments θ1 and λ. By symmetry

of the distribution of X1 we have that ∆ = 2θ1, also as a consequence of symme-

try, 2φ0 (θ1) is an odd (real) analytic function of θ1 at the origin, which implies that

(2φ0 (θ1))2 is even. Hence, using the expansion of log (1− z) at z = 0 in expres-

sions (53) and (54) (justified by virtue of Proposition 3), we see that an asymptotic

expansion for the integral (53) involves expanding expressions of the form

K (θ1)1

Z|λ|≥δ

∆¡∆2 + λ2

¢C (θ1,λ)kD (θ1,λ)2m dλ (59)

where K (θ1) is an even function of θ1 which is also (real) analytic at the origin. This

implies in view of (55) to (58) and the properties of 2φ0 (θ1) discussed before, that an

asymptotic expansion for (59) must be given in odd powers of ∆ only, which must be

also the case for the integral in (53). The treatment for the integral (54) is completely

analogous and also yields an asymptotic expansion in odd powers of ∆. This yields

the conclusion of the theorem.

Chapter 3

The Cramer-Lundberg Theorem in

the Presence of Heavy Tails

Let S = (Sn : n ≥ 0) be the randomwalk generated by the sequenceX = (Xn : n ≥ 1)of independent and identically distributed random variables (iid rv’s) with EX1 = 0

and EX21 = 1 (so that S0 = 0 and Sn = X1 + ... +Xn for n ≥ 1). Assume that the

Xi’s are strongly non-lattice, in the sense that g (λ) , E exp (iλX1) satisfies, for eachε > 0,

inf|λ|>ε

|1− g (λ)| > 0.

Or, in other words, that lim|λ|→∞ |g (λ)| < 1 (see Siegmund (1985), p. 176).Let us introduce a small location parameter δ > 0 representing the drift of the

random walk. More precisely, let us consider a parametric family of random walks,

Sδ =¡Sδn : n ≥ 0

¢, generated by the sequence Xδ = (Xn − δ : n ≥ 1). So that

Sδn = Sn − nδ.

We shall focus on developing highly accurate approximations, of Cramer-Lundberg

type, for the distribution of

Mδ = maxn≥0

Sδn

43

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 44

in the presence of heavy tailed increments. For our purposes here, we say that X has

“heavy tails” if for all θ 6= 0, E exp (θ |X|) =∞.Driven by a number of important applications in several disciplines, a great deal of

effort has been put into understanding the distributional properties of Mδ. The book

by Asmussen (2003) provides a detailed account of several important applications

settings in which the distribution ofMδ plays a major role. Most notably we mention:

insurance risk theory, in which P (Mδ > x) is the probability of eventual ruin of an

insurer that faces iid claims and possesses initial reserve x; queueing theory, in which

the waiting time sequence (excluding service) in the single-server queue, under iid

inter-arrival and processing times, and first-come first-served service discipline, turns

out to converge in distribution toMδ (see Kiefer andWolfowitz (1956)), and sequential

analysis, in which the tail probability P (Mδ > x) can be interpreted as the power of

a one-sided sequential probability ratio test (see Siegmund (1985)).

Many problems in applied probability motivate study of models with heavy tails.

For instance, in certain lines of the insurance business, such as fire insurance, statis-

tical evidence suggest that claims sizes generally exhibit heavy tailed behavior (see,

for example, p. 436 of Bowers et al (1997) and Embrechts, Klüppelberg and Mikosch

(1997)). Queueing theory also gives rise to heavy-tails. For example, when mod-

eling data traffic in communication networks, evidence has been found suggesting

that exponential tail features (present in traditional models of data traffic) are not

compatible with empirical observations (see Adler, Feldman and Taqqu (1998), and

Willinger et al (1995)). Therefore, developing asymptotic analysis for systems with

heavy tail characteristics is an important applied problem.

Computing the exact distribution ofMδ (either numerically or analytically) under

general increment distributions is well known to be a challenging problem. Essentially,

it entails solving a Wiener-Hopf type equation known as Lindley’s equation (see Lind-

ley (1952)). This integral equation corresponds to the equation describing the sta-

tionary distribution of the positive recurrent Markov chain Wn+1 = (Wn +Xn − δ)+.

Consequently, most of the literature has been focused on developing approximations

and numerical algorithms for computing the distribution of Mδ. One of the most

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 45

popular approximations is based on the so-called Cramer-Lundberg asymptotic for-

mula (see equation (1) below). This formula was initially developed for light tailed

increment distributions (i.e. E exp (η |X1|) < ∞ for η in a neighborhood of the ori-

gin). The Cramer-Lundberg approximation is a celebrated result in insurance risk and

queueing theory (see Asmussen (2001, 2003) and Grandell (1991)), and it is widely

accepted that it tends to perform very well in practice (see the discussion in Asmussen

(2001) and Grandell (1991)). This performance can be explained via the exponential

rate of convergence that actually holds in many practical applications, see equation

(2) below.

As we shall see in Section 2, the Cramer-Lundberg representation for P (Mδ > x)

in the case of light tailed increments can be interpreted in a “scaled” form as a

function of δ > 0 only (i.e. we allow x = y (δ) = O¡δ−b¢for b ≥ 1 as δ & 0, see

(2) and (3) below). The case of y (δ) = O¡δ−1¢(as δ & 0) is of great interest, since

it corresponds to the so-called diffusion scale (see equation (4)). With this scaled

interpretation we can see that the Cramer-Lundberg approximation has an error that

is exponentially small as δ & 0 (or, equivalently, y (δ) % ∞). Note that the case δclose to zero is encountered often in practice. For instance, in the queueing setting

described before, δ ≈ 0 corresponds to the so-called heavy traffic regime in which theserver is busy close to 100% of the time (this terminology actually motivated the title

of this chapter). In insurance risk theory, δ close to zero implies that the premium

charged is close to the typical pay-out for claims (in the language of risk theorists,

the “safety loading” is small). Furthermore, our scaled form of the Cramer-Lundberg

representation allows us to obtain a corresponding heavy tailed version (assuming

E |X1|3+α for α > 0) of the standard Cramer-Lundberg approximation that providesa good fit (as δ & 0) at essentially every region of the quantile space (see (5)). In

particular, the error obtained is of polynomial form in δ, at a rate that depends on

the number of moments available. Although we state the complete form of our scaled

Cramer-Lundberg representation (see (5)), we focus only on the diffusion region of the

space (i.e. y (δ) = O¡δ−1¢), which yields the most important result of this chapter,

namely, Theorem 1. The details for the case y (δ) = O¡δ−b¢for b > 1 are given in

Blanchet, Olvera-Cravioto and Glynn (2004).

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 46

Initial forms of heavy tailed Cramer-Lundberg asymptotics for P (Mδ > x) were

given by Bahr (1975) and Borovkov (1976) in the context of the so-called classical risk

model or (equivalently) the single-server queue with Poisson arrivals. Further gener-

alizations were developed by Embrechts and Veraverbeeke (1982). These heavy tailed

versions of the Cramer-Lundberg approximation tend to perform well only at very

large quantile values (see Asmussen and Binswanger (1997), and also the discussion

in Embrechts, Klüppelberg and Mikosch (1997) p. 54). The approximations provided

in this chapter (in particular, see Theorem 1) are intended to yield good fit in more

“typical” values of the distribution (i.e. on the region x = y (δ) = O¡δ−1¢, which

corresponds to the diffusion scale). For large quantiles (i.e. x = y (δ) = O¡δ−b¢for

b ≥ 1) our approximations match earlier results mentioned above.A closely related approximation, of the type of so-called “corrected diffusion ap-

proximations” (CDA’s), has been tested in practical applications by Asmussen and

Binswanger (1997) and shows satisfactory performance. This first order CDA was

developed by Hogan (1986). As we shall see, Theorem 1 not only allows one to

strengthen and recover Hogan’s CDA but it also significantly reduces the error of the

diffusion approximation (see (4) below) as δ & 0.

Section 2 introduces our “scaled” Cramer-Lundberg representation and discusses

our main results (see Theorem 1) using ideas from the light tailed case. Section 3

studies the connection between our proposed representation and corrected diffusion

approximations. The technical development is given in Section 4.

3.1 A Cramer-Lundberg Representation

As we mentioned previously, the so-called Cramer-Lundberg asymptotic formula was

initially developed for light tailed random walks. In particular, suppose that there

exists a positive solution θδ to the equation

φ¡θδ¢= exp

¡θδδ¢,

where φ (θ) , E exp (θX1). For x > 0 define τ (x) = inf{n ≥ 1 : Sδn > x}. Since

{τ (x) <∞} = {Mδ > x}, the fundamental identity of sequential analysis establishes

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 47

that

P (Mδ > x) = P (τ (x) <∞) = exp¡−θδx¢Eθδ exp

¡−θδ ¡Sδτ(x) − x

¢¢,

where

Pθδ (A) = E¡exp

¡θδ (Sn − nδ)

¢1A¢

for every set A ∈ σ (X1, ...,Xn) (where σ (X1, ..,Xn) is the sigma-field generated by

X1, ..., Xn). The “overshoot” R (x) , Sδτ(x) − x can be interpreted as the residual

life time of the embedded renewal process generated by the strictly ascending lad-

der heights of Sδ. The standard Cramer-Lundberg asymptotic is then obtained by

applying renewal theory at strictly ascending ladder heights yielding

P (Mδ > x) ∼ exp¡−θδx+ r (δ)¢ (1)

as x→∞, where r (δ) = logE exp ¡−θδR (∞)¢.Moreover, since we are assuming strongly non-lattice increment distributions, a

result by Stone (1965) on rates of convergence in renewal theory guarantees an expo-

nential rate of convergence in (1). In particular, the Cramer-Lundberg representation

P (Mδ > x) = exp¡−θδx+ r (δ)¢+O ¡e−ax¢ (2)

holds for some a > 0 (see Asmussen (2003) p. 196 ). It turns out that the exponential

rate of convergence in (2) is uniform in δ > 0 (see Lemma 5 of Siegmund (1979) or

Lemma 1 below), allowing us to write (see Chang (1992)) the following scaled Cramer-

Lundberg representation for P (Mδ > y (δ))

P (Mδ > y (δ)) = exp¡−θδy (δ) + r (δ)¢+O ¡e−ay(δ)¢ , (3)

which is valid for some a > 0 (uniformly on δ ∈ [0, δ1] for some δ1 > 0) and y (δ) =O¡δ−b¢for b > 0. Of special importance is the case in which b = 1 (i.e. y (δ) =

O¡δ−1¢). Using the implicit function theorem it is easy to see that θδ = 2δ+O

¡δ2¢,

we therefore can recover, from (3), Kingman’s (1963) diffusion approximation

P (Mδ > x/δ) ≈ exp (−2x) + o (1) , (4)

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 48

valid as δ & 0, for x > 0.

In this chapter, we introduce a heavy tailed version of the scaled Cramer-Lundberg

representation (3). In particular, if E |X1|3+α < ∞ for α > 0, then, for each ε > 0

sufficiently small, our proposed representation takes the form

P (Mδ > y (δ))

=

(exp

¡−θδαy (δ) + rα (δ)¢+ o ¡δα−ε¢ if y (δ) = o¡δ−b¢for b = 1

R∞y(δ)P (X1 > u) du+ o

¡y (δ)α+1

¢if y (δ) = o

¡δ−b¢for b > 1

. (5)

The constants θδα and rα (δ) correspond to natural approximations for θδ and r (δ)

respectively − their form is discussed in detail below. The case y (δ) = O¡δ−b¢for

b > 1 is derived under the additional assumption that the increments possess regu-

larly varying tails, although the technical details are not discussed in this dissertation

(see Blanchet, Olvera-Cravioto, and Glynn (2004)) for additional detail on this case).

It suffices to remark in the present discussion that representation (5) generalizes the

scaled Cramer-Lundberg representation (3) and reconciles our proposed representa-

tion with previous Cramer-Lundberg type asymptotics developed for fixed values of

δ (see Embrechts, Klüppelberg and Mikosch (1997) p. 39). In our development here,

we will focus only on the “diffusion” region of the space, namely, y (δ) = O¡δ−1¢.

In order to understand the nature of the constants θδα and rα (δ) let us analyze the

elements describing (3). Using the implicit function theorem, it is possible to develop

an approximation for θδ in terms of

θδα , 2δ +X0≤j≤α

ξj+2δj+2

(j + 2)!= θδ + o

³δbαc

´, (6)

where ξ2 = 8EX31/3, and ξj depends on the first j +1 moments of X1. Also, it turns

out that r (δ) can be computed explicitly in terms of Woodroofe’s (1979) integral

form

r (δ) =1

Z ∞

−∞

−θδ¡θδ + iλ

¢iλlog

Ãφ¡θδ¢− e−δiλφ ¡θδ + iλ¢

−i ¡φ0 ¡θδ¢− δφ¡θδ¢¢

λ

!dλ, (7)

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 49

see also Siegmund (1985) p. 176. It is not hard to verify, using a dominated conver-

gence argument and Proposition 8.44 of Breiman (1992), that if

rα (δ) , 1

Z ∞

−∞

−θδα¡θδα + iλ

¢iλ

log

Ãγα¡θδα¢− e−δiλPk≤α+1 g

(k) (λ)¡θδα¢k/k!

−i ¡γ0α ¡θδα¢− δγα¡θδα¢¢ !

dλ (8)

with γα (·) defined as

γα (θ) = 1 +X0≤j≤α

θj+2EXj+2

(j + 2)!,

then,

r (δ) = rα (δ) + o (δα) .

In view of the fact that for θδα and rα (δ) to be meaningful only finitely many moments

of X1 are required to exist, the previous estimates together with (3) suggest the

natural scaled Cramer-Lundberg representation provided. Summarizing, the main

result of this chapter is the following.

Theorem 1 Suppose that E |X1|α+3 <∞ for α > 0, and that the distribution of X1is strongly non-lattice. Then,

P (Mδ > x/δ) = exp¡−θδαx/δ + rα (δ)¢+ o ¡δα−ε¢ (9)

as δ & 0 for ε > 0 sufficiently small and x > 0 fixed.

Remark 1 As we shall, the slack term ε > 0 comes from an estimate involving

Spitzer identities and the Wiener-Hopf factorization (see Proposition 3). In other

words, if we could set ε = 0 in Proposition 3, then Theorem 1 would hold assuming

only E |X1|α+3 <∞ for α ≥ 0 with an error of order o (δα).

Remark 2 We also will see, that Theorem 1 could also have been formulated in

a more robust form as follows.

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 50

Theorem 2 (Robust form) Let G be the class of random variables Y such that

i) supY ∈G E |Y |α+3 <∞ for all α > 0.

ii) The distribution of Y equals the distribution of X1 on [−1/δ, 1/δ].iii) X1 is strongly non-lattice and EY = o (δα).

Then, for ε > 0 sufficently small and each x > 0 fixed,

P¡MY

δ > x/δ¢= exp

¡−θδαx/δ + rα (δ)¢+ o ¡δα−ε¢ ,as δ & 0 (uniformly in Y ∈ G) where MY

δ is the all time maximum of the random

walk Sn = Y1 + ...+ Yn − nδ, and the Yi’s are iid rv’s members of class G.

3.2 Connection to Corrected Diffusion Approxi-

mations

The approximation suggested by Theorem 1 is closely related to so-called “corrected

diffusion approximations” (CDA’s). These approximations are developed in the form

of asymptotic expansions in powers of δ > 0. These asymptotic expansions follow the

spirit of Edgeworth expansions for the central limit theorem and provide parametric

information (in δ > 0) about the distribution of the whole time maximum of ran-

dom walk. CDA’s for the distribution of Mδ were introduced by Siegmund (1979).

Assuming light tailed increments, Siegmund (1979) developed an expansion that cor-

rects the diffusion approximation (4) up to an error of order o¡δ2¢. Chang and Peres

(1997) obtained a complete asymptotic expansion for Gaussian random walks and,

as we have seen, a complete asymptotic expansion for general strongly non-lattice

increments with exponential moments was developed in the second chapter of this

dissertation.

A first order CDA (corrected diffusion approximation) to (4) in the case of heavy

tailed increments was proposed by Hogan (1986). In particular, assuming thatE |X1|5 <∞, and under some integrability conditions on the characteristic function ofX1 (which

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 51

in particular imply the continuity of X1) Hogan showed that

P (Mδ > x/δ) = exp (−2x)µ1 + δ

4xEX31

3− 2δβ

¶+ o (δ) .

The constant β was computed by Siegmund (1979) as

β =1

6EX3

1 −1

Z ∞

−∞

1

θ2Re log{2 (1− g (θ)) /θ2}dθ. (10)

Hogan’s strategy consists, essentially, in applying direct Fourier inversion to the char-

acteristic function of Mδ. His method of proof does not seem to extend directly to

higher order correction terms.

A more convenient representation for Hogan’s approximation (which is guaranteed

to give only non-negative values) can be written as

P (Mδ > x/δ) ≈ exp¡−2x ¡1− 2δEX3

1/3¢− 2δβ¢ . (11)

In order to recover Hogan’s approximation (11) from (9) note that (using the same

technique as in the proof of Theorem 3 in Chapter 2 of this dissertation),

rα (δ) =1

Z ∞

−∞

−θδα¡θδα + iλ

¢iλ

log

2³γα¡θδα¢− e−δiλPk≤α+1 g

(k) (λ)¡θδα¢k/k!´

λ¡λ− 2i ¡φ0 ¡θδ¢− δφ

¡θδ¢¢¢

dλ∼ 1

Z ∞

−∞

−θδα¡θδα + iλ

¢iλlog¡2 (1− φ (iλ))λ−2

¢dλ ∼ 2δβ. (12)

The estimate (12) was obtained from the expansion θδα = 2δ + δ28EX31/3 + o

¡δ2¢,

which is valid, as we show in Corollary 1 below, as long as E |X1|4+α <∞ for α ≥ 0.Approximation (11) can therefore be recovered by combining (12) and the expansion

for θδα into (9). We stress that (9) does not provide a CDA in the parametric sense

introduced by Siegmund (1979). Furthermore, the techniques introduced in Chapter

2 do not apply directly to provide an asymptotic expansion of rα (δ) in powers of the

drift δ under the parameterization utilized here.

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 52

3.3 Technical Development

Throughout the rest of the chapter we will suppose, in addition to the assumptions

discussed at the beginning of this chapter, that E |X1|α+3 < ∞ for α > 0. The

strategy that we will pursue follows a truncation argument. Consider the sequence

Xδof rv’s X

δ

k = Xk1 (|Xk| ≤ 1/δ) − δ, for k ≥ 1, and its associated random walk

Sδ=³Sδ

n : n ≥ 0´(i.e. S

δ

0 = 0 and Sδ

n = Xδ

1 + ...+Xδ

n). The idea is first to develop

approximation (9) for the distribution of

Mδ = maxn≥0

n.

Later, we will show that P (Mδ > x/δ) and P¡Mδ > x/δ

¢are suitably close.

Put φδ (θ) = E exp¡θX1

¢and set ψδ (θ) = log φδ (θ). Note that ψ

0δ (0) = −δ +

o¡δα+2

¢; therefore, if δ is small enough, we can guarantee that there is a strictly

positive solution to the equation ψδ

¡θδ∗¢= 0. A similar argument to that given

previously to obtain (1) yields

P¡Mδ > x

¢= exp

¡−θδ∗x¢E∗δ exp ¡−θδ∗Rδ (x)¢, (13)

where τ (x) = inf{n ≥ 1 : Sδ

n > x}, Rδ (x) , Sδ

τ(x)−x is the overshoot at level x, and

P ∗δ (A) = E³exp

³θδ∗S

δ

n

´1A´

for every set A ∈ σ³X

δ

1, ...,Xδ

n

´(where σ

³X

δ

1, ...,Xδ

n

´is the sigma-field generated

by Xδ

1, ..., Xδ

n). Renewal theory applied at the strictly increasing ladder heights of

the random walk S implies that

E∗δ exp¡−θδ∗Rδ (x)

¢→ E∗δ exp¡−θδ∗Rδ (∞)

¢as x → ∞, for fixed δ > 0. Here, we are interested in applying renewal theory

uniformly on δ ∈ (0, δ1). The next proposition (which is analogous to Lemma 5 ofSiegmund (1979)) provides the means for doing so.

Lemma 1 Let F be a family of distribution functions supported on [0,∞). For eachF ∈ F, let EF (·) be the expectation operator associated to F ∈ F, and define

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 53

EFg (τ) ,R[0,∞) g (t)F (dt) for each continuous and bounded function g : [0,∞)→ C.

Suppose that the family F is uniformly strongly non-lattice, (i.e. the corresponding

characteristic functions χF (λ) = EF exp (iλτ) satisfy

infF∈F

inf|λ|>ε

|1− χF (λ)| > 0. (14)

Then, UF (t) ,P∞

n=0 F∗n (t) satisfies the following.

1. If supF∈F EF exp (ηX1) <∞ for some η > 0, then

supF∈F

¯UF (t)− t

EF τ− EF τ

2

2E2F τ

¯= O

¡e−at

¢as t→∞ for some a > 0.

2. Moreover, if supF∈F EF τε+2 <∞ for ε ≥ 0, then,

supF∈F

¯UF (t)− t

EF τ− EF τ

2

2E2F τ− H

F2 (t)

EF τ 2δ−HF

1 ∗HF1 (t)

¯= o

¡tα+2 log (t)

¢as t→∞, where HF

1 (t) =R∞t(1− F (s)) ds /EF τ and HF

2 (t) =R∞tHF1 (s) ds.

Proof. See Theorem 1 in Chapter 4 of this dissertation.

A crucial assumption that must be verified when applying the previous lemma is

the strongly non-lattice condition (14). A key result that we shall use to verify this

assumption repeatedly throughout the rest of this chapter is the so-called Wiener-

Hopf factorization, which we now state without proof (see Theorem 8.3.1 of Asmussen

(2003) for a proof of this classical result).

Lemma 2 (Wiener-Hopf) Suppose that Y = (Yj : j ≥ 1) is a sequence of iid rv’swith characteristic function g (λ) , E exp (iλY1) . Define Sn , Y1 + ... + Yn and

S0 , 0. Put τ+ = inf{n ≥ 0 : Sn > 0} and set τ− = inf{n ≥ 1 : Sn ≤ 0}. Finally, letg+ (λ) = E(exp(iλSτ+); τ+ <∞) and put g− (λ) = E(exp (iλSτ−) ; τ− <∞). Then,

1− g (λ) = (1− g+ (λ)) (1− g− (λ)) .

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 54

With Lemma 1 in hand, we now can provide a detailed asymptotic analysis of

E∗δ exp¡−θδ∗Rδ (x/δ)

¢as δ & 0 (for fixed x > 0), as the following proposition shows.

Proposition 1 There exists δ∗ > 0 and a function f1 : (0,∞) → (0,∞), such thatf1 (z) = o

¡z−(1+α)

¢as z %∞ for which

sup0≤δ≤δ∗

¯P ∗δ¡Rδ (x) > y

¢− P ∗δ ¡Rδ (∞) > y¢¯ ≤ y−1f1 (y)xf1 (x) + f1 (x+ y)

for x, y > 0. Also, if x = O (1/∆) we have¯E∗δ exp

¡−∆Rδ (x)¢−E∗δ exp ¡−∆Rδ (∞)

¢¯ ≤ o ¡∆1+α¢

as ∆& 0 uniformly in δ ∈ (0, δ∗).

Proof. An analogous result was obtained by Chang (1992) when exponential

moments exist. Our argument here follows Chang’s argument, we provide the details

for completeness. Applying renewal theory at strictly increasing ladder heights we

have that

P ∗δ¡Rδ (x) > y

¢=

Z[0,x)

P ∗δ³Sδ

τ+> x+ y − t

´U∗δ (dt) ,

where τ+ = inf{n ≥ 0 : Sδ

n > 0} is the first strictly increasing ladder epoch, Sδ

τ+ is the

first strictly increasing ladder height, and U∗δ is the corresponding renewal measure

generated by the strictly increasing ladder heights under the probability measure P ∗δ .

We also know from renewal theory (for fixed δ > 0) that

P ∗δ¡R (∞) > y¢ =

1

E∗δSδ

τ+

Z ∞

y

P ∗δ³Sδ

τ+> t´dt

=1

E∗δSδ

τ+

Z x

−∞P ∗δ³Sδ

τ+> x+ y − t

´dt.

Thus,

P ∗δ¡Rδ (x) > y

¢− P ∗δ ¡Rδ (∞) > y¢

=

Z x

0

P ∗δ³Sδ

τ+> x+ y − t

´ÃU∗δ (dt)−

dt

E∗δSδ

τ+

!(15)

+1

E∗δSδ

τ+

Z 0

−∞P ∗δ³Sδ

τ+> x+ y − t

´dt. (16)

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 55

Note that (15) can be written asZ x

0

P ∗δ³Sδ

τ+> x+ y − t

´ε∗δ (dt) ,

where

ε∗δ (t) = U∗δ (t)−

t

E∗δSδ

τ+

−E∗δ³Sδ

τ+

´22E∗δS

δ

τ+

.

Using properties of the convolution and the change of variable u = x/t we obtainZ x

0

P ∗δ³Sδ

τ+> x+ y − t

´ε∗δ (dt)

= −Z x

0

ε∗δ (x− t)P ∗δ³Sδ

τ+∈ y + dt

´= −

Z 1

0

ε∗δ (x− xt)P ∗δ³Sδ

τ+∈ y + xdt

´. (17)

Now, since E¡|X1|3+α¢ < ∞, 0 ≤ Sδ

τ+ ≤ 1/δ and θδ∗ ∼ 2δ, we can guarantee thatthere exists δ1 > 0 such that

sup0≤δ≤δ1

E∗δ

µ³Sδ

τ+

´2+α¶= sup

0≤δ≤δ1E

µ³Sδ

τ+

´2+αexp

³θδ∗S

δ

τ+

´¶≤ M sup

0≤δ≤δ1E

µ³Sδ

τ+

´2+α¶<∞.

Let us verify that the laws P ∗δ³Sδ

τ+ ∈ ds´are uniformly strongly non-lattice. First,

it is almost immediate to see that the laws P ∗δ¡X1 ∈ ds

¢are uniformly strongly non-

lattice. From Lemma 2 we have that

1

2

¯1− e−δiλE∗δ exp

¡iλX1

¢¯ ≤ ¯1− e−δiλE∗δ exp³iλSδ

τ+

´¯.

The uniform strongly non-lattice assumption can be easily verified from the previous

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 56

inequality. Consequently, we can apply Lemma 1 to conclude, from (17), that¯Z x

0

P ∗δ³Sδ

τ+> x+ y − t

´ε∗δ (dt)

¯≤

Z 1

0

|ε∗δ (x− xt)|P ∗δ³Sδ

τ+∈ y + xdt

´≤

Z 1/2

0

|ε∗δ (x− xt)|P ∗δ³Sδ

τ+∈ y + xdt

´+Z 1

1/2

|ε∗δ (x− xt)|P ∗δ³Sδ

τ+∈ y + xdt

´≤ o

¡x−α

¢P ∗δ³Sδ

τ+≥ y

´+ P ∗δ

³Sδ

τ+≥ y + x/2

´= o

¡x−α

¢o¡y−2−α

¢+ o

¡(y + x)−2−α

¢.

On the other hand, the term (16) equals

1

E∗δSδ

τ+

Z ∞

x+y

P ∗δ³Sδ

τ+> t´dt = o

¡(x+ y)−1−α

¢.

This yields the first part of this proposition. For the second part, note that

E∗δ exp¡−∆Rδ (x)

¢=

Z ∞

0

e−uP ∗δ¡Rδ (x) ≤ u/∆

¢du,

thus ¯E∗δ exp

¡−∆Rδ (x)¢−E∗δ exp ¡−∆Rδ (∞)

¢¯≤

Z ∞

0

e−u¡o¡u−2−α∆2+α

¢o¡x−α

¢+ o

¡∆1+α (x∆+ u)−1−α

¢¢du = o

¡∆1+α

¢as long as x = O (1/∆), this provides the second part of the statement.

We are almost ready to show that our stated approximation (9) is valid for the

truncated random walk Sδ. Let us just provide, a couple of elementary results de-

scribing the asymptotic behavior of θδ∗ and φδ

¡θδ∗¢as δ & 0.

Proposition 2 Let

eφδ (θ) , E (exp (θX11 (|X1| ≤ 1/δ))) .

(Observe that φδ (θ) = exp (−θδ) eφδ (θ).) Then, for all θ ∈ [−Mδ,Mδ] with M > 0¯¯eφδ (θ)−

X1≤j≤α+3

EXj11 (|X1| ≤ 1/δ)

θj

j!

¯¯ ≤ o ¡δα+2¢ .

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 57

Furthermore, ¯¯eφδ (θ)−

X1≤j≤α+3

EXj1

θj

j!

¯¯ ≤ o ¡δα+2¢ ,

for θ ∈ [−Mδ,Mδ].

Proof. The proof proceeds by expanding for fixed δ the function

eφδ (θ)

=X

1≤j≤α+2EXj

11 (|X1| ≤ 1/δ)θj

j!

+E³Xbα+3c1 exp (ηX11 (|X1| ≤ 1/δ))

´ θbα+3c

(bα+ 3c)! ,

where |η| ≤ |θ| ≤Mδ. Hence,¯¯E ³Xbα+3c

1 exp (ηX11 (|X1| ≤ 1/δ))´ θbα+3c

(bα+ 3c)!

¯¯

≤ Mδbα+3c

(bα+ 3c)! exp (M)E¯Xα+31

¯= o

¡δα+2

¢.

The fact that EXj11 (|X1| ≤ 1/δ) − EXj

1 = o¡δα+3−j

¢can be easily checked, this

yields the conclusion of the proposition.

As a consequence of the previous proposition we obtain the next corollary.

Corollary 1 ¯¯θδ∗ − X

j≤α+2

ξjj!δj

¯¯ ≤ o ¡δα+1¢ .

The constants ξj are computed via the system of linear equations:

nXm=0

µn

m

¶κm+2m+ 2

ξn−m+1 = 0; 0 ≤ n ≤ α+ 1,

where κj is the jth cumulant of X1.

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 58

Proof. The interesting case arises when exponential moments fail to exists, in

that case log eφδ (1) > δ for all δ small. Therefore, by strict convexity of log eφδ (·), wemust have θδ∗ ≤ δ. This implies that θδ∗ is in the domain in which the expansion of

Proposition 2 is valid. The rest of the conclusion follows from the implicit function

theorem.

Proposition 3 If y (δ) = O¡δ−b¢for b ≤ 1, and x > 0, then

P¡M δ > x/δ

¢= exp

¡−θδαx/δ + rα (δ)¢+ o (δα) .Proof. By Corollary 1 we have

P¡M δ > x/δ

¢= exp

¡−θδ∗x/δ¢E∗δ exp ¡−θδ∗Rδ (x/δ)¢.

On the other hand, Proposition 1 asserts that¯E∗δ exp

¡−θδ∗Rδ (x/δ)¢−E∗δ exp ¡−θδ∗Rδ (∞)

¢¯ ≤ o ¡δα+1¢ ,as long as θ∗ = O (δ) which holds by virtue of Corollary 1. Now, observe that

logE∗δ exp¡−θδ∗Rδ (∞)

¢=−12π

Z ∞

−∞

θδ∗¡θδ∗ + iλ

¢iλlog

eφδ

¡θδ∗¢− e−δiθeφδ

¡θδ∗ + iθ

¢−i³eφ0δ ¡θδ∗¢− δeφδ

¡θδ∗¢´

θ

dθ.Since ¯

g(m) (iθ)− eφ(m)δ (iθ)¯≤ 2E (|Xm| 1 (|X| > 1/δ)) = o ¡δα+3−m¢ ,

a routine dominated convergence argument (obtained with the aid of Proposition 8.44

of Breiman (1992)) yields

logE∗δ exp¡−θδ∗Rδ (∞)

¢− rα (δ) = o (δα) .The proposition is proved by combining these estimates.

The next step is to show that P¡M δ > x/δ

¢ − P (Mδ > x/δ) = o (δα). We well

do this by taking advantage of a geometric sum representation of the maximum of

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 59

random walks with negative drift. Specifically, let us write τ+ = τ (0) and Sδτ+ (resp.

τ+ = τ (0) and Sδ

τ+) to denote the first strictly ascending ladder epoch and first

strictly ascending ladder height of the random walk Sδ (resp. Sδ). It is well known

that

MδD= Z ,

G(pδ)Xj=1

Tj,δ , (18)

where T δ = (Tj,δ : j ≥ 1) is a sequence of iidrv’s with distribution function

given by P (T1,δ ≤ t) = P¡Sδτ+≤ t¯ τ+ <∞¢ and G (pδ) is geometrically distributed

with parameter pδ = P (τ+ =∞) (i.e. P (G (pδ) = k) = pδ (1− pδ)k for k ≥ 0). Acompletely analogous representation is also valid for Mδ, namely

M δD= Z ,

G(pδ)Xj=1

T j,δ , (19)

with an iid sequence T =¡T j,δ : j ≥ 1

¢such that P

¡T 1,δ ≤ t

¢= P

³Sδ

τ+≤ t¯τ δ+ <∞

´and a parameter pδ = P (τ+ =∞) for the geometric rv G.It is natural to expect that if the moments of Mδ and Mδ are close, then their

corresponding distributions do not differ significantly. The next result (whose proof

is given at the end of the section) shows that the moments of T1,δ and T 1,δ are close

as δ & 0 (this implies, in view of representations (18) and (19), that the moments of

M δ provide good approximations, in some sense, for those of Mδ).

Theorem 3 For each ε > 0 small enough

pδ = pδ + o¡δα+1−ε

¢,

E¡T 1,δ

¢= E (T1,δ) + o

¡δα−ε

¢.

Moreover, for 2 ≤ j ≤ α+ 2

E³Tj

1,δ

´= E

¡T j1,δ¢+ o

¡δα+2−j−ε

¢.

Proof. Given at the end of the section.

Theorem 1 is just an immediate consequence of the next final proposition.

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 60

Proposition 4

P¡Mδ > x/δ

¢− P (Mδ > x/δ) = o (δα)

Proof. Applying Theorem 1.1 of Kalashnikov (1997) (see also Proposition 1 in

the fourth chapter of this dissertation for a somewhat shorter argument) we obtain

P (M > x/p) = qEqN(x/p). (20)

A similar argument as that as the one given in the proof of Proposition 1 (by means

of the Wiener-Hopf factorization) can be used to easily verify the uniform strong non-

latticity (for δ > 0 sufficiently small) of the distributions of both T1,δ and T 1,δ. In

addition, note that both T1,δ and T 1,δ have uniformly (in δ > 0) bounded moments of

order bα+ 2c. Using renewal theory (in its uniform version, as in Lemma 1) we shallobtain, in Theorem 3 of this dissertation’s fourth chapter, asymptotic expansions (as

p& 0) for P (Z > x/p), which, combined with (20), allows writing

P (Mδ > x/δ) = exp (aδ (pδ)x/δ + bδ (pδ)) + o (δα) ,

where a (pδ) and b (pδ) satisfy

a (pδ) =Xk≤α

a(k+1)δ (0)

(k + 1)!pk+1, (21)

b (pδ) =Xk≤α

b(k+1)δ (0)

(k + 1)!pk+1, (22)

with a(m)δ (0) and b(m)δ (0) depending algebraically on the first m and m+ 1 moments

respectively of T1,δ (see Theorem 4 in the fourth chapter). Similarly,

P¡M δ > x/δ

¢= exp

¡aδ (pδ)x/δ + bδ (pδ)

¢+ o (δα) ,

where aδ (pδ) and bδ (pδ) have analogous representations as (21) and (22) above. This

implies, by virtue of Theorem 3, that

δ−1 (aδ (pδ)− aδ (pδ)) = o¡δα−ε

¢= bδ (pδ)− bδ (pδ) ,

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 61

which in turn implies the statement of the proposition.

Proof of Proposition 3. We first estimate |pδ − pδ|. Recall that

pδ = P¡τ δ+ =∞

¢=

∞Xn=1

1

nP¡Sδn > 0

¢=

∞Xn=1

1

nP

µSnn> δ

¶,

similarly

pδ =∞Xn=1

1

nP

µSnn> δ

¶.

Thus,

|pδ − pδ| ≤∞Xn=1

1

nE

¯1

µSnn> δ

¶− 1

µSnn> δ

¶¯=

∞Xn=1

1

nP

µSnn> δ;

Snn≤ δ

¶+

∞Xn=1

1

nP

µSnn> δ;

Snn≤ δ

¶.

Now, fix ε > 0 small and write

∞Xn=1

1

nP

µSnn> δ;

Snn≤ δ

¶=

Xn≤1/δ2+ε0

1

nP

µSnn> δ;

Snn≤ δ

+X

n>1/δ2+ε

1

nP

µSnn> δ;

Snn≤ δ

¶.

Observe that

P

µSnn> δ;

Snn≤ δ

¶≤ P

³nmaxk=1

|Xk| > 1/δ´

= 1− ¡1− F (1/δ)¢n ,where F (x) = P (X > x). Since E

¡|X|3+α¢ < ∞ we have that F (1/δ) = o¡δ3+α

¢.

Thus, we can write

P

µSnn> δ;

Snn≤ δ

¶≤ 1− ¡1− o ¡δ3+α¢¢n .

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 62

However,Xn≤1/δ2+ε

1

nP

µSnn> δ;

Snn≤ δ

¶≤ M log

¡1/δ2+ε

¢ ³1− ¡1− o ¡δ3+α¢¢1/δ2+ε´

= M log¡1/δ2+β

¢ ³1− ¡1− δ2+εo

¡δ1+a−ε

¢¢1/δ2+ε´= o

¡δ1+α−ε0

¢for ε0 > ε > 0 small enough. Now, put eψδ (θ) = log eφδ (θ) (recall that eφδ (θ) was

defined in Proposition 2) and use Chernoff’s bound to obtain

P

µSnn> δ;

Snn≤ δ

¶≤ P

µSnn> δ

¶≤ exp

³−n

³δθδ − eψδ

¡θδ¢´´

,

where θδ satisfies the equation eψ0δ ¡θδ¢ = δ (which, can be easily seen to have a

solution for δ > 0 small enough). Hence,Xn>1/δ2+ε

1

nP

µSnn> δ;

Snn≤ δ

¶≤

Xn>1/δ2+ε

1

nexp

³−n

³δθδ − eψδ

¡θδ¢´´

≤exp

³− ¥1/δ2+ε¦ ³δθδ − eψδ

¡θδ¢´´

1− exp³−³δθδ − eψδ

¡θδ¢´´

= o (exp (−r/δε))

for r > 0 (since³δθδ − eψδ

¡θδ¢´ ∼ δ2/2), the previous term is obviously of order

o¡δ1+α+ε0

¢. For the term

Xn>1/δ2+ε

1

nP

µSnn> δ;

Snn≤ δ

¶≤

Xn>1/δ2+ε

1

nP

µSnn> δ

¶,

we first note that (since E |X1|3+α <∞)

P (X1 > x) ≤ P (|X1| > x) ≤ C (1 + x)−(α+3) , V (x)

for some constant C > 0. Corollary 4.2 of Borovkov (2000) implies that

supx≥t√(α+1)n logn

P (maxk≤n Sn > x)nV (x)

≤ 1 + h (t) ,

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 63

where h (t)→ 0 as t%∞. In our case n > 1/δ2+ε, thereforex = nδ ≥ δ−(1+ε) ≥ t1δ−(1+5ε/6)

≥ tq(α+ 1)

¡1/δ2+ε

¢log¡1/δ2+ε

¢ ≥ tp(α+ 1)n log n,for large enough (but fixed) constants t1, t > 0. We therefore conclude that there

exists t2 > 0 such thatXn>1/δ2+ε

1

nP

µSnn> δ

¶≤ t2

Xn>1/δ2+ε

V (nδ) = o¡δα+1

¢(is analogous. This gives the estimate pδ = pδ + o

¡δα+1−ε

¢.

Now let us write µ+ (j, δ) = ETj1,δ and µ+ (j, δ) = ET

j

1,δ for j ≤ bα+ 2c. Similarly,we use the symbol µ− (j, δ) (resp. µ− (j, δ)) to denote the jth moment of the first

weakly descending ladder height of the random walk Sδ (resp. the jth moment

of the first weakly descending ladder height of Sδ). Finally, let µj = E (X1 − δ)j

and µj = E³X

δ

1

´j. The Wiener-Hopf factorization (Lemma 2) then asserts that

µ1 = pµ−,1 (and that µ1 = pµ−,1), that is

µ− (j, δ)− µ− (j, δ) =µ1

p+ o¡δα+1−ε

¢ − µ1p

=µ1 − µ1

p+ o¡δα+1−ε

¢ − µ1Ã

1

p+ o¡δα+1−ε

¢ − 1p

!

= o¡δα+1

¢− δ

Ão¡δα+1−ε

¢¡p+ o

¡δα+1−ε

¢¢p

!= o

¡δα−ε

¢. (23)

Also from the Wiener-Hopf factorization we obtain

µ+ (1, δ) =pµ− (2, δ)− µ22µ− (1, δ)

. (24)

Therefore, in order to continue, we need to estimate the difference between µ− (j, δ)

and µ− (j, δ). This differences will be estimated via Fourier methods.

Since we are assuming strongly non-lattice we can use the following identity

log

Ã1−E ¡exp ¡∆Sδ

τ−¢¢

−E ¡∆Sδτ−¢ !

=1

Z ∞

−∞

∆¡∆2 + λ2

¢ Re logµ1− eδiθg (−λ)−iδλ¶dλ (25)

− 12π

Z ∞

−∞

∆2¡∆2 + λ2

¢λIm log

µ1− eδiθg (−λ)−iδλ

¶dλ,

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 64

for ∆ > 0. This identity is almost the same as the one derived via Corollary 8.45

and Theorem 8.51 of Siegmund (1985) which is obtained for the strictly ascending

ladder height, however a straightforward adaptation of Siegmund’s argument shows

that the result also holds for the descending ladder height as displayed in (25). An

expansion of the left hand side of (25) in powers of ∆ (up to order bα+ 2c) generatesa sequence of coefficients cj (δ). Note that the ratios µ−(k, δ)/µ−(1, δ) (for k ≤bα+ 2c) can be recovered from the coefficients cj (δ), for j ≤ k by solving a system ofequations (in fact, (−1)j cj (δ) is the jth order cumulant of the limiting overshoot ofthe random walk −Sδ, and µ− (j + 1, δ) /µ− (1, δ) is proportional to its jth moment).

Hence, we can compute the magnitude of the error between µ− (j + 1, δ) /µ− (1, δ)

and µ− (j + 1, δ) /µ− (1, δ) by estimating cj (δ) − cj (δ). Consequently, it suffices tostudy the coefficients in the asymptotic expansion (in powers of ∆ > 0) of

log

ESδ

τ−¡1−E exp ¡∆Sδ

τ−¢¢

ESδτ−³1−E exp

³∆S

δ

τ−´´

=1

Z ∞

−∞

∆¡∆2 + λ2

¢ Re logáδ + o ¡δα+2¢¢ ¡1− eiδλg (−λ)¢δ (1− eiδθegδ (−λ))

!dλ (26)

− 12π

Z ∞

−∞

∆2¡∆2 + λ2

¢λIm log

áδ + o

¡δα+2

¢¢ ¡1− eiδλg (−λ)¢

δ (1− eiδθegδ (−λ))!dλ, (27)

where egδ (λ) = E exp (iλX11 (|X1| ≤ 1/δ)). The expansion of the integrals (26) and(27) can be easily obtained using Proposition 2 in the second chapter of this disser-

tation. For instance,

c1 − c1 =1

2

õ− (2, δ)µ− (1, δ)

− µ− (2, δ)µ− (1, δ) + o

¡δα−ε

¢! (using the LHS of (25))

=1

4

õ2 + o

¡δα+2

¢δ + o

¡δα+3

¢ − µ2δ

!(expanding (26) and (27)) (28)

1

Z ∞

−∞

1

λ2Re log

áδ + o

¡δk−1

¢¢ ¡1− eiδλg (−λ)¢

δ (1− eiδθegδ (−λ))!dθ. (29)

Since |g (−λ)− egδ (−λ)| ≤ o ¡δα+3¢, the term (29) is smaller than the term (28), and

CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 65

it is straightforward to verify that

µ2 + o¡δα+1

¢δ + o

¡δα+2

¢ − µ2δ= o (δα) ;

which implies that

µ− (2, δ)− µ− (2, δ) = o¡δα−ε

¢.

We can continue in this fashion; for example, we observe that the error term in the dif-

ference between c2 (δ) and c2 (δ) is determined by that of the difference between µ3/µ1and µ3/µ1 (because now the coefficient of ∆

2 in the expansion of (26) involves µ3/δ−µ3/

¡δ + o

¡δα+2

¢¢and that of (27) is an integral involving µ2/δ−µ2/

¡δ + o

¡δα+2

¢¢).

So, we obtain that µ− (3, δ) − µ− (3, δ) = o¡δα−1

¢. Similarly, for the difference be-

tween µ− (n, δ) /µ− (1, δ) and µ− (n, δ) /µ− (1, δ) for n > 3, we observe that we must

look at the difference between µn/δ and µn/¡δ + o

¡δα+2

¢¢which yields that

µ− (n, δ)− µ− (n, δ) = o¡δα+2−n

¢.

Furthermore, using the Wiener-Hopf factorization we see that the error in µ+ (n, δ)−µ+ (n, δ) is determined by that of µ− (n, δ) − µ− (n, δ) which, in particular, impliesthe statement of the proposition.

Chapter 4

Asymptotic Expansions for

Geometric Sums with Applications

to Defective Renewal Equations

Consider a sequence X = (Xk : k ≥ 1) of non-negative independent and identicallydistributed (iid) random variables (rv’s). Suppose that X1 is strongly non-lattice in

the sense that its characteristic function, g (λ) = E exp (iλX1), satisfies that for every

ε > 0

inf|λ|>ε

|1− g (λ)| > 0 (1)

or, equivalently, that lim|λ|→∞ |g (λ)| < 1 (see Siegmund (1985) p. 176).Let M be a geometrically distributed random variable independent of X. That

is,

P (M = k) = p (1− p)k = pqk; k ≥ 0.

Our focus here is on the distribution of

SM ,MXk=1

Xk,

66

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 67

(SM , 0 onM = 0) when the success probability p of the geometric random variable

M is small. The rv SM is called a geometric sum. Renyi’s theorem for geometric sums

of random variables establishes that if EX1 <∞, then

P (pSM > x) = exp (−x/E (X1)) + o (1) (2)

as p& 0. In this chapter, under the assumption of strongly non-lattice increments,

we develop additional order correction terms (in powers of p) to approximation (2)

(see equation (18) below). These types of expansions are similar in spirit to the

Edgeworth expansions for the central limit theorem. As in Edgeworth expansions,

the existence of certain order moments has to be imposed in order to provide the nth

order correction term. See, for example, Theorem 3.

The rv SM is utilized in many applied probability settings. For example, in

queueing theory, it is well known (by appealing to the ascending ladder heights rep-

resentation for the maximum of random walk) that the steady-state waiting time

distribution of the standard single server queue can be represented as a geometric

sum with non-negative increments (c.f. Asmussen (1987) or Kalashnikov (1997), Sec-

tion 1.3.3). In insurance risk theory, the ruin probability in the renewal model can

also be expressed as a tail probability of a geometric sum with non-negative incre-

ments (see Asmussen (2001) or Kalashnikov (1997), Section 1.3.4). Finally, in the

context of reliability models, the first break-down time of a system that consists of

an operating element, N − 1 unloaded redundant elements and M identical repair

units, can also be expressed as a geometric sum such as SM (refer also to Kalashnikov

(1997), Section 1.3.5). Other applications include program debugging and the total

reward until visiting a rare set in a Markov setting. (See the book on geometric sums

by Kalashnikov (1997) for additional details.)

The setting in which the success probability p is close to zero arises often in appli-

cations. For instance, in the queueing example mentioned in the previous paragraph,

this setting corresponds to the so-called heavy traffic regime in which the server uti-

lization is close to 100%; in the risk insurance context, p close to zero describes the

setting in which the security margin, included in the risk premium received by the

insurance company, is close to zero. Finally, in the reliability example, p close to zero

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 68

reflects a setting with a low break-down rate. In several of the examples above, the

distribution of the increments Xk depends on p as well. We must therefore develop

a theory that can handle this dependence. As an important application of the results

developed in this chapter (in particular Theorem 4) recall our results in Chapter 3

on high accuracy approximations for the maximum of random walk with heavy-tailed

increments. These approximations, as we have pointed out repeatedly, are very useful

in some of the applied settings mentioned at the beginning of our discussion (e.g. the

steady-state distribution of the single server queue and the ruin probability in the

insurance context).

In addition to the applications described above, there exists a close connection

between so-called defective renewal equations and geometric sums. Indeed, if a (·)satisfies the defective renewal equation

a (t) = b (t) + q

Z[0,t)

a (t− s)P (X1 ∈ ds) , (3)

Then, in great generality, (see Lin and Willmot (2000) p. 152) it follows that

a (t) =1

p

Z[0,t)

b (t− s)P (SM ∈ ds) . (4)

Equation (4) makes the connection clear between solutions of defective renewal equa-

tions (such as (3)) and the distribution of geometric sums. It turns out that defective

renewal equations such as (3) play an important role in a number of applied prob-

ability settings. A prominent example is insurance risk theory; in particular, the

so-called “expected discounted penalty” at ruin (from which many quantities of in-

terest, including the ruin probability, can be recovered by judicious choices of the

discount rate and the penalty) can be expressed in terms of a defective renewal equa-

tion (see Lin and Willmot (2000) p. 162). Many other examples in which defective

renewal equations play an important role are also described in Feller (1968) p. 188,

216, Resnick (1992) p. 158, and Lin and Willmot (2000) Ch. 9) these examples in-

clude Geiger counters, generalized terminating renewal processes, and age dependent

branching processes. The setting in which q is close to one in (3) (or, equivalently, p

is close to zero) is common in the application settings described before. For instance,

in the insurance setting it arises in environments of low net profits (which occur in

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 69

competitive conditions). For generalized terminating renewal processes, q close to one

corresponds to settings in which the process continues for long periods, and in age

dependent branching processes, q close to one reflects a case in which the population

is less likely to die. This, consequently, motivates developing asymptotics for the

solution a (·) of (3) as p& 0.

In Section 2, we develop the asymptotic expansion in powers of p for P (pSM > x)

(see Theorems 2, 3, and 4). The implications for asymptotic expansions of defective

renewal equations are studied in Section 3 (see Theorem 5).

4.1 Asymptotics for Geometric Sums

We first start with a useful representation for the tail probability of a geometric sum.

Set Sn = X1 + ...+Xn (with S0 = 0) and put N (t) = sup{n ≥ 0 : Sn ≤ t}. Observethat, for each non-negative integer m, {Sm > x} = {N (x) < m}. Thus, combiningthe independence between X and M with the fact that P (M > m) = qm+1, we can

write

P (SM > x) = P (N (x) > M) = E (P (N (x) > M |X)) = qE¡qN(x)

¢.

We therefore have shown the next proposition.

Proposition 1

P (SM > x) = qEqN(x).

The previous proposition implies that in order to study P (pSM > x) it suffices

to study the behavior of EqN(x/p) for x > 0 and small p > 0. A renewal theoretic

argument yields the following (defective) renewal equation

EqN(t) = P (X1 > t) + q

Z[0,t)

EqN(t−s)P (X1 ∈ ds) . (5)

Note that, if p > 0 small enough and E exp (ηX1) <∞ for some η > 0, the equation

E exp³bθX1´ = 1/q (6)

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 70

has a unique solution bθ > 0. Therefore, (5) can be transformed into the non-defectiverenewal equation

ebθtEqN(t) = ebθtP (X1 > t) +

Z[0,t)

ebθ(t−s)EqN(t−s)Fbθ (ds) ,

where Fbθ (ds) = qebθsP (X1 ∈ ds). Renewal theory then implies thatebθtEqN(t) =

Z[0,t)

ebθ(t−s)P (X1 > t− s)Ubθ (ds) , (7)

where Ubθ (t) = Ebθ (N (t) + 1), and under Pbθ the Xi’s are iid with distribution Fbθ.Using results by Stone (1965) it is not hard to verify that, for fixed but small p > 0¯

ebθtEqN(t) − 1

EbθX1Z[0,t)

ebθsP (X1 > s) ds

¯≤ K (p) exp (−a (p) t) . (8)

Since our situation involves sending simultaneously t%∞ and p& 0 we would like

the bound on the right hand side of (8) to hold uniformly in p ∈ [0, δ1] for someδ1 > 0. That is, we would like to show that we can find a,K ∈ (0,∞) such thatsupp∈[0,δ1]K (p) ≤ K < ∞ and supp∈[0,δ1] a (p) ≥ a. The following theorem provides

means to obtain these uniform estimates.

Theorem 1 Let F be a family of distribution functions supported on [0,∞). Foreach F ∈ F, let EF (·) be the expectation operator associated to F ∈ F, and defineEFg (τ) ,

R[0,∞) g (t)F (dt) for each continuous and bounded function g : [0,∞)→ C.

Suppose that the family F is uniformly strongly non-lattice, (i.e. the corresponding

characteristic functions χF (λ) = EF exp (iλτ) satisfy

infF∈F

inf|λ|>ε

|1− χF (λ)| > 0. (9)

Then, UF (t) ,P∞

n=0 F∗n (t) satisfies the following.

1. If supF∈F EF exp (ηX1) <∞ for some η > 0, then

supF∈F

¯UF (t)− t

EF τ− EF τ

2

2E2F τ

¯= O

¡e−at

¢as t→∞ for some a > 0.

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 71

2. Moreover, if supF∈F EF τε+2 <∞ for ε ≥ 0, then,

supF∈F

¯UF (t)− t

EF τ− EF τ

2

2E2F τ− H

F2 (t)

EF τ 2δ−HF

1 ∗HF1 (t)

¯= o

¡tα+2 log (t)

¢as t→∞, where HF

1 (t) =R∞t(1− F (s)) ds /EF τ and HF

2 (t) =R∞tHF1 (s) ds.

Proof. Part 1 is essentially Siegmund’s (1979) lemma. Part 2 follows the same

steps as in Carlsson (1983), the key assumption is the uniform strongly non-lattice

condition (9). The Fourier inversion expressions provided by Carlsson (1983) are the

same for each fixed F . At the end, Carlsson’s estimates of the error rate depend on

the application of a uniform version of the Riemann-Lebesgue lemma to his equation

(11) which can be obtain following his same argument in the presence of the strongly

nonlattice assumption imposed.

We now are ready to provide our asymptotic expansion for P (pSM > x) in the

presence of exponential moments.

Theorem 2 Suppose that X1 has strongly non-lattice distribution and that φ (η) ,E exp (ηX) <∞ for some η > 0. Then, for some a > 0, and as x/p→∞,

P (pSM > x) = exp³−xbθ/p+ r (p)´+O (exp (−ax/p)) , (10)

where bθ solves (6) andexp (r (p)) =

p

qbθφ0 ³bθ´ , c (p) . (11)

Moreover, both bθ and r are real analytic functions of p at the origin.Proof. The argument preceding Theorem 1 led us to equation (7). We now

verify that the assumptions in Theorem 1 are satisfied. Let us define gbθ (λ) ,Ebθ exp (iλX1) = qE exp

³³iλ+ bθ´X1´. Using the implicit function theorem on (6)

it follows easily that bθ = p/EX1+O (p2). As a consequence, the following inequalitycan be easily derived for all p > 0 sufficiently small and some M1 ∈ (0,∞)¯

gbθ (λ)− g (λ)¯ ≤M1p.

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 72

Hence, we conclude that for each ε > 0 it is possible to pick δ > 0 sufficiently small

so that

infp∈[0,δ]

inf|λ|>ε

¯1− gbθ (λ)¯ ≥ inf

p∈[0,δ]inf|λ|>ε

|1− g (λ)|−M1δ > 0.

Finally, also because bθ = O (p), it is possible to pick p > 0 small enough so that

Ebθ exp (ηX1) = qEbθ exp³³

η + bθ´X1´ < ∞ for some η > 0. We now can apply

Theorem 1 to equation (7) and obtain¯ebθx/pEqN(x/p) − 1

EbθX1Z ∞

0

ebθsP (X1 > s) ds

¯≤ 1

EbθX1Z ∞

x/p

ebθsP (X1 > s) ds (12)

+

¯1

EbθX1Z[0,x/p)

ebθ(x/p−s)P (X1 > x/p− s)V (ds)

¯, (13)

where, V (t) is a function that we are introducing here and it corresponds to the left

hand side of 1 in Theorem 1), therefore |V (t)| = O (e−at) for some a > 0. The integralin (12) is easily seen to be bounded by Ke−ax/p for some finite constants K, a > 0

(assuming that p > 0 is sufficiently small). We just need to analyze the integral in

(13). Integration by parts yieldsZ[0,x/p)

ebθ(x/p−s)P (X1 > x/p− s)V (ds)

= V (x/p)P (X1 > 0)− ebθx/pP (X1 > x/p)V (0) (14)

+bθebθx/p Z[0,x/p)

V (s) e−bθsP (X1 > x/p− s) ds (15)

+ebθx/p Z

[0,x/p)

V (s) e−bθsP (X1 > x/p− ds) . (16)

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 73

The absolute value of (14) is also bounded by Ke−ax/p for some finite constants

K, a > 0. For the integral (15) observe that

bθebθx/p ¯Z[0,x/p)

V (s) e−bθsP (X1 > x/p− s) ds

¯≤ bθebθx/p Z

[0,x/2p)

|V (s)| e−bθsP (X1 > x/p− s) ds+bθebθx/p Z

[x/2p,x/p)

|V (s)| e−bθsP (X1 > x/p− s) ds≤ bθebθx/pP (X1 > x/ (2p))M + bθebθx/p Z

[x/2p,∞)|V (s)| ds.

Since bθ = O (p) and X1 has exponential moments, we conclude that the previous

expression is bounded by Ke−ax/p (for appropriate positive constants K and a). The

treatment for integral (16) is very similar to that of (15). Thus, we conclude that

EqN(x/p) =

R∞0ebθsP (X1 > s) dsEbθX1 +O (exp (−rx/p)) . (17)

In order to recover the required expression for c (p), note that

EbθX1 = qZ[0,∞)

sebθsP (X1 ∈ ds) = qφ0

³bθ´ .On the other hand, using integration by parts and the definition of bθ, we see thatZ ∞

0

ebθsP (X1 > s) ds =

³φ³bθ´− 1´bθ =

p

qbθ .Combining the previous last two identities together into (17) yields equation (10).

The analytic properties of bθ follow directly from the implicit function theorem. It is

easy to see that r (·) is well defined at zero (i.e. that the right hand side of (11) isstrictly positive when p is close to zero). However, it is almost immediate to verify

that c (p) is real analytic at the origin with c(0) = 1. This implies the real analyticity

of r and the conclusion of the theorem.

Theorem 2 indicates that

bθ (p) = ∞Xk=1

bθ(k) (0)k!

pk, and r (p) =∞Xk=0

r(k) (0)

k!pk.

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 74

For notational convenience let us write bθ(k)(0)/k! = γk and r(k)(0)/k! = ξk. We

know that bθ(1) (0) = 1/EX1 and r (0) = 0, the rest of the γk’s and ξk’s can be easily

computed via the implicit function theorem. For instance, 2γ2 = 1− EX21/ (2E

2X1)

and ξ1 = 1− γ2EX1−EX21/(2E

2X1). For completeness we provide a set of recursive

equations to compute the γk’s and ξk’s.

Proposition 2 For n ≥ 1 and each k ≤ n, the constants (γk : 1 ≤ k ≤ n) can becomputed by solving recursively the following set of equations (note that the kth equa-

tion is linear in the γk and it only depends on the γj’s for j ≤ k).kX

m=1

EXm1

m!

X{n1+...+nm=k−m, n1,..,nm≥0}

mYj=1

γnj+1 = 1, for 1 ≤ k ≤ n.

Consequently, the constants (ξk : 0 ≤ k ≤ n− 1) can be obtained through a Taylorexpansion up to order n of the function

ern (p) = logà 1

qPn

k=1 γkpk−1Pn−1

m=0 (Pn

k=1 γkpk)mEXm+1

1 /m!

!

around p = 0. In particular, for k ≤ n− 1, ξk = er(k)n (0) /k!.

Proof. The proof follows directly by applying the implicit function theorem. The

details are omitted

Consequently, the previous theorem provides the means to develop an algorithm,

that can be implemented easily, for computing an asymptotic expansion for the tail

probability P (SM > x/p) in powers of p.

Theorem 2 corrects Renyi’s approximation (2) by providing a full asymptotic

expansion in powers of p with an exponential error term. In other words, the last

theorem provides rigorous support for the parametric (in p > 0) approximation

P (SM > x/p) ≈ expÃ−x/EX1 +

∞Xk=1

pk¡ξk − γk+1x

¢!, (18)

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 75

valid up to an error exponentially small as p & 0. It is easy to see that γk and ξk

depend on the first k and (k + 1) order moments of X1 respectively. This suggests

that, if EXα+21 <∞, say, the approximation

P (SM > x/p) ≈ expÃ−x/EX1 +

αXk=1

pk¡ξk − γk+1x

¢!(19)

should be more accurate than (2). Providing rigorous support for approximation (19)

in the presence of heavy tails (we say here that a non-negative random variable X1 is

heavy tailed if for every η > 0, E exp (ηX1) =∞) presents an additional mathematicalcomplication. Note that a crucial ingredient in the proof of Theorem 2 is the existence

of a root for equation (6). This indicates that the strategy followed in the proof of

Theorem 2 is infeasible in the heavy tailed case. Our idea is then to proceed via

truncation. Define the sequence X =¡Xk : k ≥ 1

¢as Xk = Xk1 (Xk ≤ x/p) and

consider its associated random walk S =¡Sn : n ≥ 0

¢(i.e. Sn = X1 + ...+Xn with

S0 = 0). We first argue that the distribution of SM is suitably close to that of SM .

Lemma 1 Suppose that EXβ1 <∞ for β ≥ 1, then

¯P (pSM > x)− P

¡pSM > x

¢¯= o

µpβ−1

¶Proof. Note that¯

P (pSM > x)− P¡pSM > x

¢¯≤ p

∞Xk=0

qkP¡Sn > x/p;Sn ≤ x/p

¢+ p

∞Xk=0

qkP¡Sn > x/p;Sn ≤ x/p

¢≤ 2p

∞Xk=0

qkP

µmaxk≤n

Xk > x/p

¶= 2p

∞Xk=0

qkµ1−

³1− o

³(p/x)β

´´k¶.

On the other hand,³1− o

³(p/x)β

´´k= 1− ko

³(p/x)β

´+k (k − 1)

2(1− ηk)

k−2 o³(p/x)2β

´,

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 76

where |ηk| ≤ o³(p/x)β

´. Hence, we can write

¯P (pSM > x)− P

¡pSM > x

¢¯ ≤ 2p∞Xk=0

qkko³(p/x)β

´+ 2p

∞Xk=0

qkk2o³(p/x)2β

´= o

µpβ−1

¶+ o

µp2β−2

x2β

¶= o

µpβ−1

¶.

We now would like to study P¡pSM > x

¢just as we did in Theorem 2. Theorem

1 can also be applied here to obtain a suitable approximation for P¡pSM > x

¢, as

our next result shows.

Theorem 3 Assume that the distribution of X1 is strongly non-lattice. Also, suppose

that

EX2+α1 <∞

for α ≥ 0. Then,

P (pSM > x) = exp³−xbθα/p+ rα (p)´+ o (pα)

as p& 0, where

bθα = p/EX1 +Xk≤α

γkpk, and rα (p) =

Xk≤α

ξkpk

and the γk’s and ξk’s are defined recursively via Proposition 2.

Proof. Let N (t) = sup{n ≥ 0 : Sn ≤ t}, then, by virtue of Proposition 1 andLemma 1 it suffices to compute EqN(x/p). Following similar steps as in the proof of

Theorem 2 we obtain

EqN(x/p) =

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)Uθ (ds) . (20)

The elements in equation (20) are indicated next. First, θ is the solution to the

equation

φ¡θ¢, E exp

¡θX1

¢=1

q,

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 77

which clearly exists if p is small enough. In order to describe Uθ (s) define Pθ(·) via

Pθ (B) = qnE¡exp

¡θSn

¢; 1 (B)

¢,

for every B in the sigma-field σ¡X1, .., Xn

¢. Next, we will show that

V (s) , Uθ (s)− s

EθX1

− EθX2

1

2E2θX1

−R∞t

R∞sPθ¡X1 > u

¢duds

EX21

, (21)

where,¯V (s)

¯= o

¡s−(α+1)

¢as s%∞ uniformly in p > 0 small enough. The previous

expression follows from Theorem 1, as we now illustrate. (Note that the term V in

(21) includes the last two terms in the right hand side of the equation in the part 2

of Theorem 1.) Observe that gp (λ) , Eθ exp¡iλX1

¢= qEθ exp((iλ+ θ)X1) satisfies¯

gp (λ)−E exp (iλX1)¯ ≤ ¯

gp (λ)−E exp¡iλX1

¢¯+ o

¡pα+2

¢≤ p

¯E exp

¡iλX1

¢¯+ θEX1 + o

¡pα+2

¢= O (p) .

This implies that gp (·) satisfies the uniform strongly non-lattice condition (14). On

the other hand, since θ = O (p), we have that for all p > 0 small enough

EθXα+2

1 = qE exp¡θX1

¢X

α+2

1 ≤MEXα+2

1 < MEXα+21 <∞.

This justifies the validity of representation (21). Furthermore, (21) implies that

EqN(x/p)

=

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)EθX1

ds (22)

+

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)EθX

2

1

Z ∞

s

Pθ¡X1 > u

¢duds (23)

+

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)V (ds) . (24)

Let us denote by I1, I2, and I3 the expressions (22), (23), and (24) respectively. We

first show that I3 = o (pα+1). To see this, we use integration by parts, the triangle

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 78

inequality and the fact that θ = O (p) to obtain

|I3| ≤¯V (x/p)

¯+M1

¯Z[0,x/p)

V (s) de−θsP (X1 > x/p− s;X1 ≤ x/p)¯

≤ ¯V (x/p)

¯+M1

¯Z[0,x/p)

V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯

+M1

¯Z[0,x/p)

V (s) e−θsP (X1 > x/p− s;X1 ≤ x/p) ds¯. (25)

The term¯V (x/p)

¯= o (pα+1). Now, observe that¯Z[0,x/p)

V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯

≤¯Z[0,x/2p)

V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯

+

¯Z[x/2p,x/p)

V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯

≤ K2P (X1 > x/ (2p)) +K1 max1/2≤u≤1

¯V (ux/p)

¯(26)

= o¡pα+2

¢+ o

¡pα+1

¢= o

¡pα+1

¢,

for some constants K1 and K2. The integral in (25) follows the same lines as (26).

For I2 we have

I2 =1

EθX2

1

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s)Z ∞

s

Pθ¡X1 > u

¢duds+ o

¡pα+1

¢. (27)

A parallel argument to that given for I3 shows that

1

EθX2

1

Z[0,x/p)

eθ(x/p−s)P (X1 > x/p− s)Z ∞

s

Pθ¡X1 > u

¢duds = o (pα) ,

which yields

I2 = o (pα) .

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 79

Finally, we analyze I1

I1 + o¡pα+1

¢=

1

EθX1

Z x/p

0

eθuP (X1 > u) du =1

θEθX1

Z x/p

0

P (X1 ≥ u) deθu

= o¡pα+1

¢− 1

θEθX1

+1

θEθX1

Z x/p

0

eθuP (X1 ∈ du)

=

¡1−E exp ¡θX1

¢¢θE¡exp

¡θX1

¢X1

¢ + o ¡pα+1¢ .Lastly, the implicit function theorem yields that

p

q2θφ0 ¡θ¢ = ¡

1− E exp ¡θX1

¢¢qθE

¡exp

¡θX1

¢X1

¢ = expÃXk≤α

pkξk

!+ o (pα) ,

and

θ =Xk≤α

pkγk + o (pα) .

This concludes the proof of the theorem.

Remark Note that (26) and 27) combined with Lemma 1 indicate that, in prin-

ciple, it is possible to develop an approximation for P (pSM > x) up to an error of

order o (pα+1) given by

P (pSM > x) ≈ exp¡−xθ/p¢Ã p

q2θφ0 ¡θ¢ + I2! .

This approximation, however, involves computing explicitly I2 and θ which may be

cumbersome in practice.

As we indicated at the beginning of this chapter, in many applications settings the

increment distributions are actually changing with p. In this context, it is desirable

to develop approximations similar to those provided in the previous theorems. Fortu-

nately, Theorem 1 also provides a means to deal with the typical situations that arise

in practice. To fix ideas, consider a family of probability measures P = {Pp, p ∈ [0, δ]

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 80

for some δ > 0}. Suppose that, under each Pp, the random variables (Xk : k ≥ 1)form an iid sequence. Also, assume that the distribution of X1 is uniformly strongly

non-lattice with respect to P (i.e. the characteristic functions gp (λ) = Ep exp (iθX1)satisfy condition (9)). In addition, suppose that one of the following conditions hold:

A) for some η > 0, sup0≤p≤δ Ep exp (ηX1) <∞ or

B) sup0≤p≤δ EpX2+α1 <∞, for some α ≥ 0.

Under this set of assumptions, we have the following analogue to Theorems 2 and

3.

Theorem 4 Assume that the family Pp, p ∈ [0, δ] is uniformly strongly non-lattice(see equation (9)). If condition A) above holds, then, there exists constantsK1, K2 > 0

such that for p > 0 small

|Pp (pSM > x)− exp (−θ∗x/p+ rp (p))| ≤ K1 exp (−K2x/p) ,

where θ∗ = θ∗ (p) solves φp (θ∗) , Ep exp (θ∗X1) = 1/q and

exp (rp (p)) =p

q2θ∗φ0p (θ∗).

Moreover, θ∗ (p) =P∞

k=1 pkγk (p) and rp (p) =

P∞k=1 p

kξk (p) (where the γk (p)’s and

ξk (p)’s depend on the first k and (k + 1) moments of X1 under Pp respectively).

Finally, if condition B) is in force, then¯¯Pp (pSM > x)− exp

Ã−x/EpX1 +

Xk≤α

pk¡ξk − γk+1x

¢!¯¯ = o (pα) .Proof. The proof parallels the arguments given in Theorems 2 and 3 using The-

orem 1. The details are omitted.

Remark Note that the γk’s and the ξk’s also depend on p. The previous result

would yield the desired asymptotic expansion assuming that the problem at hand has

enough structure, so that an asymptotic expansion of ξk’s and γk’s can be obtained.

The expansion for the distribution of the all time maximum of a random walk with

small negative drift given in Chapter 1 of this dissertation, provides an example in

which the previous result would have been applicable.

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 81

4.2 Asymptotics of Defective Renewal Equations

As we discussed at the beginning of the chapter, in many applied probability settings

one often deals with defective renewal equations, which are integral equations that

can be written as

a (t) = b (t) + q

Z[0,t)

a (t− s)P (X1 ∈ ds) ,

where q = 1− p ∈ (0, 1) and b is a given function for which we shall assume certainregularity properties (see Theorem 5). As an application of our developments in

Section 2, we provide means to obtain asymptotic expansion for a (·) as p& 0.

Theorem 5 Suppose that the distribution ofX1 is strongly non-lattice and that EX2+α1

<∞. In addition, suppose that b is right-continuous with left limits, has finite vari-ation and |b| (t) ≤ g (t) with R∞

0tα+1g (t) < ∞. Finally, let us write, for j ≤ α + 1,

bj =R∞0tjb (t) dt. Then, as p& 0

a (t/p) = exp³−tbθα/p´ d (p) + o (pα) ,

where

d (p) =b0 +

Pk≤α bkbθkα/k!

q³EX1 +

Pk≤α bθkαEXk+1

1 /k!´ .

Proof. First we note that if a (·) satisfies

a (t) = b (t) + q

Z[0,t)

a (t− s)P ¡X1 ∈ ds¢, (28)

where X1 = X11 (X1 < 1/p), then, by applying Laplace transforms we can verify that

a satisfies (see Theorem 9.1.1 of Lin and Willmot (2000))

a (t) =1

p

Z[0,t)

b (t− s)P ¡SM ∈ ds¢ .

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 82

Therefore

a (t/p)− a (t/p)=

1

p

Z[0,t/p)

b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢=

1

p

Z[0,t/2p)

b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢ (29)

+1

p

Z[t/2p,t/p)

b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢ . (30)

Let J1 and J2 be the integrals in (29) and (30) respectively. Now, since b (t) = o (tα+2)

and is right continuous with left limits, it is not hard to see that

max1/2≤u≤1

|b (ut/p)| = o ¡pα+2¢ .Thus, it follows that

1

p

¯Z[0,t/2p)

b (t/p− s)P ¡SM ∈ ds¢¯ ≤ 1pmax

1/2≤u≤1|b (ut/p)| = o ¡pα+1¢ .

Which implies that J1 = o (pα). For J2 we can use integration by parts to obtain

|J2| = b (0)

p

¡P¡pSM ≤ t

¢− P (pSM ≤ t)¢+

¯b (t/p)

p

¯ ¡P¡pSM ≤ t/2

¢− P (pSM ≤ t/2)¢+

Z[1/2,1)

1

p

¯P¡SM ≤ st/p

¢− P (SM ≤ st/p)¯ |b| (t/p− ds) .From Lemma 1 and the fact that

R[0,∞) |b| (ds) < ∞ we can easily obtain that J2 =

o (pα+1). The rest of the argument follows just as in the proof of Theorem 3, by finding

a root for the equation E exp¡θX1

¢= 1/q, transforming (28) into a non-defective

renewal equation and applying Theorem 1.

As a final remark we note that a straightforward generalization of the previous

theorem can be obtained in a completely analogous setting as the one described in

Theorem 4. As an application of the previous results we consider a couple of examples

in insurance risk theory and queueing theory.

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 83

Example 1 (Perturbed ruin model) Consider the case of the classical ruin

model perturbed by a diffusion introduced by Dufresne and Gerber (1991). That is,

suppose that the risk process is a Levy process of the form

R (t) = x+ ct− S (t) + σB (t) ; t ≥ 0,

where S (·) represents the aggregate claim process, which follows a compound Pois-

son process with Poisson parameter λ and increments (claims) Y = (Yk : k ≥ 1); xrepresents the initial reserve, c is a constant premium rate satisfaying c > λEY ,

and σB (·) is a Brownian motion independent of S with diffusion coefficient equalto σ (i.e. instantaneous variance equal to σ2). The term involving the Brownian

motion B, represents noise that may incorporate non-systematic fluctuations in the

composition of the insurance portfolio, measurement errors, etc. We are interested in

computing the probability of eventual ruin in this model. Note that this model can-

not be reduced directly to the standard renewal model discussed at the beginning of

the chapter because, in this case, ruin can occur between claim arrivals. In order to

apply Theorem 5, let us introduce some additional notation. Let Z be a rv having the

equilibrium distribution generated by Y , that is

P (Z ≤ z) = 1

EY

Z z

0

P (Y > y) dy.

Also, define p = 1 − λEY/c and q = 1 − p, and V = Z + σ2W/ (2c), where W is

distributed exponential with mean one and Z and W are independent. Finally, let

τ (x) = inf (t ≥ 0 : R (t) < x), and note that the ruin occurs if and only if {τ (x) <∞}. Dufresne and Gerber (1991) proved that if P (τ (x) <∞) = a (x), then

a (x) = qP (V > x) + pP¡W > 2cx/σ2

¢+q

Z x

0

a (x− y)P (V ∈ dy) .

In this context, p close to zero and x large are reasonable assumptions, hence we

can use Theorem 5 can be directly applied here to provide asymptotics for a (x/p) as

p→ 0. In particular, for j ≥ 0, it is easy to verify that

bj =1

j + 1

³qEV j+1 + q

¡σ2/ (2c)

¢j+1(j + 1)!

´,

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 84

and that

EZj =EZj

(j + 1)EZ, EW j = j!.

These expressions, combined with Proposition 2 and Theorem 5 provide all the nec-

essary means to compute the desired asymptotic expansion. For instance, assuming

that EZ4 <∞ (which implies that EY 3 <∞), we obtain that

a (x/p) = exp¡−x/EV + 1/2 ¡1−EV 2/ ¡2E2V ¢¢ p¢ d (p) + o (p) ,

where

d (p) =(qEV + pσ2/ (2c)) + (qEV 2 + σ4/ (2c2)) p/EV

q (EV + pEV 2/EV ),

and

EV = EY 2/ (2EY ) + σ2/ (2c)

EV 2 = EY 3/ (3EY ) + σ4/¡2c2¢+ σ2EY 2/ (2cEY ) .

Note that these asymptotics correspond to corrected diffusion approximations for the

present model.

Example 2 (M/G/c waiting time) A standard (first-come first-served) M/G/c

queue can be described as follows. Customers arrive according to a Poisson process

with rate λ. The system is composed by c servers and a buffer of infinite size. The

amount of time required by service of each customer is described by a sequence V =

(Vj : j ≥ 1) of non-negative iid random variables independent of the arrival process.

The stability condition of this queue requires less than 100% utilization, which is ex-

pressed via ρ = λEV/c < 1. Just as in the standard M/G/1 case, the so-called equilib-

rium distribution of the service time sequence, namely H (t) =R t0P (V > s) ds/EV ,

plays an important role in discribing the steady-state waiting time distribution, say

W = (Wn : n ≥ 1), (excluding service) of this queueing system. In particular (seeVan Hoorn (1984)), it turns out that if a (t) = P (W∞ > t|W∞ > 0), then

a (t) = (1− ρ) (1−H (t))c

+ρ (1−H (tc)) + ρ

Z t

0

a (t− s) dH (sc) ,

CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 85

and

P (W∞ > 0) =(λEV )c−1

(c− 1)!ρ

(1− ρ)Pc−1

j=0 (λEV )j /j! + (λEV )c / (c!)

.

Therefore, as a straightforward application of Theorem 5, we can develop corrected

diffusion approximations (in the spirit of Chapter 2 of this dissertation) for the steady-

state waiting time of the M/G/c queue.

Chapter 5

Approximations for the

Distribution of Infinite Horizon

Discounted Rewards

For t ≥ 0, let Λ (t) be a real-valued random variable representing the cumulative

reward associated with running a system over [0, t]. In the presence of a stochastic

inflation rate, the infinite horizon discounted reward takes the form

D =

Z[0,∞)

exp (−Γ (t−)) dΛ (t) ,

where Γ = (Γ (t) : t ≥ 0) is a real-valued process representing the cumulative infla-tion to time t. An enormous literature exists within the performance modeling and

stochastic control communities that focuses on computing and/or optimizing the ex-

pected infinite-horizon discounted reward, namely E (D). Our focus, in this paper,

is on the development of approximations for the distribution of the random variable

(r.v.) D (and not just its expected value).

As we shall see in Section 2, the distribution of the random variable D plays a

key role in a number of different applications contexts. Since, clearly, computing

the exact distribution of D is, in general, very difficult, the emphasis in this paper

is on the development of approximations. All of our approximations are rigorously

supported by limit theorems that are valid in the asymptotic regime in which the

86

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 87

“inflation rate” is small.

Study of approximations for the distribution of D can be traced back to the early

seventies. Gerber (1971) established a Central Limit Theorem (CLT), as well as its

Berry-Esséen companion, for

D =∞Xk=0

exp (−αk)Xk,

in the case of a (small) deterministic discount rate α and iid rewards (Xk)k≥0. Whitt

(1972) obtained more general central limit theorems for D, also under the assumption

of deterministic interest rates. The aim of Whitt’s paper was to establish discounted

stochastic limit theorems based on postulating a functional limit theorem for the

(undiscounted) reward process (in our notation, Λ).

The stochastic discount rate has also been widely studied. Pollack and Siegmund

(1985) computed the distribution of D in the case in which Γ follows a Brownian

motion with negative drift and Λ (t) = t; see also Dufresne (1990). The distribution

of D has also been computed explicitly by Gjessing and Paulsen (1997) in some other

particular cases in which both Γ and Λ follow particular types of Levy processes) .

Computing the distribution of D in complete generality is clearly unfeasible. And

even in Markovian settings, such as those previously described, the type of integro-

differential equations that arise (see Gjessing and Paulsen (1997) and Yor (2001)) are

challenging to solve both analytically and numerically. Hence, our goal is to provide

approximations to D that hold in great generality and require relatively “easy-to-

obtain” information for their implementation.

It is important to recognize that D arises as the stationary distribution of certain

processes that have been well studied in the context of time series analysis (such

as AR and ARCH processes). By properly scaling certain types of auto-regressive

processes, Nelson (1990) obtained sample-path weak convergence results a Gaussian

Ornstein-Uhlenbech process as the sample frequency increases. More recently Forniari

and Mele (1997) extended Nelson’s results to cover more general type of non-linear

ARCH and GARCH time series models. From the time series analysis perspective, the

central limit theorem (CLT) derived here in Section 4 is related to the convergence

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 88

of the stationary distributions of auto-regressive type models to that of Ornstein-

Uhlenbeck (namely a Gaussian law). One of the contributions of Whitt (1972) is to

show that weak convergence of properly scaled processes Γ and Λ in the standard

Skorohod topology is not enough to guarantee weak convergence of a suitably scaled

and normalized version of D. Thus, the general weak convergence analysis at the

level of stationary distributions in auto-regressive processes does not follow directly

from previous results in the literature. Our laws of large numbers (LLNs) and CLTs

derived in Section 4 hold in great generality (in particular, in the cases considered by

Nelson (1990) and Forniari and Mele (1997)), hence our results complement previous

analysis on the structure of auto-regressive processes.

Some results similar in spirit to our results in Section 4 have been obtained by

Kushner (1984) and Benveniste, Metiver, and Priouret, (1990) in the context of

stochastic approximation algorithms, more precisely the so-called least mean squares

(LMS) algorithm, which gives rise to a linear stochastic recursive equation of order

one. Although these results hold in the vector valued case, the dependence assump-

tions imposed are stronger than ours and are only given in discrete time, which is

not convenient for some of the applications discussed in the next section (e.g. finance

and insurance). Also in the context of stochastic approximation algorithms, Bucklew,

Kurtz, and Sethares (1993) analyzed weak convergence (on compact sets) of processes

following certain stochastic recursive equations that give rise to stationary distribu-

tions related toD. As in the previous discussion regarding the time series setting, this

type of analysis does not directly imply weak convergence of stationary distributions.

In this paper, we not only complement previous results in the literature (such as

those discussed for in the context of time series analysis) by providing rigorous general

statements that support LLNs and CLTs at the level of stationary distributions, but

also provide new approximations and further refinements for the LLN’s and CLT’s

previously indicated. The approximations proposed take the form of Edgeworth ex-

pansions and large deviation principles (LDP’s), and can typically be implemented

at a modest computational cost (see, for example, (7) and (12)). The assumptions

under which these results are derived are stated at the beginning of the corresponding

sections.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 89

The rest of the paper is organized as follows. As we indicated before our first

approximation takes the form of a law of large numbers (LLN) and is derived in

Section 3. In Section 4, we provide a central limit theorem (CLT) correction to

the LLN derived in Section 3. The approximations developed in Sections 3 and 4

are shown to be valid under very general settings. Under additional assumptions,

refinements to the CLT are introduced in Section 5. These refined approximations

take the form of Edgeworth expansions and are provided in both the discrete and

the continuous time settings. Finally, under exponential tail conditions on Λ and Γ,

general large deviation principles (LDP’s) are given in Section 6. In this section also,

sharp large deviation asymptotics are discussed as well.

5.1 Motivating Examples

The distribution of D plays a key role in a number of different applications contexts.

In the world of finance and pension funds, D is called a “perpetuity” (see Embrechts,

Klüppelberg, andMikosch (1997)). As an example of howD arises in pension funds we

mention Dufresne (1990), who proposed a model, based on perpetuities, for computing

the value of a pension fund. He argued that the value at time t can be expressed as

V (t) =

Z t

−∞exp

µ−Z t

s

γ (u) du

¶λ (s) ds,

where (γ (t) ,λ (t) : −∞ < t <∞) is stationary and ergodic, with 0 < Eγ (t) < ∞and E log (1 + |λ (t)|) <∞. The processes γ (t) and λ (t) depend on the parameters

that serve to characterize the pension fund (i.e. benefit payments, actuarial liability,

net premium, and rate of return). As explained in Dufresne (1990), the distribution

of the value process plays an important role in risk management, as it serves to

compute critical rates ensuring that the fund is being managed in a balanced manner

with respect to its actuarial liabilities; see Dufresne (1990) and Bédard and Dufresne

(2001) for additional detail on pension funding. The random variable V (t) can be

recast as a special case of D, so that our results apply directly.

The random variable D also arises in non-pension fund insurance settings. Con-

sider a company that receives premiums at a rate of p dollars per unit time, and pays

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 90

out claims according to the random process A (·). If γ (t) represents the rate of returnon the invested risk reserve at time t, the risk reserve R (t) evolves according to the

equation

dR (t) = γ (t)R (t) dt+ (pdt− dA (t)) ,subject to the initial condition R (0) = r0. Harrison (1977) shows that the ruin

probability P (inft≥0Rt < 0) can be computed in terms of D (for Λ and Γ suitably

defined) when γ is deterministic. Paulsen (1998) extends this result to the case of

stochastic γ (·); see also Nyrhinen (2001). Thus, the key to calculating such ruin

probabilities is computing the distribution of D.

As we mentioned before, it turns out that D also plays a major role in the theory

of ARCH processes. This class of time series is widely used within the statistics and

econometrics communities, and has been employed to model log-returns, exchange

rates, inflation, and many other financial and economic time series; see Campbell,

Lo and Mackinlay (1999), Shephard (1996), Mills (1993) and Wilkie (1986). An

ARCH(1) model satisfies the stochastic recursion

Yn+1 = An +Bn+1Yn,

where the sequence ((Ai, Bi) : i ≥ 1) is iid (independent and identically distributed.)Under mild stability conditions (see, for example, Kesten (1973), Verbaat (1979),

Goldie (1991), Embrechts and Goldie (1994)), this Markov chain has a stationary

distribution. This stationary distribution is a special case of D.

We also note several other applications settings in which the distribution of D

arises as a central object. Goldie and Grübel (1996) describe its relevance to complex-

ity theory (in the context of sorting algorithms related to “Quicksort”) and analytic

number theory. Carmona, Petit, and Yor (2001) describe several other applications

arising in mathematical physics and finance.

Apart from the financial Whitt (1972) also reports two application contexts in

which our limit theorems may have potential important implications. and, second,

the dynamic programming context, in which the approximations derived may be used

in developing stochastic criteria and sensitivity analysis for small interest rates, see

Whitt (1972) for additional detail.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 91

As stated earlier, our work is intended to provide approximations to the distribu-

tion of D. Of particular importance (in view of the above applications) is the setting

in which the “interest rate” corresponding to Γ is small. Our theorems establish that

our approximations become asymptotically exact as the “interest rate” goes to zero.

5.2 Law of Large Numbers

We assume throughout this chapter (except in some cases explicitly indicated) that

for each t ≥ 0, (Λ (s) : 0 ≤ s ≤ t) is of bounded variation. We also require Λ and

(Γ (s) : s ≥ 0) to be right continuous functions with left limits (RCLL). (Note thatwe do not require bounded variation for Γ.) Let |Λ| (t) be the total variation of Λover [0, t], and suppose that |Λ| satisfies:

limt→∞1

t|Λ| (t) <∞ a.s. (1)

We further assume that:

A1 There exist deterministic constants λ ∈ R and γ ∈ (0,∞) such that:

Γ (t) = γt+ op (t)

Λ (t) = λt+ op (t) ,

where op (t) means that for every c > 0,

sup0≤t≤c

¯op (tβ)

β

¯= o (1) as β →∞.

Our first proposed approximation for D takes the form

DD≈ λ/γ. (2)

Here,D≈ means “has approximately the same distribution as”, and is intended to hold

no rigorous mathematical meaning. The relation (2) should be merely interpreted

as a statement of a proposed approximation.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 92

Of course, given such an approximation, it is important to identify conditions

under which the approximation can be guaranteed to be good. We shall argue that

the approximation (2) tends to be good when the discount rate γ is small. To

make this statement mathematically rigorous, we shall introduce a parameter α that

will control the magnitude of the discount rate. We will show that as α & 0, the

approximation (2) becomes asymptotically valid. More precisely, let

D (α) =

Z[0,∞)

exp (−αΓ (t−))Λ (dt) .

For D (α), the approximation (2) takes the form

D (α)D≈ λ/αγ. (3)

The following theorem shows that the approximation (3) becomes accurate as α& 0.

Theorem 1 Under A1,

αD (α)→ λ

γa.s. as α& 0.

Note that the “law of large numbers” (LLN) offered by Theorem 1 does not assume

that the instantaneous discount rate is non-negative (i.e. that Γ is non-decreasing).

The lack of such an assumption introduces some technical complications in our proofs.

We prove (3) by replacing Γ with the non-decreasing function

Γ (t) , sup{Γ (s) : 0 ≤ s ≤ t}.

The hope is that

D (α) =

Z[0,∞)

exp¡−αΓ (t−)¢Λ (dt)

then has a behavior similar to that of D (α) when α is small. Theorem 1 can be

established by proving the corresponding law of large numbers for D (α). Thus, the

proof of Theorem 1 follows from Propositions 1 and 2 below.

Proposition 1 Assume A1. Then,

α¡D (α)−D (α)¢→ 0 a.s.

as α& 0.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 93

Proof. Observe that

D (α) =

Z ∞

0

exp(−αΓ (t−))Λ (dt)

=

Z ∞

0

Z ∞

αΓ(t−)exp(−u)duΛ (dt)

=

Z ∞

0

ÃZ ∞

αΓ(t−)exp(−u)du+

Z αΓ(t−)

αΓ(t−)exp(−u)du

!Λ (dt)

=

Z ∞

0

Z ∞

αΓ(t−)exp(−u)duΛ (dt) +

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

= D (α) +

Z ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt) .

Therefore, we must show that¯¯αZ ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

¯¯→ 0 a.s. as α& 0,

but ¯¯αZ ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

¯¯ ≤ α

Z ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)du |Λ| (dt) .

Now the right hand side of the last inequality is equal toZ ∞

0

α

Z αΓ(t/α−)−αΓ(t/α−)

0

exp(−u− αΓ (t/α))du |Λ|µdt

α

¶=

Z ∞

0

α exp (−γt+ αop (t/α))

Z αΓ(t/α−)−αΓ(t/α−)

0

exp(−u)du |Λ|µdt

α

¶=

Z ∞

0

h (t,α)µa (dt) .

Where,

h (t,α) = exp³−γ

2t+ αop (t/α)

´Z αΓ(t/α−)−αΓ(t/α−)

0

exp(−u)du,

and

µa (dt) = α exp³−γ

2t´|Λ|µdt

α

¶.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 94

Observe that

0 ≤ supα,t≥0

h (t,α) ≤M.

Moreover, it is not hard to verify that h (t,α) → 0 as α → 0 uniformly on compact

intervals. Also, notice that

µa(0,∞) =

Z ∞

0

α exp³−γ

2t´|Λ|µdt

α

¶=

Z ∞

0

α exp³−γ

2tα´|Λ| (dt)

=

Z ∞

0

e−uα |Λ|µ2u

αγ

¶du

=

Z ∞

0

e−u/2e−u/2α |Λ|µ2u

αγ

¶du.

Since |Λ| satisfies (1), we have for some B > 0,

0 ≤Z ∞

0

e−u/2α |Λ|µ2u

αγ

¶du ≤ B.

Thus, we have that for all α > 0,

0 ≤ µa(0,∞) ≤ B <∞.

And, if we fix ε > 0, then there exists C > 0 such that µa(C,∞) ≤ ε, and such that

supt∈[0,C]

h (t,α) ≤ ε,

for α small enough, this implies that if α is small,Z ∞

0

h (t,α)µa (dt) ≤ ε(B +M),

since ε was arbitrary, we deduce that¯¯αZ ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

¯¯ ≤

Z ∞

0

h (t,α)µa (dt)→ 0.

as claimed.

To prove Proposition 2, we need to recall the following definition of the generalized

inverse of a non-decreasing function.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 95

Definition 1 Let Γ : R+ → R non-decreasing, RCLL (right continuous with left

limits) then we define Γ−1 as

Γ−1 (u) = inf{t ≥ 0 : Γ (t) > u}.

Proposition 2 Assume A1. Then,

αD (α)→ λ/γ a.s.

as α& 0.

Proof.

αD (α) = α

Z ∞

0

exp(−αΓ (t−))Λ (dt)

= α

Z ∞

0

Z ∞

Γ(t−)exp(−u)duΛ (dt)

= α

Z ∞

0

exp(−u)Λ³Γ−1(u/α)

´du.

Now,

t

Γ−1(t−)

=Γ³Γ−1(t−)

´Γ−1(t−)

= γ +op

³Γ−1(t−)

´Γ−1(t−)

.

Hence

Γ−1 (t) =t

γ+ op (t) a.s.

This implies that

αΛ¡Γ−1 (u/α)

¢= αΓ−1 (u/α)

Λ (Γ−1 (u/α))Γ−1 (u/α)

→ uλ

γ.

Thus, in order to apply the Dominated Convergence Theorem, it suffices to show that

for almost every sample path, we have that¯αΛ¡Γ−1 (u/α)

¢¯ ≤ H (u,ω) ∈ L1 ¡e−udu¢ ,for some measurable function H. However,¯

αΛ¡Γ−1 (u/α)

¢¯= O (u)

which suffices to apply dominated convergence.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 96

5.3 The Central Limit Theorem

In this section, we assume that Λ and Γ jointly satisfy a “strong approximation

principle”, namely:

A2 There exists a probability space supporting (Λ,Γ) and a two-dimensional stan-

dard Brownian motion

(B1, B2) = ((B1 (t) , B2 (t)) : t ≥ 0)for which deterministic constants λ ∈ (−∞,∞) and γ ∈ (0,∞) and a covariancematrix Σ can be found such thatÃ

Γ (t)

Λ (t)

!=

Ãγ

λ

!t+G

ÃB1 (t)

B2 (t)

!+ op

¡t1/2¢

a.s. as t→∞.The entries of the covariance matrix C = GGT can typically be identified as

follows:

C11 = limt→∞

1

tE (Λ (t)− λt)2

C12 = limt→∞

1

tE (Λ (t)− λt) (Γ (t)− γt) = C21

C22 = limt→∞

1

tE (Γ (t)− γt)2 .

The strong approximation principle A2 holds in great generality, the prototypical

example is a sequence of (independent and identically distributed) iid random vari-

ables with finite second moment, some other cases, in which dependence is allowed,

and under which the validity of this principle has been proved are briefly described

(along with references) next. (Some relevant references on this topic are Philipp and

Stout (1975) and Csörgo and Révész (1981).)

Case 1 (Philipp and Stout (1975) Thm. 4.1) Suppose X = (Xn : n ≥ 1) is strictlystationary sequence of random variables, such that E

³|X1|2+δ

´<∞ for some δ > 0.

Also, assume that X is a φ−mixing with∞Xk=1

φ1/2 (k) <∞,

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 97

then S (t) =P

k≤tXk satisfies a strong approximation principle with rate op¡t1/2−λ

¢for some λ > 0 small.

Case 2 Philipp and Stout’s Thm. 8.1, also provides a strong approximation

principle with rate op¡t1/2−λ

¢(for λ > 0 small) when X = (Xn : n ≥ 1), is not

necessarily stationary, but φ−strong mixing with

φ (k) << k−300(1+2δ),

for some δ ∈ (0, 2]. This result also requires moments of order 2 + δ and other

technical assumptions to control the growth of the second moment of the random

elements Xk.

Case 3 In the context of positive recurrent irreducible Markov sequence (ζn)n≥1with stationary transition probabilities and countable state space, Thm. 10.1 of

Philipp and Stout (1975), provides strong approximation principles for the case in

which Xk = f (ζk) . The results in this case depend on moment conditions of the

type described before for the cumulative reward within a cycle of the Markov process.

Case 4 Horvath (1984a, 1984b and 1986) developed strong approximation the-

orems in the contexts of vector valued cases under higher moment conditions, and

also for the cases of renewal processes and extended renewal processes. Also, Philipp

and Stout (1975) Ch. 12 deals with various types of continuous parameter stochastic

processes (e.g. Gaussian increments and mixing increments, the later case includes

as a particular case Levy processes).

Given A2, we propose the following (refined) Gaussian approximation for D,

namely

DD≈ λ/γ + σ/γ1/2N (0, 1) , (4)

where

σ2 =1

2

µC11 − 2λ

γC12 +

λ2

γ2C22

¶.

Note that (4) improves upon (2) by providing a normal approximation for the stochas-

tic variability that is present in the r.v. D.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 98

As for the approximation (2), we claim that (4) is accurate when the discount rate

is small. In particular, note that (4) suggests the approximation

D (α)D≈ λ/αγ + σ/

√αγN (0, 1) , (5)

where

σ =

s1

2

µC11 − 2 λ

γααC12 +

λ2

(γα)2α2C22

=

s1

2

µC11 − 2λ

γC12 +

λ2

γ2C22

¶.

The following central limit theorem (CLT) asserts that the approximation (5) is indeed

valid as α& 0.

Theorem 2 If A2 is in force, then

α−1/2µαD (α)− λ

γ

¶=⇒ σN (0, 1)

as α& 0, where

σ2 =1

µC11 − 2λ

γC12 +

λ2

γ2C22

¶.

Again, just as in the case of the LLN derived previously, the strategy is to show

that the behavior of the random variable

D (α) =

Z[0,∞)

exp¡−αΓ (t−)¢Λ (dt) ,

is comparable to that of D (α) for the purposes of approximation (5). This is the aim

of Proposition 3 below, whose proof follows using the same technique as in Proposition

1 together with an application of the next Lemma.

Lemma 1 Under A2,

√α¡Γ (t/α)− Γ (t/α)

¢→ 0 as α→ 0

uniformly on compact sets.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 99

Proof.¯√α¡Γ (t/α)− Γ (t/α)

¢¯ ≤ √α max0≤s≤t/α

(γ (s− t/α) + Σ2· (B (s)−B (t/α))) +√αop

³(t/α)1/2

´.

Observe that

√α max0≤s≤t/α

(γ (s− t/α) + Σ2· (B (s)−B (t/α)))

= max0≤s≤t

µγ(s− t)√

α+√αΣ2· (B (s/α)−B (t/α))

¶≤ max

0≤s≤t

³C (s− t)1/2+ε α−ε − γ (s− t)α−1/2

´,

≤ max0≤u≤t

¡Cu1/2+εα−ε − γuα−1/2

¢ ≤Mα1/2 → 0.

The first inequality holds by virtue of the law of iterated logarithms (LIL) and from

the last inequality we can see that the convergence holds uniformly on compact sets.

Proposition 3 Under A2,

α1/2¡D (α)−D (α)¢→ 0 a.s.

as α& 0.

Proof. As in the proof of Proposition 1,

D (α) = D (α) +

Z ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt) .

Hence, we must show

α1/2

¯¯Z ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

¯¯→ 0 a.s. as α& 0.

Now, observe that¯¯αZ ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)duΛ (dt)

¯¯ ≤ α

Z ∞

0

Z αΓ(t−)

αΓ(t−)exp(−u)du |Λ| (dt)

=

Z ∞

0

h (t,α)µa (dt) .

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 100

Where,

h (t,α) = exp³−γ

2t+ αop (t/α)

´ 1√α

Z αΓ(t/α−)−αΓ(t/α−)

0

exp(−u)du,

and

µa (dt) = α exp³−γ

2t´|Λ|µdt

α

¶.

Observe that

0 ≤ h (t,α) ≤M,

and virtue of the previous lemma, h (t,α)→ 0 as α→ 0 uniformly on compact inter-

vals. The rest of the proof follows by repeating the same steps used in Proposition

1.

In light of the previous proposition, Theorem 2 follows by combining the last result

together with Proposition 4 below. The following lemma will be used in the proof of

Proposition 4.

Lemma 2 Let Σ be a d−dimensional vector and τ (t) such that

τ (t) = γt+ op (t)

and suppose that B (t) is a d−dimensional Brownian motion. Then,

α1/2Z ∞

0

e−uΣ0B (τ (u/α)) du⇒ N¡0,σ2

¢,

where σ2 = 12γΣ0Σ0T .

Proof.

α1/2Z ∞

0

e−uΣ0B (τ (u/α)) du

= α1/2Z t/α

0

e−uΣ0B (τ (u/α)) du+ α1/2Z ∞

t/α

e−uΣ0B (τ (u/α)) du

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 101

Observe that¯α1/2

Z ∞

t/α

e−uΣ0B (τ (u/α)) du¯≤ α1/2

Z ∞

t/α

e−uΣ0 |B (τ (u/α))| du

= α−εZ ∞

t/α

Ce−uu1/2+εdu a.s.,

by virtue of the LIL (for all ε > 0 and for some C > 0), and the integral above, goes

to zero faster than αε. Hence, in order to show weak convergence, it suffices to study

(by virtue of Slutsky’s lemma)

α1/2Z t/α

0

e−uΣ0B (τ (u/α)) du = α1/2Z t/α

0

e−uΣ0 (B (τ (u/α))−B (u/γα)) du

+α1/2Z t/α

0

e−uΣ0B (u/γα) du.

We first show that¯¯α1/2

Z t/α

0

e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ =⇒ 0.

Observe that

sup0≤u≤t/α

αe−u |τ (u/α)− µu/α| = sup0≤u≤t/α

e−u |αop (u/α)|

≤ o (1) sup0≤u≤∞

e−uu

therefore, for every δ > 0 we can choose α0 such that if α > α0

P

Ãsup

0≤u≤t/αe−u |τ (u/α)− µu/α| > δ/α

!≤ δ

Let us define

Aδ = {ω : sup0≤u≤t/α

e−u |τ (u/α)− µu/α| ≤ δ/α}

and

Akδ = {ω : supk/α≤u≤(k+1)/α

e−u |τ (u/α)− µu/α| ≤ δ/α},

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 102

then

P

ï¯α1/2

Z t/α

0

e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ > ε

!

≤ P

Ãα1/2

Z t/α

0

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε

!

≤ P

Ãα1/2

Z t/α

0

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ

!+ δ.

Notice that

P

Ãα1/2

Z t/α

0

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ

!

≤btcXk=0

P

Ãα1/2

Z (k+1)/α

k/α

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε

t;Aδ

!

≤btcXk=0

P

Ãα1/2

Z (k+1)/α

k/α

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε

t;Akδ

!.

The kth−term in the last expression is less or equal than

P

Ãα1/2

Z (k+1)/α

k/α

e−uΣ0 sup0≤s≤ 2δ

αek/α

|B (s)| du > ε

2t

!

= P

ÃZ (k+1)/α

k/α

e−u sup0≤s≤2δek/α

|B (s)| du > ε

2tk

!

≤ t

2εce−k/α

¡1− e−1/α¢EÃ sup

0≤s≤2δek/α|B (s)|

!

=t

2εce−k/α

¡1− e−1/α¢Eµ sup

0≤u≤1

√2δek/α |B (s)|

¶=

Mt√δ

εe−k/2α

¡1− e−1/α¢ .

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 103

This implies that

P

Ãα1/2

Z t/α

0

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ

!

≤ Mt√δ

ε

¡1− e−1/α¢ ∞X

k=0

e−k/2α

=Mt√δ

ε

1− e−1/α1− e−1/2α =

Mt√δ

ε

1− e−1/α1− e−1/2α .

Therefore,

limα→0P

Ãα1/2

Z t/α

0

e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ

!≤ 2Mt

√δ

ε.

Since δ was arbitrary, we conclude that¯¯α1/2

Z t/α

0

e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ P→ 0,

in particular the last term goes to zero weakly; finally we observe that

α1/2Z t/α

0

e−uΣ0B (u/γα) du D=

Z t/α

0

e−uΣ0B (u/γ) du =⇒Z ∞

0

e−uΣ0B (u/γ) du,

which is Gaussian, with mean zero and variance (which can be computed using inte-

gration by parts and the Ito isometry) σ2 = 12γΣ0Σ0T .

Proposition 4 If A2 is in force,

α−1/2µαD (α)− λ

γ

¶=⇒ σN (0, 1) as α→ 0,

where

σ2 =1

h1 −λ

γ

iC

·1

−λγ

¸=

1

µC11 − 2λ

γC12 +

λ2

γ2C22

¶.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 104

Proof. We write

αD (α)− λ

γ=

Z ∞

0

e−uµαΛ³Γ−1 ³u−

α

´´− λ

γ

¶du.

Let W (α, u) =³αΛ³Γ−1(u/α)

´− λ

γ

´, using the strong approximation assumption

we can see that:

W (α, u) = α

µλ

γu− λ

γ

¶− αλ

γG2·B

³Γ−1 ³u−

α

´´+ αG1·B

³Γ−1 ³u−

α

´´+αop

µ³Γ−1 ³u−

α

´´1/2¶= α

µλ

γu− λ

γ

¶+ αe−uΣ0B

³Γ−1

³u−α

´´du+ αop

µΓ−1

³u−α

´1/2¶,

where

Σ0 =h1 −λ

γ

iG.

Integrating out α−1/2W (α, u), we obtain

α−1/2Z ∞

0

e−uW (α, u) du = I1 (α) + α1/2Z ∞

0

e−uΣ0B³Γ−1(u/α)

´du+ I2 (α) .

We analyze one by one each of these terms. First, it is clear that

I1 = α−1/2Z ∞

0

e−uµλ

γu− λ

γ

¶du = 0.

Next,

I2 = α−1/2Z ∞

0

e−uαop

µ³Γ−1(u/α)

´1/2¶du =

Z ∞

0

e−uα1/2op

µ³Γ−1(u/α)

´1/2¶du.

Observe that

√αop

µ³Γ−1(u/α)

´1/2¶= u1/2

uΓ−1 (u/α)

op

µ³Γ−1(u/α)

´1/2¶³Γ−1(u/α)

´1/2 → 0 a.s. α→ 0.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 105

We wish to apply dominated convergence. Notice that¯√αop

µ³Γ−1(u/α)

´1/2¶¯= O

¡u1/2

¢ ≤ H (u,ω) ∈ L ¡e−udu¢this implies that I2 (α)→ 0 a.s. Therefore, we obtain that

α−1/2W (α, u)− α1/2Z ∞

0

e−uΣ0B³Γ−1(u/α)

´du⇒ 0,

which combined with the previous lemma and standard converging together results

yields the conclusion of the proposition.

Finally, Theorem 2 is a direct consequence of Propositions 3 and 4.

5.4 Edgeworth Expansion

In this section, we provide refined versions of the approximations given in the previous

sections. The refined approximation takes the form of an Edgeworth expansion for

the distribution of D. We shall derive these approximations in the iid setting for the

discrete time case and under Markovian assumptions for the continuous time case.

More precisely, in the discrete time case, motivated by the applications to ARCH

processes described in Section 2, we consider

D =∞Xk=0

exp

Ã−

k−1Xj=0

Zj

!Xk,

where (Xk, Zk)k≥1 is a sequence of iid random vectors satisfying certain assumptions

to be described later (see assumptions AI1 to AI4 below); while in the continuous

time context, we work with

D =

Z ∞

0

exp

µ−Z t

0

γ (Y (s)) ds

¶dΛ (t) ,

where Y = (Ys : s ≥ 0) is a suitably defined homogeneous Markov process Λ is a

stationary independent increment process, this setting is commonly used in the risk

theory example discussed in Section 2 (see Ch. 7 of Asmussen (2001)).

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 106

5.4.1 The discrete time setting

In this section, we shall consider the following set of assumptions.

ED1 Assume that Z1 ≥ 0, E (Z1) = γ <∞, E (Z21) = µ(2)Z <∞, and E ¡|Z1|3¢ <∞.Let σ2Z be the variance of Z1 and κ

(3)Z its third order cumulant, which can be

written as

κ(3)Z = µ

(3)Z − 3µ(2)Z γ + 2γ3.

ED2 Suppose that X1 has non-lattice distribution with E (X1) = λ, V ar (X21) = σ2X ,

and E¡|X1|3¢ <∞. Let E (X3

1 ) = µX3 and write κ

(3)X to denote the third order

cumulant of X1. In addition, assume that the distribution of X1 given Z1 is

non-lattice.

ED3 Suppose that E³|X1|j |Z1|k

´< ∞ for 0 < j + k ≤ 3 and for j, k ≥ 1 denote

µjk = E¡Xj1Z

k1

¢. Moreover, let us define,

δ (θ, Z1) =¯E¡eiθX1

¯Z1¢¯

and assume that

limh→0

supε≤|θ|≤1/ε

P (δ (θ, Z1) > 1− h)h

<∞, (6)

for ε > 0.

Condition (6) is technical, and may be seen as a form of strong non-latticity of X1given Z1. Notice that, in the important special case in which theXk’s are independent

of the Zk’s, assumption AI3 is an immediate consequence of AI2. Indeed, if X1 is

non-lattice, we have that δ (θ, Z1) = δ (θ) < 1. Therefore, for all h > 0 sufficiently

small, δ (θ) < 1− h. This implies that the limit in (6) is zero.As a remark, we also note that, alternatively, the non-negativity of Z1 required in

assumption AI1 can be replaced by the existence of exponential moments, we record

this observation as our alternative assumption AI1’.

ED1’ Assume that E exp (ρZ1) <∞ for ρ in a vicinity of the origin.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 107

Under these assumptions, we improve the approximation (4) by providing an

Edgeworth expansion for the distribution of D when (Xk, Zk)k≥1 is a sequence of

i.i.d. random vectors and the discount rate γ is small. In particular, by defining

σ2 =1

2

µσ2X − 2

λ

γσXZ +

λ2

γ2σ2Z

¶,

we can write the approximation proposed as

P (D ≤ y) ≈ P¡N¡λ/γ,σ2/γ

¢ ≤ y¢−√γβ1ηµ(y − λ/γ)

√γ

σ

¶(7)

−√γ

18β2H

µ(y − λ/γ)

√γ

σ

¶.

The constants β1 and β2 satisfy

β1 =µ(2)Z λ

2γ2σ,

σ3β2 = κ(3)X − 2κ21

λ

γ+ 3κ12

λ2

γ2− 3κ11

γ

µσ2X − 2

λ

γσXZ +

λ2

γ2σ2Z

¶+3σ2Z

λ

γ2

µσ2X − 2

λ

γσXZ +

λ2

γ2σ2Z

¶− κ

(3)Z λ3

γ3,

with

κ12 = µ12 + µ11 − µ(2)Z − 3γµ11 + 2γ2λ,κ21 = µ21 + µ11 − µ(2)X − 3λµ11 + 2λ2γ,κ11 = µ11 − λγ = σXZ , cov (X,Z) ;

and

η (y) =1√2πexp

¡−y2/2¢H (y) =

¡y2 − 1¢ η (y) .

The application of the approximation (7), requires estimation of the joint mo-

ments µij, which can be easily done (even non-parametrically) using standard meth-

ods. Also, observe that in the case in which the sequences (Xk)k>0 and (Zk)k>0 are

independent, the constants σ2, β1 and β2 take the simplified form

σ2 =1

2

µσ2X +

λ2

γ2σ2Z

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 108

and

β1 =µ(2)Z λ

2γ2σ, β2 =

1

σ3

Ãκ(3)X + 3

σ2Zλσ2X

γ2− κ

(3)Z λ3

γ3+ 3

σ4Zλ3

γ4

!.

In order to understand the nature of approximation (7), we introduce a small

scaling parameter α > 0 and define

D (α) =∞Xk=0

exp

Ã−α

k−1Xj=0

Zj

!Xk.

approximation (7) becomes (since the quantities σ, β1 and β2 are not affected by the

scaling)

P (D (α) ≤ y) ≈ P¡N¡λ/αγ,σ2/αγ

¢ ≤ y¢−√γαβ1ηµ(y − λ/γα)

√γα

σ

¶(8)

−√γα

18β2H

µ(y − λ/γα)

√γα

σ

¶.

Or, in other words,

P¡√

α (D (α)− λ/αγ) ≤ y¢ ≈ P¡N¡0,σ2/γ

¢ ≤ y¢−√γαβ1ηµ√γσ y

¶−√γα

18β2H

µ√γ

σy

¶with an error of order o (

√α) (uniformly on y). The precise mathematical statement

concerning the previous approximations is the content of Theorem 3 below, which

provides the first order correction in the Edgeworth expansion for D (α). However,

before moving on to Theorem 3, we present a simple example to illustrate the accuracy

of the approximations proposed.

Example 1 Suppose that X1 ∼ λ exp (1) and Z1 ∼ γ exp (1). Under these assump-

tions it follows (see Gjessing and Paulsen (1997)) that

D =∞Xk=0

exp

Ã−

k−1Xj=0

Zj

!Xk ∼ λΓ (1/γ + 1, 1) ,

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 109

where Γ (1/γ + 1, 1) represents a random variable with distribution gamma with the

parameters given. In order to illustrate the numerical fit of the approximation pro-

vided we consider the case in which λ = 1 and γ = .1 and γ = .5 respectively. The

following graphs compare the CLT and Edgeworth approximations developed against

the true distribution of D:

Approximation for D (Exponential Case EZ=.1 )

00.10.20.30.40.50.60.70.80.91

-2.8

-1.9

-1 -0.1

0.8

1.7

2.6

3.5

4.4

5.3

6.2

7.1

TrueCLTEdgeworth

Approximation for D (Exponential Case EZ=.5 )

00.10.20.30.40.50.60.70.80.91

-2.8

-1.9

-1 -0.1

0.8

1.7

2.6

3.5

4.4

5.3

6.2

7.1

TrueCLTEdgeworth

CLT and Edgeworth Based Approximations

We now provide the rigorous statement supporting approximation (7).

Theorem 3 If the set of assumptions ED1 (or ED10) to ED4 are in force, then

P

µ√α

µD (α)− λ

γα

¶≤ y

¶= P

µN

µ0,

σ2

γ

¶≤ y

¶−√αβ1n (y) (9)

−√α

18

β2γH (y) +Gα (y) ;

where Gα represents a signed measure with G+α (R) +G−α (R) , kGα (dy)k = o (√α) .

In order to prove this theorem, we need some preliminary results. As it is standard

in obtaining Edgeworth expansions via Fourier analytic methods (see Feller (1968)

p. 512), one first proceeds to obtain an asymptotic expansion for the cumulant

moment generating function of interest. Hence, our first result provides an asymptotic

expansion for ψα (θ) , logE exp¡iθα−1/2 (αD (α)− λ/γ)

¢in powers of

√α.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 110

Lemma 3 Assume ED1 (or ED10) to ED3. Then, there exists δ > 0 for which we

have that

ψα (θ) =

õ(2)Z λ

2γ2+O (α)

!iθα1/2

+

µ1

2γα

µσ2X − 2

λ

γσXZ +

λ2

γ2σ2Z

¶+O (1)

¶(iθ)2

+

µC3α+O (1)

¶(iθ)3

6α3/2 + o

¡α1/2

¢,

(uniformly in θ ∈ (−δ, δ), δ > 0) where

3γC3 = κ(3)X − 2κ21

λ

γ+ 3κ12

λ2

γ2

+3

µσ2Z

λ

γ2− σXZ

γ

¶µσ2X − 2

λ

γσXZ +

λ2

γ2σ2Z

¶− κ

(3)Z λ3

γ3.

Proof. The idea is to write

φα (θ) = exp¡iθλ/γ

√α¢φ¡θ√α,α

¢,

where φα (θ) , exp (ψα (θ)) and φ (θ,α) , E exp (iθD (α)). Notice that φ (θ,α)

satisfies

φ (θ,α) = E (exp (iθ (X1 + exp (−αZ1)D1 (α)))) ,

with D1 (α) independent of (X1, Z1). Thus, we have,

φ (θ,α) = E (exp (iθ (X1 + exp (−αZ1)D1 (α))))= E (E (exp (iθ (X1 + exp (−αZ1)D1 (α)))|X1, Z1))= E (E (exp (iθX1)φ (θ exp (−αZ1) ,α)|X1, Z1))= E (exp (iθX1)φ (θ exp (−αZ1) ,α)) .

Using the Taylor development for characteristic functions (see Feller (1968) App. Sec.

XV.5 and Breiman (1992) Prop. 8.44) applied to φ (θ,α) and φα (θ), together with

the moment conditions implied by assumptions ED1 (or ED10) to ED3, we arrive at

the expression stated for ψα (θ).

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 111

Lemma 4 Under assumptions ED1 (or ED10) to ED4, φ (θ,α) , E exp (iθD (α))satisfies

|φ (θ,α)| = o ¡α1/2¢as α→ 0 uniformly in θ over compact sets not containing the origin.

Proof. Let φX (θ, Z1) = E¡eiθX1

¯Z1¢, and let Tα = inf{k : Sk > 1/α}. Then,

|φ (θ,α)| =¯¯EÃE

Ãexp

Ãiθ

∞Xk=1

Xk exp (−αSk−1)!¯¯Z!!¯

¯=

¯E¡Π∞k=1φX

¡θe−αSk−1 , Zk

¢¢¯≤ E

¡Π∞k=1

¯φX¡θe−αSk−1, Zk

¢¯¢≤ E

¡ΠTα−1k=1

¯φX¡θe−αSk−1 , Zk

¢¯¢≤ E

¡ΠTα−1k=1 |∆ (θ, Zk)|

¢,

where ∆ (θ, Z1) = sup{|φX (θ∗, Z1)| : |θ∗| > |θe−1|}. Since the distribution of X1given Z1 is non-lattice, we must have that 0 < ∆ (θ, Z1) < 1. So,

|φ (θ,α)| ≤ E¡ΠTα−1k=1 |∆ (θ, Zk)|

¢≤ P

µα

¯Tα − 1

αγ

¯> ε

¶+E

¡ΠTα−1k=1 |∆ (θ, Zk)| ;α |Tα − 1/αγ| ≤ ε

¢≤ P

µα

¯Tα − 1

αγ

¯> ε

¶+E

³|∆ (θ, Z1)|1/α(1/γ−ε)−1

´.

Since condition AF1 (AF10) imply that 0 < EZ1 < ∞ and V ar (Z1) < ∞, we havethat

³α1/2

¯Tα − 1

αγ

¯´2is uniformly integrable (see Gut (1988) p. 92.) In particular,

this implies, using Chebyshev’s inequality, that

P

µα

¯Tα − 1

αγ

¯> ε

¶= O (α) .

Finally, if we choose ε > 0 small enough so that c , 1/γ − ε > 0, we must show (for

θ not in a neighborhood of the origin) that

E³|∆ (θ, Z1)|c/α

´= o

¡√α¢.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 112

Let W = − log (|∆ (θ, Z1)|) and β = c/α. Then,

E³|∆ (θ, Z1)|β

´= E (exp (−βW )) =

Z ∞

0

exp (−u)P (u/β > W ) du.

Thus,

βE³|∆ (θ, Z1)|β

´=

Z ∞

0

exp (−u)βP (u/β > W ) du.

Fix ε > 0 and write

βE³|∆ (θ, Z1)|β

´=

Z ε

0

exp (−u) βP (u/β > W ) du

+

Z ∞

ε

u exp (−u)β/uP (u/β > W ) du (10)

≤ βP (ε/β > W ) +

Z ∞

ε

u exp (−u)β/uP (u/β > W ) du.

We want to apply Fatou’s Lemma in the form

limβ−→∞

Z ∞

ε

u exp (−u)β/uP (u/β > W ) du

≤Z ∞

ε

limβ−→∞u exp (−u)β/uP (u/β > W ) du.

In order to do this, we must show that

0 ≤ β/uP (u/β > W ) ≤M

for someM > 0 for u ∈ [ε,∞], and β large. So, by right continuity and the existence

of left limits, it suffices to show that

limβ−→∞P (h > W )

h<∞.

But

limh−→0P (h > W )

h= limh−→0

P (h > − log (|∆ (θ, Z1)|))h

= limh−→0P (exp(−h) < |∆ (θ, Z1)|)

h

= limh−→0P (|∆ (θ, Z1)| > 1− h)

h<∞,

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 113

by virtue of assumption ED4. This is what we require in order to apply Fatou’s

lemma. Consequently, we have

limβ−→∞βE³|∆ (θ, Z1)|β

´<∞,

which implies

limβ−→∞p

βE³|∆ (θ, Z1)|β

´= 0,

and this is what we needed to conclude the proof of the lemma.

We now are ready to proof Theorem 3.

Proof of Theorem 3. . The proof of this theorem follows closely the steps

of Feller (1968) p.512. To simplify the notation, let us consider E (X1) = 0 and

E (X21) = 2γ and the Xk’s independent of the Zk’s (as we shall see from the proof,

these are just simplifying assumptions and the adaptation of the present proof is

straightforward using the corresponding local expansion given in Lemma 3)). Let

γ (θ) = bG (θ) = e−θ2/2µ1 + (iθ)3κ(3)X

18γ

√α

¶. Esséen’s lemma applies here since

G (x) = Φ (x)− κ(3)X

18

√a¡x2 − 1¢ η (x)

is bounded by some constant C. Also γ (0) = 1 and γ0 (0) = 0. Therefore,

|Fα (x)−G (x)| ≤ 1

π

Z T

−T

1

|θ|¯φ¡√

αθ,α¢− γ (θ)

¯dθ +

24C

πT.

Let T =M/√α, for some M > 0 big. Then, for any δ > 0 small, we have

|Fα (x)−G (x)| ≤ I1 + I2 + I3 +√α24C

πM,

where

I1 =1

π

Z δ/√α

−δ/√α

1

|θ|¯φ¡√

αθ,α¢− γ (θ)

¯dθ,

I2 =1

π

Z M/√α

δ/√α

1

|θ|¯φ¡√

αθ,α¢− γ (θ)

¯dθ,

I3 =1

π

Z δ/√α

−M/√α

1

|θ|¯φ¡√

αθ,α¢− γ (θ)

¯dθ.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 114

Observe that

I2 ≤ 1

π

Z M/√α

δ/√α

1

|θ|¯φ¡√

αθ,α¢¯dθ +

1

π

Z M/√α

δ/√α

1

|θ| |γ (θ)| dθ

=1

π

Z M

δ

1

|θ| |φ (θ,α)| dθ +1

π

Z M/√α

δ/√α

1

|θ| |γ (θ)| dθ.

By virtue of our previous lemma, it is clear that I2 goes to zero faster than√α,

similarly for I3. Thus, we just have to study I1. Let

ζ (θ,α) , log (φ (θ,α)) +θ22γ

2 (1−m (−2α))= log (φ (θ,α)) +

θ2γ

(1−m (−2α))where m (−λ) = E ¡e−λZ1¢. Hence, we can write

I1 =1

π

Z δ/√α

−δ/√α

1

|θ|¯φ¡√

αθ,α¢− γ (θ)

¯dθ

=1

π

Z δ/√α

−δ/√α

1

|θ|¯exp

µζ¡√

αθ,α¢− θ2γ

(1−m (−2α))¶− γ (θ)

¯dθ

=1

π

Z δ/√α

−δ/√α

1

|θ|e−θ2/2

¯¯e³ζ(√αθ,α)− θ2

2 (αγ

(1−m(−2α))−1)´− 1− (iθ)

3 µ3√α

18

¯¯ dθ.

Using Feller (1968), p. 507, we have that for any eβ1 and eβ2 complex numbers,¯eeβ1 − 1− eβ2 ¯ ≤ µ¯eβ1 − eβ2 ¯+ 12eβ22

¶exp (υ) , (11)

where υ ≥ max³¯eβ1 ¯ , ¯eβ2 ¯´ . Given ε > 0, we can choose δ > 0 small enough so that

|θ√α| < δ (as in Feller (1968), p. 507) and¯¯ζ ¡θ√α,α¢− α3/2 (iθ)3 κ

(3)X

3! (1−m (−3α))

¯¯ ≤ ε

θ3α3/2

|(1−m (−3α))| ≤ εKθ3α1/2

for α small enough and some constant K1 independent of α (becauseα3/2κ

(3)X

(1−m(−3α)) is the

cumulant of order 3 for the random variable√αD (α)). At the same time, δ can also

be chosen satisfying ¯ζ¡θ√α,α

¢¯<1

2

γαθ2

(1−m (−2α)) ≤K2

3θ2

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 115

for some K2 ≤ 1 for α small enough. Now, δ can be chosen also with the propertythat ¯

¯ α3/2 (iθ)3 κ(3)X

3! (1−m (−3α))

¯¯ < K2

3θ2.

Notice that ¯¯e³ζ(√αθ,α)− θ2

2 (αγ

(1−m(−2α))−1)´− 1− (iθ)

3 κ(3)X

18

¯¯

≤¯¯e³ζ(√αθ,α)− θ2

2 (αγ

(1−m(−2α))−1)´− 1− α3/2 (iθ)3 κ

(3)X

3! (1−m (−3α))

¯¯+¯

¯ α3/2 (iθ)3 κ(3)X

3! (1−m (−3α)) −(iθ)3 κ

(3)X

18

√α

¯¯ ,

and observe that ¯¯ α3/2 (iθ)3 κ

(3)X

3! (1−m (−3α)) −(iθ)3 κ

(3)X

18

√α

¯¯ ≤ √αθ3o (1) .

Finally, we apply inequality (11) with eβ1 = ζ (√αθ,α) − θ2

2

³αγ

(1−m(−2α)) − 1´andeβ2 = α3/2(iθ)3κ

(3)X

3!(1−m(−3α)) for δ > 0 small enough so that

I1 ≤ ε

πκ1√α

Z ∞

−∞θ2e−θ

2/6dθ +α

πK21

Z ∞

−∞e−θ

2/6θ6dθ +

√α

πo (1)

Z ∞

−∞|θ|3 e−θ2/6dθ.

Hence we conclude that

lim supα→0

1√αsupx|Fα (x)−G (x)| ≤ εκ,

for some constant κ. Since ε was arbitrary, this concludes the proof of the theorem.

5.4.2 The continuous time setting

A popular model in the risk theory setting discussed in Section 2 consists of con-

sidering the processes Γ as Λ two independent Levy processes (i.e. two stationary

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 116

independent increment processes, see Gjessing and Paulsen (1992)). The stationary

independent increment assumption of the risk process Λ has been argued to hold by

several authors in the risk theory community (this setting includes the so-called clas-

sical risk model, see Asmussen (2001) and Grandell (1991)). On the other hand, in

finance, short rate processes usually are modelled as positive functions of a Markov

process (typically with mean reverting characteristics). This motivates the following

setting in which we develop the desired Edgeworth expansion.

Suppose that Λ = (Λ (t) : t ≥ 0) is a Levy process. In addition, let Y = (Y (s) :s ≥ 0) be a homogeneous Markov process taking values in a Polish space Ξ and letB (Ξ) be the Borel sigma-field in Ξ. Let P (t, y, B) (t ∈ R+, y ∈ Ξ and B ∈ B (Ξ)) bethe corresponding transition probability function. Assume that Y satisfies the Feller

condition (i.e. P (t, y,Bδ (x)) → 1 as t & 0, for all δ > 0) and that the mapping

y → Eyf (Yt) is continuous for all f (·) ∈ C (Ξ) (the space of continuous functiontaking values on Ξ). Let A be the associated infinitesimal generator of the process

Y , defined via the relation

Af (y) = limt↓0Eyf (Y (t))− f (y)

t,

where f ∈ C(Ξ). The domain D (A) of A is composed by those functions f ∈C (Ξ) for which the previous limit exists (uniformly, for all y ∈ Ξ) (See Skorohod,

Hoppensteadt and Salehi (2002)). In addition, suppose that Y (·) has right continuouswith left limits sample paths and that it is geometrically ergodic (see Kontoyiannis

and Meyn (2003), p. 9).

The following set of assumptions are in force throughout this section.

EC1 Λ and Γ are independent and the distribution of Λ (1) is non-lattice.

EC2 Suppose Y is geometrically ergodic (see Kontoyiannis and Meyn (2003) p. 9).

Suppose that eγ (·) : Ξ → R is a continuous mapping such that eγ (x) > 0 for allx ∈ Ξ and define Γ as

Γ (t) =

Z t

0

eγ (Y (s)) ds.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 117

Under EC1 and EC2, we shall provide rigorous support for the approximation

P (D ≤ y) ≈ P¡N¡λ/γ,χ(2) (0) /2

¢ ≤ y¢−√γλγF (y0) η

µ(y − λ/γ)

q2/χ(2) (0)

¶−√γ

18χ(3) (0)H

µ(y − λ/γ)

q2/χ(2) (0)

¶, (12)

where (if π (dy) denotes the stationary distribution of Y ), F can be characterized as

the solution of the Poisson equation

AF = Eπγ (Y (1))− γ (y) ,

and χ (·) depends on the log-moment generating function of Λ and the Perron-

Frobenius eigenvalue associated with cumulative Markov reward Γ. More pre-

cisely, for every θ ∈ R consider the (unique) solution pair (u (y, θ) ,ψΓ (θ)) (such

that u (y, 0) = 1) satisfying

(Au) (y, θ) = (ψΓ (θ)− θeγ (y))u (y, θ) . (13)

Note that the geometric ergodicity guarantees existence and uniqueness of the solution

pair (u,ψΓ), see Kontoyiannis and Meyn (2003)). Let ψΛ (iθ) = logE exp (iθΛ (1))

(we work with the branch {arg (z) ∈ [0, 2π)} when operating with complex loga-rithms) then χ (iθ) = −ψ−1Γ (−ψΛ (iθ)) (note that χ0 (0) = λ/γ). Just as in the

discrete time case, the approximation (12) will be supported in the context of small

interest rates for a suitably parameterized family of discounted rewards. In particu-

lar, we shall prove that the approximation

P¡√

α (D (α)− λ/ (γα)) ≤ y¢≈ P

¡N¡0,χ(2) (0) /2

¢ ≤ y¢−√γαλγF (y0) η

µyq2/χ(2) (0)

¶−√γα

18χ(3) (0)H

µyq2/χ(2) (0)

¶holds with an error of order o (

√α) (uniformly on y), where

D (α) =

Z ∞

0

exp (−αΓ (t)) dΛ (t) .

(Note that the previous integral can be interpreted, via integration by parts, path by

path as a Lebesgue-Stieltjes integral.)

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 118

Theorem 4 Suppose that EC1 and EC2 hold. Then,

P¡√

α (D (α)− χ0 (0) /α) ≤ y¢= P

¡N¡0,χ(2) (0) /2

¢ ≤ y¢−√γαF (y0) ηµyq2/χ(2) (0)¶−√γα

18χ(3) (0)H

µyq2/χ(2) (0)

¶+Gα([−∞, y));

where Gα represents a signed measure with G+α (R) +G−α (R) , kGαk = o (√α) .

The proof of the previous theorem parallels its corresponding continuous time

analogue described in the previous section. We first obtain a local description of

ψα (θ) = logE exp (iθ√α (D (α)− λ/ (γα))).

Lemma 5 Under assumptions EC1 and EC2 we have that

ψα (θ) = −χ(2) (0)

2θ2 +

√α

µχ(3) (0)

18(iθ)3 − λ

γF (y0) iθ

¶+ o

¡√α¢

(uniformly in θ ∈ (−δ, δ), δ > 0).

Proof. It is known that for every u ∈ D (A) such that inf x∈Ξ|u (x) | > 0 we havethat

Mt (z) =u (Y (t) , θ)

u (Y0, θ)exp

µ−Z t

0

µAu

u

¶(Y (s) , θ) ds

¶(14)

is a Martingale with respect to the filtration generated by Y (see Lemma 2, p. 82

of Skorohod, Hoppensteadt and Salehi (2001)). Since Y is geometrically ergodic it

follows that the generalized eigenvalue problem

(Au) (y, θ) = (ψΓ (θ)− θeγ (y))u (y, θ) , u (y, 0) = 1 (15)

has a unique solution pair (u (y, θ) ,ψΓ (θ)) for every θ ∈ R. In addition, infθ∈Ξ u (y, θ) >0 for all θ ∈ R and ψΓ (·) is a strictly increasing function (since

ψΓ (θ) = limt→∞

1

tlogE exp (θΓ (t)) ).

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 119

Observe that the solution to (15) automatically provides the solution to the problem

1eγ (y) (Au) (y, θ) =

µψΓ (θ)eγ (y) − θ

¶u (y, θ)

=

µ−ψ−1Γ (−ν)− νeγ (y)

¶u¡y,ψ−1Γ (−ν)¢ ,

(where ν = −ψΓ (θ)). In addition, Proposition 4.8 of Kontoyiannis and Meyn (2003)

states that for each θ ∈ Ξ, both u (y, ·) and ψΓ (·) are analytic inN ={z ∈ C : |z| ≤ δ}for some δ > 0 (which immediately implies the analyticity of ζ (·) = −ψ−1Γ (−·)) andinf x∈Ξ,z∈N |u (x, z) | > 0. Note that the Markov process eY = ³eY (t) : t ≥ 0´ definedas eY (t) = Y (Γ−1 (t)) is also a geometrically ergodic Markov process with generatoreA = 1eγA (the reason is that eγ being continuous and positive implies infx∈Ξ eγ (x) > 0,which yields that the Lyaponuv bound needed in the definition of geometric ergodic-

ity is immediately satisfied after scaling factors (see Kontoyiannis and Meyn (2003)

p. 9). Therefore, by considering the Markov generator ∂t + eA and the function

u (y,ψΛ(iθe−αt)), (for θ ∈ R with |θ| < δ) in the relation (14) we can build the

Martingales

Mt (iθ) =u³eY (t) ,−χ (iθe−αt)´u (Y0,−χ (iθe−αt)) exp

Z t

0

ψΛ (iθe−αt)eγ ³eY (t)´ dt−

Z t

0

χ¡iθe−αt

¢dt

exp

−α Z t

0

iθe−αtuθ³eY (t) ,−χ (iθe−αt)´

u³eY (t) ,−χ (iθe−αt)´ χ

¡iθe−αt

¢dt

.Note thatMt (iθ) is a bounded martingale (in particular, uniformly integrable). Thus

it possesses a last element M∞ (iθ), which implies that

exp

µZ ∞

0

χ¡iθe−αt

¢dt

¶u (Y0, iθ) = E exp

Z ∞

0

ψΛ (iθe−αt)eγ ³eY (t)´ dt− ξ (α, iθ)

,where

ξ (α, iθ) = α

Z t

0

iθe−αtuθ³eY (t) ,−χ (iθe−αt)´

u³eY (t) ,−χ (iθe−αt)´ χ

¡iθe−αt

¢dt.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 120

Therefore, we conclude that

exp

µZ ∞

0

¡χ¡√

αiθe−αt¢−√αiθe−αtλ/γ¢ dt¶u ¡Y0,−χ ¡√αiθ¢¢

= E exp

Z ∞

0

ψΛ (√αiθe−αt)eγ ³eY (t)´ dt− iθ λ

γ√α− ξ

¡α,√αiθ¢

= E exp

Z ∞

0

ψΛ (√αiθe−αt)eγ ³eY (t)´ dt− iθ λ

γ√α

+ o ¡√α¢ (16)

(uniformly in θ ∈ (−δ, δ)). The previous equality follows from the fact that

Eξ¡α,√αiθ¢=√αiθα

Z ∞

0

e−αtuθ³eY (t) ,−χ (√αiθ)´

u³eY (t) ,−χ (√αiθ)´ χ

¡√αiθe−αt

¢dt

=√αiθ

λ

γEα

Z ∞

0

e−αtuθ³eY (t) , 0´ dt+O (α) ,

and (using Theorem 1) in combination with the bounded convergence theorem) it

follows that

αE

Z ∞

0

e−αtuθ³eY (t) , 0´ dt * Euθ (Y (∞) , 0) = EπF (Y (1)) = 0

(since uθ (y, 0) = F (y)). On the other hand, notice that

E exp (iθD (α)) = E

µE

µexp

µiθ

Z ∞

0

exp (−αΓ (t)) dΛ (t)¶¯

Γ

¶¶= E exp

µZ ∞

0

ψΛ (iθ exp (−αΓ (t))) dt¶

= E exp

Z ∞

0

ψΛ (iθe−αu)eγ ³eY (t)´ du

. (17)

Combining expressions (10) and (17) with a Taylor expansion of χ (·) and u (Y0, ·)yields the conclusion of the Theorem.

The proof of Theorem 4 can be completed along the same lines as in the discrete

time case after showing that φ (θ,α) , E exp (iθD (α)) goes to zero fast enough for|θ| ∈ (w0, w1) for any 0 < w0 < w1 <∞.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 121

Lemma 6 Suppose that EC1 and EC2 are in force, then φ (θ,α) , E exp (iθD (α))satisfies

sup|θ|∈(θ0,θ1)

|φ (θ,α)| = o ¡√α¢ ,for all 0 < θ0 < θ1 <∞.

Proof. We proceed as in the discrete time case, first we write

|φ (θ,α)| =¯E exp

µZ ∞

0

ψΛ (iθ exp (−αΓ (t))) dt¶¯

≤ E

¯exp

µZ ∞

0

ψΛ (iθ exp (−αΓ (t))) dt¶¯

(note that ψΛ (i·) is well defined except for at most countably many values, in thosecases we can assign the value −∞ and that will not affect the value of the integral

above). The proof now follows just as in the discrete time case, by spliting the integral

up to Γ−1 (1/α) and using the non-lattice property of the distribution of Λ (1) . In

fact, since 1/(α supx∈Ξ eγ (x)) ≤ Γ−1 (1/α) we actually can obtain an exponential rate

of convergence instead of the rate o¡α1/2

¢.

Remarks

a) The assumption that Ξ is compact does not really play an essential role. It was

only used to ensure that the martingale property of Mt (iθ) in the proof of Lemma

5. A local description for ψα (iθ) could also have been obtained by computing the

moments of D (α), which is relatively easy in the present setting.

b) The independence between Γ and Λ can also be relaxed. For example, one

could have assumed that both processes are conditionally independent given another

Markov process, say Z, provided that Λ remains a possibly non-time homogeneous

Levy process with a suitably non-lattice conditional distribution type assumption

analogous to condition AI3 in the previous subsection.

c) Following the same ideas as in Lemma 5, a local expansion for ψα (θ) can be

obtained for the case in which

D (a) =

Z ∞

0

exp

µ−α

Z t

0

eγ (Y (s)) ds¶eλ (Y (s)) .

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 122

(where eλ is, say, continuous on the compact Polish space Ξ). In this case, the corre-sponding generalized eigenvalue problem takes the form

1eγ (Au) (y, θ) =Ãχ (θ)−

eλ (y)eγ (y)!u (y, θ) , u (y, 0) = 1,

and a formal corrected approximation can be written as

P (D ≤ y) ≈ P¡N¡λ/γ,χ(2) (0) /2

¢ ≤ y¢−√γuθ (y0, 0) ηµ(y − λ/γ)q2/χ(2) (0)

¶−√γ

18χ(3) (0)H

µ(y − λ/γ)

q2/χ(2) (0)

¶.

The only step required to make the previous approximation rigorous is to show that for

all 0 < θ0 < θ1 <∞, sup|θ|∈(θ0,θ1) |φ (θ,α)| = o (√α) as in Lemma 6. This essentially

involves assuming enough structure to ensure strongly non-lattice properties of D.

We have chosen Levy process in our exposition because they provide a convenient

framework to easily verify, from the model primitives, the non-lattice conditions that

yield the described Edgeworth expansion.

5.5 Large Deviations

To fix ideas, let us begin by considering the same setting under which we derived

our LLN in Section 3. In the previous section, we derived accurate approximations

for the distribution of D (in the iid setting) for small interest rates when the D is

close to its typical value (according to the LLN this implies looking at D close to

λ/γ). In a number of applications (including those discussed in Section 2 regarding

time series analysis and risk theory), one is often interested in computing P (D > x)

for x suitably large. In particular, these types of applications motivate interest in

the analysis of the tail probability P (D > x) for x >> λ/γ. As we shall see, under

certain exponential moment conditions on (Γ,Λ), the approximation proposed here

will take the form

P (D > x) ≈ exp (−I (x) /γ) , (18)

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 123

where I (x) > 0 corresponds to the so-called rate function and will typically take

the form I (x) = xθ∗ − R∞0

χ (θ∗e−s) ds, where θ∗ satisfies θ∗x = χ (θ∗) and χ (·)is a suitably defined convex function. The goal of this section is to provide, under

general conditions, rigorous justification (at least in a rough logarithmic sense) for the

previous approximation. In addition, we will also explore, under additional structure,

exact asymptotics (also known as precise large deviations).

Applications in finance and risk theory motivate study of continuous time pro-

cesses, including the case in which the processes Γ and Λ take the form

Γ (t) =

Z t

0

eγ (s) ds and Λ (t) =

Z t

0

eλ (s) ds,where, for all s, eγ (s) > 0 represents the “short rate” process and eλ (s) represent thereward rate. Also, other applied contexts such as the analysis of ARCH processes in

time series motivate study of the discrete time setting, in which

Γ (t) ,btcXk=1

Zk and Λ (t) ,btcXk=1

Xk,

and (Xk, Zk)k≥0 is a (typically stationary) sequence of two dimensional random vectors

with the property that Zk > 0 for all k ≥ 0.In order to provide rigorous justification for the approximation (18), we shall

consider

αD (α) =

Z[0,∞)

exp (−αΓ (t−)) dΛ (t)

= α

Z[0,∞)

exp (−u)Λ ¡Γ−1 (u/α)¢ du, (19)

and study P (αD (α) > x) for x > λ. Note that the previous identity holds in general

provided that Γ (·) is non-decreasing and Λ (·) has RCLL sample paths. In otherwords, (19) may hold even if Λ does not have bounded variation. Expression (19)

suggests a natural strategy to derive a LDP for {αD (α)}α>0 as α & 0; namely,

to apply the contraction principle (under appropriate sample path large deviations

assumptions on α (Γ (·/α) ,Λ (·/α))), to the mapping Ψ : D[0,∞) × D[0,∞) → R

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 124

defined as

Ψ (x, y) =

Z ∞

0

exp (−t) y ¡x−1 (t)¢ dt.Actually, we will follow more or less this idea, although with important modifications

arising due to the fact that Ψ is not continuous. Indeed, if we consider the map Ψ1,

acting on D[0,∞) endowed with the Skorohod J1 topology (see Whitt (2001)) anddefined as Ψ1 (x) =

R∞0exp (−t)x (t) dt, then, we can see, aside from the fact that

Ψ1 is not well defined for every element in D[0,∞), that Ψ1 is discontinuous at every

single point. In order to see this, just consider the sequence of functions (xn : n ≥ 1),defined as xn (t) = enI (n ≤ t < n+ 1), and note that xn → 0 while Ψ1 (xn) = 1−e−1.(This example was given byWhitt (1972); thatΨ1 (·) is discontinuous at every elementof D[0,∞) follows by linearity of Ψ1.)

The idea, then, is to restrict the domain of Ψ1 to a proper subspace of D[0,∞),endowed with a finer topology under which Ψ1 (·) is continuous. This idea will bestudied in detail in the next subsection, in which we treat the continuous setting.

Later, we will return to the discrete setting.

5.5.1 The continuous time setting

We will restrict the domain of Ψ1 to the subspace

Lβ[0,∞) , {x ∈ C[0,∞) : limt→∞¯x (t)

¯= 0},

for some β > 0, with the topology generated by the weighted norm

kxkβ = supt≥0

|x (t)|1 + tβ

.

Whitt (1972) proved that Ψ1 is continuous on³Lβ[0,∞), k·kβ

´, which suggests using

the contraction principle on this space. The following proposition constitutes an

intermediate step in this direction.

Proposition 5 Suppose that the family of processes α (Γ (·/α) ,Λ (·/α))α>0 satisfiesa LDP on C[0,∞) × C[0,∞) (endowed with the product topology generated by the

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 125

uniform convergence on compact sets also known as Stone’s topology) with a good rate

function I (x, y). Then, Rα (·) = αΛ (Γ−1 (·/α)) satisfies a LDP on C[0,∞) (endowedwith Stone’s topology) with good rate function I 0 (z) = inf{I (x, y) : z = y ◦ x−1}.

Proof. This is just a direct consequence of the contraction principle (see Theorem

4.2.1, p. 126 of Dembo and Zeitouni (1999)) and the fact that the mapping (x, y)→y ◦ x−1 in the topological spaces described (see Whitt (2001), Theorem 13.2.2., p.

430).

At this point, one may be tempted to invoke, once again, the contraction principle

in combination with Proposition 5 to obtain the desired LDP. However, in order to

proceed with this program, we must show that the LDP developed in Proposition 5

actually holds on³Lβ[0,∞), k·kβ

´(since, in order to apply the contraction principle,

the continuity of Ψ1 must be compatible with the topology under which the original

LDP was derived). In order to show the LDP on³Lβ[0,∞), k·kβ

´we will need

to show that the random elements (αΛ (Γ−1 (·/α)))α>0 are exponentially tight (seeDembo and Zeitouni (1998)). (This type of reasoning parallels similar arguments in

the context of weak convergence theory and the important role that tightness plays

in this theory). Recall that a sequence of probability measures Pn is said to be

exponentially tight if for every a > 0 there exist compact sets Ka, such that

limn→∞1

nlogPn (Ka) ≤ −a,

or, if the Pn’s take values on subsets of a Polish space, then the Pn’s are exponentially

tight if for ε > 0, there exists a compact set Kε such that, for all n ≥ 1,

εn > 1− Pn (Kε) ,

(see Zajic (1993) p. 11). In view of these observations, we must characterize expo-

nential tightness in Lβ[0,∞). This is the aim of the following theorem.

Lemma 7 Consider a sequence of probability measures (Pn : n ≥ 1) on Lβ[0,∞)(such that Pn{x : x (0) = 0} = 1) and acting on the Borel sigma-field correspondingto the topology generated by the norm k·kβ. Then, (Pn : n ≥ 1) is exponentially tight

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 126

if and only if (Pn : n ≥ 1) is exponentially tight under the (relative) Stone topology,and that for each δ > 0

limn→∞1

nlogPn

µx : sup

t≥t0

|x (t)|tβ

> δ

¶→ −∞ as t0 %∞. (20)

Proof. Lemma 3.3 of Whitt (1972) establishes that relatively compact sets in³Lβ[0,∞), k·kβ

´are those sets B with compact closure under the relative Stone

topology, and satisfying

limt→∞

supx∈B

|x (t)|tβ

= 0.

Also, recall that (if x (0) = 0 a.s. with respect to each Pn) for exponential tightness

under Stone’s topology, it is necessary and sufficient (see Feng and Kurtz (2000), p.

30) that, for each ε, T > 0,

limn→∞1

nlogPn (x : ω (x, δ, T ) > ε)→ −∞ as δ & 0, (21)

where ω (x, δ, T ) is the modulus of continuity of x, on the interval [0, T ], evaluated

at δ. We now show that if conditions (20) and (21) are satisfied, then the sequence

(Pn : n ≥ 1) is exponentially tight. Pick λ > 0, choose δk so that

Pn (x : ω (x, δk, T ) > 1/k) ≤ e−nλ/2k+1,

and let Bk = {x : ω (x, δk, T ) ≤ 1/k}. Also, pick tk so that

Pn

µx : sup

t≥tk

|x (t)|tβ

> 1/k

¶≤ e−nλ/2k+1,

and let Ck = {x : supt>tk |x (t)| /tβ ≤ 1/k}. Consider the closure, Aλ, of Aλ =

∩k (Bk ∩ Ck). Note that

1− P ¡Aλ

¢ ≤ 1− P (Aλ) = P (∪k (Bck ∩ Cck)) ≤ e−nλ

We claim that Aλ is relatively compact (i.e. that Aλ is compact), to see this, choose

ε > 0 and let k0 > 1/ε. Then, for all δ < δk0 we have that

supx∈A

ω (x, δ, T ) < ε.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 127

Similarly, for every T > tk0 we have that

ε > supx∈A

supt>T

|x (t)|tβ

,

which implies that

limt→∞ supx∈A

|x (t)|tβ≤ ε

for all ε > 0. Thus, by virtue of the Arzela-Ascoli theorem (see Billingsley (1999)

p.81) and Lemma 3.3 of Whitt (1972), which concludes the argument for sufficiency.

The necessity part is easier and follows just as in Feng and Kurtz (2000) p. 30.

Therefore, it is omitted.

With the aid of the previous lemma, the exponential tightness of (αΛ (Γ−1 (·/α)))α>0follows easily.

Lemma 8 Suppose that α (Γ (·/α) ,Λ (·/α))α>0 satisfies a full LDP with rate functionI (x, y) (under Stone’s topology). (Recall that a full LDP means an LDP with convex

good rate function). Then,

a) The family (αΓ (·/α)− γ·,αΛ (·/α)− λ·)α>0 is exponentially tight in L1[0,∞)×L1[0,∞) with the product topology generated by the norm k·k1

b) The class of random elements (αΛ (Γ−1 (·/α))− λ · /γ)α>0, is exponentially tightin (L1[0,∞), k·k1) .

Remark The convexity of the rate function does not really play a role in this

lemma, but only the goodness of the rate function is required.

Proof. For part a), it suffices to show that αΓ (·/α)−γ· and αΛ (·/α)−γ· are bothexponentially tight in (L1[0,∞), k·k1). Since αΛ (·/α) satisfies a full LDP in C[0,∞)(under Stone’s topology), which is a topological group (which implies the addition is

a continuous operation), it follows from the contraction principle that αΛ (·/α)− γ·also satisfies a full LDP. Note that C[0,∞), endowed with Stone’s topology, is aPolish space. Thus, the existence of a full LDP guarantees the exponential tightness

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 128

of αΛ (·/α) − γ· (see Dembo and Zeitouni (1999), p. 120 (c)). Therefore, we just

have to prove condition (20). Note that for any 0 < a < b < ∞, the mappingx→ supt∈[a,b] |x (t) /t| is continuous (under Stone’s topology), which implies that thefamily Vα = supt∈{a,b} |αΛ (t/α) /t− γ| satisfies an LDP with good rate function J ,say. Hence, we can write

P

µsupt>t0

¯αΛ (t/α)− γt

t

¯≥ δ

¶≤

∞Xk=1

P

Ãsup

t>t0[k,k+1]

¯αΛ (t/α)− γt

t

¯≥ δ

!

≤∞Xk=1

P

supu= t

kt0>[1,2]

¯αΛ (ukt0/α)− γukt0

ukt0

¯≥ δ

=

∞Xk=1

exp¡− ¡J (δ) + okto/α (1)¢ kt0/α¢ ,

where the subindex in okto/α (1) has been used just to emphasize that okto/α (1) → 0

as kt0/α → ∞. So we can choose k0 big enough so that for every k > k0 we have

J (δ) + okto/α (1) > J (δ) /2 > 0. From these estimates it is easy to conclude that

limα→0α logPµsupt>t0

¯αΛ (t/α)− γt

t

¯≥ δ

¶→ −∞ as t0 %∞,

which yields, by virtue of Lemma 7, the corresponding exponential tightness for

αΛ (·/α) − γ·. The argument for αΓ (·/α) − γ· is exactly the same and thereforehas been omitted. Part b) also proceeds along the same lines as the previous ar-

gument, since it follows from Proposition 5 that αΛ (Γ−1 (·/α)) satisfies a full LDPunder Stone’s topology.

We are ready to derive the LDP for (αD (α))α>0 in the continuous setting.

Theorem 5 Suppose that the family of processes α (Γ (·/α) ,Λ (·/α))α>0 satisfies afull LDP on C[0,∞) × C[0,∞) (endowed with the corresponding product Stone’stopology) with a good rate function I (x, y). Then, {αD (α)}α>0 satisfies an LDPon R with good rate function

I (z) = inf{I (x, y) : z =Z ∞

0

e−t¡y ◦ x−1¢ (t) dt}.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 129

Proof. Proposition 5 combined with the contraction principle tells us that the

family of random variables (αΛ (Γ−1 (·/α))− λ · /γ)α>0 satisfies a full LDP onC[0,∞).Since the product topology generated by the norm k·k1 in the subspace L1[0,∞) isfiner than Stone’s topology, Corollary 4.2.6 of Dembo and Zeitouni (1999) (which is

a simple consequence of the inverse contraction principle applied with the identity

mapping) applies yielding that (αΛ (Γ−1 (·/α))− λ · /γ)α>0 satisfies a full LDP on(L1[0,∞), k·k1). Since the mapping Ψ1 is continuous on (L1[0,∞), k·k1), we canapply the contraction principle once again here thereby yielding the conclusion of the

theorem.

The previous theorem provides rigorous justification for approximation (18) in

very general setting (essentially all those in which functional LDPs for (Γ,Λ) exist in

the space of continuous functions). This includes, for example, the setting in which Λ

and Γ are diffusion processes (see Dembo and Zeitouni (1999) Section 5.6). However,

in order for the previous theorem to be useful from an applied standpoint, sufficient

conditions must be provided to guarantee the validity of an LDP with good rate

function on C[0,∞)×C[0,∞). Fortunately, these types of conditions have been wellstudied in the literature.

The following set of assumptions taken from Zajic (1993) are useful to guarantee

the existence of a full LDP (more than we actually need), and their validity has been

shown in many different settings (see Zajic (1993) Ch. 3 and Ch. 4).

ACL1 For all θ, η ∈ R suppose that

g (η, θ) , sups,t

1

tlogE exp

µη

Z s+t

s

γ (u) du+ θ

Z s+t

s

|λ (u)| du¶<∞.

In addition, assume that there exists ε > 0 and a pair of functions (f, h) such

that h (δ)→ 0 as δ → 0 and

limδ→0 (ε log (δ) + f (δ)h (δ)− δg (f (δ))) =∞.

ACL2 If 0 = t0 < t1 < ... < tm <∞ then

Wα,m , α ((Γ (ti/α)− Γ (ti−1/α)) , (Λ (ti/α)− Λ (ti−1/α)))mi=1

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 130

satisfies a Large Deviations Principle (LDP) on R2m with good rate function

Im (z) =mXi=1

(ti − ti−1) Iµ

ziti − ti−1

¶,

where I (x1,x2) is the rate function governing the LDP of n−1 (Γ (n) ,Γ (n)).

The following theorem provides a form of the LDP that is well suited for applica-

tions. Define (as in Zajic (1993) p. 9)

ψ (η, θ) = limn→∞1

nlogE exp (ηΓ (n) + θΛ (n)) <∞.

Theorem 6 Suppose that assumptions ACL1 and ACL2 are in force. Let AC0 be

the set of absolutely continuous functions, defined on [0,∞), taking values on the realline and vanishing at the origin. Then, if y > λ/γ, we have that

limα→∞

α logP (αD (α) ≥ y)

= −I (y) , − infx∈AC0

{Z ∞

0

supθ(θx (s)− χ (θ)) ds : y =

Z ∞

0

e−sx (s) ds},

where χ (·) is defined via ψ (−χ (·) , ·) = 0. In addition, if there exists θ∗ = θ∗ (y)

such that yθ∗ = χ (θ∗), then we have

limα→∞

α logP (αD (α) ≥ y) = supθ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶= yθ∗ −

Z θ∗

0

χ (u)

udu.

Proof. All what we have to do is to identify the rate function. Theorem 2.2.2.,

p. 25, of Zajic (1993) indicates that α (Γ (·/α) ,Λ (·/α))α>0 satisfies a LDP with goodrate function

I (x, y) ,½R∞

0supη,θ (x (s) η + y (s) η − ψ (η, θ)) ds if x, y ∈ AC0

∞ otherwise.

This implies (combining the results of Puhalskii and Whitt (1997) and Russell (1998))

that (αΛ (Γ−1 (·/α)))α>0 satisfies a full LDP with good rate function

J (x) ,½R∞

0supθ (x (s) θ − χ (θ)) ds if x ∈ AC0

∞ otherwise.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 131

This expression, combined with the contraction principle, yields the first part of the

theorem. Hence, we only need to show that if y > λ/γ and yθ∗ = χ (θ∗), then

I (y) = supθ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶= yθ∗ −

Z θ∗

0

χ (u)

udu.

First, observe that integration by parts yields

infx∈AC0; y=

R∞0 e−sx(s)ds

{Z ∞

0

supθ(θx (s)− χ (θ)) ds}

= infx∈AC0; y=

R∞0 e−sx(s)ds

{Z ∞

0

supθ(θx (s)− χ (θ)) ds}.

Also, note that for every s ∈ R

supθ(θx (s)− χ (θ)) = sup

θ

¡θe−sx (s)− χ

¡θe−s

¢¢.

In particular, we have that for x ∈ AC0 and y =R∞0e−sx (s) dsZ ∞

0

supθ(θx (s)− χ (θ)) ds =

Z ∞

0

supθ

¡θe−sx (s)− χ

¡θe−s

¢¢ds

≥ supθ

Z ∞

0

¡θe−sx (s)− χ

¡θe−s

¢¢ds

= supθ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶.

Consequently,

I (y) ≥ supθ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶.

Now, if y > λ/γ = χ (0), then

supθ≥0

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶≥ 0.

On the other hand, for every θ > 0, we have (by making the change of variables

θe−s = u) Z ∞

0

χ¡θe−s

¢ds =

Z θ

0

χ (u)

udu.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 132

Therefore, by first order optimality conditions we have that (using the convexity of

the rate function)

supθ≥0

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶= yθ∗ −

Z θ∗

0

χ (u)

udu > 0.

Finally consider the function x∗ (s) such that χ (θ∗e−s) = x∗ (s) and x (0) = 0. Note

that Z ∞

0

e−sx∗ (s) ds =

Z ∞

0

e−sχ¡θ∗e−s

¢ds

=−1θ∗

Z ∞

0

dχ¡θ∗e−s

¢=

χ (θ∗)θ∗

= y.

Hence, we have that

I (y) = infx∈AC0; y=

R∞0 e−sx(s)ds

{Z ∞

0

supθ(θx (s)− χ (θ)) ds}s}

≤Z ∞

0

supθ

¡θe−sχ

¡θ∗e−s

¢− χ¡θe−s

¢¢ds

= yθ∗ −Z θ∗

0

χ (u)

udu = sup

θ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶,

which yields the conclusion of the theorem.

Our final result is an exact LDP formulated in the continuous setting for processes

with a Markovian structure. We adopt the setting of Subsection 5.2, in which a

suitably time homogeneous Markov process Y = (Y (s) : s ≥ 0) with generator A wasintroduced. We also assume that Λ is a Levy process independent of Y . The desired

exact LDP for

αD (α) = α

Z ∞

0

exp (−αΓ (t)) dΛ (t)

provided in the next theorem gives support to the following approximation (valid

when x >> λ/γ)

Py (D > x) ≈ γ1/2u (y0,−χ (θ∗)) exp (θ∗χ (θ∗) c (θ∗))

θ∗p

πχ00 (θ∗)exp (−I (x) /γ) ,

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 133

where u (y, ·) and χ00 (·) are defined as in Subsection 5.2 via the generalized eigenvalueproblem (15), c (θ∗) = Eeπ

³uθ³eY (1) ,−χ (θ∗)´ /u³eY (1) ,−χ (θ∗)´´ (with eπ (dy) =eγ (y)π (dy) /Eπeγ (Y (1))) and

I (x) = xθ∗ −Z θ∗

0

χ (u)

udu,

with χ (θ∗) = θ∗x.

Theorem 7 Suppose that Y is geometrically ergodic and that Λ (1) is non-lattice

with φΛ (θ) = E exp (θΛ (1)) < ∞ for all θ ∈ R. Then, if x > λ/γ and c (θ∗) =

Eeπ³uθ³eY (1) ,−χ (θ∗)´ /u³eY (1) ,−χ (θ∗)´´ (eπ (dy) = eγ (y)π (dy) /Eπeγ (Y (1)))

exp (I (x) /α)Py (αD (α) > x) ∼ α1/2

u (y,−χ (θ∗))exp (θ∗χ (θ∗) c (θ∗))

θ∗p

πχ(2) (θ∗)as α& 0,

where χ (θ) = −ψ−1Γ (−ψΛ (θ)), θ∗ satisfies χ (θ∗) = θ∗x and u (y, θ) (u (y, 0) = 1)

solves the eigenvalue problem

1eγ (y) (Au) (y, θ) =µ−ψΛ (θ)eγ (y) + χ (θ)

¶u (y, θ) .

Proof. Consider the family of probability measures P ∗y defined as

dP ∗y = exp (θ∗D (α)− ψ (θ∗,α)) dPy,

where ψ (θ∗,α) = logE exp (θ∗D (α)). Note that

exp (I (x) /α)Py0 (αD (α) > x)

= exp (I (x) /α)E∗y0 (1 (αD (α) > x) exp (ψ (θ∗,α)− θ∗D (α))) .

Now, observe that (since θ∗ > 0)

I (x) /α− xθ∗/α = −Z θ∗

0

χ (u)

udu = −

Z ∞

0

χ¡θ∗e−αs

¢ds.

On the other hand, from the proof of Lemma 5 we have that for all θ ∈ R,

exp

µZ ∞

0

χ¡θe−αt

¢dt

¶u (y0,−χ (θ)) (22)

= E exp

Z ∞

0

ψΛ

¡θe−αΓ(t)

¢dt− α

Z ∞

0

e−αuuθ³eY (u) ,−χ (θe−αu)´ χ (θe−αu)u³eY (u) ,−χ (θe−αu)´ du

.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 134

Which implies that

exp

µψ (θ,α)−

Z ∞

0

χ¡θe−αs

¢ds

¶(23)

∼ u (Y0,−χ (θ)) exp (c (θ)) , ξ (y0, θ) .

as α& 0. Therefore, we have that

exp (I (x) /α)Py0 (αD (α) > x) (24)

∼ ξ (y0, θ∗)Eα

y0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α))) .

The strategy is then to develop an Edgeworth expansion for√α (D (α)− x/α) under

E∗y0. Using the same steps as in the proof of Lemma 5 we can obtain a description of

the local behavior ψ∗α (θ) , logE∗y0 exp (iθ√α (D (α)− x/α)). In fact, we can obtain

ψ∗α (θ) = −θ2χ(2) (θ∗)4

+√α¡c1iθ + c2 (iθ)

3¢+ o ¡√α¢(uniformly on θ ∈ (−δ, δ) for some δ > 0). The coefficients c1 and c2 can actually

be computed but their values are not relevant for purposes of developing sharp large

deviations. The coefficient χ(2) (θ∗) /4 comes from the development ofZ ∞

0

¡χ¡¡√

αθ + θ∗¢e−αu

¢− χ¡θ∗e−αu

¢−√αe−αuxθ¢ du= θ

√α

Z ∞

0

¡χ¡θ∗e−αu

¢− x¢ e−αudu+ θ2Z ∞

0

2αχ(2) (θ∗e−αu) e−2αu

4du (25)

+o¡√

α¢.

Indeed, sinceZ ∞

0

χ¡θ∗e−αu

¢e−αudu = − 1

θ∗α

Z ∞

0

dχ¡θ∗e−αu

¢=

χ (θ∗)αθ∗

=x

α,

we obtain that the coefficient multiplying θ in (25) vanishes and, thus, ψ∗α (θ) ∼−θ2χ(2) (θ∗) /4 as stated. We also must show that |φ∗ (θ,α) | = |E∗y0 exp (iθD (α)) | =o (√α) uniformly on compact sets not containing the origin. The key observation to

prove this condition is to note that

φ∗ (θ,α) =E exp

¡R∞0

ψΛ ((iθ + θ∗) exp (−αΓ (t))) dt¢E exp

¡R∞0

ψΛ (θ∗ exp (−αΓ (t))) dt¢ .

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 135

Next, observe that (23) implies that there exists a positive constant C <∞ such that

|φ∗ (θ,α) | ≤ C|E expµZ ∞

0

ψΛ

¡(iθ + θ∗) e−αΓ(t)

¢− χ¡θ∗e−αt

¢dt

¶|. (26)

Now consider the (geometrically ergodic) Markov process eY = ³eY (s) : s ≥ 0´ withgenerator eA = 1eγA (compare the proof of Lemma 5, where this process was intro-

duced). Let us define the probability measure eP acting the sigma-field generated byeY asd eP =M∞ (θ∗) dP,

where M∞ (θ∗) is the last element of the bounded martingale M∗ = (Mt (θ∗) : 0 ≤

t ≤ ∞) defined as

Mt (θ∗) =

u³eY (t) ,−χ (θ∗e−αt)´u (Y0,−χ (θ∗e−αt)) exp

Z t

0

ψΛ (θ∗e−αt)eγ ³eY (t)´ dt−

Z t

0

χ¡θ∗e−αt

¢dt

exp

−α Z t

0

θ∗e−αtuθ

³eY (t) ,−χ (θ∗e−αt)´u³eY (t) ,−χ (θ∗e−αt)´ χ

¡θ∗e−αt

¢dt

.(This martingale was also introduced in the proof of Lemma 5, where it has been

indicated how the martingale property follows from Lemma 2, p. 82 of Skorohod,

Hoppensteadt and Salehi (2001)). Note, therefore, that

E exp

µZ ∞

0

ψΛ

¡(iθ + θ∗) e−αΓ(t)

¢− χ¡θ∗e−αt

¢dt

¶= eE expµZ ∞

0

ψΛ

¡(iθ + θ∗) e−αΓ(t)

¢− Z t

0

ψΛ

¡θ∗e−αΓ(t)

¢dt

¶Z (α) ,

where B1 < |Z (α) | < B2 for some constants 0 < B1 < B2 <∞. This implies, usingthe bound (26), that

|φ∗ (θ,α) | ≤ CB2 eE| expµZ ∞

0

¡ψΛ

¡(iθ + θ∗) e−αΓ(t)

¢− ψΛ

¡θ∗e−αΓ(t)

¢¢dt

¶|.

From this bound it is easy to see that |φ∗ (θ,α) | = o (√α) by noting that for everyη2 ∈ R, exp (ψΛ (i ·+η2)− ψΛ (η2)) is the characteristic function of Λ (1) under the

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 136

obvious exponential change of measure and that δt ≤ Γ (t) ≤ Kt.for positive finiteconstants δ and M . With these elements on hand, the corresponding Edgeworth

expansion for√α (D (α)− x/α) under E∗y0 follows routine steps as in the proof of

Theorem 3. Therefore, we obtain that

Eαy0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α)))

=√α

Z ∞

0

exp (−x) exp ¡−αx2/ ¡χ(2) (θ∗) θ∗¢¢θ∗p

πχ(2) (θ∗)dx

+√α

Z ∞

0

exp¡−θ∗x/√α¢ p (x) exp ¡−x2/χ(2) (θ∗)¢ dx

+

Z ∞

0

exp¡−θ∗x/√α¢Gα (dx) ,

where p (x) in the second term above represents a polynomial of degree 3 and Gα (dx)

is a signed measure such that kGα (dx)k = o (√α). Hence, using the Dominated

Convergence Theorem and the stated property on the total variation of Gα, we obtain

that

1√αEαy0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α)))→ 1

θ∗p

πχ(2) (θ∗).

Combining these estimates with (24) yields the conclusion of the theorem.

5.5.2 The discrete time setting

The goal now is to obtain the LDP for the discrete time case. The following set of

assumptions are analogous to those stated at the end of the previous section, and

their validity has been verified in many cases (including under Markovian and strong

mixing assumptions; see Zajic (1993), chapters 3 and 4).

ADL1 For each θ, η ∈ R, suppose that

g (η, θ) , supn,k

1

nlogE exp

Ãηn+kXj=k

Zj + θn+kXj=k

|Xj|!<∞.

ADL2 If 0 = t0 < t1 < ... < tm <∞ then

Wα,m , α ((Γ (ti/α)− Γ (ti−1/α)) , (Λ (ti/α)− Λ (ti−1/α)))mi=1

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 137

satisfies an LDP on R2m with good rate function

Im (z) =mXi=1

(ti − ti−1) Iµ

ziti − ti−1

¶,

where I (x1,x2) is the rate function governing the LDP of n−1 (Γ (n) ,Λ (n))

The strategy here is first to consider a related family of approximating processes³eΓ, eΛ´ defined viaeΓ (t) ,

btcXk=1

Zk + (t− btc)Zbtc+1,

eΛ (t) ,dteXk=1

Xk + (t− btc)Xdte+1.

Theorem 2.1.1., p. 19, of Zajic (1993) establishes that α³eΓ (·/α) , eΛ (·/α)´ satisfies

a full LDP under Stone’s topology. (Note that d·e is being used here instead ofb·c in the definition of eΛ, but it is straightforward to adapt Zajic’s estimates in thissetting. Also, recall that a full LDP is one that holds with a good and convex rate

function. See Dembo and Zeitouni (1999) for the definition of good rate function.)

Thus, Theorem 5 applies here yielding the full LDP for the corresponding normalized

infinite horizon discounted reward

α eD (α) , αΨ³eΓ−1α , eΛα

´= α

Z[0,∞)

exp (−u) eΛα

³eΓ−1α (u)´du.

In view of this observation, the natural step is to show that α eD (α) is suitably closeto αD (α) (in exponential scale) as α & 0. In other words, we would like to show

that the families of random variables {α eD (a)}α>0 and {αD (α)}α>0 are exponentiallyequivalent (i.e. that for each δ > 0

limα→∞

α logP³¯α eD (α)− αD (α)

¯> δ´= −∞,

see Dembo and Zeitouni (1999), p. 130). With exponential equivalence on hand

we would be able to conclude, by virtue of Theorem 4.2.13 of Dembo and Zeitouni

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 138

(1999), that a full LDP also holds for αD (α) as α & 0. We will actually follow

this program but we will utilize a different family of approximating processes. The

reason is that the integral structure in the definition of eD (α) = Ψ³eΓ−1α , eΛα

´allows

us to take advantage of the nature of the Lebesgue measure to construct a family of

approximating processes {Λα}α>0 which is more convenient for purposes of provingthe exponential equivalence required. We, thus, define for each α > 0, the continuous

process Λα as

Λα (t) ,dteXk=1

Xk + Uα (t) ,

where

Uα (t) = dte (t− (dte− α))

αXdte+11 (t ∈ [dte− α, dte)).

We now show that³αeΓ (·/α) ,αΛα (·/α)

´and

³αeΓ (·/α) ,αeΛ (·/α)´ are equivalent

from a large deviations standpoint.

Lemma 9 The families {³αeΓ (·/α) ,αΛα (·/α)

´}α>0 and {

³αeΓ (·/α) ,αeΛ (·/α)´}α>0

are exponentially equivalent in C[0,∞)× C[0,∞) Stone’s topology.Proof. It suffices to show the corresponding exponential equivalence for {Λα}α>0

and {eΛα}α>0. Recall that Stone’s topology is generated by the metric

d∞ (x, y) =∞Xk=1

2−kdk (x, y)

1 + dk (x, y),

where

dT (x, y) = sup0≤t≤T

|x (t)− y (t)|

(see Zajic (1993) p. 20). Fix δ > 0 small and choose k0 > d− log (δ/2) / log (2)e.Then,

P∞k=k0

2−k < δ/2 and, noting that dk³Λα, eΛα

´≤ αmax1≤k≤bt/ac |Xk|, we can

write

P³d∞³Λα, eΛα

´> δ´≤ P

³dk0

³Λα, eΛα

´> δ/2

´≤ bk0/αc max

1≤k≤bk0/acP¡|Xk| > 2−1δ/α¢

≤ bk0/αc exp¡−A2−1δ/α¢max

k∈NE¡exp

¡A2−1δ/α |Xk|

¢¢,

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 139

for every A > 0 (by virtue of assumption ADL2). Therefore, we conclude that

limα→∞

α logP³d∞³Λα, eΛα

´> δ´= −A2−1δ.

Letting A%∞ yields the conclusion of the lemma.

The same strategy followed in the continuous case can now be applied to the pair³eΓα (·) , Λα (·)´as the next proposition summarizes.

Proposition 6 Under assumptions ADL1 and ADL2, the family of random elements

Λα

³eΓ−1α (·)´satisfies a full LDP on the space of continuous function C[0,∞) endowed

with Stone’s topology. Moreover, the corresponding normalized infinite horizon dis-

counted reward

αD (α) = α

Z[0,∞)

exp (−u) Λα

³eΓ−1α (u)´du

satisfies a LDP with good rate function

I (y) = infx∈AC0

{Z ∞

0

supθ(θx (s)− χ (θ)) ds : y =

Z ∞

0

e−sx (s) ds}

where χ (·) is defined via ψ (−χ (·) , ·) = 0.

Proof. It follows just as Theorem 5.

We now are ready to show that αD (a) is suitably close to αD (a) in exponential

scale.

Lemma 10 The families {αD (α)}α>0 and {αD (α)}α>0 are exponentially equivalent.In other words, for each δ > 0,

limα→0α logP¡¯αD (α)− αD (α)

¯> δ¢= −∞.

Proof. Note thatleΓ−1 (t)m = Γ−1 (t) for almost every t with respect to Lebesgue

measure. Therefore, it follows that, for almost every t,

αΛ¡Γ−1 (t/α)

¢− αΛ³eΓ−1 (t/α)´ = −αUα

³eΓ−1 (t/α)´ .

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 140

As a result, we have (making the change of variables eΓ (t/α) = u/α) that¯αD (α)− αD (α)

¯ ≤ α

Z ∞

0

W (u/α) du,

where

W (u/α) = exp³−αeΓ (u/α)´ |Uα (u/α)|Z (bu/αc+ 1) .

Let us define

V1 = α

Z t0

0

W (u/α) du and V2 = α

Z ∞

t0

W (u/α) du,

and consider the sets

A1 (t0,α, ε) , {ω : supt>t0

¯³αeΓ (t/α)− γt

´t−1¯≤ ε},

A2 (t0,α, ε) , {ω : supt>t0

¯(αΓ (t/α)− γt) t−1

¯ ≤ ε},

A3 (t0,α,m) , {ω : sup0≤t≤t0

¯αeΓ (t/α)− γt

¯≤ m},

A4 (t0,α,M) , {ω : αdt0/αeXk=1

|Xk| ≤M},

A5 (t0,α, ε) , {ω : supk>t0/α

|Xk|k≤ ε}.

For notational convenience, we will drop the arguments in the definitions of Aj,

1 ≤ j ≤ 5. Using these definitions, we can write

P¡¯αD (α)− αD (a)

¯> δ¢

≤ P

µα

Z ∞

0

W (u/α) du > δ;∩5k=1Ak¶+

5Xk=1

P (Ack) .

Observe that if we write K1 = exp (−m+ γ) then, on ∩5k=1Ak, we have

V1

≤ αK1

Z t0

0

|X (du/αe+ 1)|Z (bu/αc+ 1) 1 (u/α ∈ [du/αe− α, du/αe)) du

≤ α3K1

dt0/αeXk=1

|Xk+1|Zk ≤ α3K1

dt0/αeXk=1

|Xk|dt0/αeXk=1

Zk ≤ α2K1M

dt0/αeXk=1

Zk.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 141

On the other hand, also on ∩5k=1Ak, and for t0 (ε, γ) suitably large, there exists apositive constant K (ε, γ) <∞ such that

V2 ≤ αK (ε, γ) .

Thus, if α < δ/ (2K (ε, γ)), we have that

P

µα

Z ∞

0

W (u/α) du > δ;∩5k=1Ak¶≤ P

α

dt0/αeXk=1

Zk > δ/(α2K1M)

.But we know that α

Pdt0/αek=1 Zk satisfies an LDP (as α→ 0) therefore, we must have

that (for fixed ε, γ and large but fixed t0)

α logP

µα

Z ∞

0

W (u/α) du > δ

¶→−∞ as α& 0.

Now we analyze each P (Ack) for each 1 ≤ k ≤ 5. First, note that (by Lemma 7), t0can be chosen so that

limα→0α logP (Ac1 ((t0,α, ε))) ≤ −t0. (27)

Because (αΓ (·/α)− γ·)α>0 satisfies a full LDP onD[0,∞) endowed with the topologygenerated by the uniform convergence on compact sets (see Theorem 2.2.1 of Zajic

(1993)), it follows that the same argument provided for the proof of condition (20),

applies in this case as well. This implies that a bound such as (27) also applies for

the set Ac2. Observe that

α logP (Ac3 (t0,α,m))→ −J (m) ,

for some convex good rate function J (·) (by definition of full LDP, see Dembo andZeitouni (1999)). Now, for A4, we can use Chebyshev’s bound to obtain

α logP (Ac4 (t0,α,M)) ≤ α

»t0α

¼logE exp

dt0/αeXk=1

|Xk|−M

≤ αg (0, 1)−M →−M as α& 0.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 142

Finally, for A5, we have

P (Ac5 (t0,α, ε)) ≤∞X

k>t0/α

P

µ |Xk|k

> ε

≤∞X

k>t0/α

exp (−εk)E exp (|Xk|)

≤ exp (g (0, 1)) exp (−ε dt0/αe)1− exp (ε) .

thus, for ε > 0 small but fixed,

α logP (Ac5 (t0,α,m))→−εt0.Combining the previous estimates, we conclude that

limα→0α logP¡¯αD (α)− αD (a)

¯> δ¢ ≤ − (2 + ε) t0 −M − J (m) .

since J (·) is a convex good rate function, the previous quantity in the right hand sidetends to infinity as m, t0,M %∞, which yields the proof of the lemma.

We are now in position to identify the rate function required to make practical

use of approximation (18) and under which the LDP for αD (α) holds. Define (as in

Zajic (1993) p. 9)

ψ (η, θ) = limn→∞

1

nlogE exp (ηΓ (n) + θΛ (n)) <∞.

Theorem 8 Suppose that ADL1 and ADL2 hold. Then, if y > λ/γ,

limα→∞

α logP (αD (α) ≥ y)

= infx∈AC0

{Z ∞

0

supθ(θx (s)− χ (θ)) ds : y =

Z ∞

0

e−sx (s) ds},

where AC0 is the space of absolutely continuous functions, defined on the interval

[0,∞), that vanish at the origin. In addition, if there exists θ∗ = θ∗ (y) such that

yθ∗ = χ (θ∗), then

limα→∞

α logP (αD (α) ≥ y) = supθ

µyθ −

Z ∞

0

χ¡θe−s

¢ds

¶= yθ∗ −

Z θ∗

0

χ (u)

udu.

CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 143

Proof. We know that {αD (α)}α>0 and {αD (a)}α>0 are exponentially equivalent.On the other hand, Proposition 6 indicates that αD (a)}α>0 satisfies a full LDP. Thus,By Theorem 4.2.13, p. 130, of Dembo and Zeitouni (1999) {αD (α)}α>0 must alsosatisfy a full LDP with the same rate function. The identification of the rate function

follows as in Theorem 6.

The corresponding exact large deviations asymptotic is provided under the iid

setting described in Subsection 5.1. Under those conditions, if x >> λ/γ, we shall

provide rigorous justification for the approximation

P (D > x) ≈√γ

θ∗p

πχ00 (θ∗)exp

Ã−Ãxθ∗ −

Z θ∗

0

χ (u)

udu

!/γ

!,

where xθ∗ = χ (θ∗), χ (·) satisfies ψ (−χ (·) , ·) = 0, and ψ (η, θ) = logE exp (ηZ + θX).

As usual, the approximation will be shown to hold in the regime of small interest

rates. That is, we will show that the previous approximation is valid for the discrete

time discounted reward D (α) =P∞

k=0 exp³−αPk−1

j=0 Zj´Xk. The proof of the next

theorem follows the same strategy as that of Theorem 7.

Theorem 9 Suppose that (Xk, Zk)k≥0 is an iid sequence of random variables. Sup-

pose that Zk > 0 and that for all η, θ ∈ R we have that E exp (θZ1 + ηX1) < ∞. Inaddition, assume that conditions AI2 and AI3 of Subsection 5.1 hold. Let χ (θ) be

defined as the solution to

ψ (−χ (θ) , θ) = 0,

where ψ (η, θ) = logE exp (ηZ1 + θX1). Suppose that x > λ/γ and let θ∗ be the

solution of xθ∗ = χ (θ∗). Then,

P (αD (α) > x) ∼ α1/2exp (−I (x) /α)θ∗p

πχ(2) (θ∗)as α& 0,

with I (x) = xθ∗ − R θ∗

0χ(u)udu.

Bibliography

[1] Adler, J., Feldman, R., and Taqqu, M. (Editors.) A Practical Guide to Heavy

Tails: Statistical Techniques and Applications. Birkhauser. Boston.

[2] Asmussen, S. (1987) Applied Probability and Queues. Wiley. New York.

[3] Asmussen, S. (2001) Ruin Probabilities. World Scientific. Singapore.

[4] Asmussen, S. (2003) Applied Probability and Queues. Springer-Verlag. New York.

[5] Asmussen, S., and Binswanger, K. (1997) Simulation of ruin probabilities for

subexponential claims. Astin Bulletin 27, 297-318.

[6] Asmussen, S., and Hojgaard, B. (1999) Approximations for finite horizon ruin

probabilities in the renewal model. Scand. Act. J. 2, 106-119.

[7] Bahr, B. (1975) Asymptotic ruin probabilities when exponential moments do not

exist. Scand. Act. J., 6-10.

[8] Bédard, D., and Dufresne, D. (2001). Pension funding with moving average rates

of return. Scand. Actuarial Journal 101: 1-17.

[9] Benveniste, A., Metiver, M., and Priouret, P. (1990) Adaptive algorithms and

stochastic approximations. Springer-Verlag. New York.

[10] Billingsley, P. (1999) Convergence of probability measures. Wiley. New

York.heory and pension funding. Scandinavian Actuarial Journal.

[11] Blanchet, J., Olvera-Cravioto, M. and Glynn, P. (2004) From diffusion to large

deviations for the maximum of random walk. In preparation.

144

BIBLIOGRAPHY 145

[12] Borovkov, A. (1976) Asymptotic methods in queueing theory. Springer-Verlag.

New York.

[13] Borovkov, A. (2000) Estimates for the distribution of sums and maxima of sums

of random variables without the Cramer condition. Preprint.

[14] Bowers, N., Gerber, H., Hickman, J., Jones, D., and Nesbitt, C. (1997) Actuarial

Mathematics. The Society of Actuaries. Schaumburg, Illinois.

[15] Breiman, L. (1992) Probability. Addison-Wesley. Massachusetts.

[16] Bucklew, K., Kurtz, T., and Sethares, W. (1993) Weak convergence and local

stability properties of fixed step size recursive algorithms. IEEE Transactions on

Information Theory, Vol. 39, No. 3, pp. 966-978.

[17] Butzer, P., and Nessel, R. (1971) Fourier Analysis and Approximation. Vol 1.

Birkhauser Verlag. New York.43-157.

[18] Campbell, J., Lo, A., and Mackinlay, C. (1999) The econometrics of financial

markets. Princeton University Press.

[19] Carlsson, H. (1983) Reminder term estimates of the renewal function. Annals of

Probability, Vol. 11, No. 1, 1

[20] Carmona, P., Petit, F., and Yor, M. (2001) Exponential functionals of Lévy

processes. O. Barndorff-Nielsen, T. Mikosch and S. Resnick (eds.) Lévy processes:

theory and applications. 41-55, Birkhauser.

[21] Chang, J. (1992) On moments of the first ladder height of random walks with

small drift. Ann. of App Prob. 2, 714-738.

[22] Chang, J., and Peres, Y. (1997) Ladder heights, Gaussian random walks and the

Riemann zeta function. Annals of Probability 25, 787-802.

[23] Csörgo, M. and Révész, P. (1981) Strong Approximations in Probability and

Statistics. Academic Press.

BIBLIOGRAPHY 146

[24] Dembo, A., and Zeitouni, O. (1998) Large deviations techniques and applications.

Springer-Verlag. New York.

[25] Dufresne, D., (1990) The distribution of a perpetuity, with applications to risk

theory and pension funding. Scandinavian Actuarial Journal.

[26] Dufresne, F., and Gerber, H. (1991) Risk theory for the compound Poisson pro-

cess that is perturbed by diffusion. Insurance Math. Econom. 51-59

[27] Embrechts, P., and Goldie, C. (1994) Perpetuities and random equations. In:

Mandl, P., Huskova, M. (eds.) Asymptotic Statistics. Proceedings of the 5th

Prague Symposium, 75-86. Physica-Verlag.

[28] Embrechts, P., Klüppelberg, C., andMikosch, T. (1997)Modelling extreme events

with applications to insurance and finance. Springer-Verlag. New York.

[29] Embrechts, P., and Vereberbeeke, N. (1982) Estimates for the probability of ruin

with special emphasis on the possibility of large claims. Insurance: Mathematics

and Economics, 1, 55-72.

[30] Feller, W. (1978) An Introduction to Probability Models and Its Applications II.

Wiley. New York.

[31] Feng, J., and Kurtz, T. (2000) Large deviations for stochastic processes. Preprint.

[32] Forniari, F., and Mele, A. (1997) Weak convergence and distributional assump-

tions for a general class of non-linear ARCH models. Econometric Reviews, 16

(2), 205-227.

[33] Gaier, J. Grandits, P. and Schachermayer, W. (2003) Asymptotic ruin probabil-

ities and optimal investment. Ann. of Appl. Prob. Vol. 13.

[34] Gerber, H. (1971) The discounted central limit theorem and its Berry-Esséen

analogue. Ann. of Math. Stat. Vol. 42 , No. 1, 389-392.

[35] Gjessing, H., and Paulsen, J. (1997) Present value distributions with applications

to ruin theory and stochastic equations, St. Pr. and Appl. Vol. 71, 123-144.

BIBLIOGRAPHY 147

[36] Goldie, C. (1991) Implicit renewal theory and tails of solutions of random equa-

tions. Ann. Appl. Probab. 126-166,

[37] Goldie, C. and Grübel, R. (1996) Perpetuities with Thin Tails. Adv. Appl. Prob.

28, 463-480.

[38] Grandell, J. (1991) Aspects of Risk Theory. Springer-Verlag. New York.

[39] Gut, A. (1988) Stopped random walks. Springer-Verlag. New York.

[40] Harrison, M. (1977) Ruin problems with compounding assets. St. Pr. and Appl.

1977. 5, 67-79.

[41] Hogan, M. (1986) Comment on ‘Corrected Diffusion Approximations in Certain

Random Walk Problems’. J. Appl. Probab. 23, 89-96.

[42] Horvath, L. (1984a) Strong approximation of renewal processes. Stochastic Pro-

cess. Appl. 18, No. 1, 127—138.

[43] Horvath, L. (1984b) Strong approximation of extended renewal processes. Ann.

Probab. 12, No. 4, 1149—1166.

[44] Horvath, L. (1986) Strong approximations of renewal processes and their appli-

cations. Acta Math. Hungar. 47, No. 1-2, 13—28.

[45] Kalashnikov, V. (1997) Geometric Sums: Bounds for Rare Events with Applica-

tions. Kluwer. Dordrecht, The Netherlands.

[46] Kesten, H. (1973) Random difference equations and renewal theory for products

of random matrices. Acta Math. 131, 207-248.

[47] Kiefer, J. and Wolfowitz, J. (1956) On the characteristics of the general queueing

process with applications to random walks. Trans. Amer. Math. Soc. 78, 1-18.

[48] Kingman, J. (1963) Ergodic proper ties of continuous time Markov processes and

their discrete skeletons. Proc. London. Math. soc. 13, 593-604.

BIBLIOGRAPHY 148

[49] Kontoyiannis, I., and Meyn, S. (2003) Spectral theory and limit theorems for

geometrically ergodic Markov processes. Ann. Appl. Probab. 13, 304-362.

[50] Kontoyiannis, I., and Meyn, S. (2004) Large deviations asymptotics and the

spectral theory of multiplicatively regular Markov processes. Preprint.

[51] Kushner, H. (1984) Approximation and weak convergence methods for random

processes. MIT Press Series in Signal Processing, Optimization and Control,

Cambridge, Massachussetts.

[52] Lai, T. (1976) Asymptotic moments of random walks with applications to ladder

variables and renewal theory. Annals of Probability. 4, 51-66.

[53] Lindley, D. (1952) The theory of a queue with a single-server. Proc. Cambr.

Philos. Soc. 48, 277-289.

[54] Lin, S., and Willmot, G. (2000) Lundberg approximations for compound dis-

tributinos with insurance applications. Springer-Verlag. New York.

[55] Mills, T. (1993) The econometric modelling of financial time series. Cambridge

University Press.

[56] Nelson, D. (1990) ARCH models as diffusion approximations. Journal of Econo-

metrics. 45, 7-38.

[57] Nyrhinen, H. (1999) On the ruin probabilities in a general economic environment.

St. Pr. and Appl. 83, 318-330.

[58] Nyrhinen, H. (2001) Finite and infinite time ruin probabilities in a stochastic

economic environment. St. Pr. and Appl. 92, 265-285.

[59] Paulsen, J. (1993) Risk theory in a stochastic economic environment, St. Pr. and

Appl. Vol.46, 327-361

[60] Paulsen, J. (1998) Sharp conditions for certain ruin in a risk process with stochas-

tic return on investments, St. Pr. and Appl. Vol.75, 135-148.

BIBLIOGRAPHY 149

[61] Pollack, M. and Siegmund, D. (1985) A diffusion and its applications to detecting

a change in the drift of Brownian motion. Biometrika, 72, 267-280.

[62] Phillip, W., and Stout, W. (1975) Almost Sure Invariance Principles for Par-

tial Sums of Weakly Dependent Random Variables Providence, R.I. : American

Mathematical Society.

[63] Puhalskii, A., and Whitt, W. (1997) Functional large deviation principles for

first-passage-time processes. Ann. of Appl. Prob. Vol 7, No. 2, 362-381.

[64] Rudin, W. (1987) Real and Complex Analysis. McGraw-Hill. New York.

[65] Russell, R. (1998) The large deviations of random time changes. Ph.D. thesis.

Trinity College, Dublin.

[66] Shephard, N. (1996) Statistical aspects of ARCH and stochastic volatility. In:

Cox, D.R., Hinkley, D.V. and Barndorff-Nielsen, O.E. (eds) Likelihood, Time

Series with Econometric and Other Applications. Chapman and Hall.

[67] Skorokhod, A., Hoppensteadt, F., and Salehi, H. (2001) Random perturbation

methods. Springer-Verlag. New York.

[68] Siegmund D. (1979) Corrected diffusion approximations in certain random walk

problems. Adv. Appl. Prob. 11, 701-719.

[69] Siegmund D. (1985) Sequential Analysis. Springer-Verlag. New York.

[70] Spitzer, F. (1964) Principles of Random Walk. Van Nostrand. New York.

[71] Stone, C. (1965) On characteristic functions and renewal theory. Trans. Amer.

Math. Soc., Vol. 120, 327-342

[72] Van Hoorn, M. H. (1984) Algorithms and Appproximations for Queueing Sys-

tems. Center for Mathematics and Computer Science. Amsterdam, The Nether-

lands.

BIBLIOGRAPHY 150

[73] Verbaat, W. (1979) On a stochastic difference equation and a representation of

non-negative infinitely divisible random variables. Adv. Appl. Probab. 11, 750-

783.

[74] Whitt, W. (2001) Stochastic-Process Limits. Springer-Verlag. New York.

[75] Wilkie, A. (1986) A Stochastic investment model for actuarial use. Transactions

of The Faculty of Actuaries. 39, 341,

[76] Willinger, W., Taqqu, M., Leland, W., and Wilson, D. (1995) Self-similarity in

high-speed packet traffic: Analysis and Modeling of Ethernet Traffic Measure-

ments. Statistical Science. Vol. 10, 67—85.

[77] Woodroofe, M. (1979) Repeated likelihood ratio tests. Biometrika. 66, 453-463.

[78] Woodroofe, M. (1982) Non-Linear Renewal Theory in Sequential Analysis. Soci-

ety for Industrial and Applied Mathematics. Philadelphia.

[79] Yor, M. (2001) Interpretations in terms of Brownian and Bessel meanders of the

distribution of a subordinated perpetuity. O. Barndorff-Nielsen, T. Mikosch and

S. Resnick (eds.) Lévy processes: theory and applications. 41-55, Birkhauser.

[80] Zajic, T. (1993) Large deviations for sample path processes and applications.

Stanford Ph.D. dissertation O.R.