limit theorems and approximations with applications …jb2814/papers/disertab3.pdf · limit...
TRANSCRIPT
LIMIT THEOREMS AND APPROXIMATIONS WITH
APPLICATIONS TO INSURANCE RISK AND
QUEUEING THEORY
a dissertation
submitted to the
department of management science and engineering
and the committee on graduate studies
of stanford university
in partial fulfillment of the requirements
for the degree of
doctor of philosophy
Jose H. Blanchet
August 2004
I certify that I have read this dissertation and that, in
my opinion, it is fully adequate in scope and quality as a
dissertation for the degree of Doctor of Philosophy.
Peter W. Glynn(Principal Adviser)
I certify that I have read this dissertation and that, in
my opinion, it is fully adequate in scope and quality as a
dissertation for the degree of Doctor of Philosophy.
Nicholas Bambos
I certify that I have read this dissertation and that, in
my opinion, it is fully adequate in scope and quality as a
dissertation for the degree of Doctor of Philosophy.
David O. Siegmund
Approved for the University Committee on Graduate
Studies.
iii
Acknowledgements
First, I want to thank God for giving me the opportunity of living all these wonderful
experiences at Stanford together with my beloved wife, Citlalli. Thanks, Lalli, for
being extremely supportive and being always interested and willing to listen to my
ideas. I consider your support and help throughout the completion of this step in my
academic life extremely valuable!.
My advisor, Professor Peter Glynn, has been a constant source of encouragement
and support. I had the fortune of enjoying a rich academic experience at Stanford and,
obviously, my interactions with Professor Glynn have played a crucial role in making
my Stanford experience so enjoyable. The example of Professor Glynn as researcher,
teacher and advisor is something that I treasure as one of the most important lessons
that I am keeping as a part of my learning experience.
I am grateful to the members of both, the reading and examination committees
(Professors Bambos, Diaconis, Glynn, Siegmund and Van Roy), for taking the time
to read this dissertation and provide valuable feedback through useful conversations
and interesting questions. In particular, thanks to Professor Siegmund for the use-
ful comments that he provided during several insightful discussions. I also want to
acknowledge the support that I received, in particular at early stages of my Ph.D.
work, from Professor David Luenberger.
Naturally, I want to thank my parents and brothers, specially my mother and
sisters, Rocío and Roxanna, for being extremely supportive and loving! And, finally, I
would like to express my gratitude to my friends and colleagues at Stanford, who have
contributed to make my overall experience at Stanford wonderful in every dimension.
Thanks!
iv
Contents
Acknowledgements iv
1 Introduction 1
2 Corrected Diffusion Approximations 6
2.1 The Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Short-time Asymptotics for the Cauchy Process . . . . . . . . . . . . 15
2.3 Reduction to Cauchy Process’ Asymptotics . . . . . . . . . . . . . . . 17
2.4 An Asymptotic Expansion for I (θ, b) . . . . . . . . . . . . . . . . . . 23
2.5 Expansions for r (∆) and EθRk (∞) . . . . . . . . . . . . . . . . . . . 30
2.5.1 The Expansion for r (∆) . . . . . . . . . . . . . . . . . . . . . 30
2.5.2 The Expansion for EθR (∞)k as θ & 0 . . . . . . . . . . . . . 32
2.6 Technical Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Cramer-Lundberg with Heavy Tails 43
3.1 A Cramer-Lundberg Representation . . . . . . . . . . . . . . . . . . . 46
3.2 Connection to Corrected Diffusion Approximations . . . . . . . . . . 50
3.3 Technical Development . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Geometric Sums and Applications 66
4.1 Asymptotics for Geometric Sums . . . . . . . . . . . . . . . . . . . . 69
4.2 Asymptotics of Defective Renewal Equations . . . . . . . . . . . . . . 81
v
5 Approximating Discounted Rewards 86
5.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . 96
5.4 Edgeworth Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4.1 The discrete time setting . . . . . . . . . . . . . . . . . . . . . 106
5.4.2 The continuous time setting . . . . . . . . . . . . . . . . . . . 115
5.5 Large Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.5.1 The continuous time setting . . . . . . . . . . . . . . . . . . . 124
5.5.2 The discrete time setting . . . . . . . . . . . . . . . . . . . . . 136
Bibliography 144
vi
Chapter 1
Introduction
This dissertation focuses on the development of limit theorems and approximations
for several performance measures that play an important role in a great variety of
applied disciplines including: Insurance Risk Theory, Queueing Theory, Statistical
Sequential Analysis, and Time Series Analysis, among others. To be more precise, let
us utilize the insurance setting as a vehicle to provide a unified overview of the types
of results that are developed in the subsequent chapters of this dissertation.
When dealing with the contingent nature of the insurance business, risk man-
agers take advantage of stochastic models and tools that are used to effectively assess
the risk of insurance portfolios (see Bowers et al (1997)). A popular model, widely
used in the insurance community to analyze collective risk models is the so-called
renewal model (see Bowers et al (1997) p. 432 and Asmussen (2001) Ch. 5). The
renewal model assumes that the claims arrive according to a renewal process, with in-
dependent and identically distributed (iid) inter-arrival times. It is also assumed that
the claims sizes are represented by a sequence of iid non-negative random variables
(rv’s), independent of the arrival process. Finally, the model specifies a constant (ag-
gregated) premium rate, which is received by the insurance company. A fundamental
quantity in the risk analysis of insurance portfolios is the so-called ruin probability
or probability of bankruptcy. Of course, if the premium rate charged is less than or
equal to the equilibrium pay-out rate, then the LLN implies that the company will go
bankrupt eventually with probability one. Consequently, insurance companies would
1
CHAPTER 1. INTRODUCTION 2
typically charge a positive “safety loading” in addition to the equilibrium pay-out
rate. Note, however, that in competitive environments, one would typically expect
insurance companies to charge small safety loadings to their customers.
The first portion of this dissertation addresses the problem of understanding the
probability of eventual ruin, parametrically in the premium rate, under low safety
loading environments. This problem, in turn, involves studying the mathematical
structure of random walk with small negative drift. Indeed, the time to bankruptcy
in the renewal model can be represented as the first hitting time to a certain level
(which is just the initial reserve level of the company) for a random walk that has a
negative drift proportional to the safety loading. As a result, the ruin occurs in finite
time if the maximum of a random walk with negative drift ever hits a certain level,
or, equivalently, if the corresponding first hitting time to this level is finite. Conse-
quently, the aforementioned insurance problem motivates the parametric analysis of
the distribution of the maximum of a random walk with small negative drift.
Incidentally, the distribution of the all time maximum of random walk with nega-
tive drift corresponds to the steady-state waiting time distribution (excluding service)
of the single server queue (which is one of the most fundamental models in the theory
of queues). As in the insurance setting discussed previously, the underlying ran-
dom walk would often have close to zero drift, which translates into the so-called
heavy traffic regime that is widely used in the modern analysis of queueing systems.
Heavy traffic analysis is often done through diffusion approximations. In fact, as we
shall see in Chapter 2, our parametric analysis of the distribution of the maximum
of random walk, with close to zero drift, corrects the natural diffusion approxima-
tion based on Brownian motion (which provides a crude “first order” approximation
to the distribution of the all time maximum of random walk). Corrected diffusion
approximations (CDA’s) for the distribution of the maximum of random walk were
introduced by Siegmund (1979). Siegmund’s second order correction to the standard
Brownian approximation was motivated by applications in Statistical Sequential Anal-
ysis. Specifically, applications related to proper design of statistical tests that run
up to a suitably defined first hitting time of an underlying random walk. The theory
presented in Chapter 2 extends the development initiated by Siegmund (1979) and
CHAPTER 1. INTRODUCTION 3
subsequent results in Statistical Sequential Analysis (see, for example, Chang (1992)
and Chang and Peres (1997)).
The previous discussion presents some examples of applied disciplines that can po-
tentially benefit from the results in the second chapter of this dissertation. Of course,
in some of these disciplines, stylized features arising from modeling considerations,
and statistical analysis of the data may give rise to additional technical complications
that must be addressed. For example, in the insurance setting described before, it
turns out that, in several branches of the insurance business (such as property insur-
ance), heavy tailed structure (in particular, claims sizes that do not have exponential
moments) seems to be an appropriate modeling feature to consider. (Other examples
are discussed in Chapter 3 below.) Unfortunately, techniques (such as exponential
changes of measure) that are extremely useful in the analysis of light tailed systems
(i.e. assuming the existence of exponential moments) do not extend to the heavy tailed
case. For instance, again coming back to the insurance arena, the corrected diffusion
approximation by Siegmund (1979), and the extension provided in Chapter 2 of this
dissertation, rely on light tailed techniques. Also, another approximation for the ruin
probability, which is typically very powerful in light tailed settings, is the celebrated
Cramer-Lundberg approximation. It turns out that, in the light tailed case, both
the CDA presented in Chapter 2 and the Cramer-Lundberg approximation are inti-
mately connected. Due to its success in applications involving light tailed character-
istics, analogous forms of the Cramer-Lundberg approximation have been developed
to cover a large class of heavy tailed claims (more precisely, subexponential claims,
see Embrechts, Klüppelberg and Mikosch (1997)). These extensions to heavy tailed
contexts are developed for large values of the initial reserve and fixed safety loading
and typically provide a poor performance for typical values of the initial reserve in
practical applications (see Embrechts, Klüppelberg and Mikosch (1997) p. 54). In
Chapter 3, we introduce a new interpretation of the Cramer-Lundberg approximation
for heavy tailed claims under the low safety loading asymptotic regime. In this dis-
sertation (specifically in Chapter 3) we only focus on the proposed Cramer-Lundberg
type of approximation in diffusion scale, which is related to the CDA presented in
Chapter 2. Thus, in simple terms, Chapter 3 provides a new Cramer-Lundberg type
CHAPTER 1. INTRODUCTION 4
of approximation for heavy tailed claims, interpreted in a low safety loading asymp-
totic regime, that seems to perform well in practical applications. (See Asmussen
and Binswanger (1997), who analyzed a related approximation provided by Hogan
(1986), which is discussed in Chapter 3 of this dissertation.) The approximation pro-
vided in Chapter 3 blends accurate approximations in diffusion scale with standard
Cramer-Lundberg asymptotics for large values of the reserve in a coherent way; this
parallels the relationship between the CDA of Chapter 2 and the Cramer-Lundberg
asymptotic in the light tailed case.
As was mentioned before, the analysis of stochastic systems with heavy tailed
characteristics gives rise to technical complications due to the fact that standard
light tailed techniques are infeasible. In order to deal with the problem of providing
accurate approximations for the probability of bankruptcy in heavy tailed contexts,
we developed new techniques that, in particular, are applied to obtain the results
described in the previous paragraph. These new techniques are presented in Chapter
4 of this dissertation. More precisely, Chapter 4 develops asymptotic expansions
of so-called random geometric sums (or geometric convolutions) when the success
parameter of the geometric random variable is close to zero. The direct connection to
the ruin problem and the distribution of the maximum of random walk comes from
a well known representation of the all time maximum of random walk as a geometric
number of iid positive random variables. The techniques developed in Chapter 4
have implications beyond the ruin problem previously discussed. In particular, as we
shall see, asymptotic expansions of geometric sums are closely related to so-called
defective renewal equations. As we discuss in Chapter 4, these types of integral
equations arise naturally in many areas of applied probability (including queueing
theory and insurance risk theory). The asymptotics developed for geometric sums are
then used to obtain expansion for defective renewal equations that are close to being
proper. Again, this asymptotic regime arises repeatedly in queueing and insurance.
For instance, as we shall see in Chapter 4, these results are useful in the development
of corrected heavy traffic approximations for M/G/c queueing models and in the
analysis of generalizations of classical renewal risk models.
Finally, it should be recognized that investments may play an important role in
CHAPTER 1. INTRODUCTION 5
the bankruptcy of insurance companies. Indeed, it follows that if one introduces in-
vestment effects in the risk reserve, the probability of bankruptcy can be expressed in
terms of the distribution of a so-called perpetuity or infinite horizon discounted reward
(see Asmussen (2001) Ch. 7). This motivates the theme of the last chapter of this
dissertation, namely Chapter 5. Specifically, in Chapter 5 we develop approximation
for the distribution of infinite horizon discounted rewards. The theory provided in
Chapter 5 is developed, just as in the previous chapters in a “low profit environment”
which again is natural in many applications settings (such as the insurance context
that we have been emphasizing). In particular, we develop central limit theorems,
laws of large numbers, Edgeworth expansions and large deviation principles (rough
and exact) for the distribution of perpetuities under low interest rates. As we shall
also discuss in Chapter 5, these approximations are relevant not only to the insurance
ruin problem, but also for other applied disciplines (including time series analysis and
finance).
Chapter 2
Corrected Diffusion Approximation
for the Maximum of Random Walk
Let (Xn : n ≥ 1) be a sequence of independent and identically distributed (iid) randomvariables (rv’s), and let S = (Sn : n ≥ 0) be its associated randomwalk (so that S0 = 0and Sn = X1 + ... +Xn for n ≥ 1). In this chapter, we focus on the development ofhigh accuracy approximations to the distribution of the maximum r.v.
M = max{Sn : n ≥ 0}.
Clearly, −µ , EX1 must typically be negative in order that M be finite-valued. The
distribution of M is of importance in a number of different disciplines.
For x > 0, {M > x} = {τ (x) < ∞}, where τ (x) = inf{n ≥ 1 : Sn > x}, sothat computing the tail of M is equivalent to computing a level crossing probability
for the random walk S. Because of this level crossing interpretation, the tail of M
is of great interest to both the sequential analysis and risk theory communities. In
particular, in the setting of insurance risk, P (τ (x) <∞) is the probability that aninsurer will face ruin in finite time (when the insurer starts with initial reserve x and
is subjected to iid claims over time); see, for example, Asmussen (2000).
The distribution ofM also arises in the analysis of the single most important model
in queueing theory, namely the single-server queue. If the inter-arrival and service
times for successive customers are iid with a mean arrival rate less than the mean
6
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 7
service rate, then W = (Wn : n ≥ 0) is a positive recurrent Markov chain on [0,∞),whereWn is the waiting time (exclusive of service) for customer n. IfW∞ is a random
variable having the stationary distribution of W , then Kiefer and Wolfowitz (1956)
showed thatW∞ has the distribution ofM for an appropriately defined random walk.
As a consequence, computing the distribution of M is of fundamental importance to
queueing theorists.
Since W is a positive recurrent Markov chain, the distribution of M can be com-
puted as the solution to the equation describing the stationary distribution of W .
This linear integral equation is known as Lindley’s equation (see Lindley (1952)) and
is of Wiener-Hopf type; it is challenging to solve, both analytically and numerically.
As a result, approximations are frequently employed instead. One important such
approximation holds as µ& 0. This asymptotic regime corresponds in risk theory to
the setting in which the “safety loading” is small (i.e. the premium charged is close to
the typical pay-out for claims) and in queueing theory to the “heavy traffic” setting
in which the server is utilized close to 100% of the time. Thus, this asymptotic regime
is of great interest from an applications standpoint. Kingman (1963) showed that the
approximation
P (M > x) ≈ exp ¡−2µx/σ2¢ (1)
is valid as µ & 0, where σ2 = V ar (X1). (A more precise statement of this result
will be given in Section 2.) Because the right hand side of (1) is the exact value of
the level crossing probability for the natural Brownian approximation to the random
walk S, (1) is often called the diffusion approximation to the distribution of M .
As with any such approximation, there are applications for which (1) delivers
poor results. Siegmund (1979) therefore proposed a so-called “corrected diffusion
approximation” that reflects information in the increment distribution beyond the
mean and variance. This corrected diffusion approximation computes the next term in
the asymptotic (as µ& 0) beyond that given by the right hand side of (1). The main
result in this chapter (Theorem 1) is a development of the full asymptotic expansion
initiated by Siegmund. We compute all the terms in the asymptotic expansion for
general random walks with increments having exponential moments; see Section 6 for
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 8
details on the calculation of the relevant coefficients in the expansion. Our theorem
can be viewed as a non-Gaussian counterpart to the corresponding expansion provided
recently by Chang and Peres (1997) for Gaussian randomwalks. As perhaps expected,
the mathematical approach followed here is quite different from that used by Chang
and Peres.
As is well known in the literature, there is a close connection between such cor-
rections and asymptotic expansions for the moments of the ascending ladder height
random variables associated with the random walk. Theorem 2 establishes an asymp-
totic expansion for the mean of the first strict ascending ladder height for random
walks with light-tailed symmetric and continuous increments. As indicated in Sec-
tion 6, this permits one to develop asymptotic expansions for all the moments of the
ascending ladder heights (and for the limiting overshoot induced by the associated
renewal process); see also Theorem 4.
This chapter is organized as follows. The main results are described in Section
2. A key connection to asymptotic expansions for the “short-time” behavior of the
Cauchy process is made in Section 3. Section 4 shows how all the integrals required
for our asymptotic expansion can be reduced to the short-time asymptotics of Section
3. Finally, Section 5 provides rigorous support for the remaining details in the ar-
gument used to compute the coefficients in the expansion. Section 6 summarizes the
computation of the coefficients, and discusses an expansion related to the moments
of the strict ascending ladder height. Any proof that does not follows the statement
of the result can be found in our final section, namely Section 7.
2.1 The Main Results
To state our main results, we adopt the parameterization utilized by Siegmund (1979).
We assume throughout this chapter that the Xi’s have exponential moments, so that
E exp (θX1) <∞ for θ in a neighborhood containing the origin. For such θ, define
ψ (θ) = logE (exp (θX1)) .
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 9
Then, for each such θ, we can define the probability measure Pθ having the property
that for n ≥ 0,
Pθ (A) = E (exp (θSn − nψ (θ)) 1A)
for A ∈ σ (Sj : 0 ≤ j ≤ n). As is well known, S is again a random walk with iid
increments under Pθ, having common increment distribution
Pθ (X1 ∈ dx) = exp (θx− ψ (θ))P (X1 ∈ dx)
for x ∈ R (with mean EθX1 = ψ0 (θ) and variance V arθ (X1) = ψ00 (θ)). Without
any loss of generality, assume that EX1 = 0 and V ar (X1) = 1. Since ψ (·) is strictlyconvex on its domain of finiteness, EθX1 < 0 for θ < 0. Thus, Pθ induces a random
walk with negative drift when θ < 0. We therefore focus on corrected approximations
to Pθ (M > x) as θ % 0.
A key step to the analysis of Pθ (M > x) is the judicious application of Wald’s
likelihood ratio identity; see, for example Siegmund (1985), p. 13. For θ0 in some
interval of the form (−η, 0), there exists a positive θ1 such that ψ (θ0) = ψ (θ1).
Set ∆ = θ1 − θ0. Note that parameterizing in terms of ∆ is essentially equivalent
to parameterization in terms of θ0 (or parameterization in terms of the drift µ =
−ψ0 (θ0)). The likelihood ratio identity then asserts that
Pθ0 (τ (x) <∞) = Eθ1 exp¡− (θ1 − θ0)Sτ(x)
¢= exp (− (θ1 − θ0)x)Eθ1 exp (− (θ1 − θ0)R (x)) , (2)
where R (x) = Sτ(x) − x is the so-called “overshoot” at level x.Suppose now that X1 is strongly non-lattice, in the sense that for each δ > 0,
inf|λ|>δ
|1− g (λ)| > 0, (3)
where g (λ) = E exp (iλX1) is the characteristic function of X1 (under P0). Applying
renewal theory to the random walk at strictly increasing ladder epochs establishes
then
Eθ1 exp (− (θ1 − θ0)R (x))→ Eθ1 exp (− (θ1 − θ0)R (∞)) (4)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 10
as x→∞.Siegmund (1979) showed that the renewal theorem can be applied uniformly for
∆ < η (see also Chang (1992)). Hence, (2) yields
Pθ0 (M > x) = exp (−∆x)Eθ1 exp (−∆R (∞)) + o (exp(− (∆+ r)x)) (5)
for some r > 0 (uniformly in θ0 > −η/2). In insurance risk theory, ∆ is called
the “adjustment coefficient” and the quantity Eθ1 exp (−∆R (∞)) is known as theCramer-Lundberg constant (c.f. Asmussen (2001)).
Relation (5) may alternatively be written as
Pθ0 (∆M > x) = exp (−x)Eθ1 exp (−∆R (∞)) + o (exp (−rx/∆)) (6)
where o (exp (−rx/∆)) is uniform in θ0 > −η/2. Note that exp (−x) is precisely thelevel crossing probability of level x/∆ for a Brownian motion with drift −∆/2 andunit variance. Since Eθ1X1 ∼ −∆/2 as θ0 % 0, (6) provides rigorous support for
the diffusion approximation (1). Furthermore, a correction to the diffusion approxi-
mation described at the beginning of this chapter can be obtained by developing an
asymptotic expansion for Eθ1 exp (−∆R (∞)).Siegmund (1979) obtained his corrected diffusion approximation by showing that
Eθ1 exp (−∆R (∞)) = exp (−∆β1) + o¡∆2¢
(7)
as ∆ ↓ 0, where β1 can be computed explicitly as
β1 =1
6EX3
1 −1
2π
Z ∞
−∞
1
λ2Re log{2 (1− g (λ)) /λ2}dλ. (8)
Note that by computing the single integral (8), Siegmund’s corrected diffusion ap-
proximation to the distribution of M provides a parametric approximation that is
valid for all random walks having negative drift sufficiently close to zero. Such para-
metric approximations are convenient in many applications settings (i.e. in studying
the behavior of a queue when utilization is close to 100%).
Our main theorem shows that there is a full asymptotic expansion for
r (∆) , logEθ1 exp (−∆R (∞)) .
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 11
Theorem 1 Suppose that X1 has exponential moments and is strongly non-lattice.
Then, r (·) (initially defined on [0, υ) for υ > 0) admits an analytic extension on a
neighborhood of the origin in the complex plane.
Remark An immediate consequence of Theorem 1 and the implicit function the-
orem is that the Cramer-Lundberg constant, namely exp (r (∆ (θ0))), initially defined
for all θ0 < 0 sufficiently close to zero, admits an analytic extension on a disc con-
taining the origin in the complex plane.
According to Theorem 1,
Eθ1 exp (−∆R (∞)) = expà ∞Xn=1
βn∆n
!, (9)
where β1 is given by (8) and β2 = 0. (This latter equality follows from the fact that
the error term in (7) is o (∆2) .) Obviously, in order for (9) to be useful from an
applied standpoint, we need a means of numerically computing the βn’s. This issue
is discussed in Section 6. We establish there that the βn’s can be successively com-
puted via a finite number of one-dimensional integrations reminiscent of the integral
appearing in (8). Thus, the βn’s can easily be computed, thereby yielding cheaply
computable high-order parametric corrections to the diffusion approximation (1).
The argument above also permits us to establish asymptotic expansions for certain
ladder height quantities. As noted earlier, renewal theory applies to the random walk
when sampled at strictly increasing ladder epochs. The renewal theorem invoked
above actually establishes that
Eθ1 exp (−∆R (∞)) =1−Eθ1 exp
¡−∆Sτ+¢∆Eθ1Sτ+
, (10)
where τ+ = inf{n ≥ 1 : Sn > 0} is the first (strict) increasing ladder epoch (seeAsmussen (1987)). In view of (2), it follows that
1− Eθ1 exp¡−∆Sτ+¢ = Pθ0 (τ+ =∞) . (11)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 12
Random walk duality (see, for example, p. 173 of Siegmund (1985)) implies that
Pθ0 (τ+ =∞) = 1/Eθ0τ−, (12)
where τ− = inf{n ≥ 1 : Sn ≤ 0}. If the Xi’s are symmetric rv’s with commoncontinuous distribution function, ∆ = 2θ1 and Eθ0τ− = Eθ1τ+. Furthermore, (10) to
(12) then imply that
Eθ1 exp (−∆R (∞)) =1
2θ1¡Eθ1Sτ+
¢(Eθ1τ+)
.
In view of Wald’s identity, we then obtain the relation
Eθ1 exp (−∆R (∞)) =ψ0 (θ1)
2θ1¡Eθ1Sτ+
¢2 .As a consequence, Theorem 1 then yields a full asymptotic expansion for the expected
ladder height Eθ1Sτ+. We record this result as our Theorem 2.
Theorem 2 Assume that X1 has exponential moments and is symmetric with a con-
tinuous distribution function. Then,
Eθ1Sτ+ =
sψ0 (θ1)2θ1
exp
Ã−12
∞Xm=0
β2m+1 (2θ1)2m+1
!.
Given our above argument, the only remaining issue in proving Theorem 2 is
establishing that β2n = 0 for n ≥ 1 in the presence of symmetry. This fact is provenin Section 2.6.
The most important device that we use to prove Theorems 1 and 2 is a convenient
representation for r (∆). This representation is a key idea in our mathematical de-
velopment. To introduce our representation put φ (θ) = E exp (θX1) for θ ∈ R and,for z ∈ C, set γ (z) = E exp (zX1). Note that φ is finite-valued on a neighborhood Nof the origin and γ is analytic on the strip {x+ iy : x ∈ N , y ∈ R}. For non-negativeθ ∈ N and b ∈ R, put
ρ (θ, b) = logEθ exp (−bR (∞)) .
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 13
Note r (∆) = ρ (θ1,∆), where θ1 = θ1 (∆) > θ0 (∆) = θ0 is such that ψ (θ1 (∆)) =
ψ (θ0 (∆)). Woodroofe (1979) showed that
ρ (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µγ (θ)− γ (θ + iλ)
−iφ0 (θ)λ¶dλ; (13)
see also Corollary 8.45 and Theorem 8.51 of Siegmund (1985). While (13) is conve-
nient for many purposes, it presents difficulties in the current circumstances because
of the singularity (in the logarithm) that arises when θ & 0. The following represen-
tation for ρ (θ, b) is free of such singularities.
Theorem 3 Suppose X1 has exponential moments and is strongly non-lattice. Then,
for non-negative θ ∈ N and b > 0,
ρ (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µ2 (γ (θ)− γ (θ + iλ))
λ (λ− 2iφ0 (θ))¶dλ. (14)
Siegmund’s computation of β1 takes advantage of the fact that the first order
behavior of r (∆) should match that of
s (∆) = logE0 exp (∆R (∞)) . (15)
Since s (∆) = ρ (0,∆) , Theorem 3 implies that
s (∆) =1
2π
Z ∞
−∞
−∆(∆+ iλ) iλ
log¡2 (1− g (λ))λ−2¢ dλ; (16)
see also p. 226 of Siegmund (1985). We proceed to analyze ρ (θ, b) by writing ρ (θ, b) =
s (b) + I (θ, b). In view of both Theorem 3 and (16),
I (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µλ (γ (θ)− γ (θ + iλ))
(λ− 2iφ0 (θ)) (1− g (λ))¶dλ. (17)
In the next sections, we develop asymptotics, as b & 0, appropriate to the inte-
grals arising in (16) and (17). Such asymptotics can be used to provide asymptotic
expansions for the moments (or, equivalently, the cumulants) of the limiting expected
overshoot r.v.R (∞) under Pθ as θ & 0. Specifically, for n ≥ 1, let
κn (θ) = (−1)n ∂n
∂bρ (θ, b)
¯b=0
.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 14
Theorem 4 Assume that X1 has exponential moments and is strongly non-lattice.
Then (for all n ≥ 1) κn (·), initially defined on [0, υ) for υ > 0, can be extended to bean analytic function throughout a disc in the complex plane containing the origin.
An important implication of Theorem 4 is that it can be directly applied to obtain
complete asymptotics for the steady-state mean of the waiting time sequence, namely
Eθ0M (= Eθ0W∞). In particular, Siegmund (1979) shows (see also Theorem 6.7, p.
275, of Asmussen (1987)) that
Eθ0M =Eθ0
¡Sτ+
¯τ+ <∞
¢Pθ0 (τ+ =∞)
=Eθ1Sτ+ exp
¡−∆Sτ+¢1−Eθ1 exp
¡−∆Sτ+¢=
Eθ1 (1−R (∞)) exp (−∆R (∞))∆Eθ1 exp (−∆R (∞))
=1
∆+1
∆
∂
∂bρ (θ1,∆) . (18)
Thus, since
∂
∂bρ (θ, b) =
∞Xm=0
(−1)m κm+1 (θ)bm
m!,
it follows that Theorem 4 can be applied directly to provide the full asymptotic
expansion for Eθ0M . Indeed, our analysis in Sections 3 to 5 yield an asymptotic
expansion for κn (·) around zero which in turn implies the expansion
Eθ0M =1
∆+
nXm=0
n−mXj=0
(−1)m κ(j)m+1 (0)
θ1 (∆)j
j!
∆m
m!+O
¡∆n+1
¢valid for all n ≥ 0. The explicit computation of the derivatives κ(j)m+1 (0), for j,m ≥ 0,is discussed in Section 2.5.2.
Finally, the analytic extension of κn (·) and r (·) is a consequence of the followingresult.
Proposition 1 If X1 has exponential moments and strongly non-lattice distribution,
then, I (·) (defined as in (17) on a domain containing [0, υ)× [0, υ) with υ > 0) can
be analytically extended throughout a disc containing the origin in C×C.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 15
Moreover, with the aid of Theorem 1 it follows easily (from (18) and the implicit
function theorem) that ∆Eθ0M (initially defined for θ0 < 0) can be analytically
extended (as a function of ∆ (θ0)) in a neighborhood of the origin in the complex
plane.
2.2 Short-time Asymptotics for the Cauchy Pro-
cess
The approach described in Section 2 suggests computing an asymptotic expansion
for r (∆) by developing appropriate expansions for s (∆) and I (θ1,∆). In this sec-
tion, we will show how asymptotics for s (∆) can be obtained. Section 4 shows how
asymptotics for I (θ,∆) (and, as a result, also for I (θ1 (∆) ,∆)) can be reduced to
the types of integrals considered here.
Since s (b) is real for b positive, it follows that the integral of the imaginary part
of (16) must vanish. Hence, s (b) equals the integral of the real part of (16), so that
s (b) =1
2π
Z ∞
−∞
b
b2 + λ2Re log
¡2 (1− g (λ))λ−2¢ dλ (19)
− 12π
Z ∞
−∞
b¡b2 + λ2
¢λIm log (1− g (λ)) dλ.
Both of the above integrals take the form
K (b, f) =1
2π
Z ∞
−∞
b
b2 + λ2f (λ) dλ (20)
=1
2π
Z ∞
−∞
1
1 + λ2f (λb) dλ.
for suitably defined f . Note that if Y = (Y (t) : t ≥ 0) is a standard Cauchy pro-cess (so that Y (1) is distributed as a standard Cauchy r.v.), K (t, f) can then be
represented as
K (t, f) =1
2E (f (Y (t))|X = 0) .
Hence, representing K (t, f) as a power series in t is equivalent to the development of
short-time asymptotics of the Cauchy process. Such asymptotics are also of general
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 16
analytical interest, because of their relevance to Fourier analysis. Integrals of the
type (20) are closely related to “approximate identities of the Fejer type”; see p. 31
of Butzer (1971).
Let L be the space of functions f : R→ C for which E |f (Y (1))| is finite and forwhich f is infinitely differentiable at zero. For f : R→ C let f be the symmetrizationof f defined via f(x) = (f (x) + f (−x)) /2. The following result provides our short-time asymptotic expansion for K (t, f).
Proposition 2 Suppose f belongs to L . Then, K (·, f) is infinitely differentiable atthe origin and
K(n) (0, f) =
((−1)n/2 f (n) (0) n even
(−1)(n−1)/2 n! 12π
R∞−∞¡T(n−1)/2f
¢(λ) dλ n odd
,
where, for j ≥ 0, Tj acts on even functions in L as
Tjf (λ) =f (λ)−P2j−1
k=0 f(2k) (0)λ2k/ (2k!)
λ2j.
Furthermore, the family of linear operators (Tn : n ≥ 0) is a commutative semigroup,so that Tn+m = TnTm m,n ≥ 0.
Remark Note that the even derivatives of f match those of f(·). One mighttherefore be tempted to write the derivatives of K (·, f) in terms of integrals of Tjfrather than Tjf . The problem is that Tjf typically has a singularity at the origin,
unless the odd derivatives of f at zero vanish. As a consequence, the integrals defining
the derivative of K (·, f) may diverge if they were defined directly in terms of f . Toavoid this, we use the symmetrization f.
Proof of Proposition 2. The fact that Tn is a linear operator, and forms a
commutative semigroup is straightforward. To obtain the formula for the derivatives
of K (·, f) at the origin, note that K (·, f) = K ¡·, f¢ where f is the symmetrizationof f given by f(·) = (f (·) + f (−·)) /2. Furthermore, if f ∈ L then f is also in L.Observe that the Dominated Convergence Theorem implies that
K¡t, f¢→ f (0) /2
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 17
as t& 0. This motivates writing
K¡t, f¢= f (0) /2 +
1
2π
Z ∞
−∞
t
t2 + λ2¡f (λ)− f (0)¢ dλ.
Since E¯f (Y (1))
¯is finite, it follows that the above integrand is uniformly domi-
nated by an integrable function for |λ| bounded away from zero. On the other hand,f(λ)−f(0) = O ¡λ2¢ as λ → 0, so the integrand is also uniformly (in t) dominated
for |λ| small. Hence, the Dominated Convergence Theorem yields the conclusion that
K (t, f) = f (0) /2 +t
2π
Z ∞
−∞
1
t2¡f (λ)− f (0)¢ dλ+ o (t)
as t→ 0. In fact,
K (t, f) = f (0) /2 +t
2π
Z ∞
−∞
1
λ2¡f (λ)− f (0)¢ dλ
− t2
2π
Z ∞
−∞
t
t2 + λ2
¡f (λ)− f (0)¢
λ2dλ
= f (0) /2 +t
2π
Z ∞
−∞
¡T1f
¢(λ) dλ− t2K ¡t, T1f¢ . (21)
If we apply (21) recursively to K¡·, T1f¢, K ¡·, T2f¢,... we find that K (t, f) satisfies
K (t, f) =nXj=0
(−1)jÃt2j¡Tjf
¢(0)
2+t2j+1
2π
Z ∞
−∞
¡Tj+1f
¢(λ) dλ
!+(−1)n+1 t2(n+1)K ¡t, T2(n+1)f¢ ,
yielding the result.
With Proposition 2 in hand, our asymptotic expansion for s (∆) follows immedi-
ately.
2.3 Reducing the Analysis to Cauchy Process Short-
time Asymptotics
As we discussed earlier in Section 2, the backbone of our asymptotic analysis for
r (∆) is given by the relation ρ (θ, b) = s (b)+ I (θ, b). In Section 3, we studied how to
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 18
develop asymptotics for s (b). In this section, we will study how to reduce the analysis
of the remaining term I (θ, b) to that already studied in Section 3. Recall that
I (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log (1− v (θ,λ)) dλ,
where
v (θ,λ) =λ
λ− 2iφ0 (θ)µ1− 2iφ
0 (θ)λ
− γ (θ)− γ (θ + iλ)
1− g (λ)¶.
A natural strategy to now follow is to express the logarithm as a power series in
v (θ,λ), followed by an expansion for v as
v (θ,λ) =∞Xn=0
vn (iλ)θn
n!. (22)
One could then apply Proposition 2 (as for (19)) to the real and imaginary parts in
each of the resulting integrals that would appear as coefficients for θn. However, the
expansion (22) requires that the function v be expressible as a joint power series in
non-negative powers of θ and λ. Unfortunately, the presence of the term (λ− 2iφ0 (θ))in the denominator of v precludes the existence of such a joint power series.
To avoid this difficulty we write v as
v (θ,λ) =λH (θ,λ)
λ− 2iφ0 (θ) ,
so that
H (θ,λ) = 1− 2iφ0 (θ)λ
− γ (θ)− γ (θ + iλ)
1− g (λ) .
The functionH (·) is well behaved because the term 2iφ0 (θ) /λ controls the behavior of(γ (θ)− γ (θ + iλ)) (1− g (λ))−1 as λ& 0. As a consequence, H (·) can be smoothlydefined at λ = 0 via the relation H (θ, 0) = 1− φ00 (θ). Our next result describes the
analytic structure of H (·).
Proposition 3 Let Dη/2 , {z ∈ C : |z| < η/2} and, for (z1, z2) ∈ Dη/2סDη/2
SR¢,
put H
H (z1, z2) = 1− 2iγ0 (z1)z2
− γ (z1)− γ (z1 + iz2)
1− γ (iz2).
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 19
Then, for every z1 ∈ Dη/2, the function H (z1, ·) is analytic on Dη/2. Similarly, for
every z2 ∈ Dη/2
SR, the function H (·, z2) is analytic on Dη/2. Finally, H (z1,λ)
can be represented as an absolutely and uniformly convergent series, for λ ∈ R andz1 ∈ Dη/2, namely
H (z1,λ) =∞Xk=1
hk (iλ)zk1k!, (23)
where hk (iλ) ,¡γ(k) (iλ)− µk
¢/ (1− g (λ))− ¡2iµk+1/λ¢. In particular, this implies
that
supλ∈R
|H (z1,λ)|→ 0
as z1 → 0.
Remark Note that the function eH (z1, z2) , H (z1, z2)−H (θ, 0) = H (z1, z2)−1+γ00 (z1), satisfies the same properties stated for H (·) in Proposition 3 with ehk (iλ) ,hk (iλ) + µk+2, this follows from the analyticity of γ (·) and the fact that γ00 (0) = 1.Moreover, observe that completely analogous analytic properties apply to the functioneG (z1, z2) = (γ00 (z1))−1 eH (z1, z2) defined on Dη/2 ×
¡Dη/2
SR¢.
Note that |λ/ (λ− 2iφ0 (θ))| = |λ|³λ2 + (2φ0 (θ))2
´−1/2≤ 1. It follows from
Proposition 3 that for r > 0 small enough,
supθ∈(0,r)
supλ∈R
|v (θ,λ)| < 1.
Therefore, for all 0 < θ < r, we can proceed to expand log (1− v) in powers of v andformally integrate each term in the obtained expansion to express I (θ, b) in terms of
integrals of the form
Jk (a, b, f) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
µiλ
a+ iλ
¶kf (iλ) dλ, (24)
where a, b > 0, f (i·) ∈ L and k ≥ 0. Because J0 (a, b, f) , J0 (b, f) can be written as
J0 (b, f) =1
2π
Z ∞
−∞
b
b2 + λ2¡Re f (iλ)− λ−1 Im f (iλ)
¢dλ
+i
2π
Z ∞
−∞
b
b2 + λ2¡Im f (iλ) + λ−1Re f (iλ)
¢dλ, (25)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 20
it follows that asymptotics for J0 can be computed in terms of asymptotics for the
K-type integrals that are subject of Proposition 2. In view of the development leading
to (24), a key to our asymptotic expansion for I (θ, b) is therefore the reduction of
integrals Jk (a, b, f) for k ≥ 1 to integrals of the form J0 (b, f). A key identity in
establishing this reduction step is the following.
Lemma 1 Suppose that a, b ≥ 0. Then, for m,n ≥ 0,1
2π
Z ∞
−∞
−b(b+ iλ) iλ
(iλ)m+1
(a+ iλ)m+n+1dλ = 0.
Furthermore,
1
2π
Z ∞
−∞
−1(1 + iλ) iλ
log (1 + aiλ) dλ = 0.
Proof. For a, b > 0, let the function of a complex variable f (·) be defined as
f (z) =−b
(b+ iz) iz
(iz)m+1
(a+ iz)m+n+1.
Consider the contour (in the clockwise direction) C (r) = C1 (r) + C2 (r), where
C1 (r) = {reiτ : −π ≤ τ ≤ 0} and C2 (r) = {λ : λ ∈ [−r, r]}. Since f is (complex)analytic on Im (z) ≤ 0 , Cauchy’s theorem yields
1
2π
ZC(r)
−b(b+ iz) iz
(iz)m+1
(a+ iz)m+n+1dz = 0.
This, in turn, implies that
1
2π
Z r
−r
−b(b+ iλ) iλ
(iλ)m+1
(a+ iλ)m+n+1dz =
−12π
ZC1(r)
−b(b+ iz) iz
(iz)m+1
(a+ iz)m+n+1dz
=−12π
Z 0
−π
b (ir)m+1 e(m+1)τ i
(b+ ireτ i) (a+ ireτ i)m+n+1dτ .
Letting r →∞, we obtain (by virtue of dominated convergence) the first part of thelemma. For the second part, let us define
f1 (a) =1
2π
Z ∞
−∞− ((1 + iλ) iλ)−1 log (1 + aiλ) dλ.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 21
A routine dominated convergence argument, combined with our previous analysis,
shows that
f 01 (a) =1
2π
Z ∞
−∞
−1(1 + iλ) (1 + aiλ)
dλ = 0.
The proof of the lemma is completed by observing that f1 (a)→ 0 as a& 0.
Let L0 be the subspace of L (recall the definition of L preceding Proposition 2)
for which f (0) = 0. Also, for f ∈ L, let ef (·) = f (·) − f (0) (∈ L0). We are nowready to offer a proposition that reduces the evaluation of the integrals Jk (a, b, f) for
k ≥ 1 to that of integrals such as J0 (b, f), thereby permitting the application of theshort-time asymptotics of Section 3.
Proposition 4 Suppose that f ∈ L0. Then, for k ≥ 1 and n ≥ 0,
Jk (a, b, f) = J0
Ãb,
nXj=0
µk + j − 1
j
¶(−a)j eTj ef!+ bo (an) , (26)
where the linear operator eTj (j ≥ 0) acts on functions ef (i·) ∈ L0 as³eTj ef´ (iλ) = ef (iλ)−Pjm=1
ef (m) (0) (iλ)m /m!(iλ)j
.
Moreover, the family of operators³eTj : j ≥ 0´ constitutes a commutative semigroup,
so that eTm eTn = eTm+n.Remark As for Proposition 2, one might be tempted to express the right-hand
side of (26) in terms of f rather that ef . However, eTjf is generally non-integrable withrespect to the kernel that defines J0. Finally, note that, if all integrals are interpreted
in terms of Cauchy principal value, one can apply Proposition 4 directly to functions
that do not vanish at the origin by defining J0 (b, f) = J0 (b, f (·)− f (0)) + f (0) /2.
Proof of Proposition 4. That³eTj : j ≥ 0´ is a family of linear operators
forming a commutative semigroup is immediate. By virtue of Lemma 1, it follows
that
Jm (a, b, f) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
µiλ
a+ iλ
¶m ef (iλ) dλ.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 22
Observe that ef (i·) is now in the domain of the operators eTn, n ≥ 1. On the otherhand, we can write
Jm³a, b, ef´ = 1
2π
Z ∞
−∞
−b(b+ iλ) iλ
ef (iλ) dλ+1
2π
Z ∞
−∞
−b(b+ iλ) iλ
ef (iλ)µµ iλ
a+ iλ
¶m− 1¶dλ. (27)
Note that µiλ
a+ iλ
¶m− 1 = −
mXk=1
µm
k
¶ak (iλ)m−k
(a+ iλ)m.
Once again, by appealing to Lemma 1 and to the definition of eTk ef , it follows that,for m ≥ k ≥ 1,
akJm³a, b, eTk ef´ , 1
2π
Z ∞
−∞
−b(b+ iλ) iλ
ak (iλ)m−k
(a+ iλ)mef (iλ) dλ.
Combining this observation with (27), we obtain
Jm (a, b, f) = Jm³a, b, ef´ = J0 ³b, ef´− mX
k=1
µm
k
¶akJm
³a, b, eTk ef´ . (28)
The recursive relation (28) can now be expressed in operator form as
Jm³a, b, ef´ = J0 ³b, ef´+ Jm ³a, b,³1− ³1 + aeT´m´ ef´ .
(Here, we have used the semigroup property of the family of operators eTm). Iteratingthe previous expression, we arrive at
Jm (a, b, f) = Jm³a, b, ef´ = nX
k=0
J0
µb,³1−
³1 + aeT´m´k ef¶
+Jm
µa, b,
³1−
³1 + aeT´m´n+1 ef¶
= J
Ãb,
nXj=0
µm+ j − 1
j
¶(−a)j eTj ef!+ bo (an) , (29)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 23
where the last equality in (29) has been obtained by using the semigroup property of
the operators eTm and by noting that the coefficient of aj eTj in (29) (for j ≤ n) mustmatch that of xj in the formal expansion of
p (x) =1− (1− (1 + x)m)n+11− (1− (1 + x)m) =
1
(1 + x)m+O
¡xn+1
¢.
That the error term in (29) is bo (an) comes from the fact that aJm³a, b, ef´ = bo (1),
as a& 0, as it can be seen as follows,¯aJm
³a, b, ef´¯ = ¯¯ a2π
Z ∞
−∞
−b (iλ)m−1 ef (iλa)(b+ iλa) (1 + iλ)m
dλ
¯¯
≤ b
2π
Z ∞
−∞
¯¯ ef (iλa)λ (1 + iλ)
¯¯ dλ = bo (1) ,
where the last step follows by a dominated convergence argument. This concludes
the proof of the proposition.
Proposition 4, combined with our development for K (t, ·) in Section 3, providesall the elements required to develop asymptotic expansions for integrals of the form
Jm (a, b, f). Since, as discussed earlier at a formal level, I (θ, b) can be expressed as a
sum of terms such as Jm (a, b, f), it follows that the whole asymptotic analysis of r (∆)
and ρ (θ, b) can be reduced to that of Section 3. A complete rigorous justification for
this representation for I (θ, b) is one of the main issues discussed in Section 5.
2.4 An Asymptotic Expansion for I (θ, b)
In Sections 3 and 4, we have developed the tools required to obtain asymptotic ex-
pansions, in powers of b, for s (b) and I (θ, b). We have done this by showing that
the problem can be reduced to short-time asymptotics for the Cauchy process. The
purpose of this section is to make rigorous the expansion for I (θ, b), in powers of θ,
that was outlined in Section 4.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 24
Noting the important role that functions vanishing at the origin plays in Propo-
sition 4, it seems appropriate to define
eH (θ,λ) , H (θ,λ)−H (θ, 0) = H (θ,λ)− 1 + φ00 (θ)
=∞Xk=1
ehk (iλ) θkk!, (30)
where ehk (iλ) , (γ(k) (iλ)−µk)/(1−g (λ))−¡2iµk+1/λ¢+µk+2 is such that ehk (0) = 0.The next proposition shows how a simplified expression for I (θ, b) in terms of eH can
be obtained.
Proposition 5 Define Ψ (θ) = 2φ0 (θ) /φ00 (θ). Then,
I (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
Ã1− φ00 (θ)−1 λ eH (θ,λ)
(λ− iΨ (θ))
!dλ. (31)
Proof. Just note that
log (1− v (θ,λ)) = logÃ1− λ eH (θ,λ)
(λ− iΨ (θ)φ00 (θ)) −λ (1− φ00 (θ))
(λ− iΨ (θ)φ00 (θ))
!
= log
µiλ/Ψ (θ) + 1
iλφ00 (θ) /Ψ (θ) + 1
¶+ log
Ã1− φ00 (θ)−1 λ eH (θ,λ)
(λ− iΨ (θ))
!.
Thus, (31) follows from Lemma 1 by noting that
1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µiλ/Ψ (θ) + 1
iλφ00 (θ) /Ψ (θ) + 1
¶dλ
=1
2π
Z ∞
−∞
−1(1 + iλ) iλ
log
µiλb/Ψ (θ) + 1
iλbφ00 (θ) /Ψ (θ) + 1
¶dλ = 0.
Additional simplifications reduce the complexity of the expansion for I (θ, b). In
particular, the expression for the integral J0 (b, f) simplifies when it is known that
J0 (b, f) is real; see (25). Fortunately, our analysis of I (θ, b) gives rise to such real-
valued J0 (b, f)’s. To establish this result, we introduce the following family of func-
tions.
Definition A function f : R→ C is said to have the “parity property” if Re f (i·)and Im f (i·) are even and odd functions respectively. The class of functions possessingthe parity property will be denoted by P.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 25
Note that if f (i·) is in the domain of J0 (b, ·) and f possesses the parity property,then we must have that ImJ0 (b, f) = 0 (since it corresponds to an integral on the
real line of an odd integrable function). The family of functions enjoying the parity
property has certain closure characteristics that will be useful for the rest of our
development. These closure properties are discussed in the next proposition.
Proposition 6 The class P of functions forms an algebra on R (i.e. a vector spaceon R that is closed under product of functions). In addition, if f ∈ P, then 1/f (·)(defined on its domain of finiteness) also possesses the parity property. Finally, if f
is in the domain of eT and has the parity property, then eTf ∈ P.Proof. Certainly P constitutes a vector space on R and it is almost imme-
diate that eT preserves the parity property. Now, if f1,f2 ∈ P, then Re (f1f2) =Re (f1)Re (f2) − Im (f1) Im (f2) must clearly be even. Similarly, Im (f1f2) must beodd , which implies that f1f2 ∈ P. Finally, note that
1
f=
Re (f)
Re (f)2 + Im(f)2− i Im (f)
Re (f)2 + Im(f)2,
which immediately implies that Re 1/f and Im1/f are even and odd functions re-
spectively and thus 1/f ∈ P.
We now present the main result of this section, which yields an expansion for
I (θ, b) in powers of θ and coefficients involving only integrals of the form J0 (b, f)
with f satisfying the parity property.
Proposition 7 For k,m ≥ 1, let the coefficient multiplying θk in the power se-
ries representation of eG (θ,λ)m , ³φ00 (θ)−1 eH (θ,λ)´m be defined as egk,m (iλ). Then,egk,m (·) ∈ P can be recursively computed viaegk,m+1 (iλ) = kX
n=0
egn+1,m (iλ) egk−n,1 (iλ) .Consider b > 0 and let χ (θ) = −Ψ (θ) /θ. Then,
I (θ, b) =nX
m=1
θmm−1Xj=0
χ (θ)j J0 (b, Ej,m) + bo (θn) , (32)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 26
where Ej,m (iλ) , defined for 0 ≤ j ≤ m− 1 and m ≥ 1 as
Ej,m = −m−j−1Xk=0
1
m− j − kµm− k − 1
j
¶eTjegk,m−j−k,satisfies the parity property.
Proof. Since γ (iλ) = E0 cos (λX1)+ iE0 sin (λX1), it follows that γ(k) (iλ)−µk ∈P, as does the function 1−γ (iλ). By the closure properties described in Proposition
6, we may easily conclude that egk,1 ∈ P. A second application of Proposition 6 showsthat egk,m ∈ P and Ej,m ∈ P. The recursive expression provided for egk,m follows fromstandard convolution operations of power series. For n ≥ 1, define
eGn (θ,λ) , nXk=1
egk,1 (iλ) θkk!
and
In (θ, b) ,1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
Ã1−
eGn (θ,λ)λλ− i2φ0 (θ)
!dλ.
Note that
log
Ã1−
eG (θ,λ)λλ− i2φ0 (θ)
!− log
Ã1−
eGn (θ,λ)λλ− i2φ0 (θ)
!
= log
1− λ
λ− i2φ0 (θ)
³ eGn (θ,λ)− eG (θ,λ)´³1− eGn (θ,λ)λ (λ− i2φ0 (θ))−1´
.On the other hand, from the remark following Proposition 3 and because log (1 + z) =
z (1 + ε (z)) for z ∈ C, where |ε (z)| ≤ |z| for |z| ≤ 1/2 (see Proposition 8.46, Breiman(1992)), we can see that there exists a constant B > 0 such that
|I (b, θ)− In (b, θ)|
≤ B
2π
Z ∞
−∞
b¯ eGn (θ,λ)− eG (θ,λ)¯¡
b2 + λ2¢1/2 ³
λ2 + (2φ0 (θ))2´1/2dλ.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 27
Essentially by making the change of variables u = λθ we then see that for all θ ∈ (0, δ)for some δ > 0 we have
|I (b, θ)− In (b, θ)| ≤ Bbθn
2π
Z ∞
−∞
¯ eGn (θ,λθ)− eG (θ,λθ)¯θn+1 |λ| ¡λ2 + 1¢1/2 dλ.
It follows easily from the previous inequality and the Dominated Convergence Theo-
rem that
I (b, θ)− In (b, θ) = bo (θn) .Using the expansion of log (1 + z) at z = 0 and a similar dominated convergence
argument, we can write
In (b, θ) =−12π
Z ∞
−∞
−b(b+ iλ) iλ
nXm=1
1
m
µiλ
iλ+Ψ (θ)
¶m eGn (θ,λ)m dθ + bo (θn)=−12π
Z ∞
−∞
−b(b+ iλ) iλ
nXm=1
1
m
µiλ
iλ+Ψ (θ)
¶m n−mXk=0
θk+megk,m (iλ) dλ+bo (θn) . (33)
Using Proposition 4 and (33), we obtain that
I (θ, b)
= −nX
m=1
θmm−1Xk=0
Jm−k
µΨ (θ) , b,
egk,mm− k
¶+ bo (θn)
= −nX
m=1
θmm−1Xk=0
J0
Ãb,
nXj=0
µm− k + j − 1
j
¶(−Ψ (θ))j eTj egk,m
m− k
!+ bo (θn)
= −nX
m=1
θmm−1Xj=0
θjχ (θ)j J0
Ãb,m−1Xk=0
µm− k + j − 1
j
¶eTj egk,mm− k
!+ bo (θn)
=nX
m=1
θmm−1Xj=0
χ (θ)j J0
Ãb,−
m−j−1Xk=0
µm− k − 1
j
¶eTj egk,m (iλ)m− j − k
!+ bo (θn) , (34)
which yields the desired conclusion.
In view of the previous result, an explicit expression for the coefficients in the
expansion for J0 (·, f), when f satisfies the parity property, deserves special attention.Providing such explicit expressions is the aim of the next proposition.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 28
Proposition 8 Suppose that f (i·) ∈ L0 has the parity property. Then, J0 (·, f) isinfinitely differentiable at zero and
J(n)0 (0, f) =
(−1)n/2
³f(n)RE (0)− n!
2π
R∞−∞¡Tn/2+1fIM
¢(λ) dλ
´n even
(−1)(n+1)/2µf(n+1)IM (0)
(n+1)− n!
2π
R∞−∞¡T(n+1)/2fRE
¢(λ) dλ
¶n odd
,
(35)
where fIM (iλ) = Im f (iλ)λ−1 and fRE (iλ) = Re f (iλ).
Proof. The proof follows by a direct application of Proposition 2 combined with
the fact that Re J0 (b, f) = 0.
We close this section with some remarks that clarify how the expansion just de-
rived for I (θ, b) can alternatively be viewed through the prism of a formal operator
expansion. The analytic properties stated in Proposition 1 provide rigorous justifica-
tion for the expansions outlined next. First, we note that if θ > 0 is small enough
and b > 0, we can formally write
I (θ, b) = −∞Xk=1
1
kJk³Ψ (θ) , b,φ00 (θ)−k eHk (θ, ·)
´. (36)
Formally interpreting³1 + aeT´−m as
³1 + aeT´−m = ∞X
k=0
µm+ k − 1
k
¶(−a)k eT k,
in combination with the expansion (26) developed for Jk (a, b, f) and equality (36),
allows us to write
I (θ, b) = −∞Xk=1
1
kJ0
µb,φ00 (θ)−k
³1 +Ψ (θ) eT´−k eHk (θ, ·)
¶.
If we introduce the convention that for commutative operators B1 (θ), B2 (θ) and
functions F1 (θ, ·), F2 (θ, ·), expressions of the form B1 (θ)F1 (θ, ·)B2 (θ)F2 (θ, ·) (orany permutation of this form) are always interpreted as
(B1 (θ)B2 (θ)) (F1 (θ, ·)F2 (θ, ·)) ,
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 29
then we can write
I (θ, b) = J0
µb, log
µ1− φ00 (θ)−1
³1 +Ψ (θ) eT´−1 eH (θ, ·)¶¶ . (37)
Expression (37) provides a convenient shorthand notation for the expansion of I (θ, b),
in powers of θ and with coefficients in terms of integrals of the form J0 (b, ·). Inaddition, note that, in order to recover the coefficients in the expansion for I (θ1 (·) , ·)one can apply formal differentiation to (37) in both arguments θ and b (always having
in mind that (37) is just a formalism representing a certain asymptotic expansion).
Hence, for example, one can obtain the first term in the expansion for I (θ1 (·) , ·) as
∂∆I (θ1 (∆) ,∆)|∆=0 = ∂θI (0, 0) ∂∆θ1 (0) + ∂bI (0, 0) ,
where the formal derivatives applied to (37) must be interpreted using the formal
operator convention introduced earlier. Thus, for example, if B (θ) is an operator of
the form
B (θ) =∞Xk=0
bkθkeT kk,
applied to a function F (θ,λ) =Pfk (iλ) θ
k/k!, we interpret the formal derivative
∂θ log (1−B (θ)F (θ, ·)) as
∂θ log (1−B (θ)F (θ, ·)) = −∂θB (θ) (1−B (θ)F (θ, ·))−1 F (θ, ·)−B (θ) (1−B (θ)F (θ, ·))−1 ∂θF (θ, ·) .
where
∂θB (θ) (1−B (θ)F (θ, ·))−1 F (θ, ·)
=∞Xk=0
³∂θB (θ)B (θ)
k´F (θ, ·)k+1 ,
and, similarly,
B (θ) (1−B (θ)F (θ, ·))−1 ∂θF (θ, ·)
=∞Xk=0
B (θ)k+1³F (θ, ·)k ∂θF (θ, ·)
´.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 30
Thus, it is possible to combine this formalism with the expansion
J0 (b, f) =mXn=1
J (n) (0, f) bn/n! +O¡bm+1
¢to recover the coefficients in the expansion for I (θ1 (·) , ·) in powers of ∆.
2.5 Expansions for r (∆) and EθRk (∞)
In previous sections, we developed all the elements required to rigorously compute a
full asymptotic expansion for r (·) in powers of ∆. In the first part of this section, asa summary, we indicate how the developments obtained in the previous three sections
can be applied to provide an asymptotic expansion for r (·) in powers of ∆. In view ofthe level of complexity in the computation of the constants βn, the description in this
section is intended to provide guidance for an easy-to-design practical implementation
in a computational package such as Mathematica or Matlab. An efficient implemen-
tation of the procedure will appear elsewhere. In the second part of this section, also
as a direct consequence of the analysis in the previous sections, we will develop a
rigorous asymptotic expansion for the cumulants of R (∞) under Pθ in powers of θ.
2.5.1 The Expansion for r (∆)
An algorithm for computing βk for k ≤ n proceeds as follows:
1. Expand s (∆) up to terms of order O (∆n+1) using Proposition 8.
2. Similarly, expand the functions J0 (·, Ej,m) up to terms O (∆n−m) with 0 ≤ j ≤m− 1 and 1 ≤ m ≤ n. This also can be done by applying Proposition 8, sinceEj,m has the parity property.
3. Finally, the terms obtained can be combined with an expansion for θ1 (∆) up
to terms of order O (∆n+1). Such an expansion can be easily obtained using the
implicit function theorem and therefore is omitted.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 31
Observe that the previous algorithm provides an asymptotic expansion for r (·) inpowers of ∆. However, because of Theorem 1, we actually have that this asymptotic
expansion converges absolutely in a neighborhood of the origin.
As a simple application of the previous expansion, we show that β2 = 0.
Proposition 9 Suppose that X1 has exponential moments and is strongly non-lattice.
Then
r (∆) = −∆β1 +O¡∆3¢
Proof. We only need to show that β2 = 0. Note that by virtue of Proposition 8,
the coefficient multiplying ∆2 in the expansion of s (∆) equals
s2 =1
2π
Z ∞
−∞
1
λ2
µIm log (1− g (λ))
λ− µ3
¶− ¡µ4/12− µ23/18¢ .
In order to show that β2 = 0 it suffices to show that θ1J (∆, E0,1) ∼ −∆2s2 or (since
J (∆, Ej,m) = O (∆), θ1/2 ∼ ∆ and φ00 (θ1) ∼ 1), that ∆J (∆, E0,1) ∼ −2∆2s2, where
∆J (∆, E0,1) =1
2π
Z ∞
−∞
−∆(∆+ iλ) iλ
µγ0 (iλ)1− g (λ) −
2i
λ+ µ3
¶dλ
=1
π
Z ∞
0
∆2¡∆2 + λ2
¢ Reµ γ0 (iλ)1− g (λ) −
2i
λ+ µ3
¶dλ (38)
−∆π
Z ∞
0
∆2¡∆2 + λ2
¢λIm
µγ0 (iλ)1− g (λ) −
2i
λ+ µ3
¶dλ. (39)
Note that g0 (λ) = iγ0 (iλ) and that Im log¡2λ−2
¢= 0; hence, we can write
Re
µγ0 (iλ)1− g (λ) −
2i
λ+ µ3
¶= − Im d
dλ
¡log¡2 (1− g (λ))λ−2¢− µ3iλ¢ ,
which implies, using integration by parts, that the integral in (38) equals
−1π
Z ∞
0
2λ∆2¡∆2 + λ2
¢2 Im ¡log ¡2 (1− g (λ))λ−2¢− µ3iλ¢ dλ∼ −∆
2
π
Z ∞
0
2
λ2
µIm log (1− g (λ))
λ− µ3
¶dλ, (40)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 32
where (40) has been obtained using dominated convergence and simple manipulations.
It follows from Proposition 2 and a first order asymptotic expansion of E0,1 (iλ) that
(39) equals −∆2 (µ23/9− µ4/6). Combining this last estimate together with (40) into(38) and (39) yields ∆J (∆, E0,1) ∼ −2∆2s2 which is exactly what we wanted to show
to conclude that β2 = 0.
2.5.2 The Expansion for EθR (∞)k as θ& 0
We shall provide asymptotics for EθR (∞)k = Eθ
¡Skτ+
¢/¡k!Eθ
¡Sτ+
¢¢via the cu-
mulants (κj (θ) : j ≥ k) of R (∞) under Pθ. In particular, these estimates yield theproof of Theorem 4 stated in Section 2. The idea is to develop an asymptotic expan-
sion, in powers of b, for s (b) and I (θ, b) respectively and to match coefficients in the
expression
ρ (θ, b) = −κ1 (θ) b+ κ2 (θ) b2/2− κ3 (θ) b
3/3! + ...
= s (b) + I (θ, b) . (41)
In order to perform this task, we will take advantage of Proposition 7 as follows; first
let us define, for k ≥ 1, αk,j,m = J (k)0 (0, Ej,m) /k! (which can be explicitly computed
via Proposition 8). With this notation, we can write, for l, n ≥ 1,
I (θ, b) =nX
m=1
θmm−1Xj=0
χ (θ)jÃ
lXk=1
αk,j,mbk +O
¡bl+1
¢!+ bo (θn)
=lX
k=1
bknX
m=1
m−1Xj=0
θmχ (θ)j αk,j,m + θO¡bl+1
¢+ bo (θn)
Therefore, we obtain that, for all s, n ≥ 1, κs (θ) satisfies
κs (θ) = (−1)sÃκs (0) + s!
nXm=1
m−1Xj=1
θmχ (θ)j αs,j,m
!+O
¡θn+1
¢.
Consequently, κn (·) is an infinitely differentiable function at θ = 0 and for m ≥ 0and n ≥ 1 we have
κ(m)n (0)
n!= (−1)n κn (0)
n!+m−1Xs=0
m−1−sXj=0
χs,jαn,j,m−s,
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 33
where, for n, j ≥ 1, χn,j is the coefficient multiplying θn in the expansion for χ (θ)j .
In particular, the χn,j can be computed recursively as
χn,j+1 =kXn=0
χn,jχ(k−n) (0) / (k − n)!,
with χn,1 = χ(n) (0) /n!.
2.6 Technical Proofs
Proof of Theorem 3. Using Lemma 1, we can add
0 =1
2π
Z ∞
−∞
b
(b+ iλ) iλlog (1 + iλ/2φ0 (θ)) dλ
to expression (13) for ρ (θ, b) to obtain
ρ (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µγ (θ)− γ (θ + iλ)
−iφ0 (θ)λ (1 + iλ/2φ0 (θ))¶dλ
=1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µ2 (γ (θ)− γ (θ + iλ))
λ (λ− 2iφ0 (θ))¶dλ,
yielding the conclusion of the theorem.
Proof of Proposition 3. It follows immediately, by a Taylor series expansion
of γ (·), that a series representation for H can be written (for fixed λ and θ such that
0 < |λ|+ |θ| < η) as
H (θ,λ) = 1− 2iφ0 (θ)λ
− γ (θ)− γ (θ + iλ)
1− g (λ)
= 1− 2iλ
∞Xk=1
µk+1θk
k!− 1
1− g (λ)∞Xk=0
¡µk − γ(k) (iλ)
¢ θkk!
=∞Xk=1
hk (iλ)θk
k!.
In fact, the functions hk (i·) can be analytically extended throughout the disc Dη/2 =
{z ∈ C : |z| < η/2}. This is easily seen as follows, recall that γ (·) (and therefore
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 34
γ(k) (·)) are analytic on N (defined in Section 2). Also, observe that 1−γ (iz) ∼ z2/2and γ(k) (iz)− µk ∼ izµk+1 as z → 0. Thus,
¡γ(k) (iz)− µk
¢/ (1− γ (iz)) possesses a
simple pole at 0 with residue equal to 2iµk+1, which implies that the natural extension
of hk defined as
hk (iz) =γ(k) (iz)− µk1− γ (iz)
− 2iµk+1z
=
¡γ(k) (iz)− µk
¢z − 2iµk+1 (1− γ (iz))
(1− γ (iz)) z
is analytic on Dη/2. Now, by virtue of the maximum principle (see, for example,
Rudin (1987), p. 253) we have that if δ > 0 is suitably small,
sup|z|≤δ
|hk (iz)| ≤ sup|z|=δ
|hk (iz)| .
Since γ (z) is a non-constant analytic function defined on Dη/2 (which is an open set
and thus has an accumulation point), then 1 − γ (z) has an isolated zero at z = 0.
Thus, it is possible to choose δ > 0 in such a way that
inf|z|=δ
|1− γ (iz)| > ε > 0,
for some ε > 0. Consequently,
sup|z|≤δ
|hk (iz)| ≤ sup|z|=δ
|hk (iz)| ≤ 1
εδsup|z|=δ
¯¡γ(k) (iz)− µk
¢z + 2µk+1 (1− γ (iz))
¯.
Observe that, for |z| < η/2, γ(k) (z) = E0¡Xk exp (zX)
¢. Therefore, if z = x + iy,
with |z| = δ,¯γ(k) (iz)
¯ ≤ E0 ³|X|k |exp (izX)|´ = E0 ³|X|k |exp (yX)|´ ≤ E0 ³|X|k exp (δ |X|)´ .A similar bound can be obtained for γ (z) and we can conclude that ∃ B > 0 such
that
sup|z|≤δ0
|hk (iz)| ≤ B³E0³|X|k (exp (δ |X|) + 1)
´+E0
³|X|k+1
´(1 +E0 exp (δ |X|))
´.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 35
Now, suppose that δ < η/2. Then, if z1 ∈ Dη/2, we can define
BE0³|X|k+1
´(1 +E0 exp (δ |X|)) z
k1
k!, N1 (z1)
in such a way that the previous series converges absolutely and uniformly on Dη/2.
Similarly, we can define
B∞Xk=1
E0³|X|k (exp (δ |X|) + 1)
´ zk1k!= BE0
à ∞Xk=1
|X|k zk1
k!(exp (δ |X|) + 1)
!= BE0 ((exp (z1 |X|)− 1) (exp (δ |X|) + 1)), N2 (z1) .
Note that, for j = 1, 2, Nj (z1) → 0 as z1 → 0. On the other hand, since g (λ) is
strongly non-lattice, we have that
sup|λ|≥δ
|hk (iλ)| = sup|λ|≥δ
¯γ(k) (iλ)− µk1− g (λ) − 2iµk+1
λ
¯≤ B
³E0
³|X|k
´+E0
³|X|k+1
´´,
if B < ∞ is big enough. The previous estimates imply that there exist constants
0 < Mk ≤ B³E0³|X|k (exp (δ |X|) + 1)
´+E0
³|X|k+1
´(1 +E0 exp (δ |X|))
´such
that
supz2∈R
SDη/2
|hk (iz2)| ≤Mk
and¯P∞
k=1Mkzk1k!
¯≤P∞
k=1
¯Mk
zk1k!
¯<∞ for z1 ∈ Dη/2. Thus, using the Weierstrass M
test, we obtain the validity of (23). Finally, the invoked Weierstrass M test combined
with the analytic functions convergence theorem (see Theorem 10.28, p. 214, of Rudin
(1987)) yields the analyticity of H (z1, ·) on RSDη/2 (for z1 ∈ Dη/2) and similarly
for H (·, z2) on Dη/2 (for z2 ∈ RSDη/2).
Proof of Proposition 1. We start by writing
I (θ, b) =1
2π
Z ∞
−∞
−b(b+ iλ) iλ
log
µ1− H (θ,λ)λ
λ− 2φ0 (θ) i¶dλ.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 36
The strategy will be to study this integral on {|λ| < δ} and {|λ| ≥ δ} separately(where δ > 0 is some convenient small number to be characterized later).
I (θ, b) = − 12π
Z δ
−δ
b
(b+ iλ)
1
iλlog
µ1− H (θ,λ)λ
λ− 2φ0 (θ) i¶dλ (42)
− 1
2π
Z|λ|≥δ
b
(b+ iλ)
1
iλlog
µ1− H (θ,λ)λ
λ− 2φ0 (θ) i¶dλ. (43)
Let us define IA (θ, b) and IB (θ, b) as (42) and (43) respectively. Suppose that 0 <
b < δ < η/2. By making u = bλ, we can write
IA (θ, b) = − 12π
Z δ
−δ
b
(b+ iλ)
1
iλlog
µ1− H (θ,λ)λ
λ− 2φ0 (θ) i¶dλ.
Let C = {w ∈ C : |w| ≤ δ}∩{Im (w) ≤ 0}, and observe that by virtue of Proposition3, we can pick δ1 > 0 in such a way that for all 0 < θ < δ1 the function
f1 (w) =b
(b+ iw)
1
iwlog
µ1− H (θ, w)w
w − 2φ0 (θ) i¶
is analytic on C. Thus, applying Cauchy’s theorem to the contour enclosing C we
obtain
IA (θ, b)
= − 12π
Z 0
−π
b
(b+ iδeiλ)
iδeiλ
iδeiλlog
Ã1− H
¡θ, δeiλ
¢δeiλ
δeiλ − i
!dλ
=1
2π
Z 0
−π
ibδ−1e−iλ¡1− ibδ−1e−iλ¢ log
Ã1− H
¡θ, δeiλ
¢1− i2φ0 (θ) δ−1e−iλ
!dλ. (44)
The equality (44) has been obtained by simple algebraic manipulations. Observe that
the previous expression in combination with Proposition 3 and the analyticity of the
functions φ0 (θ) (∼ 0) at zero immediately gives that IA (θ, b) can be represented asan absolutely convergent double power series in θ and b on the set 0 < |θ|+ |b| < δ2
for some δ2 > 0. Indeed, if we pick δ2 small enough, it is possible to provide an
explicit power series representation for IA (θ, b) by using the expansion of log (1− w)at w = 0 in combination with the series representation (23) for the function H (θ,λ)
derived in Proposition 3 and a taylor expansion of (1− w)−1 around w = 0.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 37
The analysis of IB (θ.b) is easier,
IB (θ, b) =1
2π
Z|λ|≥δ
b¡1− bλ−1¢ 1λ2 log
µ1− H (θ,λ)
1− i2φ0 (θ)λ−1¶dλ.
Hence, in order to show that IB (·) can be written as an absolutely convergence doublepower series in a neighborhood of the origin, it suffices to show (by Fubini’s theorem)
thatZ|λ|≥δ
∞Xk,j,m≥0
µm+ k
k
¶bj+1 (2 (E (exp (θ |X|)− 1− |X|)))m
(k + 1) |λ|j+2+mà ∞Xs=1
|hs (iλ)| θs
s!
!k+1dλ
is finite for all non-negative θ and b such that θ + b < δ3 for some δ3 > 0. But this
fact follows easily from Proposition 3, first note, by the change of variables λ = uδ,
that the previous expression equalsZ|u|≥1
∞Xk,j,m≥0
µm+ k
k
¶bj+1 (2 (E (exp (θ |X|)− 1− |X|)))m
δj+m+1 (k + 1) |u|j+2+mà ∞Xs=1
|hs (iλδ)| θs
s!
!k+1du,
now pick δ3 small enough so that 0 < max (b, 2 (E (exp (θ |X|)− 1− |X|))) < δ3 < δ
(if θ + b < δ3), and use Proposition 3 to conclude that one δ3 can be chosen so thatP∞s=1 |hs (iλδ)| θ
s
s!< c < 1− δ3/δ. Therefore, we can bound the previous sum byZ
|u|≥1
∞Xk,j,m≥0
µm+ k
k
¶(δ3/δ)
j+m+1
(k + 1) |u|2 ck+1du
≤ 2
3
1
1− δ3/δ
¯log
µ1− c
1− δ3/δ
¶¯<∞.
The conclusions obtained for both IA (·) and IB (·), indicate that for all 0 ≤ θ, b ≤ υ
(for some υ > 0) I (θ, b) can be written as
I (θ, b) =Xj,k≥1
θjbkIjk,
where the previous series converges absolutely on the specified region on θ and b.
The previous expression provides the natural analytic extension of I (·) on D2υ =
{(z1, z2) ∈ C×C : |z1|+ |z2| < υ}.
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 38
Proof of Theorem 1. Since
exp(s (∆)) =1− E0
¡exp
¡−∆Sτ+¢¢∆E0
¡Sτ+
¢ = E0 (exp (−∆R (∞))) ,
the analytic extension of the term s (∆) follows from that of the right hand side, which
comes from the fact that Sτ+ has exponential moments (see Asmussen (1987)). Thus,
since r (∆) = s (∆) + I (θ1 (∆) ,∆), we just have to analyze I (θ1 (∆) ,∆). However,
from the implicit function theorem, we know that θ1 (·) is analytic in neighborhoodof the origin, thus, the analytic functions convergence theorem (see Theorem 10.28,
p. 214, of Rudin (1987)) combined with Theorem 1 yields the desired conclusion.
Proof of Theorem 4. From Theorem 1, we know that for 0 ≤ θ, b ≤ υ (for
some υ > 0)
I (θ, b) =∞Xj=1
bjI·,j (θ) ,
where each function I·,j (θ) can be expanded in absolutely convergent power series for
0 ≤ θ ≤ υ, and thus can be analytically extended throughout a neighborhood of the
origin in the complex plane. But,
ρ (θ, b) = −κ1 (θ) b+ κ2 (θ) b2/2− κ3 (θ) b
3/3! + ...
= s (b) + I (θ, b) ,
where s (·) is (real) analytic at zero. Hence, the conclusion of the Theorem follows
immediately by matching coefficients.
Next, we show that if the distribution of X1 is symmetric then for n ≥ 1, β2n = 0.
Proof of Theorem 2. As we discussed before, all that we need to show is that
β2n = 0. We have shown that an absolutely convergent power series representation
is possible for r (∆) when ∆ is small, thus it suffices to show that if 0 < ∆ < δ
(where δ > 0 is suitably small), then an asymptotic expansion for r (∆) is given in
odd powers of ∆ only. Using the integral expression (14), integrating on |λ| ≤ δ and
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 39
|λ| > δ we can write
r (∆) =1
2π
Z|λ|<δ
−∆(∆+ iλ) iλ
log
µ2 (γ (θ1)− γ (θ1 + iλ))
λ (λ− 2iφ0 (θ1))¶dλ (45)
+1
2π
Z|λ|≥δ
−∆(∆+ iλ) iλ
log
µ2 (γ (θ1)− γ (θ1 + iλ))
λ (λ− 2iφ0 (θ1))¶dλ. (46)
Define by A (∆) and B (∆) the integrals appearing in expressions (45) and (46) re-
spectively. We first analyze A (∆). Using a similar argument as in the proof of
Theorem 1, we see that
A (∆) =1
2π
ZC1
∆
(∆+ iz) izlog
µ2 (γ (θ1)− γ (θ1 + iz))
z (z − 2iφ0 (θ1))¶dz,
where the trajectory C1 is defined as C1 = {δeiλ : λ ∈ [0,−π)}. Also, define thetrajectory C2 = {δeiλ : λ ∈ [−π, 0)}. The proof of the theorem will be complete if weshow that A (∆) is an odd function. That is, we must show that A (∆) = −A (−∆) .Note that
−A (−∆) = −12π
ZC1
−∆(−∆+ iz) iz log
µ2 (γ (−θ1)− γ (−θ1 + iz))
z (z − 2iφ0 (−θ1))¶dz
=1
2π
ZC2
−∆(∆+ iw) iw
log
µ2 (γ (θ1) + γ (θ1 + iw))
w (w − 2iφ0 (θ1))¶dz. (47)
Equality (47) was obtained by making the change of variables −w = z and using thatγ (θ1) and φ0 (θ1) are even and odd functions of θ1 respectively. In view of (47), in
order to show that A (∆) = −A (−∆), it suffices to show that
0 =1
2π
ZC
∆
(∆+ iw) iwlog
µ2 (γ (θ1)− γ (θ1 + iw))
w (w − 2iφ0 (θ1))¶dw,
where C = C1 + C2 is the contour corresponding to the circle with radius δ. Now,
1
2π
ZC
∆
(∆+ iw) iwlog
µ2 (γ (θ1)− γ (θ1 + iw))
w (w − 2iφ0 (θ1))¶dw
=1
2π
Z−C
∆
w (w − i∆) logµ2 (γ (θ1)− γ (θ1 + iw))
w (w − i∆)¶dw (48)
+1
2π
Z−C
∆
w (w − i∆) logµ
w − i∆w − i2φ0 (θ1)
¶dw. (49)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 40
We will show that both terms (48) and (49) vanish. We first consider (49). For
γ ∈ [0, 1] and a ∈ [−δ, δ], define f (γ) as
f (γ) =1
2π
Z−C
∆
w (w − i∆) log (γw − ia) dw.
Using residue calculus (see Rudin (1987), p. 224) it is easy to see that f (0) = 0. A
standard dominated convergence argument yields
f 0 (γ) =1
2π
Z−C
∆
(w − i∆) (γw − ia)dw = 0,
where the previous integral has again been evaluated using residue calculus. As a
result, we obtain that f (1) = 0. Applying these considerations with a = ∆ and
a = 2φ0 (θ1) shows that the integral in (49) equals zero. We also can apply residue
calculus to evaluate (48) directly as follows. Consider
f1 (w) =∆
w (w − i∆) logµ2 (γ (θ1)− γ (θ1 + iw))
w (w − i∆)¶.
Using the change of variables w = h + i∆ and the definition of ∆ = θ1 − θ0 with
γ (θ1) = γ (θ0) we can evaluate the residue of f1 at w = i∆ as Residue(f1; i∆) =
−i log (−2γ0 (θ0) /∆). We also can obtain Residue(f1; 0) = i log (2γ0 (θ1) /∆). There-fore, using residue calculus we obtain that the integral in (48) equals
−i log (−2γ0 (θ0) / (2γ0 (θ1))) = −i log (γ0 (θ1) /γ0 (θ1)) = 0,
since in the case of symmetric distributions γ0 (λ) is odd and θ1 = −θ0.Finally, we analyze B (∆). Note that
B (∆) =1
2π
Z|λ|≥δ
−∆(∆+ iλ) iλ
log¡2 (1− g (λ))λ−2¢ dλ (50)
+1
2π
Z|λ|≥δ
−∆(∆+ iλ) iλ
log
µ1− λH (θ1,λ)
λ− 2iφ0 (θ1)¶dλ. (51)
LetB1 (∆) andB2 (∆) be defined as (50) and (51) respectively. SinceX1 is symmetric,
it follows that log¡2 (1− g (λ))λ−2¢ is real. As a result, we obtain, just by integrating
the real and imaginary parts of the integrand in B1 (∆),
B1 (∆) =1
2π
Z|λ|≥δ
∆
∆2 + λ2log¡2 (1− g (λ))λ−2¢ dλ. (52)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 41
Expression (52) yields an asymptotic expansion in odd powers of∆ for B1 (∆). Again,
integrating the real and imaginary parts in B2 (∆) we obtain
B2 (∆) =1
2π
Z|λ|≥δ
∆¡∆2 + λ2
¢ Re logµ1− λH (θ1,λ)
λ− 2iφ0 (θ1)¶dλ (53)
− 1
2π
Z|λ|≥δ
∆2¡∆2 + λ2
¢λIm log
µ1− λH (θ1,λ)
λ− 2iφ0 (θ1)¶dλ. (54)
The previous identity for B2 (∆) is obtained by observing that the integral
of the imaginary part must vanish. This occurs because for all θ1 small the
function log (1− iλH (θ1,λ) / (iλ+ 2φ0 (θ1))) satisfies the parity property, which canbe verified by observing that, since γ (iλ) = E0 cos (λX)+iE0 sin (λX), it follows that
hk (iλ) ∈ P; also, using Proposition 6, we obtain that iλ/ (iλ+ 2φ0 (θ1)) satisfies theparity property. Therefore, the closure properties proved in Proposition 6 together
with an expansion of the logarithm yield that log (1− iλH (θ1,λ) / (iλ+ 2φ0 (θ1))) ∈P. which justifies (53) and (54). For notational convenience let us define
C (θ1,λ) =∞Xk=1
h2k (iλ) θ2k1 /2k! (55)
and
D (θ1,λ) = −i∞Xk=1
h2k−1 (iλ) θ2k−11 / (2k − 1)!, (56)
where hk (iλ) = (γ(k) (iλ)− µk)/(1− γ (iλ))− 2iµk+1/λ. Since the distribution of X1is symmetric we have that γ(iλ) is even and real. Moreover, we also have that hk (iλ)
is even if and only if k is even. We also can see that Re (H (θ,λ)) , C (θ,λ) and
Im(H (θ,λ)) , D (θ,λ) are even and odd functions of both θ and λ (meaning that
for every θ ∈ (−η/2, η/2) fixed, C (θ, ·) is even and, similarly, for each λ ∈ R, C (·,λ)is also even on (−η/2, η/2), say). Using this notation, we can write
λH (θ1,λ)
λ− 2φ0 (θ1) i =λ2C (θ1,λ)− 2φ0 (θ1)λD (θ1,λ)
λ2 + (2φ0 (θ1))2 (57)
+ i2φ0 (θ1)λC (θ1,λ) + λ2D (θ1,λ)
λ2 + (2φ0 (θ1))2 . (58)
CHAPTER 2. CORRECTED DIFFUSION APPROXIMATIONS 42
Let us define C (θ1,λ) and D (θ1,λ) as the real and imaginary parts of
λH (θ1,λ) / (λ− 2φ0 (θ1) i), respectively, as indicated in the corresponding expressions(57) and (58). Since λH (θ1,λ) / (λ− 2φ0 (θ1) i) holds the parity property, C (θ1,λ)and D (θ1,λ), are even and odd function in both arguments θ1 and λ. By symmetry
of the distribution of X1 we have that ∆ = 2θ1, also as a consequence of symme-
try, 2φ0 (θ1) is an odd (real) analytic function of θ1 at the origin, which implies that
(2φ0 (θ1))2 is even. Hence, using the expansion of log (1− z) at z = 0 in expres-
sions (53) and (54) (justified by virtue of Proposition 3), we see that an asymptotic
expansion for the integral (53) involves expanding expressions of the form
K (θ1)1
2π
Z|λ|≥δ
∆¡∆2 + λ2
¢C (θ1,λ)kD (θ1,λ)2m dλ (59)
where K (θ1) is an even function of θ1 which is also (real) analytic at the origin. This
implies in view of (55) to (58) and the properties of 2φ0 (θ1) discussed before, that an
asymptotic expansion for (59) must be given in odd powers of ∆ only, which must be
also the case for the integral in (53). The treatment for the integral (54) is completely
analogous and also yields an asymptotic expansion in odd powers of ∆. This yields
the conclusion of the theorem.
Chapter 3
The Cramer-Lundberg Theorem in
the Presence of Heavy Tails
Let S = (Sn : n ≥ 0) be the randomwalk generated by the sequenceX = (Xn : n ≥ 1)of independent and identically distributed random variables (iid rv’s) with EX1 = 0
and EX21 = 1 (so that S0 = 0 and Sn = X1 + ... +Xn for n ≥ 1). Assume that the
Xi’s are strongly non-lattice, in the sense that g (λ) , E exp (iλX1) satisfies, for eachε > 0,
inf|λ|>ε
|1− g (λ)| > 0.
Or, in other words, that lim|λ|→∞ |g (λ)| < 1 (see Siegmund (1985), p. 176).Let us introduce a small location parameter δ > 0 representing the drift of the
random walk. More precisely, let us consider a parametric family of random walks,
Sδ =¡Sδn : n ≥ 0
¢, generated by the sequence Xδ = (Xn − δ : n ≥ 1). So that
Sδn = Sn − nδ.
We shall focus on developing highly accurate approximations, of Cramer-Lundberg
type, for the distribution of
Mδ = maxn≥0
Sδn
43
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 44
in the presence of heavy tailed increments. For our purposes here, we say that X has
“heavy tails” if for all θ 6= 0, E exp (θ |X|) =∞.Driven by a number of important applications in several disciplines, a great deal of
effort has been put into understanding the distributional properties of Mδ. The book
by Asmussen (2003) provides a detailed account of several important applications
settings in which the distribution ofMδ plays a major role. Most notably we mention:
insurance risk theory, in which P (Mδ > x) is the probability of eventual ruin of an
insurer that faces iid claims and possesses initial reserve x; queueing theory, in which
the waiting time sequence (excluding service) in the single-server queue, under iid
inter-arrival and processing times, and first-come first-served service discipline, turns
out to converge in distribution toMδ (see Kiefer andWolfowitz (1956)), and sequential
analysis, in which the tail probability P (Mδ > x) can be interpreted as the power of
a one-sided sequential probability ratio test (see Siegmund (1985)).
Many problems in applied probability motivate study of models with heavy tails.
For instance, in certain lines of the insurance business, such as fire insurance, statis-
tical evidence suggest that claims sizes generally exhibit heavy tailed behavior (see,
for example, p. 436 of Bowers et al (1997) and Embrechts, Klüppelberg and Mikosch
(1997)). Queueing theory also gives rise to heavy-tails. For example, when mod-
eling data traffic in communication networks, evidence has been found suggesting
that exponential tail features (present in traditional models of data traffic) are not
compatible with empirical observations (see Adler, Feldman and Taqqu (1998), and
Willinger et al (1995)). Therefore, developing asymptotic analysis for systems with
heavy tail characteristics is an important applied problem.
Computing the exact distribution ofMδ (either numerically or analytically) under
general increment distributions is well known to be a challenging problem. Essentially,
it entails solving a Wiener-Hopf type equation known as Lindley’s equation (see Lind-
ley (1952)). This integral equation corresponds to the equation describing the sta-
tionary distribution of the positive recurrent Markov chain Wn+1 = (Wn +Xn − δ)+.
Consequently, most of the literature has been focused on developing approximations
and numerical algorithms for computing the distribution of Mδ. One of the most
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 45
popular approximations is based on the so-called Cramer-Lundberg asymptotic for-
mula (see equation (1) below). This formula was initially developed for light tailed
increment distributions (i.e. E exp (η |X1|) < ∞ for η in a neighborhood of the ori-
gin). The Cramer-Lundberg approximation is a celebrated result in insurance risk and
queueing theory (see Asmussen (2001, 2003) and Grandell (1991)), and it is widely
accepted that it tends to perform very well in practice (see the discussion in Asmussen
(2001) and Grandell (1991)). This performance can be explained via the exponential
rate of convergence that actually holds in many practical applications, see equation
(2) below.
As we shall see in Section 2, the Cramer-Lundberg representation for P (Mδ > x)
in the case of light tailed increments can be interpreted in a “scaled” form as a
function of δ > 0 only (i.e. we allow x = y (δ) = O¡δ−b¢for b ≥ 1 as δ & 0, see
(2) and (3) below). The case of y (δ) = O¡δ−1¢(as δ & 0) is of great interest, since
it corresponds to the so-called diffusion scale (see equation (4)). With this scaled
interpretation we can see that the Cramer-Lundberg approximation has an error that
is exponentially small as δ & 0 (or, equivalently, y (δ) % ∞). Note that the case δclose to zero is encountered often in practice. For instance, in the queueing setting
described before, δ ≈ 0 corresponds to the so-called heavy traffic regime in which theserver is busy close to 100% of the time (this terminology actually motivated the title
of this chapter). In insurance risk theory, δ close to zero implies that the premium
charged is close to the typical pay-out for claims (in the language of risk theorists,
the “safety loading” is small). Furthermore, our scaled form of the Cramer-Lundberg
representation allows us to obtain a corresponding heavy tailed version (assuming
E |X1|3+α for α > 0) of the standard Cramer-Lundberg approximation that providesa good fit (as δ & 0) at essentially every region of the quantile space (see (5)). In
particular, the error obtained is of polynomial form in δ, at a rate that depends on
the number of moments available. Although we state the complete form of our scaled
Cramer-Lundberg representation (see (5)), we focus only on the diffusion region of the
space (i.e. y (δ) = O¡δ−1¢), which yields the most important result of this chapter,
namely, Theorem 1. The details for the case y (δ) = O¡δ−b¢for b > 1 are given in
Blanchet, Olvera-Cravioto and Glynn (2004).
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 46
Initial forms of heavy tailed Cramer-Lundberg asymptotics for P (Mδ > x) were
given by Bahr (1975) and Borovkov (1976) in the context of the so-called classical risk
model or (equivalently) the single-server queue with Poisson arrivals. Further gener-
alizations were developed by Embrechts and Veraverbeeke (1982). These heavy tailed
versions of the Cramer-Lundberg approximation tend to perform well only at very
large quantile values (see Asmussen and Binswanger (1997), and also the discussion
in Embrechts, Klüppelberg and Mikosch (1997) p. 54). The approximations provided
in this chapter (in particular, see Theorem 1) are intended to yield good fit in more
“typical” values of the distribution (i.e. on the region x = y (δ) = O¡δ−1¢, which
corresponds to the diffusion scale). For large quantiles (i.e. x = y (δ) = O¡δ−b¢for
b ≥ 1) our approximations match earlier results mentioned above.A closely related approximation, of the type of so-called “corrected diffusion ap-
proximations” (CDA’s), has been tested in practical applications by Asmussen and
Binswanger (1997) and shows satisfactory performance. This first order CDA was
developed by Hogan (1986). As we shall see, Theorem 1 not only allows one to
strengthen and recover Hogan’s CDA but it also significantly reduces the error of the
diffusion approximation (see (4) below) as δ & 0.
Section 2 introduces our “scaled” Cramer-Lundberg representation and discusses
our main results (see Theorem 1) using ideas from the light tailed case. Section 3
studies the connection between our proposed representation and corrected diffusion
approximations. The technical development is given in Section 4.
3.1 A Cramer-Lundberg Representation
As we mentioned previously, the so-called Cramer-Lundberg asymptotic formula was
initially developed for light tailed random walks. In particular, suppose that there
exists a positive solution θδ to the equation
φ¡θδ¢= exp
¡θδδ¢,
where φ (θ) , E exp (θX1). For x > 0 define τ (x) = inf{n ≥ 1 : Sδn > x}. Since
{τ (x) <∞} = {Mδ > x}, the fundamental identity of sequential analysis establishes
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 47
that
P (Mδ > x) = P (τ (x) <∞) = exp¡−θδx¢Eθδ exp
¡−θδ ¡Sδτ(x) − x
¢¢,
where
Pθδ (A) = E¡exp
¡θδ (Sn − nδ)
¢1A¢
for every set A ∈ σ (X1, ...,Xn) (where σ (X1, ..,Xn) is the sigma-field generated by
X1, ..., Xn). The “overshoot” R (x) , Sδτ(x) − x can be interpreted as the residual
life time of the embedded renewal process generated by the strictly ascending lad-
der heights of Sδ. The standard Cramer-Lundberg asymptotic is then obtained by
applying renewal theory at strictly ascending ladder heights yielding
P (Mδ > x) ∼ exp¡−θδx+ r (δ)¢ (1)
as x→∞, where r (δ) = logE exp ¡−θδR (∞)¢.Moreover, since we are assuming strongly non-lattice increment distributions, a
result by Stone (1965) on rates of convergence in renewal theory guarantees an expo-
nential rate of convergence in (1). In particular, the Cramer-Lundberg representation
P (Mδ > x) = exp¡−θδx+ r (δ)¢+O ¡e−ax¢ (2)
holds for some a > 0 (see Asmussen (2003) p. 196 ). It turns out that the exponential
rate of convergence in (2) is uniform in δ > 0 (see Lemma 5 of Siegmund (1979) or
Lemma 1 below), allowing us to write (see Chang (1992)) the following scaled Cramer-
Lundberg representation for P (Mδ > y (δ))
P (Mδ > y (δ)) = exp¡−θδy (δ) + r (δ)¢+O ¡e−ay(δ)¢ , (3)
which is valid for some a > 0 (uniformly on δ ∈ [0, δ1] for some δ1 > 0) and y (δ) =O¡δ−b¢for b > 0. Of special importance is the case in which b = 1 (i.e. y (δ) =
O¡δ−1¢). Using the implicit function theorem it is easy to see that θδ = 2δ+O
¡δ2¢,
we therefore can recover, from (3), Kingman’s (1963) diffusion approximation
P (Mδ > x/δ) ≈ exp (−2x) + o (1) , (4)
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 48
valid as δ & 0, for x > 0.
In this chapter, we introduce a heavy tailed version of the scaled Cramer-Lundberg
representation (3). In particular, if E |X1|3+α < ∞ for α > 0, then, for each ε > 0
sufficiently small, our proposed representation takes the form
P (Mδ > y (δ))
=
(exp
¡−θδαy (δ) + rα (δ)¢+ o ¡δα−ε¢ if y (δ) = o¡δ−b¢for b = 1
1δ
R∞y(δ)P (X1 > u) du+ o
¡y (δ)α+1
¢if y (δ) = o
¡δ−b¢for b > 1
. (5)
The constants θδα and rα (δ) correspond to natural approximations for θδ and r (δ)
respectively − their form is discussed in detail below. The case y (δ) = O¡δ−b¢for
b > 1 is derived under the additional assumption that the increments possess regu-
larly varying tails, although the technical details are not discussed in this dissertation
(see Blanchet, Olvera-Cravioto, and Glynn (2004)) for additional detail on this case).
It suffices to remark in the present discussion that representation (5) generalizes the
scaled Cramer-Lundberg representation (3) and reconciles our proposed representa-
tion with previous Cramer-Lundberg type asymptotics developed for fixed values of
δ (see Embrechts, Klüppelberg and Mikosch (1997) p. 39). In our development here,
we will focus only on the “diffusion” region of the space, namely, y (δ) = O¡δ−1¢.
In order to understand the nature of the constants θδα and rα (δ) let us analyze the
elements describing (3). Using the implicit function theorem, it is possible to develop
an approximation for θδ in terms of
θδα , 2δ +X0≤j≤α
ξj+2δj+2
(j + 2)!= θδ + o
³δbαc
´, (6)
where ξ2 = 8EX31/3, and ξj depends on the first j +1 moments of X1. Also, it turns
out that r (δ) can be computed explicitly in terms of Woodroofe’s (1979) integral
form
r (δ) =1
2π
Z ∞
−∞
−θδ¡θδ + iλ
¢iλlog
Ãφ¡θδ¢− e−δiλφ ¡θδ + iλ¢
−i ¡φ0 ¡θδ¢− δφ¡θδ¢¢
λ
!dλ, (7)
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 49
see also Siegmund (1985) p. 176. It is not hard to verify, using a dominated conver-
gence argument and Proposition 8.44 of Breiman (1992), that if
rα (δ) , 1
2π
Z ∞
−∞
−θδα¡θδα + iλ
¢iλ
log
Ãγα¡θδα¢− e−δiλPk≤α+1 g
(k) (λ)¡θδα¢k/k!
−i ¡γ0α ¡θδα¢− δγα¡θδα¢¢ !
dλ (8)
with γα (·) defined as
γα (θ) = 1 +X0≤j≤α
θj+2EXj+2
(j + 2)!,
then,
r (δ) = rα (δ) + o (δα) .
In view of the fact that for θδα and rα (δ) to be meaningful only finitely many moments
of X1 are required to exist, the previous estimates together with (3) suggest the
natural scaled Cramer-Lundberg representation provided. Summarizing, the main
result of this chapter is the following.
Theorem 1 Suppose that E |X1|α+3 <∞ for α > 0, and that the distribution of X1is strongly non-lattice. Then,
P (Mδ > x/δ) = exp¡−θδαx/δ + rα (δ)¢+ o ¡δα−ε¢ (9)
as δ & 0 for ε > 0 sufficiently small and x > 0 fixed.
Remark 1 As we shall, the slack term ε > 0 comes from an estimate involving
Spitzer identities and the Wiener-Hopf factorization (see Proposition 3). In other
words, if we could set ε = 0 in Proposition 3, then Theorem 1 would hold assuming
only E |X1|α+3 <∞ for α ≥ 0 with an error of order o (δα).
Remark 2 We also will see, that Theorem 1 could also have been formulated in
a more robust form as follows.
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 50
Theorem 2 (Robust form) Let G be the class of random variables Y such that
i) supY ∈G E |Y |α+3 <∞ for all α > 0.
ii) The distribution of Y equals the distribution of X1 on [−1/δ, 1/δ].iii) X1 is strongly non-lattice and EY = o (δα).
Then, for ε > 0 sufficently small and each x > 0 fixed,
P¡MY
δ > x/δ¢= exp
¡−θδαx/δ + rα (δ)¢+ o ¡δα−ε¢ ,as δ & 0 (uniformly in Y ∈ G) where MY
δ is the all time maximum of the random
walk Sn = Y1 + ...+ Yn − nδ, and the Yi’s are iid rv’s members of class G.
3.2 Connection to Corrected Diffusion Approxi-
mations
The approximation suggested by Theorem 1 is closely related to so-called “corrected
diffusion approximations” (CDA’s). These approximations are developed in the form
of asymptotic expansions in powers of δ > 0. These asymptotic expansions follow the
spirit of Edgeworth expansions for the central limit theorem and provide parametric
information (in δ > 0) about the distribution of the whole time maximum of ran-
dom walk. CDA’s for the distribution of Mδ were introduced by Siegmund (1979).
Assuming light tailed increments, Siegmund (1979) developed an expansion that cor-
rects the diffusion approximation (4) up to an error of order o¡δ2¢. Chang and Peres
(1997) obtained a complete asymptotic expansion for Gaussian random walks and,
as we have seen, a complete asymptotic expansion for general strongly non-lattice
increments with exponential moments was developed in the second chapter of this
dissertation.
A first order CDA (corrected diffusion approximation) to (4) in the case of heavy
tailed increments was proposed by Hogan (1986). In particular, assuming thatE |X1|5 <∞, and under some integrability conditions on the characteristic function ofX1 (which
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 51
in particular imply the continuity of X1) Hogan showed that
P (Mδ > x/δ) = exp (−2x)µ1 + δ
4xEX31
3− 2δβ
¶+ o (δ) .
The constant β was computed by Siegmund (1979) as
β =1
6EX3
1 −1
2π
Z ∞
−∞
1
θ2Re log{2 (1− g (θ)) /θ2}dθ. (10)
Hogan’s strategy consists, essentially, in applying direct Fourier inversion to the char-
acteristic function of Mδ. His method of proof does not seem to extend directly to
higher order correction terms.
A more convenient representation for Hogan’s approximation (which is guaranteed
to give only non-negative values) can be written as
P (Mδ > x/δ) ≈ exp¡−2x ¡1− 2δEX3
1/3¢− 2δβ¢ . (11)
In order to recover Hogan’s approximation (11) from (9) note that (using the same
technique as in the proof of Theorem 3 in Chapter 2 of this dissertation),
rα (δ) =1
2π
Z ∞
−∞
−θδα¡θδα + iλ
¢iλ
log
2³γα¡θδα¢− e−δiλPk≤α+1 g
(k) (λ)¡θδα¢k/k!´
λ¡λ− 2i ¡φ0 ¡θδ¢− δφ
¡θδ¢¢¢
dλ∼ 1
2π
Z ∞
−∞
−θδα¡θδα + iλ
¢iλlog¡2 (1− φ (iλ))λ−2
¢dλ ∼ 2δβ. (12)
The estimate (12) was obtained from the expansion θδα = 2δ + δ28EX31/3 + o
¡δ2¢,
which is valid, as we show in Corollary 1 below, as long as E |X1|4+α <∞ for α ≥ 0.Approximation (11) can therefore be recovered by combining (12) and the expansion
for θδα into (9). We stress that (9) does not provide a CDA in the parametric sense
introduced by Siegmund (1979). Furthermore, the techniques introduced in Chapter
2 do not apply directly to provide an asymptotic expansion of rα (δ) in powers of the
drift δ under the parameterization utilized here.
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 52
3.3 Technical Development
Throughout the rest of the chapter we will suppose, in addition to the assumptions
discussed at the beginning of this chapter, that E |X1|α+3 < ∞ for α > 0. The
strategy that we will pursue follows a truncation argument. Consider the sequence
Xδof rv’s X
δ
k = Xk1 (|Xk| ≤ 1/δ) − δ, for k ≥ 1, and its associated random walk
Sδ=³Sδ
n : n ≥ 0´(i.e. S
δ
0 = 0 and Sδ
n = Xδ
1 + ...+Xδ
n). The idea is first to develop
approximation (9) for the distribution of
Mδ = maxn≥0
Sδ
n.
Later, we will show that P (Mδ > x/δ) and P¡Mδ > x/δ
¢are suitably close.
Put φδ (θ) = E exp¡θX1
¢and set ψδ (θ) = log φδ (θ). Note that ψ
0δ (0) = −δ +
o¡δα+2
¢; therefore, if δ is small enough, we can guarantee that there is a strictly
positive solution to the equation ψδ
¡θδ∗¢= 0. A similar argument to that given
previously to obtain (1) yields
P¡Mδ > x
¢= exp
¡−θδ∗x¢E∗δ exp ¡−θδ∗Rδ (x)¢, (13)
where τ (x) = inf{n ≥ 1 : Sδ
n > x}, Rδ (x) , Sδ
τ(x)−x is the overshoot at level x, and
P ∗δ (A) = E³exp
³θδ∗S
δ
n
´1A´
for every set A ∈ σ³X
δ
1, ...,Xδ
n
´(where σ
³X
δ
1, ...,Xδ
n
´is the sigma-field generated
by Xδ
1, ..., Xδ
n). Renewal theory applied at the strictly increasing ladder heights of
the random walk S implies that
E∗δ exp¡−θδ∗Rδ (x)
¢→ E∗δ exp¡−θδ∗Rδ (∞)
¢as x → ∞, for fixed δ > 0. Here, we are interested in applying renewal theory
uniformly on δ ∈ (0, δ1). The next proposition (which is analogous to Lemma 5 ofSiegmund (1979)) provides the means for doing so.
Lemma 1 Let F be a family of distribution functions supported on [0,∞). For eachF ∈ F, let EF (·) be the expectation operator associated to F ∈ F, and define
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 53
EFg (τ) ,R[0,∞) g (t)F (dt) for each continuous and bounded function g : [0,∞)→ C.
Suppose that the family F is uniformly strongly non-lattice, (i.e. the corresponding
characteristic functions χF (λ) = EF exp (iλτ) satisfy
infF∈F
inf|λ|>ε
|1− χF (λ)| > 0. (14)
Then, UF (t) ,P∞
n=0 F∗n (t) satisfies the following.
1. If supF∈F EF exp (ηX1) <∞ for some η > 0, then
supF∈F
¯UF (t)− t
EF τ− EF τ
2
2E2F τ
¯= O
¡e−at
¢as t→∞ for some a > 0.
2. Moreover, if supF∈F EF τε+2 <∞ for ε ≥ 0, then,
supF∈F
¯UF (t)− t
EF τ− EF τ
2
2E2F τ− H
F2 (t)
EF τ 2δ−HF
1 ∗HF1 (t)
¯= o
¡tα+2 log (t)
¢as t→∞, where HF
1 (t) =R∞t(1− F (s)) ds /EF τ and HF
2 (t) =R∞tHF1 (s) ds.
Proof. See Theorem 1 in Chapter 4 of this dissertation.
A crucial assumption that must be verified when applying the previous lemma is
the strongly non-lattice condition (14). A key result that we shall use to verify this
assumption repeatedly throughout the rest of this chapter is the so-called Wiener-
Hopf factorization, which we now state without proof (see Theorem 8.3.1 of Asmussen
(2003) for a proof of this classical result).
Lemma 2 (Wiener-Hopf) Suppose that Y = (Yj : j ≥ 1) is a sequence of iid rv’swith characteristic function g (λ) , E exp (iλY1) . Define Sn , Y1 + ... + Yn and
S0 , 0. Put τ+ = inf{n ≥ 0 : Sn > 0} and set τ− = inf{n ≥ 1 : Sn ≤ 0}. Finally, letg+ (λ) = E(exp(iλSτ+); τ+ <∞) and put g− (λ) = E(exp (iλSτ−) ; τ− <∞). Then,
1− g (λ) = (1− g+ (λ)) (1− g− (λ)) .
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 54
With Lemma 1 in hand, we now can provide a detailed asymptotic analysis of
E∗δ exp¡−θδ∗Rδ (x/δ)
¢as δ & 0 (for fixed x > 0), as the following proposition shows.
Proposition 1 There exists δ∗ > 0 and a function f1 : (0,∞) → (0,∞), such thatf1 (z) = o
¡z−(1+α)
¢as z %∞ for which
sup0≤δ≤δ∗
¯P ∗δ¡Rδ (x) > y
¢− P ∗δ ¡Rδ (∞) > y¢¯ ≤ y−1f1 (y)xf1 (x) + f1 (x+ y)
for x, y > 0. Also, if x = O (1/∆) we have¯E∗δ exp
¡−∆Rδ (x)¢−E∗δ exp ¡−∆Rδ (∞)
¢¯ ≤ o ¡∆1+α¢
as ∆& 0 uniformly in δ ∈ (0, δ∗).
Proof. An analogous result was obtained by Chang (1992) when exponential
moments exist. Our argument here follows Chang’s argument, we provide the details
for completeness. Applying renewal theory at strictly increasing ladder heights we
have that
P ∗δ¡Rδ (x) > y
¢=
Z[0,x)
P ∗δ³Sδ
τ+> x+ y − t
´U∗δ (dt) ,
where τ+ = inf{n ≥ 0 : Sδ
n > 0} is the first strictly increasing ladder epoch, Sδ
τ+ is the
first strictly increasing ladder height, and U∗δ is the corresponding renewal measure
generated by the strictly increasing ladder heights under the probability measure P ∗δ .
We also know from renewal theory (for fixed δ > 0) that
P ∗δ¡R (∞) > y¢ =
1
E∗δSδ
τ+
Z ∞
y
P ∗δ³Sδ
τ+> t´dt
=1
E∗δSδ
τ+
Z x
−∞P ∗δ³Sδ
τ+> x+ y − t
´dt.
Thus,
P ∗δ¡Rδ (x) > y
¢− P ∗δ ¡Rδ (∞) > y¢
=
Z x
0
P ∗δ³Sδ
τ+> x+ y − t
´ÃU∗δ (dt)−
dt
E∗δSδ
τ+
!(15)
+1
E∗δSδ
τ+
Z 0
−∞P ∗δ³Sδ
τ+> x+ y − t
´dt. (16)
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 55
Note that (15) can be written asZ x
0
P ∗δ³Sδ
τ+> x+ y − t
´ε∗δ (dt) ,
where
ε∗δ (t) = U∗δ (t)−
t
E∗δSδ
τ+
−E∗δ³Sδ
τ+
´22E∗δS
δ
τ+
.
Using properties of the convolution and the change of variable u = x/t we obtainZ x
0
P ∗δ³Sδ
τ+> x+ y − t
´ε∗δ (dt)
= −Z x
0
ε∗δ (x− t)P ∗δ³Sδ
τ+∈ y + dt
´= −
Z 1
0
ε∗δ (x− xt)P ∗δ³Sδ
τ+∈ y + xdt
´. (17)
Now, since E¡|X1|3+α¢ < ∞, 0 ≤ Sδ
τ+ ≤ 1/δ and θδ∗ ∼ 2δ, we can guarantee thatthere exists δ1 > 0 such that
sup0≤δ≤δ1
E∗δ
µ³Sδ
τ+
´2+α¶= sup
0≤δ≤δ1E
µ³Sδ
τ+
´2+αexp
³θδ∗S
δ
τ+
´¶≤ M sup
0≤δ≤δ1E
µ³Sδ
τ+
´2+α¶<∞.
Let us verify that the laws P ∗δ³Sδ
τ+ ∈ ds´are uniformly strongly non-lattice. First,
it is almost immediate to see that the laws P ∗δ¡X1 ∈ ds
¢are uniformly strongly non-
lattice. From Lemma 2 we have that
1
2
¯1− e−δiλE∗δ exp
¡iλX1
¢¯ ≤ ¯1− e−δiλE∗δ exp³iλSδ
τ+
´¯.
The uniform strongly non-lattice assumption can be easily verified from the previous
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 56
inequality. Consequently, we can apply Lemma 1 to conclude, from (17), that¯Z x
0
P ∗δ³Sδ
τ+> x+ y − t
´ε∗δ (dt)
¯≤
Z 1
0
|ε∗δ (x− xt)|P ∗δ³Sδ
τ+∈ y + xdt
´≤
Z 1/2
0
|ε∗δ (x− xt)|P ∗δ³Sδ
τ+∈ y + xdt
´+Z 1
1/2
|ε∗δ (x− xt)|P ∗δ³Sδ
τ+∈ y + xdt
´≤ o
¡x−α
¢P ∗δ³Sδ
τ+≥ y
´+ P ∗δ
³Sδ
τ+≥ y + x/2
´= o
¡x−α
¢o¡y−2−α
¢+ o
¡(y + x)−2−α
¢.
On the other hand, the term (16) equals
1
E∗δSδ
τ+
Z ∞
x+y
P ∗δ³Sδ
τ+> t´dt = o
¡(x+ y)−1−α
¢.
This yields the first part of this proposition. For the second part, note that
E∗δ exp¡−∆Rδ (x)
¢=
Z ∞
0
e−uP ∗δ¡Rδ (x) ≤ u/∆
¢du,
thus ¯E∗δ exp
¡−∆Rδ (x)¢−E∗δ exp ¡−∆Rδ (∞)
¢¯≤
Z ∞
0
e−u¡o¡u−2−α∆2+α
¢o¡x−α
¢+ o
¡∆1+α (x∆+ u)−1−α
¢¢du = o
¡∆1+α
¢as long as x = O (1/∆), this provides the second part of the statement.
We are almost ready to show that our stated approximation (9) is valid for the
truncated random walk Sδ. Let us just provide, a couple of elementary results de-
scribing the asymptotic behavior of θδ∗ and φδ
¡θδ∗¢as δ & 0.
Proposition 2 Let
eφδ (θ) , E (exp (θX11 (|X1| ≤ 1/δ))) .
(Observe that φδ (θ) = exp (−θδ) eφδ (θ).) Then, for all θ ∈ [−Mδ,Mδ] with M > 0¯¯eφδ (θ)−
X1≤j≤α+3
EXj11 (|X1| ≤ 1/δ)
θj
j!
¯¯ ≤ o ¡δα+2¢ .
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 57
Furthermore, ¯¯eφδ (θ)−
X1≤j≤α+3
EXj1
θj
j!
¯¯ ≤ o ¡δα+2¢ ,
for θ ∈ [−Mδ,Mδ].
Proof. The proof proceeds by expanding for fixed δ the function
eφδ (θ)
=X
1≤j≤α+2EXj
11 (|X1| ≤ 1/δ)θj
j!
+E³Xbα+3c1 exp (ηX11 (|X1| ≤ 1/δ))
´ θbα+3c
(bα+ 3c)! ,
where |η| ≤ |θ| ≤Mδ. Hence,¯¯E ³Xbα+3c
1 exp (ηX11 (|X1| ≤ 1/δ))´ θbα+3c
(bα+ 3c)!
¯¯
≤ Mδbα+3c
(bα+ 3c)! exp (M)E¯Xα+31
¯= o
¡δα+2
¢.
The fact that EXj11 (|X1| ≤ 1/δ) − EXj
1 = o¡δα+3−j
¢can be easily checked, this
yields the conclusion of the proposition.
As a consequence of the previous proposition we obtain the next corollary.
Corollary 1 ¯¯θδ∗ − X
j≤α+2
ξjj!δj
¯¯ ≤ o ¡δα+1¢ .
The constants ξj are computed via the system of linear equations:
nXm=0
µn
m
¶κm+2m+ 2
ξn−m+1 = 0; 0 ≤ n ≤ α+ 1,
where κj is the jth cumulant of X1.
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 58
Proof. The interesting case arises when exponential moments fail to exists, in
that case log eφδ (1) > δ for all δ small. Therefore, by strict convexity of log eφδ (·), wemust have θδ∗ ≤ δ. This implies that θδ∗ is in the domain in which the expansion of
Proposition 2 is valid. The rest of the conclusion follows from the implicit function
theorem.
Proposition 3 If y (δ) = O¡δ−b¢for b ≤ 1, and x > 0, then
P¡M δ > x/δ
¢= exp
¡−θδαx/δ + rα (δ)¢+ o (δα) .Proof. By Corollary 1 we have
P¡M δ > x/δ
¢= exp
¡−θδ∗x/δ¢E∗δ exp ¡−θδ∗Rδ (x/δ)¢.
On the other hand, Proposition 1 asserts that¯E∗δ exp
¡−θδ∗Rδ (x/δ)¢−E∗δ exp ¡−θδ∗Rδ (∞)
¢¯ ≤ o ¡δα+1¢ ,as long as θ∗ = O (δ) which holds by virtue of Corollary 1. Now, observe that
logE∗δ exp¡−θδ∗Rδ (∞)
¢=−12π
Z ∞
−∞
θδ∗¡θδ∗ + iλ
¢iλlog
eφδ
¡θδ∗¢− e−δiθeφδ
¡θδ∗ + iθ
¢−i³eφ0δ ¡θδ∗¢− δeφδ
¡θδ∗¢´
θ
dθ.Since ¯
g(m) (iθ)− eφ(m)δ (iθ)¯≤ 2E (|Xm| 1 (|X| > 1/δ)) = o ¡δα+3−m¢ ,
a routine dominated convergence argument (obtained with the aid of Proposition 8.44
of Breiman (1992)) yields
logE∗δ exp¡−θδ∗Rδ (∞)
¢− rα (δ) = o (δα) .The proposition is proved by combining these estimates.
The next step is to show that P¡M δ > x/δ
¢ − P (Mδ > x/δ) = o (δα). We well
do this by taking advantage of a geometric sum representation of the maximum of
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 59
random walks with negative drift. Specifically, let us write τ+ = τ (0) and Sδτ+ (resp.
τ+ = τ (0) and Sδ
τ+) to denote the first strictly ascending ladder epoch and first
strictly ascending ladder height of the random walk Sδ (resp. Sδ). It is well known
that
MδD= Z ,
G(pδ)Xj=1
Tj,δ , (18)
where T δ = (Tj,δ : j ≥ 1) is a sequence of iidrv’s with distribution function
given by P (T1,δ ≤ t) = P¡Sδτ+≤ t¯ τ+ <∞¢ and G (pδ) is geometrically distributed
with parameter pδ = P (τ+ =∞) (i.e. P (G (pδ) = k) = pδ (1− pδ)k for k ≥ 0). Acompletely analogous representation is also valid for Mδ, namely
M δD= Z ,
G(pδ)Xj=1
T j,δ , (19)
with an iid sequence T =¡T j,δ : j ≥ 1
¢such that P
¡T 1,δ ≤ t
¢= P
³Sδ
τ+≤ t¯τ δ+ <∞
´and a parameter pδ = P (τ+ =∞) for the geometric rv G.It is natural to expect that if the moments of Mδ and Mδ are close, then their
corresponding distributions do not differ significantly. The next result (whose proof
is given at the end of the section) shows that the moments of T1,δ and T 1,δ are close
as δ & 0 (this implies, in view of representations (18) and (19), that the moments of
M δ provide good approximations, in some sense, for those of Mδ).
Theorem 3 For each ε > 0 small enough
pδ = pδ + o¡δα+1−ε
¢,
E¡T 1,δ
¢= E (T1,δ) + o
¡δα−ε
¢.
Moreover, for 2 ≤ j ≤ α+ 2
E³Tj
1,δ
´= E
¡T j1,δ¢+ o
¡δα+2−j−ε
¢.
Proof. Given at the end of the section.
Theorem 1 is just an immediate consequence of the next final proposition.
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 60
Proposition 4
P¡Mδ > x/δ
¢− P (Mδ > x/δ) = o (δα)
Proof. Applying Theorem 1.1 of Kalashnikov (1997) (see also Proposition 1 in
the fourth chapter of this dissertation for a somewhat shorter argument) we obtain
P (M > x/p) = qEqN(x/p). (20)
A similar argument as that as the one given in the proof of Proposition 1 (by means
of the Wiener-Hopf factorization) can be used to easily verify the uniform strong non-
latticity (for δ > 0 sufficiently small) of the distributions of both T1,δ and T 1,δ. In
addition, note that both T1,δ and T 1,δ have uniformly (in δ > 0) bounded moments of
order bα+ 2c. Using renewal theory (in its uniform version, as in Lemma 1) we shallobtain, in Theorem 3 of this dissertation’s fourth chapter, asymptotic expansions (as
p& 0) for P (Z > x/p), which, combined with (20), allows writing
P (Mδ > x/δ) = exp (aδ (pδ)x/δ + bδ (pδ)) + o (δα) ,
where a (pδ) and b (pδ) satisfy
a (pδ) =Xk≤α
a(k+1)δ (0)
(k + 1)!pk+1, (21)
b (pδ) =Xk≤α
b(k+1)δ (0)
(k + 1)!pk+1, (22)
with a(m)δ (0) and b(m)δ (0) depending algebraically on the first m and m+ 1 moments
respectively of T1,δ (see Theorem 4 in the fourth chapter). Similarly,
P¡M δ > x/δ
¢= exp
¡aδ (pδ)x/δ + bδ (pδ)
¢+ o (δα) ,
where aδ (pδ) and bδ (pδ) have analogous representations as (21) and (22) above. This
implies, by virtue of Theorem 3, that
δ−1 (aδ (pδ)− aδ (pδ)) = o¡δα−ε
¢= bδ (pδ)− bδ (pδ) ,
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 61
which in turn implies the statement of the proposition.
Proof of Proposition 3. We first estimate |pδ − pδ|. Recall that
pδ = P¡τ δ+ =∞
¢=
∞Xn=1
1
nP¡Sδn > 0
¢=
∞Xn=1
1
nP
µSnn> δ
¶,
similarly
pδ =∞Xn=1
1
nP
µSnn> δ
¶.
Thus,
|pδ − pδ| ≤∞Xn=1
1
nE
¯1
µSnn> δ
¶− 1
µSnn> δ
¶¯=
∞Xn=1
1
nP
µSnn> δ;
Snn≤ δ
¶+
∞Xn=1
1
nP
µSnn> δ;
Snn≤ δ
¶.
Now, fix ε > 0 small and write
∞Xn=1
1
nP
µSnn> δ;
Snn≤ δ
¶=
Xn≤1/δ2+ε0
1
nP
µSnn> δ;
Snn≤ δ
¶
+X
n>1/δ2+ε
1
nP
µSnn> δ;
Snn≤ δ
¶.
Observe that
P
µSnn> δ;
Snn≤ δ
¶≤ P
³nmaxk=1
|Xk| > 1/δ´
= 1− ¡1− F (1/δ)¢n ,where F (x) = P (X > x). Since E
¡|X|3+α¢ < ∞ we have that F (1/δ) = o¡δ3+α
¢.
Thus, we can write
P
µSnn> δ;
Snn≤ δ
¶≤ 1− ¡1− o ¡δ3+α¢¢n .
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 62
However,Xn≤1/δ2+ε
1
nP
µSnn> δ;
Snn≤ δ
¶≤ M log
¡1/δ2+ε
¢ ³1− ¡1− o ¡δ3+α¢¢1/δ2+ε´
= M log¡1/δ2+β
¢ ³1− ¡1− δ2+εo
¡δ1+a−ε
¢¢1/δ2+ε´= o
¡δ1+α−ε0
¢for ε0 > ε > 0 small enough. Now, put eψδ (θ) = log eφδ (θ) (recall that eφδ (θ) was
defined in Proposition 2) and use Chernoff’s bound to obtain
P
µSnn> δ;
Snn≤ δ
¶≤ P
µSnn> δ
¶≤ exp
³−n
³δθδ − eψδ
¡θδ¢´´
,
where θδ satisfies the equation eψ0δ ¡θδ¢ = δ (which, can be easily seen to have a
solution for δ > 0 small enough). Hence,Xn>1/δ2+ε
1
nP
µSnn> δ;
Snn≤ δ
¶≤
Xn>1/δ2+ε
1
nexp
³−n
³δθδ − eψδ
¡θδ¢´´
≤exp
³− ¥1/δ2+ε¦ ³δθδ − eψδ
¡θδ¢´´
1− exp³−³δθδ − eψδ
¡θδ¢´´
= o (exp (−r/δε))
for r > 0 (since³δθδ − eψδ
¡θδ¢´ ∼ δ2/2), the previous term is obviously of order
o¡δ1+α+ε0
¢. For the term
Xn>1/δ2+ε
1
nP
µSnn> δ;
Snn≤ δ
¶≤
Xn>1/δ2+ε
1
nP
µSnn> δ
¶,
we first note that (since E |X1|3+α <∞)
P (X1 > x) ≤ P (|X1| > x) ≤ C (1 + x)−(α+3) , V (x)
for some constant C > 0. Corollary 4.2 of Borovkov (2000) implies that
supx≥t√(α+1)n logn
P (maxk≤n Sn > x)nV (x)
≤ 1 + h (t) ,
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 63
where h (t)→ 0 as t%∞. In our case n > 1/δ2+ε, thereforex = nδ ≥ δ−(1+ε) ≥ t1δ−(1+5ε/6)
≥ tq(α+ 1)
¡1/δ2+ε
¢log¡1/δ2+ε
¢ ≥ tp(α+ 1)n log n,for large enough (but fixed) constants t1, t > 0. We therefore conclude that there
exists t2 > 0 such thatXn>1/δ2+ε
1
nP
µSnn> δ
¶≤ t2
Xn>1/δ2+ε
V (nδ) = o¡δα+1
¢(is analogous. This gives the estimate pδ = pδ + o
¡δα+1−ε
¢.
Now let us write µ+ (j, δ) = ETj1,δ and µ+ (j, δ) = ET
j
1,δ for j ≤ bα+ 2c. Similarly,we use the symbol µ− (j, δ) (resp. µ− (j, δ)) to denote the jth moment of the first
weakly descending ladder height of the random walk Sδ (resp. the jth moment
of the first weakly descending ladder height of Sδ). Finally, let µj = E (X1 − δ)j
and µj = E³X
δ
1
´j. The Wiener-Hopf factorization (Lemma 2) then asserts that
µ1 = pµ−,1 (and that µ1 = pµ−,1), that is
µ− (j, δ)− µ− (j, δ) =µ1
p+ o¡δα+1−ε
¢ − µ1p
=µ1 − µ1
p+ o¡δα+1−ε
¢ − µ1Ã
1
p+ o¡δα+1−ε
¢ − 1p
!
= o¡δα+1
¢− δ
Ão¡δα+1−ε
¢¡p+ o
¡δα+1−ε
¢¢p
!= o
¡δα−ε
¢. (23)
Also from the Wiener-Hopf factorization we obtain
µ+ (1, δ) =pµ− (2, δ)− µ22µ− (1, δ)
. (24)
Therefore, in order to continue, we need to estimate the difference between µ− (j, δ)
and µ− (j, δ). This differences will be estimated via Fourier methods.
Since we are assuming strongly non-lattice we can use the following identity
log
Ã1−E ¡exp ¡∆Sδ
τ−¢¢
−E ¡∆Sδτ−¢ !
=1
2π
Z ∞
−∞
∆¡∆2 + λ2
¢ Re logµ1− eδiθg (−λ)−iδλ¶dλ (25)
− 12π
Z ∞
−∞
∆2¡∆2 + λ2
¢λIm log
µ1− eδiθg (−λ)−iδλ
¶dλ,
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 64
for ∆ > 0. This identity is almost the same as the one derived via Corollary 8.45
and Theorem 8.51 of Siegmund (1985) which is obtained for the strictly ascending
ladder height, however a straightforward adaptation of Siegmund’s argument shows
that the result also holds for the descending ladder height as displayed in (25). An
expansion of the left hand side of (25) in powers of ∆ (up to order bα+ 2c) generatesa sequence of coefficients cj (δ). Note that the ratios µ−(k, δ)/µ−(1, δ) (for k ≤bα+ 2c) can be recovered from the coefficients cj (δ), for j ≤ k by solving a system ofequations (in fact, (−1)j cj (δ) is the jth order cumulant of the limiting overshoot ofthe random walk −Sδ, and µ− (j + 1, δ) /µ− (1, δ) is proportional to its jth moment).
Hence, we can compute the magnitude of the error between µ− (j + 1, δ) /µ− (1, δ)
and µ− (j + 1, δ) /µ− (1, δ) by estimating cj (δ) − cj (δ). Consequently, it suffices tostudy the coefficients in the asymptotic expansion (in powers of ∆ > 0) of
log
ESδ
τ−¡1−E exp ¡∆Sδ
τ−¢¢
ESδτ−³1−E exp
³∆S
δ
τ−´´
=1
2π
Z ∞
−∞
∆¡∆2 + λ2
¢ Re logáδ + o ¡δα+2¢¢ ¡1− eiδλg (−λ)¢δ (1− eiδθegδ (−λ))
!dλ (26)
− 12π
Z ∞
−∞
∆2¡∆2 + λ2
¢λIm log
áδ + o
¡δα+2
¢¢ ¡1− eiδλg (−λ)¢
δ (1− eiδθegδ (−λ))!dλ, (27)
where egδ (λ) = E exp (iλX11 (|X1| ≤ 1/δ)). The expansion of the integrals (26) and(27) can be easily obtained using Proposition 2 in the second chapter of this disser-
tation. For instance,
c1 − c1 =1
2
õ− (2, δ)µ− (1, δ)
− µ− (2, δ)µ− (1, δ) + o
¡δα−ε
¢! (using the LHS of (25))
=1
4
õ2 + o
¡δα+2
¢δ + o
¡δα+3
¢ − µ2δ
!(expanding (26) and (27)) (28)
1
2π
Z ∞
−∞
1
λ2Re log
áδ + o
¡δk−1
¢¢ ¡1− eiδλg (−λ)¢
δ (1− eiδθegδ (−λ))!dθ. (29)
Since |g (−λ)− egδ (−λ)| ≤ o ¡δα+3¢, the term (29) is smaller than the term (28), and
CHAPTER 3. CRAMER-LUNDBERG WITH HEAVY TAILS 65
it is straightforward to verify that
µ2 + o¡δα+1
¢δ + o
¡δα+2
¢ − µ2δ= o (δα) ;
which implies that
µ− (2, δ)− µ− (2, δ) = o¡δα−ε
¢.
We can continue in this fashion; for example, we observe that the error term in the dif-
ference between c2 (δ) and c2 (δ) is determined by that of the difference between µ3/µ1and µ3/µ1 (because now the coefficient of ∆
2 in the expansion of (26) involves µ3/δ−µ3/
¡δ + o
¡δα+2
¢¢and that of (27) is an integral involving µ2/δ−µ2/
¡δ + o
¡δα+2
¢¢).
So, we obtain that µ− (3, δ) − µ− (3, δ) = o¡δα−1
¢. Similarly, for the difference be-
tween µ− (n, δ) /µ− (1, δ) and µ− (n, δ) /µ− (1, δ) for n > 3, we observe that we must
look at the difference between µn/δ and µn/¡δ + o
¡δα+2
¢¢which yields that
µ− (n, δ)− µ− (n, δ) = o¡δα+2−n
¢.
Furthermore, using the Wiener-Hopf factorization we see that the error in µ+ (n, δ)−µ+ (n, δ) is determined by that of µ− (n, δ) − µ− (n, δ) which, in particular, impliesthe statement of the proposition.
Chapter 4
Asymptotic Expansions for
Geometric Sums with Applications
to Defective Renewal Equations
Consider a sequence X = (Xk : k ≥ 1) of non-negative independent and identicallydistributed (iid) random variables (rv’s). Suppose that X1 is strongly non-lattice in
the sense that its characteristic function, g (λ) = E exp (iλX1), satisfies that for every
ε > 0
inf|λ|>ε
|1− g (λ)| > 0 (1)
or, equivalently, that lim|λ|→∞ |g (λ)| < 1 (see Siegmund (1985) p. 176).Let M be a geometrically distributed random variable independent of X. That
is,
P (M = k) = p (1− p)k = pqk; k ≥ 0.
Our focus here is on the distribution of
SM ,MXk=1
Xk,
66
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 67
(SM , 0 onM = 0) when the success probability p of the geometric random variable
M is small. The rv SM is called a geometric sum. Renyi’s theorem for geometric sums
of random variables establishes that if EX1 <∞, then
P (pSM > x) = exp (−x/E (X1)) + o (1) (2)
as p& 0. In this chapter, under the assumption of strongly non-lattice increments,
we develop additional order correction terms (in powers of p) to approximation (2)
(see equation (18) below). These types of expansions are similar in spirit to the
Edgeworth expansions for the central limit theorem. As in Edgeworth expansions,
the existence of certain order moments has to be imposed in order to provide the nth
order correction term. See, for example, Theorem 3.
The rv SM is utilized in many applied probability settings. For example, in
queueing theory, it is well known (by appealing to the ascending ladder heights rep-
resentation for the maximum of random walk) that the steady-state waiting time
distribution of the standard single server queue can be represented as a geometric
sum with non-negative increments (c.f. Asmussen (1987) or Kalashnikov (1997), Sec-
tion 1.3.3). In insurance risk theory, the ruin probability in the renewal model can
also be expressed as a tail probability of a geometric sum with non-negative incre-
ments (see Asmussen (2001) or Kalashnikov (1997), Section 1.3.4). Finally, in the
context of reliability models, the first break-down time of a system that consists of
an operating element, N − 1 unloaded redundant elements and M identical repair
units, can also be expressed as a geometric sum such as SM (refer also to Kalashnikov
(1997), Section 1.3.5). Other applications include program debugging and the total
reward until visiting a rare set in a Markov setting. (See the book on geometric sums
by Kalashnikov (1997) for additional details.)
The setting in which the success probability p is close to zero arises often in appli-
cations. For instance, in the queueing example mentioned in the previous paragraph,
this setting corresponds to the so-called heavy traffic regime in which the server uti-
lization is close to 100%; in the risk insurance context, p close to zero describes the
setting in which the security margin, included in the risk premium received by the
insurance company, is close to zero. Finally, in the reliability example, p close to zero
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 68
reflects a setting with a low break-down rate. In several of the examples above, the
distribution of the increments Xk depends on p as well. We must therefore develop
a theory that can handle this dependence. As an important application of the results
developed in this chapter (in particular Theorem 4) recall our results in Chapter 3
on high accuracy approximations for the maximum of random walk with heavy-tailed
increments. These approximations, as we have pointed out repeatedly, are very useful
in some of the applied settings mentioned at the beginning of our discussion (e.g. the
steady-state distribution of the single server queue and the ruin probability in the
insurance context).
In addition to the applications described above, there exists a close connection
between so-called defective renewal equations and geometric sums. Indeed, if a (·)satisfies the defective renewal equation
a (t) = b (t) + q
Z[0,t)
a (t− s)P (X1 ∈ ds) , (3)
Then, in great generality, (see Lin and Willmot (2000) p. 152) it follows that
a (t) =1
p
Z[0,t)
b (t− s)P (SM ∈ ds) . (4)
Equation (4) makes the connection clear between solutions of defective renewal equa-
tions (such as (3)) and the distribution of geometric sums. It turns out that defective
renewal equations such as (3) play an important role in a number of applied prob-
ability settings. A prominent example is insurance risk theory; in particular, the
so-called “expected discounted penalty” at ruin (from which many quantities of in-
terest, including the ruin probability, can be recovered by judicious choices of the
discount rate and the penalty) can be expressed in terms of a defective renewal equa-
tion (see Lin and Willmot (2000) p. 162). Many other examples in which defective
renewal equations play an important role are also described in Feller (1968) p. 188,
216, Resnick (1992) p. 158, and Lin and Willmot (2000) Ch. 9) these examples in-
clude Geiger counters, generalized terminating renewal processes, and age dependent
branching processes. The setting in which q is close to one in (3) (or, equivalently, p
is close to zero) is common in the application settings described before. For instance,
in the insurance setting it arises in environments of low net profits (which occur in
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 69
competitive conditions). For generalized terminating renewal processes, q close to one
corresponds to settings in which the process continues for long periods, and in age
dependent branching processes, q close to one reflects a case in which the population
is less likely to die. This, consequently, motivates developing asymptotics for the
solution a (·) of (3) as p& 0.
In Section 2, we develop the asymptotic expansion in powers of p for P (pSM > x)
(see Theorems 2, 3, and 4). The implications for asymptotic expansions of defective
renewal equations are studied in Section 3 (see Theorem 5).
4.1 Asymptotics for Geometric Sums
We first start with a useful representation for the tail probability of a geometric sum.
Set Sn = X1 + ...+Xn (with S0 = 0) and put N (t) = sup{n ≥ 0 : Sn ≤ t}. Observethat, for each non-negative integer m, {Sm > x} = {N (x) < m}. Thus, combiningthe independence between X and M with the fact that P (M > m) = qm+1, we can
write
P (SM > x) = P (N (x) > M) = E (P (N (x) > M |X)) = qE¡qN(x)
¢.
We therefore have shown the next proposition.
Proposition 1
P (SM > x) = qEqN(x).
The previous proposition implies that in order to study P (pSM > x) it suffices
to study the behavior of EqN(x/p) for x > 0 and small p > 0. A renewal theoretic
argument yields the following (defective) renewal equation
EqN(t) = P (X1 > t) + q
Z[0,t)
EqN(t−s)P (X1 ∈ ds) . (5)
Note that, if p > 0 small enough and E exp (ηX1) <∞ for some η > 0, the equation
E exp³bθX1´ = 1/q (6)
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 70
has a unique solution bθ > 0. Therefore, (5) can be transformed into the non-defectiverenewal equation
ebθtEqN(t) = ebθtP (X1 > t) +
Z[0,t)
ebθ(t−s)EqN(t−s)Fbθ (ds) ,
where Fbθ (ds) = qebθsP (X1 ∈ ds). Renewal theory then implies thatebθtEqN(t) =
Z[0,t)
ebθ(t−s)P (X1 > t− s)Ubθ (ds) , (7)
where Ubθ (t) = Ebθ (N (t) + 1), and under Pbθ the Xi’s are iid with distribution Fbθ.Using results by Stone (1965) it is not hard to verify that, for fixed but small p > 0¯
ebθtEqN(t) − 1
EbθX1Z[0,t)
ebθsP (X1 > s) ds
¯≤ K (p) exp (−a (p) t) . (8)
Since our situation involves sending simultaneously t%∞ and p& 0 we would like
the bound on the right hand side of (8) to hold uniformly in p ∈ [0, δ1] for someδ1 > 0. That is, we would like to show that we can find a,K ∈ (0,∞) such thatsupp∈[0,δ1]K (p) ≤ K < ∞ and supp∈[0,δ1] a (p) ≥ a. The following theorem provides
means to obtain these uniform estimates.
Theorem 1 Let F be a family of distribution functions supported on [0,∞). Foreach F ∈ F, let EF (·) be the expectation operator associated to F ∈ F, and defineEFg (τ) ,
R[0,∞) g (t)F (dt) for each continuous and bounded function g : [0,∞)→ C.
Suppose that the family F is uniformly strongly non-lattice, (i.e. the corresponding
characteristic functions χF (λ) = EF exp (iλτ) satisfy
infF∈F
inf|λ|>ε
|1− χF (λ)| > 0. (9)
Then, UF (t) ,P∞
n=0 F∗n (t) satisfies the following.
1. If supF∈F EF exp (ηX1) <∞ for some η > 0, then
supF∈F
¯UF (t)− t
EF τ− EF τ
2
2E2F τ
¯= O
¡e−at
¢as t→∞ for some a > 0.
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 71
2. Moreover, if supF∈F EF τε+2 <∞ for ε ≥ 0, then,
supF∈F
¯UF (t)− t
EF τ− EF τ
2
2E2F τ− H
F2 (t)
EF τ 2δ−HF
1 ∗HF1 (t)
¯= o
¡tα+2 log (t)
¢as t→∞, where HF
1 (t) =R∞t(1− F (s)) ds /EF τ and HF
2 (t) =R∞tHF1 (s) ds.
Proof. Part 1 is essentially Siegmund’s (1979) lemma. Part 2 follows the same
steps as in Carlsson (1983), the key assumption is the uniform strongly non-lattice
condition (9). The Fourier inversion expressions provided by Carlsson (1983) are the
same for each fixed F . At the end, Carlsson’s estimates of the error rate depend on
the application of a uniform version of the Riemann-Lebesgue lemma to his equation
(11) which can be obtain following his same argument in the presence of the strongly
nonlattice assumption imposed.
We now are ready to provide our asymptotic expansion for P (pSM > x) in the
presence of exponential moments.
Theorem 2 Suppose that X1 has strongly non-lattice distribution and that φ (η) ,E exp (ηX) <∞ for some η > 0. Then, for some a > 0, and as x/p→∞,
P (pSM > x) = exp³−xbθ/p+ r (p)´+O (exp (−ax/p)) , (10)
where bθ solves (6) andexp (r (p)) =
p
qbθφ0 ³bθ´ , c (p) . (11)
Moreover, both bθ and r are real analytic functions of p at the origin.Proof. The argument preceding Theorem 1 led us to equation (7). We now
verify that the assumptions in Theorem 1 are satisfied. Let us define gbθ (λ) ,Ebθ exp (iλX1) = qE exp
³³iλ+ bθ´X1´. Using the implicit function theorem on (6)
it follows easily that bθ = p/EX1+O (p2). As a consequence, the following inequalitycan be easily derived for all p > 0 sufficiently small and some M1 ∈ (0,∞)¯
gbθ (λ)− g (λ)¯ ≤M1p.
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 72
Hence, we conclude that for each ε > 0 it is possible to pick δ > 0 sufficiently small
so that
infp∈[0,δ]
inf|λ|>ε
¯1− gbθ (λ)¯ ≥ inf
p∈[0,δ]inf|λ|>ε
|1− g (λ)|−M1δ > 0.
Finally, also because bθ = O (p), it is possible to pick p > 0 small enough so that
Ebθ exp (ηX1) = qEbθ exp³³
η + bθ´X1´ < ∞ for some η > 0. We now can apply
Theorem 1 to equation (7) and obtain¯ebθx/pEqN(x/p) − 1
EbθX1Z ∞
0
ebθsP (X1 > s) ds
¯≤ 1
EbθX1Z ∞
x/p
ebθsP (X1 > s) ds (12)
+
¯1
EbθX1Z[0,x/p)
ebθ(x/p−s)P (X1 > x/p− s)V (ds)
¯, (13)
where, V (t) is a function that we are introducing here and it corresponds to the left
hand side of 1 in Theorem 1), therefore |V (t)| = O (e−at) for some a > 0. The integralin (12) is easily seen to be bounded by Ke−ax/p for some finite constants K, a > 0
(assuming that p > 0 is sufficiently small). We just need to analyze the integral in
(13). Integration by parts yieldsZ[0,x/p)
ebθ(x/p−s)P (X1 > x/p− s)V (ds)
= V (x/p)P (X1 > 0)− ebθx/pP (X1 > x/p)V (0) (14)
+bθebθx/p Z[0,x/p)
V (s) e−bθsP (X1 > x/p− s) ds (15)
+ebθx/p Z
[0,x/p)
V (s) e−bθsP (X1 > x/p− ds) . (16)
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 73
The absolute value of (14) is also bounded by Ke−ax/p for some finite constants
K, a > 0. For the integral (15) observe that
bθebθx/p ¯Z[0,x/p)
V (s) e−bθsP (X1 > x/p− s) ds
¯≤ bθebθx/p Z
[0,x/2p)
|V (s)| e−bθsP (X1 > x/p− s) ds+bθebθx/p Z
[x/2p,x/p)
|V (s)| e−bθsP (X1 > x/p− s) ds≤ bθebθx/pP (X1 > x/ (2p))M + bθebθx/p Z
[x/2p,∞)|V (s)| ds.
Since bθ = O (p) and X1 has exponential moments, we conclude that the previous
expression is bounded by Ke−ax/p (for appropriate positive constants K and a). The
treatment for integral (16) is very similar to that of (15). Thus, we conclude that
EqN(x/p) =
R∞0ebθsP (X1 > s) dsEbθX1 +O (exp (−rx/p)) . (17)
In order to recover the required expression for c (p), note that
EbθX1 = qZ[0,∞)
sebθsP (X1 ∈ ds) = qφ0
³bθ´ .On the other hand, using integration by parts and the definition of bθ, we see thatZ ∞
0
ebθsP (X1 > s) ds =
³φ³bθ´− 1´bθ =
p
qbθ .Combining the previous last two identities together into (17) yields equation (10).
The analytic properties of bθ follow directly from the implicit function theorem. It is
easy to see that r (·) is well defined at zero (i.e. that the right hand side of (11) isstrictly positive when p is close to zero). However, it is almost immediate to verify
that c (p) is real analytic at the origin with c(0) = 1. This implies the real analyticity
of r and the conclusion of the theorem.
Theorem 2 indicates that
bθ (p) = ∞Xk=1
bθ(k) (0)k!
pk, and r (p) =∞Xk=0
r(k) (0)
k!pk.
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 74
For notational convenience let us write bθ(k)(0)/k! = γk and r(k)(0)/k! = ξk. We
know that bθ(1) (0) = 1/EX1 and r (0) = 0, the rest of the γk’s and ξk’s can be easily
computed via the implicit function theorem. For instance, 2γ2 = 1− EX21/ (2E
2X1)
and ξ1 = 1− γ2EX1−EX21/(2E
2X1). For completeness we provide a set of recursive
equations to compute the γk’s and ξk’s.
Proposition 2 For n ≥ 1 and each k ≤ n, the constants (γk : 1 ≤ k ≤ n) can becomputed by solving recursively the following set of equations (note that the kth equa-
tion is linear in the γk and it only depends on the γj’s for j ≤ k).kX
m=1
EXm1
m!
X{n1+...+nm=k−m, n1,..,nm≥0}
mYj=1
γnj+1 = 1, for 1 ≤ k ≤ n.
Consequently, the constants (ξk : 0 ≤ k ≤ n− 1) can be obtained through a Taylorexpansion up to order n of the function
ern (p) = logà 1
qPn
k=1 γkpk−1Pn−1
m=0 (Pn
k=1 γkpk)mEXm+1
1 /m!
!
around p = 0. In particular, for k ≤ n− 1, ξk = er(k)n (0) /k!.
Proof. The proof follows directly by applying the implicit function theorem. The
details are omitted
Consequently, the previous theorem provides the means to develop an algorithm,
that can be implemented easily, for computing an asymptotic expansion for the tail
probability P (SM > x/p) in powers of p.
Theorem 2 corrects Renyi’s approximation (2) by providing a full asymptotic
expansion in powers of p with an exponential error term. In other words, the last
theorem provides rigorous support for the parametric (in p > 0) approximation
P (SM > x/p) ≈ expÃ−x/EX1 +
∞Xk=1
pk¡ξk − γk+1x
¢!, (18)
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 75
valid up to an error exponentially small as p & 0. It is easy to see that γk and ξk
depend on the first k and (k + 1) order moments of X1 respectively. This suggests
that, if EXα+21 <∞, say, the approximation
P (SM > x/p) ≈ expÃ−x/EX1 +
αXk=1
pk¡ξk − γk+1x
¢!(19)
should be more accurate than (2). Providing rigorous support for approximation (19)
in the presence of heavy tails (we say here that a non-negative random variable X1 is
heavy tailed if for every η > 0, E exp (ηX1) =∞) presents an additional mathematicalcomplication. Note that a crucial ingredient in the proof of Theorem 2 is the existence
of a root for equation (6). This indicates that the strategy followed in the proof of
Theorem 2 is infeasible in the heavy tailed case. Our idea is then to proceed via
truncation. Define the sequence X =¡Xk : k ≥ 1
¢as Xk = Xk1 (Xk ≤ x/p) and
consider its associated random walk S =¡Sn : n ≥ 0
¢(i.e. Sn = X1 + ...+Xn with
S0 = 0). We first argue that the distribution of SM is suitably close to that of SM .
Lemma 1 Suppose that EXβ1 <∞ for β ≥ 1, then
¯P (pSM > x)− P
¡pSM > x
¢¯= o
µpβ−1
xβ
¶Proof. Note that¯
P (pSM > x)− P¡pSM > x
¢¯≤ p
∞Xk=0
qkP¡Sn > x/p;Sn ≤ x/p
¢+ p
∞Xk=0
qkP¡Sn > x/p;Sn ≤ x/p
¢≤ 2p
∞Xk=0
qkP
µmaxk≤n
Xk > x/p
¶= 2p
∞Xk=0
qkµ1−
³1− o
³(p/x)β
´´k¶.
On the other hand,³1− o
³(p/x)β
´´k= 1− ko
³(p/x)β
´+k (k − 1)
2(1− ηk)
k−2 o³(p/x)2β
´,
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 76
where |ηk| ≤ o³(p/x)β
´. Hence, we can write
¯P (pSM > x)− P
¡pSM > x
¢¯ ≤ 2p∞Xk=0
qkko³(p/x)β
´+ 2p
∞Xk=0
qkk2o³(p/x)2β
´= o
µpβ−1
xβ
¶+ o
µp2β−2
x2β
¶= o
µpβ−1
xβ
¶.
We now would like to study P¡pSM > x
¢just as we did in Theorem 2. Theorem
1 can also be applied here to obtain a suitable approximation for P¡pSM > x
¢, as
our next result shows.
Theorem 3 Assume that the distribution of X1 is strongly non-lattice. Also, suppose
that
EX2+α1 <∞
for α ≥ 0. Then,
P (pSM > x) = exp³−xbθα/p+ rα (p)´+ o (pα)
as p& 0, where
bθα = p/EX1 +Xk≤α
γkpk, and rα (p) =
Xk≤α
ξkpk
and the γk’s and ξk’s are defined recursively via Proposition 2.
Proof. Let N (t) = sup{n ≥ 0 : Sn ≤ t}, then, by virtue of Proposition 1 andLemma 1 it suffices to compute EqN(x/p). Following similar steps as in the proof of
Theorem 2 we obtain
EqN(x/p) =
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)Uθ (ds) . (20)
The elements in equation (20) are indicated next. First, θ is the solution to the
equation
φ¡θ¢, E exp
¡θX1
¢=1
q,
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 77
which clearly exists if p is small enough. In order to describe Uθ (s) define Pθ(·) via
Pθ (B) = qnE¡exp
¡θSn
¢; 1 (B)
¢,
for every B in the sigma-field σ¡X1, .., Xn
¢. Next, we will show that
V (s) , Uθ (s)− s
EθX1
− EθX2
1
2E2θX1
−R∞t
R∞sPθ¡X1 > u
¢duds
EX21
, (21)
where,¯V (s)
¯= o
¡s−(α+1)
¢as s%∞ uniformly in p > 0 small enough. The previous
expression follows from Theorem 1, as we now illustrate. (Note that the term V in
(21) includes the last two terms in the right hand side of the equation in the part 2
of Theorem 1.) Observe that gp (λ) , Eθ exp¡iλX1
¢= qEθ exp((iλ+ θ)X1) satisfies¯
gp (λ)−E exp (iλX1)¯ ≤ ¯
gp (λ)−E exp¡iλX1
¢¯+ o
¡pα+2
¢≤ p
¯E exp
¡iλX1
¢¯+ θEX1 + o
¡pα+2
¢= O (p) .
This implies that gp (·) satisfies the uniform strongly non-lattice condition (14). On
the other hand, since θ = O (p), we have that for all p > 0 small enough
EθXα+2
1 = qE exp¡θX1
¢X
α+2
1 ≤MEXα+2
1 < MEXα+21 <∞.
This justifies the validity of representation (21). Furthermore, (21) implies that
EqN(x/p)
=
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)EθX1
ds (22)
+
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)EθX
2
1
Z ∞
s
Pθ¡X1 > u
¢duds (23)
+
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s;X1 ≤ x/p)V (ds) . (24)
Let us denote by I1, I2, and I3 the expressions (22), (23), and (24) respectively. We
first show that I3 = o (pα+1). To see this, we use integration by parts, the triangle
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 78
inequality and the fact that θ = O (p) to obtain
|I3| ≤¯V (x/p)
¯+M1
¯Z[0,x/p)
V (s) de−θsP (X1 > x/p− s;X1 ≤ x/p)¯
≤ ¯V (x/p)
¯+M1
¯Z[0,x/p)
V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯
+M1
¯Z[0,x/p)
V (s) e−θsP (X1 > x/p− s;X1 ≤ x/p) ds¯. (25)
The term¯V (x/p)
¯= o (pα+1). Now, observe that¯Z[0,x/p)
V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯
≤¯Z[0,x/2p)
V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯
+
¯Z[x/2p,x/p)
V (s) e−θsP (X1 > x/p− ds;X1 ≤ x/p)¯
≤ K2P (X1 > x/ (2p)) +K1 max1/2≤u≤1
¯V (ux/p)
¯(26)
= o¡pα+2
¢+ o
¡pα+1
¢= o
¡pα+1
¢,
for some constants K1 and K2. The integral in (25) follows the same lines as (26).
For I2 we have
I2 =1
EθX2
1
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s)Z ∞
s
Pθ¡X1 > u
¢duds+ o
¡pα+1
¢. (27)
A parallel argument to that given for I3 shows that
1
EθX2
1
Z[0,x/p)
eθ(x/p−s)P (X1 > x/p− s)Z ∞
s
Pθ¡X1 > u
¢duds = o (pα) ,
which yields
I2 = o (pα) .
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 79
Finally, we analyze I1
I1 + o¡pα+1
¢=
1
EθX1
Z x/p
0
eθuP (X1 > u) du =1
θEθX1
Z x/p
0
P (X1 ≥ u) deθu
= o¡pα+1
¢− 1
θEθX1
+1
θEθX1
Z x/p
0
eθuP (X1 ∈ du)
=
¡1−E exp ¡θX1
¢¢θE¡exp
¡θX1
¢X1
¢ + o ¡pα+1¢ .Lastly, the implicit function theorem yields that
p
q2θφ0 ¡θ¢ = ¡
1− E exp ¡θX1
¢¢qθE
¡exp
¡θX1
¢X1
¢ = expÃXk≤α
pkξk
!+ o (pα) ,
and
θ =Xk≤α
pkγk + o (pα) .
This concludes the proof of the theorem.
Remark Note that (26) and 27) combined with Lemma 1 indicate that, in prin-
ciple, it is possible to develop an approximation for P (pSM > x) up to an error of
order o (pα+1) given by
P (pSM > x) ≈ exp¡−xθ/p¢Ã p
q2θφ0 ¡θ¢ + I2! .
This approximation, however, involves computing explicitly I2 and θ which may be
cumbersome in practice.
As we indicated at the beginning of this chapter, in many applications settings the
increment distributions are actually changing with p. In this context, it is desirable
to develop approximations similar to those provided in the previous theorems. Fortu-
nately, Theorem 1 also provides a means to deal with the typical situations that arise
in practice. To fix ideas, consider a family of probability measures P = {Pp, p ∈ [0, δ]
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 80
for some δ > 0}. Suppose that, under each Pp, the random variables (Xk : k ≥ 1)form an iid sequence. Also, assume that the distribution of X1 is uniformly strongly
non-lattice with respect to P (i.e. the characteristic functions gp (λ) = Ep exp (iθX1)satisfy condition (9)). In addition, suppose that one of the following conditions hold:
A) for some η > 0, sup0≤p≤δ Ep exp (ηX1) <∞ or
B) sup0≤p≤δ EpX2+α1 <∞, for some α ≥ 0.
Under this set of assumptions, we have the following analogue to Theorems 2 and
3.
Theorem 4 Assume that the family Pp, p ∈ [0, δ] is uniformly strongly non-lattice(see equation (9)). If condition A) above holds, then, there exists constantsK1, K2 > 0
such that for p > 0 small
|Pp (pSM > x)− exp (−θ∗x/p+ rp (p))| ≤ K1 exp (−K2x/p) ,
where θ∗ = θ∗ (p) solves φp (θ∗) , Ep exp (θ∗X1) = 1/q and
exp (rp (p)) =p
q2θ∗φ0p (θ∗).
Moreover, θ∗ (p) =P∞
k=1 pkγk (p) and rp (p) =
P∞k=1 p
kξk (p) (where the γk (p)’s and
ξk (p)’s depend on the first k and (k + 1) moments of X1 under Pp respectively).
Finally, if condition B) is in force, then¯¯Pp (pSM > x)− exp
Ã−x/EpX1 +
Xk≤α
pk¡ξk − γk+1x
¢!¯¯ = o (pα) .Proof. The proof parallels the arguments given in Theorems 2 and 3 using The-
orem 1. The details are omitted.
Remark Note that the γk’s and the ξk’s also depend on p. The previous result
would yield the desired asymptotic expansion assuming that the problem at hand has
enough structure, so that an asymptotic expansion of ξk’s and γk’s can be obtained.
The expansion for the distribution of the all time maximum of a random walk with
small negative drift given in Chapter 1 of this dissertation, provides an example in
which the previous result would have been applicable.
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 81
4.2 Asymptotics of Defective Renewal Equations
As we discussed at the beginning of the chapter, in many applied probability settings
one often deals with defective renewal equations, which are integral equations that
can be written as
a (t) = b (t) + q
Z[0,t)
a (t− s)P (X1 ∈ ds) ,
where q = 1− p ∈ (0, 1) and b is a given function for which we shall assume certainregularity properties (see Theorem 5). As an application of our developments in
Section 2, we provide means to obtain asymptotic expansion for a (·) as p& 0.
Theorem 5 Suppose that the distribution ofX1 is strongly non-lattice and that EX2+α1
<∞. In addition, suppose that b is right-continuous with left limits, has finite vari-ation and |b| (t) ≤ g (t) with R∞
0tα+1g (t) < ∞. Finally, let us write, for j ≤ α + 1,
bj =R∞0tjb (t) dt. Then, as p& 0
a (t/p) = exp³−tbθα/p´ d (p) + o (pα) ,
where
d (p) =b0 +
Pk≤α bkbθkα/k!
q³EX1 +
Pk≤α bθkαEXk+1
1 /k!´ .
Proof. First we note that if a (·) satisfies
a (t) = b (t) + q
Z[0,t)
a (t− s)P ¡X1 ∈ ds¢, (28)
where X1 = X11 (X1 < 1/p), then, by applying Laplace transforms we can verify that
a satisfies (see Theorem 9.1.1 of Lin and Willmot (2000))
a (t) =1
p
Z[0,t)
b (t− s)P ¡SM ∈ ds¢ .
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 82
Therefore
a (t/p)− a (t/p)=
1
p
Z[0,t/p)
b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢=
1
p
Z[0,t/2p)
b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢ (29)
+1
p
Z[t/2p,t/p)
b (t/p− s) ¡P ¡SM ∈ ds¢− P (SM ∈ ds)¢ . (30)
Let J1 and J2 be the integrals in (29) and (30) respectively. Now, since b (t) = o (tα+2)
and is right continuous with left limits, it is not hard to see that
max1/2≤u≤1
|b (ut/p)| = o ¡pα+2¢ .Thus, it follows that
1
p
¯Z[0,t/2p)
b (t/p− s)P ¡SM ∈ ds¢¯ ≤ 1pmax
1/2≤u≤1|b (ut/p)| = o ¡pα+1¢ .
Which implies that J1 = o (pα). For J2 we can use integration by parts to obtain
|J2| = b (0)
p
¡P¡pSM ≤ t
¢− P (pSM ≤ t)¢+
¯b (t/p)
p
¯ ¡P¡pSM ≤ t/2
¢− P (pSM ≤ t/2)¢+
Z[1/2,1)
1
p
¯P¡SM ≤ st/p
¢− P (SM ≤ st/p)¯ |b| (t/p− ds) .From Lemma 1 and the fact that
R[0,∞) |b| (ds) < ∞ we can easily obtain that J2 =
o (pα+1). The rest of the argument follows just as in the proof of Theorem 3, by finding
a root for the equation E exp¡θX1
¢= 1/q, transforming (28) into a non-defective
renewal equation and applying Theorem 1.
As a final remark we note that a straightforward generalization of the previous
theorem can be obtained in a completely analogous setting as the one described in
Theorem 4. As an application of the previous results we consider a couple of examples
in insurance risk theory and queueing theory.
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 83
Example 1 (Perturbed ruin model) Consider the case of the classical ruin
model perturbed by a diffusion introduced by Dufresne and Gerber (1991). That is,
suppose that the risk process is a Levy process of the form
R (t) = x+ ct− S (t) + σB (t) ; t ≥ 0,
where S (·) represents the aggregate claim process, which follows a compound Pois-
son process with Poisson parameter λ and increments (claims) Y = (Yk : k ≥ 1); xrepresents the initial reserve, c is a constant premium rate satisfaying c > λEY ,
and σB (·) is a Brownian motion independent of S with diffusion coefficient equalto σ (i.e. instantaneous variance equal to σ2). The term involving the Brownian
motion B, represents noise that may incorporate non-systematic fluctuations in the
composition of the insurance portfolio, measurement errors, etc. We are interested in
computing the probability of eventual ruin in this model. Note that this model can-
not be reduced directly to the standard renewal model discussed at the beginning of
the chapter because, in this case, ruin can occur between claim arrivals. In order to
apply Theorem 5, let us introduce some additional notation. Let Z be a rv having the
equilibrium distribution generated by Y , that is
P (Z ≤ z) = 1
EY
Z z
0
P (Y > y) dy.
Also, define p = 1 − λEY/c and q = 1 − p, and V = Z + σ2W/ (2c), where W is
distributed exponential with mean one and Z and W are independent. Finally, let
τ (x) = inf (t ≥ 0 : R (t) < x), and note that the ruin occurs if and only if {τ (x) <∞}. Dufresne and Gerber (1991) proved that if P (τ (x) <∞) = a (x), then
a (x) = qP (V > x) + pP¡W > 2cx/σ2
¢+q
Z x
0
a (x− y)P (V ∈ dy) .
In this context, p close to zero and x large are reasonable assumptions, hence we
can use Theorem 5 can be directly applied here to provide asymptotics for a (x/p) as
p→ 0. In particular, for j ≥ 0, it is easy to verify that
bj =1
j + 1
³qEV j+1 + q
¡σ2/ (2c)
¢j+1(j + 1)!
´,
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 84
and that
EZj =EZj
(j + 1)EZ, EW j = j!.
These expressions, combined with Proposition 2 and Theorem 5 provide all the nec-
essary means to compute the desired asymptotic expansion. For instance, assuming
that EZ4 <∞ (which implies that EY 3 <∞), we obtain that
a (x/p) = exp¡−x/EV + 1/2 ¡1−EV 2/ ¡2E2V ¢¢ p¢ d (p) + o (p) ,
where
d (p) =(qEV + pσ2/ (2c)) + (qEV 2 + σ4/ (2c2)) p/EV
q (EV + pEV 2/EV ),
and
EV = EY 2/ (2EY ) + σ2/ (2c)
EV 2 = EY 3/ (3EY ) + σ4/¡2c2¢+ σ2EY 2/ (2cEY ) .
Note that these asymptotics correspond to corrected diffusion approximations for the
present model.
Example 2 (M/G/c waiting time) A standard (first-come first-served) M/G/c
queue can be described as follows. Customers arrive according to a Poisson process
with rate λ. The system is composed by c servers and a buffer of infinite size. The
amount of time required by service of each customer is described by a sequence V =
(Vj : j ≥ 1) of non-negative iid random variables independent of the arrival process.
The stability condition of this queue requires less than 100% utilization, which is ex-
pressed via ρ = λEV/c < 1. Just as in the standard M/G/1 case, the so-called equilib-
rium distribution of the service time sequence, namely H (t) =R t0P (V > s) ds/EV ,
plays an important role in discribing the steady-state waiting time distribution, say
W = (Wn : n ≥ 1), (excluding service) of this queueing system. In particular (seeVan Hoorn (1984)), it turns out that if a (t) = P (W∞ > t|W∞ > 0), then
a (t) = (1− ρ) (1−H (t))c
+ρ (1−H (tc)) + ρ
Z t
0
a (t− s) dH (sc) ,
CHAPTER 4. GEOMETRIC SUMS AND APPLICATIONS 85
and
P (W∞ > 0) =(λEV )c−1
(c− 1)!ρ
(1− ρ)Pc−1
j=0 (λEV )j /j! + (λEV )c / (c!)
.
Therefore, as a straightforward application of Theorem 5, we can develop corrected
diffusion approximations (in the spirit of Chapter 2 of this dissertation) for the steady-
state waiting time of the M/G/c queue.
Chapter 5
Approximations for the
Distribution of Infinite Horizon
Discounted Rewards
For t ≥ 0, let Λ (t) be a real-valued random variable representing the cumulative
reward associated with running a system over [0, t]. In the presence of a stochastic
inflation rate, the infinite horizon discounted reward takes the form
D =
Z[0,∞)
exp (−Γ (t−)) dΛ (t) ,
where Γ = (Γ (t) : t ≥ 0) is a real-valued process representing the cumulative infla-tion to time t. An enormous literature exists within the performance modeling and
stochastic control communities that focuses on computing and/or optimizing the ex-
pected infinite-horizon discounted reward, namely E (D). Our focus, in this paper,
is on the development of approximations for the distribution of the random variable
(r.v.) D (and not just its expected value).
As we shall see in Section 2, the distribution of the random variable D plays a
key role in a number of different applications contexts. Since, clearly, computing
the exact distribution of D is, in general, very difficult, the emphasis in this paper
is on the development of approximations. All of our approximations are rigorously
supported by limit theorems that are valid in the asymptotic regime in which the
86
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 87
“inflation rate” is small.
Study of approximations for the distribution of D can be traced back to the early
seventies. Gerber (1971) established a Central Limit Theorem (CLT), as well as its
Berry-Esséen companion, for
D =∞Xk=0
exp (−αk)Xk,
in the case of a (small) deterministic discount rate α and iid rewards (Xk)k≥0. Whitt
(1972) obtained more general central limit theorems for D, also under the assumption
of deterministic interest rates. The aim of Whitt’s paper was to establish discounted
stochastic limit theorems based on postulating a functional limit theorem for the
(undiscounted) reward process (in our notation, Λ).
The stochastic discount rate has also been widely studied. Pollack and Siegmund
(1985) computed the distribution of D in the case in which Γ follows a Brownian
motion with negative drift and Λ (t) = t; see also Dufresne (1990). The distribution
of D has also been computed explicitly by Gjessing and Paulsen (1997) in some other
particular cases in which both Γ and Λ follow particular types of Levy processes) .
Computing the distribution of D in complete generality is clearly unfeasible. And
even in Markovian settings, such as those previously described, the type of integro-
differential equations that arise (see Gjessing and Paulsen (1997) and Yor (2001)) are
challenging to solve both analytically and numerically. Hence, our goal is to provide
approximations to D that hold in great generality and require relatively “easy-to-
obtain” information for their implementation.
It is important to recognize that D arises as the stationary distribution of certain
processes that have been well studied in the context of time series analysis (such
as AR and ARCH processes). By properly scaling certain types of auto-regressive
processes, Nelson (1990) obtained sample-path weak convergence results a Gaussian
Ornstein-Uhlenbech process as the sample frequency increases. More recently Forniari
and Mele (1997) extended Nelson’s results to cover more general type of non-linear
ARCH and GARCH time series models. From the time series analysis perspective, the
central limit theorem (CLT) derived here in Section 4 is related to the convergence
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 88
of the stationary distributions of auto-regressive type models to that of Ornstein-
Uhlenbeck (namely a Gaussian law). One of the contributions of Whitt (1972) is to
show that weak convergence of properly scaled processes Γ and Λ in the standard
Skorohod topology is not enough to guarantee weak convergence of a suitably scaled
and normalized version of D. Thus, the general weak convergence analysis at the
level of stationary distributions in auto-regressive processes does not follow directly
from previous results in the literature. Our laws of large numbers (LLNs) and CLTs
derived in Section 4 hold in great generality (in particular, in the cases considered by
Nelson (1990) and Forniari and Mele (1997)), hence our results complement previous
analysis on the structure of auto-regressive processes.
Some results similar in spirit to our results in Section 4 have been obtained by
Kushner (1984) and Benveniste, Metiver, and Priouret, (1990) in the context of
stochastic approximation algorithms, more precisely the so-called least mean squares
(LMS) algorithm, which gives rise to a linear stochastic recursive equation of order
one. Although these results hold in the vector valued case, the dependence assump-
tions imposed are stronger than ours and are only given in discrete time, which is
not convenient for some of the applications discussed in the next section (e.g. finance
and insurance). Also in the context of stochastic approximation algorithms, Bucklew,
Kurtz, and Sethares (1993) analyzed weak convergence (on compact sets) of processes
following certain stochastic recursive equations that give rise to stationary distribu-
tions related toD. As in the previous discussion regarding the time series setting, this
type of analysis does not directly imply weak convergence of stationary distributions.
In this paper, we not only complement previous results in the literature (such as
those discussed for in the context of time series analysis) by providing rigorous general
statements that support LLNs and CLTs at the level of stationary distributions, but
also provide new approximations and further refinements for the LLN’s and CLT’s
previously indicated. The approximations proposed take the form of Edgeworth ex-
pansions and large deviation principles (LDP’s), and can typically be implemented
at a modest computational cost (see, for example, (7) and (12)). The assumptions
under which these results are derived are stated at the beginning of the corresponding
sections.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 89
The rest of the paper is organized as follows. As we indicated before our first
approximation takes the form of a law of large numbers (LLN) and is derived in
Section 3. In Section 4, we provide a central limit theorem (CLT) correction to
the LLN derived in Section 3. The approximations developed in Sections 3 and 4
are shown to be valid under very general settings. Under additional assumptions,
refinements to the CLT are introduced in Section 5. These refined approximations
take the form of Edgeworth expansions and are provided in both the discrete and
the continuous time settings. Finally, under exponential tail conditions on Λ and Γ,
general large deviation principles (LDP’s) are given in Section 6. In this section also,
sharp large deviation asymptotics are discussed as well.
5.1 Motivating Examples
The distribution of D plays a key role in a number of different applications contexts.
In the world of finance and pension funds, D is called a “perpetuity” (see Embrechts,
Klüppelberg, andMikosch (1997)). As an example of howD arises in pension funds we
mention Dufresne (1990), who proposed a model, based on perpetuities, for computing
the value of a pension fund. He argued that the value at time t can be expressed as
V (t) =
Z t
−∞exp
µ−Z t
s
γ (u) du
¶λ (s) ds,
where (γ (t) ,λ (t) : −∞ < t <∞) is stationary and ergodic, with 0 < Eγ (t) < ∞and E log (1 + |λ (t)|) <∞. The processes γ (t) and λ (t) depend on the parameters
that serve to characterize the pension fund (i.e. benefit payments, actuarial liability,
net premium, and rate of return). As explained in Dufresne (1990), the distribution
of the value process plays an important role in risk management, as it serves to
compute critical rates ensuring that the fund is being managed in a balanced manner
with respect to its actuarial liabilities; see Dufresne (1990) and Bédard and Dufresne
(2001) for additional detail on pension funding. The random variable V (t) can be
recast as a special case of D, so that our results apply directly.
The random variable D also arises in non-pension fund insurance settings. Con-
sider a company that receives premiums at a rate of p dollars per unit time, and pays
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 90
out claims according to the random process A (·). If γ (t) represents the rate of returnon the invested risk reserve at time t, the risk reserve R (t) evolves according to the
equation
dR (t) = γ (t)R (t) dt+ (pdt− dA (t)) ,subject to the initial condition R (0) = r0. Harrison (1977) shows that the ruin
probability P (inft≥0Rt < 0) can be computed in terms of D (for Λ and Γ suitably
defined) when γ is deterministic. Paulsen (1998) extends this result to the case of
stochastic γ (·); see also Nyrhinen (2001). Thus, the key to calculating such ruin
probabilities is computing the distribution of D.
As we mentioned before, it turns out that D also plays a major role in the theory
of ARCH processes. This class of time series is widely used within the statistics and
econometrics communities, and has been employed to model log-returns, exchange
rates, inflation, and many other financial and economic time series; see Campbell,
Lo and Mackinlay (1999), Shephard (1996), Mills (1993) and Wilkie (1986). An
ARCH(1) model satisfies the stochastic recursion
Yn+1 = An +Bn+1Yn,
where the sequence ((Ai, Bi) : i ≥ 1) is iid (independent and identically distributed.)Under mild stability conditions (see, for example, Kesten (1973), Verbaat (1979),
Goldie (1991), Embrechts and Goldie (1994)), this Markov chain has a stationary
distribution. This stationary distribution is a special case of D.
We also note several other applications settings in which the distribution of D
arises as a central object. Goldie and Grübel (1996) describe its relevance to complex-
ity theory (in the context of sorting algorithms related to “Quicksort”) and analytic
number theory. Carmona, Petit, and Yor (2001) describe several other applications
arising in mathematical physics and finance.
Apart from the financial Whitt (1972) also reports two application contexts in
which our limit theorems may have potential important implications. and, second,
the dynamic programming context, in which the approximations derived may be used
in developing stochastic criteria and sensitivity analysis for small interest rates, see
Whitt (1972) for additional detail.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 91
As stated earlier, our work is intended to provide approximations to the distribu-
tion of D. Of particular importance (in view of the above applications) is the setting
in which the “interest rate” corresponding to Γ is small. Our theorems establish that
our approximations become asymptotically exact as the “interest rate” goes to zero.
5.2 Law of Large Numbers
We assume throughout this chapter (except in some cases explicitly indicated) that
for each t ≥ 0, (Λ (s) : 0 ≤ s ≤ t) is of bounded variation. We also require Λ and
(Γ (s) : s ≥ 0) to be right continuous functions with left limits (RCLL). (Note thatwe do not require bounded variation for Γ.) Let |Λ| (t) be the total variation of Λover [0, t], and suppose that |Λ| satisfies:
limt→∞1
t|Λ| (t) <∞ a.s. (1)
We further assume that:
A1 There exist deterministic constants λ ∈ R and γ ∈ (0,∞) such that:
Γ (t) = γt+ op (t)
Λ (t) = λt+ op (t) ,
where op (t) means that for every c > 0,
sup0≤t≤c
¯op (tβ)
β
¯= o (1) as β →∞.
Our first proposed approximation for D takes the form
DD≈ λ/γ. (2)
Here,D≈ means “has approximately the same distribution as”, and is intended to hold
no rigorous mathematical meaning. The relation (2) should be merely interpreted
as a statement of a proposed approximation.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 92
Of course, given such an approximation, it is important to identify conditions
under which the approximation can be guaranteed to be good. We shall argue that
the approximation (2) tends to be good when the discount rate γ is small. To
make this statement mathematically rigorous, we shall introduce a parameter α that
will control the magnitude of the discount rate. We will show that as α & 0, the
approximation (2) becomes asymptotically valid. More precisely, let
D (α) =
Z[0,∞)
exp (−αΓ (t−))Λ (dt) .
For D (α), the approximation (2) takes the form
D (α)D≈ λ/αγ. (3)
The following theorem shows that the approximation (3) becomes accurate as α& 0.
Theorem 1 Under A1,
αD (α)→ λ
γa.s. as α& 0.
Note that the “law of large numbers” (LLN) offered by Theorem 1 does not assume
that the instantaneous discount rate is non-negative (i.e. that Γ is non-decreasing).
The lack of such an assumption introduces some technical complications in our proofs.
We prove (3) by replacing Γ with the non-decreasing function
Γ (t) , sup{Γ (s) : 0 ≤ s ≤ t}.
The hope is that
D (α) =
Z[0,∞)
exp¡−αΓ (t−)¢Λ (dt)
then has a behavior similar to that of D (α) when α is small. Theorem 1 can be
established by proving the corresponding law of large numbers for D (α). Thus, the
proof of Theorem 1 follows from Propositions 1 and 2 below.
Proposition 1 Assume A1. Then,
α¡D (α)−D (α)¢→ 0 a.s.
as α& 0.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 93
Proof. Observe that
D (α) =
Z ∞
0
exp(−αΓ (t−))Λ (dt)
=
Z ∞
0
Z ∞
αΓ(t−)exp(−u)duΛ (dt)
=
Z ∞
0
ÃZ ∞
αΓ(t−)exp(−u)du+
Z αΓ(t−)
αΓ(t−)exp(−u)du
!Λ (dt)
=
Z ∞
0
Z ∞
αΓ(t−)exp(−u)duΛ (dt) +
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
= D (α) +
Z ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt) .
Therefore, we must show that¯¯αZ ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
¯¯→ 0 a.s. as α& 0,
but ¯¯αZ ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
¯¯ ≤ α
Z ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)du |Λ| (dt) .
Now the right hand side of the last inequality is equal toZ ∞
0
α
Z αΓ(t/α−)−αΓ(t/α−)
0
exp(−u− αΓ (t/α))du |Λ|µdt
α
¶=
Z ∞
0
α exp (−γt+ αop (t/α))
Z αΓ(t/α−)−αΓ(t/α−)
0
exp(−u)du |Λ|µdt
α
¶=
Z ∞
0
h (t,α)µa (dt) .
Where,
h (t,α) = exp³−γ
2t+ αop (t/α)
´Z αΓ(t/α−)−αΓ(t/α−)
0
exp(−u)du,
and
µa (dt) = α exp³−γ
2t´|Λ|µdt
α
¶.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 94
Observe that
0 ≤ supα,t≥0
h (t,α) ≤M.
Moreover, it is not hard to verify that h (t,α) → 0 as α → 0 uniformly on compact
intervals. Also, notice that
µa(0,∞) =
Z ∞
0
α exp³−γ
2t´|Λ|µdt
α
¶=
Z ∞
0
α exp³−γ
2tα´|Λ| (dt)
=
Z ∞
0
e−uα |Λ|µ2u
αγ
¶du
=
Z ∞
0
e−u/2e−u/2α |Λ|µ2u
αγ
¶du.
Since |Λ| satisfies (1), we have for some B > 0,
0 ≤Z ∞
0
e−u/2α |Λ|µ2u
αγ
¶du ≤ B.
Thus, we have that for all α > 0,
0 ≤ µa(0,∞) ≤ B <∞.
And, if we fix ε > 0, then there exists C > 0 such that µa(C,∞) ≤ ε, and such that
supt∈[0,C]
h (t,α) ≤ ε,
for α small enough, this implies that if α is small,Z ∞
0
h (t,α)µa (dt) ≤ ε(B +M),
since ε was arbitrary, we deduce that¯¯αZ ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
¯¯ ≤
Z ∞
0
h (t,α)µa (dt)→ 0.
as claimed.
To prove Proposition 2, we need to recall the following definition of the generalized
inverse of a non-decreasing function.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 95
Definition 1 Let Γ : R+ → R non-decreasing, RCLL (right continuous with left
limits) then we define Γ−1 as
Γ−1 (u) = inf{t ≥ 0 : Γ (t) > u}.
Proposition 2 Assume A1. Then,
αD (α)→ λ/γ a.s.
as α& 0.
Proof.
αD (α) = α
Z ∞
0
exp(−αΓ (t−))Λ (dt)
= α
Z ∞
0
Z ∞
Γ(t−)exp(−u)duΛ (dt)
= α
Z ∞
0
exp(−u)Λ³Γ−1(u/α)
´du.
Now,
t
Γ−1(t−)
=Γ³Γ−1(t−)
´Γ−1(t−)
= γ +op
³Γ−1(t−)
´Γ−1(t−)
.
Hence
Γ−1 (t) =t
γ+ op (t) a.s.
This implies that
αΛ¡Γ−1 (u/α)
¢= αΓ−1 (u/α)
Λ (Γ−1 (u/α))Γ−1 (u/α)
→ uλ
γ.
Thus, in order to apply the Dominated Convergence Theorem, it suffices to show that
for almost every sample path, we have that¯αΛ¡Γ−1 (u/α)
¢¯ ≤ H (u,ω) ∈ L1 ¡e−udu¢ ,for some measurable function H. However,¯
αΛ¡Γ−1 (u/α)
¢¯= O (u)
which suffices to apply dominated convergence.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 96
5.3 The Central Limit Theorem
In this section, we assume that Λ and Γ jointly satisfy a “strong approximation
principle”, namely:
A2 There exists a probability space supporting (Λ,Γ) and a two-dimensional stan-
dard Brownian motion
(B1, B2) = ((B1 (t) , B2 (t)) : t ≥ 0)for which deterministic constants λ ∈ (−∞,∞) and γ ∈ (0,∞) and a covariancematrix Σ can be found such thatÃ
Γ (t)
Λ (t)
!=
Ãγ
λ
!t+G
ÃB1 (t)
B2 (t)
!+ op
¡t1/2¢
a.s. as t→∞.The entries of the covariance matrix C = GGT can typically be identified as
follows:
C11 = limt→∞
1
tE (Λ (t)− λt)2
C12 = limt→∞
1
tE (Λ (t)− λt) (Γ (t)− γt) = C21
C22 = limt→∞
1
tE (Γ (t)− γt)2 .
The strong approximation principle A2 holds in great generality, the prototypical
example is a sequence of (independent and identically distributed) iid random vari-
ables with finite second moment, some other cases, in which dependence is allowed,
and under which the validity of this principle has been proved are briefly described
(along with references) next. (Some relevant references on this topic are Philipp and
Stout (1975) and Csörgo and Révész (1981).)
Case 1 (Philipp and Stout (1975) Thm. 4.1) Suppose X = (Xn : n ≥ 1) is strictlystationary sequence of random variables, such that E
³|X1|2+δ
´<∞ for some δ > 0.
Also, assume that X is a φ−mixing with∞Xk=1
φ1/2 (k) <∞,
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 97
then S (t) =P
k≤tXk satisfies a strong approximation principle with rate op¡t1/2−λ
¢for some λ > 0 small.
Case 2 Philipp and Stout’s Thm. 8.1, also provides a strong approximation
principle with rate op¡t1/2−λ
¢(for λ > 0 small) when X = (Xn : n ≥ 1), is not
necessarily stationary, but φ−strong mixing with
φ (k) << k−300(1+2δ),
for some δ ∈ (0, 2]. This result also requires moments of order 2 + δ and other
technical assumptions to control the growth of the second moment of the random
elements Xk.
Case 3 In the context of positive recurrent irreducible Markov sequence (ζn)n≥1with stationary transition probabilities and countable state space, Thm. 10.1 of
Philipp and Stout (1975), provides strong approximation principles for the case in
which Xk = f (ζk) . The results in this case depend on moment conditions of the
type described before for the cumulative reward within a cycle of the Markov process.
Case 4 Horvath (1984a, 1984b and 1986) developed strong approximation the-
orems in the contexts of vector valued cases under higher moment conditions, and
also for the cases of renewal processes and extended renewal processes. Also, Philipp
and Stout (1975) Ch. 12 deals with various types of continuous parameter stochastic
processes (e.g. Gaussian increments and mixing increments, the later case includes
as a particular case Levy processes).
Given A2, we propose the following (refined) Gaussian approximation for D,
namely
DD≈ λ/γ + σ/γ1/2N (0, 1) , (4)
where
σ2 =1
2
µC11 − 2λ
γC12 +
λ2
γ2C22
¶.
Note that (4) improves upon (2) by providing a normal approximation for the stochas-
tic variability that is present in the r.v. D.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 98
As for the approximation (2), we claim that (4) is accurate when the discount rate
is small. In particular, note that (4) suggests the approximation
D (α)D≈ λ/αγ + σ/
√αγN (0, 1) , (5)
where
σ =
s1
2
µC11 − 2 λ
γααC12 +
λ2
(γα)2α2C22
¶
=
s1
2
µC11 − 2λ
γC12 +
λ2
γ2C22
¶.
The following central limit theorem (CLT) asserts that the approximation (5) is indeed
valid as α& 0.
Theorem 2 If A2 is in force, then
α−1/2µαD (α)− λ
γ
¶=⇒ σN (0, 1)
as α& 0, where
σ2 =1
2γ
µC11 − 2λ
γC12 +
λ2
γ2C22
¶.
Again, just as in the case of the LLN derived previously, the strategy is to show
that the behavior of the random variable
D (α) =
Z[0,∞)
exp¡−αΓ (t−)¢Λ (dt) ,
is comparable to that of D (α) for the purposes of approximation (5). This is the aim
of Proposition 3 below, whose proof follows using the same technique as in Proposition
1 together with an application of the next Lemma.
Lemma 1 Under A2,
√α¡Γ (t/α)− Γ (t/α)
¢→ 0 as α→ 0
uniformly on compact sets.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 99
Proof.¯√α¡Γ (t/α)− Γ (t/α)
¢¯ ≤ √α max0≤s≤t/α
(γ (s− t/α) + Σ2· (B (s)−B (t/α))) +√αop
³(t/α)1/2
´.
Observe that
√α max0≤s≤t/α
(γ (s− t/α) + Σ2· (B (s)−B (t/α)))
= max0≤s≤t
µγ(s− t)√
α+√αΣ2· (B (s/α)−B (t/α))
¶≤ max
0≤s≤t
³C (s− t)1/2+ε α−ε − γ (s− t)α−1/2
´,
≤ max0≤u≤t
¡Cu1/2+εα−ε − γuα−1/2
¢ ≤Mα1/2 → 0.
The first inequality holds by virtue of the law of iterated logarithms (LIL) and from
the last inequality we can see that the convergence holds uniformly on compact sets.
Proposition 3 Under A2,
α1/2¡D (α)−D (α)¢→ 0 a.s.
as α& 0.
Proof. As in the proof of Proposition 1,
D (α) = D (α) +
Z ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt) .
Hence, we must show
α1/2
¯¯Z ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
¯¯→ 0 a.s. as α& 0.
Now, observe that¯¯αZ ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)duΛ (dt)
¯¯ ≤ α
Z ∞
0
Z αΓ(t−)
αΓ(t−)exp(−u)du |Λ| (dt)
=
Z ∞
0
h (t,α)µa (dt) .
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 100
Where,
h (t,α) = exp³−γ
2t+ αop (t/α)
´ 1√α
Z αΓ(t/α−)−αΓ(t/α−)
0
exp(−u)du,
and
µa (dt) = α exp³−γ
2t´|Λ|µdt
α
¶.
Observe that
0 ≤ h (t,α) ≤M,
and virtue of the previous lemma, h (t,α)→ 0 as α→ 0 uniformly on compact inter-
vals. The rest of the proof follows by repeating the same steps used in Proposition
1.
In light of the previous proposition, Theorem 2 follows by combining the last result
together with Proposition 4 below. The following lemma will be used in the proof of
Proposition 4.
Lemma 2 Let Σ be a d−dimensional vector and τ (t) such that
τ (t) = γt+ op (t)
and suppose that B (t) is a d−dimensional Brownian motion. Then,
α1/2Z ∞
0
e−uΣ0B (τ (u/α)) du⇒ N¡0,σ2
¢,
where σ2 = 12γΣ0Σ0T .
Proof.
α1/2Z ∞
0
e−uΣ0B (τ (u/α)) du
= α1/2Z t/α
0
e−uΣ0B (τ (u/α)) du+ α1/2Z ∞
t/α
e−uΣ0B (τ (u/α)) du
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 101
Observe that¯α1/2
Z ∞
t/α
e−uΣ0B (τ (u/α)) du¯≤ α1/2
Z ∞
t/α
e−uΣ0 |B (τ (u/α))| du
= α−εZ ∞
t/α
Ce−uu1/2+εdu a.s.,
by virtue of the LIL (for all ε > 0 and for some C > 0), and the integral above, goes
to zero faster than αε. Hence, in order to show weak convergence, it suffices to study
(by virtue of Slutsky’s lemma)
α1/2Z t/α
0
e−uΣ0B (τ (u/α)) du = α1/2Z t/α
0
e−uΣ0 (B (τ (u/α))−B (u/γα)) du
+α1/2Z t/α
0
e−uΣ0B (u/γα) du.
We first show that¯¯α1/2
Z t/α
0
e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ =⇒ 0.
Observe that
sup0≤u≤t/α
αe−u |τ (u/α)− µu/α| = sup0≤u≤t/α
e−u |αop (u/α)|
≤ o (1) sup0≤u≤∞
e−uu
therefore, for every δ > 0 we can choose α0 such that if α > α0
P
Ãsup
0≤u≤t/αe−u |τ (u/α)− µu/α| > δ/α
!≤ δ
Let us define
Aδ = {ω : sup0≤u≤t/α
e−u |τ (u/α)− µu/α| ≤ δ/α}
and
Akδ = {ω : supk/α≤u≤(k+1)/α
e−u |τ (u/α)− µu/α| ≤ δ/α},
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 102
then
P
ï¯α1/2
Z t/α
0
e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ > ε
!
≤ P
Ãα1/2
Z t/α
0
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε
!
≤ P
Ãα1/2
Z t/α
0
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ
!+ δ.
Notice that
P
Ãα1/2
Z t/α
0
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ
!
≤btcXk=0
P
Ãα1/2
Z (k+1)/α
k/α
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε
t;Aδ
!
≤btcXk=0
P
Ãα1/2
Z (k+1)/α
k/α
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε
t;Akδ
!.
The kth−term in the last expression is less or equal than
P
Ãα1/2
Z (k+1)/α
k/α
e−uΣ0 sup0≤s≤ 2δ
αek/α
|B (s)| du > ε
2t
!
= P
ÃZ (k+1)/α
k/α
e−u sup0≤s≤2δek/α
|B (s)| du > ε
2tk
!
≤ t
2εce−k/α
¡1− e−1/α¢EÃ sup
0≤s≤2δek/α|B (s)|
!
=t
2εce−k/α
¡1− e−1/α¢Eµ sup
0≤u≤1
√2δek/α |B (s)|
¶=
Mt√δ
εe−k/2α
¡1− e−1/α¢ .
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 103
This implies that
P
Ãα1/2
Z t/α
0
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ
!
≤ Mt√δ
ε
¡1− e−1/α¢ ∞X
k=0
e−k/2α
=Mt√δ
ε
1− e−1/α1− e−1/2α =
Mt√δ
ε
1− e−1/α1− e−1/2α .
Therefore,
limα→0P
Ãα1/2
Z t/α
0
e−uΣ0 k(B (τ (u/α))−B (u/γα))k du > ε;Aδ
!≤ 2Mt
√δ
ε.
Since δ was arbitrary, we conclude that¯¯α1/2
Z t/α
0
e−uΣ0 (B (τ (u/α))−B (u/γα)) du¯¯ P→ 0,
in particular the last term goes to zero weakly; finally we observe that
α1/2Z t/α
0
e−uΣ0B (u/γα) du D=
Z t/α
0
e−uΣ0B (u/γ) du =⇒Z ∞
0
e−uΣ0B (u/γ) du,
which is Gaussian, with mean zero and variance (which can be computed using inte-
gration by parts and the Ito isometry) σ2 = 12γΣ0Σ0T .
Proposition 4 If A2 is in force,
α−1/2µαD (α)− λ
γ
¶=⇒ σN (0, 1) as α→ 0,
where
σ2 =1
2γ
h1 −λ
γ
iC
·1
−λγ
¸=
1
2γ
µC11 − 2λ
γC12 +
λ2
γ2C22
¶.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 104
Proof. We write
αD (α)− λ
γ=
Z ∞
0
e−uµαΛ³Γ−1 ³u−
α
´´− λ
γ
¶du.
Let W (α, u) =³αΛ³Γ−1(u/α)
´− λ
γ
´, using the strong approximation assumption
we can see that:
W (α, u) = α
µλ
γu− λ
γ
¶− αλ
γG2·B
³Γ−1 ³u−
α
´´+ αG1·B
³Γ−1 ³u−
α
´´+αop
µ³Γ−1 ³u−
α
´´1/2¶= α
µλ
γu− λ
γ
¶+ αe−uΣ0B
³Γ−1
³u−α
´´du+ αop
µΓ−1
³u−α
´1/2¶,
where
Σ0 =h1 −λ
γ
iG.
Integrating out α−1/2W (α, u), we obtain
α−1/2Z ∞
0
e−uW (α, u) du = I1 (α) + α1/2Z ∞
0
e−uΣ0B³Γ−1(u/α)
´du+ I2 (α) .
We analyze one by one each of these terms. First, it is clear that
I1 = α−1/2Z ∞
0
e−uµλ
γu− λ
γ
¶du = 0.
Next,
I2 = α−1/2Z ∞
0
e−uαop
µ³Γ−1(u/α)
´1/2¶du =
Z ∞
0
e−uα1/2op
µ³Γ−1(u/α)
´1/2¶du.
Observe that
√αop
µ³Γ−1(u/α)
´1/2¶= u1/2
rα
uΓ−1 (u/α)
op
µ³Γ−1(u/α)
´1/2¶³Γ−1(u/α)
´1/2 → 0 a.s. α→ 0.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 105
We wish to apply dominated convergence. Notice that¯√αop
µ³Γ−1(u/α)
´1/2¶¯= O
¡u1/2
¢ ≤ H (u,ω) ∈ L ¡e−udu¢this implies that I2 (α)→ 0 a.s. Therefore, we obtain that
α−1/2W (α, u)− α1/2Z ∞
0
e−uΣ0B³Γ−1(u/α)
´du⇒ 0,
which combined with the previous lemma and standard converging together results
yields the conclusion of the proposition.
Finally, Theorem 2 is a direct consequence of Propositions 3 and 4.
5.4 Edgeworth Expansion
In this section, we provide refined versions of the approximations given in the previous
sections. The refined approximation takes the form of an Edgeworth expansion for
the distribution of D. We shall derive these approximations in the iid setting for the
discrete time case and under Markovian assumptions for the continuous time case.
More precisely, in the discrete time case, motivated by the applications to ARCH
processes described in Section 2, we consider
D =∞Xk=0
exp
Ã−
k−1Xj=0
Zj
!Xk,
where (Xk, Zk)k≥1 is a sequence of iid random vectors satisfying certain assumptions
to be described later (see assumptions AI1 to AI4 below); while in the continuous
time context, we work with
D =
Z ∞
0
exp
µ−Z t
0
γ (Y (s)) ds
¶dΛ (t) ,
where Y = (Ys : s ≥ 0) is a suitably defined homogeneous Markov process Λ is a
stationary independent increment process, this setting is commonly used in the risk
theory example discussed in Section 2 (see Ch. 7 of Asmussen (2001)).
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 106
5.4.1 The discrete time setting
In this section, we shall consider the following set of assumptions.
ED1 Assume that Z1 ≥ 0, E (Z1) = γ <∞, E (Z21) = µ(2)Z <∞, and E ¡|Z1|3¢ <∞.Let σ2Z be the variance of Z1 and κ
(3)Z its third order cumulant, which can be
written as
κ(3)Z = µ
(3)Z − 3µ(2)Z γ + 2γ3.
ED2 Suppose that X1 has non-lattice distribution with E (X1) = λ, V ar (X21) = σ2X ,
and E¡|X1|3¢ <∞. Let E (X3
1 ) = µX3 and write κ
(3)X to denote the third order
cumulant of X1. In addition, assume that the distribution of X1 given Z1 is
non-lattice.
ED3 Suppose that E³|X1|j |Z1|k
´< ∞ for 0 < j + k ≤ 3 and for j, k ≥ 1 denote
µjk = E¡Xj1Z
k1
¢. Moreover, let us define,
δ (θ, Z1) =¯E¡eiθX1
¯Z1¢¯
and assume that
limh→0
supε≤|θ|≤1/ε
P (δ (θ, Z1) > 1− h)h
<∞, (6)
for ε > 0.
Condition (6) is technical, and may be seen as a form of strong non-latticity of X1given Z1. Notice that, in the important special case in which theXk’s are independent
of the Zk’s, assumption AI3 is an immediate consequence of AI2. Indeed, if X1 is
non-lattice, we have that δ (θ, Z1) = δ (θ) < 1. Therefore, for all h > 0 sufficiently
small, δ (θ) < 1− h. This implies that the limit in (6) is zero.As a remark, we also note that, alternatively, the non-negativity of Z1 required in
assumption AI1 can be replaced by the existence of exponential moments, we record
this observation as our alternative assumption AI1’.
ED1’ Assume that E exp (ρZ1) <∞ for ρ in a vicinity of the origin.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 107
Under these assumptions, we improve the approximation (4) by providing an
Edgeworth expansion for the distribution of D when (Xk, Zk)k≥1 is a sequence of
i.i.d. random vectors and the discount rate γ is small. In particular, by defining
σ2 =1
2
µσ2X − 2
λ
γσXZ +
λ2
γ2σ2Z
¶,
we can write the approximation proposed as
P (D ≤ y) ≈ P¡N¡λ/γ,σ2/γ
¢ ≤ y¢−√γβ1ηµ(y − λ/γ)
√γ
σ
¶(7)
−√γ
18β2H
µ(y − λ/γ)
√γ
σ
¶.
The constants β1 and β2 satisfy
β1 =µ(2)Z λ
2γ2σ,
σ3β2 = κ(3)X − 2κ21
λ
γ+ 3κ12
λ2
γ2− 3κ11
γ
µσ2X − 2
λ
γσXZ +
λ2
γ2σ2Z
¶+3σ2Z
λ
γ2
µσ2X − 2
λ
γσXZ +
λ2
γ2σ2Z
¶− κ
(3)Z λ3
γ3,
with
κ12 = µ12 + µ11 − µ(2)Z − 3γµ11 + 2γ2λ,κ21 = µ21 + µ11 − µ(2)X − 3λµ11 + 2λ2γ,κ11 = µ11 − λγ = σXZ , cov (X,Z) ;
and
η (y) =1√2πexp
¡−y2/2¢H (y) =
¡y2 − 1¢ η (y) .
The application of the approximation (7), requires estimation of the joint mo-
ments µij, which can be easily done (even non-parametrically) using standard meth-
ods. Also, observe that in the case in which the sequences (Xk)k>0 and (Zk)k>0 are
independent, the constants σ2, β1 and β2 take the simplified form
σ2 =1
2
µσ2X +
λ2
γ2σ2Z
¶
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 108
and
β1 =µ(2)Z λ
2γ2σ, β2 =
1
σ3
Ãκ(3)X + 3
σ2Zλσ2X
γ2− κ
(3)Z λ3
γ3+ 3
σ4Zλ3
γ4
!.
In order to understand the nature of approximation (7), we introduce a small
scaling parameter α > 0 and define
D (α) =∞Xk=0
exp
Ã−α
k−1Xj=0
Zj
!Xk.
approximation (7) becomes (since the quantities σ, β1 and β2 are not affected by the
scaling)
P (D (α) ≤ y) ≈ P¡N¡λ/αγ,σ2/αγ
¢ ≤ y¢−√γαβ1ηµ(y − λ/γα)
√γα
σ
¶(8)
−√γα
18β2H
µ(y − λ/γα)
√γα
σ
¶.
Or, in other words,
P¡√
α (D (α)− λ/αγ) ≤ y¢ ≈ P¡N¡0,σ2/γ
¢ ≤ y¢−√γαβ1ηµ√γσ y
¶−√γα
18β2H
µ√γ
σy
¶with an error of order o (
√α) (uniformly on y). The precise mathematical statement
concerning the previous approximations is the content of Theorem 3 below, which
provides the first order correction in the Edgeworth expansion for D (α). However,
before moving on to Theorem 3, we present a simple example to illustrate the accuracy
of the approximations proposed.
Example 1 Suppose that X1 ∼ λ exp (1) and Z1 ∼ γ exp (1). Under these assump-
tions it follows (see Gjessing and Paulsen (1997)) that
D =∞Xk=0
exp
Ã−
k−1Xj=0
Zj
!Xk ∼ λΓ (1/γ + 1, 1) ,
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 109
where Γ (1/γ + 1, 1) represents a random variable with distribution gamma with the
parameters given. In order to illustrate the numerical fit of the approximation pro-
vided we consider the case in which λ = 1 and γ = .1 and γ = .5 respectively. The
following graphs compare the CLT and Edgeworth approximations developed against
the true distribution of D:
Approximation for D (Exponential Case EZ=.1 )
00.10.20.30.40.50.60.70.80.91
-2.8
-1.9
-1 -0.1
0.8
1.7
2.6
3.5
4.4
5.3
6.2
7.1
TrueCLTEdgeworth
Approximation for D (Exponential Case EZ=.5 )
00.10.20.30.40.50.60.70.80.91
-2.8
-1.9
-1 -0.1
0.8
1.7
2.6
3.5
4.4
5.3
6.2
7.1
TrueCLTEdgeworth
CLT and Edgeworth Based Approximations
We now provide the rigorous statement supporting approximation (7).
Theorem 3 If the set of assumptions ED1 (or ED10) to ED4 are in force, then
P
µ√α
µD (α)− λ
γα
¶≤ y
¶= P
µN
µ0,
σ2
γ
¶≤ y
¶−√αβ1n (y) (9)
−√α
18
β2γH (y) +Gα (y) ;
where Gα represents a signed measure with G+α (R) +G−α (R) , kGα (dy)k = o (√α) .
In order to prove this theorem, we need some preliminary results. As it is standard
in obtaining Edgeworth expansions via Fourier analytic methods (see Feller (1968)
p. 512), one first proceeds to obtain an asymptotic expansion for the cumulant
moment generating function of interest. Hence, our first result provides an asymptotic
expansion for ψα (θ) , logE exp¡iθα−1/2 (αD (α)− λ/γ)
¢in powers of
√α.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 110
Lemma 3 Assume ED1 (or ED10) to ED3. Then, there exists δ > 0 for which we
have that
ψα (θ) =
õ(2)Z λ
2γ2+O (α)
!iθα1/2
+
µ1
2γα
µσ2X − 2
λ
γσXZ +
λ2
γ2σ2Z
¶+O (1)
¶(iθ)2
2α
+
µC3α+O (1)
¶(iθ)3
6α3/2 + o
¡α1/2
¢,
(uniformly in θ ∈ (−δ, δ), δ > 0) where
3γC3 = κ(3)X − 2κ21
λ
γ+ 3κ12
λ2
γ2
+3
µσ2Z
λ
γ2− σXZ
γ
¶µσ2X − 2
λ
γσXZ +
λ2
γ2σ2Z
¶− κ
(3)Z λ3
γ3.
Proof. The idea is to write
φα (θ) = exp¡iθλ/γ
√α¢φ¡θ√α,α
¢,
where φα (θ) , exp (ψα (θ)) and φ (θ,α) , E exp (iθD (α)). Notice that φ (θ,α)
satisfies
φ (θ,α) = E (exp (iθ (X1 + exp (−αZ1)D1 (α)))) ,
with D1 (α) independent of (X1, Z1). Thus, we have,
φ (θ,α) = E (exp (iθ (X1 + exp (−αZ1)D1 (α))))= E (E (exp (iθ (X1 + exp (−αZ1)D1 (α)))|X1, Z1))= E (E (exp (iθX1)φ (θ exp (−αZ1) ,α)|X1, Z1))= E (exp (iθX1)φ (θ exp (−αZ1) ,α)) .
Using the Taylor development for characteristic functions (see Feller (1968) App. Sec.
XV.5 and Breiman (1992) Prop. 8.44) applied to φ (θ,α) and φα (θ), together with
the moment conditions implied by assumptions ED1 (or ED10) to ED3, we arrive at
the expression stated for ψα (θ).
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 111
Lemma 4 Under assumptions ED1 (or ED10) to ED4, φ (θ,α) , E exp (iθD (α))satisfies
|φ (θ,α)| = o ¡α1/2¢as α→ 0 uniformly in θ over compact sets not containing the origin.
Proof. Let φX (θ, Z1) = E¡eiθX1
¯Z1¢, and let Tα = inf{k : Sk > 1/α}. Then,
|φ (θ,α)| =¯¯EÃE
Ãexp
Ãiθ
∞Xk=1
Xk exp (−αSk−1)!¯¯Z!!¯
¯=
¯E¡Π∞k=1φX
¡θe−αSk−1 , Zk
¢¢¯≤ E
¡Π∞k=1
¯φX¡θe−αSk−1, Zk
¢¯¢≤ E
¡ΠTα−1k=1
¯φX¡θe−αSk−1 , Zk
¢¯¢≤ E
¡ΠTα−1k=1 |∆ (θ, Zk)|
¢,
where ∆ (θ, Z1) = sup{|φX (θ∗, Z1)| : |θ∗| > |θe−1|}. Since the distribution of X1given Z1 is non-lattice, we must have that 0 < ∆ (θ, Z1) < 1. So,
|φ (θ,α)| ≤ E¡ΠTα−1k=1 |∆ (θ, Zk)|
¢≤ P
µα
¯Tα − 1
αγ
¯> ε
¶+E
¡ΠTα−1k=1 |∆ (θ, Zk)| ;α |Tα − 1/αγ| ≤ ε
¢≤ P
µα
¯Tα − 1
αγ
¯> ε
¶+E
³|∆ (θ, Z1)|1/α(1/γ−ε)−1
´.
Since condition AF1 (AF10) imply that 0 < EZ1 < ∞ and V ar (Z1) < ∞, we havethat
³α1/2
¯Tα − 1
αγ
¯´2is uniformly integrable (see Gut (1988) p. 92.) In particular,
this implies, using Chebyshev’s inequality, that
P
µα
¯Tα − 1
αγ
¯> ε
¶= O (α) .
Finally, if we choose ε > 0 small enough so that c , 1/γ − ε > 0, we must show (for
θ not in a neighborhood of the origin) that
E³|∆ (θ, Z1)|c/α
´= o
¡√α¢.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 112
Let W = − log (|∆ (θ, Z1)|) and β = c/α. Then,
E³|∆ (θ, Z1)|β
´= E (exp (−βW )) =
Z ∞
0
exp (−u)P (u/β > W ) du.
Thus,
βE³|∆ (θ, Z1)|β
´=
Z ∞
0
exp (−u)βP (u/β > W ) du.
Fix ε > 0 and write
βE³|∆ (θ, Z1)|β
´=
Z ε
0
exp (−u) βP (u/β > W ) du
+
Z ∞
ε
u exp (−u)β/uP (u/β > W ) du (10)
≤ βP (ε/β > W ) +
Z ∞
ε
u exp (−u)β/uP (u/β > W ) du.
We want to apply Fatou’s Lemma in the form
limβ−→∞
Z ∞
ε
u exp (−u)β/uP (u/β > W ) du
≤Z ∞
ε
limβ−→∞u exp (−u)β/uP (u/β > W ) du.
In order to do this, we must show that
0 ≤ β/uP (u/β > W ) ≤M
for someM > 0 for u ∈ [ε,∞], and β large. So, by right continuity and the existence
of left limits, it suffices to show that
limβ−→∞P (h > W )
h<∞.
But
limh−→0P (h > W )
h= limh−→0
P (h > − log (|∆ (θ, Z1)|))h
= limh−→0P (exp(−h) < |∆ (θ, Z1)|)
h
= limh−→0P (|∆ (θ, Z1)| > 1− h)
h<∞,
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 113
by virtue of assumption ED4. This is what we require in order to apply Fatou’s
lemma. Consequently, we have
limβ−→∞βE³|∆ (θ, Z1)|β
´<∞,
which implies
limβ−→∞p
βE³|∆ (θ, Z1)|β
´= 0,
and this is what we needed to conclude the proof of the lemma.
We now are ready to proof Theorem 3.
Proof of Theorem 3. . The proof of this theorem follows closely the steps
of Feller (1968) p.512. To simplify the notation, let us consider E (X1) = 0 and
E (X21) = 2γ and the Xk’s independent of the Zk’s (as we shall see from the proof,
these are just simplifying assumptions and the adaptation of the present proof is
straightforward using the corresponding local expansion given in Lemma 3)). Let
γ (θ) = bG (θ) = e−θ2/2µ1 + (iθ)3κ(3)X
18γ
√α
¶. Esséen’s lemma applies here since
G (x) = Φ (x)− κ(3)X
18
√a¡x2 − 1¢ η (x)
is bounded by some constant C. Also γ (0) = 1 and γ0 (0) = 0. Therefore,
|Fα (x)−G (x)| ≤ 1
π
Z T
−T
1
|θ|¯φ¡√
αθ,α¢− γ (θ)
¯dθ +
24C
πT.
Let T =M/√α, for some M > 0 big. Then, for any δ > 0 small, we have
|Fα (x)−G (x)| ≤ I1 + I2 + I3 +√α24C
πM,
where
I1 =1
π
Z δ/√α
−δ/√α
1
|θ|¯φ¡√
αθ,α¢− γ (θ)
¯dθ,
I2 =1
π
Z M/√α
δ/√α
1
|θ|¯φ¡√
αθ,α¢− γ (θ)
¯dθ,
I3 =1
π
Z δ/√α
−M/√α
1
|θ|¯φ¡√
αθ,α¢− γ (θ)
¯dθ.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 114
Observe that
I2 ≤ 1
π
Z M/√α
δ/√α
1
|θ|¯φ¡√
αθ,α¢¯dθ +
1
π
Z M/√α
δ/√α
1
|θ| |γ (θ)| dθ
=1
π
Z M
δ
1
|θ| |φ (θ,α)| dθ +1
π
Z M/√α
δ/√α
1
|θ| |γ (θ)| dθ.
By virtue of our previous lemma, it is clear that I2 goes to zero faster than√α,
similarly for I3. Thus, we just have to study I1. Let
ζ (θ,α) , log (φ (θ,α)) +θ22γ
2 (1−m (−2α))= log (φ (θ,α)) +
θ2γ
(1−m (−2α))where m (−λ) = E ¡e−λZ1¢. Hence, we can write
I1 =1
π
Z δ/√α
−δ/√α
1
|θ|¯φ¡√
αθ,α¢− γ (θ)
¯dθ
=1
π
Z δ/√α
−δ/√α
1
|θ|¯exp
µζ¡√
αθ,α¢− θ2γ
(1−m (−2α))¶− γ (θ)
¯dθ
=1
π
Z δ/√α
−δ/√α
1
|θ|e−θ2/2
¯¯e³ζ(√αθ,α)− θ2
2 (αγ
(1−m(−2α))−1)´− 1− (iθ)
3 µ3√α
18
¯¯ dθ.
Using Feller (1968), p. 507, we have that for any eβ1 and eβ2 complex numbers,¯eeβ1 − 1− eβ2 ¯ ≤ µ¯eβ1 − eβ2 ¯+ 12eβ22
¶exp (υ) , (11)
where υ ≥ max³¯eβ1 ¯ , ¯eβ2 ¯´ . Given ε > 0, we can choose δ > 0 small enough so that
|θ√α| < δ (as in Feller (1968), p. 507) and¯¯ζ ¡θ√α,α¢− α3/2 (iθ)3 κ
(3)X
3! (1−m (−3α))
¯¯ ≤ ε
θ3α3/2
|(1−m (−3α))| ≤ εKθ3α1/2
for α small enough and some constant K1 independent of α (becauseα3/2κ
(3)X
(1−m(−3α)) is the
cumulant of order 3 for the random variable√αD (α)). At the same time, δ can also
be chosen satisfying ¯ζ¡θ√α,α
¢¯<1
2
γαθ2
(1−m (−2α)) ≤K2
3θ2
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 115
for some K2 ≤ 1 for α small enough. Now, δ can be chosen also with the propertythat ¯
¯ α3/2 (iθ)3 κ(3)X
3! (1−m (−3α))
¯¯ < K2
3θ2.
Notice that ¯¯e³ζ(√αθ,α)− θ2
2 (αγ
(1−m(−2α))−1)´− 1− (iθ)
3 κ(3)X
18
¯¯
≤¯¯e³ζ(√αθ,α)− θ2
2 (αγ
(1−m(−2α))−1)´− 1− α3/2 (iθ)3 κ
(3)X
3! (1−m (−3α))
¯¯+¯
¯ α3/2 (iθ)3 κ(3)X
3! (1−m (−3α)) −(iθ)3 κ
(3)X
18
√α
¯¯ ,
and observe that ¯¯ α3/2 (iθ)3 κ
(3)X
3! (1−m (−3α)) −(iθ)3 κ
(3)X
18
√α
¯¯ ≤ √αθ3o (1) .
Finally, we apply inequality (11) with eβ1 = ζ (√αθ,α) − θ2
2
³αγ
(1−m(−2α)) − 1´andeβ2 = α3/2(iθ)3κ
(3)X
3!(1−m(−3α)) for δ > 0 small enough so that
I1 ≤ ε
πκ1√α
Z ∞
−∞θ2e−θ
2/6dθ +α
πK21
Z ∞
−∞e−θ
2/6θ6dθ +
√α
πo (1)
Z ∞
−∞|θ|3 e−θ2/6dθ.
Hence we conclude that
lim supα→0
1√αsupx|Fα (x)−G (x)| ≤ εκ,
for some constant κ. Since ε was arbitrary, this concludes the proof of the theorem.
5.4.2 The continuous time setting
A popular model in the risk theory setting discussed in Section 2 consists of con-
sidering the processes Γ as Λ two independent Levy processes (i.e. two stationary
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 116
independent increment processes, see Gjessing and Paulsen (1992)). The stationary
independent increment assumption of the risk process Λ has been argued to hold by
several authors in the risk theory community (this setting includes the so-called clas-
sical risk model, see Asmussen (2001) and Grandell (1991)). On the other hand, in
finance, short rate processes usually are modelled as positive functions of a Markov
process (typically with mean reverting characteristics). This motivates the following
setting in which we develop the desired Edgeworth expansion.
Suppose that Λ = (Λ (t) : t ≥ 0) is a Levy process. In addition, let Y = (Y (s) :s ≥ 0) be a homogeneous Markov process taking values in a Polish space Ξ and letB (Ξ) be the Borel sigma-field in Ξ. Let P (t, y, B) (t ∈ R+, y ∈ Ξ and B ∈ B (Ξ)) bethe corresponding transition probability function. Assume that Y satisfies the Feller
condition (i.e. P (t, y,Bδ (x)) → 1 as t & 0, for all δ > 0) and that the mapping
y → Eyf (Yt) is continuous for all f (·) ∈ C (Ξ) (the space of continuous functiontaking values on Ξ). Let A be the associated infinitesimal generator of the process
Y , defined via the relation
Af (y) = limt↓0Eyf (Y (t))− f (y)
t,
where f ∈ C(Ξ). The domain D (A) of A is composed by those functions f ∈C (Ξ) for which the previous limit exists (uniformly, for all y ∈ Ξ) (See Skorohod,
Hoppensteadt and Salehi (2002)). In addition, suppose that Y (·) has right continuouswith left limits sample paths and that it is geometrically ergodic (see Kontoyiannis
and Meyn (2003), p. 9).
The following set of assumptions are in force throughout this section.
EC1 Λ and Γ are independent and the distribution of Λ (1) is non-lattice.
EC2 Suppose Y is geometrically ergodic (see Kontoyiannis and Meyn (2003) p. 9).
Suppose that eγ (·) : Ξ → R is a continuous mapping such that eγ (x) > 0 for allx ∈ Ξ and define Γ as
Γ (t) =
Z t
0
eγ (Y (s)) ds.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 117
Under EC1 and EC2, we shall provide rigorous support for the approximation
P (D ≤ y) ≈ P¡N¡λ/γ,χ(2) (0) /2
¢ ≤ y¢−√γλγF (y0) η
µ(y − λ/γ)
q2/χ(2) (0)
¶−√γ
18χ(3) (0)H
µ(y − λ/γ)
q2/χ(2) (0)
¶, (12)
where (if π (dy) denotes the stationary distribution of Y ), F can be characterized as
the solution of the Poisson equation
AF = Eπγ (Y (1))− γ (y) ,
and χ (·) depends on the log-moment generating function of Λ and the Perron-
Frobenius eigenvalue associated with cumulative Markov reward Γ. More pre-
cisely, for every θ ∈ R consider the (unique) solution pair (u (y, θ) ,ψΓ (θ)) (such
that u (y, 0) = 1) satisfying
(Au) (y, θ) = (ψΓ (θ)− θeγ (y))u (y, θ) . (13)
Note that the geometric ergodicity guarantees existence and uniqueness of the solution
pair (u,ψΓ), see Kontoyiannis and Meyn (2003)). Let ψΛ (iθ) = logE exp (iθΛ (1))
(we work with the branch {arg (z) ∈ [0, 2π)} when operating with complex loga-rithms) then χ (iθ) = −ψ−1Γ (−ψΛ (iθ)) (note that χ0 (0) = λ/γ). Just as in the
discrete time case, the approximation (12) will be supported in the context of small
interest rates for a suitably parameterized family of discounted rewards. In particu-
lar, we shall prove that the approximation
P¡√
α (D (α)− λ/ (γα)) ≤ y¢≈ P
¡N¡0,χ(2) (0) /2
¢ ≤ y¢−√γαλγF (y0) η
µyq2/χ(2) (0)
¶−√γα
18χ(3) (0)H
µyq2/χ(2) (0)
¶holds with an error of order o (
√α) (uniformly on y), where
D (α) =
Z ∞
0
exp (−αΓ (t)) dΛ (t) .
(Note that the previous integral can be interpreted, via integration by parts, path by
path as a Lebesgue-Stieltjes integral.)
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 118
Theorem 4 Suppose that EC1 and EC2 hold. Then,
P¡√
α (D (α)− χ0 (0) /α) ≤ y¢= P
¡N¡0,χ(2) (0) /2
¢ ≤ y¢−√γαF (y0) ηµyq2/χ(2) (0)¶−√γα
18χ(3) (0)H
µyq2/χ(2) (0)
¶+Gα([−∞, y));
where Gα represents a signed measure with G+α (R) +G−α (R) , kGαk = o (√α) .
The proof of the previous theorem parallels its corresponding continuous time
analogue described in the previous section. We first obtain a local description of
ψα (θ) = logE exp (iθ√α (D (α)− λ/ (γα))).
Lemma 5 Under assumptions EC1 and EC2 we have that
ψα (θ) = −χ(2) (0)
2θ2 +
√α
µχ(3) (0)
18(iθ)3 − λ
γF (y0) iθ
¶+ o
¡√α¢
(uniformly in θ ∈ (−δ, δ), δ > 0).
Proof. It is known that for every u ∈ D (A) such that inf x∈Ξ|u (x) | > 0 we havethat
Mt (z) =u (Y (t) , θ)
u (Y0, θ)exp
µ−Z t
0
µAu
u
¶(Y (s) , θ) ds
¶(14)
is a Martingale with respect to the filtration generated by Y (see Lemma 2, p. 82
of Skorohod, Hoppensteadt and Salehi (2001)). Since Y is geometrically ergodic it
follows that the generalized eigenvalue problem
(Au) (y, θ) = (ψΓ (θ)− θeγ (y))u (y, θ) , u (y, 0) = 1 (15)
has a unique solution pair (u (y, θ) ,ψΓ (θ)) for every θ ∈ R. In addition, infθ∈Ξ u (y, θ) >0 for all θ ∈ R and ψΓ (·) is a strictly increasing function (since
ψΓ (θ) = limt→∞
1
tlogE exp (θΓ (t)) ).
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 119
Observe that the solution to (15) automatically provides the solution to the problem
1eγ (y) (Au) (y, θ) =
µψΓ (θ)eγ (y) − θ
¶u (y, θ)
=
µ−ψ−1Γ (−ν)− νeγ (y)
¶u¡y,ψ−1Γ (−ν)¢ ,
(where ν = −ψΓ (θ)). In addition, Proposition 4.8 of Kontoyiannis and Meyn (2003)
states that for each θ ∈ Ξ, both u (y, ·) and ψΓ (·) are analytic inN ={z ∈ C : |z| ≤ δ}for some δ > 0 (which immediately implies the analyticity of ζ (·) = −ψ−1Γ (−·)) andinf x∈Ξ,z∈N |u (x, z) | > 0. Note that the Markov process eY = ³eY (t) : t ≥ 0´ definedas eY (t) = Y (Γ−1 (t)) is also a geometrically ergodic Markov process with generatoreA = 1eγA (the reason is that eγ being continuous and positive implies infx∈Ξ eγ (x) > 0,which yields that the Lyaponuv bound needed in the definition of geometric ergodic-
ity is immediately satisfied after scaling factors (see Kontoyiannis and Meyn (2003)
p. 9). Therefore, by considering the Markov generator ∂t + eA and the function
u (y,ψΛ(iθe−αt)), (for θ ∈ R with |θ| < δ) in the relation (14) we can build the
Martingales
Mt (iθ) =u³eY (t) ,−χ (iθe−αt)´u (Y0,−χ (iθe−αt)) exp
Z t
0
ψΛ (iθe−αt)eγ ³eY (t)´ dt−
Z t
0
χ¡iθe−αt
¢dt
exp
−α Z t
0
iθe−αtuθ³eY (t) ,−χ (iθe−αt)´
u³eY (t) ,−χ (iθe−αt)´ χ
¡iθe−αt
¢dt
.Note thatMt (iθ) is a bounded martingale (in particular, uniformly integrable). Thus
it possesses a last element M∞ (iθ), which implies that
exp
µZ ∞
0
χ¡iθe−αt
¢dt
¶u (Y0, iθ) = E exp
Z ∞
0
ψΛ (iθe−αt)eγ ³eY (t)´ dt− ξ (α, iθ)
,where
ξ (α, iθ) = α
Z t
0
iθe−αtuθ³eY (t) ,−χ (iθe−αt)´
u³eY (t) ,−χ (iθe−αt)´ χ
¡iθe−αt
¢dt.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 120
Therefore, we conclude that
exp
µZ ∞
0
¡χ¡√
αiθe−αt¢−√αiθe−αtλ/γ¢ dt¶u ¡Y0,−χ ¡√αiθ¢¢
= E exp
Z ∞
0
ψΛ (√αiθe−αt)eγ ³eY (t)´ dt− iθ λ
γ√α− ξ
¡α,√αiθ¢
= E exp
Z ∞
0
ψΛ (√αiθe−αt)eγ ³eY (t)´ dt− iθ λ
γ√α
+ o ¡√α¢ (16)
(uniformly in θ ∈ (−δ, δ)). The previous equality follows from the fact that
Eξ¡α,√αiθ¢=√αiθα
Z ∞
0
e−αtuθ³eY (t) ,−χ (√αiθ)´
u³eY (t) ,−χ (√αiθ)´ χ
¡√αiθe−αt
¢dt
=√αiθ
λ
γEα
Z ∞
0
e−αtuθ³eY (t) , 0´ dt+O (α) ,
and (using Theorem 1) in combination with the bounded convergence theorem) it
follows that
αE
Z ∞
0
e−αtuθ³eY (t) , 0´ dt * Euθ (Y (∞) , 0) = EπF (Y (1)) = 0
(since uθ (y, 0) = F (y)). On the other hand, notice that
E exp (iθD (α)) = E
µE
µexp
µiθ
Z ∞
0
exp (−αΓ (t)) dΛ (t)¶¯
Γ
¶¶= E exp
µZ ∞
0
ψΛ (iθ exp (−αΓ (t))) dt¶
= E exp
Z ∞
0
ψΛ (iθe−αu)eγ ³eY (t)´ du
. (17)
Combining expressions (10) and (17) with a Taylor expansion of χ (·) and u (Y0, ·)yields the conclusion of the Theorem.
The proof of Theorem 4 can be completed along the same lines as in the discrete
time case after showing that φ (θ,α) , E exp (iθD (α)) goes to zero fast enough for|θ| ∈ (w0, w1) for any 0 < w0 < w1 <∞.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 121
Lemma 6 Suppose that EC1 and EC2 are in force, then φ (θ,α) , E exp (iθD (α))satisfies
sup|θ|∈(θ0,θ1)
|φ (θ,α)| = o ¡√α¢ ,for all 0 < θ0 < θ1 <∞.
Proof. We proceed as in the discrete time case, first we write
|φ (θ,α)| =¯E exp
µZ ∞
0
ψΛ (iθ exp (−αΓ (t))) dt¶¯
≤ E
¯exp
µZ ∞
0
ψΛ (iθ exp (−αΓ (t))) dt¶¯
(note that ψΛ (i·) is well defined except for at most countably many values, in thosecases we can assign the value −∞ and that will not affect the value of the integral
above). The proof now follows just as in the discrete time case, by spliting the integral
up to Γ−1 (1/α) and using the non-lattice property of the distribution of Λ (1) . In
fact, since 1/(α supx∈Ξ eγ (x)) ≤ Γ−1 (1/α) we actually can obtain an exponential rate
of convergence instead of the rate o¡α1/2
¢.
Remarks
a) The assumption that Ξ is compact does not really play an essential role. It was
only used to ensure that the martingale property of Mt (iθ) in the proof of Lemma
5. A local description for ψα (iθ) could also have been obtained by computing the
moments of D (α), which is relatively easy in the present setting.
b) The independence between Γ and Λ can also be relaxed. For example, one
could have assumed that both processes are conditionally independent given another
Markov process, say Z, provided that Λ remains a possibly non-time homogeneous
Levy process with a suitably non-lattice conditional distribution type assumption
analogous to condition AI3 in the previous subsection.
c) Following the same ideas as in Lemma 5, a local expansion for ψα (θ) can be
obtained for the case in which
D (a) =
Z ∞
0
exp
µ−α
Z t
0
eγ (Y (s)) ds¶eλ (Y (s)) .
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 122
(where eλ is, say, continuous on the compact Polish space Ξ). In this case, the corre-sponding generalized eigenvalue problem takes the form
1eγ (Au) (y, θ) =Ãχ (θ)−
eλ (y)eγ (y)!u (y, θ) , u (y, 0) = 1,
and a formal corrected approximation can be written as
P (D ≤ y) ≈ P¡N¡λ/γ,χ(2) (0) /2
¢ ≤ y¢−√γuθ (y0, 0) ηµ(y − λ/γ)q2/χ(2) (0)
¶−√γ
18χ(3) (0)H
µ(y − λ/γ)
q2/χ(2) (0)
¶.
The only step required to make the previous approximation rigorous is to show that for
all 0 < θ0 < θ1 <∞, sup|θ|∈(θ0,θ1) |φ (θ,α)| = o (√α) as in Lemma 6. This essentially
involves assuming enough structure to ensure strongly non-lattice properties of D.
We have chosen Levy process in our exposition because they provide a convenient
framework to easily verify, from the model primitives, the non-lattice conditions that
yield the described Edgeworth expansion.
5.5 Large Deviations
To fix ideas, let us begin by considering the same setting under which we derived
our LLN in Section 3. In the previous section, we derived accurate approximations
for the distribution of D (in the iid setting) for small interest rates when the D is
close to its typical value (according to the LLN this implies looking at D close to
λ/γ). In a number of applications (including those discussed in Section 2 regarding
time series analysis and risk theory), one is often interested in computing P (D > x)
for x suitably large. In particular, these types of applications motivate interest in
the analysis of the tail probability P (D > x) for x >> λ/γ. As we shall see, under
certain exponential moment conditions on (Γ,Λ), the approximation proposed here
will take the form
P (D > x) ≈ exp (−I (x) /γ) , (18)
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 123
where I (x) > 0 corresponds to the so-called rate function and will typically take
the form I (x) = xθ∗ − R∞0
χ (θ∗e−s) ds, where θ∗ satisfies θ∗x = χ (θ∗) and χ (·)is a suitably defined convex function. The goal of this section is to provide, under
general conditions, rigorous justification (at least in a rough logarithmic sense) for the
previous approximation. In addition, we will also explore, under additional structure,
exact asymptotics (also known as precise large deviations).
Applications in finance and risk theory motivate study of continuous time pro-
cesses, including the case in which the processes Γ and Λ take the form
Γ (t) =
Z t
0
eγ (s) ds and Λ (t) =
Z t
0
eλ (s) ds,where, for all s, eγ (s) > 0 represents the “short rate” process and eλ (s) represent thereward rate. Also, other applied contexts such as the analysis of ARCH processes in
time series motivate study of the discrete time setting, in which
Γ (t) ,btcXk=1
Zk and Λ (t) ,btcXk=1
Xk,
and (Xk, Zk)k≥0 is a (typically stationary) sequence of two dimensional random vectors
with the property that Zk > 0 for all k ≥ 0.In order to provide rigorous justification for the approximation (18), we shall
consider
αD (α) =
Z[0,∞)
exp (−αΓ (t−)) dΛ (t)
= α
Z[0,∞)
exp (−u)Λ ¡Γ−1 (u/α)¢ du, (19)
and study P (αD (α) > x) for x > λ. Note that the previous identity holds in general
provided that Γ (·) is non-decreasing and Λ (·) has RCLL sample paths. In otherwords, (19) may hold even if Λ does not have bounded variation. Expression (19)
suggests a natural strategy to derive a LDP for {αD (α)}α>0 as α & 0; namely,
to apply the contraction principle (under appropriate sample path large deviations
assumptions on α (Γ (·/α) ,Λ (·/α))), to the mapping Ψ : D[0,∞) × D[0,∞) → R
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 124
defined as
Ψ (x, y) =
Z ∞
0
exp (−t) y ¡x−1 (t)¢ dt.Actually, we will follow more or less this idea, although with important modifications
arising due to the fact that Ψ is not continuous. Indeed, if we consider the map Ψ1,
acting on D[0,∞) endowed with the Skorohod J1 topology (see Whitt (2001)) anddefined as Ψ1 (x) =
R∞0exp (−t)x (t) dt, then, we can see, aside from the fact that
Ψ1 is not well defined for every element in D[0,∞), that Ψ1 is discontinuous at every
single point. In order to see this, just consider the sequence of functions (xn : n ≥ 1),defined as xn (t) = enI (n ≤ t < n+ 1), and note that xn → 0 while Ψ1 (xn) = 1−e−1.(This example was given byWhitt (1972); thatΨ1 (·) is discontinuous at every elementof D[0,∞) follows by linearity of Ψ1.)
The idea, then, is to restrict the domain of Ψ1 to a proper subspace of D[0,∞),endowed with a finer topology under which Ψ1 (·) is continuous. This idea will bestudied in detail in the next subsection, in which we treat the continuous setting.
Later, we will return to the discrete setting.
5.5.1 The continuous time setting
We will restrict the domain of Ψ1 to the subspace
Lβ[0,∞) , {x ∈ C[0,∞) : limt→∞¯x (t)
tβ
¯= 0},
for some β > 0, with the topology generated by the weighted norm
kxkβ = supt≥0
|x (t)|1 + tβ
.
Whitt (1972) proved that Ψ1 is continuous on³Lβ[0,∞), k·kβ
´, which suggests using
the contraction principle on this space. The following proposition constitutes an
intermediate step in this direction.
Proposition 5 Suppose that the family of processes α (Γ (·/α) ,Λ (·/α))α>0 satisfiesa LDP on C[0,∞) × C[0,∞) (endowed with the product topology generated by the
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 125
uniform convergence on compact sets also known as Stone’s topology) with a good rate
function I (x, y). Then, Rα (·) = αΛ (Γ−1 (·/α)) satisfies a LDP on C[0,∞) (endowedwith Stone’s topology) with good rate function I 0 (z) = inf{I (x, y) : z = y ◦ x−1}.
Proof. This is just a direct consequence of the contraction principle (see Theorem
4.2.1, p. 126 of Dembo and Zeitouni (1999)) and the fact that the mapping (x, y)→y ◦ x−1 in the topological spaces described (see Whitt (2001), Theorem 13.2.2., p.
430).
At this point, one may be tempted to invoke, once again, the contraction principle
in combination with Proposition 5 to obtain the desired LDP. However, in order to
proceed with this program, we must show that the LDP developed in Proposition 5
actually holds on³Lβ[0,∞), k·kβ
´(since, in order to apply the contraction principle,
the continuity of Ψ1 must be compatible with the topology under which the original
LDP was derived). In order to show the LDP on³Lβ[0,∞), k·kβ
´we will need
to show that the random elements (αΛ (Γ−1 (·/α)))α>0 are exponentially tight (seeDembo and Zeitouni (1998)). (This type of reasoning parallels similar arguments in
the context of weak convergence theory and the important role that tightness plays
in this theory). Recall that a sequence of probability measures Pn is said to be
exponentially tight if for every a > 0 there exist compact sets Ka, such that
limn→∞1
nlogPn (Ka) ≤ −a,
or, if the Pn’s take values on subsets of a Polish space, then the Pn’s are exponentially
tight if for ε > 0, there exists a compact set Kε such that, for all n ≥ 1,
εn > 1− Pn (Kε) ,
(see Zajic (1993) p. 11). In view of these observations, we must characterize expo-
nential tightness in Lβ[0,∞). This is the aim of the following theorem.
Lemma 7 Consider a sequence of probability measures (Pn : n ≥ 1) on Lβ[0,∞)(such that Pn{x : x (0) = 0} = 1) and acting on the Borel sigma-field correspondingto the topology generated by the norm k·kβ. Then, (Pn : n ≥ 1) is exponentially tight
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 126
if and only if (Pn : n ≥ 1) is exponentially tight under the (relative) Stone topology,and that for each δ > 0
limn→∞1
nlogPn
µx : sup
t≥t0
|x (t)|tβ
> δ
¶→ −∞ as t0 %∞. (20)
Proof. Lemma 3.3 of Whitt (1972) establishes that relatively compact sets in³Lβ[0,∞), k·kβ
´are those sets B with compact closure under the relative Stone
topology, and satisfying
limt→∞
supx∈B
|x (t)|tβ
= 0.
Also, recall that (if x (0) = 0 a.s. with respect to each Pn) for exponential tightness
under Stone’s topology, it is necessary and sufficient (see Feng and Kurtz (2000), p.
30) that, for each ε, T > 0,
limn→∞1
nlogPn (x : ω (x, δ, T ) > ε)→ −∞ as δ & 0, (21)
where ω (x, δ, T ) is the modulus of continuity of x, on the interval [0, T ], evaluated
at δ. We now show that if conditions (20) and (21) are satisfied, then the sequence
(Pn : n ≥ 1) is exponentially tight. Pick λ > 0, choose δk so that
Pn (x : ω (x, δk, T ) > 1/k) ≤ e−nλ/2k+1,
and let Bk = {x : ω (x, δk, T ) ≤ 1/k}. Also, pick tk so that
Pn
µx : sup
t≥tk
|x (t)|tβ
> 1/k
¶≤ e−nλ/2k+1,
and let Ck = {x : supt>tk |x (t)| /tβ ≤ 1/k}. Consider the closure, Aλ, of Aλ =
∩k (Bk ∩ Ck). Note that
1− P ¡Aλ
¢ ≤ 1− P (Aλ) = P (∪k (Bck ∩ Cck)) ≤ e−nλ
We claim that Aλ is relatively compact (i.e. that Aλ is compact), to see this, choose
ε > 0 and let k0 > 1/ε. Then, for all δ < δk0 we have that
supx∈A
ω (x, δ, T ) < ε.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 127
Similarly, for every T > tk0 we have that
ε > supx∈A
supt>T
|x (t)|tβ
,
which implies that
limt→∞ supx∈A
|x (t)|tβ≤ ε
for all ε > 0. Thus, by virtue of the Arzela-Ascoli theorem (see Billingsley (1999)
p.81) and Lemma 3.3 of Whitt (1972), which concludes the argument for sufficiency.
The necessity part is easier and follows just as in Feng and Kurtz (2000) p. 30.
Therefore, it is omitted.
With the aid of the previous lemma, the exponential tightness of (αΛ (Γ−1 (·/α)))α>0follows easily.
Lemma 8 Suppose that α (Γ (·/α) ,Λ (·/α))α>0 satisfies a full LDP with rate functionI (x, y) (under Stone’s topology). (Recall that a full LDP means an LDP with convex
good rate function). Then,
a) The family (αΓ (·/α)− γ·,αΛ (·/α)− λ·)α>0 is exponentially tight in L1[0,∞)×L1[0,∞) with the product topology generated by the norm k·k1
b) The class of random elements (αΛ (Γ−1 (·/α))− λ · /γ)α>0, is exponentially tightin (L1[0,∞), k·k1) .
Remark The convexity of the rate function does not really play a role in this
lemma, but only the goodness of the rate function is required.
Proof. For part a), it suffices to show that αΓ (·/α)−γ· and αΛ (·/α)−γ· are bothexponentially tight in (L1[0,∞), k·k1). Since αΛ (·/α) satisfies a full LDP in C[0,∞)(under Stone’s topology), which is a topological group (which implies the addition is
a continuous operation), it follows from the contraction principle that αΛ (·/α)− γ·also satisfies a full LDP. Note that C[0,∞), endowed with Stone’s topology, is aPolish space. Thus, the existence of a full LDP guarantees the exponential tightness
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 128
of αΛ (·/α) − γ· (see Dembo and Zeitouni (1999), p. 120 (c)). Therefore, we just
have to prove condition (20). Note that for any 0 < a < b < ∞, the mappingx→ supt∈[a,b] |x (t) /t| is continuous (under Stone’s topology), which implies that thefamily Vα = supt∈{a,b} |αΛ (t/α) /t− γ| satisfies an LDP with good rate function J ,say. Hence, we can write
P
µsupt>t0
¯αΛ (t/α)− γt
t
¯≥ δ
¶≤
∞Xk=1
P
Ãsup
t>t0[k,k+1]
¯αΛ (t/α)− γt
t
¯≥ δ
!
≤∞Xk=1
P
supu= t
kt0>[1,2]
¯αΛ (ukt0/α)− γukt0
ukt0
¯≥ δ
=
∞Xk=1
exp¡− ¡J (δ) + okto/α (1)¢ kt0/α¢ ,
where the subindex in okto/α (1) has been used just to emphasize that okto/α (1) → 0
as kt0/α → ∞. So we can choose k0 big enough so that for every k > k0 we have
J (δ) + okto/α (1) > J (δ) /2 > 0. From these estimates it is easy to conclude that
limα→0α logPµsupt>t0
¯αΛ (t/α)− γt
t
¯≥ δ
¶→ −∞ as t0 %∞,
which yields, by virtue of Lemma 7, the corresponding exponential tightness for
αΛ (·/α) − γ·. The argument for αΓ (·/α) − γ· is exactly the same and thereforehas been omitted. Part b) also proceeds along the same lines as the previous ar-
gument, since it follows from Proposition 5 that αΛ (Γ−1 (·/α)) satisfies a full LDPunder Stone’s topology.
We are ready to derive the LDP for (αD (α))α>0 in the continuous setting.
Theorem 5 Suppose that the family of processes α (Γ (·/α) ,Λ (·/α))α>0 satisfies afull LDP on C[0,∞) × C[0,∞) (endowed with the corresponding product Stone’stopology) with a good rate function I (x, y). Then, {αD (α)}α>0 satisfies an LDPon R with good rate function
I (z) = inf{I (x, y) : z =Z ∞
0
e−t¡y ◦ x−1¢ (t) dt}.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 129
Proof. Proposition 5 combined with the contraction principle tells us that the
family of random variables (αΛ (Γ−1 (·/α))− λ · /γ)α>0 satisfies a full LDP onC[0,∞).Since the product topology generated by the norm k·k1 in the subspace L1[0,∞) isfiner than Stone’s topology, Corollary 4.2.6 of Dembo and Zeitouni (1999) (which is
a simple consequence of the inverse contraction principle applied with the identity
mapping) applies yielding that (αΛ (Γ−1 (·/α))− λ · /γ)α>0 satisfies a full LDP on(L1[0,∞), k·k1). Since the mapping Ψ1 is continuous on (L1[0,∞), k·k1), we canapply the contraction principle once again here thereby yielding the conclusion of the
theorem.
The previous theorem provides rigorous justification for approximation (18) in
very general setting (essentially all those in which functional LDPs for (Γ,Λ) exist in
the space of continuous functions). This includes, for example, the setting in which Λ
and Γ are diffusion processes (see Dembo and Zeitouni (1999) Section 5.6). However,
in order for the previous theorem to be useful from an applied standpoint, sufficient
conditions must be provided to guarantee the validity of an LDP with good rate
function on C[0,∞)×C[0,∞). Fortunately, these types of conditions have been wellstudied in the literature.
The following set of assumptions taken from Zajic (1993) are useful to guarantee
the existence of a full LDP (more than we actually need), and their validity has been
shown in many different settings (see Zajic (1993) Ch. 3 and Ch. 4).
ACL1 For all θ, η ∈ R suppose that
g (η, θ) , sups,t
1
tlogE exp
µη
Z s+t
s
γ (u) du+ θ
Z s+t
s
|λ (u)| du¶<∞.
In addition, assume that there exists ε > 0 and a pair of functions (f, h) such
that h (δ)→ 0 as δ → 0 and
limδ→0 (ε log (δ) + f (δ)h (δ)− δg (f (δ))) =∞.
ACL2 If 0 = t0 < t1 < ... < tm <∞ then
Wα,m , α ((Γ (ti/α)− Γ (ti−1/α)) , (Λ (ti/α)− Λ (ti−1/α)))mi=1
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 130
satisfies a Large Deviations Principle (LDP) on R2m with good rate function
Im (z) =mXi=1
(ti − ti−1) Iµ
ziti − ti−1
¶,
where I (x1,x2) is the rate function governing the LDP of n−1 (Γ (n) ,Γ (n)).
The following theorem provides a form of the LDP that is well suited for applica-
tions. Define (as in Zajic (1993) p. 9)
ψ (η, θ) = limn→∞1
nlogE exp (ηΓ (n) + θΛ (n)) <∞.
Theorem 6 Suppose that assumptions ACL1 and ACL2 are in force. Let AC0 be
the set of absolutely continuous functions, defined on [0,∞), taking values on the realline and vanishing at the origin. Then, if y > λ/γ, we have that
limα→∞
α logP (αD (α) ≥ y)
= −I (y) , − infx∈AC0
{Z ∞
0
supθ(θx (s)− χ (θ)) ds : y =
Z ∞
0
e−sx (s) ds},
where χ (·) is defined via ψ (−χ (·) , ·) = 0. In addition, if there exists θ∗ = θ∗ (y)
such that yθ∗ = χ (θ∗), then we have
limα→∞
α logP (αD (α) ≥ y) = supθ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶= yθ∗ −
Z θ∗
0
χ (u)
udu.
Proof. All what we have to do is to identify the rate function. Theorem 2.2.2.,
p. 25, of Zajic (1993) indicates that α (Γ (·/α) ,Λ (·/α))α>0 satisfies a LDP with goodrate function
I (x, y) ,½R∞
0supη,θ (x (s) η + y (s) η − ψ (η, θ)) ds if x, y ∈ AC0
∞ otherwise.
This implies (combining the results of Puhalskii and Whitt (1997) and Russell (1998))
that (αΛ (Γ−1 (·/α)))α>0 satisfies a full LDP with good rate function
J (x) ,½R∞
0supθ (x (s) θ − χ (θ)) ds if x ∈ AC0
∞ otherwise.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 131
This expression, combined with the contraction principle, yields the first part of the
theorem. Hence, we only need to show that if y > λ/γ and yθ∗ = χ (θ∗), then
I (y) = supθ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶= yθ∗ −
Z θ∗
0
χ (u)
udu.
First, observe that integration by parts yields
infx∈AC0; y=
R∞0 e−sx(s)ds
{Z ∞
0
supθ(θx (s)− χ (θ)) ds}
= infx∈AC0; y=
R∞0 e−sx(s)ds
{Z ∞
0
supθ(θx (s)− χ (θ)) ds}.
Also, note that for every s ∈ R
supθ(θx (s)− χ (θ)) = sup
θ
¡θe−sx (s)− χ
¡θe−s
¢¢.
In particular, we have that for x ∈ AC0 and y =R∞0e−sx (s) dsZ ∞
0
supθ(θx (s)− χ (θ)) ds =
Z ∞
0
supθ
¡θe−sx (s)− χ
¡θe−s
¢¢ds
≥ supθ
Z ∞
0
¡θe−sx (s)− χ
¡θe−s
¢¢ds
= supθ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶.
Consequently,
I (y) ≥ supθ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶.
Now, if y > λ/γ = χ (0), then
supθ≥0
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶≥ 0.
On the other hand, for every θ > 0, we have (by making the change of variables
θe−s = u) Z ∞
0
χ¡θe−s
¢ds =
Z θ
0
χ (u)
udu.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 132
Therefore, by first order optimality conditions we have that (using the convexity of
the rate function)
supθ≥0
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶= yθ∗ −
Z θ∗
0
χ (u)
udu > 0.
Finally consider the function x∗ (s) such that χ (θ∗e−s) = x∗ (s) and x (0) = 0. Note
that Z ∞
0
e−sx∗ (s) ds =
Z ∞
0
e−sχ¡θ∗e−s
¢ds
=−1θ∗
Z ∞
0
dχ¡θ∗e−s
¢=
χ (θ∗)θ∗
= y.
Hence, we have that
I (y) = infx∈AC0; y=
R∞0 e−sx(s)ds
{Z ∞
0
supθ(θx (s)− χ (θ)) ds}s}
≤Z ∞
0
supθ
¡θe−sχ
¡θ∗e−s
¢− χ¡θe−s
¢¢ds
= yθ∗ −Z θ∗
0
χ (u)
udu = sup
θ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶,
which yields the conclusion of the theorem.
Our final result is an exact LDP formulated in the continuous setting for processes
with a Markovian structure. We adopt the setting of Subsection 5.2, in which a
suitably time homogeneous Markov process Y = (Y (s) : s ≥ 0) with generator A wasintroduced. We also assume that Λ is a Levy process independent of Y . The desired
exact LDP for
αD (α) = α
Z ∞
0
exp (−αΓ (t)) dΛ (t)
provided in the next theorem gives support to the following approximation (valid
when x >> λ/γ)
Py (D > x) ≈ γ1/2u (y0,−χ (θ∗)) exp (θ∗χ (θ∗) c (θ∗))
θ∗p
πχ00 (θ∗)exp (−I (x) /γ) ,
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 133
where u (y, ·) and χ00 (·) are defined as in Subsection 5.2 via the generalized eigenvalueproblem (15), c (θ∗) = Eeπ
³uθ³eY (1) ,−χ (θ∗)´ /u³eY (1) ,−χ (θ∗)´´ (with eπ (dy) =eγ (y)π (dy) /Eπeγ (Y (1))) and
I (x) = xθ∗ −Z θ∗
0
χ (u)
udu,
with χ (θ∗) = θ∗x.
Theorem 7 Suppose that Y is geometrically ergodic and that Λ (1) is non-lattice
with φΛ (θ) = E exp (θΛ (1)) < ∞ for all θ ∈ R. Then, if x > λ/γ and c (θ∗) =
Eeπ³uθ³eY (1) ,−χ (θ∗)´ /u³eY (1) ,−χ (θ∗)´´ (eπ (dy) = eγ (y)π (dy) /Eπeγ (Y (1)))
exp (I (x) /α)Py (αD (α) > x) ∼ α1/2
u (y,−χ (θ∗))exp (θ∗χ (θ∗) c (θ∗))
θ∗p
πχ(2) (θ∗)as α& 0,
where χ (θ) = −ψ−1Γ (−ψΛ (θ)), θ∗ satisfies χ (θ∗) = θ∗x and u (y, θ) (u (y, 0) = 1)
solves the eigenvalue problem
1eγ (y) (Au) (y, θ) =µ−ψΛ (θ)eγ (y) + χ (θ)
¶u (y, θ) .
Proof. Consider the family of probability measures P ∗y defined as
dP ∗y = exp (θ∗D (α)− ψ (θ∗,α)) dPy,
where ψ (θ∗,α) = logE exp (θ∗D (α)). Note that
exp (I (x) /α)Py0 (αD (α) > x)
= exp (I (x) /α)E∗y0 (1 (αD (α) > x) exp (ψ (θ∗,α)− θ∗D (α))) .
Now, observe that (since θ∗ > 0)
I (x) /α− xθ∗/α = −Z θ∗
0
χ (u)
udu = −
Z ∞
0
χ¡θ∗e−αs
¢ds.
On the other hand, from the proof of Lemma 5 we have that for all θ ∈ R,
exp
µZ ∞
0
χ¡θe−αt
¢dt
¶u (y0,−χ (θ)) (22)
= E exp
Z ∞
0
ψΛ
¡θe−αΓ(t)
¢dt− α
Z ∞
0
e−αuuθ³eY (u) ,−χ (θe−αu)´ χ (θe−αu)u³eY (u) ,−χ (θe−αu)´ du
.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 134
Which implies that
exp
µψ (θ,α)−
Z ∞
0
χ¡θe−αs
¢ds
¶(23)
∼ u (Y0,−χ (θ)) exp (c (θ)) , ξ (y0, θ) .
as α& 0. Therefore, we have that
exp (I (x) /α)Py0 (αD (α) > x) (24)
∼ ξ (y0, θ∗)Eα
y0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α))) .
The strategy is then to develop an Edgeworth expansion for√α (D (α)− x/α) under
E∗y0. Using the same steps as in the proof of Lemma 5 we can obtain a description of
the local behavior ψ∗α (θ) , logE∗y0 exp (iθ√α (D (α)− x/α)). In fact, we can obtain
ψ∗α (θ) = −θ2χ(2) (θ∗)4
+√α¡c1iθ + c2 (iθ)
3¢+ o ¡√α¢(uniformly on θ ∈ (−δ, δ) for some δ > 0). The coefficients c1 and c2 can actually
be computed but their values are not relevant for purposes of developing sharp large
deviations. The coefficient χ(2) (θ∗) /4 comes from the development ofZ ∞
0
¡χ¡¡√
αθ + θ∗¢e−αu
¢− χ¡θ∗e−αu
¢−√αe−αuxθ¢ du= θ
√α
Z ∞
0
¡χ¡θ∗e−αu
¢− x¢ e−αudu+ θ2Z ∞
0
2αχ(2) (θ∗e−αu) e−2αu
4du (25)
+o¡√
α¢.
Indeed, sinceZ ∞
0
χ¡θ∗e−αu
¢e−αudu = − 1
θ∗α
Z ∞
0
dχ¡θ∗e−αu
¢=
χ (θ∗)αθ∗
=x
α,
we obtain that the coefficient multiplying θ in (25) vanishes and, thus, ψ∗α (θ) ∼−θ2χ(2) (θ∗) /4 as stated. We also must show that |φ∗ (θ,α) | = |E∗y0 exp (iθD (α)) | =o (√α) uniformly on compact sets not containing the origin. The key observation to
prove this condition is to note that
φ∗ (θ,α) =E exp
¡R∞0
ψΛ ((iθ + θ∗) exp (−αΓ (t))) dt¢E exp
¡R∞0
ψΛ (θ∗ exp (−αΓ (t))) dt¢ .
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 135
Next, observe that (23) implies that there exists a positive constant C <∞ such that
|φ∗ (θ,α) | ≤ C|E expµZ ∞
0
ψΛ
¡(iθ + θ∗) e−αΓ(t)
¢− χ¡θ∗e−αt
¢dt
¶|. (26)
Now consider the (geometrically ergodic) Markov process eY = ³eY (s) : s ≥ 0´ withgenerator eA = 1eγA (compare the proof of Lemma 5, where this process was intro-
duced). Let us define the probability measure eP acting the sigma-field generated byeY asd eP =M∞ (θ∗) dP,
where M∞ (θ∗) is the last element of the bounded martingale M∗ = (Mt (θ∗) : 0 ≤
t ≤ ∞) defined as
Mt (θ∗) =
u³eY (t) ,−χ (θ∗e−αt)´u (Y0,−χ (θ∗e−αt)) exp
Z t
0
ψΛ (θ∗e−αt)eγ ³eY (t)´ dt−
Z t
0
χ¡θ∗e−αt
¢dt
exp
−α Z t
0
θ∗e−αtuθ
³eY (t) ,−χ (θ∗e−αt)´u³eY (t) ,−χ (θ∗e−αt)´ χ
¡θ∗e−αt
¢dt
.(This martingale was also introduced in the proof of Lemma 5, where it has been
indicated how the martingale property follows from Lemma 2, p. 82 of Skorohod,
Hoppensteadt and Salehi (2001)). Note, therefore, that
E exp
µZ ∞
0
ψΛ
¡(iθ + θ∗) e−αΓ(t)
¢− χ¡θ∗e−αt
¢dt
¶= eE expµZ ∞
0
ψΛ
¡(iθ + θ∗) e−αΓ(t)
¢− Z t
0
ψΛ
¡θ∗e−αΓ(t)
¢dt
¶Z (α) ,
where B1 < |Z (α) | < B2 for some constants 0 < B1 < B2 <∞. This implies, usingthe bound (26), that
|φ∗ (θ,α) | ≤ CB2 eE| expµZ ∞
0
¡ψΛ
¡(iθ + θ∗) e−αΓ(t)
¢− ψΛ
¡θ∗e−αΓ(t)
¢¢dt
¶|.
From this bound it is easy to see that |φ∗ (θ,α) | = o (√α) by noting that for everyη2 ∈ R, exp (ψΛ (i ·+η2)− ψΛ (η2)) is the characteristic function of Λ (1) under the
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 136
obvious exponential change of measure and that δt ≤ Γ (t) ≤ Kt.for positive finiteconstants δ and M . With these elements on hand, the corresponding Edgeworth
expansion for√α (D (α)− x/α) under E∗y0 follows routine steps as in the proof of
Theorem 3. Therefore, we obtain that
Eαy0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α)))
=√α
Z ∞
0
exp (−x) exp ¡−αx2/ ¡χ(2) (θ∗) θ∗¢¢θ∗p
πχ(2) (θ∗)dx
+√α
Z ∞
0
exp¡−θ∗x/√α¢ p (x) exp ¡−x2/χ(2) (θ∗)¢ dx
+
Z ∞
0
exp¡−θ∗x/√α¢Gα (dx) ,
where p (x) in the second term above represents a polynomial of degree 3 and Gα (dx)
is a signed measure such that kGα (dx)k = o (√α). Hence, using the Dominated
Convergence Theorem and the stated property on the total variation of Gα, we obtain
that
1√αEαy0(1 (D (α)− x/α > 0) exp (−θ∗ (D (α)− x/α)))→ 1
θ∗p
πχ(2) (θ∗).
Combining these estimates with (24) yields the conclusion of the theorem.
5.5.2 The discrete time setting
The goal now is to obtain the LDP for the discrete time case. The following set of
assumptions are analogous to those stated at the end of the previous section, and
their validity has been verified in many cases (including under Markovian and strong
mixing assumptions; see Zajic (1993), chapters 3 and 4).
ADL1 For each θ, η ∈ R, suppose that
g (η, θ) , supn,k
1
nlogE exp
Ãηn+kXj=k
Zj + θn+kXj=k
|Xj|!<∞.
ADL2 If 0 = t0 < t1 < ... < tm <∞ then
Wα,m , α ((Γ (ti/α)− Γ (ti−1/α)) , (Λ (ti/α)− Λ (ti−1/α)))mi=1
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 137
satisfies an LDP on R2m with good rate function
Im (z) =mXi=1
(ti − ti−1) Iµ
ziti − ti−1
¶,
where I (x1,x2) is the rate function governing the LDP of n−1 (Γ (n) ,Λ (n))
The strategy here is first to consider a related family of approximating processes³eΓ, eΛ´ defined viaeΓ (t) ,
btcXk=1
Zk + (t− btc)Zbtc+1,
eΛ (t) ,dteXk=1
Xk + (t− btc)Xdte+1.
Theorem 2.1.1., p. 19, of Zajic (1993) establishes that α³eΓ (·/α) , eΛ (·/α)´ satisfies
a full LDP under Stone’s topology. (Note that d·e is being used here instead ofb·c in the definition of eΛ, but it is straightforward to adapt Zajic’s estimates in thissetting. Also, recall that a full LDP is one that holds with a good and convex rate
function. See Dembo and Zeitouni (1999) for the definition of good rate function.)
Thus, Theorem 5 applies here yielding the full LDP for the corresponding normalized
infinite horizon discounted reward
α eD (α) , αΨ³eΓ−1α , eΛα
´= α
Z[0,∞)
exp (−u) eΛα
³eΓ−1α (u)´du.
In view of this observation, the natural step is to show that α eD (α) is suitably closeto αD (α) (in exponential scale) as α & 0. In other words, we would like to show
that the families of random variables {α eD (a)}α>0 and {αD (α)}α>0 are exponentiallyequivalent (i.e. that for each δ > 0
limα→∞
α logP³¯α eD (α)− αD (α)
¯> δ´= −∞,
see Dembo and Zeitouni (1999), p. 130). With exponential equivalence on hand
we would be able to conclude, by virtue of Theorem 4.2.13 of Dembo and Zeitouni
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 138
(1999), that a full LDP also holds for αD (α) as α & 0. We will actually follow
this program but we will utilize a different family of approximating processes. The
reason is that the integral structure in the definition of eD (α) = Ψ³eΓ−1α , eΛα
´allows
us to take advantage of the nature of the Lebesgue measure to construct a family of
approximating processes {Λα}α>0 which is more convenient for purposes of provingthe exponential equivalence required. We, thus, define for each α > 0, the continuous
process Λα as
Λα (t) ,dteXk=1
Xk + Uα (t) ,
where
Uα (t) = dte (t− (dte− α))
αXdte+11 (t ∈ [dte− α, dte)).
We now show that³αeΓ (·/α) ,αΛα (·/α)
´and
³αeΓ (·/α) ,αeΛ (·/α)´ are equivalent
from a large deviations standpoint.
Lemma 9 The families {³αeΓ (·/α) ,αΛα (·/α)
´}α>0 and {
³αeΓ (·/α) ,αeΛ (·/α)´}α>0
are exponentially equivalent in C[0,∞)× C[0,∞) Stone’s topology.Proof. It suffices to show the corresponding exponential equivalence for {Λα}α>0
and {eΛα}α>0. Recall that Stone’s topology is generated by the metric
d∞ (x, y) =∞Xk=1
2−kdk (x, y)
1 + dk (x, y),
where
dT (x, y) = sup0≤t≤T
|x (t)− y (t)|
(see Zajic (1993) p. 20). Fix δ > 0 small and choose k0 > d− log (δ/2) / log (2)e.Then,
P∞k=k0
2−k < δ/2 and, noting that dk³Λα, eΛα
´≤ αmax1≤k≤bt/ac |Xk|, we can
write
P³d∞³Λα, eΛα
´> δ´≤ P
³dk0
³Λα, eΛα
´> δ/2
´≤ bk0/αc max
1≤k≤bk0/acP¡|Xk| > 2−1δ/α¢
≤ bk0/αc exp¡−A2−1δ/α¢max
k∈NE¡exp
¡A2−1δ/α |Xk|
¢¢,
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 139
for every A > 0 (by virtue of assumption ADL2). Therefore, we conclude that
limα→∞
α logP³d∞³Λα, eΛα
´> δ´= −A2−1δ.
Letting A%∞ yields the conclusion of the lemma.
The same strategy followed in the continuous case can now be applied to the pair³eΓα (·) , Λα (·)´as the next proposition summarizes.
Proposition 6 Under assumptions ADL1 and ADL2, the family of random elements
Λα
³eΓ−1α (·)´satisfies a full LDP on the space of continuous function C[0,∞) endowed
with Stone’s topology. Moreover, the corresponding normalized infinite horizon dis-
counted reward
αD (α) = α
Z[0,∞)
exp (−u) Λα
³eΓ−1α (u)´du
satisfies a LDP with good rate function
I (y) = infx∈AC0
{Z ∞
0
supθ(θx (s)− χ (θ)) ds : y =
Z ∞
0
e−sx (s) ds}
where χ (·) is defined via ψ (−χ (·) , ·) = 0.
Proof. It follows just as Theorem 5.
We now are ready to show that αD (a) is suitably close to αD (a) in exponential
scale.
Lemma 10 The families {αD (α)}α>0 and {αD (α)}α>0 are exponentially equivalent.In other words, for each δ > 0,
limα→0α logP¡¯αD (α)− αD (α)
¯> δ¢= −∞.
Proof. Note thatleΓ−1 (t)m = Γ−1 (t) for almost every t with respect to Lebesgue
measure. Therefore, it follows that, for almost every t,
αΛ¡Γ−1 (t/α)
¢− αΛ³eΓ−1 (t/α)´ = −αUα
³eΓ−1 (t/α)´ .
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 140
As a result, we have (making the change of variables eΓ (t/α) = u/α) that¯αD (α)− αD (α)
¯ ≤ α
Z ∞
0
W (u/α) du,
where
W (u/α) = exp³−αeΓ (u/α)´ |Uα (u/α)|Z (bu/αc+ 1) .
Let us define
V1 = α
Z t0
0
W (u/α) du and V2 = α
Z ∞
t0
W (u/α) du,
and consider the sets
A1 (t0,α, ε) , {ω : supt>t0
¯³αeΓ (t/α)− γt
´t−1¯≤ ε},
A2 (t0,α, ε) , {ω : supt>t0
¯(αΓ (t/α)− γt) t−1
¯ ≤ ε},
A3 (t0,α,m) , {ω : sup0≤t≤t0
¯αeΓ (t/α)− γt
¯≤ m},
A4 (t0,α,M) , {ω : αdt0/αeXk=1
|Xk| ≤M},
A5 (t0,α, ε) , {ω : supk>t0/α
|Xk|k≤ ε}.
For notational convenience, we will drop the arguments in the definitions of Aj,
1 ≤ j ≤ 5. Using these definitions, we can write
P¡¯αD (α)− αD (a)
¯> δ¢
≤ P
µα
Z ∞
0
W (u/α) du > δ;∩5k=1Ak¶+
5Xk=1
P (Ack) .
Observe that if we write K1 = exp (−m+ γ) then, on ∩5k=1Ak, we have
V1
≤ αK1
Z t0
0
|X (du/αe+ 1)|Z (bu/αc+ 1) 1 (u/α ∈ [du/αe− α, du/αe)) du
≤ α3K1
dt0/αeXk=1
|Xk+1|Zk ≤ α3K1
dt0/αeXk=1
|Xk|dt0/αeXk=1
Zk ≤ α2K1M
dt0/αeXk=1
Zk.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 141
On the other hand, also on ∩5k=1Ak, and for t0 (ε, γ) suitably large, there exists apositive constant K (ε, γ) <∞ such that
V2 ≤ αK (ε, γ) .
Thus, if α < δ/ (2K (ε, γ)), we have that
P
µα
Z ∞
0
W (u/α) du > δ;∩5k=1Ak¶≤ P
α
dt0/αeXk=1
Zk > δ/(α2K1M)
.But we know that α
Pdt0/αek=1 Zk satisfies an LDP (as α→ 0) therefore, we must have
that (for fixed ε, γ and large but fixed t0)
α logP
µα
Z ∞
0
W (u/α) du > δ
¶→−∞ as α& 0.
Now we analyze each P (Ack) for each 1 ≤ k ≤ 5. First, note that (by Lemma 7), t0can be chosen so that
limα→0α logP (Ac1 ((t0,α, ε))) ≤ −t0. (27)
Because (αΓ (·/α)− γ·)α>0 satisfies a full LDP onD[0,∞) endowed with the topologygenerated by the uniform convergence on compact sets (see Theorem 2.2.1 of Zajic
(1993)), it follows that the same argument provided for the proof of condition (20),
applies in this case as well. This implies that a bound such as (27) also applies for
the set Ac2. Observe that
α logP (Ac3 (t0,α,m))→ −J (m) ,
for some convex good rate function J (·) (by definition of full LDP, see Dembo andZeitouni (1999)). Now, for A4, we can use Chebyshev’s bound to obtain
α logP (Ac4 (t0,α,M)) ≤ α
»t0α
¼logE exp
dt0/αeXk=1
|Xk|−M
≤ αg (0, 1)−M →−M as α& 0.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 142
Finally, for A5, we have
P (Ac5 (t0,α, ε)) ≤∞X
k>t0/α
P
µ |Xk|k
> ε
¶
≤∞X
k>t0/α
exp (−εk)E exp (|Xk|)
≤ exp (g (0, 1)) exp (−ε dt0/αe)1− exp (ε) .
thus, for ε > 0 small but fixed,
α logP (Ac5 (t0,α,m))→−εt0.Combining the previous estimates, we conclude that
limα→0α logP¡¯αD (α)− αD (a)
¯> δ¢ ≤ − (2 + ε) t0 −M − J (m) .
since J (·) is a convex good rate function, the previous quantity in the right hand sidetends to infinity as m, t0,M %∞, which yields the proof of the lemma.
We are now in position to identify the rate function required to make practical
use of approximation (18) and under which the LDP for αD (α) holds. Define (as in
Zajic (1993) p. 9)
ψ (η, θ) = limn→∞
1
nlogE exp (ηΓ (n) + θΛ (n)) <∞.
Theorem 8 Suppose that ADL1 and ADL2 hold. Then, if y > λ/γ,
limα→∞
α logP (αD (α) ≥ y)
= infx∈AC0
{Z ∞
0
supθ(θx (s)− χ (θ)) ds : y =
Z ∞
0
e−sx (s) ds},
where AC0 is the space of absolutely continuous functions, defined on the interval
[0,∞), that vanish at the origin. In addition, if there exists θ∗ = θ∗ (y) such that
yθ∗ = χ (θ∗), then
limα→∞
α logP (αD (α) ≥ y) = supθ
µyθ −
Z ∞
0
χ¡θe−s
¢ds
¶= yθ∗ −
Z θ∗
0
χ (u)
udu.
CHAPTER 5. APPROXIMATING DISCOUNTED REWARDS 143
Proof. We know that {αD (α)}α>0 and {αD (a)}α>0 are exponentially equivalent.On the other hand, Proposition 6 indicates that αD (a)}α>0 satisfies a full LDP. Thus,By Theorem 4.2.13, p. 130, of Dembo and Zeitouni (1999) {αD (α)}α>0 must alsosatisfy a full LDP with the same rate function. The identification of the rate function
follows as in Theorem 6.
The corresponding exact large deviations asymptotic is provided under the iid
setting described in Subsection 5.1. Under those conditions, if x >> λ/γ, we shall
provide rigorous justification for the approximation
P (D > x) ≈√γ
θ∗p
πχ00 (θ∗)exp
Ã−Ãxθ∗ −
Z θ∗
0
χ (u)
udu
!/γ
!,
where xθ∗ = χ (θ∗), χ (·) satisfies ψ (−χ (·) , ·) = 0, and ψ (η, θ) = logE exp (ηZ + θX).
As usual, the approximation will be shown to hold in the regime of small interest
rates. That is, we will show that the previous approximation is valid for the discrete
time discounted reward D (α) =P∞
k=0 exp³−αPk−1
j=0 Zj´Xk. The proof of the next
theorem follows the same strategy as that of Theorem 7.
Theorem 9 Suppose that (Xk, Zk)k≥0 is an iid sequence of random variables. Sup-
pose that Zk > 0 and that for all η, θ ∈ R we have that E exp (θZ1 + ηX1) < ∞. Inaddition, assume that conditions AI2 and AI3 of Subsection 5.1 hold. Let χ (θ) be
defined as the solution to
ψ (−χ (θ) , θ) = 0,
where ψ (η, θ) = logE exp (ηZ1 + θX1). Suppose that x > λ/γ and let θ∗ be the
solution of xθ∗ = χ (θ∗). Then,
P (αD (α) > x) ∼ α1/2exp (−I (x) /α)θ∗p
πχ(2) (θ∗)as α& 0,
with I (x) = xθ∗ − R θ∗
0χ(u)udu.
Bibliography
[1] Adler, J., Feldman, R., and Taqqu, M. (Editors.) A Practical Guide to Heavy
Tails: Statistical Techniques and Applications. Birkhauser. Boston.
[2] Asmussen, S. (1987) Applied Probability and Queues. Wiley. New York.
[3] Asmussen, S. (2001) Ruin Probabilities. World Scientific. Singapore.
[4] Asmussen, S. (2003) Applied Probability and Queues. Springer-Verlag. New York.
[5] Asmussen, S., and Binswanger, K. (1997) Simulation of ruin probabilities for
subexponential claims. Astin Bulletin 27, 297-318.
[6] Asmussen, S., and Hojgaard, B. (1999) Approximations for finite horizon ruin
probabilities in the renewal model. Scand. Act. J. 2, 106-119.
[7] Bahr, B. (1975) Asymptotic ruin probabilities when exponential moments do not
exist. Scand. Act. J., 6-10.
[8] Bédard, D., and Dufresne, D. (2001). Pension funding with moving average rates
of return. Scand. Actuarial Journal 101: 1-17.
[9] Benveniste, A., Metiver, M., and Priouret, P. (1990) Adaptive algorithms and
stochastic approximations. Springer-Verlag. New York.
[10] Billingsley, P. (1999) Convergence of probability measures. Wiley. New
York.heory and pension funding. Scandinavian Actuarial Journal.
[11] Blanchet, J., Olvera-Cravioto, M. and Glynn, P. (2004) From diffusion to large
deviations for the maximum of random walk. In preparation.
144
BIBLIOGRAPHY 145
[12] Borovkov, A. (1976) Asymptotic methods in queueing theory. Springer-Verlag.
New York.
[13] Borovkov, A. (2000) Estimates for the distribution of sums and maxima of sums
of random variables without the Cramer condition. Preprint.
[14] Bowers, N., Gerber, H., Hickman, J., Jones, D., and Nesbitt, C. (1997) Actuarial
Mathematics. The Society of Actuaries. Schaumburg, Illinois.
[15] Breiman, L. (1992) Probability. Addison-Wesley. Massachusetts.
[16] Bucklew, K., Kurtz, T., and Sethares, W. (1993) Weak convergence and local
stability properties of fixed step size recursive algorithms. IEEE Transactions on
Information Theory, Vol. 39, No. 3, pp. 966-978.
[17] Butzer, P., and Nessel, R. (1971) Fourier Analysis and Approximation. Vol 1.
Birkhauser Verlag. New York.43-157.
[18] Campbell, J., Lo, A., and Mackinlay, C. (1999) The econometrics of financial
markets. Princeton University Press.
[19] Carlsson, H. (1983) Reminder term estimates of the renewal function. Annals of
Probability, Vol. 11, No. 1, 1
[20] Carmona, P., Petit, F., and Yor, M. (2001) Exponential functionals of Lévy
processes. O. Barndorff-Nielsen, T. Mikosch and S. Resnick (eds.) Lévy processes:
theory and applications. 41-55, Birkhauser.
[21] Chang, J. (1992) On moments of the first ladder height of random walks with
small drift. Ann. of App Prob. 2, 714-738.
[22] Chang, J., and Peres, Y. (1997) Ladder heights, Gaussian random walks and the
Riemann zeta function. Annals of Probability 25, 787-802.
[23] Csörgo, M. and Révész, P. (1981) Strong Approximations in Probability and
Statistics. Academic Press.
BIBLIOGRAPHY 146
[24] Dembo, A., and Zeitouni, O. (1998) Large deviations techniques and applications.
Springer-Verlag. New York.
[25] Dufresne, D., (1990) The distribution of a perpetuity, with applications to risk
theory and pension funding. Scandinavian Actuarial Journal.
[26] Dufresne, F., and Gerber, H. (1991) Risk theory for the compound Poisson pro-
cess that is perturbed by diffusion. Insurance Math. Econom. 51-59
[27] Embrechts, P., and Goldie, C. (1994) Perpetuities and random equations. In:
Mandl, P., Huskova, M. (eds.) Asymptotic Statistics. Proceedings of the 5th
Prague Symposium, 75-86. Physica-Verlag.
[28] Embrechts, P., Klüppelberg, C., andMikosch, T. (1997)Modelling extreme events
with applications to insurance and finance. Springer-Verlag. New York.
[29] Embrechts, P., and Vereberbeeke, N. (1982) Estimates for the probability of ruin
with special emphasis on the possibility of large claims. Insurance: Mathematics
and Economics, 1, 55-72.
[30] Feller, W. (1978) An Introduction to Probability Models and Its Applications II.
Wiley. New York.
[31] Feng, J., and Kurtz, T. (2000) Large deviations for stochastic processes. Preprint.
[32] Forniari, F., and Mele, A. (1997) Weak convergence and distributional assump-
tions for a general class of non-linear ARCH models. Econometric Reviews, 16
(2), 205-227.
[33] Gaier, J. Grandits, P. and Schachermayer, W. (2003) Asymptotic ruin probabil-
ities and optimal investment. Ann. of Appl. Prob. Vol. 13.
[34] Gerber, H. (1971) The discounted central limit theorem and its Berry-Esséen
analogue. Ann. of Math. Stat. Vol. 42 , No. 1, 389-392.
[35] Gjessing, H., and Paulsen, J. (1997) Present value distributions with applications
to ruin theory and stochastic equations, St. Pr. and Appl. Vol. 71, 123-144.
BIBLIOGRAPHY 147
[36] Goldie, C. (1991) Implicit renewal theory and tails of solutions of random equa-
tions. Ann. Appl. Probab. 126-166,
[37] Goldie, C. and Grübel, R. (1996) Perpetuities with Thin Tails. Adv. Appl. Prob.
28, 463-480.
[38] Grandell, J. (1991) Aspects of Risk Theory. Springer-Verlag. New York.
[39] Gut, A. (1988) Stopped random walks. Springer-Verlag. New York.
[40] Harrison, M. (1977) Ruin problems with compounding assets. St. Pr. and Appl.
1977. 5, 67-79.
[41] Hogan, M. (1986) Comment on ‘Corrected Diffusion Approximations in Certain
Random Walk Problems’. J. Appl. Probab. 23, 89-96.
[42] Horvath, L. (1984a) Strong approximation of renewal processes. Stochastic Pro-
cess. Appl. 18, No. 1, 127—138.
[43] Horvath, L. (1984b) Strong approximation of extended renewal processes. Ann.
Probab. 12, No. 4, 1149—1166.
[44] Horvath, L. (1986) Strong approximations of renewal processes and their appli-
cations. Acta Math. Hungar. 47, No. 1-2, 13—28.
[45] Kalashnikov, V. (1997) Geometric Sums: Bounds for Rare Events with Applica-
tions. Kluwer. Dordrecht, The Netherlands.
[46] Kesten, H. (1973) Random difference equations and renewal theory for products
of random matrices. Acta Math. 131, 207-248.
[47] Kiefer, J. and Wolfowitz, J. (1956) On the characteristics of the general queueing
process with applications to random walks. Trans. Amer. Math. Soc. 78, 1-18.
[48] Kingman, J. (1963) Ergodic proper ties of continuous time Markov processes and
their discrete skeletons. Proc. London. Math. soc. 13, 593-604.
BIBLIOGRAPHY 148
[49] Kontoyiannis, I., and Meyn, S. (2003) Spectral theory and limit theorems for
geometrically ergodic Markov processes. Ann. Appl. Probab. 13, 304-362.
[50] Kontoyiannis, I., and Meyn, S. (2004) Large deviations asymptotics and the
spectral theory of multiplicatively regular Markov processes. Preprint.
[51] Kushner, H. (1984) Approximation and weak convergence methods for random
processes. MIT Press Series in Signal Processing, Optimization and Control,
Cambridge, Massachussetts.
[52] Lai, T. (1976) Asymptotic moments of random walks with applications to ladder
variables and renewal theory. Annals of Probability. 4, 51-66.
[53] Lindley, D. (1952) The theory of a queue with a single-server. Proc. Cambr.
Philos. Soc. 48, 277-289.
[54] Lin, S., and Willmot, G. (2000) Lundberg approximations for compound dis-
tributinos with insurance applications. Springer-Verlag. New York.
[55] Mills, T. (1993) The econometric modelling of financial time series. Cambridge
University Press.
[56] Nelson, D. (1990) ARCH models as diffusion approximations. Journal of Econo-
metrics. 45, 7-38.
[57] Nyrhinen, H. (1999) On the ruin probabilities in a general economic environment.
St. Pr. and Appl. 83, 318-330.
[58] Nyrhinen, H. (2001) Finite and infinite time ruin probabilities in a stochastic
economic environment. St. Pr. and Appl. 92, 265-285.
[59] Paulsen, J. (1993) Risk theory in a stochastic economic environment, St. Pr. and
Appl. Vol.46, 327-361
[60] Paulsen, J. (1998) Sharp conditions for certain ruin in a risk process with stochas-
tic return on investments, St. Pr. and Appl. Vol.75, 135-148.
BIBLIOGRAPHY 149
[61] Pollack, M. and Siegmund, D. (1985) A diffusion and its applications to detecting
a change in the drift of Brownian motion. Biometrika, 72, 267-280.
[62] Phillip, W., and Stout, W. (1975) Almost Sure Invariance Principles for Par-
tial Sums of Weakly Dependent Random Variables Providence, R.I. : American
Mathematical Society.
[63] Puhalskii, A., and Whitt, W. (1997) Functional large deviation principles for
first-passage-time processes. Ann. of Appl. Prob. Vol 7, No. 2, 362-381.
[64] Rudin, W. (1987) Real and Complex Analysis. McGraw-Hill. New York.
[65] Russell, R. (1998) The large deviations of random time changes. Ph.D. thesis.
Trinity College, Dublin.
[66] Shephard, N. (1996) Statistical aspects of ARCH and stochastic volatility. In:
Cox, D.R., Hinkley, D.V. and Barndorff-Nielsen, O.E. (eds) Likelihood, Time
Series with Econometric and Other Applications. Chapman and Hall.
[67] Skorokhod, A., Hoppensteadt, F., and Salehi, H. (2001) Random perturbation
methods. Springer-Verlag. New York.
[68] Siegmund D. (1979) Corrected diffusion approximations in certain random walk
problems. Adv. Appl. Prob. 11, 701-719.
[69] Siegmund D. (1985) Sequential Analysis. Springer-Verlag. New York.
[70] Spitzer, F. (1964) Principles of Random Walk. Van Nostrand. New York.
[71] Stone, C. (1965) On characteristic functions and renewal theory. Trans. Amer.
Math. Soc., Vol. 120, 327-342
[72] Van Hoorn, M. H. (1984) Algorithms and Appproximations for Queueing Sys-
tems. Center for Mathematics and Computer Science. Amsterdam, The Nether-
lands.
BIBLIOGRAPHY 150
[73] Verbaat, W. (1979) On a stochastic difference equation and a representation of
non-negative infinitely divisible random variables. Adv. Appl. Probab. 11, 750-
783.
[74] Whitt, W. (2001) Stochastic-Process Limits. Springer-Verlag. New York.
[75] Wilkie, A. (1986) A Stochastic investment model for actuarial use. Transactions
of The Faculty of Actuaries. 39, 341,
[76] Willinger, W., Taqqu, M., Leland, W., and Wilson, D. (1995) Self-similarity in
high-speed packet traffic: Analysis and Modeling of Ethernet Traffic Measure-
ments. Statistical Science. Vol. 10, 67—85.
[77] Woodroofe, M. (1979) Repeated likelihood ratio tests. Biometrika. 66, 453-463.
[78] Woodroofe, M. (1982) Non-Linear Renewal Theory in Sequential Analysis. Soci-
ety for Industrial and Applied Mathematics. Philadelphia.
[79] Yor, M. (2001) Interpretations in terms of Brownian and Bessel meanders of the
distribution of a subordinated perpetuity. O. Barndorff-Nielsen, T. Mikosch and
S. Resnick (eds.) Lévy processes: theory and applications. 41-55, Birkhauser.
[80] Zajic, T. (1993) Large deviations for sample path processes and applications.
Stanford Ph.D. dissertation O.R.