constrained portfolio optimization - university of st. gallenfile/dis3030.pdf · constrained...

Constrained Portfolio Optimization

D I S S E RT A T I O Nof the University of St. Gallen,

Graduate School of Business Administration,Economics, Law and Social Sciences (HSG)

to obtain the title ofDoctor of Economics

submitted by

Stephan Muller

from

Germany

Approved on the application of

Prof. Dr. Heinz Muller

and

Prof. Dr. Freddy Delbaen

Dissertation no. 3030

Adag Copy AG, Zurich 2005

The University of St. Gallen, Graduate School of Business

Administration, Economics, Law and Social Sciences (HSG) hereby

consents to the printing of the present dissertation, without hereby

expressing any opinion on the views herein expressed.

St. Gallen, January 20, 2005

The President:

Prof. Dr. Peter Gomez

Acknowledgments

When I started my thesis project, I only had the vague idea of doing research

on the general equilibrium theory in financial markets. Despite this vague

project statement, my thesis advisor Heinz Muller not only accepted me as a

doctor candidate, but also took a great interest in my thesis from the very

beginning. It was through his guidance that I abandoned the general equi-

librium research project, and concentrated my study on constrained portfolio

optimization. With hindsight I can say that if this had been the only advice

he had given to me, it would have already served me very well. But over time,

he has made so many contributions to my thinking that this thesis would not

have taken the current shape without his effort. I sincerely want to thank him

for all the time he has devoted to my work.

After it had become clear what my thesis subject would be, Heinz Muller

suggested to ask Freddy Delbaen to be me co-advisor. With pride I thank

Freddy Delbaen for immediately accepting and supporting me at several deci-

sive points. I still very much remember how he once gave me a lucid lecture

on martingale differences at his office. This lecture all but shattered my hope

for proving a certain result, if it had not been for him to point me in the right

direction at the end of his lecture.

Many other people have added to my thesis. Over the years, I benefited

from insightful discussions, comments and suggestions from Roger Baumann,

Patrick Coggi, Wolfgang Drobetz, Constantin Filitti, Lars Jaeger, Tilman

Keese, Michael Schurle, Daniel Seiler, Stefan Wittmann, and Alexandre Ziegler.

Both Roger Baumann and Evelyn Ribi proof-read parts of my thesis. I further

express my gratitude to Alex Keel and Heinz Muller for letting me work at

the Department of Mathematics and Statistics. I stayed for more than five

years, first as a teaching and then as a research assistant. While I was working

there, I wore out three office room mates: Albert Gabriel, Christian Bach, and

Christina Jockle. I certainly hope that I was as pleasant a room mate to them

as they were to me. I also want to express my special thanks to David Schiess

for re-inflaming my enthusiasm for playing soccer. During parts of my thesis

I was working at Vescore Solutions, St. Gallen, and Partners Group, Zug. All

the people I met at these places were very supportive to my thesis project.

Zug, January 2005 Stephan Muller

Contents

Acknowledgments I

Notation VII

1 Portfolio Optimization (General Case) 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 The Financial Market Model . . . . . . . . . . . . . . . . . . . 4

1.3 Superhedging of Contingent Claims . . . . . . . . . . . . . . . . 7

1.4 A General Existence Result . . . . . . . . . . . . . . . . . . . . 11

1.5 The Optimal Wealth Process . . . . . . . . . . . . . . . . . . . 16

1.6 First-Order Conditions . . . . . . . . . . . . . . . . . . . . . . . 17

1.7 Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.8 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Portfolio Optimization (Time-Additive) 29

2.1 Complete Market Portfolio Optimization . . . . . . . . . . . . . 30

2.1.1 The Unconstrained Dynamic and Static Problems . . . 30

2.1.2 A Verification Theorem for the Static Problem . . . . . 35

2.1.3 Equivalence of Dynamic and Static Problems . . . . . . 37

2.1.4 A Verification Theorem for the Dynamic Problem . . . 40

2.1.5 Existence of an Optimal Solution . . . . . . . . . . . . . 40

2.1.6 Examples (Unconstrained Brownian Market) . . . . . . 58

2.2 Introduction to Constrained Optimization . . . . . . . . . . . . 65

2.2.1 The Constrained Dynamic Problem . . . . . . . . . . . 65

2.2.2 Existence of an Optimal Solution . . . . . . . . . . . . . 67

2.2.3 First-Order Conditions . . . . . . . . . . . . . . . . . . . 70

2.2.4 Examples (Constrained Brownian Market) . . . . . . . . 79

3 Duality Approach (Time-Additive) 83

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.2 Unconstrained Problem . . . . . . . . . . . . . . . . . . . . . . 85

3.3 Constraints, but no Consumption . . . . . . . . . . . . . . . . . 88

3.4 The General Case with Consumption . . . . . . . . . . . . . . . 95

3.5 Extensions and Ramifications . . . . . . . . . . . . . . . . . . . 103

3.5.1 0 in Constraint Set . . . . . . . . . . . . . . . . . . . . . 104

3.5.2 Stochastic Income . . . . . . . . . . . . . . . . . . . . . 104

3.5.3 Negative Wealth . . . . . . . . . . . . . . . . . . . . . . 105

3.5.4 Other Utility Functions . . . . . . . . . . . . . . . . . . 106

3.5.5 Various Extensions . . . . . . . . . . . . . . . . . . . . . 108

4 Optimal Portfolios 111

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

4.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 112

4.1.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . 113

4.2 Model and Standing Assumptions . . . . . . . . . . . . . . . . . 117

4.3 Optimal Portfolios for Ito Processes . . . . . . . . . . . . . . . 119

4.3.1 Cone Constraints . . . . . . . . . . . . . . . . . . . . . . 119

4.3.2 Closed Constraints . . . . . . . . . . . . . . . . . . . . . 121

4.4 Extensions and Ramifications . . . . . . . . . . . . . . . . . . . 127

A A General Semimartingale Model 129

A.1 Stochastic Setting . . . . . . . . . . . . . . . . . . . . . . . . . 130

A.2 Topological Properties . . . . . . . . . . . . . . . . . . . . . . . 136

A.3 Dual Characterization . . . . . . . . . . . . . . . . . . . . . . . 140

A.3.1 Portfolio-Proportion Processes . . . . . . . . . . . . . . 141

A.3.2 Portfolio Processes . . . . . . . . . . . . . . . . . . . . . 159

B Convex Analysis and Duality 165

B.1 Kramkov / Schachermayer’s Duality Result . . . . . . . . . . . 165

B.2 Generalized Lagrangians . . . . . . . . . . . . . . . . . . . . . . 167

B.3 A Stochastic Optimization Problem . . . . . . . . . . . . . . . 174

C Various Proofs 179

C.1 Proof of Several Results in Section 2.1 . . . . . . . . . . . . . . 179

C.1.1 Proof of Theorem 2.1.12 . . . . . . . . . . . . . . . . . . 179

C.1.2 Proof of Corollary 2.1.14 . . . . . . . . . . . . . . . . . . 182

C.1.3 Proof of Lemma 2.1.20 . . . . . . . . . . . . . . . . . . . 183

C.2 A Distributional Property of Ito Processes . . . . . . . . . . . . 184

C.3 Solution to a SDE . . . . . . . . . . . . . . . . . . . . . . . . . 186

C.4 A Simple Comparison Theorem . . . . . . . . . . . . . . . . . . 189

Bibliography 190

Notation

Numbers after the explanation refer to pages where the concept is defined.

We follow two principles:

(i) Use the conventional notation of the field.

(ii) Use different alphabets / different character sets for different mathe-

matical objects.

If the two rules contradict each other, than we will usually prefer the conven-

tional notation.

Functions

B(·) bequest function, state-dependent,

time-additive terminal utility function

IA indicator function of set A

u(·, ·) utility function for contingent claims

u(·) utility function for initial wealth

U(·, ·) state-dependent, time-additive

utility function

Measures

λ measure on (subsets of) the real line I,

usually the Lebesgue measure, 4, 130

λ⊗ P product measure

P ‘true’ probability measure, 4

Qm equivalent (local) martingale measure, 4, 132

Integrals∫ t

0f(s)ds Lebesgue integral with respect to λ, 131∫

fdµ integral with respect to measure µ, 131∫ ·0ξs · dSs (vector) stochastic integral, 131∫ ·

0+ξs · dSs see Remark A.1.2

Stochastic Processes

are adapted (convention)

ASK(Q) upper variation process, 142

c, (ct), c(·) consumption process, 6, 135

c minimal consumption, 31

π, (πt), π(·) portfolio-proportion process, 5, 134

S, (St), S(·) (vector-valued) semimartingale, “risky” assets,

“stocks”, 4, 130

( 1St

) see p. 132

(ξtSt) see p. 132∫ ·0ξs · dSs vector stochastic integral, 131

W, (Wt),W (·) (discounted) wealth process, 5, 133, 135

ξ, (ξt), ξ(·) (admissible) portfolio process, 5

x,x(·) explicit matrix notation, 132

Z,Z(·) Brownian motion, 59

Sets

A(S, W0), Aπ(S, W0), admissible combinations of (ξ, c), (π, c), 135

AK(S, W0), AKπ (c,W0) constrained admissible combinations of

(ξ, c), (π, c), 135

B[0, T ] Borel σ-algebra on [0, T ]

C, C polar, bipolar of a set, 137

CK attainable contingent claims, 143

cl(C) closure of set Ccone(C) cone generated by C, 167

conv(C) convex hull of CF σ-field of a probability space, 4

F1 ⊗F1 product σ-field

F(t) filtration, 4, 130FZN (t)

augmented Brownian filtration, 59

I subset of the real line, index of time

int(C) interior of CK constrained set

La(S, W0) admissible portfolio processes, 133

Lπ(S) portfolio-proportion processes, 134

La,π(S) (admissible) portfolio-proportion

processes, 134

M(S) set of equivalent (local) martingale

measures, 4, 132

M(SK) equivalent measures, 142

P predictable σ-algebra, 131

Prog progressive σ-algebra, 11

R partial ordering, subset of X× X, 168

IR real numbers

IR+, IR+0 , IR−, IR−

0 positive, non-negative, negative,

non-positive real numbers, 130

S attainable portfolio(-proportion)

processes, 4, 132

SK constrained attainable portfolio

(-proportion) processes, 142

WK(W0) attainable wealth processes, 142

YK “state-price densities”, 143

YK sequential “closure” of YK, 144

Spaces

X,Y,Z real linear spaces

X×Y product spaces

X′,Y′,Z′ topological duals of real linear spaces

(Ω,F , P) probability space, 4

(L0(Ω,F , P), dP), (L0(P), dP) space of random variables, 137

(L0+(Ω,F , P), dP), (L0

+(P), dP) space of non-negative random variables, 137

(Lp(Ω,F , P), dP), (Lp(P), dP) space of p-integrable random variables

(L(S), dS) the space of S-integrable, predictable

processes, 4, 136

(I × Ω,P, λ⊗ P) measure space of predictable processes, 131

Varia

a ≡ b a equivalent to b

dom(f) domain of function f

ess-sup(C) essential supremum of Cf > g inequality for functions f, g, 131

f+, f− max(f, 0), max(−f, 0)

f−1 inverse of function f

N number of risky assets

T maximal element of I, 4, 130

Chapter 1

Portfolio Optimization

(General Case)

2 CHAPTER 1. PORTFOLIO OPTIMIZATION (GENERAL CASE)

1.1 Introduction

Portfolio optimization is a cornerstone of modern finance theory:

(i) It relates the theory of financial markets to mainstream microeconomic

theory by showing that pricing in financial markets is just a special case of

utility optimization. To be more specific, the arbitrage pricing paradigm

of mathematical finance turns out to be a special case of the paradigm of

relative pricing by the marginal rate of substitution (see Remark 3.3.5).

(ii) It is of practical importance due to its applications. To name just two:

optimal portfolio choice for an institutional investor, and pricing in in-

complete markets.

For these reasons, a lot of work on this subject has been done over the last

fifty years. Indeed, modern finance starts with the discovery that there is a

tradeoff between risk and return in holding financial assets (Markowitz, 1952;

Roy, 1952). To model the return of a portfolio of financial assets, we intro-

duce random variables. These random variables depend on the choice of an

investor, namely on her choice to hold certain assets. Given a goal, we can

try to find the optimal portfolio choice, i.e. the portfolio choice that maximizes

or minimizes her goal. Markowitz postulates that the portfolio returns at the

end of the period are normally distributed random variables, and that the goal

is to minimize the risk (measured by the variance of the portfolio) given an

expected portfolio return. This leads to a convex problem solved in Markowitz

(1952). Roy gets similar results using the Chebyshev bound as the risk mea-

sure. Despite their elegance, these approaches to portfolio optimization have

several shortcomings: they are static approaches; the assumption about the

distribution of the asset prices is questionable given empirical facts; and the

goal seems to be somewhat simplifying.

The approach used today was pioneered by Samuelson (1969); Merton

(1969). They suggest modeling the risk of a portfolio and the portfolio choice

by stochastic processes. They also propose rather general utility functions.

We follow Samuelson’s and Merton’s lead throughout this thesis. This chapter

1.1. INTRODUCTION 3

tackles their problem in a very general setting. The stochastic processes of

the risky assets are general semimartingales, the constraints are convex and

can be state-dependent, and the utility function is quasiconcave, upper semi-

continuous and nondecreasing. We prove existence of an optimal strategy and

characterize the optimal solution.

The outline of this chapter is as follows: Section 1.2 clarifies the notation.

The next section, Section 1.3, discusses a superhedging result. We thereby

characterize all contingent claims that are attainable using dynamic trading

strategies. Section 1.4 proves the existence of an optimal solution for a very

general constrained portfolio optimization problem. This existence result shows

that the portfolio optimization problem is well-defined and leads to a series of

questions that are the main topic of this thesis. They concern the structure of

the optimal solution. A first answer to these questions is given in Section 1.5,

which characterizes the optimal wealth process. Section 1.6 sketches the idea

that is behind traditional first-order conditions to further describe the solution.

And in Section 1.7 we review previous research. Finally, Section 1.8 discusses

the structure of the remaining chapters of this thesis.

We adopt the semimartingale model, the most general model that allows for

a sensible definition (Frittelli, 1997; Delbaen and Schachermayer, 1994, Theo-

rem 7.2). A common alternative is the Brownian market model (Karatzas and

Shreve, 1998). However, this setting comes with the extra burden of a cum-

bersome notation, and most proofs can be easily extended to the more general

semimartingale model. What is more, many proofs for the Brownian market

rely on the assumption that one can observe the filtration of the underlying

Brownian motion (as opposed to the filtration generated by the observed asset

prices). Since these filtrations are not the same — unless we are in a com-

plete market Markovian world — this seems to be an audacious assumption.

Therefore, we use the semimartingale setting and assume that the information

is given exogenously by a filtration. Where necessary, we will make additional

assumptions concerning the filtration. That said, it is clear that Brownian

motion is a constant source of inspiration, and all examples use this model.


Since the topic is at the interface of finance and mathematics, we try to do

justice to both fields. Thus propositions are given with mathematical accuracy;

assumptions however are chosen with the economic rationale in mind. We try

to use economic concepts to justify such assumptions.

One word of advice might be necessary for the reader not so familiar with

the market model used, or portfolio optimization in general. This chapter is

intended to set the scene. It is therefore a little bit eclectic. The reader not

familiar with this model will find more details in Appendix A.1. As for the

portfolio optimization problem, this chapter assumes that the reader knows

what the portfolio optimization problem is about. If this is not the case, it

might be better to read Section 2.1.1 before tackling Section 1.4.

1.2 The Financial Market Model

We use the common setting of mathematical finance (see Appendix A for a more

detailed discussion). (Ω,F , P) is a probability space with a right-continuous

filtration F(t) satisfying the usual hypotheses, S a N -dimensional, locally

bounded semimartingale, and time is denoted by I ⊂ IR+0 , with 0 ∈ I always.

Since we can embed a discrete semimartingale in a continuous setting (e.g.

Cherny and Shiryaev, 2001, Remark in Section 7.1), we will usually think of

I as an interval. We always assume S > 0 almost surely to avoid cumber-

some notation. The measure λ is a suitable measure on I, say the Lebesgue

(see Footnote 4 on p. 130 for more on this). T is the maximum element of

I. We assume that T is finite, but note that all results can be extended

to the infinite case with minor modifications. We also assume existence of

a probability measure Qm equivalent to P such that all processes in the set

S = X ≥ 0 a.s. : Xt = X0 +∫ t

0+ξs · dSs, ξ ∈ L(S) are local martingales (an

equivalent local martingale measure); here L(S) is the space of all S-integrable

N -dimensional, predictable processes. The set of all equivalent local martingale

measures is denoted by M(S).

In order to limit the notation, all processes defined or taken as given are

1.2. THE FINANCIAL MARKET MODEL 5

adapted. For stochastic integrals and processes defined by conditional expec-

tations, right-continuous versions will be chosen. All other properties of a

stochastic process will be stated. To streamline the exposition, we will also

write∫ t

0f(s)ds instead of

∫ t

0f(s)dλ(s).

Let W0 ∈ IR+0 and ξ ∈ L(S). A (discounted) wealth process is defined by

W4= W0 +

∫ ·

0+

ξs · dSs ∀ t ∈ I P− a.s. (1.1)

We call a predictable portfolio process ξ admissible if WT exists and

Wt ≥ 0 ∀ t ∈ I P− a.s. (1.2)

for the wealth process (1.1). We write La(S, W0) for the set of all admissible

processes.

If W ≥ 0, it is sometimes convenient to use the portfolio-proportion process

π instead. The latter is defined by π0 = 0,1

πt4=

1Wt−

ξtSt−IWt−>0 ∀ t ∈ I \ 0 P− a.s. (1.3)

Lπ(S) 4= π defined by (1.3), ξ ∈ L(S) is the set of all integrable portfolio-

proportion processes. La,π(S) 4= π defined by (1.3), ξ ∈ La(S, W0) is the set

of all portfolio-proportion processes generated by admissible processes.2 Using

this notation (1.1) can be rewritten as

W = W0E(∫ ·

0+

πs ·dSs

Ss−

)Q− a.s.

E(·) is the Doleans-Dade exponential (roughly speaking, E(X) is the unique so-

lution to the stochastic differential equation Wt =∫ t

0+Ws−dXs). Occasionally,

we call dSt

St−the return process.

In reality, there is quite often another source for changes in the value of a

portfolio, namely consumption.

1For the precise meaning of the notation see Appendix A.1.2Here we write La,π(S) instead of La,π(S, W0) because it is easy to see that a portfolio-

proportion process π is independent of W0 ≥ 0.


1.2.1 Definition (Consumption Process). A consumption (density) process is

a progressively measurable non-negative process c, satisfying∫ T

0c(s)ds < ∞

almost surely.

1.2.2 Remark. It is often preferred to use C, an optional, increasing process

(with C(0) = 0) that captures total consumption up to time t. Then (A.4)

would for example read

Wt = W0 +∫ t

0+

ξs · dSs −∫ t

0

dC(s)

= W0 +∫ t

0+

ξs · dSs − C(t) ∀ t ∈ I P− a.s.

It is clear that C4=∫ ·0c(s)ds is adapted and continuous, hence optional; i.e.

our approach is slightly less general than the one using the process C. The

advantage of C is that consumption can happen in gulps. For this chapter,

nothing material would change if we used C instead. However, with a view

towards the time-additive case, we stick to the slightly less general definition.

See Bank (2000) for a complete discussion. Also see Remark 1.4.4 and Remark

2.1.11 for other generalizations along these lines.

1.2.3 Definition (Wealth Process). A wealth process is a stochastic process

(Wt) with a representation

Wt = W0 +∫ t

0+

ξs · dSs −∫ t

0

c(s)ds ∀ t ∈ I P− a.s. (1.4)

Here ξ ∈ L(S) and c is a consumption process.

If W ≥ 0, it will sometimes be convenient to write this equation with

respect to portfolio-proportion processes. This yields the stochastic differential

equation

Wt = W0E(∫ t

0+

πs ·dSs

Ss−−∫ t

0+

c(s)Ws−

d(sIWs−>0

))= W0 +

∫ t

0+

Ws−πs ·dSs

Ss−−∫ t

0+

c(s)ds

1.3. SUPERHEDGING OF CONTINGENT CLAIMS 7

almost surely, if S > 0 almost surely. By Theorem C.3.1, the solution to this

equation is almost surely

Wt = IWt−>0E(∫ ·

0+

πs ·dSs

Ss−

)t

W0 −∫ t

0

c(s)

E(∫ ·

0+πu · dSu

Su−

)s−

ds

, (1.5)

provided there are no arbitrage opportunities in the market (implying Ws =

0 ⇒ Wt = 0 ∀ t ≥ s).

1.2.4 Definition (Admissible). We call a combination of a portfolio process

ξ and a consumption process c admissible if constraint (1.2) holds for a wealth

process (Wt) and WT exists; A(S, W0) is the set of all admissible pairs of a

predictable process ξ and a consumption process c. We write (ξ, c) ∈ A(S, W0)

for such a pair. For K ⊂ L(S), we write AK(S, W0) for all (ξ, c) ∈ A(S, W0)

with ξ ∈ K. Aπ(S, W0), AKπ (S, W0) are defined accordingly, and each (π, c) ∈Aπ(S, W0), (π, c) ∈ AKπ (S, W0) respectively, is called admissible, too.

It seems natural to assume that an individual cannot make arbitrary losses

but is bounded by a constant (a finite credit line), which for convenience we

take to be zero. There are however very good reasons to consider more gen-

eral definitions of admissible processes. We make some remarks on the merits

of such approaches and the subtleties of the definition of admissible trading

strategies in Section 3.5.3.

1.2.5 Convention. Throughout this thesis, for (ξ, c) ∈ AK(S, W0) ((π, c) ∈AKπ (S, W0)), W will be the process of Definition 1.2.3 ((1.5) respectively).

We rely on the reader’s ability to recognize the relevant (ξ, c), as long as

there is no ambiguity.

1.3 Superhedging of Contingent Claims

The portfolio optimization problem is the problem of finding the combination

of a consumption process and a portfolio(-proportion) process that maximizes


utility from consumption and terminal wealth, given a certain level of initial

wealth W0. To find a solution, we first characterize all combinations of a

consumption process and a terminal wealth level that are attainable. The

result is well-known: contingent claims are attainable if and only if they satisfy

a budget constraint like (1.6) below.

1.3.1 Definition (Attainable). We call a combination of an F-measurable

random variable X and a consumption process c attainable for W0 if there

exists an admissible (ξ, c) ∈ A(S, W0) (equivalently (π, c) ∈ Aπ(S, W0)) with

WT ≥ X almost surely for the wealth process W . We call the combination

K-attainable for W0, if there exists (ξ, c) ∈ AK(S, W0) (equivalently (π, c) ∈AKπ (S, W0)) satisfying this inequality.

Roughly speaking, (c,X) is attainable if there exists a ξ (π respectively)

such that we can consume c and still have no less wealth than X at terminal

date T for a given initial wealth W0 (so-called superhedging). Occasionally, we

will call (c,X) a contingent claim. If no misunderstanding is possible, we will

simply write “attainable” instead of “(K-)attainable for W0”. The program for

proving existence of an optimal solution is now straightforward:

(i) Characterize the set of attainable contingent claims (this section).

(ii) Solve a “static” analogue to the portfolio optimization problem (next

section).

(iii) Prove that the static solution leads to a solution of the portfolio opti-

mization problem.

Pliska (1982) has introduced this basic idea, nowadays known as the Martingale

method. Similar approaches will be used time and again, see Section 2.1.3 and

Chapter 3. As for characterizing the attainable contingent claims, we need

some additional notation, which we will now introduce. We will first introduce

the notation with respect to the portfolio-proportion process and later add the

definitions for the portfolio processes.

1.3. SUPERHEDGING OF CONTINGENT CLAIMS 9

Let K ⊂ La,π (S) be closed with respect to the semimartingale topology

dS (see Appendix A.2, (A.6), p. 136). Further assume that 0 ∈ K and that

K is convex in the following sense: if β, γ ∈ K, then αβ + (1 − α)γ ∈ Kfor any one-dimensional predictable process α such that 0 ≤ α ≤ 1. For

the remainder of this chapter, we will always assume that K satisfies these

assumptions. Consider the family of semimartingales

SK4=∫ ·

0+

πs ·dSs

Ss−: π ∈ K

.

K is the constraint the portfolio-proportion process must satisfy. Let M(SK)

be the set of all probability measures Q equivalent to P such that the upper

variation process ASK(Q) exists (see Definition A.3.5, p. 142, and the discussion

thereafter). Roughly speaking, ASK(Q) is the smallest increasing process, such

that E(∫ ·0+

πs · dSs

Ss−)/E(ASK(Q)) is a Q-supermartingale for all π ∈ K.

1.3.2 Remark. If K is linear, then M(SK) is the set of all equivalent local

martingale measures, and we have the case of a complete (incomplete) mar-

ket, if M(SK) consists of a singleton (more than one probability measure,

respectively). If K is a cone, then M(SK) is the set of all equivalent local

supermartingale measures. In both cases, ASK(Q) ≡ 0∀ Q ∈M(SK).

Observe that M(SK) is not empty. By assumption the market does not

allow for arbitrage, and therefore there exists an equivalent local martingale

measure. For this measure Qm, E(∫ ·0+

πs · dSs

Ss−) is a local martingale bounded

from below, hence a supermartingale. ASK(Qm) = 0 almost surely follows.

Now we are in the position to give the main superhedging result. It will help

us to find a “static” equivalent to the dynamic portfolio optimization problem.

1.3.3 Proposition. With the notation of this section:

(i) Suppose

supQ∈M(SK)

EQ

[X

E(ASK(Q))T+∫ T

0

c(s)E(ASK(Q))s

ds

]≤ W0 (1.6)


for some X ∈ L0+(P) and a consumption process c. Then there exists a

portfolio-proportion process π with (π, c) ∈ AKπ (S, W0) such that for the

wealth process (Wt)t∈I defined by (1.5) WT ≥ X holds almost surely.

(ii) Conversely, if (π, c) ∈ AKπ (S, W0), then

supQ∈M(SK)

EQ

[WT

E(ASK(Q))T+∫ T

0

c(s)E(ASK(Q))s

ds

]≤ W0.

Proof. Appendix A.3, Proposition A.3.14. A simpler version of this proposition

will be proven in Lemma 2.1.20.

A similar result for portfolio processes is also true.

1.3.4 Proposition. Let K ⊂ La (S) be closed with respect to dS. Further

assume that 0 ∈ K and that K is convex in the following sense: if β, γ ∈ K, then

αβ +(1−α)γ ∈ K for any one-dimensional predictable process α such that 0 ≤α ≤ 1. Consider the family of semimartingales SK

4=∫ ·

0+ξs · dSs : ξ ∈ K

.

Let M(SK) and ASK(Q) be as in Definition A.3.5. Set Mn(SK) 4= Q ∈M(SK) : ASK(Q)T ≤ n a.s. and Mb(SK) 4= ∪n≥1Mn(SK).

(i) Suppose

supQ∈Mb(SK)

EQ

[X +

∫ T

0

c(s)ds−ASK(Q)T

]≤ W0 (1.7)


portfolio process ξ with (ξ, c) ∈ AK(S, W0) such that for the wealth process

(Wt)t∈I defined by (1.4) WT ≥ X holds almost surely.

(ii) Conversely, if (ξ, c) ∈ AK(S, W0), then

supQ∈Mb(SK)

EQ

[WT +

∫ T

0

c(s)ds−ASK(Q)T

]≤ W0.

Proof. Proposition A.3.20.

This completes the characterization of attainable contingent claims. Let us

now turn to step (ii) of our little program, namely solving a static version of

the portfolio optimization problem.

1.4. A GENERAL EXISTENCE RESULT 11

1.4 A General Existence Result for the Portfo-

lio Optimization Problem

Consider an investor investing in the financial market and consuming a fraction

of her wealth over time. To evaluate the success of her investment strategy, she

uses a utility function u, which assigns a real number to a given combination

of consumption and terminal wealth. We need some additional notation.

Take as given the measure space (I × Ω × Ω,Prog × F , λ ⊗ P ⊗ P), where

Prog is the progressive σ-algebra. A utility function is then a mapping u :

L0(I×Ω×Ω,Prog×F , λ⊗P⊗P) 7→ IR. Given (π, c) ∈ AKπ (S, W0), u(c,WT ) is

the utility assigned to (π, c); here, WT is the terminal wealth (see (1.5)).3 By

assumption, the investor wants to maximize u. Under additional assumptions,

we will prove existence of (π∗, c∗) ∈ AKπ (S, W0), which maximizes u. The case

of optimal portfolio processes is completely analogous.

Before we do so, let us first consider the static problem. We prove existence

of an optimal solution to the static problem, completing thereby step (ii) of

our little program. The theorem is essentially by Levin (1976). Bank (2000)

seems to have used Komlos’s Theorem B.3.4 first for the proof of it. Similar

results can be found in Foldes (1978); Cuoco (1997).

1.4.1 Proposition. With the notation of this chapter, let u : L0(I × Ω ×Ω,Prog ⊗ F , λ⊗ P⊗ P) 7→ IR be quasiconcave (see Remark B.2.12) and upper

semicontinuous with respect to convergence in probability. Define CπK(W0)

4=

(c,X) : c ≥ 0, X ≥ 0, (1.6) holds and CK(W0)4= (c,X) : c ≥ 0, X ≥

0, (1.7) holds.Then there exists (c∗, X∗) ∈ CK(W0) which maximizes u on CK(W0). Sim-

ilarly, there exists (c∗, X∗) ∈ CπK(W0) which maximizes u on Cπ

K(W0).

Proof. It is easy to check that CπK(W0) is convex and bounded from below by

0. From Follmer and Kramkov (1997, Lemma 2.1) ASK(Q)t < ∞ for all t ∈ I3There is no generality gained or lost in incorporating WT explicitly in the utility function.

A utility function u(c) would do just as well. We consider u(c, WT ), instead, because it iscommon to differentiate between running consumption c and terminal wealth / consumptionWT . One frequently thinks of terminal utility as utility due to a bequest motive.


almost surely, which implies that (c,X) ∈ CπK(W0) is finite almost surely, i.e.

CπK(W0) ⊂ L0

+(I ×Ω×Ω,Prog ⊗F , λ⊗P⊗P). Furthermore, CπK(W0) is closed

with respect to convergence in measure. Indeed, L0(I × Ω× Ω,Prog ⊗ F , λ⊗P⊗P) is first countable, if we use the pseudometric induced by convergence in

measure as the topology. Hence sequential reasoning suffices (e.g. Schechter,

1997, Exercise 15.34 b). Let (cn, Xn) ∈ CπK(W0) be a sequence converging to

(c,X) in measure. Passing to a subsequence if necessary, we can assume that

this convergence is λ ⊗ P ⊗ P almost surely. Fatou’s lemma — in its version

for random variables taking values in [0,∞] (see Lemma 15.2 in Bauer, 1992;

Rogers and Williams, 1994a, Chapter 2, Lemma 8.2, and Note after Chapter

2, (2.5)), since (c,X) might take the value ∞ — then implies

EQ

[X

E(ASK(Q))T+∫ T

0

c(s)E(ASK(Q))s

ds

]

≤ lim infn→∞

EQ

[Xn

E(ASK(Q))T+∫ T

0

cn(s)E(ASK(Q))s

ds

]≤ W0

for all Q ∈M(SK). This proves (c,X) ∈ CπK(W0). Now, Theorem B.3.5 ensures

existence of an optimal solution. The proof that CK(W0) is closed with respect

to convergence in probability is the same, if we observe that for Q ∈Mb(SK),

ASK(Q)T is bounded by some constant; i.e. we can apply the Fatou lemma.

1.4.2 Remark. A word of caution is necessary concerning Proposition 1.4.1.

Existence of a supremum of u on CπK(W0) is trivial, since we allow u to take

the value ∞. In order to ensure that u is finite, we need additional assump-

tions. Upper semicontinuity is the crucial assumption ensuring that the optimal

(c∗, X∗) exists and can be approximated by a sequence. Otherwise, it might

be possible that the limit of the approximating sequence of (cn, Xn) does not

exist. It then exhibits extreme behavior, basically inducing the individual to

gamble with some of his fortune (see Kramkov and Schachermayer, 1999, Note

5.2). For the time-additive case, the one most frequently studied, upper semi-

continuity is usually achieved only indirectly. We defer a thorough discussion

to Section 2.1.5.


Let us now turn to step (iii) of our little program, namely the portfolio opti-

mization problem. The proof of existence of an optimal solution below holds for

a very general setting. The theorem is more general than the existence results

in Cuoco (1997); Bank (2000); Mnif and Pham (2001). Cuoco (1997) consid-

ers the Brownian motion case and time additive utility functions. He uses a

power-growth condition to ensure uniform integrability (which is a special case

of the utility function used below, see Section 2.2.2), and has a boundedness

assumption on ASK(Q) (Cuoco, 1997, Assumption 3 on p. 40). Bank (2000)

only tackles cone constraints and Hindy-Huang-Kreps utility functions. That

Bank’s utility functions are a special case can be seen almost immediately by

comparing his result to the one below. Mnif and Pham (2001) do not consider

consumption, and again employ a power-growth condition to ensure uniform

integrability. They consider only portfolio processes. All these results are spe-

cial cases of the following simple theorem. Another generalization is in the

direction of quasiconcave utility functions (instead of concave ones). This is

not only of academic interest, as discussed in Section 3.5.4, where we will also

present frequently used classes of utility functions that fit in this framework.

1.4.3 Theorem (Existence of an Optimal Solution). With the notation and

assumptions of Proposition 1.4.1, suppose that u is also nondecreasing (i.e. if

(c1, X1) ≥ (c2, X2) almost surely, then u(c1, X1) ≥ u(c2, X2)).

Then there exists (ξ∗, c∗) ∈ AK(S, W0) such that u(c∗,W ∗T ) ≥ u(c,WT )

for all (ξ, c) ∈ AK(S, W0), where W ∗,W are the respective wealth processes.

Similarly, there exists (π∗, c∗) ∈ AKπ (S, W0) such that u(c∗,W ∗T ) ≥ u(c,WT )

for all (π, c) ∈ AKπ (S, W0).

Proof. From Proposition 1.4.1, there exists a contingent claim (c∗, X∗) max-

imizing u on CπK(W0). The superhedging result, Proposition 1.3.3, ensures

existence of π∗ with (π∗, c∗) ∈ AKπ (S, W0), such that for the wealth process

W ∗T ≥ X almost surely. By assumption, u(c∗,W ∗

T ) ≥ u(c∗, X∗).

On the other hand, from the second part of Proposition 1.3.3, (c,WT ) ∈CπK(W0) for all (π, c) ∈ AKπ (S, W0). This implies u(c,WT ) ≤ u(c∗, X∗) for all

admissible (π, c).


The proof for portfolio processes is completely analogous.

In a certain sense, this seems to be the most general existence result that is

possible. However, some generalizations are still feasible. Let us consider first

extensions of the definition of the consumption process.

1.4.4 Remark. As in Cuoco (1997); Mnif and Pham (2001), the assumption

that c and X are non-negative is not necessary for the existence result. Indeed,

it suffices that π ∈ La,π(S) or ξ ∈ La(S) and

W0 +∫ T

0+

ξs · dSs −∫ T

0

c(s)ds

W0 +∫ T

0+

πsWs ·dSs

Ss−−∫ T

0

c(s)ds

are bounded from below by some constant (see Section 3.5.3 for the subtleties

of the definition of “admissible” processes). Then superhedging is still possible,

and we only have to use (ii)(a) of Theorem B.3.5 in the proof of Proposition

1.4.1, where the Y in Theorem B.3.5 is for example defined by the equivalent

martingale measure assumed to exist, i.e. Y = dQm

dP . We need however a

boundedness assumption in the proof of Proposition 1.4.1 since we can no

longer apply the Fatou lemma directly. Most authors do this by ensuring that

u(c,X) : (c,X) ∈ CπK(W0) is uniformly integrable (Bank, 2000, Assumption

2.1). We refer to Section 2.1.5 for a discussion of conditions to ensure uni-

form integrability. Negative consumption has a natural interpretation as net

consumption, i.e. consumption minus labor income. That is, the existence re-

sult also covers the case of stochastic income or endowment. However with

a nontrivial income process, wealth may become negative, and the portfolio-

proportion process is no longer defined.

1.4.5 Remark. Other extensions are also possible. u need not be nondecreas-

ing, if we can throw away money. Allowing for a consumption process C as in

Remark 1.2.2 is also relatively straightforward. We have to replace the progres-

sive σ-algebra with the optional one (see Bank, 2000, in the case of incomplete

markets with cone constraints). To extend the results to T infinite, we have


to do some additional work along the lines of Bank (2000, Remark 2.4): let

(c∗n,W ∗n) be an optimal solution on the interval [0, n], show that the sequence

of optimal solutions converges to a (c∗,W ∗∞), and prove that this is the optimal

solution for T = ∞. We omit the details. Convex terminal wealth constraints

can easily be incorporated. And a large investor model is basically only a refor-

mulation of this model (Mnif and Pham, 2001, Example 3.4). See Section 3.5

for further extensions, including the possibility of American type constraints

on the wealth process, and replacing the assumption 0 ∈ K.

1.4.6 Remark. The terms “static” and “dynamic” problem can be explained as

follows. In the static problem, the individual buys a contingent claim (c∗, X∗)

that maximizes her utility subject to the budget constraint (1.6) or (1.7), and

holds the claim until the end. For the solution of the dynamic problem, she

invokes a dynamic trading strategy π∗ or ξ∗ that requires her to adjust her

portfolio weights at any instant.

There is one obstacle associated with Theorem 1.4.3; namely, it is some-

times hard to establish upper semicontinuity with respect to convergence in

probability. We therefore state a variant of the theorem relying on a weaker

type of upper semicontinuity, and hence being more general.

1.4.7 Corollary. With the assumptions and notations of Theorem 1.4.3, sup-

pose that u, instead of being upper semicontinuous with respect to convergence

in probability, is upper semicontinuous in the following sense: for every se-

quence (cn, Xn) ∈ CK(W0) (or CπK(W0)) converging to some (c,X) almost

surely, u(c,X) ≥ lim supn→∞ u(cn, Xn) holds.

Then the conclusions of Theorem 1.4.3 remain true.

Proof. The reader can check that this property suffices for Theorem B.3.5 (see

Remark B.3.6, and then also Theorem 1.4.3 to be true.

Although the existence result answers one important question — it proves

existence of an optimal solution subject to portfolio constraints for a very

general setting — it leaves open several other ones:


(i) Is the solution unique?

(ii) Can we characterize the solution further?

(iii) What do (π∗, c∗), (ξ∗, c∗) look like?

Proving that a solution is unique is usually straightforward. It follows immedi-

ately, if u is strictly concave, and is not of major concern to us. Characterizing

the solution further is the topic of the next two sections.

1.5 The Optimal Wealth Process

In this and the next section, we will characterize the optimal solution. This

section will present a well-known stochastic control result that characterizes the

optimal wealth process, and the next section discusses first-order conditions for

the optimal solution.

1.5.1 Proposition. Under the assumptions of Theorem 1.4.3 or Corollary

1.4.7, the optimal wealth process (W ∗) can be chosen to satisfy

W ∗t = ess-sup

Q∈M(SK)

E(ASK (Q)

)t

EQ

[W ∗

T

E (ASK (Q))T

+∫ T

t

c∗(s)E (ASK (Q))s

ds∣∣∣F(t)

]

in the case of portfolio-proportion processes.

In the case of portfolio processes, let St(Q) be the set of stopping times with

values in [t, T ] such that ASK(Q)τ − ASK(Q)t is bounded for all τ ∈ St(Q).

Then (W ∗) can be chosen to satisfy

W ∗t =

ess-supQ∈Mb(SK)τ∈St(Q)

EQ

[(X +

∫ T

t

c(s)ds

)1τ=T −ASK(Q)τ |F(t)

]+ ASK(Q)t.

1.6. FIRST-ORDER CONDITIONS 17

Proof. Observe first that we can assume the constraints to be binding in (1.6)

and (1.7) (with c = c∗ and X = W ∗T ) since the utility function is nondecreasing

in Theorem 1.4.3. The first part follows from Lemma A.3.16 and the proof of

Proposition A.3.14, (i); and the second part is Corollary A.3.19.

There are two remarks connected with this result, that is well-known for Ito

processes (see Karatzas and Shreve, 1998, Chapter 6, for details and references)

and cone-constrained semimartingales (e.g. Karatzas and Zitkovic, 2003; Mnif

and Pham, 2001).

The first remark relates this result to the static solution. Suppose that we

have applied our little program and first solved the static equivalent of the

portfolio optimization problem. Let (c∗, X∗) be the optimal solution to the

static problem. Then the proposition can also be understood as characterizing

the optimal wealth process belonging to this static solution, if we replace W ∗T

by X∗ in the right-hand sides of the two equations above.

The second remark asks the natural question whether there exists a Q∗

that attains the essential supremum. This is a tricky question. As discussed

in Appendix A.3, the answer is no, in general. Although we will not do so at

the moment, we can enlarge the set M(SK) so that the answer for portfolio-

proportion processes and this enlarged set is yes. Appendix A.3.2 seems to

indicate that the same cannot be said for portfolio processes. Bellini and

Frittelli (2002, Theorem 1.1) show that the measure Q∗ exists in almost all rel-

evant cases with cone constraints (see also Karatzas and Shreve, 1998, Chapter

6 for such a result in the Ito case with arbitrary constraints on the portfolio-

proportion process).

1.6 First-Order Conditions

Let us quickly make some comments on first-order conditions for the optimal

solution. We will keep the discussion at an informal level. Giving precise for-

mulations would not be too difficult, but cumbersome. What is more, as we will


discuss later on, at this level of abstraction the results do not lead to substan-

tial insights. In the following, we only consider the case of portfolio processes,

the case of portfolio-proportion processes being completely analogous.

To start with, the static optimization problem is one of maximizing a real-

valued function, subject to the real-valued constraint (1.7). Therefore it is

natural to use a Lagrangian approach. Let (c∗, X∗) be an optimal solution

to the static optimization problem, i.e. u(c∗, X∗) ≥ u(c,X) for all (c,X) ∈CK(W0). Consider the function

f(α, y) 4=u(c∗ + α∆c,X∗ + α∆X)− y(

supQ∈Mb(SK)

EQ

[X∗ + α∆X −ASK(Q)T

+∫ T

0

c∗(s) + α∆c(s)ds]−W0

).

Then we can formally look at the limit limα↓0f(α,y)−f(0,y)

α . Given some as-

sumptions and considerations — proper differentiability; ensuring that we can

interchange taking limits and finding the supremum; taking care of (c∗ +

α∆c,X∗ + α∆X) ∈ CK(W0) for small enough α, and so on — we know that

this limit exists, is less than or equal to zero, and is zero for a properly chosen

Lagrange multiplier y∗. That is, we have the equation:

limα↓0

u(c∗ + α∆c,X∗ + α∆X)− u(c∗, X∗)α

=y∗ supQ∈Mb(SK)

EQ

[∆X +

∫ T

0

∆c(s)ds].

Suppose now that there actually exists some Q∗ attaining the supremum (which

is not true in general as we have already discussed). Then we have the first-

order condition

limα↓0

u(c∗ + α∆c,X∗ + α∆X)− u(c∗, X∗)α

= y∗EQ∗[∆X +

∫ T

0

∆c(s)ds].

(1.8)

Knowing standard microeconomic theory, it is natural to interpret the con-

straints (1.6) or (1.7) as budget constraints. Then Q∗ can be considered to

1.6. FIRST-ORDER CONDITIONS 19

be a state price density4. Hence, the result relates the marginal utility to the

state price density, just as expected from microeconomic theory.

Despite its elegance, there are some drawbacks to this analysis. Clearly,

the first-order condition does not help much if we want to find (c∗, X∗). What

is more, finding Q∗ is also an open issue here. And an interpretation of this

equation is hard, too.

The picture changes for the special case of time-additive utility. Indeed,

if u(c,X) = EP

[∫ T

0U(s, c(s))ds + B(X)

]for some properly defined functions

U,B, then (1.8) reads

EP

[B′(X∗)∆X +

∫ T

0

U ′(s, c∗(s))∆c(s)ds]

= y∗EQ∗[∆X +

∫ T

0

∆c(s)ds],

which yields the simple and insightful first-order conditions

U ′(t, c∗(t)) = y∗EP

[dQ∗

dP

∣∣∣F(t)]

(1.9a)

B′(X∗) = y∗dQ∗

dP. (1.9b)

These first-order conditions, sometimes also called the stochastic Euler equa-

tions, characterize the optimal solution very well. They also yield a lot of

additional useful results. Therefore, we will devote a whole chapter to making

this sketch precise, namely Chapter 2.

In Chapter 2 and Chapter 3, we do not only characterize the solution.

We also give a problem dual to the portfolio optimization problem, which is

sometimes easier to solve. It amounts to finding (the proper generalization of)

Q∗. A similar theory for other utility functions than time-additive ones is often

feasible, too (see Section 3.5.4).

Let us finally comment briefly on finding (π∗, c∗), (ξ∗, c∗), by far the hardest

part of all. Although we already undertake first steps in this direction in

Chapter 2 and Chapter 3, a thorough discussion must be deferred until Chapter4The naming stems from a partial equilibrium analysis we will not dwell on. We simply call

such measures (or suitable generalizations) state price densities, whenever there is a situationwhere we can, in a certain sense, equate them to the marginal utility. Other common namesare stochastic discount factor or pricing kernel.


4. There, we consider the special case of Ito processes (or slightly more general

Levy processes).

1.7 Related Research

For the moment, let us pause with the development of the theory, and review

some of the literature instead. As far as existence of an optimal solution is con-

cerned, Corollary 1.4.7 is the most general solution that seems to be possible.

The generality is possible because we have directly tackled the primal problem.

Many other papers concerned with optimal portfolio choice use a duality ap-

proach (a very good example is Karatzas and Shreve, 1998, Chapter 6): they

establish duality of the primal problem to a dual one and then prove existence

of an optimal solution to the dual problem. Results on the primal problem in

less general settings have been obtained by Mnif and Pham (2001) for semi-

martingales with arbitrary constraints, but without consumption, Bank (2000)

for a semimartingale with cone constraints, and Cuoco (1997) for constrained

Ito processes satisfying a boundedness assumption (Cuoco, 1997, Assumption

3 on p. 40). Foldes (1978) also proves existence of an optimal solution for a

rather general, albeit unconstrained, problem. Nearly all papers cited below

prove existence and usually uniqueness of optimal solutions for their specific

setting. Contrary to tackling the primal problem directly, these papers employ

results from the theory of Markov processes and Bellman’s principle to prove

existence of an optimal solution, if they do not use duality techniques.

Characterizing the optimal solution turns out to be more difficult, however.

From the previous section, the portfolio optimization problem can be consid-

ered to be a problem of maximizing u subject to the budget constraint (1.6)

or (1.7). Knowing the microeconomic utility maximization problem — or any

other optimization problem — we have conjectured that the marginal rate of

substitution must be equal to a properly defined state price density. This is

not the case in general. Indeed, much of the literature and much of this thesis

is concerned with the question of what assumptions are necessary to establish

the stochastic Euler equation (or something approximate).

1.7. RELATED RESEARCH 21

There are two answers to this question. The first is that under mild techni-

cal assumptions the marginal rate of substitution is the limit of a sequence of

state price densities (compare the results in Chapter 2). Such results are im-

plicit in several papers. In a semimartingale world without constraints or cone

constraints and time-additive utility the result is explicitly stated in Kramkov

and Schachermayer (1999); Karatzas and Zitkovic (2003). Cuoco (1997) proves

it for Ito processes with constraints, and so do Mnif and Pham (2001) for semi-

martingales. See also He and Pearson (1991b); Karatzas, Lehoczky, Shreve,

and Xu (1991); Shreve and Xu (1992a). Cvitanic, Schachermayer, and Wang

(2001) study the limit of a sequence of state price densities. As for the second

answer, with stronger assumptions many authors prove that the marginal rate

of substitution is actually equal to the state price density. See Chapter 2.

After this tour d’horizon, let us review the most important results in the

literature. We categorize these results by the type of utility functions used,

then by the constraints used, and finally by the stochastic process used. We

do not strive for generality concerning breadth and depth of this literature

review, but only give the first reference that seems relevant in our context. For

example, many results have been first proven on the time interval [0, T ] for

special processes, and then extended to [0,∞) and more general processes. For

these cases we only cite the first appearance of such results. We also do not

care about the method used to achieve a result. For a more complete literature

review, we refer to Karatzas and Shreve (1998, Chapters 3.11 and 6.9).

To start with, we quickly recall the landmark results concerning the un-

constrained problem with time-additive utility. If we model the risky return of

assets by a discrete time process (Samuelson, 1969), Markov Brownian motion

(Merton, 1969, 1971), Ito processes (Pliska, 1986; Cox and Huang, 1989, 1991;

Karatzas, Lehoczky, and Shreve, 1987), or even semimartingales (Kramkov and

Schachermayer, 1999; Karatzas and Zitkovic, 2003), and if we model the risk /

return tradeoff with the help of time-additive concave utility functions, we can

prove existence and uniqueness of an optimal solution. Provided some minor


assumptions hold, the optimal solution equals the state price density, and by

and large follows from Lagrange multiplier theory (Bismut, 1975). Concerning

optimal portfolios, some progress has been achieved. Merton (1969, 1971) cal-

culated optimal portfolios for the class of HARA utility functions and Geomet-

ric Brownian motion. He showed that the optimal portfolio rules translate into

investing in mutual funds. These mutual funds hold stocks in a fixed propor-

tion (the portfolio-proportion process remains constant). His results for HARA

utility have been extended by many authors to ever more general Ito processes,

but optimal portfolio rules for other utility functions remain scant. The only

notable exception seems to be the mean-variance problem (e.g. Richardson,

1989; Schweizer, 1992). However, except for Korn and Trautmann (1995), all

these papers imply a positive probability to end up with negative wealth and

therefore do not fit into our setting.

The unconstrained theory by and large extends to constraints on the port-

folio process and terminal wealth if the utility functions are time-additive. He

and Pearson (1991a,b); Karatzas et al. (1991); Shreve and Xu (1992a) tackle

the case of short-sale constraints or incomplete markets; arbitrary convex con-

straints on the portfolio-proportion process are considered by Cvitanic and

Karatzas (1992). All these papers prove existence and uniqueness of an opti-

mal solution for Ito-processes and general utility functions. Again, for HARA

utility we can often find solutions for the portfolio rules, too. Existence and

uniqueness of an optimal solution in a general semimartingale world with con-

straints / incomplete markets can be found in Mnif and Pham (2001); Karatzas

and Zitkovic (2003). For Ito processes, Korn and Trautmann (1995) extend the

theory to constraints on terminal wealth. Proving existence and uniqueness

is straightforward. Finding optimal portfolios is not. Korn and Trautmann

(1995) use their result to consider the trading strategies for the continuous-

time mean-variance problem. With additional short-sale constraints, a similar

result can be found in Li, Zhou, and Lim (2002).

Although time-additive utility functions are widely used, they have drawn

some criticism. If they are state-dependent, they have almost no normative

1.7. RELATED RESEARCH 23

power, since almost anything is possible. And if they are not state-dependent,

they possess some highly unrealistic properties. First of all, time-additive util-

ity functions are firmly founded in the von Neumann-Morgenstern expected

utility theory. Empirically, the independence axiom is often violated (e.g. the

Allais Paradox; see Mas-Colell, Whinston, and Green, 1995, Chapter 6). Fur-

thermore, several observed phenomena cannot easily be accounted for, most

notably the equity premium puzzle. Another critique focuses on the linkage

between consumption at one moment in time and intertemporal consumption.

It is argued that investors should be indifferent between minor alterations in

consumption at every time and the timing of the consumption plan, something

that cannot be achieved with time-additive utility. A related critique, well-

known from macroeconomic theory, focuses on the fact that the intertemporal

elasticity of substitution and the risk aversion at any time are not modeled

separately with a time-additive approach. Finally, an individual with time-

additive utility is (in a certain sense) indifferent with respect to the timing of

resolution of uncertainty. Put differently, knowledge reducing the individual’s

uncertainty does not increase her utility unless it changes her optimal strategy.

Hence, preferences for information in the sense of Kreps and Porteus (1978)

cannot be studied in this framework. State-independent time-additive utility

also lacks the ability to model several research questions of independent inter-

est. To name just a few: habit formation, subjective believes, multiple priors

(Knightian uncertainty, Epstein and Wang, 1994), and uncertainty about the

asset pricing model.

Different authors have therefore come up with alternatives to over-come

some deficiencies of the state-independent time-additive utility function, with-

out losing all of its tractability. Hindy, Huang, and Kreps (1992) suggest incor-

porating a function of total consumption up to now into the utility function in

order to capture the satisfaction derived from previous consumption. An exten-

sion of Hindy, Huang, and Zhu (1997) also accounts for habit formation. Bank

(2000) proves existence and uniqueness of an optimal solution for these kinds

of utility functions in a general semimartingale setting. Bank’s results allow


for cone constraints and incomplete markets. For exponential Levy processes

of the state price density, Bank (2000) also characterizes the optimal solution.

Investors using Hindy et al. (1992, 1997) utility functions are indifferent be-

tween minor alterations in consumption at a fixed time and the timing of the

consumption plan.

Instead of incorporating total consumption up to t as a “state”, one can also

completely disentangle intertemporal elasticity of substitution and risk aversion

(recursive utility, see Epstein and Zin, 1989; Duffie and Epstein, 1992; Camp-

bell and Viceira, 2002, for an introduction). Recursive utility — in continuous

time frequently also called stochastic differential utility — uses a stochastic

integral equation to link future utility to present utility. Such utility functions

also make it possible to study preferences for intertemporal resolution of (con-

sumption) risk (Skiadas, 1998; Lazrak and Quenez, 2003, Section 5). Lazrak

and Quenez (2003) give a generalization of recursive utility, allowing for ambi-

guity, model risk, and asymmetry in risk aversion. Schroder and Skiadas (2003)

discuss properties of the optimal solution in a Brownian model with constraints

and generalized recursive utility. They present first-order conditions of opti-

mality that turn out to be constrained forward-backward stochastic differential

equations, but sometimes reduce to backward stochastic differential equations.

Another special “utility function” is the goal to maximize the growth rate of

a portfolio (Hakansson, 1970). Closely related are universal portfolios (Cover,

1991; Jamshidian, 1991; Korn, 1997, Chapters 6.1 and 6.2): if the horizon is

“large”, universal portfolios perform approximately as well as the best con-

stantly re-balanced portfolio. As we will see in Chapter 4, constantly re-

balanced portfolios play a special role in portfolio optimization for Brownian

models. Therefore, this is an attractive property. However, it is highly dubi-

ous that a large-horizon property like this should be factored into a portfolio

decision (Merton and Samuelson, 1974). Finally, a large class of “utility func-

tions” comprises maximizing the probability to beat or track a given target, be

this target deterministic or a stochastic benchmark (e.g. Browne, 1999; Korn,

1997, Chapter 6.3).

1.8. OUTLOOK 25

Portfolio optimization is by now such an extensive field that many highly

relevant aspects can barely be mentioned. The topics not covered in this thesis

include stochastic income (Cuoco, 1997; Cvitanic et al., 2001; Mnif and Pham,

2001; Karatzas and Zitkovic, 2003), constraints on the wealth process (Korn

and Trautmann, 1995; Korn, 1997; Mnif and Pham, 2001), negative wealth

(Schachermayer, 2001), transaction costs and taxation (Deelstra, Pham, and

Touzi, 2001; Kamizono, 2001, 2003; Bouchard and Mazliak, 2003), insider trad-

ing or different information sets (Duffie and Huang, 1986; Amendinger, 2000),

preferences for information (Kreps and Porteus, 1978; Lazrak and Quenez,

2003), large investors (Mnif and Pham, 2001), insecurity about the parameters

of the assets’ process, several consumption goods (Deelstra et al., 2001; Kami-

zono, 2001, 2003), and many more. Good starting points for first results and

references for any of these topics are often Korn (1997); Karatzas and Shreve

(1998). Finding optimal portfolios is only discussed partly in this thesis (see

Chapter 4 for a survey on this). Again, we refer to Korn (1997); Karatzas

and Shreve (1998) for more. Also there are other methods to characterize and

find optimal portfolios than analytical techniques, most prominently numerical

ones (Filitti, 2004, for a recent survey) and approximate solutions (see Merton

and Samuelson, 1974; Campbell and Viceira, 2002, and the references therein).

1.8 Outlook

The structure of the thesis is as follows: this chapter has set the scene by in-

troducing the financial market used, and by proving existence of a solution to

a very general constrained portfolio optimization problem. As already men-

tioned, although this existence result is nice, we want to know more about the

properties of the solution. To this end, we have to add additional assump-

tions. In Chapter 2 and Chapter 3 we therefore assume time-additive utility

functions. Then we can characterize optimal wealth and optimal consumption

completely. After we have characterized the optimal solution, it is natural to

ask what the optimal portfolio rules look like. This is the topic of Chapter 4

for a Brownian model (or slightly more general models).


To be a bit more specific, Chapter 2 takes one step back. It presents

the portfolio optimization problem for the time-additive case starting with

well-known results and slowly progressing to the more general results. Sec-

tion 2.1 starts with the simplest possible setting: a complete market, where

all contingent claims are attainable. It first rigorously introduces the static

and dynamic portfolio optimization problems for time-additive utility func-

tions (Section 2.1.1). For the static problem, applying a Lagrange multiplier

approach is straightforward (see Section 2.1.2). Hence, if a solution exists, it

must satisfy standard Lagrange conditions; and if a set of candidate solutions

satisfies these conditions, it is a solution to the static optimization problem.

Amongst others, we get the result that the stochastic Euler equation holds for

the static problem, i.e. the optimal solution can be expressed with the help of

the state price density.

Since the market is complete, the superhedging result turns out to be sim-

pler than in Section 1.3. Although it is just a special case of the propositions

in Section 1.3, we prove it explicitly in Section 2.1.3. The proof makes the

basic structure of the equivalence of the dynamic and the static solution more

transparent. Just as in Section 1.4, we use this equivalence to characterize

the optimal solution of the dynamic portfolio optimization problem in Sec-

tion 2.1.4. As a consequence, the stochastic Euler equation must hold for the

complete dynamic case, too, provided a solution exists.

Up to this point, all the results are achieved under the additional assump-

tion that a solution exists (and satisfies certain assumptions). It is natural to

ask what sufficient conditions for the existence of a solution might look like.

Using Corollary 1.4.7, this is answered in Section 2.1.5. There, several fre-

quently employed sufficient conditions are related to one another. It is proven

that these conditions basically ensure upper semicontinuity of u. Actually, the

sufficient conditions of Section 2.1.5 are true for the incomplete market case,

too, since the basic Theorem 1.4.3 is so. The last subsection of Section 2.1 —

Section 2.1.6 — introduces the Brownian market as an example of the theory.

This example will serve us well throughout the thesis.

1.8. OUTLOOK 27

The next major step is to consider the constrained case in Section 2.2.

The basic problem is introduced in Section 2.2.1. By way of example for the

Brownian market we show that in some cases it is possible to find optimal solu-

tions easily if we use our knowledge from the complete market (Section 2.2.4).

If matters are not so easy, then we can nevertheless characterize the optimal

solution, provided it exists. This is done in Section 2.2.3, where we make pre-

cise what we have discussed in Section 1.6. We find that an approximate Euler

equation holds for the constrained case, too. Combining this with Section 2.1.5

yields sufficient conditions for the existence of such a solution in Section 2.2.2.

The solution in the constrained case is thereby characterized completely, too.

From a theoretical point of view this seems to be all that can be said

about portfolio optimization for the time-additive case. There is however one

drawback: actually finding optimal portfolios and characterizing the optimal

solution is hard, since Section 2.2.3 employs sets of stochastic processes that are

not straightforward to calculate. As an alternative that often leads to a simpler

optimization problem we therefore consider the dual problem to the (primal)

portfolio optimization problem in Chapter 3. We again start by considering the

unconstrained problem (Section 3.2), subsequently relaxing assumptions to the

general case of both incomplete markets and / or constraints (Sections 3.3 and

3.4). We not only prove existence of the solution to the portfolio optimization

problem (again), but also relate this primal problem to a dual one, which is

sometimes simpler to solve. Existence of a solution to the dual problem is

proven, and a useful characterization of the optimal solution to the primal

problem is obtained. As always, we use the Brownian market as an example.

By far the most difficult question in the field of portfolio optimization is

the one concerning the structure of optimal portfolio rules. Calculating such

rules can only be done for very few utility functions. Instead of calculating

some portfolio rules, Chapter 4 tries to characterize optimal portfolio rules in a

Brownian market (or a slightly more general market driven by a Levy process).

As it turns out the optimal portfolio π∗ satisfies π∗(·) = α(·)[σ(·)σ′(·)]−1p(·)


for some measurable IR-valued process α(·), and some process p(·). We show

that this is a geometric property of the model that only fails if we “hit” the

border of the constraint set.

The thesis comes with several Appendices. The Appendices serve only one

purpose: to enhance the readability of the thesis. To this end, they assemble

known results, prove less important additional facts, and give lengthy, but not

very insightful proofs of results in the main text. The model of a financial

market employed is discussed in depth in Appendix A, especially Appendix

A.1. We refer to this section for the generality of the model used and in case of

any ambiguities. Appendix A.2 gives some topological results on sets of ran-

dom variables and stochastic processes. And Appendix A.3 uses these results

and very deep Optional Decomposition theorems to prove the superhedging

theorems that are so important to relating the static to the dynamic solution.

Appendix B cites and proves results from convex analysis and duality the-

ory. For the reader’s convenience we reproduce a duality result by Kramkov

and Schachermayer (1999) in Appendix B.1. Appendix B.2 assembles facts

from the theory of generalized Lagrangians, as they can be found in most text

books on this topic. Finally, Appendix B.3 proves existence for some rather

general stochastic optimization problems. These results are important for the

proof of the basic existence Theorem 1.4.3, and indirectly also for the proofs

in Chapter 3 and Appendix B.1.

To enhance the readability of the main body of the text, (steps of) proofs

that are not important for the comprehension are given in Appendix C. These

include the proofs of the applications of Generalized Lagrangian theory in Sec-

tion 2.1.2, several tedious measurability considerations in Chapter 4, a solution

to an inhomogeneous SDE (Appendix C.3), and a tailor-made comparison the-

orem (Appendix C.4).

Chapter 2

Portfolio Optimization for

Time-Additive Utility

Functions

30 CHAPTER 2. PORTFOLIO OPTIMIZATION (TIME-ADDITIVE)

2.1 Complete Market Portfolio Optimization

This chapter considers a concrete example of a utility function u, namely a

time-additive utility function. For this class of utility functions, we can give

sufficient conditions for the upper semicontinuity of u, that are easily verifiable,

ensuring thereby existence of an optimal solution. These sufficient conditions

are well-known. We only contribute to the theory by showing that basically all

conditions used in the literature are nothing else but conditions ensuring the

upper semicontinuity of u. Our second contribution to the theory are first-order

conditions. Such first-order conditions are standard in a complete market. In

a constrained market, they are rarer. By tackling the primal problem directly,

we can give first-order conditions for both constrained portfolio processes and

constrained portfolio-proportion processes.

We present the theory first for an unconstrained and subsequently for a

constrained market. Progressing from the special to the general case has the

disadvantage of repetition, but the advantage of greater transparency. We try

to make use of this advantage by using simpler techniques to prove the simpler

case. The drawback is that we prove several things twice.

2.1.1 The Unconstrained Dynamic and Static Problems

This subsection presents the unconstrained portfolio optimization problem.

The term “unconstrained” stems from the fact that the problem does not re-

quire additional constraints beyond the ones necessary for a sensible model

specification. To start with, we need a definition of the time-additive utility

function (e.g. Karatzas and Shreve, 1998).

2.1.1 Definition (Time-Additive Utility Function). A time-additive utility

function B : IR × Ω 7→ [−∞,∞) is, for almost all ω ∈ Ω, quasiconcave, non-

decreasing, upper semicontinuous in its first argument, and B(x) is for any

x ∈ IR F(t)-measurable for some t ∈ I. Further the set dom(B) = x ∈ IR :

B(x) > −∞ P−a.s. ⊂ [0,∞) is not empty. Alternatively, B ≡ 0 is also called

a utility function.

2.1. COMPLETE MARKET PORTFOLIO OPTIMIZATION 31

2.1.2 Convention. Whenever we write B′ in the text, we will implicitly as-

sume that B is differentiable, B′ is positive, continuous, strictly decreasing on

the interior of dom(B), and limx→∞ B′(x) = 0 almost surely. Here, differenti-

ation in B′ is taken with respect to the first argument in a natural “point-wise”

manner. Differentiability also implies strict concavity.

A nonempty dom(B) ensures that we do not study properties of the empty

set. And allowing a time-additive utility function to be equivalently zero en-

ables us to tackle the three related problems of optimal consumption, optimal

terminal wealth, and optimal consumption / terminal wealth at the same time

(see Remark 2.1.6 below). Since we are only interested in time-additive utility

functions throughout this chapter, we simply use the term “utility function”

instead of “time-additive utility function” for this and the next chapter. We

will comment on the relationship of this definition of a utility function to our

more general one of the previous chapter in Section 2.1.4 below.

Observe also that the utility function as defined above is state-dependent,

i.e. it depends on ω ∈ Ω. We call a utility function state-independent (or not

state-dependent) if B(·, ω1) = B(·, ω2) for (almost) all ω1, ω2 ∈ Ω. To alleviate

the notation we follow the usual practice of dropping the dependence of B on

ω, unless it is of special interest to us.

Let x4= infx ∈ IR : B(x) > −∞ P− a.s. for some utility function B 6= 0;

then the strictly decreasing continuous function B′ : (x,∞) 7→ (0, B′(x)) —

where we define B′(x) 4= limγ↓0 B′(x+γ) — has a strictly decreasing continuous

“inverse” B′−1 : (0, B′(x)) 7→ (x,∞), defined by B′(B′−1(x)) = x almost

surely. We set B′−1(x) 4= x for B′(x) ≤ x ≤ ∞, so that B′−1 : (0,∞] 7→ (x,∞)

is continuous and finite.

Let U : I × IR×Ω 7→ [−∞,∞) be a mapping such that the random variable

U(t, ·) : IR × Ω 7→ [−∞,∞) is a utility function for each t ∈ I, and U(·, x) is

measurable with respect to the progressive σ-algebra Prog. If not U(t, ·) = 0

almost surely, set c(t) 4= infc ∈ IR : U(c, t) > −∞ P − a.s.; otherwise, set

c(t) 4= 0. We assume that c(·) is a continuous in t with values in [0,∞).


2.1.3 Convention. Whenever we (implicitly) assume that U ′ exists we fur-

ther assume that U and U ′ are continuous for almost all (t, c) on (t, c) ∈I × (0,∞) : c > c(t). As before, we define U ′(t0, c(t0)) by U ′(t0, c(t0))

4=

limγ↓0 U ′(t0, c(t0) + γ) throughout the text.

For each t ∈ I fixed the derivative U ′(t, ·) of the utility function U(t, ·) 6= 0

has an “inverse” U ′−1(t, ·) satisfying U ′(t, U ′−1(t, x)) = x for U ′(t, c(t)) > x >

0, and U ′(t, U ′−1(t, x)) 4= c(t) for U ′(t, c(t)) ≤ x ≤ ∞. Even more holds:

2.1.4 Lemma. The function U ′−1(·, ·) is almost surely jointly continuous on

I × (0,∞].

Proof. Karatzas and Shreve (1998, Lemma 3.5.8).

Now we have everything in place to finally consider the optimization prob-

lem an investor is concerned with. Suppose an investor has initial wealth

W0 > 0 — for W0 = 0 there is nothing to do, and therefore W0 > 0 through-

out the paper — and wishes to maximize her total utility by investing in the

financial market. Then, she is confronted with the

2.1.5 Problem (Unconstrained Dynamic Problem). Solve1

u(W0) = sup(ξ,c)∈A(S,W0)

EP

[∫ T

0

U(s, c(s))ds + B(WT )

]. (2.1)

Here, B is a utility function, U just as described before Convention 2.1.3.

The utility function B in (2.1) is often considered as (derived) utility from

terminal wealth, and then called bequest. We will also refer to it as terminal

utility function. And U captures the individual’s utility from consumption,

sometimes called the instant utility function, or the running reward function.

2.1.6 Remark. If we set U ≡ 0 we consider the problem of maximizing optimal

terminal wealth. Equivalently, if B ≡ 0 we solve the problem of optimal con-

sumption. And if neither is equivalently 0, the problem of optimal consumption

/ terminal wealth is solved.1We use the symbol “u” for two different objects throughout the thesis: for utility func-

tions defined on the space of contingent claims as in Section 1.4; and for utility functions(see Remark 3.2.2) defined on initial wealth as e.g. in this definition.


2.1.7 Remark. Problem 2.1.5 is formulated with respect to the portfolio process,

and not with respect to the portfolio-proportion process. As long as there are

no constraints on the portfolio process or the portfolio-proportions process,

this is without loss of generality. Equation (1.3) allows us to translate from

one formulation to the other. We therefore only give results for the portfolio

process or the portfolio-proportion process in Section 2.1.1 — whatever suits

us better.

As in Section 1.4 there exists a static problem closely related to Problem

2.1.5 that we will now discuss. To this end let Qm ∈M(S) be a fixed equivalent

local martingale measure throughout the rest of this section. The reason for

concentrating on a single measure Qm ∈ M(S) will be justified below. With

the same notation as in Problem 2.1.5, consider now the

2.1.8 Problem (Unconstrained Static Problem). Solve

us(W0) = sup(c,X)

EP

[∫ T

0

U(s, c(s))ds + B(X)

]

s. t. EQm

[∫ T

0

c(s)ds + X

]≤ W0.

(2.2)

Here, c is a consumption process, and X ∈ L0+(Qm).2

2.1.9 Remark. Clearly, X ∈ L1+(Qm), since EQm [X] ≤ W0. We have chosen

to write X ∈ L0+(Qm) because L0

+(Qm) is independent with respect to an

equivalent measure, and in order to prepare for the general case in the next

section. In this section, we will frequently use the equivalent condition X ∈L1

+(Qm) without further mentioning. Similarly, c(·) ∈ L1+(λ⊗Qm).

In Problem 2.1.5 the individual chooses a portfolio process and a consump-

tion process in order to maximize utility. To the contrary, in Problem 2.1.8

the individual selects a contingent claim3 in addition to the consumption pro-

cess subject to a budget constraint in order to maximize her utility. Hence, in

2Lp+(µ)

4= x ∈ Lp(µ) : x ≥ 0 µ − a.s. (the positive orthant) for a measure µ and

0 ≤ p ≤ ∞.3A contingent claim X is an asset that pays out a certain amount at time T depending

on the state of the world ω, i.e. an F-measurable random variable.


Problem 2.1.5 the individual invokes a dynamic investment strategy. She per-

manently has to adjust her portfolio weights. But in Problem 2.1.8, she buys a

contingent claim and holds it. Therefore one calls the first problem “dynamic”

and the second one “static” (see Remark 1.4.6).

There is one difficulty associated with the Problem 2.1.5 and Problem 2.1.8:

the integrals might not be defined. Therefore, the following

Standing Assumption. For a problem with a time-additive utility function

there always exists W0 > 0 with u(W0) < ∞.

2.1.10 Remark. The Standing Assumption has a long history in portfolio op-

timization. We are saying that the opportunity to trade in a financial market

does not increase an investor’s utility arbitrarily. The condition implies via-

bility (Kreps, 1981, p. 20) of the financial market in the sense of Bellini and

Frittelli (2002, Definition 4.1). For unbounded utility functions, the assump-

tion implies the existence of an equivalent local martingale measure, i.e. the

market does not allow for arbitrage (Bellini and Frittelli, 2002, Theorem 4.1).

From a technical point of view, the Standing Assumption ensures that the in-

tegral is always defined, possible being −∞. If U,B are concave, the Standing

Assumption also implies that u(W0) < ∞ for all W0 > 0. There is an alterna-

tive to this assumption: allow the function u to take the value ∞, but restrict

the set of possible strategies to trading strategies, where the inequality

EP

[∫ T

0

U−(s, c(s))ds + B−(WT )

]< ∞

holds. Then the integral

EP

[∫ T

0


]in Problem 2.1.5 is defined (possible being ∞). This is the approach taken by

Karatzas and Shreve (1998). Cuoco (1997, p. 39) uses a combination of these

two approaches where one of the two assumptions must hold. The results

are not dependent on the different approaches. Only the notation has to be

adjusted accordingly.


2.1.11 Remark. The problem of optimal terminal wealth is just a special case of

optimal consumption: if we replace∫ ·0c(s)ds with

∫ ·0c(s)df(s),

∫ ·0U(s, c(s))ds

with∫ ·0U(s, c(s))df(s) for an increasing function f : I 7→ IR+

0 with f(0) =

0 (and change the other integrals accordingly), then for the case of optimal

consumption as discussed so far, we choose f(t) = t, and for the case of optimal

terminal wealth, choose f(t) = IT(t).

2.1.2 A Verification Theorem for the Static Problem

We nevertheless solve the traditional problem of optimal consumption / ter-

minal wealth. In light of the results presented in Appendix B.2, the static

Problem 2.1.8 has quite often a straightforward solution. If we set YT = dQm

dP

(the Radon-Nikodym density) and Yt = EP [YT |F(t)] for all t ∈ I, we have the

following theorem:

2.1.12 Theorem (Verification Result). Let c∗(·) : I×Ω → IR+0 be a consump-

tion process, X∗ : Ω 7→ IR, y1 ∈ IR+0 , y2 ∈ L∞+ (λ⊗Qm), and y3 ∈ L∞+ (Qm) be

given. Suppose that

W0 ≥ EQm

[∫ T

0

c∗(s)ds + X∗

](2.3a)

c∗(t) ≥ c(t) ∀ t ∈ I Qm − a.s. (2.3b)

X∗ ≥ x Qm − a.s. (2.3c)

0 = y1

(W0 − EQm

[∫ T

0

c∗(s)ds + X∗

])

+ EQm

[∫ T

0

y2(s) (c∗(s)− c(s)) ds

]+ EQm [y3 (X∗ − x)]

(2.4)

and

U ′(t, c∗(t)) = (y1 − y2(t))Yt ∀ t ∈ I Qm − a.s. (2.5a)

B′(X∗) = (y1 − y3)YT Qm − a.s. (2.5b)

Then, c∗(·), X∗ is an optimal solution to Problem 2.1.8.


Conversely, if c∗(·), X∗ is an optimal solution to Problem 2.1.8 for which

(2.3) holds and U ′, B′ exist, then there exist y1 ∈ IR+0 , y2 ∈ L∞+ (λ ⊗ Qm),

and y3 ∈ L∞+ (Qm), such that (2.4) is satisfied. c∗(·), X∗ then maximize the

Lagrangian

L(X, c) 4=EP

[∫ T

0

U(s, c(s))ds + B(X)

]− y1EQm

[∫ T

0

c(s)ds + X

]

+ EQm

[∫ T

0

y2(s)c(s)ds

]+ EQm [y3X] .

(2.6)

Proof. See Appendix C.1.1.

2.1.13 Remark. The above verification result is formally only valid for the prob-

lem of optimizing both consumption and terminal wealth. If we consider the

related problems of optimizing terminal utility only (utility from consumption

only), one can set U ≡ 0 (B ≡ 0 respectively), see Remark 2.1.6. Similarly, set

U ′−1 ≡ 0 (B′−1 ≡ 0 respectively) in the following corollary.

It might not be immediately clear that the c∗(·) of Theorem 2.1.12 is actu-

ally progressively measurable. But this is settled in the next corollary, giving

an explicit characterization for c∗(·), X∗.

2.1.14 Corollary (Characterization of Optimal Consumption and Terminal

Wealth). Let the setting be as in Theorem 2.1.12. Then

c∗(t) = U ′−1 (t, y1Yt) ∀ t ∈ I Qm − a.s. (2.7a)

X∗ = B′−1 (y1YT ) Qm − a.s. (2.7b)

and y1 is a solution to

W0 = EQm

[∫ T

0

c∗(s)ds + X∗

]. (2.8)

Proof. See Appendix C.1.2.


2.1.15 Remark. This is just the well-known Kuhn-Tucker theorem applied to

our setting. That is, we have proven the sufficiency of certain first-order con-

ditions for our problem, and also established that any optimal solution must

satisfy these conditions.

There are sufficient conditions to ensure existence and essential uniqueness

of an optimal solution. We will learn about some in the Section 2.1.5.

2.1.3 Equivalence of Dynamic and Static Problems

If we could reduce Problem 2.1.5 to Problem 2.1.8, this would greatly facilitate

the solution. In a more general framework, this already was the topic of Section

1.3. We will now discuss an assumption where such a reduction is much simpler.

The assumption leads to a technique known as the martingale method (Bismut,

1975; Kreps, 1979, cited in Pliska, 1986; Pliska, 1982, 1986; Cox and Huang,

1989, 1991; Karatzas et al., 1987). The fundamental idea is the following

(Pliska, 1982):

(i) Characterize the set of all possible terminal wealth outcomes given the

admissible set of portfolio processes.

(ii) Optimize the static problem (with the additional restriction that the

terminal wealth lies within the set of possible terminal wealth outcomes).

(iii) Find the portfolio process that attains the optimal terminal wealth.

In general, the first step is quite hard. Hence Assumption 2.1.18 below. We

need a definition first, namely the definition of the predictable representation

property for vector stochastic integrals (e.g. Cherny and Shiryaev, 2001, Defi-

nition 1.4).

2.1.16 Definition (Predictable Representation Property). Given a probabi-

lity space (Ω,F , Qm), a semimartingale S has the predictable representation

property for the filtration F(t)t∈I , if for every local martingale M , there is


a predictable process ξ and a constant M0 ∈ IR such that

Mt = M0 +∫ t

0+

ξs · dSs ∀ t ∈ I Qm − a.s.

2.1.17 Remark. There are numerous examples of combinations of a filtra-

tion and a stochastic process having this predictable representation property.

The best known certainly is the Brownian motion B for the smallest right-

continuous and complete filtration with respect to which B is adapted (Protter,

1990, Chapter 4, Corollary 2 to Theorem 42). Another example is the finite

market model (namely the Cox-Ross-Rubinstein model as a special case) often

encountered in finance (see Elliott and Kopp, 1999, Chapters 4.2, 4.3; Pliska,

1982). Further examples (e.g. the compensated Poisson process for the Poisson

filtration) rely on the notion of extremal or standard measures for local martin-

gales (Bichteler, 2002, Chapters 4.2, 4.6; Liptser and Shiryaev, 1989, Chapter

4.8; Protter, 1990, Chapter 4.3; Revuz and Yor, 1999, Chapter 5.4).

2.1.18 Assumption (Predictable Representation Property). The semimartin-

gale S has the predictable representation property for (Ω,F , Qm).

It can be shown that this assumption is equivalent to the assumption that

Qm is the only measure in M(S), i.e. M(S) = Qm. We refer to Cherny and

Shiryaev (2001, Theorem 1.5) for a discussion.

2.1.19 Remark. We can give an economic meaning to Assumption 2.1.18. In-

deed, we call a market dynamically complete if for every contingent claim (see

Footnote 3 on p. 33)) X ∈ L1(Qm) with X ≥ c almost surely for some constant

c ∈ IR, there exists a portfolio process ξ ∈ L(S) and a constant X0 ∈ IR with

X = X0 +∫ T

0+

ξs · dSs Qm − a.s.

This means that in a dynamically complete market every contingent claim can

be replicated with the help of a dynamic trading strategy by any individual

investor provided she is endowed with enough initial wealth. We call a contin-

gent claim attainable if it can be replicated. Obviously, Assumption 2.1.18 is a


sufficient assumption for a market to be dynamically complete. It can be shown

that EQm [X] is the price of the contingent claim in the market equilibrium. See

Karatzas and Shreve (1998, Chapters 1 and 2)).

The definition of a dynamically complete market is from Karatzas and

Shreve (1998), Definition 1.6.1. If we want to prove equivalence of a dynam-

ically complete market and Assumption 2.1.18, we need a more sophisticated

definition of a dynamically complete market (Cherny and Shiryaev, 2001, De-

finition 1.3).

We now establish equivalence of the static and the dynamic optimization

problem. The result is a special case of ideas presented in Section 1.4.

2.1.20 Lemma (Equivalence of Dynamic and Static Solutions). Suppose that

Assumption 2.1.18 holds. Let (ξD, cD) be a solution to Problem 2.1.5. There

exists a solution (XS , cS) to Problem 2.1.8 (where the utility functions, and

initial wealth W0 are the same) such that the following equalities hold:

cD(t) = cS(t) ∀ t ∈ I P− a.s.

WT = XS P− a.s.

Conversely, if (XS , cS) is a solution to Problem 2.1.8, then there exists a so-

lution (ξD, cD) to Problem 2.1.5 such that the above equalities hold.

Proof. See Appendix C.1.3

2.1.21 Remark. Let (c,X) be a combination of a consumption process and a

contingent claim. We have then proven that there exists (ξ, c) ∈ A(S, W0) with

Wt = W0 +∫ t

0+

ξs · dSs −∫ t

0

c(s)ds ∀ t ∈ I P− a.s.

a wealth process satisfying WT = X almost surely if and only if

EQm

[X +

∫ T

0

c(s)ds

]≤ W0.

It suffices to find (c∗, X∗) maximizing u(c,X) = EP

[∫ T

0U(c(s), s)ds + B(X)

]on C(W0)

4= (c,X) : EQm

[X +

∫ T

0c(s)ds

]≤ W0. This is a special case of


Section 1.4 that again allows us to transform our initial dynamic problem into

a static one.

2.1.4 A Verification Theorem for the Dynamic Problem

Combining Lemma 2.1.20 with Theorem 2.1.12 we find the

2.1.22 Theorem (Verification Result). Suppose Assumption 2.1.18 holds and

that (ξ∗, c∗) ∈ A(S, W0) is such that (c∗,W ∗T ) satisfies Theorem 2.1.12 (and

thus solves Problem 2.1.8). Then (ξ∗, c∗) solves Problem 2.1.5.

2.1.23 Remark. The classical solution technique for Problem 2.1.5 is “the other

way round”: first, find a solution (c∗, X∗) to Problem 2.1.8, and then a portfolio

process ξ∗ such that (from the proof of Lemma 2.1.20 or Section 1.5)

W0 +∫ t

0+

ξ∗ · dSs = W ∗t = EQm

[∫ T

t

c∗(s)ds + X∗

∣∣∣∣∣F(t)

].

Finding this optimal portfolio process ξ∗ is however not trivial in general. In

some cases, e.g. the finite market model (cf. Elliott and Kopp, 1999, Chapter

4), it is just a matter of solving a system of linear equations. As we will see

in Example 2.1.39, this approach can be generalized to a partial differential

equation sometimes. Another common approach to finding ξ∗ uses Clark’s

formula, see Karatzas and Shreve (1998); Øksendal (1997, Appendix E) for

an introduction with a view towards applications in finance. See also Kallsen

(1998) for further examples.

2.1.5 Existence of an Optimal Solution

In this section, we will prove existence of an optimal solution. Combining this

with Theorem 2.1.22 gives a complete characterization of the optimal solution.

With the end of this subsection we have

(i) sufficient conditions for the existence of an optimal solution (Proposition

2.1.32 and Proposition 2.1.33);


(ii) first-order conditions for the optimal solution (Theorem 2.1.12);

(iii) a dual problem that is sometimes easier to solve (Lemma 2.1.38) — al-

though we will not dwell on this aspect for the moment, but defer a

thorough discussion to Chapter 3.

Besides giving the n+1th proof for the time-additive complete market case,

this section also proves existence for the constrained case (see Remark 2.1.36)

and discusses various sufficient assumptions frequently employed to ensure ex-

istence. The results are given with more general utility functions in mind than

the one defined in Definition 2.1.1 and thereafter. We will comment on the gen-

erality below. Before we start, recall the two different usages of u, see Footnote

1 on p. 32.

The starting point for our discussion is the following specialization of Corol-

lary 1.4.7. For the readers convenience, we restate the notion of upper semicon-

tinuity used there: the function u must satisfy u(c,X) ≥ lim supn→∞ u(cn, Xn)

for any sequence (cn, Xn)n≥1 converging to (c,X) almost surely (at least on

C(W0)). Clearly, upper semicontinuity of u with respect to convergence in

probability is a special case of this assumption.

2.1.24 Theorem. Suppose that the Assumption 2.1.18 holds and suppose fur-

ther that u(c,X) 4= EP

[∫ T


]is upper semicontinuous in

the sense of Corollary 1.4.7 on

C(W0)4=

(c,X) : EQm

[∫ T

0

c(s)ds + X

]≤ W0

.

Then the optimal solution to Problem 2.1.5 exists.

Proof. Follows immediately from Corollary 1.4.7 in combination with Remark

2.1.21 and the Standing Assumption on p. 34.

2.1.25 Remark. The rest of this section is devoted to establishing sufficient

conditions for upper semicontinuity. Before we do so, we quickly discuss the

minimum assumptions needed for the theorems below to be true. We always


need nondecreasing, quasiconcave functions U,B. We do however not rely on

a “pointwise” definition as in Definition 2.1.1, as long as we are not looking

at U ′, B′; instead it suffices to consider the functions U,B as mappings from

measurable spaces to measurable spaces (e.g., B : L0(Ω,F , P) 7→ L0(Ω,F , P)).

They must be upper semicontinuous in the following sense:

for every sequence (cn, Xn) ∈ CK(W0) (or CπK(W0)) converging al-

most surely to (c,X), we have almost surely (U(·, c), B(X)) ≥lim supn→∞(U(·, cn), B(Xn)).

Clearly, any utility function as defined in Definition 2.1.1 is upper semicon-

tinuous in this sense. There is however one subtlety, we have to take care of:

if U,B are only quasiconcave, and no longer concave the Standing Assump-

tion on p. 34 does no longer mean that u(W0) < ∞ for some W0 > 0 implies

u(W0) < ∞ for all W0 > 0. Therefore the Standing Assumption on p. 34 must

hold for the concrete W0 at interest. For the rest of this section, we formulate

the results with this more general utility functions in mind, where applicable

(i.e. where U ′, B′ are not needed).

In the next corollaries, we will answer the question concerning sufficient

assumptions for upper semicontinuity in Theorem 2.1.24. Recall that we call a

set B ⊂ L1(P) uniformly integrable, if limα→∞ supf∈B∫|f |≥α

|f |dP = 0.4 As it

turns out u is upper semicontinuous in the sense of Corollary 1.4.7 if and only

if a certain set is uniformly integrable.

2.1.26 Lemma. u(·, ·) is upper semicontinuous in the sense of Corollary

1.4.7 and hence the optimal solution exists, if (U+(·, ·), B+(·)) is uniformly

λ ⊗ P ⊗ P-integrable with respect to the σ-algebra Prog × F(T ) for every se-

quence (cn, Xn)n≥1 in C(W0), converging almost surely; equivalently if every

sequence (U+(·, cn(·)), B+(Xn))n≥1 is weakly compact.5

4For a standard definition and a proper discussion of uniform integrability see e.g. Chapter21 in Bauer (1992); Neveu (1964, Chapter 2.5).

5Weakly compact means that for every sequence (ζn) there exists a subsequence withlimn→∞ EP[ζnη] = EP[ζη] for any bounded random variable η.


Suppose now that U,B are utility functions in the sense of Definition 2.1.1.

If for every sequence (cn, Xn) ∈ CK(W0) (or CπK(W0)) converging almost surely

to (c,X), we have (U+(·, c), B+(X)) = limn→∞(U+(·, cn), B+(Xn)) almost

surely (sequential continuity), then both statements hold “if and only if”.

Proof. We first show that uniform integrability implies upper semicontinuity.

Let ((cn, Xn))n≥1 be a sequence in C(W0) converging to (c,X) almost surely.

The assumption of upper semicontinuity of U,B, and the Fatou lemma for

uniformly integrable random variables (Theorem 1.2 in Liptser and Shiryaev,

2000; Bauer, 1992, Exercise 21.6) imply

u(c,X) =EP

[B(X) +

∫ T

0

U(s, c(s))ds

]

≥ lim supn→∞

EP

[B(Xn) +

∫ T

0

U(s, cn(s))ds

]= lim sup

n→∞u(cn, Xn),

which is upper semicontinuity of u.

The assertion concerning weak compactness immediately follows from the

Dunford-Pettis Compactness Criterion (Liptser and Shiryaev, 2000, Theorem

1.7) and the Standing Assumption on p. 34.

For the converse, we use the following characterization of uniform integra-

bility (Bauer, 1992, Ubung 21.6):

Take as given a sequence (fn)n≥1 in L1(Ω,F , P). Suppose that∫lim supn→∞ fndP < ∞. Then (fn)n≥1 is uniformly integrable,

if and only if for all A ∈ F the inequality lim supn→∞∫

AfndP ≤∫

Alim supn→∞ fndP holds.

Before we start, we note that the assumption u(W0) < ∞ in combination with

Problem 2.1.5 and the assumptions of the lemma imply

u+(c,X) 4= EP

[B+(X) +

∫ T

0

U+(s, c(s))ds

]< ∞,


and from the assumptions, this function must be upper semicontinuous, too.

Upper semicontinuity gives us lim supn→∞ u+(cn, Xn) ≤ u+(c,X) for a con-

verging sequence. Furthermore, the assumption concerning U+, B+ implies the

equality u+(c,X) = EP

[limn→∞ B+(Xn) +

∫ T

0limn→∞ U+(s, cn)ds

]< ∞. It

remains to show that lim supn→∞∫

AB+(Xn)dP ≤

∫A

limn→∞ B+(Xn)dP. To

see this, note that IAB+(Xn) = B+(XnIA) − B+(0)IΩ\A. B+(·) ≥ 0 and

u(W0) < ∞ by the Standing Assumption on p. 34 imply that B+(0) is finite.

Thus showing that lim supn→∞∫

B+(XnIA)dP ≤∫

limn→∞ B+(XnIA)dP suf-

fices to complete the proof. But this follows from the upper semicontinuity of

u, the sequential continuity of B+ and the convergence of (XnIA). A similar

reasoning can be applied to U+ to complete the proof.

Hence, ensuring existence of an optimal solution amounts to ensuring uni-

form integrability. We now turn to one condition directly on u to ensure uni-

form integrability. This is our first sufficient condition for the existence of an

optimal solution.

2.1.27 Corollary. With the notations and assumptions of Theorem 2.1.24,

suppose that limW0→∞u(W0)

W0= 0; let U,B be utility functions in the sense of

Definition 2.1.1, and suppose there exists a non-decreasing function U : IR 7→ IR

such that U ≥ U,B ≥ U almost surely.


Proof. Before we start, we observe that without loss of generality we can as-

sume that there exists x0 ∈ IR with U(x0) ≥ 0. If this is not the case,

U(x) : x ∈ IR were bounded from above, say by k, and we could simply

add an upper bound k to B,U,U to get the desired existence of x0. This

would neither change our optimization problem, nor the assumption. This im-

plies that for X04= ess-infX > 0 : B(X) ≥ 0 and c0(·)

4= ess-infc(·) >

0 : U(·, c(·)) ≥ 0 we have EQm

[X0 +

∫ T

0c0(s)ds

]< ∞. The last inequal-

ity follows from U ≥ U,B ≥ U and U(x0) ≥ 0 for some x0 ∈ IR, hence

EQm

[X0 +

∫ T

0c0(s)ds

]≤ EQm

[x0 +

∫ T

0x0ds

]≤ x0(1 + T ).

Lemma 2.1.26 implies that we have to prove uniform integrability. To this


end, we adapt a proof from Kramkov and Schachermayer (2003, Lemma 1),

and do so by contradiction. Let ((cn, Xn))n≥1 in C(W0) be a sequence such

that (U+(·, ·), B+(·)) is not uniformly integrable for this sequence.

Then we can find some constant α > 0 and a disjoint sequence An ∈ Prog

such that (with hopefully obvious notation) for all n ≥ 1

EP

[B+(Xn)IT (An) +

∫ T

0

U+(s, cn(s))Is (An) ds

]≥ α.

Define Xgn

4= X0 +∑n

k=1 XkIT (Ak), cgn(·) 4= c0(·) +

∑nk+1 ck(·)I·(Ak). We esti-

mate EQm

[Xg

n +∫ T

0cgn(s)ds

]≤ EQm

[X0 +

∫ T

0c0(s)ds

]+ nW0, i.e. (cg

n, Xgn) ∈

C(nW0 + EQm

[X0 +

∫ T

0c0(s)ds

])and with the concavity of U,B

EP

[B(Xg

n) +∫ T

0

U(s, cgn(s))ds

]≥ nα.

Therefore

lim supW0→∞

u(W0)W0

≥ lim supn→∞

EP

[B(Xg

n) +∫ T

0U(s, cg

n(s))ds]

nW0 + EQm

[X0 +

∫ T

0c0(s)ds

] ≥ α

W0

contradicting the assumption.

2.1.28 Remark. If u is differentiable, the condition limW0→∞u(W0)

W0= 0 is

equivalent to limW0→∞ u′(W0) = 0.

Let us now turn to conditions directly on U,B that ensure uniform integra-

bility. For example, uniform integrability holds for a power-growth condition.

This condition is often employed, see Cox and Huang (1991); Karatzas et al.

(1991); Cuoco (1997); Bank (2000); Mnif and Pham (2001) or Karatzas and

Shreve (1998, Remark 3.6.8, Equations (3.6.18) and (3.6.20)) for a textbook

reference. If this condition holds, we do not have to establish the Standing

Assumption on p. 34, since it is an immediate consequence.

2.1.29 Corollary (Power-Growth). With the notations and assumptions of

Theorem 2.1.24, suppose that U(t, x) ≤ k1 + k2xγ and B(x) ≤ k1 + k2x

γ


almost surely for constants k1, k2 ≥ 0, 1 > γ and all x > 0. Assume further

that dPdQm ∈ Lp(Ω,F , P) for some p > γ

1−γ .

More generally, let p = p(1+p)γ , Z ∈ Lp(Ω,F , P), and some x0 > x, c0 > c

with B(x0) ∈ Lp(Ω,F , P), U(t, c0(t)) ∈ Lp(Ω,F , P), B(x) ≤ Z + k2xγ for all

x ≥ x0, and U(t, x) ≤ Z + k2xγ for all x ≥ c0(t).


Proof. Note that γ ≤ 0 implies a uniform bound since U,B are nondecreasing;

therefore we can safely assume 1 > γ > 0. By Lemma 2.1.26, it suffices to

prove that (U+(·, ·), B+(·)) is uniformly λ⊗ P⊗ P-integrable in C.For the proof, recall de La Vallee Poussin’s characterization of uniform

integrability (see Dellacherie and Meyer, 1975, p. 38).

B ⊂ L1(Ω,F , P) is uniformly integrable, if and only if there exists

a function b : IR+ 7→ IR+ with limx→∞b(x)

x = ∞ and supf∈B∫

b |f |dP < ∞.

Hence it suffices to prove that

EP

[(B+(X)

)p +∫ T

0

(U+(s, c(s))

)p ds

]< ∞

for some p > 1. Using the power-growth condition and changing measure, we

have to show that

EQm

[dP

dQm(Xγ)p +

∫ T

0

dPdQm

(c(s)γ)p ds

]< ∞.

To this end, set p = p(1+p)γ . Holder’s inequality for the first summand implies

EQm

[dP

dQmXγp

]≤ (EQm [X])γp

(EQm

[(dP

dQm

) 11−γp

])1−γp

< ∞,

since EQm [X] ≤ W0 and

EQm

[(dP

dQm

) 11−γp

]= EP

[(dP

dQm

) γp1−γp

]= EP

[(dP

dQm

)p]< ∞,


by assumption. This completes the proof for the first assertion for B. As for

the second assertion, simply use the estimate for x > x0, and the fact that

B+(x) ≤ B+(x0) ∈ Lp(Ω,F , P) for x < x0. Similar reasoning for U completes

the proof. The proof is standard, see e.g. Karatzas et al. (1991, Section 5) or

Cuoco (1997, Lemma B.4). Mnif and Pham (2001, Lemma 4.3) seem to have

given the extension to the state-dependent version first.

We summarize and extend the results of the previous corollaries in the

following proposition, which presents the most common sufficient conditions

for the existence of an optimal solution. We start with an assumption.

2.1.30 Assumption (Inada Conditions). Suppose x ≡ 0, c ≡ 06 and that U,B

satisfy the uniform Inada conditions: there exist (strictly) decreasing functions

K1,K2 : IR+ 7→ IR with almost surely

K1(x) ≤ U ′(t, x) ≤ K2(x) for all t ∈ I,

K1(x) ≤ B′(x) ≤ K2(x),

limx→0

K1(x) = limx→0

K2(x) = ∞,

limx→∞

K1(x) = limx→∞

K2(x) = 0,

and

lim supx→∞

K1(x)K2(x)

< ∞.

Furthermore t 7→ U(t, 1) is bounded almost surely, and so is B(1).

2.1.31 Remark. We can show that there exist continuously differentiable, state-

independent utility functions U(x), U(x) such that U(x) ≤ U(t, x) ≤ U(x) for

all t ∈ I almost surely. For example, set

U(x) =

ess-infΩ inft∈I U(t, 1) +∫ x

1K1(z)dz x ≥ 1

ess-infΩ inft∈I U(t, 1)−∫ 1

xK2(z)dz else

and then smooth out around 1 taking into account B(1) (Karatzas and Zitkovic,

2003, Proposition 3.5). We need both U and U in Lemma 2.1.38.6See the discussion after Definition 2.1.1. This assumption is only there for convenience.


2.1.32 Proposition (Sufficient Conditions I). With the notations and as-

sumptions of Theorem 2.1.24, if any of the following conditions is met, then

the optimal solution to Problem 2.1.5 exists.

(i) U+, B+ are uniformly bounded by some integrable random variable, i.e.

for all x ∈ IR+, t ∈ I we have U+(t, x) < X and B+(x) < X for some

X ∈ L1(Ω,F , P) almost surely.

(ii) U(t, x) ≤ k1 + k2xγ and B(x) ≤ k1 + k2x

γ almost surely for constants

k1, k2 ≥ 0, 1 > γ > 0, x > 0, and dPdQm ∈ Lp(Ω,F , P) for some p > γ

1−γ .


with B(x0) ∈ Lp(Ω,F , P), U(t, c0(t)) ∈ Lp(Ω,F , P), B(x) ≤ Z + k2xγ for

all x ≥ x0, and U(t, x) ≤ Z + k2xγ for all x ≥ c0(t).

(iii) There exists k > 0, y0 > 0, and 1 > γ > 0 such that (U ′)−1(·, y) ≤ ky−γ

and (B′)−1(y) ≤ ky−γ almost surely for all y ≤ y0. Furthermore, there

exists x0 > 0, such that U(·, x0), B(x0) are uniformly bounded from above

by an integrable random variable (i.e. there exists an X ∈ L1(Ω,F , P)

such that U(t, x0) < X,B(x0) < X for all t ∈ I almost surely).

(iv) There exists a function K : IR+ 7→ IR and a constant x0 > 0 such that

K(x) ≥ U ′(·, x), K(x) ≥ B′(x) almost surely,∫∞

x0K(x)dx < ∞, and

U(·, x0), B(x0) are uniformly bounded from above by an integrable ran-

dom variable (i.e. there exists an X ∈ L1(Ω,F , P) such that U(t, x0) <

X, B(x0) < X for all t ∈ I almost surely).

Proof. Condition (i) is an immediate consequence of Lemma 2.1.26 and ele-

mentary integration theory (Bauer, 1992, Chapter 21, Beispiel 3). Condition

(ii) is just Corollary 2.1.29. We will show that (iii) ⇒ (iv) ⇒ (i).

(iii) ⇒ (iv): (B′)−1(y) ≤ ky−γ ⇒ y ≥ B′(ky−γ). Setting x = ky−γ , we

have k1γ 1

x1γ≥ B′(x); and this is (iv).

(iv) ⇒ (i): On noting that B(x) = B(x0) +∫ x

x0B′(z)dz it follows imme-

diately from the assumptions that B(·) is uniformly bounded by an integrable

random variable, and so it U(·, ·).


2.1.33 Proposition (Sufficient Conditions II). With the notations and as-

sumptions of Proposition 2.1.32 suppose that Assumption 2.1.30 holds. Then

also the following conditions are sufficient for the existence of an optimal so-

lution to Problem 2.1.5.

(v) There is x0 ≥ 0, 0 < α < 1, and β > 1 such that B′(βx) ≤ αB′(x) and

U ′(·, βx) ≤ αU ′(·, x) for all x > x0 almost surely.

(vi) limx→∞ inft∈I U(t, x) > 0, limx→∞ B(x) > 0 almost surely, and there

exists a constant α < 1, such that

lim supx→∞

(supt∈I

xU ′(t, x)U(t, x)

)< α,

lim supx→∞

xB′(x)B(x)

< α.

(vii) v(y) 4= EP

[∫ T

0U(s, yEP

[dQm

dP |F(s)])

ds + B(y dQm

dP

)]< ∞ for all y >

0; here B(y) 4= supx>0 B(x)− xy is the convex dual, and U defined

accordingly.

(viii) U(t, x) ≤ k1 + k2xγ and B(x) ≤ k1 + k2x


k1, k2 ≥ 0, 1 > γ > 0, and

sup(ξ,c)∈A(S,W0)

EP

[∫ T

0

k2(c(s))γds + k2(W (T ))γ

]< ∞.

(ix) U(t, x) ≤ U(t, x) and B(x) ≤ B almost surely; U,B are utility functions

satisfying Assumption 2.1.30, any of the conditions (v), (vi) or (vii), and

sup(ξ,c)∈A(S,W0)

EP[U(s, c(s))ds + B(W (T ))

]< ∞.

Proof. We prove that (v) ⇒ (vii), (vi) ⇒ (vii) and show then that (vii)

implies Corollary 2.1.27. The condition (viii) ⇒ (ix), and sufficiency of (ix) is

a consequence of (v) ⇒ (vii), (vi) ⇒ (vii), or directly (vii).

(v) ⇒ (vii): Lemma 2.1.38 implies existence of some y0 such that −∞ <

v(y0) < ∞. Since v(y) is decreasing in y, it suffices to show that v(y) < ∞⇒


v(ky) < ∞ for some, and then all 0 < k < 1. It is shown in Lemma 2.1.38 that

v(y) > −∞ for all y > 0 (this is also a direct consequence of the assumption,

see Karatzas and Shreve, 1998, Remark 3.6.9). Given the assumptions we can

show that B(y) = B((B′)−1 (y)

)− y (B′)−1 (y). Since y (B′)−1 (y) > 0 it

suffices to show that EP

[B((B′)−1

(y0

dQm

dP

))]< ∞ implies for some k < 1

EP

[B((B′)−1

(ky0

dQm

dP

))]< ∞. On setting x = (B′)−1 (y) and using the as-

sumption B′(βx) ≤ αB′(x), we find the inequality (B′)−1 (αy) ≤ β (B′)−1 (y).

Choosing k = α, we have the estimate

EP

[B

((B′)−1

(αy0

dQm

dP

))]≤ EP

[B

(β (B′)−1

(y0

dQm

dP

))].

If we consider b(β) 4= B(β (B′)−1

(y0

dQm

dP

)), we have b(β) ≤ b(1)+b′(1)(β−1)

from the concavity of B. From v(y0) < ∞ we find EP[b(1)] < ∞; and v(y0) >

−∞ combined with b′(1) = y0dQm

dP (B′)−1(y0

dQm

dP

)leads to EP[b′(1)] < ∞ ⇒

EP[b(β)] < ∞. A similar reasoning for U establishes the claim.

(vi) ⇒ (vii): Again, by Lemma 2.1.38, it suffices to show that v(y) < ∞⇒v(ky) < ∞ for 0 < k < 1. To this end, we use the following facts that are easy

to establish: B(y) = B((B′)−1 (y)) − y (B′)−1 (y) and B′(y) = − (B′)−1 (y)

(e.g. Karatzas and Shreve, 1998, Lemma 3.4.3). Using this and the assumption,

we estimate B(y) = B(−B′(y)) + yB′(y) > − 1γ B′(y)B′(−B′(y)) + yB′(y) =

γ−1γ yB′(y) for some α < γ < 1 and all y < y for some small enough y > 0. Upon

setting f(k) 4= B(ky), g(k) 4= k−γ1−γ B(y), this can be written as f ′(1) > g′(1).

Since f(1) = g(1) it follows from continuity that there exists some x < 1 with

f(k) < g(k) for all k ∈ [x, 1]. To show that this inequality is indeed true for

all k ∈ [0, 1], suppose that this is not the case and let x < x be the maximal

element in [0, x] such that f(x) ≥ g(x). The same reasoning as before shows

that f ′(x) > g′(x), and we conclude (again by continuity) that there exists

some x > k > x with f(k) ≥ g(k), contradicting the definition of x. To sum

up, we have shown that B(ky) ≤ k−γ1−γ B(y) for all 0 < k < 1. Since we

can establish this for U , too, a straightforward estimate implies the inequality

(Karatzas and Zitkovic, 2003, Lemma A.4 for the complete proof).


(vii) ⇒ Corollary 2.1.27: From Lemma 2.1.38, v(y) = supx>0 [u(x)− xy].

Hence the assumption v(y) < ∞ for all y > 0 implies limW0→∞u(W0)

W0= 0. It

now follows from Corollary 2.1.27 that (vii) is a sufficient condition.

(viii): is a special case of (ix).

(ix) ⇒ (vii): The portfolio optimization problem with utility functions

U,B satisfies all assumptions necessary (especially the Standing Assumption on

p. 34) to see that v(y) 4= EP

[∫ T

0U(s, yEP

[dQm

dP |F(s)])

ds + B(y dQm

dP

)]< ∞

either by (v), (vi) or directly by (vii). And v(y) ≤ v(y) completes the proof.

To complete the proof, we have to establish Lemma 2.1.38. This will be

done at the end of this subsection.

2.1.34 Remark (Discussion of Assumptions). As already discussed, conditions

(i) and (ii) are common. Condition (iii) is used in Karatzas and Shreve

(1998, Equations (3.6.18) and (3.6.19)), and (iv) is a generalization of (iii).

Assumption 2.1.30 is a standard assumption in microeconomic optimization

problems. Pliska (1986) introduced it to portfolio optimization. It is clear that

these conditions can be generalized to arbitrary c, x other than 0. If the Inada

conditions hold, any optimal solution must satisfy the first-order conditions of

Theorem 2.1.12. Condition (v) is frequently employed (see e.g. Assumption

4.3 in Karatzas et al., 1991; Karatzas and Shreve, 1998, Equation (3.4.16)).

Kramkov and Schachermayer (1999) suggest condition (vi).

xB′(x)B(x)

=dB(x)B(x)

dxx

is the elasticity of the utility function B, i.e. the relative change of utility

per relative change of terminal wealth, or the marginal change B′(x) divided

through the average change B(x)x . Hence, (vi) is a constraint on the asymptotic

elasticity of the utility function. If B is twice differentiable, the condition is

equivalent to (apply De l’Hopital’s rule)

limx→∞

−xB′′(x)B′(x)

> 0,

i.e. the relative risk aversion is bounded away from zero. A closely related

assumption is used in Mnif and Pham (2001, Assumption 5.1). An asymptotic


elasticity less than 1 precludes extreme gambling behavior, where a rich investor

gambles with part of his fortune because she is risk-neutral in the limit (Kram-

kov and Schachermayer, 1999, Section 5). Kramkov and Schachermayer (2003)

discuss (vii). In a slightly different context, this condition can also be found e.g.

in Karatzas et al. (1991, Assumption 11.2) or Mnif and Pham (2001, Theorem

5.1) (textbook: Karatzas and Shreve, 1998, Assumption 3.6.1 and Equation

(6.5.2)). The fact that condition (vii) is sufficient in Proposition 2.1.33 is

a first glimpse on the duality method used in Chapter 3 to characterize the

optimal solutions. Finally, we observe that (viii) is a very handy variant of the

power-growth condition. It is straightforward to establish or check using the

sufficient conditions of Theorem 2.1.12. Amongst others, such a condition can

be found in Karatzas et al. (1991, Equations (5.3), (5.4) and Remark 11.9).

(ix) is just a generalization of (viii). The Standing Assumption on p. 34 is a

consequence of (viii) or (ix), and needs not to be established separately.

The conditions are equivalent to or weaker than the conditions used by

papers decomposing the nonlinear Hamilton-Jacobi-Bellman equation into a

linear partial differential equation (e.g. conditions (3.2), (4.8), (4.16), (5.6),

and (5.11) in Karatzas et al., 1987).

2.1.35 Remark (Discussion of Relations between Assumptions). We can mix

the assumptions: e.g. B satisfies a power-growth condition and U a condition

like (vi). Indeed, it is even true that a state-dependent utility might satisfy one

assumption on a measurable subset of Ω and another assumption on another

subset. This immediately follows from the additivity of the integral. As a

special case, the sufficient conditions also hold if U ≡ 0 or B ≡ 0, i.e. the

cases of consumption only and terminal wealth only. This also implies that

in (iii) γ = 1 is feasible, if the assumptions of (vi) hold. The case γ = 1

leads to k1 ≥ xB′(x). Now either B is bounded from above almost surely,

or lim supx→0xB′(x)B(x) = 0. And a similar reasoning applies to U . As to the

relation between condition (ii) and (vii), we note that (ii) ⇒ (vii), but we

cannot conclude without Assumption 2.1.30, that v(y) = supx>0[u(x) − xy],

which is needed to prove the sufficiency of (vii). And even if Assumption


2.1.30 and a power-growth condition holds, we still need an assumption like in

(viii) of (ix) to establish v(y) < ∞. Therefore (ii) settles cases that are not

covered by (vii). On the other hand, (vii) does not require a power-growth

condition. And even if a power-growth condition holds (e.g. if (vi) holds), (vii)

implies the weaker condition dPdQm ∈ Lp(Ω,F , P) for some p ≥ γ

1−γ instead

of p > γ1−γ . The same is true for (viii). The proof of Proposition 2.1.33

teaches that (v) or (vi) ⇒ (vii). The converse implications are not true in

general (Lemma 6.5 in Kramkov and Schachermayer, 1999, 2003). Therefore,

(vii) is more general. We also note that (v) ⇒ (vi), if U,B are uniformly

bounded from below by a constant (Kramkov and Schachermayer, 1999, Lemma

6.5). And (vi) implies a power-growth condition (Kramkov and Schachermayer,

1999). Note also that (i), (iii), (iv), (v) and (vi) are conditions for which only

properties of the utility functions are relevant. That is, they hold irrespective

of the concrete market, as long as the Standing Assumption on p. 34 is true.

To the contrary (ii), (vii), (viii) and (ix) are conditions for a specific market.

2.1.36 Remark. All the conditions can be easily extended to the constrained

case. The only situation where this would not have been straightforward is

Lemma 2.1.38, which therefore is already proven for the constrained case. To

be more specific, we can replace C(W0) by any other subset CK(W0) throughout

this subsection, provided it is convex, closed and solid. We need a ‘dual set’

YK that is convex and closed with respect to Fatou convergence. The ‘duality’

between the two sets must be as in Proposition A.3.14. Then Corollary 1.4.7

works perfectly, Corollary 2.1.27 still is true with the obvious modifications,

and so does Lemma 2.1.26. Therefore, Corollary 1.4.7 still ensures existence

of an optimal solution. To sum up, with the obvious modifications in notation

everything in this section is true for the constrained case, too. More general, it

is true for any two ‘dual’ sets CK(W0) and YK, having the necessary properties,

namely convexity and a certain closure property.

We finish the subsection with the missing lemma that fills in the gap of

Proposition 2.1.33. This lemma will be given with the more general setting of

constrained portfolio optimization in mind. Therefore, we use slightly different


notation that will be explained as part of the proof. The lemma can safely be

skipped upon first reading. Its full meaning will only become obvious in the

context of the constrained case.

2.1.37 Remark. The proof only works for arbitrary constraints on portfolio-

proportion processes. As discussed in Appendix A.3.2, the case of portfolio

processes is more involved. If ASK(Q)T : Q ∈Mb(SK) is uniformly bounded

the proof is true with minor modifications. As a special case of this, we do not

have to modify the lemma and its proof at all for the case of cone constraints

or incomplete markets. The general case is however an open issue.

2.1.38 Lemma. With the notation of Proposition 2.1.33, suppose Assumption

2.1.30 holds. Set

vK(y) 4= infY ∈YK

EP

[∫ T

0

U(s, yYs)ds + B(yYT )

],

where YK is defined in (A.8) on p. 144

Then vK(y) = supx>0[uK(x)−xy], vK(y) > −∞ for all y > 0 and vK(y0) <

∞ for some y0 > 0. Furthermore, there exists Y ∗ ∈ YK with

vK(y) = EP

[∫ T

0

U(s, yY ∗s )ds + B(yY ∗

T )

]. (2.9)

Proof. We start with the proof that vK(y) = supx>0[uK(x)−xy], where vK(y) 4=

infY ∈YK EP

[∫ T

0U(s, yYs)ds + B(yYT )

]. For the proof, we need the full power

of Proposition A.3.14. Using this proposition, the dynamic portfolio optimiza-

tion problem can be translated into a static one:

uK(W0)4= sup

(c,X)∈CπK(W0)

EP

[∫ T

0

U(s, c(s)ds + B(X)

],

where

CπK(W0)

4=

(c,X) : c ≥ 0, X ≥ 0, sup

Y ∈YKEP

[∫ T

0

c(s)Ysds + XYT

]≤ W0

.

We set CnK(W0)

4= (c,X) ∈ CπK(W0) : c(·) ≤ n, X ≤ n a.s., Bn(y) 4=

sup0<x≤n[B(x)−xy], and define Un analogously. Alaoglu’s theorem (Schechter,


1997, Theorems 28.29 (UF28)) ensures that CnK(W0) is compact in the weak-*

topology σ(L∞, L1). Note also that (U+(·, ·), B+(·)) is uniformly bounded on

CnK(W0) by U(n) (see Remark 2.1.31), i.e. upper semicontinuous by Lemma

2.1.26. Since CnK(W0),YK are both convex, we can apply the Minimax theorem

(Millar, 1983, p. 92):

vnK(y) 4= inf

Y ∈YKEP

[∫ T

0

Un(s, yYs)ds + Bn(yYT )

]

= infY ∈YK

sup(c,X)∈Cn

K(W0)

EP

[∫ T

0

U(s, c(s))− yc(s)Ysds + B(X)− yXYT

]

= sup(c,X)∈Cn

K(W0)

infY ∈YK

EP

[∫ T

0

U(s, c(s))ds + B(X)− yc(s)Ys − yXYT

]

= sup(c,X)∈Cn

K(W0)

EP

[∫ T

0


]

− y supY ∈YK

EP

[(∫ T

0

c(s)Ysds + XYT

)]

= sup(c,X)∈Cn

K(W0)

EP

[∫ T

0

U(s, c(s))ds + B(X)

]− yW0

for ‘large enough’ n so that the constraint is binding (e.g. n > W0), where

the first equality follows from pointwise optimization. Hence, limn→∞ vnK(y) =

supx>0[uK(x) − xy], and from the definition vnK(y) ≤ vK(y). To complete the

proof of vK(y) = supx>0[uK(x)−xy], it remains to show limn→∞ vnK(y) ≥ vK(y)

for the nondecreasing sequence. There is nothing to show if limn→∞ vnK(y) =

∞, and we will see below that vnK(y) > −∞; hence we will assume that this

limit actually exists in IR for the moment. Consider a sequence (Y m)m≥1 with

limn→∞

vnK(y) = lim

n→∞EP

[∫ T

0

Un(s, yY ns )ds + Bn(yY n

T )

].

Since B, U are nonincreasing in Y m (Karatzas and Shreve, 1998, Lemma 4.3),

we can and will assume that the Y m are maximal elements of YK (see Lemma


A.3.11). For later use we show that

limn→∞

vnK(y) = lim inf

n→∞supm≥n

EP

[∫ T

0

Un(s, yY ms )ds + Bn(yY m

T )

]. (2.10)

Indeed, using that Un, Bn are non-decreasing in n, we get the estimate

limn→∞

vnK(y) ≤ lim inf

n→∞supm≥n

EP

[∫ T

0


T )

]

≤ lim infn→∞

supm≥n

EP

[∫ T

0

Um(s, yY ms )ds + Bm(yY m

T )

]= lim

n→∞vnK(y).

Here, the last equality is true since vnK(y) is nondecreasing in n, i.e. the supre-

mum and the limit are the same.

As a first step towards showing limn→∞ vnK(y) ≥ vK(y), we will prove that

((Un)−

(·, ·),(Bn)−

(·)) is uniformly integrable on YK. This also shows that

limn→∞ vnK(y) > −∞, as stated above. Choose the function U as in Remark

2.1.31. Define Un(y) 4= sup0<x≤n[U(x)−xy]. Set fn 4=

(−U

n)−1

. L’Hospital’s

rule and the definition of Un

gives

limx→∞

fn(x)x

= limy→∞

y

−Un(y)

= − limy→∞

1ddy U

n(y)

= ∞.

U(x) ≥ 0 for all x > 0 implies Un(y) ≥ 0 for all y > 0. This together with Un ≥

Un

and Bn ≥ Un

already is uniform integrability of ((Un)−

(·, ·),(Bn)−

(·)).Therefore we can and will for the moment assume that there exists x > 0 such

that U(x) < 0. Now either Un(y) < 0 for all y > 0, hence

(U

n)−

= −Un, and

we have the estimate

EP

[∫ T

0

fn

((U

n)−

(s, Ys))

ds + fn

((B

n)−

(YT ))]

= EP

[∫ T

0

fn(−U

n(Ys)

)ds + fn

(−U

n(YT )

)]≤ 1 + T < ∞;


or there exist y1 > y2 > 0 with Un(y1) < 0 and U

n(y2) > 0. Continuity of U

n

then implies that there exists y0 > 0 with Un(y0) = 0 ⇒ fn(0) < ∞. We have

the estimate

EP

[∫ T

0

fn

((U

n)−

(s, Ys))

ds + fn

((B

n)−

(YT ))]

= EP

[∫ T

0

fn(−U

n(Ys)

)ds + fn

(−U

n(YT )

)]+ (1 + T )fn(0)

≤ (1 + T )(1 + fn(0)) < ∞.

Since Un ≥ Un

and Bn ≥ Un

uniform integrability of ((Un)−

(·, ·),(Bn)−

(·))follows from the de La Vallee Poussin Theorem.

Given uniform integrability, proving limn→∞ vnK(y) ≥ vK(y) is immediate.

To start with, let Y n ∈ conv(Y n, Y n+1, . . . ) be a sequence Fatou-converging

to some Y ∗ (see Lemma A.2.6). Since we have chosen the Y n for n ≥ 1 to be

maximal elements of YK, it follows from Lemma A.3.11 that Y ∗ ∈ YK. We can

therefore estimate

limn→∞

vnK(y) = lim inf

n→∞supm≥n

EP

[∫ T

0


T )

]

≥ lim infn→∞

EP

[∫ T

0

Un(s, yY ns )ds + Bn(yY n

T )

]

≥ EP

[∫ T

0

lim infn→∞

Un(s, yY ns )ds + lim inf

n→∞Bn(yY n

T )

]

≥ EP

[∫ T

0

lim infn→∞

lim infm→∞

Un(s, yY ms )ds + lim inf

n→∞lim infn→∞

Bn(yY mT )

]

= EP

[∫ T

0


T )

]≥ vK(y).

(2.11)

We have used the convexity of U , B and the Fatou Lemma (see Liptser and Shir-

yaev, 2000, Theorem 1.1.2) in the first two inequalities. The third inequality is

obvious, and the equality follows from the continuity of the functions Un, Bn,


the almost sure convergence of the sequence (Y n) (Lemma A.2.5), and Un ↑ U ,

Bn ↑ B. The last inequality is a direct consequence of the definition of vK(y)

and Y ∗ ∈ YK. See Karatzas and Zitkovic (2003) for further details.

This completes the proof of vK(y) = limn→∞ vnK(y) = supx>0[uK(x)− xy].

By the same reasoning as we have shown uniform integrability of Bn, Un

(simply drop the ‘n’), we can also show uniform integrability of B, U , from

which vK(y) > −∞ is immediate. vK(y0) < ∞ follows from the equality

vK(y) = supx>0[uK(x)−xy] and the fact that uK(W0) < ∞ implies uK(x) < ∞for all x > 0 (due to the concavity of the utility functions).

Finally, it is a consequence of the definition of vnK(y) and Y ∗ ∈ YK that

vnK(y) ≤ EP

[∫ T

0

Un(s, yY ∗s )ds + Bn(yY ∗

T )

].

Combining this with (2.11) and limn→∞ vnK(y) = vK(y) yields

vK(y) = limn→∞

vnK(y) = EP

[∫ T

0


T )

]

which proves (2.9).

2.1.6 Examples (Unconstrained Brownian Market)

The best-known example for the predictable representation property is the

Brownian market model, see Merton (1969); Pliska (1986); Cox and Huang

(1989, 1991); Karatzas et al. (1987). Standard textbooks are Elliott and

Kopp (1999, Chapter 10); Karatzas and Shreve (1998, Chapter 3); or Korn

(1997, Chapter 3). Since we have already seen a slight generalization of the

basic idea, we can limit ourselves to a streamlined exposition. It gives an

idea of the economic intuition behind the above setting. The interested reader

should also track how the introduction of a risk-free rate process r changes our

established notation slightly but not fundamentally.

2.1.39 Example. In this example a financial market consists of

(i) a complete probability space (Ω,F , P);


(ii) a finite, positive constant T , called the terminal time;

(iii) an N -dimensional Brownian motion Z(t),FZN (t); 0 ≤ t ≤ T on the pro-

bability space (Ω,F , P), where the filtration FZN (t)0≤t≤T is the aug-

mentation by the null sets of the natural filtration FZ(t)0≤t≤T and

FZN (T ) = F holds;

(iv) a progressively measurable7 risk-free rate process r(·) ≥ 0 with∫ T

0

r(s)ds < ∞ P− a.s.;

(v) a progressively measurable, N -dimensional mean rate of return process

µ(·) satisfying ∫ T

0

‖µ(s)‖ds < ∞ P− a.s.;

(vi) a progressively measurable, (N×N)-matrix-valued volatility process σ(·)satisfying

N∑i=1

N∑j=1

∫ T

0

σ2ij(s)ds < ∞ P− a.s.,

and being nonsingular for Lebesgue-a.e. t ∈ [0, T ];

(vii) a vector of positive, constant initial asset prices S(0) = (S1(0), S2(0), . . . ,

SN (0))′, Si(0) > 0∀ i ∈ 1, . . . , n. Additionally, there exists one asset

with S0(0) = 1.

For ease of exposition we assume that

(i) there exists a constant M > 0, such that r(t) ≤ M, ‖µ(t)‖ ≤ M ∀ t ∈[0, T ] P− a.s.;

(ii) all eigenvalues of σ(·) are bounded from above and away from zero; suffi-

cient is the existence of a constant ε > 0, such that ζ′σ(t)σ′(t)ζ ≥ ε‖ζ‖2

for all ζ ∈ IRN and t ∈ [0, T ] almost surely (uniform ellipticity);7We only use progressively measurable processes since they are most frequently used in

Brownian market models. We could instead just as easily use predictable processes. SeeProtter (2001, Remark on page 177).


Amongst others, these assumptions imply that there exists a progressively mea-

surable, N -dimensional market price of risk process θ(·): θ(t) , σ(t)−1(µ(t)−r(t)1) P− a.s.∀ t ∈ [0, T ], for which

∫ T

0‖θ(s)‖2ds < ∞P− a.s. and

EP

[exp

−∫ T

0

θ′(s)dZ(s)− 12

∫ T

0

‖θ(s)‖2ds]

= 1.

See Karatzas and Shreve (1998, Theorem 1.4.2), and note that∫ T

0‖θ(s)‖2ds ≤∫ T

0‖σ(s)−1‖2[‖µ(s)‖ + ‖δ(s)‖ + r(s)]2ds ≤ 9M2

∫ T

0‖σ(s)−1‖2ds ≤ c for some

c ∈ IR+, where the last inequality follows from uniform ellipticity and Karatzas

and Shreve (1991, 5.8.1). From the Novikov condition (Karatzas and Shreve,

1991, Chapter 3.5.D) EP

[exp

−∫ T

0θ′(s)dZ(s) − 1

2

∫ T

0‖θ(s)‖2ds

]= 1; i.e.(

exp−∫ t

0θ′(s)dZ(s)− 1

2

∫ t

0‖θ(s)‖2ds

)t

is a martingale.

These assumptions are not strictly necessary for the theory below to work.

Indeed, for the case of portfolio optimization, we can almost always do equally

well with local martingales. But the martingale assumption simplifies matters.

Similarly, we can easily drop the assumption of invertibility of the matrix-

valued process σ(·), and replace the inverse with the pseudo-inverse throughout

the exposition. The pseudo-inverse also allows for non-square matrix processes.

We could therefore do without the assumption that the number of assets and

the dimension of the Brownian motion are the same. The notation would

however become considerably more tedious (compare Remark 4.3.5 on p. 122

for details).

The price processes of the risky assets satisfy the following stochastic dif-

ferential equations:

dSi(t) = Si(t)

µi(t)dt +N∑

j=1

σij(t)dZ(j)(t)

∀ t ∈ [0, T ], i = 1, . . . , N

(2.12)


where Z(j)(·) is the j-th component of the Brownian motion Z(·). Equivalently

Si(t) = Si(0) exp

∫ t

0

[µi(s)−

12

N∑j=1

σ2ij(s)

]ds+

∫ t

0

N∑j=1

σij(s)dZ(j)(s)

∀ t ∈ [0, T ], i = 1, . . . , N.

(2.13)

Additional to these N risky assets, there exists one “instantaneously risk-free”

asset S0(·):dS0(t) = S0(t)r(t)dt ∀ t ∈ [0, T ] (2.14)

with the solution

S0(t) = exp∫ t

0

r(s)ds

∀ t ∈ [0, T ]. (2.15)

One can think of this asset as a money account with continuous accrual.

The role of the semimartingale S of the general theory above is now played

by St = 1S0(t)

S(t) with S(t) = (Si(t))i=1,...,N . In this case the Radon-Nikodym

density for the unique measure Qm ∈M(S) with respect to the measure P can

be calculated explicitly. Indeed, let

YT4= exp

−∫ T

0

θ′(s)dZ(s)− 12

∫ T

0

‖θ(s)‖2ds

(2.16)

be given. It is not hard to see that

Yt = exp

−∫ t

0

θ′(s)dZ(s)− 12

∫ t

0

‖θ(s)‖2ds

= 1−∫ t

0

Ysθ′(s)dZ(s)

(2.17)

is a continuous P-martingale for our setting (a consequence of the Novikov con-

dition, cf. e.g. Karatzas and Shreve, 1991, Chapter 3.5.D, and Ito’s Formula).

Hence Qm is a probability measure, characterized by Qm(A) =∫

AYT dP, (A ∈

F). We can show that St = 1S0(t)

S(t) is a martingale for the measure Qm

with the help of the Radon-Nikodym theorem (see e.g. Karatzas and Shreve,

1998, Chapter 1).


Since the predictable representation property holds for Brownian motion

(cf. Remark 2.1.17), solving the portfolio optimization problem is straightfor-

ward: solving the static problem first results in the simple problem of finding

a solution to Corollary 2.1.14; then, to find the optimal portfolio process one

has to integrate a function by quadrature or alternatively solve an additional

linear PDE (see Cox and Huang, 1989; Karatzas et al., 1987, Proposition 7.6

for a similar approach), for which numerical solution techniques are available.8

Provided that the conditions of Theorem 2.1.12 hold, we can make direct

use of the first order conditions (2.5). From (2.7) we have

c∗(t) = U ′−1 (y1Yt, t) ∀ t ∈ I Qm − a.s.

X∗ = B′−1 (y1YT ) Qm − a.s.

as the candidate optimal solutions. From Section 1.5 we know that

W ∗t = EQm

[∫ T

0

c∗(s)ds + X∗

∣∣∣∣∣FZN (t)

]

almost surely. Getting an explicit characterization of (W ∗t )t∈I as an integral

with respect to Brownian motion is difficult. One possible approach is to

use the generalized Clark-Ocone formula (see e.g. Ocone and Karatzas, 1991;

Øksendal, 1997, Chapter 5).

But assume that U and B are not state-dependent, and that all processes

involved are Markovian. Substituting c∗, X∗, it follows that W ∗t = F (t, Yt)

for some real-valued function F (x1, x2), which we can calculate by quadrature

— at least theoretically. An alternative is the solution of a partial differential

equation, see Cox and Huang (1989) or Karatzas and Shreve (1998, Chapter

3.8, especially Theorem 8.12).

2.1.40 Remark. Ocone and Karatzas (1991) find for the setting of Example

2.1.39 above a characterization similar to that in Corollary 2.1.14 under some

8Merton (1969, 1971) showed that solving the dynamic optimization problem results in twoalgebraic equations and a nonlinear PDE in the Markovian case (and is — even numerically— rather difficult to solve in general, to say the least). See Merton (1969, 1971) or Chapter3.3 in Korn (1997) for examples and some comments on the solvability.


mild additional assumptions. This so-called feedback form of the optimal solu-

tion reads

c∗(t) = U ′−1 (t, f(t, W ∗(t)) ∀ t ∈ I Qm − a.s.

π∗(t) = − [σ (t) σ′ (t)]−1 [µ (t) + δ (t)− r (s)1]f(t, W ∗(t))

fW (t, W ∗(t))W ∗(t)

given a function f for which Ocone / Karatzas give an explicit characterization

(fW being the derivative with respect to the second argument). Again, we have

to solve a linear PDE.

As a special case of the example above we have the following result (Merton,

1969, 1971). It should further clarify the advantages of the static solution.

2.1.41 Example. This example demonstrates the solution technique indicated

in Remark 2.1.23 and Example 2.1.39. Assume that the processes µ,σ, r in

Example 2.1.39 are constant, and W0 > 0. Set the utility function U ≡ 0 (no

consumption) and let B = exp (−d T )B, where d ≥ 0 is the subjective discount

rate, and

B(x) =

x1−k

1−k x > 0

limx↓0x1−k

1−k x = 0

−∞ x < 0

for a constant k > 0, k 6= 1 (i.e. CRRA). Hence, B′−1(y) = exp (− 1kd T )y−

1k

for y > 0, and since limy↓0 B′(y) = ∞, one finds X∗ > x = 0 almost surely, if

W0 > 0. From (2.7), (2.8) of Corollary 2.1.14, we have

W ∗(T ) = exp (−1k

d T ) (YT y1)− 1

k ,

and y1 is a solution to

W0 = exp (−1k

d T )EQm

[(YT y1)

− 1k

]with Y given by (2.17). Solving for y1 leaves us with

W ∗(T ) =W0

EQm

[Y− 1

k

T

]Y − 1k

T , (2.20)


hence (substituting YT and using the definition W ∗)

W ∗(T ) = W0 +∫ T

0

W ∗(s)π∗′(s)[µ− r1]ds

+∫ T

0

W ∗(s)π∗′(s)σdZ(s)

=W0

EP

[Y

1− 1k

T

] exp

12k‖θ‖2T +

1k

θ′Z(T )

.

As Y1− 1

k

T clearly is log-normally distributed, it is straightforward to calcu-

late EP

[Y

1− 1k

T

]= exp

12

[(1k − 1

)+(

1k − 1

)2] ‖θ‖2T. Plugging this into the

right side, using Ito’s Formula and subtracting W0 on both sides, we end up

with

∫ T

0

W ∗(s)π∗′(s)[µ− r1]ds +∫ T

0

W ∗(s)π∗′(s)σdZ(s)

=∫ T

0

W ∗(s)1k‖θ‖2ds +

∫ T

0

W ∗(s)1k

θ′dZ(s).

Now since θ = σ−1[µ−r1], we see immediately (comparing the finite variation

part and the martingale part) that π∗(t) = π∗ = 1k (σσ′)−1[µ− r1]. It is easy

to check that the solution found is optimal (satisfies Theorem 2.1.22).

2.1.42 Example (Continued from Example 2.1.41). With the same notation as

before, suppose that U(·, t) = exp (−d t)B(·) for all t ∈ I, instead of U ≡ 0.

Doing the same calculations as before, we find π∗(t) = 1k (σσ′)−1[µ − r1],

again. By the same reasoning (compare (2.20), we find

W ∗(T ) =exp− 1

kdTW0

exp− 1kdTEQm

[Y− 1

k

T

]+∫ T

0exp− 1

kdsEQm

[Y− 1

ks

]ds

Y− 1

k

T

c∗(t) =exp− 1

kdtW0

exp− 1kdTEQm

[Y− 1

k

T

]+∫ T

0exp− 1

kdsEQm

[Y− 1

ks

]ds

Y− 1

kt .

2.2. INTRODUCTION TO CONSTRAINED OPTIMIZATION 65

We can further simplify this equation upon calculating EQm

[Y− 1

kt

]and inte-

grating out s:

W ∗(T ) =exp− 1

kdT(− 1

kd + 12

1k

(1 + 1

k

)‖θ‖2

)W0

exp− 1kdT + 1

21k

(1 + 1

k

)‖θ‖2T − 1

Y− 1

k

T

c∗(t) =exp− 1

kdt(− 1

kd + 12

1k

(1 + 1

k

)‖θ‖2

)W0

exp− 1kdT + 1

21k

(1 + 1

k

)‖θ‖2T − 1

Y− 1

kt .

Clearly, all the examples above can be extended to time-dependent contin-

uous parameters.

2.2 Introduction to Constrained Optimization

Constrained portfolio optimization considers the problem of maximizing ex-

pected utility from terminal wealth and consumption by selecting an admissible

process from Aπ(S, W0) or A(S, W0) that satisfies certain constraints. Section

2.2.1 introduces the problem setting. Section 2.2.2 adapts the existence results

of Section 2.1.5 to the constraint case. This only leads to changes in notation.

Section 2.2.3 then presents first-order conditions along the lines of Section 1.6.

Finally, Section 2.2.4 gives some examples.

2.2.1 The Constrained Dynamic Problem

By and large, we can divide possible constraints into two categories. The first

category comprises constraints on the wealth process (e.g. wealth must not fall

below a certain threshold, wealth must be “close enough” to a benchmark),

whereas the second category consists of constraints on the portfolio holdings.

Such constraints are often formulated with respect to the portfolio-proportion

process. Typical examples are (amongst many others)

(i) prohibition to hold certain assets (π(i) ≡ 0 for i ∈ K ⊂ 1, 2, . . . , Nλ⊗Q-almost surely);

(ii) short-selling constraints (π(i) ≥ −αi for i ∈ K ⊂ 1, 2, . . . , N and some

αi ∈ IR+0 almost surely);


(iii) borrowing constraints (∑N

i=1 π(i)t ≤ α for some α ∈ IR+ and for all t ∈ I

almost surely);

(iv) limitation on the number of stocks held, e.g. not more than 5% of all

stocks outstanding (ξ(i) ≤ αi for i ∈ K ⊂ 1, 2, . . . , N and some αi ∈IR+

0 almost surely);

(v) constraint on the wealth invested in a certain asset, e.g. not more than

5% in asset i (π(i) ≤ αi for some αi ∈ IR+0 almost surely), or not more

than a given amount in asset i (ξ(i)S(i) ≤ αi).

(vi) portfolio-insurance constraint (Wt = W0 +∫ t

0ξs ·dSs ≥ α or Wt ≥ α

S(0)(t)

for some W0 ≥ α > 0) (see Section 3.5.5).

These constraints, that can obviously be combined and might depend on time

t ∈ I or state of the world ω ∈ Ω, are all a special case of the following general

constrained problems. Here, we formulate the constraints either with respect

to portfolio-proportion process or with respect to portfolio processes. Indeed,

one can easily combine these two problems, as we will discuss later on (see

Section 3.5.5). Recall that we implicitly assume S > 0 to avoid technical issues

with portfolio-proportion processes, and that K ⊂ L(S) is called convex, if

β, γ ∈ K, then αβ + (1− α)γ ∈ K for any one-dimensional predictable process

α such that 0 ≤ α ≤ 1. See Appendix A.2 for the topological properties —

especially the semimartingale metric dS , (A.6) on p. 136 — and Definition

A.3.6 and the remarks thereafter for the type of convexity used. We denote by

AKπ (S, W0)4= (π, c) ∈ Aπ(S, W0) : π ∈ K the set of admissible constrained

portfolio strategies, and define AK(S, W0) similarly (see Definition 1.2.4).

2.2.1 Problem (Constrained Dynamic Problem). Solve

uK(W0) = sup(π,c)∈AKπ (S,W0)

EP

[∫ T

0


], (2.21)

where K is a closed, convex subset of the space Lπ(S), and 0 ∈ K.


The assumption 0 ∈ K implies that we can invest our total wealth in the

riskless asset. We refer to Section 3.5.1 for a discussion of 0 /∈ K.

We can also consider this problem with respect to portfolio processes.

2.2.2 Problem (Constrained Dynamic Problem). Solve

uK(W0) = sup(ξ,c)∈AK(S,W0)

EP

[∫ T

0


],

where K is a closed, convex subset of the space L(S), and 0 ∈ K.

The first question coming to mind quite naturally is the one concerning

existence of optimal solutions to such portfolio optimization problems. This is

tackled in the next section.

2.2.2 Existence of an Optimal Solution

We have already proven existence of an optimal solution to these problems

in Section 2.1.5 (see Remark 2.1.36). For the readers convenience and latter

reference, we quickly adapt the notation to the constraint case, where neces-

sary. The proofs remain by and large unchanged, mandating only the obvious

changes in notation.

Before we start, recall again the two different usages of u, see Footnote 1

on p. 32. We will stick to the same convention for uK. Similarly, we retain

the Standing Assumption on p. 34 with the obvious changes in notation. For

the readers convenience, we again state the notion of upper semicontinuity:

the function u must satisfy u(c,X) ≥ lim supn→∞ u(cn, Xn) for any sequence

(cn, Xn)n≥1 converging to (c,X) almost surely. The existence result then is:

2.2.3 Theorem. Suppose that uK(c,X) 4= EP

[∫ T


]is up-

per semicontinuous in the sense of Corollary 1.4.7 on

CπK(W0)

4=

(c,X) : sup

Q∈M(SK)

EQ

[X

E(ASK(Q))T+∫ T

0

c(s)E(ASK(Q))s

ds

]≤ W0

.



Similarly, if uK(c,X) 4= EP

[∫ T


]is upper semicontin-

uous in the sense of Corollary 1.4.7 on

CK(W0)4=

(c,X) : sup

Q∈Mb(SK)

EQ

[X +

∫ T

0

c(s)ds−ASK(Q)T

]≤ W0

.


Proof. Follows immediately from Corollary 1.4.7, Proposition 1.3.3 or Propo-

sition 1.3.4, and the Standing Assumption on p. 34.

The first sufficient condition for the existence of an optimal solution is

simply Corollary 2.1.27:

2.2.4 Corollary. With the notations and assumptions of Theorem 2.2.3, sup-

pose that limW0→∞uK(W0)

W0= 0; let U,B be utility functions in the sense of De-

finition 2.1.1, and suppose there exists a non-decreasing function U : IR 7→ IR

such that U ≥ U,B ≥ U almost surely.

Then the optimal solution to Problem 2.2.1 or Problem 2.2.2 exists.

Proof. Corollary 2.1.27.

Another set of sufficient conditions follows from Proposition 2.1.32. For the

proposition, we use the set Mb(SK)(W0) (see Proposition 1.3.4 on p. 10)

2.2.5 Proposition (Sufficient Conditions I). With the notations and assump-

tions of Theorem 2.2.3, if any of the following conditions is met, then the

optimal solution to Problem 2.2.1 or Problem 2.2.2 exists.

(i) U+, B+ are uniformly bounded, i.e. for all x ∈ IR+, t ∈ I we have

U+(t, x) < k and B+(x) < k for some k < ∞ almost surely.

(ii) U(t, x) ≤ k1 + k2xγ and B(x) ≤ k1 + k2x


k1, k2 ≥ 0, 1 > γ > 0, x > 0, and dPdQ ∈ Lp(Ω,F , P) for some p > γ

1−γ and

some Q ∈Mb(SK)(W0).


with B(x0) ∈ Lp(Ω,F , P), U(t, c0(t)) ∈ Lp(Ω,F , P), B(x) ≤ Z + k2xγ for

all x ≥ x0, and U(t, x) ≤ Z + k2xγ for all x ≥ c0(t).


(iii) There exists k > 0, y0 > 0, and 1 > γ > 0 such that (U ′)−1(·, y) ≤ ky−γ

and (B′)−1(y) ≤ ky−γ almost surely for all y ≤ y0. Furthermore, for each

x > 0, U(·, x), B(x) are uniformly bounded (i.e. there exists a kx < ∞such that U(t, x) < kx, B(x) < kx for all t ∈ I almost surely).

(iv) There exists a function K : IR+ 7→ IR and a constant x0 > 0 such that

K(x) ≥ U ′(·, x), K(x) ≥ B′(x) almost surely,∫∞

x0K(x)dx < ∞, and

U(·, x0), B(x0) are uniformly bounded (i.e. there exists a k < ∞ such

that U(t, x0) < k,B(x0) < k for all t ∈ I almost surely).

Proof. Proposition 2.1.32. Only (ii) needs some additional work. Just as in

Corollary 2.1.29, we get the estimate

EQ

[dPdQ

Xγp

]≤ (EQ [X])γp

(EQ

[(dPdQ

) 11−γp

])1−γp

.

Since Q ∈ Mb(SK)(W0), we have ASK(Q)T ≤ n for some n ≥ 0. We can

therefore conclude that EQ [X] ≤ W0 + n for the case of portfolio processes,

and EQ [X] ≤ exp(n)W0 in the case of portfolio-proportion processes. In both

cases, we find EQ

[dPdQXγp

]< ∞.

We have proven the second set of conditions in Proposition 2.1.33 only for

portfolio-proportion processes. The reason is that we have only proven Lemma

2.1.38 for portfolio-proportion process. As indicated in Remark 2.1.37, a similar

result should hold for portfolio processes under certain circumstances, and then

the following proposition is also true for portfolio processes.

2.2.6 Proposition (Sufficient Conditions II). With the notations and assump-

tions of Theorem 2.2.3, suppose that the Inada conditions of Assumption 2.1.30

hold. Then any of the following conditions is sufficient for the optimal solution

to Problem 2.2.1 to exist.

(v) There is x0 ≥ 0, 0 < α < 1, and β > 1 such that B′(βx) ≤ αB′(x) and

U ′(·, βx) ≤ αU ′(·, x) for all x > x0 almost surely.


(vi) limx→∞ inft∈I U(t, x) > 0, limx→∞ B(x) > 0 almost surely, and there

exists a constant γ < 1, such that

lim supx→∞

(supt∈I

xU ′(t, x)U(t, x)

)< α,

lim supx→∞

xB′(x)B(x)

< α.

(vii) v(y) 4= infY ∈YK EP

[∫ T

0U (s, yYs) ds + B (yYT )

]< ∞ for all y > 0; here

B(y) 4= supx>0 B(x)− xy is the convex dual, and U defined accord-

ingly.

(viii) U(t, x) ≤ k1 + k2xγ and B(x) ≤ k1 + k2x


k1, k2 ≥ 0, 1 > γ > 0, and

sup(ξ,c)∈A(S,W0)

EP

[∫ T

0

k2(c(s))γds + k2(W (T ))γ

]< ∞.

(ix) U(t, x) ≤ U(t, x) and B(x) ≤ B almost surely; U,B are utility functions

satisfying Assumption 2.1.30, any of the conditions (v), (vi) or (vii), and

sup(ξ,c)∈A(S,W0)

EP[U(s, c(s))ds + B(W (T ))

]< ∞.

Proof. Proposition 2.1.33.

We conclude this section with the not very surprising observation that Re-

mark 2.1.34, Remark 2.1.35, and Remark 2.1.25 are true in the constrained

case, too. Note also that the optimal solution, provided it exists, satisfies the

stochastic control result of Section 1.5.

2.2.3 First-Order Conditions

Let us now turn to first-order conditions. This section strives to make the

discussion of Section 1.6 precise and extend the result of Section 2.1.2. Since


we need slightly different approaches to prove the results for portfolio processes

and portfolio-proportion processes, we give two separate propositions.

Throughout the section, we assume that an optimal solution to Problem

2.2.1 or Problem 2.2.2 exists and write c∗ for the optimal consumption process,

and W ∗T for the optimal terminal wealth. Since the utility functions are non-

decreasing and differentiable, the constraint for the static problem is binding

(compare the proof to Corollary 2.1.14); i.e.

supQ∈M(SK)

EQ

[W ∗

T

E(ASK(Q))T+∫ T

0

c∗(s)E(ASK(Q))s

ds

]= W0

or supQ∈Mb(SK) EQ

[W ∗

T +∫ T

0c∗(s)ds−ASK(Q)T

]= W0. We also assume that

for some 1 > γ > 0

EP

[∫ T

0

U ′ (s, γc∗(s)) c∗(s)ds + B′(γW ∗T )W ∗

T

]< ∞. (2.22)

Usually, this assumption can easily be checked. For example, it is true for

CRRA utility by the Standing Assumption on p. 34. This assumption im-

plies that f(γ) 4= EP

[∫ T

0U (s, γc∗(s)) ds + B(γW ∗

T )]

is differentiable for some

1 > γ > 0, and we can interchange differentiation and integration (Bauer,

1992, Lemma 16.2). The following two propositions are extensions of Cuoco

(1997, Proposition 2).

2.2.7 Proposition. Let (π∗, c∗) be an optimal solution to Problem 2.2.1, and

suppose that (2.22) holds for some 1 > γ > 0. Set

YK4=

(1

E (ASK(Q))t

EP

[dQdP|F(t)

])t∈I

: Q ∈M(SK)

Then there exists a sequence (Y n)n≥1 with Y n ∈ YK and an y ∈ IR+0 with

limn→∞

(U ′(·, c∗(·))− yY n· ) c∗(·) = 0 (2.23)

in L1(λ⊗ P) and almost surely. Similarly

limn→∞

(B′(W ∗T )− yY n

T ) W ∗T = 0 (2.24)


in L1(P) and almost surely.

The sequence (Y n)n≥1 can be chosen such that

limn→∞

EP

[W ∗

T Y nT +

∫ T

0

c∗(s)Y ns ds

]= W0, (2.25)

and such that a convex combination of it converges to a non-negative, cadlag

supermartingale Y ∗ in the Fatou sense. We have

(U ′(·, c∗(·))− yY ∗· ) c∗(·) = 0 (2.26)

(B′(W ∗T )− yY ∗

T ) W ∗T = 0 (2.27)

almost surely.

Proof. Throughout the proof, we will work with the static equivalent to the

dynamic problem; i.e. we will consider (c∗,W ∗T ) to be a contingent claim and

maximize uK(c,X) on the set CπK(W0) (compare Theorem 2.2.3). Since the

utility function is nondecreasing, (c∗,W ∗T ) must be an optimal solution to the

static problem by a standard argument similar to Theorem 1.4.3. Let δ > 0 be

arbitrary. Define

YδK

4=

Y ∈ YK : EP

[∫ T

0

c∗(s)Ysds + W ∗T YT

]>

W0

1 + δ

This set is nonempty, because the constraint is binding. For the rest of the

proof, we use the semi-normed space L1(λ ⊗ P ⊗ P,Prog ⊗ F) 4= (c,X) : c

progressively measurable, X measurable, ‖c,X‖14= EP[

∫ T

0|c(s)|ds+|X|] < ∞.

With these conventions, define the convex set A 4= (αY c∗, αYT W ∗T ) : α ∈

IR+0 , Y ∈ Yδ

K. From

αEP

[∫ T

0


]≤ αW0

we see that A ⊂ L1(λ ⊗ P ⊗ P). Using the closure cl(A) in L1(λ ⊗ P ⊗ P), it

therefore makes sense to consider the closed convex set B 4= (U ′(·, c∗(·))c∗(·)−Y·), B′(W ∗

T )W ∗T − X) : (Y , X) ∈ cl(A). From (2.22), B ⊂ L1(λ⊗ P⊗ P). We

first want to show that (0, 0) ∈ B, and do so by contradiction.


Suppose therefore that this is not the case. By a variant of the Hahn-

Banach Theorem (Schechter, 1997, 28.4 (HB20)) we conclude that there exists

(φ1, φ2) ∈ L∞(λ⊗ P⊗ P) with

EP

[∫ T

0

U ′(s, c∗(s))c∗(s)φ1(s)ds + B′(W ∗T )W ∗

T φ2

]

− αEP

[∫ T

0

c∗(s)Ysφ1(s)ds + W ∗

T YT φ2

]> 0

for all α ≥ 0 and all Y ∈ YδK. α being arbitrary, this implies for all Y ∈ Yδ

K

EP

[∫ T

0

U ′(s, c∗(s))c∗(s)φ1(s)ds + B′(W ∗T )W ∗

T φ2

]> 0 (2.28)

and

0 ≥ EP

[∫ T

0


T YT φ2

]. (2.29)

We will use the last equation to define a contingent claim (cε, Xε) for ε ∈(0, δ∧ (1− γ)), with γ from (2.22). We then will find a contradiction to (2.28).

To start with, set (cε, Xε) =(c∗ + εc∗ φ1

‖(φ1,φ2)‖∞ ,W ∗T + εW ∗

Tφ2

‖(φ1,φ2)‖∞

). We

want to show that this is actually a contingent claim (i.e. non-negative) and

attainable (see Remark 2.1.19). Using φ1

‖(φ1,φ2)‖∞ ≥ −1, φ2

‖(φ1,φ2)‖∞ ≥ −1, we

find (cε, Xε) ≥ (1 − ε)(c∗,W ∗T ), proving non-negativity. As for attainability,

we have to show EP

[∫ T

0cε(s)Ysds + XεYT

]≤ W0 holds for all Y ∈ YK. For

Y ∈ YδK this follows from (2.29):

EP

[∫ T

0

cε(s)Ysds + XεYT

]

=EP

[∫ T

0


]

+ε

‖(φ1, φ2)‖∞EP

[∫ T

0


T YT φ2

]≤W0.


And for Y ∈ YK \ YδK we estimate with 1 ≥ φ1

‖(φ1,φ2)‖∞ , 1 ≥ φ2

‖(φ1,φ2)‖∞ and

therefore (1 + δ)(c∗,W ∗T ) ≥ (cε, Xε)

EP

[∫ T

0

cε(s)Ysds + XεYT

]≤ (1 + δ)EP

[∫ T

0

c∗(s)Ysds + X∗YT

]≤ W0.

Hence (cε, Xε) is an attainable contingent claim, i.e. in CπK(W0).

Let us now turn to the contradiction of (2.28). Note that from (cε, Xε) ≥(1− ε)(c∗,W ∗

T ) ≥ γ(c∗,W ∗T ) and the properties of B we have

|B(Xε)−B(W ∗T )|

ε≤ B′(γW ∗

T )|Xε −W ∗

T |ε

≤ B′(γW ∗T )W ∗

T

and a similar estimate for cε, U . (2.22), dominated convergence and the opti-

mality of (c∗,W ∗T ) then imply the desired contradiction

1‖(φ1, φ2)‖∞

EP

[∫ T

0

U ′(s, c∗(s))c∗(s)φ1(s)ds + B′(W ∗T )W ∗

T φ2

]

= limε↓0

EP

[∫ T

0U(s, cε)− U(s, c∗(s))ds + B(Xε)−B(W ∗

T )]

ε≤ 0.

We therefore conclude that (0, 0) ∈ B.

To sum up, we have shown that there exists a sequence (yn)n≥1 in IR+0

and a sequence (Y n)n≥1 in YδK such that (ynY nc∗, ynY n

T W ∗T ) converges to

(U ′(·, c∗(·))c∗(·), B′(W ∗T )W ∗

T ) in L1(λ ⊗ P ⊗ P). What is more, the sequence

(yn)n≥1 must be bounded since ‖(Y nc∗, Y nT W ∗

T )‖1 > W01+δ from the definition

of YδK and ‖(U ′(·, c∗(·))c∗(·), B′(W ∗

T )W ∗T )‖1 = limn→∞ yn‖(Y n

· c∗(·), Y nT W ∗

T )‖1.Taking a subsequence if necessary, we can therefore assume limn→∞ yn = y

for some y ∈ IR+0 . The estimate ‖(yY nc∗, yY n

T W ∗T ) − (ynY nc∗, ynY n

T W ∗T )‖1 =

|y − yn|‖(Y n· c∗(·), Y n

T W ∗T )‖1 ≤ |y − yn|W0 shows that (yY nc∗, yY n

T W ∗T ) con-

verges to (U ′(·, c∗(·))c∗(·), B′(W ∗T )W ∗

T ) in L1(λ ⊗ P ⊗ P), too. Hence we have

completed the proof of (2.23) and (2.24), where for the “almost sure” state-

ments we take a subsequence if necessary.

Equation (2.25) is clear since δ was arbitrary. As for Y ∗, by Lemma A.2.6

we can choose a sequence(Y n)

n≥1with Y n ∈ conv(Y n, Y n+1, . . . ) converging


to some non-negative cadlag supermartingale Y ∗. It is easy to see that (2.23)

and (2.24) hold almost surely for the sequence(Y n)

n≥1, if they do so for the

sequence (Y n)n≥1. By Lemma A.2.5 this yields almost surely

0 = lim supn→∞

(U ′(t, c∗(t))− yY nt )c∗(t) = U ′(t, c∗(t))c∗(t)− yc∗(t) lim inf

n→∞Y n

t

=U ′(t, c∗(t))c∗(t)− yc∗(t)Y ∗t

for all but countably many t; and this is (2.26). The same reasoning and

lim infn→∞ Y nT = Y ∗

T almost surely by Lemma A.2.5 gives (2.27).

The case of portfolio processes is similar. However, as detailed in Appendix

A.3.2, properly enlarging the set for portfolio processes is an open issue. This

leads to a somewhat weaker result. To be more precise, the fact that ASK(Q) :

Q ∈Mb(SK) is not necessarily bounded leads to the result that the sequence

(yn)n≥1 in (2.30) and (2.31) might have a divergent sub-sequence. The first part

of the equivalent formulation to Proposition 2.2.7 therefore reads as follows:

2.2.8 Proposition. Let (ξ∗, c∗) be an optimal solution to Problem 2.2.2, and

suppose that (2.22) holds for some 1 > γ > 0.

Then there exists a sequence (Qn)n≥1 with Qn ∈ Mb(SK) and a sequence

(yn)n≥1 in IR+0 such that

limn→∞

(U ′(·, c∗(·))− ynEP

[dQn

dP

∣∣∣F(·)])

c∗(·) = 0 (2.30)

in L1(λ⊗ P) and almost surely. Similarly

limn→∞

(B′(W ∗

T )− yndQn

dP

)W ∗

T = 0 (2.31)

in L1(P) and almost surely.

Proof. Much of the proofs of this section is similar to the case of portfolio-

proportions. We are therefore a bit eclectic and refer to the portfolio-proportion

case for more details. In the following we will abuse notation for a cadlag

martingale Y and write Y ∈ Mb(SK), if there does exist a Q ∈ Mb(SK) with

Yt = EP

[dQdP

∣∣∣F(t)]

for all t ∈ I almost surely.


Define the convex set A 4= (αc∗Y, αW ∗T YT ) : α ∈ IR+

0 , Y ∈Mb(SK). From

EP

[∫ T

0

c∗(s)Ysds + W ∗T YT −m(Q)

]≤ EQ

[∫ T

0

c∗(s)ds + W ∗T −ASK(Q)T

]≤ W0

we see that A ⊂ L1(λ⊗P⊗P). Here m(Q) is a constant with ASK(Q)T ≤ m(Q)

almost surely (such constants exist by the definition of Mb(SK)). Consider the

closed convex set B 4= (U ′(·, c∗(·))c∗(·)−Y·, B′(W ∗

T )W ∗T−X) : (Y , X) ∈ cl(A).

We want to show that (0, 0) ∈ B by contradiction.

Using the same reasoning as before, we find with (φ1, φ2) ∈ L∞(λ⊗ P⊗ P)

EP

[∫ T

0

U ′(s, c∗(s))c∗(s)φ1(s)ds + B′(W ∗T )W ∗

T φ2

]> 0 (2.32)

and for all Q ∈Mb(SK)

0 ≥ EQ

[∫ T

0

c∗(s)φ1(s)ds + W ∗T φ2

]. (2.33)

Define a contingent claim (cε, Xε) for ε ∈ (0, 1 − γ), with γ from (2.22),

by (cε, Xε) =(c∗ + εc∗ φ1

‖(φ1,φ2)‖∞ ,W ∗T + εW ∗

Tφ2

‖(φ1,φ2)‖∞

). We find (cε, Xε) ≥

γ(c∗,W ∗T ), proving non-negativity. And EQ

[∫ T

0cε(s)ds + Xε −ASK(Q)T

]≤

W0 holds for all Q ∈Mb(SK) by (2.33):

EQ

[∫ T

0

cε(s)ds + Xε −ASK(Q)T

]

=EQ

[∫ T

0

c∗(s)ds + X∗ −ASK(Q)T

]

+ε

‖(φ1, φ2)‖∞EQ

[∫ T

0


]≤W0.

The rest of the contradiction to (2.32) is completely parallel to the portfolio-

proportion case; see there for details. We conclude that (0, 0) ∈ B, and this is

(2.30) and (2.31).


Recall that in the case of portfolio-proportion processes, to sharpen the

result we needed to prove the convergence of a sub-sequence of (yn)n≥1. In

the case of portfolio processes however, we can only prove that the sequence

(yn)n≥1 has a convergent subsequence if we use additional assumptions. Here

is frequently employed one (Cuoco, 1997; Mnif and Pham, 2001).

2.2.9 Lemma. With the notation of Proposition 2.2.8, if for some constant

k ≥ 0 we have ASK(Q) ≤ k for all Q ∈ Mb(SK), the sequence (yn)n≥1 can be

chosen to be yn = y for some y ≥ 0.

Proof. For δ > 0 arbitrary, such that W01+δ − δk > 0, define

Mbδ(SK) 4=

Q ∈Mb(SK) :

EQ

[W ∗

T +∫ T

0

c∗(s)ds−ASK(Q)T

]>

W0

1 + δ− δk

.

Define the convex set A 4= (αc∗Y, αW ∗T YT ) : α ∈ IR+

0 , Y ∈ Mbδ(SK). As

before we see that A ⊂ L1(λ⊗ P⊗ P). We want to show by contradiction that

(0, 0) ∈ B 4= (U ′(·, c∗(·))c∗(·)− Y·), B′(W ∗T )W ∗

T − X) : (Y , X) ∈ A.

To this end, take some (φ1, φ2) ∈ L∞(λ⊗ P⊗ P), for which

EP

[∫ T

0

U ′(s, c∗(s))c∗(s)φ1(s)ds + B′(W ∗T )W ∗

T φ2

]> 0 (2.34)

and for all Q ∈Mbδ(SK)

0 ≥ EQ

[∫ T

0


]. (2.35)

Define a contingent claim (cε, Xε) for ε ∈ (0, δ ∧ (1 − γ)) by (cε, Xε) =(c∗ + εc∗ φ1

‖(φ1,φ2)‖∞ ,W ∗T + εW ∗

Tφ2

‖(φ1,φ2)‖∞

). Non-negativity of (cε, Xε) follows


as above. As for attainability, for Q ∈Mbδ(SK) we find from (2.35):

EQ

[∫ T

0


]

=EQ

[∫ T

0


]

+ε

‖(φ1, φ2)‖∞EQ

[∫ T

0


]≤W0.

And for Q ∈Mb(SK) \Mbδ(SK) we get the estimate

EQ

[∫ T

0


]

≤EQ

[∫ T

0

(1 + δ)c∗(s)ds + (1 + δ)X∗ −ASK(Q)T

]

=EQ

[∫ T

0


]+ δEQ

[∫ T

0

c∗(s)ds + X∗

]

≤ W0

1 + δ+ δEQ

[∫ T

0

c∗(s)ds + X∗ − k

]

≤ W0

1 + δ+ δEQ

[∫ T

0


]≤W0.

The rest of the proof — contradiction to (2.35), proving that we can choose

yn = y, and so on — is the same as in the portfolio-proportion process case.

It is clear that we get the full power of Proposition 2.2.7 if there actually ex-

ists a convergent subsequence (yn)n≥1. This is the topic of the last proposition

of this section.


2.2.10 Proposition. If the sequence (yn)n≥1 of Proposition 2.2.8 has a con-

vergent subsequence, the sequence (Qn)n≥1 can be chosen such that

limn→∞

EQn

[W ∗

T +∫ T

0

c∗(s)ds−ASK(Qn)T

]= W0, (2.36)

and such that the martingale(EP

[dQn

dP

∣∣∣F(t)])

t∈Iconverges to a non-negative,

cadlag supermartingale Y ∗ in the Fatou sense. We have

(U ′(·, c∗(·))− yY ∗· ) c∗(·) = 0 (2.37)

(B′(W ∗T )− yY ∗

T ) W ∗T = 0 (2.38)

almost surely.

Proof. Proving (2.36), (2.37), (2.38), and the existence of Y ∗ is just as in the

proof for portfolio-proportion processes. The only difference is that the convex

combination defines an element in Mb(SK).

2.2.11 Remark. If c∗ > 0 and W ∗T > 0, we find c∗(·) = U ′−1(·, yY ∗

· ) and

W ∗T = B′−1(yY ∗

T ) almost surely, generalizing thereby Corollary 2.1.14.

2.2.4 Examples (Constrained Brownian Market)

This section tries to apply the previous discussion to the Brownian model of

Section 2.1.6. The case of portfolio-proportions has been discussed in textbooks

already (Karatzas and Shreve, 1998, Chapter 6), and gets most of the attention

in the examples below and in Chapter 3. Therefore we first concentrate on

portfolio processes. Consider the model of Section 2.1.6 and assume that ξ(t) ∈K for all t ∈ I almost surely; here K ⊂ IRN is closed and convex with 0 ∈ K.

For any (ξ, c) ∈ AK(S, W0), the wealth process then reads

Wt = W0 +∫ t

0

ξ′(s)(µ(s)− r(s)1)ds +∫ t

0

ξ′(s)σ(s)dZ(s)


Let Q be an arbitrary probability measure equivalent to P, and ZQ be a Q-

Brownian motion. By the Martingale Representation theorem

dQdP

= E

(−∫ T

0

(θ(s) + σ(s)−1ν(s))′dZ(s)

)

for some process ν(·) such that∫ T

0‖σ(s)−1

ν(s)‖2 < ∞ (here as always θ(·) =

σ(·)−1(µ(·)− r(·)1). Girsanov’s Theorem implies the Doob-Meyer decomposi-

tion

Wt = W0 +∫ t

0

ξ′(s)σ(s)dZQ(s)−∫ t

0

ξ′(s)ν(s)ds.

Up to now, Q war arbitrary. But from the discussion in Appendix A.3, Q ∈Mb(SK), if and only if there exists some process $ < ∞ such that $(·) ≥−ξ′(·)ν(·) for all ξ with ξ(t) ∈ K for all t ∈ I. Since 0 ∈ K, $ ≥ 0, and

ASK(Q)t =∫ t

0supξ(s)∈K−ξ′(s)ν(s)ds.

The case of portfolio-proportion processes is largely similar. We refer to

Karatzas and Shreve (1998, Chapter 6) for a proper discussion. What differs

however is that actually finding optimal portfolios is considerably simpler in

many cases of practical relevance. This is the topic of the following examples.

2.2.12 Example (Continued from Example 2.1.41). Use the setting of Example

2.1.41 and suppose that an investor wants to maximize utility subject to the

additional constraint that the portfolio-proportion process π(t) is in a non-

empty closed convex set KN ⊂ IRN for almost all t ∈ [0, T ]. One can show

that for this setting a portfolio-proportion process can only be optimal if it

is constant (e.g. Karatzas and Shreve, 1998, Chapter 6.6; or Muller, 2000; see

also Example 2.1.41). Hence we can set

KT =

X : X = W0 exp(

π′ (µ− r1)T − 12π′σσ′πT + π′σZ(T )

),

π ∈ KN

and simply add the constraint X ∈ KT to Problem 2.1.8. Reformulating, we


thus have the following very simple optimization problem (Muller, 2000):

supX∈KT

EP

[X(1−k)

1− k

]= max

π∈KN

EP

[W0

1− k

exp

(1− k)(

π′ (µ− r1)T − 12π′σσ′πT + π′σZ(T )

)]

=W0

1− kmax

π∈KN

exp

(1− k)(

π′ (µ− r1)T − 12π′σσ′πT

)+

12

(1− k)2 π′σσ′πT

,

i.e., the optimal portfolio-proportion process can be found considering the fol-

lowing quadratic program maxπ∈KN

(2kπ′ [µ− r1]− π′σσ′π

).

2.2.13 Remark. We can extend this result: let 0 = T0 < T1 < · · · < Tn = T ,

Ti ∈ IR+ for i = 1, . . . , n (or even stopping times converging to T ), and let

K[Ti,Ti+1)i ⊂ IR be convex and closed, i = 0, . . . , n. Suppose the constraints

are such that ∀ t ∈ [Ti, Ti+1), π(t) ∈ K[Ti,Ti+1)i almost surely. Then we

can find the optimal portfolio-proportion process by solving the n problems

maxπ∈K

[Ti,Ti+1)i

(2kπ′ [µ− r1]− π′σσ′π

), i = 0, . . . , n − 1. What is more, if

µ(t), r(t) and σ(t) depend on the time t, but are constant on the ‘intervals’

[Ti, Ti+1), the reasoning still works as before. Amongst others, we therefore can

reduce the initial optimization problem for the special utility function consid-

ered here to maxπ(t)∈K(t)

(2kπ′(t) [µ(t)− r(t)1]− π′(t)σ(t)σ′(t)π(t)

), as long

as K(t),µ(t), r(t) and σ(t) are deterministic functions.

2.2.14 Example (Continued from Example 2.2.12). Let us consider the special

case where we are only allowed to hold the first M assets, M < N ; i.e. KN4=

π ∈ IRN : π(i) = 0, i = M + 1, . . . , N (see Karatzas et al., 1991). Suppose

further that the volatility matrix σ has the following special structure

σ =

(σM 0

0 σN−M

),


where σM is an invertible square-matrix of dimension M (this is the case if σ

is a diagonal matrix); write similarly

µ =

(µM

µN−M

)

and

Z =

(ZM

ZN−M

).

Thus KT of Example 2.2.12 can be written as

KT =

X : X = W0

exp(

π′M (µM − r1) T − 12π′MσMσ′MπMT + π′MσMZM (T )

).

But then it is an immediate consequence of Example 2.1.41 that the optimal

portfolio-proportion process is given by π∗M = 1k (σMσ′M )−1[µM − r1].

2.2.15 Example (Continued from Example 2.2.12). Suppose that σ is a diagonal

matrix and that π(i) ≥ 0 (no short-selling). A simple argument as in the

previous example shows that the optimal portfolio-proportion process is given

by

π =

1k

(µ(1)−r)+

σ21,1

...1k

(µ(N)−r)+

σ2N,N

,

where σi,i is the ith diagonal element of σ.

Chapter 3

A Duality Approach for

Time-Additive Utility

84 CHAPTER 3. DUALITY APPROACH (TIME-ADDITIVE)

3.1 Introduction

This chapter presents a very elegant duality approach (Shreve and Xu, 1992a,b;

He and Pearson, 1991b; Karatzas, Lehoczky, Sethi, and Shreve, 1986; Karatzas

et al., 1987, 1991; Cvitanic and Karatzas, 1992; Cvitanic et al., 2001; Kramkov

and Schachermayer, 1999; Karatzas and Zitkovic, 2003). We refer to Karatzas

and Shreve (1998), Chapters 3, 5, 6, for an exposition of the Brownian motion

case. Throughout this chapter, we continue to use the notation of Chapter 2.

The duality approach does not lead to additional insights concerning exis-

tence and the basic structure of optimal solutions. But it has several advan-

tages. The first advantage lies in the introduction of a second (dual) problem

that is unconstrained and therefore often simpler and more elegantly to solve,

and that directly leads to the solution of the initial problem. The second

advantage stems from some additional insights in the structure of the opti-

mal solution, namely certain properties of the Lagrange multiplier. For these

reasons, the duality approach is one of the most commonly used in portfolio

optimization problems.

Because the duality approach does not lead to major additional insights, and

because Mnif and Pham (2001) have already tackled the problem of portfolio

processes, we stick to the portfolio-proportion problem. If we wanted to add

portfolio processes, we would need some additional assumptions along the lines

of Remark 2.1.37.

For propædeutical reasons most of the time we will consider the case of opti-

mal terminal wealth only. Indeed, the main idea is not different for the general

case, but the technical details are more involved than in the terminal wealth

case. Therefore we start with two sections on the terminal wealth problem.

The first section presents the special case where M(S) = Qm, mainly to in-

troduce duality theory to our setting. The next section will add constraints to

the portfolio process and terminal wealth. Finally, we will consider the initial

problem in full generality. For the first two sections, we need an additional

assumption.

3.2. UNCONSTRAINED PROBLEM 85

3.1.1 Assumption. Let B be a utility function with

(i) dom(B) = [0,∞).

(ii) limx↓0 B′(x) = ∞ (i.e. B satisfies the Inada condition).

(iii) B is not state-dependent.

3.1.2 Remark. The assumption that B is not state-dependent is needed to make

use of Kramkov and Schachermayer (1999), Theorem 3.2 directly without much

ado. Their result carries however over to the state-dependent case with only

minor changes. Indeed, we prove this for the case with consumption in Section

3.4.

We recall the definition of the convex dual (e.g. Karatzas and Shreve,

1998, Definition 3.4.2 and Lemma 3.4.3).

3.1.3 Definition (Convex Dual). The convex dual of B(·) is given by B(y) =

supx>0B(x)− xy, y > 0.

As a rule, if U is utility function, then U will denote the convex dual for

the remainder of this section.

3.2 The Unconstrained Problem without Con-

sumption for A Unique Measure

For this section we will assume that M(S) = Qm. Note that this is the case

of a complete market as in Section 2.1.1, and recall that it suffices to consider

either the problem of portfolio processes, or the problem of portfolio-proportion

processes, since both are equivalent without constraints (see Remark 2.1.7). As

already said, we will tackle the portfolio-proportion case. Given our additional

assumption, Problem 2.1.5 can be rephrased:

u(W0) = supπ∈La,π(S)

EP[B(WT )]. (3.1)


As in Section 1.4 and Section 2.1, the dynamic problem translates to a

static one, namely

us(W0) = supX∈L0

+(Qm)

EP[B(X)]

s.t. EQm [X] ≤ W0,

since we can show (compare Appendix A.3, also Proposition 1.3.3, and Remark

2.1.21), that a wealth process is admissible, if and only if EQm [WT ] ≤ W0.

This is a problem of finding a contingent claim that maximizes an investor’s

utility subject to a budget constraint. The Lagrangian belonging to this static

problem is:

L(X, y) 4= EP[B(X)]− y (EQm [X]−W0) = EP

[B(X)− y

dQm

dPX

]+ yW0.

As usual, the saddle point property of Lagrangian theory tells us that we should

consider the problem

infy>0

(sup

X∈L0+(Qm)

L(X, y)

),

that hopefully leads us to the optimal solution for our initial problem:

u(W0) = infy>0

(sup

X∈L0+(Qm)

EP

[B(X)− y

dQm

dPX

]+ yW0

).

Arguing heuristically, we see from the last equation that maximizing the La-

grangian can be performed for every ω ∈ Ω separately. On noting that

B

(ydQm

dP(ω))

= supx>0

B(x)− y

dQm

dP(ω)x

,

we are quite naturally led to defining

v(y) 4= EP

[B

(ydQm

dP

)]and conjecturing that u(x) = infy>0[v(y) + xy] and v(y) = supx>0[u(x)− xy],

i.e. the functions are conjugate. The latter can be proven using the Minimax

theorem (see Millar, 1983, p. 92, and the proof of Lemma 2.1.26)

3.2. UNCONSTRAINED PROBLEM 87

It is relatively easy to show that v(·) has certain desirable properties using

well-known results from convex analysis (e.g. Rockafellar, 1970). Since u(·) and

v(·) are conjugate, these properties immediately carry over to u(·), and then

ensure existence and uniqueness of the optimal solution. This is made precise

in in the following theorem.

3.2.1 Theorem. Set v(y) 4= EP

[B(y dQm

dP

)]and suppose that Assumption

3.1.1 holds. Then

(i) u(x) < ∞∀ x > 0, and v(y) < ∞ for y > 0 sufficiently large. Letting

y04= infy > 0 : v(y) < ∞, the function v(·) is continuously differen-

tiable and strictly convex on (y0,∞). Defining x04= limy↓y0 −v′(y) the

function u(·) is continuously differentiable on (0,∞) and strictly concave

on (0, x0). The value functions u(·) and v(·) are conjugate: for y > 0

v(y) = supx>0

[u(x)− xy],

and for x > 0

u(x) = infy>0

[v(y) + xy].

(ii) limx↓0 u′(x) = ∞ and limy→∞ v′(y) = 0.

(iii) Suppose W0 < x0; then the optimal terminal wealth is given by W ∗T =

B′−1(yW0

dQm

dP)

for yW0 > y0, where W0 and yW0 are related via yW0 =

u′(W0), and the optimal wealth process (W ∗t ) is a uniformly integrable

Qm-martingale.

(iv) For 0 < W0 < x0 and yW0 > y0 we have

u′(W0) = EP

[W ∗

T B′(W ∗T )

W0

](3.2)

v′(yW0) = EP

[dQm

dPB′(

yW0

dQm

dP

)]Proof. By and large, the result follows from the general result in the con-

strained case below (see Kramkov and Schachermayer, 1999, Theorem 2.0 for

the complete proof).


3.2.2 Remark. It is interesting to compare this result with the verification theo-

rem of the previous section. Not surprising, the characterization of the optimal

terminal wealth W ∗T = B′−1

(yW0

dQm

dP

)is similar to the one in Theorem 2.1.22

in connection with Corollary 2.1.14, (2.7b). The difference is that we get the

fairly explicit characterization yW0 = u′(W0) for the “Lagrange multiplier”, as

compared to the transcendental algebraic equation (2.8).

3.3 Constraints, but no Consumption

Actually, a more general theorem holds (Kramkov and Schachermayer, 1999).

We shall give a minor extension of their result solving constrained problem.

As in the unconstrained case, we can rephrase Problem 2.2.1 for the case

of U ≡ 0:

uK(W0) = supπ∈K

EP[B(WT )], (3.3)

K ⊂ La,π(S). As before, we are looking for a static equivalent to the above

dynamic problem. The first step therefore is to characterize all attainable

terminal wealth outcomes X such that there exists a super-replicating wealth

process W (i.e. a wealth process that is admissible, satisfies all constraints and

for that WT ≥ X holds almost surely).

Let CK(W0) denote all attainable contingent claims given initial wealth W0

(for a precise definition, see the proof below). It turns out that a final wealth

X ∈ CK(W0), if and only if EP [XY ] ≤ W0 for all Y ∈ DK, where the set DKis roughly speaking an analogue to the set of densities of all equivalent local

(super-)martingale measures.1

1As the careful reader might note, Theorem A.3.7 only proofs the ‘if and only if’ for asubset DK of DK. Since the subset DK does not possess the properties we need to provethe theorems of this section (namely closedness, convexity, solidity), an enlarged set DK isused instead. The Bipolar Theorem A.2.2 plays the key role in proving that the enlarged setpossesses the desired properties.

3.3. CONSTRAINTS, BUT NO CONSUMPTION 89

Using this result we can give a static equivalent to (3.3):

uK,s(W0) = supX∈L0

+(Qm)

EP[B(X)]

s.t. supY ∈DK

EP [XY ] ≤ W0,(3.4)

Hence we are faced with the task of choosing the contingent claim X that

maximizes utility subject to a budget constraint. Just as in the previous sec-

tion, we set up a Lagrangian:

L(X, y) 4= EP[B(X)]− y

(sup

Y ∈DKEP [XY ]−W0

)= inf

Y ∈DKEP [B(X)− yY X] + yW0.

Comparing this to the Lagrangian of the previous section, we are led to defining

vK(y) = infY ∈DK

EP

[B(yYT )

].

As the set DK has certain desirable properties (namely closedness and con-

vexity), we can again show that uK(·) and vK(·) are conjugate, i.e. uK(x) =

infy>0[vK(y) + xy] and vK(y) = supx>0[uK(x) − xy]. It is straightforward to

characterize the function vK(·) using known results from convex analysis. This

allows us to prove existence and uniqueness for the initial problem.

We will now make this precise. To this end, consider the family of semi-

martingales

SK4=∫ ·

0+

πs ·dSs

Ss−: π ∈ K

,

and let M(SK) be the class of all probability measures Q equivalent to Psuch that the upper variation process ASK(Q) exists (Definition A.3.5 and the

discussion thereafter). Observe that this set is not empty since by assumption

the market does not allow for arbitrage, and therefore there does exist an

equivalent local martingale measure Qm.

Define YK as in (A.8) on p. 144. We find that Kramkov / Schachermayer’s

result for incomplete markets holds verbatim for the constrained case, too:


3.3.1 Theorem. Suppose Assumption 3.1.1 holds, and set

vK(y) 4= infY ∈YK

EP

[B(yYT )

]. (3.5)

Then

(i) uK(x) < ∞ for all x > 0, and there exists y0 > 0 such that vK(y) is

finitely valued for y > y0. We have for y > 0

vK(y) = supx>0

[uK(x)− xy],

and for x > 0

uK(x) = infy>0

[vK(y) + xy].

(ii) uK(·) is continuously differentiable on (0,∞) and vK(·) is strictly convex

on y > 0 : vK(y) < ∞. Further limx↓0 u′K(x) = ∞ and limy→∞ v′K(y) =

0.

(iii) If vK(y) < ∞, then the optimal solution Y ∗ ∈ YK to (3.5) exists and is

unique.

3.3.2 Theorem. Assume that vK(y) < ∞∀ y > 0 in addition to the assump-

tions of Theorem 3.3.1. Then we also have

(i) vK(·) is continuously differentiable on (0,∞), u′K(·),−v′K(·) are strictly

decreasing, and satisfy limx→∞ u′K(x) = 0 and limx↓0−v′K(x) = ∞.

(ii) The optimal solution π∗ ∈ K to (3.3) exists, and W ∗ is unique. If Y ∗ ∈YK is the optimal solution to (3.5) at the point yW0 = u′K(W0), we have

the relation

W ∗T = B′−1(yW0Y

∗T ).

The process (W ∗Y ∗) is a uniformly integrable martingale.

(iii)

u′K(W0) = EP

[W ∗

T B′(W ∗T )

W0

](3.6)

v′K(yW0) = EP

[Y ∗

T B′(yW0Y∗T )]


Proof of Theorem 3.3.1 and Theorem 3.3.2. We match the above versions to

the respective abstract versions of Theorem B.1.2 and Theorem B.1.3. Recall

first our Standing Assumption on p. 34, and consider the sets

CK(x) = X ∈ L0+(Ω,F , P) : X ≤ WT for a wealth process W

with initial wealth W0 = x, π ∈ K

and

DK(y) =Y ∈ L0

+(Ω,F , P) :(∃Y K ∈ YK : Y ≤ yY K

T

).

Set CK = CK(1) and DK = DK(1), hence CK(x) = xCK and DK(y) = yDK(x > 0, y > 0) . To prove the theorems, we need to show that CK,DK satisfy

Assumption B.1.1. It is clear that 1 ∈ CK since 0 ∈ K, and the bidual equalities

CK = DK and DK = CK are proven in Lemma A.3.13. Consequently, we can

apply Theorem B.1.2 and Theorem B.1.3. Finally, uniform integrability of

(W ∗Y ∗) follows from W ∗Y ∗ ≥ 0 (Protter, 1990, Chapter 1, Theorem 13).

The next corollary is in the spirit of He and Pearson (1991b), Lemma 2.

3.3.3 Corollary. Under the assumptions of Theorem 3.3.2 and with the same

notation, suppose that Q∗ ∈M(SK), where Q∗(A) 4=∫

AY ∗

T dP∀ A ∈ F . Set

uQK,s(W0) = sup

X∈L0+(Q)

EP[B(X)]

s. t. EQ

[1

E (ASK(Q))T

X

]≤ W0;

(3.7)

then

uK(W0) = infQ∈M(SK)

uQK,s(W0) = uQ∗

K,s(W0).

Proof. Let X∗ be the optimal solution to the static problem of (3.4); then it is

obvious that

EQ

[1

E (ASK(Q))T

X∗]≤ W0

for all Q ∈M(SK) must hold, i.e. uK(W0) ≤ infQ∈M(SK) uQK,s(W0). It remains

to show equality. To see this, note first that for Y ∗T to define a probability


measure Q∗, it must necessarily be true that E(ASK(Q∗)

)T≡ 1, i.e. ASK(Q∗) ≡

0. It is then a consequence of Theorem 2.1.12 and Remark 2.1.13 that the

optimal solution of Theorem 3.3.2 is also optimal for

uQ∗K,s(W0) = sup

X∈L0+(Q∗)

EP [B(X)] .

s. t. EQ∗ [X] ≤ W0.

Indeed, simply set X∗ = B′−1(yW0Y∗T ). Because X∗ > x = 0 almost surely, we

must choose y3 ≡ 0 (using (2.4)), and then from (2.5b) y1 = yW0 . uQ∗K,s(W0) =

uK(W0) now follows easily from the Verification Theorem 2.1.12.

3.3.4 Remark. The corollary gives us an idea of the economic rational that lies

beneath Theorem 3.3.2. If Q∗ of Corollary 3.3.3 exists, we can minimize the

maximum possible utility of all fictitiously complete markets (3.7), Q ∈M(SK).

The idea is to add for each Q ∈ M(SK) assets to the market to make the

market complete (and arbitrage-free). Then we choose Q∗ in such a way that

there is no desire to hold the additional assets and to violate the constraints.

We have thereby reduced the dynamic problem to finding the minimum of

a family of static problems. The idea of fictitious market completion was

introduced by He and Pearson (1991b); Karatzas et al. (1991); Cvitanic and

Karatzas (1992). Q∗ it often called the minimax martingale measure. Bellini

and Frittelli (2002, Theorem 1.1) show that the minimax martingale measure

exists in almost all relevant cases for a cone constrained market. For example,

given our assumptions, it exists if supx>0 uK(x) = ∞.

3.3.5 Remark. As Kramkov and Schachermayer (1999) observe (see remarks

after Theorem 2.2), if Q∗ of Corollary 3.3.3 exists, we get from (3.6) (and from

(3.2)) the following pricing formula for a European-style contingent claims X

(an F-measurable random variable):

p(X) = EP

[X

B′(W ∗T )

u′K(W0)

].

Substituting yW0 = u′K(W0) and W ∗T = B′−1(yW0Y

∗T ) it follows that p(X) =

EQ∗ [X]. Hence we have the result that the investor prices European-style


contingent claims according to the marginal rate of substitution for different

states of the world. This well-known result of microeconomic theory very nicely

extends the theory of Arrow / Debreu to our setting (see Duffie and Huang,

1985; Davis, 1997, on this). Note also that the marginal rate of substitution is

a martingale under this assumption, a fact first observed in Foldes (1978) in a

slightly different setting. And even if we are not that lucky and Q∗ 6∈ M(SK),

then we can remedy this situation and use W ∗t as the new numeraire (again,

cf. Kramkov and Schachermayer, 1999).

Duality theory replaces the constrained problem with an unconstrained one.

To solve this unconstrained problem, we must know how M(SK) looks like. In

the following examples, we therefore specialize the results above to gain some

intuition. We start with a special case of He and Pearson (1991b).

3.3.6 Example (Continued from Example 2.1.39). This example will give us

an explicit characterization of M(SK) for the case of an incomplete market

and / or short-sale constraints. Let the setting be as in Example 2.1.39. Let

N > 2, and suppose as an example that π(1) ≥ 0 almost surely (no short-

sales of asset 1), and π(2) = 0 (no trading of asset 2, incomplete market), i.e.

K = π ∈ La,π(S) : π(1) ≥ 0,π(2) = 0 a.s.. By Remark 1.3.2, we know that

M(SK) is the set of all equivalent supermartingale measures and ASK(Q) ≡ 0.

Define

Y νt

4= exp

∫ t

0

ν′(s)− θ′(s)dZ(s)− 12

∫ t

0

‖ν(s)− θ(s)‖2ds

= E(∫ ·

0

ν′(s)− θ′(s)dZ(s))

t

where we assume that EP [Y νT ] = 1 for a predictable process (ν(t)). That

implies that we can define a measure equivalent to P with the help of Y νT . In

order to ensure that it is actually a supermartingale measure, we have to check

that WtYνt is a supermartingale. To this end, apply Ito’s Formula to WtY

νt :

d (WtYνt )

WtY νt

= π′(t)σ(t)dZ(t) + (ν′(t)− θ′(t))dZ(t) + π′(t)σ(t)ν(t)dt,

For WtYνt to be a supermartingale for any π ∈ K it follows that π′(t)σ(t)ν(t) ≤


0 for any π ∈ K, i.e. σ1(t)ν(t) ≤ 0,σi(t)ν(t) = 0, i = 3, . . . , N, ∀ t ∈ [0, T ],

where σi(t) denotes the ith row of σ(t).

Hence, let Υ 4= ν N -dimensional predictable : σ1(t)ν(t) ≤ 0,σi(t)ν(t) =

0, i = 3, . . . , N, ∀ t ∈ [0, T ] almost surely, EP [Y νT ] = 1. We have just shown

that WtYνt is a supermartingale for any wealth process Wt with π ∈ K. On

the other hand, it follows from the Girsanov theorem (e.g. Revuz and Yor,

1999, Chapter 2, Theorem 2.2) that for any measure Q equivalent to P, there

exists an N -dimensional predictable process γ, such that the Radon-Nikodym

derivative of Q with respect to P is

dQdP

= E(∫ ·

0

γ′sdZ(s))

T

.

Therefore, Qν 4=∫

Y νT dP : ν ∈ Υ = M(SK), and we can apply Theorem 3.3.1

and Theorem 3.3.2. Furthermore, if the infimum of (3.7) exists in M(SK), then

we can also apply Corollary 3.3.3

3.3.7 Remark. For the case of incomplete markets, He and Pearson (1991b)

give a quasi-linear PDE for a certain class of (Markov-)problems. See their

Theorems 7 and 8 for details. Sadly, if we consider short-selling constraints, as

in the example above, this PDE turns into a free boundary problem, and thus

finding a (explicit or numeric) solution becomes a non-trivial task. See Pham

(2002) for such a problem and its solution.

As a concrete example we reproduce the result of Example 2.2.14:

3.3.8 Example (Continued from Example 2.2.14). Let the setting be as in Ex-

ample 2.2.14, but with 0 < k < 1 for the exponent of the utility function B.

Then ASK(Q) ≡ 0 (see Remark 1.3.2) and B(y) = k1−ky−

1−kk . The first step to

a general solution is to characterize the set of equivalent supermartingale mea-

sures. Just as in Example 3.3.6, we find Υ = ν N -dimensional predictable :

σiν(t) = 0, i = 1, . . . ,M, ∀ t ∈ [0, T ] a.s., EP [Y νT ] = 1 = ν : ν(i)(t) = 0, i =

1, . . . ,M, ∀ t ∈ [0, T ] a.s., EP [Y νT ] = 1 (compare Example 3.3.6), where

Y νt = exp

∫ t

0

ν′(s)− θ′dZ(s)− 12

∫ t

0

‖ν(s)− θ‖2ds

.

3.4. THE GENERAL CASE WITH CONSUMPTION 95

Given this characterization of equivalent supermartingale measures, we can

solve the dual problem:

vK(y) = infν∈Υ

EP

[B (yY ν

T )].

From the definition of B(·), it is clear that the minimum is attained if we set

ν∗(i)(t) = θ(i) ∀ t ∈ [0, T ], i = M + 1, . . . , N . For this ν∗ it is straightforward

to evaluate (using the notation of Example 2.2.14)

W ∗T = B′−1

(yW0Y

ν∗

T

)= y

−1/kW0

(Y ν∗

T

)−1/k

.

From Theorem 3.3.2, especially yW0 = u′K(W0) and (3.6), we have

yW0 = EP

[(yW0Y

ν∗

T

)−1/kyW0Y

ν∗

T

W0

].

Solving for yW0 and substituting yields

W ∗T =

W0

EP

[(Y ν∗

T )1−1k

] (Y ν∗

T

)− 1k

,

and this is just (2.20), so that we again find π∗ = 1k (σMσ′M )−1[µM − r1].

3.3.9 Remark. Example 2.2.15 can be reproduced easily, too. This time we set

ν∗(i)(t) = −(θ(i))−

∀ t ∈ [0, T ], i = 1, . . . , N.

3.4 The General Case with Consumption

Since we are not only interested in optimizing with respect to utility from

terminal wealth, but also with respect to utility from consumption, we need

a generalization formulated by Mnif and Pham (2001); Karatzas and Zitkovic

(2003). Again, we simply could only invoke another Optional Decomposition

theorem than the one used in the original article. We will however proof a more

general version than Karatzas and Zitkovic (2003) using a slightly different

technique and a more general condition.


Define uK(W0) as in Problem 2.2.1. The idea for finding a solution is just

the same as in the two previous sections. That is, as a first step we must

characterize the combination of all consumption / terminal wealth pairs that

are attainable given initial wealth W0. As it turns out (compare Appendix

A.3, also Proposition 1.3.3, Proposition 1.3.4, and Remark 2.1.21), (π, c) ∈AKπ (S, W0) is a candidate optimal solution, if and only if the budget constraint

EP

[WT YT +

∫ T

0

c(s)Ysds

]≤ W0

holds for all Y ∈ YK with YK defined as in (A.8). Thus, again, we have found

a static analogue that we can solve more easily.2

Proceeding as in the previous sections, we set up a Lagrangian for the static

formulation and are led to defining vK(·) by

vK(y) 4= infY ∈YK

EP

[∫ T

0

U (t, yYs) ds + B (yYT )

], (3.8)

Using the closedness of YK and the Minimax theorem, it is then possible to

prove that uK(·) and vK(·) are conjugate. From the properties of YK and the

definition of the convex dual, we can prove existence of a solution to the dual

problem just as before. Existence of a solution to the primal problem then

follows easily.

3.4.1 Theorem. Suppose Assumption 2.1.30 holds, and vK(y) < ∞ for all

y > 0. Then

(i) uK(x) < ∞ for all x > 0. We have for y > 0

vK(y) = supx>0

[uK(x)− xy],

and for x > 0

uK(x) = infy>0

[vK(y) + xy].

2Again, Theorem A.3.7 only proofs the ‘if and only if’ for a subset YK of YK. Since thesubset YK does not possess the necessary properties, we enlarge the set.


(ii) uK(·) is continuously differentiable on (0,∞) and strictly concave; vK(·)is continuously differentiable and strictly convex. Further limx↓0 u′K(x) =

limy↓0−v′K(y) = ∞, and limx→∞ u′K(y) = limy→∞ v′K(y) = 0.

(iii) The optimal solution Y ∗ ∈ YK to (3.8) exists and is unique.

(iv) The optimal solution (π∗, c∗(·)) to Problem 2.2.1 exists and c∗(·),W ∗

are unique. If Y ∗ ∈ YK is the optimal solution to (3.8) at the point

yW0 = u′K(W0), we have the relation

c∗(t) = U ′−1 (t, yW0Y∗t ) ∀ t ∈ I P− a.s.

W ∗T = B′−1(yW0Y

∗T ).

(v)

u′K(W0) = EP

[∫ T

0c∗(s)U ′(s, c∗(s))ds + W ∗

T B′(W ∗T )

W0

]

v′K(yW0) = EP

[∫ T

0

Y ∗s U ′(yW0Y

∗s )ds + Y ∗

T B′(yW0Y∗T )

]

Proof. As already mentioned, we prove this result using slightly different tech-

niques than Karatzas and Zitkovic (2003). In order to avoid repetition, we only

sketch the proof and refer to previous proofs for the details. We omit the proof

of (ii), since, given (i), this is a messy, but not very insightful application of

the monotone and dominated convergence theorem to strictly convex / concave

functions, similar in spirit to the one used in the proof of Theorem 2.1.12. The

interested reader can consult Karatzas and Zitkovic (2003, Proposition A.6 and

Lemma A.7) for the proof of the properties of v′K. The properties of u′K then

follow from (i).

(i) and (iii) follow from Lemma 2.1.38 and a straightforward application of

the bidual relationships.

Proposition 2.1.33 (vii) and Lemma 2.1.38 prove the existence of (π∗, c∗(·))in (iv), since vK(y) = supx>0[uK(x) − xy], see also Remark 2.1.36. Strict

concavity again implies uniqueness. This in turn proves uK(x) = infy>0[vK(y)+


xy], which completes the proof of (i). The characterization of c∗,W ∗T follows

from the bidual relations of B, B, U, U , uK, vK, the differentiability of all the

functions, the assumption yW0 = u′K(W0) and the uniqueness of the optimal

results. See also Proposition 2.2.7.

(v) follows from this characterization on noting that

EP

[∫ T

0

c∗(s)Y ∗s ds + W ∗

T Y ∗T

]≤ W0.

In this inequality, indeed equality must hold for otherwise we could increase

utility as B∗ is strictly increasing, contradicting the fact that W ∗T is optimal.

The bidual relations then also prove the characterization of v′K.

3.4.2 Remark. If the market is complete, the theorem naturally extends and

verifies the verification result of the previous chapter. It also verifies the results

of the two previous sections.

We can extend Corollary 3.3.3, too:

3.4.3 Corollary. Under the assumptions of Theorem 3.4.1 and with the same

notation, suppose that Q∗ ∈M(SK), where Q∗(A) 4=∫

AY ∗

T dP∀ A ∈ F . Set

uQK,s(W0) = sup

(c,X)

EP

[∫ T

0

c(s)ds + B(X)

]

s. t. EQ

[∫ T

0

c(s)ds + X

]≤ W0,

(3.9)

where c is a consumption process, and X ∈ L0+(Q); then

uK(W0) = infQ∈M(SK)

uQK,s(W0).

and infQ∈M(SK) uQK,s(W0) attains its minimum at Q∗.

Proof. Completely analogous to the proof of Corollary 3.3.3.

As for sufficient conditions for vK(y) < ∞ see Proposition 2.2.6. In the next

example we consider the duality approach for the problem of both consumption

and terminal wealth.


3.4.4 Example (Continued from Example 2.1.39). The problem of Theorem

3.3.1, Theorem 3.3.2, Corollary 3.3.3, and Theorem 3.4.1 is that∫

Y ∗T dP ∈

M(SK) need not be valid even for well-behaved utility functions like B(x) =

ln(x) and well-behaved stochastic processes like continuous martingales (Kram-

kov and Schachermayer, 1999, Example 5.1). Therefore, in Theorem 3.3.1,

Theorem 3.3.2, Theorem 3.4.1 we use YK instead of M(SK). Karatzas et al.

(1986, 1987, 1991); Cvitanic and Karatzas (1992) employ the same idea and

combine it with the fictitious market completion technique pioneered by He

and Pearson (1991b); Karatzas et al. (1991). Textbook references with many

examples are Chapter 6 in Karatzas and Shreve (1998); Korn (1997, Chap-

ters 4.4 and 4.5). A comprehensive treatment of this setting for the case of

terminal wealth only can be found in Cvitanic (1999). Karatzas and Zitkovic

(2003, Section 4.1) treat the case of consumption / terminal wealth.

We turn to problem Problem 2.2.1 in the setting of Example 2.1.39. As-

sume c(t) = 0, x = 0, dom(U) = [0,∞), U ′(t, 0) = ∞∀ t ∈ I, and define

the constrained set by K = π ∈ Lπ(S) : πt ∈ KN ∀ t ∈ I for some closed

and convex KN ⊂ IRN with 0 ∈ KN . It follows that K is convex and closed.

Consider the support function ζ(ν) 4= supπ∈KN (−π′ν), ν ∈ IRN , a positive ho-

mogeneous and subadditive function (see Chapter 1 in Castaing and Valadier,

1977; Rockafellar, 1970, Section 13, on the support function). Since 0 ∈ KN ,

ζ(·) ≥ 0, and further π ∈ KN ⇔ ζ(ν) + π′ν ≥ 0∀ ν ∈ ν ∈ IRN : ζ(ν) < ∞.Define

Υ =

ν predictable : EP

[∫ T

0

‖ν(s)‖2ds

]< ∞,

EP

[∫ T

0

ζ (ν (s)) ds

]< ∞

For each ν ∈ Υ, replace r(·) with rν(·) 4= r(·) + ζ(ν(·)) and µ(·) with µν(·) 4=

µ(·)+ν(·)+ ζ(ν(·)). Then we can define Sνi in analogy to (2.13), i = 1, . . . , N ,

and Sν0 in analogy to (2.15). Similarly, we can define θν(·) , σ(·)−1(µν(·) −

rν(·)1), Y νt by (2.17), and a measure Qν with the help of (2.16). The reader

should however be careful here, as Qν need not be a probability measure (might


have total mass less than 1). To sum up, we have constructed a fictitious

completion of the market, given a ν ∈ Υ. We will identify each of these

completions by ν. Note that Y ν : ν ∈ Υ ⊂ YK, and the set is indeed

maximal (compare Example 3.2 in Mnif and Pham, 2001; Karatzas and Zit-

kovic, 2003, Proposition 4.1).

For each of these markets define (compare (3.8))

vν(y) 4= EP

[∫ T

0

U

(t, y

Y νs

Sν0 (s)

)ds + B

(y

Y νT

Sν0 (T )

)]and

uν(W0)4= sup

(π,c)∈Aπ(Sν ,W0)

EP

[∫ T

0

U(t, Sν

0 (s)c (s))

ds + B(Sν

0 (T )WT

)].

Set

Υ0 = ν ∈ Υ : vν(y) < ∞∀ y ∈ (0,∞),−(d/dy)(vν(y)) < ∞∀ y ∈ (0,∞) .

Suppose uK(W0) < ∞ for Problem 2.2.1. Then there exists ν∗ ∈ Υ0 such

that uK(W0) = uν∗(W0) = infν∈Υ uν(W0). Consequently, the optimal solution

c∗(·),W ∗,π∗ exists and is the unique solution to the fictitious completion of

the market ν∗ as defined above. Indeed, c∗(·),W ∗ can be characterized as

in Corollary 2.1.14 for the respective complete market ν∗, and the optimal

portfolio-proportion process π∗ can be found by the techniques described in

Remark 2.1.23 and Example 2.1.39. It satisfies ζ(ν∗(t)) + π∗′(t)ν∗(t) = 0

almost surely ∀ t ∈ I. Furthermore, vν∗(y) = infν∈Υ vν(y).

3.4.5 Remark. Under additional regularity conditions one could go further and

extend the example to (even random) closed convex sets KN (t) (confer Cvitanic

and Karatzas, 1992, Section 16.3).

3.4.6 Example. We finish this section with an extensive example, that is ba-

sically a unification and extension of all the previous examples. To make it

tractable, we start with some repetition. Throughout the example we will use

some vigorous hand-waving: we assume that all operations are justified without

checking the details.


To start with, we assume the Brownian market of Example 2.1.39. To find a

solution to this optimal consumption problem, we first solve the dual problem

(3.8). The first step to a general solution of this problem is to characterize the

equivalent supermartingale measures. Suppose for sake of convenience that

ASK(Q) ≡ 0 (see Remark 1.3.2). Then we can characterize the equivalent

measures with the help of the densities

Y νt

4= exp

∫ t

0

ν′(s)− θ′(s)dZ(s)− 12

∫ t

0

‖ν(s)− θ(s)‖2ds

= E(∫ ·

0

ν′(s)− θ′(s)dZ(s))

t

where we assume that EP [Y νT ] = 1 for a predictable process (ν(t)) (see Example

3.3.6). In order to ensure that the density actually defines a supermartingale

measure, we have to check that WtYνt is a supermartingale. To this end, apply

Ito’s Formula to WtYνt :

d (WtYνt )

WtY νt

= π′(t)σ(t)dZ(t) + (ν′(t)− θ′(t))dZ(t) + π′(t)σ(t)ν(t)dt,

For WtYνt to be a supermartingale for π ∈ K, it follows that π′(t)σ(t)ν(t) ≤ 0

for any π ∈ K. Therefore YK = (Y νt ) : π′(t)σ(t)ν(t) ≤ 0∀π ∈ K.

To get a more concrete result let us, as in Example 2.2.14, consider the

special case where we are only allowed to hold the first M assets, M < N ;

i.e. K 4= π : π(i)(t) = 0, i = M + 1, . . . , N ∀ t ∈ I a.s. (any of the other

constraints considered previously can be tackled just as easily). Furthermore,

the volatility matrix has got the following block structure:

σ =

(σM 0

0 σN−M

),

Then YK = (Y νt ) : ν(i) = 0, i = 1, 2, . . . ,M.

Having characterized the set of equivalent supermartingale measures needed

in (3.8), we need to calculate the convex duals U(t, y) and B(y) (see Definition

3.1.3). To do so, we must assume a specific utility function. In our case, it will


be the usual one from Example 2.1.41. We define a function

B(x) 4=

x1−k

1−k x > 0

limx↓0x1−k

1−k x = 0

−∞ x < 0

k > 0, k 6= 1, and consider the CRRA case: the individual maximizes util-

ity from running consumption and terminal wealth. The utility function for

consumption is U(c, t) 4= exp−d tB(c), and the bequest function B(x) 4=

exp−d TB(x), where d is the subjective discount rate.

We are now in the position to calculate the convex duals U(t, y) and B(y)

(see Definition 3.1.3). For the utility function considered here, it is easily seen

that B(y) = B(B′−1 (y)

)− yB′−1 (y) = k

1−ky1−k

k exp− 1kd T. Similarly,

U(t, y) = k1−ky

1−kk exp− 1

kd t. Using this, (3.8) reads

vK(y) = infν(i)(t)=0,

i=1,2,...,M

k

1− kEP

[∫ T

0

exp−1k

d sY νs

1−kk ds + exp−1

kd TY ν

T

1−kk

].

Clearly, it suffices to minimize k1−k EP

[Y ν

t

1−kk

]. And this is done if ν(i)(t) =

θ(i)(t) for i = M + 1,M + 2, . . . , N . It follows from Theorem 3.4.1 that

c∗(t) =(

y0 exp−d t exp∫ t

0

θ′M (s)dZM (s)− 12‖θM (s)‖2ds

)− 1k

W ∗(T ) =

(y0 exp−d T exp

∫ T

0

θ′M (s)dZM (s)− 12‖θM (s)‖2ds

)− 1k

where for any vector a, the vector aM consists of the first M elements of a.

On the other hand, we know that W ∗(T ) is a solution to a stochastic dif-

ferential equation. And this solution is given by (compare Theorem C.3.1, or

3.5. EXTENSIONS AND RAMIFICATIONS 103

Karatzas and Shreve, 1991, Problem 5.6.15)

W ∗(T ) =W0 exp

∫ T

0

π′M (s)[µM (s)− r(s)1M ]ds

−∫ T

0

12π′M (s)σM (s)σ′M (s)πM (s)ds +

∫ T

0

π′M (s)σM (s)dZM (s)

−∫ T

0

c∗(s) exp

∫ T

s

π′M (s)[µM (s) + δM (s)− r(s)1M ]ds

−∫ T

s

12π′M (s)σM (s)σ′M (s)πM (s)ds

+∫ T

s

π′M (s)σM (s)dZM (s)

ds,

where we have adjusted σM . Substituting c∗(·) and comparing the Brownian

motion parts of the two representations of W ∗(T ), we find π(·) = 1kσ′

−1M (·)θ(·) =

1k (σM (·)σ′M (·))−1[µM (·)− r(·)1]. What remains to be done is to calculate y0.

This is very easy from the characterizations of c∗ and W ∗ above, and the

relation

EP

[∫ T

0

c∗(s)Y νs ds + W ∗

T Y νT

]= W0,

see the proof of (v) in Theorem 3.4.1.

3.5 Extensions and Ramifications

Most of the exposition throughout the thesis is streamlined to get a readable

account of the major aspects of portfolio optimization. We now conclude the

first three chapters with various extensions that are not covered in the main

part. They have not been introduced to the main part to keep the notation at

bay. We only sketch what is possible and has been done by other authors, and

refer to these sources for details on such extensions, where necessary.


3.5.1 0 in Constraint Set

The assumption 0 ∈ K implies that we are always allowed to invest our total

wealth in the riskless asset. This assumption is only restrictive in cases where

there is a constraint requiring to be invested in non-hedgeable risky assets to

some degree. The most prominent example is the asset / liability setting in

insurance, where we always have to “invest” in the liability.

If 0 /∈ K, basically two problems arise in general. The first one is that W > 0

can no longer be ensured for the case of portfolio processes. And the second

one is that constraints like π ≤ 5% might not be enforcable; put differently the

set CK(W0) or CπK(W0) could be empty. The reason is that there are stochastic

components of our wealth process that are beyond our control. Hence, we need

an assumption that gives us at least some control on what can happen. From

a technical point of view, we need a nonincreasing lower bound in the set SK, a

role played by 0 until now (Follmer and Kramkov, 1997, proof to Proposition

5.2). Such assumptions have been used in Mnif and Pham (2001, Section 2,

especially (H0) and (H1)). Even with such an assumption, we still have the

problem of W ≤ 0 on some set.

3.5.2 Stochastic Income

Another possible reason for negative wealth is stochastic income. Up to now, we

have always assumed that c ≥ 0, i.e. (net) consumption is non-negative. If we

allow for c < 0, we can interpret the consumption process as net consumption

(or endowment), i.e. consumption minus income. The theory of portfolio opti-

mization can be extended to cope with stochastic consumption (e.g. Cvitanic

et al., 2001; Karatzas and Zitkovic, 2003; Mnif and Pham, 2001). We need

however some additional integrability conditions. That might add consider-

able complexity. If net consumption is bounded from below by some constant

however, matters are again straightforward. See also Remark 1.4.4.


3.5.3 Negative Wealth

There are several different approaches to handle negative wealth. The simplest

one just uses another lower bound for the wealth process than zero. This is

nothing but a “coordinate transformation”, and therefore adds no extra layer

of complexity.

A slightly more involved approach requires that the terminal wealth WT =

W0 +∫ T

0+ξs · dSs −

∫ T

0c(s)ds is bounded from below by a constant less than

zero. In order to avoid doubling strategies, an additional assumption is needed;

e.g. the gains processes∫ ·0+

ξsdSs, is uniformly bounded from below, i.e. for

example ξ ∈ La(S, W0). However, in the presence of a no-arbitrage assumption

like the one used throughout the thesis, this directly reduces to the the case

of the first approach. The reason is that a lower bound on terminal wealth

and a no-arbitrage assumption together induce a lower bound on the complete

wealth process. For a proper discussion, what a weakening of the no-arbitrage

assumption implies, see Section 3.5.5 below.

Whereas the first two approaches work for “portfolio-proportion processes”

(after a “coordinate transformation”, i.e.) and portfolio processes, the third

and fourth approach are only applicable to portfolio processes, since portfolio-

proportion processes are no longer well-defined with negative wealth. The third

approach still requires that all wealth processes are bounded from below by a

constant that depends on the wealth process. Slightly abusing notation, we

would say that (ξ, c) is admissible if (ξ, c) ∈ ∪W0>0AK(S, W0) holds. For such

models, superhedging results still hold true, and therefore the existence results

still work. A characterization of the optimal solution along the lines of Section

2.2.3 should be feasible. But enlarging the dual sets is an open issue in the

general constraint setting. Therefore the more advanced results of Section 2.2.3

and Chapter 3, that rely on such an enlargement, do not simply carry over.

We would have to introduce additional assumptions here (like Assumption 3

on p. 40 in Cuoco, 1997), or restrict attention to cone constraints; see also the

remarks at the end of Appendix A.3.2 beginning on p. 162. A related approach

requires that terminal wealth is in L2(Ω,F , P), or that it is uniformly integrable

(e.g. Duffie, 2001, Chapter 9).


The last and most involved approach does not limit the properties of the

wealth process. Then the admissible trading strategies must be carefully cho-

sen. But this is well beyond of the scope of this thesis (Delbaen, Grandits,

Rheinlander, Samperi, Schweizer, and Stricker, 2002; Schachermayer, 2002).

3.5.4 Other Utility Functions

Until now, we have considered two types of utility functions, namely one de-

fined on the space of attainable pairs (c,X) in Chapter 1; and then, as a first

specialization of this, the classical (state-dependent) time-additive utility func-

tion in Chapter 2. This specialization allows us to get first-order conditions

using Lagrange Multiplier Theory (compare Section 2.2.3).

In principal, both the state-dependent, time-additive utility functions and

the even more general of the current chapter are extremely flexible to capture

all kinds of different behavior of individual investors. However, they are by

no means parsimonious, which makes them next to intractable when it comes

to e.g. finding optimal portfolios. And state-independent, time-additive utility

functions suffer from several shortcomings (compare Section 1.7). Therefore,

different authors have come up with various suggestions.

One such extension are utility functions where utility is history-dependent.

To name just a few papers, such utility functions are discussed Sundaresan

(1989); Constantinides (1990); Detemple and Zapatero (1991); Hindy et al.

(1992, 1997). Roughly speaking, history-dependent utility functions are utility

functions u with utility given by u(C,X), where C is the total consumption

until now (as compared to the running consumption c). To be a little bit more

precise, one such utility function (Hindy et al., 1992) could be defined by

u(C,X) 4= EP

[∫ T

0

U(s, Vs(C))ds + B(X, VT (C))

]

where

Vt(C) 4= η exp−∫ t

0

d(s)ds

+∫ t

0

d(s) exp−∫ s

0

d(u)du

dC(s).


Here V is the individual’s level of satisfaction from previous consumption,

which depends on η > 0 and a discount factor d : I 7→ IR+0 . It is not too

hard to show that this setting by and large fits into the general definition of

u. Since our assumptions concerning admissible consumption processes c imply

that the set C : C =∫ ·0c(s)ds is not closed, we have to use a slightly different

space of consumption processes if we actually want to prove existence of an

optimal solution. For example, the space of all non-negative, nondecreasing,

progressive processes C suffices (see Remark 1.2.2). We refer to Hindy et al.

(1992, 1997); Bank (2000, Section 1.2.2) for details. Using this we easily get

existence results for the constraint case along the lines of Theorem 1.4.3. As

in the time-additive case, establishing upper semicontinuity is the only critical

aspect to prove existence of an optimal solution to such a Hindy-Huang-Kreps

utility function in the constrained case. And it is not very surprising that again

a power-growth condition suffices (Corollary 2.1.29;Section 2.2 in Bank, 2000).

Bank (2000, Section 2.3) also shows how to get first-order conditions for such

utility functions by arguments akin to the ones used in Section 1.6. It should

be possible to extend such arguments to the constrained case as in Section

2.2.3. And calculating optimal portfolio rules is also feasible for markets like

the Brownian market (e.g. using the fictitious market completion, see Remark

3.3.4 on p. 92 and Hindy and Huang, 1993; Bank, 2000, Section 4.2).

Another class of utility functions that has drawn substantial interest is

the class of recursive utility (Epstein and Zin, 1989; Duffie and Epstein, 1992;

Campbell and Viceira, 2002; Lazrak and Quenez, 2003; Schroder and Skiadas,

2003). One example of such a utility function can be defined as follows. For

given (c,X) consider the (stochastic) integral equation

Vt(c,X) = EP

[∫ T

t

U(s, c(s), Vs(c,X))ds + B(X)|F(t)

].

Then we set u(c,X) = V0(c,X). This specification implies that future (ex-

pected) utility influences current utility — hence the name recursive utility. If

we assume that U,B are concave and increasing in both their arguments, we

can intuitively see that u is so, too. This is easily seen to be true by backward


induction, if I is finite. For the case of Ito processes, such a proof (indeed the

proof that Vt possesses these properties) can be found in El Karoui, Peng, and

Quenez (1997).

In order to get existence results, we need upper semicontinuity of u, and it

seems not to hard to find power-growth conditions on U,B that actually achieve

that, since the conditional expectation operator is contractive. We therefore

find that the existence result carries over to the recursive utility setting with

constraints. First-order conditions are however much more complicated. We

refer to Schroder and Skiadas (2003).

Finally, the extension to quasi-concave utility functions is not only of aca-

demic interest (see Section 1.4 and Remark 2.1.25). There are applications in

economics where “utility functions” have jumps or kinks (e.g. Zellweger, 2003).

3.5.5 Various Extensions

American constraints — i.e. constraints on the whole wealth process — can

be introduced as in Mnif and Pham (2001). Mnif and Pham basically modify

the admissible set to cope with such constraints. We need some additional

integrability conditions. Then all of the theory is true as before. We refer to

Mnif and Pham (2001) for details.

If the constraints are only on terminal wealth (European constraints), than

our step from the dynamic to the static problem works as before. The con-

straints simply translate into some additional Lagrange multipliers. We leave

the details to the reader, and refer to Korn and Trautmann (1995) or Korn

(1997, Chapter 4.2).

Basically, the assumption that no arbitrage is possible is not needed; it just

facilitates several proofs. Without no arbitrage results concerning the portfolio-

proportion process still is valid; the existence result for portfolio processes and

first-order conditions is true using some straightforward modifications (Mnif

and Pham, 2001, Proposition 4.1); but duality results fail since we can no longer

simply extend the set (i.e. unless we are not imposing additional boundedness-

assumptions that allow for a proper enlargement of the relevant set).


But the no-arbitrage assumption is a natural assumption. As extensively

discussed in Remark 2.1.10, it is almost unavoidable to assume no arbitrage.

And if there are arbitrage opportunities, an investor will trade infinitely large

positions in this arbitrage opportunity, unless a portfolio constraint prohibits

that. In this case, she will trade the maximum allowed, basically increasing

the initial wealth W0 by exploiting the arbitrage opportunity.

If an investor faces constraints on the portfolio process and the portfolio

proportion process at the same time, we still can use the same theory. The

existence result can be used by intersecting the sets CK and CπK for portfolio

process and the portfolio-proportion process. A characterization of the optimal

solution along the lines of Section 2.2.3 should be feasible, too. Duality results

and the more advanced results of Section 2.2.3 are an open issue unless we add

an additional boundedness-assumption since the set YK for portfolio processes

is involved. We again refer to the remarks at the end of Appendix A.3.2.

Until now, we have always assumed that there is one consumption good,

namely money. If, to the contrary, the utility of an individual does not only

depend on the wealth process as a whole, but also on the individual portfolio

processes, most of the theory still is valid. We need however a more refined

definition of a utility function. In an unconstrained setting, this is the topic of

Bouchard and Mazliak (2003); Kamizono (2003).

The prototypical case where the utility depends on all portfolio processes

is transaction costs. In order to get existence results along the lines of this

chapter, we need superhedging inequalities to characterize the set of candidate

optimal solutions. Such inequalities can be found in Kabanov and Stricker

(2002). Duality results again need the necessary closure properties. Bouchard

and Mazliak (2003) prove such results in the unconstrained case. We also refer

to Deelstra et al. (2001); Kamizono (2001, 2003). Extending these results to

the constrained case would require considerable work, but should be feasible.

Chapter 4

Optimal Portfolios in the

Brownian Model

112 CHAPTER 4. OPTIMAL PORTFOLIOS

4.1 Introduction

4.1.1 Motivation

The previous chapters were devoted to the question whether an optimal strat-

egy for the portfolio optimization problem exists and whether this optimal

strategy is unique. We were able to prove existence for a rather general setting

using a method known as the martingale method in combination with dual-

ity theory. Reassuring as such existence and uniqueness results may be, they

usually do not tell us how to actually calculate optimal portfolios. Indeed,

the existence of optimal portfolios relies on Optional Decomposition theorems,

and these theorems only assert existence of certain portfolio processes. It is by

no means straightforward to find explicit characterizations for these portfolio

processes. Even if we restrict attention to the Markovian Brownian market,

matters are not obvious. Only in some special cases, most prominently the com-

plete Brownian market, can we get a linear PDE and boundary conditions that

make it feasible to calculate optimal portfolios directly from Martingale Rep-

resentation Theorems and Malliavin calculus (compare Ocone and Karatzas,

1991; Øksendal, 1997).

In the Markovian Brownian market, the dynamic programming approach

is an alternative to the martingale method for the portfolio problem. By using

the viscosity solution technique (Fleming and Soner, 1993), the dynamic pro-

gramming approach makes it possible to study rather general problem settings.

However, it still leads to a nonlinear and non-degenerate PDE (see Fleming and

Soner, 1993, Chapter 4.3; Merton, 1969, 1971; Korn, 1997, Chapter 3.3). In

the unconstrained case, numerical methods are feasible, although not trivial

(Filitti, 2004; Kushner and Dupois, 2001; Fleming and Soner, 1993, Chapter

9). Even better, we can decompose the nonlinear Hamilton-Jacobi-Bellman

equation into linear PDEs (Karatzas et al., 1987; Cox and Huang, 1989, 1991),

elegantly so, if we use duality theory (see e.g. Karatzas and Shreve, 1998, Theo-

rem 3.12). But if we face the problem of constrained optimization, this usually

4.1. INTRODUCTION 113

does not help much: the problem is a free boundary problem, although it some-

times is possible to transform the initial PDE to a semilinear one with simpler

boundaries. In short: calculating optimal portfolios explicitly is nontrivial.

Even worse, due to the structure of the problem numerical methods do quite

often not lead to an accurate solution in due time. Indeed, we need a consid-

erable reduction of the complexity for guessing optimal solutions or studying

numerical schemes.

Therefore, this chapter tries to characterize optimal portfolio weights. To

achieve this goal, we require more structure than inherent in the general semi-

martingale model. We will restrict attention to the most prominent semi-

martingale, namely the Brownian market model, and certain, closely related

jump-diffusion models. The topic of this chapter is not to calculate optimal

portfolios, but to substantially reduce the complexity of the problem, by show-

ing that the portfolio optimization problem in the Brownian model is essentially

a “one-dimensional” problem. An N -dimensional problem can be reduced to a

“one-dimensional” one. Indeed, finding optimal portfolios reduces to finding an

optimal IR-valued process α(·). Having found this process, optimal portfolios

are either on the boundary of the constraint set, or fully determined by α(·)(Theorem 4.3.6). They follow from maxπ′(·)(µ(·) − r1) subject to π(·) ∈ Kand π′(·)σ(·)σ′(·)π(·) = α(·), where K is the constraint set.

4.1.2 Previous Work

As far as calculating optimal portfolio weights for the constrained problem is

concerned, a lot of work has already been done. We will now discuss some of

these results. Before we do so, we emphasize that there are important differ-

ences between most papers available and this chapter. Other papers usually

start by discussing existence and uniqueness of an optimal solution. Only after

having achieved this (usually in a rather general setting), they try to char-

acterize the optimal solution. That is, optimal portfolios are a “by-product”

of these papers. To the contrary, this chapter does not care about existence

and uniqueness of an optimal strategy, but focuses on the question, how an


optimal solution must look like, provided it exists. In this subsection, we will

summarize the main results concerning the qualitative structure of optimal

portfolios found in other research. Comprehensive surveys of the more general

question of existence and uniqueness in constrained portfolio optimization for

Ito processes can be found in Karatzas and Shreve (1998); Cvitanic (1999).

The following literature survey concentrates on constrained optimal portfolios,

and discusses only landmark results of unconstrained optimal portfolios. We

refer to Korn (1997); Karatzas and Shreve (1998); Liu (1999) for a thorough

literature review in the unconstrained case. We first consider the special case

of time-separable utility functions.

Khanna and Kulldorff (1999) is the only paper, the author is aware of,

that is similar in spirit so far as it only cares about candidate optimal port-

folio processes. In a world of Ito processes and arbitrary utility functions,

they show that for cone constraints a mutual fund theorem holds, i.e. we can

assume that an optimal portfolio-proportion process is of the form π∗(·) =

α(·)[σ(·)σ′(·)]−1p(·) for some measurable IR-valued process α(·), and some

process p(·) that is a solution to a quadratic problem. Consequently, the

IRN -valued problem has been reduced to an IR-valued one. Extending their

result to arbitrary constraints requires additional assumptions. For the spe-

cial case of a CRRA utility function and constant coefficients (more generally,

time-dependent, deterministic coefficients), Muller (2000) proves such a result

using a similar reasoning as Khanna and Kulldorff (1999) (compare Example

2.2.12 and Remark 2.2.13, p. 80n).

Khanna and Kulldorff (1999); Muller (2000) basically employ geometric

reasoning and some elementary stochastic properties to prove their results.

More frequently used is the martingale method, usually in combination with

the duality method (Shreve and Xu, 1992a,b). A constrained problem can be

transformed to a family of unconstrained problems by adding assets to make

the market fictitiously complete, an observation first made in He and Pearson

(1991a,b); Karatzas et al. (1991); Cvitanic and Karatzas (1992) (see Remark

3.3.4, Example 3.3.6 and Karatzas and Shreve, 1998, Chapter 6). For each

4.1. INTRODUCTION 115

unconstrained problem, we can solve the PDE to find the optimal wealth or

expected utility process in the Brownian model. As a result, we get optimal

portfolios for the fictitiously complete markets. One of these processes mini-

mizes the expected utility, i.e. is the least favorable fictitious completion. This

is the optimal constrained wealth or expected utility process.

In a world with Ito processes, He and Pearson (1991b) use this method to

calculate optimal portfolio weights for the log-utility case without consumption

and with short-sale constraints as π∗(·) = max([σ(·)σ′(·)]−1[µ(·) − r(·)1],0).

This result is extended in Cvitanic and Karatzas (1992) to more general con-

straints with consumption. Given some closed convex set KN ⊂ IRN , the

constraint that π(t) ∈ KN for all t leads to a quadratic form (Cvitanic and

Karatzas, 1992, Equation (14.1)). An optimal portfolio can be written as

π∗(·) = [σ(·)σ′(·)]−1[µ(·) − r(·)1 + β(·)] for some deterministic process β(·),which follows from a pointwise minimization. Cvitanic and Karatzas (1992)

also consider several specific constraints like rectangular constrains and con-

straints on borrowing, where they actually calculate β(·). The process β(·)stems from the fictitious market completion as described in Remark 3.3.4 (see

also Examples 3.3.6, 3.3.8, and 3.4.6).

While feasible in theory, and quite successfully applied to some problems,

the technique just described does not always lead to satisfactory solutions. An-

other powerful technique, dynamic programming, is not only applicable in the

unconstrained case (Merton, 1969, 1971), but also in the constrained setting.

It requires however that the processes are Markovian. Cvitanic and Karatzas

(1992) use this technique for the CRRA utility case with utility from consump-

tion and terminal wealth and a closed convex set KN ⊂ IRN as constraint.

They prove that optimal portfolios follow again from a quadratic form (an

excellent summary of Cvitanic and Karatzas, 1992 can be found in Karatzas

and Shreve, 1998, Chapter 6). Indeed, an optimal portfolio can be written as

π∗(·) = a[σ(·)σ′(·)]−1[µ(·)− r(·)1+ β(·)] for some constant a and some deter-

ministic process β(·), which follows from a pointwise minimization. For this

result to hold, the constraint set is again π(t) ∈ KN for some closed convex set


KN ⊂ IRN , and the coefficient processes µ(·),σ(·), r(·) have to be deterministic

and continuous. For more general utility functions, they give optimal portfolios

in feedback form. As it turns out, π∗(·) = α(·)[σ(·)σ′(·)]−1[µ(·)− r(·)1+β(·)]for some measurable IR-valued process α(·) and some deterministic process

β(·), which follows from a pointwise minimization.

Independently, Fleming and Zariphopoulou (1991) arrive at a similar result

for the case of constant coefficients, and short-selling constraints (although this

paper has a much wider scope by allowing for a different lending than borrowing

rate) with slightly different assumptions concerning utility functions, and again

consider the CRRA case as an example. And Zariphopoulou (1994); Vila and

Zariphopoulou (1997) discuss the case of borrowing constraints with constant

coefficients and arrive at similar results for infinitely lived agents and more

general utility functions than in Cvitanic and Karatzas (1992).

All the papers discussed so far have in common that the coefficients are de-

terministic. However, Markovian processes can have more general coefficients.

Generalizing the dynamic programming approach, Pham (2002) allows for the

coefficients µ(·) and σ(·) to depend on another Markov process Yt, driven by

a Brownian motion independent of Z(·). Pham (2002) assumes CRRA utility

and some closed convex set KN ⊂ IRN as constraint set. And again, the opti-

mal portfolio can be written as π∗(·) = α(·)[σ(·)σ′(·)]−1[µ(·)−r(·)1+β(·)] for

some measurable IR-valued process α(·). This time however, the process β(·)is no longer deterministic, but depends on Yt. Nevertheless, it follows from a

semilinear PDE and a pointwise minimization.

In a seemingly unrelated paper, Li et al. (2002) consider a mean-variance

portfolio selection problem with short-selling constraints and deterministic

coefficients, i.e. the problem of minimizing EP

[(WT − EP [WT ])2

]subject to

short-selling constraints and to the constraint that EP [WT ] = a for some con-

stant a ∈ IR. This problem can be translated into one where one maximizes

a quadratic utility function (Section 2 in Li et al., 2002; Korn, 1997, Chapter

4.3). That is, it closely resembles a problem with HARA utility, and therefore

it should by now come as no surprise that the optimal portfolios can be written

4.2. MODEL AND STANDING ASSUMPTIONS 117

as π∗(·) = α(·)[σ(·)σ′(·)]−1[µ(·)−r(·)1] for some measurable IR-valued process

α(·). This is a special case of mean-variance hedging in incomplete markets

(Korn and Trautmann, 1995; Richardson, 1989; Schweizer, 1992, the latter two

papers do not fit in our framework since wealth may become negative). It

should be obvious that the result is only interesting when the constraints are

binding, for otherwise, we could completely “hedge the contingent claim a”.

For other utility functions than time-additive utility functions results on

constrained optimal portfolio rules are scarce (for a survey in the unconstrained

setting see Campbell and Viceira, 2002, Chapter 5). It suffices to say that all

results available in the literature have the same structure of optimal portfo-

lios as above (see Schroder and Skiadas, 2003, for recursive utility). It seems

however straightforward to extend some results from the unconstrained to the

constrained setting, using the technique of the fictitious market completion;

but we will not dwell on the details.

To some up, all results have in common that π∗(·) = α(·)[σ(·)σ′(·)]−1p(·)for some measurable IR-valued process α(·), and some process p(·). The rest

of this chapter is devoted to showing that this is actually a geometric property

of the model that only fails, if we “hit” the constraint.

4.2 Model and Standing Assumptions

This section summarizes the model and the standing assumptions that hold

throughout the chapter.

We consider the usual frictionless Brownian market on the time interval

I = [0, T ] (see Example 2.1.39 for more details). That is, (Ω,F , P) is a com-

plete probability space, Z =(Z(1), Z(2), . . . , Z(N)

)an N -dimensional Brownian

motion with Z(0) = 0 almost surely. Contrary to Example 2.1.39, we allow for

an arbitrary filtration (F(t))t∈I satisfying the usual hypotheses.

The price process for the risky assets satisfies the equation

dSi(t) = Si(t)

µi(t)dt +N∑

j=1

σij(t)dZ(j)(t)

∀ t ∈ [0, T ], i = 1, 2, . . . ,M,


µ(·) being the mean rate of return process and σ(·) being the volatility process,

and the risk-free rate process is denoted by r(·). We assume that the respective

assumptions of Example 2.1.39 concerning µ(·), σ(·), and r(·) hold.1 The

wealth process (Wt) is given by

Wt = W0 +∫ t

0

W (s)π′(s)[µ(s)− r(s)1]ds

+∫ t

0

W (s)π′(s)σ(s)dZ(s)−∫ t

0

c(s)ds;

here π(·) is the portfolio-proportion process, and c(·) the consumption process.

The individual maximizes utility from consumption and terminal wealth.

She chooses the optimal portfolio-proportion process π∗(·) subject to certain

constraints K ⊂ La,π(S) to be specified later, and the optimal consumption

process c∗(·). Contrary to Definition 1.2.1, consumption c(·) may become neg-

ative. This allows for an easy interpretation of the consumption process as

net-consumption (or stochastic endowment), i.e. the difference between (labor)

income and gross consumption.

As for the maximization problem, we consider a setting that is basically the

same problem as in Chapter 1. There are two differences: terminal wealth can

only influence utility by its distribution; and we do not assume risk aversion

of any kind. To make this clear, we use a slightly different notation. Consider

the measurable space (I × Ω,Prog), where Prog is the progressive σ-algebra.

Suppose that u : I × Ω 7→ IR is Prog/B[0, T ]-measurable, and that u is non-

decreasing in the following sense: c1(·) ≥ c2(·) a.s. ⇒ u(c1(·)) ≥ u(c2(·)) for

two consumption processes c1(·), c2(·). B : IR 7→ IR is nondecreasing and Borel

measurable.2 Then the individual chooses c(·),π(·) ∈ K so as to maximize

u(c(·)) + EP [B(W (T ))]. We still assume that c(·),π(·) must be admissible in

the sense of Definition 1.2.4, i.e. W (t) ≥ 0 for the corresponding wealth pro-

cess. This clearly nests Problem 2.2.1. As for the constraints, we choose K, a

1As in Example 2.1.39, less restrictive assumptions suffice. Indeed, we only have to insurethat the integrals

R t0 π′(s)[µ(s)− r(s)1]ds and

R t0 π′(s)σ(s)dZ(s) exist.

2Measurability of u and B is only needed, where we combine the results of this sectionwith the results of the previous one. For most of this chapter, it is irrelevant.

4.3. OPTIMAL PORTFOLIOS FOR ITO PROCESSES 119

closed, convex subset of the space L0(I × Ω,Prog, λ ⊗ P) with the metric of

convergence in probability. Compared to the definition of the constrained set

in the previous chapters, this is a more general setting. Closure with respect

to semimartingale topology implies closure with respect to convergence in pro-

bability. Recall that K is called convex if β, γ ∈ K, then αβ + (1−α)γ ∈ K for

any one-dimensional progressively measurable3 process α such that 0 ≤ α ≤ 1.

4.3 Optimal Portfolios for Ito Processes

What can we say about the optimal portfolios for this rather general setting?

Surprisingly, quite a lot can be said, as Khanna and Kulldorff (1999) have

shown. Therefore the first subsection is devoted to presenting their result.

Due to the fact that we use a different technique to prove it, our version of

Khanna and Kulldorff’s result requires slightly less restrictive assumptions.

4.3.1 Cone Constraints

Suppose that K is a cone, i.e. π ∈ K ⇒ απ ∈ K for any constant α ≥ 0. Set

θ(·) 4= σ(·)−1 (µ(·)− r(·)1). Given this setting, Khanna and Kulldorff (1999)

prove the following theorem.

4.3.1 Theorem. Let (π∗(·), c∗(·)) be a candidate optimal solution to the port-

folio optimization problem, and W∗(·) be the wealth process for this solution.

Then there is a (π∗(·), c∗(·)) with

u (c∗ (·)) + EP [B (W ∗( T ))] ≥ u (c∗ (·)) ds + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)).Here

π∗(·) = α(·)σ′(·)−1p∗(·),

c∗(·) = c∗(·) + W∗(·)(π∗(·)− π∗(·))′ (µ(·)− r(·)1) ,

3Footnote 7 on p. 59 explains the use of progressive measurability instead of predictability.


where

α(·) 4=

√π′∗(·)σ(·)σ′(·)π∗(·)

p∗′(·)p∗(·).

is a one-dimensional progressively measurable process, and p∗(·) a solution to

the quadratic programming problem

minp(·)∈K

N∑i=1

(p∗,(i)(·)− θ(i)(·)

)2

.

Proof. This is a special case of Theorem 4.3.6 and Corollary 4.3.8, see Re-

mark 4.3.9. The only thing that is not an immediate consequence of Corollary

4.3.8 is the existence of p∗(·). But this follows form an optimization in finite-

dimensional vector spaces. See Khanna and Kulldorff (1999, Theorem 3).

4.3.2 Remark. If F(t) = FZN (t), then α can be written as α(t, W ∗

t ) for some

Borel-measurable function α : I × IR+0 7→ IR, where (W ∗

t ) is the corresponding

optimal wealth process.

Here is a version, where we keep consumption unchanged

4.3.3 Corollary. Let (π∗(·), c∗(·)) be a candidate optimal solution to the port-

folio optimization problem, and W∗(·) be the wealth process for this solution.

Then there is a (π∗(·), c∗(·)) with

u (c∗ (·)) + EP [B (W ∗( T ))] ≥ u (c∗ (·)) + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)).Here

π∗(·) = α(·)σ′(·)−1p∗(·)

where

α(·) 4=

√π′∗(·)σ(·)σ′(·)π∗(·)

p∗′(·)p∗(·).

is one-dimensional progressively measurable process.

Proof. See Corollary 4.3.7 and Corollary 4.3.8.


4.3.2 Closed Constraints

Given a portfolio-proportion process π∗(·), the key insight from Khanna and

Kulldorff (1999) is that we can find a portfolio-proportion process π∗(·) sat-

isfying π∗′(·)(µ(·) − r(·)1) ≥ π′∗(·)(µ(·) − r(·)1), but π∗′(·)σ(·)σ′(·)π∗(·) =

π′∗(·)σ(·)σ′(·)π∗(·) almost surely. In words: there exists a portfolio proportion

process, with higher drift, but the same “volatility process”. We therefore can

use the higher drift to consume more now, and end up with the same distrib-

ution of terminal wealth. Any individual preferring more consumption to less

will therefore prefer π∗(·). The next lemma will characterize this π∗(·). It is

an extension of a similar lemma given in Khanna and Kulldorff (1999).

4.3.4 Lemma. Let K ⊂ L0(I × Ω,Prog, λ ⊗ P) be closed, π∗(·) ∈ K an N -

dimensional portfolio-proportion process. Then there exists an N -dimensional

portfolio-proportion π∗(·) ∈ K such that

(i) π∗(·) is on the boundary of K, or π∗′(·) = α(·)θ′(·)σ(·)−1, where

θ(·) 4=

σ−1(·)[µ(·)− r(·)1] on Cc

1 on C,

with C 4= (t, ω) : [µ(·)− r(·)1] = 0 and

α(·) 4=

√π′∗(·)σ(·)σ′(·)π∗(·)

θ′(·)θ(·).

If λ⊗ P(C) = 0, the solution is unique almost surely.

(ii) π∗′(·)(µ(·)− r(·)1) ≥ π′∗(·)(µ(·)− r(·)1) almost surely;

(iii) π∗′(·)σ(·)σ′(·)π∗(·) = π′∗(·)σ(·)σ′(·)π∗(·) almost surely.

Proof. Suppose that π∗(·) is in the interior of K (otherwise there is nothing to

prove). Define

D 4= π(·) ∈L0(I × Ω,Prog, λ⊗ P) :

π′(·)σ(·)σ′(·)π(·) ≤ π′∗(·)σ(·)σ′(·)π∗(·) a.s..


Pointwise optimization of h(π(·)) 4= π′(·)(µ(·) − r(·)1) on D gives us π∗(·) =

α(·)θ′(·)σ(·)−1 with h(π∗(·)) = supπ(·)∈D h(π(·)) (see also Remark B.3.2).

If π∗(·) ∈ K, we are done with π∗(·) = π∗(·). Otherwise, let

B 4= π : π′(·)(µ(·)− r(·)1) ≥ π′∗(·)(µ(·)− r(·)1) a.s.

Now G 4= B ∩ ∂D is closed with respect to convergence in probability (∂D is

the boundary of D). Since π∗(·), π∗(·) ∈ G, π∗(·) ∈ K, but π∗(·) /∈ K, there

must be points from the boundary of K in G ∩ K. This proves the lemma.

4.3.5 Remark. Up to now, we have always assumed that σ−1(·) exists almost

surely. This assumption is not necessary at all. It is not even necessary that

σ(·) is quadratic. To see this, let M be the number of stocks, and N be the

number of Brownian motions, i.e. σ(t) is an M×N matrix. If M > N ; then we

can drop at least M − N linearly dependent rows of σ(t), and the respective

stocks. In an arbitrage-free market, this does not change anything. Hence,

without loss of generality, we can assume that M ≤ N , and that σ(t) has

rank M (see Karatzas and Shreve, 1998, Remark 1.4.10, for a more refined

argument). It then suffices to substitute the pseudo-inverse σ′(·)[σ(·)σ′(·)]−1

for the inverse σ−1(·) (see Luenberger, 1969, Chapter 6.11, for a definition of

the pseudo-inverse), i.e. θ(·) = σ′(·)[σ(·)σ′(·)]−1[µ(·) − r(·)1]. However, the

solution is no longer necessarily unique, even if λ⊗P(C) = 0. To keep the nota-

tion simple, we will nevertheless continue to assume that σ−1(·) exists almost

surely, and rely on the reader’s ability to make the necessary generalizations.

By Lemma 4.3.4 we have a portfolio process with a higher drift than the

initial one, but the same “variance”. We can use this portfolio process to

consume more now, and still have the same distribution of terminal wealth. The

technique of the proof differs slightly from the one by Khanna and Kulldorff

(1999), however. Instead of relying on the assumption of the existence of

a unique weak solution, we use properties of the stochastic integral and the

existence of a unique strong solution.

4.3.6 Theorem. Let (π∗(·), c∗(·)) be a candidate optimal solution to the opti-

mization problem with a closed constraint set K ⊂ L0(I ×Ω,Prog, λ⊗ P), and


W∗(·) be the respective wealth process. Then there exists (π∗(·), c∗(·)) with

u (c∗ (·)) + EP [B (W ∗(T ))] ≥ u (c∗ (·)) ds + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)).π∗(·) is either on the (relative) boundary of the constrained set K, or

π∗(·) = α(·)θ′(·)σ(·)−1 for some progressively measurable process α(·), and

c∗(·) = c∗(·) + W∗(·)(π∗(·)− π∗(·))′ (µ(·)− r(·)1) .

Proof. Let π∗(·) be as in Lemma 4.3.4. Then c∗(·) is progressively measur-

able, and c∗(·) ≥ c∗(·) is immediate, hence u(c∗(·)) ≥ u(c∗(·)). Suppose we

have shown that W∗(t) and W ∗(t) have the same distribution for all t ∈ I.

Then W ∗(t) ≥ 0 almost surely, which is the admissibility of (π∗(·), c∗(·)), and

EP [B (W ∗ (T ))] = EP [B (W∗ (T ))].

What remains to be shown is the equivalence in distribution of W∗(t) and

W ∗(t). By definition

W∗(t) = W∗(0) +∫ t

0

W∗(s)π′∗(s)(µ(s)− r(s)1)ds

+∫ t

0

W∗(s)π′∗(s)σ(s)dZ(s)−∫ T

0

c∗(s)ds

= W∗(0) +∫ t

0

W∗(s)π∗′(s)(µ(s)− r(s)1)ds

+∫ t

0


0

c∗(s)ds.

The solution to this stochastic differential equation is given by (Corollary C.3.2)

W∗(t) = ζft ζi

t

[W∗(0)−

∫ t

0

c∗(s)

ζfs ζi

s

ds

],

where

ζft

4= exp∫ t

0

π∗′(s)(µ(s)− r(s)1)− 12π′∗(s)σ(s)σ′(s)π∗(s)ds

,

ζit

4= exp∫ t

0

π′∗(s)σ(s)dZ(s)

.


From Lemma 4.3.4,

ζft = exp

∫ t

0

π∗′(s)(µ(s)− r(s)1)− 12π∗′(s)σ(s)σ′(s)π∗(s)ds

almost surely. From standard results ζi

t and

ζit4= exp

∫ t

0

π∗′(s)σ(s)dZ(s)

have the same distribution (cf. Lemma C.2.1). This and that ζit

ζis

and ζit

ζis

have

the same law, conditional on F(s) (Lemma C.2.1), imply that W∗(t) and

W ∗(t) 4= ζft ζi

t

[W∗(0)−

∫ t

0

c∗(s)

ζfs ζi

s

ds

],

have the same distribution. Since the latter is the solution to the stochastic

differential equation

W ∗(t) = W ∗(0) +∫ t

0

W ∗(s)π∗′(s)(µ(s)− r(s)1)ds

+∫ t

0

W ∗(s)π∗′(s)σ(s)dZ(s)−∫ T

0

c∗(s)ds,

this completes the proof.

Instead of keeping the distribution of terminal wealth the same, we could

use a comparison theorem. Then we can keep the consumption process the

same and change terminal wealth.

4.3.7 Corollary. Let (π∗(·), c∗(·)) be a candidate optimal solution to the opti-

mization problem with a closed constraint set K ⊂ L0(I ×Ω,Prog, λ⊗ P), and

W∗(·) be the respective wealth process. Then there exists (π∗(·), c∗(·)) with

u (c∗ (·)) + EP [B (W ∗(T ))] ≥ u (c∗ (·)) ds + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)).π∗(·) is either on the (relative) boundary of the constrained set K, or

π∗(·) = α(·)θ′(·)σ(·)−1 for some progressively measurable process α(·).


Proof. With the same notation as in the proof to Theorem 4.3.6, set

W∗(t) = W∗(0) +∫ t

0

W∗(s)π′∗(s)(µ(s)− r(s)1)ds

+∫ t

0


0

c∗(s)ds

and

W ∗(t) = W∗(0) +∫ t

0

W ∗(s)π′∗(s)(µ(s)− r(s)1)ds

+∫ t

0

W ∗(s)π′∗(s)σ(s)dZ(s)−∫ T

0

c∗(s)ds.

Using similar arguments as before, W ∗(t) has the same distribution as the

solution to

W ∗(t) = W∗(0) +∫ t

0

W ∗(s)π′∗(s)(µ(s)− r(s)1)ds

+∫ t

0

W ∗(s)π′∗(s)σ(s)dZ(s)−∫ T

0

c∗(s)ds.

Standard comparison results than imply that W∗(t) ≤ W ∗(t) almost surely

(Theorem C.4.1) ⇒ EP [B (W ∗(t))] = EP

[B(W ∗(t)

)]≥ EP [B (W∗(t))].

Under additional assumptions, satisfied, for example, if the constraint is a

cone, a static optimization problem can be solved to find a better solution.

4.3.8 Corollary. Let (π∗(·), c∗(·)) be a candidate optimal solution to the port-

folio optimization problem with constraint set K ⊂ L0(I×Ω,Prog, λ⊗P) closed

and convex. Suppose that K is such that we can optimize point-wise. Let W∗(·)be the wealth process for this solution. Let π∗(·) be a solution to

maxπ π′(·)(µ(·)− r1)

s.t. π(·) ∈ K

π′(·)σ(·)σ′(·)π(·) ≤ π′∗(·)σ(·)σ′(·)π∗(·).

and suppose that π∗′(·)σ(·)σ′(·)π∗(·) = π′∗(·)σ(·)σ′(·)π∗(·) almost surely.


Then

u (c∗ (·)) + EP [B (W ∗( T ))] ≥ u (c∗ (·)) + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)).Similarly

u (c∗ (·)) + EP

[B(W ∗ (T )

)]≥ u (c∗ (·)) + EP [B (W∗ (T ))] ,

where W ∗(·) is the wealth process belonging to (π∗(·), c∗(·)) and

c∗(·) = c∗(·) + W∗(·)(π∗(·)− π∗(·))′ (µ(·)− r(·)1) .

Proof. Note that

DK4= π(·) ∈ K :π′(·)σ(·)σ′(·)π(·) ≤ π′∗(·)σ(·)σ′(·)π∗(·) a.s..

has an element π∗(·) ∈ DK that maximizes π′(µ(·) − r(·)1) point-wise, i.e.

the optimization problem is well-defined. The rest then follows just as in the

previous results, if we use the fact that the processes have the same “variance”.

4.3.9 Remark. Rewrite the static optimization problem in Corollary 4.3.8 as

maxπ π′(·)(µ(·)− r1)

s.t. π(·) ∈ K

π′(·)σ(·)σ′(·)π(·) ≤ α(·).

for some progressively measurable process α(·) ≥ 0. That means finding an

optimal (π∗(·), c∗(·)) can be decomposed into three different steps:

(i) solve the static problem to find an optimal πα(·), conditional on α(·);

(ii) find an optimal α∗c(·) conditional on c(·);

(iii) find the optimal c∗(·), i.e. the optimal combination of running consump-

tion and terminal wealth.


Although this procedure is not immediately useful for finding explicit optimal

solutions (finding α∗c(·) remains difficult), it should prove useful for numerical

schemes. Instead of solving a PDE, we can solve a sequence of static problems.

One advantage of this procedure is however to show that IRN -valued prob-

lems are just as easy to solve as IR-valued ones.

The only obstacle is the assumption that π∗′(·)σ(·)σ′(·)π∗(·) = α(·). This

assumption can only be guaranteed for very special constraints. Such examples

are the cone constraints of the previous section. Other examples are constraint

sets with shapes of balls, spheres, some triangles, and so on.

4.4 Extensions and Ramifications

There are many other occasions where the same reasoning used throughout

this chapter works equally well. For example, the proofs do not only work for

portfolio-proportion processes, but also for portfolio processes. We leave the

details to the reader. We also observe that B ≡ 0 is valid for the theorems to

be true. Therefore all the results can be extended to infinitely lived agents,

possible without utility from bequest.

From the problem setting it should be clear that — besides time-additive

specifications — most specifications of a recursive utility function, or history-

dependent utility functions fit into this framework. For utility functions such

as Hindy et al. (1992, 1997), we have to use a slightly different consumption

process C (Section 3.5.4), but this poses no major problems. The assump-

tion that c (or C) may become negative implies that the case of a stochastic

endowment process is also covered.

We can clearly apply the same logic to an Asset / Liability model. We

split the liability into a hedgeable and a non-hedgeable part, and consider the

non-hedgeable part as (another) portfolio constraint. This also shows that

certain state-dependent utility functions B are feasible, too. If the state-

dependent utility function can be written as B(WT ) = B(Y WT ) for some


state-independent utility function B, and the random variable Y is an attain-

able contingent claim, the reasoning extends to this setting, too.

The results concerning the structure of optimal portfolios still hold if we

are faced with constraints on the distribution of terminal wealth, e.g. VaR-type

constraints, or constraints like the one that we must be almost surely above a

given threshold.

A minor modification in the direction of more general processes than Brown-

ian motion is possible. Let L be a one-dimensional, symmetric Levy process

(Bertoin, 1996, Chapter 2.1). Then we can substitute Z with L4= Z + L, and

all of the theory still is true, except for proofs using comparison theorems (e.g.

Corollary 4.3.7). The somewhat strange definition of L ensures that Lemma

C.2.1 is still valid.

Appendix A

A General Semimartingale

Model

There is an abundance of excellent literature in the field of mathematical fi-

nance (Bjørk, 1998; Elliott and Kopp, 1999; Kallianpur and Karandikar, 2000;

Karatzas and Shreve, 1998; Protter, 2001). We can refer the interested reader

to any of these sources for an in-depth discussion of the models used in this

thesis. The appendix therefore aims at unifying the notation by presenting and

discussing the model used. It is not suitable as an introduction. The thesis

requires a good deal of stochastic calculus. We use Protter (1990) as a refer-

ence. Other excellent books are Bichteler (2002); Liptser and Shiryaev (1989);

Rogers and Williams (1994a,b). As a shorthand introduction to stochastic in-

tegration consult Kallianpur and Karandikar (2000). Most of these books do

not extensively cover integration of vector-valued semimartingales; see Cherny

and Shiryaev (2001) for a discussion.

Before we start, recall our convention that all processes defined or taken as

given are adapted. All other properties of a stochastic process will be stated

(but see Footnote 6 on p. 131).

130 APPENDIX A. A GENERAL SEMIMARTINGALE MODEL

A.1 Stochastic Setting

On a probability space (Ω,F , P), let S = (St)t∈I be a vector-valued, locally

bounded1 semimartingale2 for the right-continuous filtration F(t)t∈I .3 We

always assume S > 0 almost surely to avoid technicalities. Here, we write

I ⊂ IR+0 , with 0 ∈ I (note that the convention is that IR+ = x ∈ IR :

x > 0, IR+0 = x ∈ IR : x ≥ 0, IR−, IR−

0 defined accordingly), e.g. I =

[0, T ] for some T ∈ IR+, or I = 1, 2, 3, . . . , T. By convention we denote

by T the maximal element of I. Since we can embed a discrete stochastic

process on 1, 2, 3, . . . , T in the interval [0, T ] (see e.g. Cherny and Shiryaev,

2001, Remark after the proof of Theorem 1.5 in Section 7.1), and similarly treat

the utility functions, we will usually think of I as an interval. We will always

assume T finite but note that everything should hold for T infinite (modulo

some integrability conditions). For simplicity, we also assume that F(0) is

almost trivial and contains all the P-null sets (i.e. F(t)t∈I satisfies the usual

hypotheses), and that F(T ) = F .

We denote by λ a suitable measure on I, say the Lebesgue measure if [0, T ]

is an interval of IR+0 , or the counting measure in case of a discrete set of time

points.4 Without loss of generality, we make the standing assumption that

1With some extra effort it should be possible to extend the results to the general (notlocally bounded) case, if we are willing to introduce the notion of a sigma-martingale, alsoknown as a martingale transform or a “semimartingale de la classe

Pm”. See Cherny and

Shiryaev (2001, Section 5) for a rigorous discussion. Usually, locally bounded semimartingalesare sufficiently general: they include all continuous processes and all cadlag processes withuniformly bounded jumps.

2We will not use any special notation for e.g. a semimartingale M : I × Ω 7→ IR andits vector-valued equivalent M : I × Ω 7→ IRN . Instead, we rely on the reader’s abilityto tell between a semimartingale, i.e. an IR-valued process, and its vector-valued pendant,depending on the context.

3To streamline notation, we will often drop the subscript t ∈ I, and simply write (St).We will write S, wherever we “think more in terms of” a mapping S : I ×Ω 7→ IRN . Instead(St) is used, if we “think more in terms of” a process. St : Ω 7→ IRN stands for the respectiveF(t)-measurable random variable, and S(t, ω) ∈ IRN for a concrete realization, t ∈ I, ω ∈ Ω.

4 We are a little bit short on details what a “suitable measure” is. Note however that ifλ is diffuse, everything is perfect. Finite mass for a finite number of points of the interval isalso acceptable, as long as (Ω,F , P) is non-atomic. In the main body of text this is implicitlydone by treating utility from terminal wealth separately with the help of a bequest function.This is equivalent to using a measure assigning finite mass to T . The results can be extendedto the non-atomic case except for a few exceptions.

A.1. STOCHASTIC SETTING 131

λ(0) = 0. It is understood that any integral of the type∫ t

0f(s)ds should be

understood in the Lebesgue sense with respect to the measure λ. We take

the stand (where applicable) that∫

fdµ is defined for f ∈ L0(µ) if either

f+ ∈ L1(µ) or f− ∈ L1(µ); i.e., we are perfectly happy if∫

fdµ = ∞ (−∞respectively).

We will use the notation f > 0 for any functional f : X 7→ IRN to mean

(f(x))(i) > 0 ∀ x ∈ X , i = 1, . . . , N , and similarly define the other inequalities

‘≥’, ‘<’, ‘≤’. Thus for a semimartingale S, S > 0 means S(i)(t, ω) > 0 for all

(t, ω) ∈ I ×Ω, St > 0 means S(i)(t, ω) > 0 for all ω ∈ Ω, i = 1, . . . , N . For two

functions c∗(·) : I×Ω 7→ IR, c(·) : I×Ω 7→ IR, c∗ ≥ c means c∗(t, ω)−c(t, ω) ≥ 0

for all (t, ω) ∈ I × Ω, and so on. If we qualify any of these inequalities saying

it does hold only “almost surely”, the natural measure (λ ⊗ P or P) for that

situation is understood to be taken.

Finally, we denote by P the predictable σ-algebra, i.e. the smallest σ-algebra

making all adapted processes with caglad paths measurable. Then (I×Ω,P, λ⊗P) is a finite measure space, since T < ∞.

Let ξ ∈ L(S) be a predictable process that serves as an integrand for the

semimartingale S (i.e. is an S-integrable N -dimensional process), where we

denote by L(S) ⊂ L0(I ×Ω,P, λ⊗P) the vector space (Protter, 1990, Chapter

4, Theorem 16) of all S-integrable predictable processes. To highlight the fact

that the semimartingale S and the predictable process ξ are IRN valued, we

will use the symbol ‘·’, which stands for the inner product. Thus∫ ·0ξs · dSs

should be read as a vector stochastic integral5, that can be understood as∫ ·0

∑Ni=1 ξ

(i)s dS

(i)s .6 This notation is not standard. However, it is true in the

case where the vector stochastic integral is indeed a Lebesgue integral. It is also

5We always use vector stochastic integrals, but are a little bit lax on occasion with respectto notation. We can allow ourselves this little slip, since throughout this thesis we onlyneed closure with respect to one-dimensional semimartingales (the one exception being thefundamental theorems of asset pricing we sometimes use, but which are not at the core of ourinterest). We refer to Cherny and Shiryaev (2001) for a proper discussions of these issues.

6We will always select a right-continuous version of the processR ·0 ξs · dSs. The same

applies wherever we define a process by X· = EP [X|F(·)] for some random variable X ∈L1(P). As for integration with respect to IRN -valued processes, see Cherny and Shiryaev(2001); Bichteler (2002, Chapter 3).


true if the integrals∫ ·0ξ(i)s dS

(i)s all exist; in this case we have

∫ ·0

∑Ni=1 ξ

(i)s dS

(i)s =∑N

i=1

∫ ·0ξ(i)s dS

(i)s . Therefore it seems natural to use this notation in general.

We will occasionally use explicit matrix notation, and then write x′(t)y(t)

for a matrix multiplication of two processes for example. Where this is done,

we will always set the involved variables in a bold face. As there is no rule

without an exception, given two N -dimensional processes S, ξ we will write

( 1St

) for the process (1

S(1)t

,1

S(2)t ,

. . . ,1

S(N)t

)and (ξtSt) for (

ξ(1)t S

(1)t , ξ

(2)t S

(2)t , . . . , ξ

(N)t S

(N)t

).

For example, ∫ t

0

πs ·dSs

Ss−=∫ t

0

πs ·(

1Ss−

dSs

)should be read as a vector stochastic integral which resembles∫ t

0

N∑i=1

π(i)s

1

S(i)s−

dS(i)s .

We further assume existence of a probability measure Qm equivalent to Psuch that all processes in the set S = X ≥ 0 a.s. : Xt = X0 +

∫ t

0+ξs · dSs, ξ ∈

L(S) are local martingales (an equivalent local martingale measure). Such a

measure Qm exists as long as the financial market offers no free lunches in a

properly defined way (Delbaen and Schachermayer, 1997, and the references

therein for details). We denote by M(S) the space of all equivalent local

martingale measures, and clearly M(S) 6= ∅ by assumption.

A.1.1 Remark. Many authors define M(S) to be the set of equivalent measures

such that the process S is a local martingale. As is by now well known, the

two definitions coincide. Clearly for Q ∈ M(S), the process S must be a

local martingale, as we can simply choose ξ ≡ 1. The converse implication

follows from Emery (1980, Corollaire 3.5). But our definition has the additional

advantage that it allows for a straightforward extension to the case of portfolio


processes with additional constraints. For example, let K ⊂ L(S) be a convex

cone. Then we define M(SK) to be the set of all equivalent measures such that

all processes in the set

SK = X ≥ 0 a.s. : Xt = X0 +∫ t

0+

ξs · dSs, ξ ∈ K

are local supermartingales, indeed supermartingales. Later on, we will exten-

sively use a similar definition.

Now that we have discussed the basics, let us turn to the model of the

financial market. A (“discounted”) wealth process is defined by

W4= W0 +

∫ ·

0+

ξs · dSs P− a.s. (A.1)

where W0 ∈ IR+.

A.1.2 Remark. From the definition of the stochastic integral It =∫ t

0hs · dSs,

it follows that I0 = H0 · S0. However this means that we have to bother about

the contribution of the integral at 0 for our discounted wealth processes. We

do so by writing∫ t

0+

ξs · dSs = It −H0 · S0 =∫ t

0

ξs · dSs −H0 · S0

to denote integration over (0, T ]. If S0 = 0 holds (e.g. in the Brownian motion

case) we can omit this subtlety — something we freely do without further

mentioning. In principal the same applies to Stieltjes integrals which are only a

special case of the above definition (Protter, 1990, Chapter 4, Theorem 26). But

the reader can check that throughout the text S0 = 0 holds for the integrator

of any integral that can be interpreted in the usual Stieltjes way.

We call a predictable portfolio process7 ξ ∈ L(S) admissible if WT exists

and Wt is bounded from below by zero, i.e.

Wt ≥ 0 ∀ t ∈ I P− a.s. (A.2)7In this thesis, a portfolio process captures the number of stocks held. Other authors (e.g.

Karatzas and Shreve, 1998) call the money invested in a certain stock, i.e. the process ξSwith our notation, a portfolio process.


for the wealth process (A.1). We denote the set of all admissible processes

ξ by La(S, W0). The restriction that Wt is admissible is both rational from

an economic point of view — nobody can have negative wealth, since the

minimum is certainly having nothing, as long as the individual does not possess

a stochastic endowment process against which she can borrow — and sufficient

to prevent arbitrage possibilities from so-called doubling strategies (see e.g.

Karatzas and Shreve, 1998, Example 1.2.3).

In mathematical finance, one thinks of “dSt” as the absolute return —

maybe discounted by the risk-free rate — of holding one unit of the risky

assets S over the “next instant” (we will omit the qualification “risky” from

now on). The number of assets held is captured by the admissible process ξ.

It is sometimes convenient to use the portfolio-proportion process π instead.

The latter is defined by

πt =1

Wt−ξtSt−IWt−>0 ∀ t ∈ I \ 0 P− a.s. (A.3)

Lπ(S) 4= π defined by (A.3), ξ ∈ L(S) is the set of all integrable portfolio-

proportion processes, and the set of all portfolio-proportion processes generated

by admissible processes is La,π(S) 4= π defined by (A.3), ξ ∈ La(S, W0). Us-

ing this notation (A.1) can be rewritten as

W = W0E(∫ ·

0+

πs ·dSs

Ss−

)P− a.s.

E(·) is the Doleans-Dade exponential (cf. Protter, 1990, Chapter 2.8 and recall

that E(X) is a solution to the stochastic differential equation Z = 1+∫

Zs−dXs

with initial value Z0 = 1; any solution to this equation coincides with E(X)

on the set (ω, t) : E(X)t− 6= 0). As opposed to the absolute return process

(St), ( dSt

St−) is called the relative return process (or simply return process) of

the assets.

It is important to note that we do not use constraints like∑N

i=1 π(i)t =

1∀ t ∈ I almost surely, or∑N

i=1 ξ(i)t S

(i)t = Wt ∀ t ∈ I almost surely. What


is more, we also do not use a “riskless asset”. The reason is a mere conve-

nience. Provided we can change the numeraire (see Delbaen and Schacher-

mayer, 1995, on this), then there basically exists a one-to-one correspondence

between using the so-called “discounted” wealth process without such con-

straints and a “riskless asset”, and the “normal” wealth process with con-

straints (cf. also Example 2.1.39, Example 2.1.41, and Karatzas and Shreve,

1998, especially Chapter 3).

In reality, there is however quite often a net-outflow of money, namely

consumption, a progressively measurable non-negative process c, satisfying∫ T

0c(s)ds < ∞ almost surely. With this source for change in wealth the port-

folio process now reads

W4= W0 +

∫ ·

0+

ξs · dSs −∫ ·

0

c(s)ds P− a.s. (A.4)

Constraint (A.2) must still hold for this wealth process and WT still exist. We

denote by A(S, W0) the set of all pairs of a predictable process ξ and a con-

sumption process c such that constraint (A.2) is fulfilled for (A.4). We call

(ξ, c) admissible if (ξ, c) ∈ A(S, W0). For K ⊂ L(S), we write AK(S, W0) for

all (ξ, c) ∈ A(S, W0) with ξ ∈ K. Aπ(S, W0), AKπ (S, W0) are defined accord-

ingly, and each (π, c) ∈ Aπ(S, W0), (π, c) ∈ AKπ (S, W0) respectively, is called

admissible, too. The wealth process is then given by

W = W0E(∫ ·

0+

π(s) · dSs

Ss−−∫ ·

0+

c(s)Ws−

d(sIWs−>0

))P− a.s. (A.5)

By Theorem C.3.1, the solution to this stochastic differential equation is

W = IW−>0E(∫ ·

0+

πs ·dSs

S−

)W0 −∫ ·

0

c(s)

E(∫ ·−

0+πs · dSs

S−

)ds

,

almost surely, provided there are no arbitrage opportunities in the market.

Indeed, no arbitrage in a properly defined way implies the existence of an

equivalent probability measure Qm such that (A.4) and (A.5) are local super-

martingales. But then Ws = 0 ⇒ Wt = 0 for all t ≥ s, since W ≥ 0 (see Revuz

and Yor, 1999, Chapter 2, Proposition 3.4).


A.2 Topological Properties

We will now discuss some topological properties of the spaces involved. When-

ever we refer to topological properties we will identify members of an equiva-

lence class8 (i.e., we will only consider quotient spaces) to simplify the notation.

Recall that we denote by L(S) the vector space of all S-integrable pre-

dictable processes (the space of all possible trading strategies, not necessarily

admissible). This space does not depend on the measure (P or any equivalent

measure Q) chosen (Protter, 1990, Chapter 4.2, Theorem 25). By Memin

(1980), the space (L(S), dS) is a complete metric space, where the distance

dS : L(S)× L(S) 7→ IR+0 is given by:

dS(ξ1, ξ2) = dE

(∫ T

0

ξ1s · dSs,

∫ T

0

ξ2s · dSs

). (A.6)

Here dE is the Emery distance, which makes the space of all semimartingales a

complete, metrizable, but not locally convex space (Emery, 1979). The topol-

ogy is also known as the semimartingale topology.

dE(S1, S2) = sup|h|≤1

∑n≥1

2−nEP

[min

(∣∣∣ ∫ T∧n

0

hsd(S1s − S2

s )∣∣∣, 1)]

and the supremum is taken over the set of all predictable processes bounded by

one. The topology is linear (Schaefer, 1999, Chapter 1, Theorem 6.1 and the

discussion thereafter), and (L(S), dS) is a topological vector space, hence an

F-space.9 Finally we note that the semimartingale topology does not depend

on the choice of an equivalent measure, a consequence of the Closed Graph

theorem and the fact that the semimartingale topology is finer than the topol-

ogy of uniform convergence on compacts in probability (Cherny and Shiryaev,

2001, Lemma 4.9).8The equivalence relation is always almost sure equality.9Often authors do not tell between an F-space and a Frechet-Space, sometimes assum-

ing that an F-space is locally convex (Schaefer, 1999, page 49), sometimes not (Yosida,1980, Chapter 1.9, Definition 1). We define an F-space as a complete metrizable topologicalvector space (Kalton, Peck, and Roberts, 1984; Schechter, 1997, 26.2); if the space happensto be locally convex, too, we will say it is a Frechet space (Schechter, 1997, 26.14).

A.2. TOPOLOGICAL PROPERTIES 137

We topologise the space of all contingent claims (all possible terminal wealth

outcomes) L0(P) with the obvious metric of convergence in probability (see e.g.

Aliprantis and Border, 1999, Chapter 12, Theorem 40):

dP : L0(P)× L0(P) 7→ IR+0 , dP(f1, f2) = EP

[|f1 − f2|

1 + |f1 − f2|

]Again, the space (L0(P), dP) is an F-space.

A.2.1 Remark. We need some well-known facts about the space (L0(P), dP)

(cf. Aliprantis and Border, 1999, Chapters 12.10, 12.11; Kalton et al., 1984,

Chapter 2.2):

(i) The space (L0(P), dP) does not change if we switch to an equivalent mea-

sure Q. Of all (Lp(P), ‖·‖p)-spaces (0 < p ≤ ∞), only (L∞(P), ‖·‖∞)

possesses this property, too.10 This is the reason why (L0(P), dP) is es-

pecially useful in mathematical finance.

(ii) By a theorem first proven by Nikodym, (L0(P), dP) has a trivial (topo-

logical) dual if the probability space is non-atomic. Hence (L0(P), dP) is

not a locally convex space in general (Schechter, 1997, 26.16).

(iii) (L0(P), dP) is not locally bounded (in the topological sense) (Schechter,

1997, Example 27.8 b).

The last two properties are certainly not very satisfying since they prevent

the use of most of the theory that works so nicely for locally convex spaces

(Schaefer, 1999, Chapter 2).

The next very useful theorem will enable us to handle the formidable task

of applying general results from functional analysis to the F-space (L0(Q), dQ).

Let us recall the concept of a polar first (Schaefer, 1999, p. 125). For ∅ 6=C ⊂ L0

+(Ω,F , P) 4= X ∈ L0(Ω,F , P) : X ≥ 0 a.s. the polar C is defined

10However, there exists an isometric isomorphism between (L1(P), dP), (L1(Q), dQ) if P, Qare equivalent; it is given by f 7→ f dP

dQ from (L1(P), dP) to (L1(Q), dQ).


by C = h ∈ L0+(Ω,F , P) : EP[gh] ≤ 1∀ g ∈ C.11 The bipolar C is the

polar of the polar, i.e. C = (C). It does not alter under a change of an

equivalent probability measure. Also remember that a set C ⊂ L0+(Ω,F , P) is

called solid if f ∈ C, 0 ≤ g ≤ f implies g ∈ C (Schaefer, 1999, p. 209). Given

these definitions, one can show that

A.2.2 Theorem. For ∅ 6= C ⊂ (L0+(Ω,F , P)) the polar C is a convex, solid

and closed subset of (L0+(Ω,F , P), dP). The bipolar C is the smallest subset

of L0+(Ω,F , P) containing C, that is convex, solid and closed with respect to dP.

Proof. Brannath and Schachermayer (1999, Theorem 1.3).

A.2.3 Remark. Ergo, if C is already convex, solid and closed, then C = C. As

we have observed (cf. Remark A.2.1), L0(P) is not locally convex in general.

Hence the standard proof of the Bipolar theorem (e.g. Schaefer, 1999, Chapter,

Theorem 1.5), which relies on separating hyperplanes for locally convex spaces,

cannot be applied. But the Hahn-Banach theorem, which is all we need for

the proof to work, does only require an ordered vector space (see Schechter,

1997, 12.34).12 Brannath and Schachermayer (1999) use this fact and the

lattice structure of L0(P) to prove their version of the bipolar theorem. A

similar method will be used in Appendix B.3.

For later use we recall the concept of Fatou convergence, the stochastic

process analogue to almost sure convergence:

A.2.4 Definition (Fatou Convergence). Let (Xn)n≥1 be a sequence of sto-

chastic processes uniformly bounded from below. The sequence (Xn)n≥1 is

11The definition of a polar can cause some difficulties. Our definition is the “German” oneas introduced by Kothe and used by Castaing and Valadier (1977); Schaefer (1999). Bourbakidefines a polar as CB = h ∈ L0

−(Ω,F , P) : EP[gh] ≥ −1 ∀ g ∈ C; it is clear that C = −CB .

Other authors call the set CAbs = h ∈ L0(Ω,F , P) : |EP[gh]| ≤ 1 ∀ g ∈ C a polar (normallycalled the absolute polar).

12There are many formulations of a Hahn-Banach Theorem. If we want extensions to becontinuous linear functionals in the Hahn-Banach Extension Theorem (Theorem 5.40 in Ali-prantis and Border, 1999; Schechter, 1997, 12.34, VHB2), then an F-space must be locallyconvex, i.e. a Frechet space (Kalton et al., 1984, Chapter 4). Here, we consider weakerversions of the Theorem.

A.2. TOPOLOGICAL PROPERTIES 139

Fatou convergent (on τ) to a process X∗, if there exists a dense subset τ of Isuch that

X∗t = lim sup

s↓t,s∈τlim sup

n→∞Xn

s = lim infs↓t,s∈τ

lim infn→∞

Xns

almost surely for all t ∈ [0, T ]. We write limn→∞ Xn = X∗ for such limits.

For supermartingales, Fatou convergence simplifies to:

A.2.5 Lemma. Let (Xn)n≥1 be a sequence of cadlag supermartingales which

are uniformly bounded from below such that Xn0 ≤ 0 (n ≥ 1), and Fatou con-

verging to some X∗. Then there is a countable subset τ ⊂ I \ T such that for

t ∈ I \ τ , we have X∗t = lim infn→∞ Xn

t .

Proof. Zitkovic (2002, Lemma 8).

This lemma also teaches that if (Xn)n≥1 converges almost surely λ ⊗ P,

then Fatou convergence and almost sure convergence coincide. The usefulness

of Fatou convergence for our purpose also stems from the following lemma.

A.2.6 Lemma. (i) Let (Xn)n≥1 be a sequence of cadlag supermartingales

which are uniformly bounded from below such that Xn0 = 0 (n ≥ 1).

Let τ be a dense countable subset of I. Then there is a sequence Y n ∈conv(Xn, Xn+1, . . . ), n ≥ 1, and a cadlag supermartingale Y such that

Y0 ≤ 0 and (Y n)n≥1 is Fatou convergent on τ to Y .

(ii) Let (An)n≥1 be a sequence of cadlag non-decreasing processes such that

An0 = 0, n ≥ 1. Then there is a sequence Bn ∈ conv(An, An+1, . . . ), and

an increasing process B, possible taking the value +∞, such that (Bn)n≥1

is Fatou convergent on τ to B.

Proof. Follmer and Kramkov (1997, Lemma 5.2).

A.2.7 Remark. Amongst others, it is a direct consequence of the lemma that

any sequence of supermartingales satisfying the conditions of the lemma and

converging in the Fatou sense, necessarily converges to a cadlag supermartin-

gale, (after a modification if necessary). Indeed, if (Xn)n≥1 already converges,

then (Y n)n≥1 must converge to the same process.


A.3 Characterization of Admissible Processes

We can now give a dual characterization of admissible portfolios, which relies on

an Optional Decomposition result. For the general semimartingale setting, the

latter was proven by Kramkov (1996); Follmer and Kabanov (1998); Follmer

and Kramkov (1997), from where we take the theory. Consult these sources

for details.

Before we start, we assemble various facts from elementary measure theory

and the theory of stochastic processes, which we will freely use in the remainder.

A.3.1 Lemma. Let X be a local supermartingale with respect to (F(t))t∈I on

(Ω,F , P), ξ ∈ L(X) be locally bounded, and ξ ≥ 0. Then∫ ·0+

ξsdXs is a local

supermartingale.

Proof. Stopping if necessary, let X = M−A be the Doob-Meyer decomposition

of X with M a local martingale and A a nondecreasing, natural process (Prot-

ter, 1990, Chapter 3, Theorem 7). By Protter (1990, Chapter 4, Theorem 29),∫ ·0+

ξsdMs is a local martingale, and clearly∫ ·0+

ξsdAs is nondecreasing. Con-

sequently,∫ ·0+

ξsdXs =∫ ·0+

ξsdMs −∫ ·0+

ξsdAs is a local supermartingale.

For ease of notation, we use the following shortcut notation for a stopping

time. Let Y be a non-negative, cadlag semimartingale, and consider the stop-

ping time τ4= inft ≥ 0 : Yt = 0, where τ = ∞ on YT > 0; then we set

τ− = T on YT > 0.

A.3.2 Lemma. Given a non-negative cadlag supermartingale E(Y ) with re-

spect to (F(t))t∈I on (Ω,F , P), let τ ≤ inft ≥ 0 : E(Y )t = 0 a.s. be a stopping

time; then Y τ is a local supermartingale.

Proof. Doob’s optional sampling theorem (Rogers and Williams, 1994a, The-

orem 77.5) ensures that E(Y )τ is a supermartingale. From the definition

of stochastic exponential E(Y )τt = 1 +

∫ t

0+E(Y )τ

t−dY τ . This implies Y τt =∫ t

0+1

E(Y )τt−

dE(Y )τt . E(Y )τ

− is caglad, and hence also locally bounded. It follows

from Lemma A.3.1, that Y τ is a local supermartingale.

A.3. DUAL CHARACTERIZATION 141

A.3.3 Lemma. Let Y be a progressive process, and τ1, τ2, τ3 be three stop-

ping times. Then Yτ1Iτ1≤τ2 is F(τ2)-measurable, and EP[Yτ3Iτ1≤τ2|F(τ1)] =

EP[EP[Yτ3Iτ1≤τ2|F(τ2)]|F(τ1)].

Proof. If A ∈ F(τ1 ∨ τ2), then A ∩ τ1 ≤ τ2 ∈ F(τ2) (Rogers and Williams,

1994a, Chapter II, Lemma 73.4 (iii)), hence F(τ1) ∩ τ1 ≤ τ2 ⊂ F(τ2).

By Rogers and Williams (1994a, Chapter II, Lemma 73.11) Yτ1Iτ1≤τ2 is

F(τ1)-measurable. If A ∈ F(τ1), then the definition of the integral implies∫A

Yτ1Iτ1≤τ2dP =∫

A∩τ1≤τ2 Yτ1Iτ1≤τ2dP. We conclude that Yτ1Iτ1≤τ2 =

EP[Yτ1Iτ1≤τ2|F(τ1)] = EP[Yτ1Iτ1≤τ2|F(τ1) ∩ τ1 ≤ τ2], i.e. Yτ1Iτ1≤τ2 is

F(τ1) ∩ τ1 ≤ τ2 measurable. And using F(τ1) ∩ τ1 ≤ τ2 ⊂ F(τ2), we find

Yτ1Iτ1≤τ2 = EP[Yτ1Iτ1≤τ2|F(τ2)].

As for the second part, EP[Yτ3Iτ1≤τ2|F(τ1)] = EP[Yτ3Iτ1≤τ2|F(τ1) ∩τ1 ≤ τ2], and F(τ1) ∩ τ1 ≤ τ2 ⊂ F(τ2). The result thus follows from

elementary properties of the conditional expectation.

We conclude this section with a property of the Doleans-Dade exponential.

A.3.4 Lemma. Let Y be a cadlag semimartingale with Y0 = 0, and C be a

cadlag predictable process of bounded variation with C0 = 0. Then

(i) E(Y )E(C) = E(∫ ·01 + ∆CdY + C).

(ii) Given Y,C, set X =∫ ·01 + ∆CdY ; we find E(X + C) = E(Y )E(C) and

Y =∫ ·0

11+∆C dX.

Proof. (i) Use Yor’s formula (Protter, 1990, Chapter 2, Theorem 37) and

the definition of a bracket process.

(ii) Apply (i).

A.3.1 Portfolio-Proportion Processes

We first give the results for portfolio-proportion processes, and start with some

definitions.


A.3.5 Definition (Upper Variation Process). Let SK be a family of cadlag

semimartingales which are bounded from below with initial value S0 = 0, and

such that 0 ∈ SK. Denote by M(SK) the class of all probability measures

Q equivalent to P with the following property: there exists (for Q fixed) a

nondecreasing predictable process A such that S−A is a local supermartingale

for any S ∈ SK. A cadlag nondecreasing predictable process ASK(Q) will

be called upper variation process of SK, if it is minimal with respect to this

property, i.e. if for any other process A with this property, A − ASK(Q) is a

nondecreasing process.

Under the additional assumption that SK is predictably convex, it is one of

the key assertions of the mentioned literature that the upper variation process

ASK(Q) exists for any Q ∈ M(SK); furthermore, it is finite (Follmer and

Kramkov, 1997, Lemma 2.1).

A.3.6 Definition (Predictably Convex). We call SK predictably convex, if for

any predictable process α such that 0 ≤ α ≤ 1 we have∫

αdX(1) +∫

(1 −α)dX(2) ∈ SK for all X(1), X(2) ∈ SK.

Returning to our securities market, let K ⊂ La,π (S) be closed with respect

to dS . Further assume that 0 ∈ K and that K is convex in the following

sense: if β, γ ∈ K, then αβ + (1−α)γ ∈ K for any one-dimensional predictable

process α such that 0 ≤ α ≤ 1. Consider the predictably convex family of

semimartingales13 SK =∫ ·

0+πs · dSs

Ss−: π ∈ K

and set

WK(W0)4=

W ≥ 0 : Wt = W0E(∫ t

0+

πs ·dSs

Ss−− Ct

), π ∈ K,

C a non-negative, nondecreasing, cadlag process

.

(A.7)

Note that all admissible wealth processes in the sense of Appendix A.1 are in

WK(W0) (cf. Remark 1.2.2).

A.3.7 Theorem. Let W be a non-negative cadlag process. Then the following

statements are equivalent:13Note that any semimartingale S in SK is a mapping I × Ω 7→ IR, i.e. not vector-valued,

and S0 = 0.


(i) W ∈ WK(W0).

(ii) For all Q ∈ M(SK) the process W/E(ASK (Q)

)is a supermartingale

under Q.

Proof. Follmer and Kramkov (1997, Theorem 4.2), using M(SK) 6= ∅.

A.3.8 Remark. If YK4= (

1/E(ASK (Q)

)EP[(dQ/dP)

)|F(·)] : Q ∈ M(SK)

,

then a non-negative process W ∈ WK(W0) for some W0 > 0 if and only if

WY is a P-supermartingale for any process Y ∈ YK. Any Y ∈ YK is a P-

supermartingale, and EP [Yt] ≤ 1, t ∈ I.

Follmer and Kramkov (1997) prove another useful result:

A.3.9 Theorem. Under the assumptions of this subsection, let X ≥ 0 be an

F-measurable random variable. Assume that

W04= sup

Q∈M(SK)

EQ

[X

E (ASK(Q))T

]< ∞.

Then there exists a non-negative cadlag process W ∈ WK(W0) such that WT ≥X almost surely. Further, W is minimal with respect to this property (in the

sense that for any other process W ∈ WK(W0) and WT ≥ X, we have W ≥ W ).

It holds that

Wt = ess supQ∈M(SK)

(E(ASK(Q)

)tEQ

[X

E (ASK(Q))T

∣∣∣∣F(t)])

.

Proof. Follmer and Kramkov (1997, Proposition 4.3).

The two theorems enable us to prove the

A.3.10 Lemma. Set

CK4=X ∈ L0

+(Ω,F , P) : X ≤ WT a.s. for an admissible wealth process W,

W0 = 1, π ∈ K.

Then CK is convex, solid and closed (for dP); we have CK = DK, where DK =YT : Y ∈ YK

.


Proof. That CK is convex, solid and closed, follows from Theorem A.2.2 as soon

as we have established CK = DK.

CK ⊂ DK: Let X ∈ CK be arbitrary. From the definition of CK, 0 ≤ XYT ≤WT YT almost surely for all Y ∈ YK. Theorem A.3.7 (see also Remark A.3.8)

implies that EP [XYT ] ≤ EP [WT YT ] ≤ W0Y0 ≤ 1 for all Y ∈ Y ⇒ X ∈ DK.

DK ⊂ CK: For X ∈ DK we have EP [XYT ] ≤ 1∀ Y ∈ Y, i.e.

supQ∈M(SK)

EQ

[X

E (ASK(Q))T

]≤ 1.

Assume that X 6= 0 on some subset with positive measure (otherwise there is

nothing to prove). By Theorem A.3.9, there exists a process W ∈ W(W0) for

some 0 < W0 ≤ 1, such that WT ≥ X. Replace W by the process where C ≡ 0,

but π unchanged; then W ∈ W(W0) and WT ≥ WT ≥ X (Lemma A.3.4 (ii)).

Since it is clear that 1W0

WT ≥ WT ≥ X, and 1W0

W is an admissible wealth

process, we find X ∈ CK as desired.

Finally, we need some technical lemmata.

A.3.11 Lemma. Set

YK4= Y ≥ 0 : Y ≤ lim

n→∞Y

nin the Fatou sense a.s., Y

n ∈ conv(YK). (A.8)

Then YK is convex. Each maximal element, i.e. each element Y ∗ such that

there exists no other element Y ∈ YK with Y ≥ Y ∗ almost surely and Y > Y ∗

on a set with positive probability, is a cadlag supermartingale. Each maximal

element Y ∗ has got a representation Y ∗ = limn→∞ Yn

for Yn ∈ conv(YK).

Let (Y n) be a sequence of maximal elements. Let Xn be a sequence of

elements in conv(Y n), Fatou-converging to X∗. Then X∗ is a cadlag super-

martingale, and X∗ ∈ YK.

Proof. In the following, all (in)equalities hold λ ⊗ P-almost surely unless oth-

erwise stated. We use Lemma A.2.5 to ensure this where necessary. By the

same lemma, we can assume that the sequence converges for T P-almost surely.

We first prove convexity. For the moment, let us assume that Y1, Y2 are two


maximal elements that have a representation Yi = limn→∞ Yn

i for i = 1, 2 and

Yn

i ∈ conv(YK). Using Lemma A.2.5, we can write Yi = lim infn→∞ Yn

i . From

the properties of the limes inferior, we find αY1+(1−α)Y2 ≤ lim infn→∞ αYn

1 +

(1−α)Yn

2 almost surely; and Yn

i = lim infm→∞∑k(m)

j=1 β(n, i, j)Yj for Yj ∈ YKfrom the definition of YK. Again using the inequality of the limes inferior, we

therefore get αY1 + (1 − α)Y2 ≤ lim infn→∞ lim infm→∞∑k(m)

j=1 (αβ(n, 1, j) +

(1−α)β(n, 2, j))Yj ≤ lim infn→∞∑k(n)

j=1 (αβ(n, 1, j)+(1−α)β(n, 2, j))Yj . This

is convexity, since (αβ(n, 1, j) + (1− α)β(n, 2, j))Yj ∈ conv(YK). The inequal-

ities above and the definition of YK also show that each maximal element Y ∗

must have a representation Y ∗ = limn→∞ Yn

for Yn ∈ conv(YK). This also

justifies our initial assumption concerning Y1, Y2. The fact that each maximal

element is a cadlag supermartingale then follows from Lemma A.2.6, Remark

A.2.7 and the definition of YK. This completes the prove of the first part.

That X∗ is a cadlag supermartingale, follows again from Lemma A.2.6 and

Remark A.2.7. And X∗ ∈ YK, since each Y m = lim infn→∞ Y n with Y n ∈conv(YK), hence X∗ ≤ lim infm→∞ Y m ≤ lim infn→∞ Xn for some sequence

Xn ∈ conv(YK), again using “diagonalization” as above.

This enables us to prove two lemmata used in Chapter 3.

A.3.12 Lemma. Set

DK4=Y ∈ L0

+(Ω,F , P) :(∃Y K ∈ YK : Y ≤ Y K

T

)Then DK is convex, solid and closed with respect to (L0(P), dP).

Proof. That DK is convex and solid is trivial. It remains to show that DK is

closed. To this end, let (gn) be a sequence in DK converging to g 6= 0 in dP (for

g ≡ 0 almost surely there is nothing to prove). (gn) converges to g, too, where

gn ∈ conv(gn, gn+1, . . . ), n ≥ 1, a consequence of the triangular inequality.

Let (Y n) be a sequence in YK with Y nT ≥ gn ∀ n ≥ 1. By Lemma A.2.6,

there exists a sequence (Y n) ∈ conv(Y n, Y n+1, . . . ), n ≥ 1 of supermartingales,

which is Fatou convergent to a process Y on a dense countable set τ ⊂ I, where

we assume T ∈ τ (Lemma A.2.5). Y n ∈ YK from the definition of YK. By an


appropriate choice of gn, we can also assume that Y nT ≥ gn, i.e. YT ≥ g in the

limit.

A.3.13 Lemma. With the notation of Lemma A.3.10 and Lemma A.3.12

we have the bidual equalities CK = DK and DK = CK. It is also true that

DK = DK = DK.

Proof. It is clear that DK ⊂ DK, and CK = DK is shown in Lemma A.3.10.

By Theorem A.2.2, we also have that CK is convex, solid and closed, and thus

CK = CK .

By the Fatou lemma and the definition of DK, EP[lim infn→∞ Y nWT ] ≤lim infn→∞ EP[Y nWT ] ≤ 1 for Y n ∈ DK; hence DK ⊂ CK = DK . That DK is

convex, solid and closed, is shown in Lemma A.3.12. As DK ⊂ DK we conclude

from Theorem A.2.2 that DK ⊂ DK = DK, which implies DK ⊂ DK = DK ⊂CK = DK , i.e. DK = CK, and — using CK = CK — CK = DK, as desired.

Let us now turn to proving the key result of this subsection. The proposition

is true for both the sets YK and YK. Indeed, the proof shows that

supY ∈YK

EP

[XYT +

∫ T

0

c(s)Ysds

]= sup

Y ∈YKEP

[XYT +

∫ T

0

c(s)Ysds

]

This is a crucial property. Whereas the set YK is the “natural” set for this

proposition, it lacks certain desirable properties, most notable convexity and

a certain closure property. We need convexity for the Minimax theorem in

Lemma 2.1.38, and closure for proving that a certain element exists. This is

the reason for introducing YK.

A.3.14 Proposition. With the notation of this subsection:

(i) Suppose

supY ∈YK

EP

[XYT +

∫ T

0

c(s)Ysds

]≤ W0 (A.9)

or

supY ∈YK

EP

[XYT +

∫ T

0

c(s)Ysds

]≤ W0 (A.10)



portfolio-proportion process π with (π, c) ∈ AKπ (S, W0) such that for the

wealth process (Wt)t∈I defined by (A.5) WT ≥ X almost surely holds.

(ii) Conversely, if (π, c) ∈ AKπ (S, W0), then

supY ∈YK

EP

[WT YT +

∫ T

0

c(s)Ysds

]≤ W0.

and

supY ∈YK

EP

[WT YT +

∫ T

0

c(s)Ysds

]≤ W0.

The rest of this subsection is devoted to proving this result. We need several

lemmata first. We start with one concerning the structure of M(SK).

A.3.15 Lemma. With the notation of this subsection, for i = 1, 2 let Qi ∈M(SK), and let τ be a stopping time with values in [0, T ]. Define stochastic

processes (Y i) with the help of the density processes Y it = EP

[dQi

dP |F(t)]. Define

a measure Q with the help of the process (Y ) given by

Yt4=

Y 1t t < τ

Y 2t

Y 1τ

Y 2τ

t ≥ τ.

Equivalently Yt = Y 1t∧τ + (Y 2

t − Y 2τ )Y 1

τ

Y 2τ

1t≥τ.

Then Q ∈M(SK) and ASK(Q) is given by

ASK(Q)t4=

ASK(Q1)t t < τ

ASK(Q2)t − (ASK(Q2)τ −ASK(Q1)τ ) t ≥ τ,

and ASK(Q)t = ASK(Q1)t∧τ + (ASK(Q2)t −ASK(Q2)τ )1t≥τ.

Proof. We first show that Y is the density process of a probability measure.

For the proof, we freely use Lemma A.3.3 without specific reference. From


Yt = Y 1t∧τ + (Y 2

t − Y 2τ )Y 1

τ

Y 2τ

1t≥τ, and the calculation for t1 < t2

EP [Yt2 |F(t1)] = Y 1t1∧τ + EP

[(Y 2

t2 − Y 2τ )

Y 1τ

Y 2τ

1t2≥τ|F(t1)]

= Y 1t1∧τ + EP

[(Y 2

t2 − Y 2τ )

Y 1τ

Y 2τ

1t1<τ1t2≥τ|F(t1)]

+ EP

[(Y 2

t2 − Y 2τ )

Y 1τ

Y 2τ

1t1≥τ|F(t1)]

= Y 1t1∧τ + EP

[Y 1

τ

Y 2τ

1t1<τ1t2≥τEP[(Y 2

t2 − Y 2τ )|F(τ)

]|F(t1)

]+ (Y 2

t1 − Y 2τ )

Y 1τ

Y 2τ

1t1≥τ

= Yt1

we conclude that Y is a P-martingale, and hence Q a probability measure.

We now want to show that∫ ·0+

πs· dSs

Ss−−ASK(Q) is a local Q-supermartingale

for all π ∈ K. Stopping if necessary, it suffices to prove that this process is

a Q-supermartingale. Here we can and will choose the sequence of stopping

times such that∫ ·0+

πs · dSs

Ss−−ASK(Qi) is a Qi-supermartingale, and such that

ASK(Qi) is bounded from above (Liptser and Shiryaev, 1989, Chapter 1.6,

Lemma 1). We want to show that

EQ

[∫ t2

0+

πs ·dSs

Ss−−ASK(Q)t2 |F(t1)

]≤∫ t1

0+

πs ·dSs

Ss−−ASK(Q)t1 . (A.11)

Since this estimation is straightforward, but tedious, we split it into several

steps. We continue to use Lemma A.3.3.

(i) On t1 < τ the left-hand side of (A.11) can be split into two summands:

(a) The first summand is

1t1<τEQ

[∫ t2∧τ

0+

πs ·dSs

Ss−−ASK(Q)t2∧τ |F(t1)

].

Note that EP [YT |F(τ)] = Y 1τ = EP

[Y 1

T |F(τ)], and that everything

else is F(τ)-measurable (ASK(Q)t∧τ = ASK(Q1)t∧τ from the defin-

ition). Hence, taking τ -conditional expectations in the bracket, we


can switch from Q to Q1 on t1 < τ, and this summand equals

1t1<τEQ1

[∫ t2∧τ

0+

πs ·dSs

Ss−−ASK(Q1)t2∧τ |F(t1)

]≤1t1<τ

(∫ t1∧τ

0+

πs ·dSs

Ss−−ASK(Q1)t1∧τ

)=1t1<τ

(∫ t1∧τ

0+

πs ·dSs

Ss−−ASK(Q)t1∧τ

)=1t1<τ

(∫ t1

0+

πs ·dSs

Ss−−ASK(Q)t1

),

where the inequality follows from Doob’s Optional Sampling theo-

rem (Rogers and Williams, 1994a, Theorem 77.5), since ASK(Q) is

bounded.

(b) The second summand is

1t1<τEQ

[∫ t2

(t2∧τ)+

πs ·dSs

Ss−− (ASK(Q)t2 −ASK(Q)t2∧τ )|F(t1)

].

The expectation is 0 on the set t2 < τ. Furthermore, ASK(Q)t −ASK(Q)t∧τ = (ASK(Q2)t − ASK(Q2)τ )1t≥τ from ASK(Q)t∧τ =

ASK(Q1)t∧τ . Therefore, we find, using the definition of Q,

=1t1<τEP

[Y 1

τ

Y 2τ

1t2≥τ

EQ2

[∫ t2

(t2∧τ)+

πs ·dSs

Ss−− (ASK(Q2)t2 −ASK(Q2)τ )|F(τ)

]|F(t1)

]≤0

where the inequality follows form the Q2-supermartingale property

of∫ ·0+

πs · dSs

Ss−−ASK(Q2)t.


Plugging these two summands together yields the supermartingale in-

equality on t1 < τ:

1t1<τEQ

[∫ t2∧τ

0+

πs ·dSs

Ss−−ASK(Q)t2 |F(t1)

]≤ 1t1<τ

(∫ t1

0+

πs ·dSs

Ss−−ASK(Q)t1

).

(ii) For t ≥ τ, we have ASK(Q)t = ASK(Q2)t − (ASK(Q2)τ − ASK(Q1)τ ).

Hence on t1 ≥ τ the left-hand side of (A.11) equals

=1t1≥τ

(∫ t1

0+

πs ·dSs

Ss−−ASK(Q)t1

+ EQ

[∫ t2

t1+

πs ·dSs

Ss−− (ASK(Q)t2 −ASK(Q)t1)|F(t1)

])

=1t1≥τ

(∫ t1

0+

πs ·dSs

Ss−−ASK(Q)t1

+ EQ

[∫ t2

t1+

πs ·dSs

Ss−− (ASK(Q2)t2 −ASK(Q2)t1)|F(t1)

]).

Using that t1 ≥ τ , i.e. Y 1τ

Y 2τ

is F(t1)-measurable on t1 ≥ τ, and YT =

Y 2T

Y 1τ

Y 2τ

, we find

1t1≥τEQ

[∫ t2

t1+

πs ·dSs

Ss−− (Q2)t2 −ASK(Q2)t1)|F(t1)

]=1t1≥τ

Y 1τ

Y 2τ

EQ2

[∫ t2

t1+

πs ·dSs

Ss−− (ASK(Q2)t2 −ASK(Q2)t1)|F(t1)

]≤ 0,

since∫ ·0+

πs · dSs

Ss−− ASK(Q2) is a Q2-supermartingale. This gives us the

supermartingale (in-)equality on the set t1 ≥ τ.

Combining (i) and (ii) completes the proof.

We use this lemma to prove a stochastic control lemma. The proof is well

known (e.g. El Karoui and Quenez, 1995; Follmer and Kramkov, 1997; Delbaen,


2003). It is basically the Snell envelope of Mertens (1972); Dellacherie and

Meyer (1980).

A.3.16 Lemma. Suppose

supY ∈YK

EP

[XYT +

∫ T

0

c(s)Ysds

]≤ W0

for some X ∈ L0+(P) and a consumption process c. Then there exists an op-

tional, cadlag stochastic process W such that almost surely

Wt = ess-supQ∈M(SK)

E(ASK (Q)

)t

EQ

[X

E (ASK (Q))T

+∫ T

0

c(s)E (ASK (Q))s∨t

ds∣∣∣F(t)

].

W

E(ASK (Q)) and W−R ·0 c(s)ds

E(ASK (Q)) are non-negative cadlag Q-supermartingales for all

Q ∈M(SK).

Proof. Define a collection of random variables

Wt4= ess-sup

Q∈M(SK)

E(ASK (Q)

)t

EQ

[X

E (ASK (Q))T

+∫ T

0


ds∣∣∣F(t)

],

(A.12)

indexed by t ∈ I. From the properties of the essential supremum, Wt exists

and is F(t)-measurable.

Fix Q ∈M(SK) arbitrarily. First we show that Wt

E(ASK(Q))t

and Wt−R t0 c(s)ds

E(ASK(Q))t

satisfy Q-supermartingale-type inequalities. Afterwards, we prove that there

exists a progressively measurable cadlag stochastic process W that is for each

t ∈ I almost surely equal to the collection of random variables defined above.

To this end, it suffices to show that t 7→ EQ [Wt] is right-continuous in t.

Before we do so, we observe that we can assume that E(ASK(Q))t is uni-

formly bounded since a nondecreasing predictable process is locally bounded

(e.g. Liptser and Shiryaev, 1989, Chapter 1.6, Lemma 1). If it is not bounded,


then stop the process, walk through the proof for this stopped process. This

shows that the processes in question are local Q-supermartingales. And since

they are bounded from below by 0, they are indeed Q-supermartingale by the

conditional version of Fatou’s Lemma (Rogers and Williams, 1994b, Chapter

4, Remark 14.4). Hence let us assume that E(ASK(Q))t is bounded.

Supermartingale property: To start with, we define the set Mt(SK) 4=Q ∈M(SK) : EQ

[dQdQ

∣∣∣F(s)]

= 1 for s ∈ [0, t]

. We have ASK (Q) = ASK(Q)

on [0, t]. Furthermore

E(ASK (Q)

)t1

E (ASK (Q))t

=1

E (ASK (Q) I·>t1)t

for t > t1. Using this, ASK(Q)s = ASK(Q)s for s ∈ [0, t] and Q ∈ Mt(SK), and

Lemma A.3.15, it follows from (A.12) that we can write

Wt

E(ASK

(Q))

t

= ess-supQ∈Mt(SK)

EQ

[X

E (ASK (Q))T

+∫ T

0


ds∣∣∣F(t)

].

To prove the supermartingale inequality, suppose we have shown that the

conditional expectation operator and ess-sup commute. Then for u < t

EQ

Wt

E(ASK

(Q))

t

∣∣∣F(u)

= ess-sup

Q∈Mt(SK)

EQ

[EQ

[X

E (ASK (Q))T

+∫ T

0


ds∣∣∣F(t)

] ∣∣∣F(u)

]

= ess-supQ∈Mt(SK)

EQ

[X

E (ASK (Q))T

+∫ T

0


ds∣∣∣F(u)

]

≤ ess-supQ∈Mu(SK)

EQ

[X

E (ASK (Q))T

+∫ T

0

c(s)E (ASK (Q))s∨u

ds∣∣∣F(u)

]

=Wu

E(ASK

(Q))

u

.


For the second equality we use elementary probability theory (e.g. Musiela

and Rutkowski, 1997, Lemma A.0.4) and Q ∈ Mt(SK) to find for s ≤ t

EQ [Z|F(s)] =EQ[Z(dQ/dQ)|F(s)]EQ[(dQ/dQ)|F(s)] = EQ

[Z(dQ/dQ)|F(s)

]. For the inequality

we observe that Mt(SK) ⊂ Mu(SK) and E(ASK (Q)

)s∨u

≤ E(ASK (Q)

)s∨t

.

By the same reasoning

EQ

Wt −∫ t

0c(s)ds

E(ASK

(Q))

t

∣∣∣F(u)

= ess-sup

Q∈Mt(SK)

EQ

[X

E (ASK (Q))T

+∫ T

t

c(s)E (ASK (Q))s

ds∣∣∣F(u)

]

≤ ess-supQ∈Mu(SK)

EQ

[X

E (ASK (Q))T

+∫ T

u

c(s)E (ASK (Q))s

ds∣∣∣F(u)

]

=Wu −

∫ u

0c(s)ds

E(ASK

(Q))

u

.

It remains to show that the conditional expectation operator and the ess-sup

commute. Before we prove this, we define for notational convenience four F(t)-

measurable random variables. Assume given Qi ∈ Mt(SK) and define

Zi4= EQi

[X

E (ASK (Qi))T

+∫ T

0

c(s)E (ASK (Qi))s∨t

ds∣∣∣F(t)

]and

Zci4= EQi

[X

E (ASK (Qi))T

+∫ T

t

c(s)E (ASK (Qi))s

ds∣∣∣F(t)

]for i = 1, 2. It is known (Striebel, 1975, Theorem A.2.2) that the two operations

commute for non-negative integrable random variables, if for Q1, Q2 ∈ Mt(SK)

there always exists Q∗, Qc∗ ∈ Mt(SK) with almost surely

EQ∗

[X

E (ASK (Q∗))T

+∫ T

0

c(s)E (ASK (Q∗))s∨t

ds∣∣∣F(t)

]= Z1 ∨ Z2

and

EQc∗

[X

E (ASK (Qc∗))T

+∫ T

t

c(s)E (ASK (Qc

∗))s

ds∣∣∣F(t)

]= Zc

1 ∨ Zc2


respectively. Since Zi = Zci +

∫ t

0c(s)

E(ASK (Q))tds, we prove only the first equal-

ity. This is achieved if we define the measures Q∗ with the help of the den-

sity dQ1

dP IZ1>Z2 + dQ2

dP (1 − IZ1>Z2). By Lemma A.3.17 Q∗ ∈ Mt(SK) and

ASK (Q∗)s = ASK(Q1)sEP[IZ1>Z2|F(s)

]+ASK(Q2)s(1−EP

[IZ1>Z2|F(s)

]).

Using this and ASK(Q1)s

= ASK(Q2)s

= ASK(Q)s for s ≤ t implies the de-

sired equality.

Cadlag property: As E(ASK(Q)) is cadlag by assumption, it suffices to prove

that W

E(ASK(Q)) is cadlag; the other case then follows immediately since∫

c(s)ds

is continuous. Indeed, it suffices that

t 7→ EQ

Wt

E(ASK

(Q))

t

is right-continuous (Liptser and Shiryaev, 2000, Theorem 3.1). Using that the

expectation operator and ess-sup commute, we have to show that for a sequence

tn ↓ t the sequence supQ∈Mtn (SK) EQ [Ztn ] converges to supQ∈Mt(SK) EQ [Zt],

where

Ztn

4=X

E (ASK (Q))T

+∫ T

0

c(s)E (ASK (Q))s∨tn

ds.

The inequality limn→∞ supQ∈Mtn (SK) EQ [Ztn ] ≤ supQ∈Mt(SK) EQ [Zt] follows

immediately from the fact that(ASK (Q)

)is non-decreasing and Mtn(SK) ⊂

Mt(SK) (i.e. the inequality supQ∈Mtn (SK) EQ [Ztn] ≤ supQ∈Mt(SK) EQ [Zt]).

We only have to prove the converse inequality. To this end, let ε > 0 be given,

and fix Q ∈ Mt(SK). Then it follows from the right-continuity of ASK(Q) and

dominated convergence that limtn↓t EQ [Ztn ] = EQ [Zt]. Hence there does exist

some n(ε) such that EQ [Ztn]+ ε ≥ EQ [Zt] for all n ≥ n(ε). Furthermore, from

the definition of Mt(SK) and the right-continuity of the filtration, there does

exist some n(Q) with Q ∈ Mtn(SK) for all n ≥ n(Q). This yields the inequality

supQ∈Mtn (SK) EQ [Ztn ] + ε ≥ EQ [Ztn ] + ε ≥ EQ [Zt] for all n ≥ n(Q) ∨ n(ε).

And Q was arbitrary: limn→∞ supQ∈Mtn (SK) EQ [Ztn] + ε ≥ EQ [Zt] for all

Q ∈ Mt(SK), which implies the desired inequality.

Clearly, the cadlag process W is optional.


To complete the proof, we prove a structural property of the set Mt(SK).

A.3.17 Lemma. With the notation of the proof of Lemma A.3.16, suppose

that Qi ∈ Mt(SK), i = 1, 2, and let Z be an F(t)-measurable random variable.

Then Q∗ ∈ Mt(SK), if Q∗ is defined by dQ∗dP

4= IZ≥0dQ1

dP + IZ<0dQ2

dP . The

upper variation process is given by

ASK(Q∗)s4=

ASK(Q)s s < t

IZ≥0ASK(Q1)s + IZ<0A

SK(Q2)s s ≥ t;

ASK(Q∗)s = ASK(Q1)sEP[IZ≥0|F(s)] + ASK(Q2)sEP[IZ<0|F(s)] holds.

Proof. Q∗ is a probability measure. Indeed,

EP

[dQ∗

dP

]= EP

[IZ≥0EP[

dQ1

dP|F(t)

]+ IZ<0EP

[dQ2

dP|F(t)

]= 1

if we observe that EP[dQ1

dP |F(t)] = EP[dQ2

dP |F(t)] from the definition of Mt(SK).

As usual, stopping if necessary, we assume that all processes are super-

martingales. We want to show that for a cadlag semimartingale W , the process

W − ASK(Q∗) is a Q∗-supermartingale, provided that W − ASK(Qi), i = 1, 2,

are. Then the fact that ASK(Q∗) is the upper variation process follows from

the observation that the ASK(Qi) are the upper variation processes by con-

tradiction. Finally, using that ASK(Q∗)s = ASK(Qi)s = ASK(Q)s for s < t,

the equivalence of the two definitions of ASK(Q∗)s is immediate, Z being F(t)-

measurable.

To prove the supermartingale property, we consider three cases to get a

supermartingale inequality for EQ∗[Wu −ASK(Q∗)u|F(s)

]. In the following,

we use Musiela and Rutkowski (1997, Lemma A.0.4) freely.

(i) u > s ≥ t: since Z is F(s)-measurable, we find

EQ∗[Wu −ASK(Q∗)u|F(s)

]=

IZ≥0EP[(Wu −ASK(Q1)u)dQ1

dP |F(s)]

EP[dQ∗dP |F(s)]

+IZ<0EP[(Wu −ASK(Q2)u)dQ2

dP |F(s)]

EP[dQ∗dP |F(s)]

.


Furthermore

EP

[(Wu −ASK(Qi)u)

dQi

dP|F(s)

]=

EQi [Wu −ASK(Qi)u|F(s)]EQi [ dP

dQi |F(s)]

≤Ws −ASK(Qi)s

EQi [ dPdQi |F(s)]

=EQi [Ws −ASK(Qi)s|F(s)]

EQi [ dPdQi |F(s)]

=EP

[(Ws −ASK(Qi)s)

dQi

dP|F(s)

]Combining the last two equations yields


]≤

IZ≥0EP[(Ws −ASK(Q1)s)dQ1

dP |F(s)]

EP[dQ∗dP |F(s)]

+IZ<0EP[(Ws −ASK(Q2)s)dQ2

dP |F(s)]

EP[dQ∗dP |F(s)]

=EQ∗[Ws −ASK(Q∗)s|F(s)

]= Ws −ASK(Q∗)s.

as desired.

(ii) u ≥ t > s: simply write


]=EQ∗

[EQ∗

[(Wu −ASK(Q∗)u)|F(t)

]|F(s)

],

and apply step (i) above for the special case s = t. This reduces step (ii)

to step (iii) below.

(iii) t ≥ u > s: taking F(u)-conditional expectations first, and using the

definition of Mt(SK), especially EP[dQ∗dP |F(u)

]= EP

[dQ∗dP |F(s)

]= 1 and

ASK(Q∗)u = ASK(Q)u yields


]=EP

[(Wu −ASK(Q∗)u)EP

[dQ∗

dP|F(u)

]|F(s)

]=EQ

[(Wu −ASK(Q)u)|F(s)

]≤Ws −ASK(Q)s = Wu −ASK(Q∗)u.


Finally, we want to prove that ASK(Q∗) is an upper variation process and do

so by contradiction. Suppose that there exists a candidate upper variation

process A such that A ≤ ASK(Q∗) and A < ASK(Q∗) on a set with positive

λ⊗ P measure B. Without loss of generality assume that B ∩ [0, T ]× Z ≥ 0has got positive measure. Then dQ∗∗

P4= IZ≥0

dQ∗dP + IZ<0

dQ1

dP defines a mea-

sure Q∗∗ and Q∗∗ = Q1. Now if ASK(Q∗∗)s4= ASK(Q∗)sEP[IZ≥0|F(s)] +

ASK(Q1)sEP[IZ<0|F(s)], then ASK(Q∗∗) is a candidate upper variation pro-

cess for Q∗∗ = Q1, ASK(Q∗∗) ≤ ASK(Q1) and ASK(Q∗∗) < ASK(Q1) on a set

with positive probability. This is a contradiction to the definition ASK(Q1).

Proof of Proposition A.3.14. (i) (A.9) implies (A.10). Let us assume (A.10)

holds. As in Lemma A.3.16, define a progressively measurable cadlag

process (W t) by

W t = ess-supQ∈M(SK)

E(ASK (Q)

)t

EQ

[X

E (ASK (Q))T

+∫ T

0


ds∣∣∣F(t)

].

From Lemma A.3.16 W t

E(ASK (Q))t

is a Q-supermartingale. By Theorem

A.3.7, there exists some non-negative, nondecreasing, optional process C

and some π ∈ K such that W t = W 0E(∫ t

0+πs · dSs

Ss−− Ct

). Replacing X

by some larger random variable X if necessary, we can and will assume

in the following that C ≡ 0 (Lemma A.3.4 and W ≥ 0).

Define a process W by W4= W −

∫ ·0c(s)ds and a portfolio-proportion

process π4= π W

W 1W>0. Note that the process (W ) is well-defined:

it is adapted, cadlag and progressively measurable, since both W and∫ ·0c(s)ds are. Substituting, we find the stochastic differential equation

W = W0E(∫ ·

0+πs · dSs

Ss−−∫ ·0

c(s)W−

d(sIWs−>0)). Lemma A.3.16 proves

that W

E(ASK (Q)) = W−R ·0 c(s)ds

E(ASK (Q)) is a supermartingale. π ∈ K now follows

from Theorem A.3.7, and WT = X ≥ X from the definition. We conclude

(π, c) ∈ AKπ (S, W0).


(ii) We start by proving that for Y ∈ YK arbitrary

EP

[WT YT +

∫ T

0

c(s)Ysds

]≤ W0,

where W is the wealth process associated with (π, c) ∈ AKπ (S, W0). From

Theorem A.3.7, WY is a non-negative cadlag supermartingale. Let τ4=

inft ≥ 0 : WtYt = 0 be the stopping time (Rogers and Williams,

1994a, Lemma 75.1) when this process hits zero. From Revuz and Yor

(1999, Chapter 2, Proposition 3.4), WtYt = 0 on [τ, T ]; and from the

definition of c and W , c = 0 on [τ, T ]. Hence, it suffices to proof

EP

[Wτ Yτ +

∫ τ

0

c(s)Ysds

]≤ W0. (A.13)

From the definition, Wt = W0 +∫ t

0+Ws−dXs −

∫ t

0c(s)ds, where Xt =∫ t

0+πs · dSs

Ss−. On [0, τ ] we find W = E(X)

(W0 −

∫0+

c(s)E(X)s−

ds)

(Theorem

C.3.1). By Theorem A.3.7 and Doob’s optional sampling theorem (Rogers

and Williams, 1994a, Chapter II, Theorem 77.5) the processes WE(ASK (Q))

and E(X)

E(ASK (Q))are (cadlag) Q-supermartingales on [0, τ ]. Using Lemma

A.3.4, we could easily find a process Y such that E(Y ) = E(X)

E(ASK (Q)).

Since E(X) > 0 on [0, τ) and ASK(Q) < ∞ almost surely, (Follmer and

Kramkov, 1997, Lemma 2.1), E(Y ) > 0 on [0, τ), and we conclude from

Lemma A.3.2, that Y τ is a local supermartingale.

Using all this, we write

W

E(ASK(Q))= E(Y )

(W0 −

∫1

E(Y )s−

c(s)E(ASK(Q))s−

ds

)on [0, τ ]. This is the solution to the stochastic differential equation

W

E(ASK(Q))= W0 +

∫Ws−

E(ASK(Q))s−dYs −

∫c(s)

E(ASK(Q))s−ds.

Since Y is a local supermartingale, it follows from Lemma A.3.1, that

W

E(ASK(Q))+∫

c(s)E(ASK(Q))s−

ds = W0 +∫

Ws−

E(ASK(Q))s−dYs


is a local supermartingale on [0, τ ]. This is bounded from below by 0,

and we can apply the Fatou lemma to find,

EQ

[Wτ

E(ASK(Q))τ+∫ τ

0+

c(s)E(ASK(Q))s−

ds

]≤ W0.

ASK(Q)) being nondecreasing, this implies (A.13).

It remains to show that

supY ∈YK

EP

[WT YT +

∫ T

0

c(s)Ysds

]≤ W0.

To this end, let Y ∈ YK be arbitrary. Without loss of generality, we

can assume that Y ∈ YK is maximal in the sense of Lemma A.3.11.

By definition existence of a sequence (Y n)n>0, Yn ∈ YK with Y =

lim infn→∞∑l(n)

k=1 β(k, n)Y n almost surely is guaranteed, and especially

YT = lim infn→∞∑l(n)

k=1 β(k, n)Y nT (Lemma A.2.5 and the definition of

YK). Here β(k, n) ≥ 0 and∑

k β(k, n) = 1. Using the Fatou lemma, we

therefore get the estimate

EP

[WT YT +

∫ T

0

c(s)Ysds

]

=EP

WT lim infn→∞

l(n)∑k=1

β(k, n)Y nT +

∫ T

0

c(s) lim infn→∞

l(n)∑k=1

β(k, n)Y nT ds

≤ lim inf

n→∞

l(n)∑k=1

β(k, n)EP

[WT Y n

T +∫ T

0

c(s)Y ns ds

]≤ W0.

This completes the proof.

A.3.2 Portfolio Processes

Using the “additive” versions in Follmer and Kramkov (1997), we can prove

the analogue proposition with portfolio processes. Let us very quickly sketch

this and rewrite the notation first. Let K ⊂ La (S) be closed with respect to

dS . Further assume that 0 ∈ K and that K is convex in the following sense:


if β, γ ∈ K, then αβ + (1 − α)γ ∈ K for any one-dimensional predictable

process α such that 0 ≤ α ≤ 1. Consider the predictably convex family of

semimartingales SK =∫ ·

0+ξs · dSs : ξ ∈ K

. Let M(SK) and ASK(Q) be as

in Definition A.3.5. Set Mn(SK) 4= Q ∈ M(SK) : ASK(Q)T ≤ n a.s. and

Mb(SK) 4= ∪n≥1Mn(SK). For the proof, we need an analogue to Theorem

A.3.9:

A.3.18 Proposition. With the notation of this subsection, let St(Q) be the

set of stopping times with values in [t, T ] such that ASK(Q)τ − ASK(Q)t is

bounded for all τ ∈ St(Q). Suppose that

supQ∈M(SK)

supτ∈S0(Q)

EQ[X1τ=T −ASK(Q)τ

]< ∞

for some random variable X. Then there exists ξ ∈ K with

W0 +∫ t

0

ξs · dS ≥ ess-supQ∈M(SK),τ∈St(Q)

EQ[X1τ=T −ASK(Q)τ |F(t)

]+ ASK(Q)t.

Proof. Follmer and Kramkov (1997, Proposition 4.2)

A.3.19 Corollary. With the notation of this subsection, suppose that

supQ∈Mb(SK)

EQ[X −ASK(Q)T

]< ∞

for some random variable X. Then there exists ξ ∈ K with

W0 +∫ t

0

ξs · dS ≥ ess-supQ∈M(SK),τ∈St(Q)

EQ[X1τ=T −ASK(Q)τ |F(t)

]+ ASK(Q)t.

Proof. Since Mb(SK) ⊂M(SK), and T ∈ S0(Q) for Q ∈Mb(SK) we have

supQ∈Mb(SK)

EQ[X −ASK(Q)T

]≤ sup

Q∈M(SK)

supτ∈S0(Q)


].

To prove the corollary, we show that the converse inequality is true, too.

To this end, fix Q ∈ M(SK), τ ∈ S0(Q). Let Y be the density process,

i.e. Yt = EP[dQdP |F(t)

], and let Y m be the density process of the equivalent


local martingale measure, i.e. Y mt = EP

[dQm

dP |F(t)]. Define a measure Qb by∫

Y bT dP with the help of the process Y b, specified by

Y bt =

Yt t < τ

Y mt

Yτ

Y mτ

t ≥ τ.

From Lemma A.3.15 14 Qb ∈M(SK) with the upper variation process ASK(Qb)

given by ASK(Qb)t = ASK(Q)t∧τ . Using the equality EQb

[X −ASK(Qb)T

]=

EQ[Iτ=TX −ASK(Q)T

]+ EQm

[Iτ<TX

]and ASK(Q)T = ASK(Q)τ this

gives the inequality


]≤ EQb

[X −ASK(Qb)T

],

proving the claim, since Q ∈M(SK), τ ∈ S0(Q) was arbitrary.

We now prove the key result of this section (see also Mnif and Pham (2001),

Proposition 4.1).

A.3.20 Proposition. With the notation of this subsection:

(i) Suppose

supQ∈Mb(SK)

EQ

[X +

∫ T

0

c(s)ds−ASK(Q)T

]≤ W0 (A.14)


portfolio process ξ with (ξ, c) ∈ AK(S, W0) such that for the wealth process

(Wt)t∈I defined by (A.4) WT ≥ X almost surely holds.

(ii) Conversely, if (ξ, c) ∈ AK(S, W0), then

supQ∈Mb(SK)

EQ

[WT +

∫ T

0

c(s)ds−ASK(Q)T

]≤ W0

14Strictly speaking, a version of this lemma for portfolio processes. But this only amountsto a change of notation.


Proof. (i): Corollary A.3.19 proves that (A.14) implies existence of some ξ ∈ Ksuch that

W0 +∫ t

0+

ξs · dSs

≥ ess-supQ∈Mb(SK)

(EQ

[X +

∫ T

0

c(s)ds−ASK(Q)T

∣∣∣F(t)

]+ ASK(Q)t

)+

Using ASK(Qm) ≡ 0 this implies almost surely

Wt = W0 +∫ t

0+

ξs · dSs −∫ t

0

c(s)ds ≥ EQm

[X +

∫ T

t

c(s)ds∣∣∣F(t)

]≥ 0

as desired.

(ii): Follmer and Kramkov (1997, Theorem 4.1) (the version for portfolio

processes of Theorem A.3.7) show that(W0 +

∫ t

0+ξs · dSs −ASK(Q)t

)t∈I

is a

local Q-supermartingale for all Q ∈ Mb(SK). By the definition of Mb(SK),

this local Q-supermartingale is bounded from below by some constant n, hence

a Q-supermartingale; using W0 +∫ t

0+ξs · dSs = WT +

∫ T

0c(s)ds, this implies

the inequality.

There is no equivalent set to YK of Proposition A.3.14 in Proposition A.3.20.

The reason is that it is not yet known how to enlarge the set Mb(SK) for

portfolio processes with constraints (see also Mnif and Pham, 2001, p. 167). To

give an idea for the reason of the author’s inability to enlarge the set properly,

the set ASK(Q)T : Q ∈ Mb(SK) is not bounded from above in general.

This means that we cannot simply consider cl(Mb(SK)). Then, ASK(Y )T for

Y ∈ cl(Mb(SK)) would not necessarily be bounded, no matter how we define it.

It does not help to use cl(Mn(SK)) as in the proof, since the ∪n≥1 cl(Mn(SK))

is not closed in general. And a localization argument does not help either for

the simple reason that ASK(Q)τ : Q ∈Mb(SK) is not bounded in general.

This said, we observe that enlarging is straightforward, if ASK(Q)T : Q ∈Mb(SK) is bounded by a constant. Then we can enlarge the set Mb(SK)

using Lemma A.2.6, (i), as before. And Lemma A.2.6, (ii), enables us to


find suitable upper variation processes for this enlarged set. We omit the de-

tails, but note that some authors assume that the upper variation process is

bounded (e.g. Cuoco, 1997, Assumption 3). There are basically two prototyp-

ical situations that come to mind where ASK(Q)T : Q ∈Mb(SK), or slightly

more general ASK(Q)τn : Q ∈ Mb(SK) for a reducing sequence of stopping

times (τn)n≥1, is actually bounded by a constant. The first is if S is locally

of finite variation (as is the case in discrete models), and K is bounded (e.g.

Shirakawa, 1994); and the second are cone constraints, where ASK(Q)T = 0

for all Q ∈Mb(SK). The latter case is already covered by portfolio-proportion

constraints, since cone constraints are the same for portfolio processes and

portfolio-proportion processes. Karatzas and Zitkovic (2003) also discuss this

case.15 Cuoco (1997, p. 40) gives some other examples for Ito processes. How-

ever, there are examples where the upper variation process is unbounded —

for example, if S is of unbounded local variation (e.g. an Ito process) and K is

bounded (compare Cuoco, 1997, Remark on p. 42).

15Cone constraints are not the same, if wealth may become negative — for then theportfolio-proportion process is not defined. But the mathematics do not change in this caseand the results are unaltered, as we can easily enlarge the set (see Karatzas and Zitkovic,2003, and Section 3.5.3).

Appendix B

Convex Analysis and

Duality

B.1 Kramkov / Schachermayer’s Duality Re-

sult

For the reader’s convenience we reproduce a duality result by Kramkov and

Schachermayer (1999).

B.1.1 Assumption. Let C,D have the following properties

(i) C ⊂ L0+(Ω,F , P), D ⊂ L0

+(Ω,F , P).

(ii) C = D and D = C.

(iii) 1 ∈ C.

Note that C,D are convex, solid and closed in dP by Theorem A.2.2. We

set C(x) = xg : g ∈ C for x ∈ IR+ and define D(y) accordingly. Let U be a

utility function with dom(U) = [0,∞) and limx↓0 U ′(x) = ∞ almost surely (i.e.

U satisfies the Inada condition). Further assume that U is not state-dependent.

166 APPENDIX B. CONVEX ANALYSIS AND DUALITY

Then the convex dual is given by U(y) = supx>0U(x) − xy (e.g. Karatzas

and Shreve, 1998, Definition 3.4.2 and Lemma 3.4.3). Finally, consider

u(x) = supX∈C(x)

EP[U(X)], (B.1)

and

v(y) = infY ∈D(y)

EP[U(Y )]. (B.2)

As it is clear that u(·),−v(·) are concave, we may (and do) define the right-

continuous derivatives u′(·), v′(·). Then we have the following

B.1.2 Theorem. Assume that C,D are as in Assumption B.1.1 and that

u(x0) < ∞ for some x0 > 0. Then

(i) u(x) < ∞∀ x > 0, and there exists y0 > 0 such that v(y) is finitely valued

for y > y0. We have for y > 0

v(y) = supx>0

[u(x)− xy],

and for x > 0

u(x) = infy>0

[v(y) + xy].

(ii) u(·) is continuously differentiable on (0,∞) and v(·) is strictly convex on

y > 0 : v(y) < ∞. Further limx↓0 u′(x) = ∞ and limy→∞ v′(y) = 0.

(iii) If v(y) < ∞ then the optimal solution Y ∗ ∈ D(y) to (B.2) exists and is

unique.

Proof. Kramkov and Schachermayer (1999, Theorem 3.1).

B.1.3 Theorem. Assume that v(y) < ∞∀ y > 0 almost surely in addition to

the assumptions of Theorem B.1.2. Then we also have

(i) v(·) is also continuously differentiable on (0,∞), u′(·),−v′(·) are strictly

decreasing and satisfy limx→∞ u′(x) = 0 and limx↓0−v′(x) = ∞.

B.2. GENERALIZED LAGRANGIANS 167

(ii) The optimal solution X∗ ∈ C(x) to (B.1) exists and is unique. If Y ∗ ∈D(y) is the optimal solution to (B.2), where y = u′(x), we have the dual

relation

X∗ = U ′−1(Y ∗)

and

EP[X∗Y ∗] = xy.

(iii)

u′(x) = EP

[X∗U ′(X∗)

x

]

v′(y) = EP

[Y ∗U ′(Y ∗)

y

]

Proof. Kramkov and Schachermayer (2003, Theorem 4).

B.2 Generalized Lagrangians

This section summarizes some important facts on Generalized Lagrange Mul-

tiplier rules for normed spaces. Such results can be found in any textbook

on the topic, e.g. Luenberger (1969); Jahn (1996), from where we have taken

the theory. As usual, we will not strive for the most general result (e.g., most

results could be easily proven for star-shaped sets instead of convex sets), but

use results that suit our purpose. We need some basic facts on cones first.

B.2.1 Definition (Cone, Dual Cone). A cone is a nonempty subset C of a

real linear space X such that x ∈ C, α ≥ 0 ⇒ αx ∈ C. Let X′ be the dual1 of

X; then the dual cone of C is C ′ = l ∈ X′ : l(x) ≥ 0∀ x ∈ C.

Let S be a nonempty subset of a real linear space X. The set cone(S) =

αx : α ≥ 0, x ∈ S is the cone generated by S. We call a set C a convex cone

if C is both a cone and a convex set.1Wherever we use the term dual in this subsection, we mean the topological dual (i.e. the

space of all continuous linear functionals).


B.2.2 Definition (Partial Ordering). Let X be a real linear space, and let

x, y, z ∈ X be arbitrary. A partial ordering R is a nonempty subset of X × X,

and one writes x ≤ y, if (x, y) ∈ R, whenever the following properties hold:

(i) x ≤ x.

(ii) x ≤ y, y ≤ z ⇒ x ≤ z.

(iii) w ≤ x, y ≤ z ⇒ w + y ≤ x + z.

(iv) x ≤ y, α ∈ IR+ ⇒ αx ≤ αy.

We use the convention that y ≥ x means x ≤ y.

B.2.3 Definition (Ordered Vector Space). An (partially) ordered vector space

is a real linear space X equipped with a partial ordering ≤.

B.2.4 Proposition. If C is a convex cone in a real linear space X, then a

partial ordering is defined by x ≤ y, if y−x ∈ C for (x, y) ∈ X×X. Conversely,

if ≤ is a partial ordering on X, the set C = x ∈ X : 0X ≤ x is a convex cone.

Proof. Note that x, y ∈ C ⇒ x+y = 2( 12x+ 1

2y) ∈ C by convexity and Definition

B.2.1; also, 0X ∈ C. The proof is thus immediate.

B.2.5 Definition (Ordering Cone). A convex cone characterizing the partial

ordering on an ordered vector space is called an ordering cone.

For later use we recall the concept of a Frechet derivative.

B.2.6 Definition (Frechet Derivative). Let (X, ‖·‖X), (Y, ‖·‖Y) be normed

spaces, S ⊂ X nonempty, and f : S 7→ Y be some function. We say that f ′(x)

is a Frechet derivative of f at x, if f ′(x) : X 7→ Y is a bounded linear map

satisfying

limδ↓0

supx∈S:0<‖x−x‖X<δ

‖f(x)− f(x)− f ′(x)(x− x)‖Y

‖x− x‖X= 0.

The following closely related concept of a derivative is also used:


B.2.7 Definition (Directional Derivative). Let (Y, ‖·‖Y) a real normed space,

X be a real linear space, S ⊂ X nonempty, and f : S 7→ Y be a given mapping.

If for x ∈ S, d ∈ X the limit

f ′(x)(d) = limδ↓0

1δ

(f (x + δd)− f (x))

exists in Y, f ′(x)(d) is called the directional derivative of f at x in the direction

d. f is said to have a (directional) derivative f ′(x) : X 7→ Y at x, if f ′(x)(d) ∈ Y

exists in every direction d ∈ X.2

Clearly, if the Frechet derivative exists, then the directional derivative ex-

ists in all “possible” directions (caveat: to make a theorem of this statement,

some additional assumptions concerning the set S and the point x are needed;

however, this statement is certainly true if S = X — see the proof of Jahn,

1996, Theorem 3.13). For convex functions, the directional derivative exists (in

every direction), see Theorem B.2.9. This is a generalization of the standard

result that for a convex function f : IR 7→ IR both of the one-sided derivatives

exist for all x ∈ IR (see Schechter, 1997, Theorem 25.25).

B.2.8 Definition (Convex Mapping). Let X be a real linear space, Y an

ordered vector space, S ⊂ X nonempty and convex, and f : S 7→ Y be a given

mapping. f is called convex, if f(δx + (1 − δ)y) ≤ δf(x) + (1 − δ)f(y) for all

x, y ∈ S, δ ∈ [0, 1].

We do not assume that (Y,≤) is a chain, here. We only assume that if

x, y ∈ S, then f(δx + (1− δ)y) and δf(x) + (1− δ)f(y) are comparable.

B.2.9 Theorem. Let X be a real linear space, and let f : X 7→ L0(Ω,F , P)

(alternatively f : X 7→ IR) be a convex functional. Then at every x ∈ X and in

every direction d ∈ X the directional derivative f ′(x)(d) exists almost surely.

Proof (Adapted from Jahn, 1996, Theorem 3.4). We first prove the theorem to

be true for f : X 7→ L0(Ω,F , P). For arbitrary x, d ∈ X, define the function2If the directional derivative exists in every direction, some authors use the term

“Gateaux” derivative, while others add some additional restrictions to the definition of aGateaux derivative.


ϕ : IR+ 7→ L0(P) by

ϕ(δ) 4=1δ

(f (x + δd)− f (x)) .

Given 0 < δ1 ≤ δ2, we calculate using the convexity of f ,

δ1ϕ(δ1) = f(x + δ1d)− f(x) = f

(δ1

δ2(x + δ2d) +

δ2 − δ1

δ2x

)− f(x)

≤ δ1

δ2f(x + δ2d) +

δ2 − δ1

δ2f(x)− f(x)

=δ1

δ2(f(x + δ2d)− f(x)) = δ1ϕ(δ2),

that is, 0 < δ1 ≤ δ2 ⇒ ϕ(δ1) ≤ ϕ(δ2). We conclude that ϕ is monotonically

increasing. Because of the convexity of f and x = 11+δ (x+ δd)+ δ

1+δ (x− d) we

also find for all δ > 0

f(x) ≤ 11 + δ

f(x + δd) +δ

1 + δf(x− d),

implying f(x)− f(x− d) ≤ ϕ(δ).

It follows from this inequality that the set ϕ(δ) : δ > 0 is bounded from

below by a random variable. By Theorem B.3.3, a least lower bound z ex-

ists; that is, the set ϕ(δ) : 1 ≥ δ > 0 ∪ z is order complete. And since

ϕ(δ) is monotonically increasing in δ, the definition of “lim inf” and “lim sup”

(e.g. Schechter, 1997, 7.44) shows that lim infδ↓0 ϕ(δ) = lim supδ↓0 ϕ(δ) = z.

Therefore, from Theorem 7.45 in Schechter (1997), ϕ(δ) is (order) convergent

to z as δ ↓ 0. By Aliprantis and Border (1999, Lemma 7.16) order conver-

gence and almost sure convergence are equivalent. Thus, we have shown that

limδ↓0 ϕ(δ) = f ′(x)(d) exists.

For f : X 7→ IR, the first part of the proof remains unchanged, but we can

greatly simplify the second paragraph. The details are left to the reader.

B.2.10 Corollary. Under the assumptions of Theorem B.2.9, let S ⊂ X be

nonempty and convex, (X, ‖·‖X) a normed space, and f : S 7→ L0(Ω,F , P)

(alternatively f : S 7→ IR) be a convex functional. Then at every x ∈ int(S)

and in every direction d ∈ X the directional derivative f ′(x)(d) exists almost

surely.


Proof. Since x ∈ int(S), for some small enough δ, x + δd ∈ S∀ δ < δ, and the

proof of Theorem B.2.9 shows again that the limit exists.

Two concepts of “convexity” rely on the definition of a derivative:

B.2.11 Definition (C-quasiconvex). Let S be a nonempty subset of a real

linear space X, and let C be a nonempty subset of a real normed space (Y, ‖·‖Y).

Let f : S 7→ Y be a given mapping; let f have a directional derivative at x ∈ S

in every direction x−x, x ∈ S. The mapping f is C-quasiconvex at x, if ∀x ∈ S:

f(x)− f(x) ∈ C ⇒ f ′(x)(x− x) ∈ C.

B.2.12 Remark. A more conventional definition of quasiconvexity for a function

f : X 7→ IR is the following (Aliprantis and Border, 1999, Chapter 5, Definition

5.23): A function f is quasiconvex if f (αx + (1− α) y) ≤ max f (x) , f (y)for all x, y ∈ X and all 0 ≤ α ≤ 1. The two definitions coincide for this special

case. Accordingly, for a quasiconcave function, the definition goes: A function

f is quasiconcave if f (αx + (1− α) y) ≥ min f (x) , f (y) for all x, y ∈ X and

all 0 ≤ α ≤ 1.

B.2.13 Definition (Pseudo-convexity). Let (X, ‖·‖X) be a real linear space,

S ⊂ X nonempty, and (Y, ‖·‖Y) be a normed and ordered vector space. A

given functional f : S 7→ Y with a directional derivative at x in every direction

d = x− x, x ∈ S, is called pseudo-convex at x if for all x ∈ S: f ′(x)(x− x) ≥0 ⇒ f(x)− f(x) ≥ 0.

B.2.14 Theorem. Given the assumptions of Definition B.2.13, let S ⊂ X be

a nonempty subset, and let f : S 7→ L0(Ω,F , P) be a convex functional. Then

f is pseudo-convex at x for arbitrary x ∈ int(S). The result does not change,

if we require all inequalities to hold only almost surely.

Proof (Adapted from Jahn, 1996, Theorem 4.17). Let x ∈ S be given. For

some small enough δ < 1, x + δ(x− x) ∈ S∀ δ < δ. From the convexity of f ,

f(x + δ(x− x)) = f(δx + (1− δ)x) ≤ δf(x) + (1− δ)f(x), i.e.

f(x) ≥ f (x) +1δ

(f (x + δ (x− x))− f(x)) .


On letting δ ↓ 0 and applying Corollary B.2.10

f(x)− f(x) ≥ f ′(x)(x− x).

Hence, f(x)− f(x) ≥ 0, if f ′(x)(x− x) ≥ 0 as asserted.

We are now in the position to present the main results of this appendix. The

rationale behind the theorems is as follows: given a (pseudo-)convex mapping

f , which is also C-quasiconvex, where C4= x : x ≥ 0, a point x∗ is a minimum

of f (i.e. f(x) ≥ f(x∗)∀x ∈ S), if and only if f ′(x∗)(x−x∗) ≥ 0∀x ∈ S. This

is an almost immediate consequence of the definitions of C-quasiconvexity and

pseudo-convexity. The first theorem therefore uses this very fact (indeed, more

general versions of it hold).

B.2.15 Theorem. Let S be a nonempty subset of a real linear space, and let

f : S 7→ L0(Ω,F , P) (alternatively f : S 7→ IR) be a convex functional at some

x ∈ int(S). Then x is a minimal point of f on S, if and only if f ′(x)(x−x) ≥ 0

for all x ∈ S.

Proof. Suppose f ′(x)(x − x) ≥ 0 for all x ∈ S. By Theorem B.2.14, f is

pseudo-convex at x, i.e. f(x) ≥ f(x) for all x ∈ S from the Definition B.2.13

of pseudo-convexity.

Suppose now that f(x) ≥ f(x) for all x ∈ S. Then for arbitrary x ∈ S

and some small enough δ > 0, x + δ(x − x) ∈ S for all δ < δ (since x is

in the interior of S). And by assumption f(x + δ(x − x)) ≥ f(x); therefore

f ′(x)(x− x) = limδ↓01δ (f (x + δ(x− x))− f (x)) ≥ 0 as desired.

The second theorem is a constrained version of the same idea.

B.2.16 Theorem. Let S be a nonempty subset of a real linear space X,

(Y, ‖·‖Y) a partially ordered real normed space with ordering cone C, and

(Z, ‖·‖Z) a real normed space. Let f : S 7→ IR be a functional, and g : S 7→ Y,

h : S → Z be mappings. Moreover, let the constraint set S = x ∈ S : g(x) ∈−C, h(x) = 0Z be nonempty. Consider x∗ ∈ S, and let f, g, h have a direc-

tional derivative at x∗. Let C ′, Z′ be the duals of C, Z. Assume that there are


linear functionals u ∈ C ′, v ∈ Z′ with

(f ′ (x∗) + u g′ (x∗) + v h′ (x∗)) (d) ≥ 0, (B.3)

where d = x− x∗ (∀x ∈ S), and

u(g(x∗)) = 0. (B.4)

Then x∗ is a minimal point of f on

S = x ∈ S : g(x) ∈ −C + cone(g(x∗))− cone(g(x∗)), h(x) = 0Z (B.5)

if and only if the mapping

(f, g, h) : S 7→ IR×Y× Z (B.6)

is C-quasiconvex at x∗ where

C = IR− ×(− C + cone (g (x∗))− cone (g (x∗))

)× 0Z. (B.7)

Proof. Jahn (1996, Theorem 5.14).

Theorem B.2.16 gives sufficient conditions for an optimal solution. As for

the necessary conditions, we have the following theorem.

B.2.17 Theorem. With the same notation as in Theorem B.2.16, assume

additionally that X,Z are real Banach spaces, and that the ordering cone C has

a nonempty interior. Let x∗ ∈ S be a minimal point of f on S. Let f and g be

Frechet differentiable at x∗. Let h be Frechet differentiable at a neighborhood

of x∗, let h′(·) be continuous at x∗, and let the image set h′(x∗)(X) be closed.

Then there are real numbers a ≥ 0 and linear functionals u ∈ C′, v ∈ Z′ with

(a, u, v) 6= (0, 0C′ , 0Z′),

(af ′ (x∗) + u g′ (x∗) + v h′ (x∗)) (d) ≥ 0,

where d = x− x∗ (∀x ∈ S), and

u(g(x∗)) = 0.


If in addition to the above assumptions, the Kurcyusz-Robinson-Zowe reg-

ularity assumption(g′(x∗)

h′(x∗)

)cone

(S− x∗

)+ cone

(C + g(x∗)

0Z

)= Y× Z

holds, then a > 0.

Proof. Jahn (1996, Theorem 5.3).

In order to apply Theorem B.2.16 we need two simple lemmata for C-

quasiconvexity at x.

B.2.18 Lemma. Let S be a nonempty convex subset of a real linear space, and

let f : S 7→ IR be a pseudo-convex (alternatively: convex) functional having a

directional derivative at some x ∈ int(S) in every direction x−x with arbitrary

x ∈ S. Then f is IR−-quasiconvex at x.

Proof. Since by Theorem B.2.14 f is pseudo-convex if f is convex, we only have

to treat the pseudo-convex case. But from Definition B.2.13 f(x)−f(x) < 0 ⇒f ′(x)(x− x) < 0, and this is IR−-quasiconvexity at x.

B.2.19 Lemma. Let S1 ⊂ X1,S2 ⊂ X2, . . . ,Sn ⊂ Xn be nonempty subsets

of the real linear spaces X1,X2, . . . ,Xn, and let Y1,Y2, . . . ,Yn be real normed

spaces (n ∈ IN). Suppose fi : Si 7→ Yi is Ci-quasiconvex at xi (1 ≤ i ≤ n).

Then f :⊗n

i=1 Si 7→⊗n

i=1 Yi defined by f = ⊗ni=1fi, is C-quasiconvex at x,

where C = ×ni=1Ci and x = (x1, x2, . . . , xn).

Proof. Obvious.

B.3 A Stochastic Optimization Problem

Although Theorem B.2.16 is rather general, it cannot be applied to stochas-

tic optimization problems of the following kind: maximize f : L0(Ω,F , P) 7→L0(Ω,F , P) subject to certain constraints, where L0(Ω,F , P) is a probability

space, or, slightly more general, a finite measure space. Before we can give a

B.3. A STOCHASTIC OPTIMIZATION PROBLEM 175

solution, some questions have to be answered. For example, what is a “maxi-

mum” in this context? And does this maximum exist as a measurable random

variable?

For two reasons, the answers require a nontrivial extension of the results of

the previous section’s deterministic optimization problems. Firstly, the solution

needs more infinite-dimensional mathematics, and secondly, the measurability

issue is lurking behind every corner. Since the author is not aware of any

textbook tackling these issues with the generality necessary for our purpose

(although there are many papers using similar results, beginning with Foldes,

1978), we will present a simple existence result tailor-made for our purpose.

The main idea of the proof is well-known (Bank, 2000, Theorem 2.1). See also

Cuoco (1997, Appendix B).

We will not strive for any kind of generality, but look for the simplest

possible result. The machinery required to prove this simple version is largely

hidden behind two theorems which we recall for the reader’s convenience.

B.3.1 Definition (Essential Supremum, Essential Infimum, Essential Maxi-

mum, Essential Minimum). Let C ⊂ L0(Ω,F , P) be nonempty, where (Ω,F , P)

is a given probability space. The essential supremum of C, denoted by ess-sup C,is an X∗ ∈ L0(Ω,F , P) satisfying:

(i) X ≤ X∗ almost surely for all X ∈ C, and

(ii) if Y is a random variable satisfying X ≤ Y almost surely for all X ∈ C,then X∗ ≤ Y almost surely.

If X∗ can be chosen such that X∗ ∈ C, then X∗ is called an essential maximum.

The essential infimum and the essential minimum are defined analogously.

B.3.2 Remark. In general supX∈D f(X) ≥ ess-supX∈D f(X). The inequality

can be strict for elementary examples, even if there exists an X∗ that is measur-

able and attains the supremum in supX∈D f(X) (see Striebel, 1975, p. 204 for

an example). If however X∗ ∈ D then clearly supX∈D f(X) = ess-supX∈D f(X)

and we are done.


B.3.3 Theorem. Let C ⊂ L0(Ω,F , P) be nonempty. Then X∗ = ess-sup(C)and X∗ = ess-inf(C) exist and are unique almost surely, taking values in

[−∞,+∞]. X∗ (X∗) can be represented as the supremum (infimum) almost

surely of some countable subcollection of C (countable sup property). That

means, L0(Ω,F , P) is a Dedekind complete vector lattice (a Dedekind complete

Riesz space).

Proof. Schechter (1997, 21.42).

If we allow elements of the space L0(Ω,F , P) to take values in [−∞,+∞],

the space is a complete lattice.3 For the remainder of this section, we will do so.

Basic operations, inequalities and definitions following from inequalities (e.g.

convexity) will be extended to the cases ±∞ following the usual conventions.

Theorem B.3.3 is true for σ-finite spaces, but we will have no use for this.

B.3.4 Theorem (Komlos). Let (Zn)n≥1 be a sequence in L1(Ω,F , P) such that

supn‖Zn‖1 < ∞, or a sequence in L0+(Ω,F , P). Then there exists an increasing

sequence of natural numbers (mn)n≥1, and an Z ∈ L1(Ω,F , P) for which

1n

n∑i=1

Zmi

n→∞−−−−→ Z a.s. (B.8)

(B.8) remains true if the sequence (mn)n≥1 is replaced by any of its subse-

quences.

Proof. A short proof for the L1(Ω,F , P) case can be found in Trautner (1990).

Von Weizsacker (2000) proves the L0+(Ω,F , P) case.

We now give the main results of this section. They are in the spirit of a

(generalized) Weierstraß theorem, and it would not be hard to prove slightly

more general results (see also Jahn, 1996, Theorems 2.3 and 2.12). Recall that

3Some authors do not make a distinction between the terms “complete” and “Dedekindcomplete” (e.g. Aliprantis and Border, 1999, Definition 7.2). A partially ordered set Xis Dedekind complete, if every nonempty subset bounded above has a least upper bound.A partially ordered set is complete, if every subset has a supremum in X (see Schechter,1997, 4.13).

B.3. A STOCHASTIC OPTIMIZATION PROBLEM 177

the Weierstraß theorem states that a continuous function defined on a com-

pact set in a normed space has a maximum and a minimum (e.g. Luenberger,

1969, Chapter 2.13), or, slightly more general, that an upper semicontinuous

function attains a maximum (Aliprantis and Border, 1999, Theorem 2.32 and

Theorem 2.40). Unfortunately, in infinite-dimensional spaces, compactness is

a rather severe restriction. One way often employed to work around this prob-

lem is to use weak or weak-* compactness in connection with well-known re-

sults. Weak compactness is too restrictive for our current purpose. By the

Dunford-Pettis Compactness Criterion (Liptser and Shiryaev, 2000, Theorem

1.6), this would be equivalent to the uniform integrability of the set involved.

In the special case of time-additive utility functions, our more general result

simplifies to weak compactness of a certain set (Lemma 2.1.26). The weak-*

topology is used in Levin (1976), and also in Foldes (1978). It is often used as

a step towards proving duality results (e.g. Karatzas and Zitkovic, 2003, also

Lemma 2.1.38). We will use a slightly different approach, here. We use the

lattice structure and the Dedekind completeness (Theorem B.3.3) in combina-

tion with the “compactness” implied by Komlos’ Subsequence Theorem B.3.4.

See also Bank (2000); Cuoco (1997) and recall the notion of a quasiconcave

function (see Remark B.2.12).

B.3.5 Theorem. Given a set D ⊂ L0(Ω,F , P), assume that

(i) D is convex and closed with respect to convergence in probability, and

(ii) (a) there exists Y ∈ L0(Ω,F , P), Y > 0 a.s., with supX∈D‖XY ‖1 < ∞,

or

(b) D is bounded from below by some constant.

Let f : D 7→ IR be upper semicontinuous with respect to convergence in proba-

bility and quasiconcave. Then there exists X∗ ∈ D such that f(X) ≤ f(X∗) =

supX∈D f(X) for all X ∈ D.

Proof. From the properties of IR, supf(X) : X ∈ D exists, possibly being

+∞. Furthermore, there exists a nondecreasing sequence (f (Xn))n≥1 in IR


with Xn ∈ D and limn 7→∞ f (Xn) = supf(X) : X ∈ D. Applying Komlos’s

Theorem B.3.4 to the sequence (XnY )n≥1 in the case of (ii)(a), or to (Xn)n≥1

directly in the case of (ii)(b) ensures an increasing sequence (mn)n≥1 and an

X∗ ∈ D, for which X∗n(j) = 1

n

∑ni=1 Xmi+j

n→∞−−−−→ X∗ almost surely, hence

also in probability. Here, j ≥ 0 is arbitrary. What is more, limn→∞ f(Xn) ≥f(X∗) = lim infj→∞ f

(limn→∞ X∗

n(j)

)≥ lim infj→∞ lim supn→∞ f

(X∗

n(j)

),

f being upper semicontinuous in probability and X∗ ∈ D.

We have to show that lim infj→∞ lim supn→∞ f(X∗

n(j)

)≥ limn→∞ f (Xn),

which also shows that the limits exist. We do so by proving that for m arbitrary

there always exists a j0 such that for j ≥ j0 the inequality f(X∗

n(j)

)≥ f (Xm)

holds. Since we have chosen the sequence (f (Xn)) to be nondecreasing, there

exists j0 with (f(Xmj

)) ≥ (f (Xm)) for j ≥ j0 (e.g. j0 = m). Then quasicon-

cavity and again the fact that (f (Xn)) is nondecreasing implies f(X∗

n(j)

)≥

f(Xmj) ≥ f(Xm) for j ≥ j0. This establishes the desired inequality and proves

the theorem, since the choice of j does not influence the convergence.

In Levin (1976), the proof is somewhat different. There the weak-* closure

of D is considered, which is weak-* compact. Standard arguments show

existence for this more general set. This result then leads to the solution of the

initial problem by projection.

B.3.6 Remark. Upper semicontinuity with respect to convergence in probability

is used in Theorem B.3.5 for mainly aesthetic reasons. Studying the proof,

we see that a weaker assumption suffices: for every sequence (Xn)n≥1 in Dconverging to some X almost surely the following upper semicontinuity-type

property holds: f(X) ≥ lim supn→∞ f(Xn).

Appendix C

Various Proofs

C.1 Proof of Several Results in Section 2.1

C.1.1 Proof of Theorem 2.1.12

To prove the first part, we apply Theorem B.2.16 and use the notation therein

to make things easier. Note that X∗ ∈ L1+(Q) and c∗, c ∈ L1

+(λ⊗Q) by (2.3).

Let

S4=

(c,X) ∈ L1

+(λ⊗Q)× L1+(Q) : c a consumption process,

EP

[∫ T

0

U−(s, c(s))ds + B−(X)

]< ∞

,

(C.1)

and hence S is a convex set from the concavity of a utility function and the

definition of U−(·, ·), B−(·), and (c∗, X∗) ∈ S. From now on, we will identify

all members of an equivalence class (i.e. we will no longer write “almost surely”,

and so on) to keep the notation simpler, and we will further add ω to make the

dependence clear.

Consider the mappings f : S 7→ IR, g : S 7→ IR× L1(λ⊗Q)× L1(Q) given

180 APPENDIX C. VARIOUS PROOFS

by

f(c,X) = −EP

[∫ T

0

U(s, c(s, ω), ω)ds + B(X(ω), ω)

]

g(c,X) =

EQ

[∫ T

0c(s, ω)ds + X(ω)

]−W0

c(s, ω)− c(s, ω)

x−X(ω)

.

Finally, simply define h : S 7→ Z by h4= 0Z for an arbitrary space (Z, ‖·‖Z), i.e.

we ignore the function h. Set C = IR+0 ×L1

+(λ⊗Q)×L1+(Q); hence the dual cone

(with respect to the p-norm) C ′ can be identified with IR+0 ×L∞+ (λ⊗Q)×L∞+ (Q)

(see Definition B.2.1 and Aliprantis and Border, 1999, Theorem 12.28). What

is more, the constraint set S = x ∈ S : g(x) ∈ −C, h(x) = 0Z is not empty

by assumption, since (c∗, X∗) ∈ S.

We know that f ′(c∗, X∗) exists from the definition of a utility function and

Corollary B.2.10, and g′(c∗, X∗) trivially as g is affine. Setting d = (dc, dX)

with dc = c − c∗ and dX = X − X∗, (c,X) ∈ S arbitrary, we see that

(c∗ + δdc, X∗ + δdX) = ((1− δ) c∗ + δc, (1− δ) X∗ + δX) ∈ S for 1 ≥ δ ≥ 0.

We therefore have

f ′(c∗, X∗)(d) = limδ↓0

1δ

(f (c∗ + δdc, X

∗ + δdX)− f (c∗, X∗))

=− limδ↓0

1δ

EP

[ ∫ T

0

U(s, c∗(s, ω) + δdc(s, ω), ω

)− U(s, c∗(s, ω), ω)ds + B

(X∗(ω) + δdX(ω), ω

)−B

(X∗(ω), ω

)]=− EP

[ ∫ T

0

dc(s, ω)U ′(s, c∗(s, ω), ω)ds

+ dX(ω)B′(X∗(ω), ω)]

< ∞.

To see this, first note that the last expectation exists. Indeed, for the first

summand we find that dc(t, ω)U ′(t, c∗(t, ω)ω) = dc(t, ω)(y1 − y2(t, ω))Yt(ω)

by (2.5a); from the definition of dc = c − c∗ ∈ L1(λ ⊗ Q), y1 ∈ IR+0 , y2 ∈

C.1. PROOF OF SEVERAL RESULTS IN SECTION 2.1 181

L∞+ (λ ⊗ Q), and Yt = EP[dQdP |F(t)

]it now follows that the first summand

of the integral exists. A similar argument shows that the expectation of the

second summand exists. This shows that the last inequality is true. From the

concavity of U(·, ·) we also have that r(δ) 4= 1δ

[U(t, c∗(t, ω) + δdc(t, ω), ω

)−

U(t, c∗(s, ω), ω)]≤ dc(t, ω)U ′(t, c∗(t, ω), ω) almost surely for all δ > 0, and r(δ)

is almost surely nondecreasing as δ ↓ 0 (Schechter, 1997, 12.15 (F)). Obviously,

the same argument is true for B(·). Hence, the result is a consequence of the

monotone convergence theorem.

Before we calculate g′, let us observe first that

EQ

[∫ T

0

c(s)ds

]= EP

[∫ T

0

c(s)ds YT

]=∫ T

0

EP [c(s)YT ] ds

=∫ T

0

EP [c(s)Ys] ds = EP

[∫ T

0

c(s)Ysds

]

for c(·) ∈ L1+(λ ⊗ Q). We are using EP [c(t)YT ] = EP [EP [c(t)YT |F(t)]] =

EP [c(t)EP [YT |F(t)]] = EP [c(t)Yt] and Tonelli’s theorem (Bauer, 1992, Satz

23.6). Calculating g′ is thus easy:

g′(c∗, X∗)(d) =

EP

[∫ T

0dc(s, ω)Ys(ω)ds + dX(ω)YT (ω)

]−dc(s, ω)

−dX(ω)

.

Suppose, (2.5) holds; then (f ′ (c∗, X∗) + u g′ (c∗, X∗)) (d) = 0 follows, where

u ∈ C′ is defined by

u (x, dc, dX) = y1x + EQ

[∫ T

0

y2(s, ω)dc(s, ω)ds

]+ EQ [y3(ω)dX(ω)]

for arbitrary x ∈ IR, dc ∈ L1(λ⊗Q), and dX ∈ L1(Q); hence (B.3) of Theorem

B.2.16 is fulfilled. (B.4) is equal to (2.4).

In order to establish the theorem it remains to show the C-quasiconvexity of

the mapping (f, g, h), where C is defined as in (B.7). But this is a consequence

of Lemma B.2.19 in connection with Lemma B.2.18: indeed f is a convex

function S 7→ IR, hence IR−-quasiconvex by Lemma B.2.18. Since g is an affine


mapping (an affine mapping is for any set C trivially C-quasiconvex), the result

follows from Lemma B.2.19.

Just as we have applied Theorem B.2.16 to prove the first part of the the-

orem, we can apply Theorem B.2.17 to prove the second part. Note that by

Remark 2.1.9 X∗ ∈ L1+(Q) and c∗, c ∈ L1

+(λ ⊗ Q), which are real Banach

spaces, and recall that the topological duals of L1(λ ⊗ Q) and L1(Q) can be

identified with L∞(λ⊗Q) and L∞(Q). We omit the details. (2.6) then follows

from Theorem B.2.15.

C.1.2 Proof of Corollary 2.1.14

c∗(·) really is progressively measurable since U ′−1 is jointly continuous (cf.

Lemma 2.1.4) and Y is cadlag. Hence (2.7) follows from (2.5) of Theorem

2.1.12, since Yt > 0 for all t ∈ I almost surely. Indeed, (2.4) is equivalent to

0 = y1

(W0 − EQ

[∫ T

0

c∗(s)ds + X∗

])(C.2a)

0 = EQ

[∫ T

0

y2(s) (c∗(s)− c(s)) ds

](C.2b)

0 = EQ [y3 (X∗ − x)] (C.2c)

because all the summands are non-negative by assumption. Let an arbitrary

A ⊂ (t, ω) ∈ I × Ω : c∗(t, ω) > c(t, ω) be measurable and suppose (λ ⊗Q)(A) > 0; then from (C.2b), y2(t) = 0 almost surely on A since the integrand

is non-negative. From this and (2.5a), (2.7a) follows on (t, ω) ∈ I × Ω :

c∗(t, ω) > c(tω). On Ω \ (t, ω) ∈ I × Ω : c∗(t, ω) > c(t, ω), c∗(t) = c(t)

holds almost surely. Consequently, y2(t) ≥ 0, U ′(t, c∗(t)) = U ′(t, c(t)) = (y1 −y2(t))Yt ≤ y1Yt. From (2.5a) and the definition of the inverse c∗(t) = c(t) =

U ′−1(t, (y1 − y2(t))Yt) = U ′−1(t, y1Yt) and the result follows. Similarly, (C.2c)

and (2.5b) imply (2.7b).

Finally, (2.8) is a consequence of (2.3a). To see this, suppose

W0 > W0 = EQ

[∫ T

0

c∗(s)ds + X∗

]

C.1. PROOF OF SEVERAL RESULTS IN SECTION 2.1 183

for an optimal solution c∗(·), X∗. Then replace X∗ with X∗ = X∗ + W0 − W0,

and thus X∗ > x since X ≥ x and W0 − W0 > 0, i.e. X∗ is in the interior

of dom(B) almost surely. Because B′ > 0 almost surely on the interior of

dom(B), B is strictly increasing on the interior of dom(B).

EP

[∫ T

0

U(s, c∗(s))ds + B(X∗)

]> EP

[∫ T

0

U(s, c∗(s))ds + B(X∗)

]

follows and this is a contradiction to the assumption that c∗(·), X∗ is optimal.

Hence, we can always assume that equality holds in (2.3a) for an optimal

solution.

C.1.3 Proof of Lemma 2.1.20

Note first that the Standing Assumption on p. 34 ensures that the solutions

are finite. That the combination of cD and the WT is a candidate solution to

Problem 2.1.8 is almost immediate since Wt ≥ 0 holds by constraint (1.2), and

cD(t) ≥ 0 by assumption (and also necessarily for an optimal solution). From

the definition of a local martingale measure,∫ t

0

cD(s)ds + Wt = W0 +∫ t

0+

ξDs · dSs

is a Qm-local martingale bounded from below, hence a supermartingale by the

Fatou Lemma (Rogers and Williams, 1994a, Lemma 4.14.3), and

EQm

[∫ T

0

cD(s)ds + WT

]≤ W0

follows. It remains to show that there does not exist a superior solution to

Problem 2.1.8.

Suppose therefore that this is not the case and let (cS , XS) be a superior

solution to Problem 2.1.8. We show that this would imply existence of a so-

lution to Problem 2.1.5 superior to (ξD, cD) — a contradiction. Consider the

Qm-martingale

Mt = EQm

[∫ T

0

cS(s)ds + XS

∣∣∣∣∣F(t)

]


By the predictable representation property (see Definition 2.1.16), there is a

predictable process ξ such that

Mt = M0 +∫ t

0+

ξs · dSs ∀ t ∈ I Qm − a.s.,

where without loss of generality we can assume that M0 = W0. One ends up

with the following wealth process

WSt = W0 +

∫ t

0+

ξs · dSs −∫ t

0

cS(s)ds ∀ t ∈ I Qm − a.s.

for the pair (ξ, cS). From the definition WT = XS (i.e. WT exists), and clearly

EP

[∫ T

0

U(s, cS(s))ds + B(WST )

]> EP

[∫ T

0

U(s, cD(s))ds + B(WDT )

]

by assumption. To prove a contradiction it therefore remains to show that

(1.2) is satisfied for WSt , i.e. (ξ, cS) ∈ A(S, W0). But

WSt = W0 +

∫ t

0+

ξs · dSs −∫ t

0

cS(s)ds

= Mt −∫ t

0

cS(s)ds

= EQm

[∫ T

t

cS(s)ds + XS

∣∣∣∣∣F(t)

]≥ 0.

Here the inequality stems from the fact that∫ T

tcS(s)ds ≥ 0, XS ≥ 0 Qm−a.s.

(the conditional expectation operator is a positive linear operator).

The proof of the converse is immediate in light of the above.

C.2 A Distributional Property of Ito Processes

C.2.1 Lemma. With the notation of Theorem 4.3.6, let µ(·) be an IR-valued

predictable process, and let γ(·), γ(·) be IRN -valued predictable process. Suppose

C.2. A DISTRIBUTIONAL PROPERTY OF ITO PROCESSES 185

that γ′(·)γ(·) = γ′(·)γ(·) almost surely. Define

ξ(t) 4=∫ t

0

µ(s)ds +∫ t

0

γ′(s)dZ(s),

ξ(t) 4=∫ t

0

µ(s)ds +∫ t

0

γ′(s)dZ(s).

Then ξ(t) − ξ(s), ξ(t) − ξ(s), s ≤ t have the same distribution conditional on

F(s).

Proof. We give the proof in two steps. The first step reduces the problem

to one where all processes involved are one-dimensional. The second gives a

simple proof for this one-dimensional setting.

Step 1 (reduction to a one-dimensional problem): Define γ(·) 4=√

γ′(·)γ(·),

γinv(·) 4=

1γ(·) if γ(·) > 0

0 else,

and

Z(t) 4=(∫ t

0

γinv(s)γ′(·)dZ(s) +∫ t

0

(1− γinv(s)γ(s)

)dZ(s)

)t∈I

.

Then it can easily be checked using Levy’s characterization theorem that (Z(t))

is a one-dimensional Wiener process, provided(Z(t)

)is one, independent of

(Z(t)) (e.g. Theorem 4.3.6 in Revuz and Yor, 1999; Protter, 1990, Chapter 2,

Theorem 38). A calculation yields almost surely

ξ(t)−∫ t

0

µ(s)ds−∫ t

0

γ(s)dZ(s) = 0.

Hence (ξ(t)) and(∫ t

0µ(s)ds +

∫ t

0γ(s)dZ(s)

)are modifications; being right-

continuous, they are indistinguishable (Protter, 1990, Chapter 1, Theorem 2).

Step 2 (proof of the one-dimensional problem): Assume that all processes

involved are one-dimensional. Then by assumption γ(·)2 = γ(·)2, i.e. A(t) 4=

γ(t) = γ(t) is F(t)-measurable and Ac(t) = γ(·) = −γ(·). From the

properties of the stochastic integral with respect to Brownian motion, ξ(t) and


ξ(t) have the same distribution. Indeed, this is easily seen to be true if γ(·) is

simple. Using suggestive notation

Fn(c) 4= P(

Xn4=∫ t

0

γ(s)dZ(s) ≤ c

)= P

(n∑

i=1

γi(Zti+1 − Zti) ≤ c

)

=∫· · ·∫

c1,...cn−1∈IRP (γ1(Zt2 − Zt1) = c1) P (γ2(c1)(Zt3 − Zt2) = c2 − c1)

. . . P

(γn(c1, . . . , cn−1)(Ztn+1 − Ztn) ≤ c−

n−1∑i=1

ci

)dc1 . . .dcn−1

=∫· · ·∫

c1,...cn−1∈IRP (γ1(Zt2 − Zt1) = c1) P (γ2(c1)(Zt3 − Zt2) = c2 − c1)

. . . P

(Xn

4= γn(c1, . . . , cn−1)(Ztn+1 − Ztn) ≤ c−n−1∑i=1

ci

)dc1 . . .dcn−1

4=

= P(

Xn4=∫ t

0

γ(s)dZ(s) ≤ c

)4= Fn(c).

Therefore, for simple γ(·), the distributions of (Xn), (Xn) are the same. By

standard theory, any integral with respect to Brownian motion can be ap-

proximated by a sequence of simple integrals. Here (Xn), (Xn) converge in

L2(Ω, P,F), hence also in distribution1. We conclude that the distributions of

the limits must be the same.

C.3 Solution to a SDE

We consider the stochastic differential equation

Xt = −Ct + X0 +∫ t

0+

Xs−dZs. (C.3)

This stochastic differential equation has been studied extensively (see Protter,

1990, Chapter 5.9). If C is a semimartingale, Yoeurp and Yor (1977) seem to

1Convergence in distribution is often also called “weak”, or in a slightly different contextalso “vague” convergence Bauer (1992, Chapter 30). Throughout the thesis, we however usethe term “weak convergence” in its topological meaning.

C.3. SOLUTION TO A SDE 187

have solved this equation first (cited in Bichteler, 2002 and Protter, 1990).2

The solution is a special case of Jaschke (2003).

We will calculate a solution for the situation of interest to us.

C.3.1 Theorem. Let (Ct)t∈I be a continuous and nondecreasing process with

X0 − C0 > 0. Then a unique solution to (C.3) exists.

Suppose that for this unique solution, X ≥ 0 holds almost surely. Let

τ4= supt ∈ I : Xt− > 0 and suppose further that Xt = 0 on the set t ≥ τ.

Then the unique solution is given by

Xt = IXt−>0E(Z)t

(X0 −

∫ t∧τ

0+

1E(Z)s−

dCs

).

Proof. The unique solution to this stochastic differential equation exists (Prot-

ter, 1990, Chapter 5, Theorem 7). We also observe that τ = inft : Xt ≤ 0is a stopping time (e.g. Protter, 1990, Chapter 1, Theorem 4). Since X ≥ 0,

τ = inft : Xt = 0. Furthermore, we find Xt∨τ = 0 and Ct∨τ = Ct. To avoid

tedious notation, we can therefore limit all calculations to the set t < τ,something we will do.

To solve (C.3), we use the standard technique (variation of constant), i.e.

we conjecture a solution X = E(Z)Y , and assume that (Yt)t∈I is continuous

and nonincreasing. Integration by parts yields

dXt = E(Z)t−dYt + Yt−dE(Z)t + d[E(Z), Y ]t.

Y continuous implies [E(Z), Y ]t = Y0 + [E(Z), Y ]ct (Protter, 1990, Chapter 2,

Theorem 23(i)). Since Y is of finite variation [E(Z), Y ]ct = 0 follows (Bichteler,

2002, Exercise 3.8.12). Using Yt−dE(Z)t = Yt−E(Z)t−dZt = Xt−dZt, we find

dXt = E(Z)t−dYt + Xt−dZt.

On the other hand, X is by assumption a solution to (C.3):

dXt = −dCt + Xt−dZt.

2The author could not get hold of this paper.


Equating the last two equations, we have

E(Z)t−dYt = −dCt.

From (C.3), ∆Xt = Xt−∆Zt; hence if ∆Zt ≤ −1, then Xt− > 0 ⇒ Xt ≤ 0,

contradicting X > 0 on [0, τ). We therefore find ∆Zt > −1 on t < τ, i.e.

E(Z)t− > 0. This leaves us with

Yt = X0 −∫ t

0+

1E(Z)t−

dCt,

which justifies our assumptions concerning Y and completes the proof.

Theorem C.3.1 is a special case of Jaschke (2003, Theorem 1). In a more

general setting, Jaschke proves — using different techniques — that the solution

of the stochastic differential equation is

Xt = E(Z)t

X0 −∫ t

0+

1E(Z)s−

d

Cs − [C,Z]cs −∑

0≤u≤s

∆Cu∆Zu

1 + ∆Zu

.

Since C is continuous and of finite variation, [C,Z]c = 0 and ∆C = 0, and this

is Theorem C.3.1.

C.3.2 Corollary. Given the setting of Theorem C.3.1, suppose that E(Z) > 0

almost surely (e.g. if Z is a continuous semimartingale). Then

Xt = E(Z)t

(X0 −

∫ t

0+

1E(Z)s−

dCs

).

Proof. In the proof of Theorem C.3.1, the assumption Xt− > 0 was only needed

to ensure E(Z)t− > 0.

If Z is a continuous semimartingale, the corollary is a special case of Protter

(1990, Chapter 5, Theorem 52). For Brownian motion, the result follows from

Karatzas and Shreve (1991, Problem 5.6.15).

C.4. A SIMPLE COMPARISON THEOREM 189

C.4 A Simple Comparison Theorem

The following comparison theorem is tailor-made for our purpose. The proof

is standard (see e.g. Chapter 9, Theorem 3.7, in Revuz and Yor, 1999; Prot-

ter, 1990, Theorem 5.54). The theorem is essentially by Ikeda and Watanabe

(Rogers and Williams, 1994b, Theorem 43.1).

C.4.1 Theorem. Let (Z(t),F(t))t∈I be an N -dimensional Brownian motion

on (Ω,F , P). For µ1, µ2,σ progressively measurable and c progressive, define

X1(t) 4= X0 +∫ t

0

X1(s)µ1(s)ds +∫ t

0

X1(s)σ′(s)dZ(s) +∫ t

0

c(s)ds,

X2(t) 4= X0 +∫ t

0

X2(s)µ2(s)ds +∫ t

0

X2(s)σ′(s)dZ(s) +∫ t

0

c(s)ds.

Suppose that X2 ≥ 0 and µ1 ≥ µ2 almost surely. Then X1(t) ≥ X2(t) almost

surely for all t ∈ I.

Proof. E(∫ ·

0σ′(s)dZ(s)

)> 0. Set

U4= X1 −X2,

V4=

∫ ·

0

σ′(s)dZ(s),

W4=

∫ ·

0

X1(s)µ1(s)−X2(s)µ2(s)ds.

Then U(t) = W (t) +∫ t

0U(s)dV (s), hence (Protter, 1990, Theorem 5.52)

U

E(V )=∫ ·

0

1E(V (s))

dW (s)

=∫ ·

0

X1(s)µ1(s)−X2(s)µ2(s)E(V (s))

ds

=∫ ·

0

X2(s)(µ1(s)− µ2(s)) + U(s)µ1(s)E(V (s))

ds.

This is a simple integral equation that can be solved pathwise, i.e. for each

ω ∈ Ω : UE(V ) =

∫ ·0exp

∫ ·sµ1(u)du

X2(s)(µ1(s)−µ2(s))E(V (s)) ds ≥ 0. Since E(V ) >

0 ⇒ U ≥ 0 almost surely. This completes the proof.

Bibliography

C. D. Aliprantis and K. C. Border. Infinite Dimensional Analysis. Springer,

Berlin, second edition, 1999.

J. Amendinger. Martingale representation theorems for initially enlarged fil-

trations. Stochastic Processes and their Applications, 89:101–116, 2000.

P. Bank. Singular Control of Optional Random Measures: Stochastic Opti-

mization and Representation Problems Arising in the Microeconomic Theory

of Intertemporal Consumption Choice. PhD thesis, Humboldt University of

Berlin, 2000.

H. Bauer. Maß- und Integrationstheorie. Walter de Gruyter, Berlin, second

edition, 1992.

F. Bellini and M. Frittelli. On the existence of minimax martingale measures.

Mathematical Finance, 12:1–21, 2002.

J. Bertoin. Levy Processes, volume 121 of Cambridge Tracts in Mathematics.

Cambridge University Press, Cambridge, 1996.

K. Bichteler. Stochastic Integration with Jumps, volume 89 of Encyclopedia of

Mathematics and its Applications. Cambridge University Press, Cambridge,

2002.

J.-M. Bismut. Growth and optimal intertemporal allocations of risks. Journal

of Economic Theory, 10:239–287, 1975.

192 Bibliography

T. Bjørk. Arbitrage Theory in Continuous Time. Oxford University Press,

New York, 1998.

B. Bouchard and L. Mazliak. A multidimensional bipolar theorem in

L0(Rd; Ω;F ;P ). Stochastic Processes and their Applications, 107:213–231,

2003.

W. Brannath and W. Schachermayer. A bipolar theorem for L0+(Ω,F , P).

Seminaire de Probabilites, XXXIII:349–354, 1999.

S. Browne. Beating a moving target: optimal portfolio strategies for outper-

forming a stochastic benchmark. Finance and Stochastics, 3:275–294, 1999.

J. Y. Campbell and L. M. Viceira. Strategic Asset Allocation. Clarendon

Lectures in Economics. Oxford University Press, Oxford, 2002.

C. Castaing and M. Valadier. Convex Analysis and Measurable Multifunctions,

volume 580 of Lecture Notes in Mathematics. Springer, Berlin, 1977.

A. S. Cherny and A. N. Shiryaev. Vector stochastic integrals and the funda-

mental theorems of asset pricing. Transactions of the French-Russian A. M.

Liapunov Institute, 3:5–37, 2001.

G. M. Constantinides. Habit formation: a resolution of the equity premium

puzzle. Journal of Political Economy, 93:519–543, 1990.

T. Cover. Universal portfolios. Mathematical Finance, 1:1–29, 1991.

J. C. Cox and C.-F. Huang. Optimal consumption and portfolio policies when

asset prices follow a diffusion process. Journal of Economic Theory, 49:33–83,

1989.

J. C. Cox and C.-F. Huang. A variational problem arising in financial eco-

nomics. Journal of Mathematical Economics, 20:465–487, 1991.

D. Cuoco. Optimal consumption and equilibrium prices with portfolio con-

straints and stochastic income. Journal of Economic Theory, 72:33–73, 1997.

Bibliography 193

J. Cvitanic. Theory of portfolio optimization in markets with frictions. In

Handbook of Mathematical Finance. Cambridge University Press, Cambridge,

1999.

J. Cvitanic and I. Karatzas. Convex duality in constrained portfolio optimiza-

tion. The Annals of Applied Probability, 2:767–818, 1992.

J. Cvitanic, W. Schachermayer, and H. Wang. Utility maximization in incom-

plete markets with random endowment. Finance and Stochastics, 5:259–272,

2001.

M. H. A. Davis. Option pricing in incomplete markets. In M. A. H. Demp-

ster and S. R. Pliska, editors, Mathematics of Derivative Securities, pages

216–226. Cambridge University Press, Cambridge, 1997.

G. Deelstra, H. Pham, and N. Touzi. Dual formulation of the utility maxi-

mization problem under transaction costs. The Annals of Applied Probability,

11:1353–1383, 2001.

F. Delbaen. The structure of m-stable sets and in particular of the set of risk

neutral measures. Eidgenossische Technische Hochschule, Zurich, 2003.

F. Delbaen and W. Schachermayer. A general version of the fundamental

theorem of asset pricing. Mathematische Annalen, 300:463–520, 1994.

F. Delbaen and W. Schachermayer. The no-arbitrage property under a change

of numeraire. Stochastics, 53:213–226, 1995.

F. Delbaen and W. Schachermayer. Non-arbitrage and the fundamental the-

orem of asset pricing: Summary of main results. Proceedings of Symposia in

Applied Mathematics, 00:1–10, 1997.

F. Delbaen, P. Grandits, T. Rheinlander, D. J. Samperi, M. Schweizer, and

C. Stricker. Exponential hedging and entropic penalties. Mathematical Fi-

nance, 12:99–123, 2002.

194 Bibliography

C. Dellacherie and P. A. Meyer. Probabilites et Potentiel: Chapitres I a IV.

Hermann, Paris, 1975.

C. Dellacherie and P. A. Meyer. Probabilites et Potentiel: Chapitres V a VIII.

Hermann, Paris, 1980.

J. B. Detemple and F. Zapatero. Asset prices in an exchange economy with

habit formation. Econometrica, 59:1633–1657, 1991.

D. Duffie. Dynamic Asset Pricing Theory. Princeton University Press, Prince-

ton, third edition, 2001.

D. Duffie and L. G. Epstein. Stochastic differential utility. Econometrica, 60:

353–394, 1992.

D. Duffie and C.-F. Huang. Implementing Arrow-Debreu equilibria by contin-

uous trading of few long-lived securities. Econometrica, 55:1337–1356, 1985.

D. Duffie and C.-F. Huang. Multiperiod security markets with differential

information. Journal of Mathematical Economics, 15:283–303, 1986.

N. El Karoui and M. Quenez. Dynamic programming and pricing of contingent

claims in an incomplete market. SIAM Journal of Control and Optimization,

33:29–66, 1995.

N. El Karoui, S. Peng, and M. Quenez. Backward stochastic differential

equations in finance. Mathematical Finance, 7:1–71, 1997.

R. J. Elliott and P. E. Kopp. Mathematics of Financial Markets. Springer

Finance. Springer, New York, 1999.

M. Emery. Une topologie sur l’espace des semimartingales. In Seminaire de

Probabilites XIII, volume 721 of Lecture Notes in Mathematics, pages 260–280.

Springer, 1979.

M. Emery. Compensation de processus a variation finie non localement

integrables. In Seminaire de Probabilites XIV, volume 784 of Lecture Notes

in Mathematics, pages 152–160. Springer, Berlin, 1980.

Bibliography 195

L. G. Epstein and T. Wang. Intertemporal asset pricing under Knightian

uncertainty. Econometrica, 61:283–322, 1994.

L. G. Epstein and S. Zin. Substitution, risk aversion and the temporal be-

havior of asset returns: a theoretical framework. Econometrica, 57:937–969,

1989.

C. A. Filitti. Portfolio Selection in Continuous Time: Analytical and Numer-

ical Methods. PhD thesis, University of St.Gallen, 2004.

W. H. Fleming and H. M. Soner. Controlled Markov Processes and Viscosity

Solutions, volume 25 of Applications of Mathematics. Springer, Berlin, 1993.

W. H. Fleming and T. Zariphopoulou. An optimal investment / consump-

tion model with borrowing. Mathematics of Operations Research, 16:802–822,

1991.

L. Foldes. Optimal saving and risk in continuous time. The Review of Eco-

nomic Studies, 45:39–65, 1978.

H. Follmer and Y. M. Kabanov. Optional decomposition and Lagrange mul-

tipliers. Finance and Stochastics, 2:69–81, 1998.

H. Follmer and D. O. Kramkov. Optional decomposition under constraints.

Probability Theory and Related Fields, 109:1–25, 1997.

M. Frittelli. Semimartingales and asset pricing under constraints. In M. A. H.

Dempster and S. R. Pliska, editors, Mathematics of Derivative Securities,

pages 255–268. Cambridge University Press, Cambridge, 1997.

N. Hakansson. Optimal investment and consumption strategies under risk for

a class of utility functions. Econometrica, 38:587–607, 1970.

H. He and N. D. Pearson. Consumption and portfolio policies with incomplete

markets and short-sale constraints: The finite-dimensional case. Mathematical

Finance, 1:1–10, 1991a.

196 Bibliography

H. He and N. D. Pearson. Consumption and portfolio policies with incomplete

markets and short-sale constraints: The infinite-dimensional case. Journal of

Economic Theory, 54:259–304, 1991b.

A. Hindy and C.-F. Huang. Optimal consumption and portfolio rules with

durability and local substitution. Econometrica, 61:85–121, 1993.

A. Hindy, C.-F. Huang, and D. M. Kreps. On intertemporal preferences in

continuous time: the case of certainty. Journal of Mathematical Economics,

21:401–440, 1992.

A. Hindy, C.-F. Huang, and S. Zhu. Optimal consumption and portfolio

rules with durability and habit formation. Journal of Economic Dynamics &

Control, 21:525–550, 1997.

J. Jahn. Introduction to The Theory of Nonlinear Optimization. Springer,

Berlin, second edition, 1996.

F. Jamshidian. Asymptotically optimal portfolios. Mathematical Finance, 1:

131–150, 1991.

S. Jaschke. A note on the inhomogeneous linear stochastic differential equa-

tion. Insurance: Mathematics and Economics, 32:461–464, 2003.

Y. M. Kabanov and C. Stricker. Hedging of contingent claims under transac-

tion costs. Technical report, 2002.

G. Kallianpur and R. L. Karandikar. Introduction to Option Pricing Theory.

Birkhauser, Boston, 2000.

J. Kallsen. Semimartingale Modelling in Finance. PhD thesis, Albert-

Ludwigs-Universitat Freiburg i. Br., 1998.

N. J. Kalton, N. T. Peck, and J. W. Roberts. An F-space Sampler. Cambridge

University Press, Cambridge, 1984.

K. Kamizono. Hedging and Optimization under Transaction Costs. PhD

thesis, Columbia University, 2001.

Bibliography 197

K. Kamizono. Multivariate utility maximization under transaction costs.

Technical report, Faculty of Economics, Nagasaki University, 2003.

I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus,

volume 113 of Graduate Texts in Mathematics. Springer, New York, second

edition, 1991.

I. Karatzas and S. E. Shreve. Methods of Mathematical Finance, volume 39

of Applications of Mathematics. Springer, Berlin, 1998.

I. Karatzas and G. Zitkovic. Optimal consumption from investment and ran-

dom endowment in incomplete semimartingale markets. Annals of Probability,

31(4):1821–1858, 2003.

I. Karatzas, J. P. Lehoczky, S. P. Sethi, and S. E. Shreve. Explicit solution

of a general consumption / investment problem. Mathematics of Operations

Research, 11:261–294, 1986.

I. Karatzas, J. P. Lehoczky, and S. E. Shreve. Optimal portfolio and con-

sumption decisions for a “small investor” on a finite horizon. SIAM Journal

of Control and Optimization, 25:1557–1586, 1987.

I. Karatzas, J. P. Lehoczky, S. E. Shreve, and G.-L. Xu. Martingale and

duality methods for utility maximization in an incomplete market. SIAM

Journal of Control and Optimization, 29:702–730, 1991.

A. Khanna and M. Kulldorff. A generalization of the mutual fund theorem.

Finance and Stochastics, 3:167–185, 1999.

R. Korn. Optimal Portfolios: Stochastic Models for Optimal Investment and

Risk Management in Continuous Time. World Scientific, Singapore, 1997.

R. Korn and S. Trautmann. Continuous-time portfolio optimization under

terminal wealth constraints. Mathematical Methods of Operations Research,

42:69–92, 1995.

198 Bibliography

D. O. Kramkov. Optional decomposition of supermartingales and hedging

contingent claims in incomplete security markets. Probability Theory and

Related Fields, 105:459–479, 1996.

D. O. Kramkov and W. Schachermayer. Necessary and sufficient conditions

in the problem of optimal investment in incomplete markets. The Annals of

Applied Probability, 13:1504–1516, 2003.

D. O. Kramkov and W. Schachermayer. The asymptotic elasticity of util-

ity functions and optimal investment in incomplete markets. The Annals of


D. M. Kreps. Three essays on capital markets. Technical Report 499, Graduate

School of Business, Stanford University, 1979.

D. M. Kreps. Arbitrage and equilibrium in economies with infinitely many

commodities. Journal of Mathematical Economics, 8:15–35, 1981.

D. M. Kreps and E. L. Porteus. Temporal resolution of uncertainty and

dynamic choice theory. Econometrica, 46:185–200, 1978.

H. J. Kushner and P. Dupois. Numerical Methods for Stochastic Control

Problems in Continuous Time, volume 24 of Applications of Mathematics.

Springer, Berlin, second edition, 2001.

A. Lazrak and M. Quenez. A generalized stochastic differential utility. Math-

ematics of Operations Research, 28(1):154–180, 2003.

V. Levin. Extremal problems with convex functionals that are lower semicon-

tinuous with respect to convergence in measure. Doklady Mathematics, 16:

1384–1388, 1976.

X. Li, X. Y. Zhou, and A. E. B. Lim. Dynamic mean-variance portfolio selec-

tion with no-shorting constraints. SIAM Journal of Control and Optimization,

40:1540–1555, 2002.

Bibliography 199

R. S. Liptser and A. N. Shiryaev. Statistics of Random Processes: I General

Theory, volume 5 of Applications of Mathematics. Springer, Berlin, second

edition, 2000.

R. S. Liptser and A. N. Shiryaev. Theory of Martingales. Kluwer, Dordrecht,

1989.

J. Liu. Portfolio Selection in Stochastic Environments. PhD thesis, Stanford

University, 1999.

D. G. Luenberger. Optimization by Vector Space Methods. John Wiley &

Sons, Inc., New York, 1969.

H. M. Markowitz. Portfolio selection. The Journal of Finance, 7:77–91, 1952.

A. Mas-Colell, M. D. Whinston, and J. R. Green. Microeconomic Theory.

Oxford University Press, Oxford, 1995.

J. Memin. Espaces de semi martingales et changement de probabilite. Zur

Wahrscheinlichkeitstheorie und Verwandte Gebiete, 52:9–39, 1980.

J.-F. Mertens. Theorie des processus stochastiques generaux; applications aux

surmartingales. Zur Wahrscheinlichkeitstheorie und Verwandte Gebiete, 22:

45–68, 1972.

R. C. Merton. Lifetime portfolio selection under uncertainty: The continuous-

time case. Review of Economics and Statistics, 51:247-257. In Continuous-

Time Finance Merton (1990), pages 97–119.

R. C. Merton. Optimum consumption and portfolio rules in a continuous-time

model. Journal of Economic Theory, 3:373-413. In Continuous-Time Finance

Merton (1990), pages 120–165.

R. C. Merton. Continuous-Time Finance. Blackwell Publishers, Cambrigde

(Massachusetts), 1990.

200 Bibliography

R. C. Merton and P. A. Samuelson. Generalized mean-variance tradeoffs for

best perturbation corrections to approximate portfolio decisions. The Journal

of Finance, 29:27–40, 1974.

P. W. Millar. The minimax principle in asymptotic statistical theory. In

P. L. Hennequin, editor, Ecole d’Ete de Probabilites de Saint Flour XI - 1981,

volume 976 of Lecture Notes in Mathematics, pages 75–265, Springer, 1983.

M. Mnif and H. Pham. Stochastic optimization under constraints. Stochastic

Processes and their Applications, 93:149–180, 2001.

H. H. Muller. A simple method for a class of portfolio models and asset

liability models in continuous time. University of St.Gallen, Department of

Mathematics und Statistics, 2000.

M. Musiela and M. Rutkowski. Martingale Methods in Financial Modelling,

volume 36 of Applications of Mathematics. Springer, Berlin, 1997.

J. Neveu. Bases mathematiques du calcul des probabilites. Masson et Cie.,

120, bd Saint-Germain, Paris, VIe, 1964.

D. L. Ocone and I. Karatzas. A generalized Clark representation formula,

with application to optimal portfolios. Stochastics and Stochastics Reports,

34:187–220, 1991.

B. Øksendal. An introduction to Malliavin calculus with applications to eco-

nomics. Technical Report 3/96, Norwegian School of Economics and Business

Administrations, 1997.

H. Pham. Smooth solutions to optimal investment models with stochastic

volatilities and portfolio constraints. Applied Mathematics & Optimization,

46:55–78, 2002.

S. R. Pliska. A discrete time stochastic decision model. In W. H. Fleming

and L. G. Gorostiza, editors, Advances in Filtering and Optimal Stochastic

Control: Proceedings of The IFIP-WG 7/1 Working Conference, volume 42

Bibliography 201

of Lecture Notes in Control and Information Sciences, pages 290–304, Berlin,

1982. Springer.

S. R. Pliska. A stochastic calculus model of continuous trading: Optimal

portfolios. Mathematics of Operations Research, 11:371–382, 1986.

P. E. Protter. A partial introduction to financial asset pricing theory. Sto-

chastic Processes and their Applications, 91:169–203, 2001.

P. E. Protter. Stochastic Integration and Differential Equations, volume 21 of

Applications of Mathematics. Springer, Berlin, 1990.

D. Revuz and M. Yor. Continuous Martingales and Brownian Motion, volume

293 of Grundlehren der mathematischen Wissenschaft. Springer, Berlin, third

edition, 1999.

H. R. Richardson. A minimum variance result in continuous trading portfolio

optimization. Management Science, 35:1045–1055, 1989.

R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematical

Series. Princeton University Press, Princeton, 1970.

L. C. G. Rogers and D. Williams. Diffusions, Markov Processes and Martin-

gales: Foundations. Cambridge Mathematical Library. Cambridge University

Press, Cambridge, second edition, 1994a.

L. C. G. Rogers and D. Williams. Diffusions, Markov Processes and Martin-

gales: Ito Calculus. Cambridge Mathematical Library. Cambridge University

Press, Cambridge, second edition, 1994b.

A. D. Roy. Safety first and the holding of assets. Econometrica, 20:431–449,

1952.

P. A. Samuelson. Lifetime portfolio selection by dynamic programming. Re-

view of Economics and Statistics, 51:239–246, 1969.

202 Bibliography

W. Schachermayer. Optimal investment in incomplete financial markets when

wealth may become negative. The Annals of Applied Probability, 11:694–734,

2001.

W. Schachermayer. How potential investments may change the optimal port-

folio for the exponential utility. Technical report, Vienna University of Tech-

nology, 2002.

H. H. Schaefer. Topological Vector Spaces, volume 3 of Graduate Texts in

Mathematics. Springer, Berlin, second edition, 1999.

E. Schechter. Handbook of Analysis and Its Foundations. Academic Press,

San Diego, 1997.

M. Schroder and C. Skiadas. Optimal lifetime consumption-portfolio strate-

gies under trading constraints and generalized recursive preferences. Stochas-

tic Processes and their Applications, 108:155–202, 2003.

M. Schweizer. Mean-variance hedging for general claims. The Annals of


H. Shirakawa. Optimal consumption and portfolio selection with incomplete

markets and upper and lower bound constraints. Mathematical Finance, 4:

1–24, 1994.

S. E. Shreve and G.-L. Xu. A duality method for optimal consumption and

investment under short-selling constraints: I. General market coefficients. The

Annals of Applied Probability, 2:87–112, 1992a.

S. E. Shreve and G.-L. Xu. A duality method for optimal consumption and

investment under short-selling constraints: II. Constant market coefficients.

The Annals of Applied Probability, 2:314–328, 1992b.

C. Skiadas. Recursive utility and preferences for information. Economic The-

ory, 12:293–312, 1998.

C. Striebel. Optimal Control of Discrete Time Stochastic Systems, volume 110

of Lecture Notes in Economic and Mathematical Systems. Springer, Berlin,

1975.

S. M. Sundaresan. Intertemporally dependent preferences and the volatility

of consumption and wealth. The Review of Financial Studies, 2:73–89, 1989.

R. Trautner. A new proof of the Komlos-Revesz-theorem. Probability Theory

and Related Fields, 84:281–287, 1990.

J.-L. Vila and T. Zariphopoulou. Optimal consumption and portfolio choice

with borrowing constraints. Journal of Economic Theory, 77:402–431, 1997.

H. von Weizsacker. Komlos’ subsequence theorem in L0+. Technical report, Ar-

beitsgruppe Stochastik und reelle Analysis, Universitat Kaiserslautern, 2000.

C. Yoeurp and M. Yor. Espace orthogonal a une semimartingale; applications.

1977.

K. Yosida. Functional Analysis, volume 123 of Grundlehren der mathematis-

chen Wissenschaften. Springer, Berlin, sixth edition, 1980.

T. Zariphopoulou. Consumption-investment models with constraints. SIAM

Journal of Control and Optimization, 32:59–85, 1994.

O. Zellweger. Risk tolerance of institutional investors. PhD thesis, University

of St.Gallen, 2003.

G. Zitkovic. A filtered version of a bipolar theorem of Brannath and Schacher-

mayer. Journal of Theoretical Probability, 15:41–61, 2002.

Curriculum Vitae

Education

11/98–04/05 University of St. Gallen: Doctorate

studies in mathematical finance

St. Gallen, CH

11/93–10/98 University of St. Gallen: Licenciate

in financial economics

St. Gallen, CH

9/95–3/98 University of Hagen: Undergraduate

studies in mathematics

Hagen, D

Professional Experience

03/04– Partners Group: Associate hedge

fund investment management

Zug, CH

08/98–03/04 University of St. Gallen: Teaching

assistant and research assistant

St. Gallen, CH

9/99–2/02 Vescore Solutions: Developer of

econometric models for strategic asset

allocation

St. Gallen, CH

9/97–2/98 Pictet & Cie.: Internship institutional

asset management

Zurich, CH

7/97–8/97 Morgan Stanley: Internship fixed in-

come sales

Frankfurt, D

11/96–1/97 Commerzbank: Internship equity re-

search

Frankfurt, D

constrained portfolio optimization - university of st. gallenfile/dis3030.pdf · constrained...

Documents