a course on the theory of competitive financial markets · 2020. 3. 8. · a course on the theory...

204
A Course on the Theory of Competitive Financial Markets Costis Skiadas

Upload: others

Post on 18-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A Course on the Theory ofCompetitive Financial Markets

Costis Skiadas

Preliminary draft. This is copyrighted material, do not distribute without the instructor’s permission.November 27, 2020
Page 2: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

PRELIMINARY AND INCOMPLETE DRAFTCopyright © 2020 by Constantinos N. Skiadas

Kellogg School of Management, Northwestern UniversityAll Rights Reserved.

Page 3: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

To Robin

Page 4: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Preface

This text is the distillation of material I have taught to doctoral stu-dents at Northwestern University for over two decades. I have relied onlectures for broader context and on this text for a self-contained state-ment of a theory with classical roots in Walrasian competitive analysis,extended by Arrow and Debreu to include time and uncertainty andfurther developed in Finance to emphasize the role of arbitrage ar-guments and the tools of modern stochastic analysis. There is equalemphasis on sound economics and well-motivated methodology.

The progression of topics can be thought of as increasing in scopeand decreasing in realism. Arbitrage arguments are presented first, fol-lowed by characterizations of optimality and competitive equilibrium.Arbitrage arguments utilize the assumption that the market does notallow incremental cash flows that are desirable in the narrow sense ofarbitrage. Optimality is then introduced as a refinement of the no-arbitrage assumption, where the notion of a desirable cash flow is en-larged through preferences and the idea of an arbitrage is extended toallow for multi-agent transactions through profitable market-makingopportunities. Restrictions on preferences are initially minimal, em-phasizing the fact that competitive equilibrium notions are robust topreference structure, and are gradually strengthened in order to ex-press ideas of how market prices and optimal consumption/portfoliochoice relate to preferences for smoothing across time and states ofthe world. Auxiliary results on utility representations are collectedin Appendix A, which includes results that, to my knowledge, havepreviously only been available in research paper form.

On the methodological side, a self-contained introduction to proba-bilistic methods starts with a rigorous treatment on a finite informationtree and concludes with an introduction to the continuous-time theory,which omits several technical details but leverages the thorough un-derstanding of the tools on a finite tree. In an approximate numericalsense, the continuous-time model is presented as a simplified specialcase of a high-frequency finite-information model. Tools like martin-gale representations, Girsanov change-of-measure arguments, the Itocalculus, forward and backward stochastic differential equations arehopefully demystified in this way, providing an entry point to a litera-ture which is typically obfuscated by the requirements of set-theoreticrigor. The optimality and equilibrium theory emphasizes a unified

4

Page 5: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

PREFACE 5

geometric viewpoint and convex analysis methods, making this coursecomplementary to a macroeconomics course emphasizing dynamic pro-graming methods.

The text can form the basis for either a fast-paced quarter-longcourse (with some material omitted) or a more relaxed semester-longcourse. The end-of-chapter exercises are an integral part of the book,with detailed solutions available to instructors. As for required back-ground, the most important ingredient is graduate-level maturity inabsorbing economic and mathematical ideas. Although the material isformally self-contained, a background on basic linear algebra is essen-tial and some prior exposure to introductory probability theory andmicroeconomics is helpful. Prior to this course, students are asked toread Appendix B on convex analysis, with emphasis on a geometricunderstanding of the results rather than the details of the providedrigorous proofs.

This text is consistent in approach with my older book (Skiadas[2009]), but differs in some significant ways (as well as in numeroussmaller improvements not listed here). The treatment has been stream-lined around a central conceptual development and has been extendedto include an introduction to continuous-time methodology, resultingis a shorter but more panoramic and tightly integrated book. The ma-terial is directly presented in a dynamic setting, as opposed to the moretraditional order of first considering the static theory. The probabilis-tic foundations are pedagogically interweaved into the main material,as opposed to a disconnected appendix. Some of the older book’s ma-terial that is not essential to the main narrative has been omitted orbeen relegated to the exercises, which are better integrated to the maintext, classroom tested, and with available detailed solutions. Some ofthe peripheral theory on preferences has been significantly refined andextended and pulled into Appendix A. The overview of convex analysisin Appendix B has also been both tightened and extended (most no-tably, in the treatment of super-differentials and some subtler technicalarguments on how convexity can substitute for compactness).

This text has benefitted from the challenge of trying to present anoverview of some fairly difficult technical material in my introductorydoctoral class over just ten weeks. I thank Ariel Lanza, Huidi Lin,Raul Riva, Matheus Sampaio, Brandon Zborowski and Weijia Zhao forcorrections.

Evanston IL, November 2020

Page 6: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Contents

Preface 4

Chapter 1. Market and Arbitrage Pricing 81.1. Uncertainty and information 81.2. Market and arbitrage 131.3. Trading and pricing of financial contracts 171.4. Present-value functions 201.5. Options and dominant choice 251.6. Trading strategies 321.7. Money market account and returns 361.8. Exercises 39

Chapter 2. Probabilistic Methods in Arbitrage Pricing 442.1. Probability basics 442.2. Beta pricing and frontier returns 502.3. State-price densities 552.4. Equivalent martingale measures 592.5. Predictable representations 662.6. Independent increments and the Markov property 732.7. A glimpse of the continuous-time theory 762.8. Brownian market example 842.9. Exercises 93

Chapter 3. Optimality and Equilibrium Pricing 1013.1. Preferences and optimality 1013.2. Equilibrium 1063.3. Utility functions and optimality 1113.4. Dynamic consistency and recursive utility 1193.5. Scale invariant recursive utility 1283.6. Equilibrium with scale invariant recursive utility 1343.7. Optimal consumption and portfolio choice 1403.8. Recursive utility and optimality in continuous time 1443.9. Exercises 156

Appendix A. Additive Utility Representations 164A.1. Utility representations of preferences 164A.2. Additive utility representations 166A.3. Concave additive representations 168A.4. Scale/translation invariant representations 170

6

Page 7: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

CONTENTS 7

A.5. Expected utility representations 172A.6. Expected utility and risk aversion 174

Appendix B. Elements of Convex Analysis 178B.1. Inner product space 178B.2. Basic topological concepts 183B.3. Convexity 186B.4. Projections on convex sets 189B.5. Supporting hyperplanes and (super)gradients 193B.6. Optimality conditions 197

Bibliography 200

Page 8: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

CHAPTER 1

Market and Arbitrage Pricing

An arbitrage is a trade that results in a positive incremental cashflow, that is, some inflow at some time in some contingency, but nopossible outflow. This chapter introduces a formal notion of a mar-ket and lays the foundations for pricing arguments based on the as-sumption of the lack of arbitrage opportunities. Essential notationrelating to trading strategies and associated budget equations is alsointroduced. Throughout this text, we use set-theoretic notation com-mon in graduate-level mathematics and write ≡ to mean “equal bydefinition.”

1.1. Uncertainty and informationWe begin with a formal representation of uncertainty and a common

information stream that is available to market participants over a finitetime horizon.

A time is an element of the set 0, 1, . . . , T, for a positive integerT that is fixed throughout. One of a finite number of possible statesof the world, or contingencies, is realized by the terminal time T .These states are represented by the elements of a finite set Ω, thestate space. We are not yet concerned with the likelihood of any onestate occurring, but every contingency represented by a state in Ω ispossible and every relevant contingency is represented by a state in Ω.

The subsets of Ω are called events. A partition of Ω is a set ofmutually exclusive nonempty events whose union is Ω. Time-t infor-mation is represented by a partition F0

t of Ω. All that is known attime t is what element of F0

t contains the state realized at time T . Attime 0 it is only known that the state to be revealed at time T is anelement of Ω and therefore F0

0 = Ω. At time T the state is revealedand therefore F0

T = ω | ω ∈ Ω.

Example 1.1.1. Information is generated by observing the outcomeof a coin toss at each time t = 1, 2, . . . , T . Let us encode heads with1 and tails with −1. A state is a finite sequence ω = (ω1, . . . , ωT ),where ωt ∈ 1,−1, and the state space is the cartesian product Ω ≡1,−1T . At time t > 0 the first t coin toss outcomes ω1, . . . , ωt ∈1,−1 have been observed and it is therefore known that the state isan element of the event ω ∈ Ω | ω1 = ω1, . . . , ωt = ωt. The partitionF0t is the set of all these events as (ω1, . . . , ωt) ranges over 1,−1t. ♦

8

Page 9: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.1. UNCERTAINTY AND INFORMATION 9

As in the preceding example, we assume perfect recall: If at sometime the state is known to belong to a partition element, the sameremains true at all subsequent times. More formally, we assume that ifu > t, the partition F0

u is a refinement of the partition F0t , meaning

that every event in F0t is the union of events in F0

u .It is mathematically convenient to also define the sets

(1.1.1) Ft =F | F is a union of elements of F0

t

, t = 0, . . . , T.

An event F belongs to Ft if and only if at time t it is known whetherF contains the state to be revealed at time T .

Example 1.1.2. Let Ω ≡ 1, 2, 3, 4 and F01 ≡ 1, 2 , 3 , 4.

Then F1 = ∅, 1, 2 , 3 , 4 , 1, 2, 3 , 1, 2, 4 , 3, 4 ,Ω. Supposestate 1 is realized at time T ≡ 2. At time 1 it is known that the stateis either 1 or 2. From that it can be inferred whether every one of theevents in F1 contains 1 or not, and these are all the events about whichsuch a claim can be made. ♦

Every Ft is an algebra of events, meaning that it contains ∅ andΩ, and it is closed relative to the formation of Boolean set opera-tions: For all A,B ∈ Ft, the union A ∪ B ≡ ω | ω ∈ A or ω ∈ B,intersection A ∩ B ≡ ω | ω ∈ A and ω ∈ B, and set differenceA \ B ≡ ω | ω ∈ A and ω /∈ B are all elements of Ft. In particular,for all F ∈ Ft, the complement F c ≡ Ω \F is an element of Ft. Thisdefinition of an algebra is of course redundant. For example, sinceA ∩ B = (Ac ∪Bc)c and A \ B = A ∩ Bc, an algebra (of events) isany nonempty set of events that is closed with respect to the formationof unions and complements. Besides providing convenient notation,algebras are key in formulating this text’s theory in ways that can beinterpreted in infinite state-space extensions, where algebras are notgenerated by partitions. Here we will take full advantage of the par-tition representation, even though most results will be stated in waysthat are amenable to generalization.

The intersection of an arbitrary collection of algebras is also analgebra. (The reader can construct a simple example showing thatthe union of two algebras is not necessarily an algebra.) The algebraσ (S) generated by a set of events S is the intersection of all algebrasthat include S. It is straightforward to verify that Ft = σ(F0

t ) for allt. Conversely, F0

t can be recovered from Ft as the set of nonemptyelements of Ft that do not have a nonempty proper subset in Ft.

Note that F0u is a refinement of F0

t if and only if Ft ⊆ Fu. Thismotivates the definition of a filtration as a time-indexed sequenceF0,F1, . . . ,FT of algebras of events, abbreviated to Ft, such thatu ≥ t implies Fu ⊇ Ft. We can therefore equivalently specify theinformation primitive of our model as a filtration Ft satisfying(1.1.2) F0 = ∅,Ω and FT = 2Ω (the set of all subsets of Ω) .

Page 10: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.1. UNCERTAINTY AND INFORMATION 10

(Ω, 0)

(ω3, ω4, 1)

(ω1, ω2, 1)

state ω4

state ω3

state ω2

state ω1

Figure 1.1.1. Information tree of Example 1.1.1 withT = 2. Each state ωi can be identified with the terminalspot (ωi , 2).

As illustrated in Figure 1.1.1, a filtration Ft can be thoughtof as an information tree, whose nodes correspond to what we call“spots.” Formally, a spot of the filtration Ft is a pair (F, t) wheret ∈ 0, . . . , T and F ∈ F0

t . The root of the information tree corre-sponds to the initial spot (Ω, 0). A terminal spot takes the form(ω , T ), where ω ∈ Ω, and can therefore be identified with the stateω as well as the unique path on the information tree from the ini-tial spot to the given terminal spot. A nonterminal spot (F, t− 1),t ∈ 1, . . . , T, has immediate successor spots (F0, t) , . . . , (Fd, t),where F0, . . . , Fd are the elements of F0

t whose union is F . The spot(F, t− 1) can be thought of as the set of paths on the information treefrom in the initial spot to every terminal spot corresponding to a statein F .

Economic quantities such as cash flows and prices will be repre-sented by stochastic processes, which are time-indexed sequences ofrandom variables that are consistent with the information structurejust introduced. We now formalize these notions and introduce relatednotation.

A random variable is a function of the form x : Ω → R. Astochastic process, or simply process, can equivalently be definedas a time-indexed sequence (x0, x1, . . . , xT ) of random variables or as afunction of the form x : Ω× 0, 1, . . . , T → R, where xt (ω) = x (ω, t)for ω ∈ Ω and t ∈ 0, . . . , T. On occasion we write x (t) instead of xt.The function x (ω, ·) : 0, . . . , T → R, for any fixed ω ∈ Ω, is a pathof the process x.

As is common in probability theory, we identify a scalar α and theprocess that is identically equal to α. For example, x + α denotes theprocess that takes the value x (ω, t)+α at state ω and time t. Sums andproducts of processes are defined point-wise: (x+ yz) (ω, t) ≡ x (ω, t)+y (ω, t) z (ω, t). We write x ≤ y to mean x (ω, t) ≤ y (ω, t) for all (ω, t),and analogously for any other relation. A process is said to be strictlypositive if it is valued in (0,∞).

Page 11: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.1. UNCERTAINTY AND INFORMATION 11

Analogous conventions apply to random variables (which can afterall be viewed as processes with T = 0). Random variables are oftenused to define events in terms of predicates, as in ω ∈ Ω | x (ω) ≤ α.In such cases, we simplify the notation by eliminating the state variable,as in x ≤ α.

Another useful piece of notation is that of an indicator function1A of a set A, which takes the value 1 on A and 0 on the complementof A in the implied domain. For example, if A is an event, then 1A isthe random variable

1A (ω) =

1, if ω ∈ A;0, if ω /∈ A.

If A ⊆ Ω × 0, . . . , T, 1A denotes the process defined as above, butwith (ω, t) in place of ω.

Suppose information is represented by a given underlying filtrationFt on Ω satisfying (1.1.2). If a process is to represent an observedquantity, it cannot reveal more information than implied by the pos-tulated filtration. For example, if T = 1 and Ω = 0, 1, the processx0 (ω) = x1 (ω) = ω is not consistent with the information structure,because observation of the realization of x0 at time zero reveals thestate ω at time zero. To formalize this type of informational con-straint, we first introduce the key notion of measurability with respectto an algebra.

A random variable x is said to be measurable with respect to analgebra A, or A-measurable, if x ≤ α ∈ A for every α ∈ R. If A isgenerated by the partition A0 = A1, . . . , An, then x is A-measurableif and only if it can be expressed as x =

∑ni=1 αi1Ai

, where αi ∈ R is aconstant value x takes on Ai.

The set of A-measurable random variables, which we denote byL (A), is a linear subspace of RΩ that is also closed relative to nonlinearcombinations of its elements, in the following sense, where f (x1, . . . , xn)denotes the random variable that maps ω to f (x1 (ω) , . . . , xn (ω)).

Proposition 1.1.3. If x1, . . . , xn ∈ L (A) then f (x1, . . . , xn) ∈L (A) for all f : Rn → R, and (x1, . . . , xn) ∈ S ∈ A for all S ⊆ Rn.

Proof. If each xi is constant over every element of A0, then sois f (x1, . . . , xn). Letting f (x1, . . . , xn) = 2 × 1(x1,...,xn)∈S shows that(x1, . . . , xn) ∈ S = f (x1, . . . , xn) ≤ 1 ∈ A.

We can now formalize the requirement that a process respects thegiven information structure with the notion of adaptedness. The pro-cess x is said to be adapted (to the underlying filtration) if xt ∈ L (Ft)for every time t. Since F0 = ∅,Ω and FT = 2Ω, the initial value ofan adapted process is constant, while its terminal value can be anyrandom variable.

Page 12: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.1. UNCERTAINTY AND INFORMATION 12

The space of all adapted processes, which we denote by L, can beidentified with the Euclidean space R1+N , where 1 + N is the totalnumber of spots. To see how, consider any adapted process x. For anyspot (F, t) , the random variable xt is constant over F, taking a valuethat we denote by x (F, t) (that is, x (ω, t) = x (F, t) for all ω ∈ F ).If F0

t = F1, . . . , Fn , then xt =∑n

i=1 x (Fi, t) 1Fi. One can therefore

regard x as an assignment of a real number to every spot of the infor-mation tree. The set of strictly positive adapted processes is denotedby L++ and is analogously identified with R1+N

++ .Related to the notion of an adapted process is that of a stopping

time defined as a function of the formτ : Ω → 0, 1, . . . , T ∪ ∞ ,

provided that τ ≤ t ∈ Ft for every time t. The last restriction isequivalent to the adaptedness of the indicator process 1τ≤t, whichtakes the value zero prior to the (random) time τ , and the value onefrom time τ on (which on the event τ = ∞ is never). A stoppingtime, or corresponding indicator process, announces the (first) arrivalof an event which is consistent with the information stream encodedin the underlying filtration. For example, if x is an adapted process,then the first time that xt ≥ 1 defines a stopping time (with the value∞ being assigned on the event that x remains valued below one). Onthe other hand, the first time that x reaches its path maximum is notgenerally a stopping time. For any process x and stopping time τ , therandom variable xτ or x (τ) is defined by letting xτ (ω) = x (ω, τ (ω)),with the convention x (ω,∞) = 0.

In applications, it is common to specify the filtration Ft as theinformation revealed by given processes representing observable quan-tities. To formally define this way of constructing the informationstream, we first define a notion of information revealed by a given setof random variables.

The algebra generated by a set S of random variables is the inter-section of all algebras relative to which every x ∈ S is measurable, andis denoted by σ (S). If S = x1, . . . , xn, σ (x1, . . . , xn) ≡ σ (S) is thesame as the algebra generated by the partition of all nonempty events ofthe form (x1, . . . , xn) = α, where α ∈ Rn. We interpret σ (x1, . . . , xn)as the information that can be inferred by observing the realization ofthe random variables x1, . . . , xn. Any other variable whose realizationis revealed by this information must be determined as a function of therealization of (x1, . . . , xn):

Proposition 1.1.4. Given any random variables x1, . . . , xn, a ran-dom variable y is σ (x1, . . . , xn)-measurable if and only if there exists afunction f : Rn → R such that y = f (x1, . . . , xn).

Proof. That f (x1, . . . , xn) is σ (x1, . . . , xn)-measurable follows fromProposition 1.1.3. Conversely, suppose y is σ (x1, . . . , xn)-measurable

Page 13: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.2. MARKET AND ARBITRAGE 13

and let (x1 (ω) , . . . , xn (ω)) | ω ∈ Ω = α1, . . . , αm ⊆ Rn. The al-gebra σ (x1, . . . , xn) is generated by the partition A1, . . . , Am, whereAi ≡ (x1, . . . , xn) = αi. Let y =

∑mj=1 βi1Ai

, where βi is the con-stant value of y on Ai. Selecting any function f : Rn → R such thatβi = f (αi) results in y = f (x1, . . . , xn).

A filtration is said to be generated by given processes B1, . . . , Bd,where each Bi

0 is a constant, ifFt = σ

(B1

s , . . . , Bds | s = 1, . . . , t

), t = 1, . . . , T.

Writing B =(B1, . . . , Bd

)′, it follows from Proposition 1.1.4 that inthis case a process x is adapted if and only if x0 is constant and forevery time t > 0, there exists some function f (t, ·) : Rd×t → R suchthat

x (ω, t) = f (t, B (ω, 1) , . . . , B (ω, t)) , ω ∈ Ω.

In other words, xt is a function of the path of B up to time t.

1.2. Market and arbitrageA financial market can be thought of as a set of net incremental

cash flows that are generated by trading financial contracts such asbonds, stocks, futures, options and swaps. For example, the purchaseof a share of a stock at some spot generates the cash flow consisting ofminus the stock price at the given spot followed by the dividend streamresulting from holding the share on the information subtree rooted atthe given spot. Effectively, a market facilitates the exchange of fundsacross spots, be it saving, borrowing, hedging or speculation, or somecombination thereof that cannot be cleanly categorized. We implic-itly assume that the terms of market exchanges are set competitively,meaning that there is a large number of market participants each ofwhom has negligible bargaining power, but who collectively influencemarket-clearing prices. A more formal definition of a market and re-lated concepts follows.

Definition 1.2.1. A cash flow is any adapted process. A marketis any linear subspace X of the space L of all cash flows. A cash flow xis said to be traded in X if x ∈ X. A cash flow c is an arbitrage if0 6= c ≥ 0. A market X is arbitrage-free if it contains no arbitragecash flow.

All cash flows are specified in some implicit unit of account thatis fixed throughout. A market X represents the set of all net incre-mental cash flows that are available to market participants, typicallyby following some trading strategy over time, a notion that is formallydefined later in this chapter. From the perspective of time zero, anagent can use the market X to modify a cash flow c to c + x for anyx ∈ X. Of course, the agent will add a traded cash flow x to c only if

Page 14: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.2. MARKET AND ARBITRAGE 14

c+x is preferred to c. An arbitrage cash flow x satisfies x (F, t) ≥ 0 forevery spot (F, t), and x (F, t) > 0 for some spot (F, t). An agent whoprefers more to less prefers c + x to c, for every c and every arbitragecash flow x. If X contains an arbitrage, such an agent cannot findany c optimal given X. In a competitive equilibrium, to be definedformally in Chapter 3, agents follow optimal plans and therefore themarket cannot contain arbitrage opportunities.

A market X is assumed to be a linear subspace reflecting the un-derlying assumption of perfectly competitive markets with no transac-tion costs or trading constraints. Traded cash flows can be reversed,scaled arbitrarily and combined. The discussion of market equilibriumin Chapter 3 provides an argument as to why market forces work torelax binding market constraints. Yet, frictions such as moral hazard,asymmetric information, search or information processing costs, con-centrated market power and phenomena of market panics and runscan work to impede this process of increasingly complete markets andwork against the definition of traded cash flows as a linear subspace.1This being a first course, we will mostly confine ourselves to the caseof linear markets, which is why linearity is part of a market’s defini-tion. Besides its pedagogical value, the perfect case is of practical valueas an approximation in thinking about highly liquid and competitivemarkets in standardized financial contracts as long as the limitationsof such an approximation are well recognized.

Fixing a reference market X in the background, we make a technicaldistinction between traded cash flows, which are the elements of X,and marketed cash flows, which are cash flows that can be obtained inthe market given sufficient time-zero cash. Recall that 1Ω×0 denotesthe process that takes the value one at spot zero and the value zeroeverywhere else on the information tree.

Definition 1.2.2. The cash flow c is marketed (in the marketX at time zero) if c − α1Ω×0 ∈ X for some α ∈ R. The market iscomplete if every cash flow is marketed.

A marketed cash flow c can be expressed as c = α1Ω×0 + x forsome traded cash flow x. The scalar α represents a time-zero price of c,which is unique if and only if 1Ω×0 /∈ X, a condition known as thelaw of one price. Clearly, an arbitrage-free market satisfies the lawof one price but the converse is not generally true. For expositionalsimplicity, we will assume that the market is arbitrage-free, even wherethe law of one price would suffice.

1This text’s conceptual framework and several of its arguments extend to allowfor exogenous trading constraints, resulting in shapes of X that are not linearsubspaces. From an economic viewpoint, the more interesting aspect of tradingconstraints, however, is their endogenous source in equilibrium.

Page 15: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.2. MARKET AND ARBITRAGE 15

Definition 1.2.3. Assuming the market is arbitrage-free, the (time-zero) present value of a marketed cash flow c is the unique scalar αsuch that c− α1Ω×0 ∈ X.

Proceeding under the assumption that X is arbitrage-free, notethat the set of marketed cash flows is the linear span of X and 1Ω×0,whose dimension is one more than that of X. The function that mapsevery marketed cash flow to its present value is linear and positive,where positivity means that the present value of every arbitrage cashflow is (strictly) positive. The set X is the kernel of this function,that is, the set of marketed cash flows of zero present value. If themarket is complete, every cash flow is marketed and therefore has auniquely defined present value. The dimension of X in this case is N ,where 1 + N is the total number of spots on the information tree. Inthe following section we will show that if the market is not complete,then the present value function on the set of marketed cash flows canbe extended to all of L while retaining linearity and positivity, whichleads to some useful mathematical representations of the present-valuefunction. Such an extension is not unique, however.

We have defined the market from the perspective of time zero. Amarket XF,t can also be defined analogously from the perspective ofany other spot (F, t) as a subset of

LF,t ≡x ∈ L | x = x1F×t,...,T

,

which is the set of cash flows that can only take nonzero values on thesubtree rooted at spot (F, t). Analogously to Definition 1.2.2, we have:

Definition 1.2.4. A cash flow c is marketed at spot (F, t) in themarket XF,t if c1F×t,...,T−α1F×t ∈ XF,t for some α ∈ R. The marketXF,t is complete at (F, t) if every cash flow in LF,t is marketed at (F, t)in XF,t.

The following proposition introduces assumptions that allow us toconstruct XF,t in terms of the the time-zero market X.

Proposition 1.2.5. Suppose that the (time-zero) market X isarbitrage-free, and for some spot (F, t), the set XF,t satisfies:

(1) (adaptedness) XF,t ⊆ LF,t.(2) (dynamic consistency) XF,t ⊆ X.(3) (liquidity) Every x ∈ X is marketed at (F, t) in XF,t.

Then XF,t = X ∩ LF,t.

Proof. By adaptedness and dynamic consistency, XF,t ⊆ X∩LF,t.To show the reverse inclusion, suppose that x ∈ X ∩LF,t. By liquidity,there exist y ∈ XF,t and α ∈ R such that x = x1F×t,...,T = α1F×t+y.By dynamic consistency, y ∈ X and therefore α1F×t = x − y ∈ X.Since X is arbitrage-free, α = 0 and x = y, and therefore x ∈ XF,t.

Page 16: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.2. MARKET AND ARBITRAGE 16

The Proposition’s assumptions have simple interpretations. Theelements of XF,t represent cash flows available to a market participantat spot (F, t) through trading over the information subtree rooted at(F, t). Such cash flows are naturally viewed as elements of LF,t. Theidea behind dynamic consistency is that at time zero a trader can haveany cash flow x in XF,t by making a contingent plan to carry out thetransactions that result in x if (F, t) materializes. The cash flow x iseffectively also available to the agent at time zero, and must thereforebe an element of the time-zero market X. Finally, liquidity, in thenarrow technical sense used here, means that if at time zero the agentstarts following a plan generating the cash flow x ∈ X, then at anyfuture spot the agent can liquidate all positions and cancel all remainingcash flows. For example, suppose there is no uncertainty, T = 2 andat time zero the agent buys a bond for 99 (units of account) that pays100 at time two, thus generating the cash flow x = (−99, 0, 100). In aliquid market, the cash flow (0, p,−100) is traded at time one for someprice p, in other words, the same bond continues to be traded at someprice.

A dynamic market specifies a market XF,t for every spot (F, t),with X = XΩ,0 being the time-zero market, and is liquid if it satis-fies the third condition of Proposition 1.2.5. Under the assumptions ofProposition 1.2.5, it must be the case that XF,t = X ∩ LF,t and there-fore X specifies the entire liquid dynamic market. Not every time-zeromarket is consistent with a liquid dynamic market, however, since theliquidity requirement of Proposition 1.2.5 must be satisfied. This mo-tivates the following definition.

Definition 1.2.6. The market X is liquid if for every spot (F, t),every x ∈ X is marketed at spot (F, t) in the market X ∩ LF,t.

The preceding definition abuses terminology in the interest of sim-plicity; liquidity is really a property of a dynamic market. It is entirelyconsistent to consider a dynamic market XF,t, where XΩ,0 = X isa market satisfying the property of Definition 1.2.6, while for somenon-zero spot (F, t), XF,t violates the liquidity property of Proposi-tion 1.2.5, for example, XF,t = 0. Whenever we specify a liquidmarket X, however, it is implicitly assumed that at each spot (F, t),the market XF,t = X ∩LF,t is available, and therefore a correspondingliquid dynamic market XF,t is specified by X.

Liquidity requires that an initially marketed cash flow continues tobe marketed, but it does not require that every cash flow is marketed.On the other hand, an arbitrage-free complete market is liquid (or,more precisely, implies a liquid dynamic market) as a consequence ofthe following observation.

Proposition 1.2.7. Suppose X is an arbitrage-free complete mar-ket. Then X ∩ LF,t is complete at (F, t).

Page 17: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.3. TRADING AND PRICING OF FINANCIAL CONTRACTS 17

Proof. Consider any cash flow c. Using the assumption that X iscomplete, pick scalars α and β such that −α1Ω×0 + c1F×t,...,T ∈ Xand −β1Ω×0+1F×t ∈ X. Since X is arbitrage-free, β > 0. Linearityof X then implies that c1F×t,...,T − (α/β) 1F×t ∈ X. Therefore, c ismarketed at (F, t) in X ∩ LF,t.

1.3. Trading and pricing of financial contractsWhile market participants are ultimately interested in the incremen-tal cash flows of the market X, they must undertake certain actionsto generate these cash flows. In theory, every traded cash flow couldbe implemented as a buy-and-hold portfolio in contracts generatingcash flows that form a linear basis of X. The problem with this ap-proach is that if there are 1 + N spots, the dimension of a completemarket is N , which typically rises exponentially as the number T ofperiod increases. For example, if each spot has at least two immediatesuccessors, then N > 2T+1 (why?). Moderate values of T imply as-tronomically large values for N . Such a large number of competitivelytraded basic contracts is implausible even as a rough approximationof reality. A key insight is that a small number of contracts can im-plement a high-dimensional market provided these same contracts canbe traded at every spot of the filtration. Postponing a more formalstatement and proof of this claim, in this section we define contractsand we discuss their relationship to a market and some straightforwardimplications of the no-arbitrage assumption for contract pricing.

We use the term “contract” in a narrow formal sense to mean adividend stream together with a price process indicating at what pricethe dividend stream can be traded at every spot.

Definition 1.3.1. A contract is a pair (δ, V ) of adapted processessatisfying the convention

(1.3.1) δ0 ≡ 0 and δT ≡ VT .

The process δ is the contract’s dividend process and V is the con-tract’s value process or cum-dividend price process. The con-tract’s ex-dividend price process is

S ≡ V − δ.

We use the word “dividend” in a generalized sense to mean any cashflow that is the result of holding a long position in the contract up totime T and liquidating at time T . In applications, such payments maycorrespond to coupon payments and a face-value payment at maturityin the case of bonds, dividends prior to time T and the sale price cumdividend at time T in the case of a stock (whose actual dividend streamcan extend beyond time T ), net cash settlements in the case of a swap,and so on. A contract entitles the owner (or long position) to a single

Page 18: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.3. TRADING AND PRICING OF FINANCIAL CONTRACTS 18

dividend stream. An extension to include options, where the ownercan select from a set of dividend streams, is discussed in Section 1.5.

The owner of a contract (δ, V ) receives the dividend payment δt attime t. Contracts can be bought or sold at every spot. The contract’stime-t value Vt represents, by convention, a cum-dividend price. Ifthe contract is bought at time t, either the buyer pays the seller Vtand receives the time-t dividend δt, or the buyer pays the seller theex-dividend price St and the time-t dividend goes to the seller. Thedividend convention (1.3.1) is made for notational simplicity and entailsno loss of generality within the scope of the model presented here. Onecan think of the condition δT = VT as reflecting the implicit assumptionthat the contract is liquidated at the terminal date T , resulting in thepayment VT = ST + δT . It makes no difference within the model howthe value VT is split between ST and δT or how V0 is split between S0

and δ0; for simplicity, we set ST = 0 and δ0 = 0.

Definition 1.3.2. A contract (δ, V ) is traded at spot (F, t) in themarket X if(1.3.2) x = −V 1F×t + δ1F×t,...,T ∈ X.

The cash flow x is generated by buying the contract at spot (F, t),while −x is generated by selling the contract at the same spot. Thecontract (δ, V ) is traded (in X) if it is traded at every spot.

We proceed taking as given a reference arbitrage-free market X.When we say that a contract is traded (at some spot), it is implied thatthe contract is traded in the reference market X. It is an immediateconsequence of the definitions that a cash flow δ is marketed in X if andonly if there is a contract (δ, V ) that is traded at time zero, in whichcase V0 is the present value of δ. Note that this simple statement reliescritically on the convention VT = δT . The essence of the conclusionis that V0 is the present value of the dividends paid out prior to theterminal date plus the present value of the contract’s terminal (cum-dividend) value.

Remark 1.3.3. The trades generating an arbitrage as a conse-quence of the violation of a claimed arbitrage pricing relationship areinstructive in that they suggest potential ways in which the pricingrelationship can be violated in realistic applications due to frictionsleft out of the formal model. For a simple example, suppose the con-tracts (δ, V ) and (δ, V ′) are both traded, but V0 < V

′0 . The arbitrage(

V′0 − V0

)1Ω×0 results by selling the second contract and buying the

first one. Suppose the latter represents a security, like a US treasurybond, that can be used for the purpose of posting collateral, thus facil-itating other trades. If collateral in this sense is scarce in equilibrium,the present value of the cash dividend stream δ may not present ade-quate compensation for parting with the security and the value V0 may

Page 19: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.3. TRADING AND PRICING OF FINANCIAL CONTRACTS 19

well exceed the present value of δ. In the presence of a well-functioningcompetitive lease market in the security, V0 equals the present value ofthe equilibrium lease payments resulting from lending out the security.The extent to which these lease payments exceed the dividends δ re-flects the equilibrium value of holding an additional unit of inventory. ♦

Returning to our current idealized formal setup, the preceding ob-servations can be applied from the perspective of any other spot, andtherefore for every traded contract (δ, V ), the value V (F, t) is a func-tion of the restriction of δ on F×t, . . . , T, which is the present value ofδ from the perspective of spot (F, t). Thus, in an arbitrage-free market,any two traded contracts (δ, V 1) and (δ, V 2) with a common dividendprocess δ (and therefore common terminal value V 1

T ≡ δT ≡ V 2T ) must

also have a common value process: V 1 = V 2. It is also worth notingthat buying the contract (δ, V ) at spot (F, t− 1) and selling it at eachof its immediate successor spots generates the cash flow

(1.3.3) x = −S1F×t−1 + V 1F×t,

where S ≡ V − δ. Therefore, if the contracts (δ1, V 1) and (δ2, V 2) aretraded and V 1

t = V 2t on some F ∈ Ft−1, then S1

t−1 = S2t−1 on F . As a

corollary, if (δ1, V 1) and (δ2, V 2) are traded in an arbitrage-free marketand V 1

t = V 2t for all t > 0, then the two contracts are identical.

We have defined what it means for a contract to be traded in a givenmarket. Conversely, trading in a given set of contracts implements amarket, which can be succinctly defined as follows.

Definition 1.3.4. The market implemented by a set of contractsis the intersection (and hence smallest relative to inclusion) market inwhich all contracts in the given set are traded.

Suppose C is a set of contracts, let X be the market implementedby C, and let X0 be the set of all cash flows of the form (1.3.2) forevery (δ, V ) ∈ C and spot (F, t). The set span(X0) of all finite linearcombinations of elements of X0 is a market that clearly includes Xand in which every contract in C is traded. Therefore, X = span(X0).This construction also makes it clear that X is liquid (why?). Anelement of span(X0) can be thought of as a contingent plan to buy orsell contracts at various spots, in other words, a trading strategy. Acontract that is not in C but is traded in the arbitrage-free market Xis synthetic in C, meaning that it is generated by a trading strategyin contracts in C. Section 1.6 discusses trading strategies, syntheticcontracts and associated budget equations more systematically. In thefollowing chapter, we will see that the minimum number of contractsimplementing a complete market is equal to the maximum number ofimmediate successor spots to each spot.

Page 20: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.4. PRESENT-VALUE FUNCTIONS 20

1.4. Present-value functionsA dual approach to arbitrage pricing utilizes the notion of a present-

value function, formally defined below in terms of a reference marketX that is taken as given throughout this section. Recall that a linearfunctional on a vector space is a real-valued linear function whosedomain is the entire vector space. A linear functional is positive if itassigns a positive value to every vector x such that 0 6= x ≥ 0.

Definition 1.4.1. A (time-zero) present-value function (for themarket X) is a positive linear functional Π on L such that Π(x) ≤ 0for all x ∈ X and Π(1Ω×0) = 1.

A present-value function Π specifies a time-zero value for every cashflow c, marketed or not, which is positive if c is an arbitrage cash flow.The essential restriction that Π is nonpositive on X is equivalent toΠ(x) = 0 for all x ∈ X, since X is a linear subspace. (In extensionswith trading constraints, X is no longer a linear subspace, but Defi-nition 1.4.1 is still valid, and the present value of a traded cash flowcan be strictly negative.) The requirement Π(1Ω×0) = 1 is merely anormalization.

The properties of a present-value function Π combine to determinethe value Π(c) of a marketed cash flow c as the present value of cin the sense of Definition 1.2.3. Suppose the scalar α is such thatc − α1Ω×0 ∈ X. The existence of Π implies that X is arbitrage-free (why?) and therefore α is unique. The fact that Π vanishes onX implies that Π

(c− α1Ω×0

)= 0. The linearity of Π implies that

Π(c) = αΠ(1Ω×0). Finally, the normalization assumption results inΠ(c) = α. Another way of stating this conclusion is that if the contract(δ, V ) is traded at time zero and Π is a present-value function, thenV0 = Π(δ).

If the market X is arbitrage-free and complete, letting Π(c) equalthe (unique) present value of c defines a function Π : L → R thatis easily confirmed to be a present-value function. Therefore, everycomplete arbitrage-free market admits a unique present-value function.Conversely, every present-value function Π for X defines a unique com-plete market for which Π is a present value function; it is the kernelof Π, that is, the set c ∈ L | Π(c) = 0. If X is complete, then thekernel of Π is equal to X. If X is incomplete, then the kernel of Πis a proper superset of X and can be thought of as an arbitrage-freecompletion of X.

The relationship between the market X and a present-value func-tion Π has a geometric interpretation in the space R1+N , where 1+N isthe total number of spots on the information tree. Recall that the setL of adapted processes can be identified with R1+N , since an adaptedprocess x is an assignment of a scalar x (F, t) to every spot (F, t). Usingthis identification, we endow L with the usual Euclidean inner product

Page 21: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.4. PRESENT-VALUE FUNCTIONS 21

in R1+N , denoted

x · y ≡∑

all spots (F,t)

x (F, t) y (F, t) .

Definition 1.4.2. A state-price process is any adapted processp with p0 > 0 such that

Π(c) ≡ 1

p0p · c, c ∈ L,

defines a present-value function Π. In this case, we say that p rep-resents Π. An Arrow cash flow2 is a cash flow of the form 1F×t,where (F, t) is a spot.

Every cash flow is a linear combination of the 1 + N Arrow cashflows, reflecting the identification of L and R1+N . If Π is a present-value function and we define p (F, t) = Π(1F×t) for every spot (F, t),then Π(c) = p · c for all c ∈ L (why?). Therefore, every present-valuefunction can be represented by a state-price process. (In mathemat-ical terms, the Arrow cash flows are a linear basis of L and p is theRiesz representation of Π.) If p is a state-price process representingthe present-value function Π, then p (F, t) = p0Π(1F×t) for every spot(F, t), and therefore p is strictly positive, it is uniquely determined byΠ up to a positive scaling factor, and it represents relative present valueof Arrow cash flows: For all spots (F, t), (G, s),

p (F, t)

p (G, s)=

Π(1F×t)

Π(1G×s).

For a geometric interpretation of a state-price process p representingΠ, note that, since Π(x) = 0 for all x ∈ X, p is orthogonal to X, andsince Π is positive, p lies in R1+N

++ (the interior of the positive orthantof R1+N). The set of all state-price processes representing Π can beidentified with a directed half line in R1+N

++ . Inspection of Figure 1.4.1suggests that such a direction exists if and only if

X ∩ R1+N+ = 0 ,

which is another way of saying that X is arbitrage-free. If, as in Fig-ure 1.4.1, X is complete and therefore N -dimensional, there is onlyone orthogonal-to-X direction within R1+N , and therefore the corre-sponding present-value function must be unique. If X is incomplete,the dimension of X is at most N − 1, leaving at least one dimension

2The term, which is not entirely standard, is in recognition of Arrow [1953](translated to English in Arrow [1963]), who together with Debreu [1959] providedthe modern conceptual framework for incorporating uncertainty in classical compet-itive equilibrium theory. The more common term “Arrow-Debreu security” refersto a claim to an Arrow cash flow. The (relative) time-zero prices of Arrow-Debreusecurities are also known as Arrow-Debreu prices, corresponding to the notion of astate-price process here.

Page 22: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.4. PRESENT-VALUE FUNCTIONS 22

spot 0

spot 1

market X

p or Π

Figure 1.4.1. Example of a market and present-valuefunction with N = 2. The shaded region (including theaxes but not the origin) is the set of arbitrage cash flows.The market X does not cut into the shaded region if andonly if the half line defined by the orthogonal vector plies in the interior of the shaded region.

along which p can be rotated within R1+N++ while maintaining orthogo-

nality to X, which suggests that an incomplete market admits multiplepresent-value functions. For an example, extend Figure 1.4.1 by intro-ducing a spot-2 orthogonal axis, while maintaining the market X as aline. Rotating the orthogonal-to-X vector p around X while stayingwithin the interior of the positive orthant traces out all present-valuefunctions relative to X. A rigorous proof of these informal insightsfollows.3

Theorem 1.4.3. A present-value function exists if and only if Xis arbitrage-free; and is unique if and only if the market X is complete.

Proof. (Existence) If Π is a present-value function, then Π(x) ≤ 0for all x ∈ X. If c is an arbitrage, then Π(c) > 0 and therefore c cannot

3The equivalence of the lack of arbitrage opportunities and the existence of apresent-value function is an example of the so-called theorems of the alternative inconvex analysis, exposited in Chapter 1 of Stoer and Witzgall [1970]. In the currentfinite-dimensional financial market context, the result is due to Ross [1978]. It wasextended to infinite-dimensional spaces by Yan [1980] and Kreps [1981]. In general,under infinitely many states, the result requires the exclusion of a stronger notionof arbitrage opportunities than mere positivity. A notable exception is the case ofa market generated by finitely many assets in discrete time, as shown by Dalanget al. [1990] (with simplified proofs given by Schachermayer [1992] and Kabanov andKramkov [1994]). Extension to models with continuous-time trading are reviewedby Delbaen and Schachermayer [2006].

Page 23: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.4. PRESENT-VALUE FUNCTIONS 23

be in X. For the converse, we will show that a state-price processexists assuming only that X is a closed convex cone that containsno arbitrage cash flow, a generality that will be useful later on, asdiscussed in Remark 1.4.4 below. We fix an enumeration 0, 1, . . . , N ofall spots, where (Ω, 0) is spot zero, and we write x = (x0, x1, . . . , xN)for the element of R1+N corresponding to the adapted process x. (Theconflicting notation xt, where t is a time, is not used in this proof.)We use the Euclidean inner product on L, which takes the familiarform x · y =

∑Nn=0 xnyn. A state-prices process p satisfies p · x ≤ 0 for

all x ∈ X and p · c > 0 for every arbitrage c, and therefore separatesthe convex sets X and R1+N

+ \ 0. This suggests that the existenceof p follows from the separating hyperplane theorem, except for thesubtlety that p · c > 0 for every arbitrage c only implies that 0 6= p ≥ 0,not necessarily that p ∈ R1+N

++ , as required of a state-price process. Toovercome this difficulty, we instead separate the compact convex set∆ ≡ x ∈ R1+N

+ |∑

n xn = 1 from X. Since X is arbitrage-free,X ∩∆ = ∅. By the Projection Theorem B.4.1, the function that mapseach point of ∆ to its projection on X is continuous. Since norms arecontinuous, the function that maps each point of ∆ to its least distancefrom X is also continuous (as well as convex) and therefore achieves aminimum over ∆ by Proposition B.2.6 (or Proposition B.3.3). Thereis, therefore, a pair (x, y) ∈ X×∆ such that x is the projection of y onX and y is the projection of x on ∆. By the Projection Theorem B.4.1,−p ≡ x − y supports X at x and therefore p · x ≤ p · x for all x ∈ X.Since X is a cone, p · x = 0 and p · x ≤ 0 for all x ∈ X. Similarly,p ≡ y − x supports ∆ at y, and therefore p · y ≥ p · y for all y ∈ ∆.Since p · x = 0, p · y = p · p > 0. Therefore, p · y > 0 for all y ∈ ∆,which implies that p is strictly positive. A corresponding present-valuefunction is defined by Π(c) = p · c.

(Uniqueness) Suppose Π is a present-value function. Let M denotethe set of marketed cash flows, spanned by the elements of X and thecash flow 1Ω×0, which we identify with the vector 10 ≡ (1, 0, . . . , 0)in R1+N . We have already shown that Π(c) is uniquely determinedfor all c ∈ M . If X is complete, then M = R1+N and Π is uniquelydetermined. Suppose instead that X is incomplete, and fix any c ∈R1+N \ M . Taking an orthogonal projection, let c = m + z, wherem ∈ M and z 6= 0 is orthogonal to M and therefore also to X and 10.Let p ∈ R1+N

++ be the Riesz representation of Π, that is, Π(c) = p · c forall c ∈ R1+N . Consider any scalar α 6= 0 such that pα ≡ p+αz ∈ R1+N

++ .Since z is orthogonal to X and 10, pα · x = 0 for all x ∈ X andpα · 10 = p0 · 10 = 1. Therefore, Πα (c) = pα · c defines a continuum ofdistinct present-value functions as α ranges over all scalars such thatpα ∈ R1+N

++ .

Page 24: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.4. PRESENT-VALUE FUNCTIONS 24

Remark 1.4.4. The preceding proof shows the existence of a present-value function assuming only that X is a closed convex cone, notnecessarily a linear subspace. This generality is utilized in our laterdiscussion of dominant choice. It also allows an easy extension of The-orem 1.4.3 to a constrained market X, which we define as a closedconvex set of cash flows such that 0 ∈ X and there exists ε > 0 suchthat for all x ∈ X, 0 < ‖x‖ < ε implies (ε/‖x‖)x ∈ X. In words, everytrade that is small in the sense that its norm is no more than ε can bescaled up so that its norm is ε. This allows for larger position limits aswell as short-sale constraints (since x ∈ X need not imply −x ∈ X).Let us show that a constrained market X is arbitrage-free if and onlyif a present-value function exists. First note that X is arbitrage-freeif and only if the cone C ≡ sx | s ∈ R+, x ∈ X generated by X isarbitrage-free. Since C is a closed convex cone, the proof of Theo-rem 1.4.3 applies. That C is convex is immediate. Closure of C followsfrom the assumption that trades smaller than ε can be scaled up tohave norm ε. The interested reader can prove this by confirming thatC ∩X = C ∩

x ∈ R1+N | ‖x‖ ≤ ε

. ♦

We have so far defined present-value functions from the perspectiveof time zero. A present-value function ΠF,t : LF,t → R can be definedanalogously from the perspective of every other spot (F, t). SupposeXF,t ⊆ LF,t is the set of traded cash flows from the perspective of spot(F, t), and the dynamic consistency assumption XF,t ⊆ X is satisfied(as in Proposition 1.2.5). Then the restriction of a time-zero present-value function Π to LF,t is a positive linear functional on LF,t satisfyingΠ(x) ≤ 0 for all x ∈ XF,t. After normalization so that ΠF,t(1F×t) = 1,we arrive to the following definition.

Definition 1.4.5. The spot-(F, t) present-value function inducedby the time-zero present value function Π is the function

(1.4.1) ΠF,t (c) =Π(c1F×t,...,T

)Π(1F×t

) , c ∈ L.

Note that we have defined ΠF,t over the entire domain L, while onlyits restriction ΠF,t : LF,t → R is meaningful as a spot-(F, t) present-value function. This is merely a convenience. The following propositionshows that for an arbitrage-free liquid market the preceding definitioncovers all spot-(F, t) present-value functions.

Proposition 1.4.6. Suppose the market X is arbitrage-free andliquid and ΠF,t is a positive linear functional on LF,t such that ΠF,t (x) ≤0 for every x ∈ X ∩LF,t and ΠF,t(1F×t) = 1. Then ΠF,t is induced bysome time-zero present value function.

Proof. Fix any time-zero present-value function Π and define thefunction Π : L → R by letting Π(c) = Π (c), where c is the cash flow

Page 25: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 25

obtained from c after replacing the restriction of c on F × t, . . . , Twith a single payment at spot (F, t) equal to ΠF,t (c), that is, c ≡c− c1F×t,...,T + ΠF,t (c) 1F×t. Then Π is a positive linear functionalthat satisfies Π(1Ω×0) = 1. We now show that Π vanishes on Xand is therefore a present-value function that induces ΠF,t. Givenany x ∈ X, the liquidity assumption allows us to find y ∈ X ∩ LF,tsuch that (x− y) 1F×t+1,...,T = 0 and therefore ΠF,t (x− y) 1F×t =

(x− y) 1F×t,...,T, which in turn implies Π(x− y) = Π (x− y) = 0.Moreover, since y ∈ LF,t and ΠF,t (y) = 0, we also have Π(y) = 0.Therefore, Π(x) = Π (x− y) + Π (y) = 0.

Unless otherwise indicated, our focus is on liquid markets and thenotation ΠF,t will refer to the conditioned version of Π, as defined byequation (1.4.1).

We conclude this section with some simple but essential observa-tions on the relationship between present-value functions and contractpricing.

Definition 1.4.7. The positive linear functional Π : L → R is saidto price the contract (δ, V ) if for every spot (F, t),(1.4.2) V (F, t) = ΠF,t (δ) ,

where ΠF,t denotes the spot-(F, t) present-value function induced by Π.

Proposition 1.4.8. (a) In every market, all present-value func-tions price all traded contracts.

(b) A positive linear functional on L is a present-value function forthe market implemented by a set of contracts if and only if it pricesevery contract in the given set.

Proof. (a) Set to zero the present value of the cash flow (1.3.3)resulting from buying a traded contract at a given spot.

(b) Let X0 be defined as in last section’s last paragraph. A positivelinear functional vanishes on span(X0) if and only if it vanishes on X0

if and only if it prices every contract defining X0.

1.5. Options and dominant choiceAn option generalizes our earlier notion of a contract by allowing

the owner to choose any dividend process within a specified set of cashflows O. For example, suppose O represents the possible cash flowsresulting from a finite opportunity set of projects available to a firm.Shareholder owners of the firm may disagree on what is the best projectto undertake. Assuming shareholders only care about the cash flowgenerated by a project and every cash flow in O can be sold in a com-plete market, then shareholders who would disagree absent a marketcan agree on a project that is assigned the highest price by the market.Extending this idea, we will see that in the presence of a sufficiently

Page 26: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 26

complete arbitrage-free market, a present-value maximizing choice isdominant in that it is optimal given the market for every shareholder.Moreover, assuming the market is liquid, a choice that is dominant attime zero remains dominant as uncertainty unfolds and therefore thereis no incentive to deviate from an initially selected dominant choice.The existence of a dominant choice in turn leads to the pricing of atraded option. Another important application of these arguments is tostandardized traded financial options such as American call and putoptions.

We assume throughout that a market X is available and that O isa nonempty set of cash flows.

Definition 1.5.1. A cash flow δ∗ ∈ O is dominant (in O givenX) if for every δ ∈ O, there exists some x ∈ X such that δ∗ + x ≥ δ.

A dominant choice is optimal for every agent who is not averse toadditional income at some spot (or can freely dispose of income). Givenany set of agents I, suppose agent i ∈ I finds δi optimal in O. If δ∗ ∈ Ois dominant, then there exist trades xi ∈ X such that δ∗ + xi ≥ δi.Agent i is therefore at least as well off selecting δ∗ instead of δi and atthe same time entering the trade xi. In this sense, all agents in I agreeon the optimality of δ∗, even though the way they use the market totransform δ∗ can differ.

Theorem 1.5.2. Suppose the market X is arbitrage-free and O isa nonempty set of cash flows. Then δ∗ ∈ O is dominant in O if andonly if Π(δ∗) = max Π(δ) | δ ∈ O for every present-value function Πfor X.

Proof. Suppose δ∗ is dominant. For any δ ∈ O, we can writeδ∗ + x ≥ δ for some x ∈ X, and therefore Π(δ∗) = Π (δ∗ + x) ≥ Π(δ)for every present-value function Π.

For the converse, it is instructive, although formally redundant,to first consider a simple argument for a complete market X. Supposeδ∗ ∈ O maximizes the unique present-value function Π over O. Since Xis complete, δ∗ = Π(δ∗) 10+y∗ and δ = Π(δ) 10+y for some y∗, y ∈ X.Letting x = y−y∗ ∈ X, we have δ∗+x = Π(δ∗) 10+y ≥ Π(δ) 10+y = δ,which proves the dominance of δ∗.

More generally, suppose X is arbitrage-free but not necessarily com-plete and δ∗ ∈ O is not dominant. We will show that there exists somepresent-value function that δ∗ does not maximize. As in the proof ofTheorem 1.4.3, we identify L with R1+N . Since δ∗ is not dominant,there exists δ ∈ O such that δ∗ − δ + x /∈ R1+N

+ for all x ∈ X. Withx∗ ≡ δ∗ − δ, the set X∗ = x+ αx∗ | x ∈ X, α ∈ R+ is a closed con-vex cone that contains no arbitrage. By Remark 1.4.4, there existsp ∈ R1+N

++ such that p · x ≤ 0 for all x ∈ X∗. Such a vector p is astate-price process that satisfies p · x∗ ≤ 0. By construction, x∗ /∈ X.

Page 27: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 27

Let x∗ = x + z, where x ∈ X and z is nonzero and orthogonal to X.Pick ε > 0 small enough so that pε ≡ p − εz is a state-price processsuch that pε · x∗ = −εz · z < 0. The present-value function Π definedby pε satisfies Π(x∗) < 0 and therefore Π(δ∗) < Π(δ).

Corollary 1.5.3. Suppose the market is arbitrage-free, the set ofcash flows O is compact (for example, finite), and every element of Ois marketed. Then a dominant choice in O exists.

If the option contains non-marketed cash flows, there may not be adominant choice, even if O is compact. For a trivial example, supposeT = 1, there is no uncertainty, O = (0, 1) , (1, 0) and X = 0. Nei-ther element of O dominates the other. Every positive linear functionalis a present-value function for the trivial market X = 0, and eachelement of O maximizes some present value.

The preceding theorem characterizes dominance from the perspec-tive of time zero. Suppose now that the option holder does not haveto commit to an initial selection. Will there be an incentive to deviatefrom a previously made dominant selection in the face of new infor-mation? To address this issue we refine the definition of an option toincorporate the idea that the option holder can deviate from a previouscash flow choice at any time. As before, let O represent a set of cashflows available to the option holder at time zero. For every cash flowδ and spot (F, t), the set of cash flows in O that are equal to δ up tobut not including spot (F, t) is

(1.5.1) OF,t (δ) ≡δ ∈ O | δ = δ on F × 0, . . . , t− 1

,

Selecting δ ∈ OΩ,0 = O at time zero and switching to δ ∈ OF,t (δ)

at spot (F, t) is equivalent to selecting δ + (δ − δ)1F×t,...,T at timezero, which must therefore also belong to O if the option holder doesnot have to commit at time zero. This motivates the following formaldefinition.

Definition 1.5.4. An option is a set O ⊆ L such that for everyspot (F, t), δ ∈ O and δ ∈ OF,t (δ) implies δ + (δ − δ)1F×t,...,T ∈ O.

Example 1.5.5. (American call) An American call is the right butnot the obligation to receive a specified traded security in exchange fora specified cash amount, known as the strike, at most once any timeup to a given maturity date. We can formally view an American callas a special case of an option in the sense of Definition 1.5.4. Givena maturity τ ∈ 1, . . . , T − 1, let T denote the set of all stoppingtimes that are valued in 0, . . . , τ ∪ ∞. For τ ∈ T , let 1[τ ] denotethe process whose value at (ω, t) is one if τ (ω) = t and zero otherwise.An American option with payoff process D ∈ L is the option

O =D1[τ ] | τ ∈ T

, where D∞ ≡ 0.

Page 28: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 28

Selecting cash flow D1[τ ] is referred to as exercising the Americanoption at time τ . An American call on some underlying traded con-tract with ex-dividend processes S is the American option with payoffprocess D = S −K, where the scalar K is the strike.

Suppose that the underlying contract, which we henceforth referto as the stock, does not pay any dividends up to and including thematurity date τ . Suppose also that for every time prior to τ there isa way to save K (units of account) and have K for sure at time τplus potentially some non-negative interest. (Section 1.7 defines moreformally a money-market account that can implement such savings,provided the interest rate process is non-negative.) In this case, it isa dominant choice to not exercise the American call prior to maturity.To see why, suppose the option holder is considering exercising theoption at some spot (F, t) with t < τ . The option holder is at leastas well off keeping the option alive to maturity, shorting the stock andsaving K. Exercising at spot (F, t) results in an inflow S (F, t)−K andnothing thereafter. The alternative strategy also results in an inflowS (F, t)−K at spot (F, t), but now the option holder still has the calloption, a saved amount that is sufficient to pay the strike at maturity,and a short position on the underlying stock. At maturity, if the stockis worth more than K, the option holder can spend K from savings,exercise the call option and close out the short position. If on the otherhand the stock price is less than K at maturity, the option holder endsup with K − Sτ plus any additional interest on savings. Put together,relative to exercising at spot (F, t), the alternative strategy results inthe additional time-τ payoff of (K − Sτ )

+ (which is the payoff of aEuropean put option) plus any interest on the strike.

The argument fails if the stock pays sufficiently high dividends,since the benefit of collecting dividends can outweigh the costs of earlyexercise. This tradeoff between immediate payoff and the benefit ofweighting in order to condition actions on future information, as well aspossibly collect interest on costs associated with exercising the option,captures an essential intuition behind the optimal exercise of Americanoptions. We return to this example in the following chapter usinga dual approach based on present-value maximization and Jensen’sinequality. ♦

Fixing a reference option O and an arbitrage-free and liquid (butpossibly incomplete) market X, we now extend the notion of domi-nance to apply from the perspective of any given spot (F, t), where itis implicit that dominance is relative to the spot-(F, t) market XF,t ≡X ∩ LF,t.

Definition 1.5.6. The cash flow δ in O is dominant at spot (F, t)

if for all δ in OF,t (δ), there exists some x ∈ XF,t such that δ + x ≥ δon F × t, . . . , T.

Page 29: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 29

Theorem 1.5.7 can now be extended as follows. Recall that every(time-zero) present value function Π induces a spot-(F, t) present valuefunction ΠF,t, defined by (1.4.1).

Theorem 1.5.7. Given an arbitrage-free liquid market, the follow-ing conditions are equivalent, for all δ∗ ∈ O.

(1) δ∗ is dominant at every spot.(2) δ∗ is dominant (at spot zero).(3) Π(δ∗) = maxδ∈O Π(δ) for every present-value function Π.(4) For every present-value function Π and spot (F, t),

(1.5.2) ΠF,t (δ∗) = max

ΠF,t(δ) | δ ∈ OF,t (δ)

.

Proof. (1 =⇒ 2) If δ∗ is dominant, it is dominant at spot (Ω, 0).(2 ⇐⇒ 3) See Theorem 1.5.2.(3 =⇒ 4) Suppose that ΠF,t (δ) > ΠF,t (δ

∗) for some present-valuefunction Π, spot (F, t) and δ ∈ OF,t (δ

∗). Let δ ≡ δ∗+(δ − δ∗) 1F×t,...,T ∈O. Using equation (1.4.1) with c = δ − δ∗, we conclude that Π(δ) >Π(δ∗).

(4 =⇒ 1) Since the market is assumed liquid, Proposition 1.4.6implies that every spot-(F, t) present-value function ΠF,t can be com-puted by (1.4.1) in terms of some present-value function Π. Thereforeδ∗ is dominant at (F, t) by the argument of Theorem 1.5.2 applied toXF,t.

Suppose now that an option to be bought or sold. The option buyerpays the option price, commonly referred to as the premium, to theoption seller (or “writer”), who is then obligated to deliver the cash flowselected by the buyer. Assuming the existence of a marketed dominantchoice at time zero, as well as a dominant choice at every other spotgiven any exercise history, we will show that the premium that is con-sistent with the absence of arbitrage opportunities is the present valueof a dominant cash flow. The only subtlety in this argument is thata potential arbitrageur that writes an option must be able to hedgepotentially suboptimal choices by the option buyer.

We continue taking as given the arbitrage-free market X and anoption O with time-zero4 premium p. We analyze the buying andselling of an option separately in order to account for the fact that onlythe owner decides how to exercise the option. If p is sufficiently low,then an arbitrage can be created by buying the option and selectinga dominant cash flow. A buyer of the option pays the premium p,can select any cash flow δ in O and can trade in the market X, thus

4To keep the notation simple, we consider the trading of the option O at timezero only. For every δ ∈ O, the trading of the option OF,t (δ) at spot (F, t) can beanalyzed by applying the same arguments on the subtree rooted at (F, t).

Page 30: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 30

generating a cash flow of the form

(1.5.3) − p1Ω×0 + δ + x, δ ∈ O, x ∈ X.

Excluding such an arbitrage implies a lower bound on the option pre-mium.

Proposition 1.5.8. Suppose the market is arbitrage-free and δ∗ isa dominant cash flow in O that is marketed with present value p∗. Thenp ≥ p∗ if and only if there is no arbitrage of the form (1.5.3).

Proof. Choose x∗ ∈ X such that δ∗ = p∗1Ω×0 + x∗. If p < p∗,then the cash flow −p1Ω×0+ δ

∗−x∗ = (p∗ − p) 1Ω×0 is an arbitrage.Conversely, suppose p ≥ p∗ and let c ≡ −p1Ω×0+ δ+x for any δ ∈ Oand x ∈ X. The dominance of δ∗ implies that δ∗ + y ≥ δ for somey ∈ X. Therefore,

c ≤ −p∗1Ω×0 + (δ∗ + y) + x = x∗ + x+ y ∈ X.

Since X is arbitrage-free, c is not an arbitrage.

If the option premium is higher than the present value p∗ of thedominant choice, one may wish to sell the option towards an arbitrage.A potential difficulty in this case is that the option buyer cannot beassumed to select the dominant cash flow δ∗. Given a slightly strongerversion of the assumption of Proposition 1.5.8, however, we can stillshow that if p > p∗ then an arbitrage is possible that involves writingthe option and hedging all possible choices by the option buyer. A keyaspect of the arbitrageur’s hedging strategy is that it can be imple-mented without knowledge of the option holder’s future choices. Weformalize this type of informational restriction as follows.

Definition 1.5.9. An O-adapted strategy is a mapping h thatassigns to each nonterminal spot (F, t) a function hF,t : O → XF,t suchthat δ = δ on F × 0, . . . t implies hF,t (δ) = hF,t(δ).

Given an O-adapted strategy h, we think of hF,t (δ) as the incrementaltrade that the option writer must enter at spot (F, t) in order to hedgeall possible future choices by the option buyer who has selected δ upto and including spot (F, t). To simplify the notation, for every δ ∈ Oand time t < T , let the function ht (δ) : Ω → X be defined by

(1.5.4) ht (δ) (ω) ≡ hF,t (δ) for all ω ∈ F.

An option seller who receives the premium p at time zero must deliverwhatever cash flow δ in O is selected by the option buyer. If the optionseller follows the hedging strategy h, the resulting cash flow is

(1.5.5) p1Ω×0 − δ +∑T−1

t=0ht (δ) .

Page 31: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.5. OPTIONS AND DOMINANT CHOICE 31

Excluding every arbitrage of this type implies an upper bound forthe option premium, which together with the lower bound of Propo-sition 1.5.8 pins down the option premium as the present value of adominant cash flow.

Proposition 1.5.10. Suppose the market is arbitrage-free and δ∗

is a dominant cash flow in O that is marketed with present value p∗.Suppose further that for every δ ∈ O and spot (F, t), there exists a cashflow in OF,t (δ) that is dominant at (F, t). Then p ≤ p∗ if and only ifthere exists no O-adapted strategy h such that the cash flow (1.5.5) isan arbitrage for every δ ∈ O.

Proof. The “only if” part can be shown similarly to the corre-sponding part of Proposition 1.5.8. Conversely, we assume p > p∗ andshow the existence of an O-adapted strategy h such that the cash flow(1.5.5) is an arbitrage for all δ ∈ O. Given δ ∈ O, select δF,t ∈ OF,t (δ)to be dominant at spot (F, t) and let δt ≡

∑ni=1 δ

Fi,t1Fi×0,...,T, whereF1, . . . , Fn is the partition generating Ft. The definition of an optionimplies that δt ∈ OF,t (δ) for every spot (F, t). Using the dominance ofδ∗, select x ∈ X such that δ∗+x ≥ δ1 and let h0 (δ) ≡ −p∗1Ω×0+δ

∗+x.The value h0 (δ) depends on δ only through the value δ0, since the choiceof x has this property. At time zero the arbitrageur sells the option,receiving the premium p, buys the cash flow δ∗ for a price p∗ and entersthe trade x, which together with δ∗ dominates δ1. After paying δ0 = δ10to the option buyer, the arbitrageur faces the cash flow

c0 ≡ p1Ω×0 + h0 (δ)− δ1Ω×0 ≥ (p− p∗) 1Ω×0 + δ11Ω×1,...,T.

Proceeding inductively, suppose that after all transactions prior to timet ∈ 1, . . . , T − 1, the arbitrageur faces an overall cash flow

ct−1 ≥ (p− p∗) 1Ω×0 + δt1Ω×t,...,T.

Using the dominance of δF,t at spot (F, t), choose hF,t (δ) ∈ XF,t so that

δt1F×t,.....T + hF,t (δ) ≥ δt+11F×t,.....T.

The arbitrageur can choose δt+11F×t,...,T, and therefore hF,t (δ), havingobserved only δ1F×0,...,t. At time t the arbitrageur enters the tradeht (δ), defined in (1.5.4), and pays out δt to the option holder, resultingin the new cash flow

ct ≡ ct−1 + ht (δ)− δ1Ω×t ≥ (p− p∗) 1Ω×0 + δt+11Ω×t+1,...,T.

At time T the arbitrageur pays out δT to the option holder, resultingin the overall arbitrage cash flow cT−1−δ1Ω×T ≥ (p− p∗) 1Ω×0. Therecursive construction of the cash flows ct implies that cT−1−δ1Ω×T isequal to (1.5.5), and we have therefore produced an O-adapted strategyh such that (1.5.5) defines an arbitrage for all δ ∈ O.

Page 32: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.6. TRADING STRATEGIES 32

1.6. Trading strategiesAs we saw in Section 1.2, the market implemented by a given set of

contracts can be described as the set of cash flows generated by tradingstrategies. A trading strategy essentially specifies a contingent tradefor every spot of the information tree. This section provides a moresystematic discussion of trading strategies and associated notation andterminology, which is tailored to the application of probabilistic meth-ods introduced in the following chapter.

Throughout this section, we take as given the contracts

(1.6.1)(δ1, V 1

), . . . ,

(δJ , V J

),

with corresponding ex-dividend price processes Sj = V j − δj. Tradingstrategies in these contracts are time-indexed sequences of portfoliosrepresenting positions over each period. Period t ∈ 1, . . . , T is thetime interval whose beginning is time t− 1 and whose end is time t. Aperiod-t portfolio

(θ1t , . . . , θ

Jt

)is formed at the beginning of period t,

where θjt represents a number of shares in contract j. By convention, weset θj0 = 0. In general, it is important to keep track of what quantitiesare known at the beginning of the period and what quantities are onlyknown at the end of the period. To emphasize this distinction, we definea process x to be predictable if xt is Ft−1-measurable, for every timet > 0, and x0 is constant. Thus if x is adapted, we can only claim thatxt is known at the end of period t, while if x is predictable, we knowthat xt is revealed at the beginning of period t. We let P denote the setof all predictable processes (relative to the given underlying filtration)and we set

P0 ≡ x ∈ P | x0 = 0 .The sequence (θj0, θ

j1, . . . , θ

jT ) of positions in contract j specified by a

trading strategy defines an element of P0.

Definition 1.6.1. A trading strategy in the contracts (1.6.1)is a J-dimensional row vector whose entries are elements of P0. Thetrading strategy

(θ1, . . . , θJ

)generates the cash flow x, where

(1.6.2) xt =∑

jθjtV

jt − θjt+1S

jt , t < T ; xT =

∑jθjTV

jT .

In the budget equation (1.6.2), one can think of xt as the netof the time-t value

∑j θ

jtV

jt of the period-t portfolio and the time-t

cost∑

j θjt+1S

jt of forming the period-(t+ 1) portfolio. Since θ0 = 0,

the budget equation implies that x0 = −∑

j θj1S

j0, which is the initial

payment required to start the strategy. The final payment xT equalsthe portfolio’s time-T liquidation value.

In Section 1.2 we defined the set implemented by the contracts (1.6.1)as the smallest market in which every (δj, V j) is traded.

Page 33: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.6. TRADING STRATEGIES 33

Proposition 1.6.2. The market implemented by the contracts (1.6.1)is the set of all cash flows generated by all trading strategies in thesecontracts.

Proof. Let X be the market implemented by the contracts (1.6.1)and let X ′ be the set of all cash flows generated by all trading strategiesin these contracts. We are to show that X = X ′. In the closingparagraph of Section 1.2 we saw that the X = span (X0), where X0

is the set of every −V j1F×t + δj1F×t,...,T, where (F, t) is a spotand j ∈ 0, . . . , J. One can easily check that every cash flow in X0 isgenerated by a traded strategy. SinceX ′ is a linear subspace, this showsthat X ⊆ X ′. For the converse inclusion, we must show that the cashflow x generated by a trading strategy θ is a linear combination of cashflows in X0. Note that x =

∑j x

j, where xjt = θjtVjt − θjt+1S

jt for t < T

and xjT = θjTVjT . It is therefore, sufficient to show that xj ∈ span (X0)

for every j. This can be shown by induction in the number of trades ofθj, defined as the number of spots where θj changes value. The detailsare left to the interested reader.

The budget equation (1.6.2) is more conveniently expressed in termsof gain processes, using a process transform operator that directly corre-sponds to stochastic integrals in continuous-time versions of the theoryand facilitates the application of martingale methods introduced in thefollowing chapter. Prior to stating this form of the budget equation,we digress briefly to introduce some useful notation.

For any process x, the lagged process x− is defined byx− (0) = x (0) , x− (t) = x (t− 1) , t = 1, . . . , T.

Clearly, the process x is adapted if and only if x− is predictable. Theincrements process of x is the process ∆x = x−x−, or equivalently,

∆x0 = 0, ∆xt = xt − xt−1, t = 1, . . . , T.

The integral5 x • y of the process x with respect to the process y isthe process

(x • y)0 = 0, (x • y)t =∑t

s=1xs∆ys, t = 1, . . . , T.

We denote by t the process that counts time: t (t) = t for every timet. Therefore, for every process x,

(x • t)0 = 0, (x • t)t =∑t

s=1xs, t = 1, . . . , T.

The gain process Gj of contract (δj, V j) is defined by

Gjt = V j

t +∑s<t

δjs = Sjt +∑s≤t

δjs, t = 0, . . . , T.

5The bullet notation for an integral is more commonly found in the more ad-vanced literature on stochastic analysis. See, for example, Jacod and Shiryaev[2003].

Page 34: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.6. TRADING STRATEGIES 34

For times t > s, the increment Gjt − Gj

s represents the total gain (orloss if negative) resulting from purchasing the jth contract at time sand selling it at time t. If

(θ1, . . . , θJ

)is a trading strategy, then θj •Gj

represents total gains from trading contract j up to time t.

Proposition 1.6.3. The trading strategy(θ1, . . . , θJ

)generates the

cash flow x if and only if

(1.6.3)∑

jθjV j = −x− • t +

∑jθj •Gj,

∑jθjTV

jT = xT .

Proof. Let W =∑

j θjV j. Since ∆Gj

t = V jt − Sjt−1, the budget

equation (1.6.2) can be written as

∆Wt = −xt−1 +∑j

θjt∆Gjt , t ≤ T ; WT = xT ,

which is equivalent to (1.6.3).

Matrix notation helps us simplify expressions such as (1.6.3), so letus take a moment to introduce some associated notation. For any setZ, we write Zm×n for the set of m-by-n matrices whose entries areelements of Z. For example, trading strategies are elements of P1×J

0 .Unless explicitly specified otherwise (as we did for trading strategies),vectors of processes are assumed to be column vectors and we write Zn

rather than Zn×1. If Z is the set of all (adapted, predictable) processes,then Zn is the set of n-dimensional (adapted, predictable) processes.If Z is a set of processes, we typically use superscripts to index vectorsand matrices. For x ∈ Ln×m and y ∈ Lm×l, the process x • y ∈ Ln×l isdefined by the analog of the usual matrix multiplication formula:

(x • y)ij =∑m

k=1xik • ykj.

With these conventions in place, we define the J-dimensional pro-cesses

(1.6.4) δ ≡

δ1

...δJ

and V ≡

V 1

...V J

,

as well as(1.6.5) S ≡ V − δ and G ≡ S + δ • t = V + δ− • t.The budget equation (1.6.3) can be restated more succinctly as(1.6.6) θV = −x− • t + θ •G, θTVT = xT .

Other versions of the budget equation are obtained by a change ofthe unit of account (also known as a change of numeraire). Supposethe strictly positive adapted process π represents a unit conversionfactor at every spot. A cash flow or price process x expressed in the

Page 35: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.6. TRADING STRATEGIES 35

original unit of account becomes πx in the new unit of account. Thegain process G in the new units becomes

Gπ ≡ πV + (πδ)− • t.

Clearly, changing units should not affect the validity of a budget equa-tion, resulting in the following version, whose proof is a simple exercise.

Proposition 1.6.4. For all π ∈ L++, the trading strategy θ gener-ates the cash flow x if and only if

(1.6.7) πθV = − (πx)− • t + θ •Gπ, θTVT = xT .

Given the J contracts (1.6.1) and the vector notation (1.6.4), wewrite (δ, V ) to refer to these contracts and we write X (δ, V ) for themarket that is implemented by trading in (δ, V ). A trading strategy θin (δ, V ) defines a new, synthetic contract, which can be thought of asa share in a fund following strategy θ.

Definition 1.6.5. The trading strategy θ ∈ P1×J0 , generating the

cash flow x, defines the synthetic contract (δθ, V θ), where

(δθ0, Vθ0 ) = (0, θ1V0) , (δθt , V

θt ) = (xt, θtVt) , t = 1, . . . , T.

A contract is synthetic in (δ, V ) if it is of the form (δθ, V θ) for sometrading strategy θ.

Note that the ex-dividend price process Sθ = V θ − δθ is given by

(1.6.8) Sθt−1 = θtSt−1, t = 1, . . . , T ; SθT = 0,

and the gain process of the contract (δθ, V θ) is given by

(1.6.9) Gθ ≡ V θ + δθ− • t = V θ0 + θ •G,

as can be seen by rearranging the budget equation (1.6.6).Consider now the trading strategies θ1, . . . , θm in the original con-

tracts (1.6.1) and let α ∈ Pm0 be a trading strategy in the synthetic

contracts (δθi , V θi), i = 1, . . . ,m, generating the cash flow x. One canthink of α as a trading strategy in m funds, where fund i follows trad-ing strategy θi. The same cash flow x can be generated by the tradingstrategy θ =

∑mi=1 αiθi in the original contracts (δ, V ). This argument

justifies the following observations.

Proposition 1.6.6. The market implemented by any synthetic con-tracts in (δ, V ) is a subset of the market X (δ, V ) implemented by (δ, V ).The market implemented by (δ, V ) and any number of synthetic con-tracts in (δ, V ) is X (δ, V ).

Dividend processes of synthetic contracts correspond to the cashflows that are marketed in X (δ, V ):

Page 36: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.7. MONEY MARKET ACCOUNT AND RETURNS 36

Proposition 1.6.7. A cash flow c is marketed in X (δ, V ) if andonly if there exists a trading strategy θ in (δ, V ) such that

ct = δθt , t = 1, . . . , T.

In particular, the market X (δ, V ) is complete if and only if every cashflow c with c0 = 0 is the dividend process of some contract that issynthetic in (δ, V ).

Proof. Recall that c is marketed in X (δ, V ) if and only if thereexists x ∈ X (δ, V ) such that ct = xt for t > 0. By the definitionof X (δ, V ) and a synthetic contract, the latter condition is equivalentto the existence of a trading strategy in (δ, V ) such that ct = δθt fort > 0.

Finally, we relate synthetic contracts to traded contracts. Note thatin order to conclude that a traded contract is synthetic, the assumptionthat the market is arbitrage-free is essential.

Proposition 1.6.8. Every contract that is synthetic in (δ, V ) istraded in X (δ, V ). Conversely, if the market X (δ, V ) is arbitrage-free,every contract that is traded in X (δ, V ) is synthetic in (δ, V ).

Proof. Buying a contract (δ∗, V ∗) that is synthetic in (δ, V ) at anyspot generates a cash flow in X (δ∗, V ∗), which, by Proposition 1.6.6, isa subset of X (δ, V ) . Therefore, a synthetic contract in (δ, V ) is tradedin X (δ, V ) .

Conversely, suppose that (δ∗, V ∗) is traded in X (δ, V ) and let xbe the cash flow generated by buying (δ∗, V ∗) at time zero. Sincex ∈ X (δ, V ), there exists a trading strategy θ in (δ, V ) that generates x.Since the time-zero dividend of every contract is assumed to be zero byconvention, it follows that δ∗ = δθ. Assuming the market is arbitrage-free, it must also be the case that V ∗ = V θ.

1.7. Money market account and returnsA special type of contract we call a money-market account im-

plements single-period default-free borrowing and lending: A unit ofaccount invested at time t− 1 pays 1 + rt at time t, where the interestrate rt is determined at time t − 1 and is therefore Ft−1-measurable.In practice, a loan can be made default-free by the posting of suffi-cient collateral, an important aspect of real-world markets that is notmodeled here. The rate rt is often referred to in the literature as the(period-t) risk-free rate. The rate rt applies to a loan over a singleperiod and is therefore a short-term interest rate and the predictableprocess r (with the convention r0 = 0) is a short-term interest rateprocess, a term we abbreviate to short-rate process. More precisely, weadopt the following terminology.

Page 37: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.7. MONEY MARKET ACCOUNT AND RETURNS 37

Definition 1.7.1. A money-market account (MMA) is a con-tract (δ0, V 0), with ex-dividend price process S0 = V 0 − δ0, such thatfor some r ∈ P0,

S0t−1 = 1 and V 0

t = 1 + rt, t = 1, . . . , T.

The predictable process r is the account’s (interest) rate process.Given a reference market X, a process r ∈ P0 is a short-rate processif r is the rate process of an MMA that is traded in X.

The following observations can be verified by the reader. We calltwo contracts equivalent if each is synthetic in the other.

Proposition 1.7.2. Suppose the market is arbitrage-free and r isa short-rate process. Then r is unique and 1 + r is strictly positive.Moreover, every traded contract (δ0, V 0) whose value process V 0 ispredictable and strictly positive is equivalent to an MMA and satisfies

(1.7.1) V 0t

S0t−1

= 1 + rt, S0 ≡ V 0 − δ0, t = 1, . . . , T.

In applications where the market is implemented by a given finiteset of contracts, it is common to assume that one of these contractsis an MMA. In all such applications, we label the contracts generatingthe market as(1.7.2)

(δ0, V 0

),(δ1, V 1

), . . . ,

(δJ , V J

),

where contract zero is the MMA, with rate process r and gain processG0. Since δ00 = r0 = 0, we have(1.7.3) V 0 = 1 + r, δ0− = r−, δ0T = VT , G0 = 1 + r • t.

Of course, all of last section’s results apply to this case after asimple relabeling of the contracts, which affects the form of the budgetequation. We adopt the matrix notation (1.6.4) and (1.6.5), where(δ, V ) and S and G refer to contracts 1, . . . , J and exclude contractzero. A trading strategy in the 1 + J contracts (1.7.2) is denoted as(

θ0, θ)∈ P × P1×J ,

where θ0t represents the ex-dividend value in the MMA at the beginningof period t > 0, and θ is a trading strategy in the remaining J contracts.Adapting the budget equation (1.6.6) to this notation, it follows thatthe trading strategy (θ0, θ) generates cash flow x if and only if(1.7.4) θ0V 0 + θV =

(θ0r − x−

)• t + θ •G, θ0TV

0T + θTVT = xT .

In applications where portfolios values can be assumed to stay pos-itive it is common to focus on returns rather than prices. The returnprocess Rj associated with contract j is defined by

(1.7.5) Rj0 = 1, Rj

t ≡V jt

Sjt−1

, t = 1, . . . , T,

Page 38: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.7. MONEY MARKET ACCOUNT AND RETURNS 38

provided that every Sjt−1 is nowhere zero, which we assume for theremainder of this section. We refer to Rj

t as the period-t return ofcontract j. Note that if r is the market’s short-rate process, then

R0 ≡ 1 + r.

The return process Rθ of a trading strategy(θ0, θ) is defined as thereturn process of the corresponding synthetic contract, which we de-note by

(δθ, V θ

), instead of the more cumbersome (δ(θ

0,θ), V (θ0,θ)).Letting Sθ ≡ V θ − δθ, the return process Rθ is well defined providedthe denominator in the following definition is nowhere zero:

Rθ0 ≡ 1, Rθ

t ≡V θt

Sθt−1

, t = 1, . . . , T.

The excess return process of the trading strategy (θ0, θ) is the differ-ence Rθ −R0.

Portfolio returns can be more parsimoniously represented in termsof returns and portfolio weights, provided of course all relevant returnsare well defined. Consider any trading strategy (θ0, θ) such that thetime-(t− 1) ex-dividend portfolio value

Sθt−1 = θ0t + θtSt−1

is nowhere zero. Associated with (θ0, θ) is a row vector

ψ =(ψ1, . . . , ψJ

)∈ P1×J

0 ,

defined by

(1.7.6) ψj0 = 0, ψjt ≡θjtS

jt−1

Sθt−1

, t = 1, . . . , T.

At the beginning period t, which is time t − 1, ψjt represents the pro-portion of the portfolio’s ex-dividend value Sθt−1 that is allocated tocontract j. The vector ψt omits the proportion allocated to the MMA,which can be computed as ψ0

t ≡ 1 −∑J

j=1 ψjt . We will refer to ψt,

which can be any element of L (Ft−1)1×J , as a (period-t) portfolio al-

location, and to ψ, which can be any element of P1×J0 , as a portfolio

allocation policy. The period-t return Rθt can be computed entirely

in terms of ψt, which we therefore also denote, abusing notation, byRψt :

(1.7.7) Rθt ≡ Rψ

t ≡ R0t +

J∑j=1

ψjt(Rjt −R0

t

).

In the following chapter we will discuss notions of optimal portfolioallocations.

Page 39: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.8. EXERCISES 39

1.8. ExercisesExercise 1 This exercise reviews some simple arbitrage arguments.

Assume that there is a single period (T = 1) and therefore the filtrationconsists of spot zero and N time-one spots. In an arbitrage-free marketX, you can trade a stock, which is a contract with time-zero price S(a scalar) and time-one payoff V (a random variable or element ofRN), and a money-market account (MMA) whose single-period rateis r (a scalar). Buying the stock at time zero generates the cash flow(−S, V ) ∈ X and selling (or shorting) the stock generates the cash flow(S,−V ) ∈ X. Investing a unit of account in the MMA generates thecash flow (−1, 1 + r) ∈ X and borrowing a unit of account from theMMA generates the cash flow (1,− (1 + r)) ∈ X. Note that since themarket is arbitrage-free, 1 + r > 0.

(a) (Forward pricing) A forward contract for delivery of the stockat time one is a contract whose time-zero price is by definition zero andwhose time-one payoff takes the form V −K for a scalar K, which isthe contract’s delivery price. The long contract position generatesthe cash flow (0, V −K) and the short contract position generates thecash flow (0, K − V ). The forward contract is traded inX if these cashflows are in X, in which case K defines the stock’s forward price F .Notice that the delivery price K is part of the contract definition. Theunique value of K such that (0, V −K) ∈ X defines F . How can youcreate a synthetic forward using the stock and the MMA? What is theimplied relationship between S and F assuming the forward contract istraded? What is an explicit arbitrage (assuming the forward is traded)if the claimed relationship between S and F is violated? How do youranswers change if the stock cannot be shorted,6 that is, if (−S, V ) ∈ Xbut X is only a cone, not a linear subspace, and (S,−V ) /∈ X.

(b) (Put-call parity) Entering a long forward contract generates apositive payoff on the event V > K and a negative payoff on the eventV < K. A trader entering a long forward position has the right toreceive the positive payoff and the obligation to pay the negative payoff.A (European) call option (on the stock), or just call, with strike Kis defined by the same payoff but without the obligation. The call’spayoff is therefore (V −K)+. Starting with a short forward contractand removing the obligation part results in the definition of the payoff(K − V )+ of a (European) put option (on the stock), or just put,with strike K. A call or put option, with payoff V o, is traded in X if(−So, V o) for a necessarily unique scalar So, which defines the option’sprice or premium. The cash flow (−So, V o) is generated by buying

6An arbitrageur that has a positive inventory of shares of the stock can incre-mentally sell some and therefore (S,−V ) ∈ X. A short-sale constraint binds if thearbitrageur has no share inventory and cannot borrow shares. The economics ofthe implied lease market dictate the cost of shorting the stock.

Page 40: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.8. EXERCISES 40

(or going long) the option and the cash flow (So,−V o) is generated byselling (or going short or writing) the option. Given a commitmentto pay V (resp. receive V ) at time one, buying the call (resp. put) is aform of insurance, paying a premium So at time zero in return for theright to pay K rather than V on the event V > K (resp. receive Krather than V on the event V < K).

Suppose the call and the put, both with strike K, are traded, withrespective premia Sc and Sp. How can you use them to syntheticallycreate the forward contract of part (a) with delivery priceK? Assumingthis forward contract is traded, and therefore F = K is the forwardprice of the stock, what is the implied relationship (known as put-callparity) between Sc − Sp and F? Suppose the relationship is violated.What is an explicit arbitrage using the options, the stock and the MMA(but not directly a forward)? How do your answers change if the stockcannot be shorted?

(c) (Simple binomial pricing) Specialize the setting further by as-suming there are only two states (N = 2). Assume that the stockreturn V /S takes the values 1 + u and 1 + d, where u > d > −1. Themarket X is implemented by trading in the stock and the MMA. Whatare necessary and sufficient conditions on the parameters r, u and dfor X to be arbitrage-free? Proceeding under the assumption that X isarbitrage-free, show that X is complete and compute the correspond-ing present-value function. What is the premium Sc of a (European)call according to this present-value function? Compute a replicatingportfolio, that is, a portfolio in the stock and the MMA whose time-one payoff is the same as that of the call option. Finally, confirm thatthe time-zero value of the replicating portfolio is consistent with yourearlier call premium calculation.

Exercise 2 Suppose the reference market X is arbitrage-free andcomplete. Explain why for every cash flow c, there is a unique scalarΠ(c) such that c − Π(c) 1Ω×0 ∈ X. Then show that the functionΠ : L → R so defined is a present-value function.

Exercise 3 Suppose the reference market is arbitrage-free. Showthat a cash flow is marketed if and only if it is assigned the same valueby every present-value function. As a corollary, show that a cash flow istraded if and only if it is assigned the value zero by every present-valuefunction.

Exercise 4 Fix a reference complete market with correspondingpresent-value function Π, which induces a spot-(F, t) present-valuefunction ΠF,t for every spot (F, t) (see Definition 1.4.5).

(a) Show that if Π prices the contract (δ, V ) (see Definition 1.4.7)then V uniquely solves the backward recursion

V (F, t) = δ (F, t) + ΠF,t

(V 1F×t+1

), V = δT .

Page 41: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.8. EXERCISES 41

(b) Consider an American option with payout process D as definedin Example 3.6.7, whose notation we use here. For every stoppingtime τ , let δτ ≡ D1[τ ] and define the adapted process V τ by lettingV τ (F, t) ≡ ΠF,t (δ

τ ) for every spot (F, t). (Note that, by construction,Π prices the contract (δτ , V τ ) and part (a) applies.) Define the adaptedprocess V ∗ by letting, for every spot (F, t),

V ∗ (F, t) ≡ max V τ (F, t) | τ ∈ T .

Show that V ∗ uniquely solves the backward recursionV ∗ (F, t) = max

D (F, t) ,ΠF,t

(V ∗1F×t+1

), V ∗

T = D+T .

Label each spot (F, t) red if V ∗ (F, t) = D (F, t) and green otherwise.A state ω corresponds to a path from spot zero to a terminal spot.Define τ ∗ (ω) to be the time of the first red spot along the path ω, withthe convention τ ∗ (ω) = ∞ if ω is a sequence of green spots only. Showthat the stopping time τ ∗ so defined is dominant (in the sense thatD1[τ∗] is a dominant cash flow choice for the option).

Exercise 5 (a) Show thatx • (y • z) = (xy) • z, x, y, z ∈ L.

(This is sometimes called the associate property of stochastic integralsand applies in more general stochastic settings.)

(b) Given contracts (δ, V ) ∈ LJ×2 , the trading strategies θi, . . . , θmdefine corresponding synthetic contracts

(δθi , V θi

), i = 1, . . . ,m. Let

α = (α1, . . . , αm) be a trading strategy in the m synthetic contracts,generating cash flow x. Use the corresponding budget equation in theform∑m

i=1αiV

θi = −x− • t +∑m

i=1αi •Gθi ,

∑m

i=1αiTV

θiT = xT ,

and part (a) to verify that the cash flow x is also generated by thetrading strategy θ ≡

∑mi=1 αiθi in the original contracts (δ, V ).

Exercise 6 Give an example of contracts (δ, V ) ∈ LJ×2 and acontract (δ∗, V ∗) that is not a synthetic contract in (δ, V ), yet it istraded in X (δ, V ).

Exercise 7 (Binomial replication) This exercise extends Exercise1c to the multi-period case, using the notation U ≡ 1+u and D ≡ 1+d.Suppose Ω = 0, 1T and the filtration Ft is generated by the processb, where

b0 ≡ 0, bt (ω) ≡ ωt, ω = (ω1, . . . , ωT ) ∈ Ω, t ∈ 1, . . . , T .

The process Z is specified by a given initial value Z0 ∈ (0,∞) and therecursion

ZtZt−1

≡ btU + (1− bt)D, t = 1, . . . , T,

Page 42: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.8. EXERCISES 42

for given constants U > D > 0. Note that the process Z also gener-ates the filtration Ft, since (Z1, . . . , Zt) and (b1, . . . , bt) are mutuallyuniquely determined path by path.

(a) The recombining tree is the graph whose nodes are all possi-ble values of Zt/Z0, and whose arrows connect a value z off Zt−1/Z0 tothe corresponding possible values zU and zD of Zt/Z0. For example,here is the recombining tree for T = 3:

1

D

U

D2

DU

U2

D3

D2U

DU2

U3

Note that spots correspond to the paths on the recombining tree. Forexample, there are three ways to go from the time-zero node to nodeDU2, corresponding to the fact that there are three spots that coa-lesce to form the node. While the spot “remembers” how we got there(there is only one path to each spot), the node “forgets” (there can bemany paths to each node). How many nodes are there and how manyspots? How do these numbers increase with T? What is their order ofmagnitude for T = 100?

(b) Assume that the market is arbitrage-free and is implemented bytwo contracts: an MMA (δ0, V 0) with a constant rate process r > −1,and a stock (δ, V ) with ex-dividend price process S ≡ V − δ, specifiedin terms of a constant dividend yield y > −1 by

(1.8.1) St−1 ≡ Zt−1 and Vt ≡ (1 + y)Zt, t = 1, . . . T.

As always, the convention is δ0 ≡ 0 and δT ≡ VT . Explain (withouttoo much formalism) why

U (1 + y) > 1 + r > D (1 + y) .

(c) Consider a contract (δ∗, V ∗) that pays no dividends prior to T ,that is, δ∗− = 0. Assume that the contract’s price can be expressedas V ∗

t = ft (Zt) for some functions ft : Nt → (0,∞). Let (θ0, θ) be

Page 43: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

1.8. EXERCISES 43

a trading strategy in the contracts (δ0, V 0) and (δ, V ), defining thesynthetic contract (δθ, V θ). Show that if (δ∗, V ∗) = (δθ, V θ), then

θ0t = gt (Zt−1) and θt = ht (Zt−1) , t = 1, . . . , T,

for functions gt, ht : Nt−1 → R, for which you should be able to giveexplicit formulas in terms of ft and the model parameters. Finally,provide a recursive algorithm for computing ft.

Page 44: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

CHAPTER 2

Probabilistic Methods in Arbitrage Pricing

The arbitrage-pricing theory of Chapter 1 postulates an exhaustiveset of possible states but makes no use of any probabilities over thesestates. This chapter introduces probabilistic representations of valu-ation rules consistent with a given arbitrage-free market, which areuseful in developing theoretical and computational methodology andessential in empirical applications.

2.1. Probability basicsIn this section we review some essential probabilistic concepts and

notation. To last chapter’s primitives of a finite state space Ω and afiltration on this state space, we add a reference probability (measure)on the subsets of Ω, that is, a function P : 2Ω → [0, 1] such thatP (Ω) = 1 and for all events A and B,

A ∩B = ∅ =⇒ P (A ∪B) = P (A) + P (B) .

We assume that P has full support: P (A) > 0 if A 6= ∅.The expectation or mean of a random variable x (under P ) is

E [x] ≡∑ω∈Ω

x (ω)P (ω) =∑

α∈x(ω)|ω∈Ω

αP (x = α) .

The function E : RΩ → R so defined is the expectation operatorrelative to P ; it is the unique linear functional on RΩ that is positive(E [x] > 0 if 0 6= x ≥ 0) and satisfies E [1A] = P (A) for every event A.We often omit excessive parentheses, as in Ex = E [x] and P [x ≤ α] =P (x ≤ α). The demeaned version of a random variable x is denotedx ≡ x−Ex. The covariance of two random variables x, y is the scalar(2.1.1) cov [x, y] ≡ E [xy] = E [xy]− E [x]E [y] .

We can then define the variance var [x] ≡ cov [x, x], the standarddeviation stdev [x] ≡

√var [x], and provided x and y have positive

variance, the correlation coefficient

corr [x, y] ≡ cov [x, y]stdev [x] stdev [y] .

The random variables x and y are uncorrelated if cov [x, y] = 0.Two uncorrelated random variables can be nontrivially determined bythe same random source. For example, if Ω = −1, 0, 1 and each state

44

Page 45: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.1. PROBABILITY BASICS 45

is assigned the same probability, then the identity random variablex (ω) = ω is uncorrelated with its square x2 (ω) = |ω| . If f (x) andg (y) are uncorrelated for all functions f, g : R → R, then the randomvariables x, y are said to be stochastically independent, or justindependent where there is no risk of confusion with other typesof independence (like linear independence). By virtue of Proposition1.1.4, the independence of x and y is really a property of the algebrasσ(x) and σ (y). Define two algebras A and B to be (stochastically)independent if every random variable in L (A) is uncorrelated withevery random variable in L (B). Then x and y are independent if andonly if σ(x) and σ (y) are independent.

Proposition 2.1.1. For all algebras A and B and correspondingpartitions A0 and B0, the following are equivalent conditions.

(1) A and B are independent.(2) P (A ∩B) = P (A)P (B) for all A ∈ A and B ∈ B.(3) P (A ∩B) = P (A)P (B) for all A ∈ A0 and B ∈ B0.

Proof. Since the covariance operator is linear in each of its argu-ments and every random variable is a linear combination of indicatorfunctions, A and B are independent if and only if 1A and 1B are un-correlated for all A ∈ A and B ∈ B (resp. A ∈ A0 and B ∈ B0). Sincecov (1A, 1B) = P (A ∩B)− P (A)P (B), this proves the equivalence ofthe first and second (resp. third) conditions.

More generally, the random variables x1, . . . , xn are said to be(stochastically) independent if for each i ∈ 1, . . . , n, the algebrasσ (xi) and σ (xj | j 6= i) are independent.1

Proposition 2.1.2. For all random variables x1, . . . , xn, the fol-lowing are equivalent conditions.

(1) x1, . . . , xn are stochastically independent.(2) σ (x1, . . . , xi−1) and σ (xi) are independent for i = 2, . . . , n.(3) E

∏ni=1 fi (xi) =

∏ni=1Efi (xi) for all f1, . . . , fn : R → R.

(4) P [⋂ni=1 xi ∈ Si] =

∏ni=1P [xi ∈ Si] for all S1, . . . , Sn ⊆ R.

Proof. That (1) implies (2) is immediate. We show (3) assuming(2) by induction in n. For n = 2, the claim is valid by definition.Suppose it is true for n − 1 variables. For all fi : R → R, (2) im-plies that

∏n−1i=1 fi (xi) and fn (xn) are uncorrelated and therefore the

expectation of their product equals the product of their expectation,which together with the inductive hypothesis gives (3). To show that

1This is not the same as pairwise independence. For example, suppose x1 andx2 are independent random variables, each taking the values +1 and −1 with equalprobability. (As in Example 1.1.1 with T = 2, xi (ω) = ωi and equal probabilityfor each state.) Then the random variables x1, x2, x1x2 are not independent eventhough any two of them are independent.

Page 46: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.1. PROBABILITY BASICS 46

(3) implies (4), set fi (xi) = 1xi∈Si. Finally, we show (1) assuming(4) holds. The partition generating σ (x1) is the set of all nonemptyevents of the form A = x1 = α1, α1 ∈ R, and the partition gen-erating σ (x2, . . . , xn) is the set of the nonempty events of the formB = x2 = α2, . . . , xn = αn, α2, . . . , αn ∈ R. For such A and B, (4)implies P (A ∩B) =

∏ni=1P [xi = αi] and P (B) =

∏ni=2P [xi = αi].

Therefore P (A ∩B) = P [A]P [B]. By Proposition 2.1.1, this provesthat σ (x1) is stochastically independent of σ (x2, . . . , xn). The sameargument applies for any permutation of the x1, . . . , xn, completing theproof.

Uncorrelatedness and stochastic independence can equivalently bethought of as orthogonality conditions in the space L =

x | x ∈ RΩ

of

zero-mean random variables, with the covariance inner product 〈x | y〉 =cov[x, y] and induced norm ‖x‖ =

√〈x | y〉 = stdev [x]. The random

variables x, y are uncorrelated if and only if x and y are orthogonalin L. Similarly, the algebras A and B are independent if and only ifL (A) ≡ x | x ∈ L (A) and L (B) are orthogonal in L. The Cauchy-Schwarz inequality in this context states that |corr [x, y]| ≤ 1 withequality holding if and only if x = αy or y = αx for some scalar α.

The conditional probability P (· | B) : 2Ω → [0, 1] given a positiveprobability event B is defined by Bayes’ rule:

(2.1.2) P (A | B) =P (A ∩B)

P (B).

The expectation operator corresponding to P (· | B) is denoted by E [· | B]and is easily seen to satisfy

(2.1.3) E [x | B] =E [x1B]

P (B), for all x ∈ RΩ.

The conditional expectation operator (under P ) given the algebraB generated by the partition B1, . . . , Bn is defined by

(2.1.4) E [x | B] =n∑i=1

E [x | Bi] 1Bi, for all x ∈ RΩ.

The following proposition shows that the random variable E [x | B] isthe best estimate of x given information B, in the sense of minimizingexpected squared error.

Proposition 2.1.3. For all algebras B, and random variables x,y, the following are equivalent conditions, provided y is B-measurable.

(1) y = E [x | B].(2) E[(x− y)2] ≤ E[(x− z)2] for all z ∈ L (B).(3) E [yz] = E [xz] for all z ∈ L (B).(4) Ey = Ex and corr [x− y, z] = 0 for all z ∈ L (B).(5) E [y1B] = E [x1B] for all B ∈ B.

Page 47: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.1. PROBABILITY BASICS 47

Proof. Condition (2) is a norm-minimization problem in the vec-tor space of random variables with the inner product 〈x | y〉 = E [xy].The y ∈ L (B) satisfying condition (2) is the projection of x ontoL (B). By the orthogonal projection theorem (Corollary B.4.2), such ay ∈ L (B) is equivalently characterized by the orthogonality condition〈x | y − z〉 = 0 for all z ∈ L (B), which is condition (3). The equiva-lence of (3) and (4) is immediate from the definitions. The equivalenceof (3) and (5) is also straightforward, given that every z ∈ L (B) isa linear combination of indicator functions of events in B. Finally,let B0 ≡ B1, . . . , Bn denote the partition generating B. Since theindicator of any B ∈ B is the sum of random variables of the form1Bi

, condition (5) is equivalent to its version resulting after replacingB with B0, whose equivalence to condition (1) follows easily from thedefinitions.

Remark 2.1.4. In infinite state space extensions of the theory,definition (2.1.4) is not generally meaningful and a version of the lastproposition’s orthogonality condition forms the basis for the usual text-book definition of a conditional expectation. The uniqueness claim inthe last proposition relies on the full-support assumption on P . Ingeneral, any two random variables y and y′ that are conditional expec-tations of x given B must satisfy P [y = y′] = 1 but can take arbitraryvalues on any B ∈ B such that P (B) = 0. We will have no needfor this generality here, but the technicality becomes unavoidable ininfinite state space extensions of the theory.

Corollary 2.1.5. The algebras A and B are stochastically inde-pendent if and only if E [x | B] = E [x] for all x ∈ L (A).

Proof. Recall that A and B are independent if and only if L (A)

and L (B) are orthogonal in L under the covariance inner product,which is in turn equivalent to the requirement that the projection ofevery x ∈ L (A) onto L (B) is zero. By Proposition 2.1.3, the lastcondition can be restated as E [x | B] = 0 for all x ∈ L (A).

The important law of iterated expectations states that(2.1.5) B ⊆ A =⇒ E [E [x | A] | B] = E [x | B] ,for every random variable x and algebras A and B. Indeed if B ⊆ A,L (B) is a linear subspace of L (A), and therefore projecting x on L (B)is equivalent to first projecting x on L (A) and then further projectingon L (B), which translates to E [x | B] = E [E [x | A] | B].

Another important property of conditional expectations, stated be-low, is an immediate consequence of identities (2.1.3) and (2.1.4) in thecurrent context. The following less direct proof is instructive, however,in illustrating how the orthogonality characterization of expectationscan be used in ways that also apply in infinite state-space extensionsof the theory.

Page 48: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.1. PROBABILITY BASICS 48

Proposition 2.1.6. For every random x and algebra B, if b ∈L (B), then E [bx | B] = bE [x | B].

Proof. Fix any b ∈ L (B) and let y = bE [x | B]. We use thecharacterization of condition 2 of Proposition 2.1.3 twice, first to justifythe middle equality in

E [(bx) z] = E [x (bz)] = E [E [x | B] (bz)] = E [yz] , for all z ∈ L (B) ,

and then to conclude that since y ∈ L (B), the above condition impliesy = E [bx | B].

The conditional expectation of a random variable x given therandom variables y1, . . . , yn is defined by

E [x | y1, . . . , yn] = E [x | σ (y1, . . . , yn)] ,

which is a function of (y1, . . . , yn) in the sense of Proposition 1.1.4.Thus E [x | y1, . . . , yn] is the projection of x onto L (σ (y1, . . . , yn)),which is the linear space of all random variables of the form f (y1, . . . , yn)for arbitrary f : Rn → R. This contrasts with a simple linear re-gression,2 where x is projected on the smaller subspace of all randomvariables of the same form but with f restricted to be linear.

As in the last chapter, throughout this chapter we take as givenan underlying filtration (F0,F1, . . . ,FT ) on Ω, abbreviated to Ft,with F0 = ∅,Ω and FT = 2Ω. We write Lt ≡ L (Ft) for the set ofall Ft-measurable random variables, Et [x] or Etx for the conditionalexpectation E [x | Ft], and covt for the conditional covariance given Ft,defined as in (2.1.1) but with Et in place of E. Conditional variances,standard deviations and correlation coefficients given Ft are definedand denoted analogously.

A martingale (under P ) is any adapted process M such that

Mt = EtMs for all s > t.

By the law of iterated expectations (2.1.5), an adapted process M isa martingale if and only if Et−1 [∆Mt] = 0 for all t > 0 if and onlyif Mt = Et [MT ] for all t. We let M denote the set of all martingalesand M0 ≡ M ∈ M |M0 = 0, which is the set of all zero-meanmartingales since a martingale M is in M0 if and only if EMt = 0for all t.

The following simple observation has far-reaching implications forfinance interpretations as well as martingale theory in general, includ-ing the extension of the theory of stochastic integrals in more generalsettings.

2A prominent case where the two projections coincide is when (x, y1, . . . , yn)has a Gaussian distribution, the definition of which requires an infinite state space.

Page 49: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.1. PROBABILITY BASICS 49

Proposition 2.1.7. For every martingale M and predictable pro-cess ϕ, the integral ϕ •M is a martingale.3

Proof. If ϕt ∈ Lt−1, then Et−1 [∆ (ϕ •M)t] = Et−1 [ϕt∆Mt] =ϕtEt−1 [∆Mt] = 0.

Example 2.1.8. In the context of Example 1.1.1, assume that Prepresents a fair coin: P (ω) = 2−T for all ω ∈ Ω. A gambler bet-ting on a coin toss gains a dollar if the outcome is heads and loses adollar otherwise. Contingent on state ω, a gambler betting on the firstt coin tosses gains (or loses if negative) Bt (ω) = ω1 + ω2 + · · · + ωt.Letting B0 = 0, this defines an adapted process B whose increments∆B1, . . . ,∆BT are stochastically independent and generate the under-lying filtration. Since the algebra Ft−1 = σ (∆B1, . . . ,∆Bt−1) is inde-pendent of (the algebra generated by) ∆Bt, Et−1∆Bt = E∆Bt = 0 andtherefore B ∈ M0. A gambler who starts betting at time t and quits ata later time u can expect zero total gains: Et [Bu −Bt] = 0. A naturalquestion is whether the gambler could beat the odds by following someclever strategy, wagering ϕt ∈ Lt dollars on the tth coin toss at timet−1. With the convention ϕ0 = 0, such a strategy defines an element ϕof P0 and the process M = ϕ •B defines the corresponding cumulativegains. By Proposition 2.1.7, M is also a zero-mean martingale. There-fore, a gambler following strategy ϕ from time t up to a later time umust again expected zero total gains: Et [Mu −Mt] = 0. In this sense,the gambler cannot beat the odds. If on the other hand the gamblerwere allowed to bet indefinitely, then the following doubling strategywould beat the odds: Bet one dollar, if heads stop, otherwise bet twodollars plus one on a second coin toss, if heads stop, otherwise bet fourdollars plus one on a third coin toss, and so on. If allowed to playindefinitely, eventually heads is bound to show up resulting in a gainof one dollar. Of course, losses can be staggering in the meantime andany finite borrowing limit is enough to rule out the strategy. Here thistype of doubling strategy is ruled out by the finite number of periods,but it is something one has to technically deal with in infinite-horizonor continuous-time models. ♦

Every adapted process x has a unique Doob decomposition:(2.1.6) x = x0 + xp +M, xp ∈ P0, M ∈ M0.

To see why, note that (2.1.6) implies ∆x = ∆xp +∆M and therefore,∆xpt = Et−1∆xt, t = 1, . . . , T.

Conversely, if xp is recursively defined by the last equation and xp0 = 0,then xp ∈ P0 and x − xp is a martingale. The process xp is known asthe compensator of x.

3Note that it is not enough to assume that ϕ is adapted for ϕ • M to bea martingale, a fact that is the main motivation behind the introduction of thenotion of a predictable process.

Page 50: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.2. BETA PRICING AND FRONTIER RETURNS 50

2.2. Beta pricing and frontier returnsIn Section 1.7 we defined excess returns as the difference between

the returns of a traded contract and that of a traded (default-free)money market account (MMA). In this section we show that in anarbitrage-free market expected excess returns are proportional to thereturn’s covariance with a traded return that is characterized by theproperty of minimizing variance given its expected value. The resultingexpression for expected returns is known as a beta pricing equation.

Beta pricing is a characterization of single-period traded cash flows;it is not about the specific contracts implementing the market anda version of the theory can be formulated with or without a tradedMMA. For expositional simplicity, however, we adopt the setting ofSection 1.7. We take as given an arbitrage-free market that is im-plemented by the 1 + J contracts (1.7.2), where Sjt−1 6= 0 everywhere(meaning at all states) and therefore returns are well-defined in (1.7.5).Contract zero is an MMA with rate process r ∈ P0. The correspondingperiod-t return is

R0t = 1 + rt ∈ Lt−1.

A period-t portfolio allocation ψt =(ψ1t , . . . , ψ

Jt

)∈ L1×J

t−1 results in theportfolio return

Rψt ≡ R0

t +J∑j=1

ψjt(Rjt −R0

t

).

The set of period-t traded returns is

Rt ≡Rψt | ψt ∈ L1×J

t−1

.

A key observation is that returns can be mixed: For all R1t , R

2t ∈ Rt and

αt ∈ Lt−1, (1− αt)R1t + αtR

2t ∈ Rt. To avoid trivialities, we assume

throughout that for every time t, there exists some Rt ∈ Rt such thatvart−1 [Rt] > 0 everywhere.

Definition 2.2.1. The traded return R∗t ∈ Rt is a (minimum vari-

ance) frontier return if for all Rt ∈ Rt,Et−1Rt = Et−1R

∗t ⊆ vart−1[Rt] ≥ vart−1[R

∗t ] .

The property of being a frontier return can also be defined spot byspot: R∗

t ∈ Rt is a frontier return at spot (F, t− 1) if for all Rt ∈ Rt,E [Rt | F ] = E [R∗

t | F ] =⇒ var [Rt | F ] ≥ var[R∗t | F ].

Since for all F ∈ Ft−1, Rt, R∗t ∈ Rt implies Rt1F + R∗

t 1F c ∈ Rt, itfollows that R∗

t is a frontier return if and only if it is a frontier returnat every time-(t− 1) spot. Similarly, the following characterization offrontier returns, as well as the entire discussion of the remainder of thissection, applies separately spot by spot.

Page 51: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.2. BETA PRICING AND FRONTIER RETURNS 51

Lemma 2.2.2. R∗t ∈ Rt is a frontier return if and only if for all

Rt ∈ Rt, Et−1 [Rt −R∗t ] = 0 ⊆ covt−1[Rt −R∗

t , R∗t ] = 0.

Proof. Fix any spot (F, t− 1) and let (Fi, t), i = 0, 1, . . . , d, de-note its immediate successor spots. On R1+d, we use the inner product

〈x | y〉 ≡d∑i=0

xiyiP [Fi | F ] .

Let RF,t ≡ (R (F0, t) , . . . , R (Fd, t)) | R ∈ Rt and define the linearmanifold M of all z ∈ RF,t such that

∑i (zi − z∗i )P [Fi | F ] = 0, where

z∗ ≡ (R∗ (F0, t) , . . . , R∗ (Fd, t)). The fact that R∗

t is a frontier returnimplies that ‖z‖ ≥ ‖z∗‖ for all z ∈ M . In other words, z∗ is theorthogonal projection of the zero vector onto M . By the orthogonalprojection theorem (Corollary B.4.2), z∗ is characterized by the orthog-onality condition 〈z − z∗ | z∗〉 = 0 for all z ∈M , which can be restatedas, for all Rt ∈ Rt, E[Rt−R∗

t | F ] = 0 implies E[(Rt −R∗t )R

∗t | F ] = 0.

The lemma’s claim follows. We can use the above lemma to determine all frontier allocations,

that is, all portfolio allocations generating frontier returns. Let1 + µit ≡ Et−1R

it and Σij

t ≡ covt−1

[Ri, Rj

], i, j = 1, . . . , J,

and define the column vector µt ≡(µ1t , . . . , µ

Jt

)′ and the J × J sym-metric matrix Σt ≡ [Σij

t ], which is positive semidefinite (at every time-(t− 1) spot). For simplicity, we assume that Σt is full rank and there-fore positive definite. The assumption of a full-rank Σt is easily seen tobe equivalent to the assumption that the contracts are everywherenon-redundant, meaning that conditionally on any spot (F, t− 1),there is no j ∈ 0, 1, . . . , J such that the return Ri

t on F can be gen-erated by an allocation in the remaining contracts. By Lemma 2.2.2,an allocation ψ∗

t generates a frontier return R∗t if and only if for every

period-t allocation ψt,(ψ − ψ∗

t ) (µt − rt) = 0 implies (ψ − ψ∗t ) Σtψ

∗′t = 0.

In other words, whenever ψt−ψ∗t is orthogonal to µt− rt, it is also or-

thogonal to Σtψ∗′t (all conditionally on each time-(t− 1) spot). There-

fore, µt−rt is collinear to Σtψ∗′t in the sense that there exists αt ∈ Lt−1

such that αt (µt − rt) = Σtψ∗′t . Rearranging, we can parametrically

describe all frontier allocations by(2.2.1) ψ∗

t = αt (µt − rt)′ Σ−1

t , αt ∈ Lt−1.

Note that if ψ∗t is any nonzero frontier allocation, then every other

frontier allocation takes the form αtψ∗t for some αt ∈ Lt−1, a fact

known as two-fund separation. The term reflects the idea that everyfrontier return can be achieved by allocating a value proportion αt in afund allocated according to ψ∗

t and the rest in a fund that is an MMA.

Page 52: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.2. BETA PRICING AND FRONTIER RETURNS 52

As we will see shortly, two-fund separation is valid even without theassumption that the contracts are everywhere non-redundant.

The following is this section’s central result, showing that the fron-tier returns other than the MMA return are exactly the returns relativeto which beta-pricing is possible.

Proposition 2.2.3. For all R∗t ∈ Rt such that vart−1 [R

∗t ] > 0

everywhere, the following two conditions are equivalent:(1) R∗

t is a frontier return.(2) For all Rt ∈ Rt,

(2.2.2) Et−1Rt −R0t =

covt−1[R∗t , Rt]

vart−1 [R∗t ]

(Et−1R

∗t −R0

t

)and for some Rt ∈ Rt, Et−1Rt 6= R0

t everywhere.Proof. (1 =⇒ 2) Suppose R∗

t ∈ Rt is a frontier return and there-fore 0 = vart−1 [R

0t ] ≥ vart−1 [R

∗t ] on the event Et−1R

∗t = R0

t. Since,vart−1 [R

∗t ] > 0 everywhere, the event Et−1R

∗t = R0

t is empty.Given any Rt ∈ Rt, define Rt ∈ Rt by letting

Rt ≡ R∗t +Rt −R0

t onEt−1Rt = R0

t

, and

Rt ≡ R0t +

Et−1R∗t −R0

t

Et−1Rt −R0t

(Rt −R0t ) on

Et−1Rt 6= R0

t

.

By construction, Et−1

[Rt −R∗

t

]= 0 and therefore, by Lemma 2.2.2,

covt−1[Rt −R∗t , R

∗t ] = 0, which expands to equation (2.2.2).

(2 =⇒ 1) Conversely, suppose the second condition holds, whichclearly implies that Et−1R

∗t 6= R0

t everywhere. We show that R∗t is a

frontier return by verifying the orthogonality condition of Lemma 2.2.2.Consider any Rt ∈ Rt such that Et−1 [Rt −R∗

t ] = 0. Applying the betapricing equation and canceling out the term Et−1R

∗t −R0

t on each side,we conclude that covt−1[Rt −R∗

t , R∗t ] = 0.

The so-called beta-pricing equation (2.2.2) has been of considerableinterest in empirical work, since it suggests that the slope coefficientof a linear regression (commonly denoted by β) can be used to explainexpected excess returns. In practice, we can only identify a frontierreturn with error. We therefore have to consider the beta pricing equa-tion relative to some proxy return Rp

t = R∗t + εt, where the error εt

is judged to be small. It is an instructive exercise to show that anarbitrarily small value of Et−1ε

2t is consistent with the existence of a

traded return whose beta with respect to Rpt is arbitrarily different

from its beta with respect to R∗. The basic idea is that leverage (thatis, borrowing from the MMA to invest in a risky portfolio) can arbi-trarily amplify any discrepancy between the two betas. This pitfall canbe avoided if we instead focus on excess returns normalized by theirstandard deviation.

Page 53: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.2. BETA PRICING AND FRONTIER RETURNS 53

The time-(t− 1) Sharpe ratio of a period-t return Rt is the ratio

(2.2.3) St−1[Rt] ≡Et−1Rt −R0

t

stdevt−1 [Rt],

provided the denominator is nowhere zero. We adopt the conventionthat St−1[Rt] ≡ 0 on the event vart−1 [Rt] = 0. The beta-pricingequation (2.2.2) can be restated in terms of Sharpe ratios as

(2.2.4) St−1 [Rt] = corrt−1[R∗t , Rt]St−1 [R

∗t ] .

This version of the beta-pricing equation is robust to replacing R∗t with

a highly correlated proxy Rpt , in the following sense.4

Proposition 2.2.4. For all R∗t , R

pt , Rt ∈ Rt, equation (2.2.4) im-

plies its approximate version:

(2.2.5) St−1[Rt] = corrt−1[Rpt , Rt]St−1 [R

pt ] + ϵt−1,

where|ϵt−1| ≤ 2 |St−1 [R

∗t ]| |1− corrt−1 [R

∗t , R

pt ]| .

Proof. We adopt the notation of the proof of Lemma 2.2.2, sincethe argument relates to a single step of the information tree following agiven spot (F, t− 1). Analogously to z∗, let z ≡ (R (F0, t) , . . . , R (Fd, t))and zp ≡ (Rp (F0, t) , . . . , R

p (Fd, t)). We write S [z] for the value ofSt−1[Rt] on F , and analogously for S [z∗] and S [zp]. The vectorsz, z∗, zp can be thought of as random variables on 0, 1, . . . , d. Forany such random variable x, we use the notation

x ≡ x−∑

i xiP [Fi | F ]stdev [x] ,

where the standard deviation is relative to the probability assigningmass P [Fi | F ] to state i. Note that 〈x | x〉 = 1 and 〈x | y〉 = corr [x, y].Let 1 − δ ≡ 〈z∗ | zp〉, which is the value of corrt−1 [R

∗t , R

pt ] on F , and

let ϵ ≡ S [z]−〈zp | z〉 S [Rp], which is the value of ϵt−1 on F , as definedby (2.2.5). Condition (2.2.4) implies that S [z] = 〈z∗ | z〉 S [z∗] andS [zp] = (1− δ)S [z∗]. Therefore,

ϵ2

S [z∗]2= 〈z∗ − (1− δ) zp | z〉2

≤ 〈z∗ − (1− δ) zp | z∗ − (1− δ) zp〉2 ,

where the last inequality follows by the Cauchy-Schwarz inequality.Expanding the last term, we find that it equals (2δ − δ2)

2. Thereforeϵ2 ≤ S [z∗]2 (2δ)2, which is the claimed bound.

4While this section’s material is standard textbook material, to my knowledge,Proposition 2.2.4 first appeared in Skiadas [2009].

Page 54: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.2. BETA PRICING AND FRONTIER RETURNS 54

The set of frontier traded returns other than the MMA return arethe set of the traded returns of maximum absolute Sharpe ratio. Westate this claim more formally below, using the convention that inde-terminate Sharpe ratios are assigned the value zero.

Proposition 2.2.5. R∗t ∈ Rt is a frontier return if and only if

|St−1 [R∗t ]| ≥ |St−1[Rt]| for all Rt ∈ Rt.

Proof. The “only if” part is a corollary of Proposition 2.2.3, ascan easily be seen by taking absolute values on both sides of equation(2.2.4) and using the fact that the absolute correlation is less thanone. (Alternatively, one can show the claim more directly from thedefinition of frontier returns.) The converse is immediate from thedefinitions.

Sharpe ratios are invariant to positions in the MMA. For any αt ∈Lt−1, a mix of αt of a portfolio allocation ψt and 1 − αt in the MMAresults in a return that has the same absolute Sharpe ratio as Rψ

t :

(2.2.6)∣∣∣St−1[R

0t + αt(R

ψt −R0

t )]∣∣∣ = ∣∣∣St−1[R

ψt ]∣∣∣ .

Moreover, as we vary αt we trace out all (conditional) mean-standarddeviation pairs (αtEt−1R

ψt , |αt| stdevt−1[R

ψt ]) of traded returns that are

consistent with the given absolute Sharpe ratio value. In particular,if Rψ = R∗ is a frontier return other than the risk-free return R0

t ,varying αt traces out all traded returns with maximum absolute Sharperatio, that is, all frontier returns. We therefore obtain once again thetwo-fund separation result introduced earlier, but without the non-redundancy assumption. If the contracts are assumed to be everywherenon-redundant, we can use expression (2.2.1) to compute the maximumsquared Sharpe ratio as

(2.2.7) St−1 [R∗t ]

2 = (µt − rt)′Σ−1

t (µt − rt) .

The returns corresponding to frontier returns with positive Sharperatio, or equivalently positive (conditional) expected returns, are knownas (conditionally) mean-variance efficient, since besides minimizingvariance given expected return, they also achieve a maximum expectedreturn given the variance of the return (all conditionally beginning-of-period information). Since the seminal contribution of Markowitz[1952], mean-variance efficiency has played a prominent role as a cri-terion for portfolio choice. The criterion is clearly limited in that itis myopic (only focuses on single-period returns) and uses variance asa measure of portfolio risk. A more sophisticated theory of portfo-lio choice, albeit with its own limitations, is developed in Chapter 3.Mean-variance efficiency resurfaces as a building block of optimal port-folios in the more sophisticated theory for a useful class of return dy-namics in high trading frequency. So the concept is more robust than

Page 55: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.3. STATE-PRICE DENSITIES 55

this section’s discussion suggests. Its empirical implementation, how-ever, has its own serious limitations, which are beyond the scope of thistext. Ultimately, the theory of mean-variance efficiency mainly servesas a parsimonious model that highlights the benefits of diversification.

2.3. State-price densitiesA state price process was defined in Section 1.2 as a representation

of a present-value function. A state price density process is essentiallythe same object, but with its values expressed as a density relative toa given reference probability. This simple construct opens the door forthe use of probabilistic methods.

We continue to take as given a market X in the usual stochasticsetting, consisting of a filtration Ft on the finite state space Ω, whereF0 = ∅,Ω and FT = 2Ω, and a full-support probability P on FT . Astate-price density can be defined as an adapted process π such that astate-price process p is well-defined by letting p (F, t) = π (F, t)P (F )for every spot (F, t). The following equivalent definition bypasses refer-ence to a state-price process and extends directly to infinite state-spacesettings.

Definition 2.3.1. A state-price density process or SPD (rel-ative to the probability P ) is any π in L++ such that a present-valuefunction Π : L → R is well-defined by

(2.3.1) Π(c) =1

π0E[∑T

t=0πtct

], c ∈ L.

In this case, π is said to represent Π.

Proposition 2.3.2. Every present-value function admits an SPDrepresentation, which is unique up to positive scaling.

Proof. Given a present-value function Π, let π denote its (unique)Riesz representation in L with the inner product 〈x | y〉 = E

∑Tt=0 xtyt.

The positivity of Π implies that π ∈ L++, and therefore π is an SPDrepresenting Π. Conversely, if π is an SPD representing Π, then π/π0is the Riesz representation of Π.

A present-value function Π, which is defined from the perspective oftime zero, induces a spot-(F, t) present-value function ΠF,t (Definition1.4.5) for every other spot (F, t). If the SPD π represents Π, then

(2.3.2) ΠF,t (c) =1

π (F, t)E

[T∑u=t

πucu | F

].

This link between conditional valuation and conditional expectationturns out to be methodologically quite useful.

Page 56: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.3. STATE-PRICE DENSITIES 56

In Proposition 1.4.8 we saw that if Π is a present-value functionand (δ, V ) is a traded contract, then Π prices the contract in the sensethat

V (F, t) = ΠF,t (δ) for every spot (F, t) .

If π is an SPD representing Π, then the same pricing condition can beexpressed as

(2.3.3) Vt =1

πtEt[∑T

u=tπuδu

], t = 0, . . . , T − 1.

In this case we say that π prices the contract (δ, V ). The followingvariant of the argument used in Proposition 1.4.8 shows the pricingcondition (2.3.3) directly in a way that extends to infinite sate-spacesettings, where expression (2.3.2) is not generally meaningful. Givenany time t and F ∈ Ft, set to zero the present value of the cash flow(1.3.2) generated by buying the contract at time t on the event F andholding it to time T to find

E [πtVt1F ] = E[(∑T

u=tπuδu

)1F

].

Equation (2.3.3) follows after applying Proposition 2.1.3.The second part of Proposition 1.4.8 can also be restated in terms

of SPDs:

Proposition 2.3.3. A process π ∈ L++ is an SPD for the marketimplemented by a set of contracts if and only if it prices every contractin the given set.

The following link of pricing to martingales plays a central method-ological role, especially in technically more advanced incarnations of thetheory.

Proposition 2.3.4. A process π ∈ L++ prices the contract (δ, V )if and only if Gπ ≡ πV + (πδ)− • t is a martingale.

Proof. Multiply equation (2.3.3) by πt and add∑t−1

u=0 πuδu onboth sides to find Gπ

t = Et [GπT ]. This calculation can be reversed.

Remark 2.3.5. The preceding martingale condition characterizesan SPD for a market implemented by given contracts. Suppose that themarket X is implemented by contracts (δj, V j), j = 1, . . . , J , and letδ =

(δ1, . . . , δJ

)′ and V =(V 1, . . . , V J

)′. Write G for the correspond-ing column vector of gain processes. Then, by the last two propositions,π ∈ L++ is an SPD if and only if Gπ is a martingale, meaning thateach Gjπ = πV j + (πδj)− • t is a martingale. For the synthetic con-tract generated by a trading strategy θ, we have Gθπ = π0V

θ0 + θ •Gπ,

which is a martingale if Gπ is a martingale, since θ is predictable. Thisobservation merely reflects the fact that an SPD prices every tradedcontract, and every synthetic contract is traded. ♦

Page 57: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.3. STATE-PRICE DENSITIES 57

Yet another useful way of expressing the fact that an SPD π pricesthe contract (δ, V ) is in the recursive form

(2.3.4) St−1 = Et−1

[πtπt−1

Vt

], t = 1, . . . , T,

where S ≡ V − δ. The equivalence of this condition to (2.3.3) followsfrom the law of iterated expectations (2.1.4). Alternatively, equation(2.3.4) can be rearranged to Et−1 [∆G

πt ] = 0, which is one of the equiv-

alent ways of stating the martingale property of Gπ.If St−1 6= 0 everywhere, then the return Rt = Vt/St−1 is well-defined

and recursion (2.3.4) implies

(2.3.5) 1 = Et−1

[πtπt−1

Rt

], t = 1, . . . , T.

Conversely, the validity of the pricing restriction (2.3.5) for every re-turn R of a traded contract implies that π prices an arbitrary tradedcontract (δ, S), provided that at every nonterminal spot there existsat least some traded contract with non-zero ex-dividend price. If theevent F ≡ St−1 = 0 is empty, then clearly (2.3.5) with Rt = Vt/St−1

implies (2.3.4). What if F is nonempty? Equation (2.3.5) cannot beapplied directly to the contract (δ, V ). Instead, apply (2.3.5) twice—once to any traded contract with a well-defined return on F , and onceto the return of a portfolio that holds this same contract as well asthe contract (δ, V ) whose ex-dividend price vanishes on F . Solving theresulting pair of equations gives the desired pricing condition (2.3.4).

Applying the pricing equation (2.3.5) to a traded MMA with rateprocess r results in an expression that gives r as a function of an SPD:

(2.3.6) 1

1 + rt= Et−1

[πtπt−1

], t = 1, . . . , T.

This equation can be rearranged to

rt = −Et−1

[∆πtπt−1

]+ εt, where εt ≡

r2t1 + rt

.

If a period in the model represents a sufficiently short time interval,then εt is numerically negligible and the period-t risk-free interest ratert is approximately equal to minus the expected growth rate of the SPDπt over the period, conditionally on the beginning-of-period informa-tion. In this chapter’s final section we will see that this approximationbecomes an exact relationship in continuous-time, where rt representsa continuously compounded interest rate.

Suppose now that π is an SPD, an MMA is traded and therefore theshort-rate process r is given in terms of π as just described. The con-tract (δ, V ) is said to be priced risk neutrally if St−1 = Et−1Vt/ (1 + rt),a relationship that is valid for an MMA. For a general contract, weexpect that the ex-dividend price St must be adjusted relative to the

Page 58: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.3. STATE-PRICE DENSITIES 58

risk neutral valuation to reflect the uncertainty of the end-of-periodpayoff Vt. The appropriate risk adjustment can be written precisely byrearranging equation (2.3.4) to

(2.3.7) St−1 −Et−1Vt1 + rt

= covt−1

[πtπt−1

, Vt

]= covt−1

[∆πtπt−1

, Vt

].

In words, the risk adjustment relative to the risk-neutral valuation isequal to the covariance of the end-of-period value with the SPD growthrate conditionally on the beginning-of-period information.

Assuming the return Rt = Vt/St−1 is well defined, the precedingrelationship is often expressed as a restriction on the excess returnRt −R0

t , where R0 ≡ 1 + r. Using the definition of covariance and thepricing restrictions (2.3.5) and (2.3.6), we find

(2.3.8) Et−1Rt −R0t = −R0

t covt−1

[πtπt−1

, Rt

].

To make the connection to last section’s beta-pricing pricing notion,we need to project πt/πt−1 to the space of traded returns. In thecontext of Section 2.2, we can write πt/πt−1 = R∗

t + ϵt, where R∗t is

a period-t traded return and Et−1 [ϵtRt] = 0 for every period-t tradedreturn Rt (and therefore Et−1ϵt = 0). The existence of a (unique) suchdecomposition follows by a simple projection argument conditionally ateach time-(t− 1) spot, using the inner product of the proof of Lemma2.2.2. Substituting into equation (2.3.8) and dividing the resultingequation by its special case with R∗ in place of R, we obtain the betapricing equation

Et−1Rt −R0t =

covt−1[R∗t , Rt]

vart−1 [R∗t ]

(Et−1R

∗t −R0

t

).

The return R∗t maximizes correlation with πt/πt−1 conditionally on

time-(t− 1) information among all period-t traded returns; a claimthat reduces to the Cauchy-Schwarz inequality in the geometry usedto define R∗

t and ϵt as the components of an orthogonal projection ofπt/πt−1.

Another noteworthy consequence of (2.3.8) is the inequality

(2.3.9) St−1[Rt]2 ≤ (1 + rt)

2 vart−1

[πtπt−1

].

To show it, write the covt−1 of equation (2.3.8) in terms of corrt−1 anduse the fact that the latter is valued in [−1, 1]. Given the short rate rt,the conditional variance of an SPD growth rate places an upper boundon the maximum conditional squared Sharpe ratio of a traded return(an expression for which was derived at the end of the last section).This fact is known as the Hansen-Jagannathan bound.5

5Named after the contribution of Hansen and Jagannathan [1991].

Page 59: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 59

2.4. Equivalent martingale measuresEquivalent martingale measures are another methodologically use-

ful way of representing present-value functions. Prior to introducingthe concept, let us fix, for the entire section, a reference arbitrage-freemarket X and underlying full-support probability P . Suppose that π isan SPD relative to these primitives, as defined in the last section. Sup-pose also, for now, that an MMA is traded in X, with correspondingshort rate process r. Consider any nonterminal spot (F, t− 1), withimmediate successor spots (Fi, t), i = 0, . . . , d, and define the positivescalars Q (Fi | F ) by

(2.4.1) π (Fi, t)

π (F, t− 1)=

1

1 + r (F, t)

Q (Fi | F )P (Fi | F )

, i = 0, . . . , d.

Pricing of the MMA, as in equation (2.3.6), gives

(2.4.2) 1

1 + r (F, t)=

d∑i=0

π (Fi, t)

π (F, t− 1)P (Fi | F ) ,

and therefore∑d

i=0Q (Ft | F ) = 1. In other words, Q (Fi | F ) can bethought of as a positive transition probability from spot (F, t− 1) tospot (Fi, t). For every path on the information tree, which correspondsto a state revealed at the terminal date T , these transitions probabili-ties can be multiplied through, thus defining a full-support probabilityQ, which is the Equivalent Martingale Measure (EMM) correspondingto π. In fact, this construction applies without the assumption thatan MMA is traded by taking equation (2.4.2) to be the definition ofr (F, t).

We write EQ for the expectation operator relative to Q, simplify-ing the conditional expectation notation EQ [x | Ft ] to EQt [x] or EQt x.(Omitting the superscript implies the default reference probability:E = EP .) Identity (2.4.1) can be used to price any traded contract(δ, V ), with S = V − δ, resulting in

(2.4.3) St−1 =EQt−1Vt1 + rt

, t = 1, . . . , T.

This pricing relationship is often referred to as risk-neutral pricing,6although the term is misleading in terms of economic content; the riskassociated with the payoff Vt is priced in equation (2.4.3) just as itwould by the SPD π. In fact, Q can be thought of as representing apure (conditional) price of risk over a single period in the sense thatif at spot (F, t− 1) one enters a forward contract to exchange a time-tfixed payment fi,t for a contingent unit of account that is paid if and

6The idea of risk-neutral pricing already appears in Arrow [1971] and Drèze[1971], and it is exploited in option pricing by Cox and Ross [1976]. The term“equivalent martingale measure” is due to Harrison and Kreps [1979].

Page 60: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 60

only if spot (Fi, t) materializes, then fi,t = Q (Fi | F ). In other words,Q (Fi | F ) is the one-period ahead spot-(F, t− 1)-forward price of aunit payment at spot (Fi, t). The following example shows the role ofthe conditional expectation EQt−1 as a one-period-ahead forward pricingoperator more formally.

Example 2.4.1. Apply the pricing equation (2.4.3) for an assumedtraded contract (δ, V ) = (D − f,D − f), where D is an adapted pro-cess and f is a predictable process, with f0 = D0 = 0. For everyt > 0, we interpret Dt as the time-t value of some asset and ft as thetime-(t− 1) forward price of the asset for time-t delivery. Buying thecontract at time t − 1 and selling it at time t results in the net cashflow x, where xu = 0 for u 6= t and xt = Dt − ft. From the perspectiveof time t − 1, the trade is equivalent to entering a forward contractfor time-t delivery of the asset at the forward price ft, whose valueis determined at time t − 1. The pricing equation (2.4.3) in this casereduces to ft = EQt−1Dt, t = 1, . . . , T . Note that this argument doesnot apply in general if we extend the delivery of the asset to two ormore periods, since the interest r can vary stochastically from periodto period.

The development so far relies critically on the finite informationtree and the arbitrary use of a state-price density. We now introducean equivalent but more direct EMM definition that extends readily toinfinite state-space versions of the theory. Q denotes the set of allfull-support probability measures on 2Ω.

Definition 2.4.2. A discount process is a predictable strictlypositive process ρ such that ρ0 = 1. An equivalent martingale mea-sure7 (EMM) is a probability Q ∈ Q relative to which there existsa predictable SPD. An EMM-discount pair is a (Q, ρ) ∈ Q × Psuch that ρ is both a discount process and an SPD relative to Q.The present-value function represented by this pair is defined byΠ(c) = EQ

[∑Tt=0 ρtct

], c ∈ L.

Last section’s results on pricing using state-price densities all applyto an EMM-discount pair (Q, ρ), with simplifications resulting from thefact that ρ is predictable. For example, if an MMA with rate process ris traded, then the pricing equation (2.3.6) with P replaced by Q and

7The terms “equivalent” and “martingale measure” have their origin in prob-ability theory. A probability (measure) Q is said to be equivalent to P ifQ (A) = 0 ⇐⇒ P (A) = 0 for every event A. In our setting, for any P ∈ Q,the set of all equivalent-to-P probabilities on 2Ω is Q. A martingale measure isa probability relative to which a given set of processes is a set of martingales. Inthe current context, such a set is that of the properly discounted gain processes ofall traded contracts, as discussed at the end of this section.

Page 61: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 61

π replaced by ρ results in 1/(1 + rt) = ρt/ρt−1 and therefore

(2.4.4) r = −∆ρ

ρand ρt =

t∏u=0

1

1 + ru, t = 1, . . . , T.

In general, given discount process ρ, we refer to the r ∈ P0 defined bythe first equation in (2.4.4) as the rate process implied by ρ, andgiven r ∈ P0, we refer to the ρ defined by ρ0 = 1 and the second equa-tion in (2.4.4) as the discount process implied by r. The precedingdiscussion shows

Proposition 2.4.3. If r is the market’s short-rate process and(Q, ρ) is an EMM-discount pair, then r is the rate process implied by ρ.

Whether an MMA is traded or not, an EMM-discount pair (Q, ρ)prices a traded contract according to (2.4.3), where r is the rate processimplied by ρ, as can be seen by formally replacing P by Q and π byρ in equation (2.3.4). Last section’s other pricing relationships cansimilarly be stated in terms of an EMM-discount pair.

Let us now establish a correspondence between SPDs relative to Pand EMM-discount pairs, defined by the requirement that they rep-resent the same present-value function. Not surprisingly, this corre-spondence is the same as that of equations (2.4.1) and (2.4.2), thusestablishing the equivalence of the EMM notion of Definition 2.4.2 andthat of this section’s opening discussion. As preparation, we reviewtwo generally useful probabilistic constructs, starting with a change-of-measure formula.

Associated with every probability Q ∈ Q is the density8 dQ/dP ,which is the strictly positive unit-mean random variable

dQ

dP(ω) =

Q (ω)P (ω)

, ω ∈ Ω,

and the conditional density process

(2.4.5) ξt = Et[dQ

dP

], t = 0, 1, . . . , T.

Since FT = 2Ω, ξT = dQ/dP and therefore ξ determines Q and con-versely. By the law of iterated expectations (2.1.5), ξ is a unit-meanstrictly positive martingale (under P ).

Lemma 2.4.4. For every Q ∈ Q and adapted process x,(2.4.6) EQ [xt] = E [ξtxt] , t = 0, . . . , T.

Proof. For every random variable z, we have

EQz =∑ω∈Ω

z (ω)Q (ω) =∑ω∈Ω

Q (ω)P (ω)

z (ω)P (ω) = E[dQ

dPz

].

8Also known as the Radon-Nikodym derivative of Q with respect to P .

Page 62: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 62

The law of iterated expectations and the fact that xt ∈ Lt can now beused to argue that

EQxt = EEt[dQ

dPxt

]= E

[Et[dQ

dP

]xt

]= E [ξtxt] .

Remark 2.4.5. For every spot (F, t), the last lemma with xt = 1F

implies that Q (F ) = EQ1F = E [ξt1F ] = ξ (F, t)P (F ). Therefore,the value ξ (F, t) that ξt takes on the event F is the likelihood ratioQ (F ) /P (F ), which can be thought of as the ratio of the Q-probabilityof the path on the information tree leading from time zero to spot(F, t) to the P -probability of the same path. The random variableξT = dQ/dP has the same interpretation for terminal spots, which areof the form (ω , t), ω ∈ Ω.

Remark 2.4.6. We saw that a Q ∈ Q defines the density dQ/dP,which is strictly positive and satisfies E [dQ/dP ] = 1. Conversely, astrictly positive random variable Z such that EZ = 1 defines a uniqueQ ∈ Q such that Z = dQ/dP ; it is given by Q (F ) = E [Z1F ] for everyevent A.

The second probabilistic construction we need to relate SPDs toEMMs is the following multiplicative version of the Doob decomposi-tion. P++ denotes the set of all strictly positive predictable processes,and M denotes the set of all martingales under P .

Lemma 2.4.7. Every strictly positive adapted process π admits aunique decomposition of the form(2.4.7) π = π0ρξ, ρ ∈ P++, ξ ∈ M, ξ0 = 1.

Proof. Given π ∈ L++, let ρ ∈ P++ be defined recursively by

(2.4.8) ρ0 = 1 and ρtρt−1

=Et−1πtπt−1

, t = 1, . . . , T.

Define the process ξ so that π = π0ρξ and therefore ξ0 = 1. Thesecond equality in (2.4.8) is equivalent to the martingale propertyξt−1 = Et−1ξt. This proves the existence of decomposition (2.4.7) andalso its uniqueness, since the martingale property of ξ implies that ρmust be given by (2.4.8).

The last two lemmas together reveal the relationship between anSPD π representing a present-value function Π and a correspondingEMM-discount pair (Q, ρ) also representing Π: If dQ/dP = ξT and ρand ξ are as in decomposition (2.4.7), then for every cash flow c,

Π(c) =1

π0E[∑T

t=0πtct

]= E

[∑T

t=0ρtξtct

]= EQ

[∑T

t=0ρtct

].

Here is a more detailed statement of this conclusion.

Page 63: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 63

Proposition 2.4.8. Suppose π is an SPD representing the present-value function Π. Let ρ be defined by (2.4.8), and let

(2.4.9) Q (F ) = E [ξT1F ] , F ∈ FT , where ξT ≡ 1

ρT

πTπ0.

Then (Q, ρ) is an EMM-discount pair representing Π. Conversely,suppose (Q, ρ) is an EMM-discount pair representing the present-valuefunction Π. Let ξ denote the conditional density process of Q relativeto P , and let π0 be an arbitrary positive scalar. Then π = π0ρξ is anSPD representing Π.

Corollary 2.4.9. For every present-value function Π, there existsa unique Q in Q such that (Q, ρ) is an EMM-discount pair representingΠ for some (necessarily unique) discount process ρ.

The relationship π = π0ρξ of the preceding proposition connectsthe EMM notion of Definition 2.4.2 to the earlier construction of equa-tions (2.4.1) and (2.4.2). To see how, consider any nonterminal spot(F, t− 1) with immediate successor spots (Fi, t), i = 0, . . . , d. Bayes’rule and Remark 2.4.5 imply

(2.4.10) Q (Fi | F )P (Fi | F )

=Q (Fi) /Q (F )

P (Fi) /P (F )=

ξ (Fi, t)

ξ (F, t− 1), i = 0, . . . , d.

Therefore, if r is the rate process implied by ρ, then equation (2.4.1)applied to all nonterminal spots is the same as π/π− = (ρξ) / (ρξ)−,which is an equivalent recursive expression of the condition π = π0ρξ.

At the heart of the connection between pricing in terms of an SPDand pricing in terms of an EMM is a probabilistic change-of-measureformula, such as Lemma 2.4.4. We now review two other versions ofthe change-of-measure formula, each offering another insight on therelationship between pricing in terms of SPDs and EMMs.

We have encountered the pricing of risk in terms of covariances inequation (2.3.7), and in terms of an expectation under an EMM inequation (2.4.3). The connection between the two is made directly bythe following conditional change-of-measure formula, which on a finiteinformation tree is an immediate consequence of equation (2.4.10). Theresult applies in more general stochastic settings, however, where anexpression like (2.4.10) is not meaningful. The following alternativeargument applies to such settings and offers a chance to practice formalproperties of conditional expectations.

Lemma 2.4.10. For every Q ∈ Q, V ∈ L and time t > 0,

(2.4.11) EQt−1Vt = Et−1

[ξtξt−1

Vt

]= Et−1Vt + covt−1

[ξtξt−1

, Vt

],

where ξ is the conditional density process of Q relative to P .

Page 64: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 64

Proof. Let y = Et−1 [ξtVt] /ξt−1. For all z ∈ Lt−1, Lemma 2.4.4,the conditional expectation properties of Proposition 2.1.6 and the factthat yz ∈ Lt−1 imply

EQ [yz] = E [ξtyz] = E [Et−1 [ξtyz]] = E [Et−1 [ξt] yz] = E [ξt−1yz]

= E [Et−1 [ξtVt] z] = E [Et−1 [ξtVtz]] = E [ξtVtz] = EQ [Vtz] .

By Proposition 2.1.3, it follows that y = EQt−1Vt. The second equationin (2.4.11) follows from the definition of the conditional covariance andthe fact that Et−1ξt/ξt−1 = 1.

Now insert the SPD factorization π = π0ρξ, where ρ is a discountprocess with implied rate process r and ξ is a strictly positive unit-meanmartingale, in the pricing equation (2.3.7) to find

(2.4.12) St−1 =1

1 + rt

(Et−1Vt + covt−1

[ξtξt−1

, Vt

]).

This is the same as equation (2.4.3) if Q is the probability defined byξT = dQ/dP, since the term in parentheses is equal to EQt−1Vt by theLemma just shown.

Yet another way in which state-pricing is expressed is through themartingale property of properly discounted gain processes. The con-nection between EMMs and SPDs in those terms is made through aprobabilistic result on the martingale property after a change of mea-sure reviewed below. For simplicity, let us assume that the market X isimplemented by 1 + J contracts, where contract zero is an MMA, justas described in Section 1.7, whose notation we adopt here. Considerany π ∈ L++ with decomposition (2.4.7), where ξ is the conditionaldensity process of Q ∈ Q. As an SPD, π prices the MMA if andonly if the discount process ρ and the short rate process r are relatedby (2.4.4), a condition we henceforth assume. By Proposition 2.3.3,π is an SPD if and only if it prices the remaining J contracts withunderlying probability P , if and only if ρ prices the same J contractswith underlying probability Q. Therefore, by Proposition 2.3.4, π is anSPD if and only if Gπ ≡ πV +(πδ)− • t is a martingale (under P ); andQ is an EMM if and only if9 Gρ ≡ ρV + (ρδ)− • t is a Q-martingale;that is, a martingale under Q. The fact that π is an SPD if and only ifQ is an EMM corresponds to the purely probabilistic fact that Gπ is aP -martingale if and only if Gρ is a Q-martingale, which can be showndirectly as a corollary of Lemma 2.4.10.

9It is this equivalence that justifies the term “martingale measure” for Q. The“equivalent” qualification comes from probability theory, where two probabilitiesare said to be equivalent if they vanish on exactly the same events. Since P isassumed to have full support, Q is the set of probabilities that are equivalent toP . In infinite state-space settings, the definition of full support does not apply asgiven for the finite case. For example, under the uniform distribution on the unitinterval, every finite or countable event has zero probability.

Page 65: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.4. EQUIVALENT MARTINGALE MEASURES 65

Example 2.4.11. (American call) This example illustrates howsome basic facts from martingale theory can provide a dual argu-ment showing the American call late exercise result of Example 1.5.5.Throughout this example, we restrict the time horizon to be the Amer-ican call maturity τ (rather than T ), and we let T denote the set ofevery stopping time valued in 0, . . . , τ. From Section 2.1, recall thatevery adapted process has a unique Doob decomposition, whose pre-dictable part defines its compensator. A submartingale is an adaptedprocess x with an increasing compensator xp, meaning ∆xpt ≥ 0 or,equivalently, Et−1∆xt ≥ 0 (at every state and time t > 0). We will usetwo basic facts about a submartingale x.

Fact 1: τ ∈ T implies Exτ ≥ Exτ .

Proof. Define the martingale M ≡ x − xp and given any τ ∈ T ,the process ϕt = 1τ<t≤τ (with 1∅ ≡ 0). Since ϕ is predictable, ϕ•M isa zero-mean martingale and 0 = E (ϕ •M)T = E [Mτ −Mτ ]. ThereforeEMτ = EMτ and, since xp is increasing, Exτ ≥ E [Mτ + xpτ ] = Exτ .

Fact 2: Suppose the function f : R → R is increasing and convex.Then f (x) is also a submartingale.

Proof. The submartingale property Et−1∆xt ≥ 0 and the assump-tion that f is increasing imply f (Et−1xt) ≥ f (xt−1). The claim istherefore a corollary of Jensen’s inequality: Et−1f (xt) ≥ f (Et−1xt).Using only the convexity of f , note that the right derivative of f atEt−1xt defines dt−1 ∈ Lt−1 as the slope of a supporting line of the graphof f at Et−1xt and therefore f (xt) ≥ f (Et−1xt) + dt−1 (xt − Et−1xt).Applying Et−1 on both sides results in Jensen’s inequality.

Consider now the American call of Example 1.5.5 with the under-lying stock paying no dividends up to τ . Assume also that there is atraded MMA with a non-negative interest rate process, which meansthat the corresponding (predictable) discount process ρ is decreasing(s > t implies ρs ≤ ρt). For every EMM Q, we will show that τmaximizes the corresponding present value:

EQ[ρτ (Sτ −K)+

]= max

τ∈TEQ[ρτ (Sτ −K)+

].

By Theorem 1.5.2, this proves that not exercising the call prior to ma-turity is a dominant choice. Fixing the reference EMM Q, let us selectthe underlying probability P to equal Q. The no-dividends assumptionimplies that ρS is a martingale (up to the assumed time-horizon τ),and since ρK is a predictable decreasing process, ρS − ρK is a sub-martingale. The function that takes the positive part is convex andincreasing. By Fact 2, x ≡ ρ (S −K)+ is a submartingale, and byFact 1, Exτ ≥ Exτ . ♦

Page 66: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 66

2.5. Predictable representationsIn linear algebra it is often convenient to introduce a linear basis and

refer to vectors by their representation relative to the reference basis. Inwhat is essentially the same idea applied at each node of the informationtree, in this section, we introduce processes representing risk sourcesthat span all uncertainty. We show how every adapted process canbe represented in terms of its exposure to these risk sources, known asvolatility, and we discuss how the market prices volatility. The resultingscaffolding is especially useful as we pass from high-frequency modelsto the continuous-time limit. Unless otherwise indicated, expectations,variances, covariances and the martingale property are all relative to agiven underlying full-support probability P . Recall that M0 denotesthe set of zero-mean martingales and P0 denotes the set of predictableprocesses that take the value zero at time zero.

The essential idea behind predictable representations is easiest tosee in the one-period case, with a single time-zero spot and 1+ d time-one spots. A random variable in this case is any vector in R1+d. Letthe column vector b =

(b1, . . . , bd

)′ list the elements of an orthonormalbasis of the linear subspace of all zero-mean random variables with thecovariance inner product. By construction, the bi are uncorrelated toeach other and they each have zero mean and unit variance. In thissimple case, an M ∈ M0 can be represented as M1 = ∆M1 = σb forsome row vector σ ∈ R1×d, necessarily given by σ = E [∆M1b

′].The same construction can be applied over the single period follow-

ing any non-terminal spot on the filtration Ft, for any time horizon T .For simplicity, we assume that every nonterminal spot has exactly 1+dimmediate successor spots. We call such a filtration uniform withspanning number 1+d. Focusing on any nonterminal spot (F, t− 1)and its immediate successor spots (Fi, t), i = 0, . . . , d, we apply theearlier single-period construction conditionally on F . Adding the sub-script F, t, we end up with the vector bF,t = (b1F,t, . . . , b

dF,t)

′, where thebiF,t are random variables taking the value zero outside F and suchthat E[ biF,t | F ] = 0, E[ (biF,t)2 | F ] = 1, and E[ biF,tb

jF,t | F ] = 0 for

i 6= j. For every M ∈ M0, ∆Mt = σF,tbF,t on the event F , whereσF,t = E[ ∆Mtb

′F,t | F ]. The bases bF,t are conveniently stitched to-

gether by defining the process B = (B1, . . . , Bd)′, where Bi0 = 0 and

∆Bit1F = biF,t for every nonterminal spot (F, t− 1).The fact that the biF,t have conditionally zero mean is equivalent to

B ∈ Md0.

The conditional second moment conditions satisfied by the biF,t areequivalent to the condition that B is dynamically orthonormal:

(2.5.1) Et−1

[∆Bi

t∆Bjt

]=

1 if i = j,0 if i 6= j,

t = 1, . . . , T.

Page 67: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 67

The spot-by-spot representations of the increments of a martingale Mbecome

(2.5.2) M = σ •B, σ ∈ P1×d0 ,

where necessarily σt = Et−1 [∆Mt∆B′t] . Expression (2.5.2) is known

as the predictable martingale representation of M (relative toB), and B is said to have the martingale representation property.Combining the latter with the Doob decomposition of an adapted pro-cess, as defined in Section 2.1, we have the following representationresult.

Proposition 2.5.1. The underlying filtration is uniform with span-ning number 1+d if and only if it is generated by a B ∈ Md

0 satisfyingthe conditional orthonormality condition (2.5.1). In this case everyadapted process x can be uniquely represented in the form

(2.5.3) x = x0 + µ • t + σ •B, µ ∈ P0, σ ∈ P1×d0 .

The processes µ and σ can be computed in terms of x by

(2.5.4) µt = Et−1 [∆xt] and σt = Et−1 [∆xt∆B′t] , t = 1, . . . , T.

Proof. We have already shown the existence of a conditionallyorthonormal B ∈ Md

0 with the martingale representation property.Representation (2.5.3) results from the Doob decomposition of x withµ = ∆xp. To show that B generates the underlying filtration, we mustshow that for every time t, the realization of B1, . . . , Bt reveals therealized time-t spot. We argue by induction. For t = 0, the claim isvacuously true. Suppose it is true for t− 1 and ω is the realized state,corresponding to spots (F, t− 1) and (G, t), where ω ∈ G ⊆ F . Theinductive hypothesis means that (F, t− 1) is the unique spot that isconsistent with Bs (ω) for s < t. Let x be the adapted process thattakes the value zero everywhere except for x (G, t) = 1. Representa-tion (2.5.3) of x means that we can write ∆xt = µt + σt∆Bt, whereµt and σt take a constant value on F , revealed by Bs (ω) for s < t. Ifin addition we know B (G, t), then the value x (G, t) is also revealed,and hence the identity of the spot (G, t). This completes the inductiveproof that B generates the underlying filtration. Finally, the necessityof the condition that each nonterminal spot has exactly d + 1 imme-diate successor spots follows by a dimensionality argument. Considerany spot (F, t− 1) with immediate successors (G0, t) , . . . , (Gk, t). Byrepresentation (2.5.3), the set of Ft-measurable random variables thattake the value zero outside F and have zero conditional expectationgiven F is a d-dimensional linear space. The same linear space canbe identified with the set of every vector (x (G0, t) , . . . , x (Gk, t)) sat-isfying

∑i x (Gi, t)P [Gi | F ] = 0, which is k-dimensional. Therefore

k = d.

Page 68: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 68

We henceforth fix a reference B ∈ Md0 having the properties of

the preceding proposition. We call such a martingale a dynamicallyorthonormal basis of M0 and the representation (2.5.3) as the pre-dictable representation or the dynamics of x (relative to B). Theprocesses µ and σ are, respectively, the drift and volatility of x.Clearly, an adapted process is a martingale if and only if its drift iszero.

Consider now a state price density π and corresponding decomposi-tion π = π0ρξ, where ρ is a (predictable) discount process and ξ is theconditional density process of the EMM Q associated with π. We areinterested in the predictable representation of π, how it relates to thedynamics of ρ and ξ, and finally how it can be used to price a tradedcontract. We saw in the last section that if there is a traded MMAthen the short rate process is r = −∆ρ/ρ. Let r ∈ P0 be defined bythis equation, whether an MMA is traded or not. Simple algebra showsthat

∆π

π−=

ρ

ρ−

(∆ρ

ρ+

∆ξ

ξ−

)= − 1

1 + r

(r − ∆ξ

ξ−

).

The last term leads us to the process (1/ξ−) • ξ, which is a zero-meanmartingale (why?). Define the process η by the corresponding uniquepredictable representation

(2.5.5) 1

ξ−• ξ = −η′ •B, η ∈ Pd

0 .

The state-price dynamics can then be expressed as

(2.5.6) ∆π

π−= − 1

1 + r(r + η′∆B) .

By Proposition 2.5.1 and Lemma 2.4.10,

(2.5.7) ηt = −Et−1

[ξtξt−1

∆Bt

]= −EQt−1 [∆Bt] .

Equivalently,(2.5.8) B + η • t is a Q-martingale.This conclusion is an instance of what in a more general setting isknown as the Girsanov-Lenglart theorem.

Conversely, the process η determines the probability Q, since rep-resentation (2.5.5) can be restated as the recursion ξ/ξ− = 1 − η′∆B,which can be multiplied through, starting with ξ0 = 1, to obtain

(2.5.9) ξt =t∏

s=0

(1− η′s∆Bs) , t = 0, 1, . . . , T.

In particular, η determines ξT = dQ/dP and hence Q. Not everyη ∈ Pd

0 defines a Q ∈ Q in this manner, however, since ξ must be

Page 69: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 69

strictly positive. In fact, the preceding correspondence between η andQ defines a bijection between Q and the set(2.5.10) H =

η ∈ Pd

0 : 1− η′∆B ∈ L++

.

In equations (2.4.4) of the last section, we saw that every rate pro-cess r ∈ P0 implies a unique discount process ρ (Definition 2.4.2) andvice versa. The preceding construction shows that every Q ∈ Q de-fines a unique η ∈ H by letting ηt = −EQt−1 [∆Bt], and conversely,every η ∈ H defines a Q ∈ Q where dQ/dP = ξT is defined by (2.5.9).With Q ∈ Q and η ∈ H so related, we refer to η as the predictablerepresentation of Q, and to Q as the probability represented by η.We also say that the pair (Q, ρ) implies a pair (η, r) ∈ H × P0 andvice versa.

Definition 2.5.2. The process η ∈ H is a market price of riskprocesses if it represents an EMM Q, in which case, we call η the marketprice of risk process implied by the EMM Q.

In the last two sections, we saw a number of equivalent expressionsof what it means for an SPD π or corresponding EMM-discount pair(Q, ρ) to price a given contract (δ, V ). One of those is that Gρ =ρV + (ρδ)− • t is a Q-martingale. Let us now formulate the samecondition in terms of the implied (η, r) ∈ H×P0 and the gain processpredictable representation(2.5.11) G ≡ V + δ− • t ≡ V0 + µG • t + σG •B.A direct calculation of the increment ∆Gρ shows that

Gρ =(ρ(µG − rS− − σGη

))• t +

(ρσG

)• (B + η • t) .

By (2.5.8), Gρ is a Q-martingale if and only if the first term, whichis the drift of Gρ under Q, is zero. Summarizing, we have shown that(Q, ρ) prices the contract (δ, V ) if and only if(2.5.12) µG = rS− + σGη.

In words, the conditional expected gain over one period is the risk-free interest on the beginning-of-period ex-dividend value plus a riskadjustment that is measured by the volatility σG of the gain processpriced by η.

We arrived at the pricing condition (2.5.12) through the martin-gale property of Gρ under Q, an approach that has the advantage ofgeneralizing to continuous-time formulations. The reader may wish toexplore more direct links between the pricing of a contract and equa-tion (2.5.12), for example, through the pricing condition (2.4.12). Ingeneral, the pricing of a contract corresponds to a backward recursionon the information tree, corresponding to the requirement that currentvalue be equal to the present value of all future dividends up to sometime plus the present value of the contract value at that time. That

Page 70: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 70

condition (2.5.12) represents a backward recursion becomes apparentonce we note that Gt−1 = Et−1Gt − µGt and σGt = Et−1 [Gt∆Bt]. Thusknowing the end-of-period value Gt gives us σGt , which in turn givesus µGt by (2.5.12), and finally gives us Gt−1. In general, any conditionthat specifies the drift as a function of the volatility represents a back-ward recursion. This simple insight is particularly useful in gainingintuition behind continuous-time formulations, where the notion of asingle-period backward recursion is not directly available.

Assuming St−1 is nonzero everywhere, pricing condition (2.5.12) isoften expressed in terms of the return dynamics

(2.5.13) ∆Gt

St−1

= µRt + σRt ∆Bt, µR ∈ P0, σR ∈ P1×d0 .

In this case, equation (2.5.12) can be restated as

(2.5.14) µR − r = σRη,

which can more directly be viewed as a corollary of the expected ex-cess returns expression (2.3.8) and the SPD dynamics (2.5.6). Thus(conditional) expected excess returns relative to the risk-free rate overa single period are explained by exposure to the risk sources ∆Bt asmeasured by the return volatility σR. In the literature, this type ofpricing is also known as factor pricing, with ∆Bt being the (risk) fac-tors, the volatility representing (conditional) factor loadings, and ηtrepresenting the factor prices.

In arbitrage-pricing applications it is common to assume that themarket is implemented by exogenously specified contracts. Let usadopt the setting of Section 1.7, where X is implemented by 1 + Jcontracts, with contract zero being an MMA defining the short-rateprocess r and implied discount process ρ. Modifying our earlier no-tation, let δ =

(δ1, . . . , δJ

)′ and define V,G ∈ LJ analogously, withG having dynamics (2.5.11), where µG ∈ PJ

0 and σG ∈ PJ×d0 . Given

Q ∈ Q with predictable representation η ∈ H, we know that Q is anEMM if and only if it prices contracts 1 through J , a condition thatis in turn equivalent to η satisfying equation (2.5.12). Therefore themarket is arbitrage-free if and only if (2.5.12) admits a solution η ∈ H,and such a solution is unique if and only if the market is complete. Ofcourse, the computational details of the determination of η depends onthe details of the specification, which typically includes a Markovianstructure, as discussed in the following section.

Remark 2.5.3. The existence of some η ∈ P0 satisfying equa-tion (2.5.12) does not exclude arbitrage opportunities; the positivitypart of the definition of H is essential. The existence of of some η ∈ P0

satisfying equation (2.5.12) is equivalent to the weaker condition

(2.5.15) for all θ ∈ P1×J0 , θσG = 0 =⇒ θ

(µG − rS−

)= 0.

Page 71: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 71

To see the necessity of this condition, use an orthogonal decomposi-tion at each spot to write µG − rS− = σGη + ε′, for some η ∈ Pd

0

and ε ∈ P1×J0 such that εσG = 0 and therefore ε

(µG − rS−

)= 0.

Since ε(µG − rS−

)= εε′, it follows that ε = 0, proving (2.5.15). The

converse claim is immediate. Condition (2.5.15) states that every trad-ing strategy with zero volatility must produce the same returns as theMMA. In other words, there is no MMA arbitrage. ♦

For a typical arbitrage-pricing application, consider a market makerwho can trade in the 1+ J contracts implementing the market X, andsells contract (δ∗, V ∗) to a customer at time zero. We assume that Xis arbitrage-free and contract (δ∗, V ∗) is synthetic in X, meaning thatthere exists a trading strategy (θ0, θ) such that (δ∗, V ∗) =

(δθ, V θ

). (By

Proposition1.6.8, the contract is synthetic if and only if it is traded.)The customer could in principle bypass the market maker by followingtrading strategy (θ0, θ), but the idea is that the customer is less qual-ified to carry out the necessary transactions, for example, because oflack of sophistication, market access or time. In a perfectly competi-tive, frictionless market-making market, the market maker charges V ∗

0

for the contract at time zero10 and delivers the cash flow δ∗. The mar-ket maker entirely hedges the liability risk resulting from this sale bypurchasing back the synthetic contract

(δθ, V θ

). The market maker’s

pricing model consists of the short-rate process r and the gain or returndynamics specified earlier, which imply a market price of risk process ηand associated EMM Q. This data can be used to recursively computeV ∗ as the present value process of δ∗. Given V ∗, the market maker cancompute the hedging strategy in two steps. The position in the MMAis dictated by the replication condition V ∗ = V θ, which is the fact thatthe (cum-dividend) value in the MMA plus the value of the remainingpositions in the J contracts must equal the value of the sold contract.Therefore,

(2.5.16) θ0t =V ∗t − θtVt1 + rt

, t = 1, . . . , T ; θ00 = 0.

The remaining trading strategy θ is computed to eliminate volatilityperiod by period. To see why, define the gain-process predictable rep-resentation(2.5.17) G∗ ≡ V ∗ + δ∗− • t ≡ V ∗

0 + µ∗ • t + σ∗ •Band use the budget equation in the form of Proposition 1.6.3 to find(2.5.18) V ∗ − θV = predictable term +

(σ∗ − θσG

)•B.

10In practice, the market maker would charge a small spread over V ∗0 reflecting

trading and other business costs and possibly barriers to entry by other marketmakers. The assumption here is that for any individual trade such a spread isnegligible, although the same spread multiplied by a large number of transactionsmay not be negligible.

Page 72: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.5. PREDICTABLE REPRESENTATIONS 72

By (2.5.16), V ∗ − θV is predictable and therefore

(2.5.19) σ∗ = θσG,

which is assumed to have a solution in θ. (In the simplest formulation,σG is invertible.)

Assuming the market is arbitrage-free, a corollary of the preced-ing construction is that every synthetic contract is uniquely gener-ated by a trading strategy if and only if the rows of the J × d matrixσG (ω, t) are linearly independent for all (ω, t) ∈ Ω × 1, . . . , T . Inthis case, we say that the contracts (δ1, V 1) , . . . ,

(δJ , V J

)are dynam-

ically independent. Another corollary, via Proposition 1.6.7, is thatthe market is complete if and only if the rank of σG (ω, t) is d for all(ω, t) ∈ Ω × 1, . . . , T. In particular, the MMA (δ0, V 0) togetherwith the dynamically independent contracts (δ1, V 1) , . . . ,

(δJ , V J

)im-

plement a complete market if and only if J = d and σG (ω, t) is invert-ible for all (ω, t) ∈ Ω×1, . . . , T . In this case, for every j ∈ 1, . . . , J,the contract (δj, V j) is necessarily everywhere risky, meaning thatfor every time t > 0 and every nonempty event F ∈ Ft, the randomvariable V j

t 1F is not Ft−1-measurable.We close with an argument that leverages this section’s apparatus

to easily show that every complete market on the given filtration can beimplemented by 1 + d contracts. In contrast, if we were to implementa complete market by forming portfolios at time zero only, as manycontracts as there are non-time-zero spots would be required, a numberthat rises exponentially in T if d > 0, rendering the assumption of aperfectly competitive market implausible.

Theorem 2.5.4. Suppose the underlying filtration is uniform withspanning number 1+d. An arbitrage-free market is complete if and onlyif it can be implemented by a money-market account and d dynamicallyindependent everywhere risky contracts.

Proof. The “if” part follows from the preceding discussion. Con-versely, suppose the market X is complete and arbitrage-free, and let(Q, ρ) be the corresponding unique EMM-discount pair implying theshort-rate process r and market price of risk process η. Market com-pleteness implies that a contract is traded (in X) if and only if it ispriced by (Q, ρ) (why?). The MMA (δ0, V 0) is defined by r in (1.7.3).Clearly, (δ0, V 0) is priced by (Q, ρ) and is therefore traded. Definethe everywhere risky contracts (δ, V ) ∈ Ld×2 by letting δ− = 0 andGρ = ρV = B + η • t. By (2.5.7), Gρ is a Q-martingale and there-fore, by Proposition 2.3.4, (Q, ρ) prices the contracts (δ, V ) , which aretherefore traded. The market implemented by the contracts (δ0, V 0)and (δ, V ) is included in X and is complete and is therefore equal toX.

Page 73: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.6. INDEPENDENT INCREMENTS AND THE MARKOV PROPERTY 73

2.6. Independent increments and the Markov propertyOn a uniform filtration with spanning number 1 + d, where d > 0,

the number of spots rises exponentially with the number of periods.A spot-by-spot backward recursion that determines some value Vt interms of Vt+1 may be conceptually trivial, but computationally infea-sible as t increases due to an explosion in the number of spots. Inempirical applications, another concern is that a parameter can de-pend in too many ways on the entire history path, leading to dataoverfitting. These issues are commonly handled by introducing a so-called Markovian structure, where the entire history is summarized bya relatively low-dimensional state vector.

We review the basic idea in last section’s setting, where the underly-ing filtration is generated by the d-dimensional dynamically orthonor-mal basis B, where d > 0, relative to the full-support probability P .We further assume, throughout this section, that B has independentincrements (relative to P ), meaning that for every time t > 0 andfunction f : Rd → R,

(2.6.1) Et−1 [f (∆Bt)] = E [f (∆Bt)] .

Note that by Proposition 1.1.4 and Corollary 2.1.5, B has independentincrements if and only if for every time t > 0, the algebras σ (∆Bt) andFt−1 = σ (∆B1, . . . ,∆Bt) are stochastically independent. While condi-tion (2.6.1) is all we need to apply the independent-increments propertyin this section, Proposition 2.1.2 explains the terminology: B has inde-pendent increments if and only if the random variables ∆B1, . . . ,∆BT

are stochastically independent. Example 2.1.8 gives a prototypical in-dependent increments process. As in that example, an independentincrements process B such that E∆Bt = 0 for all t > 0 is necessar-ily martingale, but an arbitrary martingale need not have independentincrements.

Recall that a time-t spot corresponds to a specific realization of theentire history of B up to that spot. There are (1 + d)t such histories.The idea is to hypothesize that at each time-t there is a k-dimensionalrandom variable Zt that summarizes all that is relevant of the historyof B up to t, where k is a computationally manageable positive integer.Fixing an initial value Z0 ∈ Rk, we assume that Z is the k-dimensionaladapted process defined recursively by

(2.6.2) ∆Zt = µZt (Zt−1) + σZt (Zt−1)∆Bt, t = 1, . . . , T,

for given functions µZt : Rk → Rk and σZt : Rk → Rk×d. This construc-tion guarantees that Z is a Markov process (relative to P ), meaningthat for every time t > 0 and function f : Rk → R,

(2.6.3) Et−1f (Zt) = E [f (Zt) | Zt−1] .

Page 74: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.6. INDEPENDENT INCREMENTS AND THE MARKOV PROPERTY 74

The Markov property (2.6.3) formalizes the idea that every statisticof Zt can be calculated knowing only Zt−1 rather than the entire pathof B up to time t − 1. Note that equation (2.6.3) holds if and only ifEt−1f (Zt) is σ (Zt−1)-measurable.

The claim that Z is a Markov process relative to P is a corollaryof the following more general result.

Lemma 2.6.1. Suppose B has independent increments relative toP , and Q ∈ Q has a conditional density process ξt = Et [dQ/dP ] thatsatisfies

∆ξtξt−1

= −ηt (Zt−1)′ ∆Bt, ξ0 = 1,

for some ηt : Rk → Rd×1, t = 1, . . . T . Then Z is a Markov processrelative to Q.

Proof. Consider any time t > 0, vector z ∈ Rk and functionh : Rk → R. Using the conditional change of measure formula ofLemma 2.4.10 and recursion (2.6.2) for Z, note that on the eventZt−1 = z, the conditional expectation EQt−1 [h (Zt)] is equal to

Et−1

[(1− η (z)′t∆Bt

)h(z + µZt (z) + σZt (z)∆Bt

)].

Since ∆Bt is independent of Ft−1, the above quantity is constant onZt−1 = z and therefore so is EQt−1 [h (Zt)]. This proves that EQt−1 [h (Zt)]is σ (Zt−1)-measurable.

For each time t, the set of possible time-t Markov states is Nt ≡Zt (ω) : ω ∈ Ω. For x ∈ L, we abuse notation and write xt = xt (Zt)to mean that there exists a function xt : Nt → R such that xt (ω) =xt (Zt (ω)) for all ω ∈ Ω, or, equivalently, that xt is σ (Zt)-measurable.Similarly, if x is a predictable process, we write xt = xt (Zt−1) to expressthe condition that there exists a function xt : Nt−1 → R such thatxt (ω) = xt (Zt−1 (ω)) for all ω ∈ Ω (a condition that is equivalent tothe σ (Zt−1)-measurability of xt). Even though xt denotes two separatequantities, the meaning is clear from the context.

With these conventions, suppose that the market is arbitrage-freeand is implemented by the MMA and J contracts, as specified in thelast section, with the additional restriction that for every time t > 0,(rt, δt, Vt, µ

Gt , σ

Gt

)=(rt (Zt−1) , δ (Zt) , Vt (Zt) , µ

Gt (Zt−1) , σ

Gt (Zt−1)

),

and therefore St ≡ Vt − δt = St (Zt). A market price of risk processη ∈ H, defining an EMM Q, is specified as a predictable solution toequation (2.5.12) such that 1− η′∆B is strictly positive (with η0 = 0).The reader can verify that in this context, we can select η ∈ H to satisfyηt = ηt (Zt−1), t = 1, . . . , T . Assuming such a selection, it follows, fromLemma 2.6.1, that Z is a Markov process under Q, as well as P .

As in the last section, we consider the pricing and replication of atraded contract (δ∗, V ∗), with the added assumption that δ∗t = δ∗t (Zt)

Page 75: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.6. INDEPENDENT INCREMENTS AND THE MARKOV PROPERTY 75

for all t, and V ∗T = V ∗

T (ZT ) (which is redundant by the conventionδ∗T = V ∗

T ). Pricing the contract recursively using the EMM Q and theMarkov property of Z relative to Q gives

V ∗t−1 = δ∗t−1 (Zt−1) +

EQ [V ∗t (Zt) | Zt−1]

1 + rt (Zt−1)= V ∗

t−1 (Zt−1) .

Therefore, V ∗t = V ∗

t (Zt) for all t. Using the notation (2.5.17) forthe corresponding gain process, we also have µ∗

t = µt (Zt−1) and σ∗t =

σ∗t (Zt−1) (why?). A trading strategy that replicates the contract (δ∗, V ∗)

can be selected to have an analogous Markovian structure. We saw inthe last section that a trading strategy (θ0, θ) such that (δ∗, V ∗) =(δθ, V θ) can be constructed by letting θ0t be defined by (2.5.16) whereθt solves σ∗ (Zt−1) = θtσ

Gt (Zt−1). Select such a θ of the form θt =

θt (Zt−1). Let S∗ ≡ V ∗−δ∗ and S ≡ V −δ and substitute the identitiesV ∗t = S∗

t−1 + ∆G∗t and Vt = St−1 + ∆Gt in expression (2.5.16) for θ0t .

Using the dynamics of G and G∗, it follows that θ0t = θ0t (Zt−1), where

θ0t (z) =S∗t−1 (z) + µ∗

t (z)− θt (z)(St−1 (z) + µGt (z)

)1 + rt (z)

.

This type of Markovian scaffolding can be easily extended to ourearlier discussion of option pricing given the existence of a dominantchoice. The following example illustrates the basic idea.

Example 2.6.2 (American call). We revisit the American call ofExamples 1.5.5 and 2.4.11, but with the underlying stock potentiallypaying dividends. Assuming the above Markovian structure and nota-tion convention, the option’s value process and an associated optimalexercise time can determined by solving the backward recursion

V ∗t (z) = max

St (z)−K,

EQ[V ∗t+1 (Zt+1) | Zt = z

]1 + rt+1 (z)

, V ∗

τ+1 (z) ≡ 0,

where z ranges over all possible values of the Markov state. The ideais that at time t and Markov state z, assuming the option has notbeen already exercised, it is optimal to exercise the option and collectSt (z)−K if the maximum is achieved by the first term and it is optimalto keep the option alive if the maximum is achieved by the second term.Define the stopping time

τ ∗ ≡ min t | V ∗t (Zt) = St (Zt)−K

and for any stopping time τ ∈ T , let V τ be the present value process(under the EMM Q) of the dividend process δτt ≡ (St −K) 1τ=t.It is left as an exercise to verify (using a backward-in-time inductiveargument) that V τ

t ≤ V ∗t , with equality if τ = τ ∗. This confirms

that τ ∗ maximizes present value and is therefore optimal, and that V ∗

represents the option value process in the sense of Section 1.5. ♦

Page 76: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 76

2.7. A glimpse of the continuous-time theoryWe have so far taken the unit of time to coincide with a period,

which implies that the time horizon T is the same as the number ofperiods N . For this section, we fix the time horizon T in a given timeunit, which we call a year, and discuss approximations and associatedsimplifications that arise as the number of periods N becomes verylarge. Continuous-time models of interest are idealized representationsof such large discrete models in the sense that the quantitative pre-dictions of the discrete and continuous models are very close providedN is large enough. Theoretical support for this claim is provided bylimit theorems as N goes to infinity. The rigorous details of boththe continuous-time limiting model and arguments of convergence arehighly technical and well beyond the scope of this presentation. More-over, infinite models bring into scope set-theoretic esoteric aspects ofthe theory that are far removed from applications. An introductionfounded on this chapter’s finite information tree analysis should helpdemystify some of the continuous-time tools, as well as provide a ba-sis for prioritizing aspects of the continuous-time theory that are mostrelevant for economic theory.

While one can embed an arbitrary sequence of discrete risks intoa continuous-time model, tractability benefits derive from the assump-tion that uncertainty evolves as a sequence of risks that are in a sensesmall or infinitesimal in the continuous-time limit. The fundamentalbuilding blocks for constructing such high frequency sequences of smallrisks are two types of stochastic process: Brownian motion and thePoisson process. The incremental change of either type of process overan infinitesimal time interval represents a small risk, but for oppos-ing reasons. For Brownian motion, a change is certain to happen, butthe magnitude of the change is infinitesimal. For a Poisson process,the change is fixed at a unit amount, but the probability of changeis infinitesimal. Mixing of Brownian motions and Poisson processescan be used to construct an arbitrary Lévy process,11 which can bethought of as a representation of an arbitrary sequence of independentidentically distributed small risks in high frequency. Lévy processesare further extended to classes of so-called semimartingales,12 wherethe conditional moments of each small risk can be time-varying andpath dependent, presenting a rich palette for formulating statisticallytestable stochastic models. This being a pedagogical introduction, wefocus on the one-dimensional Brownian case.

11Cinlar [2010] provides a broad introduction to Probability theory that in-cludes Lévy processes. Applebaum [2004] provides an overview of Lévy processeswith a focus on stochastic integration.

12See Jacod and Shiryaev [2003] and Jacod and Protter [2012].

Page 77: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 77

The set of times for this section is 0, h, 2h, 3h, . . . , Nh, whereh ≡ T/N represents the time length of each period in years. The ideais to fix T and consider versions of the model as N goes to infinity and hgoes to zero. We proceed informally with a given value of N , but withthe understanding that N is big in the following sense. Suppose h =10−6 years and we are only interested in computing quantities to a six-decimal precision. We therefore think of h as small but not negligibleand quantities of higher order, like h1.5 = 10−9 and h2 = 10−12, asnegligible. On the other hand

√h = 10−3 is 1000 times bigger than h

and definitely not negligible. Note that as h goes to zero,√h becomes

infinitely larger than h.The underlying filtration Fnh | n = 0, 1, . . . , N is defined analo-

gously to Example 1.1.1, where information available at time t = nhcan be thought of as being the outcome of n consecutive coin tosses.Equivalently, we assume that the underlying filtration is generated bya process B, where B0 ≡ 0, and over every period the increment of Bcan take one of two possible values. As in Section 2.5, we fix an un-derlying full-support probability P , relative to which B is a zero-meanmartingale. Moreover, we assume that v ≡ Et

[(Bt+h −Bt)

2] is thesame for all t = nh, and B is normalized so that E [B2

T ] = T . SinceBT =

∑Nn=1∆Bnh, where ∆Bnh ≡ Bnh −B(n−1)h,

E[B2T

]=

N∑n=1

E[(∆Bnh)

2]+ 2∑m<n

E [∆Bmh∆Bnh] .

For m < n, E [∆Bmh∆Bnh] = E [∆BmhEmh∆Bnh] = 0. ThereforeT = E [B2

T ] = Nv or v = T/N = h.Another assumption we make requires that, loosely speaking, the

increment Bt+h − Bt is small, so that in the limit the paths of B arecontinuous. For example, if the two scenarios following each spot areequally likely,13 the fact that Bt+h−Bt has zero conditional mean andconditional second moment equal to h implies that Bt+h−Bt takes thepossible values ±

√h, and therefore (Bt+h −Bt)

2 = h. More generally,we assume that for every smooth function f : R → R, the (informallystated) second-order Taylor approximation is valid at all states:

(2.7.1) f (Bt+h)−f (Bt) ≈ f ′ (Bt) (Bt+h −Bt)+1

2f ′′ (Bt) (Bt+h−Bt)

2.

Using this approximation, we can now compute the limiting distri-bution of B (as N → ∞ with T fixed). We assume familiarity withthe normal (or Gaussian) distribution and its characteristic function

13A probabilistically small risk we are excluding is one where Bt+h −Bt takesthe value 1− h with probability h and the value −h with probability 1− h. In thiscase, the limiting process is Bt = Ct − t, where Ct is a Poisson process with unitarrival rate.

Page 78: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 78

(or Fourier transform). Consider approximation (2.7.1) withf (x) ≡ exp (iθx) ≡ cos (θx) + i sin (θx) , i ≡

√−1.

The fact that f is complex-valued presents no problem, since the ap-proximation can be applied on the real and imaginary parts sepa-rately. Fixing any time s, we compute ms (t) ≡ Esf (Bt) for t ∈ [s, T ],which defines the characteristic function of the distribution of Bt con-ditionally on time-s information. By the law of iterated expectations,Es [Bt+h −Bt] = 0 and Es [(Bt+h −Bt)

2] = h. Applying Es on bothsides of (2.7.1) and using the fact that f ′′ (x) = −θ2f (x), we have

ms (t+ h)−ms (t)

h≈ −θ

2

2ms (t) .

For infinitesimal h, we conclude that the derivative of log (ms (t)) withrespect to t equals −θ2/2. Since ms (s) = f (Bs), we can integrate froms to t to find that in the limit14 as N → ∞,

Es exp (iθ (Bt −Bs)) = exp(−θ

2

2(t− s)

).

This is the characteristic function of a normal distribution with meanzero and variance t − s, which must therefore be the distribution ofBt−Bs conditionally on time-s information. Coupled with the fact thatin the limit the paths of B are continuous, we are led to the usual15 def-inition of (one-dimensional16) Standard Brownian Motion (SBM),or Wiener process, as an independent-increments stochastic processwith continuous paths that start at zero, and whose increment over anytime interval of length ∆ is normally distributed with mean zero andvariance ∆.

To get a sense of what the paths of SBM look like, consider firstthe finite filtration case with the probability P assigning equal massto every path of B, which as we noted, implies that over every periodthe increment of B is ±

√h, where h = T/N . The sum N

√h of the

absolute value of all these increments along a path of B is known asthe path’s total variation, which in this case blows up as N → ∞.On the other hand, the sum Nh of the square of the increments of B

14The heuristic argument just given can be converted to a rigorous version,but omits an important sense in which the probability distribution of the paths ofwhat we call martingale bases converge to SBM. A rigorous statement and proofof a more general convergence result for continuous martingales can be found inthe exposition of Whitt [2007], while more elaborate extensions to semimartingalelimits that allow for Poisson-type jumps is the subject of Jacod and Shiryaev [2003].

15Mörters and Peres [2010] and Revuz and Yor [1999] are some excellent ac-counts of Brownian motion and its fascinating properties, some of which are infor-mally discussed in this section.

16If the spanning number of the finite filtrations were any positive integer 1+d,the limiting process would be a d-dimensional vector of mutually stochasticallyindependent SBMs.

Page 79: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 79

along a path is known as the path’s quadratic variation, which inthis case is equal to T no matter how big N is. These observations arereflected in properties of the paths of a SBM B. Every path of B iscontinuous but has infinite total variation (and hence infinite length)but quadratic variation equal to T . Omitting rigorous definitions, wewrite the quadratic variation property as

∫ t0(dBs)

2 = t for all t ∈ [0, T ],or (dBt)

2 = dt. Since t = var (Bt), the variance of Bt is perfectlyrevealed by its time-series estimate along any path of B up to time t, nomatter how small t is. On the other hand, paths of B do not reveal themean of Bt, which can only be estimated in an infinite-horizon versionof the model through the law of large numbers: limt→∞Bt/t = 0 withprobability one. Brownian paths can also be shown to be nowheredifferentiable and have further intricate structure not covered here.These properties may cause some suspicion as to whether SBM existsas a rigorous mathematical object. It’s existence, however, has beenestablished in a variety of ways within the conventional set-theoreticfoundations of real analysis.

A key assumption in our heuristic derivation of the Brownian mo-tion distribution has been that B is a zero-mean martingale satisfyingEt[(Bt+h −Bt)

2] = h or, equivalently, B2t − t = Et

[B2t+h − (t+ h)

].

This insight is reflected in Lévy’s characterization of SBM: A pro-cess B with continuous paths17 starting at zero is SBM if and only ifboth Bt and B2

t − t are martingales. In fact, for the “if” part, it issufficient to assume that Bt and B2

t − t are local martingales. Thedistinction between a martingale and a local martingale is not presentin the finite information setting, but is essential in the continuous-timelimit. A martingale over the continuous time interval [0, T ] is anadapted stochastic process M such that E |Mt| < ∞ for every time t,and s < t implies EsMt = Ms. A local martingale is an adaptedprocess M for which there exists an increasing sequence of stoppingtimes τ1 ≤ τ2 ≤ · · · converging to T with probability one and suchthat for every n, the process M is a martingale up to time τn (mean-ing that the process that equals Mt on t ≤ τn and Mτn on t > τnis a martingale). On a three-date infinite state-space setting, we maythink of an adapted process M whose increment ∆ ≡M2−M1 satisfiesE1 |∆| < 0 and E1∆ = 0, but from the perspective of time zero, wehave E |∆| = ∞, which leads to an example18 of a local martingale that

17The continuity of paths is essential here. Suppose instead we applied ananalogous argument for a process Bt = Ct − t, where the paths of C are right-continuous and constant except for jumps of size +1. Think of Ct as counting thenumber of arrivals over [0, t]. Then the martingale property of B characterizes Cas a Poisson process with unit arrival rate. This is Watanabe’s characterization ofPoisson processes.

18See Example 1.49 of Chapter 1 in Jacod and Shiryaev [2003].

Page 80: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 80

is not a martingale. This type of example fully encapsulates the dis-tinction between a martingale and a local martingale in discrete time,19

but the issue is more subtle in the continuous-time limit.The fact that the paths of B have infinite length implies that an

integral σ • B cannot be defined path by path in a conventional way,even if σ is assumed to be bounded. A key insight of Ito’s calculus isthat the integral σ•B must be defined as a genuinely stochastic integral,taking into account the entire filtration and the requirement that σ ispredictable. (In the Brownian continuous-time limit, predictability ofσ can be thought of as the requirement that σ is adapted, since thedistinction of the information at time t and t − dt is negligible. Forexample, the SBM B is predictable, while none of the BN are.) Wherein discrete time we would recursively express the relationship M = σ•Bas ∆Mt = σt∆Bt, in continuous time, we write dMt = σtdBt, wherethe differential notation dMt can be thought of as standing for theinfinitesimal increment Mt −Mt−dt, and analogously for dBt. We alsowrite Mt =

∫ t0σsdBs, t ∈ [0, T ]. Where in the finite state-space model

σ •B is always a martingale, in the current context we can only claimthat σ •B is a local martingale. As the heuristic (dMt)

2 = σ2t (dBt)

2 =σ2t dt suggests, in order for the process M ≡ σ •B to be well-defined as

a process of finite quadratic variation, it is necessary that the processQt ≡

∫ t0σ2t dt is well-defined and finite, in which case, Q is exactly the

quadratic variation process of M .Ito’s calculus corresponds to the limiting case of our earlier approx-

imations in terms of h, which can be stated as the exact relationships(2.7.2) (dBt)

2 = dt and dBtdt = (dt)2 = 0,

while the Taylor approximation (2.7.1), for twice continuously differ-entiable f : R → R, becomes

(2.7.3) df (Bt) = f ′ (Bt) dBt +1

2f ′′ (Bt) dt.

For example, for f (x) = x2, we find dB2t = 2BtdBt + dt, or

(2.7.4) B2t − t = 2

∫ t

0

BtdBt.

SinceB is predictable, the above integral defines a local martingale. Fora SBM B, we already know that B2

t − t is a martingale. The argumentleading to the above identity, however, applies to any local martingaleB with continuous paths and quadratic variation (dBt)

2 = dt. ByLévy’s characterization, any such local martingale starting at zero mustnecessarily be a SBM.

The preceding argument leads to an interesting insight on the re-lationship between volatility and time. Consider the local martingale

19See Proposition 1.64 of Chapter 1 in Jacod and Shiryaev [2003].

Page 81: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 81

Mt ≡∫ t0σtdBt, for predictable σ, defining the volatility of M . The pro-

cess M is close to being a Brownian motion—it is a local martingalewith continuous paths, but it does not necessarily have unit volatility:(dMt)

2 = σ2t dt. Imagine now a movie of M which at time t is played at

speed σ2t . Think of t as real time for the duration [0, T ] of the movie.

Think of u as time shown on a clock within the movie. Both times areset to zero at the beginning of the movie. At real time t, the clock inthe movie shows u =

∫ t0σ2sds, which is the quadratic variation of M up

to time t. With u and t so related, let Wu =Mt. For example, if W rep-resents a contract’s gain process from the perspective of a character inthe movie, M represents the same gain process from the perspective ofthe viewer. The movie character observes that W is a local martingalewith continuous paths that satisfies (dWu)

2/du = (dMt)2 /σ2

t dt = 1,and therefore can correctly claim that W is SBM.

To get a glimpse of phenomena that arise in the continuous-timelimit but are not present in a finite-tree model, suppose

∫ T0σ2sds = ∞

(for example, σt = (T − t)−1/2). The movie is the complete biographyof an immortal god, who correctly argues that with probability oneW eventually hits the value one. One way to see this is through thereflection principle for SBM. For each (continuous) path ω of W thatcrosses 1 before time t, there is an equally likely path ω that coincideswith ω up the first time t1 that ω hits 1, and thereafter it is the verticalreflection of ω relative to the horizontal line through 1, that is, fors > t1, ωs ≡ 1 − (ωs − 1). By construction, ωt < 1 if and only ifωt > 1. Both ω and ω have the property that they cross 1. Conversely,every path that crosses 1 before time t belongs to such a pair ω, ω.Therefore, the probability that W crosses 1 by time t is twice theprobability of the event Wt > 1 =

Wtt

−1/2 > t−1/2

, a probabilitythat converges to one as t → ∞ since Wtt

−1/2 is normally distributedwith zero mean and unit variance. Equivalently, the stopping timeτ defined as the first time that W takes the value one is finite withprobability one. But then the viewer of the movie must also concludethat there is a [0, T ]-valued stopping time τ such that Mτ = 1 withprobability one. Doob’s optional stopping theorem states that if M isa zero-mean martingale, then EMτ = 0 for every stopping time τ thatis bounded by some deterministic time. Within the movie, we havean example of a zero-mean martingale W and a stopping time τ suchthat EWτ = 1, which is consistent with Doob’s theorem, since τ is notbounded—it can take an arbitrarily long time for W to hit one. Themovie viewer’s stopping time τ , however, is bounded and therefore itmust be the case that M is a local martingale that is not a martingale.

In a market interpretation, we can think of σ as a trading strategyin a contract with gain process B, while within the movie the god holdsone share in a contract whose gain process is W . The cumulative gains

Page 82: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 82

of σ at real time t match the cumulative gains of the god’s strategy atgod time u =

∫ t0σ2sds. In a finite model, the expected gains from trad-

ing a contract whose gain process is a zero-mean martingale are zero.The god who holds the contract until W hits one is guaranteed a unitexpected gain. Likewise, the trading strategy σ can guarantee a unitgain by time T . Both strategies are a form of arbitrage. Within themovie, the arbitrage disappears if we place a martingale lower boundon how negative the god’s wealth can become before liquidation. Al-ternatively, the arbitrage disappears if we assume the god has a finiteexpected lifetime after all. In real time, an appropriate lower boundon wealth eliminates the arbitrage, and that’s the modeling choice wewill adopt in the following section. Limiting the god’s expected hori-zon corresponds to the square integrability condition E

∫ T0σ2sds < ∞,

which implies that M ≡ σ•B is a zero-mean finite-variance martingale.Conversely, every such martingale M can be represented as M ≡ σ •Bfor some square integrable predictable σ.

A natural type of process used to represent prices in a Brownianmodel is an Ito process, defined as a process of the form

xt = x0 +

∫ t

0

µsds+

∫ t

0

σsdBs or dxt = µtdt+ σtdBt,

for adapted processes µ and σ, respectively referred to as the driftand volatility of x, where it is assumed that the integrals

∫ T0|µt| dt

and∫ T0σ2t dt are well-defined and finite with probability one. The Ito

decomposition of x into a drift and a volatility term is unique (mod-ulo some zero-probability event technicalities we will ignore). To seewhy, suppose dxt = µtdt + σtdBt = µtdt + σtdBt. Then (µt − µt) dt +(σt − σt) dBt = 0. Squaring this expression and using the Ito calcu-lus (2.7.2) results in (σt − σt)

2 dt = 0 and therefore σ = σ (essen-tially), which in turn implies µ = µ (essentially). Alternatively, we canheuristically, but correctly, think of the drift and volatility as beingdetermined by Et−dt [dxt] = µtdt and covt−dt [dxt, dBt] = σtdt. For anypredictable process θ, the integral θ • x can be expressed as∫ t

0

θsdxs =

∫ t

0

θsµsds+

∫ t

0

θsσsdBs or θtdxt = θtµtdt+ θtσtdBt,

provided∫ T0|θtµt| dt and

∫ T0(θtσt)

2 dt are well-defined and finite withprobability one.

The type of second-order Taylor approximation that led to identity(2.7.1) can be applied to to an arbitrary Ito process x and associatedtime-dependent function f : [0, T ] × R → R that is continuously dif-ferentiable with respect to time and twice continuously differentiablewith respect to the second argument:

(2.7.5) df (t, xt) =∂f (t, xt)

∂tdt+

∂f (t, xt)

∂xdxt +

1

2

∂2f (t, xt)

∂x2(dxt)

2 ,

Page 83: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.7. A GLIMPSE OF THE CONTINUOUS-TIME THEORY 83

where (dxt)2 = σ2

t dt, according to the Ito multiplication rules (2.7.2).This is known as Ito’s formula (or lemma) and extends to multidi-mensional Ito processes (and beyond that, to general semimartingales).We will not go further in this direction here, except state an importantspecial case, known as the integration by parts formula: If x and yare Ito processes, then d (xtyt) = ytdxt+xtdyt+dxtdyt. (For a heuristicproof, expand the product (xt−dt + dxt) (yt−dt + dyt) and use the con-tinuity of x which implies that xt−dt = xt, and analogously for y.) Bythe Ito multiplication rules (2.7.2), if x and y have volatility σx and σy,respectively, then dxtdyt = σxt σ

yt dt. For example, for x = y = B, we

recover20 identity (2.7.4). For strictly positive x and y, the followingversion of integration by parts is often more convenient to use:

d (xtyt)

xtyt=dxtxt

+dytyt

+dxtxt

dytyt.

We close this section with some comments on the interchange oflimits and expectations which will arise in our discussion of state pric-ing in the following section. Consider first the example of the statespace Ω = (0, 1] with the underlying probability P being the uni-form distribution: 0 < a < b ≤ 1 implies P ((a, b]) = b − a. Thesequence of random variables Xn (ω) ≡ n1ω≤1/n converges to 0 forevery ω ∈ Ω, yet EXn = 1 for all n. Unlike the finite Ω case, wecannot freely interchange limits and expectation. An essential positiveresult in this regard, which is not far from the way expectations aredefined, is the monotone convergence theorem: For all [0,∞)-valued random variables X,X1, X2, . . . , if Xn ↑ X then EXn ↑ EX(where Xn ↑ X means that, with probability one, Xn+1 ≥ Xn for alln, and limn→∞Xn = X). The conclusion applies even if EX = ∞. Wewill use the monotone convergence theorem through Fatou’s lemma:

Lemma 2.7.1 (Fatou). If Xn ≥ 0 for all n, then

lim infn

EXn ≥ E lim infn

Xn.

Proof. Let Yn = infk≥nXk and note that 0 ≤ Yn ↑ Y ≡ lim infnXn.For all m ≥ n, EXm ≥ EYn and therefore infm≥n EXm ≥ EYn. Takingthe limit as n→ ∞ and using the monotone convergence theorem, weconclude that lim infn EXn ≥ EY .

20Integration by parts can be used inductively to show that every polynomialin (t, x) satisfies Ito’s formula, which can then form the basis for proving the Itoformula as stated above. The idea is to stop the process x before it leaves a compactinterval and then uniformly approximate f and the relevant derivatives on thatinterval by polynomials.

Page 84: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 84

2.8. Brownian market exampleAs a simple example, in this section we discuss an arbitrage-free

market with the underlying filtration generated by SBMB over the con-tinuous time interval [0, T ]. In other words, the information availableat time t is the realization of the path of B up to time t. The processB is one-dimensional, which means that the filtration can be thoughtof as a high-frequency limiting version of the binomial filtration of Ex-ample 1.1.1. The extension to multi-dimensional B is mainly a matterof notation and will not be given here. The arguments that follow arefor the most part similar to finite-information counterparts we haveencountered already, with simplifications resulting due to small-riskapproximations which are codified as exact infinitesimal relationshipsby the Ito calculus. As in the last section, we omit some mathematicaldetails. For example, the mere definition of the information generatedby the SBM B requires notions of σ-algebras and measurability, aswell as an underlying probability measure that defines the expectationoperator E. The hope is that once the main economic arguments areestablished, and the formalities are tied to finite-tree constructs, themore advanced literature that includes all these mathematical detailswill make a lot more sense.

A general continuous-time cash flow on the given filtration can bedefined analogously to the way cumulative distribution functions areused to specify probability distributions on the real line. This beinga simple introduction, we instead consider a special type of market inwhich all cash flows consist of just two lump-sum payments at times 0and T . More precisely, a cash flow is any pair (c0, cT ) ∈ R×L2, whereL2 is the set of every random variable cT such that E [c2T ] < ∞ (withthe usual convention of identifying any two random variables whosedifference has zero mean and variance). If x is a traded cash flow, thetime-0 payment x0 is used to purchase an initial portfolio of value −x0,which is then updated through a trading strategy up to time T , whenthe terminal portfolio value xT is liquidated. Every trading strategyis assumed to be self-financing, meaning that no cash is generated orinjected after the initial purchase and prior to the terminal liquidation.

The market is implemented by two contracts: a money-market ac-count and a stock. The money-market account (MMA) has a valueprocess identically equal to one unit of account, let’s call it a “dollar,”and a continuously compounded constant interest rate r ∈ R. Adollar in the MMA at time t generates interest rdt at time t + dt.Equivalently, a dollar invested in the MMA at time zero with all in-terest continuously reinvested, results in a time-t account balance ofert dollars. The stock has a prices process S which follows geometricBrownian motion with drift: St = S0 exp (at+ σBt), for given con-stants a ∈ R and S0, σ ∈ (0,∞). The assumption that σ is positive is

Page 85: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 85

just a convention, since the implications of the model are invariant toreplacing B with the SBM −B. The stock’s dividend yield is a con-stant y ∈ R, meaning that a share purchased at time t pays ydt stockshares (or yStdt dollars) in dividends at time t+ dt. Equivalently, if ashare of the stock is purchased at time zero and all dividends are rein-vested in the stock, then the time-t stock position is eyt shares, whichis worth eytSt dollars. (Note that by this definition, we could havecalled r the dividend yield of the MMA.) Since interest and dividendsare paid out continuously, the cum and ex-dividend price processes foreach contract are the same. Ito’s formula (2.7.5) implies that S is anIto process satisfying

(2.8.1) dStSt

+ ydt = µdt+ σdBt, µ ≡ a+ y +1

2σ2.

A trading strategy takes the form of a pair of predictable processes(θ0, θ), where θ0t is the time-t dollar balance in the MMA and θt is thethe number of stock shares held at time t. The gain processes for thetwo contracts are defined analogously to the finite case by

G0t ≡ 1 + rt and Gt ≡ St +

∫ t

0

ySudu.

The trading strategy (θ0, θ) is self-financing if it satisfies the budgetequation expressing the fact that the portfolio value is equal to theinitial portfolio value plus accumulated gains from trading:

(2.8.2) θ0t + θtSt = θ00 + θ0S0 +

∫ t

0

θ0udG0u +

∫ t

0

θudGu,

where dG0t = rdt and dGt = dSt + yStdt.

The trading strategy (θ0, θ) is admissible if it is self-financing (im-plying the integrals of the budget equation are well-defined) and thereexist constants θ0 and θ such that, with probability one, the followingsolvency constraint is satisfied:(2.8.3) θ0ert + θeytSt + θ0t + θtSt ≥ 0, for all t ∈ [0, T ] .

The constraint can be thought of as a simple form of a collateral re-quirement. A brokerage carries out your trades, but must ensure thatat all times your total account balance can cover your trading losses.At time zero, you are asked to deposit θ0 dollars in the MMA and θshares of the stock, with interest and dividends reinvested in the re-spective contracts. The solvency constraint requires that the net valueof all your positions with the brokerage remains positive. While redun-dant in the finite state-space model, it is needed here in some form inorder to preclude arbitrage trades based on increasingly larger bets ofthe type discussed in the last section. (As suggested by last section’sdiscussion, a variant of this example replaces the solvency constraintwith a square integrability restriction on admissible trading strategies.

Page 86: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 86

Either approach supports the example of Black-Scholes option pricingand hedging presented below.)

The budget equation should be invariant relative to a change of theunit of account implied by the strictly positive Ito process π. At time ta dollar is the same as πt of the new unit of account. The correspondinggain processes in the new unit are

G0πt ≡ πt +

∫ t

0

rπudu and Gπt ≡ πtSt +

∫ t

0

yπuSudu.

The integration by parts formula implies that the budget equation (2.8.2)can be equivalently stated as

(2.8.4) θ0t πt + θtπtSt = θ00π0 + θ0π0S0 +

∫ t

0

θ0udG0πu +

∫ t

0

θudGπu.

The solvency constraint under the new units is inequality (2.8.3) mul-tiplied through by πt, and is therefore also invariant under this type ofchange of a unit of account.

We say that the cash flow x = (x0, xT ) is traded if there exists anadmissible trading strategy (θ0, θ) such that x0 = − (θ00 + θ0S0) andxT = θ0T +θTST . A state price density (SPD) process in this contextis a strictly positive Ito process π such that πT ∈ L2 and for everytraded cash flow (x0, xT ),(2.8.5) π0x0 + E [πTxT ] ≤ 0.

Note that the assumption xT , πT ∈ L2 ensures that E |πTxT | < ∞thanks to the Cauchy-Schwarz inequality.

Based on the analysis for the finite case, we can guess the stateprice dynamics in terms of the short-rate process r and the market-price-of-risk process η, both of which are constant in this example:

(2.8.6) dπtπt

= −rdt− ηdBt, where η ≡ µ− r

σ.

In integrated form, the proposed SPD is

(2.8.7) πt = π0e−rtξt, ξt = exp

(−η

2

2t− ηBt

).

To confirm that this form of π indeed follows the claimed dynamics,apply integration by parts and Ito’s formula, which implies that ξ solves

(2.8.8) dξtξt

= −ηdBt, ξ0 = 1.

This Ito decomposition implies that ξ is a local martingale. Relaxingtemporarily the assumption that η is constant over time and states, itmay well be the case that ξ is not a martingale. The local martingaleproperty of ξ together with Fatou’s lemma can be used to show that ξ isa supermartingale: s > t implies Etξs ≤ ξt, and ξ is a martingale if andonly if EξT = 1, in which case it defines a probability Q analogously to

Page 87: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 87

the finite case by ξT = dQ/dP . Here η is constant and the martingaleproperty of ξ can be easily confirmed from (2.8.7) using the fact thatB has independent and normally distributed increments.

As in the finite case, the coefficients of the Ito expansion of π aredetermined by the requirement that G0π and Gπ are local martingales.To see that, suppose dπt/πt = atdt+btdBt and use integration by parts:

dG0πt

πt= (at + r) dt+ btdBt,

dGπt

πtSt= (µ+ at + σbt) dt+ (σ + bt) dBt.

Setting the drift term in the first equation to zero is equivalent tosetting at = −r, and given that, setting the drift term in the secondequation to zero is equivalent to setting bt = −η. In the followingproposition we prove that π is a state-price density by showing thatthe local martingale property of G0π and Gπ implies the state-pricedensity property of π.

Proposition 2.8.1. For any π0 ∈ (0,∞), the process π definedby (2.8.6) is a state-price density.

Proof. Suppose the trade (x0, xT ) is generated by the admissibletrading strategy (θ0t , θt), satisfying the solvency constraint (2.8.3). Let

Wt ≡ θ0ert + θeytSt and Wt ≡ Wt + θ0t + θtSt.

We remarked earlier that ξt = πtert is a martingale. Similarly, applying

integration by parts to ζt = πteytSt, we find that dζt/ζt = (σ − η) dBt,

which implies that ζ is a martingale (for the same reason that ξ is amartingale). The local martingale property of G0π and Gπ, the budgetequation (2.8.4), and the fact that πtWt is a martingale together implythat πW is a local martingale. There exists, therefore, an increasingsequence of stopping times τn that converges to T with probability onesuch that πW is a martingale up to time τn and therefore(2.8.9) π0W0 = E [πτnWτn ] , n = 1, 2, . . .

By the solvency constraint, πtWt ≥ 0. By Fatou’s Lemma 2.7.1,

π0W0 = lim infn→∞

E [πτnWτn ] ≥ E[lim infn→∞

πτnWτn

]= E [πTWT ] .

Subtracting the martingale identity π0W0 = E[πT WT

]on both sides

and using the fact that W0 = W0 − x0 and WT = WT + xT , we obtain(2.8.5).

The equivalent martingale measure (EMM) argument of the finitecase extends to the current context. Using the fact that ξ is a strictlypositive unit-mean martingale, we define the probability Q throughξT = dQ/dP , where the latter is the density of Q with respect to P .The Girsanov-Lenglart theorem alluded to in Section 2.5 is known as

Page 88: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 88

simply Girsanov’s theorem in the Brownian setting: BQt ≡ Bt+ ηt is a

Q-martingale and therefore, by Lévy’s characterization of SBM, BQ isa SBM under the probability Q.

Remark 2.8.2. Girsanov’s theorem for Brownian motion can berelated to the functional form of the standard normal density func-tion ϕ as follows. Let Φ (x) ≡

∫ x−∞ ϕ (y) dy denote the standard nor-

mal cumulative distribution function. Girsanov’s theorem implies thatΦ (x) = Q[BQ

1 ≤ x]. We use expression (2.8.7) for ξ1 and the changeof measure formula EQz = E [zξ1] for every F1-measurable boundedrandom variable z to conclude that

Φ (x) = E[1BQ

1 ≤xξ1]=

∫ x

−∞ϕ (y − η) e−

η2

2−η(y−η)dy.

Equivalently, ϕ (y) = ϕ (y − η) exp (η2/2− ηy). For η = y, this givesthe standard normal density as

ϕ (y) = ϕ (0) exp(−y

2

2

),

where ϕ (0) is set to (2π)−1/2 so that∫∞−∞ ϕ (y) = 1. Conversely, given

the functional form of ϕ, the above steps can be reversed to confirmthat Φ (x) = Q[BQ

1 ≤ x]. The argument can be easily extended toconfirm that for all times s > t, BQ

s −BQt has a normal distribution of

zero mean and variance s− t, independently of Ft, all under Q. SinceBQ has continuous paths starting at zero, this confirms that BQ is aSBM under Q. ♦

The stock price dynamics can be equivalently written as(2.8.10) dSt = (r − y)Stdt+ σStdB

Qt .

Repeating the above analysis with the EMM Q as the underlying prob-ability, BQ in place of B, and r in place of µ, results in the state-pricingcondition x0+EQ

[e−rTxT

]≤ 0 for every traded cash flow x. If ±x are

both traded, we have the pricing identity

(2.8.11) − x0 = E[πTπ0xT

]= EQ

[e−rTxT

].

The last equation also follows from the change of measure formulaEQxT = E [(dQ/dP )xT ], since πT = e−rTdQ/dP . (These argumentsapply with more general predictable r and η, with e−

∫ T0 rtdt in place of

e−rT and BQt ≡ Bt +

∫ t0ηsds, a SBM under Q.)

Example 2.8.3 (Forward pricing). Suppose a forward contract fordelivery of one share of the stock at time T is traded at time zero,with corresponding forward price F0,T ∈ R. The cash flow (x0, xT )generated by entering a long position in the forward contract is givenby x0 = 0 and xT = ST − F0,T . The same traded cash flow can be

Page 89: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 89

generated by borrowing S0e−yT from the MMA to purchase e−yT stock

shares at time zero, continuously rolling over the loan and reinvestingall dividends, resulting in a time-T long position of one stock share anda short dollar position −S0e

(r−y)T in the MMA. Barring arbitrage, itfollows that(2.8.12) F0,T = S0e

(r−y)T .

This expression can also be dually derived using the pricing restric-tion (2.8.11) with x0 = 0 and xT = ST − F0,T to conclude thatF0,T = EQST (a conclusion which critically depends on the assump-tion that r is constant). To show directly that the two expressions forF0,T are consistent, suppose F0,T is given by (2.8.12) and note that

(2.8.13) ST = F0,T exp(−σ

2

2T + σBQ

T

).

The exponential factor has unit mean under Q for the same reason thatξT as defined in (2.8.7) has unit mean under P . ♦

The replication and arbitrage-pricing argument of the last exam-ple follows a line of reasoning that was introduced in Section 2.5 andextends more generally to the current context. For simplicity, we con-sider a contract, we call the “star” contract, whose value process is V ∗

and whose only dividend is a terminal lump-sum payment V ∗T ∈ L2.

(In the preceding forward contract example, V ∗T ≡ ST − F0,T .) Using

the EMM Q, let(2.8.14) V ∗

t ≡ EQt[e−r(T−t)V ∗

T

],

which defines the Q-martingale V ∗t e

−rt thanks to the law of iterated ex-pectations. Omitting some technical details on integrability conditions,a predictable martingale representation theorem implies the existenceof a predictable process σ∗ such that

V ∗t e

−rt = EQt[e−rTV ∗

T

]= V ∗

0 +

∫ t

0

e−ruσ∗udB

Qu .

Integration by parts leads to the Ito decomposition(2.8.15) dV ∗

t = rV ∗t dt+ σ∗

t dBQt = (rV ∗

t + σ∗t η) dt+ σ∗

t dBt.

This equation can be read as a condition that specifies the drift of V ∗t

as a function of its volatility σ∗t . In our discussion of equation (2.5.12)

in Section 2.5, we saw that in the finite model such a restriction isequivalent to a backward recursion on the information tree. Equa-tion (2.8.15) can be thought of as the continuous-time version of sucha backward recursion, applied dt by dt backward in time, starting withthe given terminal value V ∗

T . Mathematically, (2.8.15) is an example ofa backward stochastic differential equation (BSDE) to be solved jointlyin V ∗ and σ∗ given the terminal value V ∗

T . The fact that the drift termin (2.8.15) is linear in V ∗ and σ∗ makes this a linear BSDE, which is

Page 90: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 90

why we have (subject to some technical integrability requirements) aclosed-form solution for V ∗ in (2.8.14)

Analogously to a construction in Section 2.5, we can think of thestar contract as a synthetic contract, generated by the trading strategy(θ0, θ), where(2.8.16) θ0t + θtSt = V ∗

t and θtStσ = σ∗t .

The corresponding budget equation is satisfied, since, using (2.8.10),θ0t dG

0t + θtdGt = rV ∗

t dt+ σ∗t dB

Qt = dV ∗

t .

Unlike the finite case of Section 2.5, we must further verify the solvencyconstraint. Moreover, in order to arrive to an arbitrage pricing equa-tion, as opposed to an inequality, we must confirm that both ± (θ0, θ)are admissible. For instance, the condition is obviously satisfied for theforward contract of Example 2.8.3, as well as the case of a Europeancall reviewed below.

Example 2.8.4 (European call and the Black-Scholes formula).Suppose the star contract is a European call option on the stock withstrike K ∈ (0,∞) and maturity T , and therefore V ∗

T ≡ (ST −K)+. Thesolvency constraint is satisfied for both the long and short position(why?), and therefore the time-zero option price, also known as theoption premium, is V ∗

0 = e−rTEQ[(ST −K)+]. By identity (2.8.13),the latter can be written as the following expression for the forwardcall premium per unit strike:

(2.8.17) p ≡ V ∗0 e

rT

K= EQ

[(exp

(m− v2

2+ v

BQT√T

)− 1

)+],

where v ≡ σ√T and m ≡ log (F0,T/K), with the forward price F0,T

given in (2.8.12). Let Φ (x) ≡ (2π)−1/2 ∫ x−∞ exp (−y2/2) dy denote the

standard Gaussian cumulative distribution of function. Since the latteris the distribution of BQ/

√T under Q, the expectation in (2.8.17) can

be written as(2.8.18) p = emΦ

(mv

+v

2

)− Φ

(mv

− v

2

).

This is the Black and Scholes [1973] formula for pricing the Europeancall. ♦

The Markovian structure introduced in Section 2.6 is also presentin the current market model, with the stock price S being the Markovstate. The analog of the forward recursion (2.6.2) are the Ito dynam-ics (2.8.1) of the stock price S, given the initial value S0. This isan example of a (forward) stochastic differential equation (SDE). In amore general formulation, an SDE can represent the Ito dynamics ofan underlying Markov process Z as

dZt = a (t, Zt) dt+ b (t, Zt) dBt, given Z0 ∈ R,

Page 91: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 91

with the functions a, b : [0, T ] × R → R appropriately restricted toensure a unique solution. Here Z = S, a (t, Z) = µZ and b (t, Z) = σZ,which allows a closed-form solution.

We henceforth assume that V ∗T = fT (ST ) for given fT : R+ → R.

Suppose the (suitably regular) function f : [0, T ]×R+ → R solves thepartial differential equation (PDE)

(2.8.19) ∂f

∂t+ (r − y)S

∂f

∂S+

1

2σ2S2 ∂f

2

∂S2= rf, f (T, ·) = fT .

By Ito’s lemma, it follows that

V ∗t = f (t, St) and σ∗

t =∂f (t, St)

∂SStσ

defines a solution to the linear BSDE (2.8.15). This is an illustration ofa close connection between BSDEs and corresponding PDEs, which al-lows one to leverage the rich theoretical and computational insights ofthe PDE literature. (For example, PDE (2.8.19) can be transformed tothe classical heat equations in Physics.) Conversely, probabilistic argu-ments in the BSDE literature can help shed light on the correspondingPDE. Given f , we can express the trading strategy defined in (2.8.16)as a function of the Markov state: θ0t = θ0 (t, St) and θt = θ (t, St),where

(2.8.20) θ (t, S) =∂f (t, S)

∂S, θ0 (t, S) = f (t, S)− θ (t, S)S.

Example 2.8.5 (European call replication). Suppose the star con-tract is the European call of the last example, and therefore fT (S) ≡(S −K)+. Expressing the Black-Scholes equation in the form V ∗

0 =f (0, S0) results in an expression for f (0, ·). The Markov property im-plies that f (t, ·) is identical to f (0, ·) after replacing T with T −t. Theresult is f (t, S) = θ0 (t, S) + θ (t, S)S, where

θ0 (t, S) = −e−r(T−t)KΦ

(log (S/K) + (r − y) (T − t)

σ√T − t

− σ√T − t

2

),

θ (t, S) = e−y(T−t)Φ

(log (S/K) + (r − y) (T − t)

σ√T − t

+σ√T − t

2

).

One can then easily check that these two function satisfy (2.8.20) andtherefore define the trading strategy that implements the European callas a synthetic contract in the MMA and the stock. ♦

In the more applied literature, the notation (Θ,∆,Γ) is used forthe triple (∂f/∂t, ∂f/∂S, ∂f 2/∂S2). A market maker who sells thestar contract can use f to price the contract, and buy ∆ stock sharesto hedge the position, an activity known as delta hedging. Γ containsinformation on how aggressively the market maker must trade to main-tain the hedging position as a consequence of changes in the underlyingstock price. By Ito’s lemma, (Θ,∆,Γ) encapsulates a snapshot of the

Page 92: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.8. BROWNIAN MARKET EXAMPLE 92

value V ∗t+dt = V ∗

t + Θdt +∆dSt + (Γ/2) (dSt)2 a short time interval in

the future as a function of the stock price, and PDE (2.8.19) expressesthe fact that V ∗

t = e−rdtEQt V ∗t+dt (why?). Notice that ∆ represents a

directional exposure that delta-hedging seeks to cancel out, while Γrepresents an exposure to volatility that can be a source of error inthe delta hedging strategy. In theory, trading sufficiently frequentlymakes the error negligible, but in practice the market maker faces smalltransaction costs and may wish to control the overall Γ to reduce theneed for frequent rebalancing, a practice known as gamma hedging.Moreover, what we called the star contract, may be a portfolio of con-tracts with the same Markov structure, for example, various optionswith different characteristics on the same stock. Since the derivativesdefining Θ, ∆ and Γ are linear, these parameters can be easily added upacross portfolio positions, allowing the market maker to quickly assessthe effect of new trades on the current portfolio.

Page 93: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 93

2.9. ExercisesExercise 1 Consider the setting of Section 2.2 with T = 1. Since

there is only one period, we drop all time subscripts. (The model canbe thought of as a single period on the information tree conditional ona given beginning-of-period spot.) Recall that the market is arbitrage-free, R is the set of all traded returns, and R0 ∈ R is the (zero-variance)MMA return. Assume that the market is not priced risk neutrally:There exists some R ∈ R such that ER 6= R0. Finally, assume that thepositive-variance return R∗ ∈ R maximizes the Sharpe ratio (relativeto R0) over R.

(a) Give a direct argument showing that R∗ must be a minimumvariance frontier return.

(b) Fix any positive-variance return R ∈ R such that R−R∗ is cor-related with R∗ (that is, cov [R−R∗, R∗] 6= 0). On the plane, considerthe locus

H ≡ (stdev [(1− w)R∗ + wR] , E [(1− w)R∗ + wR]) | w ∈ Ras well as the line L that connects h∗ ≡ (stdev [R∗] ,ER∗) to (0, R0). Forevery point (s,m) ∈ H with s > 0, the slope of the line that connects(0, R0) to (s,m) is the Sharpe ratio (m−R0) /s of the correspondingreturn. Since the latter is maximized by R∗ over all of R and h∗ lieson H, it follows that H lies below L on the plane and it touches L ath∗. Since H is smooth, it must then be the case that H is tangent toL at h∗. Compute the slope of the tangent of H at h∗ and show thatthe tangency condition is equivalent to the beta pricing equation

ER−R0 =cov [R,R∗]

var [R∗]

(ER∗ −R0

).

(c) How would the arguments of parts (a) and (b) change if R∗ wereinstead assumed to minimize the Sharpe ratio over all traded returns?Finally, explain why the minimum Sharpe ratio case applied to R∗ isessentially the same as the maximum Sharpe ratio case applied to asuitably defined symmetric traded return.

Exercise 2 This exercise outlines a version of the theory of betapricing without a traded money-market account. As in Section 2.2, theargument applies separately over a single period conditionally on eachnonterminal spot. To simplify notation (and without loss of generality)we assume there is a single period (T = 1), so every cash flow x is a pairof a scalar x0 and a random variable x1. We fix throughout a referencemarket, which is arbitrage-free and therefore every marketed cash flowhas a uniquely defined present value. Call the random variable x amarketed payoff if (0, x) is a marketed cash flow, in which case Π(x)denotes the present value of the cash flow (0, x). This defines a linearfunctional Π on the set of marketed payoffs. (In terms of the present-value notation of Chapter 1, Π(x) can be thought of as shorthand

Page 94: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 94

for Π((0, x)).) The market is implemented by J ≥ 2 contracts withwell-defined and linearly independent returns R1, . . . , RJ . Fixing anunderlying full-support probability, let Σ = [Σij] denote the returnvariance-covariance matrix: Σij ≡ cov (Ri, Rj), i, j ∈ 1, . . . , J. Aportfolio allocation is any ψ = (ψ1, . . . , ψJ) ∈ RJ such that

∑j ψ

j =

1, generating the return Rψ ≡∑

j ψjRj. The set of traded returns is

the linear manifold R ≡Rψ | ψ ∈ RJ

. Assume throughout that

there is no traded money-market account: var [R] > 0 for all R ∈ R.Let LR denote the linear span of R.

(a) Explain why LR equals the set of marketed payoffs, and x/Π (x) ∈R for every x ∈ LR such that Π(x) 6= 0.

(b) Verify that covariance defines an inner product for the vectorspace LR. Treat LR as an inner product space with the covarianceinner product for the remainder of this exercise.

(c) Let xΠ denote the Riesz representation of the present-value func-tional Π in LR:

xΠ ∈ LR and Π(x) = cov[xΠ, x

]for all x ∈ LR.

Explain why Π(xΠ)> 0 and therefore RΠ is well defined by

RΠ ≡ xΠ

Π(xΠ)∈ R.

Use Proposition B.1.6 to derive closed-form expressions, in terms of Σ,for the portfolio allocation that generates RΠ and the variance of RΠ.

(d) Show that var[RΠ]= min var [R] | R ∈ R by using an or-

thogonal projection argument to argue that the minimum exists, isunique and is characterized by the orthogonality condition RΠ ⊥ R −RΠ for all R ∈ R, or equivalently,

for all R ∈ R, cov[RΠ, R

]= var

[RΠ].

Verify that RΠ satisfies this condition and is therefore the traded returnof least variance. Also explain why the same conclusion can be reachedby applying Corollary B.4.7.

(e) The return R∗ ∈ R is a frontier return if

var [R∗] = min var [R] | ER = ER∗, R ∈ R .

Give a geometric interpretation of the property as an orthogonal pro-jection and (briefly) explain why R∗ ∈ R is a frontier return if andonly if R∗ ⊥ R − R∗ for every traded return R such that ER = ER∗.Call the market degenerate if all traded returns have the same mean,that is, there exists an m such that ER = m for all R ∈ R. What isthe set of frontier returns if the market is degenerate? Show that if themarket is not degenerate, then the set of frontier returns is a line inLR that contains and is orthogonal to RΠ.

Page 95: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 95

Hint: The idea is to show that given a frontier return R∗ 6= RΠ andany given mean m, there is a point on the line through R∗ and RΠ thatis the projection of zero to the linear manifold R ∈ R | ER = m. Therequisite orthogonality condition should follow from the correspondingorthogonality condition for R∗ and the fact that RΠ is orthogonal toR. This is the hard way—we will encounter an easier approach below.

(f) Let xE be the Riesz representation of E restricted to LR, definedby the requirements:

xE ∈ LR and Ex = cov[xE, x

]for all x ∈ LR.

Use Proposition B.1.6 to derive closed-form expressions, in terms of Σand

(ER1, . . . ,ERJ

), for the row vector βE such that xE =

∑j β

Ej R

j

as well as Π(xE).

(g) Prove that the market is degenerate if and only if there ex-ists a scalar m such that xE = mxΠ. Assume that the market is notdegenerate for the remainder of this exercise.

(h) Use Corollary B.4.7 to show that R∗ ∈ R is a frontier returnif and only if it takes the form αxΠ + βxE for scalars α and β. Usethis fact to give another proof that the set of frontier returns is a linein LR.

(i) Fix an arbitrary frontier return R∗ other than RΠ and show thatthere exists a unique frontier return R0 that is uncorrelated with R∗.Give a geometric interpretation of this result. How is R0 positionedrelative to RΠ and R∗ on the frontier line? How does R0 behave as R∗

approaches RΠ?(j) Suppose that R∗ is a frontier return and R0 is the unique frontier

return that is uncorrelated with R∗. Show that

E[R−R0

]=

cov [R∗, R]

var [R∗]E[R∗ −R0

], R ∈ R.

Conversely, show that if this beta pricing equation holds for someR∗, R0 ∈ R, then R∗ and R0 are uncorrelated, R∗ is necessarily afrontier return, and R0 can always be selected to be a frontier return.

Exercise 3 Assume the setting of Section 2.2 with a single period(T = 1). The (arbitrage-free) market is implemented by 1 + J lin-early independent contracts: an MMA with return R0 ≡ 1 + r andJ risky contracts with returns R1, . . . , RJ . Define 1 + µj ≡ ERj

and Σij ≡ cov [Ri, Rj], and write µ ≡(µ1, . . . , µJ

), a row vector,

and Σ ≡ [Σij], a J × J matrix. A portfolio allocation is anyrow vector ψ =

(ψ1, . . . , ψJ

)∈ RJ , generating the return Rψ ≡

R0+∑

j ψj (Rj −R0). The set of traded returns is the linear manifold

R ≡Rψ | ψ ∈ RJ

. As in Exercise 2, the set of marketed payoffs is

LR ≡ span (R) and for every x ∈ LR , we write Π(x), rather thanΠ((0, x)), for the present value of x. Moreover, by the argument ofExercise 2(a), x/Π (x) ∈ R for every x ∈ LR such that Π(x) 6= 0.

Page 96: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 96

(a) Explain why covariance is not an inner product in LR. Instead,for the remainder of this exercise assume LR is an inner product spacewith the inner product 〈x | y〉 ≡ E [xy]. As a notational convention,exponentiation takes precedence over expectation, and therefore theimplied squared norm of x is Ex2 ≡ E [x2].

(b) Let xΠ denote the Riesz representation of Π in LR:

xΠ ∈ LR and Π(x) = E[xΠx

]for all x ∈ LR.

Define the corresponding return

RΠ ≡ xΠ

Π(xΠ)∈ R.

Use Proposition B.1.6 to derive closed-form expressions, in terms of r,µ and Σ, for the portfolio allocation that generates RΠ.

Hint: Show and use the fact that xΠ−ExΠ is the Riesz representa-tion of the present-value functional in the linear subspace

x− Ex | x ∈ LR.

(c) Use an orthogonal projection argument to show that

E(RΠ)2

= minER2 | R ∈ R

.

What is the associated orthogonality condition satisfied by RΠ?(d) Show that the set of frontier returns is

αR0 + (1− α)RΠ | α ∈ R.

(e) Call the market degenerate if all traded returns have the samemean. Show that the set of frontier returns is a point if and only if themarket is degenerate, and otherwise it is a line.

(f) Arguing directly from the definition of frontier returns, showthat the set of frontier returns other than R0 is exactly the set oftraded returns of maximum absolute Sharpe ratio.

(g) The locus of all pairs (stdev [R∗] ,ER∗) asR∗ ranges over all fron-tier returns is known as the minimum-variance frontier. Describeand plot the minimum-variance frontier, and specify geometrically theexact location of the point

(stdev

[RΠ],ERΠ

).

Exercise 4 Consider the setting of Section 2.2. To simplify nota-tion, assume there is a single period (T = 1), which entails no loss ofgenerality since the argument of this exercise can be made over a singleperiod conditionally on any nonterminal spot. In Proposition 2.2.3 weencountered the beta of a traded return R with respect to the tradedreturn R∗:

β ≡ cov[R∗, R]

var [R∗].

Suppose that in an empirical implementation of the beta-pricing equa-tion of Proposition 2.2.3, a proxy R∗ + ε is used instead of a “true”

Page 97: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 97

frontier return R∗, where the error ε is judged to be small. The corre-sponding beta is

βε ≡ cov[R∗ + ε,R]

var [R∗ + ε].

Give a simple example illustrating the claim that an arbitrarily smallvalue of Eε2 is consistent with an arbitrarily large value of |βε − β|.

Exercise 5 (a) Show that an adapted process M is a martingaleif and only if M0 = EMτ for every stopping time τ : Ω → 0, . . . , T.(Note that τ is not allowed the value ∞.)

(b) Suppose the market is implemented by the contracts (δ, V ) ∈LJ ×LJ . Show that a strictly positive adapted process π is an SPD ifand only if for every finite stopping time τ : Ω → 0, . . . , T,

V0 =1

π0E

[τ−1∑t=0

πtδt + πτVτ

].

Exercise 6 (Binomial pricing) This is a continuation of Exercise 7of Chapter 1, whose setting and notation is assumed here. We also usethe notation R0 ≡ 1 + r, Y ≡ 1 + y and

(2.9.1) q ≡ (R0/Y )−D

U −D.

The condition of part (b) of Exercise 7, which is implied by the no-arbitrage assumption, is equivalent to q ∈ (0, 1). Assume that thiscondition holds for the remainder of this exercise.

(a) Compute all EMM-discount pairs for the given market. Is themarket complete and why? Explain why the condition 1 + r ∈ (0,∞)and q ∈ (0, 1) is sufficient for the market to be arbitrage-free.

(b) For the remainder of this exercise, assume (Q, ρ) is an EMM-discount pair. Is Z a Markov process under Q and why?

(c) Consider a contract (δ∗, V ∗) that is specified in terms of thepayoff function fT : NT → R by

δ∗T = V ∗T = fT (ZT ) and δ∗− = 0.

Use an EMM to show a pricing relationship of the form V ∗t = ft (Zt),

where the functions ft : Nt → (0,∞) are computed recursively, back-ward in time, starting with the known terminal function fT . Confirmthat your result is consistent with that of Exercise 7.

(d) What is the order of magnitude of the number of operationsneeded to compute V ∗

0 in part (c)? How does that compare to thenumber of spots on the information tree? What key assumptions makethis type of computational efficiency possible?

(e) Postulate an underlying probability P that is defined in termsof the constant p ∈ (0, 1) by

P (ω) = pN(ω) (1− p)T−N(ω) , ω ∈ Ω,

Page 98: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 98

where N ≡∑T

t=1 bt. The process B is defined recursively by

B0 = 0, ∆Bt = bt

√1− p

p− (1− bt)

√p

1− p, t = 1, . . . , T.

Verify that B is a dynamic orthonormal basis under P , and specify theset of all dynamically orthonormal bases in terms of B.

(f) Compute coefficients α and β so that∆Z

Z−= α + β∆B.

Be as specific as you can, given the primitives of the model.(g) Define the processes µR and σR by the return representation

∆G

S−= µR + σR∆B, µR ∈ P0, σR ∈ P1×d

0 .

Derive formulas for µR and σR in terms of α, β and Y .(h) Assume that Q = P η is an EMM, where η ∈ H. Compute

EQt−1 [∆Gt/St−1] and use the fact that η = −EQt−1∆Bt to conclude that

η =µR − r

σR.

Give an expression for η in terms of p and q, and use it together withthe definition of ∆B to confirm that

1− η∆B =q

pb+

1− q

1− p(1− b) ,

and that the conditional density process ξt = Et [dQ/dP ] is given by

ξt =

(q

p

)Nt(1− q

1− p

)t−Nt

, Nt =t∑

u=0

bu.

How can you use this result to recover the EMM expression you derivedin part (a)?

Exercise 7 Assume the stochastic setting of Section 2.7.(a) Suppose xt = x0 + αt + βBt, for constant α, β ∈ R. Using

Ito’s lemma to derive the Ito decomposition of exp (xt), and then usethis decomposition to compute E exp (xt). You can use without proofthe facts that in this context, the expectation E and integral

∫ t0

canbe interchanged, and the local martingale part in the Ito expansionof exp (xt) is in fact a martingale. Jensen’s inequality requires thatthe ratio E exp (xt) / exp (Ext) is greater than one. Explain how yourcalculation quantifies this ratio.

(b) Suppose xt is a strictly positive Ito process and α ∈ R. UseIto’s calculus to give expressions for d logxt and dxαt /x

αt as a function

of dxt/xt. Show how these expressions simplify when x is geometricBrownian motion with drift: dxt/xt = µdt+σdBt for constant µ and σ.

Page 99: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 99

Exercise 8 Use integration by parts to show the equivalence ofthe budget equation (2.8.2) and the budget equation (2.8.4) after thechange of unit of account implied by the strictly positive Ito process π.

Exercise 9 This exercise provides some insight on the relation-ship between the Brownian model of Section 2.8 and a high-frequencyversion of the binomial model of Exercise 5.

(a) As preparation, you will prove a special version of the stronglaw of large numbers. Suppose xn,N , n = 1, 2, . . . N ; N = 1, 2, . . . ,are i.i.d. (that is, identically distributed and stochastic independent)random variables. These are defined on a common state space with agiven probability measure P and corresponding expectation operatorE. Assume that for all n,N , Exn,N = 0 and E

[x2n,N

]= 1. Moreover,

assume that there exists a constant C such that for all n,N , |xn,N | ≤ Ceverywhere. (The argument you are about to give applies if the lastassumption is weakened to E

[x4n,N

]< ∞, but we do not need this

generality here.) Define the averages

xN ≡ 1

N

N∑n=1

xn,N , N = 1, 2, . . .

Prove that the random variable∑∞

N=1 x4N has finite expectation and

explain why it must then be true that limN→∞ xN = 0 with probabilityone. You can use the fact that E

∑∞N=1 x

4N =

∑∞N=1 E [x4N ], which is a

consequence of the monotone convergence theorem.(b) For each N = 1, 2, . . . , consider a version of the model of Ex-

ercise 5 (introduced in Exercise 7 of Chapter 1), where instead of nor-malizing the time unit to correspond to one period, we assume thatT = 1 and there are N periods. The time length of each period istherefore 1/N . We wish to select the remaining parameters so that theBrownian model of Section 2.8 is approximated as N gets large (with-out actually proving a convergence result). The transition probabilityp is the same for all N . Suppose zn,N , n = 1, 2, . . . , N ; N = 1, 2, . . .are i.i.d. random variables (under some given probability P on somecommon state space), and

p = P

[zn,N =

√1− p

p

]= 1− P

[zn,N = −

√p

1− p

].

The martingale basis for the N th model is BN , where

(2.9.2) BN0 = 0, BN

nN−BN

n−1N

=

√1

Nzn,N , n = 1, . . . , N.

The stock return parameters µR and σR are specified in terms of con-stants µ and σ by µR = µ/N and σR = σ/

√N . The stock cumulative

Page 100: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

2.9. EXERCISES 100

return process RN is given by

(2.9.3) RNnN−RN

n−1N

≡S n

Ne

yN

Sn−1N

− 1 = µ1

N+ σ

√1

Nzn,N , RN

0 ≡ 1.

The interest rate process of the MMA of the N th model is analogouslyscaled, so that one dollar invested in the account at time (n− 1) /Nbecomes 1 + r/N at time n/N . The constant parameters (r, µ, σ) donot vary with N . Using part (a), show that with probability one,

limN→∞

N∑n=1

(RN

nN−RN

n−1N

)2= σ2.

This shows that no matter what the probability p ∈ (0, 1) is, and givenany required level of precision, we can take N large enough so that thequadratic variation of RN equals σ2 up to the required precision. In thecontinuous-time limit, where R is an Ito process, this fact correspondsto the quadratic variation calculation (dRt)

2 = σ2dt.(c) Explain why in the finite binomial model the choice of the pa-

rameter p ∈ (0,∞) is irrelevant for the pricing of the European calloption. Also explain why in the Brownian model, the value of a (or µ)is irrelevant for the pricing of the European call option. (The absenceof µ in the Black-Scholes formula reflects this fact. Here you are askedto provide a deeper reason for this absence.)

(d) Given the insight of part (c), set p = 1/2. Recall that thestock return St/St−h over every nonterminal period is either U ≡ UN

or D ≡ DN , where the superscript N has been added to emphasizethe dependence of these parameters on N . (Technically speaking, theconvention ST = 0 requires us to slightly modify this statement, butthe last period becomes infinitesimal in the limit and does not matter.)What are suitable expressions for UN and DN in terms of the parame-ters (r, y, σ) so that, as N → ∞, the binomial model approximates theBrownian model of Section 2.8 with a = 0. Provide a brief explanationof your claim, without proving anything formal.

Page 101: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

CHAPTER 3

Optimality and Equilibrium Pricing

We have so far discussed markets that contain no cash flows thatare desirable in the sense of arbitrage. In this chapter, we expand thenotion of a desirable cash flow by introducing preferences. The absenceof traded desirable cash flows relative to a reference consumption plandefines the plan’s optimality and the related problem of optimal port-folio selection. Multi-agent optimality together with a market-clearingcondition defines competitive equilibrium.

3.1. Preferences and optimalityWe adopt the setting of Chapter 1, with information unfolding over

times 0, . . . , T , defining periods 1, . . . , T and 1 + K spots. There isno information at time zero and the state is revealed at time T . Aconsumption plan is an adapted process. The set of all consumptionplans is therefore L, which can be identified with R1+K . A consump-tion plan represents an agent’s total contingent consumption at eachspot. A cash flow available in a market can be incrementally addedto a given consumption plan to modify it to a preferred consumptionplan. In reality, consumption consists of bundles of multiple goods. Byassuming that consumption is one-dimensional at every spot, we areimplicitly taking relative spot prices of goods as given and we measureconsumption in some unit of account.

A consumption set is a set C of consumption plans such thatfor all c ∈ C, if x is an arbitrage, then c + x ∈ C. Given c ∈ C, wewish to specify a set D (c) with the interpretation that every x ∈ D (c)represents a desirable cash flow in the sense that c+x is a consumptionplan that is strictly preferred to c from the perspective of time zero.We impose some minimal requirements on preferences, listed below.Further structure will be imposed as needed later on.

Definition 3.1.1. A preference correspondence is a functionD whose domain, denoted dom (D), is a consumption set, and suchthat for all c ∈ dom (D), D (c) is a set of cash flows satisfying

• admissibility: x ∈ D (c) implies c+ x ∈ dom (D).• irreflexivity: 0 /∈ D (c).• continuity: D (c) is open.• monotonicity: For every arbitrage cash flow y, if x = 0 orx ∈ D (c) then x+ y ∈ D (c).

101

Page 102: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.1. PREFERENCES AND OPTIMALITY 102

The first two properties state that c + x must be an admissibleconsumption plan in order to be strictly preferred to c, and c cannotbe strictly preferred to itself. The third condition states that if c+x isstrictly preferred to c then there is a sufficiently small tolerance ϵ > 0such that ‖x − x′‖ < ϵ implies that c + x′ is also strictly preferred toc. Finally, the fourth condition means that the agent always strictlyprefers to increase consumption at some spot, provided consumption isnot reduced at any other spot.

We henceforth fix a reference market X, relative to which we for-mulate notions of individual and allocational optimality.

Definition 3.1.2. A consumption plan c is optimal for the pref-erence correspondence D given X if c ∈ dom (D) and X ∩ D (c) = ∅.

The first part of the optimality condition requires that c is an ad-missible consumption plan, while the second part states that there isno trade that results in a desirable incremental cash flow relative to c.Since, by monotonicity, D (c) contains all arbitrage cash flows, opti-mality of c implies the no-arbitrage condition X ∩ R1+K

+ = 0.Individual optimality extends usefully to a notion of allocational

optimality as follows. Fix a positive integer I. Informally, we thinkof i ∈ 1, . . . , I as labeling agents representing potential market par-ticipants. A consumption allocation is a tuple c =

(c1, . . . , cI

),

with the interpretation that ci is a consumption plan for agent i. Apreference profile is a preference correspondence tuple

(D1, . . . ,DI

),

with Di representing the admissible consumption plans and prefer-ences of agent i. The sum of the Di is the correspondence D ondom (D) ≡

∏Ii=1 dom (Di) defined by

(3.1.1) D (c) ≡I∑i=1

Di(ci)≡

I∑i=1

xi | xi ∈ Di(ci)

for all i.

Definition 3.1.3. An allocation c =(c1, . . . , cI

)is optimal for

the preference profile(D1, . . . ,DI

)given the market X if c ∈ dom (D)

and X ∩ D (c) = ∅.

The existence of some x ∈ X∩D (c) can be thought of as a profitablemarket-making opportunity. Suppose x =

∑Ii=1 x

i where xi ∈ Di (ci).A market maker can offer xi to each agent i in exchange for somepositive amount, since agent i strictly prefers to add xi to ci. Themarket maker can then simply trade away the aggregate cash flow x,since x ∈ X. The net result is a positive profit for the market maker,a type of arbitrage that is not in X but can be obtained by usingX and simultaneously contracting with the I agents in an incentivecompatible way. Optimality of the allocation c given X means thereare no market-making opportunities of this sort.

Page 103: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.1. PREFERENCES AND OPTIMALITY 103

The monotonicity and continuity properties of preference corre-spondences imply that the optimality of an allocation c is equivalentto the apparently stronger condition of optimality of the allocation forevery subset of the agents.

Proposition 3.1.4. The allocation(c1, . . . , cI

)is optimal for the

preference profile(D1, . . . ,DI

)given X if and only if for every A ⊆

1, . . . , I, the allocation (ci)i∈A is optimal for (Di)i∈A given X.

Proof. Suppose x =∑

i∈A xi ∈ X, where xi ∈ Di (ci) for all

i ∈ A ⊆ 1, . . . , I. For i /∈ A let xi = 0. The idea is to increase everyxi at the expense of a single agent α ∈ A by a sufficiently small amountso that agent α still finds the resulting incremental cash flow desirable.More formally, let y be any arbitrage cash flow, say y = 1, and fix anyagent α ∈ A. Given that Dα (cα) is an open set, choose scalar ϵ > 0small enough so that xα ≡ xα − (1− I) ϵy ∈ Dα (cα), and for everyagent i 6= α let xi ≡ xi + ϵy. By preference monotonicity, xi ∈ D (ci)

for all i, and by construction∑I

i=1 xi = x ∈ X. Therefore,

(c1, . . . , cI

)is not optimal given X. The converse is immediate.

Since individual optimality is the same as the degenerate case ofsingle-agent allocational optimality, it follows that allocational opti-mality implies individual optimality.

Corollary 3.1.5. If(c1, . . . , cI

)is optimal for

(D1, . . . ,DI

)given

X, then for all i ∈ 1, . . . , I, ci is optimal for Di given X.We proceed under the assumption that c represents either a con-

sumption plan of a single agent with preference correspondence D,or an allocation c =

(c1, . . . , cI

)for I agents with preference profile(

D1, . . . ,DI). In the latter case, D denotes the sum of the Di as

specified in (3.1.1) with dom (D) ≡∏I

i=1 dom (Di). In the followingdiscussion, any statement of optimality of c is understood to be for Dif c is a consumption plan and for

(D1, . . . ,DI

)if c is an allocation.

With these conventions, we introduce a dual notion of optimality thatwill be key in relating optimality given a market to optimality given abudget constraint in terms of present values.

Definition 3.1.6. For any linear functional Π : L → R, theconsumption plan or allocation c is Π-optimal if c ∈ dom(D) andx ∈ D (c) implies Π(x) > 0.

Since D (c) contains every arbitrage-cash flow, if c is Π-optimal,then Π is necessarily positive (Π(x) > 0 if x is an arbitrage). Aconsequence of this is that Π-optimality is the same as optimality giventhe complete market for which Π is the unique present-value function.

Lemma 3.1.7. A consumption plan or allocation is Π-optimal if andonly if it is optimal given the market XΠ ≡ x ∈ L | Π(x) = 0.

Page 104: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.1. PREFERENCES AND OPTIMALITY 104

Proof. Suppose c is not Π-optimal and therefore Π(x) ≤ 0 forsome x ∈ D (c). Then the constant cash flow δ = −Π(x) is either zeroor an arbitrage and therefore x + δ ∈ D (c) ∩ XΠ, which implies c isnot optimal given XΠ. The converse is immediate.

The lemma allows us to apply Proposition 3.1.7 and its corollary toΠ optimality. Therefore, if an allocation c =

(c1, . . . , cI

)is Π-optimal,

then ci is Π-optimal for every agent i. The fact that the Π in thisstatement is the same for every agent will be key.

The duality between optimality given a market and Π-optimalityfor a present-value function Π has the geometric structure of the dualitybetween an arbitrage-free market and a present-value function (Theo-rem 1.4.3). The set D (c) enlarges the set of arbitrage cash flows andthe condition that Π(x) is positive for all x ∈ D (c) strengthens thecondition that Π is positive. A present-value function Π (strictly) sep-arates the market from the set of arbitrage cash flows, while a presentvalue Π such that c is Π-optimal separates the market from D (c). Thispicture is the basis for the following proposition, where we take the ref-erence market X as given, and we say that the consumption plan orallocation c is optimal to mean that c is optimal given the marketX. Note that the second part assumes convexity of D (c), which in themulti-agent case is implied by the convexity of every Di (c).

Proposition 3.1.8. The following are true for all c ∈ dom(D).(1) If c is Π-optimal for some present-value function Π, then c is

optimal.(2) If c is optimal and D (c) is convex, then there exists a present-

value function Π such that c is Π-optimal.(3) Suppose the market complete and Π is the unique present-value

function. Then c is optimal if and only if it is Π-optimal.

Proof. (1) If x ∈ D (c), then Π(x) > 0 and therefore x /∈ X.(2) Suppose c is optimal and D (c) is convex. Viewing the space of

adapted processes as R1+K with the Euclidean inner product, the sepa-rating hyperplane theorem (Corollary B.5.3) implies that there exists anonzero vector p ∈ R1+K such that p · x ≤ 0 for all x ∈ X and p · y ≥ 0for all y ∈ D (c). Suppose that p ·x = 0 for some x ∈ D (c). Since D (c)is open, there is ϵ > 0 such that x − ϵp ∈ D (c), leading to the absur-dity 0 ≤ p · (x− ϵp) < 0. Therefore, p · x > 0 for all x ∈ D (c), whichimplies that p is strictly positive (since D (c) contains every arbitrage).It follows that Π(x) = p · x/p0 defines a present-value function suchthat c is Π-optimal.

(3) This is Lemma 3.1.7 with X = XΠ.

A preference correspondence D has been defined from the perspec-tive of time zero. A preference correspondence DF,t can be analogously

Page 105: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.1. PREFERENCES AND OPTIMALITY 105

defined from the perspective of spot (F, t), with DF,t (c) ⊆ LF,t for ev-ery c ∈ C. Given a corresponding spot-(F, t) market XF,t ⊆ LF,t, a keyobservation is that if XF,t ⊆ X and DF,t (c) ⊆ D (c), then X∩D (c) = ∅implies XF,t ∩ DF,t (c) = ∅. This condition states that if c is optimalfrom the perspective of spot zero, then it is also optimal from the per-spective of spot (F, t). The condition guarantees that an agent thatselects an optimal consumption plan c at time zero has no incentive todeviate from that choice as uncertainty unfolds. In discussing Proposi-tion 1.2.5, we saw that XF,t ⊆ X is a dynamic consistency assumptionon the market: If an incremental cash flow x is available in the marketat spot (F, t), then it is also available at time zero, since the agent canmake contingent plans to carry out whatever trades generate x if spot(F, t) is realized. Analogously, DF,t (c) ⊆ D (c) is a dynamic con-sistency assumption on preferences: If from the perspective of spot(F, t), consumption plan c+ x is strictly preferred to c for some incre-mental cash flow x ∈ LF,t, then the agent at time zero anticipates thispreference contingent on spot (F, t), the only contingency on which xis non-zero, and therefore decides that c + x is strictly preferred to cfrom the perspective of time zero as well.

A useful way of thinking about dynamic choice is as if the deci-sion maker is multiple agents, one for every spot. Spot-(F, t) agenthas preferences represented by DF,t and selects trades in XF,t. It isnot hard to envision situations where dynamic consistency is violated.For example, suppose that the time-zero copy of the agent considers acontingent trade at spot (F, t) optimal, let’s say a trade to rebalanceinto a falling stock market, yet, the spot-(F, t) copy of the same agent,faced with immediate market losses becomes afraid and stays out ofthe market. (As former heavyweight world champion Mike Tyson putit, “Everyone has a plan ’till they get punched in the mouth.”) Perhapsthe spot-zero agent has poor foresight and would have chosen differ-ently given a better understanding of the future self. Alternatively,the spot-zero agent may be well aware of the role of future tempta-tions and may therefore seek to commit1 to a plan at time zero in away that cannot be reversed at spot (F, t). The strategic interactionamong copies of the same agent can become complex (and interesting).Here we bypass this complexity, by removing any conflict among copiesof the same agent. Dynamic consistency ensures that what is best forthe spot-zero copy of the agent is also best for the spot-(F, t) agent,and is therefore sufficient to only consider time-zero optimal decisions.Analogous reasoning applies to the notions of allocational optimalityand equilibrium, which we therefore only discuss from the perspectiveof time zero.

1The concept is discussed by Strotz [1957], who quotes from Homer’s Odyssey.Odysseus, being well aware his future self cannot resist the song of the Sirens,instructs his crew to tie him to the mast.

Page 106: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.2. EQUILIBRIUM 106

3.2. EquilibriumIn order to introduce a simple notion of (competitive) equilibrium,

we formally define an agent to be a pair of a preference correspondenceand a consumption plan, called the agent’s endowment. We take asprimitive the I agents

(3.2.1)(D1, e1

), . . . ,

(DI , eI

).

The initial allocation(e1, . . . , eI

)can be modified to a new allocation

c ≡(c1, . . . , cI

)through access to a market X.

Definition 3.2.1. An allocation c is X-feasible if ci − ei ∈ X forall i, and clears the market if

∑Ii=1 c

i = e, where e ≡∑I

i=1 ei is the

aggregate endowment. An equilibrium is a pair (X, c), where Xis a market, c is an X-feasible allocation that clears the market, andfor all i, ci is optimal for Di given X.

Equilibrium is commonly formulated in the literature in terms ofcontracts implementing the market, whose dividend processes are givenand whose prices are set in equilibrium. The following example out-lines the relationship of this approach to the equilibrium notion justintroduced.

Example 3.2.2 (Contract-market equilibrium). Consider J con-tracts with given dividend processes δ ≡

(δ1, . . . , δJ

)′. A contract-market equilibrium is a corresponding vector of value processesV ≡ (V 1, . . . , V J)′ and trades θ ≡ (θ1, . . . , θJ) in the contracts (δ, V )such that

∑i θ

i = 0 and θi is an optimal trading strategy for agenti, in the sense that if θi generates cash flow xi and ci ≡ ei + xi, thenX (δ, V ) ∩ Di (ci) = ∅. Recall that X (δ, V ) denotes the market im-plemented by (δ, V ). The last optimality condition, therefore, statesthat there is no trading strategy in (δ, V ) that agent i desires to addto θi. If (V, θ) is a contract-market equilibrium, then

∑i θ

i = 0 im-plies

∑i x

i = 0, and therefore (X (δ, V ) , c) is an equilibrium. In theconverse direction, suppose that (X (δ, V ) , c) is an equilibrium and foreach agent i, let θi be a trade that finances ci, meaning that θi gen-erates a cash flow xi ∈ X (δ, V ) such that ci = ei + xi. While it isclear that ci is optimal for agent i given the market X (δ, V ) and that∑

i xi = 0, it is not necessary that

∑i θ

i = 0, since there are potentiallymore than one trading strategies generating a given traded cash flow.It is possible, however, to choose θ =

(θ1, . . . , θJ

)so that

∑i θ

i = 0,and therefore so that (V, θ) is a contract-market equilibrium. The trickis to pick θi to finance ci for i > 1, and then define θ1 = −

∑i>1 θ

i.The fact that c clears the market implies that θ1 finances c1, and wehave the desired contract-market equilibrium. ♦

Page 107: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.2. EQUILIBRIUM 107

Market equilibrium as defined here is closely related to the classicalWalras [1874] notion of a competitive exchange equilibrium in the ab-sence of time and uncertainty. For example, given an initial allocationof bread and wine, a competitive exchange equilibrium is a price foreach good and a new allocation that is resource feasible (no bread orwine is created) and individually optimal (agents cannot do better byselling some wine to buy more bread or vice versa). A key insight ofArrow [1953, 1963] and Debreu [1959] was that the same notion canbe applied in contexts with time and uncertainty by reinterpreting thenotion of a good, in the example wine or bread, to be what we havecalled Arrow cash flows (also known as Arrow-Debreu securities). Bun-dles of goods correspond to cash flows and prices of goods correspondto state prices. A state price vector p can always be identified withthe corresponding linear functional Π(c) = p · c, which leads to thefollowing equilibrium notion.

Definition 3.2.3. An Arrow-Debreu equilibrium is a pair (Π, c)of a linear functional Π : L → R and an allocation c ≡

(c1, . . . , cI

)such

that∑I

i=1 ci ≤ e and for all i, Π(ci) ≤ Π(ei) and

(3.2.2) c− ci ∈ Di(ci)

=⇒ Π(c) > Π(ei).

The following equivalent form of this definition will be useful forour purposes.

Lemma 3.2.4. The pair (Π, c) is an Arrow-Debreu equilibrium ifand only if Π : L → R is a positive linear functional, c clears themarket, and for all i, ci is Π-optimal for Di and Π(ci) = Π (ei).

Proof. Suppose (Π, c) is an Arrow-Debreu equilibrium and x isan arbitrage. For every scalar ϵ > 0, ϵx ∈ Di (ci) and therefore

Π(ci)+ ϵΠ(x) = Π

(ci + ϵx

)> Π

(ei)≥ Π

(ci).

This implies that Π(x) > 0 and, since x can be any arbitrage, thatΠ is positive. Letting ϵ ↓ 0, we conclude that Π(ci) = Π (ei) for all i.Individual Π-optimality and market clearing follow. The converse claimis immediate.

A positive linear functional Π on L is the present-value function forthe complete market XΠ ≡ x ∈ L | Π(x) = 0. By Proposition 3.1.8and the above lemma, it follows that (Π, c) is an Arrow-Debreu equilib-rium if and only if

(XΠ, c

)is an equilibrium. How does an equilibrium

(X, c) relate to an Arrow-Debreu equilibrium if X is not complete? ByProposition 3.1.8, assuming convex preferences, individual optimalityimplies Π-optimality for some present-value function Π, which neednot be common for all agents, unless the entire allocation c is optimalgiven the market. This leads us to the notion of an effectively competemarket equilibrium.

Page 108: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.2. EQUILIBRIUM 108

Definition 3.2.5. An equilibrium (X, c) is effectively completeif the allocation c is optimal for

(D1, . . . ,DI

)given the market X.

As in the last section, we define D ≡∑I

i=1Di and note that D (c) isconvex if every Di (ci) is convex. The following claims are a consequenceof Proposition 3.1.8 and our earlier discussion.

Proposition 3.2.6. The following are true for any market X.(1) If the allocation c is X-feasible and there exists a present-value

function Π for X such that (Π, c) is an Arrow-Debreu equilib-rium, then (X, c) is an effectively complete market equilibrium.

(2) Suppose D (c) is convex. If (X, c) is an effectively complete-market equilibrium, then there exists a present-value functionΠ for X such that (Π, c) is an Arrow-Debreu equilibrium.

(3) Suppose the market X is complete market with (necessarilyunique) present-value function Π. Then (X, c) is an equilib-rium if and only if (Π, c) is an Arrow-Debreu equilibrium.

Corollary 3.2.7. Suppose (X, c) is an equilibrium. If there existsexists a complete market X ⊇ X such that

(X, c

)is an equilibrium,

then (X, c) is an effectively complete market equilibrium. The converseclaim is also true if D (c) is convex.

We conclude this section with two highly stylized examples of (ef-fectively complete) equilibrium pricing that have played an importantrole in the early development of asset pricing theory. The first oneis a version of the CAPM (Capital Asset Pricing Model), which is amyopic, single-period model in which covariance with the market port-folio explains expected excess returns relative to a traded (credit-risk-free) money market account. The second example, which works just aswell in the multi-period case, is one in which individual optimality canbe characterized as the optimality of the aggregate endowment for asuitably constructed “representative” agent. Representative-agent ar-guments of this type have been used extensively in the literature tojustify the formulation of simple single-agent equilibrium models thatrelate aggregate consumption to market risk premia and interest rates.Both examples rely on special preference and endowment structure anda high degree of similarity among agents.

Example 3.2.8. (CAPM) There is a single period (T = 1) and anunderlying full-support probability, which can be thought of as repre-senting common beliefs among the I agents. Consider an equilibrium(X, c) with a traded money-market account (MMA) defining the shortrate r and return R0 ≡ 1 + r. The set of traded returns is

R ≡ −x1/x0 | x ∈ X, x0 6= 0 .Each endowment ei is assumed to be marketed, with present valuewi (which is uniquely defined since X is arbitrage-free). The present

Page 109: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.2. EQUILIBRIUM 109

value of the aggregate endowment e is therefore w ≡∑I

i=1wi. The

market return Rm is the return of the traded contract whose payoffis the time-one aggregate endowment e1. Since e1 has present valuew − e0, Rm ≡ e1/ (w − e0) ∈ R. To ensure that the market return iswell-defined and has positive variance, we assume that(3.2.3) var [e1] > 0 and e0 6= w.

The CAPM states that Rm is a beta-pricing return:

(3.2.4) ER−R0 =cov (R,Rm)

var (Rm)

(ERm −R0

), R ∈ R.

We will show the necessity of the CAPM with ERm 6= R0 under theadditional key assumption that all agents are variance averse:

Di (c) ⊇ x ∈ L | (x0,Ex1) = (0, 0) and var [c1 + x1] < var [c1] .Moreover, assuming preference transitivity, we will show that the equi-librium is necessarily effectively complete.

Ignoring technicalities, the essential argument is simple given thebeta-pricing discussion of Section 2.2. Because of the assumptions ofcommon beliefs, marketed endowments and variance aversion, in equi-librium, agents sell their endowments and buy plans on the minimum-variance frontier. Such frontier plans are characterized by two-fundseparation: All agents can finance their equilibrium consumption byholding the MMA and the same frontier risky investment. The key isthat these positions can be added up. By market clearing, it followsthat the market return takes the same form and must therefore alsobe a frontier return. The CAPM equation (3.2.4) then follows fromProposition 2.2.3.

We proceed with the technical details, starting with a review ofminimum-variance analysis in the current context. On the space ofrandom variables, we use the inner product 〈x | y〉 = E [xy] withinduced norm ‖ · ‖. Let xΠ1 denote the Riesz representation of thepresent-value functional on X1 ≡ x1 | x ∈ X, which is the uniquexΠ1 ∈ X1 such that −x0 = 〈xΠ1 | x1〉 for all x ∈ X. Define xΠ ∈ Xby letting −xΠ0 = 〈xΠ1 | xΠ1 〉, a positive number, since xΠ1 6= 0 by theMMA pricing equation 〈xΠ1 | R0〉 = 1. The return RΠ ≡ −xΠ1 /xΠ0is therefore well-defined. Call x ∈ X a frontier cash flow if var [x1]minimizes var [y1] as y ranges over X subject to the constraints y0 = x0and Ey1 = Ex1, a condition that can be equivalently stated as

‖x (1) ‖ = min‖z‖ | 〈xΠ1 | z〉 = −x0, 〈1 | z〉 = Ex1, z ∈ X1

.

By Corollary B.4.7, x ∈ X is a frontier cash flow if and only if x1 ∈span

(xΠ1 , 1

)= span

(RΠ, R0

).

The main CAPM argument follows. By the marketed-endowmentassumption, ci = wi1Ω×0 + xi, where wi ∈ R and xi ∈ X. By theoptimality of ci for agent i and variance-aversion, xi is a frontier cash

Page 110: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.2. EQUILIBRIUM 110

flow and therefore xi1 = aiRΠ + biR0 for some ai, bi ∈ R. Let a ≡∑

i ai

and b ≡∑

i bi. Adding up over all agents and using market clearing

and the regularity assumption (3.2.3), we find(3.2.5) e = w1Ω×0 + x, where x1 = aRΠ + bR0 and a 6= 0.

By the same regularity assumption, x0 6= 0 and the market returnRm has positive variance. This shows that (−1, Rm) is a frontier cashflow and therefore Rm is a frontier return, in the sense that for allR ∈ R, ER = ERm implies var [R] ≤ var [Rm]. As in Lemma 2.2.2,this means that Rm is the projection of zero onto the linear manifoldRm ≡ R ∈ R | ER = ERm, a condition that is equivalent to theorthogonality of Rm to Rm, which is in turn equivalent to cov [R,Rm] =var [Rm] for all R ∈ Rm. The CAPM equation (3.2.4) with ERm 6= R0

now follows exactly as in the (1 =⇒ 2) part of Proposition 2.2.3.Finally, we show that (X, c) is an effectively complete market equi-

librium, assuming preference transitivity: if y1 ∈ Di (ci) and y2 ∈Di (ci + y1) then y1 + y2 ∈ Di (ci). Suppose yi ∈ Di (ci) for everyagent i and

∑i y

i ∈ X. We will show that this violates individualoptimality and hence cannot happen in equilibrium. Let yi ∈ X bedefined by the requirement that yi1 is the projection of yi1 onto X1.Write yi = yi + (δi, εi), where δi ∈ R and εi is orthogonal to X1. Since1 ∈ X1, Eεi = 0. By variance aversion and preference transitivity,(3.2.6) yi +

(δi, 0

)∈ Di

(ci).

Let y ≡∑

i yi, y ≡

∑i y

i and ε ≡∑

i εi. Since ε = y1 − y1 ∈ X1

and ε is also orthogonal to X1, ε = 0 and therefore∑

i δi1Ω×0 =

y − y ∈ X. Since X is arbitrage-free,∑

i δi = 0. If for some i, δi < 0,

preference monotonicity and (3.2.6) implies yi ∈ Di (ci), which violatesthe optimality of ci. Therefore,

∑i δi = 0 and δi ≥ 0 for all i, which

implies δi = 0 for all i. Equation (3.2.6) reduces to yi ∈ Di (ci), whichagain violates the optimality of ci. ♦

Example 3.2.9. (Representative-agent pricing) The following is anexample of representative-agent pricing based on the assumed scale in-variance (also known as homotheticity) of preferences. Other variationsare explored in the exercises.

We will characterize an equilibrium (X, c) for agents who are spec-ified in terms of a fixed reference agent (D0, e0) with dom (D0) ≡ L++.A key assumption is that D0 is scale invariant (SI):

D0 (sc) = sD0 (c) for all s ∈ (0,∞) and c ∈ dom(D0),

where sD0 (c) ≡ sx | x ∈ D0 (c). For i = 1, . . . , I, agent (Di, ei) isdefined in terms of the parameters (bi, wi, xi) ∈ L × (0,∞)×X by

Di (c) = D0(c− bi

)for all c ∈ dom

(Di)≡ bi + L++,

andei ≡ bi + wie0 + xi.

Page 111: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 111

The consumption plan bi can be thought of as a subsistence plan, thescalar wi as the agent’s wealth above subsistence measured in multiplesof the reference plan e0, and xi as an endowed traded cash flow. Forexample, if e0 = 1Ω×0 and X is arbitrage-free, then the present valueof the endowment in excess of subsistence, ei − bi, is equal to wi. Wedo not assume, however, that e0 is marketed. Note that the aggregateendowment can be written as

e ≡∑

iei = b+ we0 + x,

where b ≡∑

i bi, w ≡

∑iw

i and x ≡∑

i xi. Let us now define the

allocation c resulting from first allocating to all agents their subsistenceplans bi and then allocating the remaining aggregate endowment e− bin proportion to each agent’s wealth wi above subsistence:

ci ≡ bi +wi

w(e− b) , i = 1, . . . , I.

The representative agent (D, e), whose endowment is the aggregateendowment, is defined by the preference correspondence

D (c) ≡ D0 (c− b) for all c ∈ dom (D) ≡ b+ L++.

The claim is that (X, c) is an equilibrium if and only if the ag-gregate endowment e is optimal for the representative agent given themarket X, that is, X ∩ D (e) = ∅. To verify this claim, one can easilycheck, using the definition of endowments and the allocation, that theallocation c is X-feasible and clears the market. Moreover, using thepreference and allocation definitions and the scale-invariance of D0, wehave

(3.2.7) 1

wiDi(ci)= D0

(ci − bi

wi

)= D0

(e− b

w

)=

1

wD (e) ,

and therefore X ∩ Di (ci) = ∅ if and only if X ∩ D (e) = ∅.Finally, assuming (X, c) is an equilibrium and D (e) is convex, we

show that (X, c) is an effectively complete market equilibrium. Sup-pose that yi ∈ Di (ci) for all i and y ≡

∑i y

i. From (3.2.7), wehave yi ≡ (w/wi) y

i ∈ D (e), and therefore, by the convexity of D (e),y =

∑i (w

i/w) yi ∈ D (e). Since e is an optimal consumption plan forthe representative agent, it follows that y /∈ X. ♦

3.3. Utility functions and optimalityIn the remainder of this chapter we introduce more structured for-

mulations based on utility representations of preferences. In this sec-tion we review general characterizations of optimality and equilibriumusing utility functions, which are further specialized in the followingsection in a way that will allow us to relate time preferences andrisk aversion to equilibrium pricing as well as optimal consumption

Page 112: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 112

and portfolio choice. For simplicity, we assume throughout that ev-ery agent’s consumption set is C ≡ L++, although the theory clearlyapplies more generally and some of the exercises assume specificationsthat require different consumption sets. We define utility functions tobe continuous and monotone, which is not standard in the literature,but is convenient for our purposes. The existence of a utility represen-tation is characterized in Section A.1.

Definition 3.3.1. A utility function is a continuous function ofthe form U : C → R that is increasing: for all c ∈ C, if x is anarbitrage, then U (c+ x) > U (c). The function U : C → R is a utilityrepresentation of the preference correspondence D if dom (D) = Cand x ∈ D (c) is equivalent to U (c+ x) > U (c), in which case U is saidto represent D. Two utility functions are ordinally equivalent ifthey represent the same preference correspondence. A utility functionU is normalized if U (U (c)) = U (c) for all c. The compensationfunction of D is the function

U (c) ≡ inf s ∈ R | s− c ∈ D (c) , c ∈ dom (D) .

Note that the utility functions U and U on C are ordinally equiva-lent if and only if U = f U for a (strictly) increasing function f fromthe range of U onto the range of U . If D admits a utility representationU , the compensation function U is the unique normalized utility thatis ordinally equivalent to U ; it can be expressed as U = ϕ−1 U whereϕ (s) ≡ U (s), s ∈ (0,∞). While the numerical value U (c) in isolationis meaningless, U (c) represents a per-period payment (in the assumedunit of account) of an annuity that is equally desirable as c.

A preference correspondence D is convex if D (c) is a convex setfor every c ∈ dom (D), a condition that can be thought of as preferencefor consumption smoothing across spots. Concavity of a utility func-tion clearly implies the convexity of the preference correspondence itrepresents. The converse claim is not generally true—a convex prefer-ence correspondence admitting a utility representation may admit noconcave utility representation.2 The following example shows that theconverse is true for scale-invariant preferences.

Example 3.3.2. (Scale-invariant preferences) Suppose the prefer-ence correspondence D is scale invariant (SI):

D (sc) = sD (c) for all s ∈ (0,∞) and c ∈ L++.

The compensation function U of D is homogeneous of degree one:U (sc) = sU (c) for all s ∈ (0,∞) and c ∈ L++.

2For example, U (x, y) ≡ x +√

x2 + y represents a convex preference corre-spondence on R2

++, but there is no strictly increasing function f such that f Uis concave. This is shown by Reny [2013] based on an idea outlined by Aumann[1975].

Page 113: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 113

It follows that a preference correspondence admitting a utility repre-sentation is SI if and only if it admits a utility representation that ishomogeneous of degree one.3 If D is both SI and convex, then U isnecessarily concave. Since U is homogeneous of degree one, concavityof U is equivalent to its super-additivity:(3.3.1) U (x+ y) ≥ U (x) + U (y) for all x, y ∈ L++.

To show super-additivity given convexity of D, fix any x, y ∈ L++ anddefine α, β ∈ R++ so that

S ≡ U (x+ y) = U (αx) = U (βy) ,

that is, α ≡ S/U (x) and β ≡ S/U (y). Let p ≡ α/ (α + β). For everypositive integer n, we have

S = U (αx) < U

(αx+

1

n

), S = U(αx) = U (βy) < U

(βy +

1

n

),

and therefore S < U ((1− p)αx+ pβy + n−1), since D (αx) is convex.Letting n go to infinity, we have

S ≤ U ((1− p)αx+ pβy) =Sαβ

α + β.

Therefore, α−1 + β−1 ≤ 1, which rearranges to (3.3.1). ♦Let us now fix a reference market X and a preference correspon-

dence D with utility representation U . We call a consumption plan coptimal if it is optimal for D given X. On the space of all adaptedprocesses, we use the inner product

(3.3.2) 〈x | y〉 = ET∑t=0

xtyt, x, y ∈ L.

In applications, U is commonly assumed to be differentiable, in whichcase a simple characterization of optimality can be given in terms ofthe gradient vector ∇U (c), defined by

〈∇U (c) | x〉 = limϵ↓0

U (c+ ϵx)− U (c)

ϵ, x ∈ L.

Example 3.3.3. (Gradient of additive utility) Suppose the utilitytakes the additive form U (c) = EQ

∑t ut (ct), where each ut : (0,∞) →

R is a differentiable function andQ is a full-support probability that canbe thought of as expressing beliefs. Let ξ denote the conditional densityprocess ξt = EtdQ/dP . Lemma 2.4.4 implies U (c) = E

∑t ut (ct) ξt and

therefore ∇U (c)t = u′t (ct) ξt. For example, Section A.4 shows that ifU represents scale-invariant preferences, then ut must take a power orlogarithmic form, which implies that the utility function is necessarilydifferentiable. ♦

3In the literature, preferences that are representable by a homogeneous-of-degree-one utility function are commonly referred to as “homothetic.”

Page 114: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 114

spot 0

spot 1

market X

∇U(c)

Figure 3.3.1. The dotted line is the boundary of theset D (c) = x | c+ x ∈ L++, U (c+ x) > U (c), whichincludes the shaded region of all arbitrage cash flows.The gradient ∇U (c) is orthogonal to the dotted line andpoints to the direction of steepest utility gain. Optimal-ity means that X does not intersect D (c) and is there-fore tangent to the dotted line, which in turn implies that∇U (c) is orthogonal to the market.

The role of the utility gradient for optimality is illustrated in Fig-ure 3.3.1 (which extends Figure 1.4.1) and is formally shown in the fol-lowing proposition. Note that the assumed strict positivity of ∇U (c)is not implied by the monotonicity of U . (Set ut (c) = (c− 1)3 in Ex-ample 3.3.3 and compute ∇U (1).) The reader can show, however, thatif U is concave, then the fact that U is monotone does imply the strictpositivity of ∇U (c).

Proposition 3.3.4. Suppose c ∈ C and the gradient vector ∇U (c)exists and is strictly positive. If c is optimal, then ∇U (c) is a state-price density. Conversely, if U is concave and ∇U (c) is a state-pricedensity, then c is optimal.

Proof. Suppose c is optimal. Given any x ∈ X, we use the factthat C is open to select ε > 0 so that c + αx ∈ C for all α ∈ [0, ε].The function f (α) = U (c+ αx), α ∈ [0, ε], is maximized at zero andtherefore has a nonpositive right derivative at zero: 〈∇U (c) | x〉 ≤ 0.By assumption, ∇U (c) ∈ L++ and therefore ∇U (c) is a state-pricedensity. Conversely, suppose that U is concave and π ≡ ∇U (c) is astate-price density. For all x ∈ X such that c + x ∈ C, the gradientinequality and the fact that 〈π | x〉 ≤ 0 imply that U (c+ x) ≤ U (c)+〈π | x〉 ≤ U (c).

Page 115: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 115

In cases where the utility U is known to be concave but differentia-bility cannot be guaranteed, optimality at c ∈ C can be characterizedin terms of the superdifferential

∂U (c) = π ∈ L | c+ x ∈ C =⇒ U (c+ x) ≤ U (c) + 〈π | x〉 .

As shown in Section B.5, assuming U is concave, ∂U (c) is nonempty,and d = ∇U (c) if and only if ∂U (c) = d.

Proposition 3.3.5. Consider any c ∈ C. If there exists a state-price density π ∈ ∂U (c), then c is optimal. Conversely, if c is optimaland U is concave, then there exists a state-price density π ∈ ∂U (c).

Proof. Suppose π ∈ ∂U (c) is a state-price density. For all x ∈ Xsuch that c + x ∈ C, 〈π | x〉 ≤ 0 and therefore U (c+ x) ≤ U (c) +〈π | x〉 ≤ U (c). This proves the optimality of c. Conversely, suppose cis optimal and U is concave. Consider the convex sets in R1+K × R:

A ≡ (x, α) | x ∈ X, U (c) < α ,B ≡ (y, β) | c+ y ∈ C, β ≤ U (c+ y) .

If (x, α) ∈ A ∩ B, then U (c) < α ≤ U (c+ x) for some x ∈ X, contra-dicting the optimality of c. Note also that (0, U (c)) is in the closureof both sets. Therefore, by the separating hyperplane Theorem B.5.3,there exists some nonzero (p, r) ∈ R1+K × R such that

(x, α) ∈ A =⇒ p · x+ rα ≤ rU (c) ,(3.3.3)(y, β) ∈ B =⇒ p · y + rβ ≥ rU (c) .(3.3.4)

If r > 0, condition (3.3.4) is violated by taking β to minus infinity.If r = 0, and therefore p 6= 0, condition (3.3.4) is violated by takingy = −εp for ε > 0 small enough so that c + y ∈ C. Therefore, r < 0and, after rescaling, we can set r = −1. Given this normalization, con-dition (3.3.3) implies that p · x ≤ 0 for all x ∈ X and condition (3.3.4)implies that p ∈ ∂U (c). Since U is increasing, p is necessarily strictlypositive.

As we saw in Proposition 3.1.8, optimality given the market isclosely related to the notion of Π-optimality for a present-value func-tion Π. For example, suppose the market is arbitrage-free and completeand therefore the present-value function Π exists and is unique. In thiscase, if an agent is endowed with a plan e ∈ L++, then w ≡ Π(e)represents the agent’s time-zero wealth, and a plan c maximizes utilitygiven the budget constraint Π(c) ≤ w. Let π denote the correspond-ing state-price density with π0 = 1, so that Π(c) = 〈π | c〉 for all c.The following proposition relates the scaling factor λ that makes λπ amember of ∂U (c) to the marginal value of the wealth level w.

Page 116: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 116

Proposition 3.3.6. Suppose Π is a linear functional on L withRiesz representation π ∈ L++ and(3.3.5) V (w) ≡ sup U (c) | Π(c) ≤ w, c ∈ L++ , w ∈ R++.

Then for all w ∈ R++ and c ∈ L++ such that Π(c) = w:(1) If λπ ∈ ∂U (c) for some λ ∈ (0,∞), then c is Π-optimal and

λ ∈ ∂V (w).(2) Suppose U is concave and c is Π-optimal. Then V is concave,

∂V (w) 6= ∅ and λπ ∈ ∂U (c) for all λ ∈ ∂V (w). Assumingfurther that ∇U (c) exists, then V is differentiable at w and

(3.3.6) λ = V ′ (w) = 〈∇U (c) | x〉 for all x ∈ L such that Π(x) = 1.

Proof. The result is a corollary of Theorem B.6.2 and Lemma B.6.3.Condition 2 of Theorem B.6.2 in this context reduces to λπ ∈ ∂U (c)for some λ ∈ (0,∞), where the positivity of λ is a consequence of themonotonicity of U . To show condition (3.3.6), note that if U is con-cave and ∇U (c) exists for a Π-optimal c, then (by Theorem B.5.5)∂U (c) = ∇U (c) and therefore λ ∈ ∂V (w) implies ∇U (c) = λπ.This proves that ∂V (w) is a singleton and therefore λ = V ′ (w). Fi-nally, if Π(x) = 1, then 〈∇U (c) | x〉 = λ 〈π | x〉 = λ. Informally speaking, condition (3.3.6) can be thought of in terms ofan optimizing agent who has initial financial wealth w and pays Π(c)to consume c. Suppose c is optimal, and therefore V (w) = U (c) andΠ(c) = w (since U is increasing, the wealth constraint Π(c) ≤ w mustbind). Condition (3.3.6) means that if the initial wealth w is increasedby a small amount δ, the resulting optimal utility value V (w + δ) isapproximately equal to U (c+ δx), where x is any unit-cost cash flow.The intuition is that once all but the last penny of one’s wealth hasbeen optimally allocated, then what one does with the last penny is ofno quantitative significance. This sort of intuition applies more broadlyto a class of so-called envelope theorems and is particularly importantin interpreting observed choices involving monetary amounts that aretrivial in comparison to one’s total wealth. Condition (3.3.6) is also thejustification for the term marginal value of wealth for the Lagrangemultiplier λ of the budget constraint Π(c) ≤ w.

Example 3.3.7. In the context of Proposition 3.3.6, if U is homo-geneous of degree one, then V (w) = V (1)w and the marginal value ofwealth λ = V ′ (w) does not depend on w. ♦

As in Section 3.1, individual optimality characterizations extendto allocational optimality given the market. To see, how, let us fix areference preference correspondence profile

(D1, . . . ,DI

)and assume

that each Di has a utility representation U i. We call an allocationc ∈ Cn optimal given X if it is optimal for

(D1, . . . ,DI

)given X.

Thus an allocation c is optimal given X if and only if for every other

Page 117: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 117

allocation c+x ∈ Cn, if U i (ci + xi) > U i (ci) for all i, then∑

i xi /∈ X.

Moreover, by Proposition 3.1.4, c is optimal given X if and only if forall c+x ∈ Cn, if U i (ci + xi) ≥ U i (ci) for all i and U i (ci + xi) > U i (ci)for some i, then

∑i x

i /∈ X. Allocational optimality given the trivialmarket 0 is known as Pareto optimality. Note that allocationaloptimality given any market X implies Pareto optimality.

Proposition 3.3.4 extends to allocational optimality as follows.

Proposition 3.3.8. Suppose c is an allocation such that each gra-dient ∇U i (ci) exists and is strictly positive. If c is optimal given X,then there exists a state-price density π and some µ ∈ RI

++ such that

(3.3.7) π = µi∇U i(ci), i = 1, . . . , I.

Conversely, assuming every U i is concave, if π is a state-price densitysuch that (3.3.7) holds for some µ ∈ RI

++, then c is optimal given X.

Proof. Suppose c ∈ Cn is optimal given X. Then x = 0 maxi-mizes U i (ci + x) subject to the constraint U j (cj − x) ≥ U j (cj) as xranges over the set of cash flows such that ci + x, c − x ∈ C. Ap-plying Theorem B.6.4 results in condition (3.3.7) for some µ ∈ RI

++

and π ∈ L. Since allocational optimality implies individual optimality,Proposition 3.3.4 implies that π is a state-price vector. To show theconverse, assuming utility concavity, consider any ci+xi ∈ C such thatx ≡

∑i x

i ∈ X. Multiply the gradient inequality U i (ci + xi) ≤ U i (ci)+〈∇U i (ci) | xi〉 by µi, add up over i, and use condition (3.3.7) and thefact that 〈π | x〉 = 0 to conclude that

∑i µiU

i (ci + xi) ≤∑

i µiUi (ci).

Since µ ∈ RI++, it is not the case that U i (ci + xi) > U i (ci) for all i.

The reader is encouraged to attempt a visualization of the preced-ing result based on Figure 3.3.1, but with two agents and a commonmarket X. The less obvious case is when X is incomplete, which canbe visualized by adding a third orthogonal axis, while preserving Xas a line. In this case, ∇U1 (c1) and ∇U2 (c2) can both be orthogonalto X without satisfying the collinearity condition (3.3.7). Individualoptimality implies that both gradients are state-price densities, but nottheir collinearity. Allocational optimality given X implies Pareto op-timality, which in turn implies the gradient collinearity. To visualizethe last claim, note that if a cash flow xi forms an acute angle with∇U i (ci), then for all sufficiently small ϵ > 0, ϵxi is strictly desirable foragent i. The non-collinearity of the gradients allows us to choose suchxi that sum up to zero, implying the violation of Pareto optimality.

A useful construct for characterizing effectively complete marketequilibria is the so-called central planner, defined in terms of theparameter µ ∈ RI

++ as the agent whose endowment is the aggregateendowment e, and whose preferences are represented by the utility

Page 118: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.3. UTILITY FUNCTIONS AND OPTIMALITY 118

function Uµ : L++ → R, where

(3.3.8) Uµ (c) ≡ sup∑I

i=1µiU

i(ci)|∑I

i=1ci ≤ c, ci ∈ L++

.

For scalar µi, we use the notation µi∂Ui (ci) ≡ µid | d ∈ ∂U i (ci).

Proposition 3.3.9. Suppose U i is concave for all i. For every X-feasible allocation c, (X, c) is an effectively complete market equilibriumif and only if there exists µ ∈ RI

++ such that e =∑

i ci is optimal for the

central planner given X and c solves the maximization problem definingUµ (e). In this case, there exists a state-price density π for X such that

π ∈ ∂Uµ (e) =⋂I

i=1µi∂U

i(ci),

and if the gradient vectors ∇U i (ci) exist, then ∇Uµ (e) also exists andπ = ∇Uµ (e) = µi∇U i

(ci), i = 1, . . . , I.

Proof. Since every U i is assumed to be concave, Uµ is also concaveand ∂Uµ (e) 6= ∅ (by Theorem B.5.4). Apply Theorem B.6.2 to (3.3.8)with c = e, F (δ) = Uµ (e+ δ) and λ = π. The theorem’s secondcondition in this application can be stated as

Uµ (e) =∑I

i=1max

µiU

i(ci)−⟨π | ci − ei

⟩| c ∈ LI++

, π ∈ L++,

where strict positivity of π follows from the assumption that every U i

is increasing. The ith term of this sum is maximized by ci if and onlyif π ∈ µi∂U

i (ci). Therefore, for every µ ∈ RI++ and allocation c such

that∑

i ci = e, Uµ (e) =

∑Ii=1 µiU

i (ci) and π ∈ ∂Uµ (e) both hold ifand only if π ∈

⋂Ii=1 µi∂U

i (ci).Suppose (X, c) is an effectively complete market equilibrium. By

Proposition 3.2.6, there exists a present-value function Π such that(Π, c) is an Arrow-Debreu equilibrium. Given a state price density πrepresenting Π, the Π-optimality of ci for agent i implies that thereexists λi > 0 such that λiπ ∈ ∂U i (ci). The argument of the firstparagraph with µi ≡ 1/λi shows that Uµ (e) =

∑i µiU

i (ci) and π ∈∂Uµ (e) and therefore, by Proposition 3.3.5, e is optimal for the centralplanner given X. Reversing these steps yields the converse. The claiminvolving gradients is a consequence of Theorem B.5.5(3).

Example 3.3.10. Suppose that every agent’s utility takes the formU i (c) = E

∑t u

it (ct), where each uit : (0,∞) → R is a concave differen-

tiable function. For every vector of agent weights µ ∈ RI++, the utility

Uµ defined in (3.3.8) is of the same form with i = µ, where

uµt (x) ≡

∑I

i=1µiu

it (xi) |

∑i

xi ≤ x, xi ∈ (0,∞)

.

If (X, c) is an effectively complete market equilibrium, then there existsµ ∈ RI

++ such that ∇Uµ (e) is a state-price vector and πt = uµ′t (et)

Page 119: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 119

defines a corresponding state-price density, which is a deterministicfunction of the aggregate endowment at every spot. ♦

While the above example suggests that an additive utility struc-ture has some analytical advantages, the following example highlightsa limitation of additive utility in capturing aversion to risk associatedwith the possible persistence of outcomes.

Example 3.3.11. Suppose the random variables ϵ1, . . . , ϵT are in-dependent and identically distributed. For example, think of each ϵt astaking the value +1 or −1, depending on the outcome of a coin toss.(Mathematically, let ϵt (ω) = ωt in Example 1.1.1 and assign equalprobability to every state ω as in Example 2.1.8.) Consider two con-sumption plans a and b with some common initial value a0 = b0 > 1and such that for every time t > 0, at = a0+ ϵ1 and bt = b0+ ϵt. In thecoin-toss interpretation, consumption under a is either permanently in-creased or permanently decreased depending on the outcome of a singlecoin toss at time one, while consumption under b is increased or de-creased in every period, depending on a new coin toss for every period.While both a and b result in the same expected consumption in everyperiod, there is a clear sense in which a is riskier than b. Yet, every util-ity of the form U = E

∑t ut, for any functions ut : (0,∞) → R, must

necessarily assign the same value to both a and b, since U =∑

t Eutand Eut (at) = Eut (bt) for all t. ♦

3.4. Dynamic consistency and recursive utilityIn this section we show that the dynamic consistency of prefer-

ences admitting a utility representation leads to a recursive relation-ship of utilities on the information tree. The resulting utility classincludes additive specifications such as expected discounted utility.Example 3.3.11 gave a first indication of the inadequacy of additiveutilities. We will further see in this section that additive utilities im-pose an overly strong relationship between time preferences in the ab-sence of risk and attitudes toward risk. This limitation motivates abroader class of recursive utilities, which allow a partial separation oftime preferences and attitudes toward risk. Recursive utility is used inthe remainder of this chapter to discuss equilibrium pricing as well asoptimal consumption and portfolio choice.

As at the end of Section 3.1, we consider an agent whose prefer-ences at spot (F, t) are represented by a preference correspondence DF,t

whose domain C is common across all spots. Throughout the rest ofthis text, we restrict attention to the case in which DF,t has a utilityrepresentation and is forward looking, meaning that for all a, b ∈ C,if a = b on F × t, . . . , T then DF,t (a) = DF,t (b). In other words,spot-(F, t) preferences do not depend on past or unrealized consump-tion. While this assumption rules out potentially important aspects

Page 120: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 120

of preferences (such as habit formation), it is methodologically appro-priate to start with the simpler case and layer complexity on top asneeded.4 A corresponding utility function UF,t : C → R must also beforward looking: for all consumption plans a, b,

a = b on F × t, . . . , T =⇒ UF,t (a) = UF,t (b) ,

which allows us to consistently define UF,t(c1F×t,...,T

)≡ UF,t (c) for all

c ∈ C, thus extending the domain of UF,t to CF,t ≡c1F×t,...,T | c ∈ C

.

We call the function UF,t increasing on CF,t if for every c ∈ C andarbitrage x ∈ LF,t, UF,t (c+ x) > UF,t (c).

Definition 3.4.1. A spot-(F, t) utility function is any contin-uous and forward-looking function UF,t : C → R that is increasingon CF,t; it is said to represent DF,t if x ∈ DF,t (c) is equivalentto UF,t (c+ x) > UF,t (c). A dynamic utility is a function of theform U : C → L such that for every spot (F, t), a spot-(F, t) utilityUF,t : C → R is well-defined by UF,t (c) ≡ U (c) (F, t), c ∈ C.

We refer to U (c) as the utility process of c and write Ut (c) forits time-t value. Since U0 (c) = UΩ,0 (c) for all c, we write U0 for thetime-zero utility function UΩ,0 : C → R.

Two dynamic utilities U and U are ordinally equivalent if for ev-ery spot (F, t), the utilities UF,t and UF,t represent the same preferencecorrespondence. The dynamic utility U is normalized if U (α) = α forall α ∈ (0,∞), and therefore UF,t (c) = UF,t (UF,t (c)) for every c ∈ CF,t,meaning that at spot (F, t) an agent whose preferences are representedby UF,t is indifferent between c and a constant annuity with paymentUF,t (c) in every period. The discussion of Section 3.3 can be appliedto each UF,t to show that every dynamic utility has a unique ordinallyequivalent normalized version. A property of a dynamic utility U isordinal if its validity is equivalent to its validity for every dynamicutility that is ordinally equivalent to U , and is therefore effectively aproperty of the preference correspondences represented by U at everyspot. We next introduce the ordinal conditions of dynamic consistencyand the irrelevance of current consumption for risk aversion, whichtogether characterize recursive utility.

A dynamic utility U dynamically consistent if for every spotspot (F, t) and all a, b ∈ CF,t,

UF,t (a) > UF,t (b) =⇒ U0 (a) > U0 (b) .

The latter condition corresponds to the dynamic consistency assump-tion DF,t (c) ⊆ D (c) ≡ DΩ,0 (c) first encountered and discussed at the

4For example, Schroder and Skiadas [2002] provide a way to mechanicallyextend the forward looking formulation to incorporate a linear form of habitformation.

Page 121: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 121

end of Section 3.1. The assumed continuity and monotonicity of utili-ties allows the following characterization:

Lemma 3.4.2. U is dynamically consistent if and only ifUF,t (a) ≥ UF,t (b) ⇐⇒ U0 (a) ≥ U0 (b) , for all a, b ∈ CF,t.

Proof. Suppose U is dynamically consistent and UF,t (a) ≥ UF,t (b).Then for all ε ∈ (0,∞), UF,t (a+ ε) > UF,t (b) and therefore U0 (a+ ε) >U0 (b). It follows that U0 (a) ≥ U0 (b) by continuity. The remainingclaims are immediate.

Remark 3.4.3. The lemma implies that if U is dynamically con-sistent, then every ordinal property of UF,t can be equivalently statedas an ordinal property of U0. In particular, U0 uniquely defines a nor-malized version of UF,t. ♦

Consider now any nonterminal spot (F, t) and let(3.4.1) (F0, t+ 1) , . . . , (Fd, t+ 1)

denote its immediate successor spots. The number d can vary from spotto spot, although in typical applications it is a constant throughoutthe filtration. Assuming U is normalized, a consequence of dynamicconsistency is that the spot-(F, t) utility remains the same if for each iwe replace the restriction of c on the subtree rooted on (Fi, t+ 1) withan annuity whose payments are all equal to the (normalized) utilityvalue UFi,t+1 (c). More formally, we have

Lemma 3.4.4. Suppose the dynamic utility U is a normalized anddynamically consistent. Given any c ∈ CF,t, let c ∈ CF,t be definedby letting c(F, t) = c (F, t) and c = UFi,t+1 (c) on Fi × t+ 1, . . . , T.Then UF,t (c) = UF,t (c).

Proof. Let xi ≡ (UFi,t+1 (c)− c) 1Fi×t+1,...,T. Adding xi to c re-places c on the subtree rooted at (Fi, t+ 1) with an annuity whosepayment is UFi,t+1 (c). We use induction in k to show that

(3.4.2) UF,t (c) = UF,t

(c+

∑k

i=0xi

), k = 0, 1, . . . , d.

For k = 0, the claim is trivially true. For the inductive step, assumeUF,t (c) = UF,t (b), where b = c +

∑k−1i=0 xi, with b = c for k = 0.

The utility normalization implies that UFk,t+1 (c) = UFk,t+1 (c+ xk).Since c equals b on the subtree rooted at (Fk, t+ 1), UFk,t+1 (b) =UFk,t+1 (b+ xk). By dynamic consistency, U0 (b) = U0 (b+ xk), andtherefore UF,t (b) = UF,t (b+ xk). The last equation combined with theinductive hypothesis gives UF,t (c) = UF,t (b+ xk), proving (3.4.2).

The Lemma’s conclusion is equivalent to the existence of a functionΦF,t : (0,∞)2+d → R such that(3.4.3) UF,t (c) = ΦF,t (c (F, t) , UF0,t+1 (c) , . . . , UFd,t+1 (c)) , c ∈ C.

Page 122: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 122

The functions ΦF,t, as (F, t) ranges over all nonterminal spots, specifya backward recursion on the information tree, which starts with theterminal values UT (c) (equal to cT if U is normalized) and computesUF,t (c) in terms of spot-(F, t) consumption and the already computedend-of-period utility values. Let us call generalized recursive utilityany dynamic utility that admits such a recursive specification, which isan ordinal property (why?). One can easily check that every generalizedrecursive utility is dynamically consistent. We have therefore shown:

Lemma 3.4.5. A dynamic utility is generalized recursive utility ifand only if it is dynamically consistent.

We call a (non-normalized) dynamic utility U additive if it takesthe form Ut (c) = Et

∑Ts=t us (cs), for functions ut : (0,∞) → R that

are increasing and continuous and hence invertible, and an expecta-tion operator E under some underlying full-support probability. Everyadditive utility is dynamically consistency and is therefore generalizedrecursive utility. As pointed out at the end of last section, utilityadditivity has some analytical advantages but also some significantlimitations. Besides the issue illustrated in Example 3.3.11, prefer-ences under uncertainty represented by additive utilities are entirelydetermined by preferences in the absence of uncertainty. To clarifyand show this claim, first note that for every consumption plan that isdeterministic (that is, each ct is constant across states), an additivetime-zero utility reduces to U0 (c) =

∑t ut (ct). As a corollary of the

uniqueness Theorem A.2.4, we have the following conclusion:Proposition 3.4.6. Any two additive utilities that are ordinally

equivalent when restricted to deterministic plans are necessarily ordi-nally equivalent over all consumption plans.

In order to achieve a partial separation of time preferences andattitudes toward risk, we consider a normalized generalized recursiveutility (3.4.3) such that for all x, x′, y0, . . . , yd, y ∈ (0,∞),

ΦF,t (x, y0, . . . , yd) = ΦF,t (x, y, . . . , y) ⇐⇒Φ (x′, y0, . . . , yd) = ΦF,t (x

′, y, . . . , y) .(3.4.4)This is an ordinal property of UF,t (or U0 given dynamic consistency)since, by Lemma 3.4.4,

(3.4.5) ΦF,t (x, y0, . . . , yd) = UF,t

(x1F×t +

∑iyi1Fi×t+1,...,T

).

Given x, the value ΦF,t (x, y0, . . . , yd) can be thought of as a utilityvalue of a single-period uncertain payoff (y0, . . . , yd) ∈ (0,∞)1+d, withcorresponding certainty equivalent value y ≡ νF,t (y0, . . . , yd) definedimplicitly by any of the equivalent equations of condition (3.4.4), in-dependently of the value of the value x. From the perspective of spot(F, t), an agent whose preferences are represented by UF,t is indifferent

Page 123: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 123

between the uncertain annuity∑

i yi1Fi×t+1,...,T and the certain an-nuity that pays νF,t (y0, . . . , yd) in every period, independently of spot(F, t) consumption. The lower this payment is, the more risk aversethe agent. We say that a dynamic utility U satisfies the irrelevanceof current consumption for risk aversion to mean that condition(3.4.4) holds at every nonterminal spot (F, t), where ΦF,t is defined by(3.4.5). If we also define(3.4.6) fF,t (x, y) ≡ ΦF,t (x, y, . . . , y) , x, y ∈ (0,∞) ,

then ΦF,t (x, y0, . . . , yd) = fF,t (x, νF,t (y0, . . . , yd)) and (3.4.3) becomes(3.4.7) UF,t (c) = fF,t (c (F, t) , νF,t (UF0,t+1 (c) , . . . , UFd,t+1 (c))) .

By recursive utility we mean a dynamic utility that satisfies a re-cursion of this form for some functions fF,t : (0,∞)2 → (0,∞) andνF,t : (0,∞)1+d → (0,∞), as (F, t) ranges over all nonterminal spots.

Lemma 3.4.5 together with the above argument lead to the followingmain conclusion.

Proposition 3.4.7. A dynamic utility is recursive utility if andonly if it is dynamically consistent and satisfies the irrelevance of cur-rent consumption for risk aversion.

Whether a given dynamic utility is recursive is an ordinal propertyof the utility and is therefore valid if and only if the normalized versionof the utility is recursive. Note that the recursive utility defined by(3.4.7) is normalized if and only if the functions fF,t and νF,t in recursion(3.4.7) are also normalized:

for all s ∈ (0,∞) , fF,t (s, s) = s and νF,t (s, . . . , s) = s.

Unless otherwise indicated, in the remainder of this chapter we willwork with normalized dynamic utilities only. We use the terms con-ditional aggregator and conditional certainty equivalent (CE)to mean continuous, increasing and normalized functions of the formfF,t : (0,∞)2 → (0,∞) and νF,t : (0,∞)1+d → (0,∞), respectively.

Our earlier construction of recursion (3.4.7) makes precise the sensein which the conditional aggregator fF,t and conditional CE νF,t repre-sent single-period time preferences and risk aversion from the perspec-tive of spot (F, t). Suppose U ′ is another normalized recursive utilitywith spot-(F, t) conditional aggregator f ′

F,t and conditional CE ν ′F,t. Byidentity (3.4.6), fF,t = f ′

F,t if and only if UF,t and U ′F,t are equal when

restricted to consumption plans of the form x1F×t + y1F×t+1,...,T(which is an ordinal property by normalization). In this sense, fF,t rep-resents time preferences over a single period. A complete separationbetween time preferences and risk attitudes is not generally possible.We can make the more modest claim, however, that given single-periodtime preferences, the conditional CE represents single-period risk atti-tudes. We call the conditional CE ν ′F,t more risk averse than νF,t if

Page 124: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 124

ν ′F,t ≤ νF,t, meaning that ν ′F,t (z) ≤ νF,t (z) for all z ∈ (0,∞)1+d. As-suming that fF,t = f ′

F,t, making ν ′F,t more risk averse than νF,t meansthat from the perspective of spot (F, t), the constant annuity that isof equal utility as the contingent annuity

∑i yi1Fi×t+1,...,T makes a

lower payment ν ′F,t (y0, . . . , yd) under U ′F,t compared to the payment

νF,t (y0, . . . , yd) under UF,t. In this sense, risk aversion toward single-period payoffs is higher under U ′

F,t than UF,t.

Example 3.4.8. The functional form of νF,t can be founded onany static theory of choice under uncertainty, the most common beingexpected utility theory: νF,t (z) = u−1 (

∑i u (zi)P [Fi | F ]) for some

probability P and increasing continuous function u : (0,∞) → R.Let us temporarily assume that νF,t has the above expected utilityspecification and similarly ν ′F,t (z) = u′−1 (

∑i u

′ (zi)P [Fi | F ]), withthe same probability P . By Theorem A.6.1, ν ′F,t ≤ νF,t if and only ifu′ = ϕu for a concave function ϕ : u (0,∞) → R. Section A.6 discussesvarious conditions that express the idea that νF,t is risk averse in anabsolute sense, all of which are equivalent to the concavity of u. ♦

Recall that Lt denotes the set of all Ft-measurable random vari-ables. Let L++

t denote the set of strictly positive elements of Lt. Justas it is convenient to represent the conditional expectations E[· | F ] forevery spot (F, t) by an operator Et, it will be convenient to representconditional CEs as operators from L++

t+1 to L++t by letting

(3.4.8) vt (z) (ω) ≡ vF,t (z) ≡ νF,t (z (F0) , . . . z (Fd)) , ω ∈F, z ∈ L++t+1,

where z (Fi) denotes the value of the random variable z on the event Fi.For instance, for the expected-utility conditional CE specification ofExample 3.4.8, vt = u−1Etu, meaning vt (z) = u−1 (Etu (z)) for allz ∈ L++

t+1. For conditional aggregators, we write

(3.4.9) fF,t (x, y) ≡ f (ω, t, x, y) ≡ ft (ω, x, y) ω ∈F, x, y ∈ (0,∞) .

As with random variables, the state variable ω is typically elided. Theseconventions allow us to write recursion (3.4.7) in the more succinct form

(3.4.10) Ut (c) = ft (ct, vt (Ut+1 (c))) , UT (c) = cT .

Example 3.4.9. Consider the additive utility Ut = Et∑T

s=t us.The corresponding normalized version is Ut (c) ≡ ϕ−1

t Ut (c), whereϕt (z) ≡

∑Ts=t us (z), z ∈ (0,∞). The utility U recursive, as can be

seen starting with the recursion Ut (c) = ut (ct)+EtUt+1 (c) and endingwith a recursion of the form (3.4.10), where both f and v are defined interms of the functions ut, thus tying time preferences to risk aversionas discussed earlier. Rather than go over the general case, the point ismade more clearly in the special case where, for some β ∈ (0, 1) and

Page 125: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 125

u : (0,∞) → R,

(3.4.11) Ut (c) = Et

[T−1∑s=t

βs−tu (cs) +βT−t

1− βu (cT )

].

To motivate the utility weight for terminal consumption, define thehypothetical infinite-horizon plan c by letting ct = ct for t < T andct = cT for all t ≥ T . Then Ut (c) = Et

∑∞s=t β

s−tu (cs). The normalizedversion U of U satisfies recursion (3.4.10) with

(3.4.12) ft (ω, x, y) ≡ u−1 ((1− β)u (x) + βu (y)) , vt ≡ u−1Etu.

Since preferences over deterministic consumption plans determine f ,they also determine u and hence v, which captures attitudes towardrisk. This illustrates the more general conclusion of Proposition 3.4.6.By relaxing the additivity assumption, we can use a different functionu in specifying f and v, thus freeing the conditional CE specificationfrom any assumptions on preferences in the absence of uncertainty. ♦

Motivated by the streamlined notation of recursion (3.4.10), wehenceforth adopt the following terminology.

An aggregator f is a mapping that assigns to every state ω andnon-terminal time t a normalized, increasing and continuous function

f (ω, t, ·) : (0,∞)2 → (0,∞) ,

such that the process (ω, t) 7→ f (ω, t, x, y) is predictable, for all (x, y).We say that f is state (resp. time) independent if f (ω, t, ·) does notvary with the state ω (resp. time t). For instance, the aggregator ofExample 3.4.9 is state and time independent. For every spot (F, t), theaggregator f implies a conditional aggregator fF,t defined in (3.4.9).

A certainty equivalent (CE) υ is a mapping that assigns to eachnonterminal time t a continuous function

υt : L++t+1 → L++

t

that is normalized (vt (α) = α for all α ∈ (0,∞)) and for all z ∈ L++t+1,

the value of υt (z) at spot (F, t), which we denote by υF,t (z), is anincreasing function of the restriction5 of z on F . Given this condition,we consistently extend the domain of vF,t by letting

υF,t (z) ≡ υF,t (z1F ) , z ∈ L++t+1.

For every spot (F, t), the CE v implies a conditional CE νF,t definedin (3.4.8).

Using this terminology, we restate the definition of recursive utilityas it applies to a normalized dynamic utility.

5More formally, this means that for all x, y ∈ L++t+1, if x1F = y1F then

υt (x) 1F = υt (y) 1F , and if x1F ≥ y1F 6= x1F then υt (x) 1F > υt (y) 1F .

Page 126: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 126

Definition 3.4.10. A normalized dynamic utility U is recursiveutility if there exist an aggregator f and a CE υ such that for everyc ∈ C, the process U (c) solves the backward recursion (3.4.10).

If f is state independent and c is a deterministic consumptionplan, then the utility process U (c) is also deterministic and thereforevt (Ut+1 (c)) = Ut+1 (c). A state-independent aggregator, therefore, de-termines and is determined by preferences over deterministic consump-tion plans. For two normalized recursive utilities U and U ′ sharing thestate independent aggregator f and corresponding CEs v and v′, thereader can show that v′ is more risk averse than v in the sense thatv′t ≤ vt for all t, if and only if U ′

0 is more risk averse than U0 in thesense that U ′

0 ≤ U0.We conclude this section with a recursive utility gradient calculation

that is essential in formulating optimality conditions. As in Section 3.3,the gradient is defined relative to the inner product (3.3.2), where E isthe expectation operator relative to some underlying full-support prob-ability that is fixed throughout. We will derive a gradient expression interms of the CE derivative, which we define using the conditional normnotation ‖h‖2t ≡ Et [h2], h ∈ Lt+1, as follows: The derivative of theCE υ is a mapping κ that assigns to each t ∈ 1, . . . , T and z ∈ L++

t+1

a random variable κt+1 (z) ∈ Lt+1 such that for all z + h ∈ L++t+1,

υt (z + h) = υt (z) + Et [κt+1 (z)h] + εt (h) ‖h‖t,

where εt (h) is small for small h in the sense that for all ϵ > 0, thereexists δ > 0 such that ‖h‖t < δ implies |εt (h)| < ϵ.

A CE is differentiable if it has a derivative. For a more concreteexpression of the derivative, consider the spot-(F, t) conditional CEνF,t implied by v, and let zi denote the value z (Fi) that z takes at theimmediate successor spot (Fi, t+ 1) of (F, t). Writing κFi,t+1 (z) for thevalue of κt+1 (z) at (Fi, t+ 1), we have

κFi,t+1 (z)P [Fi | F ] =∂νF,t (z0, . . . , zd)

∂zi.

A derivative of v corresponds to the derivative of νF,t in the multivari-ate calculus sense. By a standard calculus result, if the above partialderivatives of νF,t exist and are continuous for all nonterminal spots(F, t), then κ is the derivative of v.

Example 3.4.11. Suppose υt = u−1Etu for some continuously dif-ferentiable increasing function u : (0,∞) → R. Then the derivative κof υ exists and is given by

κt (z) =u′ (z)

u′ (υt−1 (z)), t = 1, . . . , T,

where u′ denotes the derivative of u. ♦

Page 127: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.4. DYNAMIC CONSISTENCY AND RECURSIVE UTILITY 127

The gradient of a recursive utility is computed in the followingproposition, where ∂ft/∂c and ∂ft/∂v denote the partial derivativesof the time-t aggregator ft with respect to its consumption and CEarguments, respectively.

Proposition 3.4.12. Suppose U is a recursive utility with aggre-gator f and CE υ. Suppose also that ft (ω, ·) is differentiable for everystate ω and time t < T , and υ has derivative κ. Given a referenceconsumption plan c, let the processes λ and E be defined by

(3.4.13) λt ≡∂ft∂c

(ct, υt (Ut+1 (c))) , t < T, λT = 1,

(3.4.14) E0 ≡ 1,EtEt−1

≡ ∂ft−1

∂v(ct−1, υt−1 (Ut (c)))κt (Ut (c)) .

Then for every adapted process x and time t,

(3.4.15) limϵ↓0

Ut (c+ ϵx)− Ut (c)

ϵ= Et

[∑T

s=t

EsEtλsxs

],

and therefore∇U0 (c) = Eλ.

Proof. Given x ∈ R1+K , define ϕt (ϵ) ≡ Ut (c+ ϵx) for all suffi-ciently small ϵ ∈ R. The left-hand-side in (3.4.15) defines the derivativeϕ′t (0). The utility recursion implies

ϕt (ϵ) = ft (ct + ϵxt, υt (ϕt+1 (ϵ))) .

Differentiating at ϵ = 0 and using the chain rule, we have

ϕ′t (0) = λtxt +

∂ft∂v

(c, υt (Ut+1 (c)))Et[κt+1 (ϕt+1 (0))ϕ

′t+1 (0)

].

Letting Vt ≡ ϕ′t (0) and δt ≡ λtxt, the recursion can be restated as

Vt = δt +1

EtEt [Et+1Vt+1] , VT = xT .

As discussed in Section 2.3, this recursion expresses the fact that E(viewed as a state-price density) prices the contract (δ, V ), a conditionthat can be restated as equation (3.4.15).

Remark 3.4.13. Suppose that at time zero an agent with an endow-ment e maximizes the utility U0 given a complete market with present-value function Π. If c is the agent’s optimal consumption plan, thenw ≡ Π(e) = Π (c) is the agent’s time-zero wealth. Proposition 3.3.6expresses the value λ0 of the above proposition as the marginal value ofwealth V ′ (w). The same argument can be applied from the perspectiveof any other spot. For this reason, we will refer to λ as a marginalvalue of wealth process.

Page 128: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 128

3.5. Scale invariant recursive utilityThe optimality and equilibrium theory to follow assumes a recur-

sive utility that represents scale invariant preferences in the sense ofExamples 3.2.9 and 3.3.2. The result is a simple theory that allows an-alytical insights while making not entirely unreasonable assumptionson how preferences and portfolio choices scale with wealth. With thismotivation, we analyze a scale invariant (SI) normalized recursive util-ity U with aggregator f and CE v (Definition 3.4.10). By dynamicconsistency and Lemma 3.4.4, U is SI (meaning UF,t represents SI pref-erences for every spot (F, t)) if and only if U0 is SI. Moreover, for everyspot (F, t), since UF,t is assumed normalized, it is SI if and only if it ishomogenous of degree one: For every consumption plan c,

s ∈ (0,∞) =⇒ UF,t (sc) = sUF,t (c) .

A conditional aggregator or CE can also be viewed as a normalizedutility function and is therefore defined to be SI if it is homogeneousof degree one. We call an aggregator f (resp. CE v) SI if for everynon-terminal spot (F, t), the implied conditional aggregator fF,t (resp.conditional CE νF,t) is SI. Given last section’s construction of recursiveutility, it is straightforward to verify that a recursive utility is SI if andonly if the corresponding aggregator and conditional CE are both SI.

Example 3.5.1 (Epstein-Zin-Weil utility). By Theorem A.4.3, theexpected utility CE of Example 3.4.8 is SI if and only if

(3.5.1) vt = u−1γ Etuγ, where uγ (x) ≡

x1−γ − 1

1− γ(with u1 ≡ log ),

for a coefficient of relative risk aversion (CRRA) γ. Analogously,assuming an additive utility over deterministic plans (characterized byTheorem A.2.3), the aggregator is SI and state and time independentif and only if it takes the form(3.5.2) ft (ω, c, υ) = u−1

δ ((1− β)uδ (c) + βuδ (υ)) ,

for parameters β ∈ (0, 1) and δ ∈ R, with uδ defined in (3.5.1). Thenormalized recursive utility specified in terms of the constant param-eters β, δ and γ by the CE (3.5.1) and aggregator (3.5.2) is known asEpstein-Zin-Weil (EZW) utility.6 (All our later results with EZWutility apply with minor changes to parameters that are predictableprocesses, which can be thought of as the result of applying Theo-rem A.4.3 separately to each conditional aggregator fF,t and CE νF,t.)The constant parameters (β, δ) of EZW utility determine and are deter-mined by the utility of deterministic consumption plans. Given (β, δ),increasing γ increases risk aversion. EZW utility reduces to expecteddiscounted utility if and only if γ = δ, in which case, the ordinally

6Named after the contributions of Epstein and Zin [1989] and Weil [1989, 1990].

Page 129: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 129

equivalent utility U = (1− β)−1 uδ U takes the additive form (3.4.11)of Example 3.4.9 with u = uδ. Of course, on deterministic consump-tion plans, U takes the same additive form, with Et omitted, no matterwhat the value of γ is.

In the additive specification (3.4.11), the parameter β determinesboth the weight on the terminal term and the discounting of everyother term. Normalizing the more general additive utility

(3.5.3) Ut (c) = Et

[T−1∑s=t

bs−tuδ (cs) +bT−t

1− βuδ (cT )

],

for some constant b > 0, not necessarily equal to β, yields the EZWform with a time-dependent parameter β. Alternatively, assumingδ 6= 1, utility (3.5.3) becomes EZW utility (with constant parame-ters) after a change of the unit of account. Let us call the original unita “bushel” and the new unit a “dollar,” and let u represent the unitconversion process: at time t one bushel equals ut dollars. For everyconsumption plan c in bushels, let cu ≡ cu denote the same consump-tion plan in dollars, and let Uu (cu) ≡ U (c). Clearly, Uu representsthe same preferences over consumption plans in dollars as U does overconsumption plans in bushels. For the specific unit conversion choiceut ≡ (1 + α)t, where α solves β (1 + α)1−δ = b, Uu

t (cu) is a positive

affine transformation of

Et

[T−1∑s=t

βs−t(cus)

1−δ

1− δ+βT−t

1− β

(cuT )

1−δ

1− δ

], (δ 6= 1) ,

and therefore the normalized version Uu of Uu is EZW utility with con-stant parameters β and γ = δ. For example, if one is interested in max-imizing U0 (c) subject to 〈π | c〉 ≤ π0w for some SPD π, all expressed inbushels, one can change the units to dollars and equivalently maximizethe EZW utility Uu (cu) subject to 〈πu | cu〉 ≤ πu

0w, where πu ≡ u−1π.Note that πu and π differ only in their respective implied short-rateprocesses ru and r, which are related by (1 + ru) = (1 + α) (1 + r). ♦

At the center of the equilibrium and optimality theory to followis the utility gradient expression Eλ of Proposition 3.4.12, with addedregularity implied by scale invariance that we now review. It is a gen-eral property of the gradient of a homogeneous-of-degree-one functionalthat it is homogeneous of degree zero and satisfies the so-called Eulerequation: The functional’s rate of change at a vector x in the directionof x equals the functional’s value at x. The proof is a matter of ob-serving simple properties of the relevant difference quotient. Below westate and prove this claim in the language of SI CEs.

Page 130: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 130

Lemma 3.5.2. Suppose υ is an SI CE with derivative κ. Then forall processes s, U ∈ L++,

κt (st−1Ut) = κt (Ut) and υt−1 (Ut) = Et−1 [κt (Ut)Ut] .

Proof. Since the derivative κ of υ is in particular a directionalderivative, we have the identity

Et−1 [κt (st−1Ut)xt] = limϵ↓0

υt−1 (st−1Ut + ϵst−1xt)− υt−1 (st−1Ut)

ϵst−1

= Et−1 [κt (Ut)xt] ,

where the second equality follows from the fact that υ is SI and there-fore the term st−1 can be factored out in the numerator. Since xt can beany Ft-measurable random variable, κt (st−1Ut) = κt (Ut). For xt = Utand st−1 = 1, scale invariance allows us to factor out (1 + ϵ) in thenumerator of the above difference quotient, for all ϵ > 0, and concludethat it must equal υt−1 (Ut).

The Euler equation for homogeneous functions applied to SI recur-sive utility gives the following result.

Lemma 3.5.3. Suppose U is SI recursive utility and c is a con-sumption plan. Assume that U satisfies the smoothness assumptions ofProposition 3.4.12, and therefore π ≡ Eλ is the utility gradient of U0

at c, where λ and E are defined by (3.4.13) and (3.4.14), respectively.Then(3.5.4) Ut (c) = λtWt, t = 0, . . . , T,

where

(3.5.5) Wt ≡ Et[∑T

s=t

πsπtcs

].

Proof. Let x = c in (3.4.15) and note that the left-hand sideequals Ut (c).

Remark 3.5.4. The preceding lemma is entirely a statement aboutthe utility function—no market is specified. Suppose, however, that theconsumption plan c is optimal given a complete market for an agentmaximizing a utility function of the form specified in the lemma. ByProposition 3.1.8, π is also an SPD, and therefore Wt represents thetime t market value of the remaining consumption plan ct, . . . , cT ; thatis, the agent’s wealth just prior to time-t, which is used to finance allremaining consumption. Equation (3.5.4) in such a context states thatthe agent’s utility at a time-t spot is proportional to the agent’s wealthat the given spot, with the constant of proportionality being the mar-ginal value of wealth. The relationship reflects the idea that, because ofthe homogeneity of the agent’s budget constraint and utility function,scaling up the agent’s wealth at an optimum scales up proportionatelythe agent’s consumption and utility, while preserving optimality. ♦

Page 131: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 131

For readability, we henceforth write c to denote either a consump-tion plan or a dummy variable representing a consumption value, andwe write υ to denote either a CE or a dummy variable representing aconditional CE value. An SI aggregator can be written as

(3.5.6) ft (ω, c, υ) = υgt

(ω,c

υ

), c, υ ∈ (0,∞) ,

where gt (ω, x) ≡ ft (ω, x, 1), which leads us to the following terminol-ogy and SI aggregator characterization.

Definition 3.5.5. A proportional aggregator is a mapping thatassigns to each time t < T a function gt : Ω× (0,∞) → (0,∞), wheregt (·, x) is Ft-measurable for all x ∈ (0,∞), and for all ω ∈ Ω, thefunction gt (ω, ·) is continuous and increasing, gt (ω, 1) = 1, and themapping x 7→ gt (ω, x) /x is decreasing. A proportional aggregator gconcave or differentiable if gt (ω, ·) has the respective property forevery (ω, t).

Lemma 3.5.6. An aggregator f is SI if and only if it takes theform (3.5.6) for some proportional aggregator g, in which case f isconcave if and only if g is concave.

Proof. The first part of the proposition is a straightforward con-sequence of the definitions. If f is concave, then clearly so is g. Con-versely, suppose f is given by (3.5.6) for a concave proportional aggre-gator g. We fix the reference pair (ω, t) and abuse notation by writingf and g for the functions ft (ω, ·) and gt (ω, ·). Consider any pair ofdistinct points (c1, υ1) and (c2, υ2) in (0,∞)2 and let (c, υ) denote theirsum. Concavity of g implies

g( cυ

)≥ υ1

υ1 + υ2g

(c1

υ1

)+

υ2

υ1 + υ2g

(c2

υ2

).

Multiplying through by υ shows that f (c, υ) > f (c1, υ1) + f (c2, υ2),which implies concavity of f , since f is assumed to be homogeneous ofdegree one.

The terminology and notation for proportional aggregators is anal-ogous to that for aggregators. Thus we say that g is state (resp. time)independent if gt (ω, ·) does not vary ω (resp. t). We also omit thestate variable in expressions like the utility recursion

Ut (c) = υt (Ut+1 (c)) gt

(ct

υt (Ut+1 (c))

), t < T, UT = cT .

We next introduce some useful transformations of a proportionalaggregator g that is from now on assumed to be differentiable, withg′t (ω, ·) denoting the derivative of gt (ω, ·). The elasticity h of g isdefined for every time t < T by

(3.5.7) ht (x) ≡d log gt (x)d logx =

xg′t (x)

gt (x), x ∈ (0,∞) .

Page 132: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 132

Since x, gt (x) and g′t (x) are all positive, so is ht (x). The fact thatgt (x) /x is decreasing is equivalent to

ht (x) ∈ (0, 1) , x ∈ (0,∞) .

Note also that since gt (1) = 1, we have g′t (1) = ht (1) ∈ (0, 1). Thefunctions g′t and ht arise naturally in the following calculations.

Lemma 3.5.7. Under the same assumptions as Lemma 3.5.3,

(3.5.8) λt = g′t (xt) and ctWt

= ht (xt) , for all t < T,

where Wt is defined in (3.5.5) and

(3.5.9) xt ≡ct

υt (Ut+1 (c)).

Proof. The expression for λt follows from the definition (3.4.13) ofλ and identity (3.5.6) defining the proportional aggregator g. To com-pute the ratio ct/Wt, we use the identity U (c) = λW from Lemma 3.5.3together with the definitions of x and h, and the utility recursion :

ctWt

=ctg

′t (xt)

Ut (c)=

ctg′t (xt)

υt (Ut+1 (c)) gt (xt)= ht (xt) .

Another transformation of gt that is going to be useful in the theoryof SI pricing to follow is defined as

(3.5.10) qt (x) ≡gt (x)− xg′t (x)

g′t (x)x ∈ (0,∞) .

The following lemma shows that the function qt arises naturally incomputing the growth of the gradient of an SI recursive utility.

Lemma 3.5.8. Given the same assumptions as those of Proposi-tion 3.4.12, suppose further that U is SI recursive utility with propor-tional aggregator g, and let qt be defined by (3.5.10). Then the gradientπ ≡ ∇U0 (c) solves the recursion

(3.5.11) π0 = λ0,πtπt−1

= qt−1 (xt−1)κt (Ut (c))λt, t = 1, . . . , T,

where xt is defined in (3.5.9), and λ is given in terms of x in (3.5.8)with λT = 1.

Proof. The claim is a corollary of Proposition 3.4.12 and the ob-servation that

qt (x) =∂ft (c, υ) /∂v

∂ft (c, υ) /∂c, x ≡ c

υ∈ (0,∞) .

Page 133: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.5. SCALE INVARIANT RECURSIVE UTILITY 133

The elasticity of g should not be confused with the elasticity ofintertemporal substitution, or EIS for short, a term that is widely usedin the literature. In the current context, the EIS is defined as theinverse of the elasticity of q:

(3.5.12) 1

EISt (x)≡ d log qt (x)

d logx , x ∈ (0,∞) .

The following example shows that a constant EIS corresponds to theSI aggregator form (3.5.2) of Example 3.5.1.

Example 3.5.9. Suppose the proportional aggregator g implies aconstant EIS and let

δ ≡ 1

EIS ∈ R.

Define the parameter β by

1− β ≡ g′t (1) = ht (1) ∈ (0, 1) .

Integrating (3.5.12) and using ht (x) = 1/ (1 + qt (x) /x), we have

(3.5.13) qt (x) =βxδ

1− βand ht (x) =

(1− β)x1−δ

(1− β)x1−δ + β.

Finally, integrating (3.5.7) and using the fact that gt (1) = 1 results inthe proportional aggregator gt (x) ≡ ft (x, 1), where f is the aggrega-tor (3.5.2) of the EZW specification of Example (3.5.1). ♦

The constant-EIS proportional aggregator is an example of whatwe will call a regular proportional aggregator, which is defined belowand used in simplifying some of the optimality theory to follow.

Definition 3.5.10. The proportional aggregator g is regular if itis differentiable and for every ω ∈ Ω, the derivative g′t (ω, ·) : (0,∞) →(0,∞) is decreasing and satisfies

(3.5.14) limx↓0

g′t (ω, x) = ∞ and limx→∞

g′t (ω, x) = 0.

In the remainder of this section we assume that g is a regular pro-portional aggregator and we follow the usual notational conventionof omitting the implied state variable ω. In this case, the derivativeg′t is invertible and therefore the corresponding right inverse functionIt : Ω× (0,∞) → (0,∞) is well-defined by

(3.5.15) g′t (It (λ)) = λ, λ ∈ (0,∞) .

Analogously to c and v, we abuse notation by using the same sym-bol for a process, λ in this case, and a dummy variable representingpossible values of the process. The definition of It is motivated bythe relationship λt = g′t (xt) of Lemma 3.5.7, which is equivalent toxt = It (λt) .

Page 134: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 134

The regularity assumption on g implies that gt is a (strictly) concavefunction. The convex dual7 of gt is the function g∗t : Ω × (0,∞) →(0,∞) defined for every λ ∈ (0,∞) (and implied state) by(3.5.16) g∗t (λ) ≡ max

x∈(0,∞)(gt (x)− λx) = gt (It (λ))− λIt (λ) .

To gain some geometric understanding of this quantity, draw the graphof gt on the real plain and a line of slope λ that intersects the graph.Raise the line as high as possible while intersecting the graph andmaintaining the slope λ. At the highest such point, the line is tangentto the graph and intersects the vertical axis at g∗t (λ).

Note that after the change of variables

λ ≡ g′t

( cυ

)and x ≡ c

v,

(3.5.17) ∂ft (c, υ)

∂v= g∗t (λ) and qt (x) =

g∗t (λ)

λ.

An observation that will be useful later on is that the last equationuniquely determines qt (x) given λ:

Lemma 3.5.11. Assuming the proportional aggregator g is regular,for every time t < T and q ∈ (0,∞) , there exists a unique λ ∈ (0,∞)such that q = g∗t (λ) /λ.

Proof. Plot the graph of the concave function gt and note that asthe slope λ = g′t (x) of the tangent line at x decreases, the interceptg∗t (λ) with the vertical axis increases. Therefore, g∗t is a (strictly)decreasing function. Letting the tangent line’s slope go to zero, observethat g∗t (0+) = gt (∞) > gt (1) = 1. Consider now the graph of g∗t on theplane and, for any given q ∈ (0,∞), the line L = (λ, λq) : λ ∈ (0,∞) .The graph of g∗t is downward sloping, it is above L near the verticalaxis and therefore crosses L at exactly one point, which defines theunique λ ∈ (0,∞) such that q = g∗t (λ) /λ.

3.6. Equilibrium with scale invariant recursive utilityProposition 3.4.12 gives the formula Eλ for the gradient of recur-

sive utility at a given reference consumption plan c. If c is optimalfor some agent given a market, then π ≡ Eλ is also an equilibriumstate-price density (SPD). Since an SPD prices all traded assets, wehave a link between consumption and market prices, which is the ba-sis for so-called consumption-based asset pricing models. A usefulway of thinking about equilibrium price restrictions is in terms of theperiod-t intertemporal marginal rate of substitution (IMRS)πt/πt−1, where π is the utility gradient at a given reference consump-tion plan c. If c is optimal, the IMRS places recursive restrictions on

7In more classical terms, if we define ft (x) ≡ −gt (x) and f∗t (x∗) ≡ g∗t (−x∗),

then f∗t is the Legendre transform of ft, also known as the convex conjugate of ft.

Page 135: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 135

every traded (or synthetic) contract. For example, as we saw in Sec-tion 2.3, assuming the contract contract (δ, V ) has well-defined returnsRt = Vt/ (Vt−1 − δt−1), its pricing by π can be stated as

(3.6.1) Et−1

[πtπt−1

Rt

]= 1.

Applied to a traded money-market account with rate process r ∈ P0,we have the IMRS mean restriction

Et−1

[πtπt−1

]=

1

1 + rt.

Given the latter, the Hansen-Jagannathan bound (2.3.9) implies a lowerbound on the IMRS variance.

Remark 3.6.1. In interpreting the IMRS, it is important to keepin mind that it is defined relative to the underlying probability P . Ifπ is an SPD under P and P is another full-support probability, thenthe corresponding SPD under P is given, through Lemma 2.4.4, asπt = πtEt[dP/dP ], where E denotes expectation under P . ♦

In the remainder of this section we elaborate on equilibrium pric-ing restrictions under the assumption of SI recursive utility, which isconsistent with the representative-agent argument of Example 3.2.9.The theory presented is agnostic toward this interpretation—the con-sumption plan c can be either an individual investor’s plan or a rep-resentative agent’s plan. Expectations, the SPD property and relatedquantities whose definition depends on an implicit underlying proba-bility are all relative to a given full-support probability P .

A formula for the equilibrium IMRS for SI recursive utility wasestablished in Lemma 3.5.8. We now show how the IMRS can bedetermined by taking as input consumption growth only. As a pre-liminary example, consider an agent maximizing the additive utility ofExample 3.4.9 under the assumption of scale invariance, which impliesthe EZW specification of Example 3.5.1 with γ = δ. The utility gradi-ent at a reference consumption plan c can be computed spot by spot,resulting in the IMRS

(3.6.2) πtπt−1

= β

(ctct−1

)−δ

, (assuming γ = δ).

The IMRS in this case is entirely determined by the consumptiongrowth process on the information tree. The following propositionshows that for a more general SI recursive utility the mapping fromconsumption growth to IMRS is less direct, requiring the solution ofa backward recursion on the information tree. Although not explic-itly stated, the proof shows that the process x corresponds to theconsumption-to-CE ratio of equation (3.5.9).

Page 136: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 136

Proposition 3.6.2 (IMRS and consumption growth). Suppose Uis SI recursive utility with differentiable proportional aggregator g andCE υ with derivative κ. Given a consumption plan c, let the process xbe defined by the backward recursion

(3.6.3) xt−1 = υt−1

(gt (xt)

xt

ctct−1

)−1

, t = 1, . . . , T ; xT = 1.

Then the process π ∈ L++ is the gradient of U0 at c if and only ifπ0 = g′0 (x0) and

(3.6.4) πtπt−1

= qt−1 (xt−1)κt

(gt (xt)

xt

ctct−1

)g′t (xt) , t = 1, . . . , T,

where the functions qt are defined by (3.5.10).

Proof. With some abuse of notation, let U ≡ U (c). The definitionxt ≡ ct/υt (Ut+1) and the recursion defining the utility function imply

Ut = υt (Ut+1) gt (xt) = ct−1gt (xt)

xt

ctct−1

.

Substituting the last expression for Ut in xt−1 = ct−1/υt−1 (Ut) andusing the homogeneity of υt−1 to cancel out ct−1 results in recursion(3.6.3). The same expression for Ut and the fact that κt is homogeneousof degree zero (Lemma 3.5.2) result in the identity

(3.6.5) κt (Ut) = κt

(gt (xt)

xt

ctct−1

).

Lemma 3.5.8 completes the proof. Example 3.6.3. For the constant-CRRA CE (3.5.1), Example 3.4.11

and equation (3.6.3) imply that the middle factor of the IMRS expres-sion (3.6.4) can be written as

κt

(gt (xt)

xt

ctct−1

)=

(xtxt−1

)γ (gt (xt)

ctct−1

)−γ

.

For SI recursive utility, pricing in terms consumption growth isclosely related to pricing in terms of the market return. To elabo-rate, we henceforth fix a reference SI recursive utility U with a regularproportional aggregator (Definition 3.5.10). Given a consumption planc ∈ L++, suppose π is the gradient of U0 at c. The following propositionrelates the corresponding IMRS to the quantity

(3.6.6) Mt ≡Wt

Wt−1 − ct−1

, where Wt ≡ Et[∑T

s=t

πsπtcs

].

In the representative-agent pricing context of Example 3.2.9, c is theaggregate endowment and M is the market return process.

Page 137: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 137

Proposition 3.6.4 (IMRS and market returns). Suppose U is SIrecursive utility whose CE υ has derivative κ and whose proportionalaggregator g is regular, with convex dual g∗ as defined in (3.5.16).Given any c, π ∈ L++, let the process M be defined by (3.6.6). Thenthe process λ ∈ L++ is uniquely defined as the solution to the followingbackward recursion, which also defines the process q along the way:

(3.6.7) qt−1 =g∗t−1 (λt−1)

λt−1

=1

υt−1 (λtMt), λT = 1.

Finally, π is a gradient of U0 at c if and only if

(3.6.8) πtπt−1

= qt−1κt (λtMt)λt, t = 1, . . . , T, π0 = λ0.

Proof. Suppose π is the gradient of U0 at c, and let U ≡ U (c).By Lemma 3.5.3 and the definition of Mt,(3.6.9) Ut = λtWt = (Wt−1 − ct−1)λtMt.

By Lemma 3.5.7, λt = g′t (xt), where xt denotes the consumption-to-CE ratio (3.5.9). Given this fact, the combination of identities (3.5.10)and (3.5.17) proves the first equality in (3.6.7) with qt = qt (xt). Theremainder of (3.6.7) is a consequence of the following string of equalities

qt−1 = xt−11− ht−1 (xt−1)

ht−1 (xt−1)=

ct−1

υt−1 (Ut)

Wt−1 − ct−1

ct−1

=1

υt−1 (λtMt).

The first equality follows from the definition of ht and qt in (3.5.7) and(3.5.10), the second equality follows from the identity ht (xt) = ct/Wt

of Lemma 3.5.7, and the last equality follows by inserting expres-sion (3.6.9) for Ut and simplifying using the homogeneity of υt−1. Thatthe process λ uniquely solves (3.6.7) is shown in Lemma 3.5.11.

The IMRS expression (3.6.8) follows from Lemma 3.5.8, expres-sion (3.6.9) for Ut, and the fact that κt is homogeneous of degree zero(Lemma 3.5.2). Conversely, the IMRS expression uniquely specifiesthe process π, which must therefore be equal to the (unique) gradientof U0 at c.

Example 3.6.5. In addition to the assumptions of Proposition 3.6.4,suppose that υt = u−1

γ Etuγ, where uγ is the power or logarithmicfunction (3.5.1) for some CRRA γ > 0. The CE derivative calcula-tion of Example 3.4.11 and equation (3.6.7) imply that κt (λtMt) =(qt−1λtMt)

−γ. IMRS expression (3.6.8) in this case reduces to

(3.6.10) πtπt−1

= (qt−1λt)1−γM−γ

t ,

with q and λ given by (3.6.7). Note that

(3.6.11) γ = 1 implies πtπt−1

=1

Mt

. ♦

Page 138: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 138

Consider now the consumption growth to market return ratio

(3.6.12) Φt ≡ctct−1

1

Mt

=

(1

ϱt−1

− 1

)ϱt, ϱt ≡

ctWt

,

where the second equation follows from the definition of M in (3.6.6).For a regular proportional aggregator g, the processes x and λ are re-lated by λt = g′t (xt) and xt = It (λt). By Lemma 3.5.7, ϱt = ht (xt) andtherefore the ratio Φ is entirely determined by either x or λ. There-fore, given x or λ, consumption growth and market returns carry thesame information, which explains why their role is interchangeable inPropositions 3.6.2 and 3.6.4. In both cases, x or λ is determined by abackward recursion. For EZW utility with non-unit EIS, the fact thatwe can invert the identity ϱt = ht (xt) means that the IMRS expressioncan be computed jointly in terms of market returns and consumptiongrowth, without having to solve a backward recursion. The followingproposition shows that the resulting IMRS expression8 is the geometricaverage of expression (3.6.2) for the additive (γ = δ) case, and expres-sion (3.6.11) for the unit CRRA (γ = 1) case.

Proposition 3.6.6 (EZW pricing with non-unit EIS). Suppose Uis the EZW utility of Example 3.5.1 for some CRRA γ and inverse-EISparameter δ 6= 1. Fix any consumption plan c and let Mt be defined by(3.6.6). Then π ∈ L++ is the gradient of U0 at c if and only if

(3.6.13) πtπt−1

=

(ctct−1

)−δ)ϕ(

1

Mt

)1−ϕ

, ϕ ≡ 1− γ

1− δ.

Proof. Once again, we apply Lemma 3.5.8 and we show thatthe IMRS expression (3.5.11) reduces to the claimed expression. ByLemma 3.5.7, ϱt ≡ ct/Wt = ht (xt), where xt is the consumption-to-CEratio. Here ht is given by equation (3.5.13), and we can therefore invertthe last equation to compute xt as a function of ϱt:

xt =

1− β

ϱt1− ϱt

)1/(1−δ)

.

Using the expression for qt in (3.5.13), we then compute

(3.6.14) qt (xt) =

1− β

)1/(1−δ)(ϱt

1− ϱt

)δ/(1−δ).

Similarly, using the fact that gt (x) =((1− β)x1−δ + β

)1/(1−δ), we find

(3.6.15) λt = g′t (xt) = (1− β)1/(1−δ)(1

ϱt

)δ/(1−δ).

8The proposition’s proof shows the more general IMRS expression (3.6.17),which applies for any, not necessarily additive, differentiable SI CE. Another gen-eralization is given in Skiadas [2009] for a constant CRRA CE but any regularproportional aggregator g for which h is invertible.

Page 139: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.6. EQUILIBRIUM WITH SCALE INVARIANT RECURSIVE UTILITY 139

The last two centered equations and equation (3.6.12) together imply

(3.6.16) qt−1 (xt−1)λt = β1/(1−δ)Φ−δ/(1−δ)t .

Finally, we claim that

κt (Ut (c)) = κt (λtMt) = κt

−δ/(1−δ)t Mt

).

Since κt is homogeneous of degree zero (Lemma 3.5.2), the first equa-tion follows from equation (3.6.9) (just as in the proof of Proposi-tion 3.6.4), and the second equation follows from equation (3.6.16).The last two displays and IMRS expression (3.5.11) result in

(3.6.17) π0 = λ0 and πtπt−1

= β1/(1−δ)Φ−δ/(1−δ)t κt

−δ/(1−δ)t Mt

).

We will also use the identity

(3.6.18) υt−1

−δ/(1−δ)t Mt

)= β−1/(1−δ),

To show it, let qt = qt (xt) and recall from Proposition 3.6.4 (andits proof) that qt−1 = 1/υt−1 (λtMt) and therefore υt−1 (qt−1λtMt) =1 (since υ is SI). Substituting expression (3.6.16) for qt−1λt resultsin (3.6.18).

All results so far apply for any differentiable SI CE. For a constantCRRA CE, κt (z) = (z/υt−1 (z))

−γ. Applying this expression withz = Φ

−δ/(1−δ)t Mt in the IMRS expression (3.6.17) and using (3.6.18)

the claimed IMRS expression (3.6.13) follows.

Example 3.6.7 (unit EIS). The key to the proof of the precedingproposition is the invertibility of the identity ϱt = ht (xt) of Lemma 3.5.7,which does not hold in the unit-EIS case gt (x) = x1−β, β ∈ (0, 1), whereht (x) = 1− β. Equation (3.6.12) in this case gives

ctct−1

= βMt (assuming unit EIS). ♦

This section’s results attribute all IMRS variability to stochastic con-sumption growth and/or market returns. Another source of IMRSvariability can be hard-wired into preferences, for example, throughstochastic parameters β, γ, δ, or beliefs (Remark 3.6.1). There aremany other possible sources of IMRS variability that do not fit thissection’s formalism, including agent heterogeneity, non-tradeability oflabor income due to moral hazard concerns, collateral constraints andassociated leverage dynamics, other institutional constraints, marketpanics or runs, limited ability to model the future leading to model re-visions violating dynamic consistency, and trading patterns driven bynarrative and influence dynamics that are hard to explain in a Bayesianframework.

Page 140: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.7. OPTIMAL CONSUMPTION AND PORTFOLIO CHOICE 140

3.7. Optimal consumption and portfolio choiceConsider an agent (D, e) with access to an arbitrage-free market X.

In abstract terms, in this section we discuss, under special assumptions,the problem of finding a trade x ∈ X such that e + x is optimal forD given X. We specialize this problem by assuming that preferenceshave an SI recursive utility representation,(3.7.1) e ≡ w10×Ω, w ∈ (0,∞) ,

and the market X is implemented by 1 + J contracts as specified inSection 1.7, where V j

t is strictly positive for every time t > 0 andcontract j. Since the market is arbitrage-free, Sjt−1 and Rj

t ≡ V jt /S

jt−1

are also strictly positive. Contract zero is a money-market account(MMA) with rate process r and return process R0 ≡ 1 + r.

Recall that a trading strategy (θ0, θ) defines in (1.7.6) a correspond-ing portfolio allocation policy ψ =

(ψ1, . . . , ψJ

), and equation (1.7.7)

defines the notation Rψ in terms of ψ and the contract returns Rj. Theagent enters period t with financial wealth Wt−1, consumes ct−1 and in-vests the remainder according to the allocation ψt earning a return Rψ

t

for the period. The wealth process W must therefore satisfy(3.7.2) W0 = w, Wt = (Wt−1 − ct−1)R

ψt , WT = cT .

The pair (c,W ) is related to the synthetic contract (δθ, V θ) bySθt−1 = Wt−1 − ct−1, Wt = V θ

t , ct = δθt , t = 1, . . . , T.

The middle equation in (3.7.2) is therefore simply the claim Rθ = Rψ

of equation (1.7.7). The agent effectively spends w− c0 at time zero topurchase the synthetic contract (δθ, V θ) and subsequently consumes alldividends. Since in an arbitrage-free market every synthetic contractis traded, the contract (δθ, V θ) is priced by every SPD π, which iscondition (3.5.5) and the reason we have used the same notation W inboth instances.

More parsimoniously, we take as primitive the contract returns andthe agent’s initial wealth and express the optimal decision using thefollowing terminology and notation.

Definition 3.7.1. A consumption allocation policy is a (0, 1)-valued adapted process ϱ such that ϱT = 1. A portfolio allocationpolicy is a process ψ =

(ψ1, . . . , ψJ

)∈ P1×J

0 . An allocation policyis a pair (ϱ, ψ) of a consumption policy and a trading policy. A wealthprocess is a strictly positive adapted process. The allocation policy(ϱ, ψ) generates the wealth process W defined recursively by(3.7.3) W0 = w, Wt = Wt−1 (1− ϱt−1)R

ψt ,

in which case the allocation policy (ϱ, ψ) is said to finance the con-sumption plan c ≡ ϱW . An allocation policy is optimal if it financesan optimal consumption plan.

Page 141: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.7. OPTIMAL CONSUMPTION AND PORTFOLIO CHOICE 141

The section’s main result follows. It applies to a concave SI re-cursive utility, where concavity is a consequence of the assumptionthat both the CE and the proportional aggregator are concave (viaLemma 3.5.6). For simplicity, we assume a differentiable CE v andthat the period-t portfolio allocation can be any member of the set

(3.7.4) Ψt ≡ψt ∈ L1×J

t−1 | Rψt is strictly positive

.

The result remains true9 if v is only assumed to be concave and Ψt is anarbitrary spot-dependent nonempty convex subset of the set in (3.7.4).

Theorem 3.7.2. Assume the reference agent has endowment (3.7.1)and preferences represented by an SI recursive utility U with a regularproportional aggregator g and a concave differentiable CE v. Let h,I and g∗ be defined in (3.5.7), (3.5.15) and (3.5.16), respectively. Anallocation policy (ϱ, ψ) is optimal if and only if it can be computed bythe following recursive procedure:

(1) (Initialization) Set λT = ϱT = 1 and t = T .(2) (Recursive step) Given λt, select ψt ∈ Ψt such that

(3.7.5) υt−1(λtRψt ) = max

p∈Ψt

υt−1 (λtRpt ) ,

let λt−1 ∈ L++t−1 be the unique solution to

(3.7.6) λt−1

g∗t−1 (λt−1)= υt−1(λtR

ψt ),

and set ϱt−1 = ht−1 (It−1 (λt−1)).(3) (Loop) While t > 1, decrease t by one and repeat (2).

Assuming (ϱ, ψ) is optimal and finances c, the process λ generated bythis algorithm is the marginal price of wealth process defined by (3.4.13),and U (c) = λW , where W is the wealth process generated by (ϱ, ψ).

Proof. Let us first observe that condition (3.7.5) is equivalent to

(3.7.7) Et−1

[κt(λtR

ψ)λt(Rjt −R0

t

)]= 0, j = 1, . . . , J,

where κ is the derivative of v. The condition sets to zero the partialderivatives of the value being maximized in (3.7.5) with respect to eachαit. Since the CE υ is assumed to be concave, the resulting optimalitycondition (3.7.7) is necessary and sufficient for (3.7.5).

We henceforth assume that the allocation policy (ϱ, ψ) generatesthe wealth process W and finances the consumption plan c ≡ ϱW ,which by the budget equation in the form (3.7.2) implies that

(3.7.8) Rψt =

Wt

Wt−1 − ct−1

.

9For details and a proof of this claim see Skiadas [2013a].

Page 142: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.7. OPTIMAL CONSUMPTION AND PORTFOLIO CHOICE 142

Necessity: Suppose c is optimal with corresponding marginal valueof wealth process λ. Then π ≡ ∇U0 (c) is an SPD, and therefore

(3.7.9) Et−1

[πtπt−1

Rjt

]= 1, j = 0, 1, . . . , J,

which in turn implies

(3.7.10) Et−1

[πtπt−1

Rψt

]= 1, t = 1, . . . , T.

Substituting expression (3.7.8) for Rψ into (3.7.10) shows that π pricesthe contract (W, c), except for the irrelevant technicality that c0 6= 0,which is easily finessed by redefining the time zero contract value tobe W0 − c0. The result is that W satisfies (3.5.5), and we can there-fore apply Proposition 3.6.4 with M = Rψ. Therefore λ uniquely solvesrecursion (3.7.6) and πt/πt−1 is given by equation (3.6.8), which in con-junction with the state-pricing condition (3.7.9) implies the optimalitycondition (3.7.7), and hence the optimal-portfolio equation (3.7.5). Theexpression for ϱ follows from Lemma 3.5.7.

Sufficiency: Conversely, suppose that the processes λ and q areconstructed by the Theorem’s algorithm and let π be defined by therecursion (3.6.8) but with Rψ in place of M . By the construction of qand π, and Lemma 3.5.2, we have

1 = qt−1υt−1(λtRψt ) = qt−1Et−1

[κt(λtR

ψt )]= Et−1

[πtπt−1

Rψt

].

Therefore, equation (2.4.2) is satisfied, which as we saw earlier impliesthat W satisfies (3.5.5) and Proposition 3.6.4 applies with M = Rψ. Itfollows that λ is the marginal value of wealth process (since it uniquelysolves (3.7.6)), π = ∇U0 (c) and U (c) = λW (by Lemma 3.5.3).Since (3.7.7) is satisfied by construction, so is the first equation in (3.7.9).The latter together with (3.7.7) implies the second equation in (3.7.9).Since π prices every contract implementing the market, it is an SPD,which proves the optimality of c.

Example 3.7.3 (Unit EIS and optimal consumption policy). Sup-pose that the proportional aggregator of the unit-EIS form of Exam-ple 3.5.9 with δ = 1. In this case, ht (x) = 1−β for all x, and thereforethe optimal consumption-to-wealth ratio is also a constant, ϱt = 1−β,for every specification of the conditional CE. As we saw in Exam-ple 3.5.1, if the conditional CE is an expected utility one with unitCRRA, then the recursive utility is ordinally equivalent to expecteddiscounted logarithmic utility. By moving away from the expected-discounted utility framework, we can vary risk aversion while retain-ing the simplifying assumption of a constant consumption-to-wealthratio. ♦

Page 143: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.7. OPTIMAL CONSUMPTION AND PORTFOLIO CHOICE 143

Example 3.7.4 (Markovian formulation). Suppose that the con-tracts implementing the market have the Markovian structure of Sec-tion 2.6, which is defined in terms of an underlying martingale basiswith stochastically independent increments. In this case, the processλ and the optimal allocation policy can be expressed as functions ofthe Markov state: λt = λ (t, Zt), ϱt = ϱ (t, Zt) and ψt = ψ (t, Zt−1),with the abuse of notation defined in Section 2.6. Analogously to thearbitrage-pricing application of Section 2.6, the significance of the Mar-kovian formulation is that the recursive formula determining ψt andλt−1 in terms of λt need be evaluated only for every possible value ofthe Markov state Zt−1 rather than every time-(t− 1) spot, which candramatically reduce the problem’s computational complexity. ♦

Example 3.7.5 (Deterministic marginal value of wealth). Supposethat the short-rate process r is deterministic and period-t excess re-turns Rj

t − R0t are stochastically independent of Ft−1, for every t > 0.

In this case, a backward induction shows that the marginal-value-of-wealth process λ at the optimum is deterministic. As a consequence,the optimal allocation policy (ϱ, ψ) is also deterministic, with the opti-mal time-t portfolio weights determined as the solution to the myopicproblem ψt ∈ arg max υt−1 (R

pt ) : p ∈ Ψt. ♦

The algorithm of Theorem 3.7.2 produces a solution provided thereis an optimal portfolio solution to problem (3.7.5). We show belowthat this is indeed the case for an expected-utility SI CE with CRRAγ > 0. Problem (3.7.5) in this case can be equivalently expressed as

(3.7.11) ψt ∈ arg maxp∈Ψt

EQt−1

[(Rp

t )1−γ − 1

1− γ

],

provided we define the probability Q to have the conditional densityprocess ξ determined by the recursion

(3.7.12) ξtξt−1

=λ1−γt

Et−1[λ1−γt ]

, ξ0 = 1.

The reason for this is the change-of-measure formula of Lemma 2.4.10.

Example 3.7.6 (Unit CRRA and optimal portfolio choice). Theoptimal portfolio problem with unit CRRA (γ = 1) reduces to

ψt ∈ arg maxp∈Ψt

Et−1 log (Rpt ) .

In this case, the optimal portfolio weights are the same as for a myopicagent who maximizes conditional expected logarithmic utility over asingle period. In contrast to Example 3.7.5, the myopic portfolio ruleis optimal even if the marginal-value-of-wealth process is stochastic.Referring back to Example 3.7.3, recall that a unit EIS implies a myopicconsumption policy. The intersection of that example and the present

Page 144: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 144

one is the case of expected discounted logarithm utility, in which casethe entire optimal policy (ϱ, ψ) is myopic. ♦

Proposition 3.7.7. For an SI CE with positive CRRA γ, the op-timal portfolio problem has a solution and therefore the algorithm ofTheorem 2.4.10 produces a solution.

Proof. The argument applies over a single period conditionallyon each time-(t− 1) spot. We therefore assume, without loss of gener-ality, that T = 1 and the underlying probability P coincides with theprobability Q specified above. We omit time subscripts and let u (x) ≡(x1−γ − 1) / (1− γ). Suppose first that γ ≥ 1. Given any SPD (1, π),define the compact set A ≡ z ∈ L++ | Eu (z) ≥ u (1 + r) , E [πz] = 1and the closed set B ≡

Rp | p ∈ R1×J. Since 1+r ∈ B and E [πz] = 1

for all z ∈ B, the optimal portfolio problem is equivalent to maximiz-ing the continuous function Eu over the compact set A∩B. Existenceof a maximum follows by Proposition B.2.6. If γ ∈ (0, 1), A as definedabove is not compact. We instead define A to be the set of every [0,∞)-valued random variable z such that E [πz] = 1. Arguing as above, weconclude there is an optimal portfolio choice, provided we verify thatthe optimizing value of z is strictly positive. This follows from the factthat the marginal value z−γ becomes infinite at zero and therefore asufficiently small deviation away from zero is always improving.

3.8. Recursive utility and optimality in continuous timeWe conclude with an introduction to recursive utility and optimal

consumption/portfolio choice in the Brownian setting of Section 2.8.Technicalities aside, the solution is simpler than last section’s discretecounterpart thanks to the quadratic approximations encoded in the Itocalculus. We omit several mathematical details in order to convey theessential argument without requiring knowledge of probability theorybeyond what has already been discussed in this text. As in Section 2.8,the underlying filtration is generated by a single SBM B over the timehorizon [0, T ], defined on some underlying state space (say, the set ofcontinuous paths), with some probability (specifying the likelihood ofevery path) with expectation operator E. The extension to a filtrationgenerated by multiple independent SBMs is straightforward and mainlya matter of introducing appropriate vector notation.

A consumption plan is a strictly positive adapted process c (satis-fying a suitable integrability restriction for utility processes and relatedquantities to be well defined). As in the discrete case, we differentiatebetween consumption over a single period, which in the current contextis ctdt over [t, t + dt], and terminal lump-sum consumption cT . Stateprice densities and utility gradients are defined in terms of the inner

Page 145: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 145

product

〈x | y〉 = E[∫ T

0

xtytdt+ xTyT

].

To formulate recursive utility in continuous time, we fix a referenceconsumption plan c and, abusing notation, we let U ≡ U (c) denotethe corresponding normalized utility process with Ito decomposition

dUt = αUt dt+ βUt dBt, UT = cT .

As we saw in Sections 2.5 and 2.8, expressing αUt as a function of(Ut, β

Ut

)can be thought of as a backward in time recursion, which

is formally expressed as a backward stochastic differential equation(BSDE) to be solved jointly in

(Ut, β

Ut

). This is the way we can specify

recursive utility directly in continuous time.The heuristic argument that follows can be applied to a broader

class of recursive utilities, but for concreteness we assume the expected-utility CE vt = u−1Etu for some continuous and increasing functionu : (0,∞) → R, with Arrow-Pratt coefficient of absolute risk aversionA ≡ −u′′/u′. Ito’s lemma implies

u (Ut+dt) = u (Ut) + u′ (Ut)

(αUt − A (Ut)

2

(βUt)2)

dt+ u′ (Ut) dBt.

Apply the conditional expectation Et on both sides, which eliminatesthe Brownian term, and then apply u−1 on both sides, followed by afirst-order Taylor expansion on the right-hand side. The result is theArrow-Pratt CE approximation10

vt (Ut+dt) = Ut +

(αUt − A (Ut)

2

(βUt)2)

dt.

We can heuristically rewrite the Arrow-Pratt approximation as

vt (Ut+dt) = EtUt+dt −A (Ut)

2vart [Ut+dt] .

As we will see, this is the key in concluding that if a myopic optimalportfolio in a scale-invariant formulation is justified, the optimal port-folio has to be mean-variance efficient in the maximum-Sharpe-ratiosense of Section 2.2 over every infinitesimal time interval.

The aggregator of the recursive utility can also be specified moregenerally, but again for concreteness, we assume

(3.8.1) uδ (Ut) =(1− e−βdt

)uδ (ct) + e−βdtuδ (vt (Ut+dt)) , UT = cT ,

where β > 0 is a constant (not to be confused with βU). For now,uδ : (0,∞) → R can be any increasing and continuously differentiablefunction. Inserting the Arrow-Pratt CE approximation in this recursion

10Named after the contributions of Arrow [1965, 1971] and Pratt [1964].

Page 146: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 146

and using a first-order Taylor expansion results in an expression for αUtin terms of

(Ut, β

Ut

)corresponding to the BSDE

(3.8.2) dUt = −(f (ct, Ut)−

A (Ut)

2

(βUt)2)

dt+ βUt dBt, UT = cT ,

wheref (c, U) ≡ β

uδ (c)− uδ (U)

u′δ (U).

We proceed assuming the BSDE has a unique solution,11 and the so-lution is increasing in c. Analogously to the discrete case, this sec-tion’s entire analysis applies with only cosmetic changes if the param-eters f and A are state and time dependent (provided f (ω, t, c, U) andA (ω, t, U) as functions of (ω, t) are predictable processes).

Remark 3.8.1. The quadratic term in the drift of BSDE (3.8.2) canbe eliminated by passing to the ordinally equivalent utility Vt = u (Ut).Letting φ ≡ uδ u−1, recursion (3.8.1) can be restated asφ (Vt) =

(1− e−βdt

)φ (u (ct)) + e−βdtφ (Et (Vt+dt)) , VT = u (cT ) ,

This is of the same form as (3.8.1) after substituting u (ct) for ct, φ foruδ, V for u (Ut) and finally the risk-neutral CE Et for vt. The argumentleading to (3.8.2) in this case leads to the BSDE12

(3.8.3) dVt = −ϕ (ct, Vt) dt+ βVt dBt, VT = u (cT ) ,

where ϕ (c, v) ≡ β (φ (c)− φ (v)) /φ′ (v). Alternatively, the equivalenceof BSDEs (3.8.2) and (3.8.3) can be shown as an application of Ito’slemma. Assuming sufficient integrability conditions, BSDE (3.8.3) canequivalently be expressed in the form

Vt = Et[∫ T

t

ϕ (cs, Vs) ds+ VT

],

which should be thought of as a fixed-point problem in V . ♦

In the remainder of this section, we concentrate on the special caseof an SI U . Since U is normalized, that means U is homogeneous ofdegree one, and f must also have this property: f (c, U) = Ug (c/U).The function g is the continuous-time counterpart of the proportionalaggregator. Thanks to Theorem A.4.3, additivity of the aggregator

11The existence/uniqueness theory for nonlinear BSDEs begins, independently,with Duffie and Epstein [1992a] (who treated BSDEs of the reduced form of Re-mark 3.8.1) and Pardoux and Peng [1990] (who allowed for a more general depen-dence on the volatility). These papers imposed conditions that are violated in thissection’s main application with EZW utility. For the latter, the relevant BSDEfoundations were developed by Schroder and Skiadas [1999] and Xing [2017].

12In the continuous-time counterpart of Exercise 6, given in Skiadas [1998], theconcavity or convexity of ϕ corresponds to monotonicity of preferences for informa-tion, or what in a different setting Kreps and Porteus [1978] termed preferences forthe timing for resolution of uncertainty.

Page 147: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 147

coupled with scale invariance means that there exist constants β andδ (the inverse of the EIS) such that

(3.8.4) g = βuδ where uδ (x) ≡x1−δ − 1

1− δ, with u1 = log .

Similarly, the assumption of an SI expected-utility CE implies thatu = uγ for a constant coefficient of relative risk aversion γ, that is,A (U)U = γ. Preference monotonicity implies β > 0. We also assumestrict concavity of both uδ and uγ, and therefore γ, δ > 0. LettingβUt = σUt Ut, the utility BSDE becomes

(3.8.5) dUtUt

= −(g

(ctUt

)− γ

2

(σUt)2)

dt+ σUt dBt, UT = cT .

For concreteness, we proceed under the assumption that g is definedby (3.8.4), corresponding to the continuous-time version of the EZWutility of Example 3.5.1. Once again, the analysis that follows appliesmore generally to a (sufficiently regular) increasing and concave g, andboth g and γ can be state and time dependent (subject to the usualadaptedness restriction). Assuming that g and γ are deterministicallows us to make the claim that g determines and is determined bythe utility of deterministic consumption plans, and given g, increasingγ increases risk aversion.

The following is a continuous-time analog to Example 3.5.1.

Example 3.8.2. For any b ∈ R and β ∈ (0,∞), let

(3.8.6) Ut (c) ≡ Et[∫ T

t

e−b(s−t)uδ (cs) ds+e−b(T−t)

βuδ (cT )

].

One way of applying this section’s theory to the dynamic utility Uis to allow g to be time dependent, which does not affect any of thearguments that follow. Specifically, suppose the dynamic utility U isdefined by BSDE (3.8.5) with

g ≡ βtuδ where 1

βt≡∫ T

t

e−b(s−t)ds+e−b(T−t)

β.

An exercise in Ito’s lemma then shows that Ut = β−1t uδ (Ut).

Alternatively, suppose α ∈ R satisfies b = β − (1− δ)α and thedynamic utility U is defined by the BSDE

(3.8.7) dUtUt

= −(α + βuδ

(ctUt

)− γ

2

(σUt)2)

dt+ σUt dBt, Ut = cT .

Another exercise in Ito’s lemma shows that

Ut =uδ(Ut)

β− α

β

∫ T

t

e−b(s−t)ds,

Page 148: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 148

and therefore Ut is ordinally equivalent to the additive utility (3.8.6).Strictly speaking, U is of the EZW form if and only if α = 0, or equiva-lently b = β, but the remainder of this section applies just as well withg ≡ α+ βuδ for any α. Alternatively, the EZW form is recovered aftera suitable change of the unit account. Analogously to Example 3.5.1,we change the unit of account from a “bushel” to a “dollar,” where attime t one bushel equals ut ≡ eαt dollars. A consumption plan c inbushels is consumption plan cu ≡ cu in dollars, and Uu (cu) ≡ uU (c)defines a dynamic utility Uu representing the same preferences over con-sumption plans in dollars as U does over consumption plans in bushels.Integration by parts shows that BSDE (3.8.7) is equivalent to BSDE

dUut

Uut

= −(βuδ

(cut

Uut

)− γ

2

(σUt)2)

dt+ σUt dBt, Uut = cu

T ,

which is the normalized EZW form. As we will see in Remark 3.8.4below, on the market side, changing units from bushels to dollars cor-responds to adjusting the short rate process from r to r + α. ♦

Consider now an agent13 whose preferences are represented by thecontinuous-time SI recursive utility just defined, who has some initialfinancial wealth w > 0 and no other income, and who has access tothe financial market of Section 2.8, but with r, a, y, σ, and thereforeµ, representing general predictable processes, rather than constants(subject to omitted technical integrability restrictions). Allowing moregeneral market parameters will be useful in our discussion of how opti-mal portfolios relate to mean-variance efficiency. The agent can use wto purchase an initial portfolio in the MMA and the stock and can re-balance the account over time, provided the account balance Wt stayspositive at every time t. Unlike the trading strategies of Section 2.8,which were assumed to be self-financing, the agent can withdraw cashfrom the account at a time-t rate ct for all t < T , followed by a ter-minal lump-sum payment cT = WT , thus converting the initial wealthw = W0 to a consumption plan c. The agent seeks to do so in a waythat maximizes the utility U0 (c). By dynamic consistency, maximiza-tion of time-zero utility implies the maximization of utility at everyfuture time. It is therefore sufficient to solve the agent’s problem fromthe perspective of time zero. A useful way of thinking of the agent’sconsumption and trading strategy is as the time-zero purchase of the

13The rest of this section is based on Schroder and Skiadas [2003], which in-cludes trading constraints, more general recursive utilities, as well as multiple assetsand sources of risk, all in the context of an SI formulation with possibly incompletemarkets but tradeable income. Non-tradeable income is inconsistent with scaleinvariance. Schroder and Skiadas [2005] give a version of the argument that in-cludes non-tradeable income based on translation invariance (which removes wealtheffects).

Page 149: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 149

synthetic contract (c,W ), where c is the dividend process generated bythe contract and W is the contract’s cum-dividend price process.

The reader would have no difficulty extending the self-financingbudget equation of Section 2.8 to include this type of contract. In-stead, we equivalently describe the budget equation in terms of wealthallocations, analogously to (3.7.3). An allocation policy is a pair(ϱ, ψ) of predictable processes, where for every time t < T , ϱt ∈ (0,∞)represents the consumption rate as a proportion of wealth and ψt is theproportion of wealth that is allocated to the stock, with the remainderallocated to the MMA. We also assume, as part of an allocation policydefinition, that ϱT ≡ 1 (corresponding to the assumption cT = WT )and that the following version of the budget equation is a well-formedSDE uniquely determining the strictly positive wealth process Wgenerated by (ϱ, ψ):

(3.8.8) W0 = w,dWt

Wt

= (rt − ϱt) dt+ ψt

(dStSt

+ ytdt

).

The resulting consumption plan financed by the allocation policy(ϱ, ψ) is c ≡ ϱW . We call a consumption plan feasible if it is financedby some allocation policy, and optimal if it maximizes U0 among allfeasible consumption plans. We wish to determine an allocation policythat is optimal in that it finances an optimal consumption plan.

Given the reference consumption plan c, the relevant market X (c)is the set of all x such that c + x is feasible consumption plan. Asin Section 2.8, let the strictly positive Ito process π be defined, forarbitrary π0 > 0, by the SDE

(3.8.9) dπ

π= −rtdt− ηtdB, where ηt ≡

µt − rtσt

.

In a variant of Proposition 2.8.1, we will show that π is an SPD forX (c) by analyzing the linear BSDE(3.8.10) dWt = −

(ct − rtWt − ηtσ

W)dt+ σWt dBt, WT = cT ,

which results from entering c = ϱW into the budget equation (3.8.8)and using the return dynamics (2.8.1) and the definition of η in (3.8.9).BSDE (3.8.10) expresses the heuristic backward recursion

Wt = Et[ct +

πt+dtπt

Wt+dt

], WT = cT .

In the finite model, such a recursion states that the present-value func-tion represented by π prices the contract (c,W ) and Wt is the time-tpresent value of c. The latter conclusion is not guaranteed in con-tinuous time. As we saw in the last chapter, in continuous time, wehave to entertain the possibility that the synthetic contract (c,W ) isimplemented by a “doubling” strategy (in a generalized sense), wherewealth can be created from nothing, or a reverse doubling strategy,

Page 150: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 150

where wealth is guaranteed to be destroyed. The fact that W is re-quired to stay positive precludes doubling strategies, but not reversedoubling strategies. As a consequence Wt should be at least as greatas the time-t present value of c. A condition that limits in a suitablesense how big W can get rules out reverse doubling strategies, allowingthe conclusion that Wt equals the time-t present value of c.

Lemma 3.8.3. Suppose(W,σW

)solves BSDE (3.8.10). Then

(3.8.11) Wt ≥1

πtEt[∫ T

t

πscsds+ πT cT

], t ∈ [0, T ] .

Moreover, if E [supt πtWt] <∞, the inequality holds as an equality.Proof. Given the dynamics (2.8.6) for π and (3.8.10) for W , in-

tegration by parts implies that d (πtWt) = −πtctdt+ dMt, where M isa local martingale. Let τn be a sequence of stopping times such thatτn ↑ T with probability one and M is a martingale up to time τn, andtherefore Mt = EtMτn on τn ≥ t, which in turn implies

πtWt = Et[∫ τn

0

πscsds+ πτnWτn

]on τn ≥ t .

As n → ∞, the term Et[∫ τntπscsds

]converges to Et

[∫ Ttπscsds

]by

the monotone convergence theorem (for conditional expectations). Wealso know that πτnWτn converges to πTWT = πT cT . Inequality (3.8.11)follows by Fatou’s lemma (for conditional expectations). If we fur-ther know that E [supt πtWt] < ∞, then limn Et [πτnWτn ] = Et [πTWT ]and (3.8.11) becomes an equality thanks to Lebesgue’s dominated con-vergence theorem (for conditional expectations), another fundamentalresult in the theory of integration.14

Preference monotonicity implies that if the consumption plan c isa candidate to be optimal, it should not be financed by a reverse dou-bling strategy. A sufficient condition for this to be true is last lemma’sintegrability condition, which we would have to verify in a mathemat-ically complete formulation. The integrability condition allows us toconfirm the SPD property of π as we set out to do.

Lemma 3.8.4. Suppose an allocation policy generates the wealthprocess W and finances the consumption plan c. If E [supt πtWt] <∞,then π is an SPD relative to X (c), that is, 〈π | x〉 ≤ 0 whenever c+ xis a feasible consumption plan.

Proof. Suppose c+x is a feasible consumption. By the last lemma,we have π0w ≥ 〈π | c + x〉 and π0w = 〈π | c〉. Subtracting the latterfrom the former, we find 0 ≥ 〈π | x〉.

14For all the referenced conditional expectation limit results, see, for example,Chapter 10 of Dudley [2002].

Page 151: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 151

Given the last lemma, our optimality verification argument willbe exactly as in Proposition 3.3.4, based on concavity and the SPDproperty of the utility gradient.

Remark 3.8.5 (Change of unit of account). As in Example 3.8.2,consider a change of the unit of account where consumption plan c in“bushels” is consumption plan cu = cu in “dollars.” If π is an SPDin bushels, then uπ is the same SPD in dollars. Maximizing U0 (c)subject to 〈π | c〉 ≤ wπ0 (as we are effectively doing in the solutionmethod that follows) is equivalent to maximizing Uu

0 (cu) subject to

〈uπ | cu〉 ≤ w (uπ)0, where Uu (cu) ≡ U (c). In Example 3.8.2, ut ≡ eαt

and the only difference between π and πu is in their respective impliedshort rate processes r and r+α. To see that apply integration by partsto πu and compare the resulting Ito expansion to (3.8.9). ♦

Technical aspects aside, the utility gradient calculation is analogousto Proposition 3.3.4 for the discrete case. Given any feasible directionx, the idea is to use the utility BSDEs to write the Ito expansionof U (c+ ϵx) − U (c), and then linearize the Ito coefficients, assumingsmall ϵ, resulting in a linear BSDE of the form we used earlier to pricethe contract (c,W ). The following remark, which can be skipped ona first reading, outlines the general pattern and associated gradientinequality.15

Remark 3.8.6. Omitting technical details, suppose that for a dif-ferentiable function F , the process Y ≡ Y (c) solves the BSDE(3.8.12) dYt = −Ft (ct, Yt, Zt) dt+ ZtdBt, YT = FT (cT ) .

Fixing c, we use the notation (Y, Z) ≡ (Y (c) , Z (c)) and

λt =∂Ft (ct, Yt, Zt)

∂c, FY (t) ≡ ∂F (ct, Yt, Zt)

∂Y, FZ (t) ≡

∂F (ct, Yt, Zt)

∂Z.

Then (omitting technical requirements),

(3.8.13) limϵ↓0

Yt (c+ ϵx)− Yt (c)

ϵ= Et

[∫ T

t

EsEtλsxsds+

ETEtλTxT

],

where E solves the SDEdEtEt

= FY (t) dt+ FZ (t) dBt, E0 = 1.

15The recursive utility gradient calculation and its use in this chapter originatedin my doctoral thesis Skiadas [1992]. At that time, my advisor Darrell Duffiehad just pioneered continuous-time recursive utility with Larry Epstein in Duffieand Epstein [1992a], and asset pricing applications using dynamic programmingMarkovian methods in Duffie and Epstein [1992b]. My idea was to characterizeoptimality in a static way in terms of gradients without a Markovian structure.These initial results appeared in Duffie and Skiadas [1994]. I continued this workat Northwestern with Mark Schroder, who was a doctoral student at the time,leading to the theory on which this section is based.

Page 152: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 152

Assuming F is concave, we outline a proof of the gradient inequality

(3.8.14) Y0 (c+ x) ≤ Y0 (c) + 〈Eλ | x〉.

Let y ≡ Y (c+ x) − Y (c) and z ≡ Z (c+ x) − Z (c). Omitting timeindices, the gradient inequality for F gives

0 ≤ D ≡ −F (c+ x, Y + y, Z + z) + F (c, Y, Z) + λx+ FY y + FZz.

Subtracting the BSDE for Y (c) from the BSDE for Y (c+ x), we get

dy = − (λx−D + FY y + FZz) dt+ zdB, yT = xT .

This is a linear BSDE of the form (3.8.10), with (y, Fcx−D,FY , FZ)corresponding to (W, c,−r,−η). Given sufficient regularity, the argu-ment of Lemma 3.8.3 in this context implies that y0 = 〈E | λx − D〉.Since both E and D are positive, y0 ≤ 〈E | λx〉 = 〈Eλ | x〉, which isthe claimed gradient inequality. ♦

We continue taking as given the reference consumption plan c andassociated utility process U = U (c) determined by BSDE (3.8.5) withg defined in (3.8.4). Although we have used x for a variety of roles, inthe remainder of this section it represents the consumption to utilityratio, in terms of which we define the process λ:

(3.8.15) xt ≡ctUt

and λt ≡ g′ (xt) = βx−δt for t < T ; λT = 1.

The convex dual g∗ is defined by (3.5.16), and therefore

(3.8.16) g∗ (λt) = g (xt)− g′ (xt)xt.

Remark 3.8.6 applies to BSDE (3.8.5) by letting (Y, Z) =(U,UσU

)and F (c, Y, Z) ≡ Y g (c/Y ) − (γ/2)Y (Z/Y )2. As a consequence, thegradient of U0 is Eλ, where E solves the SDE

dEtEt

=

(g∗ (λt) +

1

2γ(σUt)2)

dt− γσUt dBt, E0 = 1.

The assumed concavity of g implies the concavity of F , as can be seenby applying Lemma 3.5.6 to each of the additive terms defining F . Thisleads to the gradient inequality and the following verification argument.

Lemma 3.8.7. Suppose the allocation policy (ϱ, ψ) finances the planc and π ≡ Eλ is an SPD for the market X (c). Then (ϱ, ψ) is optimal.

Proof. Suppose the plan c + x is feasible. The gradient inequal-ity (3.8.14) with Y = U and π ≡ Eλ gives U0 (c+ x) ≤ U0 (c)+ 〈π | x〉.The fact that x ∈ X (c) implies 〈π | x〉 ≤ 0. The two inequalitiestogether imply that U0 (c+ x) ≤ U0 (c).

Page 153: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 153

To apply this lemma, suppose (ϱ, ψ) finances c and π ≡ Eλ solvesSDE (3.8.9) with π0 = λ0, and is therefore an SPD for X (c) byLemma 3.8.4. The Euler equation for homogeneous functions gives

(3.8.17) Ut = limϵ↓0

Ut (c+ ϵx)− Ut (c)

ϵ=λtπtEt[∫ T

t

πscsds+ πT cT

].

The first equation follows from the fact Ut is homogeneous of degree one(so ϵ cancels out) and the second equation follows from identity (3.8.13)with Y = U and Eλ = π. Provided E [supt πtWt] < ∞, Lemma 3.8.3implies that the right-hand side in (3.8.17) equals λtWt. Using inte-gration by parts, we therefore have the key identities

(3.8.18) U = λW and dU

U=dλ

λ+dW

W+dλ

λ

dW

W.

The first equation and (3.8.15) allow us to compute the optimal con-sumption policy as a function of λt:

(3.8.19) ϱt ≡ctWt

= λtctUt

= λtxt = β1/δλ1−1/δt , t < T.

As in the discrete case, the determination of the optimal portfolioallocation policy can be performed jointly with a backward recursionthat determines λ. To see how, we start with the notation

dλtλt

≡ µλt dt+ σλt dBt.

The budget equation (3.8.8) and (3.8.18) imply(3.8.20) σU = σλ + ψσ.

On the other hand, integration by parts applied to π = Eλ givesdπ

π=

(g∗ (λt) +

1

2γ(σUt)2

+ µλt − γσUt σλ

)dt−

(γσUt − σλt

)dBt.

Matching coefficients with the Ito decomposition (3.8.9) for dπ/π andsubstituting expression (3.8.20) for σU , we find

−µλ = r + g∗ (λ) +γ

2

((ψσ)2 −

(σλ)2)

, ψσ =1

γ

(η + (1− γ)σλ

).

These two equations are the counterpart of the joint recursive deter-mination of λ and ψ in the second step of the algorithm of Theo-rem 3.8.21. The Arrow-Pratt CE approximation leads to the closedform solution for ψσ, which can now be substituted into the expressionfor µλ. The result is a stand-alone BSDE for λ as summarized in thefollowing solution method. The claimed optimal allocation rule is ob-tained by substituting η = (µ− r) /σ in the above expression for ψσ,and EIS ≡ 1/δ in expression (3.8.19) for ϱ. Optimality of the proposedsolution can be verified by essentially reversing our earlier steps andapplying Lemma 3.8.7.

Page 154: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 154

Solution method: Determine(λ, σλ

)by solving the BSDE

(3.8.21) dλtλt

= −(r + g∗ (λt)−

γ

2Qt

(σλt))dt+ σλt dBt, λT = 1,

where Qt (z) ≡ z2 − (γ−1ηt − (1− γ−1) z)2. The optimal allocation

policy (ϱ, ψ) is given by ϱT = 1 and, for t < T ,

(3.8.22) ϱt = βEISλ1−EISt and ψt =

1

γ

µt − rtσ2t

,

where µt ≡ µt + (1− γ)σλt σt.

Example 3.8.8 (Unit EIS). Assuming unit EIS (δ = 1), (3.8.22)implies the optimal consumption allocation rate ϱt = β. To relatethis to Example 3.7.3, let us write β and ϱ for the parameters β andϱ in last section’s discrete context. If we heuristically think of theinfinitesimal time interval [t, t+ dt] as a single discrete period, thenβ = e−βdt = 1−βdt and ϱt = ϱtdt. Therefore, the conclusion ϱt = 1− βof Example 3.7.3 corresponds to ϱt = β in the current context. ♦

Example 3.8.9 (Unit CRRA). Assuming γ = 1, µt = µt and theoptimal portfolio allocation in (3.8.22) reduces to ψt = (µt − rt) /σ

2t .

This is a mean-variance efficient portfolio in the sense of Section 2.2(with Σt = σ2

t ), whose single-period analysis can be heuristically ap-plied over an infinitesimal time interval [t, t+ dt]. In the discrete Ex-ample 3.7.6, we concluded that the optimal portfolio is myopic. Herewe can add the insight of mean-variance efficiency thanks to the Arrow-Pratt approximation of a logarithmic CE. ♦

We encountered another example of a myopic optimal portfolio inExample 3.7.5, which in the current context would correspond to de-terministic r, µ and σ. For simplicity, in the following example we takethese parameters to be constant and derive a closed-form solution. Asin the last example, the combination of a myopic portfolio rule andthe Arrow-Pratt CE approximation lead to a mean-variance efficientoptimal portfolio.

Example 3.8.10 (Constant investment opportunity set). Supposethat rt = r, µt = µ, σt = σ, and therefore ηt = η ≡ (µ− r) /σ, areall deterministic constants. BSDE (3.8.21) has a deterministic solution(σλ = 0

), implying the mean-variance efficient optimal portfolio16

ψt =η

γσ=

1

γ

µ− r

σ2.

16This optimal portfolio allocation (with extensions) was first shown for theadditive case (γ = δ) by Merton [1969, 1971], whose work has been seminal forthe theory of dynamic optimal consumption and portfolio choice. Merton used theHamilton-Jacobi-Bellman approach, which is also applicable in the current setting.

Page 155: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.8. RECURSIVE UTILITY AND OPTIMALITY IN CONTINUOUS TIME 155

For δ 6= 1, the optimal ϱ in (3.8.22) is

ϱt = p(1 +

(pβ−1/δ − 1

)e−p(T−t)

)−1, t < T, ϱT = 1,

where p ≡ (1− δ−1)(r + (2γ)−1 η2

)+ δ−1β. To show this, set σλ = 0

in BSDE (3.8.21), thus reducing it to an ordinary differential equation(ODE), which after the change of variable zt ≡ (λt/β)

(1−δ)/δ becomesdzt = (pzt − β) dt or d (e−ptzt) = (β/p) de−ptdt. Integrating from t toT , results in

λt =

(β1/δ

p+

(1− β1/δ

p

)e−p(T−t)

)δ/(1−δ).

For δ = 1, the optimal consumption policy is given in Example 3.8.8and, like ψ, does not depend on λ. One may still wish to compute λ inorder to verify the solution and also determine U = λW . This is easilydone since logλt satisfies an easy-to-solve linear ODE.

Finally, note that the above solution applies to the expected dis-counted utility of Example (3.8.2), and more generally to any recursiveutility specified by BSDE (3.8.5) with g = α+βuδ, by simply replacingr with r+ α while keeping η and ψ = η/γσ the same (implying that µchanges to µ+ α). This can be seen either directly by tracing the roleof adding α to g, or through Remark 3.8.5 after the change of the unitof account described in Example (3.8.2). ♦

The more general expression for the optimal portfolio ψt in (3.8.22)allows for deviations from mean-variance efficiency due to a stochasti-cally varying shadow-price-of-wealth process. One way of understand-ing its structure in terms of the discrete theory is through the obser-vation that for the CE vt = u−1

γ Etuγ, the optimal portfolio rule canbe expressed as (3.7.11), where Q is the probability with the condi-tional density process ξ defined by recursion (3.7.12). Heuristically,if the discrete time period is the time interval [t, t+ dt], it is nothard to see, using Ito’s lemma, that recursion (3.7.12) corresponds todξt/ξt = − (1− γ)σλt dBt. As we saw in Section 2.8, Girsanov’s the-orem implies that the drift of the stock’s cum-dividend return, whichis µ under P , becomes µt ≡ µt + (1− γ)σλt σt under Q. The claimedoptimal portfolio in (3.8.22) is therefore mean-variance efficient underthe probability Q. Just as in the last two examples under P , this isa consequence of the Arrow-Pratt CE approximation in the myopicoptimal portfolio problem (3.7.11) under Q.

Examples in which σλ is stochastic can be formulated in termsof Markovian formulations (more naturally with multiple sources ofrisk), where the BSDE solutions can be related to corresponding PDEsolutions, in a generalization of the argument that related the linearBSDE (2.8.15) to PDE (2.8.19).

Page 156: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 156

3.9. ExercisesExercise 1 Assume the context of the CAPM equilibrium of Ex-

ample (3.2.8), including preference transitivity. Provide a dual proofthat the equilibrium is effective complete by showing that there ex-ists a complete market X ⊇ X such that (X, c) is an equilibrium andapplying Corollary 3.2.7.Hint: Define the complete market X by setting to zero the presentvalue of all time-one payoffs that are orthogonal to all traded time-onepayoffs.

Exercise 2 This exercise outlines a variant of the representative-agent argument of Example 3.2.9 that includes an example of a CAPMequilibrium. To do so, we modify definition of a consumption setand Definition 3.2.9 of a preference correspondence by weakening themonotonicity requirement to introduce an upper bound on allowableconsumption: For every preference correspondence D, there exists aconsumption plan b ∈ L such that dom (D) = c ∈ L | c < b, and forevery arbitrage cash flow y, if x = 0 or x ∈ D (c) then x + y ∈ D (c),provided x+ y < b. (The notation c < b means c (ω, t) < b (ω, t) for all(ω, t).) The technical motivation behind this definition is to allow fora quadratic utility representation, as in this exercise’s final part, whichimplies variance aversion and the CAPM pricing equation. Quadraticutility is increasing only up to a maximum. The idea is to only con-sider equilibria in which the constraint that consumption is below thatmaximum is non-binding and therefore the nature of the utility formabove the maximum is irrelevant. We proceed with a more generalspecification that emphasizes the aggregation argument.

Agent preferences are specified in terms of a given scale-invariantpreference D0 with dom (D0) ≡ c ∈ L | c < 0. For every i ∈ 1, . . . , I,the preference correspondence of agent (Di, ei) is specified in terms ofbi ∈ L by dom (Di) ≡ c ∈ L | c < bi and Di (c) ≡ D0 (c− bi). Fur-ther assume that for all i, there exist vi, wi ∈ R such that

ei − wi1Ω×0 ∈ X, bi − vi1Ω×0 ∈ X and wi < vi,

and let b ≡∑

i bi, e ≡

∑ei, v ≡

∑i v

i, w ≡∑

iwi. The allocation

c ≡(c1, . . . , cI

)is defined by

ci ≡ bi − vi − wi

v − w(b− e) , i = 1, . . . , I.

Assume that e < b and therefore ci < bi for all i by construction.(a) Show that the allocation c is market-clearing and X-feasible,

assuming X is arbitrage-free.(b) Show that (X, c) is an equilibrium if and only if e is an opti-

mal consumption plan for the representative agent (D, e), defined bydom (D) ≡ c ∈ L | c < b and D (c) ≡ D0 (c− b).

Page 157: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 157

(c) Show that if (X, c) is an equilibrium and D (e) is convex, then(X, c) is an effectively complete market equilibrium.

(d) Specialize the setting by assuming that T = 1 and D0 has thequadratic utility representation U0 (c) = −c20 − βE [c21], where E is ex-pectation under a given full-support probability and β is a positivescalar. Assume further that the parameters b1, . . . , bI are (determin-istic) constants. Verify that the agent preferences are variance aversein the sense of Example 3.2.8. Adding the assumptions of a tradedmoney-market account and the regularity condition var [e1] > 0 ande0 6= w, all the CAPM assumptions of Example 3.2.8 are satisfied. Inthis context, give an alternative derivation by the CAPM beta-pricingequation (3.2.4) by using Proposition 3.3.4 and the optimality of theaggregate endowment for the representative agent.Hint: Construct a SPD whose time-one value is an affine function ofthe market return.

Exercise 3 This exercise outlines a variant of the representative-agent pricing argument of Example 3.2.9, where preferences are as-sumed to be translation invariant instead of scale invariant. (As shownin Appendix A, an additive utility representing a translation-invariantpreference correspondence necessarily takes an exponential form, whichin the expected-utility case corresponds to constant absolute risk aver-sion.) Every preference correspondence in this exercise is assumed tohave as its domain the entire consumption set L. Agents are specifiedin terms of a fixed reference agent (D0, e0). The key assumption is thatD0 is translation invariant:

D0 (c) = D0 (c+ θ) for all θ ∈ R.

For i = 1, . . . , I, agent (Di, ei) is specified in terms of the parameters(αi, θi, xi) ∈ (0,∞)× R×X by

Di (c) ≡ αiD0( cαi

), ei ≡ αie0 + θi + xi.

Let α ≡∑I

i=1 αi, θ ≡

∑Ii=1 θ

i, x ≡∑I

i=1 xi, e ≡

∑Ii=1 e

i, and definethe representative agent (D, e) by

D (c) = αD0( cα

), e = αe0 + θ + x.

The allocation c =(c1, . . . , cI

)is defined by

(3.9.1) ci = θi +αi

α(e− θ) .

(a) Show that (X, c) is an equilibrium if and only if the consumptionplan e is optimal for the representative agent given the market X,that is, X ∩ D (e) = ∅.

(b) Suppose (X, c) is an equilibrium and D (e) is convex. Show that(X, c) is an effectively complete market equilibrium.

Page 158: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 158

(c) Fix an underlying full-support probability and assume that forsome β ∈ (0, 1), every Di has the utility representation

U i (c) = E

[−αi

T∑t=0

βt exp(− ctαi

)].

Verify that the above preference assumptions are satisfied and specifya state-price density π that corresponds to the utility gradient of therepresentative agent at the aggregate endowment.

(d) Consider a sequence of the model of part (c) parameterized bythe time-horizon T = 1, 2, . . . , and take as given an infinite filtrationFt | t = 1, 2, . . . and corresponding adapted process e = (e0, e1, . . . ).The information and endowment of the T th model is the restrictionof the respective quantity over the time set 0, 1, . . . , T, that is, thefiltration is Ft | t = 1, 2, . . . , T and the endowment is (e0, e1, . . . , eT ).All other parameters are common across all models. The endowmentprocess is strictly positive and follows the dynamics et = et−1 (1 + gt),where the random variable gt is stochastically independent of Ft−1 andtakes the value ε ∈ (0, 1) with probability p ∈ (0.5, 1) and −ε withprobability 1−p. Assume that m ≡ p log (1 + ε)+(1− p) log (1− ε) >0. What is the limit of the equilibrium short rate rt as t → ∞. Youcan use the law of large numbers from probability theory, which impliesthat t−1 log (et/e0) = t−1

∑ts=1 log (1 + gt) converges (with probability

one) to the mean of log (1 + gt), which is m > 0, and therefore et → ∞as t→ ∞ (with probability one).

Exercise 4 Assume a single period (T = 1) and fix an underlyingfull-support probability throughout. An agent’s time-one consumptionis restricted to take values in a non-empty open interval D ⊆ R, whichis agent specific. Given K states, DK denotes the set of all D-valuedrandom variables, and C2

++ (D) denotes the set of all strictly increasingand twice-continuously differentiable functions of the form u : D → R,with the first two derivatives denoted u′ and u′′, respectively. Utilitiesover time-one consumption are assumed to be of the expected-utilityform Eu (c), c ∈ DK , where u ∈ C2

++ (D). The corresponding coeffi-cient of absolute risk aversion is the function Au : D → R definedby

Au (w) ≡ −u′′ (w)

u′ (w), w ∈ D.

(a) Suppose that u, u ∈ C2++ (D). Show that Au = Au if and only

if there exist a ∈ (0,∞) and b ∈ R such that u = au + b. Then useTheorem A.2.4 to show that u and u represent the same preference, inthe sense that Eu (x) > Eu (y) ⇐⇒ Eu (x) > Eu (y) for all x, y ∈ DK ,if and only if Au = Au.

Page 159: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 159

(b) We call u HARA with coefficients (α, β) ∈ R if u ∈ C2++ (D)

withD ≡ w | α + βw > 0 and Au (w) ≡ 1

α + βw.

The parameter β is the coefficient of cautiousness. (For β < 0,consumption is bounded above, just as in Exercise 3.9.) Show that uis HARA with coefficients (α, β) if and only if there exist a ∈ (0,∞)and b ∈ R such that

(3.9.2) au (w) + b =

1

β−1(α + βw)(β−1)/β , if β 6= 0 and β 6= 1;

log (α + w) , if β = 1;−α exp (−w/α) , if β = 0 and α > 0.

(c) Explain why u HARA implies that u is strictly concave.Exercise 5 This exercise, which has Exercise 4 as a prerequisite,

formulates and analyzes a representative-agent equilibrium for agentswho maximize HARA expected utility. The formulation nests Exam-ple 3.2.9 and Exercises 2 and 3 for the expected utility case. 17

Assume T = 1 and fix an underlying full-support probability overK states. For simplicity, agents can only trade in a forward market tomodify their time-one consumption. Taken as given is a column vectorV ≡ (V1, . . . , VJ)

′, where Vj ∈ RK represents the time-one value of someasset that can be traded in a forward market. A portfolio is a rowvector θ ≡ (θ1, . . . , θJ), where θj ∈ R represents a number of forwardcontracts in asset j. Given a forward-price vector f ≡ (f1, . . . , fJ)

′ ∈RJ×1, the portfolio θ generates the cash flow (0, θ (V − f)). Since alltime-zero cash flows are zero, we define the market (given f) by

Xf ≡θ (V − f) | θ ∈ R1×J .

Each agent i ∈ 1, . . . , I has a (time-one) endowment of the formei ≡ ai + biV, ai ∈ R, bi ∈ R1×J ,

and maximizes the utility Eui over time-one consumption, where uiis HARA with coefficients (αi, β). Note that E and β are the sameacross agents. The set of admissible (time-one) consumption plans foragent i is DK

i , where Di ≡ w | αi + βw > 0. The consumption planci ∈ DK

i is optimal for agent i given f if there is no x ∈ Xf suchthat ci + x ∈ DK

i and Eui (ci + x) > Eui (ci). An allocation is anyelement of DK

1 × · · · × DKI . The allocation c ≡

(c1, . . . , cI

)is said

to be f-feasible if ci − ei ∈ Xf for all i, and market-clearing if∑i ci = e. An equilibrium is a pair (f, c) of a forward-price vector

f and an f -feasible and market-clearing allocation c such that for alli ∈ 1, . . . , I, ci is optimal for agent i given f .

17An extension with asymmetric information is given in DeMarzo and Skiadas[1998, 1999].

Page 160: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 160

Define the aggregate parameters

α ≡I∑i=1

αi, a ≡I∑i=1

ai, b ≡I∑i=1

bi, e ≡I∑i=1

ei = a+ bV.

The representative agent is endowed with the aggregate endow-ment e and maximizes the expected utility Eu, where u is HARA withcoefficients (α, β). The set of admissible consumption plans for therepresentative agent is therefore DK , where D ≡ w | α + βw > 0.Assume that for every agent i, ei ∈ DK

i and therefore e ∈ DK .(a) Fix any forward-price vector f such that Xf is arbitrage-free

(meaning Xf ∩ RK+ = 0). Show that

(3.9.3) ci ≡ ai + bif +αi + β (ai + bif)

α + β (a+ bf)b (V − f) , i = 1, . . . , I,

defines an allocation c ∈ DK1 ×· · ·×DK

I , which is f -feasible and market-clearing, and there exist scalars λif > 0 such that

λifu′i

(ci)= u′ (e) , i = 1, . . . , I.

(b) Show that for all f ∈ RJ×1, the aggregate endowment e isoptimal for the representative agent given f (that is, there is no x ∈ Xf

such that e+ x ∈ DK and Eu (e+ x) > Eu (e)) if and only if

(3.9.4) f =1

Eu′ (e)E [u′ (e)V ] .

(c) Show that equations (3.9.4) and (3.9.3) define an equilibrium(f, c).

(d) Show that the equilibrium of part (c) is an effectively com-plete market equilibrium: for all x1, . . . , xI such that ci + xi ∈ DK

i , ifEui (ci + xi) ≥ Eui (ci) for all i and Eui (ci + xi) > Eui (ci) for some i,then

∑Ii=1 x

i /∈ Xf . You can base your argument on Proposition 3.3.8,but you should give a proof from first principles.

(e) Show that the equilibrium of part (c) is unique. Hint: If f ispart of some equilibrium and the market-clearing allocation c is Paretooptimal and f -feasible, then (f, c) is an equilibrium.

(f) Assume β ≡ −1. Given the equilibrium of part (c), call R atraded forward return if there exists a portfolio θ such that θf 6= 0and R = θV /θf . Assume that a + bf 6= 0 and var [e] > 0, and definethe forward return of the aggregate endowment:

Rm ≡ a+ bV

a+ bf=

e

forward price of e.

Assuming thatR0 is a forward traded return that is uncorrelated toRm,show that every traded forward return R satisfies the CAPM equation

E[R−R0

]=

cov [R,Rm]

var [Rm]E[Rm −R0

].

Page 161: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 161

Exercise 6 (Preferences for the Timing of Resolution of Uncer-tainty) In this course preferences are defined over consumption planstaking the information tree (filtration) as given. In settings in which theinformation structure is not fixed, it is natural to consider preferencesnot only over consumption but also over information.18 In particular,an agent may have preferences for earlier or later resolution of uncer-tainty about future consumption. In the context of Example 3.3.11,given the consumption plan b, an agent with preferences for early reso-lution of uncertainty would prefer to have all T coin tosses announcedat time zero, rather than wait to find out the outcome of the tth cointoss at time t.

More formally, let C be a (non-empty) set of consumption plans andlet Φ be the set of every filtration Ft : 0, . . . , T satisfying F0 = Ω, ∅and FT = 2Ω. We consider a (non-normalized) utility function V0 (·)over the set of all pairs (c, Ft) ∈ C × Φ such that c is adapted toFt. Taking as primitive an underlying probability with expectationoperator E and the functions Ft : R2 → R, t = 0, . . . , T − 1 andFT : R → R, assume that V0 (c, Ft) is the initial value of the processV that solves the backward recursion

Vt = Ft (ct,E [Vt+1 | Ft]) , t = 0, . . . , T − 1; VT = FT (cT ) .

We say that the utility function V0 (·) expresses preferences for earlierresolution of uncertainty if for all (c, F1

t ) and (c, F2t ) in its domain,

F1t ⊆ F2

t for all t implies V0(c,F1t

)≤ V0

(c,F2t

).

Preferences for late resolution of uncertainty are defined analogously,with the last inequality reversed.

(a) Use Jensen’s inequality to show that if the functions Ft areconvex (resp. concave) in their second (utility) argument for all t < T ,then V0 (·) expresses preferences for earlier (resp. later) resolution ofuncertainty.

(b) Suppose the choice of Ft corresponds to the functional form ofexpected discounted utility. What does that imply about preferencesover information?

(c) Suppose the utility process associated with (c, Ft) is of theEZW form of Example (3.5.1), with CRRA γ and inverse EIS δ. Char-acterize preferences for the timing of resolution of uncertainty in termsof the relative value of the parameters γ and δ. You can use withoutproof the fact that if u is of the HARA form (3.9.2) and β ∈ (0, 1), thenthe function u−1 ((1− β)u (x) + βu (y)) is concave if u is concave, and

18The notion of preferences for the timing of resolution of uncertainty appearsin Kreps and Porteus [1978] using a formalism based on preferences over probabilitydistribution. The approach without probabilities in this exercise is from Skiadas[1998].

Page 162: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 162

convex if u is convex. (This follows by the argument of Example 3.2.9for the power-or-logarithmic case and a similar argument applies forthe exponential case.)

Exercise 7 In this problem assume scale-invariant recursive utilitywith a proportional aggregator that takes the unit-EIS form gt (x) =x1−β for some β ∈ (0, 1).

(a) Specialize the result of Proposition 3.6.2 (pricing and consump-tion growth) to this case.

(b) Specialize the result of Proposition 3.6.4 (pricing and marketreturns) to this case.

(c) Show that the procedures for computing π in parts (a) and (b)can be easily derived from each other, given the relationship betweenconsumption growth and market returns of Example 3.6.7 (unit EIS).

(d) Further specialize the results of parts (a) and (b) by assumingunit-EIS Epstein-Zin-Weil utility. (Provide as specific expressions asyou can.)

Exercise 8 An alternative approach to proving Theorem 3.7.2 onoptimal consumption and portfolio choice is based on the ideas of dy-namic programming.19 While it is not difficult to show the completeTheorem 3.7.2, in this problem you are only asked to use a dynamicprogramming argument to prove the sufficiency part: If (ϱ, ψ) is con-structed according to the Theorem’s algorithm, then (ϱ, ψ) finances anoptimal consumption plan. Also, for notational brevity, assume theproportional aggregator gt = g is time independent.

Define a value function to be a mapping V : (0,∞) → L++ thatassigns to each w ∈ (0,∞) a strictly positive adapted process V (w).For each spot (F, t), think of V (w) (F, t) as the optimal utility level atspot (F, t) given total wealth w at that spot. (In a typical application,the dependence on (F, t) is through the value Z (F, t) of some Markovprocess Z on that spot, and the value function is therefore definedas a function of wealth and the Markov state. Here, Z (F, t) can bethought of as being the spot itself, which can be identified with theentire history leading to that spot.)

Write Vt (w) for the random variable that takes the value V (w) (F, t)on F for every spot (F, t). The corresponding Bellman equation is

Vt−1 (w) = maxϱt−1,ψt

f(ϱt−1w, υt−1

(Vt((1− ϱt−1)wR

ψt

))),

Proceed in the following steps.

19Richard Bellman coined the term “dynamic programming” in 1950 at theRAND corporation as a marketing ploy in getting his research agenda past hisboss, and later secretary of defense, Wilson. As recounted by Bellman’s colleagueand coauthor Dreyfus [2002], Wilson was no fan of mathematical research.

Page 163: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

3.9. EXERCISES 163

(a) What exactly is the mathematical content of the Bellman equa-tion at each spot? What is the set over which maximization occurs ateach spot? Be as precise and specific as you can.

(b) Conjecture the functional form of V (w) and explain your rea-soning.

(c) Show that the algorithm of Theorem 3.7.2 produces a solutionto the Bellman equation.

(d) Provide what is known as a verification argument, that is, showthat given the right terminal condition, a solution to the Bellman equa-tion defines an optimal policy. You should explain how the Bellmanequation defines the optimal policy and then prove optimality usingonly the budget equation, the utility definition, and the Bellman equa-tion.

Page 164: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

APPENDIX A

Additive Utility Representations

This appendix presents some basic results on the representationof preferences by additive utility functions. Simple ordinal assump-tions, mainly separability and continuity, are enough for the existenceof an additive utility representation. Additional ordinal restrictions,like preference convexity, scale invariance or translation invariance,are shown to have strong implications for the structure of an additiveutility representation. The appendix concludes with a discussion ofrisk aversion in the context of expected utility.

A.1. Utility representations of preferences

In this appendix we discuss preferences over the set C ≡ (ℓ,∞)N

for some integer N ≥ 2 and constant ℓ ∈ [−∞, 1). We treat a constantα ∈ (ℓ,∞) as an element of C by identifying it with (α, α, . . . , α). FromChapter 3, we adopt Definition 3.1.1 of a preference correspondence andDefinition 3.3.1 of a utility (but without the restriction C = L++). Apreference correspondence D defines a corresponding binary relation on C by letting a b ⇐⇒ a − b ∈ D (b). (Formally speaking,a binary relation on C is a subset of C × C and the notationa b means (a, b) ∈.) We henceforth use the term preference tomean any binary relation on C defined as above by some preferencecorrespondence D, in which case we say that a utility U : C → Rrepresents if it represents D, that is, for all a, b ∈ C,

a b ⇐⇒ U (a) > U (b) .

All preferences in this appendix are assumed to have a utility rep-resentation, but we use the preference notation where we wish toemphasize that a property is ordinal and therefore not dependent onany particular choice of a utility representation. In the remainder ofthis section, we review necessary and sufficient conditions for a prefer-ence to admit a utility representation. The argument is simplified byour standing assumption of monotone preferences.1

Let the relation on C be defined by a b ⇐⇒ not b a.Clearly, admits a utility representation U if and only if admits autility representation U in the sense (A.1.1) of the following result.

1Debreu [1983] (based on a 1954 working paper) shows the existence of a contin-uous utility representation of a continuous, complete and transitive binary relation,without monotonicity.

164

Page 165: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.1. UTILITY REPRESENTATIONS OF PREFERENCES 165

Theorem A.1.1. For any binary relation on C, there exists autility function U : C → R such that

(A.1.1) a b ⇐⇒ U (a) ≥ (b) .

if and only if is• total: for all a, b ∈ C, either a b or b a.• transitive: a b and b c implies a c.• increasing: for all a, b ∈ C, if b 6= a ≥ b then not b a.• continuous: for every sequence (an, bn) such that an bn,

if limn→∞ (an, bn) = (a, b) ∈ C × C, then a b.

Proof. The “only if” part is immediate from the definitions. Con-versely, suppose that is total, transitive, increasing and continuous.On C, define the function U (c) ≡ inf α ∈ R | α c and the relationa ∼ b if and only if both a b and b a. We proceed in six steps.

Step 1. U (c) ∼ c. Choose any sequence in αn in R that convergesto U (c) and αn c for all n. Since is continuous, U (c) c andU (c) = min α ∈ R | α c. For all n = 1, 2, . . . , it is not the casethat U (c)− n−1 c, and since is total, c U (c)− n−1. Since iscontinuous, c U (c). This proves that U (c) ∼ c.

Step 2. a b ⇐⇒ U (a) ≥ U (b). Since is transitive, if a b,then α ∈ R | α a ⊆ α ∈ R | α b, and therefore U (a) ≥ U (b)by the definition of U . Conversely, since is monotone, U (a) ≥ U (b)implies that U (a) U (b). By Step 1, a U (a) and U (b) b. Since is transitive, it follows that a b.

Step 3. U is increasing. Suppose a, b ∈ C and b 6= a ≥ b. Since is total and increasing, a b and not b a. By Step 2, U (a) ≥ U (b)and not U (b) ≥ U (a), and therefore U (a) > U (b).

Step 4. For all u ∈ (ℓ,∞), U (u) = u. Since is total and increas-ing, for all α, u ∈ (ℓ,∞), α u if and only if α ≥ u. The claim nowfollows by the definition of U .

Step 5. For all u ∈ R, if u ∼ c, then u = U (c). By Step 1, c ∼ U (c).Since is transitive, u ∼ U (c). By Steps 2 and 4, it then follows thatu = U (c).

Step 6. U is continuous. Consider any sequence cn in C converg-ing to c ∈ C. Suppose first that U (cn) converges to some u. SinceU (cn) ∼n cn for all n and is continuous, it follows that u ∼ c andtherefore u = U (c) by Step 5. Consider now the general case, wherewe do not know a-priori that U (cn) converges. Since cn → c, thereexist a, b ∈ (ℓ,∞) such that c and all components of cn are valued inthe interval [a, b] and therefore U (c) and U (cn) are all valued in thecompact set [U (a) , U (b)] (which equals [a, b], although that does notmatter here). It follows that every subsequence of U (cn) has a con-vergent further subsequence, which as shown earlier must converge toU (c). This proves that U (cn) also converges to U (c).

Page 166: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.2. ADDITIVE UTILITY REPRESENTATIONS 166

A.2. Additive utility representations

The rest of this appendix discusses preferences on (ℓ,∞)N , whereℓ ∈ [−∞, 1), that admit an additive utility representation in the fol-lowing sense.

Definition A.2.1. A utility U : (ℓ,∞)N → R is additive if thereexist functions Un : (ℓ,∞) → R such that

(A.2.1) U (x) =∑N

n=1Un (xn) , x ∈ (ℓ,∞)N .

Given any x, y ∈ (ℓ,∞)N and A ⊆ 1, . . . , N, xAy denotes theelement of (ℓ,∞)N defined by

(xAy)n =

xn, if n ∈ A;yn, if n /∈ A.

Definition A.2.2. A preference on (ℓ,∞)N is separable if

(A.2.2) xAz yAz ⇐⇒ xAz yAz,

for all x, y, z, z ∈ (ℓ,∞)N and A ⊆ 1, . . . , N.

A preference that admits an additive utility representation is clearlyseparable. The following remarkable theorem gives a converse. It is aspecial case of Debreu’s additive representation theorem,2 but it cap-tures the deeper aspects of Debreu’s theorem. The proof would takeus too far afield and will not be given here.

Theorem A.2.3 (Existence of Additive Representations). SupposeN > 2 and is a preference on (ℓ,∞)N that admits some utilityrepresentation. Then admits an additive utility representation if andonly if it is separable.

The assumption N > 2 is critical; separability is not sufficient forthe existence of an additive representation if N = 2. The uniquenesspart of Debreu’s theorem (which also applies if N = 2) is stated andproved below. The result has important ramifications for the discus-sion of the limitations of additive utilities in Chapter 3, as well as ourlater parametric characterization of scale or translation invariant addi-tive preferences, and the associated foundation of expected utility withconstant relative or absolute risk aversion.

2Debreu [1983] characterizes continuous additive utility representations in away that includes Theorems A.2.3 and A.2.4. Debreu’s theorems are part of abroader theory of measurement, which is reviewed in the monographs of Krantzet al. [1971] and Narens [1985]. These authors present an algebraic theory thatgeneralizes Debreu’s topological results (see also Wakker [1988]). A detailed proofof Debreu’s theorem can be found in Wakker [1989].

Page 167: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.2. ADDITIVE UTILITY REPRESENTATIONS 167

Theorem A.2.4 (Uniqueness of Additive Representations). For alladditive utilities U and U on (ℓ,∞)N , where N ≥ 2, the following twoconditions are equivalent:

(1) U and U are ordinally equivalent: For all x, y ∈ (ℓ,∞)N ,

U (x) > U (y) ⇐⇒ U (x) > U (y) .

(2) U and U are related by a positive affine transformation:There exist a ∈ (0,∞) and b ∈ RN such that

Un = aUn + bn, n = 1, . . . , N.

Proof. That (2) =⇒ (1) is immediate. We show the converse,assuming that ℓ = −∞. This is without loss of generality: If ℓ > −∞,then apply the result for ℓ = −∞ to the utility functions that mapz ∈ RN to

∑n Un (ℓ+ ezn) and

∑n Un (ℓ+ ezn), respectively.

Consider any ordinally equivalent additive utilities U and U onRN that satisfy, for all n ∈ 1, . . . , N,

(A.2.3) Un (0) = Un (0) = 0 and U1 (1) = U1 (1) = 1.

The claim is that Un = Un for all n, which proves (1) =⇒ (2),since any additive utility on RN can be made to satisfy normaliza-tion (A.2.3) after a positive affine transformation. Fixing an arbitraryn ∈ 2, . . . , N, L ∈ (−∞, 0) and scalar ∆ such that L+∆ > 1, definethe functions f, g : [0, 1] → R by

f (z) ≡ U1 (L+ z∆)− U1 (L)

U1 (L+∆)− U1 (L), g (z) ≡ Un (L+ z∆)− Un (L)

U1 (L+∆)− U1 (L).

Define also f and g by putting a tilde over f , g and every instance of Uin the above display. Applying Lemma A.2.5 immediately following thisproof to these functions, it follows that U1 (x) = U1 (x) and Un (x) =Un (x) for all x ∈ [L,L+∆]. This proves that Un = Un, since everyx ∈ R is in an interval of the form [L,L+∆] ⊃ [0, 1].

Lemma A.2.5. Suppose the functions f, g, f , g : [0, 1] → R areincreasing and continuous, and satisfy

(A.2.4) f (0) = g (0) = f (0) = g (0) = 0 and f (1) = f (1) = 1,

Suppose also that for all x, y, z, w ∈ [0, 1],

(A.2.5) f (x)+g (y) = f (z)+g (w) ⇐⇒ f (x)+g (y) = f (z)+g (w) .

Then f = f and g = g.

Proof. LetN be any positive integer such that 2−N < g (1). Givenany n ∈ N,N + 1, . . . , define xnk ∈ [0, 1] and yn ∈ (0, g (1)) by

(A.2.6) f (xnk) = k2−n, k = 0, 1, . . . , 2n, and g (yn) = 2−n.

Page 168: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.3. CONCAVE ADDITIVE REPRESENTATIONS 168

Note that xn0 = 0 and xn2n = 1. Since g (0) = 0, we havef (xnk) + g (0) = f

(xnk−1

)+ g (yn) , k = 1, . . . , 2n.

By assumption (A.2.5), it is also true that(A.2.7) f (xnk) + g (0) = f

(xnk−1

)+ g (yn) , k = 1, . . . , 2n.

Since g (0) = f (0) = 0, it follows that

1 = f (1) =∑2n

k=1f (xnk)− f

(xnk−1

)= 2ng (yn) .

This proves that g (yn) = 2−n, which together with (A.2.7) shows thatf (xnk) = k2−n for k > 0. Comparing this conclusion to (A.2.6), wehave proved that the functions f−1 and f−1 are equal on the set Dn =k2−n : k = 0, . . . , 2n, for all n ≥ N . Since the set

⋃n≥N Dn is dense

in [0, 1] and the functions f−1 and f−1 are continuous, it follows thatf−1 = f−1 and therefore f = f .

To show that g = g, we apply the same argument with (F,G, F , G)

in place of (f, g, f , g), where

F (z) ≡ g (z)

g (1), F (z) ≡ g (z)

g (1), G (z) ≡ f (z)

g (1), G (z) ≡ f (z)

g (1).

The conclusion F = F implies that g = ag for some a ∈ (0,∞). Chooseany ε, δ > 0 such that f (ε) = g (δ), and therefore f (ε) + g (0) =f (0) + g (δ) (by (A.2.4)). By (A.2.5), it must also be the case thatf (ε) = g (δ). Since f = f and g = ag, this shows that a = 1 andtherefore g = g, completing the proof.

A.3. Concave additive representations

A preference on (ℓ,∞)N is convex if the set x : x y is convexfor every y ∈ (ℓ,∞)N . Preference convexity is a key component of thistext’s theory as it expresses preference for smoothing across time andstates. As pointed out in Section 3.3, a convex preference admittinga utility representation may admit no concave utility representation.This is also true of additive representations: U (x) ≡ ex1 − e−x2 de-fines an additive utility on (0,∞)2 that represents a convex preference(uniquely up to positive affine transformations). On the other hand,we have the following positive and rather surprising result.3

Theorem A.3.1. Suppose the preference on (ℓ,∞)N is convexand it admits an additive utility representation U : (ℓ,∞)N → R. Thenat least N − 1 of the functions Un defined in (A.2.1) are concave.

Proof. The theorem follows from Lemma A.3.3 below. 3The proof of Theorem A.3.1 is based on Yaari [1977], who credited Koopmans

with a different proof, as well as Gorman for the twice-differentiable case, in bothcases in unpublished papers.

Page 169: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.3. CONCAVE ADDITIVE REPRESENTATIONS 169

In fact, the argument that follows proves a stronger result in thatour standing monotonicity assumption on preferences plays no role andthe interval (ℓ,∞) can be replaced by any other interval. The remain-der of this section shows two lemmas that lead to the proof of Theo-rem A.3.1. These are fairly technical and can be omitted.

The following lemma is stated a little more generally than neededin this section. We will use it again in our discussion of expected utilityand risk aversion.

Lemma A.3.2. Suppose the continuous function f : (ℓ,∞) → R isnot concave and p ∈ (0, 1). Then there exist some x∗ ∈ (ℓ,∞) andsmall enough scalar ε > 0 such that for all δ ∈ (0, ε),(A.3.1) f (x∗) < pf (x∗ + (1− p) δ) + (1− p) f (x∗ − pδ) .

Proof. Since f is not concave, there exist x0, x1 ∈ (ℓ,∞) andα ∈ (0, 1) such that f (xα) < (1− α) f (x0) + αf (x1), where xα ≡(1− α)x0 + αx1 and x0 < x1. Let h : (ℓ,∞) → R be the affinefunction whose graph is the straight line through the points (x0, f (x0))and (x1, f (x1)). Inequality (A.3.1) is satisfied by the given function fif and only if it is satisfied by the function f − h. Replacing f byf − h, we proceed under the assumption that f (x0) = f (x1) = 0 andtherefore f (xα) < 0. Let K = argmin f (x) : x ∈ [x0, x1], a compactset (since f is continuous), and let x∗ = minK. Clearly, x∗ ∈ (x0, x1)and f (x∗) < 0. Choose ε > 0 small enough so that x∗ + (1− p) εand x∗ − pε are both valued in [x0, x1]. The facts that x∗ minimizesf on [x0, x1] and x∗ is the least point in [x0, x1] with this propertyrespectively imply, for all δ ∈ (0, ε), the inequalities

f (x∗) ≤ f (x∗ + (1− p) δ) and f (x∗) < f (x∗ − pδ) ,

which in turn imply (A.3.1). Lemma A.3.3. Suppose that the functions f, g : (ℓ,∞) → R are

continuous and neither of them is concave. Then there exists some(x∗, y∗) such that the set

(x, y) ∈ (ℓ,∞)2 | f (x) + g (y) > f (x∗) + g (y∗)

is not convex (with the convention that the empty set is convex).Proof. Let x∗ and ε be selected in terms of f as in Lemma A.3.2

with p = 1/2, and let y∗ and ε be analogous quantities for g. (Notethat we can choose the same ε in both cases.) Clearly, the validity ofthe lemma’s claim does not change if f and g are modified by adding aconstant. Replacing f with f−f (x∗) and g with g−g (y∗), we proceedunder the assumption that(A.3.2) f (x∗) = g (y∗) = 0.

Letting ϵ ≡ ε/2, condition (A.3.1) can be restated as(A.3.3) f (x∗ + a) + f (x∗ − a) > 0 for all a ∈ (−ϵ, ϵ) .

Page 170: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.4. SCALE/TRANSLATION INVARIANT REPRESENTATIONS 170

This implies that x∗ is not a local maximizer of f . Therefore, the setIf = f (x∗ + a) | a ∈ (−ϵ, ϵ)

is an interval (since f is continuous) that includes a subinterval of theform [0, δ) for some δ > 0. Analogously, we have(A.3.4) g (y∗ + b) + g (y∗ − b) > 0 for all b ∈ (−ϵ, ϵ) ,and Ig includes a subinterval [0, δ) for some δ > 0. It follows that If∩Igis nonempty, which is to say that there exist a and b in (−ϵ, ϵ) such thatf (x∗ + a) = g (y∗ + b). The last equality in combination with (A.3.3)and (A.3.4) implies

f (x∗ − a) + g (y∗ + b) > 0 and f (x∗ + a) + g (y∗ − b) > 0.

We have proved that S ≡ (x, y) : f (x) + g (y) > 0 contains the points(x∗ − a, y∗ + b) and (x∗ + a, y∗ − b), whose midpoint is (x∗, y∗). By as-sumption (A.3.2), (x∗, y∗) /∈ S and therefore S is not convex.

A.4. Scale/translation invariant representationsIn this section we show that a scale invariance property of an ad-

ditive utility implies a unique power or logarithmic functional form,while a translation invariance property of an additive utility impliesa unique exponential form.4 As we will see in the following section,these results can be interpreted as providing ordinal foundations forexpected utility with constant relative risk aversion in the case of scale-invariant preferences, and constant absolute risk aversion in the caseof translation-invariant preferences.

The main argument relies on Theorem A.2.4 on the uniqueness ofadditive representations and the following technical result.5

Lemma A.4.1. Suppose that the function f : R → R satisfies(A.4.1) f (x+ y) = f (x) + f (y) for all x, y ∈ R,and there exists some nonempty open interval on which f is bounded.Then f (x) = f (1)x for all x ∈ R.

Proof. We are to show that the function δ (x) ≡ f (x) − f (1)xvanishes on the entire real line. Note that δ (1) = 0 and δ (x+ y) =δ (x) + δ (y) for all x, y ∈ R. Iterating the last equation, we haveδ (1) =

∑ni=1 δ (1/n), and therefore δ (1/n) = 0, for every positive in-

teger n. Similarly, for positive integers m and n, we have δ (m/n) =∑mi=1 δ (1/n) = 0. Therefore, δ vanishes on the set of positive rational

numbers. Since δ (0 + 0) = δ (0) + δ (0), it also vanishes at zero, and4To my knowledge, the purely ordinal characterizations of Theorems A.4.3

and A.4.2, which do not assume differentiability of the utility function, first ap-pear in Skiadas [2013b]. An earlier version in Skiadas [2009] assumes that theutility is continuously differentiable in some arbitrarily small neighborhood.

5For some historical context and related results, see Aczél [2006].

Page 171: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.4. SCALE/TRANSLATION INVARIANT REPRESENTATIONS 171

since δ (0) = δ (r) + δ (−r), it also vanishes on the set of negative ra-tionals. The function δ inherits from f the property that it is boundedon some nonempty open interval (a, b). Given any x ∈ R, we can finda rational r such that x + r ∈ (a, b), and since δ (x) = δ (x+ r), itfollows that δ is bounded on the entire real line. Finally, for any realx, the set of all δ (nx) = nδ (x) as n ranges over the positives integersremains bounded only if δ (x) = 0.

The preference on RN is said to be translation invariant ifx y and t ∈ R =⇒ x+ t y + t.

Theorem A.4.2. Suppose the preference on RN admits an ad-ditive utility representation (Definition A.2.1). Then is translationinvariant if and only if it admits a utility representation of the form

(A.4.2) U (x) =∑N

n=1wn

1− exp (−αxn)α

, x ∈ RN ,

for unique α ∈ R and w1, . . . , wN ∈ (0, 1) such that∑

nwn = 1. Theconvention for α = 0 is that (1− exp (−αx)) /α is equal to x, which isthe limit as α → 0.

Proof. Suppose U is an additive utility representing the transla-tion invariant preference . Without loss of generality, we assume thatUn (0) = 0 for all n = 1, . . . , N (otherwise, replace Un by Un − Un (0)).For any t ∈ R, the translation invariance of implies that U (x+ t) asa function of x ∈ (0,∞)N defines another additive utility representationof . Since, by Theorem A.2.4, an additive representation is uniqueup to a positive affine transformation and we assumed Un (0) = 0, itfollows that there exists a function a : R → (0,∞) such that(A.4.3) Un (s+ t) = Un (s) a (t) + Un (t) , s, t ∈ R, n = 1, . . . , N.

If a is identically equal to one, then Lemma A.4.1 implies that Un (x) =wnx, x ∈ R, for necessarily positive constants wn, which can be as-sumed to add up to one after positively scaling the utility, resulting inrepresentation (A.4.2) with α = 0. Suppose instead that a (y) 6= 1 forsome y. Equation (A.4.3) applied with (s, t) = (x, y) and again with(s, t) = (y, x) implies that(A.4.4) Un (x) = wn (1− a (x)) , x ∈ R,where wn ≡ Un (y) / (1− a (y)) 6= 0 (since Un is increasing). Equa-tion (A.4.3) with s = 1 implies that a is the difference of two increas-ing functions and therefore bounded on some open interval. Substi-tuting expression (A.4.4) for Un into equation (A.4.3) and simplifying,we find that log a satisfies the assumptions of Lemma A.4.1. There-fore log a (x) = −αx for some scalar α, and equation (A.4.4) becomesthe claimed representation (A.4.2) after positive scaling. The converseclaim is immediate.

Page 172: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.5. EXPECTED UTILITY REPRESENTATIONS 172

The preference on (0,∞)N is scale invariant ifx y and s ∈ (0,∞) =⇒ sx sy.

Theorem A.4.3. Suppose the preference on (0,∞)N admits anadditive utility representation (Definition A.2.1). Then is scale in-variant if and only if it admits a utility representation of the form

(A.4.5) U (x) =∑N

n=1wnx1−γn − 1

1− γ, x ∈ (0,∞)N ,

for unique γ ∈ R and w1, . . . , wN ∈ (0, 1) such that∑

nwn = 1. Theconvention for γ = 1 is that (x1−γ − 1) / (1− γ) is equal to logx, whichis the limit as γ → 1.

Proof. Using the notation exp x = (expx1, . . . , expxN), definethe preference exp on RN by

x exp y ⇐⇒ expx exp y.Note that exp is translation invariant if and only if is scale invariant.The result follows from Theorem A.4.2 applied to exp.

A.5. Expected utility representationsThe rest of this appendix is on expected utility, which is a special

type of additive utility. We henceforth refer to 1, . . . , N as the statespace and regard every x ∈ (ℓ,∞)N as a (state-contingent) payoff.An agent expresses preferences over such payoffs, represented by thepreference (relation) , not knowing what state will be realized. Astrictly positive probability is any vector P = (P1, . . . , Pn) ∈ (0, 1)N

such that∑

n Pn = 1. An expected utility is a pair (P, u) of astrictly positive probability P and an increasing and continuous func-tion u : (ℓ,∞) → R; it is said to represent the preference ifU ≡

∑Nn=1 Pnu is an (additive) utility representation of . Last sec-

tion’s scale or translation invariant additive utilities are examples ofexpected utility. As in those examples, the preference uniquelydetermines an expected utility representation up to a positive affinetransformation.

Theorem A.5.1. If (P, u) and (P , u) are expected utilities repre-senting the same preference, then P = P and there exist constantsa ∈ (0,∞) and b ∈ R such that u = au+ b.

Proof. By Theorem A.2.4, there exist a ∈ (0,∞) and b1, . . . , bn ∈R such that Pnu = aPnu+ bn for all n. Adding up over all n, we obtainu = au + b, where b ≡

∑n bn. Therefore, Pn (au+ b) = aPnu + bn

or (Pn − Pn)au = bn − Pnb. The right-hand side of the last equationis a constant but the left-hand side is an increasing function, unlessPn = Pn.

Page 173: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.5. EXPECTED UTILITY REPRESENTATIONS 173

In the remainder of this section we formulate a property of wecall state independence and we refine the additive representation The-orem A.2.3 (where N > 2) by showing that is both separable andstate independent if and only if it admits an expected utility repre-sentation.6 The reader not interested in the details of this claim cansafely proceed to the following section. It is worth noting, however,that in this context, last section’s results show that state independenceis implied by scale or translation invariance.

We formulate the state-independence condition in terms of the in-difference relation ∼ associated with the preference , defined by

x ∼ y ⇐⇒ (not x y) and (not y x) .

We write 1n for the random variable that is the indicator of n, thatis, 1nn = 1 and 1nk = 0 for all k 6= n. As usual, we identify a scalarw ∈ (ℓ,∞) and the constant payoff (w,w, . . . , w).

Fixing any distinct states m,n ∈ 1, . . . , N, suppose the agentis indifferent between the constant payoff w and the same payoff butwith the contingent value at state m increased by z ∈ (0,∞) and thecontingent value at state n decreased by y ∈ (0, w − ℓ). We expressthis indifference with the notation

(z;m+) =w (y;n−) ⇐⇒ w ∼ w + z1m − y1n.

We express the agent’s indifference between increasing the constantpayoff w by z ∈ (0,∞) at state m or by x ∈ (0,∞) at state n as

(z;m+) =w (x;n+) ⇐⇒ w + z1m ∼ w + x1n.

We write (x;n+) =mw (y;n−) if there exists some z ∈ (0,∞) such that

both (z;m+) =w (y;n−) and (z;m+) =w (x;n+), which provides a sensein which an increase by x at state n is of the same utility magnitudeas a decrease by y at state n. We write (x;n+) 6=m

w (y;n−) if thereexists some z ∈ (0,∞) such that (z;m+) =w (y;n−) but it is not thecase that (z;m+) =w (x;n+). We call state independent if forall w ∈ (ℓ,∞) and distinct states m,n, n′ ∈ 1, . . . , N, there existno7 x ∈ (0,∞) and y ∈ (0, w − ℓ) such that

(x;n′

+

)=mw

(y;n′

−)

but(x;n+) 6=m

w (y;n−).6Preferences over probability distributions that admit an expected-utility repre-

sentation were first characterized by von Neumann and Morgenstern [1944]. Incor-porating a subjective view of probability in the tradition of Ramsey [1926], Savage[1954] developed an axiomatic foundation for expected utility with a nonatomicprobability that is uniquely determined by preferences over mappings from statesto a set of consequences. Anscombe and Aumann [1963] offered an alternativefoundation for expected utility with subjective beliefs that utilizes objective prob-abilities to calibrate subjective beliefs. The approach presented here is a variant ofthat in Wakker [1984, 1988, 1989] and is based on Skiadas [1997, 2009].

7Theorem A.5.2 and its proof remain valid if we define state independence tomean that for all distinct m,n ∈ 2, . . . , N, there exists ϵ > 0 such that for allx, y ∈ (0, ϵ), it is not the case that (x; 1+) =

mw (y; 1−) and (x;n+) 6=m

w (y;n−).

Page 174: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.6. EXPECTED UTILITY AND RISK AVERSION 174

Theorem A.5.2. Suppose that N > 2 and is a preference thatadmits a utility representation. Then admits an expected utilityrepresentation if and only if it is both separable and state independent.

Proof. The “only if” part is straightforward. Conversely, sup-pose is separable and state independent. By Theorem A.2.3, isrepresented by an additive utility U =

∑Nn=1 Un. For every state n,

define the open interval In ≡ Un (x) | x ∈ (ℓ,∞) and the functionfn : I1 7→ In such that Un (x) = fn (U1 (x)) for all x ∈ (ℓ,∞). We willshow that each fn is affine, that is, there exist an ∈ (0,∞) and bn ∈ Rsuch that fn (U1) = anU1+bn. An expected utility representation (P, u)results by letting u = U1 and Pn = an/

∑n an.

Note that every fn is increasing and surjective, and therefore con-tinuous. To show that fn is affine, we show that for all α ∈ I1, thereexists ε > 0 such that for all δ ∈ (0, ε),(A.5.1) fn (α + δ)− fn (α) = fn (α)− fn (α− δ) .

By Lemma A.3.2 with p = 1/2, this condition implies that both fnand −fn are concave, and therefore fn is affine. Fix any α ∈ I1 andlet w ∈ (ℓ,∞) be defined by U1 (w) = α. Consider any distinct statesm,n ∈ 2, . . . , N. By the utility continuity, there exists a sufficientlysmall ε > 0 such that for all δ ∈ (0, ε) there exist positive scalars x, y,z, z′ that solve the equations

δ = U1 (w + x)− U1 (w) = U1 (w)− U1 (w − y) ,

δ = Um (w + z′)− Um (w) ,

Um (w + z)− Um (w) = Un (w)− Un (w − y) .

The first three equalities imply that (x; 1+) =mw (y; 1−), and the last

one that (z;m+) =w (y;n−). The state independence assumption thenrequires that (z;m+) =w (x;n+), and therefore Un (w + x)− Un (w) =Um (w + z)−Um (w). This proves that Un (w + x)−Un (w) = Un (w)−Un (w − y), which in turn implies (A.5.1).

A.6. Expected utility and risk aversionConcavity of an expected utility corresponds to risk aversion and

the more concave the utility is the more risk averse the representedpreference. In this section, we make precise and prove this claim. As iscustomary in the related literature, we use the term “more risk averse”to mean “no less risk averse,” and “risk averse” to mean “risk averseor risk-neutral.”

Consider a preference on (ℓ,∞)N with expected utility repre-sentation (P, u). The corresponding certainty equivalent (CE) vis the unique utility representation of satisfying v (w) = w for allw ∈ (ℓ,∞). Let E denote the expectation operator under P , defined byEz ≡

∑Nn=1 Pnzn, z ∈ RN . The CE v : (ℓ,∞)N → (ℓ,∞) is necessarily

Page 175: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.6. EXPECTED UTILITY AND RISK AVERSION 175

given as v = u−1Eu, meaning v (x) = u−1 (Eu (x)) for all x ∈ (ℓ,∞)N .The agent is indifferent between x and a constant payoff v (x). Forgiven probability P , a lower CE value v (x) represents higher risk aver-sion, since, given the same odds, the agent is willing to settle for alower sure payoff in place of x. An expected utility CE v = u−1Eu istherefore more risk averse than v if v (x) ≤ v (x) for all x ∈ (ℓ,∞)N ,a condition we abbreviate to v ≤ v.

Theorem A.6.1. Suppose (P, u) and (P, u) are expected utilities,with corresponding CEs v ≡ u−1Eu and v ≡ u−1Eu. Then v ≤ v if andonly if u = f u for a concave function f : u (ℓ,∞) → R.

Proof. Define the (increasing) function f : u (ℓ,∞) → R suchthat u = f u. If f is concave, then v ≤ v by Jensen’s inequality:

u (v (x)) = f (Eu (x)) ≥ Ef u (x) = u (v (x)) , x ∈ (ℓ,∞)N .

Conversely, we assume that v ≤ v and f is not concave and showa contradiction. Let p ≡ P1 > 0. By Lemma A.3.2, there existsw ∈ (ℓ,∞) and ε > 0 such that for all δ ∈ (0, ε),(A.6.1) f (u (w)) < pf (u (w) + (1− p) δ) + (1− p) f (u (w)− pδ) .

Pick δ > 0 small enough so that x ∈ (ℓ,∞)N is well-defined byu (x1) = u (w) + (1− p) δ, u (xn) = u (w)− pδ, n ≥ 2,

which implies that w = v (x), in contradiction to inequality (A.6.1),which states that u (w) < Eu (x) and therefore w < v (x) ≤ v (x).

Corollary A.6.2. For every expected utility (P, u), u is concaveif and only if u−1Eu (x) ≤ Ex for all x ∈ (0,∞)N .

The corollary’s condition states that v is more risk averse than therisk-neutral CE E; it provides an absolute sense in which we can call risk averse. Another reasonable definition of absolute risk aversionis the convexity of the preference , which can be thought of as pref-erence for diversification. By Theorem A.3.1, is convex if and onlyif u is concave, and therefore the two notions of absolute risk aver-sion coincide. A third notion of absolute risk aversion, which we adoptas a definition, states that whenever rejects adding a risky payoffto a sure payoff, it also rejects adding every scaled-up version of thesame risky payoff. We therefore define to be risk averse if for allw ∈ (ℓ,∞) and x ∈ (ℓ− w,∞)N ,(A.6.2) w w + x and θ ∈ (1,∞) =⇒ w w + θx.

This condition gives a sense of aversion to a specific risk x. Implicit inexpected utility is a sense in which risk aversion is risk-source indepen-dent,8 which leads to the equivalence of risk aversion to the concavity

8For a specification of scale-invariant preferences with source-dependent riskaversion, see Skiadas [2013b, 2015].

Page 176: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.6. EXPECTED UTILITY AND RISK AVERSION 176

of u, and hence the equivalence of all three notions of absolute riskaversion just introduced: risk aversion, preferences for diversification,and more risk averse than risk neutral.

Theorem A.6.3. A preference with expected utility representation(P, u) is risk averse if and only if u is concave.

Proof. Suppose u is concave, w ∈ (ℓ,∞) and θ ∈ (1,∞). Sincew + x is a convex combination of w and w + θx, if Eu (w + θx) ≥u (w) then Eu (w + x) ≥ u (w), which is the contrapositive of condi-tion (A.6.2) and therefore proves that is risk averse.

Conversely, suppose is risk averse. We show that u is concaveassuming N = 2, which entails no loss in generality (why?). We do soby confirming that u−1Eu ≤ E and applying Corollary A.6.2. Supposeinstead that for some x ∈ (ℓ,∞)2, u−1Eu (x) > Ex. Define µ ≡ Ex andx ≡ x− µ, and therefore(A.6.3) Eu (µ+ x) > u (µ) and Ex = 0.

We claim that this condition implies that u is convex on (µ− ϵ, µ+ ϵ)for some ϵ > 0. Suppose not. Then, by Lemma A.3.2 applied to −u,for all n = 1, 2, . . . , there exist µn ∈ (µ− 1/n, µ+ 1/n) and δn ∈ (0, 1)such that u (µn) > Eu (µn + δnx), and u (µn) > Eu (µn + x) by riskaversion. Since u is continuous and limn→∞ µn = µ, u (µ) ≥ Eu (µ+ x)in contradiction to (A.6.3). We can therefore select ϵ > 0 such thatu is convex on (µ− ϵ, µ+ ϵ). Since u is also (strictly) increasing, itfollows9 that the derivative u′ exists and is positive at all but at mostcountably many points of (µ− ϵ, µ+ ϵ). Since u is continuous, in con-dition (A.6.3), we can slightly perturb µ if necessary to some value wsuch that u′ (w) exists and is positive, while it is still the case thatEu (w + x) > u (µ). We can then slightly decrease x to x so thatEu (w + x) ≥ u (w), Ex < 0 and xn 6= 0 for every state n. Risk aver-sion implies that for all δ ∈ (0, 1), Eu (w + δx) ≥ u (w) and therefore

E[u (w + δx)− u (w)

δxx

]≥ 0.

Letting δ ↓ 0 gives u′ (w)Ex ≥ 0, which contradicts the fact thatu′ (w) > 0 and Ex < 0 by construction.

9Suppose u is strictly increasing and convex on an interval (a, b) and w ∈ (a, b).For all small enough δ > 0, define the ordered slopes

∆+ (δ) ≡ u (w + δ)− u (w)

δ≥ u (w)− u (w − δ)

δ≡ ∆− (δ) .

As δ ↓ 0, ∆+ (δ) monotonically decreases to a limit defining the right derivativeu′+ (w), and ∆− (δ) monotonically increases to a limit defining the left derivative

u′− (w). Clearly, u′

+ (w) > 0 and u′+ (w) ≥ u′

− (w). The intervals(u′− (w) , u′

+ (w))

are non-overlapping as w ranges over (a, b) and therefor at most countably many ofthem are nonempty. This proves that for all except at most countably many valuesof w ∈ (a, b) , u′ (w) = u′

− (w) = u′+ (w) > 0.

Page 177: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

A.6. EXPECTED UTILITY AND RISK AVERSION 177

Consider an expected utility (P, u), where u belongs to the set C2 oftwice continuously differentiable real-valued functions on the real line.The scale-or-translation invariant cases of Section 3.8 are examples.More generally, every expected utility is “close” to a smooth version inthe sense of the following remark (which is not used anywhere else).

Remark A.6.4. Suppose that a payoff x ∈ (ℓ,∞)N is perceivedby the agent as x + ε, where ε is an arbitrarily small noise term thatis stochastically independent of x and continuously distributed over acompact interval with density fε ∈ C2. The expected utility of x canbe thought of as the reduced form of the expected utility of x+ ε:

Eu (x) ≡ Euε (x+ ε) ≡N∑n=1

Pn

∫Ruε (xn + y) fε (y) dy.

We can therefore define the function u : (ℓ,∞) → R as

u (z) ≡∫Ruε (z + y) fε (y) dy =

∫Ruε (x) fε (x− z) dx.

For continuous but otherwise arbitrary uε, the fact that fε ∈ C2 has acompact support implies that u ∈ C2. ♦

The Arrow-Pratt coefficient of absolute risk aversion asso-ciated with u ∈ C2 is the function

(A.6.4) Au ≡ −u′′

u′.

Section 3.8 presents our central motivation for introducing this func-tion, which is the approximation of the expected utility of small Brown-ian risks. Just as the CE u−1Eu, Au is invariant to positive affine trans-formations of u and determines u up to a positive affine transformation.(To see the latter claim, write (A.6.4) as Au (x) dx = −d logu′ (x) andintegrate twice.) Given P , the function Au is therefore uniquely as-sociated with the corresponding CE u−1Eu and hence the preferencerelation represented by (P, u). Note that the TI formulation of Theo-rem A.6.4 corresponds to the constant absolute risk aversion assump-tion Au (x) ≡ α, x ∈ R, and the SI formulation of Theorem A.6.4 corre-sponds to the constant relative risk aversion assumption xAu (x) ≡ γ,x ∈ (0,∞). The function x 7→ xAu (x) is known as the coefficient ofrelative risk aversion.

By Theorem A.6.4, for u ∈ C2, the expected utility (P, u) is riskaverse if and only if u′′ ≤ 0 if and only if Au ≥ 0. The inequalities areto be interpreted as holding on the entire domain, for example, Au ≥ 0means Au (x) ≥ 0 for all x ∈ (ℓ,∞). By Theorem A.6.4, for u, u ∈ C2,the expected utility (P, u) is more risk averse than the expected utility(P, u) (in the sense that u−1Eu ≤ u−1Eu) if and only if Au ≥ Au. Toconfirm this claim, note that if u = f u then Au = Au +

(Af u

)u′

and f is concave if and only if Af ≥ 0.

Page 178: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

APPENDIX B

Elements of Convex Analysis

This appendix reviews some basic linear algebra and convex analysisconcepts. The emphasis is on the finite-dimensional case, but from aperspective that is amenable to infinite-dimensional extensions.1

B.1. Inner product spaceThe analysis of this appendix is all carried out relative to an inner

product space, which is a vector space together with an inner product.In this section we define these terms and discuss some of their basicproperties.

All vector spaces in this text are defined relative to the field R ofthe real numbers, also referred to as scalars. A vector or linear spaceis a triple of a set X, whose elements are called vectors or points, ascaling operation assigning a vector αx to each (α, x) ∈ R × X, andan addition operation assigning a vector x+y to each (x, y) ∈ X×X,provided the following restrictions hold for all x, y, z ∈ X and α, β ∈ R:(α + β)x = αx + βx and α (x+ y) = αx + αy; (αβ)x = α (βx) and1x = x; x+y = y+x and (x+ y)+z = x+(y + z); there is a (necessarilyunique) vector 0, called the zero vector, such that 0 + x = x for allx, and for each vector x there is a (necessarily unique) vector −x suchthat x + (−x) = 0. The difference between two vectors is defined asx− y ≡ x+ (−y). (As in the main text, ≡ means equal by definition.)

It is customary to refer to a vector space X, while it is implied thatscaling and addition operations are also specified along with X. Twovector spaces Xand X are isomorphic if there is a bijection (one-to-one and onto mapping) ϕ : X → X that respects the addition andscaling operations: ϕ (x+ y) = ϕ (x) + ϕ (y) and ϕ (sx) = sϕ (x) for allx, y ∈ X and s ∈ R. Note that addition and scaling on the left-handside of these equations refer to operations in X and those on the right-hand side refer to operations in X. One can think of X as being thesame vector space as X, but with each vector x given a unique newlabel ϕ (x). We refer to ϕ as a vector space isomorphism.

1Treatments of finite-dimensional convex optimization theory include Rockafel-lar [1970], Bertsekas [2003] and Boyd and Vandenberghe [2004], the latter provid-ing an introduction to modern computational optimization algorithms. Infinite-dimensional extensions can be found in Bonnans and Shapiro [2000], Dunford andSchwartz [1988], Ekeland and Témam [1999] and Luenberger [1969].

178

Page 179: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.1. INNER PRODUCT SPACE 179

A canonical example of a vector space is d-dimensional Euclideanspace, for positive integer d, defined as the Cartesian product Rd withcoordinate-wise scaling and addition operations: (αx+ y)i = αxi + yifor every i ∈ 1, . . . , d, α ∈ R and x, y ∈ Rd. The set L of all ran-dom variables on some state space Ω is another example of a vectorspace, where a random variable is defined as a real-valued functionon Ω and scaling and addition are defined state-wise: (αx+ y) (ω) =αx (ω) + y (ω) for all ω ∈ Ω, α ∈ R and x, y ∈ L. If Ω = ω1, . . . , ωd,then L is isomorphic to Rd.

We henceforth fix a reference vector space X. Unless otherwisespecified, a vector is an element of X. A linear subspace of X isany subset Y of X that is closed with respect to scaling and addition:αx+ y ∈ Y for all α ∈ R and x, y ∈ Y . Equivalently, a linear subspaceof X is a subset of X that is a vector space in its own right, with thesuitably restricted addition and scaling operations inherited from X.

A linear combination of the vectors x1, . . . , xn is any vector of theform α1x1 + · · ·+ αnxn for scalar αi. (The term linear combinationwill always refer to the linear combination of finitely many vectors.) Ifall the αi are zero, we call the linear combination trivial. The linearspan of a set of vectors S, denoted span (S), is the intersection ofall linear subspaces that include S, which is the same as the set ofall linear combinations of elements of S. We say that span (S) is thelinear subspace generated or spanned by S or the elements of S. IfS = x1, . . . , xn, we also write span (x1, . . . , xn) for span (S).

A set of vectors S is linearly independent if every linear com-bination of elements of S that is also in S is trivial, a condition thatis equivalent to the unique representation of each element of span (S)as a linear combination of elements of S, and is also equivalent to thenonexistence of an element x of S that can be expressed as a linearcombination of elements in S \ x.

A basis of X is a linearly independent set of vectors that spansX. A vector space is finite-dimensional if it has a finite basis andinfinite-dimensional otherwise. Every basis of a finite-dimensionalvector space has the same number of elements, called the space’s di-mension. The vector space 0 has, by definition, dimension zero.

If X has finite dimension d, we represent a basis B1, . . . , Bd ofX as a column matrix B ≡ (B1, . . . , Bd)

′ and we write σx for the rowvector in Rd that represents the point x in X relative to the givenbasis B :

x = σxB ≡d∑i=1

σxi Bi.

The mapping x 7→ σx defines a vector space isomorphism from X to Rd,which shows that every finite-dimensional vector space is isomorphicto a Euclidean space.

Page 180: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.1. INNER PRODUCT SPACE 180

A functional is a function of the form f : X → R. The functionalf is linear if f (αx+ βy) = αf (x) + βf (y) for all x, y ∈ X andα, β ∈ R. Given a finite basis B = (B1, . . . , Bd)

′ and a linear functionalf , we use the notation f (B) ≡ (f (B1) , . . . , f (Bd))

′. The single vectorf (B) in Rd determines the entire function f, since x = σxB impliesf (x) = σxf (B) = σx · f (B), where the dot of the last expressiondenotes the Euclidean inner product:

(B.1.1) x · y =d∑i=1

xiyi, x, y ∈ Rd.

The mapping (x, y) 7→ x · y is bilinear (linear in each of x and ykeeping the other fixed), symmetric (x · y = y ·x), and positive definite(x · x ≥ 0, with equality holding if and only if x = 0). Abstractingaway these properties, we define a (real) inner product 〈· | ·〉 on thevector space X as a mapping that assigns to each (x, y) ∈ X × X areal number 〈x | y〉 such that for all x, y, z ∈ X and α ∈ R,

• 〈αx+ y | z〉 = α 〈x | z〉+ 〈y | z〉 .• 〈x | y〉 = 〈y | x〉.• 〈x | x〉 ≥ 0, with equality holding if and only if x = 0.

An inner product space is a vector space together with an innerproduct on this space. Two inner product spaces X and X (each withan implied inner product) are said to be isomorphic if there is avector space isomorphism ϕ : X → X that preserves inner products:〈x | y〉 = 〈ϕ (x) | ϕ (y)〉, where the left-hand side refers to the X innerproduct and the right-hand side refers to the X inner product.

Example B.1.1. Every symmetric positive definite matrix Q ∈Rd×d defines an inner product on Rd by letting 〈x | y〉 = xQy′, wherex, y ∈ Rd are treated as row vectors. The Euclidean inner product isobtained if Q is the identity matrix. ♦

Example B.1.2. Suppose X is any finite-dimensional vector spaceand B is any basis of X. Then an inner product is defined by letting〈x | y〉 = σx ·σy, where x = σxB and y = σyB. Later, we will establishthe more interesting fact that given any inner product on X, we canselect the basis B so that this example’s representation is valid. ♦

Example B.1.3. Let E denote the expectation operator relative toa probability on a finite state space Ω, where every state is assigned anonzero probability. On the vector space L of all random variables onΩ, the inner product 〈x | y〉 = E [xy] renders L an inner product spacethat is isomorphic to a Euclidean inner product space (why?). Notethat for zero-mean x, y ∈ L, 〈x | y〉 = cov [x, y], but covariance is notan inner product on all of L (why?). ♦

Page 181: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.1. INNER PRODUCT SPACE 181

The norm induced by the inner product 〈· | ·〉 is the function‖·‖ : X → R defined by(B.1.2) ‖x‖ =

√〈x | x〉, x ∈ X.

Here ‖x‖ denotes the norm of x, which is the value the norm ‖·‖assigns to x. Note that an inner-product isomorphism also preservesnorms, a property known as isometry. Conversely, every isometryalso preserves inner products, since(B.1.3) ‖x+ y‖2 = ‖x‖2 + 2 〈x | y〉+ ‖y‖2.Vectors x and y are said to be orthogonal if 〈x | y〉 = 0. Inspection ofidentity (B.1.3) shows that the vectors x and y are orthogonal if andonly if they satisfy the Pythagorean identity

‖x+ y‖2 = ‖x‖2 + ‖y‖2 .We have so far treated inner products purely algebraically. For

geometric intuition, we visualize ‖x‖ as the length of the vector x,and the orthogonality condition 〈x | y〉 = 0 as x and y forming a rightangle. For any vector x, we write x ≡ ‖x‖−1x for the unique positivelyscaled version of x whose norm is one. The inner product 〈x | y〉 isthe value of the scalar α that minimizes ‖y − αx‖, a condition thatdefines y ≡ 〈x | y〉 x as the (orthogonal) projection of the point yonto span (x) = span (x). One can visualize y as the point of theline span (x) that is closest to the point y. To justify this claim, useidentity (B.1.3) and the fact that ‖x‖ = 1 to compute

‖y − αx‖2 = ‖y‖2 − ‖y‖2 + (α− 〈x | y〉)2 .The quadratic is clearly minimized for α = 〈x | y〉 and the correspond-ing minimum is equal to ‖y − y‖2 = ‖y‖2 − ‖y‖2. The latter is aPythagorean identity that is equivalent to the orthogonality of y − yto y and therefore to x.

The very useful Cauchy-Schwarz inequality results from the simpleobservation that the nonzero vectors x, y and 〈x | y〉 x form an orthog-onal triangle whose hypotenuse y has unit length, and therefore eachof its other sides has length less than one. In particular, |〈x | y〉| ≤ 1.The following proposition elaborates on this conclusion. Two vectorsx and y are said to be collinear if either x = αy or y = αx for someα ∈ R.

Proposition B.1.4. All vectors x, y satisfy the Cauchy-Schwarzinequality

|〈x | y〉| ≤ ‖x‖‖y‖,and the triangle inequality(B.1.4) ‖x+ y‖ ≤ ‖x‖+ ‖y‖ .The Cauchy-Schwarz inequality holds as an equality if and only if xand y are collinear.

Page 182: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.1. INNER PRODUCT SPACE 182

Proof. Omitting trivial cases, we assume x and y are nonzero.Direct calculation using the bilinearity of an inner product and the factthat 〈y | y〉 = 1 shows that x − 〈x | y〉 y is orthogonal to y. Using thepositive definiteness of inner products and the Pythagorean identity,we conclude that 0 ≤ ‖x− 〈x | y〉 y‖2 = 1 − 〈x | y〉2, with equalityholding if and only if x = 〈x | y〉 y if and only if x and y are collinear.We showed 〈x | y〉2 ≤ 1, which is equivalent to the Cauchy-Schwarzinequality. Identity (B.1.3) and the Cauchy-Schwarz inequality implythe triangle inequality.

Example B.1.5. In Example B.1.3, consider the covariance innerproduct on the vector space x ∈ L | Ex = 0. The norm of x is itsstandard deviation and the Cauchy-Schwarz inequality corresponds tothe fact that the correlation coefficient of two random variables lies inthe interval [−1, 1]. ♦

We henceforth take as given the inner product 〈· | ·〉, rendering Xan inner product space. Every x ∈ X defines a corresponding linearfunctional f (y) = 〈x | y〉, y ∈ X, in which case, we say that x is the(necessarily unique) Riesz representation of f .

In the finite-dimensional case, the inner product and Riesz repre-sentations can be expressed concretely in terms of a linear basis. Tosimplify notation, we extend the inner products to matrices of vectors,using the usual matrix addition and multiplication rules. In particular,given a column matrix of vectors B = (B1, . . . , Bd)

′ and any vector x,we write 〈x | B′〉 ≡ (〈x | B1〉 , . . . , 〈x | Bd〉), and 〈B | B′〉 for the Grammatrix of B, defined as the d×d matrix whose (i, j) entry is 〈Bi | Bj〉.

Proposition B.1.6. Suppose B = (B1, . . . , Bd)′ is a basis of X

and for every x ∈ X, the row vector σx is defined by x = σxB. Thenthe Gram matrix 〈B | B′〉 is symmetric and positive definite and theinner product in X can be expressed as(B.1.5) 〈x | y〉 = σx 〈B | B′〉σy′, x, y ∈ X.

The Riesz representation of every linear functional f exists and isgiven by f (B)′ 〈B | B′〉−1B, and therefore σx = 〈x | B′〉 〈B | B′〉−1 forall x ∈ X.

Proof. Bilinearity of the inner product implies the inner productrepresentation (B.1.5). By the symmetry of the inner product, 〈B | B′〉is a symmetric matrix. By the positive definiteness of the inner prod-uct, for every row vector α ∈ Rd, α 〈B | B′〉α′ = 〈αB | αB〉 ≥ 0,with equality holding if and only if αB = 0 if and only if α = 0(since B is a basis). This proves that 〈B | B′〉 is a positive defi-nite matrix (and therefore invertible). If f is a linear function, thenf (y) = f (B)′ σy′ and 〈x | y〉 = σx 〈B | B′〉σy′. Therefore, x is theRiesz representation of f if and only if σx = f (B)′ 〈B | B′〉−1. In thiscase, σx = 〈x | B′〉 〈B | B′〉−1 , since f (Bi) = 〈x | Bi〉 for all i.

Page 183: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.2. BASIC TOPOLOGICAL CONCEPTS 183

A corollary is that every d-dimensional inner product space is iso-morphic to the inner product space of Example B.1.1. The isomor-phism is the mapping x 7→ σx, and the matrix Q of Example B.1.1 isthe Gram matrix 〈B | B′〉. The basis B is said to be orthonormalif the corresponding Gram matrix is an identity matrix, which meansthe basis elements all have unit norm and are pairwise orthogonal. Thelast paragraph of Section B.4 shows that every finite-dimensional innerproduct space admits an orthonormal basis and is therefore isomorphicto a Euclidean inner-product space.

B.2. Basic topological conceptsThroughout this section X is an inner product space with induced

norm ‖ · ‖, which we use to define convergence and related basic topo-logical concepts. The function ‖·‖ : X → R is positively homogeneous:‖αx‖ = |α| ‖x‖ for all α ∈ R and x ∈ X; positive definite: ‖x‖ ≥ 0,with equality holding if and only if x = 0; and satisfies the triangleinequality (B.1.4), which can be equivalently stated as(B.2.1) |‖x‖ − ‖y‖| ≤ ‖x− y‖ for all x, y ∈ X.

These three properties more broadly define what is known as a normon a vector space, which may or may not be induced by an innerproduct. For every r ∈ R+ and x ∈ X, the open ball with center xand radius r is the set

B (x; r) ≡ y ∈ X | ‖y − x‖ < r .A sequence xn = (x1, x2, . . . ) of points in X converges to the

limit x ∈ X if for every ε > 0, there exists an integer Nε such thatn > Nε implies xn ∈ B (x; ε). In this case, the sequence xn is said tobe convergent. A function f : D → R, where D ⊆ X, is continuousat x ∈ D if for any sequence xn in D converging to x, the sequencef (xn) converges to f (x). Note that f : D → R is continuous atx ∈ D if and only if for all ε > 0, there exists some δ > 0 (depending onε) such that y ∈ D ∩ B (x; δ) implies |f (y)− f (x)| < ε. The functionf is continuous if it is continuous at every point of its domain D.

Inequality (B.2.1) shows that the underlying norm is a continuousfunction. The inner product 〈· | ·〉 is also continuous, in the followingsense.

Proposition B.2.1. Suppose xn and yn are sequences in Xconverging to x and y, respectively. Then 〈xn | yn〉 converges to〈x | y〉.

Proof. By the triangle and Cauchy-Schwarz inequalities, we have|〈xn | yn〉 − 〈x | y〉| = |〈x− xn | y − yn〉+ 〈xn − x | y〉+ 〈x | yn − y〉|

≤ ‖x− xn‖ ‖y − yn‖+ ‖xn − x‖ ‖y‖+ ‖x‖ ‖yn − y‖ .Letting n go to infinity completes the proof.

Page 184: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.2. BASIC TOPOLOGICAL CONCEPTS 184

In a finite-dimensional space, a linear functional f on X is neces-sarily continuous, since it has a Riesz representation z and therefore|f (x)− f (y)| ≤ ‖z‖ ‖x− y‖ by the Cauchy-Schwarz inequality. Incontrast, the following example shows that a linear functional on aninfinite-dimensional space may not be continuous.

Example B.2.2. Let X be the inner product space of every se-quence xn in R such that xn = 0 for all but finitely many val-ues of n, with the inner product 〈x | y〉 =

∑n xnyn. The functional

f (x) =∑

n nxn is linear but not continuous (why?). ♦

A sequence xn of points in X is Cauchy if for every ε > 0, thereexists an integer N such that m > n > N implies xm ∈ B (xn; ε). Asubset S of X is complete if every Cauchy sequence in S convergesto a limit that is an element of S. A subset of a complete set iscomplete if and only if it is closed. The triangle inequality implies thatevery convergent sequence is Cauchy. The following example showsthat the converse need not be true for an arbitrary inner product space.Intuitively, a Cauchy sequence should converge to something, but ifthat something is not within the space X, then the sequence is notconvergent. As we will see shortly, difficulties of the sort do not arisein finite-dimensional spaces.

Example B.2.3. Let X = C [0, 1] be the (infinite-dimensional)vector space of all continuous functions on the unit interval with theinner product 〈x | y〉 =

∫ 1

0x (t) y (t) dt. The sequence xn defined by

xn (t) = 1/ (1 + nt), t ∈ [0, 1], is Cauchy but does not converge in X. ♦

If X is finite-dimensional, the convergence or Cauchy property ofa sequence is equivalent to the respective property of the sequence’scoordinates relative to any given basis.

Proposition B.2.4. Suppose B = (B1, . . . , Bd)′ is a basis of X

and σn =(σ1n, . . . , σ

dn

)∈ Rd for n = 1, 2, . . . The sequence σnB is

Cauchy (resp. converges to σB) if and only if the scalar sequence σinis Cauchy (resp. converges to σi) for every coordinate i ∈ 1, . . . , d.

Proof. Suppose σin is Cauchy for each i. The triangle inequalityimplies

‖σnB − σmB‖ ≤∑

i

∣∣σin − σim∣∣ ‖Bi‖ ,

from which it follows that σnB is Cauchy. Conversely, supposeσnB is Cauchy. Noting the identity σn = 〈σnB | y〉, where y =B′ 〈B | B′〉−1, apply the Cauchy-Schwarz inequality to obtain∣∣σin − σim

∣∣ = |〈σnB − σmB | yi〉| ≤ ‖σnB − σmB‖ ‖yi‖ .

Therefore σin is Cauchy. The claims in parentheses follow by thesame argument, with σ in place of σm.

Page 185: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.2. BASIC TOPOLOGICAL CONCEPTS 185

A Hilbert space is any inner product space that is complete (rela-tive to the norm induced by the inner product). One of the fundamentalproperties of the real line is that it is complete. Given this fact, the lastproposition implies that every finite-dimensional inner product space isa Hilbert space. Another consequence of Proposition B.2.4 is that fora finite-dimensional vector space, the convergence or Cauchy propertyof a sequence, and the continuity of a function are all properties thatdo not depend on the choice of an underlying inner product (or norm,as we will see shortly).

A subset S of X is (sequentially) compact if every sequence in Shas a subsequence that converges to a vector in S. This definition imme-diately implies that a compact set is complete and therefore closed. Acompact set S is also bounded, meaning that sup ‖s‖ : s ∈ S <∞.If S were unbounded, there would exist a sequence sn in S such that‖sn‖ > n for all n, which precludes the existence of a convergent sub-sequence. On the real line, a closed bounded interval is compact sinceit can be sequentially subdivided in halves that contain infinitely manypoints of a given sequence, leading to the selection of subsequence thatis Cauchy and therefore convergent (by the completeness of the realline). By virtue of Proposition B.2.4, this type of argument can beextended to Rd by applying it to each coordinate sequentially. Sinceevery finite-dimensional space is isomorphic to Rd, we have

Proposition B.2.5. In a finite-dimensional space, a set is compactif and only if it is closed and bounded.

The claim is not true if the assumption of finite dimensionality isremoved. For example, consider l2, the Hilbert space of all square-summable sequences of real numbers with the inner product 〈x | y〉 =∑∞

n=1 x (n) y (n). The sequence en in l2, where en (n) = 1 anden (k) = 0 for all k 6= n, satisfies ‖en‖ = 1 for all n, but en hasno convergent subsequence. In fact, it can be shown that the closureof the unit ball is not compact in every infinite-dimensional normedspace.

Our interest in compactness is mainly due to the following result,showing that a continuous function achieves its supremum and infimumover a compact set.

Proposition B.2.6. Suppose that S is a compact subset of X andthe function f : S → R is continuous. Then there exist s∗, s∗ ∈ S suchthat f (s∗) ≥ f (s) ≥ f (s∗) for all s ∈ S.

Proof. Let sn be a sequence such that limn f (sn) = sup f. Bythe compactness of S, there exists a subsequence of sn convergingto some s∗ ∈ S. Since f is continuous, f (s∗) = sup f and thereforef (s∗) ≥ f (s) for all s ∈ S. The same argument applied to −f com-pletes the proof.

Page 186: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.3. CONVEXITY 186

Proposition B.2.6 implies that in a finite-dimensional space all normsare topologically equivalent, in the following sense. Suppose X is finite-dimensional and consider any other norm ‖·‖∗ on X (not necessarilyinduced by any inner product). Since ‖·‖∗ is convex, it is continuouson the open ball B (0; 2) and therefore achieves a minimum and a max-imum over the closure of B (0; 1). By the homogeneity of ‖·‖∗, thereexist constants k and K such that k ‖x‖ ≤ ‖x‖∗ ≤ K ‖x‖ for all x ∈ X.Therefore, for a finite-dimensional vector space, all norms define thesame notion of convergence and hence the same compact sets. The situ-ation is quite different with infinite-dimensional spaces, where differentnorms can define dramatically different notions of convergence.

We conclude this section with an overview of associated topologicalconcepts and notation. Let S be any subset of X. The closure of S isthe set of

S = x | B (x; ε) ∩ S 6= ∅ for all ε > 0 ,which is the same as the set of every vector x for which there existssome sequence in S that converges to x. The set S is closed if S = S,or equivalently, if every convergent sequence in S converges to a pointin S. The interior of S is the set

S0 = x | B (x; ε) ⊂ S for some ε > 0 .The boundary of S is the set S \ S0. The set S is open if S = S0,or equivalently, if its complement X \ S is closed. The set of all opensubsets of X is known as the (norm) topology of X. The followingproperties of open and closed sets can be verified as an exercise. Theempty set and X are both open and closed. The union of finitely manyclosed sets is closed, and the intersection of finitely many open sets isopen. Arbitrary intersections of closed sets are closed, and arbitraryunions of open sets are open. The closure of a set is the intersectionof all its closed supersets, and the interior of a set is the union of allits open subsets. Continuity of a function or compactness of a set aretopological properties, in the sense that they depend on the underlyingnorm only through the topology they define. This fact will not beused or shown here, but can be found in textbooks that include anintroduction to general topology including normed spaces (for example,Dudley [2002]).

B.3. ConvexityThe purpose of this section is to introduce the central notions of

convexity and concavity that underly the analysis of the rest of thisappendix, as well as some basic topological implications of convexityassumptions. As always, the reference inner product space X is takenas given.

A set C ⊆ X is convex if the line segment connecting any twopoints in C lies entirely within C: for all x, y ∈ C, α ∈ (0, 1) implies

Page 187: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.3. CONVEXITY 187

αx + (1− α) y ∈ C. Two important special types of convex set thatarise in this text are convex cones and linear manifolds. A subset Cof X is a cone if for all x ∈ C, α ∈ R+ implies αx ∈ C. A cone C isconvex if and only if x, y ∈ C implies x+y ∈ C (and a convex cone C isa linear subspace if and only if x ∈ C implies −x ∈ C). A set M ⊆ X isa linear manifold if the line through any two points in M lies entirelywithin M : for all x, y ∈ M and α ∈ R implies αx + (1− α) y ∈ M .Clearly, every linear manifold is convex. Given any subset S of Xand vector x, we write x + S = S + x = x+ s | s ∈ S to denotethe translation of S by x. Note that if M is a linear manifold andx is any point of M , then M − x is a linear subspace. Conversely,the translation of any linear subspace is a linear manifold. ThereforeM ⊆ X is a linear manifold if and only if it takes the form M = x+Lfor some vector x and linear subspace L. In this case, the dimensionof M is the dimension of L. If L is finite-dimensional, it is spannedby some basis B= (B1, . . . , Bd)

′ and M = x ∈ X | 〈B | x〉 = b forsome b ∈ Rd. Conversely, every set of this form is a linear manifold.A convex set may or may not be closed, but the situation is less clearwith linear manifolds. By Lemma B.2.4, a finite-dimensional linearsubspace is necessarily complete and therefore closed if X is complete.In contrast, an infinite-dimensional linear subspace may not be closedeven if the space X is complete.2

A function f : D → R, where D ⊆ X, is concave if D is aconvex set and for all x, y ∈ D, α ∈ (0, 1) implies f (αx+ (1− α) y) ≥αf (x)+ (1− α) f (y). This property means that on a two-dimensionalplot of f along the line segment connecting x and y, the graph of f liesabove the line segment, and this is true for all x, y ∈ D. The functionf is convex if −f is concave. Note that a functional f : X → Ris linear if and only if it is both concave and convex. We saw in thelast section that a linear functional is necessarily continuous if X isfinite-dimensional. This fact generalizes as follows.

Theorem B.3.1. Suppose that X is finite dimensional, and x ∈ Xis in the interior of the domain of a concave function f . Then f iscontinuous at x.

Proof. Since every finite-dimensional inner product space is iso-morphic to a Euclidean space, we assume that X = Rd with the Eu-clidean inner product. Choose α, β ∈ Rd such that αi < xi < βi for alli and [α, β] ≡

∏di=1 [αi, βi] is included in the domain of f . A point in

[α, β] is extreme if all its coordinates are of the form αi or βi. (Forexample, for d = 2, [α, β] is a rectangle and its extreme points are the

2Let X is be the space of every sequence x = xn that is square-summable(that is,

∑n x

2n < ∞) with the inner product 〈x, y〉 =

∑n xnyn. This space can

be shown to be complete, but the linear subspace of every sequence x such thatxn = 0 for all but finitely many values of n is not closed.

Page 188: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.3. CONVEXITY 188

corners.) Concavity of f implies that for every y ∈ [α, β], there existsan extreme point y ∈ [α, β] such that f (y) ≥ f (y). (To see that, thinkof a plot of f (y) on [αi, βi] as a function of its ith coordinate and notethat the minimum is achieved at one of the endpoints.) Since [α, β]has finitely many extreme points, f is minimized by one of them. Letb denote the minimum value, and let r > 0 be small enough so thatB (x; r) ⊆ [α, β]. Fixing any y ∈ B (x; r), let u ≡ (y − x) / ‖y − x‖ andϕ (α) ≡ f (x+ αu) for all α ∈ [−r, r]. We then have the inequalities

K ≡ ϕ (0)− b

r≥ ϕ (0)− ϕ (−r)

r

≥ ϕ (‖y − x‖)− ϕ (0)

‖y − x‖≥ ϕ (r)− ϕ (0)

r≥ b− ϕ (0)

r= −K.

The fact that ϕ is bounded below by b justifies the first and last in-equalities. The three middle expressions represent slopes that decreasefrom left to right since ϕ : [−r, r] → R is concave. This proves that|f (y)− f (x)| ≤ K ‖y − x‖ for all y ∈ B (x; r).

Convexity assumptions in combination with completeness can playthe role that compactness did in the last section, even in the infinite-dimensional case, where compact sets are harder to come by. Givena sequence xn, we write conv (xn, xn+1, . . . ) for the set of all finitelinear combinations of the form

∑i αixni

, where αi > 0,∑

i αi = 1 andni ≥ n for all i. Equivalently, conv (xn, xn+1, . . . ) is the intersection ofall convex sets that contain the points xn, xn+1, . . . .

Lemma B.3.2. Suppose xn is a bounded sequence that lies in acomplete convex subset of X. Then there exists a convergent sequenceyn such that yn ∈ conv (xn, xn+1, . . . ) for all n.

Proof. Since the sequence xn is bounded,

Ln ≡ inf ‖y‖ | y ∈ conv (xn, xn+1, . . . )

is finite, increasing in n, and converges to the finite limit M ≡ supn Lnas n→ ∞. Choose yn ∈ conv (xn, xn+1, . . . ) such that ‖yn‖ < Ln+1/nfor all n. We will show that the sequence yn is Cauchy and henceconvergent, since xn is assumed to lie in a complete convex set. Givenany ϵ > 0, choose N such that 1/N < ϵ and Ln > M − ϵ for all n > N .For all m > n > N , we have (1/2) (ym + yn) ∈ conv (xn, xn+1, . . . ) andtherefore ‖(1/2) (ym + yn)‖ > M − ϵ and

‖ym − yn‖2 = 2‖ym‖2 + 2‖yn‖2 − ‖ym + yn‖2

< 2 (Lm + 1/m)2 + 2 (Ln + 1/n)2 − 4 (M − ϵ)2

< 4 (M + ϵ)2 − 4 (M − ϵ)2 = 16Mϵ.

The Cauchy property of yn follows.

Page 189: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.4. PROJECTIONS ON CONVEX SETS 189

The following result is a corollary of Proposition B.2.6 in the finite-dimensional case, but is remarkable in that it applies equally well tothe infinite-dimensional case and relies on the last lemma rather thancompactness.

Proposition B.3.3. Suppose the nonempty set S is convex, com-plete and bounded, and the function f : X → R is concave over Sand continuous at every point of S. Then there exists y ∈ S such thatf (y) = max f (s) | s ∈ S.

Proof. Choose the sequence xn in S so that f (xn) convergesto M ≡ sup f (x) | x ∈ S. Let yn be as in the last lemma, con-verging to y ∈ S (since S is closed). Given any ϵ > 0, choose N largeenough so that n > N implies f (xn) > M − ϵ. Fixing n > N , supposethat yn =

∑i αixni

for finitely many αi > 0 such that∑

i αi = 1 andni ≥ n. Concavity of f implies that

M ≥ f (yn) ≥∑i

αif (xni) >

∑i

αi (M − ϵ) =M − ϵ.

This shows that the sequence f (yn) converges to M and also con-verges to f (y), since f is continuous at y. Therefore, M = f (y).

B.4. Projections on convex setsIn Section B.1 we characterized the projection of a vector onto a

line by an orthogonality condition. As the geometric intuition suggests,this argument generalizes to projections on convex sets. The vector xSis defined to be a projection of a vector x on a set S ⊆ X if xS ∈ Sand ‖x− s‖ ≥ ‖x− xS‖ for all s ∈ S. In words, xS is a point ofS that minimizes the distance between x and every point in S. Aprojection xS on S may not exist (for example, take X to be the realline, S = (0, 1) and x = 2), but as shown in this section, assumingS is convex, if xS exists it is unique, and if S is complete (Cauchysequences in S converge in S) then xS does exist. Our central concernis the dual characterization of a projection, which makes use of thefollowing condition: The vector x supports the set S at s ∈ X if

(B.4.1) 〈x | s〉 = inf 〈x | s〉 | s ∈ S .

This definition does not require that s ∈ S, which is useful in a latersection, but if s ∈ S, then x supports S at s if and only if 〈x | s− s〉 ≥ 0for all s ∈ S. The latter inequality can be visualized as the requirementthat the vectors x and s− s form an acute angle.

The central result on projections on convex sets follows. Note thatthe third part on existence applies in particular if X is a Hilbert spaceand S is closed, since then S is necessarily complete.

Page 190: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.4. PROJECTIONS ON CONVEX SETS 190

Theorem B.4.1. (Projection Theorem) Suppose S is a convex sub-set of the inner product space X. Then the following are true for allx, y ∈ X.

(1) The vector xS is a projection of x on S if and only if xS ∈ Sand xS − x supports S at xS.

(2) If xS is a projection of x on S and yS is a projection of y onS, then ‖xS − yS‖ ≤ ‖x− y‖. Therefore, if a projection of xonto S exists, it is unique.

(3) If S is complete, then the projection of x on S exists.Proof. 1. Fix xS, s ∈ S and define xα = xS + α (s− xS) ∈ S

for all α ∈ [0, 1]. Then ‖x− xα‖2 = ‖x− xS‖2 + α2 ‖s− xS‖2 +2α 〈xS − x | s− xS〉. The claim follows from the observation that as αranges over [0, 1], this quadratic is minimized at α = 0 if and only if〈xS − x | s− xS〉 ≥ 0 (why?).

2. Let δ ≡ y − x and δS ≡ yS − xS. By part 1, 〈xS − x | δS〉 ≥ 0and 〈y − yS | δS〉 ≥ 0. Adding up the two inequalities, we conclude〈δ − δS | δS〉 ≥ 0 and therefore

‖δ‖2 = ‖δ − δS‖2 + ‖δS‖2 + 2 〈δ − δS | δS〉 ≥ ‖δS‖2 .Applying this argument with x = y proves the uniqueness claim.

3. The projection of x on S exists if and only if the projection ofx on the intersection of S and a sufficiently large closed ball aroundx exists. We can therefore assume that S is bounded and the claimfollows from Proposition B.3.3. The remainder of this section elaborates on the important special caseof projections on linear manifolds. A vector x is said to be orthogonalto a linear manifold M if x is orthogonal to y− z for all y, z ∈M . Theorthogonal to M subspace, denoted by M⊥, is the linear subspaceof all vectors that are orthogonal to M . Note that M⊥ = (x+M)⊥

for every x ∈ X, and a vector supports M at some point if and only ifit is orthogonal to M . Theorem B.4.1 applied to linear manifolds gives

Corollary B.4.2. (Orthogonal Projection Theorem) Suppose Mis a linear manifold in X and x ∈ X. A point xM is the projection ofx on M if and only if xM ∈ M and x − xM ∈ M⊥. Assume furtherthat M is a complete (for example, finite-dimensional). Then x hasa unique decomposition of the form x = xM + n, xM ∈ M , n ∈ M⊥.Finally, if M = x + L for a (necessarily complete) linear subspace L,then3 M⊥⊥ = L.

Proof. We show the last claim only, since the rest is immediatefrom Theorem B.4.1. Also immediate is the fact that M⊥ = L⊥ andL ⊆ L⊥⊥ (whether L is complete or not). Assuming that L is complete,

3Although not needed in this text, it is not hard to show that for an arbitrarylinear subspace L of a Hilbert space, L⊥⊥ = L.

Page 191: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.4. PROJECTIONS ON CONVEX SETS 191

we now show that L⊥⊥ ⊆ L. Consider any x ∈ L⊥⊥ and let x = xL+n,where xL ∈ L and n ∈ L⊥, and therefore 〈x | n〉 = 〈xL | n〉 + 〈n | n〉and 〈xL | n〉 = 0. Since x ∈ L⊥⊥, we also have 〈x | n〉 = 0. The lastthree equalities imply that 〈n | n〉 = 0 and therefore x = xL ∈ L.

The following example demonstrates what can go wrong if M is notcomplete (and therefore infinite-dimensional).

Example B.4.3. The space C [0, 1] of all continuous functions ofthe form x : [0, 1] → R is a Hilbert space under the inner product〈x | y〉 =

∫ 1

0x (t) y (t) dt. It it existed, the projection of the zero vector

onto the linear manifold M ≡ x ∈ C [0, 1] | x (0) = 1 would minimize∫ 1

0x (t)2 dt subject to the constraint x (0) = 1. Such a minimum does

not exist within C[0, 1]. ♦Corollary B.4.2 implies the following useful relationship between

orthogonal projections and Riesz representations.Proposition B.4.4. Suppose that f (y) = 〈x | y〉 for all y ∈ X and

fL is the restriction of f on the linear subspace L. The vector xL isthe Riesz representation of fL in L if and only if it is the projection ofx on L.

The existence of orthogonal projections on complete linear sub-spaces implies that in any Hilbert space the Riesz representation ofa continuous linear functional exists. The argument is redundant inthe finite-dimensional case, since the claim was established in Proposi-tion B.1.6, but still worth reviewing.

Theorem B.4.5. In a Hilbert space, a linear functional is contin-uous if and only if it has a (necessarily unique) Riesz representation.

Proof. Suppose f is a continuous linear functional. The null sub-space N = x : f (x) = 0 is closed and therefore complete (since X isassumed complete). If X = N , the claim is trivial. Otherwise, pickany z ∈ X such that f (z) 6= 0, let zN be the projection of z on Nand define y = f (z − zN)

−1 (z − zN) . Then y is orthogonal to N andsatisfies f (y) = 1. For any x ∈ X, the fact that x−f (x) y ∈ N impliesthat 〈x− f (x) y | y〉 = 0, and therefore f (x) = 〈x | y〉 / 〈y | y〉 . Thevector 〈y | y〉−1 y is the Riesz representation of f.

Projections on finite-dimensional subspaces can be expressed bysimple formulas in terms of a given basis.

Proposition B.4.6. Suppose L is a linear subspace of X and B =(B1, . . . , Bd)

′ is a basis of L. For every x ∈ X, the vector(B.4.2) xL = 〈x | B′〉 〈B | B′〉−1

B

is the projection of x on L and x− xL is the projection of x on(B.4.3) L⊥ = y ∈ X | 〈B | y〉 = 0 .

Page 192: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.4. PROJECTIONS ON CONVEX SETS 192

Proof. The projection expression (B.4.2) can be viewed as a corol-lary of Proposition (B.1.6) (why?). For a more direct argument, letxL = σxLB, where σxL = 〈x | B′〉 〈B | B′〉−1. Then x− xL ∈ L⊥, sincefor all y = σyB,

〈xL | y〉 = σxL 〈B | B′〉σy′ = 〈x | B′〉 〈B | B′〉−1 〈B | B′〉σy′ = 〈x | y〉 .

By Corollary B.4.2, xL is the projection of x on L. Since L ⊆ L⊥⊥,x− (x− xL) = xL ∈ L⊥⊥ and x−xL ∈ L⊥, and therefore x−xL is theprojection of x on L⊥.

A useful application is to optimization problems of the form

min ‖x‖ | x ∈ X, 〈B | x〉 = b (b ∈ Rd).

The minimizing value of x is the projection of the zero vector onto thelinear manifold of all x such that 〈B | x〉 = b, which is characterizedbelow.

Corollary B.4.7. Suppose B = (B1, . . . , Bd)′ is a column ma-

trix of linearly independent vectors and M = x ∈ X | 〈B | x〉 = bfor some column vector b ∈ Rd. Then M⊥ = span (B1, . . . , Bd) andb′ 〈B | B′〉−1B is the projection of the zero vector on M .

Proof. Fix any m ∈ M and note that M −m is the linear sub-space L⊥ of equation (B.4.3). Therefore, M⊥ = L⊥⊥ = L = span (B).The point 0M is the projection of the zero vector on M if and only ifm− 0M is the projection of m on L⊥. We can therefore apply Propo-sition B.4.6 to conclude that the projection 0M of 0 on M exists and0M = 〈m | B′〉 〈B | B′〉−1B. Since 〈B | m〉 = b, the result follows.

Recall (from Section B.1) that a basis B is said to be orthonormalif the corresponding Gram matrix is an identity matrix. Another usefulapplication of Proposition B.4.6 is the construction of an orthonormalbasis, which is the basis for the claim that every finite-dimensional innerproduct space is isomorphic to a Euclidean space. We call a basis Borthogonal if its elements are pairwise orthogonal: 〈Bi | Bj〉 = 0 fori 6= j. Clearly, B is orthonormal if and only if it is orthogonal andnormalized in the sense that ‖Bi‖ = 1 for all i. Every orthogonal basiscan be normalized by scaling. If the linear subspace L has a finiteorthogonal basis B, then 〈B | B′〉 is diagonal and formula (B.4.2) forthe projection of x on L reduces to

xL =n∑i=1

〈x | Bi〉〈Bi | Bi〉

Bi.

Assuming X is finite-dimensional, this equation can be used to recur-sively construct an orthogonal basis of X, a process known as Gram-Schmidt orthogonalization. Start with any nonzero vector B1 in

Page 193: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.5. SUPPORTING HYPERPLANES AND (SUPER)GRADIENTS 193

X. Given n orthogonal vectors B1, . . . , Bn such thatL = span (B1, . . . , Bn) 6= X,

choose any x ∈ X \ L and define Bn+1 = x − xL, where xL is theprojection of x on L. If X is d-dimensional, the recursive constructionterminates for n = d. Normalizing the resulting orthogonal basis provesthat every finite-dimensional inner product space has an orthonormalbasis.4

B.5. Supporting hyperplanes and (super)gradientsA hyperplane H is a linear manifold whose orthogonal subspace

is of dimension one. If H⊥ is spanned by the vector y and α = 〈y | x〉 ,where x is any point in H, then H = x | 〈y | x〉 = α. Conversely, forevery nonzero vector y and scalar α, the preceding expression definesa hyperplane. The set x | 〈y | x〉 ≥ α defines a half-space whoseboundary is H in the direction of the orthogonal-to-H vector y. Recallthat the vector y is said to support the set S at s ∈ X if

〈y | s〉 = inf 〈y | s〉 | s ∈ S .This condition can be visualized as the inclusion of the closure of S inthe half-space x | 〈y | x〉 ≥ α, where α ≡ 〈y | s〉, with S touching Hat s. It is intuitively compelling that one should be able to support aconvex set at any point of its boundary. The supporting hyperplanetheorem shows that this is indeed true for a finite-dimensional space.

Theorem B.5.1. (Supporting Hyperplane Theorem) Suppose S is aconvex subset of the finite-dimensional space X. Then for every vectorx that is not in the interior of S, there exists a nonzero vector y suchthat 〈y | x〉 ≤ 〈y | s〉 for all s ∈ S. If x is on the boundary of S, then ysupports S at x.

Proof. Since x is not interior, we can construct a sequence xn ofvectors such that xn /∈ S for all n and xn → x as n→ ∞. For each n, letsn be the projection of xn on S and define yn = (sn − xn) / ‖sn − xn‖.The dual characterization of projections gives 〈yn | sn〉 ≤ 〈yn | s〉 forall s ∈ S. By the second part of Theorem B.4.1, the sequence snconverges to the projection s of x on S. The sequence yn lies inthe closure of the unit ball, which is compact, and therefore we canextract a subsequence ynk

that converges to some vector y of unitnorm. By the continuity of inner products, 〈y | s〉 ≤ 〈y | s〉 for alls ∈ S. If x is on the boundary of S, then x = s is the limit ofsome sequence sn in S. Therefore, 〈y | sn〉 converges to 〈y | s〉,which shows (B.4.1). If x is not on the boundary of S, then ynk

4The Gram-Schmidt orthogonalization procedure described here is of theoreti-

cal value, but is not numerically stable when implemented as an algorithm—betteralternatives exist. A complete discussion of this point is given in Meyer [2004].

Page 194: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.5. SUPPORTING HYPERPLANES AND (SUPER)GRADIENTS 194

converges to y = (s− x) / ‖s− x‖. Since 〈y | s− x〉 > 0, it followsthat 〈y | x〉 < 〈y | s〉 ≤ 〈y | s〉 for all s ∈ S.

The following example shows that the preceding theorem is notvalid in the infinite-dimensional case.5

Example B.5.2. Suppose X = l2, the space of square summablesequences with the inner product 〈x | y〉 =

∑∞n=1 xnyn. Let S equal

the positive cone x ∈ X | xn ≥ 0 for alln and pick any s ∈ S suchthat sn > 0 for all n. There is no nonzero vector that supports S at s(why?). ♦

Corollary B.5.3. (Separating Hyperplane Theorem) Suppose Aand B are convex subsets of the finite-dimensional inner-product spaceX. If A ∩B = ∅, then there exists a nonzero y ∈ X such that(B.5.1) inf

a∈A〈y | a〉 ≥ sup

b∈B〈y | b〉 .

Proof. Since A ∩B = ∅, the zero vector is not inS ≡ A−B = a− b | a ∈ A, b ∈ B .

By the supporting hyperplane theorem, there exists a nonzero y ∈ Xsuch that 〈y | s〉 ≥ 0 for all s ∈ S and therefore 〈y | a〉 ≥ 〈y | b〉 for alla ∈ A and b ∈ B.

Another useful application of the supporting hyperplane theoremarises when the convex set being supported is defined as the spacebelow the graph of a concave function, leading to the concept of asupergradient, which for a smooth function corresponds to a directionalderivative. Consider any function f : D → R, where D ⊆ X. Thevector y is a supergradient of f at x ∈ D if it satisfies the gradientinequality:

f (x+ h) ≤ f (x) + 〈y | h〉 for all h such that x+ h ∈ D.

The superdifferential of f at x, denoted by ∂f (x), is the set of allsupergradients of f at x. The supergradient property can be visualizedas a support condition in the space X × R with the inner product

〈(x1, α1) | (x2, α2)〉 = 〈x1 | x2〉+ α1α2, xi ∈ X, αi ∈ R.

The subgraph of f is the set(B.5.2) sub (f) = (x, α) ∈ D × R | α ≤ f (x) .

5The set S of the example has an empty interior relative to the topology definedby the induced norm (why?). If S is assumed to have a non-empty interior, thenthe validity of the Supporting Hyperplane Theorem is restored in any Hilbert space(as well as extensions), but this version of the result is of limited practical use,because in infinite-dimensional spaces, so many convex sets of interest have emptyinteriors.

Page 195: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.5. SUPPORTING HYPERPLANES AND (SUPER)GRADIENTS 195

The preceding definitions imply that(B.5.3) y ∈ ∂f (x) ⇐⇒ (y,−1) supports sub (f) at (x, f (x)) .

Theorem B.5.4. In a finite-dimensional inner product space, thesuperdifferential of a concave function at an interior point of its domainis nonempty, convex and compact.

Proof. Suppose f : D → R is concave and x ∈ D0. By thesupporting hyperplane theorem, sub (f) is supported by some nonzero(y,−β) ∈ X × R at (x, f (x)) :

〈y | x〉 − βf (x) = min 〈y | z〉 − βα | α ≤ f (z) , z ∈ D, α ∈ R .Since the left-hand side is finite, it follows that β ≥ 0. If β = 0, then ysupports D at x, which contradicts the assumption that x is an interiorpoint of D. Therefore, β > 0 and y/β ∈ ∂f (x). This proves that ∂f (x)is nonempty. It follows easily from the definitions that ∂f (x) is alsoconvex and closed. Finally, we show that ∂f (x) is bounded, utilizingthe finite dimensionality assumption. We assume x = 0, which entailsno loss of generality (why?). Theorem B.3.1 implies that f is boundedbelow over some ball centered at zero. Let ε > 0 and K ∈ R be suchthat for all z ∈ X, ‖z‖ = ε implies z ∈ D0 and f (z) > K. It followsthat for every y ∈ ∂f (0),

K < f

(− ε

‖y‖y

)≤ f (0) +

⟨y | − ε

‖y‖y

⟩= f (0)− ε ‖y‖ ,

which results in the bound ‖y‖ < (f (0)−K) /ε. This proves that∂f (x) bounded. Since it is also closed, it is compact.

Assuming it exists, the directional derivative of f at x in thedirection h is denoted and defined by

f ′ (x;h) = limα↓0

f (x+ αh)− f (x)

α.

The gradient of f at x is said to exist if f ′ (x;h) exists for every h ∈ Xand the functional f ′ (x; ·) is linear. In this case, the Riesz representa-tion of the linear functional f ′ (x; ·) is the gradient of f at x, denotedby ∇f (x). Therefore, when it exists, the gradient ∇f (x) is character-ized by the restriction f ′ (x;h) = 〈∇f (x) | h〉 for all h ∈ X.

Theorem B.5.5. Suppose D is a convex subset of the finite-dimensionalinner product space X, the function f : D → R is concave and x is anyinterior point of D. Then the directional derivative f ′ (x;h) exists forall h ∈ X and the following are true:

(1) ∂f (x) = y ∈ X | 〈y | h〉 ≥ f ′ (x;h) for all h ∈ X.(2) f ′ (x;h) = min 〈y | h〉 | y ∈ ∂f (x) for every h ∈ X.(3) The gradient ∇f (x) exists if and only if ∂f (x) is a singleton,

in which case ∂f (x) = ∇f (x).

Page 196: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.5. SUPPORTING HYPERPLANES AND (SUPER)GRADIENTS 196

Proof. Fix any h ∈ X, let A = α ∈ R | x+ αh ∈ D0 (an openinterval) and define the concave function ϕ : A → R by ϕ (α) =f (x+ αh)− f (x). Note that f ′ (x;h) = ϕ′

+ (0) (the right derivative ϕat zero). For each α ∈ A, let ∆(α) = ϕ (α) /α, which is the slope of theline segment on the plane connecting the origin to the point (α, ϕ (α)).Consider any decreasing sequence αn of positive scalars in A. Con-cavity of ϕ implies that the corresponding sequence of slopes ∆(αn)is increasing. Moreover, fixing any ε > 0 such that x − εh ∈ D0, wehave ∆(αn) ≤ ∆(−ε) for all n. Being increasing and bounded, thesequence ∆(αn) converges. This proves that the (monotone) limitf ′ (x;h) = limα↓0∆(α) exists and is finite.

(1) If y ∈ ∂f (x), the gradient inequality implies that 〈y | h〉 ≥∆(α) for all positive α ∈ A. Letting α ↓ 0, it follows that 〈y | h〉 ≥f ′ (x;h). Conversely, suppose that y /∈ ∂f (x) and therefore ∆(1) =f (x+ h) − f (x) > 〈y | h〉 for some h. We saw earlier that ∆(α)increases to f ′ (x;h) as α ↓ 0 and therefore f ′ (x;h) ≥ ∆(1) > 〈y | h〉for some h.

(2) In light of part 1, the claim of part 2 follows if given h ∈ X, wecan produce some y ∈ ∂f (x) such that 〈y | h〉 = f ′ (x;h). The gradientinequality ϕ (α) ≤ ϕ′

+ (0)α for all α ∈ A (equivalently, f (x+ αh) ≤f (x) + αf ′ (x;h) for all α ∈ A) can be restated as the condition thatthe line segment

L = (x+ αh, f (x) + αf ′ (x;h)) : α ∈ A ⊆ X × R

does not intersect the interior of the subgraph of f as defined in (B.5.2).By the separating hyperplane theorem (Corollary B.5.3), there existsnonzero (p, β) ∈ X × R that separates the sets L and sub (f):

infα∈A

〈p | x+ αh〉+ β (f (x) + αf ′ (x;h)) ≥ sup(x,α)∈sub(f)

〈p | x〉+ βα.

Since the right-hand side must be finite, β > 0. It follows that theright-hand side is at least as large as 〈p | x〉 + βf (x), which is alsoobtained as the expression on the left-hand side with α = 0. Since0 ∈ A0, the coefficient of α on the left-hand side must vanish: 〈p | h〉+βf ′ (x;h) = 0. Therefore, f ′ (x;h) = 〈y | h〉, where y = −p/β, and theseparation condition reduces to the gradient inequality defining thecondition y ∈ ∂f (x).

(3) If ∂f (x) = δ, then part 2 implies that f ′ (x;h) = 〈δ | h〉for all h ∈ X and therefore δ = ∇f (x). Conversely, if the gradientexists, then part1 implies that 〈∇f (x)− y | h〉 ≤ 0 for all h ∈ X andy ∈ ∂f (x). Letting h = ∇f (x) − y, it follows that y = ∇f (x) ify ∈ ∂f (x).

Page 197: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.6. OPTIMALITY CONDITIONS 197

B.6. Optimality conditionsThe characterization of optima is a main application of convex anal-

ysis in economics. We already saw an example with the projection the-orem. In this section, we focus on the optimization problem definingthe function F : Rn → [−∞,+∞], where(B.6.1) F (δ) ≡ sup f (x) | g (x) ≤ δ, x ∈ D , δ ∈ Rn,

with the convention inf ∅ = −∞, for given set D ⊆ X and functionsf : D → R and g : D → Rn, for some positive integer n. Since δ canbe absorbed in the specification of g, it is sufficient to consider the caseδ = 0. To do so, however, it is useful to consider the entire function F .

Assuming F (0) is finite, the superdifferential of F at zero is∂F (0) ≡ λ ∈ Rn | F (δ)− F (0) ≤ λ · δ for all δ ∈ Rn ⊆ Rn

+,

where the inclusion is due to the fact that δ ≥ 0 implies F (δ) ≥ F (0).It is closely related to the Lagrangian function

L (x, λ) ≡ f (x)− λ · g (x) , x ∈ C, λ ∈ Rn.

Lemma B.6.1. Assume F (0) is finite. For all λ ∈ Rn+, λ ∈ ∂F (0)

if and only if F (0) = supx∈D L (x, λ).

Proof. In Rn×R with the Euclidean inner product, λ ∈ ∂F (0) ifand only if (λ,−1) supports at (0, F (0)) the set A of all (a, α) ∈ Rn×Rsuch that α < F (a) (why?). Similarly, F (0) = supx∈D L (x, λ) if andonly if (λ,−1) supports at (0, F (0)) the set B of all (b, β) ∈ Rn×R forwhich there exists x ∈ D such that g (x) ≤ b and α < F (x) (why?).Finally, note that A = B.

Assuming that the supremum defining F (0) is a maximum and that∂F (0) is nonempty, the following main result provides a way of con-verting the constrained optimization problem defining F (0) to a cor-responding unconstrained problem. The intuitive idea is that throughthe condition λ ∈ ∂F (0), the parameter λ, known as a Lagrangemultiplier, provides an appropriate pricing of the constraint g ≤ 0.If the constraint gi (x) ≤ 0 is slack, then the corresponding price ofthe constraint must be zero. This leads to the so-called complemen-tary slackness condition λ · g (x) = 0, which, given the inequalitiesg (x) ≤ 0 and λ ≥ 0, is equivalent to gi (x) < 0 =⇒ λi = 0.

Theorem B.6.2. Suppose that x ∈ D, g (x) ≤ 0 and λ ∈ Rn. Thenthe following two conditions are equivalent:

(1) f (x) = F (0) and λ ∈ ∂F (0).(2) L (x, λ) = maxy∈D L (y, λ) , λ · g (x) = 0, λ ≥ 0.

Proof. Suppose condition 1 holds. It has already been noted thatλ ∈ ∂F (0) implies λ ≥ 0. Since g (x) ≤ 0, L (x, λ) ≥ f (x) = F (0).By the last lemma, F (0) ≥ L (x, λ). Therefore, L (x, λ) = f (x) and

Page 198: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.6. OPTIMALITY CONDITIONS 198

λ · g (x) = 0. Conversely, condition 2 implies that f (x) = L (x, λ) ≥L (y, λ) ≥ f (y) for all y such that g (y) ≤ 0. Therefore, f (x) = F (0)and condition 1 follows by the last lemma. Assuming the existence of a maximum, the preceding optimality con-ditions are applicable if and only if ∂F (0) is nonempty. The followinglemma gives convexity-based sufficient conditions that are easy to checkif they can be satisfied.

Lemma B.6.3. Suppose that D and g are convex, f is concave, F (0)is finite and there exists x ∈ D such that gi (x) < 0 for every i. Then∂F (0) 6= ∅.

Proof. By the supporting hyperplane theorem, the convex setsub(F ) is supported at (0, F (0)) by some nonzero (λ,−α), and there-fore υ ≤ F (δ) implies −αF (0) ≤ λ · δ − αυ for all δ ∈ Rn. If α < 0,setting δ = 0 leads to a contradiction. The Slater condition ensuresthat F (δ) > −∞ for all sufficiently small δ. Therefore α > 0 andλ/α ∈ ∂F (0).

The optimality conditions given so far are global in that they utilizethe structure of the objective and constraint functions on their entiredomain. Another type of optimality condition is local in that it pro-vides implications of the fact that infinitesimal feasible perturbationsfrom a reference optimum cannot improve the objective. The simplestinstance of such an argument gives necessary optimality conditions foran unconstrained maximum. Suppose that x is interior to D and thegradient of f at x exists. If f (x) = maxy∈D f (y) then ∇f (x) = 0.To see why, for every y ∈ X, consider the function on a small enoughneighborhood of zero on the real line mapping α to f (x+ αy). Sincethis function is maximized for α = 0, its derivative at zero, ∇f (x) · y,must equal zero, and since y is arbitrary, ∇f (x) = 0. In the conversedirection, in order for the local condition ∇f (x) = 0 to imply the globalcondition f (x) = maxy∈D f (y), we need a global regularity condition.For example, concavity of f works via the gradient inequality.

The preceding argument can be applied in particular to the functionL (·, λ) of the second condition of Theorem B.6.2. Assuming x ∈ D0

and the existence of ∇f (x) and ∇g (x), L (x, λ) = maxy∈D L (y, λ)implies ∇f (x) = λ · ∇g (x), and the converse is true if D and g areconvex and f is concave. We are therefore led to consider the Kuhn-Tucker conditions:(B.6.2) ∇f (x) = λ · ∇g (x) , λ · g (x) = 0, λ ≥ 0.

The sufficiency of the Kuhn-Tucker conditions for optimality underconvexity assumptions is covered by Theorem B.6.2 and the gradientinequality. A necessity argument via Theorem B.6.2 and Lemma B.6.3requires redundant global convexity assumptions. Instead, we can de-rive the necessity of the Kuhn-Tucker conditions for optimality with

Page 199: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

B.6. OPTIMALITY CONDITIONS 199

the following local argument. Note that even though convexity makesno appearance in the theorem’s statement, the separating hyperplanetheorem is at the core of its proof.

Theorem B.6.4. Suppose that the gradients of f and g at x ∈ D0

exist, and there exists some vector h such that 〈∇gi (x) | h〉 < 0 for all i.If F (x) = f (x) and g (x) ≤ 0, then the Kuhn-Tucker conditions (B.6.2)hold for some λ ∈ Rn.

Proof. Define the convex set A of all (a, α) ∈ Rn × R such thatai < 0 for all i and α > 0. Also define the convex set B of all (b, β) ∈Rn × R for which there exists h ∈ X such that

g (x) + 〈∇g (x) | h〉 ≤ b and 〈∇f (x) | h〉 ≥ β.

Optimality of x implies that for all h ∈ X, if gi (x) + 〈∇gi (x) | h〉 <0 for all i, then 〈∇f (x) | h〉 ≤ 0 (why?). Therefore A∩B = ∅. By theseparating hyperplane theorem (Corollary B.5.3), there exists nonzero(−λ, µ) ∈ Rn × R such that

(a, α) ∈ A =⇒ −λ · a+ µα ≥ 0,(B.6.3)(b, β) ∈ B =⇒ −λ · b+ µβ ≤ 0.(B.6.4)

Condition (B.6.3) implies that λ, µ ≥ 0. The case µ = 0 is ruled out bythe regularity assumption on the gradient of g. After proper scaling,we can therefore assume that µ = 1. Condition (B.6.4) with µ = 1implies that

−λ · (g (x) + 〈∇g (x) | h〉) + 〈∇f (x) | h〉 ≤ 0 for all h ∈ X,

which together with the inequalities λ ≥ 0 and g (x) ≤ 0 results in theKuhn-Tucker conditions (B.6.2).

This preceding result excludes equality constraints where gi (x) ≤ 0and gi (x) ≥ 0 are both binding, but the ideas extend easily to this case.For example, consider the simple case of maximization of f over thelinear manifold M = x ∈ X | 〈B | x〉 = b, where B = (B1, . . . , Bd)

is a column matrix of vectors and b ∈ Rd. Assume that x is interiorto D and the gradient of f at x exists. Suppose that x is optimalin the sense that f (x) = maxy∈M∩D f (y). Then for every y suchthat 〈B | y〉 = 0, x + αy ∈ M for all α ∈ R. Setting the derivativeof α 7→ f (x+ αy) at zero equal to zero, we conclude that ∇f (x)is orthogonal to all y ⊥ span (B), and therefore ∇f (x) ∈ span (B).The latter is a necessary local optimality condition. By virtue of thegradient inequality, it is also a global sufficient optimality conditionunder the assumption that x ∈ M ∩D, D is convex and f is concave.The argument extends to equality constraints of the form g (x) = 0,with ∇g (x) playing the role of B in the preceding argument, but wehave no need for this extension in this text.

Page 200: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Bibliography

J. Aczél. Lectures on Functional Equations and Their Applications.Dover Publications, Mineola, NY, 2006. 170

F. J. Anscombe and R. J. Aumann. A definition of subjective proba-bility. Annals of Mathematical Statistics, 34:199–205, 1963. 173

D. Applebaum. Lévy Processes and Stochastic Calculus. CambridgeUniversity Press, Cambridge, U.K., 2004. 76

K. J. Arrow. Le rôle des valeurs boursiéres pour la repartition la meil-lure des risques. Econométrie, Colloques Internationaux du CentreNational de la Recherche Scientifique, 40:41–47, 1953. 21, 107

K. J. Arrow. The role of securities in the optimal allocation of riskbearing. Review of Economic Studies, 31:91–96, 1963. 21, 107

K. J. Arrow. Aspects of the Theory of Risk Bearing. Yrjo JahnssoninSaatio, Helsinki, 1965. 145

K. J. Arrow. Essays in the Theory of Risk Bearing. North Holland,London, 1971. 59, 145

R. J. Aumann. Values of markets with a continuum of traders. Econo-metrica, 43(4):611–646, 1975. 112

D. P. Bertsekas. Convex Analysis and Optimization. Athena Scientific,Belmont, MA, 2003. 178

F. Black and M. Scholes. The pricing of options and corporate liabili-ties. Journal of Political Economy, 3:637–654, 1973. 90

J. F. Bonnans and A. Shapiro. Perturbation Analysis of OptimizationProblems. Springer-Verlag, New York, 2000. 178

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge Uni-versity Press, Cambridge, United Kingdom, 2004. 178

E. Cinlar. Probability and Stochastics. Springer-Verlag, New York,2010. 76

J. Cox and S. Ross. The valuation of options for alternative stochasticprocesses. Journal of Financial Economics, 3:145–166, 1976. 59

R. Dalang, A. Morton, and W. Willinger. Equivalent martingalemeasures and no-arbitrage in stochastic securities market models.Stochastics and Stochastic Reports, 29:185–201, 1990. 22

G. Debreu. Theory of Value. Cowles Foundation Monograph, YaleUniversity Press, New Haven, 1959. 21, 107

G. Debreu. Mathematical Economics: Twenty Papers of Gerard De-breu. Cambridge University Press, New York, 1983. 164, 166

200

Page 201: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Bibliography 201

F. Delbaen and W. Schachermayer. The Mathematics of Arbitrage.Springer-Verlag, New York, 2006. 22

P. DeMarzo and C. Skiadas. Aggregation, determinacy, and informa-tional efficiency for a class of economies with asymmetric informa-tion. Journal of Economic Theory, 80:123–152, 1998. 159

P. DeMarzo and C. Skiadas. On the uniqueness of fully informativerational expectations equilibria. Economic Theory, 13:1–24, 1999.159

S. Dreyfus. Richard Bellman on the birth of dynamic programming.Operations Research, 50(1), 2002. 162

J. Drèze. Market allocation under uncertainty. European EconomicReview, 15:133–165, 1971. 59

R. M. Dudley. Real Analysis and Probability. Cambridge UniversityPress, New York, 2002. 150, 186

D. Duffie and L. G. Epstein. Stochastic differential utility. Economet-rica, 60:353–394, 1992a. 146, 151

D. Duffie and L. G. Epstein. Asset pricing with stochastic differentialutility. Review of Financial Studies, 5:411–436, 1992b. 151

D. Duffie and C. Skiadas. Continuous-time security pricing: A utilitygradient approach. Journal of Mathematical Economics, 23:107–131,1994. 151

N. Dunford and J. T. Schwartz. Linear Operators, Part I, GeneralTheory. Wiley, 1988. 178

I. Ekeland and R. Témam. Convex Analysis and Variational Problems.SIAM, Philadelphia, PA, 1999. 178

L. Epstein and S. Zin. Substitution, risk aversion, and the temporalbehavior of consumption and asset returns: A theoretical framework.Econometrica, 57:937–969, 1989. 128

L. P. Hansen and R. Jagannathan. Implications of security market datafor models of dynamic economies. Journal of Political Economy, 99:225–262, 1991. 58

M. J. Harrison and D. M. Kreps. Martingale and arbitrage in multi-period securities markets. Journal of Economic Theory, 20:381–408,1979. 59

J. Jacod and P. Protter. Discretization of Processes. Springer VerlagBerlin Heidelberg, 2012. 76

J. Jacod and A. N. Shiryaev. Limit Theorems for Stochastic Processes.Springer-Verlag, Berlin Heidelberg, second edition, 2003. 33, 76, 78,79, 80

Y. M. Kabanov and D. O. Kramkov. No-arbitrage and equivalent mar-tingale measure: An elementary proof of the Harrison-Pliska theo-rem. Theory of Probability and its Applications, 39:523–527, 1994.22

D. H. Krantz, R. D. Luce, P. Suppes, and A. Tversky. Foundations ofMeasurement, volume I. Academic Press, Inc., San Diego, 1971. 166

Page 202: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Bibliography 202

D. Kreps and E. Porteus. Temporal resolution of uncertainty and dy-namic choice theory. Econometrica, 46:185–200, 1978. 146

D. M. Kreps. Arbitrage and equilibrium in economies with infinitelymany commodities. Journal of Mathematical Economics, 8:15–35,1981. 22

D. G. Luenberger. Optimization by Vector Space Methods. Wiley, NewYork, 1969. 178

H. Markowitz. Portfolio selection. Journal of Finance, 7:77–91, 1952.54

R. C. Merton. Lifetime portfolio selection under uncertainty: Thecontinuous time case. Review of Economics and Statistics, 51:247–257, 1969. 154

R. C. Merton. Optimum consumption and portfolio rules in acontinuous-time model. Journal of Economic Theory, 3:373–413,1971. Erratum 6 (1973): 213–214. 154

C. D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, 2004.193

P. Mörters and Y. Peres. Brownian Motion. Cambridge UniversityPress, New York, 2010. 78

L. Narens. Abstract Measurement Theory. MIT Press, Cambridge, MA,1985. 166

E. Pardoux and S. Peng. Adapted solution of a backward stochasticdifferential equation. Systems and Control Letters, 14:55–61, 1990.146

J. W. Pratt. Risk aversion in the small and in the large. Econometrica,32:122–136, 1964. 145

F. P. Ramsey. Truth and probability. In H. E. Kyburg, Jr. and H. E.Smokler, editors, Studies in Subjective Probability (1980). Robert E.Krieger Publishing Company, New York, 1926. 173

P. J. Reny. A simple proof of the nonconcavifiability of functions withlinear not-all-parallel contour sets. Journal of Mathematical Eco-nomics, 49:506–508, 2013. 112

D. Revuz and M. Yor. Continuous Martingales and Brownian Motion.Springer, New York, third edition, 1999. 78

R. Tyrrell Rockafellar. Convex Analysis. Princeton University Press,Princeton, New Jersey, 1970. 178

S. A. Ross. A simple approach to the valuation of risky streams. Journalof Business, 51:453–475, 1978. 22

L. J. Savage. The Foundations of Statistics. Dover Publications (1972),New York, 1954. 173

W. Schachermayer. A Hilbert-space proof of the fundamental theoremof asset pricing. Insurance Mathematics and Economics, 11:249–257,1992. 22

M. Schroder and C. Skiadas. Optimal consumption and portfolio selec-tion with stochastic differential utility. Journal of Economic Theory,

Page 203: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Bibliography 203

89:68–126, 1999. 146M. Schroder and C. Skiadas. An isomorphism between asset pricing

models with and without linear habit formation. Review of FinancialStudies, 15:1189–1221, 2002. 120

M. Schroder and C. Skiadas. Optimal lifetime consumption-portfoliostrategies under trading constraints and generalized recursive pref-erences. Stochastic Processes and Their Applications, 108:155–202,2003. 148

M. Schroder and C. Skiadas. Lifetime consumption-portfolio choiceunder trading constraints and nontradeable income. Stochastic Pro-cesses and their Applications, 115:1–30, 2005. 148

C. Skiadas. Advances in the Theory of Choice and Asset Pricing. PhDthesis, Stanford University, 1992. 151

C. Skiadas. Subjective probability under additive aggregation of con-ditional preferences. Journal of Economic Theory, 76:242–271, 1997.173

C. Skiadas. Recursive utility and preferences for information. EconomicTheory, 12:293–312, 1998. 146

C. Skiadas. Asset Pricing Theory. Princeton Univ. Press, Princeton,NJ, 2009. 5, 53, 138, 170, 173

C. Skiadas. Scale-invariant asset pricing and consumption/portfoliochoice with general attitudes toward risk and uncertainty. Mathe-matics and Financial Economics, 7:431–456, 2013a. 141

C. Skiadas. Scale-invariant uncertainty-averse preferences and source-dependent constant relative risk aversion. Theoretical Economics, 8:59–93, 2013b. 170, 175

C. Skiadas. Dynamic choice with constant source-dependent relativerisk aversion. Economic Theory, 3:393–422, 2015. 175

J. Stoer and C. Witzgall. Convexity and Optimization in Finite Di-mensions. Springer-Verlag, Berlin, Germany, 1970. 22

R. H. Strotz. Myopia and inconsistency in dynamic utility maximiza-tion. Review of Economic Studies, 23:165–180, 1957. 105

J. von Neumann and O. Morgenstern. Theory of Games and EconomicBehavior. Princeton University Press, Princeton, NJ, 1944. 173

P. P. Wakker. Cardinal coordinate independence for expected utility.Journal of Mathematical Psychology, 28:110–117, 1984. 173

P. P. Wakker. The algebraic versus the topological approach to additiverepresentations. Journal of Mathematical Psychology, 32:421–435,1988. 166, 173

P. P. Wakker. Additive Representations of Preferences. Kluwer, Dor-drecht, The Netherlands, 1989. 166, 173

L. Walras. Eléments d’Economie Pure. Corbaze, Lausanne. Englishtranslation: Elements of Pure Economics, R. D. Irwin, Homewood,IL (1954), 1874. 107

Page 204: A Course on the Theory of Competitive Financial Markets · 2020. 3. 8. · A Course on the Theory of Competitive Financial Markets Costis Skiadas. Preliminary and Incomplete March

Bibliography 204

P. Weil. The equity premium puzzle and the risk-free rate puzzle.Journal of Monetary Economics, 24:401–421, 1989. 128

P. Weil. Non-expected utility in macroeconomics. The Quarterly Jour-nal of Economics, 105:29–42, 1990. 128

W. Whitt. Proofs of the martingale FCLT. Probability Surveys, 4:268–302, 2007. 78

H. Xing. Consumption investment optimization with epstein-zine util-ity in incomplete markets. Finance and Stochastics, 21:227–262,2017. 146

M. E. Yaari. A note on separability and quasiconcavity. Econometrica,45:1183–1186, 1977. 168

J. A. Yan. Caracterisation d’une class d’ensembles convexes de l1 ouh1. Lecture Notes in Mathematics, 784:220–222, 1980. 22