econ 2021 - financial economics ikkasa/ec2021_lecture3r.pdf · 2018-09-06 · note, in economics we...

ECON 2021 - FINANCIAL ECONOMICS I

Lecture 3 – Stochastic Processes & Stochastic Calculus

September 24, 2018

KASA ECON 2021 - FINANCIAL ECONOMICS I

STOCHASTIC PROCESSES

Asset prices, asset payoffs, investor wealth, and portfoliostrategies can all be viewed as stochastic processes.

Definition: A stochastic process is a time-indexedsequence of random variables.

Stochastic Processes vs. Random Variables

(paths) (real numbers)

A stochastic process must be regarded as a single unit. Itis a random variable, but it belongs to a different ‘space’,i.e., a function space, and we write it using two arguments:X(t, ω).


A realization is an entire path, not a number, and thePopulation is the set of all possible realizations.

Time

Price

ω1

ω2

ω3

Note, in economics we observe only one realization. ‘Lawof Large Numbers’ and ‘Central Limit Theorem’ argumentsare still applicable if observations over time provideenough new information about ensemble averages.


Stationary Stochastic Process: All joint distributions are independentof time (e.g., initial conditions have ‘worn off’)

f(xt, xt+1, · · ·xt+j) indpt. of t ∀j

Markov Process: A restriction on the conditional distribution function:

f(future|present, past) = f(future|present)

or Prob(xt+1|xt, xt−1, · · ·xt−k) = prob(xt+1|xt) ∀k ≥ 1 (i.e., nopath dependence).

Whether a process is Markov depends on the dimensionality of thestate.Example: St = α1St−1 + α2St−2 appears non-Markov. However, ifwe define Zt = (St, St−1), then Zt is Markov.

A process can be Markov but nonstationary. Example:xt = λxt−1 + εt with |λ| ≥ 1 (Why?)

If a process is stationary and (low-dimensional) Markov, then‘conventional’ statistical methods can be applied.


MARTINGALES

The preeminent stochastic process in finance is a martingale. Amartingale satisfies the following restriction:

E[Xt+1|Ft] = Xt

where Ft = Information available at time-t. Hence, a martingale isnonstationary.

There are 2 key ingredients to any martingale:

1 A family of information sets, Ft (called a ‘filtration’). A process may be amartingale w.r.t. one filtration but not another.

2 A probability measure, used to define conditional expectations. A process may bea martingale w.r.t. one measure but not another. We exploit this often later on.

A martingale is a mathematical formalization of a fair game. Betting onthe outcomes of an unbiased coin generates a martingale wealthprocess.


A martingale only restricts the first moment. It should be distinguishedfrom a random walk. A random walk is a martingale, but not viceversa. (Exercise: Provide an example of a martingale that is not arandom walk).

If E[Xt+1|Ft] > Xt the process is called a submartingale. IfE[Xt+1|Ft] < Xt it is a supermartingale.

Note that if we define prices to include dividends we can write theFund. Eq. as follows:

MtPt = Et[Mt+∆ · Pt+∆]

where with time-additive utility Mt+∆/Mt = e−ρ∆U ′(Ct+∆)/U ′(Ct)

This says that SDF-scaled prices follow a martingale. We areinterested in what happens when ∆→ 0 and trading becomescontinuous. The continuous-time limit of a (continuous) martingale is aBrownian motion process.


CONTINUOUS- VS. DISCRETE-TIME

The time index in a stochastic process can either be discrete (integers)or continuous (real numbers).

Is time “really” discrete or continuous? [Who knows, ask a physicist].

In economics, the choice between them should be based solely onmathematical and computational convenience. Use whichever iseasier for the problem at hand!

The fact is, in many asset pricing problems, continuous-time is easier.This is especially true in option pricing. Besides, in today’s world ofcomputerized, algorithmic trading, continuous-time is becomingincreasingly accurate even from a descriptive standpoint!

The easiest way to think about a continuous-time stochastic process isas a limit of a discrete-time process. However, there are dangers indoing this, and we must be careful in how we take limits.


WIENER PROCESS (BROWNIAN MOTION)

Defining a discrete-time process is simple - Just cumulate i.i.d. shocks.The Wold Representation Theorem tells us this is a completely generalway to define a (linear) discrete-time process.Example: xt = ρxt−1 + εt |ρ| < 1 εt ∼ i.i.d.⇒ xt =

∑∞j=0 ρ

jεt−j

Likewise, defining a deterministic continuous-time process is alsosimple - Just write down a differential equationExample: dx

dt= a · x

Here’s the question - Can we define a stochastic continuous-timeprocess by just adding on an i.i.d. shock to a differential equation, aswe did in the discrete-time case? Does the following eq. make sense?

dx

dt= a · x+ ε

Answer: Not really, but in a way yes.


It turns out that defining a continuous-time i.i.d. process is a bit tricky,and leads to some surprising results.

We will see that the mathematically correct way to define a stochasticdifferential equation is in terms of the following integral equation:

xt = x0 +

∫ t

0

axsds+

∫ t

0

dWs

where Wt is a Wiener Process, and the 2nd integral is a new type ofintegral, called an Ito Integral.

The Wiener process can be viewed as a continuous-time limit of thefollowing random walk process

xt = xt−1 + εt εt ∼ i.i.d. N(0, 1)

⇒ xt+T = xt +∑T

j=1 εt+j

where the step-size shrinks to zero in a very particular way as the timebetween steps shrinks to zero.


Note 3 features of the above random walk:1 E(xt+T |xt) = xt∀T or equivalently, E(∆xt+k|xt) = 0 (Markov, with indpt.,

zero mean increments).2 Var(xt+T |xt) = T (Variance increases linearly with number of periods).3 Although the unconditional distribution is undefined (since the process is

nonstationary), all conditional distributions are Gaussian.

We want to derive a continuous-time process with these sameproperties. In particular, we want the variance to depend on thesample length, T , but not on the number of steps.

Unless the step-sizes go to zero slower than the step-lengths, the lawof large numbers would cause the variance to shrink to zero.

In analogy to the Wold Representation Theorem, all continuoussample path continuous-time processes can be constructed asfunctions of the Wiener process. (If you want to allow jumps, then youmust consider a wider class of innovations, called Levy Processes).


BINOMIAL TREE

Let’s consider a slight generalization of a random walk, which allows the process to driftup or down on average over time.

To visualize the path of this process, consider the following binomial tree:

P

q

P2

2pq

q2

X0

X0 + Δh

X0 + 2Δh

X0 -- Δh

X0 -- 2Δh

t=0

Type equation here.

t = Δt t = 2Δt

This depicts the path of the following r.w. (with drift) process:

xt+∆t = xt + εt+∆t εt = ∆h with prob. p

= −∆h with prob. q


This is called a binomial tree because each step is the outcome of aBernoulli trial, so the path of the process (sum of the steps) is abinomial r.v.

By definition, ∆t→ 0 when defining a continuous-time process. Thequestion is, What happens to ∆h as ∆t→ 0?

Clearly, if the process is to be continuous, then ∆h→ 0 as ∆t→ 0.But how fast?

Moments:

E(∆x) = (p− q)∆hE[(∆x)2] = p(∆h)2 + q(−∆h)2 = (∆h)2

var(∆x) = E[(∆x)2]− [E(∆x)]2 = [1− (p− q)2](∆h)2 = 4pq(∆h)2


A continuous-time interval of length T can be divided into n = T∆t

discrete time-steps. Since each step is independent, we have

E(xT − x0) =T

∆t(p− q)∆h = T (p− q)

∆h

∆t(1)

var(xT − x0) =T

∆t· 4pq(∆h)2 = T · 4pq

(∆h)2

∆t(2)

Suppose ∆h ∼ ∆t (∆h goes to 0 at the same rate as ∆t). Then notice from (2) thatvar(xT − x0)→ 0. This is just the Law of Large Numbers. However, we wantvar(xT − x0) ∼ T . To get this, we assume

∆h = σ√

∆t

This implies (∆h)2

∆t→ σ2. If we also let

p =1

2

[1 +

α

σ

√∆t]

q =1

2

[1−

α

σ

√∆t]

then, as ∆t→ 0, the above binomial dist. converges to a Normal dist. (ie, Central LimitTheorem) and we find

xT − x0 → N(αT, σ2T ) Note, pq = 14(1− α2

σ2 ∆t)→ 14

This is what we want!KASA ECON 2021 - FINANCIAL ECONOMICS I

Comments & Questions:1 Notice that

∆x

∆t∼

∆h

∆t∼√

∆t

∆t∼

1√

∆t→∞ as ∆t→ 0

2 This means that the continuous-time limit process, x(t), is not differentiable. However, itis continuous. (Can you visualize a continuous, but nowhere differentiable process?)

3 Divide the continuous interval [0, T ] into discrete subintervalst0 = 0 < t1 < · · · < tn = T . Define the pth-Variation of the Wiener process as

V p = lim∆tk→0

n∑k=1

|Xtk −Xtk−1|p

Using the above results, one can show V 1 =∞, V 2 = T , and V p = 0 for p ≥ 3.

4 Thus, the ‘total variation’ or ‘length’ of the Wiener process is infinite, even over arbitrarilyshort time intervals. However, its ‘quadratic variation’ is finite, and equal to the length ofthe time interval. This reflects a trade-off between the number of steps, and the size ofeach step. As the number of steps goes to∞, the size of each step doesn’t go to zerofast enough to keep the sum of the products finite, but the squared step sizes do.

5 The fact that variations beyond order 2 equal 0 will be important when developing ournew stochastic calculus rules.

6 It turns out that the Wiener process is not special in this regard. All continuousmartingales have V 1 =∞ and V 2 = T (except for the degenerate case of aconstant).


What exactly does it mean for a discrete-time process to ‘converge’ to a continuous-timeprocess. (They live in different ‘spaces’!). We just showed their means and variancesconverged. What about their sample paths?

How do we know a random process is continuous? (It’s random!)

If Wt is not differentiable, in what sense (if any) can we use its innovations as theunderlying i.i.d. shocks driving a stochastic analog of a differential equation?

In discrete-time, we move interchangebly between sums and differences:

xt = xt−1 + εt discrete ‘differential eq.’

xt =t∑j=0

εj discrete ‘integral eq.’

This correspondence is like a discrete version of the Fundamental Theorem of Calculus:F (b)− F (a) =

∫ ba F′(x)dx.

Does a similar result apply to stochastic continuous-time processes? It’s not obvious,since dW

dtdoes not exist.


SUMMARY SO FAR

A Wiener process (or standard Brownian motion) is a continuous-timeprocess having the following 3 properties:

1 Continuous sample paths2 Stationary i.i.d. increments3 Wt ∼ N(0, t) ∀t [W0 = 0]

Therefore, over any discrete time interval, ∆t, we can write

∆Wt = εt√

∆t εt ∼ N(0, 1)

Notation: As ∆t→ 0, we write dW = ε√dt

⇒ E[dW ] = E[ε ·√dt] = 0

E[(dW )2] = E[ε2 · dt] = dt

More generally, we can define a Brownian motion with drift, µ, andvolatility, σ, as follows

xt = x0 + µ · t+ σWt


Upon first encountering this material, one sometimes gets theimpression that Wiener processes and Brownian motion are differentobjects, i.e., that Wiener processes are in some sense ‘more general’.A famous theorem of P. Levy shows that this is not the case, and thatwe can think of them as being synonymous.

Theorem: If a continuous-time process, xt, has continuous samplepaths with stationary i.i.d. increments, then it must be a Brownianmotion.That is, continuity + stationary i.i.d. increments⇒ NormalityHence, the 3rd part of the previous definition was not really required.

Implication: If the data indicate a process is non-Gaussian, then itcannot be a Wiener process. (However, you might be able to convert itinto one with a suitable transformation).


A CaveatBefore proceeding, we need to be aware of an important caveat to allour results. Consider the following silly (but useful) example:

Example: Let U by a r.v. which is uniformly distributed on [0, 1], andconsider the following 2 random processes on the continuous timeinterval [0, 1]:

Xt = 0 for all t ∈ [0, 1] Yt =

1 if t = U0 otherwise

Since Pr(U = t) = 0 ∀t, Xt and Yt have the same distributions.However, Xt and Yt are different processes. In particular,Pr(Xt = 0 ∀t) = 1 and Pr(Yt = 0 ∀t) = 0.

Implications:1 Since we are dealing with random things, there are no guarantees.2 All our statements about the properties of a random process mean they hold

‘almost everywhere’ or ‘almost surely’.3 In this way, we can say the above 2 processes are equal. (They only differ on a

set of measure zero).4 This qualifies our earlier claim about the continuity of a Wiener process - More

precisely, The sample paths of a Wiener process are almost surely continuous


WEAK CONVERGENCE/DONSKER’S THEOREM

In what sense do the sample paths of a discrete-time random walkconverge to those of a Wiener process?

In statistics, you learn that there are many ways we can define theconvergence of a random variable. One of them is:


Convergence in Distribution:Let Xn be a sequence of random variables with CDF Fn(Xn). We say that Xn converges indistribution to a r.v. X with CDF F (X) if limn→∞ |Fn(Xn)− F (X)| = 0 at all continuitypoints of F (X).

This is a statement about the prob. dists. of Xn and X, not the values of Xn and X

Weak convergence can be interpreted as a function space analog of convergence indistribution (ie, it applies to stochastic processes rather than random variables).

Definition: Let (Ω,F) be a ‘measure space’, where Ω is a set of continuous functions, F be aσ-algebra of its subsets, and µn be a sequence of prob. measures on (Ω,F). Then µnconverges weakly to µ (denoted µn ⇒ µ) if for each continuous, bounded function f we have∫f(x)µn(dx)→

∫f(x)µ(dx)

Equivalently, if we let Xn be the random process associated with µn (ie,µn(A) = Pr(Xn ∈ A)), then weak convergence implies Ef(Xn)→ Ef(x).

Theorem (Donsker): Let εt ∼ i.i.d. with Et = 0 and Eε2t = 1. LetXT = ε1 + ε2 + · · ·+ εT be the T th partial sum. Form the following linear continuous-timeinterpolation of XT on the interval [0, 1]

Wnt =

1√nXm if t = m

n(0 ≤ m ≤ n)

linear if t ∈ [mn, m+1

n)

Then, as n→∞, Wnt ⇒Wt, where Wt is a Wiener process.


Comments:

Note that no assumption was made about the distribution of εt.

Believe it or not, this is really a useful and practical result. Why?Because the function f can be used to define many useful propertiesof a path (e.g., the maximum value reached before time T ).

Donsker’s Theorem then tells us that the sample path properties of arandom walk can be approximated by those of a Wiener process (andvice versa). This means computer programs can be used toapproximate the path properties of Wt (or, going the other way)analytically derived results for Wt approximate those for Xt.


BACKWARD & FORWARD EQUATIONS

Markov processes are fully characterized by their transitionprobabilities (and an initial condition).

Example: A discrete-state/discrete-time process (called a MarkovChain) is characterized by its transition probability matrix, Pij .

A standard Wiener process is Markov, and from our results so far, itstransition probabilities are given by:

f(y, t|x, s) =1√

2π(t− s)exp

−(y − x)2

2(t− s)

= prob. Wt will be at y at t given it starts at x at s < t

Taking derivatives, one can verify that f is the solution of the following2 (partial) differential equations:

∂f

∂s= −

1

2

∂2f

∂x2Backward Eq.

∂f

∂t=

1

2

∂2f

∂y2Forward Eq.


The backward eq. conditions on a future time and value, anddescribes how the expectation of this value changes with differentinitial conditions. It is solved ‘backwards’. Option pricing uses thebackward eq., since we know the value at expiration if we know theterminal price of the underlying asset.

The forward eq. conditions on an initial distribution, and describes howthis distribution evolves over time. It is solved ‘forwards’. It can be usedto find the long-run stationary distribution of a stochastic process.

Mathematically, the backward and forward eqs. are duals (adjoints) ofeach other.

The forward eq. describes the ‘diffusion’ of particles subject to randommolecular motion. It is sometimes called the ‘heat eq.’. The backwardeq. is just the heat eq. with time running backwards.

Later we derive more general versions of these equations using adifferent method, based on Ito’s Lemma.


‘WHITE NOISE’ & STOCHASTIC DIFF. EQS.Discrete-time processes are built from discrete-time i.i.d. random variables, εt

Xt =

propagation mech.︷︸︸︷α1Xt−1 + α2Xt−2 +

impulse︷︸︸︷εt

It would be nice if the same were true for cont.-time processes

The ‘sum’ of a cont.-time i.i.d. process makes sense. That’s what a Wiener process is:Wt =

∫ t0 εsds.

Problem: We can’t differentiate this to get: dWdt

= ε. Wt is too eratic. The stochasticanalog of the Fundamental Theorem of Calculus breaks down. We need to develop newcalculus rules (known as ‘stochastic calculus’).

For now, I will just alert you to the fact that there is a mathematically sophisticated way ofdefining the above derivative by appealing to the notion of generalized functions, whichare motivated by the Dirac δ-function

δ(x) =

+∞ x = 00 x 6= 0

∫ ∞−∞

δ(x)dx = 1

⇒∫∞−∞ f(x)δ(x)dx = f(0)

Engineers & physicists use these all the time. Using δ-functions we can say that dWdt

“exists”, and engineers call it “white noise”.


STOCHASTIC DIFFERENTIAL EQUATIONS

Since dWdt

does not exist (in a conventional sense), we approachSDEs via the following integral eq.

Xt = X0 +

∫ t

0

µ(Xs)ds+

∫ t

0

σ(Xs)dWs

The first integral is a conventional Riemann integral. The 2nd integralrequires special treatment due to the erratic behavior of Wt.

If the integrand σ(·) is ‘well behaved’ (eg, independent of W ) then the2nd integral can be evaluated ‘as usual’. For example, if σ is constant,we have

∫ t

0σdWs = σWt (since W0 = 0). However, if the integrand

is random (as if often is in finance) then we must be careful.

Note, since dW is random, we must remember that the integral is arandom variable, so when discussing convergence of discrete sumswe must adopt a probabilistic notion of convergence.


THE ITO INTEGRAL

Let’s start with an example. Suppose we want to evaluate∫ t

0

WsdWs

Consider the following two Riemann sum approximations

I1 = lim∆tk→0

n−1∑k=0

Wtk [Wtk+1−Wtk ]

I2 = lim∆tk→0

n−1∑k=0

Wtk+1[Wtk+1

−Wtk ]

The only difference here is that I1 evaluates the integrand at theleft-hand endpoint of each interval whereas I2 evaluates at the rightendpoint. If W were a deterministic function of time, both wouldconverge to the same result: 1

2W 2

t (since W0 = 0). [To show this use the result∑nk=1 k = (1/2)n(n+ 1)]


When Wt is random, however, it matters where we evaluate the integrand. Using thefact that Wt has independent mean zero increments and the law of iteratedexpectations, we can see that E(I1) = 0. At the same time, we can easily see thatE(I2) = t.

The Ito integral is defined by the assumption that we evaluate at the left endpoint. Aslong as the integrand is ‘nonanticipating’ (ie, measurable w.r.t. info available at the leftendpoint) and ‘square integrable’ (to make sure sums converge) this will ensure that theIto integral is a martingale.

Evaluating at the left endpoint also makes sense economically. It allows us to useintegrals to sum up continuous-time trading profits when agents must choose portfoliosbefore ‘future’ prices are known.

In this simple example, we could actually evaluate the limit of the Riemann sum by bruteforce. However, it is easier to use the fact that by construction the Ito integral is amartingale. Since E(W 2

t ) = t, we can infer that∫ t0 WsdWs = 1

2(W 2

t − t).

More precisely, remembering that the integral is random, one can show that I1converges in mean-square to 1

2(W 2

t − t). (That is, the expected squared deviationsbetween the two goes to zero as the intervals go to zero).


ITO’S LEMMA

As in regular calculus, one rarely evaluates stochastic integrals usingRiemann sums. Instead, we look for an anti-derivative. Notsurprisingly, taking derivatives of a random process also producessome differences from classical calculus.

Ito’s lemma shows us how to take derivatives of functions of a Wienerprocess. It can be interpreted as a stochastic analog of the ChainRule. It has 3 main uses in financial economics

1 It allows us to calculate expected continuation values, E[dV ], in continuous-timedynamic programming problems. We use DP to study optimalconsumption/portfolio problems in various settings.

2 In asset pricing problems, the SDF process is a function of underlying macrovariables that follow Ito processes [eg., Mt = e−δtU ′(Ct)]. Ito’s lemmaenables us to infer the stochastic properties of the SDF, given the stochasticproperties of the underlying macro variables.

3 Derivatives prices are functions the underlying price, C(S, t), where S follows anIto process. To calculate its value we use Ito’s lemma.


Rather than proceed in the abstract, let’s consider an example.Suppose Xt follows the Brownian motion with drift process

dX = µ · dt+ σdW

Remember, we cannot write: dXdt

= µ+ σ dWdt

. The above equation issimply shorthand notation for the integral

Xt = X0 + µ

∫ t

0

ds+ σ

∫ t

0

dWs

(Since σ is constant here, no special treatment of the integral isrequired).

Suppose we have some function, F (t,X), which depends on X (andt), and we want to approximate dF = total differential. Normally, a1st-order approximation would be:

dF ≈∂F

∂xdX +

∂F

∂tdt


However, since dX ∼√dt, we must go out to 2nd order in the Taylor series to pick-up

all the first-order terms in dt

dF ≈∂F

∂xdX +

∂F

∂tdt+

1

2

∂2F

∂X2(dX)2 +

1

2

∂2F

∂t2(dt)2 +

∂F

∂X∂tdX · dt

Note that the final two terms are of order higher than dt, and so can be dropped.However, note that

(dX)2 = µ2(dt)2 + 2µσdt · dW + σ2(dW )2

≈ σ2dt

This gives us Ito’s Lemma:

dF =∂F

∂XdX +

∂F

∂tdt+

1

2σ2 ∂

2F

∂X2dt

=

(µ ·

∂F

∂X+

1

2σ2 ∂

2F

∂X2+∂F

∂t

)dt+ σ

∂F

∂XdW


The extra 12σ2 ∂2F

∂X2 dt term in Ito’s Lemma can be interpreted as a Jensen’s inequalitycorrection, reflecting the fact that the expectation of a nonlinear function 6= the nonlinearfunction of the expectation. As always, its importance is increasing in the variance. Notethat if F is concave, the Ito/Jensen correction reduces the drift of the process, while if Fis convex, it increases the drift.

Examples:1 Suppose dX = µ · dt+ σdW and define Z = F (X) = eX . Note,

Fx = eX = Z

Fxx = eX = Z

Ft = 0

Plugging into Ito’s lemma we get

dZ = dF = Fx · dX +1

2σ2Fxxdt

= Z(µdt+ σdW ) +1

2σ2Zdt

= (µ+1

2σ2)Z · dt+ σZdW

This example is used all the time in financial economics. [Z is called geometric Brownian motion]

2 Suppose dX = µX · dt+ σX · dW and define Z = F (X) = log(X). Note,

Fx = 1/X

Fxx = −1/X2

Ft = 0


Plugging into Ito’s lemma we get

dZ = dF = Fx · dX +1

2σ2X2(−1/X2)dt

=1

X(µXdt+ σXdW )−

1

2σ2dt

= (µ−1

2σ2) · dt+ σdW

This example is also used all the time in financial economics.

Suppose dCC

= µdt+ σdW , and consider the SDF processM = F (t, C) = e−δtU ′(C). Ito’s lemma allows us to infer the following process forthe SDF

dM = Ftdt+ FcdC +1

2σ

2C

2Fccdt

= −δe−δtU′(C)dt+ e−δt

U′′

(C)[µCdt+ σCdW ] +1

2σ

2C

2e−δt

U′′′

(C)dt

⇒ dMM

=[−δ +

(CU′′

U′

)µ+ 1

2σ2(C2U′′′

U′

)]dt+ σ

(CU′′

U′

)dW

Once again, this result is used often in financial economics.


SOLUTIONS OF SDES

Consider the SDE

dX = µ(t,X)dt+ σ(t,X)dW (3)

What does it mean to ‘solve’ this equation? Unlike solving adeterministic ODE, we cannot expect to find a unique path thatsatisfies the equation. Instead, we look for a stochastic process whosestatistical properties replicate those of equation (3).

One possible strategy is to express it in its more appropriate integralform: Xt = X0 +

∫ t

0µ(s,Xs)ds+

∫ t

0σ(s,Xs)dWs and attempt to

evaluate the integrals.

In practice, it is more common to search for a function, Xt = F (t,X),such that when Ito’s Lemma is applied to it, we get back the same drift,µ(t,X), and diffusion, σ(t,X), coefficients as in eq. (3).


Since the drift and diffusion coefficients fully characterize the statistical properties of aprocess, doing this will ‘solve’ the equation.

However, there is a subtlety here that goes back to our discussion of martingales. Weknow that Ito integrals are martingales, and that martingales are always defined relativeto an information set. Let Ft be the filtration generated by the Brownian motion in eq.(3). Then if our solution is Ft-adapted, we have what is called a ‘strong solution’. If it isadapted to some other information set, we have a ‘weak solution’. The statisticalproperties are the same either way, but our assumptions about the available informationcould be different.

Example:Suppose we want to solve for the geometric Brownian motion process

dX

X= µdt+ σdW (4)

where µ and σ are constants. It’s not clear how we should evaluate the Ito integral∫XsdWs,

since we don’t know what X is yet! So let’s use the indirect/Ito’s lemma approach. Let’s positthe form

Xt = X0e(µ− 1

2σ2)t+σWt (5)

Applying Ito’s lemma to this, one can verify that the resulting drift is µX and the resultingdiffusion coefficient is σX. Hence, eq. (5) ‘solves’ eq. (4). If the Wt process in (5) is the sameas in (4), then we have a strong solution.


AN EXISTENCE THEOREM

In the case of deterministic ODEs there are well known conditions that ensure existence(and uniqueness) of solutions. (Before looking for something, it’s useful to know that itexists)

The following theorem provides conditions for the existence and uniqueness of solutionsto SDEs.

Theorem: Assume the drift and diffusion coefficients are measureable functions satisfying thefollowing two conditions

|µ(t,X)|+ |σ(t,X)| ≤ C1(1 + |X|) ∀X, t ∈ [0, T ]

|µ(t,X)− µ(t, Y )|+ |σ(t,X)− σ(t, Y )| ≤ C2|X − Y | ∀X, Y, t ∈ [0, T ]

where C1 and C2 are given constants. Then the SDE

dX = µ(t,X)dt+ σ(t,X)dW

has a unique square-integrable solution.

The 1st condition is called a linear growth condition. The 2nd condition is called aLipschitz condition.

These are sufficient conditions, and are often too strict for finance applications.However, one case is noteworthy - if the drift and diffusion coefficients are indpt. of t andaffine functions of X, then this theorem tells us a unique solution exists.


LOCAL MARTINGALES

Earlier I claimed that Ito integrals produce martingales. That is,

Mt = M0 +

∫ t

0

θsdWs

is a martingale as long as θt is adapted to the filtration generated byWt, (Ft).

According to the Martingale Representation Theorem, this is acompletely general way to define martingales, i.e., all (continuous)martingales adapted to Ft are Ito integrals for some θt process.

However, it turns out this is not quite correct. Ito integrals only definelocal martingales. Without further restrictions on θt, there is noguarantee they are global martingales. A sufficient condition thatensures a local martingale is a global martingale is the following

E

[∫ T

0

θ2sds

]<∞


DOUBLING STRATEGIES

A good way to see the relevance of this distinction is to consider thefollowing doubling strategy:

Consider an infinite sequence of discrete dates within some interval [0, T ]. For example,0 = t0 < t1 < · · · < T with tn = nT/(n+ 1) as n→∞. Suppose you can bet on thetoss of a fair coin at each tn, winning if it shows heads. Suppose your initial bet is $1, and youdouble your bet each time until you win. If you win, you quit. From the Law of Large Numbers,you win with probability 1, and E[WT ] = W0 + 1 6= W0. Hence, your wealth is not amartingale, even though each bet is fair and the expected (local) change in your wealth is 0.

What’s going on here? How can the sum of a bunch of zeros add up to1? Three things are crucial: (1) You can place an infinite number ofbets, (2) Your wealth is allowed to become unboundedly negative, and(3) You are allowed to place an infinitely large bet. Relax any of theseand your strategy is a loser: E(WT ) < W0. The only way you cangenerate a positive global mean from a sequence mean zero bets is ifthere is some probability of an infinite loss.

Lesson: In continuous-time finance, ruling out arbitrage requiresplacing (weak) restrictions on an agent’s portfolio or wealth.


econ 2021 - financial economics ikkasa/ec2021_lecture3r.pdf · 2018-09-06 · note, in economics we...

Documents