perturbation and symmetry techniques

7/26/2019 Perturbation and Symmetry Techniques

1/119

Perturbation and Symmetry TechniquesApplied to Finance

A Dissertationsubmitted to obtain the title of Doctor of Philosophy (Dr. rer. pol.)

from the Frankfurt School of Finance & Management.Specialty: Quantitative Finance

Submitted by: Stephen Taylor

Dissertation Advisor: Jan Vecer

March 2010

1


2/119

2

Declaration of Authorship: I hereby certify that unless otherwise indicated in the textof references, or acknowledged, this dissertation is entirely the product of my own work.

Stephen Taylor


3/119

3

Contents

1. Introduction 32. Background Material 72.1. Probability Theory and Stochastic Processes 72.2. Stochastic Differential Equations 92.3. Financial Applications of Stochastic Differential Equations 10

2.4. Riemannian Geometry 142.5. Finance/Geometry Relation 172.6. Heat Kernel Expansion Formula 203. Local Volatility Model Applications 233.1. Black-Scholes-Merton Example 233.2. Constant Elasticity of Variance Model 273.3. Quadratic Local Volatility Model 353.4. Cubic Local Volatility Model 363.5. Affine-Affine Short Rate Model 373.6. Generalized CEV Model 404. Two Dimensional Stochastic Volatility Models 46

4.1. Geometry of A Class of Two Dimensional Stochastic Volatility Models 464.2. Geometry of Extended Stochastic Volatility Models 514.3. The History of SABR Approximations with Examples 544.4. Heat Kernel Methods for the SABR Model 634.5. Paulots Conversion Technique 684.6. Lees Moment Formula 734.7. The Heston Model 744.8. General Class of Stochastic Volatility Models 754.9. Higher Dimensions and Limitations 764.10. An Alternative Perturbation Theory for Stochastic Volatility 765. Symmetries of Finance Equations 81

5.1. Introduction to Symmetry Analysis 815.2. Correspondence with the Standard Theory 865.3. Symmetry Analysis of the Black-Scholes-Merton Equation 885.4. Symmetries of the Classical Asian Equation 945.5. Symmetries of Non Standard Asian Equations 1035.6. Symmetries of New Asian Equations 109References 113


4/119

4

1. Introduction

A significant portion of the mathematical finance literature is devoted to constructing exactpricing formulas for a variety of contingent claims whose underlying assets are assumed toevolve according to a specified stochastic process. In particular, the Black-Scholes-Merton[18], [119], [120], pricing formula for a European call option on an asset which undergoeslognormal dynamics has served as a cornerstone for much of the development of modern

pricing theory. Although the lognormal model has long been a guide for gaining insight intohow derivatives prices depend on market parameters, it is often too simplistic for practicaluse and has many drawbacks. Specifically, asset prices generally do not follow lognormaldynamics, and one is typically unable to calibrate this model in a manner that is consistentwith current market option prices.

These issues, amongst others, led Dupire [48] to consider the more general class of lo-cal volatility models. A local volatility model allows an asset process to evolve by a one-dimensional Ito process, where the diffusion coefficient function is chosen so that the modelspresent day European call prices agree with current market quotes. Unfortunately, one can-not determine the transition density for a general It o process (which is equivalent to con-structing an explicit formula (up to quadrature) for a call option). There are only a handful

of local volatility models where it is possible to derive an expression for the transition densitygiven by a composition of elementary and special functions. Constructing such a solutionis equivalent to solving a single linear parabolic equation of one spatial and one temporalvariable. Generally, as the functional form of the diffusion coefficient becomes more compli-cated, the prospect of finding an exact solution to the associated transition density equationdiminishes.

In lieu of exact formulas, there are several widely used approximate pricing techniques.For example, tree pricing, Monte Carlo, and finite difference methods are currently the mostpopular approximation methods used in practice. Alternatively, one can attempt to constructexplicit approximation formulas for the transition density of a local volatility process thatgenerically will only be valid in certain model parameter regimes. There are a variety of

approximation techniques that have been applied to one dimensional local volatility models,(c.f. Corielli et. al. [39], Cheng et. al. [31], Kristensen and Mele [97], and Ekstrom andTysk [52] for example) The advantage of such formulas is that they are considerably morecomputationally efficient, relative to their previously mentioned counterparts, and, at times,retain qualitative model information in their functional form. The main drawback of theseapproximations is that they can be inaccurate for certain model parameters, and a thorougherror analysis against an established approximation method is usually required prior to realworld use.

The literature regarding explicit approximation formulas, which are generically constructedusing different types of perturbation theory, has been growing over the past several years. Inparticular, heat kernel perturbation theory has proven to be effective in creating some of the

most accurate approximation formulas to date. Roughly speaking, heat kernel perturbationtheory utilizes a Taylor series expansion ansatz in time together with a geometrically moti-vated prefactor which is chosen to simplify subsequent calculations. The heat kernel ansatzis most naturally stated using the language of differential geometry. Hagan and Lesnewski[78], [79], [80], [106] were the first to introduce differential geometric methods in finance.Henry-Labordere [98], [99], [101], used heat kernel methods to construct implied volatilityformulas for local volatility models, the SABR model, and the SABR Libor Market Model.More recently, Gatheral, et. al. [64] have also considered heat kernel expansions in thecontext of local volatility models. Improving upon the work of Henry-Labordere, Paulot


5/119

5

[134] constructed (to our knowledge) the most accurate explicit implied volatility formulafor the SABR model. Forde [58, 59] has rigorized and extended this work to a larger classof stochastic volatility models. We refer the reader to Medvedevs work [114] for a list ofseveral references to the finance perturbation theory literature.

There are two main advantages of heat kernel methods over other forms of perturbationtheory. First, the only error in a formula constructed solely using a heat kernel ansatz is due

to a Taylor series assumption in time. In particular, an expression for a transition densityconstructed using heat kernel perturbation theory will increase in accuracy as time decreases.Consequently, implied volatility smile approximation formulas constructed from a heat kernelexpansion tend to be more accurate at out of the money strikes than others constructed withalternative formalisms that are perturbative in both time and strike. Secondly, the heatkernel ansatz solves its associated parabolic equation exactly to zeroth order in time. Thusin some sense, one does not carry along zeroth order complexities when trying to establisha first order correction for the transition density of a process.

Heat kernel perturbation theory is not without its limitations. Even in the case of twodimensional models, the heat kernel ansatz can not be expressed explicitly. This is due tothe fact that one can not obtain an explicit distance function on a generic two dimensional

geometry. There are other techniques in differential equations which one can use to attemptto reduce the complexity of a given pricing equation. One such technique is Lie symmetryanalysis.

Lie symmetry analysis provides a means to reduce the dimension of a partial differentialequation by using the symmetry of the equation to construct a natural coordinate system inwhich it takes a simpler form. This can be particularly useful in finance as one can reduce thedimension of a given pricing equation and then numerically simulate the reduced equationat a significantly faster speed than the original.

We study the symmetry groups of several equations relevant to finance including theheat equation, Black-Scholes-Merton equation, the classical Asian equation in the lognormalmodel, and several related Asian type equations. We examine both the heat and Black-

Scholes-Merton equations in order to demonstrate our method for computing symmetrygroups of partial difference equations. We then apply this method to the standard Asianequation and find many interesting symmetries which allow one to reduce the equation to alower dimensional problem.

In particular, we first review the construction of the symmetry group of the Asian equation(c.f. Glasgow and Taylor [154]) and show that there are several ways one can reduce thisequation by one spatial dimension. We next consider several additional equations whichcan be constructed the from Asian equation by a change of numeraire. Finally, we examineseveral relations between these different equations.

This work is organized in the following manner: In section 2, we provide basic conventionsand background and review the construction of the heat kernel ansatz. Specifically, we

first review topics in probability theory, stochastic differential equations, and Riemanniangeometry which we will need throughout the remainder of the text. In addition, we reviewthe heat kernel ansatz for a partial differential equation and then demonstrate how it canbe used to construct explicit approximation formulas for the transition density of stochasticprocesses. Such formulas can be used in conjunction with integral approximation techniquesto enable one to construct explicit formulas for path independent derivatives.

In section 3, we state the explicit form of the heat kernel ansatz in one dimension. Wethen apply this ansatz to several local volatility models. We first consider the Black-Scholes-Merton model, where we show that one can use the heat kernel method to construct theexact solution of the European call equation whose underlying evolves according to lognormal


6/119

6

dynamics. Thus we provide a novel, although somewhat roundabout way, of solving the BlackScholes equation. Next, we examine the CEV local volatility model. We construct an n-thorder approximation formula for this model and comment on convergence issues related tothe resulting formula as well as their implications for the related formulas for the SABRmodel. Next, we look at the quadratic local volatility model. Here, we are able to derivethe exact transition density for this model by using heat kernel perturbation theory to solve

this relevant pricing PDE to arbitrarily high order and then inverting a related Taylor seriesto construct an exact solution for the transition density. We then turn to three new localvolatility models called the cubic local volatility model, the affine-affine model, and thegeneralized CEV model. We are able to accurately approximate the transition densities ofeach of these models in certain model parameter regimes.

In section 4, we turn attention to studying two dimensional stochastic volatility models.We first consider a correspondence between stochastic volatility models and related construc-tions in differential geometry. We next focus on the SABR model. This stochastic volatilitymodel is widely used in industry across asset classes. This model is popular because Haganet. al, in [77], constructed an explicit implied volatility model for European call options interms of SABR model parameters. However, this formula is known to degenerate for certain

parameters values which include strike values for options that are highly out of the money.There are other methods to construct implied volatility approximation formulas for thismodel. Currently, the singly more accurate explicit implied volatility formula for this modelwas constructed by using heat kernel perturbation theory. We review, and partially extend,the work of Paulot [134] where we apply the heat kernel method to the SABR model. Next,we show that the resulting approximation method satisfies Lees moment formula. Afterthis, we look at a similar construction in the Heston model and examine a class of stochas-tic volatility models that one can also apply heat kernel methods to in order to constructexplicit implied volatility formulas. We lastly, consider an alternative type of perturbationfor stochastic volatility models and make connections with these and certain constructs ingeometry.

Finally, in section 5, we show how Lie symmetry analysis can be applied to pricing problemsin finance. We first review the subject as well as our method for computing the symmetrygroup of a given differential equation. We then demonstrate this method in the case ofthe heat and Black-Scholes-Merton equations. Next, we turn to the two dimensional Asianequation whose underlying evolves according to lognormal dynamics, and compute its fullsymmetry group. This leads to several dimensional reductions of the Asian equation whichcan be used to speed up the pricing of Asian options. We next analyze the symmetry groupsof several other Asian type equations which are related to the original equation via a changeof numeraire.

Acknowledgements: There are many people who have helped me throughout my stayin academia that I would like to acknowledge. In chronological order, I would like thank

Ronald Selby for teaching me calculus at my secondary school. I thank Michael Dorff andEric Hirschmann for their time in advising both my Masters theses and teaching me aboutcomplex analysis, geometry, and relativity. Next, I would like to thank Scott Glasgow forteaching me many things about differential equations, helping me understand the researchprocess, and introducing me to the world of mathematical finance. I would also like to thankmy past graduate student colleges Aaron Petersen and Marcelo Disconzi for many fruitfuldiscussions about a variety of topics.

There are also many people at Bloomberg that have helped me shape portions of this work.I thank Fabio Mercurio for playing a significant role by suggesting research topics as well asteaching many things about interest rate models. I would like to thank all my colleagues in


7/119

7

Bruno Dupires Quantitative Finance Research group for many useful comments related tothis work, as well as continually teaching me many things on a variety of topics in real worldfinance.

Finally, I would like to thank my dissertation advisor Jan Vecer for educating me abouthow numeraire techniques can be applied in finance often in a manner which allows one toconsiderably simplify pricing problems. I would also like to thank him for suggesting many

research questions regarding Asian options and taking the time to serve as my advisor. Also,I would like to thank the Frankfurt School for providing me with the means to complete mydoctoral work.

I would lastly like to thank my parents for encouraging me to pursue higher education aswell as instilling me with a strong desire to continually learn new things.


8/119

8

2. Background Material

We first summarize background material in probability theory that we will use throughoutthe remainder of the text. Next, we turn to recalling relevant topics related to stochasticdifferential equations, Riemannian geometry. We then combine use these results to constructthe heat kernel expansion ansatz for differential equations.

2.1. Probability Theory and Stochastic Processes. We first review basic aspects ofprobability theory and stochastic processes which will be utilized throughout the remainder ofthe text. In particular, we discuss relevant aspects of probability theory including, Brownianmotion, stochastic processes constructed from Brownian motion, along with probability andtransition density functions associated to these stochastic processes. Our aim is not to berigorous or thorough and we defer to Borodin-Salminen [20], Evans [53], Jones [90], Joshi[91], Oksendal [130], Rudin [142], and Shreve [144] for deeper discussions of these topics.

We start with the notion of a probability space. Probability spaces are the fundamentalbackground objects upon which stochastic processes are defined. A probability space isa measure space (, F,P) where is a set called the event space,F is a sigma algebraover , and P is a probability measure, i.e. a measure on that satisfies the additional

condition P() = 1. Given a probability space (, F,P), a real-valued one-dimensionalrandom variable is a function X : R that is (F, B(R))-measurable; that is to say,X1(B) F for every B B(R) where we letB(R) denote the collection of Borel subsetsofR. For any random variable X, there is an associated cumulative distribution functionFX : R [0, 1] defined on intervals [a, b] R byFX([a, b]) = P(aX b). If there is afunctionp : R R+ such that

(2.1) FX([a, b]) =

ba

p(x)dx

for any a, bR with ab, then p(x) is called the density function (or probability densityfunction) associated toX. IfXis a standard normal random variable (X N(0, 1)), thenwe denote its density and distribution functions by

(2.2) (x) = 1

2ex

2/2 and (x) =

x

(u)du.

If a random variable Xadmits a density function p, then one is able to compute manyimportant quantities associated to X by evaluating Riemann integrals which involve p. Inparticular, the expected value functional is defined by

(2.3) (X) = E[X]

X()dP() =

R

xf(x)dx.

The moments ofXare also given by similar expressions. In particular, one can compute then-th moment ofXn by evaluating

(2.4) n(X) = E[(XEX)n] =R

(x )nf(x)dx.

Here 0 = 1, 1 = 0, and 2 = Var(X), where Var(X) is the variance ofX; also, the thirdand forth moments define the skewness and kurtosis ofX.

Stochastic processes are collections of random variables on a fixed probability space. Morespecifically, a stochastic processXtis a continuum of random variables {Xt|t[a, b] R} ona probability space (, F,P). The only stochastic processes we consider will be constructedfrom Brownian motions (Wiener processes) and deterministic quantities. Thus we will notenter into the realm of Levy processes or semimartingales. A Brownian motion is a stochastic


9/119

9

processWton a time interval T = [0, T] which initially takes the value W0= 0, is continuousalmost everywhere, and for t, s T with t < s, Wt Ws is a normal random variable withmeanWs and variance t s, i.e. Wt Ws N(Ws, t s). Given a Brownian motion Wt, afiltrationF(t) for Wt is a continuum of sigma algebras such thatF(t1) F(t2) for t1 < t2whereWt isF(t) measurable, and has independent increments. A filtration can be viewedas a refinement of a sigma algebra over time andF(t) can be interpreted as the totality ofall information known at time t which monotonically increases as time progresses. We willuse the notation E[X|F(s)] to represent the conditional expected value of a process X, giventhat the value ofX is known at time t = s.

Brownian motions have many properties which make them somewhat amenable processesto work with. One of these is the martingale property E[W(t)|F(s)] =W(s) which demon-strates that a Brownian motion has no tendency on average to rise of fall above its knownlevel at times and has trivial expected value ifs= 0. If we letWtevolve fromt = 0 tot = t1and find it has valueWt1 =1, then the martingale property requires thatWt N(1, tt1)for all t > t1. Note that the variance ofWt grows like

t t1 in time; hence, a Brownian

motion is diffusive in nature.We can construct new processes from Brownian motions by rescaling Wt and summing it

together with a deterministic function. For example, let ft= f(t) be a measurable functionand consider the process at = ft+ Wt for some constant . Then rescales the varianceof the Brownian motion and ft introduces a deterministic drift term. At time zero, theexpected value ofat is just the expected value off, i.e. E(at) = E(f). One can choose valuesof inat to suitably adjust the inherent randomness of the process. For 0, the processwill be mostly deterministic and accurately approximatef(t) whereas when becomes large,graphs of the process will become increasingly randomly perturbed about the graph off.

We now wish to construct another important family of stochastic process called It o pro-cesses which are fundamental objects in continuous time finance and also the most generalprocess that we consider. Let ={t0, . . . , tn} be an n-point partition of the interval [0, t]withti < ti+1, t0= 0, and tn1= t, and for a given deterministic process t, i.e. a function,

consider the stochastic process(2.5) In(t) =

ti1(Wti Wti1),

where we note here that is evaluated at the left endpoints of the subintervals of . The leftendpoint evaluation of the integrand of the Ito process is important as it makes the processnon-anticipative by construction. In fact, many of the useful properties of Ito processes whichmake them desirable tools for financial modeling stem from this aspect of their definition.

Taking a limit of any refining sequence of partitions that simultaneously satisfy || 0and n , allows us to construct a new unique (modulo measure zero sets) stochasticprocess

(2.6) It = lim||0

ti1(Wti Wti1)

t

0

(s)dWs.

The process It is called an Ito integral, and it is an example of an Ito process It defined by

(2.7) It = I0+

t0

(s)ds+

t0

(s)dW(s)

where L1[0, t], andL2[0, t] are deterministic functions where here L1 andL2 representLp function spaces. Ito processes are the basic dynamical objects used to model the evolutionof continuous time financial quantities.


10/119

10

Now letSdenote an Ito process which starts at an initial value St at timet and evolves tosome future value ST at time T > t. Then we will let p(t, St, T , S T) denote the probabilitytransition density ofS. Here p given the probability ofSt evolving to ST. For example, ifSis a Brownian motion, then p is a normalized Gaussian centered at St. Transition densitiesare important in finance since if one can obtain an exact form for the transition density foran Ito process, then he can price a call option on an asset which evolves by this process up

to quadrature. We will see later that the transition density for an Ito process can always bedescribed as the solution of a parabolic partial differential equation.

2.2. Stochastic Differential Equations. Working with Ito processes written in the formof equation (2.7) quickly becomes quite tedious. Stochastic differential equations are al-ternative representations of Ito processes in the analogous manner that any linear Volterraintegral equation of the first kind can be converted to an associated first order differentialequation (This should not be taken literally as we lose no regularity in the SDE case). Fori= 1, . . . , n , let xit be n Ito processes defined by

(2.8) xit = xi0+

t

0

bi(s, xs)ds+ t

0

m

j=1ij(s, xs)dW

js

wherexi0are the initial values ofxit,xsrepresents the collection of all the x

it,b

i : R+Rn R,ij : R

+ Rn R+ are measurable and have suitable regularity for the process to be well-defined, andWj arem Brownian motions with correlation matrix . We can formally writethis system of Ito processes as a system of stochastic differential equations (SDEs)

(2.9) dxit = bi(t, xt)dt+

mj=1

ij(t, xt)dWjt.

There are two ways in which these equations are coupled together. The first is that thecoefficient functions bi and ij for the i-th equation can depend on the other x

jt , which is

exactly how one couples systems of deterministic first order differential equations. Secondly,the Brownian motions are coupled via E[WjtWit ] = ij dt where ij is a symmetric positivedefinite covariance matrix. Note that SDEs can only be first order by construction. However,we can effectively think of such systems as higher order linear equations in the same sensethat second order linear ordinary differential equations can be decomposed into systems offirst order equations.

There is a somewhat weak standard existence/uniqueness theorem for systems of SDEswhich is given by the following.

Theorem 2.1. [53, p.86] (Evans) Let : Rn [0, T] Rn and : Rn [0, T] Rmnbe continuous Lipschitz functions with Lipshitz constant L that satisfy the linear growthconditions

(2.10) |(x, t)| L(1 + |x|), |(x, t)| L(1 + |x|)for all t [0, T] andx Rn. Letx0 be anRn valued random variables withE(|x0|2)


11/119

11

possibly dangerous, assumption that there is indeed a well-defined unique solution of the SDEsystems underlying stochastic processes. There are a variety of more specialized results forSDEs similar in spirit to this theorem.

If we know that a system of SDEs admits a unique solution, we often want to understandthe associated unique dynamics of functions of the underlying stochastic processes. Itoslemma provides us with this information:

Lemma 2.2. [53, p. 66] (Evans) Consider ann-dimensional system of SDEs

(2.12) dxi =i(t, x)dt+m

j=1

ij (t, x)dWj

whereL1(0, T) and L2(0, T). Letu : [0, T] R R beC1,2([0, T] R). Define aprocessY(t) =u(x(t), t). ThenYhas dynamics given by

(2.13) dY =

Y

t +

i

iY

xi+

1

2

i,j,k

ikjk

2Y

xixj

dt+

i,j

ij(x, t)Y

xidWj .

We note this can be written in the somewhat more appealing form

(2.14) dY =Y

tdt +

ni

Y

xidxi +

1

2

ni,j

2Y

xixjdxidxj,

where one can substitute equation (2.12) into this equation and use the formal rules dt2 =dtdWi = 0,dWidWj =ij dt(which follow from properties of the Ito integral Evans [53]) toderive equation (2.13). We will often employ this computational convenience when utilizingItos lemma in future calculations. We now comment on how SDEs can be used to modelthe evolution of financial quantities.

2.3. Financial Applications of Stochastic Differential Equations. There are a myriadnumber of ways SDEs are used for modeling purposes in finance. We are interested instudying their use in modeling the evolution of equities, exchange rates, interest rates, etc,which will allow us to price contingent claims (derivative contracts) whose underlying assetsevolve according to a specified SDE model.

A market model (or model for short) is a collection ofnIto processesxi which are specifiedby a system of SDEs

(2.15) dxit = i(x, t)dt+

mj=1

ij(x, t)dWj,

where the xit have initial values x

i0 and

i

, i

j are chosen appropriately so that the systemadmits a unique solution. Also, the Brownian motions satisfy E[dWidWj] = ij dt wherehere Eis defined with respect to the probability measure P under which Wj are Brownian.In addition to this system of equations, there is a separate SDE in a market model given by

(2.16) dBt = rtBtdt, B0 = 1.

Here Bt is interpreted as a process which models the growth of one unit of currencyaccording to a risk free rate rt. The risk free rate rt can be taken to be constant, or modeledby another stochastic process (see for instance Brigo and Mercurio [23, Ch 3]) and is assumednot to depend on the xi since we will assure that the xi represent only a few asset prices


12/119

12

whose values will not significantly impact the global risk-free rate. Equation (2.16) can beintegrated exactly. In particular,

(2.17) B(t) = exp

t0

r(s)ds

.

An important quantity associated with B(t) is the discount factor or zero coupon bond

price at time t = 0 given by D(0, t) = B(0, t)1

. The zero coupon bond curve encodesthe markets current view on future expected values of the risk free rate. It allows us tocompute the present day value of an asset whose value is known at time t by multiplication.In particular, if an asset has value C(t) at some future time t >0, then at the present timet= 0, it has present day value D(0, t)C(t). We will typically assume that the yield curve isflat.

The xi are typically interpreted as market observables, and are commonly taken tomodel asset prices, short rates, forward rates, or foreign exchange rates. We will call i thedrift associated to xi and ij the instantaneous volatility associated with the j -th Brownianmotion ofxi.

In particular, we will usually be interested in the case of European derivatives. These are

contracts that depend on a market models assets that can only be exercised at some fixedfuture date T > 0. Given a market model, we can define a derivative which depends onunderlying assetsxi by a payoff function CT =C(xt) which may depend on the paths of thexi (which is the case of Asian and other exotic options). Our task then is to find the fairvalueCt of the contract at prior times t[0, T).

The purpose of pricing theory is not to attempt to predict a single evolution of asset prices.Rather, one of the main questions it attempts to address is: given a market model, i.e. acollection of assumed dynamics fornassets and a risk free interest rate, what is the fair valueof a specified derivative whose underlying asset(s) evolve according to the specified market-model dynamics? A market model is typically characterized by several market parametersincluding (but not limited to) volatility of the underlying, a mean reversion level, a mean

reversion speed or an exponent with controls the relative grown of the underlying. Beforepricing a given contract, one needs to choose values for these model parameters. This processis called calibration and in the context of equities is performed by forcing model call pricesto agree with current market prices.

There are a variety of SDEs one can use to model the evolution of the underlying financialobservables. One simple example which is often used in a variety of contexts in practice dueto its robust analytical tractability is to assume that the price process S(t) evolves accordingto lognormal dynamics

(2.18) dS=Sdt+SdW

where and are constants and S(0) =S0. However, this two parameter family of models

often does not simulate realistic price evolution. For instance, one can see from a histogramof the log of the daily returns from the S&P Index that the associated density is not normal.We can extend this to the more general class of time dependent local volatility models withdrift

(2.19) dS=(S, t)dt+(S, t)dW,

which now are parameterized by the choice of two functions (S, t),(S, t). In practice, onecan choose these functions to be consistent with prevailing market conditions, in order toattempt to construct a more accurate approximation of the distributional properties of theprice process (this process is called calibration). There is no need to limit the stochastic part


13/119

13

of this SDE to being a Brownian motion. In fact, one can consider jump processes amongother constructions in order to better approximate the true price process.

In order to construct a formula for the price of a generic European derivative, we first givea heuristic definition of the fundamental concept of arbitrage. A market model is said to bearbitrage free if there is no trading strategy one can execute on a portfolio of assets xi(t) oftrivial initial value which has positive probability of having a positive value at any future

time T >0 and zero probability of having a negative future value. Alternatively, a marketmodel is arbitrage free if there is no way to make an instantaneous risk free profit by tradingin market assets; more colloquially, there is no free lunch. In other words, a market hasno arbitrage if there is no way to definitively profit at a future time by trading assets in aportfolio with trivial initial value. Although real markets are not completely arbitrage free,such an assumption approximately holds for efficient markets and we will always operateunder this assumption. The theory of arbitrage has a long history and we refer the readerto the standard Delbaen and Schachermayer [42] for a more detailed discussion.

The fundamental theorem of asset pricing will allow us to construct a pricing formula forderivatives in arbitrage free markets. We state it here in a simplified form.

Theorem 2.3. (Fundamental Theorem of Asset Pricing) Consider a market model where

the xit are defined for t [0, T] on a probability space (, F,P). Then the market modelis arbitrage free iff there exists a probability measureP on (, F) such that the discountedassetsxitD(0, t)xit are martingales with respect to P.

The discounted asset dynamics under P have trivial drift and as a result the measureis called the risk-neutral measure. Given such a measure, we can represent the price of aEuropean derivative by using the following lemma:

Lemma 2.4. Given a risk neutral measureP for a market model, then for each t[0, T),there exists a unique pricet for a European derivativeCt with underlying assetsxi given by

(2.20) t = EP(D(t, T)CT

|Ft).

See Brigo and Mercurio [23, Chpt. 2] for an explanation of these theorems as well asoriginal references.

Here EP is the conditional expectation operator with respect to the risk neutral measuregiven all available information Ftat timetandD(t, T) is a discount factor given by D(t, T) =D(0, T)/D(0, t).

Computing exact prices of derivatives in generic market models is a challenging task. Infact, only a handful of analytic pricing formulas are known, and most of these exist only forsimple derivatives with correspondingly simple, low dimensional market dynamics. Oftenone either has to resort to Monte Carlo methods Glasserman [71], Kloedan and Platen [96]or numerical PDE Andersen and Piterbarg [2], Duffie [47] techniques to accurately price

derivatives.To demonstrate this theory, we now give a well known, but very important, example ofpricing a European call option in the case of log-normal underlying dynamics. Suppose adiscounted asset St evolves by log-normal dynamics for t [0, T], under the risk neutralmeasure, given by

(2.21) dSt = StWt, S(0) =S0,

for some volatility constant . Then we can use Itos lemma to compute

(2.22) d (ln S) = 1

SdS 1

S2dS2 =dWt

2

2dt,


14/119

14

and integrate over [0, t] for t < T to find

(2.23) St = S0exp

2

2t+Wt

.

We now wish to price a European call on the asset S with strike K. This is a contractthat allows its holder to purchase the asset Spriced in dollars at time T forKdollars. Thepayoff function for such a call is given by C

T = max(S

TK, 0), and at time t = 0, our

pricing equation in the case of zero interest rates gives

(2.24) C(0, S0) = E[[ST K]+].Now sinceWt is aN(0, T) random variable under the risk neutral measure, we can evaluatethe call price by computing the integral

(2.25) C(0, S0) = 1

2

S0exp

2

2T+

T x

K+

exp

x

2

2

dx

(2.26) =S0(d1) K(d2),where here

(2.27) di = ln(S0/K) + (1)i12T /2

T,

which is the well known Black-Scholes-Merton pricing formula for a European call option.We denote this function by

(2.28) CBS M(, S0, K , T ) =S0(d1) K(d2).This formula, although not viable real-world pricing, contains a considerable amount ofstructure which one can use to build intuition for how call options should be priced. Onesuch important property of the formula is that the Vega

(2.29) CBS M

=S0

T(d1())> 0.

Thus the BSM formula is monotonically increasing in the volatility and therefore thereis an injective correspondence between prices of European call options and BSM volatilityconstants .

If one were to consider more complicated dynamics for the asset(s) and needs to pricea European call on such an asset(s) it may not be possible to integrate the correspondingsystem of SDEs exactly, nevertheless evaluate the call option expectation (not to mentionprice more complicated derivatives). Suppose we have such a model and label it model A.From risk neutral pricing theory, we know there exists a unique price for a call at time zerogiven by CA0 given that the underlying market is complete. Now suppose that model Adepends onn parametersa1, . . . , an. The Black implied volatility of this model is a function

= (ai) such thatCA0 =CBS M(). Note this is unique by monotonicity ofCBS M(). Thusif we can construct a Black implied volatility function for a model, we can compute pricesof European calls in this model simply by evaluating CBS M(). One of our aims will be toconstruct accurate Black implied volatility functions for complicated models.

Now in the BSM model, the volatility is assumed to be constant. In reality, the volatilityis known to depend on both strike and maturity. In particular, the function (K) is knownas the smile and is not constant in the majority of derivatives markets. A large literatureregarding the smile has developed over the past decade. Smile modeling was first consideredin the equity and foreign-exchange setting by Dupire in [48],[49] ,[50], in the context of localvolatility models. In addition, one can assume the instantaneous volatility functions of local


15/119

15

volatility models are stochastic (the resulting model is called a stochastic volatility model).Stochastic volatility models allow us to give an alternative, although more complicated, wayto model the smile as well. We will consider both cases of local volatility and stochasticvolatility models later on.

We finally turn to another very important theorem in mathematical finance that allowsus to evaluate expectations by solving associated deterministic parabolic equations; this

provides an alternative way to price derivatives. The Feynman-Kac theorem states:Theorem 2.5. (Feynman-Kac [94]) Let C(t, x) C(1,2) and let rt > 0 be a continuousinterest rate process. Then assuming xi and r are given by a market model that admits aunique solution, the price of a generic derivative

(2.30) C(t, x) = E

exp

T

t

rsds

Ft

is given as the solution of the backward parabolic deterministic equation

(2.31) tC(t, x) =LC(t, x) rtC(t, x)with terminal data given byC(T, x) =C(T) whereL is an elliptic operator given by

(2.32) L= i

xi+

1

2

ni,j

mk

ikjk

2

xixj.

The Feynman-Kac formula provides a link between SDE and PDE techniques for deriva-tives pricing. In particular, one can evaluate (2.31) by performing a Monte Carlo simulation,or numerically solving the differential equation (2.32). However there are several drawbacksto these approaches. In addition to many numerical issues that arise in pricing by Monte-Carlo or numerical PDE simulation, one major drawback is that both of these techniquesgenerally become increasingly computationally intensive as the complexity of the underlyingSDEs grows. The particular type of approximation method we will use requires the lan-guage of differential geometry. We now review a few pertinent aspects of this subject priorto utilizing them to approximation pricing PDE.

We will focus on PDE related pricing methods. Specifically, we will attempt to approxi-mate the solution of equations of the form (2.32) using perturbation theory methods. Thiswill result in approximation formulas which will provide a means for fast valuation of deriva-tives in complex market models relative to numerical PDE and Monte Carlo techniques.

2.4. Riemannian Geometry. One of our main objectives is to study the interaction be-tween SDEs and certain constructions in Riemannian geometry. Before we can considerthese constructions, we need to review some aspects of Riemannian geometry. We refer thereader to Cheeger and Ebin [30], Do Carmo [44], Do Carmo [45], Frankel [55], Hassani [82],Jose [92], Milnor [122], Petersen [135], Nakahara [126], for more in depth discussions of the

subject.In the following we will let M be a C differentiable manifold with tangent bundle T M

and letg : T M T M Rbe a positive definite symmetric (0, 2) tensor, i.e. a Riemannianmetric on M (see Warner [167] for basic manifold theory definitions). The pair (M, g) iscalled a Riemannian manifold. Typically, Mis taken to have non-trivial topology and in factmost of the fundamental results of Riemannian geometry explore the relationship betweenthe topology of M and the space of admissible metrics g that can be defined upon M.However, all of our analysis will be local, and thus without loss of generality, we will alwaystakeMto be an embedded graph in a Euclidean space, although we seldom take g to be thestandard Euclidean metric.


16/119

16

We also will let Ebe a smooth vector bundle over M and let (E) denote the space ofsmooth sections ofE. We do not define a vector bundle here and refer the interested readerto the standard references Milnor [121], Steenrod [149] for more references. A connectionA : (E) (TM E) on E is a linear map which for any v E and f C(M)satisfies a Leibnitz rule

(2.33)

A(f v) =df

v+

Adv.

where heredis exterior differentiation. We will only be interested in local properties of vectorbundles and since every vector bundle is locally trivial, we lose no generality in assumingthat E = M Rn (in particular we are only interested in line bundles, i.e. bundles withn = 1). If we letei be the standard basis elements ofR

n, then we can represent a genericsectionv ofEbyv(x) =vi(x)ei, and the Leibnitz rule requires that

(2.34) Av=Avi ei+iAei.NowAei (TM E) so there is a matrix Aji of one forms Aji = Ajikdxk such thatAei= Aji ej . In particular, we have(2.35)

Av= (dvi +iAij)

ei,

and the connection acts on components of the section like

(2.36) Ak vi =ki +Aikj j.We can write the connection operator more conveniently in local coordinates as

(2.37) Ak =k+Ak, where Ak = Aikj ,where in the case of a line bundle the Ak are just 1-by-1 matrices of one forms, or simply

just the local coordinate components of one forms.We now consider two examples of connections, in particular the Levi-Civita connection on

the tangent bundle of (M, g) and an Abelian connection on a line bundle over (M, g).The fundamental theorem of Riemannian geometry states:

Theorem 2.6. ( Do Carmo [45] , Petersen [135]) Given a Riemannian manifold (M, g),there is a unique connection defined on the tangent bundleT M that is torsion free, i.e.(2.38) XY YX= [X, Y]and compatible with the metric(g = 0). This connection is called the Levi-Civita connec-tion.

We denote the Levi-Civita connection by (without using a superscript) and its asso-ciated connection coefficients by ijk which are called Christoffel symbols. The Levi-Civitaconnection acts locally on the components of a generic tensor T :rT Mq TM Raccording to

(2.39) iTabcd =iTabcd +Tebadaie+ +Taecbbie Tabed eic Tabceeid.In particular, the metric compatibility condition demands that

(2.40) 0 =kgij =kgij lkiglj lkj gil,which can be inverted to produce a formula for the Christoffel symbols (which are nottensors),

(2.41) abc=1

2gad (bgdc+cgdb dgbc) .

These Christoffel symbols are symmetric in their lower two indices abc= acb.


17/119

17

We will mostly be concerned with connections on line bundles. In this case, the connec-tion coefficients are given by ijk +A

ijk , so using our shorthand notation,A acts on the

components of a vector v i like

(2.42) Ajv i =jvi + j vi +Aj vi =jvi +Ajvi

where here Aj are the components of a covector field. We are free to choose the Aj in

specifying a connection on a line bundle. If one takes this action ofA

j as axiomatic, thenhe need not concern himself with the previous vector bundle discussion.

Now given a connectionA, we can also consider the curvature operator associated withA, denoted by2A=A A. This operator acts on a section v like(2.43) 2Av= v, where =dA+A A.The matrix valued curvature two formF is given locally componentwise by(2.44) Fij =iAj jAi+ [Ai, Aj].In the case of the Levi-Civita connection, this produces the well known local coordinateformula for the Riemann curvature tensor ofg (where we use the sign convention of Wald

[165]),

(2.45) R lijk =jlik iljk + aiklaj ajk lai

Next we define the components of the Ricci curvature Rij and the scalar curvature R bytaking contractions of the Riemann tensor

(2.46) Rij =

k

R kikj , R= gijRij .

We need to use all of these quantities to compute approximations for PDE associated todifferent market models. We will mostly be interested in one and two dimensional models inwhich case the Riemann, Ricci, and scalar curvature tensors all have only one independentcomponent which is trivial in one dimension or the Gauss curvature in two dimensions. Inthree dimensions, the Ricci tensor has six independent components, and the Riemann tensoris completely determined by the Ricci tensor. Only in four or higher dimensions does theRiemann tensor contain more information than Ricci, e.g. it has twenty-four independentcomponents in dimension four whereas the Ricci tensor only has ten.

Another important geometric notion that we need to utilize is the concept of the distancefunction associated to a Riemannian metric. In particular, given two pointsp, q M, wedefine the distance function d(p, q) :M M R+. To do this, letC be the space of all C1curves joining p to qand for C denote the length functional l :C R+ given by

(2.47) l[] = qpgiji(t)j(t)dt,

where here t is a coordinate that parameterizes by arclength. The distance function isthen defined by

(2.48) d(p, q) = infC

{l[]}.

One can reduce the computation of the infimum to solving a system of ODEs by employingthe standard variational calculus technique of demanding the first variation oflvanish, whichis equivalent to solving the Euler-Lagrange equations associated with the length Lagrangian.The result is an associated system of differential equations for the components of the tangent


18/119

18

vector to a geodesic joiningp to q. The quasi-linear ODE system that one needs to solve tofind the minimizing paths are the geodesic equations,

(2.49) i + ijk jk = 0, i(0) =i0.

These equations generally form a coupled system of quasi-linear ODE and thus localexistence and uniqueness of the system is guaranteed by standard ODE theorems. There isno global existence result for these equations. To understand why, note that on the spherethe are infinitely many geodesics joining any two antipodal points and hence uniqueness isviolated.

There are cases where one can guarantee global existence/uniqueness of the geodesic equa-tion. For instance, in the case (M, g) has negative sectional curvature and trivial fundamentalgroup, if follows from the Hadamard-Cartan theorem Petersen [135, p.162] that geodesics areglobally and uniquely defined on (M, g). Many cases of geometries we consider will satisfythis constraint, in particular the case of hyperbolic geometry.

We now turn our attention to discussing the relationship between SDEs and differentialgeometry.

2.5. Finance/Geometry Relation. We now review the construction of the pricing equa-

tion for a general path independent derivative whose payoff depends on n generic timehomogeneous Ito processes. We then review how this equation can be related to a geometricheat equation, which will motivate the heat kernel expansion ansatz. Finally, we write theansatz out explicitly in the one dimensional setting.

For i = 1, . . . , n, let xit xi(t) be n Ito processes which evolve according to a system ofcoupled stochastic differential equations (SDEs)

(2.50) dxit = i(xt)dt+

ij(xt)dW

j , xi(0) =xi0,

where j = 1, . . . , n, xt ={x1t , . . . , xnt } denotes a functions dependence on potentially allthe xit, W

i are n Brownian motions with covariance matrix ij, i.e. E[dWidWj ] = ij dt,and i, ij are suitably regular functions. We assume these dynamics are risk neutral and

that E is the expectation operator with respect to (w.r.t.) the risk neutral measure. Wewill also sometimes write xi = xit for short when no contextual conflicts are present. Thecoefficient functions of this SDE system do not have explicit time dependence; we will alwaysassume that this holds and note that the application of heat kernel expansions to modelswhose instantaneous volatility function depends explicitly on time remains an open area ofresearch.

Let F(xt, t) be a pricing function for a contingent claim with a path independent payofffunction F(xT, T) for some fixed T > t > 0. Then Itos lemma requires that F evolvesaccording to

(2.51) dF =F

t

dt+F

xidxi +

1

2

2F

xixjdxidxj,

where here we adopt the Einstein summation convention where repeated indices indicateimplicit summation.

Moreover, since we take the above SDEs to be risk neutral, the discounted price processerF(xt) (assuming a flat discounting curve), where =T t, is a martingale. Therefore(2.52) d(erF(xt, t)) = [e

rFt rerF]dt+erFidxi +12

erFijdxidxj

(2.53) =er

(Ft rF)dt+Fi(idt+ijdWj) +1

2Fij(

idt+ikdWk)(jdt+jl dW

l)


19/119

19

(2.54) =er

Ft rF+Fii +12

Fij ik

jl

kl

dt+erFi

ijdW

j,

where here we use subscripts onFto indicate partial differentiation with respect to the localcoordinate functions, i.e. FixiF. Now let ij ikjl kl be a positive definite volatilitymatrix. Since the above process is a martingale, it must be driftless, and consequentlyFsatisfies a backwards Fokker-Planck-Kolomogorov equation

(2.55) Ft rF+i Fxi

+1

2ij

2F

xixj = 0.

Alternatively, since we are restricting to the class of path independent payoff functions, giventhe values of the underlying processes xt at time t and a filtrationFt, we can represent Faccording to

(2.56) F(xt) = E [erF(xT)|Ft] =er

R

F(xT)(T, xT|t, xt)dxT,

where here(T, xT|t, xt) is the joint transition density of the underlying processes xi, i.e. itrepresents the probability that x will evolve tox = xTat timet given that thex

i had initialvaluesxt at time t.

Next, we want to construct a partial differential equation (PDE) for by computing Ft,Fi, andFij and substituting the results into equation (2.56). Differentiating with respect tot, we find that

(2.57) Ft(xt) =

erR

F(xT)t(xT)dxT rerR

F(xT)(xT)dxT

(2.58) =erR

F(xT)t(xT)dxT rF(xt).

We denote the spatial coordinates byxi =xit so that the pricing PDE takes the form

(2.59) Ft rF+i F

xit +

1

2ij

2F

xitxjt = 0.

The gradient and Hessian ofF just act on ; after substituting them into equation (2.59),we find

(2.60) 0 =erR

t(xT) +

i (xT)

xit+

1

2ij

2(xT)

xitxjt

F(xT)dxT.

This must hold for any payoff function F, so it is equivalent to a backwards parabolic PDEfor the probability transition density

(2.61) 0 =

t +i

xit+

1

2ij

2

xitxjt

, (T t,x,xt) =(x xt),

which is also subject to the spatial density boundary condition (x)0 as|xt xi| (c.f Evans [54] for background on parabolic equations). Here (x xt) is the delta functionon Rn centered at the point xt and expressed in Euclidean coordinates.

One can convert this PDE into a forward equation by expressing it in terms of,

(2.62)

=i

xit+

1

2ij

2

xitxjt

, (0, x , xt) =(x xt).

We seek to approximate solutions to equations of this form using heat kernel perturbationtheory. In particular, we make an ansatz for which solves this equation exactly to zerothorder in hope of allowing us to simplify subsequent perturbative computations for higher


20/119

20

order correction terms. There are many references for this construction (see e.g. Arnold[10], Avramidi [12], Labordere [98], Paulot [134]). The reader uninterested with the detailsof the construction may skip to equation (2.95) where we summarize the one dimensionalform of the heat kernel ansatz that will be used later in our examples.

In order to motivate the heat kernel ansatz, we first review a correspondence betweenelliptic operators on Rn and connections on line bundles over a Riemannian manifold (M, g);

hereM is aCmanifold (which can roughly be interpreted as a smooth subset of a Euclideanspace) and g is a smooth set of symmetric positive definite matrices, indexed by points inM, which contains all necessary information related to computing distances onM. Consideran elliptic operator

(2.63) L=1

2ijij+

ii,

on Rn. We can represent L by an equivalent operator of the form

(2.64) L= A Q= gijAiAj Q,on a line bundleL over M where hereAi is a connection which can be decomposed as

Ai =

i + Ai where

i is the Levi-Civita connection associated to g and Ai are the

components of a real-valued section of the cotangent bundle ofM, i.e. A (TM), andQis a section of EndL L L, (c.f. Avramidi [12]). All of our analysis will be local, and wewill only require the fact thatAi acts on a function p : M Raccording to(2.65) Ai p = (i+Ai)p=ip+Aip,and on the components vj of a covector vT(M) like(2.66) Ai vj =ivj kijvk, where kij =

1

2gkm(igkj +kgij kgij ),

are the Christoffel symbols associated with g. In particular, in local coordinates, we cancompute

(2.67) Lp= gij (i+Ai)(j+Aj )p Qp= gij (i+Ai)(jp+Ajp) Qp(2.68) =gij

ijp kij kp+p(iAj kijAk) +Ajpi+Aipj+ AiAjp

Qp(2.69) =gij

ijp+ 2Ajpi kijpk+

iAj kij Ak+AiAj

p Qp.

We now can identity the diffusion and advection terms of this expression with those ofequation (2.63) to find that

(2.70) 1

2ij =gij, i = 2gij Aj gjk ijk , 0 =gij

iAj kijAk+AiAj

Q.Through these equations, we can express an elliptic operator either by a choice of ( ij , i),

or equivalently, by specifying a triple (gij, Ai, Q). Specifically, they can be inverted in orderto write the geometric quantities in terms of the financial ones,

(2.71) gij =1

2ij , Ak =

1

2

gik

i +gikgjmijm

, Q= gij(iAj kijAk+AiAj).

We now give two operator identities that will prove useful later. The first concerns theA-Laplacian A defined by,

(2.72) Apgij(i+Ai)(j+Aj)p= gij (ijp+ i(Ajp) +Aijp+AiAjp)

(2.73) = gp+gij(Ajip+p(iAj kijAk) +Aijp+AiAjp)


21/119

21

(2.74) = gp+ 2gijAijp+g

ij[iAj kij Ak+AiAj]p= gp+ 2gijAijp+Qp,which can be expressed more concisely as

(2.75) (A Q)p= gp+ 2gij Aijp.The second identity involves the Levi-Civita Laplacian (Laplace-Beltrami operator) g andis expressed in local coordinates by

(2.76) gp= 1

gi

ggij jp

=gij

ijp kikjp

.

where here

g is the determinant of the metric.

2.6. Heat Kernel Expansion Formula. With the above tools at hand, we now can con-struct the heat kernel ansatz. First, consider a general second order elliptic differentialoperator on Rn expressed in terms ofAi given by

(2.77) L= gijAiAj Q= 1

g(i+Ai)

ggij (j+Aj)

Q.

The heat equation = L can be solved exactly to zeroth order in . We assume that is given by this zeroth other solution multiplied by an arbitrary function , where we musthave (0, x , x) = 1 for consistency. The resulting expression for is called the heat kernelexpansion and for x, xRn, is given by

(2.78) ( , x, x) =

g(x)

(4)n/2P(x, x)1/2(x, x)exp

d

2(x, x)4

( , x, x),

where here d(x, x) is the distance function from x tox associated with the metric g, and

(2.79) P(x, x) = exp

C

A

= exp

x

xAidx

i

.

Here C is a minimizing oriented geodesic from x = x(0) to x = x(t) parametrized byarclength. In the one dimensional setting, such a geodesic always exists, although this issueis more subtle in higher dimensions c.f. Forde [58]. Finally,

(2.80) (x, x) = 1g(x)g(x)

det

2d2(x, x)xixj

,

is known as the van-Vleck-Morette determinant. If we substitute this ansatz into the previousequation, then after simplification, we find that must satisfy

(2.81)

+

1

(i)i P11/2L1/2P( , x, x

) = 0,

with initial condition (0, x , x) = 1. Now assume that is given by a formal power seriesin,

(2.82) ( , x, x) =

k=0

ak(x, x)k.

Next, let (x, x) =d(x, x)2/2, and substitute into equation (2.81) to find

(2.83) 0 =

k=0

kakk1 +

k=0

[(i)iak]k1

k=0

P11/2L1/2Pakk


22/119

22

(2.84) =

k=1

kakk1 +

k=0

[(i)iak]k1

k=1

P11/2L1/2Pak1k1

(2.85) = (i)(ia0)1 +

k=1[kak+ (i)iak P11/2L1/2Pak1]k1.

Now the coefficients of the different powers of the i must vanish identically. The initialcondition (0, x , x) = 1 together withia0 = 0 require that a0 = 1. We find that the restof the ak are given by a recursive hierarchy of differential equations

(2.86) kak+ddak P11/2L1/2Pak1 = 0,where here we usei=did and (i)i = dddd. One can integrate this systemto find an iterative formula for the ak,

(2.87) ak(x, x) =

1

dk

C

dk1P1(x, x)1/2L1/2P(x, x)ak1, k1.

The goal of heat kernel perturbation theory is to attempt to evaluate or approximate (usuallyby the tractable diagonal coefficients ak(x, x)) the ak integrals in a manner such that theresulting explicit form forapproximates the true solution of equation (2.63) to a high degreeof accuracy for a desired domain of model parameters. Computingak exactly is generallyonly possible in the simplest geometries (M, g) for dimensionsn2. However, when n = 1,these reduce to integrals over Rand are calculable in a wide variety of models.

We now restrict our attention to the case of one dimensional models. Specifically, weconsider a driftless local volatility model of the form

(2.88) dSt= C(St)dW, S (0) =S0,

where we take C : R+ R to be at least a C2 function and uniformily positive. Thetransition density equation for this model is given by

(2.89) =1

2C()2.

Applying the PDE/geometry correspondence, note that the single component of the inversemetric is given by g =C()2/2. Thus the metric and the square root of its determinant

are just g = 2/C()2 and

g() =

2/C(). Using this, we can compute the single

Christoffel symbol

(2.90) =1

2gg= 1

C()

C()

,

which is just the component of the one formd ln C(). Next, we find A= 12gg =12

and

(2.91) P(, S) = exp

S

Ad

= exp

S

d ln

C()

.

Evaluating the integral, we find that P(, S) =C(S)/C(), which in turn implies P(S, ) =C()/C(S). We can further compute

(2.92) P1gP+ 2P1gAP=18

2CC (C)2 .


23/119

23

Next, define a coordinate s =

2S

du

C(u) which parameterizes geodesics on (R, g) by

arclength. Changing coordinates allows us to see that

(2.93) g=

s

2gss=

2

C()2gss

from which we note that the line element is given by

(2.94) ds2 =gd2,i.e. in thes coordinate g is just the standard Euclidean metric on R. In particular, thisimplies that (S, ) = 1 which considerably simplifies the computation of the ak.

We now summarize the the heat kernel ansatz in one dimension in a form that is expressedsolely in terms of operators and functions on R, namely,

(2.95) ( , , S ) =

C()

2 C(S)3exp

d

2(S, )

4

k=1

ak(S, )k,

where the distance function dis

(2.96) d(S, ) =

2 S

du

C(u).

Actually, the true distance function associated to g is given by taking the absolute value ofthe above; however we omit this off for simplicity as the sign ofd will not effect the ansatz.Theak are given by the integrals

(2.97) ak(S, ) =1dk

S

dk1P(S, )1

g()g(P(S, )ak1(S, ))

(2.98) +2P(S, )1

g()gA(P(S, )a2(S, ))

d.

We can represent the first heat kernel coefficient in the following convenient way,

(2.99) a1(S, ) =24d

S

C(u) (C(u))2

2C(u)

du.

When it is possible evaluate this integral, which can be done for a wide range of functionsC, then one can insert the result into the ak formula and attempt to compute a2. Althoughone can typically compute the a1 integral exactly, computing higher ak is a potentially adifficult task. If we are able to computeak for a given local volatility model, we will say thatwe have constructed a k-th order approximation formula. We now turn to several examplesstarting with the CEV model.

Before proceeding to applications of this approximation, now briefly comment on how wewill use the heat kernel expansion formula for purposes of transition density estimation and

derivative pricing. Ifxi

are n stochastic processes with initial values xi

0, then we have seenthat the transition probability densityp(xi, xi0) is determined by solving a second order linearparabolic equation with delta function initial data. We will use the heat kernel expansionto estimate the solution of this PDE initial value problem. We can then use an expressionfor the transition density to price a generic derivative by integrating the derivatives payofffunction against the transition density. In some instances, evaluation of this integral is notpossible, and we can turn to a Laplace/steepest decent approximation.


24/119

24

3. Local Volatility Model Applications

We now consider heat kernel expansions in the context of one-dimensional local volatilitymodels. The underlying geometry of these models will always be Euclidean when we expressthe model dynamics in an appropriate local coordinate system. Thus the distance function iseasy to determine and the Van-Vleck determinant will always be unity. We can even calculatethe ai heat kernel series coefficient functions exactly in the case of several one-dimensional

models. We then comment on the errors associated with this approximation procedure.We start by considering the Black-Scholes Merton model in order to illustrate these pertur-

bative methods in a simple context. We then turn to the CEV and quadratic local volatilitymodels. In each case, we are able to compute all the heat kernel coefficients. We thenconstruct first and second order expansion formulas for the transition density for three newlocal volatility models which we call the cubic, affine-affine, and generalized CEV models.

3.1. Black-Scholes-Merton Example. We now consider the simple case of the one di-mensional lognormal Black-Scholes-Merton (BSM) model [18], [119]. We can compute allquantities involved in the heat equation exactly in this model and in fact can even invert theseries in the heat kernel expansion to produce the known analytic expression for the BSM

transition density function thus providing an alternative way to solve the BSM equation (seeAndereasen et. al [6] for a variety of ways to solve the BSM equation).The risk neutral dynamics are given by

(3.1) dS(t) =rS(t)dt+S(t)dW(t),

wherer is the risk-free interest rate and >0 is a volatility constant. Thus S = rS andSS = 2S2. so that the transition probability density functionp(S , T , S 0, t) = p(,S,S0)must satisfy

(3.2) p

=rS0

p

S0+

2

2S20

2p

S20, p(0, S0, S) =(S S0).

Now this equation can be solved exactly by making appropriate coordinate changes to reduce

it to the heat equation or by using Fourier transform techniques and its solution is given by

(3.3) p(S , T , S 0, t) = 1

S

2exp

(log(S/S0) (r

2/2))2

22

.

We now compute the quantities that go into the heat kernel expansion formula which ap-proximates the solution of this PDE.

To ease notation, let = S0. First, we know that the inverse metric is given byg = 1

2 =

22

2 , and hence g = 2/

22. Now we define a new coordinate s() =2

S d ln u such that s/ =

2/. In this coordinate, the metric takes the form

gss = (/s)2g = 1, i.e. it is just the standard Euclidean metric and hence s is an

arc-length coordinate; that is to say, s parameterizes geodesics on (R, g) by arclength.We can thus compute the distance function associated to this metric immediately to be

(3.4) d(x1, x2) =s(x2) s(x1) =

2

(ln x2 ln x1) =

2

ln

x2x1

,

where we takex1< x2. From the form ofd(s0, s) =s s0, it is immediate that (s0, s) = 1.Since this does not depend on local coordinates, we have that (, S) = 1 as well. Wefinally need to computeP and a, for which we need to know A; we also compute Q:(3.5) A =

1

2g

+1

2=

r

2 1

2

1

,


25/119

25

(3.6) Q= g

A A+A2

=2

2

r

21

2

2,

where we use the fact that = 12

gg =1/. Using this we can compute

(3.7)

P(, S) = exp

S

Ad = expr

2

1

2

S

1

d = S

12 r

2

.

where we used the formula

(3.8) P(x, x) = exp

C(x,x)

Aidxi

,

where C(x, x) is the arc length parametrized oriented geodesic starting at x and endingat x. Note that the ordering on the arguments ofP is important. It turns out that weget the same result if we compute a1(S, ) instead ofa1(, S) in the BSM case, however inmore complicated models this does not hold. We will stick to the convention that the firstargument of theak should always be the spatial variables of the relevant probability densityPDE and the second should be an initial data constant.

Now we are interested in solving the BSM PDE

(3.9) p=rp+2

22p= Lp( , , x),

where we denote the elliptic part of the BSM density operator by L to illustrate that isthe spatial variable of the operator.

We have to reverse the arguments of our previous heat kernel formula to compensatefor the fact that is the initial asset price and find that the relevant heat kernel expansionformula is given by

(3.10) p( , , S ) =

g(S)

(4)n/2P(S, )1/2 exp

d

2(S, )

4

k=0

ak(S, )k.

Now note that

(3.11) P(S, ) = exp

S

Ad

= exp

S

Ad

=P(, S)1.

Now we want to compute(3.12)

a1(S, ) =1

d

C

P1(S, )(A Q)P(S, )ds= 1d

C

P1gP+ 2P1gAP ds,where here C is the geodesic that joins to S andP =P(S, ) in the above. We nowcompute

(3.13) P1(S, )gP(S, ) =P1

g

(ggP) = 2

2

12 r

22

=Q.

Next we compute

(3.14) 2P1(S, )gAP(S, ) =2

1

2 r

2

2=2Q.

which when combined with the previous yields

(3.15) a1(S, ) =1

d

d0

[Q 2Q]ds= 1d

S

(Q)

2

d=Q.


26/119

26

Similarly, we can use the formula for ak to find that

(3.16) ak(S, ) = 1

dk

d0

sk1 (Q)k(k 1)! ds=

(Q)kk!

=(1)k(2r 2)2k

k!23k2k .

Thus we can sum the ak exactly by recognizing them as the Taylor series coefficients of anexponential function

(3.17) k=0

akk = k=0

(Q)k

k! =eQ.

We can now put these pieces together in our heat kernel formula to find that the densityfunction is given by

(3.18) p( , , S ) =

g(S)4

P(S, )exp

d(, S)2

4

k

akk

(3.19) = 1

S

2exp

r

21

2

ln

S

exp

ln(S/)

2

22

eQ

(3.20) = 1

S

2exp

[ln(S/) (r

2/2)]2

22

which is exactly the same transition density function that we found before. We now considerexamples to determine how accurate different order approximations are compared to theexact solution of the model in order to build intuition for more complex models to come. Inorder to achieve this we need to define

(3.21) pn( , , S ) = 1

S

2exp

r

21

2

ln

S

ln

2(S/)

22

nk=0

(Q)kk!

k.

We first note that when one considers graphs of the pn for typical market values of the

parameters and r, i.e. [0, 0.5] andr[0, 0.1], it is hard to distinguish the graph ofp0from the exact solution for standard maturity times unless one considers minuscule codomainscales due to the relatively small size of the correction terms in the perturbation series (orthe correspondingly highly accurate nature of the leading terms of the approximation). Wethus first consider a plot using a reasonable interest rate value ofr = 0.05 along with a highvolatility = 0.8. In addition, we take a unit initial asset price S0 = 1 and = 1 and plot

p0 throughp4 along with the exact solution in Figure 1. There only appear to be two graphs(even though we are plotting six curves) in the figure. The zeroth order correction p0 isrepresented by the greatest graph andp1, . . . , p4 as well as the exact solution are depicted inthe second graph; if one zooms in very closely, he can see a where a graph which correspondsto the first order approximation differs from the higher order graphs. Thus thepi, fori >1,

approximate the exact solution very accurately to the point they are indistinguishable in ourplot.

There are a few notable statistical features of these density functions. First we notethat if we let In =

R+

pndS, then we can numerically integrate to find that I0 = 1.05861,I1 = 0.998315, I2 = 1.00003, and I3 = 1.00000. The In give an estimate of how fast the

pn are converging to a valid density function, since such a function must integrate to one.Now let Xbe the random variable that has a density function given by p, i.e. the exactsolution, and let Xn be random variables which have pn as psuedo-density functions (Weuse the qualifier pseudo here since none of the pn can in fact be density functions sincethey do not integrate to unity. We do not normalize the pn to fix this). We first note that


27/119

27

Figure 1. This is a plot of p0, . . . , p4 from equation (3.21) and the exactdensityp from equation (3.3) for = 1, = 1, = 0.8, andr = 0.05.

EX0 = 1.11288 which overestimates the mean ofXwhich again is expected by inspection ofthe graph plots. Next, we find that EX1 = 1.0495, which is now an underestimate. This isdue to the fact that the terms in the heat kernel expansion formula alternate in sign and thuswe expect an alternating over/under-estimating of the mean. This is confirmed by furtherconsidering EX2 = 1.05130 and EX3 = 1.05127. Then we find to five decimal places, allthe higherpn have the same value as EX3, so EX= 1.05127 which is a reasonable expectedvalue given that we have specified the risk neutral dynamics at a five percent annual riskfree rate.

The over/under estimation pattern is brought out more clearly in our next graph. InFigure 2, we again plot p0, . . . , p4 together with the exact solution, but this time take anunusually high interest rate r = 0.4 together with a standard volatility = 0.3 while stillkeeping f0 = = 1.

Herep0 is the greatest curve,p1 the least,p2 the second greatest, p3 the second least, andp4 as well as the exact solution plotted in the middle and cannot be distinguished. Here wefind EX= 1.49182.

We note that our sequence of psuedo-densities pnconverges uniformly to the exact densityat every point in their domains. We will see in our next example that this is not the casefor more general models.

Finally, we note that we can tell from the form of (3.16) how well the approximation willwork for different values of the model parameters (r, ). In particular, note that

(3.22) |ak|= (2r 2)2k

k!23k2k .

Thus when 2r 2 we see the higher order correction terms become small very quickly.Note that theak are generically small for reasonable choices of interest rates and volatilities.Also if one would apply this model in a highly volatile market where interest rates arecorrespondingly high, i.e. = 0.5 and r = 0.12, we would still have a very accuracteapproximation even to low order in the ai. Lastly, note that one can apply the ratio test totheai to see that the series

i ai is indeed convergent. We will see in the next section that


28/119

28

Figure 2. This is a plot of p0, . . . , p4 from equation (3.21) and the exactdensityp from equation (3.3) for = 1, = 1, = 0.3, andr = 0.4.

this is a special property of BSM dynamics that does not hold in the CEV setting where theassociated ai series is in fact always divergent except when it reduces to the BSM case.

We finally consider a slightly more global representation of the error associated with thisapproximation in Figure 3. Here we fix = 1, T= 1, and r = 0.05 and plot the differencebetween our first order approximation and the exact solution, namelyp p1, for small assetvaluesS[0, 2] and the full range of realistic volatility values [0, 1].

Note the error is generically small except in the case where is large andSis small. There

also is an spike in the error in a neighborhood around ( S, ) = (1, 0.15). This spike appearssomewhat generic to all terms, and the error in this region is significantly reduced by takinghigher order approximations to the density.

3.2. Constant Elasticity of Variance Model. We now consider a generalization of theprevious BSM model called the Constant Elasticity of Variance model (CEV) (see Brecherand Lindsay [22] for a survey). The CEV model is a local volatility model and is widely usedin many financial contexts for smile modeling. The model dynamics are given by

(3.23) dS=SdW, S (0) =S0,

where we fix(0, 1) and >0. This model can be thought of as the natural interpolationbetween the normal Bachelier (

0) and lognormal Black-Scholes-Merton (

1) models.

We note that when= 1, one can compute the heat kernel coefficientsak exactly and in factis able to invert the heat kernel expansion series and recover the exact well known transitiondensity for a lognormal process. Note that this SDE has no drift. The presence of a driftterm significantly complicates the perturbation theory involved in computing an expansionformula. We will consider adding a mean reversion/drift term when we later investigate ageneralized CEV model.

The solutions of the CEV SDE fall into two classes depending on whether (0, 1/2)or [1/2, 1). In the former case, it was shown by Feller [56] that the level S = 0 isattainable and one needs to specify whether this boundary is absorbing (meaning that theprocess remains trivial after hitting the zero level) or reflecting. If [0.5, 1), then the


29/119

29

Figure 3. This is a first order absolute error plot ofp p1 where p1 is givenby (3.21) and the exact transition densityp by equation (3.3). Here S[0, 2]and [0, 1].

boundary is always absorbing. Also, if >1, the zero level is not attainable so there is noneed to consider spatial boundary condition.

We now compute the geometric quantities that are relevant to this model. First we note = 22 where = S0 and

= 0, so we wish to approximate the solution of thetransition density PDE

(3.24) p=1

222p, p(0, , S ) =(S ).

Now in the case of an absorbing spatial boundary condition at the S= 0 level, this PDEhas an exact solution given by

(3.25) p(,S,) = S

122

(1 )2

exp

S

2(1) +2(1)

2(1 )22

I 12(1)

(S)1

(1 )22

,

where here I(x) is the modified Bessel function of the first kind defined by

(3.26) I(x) = x

2

k=0

(x2/4)k

k!(+k+ 1),

and (x) is the standard interpolation of the factorial function, commonly called the Gammafunction, and is defined for positive real values according to

(3.27) (x) =

R+

ux1eudu.

One can verify this solution directly by substitution and use of the modified Bessel functionidentity

(3.28) I(x)

x =

1

2(I1(x) +I+1(x)) ,


30/119

30

which can be iterated to compute the required second derivative formula.Note thatpis not analytic at the point= 0. Also for certain parameter values,

pdS


31/119

31

and that s(S) = 0 and s() =d(S, ). Now(3.38)

a1(S, ) =( 2)2

8d

d0

2(1)ds=

2( 2)8d

S

2d=( 2)2

8 (S)1.

Next we can compute

(3.39) a2=(

2)(3

4)(3

2)

128 (S)22

.

Moreover, after a long calculation we find that

(3.40) an+1

an=

2

8(n+ 1)[(2n+ 1) (2n+ 2)][(2n+ 1) 2n](S)1.

Putting the pieces back together, we find that

(3.41) p( , , S ) =

g(S)4

P(S, )exp

d(, S)2

4

k

akk

(3.42) = 1

S2

S/2 exp

(S1

1)2

22(1 )2

k=0

ak

k.

Just as in the previous BSM example, we will label the k-th partial sum of the above formulabypk. We now comment on the non-convergence of this series. Note that if we apply theratio series convergence test, we find

(3.43) limn

an+1an =,

except in the BSM case where = 1. Hence the series will always diverge for= 1. Onemight expect this to invalidate the use of these perturbation methods for the CEV model.However, for1, they approximate the true density quite well when the series is truncatedafter the first few ak, as we will demonstrate in a few examples. However, if one takes klarge, he finds this approximation becomes increasingly inaccurate in parameter regimes ofincreasing size.

We demonstrate this with plots of two examples, but first provide an alternative approx-imation formula from Labordere [98]. In Labordere [98, p.132], this approximation formulafor the transition density is given by

(3.44) pHL (,S,) = S

22exp

1

22

S0S

1

udu

2S

/2

(3.45) 1 +1

8S22av ( 2) 2 +

( 2)(3 2)(3 4)128

S44av ( 2)2 .

This is a second order formula where the heat kernel coefficients a1 and a2 have not beencomputed exactly but rather are approximated by their simpler diagonal values. In par-ticular, the ai are approximated by ai(, S) = ai(Sav, Sav) where Sav = (S+ )/2 whichtakes advantage of the simple form of the diagonal heat kernel coefficients which in the onedimensional CEV case are given by

(3.46) a1(S, S) =Q(S), a2(S, S) =1

2

Q(S)2 +

gQ(S)

3

.

We now consider two examples of plots of the pn. First we will set the model parametersto = 1,T= 10, = 0.3, and= 0.6. In Figure 4, the greatest graph is the approximation


32/119

32

from Labordere [98], next plot the exact transition density, our zeroth order correction, andour first order correction in the other plots.

Figure 4. Here we plotp from equation (3.25) together with pHL from equa-tion (3.45) alongside our approximations inp0 andp1 from equation (3.42) forthe parameter values = 1, T= 10, = 0.3, and= 0.6.

We note that all the graphs are virtually indistinguishable for S[0.5, 5]. For small valuesS [0.1, 0.5], our first and second order approximations are closer to the exact transitiondensity function thanpHL . Now for very low values ofS, all the approximations break down.This fact is brought out more clearly in Figure 5 where here in the middle exponential type

graph we plot the exact solution, and in the greatest graph we again plot pHL .The other plots in decreasing order correspond to p1, p2, p3, p5,, andp10 respectively. Here

we can see on the small scale S [0, 0.1] that the approximation formulas are becomingincreasingly inaccurate. We demonstrate the degeneration of the pk in Figure 6.

Here, in increasing order of the graphs, we plot p1,p2,p3,p5,p10,p20,p30,p40, andp50. Thusas n grows large, the associated pn approximations degenerate. Also note that pn+1 < pnpointwise which can straightforwardly be deduced from the form of equation (3.42). Inaddition, it seems that whenever we find a point where the transition density is accuratelyapproximated by some pn, say at S= S0, then the associated approximating function alsogives an accurate approximation of the exact transition density for all S > S0. One couldexploit this fact in practice for instance by using a Monte Carlo method to check the accuracy

of the approximation at someSvalue, and if he establishes that the approximation is indeedgood, it can then be safely used for larger Svalues.

We now turn to one additional example where we keep all the model parameters fixedbut let = 0.3. In Figure 7, we plot the exact solution in the top red graph, and thesecond order approximationpHL in green. Then the decreasing graphs are given in the plots

p0, p1, p2, p3, p5, p10, and p20.Note again that the pn degenerate as n , and p2 gives a more accurate second order

approximation of the exact solution than pHL .Lastly, we provide a somewhat more global description of the error encountered in our

CEV approximation in Figure 8. Here we again take = 1, T= 10, and = 0.3. We then


33/119

33

Figure 5. Here we plotp from equation (3.25) together with pHL from equa-tion (3.45) alongside our approximations inp1,p2,p3,p5, andp10from equation(3.42) for the parameter values = 1, T= 10, = 0.3, and= 0.6.

Figure 6. Here we plotp from equation (3.25) together with pHL from equa-tion (3.45) alongside our approximations in p1,p2,p3, p5, p10, p20,p40, andp50from equation (3.42) for the parameter values = 1, T = 10, = 0.3, and= 0.6.

plot the difference between the exact solution and our second order approximation p p2 forS[0, 0.5] and[0.001, 1].


34/119

34

Figure 7. Here we plotp from equation (3.25) together with pHL from equa-tion (3.45) alongside our approximations in p0,p1,p2,p3,p5,p10, andp20 fromequation (3.42) for the parameter values = 1,T= 10, = 0.3, and= 0.3.

Figure 8. This is a second order absolute error plot ofp p2, where p2 isgiven by (3.42) and the exact transition densitypis defined by equation (3.25).HereS[0, 0.5] and[0.001, 1] with fixed model parameters= 1,T= 10,and = 0.3.

Note as we increase , the graph tends to the zero plane for all S. In particular, the onlyplaces where trouble seems to arise is for small values of(this appears to be independentofS) and for simultaneous small values ofSand low values of, e.g. [0, 0.5].


35/119

35

We finally note that it is possible to derive these results in a slightly simpler fashion. Wecan simplify the CEV SDE by changing variables. In particular, we can change variables bydefiningx(t) = S(t)2(1)/2(1 )2. Then using Itos lemma, we find that

(3.47) dx= x

SdS+

1

2

2x

S2dS2 =

1

2

1 21 dt+ 2

xdW dt+ 2xdW.

The process x is known as a -Bessel process. Note for

(

, 1/2) then

(0, 1) andmonotonically decreases as a function of. For(1/2, 1), (0, ) is monotonicallydecreasing. For (1, ),is decreasing fromto 1.

The associated equation for the transition probability density is given by

(3.48) p= 2p+p.

where we again set = x0 in order to simplify our notation. We will consider this processesfor arbitrary.

We identify = and = 4. Thusg = 2 and g = 1/2. Again we changevariables by defining an arclength coordinate s() = 1

2

x

duu

which measures the distance

we are away from x. This coordinate gss= (/s)2g = 1, and thus we have (, x) = 1.

Note that s(x) = 0 and s() =d(x, ), so s parameterizes a geodesic by arclength startingfromx and ending at 0, i.e. it parametrizes the curve backwards. We next compute =12g

g=1/2, from which we see

(3.49) A=1

2g

+1

2=

( 1)4

,

where we let =( 1)/4. Next we can compute

(3.50) P(, x) = exp

x

Ad

=x

.

so thatP(x, ) = (/x). Now we compute

(3.51) P1

(x, )gP(x, ) =P1

g(g

gP) =(2

1)

,

(3.52) P12gAP=42

,

From the above form ofs, we note that the distance function is given by

(3.53) d(x1, x2) =

2 (

x2 x1)so

(3.54) s=

2

x

=

x s2

2and ds =2/(2)d

Thus we can compute

(3.55) a1(x, ) =1

d

C(x,)

P1(x, )LP(x, )ds=2(1 + 2)x

.

It is possible to proceed further with these computations to rederive the ai in the CEV case.We finally comment on the potential implications that these results may have for the

SABR model. In [134], Paulot, has computed an explicit expression fora0 and an analo-gous expression for a1 which involves a numerical integration (although with a substantialcomputational effort, one may be able to construct an explicit formula for a1). He then


36/119

36

constructs three approximations for the implied volatility smile; to our knowledge, these arecurrently the best explicit approximation formulas for the implied volatilities of highly out ofthe money options in the SABR model. In an example, Paulot demonstrates that his secondorder formula degenerates for options with very low strike to a much greater degree than hisfirst order approximation. Since the SABR model reduces to the CEV model as the volvolconstant

perturbation and symmetry techniques

Documents