hjm framework

A Practical Implementation of the

Heath–Jarrow–Morton Framework

Proyecto fin de carrera Escuela Técnica Superior de Ingeniería (ICAI)

Universidad Pontificia Comillas Madrid

Autor: Juan Monge Liaño

Directores: François Friggit, Maria Teresa Martínez

Colaboradores: Luis Martí, Anh Tuan NGO

Madrid, junio de 2007

A Practical Implementation of the Heath – Jarrow – Morton Framework

7

A Practical Implementation of the

Heath–Jarrow–Morton Framework

1. IntroductionEquation Chapter (Next) Section 1 5

1.1 Exotic Options 6

1.2 History 7

1.3 Models 8

1.4 HJM 9

1.5 Document Structure. 10

1.6 Special Acknowledgements 14

1.7 Project Aims 14

2. Stochastic Calculus 16

2.1 Introduction 16

2.2 Markov Process 17

2.3 Martingale 18

2.4 Brownian Motion 21

2.5 Stochastic Differential Equation 22

2.6 Risk Neutral Probability 23

2.7 Solving Stochastic Differential Equations 24

2.8 Ito’s Lemma 24

2.9 Stochastic Integral 27

2.10 Girsanov’s Theorem 27

2.11 Martingale Representation Theorem 30

2.12 Major Stochastic Differential Equations 32

3. Historical ModelsEquation Chapter (Next) Section 1 36

3.1 The Black Scholes Model 36

3.2 Beyond Black 42

8

3.3 Lognormal Classic Black 45

3.4 Normal Black 46

3.5 Black Shifted 47

3.6 Local Volatility - Dupire’s Model 48

3.7 Stochastic Volatility 59

3.8 SABR 60

4. Interest Rate Models Equation Chapter (Next) Section 1 62

4.1 Rendleman and Bartter model 63

4.2 Ho-Lee model 63

4.3 Black Derman Toy model 64

4.4 Vasicek Model 64

4.5 Cox Ingersoll and Ross model 64

4.6 Black Karasinski model 65

4.7 Hull White Model 65

4.8 Conclusions 67

5. Interest Rate ProductsEquation Chapter (Next) Section 1 68

5.1 Discount Factors 68

5.2 Zero-coupon bond 70

5.3 Interest Rate Compounding 70

5.4 Present Value PV 71

5.5 Internal Rate of Return IRR 72

5.6 Bond Yield (to Maturity) YTM 72

5.7 Coupon Rate 73

5.8 Interest Rates 75

5.9 Forward Rates 78

5.10 Instantaneous forward rate 79

6. More Complex Derivative ProductsEquation Chapter (Next) Section 1 81

6.1 Calls and Puts 81

6.2 Forward 86

6.3 Future 87

6.4 FRA 89


9

6.5 FRA Forward 90

6.6 Caplet 92

6.7 Cap 94

6.8 Swap 97

6.9 Swaption 100

7. HJMEquation Chapter (Next) Section 1 104

7.1 Introduction 104

7.2 Model Origins 105

7.3 The HJM Development 106

7.4 The rt in the HJM Approach 109

8. Santander HJMEquation Chapter (Next) Section 1 112

8.1 How to choose the γ? 113

8.2 One Factor 114

8.3 Model Implementation 117

8.4 Controlled correlation 126

8.5 Tangible Parameter Explanation 128

9. Numerical MethodsEquation Chapter (Next) Section 1 134

9.1 Discretisation 135

9.2 MonteCarlo 136

9.3 Tree Diagrams 140

9.4 PDE Solvers 145

10. CalibrationEquation Chapter (Next) Section 1 149

10.1 Algorithm 150

10.2 Calibration in Detail 153

10.3 Best Fit or not Best Fit? 157

10.4 Newton Raphson 164

10.5 Iteration Algorithm 169

11. Graphical UnderstandingEquation Chapter (Next) Section 1 170

11.1 Dynamics of the curve 174

10

11.2 HJM Problematics 175

11.3 Attempted Solutions 179

11.4 3D Surface Algorithm 181

12. HJM 3 StrikesEquation Chapter (Next) Section 1 183

12.1 Exponential 183

12.2 Mean Reversion 185

12.3 Square Root Volatility 187

12.4 Pilipovic 188

12.5 Logarithmic 189

12.6 Taylor expansion 190

12.7 Graphical Note 191

12.8 Results 191

13. Analytic approximation Equation Chapter (Next) Section 1 195

13.1 Formula Development 196

13.2 Step 1 198

13.3 Second Method 203

13.4 Step 2 205

13.5 Swaption Valuation 206

13.6 Approximation Conclusion 207

13.7 Alternative point of Calculation 208

13.8 Two Factors 209

13.9 Use of ‘No Split’ 217

14. Analytic Approximation ResultsEquation Chapter (Next) Section 1 219

14.1 1 Factor Model 219

14.2 Analytic Approximation Jacobian 227

14.3 2 Factor Analytic Approximation 230

14.4 Final Considerations on the Analytic approximation 232

14.5 Conclusions and Further Developments 233

14.6 Analytic approximation Peculiarities 233

15. Calibration Set Interpolation MatrixEquation Chapter (Next) Section 1 240

15.1 Initial Data 240


11

15.2 Former approach analysis 240

15.3 2 Strikes 241

15.4 Graphical representation 244

16. Interest Rate Volatilities: Stripping Caplet Volatilities from cap quotes 249

16.1 Introduction. Equation Chapter (Next) Section 1 249

16.2 Stripping Caplet Volatility Methods 251

16.3 Previous Santander Approach for 6 month caplets 252

16.4 Linear Cap Interpolation 254

16.5 Quadratic Cap Interpolation 257

16.6 Cubic Spline Interpolation 257

16.7 Natural splines 261

16.8 Parabolic Run out Spline 262

16.9 Cubic Run out Spline 263

16.10 Constrained Cubic Splines 263

16.11 Functional Interpolation 265

16.12 Constant Caplet Volatilities. 266

16.13 Piecewise Linear Caplet Volatility Method 267

16.14 Piecewise Quadratic 269

16.15 The Algorithm 271

16.16 About the problem of extracting 6M Caplets from Market data. 272

17. SABREquation Chapter (Next) Section 1 277

17.1 Detailed SABR 279

17.2 Dynamics of the SABR: understanding the parameters 281

18. Result AnalysisEquation Chapter (Next) Section 1 288

18.2 SABR Results 292

18.3 3D Analysis 296

18.4 Algorithm 307

18.5 Future Developments 309

19. Summary and ConclusionsEquation Chapter (Next) Section 1 314

12

20. References 320

Figure index

Fig. 2.1. Stochastic Volatility Dynamics 17

Fig. 2.2. Linear Stochastic Differential Equation dynamics 33

Fig. 2.3. Geometric Stochastic Differential Equation dynamics 33

Fig. 2.4. Square Root Stochastic Differential Equation dynamics 34

Fig. 3.1. Only the present call value is relevant to compute its future price. Any

intermediate time-periods are irrelevant 42

Fig. 3.2. Term Structure of Vanilla Options 43

Fig. 3.3. Flat lognormal Black Volatilities 45

Fig. 3.4. Normal and lognormal Black Scholes model comparison a) price vs strike b)

black volatility vs strike 46

Fig. 3.5. Alpha skew modelling 47

Fig. 3.6. Market data smile 48

Fig. 3.7. Smiles at different maturities 53

Fig. 3.8. Implied volatility σB(K,f) if forward price decreases from f0 to f (solid line) 55

Fig. 3.9. Implied volatility σB(K,f) if forward price increases from f0 to f (solid line). 55

Fig. 3.10. Future asset volatility scenarios for different an asset 56

Fig. 3.11. Future volatility scenarios for different strikes of a same underlying asset 56

Fig. 5.1. Future value of money 69

Fig. 5.2. Discount factor 69

Fig. 5.3. Bond curve dynamics 73

Fig. 5.4. Bond curve for different maturities 74

Fig. 5.5. Relating discount rates 76


13

Fig. 6.1. Investor’s profit from buying a European call option: Option price= 5$; Strike

K= 60$ 82

Fig. 6.2.Vendor’s profit from selling a European call option: Option price= 5$; Strike =

60$ 83

Fig. 6.3. Investor’s profit from buying a European put option: Option price= 7$; Strike

= 90$ 84

Fig. 6.4. Profit from writing a European put option: Option price= 7$; Strike = 90$ 85

Fig. 6.5. Forward contract: future expected value versus real future value 86

Fig. 6.6. Future contract: future expected value versus real future value 88

Fig. 6.7. FRA payoffs 90

Fig. 6.8. FRA future’s payoffs 90

Fig. 6.9. Caplet payoffs 93

Fig. 6.10. Cap payoffs 95

Fig. 6.11. Swap payoffs 98

Fig. 8.1. Example of lack of correlation between variables belonging to a unique

Brownian motion 120

Fig. 8.2. HJM dynamics for a lognormal model: flat 122

Fig. 8.3. HJM dynamics for a normal model: skew 122

Fig. 8.4. HJM dynamics for alpha parameters greater than 1 124

Fig. 8.5. for a correlation=1 amongst interest rates. 127

Fig. 8.6. Allowing for de-correlation among different interest rates 127

Fig. 8.7. Typical vanilla dynamics for different maturities 129

Fig. 8.8. Smile to skew deformation with maturity 130

Fig. 8.9. Sigma parameter global volatility level 131

Fig. 8.10. Alpha parameter skew 132

14

Fig. 8.11. Stochastic Volatility: smile creation 133

Fig. 9.1. Call future scenarios generation 136

Fig. 9.2. Normally distributed variable generation from random numbers in the (0,1)

interval 137

Fig. 9.3. Binomial tree 141

Fig. 9.4. Trinomial tree 141

Fig. 9.5. Non recombining binomial tree 142

Fig. 9.6. Binomial tree probabilities 143

Fig. 9.7. Recombining binomial tree 144

Fig. 9.8. Recombining trinomial tree 144

Fig. 9.9. PDE mesh and boundary conditions 147

Fig. 9.10. First PDE algorithm steps 148

Fig. 10.1. Calibration Process: Vanilla Products 150

Fig. 10.2. Calibration Process: Exotic Pricing 152

Fig. 10.3. Analogy cancellable swap and inverse swap 152

Fig. 10.4. Decomposition of an exotic into time periods 154

Fig. 10.5. Decomposition of an exotic into vanillas fixing at T and with different

maturities 155

Fig. 10.6. Schematic calibration matrix representation 156

Fig. 10.7. Initial Variation before first Fixing 156

Fig. 10.8. First Row Interpolated Data 157

Fig. 10.9. Inexact fit: minimum square method 158

Fig. 10.10. Exact fit 162

Fig. 10.11. Anomaly in exact fit 163

Fig. 10.12. Anomaly in minimum square method 163


15

Fig. 10.13. Newton Raphson Iterations 164

Fig. 10.14. Newton Raphson Iterations with a constant Jacobian 166

Fig. 10.15. Calibration Jacobian 167

Fig. 10.16. Jacobian calculation Iterations 168

Fig. 10.17. Detailed Calibration Algorithm: Jacobian computation 169

Fig. 11.1. HJM model: sigma vs price dynamics with different alpha parameters 171

Fig. 11.2. HJM model: alpha vs price dynamics with different sigma parameters 172

Fig. 11.3. HJM MonteCarlo model price surface 172

Fig.11.4. HJM MonteCarlo two dimensional solution 173

Fig. 11.5. HJM MonteCarlo two dimensional solution intersection for two vanilla

products 173

Fig. 11.6. Model implications on taking a) very close strikes b)distant strikes 174

Fig. 11.7. Solution dynamics with a) variation in market price b) variations in strike 175

Fig. 11.8. Convergence of the algorithm 176

Fig. 11.9. Solution Duplicity 177

Fig. 11.10. No HJM MonteCarlo solution intersection 178

Fig. 11.11. HJM MonteCarlo surface does not descend sufficiently so as to create a

solution curve 179

Fig. 11.12. Graphic surface generation algorithm 182

Fig. 14.1. Analytic approximation at high strikes 220

Fig. 14.2. Analytic approximation at distant strikes 221

Fig. 14.3. Analytic approximation acting as a tangent ‘at the money’ 222

Fig. 14.4. Analytic approximation presents difficulties in adjusting to the curve at

distant strikes 223

Fig. 14.5. Analytic approximation corrected in sigma at high strikes 225

16

Fig. 14.6. Analytic approximation corrected in sigma for low strikes 225

Fig. 14.7. Analytic approximation with a varying sigma correction 226

Fig. 14.8. HJM MonteCarlo slopes and solution curve 228

Fig. 14.9. Analytic approximation slopes and solution curve 229

Fig. 14.10. Close-up on HJM MonteCarlo’s slopes and solution curve 229

Fig. 14.11. Close-up on analytic approximation’s slopes and solution curve 230

Fig. 14.12. HJM MonteCarlo versus analytic approximation solving solution

duplicities 234

Fig. 14.13. HJM MonteCarlo presents no solution curve intersection 235

Fig. 14.14. Analytic approximation solving a case with no HJM MonteCarlo solution

intersection 235

Fig. 14.15. HJM MonteCarlo first vanilla presenting a solution curve 236

Fig. 14.16. HJM MonteCarlo second vanilla does not descend sufficiently 237

Fig. 14.17. Analytic approximation presents a solution for the first vanilla 238

Fig. 14.18. Analytic approximation also presents a solution for the second troublesome

vanilla 238

Fig. 14.19. HJM MonteCarlo versus analytic approximation for a two dimensional

view of the previous cases 239

Fig. 15.1. Strike Interpolation 241

Fig. 15.2. Strike Interpolation 242

Fig. 15.3. Vertical extrapolation no longer flat 244

Fig. 15.4. Surface Deformation in Horizontal Extrapolation 245

Fig. 15.5. Swaption Vertical Extrapolation stays the same 245

Fig. 15.6. New Circular Solution Intersection 246

Fig. 16.1. Market Cap Quotes 249

Fig. 16.2. Cap decomposition into other caps and capforwards 252


17

Fig. 16.3. Capforward decomposition into two unknown caplets 253

Fig. 16.4. Each cap is made up of a number of caplets of unknown volatility 254

Fig. 16.5. 2 Cap Interpolation 254

Fig. 16.6. Forward caps related to the caplets 267

Fig. 16.7. Optimisation algorithm for interpolation in maturities 271

Fig. 16.8. Cap market quotes: flat cap difference under 2 year barrier 272

Fig. 16.9. Creation of the 6 month caplets from 3 month Cap market quotes: flat cap

difference under 2 year barrier 273

Fig. 16.10. Decomposition of a six menthe caplet into two 3 month caplets 273

Fig. 17.1. Caplet current market behaviour 278

Fig. 17.2. beta = 0 skew imposition, rho smile imposition 281

Fig. 17.3. beta flat imposition =1, rho smile imposition 282

Fig. 17.4. Undistinguishable smile difference on calibrating with different beta

parameters β = 0 and β =1 284

Fig. 17.5. Constructing the caplet ‘at the money’ volatilities 286

Fig. 18.1. Cap flat market quotes versus interpolated flat market quotes 288

Fig. 18.2. Caplet interpolated volatilities using linear and quadratic approaches 289

Fig. 18.3. Cap interpolated volatilities using linear and cubic spline approaches 290

Fig. 18.4. Cap interpolation between natural and constrained cubic splines 290

Fig. 18.5. Caplet interpolated volatilities using linear and linear cap approaches 291

Fig. 18.6. Caplet interpolated volatilities using linear approaches 292

Fig. 18.7. Caplet interpolated volatilities using cubic spline, and an optimisation

algorithm using quadratic approaches 292

Fig. 18.8. - SABR shot maturity caplet smile 293

Fig. 18.9. - SABR long maturity caplet smile 293

18

Fig. 18.10. - SABR very short 6 month maturity - sharp smile 294

Fig. 18.11. SABR short maturity caplet smile inexact correction: very irregular smile294

Fig. 18.12. Difference in general smile and ‘at the money’ level 295

Fig. 18.13. Smile correction towards ‘at the money level’ 296

Fig. 18.14. Initial linear interpolated caplet volatility surface 297

Fig. 18.15. Cubic Spline caplet volatility surface 297

Fig. 18.16. SABR smooth interpolated smile surface with cubic spline 297

Fig. 18.17. Irregular Smile for both linear interpolation and cubic spline, whereas

SABR presents a much smoother outline 298

Fig. 18.18. Smile bump ‘at the money level’ in linear cap interpolation; maturity of 1,5

years 299

Fig. 18.19. Caplet 3M to 6M smile for a maturity of 1,5 years 300

Fig. 18.20. Comparisons in cubic and 3M to 6M adjustments 301

Fig. 18.21. SABR strike adjustment 302

Fig. 18.22. SABR at the money strike adjustment, β = 1 302

Fig. 18.23. SABR β = 0, normal skew; maturity 1 year 304

Fig. 18.24. SABR comparisons between long and short maturities, varying the β and

the weights 305

Fig. 18.25. Weighted SABR, β = 1, Maturity 8Y 306

Fig. 18.26. Caplet volatility surface construction algorithm 308

Fig. 18.27. Cap market quotes 309


19

Table index

Table 4.1 Normal or lognormal models with a mean reversion 66

Table 5.1 Construction of forward rates from bonds 78

Table 6.1 Swap payoff term structure 97

Table 10.1. Exotic product risk decomposition 153

Table 10.2 Ideal Vanilla calibration matrix: all data available 155

Table 10.3 Vanilla calibration matrix: market quoted data 156

Table 12.1. Mean Reversion Stochastic Volatility Summary 192

Table 12.2. Mean Reversion Stochastic Volatility Summary 193

Table 12.3. Square Root Stochastic Volatility Summary – 10 and 20 packs 193

Table 12.4. Logarithmic Stochastic Volatility Summary 193

Table 14.1. Approximation increases calibration speed by a factor of 5 232

Table 14.2. Approximation increases calibration speed by a factor of 8 233

Table 15.1 Horizontal interpolation, vertical extrapolation 242

Table 15.2 Summary table of the differences between vertical, horizontal

extrapolation, split and no split 243

Table 15.3 Results obtained through vertical extrapolation 247

Table 15.4 Results obtained through horizontal extrapolation 248

Table 16.1. Cap market quotes: flat cap difference under 2 year barrier 272

Table 17.1. Shows the different dates and strikes with which to model products with

similar needs to the HJM 277


5

1. Introduction

During the past two decades, derivatives have become increasingly important in

the world of finance. A derivative can be defined as a financial instrument whose

value depends on (derives from) the values of other, more basic underlying variables.

Very often the variables underlying derivatives are the prices of traded assets.

Some major developments have occurred in the theoretical understanding of how

derivative asset prices are determined, and how these prices change over time. Three

major steps in the theoretical revolution led to the use of advanced mathematical

methods.

• The arbitrage theorem, sometimes called the ‘Fundamental Theorem of Finance’

gives the formal conditions under which ‘arbitrage’ profits can or cannot exist.

This major development permitted the calculation of the arbitrage-free price of

any ‘new’ derivative product

• The Black-Scholes model (1973) used the method of arbitrage-free pricing. The

ideas are the basis of most discussions in mathematical finance. The paper they

published was also influential because of technical steps introduced in obtaining a

closed-form formula for option prices

• The equivalent martingale measures dramatically simplified and generalised the

original approach of Black and Scholes.

With the above tools, a general method could be used to price any derivative

product. Hence arbitrage-free prices under more realistic conditions could be

obtained.

But how close are the market prices of options really to those predicted by Black-

Scholes? Do traders genuinely use Black Scholes when determining a price for an

option? Are the probability distributions of asset prices really lognormal? What

further developments have been carried out since 1973?

Chapter 1 Introduction

6

In fact, traders do still use the Black Scholes Model, but not exactly in the way

that Black and Scholes originally intended. This is because they allow the volatility

used for the pricing of an option to depend on its strike price and its time to maturity.

As a result, a plot of the implied volatility as a function of its strike price gives rise to

what is known as a volatility smile. This suggests that traders do not make the same

assumption as Black and Scholes. Instead, they assume that the probability

distribution of a derivative price has a heavier left tail and a less heavy right tail than

the lognormal distribution.

Often, traders also use a volatility term structure. The implied volatility of an

option then depends on its life. When volatility smiles and volatility term structures

are combined, they produce a volatility surface. This defines implied volatility as a

function of both strike price and time to maturity.

1.1 Exotic Options

Derivatives covered at the beginning of the 1980’s were what we note as plain

vanilla products. These have standard, well defined properties and trade actively.

Their prices or implied volatilities are quoted by exchanges or by brokers on a regular

basis. One of the exciting aspects of the over the counter derivatives market is the

number of non-standard (or exotic) products that have been created by financial

engineers. Although they are usually a relatively small part of its portfolio, these

exotic products are important to a bank because they are generally much more

profitable than plain vanilla products.

These exotic products are options with rules governing the payoffs that are more

complicated than standard options. Types of exotic options can typically include

packages, non-standard American options, forward start options, chooser options,

barrier options, binary options, lookback options, Asian options,….


7

1.2 History

Models using stochastic differential equations for the modelling of stock markets

arise from the search of a function capable of reproducing the historical behaviour of

the prices. This is, a function that presents the same spiky irregular form as market

stock quotes, and thus reflects the randomness of the dynamics. The behaviour itself is

said to be fractal because any rescaling of the temporal axis always yields the same

irregular price form.

The origins of much of financial mathematics traces back to a dissertation

published in 1900 by Louis Bachelier. In it he proposed to model the movement of

stock prices with a diffusion process, or what was later to be called a Weiner, or

Brownian Motion process. However, he assumed that the stock prices were Gaussian

rather than log-normal as in the Black Scholes Model. It was not until five years later,

in Einstein’s seminal paper outlining the theory of Brownian Motion that the concepts

of Brownian motion and differentials were finally formalised.

For most of the century, the mathematical and financial branches of Bachilier’s

work evolved independently of each other. On the mathematical side, influenced by

Bacheliers work, Kiyoshi Ito went on to develop stochastic calculus, which would later

become and essential tool of modern finance.

On the financial side, Bachelier’s work was largely lost for more than half a

century. It wasn’t until 1965 that an American called Paul Samualson resurrected

Bachelier’s work and extended his ideas to include an exponential (or geometric) form

rather than the arithmetic Brownian Motion.

After this it wasn’t long until the links between the two separate branches of

work (both stemming from Bachelier) were reunited when Black, Scholes and Merton

wrote down their famous equation for the price of a European Call and Put option in

1969, work for which the surviving members received the Nobel Prize for economics

in 1997.


8

1.3 Models

Models based on the original Black-Scholes assumptions and the numerical

procedures they require are relatively straightforward. However, when tackling exotic

options, there is no simple way of calculating the volatility that should be input from

the volatility smile applicable to plain vanilla options.

Further, a simplistic approach such as that by Black Scholes models assumes that

the individual spot rates move independently to one another in a random fashion.

This is perhaps acceptable abstractly, but is not in accord with the observation that

rates for adjacent maturities tend to move together.

A number of alternative new models have since been introduced to attempt to

solve this problematic. The parameters of these models can be chosen so that they are

approximately consistent with the volatility smile observed in the market.

These models, that have come to be known as the ‘traditional models’ for pricing

interest rate options, such as the Hull White model, the Vasicek Model, the Cox

Ingersoll and Ross model or the Black Karasinski model to name but a few. Their

main difference with respect to previous models is the incorporation of a description

of how interest rates change through time. For this reason, they involve the building

of a term structure, typically based on the short term interest rate r. These models are

very robust and can be used in conjunction with any set of initial zero rates. The main

advantage of these methods lies in the possibility of specifying tr as a solution to a

Stochastic Differential Equation (SDE). This allows, through Markov theory, to work

with the associated Partial Differential Equation (PDE) and to subsequently derive a

rather simple formula for bond prices. This makes them widely suited for valuing

instruments such as caps, European bond options and European swap options.

However, they have some limitations. They need a complicated diffusion model to

realistically reconstruct the volatility observed in forward rates. In addition, despite

being able to provide a perfect fit to volatilities observed in the market today, there is

no way of controlling their future volatilities.

Furthermore, they all lead to the same drawback when solving interest rate

products: the fact that they use only one explanatory variable (the short interest rate

tr ) to construct a model for the entire market. The use of a single tr proves

insufficient to realistically model the market curve, which appears to be dependent on


9

all the rates and their different time intervals. It constrains the model to move

randomly up and down, with all the rates moving together by the same amount

according to the random motions in the unique short rate r. Consequently, these

models cannot be used for valuing interest rate derivatives such as American-style

swap options, callable bonds, and structures notes, as they introduce arbitrage

possibilities.

1.4 HJM

The most straightforward solution to the above problem should include the use of

more explanatory variables: long and medium term rates. This is, we could perhaps

consider a model in which we would use one representative short term rate, a middle

term rate, and finally a long term interest rate. The Heath Jarrow Morton framework

arises as the most complete application of the suggested approach. It chooses to

include the entire forward rate curve as a theoretically infinite dimensional state

variable. Unlike other models, this model can match the volatility structure observed

today in the market, as well as at all future times.

The LIBOR market model also provides an approach that gives the user complete

freedom in choosing the volatility term structure. We will not cover the study of this

model in the present text, but only state here the advantages that it presents over the

HJM model.

Firstly, it is developed in terms of the forward rates that determine the pricing of

caps rather than in terms of instantaneous forward rates. Secondly, it is relatively easy

to calibrate to the price of caps or European sap options.

Both HJM and LIBOR market models have the serious disadvantage that they

cannot be represented as recombining trees. In practice, this means that they must be

implemented using MonteCarlo Simulations.


10

1.5 Document Structure.

The project is divided into two parts. The first is essentially an introduction to

mathematical engineering methods in finance. The second focuses on a much more

practical implementation of a concrete model, the HJM, and discusses in depth certain

problematics that can surface within this framework.

The text approaches the mathematics behind continuous-time finance informally.

Such an approach may be found imprecise by a technical reader. We simply hope that

the informal treatment provides enough intuition about some of these difficult

concepts to compensate for this shortcoming.

The text is directed towards a reader with minimal background in finance, but

who is willing to embark in the adventure of getting acquainted with this unknown

domain. A strong background in calculus or stochastic processes is not needed, as the

concepts are built from scratch within the text. However, previous knowledge in these

fields will certainly be helpful, and it is necessary that the reader be comfortable with

the use of mathematics as a method of deduction and problem solving. Hence, the text

is designed for individuals who have a technical background roughly equivalent to a

bachelor’s degree in engineering, mathematics or science. The language of financial

markets is largely mathematical, and some aspects of the subject can only be

expressed in mathematical terms. It is hoped that practitioners in financial markets, as

well as beginning graduate students will find the text useful.

Chapter 2 introduces the mathematics of derivative assets assuming that time

passes continuously. Consequent to this assumption, information is also revealed

continuously, and decision makers may face instantaneous changes in random news.

Hence technical tools for pricing derivative products require ways of handling

random variables over infinitesimal time intervals. The mathematics necessary for

dealing with such random variables is known as Stochastic calculus.

Chapter 3 and 4 discuss classical approaches in the modelling of the term

structure of interest rates. Learning the differences between the assumptions, the basic

philosophies, and the practical implementations that one can adopt in each case is an

important step for understanding the valuation of interest-sensitive instruments.


11

Further, it enables us to understand the limitations behind these models, and why the

HJM framework could be used as a good alternative.

Chapters 5 and 6 deal with the basic building blocs of financial derivatives:

bonds, interest rates, options and futures. These are truly the foundations necessary

for modelling the term structure of interest rates. With these, we proceed to introduce

the more complicated classes of derivatives such as swaps or swaptions. We conclude

by showing how these complicated products can be decomposed into a number of

simpler derivatives. This decomposition results extremely practical.

The purpose of the second part of the text is principally to provide an

introduction to one way in which the HJM framework can be implemented. It sets out

in Chapter 7 by providing a broad vision of the framework itself before discussing the

more technical aspects. The text focuses on the practical implementation of the

framework developed in the Banco Santander. We analyse the assumptions,

hypotheses, and development of the framework through Chapter 8, becoming more

and more involved in practical discussions, until diverting to solve the specific

problematics that the model faced. In this part of the text, we have attempted to

introduce each problematic situation firstly under a broad theoretical scope.

Following, we proceed to explain the practical implementation which we decided on,

and the results which they yielded. We will often suggest alternatives for the different

methods that we implement, or offer ideas to face further developments that could

stem from our studies.

We continue in Chapter 9 and 10 with the practical implementation of the HJM.

Chapter 9 is a very brief introduction to the various numerical techniques that are

available for resolving stochastic differential equations: specifically, MonteCarlo

integrals, Tree Diagrams, and Partial Differential Equations Solvers. We deem it

necessary to acquire an understanding of the calibration procedure itself, and how we

obtain the precise parameters that the model requires for the pricing of more complex

exotic products. For this reason, Chapter 10 attempts to provide a broad discussion of

how the parameters are extracted from simple vanilla products. We discuss in detail

the advantages of the Newton Raphson root solver in our approach. A very

interesting discussion is also presented facing two contrary lines of thought: the use of


12

exact calibration methods versus the use of over determined systems that centre on a

error minimisation.

Chapter 11 discusses a mathematical tool that proves extremely useful for

visually analysing the HJM iterative process before it converges towards its final

parameter values. It is also in this section where we begin to encounter certain

calibration flaws in the model that we will need to tackle in future chapters.

Chapter 12 the first genuinely experimental chapter. It discusses the need for a

three parameter HJM model, and continues to analyse a set of possible formulations

that the third parameter, i.e. the volatility of volatilities, could present. The chapter

concludes with an analysis of specific calibrations performed with the three strike

model, and examines the possible causes of the failure to calibrate for long maturities.

In particular it points directly at a possible flaw in the caplet – swaption calibrations.

The analytic approximation of Caplet and Swaption prices is a huge achievement

that is presented in Chapter 13. As its name indicates, it is an approximate

formulation of the HJM model that enables us to reduce calibration computation times

drastically. The exhaustive mathematics of the approximation are studied in the

present chapter. We then proceed in Chapter 14 to analyse the specific results attained

through the various alternatives. We conclude the chapter by selecting the most

advantageous alternatives, and implementing them at the trader’s desk.

Chapter 15 is a first attempt to solve the calibrations problems that the HJM was

facing. It analyses the process by which the vanilla products for any specific

calibration are selected, and more specifically, it studies how the ‘missing’ data is

completed in the corresponding calibration matrices. We reach back briefly to Chapter

11 to analyse this problem by using the mathematical tool developed here. We end

realising that the selection of our interpolation and extrapolation methods can have a

drastic impact in the model solution surface itself.

Chapter 16 picks up the loose ends left behind in Chapter 12 and tackles the

caplet problematics as a cause for the failure in the 3 strikes HJM. Obtaining caplet

quotes from market cap data proves incomplete, and the need for consistent

interpolation techniques to extract the necessary values can be far from simple. We

provide a broad description for the main interpolation and optimisation algorithms

available, but show how even so, we are incapable of doing away with certain

irregularities in the caplet volatility surface. As a direct result of this, we continue to


13

the introduction of the SABR in Chapter 17. This is a stochastic volatility model that

centres on smoothening out the smile obtained by the previous interpolation

techniques. The model is discussed in depth, presenting its parameters and dynamic

characteristics.

Chapter 18 is the down to earth study of the results obtained in the previous two

chapters. It compares the exact interpolation methods with the inexact optimisations

and SABR algorithm, showing how the latter succeed in achieving a smoother

characteristic at the expense of losing precision.


14

1.6 Special Acknowledgements

Several people provided comments and helped during the process of revising the

document. I thank François Friggit for his invaluable discussions on mathematical

finance and models. I thank him also for being such a great director, for his endless

fantastic humour and support. I thank Maria Teresa Martínez for her advice, her

mathematical introductory text, and her help working out many technicalities in this

project. I would also like to thank Luis Martí in particular for having a word of help in

practically everything I did. I cannot imagine what I would have done without him. I

congratulate him for his work and thank him immensely for his friendship. Having

had the possibility to work together with Anh Tuan NGO in the creation of the

analytic approximation has proved a marvellous experience. I thank him for his help,

patience and friendship. I would finally like to thank all the other quant team

members of the Banco Santander that have made my stage during the current year so

formidable. The team is truly marvellous. I thank in particular Álvaro Serrano, Miguel

Pardo, Souahil, Cristina, Lamine, Álvaro Baillo, Pedro Franco, Jorge, Alberto, Carlos,

Manuel, and the rest of the team.

1.7 Project Aims

We briefly outline the aims with which we initiated the current project.

Setting out with the realisation that we cannot successfully approach the HJM

model without a consistent background in mathematical finance, the first part of the

project was directed towards building up this knowledge. Immediately after this, we

directly proceeded to a practical analysis of the implemented model with a direct

exposure to the relevant problematics. The project was thus, aimed at:

• An introduction to pricing derivative models

• The study of existing theoretical models: Black Scholes, Hull White, Heath Jarrow

Morton amongst others.

• The analysis of the internal Banco Santander HJM approach.


15

• Development of an understanding of the various programming techniques in use:

· Microsoft Visual Studio.net 2003: programming framework in C++; complier,

linker and assembler

· Python IL interface

· Murex: valuation platform for credit, equity and interest rate products

· Internal tools created and programmed by the BSCH quant team

· Visual Basic

• Acquaintance with the Hull White framework already implemented and running

in the Banco Santander, as a simplified model serving as prior knowledge to the

Heath Jarrow Morton framework.

• Detailed comprehension of the Heath Jarrow Morton framework: Identification of

specific market examples where the HJM calibration does not compute:

examination of the cases. Analysis of the market data related to the above cases-

verification of whether the failure to compute is due to anomalies in the market.

• Graphical representation and analysis of the correlation between the price and

each of the volatility parameters. Comprehension of the minima obtained in the

Price versus Alpha graphical representations. Isolation and examination of

specific examples.

• Resolution of the impossibility to correctly model real market situations in which

the true price lies below the minimum value of our model. The following two

preliminary approaches will be considered:

· Adjustment of the dependence between HJM volatility and the parameters.

· Modification of the parameters themselves: selection of an appropriate

statistical distribution other than the lognormal or normal distributions.

• Development and implementation of an analytic approximate formula so as to

reduce the time involved in the calibration process.

• Analysis and resolution of the existing HJM problems: possible caplet matrix

stripping problem. Possible failure in interpolation techniques.

Chapter 2 Stochastic Calculus

16

2. Stochastic Calculus Equation Chapter 2 Section 1

2.1 Introduction

Stochastic calculus handles random variables over infinitesimal time intervals,

in continuous time. In it, we can no longer apply Riemann integrals, and these are

instead replaced by Ito Integrals.

A stochastic process is a random function, i.e. a function of the form f(t, w) that is

therefore time dependent and that is defined by the variable W which represents a

source of randomness.

In deterministic or classical calculus, the future value of any function f(t) is

always known by simply calculating

0

0

t

tf f df= + ∫

The above requires ft to be differentiable. If f is stochastic as in the case of

Brownian motion, it is not differentiable almost anywhere. Moreover, we do not know

the value of ft for any future instant of time. We only know the probability density

that our function will follow. We can simply write:

0

0 0

( ) ( )t t

t sf f s ds s dWα β= + +∫ ∫

The above function is composed by two main processes. A term representing the

random motion or dispersion, that is stochastic; and a mean diffusion term that may

Standard

Brownian Motion

known

today

Slope

Mean

diffusion

Random motion

(stochastic)

dispersion


17

be constant over time, increasing or decreasing. Moreover, it can in turn either be

deterministic or stochastic depending on whether the slope term α is also stochastic.

Fig. 2.1. Stochastic Volatility Dynamics

We are in this way left with what is known as an Ornstein Uhlenbeck Process:

( ) ( ) tdf t dt t dWα β= + (2.1)

where the first term is known as the drift, and the second as the diffusion.

Before proceeding any further with the Ornstein Uhlenbeck Process or any other

more elaborate setting, we must first grasp a set of important concepts that we will

now proceed to introduce.

2.2 Markov Process

In a non mathematical approach, a Markov Process is one in which the future

probability that a variable X attains a specific value depends only on the latest

observation tF~ available at present. All other previous (or historical) information sets

11

~,...,

~FFt − are considered irrelevant for the given probability calculation.

mean

noise


18

Mathematically, a Markov Process is therefore a discrete time process

X1,X2,…,Xt,… with a joint probability distribution function F(x1,x2,…,xt) that verifies

( ) ( )t s S t s SP X a F P X a X+ +≤ = ≤% (2.2)

And where TF~is the information set on which we construct the probability

2.3 Martingale

We will say that an asset St is a martingale if its future expected value, (calculated

at time t over an information set F~), is equal to its current value. A martingale is

always defined with respect to a probability P. This is:

[ ]Pt t U tE S S+ = (2.3)

2.3.1 Martingale Properties

Martingales possess a number of very interesting properties that we will come

across time and time again. The main properties that martingales verify are:

1. The random variable St is completely defined given the information at time t,

tF~

2. Unconditional finite forecasts [ ] ∞<+UtPt SE almost everywhere

3. Future variations are completely unpredictable, meaning there are no trends

[ ] [ ] [ ] 0P P Pt t U t t t U t t t tE S S E S E S S S+ +− = − = − = (2.4)

4. A continuous square integrable martingale is a martingale with a finite second

order moment

2Pt tE S < ∞ almost everywhere (2.5)

Stochastic processes can behave as:


19

• Martingales: if the trajectory has no trends or periodicities

• Submartingale: for processes that increase on average. They verify:

[ ]Pt t U tE S S+ ≥ almost everywhere (2.6)

• Supermartingale: for processes that decrease on average. They verify:

[ ]Pt t U tE S S+ ≤ almost everywhere (2.7)

In general, assets are therefore submartingales because they are not completely

unpredictable but instead, are expected to increase over time. It is important to notice

that they can be converted to martingales using risk neutral probability if there is

absence of arbitrage.

There are two main methods of converting non martingales to martingales:

2.3.2 Doob Meyer Decomposition:

involves adding or subtracting the expected trend from the martingale, thus

leaving only the random component.

For example, in the case of right continuous time submartingales St, we can split

St into the sum of a unique right continuous martingale Mt and an increasing process

At, that is measurable with respect to tF~.

St = Mt + At. (2.8)

In general we will normalize our martingales so that their first order moment is

equal to 0. In most cases, our martingales will also have continuous trajectories. For

such martingales that in addition are square integrable – see (2.5) -, we can define their

quadratic variation process as the unique increasing process given by the Doob-Meyer

decomposition of the positive submartingale M2. In other words, it is the unique

increasing process <M>t such that M2 - < M>t is a martingale, normalized by <M>0 =

0. In particular, E(M2t ) = E(<M>t).


20

Quadratic Variation Process:

1

2

01

limi i

n

t tti

M M M−∏→ =

= −∑ (2.9)

1sup limit in probabilityi i it t −∏ = −

2.3.3 First Approach to Girsanov’s Theorem

The transformation of a probability distribution using Girsanov’s Theorem

involves simply changing the probability space in which we operate, PP~→ so as to

convert non – martingales to martingales.

[ ] [ ]P Pt t U t t t U tE S S E S S+ +> → =%

(2.10)

Continuous martingales have a further set of specific properties:

1. As we increase the partitioning of the interval [0,T] the St’s get closer.

Therefore ( ) 01

→>−−

εii tt SSP

2. They must have nonzero variance 101

2

1=

>−∑=

−

n

itt ii

SSP . The quadratic

variation therefore converges to a positive well defined random variable.

3. 1

1

1i i

n

t ti

V S S−

== − → ∞∑ (2.11)

4. All variations of greater order than V2 tend to 0, and so contain no useful

information. Thus the martingale is completely defined by its first two moments.


21

2.4 Brownian Motion

Also known as a Wiener Process, is a continuous process St whose increments ∆St

are normally distributed, i.e. they are Gaussian.

2.4.1 Characteristics:

1. W0 = 0

2. Continuous square integrable function in time

3. Independent increments dW:

Time

t U V

Thus (WU – Wt) is independent of (WV – WU) and independent of (Wt –W0) = Wt.

This implies that any model based on a Brownian motion parameter will be

independent of all that has occurred in the past. Only future events are of any

concern.

4. dWt2 = dt

5. Follows a Normal probability distribution, under the probability space Ω

( )~ (0, )U TW W N U T− − (2.12)

( )21

21

2

x T U

T U

A

P W W A e dxT Uπ

− −− ∈ =

−∫ (2.13)

The mean or expectancy of any given increment in the Brownian variable is

µ=E(WU - WT) = 0

The variance is the time interval itself Variance = U - T = U

T

dt∫

This is easily deduced following the subsequent logic.


22

( ) ( ) ( )

U T U T

T U T T U T

W W W W

E W E W E W W>

= + −= + −

(2.14)

Where we have initially defined 0)( =− TU WWE and where we know

TT WWE =)( as it is calculated for the present instant of time at which all data is

known. Therefore:

( )T U TE W W U T= > (2.15)

Which is what we previously introduced as a Markov Process - a variable whose

future expectation is exactly equal to its current value. A Brownian motion is therefore

always a martingale.

2.5 Stochastic Differential Equation

We will start by analysing the simplest stochastic differential equation possible.

Recall that we have already encountered it under the name of an Ornstein Uhlenbeck

process:

( ) ( ) tdf t dt t dWα β= + (2.16)

Where α(t) is known as the drift term, and σ(t) as the diffusion. It is the diffusion

term, through the Brownian motion, that introduces the randomness in this dynamics.

We will start by taking both α(t) and σ(t) as being deterministic, as this is the

simplest scenario that we can possibly imagine. In such a case, the solution to the

integral form of the stochastic differential equation is relatively straightforward

· ∫t

dss0

)(α has a classic, deterministic solution, or at its worst, can be solved

numerically through a Riemann integral, as

( )1 11

lim ( , )n

k k knk

S k t tα − −→∞ =−∑ (2.17)


23

• ∫t

sdWs0

)(β is probabilistic, follows a Normal (Gaussian) distribution which we

can represent as having a zero mean and whose variance is easily calculated, thus

leaving

∫t

dssN0

2 )(,0~ β

Where we have seen that dW2 = dt

2.6 Risk Neutral Probability

The difficulty when dealing with real probability is the fact that each investor

has a different inherent risk profile. This means that each will have a different

willingness to invest.

Imagine that we have a number of investors and each one of them faces the

following situation:

Each investor starts off with 100,000$. He stands a 10% chance that if he

invests, his returns will be of 1M $, whereas he faces a 90% chance that his return will

be 0 $.

As each investor has a different risk profile, they will each price the product

differently- or in other words, they will each be willing to pay a different price with

respect to the 100,000 $ stated to enter this ‘roulette’ game. The reader himself may be

willing to enter the game at a price of only 10,000 $ whereas another more risk averse

may find 90,000 $ a more suitable price for the game. Thus we realize that it is too

arbitrary to associate a price to a product if we rely on real probability.

The risk neutral probability arises as one in which the risk profile is

homogeneous. This is, that all assets present the same instantaneous yield throughout

time, and are all infinitely liquid. They must therefore all present the same rt.


24

Under the risk neutral probability, any tradable asset follows the dynamics

( ) Pt tdS r Sdt t dWσ= + (2.18)

Where σ is known as the volatility process associated to the asset S.

Here rt is the instantaneous rate defined in the discount factor section. It is time

dependent, but independent of the underlying asset S. We can see clearly in the above

formula that dS is a stochastic process, as it depends on dWtP.

In this risk neutral probability space, if we were to select another asset S2, then it

too would present the same instantaneous yield rt, despite the fact that it would

probably follow a different random process, dW2P

2.7 Solving Stochastic Differential Equations

In multiple occasions, we must resort to a change in variable so as to simplify the

more complex stochastic differential equations, attempting to transform them back

into the ideal Ornstein Uhlenbeck formulation seen before (2.1). For this reason we use

two principal methods: Ito’s Lemma and the Girsanov Theorem.

2.8 Ito’s Lemma

Is the stochastic version of the chain rule in classical calculus. We must firstly

realize that whereas partial derivatives are valid for stochastic calculus, (just as they

are for classical calculus), total derivatives and the chain rule itself are no longer

applicable. That is, considering f(t, St):

Partial derivatives: valid tt

St dSS

ff

t

ff

∂∂=

∂∂=

Total derivatives: not valid tt

dSS

fdt

t

fdf

∂∂+

∂∂=


25

Chain rule: not valid dt

dS

S

f

t

f t

t∂∂+

∂∂=

dt

df

Let S = S(t, dWt) where dWt is a Brownian process. Let us recall the stochastic

differential equation:

t t t tdS dt dWα σ= + (2.19)

If we perform a Taylor expansion on the function f(t, S) with respect to the two

variables S and t, then

2 2 22 2

1 2 2

1 1( ) ( )

2 2t t t t t

t t t t tt t t

f f f f ff f dt dS dt dS dtdS R

t S t S S t−∂ ∂ ∂ ∂ ∂= + + + + + +∂ ∂ ∂ ∂ ∂ ∂

(2.20)

Replacing with our diffusion equation, we obtain

( )

( ) ( )

22

2

2 22

2

1

2

1

2

t t tt t t

t

t tt t t t

t t

f f fdf dt dt dW dt

t S t

f fdt dW dt dt dW R

S S t

α σ

α σ α σ

∂ ∂ ∂= + + + +∂ ∂ ∂

∂ ∂+ + + +∂ ∂ ∂

(2.21)

In general, with Taylor expansions we decide to at least maintain the first order

terms, both in S as in t. However, we must be particularly careful with this

simplification:

• Our variable t is deterministic, meaning that as with classic Taylor expansions, we

can ignore powers that are greater or equal than the second order. St however is a

random process as it depends on dW, where we must recall that dW2 = dt. This is

equivalent to a first order term in t and thus cannot be ignored. Therefore, from

(2.20) and (2.21) we retain all the first order elements, that are

2

2

1

2

f f fdf dt dS dt

t S S

∂ ∂ ∂= + +∂ ∂ ∂

(2.22)


26

The first two terms correspond to a classical differentiation. The last term is

known as Ito’s term, where we have already replaced dW2 by dt. We can now replace

in (2.22) the diffusion dS of the initial function (2.19), obtaining:

2

2

2

2

1( )

2

1

2

t

t

f f fdf dt dt dW dt

t S S

f f f fdt dW

t S S S

α σ

α σ

∂ ∂ ∂= + + +∂ ∂ ∂ ∂ ∂ ∂ ∂= + + + ∂ ∂ ∂ ∂

(2.23)

Ito’s Lemma is mainly used when applying a change in variable to differential

equations. It takes a function f(t, S), that depends on both time and the stochastic

function S and then writes the diffusion of the new variable f(t, S) in terms of S and the

old function’s diffusion dynamics.

Ito’s formula can be used to evaluate Integrals. The general procedure is the

following:

· To guess a form for f(Wt, t)

· Then apply Ito’s Lemma to obtain the standard differential equation

· Integrate both sides of the equation.

· Rearrange the resulting equation.

We proceed to provide a simple practical example: Suppose that we would like

to solve ∫t

ssdWW0

. Then

2

2

2

( , )2t

Wd WdW

Wf t W

=

=

1

02

df WdW dt= + +


27

integrating and then rearranging

1

2WdW df dt= −∫ ∫ ∫

1

2WdW f t= −∫

Substituting f again we obtain

2 1

2 2

WWdW t= −∫

2.9 Stochastic Integral

Is nothing more than the integrated form of Ito’s equation (2.22).

2

20 0 0

1

2

t t t

t

f f f fdf dt dW

t S S Sα σ

∂ ∂ ∂ ∂= + + + ∂ ∂ ∂ ∂ ∫ ∫ ∫ (2.24)

Which rearranged solves the stochastic integral in terms of deterministic,

temporal terms.

2

0 20 0

1( , ) ( , )

2

t t

t t

f f f fdW f S t f S t dt

S t S Sσ α

∂ ∂ ∂ ∂= − − + + ∂ ∂ ∂ ∂ ∫ ∫ (2.25)

2.10 Girsanov’s Theorem

Girsanov’s Theorem is generally used when we seek to change the probability

space in which we work. Thus, if we have a Brownian variable under the probability

measure P and want to transform it to the probability measure Q, we then simply

perform:

( , )P QdW dW P Q dtλ= + (2.26)


28

The result is particularly useful when seeking to change the drift term in an

equation. However, it cannot simplify the diffusion term.

( )

( )( , )P Q

Q

df dt dW dt dW P Q dt

dt dW

α β α β λ

α βλ β

= + = + + =

= + + (2.27)

2.10.1 Radon Nikodym Derivative:

States that PdZdP t

~)(ξ= . This implies that the two are equivalent

probability measures if and only if it is possible to go back and forth between the two

measures, thus

1( ) ( )t tdP Z dP dP Z dPξ ξ −= ↔ =% % (2.28)

This also implies that

( ) 0 ( ) 0P dZ P dZ> ↔ >% (2.29)

The two can only exist if P assigns a 0 probability at the same time as P~

Thus with this new knowledge about the Radon Nikodym derivative we can

revise what we had said about Girsanov’s Theorem. We can now say:

Girsanov’s Theorem states that given a Wiener process Wt, we can always

multiply this process by a probability distribution )( tZξ to convert it to a different

Wiener process tW~. The two Brownians will be related by

t t tdW dW S dt= −% (2.30)

and their probability measures by

( ) ( ) ( )t t tdP W W dP Wξ=% % (2.31)

Notice that )( tZξ is a martingale with [ ] 1)( =tZE ξ , and the product

)( tt ZS ξ⋅ is now also a martingale.


29

Notice also that both tW~ and Wt are Brownian motions, meaning that they have 0

mean. However, they are related by the expression dtSdWWd ttt −=~. How can this

be possible? The answer lies in the fact that tW~ has 0 mean under its probability

measure P~, and Wt under its respective measure P. Therefore, they do not

simultaneously present 0 means with respect to one common probability measure.

We define the random process )( tZξ as the stochastic exponential of St with

respect to W:

[ ]2

01

2( ) 0,t t

u u uoS dW S du

tZ e t Tξ−∫ ∫

= ∈ (2.32)

(Notice that 1)( 00 =Zξ ), then

There are several conditions that must always hold true:

• St must be known exactly, given the information set tF~.

• St must not vary much: [ ] ∞<∫to u duSeE

2

But these conditions do not yet assure us that St is a martingale. A condition that

is sufficient for condition ( )t tZξ to satisfy the hypothesis of Girsanov’s theorem is to

assume

21

2t

uo S duE e

∫ < ∞

(2.33)

With this, ( )t tZξ is a martingale if S is deterministic.


30

2.10.2 Novikov condition:

The above conditions imply that )( tZξ is a square integrable martingale with

respect to the probability P.

Proof

We proceed now to provide a demonstration for the deterministic scenario. For a

more generic approach, see [Karatzas 1988].

Given ( )2

00

2 2

0

12

1 12 2

( )t

tuu u

t tSu uo Su u

S du S dWt S S

S du S duS dW

E S F e E e F

e e e

− ∫ ∫

− ∫ ∫∫

=

=

% %

(2.34)

If the last term is finite, then

2

0

1

2S

tuo u uS du S dW

Se e S− ∫ ∫= = (2.35)

2.11 Martingale Representation Theorem

Recall that from the Doob Meyer Decomposition, we were able to convert any

asset price St into a martingale by simply separating it into a right continuous

martingale component Mt and an increasing process At that was measurable under the

information set tF~. This was

St = Mt + At.

Let us consider now that we have

0 1 1

1k i i i

k

t t t t ti

M M H Z Z− −

=

= + − ∑ (2.36)

or equivalently, its integral form

0

0

t

t u uM M H dZ= + ∫ (2.37)


31

• 1−it

H is any random variable adapted to 1

~−it

F . Each 1−it

H is constant because they

are each entirely know at time t.

• 1−

−ii tt ZZ is any martingale with respect to tF

~ and P. They are unknown and have

unpredictable increments.

Given the above conditions, then kt

M is also a martingale with respect to tF~.

Proof:

If we calculate the expectation of the above (2.36), then

0 0 0 0 1 1 1

1k i i i i

k

t t t t t t t t ti

E M E M E E H Z Z− − −

=

= + − = ∑ (2.38)

0tM is known at time t0, meaning that its expectation is its own value. The same is

true for 1−it

H when we apply the expectation operator on it for time ti-1. It too can

therefore exit the operator.

0 0 1 1 1

1i i i i

k

t t t t t ti

M E H E Z Z− − −

=

= + ⋅ −

∑ (2.39)

Now from the martingale properties (2.4) we know that [ ] 011

=−−− iii ttt ZZE . We

are therefore left with

[ ]00 ttt MME

k= (2.40)

which according to the definition of a martingale (2.3) , implies that kt

M is itself a

martingale.

We can therefore re-write the Doob-Meyer martingale decomposition as:

( )0

T T

t t t s s s

t t

S M A M H M dZ dsα

= + = + +

∫ ∫ (2.41)

This formulation is what is known as the martingale representation theorem.


32

The martingale component Mt represents the degree of deviation about a given

trend, and is composed by H, a function adapted to Ms (that is in turn adapted to sF~),

Zs that is already a martingale given sF~and P, and a constant component M0.

The trend At is obtained by the interaction of αt, which is a known, measurable

process given sF~

2.12 Major Stochastic Differential Equations

We shall now proceed to give a brief outline of the major existing stochastic

differential equations. Many of the more complex developments can be grouped or

transformed into these more simple formulations.

2.12.1 Linear constant coefficients

Parameters: µ drift, σ diffusion

This model has the following well know stochastic differential equation

Pt tdS dt dWµ σ= + (2.42)

Whose solution is t tS t Wµ σ= + (2.43)

It is entirely defined by the measures

mean: [ ] dtdSE t µ=

( ) dtdSVariance t2σ=

σ is called the normal volatility of the process


33

S0

Fig. 2.2. Linear Stochastic Differential Equation dynamics

The variable St fluctuates around a straight line, whose slope is µdt

The fluctuations do not increase over time, and have no systematic jumps. The

variations are therefore entirely random.

2.12.2 Geometric Brownian Motion

It is the basis of the Black Scholes model, which as we shall see, has the following

stochastic differential equation:

Pt t t tdS S dt S dWµ σ= + (2.44)

21

20

tt W

tS S eµ σ σ − + = (2.45)

σ is called the volatility of the process

( ) 122

−= tt SdSVariance σ

St has an exponential growth rate µ

Its random fluctuations variance i.e. its variance increases with time

The ratio of the change has constant parameters P

tt

t dWdtS

dS σµ +=

Fig. 2.3. Geometric Stochastic Differential Equation dynamics

S0

µdt


34

2.12.3 Square Root

This is the same as the Black Scholes model only changing the variance to a

square root form as its name implies

Pt t t tdS S dt S dWµ σ= + (2.46)

( ) 12

−= tt SdSVariance σ

St has an exponential growth rate µ

Its random fluctuations are much smaller than in the Black Scholes model.

S0

Fig. 2.4. Square Root Stochastic Differential Equation dynamics

2.12.4 Mean Reverting Process:

( )( )

Pt t t t

Pt t t t

dS S dt S dW

dS S dt S dW

λ µ σ

λ µ σ

= − +

= − + (2.47)

As St falls below the mean µ, the term (µ- St) becomes positive and so tends to

make dSt more positive still, attempting to ‘revert’ the dynamics towards its mean

trend. σ is still called the volatility of the process.

The speed of this reversion is defined by the parameter λ

The Ornstein Uhlenbeck as seen in (2.1) is therefore a particular case of this mean

reverting process


35

2.12.5 Stochastic Volatility

Consists in using a volatility parameter that is in itself, time dependant and

random.

( )0

Pt t t t t

Pt t t t

dS S dt S dW

d dt dZ

µ σσ λ σ σ βσ

= +

= − + (2.48)

If ( , )t tt Sσ σ= is deterministic then we have a local volatility model.

The particularity of the above model lies in the fact that the volatility depends on

a different Brownian motion Z. Thus a new source of stochasticity is introduced into

the model that is different from that which we previously had: W, which before was

used to exclusively determine the underlying.

In the above, we have created a dynamics where the volatility has a mean

reverting process. Any other diffusion equation could also be considered, with the

simple addition of a stochastic volatility in the last term.

Chapter 3 Historical Models

36

3. Historical Models

We will now proceed to present the historical development of the major models

that have been used in the world of finance. We will follow a simplistic approach,

stating the principal characteristics of each and discussing their flaws. Our aim here is

to show the logical development of ideas from one model to the next. This is to say, to

show how each model builds on the previous and attempts to solve its intrinsic

problems. We hope that with this ‘time-travel’ the reader will be able to finally arrive

at the current date in which this project was written, and to recognise our project’s

developments as an intuitive ‘next step’ to what had previously been done.

3.1 The Black Scholes Model

Was developed in the early 1970’s by Fisher Black, Myron Scholes and Robert

Merton. It soon became a major breakthrough in the pricing of stock options. The

model has since had a huge influence on the way that traders price and hedge options.

It has also been pivotal to the growth and success of financial engineering in the 1980’s

and 1990’s. In 1997 the importance of the model was recognised when Robert Merton

and Myron Scholes were awarded the Nobel Prize for economics. Fisher Black had

sadly died in 1995. Undoubtedly, he would have otherwise also been a recipient of the

prize.

The equation was developed under the assumption that the price fluctuations of

the underlying security of an option can be described by an Ito process. Let the price S

of an underlying security be governed by a geometric Brownian motion process over a

time interval [0, T]. Then the price may be described as

( ) Pt t tdS rS dt t dWβ= + (3.1)

where W is a standard Brownian motion (or a Wiener Process). r is the risk

neutral rate of a risk free asset (a bond) over [0,T]. The value of the bond satisfies the

well known dynamics dB = rBdt. It is the theory behind the rate r that truly won

Black, Scholes and Merton the Nobel Prize.


37

We also impose that β(t) = St σ. Thus, we have

Pt t t tdS rS dt S dWσ= + (3.2)

We cannot solve the above dynamics directly because the diffusion term is not

deterministic - St is stochastic, despite the fact that σ is constant and deterministic in a

first simplistic approach.

As previously shown, in stochastic calculus we seek to transform our expression

into an Ornstein Uhlenbeck expression. Thus we seek

Ptt

t

dSrdt dW

Sσ= + (3.3)

3.1.1 Solving the PDE

The easiest setting to tackle is therefore that in which r is deterministic and σ is

constant. Then applying Ito’s Lemma, we have

2

2

1

2tt t

g g gdg dt dS dt

t S S

∂ ∂ ∂= + +∂ ∂ ∂

(3.4)

where

( , ) tt t

t

dSg S t dLnS

S= = (3.5)

0=∂∂

t

g

tt SS

g 1=∂∂

22

2 1

tt SS

g −=∂∂

2 22

1 1 10

2t t tt t

dLnS dt dS S dtS S

σ −= + +

(3.6)

Substituting dS we now have

21 1( )

2P

t t tt

dLnS rSdt S dW dtS

σ σ= + − (3.7)


38

2

0 20

TP

t TT

r dt W

tS S eσ σ− +∫

= (3.8)

Having obtained a solution to the diffusion of the asset dSt, we can now price a

Call of strike K and yield r. But before, we must realize the following:

As a direct consequence of the expression

( ) Pt t tdS rS dt t dWβ= + (3.9)

we know that the expectation of any such tradable assets is

U

sT

r dsP

t t US E S e− ∫

=

(3.10)

Proof

2

2

( ) ( )

2P

t tt t

t tdLnS rdt dW dt

S S

β β= + − (3.11)

2

2

( ) ( )

2

U U UP

tT T Tt t

t trdt dW dt

S SU TS S e

β β+ −∫ ∫ ∫

= (3.12)

2

2

( ) ( )

2

U UU Pt

T Tt tT

t tdW dtrdt

S SU TS e S e

β β−∫ ∫− ∫= (3.13)

Taking expectations

2

2

( ) ( )

2

U UU Pt

T Tt tT

t tdW dtrdt

S SP Pt U T t TE S e S E e S

β β−∫ ∫− ∫ = =

(3.14)

Gaussian variance

As it is a martingale, its mean is 1


39

3.1.2 Pricing a Call by Black Scholes1

Let us consider now the pricing of a Call Option. A Call gives the buyer of the

option the right, but not the obligation, to buy an agreed quantity of a particular

commodity or financial instrument (the underlying instrument) from the seller of the

option at a certain time (the expiration date) and for a certain price (the strike price).

The seller (or "writer") is obligated to sell the commodity or financial instrument

should the buyer so decide. The buyer pays a fee (called a premium) for this right. It

may be useful for the reader to jump momentarily to Chapter 6.1 at this point for a

detailed description of a Call option.

Suppose that we are holding a call option and that at time T the price of the

underlying asset is ST < K. In this case, we would not be interested in buying the asset

at price K. Thus we would not exercise the option, and our profit from this contract

would be 0. On the other hand, if ST > K, we would be ready to buy the asset at the

lower price K from the unfortunate underwriter of our call option, and then go on to

the market to sell the share of our underlying, so as to make a profit of ST - K.

Thus, at time T, the expected benefit obtained from the call option would be

00 0 ( )

T

tr dtP

TC E e S K− ∫ +

= −

(3.15)

Where (ST – K)+ = max [ST – K, 0].

Note that exponentials are always greater than zero, so the positive sign can come

out from the expectation operator

0 00 0

T T

t tr dt r dtP

TC E e S Ke

+− −∫ ∫

= −

(3.16)

Let us note the function

1 Note that we could have also derived Black’s formula by using the equivalent

martingale measure P~


40

11

0t

tS K

t

S K

S K>

→ > → <

Then 0 00 0 1 1

T T

t t

t t

r dt r dtP

T S K S KC E e S Ke− −∫ ∫

> >

= ⋅ − ⋅

(3.17)

Now considering the second term in the above, the exponential (deterministic) is

independent of St so can be extracted from the expectation since its value is well

known.

0 0 0

0 0

0 0 1 1 1 ( )

( ) ( )

T T T

t t t

t t t

T T

t t

r dt r dt r dtP P

S K S K S K t

r dt r dt

t t

K

E Ke Ke E Ke dP S

Ke dP S Ke prob S K

∞− − −∫ ∫ ∫

> > >−∞

∞− −∫ ∫

⋅ = =

= = >

∫

∫

(3.18)

Analysing the first term, we see that we are left with:

[ ] ∫∞

∫−

>

∫−

>

∫−⋅=⋅=

⋅

K

tt

dtr

KStP

dtr

KS

dtr

tP SdPSeSEeeSE

T

t

t

T

t

t

T

t

)(11 00000

(3.19)

Let us substitute St with the formula that we derived previously in (3.8):

P

T

T

t WT

dtr

t eSSσσ

∫ +−= 0

2

20

We then have

∫∫∞

∫ +−∫−∞∫−

⋅=⋅K

t

WT

dtrdtr

K

tt

dtr

SdPeSeSdPSeP

T

T

t

T

t

T

t

)()( 0

2

00 20

σσ

(3.20)

now notice 0 0

T T

t tr dt r dt

t tS K S e Ke− −∫ ∫

> ↔ > (3.21)

once again substituting St

2

0

22

0 002

TP

tTT Tr dtW P

T t

TS e Ke LnS W LnK r dt

σ σ σ σ− ∫− +

> ↔ − + > − ∫ (3.22)

The property of Brownians seen in Chapter 2.4 lies in the fact that they can be

split into a normal and temporal component, thus if


41

TUW PT = and N(0,1)~U then

2

00 T

T

dtrS

KLn

UKS

T

t

t

σσ

+∫−

>↔> (3.23)

We shall call

2

000

T

T

dtrS

KLn

d

T

t σσ

+∫−

= (3.24)

Evidently therefore, U > d0. At this point therefore we now have both the first and

second terms in the expression (3.17). We can substitute them into the equation to find

the call’s price discounted to present, assuming a deterministic rate r

2

2

0

0

2

0

0

2

0

0

0

0

22

0 0 0

( )

2

0 0

2

0 0

( )

0 0 0

0 1 2

( )2

( )2

( )2

( ( ) ) ( )

( ) ( )

T

t

T

t

T

t

T

t

T

t

UT r dtU T

d

U T

r dt

d

X

r dt

d T

r dt

r dt

eC S e dU Ke prob U d

eS dU Ke N d

eS dX Ke N d

S N T d Ke N d

S N d Ke N d

σ σ

σ

σ

π

π

π

σ

−∞ − ∫− +

−−∞ − ∫

−∞ − ∫

−

− ∫

− ∫

= ⋅ − ⋅ >

= ⋅ − ⋅ −

= ⋅ − ⋅ −

= − − ⋅ −

= − ⋅

∫

∫

∫ (3.25)

It is important to notice that the price of the Call depends only on

• its price today (t0)

• the distribution of probability of ST at its maturity T

• the discount factors

Therefore, all that occurs in between these two dates is irrelevant to us.


42

Tiempo

K

T

S0

t0

(ST-K)T ST

irrelevant

Fig. 3.1. Only the present call value is relevant to compute its future price. Any

intermediate time-periods are irrelevant

3.2 Beyond Black

The Black Scholes model presents several difficulties that it cannot surmount. As

we have seen, in theory it has a single volatility σ for every K and maturity T.

This however, is not what is perceived by the traders in the markets. If we were to

set up a matrix of prices for K vs T for a given underlying asset, and perform the

inverse transformation of the Black Scholes formula so as to find their corresponding

σ, we would discover that their σ is not unique, and instead varies with K and T.

Therefore, the two main problems presented by the Black Scholes’ model are:

• Smile: for any given maturity T, there are different Black implied volatilities σBlack

for different strikes K.

• Term Structure: for any given strike K, there are different Black implied

volatilities σBlack for different maturities T.

Time


43

Fig. 3.2. Term Structure of Vanilla Options

The lack of a unique volatility as assumed by Black does not mean that the Black

Scholes model is useless to our case. As stated initially, the model has triggered an

enormous amount of research and revolutionised the practice of finance. Further, we

have learnt in the above development that we can successfully use the Black Scholes

model as a ‘translator’ between option prices and their σBlack. Both have parallel

dynamics, thus, any movement in prices will result in a similar movement in the

asset’s σBlack.. We will explore the advantage of this property in the following

discussion.

3.2.1 Term Structure

For a given strike K, we may have the following structure:

0 σ 1

T1

of Variance = σ12T1

and for a different product:

σ 2 0

T2

of Variance = σ22T2

As stated previously, Black’s implied volatility is not unique, therefore σ ≠ σ1 ≠ σ2

σ

Strike

Short maturity

(smile)

Long maturity (skew) Black Implied Volatility σ


44

3.2.2 Time Dependent Black

We can solve this inconvenience by modelling σ as the mean value of the variance

that is accumulated between 0 and T2. This is simply obtained by replacing in the

original equation

2

0

( )T

T t dtσ σ→ ∫ (3.26)

Thus we have a deterministic σ(t) yielding:

P

tttt dWStdtrSdS )(σ+= (3.27)

which is known as ‘Time Dependent Black’.

We now have a unique σ(t) so can write

0 σ 1

T1

∫=1

0

21 )(

T

dttVar σ

σ 2 0

T2

1212

0

2

0

22

2

1

12

)()()( VarVardttdttdttVarT

T

TT

+=+== ∫∫∫ σσσ

3.2.3 Smile

We cannot use a similar change as in the previous case and convert our equation

to P

tttt dWSKdtrSdS )(σ+= . This is because we do not know the strike K at which

the asset will be sold in the market.


45

3.3 Lognormal Classic Black

We have so far seen that we can write the diffusion of the Black Scholes model as:

P

tttt dWSdtrSdS σ+= (3.28)

The second term is not normal as it includes a Brownian (which is normal), but

also the term tS which itself follows a stochastic diffusion. We can rewrite the above

as

P

tt

t dWrdtS

dS σ+= (3.29)

We call it a lognormal model because subsequently, the term t

t

S

dSis transformed

through Ito into tdLnS . Therefore it is the log of St that is now normal, as its diffusion

term depends only on the Brownian parameter and a constant sigma.

As the diffusion is constant in the lognormal model, when is represented in a

strike vs σBlack graph, we obtain a flat curve:

σBlack

Strike K

Lognormal (flat)

Fig. 3.3. Flat lognormal Black Volatilities


46

3.4 Normal Black

We could attempt to transform the probability distribution that our equation

follows to a different form. Thus by taking:

P

ttt dWdtrSdS σ+= (3.30)

the second term now consists in a constant volatility attached to a Gaussian

Brownian motion. This diffusion term is already normal without having to transform

it to a lognormal version. The above is commonly rewritten as:

0P

t t tdS rS dt S dWσ= + (3.31)

where we have included a constant term S0 so that the magnitude of the volatility

is comparable to that in the classic Black Scholes model.

The previous formulation can be transformed to its lognormal version so as to

compare its dynamics with the classic lognormal Black Scholes model:

0 Ptt t

t t

dS Sr dt dW

S Sσ= + (3.32)

Lognormal

Normal

Strike K

Call Price

Normal (Skew)

Strike K

Lognormal (Flat)

σBlack

ATM

Fig. 3.4. Normal and lognormal Black Scholes model comparison a) price vs strike

b) black volatility vs strike

The comparison between the two models must be made under the classic σBlack.

As seen above, if we were to compare the two normal and lognormal models in terms

of their Call prices, we would find it difficult to perceive any difference. Thus we

realise the utility of using the σBlack: The σBlack allows us to compare and clearly

a) b)


47

distinguish models that are otherwise undistinguishable under their price measures.

This matches up with our previous section’s discussion in which we questioned the

utility of calculating σBlack if we knew that the Black Scholes model could not correctly

model varying local volatilities.

The main problem with the normal Black model is the fact that it imposes a slope

that doesn’t always coincide with that which is observed in real markets. Thus a

logical continuation to the model is proposed.

3.5 Black Shifted

P

tttt dWSSdtrSdS ))1(( 0αασ −++= (3.33)

This model allows for a variety of slopes ranging between the skewed normal and

the flat lognormal version. The parameter α acts as the weight associated to each of

the models. Market data shows that the general level of volatility is imposed at the

money ATM, and is basically independent of the α parameter.

Normal (Skew) α=0

Strike K

Lognormal (Flat) α=1

σBlack

ATM

Fig. 3.5. Alpha skew modelling

The interpretation of this new model is best made when analysed from the classic

lognormal σBlack perspective:


48

P

ttt

t dWS

Srdt

S

dS

−++= 0)1( αασ (3.34)

Just as we had expected, we see that by varying the alpha parameter we are

capable of adjusting our model to any true market slope.

3.6 Local Volatility - Dupire’s Model

P

ttt

t dWStrdtS

dS),(σ+= (3.35)

Recall that in Black’s model there is a one-to one relationship between the price of

a European option and the volatility parameter σBlack. This was seen clearly in Fig. 3.3

where the lognormal Black model was a flat straight line, constant at all strikes K.

Consequently, option prices are often quoted by stating the implied volatility σBlack: the

unique value of the volatility which yields the option’s money price when used in

Black’s model. In theory, the volatility σBlack in Black’s model is a constant but as we

have already stated, in practice, options with different strikes K require different

volatilities σBlack to match their market prices. For example for a unique maturity,

market data may present this form:

Fig. 3.6. Market data smile

Acts as the volatility ~σ*(St)dWtP


49

Handling these market skews and smiles correctly is critical to fixed income and

foreign exchange desks, since these desks usually have large exposures across a wide

range of strikes. Yet the inherent contradiction of using different volatilities for

different options makes it difficult to successfully manage these risks using Black’s

model.

The development of local volatility models by Dupire and Derman- Kani was a

major advance in handling smiles and skews. Local volatility models are self-

consistent, arbitrage-free, and can be calibrated to precisely match observed market

smiles and skews. Currently these models are the most popular way of managing

smile and skew risk. However, as we shall discover the dynamic behaviour of smiles

and skews predicted by local volatility models is exactly opposite to the behaviour

observed in the marketplace: when the price of the underlying asset decreases, local

volatility models predict that the smile shifts to higher prices; when the price

increases, these models predict that the smile shifts to lower prices. In reality, asset

prices and market smiles move in the same direction.

This contradiction between the model and the marketplace tends to de-stabilize

the delta and vega hedges derived from local volatility models, and often these

hedges perform worse than the naive Black-Scholes’ hedges.

3.6.1 Detailed Comparison

We will now advance to derive a more detailed comparison between the

traditional Black model and the Dupire model.

Consider a European call option on an asset A with exercise date tex, settlement

date tset , and strike K. If the holder exercises the option on tex, then on the settlement

date tset he receives the underlying asset A and pays the strike K. To derive the value

of the option, define ˆ ( )F t to be the forward price of the asset for a forward contract

that matures on the settlement date tset, and define ˆ (0)f F= to be today’s forward


50

price. Also let B(t) be the discount factor for date t; that is, let B(t) be the value today of

1 u.c. to be delivered on date t. Martingale pricing theory asserts that under the “usual

conditions,” there is a measure, known as the forward measure, under which the

value of a European option can be written as the expected value of the payoff. The

value of a call options is

( ) 0ˆ( ) ( )P

call set exV B t E F t K F+ = −

% (3.36)

and the value of the corresponding European put is

( ) 0ˆ( ) ( )P

put set exV B t E K F t F+ = −

% (3.37)

( )( )put call setV V B t K f= + − (3.38)

(Refer to the Discount Factor section in Chapter 5.1 to learn that a future payoff

can be discounted to its present value by continuous compounding, which can be

equivalently expressed as a bond maturing at the future date).

Here the expectation E is over the forward measure, and 0F% can be interpreted

as “given all information available at t = 0.” Martingale pricing theory also shows that

the forward price ˆ ( )F t is a Martingale under this measure, so the Martingale

representation theorem shows that ˆ ( )F t obeys

ˆ ˆ( ) ( ,*) (0)dF t C t dW F f= = (3.39)

for some coefficient C (t, * ), where dW is a Brownian motion in this measure.

The coefficient C (t,* ) may be deterministic or random, and may depend on any

information that can be resolved by time t.

This is as far as the fundamental theory of arbitrage free pricing goes. In

particular, one cannot determine the coefficient C(t,*) on purely theoretical grounds.

Instead one must postulate a mathematical model for C (t, * ).


51

European swaptions fit within an identical framework. (Refer to a swpation

description in Chapter 6.9). Consider a European swaption with exercise date tex and

fixed rate (strike) K. Let ˆ ( )sR t be the swaption’s forward swap rate as seen at date t,

and let 0ˆ (0)sR R= be the forward swap rate as seen today. The value of a payer

swaption is

1

00

ˆ( ; ) ( ( ) )n

pay i s exi

V mB t T E R t K F−

+

=

= ⋅ − ∑ % (3.40)

and the value of a receiver swaption is

1

00

ˆ( ; ) ( ( ))n

rec i s exi

V mB t T E K R t F−

+

=

= ⋅ − ∑ % (3.41)

[ ]1

0

( ; ) (0)n

rec pay ii

V V mB t T K R−

=

= + ⋅ −∑ (3.42)

Here the level 1

0

( ; )n

ii

mB t T−

=∑ is today’s value of the annuity, which is a known

quantity, and E is the expectation. The PV01 of the forward swap, like the discount

factor rate ˆ ( )sR t is a Martingale in this measure, so once again

0ˆ ˆ( ) ( ,*) (0)s sdR t C t dW R R= = (3.43)

where dW is Brownian motion. As before, the coefficient C(t,*) may be

deterministic or random, and cannot be determined from fundamental theory. Apart

from notation, this is identical to the framework provided by the previous equations

for European calls and puts. Caplets and floorlets can also be included in this picture,

since they are just one period payer and receiver swaptions.

3.6.2 Black’s model and implied volatilities.

To go any further requires postulating a model for the coefficient C(t,*). We saw

in previous sections that Black postulated that the coefficient C(t,*) is ˆ ( )BF tσ where


52

the volatility σB is a constant. The forward price ˆ ( )F t is then a geometric Brownian

motion:

ˆ ˆ ˆ( ) ( ) (0)BdF t F t dW F fσ= = (3.44)

Evaluating the expected values in equations (3.36) and (3.37) under this model

then yields through Black’s formula,

( )

( )1 2( ) ( ) ( )

( )

call set

put call set

V B t f N d K N d

V V B t K f

= ⋅ − ⋅

= + − (3.45)

where

2

1,2

( )log

2B ex

T

B ex

tf

Kd

t

σ

σ

⋅ ± =

⋅ (3.46)

for the price of European calls and puts, as is well-known. All parameters in

Black’s formula are easily observed, except for the volatility σBlack .

An option’s implied volatility is the value of σBlack that needs to be used in Black’s

formula so that this formula matches the market price of the option. Since the call (and

put) prices in equations (3.45) are increasing functions of σBlack, the volatility σBlack

implied by the market price of an option is unique. Indeed, in many markets it is

standard practice to quote prices in terms of the implied volatility σBlack implied the

option’s dollar price is then recovered by substituting the agreed upon σBlack into

Black’s formula.

The derivation of Black’s formula presumes that the volatility σBlack is a constant

for each underlying asset A. However, the implied volatility needed to match market

prices nearly always varies with both the strike K and the time-to-exercise tex .


53

Fig. 3.7. Smiles at different maturities

Changing the volatility σBlack means that a different model is being used for the

underlying asset for each K and tex.

3.6.3 Local volatility models.

In an insightful work, Dupire essentially argued that Black was too bold in setting

the coefficient C (t,*) to ˆ ( )BF tσ . Instead one should only assume that C is Markovian:

ˆ( , )C C t F= . Re-writing ˆ( , )C t F as ˆ ˆ( , )loc t F Fσ then yields the “local volatility

model,” where the forward price of the asset is

ˆ ˆ ˆ( , ) (0)locdF t F dW F fσ= = (3.47)

in the forward measure. Dupire argued that instead of theorizing about the

unknown local volatility function ˆ( , )loc t Fσ , one should obtain it directly from the

marketplace by “calibrating” the local volatility model to market prices of liquid

European options.


54

3.6.4 Calibration

In calibration, one starts with a given local volatility function ˆ( , )loc t Fσ , and

evaluates

( ) 0ˆ( ) ( )P

call set exV B t E F t K F+ = −

% (3.48)

( )( )put setV B t f K= + − (3.49)

To obtain the theoretical prices of the options; one then varies the local volatility

function ˆ( , )loc t Fσ until these theoretical prices match the actual market prices of the

option for each strike K and exercise date tex. In practice liquid markets usually exist

only for options with specific exercise dates 1 2 3, , ,....ex ex ext t t for example, for 1m, 2m,

3m, 6m, and 12m from today. Commonly the local volatilities ˆ( , )loc t Fσ are taken to

be piecewise constant in time:

1 1

1

ˆ ˆ( , ) ( )

ˆ ˆ( , ) ( ) 2,3,...

ˆ ˆ( , ) ( )

loc loc ex

j j jloc loc ex ex

J Jloc loc ex

t F F for t t

t F F for t t t j J

t F F for t t

σ σ

σ σ

σ σ

−

= <

= < < =

= >

One first calibrates 1 ˆ( , )loc t Fσ to reproduce the option prices at 1ext for all strikes

K, then calibrates 2 ˆ( , )loc t Fσ to reproduce the option prices at 2ext , for all K, and so

forth.

(This calibration process can be greatly simplified by solving to obtain the prices

of European options under the local volatility model (3.47) to (3.49), and from these

prices we obtain explicit algebraic formulas for the implied volatility of the local

volatility models.)

Once ˆ( , )loc t Fσ has been obtained by calibration, the local volatility model is a

single, self-consistent model which correctly reproduces the market prices of calls

(and puts) for all strikes K and exercise dates tex without “adjustment.” Prices of exotic

options can now be calculated from this model without ambiguity. This model yields


55

consistent delta and vega risks for all options, so these risks can be consolidated across

strikes. Finally, perturbing f and re-calculating the option prices enables one to

determine how the implied volatilities change with changes in the underlying asset

price. Thus, the local volatility model provides a method of pricing and hedging

options in the presence of market smiles and skews. It is perhaps the most popular

method of managing exotic equity and foreign exchange options.

Unfortunately, the local volatility model predicts the wrong dynamics of the

implied volatility curve, which leads to inaccurate and often unstable hedges. Local

volatility models predict that the market smile/skew moves in the opposite direction

with respect to the price of the underlying asset. This is opposite to typical market

behaviour, in which smiles and skews move in the same direction as the underlying.

Fig. 3.8. Implied volatility σB(K,f) if forward price decreases from f0 to f (solid line)

Fig. 3.9. Implied volatility σB(K,f) if forward price increases from f0 to f (solid line).


56

Time t t + dt

Let us explain in a little more detail this calculation for the future projection of

local volatilities. Given that we know the σ(t, St) up until the present time instant t,

how do we construct the model up to time t + dt?

At t we know the price of the asset St. We can also calculate for it a set of possible

future scenarios, each with their respective probabilities of occurring. Thus we have

the corresponding probability distribution at time t + dt for all the values that the local

volatility could possibly take.

t + dt t

Fig. 3.10. Future asset volatility scenarios for different an asset

By analysing for a given asset the continuum of market prices for every strike at

time t, and projecting their scenarios into the future, it is possible to discover for each

price the corresponding probability distribution at time t + dt. Thus, on obtaining the

probability distribution, we can then simply invert the Black Scholes formula so as to

obtain each σBlack for each of the product’s strikes.

Fig. 3.11. Future volatility scenarios for different strikes of a same underlying

asset


57

Each strike has a particular σBlack at a given date T, just as is shown by the market

quoted data. The same procedure can be repeated at every desired maturity T,

therefore also obtaining a σBlack for a continuum of times. Any other desired σBlack can

simply be calculated by interpolating between two known σBlack derived from the

market data. Note that the method used here for interpolation will have a great

impact on the final value and smile of the various σBlack.

Proof:

Let us set out by writing the expression for a Call through a typical stochastic

diffusion equation i.e. composed by a risk neutral drift and a stochastic component

yields:

dtStS

CdS

S

Cdt

t

CCdWStrCdtdC t

ItoPttt ),(

2

1),( 2

2

2

σσ∂∂+

∂∂+

∂∂→+=

(3.50)

from before, we also know that P

tttt dWStdtrSdS ),(σ+= Thus replacing we

obtain

P

ttt dWStS

CdtSt

S

CrS

S

C

t

C),(),(

2

1 22

2

σσ∂∂+

∂∂+

∂∂+

∂∂

(3.51)

we can apply here the fact that the Call Price = (S – K)t

Therefore, for a known fixed price as is our case, if the underlying asset S

increases, the strike must consequently decrease. This implies the following

relationship between the given variables

K

C

S

C

∂∂−=

∂∂

, 2

2

2

2

K

C

S

C

∂∂−=

∂∂

(3.52)

Similarly T

C

t

C

∂∂−=

∂∂

(3.53)

Ignoring the Brownian term in dW we obtain


58

dtStK

CrS

K

C

T

CdC t

∂∂−

∂∂−

∂∂−= ),(

2

1 22

2

σ (3.54)

All of which is known from the market data with respect to every strike and

maturity, as stated before. The integral form of the above can therefore easily be

calculated obtaining a value for ),(2tStσ at each (K,T).

Thus, given a continuum of market prices with respect to the strike K and the

maturity T, it is possible to construct a unique solution for σ(t,St). The model known as

Local Volatility because it associates a local σ to each pair (K,T).

3.6.5 Problems

The model correctly simulates the market smile at any strike and for any maturity

T starting from the present date t0. However, it also presents a set of intrinsic

problems:

1. The computation is much slower as the algorithm advances by infinite time

intervals dt which can consist in periods of time composed by days or weeks. We saw

that the Black Scholes model could jump directly from the data in time t0 to the

probability distribution at maturity T (say 1, 10 or 30 years in the future) without the

need for any intermediate steps.

2. Experience has demonstrated that although the model is extremely accurate for

evaluating products that start today and end at time T, it is not accurate for forward

products that start at a future time U and end at T. The dynamics of the products

obtained through the Dupire model results in an almost flat σBlack vs K curve, whereas

market data show a smile.

3. The Dupire model implies that the volatility at a given time depends only on

the underlying ST whereas experience has demonstrated that this dependence is not

really constant. This implies the need of a further parameter to introduce another

stochastic source into our expression

The above concerns have taken the quant world to step back from the Dupire

model when analysing future products- and this has meant the need to return to the


59

models that we were previously discussing, i.e. the normal and lognormal Black

Scholes models, or the Black shifted, although with very subtle modifications.

3.7 Stochastic Volatility

P

tt

t VdWrdtS

dS += (3.55)

P

ttt dZgdtfdV += (3.56)

In this model we assume that the volatility itself follows a stochastic process.

Notice that the Brownian motion driving the volatility process is different from the

asset’s Wiener process. This is the approach followed to introduce a new variable of

stochasticity different from dW, as discussed previously. Both Brownian motions may

be correlated

Pt

Ptt dZdWdt ,=ρ (3.57)

Note also that ft represents the drift term of the volatility V, whereas gt represents

the volatility of the volatility, otherwise referred to as Vol of Vol.

The advantage of this new model is the fact that before, all the stochasticity

derived from S, and in turn, from W. Now in turn, we have two sources if

stochasticity, W and Z, where, if correlated with a ρ= 1, give way to the previously

discussed normal and lognormal models.

This approach has given way to the models know as:

• Heston 1994

• SABR 1999

Current lines of thought suggest that the path to follow include a combination of

local volatility versus stochastic volatility.


60

3.7.1 Black shifted with Stochastic Volatility

2002 – 2003

P

t

Pttttt

gdZfdtdV

dWSSVdtrSdS

+=

−++= ))1(( αα (3.58)

3.7.2 Local volatility versus simple Stochastic Volatility

P

t

Pttttt

dZgdtVVdV

dWStVSdtrSdS

00)(

),(

+−=

+=

λ

σ (3.59)

where the stochastic volatility includes a mean revision term, and where the

Volatility of Volatilities is a constant.

3.8 SABR

Most markets experience both relatively quiescent and relatively chaotic periods.

This suggests that volatility is not constant, but is itself a random function of time.

Respecting the preceding discussion, the unknown coefficient V (t,*) is chosen to be

S βα , where the “volatility” α is itself a stochastic process. Choosing the simplest

reasonable process for α now yields the “stochastic – α β ρ model,” which has become

known as the SABR model. In this model, the forward price and volatility are

1

2

ˆ ˆ ˆ (0)

(0)

dF F dW F f

d dW

βαα να α α

= == =

(3.60)

under the forward measure, the two processes are correlated

1 2dW dW dtρ= (3.61)

The SABR model has the virtue of being the simplest stochastic volatility model

which is homogenous in F and α. We find that the SABR model can be used to

accurately fit the implied volatility curves observed in the marketplace for any single

exercise date tex. More importantly, it predicts the correct dynamics of the implied


61

volatility curves. This makes the SABR model an effective means to manage the smile

risk in markets where each asset only has a single exercise date; these markets include

the swaption and caplet/floorlet markets.

The SABR model may or may not fit the observed volatility surface of an asset

which has European options at several different exercise dates; such markets include

foreign exchange options and most equity options. Fitting volatility surfaces requires

the dynamic SABR model or other stochastic volatility models.

In the SABR there are two special cases: If we analyse (3.60) in a little more depth,

we notice that for β = 1, we obtain our previous stochastic log normal Black model.

For β = 0 in turn we obtain the stochastic Black normal model. For a deeper

understanding, refer to the SABR study in the Caplet Stripping Section 17.

Finally, it is worthwhile noting that the SABR model predicts that whenever the

forward price f changes, the implied volatility curve shifts in the same direction and

by the same amount as the price f. This predicted dynamics of the smile matches

market experience, and so is a great advance over the Dupire model which was

inconsistent at this point.

Chapter 4 Interest Rate Models

62

4. Interest Rate Models

As seen, Black can be used for simple vanilla options such as Caps, European

Swaptions,... The Black-Scholes model is commonly used for the pricing of equity

assets where the model takes on a deterministic rate r. The model replicates the

evolution of its underlying asset through the use of a drift and a diffusion parameter.

However a difficulty arises when interest rate models are constructed using this

technique as the interest rate curve r is non deterministic and so can lead to arbitrage

solutions.

If we were to apply the Black Scholes formulation to forward interest rates, we

would obtain a lognormal model with a constant volatility σ. Under the risk neutral

forward probability, we would have

PtdWUTtFdtUTtdF ),,(0),,( σ+=

which results extremely simple. Applying Girsanov’s Theorem and inverting the

forward rates in a local volatility approach, we would need a different underlying

each time, because each forward rate has a different black volatility. This is a

characteristic that Black’s model does not support. Black requires a unique volatility.

A solution to the problem that we haven’t discussed could be to consider the

different forward rates as a basket of different products in equity- each being

lognormal with their associated volatility. Thus we would obtain a matrix of

correlations between them.

This approach gives way to arbitrage opportunities, creating a forward curve that

always increases with time. According to market data, the former does not always

occur- thus the need for specific interest rate models.

Hence, in previous approaches, interest rate models avoid assigning a short rate

by specifying it at every time and state. Although this is a good and practical method,

an alternative is to specify the short rate as a process defined by an Ito equation. This

allows to work in continuous time.


63

In this approach we specify that the instantaneous short rate r satisfies an

equation of the type

( , ) ( , )dr r t dt r t dWµ σ= + (4.1)

where W(t) is a standardised Wiener process in the risk-neutral world. Given an

initial condition r(0), the equation defines a stochastic process r(t)

Many such models have been proposed as being good approximations to actual

interest rate processes. We list a few of the best known short rate models:

4.1 Rendleman and Bartter model

dr mrdt rdWσ= + (4.2)

This model copies the standard geometric Brownian motion model used for stock

dynamics. It leads to lognormal distributions of future short rates. It is now, however,

rarely advocated as a realistic model of the short rate process

4.2 Ho-Lee model

( )dr t dt dWθ σ= + (4.3)

This is the continuous-time limit of the Ho-Lee model. The function ( )tθ is

chosen so that the resulting forward rate curve matches the current term structure. A

potential difficulty with the model is that r(t)may be negative for some t


64

4.3 Black Derman Toy model

( )dLnr t dt dWθ σ= + (4.4)

This is virtually identical to the Ho-Lee model, except that the underlying

variable is Ln r rather than r. Using Ito’s Lemma, it can be transformed to the

equivalent form

21( )

2dr t rdt rdWθ σ σ = + +

(4.5)

4.4 Vasicek Model

( )dr a b r dt dWσ= − + (4.6)

The model has the feature of mean reversion in that it tends to be pulled to the

value b. Again, it is possible for r(t) to be negative, but this is less likely than in other

models because of the mean reversion effect. Indeed, if there were no stochastic term

(that is if 0σ = ), then r would decrease if it were above b and it would increase if it

were below b. This feature of mean reversion is considered to be quite important by

many researchers and practitioners since it is felt that interest rates have a natural

‘home’ of about 5% and that if rates differ widely from this home value there is a

strong tendency to move back to it.

4.5 Cox Ingersoll and Ross model

( )dr a b r dt rdWσ= − + (4.7)

In this model not only does the drift have a mean reversion, but the stochastic

term is multiplied by r , implying that the variance of the process increases as the

rate r itself decreases.


65

4.6 Black Karasinski model

( )dLnr aLnr dt dWθ σ= − + (4.8)

This is the Black Derman Toy model with mean reversion

4.7 Hull White Model

( ) Pt t t tdr a r dt dWθ σ= − + (4.9)

Where θ, σ and ‘a’ are deterministic.

r is a normal (Gaussian) variable with a mean reversion term. This mean revision

within the drift parameter allows for a more static evolution of the interest rates, a

property that is historically consistent

Notice that the above equation can be solved analytically.

However, there are two main problems connected to this model:

• The first is the fact that the model is normal by definition, thus always yields a

positive probability for values of r < 0. This is not such a great concern if the

product we try to model depends on high interest rates.

• Secondly, the model gives a correlation of practically 1 between the different long

and short term interest rates. This means that any movement we try to reproduce

in our interest rate curve will result in an equal translation across the entire curve.

Thus the flexibility required to bend the curve at different points, as occurs in

reality is unavailable here. A possible solution would be to take a short term

interest rate into consideration in our model, and at the same time a long term

interest rate. This however requires taking two Brownian motions that are not

correlated.


66

4.7.1 BK

To solve the normality of the problem, a lognormal Hull White model can be

proposed, which follows a diffusion equation of the form:

P

tttt dWdtLnradLnr σθ +−= )( (4.10)

and where θ, σ and a are deterministic. The above is known as the BK model. This

equation however has no analytical solution, meaning it must be solved through

mathematical approximations such as tree diagrams or PDE solvers (MonteCarlo is

not applicable). This results in more difficult and time consuming calibrations.

4.7.2 BDT

is a solution that greatly simplifies the calculations. In it, we take the previous

equation for BK, (4.10)

imposing )(

)('

t

tat σ

σ±=

'( )

·( )

Pt t t

tdLnr Lnr dt dW

t

σθ σσ

= ± +

(4.11)

However, the previously stated problem that was characterised by uniform

translations in the interest rate curve is still recurrent with this form, i.e. there is still a

unique interest rate that must be selected to define an entire curve.

A summary of the models following similar approaches are:

Table 4.1 Normal or lognormal models with a mean reversion


67

4.8 Conclusions

All of these models are referred to as ‘single factor models’ because they each

depend on a single Wiener process W. There are other models that are ‘multifactor’

which depend on two or more underlying Wiener processes.

All the single factor models lead to the same drawback: the fact that they use only

one explanatory variable (the short interest rate tr ) to construct a model for the entire

market. The main advantage of these methods lies in the possibility of specifying tr as

a solution to a Stochastic Differential Equation (SDE). This allows, through Markov

theory, to work with the associated Partial Differential Equation (PDE) and to

subsequently derive a rather simple formula for bond prices. The disadvantage: the

need for a complicated diffusion model to realistically reconstruct the volatility

observed in forward rates. In addition, the use of a single tr proves insufficient to

model the market curve, which appears to be dependent on all the rates and their

different time intervals.

The most straightforward solution should include the use of more explanatory

variables: long and medium term rates. This is, we could perhaps consider a model in

which we would use one representative short term rate, a middle term rate, and

finally a long term interest rate. The Heath Jarrow Morton framework arises as the

most complete application of the suggested approach. It chooses to include the entire

forward rate curve as a theoretically infinite dimensional state variable. This model

will be described in Chapter 7.

Chapter 5 Interest Rate Products

68

5. Interest Rate Products

Interest rate derivatives are instruments whose payoffs are dependent in some

way on the level of interest rates. In 1973, the market for calls over stocks begins. By

the 1980’s, public debt explodes causing a huge increase in the number of Swaps being

traded. By the end of the 1980’s there already appears a market for options over

swaps, thus a whole range of new products- Swaptions, FRA’s, Caps, Caplets,... –

were developed to meet the particular needs of end users. These begin trading based

particularly on the LIBOR and EURIBOR rates: London Interbanking Offer Rate and

the European equivalent.

Interest rate derivatives are more difficult to value than equity and foreign

exchange derivatives due to four main reasons:

1. The behaviour of an individual interest rate is more complicated than that of a

stock price or exchange rate.

2. The product valuation generally requires the development of a model to

describe the behaviour of the entire zero-coupon yield curve

3. The volatilities of different points on the yield curve are different

4. Interest rates are used for discounting as well as for defining the payoff

obtained from the derivative.

5.1 Discount Factors

The money account, bank account or money-market account process is the

process that describes the values of a (local) riskless investment, where the profit is

continuously compounded at the risk free rate existent in the market at every

moment.


69

Let B(t) be the value of a bank account at time t ≥ 0. Let us assume that the bank

account evolves according to the following differential equation:

dttBrtdB t )()( ** = (5.1)

1),(* =ttB because we assume that today we invest one unit of currency, u.c.

This value is not yet stochastic because today, we know the exact value of the bank

account.

Thus, integrating we obtain ∫

=T

tsdsr

eTtB ),(* , where rt is a positive (usually

stochastic) function of time. The above gives us the return at a future time T obtained

from the bank when we invest 1 unit of currency (u.c.) on day t.

∫T

tsdsr

e

t

Fig. 5.1. Future value of money

As the opposite backward operation is more common, we define

* 1( , )

( , )

T

str ds

B t T eB t T

∫

= = (5.2)

The inverse operation is the discount of money from a future date T to present t,

at the bank’s riskless rate. This is defined therefore as '( , )

T

str ds

B t T e− ∫

= and is known

as the stochastic discount factor. It is used to bring prices from the future to the present.

∫−T

tsdsr

e

T t

Fig. 5.2. Discount factor

T


70

5.2 Zero-coupon bond

The zero coupon bond at time t with maturity T > t, denoted by B(t,T) is the value

of a contract that starts at time t and guarantees the payment of 1 u.c. at maturity T.

It’s price is the future unit of currency, discounted to present. As we do not know

what the future discount rate will be, we can only calculate the bond’s expected price.

( , )

T

str ds

PtB t T E e

− ∫ =

(5.3)

This is simply the extension of a martingale measure

=

∫−

T

dsrP

t SeES

T

ts

where

according to our definition of a zero coupon bond, we set ST = 1 u.c. Remember that

all martingales have the above property- a martingale is any tradable asset that under

risk neutral probability, follows a diffusion of the form

P

tttt dWStdtrSdS )(σ+= (5.4)

Where WtP is a Brownian motion, rt the instantaneous risk free rate and S0 a

constant.

5.3 Interest Rate Compounding

Refers to the profit that one makes by loaning money to a bank at a given time t,

and receiving it back at time T increased by a factor rt. The factor rt is the interest rate,

and represents how much you win to compensate for the fact that you have been

unable to access or invest your money during the period [t, T].

We proceed to describe the different forms of interest rates through a simple

numerical example.

Imagine that we start off with a sum of S = 100$, and decide to invest it at an

annual interest rate of r = 10% = 0.1


71

5.3.1 Annual Compounding:

1001.1 = 110$ thus mathematically ( )trS +1

5.3.2 Semi annual compounding:

r = 5% every 6 months. The difference respect the previous case is that you can

reinvest the profits every 6 months.

6M 1001.05 = 105$

12M 1051.05 = 110.25$ equivalent to 1001.052 annual.

2Y 1001.054 = 121.55$

1 1m mn

over n yearst tr rS S

m m + → +

(5.5)

5.3.3 Continuous Compounding:

is the limit when the rate r is fixed at every instant of time

lim 1 t

mr nt

m

rS Se

m⋅

→∞

+ =

(5.6)

5.4 Present Value PV

Is obtained bringing all future cash flows to present by using predicted or

constant interest rates

∑∑ ⋅−⋅−

→

+ nrcompoundcontinuous

nm

t tSem

rS 1 (5.7)


72

5.5 Internal Rate of Return IRR

rate IRR that will make the PV = 0.

1 0 0IRR n

continuous compound IRR ntrS Sem

− ⋅− ⋅ + = → =

∑ ∑ (5.8)

The higher the IRR, the more desirable the investment will be (provided that it is

larger than the bank’s interest rate r).

The IRR is the main method to evaluate an investment together with its NPV.

5.6 Bond Yield (to Maturity) YTM

Is the internal rate of return IRR at the current price. It is always quoted on an

annual basis. It is the face value paid at maturity, with C/m coupon payments each

time period in between. Thus its PV is the face value brought to present summed to

each coupon also brought to present.

∑=

=⋅⋅

++

+=

nk

kkn

mIRR

mC

mIRR

FPV

111

(5.9)

If there are n periods of coupons left, then

nn

mIRR

mCmC

mIRR

FPV ⋅⋅

+++

+=

11

(5.10)

There exists an inverse relationship between interest rates and bond prices. If the

IRR increases, then the PV of the bond price decreases.


73

5.7 Coupon Rate

It is possible to calculate the most representative points on the present value

curve for any given bond. Imagine that F = 100$, C = 10$ :

· if IRR = 0 then 1

k n

k

PV F C m=

== +∑

· if Coupon = YTM so price is constant at F

· if IRR ∞ then PV0

Yield

Price

10%

10% 15%

F10%

Fig. 5.3. Bond curve dynamics

5.7.1 Time to Maturity

For each curve we can calculate three characteristic points, as before, analysing

how they are influenced by the maturity

· if IRR = 0 then ∑=

=+=

nk

k

mCFPV1

thus if you increase the maturity T, more

coupons are paid thus the present value is greater


74

· if YTM= Coupon then PV = F thus all curves pass through

the same point

· if IRR ∞ PV0

Yield

Price

10%

10Y

30Y

F

3Y

Fig. 5.4. Bond curve for different maturities

As maturity increases, the curve becomes steeper, becoming also more sensitive

to variations in interest rates

5.7.2 Using Bonds to Determine Zero Rates:

Paying no Coupons: using the interest rate compounding formula

Principal = 100$ Time to Maturity: 3M Price: 97,5$

You earn %56.25.97

5.97100 =− every 3M

In 1Y you would earn (Note that you do not reinvest so cannot do 40256.0 )

2,56%·4 = 10,24% per year

An easier way to see this is by doing 41

3

1005.97⋅−

= Mre giving r 3M = 10,127%


75

Paying Coupons: once the short rate interests have been calculated, we can use

the short term bonds:

Principal = 100 Time to Maturity: 1.5Y Price: 97,5$

Having calculated the zero interest rate curve paying no coupons, we have

3m 10,127%; 6M 10,496%; 1Y 10,536%

• Let us imagine our bond is defined by having

Principal = 100 Time to Maturity: 1.5Y Price: 96$

Annual Coupons paid every 6M = 8$ (thus every 6M we receive 4$)

The price will be all the cash flows discounted to present

61.51

1.5

11.512

1 2 3

110.469 1.510.536 12

1.5

( )

96 4 4 (100 4) 10,681%

MYY

Y

r rr

rY

price C e C e principal C e

e e e r

− ⋅ − ⋅− ⋅

− ⋅ − ⋅− ⋅

= + + +

= + + + → = (5.11)

5.8 Interest Rates

Most of the interest rates we will deal with are yearly-based rates. In general, the

times that appear are dates whose length we specify as (T;U). This is, we convert the

interval into a yearly basis as

m = m(T, U) = U – T /1 year

the day-count fraction assumed for the product can vary depending on the

reference used. We state here the most common: (act=act, act=365, act=360, etc.).

Further, each currency has a particular calendar to be considered when

calculating the difference U-T. The convention that is usually taken is that of counting

national holidays, bank holidays, etc.

The simple forward rate contracted at t for the period [T; U] is


76

),(

11

),(

),(),,(

UTmUtB

TtBUTtL

−= (5.12)

This can easily be obtained following the previous discussions we have

considered. That is:

∫

U

tsdsr

e

∫T

tsdsr

e ∫

U

TsdsL

e

T U t

Fig. 5.5. Relating discount rates

Therefore ∫∫∫

⋅=U

Ts

T

ts

U

ts dsLdsrdsr

eee . L is clearly the forward rate and therefore the rate

between two future time periods, whereas r is the known rate between today t and a

future time period.

If we do not use continuous compounding but instead use annual compounding

(without reinvestment) the former then becomes

)1( mLee

T

ts

U

ts dsrdsr

⋅+⋅=∫∫

(5.13)

Not to mistake with 1

U T

s st t

mr ds r ds L

e em

∫ ∫ = ⋅ +

that would involve the

reinvestment of the benefits obtained. Recall that the above can be rewritten as

)1(11

mL

ee

T

ts

U

ts dsrdsr

⋅+⋅=∫−∫−

which are the discount factors

)1(),(

1),(

1Lm

TtBUtB+⋅= . Solving for L we obtain the simple forward rate:


77

),(

11

),(

),(),,(

UTmUtB

TtBUTtL

−= (5.14)

The EURIBOR (Europe Inter-Bank Offer Rate, for the European zone and fixed in

Frankfurt) and the LIBOR (London Inter-Bank Offer Rate, fixed in London) are simple

forward rates. They are an average of the rates at which certain banks (risk rated

AAA) are willing to borrow money with specified maturities (usually 3M, 6M, 1Y).

The rates L(t,T,U) are unknown (modelled as stochastic) as they fluctuate up until the

true fixing date T*. The Libor rate is usually fixed sometime before the payment

period starts. In the Euro zone, the fixing T* is made two business days (2bd) before T,

and in London, it is made the same day, thus T = T*.

5.8.1 Overnight market rates:

Are forward rates fixed each day for a one day period starting that day, meaning

they have the form O(t; T; U) with t = T, U = t + 1bd. Therefore O(t; t; t + 1bd)

5.8.2 The tomorrow-next market rates:

Are forward rates for a one day period, i.e they are fixed on one day and applied

on the following. They are of the form TN(t; T; U) with T = t + 1bd, U = t + 2bd.

Therefore TN(t, t+1bd; t+2bd)

We will now proceed to demonstrate that the market interest rate curve can only

exist if there exists a discount factor curve:

Observe that from the expressions, in the first equation B(t,t) = 1.

)2,1(

11

)2,(

)1,(),,(

)1,(

11

)1,(

1),,(

bdtbdtmbdttB

bdttBUTtTN

bdttmbdttBUTtO

++

−

++=

+

−

+=

(5.15)

For each of the two equations independently, we can recover at time t the values

of the zero-coupon curve B(t; T) for T = t + 1bd, and T=t+2bd given the market rates


78

for O(t; T; U) and TN(t; T; U). For longer T's, we use Libor rates to obtain the values of

B(t,T) since their usual fixing date is simpler- of the form t = T. Thus, from the

different Libor rates we obtain B(t,U) with U = T + 3M, T + 6M, T + 1y, typically. For

longer values of U we need to use swap forward rates.

5.9 Forward Rates

Forward interest rates are those implied by current (continuously compounding)

zero coupon rates.

YEAR Zero Rate per Year

Forward Rate

1 10 2 10.5 11 3 10.8 11.4 4 11 11.6

Table 5.1 Construction of forward rates from bonds

)(00

121122TTrTrTr fTT eeSeS −⋅⋅⋅ ⋅= and solving

12

12 12

TT

TrTrr TT

f −⋅−⋅

=

We can therefore calculate the forward rate for the 2Y:

%11100100 1111025.10 1 =→⋅= ⋅⋅⋅

Yf

r reee Yf

And analogously, we can calculate the forward rate for 4Y as:

%6.11100100 4314.111110411 43 =→⋅⋅⋅= →⋅⋅ →

YYf

r reeeee YYf

Or %6.11100100 43138.10411 43 =→⋅= →⋅⋅⋅ →

YYf

r reee YYf


79

5.10 Instantaneous forward rate

The instantaneous forward rate is the risk free rate of return which we may have

on an investment over the infinitesimal interval [T; T+dT ] if the contract is made at T.

Thus it is natural to view f(t; T) as an estimate of the future short rate r(T).

Recall, assuming m(t, U) = U - T, that we had

−−−=

−=

TU

TtBUtB

UtBUTmUtB

TtBUTtL

),(),(

),(

1

),(

11

),(

),(),,( (5.16)

If we assume U and T to be very close, then

U T U T

1 ( , ) ( , )lim ( , , ) lim

( , )

1 ( , ) log ( , )

( , )

B t U B t TL t T U

B t U U T

B t T B t T

B t T T T

→ →

− − = = −

∂ ∂= − = −∂ ∂

(5.17)

Because log ( , ) 1 ( , )

( , )

B t T B t T

T B t T T

∂ ∂=∂ ∂

We therefore define the instantaneous forward rate with maturity T

T

TtBTtF

∂∂−= ),(log

),( (5.18)

We have ∫−

=U

Ts duutF

eTtB

UtB ),(

),(

),( (5.19)

If instead of T we were to take t, where B(t, t) = 1, we would have

∫−

=U

ts duutF

eUtB),(

),( (5.20)

But remember now from (5.3) that we also had the expression from the zero

coupon bond

=

∫−U

tsdsr

P eEUtB ),( (5.21)


80

Therefore, by making U tend to t in (5.21), the expectation becomes known at t.

Having done this, we can now directly equate the expressions (5.20) and (5.21),

obtaining

rt = F(t, t) (5.22)


81

6. More Complex Derivative Products

The LIBOR is the London InterBank Exchange Rate. We will let L(t,T,U) denote

the forward LIBOR, seen from time t. This is, the Libor rate that will exist in a future t

period, ranging from T to U. Thus the spot LIBOR rate is given as L(T,T,U). This rate

fixes at time T, such that 1$ invested at this rate pays 1+mL(T,T,U) at maturity U. The

maturity U is generally expressed in terms of fractions of years, such that a 3 month

LIBOR will have an m = 0,25.

6.1 Calls and Puts

6.1.1 Call

A European call option (respectively a put option) with strike or exercise price K

and maturity or exercise date T on the underlying asset St, is a contract that gives to its

holder the right, but not the obligation, to buy (respectively to sell) one share of the

asset at the fixed time T. The underwriter of the option has the obligation, if the holder

decides to exercise the option, to sell (buy) the share of the asset. The option can be

exercised exclusively at time T.

The buyer of a call option wants the price of the underlying instrument to rise in

the future so that he can ‘call it’ from the vendor at the cheaper pre-established price

and then go on to sell it in the market at a profit. The seller either expects that the

call’s price will not rise, or is willing to give up some of the upside (profit) from a

price rise in return for (a) the premium (paid immediately) plus (b) retaining the

opportunity to make a gain up to the strike price

A European call option allows the holder to exercise the option (i.e., to buy) only

on the option expiration date. An American call option allows exercise at any time

during the life of the option.

The price is known as the exercise price or strike price. The final date is the

expiration or maturity.

Chapter 6 More Complex Derivative Products

82

Fig. 6.1. Investor’s profit on buying a European call option: Option price= 5$;

Strike K= 60$

In the above, consider an investor who buys a European call option with strike

60$ to buy a given stock whose current price is 58$. The price to purchase this option

is 5$. The option exercises in 5 months.

If in 5 months the stock price is above 60$, say 75$ the option will be exercised

and the investor will buy the stock. He can immediately sell the share for 60$ at the

market, thus gaining 15$ (ignoring the initial 5$ cost to buy the call). If the price of the

stock falls below 60$, the investor will clearly not exercise the call option to buy at 60$.

The call’s payoff is therefore

max( ;0)TPayoff S K= − (6.1)

The above figure shows the investor’s net profit or loss on a call option. It is

important to notice that the investor can lose money even if he does not exercise the

purchase, because he has paid the initial price or prime of 5$.

Notice the subtlety that even if the stock price rises to 62$ which is above the

market price of 60$, the investor is still incurring in a loss. This is because he gains 2$

for selling the stock, but initially lost 5$ by entering the transaction.

Nevertheless, although in the graph the investor is under the horizontal axis and

therefore in a situation of loss, he should still exercise the option. In this way his loss is


83

only of 3$. If he does not exercise, his loss are the entire 5$ that he spent to enter the

contract.

The above calculation (6.1) still holds valid as the call’s terminal value. Although

we are not taking the initial price into consideration, we have seen that irrespective of

the initial price, we will always sell if the final price is greater than the strike

Fig. 6.2.Vendor’s profit on selling a European call option: Option price= 5$; Strike

= 60$

From the vendor’s point of view, he will only gain a profit (the price at which the

call was sold) if the option is not exercised. We see his profits in the above graph. As

soon as the stock’s price rises above 60$, he is losing money because he is selling at

60$ an asset that is worth more than 60$ in the market. His payoff is the exact opposite

of the investor’s:

max( ;0) min( ;0)T TPayoff K S K S= − − = − (6.2)

Note: an option is said to be “at the money” if the strike price of the option equals

the market price of the underlying security.


84

6.1.2 Put

A call option gives the holder the right to buy the underlying asset. A put option

gives the holder the right to sell the underlying asset by a certain date at a certain

price. The vendor of a put option therefore hopes that the asset’s price will decrease so

that he will be able to ‘put it’ to the buyer at the higher pre-established price.

Imagine an investor who buys a European put option with a strike price of 90$.

Imagine the current stock price is of 85$, and that the option price is 7$.

Fig. 6.3. Investor’s profit on buying a European put option: Option price= 7$; Strike =

90$

If the share price drops to 75$, the buyer earns 15$ (ignoring the initial cost)

because he can exercise the option to sell at 90$. If the price instead rises above 90$,

the investor will clearly not exercise the option to sell at 90$, and will instead go

directly to the market where he can sell the stock at its true, higher price. The put’s

payoff is

max( ;0)TPayoff K S= − (6.3)

Conversely, the person who buys the put can only win the prime price paid when

the investor entered the contract, if the stock’s price goes up. If it declines, he will be

forced to buy the stock more expensive than its market quote, so will be losing money.

His payoff is

max( ;0) min( ;0)T TPayoff K S K S= − − = − (6.4)


85

Fig. 6.4. Profit from writing a European put option: Option price= 7$; Strike = 90$

6.1.3 Present value of an option

We already derived in the Black Scholes section (Chapter 3.1.2) a formula to

calculate the present value of a call which at a future date gives the final payoffs that

we have seen above. We simply recall the results that were derived.

( ) ( )0 1 2qT rTCall S e N d Ke N d− −= − (6.5)

( ) ( )2 0 1rT qTPut Ke N d S e N d− −= − − − (6.6)

0 0lnqTS e S

Ln qTK K

− = −

(6.7)

20

1

2S

Ln r q TK

dT

σ

σ

+ − + = (6.8)

20

2 1

2S

Ln r q TK

d dT

σ

σσ

+ − − = = − Τ (6.9)


86

6.2 Forward

A forward contract on a commodity is an agreement between two parts to

purchase or sell a specific amount of an underlying commodity at a defined price, at a

determined future date. It is therefore a security that settles a future price,

irrespectively of any possible future variations in that asset’s price. In a forward

contract, all the payments take place at time T. The buyer is said to be long whereas

the seller is said to be short.

This is best understood with a graphical representation:

St ST

K=F(t,ST)

t T

difference

Fig. 6.5. Forward contract: future expected value versus real future value

St is the asset’s current price. F(t, ST) is today’s expected future price of the asset at

time T, and so is the price K at which the vendor sells his asset today to be delivered

in the future.

ST is the real future price of the asset, and is only known at time T.

In the above example, the vendor sold above the real future price so earns the

difference (K–ST). Obviously the buyer only enters such a contract if he thinks the

asset is going to be more expensive in the future, so he secures a fixed future price K.

6.2.1 Mathematically

A forward contract on an asset ST made at t has the following cash flows:

• The holder of the contract pays at time T a deterministic amount K that was

settled before at t.


87

• He receives a stochastic (floating) amount F(T,ST) = ST which is the

future value of that underlying asset he has bought. Nothing is paid or

received at t.

The forward price K is determined at time t so that it equals the future expected

price of the asset ST.

[ ]( )

( )

( ) 0r T t Pt t T

r T tt

Forward e E S K

K S e

− −

− −

= − =

= (6.10)

6.2.2 Call option on a forward

A call option over a forward contract would be the possibility of entering the

forward agreement at a time 0 < t < maturity. One would enter only if he were to

receive a floating amount that were greater than the fixed amount K paid.

( ) ( ) 0r T t P

t t T tCallForward e E S K F− − + = − = %

(6.11)

6.3 Future

Future contracts are quoted in organised markets, giving them great liquidity and

few transactional costs.

A futures contract on an asset ST made at t with delivery date T is very much like

a forward contract, with some differences:

St ST

t T

difference

K=F(t,ST)


88

Fig. 6.6. Future contract: future expected value versus real future value

• St is the asset’s current price.

• F(t, ST) is today’s expected future price of the asset, and so is the price K at which

the vendor sells his asset to be delivered in the future.

• ST is the real future price of the asset, and is only known at time T.

• F(ti, ST) is the evolution of that expected future value as we advance through time.

The expected future value right at the end must equal the price of the underlying

at that point, thus F(T, T, ST) = ST.

The buyer, instead of paying the difference at the end T, now pays it continually

at every ti as the difference in F with respect to the previous date. If ever this

difference increases (moves upwards in Fig. 6.6), he receives this amount from the

vendor instead of having to pay for it.

In the previous example, the seller sold above the real price, so earns the

difference. In practice, the exchange of the asset at T for its real price at T needn’t be

done. Instead only the cash is exchanged.

6.3.1 Mathematically

Mathematically we can give the following explanation:

In forwards, the price of K at time t was the expected future cost of the asset F(t,

ST). Now with futures, this is substituted by a continuous stochastic payment process

F(t, T, ST) made continuously over the time interval (t,T] in such a way that the

contract value be continually zero. There is no obligation to interchange the claim ST

and the payment F(T,T, ST). The contract is achieved by the following agreement:

During each time interval (ti-1; ti), t ≤ t1 ≤ t2 ≤ …≤ T, the holder of the contract pays

),,(),,( 1 TiTi STtFSTtF −− . If the payment is negative, he will receive the money

from the counterparty. Therefore

[ ] )(1

1

11

~),,( tTr

ttTP

tT eSFSESTtF −== (6.12)


89

6.3.2 Call option on a future

One can also formulate call options on futures contracts. For instance, consider a

future contract on ST1, with delivery date T1, and a call option on F(T; T1; ST1) with

exercise price K and exercise date T < T1. The payment at T of this option is thus (F(T;

T1; ST1) - K)+. From this, we obtain that the price of the call option on the future is

( )

( )

1

1

1 1

( )1

( ) ( )

( , , )r T t Pt t T t

r T t r T tPt t t

CallFuture e E F t T S K F

e E S e K F

+− −

+− − −

= − =

= −

%

%

(6.13)

Thus, proceeding as above in the call option case, and by using Black Scholes, we

get

( )( )211)( )(),,(

1

1 dKNdNSTtFeCallFuture TtTr

t −= −− (6.14)

with

tT

tT

K

STtF

dT

TT

T

−⋅

−⋅±

=σ

σ2

)(),,(log

21

2,1

1

(6.15)

6.4 FRA

A FRA, or future rate agreement, consists in an agreement between two

counterparties, where one of the two parts pays a fixed flow of income (or fixed leg)

against a variable flow of income (or variable leg) which he receives in return.

Let us set the case where we pay a fixed leg in 3 months at a 3% annual interest

rate, fixed today.

In return we receive a 3 month EURIBOR rate fixed at maturity (in T = U = 3M)


90

0

Variable leg

Fixed leg

Fig. 6.7. FRA payoffs

We will therefore be paying 0.03 · 0.25 = 0.75%

By definition this contract has a NPV = 0

6.5 FRA Forward

Consists in an agreement between two counterparties, where one of the two parts

pays a fixed flux of income (or fixed leg) against a variable flux of income (or variable

leg) which he receives in return. The fixed rate is settled at the present date t = 0,

whereas the variable leg is set by the LIBOR value at the fixing date T, over a time

period up until U. The difference between the two rates at T is exchanged at U.

0

Variable leg

Fixed leg

T U

Fig. 6.8. FRA future’s payoffs

Mathematically, the interchange of flows at time U is then

( )( ),Nm K L T U− (6.16)

where N is the nominal and L(T,U) the floating rate. The value of such a contract

discounted to time t, assuming N = 1 for simplicity, is:


91

( )( )

( )( )

,

( , ) ,

U

st

T

st

r dsPt t

r dsPt

FRA E e m K L T U F

E B T U e m K L T U F

−

−

∫= −

∫= −

%

% (6.17)

remembering that the LIBOR forward rate L(t,T,U) is defined as

),(

11

),(

),(),,(

UTmUtB

TtBUTtL

−= (6.18)

then ( )( , ) 1 ( , )T

st

r dsPt tFRA E e KmB T U B T U F

− ∫= − +

% (6.19)

And because bonds can be expressed as

( , )T

st

r dsB t T e

−∫= (6.20)

we can rewrite the above as

( , ) ( , )

( , ) ( , ) ( , )

T T T

s s st t t

T T TU U

s s ss st t tT T

r ds r ds r dsP P Pt t

r ds r ds r dsr ds r dsP P Pt

FRA KmE e B T U F E e E e B T U

KmE e e F E e E e e F

KmB t U B t T B t U

− − −

− − −− −

∫ ∫ ∫= − +

∫ ∫ ∫∫ ∫= − +

= − +

%

% %

(6.21)

Let us set as an example a FRA Forward contract 3M x 15M.

We are obliged to pay, at time U = 15M, the rate fixed today.

In return we receive at time U the LIBOR rate L(T,U). This quantity is variable as

we cannot know the value of the LIBOR lasting 1Y and starting in 3M until T = 3M

itself.

In reality, what is exchanged is the difference between the two rates, multiplied

by the notional over which the exchange was agreed.


92

6.6 Caplet

Is the option of entering a FRA contract. This is, the option of entering the

FRA at time T (at a premium), if the variable leg is greater than the fixed leg (and thus

compensates for the premium paid).

Therefore, a caplet with maturity T and strike K will have the following

payoff: at time U the holder of the caplet receives:

+−= )),,(( KUTTLmCapletT (6.22)

Note that the caplet expires at time T, but the payoff is received at the end of the

accrual period, i.e. at time U. The payoff is day-count adjusted. The liabilities of the

holder of this caplet are always bounded above by the strike rate K, and clearly if

interest rates increase, the value of the caplet increases, so that the holder benefits

from rising interest rates.

By the usual arguments, the price of this caplet is given by the discounted risk-

adjusted expected payoff. If P (t, T) : T ≥ t represents the observed term structure of

zero-coupon bond prices at time t, then the price of the caplet is given by

[ ]+−= )),,(();( KUTTLEUtmBCaplet tt (6.23)

In this equation, the only random term is the future spot LIBOR, L(T,T,U). The

price of the caplet therefore depends on the distributional assumptions made on L(T,

T, U ). One of the standard models for this is the Black model. According to this

model, for each maturity T, the risk-adjusted relative changes in the forward LIBOR

L(t, T, U ) are normally distributed with a specified constant volatility σT , i.e.

tT dWUTTL

UTTdL σ=),,(

),,( (6.24)

This implies a lognormal distribution for L(T, T, U ), and under this modelling

assumption the price of the T – maturity caplet is given by

( )( ; ) ( , , )Pt mB t U E L t T U Kς + = −

(6.25)

1 2( ; ) ( , , ) ( ) ( )i iT Tt mB t U L t T U N d KN dς = − (6.26)


93

tT

tT

K

UTtL

dT

T

T

−⋅

−⋅±

=σ

σ2

)(),,(log

2*

2,1 (6.27)

∫∞−

−=z

uduezN

2

2

1

2

1)(

π (6.28)

0

Variable leg

Fixed leg

T U

Exercise Option

Fig. 6.9. Caplet payoffs

6.6.1 Caplet as a Put Option on a Zero-Coupon Bond

A caplet is a call option on an interest rate, and since bond prices are inversely

related to interest rates, it is natural to be able to view a caplet as a put option on a

zero coupon bond. Specifically, the payoff of a caplet at time U is

( ( , , ) )UCaplet m L T T U K += − (6.29)

This payoff is received at time T + 1Y/m, where m is the number of payments in a

year. The LIBOR rate prevalent over the accrual period [T, U ] is L(T, T, U ). It follows

that, at time T, the price of the caplet is annually discounted from U to T as was shown

in (5.13) as

+−+

= )),,((),,(1

1KUTTLm

UTTmLCapletT (6.30)

The price of a zero-coupon bond, on the other hand, is expressed as

),,(1

1),;(

UTTmLUTtB

+= (6.31)

from which it follows that


94

++

−

++=

−

−= ),;(

)1(

1)1(1

),,(

11),;( UTtB

mKmKK

UTTmLmUTtmBCapletT

(6.32)

This is just 1 + mK units of a put option on the T + 1Y/m – maturity zero-coupon

bond with strike (1 + mK) − 1 . Thus a caplet is a put option on a zero-coupon bond. A

cap, therefore, is a basket of put options on zero-coupon bonds of various maturities.

6.7 Cap

Is a sum of Caplets. This is, the option at every fixing Ti of entering that specific

Caplet from Ti to Ui = Ti+1. Each Caplet has an equal duration of time 1year/m up until

its corresponding Ui. More specifically, it is a collection (or strip) of caplets, each of

which is a call option on the LIBOR level at a specified date in the future.

To construct a standard cap on the 1year/m – maturity LIBOR with strike K and

maturity U, we proceed as follows. Suppose we are currently at time t. Starting with

U, we proceed backwards in steps of length 1year/m. Let n be the number of

complete periods of length 1year/m between t and U. Thus, we get a set of times

T0 = t + δ

T1 = T1 + 1Y/m

T2 = T2 + 1Y/m = T0 + 2·1Y/m

Tn = U = T0 + n·1Y/m

We now construct the portfolio of n caplets, struck at K, with maturities T0, T1,…,

Tn-1, called the fixing dates or caplet maturity dates. The payment dates are thus T1,

T2,…, Tn. The cap is then just equal to the sum of the prices of the strip of caplets. We

will now calculate this strip:

If ζi(t) denotes the price at time t of a caplet with maturity date Ti (and payment

date Ui = Ti+1), then the price of the cap is


95

[ ]∑∑−

=

+−

=−==

1

0

1

0

)),,(();()(),(n

iiit

n

ii KUTTLEUtmBtTt ςς (6.33)

Applying Black Scholes

[ ]∑∑−

=

+−

=−==

1

021

1

0

)()(),,();()(),(n

i

TTii

n

ii

ii dKNdNUTTLUtmBtTt ςς (6.34)

The only quantity that cannot be directly observed in this pricing formula is the

set of forward rate volatilities, σTi for each caplet. Thus

),....,,,,(),( 110 −= TnTTTtTt σσσςς

As a given set of forward rate volatilities produces a unique price, if we can find a

single number σ such that

),....,,,,(),( 110 −= TnTTTtTt σσσςς = ),....,,,,( σσσς Tt

then this σ is called the implied or Black volatility for the U – maturity cap.

The market’s observed prices of caps of various maturities are inverted

numerically to obtain a term structure of Black volatilities, and these implied

volatilities are then quoted on the market itself.

0

Variable leg

Fixed leg

T1 U

Exercise Option

T4 T3 T2 T5

Fig. 6.10. Cap payoffs


96

6.7.1 A Floor.

is a strip of floorlets, each of which is a put option on the LIBOR level at a given

future date. The pricing and hedging of floors is exactly complementary to the

treatment of caps. The price of a floor with similar a structure to the plain vanilla cap

discussed before is given by:

[ ]∑∑−

=

−

=−−−==

1

012

1

0

)(),,()();()(),(n

i

Tii

Tn

ii

ii dNUTTLdKNUtmBtTt φφ (6.35)

Just as a caplet is a put option on a pure discount bond, similarly a floorlet is a

call option on such a bond. The hedging instruments for floors are the same as for

caps, except that positions are reversed since the holder of a floor benefits from falling

interest rates. The owner of a floor is always long in the market and long in vega, i.e.

benefits from rising volatility. The value of a floor also increases with maturity, as the

number of put options increases.

6.7.2 Put-Call Parity

Consider a caplet ζ and a floorlet Ф, each maturing at time T, with the same strike

rate K. Let us construct a portfolio

= ζ − Ф

The payoff from this portfolio, received at the end of the accrual period, is

( ) ( )[ ] ( )[ ]KUTTLmUTTLKKUTTLm iiiiiiT −=−−−= ++ ),,(),,(),,(π

(6.36)

This is just a cash flow from a payers swap. Thus we have the following version

of the put-call parity appropriate for caps and floors:

Cap – Floor = Payer Swap


97

6.8 Swap

Consists in an agreement between two counterparties made at t = 0, where one of

the two parts pays a fixed coupon rate K on every fixing (or fixed leg) and in return,

receives a variable floating rate (or variable leg) defined over LIBOR rates. It is thus an

exchange of a series of payments at regular intervals for a specified period of time.

The payments are based on a notional underlying principal. There is only one option

to enter or not the agreement.

The fixed rate K is settled at the present date t = 0, whereas the variable leg is set

by the LIBOR value at each fixing date Ti, lasting a time period until Ui - thus L(Ti, Ui).

The difference between the two rates is exchanged at Ui, after being multiplied by a

notional. Note that every maturity has Ui = Ti+1

The lifetime of the swap is called its tenor. An investor is said to hold a payer

swap if he pays the fixed leg and receives the floating; an investor is said to hold a

receiver swap if the reverse is true. The time-table for payments can be represented

schematically as follows (assuming a unit notional).

Time Fixed Coupon Floating Coupon Cashflow

T0 0 0 0

T1 mK mL(T0,T0,U) m(K-L(T0,T0,U)): : : :

Tn mK mL(Tn-1,Tn-1,U) m(K-L(Tn-1,Tn-1,U))

Table 6.1 Swap payoff term structure

Consider the position of the holder of a payer swap. The value of the payer swap

at time t < T0 (the first LIBOR fixing date) is given by

[ ]∑−

=+ −=

1

01 )),,(();(),(

n

iiiin KUTTLTtmBTtV (6.37)

where K is the fixed coupon rate. To give this swap zero initial value, we can set


98

∑

∑−

=

−

=== 1

0

1

00

);(

),,();(),,( n

ii

n

iiii

n

TtB

UTTLTtBTTtRK (6.38)

This rate R(t, T0, Tn) which gives a zero value to a swap starting at time T0 and

expiring after n payments at Tn, is called the forward par swap rate for a swap with

tenor Tn − T0. The spot starting par swap rate, or just the swap rate, is the fixed coupon

rate that gives zero initial value to a swap starting at the present time. This is denoted

by R(t, t, T) for a swap with tenor T − t.

0

Variable leg

Fixed leg

T1 U T4 T3 T2 T5

Fig. 6.11. Swap payoffs

6.8.1 Swap –another approach

A swap with notional N and time schedule τ = (T0; T1; T2,…,Tn) is a contract that

interchanges payments of two legs:

The fixed leg pays at times T1; T2,…,Tn a fixed annual rate

NmiK, With i = 1,… ,n mi = m(Ti-1; Ti)

The floating leg pays at times U1,…, Um that are most probably different from Ti,

although U0 = T0, Um = Tn. The payments are based on the Libor forward rates,

resetting at Ti. That is, seen from time t ≤ T0,

N miL(t; Ui; Ui+1) with i = 1,…,n mi = m(Ui-1; Ui)

The basis or day count convention for the floating leg is usually act/360, and for

the fixed leg there are many usual choices. There are different classes of swaps,

according to what is done with the fixed leg. If it is paid, we call it a payer swap, and a

receiver swap if the fixed leg is received. Let us think, for example, of a payer swap.


99

According to

=

∫−

T

dsrP

t SeES

T

ts

the value of the payer swap will be the conditional

expectation of the discounted payoff of the product. Thus, the value (at t) of our payer

swap is then the value of the fixed leg minus the value of the floating leg, that is

i lf fst t tV V V= − (6.39)

For simplicity, let us assume N = 1. For the fixed leg, by the linearity of the

conditional expectation and the definition of B(t; T), its value is

1 1

( , )

Ti

si t

n nr dsf P

t i t i ii i

V E e Km F B t T Km− ∫

= =

= = ∑ ∑% (6.40)

To value the floating leg, let us assume that there are no differences in the time

schedules Ti and Ui as was mentioned previously. Therefore, its value is

( )

( )( )

( )( )

( ) ( )( )

( ) ( )( )

11

1

1

1

1

1

1

1

, ,

, 11

,

,1

,

,, 1

,

, ,

Ti

sl t

Ti

st

Ti

st

n r dsf P

t i i i ti

n r dsiP

i ti i i

n r dsiP

ti i

ni

ii i

i ii

V E e m L t T T F

B t Tm E e F

B t T m

B t TE e F

B t T

B t TB t T

B t T

B t T B t T

− ∫

−=

− ∫ −

=

− ∫ −

=

−

=

−

=

= −

= −

= −

= −

∑

∑

∑

∑

%

%

%

( ) ( )01

, ,n

nB t T B t T=

= −∑

(6.41)

With different time schedules, the formula results less concise. Nevertheless, it is

often said that a floating leg that is valued as was done previously is valued “at par”.

Thus, the value of the payer swap is

( ) ( ) ( )01

, , ,l

nf

t i i ni

V K B t T m B t T B t T=

= − +∑ (6.42)


100

The market swap rate associated to a swap with a time schedule of τ = (T0; T1;

T2,…,Tn) and a day-count convention m, is the rate of the fixed leg that makes the

contract fair (price 0) at time t: thus, such that the fixed leg and the variable leg have

the same price. Solving for K = Sτ,m and VSt = 0 we obtain

( )

( )0

1

( , ) ,( )

,

nm n

i ii

B t T B t TS t

B t T mτ ,

=

−=∑

(6.43)

Of course, we can therefore rewrite the initial formula in terms of Sτ,m as

( ) ( ) ( )1

, ( ) ,n

st m i i

i

V K K S t B t T mττ ,=

= − ∑ (6.44)

Observe that by using the values of the curve already obtained from the forward

cash rates and using these market swap rates, (which are market data), we can obtain

the values of the zero-coupon bond B(t,T) for values of T = 3M, 6M, 1y, etc. For values

of the curve with T’s that are not exactly these values, we use interpolation (usually

log linear interpolation, since heuristically B(0; T) ~ e-rT , where the unknown is r).

Such a recursive method to obtain information is often called a bootstrapping method.

6.9 Swaption

A swaption is the option of entering a Swap contract at a future time T. There are

two types:

A Payer swaption allows its holder to exercise into a payer swap at maturity U,

thus agreeing to pay a fixed quantity and receive floating cash flows for a specified

period of time, called the swap tenor.

A receiver swaption allows to exercise into a receiver swap, paying floating and

receiving fixed rate payments for a specified time.

At maturity, the holder of a swaption can exercise into swaps having several

possible tenors. For this reason, swaptions must specify not just the expiry time T of

the option, but also the tenor T* of the swap resulting from exercise. Thus swaption

prices, volatilities etc are quoted on a matrix.


101

6.9.1 Payoff Structure

Consider a T × T* payer swaption. Let T = T0, T1, ..., Tn−1 be the fixing dates and

T1, T2, ..., Tn = T* the cash flow dates for the swap. If K is the fixed swap coupon and m

the constant accrual factor between fixing and payment dates, then the value at time t

of the payer swap is

[ ]∑−

=−=

1

0

)),,(();(*),(n

iiii KUTTLTtmBTtV (6.45)

The payoff from the payer swaption is therefore given by

+=× *),(*),( TTVTTtPayoff (6.46)

If we consider a forward-starting par swap, then the coupon rate K is given by

R(t, T, T*). Substituting this rate into the payer swap underlying the swaption, in

(6.45) we get

∑−

=−=

1

0

* );()),,(((*),(n

iiTtmBKTTtRTtV (6.47)

The payoff from the swaption becomes

1*

0

1*

0

*

(( ( , , ) ) ( ; )

(( ( , , ) ) ( ; )

(( ( , , ) ) 01

n

ii

n

ii

Payoff R t T T K mB t T

R t T T K mB t T

R t T T K PV

+−

=

−+

=

+

= − =

= − ×

= − ×

∑

∑ (6.48)

The summation factor ∑−

=

1

0

);(n

iiTtmB is called the PV01, and represents the present

value of a basis point paid on the swap cash flow dates. Thus a payer swaption is just

a call option on the forward swap rate, with strike K. Similarly, a receiver swaption is

a put option on the forward swap rate.


102

6.9.2 Pricing a Swaption

The Black model is the market standard for pricing swaptions. In fact it is curious

to note here that the Black model started being used by the market itself, and it was

not until later that a theory was elaborated to justify its application The two crucial

assumptions in the model are firstly, that the forward swap rate R(t, T, T*) is driven

by a zero drift geometric Brownian motion dR

tTTdW

TTtR

TTtdR*

),,(

),,(*

*

×= σ (6.49)

and secondly, that the discounting is constant. This implies that the PV01 does

not change through time.

Now, by using exactly the same calculations as for vanilla caplets, it is easy to see

that the price of a payer swaption is given by

[ ][ ])()(),,(01);(

)),,((01);();(

21*

**

TTi

ipayer

dKNdNTTtRPVTtB

KTTtREPVTtBTTtPayoff

−××=

−××=× +

(6.50)

tT

tT

K

TTtR

dTT

TT

T

−⋅

−⋅±

=×

×

*

*

2

)(),,(log

2*

2,1 σ

σ

(6.51)

The only unobservable is the forward rate volatility σT×T* . However, there is a 1 to

1 correspondence between this volatility and the resultant price of a swaption, and

this fact is used to invert observed swaption market prices and obtain a matrix of flat

implied volatilities, also known as Black or lognormal volatilities.

A receiver swaption is similar to a payer swaption, except that it can be expressed

as a put option on the forward starting swap rate. The price of an otherwise identical

receiver swaption is given by

* *

*2 1

( ; ) ( ; ) 01 ( ( , , ))

( ; ) 01 ( ) ( , , ) ( )

receiveri

T Ti

Payoff t T T B t T PV E K R t T T

B t T PV KN d R t T T N d

+ × = × × −

= × × − − −

(6.52)


103

6.9.3 Put-Call Parity for Swaptions

Let the PayoffReceiver (t, T × T*) be a receiver swaption with strike K and the

PayoffPayer (t, T × T*) be an identical payer swaption. Consider the portfolio

Payoff (t, T × T*) = PayoffPayer (t, T × T*) - PayoffReceiver (t, T × T*)

It is simple to verify that

Payoff (t, T × T*) = PV01 x (R(T,T,T*)-K)

Or written in words

Payer Swaption – Receiver Swaption = Payer Swap

By no-arbitrage, the value of the portfolio (the Payoff) must be equal to the value

of a payer swap. This relationship can be used as an alternative to direct integration

when finding the price of a receiver swaption. The ATM strike is defined to be the

value of K which makes the values of the payer and receiver swaptions equal. Put-call

parity now implies that this must be the same rate that gives a forward-starting swap,

zero value. In other words, the ATM strike is simply equal to the forward starting par

swap rate R(t, T, T*).

Chapter 7 HJM

104

7. HJM

7.1 Introduction

The Heath-Jarrow-Morton framework is a general framework to model the

evolution of interest rates (forward rates in particular). It describes the behaviour of

the future price (in t) of a zero coupon bond B(t,T) paying 1 unit of currency at time T.

The framework originates from the studies of D. Heath, Robert A. Jarrow and A.

Morton in the late 1980s- refer to: “Bond pricing and the term structure of interest

rates- a new methodology” (1987) - working paper, Cornell University, and “Bond

pricing and the term structure of interest rates: a new methodology” (1989) - working

paper, Cornell University.

The Heath, Jarrow and Morton term structure model provides a consistent

framework for the pricing of interest rate derivatives. The model is directly calibrated

to the currently observed yield curve, and is complete in the sense that it does not

involve the market price of interest rate risk, something which was a feature of the

early generation of interest rate models, such as Vasicek (1977) and Cox, Ingersoll and

Ross (1985).

The key aspect of HJM techniques lies in the recognition that the drifts of the no-

arbitrage evolution of certain variables can be expressed as functions of their

volatilities and the correlations among themselves. In other words, no drift estimation

is needed. Models developed according to the HJM framework are different from the

so called short-rate models in the sense that HJM-type models capture the full

dynamics of the entire forward rate curve, while the short-rate models only capture

the dynamics of a point on the curve (the short rate). In practice however, we will not

work with a complete, absolutely continuous discount curve B(t,T), but will instead

construct our curve based on discrete market quotes, and will then extrapolate the

data to make it continuous.


105

Given the zero-coupon curve B(t,T), there exists a forward rate F(t,u) such that

PtdWTtdtTtTtdF ),(),(),( σµ += (7.1)

This dynamics is the foundation on which the HJM model is constructed.

7.2 Model Origins

There are two basic arbitrage relationships that derive from the bond pricing

equation:

1.

=

∫−T

tsdsr

Pt eETtB ),( associated with the spot rates (Classical)

2. ∫−

=T

tdsstF

eTtB),(

),( associated with the instantaneous forward rates (HJM)

All existing models start from one or another and follow the same general

procedure:

They start with a set of bond prices B(t,T) that are reasonably arbitrage free, and

use either of the previous two arbitrage relationships to go backwards in time so as to

determine a model for either the spot rate rt or for the set of forward rates F(t,s)

depending on the arbitrage relationship selected.

As both relations hold under the no arbitrage conditions, the models obtained are

risk adjusted i.e. they are valid under the risk neutral measure P.

The aim behind the creation of these rt or F(t,s) models is to then perform the

inverse path. That is, to use the models developed to price interest rate derivatives

other than bonds.

1. Classical methods use the first relationship. They try to extract from the set of

bonds B(t,T) a risk adjusted model for the spot rate rt, using an assumption on the

Markovness of rt.

Chapter 7 HJM

106

2. The HJM approach uses the second relationship. It obtains as a result the

arbitrage free dynamics of ‘d’ dimensional instantaneous forward rates F(t,s). It

requires no spot rate modelling, and what is more, it demonstrates that the spot rate rt

is in general not Markov.

7.3 The HJM Development

The HJM model starts off by exploiting the risk neutral relationship. Imagine

that we have a pair of arbitrage free zero coupon bonds B(t,T), and B(t,U), and let

F(t,T,U) be the default-free forward interest rate contracted at time t, starting at T and

ending at maturity U. For simplicity, we will assume that we have no discount factor

‘m’. As seen in the Libor section ( Chapter 5.1), we can write the arbitrage free

relationship:

[ ]),,(1),(

),(UTtF

UtB

TtB −= (7.2)

We are thus relating two different bonds (and therefore their two different

dynamics) through a unique forward rate F. This means that the bond’s arbitrage

relations will be directly built into the forward rate dynamics.

The question that logically follows is: which forward rate to use? We have already

seen that there exist both a continuously compounded model for instantaneous rates,

F(t,T), and a discrete expression of the form F(t,T,U). The above clearly makes use of

the discrete approach, and leads to the BGM models created by the work of Brace,

Gatarek and Musiela.

In contrast, the original approach used by HJM was to model the continuously

compounded instantaneous forward rates, F(t,T), where as we saw previously in 5.3.3

that

∫−

=T

tdsstF

eTtB),(

),( (7.3)

With the above, the arbitrage relationship between interest rate bonds now

becomes:


107

∫

=u

TdsstF

eUtB

TtB ),(

),(

),( (7.4)

We will continue our introduction to the HJM model along the lines of the

original HJM approach.

Notice that there is no expectation operator in the above, since the F(t,s) are all

forward rates observed at the current time t, beginning at a future date s, and lasting

an infinitesimal time ds. For simplicity we will now adopt Bt = B(t,T).

The HJM model, following Black, assumes that a typical bond in a risk neutral

environment follows the stochastic differential equation:

( , , )t t t t t tdB r B dt t T B B dWσ= + (7.5)

Where rt is the risk-free instantaneous spot rate and is therefore equal for all

bonds or assets. The noteworthy part of this model is the fact that it uses a unique

Brownian Motion. Indeed, it is quite tedious to demonstrate how the Brownians do

not depend on the end WT. We will not enter the details of this development here, but

will simply underline once more the importance of the fact that we are able to take a

unique Brownian parameter.

From the equation above (7.4), we could rearrange the expression so as to have

∫=u

T

dsstFUtB

TtB),(

),(

),(log (7.6)

This can be re written in terms of non instantaneous forward rates, for an

infinitesimal interval ∆ as:

( )TTTTtFTtBTtB −∆+∆+=∆+− )(),,(),(log),(log (7.7)

by applying Ito’s Lemma:

[ ] ( ) dtTtBBTtTtB

TtdBTtB

TtBd t2

2),(),,(

),(

1

2

10),(

),(

1),(log σ

−++=

(7.8)

Chapter 7 HJM

108

we can replace our diffusion expression for dB in the above

[ ] Ptttt dWBTtdtBTtrTtBd ),,(),,(

2

1),(log 2 σσ +

−= (7.9)

Similarly for log B(T+∆ ) we can write

[ ] Ptttt dWBTtdtBTtrTtBd ),,(),,(

2

1),(log 2 ∆++

∆+−=∆+ σσ (7.10)

It is important to realize that the drift terms rt are the same in both cases, because

we are considering a risk neutral scenario. This is the same argument that is applied in

the Black Scholes derivation.

The drift term is unknown, but we can use a trick to eliminate it:- subtracting the

two equations:

[ ] [ ]

[ ]

2 21log log ( , , ) ( , , )

2( , , ) ( , , )

t t t t

Pt t t

d B d B t T B t T B dt

t T B t T B dW

σ σ

σ σ

+∆ +∆

+∆

− = + ∆ − +

+ + ∆ −

(7.11)

From before we had

2 2

log ( , ) log ( , )( ; , )

( , , ) ( , , ) ( , , ) ( , , )

2Pt t t t

t

B t T B t TF t T T

t T B t T B t T B t T Bdt dW

σ σ σ σ+∆ +∆

+ ∆ −+ ∆ =∆

+ ∆ − + ∆ − = + ∆ ∆ (7.12)

Now the above can be considered a derivative if 0→∆ . Recall that

0

( ) ( )lim

f f x f x

x ∆→

∂ + ∆ −=∂ ∆

We can therefore rewrite the first term in (7.12) as:

2 2

0

( , , ) ( , , ) ( , , )lim ( , , )

2t t t

t

t T B t T B t T Bt T B

T

σ σ σσ+∆

∆→

+ ∆ − ∂ = ∆ ∂


109

(7.13)

for the second term we have

T

BTtBTtBTt ttt

∂∂=

∆−∆+ ∆+→∆ ),,(

2

),,(),,(0lim σσσ

(7.14)

and 0

lim ( , , ) ( , )F t T T dF t T∆→

+ ∆ =

We therefore end up with

Pt

ttt dW

T

BTtdt

T

BTtBTtTtdF

∂∆+∂

−

∂∂

∆+=),,(),,(

),,(),(σσσ

(7.15)

where σ are the bond price volatilities (which are generally quoted by the market

itself). We therefore, need only solve the above to attain the HJM forward rate model.

Note lastly that the above corresponds to a diffusion model for F(t,T) of the form

Ptt dWtsbdttsbBTtTtdF ),(),(),,(),( −⋅∆+= σ (7.16)

where the partial derivatives are collected under the term b(s,t)

7.4 The rt in the HJM Approach

We will now demonstrate that through the HJM approach, there is no need to

model a diffusion process for rt, that may be inaccurate. Instead, we can directly derive

the spot rates from our instantaneous forward rates F(t,T), by simply realising that:

),( ttFrt = (7.17)

This is, that the spot rate corresponds to the nearest infinitesimal forward loan

starting at time t- recall (5.22) .

Now by integrating our forward model derived in (7.16), this is

Ptt dWtsbdttsbBTtTtdF ),(),(),,(),( −⋅∆+= σ (7.18)

Chapter 7 HJM

110

we obtain

∫∫ +⋅∆++=t

Ps

t

t dWTsbdsTsbBTtTFTtF00

),(),(),,(),0(),( σ (7.19)

Remember that we had

∂∂

=T

BTttsb t ),,(),(

σ

meaning ( , , ) ( , )t

t st T B b s u duσ = ∫ (7.20)

∫∫ ∫ +

+=

tP

s

t t

s

dWTsbdsduusbTsbTFTtF00

),(),(),(),0(),( (7.21)

So if we now select T = t, our expression for the forward rate becomes an

expression for the spot rate:

∫∫ ∫ +

+=

tP

s

t t

s

t dWTsbdsduusbTsbTFr00

),(),(),(),0( (7.22)

The forward rates are biased estimators of future spot rates under the risk free

measure.

Proof

Let us demonstrate this by taking the conditional expectation of a future spot rate

rτ with τ > t . Then

[ ] [ ]

+

+= ∫∫ ∫

ττ τ

τ τττ0

),(),(),(),( Ps

Pt

t s

Pt

Pt

Pt dWsbEdsduusbsbEtFErE

(7.23)

The last term is 0 as all Brownian processes have 0 future expectation. F(t,τ)is

known at time t, so comes out of the expectation. We are thus left with

[ ]

+= ∫ ∫

τ τ

τ ττt s

Pt

Pt dsduusbsbEtFrE ),(),(),( (7.24)


111

meaning )(),( ττ rEtF Pt≠

The HJM exploits the arbitrage relationship between forward rates and bond

prices, eliminating the need to model the expected rate of change of the spot rate.

Chapter 8 Santander HJM

112

8. Santander HJM

In the Heath Jarrow Morton framework, we assume the bond prices follow the

subsequent dynamics:

P

tt dWTtdtrTtB

TtdB),(

),(

),( Γ+= (8.1)

The first term dtrt is constructed based on the risk neutral probability P which

was already explained in detail in the mathematical section 2.6. Remember that it

implies that all assets, bonds and securities should present an equal internal rate of

return. It is thus independent of all bond prices B(t,T), avoiding any arbitrage

possibilities.

The second term P

tdWTt ),(Γ follows the Heath Jarrow Morton model. The

specific formulation chosen for the diffusion term ),( TtΓ has been invented and

developed by the Banco Santander quant team itself.

The initial equation which we have set out with would require a set of infinite

Brownians for every t, especially if the product we model were constructed with

numerous fixings. However, market observation of real data suggests that the curve

obtained through the above equation only experiences translations and rotations

around the 2 to 3 year mark, and presents an important smile which we must be

capable of modelling. These are the only transformations which we need to be capable

of representing, and therefore, we will not be needing hundreds of Brownian sources

of chaos. Instead, we create our HJM model based on a finite number of Brownian

sources.

1

( , ) ( , )N

Pt j t

i

t T dW t T dWγ=

Γ =∑ (8.2)

Where γi is the volatility of the discount factor B(t,T) for a particular instant.


113

As can be seen by the N index, we introduce a calibration set of N products that

each introduce a particular source of risk or stochasticity. It is up to the trader to

decide which set of N vanilla products will best reproduce the risk that he is trying to

model for his exotic product. In the above notation, each ‘j’ will correspond to a

particular maturity ‘T’.

However, the above time partitioning of the volatilities is fixed for our HJM

model, and independent of any product that we decide to calibrate. This implies that

it is we who define the series of time intervals 0= t0 < t1 < t2 < … < tN <… <∞ that we

will be considering, not the product.

From observation of historical data, we decide also that the Brownian motion

term P

tdW is independent of the bond maturities T. Bond price dynamics B(t,T) seem

to all, historically, behave in the same way for a common T.

In the Banco Santander we set out to model Γ using the following criteria. We

have already stated that the model must be capable of reproducing the smile of exotic

products. We search that it be able to provide de-correlation between forward Libor

rates so as to maintain a degree of flexibility within movements in different parts of

the forward rate curve. But most important of all, we search that the model be as

simple as possible, i.e., that it should have the minimal number of parameters capable

of reproducing the above characteristics. This will enable calibrations to be as rapid as

possible.

8.1 How to choose the γ?

Our choice for the different )(Tjγ will determine the bond price distribution that

we obtain. We start by seeing how this parameter had been chosen in other models.

The main historical developments in this field can be grouped under the BGM

methodology. Developed in 1995 – 1996, it builds on the construction of a form for the

γ that is consistent with the lognormality of a quasi Black-Scholes model, and that


114

takes into account a set of particular forward Libor rates, selected depending on their

maturities.

We have also already seen that another main stream of thought was to take the

zero coupon rates R(t,T) where ))(,(),( tTTtReTtB −−=

8.2 One Factor

Being one factor refers to the fact that the dynamics we attempt to reproduce are

modelled through a unique parameter, which will be the global volatility σ in our

case.

8.2.1 One-factor quasi log-normal:

1( ) ( , ) log ( , )j j jT t T B t Tγ σ += (8.3)

Proof

The above is derived from the following: recall that

( , )( ) log ( , )( , ) ( , )R t T T t B t T

B t T e R t Tt T

− −= → =−

(8.4)

Applying Ito, we obtain

( )( )

2

2 2

( , )

1 1 1 1 log ( , )( , ) ( , ) ( , )

( , ) 2 ( , )

dR t T

B t TdB t T dt t T B t T dt dt

t T B t T B t T t T

=

− − − + Γ + − −

(8.5)

Replacing our diffusion equation P

tt dWTtdtrTtB

TtdB),(

),(

),( Γ+= in the previous

and also replacing log B(t,T) with the expression in (8.4), we obtain


115

( ) ( )

−−Γ−Γ+

−−= dt

Tt

TtRdtTtdtdWTtdtr

TtTtdR P

tt

),(),(

2

1),(

1),( 2

(8.6)

and so regrouping terms in dt and dW

Ptt dW

Tt

TtdtrTtR

TtTtdR

−Γ+

−+

−−= ),(

2

T)(t,Γ),(

1),(

2

(8.7)

From market data analysis, we realize that the dynamics of bonds in general,

follows a lognormal term structure. As we have seen in (8.4), B(t,T) is directly related

to R(t,T). Thus if our bonds follow a lognormal behaviour, so must our rate dynamics.

Therefore, if we impose that our model be log-normal, then we must impose that

the volatility term be linear with R(t,T). This is

( , )

( , ) ( , )t T

t T R t Tt T

σΓ =−

(8.8)

in such a way that the log of our dynamics be normal

P

tdWTtdtTtR

TtdRTtR ),((...)

),(

),(),(log σ+=→ (8.9)

As can be seen, the Brownian term associated with log R(t,T) is now normal. Since

we have log R(t,T), we say that our version is lognormal.

With this imposition, we obtain from (8.8) that

( )( , ) ( , ) ( , )t T t T t T R t TσΓ = − (8.10)

And as we had from (8.4)

log ( , )

( , )B t T

R t Tt T

=−

(8.11)

Then


116

( , ) ( , ) log ( , )t T t T B t TσΓ = (8.12)

Note that the volatility of the rate R is a deterministic function σ(t; T) to be

calibrated to market data. More concretely, we will use swaption and caplet prices to

obtain information about σ, and thus about Γ. In reality we propose a piecewise

constant version of Γ with regards to t, so that R is only log-normal by parts, meaning

that it is quasi log-normal.

As we have seen in section 3.3, log-normality of R means that we have a

relatively flat Black's implied volatility smile. We have already seen that other models

consider normality instead and we have also seen in Fig. 3.4 that it implies a negative

skew for their associated Black's implied volatility smile.

The term log B(t,T) is Markovian, meaning that to continue the process, the

term depends on previous data.

8.2.2 One-factor normal

The development of a normal model is completely analogous to the above. The

only difference lies at the moment of imposing the model we want to follow. Instead

of ),(),(),(

TtRTtTt

Tt σ=−

Γwe now simply impose ),(

),(Tt

Tt

Tt σ=−

Γ This is a

normal model i.e. the Brownian term is independent of R(t,T). Since (t-T) is a

deterministic term, we can include it within the volatility term, which is itself

deterministic and also time dependent. Therefore:

),(~),()(),(

TtTtTtTt

Tt σσ =⋅−=−

Γ (8.13)

In fact, in our Santander model, we will extract from the ),(~ Ttσ a deterministic,

time dependent part, leaving

),0(

),0(log),(),(

tB

TBTtTt σ=Γ (8.14)


117

8.3 Model Implementation

1st Approach:

Suppose a generic HJM setting of the form,

( , )

( , )( , )

Pt t

dB t Tr dt t T dW

B t T= + Γ (8.15)

where

),(log),(),( TtBTtTt σ=Γ (8.16)

For simplicity we have not included the time dependency, thus

),(log),( TtBTtR = (8.17)

We can then write

( ) dtTtBTtTtB

dtTtdBTtB

TtBdTtdR 2

2),(),(

),(

1

2

10),(

),(

1),(log),( Γ

−++

==

(8.18)

recalling P

tt dWTtdtrTtB

TtdB),(

),(

),( Γ+= , then:

( ) dtTtdtdWTtdtrTtBd Ptt

2),(2

1),(),(log Γ−Γ+= (8.19)

Ptt dWTtdtrTtBd ),(

2

T)(t,Γ),(log

2

Γ+

−= (8.20)

Our main difficulty here is the term rt, which is risk neutral. This means that it

must show the same internal rate of return for any two assets. This must also apply

for any two assets that are separated in time. We can therefore examine two bonds of

different maturities, and subtract them to eliminate the term rt.

( ) ( ) PtdWUtTtdt

UUtBTtBd ),(),(

2

)(t,ΓT)(t,Γ),(log),(log

22

Γ−Γ+

−=−


118

(8.21)

The easiest way to implement this model is through a MonteCarlo approach

(refer to section 9.2). For this, we will need to generate a number of paths between

every interval ti and ti+1. As mentioned previously, in the HJM model we only

examine the particular bonds and maturities Ti that are of our interest.

Integrating the previous equation, we obtain:

( )∫∫=

++

+ + Γ−Γ+

−

+

+

11

12

12

),(),(2

)(t,Γ)(t,Γ

1

1

),(

),(iT

iT

Ptii

iT

iT

ii dWTtTtdtTT

i

i eTtB

TtB (8.22)

Not all of the above components are completely determined since we do not

possess information on the parameter values at future times:

• dt2

Γ2∆ should be integrated, using 1( ) ( , ) log ( , )j j jT t T B t Tγ σ +=

• P

tdW∆Γ is a stochastic Brownian term, and so relatively easy to calculate. i.e. it

integrates to give a 0 mean, and so we must only calculate its variance. Recall a

rapid example:

2nd Approach:

Develops on the previous idea, but is now non Markovian. We therefore change

our initial approach

),(log),(),( TtBTtTt σ=Γ (8.23)

to ),(log),(),( UTBUtUt iσ=Γ [ ]1, +∈∀ ii TTt (8.24)

In the previous approach, we did not know the future values at time t. Now,

instead, we evaluate our data at the beginning of our time step, where the value is

already known, and where only Ti+1 is left to determine.

Ti are therefore model time steps, strictly associated to the model itself and not to

the product.


119

B(Ti,U) is now no longer Markovian. This is, futures steps only depend on the

previous point.

Integrating now from Ti to Ti+1 we obtain:

( )1 12 2

1

1

( , ) ( , ) Γ (t, ) Γ (t, )log log ( , ) ( , )

( , ) ( , ) 2i i

i i

T T Pi itT T

i i

B T V B T V U Vdt t U t V dW

B T U B T U

+ ++

+

−− = + Γ − Γ

∫ ∫(8.25)

where ),(log),(),( UTBUtVt iσ=Γ , and where log B(Ti,U) is no longer time

dependent, so can be extracted from the integral:

( ) ( )

( ) ( )

1 1

1 1

1

1

2 22

( , ) ( , )log log

( , ) ( , )

(t, ) (t, )log ( , ) log ( , )

2 2

log ( , ) ( , ) log ( , ) ( , )

i i

i i

i i

i i

i i

i i

T T

i iT T

T TP Pi t i tT T

B T V B T V

B T U B T U

U UB T U dt B T V dt

B T U t U dW B T V t V dW

σ σ

σ σ

+ +

+ +

+

+

− =

− +

+ −

∫ ∫

∫ ∫

(8.26)

At Ti we already know all the values for B(Ti, _), and all the σ are also known and

deterministic. We are only left with the need to generate the stochastic integrals

1

( , )i

i

T PtT

t V dWσ+

∫ (8.27)

for the maturities V that are of our interest, and which correspond to the fixing

dates at which the cash flows are exchanged.

Notice that in this approach we only have one Brownian motion. This does not

necessarily imply that all our elements ∫+1

),(i

i

T

T

PtdWVtσ be perfectly correlated:

Imagine for example that we have

∫

∫

dWVt

dWVt

),(

),(

2

1

σ

σ both with the same dW. Then if the individual σi follow


120

t

σ1(s

t

σ2(s

Fig. 8.1. Example of lack of correlation between variables belonging to a unique

Brownian motion

then

∫

∫

after ton W dependsonly )(

tbeforeon W dependsonly )(

2

1

dWs

dWs

σ

σ

Therefore

∫ == 0 Covariance 21 dtσσ

We notice however, that to simulate the integrals ∫+1

),(i

i

T

T

PtdWVtσ , we require

the same number of Brownians Wt as the number of maturities Vi that we want for

each step. Thus we are dependent upon the form of σ(t,V) between any two dates Ti

and Ti+1. We therefore decide to look for the simplest possible expression for σ.

3rd Approach:

We make the hypothesis that σ(t,U) = σ(Ti,U), which appears to be the simplest

form for numeric generation. We therefore have:

),(log),(),( UTBUTUt iiσ=Γ [ ]1, +∈∀ ii TTt (8.28)

Notice that we select, as we did for the bonds, a known Ti for our σ(Ti,U), that is,

at the beginning of the interval t. We do this because there is no strong reason that

would suggest we should take any other t, and because by selecting a known Ti,

numerically, everything becomes much easier.


121

Notice also that Γ is still stochastic for every Ti, as it changes value between each

Ti stochastically. However, Γ is piecewise constant for every interval [ ]1, +ii TT

Constructing on the formulation that we had developed in our previous

approach, we can now also extract the constant and known σ(Ti,U) from the integrals:

( )

( ) ( )

( )

1

1 1

1

221

1

2

( , ) ( , ) (t, )log log log ( , )

( , ) ( , ) 2

(t, )log ( , ) log ( , ) ( , )

2

log ( , ) ( , )

i

i

i i

i i

i

i

Ti i

i Ti i

T T Pi i tT T

T Pi tT

B T V B T V UB T U dt

B T U B T U

UB T V dt B T U t U dW

B T V t V dW

σ

σ σ

σ

+

+ +

+

+

+

= + −

− +

−

∫

∫ ∫

∫

(8.29)

( )

( )

1

1 1

1

2 21

1

2

( , ) ( , ) 1log log log ( , ) ( , )

( , ) ( , ) 2

1log ( , ) ( , ) log ( , ) ( , )

2

log ( , ) ( , )

i

i

i i

i i

i

i

Ti i

i i Ti i

T T Pi i i i tT T

T Pi i tT

B T V B T VB T U T U dt

B T U B T U

B T V T U dt B T U T U dW

B T V T U dW

σ

σ σ

σ

+

+ +

+

+

+

= + −

− + −

−

∫

∫ ∫

∫

(8.30)

( ) ( ) ( )

[ ]( )1

2 2 21

11

log ( , ) ( , ) log ( , ) ( , )( , ) ( , )log log

( , ) ( , ) 2

log ( , ) ( , ) log ( , ) ( , )i i

i i i ii ii i

i i

P Pi i i i T T

B T U T U B T V T UB T V B T VT T

B T U B T U

B T U T U B T V T U W W

σ σ

σ σ+

++

+

− = + − +

+ − − (8.31)

At this point therefore, we can summarize that:

· logB which are constant in the interval ∆t

· We have obtained a set of σ which are also constant in ∆t

Now from market data, as previously mentioned, we know that zero coupon

bonds must be globally log-normal. In our model, we have ( ) PtdWVtUt ),(),( Γ−Γ

with ∆Γ independent of ∆logB


122

σBlack

K

Instantaneously therefore, we have constructed a model that is constant during

∆t, and so is lognormal. Globally however, the model still presents stochasticity for

∆Γ.

Notice finally that for any form of constant Γ (where we take the left Ti value), if

the particular case occurs in which Γ(t,U) = Γ(Ti,U), our approximation then becomes

exact.

4th Approach: Shifted Black

To this point, we have developed a log-normal model:

σBlack

Strike K

Lognormal (flat)

Fig. 8.2. HJM dynamics for a lognormal model: flat

Following dWS

dS σ=

And a normal model:

Fig. 8.3. HJM dynamics for a normal model: skew


123

Following dWSS

dS λ=

Where λ is a constant

Typically the implied volatilities quoted by market data form a smile that is

somewhat in between both normal and lognormal models. As shown by the following

equation, we choose a parameter α to interpolate between the more or less flat curve

presented by the quasi log-normal version and the negative slope curve produced by

the normal (Gaussian) version.

dWSdtdS )((...) λσ ++= (8.32)

where the term σS is log-normal, and the term λ is normal.

We can rewrite the above slightly differently so as to better understand its

dynamics. We therefore replace the constant λ by a known S0 and insert a factor of

interpolation α that allows us to modify the slope between a lognormal model for α =

1 and a normal model for α= 0.

tdWSSdtdS ))1(((...) 0αασ −++= (8.33)

Our Black shifted model therefore now becomes

[ ]1

(0, )( , ) ( , ) ( , ) log ( , ) (1 ( , )) log ,

(0, )i i i i i ii

B Ut T T U T U B T U T U t T T

B Tσ α α +

Γ = + − ∀ ∈

(8.34)

we will refer to σ as the general volatility level, and to α as the skew. Both are

now entirely deterministic functions.


124

5th Approach:

Notice firstly that if we take α > 1 we obtain

K

σBlack

Fig. 8.4. HJM dynamics for alpha parameters greater than 1

We realize that our model is constrained by two limiting values

• if α < 0 there is a limiting maximum that we can never touch

• if α > 1 there is a limiting minimum that we can never attain

We decide that we would like to be able to access the entire range of prices with

our model, and still maintain 0 < α < 1. So as to not restrict ourselves, we decide to

include a new parameter that will enable us to attain very steep slopes - both negative

and positive.

−+⋅⋅=Γ

),0(

),0(log)),(1(),(log),(),(),(),(

iiiiii TB

UBUTUTBUTUTVUTUt αασ

(8.35)

We are now multiplying our previous volatility by a new term V(Ti,U). The

question is, what expression must this V(Ti,U) undertake. We have already seen a

number of ideas in the model section of Stochastic Volatility. There, the simplest


125

possible expression was suggested as the SABR formulation. As we will see further

on, the analysis of stochastic ‘volatilities of volatilities’ is an important part of the

development undergone in this project.

Other Approaches

Another alternative would be to consider the mathematical formulation that is

currently being used as a first order Taylor expansion, and to extend it for instance to

a second order expansion. This would imply that instead of

[ ]dWSSSdtdWSSdtdS t )((...)))1(((...) 000 −++=−++= ασαασ

(8.36)

we would now include a new calibration parameter λ, obtaining an expression of

the form:

[ ]dWSSSSSdtdS 2000 )()((...) −+−++= λασ (8.37)

Other alternatives include a different from of interpolation between normal and

lognormal forms. Instead of performing a linear interpolation as mentioned earlier, we

could perform a geometric interpolation. For instance, instead of

−+

),0(

),0(log)),(1(),(log),(

iiii TB

UBUTUTBUT αα (8.38)

we could consider using

[ ]),(1

),(

),0(

),0(log),(log

UT

i

UTi

i

i

TB

UBUTB

αα

−

(8.39)


126

8.4 Controlled correlation

In the models we have seen up until now, the correlation structure among the

bond prices is always implicit. A way to control this correlation structure is by

changing WPt in the model to a two dimensional Brownian motion Zt = (Zt1 ;Zt2 ), and

to therefore consider a vector-valued Γ (t; T) given by

( )( )( )( ) ( )1

11

,02 1

sin ,( , )( , ) ( ) ( )

( , ) cos , j j

j

j t tj

j

t Tt Tt T T t

t T t T

θγ

θ +

+

≥ +

Γ Γ = = Χ Γ

∑ (8.40)

We can think of this as

( )( ) ( )( )( ) ( )1

1 11 1 ,

0

cos , sin , ( )j j

Pt j j t j t t t

j

W t T Z t T Z dW tγ θ θ+

+ +≥

= +∑

(8.41)

This modification could be included in any of the versions above, and would give

us one more model parameter θ(t; T).

The insertion of the two factors provides therefore an element of de-correlation

between each of the different interest rate terms tr , defined by their iT . Without this

modification, an increase in the short term interest rates would necessarily result in a

similar increase in the long term rates- implying a correlation very close to 1, and thus

would always lead to vertical displacements of the entire curve.

Instead, with the inclusion of a de-correlation term, we can allow each interest

rate to vary differently in time, allowing for evolutions from flat curves, to positive

gradients, to other more complex interest rate functions.


127

Time

Rate curve

Fig. 8.5. for a correlation=1 amongst interest rates.

Rate curve

Time

Fig. 8.6. Allowing for de-correlation among different interest rates


128

8.5 Tangible Parameter Explanation

To this point, we have seen that the diffusion of the stochastic volatility Γ has

been modelled through 3 parameters and an element of de-correlation. We seek now

to gain a deeper understanding of how they truly behave within our model. Let us

recall that we had:

−+⋅⋅=Γ

),0(

),0(log)),(1(),(log),(),(),(),(

iiiiii TB

UBUTUTBUTUTVUTUt αασ

(8.42)

where we could for example take the simplest possible form of stochastic

volatility, V as:

tZTteTtV ),(),( γ= (8.43)

Our parameters are therefore:

σ : global volatility level

α : skew or slope

γ : smile- volatility of volatilities (Vol of Vol)

We shall refer to our HJM framework as being 2-factor whenever we pursue a

two dimensional approach in the modelling of the Brownian motion.

)(),(cos)(),(sin)( 21 tdWUTtdWUTtdW ii θθ +→ (8.44)

We seek to able to model products that present the following behaviour:


129

σBlack

Strike

Short maturity (smile )

Long maturity (skew)

Fig. 8.7. Typical vanilla dynamics for different maturities

In the 3 dimensional view below, we have attempted to go further in our

understanding on the 2 dimensional representation. Note that the time scale tends to

more recent dates as we look into the 3D view, meaning that the products shows

greatest smiles for more recent dates, and greater skew for very long maturities.

The representations that we have brought here are indeed entirely schematic and

extremely smooth. We refer the reader to later chapters such as the SABR Caplet

surface in section 18.2 to see how real market quotes produce much more irregular

surfaces, and how it is only after an adequate ‘massaging’ process that the data is

finally smoothened out into the below form.


130

1,00

6,00

11,0

0

16,0

0

21,0

0

26,0

0

31,0

000

36,0

000

1,0

0 4,0

0 7,0

0

10,

00 13

16

19

22

2528-

0,50

1,00

1,50

2,00

2,50

3,00

Black Volatility

Strike

Time

Smile

Fig. 8.8. Smile to skew deformation with maturity

8.5.1 Impact of Sigma

The sigma parameter represents the global volatility level of the vanilla product

that we analyse. This is, it represents a sort of measure of the stochastic deviation that

the product can suffer with respect to its average drift trend. The sigma is a parameter

that is generally, very closely related to the ‘at the money’ Black’s volatility i.e. to the

product’s volatility at its forward value. This is because it is this point, amongst all

other possible vanilla strikes, which defines the overall level of the smile presented.


131

Imapct of Sigma

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

1% 2% 3% 4% 5% 6% 7%

Strikes

Bla

ck V

ol

kjg

h

sigma = 16 % sigma = 18 % sigma = 20 %

Fig. 8.9. Sigma parameter global volatility level

Indeed, we see that the principal effect of the sigma is to set a global volatility

level. The higher we set this sigma, the higher the curve rises on a Black volatility

scale- and therefore also in price since the two measures are directly correlated. Note

that typical values for sigma are within the range of 8 – 20%. Note also that we have

constructed the above for a lognormal case, which is why the slope is flat: α = 1. If we

were to have constructed the same with an α = 0, we would have obtained set of

skewed graphs that would be once again, vertically displaced in black volatility with

respect to each other.

Impact of Sigma


132

8.5.2 Impact of Alpha:

Impact of Alpha

11%

13%

15%

17%

19%

21%

23%

2% 3% 4% 5% 6% 7%

Strikes

Bla

ck V

olat

ility

k

hjk

Alpha = 40 % Alpha = 80 % Alpha = 130 %

Fig. 8.10. Alpha parameter skew

We see from the above behaviour that the alpha clearly acts as a slope or skew

parameter. This is, as we make alpha tend towards 1, it increases the weighting on the

lognormal component of the volatility, thus tending towards a flat slope.

If instead we make the alpha tend towards 1, we revert towards a normal model

presenting a clearly defined skew as is shown by the negative slope above. Other

values are allowed, but whose interpretation is not so clear.


133

8.5.3 Impact of the Stochastic Volatility

Impact of Vol of Vol

10%

12%

14%

16%

18%

20%

22%

24%

26%

28%

2% 3% 4% 5% 6% 7%

Strikes

Bla

ck V

olat

ility

jhj

VolOfVol = 20 % VolOfVol = 40% VolOfVol = 60%

Fig. 8.11. Stochastic Volatility: smile creation

As can be seen above, the stochastic volatility successfully produces the sought

for smile effect, increasing Black’s volatility for very high and very low strikes. This is

because the stochastic volatility attributes greater weights to the more extreme values

in strikes, thus having a much more pronounced effect at either ends.

Chapter 9 Numerical Methods

134

9. Numerical Methods

In this section our intention is that of introducing the various numerical engines

that are available when attempting to solve a stochastic differential equation. We will

see that there are three principal alternatives: MonteCarlo simulations, tree diagrams,

and partial differential equation solvers.

We set out with a model of the form: P

tttt dWdtrSdS β+=

We have already seen that both rt and βt can be stochastic.

There are also three main methods or approaches that can be considered when

analysing a product and its associated model.

• An analytical formula- such as the case of Black Scholes, that can be solved

directly

• Semi analytic formulas – can be solved almost completely using a direct approach,

apart from an occasional integral which must be performed numerically although

without any further complications

• All other combinations of models and products require a numerical computation

in order to achieve a solution. Such numerical approaches always require some

means of discretisation.


135

9.1 Discretisation

Theoretical models consider:

• ∞ time intervals- i.e. continuous time

• ∞ possible paths that the underlying asset can take between two discretised time

intervals t and t + dt.

Any model in practice can only deal with

• a finite number of future time steps

• a finite number of possible paths between any two time intervals

Anything different from this would require an infinite calculation time.

As stated initially, there are three main approaches by which a discretised model

can be tackled. These are:

• MonteCarlo simulations

• Tree Diagrams

• PDE (Partial Differential Equations) Solvers

We set out in this introduction using the simplest option we can conceive: a call

as our asset of study.


136

Call

Time

K

T

P4

P3

P2

P1

0

(ST-K)T ST

Fig. 9.1. Call future scenarios generation

We will ignore all that has occurred in the past, starting from the present time

instant t = 0, and estimating the future values that our call can take. It is in this future

projection that the mathematical engine comes into play. It acts as a numerical

resource generating future scenarios and identifying all the different paths that an

asset can possibly follow before arriving at the future date.

After this generation, the next step is to equip each of the paths with its own

outcome probability. In each trajectory, we will analyse the gains or losses in which

we have incurred as investors.

The engine then proceeds to calculate the sum over all the possible outcomes,

weighting them with respect to their probabilities.

9.2 MonteCarlo

In the field of financial mathematics, Monte Carlo methods often give better

results for large dimensional integrals, converging to the solution more quickly than

numerical integration methods, requiring less memory and being easier to program.

The advantage Monte Carlo methods offer increases as the dimensions of the problem

increase.


137

In our model’s scope, we must firstly determine the number of points where the

engine is going to have to stop and construct a probability distribution for the

product. A more complex product with fixings at numerous intervals will logically

require the engine to perform the previously explained weighting of probabilities at a

large number of points. These calculation dates that are required are almost always

defined by the product itself. They tend to be imposed by the fixings or maturities at

which the product must be evaluated. Note that a specific model can also impose the

need for certain time steps, such as in the case of our HJM.

Through the MonteCarlo approach, a finite number of paths is chosen, say n =

10,000. For each, we must construct a possible evolution of the underlying asset S. But

recall that the asset follows P

tttt dWdtrSdS β+= , which depends on a Brownian

motion. We must therefore firstly construct our model’s Brownian variables.

For this reason, a random number generator is used to create ‘n’ variables within

0 < xn < 1 (excluding both the 0 and the 1). Each of these ‘xn’ random numbers are

easily transformed into a Gaussian variable Kn by using an inverse Normal

transformation.

This is, if we consider the cumulative frequency diagram for a Normal

distribution, we have that for any number between 0 < xn < 1

Fig. 9.2. Normally distributed variable generation from random numbers in the

(0,1) interval


138

we can write that xi=N(φi)= Prob (φ < φi). We can therefore perform an inverse

transformation to find φi =N-1(xi)

At this point we have already established the foundations for the construction of

our Brownian variables:

Remember that we had already seen that [ ] ikkikki TTWWdW ϕ⋅−=−= −− 11

Meaning that any Brownian motion could be decomposed into its temporal

component and its Normal (Gaussian) component. However, we also know that

Brownians have the property of presenting

W(0) = 0, where T0 = 0.

This allows us to calculate every individual Brownian value:

1 0 1 1

1 1 0 1

0T T T Ti ii

T i ii

W W W W

W T T Tϕ ϕ

− = − =

= − ⋅ = ⋅ (9.1)

This process must be repeated for each of the i = 1,…, n paths.

Subsequently we can calculate the following path step at T2 by simply generating

another set of random variables yi from T1 to T2. All the yi must be independent from

xi: Thus Vi =N-1(yi) with

[ ] iiTT VTTWW ⋅−=− 1212 (9.2)

In the above equality, all the terms are known except for WT2 which we must

solve for.

Returning to our diffusion equation:

P

tttt dWdtrSdS β+= (9.3)

we see that we can apply a similar approach to calculate the value of our asset at

every time interval.

S(0) is known at present for T0=0. Then


139

[ ]0101 TTiTT dSSS −=− (9.4)

but we know 01 TTdS − from our diffusion equation, since it can be rewritten as

)()(01001 01

PT

PTtTTT WWTTrSdS −+−=− β (9.5)

where all the Brownians have been previously calculated. We therefore can

directly solve for the asset’s future price by solving

)()()0(0101 01

PT

PTtTT WWTTrSSS −+−+= β (9.6)

Note that we had said that we would generate n paths. We will consider each

path as having an equal probability of occurring, therefore a probability of 1/n. This is

equivalent to taking a simple random sample of all the possible trajectories.

Finally, for each path i of the product, we will have to evaluate the final product’s

payoff, F(Tk)i ; that is, how much we will receive or pay for the product in that given

scenario. What in fact must be analysed is the value today of that product in its future

scenario. We must therefore discount to today all of its possible cash flows at each of

their different payoff fixing dates:

[ ]∑

=

∫−

ki

dssr

iki

kT

eTFA 0)(

)( (9.7)

We repeat this valuation for each path scenario i = 1,..n, analysing each of the

payoffs at every time interval k. That is at T0 = 0 ,…, Tn.

The final price value that we select for our product is the weighted average of all

the product values in each path scenario i, with respect to the probability of each path

occurring (as stated, this was 1/n for each). Therefore we end up with

1

ii

product value An

=∑ (9.8)


140

9.2.1 Limitations

The main limitation of a MonteCarlo approach is the fact that it does not work

well for forward products, whereas it works extremely well for backward products.

This is, if we settle at a future date (Tk)i within our path scenario horizon, we have

only one unique path i leading to that point. This means that we can easily evaluate

the product at that node’s previous conditions, since it is completely determined by

that one path. However, if we had to generate a forward evaluation of the same

product, that is, if we had to find the value of the product that were to evolve from

that point onwards, we would need to perform a calculation of the sort:

=

∫−Tq

Tk

qkk

dssr

TPTT eVEV

)(

(9.9)

The value at a future time Tk depends on an expected value of these future

developments. But in the MonteCarlo analysis we only have one path leaving the

point being studied, so cannot really perform an average over a unique path.

A solution could be envisioned as an algorithm that would create a further 10,000

paths leaving from each evaluation node. We cannot realistically consider a

MonteCarlo operating in this way. It would definitely enable us to calculate the future

expected value that we came across previously, as we would now have a set of paths

over which to integrate. However, the proposed method would cause our calculation

time to explode exponentially.

MonteCarlo is therefore only useful for past dependant products.

9.3 Tree Diagrams

The basic idea behind any tree diagram used in financial analysis is very similar

to that used for the MonteCarlo. Its aim is to generate a number of possible paths

along which the asset of study can evolve towards future scenarios. Each path is then

weighted with a specific probability depending on the possibility that the asset

follows that route. The algorithm ends at a predetermined date at which the payoff of


141

the product is evaluated. At this point, just as in the MonteCarlo case, the probability

distribution for each possible outcome scenario is computed. The mean expected

payoff is then taken as the final value for our product. This must subsequently be

discounted back to present so as to know its trading value today. The number of

branches that sprout from each node can be chosen at will. In general the simplest

forms are the binomial and trinomial variations, although other multinomial

extensions can easily be constructed.

Binomial – 2 branches

Prob down

Fig. 9.3. Binomial tree

Trinomial- 3 branches

Prob up

Prob down

Fig. 9.4. Trinomial tree


142

Each of these may or may not have a symmetric setting for their branches, and

may or may not use a symmetric distribution for the probabilities to be assigned to

each of the possible routes.

9.3.1 Non Recombining Tree Diagram:

Analogous to the MonteCarlo simulation, each node here has a unique path

leading to it, and a number n leading away from it. As with the MonteCarlo

discussion, this alternative is impossible to implement as it rapidly diverges towards

infinite simulations. See the binomial case for instance, that generates 2k new paths at

each time step, and 3k paths in the trinomial case-

Prob up

Prob down

Tk Tk+1 Tk+2

Fig. 9.5. Non recombining binomial tree

9.3.2 Recombining Tree Diagrams:

Regarding the previous forms of tree diagrams, the recombining tree alternative

results as the only viable solution with regards towards a practical implementation. It

allows the algorithm to reach the same node via several different paths. The binomial

case is no longer exponential in terms of the number of nodes created at each step.

Instead, it only adds k + 1 nodes at each successive time interval. In the trinomial case,

the number is slightly greater, adding 2k + 1 new nodes at each time step.

T


143

It is important to note that in a binomial tree, we reduce the infinite paths that can

stem from any single node to only two possibilities. We must assign to each of these

branches a particular probability, that we will note as Probup and Probdown. It is

evident that the sum of these two probabilities must equal one. In fact, we have three

equations that enable us to evaluate each of the two probabilities. These are:

• zero central moment 1=∑ prob

• The first central moment: ( )1+

=kk TkT NodeENode

• The second central moment : discrete variance= theoretical variance

The three equations above define a unique Gaussian variable. Note that a normal

distribution is defined entirely by its moments M0, M1 and M2.

In fact, what we have here is a set of three equations and four unknowns:

Prob up, Prob down, S1,up, S1,down

Prob up

Prob down

S1 up

S1 down

Fig. 9.6. Binomial tree probabilities

It is therefore necessary to fix one of the four parameters in order to solve for the

other three. Typically, we seek a symmetry of the form Prob up = 0.5, Prob down = 0.5


144

Prob up

Prob down

Tk Tk+1 Tk+2 Tk+3

Fig. 9.7. Recombining binomial tree

In the trinomial recombining tree we would obtain the following graphical

representation:

Prob up

Prob down

Tk Tk+1 Tk+2 Tk+3

Fig. 9.8. Recombining trinomial tree

The tree diagram algorithm has properties that are the complete opposite of those

presented by the MonteCarlo algorithm. A tree diagram turns out to be very good for

future dependant products, since there is no longer the problem of a unique path


145

leaving a particular node. Thus, a probability can be computed when calculating a

future expectation by taking into account the numerous paths that emerge from any

given node. In contrast, the tree method is not good with back dependant products, as

any node (Tk, Sk)i cannot be traced back through a particular path, since it possesses a

number of possible routes.

The main problem that arises at this point is when we are faced with products

that are both forward and backward dependant. Neither of the two previous methods

can realistically be applied to this kind of situations and still yield suitable results.

In the 1990’s the Hull White Model developed a series of tricks so as to avoid such

difficulties in some of its products. These however remain limited, and inefficient in

the case where the products have many payoff dates.

Between 1995 and 2000, a series of techniques were created enabling the

MonteCarlo method to tackle future dependant products such as American contracts.

From 2002 onwards, the development has been directed towards a Longstaff

Schwartz approach. A further idea that seems promising but that has never reached a

concrete form was the Willow tree development.

9.4 PDE Solvers

This involves an entirely different approach to that of the two previous methods.

It no longer deals with the probabilities and Brownian motions necessary to solve the

diffusion equations encountered. Instead, it solves deterministic partial differential

equations, eliminating all the stochasticity of the problem. The method is in one of its

variations, a more generic approach that encompasses the recombining tree diagram.

Indeed, it is capable of incorporating tree diagrams as a specific sub-case in which the

PDE mesh is generated in a triangular form.

The basic development of the equations used can be considered as follows. Let us

consider the basic stochastic differential equation of any tradable asset. This is:


146

P

tttt dWdtrSdS β+= (9.10)

By applying Ito to the product’s payoff, which we shall note as V(t,St), then

( )P

ttt

Pttt

t

dWS

Vdt

t

VrS

S

V

t

V

dtt

VdWdtrS

S

Vdt

t

V

dtt

VdS

S

Vdt

t

VdV

ββ

ββ

β

∂∂+

∂∂+

∂∂+

∂∂=

=∂∂++

∂∂+

∂∂=

=∂∂+

∂∂+

∂∂=

22

2

22

2

22

2

2

1

2

1

2

1

(9.11)

Under the risk neutral probability, every asset must have the same average yield.

This means that the payoff can also be written generically as

P

tttt dWdtrVdV γ+= (9.12)

Since both the product and its payoff must have the same yield, we can therefore

equate the two drift terms through the Black Scholes formula:

∂∂+

∂∂+

∂∂= 2

2

2

2

1 βt

VrS

S

V

t

VVr ttt (9.13)

With this procedure we have now eliminated the Brownian component of the

equation, and with it, all the probability distributions that it implies. They are still

present in the equation, but implicitly, behind the terms V, S, and β.

Having done this, we proceed to construct a mesh for the PDE solver. This can be

of varying forms, and we shall outline briefly here just the simplest method so as to

achieve a general understanding of the procedure to be followed.


147

St

VT0 t1 t2 0

ST,1 VT,1 ST,2 VT,2 ST,3 VT,3 . . .

V=0

VT,i ST,i-K

Vt

Fig. 9.9. PDE mesh and boundary conditions

Note that we have written the underlying on the vertical axis as reference, but we

will in fact be dealing with the underlying’s payoff V, represented on the right hand

side of our graph as another vertical axis.

We can impose certain boundary conditions to restrict our mesh. For example, if

the value of St drops too far, it is clear that our product, priced V (and that is

proportional to (S - K)+), will not be profitable, meaning that the option will not be

exercised, and its value will drop to 0. In this way we have already imposed the lower

horizontal boundary for our mesh. In addition, we can also exclude any value of S

which is excessively high.

Moreover, at maturity, we have a fixed date which constrains us by imposing a

vertical limit. It is itself divided into all the possible future values that ST,i can take-

imposing our mesh divisions. We can therefore construct all the corresponding

payoffs for these different asset prices ST,i as:

[ ])( KSeEVii T

rTT −= − (9.14)

Where for t=0, we know that V = S0, thus defining our final vertical boundary at

the left of the above mesh.

The resemblance with the tree diagram is now very clear. The above method

allows to start at a future node T already knowing the value of the product here, and

work backwards in time using the discrete equation

T


148

∂∂+

∂∂+

∂∂= 2

2

2

2

1 βt

VrS

S

V

t

VVr ttt (9.15)

Let us start for instance at the top left hand corner:

Boundary conditions

Boundary conditions

i+1

i

i-1

k-1 k+1 k

Fig. 9.10. First PDE algorithm steps

The procedure chosen to discretise the different terms in the equation from this

point onwards can be extremely diverse. For instance, we can take, according to

Taylor expansions:

kk

tt

tt

VV

t

Vkk

−−

→∂∂

+

+

1

1 (9.16)

( ) ( )( ) ( )

11

11

−+

−+

−−

→∂∂

itit

itit

kk

kk

SS

VV

S

V (9.17)

( ) ( ) ( )

( ) ( )( ) ( ) ( )( )11

112

2 2

−+

−+

−⋅−+−

→∂∂

itititit

ititit

kkkk

kkk

SSSS

VVV

S

V (9.18)

The success of the PDE solver will depend mainly on the mesh used and the way

in which we choose to discretise. We will not discuss any further the advantages and

disadvantages of explicit versus implicit discretisations, or of other methods that

could have been stated- we simply name the Crank Nicholson technique as a very

broadly used method in this domain.


149

10. Calibration

Calibration is the process by which the output on a measurement instrument is

adjusted to agree with the value of the applied standard, within a specified accuracy.

The final objective behind our calibration procedure is to be capable of pricing an

exotic interest rate product whose market value is unknown. To do so, we decompose

the complex product into a set of simple, market traded vanilla products. Thus, we

expect the combination of these simple products to be capable of replicating the

complex product’s behaviour. These simple liquid products have market quoted

prices.

The following step is to create an HJM model capable of correctly pricing these

simple products. For this, we adjust our HJM’s model parameters until its output

price for each of these plain vanilla products coincides with their true market value.

Having reached this point, we have therefore obtained a set of parameters which

correctly model the entire group of vanilla products. Further, we expect that with

these parameters, our HJM model will also be capable of modelling a more complex

exotic product which is in some manner a combination of these simple products.

Thus, inputting into our HJM model the calibrated parameters, the exotic

product’s characteristics and the market rates, we obtain a final price for our exotic

product.

Chapter 10 Calibration

150

10.1 Algorithm

We will now continue to explain in further detail how the overall algorithm

works. We start by presenting a simple flowchart with the main processes involved.

Market Rates N liquid

vanilla products

yes

Model Parameters α, σ, γ, θ

Modify Parameters Newton Raphson

PRICER

N market prices = N model prices

Export model parameters α, σ, γ, θ

no

Fig. 10.1. Calibration Process: Vanilla Products

As we have already stated, the aim of the calibration is to be able to price a new

financial exotic product. For this require:

1. Taking from the market the current interest rates, including spot rates,

dividends,…

2. The second element we also take from the market are the characteristics of a set

of n products. These must include a complete description of the various payoffs and

cash flows they generate at each of their possible fixings. The products should be

similar to the new product we want to price. In this vanilla product selection, we

stress


151

• That the products should be very liquid i.e. that their market price is exact and

reliable – plain vanillas

• That the product must have a very similar risk profile to the exotic product we

attempt to evaluate.

The calibration group is configured so that it can be selected automatically or

manually by the trader himself.

In general we choose to use vanilla products, which are simple to model, and that

play the role during the calibration of giving us the market’s point of view of the risks

in our exotic product.

3. The last elements are the input parameters needed for our model to generate

the pricing of a product. In the case of Black Scholes, we have seen that there is only

one input parameter, which is the product’s σBlack. In the HJM model, we have seen

that the input data can be σ and α if it is our two strike model, and an additional λ in

the case of the three strike model. The model parameters are responsible for

simulating the different sources of risk in a product. These risks can include variations

in the underlying price, movements in the interest rate curves, …

(There are situations in which the parameter of a particular model can become so

widely used that traders begin to talk of the products themselves in terms of the

specific parameter itself, and consequently, the market creates, itself, a value for which

the parameter of each product trades. Such is the case of the σBlack for instance, and is

now also occurring with the SABR model)

With the above input data, we are now ready to proceed with our calibration

process. We must first select a set of initial guesses for our model parameters-

typically between 0 and 1.

We then proceed to test each of the n vanilla products by comparing their market

price with the model price that our HJM pricer algorithm generates for them. If they

are different, it means that our HJM model, with its set of initial parameters, does not

correctly reproduce the market. We must modify our initial parameter values and


152

repeat the procedure, once again comparing the real market prices with those

generated by our model. We continue our iterations until we find a set of parameters

for which our model correctly prices the n vanilla products.

With these parameters, we are ready to tackle the exotic product. We use these

final parameters, the market interest rates and the characteristics of our new exotic

product to enter our HJM algorithm and price the exotic product.

Market Rates EXOTIC

PRODUCT Model Parameters

α, σ, γ, θ

PRICER

Exotic Product Price

Fig. 10.2. Calibration Process: Exotic Pricing

10.1.1 An example of a vanilla product set selection

Let us consider a 10Y callable swap which is callable every year. This means that

each year, we have the option of cancelling the swap. We shall consider this our

complex exotic product.

The product, from a quant’s point of view, is equivalent to entering an inverse

swap so as to cancel the incoming cash flows. Thus, to cancel the callable swap on the

ninth year would be equivalent to having an option in nine years time of entering a 1Y

swap.

9Y 10Y 0Y

Call Swap on the 9th year

1Y

Entering a 1Y Swap on the 9th year

Fig. 10.3. Analogy cancellable swap and inverse swap


153

The same can be said for all the other fixings, meaning that we can model the

option to cancel in 8 years as the option to enter in a 2 year swap starting in 8 years.

Thus the risk of our exotic product is modelled here by decomposition into a list

of nine simple vanilla products considered very liquid and that can be taken directly

from the market:

Starting Date Length 9Y 1Y 8Y 2Y 7Y 3Y : : : :

1Y 9Y

Table 10.1. Exotic product risk decomposition

10.2 Calibration in Detail

For any product calibration, the trader must select a range of simple vanilla

products which he believes will adequately model his exotic product’s risk profile.

The vanilla products that we will use will commonly consist of caplets or swaptions

constructed over their corresponding forward rates.

Say for example that an exotic product has a determined fixing at time T that

depends on the 3 month EURIBOR. The trader may decide therefore to incorporate

this particular risk into his analysis by using a caplet based on the 3 month EURIBOR

whose maturity coincides with T. If instead, the trader needed to model a 1 year

EURIBOR risk at this date T, he would probably decide to use a swaption instead of a

caplet, since the swaptions are constructed over the 1 year EURIBOR. That he should

choose to use a swaption or a caplet on the 1 year EURIBOR over the same time

period would depend on a more subtle analysis that we will not enter here.

The HJM Santander model is constructed following an entirely generic approach.

We realize that an exotic product with a global maturity of UN years may be replicated

Exercise Date Tenor


154

using a wide range of intermediate vanilla products with maturities Ui ≤ UN and of

varying life spans. Therefore, the approach that gives the trader the greatest freedom

is to allow him to select any of these for his calibration.

The first thing that the trader must do is to decide on the minimum time intervals

that his product will depend on. The minimum possibility is a 3 month scenario

between each date, meaning we would proceed to divide the global maturity UN into

3 month intervals. Other options would be to have 6 month or yearly intervals.

t

Settle Product

T UN

First Fixing Global Maturity

t T UN

Fig. 10.4. Decomposition of an exotic into time periods

The next step would be to select the specific vanilla products for the calibration.

These could have a starting or fixing time T at any of the dates Ui into which we have

divided our time space, and could have a maturity Uj > T, with a superior limit to this

maturity equal to the product’s global maturity, thus Uj ≤ UN. The range of all the

possible vanilla products that we can incorporate into our calibration set therefore

takes on the form of a matrix. Below we present all the possible combinations for a

given exercise date Ti.

Settle Product First Fixing Global Maturity

t T UN

Vanilla Possibilities starting at Ti

Ti


155

Fig. 10.5. Decomposition of an exotic into vanillas fixing at T and with different

maturities

Yet the above decomposition can be equally performed for any Ti satisfying T < Ti

<UN. The global exotic product with maturity UN can therefore be composed by any of

the product combinations presented in the matrix. Note that each cell within the

matrix represents a vanilla product that fixes at its corresponding row date T, and

ends at its corresponding column maturity Ui.

U0 U1 U2 U3

T0 V(t, T0,U0) V(t, T0, U1) V(t, T0, U2) V(t, T0, U3)



T3 V(t, T3,U0) V(t, T3, U1) V(t, T3, U2) V(t, T2, T3)

Table 10.2 Ideal Vanilla calibration matrix: all data available

In practice however, taking into account the entire matrix to calibrate our model

parameters would result excessively time-consuming. It would however be the ideal

goal to attain in the future.

In general, we decide instead to calibrate the most representative areas of the

above matrix. These are, firstly, the diagonal, representing vanilla products that start

at Ti and end immediately 3, 6 or 12 months after depending on the vanilla product

we are using.

A further addition that we can allow ourselves to perform is to calibrate with the

end column. This is, to take into account vanilla products that start at each of the

possible time intervals Ti and whose maturity is the product’s global maturity UN.

Thus we have the subsequent matrix appearance:

U0 U1 U2 U3




T3 V(t, T3,U0) V(t, T3, U1) V(t, T3, U2) V(t, T2, T3)


156

Table 10.3 Vanilla calibration matrix: market quoted data

Or schematically:

U0 Ui UN

T0 Ti

TN

Fig. 10.6. Schematic calibration matrix representation

Evidently, the above is not sufficient for our calibration. We complete the rest of

our matrix through interpolation and extrapolation between the known values. Thus,

we will refer to the end column and the diagonal as the Target Parameters in our

calibration set, whereas we will consider the rest of the matrix as being composed by

Dependent Parameters.

Note that there is a further extrapolation that we had not mentioned earlier. This

is the initialisation of a set of values that we will also be using. For example, if our first

possible fixing is at date T, we must be aware nevertheless that as traders, we agreed

to enter this product at a previous value date t. There is definitely a time interval

between t and the first fixing T, in which the market is not static, and whose

fluctuations can affect our product before we even start the first fixing.

t

Product Settlement

T U

First Fixing

Fig. 10.7. Initial Variation before first Fixing

These fluctuations correspond to time intervals which would be included as the

first rows in our matrix. They too must be extrapolated. We really have:


157

U0 Ui UN

T0 Ti

TN

Fig. 10.8. First Row Interpolated Data

Although for simplicity we will ignore this representation and use the former.

We must distinguish a further characteristic before we can proceed to the analysis

of results. When we are calibrating in a 1 strike model, we need to calibrate with only

one product for each matrix position, whereas in a two strike model, in each cell we

must consider two products with different strikes. Further, a 1 strike model only

calibrates with the end column, whereas a 2 strike model calibrates with both the

column and the diagonal.

10.3 Best Fit or not Best Fit?

The following aims at discussing the criteria required to determine whether we

are satisfied with the approximation between model and market prices, and thus are

ready to end the iterative process.

10.3.1 Minimum Square Error. 1st Method:

Given a model of M parameters and a calibration set of N liquid products:

· If N < M, then the problem has an infinite set of solutions, and lacks sufficient

information to be solved.

· If N ≥ M, the problem can be solved by applying a square mean calculation,

where we solve


158

2min ( )i ii

F Market price Model price = − ∑ (10.1)

The main advantage of this method is the fact that the calibration is performed

over an excess of information. This means that there always exists a solution.

The main problem in contrast, is that it is time consuming when seeking the

solution due to the minimisation algorithm which is necessary. Further, minimisation

algorithms can prove imprecise, finding local minima rather than the absolute

minimum to the problem. We must also realise that we seek

Market prices = Model prices

This means that we seek a minimum of exactly 0 for the difference

Market prices - Model prices

This cannot be guaranteed by our algorithm, which simply provides a minimal

difference between the two parameters- but this difference can be arbitrarily large.

In addition, when we arrive at our final solution curve, we cannot know which

points are exact i.e. which points are supplied by the market data and which are

approximations, thus we have truly lost information. Projecting N dimensional data

onto an M dimensional problem space results in a loss of information:

σBlack

Strike K

Fixed Maturity (t0,T)

Fig. 10.9. Inexact fit: minimum square method


159

Above, our approximation curve only coincides with one of the market data

points, so realistically speaking, can only be considered exact at that precise point, and

is therefore an approximation for all other regions.

One of the greatest problems encountered using this procedure is the

introduction of noise in the calculation of the Greeks.

The Greeks:

Are the variations of a product (its derivative) with respect to its market

parameters. All the different types of risks in the pricing of a product can be

measured by several instruments, the Greeks. With these, it is possible to hedge

against the risks in a very easy manner.

The Greeks are vital tools in risk management. Each Greek (with the exception of

theta - see below) represents a specific measure of risk in owning an option, and

option portfolios can be adjusted accordingly ("hedged") to achieve a desired

exposure

As a result, a desirable property of a model of a financial market is that it allows

for easy computation of the Greeks. The Greeks in the Black-Scholes model are very

easy to calculate and this is one reason for the model's continued popularity in the

market.

The delta measures sensitivity to price. The ∆ of an instrument is the

mathematical derivative of the value function with respect to the underlying price,

V

S

∂∆ =∂

(10.2)

The gamma measures second order sensitivity to price. The Γ is the second

derivative of the value function with respect to the underlying price,

2

2

V

S

∂Γ =∂

(10.3)

The speed measures third order sensitivity to price. The speed is the third

derivative of the value function with respect to the underlying price,


160

3

3

Vspeed

S

∂=∂

(10.4)

The vega, which is not a Greek letter (ν, nu is used instead), measures sensitivity

to volatility. The vega is the derivative of the option value with respect to the volatility

of the underlying,

Vυσ

∂=∂

(10.5)

The term kappa, κ, is sometimes used instead of vega, and some trading firms

also use the term tau, τ.

The theta measures sensitivity to the passage of time. Θ is the negative of the

derivative of the option value with respect to the amount of time to expiry of the

option,

V

T

∂Θ = −∂

(10.6)

The rho measures sensitivity to the applicable interest rate. The ρ is the derivative

of the option value with respect to the risk free rate,

V

rρ ∂=

∂ (10.7)

Less commonly used, the lambda λ is the percentage change in option value per

change in the underlying price, or

1

V

V

Sλ ∂=

∂ (10.8)

It is the logarithmic derivative.

The vega gamma or volga measures second order sensitivity to implied volatility.

This is the second derivative of the option value with respect to the volatility of the

underlying,

2

2volga

V

σ∂=∂

(10.9)


161

The vanna measures cross-sensitivity of the option value with respect to change

in the underlying price and the volatility,

2V

vannaS σ∂=

∂ ∂ (10.10)

This can also be interpreted as the sensitivity of delta to a unit change in

volatility.

The delta decay, or charm, measures the time decay of delta,.

2V

charmT S T

∂∆ ∂= =∂ ∂ ∂

(10.11)

This can be important when hedging a position over a weekend.

The colour measures the sensitivity of the charm, or delta decay to the underlying

asset price,

3

2

Vcolour

S T

∂=∂ ∂

(10.12)

It is the third derivative of the option value, twice to underlying asset price and

once to time.

For our particular model, if we want to calculate:

21

21

BlackBlackBlack

pricepriceprice

σσσ −−≈

∂∂

(10.13)

we simply evaluate our product’s price once to obtain price1. This will have an

inherent error E1 due to the square mean minimisation used to obtain the solution

curve. To calculate the price2 we slightly shift the market σBlack which we input into

our calibration set, and then re-price our product. We must notice that with such a

variation, we generate a new minimisation curve that will most probably have a

different minimisation error, E2 at the point we are studying. Such variations are

responsible for the noise generated in the Greeks we obtain.


162

10.3.2 Exact Fit. 2nd Method:

Despite the fact that the market contains an excess of products, we seek to have N

= M. This is, for every maturity T, we shall select the same number of market products

as unknown parameters in our model. The difficulty in this procedure is to correctly

select the pertinent calibration set – yet this is where the skill of the trader lies. Thus,

with this model, we can assure that the value at any of the selected strikes is precise.

For example for a model of five parameters, we would select five products:

σBlack

Strike K


Fig. 10.10. Exact fit

Between points, the model may or may not be exact, but at the specific strikes K

set by the trader, we can be sure of the results obtained.

1st Effect:

A further advantage of the best fit method lies in the fact that a real anomaly

within market data can be truthfully taken account of in our model, whereas the same

anomaly would be evened out by the square mean method:


163

σBlack

Strike K


Fig. 10.11. Anomaly in exact fit

The square mean method in contrast, would produce:

σBlack

Strike K


Fig. 10.12. Anomaly in minimum square method

2nd Effect:

The use of the minimum squares method introduces noise in the calculation of

the Greeks, not allowing for any form of control over the residual error created.

The best fit method however allows to calculate sensitivities through differentials

or finite difference methods. It will only present sensitivities towards the factors

included in our calibration set.

Thus, we must include in our calibration set all the sensibilities that we

specifically want to take into consideration. For example, if we want our three


164

Market Price – Model Price

Parameter σ1 σ2 σ3 σ4 σ*

P1

parameter calibration to present several sensibilities, we will not construct it with

three identical vanilla products, but instead for example, with a vanilla at the money,

a risk product and a butterfly. Each of these will be aimed at bringing into our model

a characteristic sensitivities.

3rd Effect:

We economise a lot of time by analysing only the pertinent products, and not an

indefinite range of them. Time of calculation is of utmost importance in our working

environment

10.4 Newton Raphson

We seek: Model Prices = Market Prices.

We use the Newton Raphson algorithm to obtain the solution to the above

equation. The algorithm is known to be the fastest among other optimisation

algorithms. Its main problems arise when the surface is not smooth, and when the

initial point of calculation is far from the final solution. The first is not our case, and

we will show that our first guess is generally a good starting approximation.

Fig. 10.13. Newton Raphson Iterations


165

The Newton Raphson procedure is simple and well known. It seeks

Model Prices - Market Prices = 0

As seen in the above figure, a first guess σ1 is used to construct the model price,

P1. The slope is calculated at this point as σ∂

∂= Pm1 . This is used to construct a

straight line of equal slope at the point (σ1, P1). Its point of intersection with the

horizontal axis provides the next point of calculation. 1

112 m

P−= σσ

The method is easily generalised to M parameters by substituting the slope by the

Jacobian: i

PJ

σ∂∂= Thus 1

112−⋅−= JPσσ

The difficulty in this procedure lies in the need to calculate the inverse of the

Jacobian, something that can result not only difficult numerically, but also time

consuming.

10.4.1 Simplifications of the algorithm.

1st Simplification:

The Newton Raphson method can only be used where the slope is smooth. If ever

σ∂∂P

is not smooth, we can create what is called a buffer parameter, λ. Typical forms

are: ∫= dt2σλ or ∫= dte2σλ

The transformation allows for a smooth λ∂

∂Pwhere the Newton Raphson

algorithm can now be applied successfully.


166

2nd Simplification:

We do not really need an extremely accurate Jacobian. This is, for a smooth curve,

we can use a constant initial Jacobian in all iterations to reach the final solution σ*.

Although more steps are needed for the algorithm, we avoid the time consuming

calculation of a new Jacobian at each step, and thus comparatively, greatly reduce the

time of computation.

Market Price – Model Price

Parameter σ1 σ2 σ3 σ4 σ*

σ5 σ6 …

Fig. 10.14. Newton Raphson Iterations with a constant Jacobian

A further solution would be to use an analytic, approximate Jacobian at each

point. Making it analytical would render it very rapid, and would also avoid the need

to calculate the numeric Jacobian at each step. This is what we have developed in

section 12 and called an analytic approximation. In fact, the analytic approximation

calculates an approximate solution using its own approximate Jacobians, and once it

arrives at its solution, the algorithm transforms into a MonteCarlo approach that

nevertheless, still uses the last Jacobian that the analytic approximation calculated.

The proof for the fact that the use of an analytic approximation is less time

consuming than the calculation of a new Jacobian each time relies on the following

logic:


167

Without an analytic approximation, each time we go through one whole iteration

within the calibration process we must perform the pricing n+1 times: We call on the

pricer a first time during the calibration so as to calculate the model price for the n

liquid vanilla products based on the first guess model parameters. If we then have to

iterate because these model prices do not coincide with the market prices, we must

call on the pricer a further n times. This occurs because we must generate a Jacobian to

proceed with the slope method of the Newton Raphson algorithm.

P1 Pi P2 P3 Pn … …

N Prices

λ1

λi

λ2

λ3

λn

:

:

N Parameters

Jacobian Matrix

d Pricei

d Paramj

Fig. 10.15. Calibration Jacobian

We calculate the Jacobian by varying or ‘bumping’ the first of the n parameters λ1

by a differential amount dλ1. We then go through the pricer again, calculating all the

new prices Pi + dP|λ1. Thus we obtain the first row in our Jacobian matrix:

( )( )λλλλ d

dPPPP iii

+−+−

=∂∂

111

(10.14)


168

Market Rates N liquid

vanilla products

yes

Model Parameters α, σ, γ, θ

Modify Parameters Newton Raphson

PRICER

N market prices = N model prices

Export model parameters α, σ, γ, θ

no

Bump λ1

Recalculate Pi

Fig. 10.16. Jacobian calculation Iterations

We repeat the process n times for each of the parameters λi .With the old prices Pi

and parameters λi, and with the new parameters λi + dλ and prices Pi + dP, the

Jacobian is then fully computed.

With an analytic approximation therefore, we would avoid calling on the pricer

n+1 times. Instead we would use a unique Jacobian and simply require a few more

iterations to arrive at the final result.


169

10.5 Iteration Algorithm

Recall the initial Calibration flow chart, we are now going to present in more

detail the iteration’s right hand side:

j = j + 1

Price

λj i

jiλ∆

− j0

ji PricePrice

Store Model Priceji

New Parameters

λj+1

i

Bump Parameters

iji

ji ∆+= λλ

Newton Raphson

n Parameters λji

i = 0

i = i+1

Export n

Parameters λji

PRICER

yes

no

i = 0 ?

yes

i= n ?

Market Price = Model Price?

yes

no

no

Market rates

N Vanilla Products

START

Fig. 10.17. Detailed Calibration Algorithm: Jacobian computation

Chapter 11 Graphical Understanding

170

11. Graphical Understanding

We now seek to truly understand how well our HJM model operates, and a direct

method to achieve this is by reproducing it graphically. By doing so, we will be able to

see where calibration solutions lie, where the first guesses are taken, and how directly

our algorithm converges towards its solution. We also hope that any specific cases

where our algorithm finds difficulties in converging, or does not converge at all, will

become apparent during this phase.

We set out to analyse the two strike HJM model. Recall that the calibration

parameters for such a formulation were simply σ and α. The initial idea is to generate

the space Ω consisting of all the prices that the HJM model generates for each pair (σ,

α), for a determined vanilla product. We then compare this three dimensional surface

with the true market price of the vanilla product. Wherever the HJM model price

coincides with the market price, we would have a valid solution for the parameters

(σ,α).

We develop this idea a step further, and set our z axis directly as:

HJM Model Price – Market Price

This is simply a vertical displacement of the curve along the z axis, since the

market price for a given vanilla product is a constant. What we achieve through this

transformation, is to be able to directly observe our (σ, α) solutions as the intersection

of our surface with the horizontal plane. That is, we have a valid solution whenever

HJM Model Price – Market Price = 0

Note that as discussed in the Black Scholes Chapter, the use of Market prices or of

Black volatilities on our vertical axis is indifferent, as the two are directly related. We

must also be aware of the fact that each valuation must be made for a product with a

defined maturity and a definite strike.

We will not enter the details of the programming algorithm involved in this

procedure yet. For the interested reader, refer to the algorithm at the end of this

section Fig. 11.12.


171

Our space Ω will consist in the parameters to calibrate along the horizontal plane.

Notice that we can intuitively guess the form that this surface will have, by analysing

the 2 dimensional behaviour of the individual parameters.

We must not confuse the behaviour of the model parameters in this section with

those in the HJM section. Here, we are representing price versus model parameters. In

previous graphs, we were analysing prices versus strikes, and seeing how our

parameters transformed those curves.

The sigma parameter in the two strike model presents a characteristic curve of the

form:

Sigma

0

1

2

3

4

5

6

7

0 0,05 0,1 0,15 0,2 0,25

Sigma [%]

Pric

es k

Alpha -20 Alpha 0 Alpha 20 Alpha 40 Alpha -60

Fig. 11.1. HJM model: sigma vs price dynamics with different alpha parameters

It is a monotonously increasing curve. We see that its slope varies depending on

the value of alpha.

The alpha parameter presents a characteristic convexity. Its global level varies

vertically in price depending on the value of sigma.


172

Alpha

-5

0

5

10

15

-65 -45 -25 -5 15 35

Alpha [%]

Pric

e k

Sigma 0.04 Sigma 0.08 sigma 0.12 Sigma 0.16 Sigma 0.2

Fig. 11.2. HJM model: alpha vs price dynamics with different sigma parameters

By combining the two, we obtain a 3Dsurface Ω of the form:

0,02

0,06

0,1

0,14

0,18

-50

-40

-30

-20

-1001020304050

-20

0

20

40

60

80

100

120

Mod

el P

rice - Marke

t Pric

e . .

Sigma

Alpha

HJM Dynamics

Fig. 11.3. HJM MonteCarlo model price surface

The possible (σ, α) solutions are those in which Model Price = Market Price,

therefore where the surface passes through 0. This is depicted above as the limiting

curve between the blue and purple regions.


173

HJM Solution

0

0,02

0,04

0,06

0,08

0,1

0,12

0,14

0,16

0,18

-60 -40 -20 0 20 40 60 80

Alpha [%]

Sig

ma

l

K = 3.35%

K = 4.45%

HJM Solution

0

0,02

0,04

0,06

0,08

0,1

0,12

0,14

0,16

0,18

-60 -40 -20 0 20 40 60 80

Alpha [%]

Sig

ma

l

K = 3.35%

Fig.11.4. HJM MonteCarlo two dimensional solution

Notice however that we obtain an entire curve of possible solutions, meaning that

we have an infinite set of possibilities to select our (σ, α) pair from. As in any

mathematical problem with two variables to solve, it is clear that we need two

equations to fully determine the system. We must therefore perform each calibration

in pairs of vanilla products (that have the same time schedules, and different strikes),

so that the final solution set (σ, α) is that which satisfies both curves, thus their

intersection.

Fig. 11.5. HJM MonteCarlo two dimensional solution intersection for two vanilla products

Notice also that the pair of vanilla products must be selected with the same

maturity, but different strike. It is left to the trader to decide which specific pair of


174

vanilla products to select at each maturity and tenor, so as to model the risk that he

seeks.

Depending on his selection, the trader will be taking into account a tangent to the

volatility curve at a specific point, or will be reproducing a broad slope modelisation.

Fig. 11.6. Model implications on taking a) very close strikes b)distant strikes

We immediately see here that a trader trying to capture a curvature, i.e. a smile,

will be unable to do so with the selection of only two strike positions. A third strike

would be required, and consequently a third parameter γ – the ‘Volatility of

Volatilities’- would have to be inserted into our HJM formulation to completely

determine the system of 3 equations and 3 unknowns.

11.1 Dynamics of the curve

Increasing the market prices, or equivalently, the σBlack results in a downward

vertical displacement of the entire surface Ω. According to our z axis definition we are

now subtracting a greater amount in the second term of the equation

HJM Model Price – Market Price

This has the effect on our 2D graph of displacing our solution curves upwards in

the sigma axis. (Not to confuse our HJM model’s sigma with the σBlack that is related to

the product price through the Black Scholes formula).

Variation of the strike K in turn appears to be a re-scaling of the solution curve.

Its effect is to increase the overall size of the curve and shift it towards smaller values

of alpha, maintaining a more or less constant right end value

K

σB

K KK

K

σB

KK


175

σBlack1

σBlack2

K1

K2

Fig. 11.7. Solution dynamics with a) variation in market price b) variations in

strike

11.2 HJM Problematics

Of specific interest to us is to analyse the cases in which our HJM algorithm does

not find a concrete solution. We have encountered four main cases of concern.

11.2.1 Lack of Convergence

There exists a solution but the algorithm does not converge properly towards it.

σ

K

σ

K


176

Convergence

0,04

0,06

0,08

0,1

0,12

0,14

0,16

-40 -30 -20 -10 0 10 20 30 40

Alpha [%]

Sig

ma

l

K = 3.35%

K = 3.45%

1st Guess

no convergence

Fig. 11.8. Convergence of the algorithm

We realise that this problem can be solved easily and directly through two main

alternatives, each being equally valid.

a. Selecting a first guess which is closer to the initial solution. This will be one of

the driving forces to develop an analytic approximation.

b. Increasing the number of MonteCarlo simulations so as to have a more robust

convergence towards the true solution. The only drawback with this alternative is the

increased computation time required.

11.2.2 Duplicity

There exists a duplicity of solutions

The observant reader will have already noticed this in the previous graph or

could have already forecasted this difficulty due to the fact that the alpha has a

concave form. We present here a more evident solution duplicity case.


177

Solution Duplicity

0,08

0,1

0,12

0,14

0,16

0,18

0,2

-10 -5 0 5 10 15

Alpha [%]

Sig

ma

l

K = 4.35%

K = 4.45%

Fig. 11.9. Solution Duplicity

In general we intend our model to have a logical set of parameters. Remember

that the alpha represents the weight that we attribute to the normality or log-

normality of our model. Thus we expect our set of alpha values to range between [0,1].

We notice that one of our solutions always lies within this range, whereas a second

solution sometimes appears that can greatly exceed it. To impose that our model does

not converge towards solutions of the form α = 5, we could either code a restrictive

algorithm, or we can simply impose a first guess with an alpha within this range.

Better still, we could approximate our first guess greatly to the valid solution to

ensure that the algorithm converges towards it. (See the analytic approximation in

Section 12).


178

11.2.3 No solution

No curve intersection, and thus no valid pair of parameters.

Lack of Intersection

0

0,02

0,04

0,06

0,08

0,1

0,12

0,14

0,16

0,18

0,2

-50 -40 -30 -20 -10 0 10 20 30 40

Alpha [%]

Sig

ma

l

K = 4.45%

K = 4.45%

Fig. 11.10. No HJM MonteCarlo solution intersection

11.2.4 No curve whatsoever

This occurs whenever our HJM model is not flexible enough to descend below the

horizontal axis. In other words, the model is never capable of equating its model

prices to the real market prices with any possible combination of parameters (σ, α).

The surface remains asymptotic to the horizontal axis.


179

0.04

0.11

0.18

-2-0.5

12.545.

57

0

1

2

3

4

5

6

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]

Surface Flexibility - Vanilla 2

Fig. 11.11. HJM MonteCarlo surface does not descend sufficiently so as to create a

solution curve

11.3 Attempted Solutions

A first idea that we tested to probe the cause of this inability to calibrate was to

analyse the gradients. We define the gradient in a σBlack versus strike K graph as

StrikeBlack

∂∂σ

We realize that the model has particular difficulties when trying to calibrate

products that are very smile i.e. that are very convex; or that have strong changes in

their slope. We analysed two principal cases:

· Whether the algorithm was incapable of calibrating due to the magnitude

(steepness) of the gradient.

· Whether the algorithm was incapable of calibrating depending on the

particular strike K location where the gradient was calculated i.e., whether the


180

algorithm found it particularly more difficult to calibrate when it was far from

the ‘at the money’ strike position.

No conclusive evidence was found supporting either of the two suggestions.

A second idea tested was to analyse the actual plain vanilla products that we

were calibrating. We found that calibrations of similar products, for example, a

swaption pair, could be solved for by the HJM model. If instead we calibrated

different products together, such as a swaption and a capfloor pair, the HJM would be

unable to converge to a suitable solution. We noticed also that there is a clear

difference in slopes between these two types of products. On average, a capfloor’s

slope has a value of around -1, whereas a Swaption has a slope of -2.

The above sparks off two possibilities:

The first is an examination of the actual cap market data so as to examine if the

caplets are being correctly derived from their cap market quotes. This will be

discussed in detail in a later chapter- Section 16.

A second option would be to try and normalize the caplet volatility measure. We

have noticed that the joint calibration of swaptions (based on the 1Y LIBOR) and 1Y

caplets (based on the 1Y LIBOR), do yield results. However, when we try to calibrate

the same swaptions with 3 month or 6 month caplets (based on the 6M LIBOR),

immediate problems arise. A possible alternative would be to transform these caplets

into an equivalent annual measure, and to calibrate this fictitious annual measure

with the swaption measure. See [BM].


181

11.4 3D Surface Algorithm

We follow 4 main steps in the algorithm that constructs the 3 dimensional surface

and that then evaluates the 2 dimensional solutions for α and σ.

1. Selection of the maturity i at which we will evaluate a particular pair of

products

2. Mesh generation: this is, we define the horizontal plane of α’s and σ’s over

which to construct our surface.

3. For each of the two products q, we generate for every mesh node (α, σ) the 3D

surface as

Model Price(α, σ) – Market Price

4. We search for the (α, σ) solutions. This is where the surface passes through the

plane z = 0. For this, we check every node on the horizontal mesh and see if its

corresponding price difference (z value) experiences a change in sign with respect to

any adjacent node. We check both the horizontally adjacent, and vertically adjacent.

Any solution found is stored.


182

q = q - 1 3D Matrix

α 0

αL

αn

For each point ML,LL

MarketPrice-ModelPrice

N Products 2 Parameters

σ 0 σLL σm

Set Parameter mesh LL = m σ points L = n α points

yes

no q > 0 ?

LL = LL – 1 L = n

yes

no

Select product pair ‘i’ to calibrate

q=2

LL > 0 ?

Export 3D Matrices

Vertical Check: for each σLL cst. See if change in sign in α column

Next αL

L > 0 ?

L = L - 1

α 0

αL

αn

σ LL

Next column σLL

ML,LL · ML-1,LL < 0 Export Solution

+ −LL

LL σαα,

21

START

L = L – 1 LL = m

yes

no

L > 0 ?

Horizontal Check: for each αL cst. See if change in sign in σ row

Next σ LL

LL > 0 ?

LL = LL - 1

αL

σ 0 σ LL σ m

Next row αL

ML,LL · ML,LL-1 < 0 Export Solution

+ −

2, 1LLLL

L

σσα yes

no

yes

no

no

yes

yes

no

L = n LL = m

Fig. 11.12. Graphic surface generation algorithm


183

12. HJM 3 Strikes

The HJM three strike model, as stated initially, aims at being able to capture the

smile perceived by the market through the addition of a new parameter. The

peculiarity of this new component lies in the fact that it is itself stochastic. We will see

that the new parameter, which we will call the volatility of volatilities, has its own

diffusion dynamics that is governed by a different Brownian motion, Z.

We will now proceed to analyse the main alternatives which we experimentally

invented and developed here. Notice as we advance, how they all present a direct

relationship with the typical normal and lognormal stochastic differential diffusion

equations. This is why, upon integrating, they all evolve towards an exponential form.

We have provided their mathematical development below:

−+⋅⋅=Γ

),0(

),0(log)),(1(),(log),(),(),(),(

tB

TBTtTtBTtTtVTtTt αασ

(12.1)

Where V(t,T) introduces the new term of stochasticity, Z. We will describe several

choices of V tried in HJM Santander.

12.1 Exponential

Is one of the most simple stochasticity models that we shall tackle. It involves a

unique volatility of volatilities parameter, combined with the new Brownian motion

itself.

tZTteTtV ),(),( γ= (12.2)

The above is selected in particular so that we ensure

[ ] 1),( =TtVE P (12.3)

Chapter 12 HJM 3 Strikes

184

A logical extension of the presented model would be

tZTttTt

eTtV),(),(

2

1 2

),(γγ +−

= (12.4)

Clearly, recalling chapter 3.1, we can obtain the above from a Black formulation of

the diffusion for the new Brownian motion. Imagine that we have a dynamics of the

form

tdZTtVTtdtTtVTtdV ),(),(),(),( γβ += (12.5)

ttt

t dZdtV

dV γβ += (12.6)

Let us impose a change in variable

( )

( )t

t

VdLNdX

VLNX

==

(12.7)

Applying Ito

dtVV

dVV

dX tt

t

tt

22

2

1

2

10

1 γ

−++= (12.8)

replacing the term in dV with our initial diffusion equation

dtdZdtdXVdLN tttt2

2

1)( γγβ −+== (12.9)

and now integrating

∫+

−∫=ttt dZdt

t eVVγγβ 2

2

1

0 (12.10)

Assuming γ(t,T) to be piecewise constant, we could extract it from the integral,

obtaining

Ztt

t

tt

eVVγγβ +−

=2

2

1

0 (12.11)

If we do not make the above assumption, we must solve the second integral using

Brownian motion properties, as


185

( ) ( )t

Zdt

t

ZdtVariancedZ tttt ∫∫∫ == 2γγγ (12.12)

Which leaves us with

( )2 21

20

t tZ

dt dtt

tV V eβ γ γ − + ∫ ∫= (12.13)

We will see that this procedure is common throughout all the possibilities

attempted. This is to say, that we could always use a more simplistic approach rather

than a complete integration. In general we had particular difficulties in implementing

the complete integrals. For this reason we commonly resorted to piecewise constant

assumptions. Another successful solution to avoid the integral of a squared parameter

was to approximate:

( )22

t tdt dtγ γ≈∫ ∫

12.2 Mean Reversion

We simply add here a term of mean reversion to the previously developed

dynamics in (12.5).

tdZTtdtTtVTtdV ),()),((),( γθβ +−= (12.14)

We now must apply a change in variable to attain the solution. Let

∫

=

t

sds

tt eVx 0

β

(12.15)

Then by Ito

000 +∫

+∫

= dteVdVedx

t

s

t

s ds

ttt

ds

t

ββ

β (12.16)

( )dtVdVe ttt

dst

s

ββ

+∫

= 0 (12.17)


186

And substituting our diffusion equation in dV we obtain

( )dtVdZTtdtTtVe tttt

dst

s

βγθββ

++−∫

= ),()),((0 (12.18)

Leaving

( )tt

ds

t dZTtdtedx

t

s

),(0 γθββ

+∫

= (12.19)

Integrating between t and 0 we have

( )∫

+∫

=−t

uu

ds

t dZTuduexx

u

s

0

0 ),(0 γθββ

(12.20)

We can substitute now our change in variable for x to return to a formulation in

V:

( )∫

+∫

=∫

−∫ t

uu

dsdsda

t dZTudueeVeV

u

ss

t

a

0

0 ),(0

0

00 γθββββ

(12.21)

( )∫

+∫∫

+∫

=−− t

uu

dsdads

t dZTudueeeVV

u

s

t

a

t

s

0

0 ),(000 γθββββ

(12.22)

( )∫

+∫∫

+∫

=−− t

uu

dsdads

t dZTudueeeVV

u

s

t

a

t

s

0

0 ),(000 γθββββ

(12.23)

Let us take βt = β, then

∫ ∫−−− ++=

t t

uututt

t dZTueedueeeVV0 0

0 ),(γβθ βββββ (12.24)

If we assume γ(t,T) to be piecewise constant, we can extract it from the

integration, thus


187

t

tut

ttt

t dZeeTte

eeVV ∫−−− +

−+=0

0 ),(1 ββ

βββ γ

ββθ (12.25)

Where using the Brownian motion properties we can calculate t

tudZe∫

0

β as

( )t

Ze

t

ZdteVariance ttt

tu ⋅−=⋅

∫ 1

2

1 2

0

ββ

β (12.26)

Leaving

( ) ( )t

ZeeTteeVV ttttt

t ⋅−+−+= −−− 12

1),(1 2

0ββββ

βγθ (12.27)

Notice that if we do not consider γ(t,T) piecewise constant, we would have

t

ZdtTte

t

ZdtTteVariance t

ttt

tt ⋅

=⋅

∫∫0

22

0

),(),( γγ ββ (12.28)

Leaving

2 20

0

1( , ) ( , )

ttt t t t t

t

ZeV V e e t T e e t T dt

t

ββ β β ββθ γ γ

β− − − −= + + ⋅

∫

(12.29)

12.3 Square Root Volatility

We simply state the idea of the diffusion for this form. We did not finally

implement the expression below

tdZTtvTtdtTtvTtdv ),(),()),((),( γθβ +−= (12.30)


188

12.4 Pilipovic

Notice that the expression below is very similar to the one that we stated in the

mean revision section- (12.14). The main difference is simply the inclusion of the

stochastic volatility of volatilities term V in the diffusion (second term). Previously, it

had only been included in the drift term.

tdZTtVTtdtTtVTtdV ),(),()),((),( γθβ +−= (12.31)

A solution to the above equation was discovered by Dragana Pilipovic as

∫+

∫= ∫

+−+−−

dueVetVt dZsddsdZtddt ss

t

sss

t

t

βθγγβγγβ

0

2

1

2

1

0

2

0

2

)0()( (12.32)

To demonstrate that this is a solution, we will apply Ito as proof. For this we

must calculate each term in:

2

22

( ) ( ) 1 ( )( )

2 t

V t V t V tdV t dt dZ dt

t Z Zγ∂ ∂ ∂= + +

∂ ∂ ∂ (12.33)

We therefore calculate each of the above partial derivatives as:

2 2

0 0

2 2 2 2

0 0 0 0

1 1

2 2

0

1 1 1 1

2 2 2 22

0

( )(0)

1(0)

2

t t

t s s s s s

t t t t

t s s s s s t s s s s s

tt dt dZ ds dZ

tt dt dZ ds dZ t dt dZ ds ds dZ

t

V te V e du

t

e V e du e e

β γ γ β γ γ

β γ γ β γ γ β γ γ β γ γ

β βθ

γ βθ βθ

− − + − +

− − + − + − − + − +

∫ ∫∂ = − + − ∂

∫ ∫ ∫ ∫ − + +

∫

∫

(12.34)

βθγβ +−−=∂

∂)(

2

1)(

)( 2 tVtVt

tVt (12.35)

)()(

)()0()(

2

0

2

1

2

1

0

2

0

2

tVZ

tV

tVdueVeZ

tV

t

t

t

t dZsddsdZtddt

tt

ss

t

sss

t

t

γ

γβθγγγβγγβ

=∂

∂

=

∫+

∫=

∂∂

∫+−+−−

(12.36)


189

By Ito, applying (12.33) and now substituting the above, we see that the complex

terms cancel out, bringing us back to the original diffusion equation.

( ) dZtVdttV

dttVdZtVdttVtVtdV

t

ttt

γβθβ

γγβθγβ

)()(

)(2

1)()(

2

1)()( 22

+−=

++

+−−= (12.37)

For a constant β, and piecewise constant γt ,our solution would be

+

−+

−+=

=

+=

∫

∫

−−+−

+−−+−

dueee

Ve

dueVetV

tZt

t

tZtt

tZuuZtt

uutsst

uuusst

0

2

1

22

1

0

2

1

2

1

121

)0(

)0()(

22

22

γγβγγβ

γγβγγβ

γββθ

βθ (12.38)

where the last term is of particular difficulty, and could numerically be

approached as ∑ −−i

iiZ tte ii )( 1

γ

12.5 Logarithmic

Note that this is still a very similar format to that which we had in our mean

reversion model. We have simply replaced the variable V by log V.

tdZTtdttVtVd ),())(log()(log γθβ +−= (12.39)

Let us convert the above to a simpler form through

( ))(log)(

)(log)(

tVdtdX

tVtX

==

(12.40)

Then the initial equation is now written as

tdZTtdttXtdX ),())(()( γθβ +−= (12.41)

For which the solution, as we saw previously, for a constant β , was

( ) t

tuttt

t dZeeTteeXX ∫−−− +−+=

0

0 ),(1 ββββ γθ (12.42)


190

Now undoing our initial change in variable:

( ))(log)(

)(log)(

tVdtdX

tVtX

==

(12.43)

so ( ) t

tuttt

t dZeeTteeVV ∫−−− +−+=

0

0 ),(1loglog ββββ γθ (12.44)

Where the integral can be treated as a Brownian term

( )t

Ze

t

Zdue

t

ZdueVariancedZe ttt

tut

tu

t

tu ⋅−=⋅

=⋅

= ∫∫∫ 1

2

1 2

0

2

00

ββββ

β

(12.45)

( ) ( )

t

Zeee

et

tttt

tt

eeVV⋅−+− −−

−

⋅=1

2

11

0

2ββββ β

γθ (12.46)

12.6 Taylor expansion

Another alternative that has been examined is the transformation of the curvature

into a Taylor expansion. Recall that we had

−+=Γ

),0(

),0(log)),(1(),(log),(),(),(

tB

TBTtTtBTtTtTt αασ (12.47)

let us note

),(log

),0(

),0(log0

TtBx

tB

TBx

=

= (12.48)

we can then re-write the above as

00 )(),( xxxTt ttt σασ +−=Γ (12.49)


191

We could imagine a more accurate extension of the above as a second order

Taylor expansion

[ ]2000 )()(),( xxxxxTt tt −−−+=Γ γασ (12.50)

The above works relatively well, yet only until maturities of around 12 years. We

do not have a strong reason to abandon this formulation, other than the fact that we

need to narrow down our range of alternatives, and thus have decided to use the

volatility of volatility dynamics method instead, as it gives good results for products

with longer maturities. The behaviour itself depends on the market data. For example,

for the USD, the curvature approach has been implemented already and appears to

work better than the volatility of volatilities approach.

12.7 Graphical Note

Following our discussion on the two strike model, it seems evident that in the

three strike model, we will now need three vanilla products , calibrated jointly, to

attain a unique set of parameters for a given maturity. Remember how before, we

obtained the solution in the intersection of two curves. Now, the solution will be

obtained in the unique intersection of three surfaces.

Visually this is much more complex than before. The only way in which it could

be represented would be to fix one of the model parameters, and plot the remaining

two against a vertical axis consisting of ‘Model Price – Market Price’. We have not

pursued this possibility any further.

12.8 Results

We will now proceed to summarize the results obtained by comparing how each

of the former expressions for the volatility of volatilities performs.

It is worthwhile noting that in many of the previous formulations, we saw several

new parameters apart form the volatility of volatilities. These include for instance the

β and θ, whose values we have decided to hardcode into our algorithms. An


192

alternative or possible future approach could consist in also calibrating these

parameters. For the meantime, we have manually adjusted them, searching for those

for which our calibration works best.

Notice in the following set of results that we have tended to perform a first

simplistic approach setting these additional parameters to default values of 0 or 1. We

have then proceeded to perform a more detailed analysis, searching for their optimal

values and exploring to what extent they improved the calibration. We have noted

these as ‘extended’ results.

We state that all the formulae were exhaustively tested over 20 year products so

as to evaluate at what point the calibration failed. None of the former expressions

were capable of successfully completing the calibration process. The results presented

below were obtained for 10 pack simulations: this is, 10,000 MonteCarlo paths. The

more packs we added, the more difficult the HJM 3 factor algorithm found it to

advance.

12.8.1 Case 1, Exponential (12.2):

1st Case Limit β

Normal 13Y,14Y CapletExtended piecewise integration 14Y, 15Y Caplet -0.05

exact integration 2Y, 3Y 0.1 to -0.1squared integral 14Y, 15Y Caplet 1 to -0.1

Table 12.1. Mean Reversion Stochastic Volatility Summary

This first formulation turned out to be one of the most successful. Notice that the

column headed limit represents the time span up until which the algorithm

successfully calibrated. We have also included a column β which shows the range of

values for this parameter in which the above formula successfully calibrated to the

specified date.


193

A more careful analysis of the results obtained confirms the following:

· The algorithm seems to fail because the change in the gamma parameter for

long products is too drastic.

· The extended version with beta appears much more flexible.

· The squared integral method allows for the greatest range of parameter

alternatives whilst still reaching the same final length (in years) of calibration.

12.8.2 Case 2, Mean Reversion (12.14):

2nd Case Limitintegral squared 2Y, 3Y(integral) squared 5Y, 6Y

Table 12.2. Mean Reversion Stochastic Volatility Summary

12.8.3 Case 3, Square Root (12.30) :

3rd Case Limit2Y, 3Y5Y, 6Y

Table 12.3. Square Root Stochastic Volatility Summary – 10 and 20 packs

12.8.4 Case 5, Logarithmic (12.39):

5th Case Limit β θ

15Y, 20Y Swaption 1 -0.7 to -0.00115Y, 16Y Caplet 1.5 -0.03 to -0.03516Y, 17Y Caplet 2.8 to 3 -0.02 to -0.25

Table 12.4. Logarithmic Stochastic Volatility Summary


194

Clearly, this fifth formulation is the one that has been most successful, reaching

one or two years further in the calibration than the 1st case formulation. Nevertheless,

we were still incapable of calibrating the vanilla products to the final maturity of 20

years.

Having reached this point, we began to consider the possibility that perhaps, it

was not the ‘Volatility of Volatilities’ expression that was causing the failure to

calibrate. We postulated the hypothesis that perhaps the key to the problem could be

located elsewhere.

Indeed, we found a surprising feature when performing these tests. All the above

expressions were obtained from joint calibrations of both Swaptions and Caplets,

where the caplets were based on the 1 year EURIBOR. Since we had calibrated up

until 16 years, we expected to find that our MonteCarlo would at least be able to

calibrate any product under the 15 year mark. Contrary to our expectations, we found

that there was a certain range of products that our model was incapable of calibrating,

even in the case of shorter maturities. These were, specifically, joint calibrations of

Swaptions and Caplets in which the Caplets were not based on the 1 year EURIBOR,

but instead, on the 3 month or 6 month EURIBOR.

An important problem is the fact that in the cap market, forward rates are mostly

semi – annual whereas those entering the forward-swap-rate expressions are typically

annual rates. Therefore, when considering both markets at the same time, we may

have to reconcile volatilities of semi-annual forward rates and volatilities of annual

forward rates. Otherwise, when we treat them as exact equals, the above calibration

problems occur. And perhaps, it is incorrect even to treat the swaptions and caplets

both based on the 1 year EURIBOR as equals. This may be the underlying reason why

we do not calibrate successfully.

This problematic is what gives rise to a deeper study of the joint calibration. We

will first analyse a possibility that the caplet market data themselves should be

erroneous. This will involve an analysis of how the caplets are stripped from their

corresponding cap values. This has been developed in detail in the Caplet Stripping

Section 16. Once this has been implemented, we should ideally return to the 3 Strike

analysis to see whether any improvements are obtained.

If this turns out not to be the true problem, we will perhaps need to somehow

normalise the data of both caplets and swaptions. For this we refer the reader to [BM].


195

13. Analytic approximation

What we refer to as an analytic approximation is simply an approximate

formulation of the HJM framework. In other words, it is an approximation of the

model’s dynamics yielding a price for Swaptions which can be described through an

analytical formula that is very simple to implement numerically.

The need for an approximate formula arises from the fact that the HJM can be

very costly time-wise. It generally involves a huge number of path simulations which

can weigh heavily on any computer and even on a grid network. As seen in the

calibration section, every additional iteration requires n + 1 additional pricings so as

to calculate all the elements in the Jacobian.

If instead we were to work with an analytical approximation, we would stand in

a far stronger position due to the following main reasons:

The HJM model starts off its calibration with an arbitrary first guess for its

parameters. As it iterates through a Newton Raphson algorithm, this type of

algorithm is characterised by converging badly and very slowly if the first guess is

very far from the true solution. A huge leap forward in computation time would be

achieved if the analytical approximation were to provide us with a good starting

point.

Calculating a numerical Jacobian as is done in the calibration process is extremely

costly. We have already seen that we can freeze the model’s Jacobian so as to iterate in

more steps but without having to recalculate the Jacobian at each step. We could

further reduce calibration time if we were to use an analytical approximation Jacobian

that could be calculated mathematically and not through a finite difference ‘bumping’

procedure.

We therefore set out with the aim of deriving an approximate formula of a

swaption for the Santander HJM model.

Chapter 13 Analytic approximation

196

As shall be demonstrated, we will start off by making an assumption for the

dynamics of the forward swap rate

[ ] PtdWSttStttdS )0())(1()()()()( αασ −+= (13.1)

We will use this formulation as an analytic approximation to the exact HJM

expression. From this approximation we will derive its time-dependent parameters

α(t), σ(t). The exact expressions for these will cover the central part of our research. We

will subsequently need a formula to relate our new approximate expression’s σ(t) and

α(t) to the HJM model’s σi(t) and αi(t) (with i = 0, ..., n)

Secondly, we will use the technique of “averaging” developed by Piterbarg to

convert our time dependent parameters α(t), σ(t) into a diffusion with time-

independent parameters α, σ. This will convert our approximate formula into the form

[ ] PtdWStStdS )0()1()()( αασ −+= (13.2)

We will simply state the formulation to be used in this section, but will not

analyse it in any further depth.

13.1 Formula Development

Let us recall two pivotal expressions in our HJM model:

−+=Γ

),0(

),0(log)),(1(),(log),(),(),(

tB

TBTtTtBTtTtTt αασ

(13.3)

[ ] Pt

f dWTtRTtTtRTtTtdtTtdR ),()),(1(),(),(),((...)),( αασ −++=

(13.4)

The first expression implies the second. Indeed, write

Ptt dWTtdtr

TtB

TtdB),(

),(

),( Γ+= (13.5)


197

))(,(),( tTTtReTtB −−= (13.6)

Tt

TtBTtR

−= ),(log

),( (13.7)

Applying Ito to R(t,T) we obtain:

( )

( )dtBTt

dtB

BTtdBTtBTt

TtdR log11

2

1),(

),(

11),(

2222

−−+

−Γ+

−=

(13.8)

[ ]( )

( )dtTtTtRTt

dtdWrdtTt

TtdR −−

−Γ−Γ+−

= ),(1

2

11),(

22

(13.9)

Separating into temporal and Brownian components

( ) dWTt

dtTt

TtRr

Tt

rTtdR

−Γ+

−−Γ−

−= ),(

2

1),( 2

(13.10)

Replacing the Γ in the Brownian part with (13.3)

dWtB

TBTtTtBTtTt

TtdtTtdR

−+

−+=

),0(

),0(log)),(1(),(log),(),(

1(...)),( αασ

(13.11)

And applying the relationship between bonds and rates in (13.7)

( ) [ ]dWTtTtRTtTtTtRTtTtTt

dtTtdR f ))(,()),(1())(,(),(),(1

...),( −−+−−

+= αασ

(13.12)

where we have decided to call Rf the zero coupon rate forward

( , )( )(0, )

(0, )

fR t T T tB Te

B t− −= (13.13)


198

13.1.1 Swaption Measure

Let us consider our receiver swaption with strike K and time schedule τ = U0,U1,

...,Un. We are thus left with:

[ ] Pti

fiiiii dWtRttRttdttdR )())(1()()()((...))( αασ −++= (13.14)

Where ),()( ii UtRtR =

The swaption rate forward of such a swap at time t is:

∑=

−=

n

iii

n

UtBm

UtBUtBtS

1

0

),(

),(),()( (13.15)

13.2 Step 1

Under its annuity measure S(t) is a martingale. From the previous equation and

under the HJM model, S(t) presents the following dynamics:

[ ]∑=

−+∂∂=

n

i

Pti

fiiii

i

dWtRttRtttR

tStdS

0

)())(1()()()()(

)()( αασ (13.16)

Proof:

By applying a multidimensional form of Ito to (13.15) which in turn can be

rewritten as

0 0 0( , )( ) ( , )( )

0

( , )( )

1

(0, ) (0, )( )

(0, )

n n

i i

R t U U t R t U U tn

nR t U U t

i ii

B U e B U eS t

m B U e

− − − −

− −

=

−=∑

(13.17)

through the use of

),()(),( TtRtTeTtB −−= (13.18)


199

We have thus obtained a multidimensional form of the more simplistic Ito

equation

dtR

Sdt

t

SdR

R

StdS 2

2

2

2

1)( σ

∂∂+

∂∂+

∂∂= (13.19)

Imposing that St is a martingale, we can say that it follows a driftless process,

meaning that all the terms in dt should cancel out. In this way we reach that

dRR

StdS

∂∂=)( (13.20)

Where we have dR from before. Thus, by simply replacing dR and realizing that

we have differentiated with respect to each of the Ri that we take from our continuous

rate curve, we have

[ ]∑=

−+∂∂

=n

i

Pti

fiiii

i

dWtRttRtttR

tStdS

0

)())(1()()()()(

)()( αασ

Because our HJM model produces a skew, we can make an assumption for our

approximation, on the dynamics of the swap rate forward such that:


This is the approximation of the dynamics of St that we choose to use.

We will now need a formula to relate our new approximate expression’s σ(t) and

α(t) to the true HJM model’s σi(t) and αi(t) (with i = 0, ..., n)

Equations (13.16) and (13.21) give us the dynamics for S(t). The first is the HJM

formulation and the second is our newly developed approximation. We must

therefore find a suitable relationship between them.


200

13.2.1 First Method

To get from (13.16) to (13.21), we impose two conditions. The first is that the two

equations should be equivalent along the forward path. Suppose Ri(t) = Rfi (t)

ni ,...,0∈∀ then S(t) = S(0). In fact, when Ri(t)=Rfi(t) we have),0(

),0(),(

tB

TBTtB = ,

since

),()(),( TtRtTeTtB −−= (13.22)

))(,(

),0(

),0( tTTtR f

etB

TB −−= (13.23)

Hence

∑∑==

−=

−=

n

i

ii

n

n

iii

n

tB

UBm

tB

UB

tB

UB

UtBm

UtBUtBtS

1

0

1

0

),0(

),0(),0(

),0(

),0(

),0(

),(

),(),()( (13.24)

By definition

)0(),0(

),0(),0()(

1

0 SUBm

UBUBtS

n

iii

n =−

=∑

=

(13.25)

So S(t) = S(0)

From this first condition we obtain, rewriting our two expressions: i.e. the

swaption rate forward dynamics from the HJM and the approximations standpoints:

[ ] Pt

Pt dWStdWSttStttdS )0()()0())(1()()()()( σαασ =−+= (13.26)

[ ] ∑∑== ∂

∂=−+∂∂=

n

i

if

ii

n

i

Pti

fiiii

i

tRttR

SdWtRttRtt

tR

tSdSt

00

)()()(

)0()())(1()()()(

)(

)()0()( σαασσ

(13.27)


201

Both the HJM and the approximate formulation must be equivalent. We can

equate them, and solve for our approximation’s σ(t):

∑= ∂

∂=n

i

if

ii

RttR

SSt

0

)0()()(

)0()0()( σσ (13.28)

Which we write in a simplified manner as

)()()()0(

)(

)(

)0()(

0

ttqtS

tR

tR

St i

n

oiii

n

i

if

i

σσσ ∑∑==

=∂∂= (13.29)

where in order to be able to solve, we must freeze the parameter Ri , this is

1,...,fi iR R i n= =

And where

)0(

)(

)(

)0()(

S

tR

tR

Stq

if

ii ∂

∂= (13.30)

Thus, at this point, we have achieved an expression for our approximate σ(t) that

is a function of parameters that are all known at present. It is important to notice that

we will use the techniche of freezing the Ri to its Rfi value in this and all subsequent

approximation alternatives.

We proceed now to calculate an expression for α(t). Having decided that the

slope should agree along the forward path as well. We intuitively identify α(t) with

the slope or skew of our HJM model, as has been seen in the HJM Section 8.5.2. Thus,

j∀ , we analyse the slope by differentiating:

( ) [ ] =

−+

∂∂

∂∂=

∂∂

∑=

n

i

if

iiiiijj

tRttRtttR

tS

tRtdS

tR 0

)())(1()()()()(

)(

)()(

)(αασ

2

0

0

( )( ) ( ) ( ) (1 ( )) ( )

( ) ( )

( )( ) ( ) (1 ( )) ( )

( ) ( )

nf

i i i i ii i j

nf

i i i ii j j

S tt t R t t R t

R t R t

S tt R t t R t

R t R t

σ α α

α α

=

=

∂ = + − + ∂ ∂

∂ ∂ + + − ∂ ∂

∑

∑

(13.31)


202

where there exists a derivative for )()()(

)(tRt

tR

tSii

i

α∂∂

only when i=j. Thus in the

last term we eliminate the “i” index and replace by a “j”:

[ ]∑= ∂

∂+−+∂∂

∂=n

ijj

j

if

iiiiji

tttR

tStRttRtt

tRtR

tS

0

2

)()()(

)()())(1()()()(

)()(

)( ασαασ

(13.32)

And separately, we analyse the same slope in our approximate formulation. We

have

[ ])(

)()()()0())(1()()()(

)( tR

tSttSttStt

tR jj ∂∂=−+

∂∂ ασαασ (13.33)

σ does not differentiate with respect to R as it is made up of terms at time 0, and

terms in Rf

)()0(

)(

)(

)0()()( t

S

tR

tR

Stqt i

if

iii σσσ

∂∂== (13.34)

Equating the HJM model’s slopes and the approximation formula’s slopes along

the forward path, we obtain j∀

2

0

(0) (0) (0)( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

nf

i i j ji i j j j

S S St R t t t t t

R t R t R t R tσ σ α σ α

=

∂ ∂ ∂+ =∂ ∂ ∂ ∂∑

(13.35)

Version 1

Ignoring the second order derivatives and thus just taking a first order approach

)(

)0()()(

)(

)0()()(

tR

Stt

tR

Stt

jjjj ∂

∂=∂∂ ασασ (13.36)

Thus

)()()()( tttt jj ασασ = (13.37)


203

Version 2

)(

)0()()()()(

)(

)0()()(

)()(

)0(

0

2

tR

Stttt

tR

StRt

tRtR

S

j

n

ijj

j

if

iji ∂

∂=∂∂+

∂∂∂=∑

=

ασασσ

(13.38)

This problem with second order considerations normally, does not have a

solution. We reformulate it in the least-square sense: finding α(t) such that:

( )∑=

−n

ijj

t tttt0

2)(min

)()()()( ασασα (13.39)

13.3 Second Method

To equate the approximation and the HJM model, we impose two conditions: that

the lognormal and normal terms should both independently be equal. This means that

as we have

[ ]∑=

−+∂∂=

n

i

Pti

fiiii

i

dWtRttRtttR

tStdS

0

)())(1()()()()(

)()( αασ (13.40)

and [ ] PtdWSttStttdS )0())(1()()()()( αασ −+= (13.41)

then equating the lognormal and normal components

∑= ∂

∂=n

iiii

i

tRtttR

tStStt

0

)()()()(

)()()()( ασασ (13.42)

∑=

−∂∂=−

n

i

if

iii

tRtttR

tSStt

0

)())(1)(()(

)()0())(1)(( ασασ (13.43)

These equations should also agree along the path forward at Ri(t)=Rfi (t) and S(t) =

S(0). Solving the above equations, we obtain:

)()()()0(

)(

)(

)0()(

0

ttqtS

tR

tR

St i

n

oiii

n

i

if

i

σσσ ∑∑==

=∂∂= (13.44)


204

)()()(

)0(

)()()()(

)0(

)(

0

0

ttRtR

S

tttRtR

S

t

i

n

i

if

i

n

iiii

i

σ

ασα

∑

∑

=

=

∂∂

∂∂

= (13.45)

Note that the formula for σ(t) is the same as the one derived from the first

method. α(t) is now a weighted composition of all the αi(t).

Note that another possible approach would have been to equate the terms in α(t)

and those independent of α(t). We immediately encounter a problem if we pursue this

approach, for we obtain

∑= ∂

∂=n

iii

i

tRttR

tStSt

0

)()()(

)()()( σσ (13.46)

))()()(()()(

)())0()()(()(

0

tRtRtttR

tSStStt i

n

i

if

iii

−∂∂=− ∑

=

ασασ (13.47)

the first equation yields the same solution for σ(t) as in all the previous cases, but

the second term gives a problem of a division by 0 when S(t) = S(0)

−−

∂∂=∑

= ))0()((

)()(

)(

)()(

)(

)()(

0 StS

tRtR

t

tt

tR

tSt ii

fn

i

ii

i σασα (13.48)

We will see in the results section that the final formulation we retain for our

algorithm is that which is provided by the second method.


205

13.4 Step 2

Following Piterbarg (and we will give no further details), we have:

dtttwT

∫=0

)()( αα (13.49)

Where

1)(0

=∫ dttwT

(13.50)

How to choose w(t) is crucial. Piterbarg suggest that

dtttv

ttvtw

T

)()(

)()()(

2

0

2

22

σ

σ

∫= (13.51)

With

∫=t

dsstv0

22 )()( σ (13.52)

Another test may be for example:

dtttT

T

∫=0

2)(

2/

1 αα (13.53)

For σ, we always choose the following:

dttT

T

∫=0

22 )(1 σσ (13.54)


206

13.5 Swaption Valuation

We will now analyse how to adapt our approximate formulation to the valuation

of a simple receiver swap. We will make our derivation as generic as possible so that

any of the two specific methods for the σ(t) and α(t) parameters derived can be equally

applied. We shall see how we simply convert our approximation dynamics into a

geometric Black Scholes form which we will then be able to solve for in a

straightforward manner.

A receiver swaption can be expressed as:

[ ])(());(();();();()(00

0 tSKmUtBUtBUtBmUtBKKVn

iii

n

iniit −=+−= ∑∑

==

(13.55)

Where

∑=

−=

n

iii

n

UtBm

UtBUtBtS

1

0

),(

),(),()( (13.56)

If we take ∑=

=n

iii UtBmtN

1

),()( as the numeraire, S(t) will be a martingale

under this probability. So we have that the price of a receiver swaption is:

[ ]1

00

(0; ) ( ( )n

i ii

Swaption m B U E K S t+−

=

= −∑ (13.57)

Under its annuity measure, our approximate formula yields

[ ] PtdWStStdS )0()1()()( αασ −+= (13.58)

Performing a change in variable αα )0()1(

)()(S

tStX−+=

We have )()( tdStdX =

Which we can replace for


207

[ ] PtdWStStdStdX )0()1()()()( αασ −+== (13.59)

and again replacing S(t) we obtain

PtdWS

StXtdX

−+

−−= )0()1()0()1(

)()( αα

αασ (13.60)

Leaving PtdWtXtdX )()( σα= (13.61)

We have thus arrived at a simple geometric standard differential equation to

which we can directly apply the Black Scholes Formula

[ ]+−

=∑ −=

1

00 )('();0(

n

i

Pii tXKEUBmSwaption (13.62)

with . α

α )0()1('

SKK

−+=

Applying the Black Scholes formula, we have:

[ ]∑−

=−−−=

1

0120 )()0()('();0(

n

iii dNXdNKUBmSwaption (13.63)

where

T

T

K

XLn

d⋅

⋅+

=σα

ασ2'

)0(22

1 (13.64)

And Tdd ⋅−= σα21 (13.65)

13.6 Approximation Conclusion

We have implemented and tested the two methods extensively dor up to 25 year

calibrations with a maximum of 100,000 simulations. One strike calibrations build on


208

the same form for σ(t), which always works well, provided that the HJM model also

finds a solution. The difference between the two methods lies therefore in the α(t)

implementation. The first approach presents difficulties with certain calibrations.

Despite giving good Jacobians, it requires many MonteCarlo iterations. The second

method proves much more robust. Furthermore, its formulas for σ(t) and α(t) are very

simple. We fancy them in particular because they both seem to be weighted averages

of the σi(t) and αi(t).

The line of research followed to this point in the development of an approximate

formulation seems to be completely compatible with an extension to the 3 Strikes

model.

13.7 Alternative point of Calculation

We now attempt to calculate our approximate formula at a different point from

the original idea of taking Ri(t) = Rfi(t).

Previously, we had arranged for S(t) = S(0)- always ‘at the money’. We now will

examine the possibility of imposing this equality at a different point. We had:

∑∑=

−−

−−−−

=

−=−

=n

i

tRtUi

tRtUtRtU

n

iii

n

ii

nn

em

ee

UtBm

UtBUtBtS

1

)()(

)()()()(

1

000

),(

),(),()( (13.66)

∑∑=

−−

−−−−

=

−=−=n

i

tRtUi

tRtUtRtU

n

iii

n

if

i

nf

nf

em

ee

UBm

UBUBS

1

)()(

)()()()(

1

000

),0(

),0(),0()0( (13.67)

Imagine that we search for

*

*0 0

( ) ( )

( ) ( )

fi i

f

R t R t

R t R t ε=

= +

then

∑=

−=

n

i

ii

n

tB

UBm

tB

UB

tB

UB

tS

1

0

),0(

),0(),0(

),0(

),0(

),0(

)( (13.68)


209

Imagine that we now want S(t) = S*(t)

Then by dividing the previous equations we obtain

),0(),0(

),0(),0(

)(

)0()(

0

0

)()())()((

)()())()((

* 000

00

ntU

n

tRtUtRtU

tRtUtRtU

UBeUB

UBUB

ee

ee

tS

Sn

fn

f

nf

nf

−−=

−−= −−−−+−−

−−−−

εε

(13.69)

),0()0(

)(1),0(

)0(

)(),0(

*

0

*)(

00

ntU UB

S

tSUB

S

tSeUB

−+=−− ε (13.70)

−+

−=

),0(

),0(

)0(

)(1

)0(

)(

)(

1

0

**

0 UB

UB

S

tS

S

tSLN

Utnε (13.71)

Note that if we take S(t)* = S(0), this model yields ε = 0, which brings us back to

the model we had initially.

We will show in the results section that this approximation method in ε yields

optimal results for calibrations that are performed at the money. This appears to be

quite logical, as the general level of volatility σ is best defined at the money, and so

calibrating it at any other point S*(t) does not seem as appropriate.

13.8 Two Factors

The development of an analytic approximation for the two factor HJM model is

completely analogous to the one factor case. Its HJM formulation is expressed as

1 20

( )

( )( ) ( ) ( ) (1 ( )) ( ) (sin ( ) ( ) cos ( ) ( ))

( )

nf P P

i i i i i i ii i

dS t

S tt t R t t R t t dW t t dW t

R tσ α α θ θ

=

=∂

= + − + ∂∑ (13.72)

Notice that the only real difference with respect to the one factor model are the

sine and cosine coefficients which have been included at the end of the expression

with respect to the two different Brownian motions.


210

Because our model produces a skew, we can make an assumption for the

dynamics of the swap rate forward, (once again, analogous to what had been

previously developed), and simply add sine and cosine coefficients with respect to the

two different Brownian motions.

[ ] ))()(cos)()((sin)0())(1()()()()( 21 tdWttdWtSttStttdS PP θθαασ +−+=

(13.73)

Solving the above equations by freezing Ri(t) = Rfi (t) and S(t) = S(0), we obtain:

2

0

2

0

)(sin)()0(

)(

)(

)0()(cos)(

)0(

)(

)(

)0()(

∂∂+

∂∂= ∑∑

==

ttS

tR

tR

Stt

S

tR

tR

St ii

n

i

if

iii

n

i

if

i

θσθσσ

(13.74)

The above can also be attained by following a parallel approach:

Separation of α-dependent and α-independent terms gives

)(sin)()()(

)()(sin)()(

0

ttRttR

tSttSt i

n

iii

i

θσθσ ∑= ∂

∂= (13.75)

)(cos)()()(

)()(cos)()(

0

ttRttR

tSttSt i

n

iii

i

θσθσ ∑= ∂

∂= (13.76)

)(sin))()()(()()(

)()(sin))0()()(()(

0

ttRtRtttR

tStStStt i

n

i

if

iiii

θασθασ ∑=

−∂∂=−

(13.77)

)(cos))()()(()()(

)()(cos))0()()(()(

0

ttRtRtttR

tStStStt i

n

i

if

iiii

θασθασ ∑=

−∂∂=−

(13.78)

and by applying trigonometry to the first two equations (13.75) and (13.76)


211

2

0

2

0

)(sin)()0(

)(

)(

)0()(cos)(

)0(

)(

)(

)0()(

∂∂+

∂∂= ∑∑

==

ttS

tR

tR

Stt

S

tR

tR

St ii

n

i

if

iii

n

i

if

i

θσθσσ

(13.79)

Dividing the first two equations we also obtain

)(cos)()()()()(

)(sin)()()()()(

)(tan

0

0

ttRtttR

tS

ttRtttR

tS

t

i

n

iiii

i

i

n

iiii

i

θασ

θασθ

∑

∑

=

=

∂∂∂∂

= (13.80)

Recall however that this methodology provided a difficulty when solving for

alpha in the one factor case. Here we will be faced with the same problem. Squaring

the last two equations (13.77) and (13.78) to eliminate sines and cosines, the expression

obtained can be solved for α(t). However, it involves a division by (S(t)-S(0)) in the

denominator, which yields 0 for S(t)=S(0) and thus makes the ratio explode towards

infinity. Further this is impossible to solve for as S(t) is stochastic

2

022 2

2

0

2 2

( )( ) ( )( ( ) ( ))sin ( )

( )( )

( )( ( ) (0))

( )( ) ( )( ( ) ( )) cos ( )

( )

( )( ( ) (0))

nf

i i i i ii i

nf

i i i i ii i

S tt t R t R t t

R tt

t S t S

S tt t R t R t t

R t

t S t S

σ α θα

σ

σ α θ

σ

=

=

∂ − ∂ = +−

∂ − ∂ +−

∑

∑

(13.81)

We seek now to find alpha through a different approach:

If we proceed as in the one factor case, we can firstly equate our two expressions

through their Brownian motions, and secondly, we can equate them further, as we

already did before, via their normality and lognormality:

)(sin)()()()(

)()(sin)()()(

0

ttRtttR

tSttStt i

n

iiii

i

θασθασ ∑= ∂

∂= (13.82)

)(cos)()()()(

)()(cos)()()(

0

ttRtttR

tSttStt i

n

iiii

i

θασθασ ∑= ∂

∂= (13.83)


212

)(sin)())(1)(()(

)()(sin)0())(1)((

0

ttRtttR

tStStt i

n

i

if

iii

θασθασ ∑=

−∂∂=−

(13.84)

)(cos)())(1)(()(

)()(cos)0())(1)((

0

ttRtttR

tStStt i

n

i

if

iii

θασθασ ∑=

−∂∂=−

(13.85)

The main problem that we encounter at this stage is the fact that we have three

variables σ(t), α(t), and θ(t) to solve with four equations.

The addition of a fifth trigonometric relationship must be approached carefully as

it involves squares and roots that enforce the sign on some of our parameters. We add

1)(cos)(sin 22 =+ tt θθ (13.86)

The system is clearly over-determined, and a preferential choice of one

combination of solutions over another is not evident. We proceed to derive a range of

alternatives which we have subsequently tested for.

We could solve for alpha in the first two equations (13.82) and (13.83), obtaining

Or

)(cos)()()()0(

)(cos)()()()()(

)(

)(sin)()()()0(

)(sin)()()()()(

)(

0

0

0

0

tttRtR

S

ttRtttR

tS

t

tttRtR

S

ttRtttR

tS

t

ii

n

i

if

i

i

n

iiii

i

ii

n

i

if

i

i

n

iiii

i

θσ

θασα

θσ

θασα

∑

∑

∑

∑

=

=

=

=

∂∂

∂∂

=

∂∂

∂∂

=

(13.87)

The problem is that we do not know which one of the two to use.

We could attempt some sort of mean:


213

∂∂

∂∂

+

∂∂

∂∂

=∑

∑

∑

∑

=

=

=

=

)(cos)()()()0(

)(cos)()()()()(

)(sin)()()()0(

)(sin)()()()()(

2

1)(

0

0

0

0

tttRtR

S

ttRtttR

tS

tttRtR

S

ttRtttR

tS

t

ii

n

i

if

i

i

n

iiii

i

ii

n

i

if

i

i

n

iiii

i

θσ

θασ

θσ

θασα

(13.88)

But this proves not to work too well.

If instead we relate (13.82) and (13.83) through trigonometry

2 2

0 0

( )

1 ( ) ( )( ) ( ) ( )sin ( ) ( ) ( ) ( ) cos ( )

( ) ( ) ( ) ( )

n n

i i i i i i i ii ii i

t

S t S tt t R t t t t R t t

S t t R t R t

α

σ α θ σ α θσ = =

=

∂ ∂= + ∂ ∂ ∑ ∑

(13.89)

2 2

0 0

2 2

0 0

( ) ( )( ) ( ) ( )sin ( ) ( ) ( ) ( ) cos ( )

( ) ( )

(0) (0)( ) ( ) cos ( ) ( ) ( )sin ( )

( ) ( )

n n


n nf f

i i i i i ii ii i


R t R tt

S SR t t t R t t t

R t R t

σ α θ σ α θα

σ θ σ θ

= =

= =

∂ ∂+ ∂ ∂ ( ) = ∂ ∂+ ∂ ∂

∑ ∑

∑ ∑

(13.90)

We find that the main problem with this approach is the fact that the root

constrains our solutions of α to a positive plane, whereas we have seen in the one

factor model that α can be both positive and negative.

We also attempted to use the expression for alpha that was developed in the one

factor case (13.45), that is, an expression that would be theta independent.

)()()(

)0(

)()()()(

)0(

)(

0

0

ttRtR

S

tttRtR

S

t

i

n

i

if

i

n

iiii

i

σ

ασα

∑

∑

=

=

∂∂

∂∂

= (13.91)

Surprisingly enough, we found that, although inconsistent with the two factor

formulas developed, this expression for alpha proved extremely effective.


214

We do realize however, that as in the one factor case, α turns out to be a mean of

all the αi. We could therefore attempt to use other averages which cannot be derived

mathematically from the above equations, such as:

∂∂+

∂∂

∂∂+

∂∂

=

∑∑

∑∑

==

==

)(sin)()()()0(

)(cos)()()()0(

)(cos)()()()()(

)(sin)()()()()(

)(

00

00

tttRtR

StttR

tR

S

ttRtttR

tSttRtt

tR

tS

t

ii

n

i

if

iii

n

i

if

i

i

n

iiii

ii

n

iiii

i

θσθσ

θασθασα

(13.92)

We have found that we obtain even better results with an expression of the form

2

0

2

0

00

)(sin)()()()0(

)(cos)()()()0(

)(cos)()()()()(

)(sin)()()()()(

2

1)(

∂∂+

∂∂

∂∂+

∂∂

=

∑∑

∑∑

==

==

tttRtR

StttR

tR

S

ttRtttR

tSttRtt

tR

tS

t

ii

n

i

if

iii

n

i

if

i

i

n

iiii

ii

n

iiii

i

θσθσ

θασθασα

(13.93)

Indeed the former proves to be one of our favourite candidates for the analytic

approximation. Its main drawback clearly being the fact that it cannot be derived from

the initial equations. This means that an extension of the analytic approximation to the

three strike scenario would rely more on the insight of the quant to come up with an

appropriate mean for alpha, than a logic follow-through of mathematical formulas.

We therefore decide to persist with our search for a more logical expression.

If we decide instead to take the last two equations (13.84) and (13.85), we would

find:

2 2 2

2 2

0 0

( )(1 ( )) (0)

( ) ( )( )(1 ( )) ( )cos ( ) ( )(1 ( )) ( )sin ( )

( ) ( )

n nf f


t t S


R t R t

σ α

σ α θ σ α θ= =

− =

∂ ∂= − + − ∂ ∂ ∑ ∑

(13.94)


215

2 2

0 0

( )

( ) ( )( )(1 ( )) ( )cos ( ) ( )(1 ( )) ( )sin ( )

( ) ( )1

( ) (0)

n nf f


t


R t R t

t S

α

σ α θ σ α θ

σ= =

=

∂ ∂− + − ∂ ∂ = −∑ ∑

(13.95)

2 2

0 0

2 2

0 0

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

( )( )(1 ( ))

( )

n nf f

i i i i i ii ii i

n nf f

i i i i i ii ii i

fi i i

i

S t S tt R t t t R t t

R t R t


R t R t

S tt t R

R t

σ θ σ θ

σ θ σ θ

σ α

= =

= =

∂ ∂+ ∂ ∂ = ∂ ∂+ ∂ ∂

∂ −∂

−

∑ ∑

∑ ∑

2 2

0 0

2 2

0 0

( )( )cos ( ) ( )(1 ( )) ( )sin ( )

( )

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

n nf

i i i i ii i i

n nf f

i i i i i ii ii i

S tt t t t R t t

R t


R t R t

θ σ α θ

σ θ σ θ

= =

= =

∂+ − ∂

∂ ∂+ ∂ ∂

∑ ∑

∑ ∑

(13.96)

However, this method restricts our values of α, forcing them to be smaller than

one, something which we have seen in the one factor case that does not always hold

true.

We finally come across the best solution: that which attempts to encompass all

four of the initial equations, and that is not ‘intuitively’ guessed and constructed by

the quant as a mean.

Let us start by developing the above expression (13.94) concerning the last two

equations (13.84) and (13.85)

2 2 2 2 2 2

2 2

0 0

2 2

( )(1 ( )) (0) ( )( ( ) 2 ( ) 1) (0)

( ) ( )( )(1 ( )) ( )cos ( ) ( )(1 ( )) ( )sin ( )

( ) ( )

n nf f


t t S t t t S


R t R t

E F

σ α σ α α


− = − + =

∂ ∂= − + − = ∂ ∂

= +

∑ ∑

(13.97)

Let us recall that a similar approach with the first two equations gave


216

2 2 2

2 2

0 0

2 2

( ) ( ) ( )

( ) ( )( ) ( ) ( )sin ( ) ( ) ( ) ( )cos ( )

( ) ( )

n n


t t S t


R t R t

C D

σ α


=

∂ ∂= + = ∂ ∂

= +

∑ ∑

(13.98)

Therefore

)()0()0(

)()()()0(

)()(2 22

22

2

22222

2

222 t

S

DC

S

FEttt

S

FEtt σσσασα −+−+=−−+=− (13.99)

Remember that we had taken a different expression for σ

( )

2 2

0 0

2 2

( )

1 (0) (0)( ) ( )cos ( ) ( ) ( )sin ( )

(0) ( ) ( )

1

(0)

n nf f

i i i i i ii ii i

t

S SR t t t R t t t

S R t R t

A BS

σ

σ θ σ θ= =

=

∂ ∂= + ∂ ∂

= +

∑ ∑

(13.100)

Leaving

+−+−+=)0()0()0()(2

1)(

2

22

2

22

2

22

2 S

BA

S

DC

S

FE

tt

σα (13.101)

The above simplifies to

22)(

BA

DBCAt

++=α (13.102)

Or in its extended full version, where as always, in order to solve we must freeze

Ri to Rfi.


217

0 0

2 2

0 0

( )

( ) (0)( ) ( ) ( ) sin ( ) ( ) ( )sin ( )

( ) ( )

(0) (0)( ) ( ) cos ( ) ( ) ( )sin ( )

( ) ( )

( )( ) ( )

( )

n nf

i i i i i i ii ii i

n nf f

i i i i i ii ii i

i ii

t

S t St t R t t R t t t

R t R t

S SR t t t R t t t

R t R t

S tt t R

R t

α

σ α θ σ θ

σ θ σ θ

σ α

= =

= =

=

∂ ∂ ∂ ∂ + ∂ ∂+ ∂ ∂

∂∂

+

∑ ∑

∑ ∑

0 0

2 2

0 0

(0)( )cos ( ) ( ) ( ) cos ( )

( )

(0) (0)( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

n nf

i i i i ii i i

n nf f

i i i i i ii ii i

St t R t t t

R t

S SR t t t R t t t

R t R t

θ σ θ

σ θ σ θ

= =

= =

∂ ∂

∂ ∂+ ∂ ∂

∑ ∑

∑ ∑

(13.103)

From all the alternatives, this last expression resulted in being that which was

fastest in calibrations, was consistent with the mathematical developments, and

always calibrated whenever there also existed a solution through MonteCarlo

simulations for the HJM.

13.9 Use of ‘No Split’

To this point we have always been considering split processes. That is to say, we

have always been considering joint calibrations of two vanilla products at a time. As

examined previously, this was necessary so as to obtain an intersection of the two

solution curves.

A ‘no split’ process is a bulk calibration procedure. We no longer calibrate the

vanilla products by pairs of equal maturity, but instead take the entire range of

maturities and calibrate them together.

This procedure results much more time consuming computational-wise, as it is

much more difficult for the algorithm to converge with so many parameter

considerations at once.

The analytic approximation however is capable of arriving at a rapid solution

when it deals with so many products at once. And the more surprising fact is that if

we use the analytic approximation’s solution as a first guess to then perform a no split


218

calibration through the HJM MonteCarlo process, we find that the calibration now

becomes much faster. In other words, before MonteCarlo no split was extremely

tedious on its own. Now, by this new procedure in which no split MonteCarlos start

from a no split approximation solution, we achieve much more rapid results than the

equivalent split approximation followed by a split MonteCarlo.

We have further noticed that there are specific cases in which with the no split

and analytic approximation, we are capable of solving calibrations performed

exclusively on caplets. Identical calibrations using the split method find no solution:

refer to the Analytic Approximation Results Section 14 in the Calibration Set

interpolation matrix section.


219

14. Analytic Approximation Results

14.1 1 Factor Model

We now proceed to analyse the results obtained from the tests performed on the

analytic approximation. Both of the proposed methods in the previous section work

well on a wide range of tests. However, the first approach for alpha rapidly ceases to

calibrate past the 5 to 10 year maturity mark.

Our great achievement lies in the fact that the second approximation proves to

always be capable of calibrating, whenever the HJM MonteCarlo is also capable of

converging for a given set of data. If the exact solution by MonteCarlo does not exist,

we will find that even so, many times the analytic approximation still provides us

with a result.

Therefore, having decided on the second method as our final expression for alpha

and sigma for our analytic approximation, we continue to use the graphical analysis

tool developed earlier in the project. This will give us good visual confirmation of how

the analytic approximation is performing, and will strengthen our understanding on

how it works.

We proceed to compare the HJM MonteCarlo solutions with the analytic

approximation solutions to confirm their similarity. Thus we will verify that the

analytic approximation solution is a very good first guess for the HJM.

Chapter 14 Analytic Approximation Results

220

14.1.1 Testing

The analytic approximation was submitted to an exhaustive series of tests. In

these, we attempted to make sure that the analytic approximation responded correctly

to any possible scenario. We therefore tested how it reacted at different maturities and

at different strikes.

We note firstly that the analytic approximation acts more or less as a tangent to

the real HJM MonteCarlo solution curve. Further, the analytic approximation always

shows a monotonous behaviour, whereas the MonteCarlo solution clearly does not.

Approximation at High Strikes

0.05

0.09

0.13

0.17

0.21

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC K = 0.052

MC K = 0.05

Approximation K = 0.052


Fig. 14.1. Analytic approximation at high strikes

The first important thing to notice therefore is that the analytic approximation

acts as a tangent in the region where the final specific solution is achieved, making it

therefore useful for our study as it adapts well to the MonteCarlo simulation.


221

Analytic Approximation Dynamics

0,05

0,07

0,09

0,11

0,13

0,15

0,17

0,19

0,21

0,23

0,25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

l

MC K = 0,03MC K = 0,06

Analytic Approximation K = 0,03Analytic Approximation K = 0,06

Fig. 14.2. Analytic approximation at distant strikes

Now the next thing we must state is that the analytic approximation is not always

a perfect tangent. Notice in the above graph how it adapts well to either side of the

HJM’s MonteCarlo hump. However, for ‘at the money’ values, this is, for the

maximum point of the MonteCarlo solution curve in the below graph, the tangent

should be flat. Instead, the analytic approximation acts more or less as a tangent and

not as a curve with a unique point of contact. Further, its gradient is not unique, as

would be the case of a real tangent.


222

Approximation Tangent 'at the money'

0.1

0.12

0.14

0.16

0.18

0.2

0.22

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC ATM K = 4.2%

MC ATM K = 4.3%

Approximation ATM K = 4.2%

Approximation ATM K = 4.3%

Fig. 14.3. Analytic approximation acting as a tangent ‘at the money’

The need to perform tests for further approximations arises explicitly because the

visual similarity between the analytic approximation and the HJM MontreCarlo

curves is not exact, (as can be seen in the figure below). For certain strikes that are

very far from ‘at the money’, the analytic approximation can visually be quite

different.

Note that we proceed to investigate other possibilities simply to see if any further

optimisation can be achieved. But we must state confidently that the approximation at

this level already calibrates extremely rapidly. Despite the difference in slopes,

because the analytic approximation’s gradient is much more pronounced, it actually

converges more rapidly towards the final solution than other better fitted alternatives.

Further, despite the slope difference, the two solutions, (MonteCarlo and analytic

approximation), continue to be extremely close together.


223

Approximation at Distant Strikes

0.05

0.09

0.13

0.17

0.21

0.25

-2 0 2 4 6 8Alpha [%]

Sig

ma

MC K = 0.03

MC K = 0.04



Fig. 14.4. Analytic approximation presents difficulties in adjusting to the curve at

distant strikes

14.1.2 Use of the Epsilon Approximation.

As stated in the previous chapter, the epsilon approximation allows to alter the

point of study. However, with the tests performed on both epsilon adjusted and non

adjusted formulations, we can conclude the following:

Calibrating at the money is a consistent, robust approach, as it settles a very good

average value for the volatility level.

Calibrating at the precise strike under consideration is very unstable. This is

because, as we have to calibrate two products at different strikes, on calibrating we

must continually jump from one of their strikes to the other’s. This is a source of

instability.

The logical next approach would be to calibrate at a fixed intermediate point

between the two products’ strike. Having a fixed position reduces instability, and

proves to generate very good results. However the improvement over calibrating ‘at

the money’ is not substantial. Furthermore, instability can still arise because the

calibration set can still have different pairs of strikes for different pairs of products.


224

This is, at different maturities the average strike also fluctuates, introducing instability

and resulting in a difficulty for the calibration process.

We therefore decide to maintain the calculations performed ‘at the money’ and so

do not pursue the epsilon approach any further.

14.1.3 Sigma and Alpha adjusting

Further corrections were performed on the alpha and the sigma parameters.

These were hard-coded and simply tested for manually without any logic behind

them, more than that of trial and error, and visually seeing if the graphical output

resembled the MonteCarlo.

We found that inserting factors in front of the alpha expression added no further

improvements.

Adjustments in the sigma on the other hand could improve the fit.

Indeed, this adjustment factor only needed to be extremely small to produce a

noticeable visual difference on the analytic approximation results. See below how a

constant factor could greatly improve fits for large strikes, but at the same time

impoverish the adjustment at low strikes.

Epsilon Correction at High Strikes

0.05

0.09

0.13

0.17

0.21

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC K = 0.06MC K = 0.03Approximation K =0.06Approximation K =0.03


225

Fig. 14.5. Analytic approximation corrected in sigma at high strikes

Similarly, modifying this factor we could obtain a good adjustment at low strikes,

but would then lose accuracy at high strikes:

Epsilon Correction at Low Strikes

0.05

0.09

0.13

0.17

0.21

0.25

-2 0 2 4 6 8Alpha [%]

Sig

ma

MC K = 0.06

MC K = 0.03



Fig. 14.6. Analytic approximation corrected in sigma for low strikes

A constant factor could not be used, and thus we created a factor that varied with

strikes.

14.1.4 Global adjustment: Factor:

−⋅+⋅=FWD

KFWD04.099.0* σσ (14.1)

The above is an example of a good adjustment factor for the calibration set that

we were considering. See below how visually, the adjustment is slightly enhanced.

The main thing to notice here is the fact that we now obtain a very good fit at both

high and low strike values.


226

Sigma Factor Varying with Strike

0.05

0.09

0.13

0.17

0.21

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC K = 0.06

MC K = 0.03



Fig. 14.7. Analytic approximation with a varying sigma correction

Despite the evident visual improvement, we must state the following. The

computational time difference between using the plain analytic approximation and

this improved method is very slight, amounting to a difference of only one or two

analytic approximation iterations. The subsequent HJM iteration counts are generally

equal between the two methods (recall that it is these HJM iterations that account for

the principal time consumption in the process).

There is a further statement that we must clearly point out in case any future

developments are pursued along this line. The improvement factor is specific for a

given maturity. This is, the close to perfect fit that we have achieved with the above

formulation is specific to a caplet fixing 6 years after settlement and expiring three

months after that. Tests performed on products with different maturities turned out to

require slightly different sigma adjustment factors.

If this were to be pursued further, the final factor would therefore necessarily

have to be a function of

F(fixing, maturity, strike, forward)


227

14.1.5 Use of the qi normalisation approach.

We attempted a final approximation that initially appeared reasonable. Notice

how we have always been altering the sigma factor by minimal amounts. According

to the final formula we saw previously in (14.1) that these values would typically

range between 0.97 and 1.01. Furthermore, we realised that the sigma factor was not

an exact weighting. This is, recalling its expression from the previous section:

)0(

)(

)(

)0()()()()(

0 S

tR

tR

Stqttqt

fi

fi

ii

n

ii ∂

∂==∑=

σσ (14.2)

∑∑

=

=

==n

iii

iiii

n

ii

ttq

ttqtpttpt

0

0 )()(

)()()()()()(

σ

σαα (14.3)

1)(0

≈∑=

tqn

ii (14.4)

1)(0

=∑=

tpn

ii (14.5)

Now the alpha term is an exact average but we see that the sigma is not. We

therefore decided to normalise all sigma terms using the following expression:

)(

)()()(

0

0

tq

ttqtq

n

ii

i

n

ii

i

∑

∑

=

==σ

(14.6)

Results conclusively signalled that this approach worsened calibrations.

14.2 Analytic Approximation Jacobian

Another important factor to take into account at this stage is the Jacobian- i.e. the

slopes that the analytic approximation generates. Recall that one of the aims of our

analytic approximation was to be able to use its analytical Jacobian instead of having

to recalculate it numerically through the HJM calibration process. This would greatly


228

reduce computation times. A first, straightforward method to confirm the similarity in

Jacobians is by comparing the slopes graphically.

From a distant perspective we can see below that the behaviour of the two

surfaces can appear to be quite different. Note that the analytic approximation’s

solution is monotonous whereas the HJM has a more peculiar form. However, the

similarity of the slopes in the region that is of our interest, that is, around the solution

curve, is actually very similar.

0.04

0.1

0.16

0.22

-2-1012345678

-4

-3

-2

-1

0

1

2

3

Model - Market Price

Sigma

Alpha [%]

HJM MonteCarlo Slopes

solution curve

Fig. 14.8. HJM MonteCarlo slopes and solution curve


229

0.04

0.08

0.12

0.16

0.20.24

-2

-0.51

2.5

4

5.5

7-4

-3

-2

-1

0

1

2

3

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]

Analytic Approximation Slopes

solution curve

Fig. 14.9. Analytic approximation slopes and solution curve

If we were to make a closer inspection of the region of interest where the final

solution occurs, we would find something of the following form:

0.04

0.1

0.160.22

-2-1

012345678

-3

-2

-1

0

1

2

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]

HJM MonteCarlo Slopes

solution curve

Fig. 14.10. Close-up on HJM MonteCarlo’s slopes and solution curve


230

0.04

0.09

0.14

0.19

0.24

-2-1

012345678

-3

-2

-1

0

1

2M

odel

- M

arke

t Pric

e

Sigma

Alpha [%]

Analytic Approximation Slopes

solution curve

Fig. 14.11. Close-up on analytic approximation’s slopes and solution curve

We see that our well known solution curves are still the intersection of the Ω

surface of prices with the horizontal axis. Once again, we distinguish the analytic

approximation as the tangent of the HJM MonteCarlo curve. Yet now, with the 3

dimensional view, we are capable of appreciating the differences between the slopes:

there is a slight difference in concavity, and the HJM solution flattens out below the

horizontal axis much sooner than the analytic approximation.

Nevertheless, the similarity is sufficient to enable us to use the analytic

approximation’s Jacobian as the HJM model’s Jacobian in any iteration process within

the specified region.

14.3 2 Factor Analytic Approximation

We set out in our analysis to test all of the possible alternatives for alpha. Recall

that the expression for sigma was the same in all cases. Many of the candidates

dropped out straight away because they were unable to advance at all in any given

calibration. The remaining candidates were thus:


231

∂∂+

∂∂

∂∂+

∂∂

=

∑∑

∑∑

==

==

)(sin)()()()0(

)(cos)()()()0(

)(cos)()()()()(

)(sin)()()()()(

)(

00

00

tttRtR

StttR

tR

S

ttRtttR

tSttRtt

tR

tS

t

ii

n

i

if

iii

n

i

if

i

i

n

iiii

ii

n

iiii

i

θσθσ

θασθασα

(14.7)

2

0

2

0

00

)(sin)()()(

)0()(cos)()(

)(

)0(

)(cos)()()()()(

)(sin)()()()()(

2

1)(

∂∂+

∂∂

∂∂+

∂∂

=

∑∑

∑∑

==

==

tttRtR

StttR

tR

S

ttRtttR

tSttRtt

tR

tS

t

ii

n

i

if

iii

n

i

if

i

i

n

iiii

ii

n

iiii

i

θσθσ

θασθασα

(14.8)

2 2

0 0

2 2

0 0

( )

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

( )( )(1 (

( )

n nf f

i i i i i ii ii i

n nf f

i i i i i ii ii i

i ii

t


R t R t


R t R t

S tt

R t

α

σ θ σ θ

σ θ σ θ

σ α

= =

= =

=

∂ ∂+ ∂ ∂ = − ∂ ∂+ ∂ ∂

∂ −∂

−

∑ ∑

∑ ∑

2 2

0 0

2 2

0 0

( ))) ( )cos ( ) ( )(1 ( )) ( )sin ( )

( )

( ) ( )( ) ( )cos ( ) ( ) ( )sin ( )

( ) ( )

n nf fi i i i i i

i i i

n nf f

i i i i i ii ii i

S tt R t t t t R t t

R t


R t R t

θ σ α θ

σ θ σ θ

= =

= =

∂+ − ∂

∂ ∂+ ∂ ∂

∑ ∑

∑ ∑

(14.9)

)()()(

)0(

)()()()(

)0(

)(

0

0

ttRtR

S

tttRtR

S

t

i

n

i

if

i

n

iiii

i

σ

ασα

∑

∑

=

=

∂∂

∂∂

= (14.10)

These four alternatives were all capable of calibrating with relative ease any small

set with maturities reaching 15 years. To differentiate which of the candidates we

would finally select, we were thus forced to submit them to more extreme calibrations.

When testing on 18 year calibrations with 34 Swaptions and 15 correlations, the

third and fourth of the above formulas, (14.9) and (14.10), already proved to be much

faster than the rest. The third was capable of calibrating without splitting, in 4


232

iterations. Notice that the first two alternatives were yielding up to 17 iterations,

therefore resulting much more time consuming. The fourth expression was discovered

to also be capable of calibrating, although in 5 iterations

We made a critical final test that proved extremely difficult to calibrate. It

involved 66 Swaptions and 15 correlations. In this final test, only the third expression

and the fourth expression were capable of performing the calibration.

As we found no further difference between the two, we decided to select the third

expression for two principal reasons:

• It performed faster by two to three iterations in all tests performed.

• It was mathematically consistent with the formulae derived for the analytic

approximation, and so did not rely on a quant’s intuitive mean weighting

approach.

14.4 Final Considerations on the Analytic approximation

So far we have seen graphical and numerical comparisons between the HJM and

our analytic approximation and confirmed their similarities. The critical analysis that

remains therefore is to determine whether we really do achieve the principal goal of

our project, this is, whether we really do significantly reduce calibration times.

1 Factor:

Calibration Set: 56 Swaptions

HJM Montecarlo 94s

Analytic approximation + HJM MonteCarlo without Split: 56s

Analytic approximation + HJM MonteCarlo with Split: 17s

Table 14.1. Approximation increases calibration speed by a factor of 5


233

2 Factors:

Calibration Set: 66 Swaptions, 15 Correlations

HJM Montecarlo: 9mins 53s

Analytic approximation + HJM MonteCarlo: 1min 12s

Table 14.2. Approximation increases calibration speed by a factor of 8

14.5 Conclusions and Further Developments

The analytic approximation appears to work extremely well. It reduces

calibration times by a factor of 3 to 10. These results are truly extraordinary. We

would like to note that the analytic approximation has already successfully been

implemented within the Banco Santander, and is being used on a daily basis by the

interest rate traders. Future developments must centre on an analytic approximation

for the 3 strike model. Initially, this would appear to involve a relatively intuitive

extrapolation of the two strikes analytic approximation methods presented above.

14.6 Analytic approximation Peculiarities

Recall that there existed three cases in which the HJM MonteCarlo failed to

produce results. We find in contrast that the analytic approximation, because of its

characteristics and in particular, because it is monotonous, manages to surmount

these difficulties.

The first problem stated in this document was the duplicity of solutions

encountered. We found that when these situations arised during calibrations with our

analytic approximation always selected the correct solution. This is, it always selected

the solution with an alpha closest to the [0,1] interval. Recall that the alpha was a

weighting parameter that allowed us to choose between a lognormal (α = 1) and a


234

normal model (α = 0). It therefore seems unreasonable that we should select a value of

6 for alpha. This would imply something of the form: “we are six times a lognormal

model”.

Solution Duplicity

0

0.05

0.1

0.15

0.2

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC K = 2.5%MC K = 6.5%Approximation K = 2.5%Approximation K = 6.5%

second MC solution

MC and analytic approximation solution

Fig. 14.12. HJM MonteCarlo versus analytic approximation solving solution

duplicities

Therefore, by using the analytic approximation as a first guess, we condition the

HJM MonteCarlo to start very close to the desired solution. In this way we avoid the

possibility that it erroneously converges to the alternative solution.

Another of the problems which we had encountered was the case in which the

HJM MonteCarlo solution curves were encompassed one inside the other. This

inevitably lead to an inexistent intersection, thus meaning that there was no valid

solution i.e. no valid pair of model parameters that could simultaneously satisfy both

conditions imposed by the two vanilla products.


235

HJM MonteCarlo Solution Curves

0

0.05

0.1

0.15

0.2

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC ATM K = 4.5%

MC K = 3%

Fig. 14.13. HJM MonteCarlo presents no solution curve intersection

Because the analytic approximation is monotonous, we do find an intersection of

its solutions:

Solution Curves

0

0.05

0.1

0.15

0.2

0.25

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC ATM K=4.5 %

MC K = 3%Approximation ATM K = 4.5%

Approximation K = 3%

analytic approximation solution

Fig. 14.14. Analytic approximation solving a case with no HJM MonteCarlo solution

intersection


236

We are yet unclear about whether the solutions provided by the approximate

model should be considered as valid and thus revise our HJM model, or whether they

too are incorrect. Whichever of the two we finally decide upon, the HJM model clearly

still needs further corrections, as it should be able to calibrate given a reasonable set of

input data. The fact that it is incapable is a problem we must examine further.

Recall now the final problematic encountered with HJM calibrations. There were

situations in which the HJM MonteCarlos price surface Ω was incapable of descending

below the horizontal axis. This lack of flexibility implied that we were never capable

of equating the model and market prices. We can analyse this situation more in depth

at this point in our study:

0.04

0.11

0.18

-2-0.5

12.545.

57

-1

0

1

2

3

4

5

6

7

8

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]


solution curvesolution curve

Fig. 14.15. HJM MonteCarlo first vanilla presenting a solution curve


237

0.04

0.11

0.18

-2-0.5

12.545.

57

0

1

2

3

4

5

6

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]


Fig. 14.16. HJM MonteCarlo second vanilla does not descend sufficiently

Realise that we continue calibrating in pairs, and although with one of the

products, the price surface does descend sufficiently, in the other product this surface

remains asymptotic to the horizontal axis. No solution curve is achieved.

When we continue with this scenario onto an analytic approximation calibration,

we find the following:


238

0.04

0.11

0.18-2-1012345678

-1

-0.5

0

0.5

1

1.5

2

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]

Analyticv Approximation - Vanilla 1

solution curve

Fig. 14.17. Analytic approximation presents a solution for the first vanilla

0.04

0.1

0.16

0.22

-2-1012345678

-1

-0.5

0

0.5

1

1.5

2

2.5

3

Mod

el -

Mar

ket P

rice

Sigma

Alpha [%]

Analytic Approximation -Vanilla 2

solution curve

Fig. 14.18. Analytic approximation also presents a solution for the second troublesome

vanilla


239

Surprisingly we find a solution with both vanilla products. On a two dimensional

plane, this is equivalent to:

Solution Curves

0.04

0.09

0.14

0.19

0.24

-2 0 2 4 6 8

Alpha [%]

Sig

ma

MC ATM K = 4%Approximation ATM K = 4%Approximation K = 1%

analytic approximation solution

Fig. 14.19. HJM MonteCarlo versus analytic approximation for a two dimensional view of

the previous cases

Where we have included the single HJM MonteCarlo vanilla product that we

managed to calibrate.

Chapter 15 Calibration Set Interpolation Matrix

240

15. Calibration Set Interpolation Matrix

We explore this section in the search of a possible solution to two main problems

presented during our calibration tests:

• The failure to calibrate when our calibration set was solely composed of caplets.

• The failure to calibrate when using a joint set of caplets and swaptions.

15.1 Initial Data

We have discovered (as is presented in the results of this section, 15.4) that the

specific interpolation process that we may decide for our matrix can have a powerful

impact on the convergence of the calibration itself. Further, the extrapolation of our

calibration set can even transform our solution surface Ω, that we had by now grown

so accustomed to. This peculiarity has edged us to seek for the best possible solution.

15.2 Former approach analysis

Initially, we were performing a horizontal linear interpolation within the triangle

defined by the Target Parameters, and linear outside it. Recall what was discussed in

Section 10.2

15.2.1 1 Factor: (only Caplets)

In the 1 factor scenario the data is not interpolated but instead extrapolated

constantly from the Target Parameters.


241

U0 Ui UN

T0 Ti

TN

A

B

Fig. 15.1. Strike Interpolation

From the tests performed, we have concluded that the data above the diagonal is

the most influential to our calibration process. This calibration works best when the

interpolation is performed vertically in this region A, irrespective of whether B is

vertical or horizontal. We proceed to the two factor scenario taking both A and B

vertically.

15.2.2 Tests performed

The one factor calibration was tested successfully up to maturities of 20 years,

with three month caplets and with a frequency of 3 months. Other less drastic

scenarios with fewer fixings and 6 month or 1 year caplets were also successfully

overcome. Thus, the one factor model with vertical extrapolation is in this way now

capable of successfully calibrating caplets.

15.3 2 Strikes

In the 2 strikes scenario, the data was formerly being interpolated horizontally

within the Target Parameter triangle and extrapolated horizontally outside it.

Following on with the improvement in caplet calibration achieved through the

vertical extrapolation, and due also to the fact that the current approach was proving

incapable of calibrating caplets, we proceeded to implement: a horizontal

interpolation within the Target Parameter triangle, and a vertical extrapolation

outside this.


242

U0 Ui UN

T0 Ti

TN

interpolate

extra

pola

te

Fig. 15.2. Strike Interpolation

0.800220.812520.8364240.8849450.9341310.9826520.9826521.004572.190952.347253.825833.825836.11781

1.390251.390251.341651.27691.211261.146511.081941.004572.190952.347253.825833.825835.11781

1.031411.031411.027911.023241.018511.013851.00921.004572.190952.347253.825833.825834.12055

2.361922.361922.342912.31762.291942.266622.241372.216272.190952.347253.825833.825833.11507

3.279973.279973.189663.069372.947422.827122.707152.587852.467552.347253.825833.825832.11507

2.444812.444812.563062.720582.880263.037783.194873.35113.508623.666153.825833.825831.11233

10.175310.13159.380828.380827.367126.367125.369864.378083.378082.378081.364380

0.800220.812520.8364240.8849450.9341310.9826520.9826521.004572.190952.347253.825833.825836.11781

1.390251.390251.341651.27691.211261.146511.081941.004572.190952.347253.825833.825835.11781

1.031411.031411.027911.023241.018511.013851.00921.004572.190952.347253.825833.825834.12055

2.361922.361922.342912.31762.291942.266622.241372.216272.190952.347253.825833.825833.11507

3.279973.279973.189663.069372.947422.827122.707152.587852.467552.347253.825833.825832.11507

2.444812.444812.563062.720582.880263.037783.194873.35113.508623.666153.825833.825831.11233

10.175310.13159.380828.380827.367126.367125.369864.378083.378082.378081.364380

Table 15.1 Horizontal interpolation, vertical extrapolation

We obtain slightly different results compared to the formerly implemented

interpolation method.

This occurs because the extrapolated data has a slight influence on the

interpolated data, as each entire row of the matrix is used to compute the subsequent

row. A variation in any of the components in the row therefore has an effect on the

following one.

We found the following main results, regarding whether we were capable of

calibrating or not:

The improvement obtained through vertical extrapolation is extremely clear from

the below table.


243

Calibrates?

Vanilla Product

Strike Split Proxy / MC Vertical Horizontal

Caplet 1K Split Proxy + MC no no MC yes no

No Split Proxy + MC yes no

MC yes very slow

no

2K No Split Proxy + MC yes no

MC yes very slow

no

Split Proxy + MC no no

MC no no

Caplet + Swaption

1K yes yes

2K MC fails fails

Table 15.2 Summary table of the differences between vertical, horizontal

extrapolation, split and no split

However, we clearly see two main difficulties in the above. Firstly, the joint

calibration of caplets and swaption remains an unresolved problem.

Secondly, the new interpolation method proves to solve many of the problems

that were previously encountered in any caplet calibration. However, we realize that

it is only effective when it operates as a ‘no split process’- that is, taking the entire set

of vanilla products and calibrating them together at once. This requires an extremely

long and tedious computation process, that is greatly accelerated through the use of

the analytic approximation developed. However, it poses important difficulties if we

are to extend the approach to the three strike model, as here, there is yet no analytic

approximation to speed up the calculations. Calibrations with three strikes and ‘no

split’ will be extremely slow.


244

15.4 Graphical representation

We observe further consequences when changing the form of interpolation used.

Recall the graphical surface Ω created. We now obtain a different form for that same

surface, one that tips upwards again at the left end of the below graph, where before it

was almost horizontally asymptotic here.

-15-9

-3

3

9

15

0 0.05

0.1 0.

15

0.2 0.25

-500

50

100

150

200

250

Alpha [%]

Sigma

Vertical Extrapolation

Fig. 15.3. Vertical extrapolation no longer flat

This deformation becomes increasingly drastic. Note that now, the intersection

solution with the horizontal axis has actually evolved towards a circular form.


245

0

0.2

0.4

0.6

0.8 1

1.2

0.122

0.13

0.138

-6

-4

-2

0

2

4

6

8

10

Mod

el-M

arke

t pric

e

Alpha [%]

Sigma

Horizontal Extrapolation

Fig. 15.4. Surface Deformation in Horizontal Extrapolation

We note that this is exclusively a caplet characteristic. When we use the same

vertical interpolation approach to calibrate swaptions, the well known surface Ω

appears once again.

0

0.05

0.1

0.150.2

-20

-14-8-241016

-2

0

2

4

6

8

10

12

14

16

18

Mod

el -

Mar

ket P

rice

.

Sigma

Alpha [%]

Swaptions Vertical Extrapolation

Fig. 15.5. Swaption Vertical Extrapolation stays the same


246

The transformation of the model price space brings with it a first important

consequence. The fact that the solution curve is no longer a curve with a ‘hump’ but

has now evolved towards a spherical form. This implies that on calibrating two

caplets, the intersection of these circles will always generate a duplicity of solutions

except for the particular case in which one circle is perfectly tangent to another.

Vertical Extrapolation

0.12

0.125

0.13

0.135

0.14

0.145

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Alpha [%]

Sig

ma

K=4.15%

K = 3%

Fig. 15.6. New Circular Solution Intersection

Thus the method introduces a new duplicity of solutions, this time in sigma

whereas previously it was only existent in alpha. The same question however is still

omnipresent: which is the correct solution?

The change in the model price surface has two further implications.


247

Firstly, the convergence of a Newton Raphson algorithm in this type of surface is

always much more direct as all lines of greatest slope head directly towards the

minimum value i.e. the same bottom pit of the convexity.

Secondly, we would like to state something important: The HJM ends its

calibration process when the error between its model value and the market value is

below a certain level. We realize that perhaps this level should be decreased

somewhat further. The reason is the fact that within the range of currently admitted

model errors, there is sufficient margin for a noticeable variation in the alpha

parameter.

See below a comparison between the two valid solutions, one of which has been

obtained through the new caplet surface and the other obtained through the

traditional horizontal extrapolation. The alpha parameter in particular is substantially

different (30% difference), despite the fact that the HJM accepts both solutions- see the

relative error.

Theoretically, the parameter generating a smaller error is more accurate. This

does not mean that the horizontal extrapolation is better. It simply implies that

perhaps one further iteration in the vertical extrapolation method should be

considered before submitting a final parameter value.

15.4.1 Vertical Extrapolation

ITERATION 0 : Type T U Value

1 SIGMA 145.479 171.233 0.116056

2 ALPHA 145.479 171.233 -0.27304

MarketPrice ModelPrice Relative error

1 579.465 579.511 0.045526 OK

2 0.205302 0.206877 0.157482 OK

Table 15.3 Results obtained through vertical extrapolation


248

15.4.2 Horizontal Extrapolation

ITERATION 0 : Type T U Value

1 SIGMA 119.726 145.753 0.115939

2 ALPHA 119.726 145.753 -0.34993

MarketPrice ModelPrice Relative error

1 495.114 495.118 0.004044 OK

2 0.10758 0.107367 0.021235 OK

Table 15.4 Results obtained through horizontal extrapolation


249

16. Interest Rate Volatilities: Stripping Caplet Volatilities

from cap quotes

16.1 Introduction.

This section describes how to create a set of caplet volatilities that are to be

derived from the market quoted cap volatilities. We will analyse two principal

approaches. The first involves an interpolation between cap prices to then extract the

caplet volatilities. The second method involves a direct interpolation amongst the

caplets themselves.

The input data which we require are the interest rate curve, the cap volatility

matrix, and future option prices. With these, we must be able to compute the caplet

forward volatilities for any given tenor (according to the market conventions, 3M in

the USD dollar, 6M in the Eurozone, etc.) and any given strike. The study is of

particular interest because the caplet volatilities are critical in the calibration of exotic

products. We present below the market quoted caps that are typically used as inputs

when deriving the corresponding caplets.

Fig. 16.1. Market Cap Quotes

Chapter 16 Interest Rate Volatilities: Stripping Caplet Volatilities from cap quotes

250

We generally follow the subsequent procedure: taking the market cap volatilities,

we firstly interpolate along the strike direction so as to create all the strike values that

are necessary for our calibration. After this, for each strike we then interpolate along

the maturity direction, (interpolating either in cap volatilities or caplet volatilities

depending on the approach) thus creating values for the already mentioned 6 month

regular intervals.

In general, we can use linear cap or constant caplet volatility interpolation as a

first approach before continuing onto the more complex functional form

optimisations.

A further step is to fit the sets of volatilities that have been calculated to a given

type of smile interpolator. This will require that we carry out the strike interpolation

at the end of the process rather than at the beginning so as to achieve an optimal smile

behaviour.

There are two principal methods of interpolation which we must firstly

distinguish:

1. Functional Form:

we use a simple deterministic function to interpolate between points. We can

further distinguish two variations here:

Global interpolation:

These methods rely on constructing a single equation that fits all the data points.

The equation is usually a high degree polynomial equation that results in a smooth

curve. However, they are usually not well suited for engineering applications, as they

are prone to severe oscillations and overshoots, which we attempt to avoid specifically

here.


251

Piecewise interpolation.

These methods rely on constructing a polynomial of low degree between each

pair of known data points. If a first degree polynomial is used, it is called linear

interpolation. Second and third degree polynomials are called quadratic and cubic

splines respectively. The higher the degree of the spline, the smoother the resulting

curve. Splines of degree m will have continuous derivatives up to a degree of m-1 at

the data points.

2. Stochastic Form:

This is the second possible form of interpolating. It implies using a stochastic

model such as CEV or SABR. We will enter the details of the SABR approach later

16.1.1 Strike Interpolation:

The first, ‘non smile’ interpolation between strikes is performed as simple as

possible. This is directly a linear interpolation as our first step in the previously

described process.

ii

iiii

KK

KKVKKVV

−−+−

=+

++

1

11 )()(

16.2 Stripping Caplet Volatility Methods

It may be useful at this point to refer back to the financial product description of a

cap- Section 6.7. Recall that a cap is constructed as a sum of caplets.

The price of a cap on the 6 month EURIBOR starting at T and maturing at U is

given by the market by means of a unique flat caplet volatility. This means that the

price of the cap must be computed as the sum of the prices of all 6 month caplets

between T and U, whose volatility must be set as the unique flat caplet volatility

specified by the market.


252

However, these flat caplets are simply a construction mechanism to obtain the

market’s cap value easily. The former does not mean that the true caplets in the

specified time period should all have flat volatilities. Instead, it simply imposes that

the sum of all the caplets in the interval should yield the same price as the sum of all

the flat caplets in that interval.

Thus there is a great degree of freedom when we try to analyse the true market of

caplets. We have the liberty of imposing any price we wish on the given group, so

long as their sum equals the market cap. For calibration purposes, we will seek to

construct these caplets so that their volatilities combine to produce a monotonous,

smooth curve.

16.3 Previous Santander Approach for 6 month caplets

The main problem in any Caplet Stripping method lies in the fact that there is not

enough information to be able to extract a unique caplet volatility. Ideally, if we had

two adjacent market cap quotes that were 6 months apart, then we could construct

2 1 1 2 1 1( , , ) ( , , ) ( , , ) ( , , )cap t T U cap t T U forwardcap t U U caplet t U U δ− = = +

(16.1)

Where δ= 6months

16.3.1 TYPE I

Cap(t,T,U1)

Cap(t,T,U2) CapForward(t,U1,U2) = Caplet(t,U1,U2)

U2 U1 T t

Fig. 16.2. Cap decomposition into other caps and capforwards


253

In this way, the resulting forward cap would be exactly equal to the caplet, so

would also have the same price. Applying the Black Scholes formula, we would be

able to extract the caplet’s volatility from its known price.

However, this is not generally the case. More commonly, two adjacent cap quotes

are separated in maturities by more than six months. In addition, the separation

between cap quotes for LIBOR markets ranges between 12 months and 5 years. In

these cases, the difference between the cap prices (i.e. the forward cap) is equal to the

sum of at least two or more different caplets.

1 2 1 21

( , , ) ( , , ) ( , , ) ( , , )n

i i ii

cap t T U cap t T U forwardcap t U U caplet t U U i δ=

− = = + ⋅∑ (16.2)

There is no additional equation to determine the price to be attributed to any

individual caplet, only to their sum. Therefore, a hypothesis must be made at this

stage so as to decide on how these prices should be distributed among the caplets.

16.3.2 TYPE II

U2 U1 t U1+δ T

CapForward(t,U1,U2)

Caplet (t,U1,U1+ δ)

Caplet (t, U1+ δ, U2)

Cap(t,T,U2)

Cap(t,T,U1)

Fig. 16.3. Capforward decomposition into two unknown caplets

We see clearly in the above that there is no additional information available to

choose a specific price for any of the caplets that combine to form the capforward.


254

16.4 Linear Cap Interpolation

In the Banco Santander Model, to ease calculations, a first approach was simply to

linearly interpolate the market cap volatilities so as to always fall within the first case

situation. This is, given the below example where there would be an excess of caplets

to determine, we linearly create the necessary caps that will enable us to return to a

simple situation as in TYPE I.

Capσ1

Capσ2

Capletσ1

T U2 U1 U1+δ U1+2δ U1+3δ

Fig. 16.4. Each cap is made up of a number of caplets of unknown volatility

We construct each cap by linearly interpolating the volatility

( )1112

1211

),,(),,(),,(),,( UiU

UU

UTtUTtUTtiUTt capcap

capcap −⋅+

−−

+=⋅+ δσσ

σδσ

(16.3)

Thus we only have

Capσ1

Capσ2 interpolated

Capletσ1

U1+δ U1

Fig. 16.5. 2 Cap Interpolation

Then we can easily solve each caplet as was stated in the TYPE I approach:

),)1(,())1(,,(),,( 1111 δδδδ ⋅+−⋅+=⋅−+−⋅+ iUiUtcapletiUTtcapiUTtcap (16.4)

Capletσi

T


255

Let us specify the above calculations for the case in which i = 1, and Mm6=δ .

We would then have

),,(),,(),,( 1111 δδ +=−+ UUtcapletUTtcapUTtcap (16.5)

Where the only unknown is the caplet volatility ),,( 1 δσ +UTtcap that we seek

to calculate. We will now specify how each of the above terms are obtained.

The second term in (16.5)

[ ])()(),,();(),,( 2111,1ii TT

Tt dKNdNUTtLUtBmUTtcap −= (16.6)

is the market cap, and is a sum of caplets with the cap’s flat market quoted

volatility. As we only have i = 1, the cap is directly equal to the caplet, with

TtUTtcap

TtUTtcap

T K

UTtLLN

d,),,(

,2

),,(1

2,1

1

1

2

),,(

δσ

δσ

⋅

⋅±

= (16.7)

The first term in the equation (16.5) has had its volatility interpolated so that the

time-space between the newly interpolated cap and the previous market cap is exactly

equal to the single caplet we seek. This interpolation has been done as:

δσσ

σδσ ⋅

−−

+=+12

1211

),,(),,(),,(),,(

UU

UTtUTtUTtUTt capcap

capcap

(16.8)

Thus we have the expression for the cap as

[ ]∑−

=−⋅+⋅+=+

1

021111 )()(),,();(),,(

n

i

TT ii dKNdNiUTtLiUtmBUTtcap δδδ

(16.9)

MUTtcap

MUTtcapii

U K

UUtLLN

d6),,(

62

),,(''

2,1

1

1

12

),,(

δσ

δσδ

δ

δ

⋅

⋅±

+

=+

+

(16.10)


256

The interpolated cap is therefore the sum of the previous cap and an additional

caplet with the interpolated flat volatility ),,( 1 δσ +UTtcap

Finally, the last term in the equation (16.5) verifies

[ ])()(),,();(),,( 2111111ii TT dKNdNUUTLUtmBUUtcaplet −++=+ δδδ

(16.11)

MUUtcaplet

MUUtcaplet

T K

UUtLLN

d6),,(

62

),,(11

2,1

11

11

2

),,(

δσ

δσδ

δ

δ

⋅

⋅±

+

=+

+

(16.12)

The only element we do not know from all the above equations is the

2),,( 11 δσ +UUtcaplet . We will need to solve the black volatility for this caplet- typically via

Newton Raphson iterations.

For the second caplet we would construct

)2,,(),,()2,,( 1111 δδδδ ⋅++=+−⋅+ UUtcapletUTtcapUTtcap

(16.13)

And analogously for the rest.

The problem with this form of stripping is that the resulting values for σcaplet i

produce a curve with respect to their maturities that is not smooth at all. (See Fig.

18.2). The ‘bumps’ present an important difficulty for calibration algorithms that

operate on these caplet volatilities. A smoother fit is consequently required.


257

16.5 Quadratic Cap Interpolation

The procedure that we will follow is completely analogous to the former one.

This is, we have a set of market quoted caps constructed on flat cap volatilities. Once

again we are going to interpolate between two known market quotes so as to obtain

an intermediate cap volatility from which we can easily extract the caplet we are

searching for.

A quadratic fit requires three parameters to completely define the parabola. We

use up two degrees of freedom by setting the parabola to pass through the two known

points (σi and σi-1) defined by the market quoted flat cap volatilities. We have absolute

freedom to impose the third point. As we have a greater density of information at the

beginning of the curve, we decide therefore that it is more useful to take our third

point as the volatility σi-2. Another approach is to use this degree of freedom to impose

a continuity in the function’s slopes- a condition of the form: fj’(Ti-1) = fj-1’(Ti).

For the first method, we use:

CBTATTf ++= 2)( (16.14)

with coefficients

( )

( )( )( )

( )( )iiii

ii

iiii

ii

TTTT

ff

TTTT

ffA

−−−

−−−

−=

−−

−

−−−

−

12

2

121

1 (16.15)

( )( ) ( )1

1

1−

−

− +−−−

= iiii

ii TTATT

ffB (16.16)

112

1 −−− −−= iii BTATfC (16.17)

16.6 Cubic Spline Interpolation

The idea behind this method remains the same as before. Our only difference is

that now we interpolate between caps using cubic functions. The motor cause behind

this decision is the fact that using quadratic functions provides concavities and

convexities to which our caplet transformation is very sensitive. Indeed, despite the


258

fact that any of the two previous methods apparently produce very smooth flat cap

volatility curves (see Fig. 18.1), we note that their subsequent caplet models show

enhanced irregularities wherever two parabolas or straight lines join (see Fig. 18.2).

This is further enhanced the more different the slopes are at that point. We thus

attempt to solve this problem by means of a cubic fit.

As its name indicates, we fit a series of unique cubic polynomials between each of

the data points, with the stipulation that the curve obtained be continuous and appear

smooth. The fundamental idea behind cubic spline interpolation is based on the

engineer’s tool used to draw smooth curves through a number of points, which is

where the method derives its name from. This spline consists of weights attached to a

flat surface at the points to be connected. A flexible strip is then bent across each of

these weights, resulting in a pleasingly smooth curve.

The mathematical spline is similar in principle. The points, in this case, are

numerical data. The weights are the coefficients on the cubic polynomials used to

interpolate the data. These coefficients ‘bend’ the line so that it passes through each of

the data points without any erratic behaviour or breaks in continuity.

We fit a piecewise function of the form

<≤

<≤<≤

=

−− nnn xxxifxs

xxxifxs

xxxifxs

xS

11

322

211

)(

)(

)(

)(M

where si(x) is a third degree polynomial defined by

3 2( ) ( ) ( ) ( )i i i i i i i is x a x x b x x c x x d= − + − + − + for i= 1, 2, ..., n-1.

The first and second derivatives of these n-1 equations are

2' ( ) 3 ( ) 2 ( )

'' ( ) 6 ( ) 2i i i i i i

i i i i

s x a x x b x x c

s x a x x b

= − + − += − +

for i= 1, 2, ..., n-1.

The curve must verify the following four conditions:

1. The piecewise function S(x) will interpolate all data points (xi,yi).

2. S(x) will be continuous on the interval [x1, xn]


259

3. S’(x) will be continuous on the interval [x1, xn]

4. S’’(x) will be continuous on the interval [x1, xn]

Since the piecewise function S(x) will interpolate all of the data points, we can

conclude that

ii

iiiiiiiiiii

ii

dy

dxxcxxbxxay

yxS

=+−+−+−=

=

)()()(

)(23 for each i= 1, 2, ..., n-1.

Because property 2 imposes that the function be continuous, then at the junction

of two piecewise cubic curves we have

iiiii dxsxs ==− )()(1

we also know that

3 21 1 1 1 1 1 1 1( ) ( ) ( ) ( )i i i i i i i i i i i is x a x x b x x c x x d− − − − − − − −= − + − + − + (16.18)

so we have

1112

113

11 )()()( −−−−−−− +−+−+−= iiiiiiiiiii dxxcxxbxxad

for i= 1, 2, ..., n-1.

16.6.1 Analysing the slopes

iii

iiiiiiiii

iiiiii

cxs

cxxbxxaxs

cxxbxxaxs

=+−+−=

+−+−=

)('

)(2)(3)('

)(2)(3)('2

2

(16.19)

Applying the third condition of continuous slopes, we also have

)(')(' 1 iiii xsxs =−

so 1112

111 )(2)(3)(' −−−−−− +−+−= iiiiiiiii cxxbxxaxs (16.20)


260

1112

11 )(2)(3 −−−−− +−+−= iiiiiiii cxxbxxac for i= 1, 2, ..., n-1.

Now considering the second derivatives:

iii

iiiiii

iiii

bxs

bxxaxs

bxxaxs

2)(''

2)(6)(''

2)(6)(''

=+−=

+−= (16.21)

And since the second derivatives must also be continuous, we impose

iiiii bxsxs 2)('')('' 1 ==− (16.22)

111

1111

2)(62

2)(6)(''

−−−

−−−−

+−=+−=

iiiii

iiiiii

bxxab

bxxaxs for i= 1, 2, ..., n-1.

for simplification, let us imagine that we have constant intervals

)( 1 ii xxx −=∆ − and let us note iiii bsxs 2'')('' ==

Then we can re-write all the coefficients in terms of these parameters as:

ii

iiiii

ii

iii

yd

xss

x

yyc

sb

x

ssa

=

∆−

−∆

−=

=

∆−

=

++

+

6

''2''2

''6

''''

11

1

(16.23)

Therefore, taking equation iiiiiiii cxxbxxac +−+−= +++ )(2)(3 12

11

xss

x

yyxx

ssx

ss

x

yy iiiii

iiiiii ∆−

−∆

−+∆+∆

−=∆

−−

∆− +++++++

6

''2''

6

''2''3

6

''2'' 11211212 β

Which we can simplify to

221

21

26''''4''

x

yyysss iii

iii ∆+−

⋅=++ ++++ for i= 1, 2, ..., n-1.


261

Which leads to a matrix formulation:

+−+−+−

+−+−+−

⋅∆

=

−−

−−−

−−−

−

−

−

nnn

nnn

nnn

n

n

n

n

yyy

yyy

yyy

yyy

yyy

yyy

x

s

s

s

s

s

s

s

s

12

123

234

543

432

321

2

1

2

3

3

3

2

1

2

2

2

2

2

2

6

''

''

''

''

''

''

''

''

14100000

01410000

00140000

00004100

00001410

00000141

MM

L

L

L

MMMMOMMMM

L

L

L

Note that this system has n -2 rows and n columns, and is therefore under-

determined. In order to generate a unique cubic spline, two other conditions must be

imposed upon the system.

Historically, the most commonly used boundary conditions have been:

16.7 Natural splines

This first spline type includes the stipulation that the second derivative be equal

to zero

At the endpoints s’’ 1 = s’’n = 0. This results in the spline extending as a line

outside the endpoints. Therefore, the first and last columns of this matrix can be

eliminated, as they correspond to s’’1 = s’’n = 0. This results in a n- 2 by n- 2 matrix,

which will determine the remaining solutions for s’’2 through s’’ n-1. The spline is now

unique.


262

+−+−+−

+−+−+−

⋅∆

=

−−

−−−

−−−

−

−

−

nnn

nnn

nnn

n

n

n

yyy

yyy

yyy

yyy

yyy

yyy

x

s

s

s

s

s

s

12

123

234

543

432

321

2

1

2

3

3

3

2

2

2

2

2

2

2

6

''

''

''

''

''

''

14100000

01410000

00140000

00004100

00001410

00000141

MM

L

L

L

MMMMOMMMM

L

L

L

16.8 Parabolic Run out Spline

The parabolic spline imposes the condition on the second derivative at the

endpoints that

s’’1 = s’’2

s’’ n =s’’ n-1

The result of this condition is a curve that becomes parabolic at the endpoints.

This type of cubic spline is useful for periodic and exponential data.

+−+−+−

+−+−+−

⋅∆

=

−−

−−−

−−−

−

−

−

nnn

nnn

nnn

n

n

n

yyy

yyy

yyy

yyy

yyy

yyy

x

s

s

s

s

s

s

12

123

234

543

432

321

2

1

2

3

3

3

2

2

2

2

2

2

2

6

''

''

''

''

''

''

500000

141000

014000

000410

000141

000015

MM

L

L

L

MMMOMMM

L

L

L


263

16.9 Cubic Run out Spline

This last type of spline has the most extreme endpoint behaviour. It assigns

s’’1 = 2s’’ 2 -s’’ 3

s’’n = 2 s’’ n-1 - s’’ n-2.

This causes the curve to degrade to a single cubic curve over the last two

intervals, rather than two separate functions.

+−+−+−

+−+−+−

⋅∆

=

−−

−−−

−−−

−

−

−

nnn

nnn

nnn

n

n

n

yyy

yyy

yyy

yyy

yyy

yyy

x

s

s

s

s

s

s

12

123

234

543

432

321

2

1

2

3

3

3

2

2

2

2

2

2

2

6

''

''

''

''

''

''

600000

141000

014000

000410

000141

000016

MM

L

L

L

MMMOMMM

L

L

L

For our particular occasion, we have decided to implement a modified spline

method.

16.10 Constrained Cubic Splines

The principle behind the proposed constrained cubic spline is to prevent

overshooting by sacrificing smoothness. This is achieved by eliminating the

requirement for equal second order derivatives at every point (condition 4) and

replacing it with specified first order derivatives.

Thus, similar to traditional cubic splines, the proposed constrained cubic splines

are constructed according to the previous equations, but substituting the second order

derivative with a specified fixed slope at every point.

1' ( ) ' ( ) '( )i i i i is x s x s x− = = (16.24)


264

The calculation of the slope becomes the key step at each point. Intuitively we

know the slope will lie between the slopes of the adjacent straight lines, and should

approach zero if the slope of either line approaches zero or changes sign.

1 1

1 1

2

'( )

0

i i i i

i i i i

i

i

x x x x

y y y y

s x

if the slopechanges sign at x

+ −

+ −

− −+

− −= =

(16.25)

for i =1, 2, ..., n - 1

For the boundary conditions, we must impose two further conditions. We shall

construct here a generic approach this time where the intervals ∆x need not be

constant any longer. Thus we will use:

iiiii dxcxbxaxs +++= 23)( (16.26)

Applying a natural spline constraint of the form s’’ 1 = s’’n = 0

We obtain

( )( )( )( ) 2

)('

2

3)('

2

)('

2

3)('

1

1

1

0

01

0101

−

−

− −−−

=

−−−

=

n

nn

nnnn

xs

xx

yyxs

xs

xx

yyxs

(16.27)

As the slope at each point is known, it is no longer necessary to solve a system of

equations. Each spline can be calculated based on the two adjacent points on each

side.

( )( )

( )( )

( )( )

( )( )2

1

1

1

1

21

1

1

11

6)(')('22)(''

6)('2)('2)(''

−

−

−

−

−

−

−

−−

−−

+−

−−=

−−

+−−

−=

ii

ii

ii

iiiiii

ii

ii

ii

iiiiii

xx

yy

xx

xsxsxs

xx

yy

xx

xsxsxs

(16.28)


265

( )

( )( ) ( ) ( )

( )1

31

211

1

133

122

1

1

11

1

1

2

)('')(''

6

)('')(''

−−−−

−

−−−

−

−−

−

−

−−−=

−−−−−−

=

−−

=

−−

=

iiiiiiii

ii

iiiiiiiii

ii

iiiiiii

ii

iiiii

xaxbxcyd

xx

xxaxxbyyc

xx

xsxxsxb

xx

xsxsa

(16.29)

This modified cubic spline interpolation method has been implemented in our flat

cap volatility interpolation. The main benefits of the proposed constrained cubic

spline are:

• It is a relatively smooth curve;

• It never overshoots intermediate values;

• Interpolated values can be calculated directly without solving a system of

equations;

• The actual parameters (ai, bi, ci and di) for each of the cubic spline equations can

still be calculated. This permits an analytical integration of the data.

16.11 Functional Interpolation

We present here the following most direct approaches to caplet volatility

stripping. Their principal difference respect the previous method lies in the fact that

the interpolation here is performed between caplet volatilities, whereas the previous

approach interpolated between cap volatilities.


266

16.12 Constant Caplet Volatilities.

This is the most basic approach for stripping cap volatilities. We will always use

either this method or the previous cap linear interpolation as our first guess in our

optimisation algorithm when proceeding to use more complex interpolation methods.

Note that the algorithm we will construct here requires a simple one dimensional root

solver- i.e., it can be simply solved by a one dimensional Newton Raphson for

example.

• Let 1

nicap i=

Σ be the set of market quoted caps for a given strike K.

• Let 1

micaplet i

σ= be the constant caplet volatilities that we are trying to calculate.

We perform the stripping by what is commonly known as a bootstrapping

mechanism. We have the same formula as we had before, where each cap price “i” is

constructed as a sum of caplet prices with the flat cap volatility:

)()(1

),(),0( 1Σ=Σ ∑

=−

n

jttiT jji

capletcap (16.30)

and where we construct each caplet as

[ ])()(),0(),( 21,1 1dKNdFNtBmttcaplet jttjj jj

−=−− (16.31)

1

12

2,1

2

−

−

⋅Σ

⋅Σ±

=j

j

t

t

K

FLN

d (16.32)

We are now noting the forward LIBOR rate as F for simplicity.

We can compute the forward caps as:

)()()( 1),0(),0(1,),( 11 −−Σ−Σ=Σ

−− iTiTiiTT iiiicapcapforwardcap (16.33)


267

Remember that the problem with these forward caps was the fact that they could

encompass several caplets of unknown volatility

Cap(0,T i)( iΣ )

CapForward(T i-1, T i)( iΣ ,i-1)

U2 U1 t U1+δ

Caplet1

Caplet 2

T

Cap(0,T i-1)( iΣ -1)

Fig. 16.6. Forward caps related to the caplets

So as to find the piecewise constant volatilities, we need to solve the following

equation for a unique, constant icapletσ

)()(1

),(1,),0( 1

in

jttiiiT jji

capletforwardcap σ∑=

− −=Σ (16.34)

16.13 Piecewise Linear Caplet Volatility Method

We now no longer use a constant icapletσ for the group of caplets that form the

capforward. Instead, we impose a linear relationship between them. This really only

gives us one more degree of freedom, as a line is entirely defined by two of its points.

Our constraint is still that the caplets in a given interval sum to give the

capforward price derived from the market caps. However, these caplets that we

construct are now related linearly.

Because we seek a smooth curve, we start by imposing that the end node of the

caplets forming one capforward(i-1) (value of ji

caplet,1−σ ) coincide with the starting node

( 0,icapletσ ) of the first caplet that forms the following capforward(i). As we have N cap

volatilities and only N+1 degrees of freedom, we are left with just one free parameter

to fit.


268

We can impose an exact fit, i.e., impose the first volatility, 0,1capletσ , such that at

each interval, we satisfy the condition

)()()( ,

1),(

,),(1,),( 111

jin

jtti

jiTTiiTT jjiiii

capletforwardcapforwardcap σσ ∑=

− −−−==Σ

(16.35)

As a result we obtain a very unstable functional form, that is, not smooth at all-

see Fig. 18.2.

We could instead consider a non exact fit, in which we would seek to smoothen

the curve at the expense of allowing for small differences in the previous equation.

This is, we seek to minimize the difference between successive changes in slopes,

and minimize at the same time the difference between each Capforward price and the

sum of each set of caplet prices.

We will note the slopes as

1

1

−

−

−−=

ii

ii

iTT

σσβ (16.36)

( ) ( )1 1

0

12 2,( , ) , 1 ( , ) 1

1 2

( ,..., )

( ) ( )i i i i

N

N Ni j

i T T i i T T i ii i

F

w forwardcap forwardcap

σ σ

σ λ β β− −

−

− −= =

=

= Σ = + −∑ ∑

(16.37)

We can take for example λ = 10-4 and wi = 1/Ti. We would like to point out at this

stage that an inexact fit implies that the resulting prices that we will obtain for the

caplets will not sum to give the market quoted cap values. We are therefore giving up

precision at the expense of greater smoothness in the curve. We do not believe

therefore that an inexact fit method should be pursued if we are attempting to

accurately portray the market.


269

16.14 Piecewise Quadratic

The approach is very similar to the previous one, with the only difference that at

each interval where a capforward is calculated, we now impose a quadratic

relationship between all the caplets instead of a linear relationship. We will

characterise each of these functional forms with the values at the end points, and with

the mid point of each interval. This is useful as the mid point normally coincides with

the value of a specific caplet that we have to calculate. For example, given a

capforward lasting one year, we will have to divide it into two equal 6 month caplets,

meaning that one of them will coincide with the mid point 6 month caplet that we

must calculate anyway.

Another reason is the fact that we will need the mid point anyway for the

computation of the slopes in each interval.

Evidently, for continuity, we impose that the last caplet volatility in each

quadratic function coincides with the first caplet volatility of the following quadratic

function. In addition, we have an extra midpoint to calculate at each interval. We

therefore have N cap volatilities, and 2N+1 degrees of freedom among the midpoints

‘m’ and end points ‘i’ in the quadratic curve.

As we said, for continuity the two points that we must define at each interval are

related to the previous point via a quadratic function

2

12111

212111

)(25.0)(5.0

)()(

−−−

−−−

−⋅+−⋅+=

−+−+=

iiiiim

iiiiii

TTCTTCff

TTCTTCff (16.38)

Where we have

( ) ( )

( ))(

)(

)(

42

121

11

21

112

−−

−

−

−−

−−−−

=

−−−−

=

iiii

ii

ii

imii

TTCTT

ffC

TT

ffffC

(16.39)

And where the slopes are now


270

,

,1

1

mi

imi

i

im

iim

im

TT

TT

−−=

−−=

−

−

σσβ

σσβ (16.40)

As with the linear fit we can also perform an inexact fit, this is

( ) ( )1 1

0

2 12 2,( , ) , 1 ( , ) 1

1 2

( ,..., )

( ) ( )i i i i

N

N Ni j

i T T i i T T i ii i

F

w forwardcap forwardcap

σ σ

σ λ β β− −

+

− −= =

=

= Σ = + −∑ ∑

(16.41)

where we allow for differences between the forward cap and its corresponding

sum of caplets. Otherwise, we can impose an exact fit in which we only minimize the

difference between slopes. This would yield:

( )∑+

=− −=

12

2

21

0 ),...,(N

iii

NF ββσσ (16.42)


271

16.15 The Algorithm

K

Strike Interpolation

Strikes K

Matu

ritie

s T

Market Flat

σCaps

1st Guess

σCaplets

For every strike

ForwardCaps Price from σflat

σCaplets

Slopes βi

Quadratic error

Σ(βi- βi-1)2

ForwardCaps Price from σcaplets

Quadratic error

Σ(FWDCapiflat- FWDCappicaplets)2

Optimisation

modify 1st guess

Export

σCaplets

Acceptable combined Error

Market Flat

σCaps

yes

no

w

λ

Fig. 16.7. Optimisation algorithm for interpolation in maturities


272

16.16 About the problem of extracting 6M Caplets from Market data.

Strike

1 30.6 26.2 21.4

1.5 27 23.8 20.4

2 26.3 23.3 20.2

3 25.6 23.4 21.1

4 25.4 23.4 21.3

5 24.9 23.1 21.1

6 24.5 22.8 21.1Mat

urity

1.5 1.75 2

Table 16.1. Cap market quotes: flat cap difference under 2 year barrier

The caps that are available to us directly from the market are not constructed over

the same underlying forward LIBOR rates. More specifically, the data that we have

available is constructed over the three month LIBOR for quoted caps with a maturity

of up until two years, and all subsequent caps quoted with longer maturities are

constructed over the six month LIBOR forward rate. Note moreover that the starting

date of our data is no longer always six months past the valuation date. This is still the

case for caps whose maturities last more than 2 years. Now however, for the data

quoted over the 3 month LIBOR, (i.e. with maturities less than 2 years) the starting

date is the third month after the value date. This means for example that the first

market data quoted with a maturity of one year is really constructed from three

caplets, with starting and end dates: (3M,6M), (6M,9M), (9M,1Y).

FlatCaplets On 3M EURIBOR

0 1Y6M 6M 9M 12M 1Y3M 3M

Fig. 16.8. Cap market quotes: flat cap difference under 2 year barrier

Further, we have to obtain from the 3 month quotes the equivalent six month

caplet starting on the sixth month so as to be consistent with the rest of the data.


273

Sought for Caplets on 6M EURIBOR

0 1Y6M 6M 12M 3M 2Y

Fig. 16.9. Creation of the 6 month caplets from 3 month Cap market quotes: flat

cap difference under 2 year barrier

We shall try to develop a method to extract a measure of the six month caplets

from the 3 month data. The most direct approach would be to assume that the

volatility of the six month cap is equivalent to the σflat(L3M). Mathematically this would

mean:

)()(1

),(),0(3

1

3

i

n

jttiiT

M

jj

M

i capletcap Σ=Σ ∑=

− (16.43)

But we would be then using

)()( 6),0(

63),0(

3i

MT

Mi

MT

Mii capcap Σ=Σ (16.44)

which is clearly wrong.

Instead, we decide to construct a cap6M using the following procedure:

Consider three instants of time, 0 < S < T < U, all six-months spaced. Assume also

that we are dealing with a ‘Swaption x 1’ and with S and T expiry six month caplets.

Fig. 16.10. Decomposition of a six menthe caplet into two 3 month caplets

· Where we have noted F as the forward rate applicable to each caplet, and

where (S, T, U) are each separated by three month intervals. Note that both F1

Caplet3M(F1)

U T S t

Caplet3M

(F )

Caplet6M (F)


274

and F2 are related to the three month LIBOR, whereas we are trying to

construct an F related to the six month LIBOR rate.

The algebraic relationship between F, F1 and F2 is easily derived expressing all

forward rates in terms of zero-coupon-bond prices.

211

1),(

),(1

−=

TtB

StBF (16.45)

211

1),(

),(2

−=

UtB

TtBF (16.46)

1

11

),(

),(

−=

UtB

StBF (16.47)

Notice that we can rewrite the latter in terms of the previous two:

42

1),(

),(

),(

),(

1

11

),(

),( 2121 FFFF

UtB

TtB

TtB

StB

UtB

StBF ++=

−⋅=

−= (16.48)

Let us now apply Ito to the formulae for F1 and F2, where we are only really

concerned with the Brownian terms:

dttdZtdZ

tdZtFtdttdF

tdZtFtdttdF

ρσσ

=+=+=

)()(

)()()((...))(

)()()((...))(

21

2222

1111

(16.49)

The quantity ρ is the ‘infra correlation’ between the ‘inner rates’ F1 and F2. By

differentiation

dtF

FdF

F

Fdt

t

FtdF i

i ii

i i

σ∑∑== ∂

∂+∂∂+

∂∂=

2

12

22

1 2

1)( (16.50)

+⋅+

+⋅+=42

1)()()(

42

1)()()((...))( 1

2222

111

FtdZtFt

FtdZtFtdttdF σσ

(16.51)


275

)(42

)()(42

)((...))( 2212

21211

1 tdZFFF

ttdZFFF

tdttdF

+⋅+

+⋅+= σσ

(16.52)

Taking variances on both sides of the above, conditional on the information

available at time t we have

2 22 2 2 21 1 2 2 1 2

1 2

1 1 2 2 1 21 2

( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )

2 4 2 4

( ) ( ) ( ) ( ) ( ) ( )2 ( ) ( )

2 4 2 4

F t F t F t F t F t F tt F t t t

F t F t F t F t F t F tt t

σ σ σ

ρσ σ

⋅ = ⋅ + + ⋅ + +

+ ⋅ + ⋅ +

(16.53)

Let us name

+=

+=

4

)()(

2

)(

)(

1)(

4

)()(

2

)(

)(

1)(

2122

2111

tFtFtF

tFtu

tFtFtF

tFtu

(16.54)

We can then rewrite the former as

)()()()(2)()()()()( 21212

222

21

21

2 tututtttuttut σρσσσσ ++= (16.55)

We decide to introduce a deterministic approximation by freezing all F’s (and

therefore u’s) at their time-zero value:

)0()0()()(2)()0()()0()( 21212

222

21

21

2 uutttututapprox σρσσσσ ++=

(16.56)

Now recall that F is the particular (one-period) swap rate underlying the ‘S x 1’

swaption, whose (squared) Black’s swaption volatility is therefore

2 2

0

2 2 2 21 1 2 2 1 2 1 2

0 0 0

1( )

1(0) ( ) (0) ( ) 2 (0) (0) ( ) ( )

S

Black approx

S S S

v t dtS

u t dt u t dt u u t t dtS

σ

σ σ ρ σ σ

= =

= + +

∫

∫ ∫ ∫

(16.57)


276

The first integral can be inputted directly as a market caplet volatility.

dttS

vS

CapletS ∫=0

21_

2 )(1 σ (16.58)

The second and third integrals in contrast require some form of parametric

assumption on the instantaneous volatility structure of rates in order to be computed.

The simplest solution is to assume that forward rates have constant volatilities. In

such a case

CapletT

S

CapletT

S

vdtvS

dtttS

_2

0

_2

0

21

1)()(

1 =≈ ∫∫ σσ (16.59)

The third integral becomes:

CapletSCapletT

S

CapletSCapletT

S

vvdtvvS

dtttS __

0

__

0

21

1)()(

1 =≈ ∫∫ σσ (16.60)

Under this assumption we finally get:

)0()0()()(2)()0()()0()( 21__2

_22

2_

21

2 uutvtvtvutvut CapletTCapletSCapletTCapletSapprox ρσ ++= (16.61)


277

17. SABR

At this point in our study, we have effectively smoothened our curve along the

maturity direction, this is, we have solved the irregularities in the term structure of

our caplet volatility surface. More specifically, and possibly more recognisable

visually, we have interpolated along the vertical direction in the market quoted

matrix:

Strikes [%]1,50 1,75 2,00 2,25 2,50 3,00 3,50 4,00 5,00 6,00 7,00 8,00 10,00

1,00 30,60 26,20 21,40 15,70 12,00 10,70 12,20 14,00 17,50 21,00 23,50 25,40 27,10

1,50 27,00 23,80 20,40 17,70 15,70 14,00 15,80 17,00 18,70 20,10 21,20 22,20 26,10

2,00 26,30 23,30 20,20 18,00 16,70 15,10 16,40 17,30 18,80 20,00 21,20 22,20 26,10

3,00 25,60 23,40 21,10 19,50 18,60 16,90 17,50 18,10 19,30 20,60 21,70 22,60 24,40

4,00 25,40 23,40 21,30 19,90 19,20 17,60 17,90 18,30 19,30 20,40 21,40 22,30 23,90

5,00 24,90 23,10 21,10 19,90 19,30 17,90 17,90 18,10 18,90 19,80 20,70 21,50 23,10

6,00 24,50 22,80 21,10 19,90 19,40 18,00 17,90 18,00 18,60 19,30 20,00 20,70 22,40

7,00 24,20 22,60 21,00 19,90 19,40 18,10 17,80 17,80 18,20 18,80 19,50 20,10 21,80

8,00 24,00 22,40 20,90 19,90 19,40 18,10 17,70 17,60 17,90 18,40 18,90 19,50 21,20

9,00 23,80 22,20 20,80 19,80 19,30 18,00 17,60 17,40 17,50 17,90 18,40 18,90 20,50

10,00 23,60 22,10 20,70 19,70 19,20 17,90 17,40 17,20 17,20 17,50 17,90 18,30 19,70

12,00 23,00 21,60 20,30 19,30 18,80 17,50 17,00 16,60 16,40 16,60 16,90 17,30 18,40

15,00 22,50 21,10 19,90 18,90 18,30 17,10 16,50 16,10 15,70 15,70 16,00 16,30 17,20

20,00 21,60 20,30 19,20 18,10 17,60 16,40 15,70 15,20 14,80 14,80 15,00 15,30 16,20

25,00 20,80 19,60 18,50 17,50 17,00 15,90 15,20 14,70 14,20 14,30 14,50 14,90 15,70

30,00 20,10 18,90 17,80 17,00 16,60 15,40 14,70 14,20 13,80 13,90 14,30 14,60 15,50

Mat

urity

[yea

rs]

l

Table 17.1. Shows the different dates and strikes with which to model products with

similar needs to the HJM

This does not mean however that our curve will be smooth along a horizontal

strike direction. In fact, we have not modified these values at all with respect to the

original market values, and therefore obtain market derived smile volatilities that can

be extremely irregular:

Chapter 17 SABR

278

Caplet Smiles with Maturity

5,000%

10,000%

15,000%

20,000%

25,000%

30,000%

1,50 2,00 2,50 3,00 3,50 4,00 5,00 6,00 7,00 8,00 9,00

Strike K [%]

Sig

ma

Bla

ck l

Maturity 0.5 years

Maturity 1.5 years

Maturity 8 years

Maturity 28 years

Fig. 17.1. Caplet current market behaviour

The requirement of a smooth smile is essential for calibration. Of particular

importance is the small peak observed for very short maturities around the ‘at the

money’ value (strike = 3%).

As is done with the market swaption matrix, we decide to execute an inexact

SABR interpolation along strikes to smoothen the curve. It is inexact because we will

modify slightly the market values in exchange for an increase in smoothness.


279

17.1 Detailed SABR

The SABR method is a stochastic interpolation method to model LIBOR forward

rates. The model takes a given vector of strikes 1

n

i iK

=and a vector of Black

volatilities 1

n

i iσ

= i.e., a horizontal row from the above matrix. The fit is done in the

sense of the nonlinear least squares, using:

1

2

ˆ ˆ ˆ (0)

(0)

dF F dW F f

d dW

βαα να α α

= == =

(17.1)

1 2dW dW dtρ= (17.2)

0σ α=

There are four parameters in the model, (α, β, ρ, ν), although we will soon see

that two of these can be set beforehand.

The SABR model uses the price of a European option given by Black’s formula:

( )

( )1 2( ) ( ) ( )

( )

call set

put call set

V B t f N d K N d

V V B t K f

= ⋅ − ⋅

= + − (17.3)

2

1,2

( )log

2B ex

T

B ex

tf

Kd

t

σ

σ

⋅ ± =

⋅ (17.4)

It then derives the below expression, where the implied volatility σB(f,K) is given

by singular perturbation techniques. We will not enter the specifics here, but simply

state the expressions:

Chapter 17 SABR

280

( )

( )( ) ( ) ( )

( )( ) ( )( )

2 41 / 2 2 4

2 2 22

1 1 / 2

,

( )1 11 log log ...

24 1920

1 1 2 31 ...

24 4 24

B

ex

K f

z

x zf ff K

K K

tf K f K

β

β β

σ

αβ β

β α ρβνα ρ ν

−

− −

=

= ⋅ ⋅

− − ⋅ + + +

− − ⋅ + ⋅ + + + ⋅ ⋅

(17.5)

Where

( )( )1 / 2log

fz f K

K

βνα

− = ⋅

(17.6)

and

( )21 2

log1

z z zx z

ρ ρρ

− + + − = −

(17.7)

For special cases of ‘at the money’ options where f = K, the formula simplifies to

( ) ( )( )

( ) ( )

2 2 22

2 2 11

1 1 2 3, 1 ...

24 4 24ATM B exf f tf f f

β ββ

βα α ρβνα ρσ α ν− −−

− − = = + ⋅ + + +

(17.8)

Implementing the SABR model for vanilla options is very easy, since once this

formula is programmed, we just need to send the options to a Black pricer.

The complexity of the formula is needed for accurate pricing. Omitting the last

line of (17.5), for example, can result in a relative error that exceeds three per cent in

extreme cases. Although this error term seems small, it is large enough to be required

for accurate pricing. The omitted terms “+ ” are much, much smaller.

The function being minimised in this procedure is

( )2

i i i1

blackToNormal( ,K )-sabrToNormal( , , , ,K )n

ii

w σ α β ρ ν=∑ (17.9)


281

This is, we are minimising the difference between the correct market quotes at

specific strikes, and the quotes that the SABR model provides at those same strikes.

The price we are willing to pay for this adjustment is set by the weights in the

minimisation algorithm that we use. We are currently taking weights of the form:

( )2

1

1i

i

wK F

=+ −

(17.10)

17.2 Dynamics of the SABR: understanding the parameters

1

2

ˆ ˆ ˆ (0)

(0)

dF F dW F f

d dW

βαα να α α

= == =

(17.11)

There are two special cases that we must pay special attention to: β=1,

representing a stochastic log normal model (flat) FdF α= , and β=0, representing a

stochastic normal model (skew) α=dF . On top of these curves we have the

superimposed smile, as can be seen in the below graph.

Notice that the σB(f,f) at the money traces the dotted line known as the backbone.

Fig. 17.2. beta = 0 skew imposition, rho smile imposition

Chapter 17 SABR

282

Fig. 17.3. beta flat imposition =1, rho smile imposition

Let us consider a simplified version of the SABR valid when K is not too far from

the current forward f.

( )

( ) ( ) ( )2 2 2 21

,

1 11 1 log 1 2 3 log ...

2 12

B K f

K K

f ff β

σ

α β ρλ β ρ λ−

=

= ⋅ − − − + − + − +

(17.12)

where 1f βνλα

−= (17.13)

The main term in the above is ( ) 1,B f f

f βασ −= which represents what we have

called the backbone. This is almost entirely determined by the exponent β, with the

exponent β = 0 (a stochastic Gaussian model) giving a steeply downward sloping

backbone, and the exponent β = 1 giving a nearly flat backbone.

The second term ( )11 log

2

K

fβ ρλ

− − −

represents the overall skew:-the slope

of the implied volatility with respect to the strike K We can decompose it into two

principal components:

1. The beta skew:

( )11 log

2

K

fβ

− −

(17.14)


283

Is downward sloping since 0 ≤ β ≤ 1. It arises because the “local volatility”

1 1

f

f f

β

β βα α

− −= is a decreasing function of the forward price.

2. The vanna skew:

1

log2

K

fρλ

−

(17.15)

Is the skew caused by the correlation between the volatility and the asset price.

Typically the volatility and asset price are negatively correlated, so on average, the

volatility α would decrease (increase) when the forward f increases (decreases). It thus

seems unsurprising that a negative correlation ρ causes a downward sloping vanna

skew.

The last term ( ) ( )2 2 2 211 2 3 log ...

12

K

fβ ρ λ

− + − + also contains two

parts:

The first part:

( )2 211 log

12

K

fβ

−

(17.16)

appears to be a smile (quadratic) term, but it is dominated by the downward

sloping beta skew, and, at reasonable strikes K, it just modifies this skew somewhat.

The second part:

( )2 2 212 3 log

12

K

fρ λ

−

(17.17)

is the smile induced by the volga (vol-gamma) effect. Physically this smile arises

because of “adverse selection”: unusually large movements of the forward F happen

more often when the volatility α increases, and less often when α decreases, so strikes

K far from the money represent, on average, high volatility environments.

Chapter 17 SABR

284

17.2.1 Fitting market data.

The exponent β and correlation ρ affect the volatility smile in similar ways. They

both cause a downward sloping skew in σB (K, f ) as the strike K varies. From a single

market snapshot of σB (K, f ) as a function of K at a given f, it is difficult to distinguish

between the two parameters.

Fig. 17.4. Undistinguishable smile difference on calibrating with different beta

parameters β = 0 and β =1

Note that there is no substantial difference in the quality of the fits, despite the

presence of market noise. This matches our general experience: market smiles can be

fit equally well with any specific value of β. In particular, β cannot be determined by

fitting a market smile.

Suppose for the moment that the exponent β is known or has been selected. The

exponent β can be determined from historical observations of the “backbone” or

selected from “aesthetic considerations”. Selecting β from “aesthetic” or other ‘a

priori’ considerations usually results in β = 1 (stochastic lognormal), β = 0 (stochastic

normal), or β = 1/2 (stochastic CIR) models. We will see however that in our

particular SABR construction, the beta parameter has a much greater impact than

what we initially expected.


285

With β given, fitting the SABR model is a straightforward procedure. Simply, for

every particular date (i.e. row) in our caplet matrix, we seek a unique pair ρ, ν with

which we satisfy the SABR equations, and for which we minimize the equation set out

initially in (17.9). The alpha can be fitted analytically in general.

( )2

i i i1

blackToNormal( ,K )-sabrToNormal( , , , ,K )n

ii

w σ α β ρ ν=∑

Now, the three parameters α, ρ, and ν have clear different effects on the curve:

· The parameter α mainly controls the overall height of the curve and is defined

and almost equal to the σATM

· The correlation ρ controls the curve’s skew,

· The vol of vol ν controls how much smile the curve exhibits.

Because of the widely separated roles these parameters play, the fitted parameter

values tend to be very stable.

It is usually more convenient to use the at-the-money volatility σATM β, ρ, and ν

as the SABR parameters instead of the original parameters α, β, ρ, ν. The parameter α

is then found whenever needed by inverting

( ) ( )( )

( ) ( )

2 2 22

2 2 11

1 1 2 3, 1 ...

24 4 24ATM B exf f tf f f

β ββ

βα α ρβνα ρσ α ν− −−

− − = = + ⋅ + + +

(17.18)

This inversion is numerically easy since the [...]tex term is small. With this

parameterization, fitting the SABR model requires fitting ρ and ν to the implied

volatility curve, with σATM given by the market and β=1 selected.

In many markets, the ATM volatilities need to be updated frequently, say once or

twice a day, while the smiles and skews need to be updated infrequently, say once or

twice a month. With the new parameterization, σATM can be updated as often as

needed, with ρ, ν (and β) updated only as needed. In general the parameters ρ and ν

Chapter 17 SABR

286

are very stable (β is initially assumed to be a given constant), and need to be re-fit only

every few weeks. This stability may be because the SABR model reproduces the usual

dynamics of smiles and skews. In contrast, the at-the-money volatility σATM , or,

equivalently the α, may need to be updated every few hours in fast-paced markets.

17.2.2 σATM Selection

For our specific caplet case, the ‘at the money’ volatility with which we set the

global volatility level is obtained as the volatility corresponding to the caplet’s

forward. For a caplet with a value date t, fixing date T and maturity U, the forward

strike is simply:

( , ) 1

1( , )FWD

B t TK

B t U m

= −

(17.19)

Thus, since at this stage, we will typically have a caplet matrix (either linear or

cubic spline interpolated), we can extract the corresponding σATM by interpolating

along a row the above strike. Graphically, this is equivalent to:

Cubic spline σATM extraction for SABR

0

0,05

0,1

0,15

0,2

0,25

0,3

0,35

1,5 3,5 5,5 7,5 9,5

Strike [%]

Bla

ck V

ol

l

constrained cubic spline

Forward Strike = 4,55%

σATM = 0,0899

Maturity 1 Year6 Month caplet

Fig. 17.5. Constructing the caplet ‘at the money’ volatilities


287

The above must be performed for each of the maturities. Followingly, the entire

caplet matrix, the forwards FWDi, the σATM i, the set of dates (value date t, fixing dates

Ti and maturities Ui) and the tenors (6 months in our case) are used as inputs for the

SABR surface construction.

Some final considerations:

In most markets there is a strong smile for short-dated options which relaxes as

the time-to-expiry increases; this is exactly what we observe in Fig. 18.14 with the

caplet market. Consequently the volatility of volatilities ν is large for short dated

options and smaller for long-dated options, regardless of the particular underlying.

The SABR model predicts that whenever the forward price f changes, the implied

volatility curve shifts in the same direction and by the same amount as the price f.

This predicted dynamics of the smile matches market experience.

If β < 1, the “backbone” is downward sloping, so the shift in the implied volatility

curve is not purely horizontal. Instead, this curve shifts up and down as the at-the-

money point traverses the backbone.

Chapter 18 Result Analysis

288

18. Result Analysis

18.1.1 Flat Cap Volatilities:

Are the market data which we possess to begin with. We seek to interpolate

from the caps the necessary intermediate maturity values to be able to perform the

subsequent caplet interpolation procedure. The first thing that we must notice is the

fact that all interpolation methods produce intermediate points that are basically

indistinguishable within the Cap curve. This apparently suggests that they will yield

very similar Caplet curves also. We will see that the latter is not the general case

Flat Cap Volatility

0,14

0,145

0,15

0,155

0,16

0,165

0,17

0,175

0,18

0,185

0,19

0,00 5,00 10,00 15,00 20,00 25,00 30,00

Maturity [Years]

Cap

Vol

atilit

y

l

Market

natural spline

constrained spline

Fig. 18.1. Cap flat market quotes versus interpolated flat market quotes


289

18.1.2 Linear Cap Interpolation versus Quadratic Cap Interpolation

The Linear Cap interpolation method is that which was already implemented in

the Banco Santander (the green line below). We will use it as the standard of

comparison with respect to all other methods tested. Notice how it presents very

noticeable ‘bumps’ throughout the entire curve, suggesting that we will encounter

future difficulties when using this data.

The first thing we must realize is the actual location of the irregularities. A careful

examination with respect to the previous cap graph from which the caplets are

derived shows that the jumps produced in the below graph coincide exactly with

sharp changes in the gradient of Fig. 18.1. This apparently suggests that for

smoothness in the caplet values, we require minimal slope variations in our cap

graph.

Cap Linear and Quadratic Interpolation

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 5 10 15 20 25 30

Maturity [Years]

Cap

let V

olat

ilitie

s

l

Cap Quadratic

Cap Linear

Fig. 18.2. Caplet interpolated volatilities using linear and quadratic approaches

Quadratic formulae tend to show a high degree of convexity or concavity in their

interpolation points. No matter how small they may be in the Cap graph, it is evident

from the above that the caplet transformation is sensitive to them.

18.1.3 Cubic Spline interpolation

Is the next logical step to take. The increase in the order of the polynomial used

clearly translates into a better interpolation of the caps. As a result, caplets are

smoother. Below we show a particularly unfavourable case that occurs at a strike


290

value of K = 4%. Notice that we still have undulations, but no longer the abrupt spikes

characteristic in the linear method.

Cap Cubic Spline Interpolation

0,08

0,1

0,12

0,14

0,16

0,18

0,2

0 5 10 15 20 25 30

Maturity [Years]

Cap

let V

olat

ility

l

Cap Linear

Cap Cubic Spline

Fig. 18.3. Cap interpolated volatilities using linear and cubic spline approaches

Cap Cubic Spline Interpolation

0,08

0,1

0,12

0,14

0,16

0,18

0,2

0 5 10 15 20 25 30

Maturity [Years]

Cap

let V

olat

ility

Cap Cubic Spline Natural

Cap Cubic Spline Constrained

Fig. 18.4. Cap interpolation between natural and constrained cubic splines

The second graph is an analysis of the two different spline methods we

implemented. Remember that we seek the smoothest changes in slope possible in our

cap interpolation, with as little overshooting as possible. This is specifically what is

achieved through the constrained spline method. Although the differences are small, a

more careful examination of the above curves shows that the amplitude of the

oscillations in the natural spline are always greater than those in the constrained

spline.


291

18.1.4 Piecewise Constant Caplet Interpolation

Caplet Constant Interpolation

0,150000

0,160000

0,170000

0,180000

0,190000

0,200000

0,210000

0,220000

0,230000

0 5 10 15 20 25 30Maturity [years]

Cap

let V

olat

ility

l

Caplet Constant, 1st point free

Cap linear interpolation

Caplet Constant, 1st point fixed

Fig. 18.5. Caplet interpolated volatilities using linear and linear cap approaches

We now revert to our second method: interpolation between caplets. Once again

we decide to compare each model with the linear cap interpolation that is already

implemented within the bank. The piecewise constant interpolation for each set of

caps is clearly not a smooth solution and has been presented mostly for a visual

confirmation of what the theory predicted.

It does help however in understanding the range of caplets we must interpolate

at each time interval. It is clearly visible that we interpolate in groups of two up until

10 years, then in two groups of around 5 caplets each until 15 years, and then in vast

groups of 10 caplets apiece until 30 years.

18.1.5 Piecewise Linear Caplet Interpolation

Caplet Linear Interpolation

0,14

0,15

0,16

0,17

0,18

0,19

0,2

0,21

0,22

0,23

0 5 10 15 20 25 30Maturity [Years]

Cap

let V

olat

ility

l

Cap linear

Caplet linear


292

Fig. 18.6. Caplet interpolated volatilities using linear approaches

The improvement here is clear with respect to the piecewise constant. We still

however attain no improvement with respect to the cap linear interpolation.

18.1.6 Piecewise Quadratic Caplet Interpolation: first guess cubic spline

Caplet Quadratic Interpolation

0,1

0,11

0,12

0,13

0,14

0,15

0,16

0,17

0,18

0,19

0 5 10 15 20 25 30

Maturity [Years]

Cap

let V

olat

ility

l

Cap Cubic Spline

Caplet Quadratic

Fig. 18.7. Caplet interpolated volatilities using cubic spline, and an optimisation

algorithm using quadratic approaches

The optimization algorithm here can result extremely time consuming. The

convergence towards an optimal solution is still extremely similar to the results

obtained through the cap spline interpolation.

We decide therefore to implement the spline interpolator. It is simpler, less time

consuming, and equally as efficient in smoothing out the caplet function.

18.2 SABR Results

Our curve is now smooth along maturities. We return now to the analysis of the

interpolation across different strikes. Recall that we had particular difficulties,

especially at very short or extremely long maturities. Through the SABR interpolation

method, we managed to greatly reduce the fluctuations present within the caplet

Black volatilities.


293

SABR

0,1700

0,1800

0,1900

0,2000

0,2100

0,2200

0,2300

0,2400

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5

Strike K [%]

Bla

ck V

ol

l

Market caplets

SABR

Fig. 18.8. - SABR shot maturity caplet smile

Above, we have taken a set of different strikes for a six month caplet whose

exercise date was fairly recent, starting in five years time. We see that without the

SABR, the spline interpolated data not too badly, but still has a few irregularities. The

SABR proves to be a lot smoother, and the adjustment is relatively close

SABR

0,1

0,11

0,12

0,13

0,14

0,15

0,16

0,17

0,18

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5

Strike K [%]

Bla

ck V

ol

l

Market Caplets

SABR

Fig. 18.9. - SABR long maturity caplet smile


294

Here with long maturities (28years) for a six month caplet, we see that the

extrapolated data is extremely irregular. This is mainly because the data available

from the market is for caps that have 25 year or 30 year maturities. Interpolation of

caplets with six month intervals can degenerate rapidly in five year intervals. The

SABR proves to be an extremely good approximation.

SABR Smile

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0.3500

1.50 2.50 3.50 4.50 5.50 6.50 7.50 8.50

Strikes

Bla

ck V

ol

Linear Interpolated Caps

SABR Interpolation

Fig. 18.10. - SABR very short 6 month maturity - sharp smile

For very low maturities the smile can be quite sharp. SABR tends to widen it

slightly, and smoothen the curvature on the slopes. Notice that there is a displacement

between ‘at the money’ levels. We shall discuss this characteristic later.

SABR Smile

0.1400

0.1500

0.1600

0.1700

0.1800

0.1900

0.2000

0.2100

0.2200

0.2300

1.50 2.50 3.50 4.50 5.50 6.50 7.50 8.50 9.50

Strike [%]

Bla

ck V

ol

Linear Cap Interpolation

SABR Interpolation

Fig. 18.11. SABR short maturity caplet smile inexact correction: very irregular smile


295

SABR also prevents overshoots (we typically see these occurring at low strikes),

maintaining the same global volatility level and capturing the general smile convexity.

The SABR that we have implemented has a particular characteristic. This is, it has

the flexibility to perform an independent smile evaluation, and then a global volatility

level evaluation:

We usually have a set of data variables that reproduce the general form of the

smile. However, it is common also to have an independent market quoted value for

the ‘at the money’ volatility. The two are not necessarily obtained from the same set of

data, meaning that they may sometimes not be coherent. See below how the data

points are inconsistent- we obtain a general smile that does not pass through the ‘at

the money’ point.

Smile Fitting

0

0.08

0.16

0.24

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5

Strike [%]

Bla

ck V

ol

'at the money'

general smile

'at the money'

Fig. 18.12. Difference in general smile and ‘at the money’ level

For this reason, we use one set of data (without the ‘at the money’) to generate the

smile, and we then displace the curve vertically, thus forcing it to pass through the

desired ‘at the money’ value.


296

Smile Fitting

0

0.08

0.16

0.24

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5

Strike [%]

Bla

ck V

ol

'at the money'general smileCorrected

'at the money'

Fig. 18.13. Smile correction towards ‘at the money level’

18.3 3D Analysis

If we were to analyse the form of the entire caplet volatility surface, we would see

in a summarized way, the improvements achieved at each phase of the interpolation

process.

18.3.1 Interpolation in Maturities

0.08

1.50

3.50

5.50

7.50

9.50

11.5

0

13.5

0

15.5

0

17.5

0

19.5

0

21.5

0

23.5

0

25.5

0

27.5

0

29.5

0

0.015

0.065

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Bla

ck V

ol

l

Maturity [Years]

Strike

Caplet Volatility Surface - Linear Cap Interpolatio n


297

Fig. 18.14. Initial linear interpolated caplet volatility surface

0.252468

1012141618202224262830

1.5

7

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Bla

ck V

ol

l

Maturity [Years]

Strike

Caplet Volatility Surface - Cubic Smile Interpolati on

Fig. 18.15. Cubic Spline caplet volatility surface

0.252468

1012141618202224262830

1.5

2.2

5

3

.50

6.0

0

10.

00

0.00000

0.05000

0.10000

0.15000

0.20000

0.25000

0.30000

0.35000

Bla

ck V

ol

Maturity [Years]

Strike

Caplet Volatility Surface -SABR

Fig. 18.16. SABR smooth interpolated smile surface with cubic spline

We see from a comparison of the above three caplet volatility surfaces that the

cubic spline successfully smoothens the surface along the maturity axis. Notice how

the surface is particularly irregular with linear interpolation. Notice also that the

SABR may have slightly different (and may we say better) values along the maturity


298

axis, despite having used the same cubic spline algorithm as Fig. 18.11 because an

SABR fitting has been performed afterwards.

18.3.2 Smile Analysis

0.25

7.5

15

22.5

30

1.5 2

2.5

3.5 5 7 10

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Bla

ck V

ol

Maturity [Years]

Strike

Linear or Cubic Smile Interpolation

0.25

8

16

24

1.5 2

2.5

3.5 5 7 10

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Bla

ck V

ol

Maturity [Years]

Strike

SABR Smile

Fig. 18.17. Irregular Smile for both linear interpolation and cubic spline, whereas

SABR presents a much smoother outline

We clearly see form the above two graphs that we have successfully achieved the

interpolation in smile that we set out for. The visual comparison is self explanatory.


299

18.3.3 Detailed Analysis

While working with our HJM model, we came across a necessity of particular

interest. The caplet volatility surface that the bank was generating presented a

particularly annoying bump ‘at the money’. This was giving way to immense

difficulties in the product calibration process. The anomaly tended to occur especially

amongst smiles with low maturities- often between one and two years. We discovered

that it occurred at a strike around 4,5 years, which coincided exactly with a column of

strikes that was being artificially created through linear interpolation. Avoiding this

linear interpolation between strikes was already capable of eliminating a great portion

of the bump.

The implementation of a constrained cubic spline cap interpolation yielded the

below pink graph. As can be seen, the cubic spline improves the smile tremendously.

However, it does not really solve the problem, for it seems to simply displace the

bump further towards lower strikes.

Current 'Bump'

0,07

0,12

0,17

0,22

0,27

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Black

vol l

Linear cap interpolation

constrained cubic interpolation

Fig. 18.18. Smile bump ‘at the money level’ in linear cap interpolation; maturity of 1,5

years

That the anomaly should be located beyond 2,5% is extremely suitable for our

purposes. It means that for strikes larger than this, the entire curve appears to be very

smooth and thus valid. Furthermore, discussions with traders have provided the

necessary insight so as to confirm that strikes lower than 2,5% are not liquid enough,


300

and so are not even traded: This implies that traders already tend to reject the

problematic region below 2,5%, meaning that our anomaly would not come into play.

A last factor to support the fact that the strange ‘elbow’ below K = 2,25% is not of

great importance is that these values are no longer going to be quoted by REUTERS

(from which we currently obtain our data), meaning that the entire problem would

disappear.

A further discussion on the above subject could extend into analysing what the

SABR smoothing produces in the above situation, and what the ‘conversion from 3

months to six months’ algorithm yields. Recall what was discussed in section 16.16

where the 3 month quotes were converted to artificial 6 month quotes. Remember also

how this only affected the data between the 1 year and 2 year time-periods. All other

dates coincided exactly with the cubic spline general method.

The main difference we observe among the below graphs is that the 3 month

version also smoothens out the bump problem, but does so at a higher volatility level,

i.e., it provides a smooth smile tangent almost to the top of the bump. Notice that its

‘elbow’ is also displaced.

Smile Comparison

0,05

0,1

0,15

0,2

0,25

0,3

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

l

linear cap interpolationconstrained cubic spline3M to 6M cubic spline

Fig. 18.19. Caplet 3M to 6M smile for a maturity of 1,5 years

From a quant’s perspective, we cannot really choose which of the spline curves is

more correct visually. It is necessary for the curves to be implemented and used in


301

calibrations in order to see whether the prices they yield for certain products coincide

with the market prices.

We present below the variations amongst the three curves for the 1,5 and 2 year

maturities. Beyond this, cubic and ‘3M to 6M’ are identical. Notice that in these cases

the 3 month correction lies below the other two. Thus, the correction seems to present

greater fluctuations in volatility level than the other two curves.

Maturity 1Y

0,05

0,1

0,15

0,2

0,25

0,3

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck v

ol

l

Linear cap interpolationconstrained cubic interpolation3M to 6M

Maturity 2Y

0,09

0,14

0,19

0,24

0,29

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck v

ol

l

Linear cap interpolationconstrained cubic interpolation3M to 6M

Fig. 18.20. Comparisons in cubic and 3M to 6M adjustments

18.3.4 The SABR Strike Adjustment

A detailed analysis of the SABR curve’s dynamics proved that the adjustment

was not perfect. There are several important features to notice when applying an

SABR. Recall that we had implemented an algorithm that fitted the SABR curve to a

specific volatility level. This level must be taken at a point of interest and relevance.

Below, we present a case in which it was taken for a strike of 3,08%. It is clear that

the adjustment of the SABR to the cubic spline curve is exact at the corresponding

volatility level of 0,158. However, the adjustment in other areas of the graph –

especially around the ‘at the money’ region and beyond- is a lot worse.


302

SABR adjusted at K = 3,08

0,05

0,1

0,15

0,2

0,25

0,3

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

olat

ility

linear cap interpolatin

Cubic spline

SABR adjusted at K=3,08

Fig. 18.21. SABR strike adjustment

SABR adjusted at K=4,5

0,04

0,09

0,14

0,19

0,24

0,29

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

olat

ility

l

Linear cap interpolation

Cubic spline

SABR adjusted at the money

Fig. 18.22. SABR at the money strike adjustment, β = 1

Above in Fig. 18.22, for a strike at the money, the general level is much better

achieved. However we notice a possible problematic ‘undershoot’ in the ‘at the

money’ region.

It would be necessary, as we said previously, to analyse through calibrations

whether the above is in fact a correct, viable option, or whether the SABR is producing

a form that is too pronounced.

Aiming to analyse now the flexibility of the SABR model, we proceeded to vary

some of its parameters so as to see if it is possible to modify at our own will the form


303

of the curve. Our initial intention is to avoid the undershoot and to resemble as best

as possible the cubic spline interpolation curve, avoiding however its peculiar elbow

for low strikes.

Before proceeding any further, it is necessary to state clearly that the SABR is not

an exact method, whereas the cubic spline is. Thus, the SABR does not satisfy the

condition that

caplets Forward cap=∑

but instead, constructs its caplets on a best fit minimisation procedure that gives

more importance to the fitting of a smooth curve.

18.3.5 The Beta Parameter

A first idea was to refer back to the SABR model and to realise that the β

parameter which we initially thought of as playing a minor role, could have a much

greater impact in our graphs than what we initially expected. Recall that the SABR

model was:

1

2

ˆ ˆ ˆ (0)

(0)

dF F dW F f

d dW

βαα να α α

= == =

(18.1)

This means that a β = 1 yields a flat lognormal model, whereas β=0 yields a

normal skew model.

Notice how in Fig. 18.22, we use a flat lognormal model. This means the SABR

can only adapt to the caplet points with a very pronounced curvature that

undershoots. A more skewed model is capable of solving the problematic. See below

with β =0 how the problem is already greatly reduced.


304

SABR, B=0

0,05

0,1

0,15

0,2

0,25

0,3

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

l

linear cap interpolation

constrained cubic spline

3M to 6M cubic spline

Fig. 18.23. SABR β = 0, normal skew; maturity 1 year

However, for long maturities that present a very flat smile, the lognormal model

turns out to be much more adequate than the skew. See below for a detailed analysis.

18.3.6 SABR Weights

Recall that we had initially implemented the weights as

( )2

1

1i

i

wK F

=+ −

(18.2)

This implies that we give greater weights to the values that are closest to the ‘at

the money’ strike, whereas more distant values do not have such a great impact in

determining the form of the SABR. We will see now that maintaining the β parameter

and simply modifying the weights enables us to decide in which area we want to

stress our SABR. We can also choose to consider all areas as being equally important.

We shall compare the above weighting scheme therefore with a wi = 1 homogeneous

form


305

18.3.7 Global Analysis

We present below the cases that we believe, best summarize the range of tests

performed. They analyse the adequacy of the different parameters within the SABR

model to the fitting of the caplet volatility smile.

SABR β=1, 1,5Y , w=1

0,07

0,12

0,17

0,22

0,27

1,5 3,5 5,5 7,5 9,5

Strike [%]

Bla

ck V

ol

l

SABR β=1, 1,5Y , weighted

0,07

0,12

0,17

0,22

0,27

1,5 3,5 5,5 7,5 9,5

Strike [%]

Bla

ck V

ol

l

SABR β=0,5 1,5Y w=1

0,07

0,12

0,17

0,22

0,27

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

SABR β=0, 1,5Y weighted,

0,07

0,12

0,17

0,22

0,27

1,5 3,5 5,5 7,5 9,5

Strike [%]

Bla

ck V

ol

Fig. 18.24. SABR comparisons between long and short maturities, varying the β and

the weights

We see clearly that for flat short maturities, the caplet curve has a pronounced

skew, meaning that the normal skew model β = 0 adapts better- with less undershoot.

As for the weighting, it seems as if the weighted parameters act better for short

maturities.


306

For flat long maturities, both the β curves are capable of adapting relatively well,

although the β=1 flat lognormal tends to be slightly better. The greatest impact is

achieved through the variation of the weights attributed to each region. Below, we see

that uniform curves encompass the curve from below.

SABR β=1, 8Y , w=1

0,11

0,12

0,13

0,14

0,15

0,16

0,17

0,18

0,19

1,5 3,5 5,5 7,5 9,5

Strike [%]

Bla

ck V

ol

l

Series1Series2Series3

linear cap interpolationconstrained cubic

SABR β=1, 8Y , weighted

0,11

0,12

0,13

0,14

0,15

0,16

0,17

0,18

0,19

0,2

1,5 3,5 5,5 7,5 9,5

Strike [%]

Fig. 18.25. Weighted SABR, β = 1, Maturity 8Y

We conclude from the above study that it seems that an even weighting of wi = 1

is more adequate for our caplet volatility surface . Furthermore, it seems that the ideal

adjustment does not consist in a unique β parameter. Instead, we see how skewed

models with β = 0 are more adequate for short maturities, whereas flat lognormal

models tend to be better for long maturities.

We could consider implementing this variation of β in our final model, or

perhaps even attempt to calibrate our model for the best possible β.


307

18.4 Algorithm

We proceed now to outline the algorithm that was finally implemented. From all

the above alternatives, we finally decided to perform a cubic spline cap interpolation

on the initial cap data along the maturity direction- other interpolation methods or

even the optimisation algorithms can alternatively be selected for use at this stage.

From these interpolated exact caps we easily construct the interpolated

capforwards, and from these extract the corresponding (and therefore also exact)

caplets. This can be done with or without a 3 month to 6 month adjustment.

If the user wants to further smoothen the curve, he has the opportunity of

selecting an SABR fit, where he also has the freedom to choose the beta factor and

weights that he prefers. From this, the final caplet matrix is recalculated.


308

Maturity Interpolation

Strikes K

Matu

ritie

s T

Market Flat

σCaps

0,5 to 2Y 3M = 6M caplets

SABR fit for each maturity

(caplet matrix row)

SABR?

yes

no

TYPE I: cubic spline Quadratic Linear TYPE II quadratic optimisation Linear optimisation Piecewise constant

0,5 to 2Y 3M to 6M caplets

adjusted

Capforward creation

Caplets extracted

Strikes K

Matu

ritie

s T

Final SABR

σCaplets

Strikes K

Matu

ritie

s T

Final cubic spline

σCaplets

Adjustment Option?

Fig. 18.26. Caplet volatility surface construction algorithm


309

18.5 Future Developments

Along this line, we have still not closed any of the alternatives, but left all the

SABR parameters and interpolation or optimisation algorithms available for the end

trader. Depending on his calibrations and results, an optimum set of parameters could

be selected to finally implement. From our initial tests however, it seems that the best

combinations always include cubic splines. These could be sufficient on their own if

an exact calculation is desired. For further smoothness, we have found an SABR to be

necessary, ideally with constant weights, w=1, and with a variation in the beta

parameter from β=0 for low maturities and β=1 for high maturities.

There is a further possibility to include extra data in the caplet volatility matrix.

We did not have time to implement the algorithm, but it is the logical next step if such

information wants to be added to the adjustment.

REUTERS or other market quotes often add an independent column to the cap

matrix that we have used so far, which includes ‘at the money ‘caplet quotes for each

maturity. The difficulty in incorporating them into the already used matrices is the

fact that they do not correspond to a single unique strike but instead, each maturity

has its own strike.

We present here the procedure that we recommend following.

CAPS STRIKESFixing End 1,5 2 3 4 5 6

0,5 1 σU=1,K =1.5 σU =1, K =2 σU=1,K=3 σU=1,K= 4 σU=1,K= 5 σU=1,K= 6

0,5 1,5 σU=1.5,K= 1.5 σK=1.5, U=2 σK=1.5, U=3 σK=1.5,U= 4 σK=1.5,U= 5 σK=1.5,U= 6

0,5 2 σK=2, U=1.5 σK=2,U= 2 σK=2,U= 3 σK=2,U= 4 σK=2, U=5 σK=2,U=6

0,5 3 σK=3,U= 1.5 σK=3,U= 2 σK=3, U=3 σK=3,U= 4 σK=3,U= 5 σK=3,U= 6

0,5 4 σK=3, U=1.5 σK=3,U= 2 σK=3, U=3 σK=3, U=4 σK=4,U= 5 σK=4, U=6

ATMVol

σ1 ATM K1

σ1,5 ATM K1,5

σ2 ATM K2

σ3 ATM K3

σ4 ATM K4

Fig. 18.27. Cap market quotes

1. We are going to perform a step construction. The first idea consists in taking

the first row from the cap market quotes (0,5 to 1Y). We know that for these, we can

either perform a three month to six month adjustment, or directly take them as being

equal to the six month caplets. With our row of caplets, we also have our ATM cap


310

volatility with its particular strike. For this first step, we can also consider the ATM

cap as being equal to the caplet volatility.

Caps = Caplets = Capforwards

0,5 1 σU=1,K =1.5 σU =1, K =2 σU=1,K=3 σU=1,K= 4 σU=1,K= 5 σU=1,K= 6 σ1 ATM K1

With all this data and constructing the forwards and σATM necessary for the SABR,

we can proceed to create our SABR smile for this first row. (Notice that the σATM for

the cap and the σATM for the caplet are two separate entities. The first is quoted by the

market and is used as an additional point in our curve. The caplet’s ATM is calculated

directly from its corresponding forward, as stated in 17.2.2).

0,10

0,11

0,12

0,13

0,14

0,15

0,16

0,17

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

l

2. For the second row in the cap matrix, (0,5Y to 1,5Y), we can no longer consider

the caps and caplets as having equal volatilities. Now, using our current row and the

previous cap row we can construct the corresponding cap forward as their difference.

0,5 1,5 0,5 1to Y to YCapForward Cap Cap= −

Cap:

0,5 1 σU=1,K =1.5 σU =1, K =2 σU=1,K=3 σU=1,K= 4 σU=1,K= 5 σU=1,K= 6

0,5 1,5 σU=1.5,K= 1.5 σK=1.5, U=2 σK=1.5, U=3 σK=1.5,U= 4 σK=1.5,U= 5 σK=1.5,U= 6 σ1,5 ATM K1,5

CapForward=Caplet (from subtracting prices and inverting Black Scholes)

1 1,5 σU=1.5,K= 1.5 σK=1.5, U=2 σK=1.5, U=3 σK=1.5,U= 4 σK=1.5,U= 5 σK=1.5,U= 6

This must be directly equal to the corresponding caplets, as the capforward is

constructed over a unique 6 month interval.

We cannot however construct the K1,5 ATM capforward by subtracting the ATM

cap from the previous row’s ATM 1Y cap as they correspond to different strikes (K1


311

K1,5 ATM

σ1Y ATM K1,5Y

σ1Y ATM K1,5Y σ1,5 ATM K1,5

and K1,5 respectively). To construct the K1,5 ATM capforward we need to extract the

corresponding previous row’s K1,5Y cap by interpolation. (where we have included

the previous row’s ATM in the data)

Fixing End 1,5 2 3 K1 ATM 4 5 6

0,5 1 σU=1,K =1.5 σU =1, K =2 σU=1,K=3 σ1 ATM K1 σU=1,K= 4 σU=1,K= 5 σU=1,K= 6

Now with this interpolated (red arrow) σU=1 K(K1,5Y), the capforward ATM is

constructed by subtraction of the two caps with equal K1,5ATM strike.

σ1,5 ATM K1,5

The caplets are followingly created, and then exported to be smoothened through

the SABR fit.

0,10

0,11

0,12

0,13

0,14

0,15

0,16

0,17

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

l

3. The 0,5Y to 2Y cap row is analogous to the former.

4. The following row proves to be different. The 0,5Y to 3Y cap row would

produce capforwards composed by two 6 month caplets. Instead, we must take all the

previous cap rows and use a cubic spline interpolator to generate an intermediate

fictitious 2,5Y cap row.

Caps



σ2 ATM K2

σ3 ATM K3

Cubic spline cap interpolation


312

K2,5 ATM

K2,5 ATM


0,5 2,5 σU=2.5,K= 1,5 σK=2.5, U=2 σK=2.5, U=3 σK=2.5,U= 4 σK=2.5,U= 5 σK=2.5,U= 6


With it we can extract a first intermediate forward cap volatility (2 to 2,5Y), form

which the 2,5Y maturity caplets can easily be extracted. The 3Y maturity caplets are

now constructed as

2 3 2 2,5 2,5 3to Y to Y to YCapForward Caplet Caplet= +

We obtain the Caplets

2 2,5 σU=1.5,K= 2,5 σK=2.5, U=2 σK=2.5, U=3 σK=2.5,U= 4 σK=2.5,U= 5 σK=2.5,U= 6

2 3 σK=3,U= 1.5 σK=3,U= 2 σK=3, U=3 σK=3,U= 4 σK=3,U= 5 σK=3,U= 6

The difficulty arises now with the extra ATM cap volatility to be included. Notice

that it has a different strike K3Yto the previous row’s K2Y ATM. We interpolate an

intermediate strike for the artificially created 2,5Y caps.

3 22,5 2

Y ATM Y ATMY ATM

K KK

+=

The corresponding σU=2,5 K2,5Y is obtained by interpolating in the new cap row:


With this artificial strike, we must also now interpolate in the 0,5 to 2Y cap row to

extract the corresponding cap quote.

Fixing End 1,5 2 3 K2 ATM 4 5 6

0,5 2 σK=2, U=1.5 σK=2,U= 2 σK=2,U= 3 σ2 ATM K2 σK=2,U= 4 σK=2, U=5 σK=2,U=6

With the newly obtained σU=2 K(K2,5Y), and the previous σU=2,5 K2,5Y, the capforward

for the 2,5Y strike can be created, and from their subtracted prices, the caplet can be

constructed.

σ2,5 ATM K2,5Y


313

K3 ATM

σ2,5 ATM K3Y

σ2,5 ATM K3Y

σ2,5 ATM K3Y

Analogously, the 3Y maturity ATM cap requires that we interpolate in the 2,5Y

cap row the corresponding cap for the 3Y ATM strike.


With it, we can then construct the capforward at this 3Y ATM strike and from it,

extract the caplets.

σ3 ATM K3

The two rows of caplets can now be sent to the SABR algorithm to construct their

smoothened smiles.

2 2,5 σU=1,5,K= 2,5 σK=2.5, U=2 σK=2.5, U=3 σK=2.5,U= 4 σK=2.5,U= 5 σK=2.5,U= 6 σ2,5 ATM K2,5Y

2 3 σK=3,U= 1.5 σK=3,U= 2 σK=3, U=3 σK=3,U= 4 σK=3,U= 5 σK=3,U= 6 σ3 ATM K3Y

0,04

0,06

0,08

0,10

0,12

0,14

0,16

1,5 2,5 3,5 4,5 5,5 6,5 7,5 8,5 9,5

Strike [%]

Bla

ck V

ol

l

Really, the above could have been performed without a stepwise process, but

with a bulk treatment. Notice that the SABR creation is independent of the rest of the

process, as we never use the data which it provides. Therefore, we could have simply

constructed all the caplet cubic spline matrix incorporating a new end row with an

ATM variable strike. This would be an exact matrix on which a further SABR could be

constructed if necessary.

CAPLETS STRIKES ATMFixing End 1,5 2 3 4 5 6 K Vol

0,5 1 σU=1,K =1.5 σU =1, K =2 σU=1,K=3 σU=1,K= 4 σU=1,K= 5 σU=1,K= 6 K1 ATM σ1 ATM K1

1 1,5 σU=1.5,K= 1.5 σK=1.5, U=2 σK=1.5, U=3 σK=1.5,U= 4 σK=1.5,U= 5 σK=1.5,U= 6 K1,5 ATM σ1,5 ATM K1,5

1,5 2 σK=1, U=1.5 σK=2,U= 2 σK=2,U= 3 σK=2,U= 4 σK=2, U=5 σK=2,U=6 K2 ATM σ2 ATM K2

2 2,5 σU=1,5,K= 2,5 σK=2.5, U=2 σK=2.5, U=3 σK=2.5,U= 4 σK=2.5,U= 5 σK=2.5,U= 6 K2,5 ATM σ2,5 ATM K2,5Y

2 3 σK=3,U= 1.5 σK=3,U= 2 σK=3, U=3 σK=3,U= 4 σK=3,U= 5 σK=3,U= 6 K3 ATM σ3 ATM K3Y

3 4 σK=3, U=1.5 σK=3,U= 2 σK=3, U=3 σK=3, U=4 σK=4,U= 5 σK=4, U=6 K4 ATM σ4 ATM K4

Chapter 19 Summary and Conclusions

314

19. Summary and Conclusions

In this project, we sought to approach the study of the Heath-Jarrow-Morton

framework, seeking to optimize its speed of calibration and robustness through the

use of approximate formulas and other possible alternative methods. We set out to

also examine the degree of control that the Banco Santander’s implemented model

had over its solutions, and whether or not they were unique.

The first main problem encountered was the framework’s failure to calibrate

when attempting to model rates whose time to maturity exceeded five years. A crucial

objective was to identify and solve the cause of these errors.

There also appeared to be specific cases in which the HJM program ceased to

calibrate. These cases were isolated so as to examine whether the problem was due to

an anomaly in the market data, or whether it was due to an internal error of the

program.

Lastly, the dependence between the price and our model’s sigma parameter

proved to be the cause of a limiting value in the implied model’s volatility surface.

Whenever the true market quoted prices lay below this boundary, we found it

impossible for our model to replicate the market prices. The two initial suggested

solutions were:

· a modification of the volatility term of theΓ diffusion function for each bond

price and more specifically, variation of its relationship with respect to the α

parameter controlling the log-normality of the distributions.

· A new election of the underlying statistical distribution to use for the

volatility of volatilities λ parameter, with respect to the currently used

lognormal distribution.

It was the second alternative that was finally selected. After several sets of

analysis, we found that there seemed to be nothing drastically wrong with the

theoretical approach followed in the implementation of the HJM framework. As a


315

result of this, we concluded that the problems that were being brought up were

simply the result of an incomplete model. Evidently, a two parameter framework as

the one proposed could never capture a volatility smile by taking only two strike

positions. This proved only sufficient for the creation of skew characteristics. As a

result, we set out immediately in the construction of a three strike model.

The three strikes enhanced framework is entirely analogous to the former, but it

incorporates a stochastic volatility parameter that introduces a new source of

randomness in the former dynamics. Indeed, the modification allows us to capture the

implied smile, but at a price. Calibrations become ever more tedious, and fail to

succeed in pricing exotic products whose maturities extend beyond 16 years. Hence it

turned out that the two factor model was not itself incomplete, but was simply

forecasting the same problems that would later be encountered in the three strikes

model.

At this point, the entire model and calibration procedure was subjected to

questioning in order to try to isolate the particular flaw which the process was

suffering from. A first attempt was to modify the plain vanilla extrapolation

techniques. The chapter that deals with this aspect proves extremely interesting from

an optimisation algorithm’s point of view. It shows a complete deformation in the

model volatility surface. That a simple extrapolation technique should prove to be so

different depending solely on the derivative products taken as inputs aroused the

hypothesis that perhaps the caplet and swaption input data we inherently inconsistent

by construction.

Before deciding as a possible final measure to forbid their joint calibration, a

careful analysis proved that even the calibrations composed uniquely by caplets were

having difficulties during calibration. The problem was successfully resolved through

the selection of an adequate interpolation and extrapolation technique as mentioned

earlier. However the combined calibration with swaptions still proved unsuccessful.

We decided to tackle the problem from its core foundations. This is, to examine in

depth the entire caplet process, from the very input of the data into our model,

straight through to the final results. We found that a possible cause for the problem

could be located at the very beginning of the entire process.


316

Caplet stripping procedures from their corresponding cap quotes are all but

trivial. A very simplistic linear interpolation approach was being used in the Banco

Santander. However, this was proving largely insufficient for the subsequent

calibration procedures. Indeed, it was introducing immense noise at all points in

which data was being interpolated, both in the maturity term structure as in the strike

implied volatility smiles.

We analysed two principal techniques in the extraction of caplet quotes. The first

approach is based on the direct interpolation of cap quotes, thus constructing

capforwards whose duration is exactly the desired 6 or 3 month caplets which we

want to construct. Hence, the caplets are exactly equivalent to the capforward itself,

and so can be obtained directly. This first approach is exact and depends solely on the

interpolation technique implemented. We found that among these, the most

successful was a constrained cubic spline which was capable of producing term

structures that were just as smooth as quadratic optimisation algorithms.

Our second alternative was to obtain the individual caplets by creating a smooth

evolution of caplet values with which to construct the corresponding capforwards.

These capforwards were no longer constructed with 3 or 6 month durations, but were

directly derived from the existent market cap quotes. Thus, the interpolation

algorithm was no longer being performed amongst caps but now within the caplets

themselves that constructed each capforward.

This text presents a wide range of optimisation algorithms to tackle the caplet

fitting to their corresponding capforward. We provide both exact fit analyses, as well

as best fit minimisation approaches. Notably, quadratic optimisation techniques

proved just as efficient as cubic spline methods, yielding almost identical results. We

state here once again, linking with a notable discussion of ‘best fit versus exact fit’

presented in the text, that we have always tended to follow the exact fit procedures.

The final cubic spline that was implemented did away with the sharp features

within the term structure dynamics. Further, it eliminated a particular ‘bump’

anomaly that was present in the short maturity volatility smiles. However, a deeper

analysis of these proved that the smile generated was not as smooth as could be

desired for a successful exotic calibration procedure. As a result, the SABR was

implemented.


317

In this project we have exhaustively analysed the alternatives that the SABR

stochastic interpolator provides. Despite the fact that it is an inexact minimum square

approach, the variations that it provides for any particular caplet market quote are

minimal. In the current project we examined the effects of varying any one of the

parameters within the model, finally arriving at an optimal combination for our caplet

volatility surface. This involved a dynamic beta parameter that shifted from a skewed

normal model in short maturities to a lognormal flat model for longer maturities. We

also examined the effect of playing with the weighting scheme attributed to the

different regions within the volatility smile. Traditional approaches had commonly

used a stronger weighting for the central ‘at the money’ values. We found however

that this was not necessarily the optimal strategy, and that to capture the pronounced

curvature present in short maturity smiles, it was often necessary to attribute an

equally strong weighting to the most extreme strike values. Without this modification,

the smile was often incapable of curving up sufficiently at its end points.

During the development of the project, this precise calculation became critical for

the bank’s interest rate analysts. The problem of a ‘bump’ in the most relevant region

of the smile was a problem that required an immediate solution. The program

developed in this project successfully dealt with this problem, and has been passed on

directly to traders as a direct solution. It will be immediately coded form its Visual

Basic environment to C++ so as to implement it in the Banco Santander Madrid, as

well as in its headquarters in New York and Latin America.

Another major objective of this project was to be able to arrive at an analytical,

approximate formula that would make use and correctly forecast the three volatility

parameters within the global volatility expression Γ :α ,σ ,λ . The task was entirely

experimental and mathematical, and encompassed a great part of our work. An initial

proposition was encountered under the assumption that the dynamics of the forward

swap rate followed


However, the above was not in the least sufficient to create a consistent

approximation. Moreover, specific analytic expressions had to be encountered and

created for the two principal volatility parameters alpha and sigma. As is clear from


318

the above formulation, our initial approach only attempted to reproduce the two

strike, skewed HJM dynamics, leaving the development of a three strike model for a

later stage.

As is demonstrated in the text, we were faced with a very wide range of

alternatives for the parameters to select. In general, all formulations converged to a

unique expression for sigma. However, we were not as fortunate with the alpha

parameter. For this, we ended selecting a weighted version which not only proved to

be the simplest alternative mathematically, but also turned out to be the most effective

in successfully performing calibrations.

The next development consisted in extending the analytic approximation’s

formulation to a two factor scenario. This is, we incorporated a multifactor approach

into the previous model through the use of two Brownian variables that were

correlated in terms of a common theta parameter through sines and cosines. This

extension appeared relatively simple at first, but the introduction of the new variables

gave rise to a set of equations that was over-determined. Strangely, we once again had

a common sigma parameter that was attained through basically any procedure we

decided to undertake. Its implementation in the one strike two factor models proved

to be extremely potent and accurate. The alpha on the other hand was an entirely

different matter. Every possible approach to the extraction of an analytic formula for

the alpha from the original equations yielded an entirely different expression.

Moreover, most of these either left out the vast majority of the equations, and so were

only consistent with one or two of them, or else, were directly incapable of producing

adequate calibration results.

We finally narrowed down our analysis to two main expressions, one of which

was consistent with all the analytic expressions, and the other simply being a mean

weighting of the various alpha expressions, consequence of an intuitive insight and

with no mathematical background. Further, we had the lingering alternative of simply

using the 1 factor expression for alpha, inputting it directly into the two factor model

as if it were completely independent of the two different Brownian terms.

Of course, it was the mathematically consistent formulation which finally yielded

the best results, and that proved capable of solving the longest and most troublesome

two factor calibrations.


319

The discovery of such an analytic approximation formula was immediately

implemented in the trader systems. It enabled these to bypass the time consuming

MonteCarlo or Tree simulations by instead, providing a first guess to the final HJM

solution parameters. This first guess is so close to the final exact solution that the

subsequent MonteCarlo algorithm needs only perform two or three further iterations

before arriving at the final solution. What is more, the analytic approach generates

also an approximate Jacobian expression that we can substitute for the MonteCarlo’s

expression. This proves to be a great advantage in time computation, as the

MonteCarlo Jacobian is calculated through finite difference bumping techniques that

are extremely long and tiresome. With the analytic Jacobian, we need not recompute

these Jacobians, hence directly providing the necessary approximate slopes.

As a result, the implementation of the analytical approximation in the Banco

Santander has successfully reduced by a factor of ten the time that traders need to

spend in each exotic product’s calibration. The optimization or time reduction of the

calculation process results in a much friendlier tool for these traders, who ideally need

immediate predictions of the exotic products they operate with. Thus, reducing the

calibration’s duration permits them to be more efficient, allowing them to analyse

many more exotic products in the same time.

Chapter 20 References

320

20. References

[Björk 2004] Björk, T., “Arbitrage Theory in Continuous Time”, Oxford Finance, Oxford University Press, 2004.

[Black 1976] Black, F., “The pricing of commodity Contracts”, Jour. Pol. Ec., 1976

[Bachert 2006] Bachert, P., Gatarek, D., Maksymiuk, R., "The Libor Market Model in Practice", Wiley Finance, London, 2006.

[Brigo 2001] Brigo, D., Mercurio, F. "Interest rate models. Theory and practice.", Springer Finance, Berlin, 2001.

[Cole 1968] Cole, J. D., "Perturbation Methods in Applied Mathematics", Ginn – Blaisdell, 1968

[Cole 1985] Cole, J. D., Kevorkian, J., "Perturbation Methods in Applied Mathematics", Springer - Verlag, 1985

[Chow 1978] Chow, Y. S., Teicher, H. "Probability Theory. Independence, Interchangeability, Martingales"., Springer-Verlag, New York, 1978.

[Dupire 1994] Dupire, B., "Pricing with a smile", Risk, 1994

[Dupire 1997] Dupire, B., "Pricing and Hedging with smiles in Mathematics of Derivative Securities", Cambridge University Press, Cambridge 1997

[Derman 1994] Derman, E., Kani, I., "Riding on a Smile", Risk, 1994

[Hull 1989] Hull, J. C., "Options, Futures, & Other Derivatives", Prentice-Hall International, New Jersey, 1989.

[Heston 1993] Heston, S. L., "A closed-form solution for options with stochastic volatility with applications to bond and currency options", The Review of Financial Studies, 1993

[Hull 1987] Hull, J. C., White, A., "The pricing of options on assets with stochastic volatilities", J. of Finance, 1987.


321

[Ikeda 1981] Ikeda, N., Watanabe, S. "Stochastic differential equations and diffusion processes", North-Holland, Amsterdam, 1981.

[Karatzas 1988] Karatzas, I., Shreve, S., "Brownian motion and stochastic calculus", Graduate Texts in Maths., 113, Springer-Verlag, Berlin, 1988.

[Lewis 2000] Lewis, A., "Option Valuation Under Stochastic Volatility", Financial Press, 2000

[Musiela 1997] Musiela, M., Rutkowswki, M., "Martingale Methods in Financial Modelling", Springer, 1997

[Neftci 1996] Neftci, S. N., "Introduction to the Mathematics of Financial Derivatives", Academic Press, United States of America, 1996.

[Vievergelt 1993]

Vievergelt, Y., "Splines in Single Multivariable Calculus", Lexington, MA, 1993

[Okdendal 1998]

Okdendal, B., "Stochastic Differential Equations", Springer, 1998

[Wilmott 2000] Wilmott, P., "Paul Wilmott on Quantitative Finance", John Wiley and Sons, 2000

Papers

[Karoui 2003] Karoui, N. E., "Couverture des risques dans les marches financiers", Ecole Polytechnique, 2003

[Kruger 2005] Kruger, C. J. C., "Constrained Cubic Spline Interpolation for Chemical Engineering Applications", 2005

[Hagan 2004] Hagan, P., Konikov, M., "Interest Rate Volatility Cube: Construction and Use", 2004

[Hagan 2002] Hagan P., Kumar, D., Lesniewski, A., Woodward, D., "Managing Smile Risk", 2002

[Hagan 1998] Hagan P, Woodward, D,. "Equivalent Black Volatilities", 1998

[Martinez 2005]

Martinez, M. T., "Interest Rate Bible Notes of some discussions with Monsieur F. Friggit", 2005

[Mamon 2004] Mamon, R. S., "Three Ways to Solve for Bond Prices in the Vasicek Model", 2004

Chapter 20 References

322

[McKinley 2000]

McKinley, S., Levine, M., "Cubic Spline Interpolation", 2000

[Sen 2001] Sen, S., "Interest Rate Options", 2001

hjm framework

Documents

model implementation

black karasinski model

black scholes model

rate productsequation

vasicek model

ross model

bartter model

model origins