what's new in mathematical optimisation from nag · nonlinear programming: active set versus...

Experts in numerical algorithmsand HPC services

What's New in Mathematical Optimisation from NAG

Jan Fiala, Benjamin Marteau

Nonlinear programming: active set versus interior point methods

Overview

Sequential quadratic programming

Interior point methods

Illustration on a few examples

Mixed integer nonlinear optimisation

Semidenite programming

Sample applications in nance

Coming next

Large-scale linear programming

Derivative free solver for calibration

Working with customers

Overview

Coming next

Nonlinear optimisation

Problems of the form:

minx∈Rn

hk(x) = 0, k = 1...me

gk(x) ≤ 0, k = 1...mi

Two dierent approaches:

Sequential quadratic programming:

Active set method

based on Gill et al., Stanford University

Interior point method

based on Wächter, Biegler, Carnegie Mellon University

Formalisation of the problem

Karush-Kuhn-Tucker (KKT) optimality conditions:

Stationarity condition

∇f(x) +

me∑k=1

λk∇hk(x) +

mi∑k=1

µk∇gk(x) = 0

Primal feasibility condition

h(x) = 0

g(x) ≤ 0

Dual feasibility condition

∀k ∈ 1, ...,mi, µk ≥ 0

Complementarity condition

∀k ∈ 1, ...,mi, µkgk(x) = 0

Two approches to tackle these equations

The Complementarity condition is problematic due to its

combinatorial nature.

Two distincts strategy:

An SQP solver guesses which constraints are binding

An IPM perturbs the equation

Overview

Coming next

Denition

An inequality constraint k is said to be active at x if it is binding

(g(x) = 0).

SQP methods iteratively build the set of active constraints by

solving quadratic programs:

Initialisation Choose a rst estimate of the solution x0. Build a

quadratic model of the objective around x0 and take a rst guess of

the set of active constraints

Iteration k

Solve the quadratic program warm started by the active set estimation

Update xk+1 and the set of active constraints

Build a new quadratic model around xk+1

A few characteristics of SQP methods

Perform lots of inexpensive iterations

Work on the null space of the constraints

The more active constraints there are, the cheaper the iterations are

As a consequence, SQP methods scale very well to large NLP

problems with a high number of constraints.

Overview

Coming next

If one tries to solve the KKT system directly, the complementarity

condition turns out to be problematic. Therefore, a IPM iteration

can be:

Relax the complementarity condition (µg(x∗) = ν with ν > 0)

Perform one Newton iteration towards the solution of the relaxed

KKT system

Update the current solution estimate and the relaxation parameter

Interior point methods aim at nding a sequence of points

converging to the solution that satisfy the constraints strictly.

A few characteristics of Interior Point methods

Perform a few expensive iterations

In the absence of constraints, behave as a Newton method

As a consequence, Interior Point methods scale very well to large

NLP problems with a small number of constraints.

Illustration on a few highly constrained problems

Problems were selected from the CUTER test set.

Name Number Number e04vh (SQP) e04st (IPM)

of vars of constrs time (s) time (s)

MINC44 1113 1033 0.28 7.60

READING8 2002 1000 9.78 251.12

NCVXQP6 10000 7500 3.60 613.38

MADSSCHJ 201 398 0.34 5.51

Illustration on a few weakly constrained problems

Problems selected from the CUTER test set.

JIMACK 3549 0 542.42 8.12

OSORIO 10201 202 303.00 0.78

TABLE8 1271 72 3.80 0.04

OBSTCLBL 10000 1 40.84 0.50

The number of constraints is not the only factor...

Illustration on a few weakly constrained problems

Problems selected from the CUTER test set.

JIMACK 3549 0 542.42 8.12

OSORIO 10201 202 303.00 0.78

TABLE8 1271 72 3.80 0.04

OBSTCLBL 10000 1 40.84 0.50

The number of constraints is not the only factor...

Other characteristics

IPM (e04st) advantages SQP (e04vh) advantages

•Ecient on unconstrained or

loosely constrained problems

•Ecient on highly constrained

problems

•Can exploit 2nd derivatives•Can capitalize on good initial

•Ecient also for quadratic

problems

•Stay feasible with respect to the

linear constraints throughout the

optimization

•Better use of multi-core

architecture

•Usually better results on

pathological problems

•New and simpler interface•Usually requires less function

evaluations

•Infeasibility detection

•Allows warm starting

Overview

Coming next

Problems of the form:

minx∈Rn,y∈Zm

f(x, y)

l ≤ c(x, y) ≤ u

x: continuous variables

y: integer variables

SQP with branch-and-cut techniques

Ordinal variables

Does not require the model evaluation on fractional values of integer

variables

Some characteristics

It might be necessary to use integral variables in an optimization

model, for example:

Cardinality constraints

Decision logic between variables (e.g. constraints only present if a

certain variable is nonzero)

Variables can only take values inside a predecided set

Included in NAG, Mark 25 as h02da. Based on Schittkowski et al.,

University of Bayreuth.

Overview

Coming next

Semidenite Programming (SDP)

Linear Programming (LP)

well-known, well-researched

convex (local → global)

strong theoretical properties

but only linear

Extensions:

NLP but some nice properties lost (e.g., convexity, duality theory)

SDP retain the theory, change geometry

add matrix inequality, symmetric matrix is positive semidenite

(all eignevalues are nonnegative)

highly nonlinear

notation: A(x) 0

but only linear

Extensions:

highly nonlinear

notation: A(x) 0

but only linear

Extensions:

highly nonlinear

notation: A(x) 0

Semidenite Programming (SDP) formulation

→ SDP → BMI-SDP

minx∈Rn

subject to lB ≤ Bx ≤ uBlx ≤ x ≤ ux

LP → SDP

→ BMI-SDP

minx∈Rn

A(x) = A0 +n∑i=1

xiAi 0

Ai given symmetric matrices

A(x) is linear in x, LMI = linear matrix inequality

with special choice A(x) can be a matrix variable X

LP → SDP → BMI-SDP

minx∈Rn

A(x) = A0 +

n∑i=1

xiAi +

n∑i,j=1

xixjQij 0

further (quadratic) extension

BMI = bilinear matrix inequalities

unique to NAG, included in Mark 26 as e04sv

in collaboration with Ko£vara at al., University of Birmingham

Semidenite Programming (SDP) Applications?

SDP = special tool

It's there when you need it!

very powerful concept

matrix constraints might not appear naturally

⇒ reformulations, relaxations

structural optimization, chemical engineering, combinatorial

optimization, statistics, control and system theory, polynomial

optimization, ...

spark interest

Warning: I am not a quant!

SDP = special tool

optimization, ...

spark interest

SDP = special tool

optimization, ...

spark interest

Overview

Coming next

SDP Applications in Finance

positive semidenite requirement appears directly construction of a correlation/covariance matrix

nearest correlation matrix (with constraints)

robust (worst-case) portfolio optimization

calibration of volatility structure for Libor market swaption

eigenvalue optimization(min/max eigenvalue/singular value, matrix condition number,nuclear norm as heuristic for rank minimization, ...) risk-management: limit Γ of your portfolio

relaxations many relaxations of (NP-hard) combinatorial problems

asian option pricing bounds(?)

reformulations polynomial nonnegativity ↔ matrix inequality

(e.g., interpolation by nonnegative splines)

Lyapunov stability of ODE

in nance?

Nearest Correlation Matrix (with Constraints)

n∑i,j=1

(Xij −Hij)2

subject to Xii = 1, i = 1, . . . , n

correlation matrix = symmetric positive semidenite matrix with

unit diagonal

H approximate correlation matrix

X new (true) correlation matrix closest to H in Frobenius norm

do not use SDP on vanilla NCM due to algorithm complexity;

special solvers in G02 are preferrable

n∑i,j=1

(Xij −Hij)2

subject to Xii = 1, i = 1, . . . , n

correlation matrix = symmetric positive semidenite matrix with

unit diagonal

H approximate correlation matrix

X new (true) correlation matrix closest to H in Frobenius norm

do not use SDP on vanilla NCM due to algorithm complexity;

special solvers in G02 are preferrable

n∑i,j=1

(Xij −Hij)2

subject to Xii = 1, i = 1, . . . , n

Possible new constraints:

x elements: Xij = Hij for some i, j

element-wise bounds: lij ≤ Xij ≤ uijsmallest eigenvalue constraint: X λminI, where λmin given

limit condition number: λmaxI X λminI, λmax ≤ κλmin,

where κ is given and λmin, λmax are new variables

n∑i,j=1

(Xij −Hij)2

subject to Xii = 1, i = 1, . . . , n

Possible dierent objective:

weight elements:∑Wij(Xij −Hij)

consider portfolio V aRα: −λZ2αw

TDXDw +∑

(Xij −Hij)2

D deviations (dii = σi), w asset allocation, λ weighting factor

n∑i,j=1

(Xij −Hij)2

subject to Xii = 1, i = 1, . . . , n

Full control over the formulation!

Robust Portfolio Optimization

mean-variance analysis often very sensitive to the data

are nominal µ (expected returns) and Σ (covariance) correct?

robust EF = limit sensitivity of the results by incorporating

uncertainity model on parameters

choose solution in the worst-case scenario (see Boyd '07)

min (µ− r1 + λ)TΣ−1(µ− r1 + λ)

subject to Fµ ≥ 0

|µi − µi| ≤ α1|µi|, i = 1, . . . , n

|1Tµ− 1T µ| ≤ α2|1T µ||Σij − Σij | ≤ β1|Σij |, i, j = 1, . . . , n

||Σ− Σ||F ≤ β2||Σ||FΣ 0

λ ≥ 0

Robust Portfolio Optimization

mean-variance analysis often very sensitive to the data

are nominal µ (expected returns) and Σ (covariance) correct?

robust EF = limit sensitivity of the results by incorporating

uncertainity model on parameters

choose solution in the worst-case scenario (see Boyd '07)

min (µ− r1 + λ)TΣ−1(µ− r1 + λ)

subject to Fµ ≥ 0

|µi − µi| ≤ α1|µi|, i = 1, . . . , n

|1Tµ− 1T µ| ≤ α2|1T µ||Σij − Σij | ≤ β1|Σij |, i, j = 1, . . . , n

||Σ− Σ||F ≤ β2||Σ||FΣ 0

λ ≥ 0

Calibration of volatility structure

How to extract correlation information from market option prices?

assume LIBOR market model with covariance structure X and

swap weights Ω = wwT

under some assumptions, swaption prices are given by

Black-Scholes formula with volatility parameter σ = Tr(ΩX)

Task: calibrate X to observed swaption market prices:

subject to Tr(ΩX) = σ

where σ are observed swaption implied vols

Calibration of volatility structure cont.

Correlation X in the previous feasibility problem not unique,

can choose objective:

min or max price of some other option: min/max Tr(ΩX)

norm of X: min‖X‖smoothness: min‖∆X‖robustness via Bid/Ask spread:

max t s.t. σBid + t ≤ Tr(ΩX) ≤ σAsk − trank of X as a heuristic via nuclear norm of X

Risk-management: How to construct positive Γ portfolio?

assume existing portfolio Π of derivatives/exotics on underlying Si:

Π = F (S1, . . . , Sn)

Π must be risk managed usual Delta hedging: ∂Π/∂S = 0

but Delta hedging only works for very small movements in the

underlyings, for larger would like to keep positive (or small) Γ as

dΠ = ∂Π∂S + 1

2ST ∂2Π∂S2 S + · · ·

to construct positive Γ: buy xi units of vanilla option pi on Si and

yi of underlying Si

minx,y∑

xipi(Si) + yiSi

subject to∂2F

∂S2+ diag

(xi∂2pi∂S2

∂Si+ xi

∂pi∂Si

+ yi = 0, i = 1, . . . , n

Risk-management: How to construct positive Γ portfolio?

assume existing portfolio Π of derivatives/exotics on underlying Si:

Π = F (S1, . . . , Sn)

Π must be risk managed usual Delta hedging: ∂Π/∂S = 0

but Delta hedging only works for very small movements in the

underlyings, for larger would like to keep positive (or small) Γ as

dΠ = ∂Π∂S + 1

2ST ∂2Π∂S2 S + · · ·

to construct positive Γ: buy xi units of vanilla option pi on Si and

yi of underlying Si

minx,y∑

xipi(Si) + yiSi

subject to∂2F

∂S2+ diag

(xi∂2pi∂S2

∂Si+ xi

∂pi∂Si

+ yi = 0, i = 1, . . . , n

Overview

Coming next

Coming next new LP solver

NAG = Amazon of optimization

(be a one-stop-shop for all you need in optimization)

Constant evolution of the library

based on our roadmap

customers' requests

latest research & collaborations

... ongoing hard work

New LP solver

new solver for large-scale LP problems

based on interior point method (IPM)

lling the missing gap

signicant speed-up

Coming next new LP solver

NAG = Amazon of optimization

(be a one-stop-shop for all you need in optimization)

Constant evolution of the library

based on our roadmap

customers' requests

latest research & collaborations

... ongoing hard work

New LP solver

new solver for large-scale LP problems

based on interior point method (IPM)

lling the missing gap

signicant speed-up

Coming next DFO for calibration

Standard data-tting (calibration) problem

given oberved data [ti, yi]; model f(·;x) depending on model

parameters x

Task: nd x to t the data as close as possible,

typically in least square sense: minx∑

(yi − f(ti;xi))2

Additional requirements

small number of parameters (< 100)

black-box model, no derivatives available

possibly expensive and/or inaccurate function evaluations

typically reasonable starting point, small improvement sucient

⇒ nite dierences shouldn't be used!

New Derivative free optimization (DFO) solver exploiting the

problem structure (the only of its kind!)

parameters x

(yi − f(ti;xi))2

parameters x

(yi − f(ti;xi))2

Overview

Coming next

Sometimes solution out of the box is not sucient!

Is it possible to speed up the solver?

Does the model t the solver?

Can a special problem structure be exploited?

NAG Mathematical Optimization Consultancy ready to help!

choice and tuning of the solver

adjustments with the model

bespoke solver development

Sometimes solution out of the box is not sucient!

Is it possible to speed up the solver?

Does the model t the solver?

Can a special problem structure be exploited?

NAG Mathematical Optimization Consultancy ready to help!

choice and tuning of the solver

adjustments with the model

bespoke solver development

Examples of optimisation projects

Energy & Commodities Trading Co. The client's model was demonstrating unusual behaviour - signicant

memory footprint and slow convergence. Analysis of the model

showed that a more suitable equivalent reformulation is available.

When the model was adjusted, the solver performed as expected.

Financial Services Software Vendor extended site visit of a client allowed us to discuss client's problem in

detail and helped to identify a weak point which was causing

convergence issues and x.

Financial Brokerage Co. The client wanted a class of problems to be solved within the

prescribed time limit. After the initial assessment of the problem, a

possible solution was identied using recent research from Stanford

university. A bespoke solution was delivered during a short consulting

engagement. The new solver drastically improved the performance so

that even bigger problems could be considered by the client.

what's new in mathematical optimisation from nag · nonlinear programming: active set versus...

Documents

methods in programming

a new perspective on the complexity of interior point...

interior point methods · interior point methods...

interior point polynomial time methods in convex...

interior programming and space planning

calvino.polito.itcalvino.polito.it/~pieraccini/didattica/gondzio/gondzio_lectures_1_10… ·...

advances in convex optimization: interior-point methods...

an annotated bibliography of network interior point … ·...

adaptive solution of truss layout optimization problems...

fast interior point solution of quadratic programming...

primal-dual interior-point methods for linear programming...

shifted barrier methods for linear...

programming methods (pm)

interior point methods for large scale...

unified complexity analysis for newton lp methods ·...

interior point methods for ... - university of...

interior methods for mathematical programs with...

methods of teaching programming

technology and programming methods

orf 522: lecture 14 linear programming: chapter 16...