lec 21 marquardt method

8/9/2019 Lec 21 Marquardt Method

1/29


2/29

Optimization Methods

One-Dimensional Unconstrained OptimizationGolden-Section Search

Quadratic InterpolationNewton's Method

Multi-Dimensional Unconstrained Optimization

Non-gradient or direct methodsGradient methods


3/29

Summary of Newton's Method

One-dimensionalOptimization

Multi-dimensionalOptimization

At theoptimal

Newton'sMethod

0)( =∇ x f 0)(' =i x f

)(1

1 iiii f xHxx ∇−= −

+)("

)('1

i

iii x f

x f x x −=+

Hi is the Hessian

matrix (or matrix of 2ndpartial derivatives) of f

evaluated at xi.)(')(" 1 ii x f x f

−


4/29

Newton's Method

• This method Converges in quadratic fashion.

• It May diverge if the starting point is not closeenough to the optimum point.

• It is very Costly to evaluate H-1.

)(11 iiii f xHxx ∇−= −

+


5/29

Marquardt Method

Idea

• When a guessed point is far away from the optimum

point, use the Steepest Ascend method or Cauchy’smethod.

• As the guessed point is getting closer and closer to the

optimum point, gradually switch to the Newton'smethod.

• In any given problem it is not known whether the

chosen initial point is away from the minimum or closeto the minimum.

• So, we need a method that takes advantages of both.


6/29

Marquardt Method

The Marquardt method achieves the objective by

modifying the Hessian matrix H in the Newton's

Method in the following way:

IHHxHxx iiiiiii f α +=∇−= −

+

• Initially, setα0 a huge number.

• Decrease the value ofαi in each iteration.

• When xi is close to the optimum point, makesαizero (or close to zero).

~where)(

~ 11


7/29

Marquardt Method

Whenαi is large

)(1

)(~

~

1

1

1 i

i

iiiiii

iiii

f f

I

xxxxHxx

IHH

∇−=≅∇−=⇒≈+=

+−

+α

α α

Whenαi is close to zero

)()(~

~

1

1

1

1 iiiiiiii

iiii

f f xHxxxHxx

HIHH

∇−=≅∇−=⇒

≈+=−

+−

+

α

Steepest Ascend Method: (i.e.,Move in the direction of thegradient.)

Newton's Method


8/29


9/29

EXERCISE 3.4.4: Marquardt Method

Consider the Himmelblau function:

Minimize using Marquardt’s method:

f(x, y) = (x2

+ y – 11)2

+ (x + y2

– 7)2

Step 1: In order to ensure proper convergence, a large value of

M (= 100) is usually chosen.

Also, keep an initial point x(0) = (0, 0)T,

and termination parameters ε1 = 10-3 .

We also set k = 0 as an iteration counter and parameter λ(0) = 100.

Step 2: The derivative at this point is calculated as (-14, -22)T.

Step 3: Since the derivative is not small we go to the step 4.


10/29


3D Graph


11/29


% Matlab program todraw contour of function

[X,Y] = meshgrid(0:.1:5);

Z = (X.*X + Y -11.).^2. +(

X + Y.*Y - 7.).^2.

contour(X, Y, Z, 150);colormap(jet);

Minimum point

Contour graph:


12/29


( ) ( )

( ) ( )

( ) ( )

[ ] .

260

00.42 and2214),0,0(At

2612444

4442412

;44;26124;42412

264422274112

142424472114

711),(

T

2

2

22

2

22

2

2

3222

2322

2222

⎥

⎦

⎤⎢

⎣

⎡

−

−=−−=∇

⎥⎦

⎤

⎢⎣

⎡

−++

+−+

=

+=∂∂

∂−+=∂∂−+=

∂∂

−++−=−++−+=∂

∂

−+−+=−++−+=

∂

∂

−++−+=

H

H

f

y y y x

y x y x

y x y x

f y y

y

f y x

x

f

y y xy x y x y y x y

f

y x xy x y x y x x x

f

y x y x y x f


13/29


( ) ( )

[ ]

[ ]

⎟⎟ ⎠ ⎞⎜⎜

⎝ ⎛ =⎟⎟

⎠ ⎞⎜⎜

⎝ ⎛

−−⎟⎟

⎠ ⎞⎜⎜

⎝ ⎛ −=⎟⎟

⎠ ⎞⎜⎜

⎝ ⎛

−−⎥

⎦⎤⎢

⎣⎡ ⎟⎟

⎠ ⎞⎜⎜

⎝ ⎛ −=

⎟⎟ ⎠ ⎞

⎜⎜⎝ ⎛

−−

⎭⎬⎫

⎩⎨⎧ ⎟⎟

⎠ ⎞

⎜⎜⎝ ⎛ +⎟⎟

⎠ ⎞

⎜⎜⎝ ⎛

−−−=

∇+−=

⎥⎦

⎤

⎢⎣

⎡

−

−

=−−=∇

−++−+=

−

−

297.0241.0

2214

580074

42921

2214

740058

22

14

10

01100260

042

))0((I)0()1(

.260

00.42

and2214),0,0(At

711),(

1

1

T

2222

x f s

f

y x y x y x f

λ H

H

Thus new point x(1) = x(0) + s(0)

x(1) = (0,0)T + (0.241, 0.297)T = (0.241, 0.297)T


14/29


• STEP 5: The function value at this point x(1) is f(x(1))and it is 157.79, which is smaller than that at x(0);

• F(0) = 170. Thus we move to next step.• STEP 6: we now set a new λ = 100/2 = 50. This has the

effect of switching from Cauchy to Newton Method. We

now set k = 1.

• This completes one iteration of Marquardt algorithm.


15/29


Now again select a new initial point x(1) = (0.241, 0.297)T,

and function value at this point is f(x(1)) = 157.79.

Step 2: The derivative at this point is calculated as (-23.60, -29.21 )T.

Step 3: Since the termination criteria are not met we go to the step 4.Step 4: At this point the Hessian is given as:

( ) ( )

⎥⎦

⎤

⎢⎣

⎡

−

−

=⎥⎦

⎤

⎢⎣

⎡

−++

+−+

=

−=−++−=∂∂

−=−+−+=∂

∂=

−++−+=

754.23152.2

152.2115.40

2612444

4442412

2147.292644222

6033.231424244

)297.0,241.0(int

711),(

2

2

32

23

2222

y y y x

y x y x

y y xy x y

f

y x xy x x

f

poat

y x y x y x f

H


16/29


( ) ( )

[ ]

[ ]

⎟⎟ ⎠

⎞⎜⎜⎝

⎛ ==⎟⎟ ⎠

⎞⎜⎜⎝

⎛

−

−

⎥⎦

⎤

⎢⎣

⎡

⎟⎟ ⎠

⎞⎜⎜⎝

⎛ −=

⎟⎟ ⎠ ⎞⎜⎜

⎝ ⎛

−−

⎭⎬⎫

⎩⎨⎧ ⎟⎟

⎠ ⎞⎜⎜

⎝ ⎛ +⎥

⎦⎤⎢

⎣⎡

−−−=

∇+−=

⎥⎦

⎤

⎢⎣

⎡

−

−

=−−=∇

−++−+=

−

−

−

749.1

738.2

21.29

64.23

246.26152.2

152.2885.9

21.2964.23

100150

754.23152.2152.2115.40

))0((I)0()1(

754.23152.2

152.2115.40

and21.2964.23),0,0(At

711),(

1

1

1

T

2222

x f s

f

y x y x y x f

λ H

H

Thus new point x(2) = x(1) + s(1)

x(2) = (0.241, 0.297)T + (2.738, 1.749)T = (2.98, 2.045)T


17/29


• STEP 5: The function value at this point x(2) is f(x(2))and it is 0.033, which is much smaller than that at x(1);

• F(1) = 157.79. Thus we move to next step.

• STEP 6: we now set a new λ = 50/2 = 25. This has the

effect of switching from Cauchy to Newton Method. Wenow set k = 2.

• This completes one more iteration for the Marquardtalgorithm.


18/29


• This process continues until the termination criteria issatisfied.

• One more iteration shows that x(3) = (2.994 , 2.005)T

and the function value f(x(3)) = 0.001

• So, we can stop here and the optimum is x(3).

• One difficulty for Marquardt method is that at everyiteration one has to estimate the Hessian matrix.


19/29

Conjugate Direction Methods

Conjugate direction methods can be regarded as

somewhat in between steepest descent and Newton's

method, having the positive features of both of them.

Motivation: There is desire to accelerate slowconvergence of steepest descent, but avoid expensive

evaluation, storage, and inversion of Hessian.


20/29

Conjugate Gradient Approaches

• It is similar to the conjugate direction method.

• Assuming that the object function is quadratic,conjugate directions can be found using first orderderivatives;

• Idea: Calculate conjugate direction at each points basedon the gradient as

12

1

2

−

−∇

∇

+∇= ii

i

ii S f

f

f S

This method Converge faster than Powell's method.

The Fletcher-Reeves Method


21/29

Example on various functions

• Determine whether the stationary point of the following quadraticfunctions is a local maxima, local minima or saddle point?

1032),,((iv)

)3()2(),((iii)

225.15.12),((ii)

1002)((i)

222

22

22

2

+−+−++=

−−−=

−−+=

+−=

yz xy xz z y x z y x f

y x y x f

y x y xy y x f

x x x f

• A point x* is a stationary point iff

• f ' (x*) = 0 (if f is a function of one variable)

• f (x*) = 0 (if f is a function of >1 variables)


22/29

Example – Solution

minimalocalais1ve)( 2)1("

1022)('1002)((i)

2

=⇒+=

=⇒=−=+−=

x f

x x x f x x x f

625.0yand50yieldssystemtheSolving

045.12

05.22

havewe,0Setting

45.12,5.22

225.15.12),((ii) 22

==

=−+

=−

=∇

−+=∂

∂−=

∂

∂

−−+=

. x

y x

x y

f

y x y

f x y

x

f

y x y xy y x f

We still have to test if the point is a local maxima, minima or saddle point(continue next page …)

E l S l ti (C ti )


23/29

maxima.localais)625.0,5.0( pointthe,0and0Since

6)2)(2()4)(5.2(42

25.2

45.12,5.22

225.15.12),(

2

2

22

22

22

<

∂

∂>

=−−−=⇒⎥⎦

⎤⎢⎣

⎡

−

−=

⎥

⎥⎥⎥

⎦

⎤

⎢

⎢⎢⎢

⎣

⎡

∂∂

∂

∂∂

∂

∂∂∂

∂∂∂

=

−+=∂

∂−=

∂

∂−−+=

x

f

y y

f

x y

f

y x

f

x x

f

y x y

f x y

x

f y x y xy y x f

H

HH

matrixHessiantheusetois pointsaddle

orminima,localmaxima,localais pointaif testway toOne

Example – Solution (Continue)

(ii) (… continue)


24/29

point.saddleais pointstationarythe

,)indefiniteis(i.e,0 but0Since

420

02

62)3(2,42)2(2

)3()2(),((iii)

2

2

22

22

22

HH

HH

>∂

∂<

−=⇒⎥⎦

⎤

⎢⎣

⎡

−=

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

∂∂∂

∂∂∂

∂∂

∂

∂∂

∂

=

+−=−−=

∂

∂−=−=

∂

∂

−−−=

x

f

y y

f

x y

f

y x

f

x x

f

y y

y

f x x

x

f

y x y x f


E l S l i (C i )


25/29

point.saddleorminima,localmaxima,local

ais pointstationaryhewhether ttellorder toinneitherordefinitenegativeordefinite positiveisif testto Need

232

321

212

322,32,22

1032),,((iv)

222

222

222

222

H

H

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

−−

−

−

=

⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

∂∂

∂

∂∂

∂

∂∂

∂ ∂∂

∂

∂∂

∂

∂∂

∂

∂∂

∂

∂∂

∂

∂∂

∂

=

−−=∂∂−+=

∂∂+−=

∂∂

+−+−++=

z z

f

y z

f

x z

f z y

f

y y

f

x y

f

z x f

y x f

x x f

y x z z

f z x y

y

f y z x

x

f

yz xy xz z y x z y x f


(continue next page …)

Example Solution (Continue)


26/29

08)5.14)(5.1)(2(

5.1/400

25.10

212

232

321

212

02242112,02

5.1/400

25.10

212

232

321

212

33

2211

nEliminatioForward

=−==>=

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

−−

−

⇒⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

−−−

−

=

×

××

H

HH

H


We can verify if a matrix is positive definite by checking if thedeterminants of all its upper left corner sub-matrices are positive.

(iv) (… continue from previous slide)

Since H is neither positive definite nor negative definite (i.e., indefinite),the stationary point is a saddle point.


27/29

Let us do more exercise

For each of the following points, determine whether it is alocal maxima, local minima, saddle point, or not astationary point of

xy y x y x f 3),( 33 −+=

1. (0, 0)

2. (1, 0)

3. (-1, -1)

4. (1, 1)


28/29

Exercise; solution

[ ]

[ ]

[ ][ ]

minima.localais(1,1)27,936Hand0andSince

.63

36 and00),1,1(At

point.stationaryanotis(-1,-1)Thus.66),1,1(At

point.stationaryanotis(1,0)Thus.33),0,1(At

point.saddleais(0,0)-9),H(or0andSince

.03

30 and00),0,0(At

63

36,33,33

3),(

1,1

T

T

T

1,1

T

22

33

=−=>=∇

⎥⎦

⎤⎢⎣

⎡

−

−==∇

≠=∇−−

≠−=∇

===∇

⎥⎦

⎤⎢⎣

⎡

−

−==∇

⎥

⎦

⎤⎢

⎣

⎡

−

−=−=

∂

∂−=

∂

∂

−+=

h f

f

f

f

h f

f

y

x x y

y

f y x

x

f

xy y x y x f

0

H

0

0

0

H

H


29/29

Summary

• Gradient – What it is and how to derive

• Hessian Matrix – What it is and how to derive

• How to test if a point is maximum, minimum, orsaddle point

• Steepest Ascent Method vs. Conjugate-Gradient Approach vs. Newton Method

lec 21 marquardt method

Documents