lec 21 marquardt method
TRANSCRIPT
-
8/9/2019 Lec 21 Marquardt Method
1/29
-
8/9/2019 Lec 21 Marquardt Method
2/29
Optimization Methods
One-Dimensional Unconstrained OptimizationGolden-Section Search
Quadratic InterpolationNewton's Method
Multi-Dimensional Unconstrained Optimization
Non-gradient or direct methodsGradient methods
-
8/9/2019 Lec 21 Marquardt Method
3/29
Summary of Newton's Method
One-dimensionalOptimization
Multi-dimensionalOptimization
At theoptimal
Newton'sMethod
0)( =∇ x f 0)(' =i x f
)(1
1 iiii f xHxx ∇−= −
+)("
)('1
i
iii x f
x f x x −=+
Hi is the Hessian
matrix (or matrix of 2ndpartial derivatives) of f
evaluated at xi.)(')(" 1 ii x f x f
−
-
8/9/2019 Lec 21 Marquardt Method
4/29
Newton's Method
• This method Converges in quadratic fashion.
• It May diverge if the starting point is not closeenough to the optimum point.
• It is very Costly to evaluate H-1.
)(11 iiii f xHxx ∇−= −
+
-
8/9/2019 Lec 21 Marquardt Method
5/29
Marquardt Method
Idea
• When a guessed point is far away from the optimum
point, use the Steepest Ascend method or Cauchy’smethod.
• As the guessed point is getting closer and closer to the
optimum point, gradually switch to the Newton'smethod.
• In any given problem it is not known whether the
chosen initial point is away from the minimum or closeto the minimum.
• So, we need a method that takes advantages of both.
-
8/9/2019 Lec 21 Marquardt Method
6/29
Marquardt Method
The Marquardt method achieves the objective by
modifying the Hessian matrix H in the Newton's
Method in the following way:
IHHxHxx iiiiiii f α +=∇−= −
+
• Initially, setα0 a huge number.
• Decrease the value ofαi in each iteration.
• When xi is close to the optimum point, makesαizero (or close to zero).
~where)(
~ 11
-
8/9/2019 Lec 21 Marquardt Method
7/29
Marquardt Method
Whenαi is large
)(1
)(~
~
1
1
1 i
i
iiiiii
iiii
f f
I
xxxxHxx
IHH
∇−=≅∇−=⇒≈+=
+−
+α
α α
Whenαi is close to zero
)()(~
~
1
1
1
1 iiiiiiii
iiii
f f xHxxxHxx
HIHH
∇−=≅∇−=⇒
≈+=−
+−
+
α
Steepest Ascend Method: (i.e.,Move in the direction of thegradient.)
Newton's Method
-
8/9/2019 Lec 21 Marquardt Method
8/29
-
8/9/2019 Lec 21 Marquardt Method
9/29
EXERCISE 3.4.4: Marquardt Method
Consider the Himmelblau function:
Minimize using Marquardt’s method:
f(x, y) = (x2
+ y – 11)2
+ (x + y2
– 7)2
Step 1: In order to ensure proper convergence, a large value of
M (= 100) is usually chosen.
Also, keep an initial point x(0) = (0, 0)T,
and termination parameters ε1 = 10-3 .
We also set k = 0 as an iteration counter and parameter λ(0) = 100.
Step 2: The derivative at this point is calculated as (-14, -22)T.
Step 3: Since the derivative is not small we go to the step 4.
-
8/9/2019 Lec 21 Marquardt Method
10/29
EXERCISE 3.4.4: Marquardt Method
3D Graph
-
8/9/2019 Lec 21 Marquardt Method
11/29
EXERCISE 3.4.4: Marquardt Method
% Matlab program todraw contour of function
[X,Y] = meshgrid(0:.1:5);
Z = (X.*X + Y -11.).^2. +(
X + Y.*Y - 7.).^2.
contour(X, Y, Z, 150);colormap(jet);
Minimum point
Contour graph:
-
8/9/2019 Lec 21 Marquardt Method
12/29
EXERCISE 3.4.4: Marquardt Method
( ) ( )
( ) ( )
( ) ( )
[ ] .
260
00.42 and2214),0,0(At
2612444
4442412
;44;26124;42412
264422274112
142424472114
711),(
T
2
2
22
2
22
2
2
3222
2322
2222
⎥
⎦
⎤⎢
⎣
⎡
−
−=−−=∇
⎥⎦
⎤
⎢⎣
⎡
−++
+−+
=
+=∂∂
∂−+=∂∂−+=
∂∂
−++−=−++−+=∂
∂
−+−+=−++−+=
∂
∂
−++−+=
H
H
f
y y y x
y x y x
y x y x
f y y
y
f y x
x
f
y y xy x y x y y x y
f
y x xy x y x y x x x
f
y x y x y x f
-
8/9/2019 Lec 21 Marquardt Method
13/29
EXERCISE 3.4.4: Marquardt Method
( ) ( )
[ ]
[ ]
⎟⎟ ⎠ ⎞⎜⎜
⎝ ⎛ =⎟⎟
⎠ ⎞⎜⎜
⎝ ⎛
−−⎟⎟
⎠ ⎞⎜⎜
⎝ ⎛ −=⎟⎟
⎠ ⎞⎜⎜
⎝ ⎛
−−⎥
⎦⎤⎢
⎣⎡ ⎟⎟
⎠ ⎞⎜⎜
⎝ ⎛ −=
⎟⎟ ⎠ ⎞
⎜⎜⎝ ⎛
−−
⎭⎬⎫
⎩⎨⎧ ⎟⎟
⎠ ⎞
⎜⎜⎝ ⎛ +⎟⎟
⎠ ⎞
⎜⎜⎝ ⎛
−−−=
∇+−=
⎥⎦
⎤
⎢⎣
⎡
−
−
=−−=∇
−++−+=
−
−
297.0241.0
2214
580074
42921
2214
740058
22
14
10
01100260
042
))0((I)0()1(
.260
00.42
and2214),0,0(At
711),(
1
1
T
2222
x f s
f
y x y x y x f
λ H
H
Thus new point x(1) = x(0) + s(0)
x(1) = (0,0)T + (0.241, 0.297)T = (0.241, 0.297)T
-
8/9/2019 Lec 21 Marquardt Method
14/29
EXERCISE 3.4.4: Marquardt Method
• STEP 5: The function value at this point x(1) is f(x(1))and it is 157.79, which is smaller than that at x(0);
• F(0) = 170. Thus we move to next step.• STEP 6: we now set a new λ = 100/2 = 50. This has the
effect of switching from Cauchy to Newton Method. We
now set k = 1.
• This completes one iteration of Marquardt algorithm.
-
8/9/2019 Lec 21 Marquardt Method
15/29
EXERCISE 3.4.4: Marquardt Method
Now again select a new initial point x(1) = (0.241, 0.297)T,
and function value at this point is f(x(1)) = 157.79.
Step 2: The derivative at this point is calculated as (-23.60, -29.21 )T.
Step 3: Since the termination criteria are not met we go to the step 4.Step 4: At this point the Hessian is given as:
( ) ( )
⎥⎦
⎤
⎢⎣
⎡
−
−
=⎥⎦
⎤
⎢⎣
⎡
−++
+−+
=
−=−++−=∂∂
−=−+−+=∂
∂=
−++−+=
754.23152.2
152.2115.40
2612444
4442412
2147.292644222
6033.231424244
)297.0,241.0(int
711),(
2
2
32
23
2222
y y y x
y x y x
y y xy x y
f
y x xy x x
f
poat
y x y x y x f
H
-
8/9/2019 Lec 21 Marquardt Method
16/29
EXERCISE 3.4.4: Marquardt Method
( ) ( )
[ ]
[ ]
⎟⎟ ⎠
⎞⎜⎜⎝
⎛ ==⎟⎟ ⎠
⎞⎜⎜⎝
⎛
−
−
⎥⎦
⎤
⎢⎣
⎡
⎟⎟ ⎠
⎞⎜⎜⎝
⎛ −=
⎟⎟ ⎠ ⎞⎜⎜
⎝ ⎛
−−
⎭⎬⎫
⎩⎨⎧ ⎟⎟
⎠ ⎞⎜⎜
⎝ ⎛ +⎥
⎦⎤⎢
⎣⎡
−−−=
∇+−=
⎥⎦
⎤
⎢⎣
⎡
−
−
=−−=∇
−++−+=
−
−
−
749.1
738.2
21.29
64.23
246.26152.2
152.2885.9
21.2964.23
100150
754.23152.2152.2115.40
))0((I)0()1(
754.23152.2
152.2115.40
and21.2964.23),0,0(At
711),(
1
1
1
T
2222
x f s
f
y x y x y x f
λ H
H
Thus new point x(2) = x(1) + s(1)
x(2) = (0.241, 0.297)T + (2.738, 1.749)T = (2.98, 2.045)T
-
8/9/2019 Lec 21 Marquardt Method
17/29
EXERCISE 3.4.4: Marquardt Method
• STEP 5: The function value at this point x(2) is f(x(2))and it is 0.033, which is much smaller than that at x(1);
• F(1) = 157.79. Thus we move to next step.
• STEP 6: we now set a new λ = 50/2 = 25. This has the
effect of switching from Cauchy to Newton Method. Wenow set k = 2.
• This completes one more iteration for the Marquardtalgorithm.
-
8/9/2019 Lec 21 Marquardt Method
18/29
EXERCISE 3.4.4: Marquardt Method
• This process continues until the termination criteria issatisfied.
• One more iteration shows that x(3) = (2.994 , 2.005)T
and the function value f(x(3)) = 0.001
• So, we can stop here and the optimum is x(3).
• One difficulty for Marquardt method is that at everyiteration one has to estimate the Hessian matrix.
-
8/9/2019 Lec 21 Marquardt Method
19/29
Conjugate Direction Methods
Conjugate direction methods can be regarded as
somewhat in between steepest descent and Newton's
method, having the positive features of both of them.
Motivation: There is desire to accelerate slowconvergence of steepest descent, but avoid expensive
evaluation, storage, and inversion of Hessian.
-
8/9/2019 Lec 21 Marquardt Method
20/29
Conjugate Gradient Approaches
• It is similar to the conjugate direction method.
• Assuming that the object function is quadratic,conjugate directions can be found using first orderderivatives;
• Idea: Calculate conjugate direction at each points basedon the gradient as
12
1
2
−
−∇
∇
+∇= ii
i
ii S f
f
f S
This method Converge faster than Powell's method.
The Fletcher-Reeves Method
-
8/9/2019 Lec 21 Marquardt Method
21/29
Example on various functions
• Determine whether the stationary point of the following quadraticfunctions is a local maxima, local minima or saddle point?
1032),,((iv)
)3()2(),((iii)
225.15.12),((ii)
1002)((i)
222
22
22
2
+−+−++=
−−−=
−−+=
+−=
yz xy xz z y x z y x f
y x y x f
y x y xy y x f
x x x f
• A point x* is a stationary point iff
• f ' (x*) = 0 (if f is a function of one variable)
• f (x*) = 0 (if f is a function of >1 variables)
-
8/9/2019 Lec 21 Marquardt Method
22/29
Example – Solution
minimalocalais1ve)( 2)1("
1022)('1002)((i)
2
=⇒+=
=⇒=−=+−=
x f
x x x f x x x f
625.0yand50yieldssystemtheSolving
045.12
05.22
havewe,0Setting
45.12,5.22
225.15.12),((ii) 22
==
=−+
=−
=∇
−+=∂
∂−=
∂
∂
−−+=
. x
y x
x y
f
y x y
f x y
x
f
y x y xy y x f
We still have to test if the point is a local maxima, minima or saddle point(continue next page …)
E l S l ti (C ti )
-
8/9/2019 Lec 21 Marquardt Method
23/29
maxima.localais)625.0,5.0( pointthe,0and0Since
6)2)(2()4)(5.2(42
25.2
45.12,5.22
225.15.12),(
2
2
22
22
22
<
∂
∂>
=−−−=⇒⎥⎦
⎤⎢⎣
⎡
−
−=
⎥
⎥⎥⎥
⎦
⎤
⎢
⎢⎢⎢
⎣
⎡
∂∂
∂
∂∂
∂
∂∂∂
∂∂∂
=
−+=∂
∂−=
∂
∂−−+=
x
f
y y
f
x y
f
y x
f
x x
f
y x y
f x y
x
f y x y xy y x f
H
HH
matrixHessiantheusetois pointsaddle
orminima,localmaxima,localais pointaif testway toOne
Example – Solution (Continue)
(ii) (… continue)
-
8/9/2019 Lec 21 Marquardt Method
24/29
point.saddleais pointstationarythe
,)indefiniteis(i.e,0 but0Since
420
02
62)3(2,42)2(2
)3()2(),((iii)
2
2
22
22
22
HH
HH
>∂
∂<
−=⇒⎥⎦
⎤
⎢⎣
⎡
−=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
∂∂∂
∂∂∂
∂∂
∂
∂∂
∂
=
+−=−−=
∂
∂−=−=
∂
∂
−−−=
x
f
y y
f
x y
f
y x
f
x x
f
y y
y
f x x
x
f
y x y x f
Example – Solution (Continue)
E l S l i (C i )
-
8/9/2019 Lec 21 Marquardt Method
25/29
point.saddleorminima,localmaxima,local
ais pointstationaryhewhether ttellorder toinneitherordefinitenegativeordefinite positiveisif testto Need
232
321
212
322,32,22
1032),,((iv)
222
222
222
222
H
H
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−
−
−
=
⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
∂∂
∂
∂∂
∂
∂∂
∂ ∂∂
∂
∂∂
∂
∂∂
∂
∂∂
∂
∂∂
∂
∂∂
∂
=
−−=∂∂−+=
∂∂+−=
∂∂
+−+−++=
z z
f
y z
f
x z
f z y
f
y y
f
x y
f
z x f
y x f
x x f
y x z z
f z x y
y
f y z x
x
f
yz xy xz z y x z y x f
Example – Solution (Continue)
(continue next page …)
Example Solution (Continue)
-
8/9/2019 Lec 21 Marquardt Method
26/29
08)5.14)(5.1)(2(
5.1/400
25.10
212
232
321
212
02242112,02
5.1/400
25.10
212
232
321
212
33
2211
nEliminatioForward
=−==>=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−
−
⇒⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−
−
=
×
××
H
HH
H
Example – Solution (Continue)
We can verify if a matrix is positive definite by checking if thedeterminants of all its upper left corner sub-matrices are positive.
(iv) (… continue from previous slide)
Since H is neither positive definite nor negative definite (i.e., indefinite),the stationary point is a saddle point.
-
8/9/2019 Lec 21 Marquardt Method
27/29
Let us do more exercise
For each of the following points, determine whether it is alocal maxima, local minima, saddle point, or not astationary point of
xy y x y x f 3),( 33 −+=
1. (0, 0)
2. (1, 0)
3. (-1, -1)
4. (1, 1)
-
8/9/2019 Lec 21 Marquardt Method
28/29
Exercise; solution
[ ]
[ ]
[ ][ ]
minima.localais(1,1)27,936Hand0andSince
.63
36 and00),1,1(At
point.stationaryanotis(-1,-1)Thus.66),1,1(At
point.stationaryanotis(1,0)Thus.33),0,1(At
point.saddleais(0,0)-9),H(or0andSince
.03
30 and00),0,0(At
63
36,33,33
3),(
1,1
T
T
T
1,1
T
22
33
=−=>=∇
⎥⎦
⎤⎢⎣
⎡
−
−==∇
≠=∇−−
≠−=∇
===∇
⎥⎦
⎤⎢⎣
⎡
−
−==∇
⎥
⎦
⎤⎢
⎣
⎡
−
−=−=
∂
∂−=
∂
∂
−+=
h f
f
f
f
h f
f
y
x x y
y
f y x
x
f
xy y x y x f
0
H
0
0
0
H
H
-
8/9/2019 Lec 21 Marquardt Method
29/29
Summary
• Gradient – What it is and how to derive
• Hessian Matrix – What it is and how to derive
• How to test if a point is maximum, minimum, orsaddle point
• Steepest Ascent Method vs. Conjugate-Gradient Approach vs. Newton Method