lec 21 marquardt method

Upload: muhammad-bilal-junaid

Post on 01-Jun-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Lec 21 Marquardt Method

    1/29

  • 8/9/2019 Lec 21 Marquardt Method

    2/29

    Optimization Methods

    One-Dimensional Unconstrained OptimizationGolden-Section Search

    Quadratic InterpolationNewton's Method

    Multi-Dimensional Unconstrained Optimization

    Non-gradient or direct methodsGradient methods

  • 8/9/2019 Lec 21 Marquardt Method

    3/29

    Summary of Newton's Method

    One-dimensionalOptimization

    Multi-dimensionalOptimization

     At theoptimal

    Newton'sMethod

    0)(   =∇   x f 0)('   =i x f 

    )(1

    1   iiii   f   xHxx   ∇−=  −

    +)("

    )('1

    i

    iii x f 

     x f  x x   −=+

    Hi is the Hessian

    matrix (or matrix of 2ndpartial derivatives) of f 

    evaluated at xi.)(')(" 1 ii   x f  x f 

      −

  • 8/9/2019 Lec 21 Marquardt Method

    4/29

    Newton's Method

    • This method Converges in quadratic fashion.

    • It May diverge if the starting point is not closeenough to the optimum point.

    • It is very Costly to evaluate H-1.

    )(11   iiii   f   xHxx   ∇−=  −

    +

  • 8/9/2019 Lec 21 Marquardt Method

    5/29

    Marquardt Method

    Idea

    • When a guessed point is far away from the optimum

    point, use the Steepest Ascend method or Cauchy’smethod.

    • As the guessed point is getting closer and closer to the

    optimum point, gradually switch to the Newton'smethod.

    • In any given problem it is not known whether the

    chosen initial point is away from the minimum or closeto the minimum.

    • So, we need a method that takes advantages of both.

  • 8/9/2019 Lec 21 Marquardt Method

    6/29

    Marquardt Method

    The Marquardt method achieves the objective by

    modifying the Hessian matrix H in the Newton's

    Method in the following way:

    IHHxHxx iiiiiii   f    α +=∇−=  −

    +

    • Initially, setα0 a huge number.

    • Decrease the value ofαi in each iteration.

    • When xi is close to the optimum point, makesαizero (or close to zero).

    ~where)(

    ~ 11

  • 8/9/2019 Lec 21 Marquardt Method

    7/29

    Marquardt Method

    Whenαi is large

    )(1

    )(~

    ~

    1

    1

    1   i

    i

    iiiiii

    iiii

     f  f 

     I 

    xxxxHxx

    IHH

    ∇−=≅∇−=⇒≈+=

    +−

    +α 

    α α 

    Whenαi is close to zero

    )()(~

    ~

    1

    1

    1

    1   iiiiiiii

    iiii

     f  f    xHxxxHxx

    HIHH

    ∇−=≅∇−=⇒

    ≈+=−

    +−

    +

    α 

    Steepest Ascend Method: (i.e.,Move in the direction of thegradient.)

    Newton's Method

  • 8/9/2019 Lec 21 Marquardt Method

    8/29

  • 8/9/2019 Lec 21 Marquardt Method

    9/29

    EXERCISE 3.4.4: Marquardt Method

    Consider the Himmelblau function:

    Minimize using Marquardt’s method:

    f(x, y) = (x2

    + y – 11)2

    + (x + y2

     – 7)2

    Step 1: In order to ensure proper convergence, a large value of

    M (= 100) is usually chosen.

     Also, keep an initial point x(0) = (0, 0)T,

    and termination parameters ε1 = 10-3 .

    We also set k = 0 as an iteration counter and parameter λ(0) = 100.

    Step 2: The derivative at this point is calculated as (-14, -22)T.

    Step 3: Since the derivative is not small we go to the step 4.

  • 8/9/2019 Lec 21 Marquardt Method

    10/29

    EXERCISE 3.4.4: Marquardt Method

    3D Graph

  • 8/9/2019 Lec 21 Marquardt Method

    11/29

    EXERCISE 3.4.4: Marquardt Method

    % Matlab program todraw contour of function

    [X,Y] = meshgrid(0:.1:5);

    Z = (X.*X + Y -11.).^2. +(

    X + Y.*Y - 7.).^2.

    contour(X, Y, Z, 150);colormap(jet);

    Minimum point

    Contour graph:

  • 8/9/2019 Lec 21 Marquardt Method

    12/29

    EXERCISE 3.4.4: Marquardt Method

    ( ) ( )

    ( ) ( )

    ( ) ( )

    [ ] .

    260

    00.42 and2214),0,0(At

    2612444

    4442412

    ;44;26124;42412

    264422274112

    142424472114

    711),(

    T

    2

    2

    22

    2

    22

    2

    2

    3222

    2322

    2222

    ⎤⎢

    −=−−=∇

    ⎥⎦

    ⎢⎣

    −++

    +−+

    =

    +=∂∂

    ∂−+=∂∂−+=

    ∂∂

    −++−=−++−+=∂

    −+−+=−++−+=

    −++−+=

    H

    H

     f 

     y y y x

     y x y x

     y x y x

     f  y y

     y

     f  y x

     x

     f 

     y y xy x y x y y x y

     f 

     y x xy x y x y x x x

     f 

     y x y x y x f 

  • 8/9/2019 Lec 21 Marquardt Method

    13/29

    EXERCISE 3.4.4: Marquardt Method

    ( ) ( )

    [ ]

    [ ]

    ⎟⎟ ⎠ ⎞⎜⎜

    ⎝ ⎛ =⎟⎟

     ⎠ ⎞⎜⎜

    ⎝ ⎛ 

    −−⎟⎟

     ⎠ ⎞⎜⎜

    ⎝ ⎛ −=⎟⎟

     ⎠ ⎞⎜⎜

    ⎝ ⎛ 

    −−⎥

    ⎦⎤⎢

    ⎣⎡ ⎟⎟

     ⎠ ⎞⎜⎜

    ⎝ ⎛ −=

    ⎟⎟ ⎠ ⎞

    ⎜⎜⎝ ⎛ 

    −−

    ⎭⎬⎫

    ⎩⎨⎧ ⎟⎟

     ⎠ ⎞

    ⎜⎜⎝ ⎛ +⎟⎟

     ⎠ ⎞

    ⎜⎜⎝ ⎛ 

    −−−=

    ∇+−=

    ⎥⎦

    ⎢⎣

    =−−=∇

    −++−+=

    297.0241.0

    2214

    580074

    42921

    2214

    740058

    22

    14

    10

    01100260

    042

    ))0((I)0()1(

    .260

    00.42

     and2214),0,0(At

    711),(

    1

    1

    T

    2222

     x f s

     f 

     y x y x y x f 

    λ H

    H

    Thus new point x(1) = x(0) + s(0)

    x(1) = (0,0)T + (0.241, 0.297)T = (0.241, 0.297)T

  • 8/9/2019 Lec 21 Marquardt Method

    14/29

    EXERCISE 3.4.4: Marquardt Method

    • STEP 5: The function value at this point x(1) is f(x(1))and it is 157.79, which is smaller than that at x(0);

    • F(0) = 170. Thus we move to next step.• STEP 6: we now set a new λ = 100/2 = 50. This has the

    effect of switching from Cauchy to Newton Method. We

    now set k = 1.

    • This completes one iteration of Marquardt algorithm.

  • 8/9/2019 Lec 21 Marquardt Method

    15/29

    EXERCISE 3.4.4: Marquardt Method

    Now again select a new initial point x(1) = (0.241, 0.297)T,

    and function value at this point is f(x(1)) = 157.79.

    Step 2: The derivative at this point is calculated as (-23.60, -29.21 )T.

    Step 3: Since the termination criteria are not met we go to the step 4.Step 4: At this point the Hessian is given as:

    ( ) ( )

    ⎥⎦

    ⎢⎣

    =⎥⎦

    ⎢⎣

    −++

    +−+

    =

    −=−++−=∂∂

    −=−+−+=∂

    ∂=

    −++−+=

    754.23152.2

    152.2115.40

    2612444

    4442412

    2147.292644222

    6033.231424244

    )297.0,241.0(int

    711),(

    2

    2

    32

    23

    2222

     y y y x

     y x y x

     y y xy x y

     f 

     y x xy x x

     f 

     poat 

     y x y x y x f 

    H

  • 8/9/2019 Lec 21 Marquardt Method

    16/29

    EXERCISE 3.4.4: Marquardt Method

    ( ) ( )

    [ ]

    [ ]

    ⎟⎟ ⎠

     ⎞⎜⎜⎝ 

    ⎛ ==⎟⎟ ⎠

     ⎞⎜⎜⎝ 

    ⎛ 

    ⎥⎦

    ⎢⎣

    ⎟⎟ ⎠

     ⎞⎜⎜⎝ 

    ⎛ −=

    ⎟⎟ ⎠ ⎞⎜⎜

    ⎝ ⎛ 

    −−

    ⎭⎬⎫

    ⎩⎨⎧ ⎟⎟

     ⎠ ⎞⎜⎜

    ⎝ ⎛ +⎥

    ⎦⎤⎢

    ⎣⎡

    −−−=

    ∇+−=

    ⎥⎦

    ⎢⎣

    =−−=∇

    −++−+=

    749.1

    738.2

    21.29

    64.23

    246.26152.2

    152.2885.9

    21.2964.23

    100150

    754.23152.2152.2115.40

    ))0((I)0()1(

    754.23152.2

    152.2115.40

     and21.2964.23),0,0(At

    711),(

    1

    1

    1

    T

    2222

     x f s

     f 

     y x y x y x f 

    λ H

    H

    Thus new point x(2) = x(1) + s(1)

    x(2) = (0.241, 0.297)T + (2.738, 1.749)T = (2.98, 2.045)T

  • 8/9/2019 Lec 21 Marquardt Method

    17/29

    EXERCISE 3.4.4: Marquardt Method

    • STEP 5: The function value at this point x(2) is f(x(2))and it is 0.033, which is much smaller than that at x(1);

    • F(1) = 157.79. Thus we move to next step.

    • STEP 6: we now set a new λ = 50/2 = 25. This has the

    effect of switching from Cauchy to Newton Method. Wenow set k = 2.

    • This completes one more iteration for the Marquardtalgorithm.

  • 8/9/2019 Lec 21 Marquardt Method

    18/29

    EXERCISE 3.4.4: Marquardt Method

    • This process continues until the termination criteria issatisfied.

    • One more iteration shows that x(3) = (2.994 , 2.005)T

    and the function value f(x(3)) = 0.001

    • So, we can stop here and the optimum is x(3).

    • One difficulty for Marquardt method is that at everyiteration one has to estimate the Hessian matrix.

  • 8/9/2019 Lec 21 Marquardt Method

    19/29

    Conjugate Direction Methods

    Conjugate direction methods can be regarded as

    somewhat in between steepest descent and Newton's

    method, having the positive features of both of them.

    Motivation: There is desire to accelerate slowconvergence of steepest descent, but avoid expensive

    evaluation, storage, and inversion of Hessian.

  • 8/9/2019 Lec 21 Marquardt Method

    20/29

    Conjugate Gradient Approaches

    • It is similar to the conjugate direction method.

    •  Assuming that the object function is quadratic,conjugate directions can be found using first orderderivatives;

    • Idea: Calculate conjugate direction at each points basedon the gradient as

    12

    1

    2

    −∇

    +∇=   ii

    i

    ii   S  f 

     f 

     f S 

    This method Converge faster than Powell's method.

    The Fletcher-Reeves Method

  • 8/9/2019 Lec 21 Marquardt Method

    21/29

    Example on various functions

    • Determine whether the stationary point of the following quadraticfunctions is a local maxima, local minima or saddle point?

    1032),,((iv)

    )3()2(),((iii)

    225.15.12),((ii)

    1002)((i)

    222

    22

    22

    2

    +−+−++=

    −−−=

    −−+=

    +−=

     yz xy xz z y x z y x f 

     y x y x f 

     y x y xy y x f 

     x x x f 

    •  A point x* is a stationary point iff 

    •   f ' (x*) = 0 (if f is a function of one variable)

    •    f (x*) = 0 (if f is a function of >1 variables)

  • 8/9/2019 Lec 21 Marquardt Method

    22/29

    Example – Solution

    minimalocalais1ve)( 2)1("

    1022)('1002)((i)

    2

    =⇒+=

    =⇒=−=+−=

     x f 

     x x x f  x x x f 

    625.0yand50yieldssystemtheSolving

    045.12

    05.22

    havewe,0Setting

    45.12,5.22

    225.15.12),((ii) 22

    ==

    =−+

    =−

    =∇

    −+=∂

    ∂−=

    −−+=

    . x

     y x

     x y

     f 

     y x y

     f  x y

     x

     f 

     y x y xy y x f 

    We still have to test if the point is a local maxima, minima or saddle point(continue next page …)

    E l S l ti (C ti )

  • 8/9/2019 Lec 21 Marquardt Method

    23/29

    maxima.localais)625.0,5.0( pointthe,0and0Since

    6)2)(2()4)(5.2(42

    25.2

    45.12,5.22

    225.15.12),(

    2

    2

    22

    22

    22

    <

    ∂>

    =−−−=⇒⎥⎦

    ⎤⎢⎣

    −=

    ⎥⎥⎥

    ⎢⎢⎢

    ∂∂

    ∂∂

    ∂∂∂

    ∂∂∂

    =

    −+=∂

    ∂−=

    ∂−−+=

     x

     f 

     y y

     f 

     x y

     f 

     y x

     f 

     x x

     f 

     y x y

     f  x y

     x

     f  y x y xy y x f 

    H

    HH

    matrixHessiantheusetois pointsaddle

    orminima,localmaxima,localais pointaif testway toOne

    Example – Solution (Continue)

    (ii) (… continue)

  • 8/9/2019 Lec 21 Marquardt Method

    24/29

     point.saddleais pointstationarythe

     ,)indefiniteis(i.e,0 but0Since

    420

    02

    62)3(2,42)2(2

    )3()2(),((iii)

    2

    2

    22

    22

    22

    HH

    HH

    >∂

    ∂<

    −=⇒⎥⎦

    ⎢⎣

    −=

    ⎥⎥⎥⎥

    ⎢⎢⎢⎢

    ∂∂∂

    ∂∂∂

    ∂∂

    ∂∂

    =

    +−=−−=

    ∂−=−=

    −−−=

     x

     f 

     y y

     f 

     x y

     f 

     y x

     f 

     x x

     f 

     y y

     y

     f  x x

     x

     f 

     y x y x f 

    Example – Solution (Continue)

    E l S l i (C i )

  • 8/9/2019 Lec 21 Marquardt Method

    25/29

     point.saddleorminima,localmaxima,local

    ais pointstationaryhewhether ttellorder toinneitherordefinitenegativeordefinite positiveisif testto Need

    232

    321

    212

    322,32,22

    1032),,((iv)

    222

    222

    222

    222

    H

    H

    ⎥⎥⎥

    ⎢⎢⎢

    −−

    =

    ⎥⎥⎥⎥⎥⎥⎥

    ⎢⎢⎢⎢⎢⎢⎢

    ∂∂

    ∂∂

    ∂∂

    ∂ ∂∂

    ∂∂

    ∂∂

    ∂∂

    ∂∂

    ∂∂

    =

    −−=∂∂−+=

    ∂∂+−=

    ∂∂

    +−+−++=

     z z

     f 

     y z

     f 

     x z

     f  z y

     f 

     y y

     f 

     x y

     f 

     z x f 

     y x f 

     x x f 

     y x z z

     f  z x y

     y

     f  y z x

     x

     f 

     yz xy xz z y x z y x f 

    Example – Solution (Continue)

    (continue next page …)

    Example Solution (Continue)

  • 8/9/2019 Lec 21 Marquardt Method

    26/29

    08)5.14)(5.1)(2(

    5.1/400

    25.10

    212

    232

    321

    212

    02242112,02

    5.1/400

    25.10

    212

    232

    321

    212

    33

    2211

    nEliminatioForward

    =−==>=

    ⎥⎥⎥

    ⎢⎢⎢

    −−

    ⇒⎥⎥⎥

    ⎢⎢⎢

    −−−

    =

    ×

    ××

    H

    HH

    H

    Example – Solution (Continue)

    We can verify if a matrix is positive definite by checking if thedeterminants of all its upper left corner sub-matrices are positive.

    (iv) (… continue from previous slide)

    Since H is neither positive definite nor negative definite (i.e., indefinite),the stationary point is a saddle point.

  • 8/9/2019 Lec 21 Marquardt Method

    27/29

    Let us do more exercise

    For each of the following points, determine whether it is alocal maxima, local minima, saddle point, or not astationary point of

     xy y x y x f  3),( 33 −+=

    1. (0, 0)

    2. (1, 0)

    3. (-1, -1)

    4. (1, 1)

  • 8/9/2019 Lec 21 Marquardt Method

    28/29

    Exercise; solution

    [ ]

    [ ]

    [ ][ ]

    minima.localais(1,1)27,936Hand0andSince

    .63

    36 and00),1,1(At

     point.stationaryanotis(-1,-1)Thus.66),1,1(At

     point.stationaryanotis(1,0)Thus.33),0,1(At

     point.saddleais(0,0)-9),H(or0andSince

    .03

    30 and00),0,0(At

    63

    36,33,33

    3),(

    1,1

    T

    T

    T

    1,1

    T

    22

    33

    =−=>=∇

    ⎥⎦

    ⎤⎢⎣

    −==∇

    ≠=∇−−

    ≠−=∇

    ===∇

    ⎥⎦

    ⎤⎢⎣

    −==∇

    ⎤⎢

    −=−=

    ∂−=

    −+=

    h f 

     f 

     f 

     f 

    h f 

     f 

     y

     x x y

     y

     f  y x

     x

     f 

     xy y x y x f 

    0

    H

    0

    0

    0

    H

    H

  • 8/9/2019 Lec 21 Marquardt Method

    29/29

    Summary

    • Gradient – What it is and how to derive

    • Hessian Matrix – What it is and how to derive

    • How to test if a point is maximum, minimum, orsaddle point

    • Steepest Ascent Method vs. Conjugate-Gradient Approach vs. Newton Method