unconstrained minimization (i)

43
Unconstrained Minimization (I) Lijun Zhang [email protected] http://cs.nju.edu.cn/zlj

Upload: others

Post on 27-Mar-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unconstrained Minimization (I)

Unconstrained Minimization (I)

Lijun [email protected]://cs.nju.edu.cn/zlj

Page 2: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 3: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 4: Unconstrained Minimization (I)

Basic Terminology

Unconstrained Optimization Problem

is convex always have a domain dom 𝑓 𝐑 , dom 𝑓 βŠ‚ 𝐑

is twice continuously differentiable dom 𝑓 is open, such as 0, ∞

The problem is solvable There exists an optimal point π‘₯βˆ—

inf 𝑓 π‘₯ 𝑓 π‘₯βˆ— π‘βˆ—

Page 5: Unconstrained Minimization (I)

Basic Terminology

Unconstrained Optimization Problem

βˆ— is optimal if and only if

Special cases: a closed-form solution General cases: an iterative algorithm A sequence of points π‘₯ , π‘₯ , … ∈ 𝐝𝐨𝐦 𝑓 with

A minimizing sequence for the problem The algorithm is terminated when

βˆ—

Equivalent

𝑓 π‘₯ β†’ π‘βˆ— as π‘˜ β†’ ∞

𝑓 π‘₯ π‘βˆ— πœ–

Page 6: Unconstrained Minimization (I)

Requirements of Iterative Algorithm

Initial Point A suitable starting point

Sublevel Set is Closed

Satisfied for all if the function is closed

Continuous functions with dom 𝑓 𝐑 Continuous functions with open domains

and 𝑓 π‘₯ β†’ ∞ as π‘₯ β†’ bd dom 𝑓

Page 7: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 8: Unconstrained Minimization (I)

Examples

Convex Quadratic Minimization Problem

Optimality Condition

βˆ— (unique solution)2. If is singular and , any solution

of βˆ— is optimal3. If , no solution, unbound below

βˆ—

Page 9: Unconstrained Minimization (I)

Examples

Convex Quadratic Minimization Problem

3. If , no solution, unbound below π‘ž π‘Ž 𝑏, π‘Ž ∈ β„› 𝑃 , 𝑏 βŠ₯ β„› 𝑃 Let π‘₯ 𝑑𝑏

Page 10: Unconstrained Minimization (I)

Examples

Least-Squares Problem

are problem data

Optimality Condition

Normal Equations

βˆ— βˆ—

βˆ—

Page 11: Unconstrained Minimization (I)

Examples

Unconstrained Geometric Programming

Optimality Condition

No analytical solution An Iterative Algorithm dom 𝑓 𝐑 , any point can be chosen as π‘₯

βˆ—βˆ—

βˆ—

Page 12: Unconstrained Minimization (I)

Examples

Analytic Center of Linear Inequalities

is called as the logarithmic barrier for the inequalities

The solution of this problem is called the analytic center of the inequalities

An Iterative Algorithm π‘₯ must satisfy π‘Ž π‘₯ 𝑏

Page 13: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 14: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

1. A Quadratic Lower Bound

𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯12 𝑦 π‘₯ 𝛻 𝑓 𝑧 𝑦 π‘₯

𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯

Page 15: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

1. A Quadratic Lower Bound

When , reduce to the first-order condition of convex functions

𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯ , βˆ€π‘₯, 𝑦 ∈ 𝑆

Page 16: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

1. A Quadratic Lower Bound

2. A Condition for Suboptimality

𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯ , βˆ€π‘₯, 𝑦 ∈ 𝑆

𝑓 𝑦 min 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯

𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯ , 𝑦 π‘₯

1π‘š 𝛻𝑓 π‘₯

𝑓 π‘₯1

2π‘š 𝛻𝑓 π‘₯

Page 17: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

1. A Quadratic Lower Bound

2. A Condition for Suboptimality

If the gradient is small at π‘₯, then it is nearly optimal

𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯π‘š2 𝑦 π‘₯ , βˆ€π‘₯, 𝑦 ∈ 𝑆

π‘βˆ— 𝑓 π‘₯1

2π‘š 𝛻𝑓 π‘₯ 𝑓 π‘₯ π‘βˆ—1

2π‘š 𝛻𝑓 π‘₯

𝛻𝑓 π‘₯ 2π‘šπœ– β‡’ 𝑓 π‘₯ π‘βˆ— πœ–

Page 18: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

3. An Upper Bound of βˆ—

π‘βˆ— 𝑓 π‘₯βˆ—

𝑓 π‘₯ 𝛻𝑓 π‘₯ π‘₯βˆ— π‘₯π‘š2 π‘₯βˆ— π‘₯

π‘βˆ— 𝛻𝑓 π‘₯ π‘₯βˆ— π‘₯π‘š2 π‘₯βˆ— π‘₯

𝑓 π‘₯ 𝛻𝑓 π‘₯ π‘₯βˆ— π‘₯π‘š2 π‘₯βˆ— π‘₯

Page 19: Unconstrained Minimization (I)

Strong Convexity

is strongly convex on , if

3. An Upper Bound of βˆ—

βˆ—, as 0 The optimal point βˆ— is unique

π‘š2 π‘₯βˆ— π‘₯ 𝛻𝑓 π‘₯ π‘₯βˆ— π‘₯

π‘₯βˆ— π‘₯2π‘š 𝛻𝑓 π‘₯

Page 20: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 21: Unconstrained Minimization (I)

Smoothness

is smooth on , if

1. A Quadratic Upper Bound

𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯12 𝑦 π‘₯ 𝛻 𝑓 𝑧 𝑦 π‘₯

𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯𝑀2 𝑦 π‘₯

Page 22: Unconstrained Minimization (I)

Smoothness

is smooth on , if

1. A Quadratic Upper Bound

2. An Upper Bound of Gradients𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯

𝑀2 𝑦 π‘₯ , βˆ€π‘₯, 𝑦 ∈ 𝑆

𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯𝑀2 𝑦 π‘₯ , 𝑦 π‘₯

1𝑀 𝛻𝑓 π‘₯

𝑓 π‘₯1

2𝑀 𝛻𝑓 π‘₯

min 𝑓 𝑦 min 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯𝑀2 𝑦 π‘₯

Page 23: Unconstrained Minimization (I)

Smoothness

is smooth on , if

1. A Quadratic Upper Bound

2. An Upper Bound of Gradients𝑓 𝑦 𝑓 π‘₯ 𝛻𝑓 π‘₯ 𝑦 π‘₯

𝑀2 𝑦 π‘₯ , βˆ€π‘₯, 𝑦 ∈ 𝑆

π‘βˆ— 𝑓 π‘₯1

2𝑀 𝛻𝑓 π‘₯

12𝑀 𝛻𝑓 π‘₯ 𝑓 π‘₯ π‘βˆ—

Page 24: Unconstrained Minimization (I)

Condition Number

Condition Number of a Matrix

is both strongly convex and smooth

Condition number of

Has a strong effect on the efficiency of optimization methods

πœ…π‘€π‘š

πœ… 𝛻 𝑓 π‘₯

Page 25: Unconstrained Minimization (I)

Condition Number

Geometric Interpretations Width of a convex set , in the

direction where

Minimum width and maximum width of

Condition number of cond 𝐢 is small implies

𝐢 it is nearly spherical

π‘Š 𝐢, π‘ž sup∈

π‘ž 𝑧 inf∈

π‘ž 𝑧

π‘Š inf π‘Š 𝐢, π‘ž , π‘Š sup

π‘Š 𝐢, π‘ž

cond πΆπ‘Šπ‘Š

Page 26: Unconstrained Minimization (I)

Condition Number

Geometric Interpretations -sublevel set of

is both strongly convex and smooth

Condition number of

𝐢 π‘₯ 𝑓 π‘₯ 𝛼 , π‘βˆ— 𝛼 𝑓 π‘₯

cond 𝐢 πœ…π‘€π‘š

𝐡 βŠ† 𝐢 βŠ† 𝐡

𝐡 𝑦 𝑦 π‘₯βˆ— 2 𝛼 π‘βˆ—

𝑀

/

𝐡 𝑦 𝑦 π‘₯βˆ— 2 𝛼 π‘βˆ—

π‘š

/

Page 27: Unconstrained Minimization (I)

Discussions

Parameters and Known only in rare cases Unknown in general

They are conceptually useful The convergence behavior of

optimization algorithms depend on them Characterize the convergence rate

In Practice Estimate their values Design parameter-free algorithms

Page 28: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 29: Unconstrained Minimization (I)

Iterative Methods

A Minimizing Sequence

is the the iteration number is the output of iterative methods is the step or search direction is the step size or step length

Shorthand

Page 30: Unconstrained Minimization (I)

Descent Methods

Descent Methods

Except when is optimal The search direction makes an acute

angle with the negative gradient𝛻𝑓 π‘₯ Ξ”π‘₯ 0

𝑓 π‘₯ 𝑓 π‘₯ 𝛻𝑓 π‘₯ π‘₯ π‘₯

𝛻𝑓 π‘₯ Ξ”π‘₯ 0 β‡’ 𝛻𝑓 π‘₯ π‘₯ π‘₯ 0 β‡’ 𝑓 π‘₯ 𝑓 π‘₯

Page 31: Unconstrained Minimization (I)

Descent Methods

Descent Methods

Except when is optimal The search direction makes an acute

angle with the negative gradient

is called as descent direction

𝛻𝑓 π‘₯ Ξ”π‘₯ 0

Page 32: Unconstrained Minimization (I)

General Descent Method

The AlgorithmGiven a starting point π‘₯ ∈ dom 𝑓Repeat

1. Determine a descent direction Ξ”π‘₯.

2. Line search: Choose a step size 𝑑 0.3. Update: π‘₯ ≔ π‘₯ π‘‘βˆ†π‘₯.

until stopping criterion is satisfied.

Line Search Determine the next iterate along the line

Page 33: Unconstrained Minimization (I)

General Descent Method

The AlgorithmGiven a starting point π‘₯ ∈ dom 𝑓Repeat

1. Determine a descent direction Ξ”π‘₯.

2. Line search: Choose a step size 𝑑 0.3. Update: π‘₯ ≔ π‘₯ π‘‘βˆ†π‘₯.

until stopping criterion is satisfied.

Stopping Criterion

Page 34: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 35: Unconstrained Minimization (I)

Exact Line Search

Minimize along the Ray

The cost of the minimization problem with one variable is low

The minimizer along the ray can be found analytically

Page 36: Unconstrained Minimization (I)

Outline Unconstrained Minimization Problems Basic Terminology Examples Strong Convexity Smoothness

Descent Methods General Descent Method Exact Line Search Backtracking Line Search

Page 37: Unconstrained Minimization (I)

Backtracking Line Search

Most line searches used in practice are inexact Approximately minimize along the ray Just reduce β€˜enough’

Backtracking Line Searchgiven a descent direction βˆ†π‘₯ for 𝑓 at π‘₯ ∈𝐝𝐨𝐦 𝑓, 𝛼 ∈ 0, 0.5 , 𝛽 ∈ 0, 1𝑑 ≔ 1

while 𝑓 π‘₯ 𝑑Δπ‘₯ 𝑓 π‘₯ 𝛼𝑑𝛻𝑓 π‘₯ βˆ†π‘₯, 𝑑 ≔ 𝛽𝑑

Page 38: Unconstrained Minimization (I)

Backtracking Line Search

The line search is called backtracking It starts with unit step size and then

reduces it by the factor

It eventually terminates is a descent direction, i.e., For small enough

𝛼 is the fraction of the decrease in 𝑓 predicted by linear extrapolation that we will accept

𝑑 ≔ 1 , 𝑑 ≔ 𝛽𝑑

𝑓 π‘₯ 𝑑Δπ‘₯ 𝑓 π‘₯ 𝑑𝛻𝑓 π‘₯ βˆ†π‘₯ 𝑓 π‘₯ 𝛼𝑑𝛻𝑓 π‘₯ βˆ†π‘₯

Page 39: Unconstrained Minimization (I)

Backtracking Line Search

Graph InterpretationThe backtracking exit inequality𝑓 π‘₯ 𝑑Δπ‘₯ 𝑓 π‘₯ 𝛼𝑑𝛻𝑓 π‘₯ Ξ”π‘₯holds for 𝑑 ∈ 0, 𝑑

Page 40: Unconstrained Minimization (I)

Backtracking Line Search

Graph InterpretationThe backtracking line search stopswith a step length 𝑑 that satisfies

𝑑 1, or 𝑑 ∈ 𝛽𝑑 , 𝑑

Page 41: Unconstrained Minimization (I)

Backtracking Line Search

Graph InterpretationThe backtracking line search stopswith a step length 𝑑 that satisfies

𝑑 min 1, 𝛽𝑑

Page 42: Unconstrained Minimization (I)

Backtracking Line Search

Require

A Practical Implementation1. Multiply by until 2. Check whether the above inequality

holds is typically chosen between and is often chosen between and

Page 43: Unconstrained Minimization (I)

Summary

Unconstrained Minimization Problems First-order Optimality Condition Strong Convexity and Implications Smoothness and Implications

Descent Methods General Descent Method Exact Line Search Backtracking Line Search