introduction to constrained optimization - optimality...

Introduction to constrained

optimization

- optimality conditions

Jussi Hakanen

Post-doctoral researcher [email protected]

spring 2014 TIES483 Nonlinear optimization

mailto:[email protected]

Structure of optimization methods

Typically – Constraint handling

converts the problem to (a series of) unconstrained problems

– In unconstrained optimization a search direction is determined at each iteration

– The best solution in the search direction is found with line search


Constraint handling

method

Unconstrained

optimization

Line

search

Group discussion

Discuss in small groups (3-4) for 20-25

minutes

Each group writes down the answers of the

group for the following questions

At the end, we summarize what each group

found


Group discussion: questions

1. What does optimality mean when there are

constraints? (draw figures of different

situations)

2. What kind of optimality conditions there exist

for constrained optimization? (use formulas)

3. List methods for unconstrained optimization?

– What are their general ideas?

– Why the methods for unconstrained optimization do

not work with constraints?


min 𝑓 𝑥 , 𝑠. 𝑡. 𝑥 ∈ 𝑆 ⊂ 𝑅𝑛

𝑆 is the feasible region, consists of all the points that satisfy

the constraints

Constraints can be e.g.

– Box constraints: 𝑥𝑖𝑙 ≤ 𝑥𝑖 ≤ 𝑥𝑖

𝑢, 𝑖 = 1,… , 𝑛

– Inequality constraints: 𝑔𝑗 𝑥 ≤ 0, 𝑗 = 1, … ,𝑚

– Equality constraints: ℎ𝑗 𝑥 = 0, 𝑗 = 1,… , 𝑙

Special case: linear constraints

– 𝐴𝑥 ≤ 𝑏 (i.e., 𝑔 𝑥 = 𝐴𝑥 − 𝑏)

– 𝐴𝑥 = 𝑏 (i.e., ℎ 𝑥 = 𝐴𝑥 − 𝑏)

Constrained problem

Unconstrained problem

Adopted from Prof. L.T. Biegler (Carnegie Mellon University)

Inequality constraints


Inequality and equality constraints


Feasible descent directions

Definition: Let 𝑆 ⊂ 𝑅𝑛, 𝑆 ≠ ∅ and 𝑥∗ ∈ 𝑐𝑙 𝑆. Then

𝐷 = 𝑑 ∈ 𝑅𝑛 𝑑 ≠ 0 & 𝑥∗ + 𝛼𝑑 ∈ 𝑆 ∀𝛼 ∈ 0, 𝛿 }

for some 𝛿 > 0 is the cone of feasible directions of 𝑆

in 𝑥∗.

– Each 𝑑 ∈ 𝐷, 𝑑 ≠ 0 is a feasible direction

Definition:

𝐹 = 𝑑 ∈ 𝑅𝑛 𝑓 𝑥∗ + 𝛼𝑑 < 𝑓(𝑥∗) ∈ 𝑆 ∀𝛼 ∈ 0, 𝛿 }

for some 𝛿 > 0 is the cone of descent directions in 𝑥∗.

Note: Set 𝐹 ∩ 𝐷 = 𝑑 ∈ 𝑅𝑛 𝑑 ∈ 𝐹 & 𝑑 ∈ 𝐷} is the cone

of feasible descent directions


A necessary condition

Let us consider min𝑓 𝑥 , 𝑠. 𝑡. 𝑥 ∈ 𝑆 ⊂ 𝑅𝑛 where 𝑆 ≠ ∅

Theorem: Let 𝑓 be differentiable in 𝑥∗ ∈ 𝑆. If 𝑥∗ is a local minimizer, then 𝐹∘ ∩ 𝐷 = ∅ where 𝐹∘ = 𝑑 ∈ 𝑅𝑛 𝛻𝑓 𝑥∗ 𝑇𝑑 < 0}. – That is, no descent direction in 𝑥∗ is feasible


Unconstrained

minimizer

Level curves of f

From Miettinen: Nonlinear

optimization, 2007 (in Finnish)

Problem with only inequality

constraints

Let us consider

min𝑓 𝑥 , 𝑠. 𝑡. 𝑔𝑖 𝑥 ≤ 0, 𝑖 = 1,… ,𝑚

𝑆 = 𝑥 ∈ 𝑅𝑛 𝑔𝑖 𝑥 ≤ 0 ∀ 𝑖 = 1,… ,𝑚} where

𝑔𝑖: 𝑅𝑛 → 𝑅 ∀𝑖

Let 𝐼 denote the set of active constraints in 𝑥∗

i.e. 𝐼 = 𝑖 𝑔𝑖 𝑥∗ = 0}.


A necessary condition (for inequality

constraints only)

Define 𝐺° = 𝑑 ∈ 𝑅𝑛 𝛻𝑔𝑖 𝑥∗ 𝑇𝑑 < 0 ∀𝑖 ∈ 𝐼}

Theorem: Let 𝑓 and 𝑔𝑖 be continuously

differentiable in 𝑥∗ ∈ 𝑆. If 𝑥∗ is a local

minimizer, then 𝐹∘ ∩ 𝐺° = ∅.


Example

min 𝑓 𝑥 = 𝑥1 − 3 2 + 𝑥2 − 2 2

𝑠. 𝑡. 𝑥12 + 𝑥2

2 ≤ 5,𝑥1 + 𝑥2 ≤ 3,𝑥1, 𝑥2 ≥ 0

𝑥 =9

5,6

5

𝑇, 𝐼 = {2}

𝐹° ∩ 𝐺° ≠ ∅

𝑥 is not a local

minimizer!


Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Notes

If 𝛻𝑓 𝑥∗ = 0, then 𝐹° = ∅ and 𝐹° ∩ 𝐺° = ∅

– All critical points satisfy necessary conditions

Any 𝑥∗ ∈ 𝑆 that satisfies 𝑔𝑖 𝑥∗ = 0 for some

𝑖 ∈ 𝐼 satisfies also the necessary conditions


Fritz John conditions

(for inequality constraints only)

Necessary conditions: Let 𝑓 and 𝑔𝑖 be continuously differentiable in 𝑥∗ ∈ 𝑆. If 𝑥∗ is a local minimizer, then there exist multipliers 𝜆 ≥ 0 and 𝜇𝑖 ≥ 0, 𝑖 = 1,… ,𝑚 such that 𝜆, 𝜇 ≠ 0 and 1) 𝜆𝛻𝑓 𝑥∗ + 𝜇𝑖𝛻𝑔𝑖 𝑥∗ = 0𝑚

𝑖=1

2) 𝜇𝑖𝑔𝑖 𝑥∗ = 0 for all 𝑖 = 1,… ,𝑚.

Multipliers are usually called as Lagrange multipliers

– min 𝐿 𝑥, 𝜇 = 𝑓 𝑥 + 𝜇𝑖𝑔𝑖(𝑥)𝑚𝑖=1 , 𝑠. 𝑡. 𝑥 ∈ 𝑅𝑛

Conditions 2) are called complementarity conditions


Example


Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

min 𝑓 𝑥 = 𝑥1

𝑠. 𝑡. 𝑥2 − 1 − 𝑥1

3 ≤ 0−𝑥2 ≤ 0

𝑥∗ = 1,0 𝑇, 𝐼 = {1,2}

𝛻𝑓 𝑥∗ = 1,0 𝑇 , 𝛻𝑔1 𝑥∗ = (0,1)𝑇 , 𝛻𝑔2 𝑥∗ = 0,−1 𝑇

𝜆10

+ 𝜇101

+ 𝜇20−1

=00

only if 𝜆 = 0!

𝜇1 = 𝜇2 = 𝛼(> 0) will do → necessary conditions are satisfied

even though 𝑥∗ is not a local minimizer

KKT conditions

Assume that 𝜆 > 0 i.e. the gradient of the objective function is taken into account every time.

Since 𝜆 > 0 we can divide by it and can assume that 𝜆 = 1

In order to satisfy KKT conditions, some regularity must be assumed from the constraints – Constraint qualifications

Definition: A point 𝑥∗ ∈ 𝑆 is regular if the set of gradients of the active constraints 𝛻𝑔𝑖 𝑥∗ (𝑖 ∈ 𝐼) is linearly independent.


KKT conditions

(for inequality constraints)

Necessary conditions: Let 𝑓 and 𝑔𝑖 be

continuously differentiable in a regular 𝑥∗ ∈ 𝑆.

If 𝑥∗ is a local minimizer, then there exist

multipliers 𝜇𝑖 ≥ 0, 𝑖 = 1,… ,𝑚 such that

1) 𝛻𝑓 𝑥∗ + 𝜇𝑖𝛻𝑔𝑖 𝑥∗ = 0𝑚𝑖=1

2) 𝜇𝑖𝑔𝑖 𝑥∗ = 0 for all 𝑖 = 1,… ,𝑚.

Note: If 𝑔𝑖 𝑥 ≥ 0, then 𝜇𝑖 ≤ 0


KKT conditions

(for inequality constraints)

Sufficient conditions: Let 𝑓 and 𝑔𝑖 be continuously

differentiable and convex. Consider 𝑥∗ ∈ 𝑆. If there

exist multipliers 𝜇𝑖 ≥ 0, 𝑖 = 1,… ,𝑚 such that

1) 𝛻𝑓 𝑥∗ + 𝜇𝑖𝛻𝑔𝑖 𝑥∗ = 0𝑚𝑖=1

2) 𝜇𝑖𝑔𝑖 𝑥∗ = 0 for all 𝑖 = 1,… ,𝑚,

then 𝑥∗ is a global minimizer.


Note

Alternatively, one can write

−𝛻𝑓 𝑥∗ = 𝜇𝑖𝑚𝑖=1 𝛻𝑔𝑖(𝑥

∗)

That is, −𝛻𝑓(𝑥∗) belongs to the cone defined

by the gradients of

the active constraints


Problem with both inequality and

equality constraints

Let us consider

min𝑓 𝑥 , 𝑠. 𝑡. 𝑔𝑖 𝑥 ≤ 0, 𝑖 = 1,… ,𝑚 and

ℎ𝑗 𝑥 = 0, 𝑗 = 1,… , 𝑙

𝑆 = 𝑥 ∈ 𝑅𝑛 𝑔𝑖 𝑥 ≤ 0 ∀ 𝑖 = 1,… ,𝑚 & ℎ𝑗 𝑥 =

0 ∀ 𝑗 = 1,… , 𝑙} where 𝑔𝑖: 𝑅𝑛 → 𝑅 ∀𝑖 and

ℎ𝑗: 𝑅𝑛 → 𝑅 ∀𝑗


KKT conditions

Definition: A point 𝑥∗ ∈ 𝑆 is regular if the set of gradients of the active inequality constraints 𝛻𝑔𝑖 𝑥∗ (𝑖 ∈ 𝐼) and equality constraints 𝛻ℎ𝑖 𝑥∗ (𝑖 = 1,… , 𝑙) are linearly independent.

Necessary conditions: Let 𝑓, 𝑔𝑖 and ℎ𝑖 be continuously differentiable in a regular 𝑥∗ ∈ 𝑆. If 𝑥∗ is a local minimizer, then there exist multipliers 𝜇𝑖 ≥ 0, 𝑖 = 1,… ,𝑚 and 𝜈𝑖 , 𝑖 = 1, … , 𝑙 such that

1) 𝛻𝑓 𝑥∗ + 𝜇𝑖𝛻𝑔𝑖 𝑥∗ + 𝜈𝑖𝛻ℎ𝑖(𝑥∗)𝑙

𝑖=1 = 0𝑚𝑖=1

2) 𝜇𝑖𝑔𝑖 𝑥∗ = 0 for all 𝑖 = 1,… ,𝑚.


KKT conditions

Sufficient conditions: Let 𝑓, 𝑔𝑖 and ℎ𝑖 be

continuously differentiable and convex. Consider

𝑥∗ ∈ 𝑆. If there exist multipliers 𝜇𝑖 ≥ 0, 𝑖 = 1,… ,𝑚 and

𝜈𝑖 , 𝑖 = 1,… , 𝑙 such that

1) 𝛻𝑓 𝑥∗ + 𝜇𝑖𝛻𝑔𝑖 𝑥∗ + 𝜈𝑖𝛻ℎ𝑖(𝑥∗)𝑙

𝑖=1 = 0𝑚𝑖=1

2) 𝜇𝑖𝑔𝑖 𝑥∗ = 0 for all 𝑖 = 1,… ,𝑚,

then, 𝑥∗ is a global minimizer.


Example

min𝑓 𝑥 = 𝑥1 − 3 2 + 𝑥2 − 2 2

s.t.

𝑥12 + 𝑥2

2 ≤ 5,𝑥1 + 2𝑥2 = 4,𝑥1, 𝑥2 ≥ 0

Formulate the KKT conditions

𝟐𝒙𝟏 − 𝟔 + 𝟐𝝁𝟏𝒙𝟏 − 𝝁𝟐 + 𝝂𝟏 = 𝟎𝟐𝒙𝟐 − 𝟒 + 𝟐𝝁𝟏𝒙𝟐 − 𝝁𝟑 + 𝟐𝝂𝟏 = 𝟎

𝝁𝟏 𝒙𝟏𝟐 + 𝒙𝟐

𝟐 − 𝟓 = 𝟎

−𝝁𝟐𝒙𝟏 = 𝟎−𝝁𝟑𝒙𝟐 = 𝟎

𝒙𝟏 + 𝟐𝒙𝟐 − 𝟒 = 𝟎

𝝁𝟏, 𝝁𝟐, 𝝁𝟑 ≥ 𝟎


introduction to constrained optimization - optimality...

Documents