convex optimization a journey of 60 yearsdimitrib/convex_opt_paths_ahead_s… · lids paths ahead...

14
Convex Optimization A Journey of 60 Years Dimitri P. Bertsekas Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology LIDS Paths Ahead Symposium, 2009

Upload: others

Post on 16-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Convex OptimizationA Journey of 60 Years

Dimitri P. Bertsekas

Department of Electrical Engineering and Computer ScienceMassachusetts Institute of Technology

LIDS Paths Ahead Symposium, 2009

Page 2: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

History and Prehistory

Prehistory: Early 1900s - 1949.Caratheodory, Minkowski, Steinitz, Farkas.Properties of convex sets and functions.

Fenchel - Rockafellar era: 1949 - mid 1980s.Duality theory.Minimax/game theory (von Neumann).(Sub)differentiability, optimality conditions, sensitivity.

Modern era - Paradigm shift: Mid 1980s - present.Nonsmooth analysis (a theoretical/esoteric direction).Algorithms (a practical/high impact direction).A change in the assumptions underlying the field.

Page 3: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Duality

Two different views of the same object.

Example: Dual description of signals.

A union of points An intersection of hyperplanesTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing ProblemWeak Duality q∗ ≤ w∗

minimize w

subject to (0, w) ∈ M,

1

A union of points An intersection of hyperplanesTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing ProblemWeak Duality q∗ ≤ w∗

minimize w

subject to (0, w) ∈ M,

1

Dual description of closed convex sets

A union of points An intersection of hyperplanesTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing ProblemWeak Duality q∗ ≤ w∗

minimize w

subject to (0, w) ∈ M,

1

A union of its points An intersection of halfspacesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

Page 4: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Dual Description of Convex Functions

Define a closed convex function by its epigraph.Describe the epigraph by hyperplanes.Associate hyperplanes with crossing points (the conjugate function).

4

Polyhedral Convexity Template

infx!"n!f(x)! x#y

"= !h(y) (y, 1) Slope = y x

epi(f) w (µ, 1) q(µ)w$ is uniformly distributed in the interval [!1, 1]! ! f!(!) X = x Measurement

(µ, ")

3 5 9 11 13 4 10 1/6

Mean SquaredLeast squares estimate

E[! | X = x]

X = ! + W

M M = epi(p) (0, w$) epi(p)E[!] var(!) Hyperplane {x | y#x = 0}cone({a1, . . . , ar})u v M

(µ, 1)

C1

1

4

Polyhedral Convexity Template

infx!"n!f(x)! x#y

"= !h(y) (y, 1) Slope = y x

f(x) = (c/2)x2

f(x) = |x|

f(x) = !! "

h(y) = (1/2c)y2

epi(f) w (µ, 1) q(µ)w$ is uniformly distributed in the interval [!1, 1]! # f!(#) X = x Measurement

(µ, ")

3 5 9 11 13 4 10 1/6

Mean SquaredLeast squares estimate

E[! | X = x]

1

4

Polyhedral Convexity Template

f(x) = supy!"n

!y#x! h(y)

"

infx!"n

!f(x)! x#y

"= !h(y) (y, 1) Slope = ! x 0

x#y ! h(y)

f(x) = (c/2)x2

f(x) = |x|

f(x) = !x ! "

"/!

" ! !1 1epi(f) w (µ, 1) q(µ)w$ is uniformly distributed in the interval [!1, 1]! # f!(#) X = x Measurement

(µ, ")

3 5 9 11 13 4 10 1/6

1

4

Polyhedral Convexity Template

Indicator FunctionsSupport Functions

(!x, 1)

(!y, 1)

rf (x)

y

f(x) = supy!"n

!y#x! h(y)

"

y#x! h(y)

infx!"n!f(x)! x#y

"= !h(y)

(y, 1)

Slope = 2

!

Slope = 1/2

1

4

Polyhedral Convexity Template

Indicator FunctionsSupport Functions

(!x, 1)

(!y, 1)

rf (x)

y

f(x) = supy!"n

!y#x! h(y)

"

y#x! h(y)

infx!"n!f(x)! x#y

"= !h(y)

(y, 1)

Slope = 2

!

Slope = 1/2

1

infx!"n

{f(x)! x#y} = !f!(y),

State sampling according to Markov chain P

(µ, 0)(a) (b) (c)

(0, 0) X (0, 1) cone(X) conv(X)

x" = !x+(1!!)x C x ! " x S S" x4 f(x) f(z) !f(x)+ (1!!)f(y)0 !"

dom(f)

f!!x + (1! !)y

"C x y z x1 x2 x3 x4 f(x) f(z) !f(x) + (1! !)f(y) 0

x x$

!f(x) + (1! !)f(y) C x y f(x) f(z) z = !x + (1 ! !)y

f(z) + (y ! z)#"f(z) f(z) + (x! z)#"f(z)

#x | f(x) # #

$

x + !(z ! x)

f(x) +f!x + !(z ! x)

"! f(x)

!

e1 e2 e3 e4 yk zk xk xk+1 0

1

Primal DescriptionDual DescriptionValues f(x) Crossing points f∗(y)A union of points An intersection of halfspacesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap Decomposition

1

Primal DescriptionDual DescriptionValues f(x) Crossing points f∗(y)A union of points An intersection of halfspacesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap Decomposition

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

A union of points An intersection of halfspacesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

A union of points An intersection of halfspacesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

1

Page 5: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Fenchel Duality Framework - The Primal Problem

> 0

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

1

Page 6: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Fenchel Primal and Dual Problem Descriptions

> 0

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗2 (−y)

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

1

Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗2 (−y)

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

1

Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗2 (−y)

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

1

Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗2 (−y)

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

1

Page 7: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Fenchel Duality

> 0

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

a1 a2 x 0

(a) (b) (c) Slope λ Slope λ∗ f1(x) −f2(x) h1(λ) + h2(−λ) x∗ f∗ − q∗

cone�{a1, a2, a3}

�{x | a�

jx ≤ 0, j = 1, 2, 3}

D C C∗ y z x H P P C ∩H1 C ∩H1 ∩H2

{y | y�a1 ≤ 0} {y | y�a2 ≤ 0}(0, w∗) w a1 a2 a3 a4 a5 c1 c2 v1 v2 v3

c = µ∗1a1 + µ∗

2a2

β α −1 1 0 N(A) R(A�) D = P ∩ aff(C) M = H ∩ aff(C)

�(g(x), f(x)) | x ∈ X

C = aff(C)∩ (Closed Halfspace Containing C)

M =�(u, w) | there exists x ∈ X such that g(x) ≤ u, f(x) ≤ w

M =�(u, w) | g(x) ≤ u, f(x) ≤ w for some x ∈ C

Separating Hyperplane H that Properly Separates C and D C and Pβ α −1 1 0

h(y) = (1/2c)y2

h(y) =�

0 if |y| ≤ 1∞ if |y| > 1

h(y) =�

β if y = α∞ if y �= α

epi(f) w (µ, 1) q(µ)3 5 9 11 1

3 4 10 1/6cone({a1, . . . , ar})u v M a

(µ, 1)

C1

C2

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

Primal Description

Dual Description

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗1 (y) + f∗2 (−y) f∗2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces minx

�f1(x) + f2(x)

Abstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

1

minx

�f1(x) + f2(x)

�= max

y

�− f�

1 (y)− f�2 (−y)

f�2,Xk

(−λ)

�B(x) ��B(x) S �� < �

Boundary of S

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u,w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing Problem

1

Page 8: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

A More Abstract View of Duality

Despite its elegance, the Fenchel framework is somewhat indirect.From duality of set descriptions, to

duality of functional descriptions, toduality of problem descriptions.

A more direct approach:Start with a set, then to

two simple prototype problems dual to each other.

Avoid functional descriptions (a simpler, less constrained framework).

Page 9: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Min-Common/Max-Crossing Duality

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

0

(a)

Min Common Point w*

Max Crossing Point q*

M

0

(b)

M

_M

Max Crossing Point q*

Min Common Point w*w w

u

0

(c)

S

_M

MMax Crossing Point q*

Min Common Point w*

w

u

u

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x Wk Nk Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Nonvertical Vertical

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Hyperplane

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

Negative Halfspace {x | a!x ! b}Positive Halfspace {x | a!x " b}

a!(C) C C # S" d z x

Hyperplane {x | a!x = b} = {x | a!x = a!x}

x# x f!!x# + (1 $ !)x

"

x x#

x0 $ d x1 x2 x x4 $ d x5 $ d d

x0 x1 x2 x3

a0 a1 a2 a3

f(z)

z

X 0 u w (µ, ") (u, w)µ

"

!u + w

#X(y)/%y%

x M M Wk y C2 C C2k+1 yk AC

C = C + S"

Min Common Point w#

Max Crossing Point q#

Level Sets of f Constancy Space Lf #$k=0Ck Rf

Level Sets of f " ! $1 1(µ, 0) cl(C)

1

w(µ + !µ, !")

(a) (b) (c)

(µ, 1)

C1

C

2

w(µ + !µ, !")

(a) (b) (c)

(µ, 1)

C1

C

2

w(µ + !µ, !")

(a) (b) (c)

(µ, 1)

C1

C

2

Page 10: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Abstract Framework for Duality Analysis

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

A union of points An intersection of hyperplanesAbstract Min-Common/Max-Crossing TheoremsMinimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

f�2,Xk

(−λ)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i ≤ k

f�(λ)

Constant− f�1 (λ) f�

2 (−λ) F ∗2,k(−λ)F ∗

k (λ)

�(g(x), f(x)) | x ∈ X

M =�(u, w) | there exists x ∈ X

Outer Linearization of f

F (x) H(y) y h(y)

supz∈Z

infx∈X

φ(x, z) ≤ supz∈Z

infx∈X

φ(x, z) = q∗ = p(0) ≤ p(0) = w∗ = infx∈X

supz∈Z

φ(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si ⊂ �n,i = 1, . . . ,mIf s ∈ conv(S) then s = s1 + · · · + sm wheresi ∈ conv(Si) for all i = 1, . . . ,m,si ∈ Si for at least m− n− 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q∗ = (cl )p(0) ≤ p(0) = w∗

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common Problem

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality ( MinMax = MaxMin )

Constrained Optimization Duality

Theorems of the Alternative etc

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality ( MinMax = MaxMin )

Constrained Optimization Duality

Abstract Geometric Framework

1

Special choices of M

f!2,Xk

(!!)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i " k

f!(!)

Constant! f!1 (!) f!

2 (!!) F !2,k(!!)F !

k (!)

!(g(x), f(x)) | x # X

"

M =!(u, w) | there exists x # X

Outer Linearization of f

F (x) H(y) y h(y)

supz"Z

infx"X

"(x, z) " supz"Z

infx"X

"(x, z) = q! = p(0) " p(0) = w! = infx"X

supz"Z

"(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si $ %n,i = 1, . . . , mIf s # conv(S) then s = s1 + · · · + sm wheresi # conv(Si) for all i = 1, . . . , m,si # Si for at least m! n! 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q! = (cl )p(0) " p(0) = w!

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing ProblemWeak Duality q! " w!

minimize w

subject to (0, w) # M,

1

Special choices of M

f!2,Xk

(!!)

x1 x2 x3 x4 Slope: xk+1 Slope: xi, i " k

f!(!)

Constant! f!1 (!) f!

2 (!!) F !2,k(!!)F !

k (!)

!(g(x), f(x)) | x # X

"

M =!(u, w) | there exists x # X

Outer Linearization of f

F (x) H(y) y h(y)

supz"Z

infx"X

"(x, z) " supz"Z

infx"X

"(x, z) = q! = p(0) " p(0) = w! = infx"X

supz"Z

"(x, z)

Shapley-Folkman Theorem: Let S = S1 + · · · + Sm with Si $ %n,i = 1, . . . , mIf s # conv(S) then s = s1 + · · · + sm wheresi # conv(Si) for all i = 1, . . . , m,si # Si for at least m! n! 1 indices i.

The sum of a large number of convex sets is almost convexNonconvexity of the sum is caused by a small number (n + 1) of sets

f(x) = (cl )f(x)

q! = (cl )p(0) " p(0) = w!

Duality Gap DecompositionConvex and concave part can be estimated separatelyq is closed and concaveMin Common ProblemMax Crossing ProblemWeak Duality q! " w!

minimize w

subject to (0, w) # M,

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondi!erentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Di!erentials

Values f(x) Crossing points f!(y)

!f!1 (y) f!

1 (y) + f!2 (!y) f!

2 (!y)

Slope y! Slope y

A union of points An intersection of halfspaces

minx

!f1(x) + f2(x)

"= maxy

!f!1 (y) + f!

2 (!y)"

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality ( MinMax = MaxMin )

Constrained Optimization Duality

Abstract Geometric Framework (Set M)

1

Page 11: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

The Modern Era: Duality Coupled with Algorithms

Traditional view: Pre 1990sLPs are solved by simplex method (G. Dantzig view).NLPs are solved by gradient/Newton methods (M. Powell view).Convex programs are special cases of NLPs.

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

Modern view: Post 1990sLPs are often solved by nonsimplex/convex methods.Convex problems are often solved by the same methods as LPs.“Key distinction is not Linear-Nonlinear but Convex-Nonconvex" (Rockafellar)

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Traditional view: Pre 1990s

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

LP CONVEX NLP

Simplex

Gradient/Newton

Duality

Subgradient Cutting plane Interior point Subgradient

LPs are solved by simplex method

NLPs are solved by gradient/Newton methods.

Convex programs are special cases of NLPs.

Modern view: Post 1990s

LPs are often best solved by nonsimplex/convex methods.

Convex problems are often solved by the same methods as LPs.

Nondifferentiability and piecewise linearity are common features.Primal Description

Dual Description

Vertical Distances

Crossing Point Differentials

Values f(x) Crossing points f∗(y)

−f∗1 (y) f∗

1 (y) + f∗2 (−y) f∗

2 (−y)

Slope y∗ Slope y

A union of points An intersection of halfspaces

minx

�f1(x) + f2(x)

�= maxy

�f∗1 (y) + f∗

2 (−y)�

Abstract Min-Common/Max-Crossing Theorems

Minimax Duality (minmax=maxmin)Constrained Optimization DualityTheorems of the Alternative etcTime domain Frequency domain

1

Page 12: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Methodological Trends

Convex programs and LPs connect around duality and large-scalepiecewise linear problems.

New methods, renewed interest in old methods.Interior point methodsSubgradient methodsPolyhedral approximation/cutting plane methodsRegularization/proximal methods

Renewed emphasis on complexity analysis.Nesterov, Nemirovski, and others ...“Optimal algorithms" (e.g., extrapolated gradient methods)

Page 13: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Synergy Between Duality, Algorithms, and Applications

Duality-based decomposition.Large-scale resource allocationLagrangian relaxation, discrete optimizationStochastic programming

Conic programming.Robust optimizationSemidefinite programming

Machine learning.Support vector machinesl1 regularization/Robust regression/Compressed sensingIncremental methods

Page 14: Convex Optimization A Journey of 60 Yearsdimitrib/Convex_Opt_Paths_Ahead_S… · LIDS Paths Ahead Symposium, 2009. History and Prehistory Prehistory:Early 1900s - 1949. Caratheodory,

Speculation - What’s Next?

“It is hard to predict, especially about the future" (N. Bohr)

Very large problems/new applications (?)Problems with network overlays (e.g., smart grids).Huge data sets in machine learning.

New approaches to large size and complexity (?)Approximate dynamic programming paradigm (e.g., LP-based dynamicprogramming).Reduced space approximations.Sampling mechanisms.

Better hardware/better algorithms multiplier effect (?)

A new paradigm shift (?)