difference of convex (dc) decomposition of nonconvex polynomials with algebraic techniques georgina...

1

Difference of Convex (DC) Decomposition of

Nonconvex Polynomials with Algebraic Techniques

Georgina HallPrinceton, ORFE

Joint work with Amir Ali AhmadiPrinceton, ORFE

7/13/2015 MOPTA 2015

2

DC Decomposition of Nonconvex Polynomials with Algebraic Techniques

Difference of Convex (DC) programming

• Problems of the form

where:

• , ,• , are convex.

3


Concave-Convex Computational Procedure (CCCP)

• Heuristic for minimizing DC programming problems.• Has been used extensively in: • machine learning (sparse support vector machines (SVM), transductive SVMs,

sparse principal component analysis)• statistical physics (minimizing Bethe and Kikuchi free energies).

• Idea:Input

x initial point

Convexify by linearizing x

convex affineconvex

Solve convex subproblem

Take to be the solution of

𝑘≔𝑘+1𝒇 𝒊𝒌 (𝒙 )

𝒇 𝒊(𝒙)

4


Concave-Convex Computational Procedure (CCCP)

• Toy example: , where

Initial point:

Convexify to obtain

Minimize and obtain

Reiterate

𝑥0𝑥0𝑥1𝑥2𝑥3𝑥4𝑥∞

5


CCCP for nonconvex polynomial optimization problems (1/2)

CCCP relies on input functions being given as a difference of convex functions.

We will consider polynomials in variables and of degree

• Any polynomial can be written as a difference of convex polynomials. • Proof by Wang, Schwing and Urtasun• Alternative proof given later in this presentation, as corollary of stronger

theorem

What if we don’t have access to such a decomposition?

6


CCCP for nonconvex polynomial optimization problems (2/2)

• In fact, for any polynomial, an infinite number of decompositions.Example

xPossible decompositions

Which one would be a natural choice for CCCP?

7


Picking the “best” decomposition (1/2)Algorithm

Linearize around a point to obtain convexified version of

Idea

Pick such that it is as close as possible to affine

Mathematical translation

Minimize curvature of ( is the hessian of

At a point

s.t. convex

Over a region

s.t. convex

8


Picking the “best” decomposition (2/2)Theorem: Finding the “best” decomposition of a degree-4 polynomial over a box is NP-hard.

Proof idea: Reduction via testing convexity of quartic polynomials is hard (Ahmadi, Olshevsky, Parrilo, Tsitsiklis).

The same is likely to hold for the point version, but we have been unable to prove it.

How can we efficiently find such a decomposition?

9


Convex relaxations for DC decompositions (1/6)

SOS, DSOS, SDSOS polynomials (Ahmadi, Majumdar)

• Families of nonnegative polynomials.

Type Characterization Testing membership

Sum of squares (sos) , polynomials, s.t. SDP

Scaled diagonally dominant sum of squares (sdsos)

p monomials, SOCP

Diagonally dominantsum of squares (dsos)

p LP

⇓

⇓

10



DSOS-convex, SDSOS-convex, SOS-convex polynomials

Definitions:• is dsos-convex if is dsos.• is sdsos-convex if is sdsos.• is sos-convex if is sos.

convex⇔𝐻𝑝 (𝑥 )≽ 0 ,∀ 𝑥⇔ ⇐ sos/sdsos/dsos

LP

SOCP

SDP

11


Convex relaxations for DC decompositions (3/6)Comparison of these sets on a parametric family of polynomials:

𝑐=−0.5

𝑎 𝑎

𝑏 𝑏

dsos-convex sdsos-convex sos-convex=convex

𝑎

𝑏

𝑐=0 𝑐=1

12


is sdd diagonal, s.t. dd.

is diagonally dominant (dd)


Original problem

s.t. convex

s.t.

convex

⇔

How to use these concepts to do DC decomposition at a point ?

Relaxation 1: sos-convex

s.t.

sos-convex

SDP

Relaxation 2: sdsos-convex

s.t.

sdsos-convex

SOCP + “small” SDP

Relaxation 3: dsos-convex

s.t.

dsos-convex

LP + “small” SDP

Relaxation 4: sdsos-convex+sdd

s.t. sdd (**) sdsos-convex

SOCP

Relaxation 5: dsos-convex + dd

s.t. dd (*)

dsos-convexLP

13



Can any polynomial be written as the difference of two dsos/sdsos/sos convex polynomials?Lemma about cones: Let a full dimensional cone ( any vector space). Thenany can be written as .Proof sketch:

KE

𝒌𝒌 ′

such that

⇔𝑣=1

1−𝛼𝑘′−

𝛼1−𝛼

𝑘

𝑘1∈𝐾𝑘2∈𝐾𝒗

¿ :𝑘 ′

14



Theorem: Any polynomial can be written as the difference of two dsos-convex polynomials.Corollary: Same holds for sdsos-convex, sos-convex and convex.Proof idea:• Need to show that dsos-convex polynomials is full-dimensional cone.• “Obvious” choices (i.e., ) do not work.

Induction on : for take

𝑎0>2 (𝑑−2 )𝑑 (𝑑−1)

+ 𝑑4 (𝑑−1)

𝑎𝑑4

𝑎1=1 𝑎𝑘+1=( 𝑑−2𝑘2𝑘+2 )𝑎𝑘 ,𝑘=1 ,…, 𝑑4−1

15


s.t. psd/sdd/dd,

s/d/sos-convex

Comparing the different relaxations (1/4)• Impact of relaxations on solving

for random ().

Type of relaxationTime (s) Opt value Time (s) Opt Value Time (s) Opt value

dsos-convex + dd 1.05 17578.54 2.79 21191.55 20.80 168327.89

dsos-convex + psd 1.19 15855.77 3.19 19426.13 25.36 146847.73

sdsos-convex + sdd 1.21 1089.41 5.17 1962.64 34.66 7936.57

sdsos-convex + psd 1.21 1069.79 5.29 1957.03 39.43 7935.72

sos-convex + psd MOSEK 2.02 193.07 93.74 317.63 ------------------

sos-convex + psd SEDUMI 11.48 193.06 10324.12 317.63 ------------------

Computer: 8Gb RAM, 2.40GHz

processor

16


Comparing the different relaxations (2/4)• Iterative decomposition algorithm implemented for unconstrained

Decompose using one of the relaxations at

point

Minimize convexified using an SDP subroutine [Lasserre; de Klerk and

Laurent]

DSOS DD DSOS PSD SDSOS SDD SDSOS PSD SOS PSD

-250000

-200000

-150000

-100000

-50000

0

• Value of the objective after 3 mins. • Algorithm given above.• 5 different relaxations used• random with , • Average over 25 iterations• Solver: Mosek

17


Comparing the different relaxations (3/4)• Constrained case: where

Minimize convexified

Relaxation:s.t.

sdsos convex

Decompose at a point

Iterative decomposition


What relaxation to use?Decompose over B

Original problem: s.t.

convex


Relaxation:s.t.

sdsos convex

Decompose once at

Single decomposition One min-max decomp.vs vs

Equivalent formulation:

convex

First relaxation:

sdsos-convex

Second relaxation:

sos sdsos-convex

18


Comparing the different relaxations (4/4)• Constrained case: single decomposition vs. iterative decomposition

vs. min-max decomposition

• Value of the objective after 3 mins. • Algorithms described above.• random with , • Radius random integer between

100 and 400.• Average over 200 iterations

Single decomp

Iter decomp Min max

-16000

-14000

-12000

-10000

-8000

-6000

-4000

-2000

0

2000

4000

19


Main messages• To apply CCCP to polynomial optimization, a DC decomposition is

needed. Choice of decomposition impacts convergence speed.

• Not computationally tractable to find “best” decomposition.

• Efficient convex relaxations based on the concepts of dsos-convex (LP), sdsos-convex (SOCP), and sos-convex (SDP) polynomials.

• Dsos-convex and sdsos-convex scale to a larger number of variables.

20

Thank you for listeningQuestions?

difference of convex (dc) decomposition of nonconvex polynomials with algebraic techniques georgina...

Documents

best decomposition

dc decompositions

algebraic techniquescomparing

algebraic techniquespicking

algebraic techniques4cccp

sdsos polynomials ahmadi

dc programming problems

algebraic techniques