cloud branching - 2014 mixed integer programming workshop · cloud branching mip workshop, ohio...

Cloud BranchingMIP workshop, Ohio State University, 23/Jul/2014

Timo Berthold Gerald GamrathXpress Optimization Team Zuse Institute Berlin

Domenico SalvagninUniversita degli Studi di Padova

© 2014 Fair Isaac Corporation.This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.

MIP & Branching

min cT xs.t . Ax ≤ b

x ∈ ZI≥0 ×RC

≥0

Mixed Integer Program:. linear objective &

constraints. integer variables. continuous variables

Branching for MIP:. improve dual bound. fractional variables. the optimum of the LP

relaxation

xi ≤ bx∗i c ∨ xi ≥ dx∗i e for i ∈ I and x∗i /∈ Z

© 2014 Fair Isaac Corporation.

MIP & Branching

min cT xs.t . Ax ≤ b

x ∈ ZI≥0 ×RC

≥0

Mixed Integer Program:. linear objective &

constraints. integer variables. continuous variables

Branching for MIP:. improve dual bound. fractional variables. the optimum of the LP

relaxation

xi ≤ bx∗i c ∨ xi ≥ dx∗i e for i ∈ I and x∗i /∈ Z© 2014 Fair Isaac Corporation.

Degeneracy

naıvely:. 1 optimal solution,

determined by nconstraints/bounds

at a second thought:. 1 optimal solution,

k > n tight constraints

but really:. an optimal polyhedron

Goal of this talk:branch on a set (a cloud) of solutions


How degenerated is the typical MIP?

Nonbasic columns (incl. slacks) with reduced costs 0.0

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0

40

80

120

. MIPLIB 3, MIPLIB2003, MIPLIB2010, Cor@l (462 instances)

. only 12.5% not dual degenerated

. dichotomy: either few degeneracy or a lot of

. presolving/cuts do not change this picture by much


Two questions

Goal of this talk:branch on a cloud of solutions

1. How do we get extra optimal solutions?2. How do we use this extra knowledge?


Generating cloud points

1. How do we get extra optimal solutions?

. restrict LP to optimal face

. min/max each variable (OBBT)

. feasibility pump objective(pump-reduce [Achterberg2010])

=

x1

x2

stalling is cheap!







=

x1= 0.4x2= 0.55

stalling is cheap!







min x1

=

x1= 0.4x2= 0.55

stalling is cheap!







max x1

=

x1∈ [0.4,1.0]

x2∈ [0.55,0.7]

stalling is cheap!







min x2

=

x1∈ [0.4,1.0]

x2∈ [0.0,0.7]

stalling is cheap!







max x2

=

x1∈ [0.4,1.0]

x2∈ [0.0,1.0]

stalling is cheap!







max x2

=

x1∈ [0.4,1.0]

x2∈ [0.0,1.0]

intervals instead of values

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣

=

x1= 0.4x2= 0.55

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣ = −x1 + x2

x1= 0.4x2= 0.55

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣ = −x1 + x2

x1∈ [0.4,0.55]

x2∈ [0.55,0.9]

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣ = +x1 + x2

x1∈ [0.4,0.55]

x2∈ [0.55,0.9]

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣ = +x1 + x2

x1∈ [0.4,0.9]

x2∈ [0.55,1.0]

stalling is cheap!







min ∆(x , x) =∑∣∣xj − xj

∣∣ = +x1 + x2

x1∈ [0.4,0.9]

x2∈ [0.55,1.0]

heuristic intervals

stalling is cheap!


Branching rules in MIP

2. How do we use this extra knowledge?

1. Pseudocost branching [BenichouEtAl1971]

. try to estimate LP values, based on history information

. effective, cheap, but weak in the beginning

. combine with strong branching (reliability, hybrid)

2. Strong branching [ApplegateEtAl1995]

. solve tentative LP relaxations for some candidates

. effective w.r.t. number of nodes, expensive w.r.t. time

3. Most infeasible branching. often referred to as a simple standard rule. computationally as bad as random branching!


Exploiting cloud points (pseudocost branching)

2.1 How do we use this extra knowledge?

z = cT x

bx?j c x?

j dx?j e

∆↓

∆↑

lj uj

z = cT x

bx?j c x?

j dx?j e

∆−j

∆+j

lj uj

∆−j ∆+j

. pseudocost update. ς+

j = ∆↑

dx?j e−x?

jand ς−j = ∆↓

x?j −bx

?j c

. pseudocost-based estimation. ∆+

j = Ψ+j (dx?

j e − x?j ) and ∆−j = Ψ−j (x?

j − bx?j c)




z = cT x

bx?j c x?

j dx?j e

∆↓

∆↑

lj uj

z = cT x

bx?j c x?

j dx?j e

∆−j

∆+j

lj uj

∆−j ∆+j


j = ∆↑

dx?j e−x?

j. . . better: ς+

j = ∆↑

dx?j e−uj


j = Ψ+j (dx?

j e − x?j ) and ∆−j = Ψ−j (x?

j − bx?j c)




z = cT x

bx?j c x?

j dx?j e

∆↓

∆↑

lj uj

z = cT x

bx?j c x?

j dx?j e

∆−j

∆+j

lj uj

∆−j ∆+j


j = ∆↑

dx?j e−x?

j. . . better: ς+

j = ∆↑

dx?j e−uj


j = Ψ+j (dx?

j e − x?j ) . . . better: ∆+

j = Ψ+j (dx?

j e − uj)


Observation

LemmaLet x? be an optimal solution of the LP relaxation at a givenbranch-and-bound node and bx?

j c ≤ lj ≤ x?j ≤ uj ≤ dx?

j e. Then

1. for fixed ∆↑ and ∆↓, it holds that ς+j ≥ ς+

j and ς−j ≥ ς−j ,respectively;

2. for fixed Ψ+j and Ψ−j , it holds that ∆+

j ≤ ∆+j and ∆−j ≤ ∆−j ,

respectively.


Exploiting cloud points (strong branching)


Full strong branching:. solves 2·#frac. var’s many LPs. uses product of improvement values as branching score

. improvement on both sides better than on one

Benefit of cloud intervals:. frac. var. gets integral in cloud point one LP spared. cloud branching acts as a filter. new frac. var.’s new candidates (one side known)Idea: Use 3-partition of branching candidates


Exploiting cloud points (most inf branching)


Most infeasible branching. branch on variable with maximal fractionality

min(x?j − bx

?j c, dx

?j e − x?

j ). computationally as bad as random branching!. maybe because x?

j is “random” within cloud interval?

bx?i c

x?i

dx?i e

li ui

bx?j c

x?j

dx?j e

lj uj

Most infeasible with cloud intervals:. select xi with maximal min(|y − z| : y ∈ [li ,ui ], z ∈ Z). direction in which the optimal face has a small diameter





min(x?j − bx

?j c, dx

?j e − x?



bx?i c x?

i dx?i e

li ui

bx?j c x?

j dx?j e

lj uj






min(x?j − bx

?j c, dx

?j e − x?



bx?i c

x?i

dx?i eli ui bx?

j c

x?j

dx?j elj uj



Experimental environment

. implementation within SCIP

. “pristine” settings: No cuts, no heurs, DFS

. cutoff = optimum branch to prove optimality

MIPLIB. MIPLIB 3.0, 2003, 2010. industry and academics. 168 benchmark instances

Cor@l. huge collection of 350 instances. many combinatorial ones. mainly collected from NEOS server

Focus: Tree size reduction by cloud information


Cloud intervals at root node

Cloud statistics:

. interval size:. average: 0.56. g. mean: 0.25

. success rate:. average: 65%. g. mean: 39%

. LP iter/success:. average: 172.15. g. mean: 31.30

. From now on, exclude: No objective, solved at root, no dualdegeneracy at root, highly symmetric

. leaves 158 instances in testbed

Note: Did not use cloud points for anything else (heuristics, cuts)


Computational results

Pseudocost branching

pscost cloud

solved 118 94time (all opt) 24.7 83.4nodes (all opt) 1945 1441

. OBBT takes a lot of time!

. fewer nodes⇔ better branching choices

. more reliable estimates



Most infeasible branching

mostinf cloud

solved 88 81time (all opt) 27.7 58.8nodes (all opt) 2553 1414

. running time doubles

. but: enumerational effort (=nodes) nearly halved

Better than random! :-)


Cloud strong branching algorithm

. frac. var. gets integral in cloud point one LP sparedIdea: Use 3-partition F2,F1,F0 of branching candidates

. improvements in both directions for F2 disregard F1 ∪ F0

. improvement in one direction for F2 ∪ F1 disregard F0

. alternatively: Only use F2

. stop pump-reduce procedure when new cloud point does notimply new integral bound



Full strong branching. full test sets, default settings, heuristic pump-reduce

strong branch cloud branchTest set Nodes Time (s) Nodes Time (s)

MIPLIB 691 72.0 661 68.2COR@L 593 157.3 569 118.3

. little less nodes

. 5.5% faster on MIPLIB (few affected instances)

. 30% faster on COR@L


Wrap up

Conclusion:. exploit knowledge of alternative relaxation optima. mostinf: −44% nodes, pscost: −26%, hybrid: −13%. full strong branching: −30% running time

Outlook:. faster generation of cloud intervals?. predictions when cloud branching is worthwhile?. cloud points from alternative relaxations (MINLP!). SB: nonchimerical + cloud + propagation = ?


Commercial break

Xpress Optimization Suite 7.7 came out in April 2014

Highlights:. parallel dual simplex (mini iterations). MIP presolving w.r.t. objective. Mosel support for robust constraints and uncertainty. improved performance on MISOCP

XPRESS MOSEK GUROBI CPLEX SCIP

1.0 1.24 2.85 5.84 16(MI)SOCP Benchmark http://plato.asu.edu/ftp/socp.html (17/Jun/2014)

Coming soon: Cloud branching


Thank You

Timo [email protected]

© 2014 Fair Isaac Corporation.This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.

cloud branching - 2014 mixed integer programming workshop · cloud branching mip workshop, ohio...

Documents