a new approach to the maximum-flow problem andrew v. goldberg, robert e. tarjan presented by andrew...

A New Approach to the Maximum-Flow ProblemAndrew V. Goldberg, Robert E. Tarjan

Presented by Andrew Guillory

Outline

Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation

Maximum Flow Problem

Classic problem in operations research Many problems reduce to max flow

Maximum cardinality bipartite matching Maximum number of edge disjoint paths Minimum cut (Max-Flow Min-Cut Theorem)

Machine learning applications Structured Prediction, Dual Extragradient and Bregman

Projections (Taskar, Lacoste-Julien, Jordan JMLR 2006) Local Search for Balanced Submodular Clusterings

(Narasimhan, Bilmes, IJCAI 2007)

Relation to Optimization

Special case of submodular function minimization

Special case of linear programming Integer edge capacities permit integer

maximum flows (constructive proof)

History of Algorithms

Augmenting Paths based algorithmsFord-Fulkerson (1962) O(mU)Edmonds-Karp (1969) O(nm3)… O(n3) O(nmlog(n)) O(nmlog(U))

Push-Relabel based algorithmsGoldberg (1985) O(n3)Goldberg and Tarjan (1986) O(nmlog(n2/m))Ahuja and Orlin O(nm + n2log(U))

Outline


Definitions

Graph G = (V, E) |V| = n |E| = m

G is a flow network if it hassource s and sink tcapacity c(v,w) for each edge (v,w) in Ec(v,w) = 0 for (v,w) not in E

Definitions (continued)

A flow f on G is a real value function on vertex pairs f(v,w) <= c(v,w) for all (v,w) f(v,w) = -f(w,v)∑uf(u,v) = 0 for all v in V - {s,t}

Value of a flow |f| is ∑vf(v,t) Maximum flow is a flow of maximum value

Definitions (continued again)

A preflow f on G is a real value function on vertex pairs f(v,w) <= c(v,w) for all (v,w) f(v,w) = -f(w,v)∑uf(u,v) >= 0 for all v in V - {s}

Flow excess e(v) = ∑uf(u,v) Intuition: flow into a vertex can exceed flow out

Outline


Intuition

Starting with a preflow, push excess flow closer towards sink

If excess flow cannot reach sink, push it backwards to source

Eventually, preflow becomes a flow and in fact the maximum flow

Residual Graph

Residual capacity rf(v, w) of a vertex pair is c(v, w) – f(v, w)

If v has positive excess and (v,w) has residual capacity, can push

δ = min(e(v), rf(v, w)) flow from v to w

Edge (v,w) is saturated if rf(v, w) = 0

Residual graph Gf = (V, Ef) where Ef is the set of residual edges (v,w) with rf(v, w) > 0

Labeling

A valid labeling is a function d from vertices to nonnegative integersd(s) = nd(t) = 0d(v) <= d(w) + 1 for every residual edge

If d(v) < n, d(v) is a lower bound on distance to sink

If d(v) >= n, d(v) - n is a lower bound on distance to source

Push Operation

Push(v,w)Precondition: v is active (e(v) > 0) and

rf(v, w) > 0 and d(v) = d(w) + 1

Action: Push δ = min(e(v), rf(v, w)) from v to w

f(v,w) = f(v,w) + δ; f(w,v) = f(w,v) – δ;e(v) = e(v) - δ; e(w) = e(w) + δ;

Relabel Operation

Relabel(v)

Precondition: v is active (e(v) > 0) and

rf(v, w) > 0 implies d(v) <= d(w)

Action: d(v) = min{d(w) + 1 | (v,w) in Ef}

Generic Push-Relabel Algorithm

Starting from an initial preflow

<<loop>>

While there is an active vertex

Chose an active vertex v

Apply Push(v,w) for some w or Relabel(v)

Example

0/30/1

0/2

Flow Network

S T

Example

4

0 0

0

3/30/1

0/2

S T

Initial preflow / labeling

Example

4

0 0

0

3/30/1

0/2

S T

Select an active vertex

Example

4

1 0

0

3/30/1

0/2

Relabel active vertex

S T

Example

4

1 0

0

3/30/1

0/2


S T

Example

4

1 0

0

3/31/1

0/2

Push excess from active vertex

S T

Example

4

1 0

0

3/31/1

0/2


S T

Example

4

1 1

0

3/31/1

0/2


S T

Example

4

1 1

0

3/31/1

1/2

Push excess from active vertex

S T

Example

4

1 1

0

3/31/1

1/2


S T

Example

4

5 1

0

3/31/1

1/2


S T

Example

4

5 1

0

1/31/1

1/2

Push excess from vertex

S T

Example

4

5 1

0

1/31/1

1/2

Maximum flow

S T

Outline


Correctness

Lemma 2.1 If f is a preflow, d is a valid labeling, and v is active, either push or relabel is applicable to v

Lemma 3.1 The algorithm maintains a valid labeling d

Theorem 3.2 A flow is maximum iff there is no path from s to t in Gf (Ford and Fulkerson [7])

Correctness (continued)

Lemma 3.3 If f is a preflow and d is a valid labeling for f, there is no path from s to t in Gf

Proof by contradictionPath s, v0, v1, …, vl, t implies that

d(s) <= d(v0) + 1 <= d(v1) + 2 <= …

<= d(t) + l < nWhich contradicts d(s) = n

Correctness (continued)

Theorem 3.4 If the algorithm terminates with a valid labeling, the preflow is a maximum flow If the algorithm terminates, all vertices have

zero excess (preflow is a flow)By Lemma 3.3 the sink is not reachable from

the sourceBy Theorem 3.2 the flow is maximum

Termination

Lemma 3.5 If f is a preflow and v is an active vertex then the source is reachable from v in Gf Let S be the set of vertices reachable in Gf Suppose s is not in S For every u,w, with w in S and u not in S, f(u,w) <= 0 ∑w in S e(w) = ∑u in V, w in S f(u,w)

= ∑u not in S, w in S f(u,w) + ∑u in S, w in S f(u,w) = ∑u not in S, w in S f(u,w) <= 0

e(w) = 0 for all w in S Lemma 3.6 A vertex’s label never decreases

Termination (continued)

Lemma 3.7 At any time the label of any vertex is at most 2n – 1Only active vertex labels are changedActive vertices can reach sPath v, v0, v1, …, vl, s implies that

d(v) <= d(v0) + 1 <= d(v1) + 2 <= …

<= d(s) + l <= n + n - 1


Lemma 3.8 There are at most 2n2 labeling operationsOnly the labels corresponding to V-{s,t} may

be relabeledEach of these n – 2 labels can only increaseAt most (2n – 1) (n – 2) relabelings


Lemma 3.9 The number of saturating pushes is at most 2nmFor any pair (v,w) d(w) must increase by 2 between

saturating pushes from v to wSimilarly d(v) must increase by 2 between pushes

from w to vd(v) + d(w) >= 1 on the first saturating pushd(v) + d(w) <= 4n - 3 on the lastAt most 2n - 1 saturating pushes per edge


Lemma 3.10 The number of nonsaturating pushes is at most 4n2m Φ = ∑v d(v) where v is active

Each nonsaturating push causes Φ to decrease by at least 1 The total increase in Φ from saturating pushes is

(2n – 1) 2nm The total increase in Φ from relabeling is

(2n – 1)(n – 2) Φ is 0 initially and 0 at termination

Termination

Theorem 3.11 The algorithm terminates in O(n2m)

Total time =

# nonsaturating pushes

+ #saturating pushes

+ #relabeling operations

4n2m + 2nm + 2n2 = O(n2m)

Outline


Implementation

At each step select an active vertex and apply either Push or Relabel

Problem: Determining which operation to perform and in the case of Push finding a residual edge

Solution: For each vertex maintain a list of edges which touch that vertex and a current edge

Push/Relabel Operation

Push/Relabel(v)

Precondition: v is active

Action:

If Push(v,w) is applicable to current edge (v,w) then Push(v,w)

Else if (v,w) is not the last edge advance current edge

Else reset the current edge and Relabel(v)

Push/Relabel Operation

Lemma 4.1 The push/relabel operation does a relabeling only when relabeling is applicable

Theorem 4.2 The push/relabel implementation runs in O(nm) time plus O(1) time per nonsaturating push operation

O(n3) bound

We can select vertices in arbitrary order Certain vertex selection strategies give

O(n3) boundsMaximum distance method (proved here)First-in, first-out method (proved in paper)Wave method

Maximum distance method

At each step, select the active vertex with maximum distance d(v)


Theorem The maximum distance method performs at most 4n3 nonsaturating pushes Consider D = maxx d(x) where x is active D only increases because of relabeling D increases at most 2n2 times D starts at 0 and ends nonnegative D changes at most 4n2 times There is at most one nonsaturating push per node per

value of D


Theorem The maximum distance method runs in time O(n3) using the push/relabel implementationPrevious theorem and Theorem 4.2

First-In First-Out Method

Discharge()Precondition: Queue is not emptyAction: Push/Relabel the vertex v at the front of the queue

until e(v) = 0 or d(v) increasesIf w becomes active during the Push/Relabel add

w to the back of the QueueIf v is still active add v to the back of the Queue


Lemma 4.3 The number of passes over the queue is at most 4n2

Proof very similar to the proof of O(n3) bound for maximum distance method

Corollary 4.4 The number of non saturating pushes is at most 4n3

One per vertex per pass


Theorem 4.5 The first-in, first-out method runs in O(n3) timeCorollary 4.4 and Theorem 4.2

Outline


Dynamic Tree Implementation

Intuition: Maintain trees such that connections between child nodes and parent nodes correspond to edges in the residual graph which permit push operations

Send flow up branches of trees Queue contains trees with active roots

Send Operation

Send(v)

Precondition: v is active

Action:

While v is not the root of its tree and e(v) > 0

Send flow up the tree from v

Cut the tree along the bottleneck edge(s)

Example

4

2 1

0

3/30/2

0/1

S T

Preflow / labeling

Example

4

2 1

0

32

1

S T

Residual graph

excess = 3

Example

4

2 1

0

2

1

S T

Dynamic tree over residual graph

excess = 3

Example

4

2 1

0

2

1

S T

Select active vertex

excess = 3

Example

4

2 1

0

1

0

S T

Send flow up tree

excess = 2

Example

4

2 1

0

1

S T

Cut along bottleneck edges

excess = 2

Example

4

2 1

0

0

S T

Send flow up tree

excess = 1

Example

4

2 1

0

S T

Cut along bottleneck edges

excess = 1 excess = 1

Example

4

2 1

0

3/32/2

1/1

S T

New preflow / labeling

Tree-Push/Relabel Operation

Tree-Push/Relabel(v)Precondition: v is a root of a tree and activeAction:1) If Push is applicable to current edge (v,w):

1a) If we can combine v and w’s trees without making the tree > size k, make w v’s parent and Send(v)1b) Else Push(v,w) and Send(w)

2) Else2a) If (v,w) isn’t the last edge advance the edge2b) Else cut v’s children out of the tree, relabel v, and reset the current edge


Lemma 5.1 The dynamic tree algorithm runs in O(nm log k) time plus O(log k) time per addition of a vertex to the queue Trees are kept at most size k by 1a) Tree operations take time O(log k) Each Tree-Push/Relabel operation takes O(1) tree

operations plus O(1) tree operations per cut Relabeling takes time O(nm) There are O(nm) cuts Tree-Push/Relabel is performed O(nm) times plus

once per addition to the Queue


Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n3/k) A vertex is added only after d(v) changes or e(v)

increases from zero d(v) changes at most n2 times e(v) increased only in 1a) or 1b) Number of vertices added to queue in 1a) or 1b) is the

number of cuts performed (2nm) plus one per occurrence of each subcase


Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n3/k) (Continued)Subcase 1a) occurs at most 2nm times (the

number of links)Subcase 1b) occurs at most 2nm times when

it causes a cut and at most 2nm times when the push from v to w is saturating


Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n3/k) (Continued) Subcase 1b) is nonsaturating if it doesn’t cause a cut

or a saturating push from v to w In a nonsaturating occurrence of 1b), either v or w’s

tree is large (size greater than k/2) There are at most 2n/k large trees in the queue at the

beginning of a pass If the large tree has changed since the beginning of

this pass, charge the operation to the cut / link that changed it (at most one per link, 2 per cuts, 6nm)

Else charge the operation to that tree (at most 2n/k per pass, 2n2 passes, 4n3/k)


Theorem 5.3 The dynamic tree algorithm runs in O(nm log(n2/m)) time if k is chosen to be n2/m

Closing Comments

Parallel version: discharge all active vertices in parallel (O(n2log n))

Maximum distance method: related work shows O(n2m1/2) bound

Implementation tricks: global relabeling, gap relabeling

Maximum distance method better than tree version in practice?

a new approach to the maximum-flow problem andrew v. goldberg, robert e. tarjan presented by andrew...

Documents