11. lecture ws 2006/07bioinformatics iii1 v11: max-flow min-cut v11 continues chapter 12 in gross...
TRANSCRIPT
11. Lecture WS 2006/07
Bioinformatics III 1
V11: Max-Flow Min-CutV11 continues chapter 12 in Gross & Yellen „Graph Theory“
Theorem 12.2.3 [Characterization of Maximum Flow]
Let f be a flow in a network N.
Then f is a maximum flow in network N if and only if there does not exist an
f-augmenting path in N.
Proof: Necessity () Suppose that f is a maximum flow in network N.
Then by Proposition 12.2.1, there is no f-augmenting path.
Proposition 12.2.1 (Flow Augmentation) Let f be a flow in a network N, and let Q be an f-augmenting path with minimum slack Q on its arcs. Then the augmented flow f‘ given by
is a feasible flow in network N and val(f‘) = val(f) + Q.
assuming an f-augmenting path existed, we could construct a flow f‘ with
val(f‘) > val(f) contradicting the maximality of f.
otherwise
Q of arc backward a is if ,
Q of arc forward a is if ,
'
ef
eef
eef
ef Q
Q
11. Lecture WS 2006/07
Bioinformatics III 2
Max-Flow Min-CutSufficiency () Suppose that there does not exist an f-augmenting path in
network N.
Consider the collection of all quasi-paths in network N that begin with source s,
and have the following property: each forward arc on the quasi-path has positive
slack, and each backward arc on the quasi-path has positive flow.
Let Vs be the union of the vertex-sets of these quasi-paths.
Since there is no f-augmenting path, it follows that sink t Vs.
Let Vt = VN – Vs.
Then Vs,Vt is an s-t cut of network N. Moreover, by definition of the sets
Vs and Vt ,
Hence, f is a maximum flow, by Corollary 12.1.8. □
st
ts
VVe
VVeecapef
, if 0
, if
11. Lecture WS 2006/07
Bioinformatics III 3
Max-Flow Min-Cut
Theorem 12.2.4 [Max-Flow Min-Cut] For a given network, the value of a
maximum flow is equal to the capacity of a minimum cut.
Proof: The s-t cut constructed in the proof of Theorem 12.2.3 has capacity equal
to the maximum flow. □
The outline of an algorithm
for maximizing the flow in
a network emerges from
Proposition 12.2.1 and
Theorem 12.2.3.
11. Lecture WS 2006/07
Bioinformatics III 4
Finding an f-Augmenting Path
The idea is to grow a tree of quasi-paths, each starting at source s.
If the flow on each arc of these quasi-paths can be increased or decreased,
according to whether that arc is forward or backward, then an f-augmenting
path is obtained as soon as the sink t is labelled.
A frontier arc is an arc e directed from a labeled endpoint v to an unlabeled
endpoint w.
For constructing an f-augmenting path, the frontier path e is allowed to be
backward (directed from vertex w to vertex v), and it can be added to the tree as
long as it has slack e > 0.
The discussion of f-augmenting paths culminating in the flow-augmenting
Proposition 12.2.1 provides the basis of a vertex-labeling strategy due to Ford
and Fulkerson that finds an f-augmenting path, when one exists.
Their labelling scheme is essentially basic tree-growing.
11. Lecture WS 2006/07
Bioinformatics III 5
Terminology: At any stage during tree-growing for constructing an f-augmenting
path, let e be a frontier arc of tree T, with endpoints v and w.
The arc e is said to be usable if, for the current flow f, either
e is directed from vertex v to vertex w and f(e) < cap(e), or
e is directed from vertex w to vertex v and f(e) > 0.
Frontier arcs e1 and e2 are usable if
f(e1) < cap(e1) and f(e2) > 0
Remark From this vertex-labeling scheme, any of the existing f-augmenting paths
could result. But the efficiency of Algorithm 12.2.1 is based on being able to find
„good“ augmenting paths.
If the arc capacities are irrational numbers, then an algorithm using the
Ford&Fulkerson labeling scheme might not terminate (strictly speaking, it would
not be an algorithm).
Finding an f-Augmenting Path
11. Lecture WS 2006/07
Bioinformatics III 6
Finding an f-Augmenting Path
Even when flows and capacities are restricted to be integers, problems
concerning efficiency still exist.
E.g., if each flow augmentation were to increase the flow by only one unit, then
the number of augmentations required for maximization would equal the capacity
of a minimum cut.
Such an algorithm would depend on the size of the arc capacities instead of on
the size of the network.
11. Lecture WS 2006/07
Bioinformatics III 7
Finding an f-Augmenting Path
Example: For the network shown below, the arc from vertex v to vertex w has
flow capacity 1, while the other arcs have capacity M, which could be made
arbitrarily large.
If the choice of the augmenting flow path at each iteration were to alternate
between the directed path s,v,w,t and the quasi path s,w,v,t , then the flow
would increase by only one unit at each iteration.
Thus, it could take as many as 2M iterations to obtain the maximum flow.
11. Lecture WS 2006/07
Bioinformatics III 8
Finding an f-Augmenting Path
Edmonds and Karp avoid these
problems with this algorithm.
It uses breadth-first search
to find an f-augmenting path
with the least number of arcs.
11. Lecture WS 2006/07
Bioinformatics III 9
FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp
Algorithm 12.2.3 combines Algorithms 12.2.1 and 12.2.2
11. Lecture WS 2006/07
Bioinformatics III 10
Example: the figures illustrate algorithm 12.2.3.
shown is the s-t cut with capacity equal to
the current flow, establising optimality.
FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp
11. Lecture WS 2006/07
Bioinformatics III 11
FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp
At the end of the final iteration, the arc directed from source s to vertex w and the
arc directed from vertex v to sink t are the only frontier arcs of the tree T,
but neither is usable.
These two arcs form the minimum cut {s,x,y,z,v }, {w,a,b,c,t} .
This illustrates the s-t cut that was constructed in the proof of theorem 12.2.3.
11. Lecture WS 2006/07
Bioinformatics III 12
Determining the connectivity of a graph
In this section, we use the theory of network flows to give constructive proofs of
Menger‘s theorem.
These proofs lead directly to algorithms for determining the edge-connectivity and
vertex-connectivity of a graph.
The strategy to prove Menger‘s theorems is based on properties of certain
networks whose arcs all have unit capacity.
These 0-1 networks are constructed from the original graph.
11. Lecture WS 2006/07
Bioinformatics III 13
Determining the connectivity of a graph
Lemma 12.3.1. Let N be an s-t network such that
outdegree(s) > indegree(s),
indegree(t) > outdegree (t), and
outdegree(v) = indegree(v) for all other vertices v.
Then, there exists a directed s-t path in network N.
Proof. Let W be a longest directed trail (trail = walk without repeated edges; path = trail
without repeated vertices) in network N that starts at source s, and let z be its terminal
vertex.
If vertex z were not the sink t, then there would be an arc not in trail W that is directed from
z (since indegree(z) = outdegree(z) ).
But this would contradict the maximality of trail W.
Thus, W is a directed trail from source s to sink t.
If W has a repeated vertex, then part of W determines a directed cycle, which can be
deleted from W to obtain a shorter directed s-t trail.
This deletion step can be repeated until no repeated vertices remain, at which point, the
resulting directed trail is an s-t path. □
11. Lecture WS 2006/07
Bioinformatics III 14
Determining the connectivity of a graph
Proposition 12.3.2. Let N be an s-t network such that
outdegree(s) – indegree(s) = m = indegree(t) – outdegree (t),
and outdegree(v) = indegree(v) for all vertices v s,t.
Then, there exist m disjoint directed s-t path in network N.
Proof. If m = 1, then there exists an open eulerian directed trail T from source s to
sink t by Theorem 6.1.3.
Review: An eulerian trail in a graph is a trail that contains every edge of that graph.
Theorem 6.1.3. A connected digraph D has an open eulerian trail from vertex x to vertex y if and only if
indegree(x) + 1 = outdegree(x), indegree(y) = outdegree(y) + 1, and all vertices except x and y have equal
indegree and outdegree.
Theorem 1.5.2. Every open x-y walk W is either an x-y path or can be reduced to an x-y path.
Therefore, trail T is either an s-t directed path or can be reduced to an s-t path.
11. Lecture WS 2006/07
Bioinformatics III 15
Determining the connectivity of a graph
By way of induction, assume that the assertion is true for m = k, for some k 1,
and consider a network N for which the condition holds for m = k +1.
There exists a directed s-t path P by Lemma 12.3.1.
If the arcs of path P are deleted from network N, then the resulting network N - P
satisfies the condition of the proposition for m = k.
By the induction hypothesis, there exist k arc-disjoint directed s-t paths in network
N - P. These k paths together with path P form a collection of k + 1 arc-disjoint
directed s-t paths in network N. □
11. Lecture WS 2006/07
Bioinformatics III 16
Basic properties of 0-1 networksDefinition A 0-1 network is a capacitated network whose arc capacities are either 0
or 1.
Proposition 12.3.3. Let N be an s-t network such that cap(e) = 1 for every arc e.
Then the value of a maximum flow in network N equals the maximum number of
arc-disjoint directed s-t paths in N.
Proof: Let f* be a maximum flow in network N, and let r be the maximum number of
arc-disjoint directed s-t paths in N.
Consider the network N* obtained by deleting from N all arcs e for which f*(e) = 0.
Then f*(e) = 1 for all arcs e in network N*.
It follows from the definition that for every vertex v in network N*,
voutdegreevOutefvOute
*
and
vindegreevInefvIne
*
11. Lecture WS 2006/07
Bioinformatics III 17
Basic properties of 0-1 networksThus by the definition of val(f*) and by the conservation-of-flow property,
outdegree(s) – indegree (s) = val(f*) = indegree(t) – outdegree(t)
and outdegree(v) = indegree(v), for all vertices v s,t.
By Proposition 12.3.2., there are val(f*) arc-disjoint s-t paths in network N*, and
hence, also in N, which implies that val(f*) r.
To obtain the reverse inequality, let {P1,P2, ..., Pr} be the largest collection of arc-
disjoint directed s-t paths in N, and consider the function f: EN R+ defined by
. otherwise ,0
arc uses path some if ,1 ePef i
Then f is a feasible flow in network N, with val(f) = r.
It follows that val(f*) r. □
11. Lecture WS 2006/07
Bioinformatics III 18
Separating Sets and CutsReview from §5.3
Let s and t be distinct vertices in a graph G. An s-t separating edge set in G is a
set of edges whose removal destroys all s-t paths in G.
Thus, an s-t separating edge set in G is an edge subset of EG that contains at least
one edge of every s-t path in G.
Definition: Let s and t be distinct vertices in a digraph D.
An s-t separating arc set in D is a set of arcs whose removal destroys all directed
s-t paths in D.
Thus, an s-t separating arc set in D is an arc subset of ED that contains at least one
arc of every directed s-t path in digraph D.
Remark: For the degenerate case in which the original graph or digraph has no
s-t paths, the empty set is regarded as an s-t separating set.
11. Lecture WS 2006/07
Bioinformatics III 19
Separating Sets and CutsProposition 12.3.4 Let N be an s-t network such that cap(e) = 1 for every arc e.
Then the capacity of a minimum s-t cut in network N equals the minimum number of
arcs in an s-t separating arc set in N.
Proof: Let K* = Vs ,Vt be a minimum s-t cut in network N, and let q be the
minimum number of arcs in an s-t separating arc set in N.
Since K* is an s-t cut, it is also an s-t separating arc set. Thus cap(K*) q.
To obtain the reverse inequality, let S be an s-t separating arc set in network N
containing q arcs, and let R be the set of all vertices in N that are reachable from
source s by a directed path that contains no arc from set S.
Then, by the definitions of arc set S and vertex set R, t R, which means that
R, VN - R is an s-t cut.
Moreover, R, VN - R S. Therefore
11. Lecture WS 2006/07
Bioinformatics III 20
Separating Sets and Cuts
which completes the proof. □
q
SRVRS
RVR
tsKRVRcapKcap
N
N
N
, since
1 are capacities all since ,
cut minimum a is * since ,*
11. Lecture WS 2006/07
Bioinformatics III 21
Arc and Edge Versions of Menger’s Theorem Revisited
Theorem 12.3.5 [Arc form of Menger‘s theorem]
Let s and t be distinct vertices in a digraph D. Then the maximum number of arc-
disjoint directed s-t paths in D is equal to the minimum number of arcs in an s-t
separating set of D.
Proof: Let N be the s-t network obtained by assigning a unit capacity to each arc of
digraph D. Then the result follows from Propositions 12.3.3. and 12.3.4., together
with the max-flow min-cut theorem. □
Theorem 12.2.4 [Max-Flow Min-Cut] For a given network, the value of a maximum flow is equal to the
capacity of a minimum cut.
Proposition 12.3.3. Let N be an s-t network such that cap(e) = 1 for every arc e. Then the value of a
maximum flow in network N equals the maximum number of arc-disjoint directed s-t paths in N.
Proposition 12.3.4 Let N be an s-t network such that cap(e) = 1 for every arc e. Then the capacity of a
minimum s-t cut in network N equals the minimum number of arcs in an s-t separating arc set in N.
11. Lecture WS 2006/07
Bioinformatics III 22
Transcriptional – Gene regulation networks
The machine that transcribes a
gene is composed of perhaps 50
proteins, including RNA
polymerase, the enzyme that
converts DNA code into RNA
code.
A crew of transcription factors
grabs hold of the DNA just above
the gene at a site called the core
promoter, while associated
activators bind to enhancer regions
farther upstream of the gene to rev
up transcription.
ahttp://www.berkeley.edu/news/features/1999/12/09_nogales.html
Working as a tightly knit machine, these
proteins transcribe a single gene into
messenger RNA. The messenger RNA
winds its way out of the nucleus to the
factories that produce proteins, where it
serves as a blueprint for production of a
specific protein.
11. Lecture WS 2006/07
Bioinformatics III 23
Transcription in E.coli and in Eucaryotes
Procaryotes Eucaryotes
Genes are grouped into operons Genes are not grouped in operons
mRNA may contain transcript of each mRNA contains only
several genes (poly-cistronic) transcript of a single gene
(mono-cistronic)
Transcription and translation are coupled. Transcription and translation are
Transcript is translated already during NOT coupled.
transcription. Transcription takes place
in nucleus, translation in cytosol.
Gene regulation takes place by Gene regulation via transcription
modification of transcription rate rate AND by RNA-processing,
RNA stability etc.
11. Lecture WS 2006/07
Bioinformatics III 24
Promoter prediction in E.coli
To analyze E.coli promoters, one may align a set of promoter sequences by the
position that marks the known transcription start site (TSS) and search for
conserved regions in the sequences.
E.coli promoters are found to contain 3 conserved sequence features
- a region approximately 6 bp long with consensus TATAAT at position -10
- a region approximately 6 bp long with consensus TTGACA at position -35
- a distance between these 2 regions of ca. 17 bp that is relatively constant
a
11. Lecture WS 2006/07
Bioinformatics III 25
Gene regulatory promoter network
In E.coli, 240 transcription factors have been verified that regulate 3000 genes.
Binding site matrics are available for more than 55 E.coli TFs
(Robison et al. 1998)
In S. cerevisae, genome-wide binding analysis of 106 transcription factors
indicates that more than one-third of the promoter regions that were bound by
regulators were bound by 2 or more regulators.
Highly connected network of transcriptional regulators.
11. Lecture WS 2006/07
Bioinformatics III 26
Feasibility of computational motif search?
Computational identification of transcription factor binding sites is difficult
because they consist of short, degenerate sequences that occur frequently by
chance.
The problem is not easy to define (therefore: it is „complex“) because
- the motif is of unknown size
- the motif might not be well conserved between promoters
- the sequences used to search for the motif do not necessarily represent the
complete promoter
- genes with promoters to be analyzed are in many cases grouped together by a
clustering algorithm which has its own limitations.
11. Lecture WS 2006/07
Bioinformatics III 27
Strategy 1
Arrival of microarray gene-expression data.
Group of genes with similar expression profile (e.g. those that are activated at
the same time in the cell cycle) one may assume that this profile ist, at least
partly, caused by and reflected in a similar structure of the regions involved in
transcription regulation.
Search for common motifs in < 1000 base upstream regions.
Sofar used: detection of single motifs (representing transcription-factor binding
sites) common to the promoter sequences of putatively co-regulated genes.
Better: search for simultaneous occurrence of 2 or more sites at a given distance
interval! Search becomes more sensitive.
11. Lecture WS 2006/07
Bioinformatics III 28
Motif identifaction
A flowchart to illustrate the two
different approaches for motif
identification. We analyzed 800
bp upstream from the translation
start sites of the five genes from
the yeast gene family PHO by
the publicly available systems
MEME (alignment) and RSA
(exhaustive search). MEME was
run on both strands, one
occurrence per sequence mode,
and found the known motif
ranked as second best. RSA
Tools was run with oligo size 6
and noncoding regions as
background, as set by the demo
mode of the system. The well-
conserved heptamer of the
motifs used by MEME to build
the weight matrix is printed in
bold. Ohler, Niemann Trends Gen 17, 2 (2001)
11. Lecture WS 2006/07
Bioinformatics III 29
Strategy 2: Exhaustive motiv search in upstream regions
Exploit the finding that relevant motifs are often repeated many times,
possibly with small variations, in the upstream region for the regulatory action to
be effective.
Search upstream region for overrepresented motifs
(1) Group genes based on the overrepresented motifs
(2) Analyze sets of genes that share motifs for coregulation in microarray exp.
(3) Consider overrepresented motifs labelling sets of co-regulated genes as
candidate binding sites.
Cora et al. BMC Bioinformatics 5, 57 (2004)
11. Lecture WS 2006/07
Bioinformatics III 30
Exhaustive motiv search in upstream regions
Exploit
Cora et al. BMC Bioinformatics 5, 57 (2004)
11. Lecture WS 2006/07
Bioinformatics III 31
Position-specific weight matrix
Popular approach when list of genes available that share TF binding motif;
Good multiple sequence alignment available.
Alignment matrix: lists # of occurrences of
each letter at each position of an alignment
Hertz, Stormo (1999) Bioinformatics 15, 563
11. Lecture WS 2006/07
Bioinformatics III 32
Position-specific weight matrix
Examples of matrices used by YRSA
http://forkhead.cgb.ki.se/YRSA/matrixlist.html
11. Lecture WS 2006/07
Bioinformatics III 33
A protein bound to a specific DNA sequence
will interfere with the digestion of that region by
DNase I.
An end-labelled DNA probe is incubated with a
protein extract or a purified DNA-binding factor.
The unprotected DNA is then partially digested
with DNase I such that on average every DNA
molecule is cut once.
Digestion products are then resolved by
electrophoresis.
Comparison of the DNase I digestion pattern in
the presence and absence of protein will allow
the identification of a footprint (protected
region)
*
*
*
*
Denaturing PAGE
Footprint
Exp. Identification of TF binding site: DNase 1 Footprinting
11. Lecture WS 2006/07
Bioinformatics III 34
Gel ShiftsElectro Mobility Shift Assay (EMSA)Band Shift
Incubating a purified protein, or a complex
mixture of proteins e.g. nuclear or cell extract,
with a 32P end-labelled DNA fragment
containing the putative protein binding site
(from promoter region).
Reaction products are then analysed on a non-
denaturing polyacrylamide gel.
The specificity of the DNA-binding protein for
the putative binding site is established by
competition experiments using DNA fragments
or oligonucleotides containing a binding site for
the protein of interest, or other unrelated DNA
sequences.
* *
Non-denaturing PAGE
Retarded mobility due to protein binding
Free DNA probe
No protein add protein
Gel retardation assays
11. Lecture WS 2006/07
Bioinformatics III 35
http://www.rcsb.org
3D structures of transcription factors
1A02.pdb 1AM9.pdb 1AU7.pdb
1CIT.pdb 1GD2.pdb 1H88.pdb
TFs bind with very
different binding modes.
Some are sensitive
for DNA conformation.
2 TFs bound!
11. Lecture WS 2006/07
Bioinformatics III 36
database for eukaryotic transcription factors: TRANSFAC
BIOBase / TU Braunschweig / GBF
Relational database
6 flat files:
FACTOR interaction of TFs
SITE their DNA binding site
GENE through which they regulate
these target genes
CELL factor source
MATRIX TF nucleotide weight matrices
CLASS classification scheme of TFs
Wingender et al. (1998) J Mol Biol 284,241
11. Lecture WS 2006/07
Bioinformatics III 37
database for eukaryotic transcription factors: TRANSFAC
BIOBase / TU Braunschweig / GBF
Matys et al. (2003) Nucl Acid Res 31,374
11. Lecture WS 2006/07
Bioinformatics III 38
TRANSFAC classification
1 Superclass basic domains 3 Superclass: Helix-turn-helix
1.1 Leuzine zipper factors (bZIP)
1.2 Helix-loop-helix factors (bHLH) 4 Superclass: beta-Scaffold
1.3 bHLH-bZIP Factors with Minor Groove
1.4 NF-1 Contacts
1.5 RF-X
1.6 bHSH 5 Superclass: others
2 Superclass: Zinc-coordinating DNA-binding domains
2.1 Cys4 zinc finger of nuclear receptor type
2.2 diverse Cys4 zinc fingers
2.3 Cys2His2 zinc finger domains
2.4 Cys6 cysteine-zinc cluster
2.5 Zinc fingers of alternating composition
http://www.gene-regulation.com/pub/databases/transfac/cl.html
11. Lecture WS 2006/07
Bioinformatics III 39
BIOBase / TU Braunschweig / GBF
Matys et al. (2003) Nucl Acid Res 31,374
database for eukaryotic transcription factors: TRANSFAC
11. Lecture WS 2006/07
Bioinformatics III 40
Summary
http://www.gene-regulation.com
Large databases available (e.g. TRANSFAC) with information about promoter sites.
Information verified experimentally.
Microarray data allows searching for common motifs of coregulated genes.
Also possible: common GO annotation etc.
TF binding motifs are frequently overrepresented in 1000 bp upstream region.
Clear function of this is unknown.
(Same as in proline-rich recognition sequences.)
Relatively few TFs regulate large number of genes.
Complex regulatory network, Next lecture(s).
11. Lecture WS 2006/07
Bioinformatics III 41
additional slides
11. Lecture WS 2006/07
Bioinformatics III 42
Arc and Edge Versions of Menger’s Theorem Revisited
Assertion 12.3.6. Let s and t be distinct vertices of a graph G, and let be the
digraph obtained by replacing each edge e of G with a pair of oppositely directed
arcs having the same endpoints as edge e.
Then there is a one-to-one correspondence between the s-t paths in graph G and
the directed s-t paths in digraph .
Moreover, two s-t paths in graph G are edge-disjoint if and only if their
corresponding directed s-t paths in digraph are arc-disjoint.
Assertion 12.3.7. Let s and t be distinct vertices of a graph G, and let be defined
as above. Then the minimum number of edges in an s-t separating set of graph G
is equal to the minimum number of arcs in an s-t separating arc set of digraph .
G
G
G
G
G
Remark The edge form of Menger‘s theorem for undirected graphs follows
directly from the next two assertions concerning the relationship between a graph
G and the digraph obtained by replacing each edge e of graph G with a pair
of oppositely directed arcs having the same endpoints as edge e.
Each of these assertions follows directly from the definitions.
G
11. Lecture WS 2006/07
Bioinformatics III 43
Arc and Edge Versions of Menger’s Theorem RevisitedTheorem 12.3.8 [Edge form of Menger‘s theorem]. Let s and t be distinct vertices in
a graph G. Then the maximum number of edge-disjoint s-t paths in G equals the
minimum number of edges in an s-t separating edge set of graph G.
Proof: This is an immediate consequence of Assertions 12.3.6 and 12.3.7, together
with the arc form of Menger‘s theorem (theorem 12.3.5).
Review from §5.1 The edge-connectivity e(G) is the size of a smallest edge-cut in
graph G.
Definition Let s and t be distinct vertices in a graph G.
The local edge-connectivity between vertices s and t , denoted e(s,t) is the
minimum number of edges in an s-t separating edge set in G.
11. Lecture WS 2006/07
Bioinformatics III 44
Determining Edge-Connectivity Using Network Flows
Proposition 12.3.9 The edge-connectivity of a graph G is equal to the minimum of
the local edge-connectivites, taken over all pairs of vertices s and t. That is:
Proposition 12.3.9 and theorem 12.3.8 suggest an algorithm for determining the
edge-connectivity e(G) of an arbitrary graph G.
The algorithm calculates the local edge-connectivity between each pair of vertices
in G, by solving an appropriate maximum flow problem in the network .
In fact, as the next two results show, it is not necessary to calculate the local edge-
connectivity between every pair of vertices.
tseGGVts
e ,min,
G
11. Lecture WS 2006/07
Bioinformatics III 45
Determining Edge-Connectivity Using Network FlowsProposition 12.3.10. Let V1,V2 be a partition-cut of minimum cardinality in a graph
G, and let v1 and v2 be any vertices in V1 and V2, respectively.
Then the edge-connectivity e(G) equals the local edge-connectivity e(v1,v2).
Proof: Suppose that the minimum local edge-connectivity is achieved between
vertices x and y. Then e(G) e(x,y) by Proposition 12.3.9.
It suffices to show that e(v1,v2) e(x,y).
Let be the digraph obtained by replacing each edge of graph G with two
oppositely directed arcs.
Then can be regarded as a v1-v2 capacitated network and as an x-y
capacitated network where each arc is assigned unit capacity.
Let K* be a minimum v1-v2 cut in network
G
G 21vvGxyG
21vvG
11. Lecture WS 2006/07
Bioinformatics III 46
Determining Edge-Connectivity Using Network Flows
It follows that cap(K*) cap V1,V2 , since the partition-cut V1,V2 corresponds to
a v1-v2 cut in network .
Next, let f* be a maximum flow and V1 ,V2 a minimum x-y cut in x-y network
so that cap(Vx,Vy) = val(f*). Then
21vvG
xyG
12.3.5 Theorem and 12.3.3n Propositio ,
cut-min flow-max *
capacityunit have arcs all ,
Gin cut -partition a toscorrespond , ,
capacityunit have arcs all ,
Gin cut a toscorrespond , ,
12.3.7Assertion and 12.3.4n Propositio *,
21
,v212121
21
21
yx
fval
VVcap
VVVV
VV
vvVVVVcap
Kcapvv
e
yx
yxyx
v
e
□
11. Lecture WS 2006/07
Bioinformatics III 47
Arc and Edge Versions of Menger’s Theorem Revisited
Corollary 12.3.11 Let s be any vertex in a graph G. Then
tsG esVt
eG
,min
Proof: Let V1 ,V2 be a partition-cut of minimum cardinality, and suppose, without
loss of generality, that vertex s V1. There must be some vertex t V2 (otherwise,
EG = , and the assertion would be trivially true).
By Proposition 12.3.10 it follows that e(G) = e(s,t). □
The variable e used in the next algorithm, represents the edge-connectivity of
graph G and is initialized with the sufficiently large positive integer |EG|.
11. Lecture WS 2006/07
Bioinformatics III 48
Arc and Edge Versions of Menger’s Theorem Revisited
Algorithm 12.3.1 requires O(n) iterations, and since Algorithm 12.2.3 requires
O(n|E|2) computations, the overall complexity of algorithm 12.3.1 is O(n2|E|2).
More efficient algorithms exist.
11. Lecture WS 2006/07
Bioinformatics III 49
Using Network Flows to Prove the Vertex Forms of Menger’s Theorem
Construction of Digraph ND from Digraph D.
Let s and t be any pair of non-adjacent vertices in a digraph D.
The digraph ND is obtained from digraph D as follows:
- each vertex x VD – {s,t} corresponds to two vertices x‘ and x‘‘ in digraph ND and
an arc directed from x‘ to x‘‘.
- each arc in digraph D that is directed from vertex s to vertex x VD – {s,t}
corresponds to an arc in ND directed from s to x‘.
- each arc in D that is directed from a vertex x VD – { s,t } to vertex t corresponds
to an arc in ND directed from x‘‘ to t.
- each arc in D that is directed from a vertex x VD – {s,t} to a vertex y VD – {s,t}
corresponds to an arc in ND directed from x‘‘ to y‘.
11. Lecture WS 2006/07
Bioinformatics III 50
Using Network Flows to Prove the Vertex Forms of Menger’s Theorem
Review from §5.3: Let s and t be a pair of non-adjacent vertices in a graph G (or
digraph D). An s-t separating vertex set in G (or in D) is a set of vertices whose
removal destroys all s-t paths in G (or all directed s-t paths in D).
Thus, an s-t separating vertex set is a set of vertices that contains at least one
internal vertex of every (directed) s-t path.
Definition Two (directed) s-t paths in a digraph D are internally disjoint if they
have no internal vertices in common.
11. Lecture WS 2006/07
Bioinformatics III 51
Relationship between digraphs D and ND
Assertion 12.3.12 There is a one-to-one correspondence between directed s-t paths
in digraph D and directed s-t paths in digraph ND.
Assertion 12.3.13 Two directed s-t paths in D are internally disjoint if and only if
their corresponding s-t directed paths in ND are arc-disjoint.
Assertion 12.3.14 The maximum number of internally disjoint directed s-t paths in D
is equal to the maximum number of arc-disjoint directed s-t paths in ND.
Assertion 12.3.15 The minimum number of vertices in an s-t separating vertex set
in digraph D is equal to the minimum number of arcs in an s-t separating arc set in
digraph ND.
11. Lecture WS 2006/07
Bioinformatics III 52
Relationship between digraphs D and ND
Theorem 12.3.16 [Vertex Form of Menger for Digraphs]
Let s and t be a pair of non-adjacent vertices in a digraph D.
Then the maximum number of internally disjoint directed s-t paths in D is equal to
the minimum number of vertices in an s-t separating vertex set in D.
Proof: This follows from Assertions 12.3.12 through 12.3.15 together with the arc
form of Menger‘s theorem (theorem 12.3.5).
Theorem 12.3.17 [Vertex Form of Menger for Undirected Graphs].
Let s and t be a pair of non-adjacent vertices in a graph G.
Then the maximum number of internally disjoint s-t paths in G is equal to the
minimum number of vertices in an s-t separating vertex set in G.
Proof: This follows from Theorem 12.3.16 and Assertions 12.3.6 and 12.3.7.
11. Lecture WS 2006/07
Bioinformatics III 53
Determining Vertex-Connectivity using Network Flow
Review from §5.3: Let s and t be non-adjacent vertices of a connected graph G.
Then the local vertex-connectivity between s and t , denoted v(s,t) is the
minimum number of vertices in an s-t separating vertex set.
Lemma 5.3.5. Let G be a connected graph containing at least one pair of non-
adjacent vertices. Then the vertex connectivity v(G) is the minimum of the local
vertex-connectivity v(s,t), taken over all pairs of non-adjacent vertices s and t.
The following algorithm, with O(n|EG|3) time-complexity, computes the vertex-
connectivity of an n-vertex graph by calculating the local vertex-connectivity
between various pairs of non-adjacent vertices.
As in algorithm 12.3.1, it is not necessary to calculate the local vertex-connectivity
between each pair.
11. Lecture WS 2006/07
Bioinformatics III 54
Determining Vertex-Connectivity using Network Flow