a simpler minimum spanning tree verification algorithm
Post on 02-Jan-2016
25 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
Valerie KingAlgorithmica, 18:263–270, 1997
D94922004 陳冠宇 D96945017 林振慶R96922013 黃于鳴 R96922028 廖鳳玉R96922032 張瀠文 R96922035 曾照涵
2008/05/22
2
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
3
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
4
Valerie KingUniversity of Victoria.Research interests:
Randomized algorithms and data structures.
Applications to networks and computational biology.
Distributed computing.
5
AbstractProblem:
Determine whether a given spanning tree in a graph is a minimal spanning tree.
In 1984, Komlós presented an algorithm:A linear number of comparisons.But nonlinear overhead to determine which
comparisons to make.This paper simplifies Komlós’s algorithm, and gives
a linear time procedure using table lookup in the unit cost RAM model.
6
Related Works(1/2)Tarjan (1979)
Path compression on balanced trees.O(m α(m,n)), almost linear running time.
α(m,n) is a very slowly growing function.
O(m) storage space.Komlós (1984)
O(m) binary comparisons between edge costs.Non-linear time to determine which comparisons to
make.
7
Related Works(2/2)Dixon, Rauch, and Tarjan (1992)
O(m), linear time algorithm.Combines Tarjan’s almost-linear-time algorithm and Komlós’s
algorithm, with a preprocessing and table-look-up method for small subproblems. Decompose the tree into a large subtree and many microtrees. Path compression (Tarjan 1979) is used on the large subtree. The comparison decision tree needed to implement Komlós’s
strategy for each possible microtree is precomputed and stored in a table. Each microtree with its query paths is encoded. The table is used to loop up the appropriate comparisons to make.
Model of computation: binary comparison, addition, substraction on edge cost on unit-cost random-access machine.
8
Definitions(1/3)A graph G = (V, E) with n nodes and m edges.A path of length k from x to y in G is a sequence of
edges {x,v1},{v1,v2},…,{vk-1,y}.
A tree T = (V, E) is a graph such that T is connected and contains no cycles.
T is a spanning tree if a tree is a subgraph of G with the same vertex set as T.
9
Definitions(2/3)G = (V,EG) a graph with w(e) on its edges.
T = (V,ET) is a spanning tree of G.
T is a minimum spanning tree if is minimum among all spanning trees of G.
TEe
ew )(
10
Definitions(3/3)T(x,y): the set of edges in the path in a tree T from
node x to node y.In a tree T, there is a unique path from x to y.
A rooted tree is a tree with a distinguished node called root.
A Full branching tree is a rooted tree with all leaves on the same level and each internal node having at least two children.
B(x,y): the set of edges in the path in a full branching tree B from leaf x to leaf y.
11
Key Observation(1/4)A spanning tree is a minimum spanning tree iff the
weight of each non-tree edge {u, v} is at least the weight of the heaviest edge in the path in the tree between u and v.
}),(},{),({max),(
,},{ edgeeach for minimum is
yxTvuvuwyxw
EEyxT TG
12
Key Observation(2/4)
}),(},{),({max),( ,},{ edgeeach for
minimum is
yxTvuvuwyxwEEyx
T
TG 4 3
52 1 4
12 3
1
1
2
x
y
T(x,y)
2
weighttalsmaller towith
' treespanning a having
},,{ with },{ replacecan we
),,(maxarg},{ where
),,(),( s.t.
},{ edge If
),(},{
T
badc
vuwdc
dcwbaw
EEba
baTvu
TG
a
b
13
Key Observation(3/4)
minimum is
}),(},{),({max),( ,},{ edgeeach For
T
yxTvuvuwyxwEEyx TG
x a
y b
4 3
52 1 4
12 3
1
1
2
),(max),( s.t. },{
sum weight edgesmaller has },{},{ s.t.
),,(},{ edge- treeand
},{ edge-tree-non
MST, anot is )',( If
),(),(vuwyxwEEyx
yxbaT
yxTba
EEyx
EVT
dcTvuTG
TG
14
Key Observation(4/4)These verification methods use this fact.Find the heaviest edge in each such path for each
non-tree edge {u,v} in the graph.Compare the weight of {u,v} to it.
1
2 3 2
3u v
u
3
3
15
Tree Path ProblemFinding the heaviest edges in the paths between
specified pairs of nodes (“query paths”).
4
2
3
1 2
16
Main Ideas(1/2)T is a spanning tree.
A simple O(n) algorithm to construct a full branching tree B with no more than 2n edges.
The weight of the heaviest edge in T(x, y) is the weight of the heaviest edge in B(x, y).
Use the version of the Komlós’s algorithm for full branching trees.Much simpler than his algorithm for general trees.
17
Main Ideas(2/2)Linear time implementation using table lookup of a
few simple functions.Can be constructed in time linear in the size of the
tree.Model of computation: unit random access model
with word size θ(log n) bits.Allow edge weights to be compared, added, or
subtracted at unit cost.
18
Organization• The construction of full branching tree B
• Proof of the property of B
Komlós’s algorithm for determining the maximum weighted edge in each of m paths of a full branching tree
Implementation of Komlós’s algorithm:
• Data structure
• Algorithm
• Details
• Analysis
19
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
20
Full branching tree: rooted tree, leaves in same level, internal nodes have at least two children.
B has no more than 2n edges.O(n) time to construct B.
The weight of the heaviest edge in T(x, y) is the weight of the heaviest edge in B(x, y).
Spanning Tree T Full Branching Tree B
T B
21
Use Borůvka’s Algorithm (1/3) T
Tree B is constructed with node set W and edge set F by adding
nodes and edges to B.Initial phase: For each node v V, create a leaf f(v) of B.
B
22
Adding phase: Let A be the set of blue trees which are joined into one blue tree t in a phase i. Add a new node f(t) to W and add edge to F.
T: B:
Borůvka’s Algorithm (2/3)
1 2 3 6
4 7 8 5 9
}A a allfor |)}(),({ { tfaf
9 9
t4
3 3
t1
66
7
t3
4 4
t2
23
Repeat edge contraction until there is one blue tree.
T B
Borůvka’s Algorithm (3/3)
t
3 3 4 4 6 6 7 9 9
1 2 3 6 4 7 8 5 9
t1 t2t3 t4
t
8 11108
t1
t2t3 t4
81110
24
Borůvka’s Algorithm (3.001/3)Problem 1: Are edge weights in the path from leaf to root
increased?No, the weight of the edge which t1 select may smaller than the edge
weight in Tt1 ,for example:
But the weight of the edge which t1 select must be bigger than the minimal edge weight in Tt1
43
5 2
c
a b
e
d
T
t1 t2
5 2 2
t2
d e
33
t1
a bc
t4 4
B
Tt1
25
3 5
c
a b
535
t1
a b c
Borůvka’s Algorithm (3.002/3)Problem 2: in adding phase, for each blue tree, if we choose the
edge with the maximum weight, not minimal weight, does B still hold the property:For any pair of nodes x and y in T, the weight of the heaviest edge in T(x, y) equals the weight of the heaviest edge in B(f(x), f(y)).
No, for example:
the weight of the heaviest edge in T(a, c) = 3,
the weight of the heaviest edge in B(a, c) = 5.
T B
26
The number of blue trees drops by a factor of at least two after each phase.
B is a full branching tree.
Theorem 1For any pair of nodes x and y in T, the weight of the
heaviest edge in T(x, y) equals the weight of the heaviest edge in B(f(x), f(y)).
Borůvka’s Tree Property (1/5)
27
Claim1: for every edge , there is an edge such that w(e’)≥w(e).
Then a = f(t) for some blue tree t which contains either x or y.Let e’ be the edge in T(x, y) with exactly one endpoint in blue tree
t. Since t had the option of selecting e’, w(e’)≥w(e).
))(),(( yfxfBe),(' yxTe
Borůvka’s Tree Property (2/5)
T
t
e
e’
e’’
Blue tree tContains x
B
e
a f(t)
r
b
f(y)Blue tree t
f(x)
28
Claim2: Let e be a heaviest edge in T(x, y). Then there is an edge of the same weight in B(f(x), f(y)).e must be selected.Case1: If e is selected by a blue tree t which contains x or y, then
an edge in B(f(x), f(y)) is labeled with w(e).
Borůvka’s Tree Property (3/5)
B
e
t
jy
i
T
Blue tree t contains x
e
a f(t)
r
b
f(y)Blue tree t
f(x)
29
Claim2: Let e be a heaviest edge in T(x, y). Then there is an edge of the same weight in B(f(x), f(y)).Case2: assume that e is selected by a blue tree i which does not
contain x or y. This blue tree contained one endpoint of e and thus one intermediate node on the path from x to y.
Therefore it is incident to at least two edges on the path. Then e is the heavier of two, giving a contradiction.
Borůvka’s Tree Property (4/5)
e
j
xy
i
T
a f(i)
r
b
f(y)Blue tree i
e
B
30
Claim1: for every edge , there is an edge such that w(e’)≥w(e).
Claim2: Let e be a heaviest edge in T(x, y). Then there is an edge of the same weight in B(f(x), f(y)).
Theorem: For any pair of nodes x and y in T, the weight of the heaviest edge in T(x, y) equals the weight of the heaviest edge in B(f(x), f(y)).
))(),(( yfxfBe),(' yxTe
Borůvka’s Tree Property (5/5)
31
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
32
MST Verification UsingFull Branching TreeGiven a spanning tree T of a graph G.Construct a full branching tree B from T using
Borůvka’s algorithm.For every non-tree edge e = {u, v} in G, find the
heaviest edge e’ in the query path B(f(u), f(v)), and see if w(e) ≥ w(e’) or not.By Borůvka’s tree property.
If for all non-tree edge e, w(e) ≥ w(e’), then T is a minimum spanning tree of G.
33
Komlós’s Algorithm (1/5)Given a full branching tree with n nodes and m
query paths between pairs of leaves, Komlós’s algorithm can compute the heaviest edge in every query path using only linear number of comparisons.
For each query path (leaf x -> leaf y), break up the path into two half-paths from leaf up to the lowest common ancestor of the pair.
Find the heaviest edge in each half-path and compare the two edges to determine the heaviest edge in the whole path.
34
Komlós’s Algorithm (2/5)A(v): the set of all half paths of every query path
which contain v restricted to the interval [root, v].Let p be the parent of v. A(v|p): the set of paths in
A(v) restricted to the interval [root, p].
a b c d e f
v
r
s
query paths: a->d, c->e, b->fA(v) = {(v->p->s), (v->p)}A(v|p) = {(p->s)}A(p) = {(p->s), (p->s->r)}
p
B
35
Komlós’s Algorithm (3/5)If we know the heaviest edge in each path in A(p), we
can determine the heaviest edge in each path in A(v) through A(v|p).Assume we know the heaviest edge in each path in A(p).Because , we already know the heaviest
edge in each path in A(v|p).To determine the heaviest edge in each path in A(v), we
need only to compare w(v, p) to the heaviest edge in each path in A(v|p).
)()( p A v|pA
e1
e2
e3
e4
v p
36
Komlós’s Algorithm (4/5)Starting from the root, descend level by level and as
each node v encountered, the heaviest edge in each path in A(v) can be determined.
For a query path(x->y), use A(x) and A(y) to determine the heaviest edge in each half path, and compare the two to determine the heaviest edge in the query path.
37
Komlós’s Algorithm (5/5)The ordering of the weights of the heaviest edges in A(p) can
be determined by the length of their respective paths, since for any two paths s and t in A(p), path s includes path t or vice versa.
Compare w(v, p) to each weight in A(v|p) can be done by using binary search.
Komlós shows that the upper bound on the number of comparisons needed to find the heaviest edge in each half path is
n) O(m )) n
nm( O(n A(v)
Tv
loglog
e2
e3
e4
w(e4) ≥ w(e3) ≥ w(e2) ≥ w(e1)
p
w(v, p)
v e1
38
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
39
Node Label and Edge TagNode Label - bits
Leaves: the order of the results of DFS.
Internal nodes: the longest all 0’s suffix in its subtree.
nlog
0100 1000
1000
0000 0001 0010 0011 0100 0101 0111 10000110
0000
0000
0000
0010
40
0010 0011
0010
0110 0111
0110
…Leaf
Internal
0010 0011
0100
0110 01110100…Leaf
Internal
…
Example 1:
Example 2:
Node Label Property
Nodes on the same level won’t possess the same label.
41
Edge tag - O( ) bits
v : the endpoint which is farther from root.
distance(v): v’s distance from root.
i(v): index of the rightmost 1 in v’s label.
< distance(v), i(v) >
Node Label and Edge Tag nloglog
0100 1000
1000
0000 0001 0010 0011 0100 0101 0111 10000110
0000
0000
0000
0010
< 3, 4 >< 3, 0 > < 3, 1 > < 3, 2 > < 3, 1 > < 3, 3 > < 3, 1 > < 3, 2 >
< 2, 0 > < 2, 2 > < 2, 3 > < 2, 4 >
< 1, 0 > < 1, 4 >
< 3, 1 >
2
3
43
LCA(v):A vector with size ith bit of LCA(v) = 1 iff there is a path in A(v) whose
upper endpoint is at distance i from the root For example:
Query paths: {(u, v), (u, w)}A(u) = {(u, a), (u, r)}LCA(u) = 1100
LCA: Lowest Common Ancestor
u wv
p
a
r
nlog
44
BigLists and SmallLists (1/3)
If |A(v)| > wordsize / tagsize , |A(v)| is Big Otherwise, |A(v)| is Small
45
BigLists and SmallLists (2/3)For example:
Query paths: {(u, v), (u, w)}.A(u) = { (u, a), (u, r)}.A1(u) = (a, r), A2(u) = (p, a).
|A(u)| = 2 > wordsize / tagsize = 4 / 4 = 1 |A(u)| is big.
1
2
BigList(u)
<1, 0> (a, r)
<2, 0> (p, a)
u wv
p
a
r
46
BigLists and SmallLists (3/3)For example:
Query path: {(c, d)}.A(c) = {(c, b)}.A1(c) =(c, b).
|A(c)| = 1 ≤ 1 A(c) is small.
c d
b1
SmallList(c)
<3, 2> (c, b)
47
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
48
Goal of the Algorithm Generate bigList(v) or smallList(v) in time
proportional to log|A(v)|.The time spent implementing Komlós’s algorithm at
each node does not exceed the worst case number of comparisons needed at each node.
49
Implementation Details of the Algorithm (1/7)The computation of the LCAs.
Compute all LCAs for each pair of endpoints of the m query paths.
Form the vector LCA(l) for each leaf l.Form the vector LCA(v) for a node at distance i from
the root by ORing together the LCAs of its children and setting the jth bits to 0 for all j i.≧
Compute all LCAs for each pair of endpoints of the m query paths using an algorithm that runs in time
O(n + m).
50
Implementation Details of the Algorithm (2/7)
100
000 000 000 000 000 000 000 000100
010110
100 010110
51
Implementation Details of the Algorithm (3/7)Subword
tagsize bits: loglogn.Swnum
The maximum number of subwords stored in a word.
wordsize/tagsize .
52
Implementation Details of the Algorithm (4/7)selectr
I and J are two strings of r bits. It outputs a list of bits of J which have been selected by I.
<e.g.> select(01010000, 11000000) = (10).
selectSr
Two inputs I and J. I is a string of no more than r bits, no more than swnum of which are 1. J is a list of no more than swnum subwords. It outputs a list of the subwords of J which have been selected by I.
<e.g.> selectS((01), (t1, t2)) = (t2).
53
Implementation Details of the Algorithm (5/7)weightr
It outputs the number of bits of a string set to 1.<e.g.> weight(01010000) = 2 .
indexr
It outputs the 1’s in the r bit vector.<e.g.> index(01100000) = (2, 3) (2, 3 are pointers).
subword1The (i*tagsize)th bit is 1 and the remaining bits are 0,
for i=1,…, swnum.<e.g.> (00100100), where wordsize=8, tagsize=3.
54
Implementation Details of the Algorithm (6/7)To implement these operations, we need to
preprocess a few functions so that we may do table lookup of these functions.
When the size of the input is no greater than logn+c, where c is a constant, the preprocessing time is O(n).
55
Implementation Details of the Algorithm (7/7)We cannot afford to build a table for selectwordsize and
selectSwordsize which takes inputs of 2wordsize bits, since the table would be too large.
However, we can compute these functions as needed in constant time using table lookups of those functions on input size wordsize/2.
A table for all inputs of length r can be built by first building a table for inputs of size r/2, looking up the result for the two halves, and, in constant time, putting the results together to form the entry.
56
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
57
A Top-down ApproachInitially, A(root) = . We proceed down the tree, from the parent p to
each of the children v.Generate the list of the heaviest edges in A(v|p).Compare w({v, p}) to the weights of these edges, by
performing binary search on the list, and insert the tag of {v, p} in the appropriate places to form the list of the heaviest edges in A(v).
Continue until the leaves are reached.
58
An Illustration of the Algorithm
LCA(p)
10
110
0
11
LCA(v)
00
010
0
01
root
pv
Height log ≦ n
t1
t2
t3
t4
t5
t1
t3
t
t
t
Comparison Time:O(log |A(v)|) binary search time
Update Time|A(v)| time at worst
Binary Search
t
A(p) A(v|p) A(v)
59
A Preliminary AnalysisWe got a naïve time algorithm. Observe that only comparisons are
used, but with a non-linear time overhead to maintain the list.
A linear time algorithm for this problem in a pointer machine is not thought possible. (such a result would imply a linear-time algorithm for computing LCA in a pointer machine)
Our Goal: Give a linear time implementation in the unit cost RAM model, in which an O(log n) operation can be done in constant time.
)(|)(A| nmv )(|)(A|log nmv
60
How to Handle a BigList and a SmallListDepending on |A(v)|, our goal is that
if , the overhead of maintaining the
small list of A(v) is O(1) time.
if , the overhead of maintaining the
big list of A(v) is O(log log n) time.
n
nv
loglog
log|)(A|
n
nv
loglog
log|)(A|
)log(log)loglog
loglog(|)(A|log that,Note n
n
nv
)1(|)(A|log Obviously, v
The overall time is therefore dominated by the number of comparisons.
61
Case 1: v is small Goal: Maintaining the list in O(1) time
If p is small, then create smallList(v|p) from smallList(p). L←select(LCA(p),LCA(v)). smallList(v|p) ←select(L, smallList(p) ).
<e.g.> Let LCA(v) = (01001000), LCA(p) = (11000000). Let smallList(p) be (t1, t2). Then L = select(11000000, 01001000) = (01); and smallList(v|p) = selectS((01), (t1, t2)) = (t2).
If p is big, then create smallList(v|p) from LCA(v) and LCA(p). smallList(v|p) ←index(select(LCA(p) , LCA(v)).
<e.g.> Let LCA(p) = (01101110), LCA(v) = (01001000). Then smallList(v|p) = index(select(01101110, 01001000)) = index(10100) = (1, 3).
)1(|)(A|log v
62
Case 1-1: v is small, p is smallroot
pv
Height log ≦ n
LCA(p) = (1 1 0 1 1 0 0 0) (t1, t2 , t3 , t4)
LCA(v) = (0 1 0 1 1 0 0 0)
Select (0 1 1 1)
Retrieve (t2 , t3 , t4)
Binary Search
Substitute (t2 , t , t)
Except Binary Search, all operations can be done in O(1) time.
small
smallt
t
A(p)
A(v|p)
A(v)
63
Case 1-2: v is small, p is bigroot
pv
Height log ≦ n
LCA(p) = (0 1 1 0 1 1 1 0) (t1, t2 , t3 , t4 , t5)
LCA(v) = (0 1 0 0 1 0 0 0)
Select (1 0 1 0 0)
Index (1 , 3)
Binary Search
Substitute (1 , t )
Except Binary Search, all operations can be done in O(1) time.
big
smallt
t
A(p)
A(v|p)
A(v)
64
Case 2: v is bigGoal: Maintaining the list in O(loglog n) time.
If v has a big ancestor. If p ≠a, then create bigList(v|p) from bigList(v|a) and smallList(p).
<e.g.> Let LCA(a) = (01101110), LCA(v) = (00100101), and let bigList(a) be (t1, t2, t3, t4, t5). Thus bigList (v|a) = (t2, t4). Suppose smallList(p) = (2, t’). Then bigList(v|p) = (t2, t’). 2 is pointer to the second item of bigList(a). t’ is the tag of some edge below a in the tree.
If v doesn’t have a big ancestor. bigList(v|p) ← smallList(p).
)log(log|)(A|log nv
65
Case 2-1: v is big, there exists a big ancestor aroot
pv
Height log ≦ n
LCA(a) = (0 1 1 0 1 1 1 0) (t1, t2 , t3 , t4 ,t5)
LCA(v) = (0 0 1 0 0 1 0 1)
Select (0 1 0 1 0)
Retrieve (t2) (t4) ()
Combination (t2 , t4) (2 , t’ ) (t2 , t’ )
Binary Search
Substitute (t2 , t , t)Except Binary Search, all operations can be done in O(log log n) time.
a
small
big
big
t
t
A(v|a)A(p) = A(v|p)A(v|p)
A(a)
A(v)
66
Case 2-2: v is big, no big ancestorroot
pv
Height log ≦ n
LCA(p) = (0 1 1 0 1 0 0 0) (t1, t2 , t3)
Copy (t1, t2 , t3)
Binary Search
Substitute (t1, t , t , t )
Except Binary Search, all operations can be done in O(log log n) time.
small
big
A(v|p) = A(p)
t
t
A(p)
A(v)
67
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
68
Time ComplexityPreparation time:
LCA query problem: O(m + n) time.Table lookup approach for O(log n)-bit operations: O(n)
time.The overhead for each case does not exceed O(log |
A(v)|) time, which leads to an
time for maintaining the list.Compare the heaviest edges in each half-paths with the
weight of each non-tree edge: O(m) time.Overall time complexity is O(m + n) time.
)(|)(A|log nmv
69
OutlineIntroductionBorůvka Tree PropertyKomlós’s Algorithm for a Full Branching TreeImplementation
Data StructuresThe AlgorithmMore DetailsAnalysis
Conclusion and Open Problems
70
Conclusions and Open ProblemsThe paper reduces Komlós’s algorithm to the
simpler case of the full branching tree, and gives an algorithm with linear-time overhead for its implementation in the unit cost RAM.
Some open problems remain:Give a linear time algorithm for the MST verification
problem in a pointer machine.The tree-path problem: Given a static tree, preprocess
it so that one can quickly retrieve the heaviest path in any online query path.
71
Thank You
top related