chapter 9 greedy technique. constructs a solution to an optimization problem piece by piece through...
TRANSCRIPT
Chapter 9 Chapter 9
Greedy TechniqueGreedy Technique
Greedy TechniqueGreedy Technique
Constructs a solution to an Constructs a solution to an optimization problemoptimization problem piece by piece by
piece through a sequence of choices that are:piece through a sequence of choices that are:
feasible feasible - - has to satisfy the problem’s constraints
locally optimal locally optimal - - has to be the best choice among all feasible choices available at that step
Irrevocable Irrevocable – once made, it cannot be changed on subsequent steps of the algorithm
For some problems, yields an optimal solution for every instance.For some problems, yields an optimal solution for every instance.
For most, does not but can be useful for fast approximations.For most, does not but can be useful for fast approximations.
Applications of the Greedy StrategyApplications of the Greedy Strategy
Optimal solutions:Optimal solutions:
• change making for “normal” coin denominationschange making for “normal” coin denominations
• minimum spanning tree (MST)minimum spanning tree (MST)
• single-source shortest paths single-source shortest paths
• simple scheduling problemssimple scheduling problems
• Huffman codesHuffman codes
Approximations:Approximations:
• traveling salesman problem (TSP)traveling salesman problem (TSP)
• knapsack problemknapsack problem
• other combinatorial optimization problemsother combinatorial optimization problems
Change-Making ProblemChange-Making Problem
Given unlimited amounts of coins of denominations Given unlimited amounts of coins of denominations dd1 1 > … > > … > ddm m , ,
give change for amount give change for amount n n with the least number of coinswith the least number of coins
Example: Example: dd1 1 = 25c, = 25c, dd2 2 =10c, =10c, dd3 3 = 5c, = 5c, dd4 4 = 1c and = 1c and n = n = 48c48c
Greedy solution: Greedy solution:
Greedy solution isGreedy solution is optimal for any amount and “normal’’ set of denominationsoptimal for any amount and “normal’’ set of denominations may not be optimal for arbitrary coin denominationsmay not be optimal for arbitrary coin denominations
Minimum Spanning Tree (MST)Minimum Spanning Tree (MST)
Spanning treeSpanning tree of a connected graph of a connected graph GG: a connected acyclic : a connected acyclic subgraph of subgraph of G G that includes all of that includes all of GG’s vertices’s vertices
Minimum spanning treeMinimum spanning tree of a weighted, connected graph of a weighted, connected graph GG: : a spanning tree of a spanning tree of GG of minimum total weight of minimum total weight
Example:Example:
c
db
a6
2
4
3
1
Prim’s MST algorithmPrim’s MST algorithm
Start with tree Start with tree TT11 consisting of one (any) vertex and “grow” consisting of one (any) vertex and “grow”
tree one vertex at a time to produce MST through tree one vertex at a time to produce MST through a series of a series of expanding subtrees Texpanding subtrees T11, T, T22, …, T, …, Tnn
On each iteration, On each iteration, construct Tconstruct Tii+1+1 from T from Ti i by adding vertex by adding vertex
not in not in TTi i that is that is closest to those already in closest to those already in TTii (this is a (this is a
“greedy” step!)“greedy” step!)
Stop when all vertices are includedStop when all vertices are included
ExampleExample
c
db
a4
26
1
3
Notes about Prim’s algorithmNotes about Prim’s algorithm
Proof by induction that this construction actually yields MST Proof by induction that this construction actually yields MST
Needs priority queue for locating closest fringe vertexNeeds priority queue for locating closest fringe vertex
EfficiencyEfficiency
• O(O(nn22)) for weight matrix representation of graph and array for weight matrix representation of graph and array implementation of priority queue implementation of priority queue
• OO((mm log log nn) for adjacency list representation of graph with ) for adjacency list representation of graph with n n vertices and vertices and m m edges and min-heap implementation of edges and min-heap implementation of priority queuepriority queue
InductionInduction
Basis step:Basis step:TT00 consists of a single vertex and hence must be part of any minimum consists of a single vertex and hence must be part of any minimum
spanning tree.spanning tree.
Inductive step:Inductive step:Assume that TAssume that Ti-1 i-1 is part of some minimum spanning tree. Need to prove that is part of some minimum spanning tree. Need to prove that
TTi i , generated from T , generated from Ti-1i-1 by Prim’s algorithm, is also a part of a minimum by Prim’s algorithm, is also a part of a minimum
spanning tree.spanning tree.
Proof by contradiction: assume that no minimum spanning tree of the graph Proof by contradiction: assume that no minimum spanning tree of the graph can contain Tcan contain Tii Let e Let eii =(v, u) be the minimum weight edge in T =(v, u) be the minimum weight edge in T i-1i-1 to a to a
vertex not in Tvertex not in Ti-1i-1 used by Prim’s algorithm to expand T used by Prim’s algorithm to expand T i-1i-1 to T to Tii . By our . By our
assumption, eassumption, eii cannot belong to the minimum spanning tree T. cannot belong to the minimum spanning tree T.
Therefore, if we add eTherefore, if we add e ii to T, a cycle must be formed. to T, a cycle must be formed.
Another greedy algorithm for MST: Kruskal’sAnother greedy algorithm for MST: Kruskal’s
Sort the edges in nondecreasing order of lengthsSort the edges in nondecreasing order of lengths
““Grow” tree one edge at a time to produce MST through Grow” tree one edge at a time to produce MST through a a series of expanding forests Fseries of expanding forests F11, F, F22, …, F, …, Fn-n-11
On each iteration, add the next edge on the sorted list On each iteration, add the next edge on the sorted list unless this would create a cycle. (If it would, skip the edge.)unless this would create a cycle. (If it would, skip the edge.)
ExampleExample
c
db
a
4
26
1
3
Notes about Kruskal’s algorithmNotes about Kruskal’s algorithm
Algorithm looks easier than Prim’s but is harder to Algorithm looks easier than Prim’s but is harder to implement (checking for cycles!)implement (checking for cycles!)
Cycle checking: a cycle is created iff added edge connects Cycle checking: a cycle is created iff added edge connects vertices in the same connected componentvertices in the same connected component
Union-find Union-find algorithms – see section 9.2algorithms – see section 9.2
Kruskal’s Algorithm re-thoughtKruskal’s Algorithm re-thought
View the algorithm’s operations as a progression through a View the algorithm’s operations as a progression through a series of forests series of forests • Containing all of the vertices of a graph andContaining all of the vertices of a graph and
• Some its edgesSome its edges
The initial forest consists of |V| trivial trees, each The initial forest consists of |V| trivial trees, each comprising a single vertex of the graphcomprising a single vertex of the graph• On each iteration the algorithm takes the next edge (u,v) from the On each iteration the algorithm takes the next edge (u,v) from the
sorted list of the graph’s edgessorted list of the graph’s edges
• Finds the trees containing the vertices u and v, andFinds the trees containing the vertices u and v, and
• If these trees are not the same, unites them in a larger tree by If these trees are not the same, unites them in a larger tree by adding the edge (u,v)adding the edge (u,v)
With an efficient sorting algorithm, Kruskal is O(|E| log |E|)With an efficient sorting algorithm, Kruskal is O(|E| log |E|)
Kruskal and Union-FindKruskal and Union-Find
Union-find algorithms for checking whether two vertices Union-find algorithms for checking whether two vertices belong to the same treebelong to the same tree
Partition the n-element set S into a collection of n one-Partition the n-element set S into a collection of n one-element subsets, each containing a different element of Selement subsets, each containing a different element of S
The collection is subjected to a sequence of intermixed The collection is subjected to a sequence of intermixed union and find operationsunion and find operations
ADT:ADT:• makeset(x) creates a one element set {x}makeset(x) creates a one element set {x}
• find(x) returns a subset contain xfind(x) returns a subset contain x
• union (x, y) constructs the union of disjoint subsets Sx and Sy, union (x, y) constructs the union of disjoint subsets Sx and Sy, containing x and y, and adds it to the collection to replace Sx and Sy containing x and y, and adds it to the collection to replace Sx and Sy which are deleted from itwhich are deleted from it
Minimum spanning tree vs. Steiner treeMinimum spanning tree vs. Steiner tree
c
db
a 1
1 1
1
c
db
a
vs
See problem 11 on page 323
Shortest paths – Dijkstra’s algorithmShortest paths – Dijkstra’s algorithm
Single Source Shortest Paths ProblemSingle Source Shortest Paths Problem: Given a weighted : Given a weighted
connected graph G, find shortest paths from source vertex connected graph G, find shortest paths from source vertex ss
to each of the other verticesto each of the other vertices
Dijkstra’s algorithmDijkstra’s algorithm: Similar to Prim’s MST algorithm, with : Similar to Prim’s MST algorithm, with
a different way of computing numerical labels: Among verticesa different way of computing numerical labels: Among vertices
not already in the tree, it finds vertex not already in the tree, it finds vertex uu with the smallest with the smallest sumsum
ddvv + + ww((vv,,uu))
where where
vv is a vertex for which shortest path has been already found is a vertex for which shortest path has been already found on preceding iterations (such vertices form a tree) on preceding iterations (such vertices form a tree)
ddvv is the length of the shortest path from source to is the length of the shortest path from source to vv
w w((vv,,uu) is the length (weight) of edge from ) is the length (weight) of edge from vv to to uu
Dijkstra’s Algorithm (G,l)Dijkstra’s Algorithm (G,l)
G is the graph, s is the start node l is the length G is the graph, s is the start node l is the length of an edgeof an edge
Let S be the set of explored nodesLet S be the set of explored nodes
For each u in S, we store the distance d(u)For each u in S, we store the distance d(u)
Initially S = {s} and d(s) = 0Initially S = {s} and d(s) = 0
While S is not equal to VWhile S is not equal to V
Select a node v not in S with at least one edge Select a node v not in S with at least one edge from S for which d’(v) = min from S for which d’(v) = min e=(u,v):u is in Se=(u,v):u is in S d(u) + l d(u) + lee is as small as possibleis as small as possible
Add v to S and define d(v) = d’(v)Add v to S and define d(v) = d’(v)
EndWhile EndWhile
ExampleExample
d4
Tree vertices Remaining verticesTree vertices Remaining vertices
a(-,0) b(a,3) c(-,∞) d(a,7) e(-,∞)
a
b4
e
37
62 5
c
a
b
d
4c
e
3
7 4
62 5
a
b
d
4c
e
3
7 4
62 5
a
b
d
4c
e
3
7 4
62 5
b(a,3) c(b,3+4) d(b,3+2) e(-,∞)
d(b,5) c(b,7) e(d,5+4)
c(b,7) e(d,9)
e(d,9)
d
a
b
d
4c
e
3
7 4
62 5
Notes on Dijkstra’s algorithmNotes on Dijkstra’s algorithm
Doesn’t work for graphs with negative weightsDoesn’t work for graphs with negative weights
Applicable to both undirected and directed graphsApplicable to both undirected and directed graphs
EfficiencyEfficiency
• O(|V|O(|V|22) for graphs represented by weight matrix and ) for graphs represented by weight matrix and array implementation of priority queuearray implementation of priority queue
• O(|E|log|V|) for graphs represented by adj. lists and min-O(|E|log|V|) for graphs represented by adj. lists and min-heap implementation of priority queueheap implementation of priority queue
Don’t mix up Dijkstra’s algorithm with Prim’s algorithm!Don’t mix up Dijkstra’s algorithm with Prim’s algorithm!
Coding ProblemCoding Problem
CodingCoding: assignment of bit strings to alphabet characters: assignment of bit strings to alphabet characters
CodewordsCodewords: bit strings assigned for characters of alphabet: bit strings assigned for characters of alphabet
Two types of codes:Two types of codes: fixed-length encodingfixed-length encoding (e.g., ASCII) (e.g., ASCII) variable-length encodingvariable-length encoding (e,g., Morse code) (e,g., Morse code)
Prefix-free codesPrefix-free codes: no codeword is a prefix of another codeword: no codeword is a prefix of another codeword
Problem: If frequencies of the character occurrences areProblem: If frequencies of the character occurrences are
known, what is the best binary prefix-free code?known, what is the best binary prefix-free code?
Huffman CodesHuffman Codes
Labeling the tree T*: Labeling the tree T*: • We label all the leaves of depth 1 (if there are any) and label them We label all the leaves of depth 1 (if there are any) and label them
with the highest-frequency letters in any order.with the highest-frequency letters in any order.
• We then take all leaves of depth 2 (if there are any) and label them We then take all leaves of depth 2 (if there are any) and label them with the next highest-frequency letters in any order.with the next highest-frequency letters in any order.
• Continue through the leaves in order of increasing depth Continue through the leaves in order of increasing depth
There is an optimal prefix code, with corresponding tree There is an optimal prefix code, with corresponding tree T* , in which the two lowest-frequency letters are assigned T* , in which the two lowest-frequency letters are assigned to leaves that are siblings in T*. to leaves that are siblings in T*. • Delete them and label their parent with a new letter having the Delete them and label their parent with a new letter having the
combined frequency combined frequency
• Yields an instance with a smaller alphabetYields an instance with a smaller alphabet
Huffman’s AlgorithmHuffman’s AlgorithmTo construct a prefix code for an alphabet S, with given To construct a prefix code for an alphabet S, with given
frequencies:frequencies:If S has two letters thenIf S has two letters then
encode one letter using 0 and other letter using 1encode one letter using 0 and other letter using 1
ElseElse
Let y* and z* be the two lowest-frequency lettersLet y* and z* be the two lowest-frequency letters
Form a new alphabet S’ by deleting y* and z* and replacing Form a new alphabet S’ by deleting y* and z* and replacing them with a new letter them with a new letter of frequency of frequency
f f y*y* + f + f z*z*
Recursively construct a prefix code Recursively construct a prefix code ’ for S’, with tree T’’ for S’, with tree T’
Define a prefix code for S as follows:Define a prefix code for S as follows:
Start with T’Start with T’
Take the leaf labeled Take the leaf labeled and add two children below and add two children below it labeled y* and z*it labeled y* and z*
Endif0………………………………………………………………………………………………………………………………………………………Endif0…………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….…………………………………………………………………………………………………………………………………………………….
Huffman codesHuffman codes
Any binary tree with edges labeled with 0’s and 1’s yields a prefix-free Any binary tree with edges labeled with 0’s and 1’s yields a prefix-free code of characters assigned to its leavescode of characters assigned to its leaves
Optimal binary tree minimizing the expected (weighted average) length of Optimal binary tree minimizing the expected (weighted average) length of a codeword can be constructed as followsa codeword can be constructed as follows
Huffman’s algorithmHuffman’s algorithm
Initialize Initialize nn one-node trees with alphabet characters and the tree weights one-node trees with alphabet characters and the tree weights
with their frequencies.with their frequencies.
Repeat the following step Repeat the following step nn-1 times: join two binary trees with smallest -1 times: join two binary trees with smallest
weights into one (as left and right subtrees) and make its weight equal the weights into one (as left and right subtrees) and make its weight equal the
sum of the weights of the two trees.sum of the weights of the two trees.
Mark edges leading to left and right subtrees with 0’s and 1’s, respectively.Mark edges leading to left and right subtrees with 0’s and 1’s, respectively.
ExampleExample
character Acharacter A B C D B C D __frequency 0.35 0.1 0.2 0.2 0.15frequency 0.35 0.1 0.2 0.2 0.15
codeword 11 100 00 01 101codeword 11 100 00 01 101
average bits per character: 2.25average bits per character: 2.25
for fixed-length encoding: 3for fixed-length encoding: 3
compression ratiocompression ratio: (3-2.25)/3*100% = 25%: (3-2.25)/3*100% = 25%
0.25
0.1
B
0.15
_
0.2
C
0.2
D
0.35
A
0.2
C
0.2
D
0.35
A
0.1
B
0.15
_
0.4
0.2
C
0.2
D
0.6
0.25
0.1
B
0.15
_
0.6
1.0
0 1
0.4
0.2
C
0.2
D0.25
0.1
B
0.15
_
0 1 0
0
1
1
0.25
0.1
B
0.15
_
0.35
A
0.4
0.2
C
0.2
D
0.35
A
0.35
A