algorithm design techniques: greedy algorithms. introduction algorithm design techniques –design...
TRANSCRIPT
Algorithm Design Techniques:Greedy Algorithms
Introduction
• Algorithm Design Techniques– Design of algorithms– Algorithms commonly used to solve problems
• Greedy, Divide and Conquer, Dynamic Programming, Randomized, Backtracking
• General approach• Examples• Time and space complexity (where
appropriate)
Greedy Algorithms
• Choose the best option during each phase
– Dijkstra, Prim, Kruskal
• Making change– Choose largest bill at each round– Does this always work?
• Bad examples where greedy does not work?
Greedy Algorithms
• Must have– Greedy-choice property: a globally optimal
solution can be arrived at by making a locally optimal choice
– Optimal substructure: an optimal solution to a problem contains optimal solutions to its subproblems
Making Change
• Greedy choice property– Highest denomination coin < n will reside in
solution – if not, it will be replaced by two or more smaller coins which will be more coins and not optimal
– This is also true for 1, 7, 10 denominations???
• Optimal substructure– Solution for (n – highest denomination coin) is
optimal
Scheduling
• Given jobs j1, j2, j3, ..., jn with known running times t1, t2, t3, ..., tn – what is the best way to schedule the jobs to minimize average completion time?
Job Time
j1 15
j2 8
j3 3
j4 10
Scheduling
j1 j2 j3 j4
15 23 26 36Average completion time = (15+23+26+36)/4 = 25
j1j2j3 j4
3 11 21 36 Average completion time = (3+11+21+36)/4 = 17.75
Scheduling
• Greedy-choice property: if shortest job does not go first, the y jobs before it will complete 3 time units faster, but j3 will be postponed by time to complete all jobs before it
• Optimal substructure: if shortest job is removed from optimal solution, remaining solution for n-1 jobs is optimal
Optimality Proof
• Total cost of a schedule is
N
∑(N-k+1)tik
k=1
t1 + (t1+t2) + (t1+t2+t3) ... (t1+t2+...+tn) N N
(N+1)∑tik - ∑k*tik
k=1 k=1
• First term independent of ordering, as second term increases, total cost becomes smaller
Scheduling
Suppose there is a job ordering such that x > y and tix < tiy
Swapping jobs (smaller first) increases second term decreasing total cost
Show: xtix + ytiy < ytix + xtiy
xtix + ytiy = xtix + ytix + y(tiy - tix) = ytix + xtix+ y(tiy - tix) < ytix + xtix+ x(tiy - tix) = ytix + xtix+ xtiy - xtix
= ytix + xtiy
More Scheduling
• Multiple processor case– Algorithm?
More Scheduling
• Multiple processor case– Algorithm:
• order jobs shortest first• schedule jobs round-robin
• Minimizing final completion time– When is this useful?– How is this different?– Problem is NP-Complete!
Huffman Codes
• 100 ASCII characters
• Need ceil(log 100) bits to represent each character
• Large file = lots of bits!
• Would like to reduce number of bits
Huffman Codes
• Idea – encode frequently occurring characters using fewer bits
• Need to make sure all characters are distinguishable– 01 = A 0101 = B– 010101 =? AAA, AB, BA
• No character code should be a prefix of another character code
Huffman Codes
• Goal: find a full binary tree of minimum cost where characters are stored in the leaves
• Cost of tree: sum across all characters of the frequency of the character times its depth in the tree– frequently occurring characters should be
highest in the tree
Huffman Codes
t
s nl
e i sp
Character Code Frequency Total Bits
a 001 10 30
e 01 15 30
i 10 12 24
s 00000 3 15
t 0001 4 16
space 11 13 26
newline 00001 1 5
total 146
Huffman’s Algorithm
• How do we produce a code?– Maintain a forest of trees
• weight of a tree is the sum of the frequencies of the leaves
• start with C trees to represent each character – weight of each is frequency of that character
– Until there is only 1 tree• choose the 2 trees with the smallest weights and
merge them by creating a new root and making each tree a right or left subtree
– Running time – O (ClogC)
Optimality Proof – Idea
1. The tree must be full• if it is not, move leaf with no siblings to its parent
2. Least frequent characters are the deepest nodes
• if not, a node can be swapped with an ancestor
3. Characters at the same depth can be swapped
4. As trees are merged, optimality holds
Optimality Proof – Idea
• Greedy choice property: given x and y -- characters with lowest frequency in alphabet C, there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit
– Take an arbitrary optimal prefix code and modify it to make it a tree representing another optimal prefix code such that x and y are sibling leaves of max depth
Optimality Proof – Idea
• Optimal substructure:
C’ = C – {x, y} U {z} where f[z] = f[x]+f[y]
T’ is optimal tree for C’
Replace z in T’ with internal node having x and y as children
Result is optimal prefix code for C
Approximate Bin Packing
• N items of sizes s1, s2, ..., sN
• 0 < si <= 1
• Goal: pack into fewest number of bins of size 1• NP-complete problem, but we can use greedy
algorithms to produce solutions not too far from optimal
• Knapsack problem
• Examples?
– Saving data to external media
Example – Optimal Packing
.8
.2
.3
.7
.5
.4
.1
•Input: .2, .5, .4, .7, .1, .3, .8
On-line vs Off-line
• On-line– Process one item at a time– Cannot move an item once it is placed
• Off-line– Look at all items before you place first item
On-line Algorithms
• On-line algorithms cannot guarantee optimal solution– Problem: cannot know when input will end– M small items ½-ε – M large items ½+ε– Can fit into M bins with 1 large and 1 small in each bin– If all small come first, place in M separate bins– If input is only M small items, we have used twice as
many bins as necessary– There are inputs that force any on-line bin-packing
algorithm to use at least 4/3 the optimal number of bins.
On-line Bin Packing Algorithms
• Next fit
• First fit
• Best fit
On-line Bin Packing Algorithms
• Next fit– Algorithm
• if item first in bin with last item – place there• else – place in new bin
– (.2, .5) (.4) (.7, .1) (.3) (.8)– Running time?– Let M be the optimal number of bins required
to pack a list I of items. Then next fit never uses more than 2M bins.
• At most, half of the space is wasted (Bj + Bj+1 > 1)
On-line Bin Packing Algorithms
• First fit– Algorithm
• Scan all bins and place item in first bin large enough to hold it
• if no bin is large enough, create new bin
– (.2, .5, .1) (.4, .3) (.7) (.8)– Running time?– Let M be the optimal number of bins required
to pack a list I of items. Then first fit never uses more than ceil(17/10M) bins.
On-line Bin Packing Algorithms
• Best fit– Algorithm
• Scan all bins and place item in bin with tightest fit (will be fullest after item is placed there)
• if no bin is large enough, create new bin
– (.2, .5, .1) (.4) (.7, .3) (.8)– Running time?– Same performance as first fit.
Off-line Bin Packing
• Sort items (in decreasing order) first for easier placement of large items
• Apply first fit or best fit algorithm– First fit – (.8, .2) (.7, .3) (.5, .4, .1)
• Let M be the optimal number of bins required to pack a list I of items. Then first fit decreasing never uses more than (11/9M)+4 bins.