algorithm design techniques: greedy algorithms. introduction algorithm design techniques –design...

Algorithm Design Techniques:Greedy Algorithms

Introduction

• Algorithm Design Techniques– Design of algorithms– Algorithms commonly used to solve problems

• Greedy, Divide and Conquer, Dynamic Programming, Randomized, Backtracking

• General approach• Examples• Time and space complexity (where

appropriate)

Greedy Algorithms

• Choose the best option during each phase

– Dijkstra, Prim, Kruskal

• Making change– Choose largest bill at each round– Does this always work?

• Bad examples where greedy does not work?

Greedy Algorithms

• Must have– Greedy-choice property: a globally optimal

solution can be arrived at by making a locally optimal choice

– Optimal substructure: an optimal solution to a problem contains optimal solutions to its subproblems

Making Change

• Greedy choice property– Highest denomination coin < n will reside in

solution – if not, it will be replaced by two or more smaller coins which will be more coins and not optimal

– This is also true for 1, 7, 10 denominations???

• Optimal substructure– Solution for (n – highest denomination coin) is

optimal

Scheduling

• Given jobs j1, j2, j3, ..., jn with known running times t1, t2, t3, ..., tn – what is the best way to schedule the jobs to minimize average completion time?

Job Time

j1 15

j2 8

j3 3

j4 10

Scheduling

j1 j2 j3 j4

15 23 26 36Average completion time = (15+23+26+36)/4 = 25

j1j2j3 j4

3 11 21 36 Average completion time = (3+11+21+36)/4 = 17.75

Scheduling

• Greedy-choice property: if shortest job does not go first, the y jobs before it will complete 3 time units faster, but j3 will be postponed by time to complete all jobs before it

• Optimal substructure: if shortest job is removed from optimal solution, remaining solution for n-1 jobs is optimal

Optimality Proof

• Total cost of a schedule is

N

∑(N-k+1)tik

k=1

t1 + (t1+t2) + (t1+t2+t3) ... (t1+t2+...+tn) N N

(N+1)∑tik - ∑k*tik

k=1 k=1

• First term independent of ordering, as second term increases, total cost becomes smaller

Scheduling

Suppose there is a job ordering such that x > y and tix < tiy

Swapping jobs (smaller first) increases second term decreasing total cost

Show: xtix + ytiy < ytix + xtiy

xtix + ytiy = xtix + ytix + y(tiy - tix) = ytix + xtix+ y(tiy - tix) < ytix + xtix+ x(tiy - tix) = ytix + xtix+ xtiy - xtix

= ytix + xtiy

More Scheduling

• Multiple processor case– Algorithm?

More Scheduling

• Multiple processor case– Algorithm:

• order jobs shortest first• schedule jobs round-robin

• Minimizing final completion time– When is this useful?– How is this different?– Problem is NP-Complete!

Huffman Codes

• 100 ASCII characters

• Need ceil(log 100) bits to represent each character

• Large file = lots of bits!

• Would like to reduce number of bits

Huffman Codes

• Idea – encode frequently occurring characters using fewer bits

• Need to make sure all characters are distinguishable– 01 = A 0101 = B– 010101 =? AAA, AB, BA

• No character code should be a prefix of another character code

Huffman Codes

• Goal: find a full binary tree of minimum cost where characters are stored in the leaves

• Cost of tree: sum across all characters of the frequency of the character times its depth in the tree– frequently occurring characters should be

highest in the tree

Huffman Codes

t

s nl

e i sp

Character Code Frequency Total Bits

a 001 10 30

e 01 15 30

i 10 12 24

s 00000 3 15

t 0001 4 16

space 11 13 26

newline 00001 1 5

total 146

Huffman’s Algorithm

• How do we produce a code?– Maintain a forest of trees

• weight of a tree is the sum of the frequencies of the leaves

• start with C trees to represent each character – weight of each is frequency of that character

– Until there is only 1 tree• choose the 2 trees with the smallest weights and

merge them by creating a new root and making each tree a right or left subtree

– Running time – O (ClogC)

Optimality Proof – Idea

1. The tree must be full• if it is not, move leaf with no siblings to its parent

2. Least frequent characters are the deepest nodes

• if not, a node can be swapped with an ancestor

3. Characters at the same depth can be swapped

4. As trees are merged, optimality holds


• Greedy choice property: given x and y -- characters with lowest frequency in alphabet C, there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit

– Take an arbitrary optimal prefix code and modify it to make it a tree representing another optimal prefix code such that x and y are sibling leaves of max depth


• Optimal substructure:

C’ = C – {x, y} U {z} where f[z] = f[x]+f[y]

T’ is optimal tree for C’

Replace z in T’ with internal node having x and y as children

Result is optimal prefix code for C

Approximate Bin Packing

• N items of sizes s1, s2, ..., sN

• 0 < si <= 1

• Goal: pack into fewest number of bins of size 1• NP-complete problem, but we can use greedy

algorithms to produce solutions not too far from optimal

• Knapsack problem

• Examples?

– Saving data to external media

Example – Optimal Packing

.8

.2

.3

.7

.5

.4

.1

•Input: .2, .5, .4, .7, .1, .3, .8

On-line vs Off-line

• On-line– Process one item at a time– Cannot move an item once it is placed

• Off-line– Look at all items before you place first item

On-line Algorithms

• On-line algorithms cannot guarantee optimal solution– Problem: cannot know when input will end– M small items ½-ε – M large items ½+ε– Can fit into M bins with 1 large and 1 small in each bin– If all small come first, place in M separate bins– If input is only M small items, we have used twice as

many bins as necessary– There are inputs that force any on-line bin-packing

algorithm to use at least 4/3 the optimal number of bins.

On-line Bin Packing Algorithms

• Next fit

• First fit

• Best fit


• Next fit– Algorithm

• if item first in bin with last item – place there• else – place in new bin

– (.2, .5) (.4) (.7, .1) (.3) (.8)– Running time?– Let M be the optimal number of bins required

to pack a list I of items. Then next fit never uses more than 2M bins.

• At most, half of the space is wasted (Bj + Bj+1 > 1)


• First fit– Algorithm

• Scan all bins and place item in first bin large enough to hold it

• if no bin is large enough, create new bin

– (.2, .5, .1) (.4, .3) (.7) (.8)– Running time?– Let M be the optimal number of bins required

to pack a list I of items. Then first fit never uses more than ceil(17/10M) bins.


• Best fit– Algorithm

• Scan all bins and place item in bin with tightest fit (will be fullest after item is placed there)

• if no bin is large enough, create new bin

– (.2, .5, .1) (.4) (.7, .3) (.8)– Running time?– Same performance as first fit.

Off-line Bin Packing

• Sort items (in decreasing order) first for easier placement of large items

• Apply first fit or best fit algorithm– First fit – (.8, .2) (.7, .3) (.5, .4, .1)

• Let M be the optimal number of bins required to pack a list I of items. Then first fit decreasing never uses more than (11/9M)+4 bins.

algorithm design techniques: greedy algorithms. introduction algorithm design techniques –design...

Documents