[주제 4] greedy methods 151

32
[ 주주 4] Greedy Methods 151

Upload: shanna-george

Post on 08-Jan-2018

229 views

Category:

Documents


0 download

DESCRIPTION

Algorithm Design Techniques • Divide-and-Conquer Method • Dynamic Programming Method • Greedy Method • Backtracking Method • Local Search Method • Branch-and-Bound Method • Etc. 152

TRANSCRIPT

Page 1: [주제 4] Greedy Methods 151

[ 주 제 4] Greedy Methods

151

Page 2: [주제 4] Greedy Methods 151

Algorithm Design Techniques • Divide-and-Conquer Method • Dynamic Programming Method • Greedy Method

• Backtracking Method • Local Search Method • Branch-and-Bound Method • Etc.

152

Page 3: [주제 4] Greedy Methods 151

23年 5月 3日 알고리즘 강의 슬라이드 4 3

소개

• 탐욕적인 알고리즘 (Greedy algorithm) 은 결정을 해야 할 때마다 그 순간에 가장 좋다고 생각되는 것을 해답으로 선택함으로써 최종적인 해답에 도달한다 .

• 그 순간의 선택은 그 당시 (local) 에는 optimal 이다 . 그러나 최적이라고 생각했던 해답들을 모아서 최종적인(global) 해답을 만들었다고 해서 , 그 해답이 궁극적으로 최적이라는 보장이 없다 .

• Does not guarantee global optimal• 따라서 탐욕적인 알고리즘은 항상 최적의 해답을

주는지를 반드시 검증해야 한다 .

Page 4: [주제 4] Greedy Methods 151

23年 5月 3日 알고리즘 강의 슬라이드 4 4

탐욕적인 알고리즘 설계 절차

1. 선정과정 (selection procedure) 현재 상태에서 가장 좋으리라고 생각되는 (greedy)

해답을 찾아서 해답모음 (solution set) 에 포함시킨 다 .

2. 적정성점검 (feasibility check) 새로 얻은 해답모음이 적절한지를 결정한다 .

3. 해답점검 (solution check) 새로 얻은 해답모음이 최적의 해인지를 결정한다 .

Page 5: [주제 4] Greedy Methods 151

23年 5月 3日 알고리즘 강의 슬라이드 4 5

보기 : Coin Change Problem

• Problem: Minimum Number of Coins for change동전의 개수가 최소가 되도록 거스름 돈을 주는 문제

• Greedy Algorithm– 거스름돈을 x 라 하자 . – 먼저 , 가치가 가장 높은 동전부터 x 가 초과되지 않도록

계속 내준다 . – 이 과정을 가치가 높은 동전부터 내림순으로 총액이 정확히

x 가 될 때까지 계속한다 .• 현재 우리나라에서 유통되고 있는 동전만을 가지고 ,

이 알고리즘을 적용하여 거스름돈을 주면 , 항상 동전의 개수는 최소가 된다 . 따라서 이 알고리즘은 최적 (optimal)!

Page 6: [주제 4] Greedy Methods 151

23年 5月 3日 알고리즘 강의 슬라이드 4 6

최적의 해를 얻지 못하는 경우

• 12 원 짜리 동전을 새로 발행했다고 하자 .• 이 알고리즘을 적용하여 거스름돈을 주면 , 항상

동전의 개수는 최소가 된다는 보장이 없다 . • 보기 : Amount of Change= 16\

– Greedy method: 12 원 1 개 = 12 원 , 1 원 4 개 = 4 원

– 동전의 개수 = 5 개 최적 (optimal) 이 아님 !– 최적의 해 : 10 원 1 개 , 5 원 1 개 , 1 원 1

개가 되어 동전의 개수는 3 개가 된다 .

Page 7: [주제 4] Greedy Methods 151

[The Fractional Knapsack Problem] [Neapolitan 4.5.2]

• Problem

• A greedy approach - Sort the items in nonincreasing order by profits per unit weight - Choose the items, possibly partially, one by one until the knapsack is full.

• Example: {w1, w2, w3} = {5, 10, 20}, {p1, p2, p3} = {50, 60, 140}, W = 30

- Choose all of the 1st item: (5, 50) - Choose all of the 3rd item: (20, 140)- Choose half of the 2nd item: (10/2, 60/2)

153

Page 8: [주제 4] Greedy Methods 151

• Implementation 1 - Sort the items O(n log n)- Repeat the choice O(n)

• Implementation 2 - Put the items in a heap O(n) - Repeat the choice O(k log n) ☞ Could be faster if only a small number of items are necessary to fill the knapsack.

The greedy method always find an optimal solution to the fractional Knapsack problem!

• Does the greedy approach always find an optimal solution to the 0-1 Knapsack problem?

- Read [Neapolitan 4.5.1].

154

Page 9: [주제 4] Greedy Methods 151

The Greedy Method

• A technique to find optimal solutions to some optimization problems. • Strategy: make the choice that appears best at each moment!

- Sort data items according to some properties.

- Scan each item in the sorted list, choosing it if possible.

It is hoped to arrive at a globally optimal solution by making a locally optimal choice.

• Pros and cons - Simple and straightforward to design an algorithm.

- Does not guarantee the optimal solution to all problems (local minima v.s. global minima).

155

Page 10: [주제 4] Greedy Methods 151

Scheduling Problem Example

156

Page 11: [주제 4] Greedy Methods 151

[Scheduling: Minimizing Total Time in the System] [Neapolitan 4.3.1]

• Problem - Consider a system in which a server is about to serve n clients. Let T = {t1, t2, …, tn} be

a set of positive numbers, where ti is the estimated time-to-completion for the ith client. What is the optimal order of service where the total (wait+service) time in the system is minimized?

- Hair stylist with waiting clients, pending operations on a shared hard disk, etc.

• Example: T = {t1, t2, t3} = {5, 10, 4}

Schedule Total Time in the System

[1, 2, 3] 5 + (5 + 10) + (5 + 10 + 4) = 39

[1, 3, 2] 33

[2, 1, 3] 10 + (10 + 5) + (10 + 5 + 4) = 44

[2, 3, 1] 43

[3, 1, 2] ☞ 4 + (4 + 5) + (4 + 5 + 10) = 32

[3, 2, 1] 37

157

Page 12: [주제 4] Greedy Methods 151

• A naïve approach - Enumerate all possible schedules of service, and select the optimal one. O(n!)

• A greedy approach - Sort T in nondecreasing order to get the optimal schedule. O(n log n)

• Does the greedy approach always find a schedule that minimizes the total time in the system? → Yes!! In this case.- Let S = [s1, s2, …, sn] be an optimal schedule. If they are not scheduled in

nondecreasing order, then, for at least one i ( ), si > si+1. Now consider the schedule S’ = [s1, s2, …, si+1, si, …, sn] that is obtained by interchanging si and si+1. …

• What if there are m servers?

158

Page 13: [주제 4] Greedy Methods 151

[Scheduling: Minimizing Lateness] • Problem

- Let J = {1, 2, …, n} be a set of jobs to be served by a single processor. - The i-th job takes ti units of processing time, and is due at time di . - When the i-th job starts at time si, its lateness li = max{0, si + ti - di }. - Goal: Find a schedule S so as to minimize the maximum lateness L = max{li}.

• Example - S = {3, 2, 6, 1, 5, 4} maximum lateness = 6

Job ti di 1 3 6

2 2 8 l4 = 6 l1 = 2 3 1 9

6 4 4 4 9 2 5 3 1 5 3 14

6 2 15 0 5 10 15

159

실행시간 , 끝나야 하는 시간

← Sorted by ti

Page 14: [주제 4] Greedy Methods 151

• Possible greedy approaches 1. Sort jobs in nondecreasing order of processing time ti: Shortest Jobs First (?)

2. Sort jobs in nondecreasing order of slack di - ti : Smallest Slack-Time First (?)

3. Sort jobs in nondecreasing order of deadline di : Earliest Deadline First (O) An optimal schedule S = {1, 2, 3, 4, 5, 6} maximum lateness = 1

Job ti di 1 3 6

2 2 8 l4 = 1 3 1 9 3 6 2 5 1 4 4 4 9

5 3 14 0 5 10 15

6 2 15

160

Page 15: [주제 4] Greedy Methods 151

• Correctness - 사실

1. 이 방법으로 구한 schedule 을 사용할 경우 프로세서에는 idle 상태가 발생하지 않음 .

2. 만약 주어진 schedule 에 inversion 이 있을 경우 , 최소한 연달아 schedule 된 두 개의 inversion 된 job 이 있음 .

- 여기서 inversion 이란 deadline 관점에서 봤을 때 서로 순서가 뒤 바뀐 두 개 의 job 의 쌍을 말함 .

3. 연달아 있는 inversion 상태의 두 개의 job 의 순서를 서로 바꿀 경우 , maximum lateness 를 증가시키지 않음 .

• 증명

1. S’ 를 최소 개수의 inversion 을 가지는 최적의 schedule 이라 가정 . 2. 만약 S’ 에 inversion 이 없다면 , 위의 방법으로 구한 schedule 과 동일 .3. 만약 S’ 에 inversion 이 있다면 , 이 경우 연달아 있는 inversion 된 두 job 의

순서를 서 로 바꾸면 , 결과로 발생하는 schedule S’’ 는 maximum lateness 를 증가시키지 않음 으로 역시 또 다른 최적의 schedule 임 .

4. 그러나 S’’ 는 S’ 보다 inversion 의 개수가 적음 . 이는 S’ 에 대한 가정에 대한 모순 . 따라서 S’ 에는 inversion 이 없고 따라서 이는 위의 방법으로 구한 schedule 과 동일 함 .

161

Page 16: [주제 4] Greedy Methods 151

[Scheduling with Deadlines] [Neapolitan 4.3.2]

• Problem - Let J = {1, 2, …, n} be a set of jobs to be served.- Each job takes one unit of time to finish. - Each job has a deadline and a profit.

• If the job starts before or at its deadline, the profit is obtained. Schedule the jobs so as to maximize the total profit (not all jobs have to be scheduled).

• Example: Schedule Total Profit

Job Deadline Profit [1, 3] 30 + 25 = 55

1 2 30 [2, 1] 35 + 30 = 65

2 1 35 [2, 3] 35 + 25 = 60

3 2 25 [3, 1] 25 + 30 = 55 4 1 40 [4, 1] ☞ 40 + 30 = 70

[4, 3] 40 + 25 = 65

162

Page 17: [주제 4] Greedy Methods 151

• The basic algorithm: a greedy approach - Sort the jobs in non-increasing order by profit. - Scan each job in the sorted list, adding it to the schedule if possible.

• Example - S = EMPTY – Is S = {1} OK? <After sorting by profit>

• Yes: S {1} ([1]) Job Deadline Profit – Is S = {1, 2} OK?

1 3 40 • Yes: S {1, 2} ([2, 1]) 2 1 35 – Is S = {1, 2, 3} OK? 3 1 30 • No.

– Is S = {1, 2, 4} OK? 4 3 25 • Yes: S {1, 2, 4} ([2, 1, 4] or [2, 4, 1]) 5 1 20

– Is S = {1, 2, 4, 5} OK? 6 3 15 • No. 7 2 10

– Is S = {1, 2, 4, 6} OK? • No.

– Is S = {1, 2, 4, 7} OK? • No.

163

Page 18: [주제 4] Greedy Methods 151

• Some terminology - A sequence is called a feasible sequence if all the jobs in the sequence

start by their deadlines. - A set of jobs is called a feasible set if there exists at least one feasible sequence

for the jobs in the set. - A sequence is called an optimal sequence if it is a feasible sequence

with maximum total profit. - A set of jobs is called an optimal set of jobs if there exists at least one

optimal sequence for the jobs in the set.

164

Page 19: [주제 4] Greedy Methods 151

Implementation Issues • A key operation in the greedy approach

- Determine if a set of jobs S is feasible. Fact: S is feasible if and only if the sequence obtained by ordering the jobs in

S according to nondecreasing deadlines is feasible. - Ex:

• Is S = {1, 2, 4} OK? [2(1), 1(3), 4(3)] Yes!• Is S = {1, 2, 4, 7} OK? [2(1), 7(2), 1(3), 4(3)] No!

• An O(n2) implementation - Sort the jobs in non-increasing order by profit. - For each job in the sorted order,

• See if the current job can be scheduled together with the previously selected jobs, using a linked list data structure.

- If yes, add it to the list of feasible sequence. - Otherwise, reject it.

Time complexity • When there are i-1 jobs in the sequence,

- at most i-1 comparisons are needed to add a new job in the sequence, and - at most i comparisons are needed to check if the new sequence is feasible.

165

Page 20: [주제 4] Greedy Methods 151

• Example Job Deadline Profit 1

1 1 100 4 3 120

2 6 80 1 2

3 3 90 7 1 115 4 3 120

4 3 120 1 2 3

5 5 40 7 1 115 4 3 120 6 4 105

6 4 105 1 2 3 4

7 1 115 7 1 115 1 1 100 4 3 120 6 4 105 8 2 85

1 2 3 4 9 4 50 7 1 115 4 3 120 3 3 90 6 4 105

1 2 3 4 5 7 1 115 8 2 85 4 3 120 3 3 90 6 4 105

1 2 3 4 5 7 1 115 4 3 120 3 3 90 6 4 105 2 6 80

1 2 3 4 5 6 7 1 115 4 3 120 3 3 90 6 4 105 9 4 50 2 6 80

1 2 3 4 5 6 7 1 115 4 3 120 3 3 90 6 4 105 5 5 40 2 6 80

166

Page 21: [주제 4] Greedy Methods 151

• Is the time complexity always O(n2)?

- What if n >> dmax?

• O(n log n + n dmax)

- What if n >> dmax and n >> kscanned?

• O(n + kscanned log n + kscanned dmax) = O(n) Is this complexity achievable when a max heap data structure is employed?

167

Page 22: [주제 4] Greedy Methods 151

Correctness of the greedy method • See Theorem 4.4.

168

Page 23: [주제 4] Greedy Methods 151

[Huffman Codes] [Neapolitan 4.4]

• Data compression - Data compression can save storage space for files.- Huffman coding is just one of many data compression techniques.

• Some terminology - Binary code - Codeword - Fixed-length v.s. variable-length binary code

• Problem - Given a file, find a binary character code for the characters in the file, which

represents the file in the least number of bits.

• Example

- Original text file: ababcbbbc

- Huffman codes: a = 10, b = 0, c = 11 Is it possible to have a code set where Compressed file: 1001001100011 a = 01, b = 0, and c = 11?

175

Page 24: [주제 4] Greedy Methods 151

Prefix Codes • No codeword can be a prefix of any other code.

- Otherwise, decoding is impossible! Example 1

- a = 00, b = 1110, c = 110, d = 01, e = 1111, f = 10 Example 2

- a = 00, b = 1100, c = 110, d = 01, e = 1111, f = 10

• Binary trees corresponding to prefix codes - The code of a character c is the label of the path from the root to c. - Decoding of an encoded file is trivial.

176

Page 25: [주제 4] Greedy Methods 151

• Problem - Given a file F to be encoded with a character set V = {v1, v2, …, vn}, find an

optimal prefix binary code with a corresponding binary tree T that minimizes the cost function

where freq(vi) is the number of times vi occurs in F, and depth(vi) is the depth vi of in T.

A Greedy approach successfully finds an optimal code.

177

Page 26: [주제 4] Greedy Methods 151

Huffman’s Algorithm

• Idea - Put the rarest characters at

the bottom of the tree.

• A greedy approach - Repeat the following until

only one tree is left: ① Start from a set of single

node trees. ② Pick up two trees u and

v with the lowest frequencies.

③ Merge them by adding a root node w where the frequency of the new node is the sum of those of u and v.

④ Replace u and v by w.

178

Page 27: [주제 4] Greedy Methods 151

Implementation • Implementation issues

- How can you manage a dynamic set to which the following operations occur frequently:

• Delete the elements with the highest priority from the list.

• Insert an element with some priority into the list. The answer is to use Priority Queue.

- The priority queue can be implemented in many ways. Which one would you use?

Representation Insertion Deletion

Unordered array O(1) O(n)

Unordered linked list O(1) O(n)

Sorted array O(n) O(1)

Sorted linked list O(n) O(1)

Heap O(log n) O(log n)

The answer is to use the priority queue based on (min) heap.

179

Page 28: [주제 4] Greedy Methods 151

typedef struct _node { char symbol; int freq;

struct _node *left; struct _node *right; } NODE;

NODE *u, *v, *w; …

for (i = 1; i <= n; i++) { /* insert the n single-node trees */

} O(n log n) time for (i = 1; i <= n-1; i++) { u = PQ_delete();

v = PQ_delete();

w = make_a_new_node();w->left = u; w->right = v;

w->freq = u->freq + v->freq;PQ_insert(w);

} w = PQ_delete(); /* u points to the optimal tree. */

180

Page 29: [주제 4] Greedy Methods 151

Correctness of the Huffman’s Algorithm siblings, branch

•(Proof by mathematical induction) If the set of trees obtained in the ith step are branches in a binary tree corresponding to an optimal code, then the set of trees obtained in the (i+1)st step are also branches in a binary tree corresponding to an optimal code.

- (Basis step) When k = 0, each tree is trivially a branch of an optimal tree. – (Induction step) Suppose that the proposition is true when k = i, that S is the set of trees that exist after the ith step, and that T is the corresponding optimal tree. Let u and v be the root of the trees combined in the (i+1)st step. • If u and v are siblings in T, we are done.

• Otherwise, assume that u is at a level in T at least as low as v, and that w is the u’s sibling in T.

- The branch in T with root w is one of the trees in S or contains one of those trees as a subtree.

- freq(w) >= freq(v) and depth(w) >= depth(v) in T - If we create a new tree T’ by swapping the two branches with root v and w, then bits(T’) = bits(T) + (depth(w) - depth(v))*(freq(v) - freq(w)) <= bits(T). Since bits(T) <= bits(T’), T’ is also optimal. Hence, the proposition also holds when k = i+1.

What happens if all the steps are done? 181

Page 30: [주제 4] Greedy Methods 151

[Minimum Spanning Tree Algorithms] [Neapolitan 4.1]

• Graph Algorithm 강의 자료 참조

182

Page 31: [주제 4] Greedy Methods 151

[Dijkstra’s Single-Source Shortest Path Algorithm] [Neapolitan 4.2]

• Graph Algorithm 강의 자료 참조

183

Page 32: [주제 4] Greedy Methods 151

184