assignment 4. (due on friday of week 14. drop it in mail box 63 )

51
1 Assignment 4. This one is cancelled since there is a solution on website. I new assignment will be given on Nov. 28. (Due on Friday of Week 14. Drop it in Mail Box 63 ) This time, Professor Yao and I can explain the questions, but we will NOT tell you how to solve the problems. Question 1. (35 points) Give a polynomial time algorithm to find the longest monotonically increasing subsequence of a sequence of n numbers. Can you give a linear space algorithm? (Assume that each integer appears once in the input sequence of n numbers) Example: Consider sequence 1,8, 2,9, 3,10, 4, 5. Both subsequences 1, 2, 3, 4, 5 and 1, 8, 9, 10 are monotonically increasing subsequences. However, 1,2,3, 4, 5 is the longest.

Upload: serena

Post on 19-Mar-2016

27 views

Category:

Documents


1 download

DESCRIPTION

Assignment 4. This one is cancelled since there is a solution on website. I new assignment will be given on Nov. 28. (Due on Friday of Week 14. Drop it in Mail Box 63 ) This time, Professor Yao and I can explain the questions, but we will NOT tell you how to solve the problems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

1

Assignment 4. This one is cancelled since there is a solution on website. I new assignment will be given on Nov. 28. (Due on Friday of Week 14. Drop it in Mail Box 63 )

This time, Professor Yao and I can explain the questions, but we will NOT tell you how to solve the problems.

Question 1. (35 points) Give a polynomial time algorithm to find the longest monotonically increasing subsequence of a sequence of n numbers.

Can you give a linear space algorithm? (Assume that each integer appears once in the input sequence

of n numbers)Example: Consider sequence 1,8, 2,9, 3,10, 4, 5. Both

subsequences 1, 2, 3, 4, 5 and 1, 8, 9, 10 are monotonically increasing subsequences. However, 1,2,3, 4, 5 is the longest.

Page 2: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

2

Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 ) Question 2. (65 points) Answer Q A OR Q B. Q A. Suppose that there are n sequences s1, s2, …, sn on alphabet

={a1, a2, …, am }. Every sequence si is of length m and every letter in appears exactly once in each si.

Design a polynomial time algorithm to compute the LCS of the n sequences. What is the time complexity of your algorithm?

Q B. Let T be a rooted binary tree, where each internal node in the tree has two children and every node (except the root) in T has a parent. Each leaf in the tree is assigned a letter in ={A, C, G, T}. Consider an edge e in T. Assume that every end of e is assigned a letter. The cost of e is 0 if the two letters are identical and the cost is 1 if the two letters are not identical. The problem here is to assign a letter in to each internal node of T such that the cost of the tree is minimized, where the cost of the tree is the total cost of all edges in the tree. Design a polynomial-time dynamic programming algorithm to solve the problem.

Page 3: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

3

NP-Complete Problems•Polynomial time vs exponential time

–Polynomial O(nk), where n is the input size (e.g., number of nodes in a graph, the length of strings , etc) of our problem and k is a constant (e.g., k=1, 2, 3, etc).–Exponential time: 2n or nn. n= 2, 10, 20, 30

2n: 4 1024 1 million 1000 millionSuppose our computer can solve a problem of size k (i.e., compute 2k operations) in a hour/week/month. If the new computer is 1024 times faster than ours, then the new computer can solve the problem of size k+10 in the same time. The improvement is very little.• Hardware improvement has little use for solving problems that require exponential running time.• Exponential running time is considered as “not efficient”.

Page 4: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

4

Story

• All algorithms we have studied so far are polynomial time algorithms.• Facts: people have not yet found any polynomial time algorithms for some famous problems, (e.g., Hamilton Circuit, longest simple path, Steiner trees).• Question: Do there exist polynomial time algorithms for those famous problems?• Answer: No body knows.

Page 5: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

5

Story

•Research topic: Prove that polynomial time algorithms do not exist for those famous problems, e.g., Hamilton circuit problem. •You can get Turing award if you can give the proof. •In order to answer the above question, people define two classes of problems, P class and NP class.•To answer if PNP, a rich area, NP-completeness theory is developed.

Page 6: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

6

Class P and Class NP•Class P contains those problems that are solvable in polynomial time.

–They are problems that can be solved in O(nk) time, where n is the input size and k is a constant.

•Class NP consists of those problem that are verifiable in polynomial time.•What we mean here is that if we were somehow given a solution, then we can verify that the solution is correct in time polynomial in the input size to the problem. •Example: Hamilton Circuit: given an order of the n distinct vertices (v1, v2, …, vn), we can test if (vi, v i+1) is an edge in G for i=1, 2, …, n-1 and (vn, v1) is an edge in G in time O(n) (polynomial in the input size).

Page 7: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

7

Class P and Class NP• Based on definitions, PNP. • If we can design a polynomial time algorithm for

problem A, then problem A is in P.• However, if we have not been able to design a

polynomial time algorithm for problem A, then there are two possibilities:

1. polynomial time algorithm does not exists for problem A or

2. we are not smart. Open problem: PNP?Clay $1 million prize.

Page 8: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

8

NP-Complete• A problem X is NP-complete if it is in NP and any problem Y in NP has a polynomial time reduction to X.

– it is the hardest problem in NP If an NP-complete problem can be solved in polynomial time, then any problem in class NP can be solved in polynomial time.

•The first NPC problem is Satisfiability probelm –Proved by Cook in 1971 and obtains the Turing Award for this work

Page 9: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

9

Boolean formula• A boolean formula f(x1, x2, …xn), where xi are boolean variables (either 0 or 1), contains boolean variables and boolean operations AND, OR and NOT . • Clause: variables and their negations are connected with OR operation, e.g., (x1 OR NOTx2 OR x5) • Conjunctive normal form of boolean formula: contains m clauses connected with AND operation. Example: (x1 OR NOT x2) AND (x1 OR NOT x3 OR x6) AND (x2 OR x6) AND (NOT x3 OR x5).

–Here we have four clauses.

Page 10: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

10

Satisfiability problem• Input: conjunctive normal form with n variables, x1, x2, …, xn.

• Problem: find an assignment of x1, x2, …, xn (setting each xi to be 0 or 1) such that the formula is true (satisfied).• Example: conjunctive normal form is (x1 OR NOTx2) AND (NOT x1 OR x3). • The formula is true for assignment x1=1, x2=0, x3=1. Note: for n Boolean variables, there are 2n assignments.•Testing if formula=1 can be done in polynomial time for any given assignment.•Given an assignment that satisfies formula=1 is hard.

Page 11: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

11

The First NP-complete Problem• Theorem: Satisfiability problem is NP-complete.

–It is the first NP-complete problem.–S. A. Cook in 1971–Won Turing prize for this work.

• Significance:–If Satisfiability problem can be solved in polynomial time, then ALL problems in class NP can be solved in polynomial time.–If you want to solve PNP, then you should work on NPC problems such as satisfiability problem.–We can use the first NPC problem, Satisfiability problem, to show that other problems are also NP-complete.

Page 12: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

12

How to show that a problem is NPC?•To show that problem A is NP-complete, we can

–First find a problem B that has been proved to be NP-complete.–Show that if Problem A can be solved in polynomial time, then problem B can also be solved in polynomial time.Remarks: Since a NPC problem, problem B, is the hardest in class NP, problem A is also the hardest

Page 13: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

13

Hamilton circuit and Longest Simple Path • Hamilton circuit : a circuit uses every vertex

of the graph exactly once except for the last vertex, which duplicates the first vertex.

• It was shown to be NP-complete. • Longest Simple Path:• Input: V={v1, v2, ..., vn} be a set of nodes in a

graph and d(vi, vj) the distance between vi and vj,, find a longest simple path from u to v .

• Theorem 2: The longest simple path problem is NP-complete.

Page 14: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

14

Theorem 2: The longest simple path (LSP) problem is NP-complete.

Proof: Hamilton Circuit Problem (HC): Given a graph G=(V, E), find a Hamilton

Circuit.We want to show that if we can solve the longest simple path problem in

polynomial time, then we can also solve the Hamilton circuit problem in polynomial time.

Design a polynomial time algorithm to solve HC by using an algorithm for LSP. Step 0: Set the length of each edge in G to be 1Step 1: for each edge (u, v)E do find the longest simple path P from u to v in G.Step 2: if the length of P is n-1 then by adding edge (u, v) we obtain an Hamilton circuit in G.Step 3: if no Hamilton circuit is found for every (u, v) then print “no Hamilton circuit exists”Conclusion: • if LSP can be solved in polynomial time, then HC can also be

solved in polynomial.• Since HC was proved to be NP-complete, LSP is also NP-complete.

Page 15: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

15

Some basic NP-complete problems

• 3-Satisfiability : Each clause contains at most three variavles or their negations.

• Vertex Cover: Given a graph G=(V, E), find a subset V’ of V such that for each edge (u, v) in E, at least one of u and v is in V’ and the size of V’ is minimized.

• Hamilton Circuit: (definition was given before)• History: Satisfiability3-Satisfiabilityvertex

coverHamilton circuit.• Those proofs are very hard. • Karp proves the first few NPC problems and obtains

Turing award.

Page 16: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

16

Approximation Algorithms

•Concepts•Knapsack•Steiner Minimum Tree•TSP•Vertex Cover

Page 17: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

17

Concepts of Approximation Algorithms

Optimization Problem:The solution of the problem is associated with a cost

(value).We want to maximize the cost or minimize the cost. Minimum spanning tree and shortest path are

optimization problems.Euler circuit problem is NOT an optimization

problem. (it is a decision problem.)

Page 18: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

18

Approximation Algorithm

An algorithm A is an approximation algorithm , if given any instance I, it finds a candidate solution s(I)

How good an approximation algorithm is?We use performance ratio to measure the

quality of an approximation algorithm.

Page 19: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

19

Performance ratio

For minimization problem, the performance ratio of algorithm A is defined as a number r such that for any instance I of the problem,

where OPT(I) is the value of the optimal solution for instance I and A(I) is the value of the solution returned by algorithm A on instance I.

)1()(

)( rr

IOPTIA

Page 20: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

20

Performance ratio

For maximization problem, the performance ratio of algorithm A is defined as a number r such that for any instance I of the problem,

OPT(I) A(I)

is at most r (r1), where OPT(I) is the value of the optimal solution for instance I and A(I) is the value of the solution returned by algorithm A on instance I.

Page 21: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

21

Simplified Knapsack Problem

Given a finite set U of items, a size s(u) Z+, a capacity Bmax{s(u):u U}, find a subset U'U such that

and such that the above summation is as large as possible. (It is NP-hard.)

'

)(Uu

Bus

Page 22: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

22

Ratio-2 Algorithm

1. Sort u's based on s(u)'s in increasing order.2. Select the smallest remaining u until no

more u can be added.3. Compare the total value of selected items

with the item with the largest size, and select the larger one.

Theorem: The algorithm has performance ratio 2.

Page 23: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

23

Proof

• Case 1: the total of selected items 0.5B (got it!)

• Case 2: the total of selected items < 0.5B.– No remaining item left: we get optimal.– There are some remaining items: the size of the

smallest remaining item >0.5B. (Otherwise, we can add it in.)

• Selecting the largest item gives ratio-2.

Page 24: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

24

The 0-1 Knapsack problem: • The 0-1 knapsack problem:• N items, where the i-th item is worth vi dollars and

weight wi pounds.– vi and wi are integers.

• A thief can carry at most W (integer) pounds.• How to take as valuable a load as possible.

– An item cannot be divided into pieces.• The fractional knapsack problem:• The same setting, but the thief can take fractions of items.

Page 25: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

25

Ratio-2 Algorithm

1. Delete the items i with wi>W.

2. Sort items in decreasing order based on vi/wi.

3. Select the first k items item 1, item 2, …, item k such that

w1+w2+…, wk W and w1+w2+…, wk +w k+1>W. 4. Compare vk+1 with v1+v2+…+vk and select the

larger one.Theorem: The algorithm has performance ratio 2.

Page 26: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

26

Proof of ratio 2

• C(opt): the cost of optimum solution• C(fopt): the optimal cost of the fractional version.1. C(opt)C(fopt). 2. v1+v2+…+vk +v k+1> C(fopt).3. So, either v1+v2+…+vk >0.5 C(fopt)0.5c(opt) or v k+1 >0.5 C(fopt)0.5c(opt).

• Since the algorithm choose the larger one from v1+v2+…+vk and v k+1

• We know that the cost of the solution obtained by the algorithm is at least 0.5 C(fopt)c(opt).

Page 27: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

27

Steiner Minimum Tree

Steiner minimum tree in the plane• Input: a set of points R (regular points) in the plane.• Output: a tree with smallest weight which contains

all the nodes in R.• Weight: weight on an edge connecting two points

(x1,y1) and (x2,y2) in the plane is defined as the Euclidean distance 2

212

21 )()( yyxx

Page 28: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

28

Example: Dark points are regular points.

Page 29: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

29

Triangle inequality

Key for our approximation algorithm.For any three points in the plane, we have:

dist(a, c ) ≤ dist(a, b) + dist(b, c).Examples:

a b

c

3

45

Page 30: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

30

Approximation algorithm(Steiner minimum tree in the plane)

Compute a minimum spanning tree for R as the approximation solution for the Steiner minimum tree problem.

How good the algorithm is? (in terms of the quality of the solutions)

Theorem: The performance ratio of the approximation algorithm is 2.

Page 31: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

31

We want to show that for any instance (input) I, A(I)/OPT(I) ≤ r (r≥1), where A(I) is the cost of the solution obtained from our spanning tree algorithm, and OPT(I) is the cost of an optimal solution.

Proof

Page 32: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

32

• Assume that T is the optimal solution for instance I. Consider a traversal of T.

1

2

3 4

5

6 7

89

10

• Each edge in T is visited at most twice. Thus, the total weight of the traversal is at most twice of the weight of T, i.e.,

w(traversal)≤2w(T)=2OPT(I). .........(1)

Page 33: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

33

• Based on the traversal, we can get a spanning tree ST as follows: (Directly connect two nodes in R based on the visited order of the traversal.)

7

1

2

3 4

5

6

89

10

From triangle inequality,

w(ST)≤w(traversal) ≤2OPT(I). ..........(2)

Page 34: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

34

• Inequality(2) says that the cost of the spanning tree ST is less than or equal to twice of the cost of an optimal solution.

• So, if we can compute ST, then we can get a solution with cost≤2OPT(I).(Great! But finding ST may also be very hard, since ST is obtained from the optimal solution T, which we do not know.)

• We can find a minimum spanning tree MST for R in polynomial time.

• By definition of MST, w(MST) ≤w(ST) ≤2OPT(I).• Therefore, the performance ratio is 2.

Page 35: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

35

Story

• The method was known long time ago. The performance ratio was conjectured to be

• Du and Hwang (1990 ) proved that the conjecture is true.

)1968(1547.13/2

Page 36: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

36

Graph Steiner minimum tree

• Input: a graph G=(V,E), a weight w(e) for each e E, and a subset R∈ V.⊂

• Output: a tree with minimum weight which contains all the nodes in R.

• The nodes in R are called regular points. Note that, the Steiner minimum tree could contain some nodes in V-R and the nodes in V-R are called Steiner points.

Page 37: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

37

Example: Let G be shown in Figure a. R={a,b,c}. The Steiner minimum tree T={(a,d),(b,d),(c,d)} which is shown in Figure b.

Theorem: Graph Steiner minimum tree problem is NP-complete.

b

a dc1 1

12 2

Figure a

a dc1 1

1

Figure b

b

Page 38: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

38

Approximation algorithm(Graph Steiner minimum tree)

1. For each pair of nodes u and v in R, compute the shortest path from u to v and assign the cost of the shortest path from u to v as the length of edge (u, v). (a complete graph is given)

2. Compute a minimum spanning tree for the modified complete graph.

3. Include the nodes in the shortest paths used.

Page 39: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

39

Theorem: The performance ratio of this algorithm is 2.

Proof:We only have to prove that Triangle

Inequality holds. Ifdist(a,c)>dist(a,b)+dist(b,c) ......(3)

then we modify the path from a to c likea→b→c

Thus, (3) is impossible.

Page 40: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

40

Example II-1

1

25

15 5

1

5

11

15

2

12

2

2 1 2

11

The given graph

a

b

e c

f

g

d

Page 41: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

41

Example II-2

Modified complete graph

a

b

c d

e /3

e /4

f/ 2

g /3

f-c-g/5

e-c-g /7

Page 42: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

42

Example II-3

The minimum spanning tree

a

b

c d

e /3 f /2

g/3

Page 43: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

43

Example II-4

1

1

2

2

1

The approximate Steiner tree

a

b

e c

f

g

d

1

Page 44: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

44

Approximation Algorithm for TSP with triangle inequality

• Assumption: the triangle inequality holds. That is, d (a, c) ≤ d (a, b) + d (b, c).

• This condition is reasonable, for example, whenever the cities are points in the plane and the distance between two points is the Euclidean distance.

• Theorem: TSP with triangle inequality is also NP-hard.

Page 45: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

45

Ratio 2 Algorithm

Algorithm A:1. Compute a minimum spanning tree

algorithm (Figure a)2. Visit all the cities by traversing twice

around the tree. This visits some cities more than once. (Figure b)

3. Shortcut the tour by going directly to the next unvisited city. (Figure c)

Page 46: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

46

Example:

(a) A spanning tree

(b) Twice around the tree

(c) A tour with shortcut

Page 47: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

47

Proof of Ratio 2

1. The cost of a minimum spanning tree: cost(t), is not greater than opt(TSP), the cost of an optimal TSP. (Why? n-1 edges in a spanning tree. n edges in TSP. Delete one edge in TSP, we get a spanning tree. Minimum spanning tree has the smallest cost.)

2. The cost of the TSP produced by our algorithm is less than 2×cost(T) and thus is less than 2×opt(TSP).

Page 48: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

48

Center Selection Problem

Problem: Given a set of points V in the plane (or some other metric space), find k points c1, c2, .., ck such that for each v in V,

min { i=1, 2, …, k} d(v, ci) d

and d is minimized.

Page 49: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

49

Fasthest-point clustering algorithmStep 1: arbitrarily select a point in V as c1.

Step 2: let i=2.Step 3: pick a point ci from V –{c1, c2, …, ci-1}

to maximize min {|c1ci|, |c2ci|,…,|ci-1 ci|}.

Step 4: i=i+1;Step 5: repeat Steps 3 and 4 until i=k.

Page 50: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

50

Theorem: Farthest-point clustering algorithm has ratio-2.

Proof: Let c i be an point in V that maximize

i=min {|c1ci|, |c2ci|,…,|ci-1 ci|}.

We have i i-1 for any i.

Since two, say ci and cj (i>j), of the k+1 points must be in the same group (in an opt solution), i 2 opt.

Thus, k+1 2 opt.

For any v in V, by the definition of k+1 ,

min {|c1v|, |c2v|,…,|ck v|} k+1 .

So the algorithm has ratio-2.

Page 51: Assignment 4. (Due on Friday of Week 14. Drop it in Mail Box 63 )

51

Vertex Cover Problem

• Given a graph G=(V, E), find V'⊆V with minimum number of vertices such that for each edge (u, v)∈E at least one of u and v is in V’.

• V' is called vertex cover.• The problem is NP-hard.• A ratio-2 algorithm exists for vertex cover

problem.