precept 4: traveling salesman problem, hierarchical clustering...a naive algorithm to find a tsp...

30
Precept 4: Traveling Salesman Problem, Hierarchical Clustering Qian Zhu 2/23/2011

Upload: others

Post on 28-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Precept 4: Traveling Salesman Problem, Hierarchical Clustering

Qian Zhu2/23/2011

Page 2: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Agenda

• Assignment: Traveling salesman problem• Hierarchical clustering

– Example– Comparisons with K-means

Page 3: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

TSP

• TSP: Given the coordinates for a set of vertices,• Find an order of connecting the vertices, such that 1) each

vertex is visited once (except for the start vertex which is visited twice), and 2) the total distance is minimum. Start vertex and end vertex must be the same.

Page 4: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Can you spot a TSP touramong these possible answers?

Page 5: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

TSP

• Smaller problems (n=50): humans can do very well.

• Larger problems: not so much… • Machine: there is no general way to solve TSP.

Computationally hard. – Challenge: find a polynomial time algorithm to

solve this problem, and you will win a million bucks!

Page 6: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

A naive algorithm to find a TSP tour

• Exhaustive search among all possible tours, and pick the minimal tour.

• How many possibilities?– N cities– N x (N-1) x (N-2) x … x 1 – N!

• Obviously not very efficient.

Page 7: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

TSP

• Let’s say no polynomial time algorithm exists to solve this problem.

• We can use heuristic methods to derive a reasonably good answer in reasonable time.

– Heuristics: no guarantee that answer is optimal

– Modern methods can solve much larger TSP (involving millions of nodes) in reasonable time.

Page 8: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Assignment goals

• Use heuristic methods to estimate answer for a computationally hard problem

• Visualize the answer to see how well the heuristics perform.

• Implement a linked list structure to represent a tour in Java.– You cannot use LinkedList from Java standard library.

• We will use two heuristics:• Nearest neighbor• Minimize total tour distance

– Extra credit: implement your own heuristic

Page 9: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Tour representation

• Circular linked list

private class Node{Point P;Node next;

}

Node A

A.p

A.next

size-1

Node A

A.p

A.next

Node B

B.p

B.next

size-2

Inner class: better encapsulation

Page 10: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Tour traversal

• Start from the head node• Iterate through the list until the head node is

reached again.

Page 11: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Adding a node to the listNode A

A.p

A.next

Node B

B.p

B.next

Node C

C.p

C.next

1 2

Add:

Node A

A.p

A.next

Node B

B.p

B.next

Node A

A.p

A.next

Node B

B.p

B.next

Node C

C.p

C.next

To:

Insert here

Page 12: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Nearest neighbor (NN)

• Read in the next point, and add it to the current tour after the point to which it is closest.

t_list = list of points in current tourp_list = list of pointsfor each p in p_list:

x = closest point in t_listadd p to t_list, after x

A C

B

E

Current point

Current tour

Compare E with each node in the tour.Pick the node that is closest to E. Connect E after this node.

Page 13: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Nearest neighbor (NN)

A C

B

E

Current point

Current tour

Let’s say B is closest to E. Add E after E:

A C

B

E

Page 14: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Smallest increase heuristic

• Read in the next point, add it to the current tour after the point where it results in the least possible increase in tour length.

t_list = list of points in current tourp_list = list of pointsfor each p in p_list:

x = point where if p is inserted after x, smallest increase in tour length

add p to t_list, after x

Need to calculate every possible increase in order to know which is smallest.

Page 15: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

How to calculate increase in tour length?

A

C

BE

5

A

C

B

4

E4

Increase in tour length = (4+4) – 5 = 3

Page 16: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Assume…

• The order of inserting nodes is given. (use the order of nodes in the input file)

• Does order matter?– Try randomize the order and note whether you

get a better answer.

Page 17: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Easy problems:– Shortest path problem:

• Assume: vertices and edges are known• Given two vertices. Find the shortest connected path between them.• Algorithm: Dijkstra (O(|V|2))• Application: route planning

– Chinese postman problem:• A mailman wants to deliver mail to a certain neighborhood. The mailman

is lazy, and he wants to find the shortest route through neighborhood, that meets these criteria:

– It is a closed circuit– He needs to go through every street at least once

• Algorithm: Edmond (O(|V|3))• Application: UPS delivery route planning

Edges weighted

Edges weighted

Page 18: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Easy problems:– Minimum spanning tree problem

• Given: a connected graph G with weighted edges.• Find: a connected subgraph that touches every vertex in the graph. The

sum of weight of edges must be minimal.• Algorithm: Prim and Kruskal (O(|E| log |E|))• Application: A cable TV company is laying cable to a new neighborhood. It

is constrained to bury cable only along certain paths. Find a spanning tree that is a subset of those paths that connects every house and incur minimum laying cost. Edges weighted

Page 19: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Easy problems (continued):– Euler Trail problem:

Seven bridges of Konigsberg -

Take a walk through the city that would cross each bridge once and only once.

Is there a solution?

Page 20: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Euler Trail problem (continued):– Let’s convert it to a graph problem:

• Define a trail: a path that visits every edge exactly once.• Open trail: a trail where start != end.• Closed trail: a trail where start = end.• Find an open trail or a closed trail in the graph. Edges unweighted

Can you find an Euler trail in Konigsberg?

Page 21: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Euler Trail problem (continued):– Modern day Konigsberg (after bombing during

WWII destroyed two bridges):

Page 22: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Other problems similar to TSP

• Hard problems:– Find a Hamiltonian circuit (a path that visits every vertex in

the graph exactly once and return to starting vertex)

Edges unweighted

Page 23: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Hierarchical clustering

0.1 0.2 0.3 0.4

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

x[,1]

x[,2

]

2

3

4

5

1

1 2 3 4

2 0.62

3 0.42 0.19

4 1.32 0.81 0.93

5 1.02 0.48 0.61 0.33

Page 24: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Hierarchical clustering (single-linkage)1 2 3 4

2 0.62

3 0.42 0.19

4 1.32 0.81 0.93

5 1.02 0.48 0.61 0.33

node nodemin distmin

1 3 0.422 3 0.193 2 0.194 5 0.335 4 0.33

Join 2 and 3

1 2 3 4

2 0.62

3 0.42 0.19

4 1.32 0.81 0.93

5 1.02 0.48 0.61 0.33

1 2,3 4

2,3 0.42

4 1.32 0.81

5 1.02 0.48 0.33

Iteration 1

Page 25: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

1 2,3 4

2,3 0.42

4 1.32 0.81

5 1.02 0.48 0.33

node nodemin distmin

1 2,3 0.422,3 1 0.424 5 0.335 4 0.33

1 2,3 4

2,3 0.42

4 1.32 0.81

5 1.02 0.48 0.33

Join 4 and 5

1 2,3

2,3 0.42

4,5 1.02 0.48

Iteration 2

Page 26: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

1 2,3

2,3 0.42

4,5 1.02 0.48

node nodemin distmin

1 2,3 0.422,3 1 0.424,5 2,3 0.48

1 2,3

2,3 0.42

4,5 1.02 0.48

Join 1 and 2,3

1,2,3

4,5 0.48

Iteration 3

Page 27: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

node nodemin distmin

1,2,3 4,5 0.484,5 1,2,3 0.48

1,2,3

4,5 0.48

Join 1,2,3 and 4,5

Final Tree:

{ { {2, 3}, 1}, {4, 5} }4 5

1

2 30.15

0.20

0.25

0.30

0.35

0.40

0.45

Cluster Dendrogram

hclust (*, "single")dist(x)

Hei

ght

Note that the height corresponds to the minimum distance at each iteration

Iteration 4

Page 28: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Exercise

-0.5 0.0 0.5 1.0

-0.5

0.0

0.5

1.0

1.5

x[,1]

x[,2

]

67

8

9

10

3

1

24

5

1 2 3 4 5 6 7 8 9

2 0.46

3 1.02 1.43

4 0.19 0.37 1.06

5 0.76 0.88 0.90 0.62

6 1.37 1.83 0.94 1.52 1.70

7 1.42 1.88 1.00 1.58 1.76 0.06

8 1.92 2.38 1.35 2.07 2.19 0.55 0.50

9 1.23 1.68 0.86 1.38 1.58 0.14 0.19 0.69

10 2.18 2.62 1.78 2.35 2.57 0.87 0.81 0.49 0.99

Pairwise distances

Cluster these points using single-linkage method, using complete-linkage method

Page 29: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Answers9

6 7

8 10

3

5

2

1 4

0.0

0.5

1.0

1.5

2.0

2.5

Cluster Dendrogram

hclust (*, "complete")dist(x)

Hei

ght 5

2

1 4

3

9

6 7

8 10

0.0

0.2

0.4

0.6

0.8

Cluster Dendrogram

hclust (*, "single")dist(x)

Hei

ght

Complete linkage Single linkage

Page 30: Precept 4: Traveling Salesman Problem, Hierarchical Clustering...A naive algorithm to find a TSP tour • Exhaustive search among all possible tours, ... – Shortest path problem:

Hierarchical vs K-means

• Run-time analysis. Which clustering algorithm runs faster?

• What’s the bottleneck step in hierarchical clustering?

• What’s the disadvantage of K-means?• What’s the disadvantage of hierarchical

clustering?