case studies: bin packing & the traveling salesman problem

132
© 2010 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property. Case Studies: Bin Packing & The Traveling Salesman Problem David S. Johnson AT&T Labs – Research Part II

Upload: suchi

Post on 19-Jan-2016

52 views

Category:

Documents


0 download

DESCRIPTION

Case Studies: Bin Packing & The Traveling Salesman Problem. Part II. David S. Johnson AT&T Labs – Research. The Traveling Salesman Problem. Given: Set of cities {c 1 ,c 2 ,…,c N }. For each pair of cities {c i ,c j }, a distance d(c i ,c j ). Find: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Case Studies:  Bin Packing & The Traveling Salesman Problem

© 2010 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.

Case Studies: Bin Packing &

The Traveling Salesman Problem

David S. JohnsonAT&T Labs – Research

Part II

Page 2: Case Studies:  Bin Packing & The Traveling Salesman Problem

The Traveling Salesman Problem

Given:Set of cities c1,c2,…,cN .

For each pair of cities ci,cj, a distance d(ci,cj).

Find: Permutation that

minimizes

)c,d(c)c,d(c π(1)π(N)

1N

1i1)π(iπ(i)

N1,2,...,N1,2,...,:π

Page 3: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 10

Page 4: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 10

Page 5: Case Studies:  Bin Packing & The Traveling Salesman Problem

Other Types of Instances

• X-ray crystallography – Cities: orientations of a crystal– Distances: time for motors to rotate the crystal

from one orientation to the other

• High-definition video compression– Cities: binary vectors of length 64 identifying

the summands for a particular function– Distances: Hamming distance (the number of

terms that need to be added/subtracted to get the next sum)

Page 6: Case Studies:  Bin Packing & The Traveling Salesman Problem

No-Wait Flowshop Scheduling

– Cities: Length-4 vectors <c1,c2,c3,c4> of integer task lengths for a given job that consists of tasks that require 4 processors that must be used in order, where the task on processor i+1 must start as soon as the task on processor i is done).

– Distances: d(c,c’) = Increase in the finish time of the 4th processor if c’ is run immediately after c.

– Note: Not necessarily symmetric: may have d(c,c’) d(c’,c).

Page 7: Case Studies:  Bin Packing & The Traveling Salesman Problem

How Hard?• NP-Hard for all the above

applications and many more

– [Karp, 1972]

– [Papadimitriou & Steiglitz, 1976]

– [Garey, Graham, & Johnson, 1976]

– [Papadimitriou & Kanellakis, 1978]

– …

Page 8: Case Studies:  Bin Packing & The Traveling Salesman Problem

How Hard?Number of possible tours:

N! = 123(N-1)N = Θ(2NlogN)

10! = 3,628,200

20! ~ 2.431018 (2.43 quadrillion)

Dynamic Programming Solution:

O(N22N) = o(2NlogN)

Page 9: Case Studies:  Bin Packing & The Traveling Salesman Problem

Dynamic Programming Algorithm

• For each subset C’ of the cities containing c1, and each city cC’, let f(C’,c) = Length of shortest path that is a permutation of C’, starting at c1 and ending at c.

• f(c1, c1) = 0

• For xC’, f(C’x,x) = MincC’f(C’,c) + d(c,x).

• Optimal tour length = MincCf(C,c) + d(c, c1).

• Running time: ~(N-1)2N-1 items to be computed, at time N for each = O(N22N)

Page 10: Case Studies:  Bin Packing & The Traveling Salesman Problem

How Hard?Number of possible tours:

N! = 123(N-1)N = Θ(2NlogN)

10! = 3,628,200

20! ~ 2.431018 (2.43 quadrillion)

Dynamic Programming Solution:

O(N22N)

102210 = 102,400

202220 = 419,430,400

Page 11: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 10

Page 12: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 100

Page 13: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 1000

Page 14: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 10000

Page 15: Case Studies:  Bin Packing & The Traveling Salesman Problem

Planar Euclidean Application #1

• Cities:

– Holes to be drilled in printed circuit boards

Page 16: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 17: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 18: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 2392

Page 19: Case Studies:  Bin Packing & The Traveling Salesman Problem

Planar Euclidean Application #2

• Cities:

– Wires to be cut in a “Laser Logic” programmable circuit

Page 20: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 21: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 7397

Page 22: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 33,810

Page 23: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 85,900

Page 24: Case Studies:  Bin Packing & The Traveling Salesman Problem

Standard Approach to Coping with NP-Hardness:

• Approximation Algorithms

– Run quickly (polynomial-time for theory, low-order polynomial time for practice)

– Obtain solutions that are guaranteed to be close to optimal

– For the latter and the TSP, we need the triangle inequality to hold:

d(a,c) ≤ d(a,b) + d(b,c)

Page 25: Case Studies:  Bin Packing & The Traveling Salesman Problem

No -Inequality Danger• Theorem [Karp, 1972]: Given a graph G = (V,E), it is

NP-hard to determine whether G contains a Hamiltonian circuit (collections of edges that make up a tour).

• Given a graph, construct a TSP instance in which

d(c,c’) =

• If Hamiltonian circuit exists, OPT = N, if not, OPT > N2N.

• A polynomial-time approximation algorithm with that guaranteed a tour of length no more than 2N OPT would imply P = NP.

1 if c,c’ E

N2N if c,c’ E

Page 26: Case Studies:  Bin Packing & The Traveling Salesman Problem

Nearest Neighbor (NN):1.Start with some city.

2.Repeatedly go next to the nearest unvisited neighbor of the last city added.

3.When all cities have been added, go from the last back to the first.

Page 27: Case Studies:  Bin Packing & The Traveling Salesman Problem

Note: By -inequality, an optimal tour need not contain any crossed edges.

A

C

B

D

E

d(A,D) ≤ d(A,E) + d(E,D)

d(B,C) ≤ d(B,E) + d(E,C)

d(A,D) + d(B,C) ≤ (d(A,E) + d(E,D)) + (d(B,E) + d(E,C))

= (d(A,E) + d(E,C)) + (d(B,E) + d(E,D))

= d(A,C) + d(B,D)For the Euclidean metric, the inequalities are strict (unless all relevant cities are co-linear)

Page 28: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Theorem [Rosenkrantz, Stearns, & Lewis,

1977]:

– There exists a constant c, such that if instance I obeys the triangle inequality, then we have NN(I) ≤ clog(N)Opt(I).

– There exists a constant c’, such that for all N > 3, there are N-city instances I obeying the triangle inequality for which we have NN(I) > c’log(N)Opt(I).

– For any algorithm A, let

RN(A) = minA(I)/OPT(I): I is an N-city instance

– Then RN(NN) = Θ(log(N)) (worst-case ratio)

Page 29: Case Studies:  Bin Packing & The Traveling Salesman Problem

Lower Bound Examples

1+

1

1F1

:

Fi+1

:1+ 1+

FiFi

Li/2 + 1 Li/2 + 1

L1 = 2, Li+1 = 2Li + 2

Let Li be the number of edges encountered if we travel step-by-step from the leftmost vertex in Fi to the rightmost.

(NN starts at left, ends in middle)

Page 30: Case Studies:  Bin Packing & The Traveling Salesman Problem

Set the distance from the rightmost vertex of Fk to the leftmost to 1+, and set all non-specified distances to their -inequality minimum.

OPT(Fk)/(1+) ≤ N = Lk + 1 = (2Lk-1 + 1) + 1

= (2(2Lk-2 + 1) + 1) + 1 = (2(2(2Lk-3 + 1) + 1) + 1) + 1

= = 2k + 2k – 1 < 2k+1

Log(N) < k+1

NN(Fk) ≥ Lk + 1 + > 2k + = 2k + (k-1)2k-1

> k2k-1 > (k/4) OPT(Fk)/(1+) = Ω(log(N) OPT(Fk)

2L21k

0i

i1

1k

L21k

1ii

i1--k

1k

1i

ii1--k 22

Page 31: Case Studies:  Bin Packing & The Traveling Salesman Problem

Upper Bound Proof (Sketch)• Observation 1: For symmetric instances with the

-inequality, d(c,c’) ≤ OPT(I)/2 for all pairs of cities c,c’.

c

c’

Optimal Tour

P1

P2

-inequality d(c,c’) ≤ length(P1) and d(c,c’) ≤ length(P2)

2d(c,c’) ≤ OPT(I)

Page 32: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Observation 2: For cities c, let L(c) be the length of edge added when c was the last city in the tour.

For all pairs c,c’, d(c,c’) ≥ min(L(c),L(c’)).

• Proof: Suppose without loss of generality that c is the first of c,c’ to occur in the NN tour. When c was the last city, the next city chosen, say c*, satisfied d(c,c*) ≤ c(c,c’’) for all unvisited cities c’’. Since c’ was as yet unvisited, this implies c(c,c’) ≥ d(c,c*) = L(c).

• Overall Proof Idea: We will partition C into O(logN) disjoint sets Ci such that cCi L(c) ≤ OPT(I), implying that

NN(I) = cC L(c) = O(logN)OPT(I).

Page 33: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Set X0 = C, i = 0 and repeat while |Xi| > 3.

• Let Ti be the set of edges in an optimal tour for Xi -- note that |Ti| = |Xi|. By the -inequality, the total length of these edges is at most OPT(I).

• Let Ti’ be the set of edges in Ti with length greater than 2OPT(I)/|Ti| -- note that |Ti’| < |Ti|/2.

• For each edge e = c,c’ in Ti - Ti’, let f(e) be a city c’’ c,c’ such that L(c’’) ≤ d(c,c’).

• Set Ci = f(e): e Ti - Ti’ -- note that

– cCi L(c) ≤ eTi length(e) ≤ OPT(I).

– |Ci| ≥ |Ti - Ti’|/2 ≥ |Ti|/4 = |Xi|/4 .

• Set Xi+1 = Xi – Ci -- note that |Xi+1| ≤ (3/4)|Xi|.

Page 34: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Given that |Xi+1| ≤ (3/4)|Xi|, i ≥ 0, this process can only continue while (3/4)iN > 3, i.e., by the time log(3/4)i + logN > log(3), or roughly while log(N)-log(3) > -log(3/4)i = log(4/3)i.

• Hence the process halts at iteration i*, where

• At this point there are 3 or fewer cities in Xi*+1. Put each of them in a separate set Ci*+j, 1 ≤ j ≤ |Xi*+1|, for which, by Observation 1, we will have L(c) ≤ OPT(I)/2.

O(logN)log(4/3)

log(3)log(N)i*

Page 35: Case Studies:  Bin Packing & The Traveling Salesman Problem

Greedy (Multi-Fragment) (GR):1. Sort the edges, shortest first, and treat them in that order. Start with an empty graph.

2. While the current graph is not a TSP tour, attempt to add the next shortest edge. If it yields a vertex degree exceeding 2 or a tour of length less than N, delete.

Page 36: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Theorem [Ong & Moore, 1984]: For all instances I obeying the -Inequality, GR(I) = O(logN)OPT(I).

• Theorem [Frieze, 1979]: There are N-city instances IN for arbitrarily large N that obey the -Inequality and have GR[IN ] = Ω(logN/loglogN).

Page 37: Case Studies:  Bin Packing & The Traveling Salesman Problem

Nearest Insertion (NI):1.Start with 2-city tour consisting of the some city and its nearest neighbor.

2.Repeatedly insert the non-tour city that is closest to a tour city (in the location that yields the smallest increase in tour length).

Page 38: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Theorem [Rosenkrantz, Stearns, & Lewis,

1977]: If instance I obeys the triangle inequality, then R(NI) = 2.

• Key Ideas of Upper Bound Proof:– The minimum spanning tree (MST) for I is no

longer than the optimal tour, since deleting an edge from the tour yields a spanning tree.

– The length of the NI tour is at most twice the length of an MST.

C

B

A d(B,C) ≤ d(B,A) + d(A,C) by -inequality, so

d(A,C) + d(B,C) – d(B,A) ≤ 2d(A,C)

(A,C) would be the next edge added by Prim’s minimum spanning tree algorithm

Page 39: Case Studies:  Bin Packing & The Traveling Salesman Problem

Lower Bound Examples2

2 22

22 2

2NI(I) = 2N-2

OPT(I) = N+1

Page 40: Case Studies:  Bin Packing & The Traveling Salesman Problem

Another Approach

• Observation 1: Any connected graph in which every vertex has even degree contains an “Euler Tour” – a cycle that traverses each edge exactly once, which can be found in linear time.

[Problem 2: Prove this!]• Observation 2: If the -inequality holds, then

traversing an Euler tour but skipping past previously-visited vertices yields a Traveling Salesman tour of no greater length.

Page 41: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 42: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 43: Case Studies:  Bin Packing & The Traveling Salesman Problem

Obtaining the Initial Graph

• Double MST algorithm (DMST):– Combine two copies of an MST.

– Theorem [Folklore]: DMST(I) ≤ 2Opt(I).

• Christofides algorithm (CH):– Combine one copy of an MST with a minimum-

length matching on its odd-degree vertices (there must be an even number of them since the total sum of degrees for any graph is even).

– Theorem [Christofides, 1976]: CH(I) ≤ 1.5Opt(I).

Page 44: Case Studies:  Bin Packing & The Traveling Salesman Problem

Optimal Tour on Odd-Degree Vertices

(No longer than overall Optimal Tour)

Matching M1 Matching M2+ = Optimal Tour

Hence Optimal Matching ≤ min(M1,M2) ≤ OPT(I)/2

Page 45: Case Studies:  Bin Packing & The Traveling Salesman Problem

Can we do better?

• No polynomial-time algorithm is known that has a worst-case ratio less than 3/2 for arbitrary instances satisfying the -inequality.

• Assuming P NP, no polynomial-time algorithm can do better than 220/219 = 1.004566… [Papadimitriou & Vempala, 2006].

• For Euclidean and related metrics, there exists a polynomial-time approximation scheme (PTAS): For each > 0, a polynomial-time algorithm with worst-case ratio ≤ 1+. [Arora, 1998][Mitchell, 1999].

Page 46: Case Studies:  Bin Packing & The Traveling Salesman Problem

PTAS RunningTimes

• [Arora, STOC 1996]: O(N100/)

• [Arora, JACM 1998]: O(N(logN)O(1/))

• [Rao & Smith, STOC 1998]:

O(2(1/)O(1)N + N log N)

• If = ½ and O(1) = 10, then 2(1/)O(1) > 21000.

Page 47: Case Studies:  Bin Packing & The Traveling Salesman Problem

Performance “In Practice”

• Testbed: Random Euclidean Instances

(Results appear to be reasonably well-correlated with those for our real-world instances.)

Page 48: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 49: Case Studies:  Bin Packing & The Traveling Salesman Problem

Nearest Neighbor

Page 50: Case Studies:  Bin Packing & The Traveling Salesman Problem

Greedy

Page 51: Case Studies:  Bin Packing & The Traveling Salesman Problem

Smart Shortcuts

Page 52: Case Studies:  Bin Packing & The Traveling Salesman Problem

Smart-Shortcut Christofides

Page 53: Case Studies:  Bin Packing & The Traveling Salesman Problem

2-Opt

3-Opt

Smart-Shortcut Christofides

Page 54: Case Studies:  Bin Packing & The Traveling Salesman Problem

Original Tour Result of 2-Opt move

Original Tour Results of 3-Opt moves

Page 55: Case Studies:  Bin Packing & The Traveling Salesman Problem

Lin-Kernighan [Johnson-McGeoch Implementation]1.4% off optimal10,000,000 cities in 46 minutes at 2.6 GhzIterated Lin-Kernighan [J-M Implementation] 0.4% off optimal100,000 cities in 35 minutes at 2.6 Ghz

2-Opt [Johnson-McGeoch Implementation]4.2% off optimal10,000,000 cities in 101 seconds at 2.6 Ghz

3-Opt [Johnson-McGeoch Implementation]2.4% off optimal10,000,000 cities in 4.3 minutes at 2.6 Ghz

Page 56: Case Studies:  Bin Packing & The Traveling Salesman Problem

Worst-Case Running Times

• Nearest Neighbor: O(N2)

• Greedy: O(N2logN)

• Nearest Insertion: O(N2)

• Double MST: O(N2)

• Christofides: O(N3)

• 2-Opt: O(N2(number of improving moves))• 3-Opt: O(N3(number of improving moves))

Page 57: Case Studies:  Bin Packing & The Traveling Salesman Problem

K-d trees [Bentley, 1975, 1990]

Page 58: Case Studies:  Bin Packing & The Traveling Salesman Problem

Data Elements

Widest Dimensio

n

Median Value in Widest

Dimension

Index L in Array T of First Point

Index U in Array T of Last Point

Tree Vertex k:

Leaf?

Array T: Permutation of Point indices

Array H: H[i] = index of tree leaf containing Point i

Implicit: Index of Parent = k/2 Index of Lower Child = 2k

Index of Higher Child = 2k+1

Empty?

Page 59: Case Studies:  Bin Packing & The Traveling Salesman Problem

Operations

• Construct kd-tree (recursively): O(NlogN)

• Delete or temporarily delete points (without rebalancing): O(1)

• Find nearest neighbor to one of points:

“typically” O(logN)

• Given point p and radius r, find all points p’ with d(p,p’) ≤ r (“ball search”):

“typically” O(logN + number of points returned).

Page 60: Case Studies:  Bin Packing & The Traveling Salesman Problem

Finding Nearest Neighbor to Point p

• Initialize M = ∞. Call rnn(H[p],p,M).

• rnn(k,p,M):– If Vertex k is a leaf,

• For each point x in vertex k’s list,– If dist(p,x) < M, set M = dist(p,x) and Champion = x.

– Otherwise,• Let d be vertex k’s widest dimension, m be its median value

for that dimension, and pd the value of p’s coordinate in dimension d.

• If pd > m, let NearChild = 2k+1 and FarChild = 2k.

• Otherwise, NearChild = 2k and FarChild = 2k+1.• If NearChild has not been explored, rnn(NearChild,p,M).

• If |pd – m| < M, rnn(FarChild,p,M).

– Mark Vertex k as “explored”

Page 61: Case Studies:  Bin Packing & The Traveling Salesman Problem

Nearest Neighbor Algorithm in “O(NlogN)”

• Construct kd-Tree on cities.

• Pick a starting city cπ(1).

• While there remains an unvisited city, let the current last city be cπ(i).

– Use the kd-Tree to find the nearest unvisited city to cπ(i).

– Delete cπ(i) from the kd-Tree

Page 62: Case Studies:  Bin Packing & The Traveling Salesman Problem

Greedy (Lazily) in “O(NlogN)”

• Construct a kd-tree on the cities. Let G = (C,). We will maintain the property that the kd-tree contains only vertices with degree 0 or 1 in G.

• For each city c, let n(c) be its nearest “legal” neighbor. A neighbor is legal if adding edge c,n(c) to the current graph does not create

– A vertex of degree 3 or

– A cycle of length less than N.

• Put triples (c,n(c),dist(n,n(c))) in a priority queue, sorted in order of increasing distance.

• For each city c, define otherend(c) = c.

Initialization:

Page 63: Case Studies:  Bin Packing & The Traveling Salesman Problem

Greedy (Lazily) in “O(NlogN)”

• Extract the minimum (c,n(c),d(c,n(c)) triple from the priority queue.

• If c,n(c) is a legal edge, add it to G, and if either c or n(c) now has degree 2, delete it from the kd-tree.

• If c has degree 2, continue.

• Otherwise n(c) = otherend(c) and so completes a short cycle.

– Temporarily delete otherend(c) from the kd-Tree.

– Find the new nearest neighbor n’(c) of c.

– Add (c,n’(c),dist(c,n’(c)) to the priority queue.

– Put otherend(c) back in the kd-Tree.

While not yet a tour:

Page 64: Case Studies:  Bin Packing & The Traveling Salesman Problem

“Nearest Insertion” in “O(NlogN)”

• Straightforward if we instead implement the variant in which each new city is inserted next to its nearest neighbor in the tour (“Nearest Addition”) – upper bound proof still goes through, but performance suffers.

• Seemingly no way to avoid Ω(N2) if we need to find the best place to insert.

• Essentially as good performance can be obtained using Jon Bentley’s “Nearest Addition+” variant:

– If c is to be inserted and x is the nearest city in the tour to c, choose the best insertion that places x next to a city x’ with d(c,x’) ≤ 2d(c,x).

– Idea: Use ball search to find candidates for x’.

Page 65: Case Studies:  Bin Packing & The Traveling Salesman Problem

Organizing the Search

Try all C not adjacent to A in the current tour.

A

C

C

ATry each A in turn around the current tour until an improving move is found – take first one found. If none found after all cities tried for A,

halt.

B

B

Page 66: Case Studies:  Bin Packing & The Traveling Salesman Problem

2-Opt in “O(NlogN)”

• Loop on cities• Nearer neighbors (via kd-trees)• Accept first rather than best• Approximation: Bounded Nbr Lists w.

stored distances• Approximation: Don’t Look Bits

Page 67: Case Studies:  Bin Packing & The Traveling Salesman Problem

3-Opt in “NlogN”

• 1st two swaps improve• Betweenness• Tour data structure

Page 68: Case Studies:  Bin Packing & The Traveling Salesman Problem

Doubly-linked List

Page 69: Case Studies:  Bin Packing & The Traveling Salesman Problem

2-Level Tree

Page 70: Case Studies:  Bin Packing & The Traveling Salesman Problem

Splay Tree

Page 71: Case Studies:  Bin Packing & The Traveling Salesman Problem

Lin-Kernighan (k-Opt on the cheap)

Page 72: Case Studies:  Bin Packing & The Traveling Salesman Problem

For more on the TSP algorithm performance, see the website for the DIMACS TSP Challenge:

http://www2.research.att.com/~dsj/chtsp/index.html/

Tour Length Normalized Running Time

Comparison: Smart-Shortcut Christofides versus 2-Opt

Page 73: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 1000 “Random Clustered Instance”

Page 74: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 10,000 “Random Clustered Instance”

Page 75: Case Studies:  Bin Packing & The Traveling Salesman Problem

Microseconds/N

Microseconds/N1.25

Microseconds/NlogN

Estimating Running-Time Growth Rate for 2-Opt

Page 76: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 77: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 78: Case Studies:  Bin Packing & The Traveling Salesman Problem

Held-Karp (or “Subtour”) Bound

• Linear programming relaxation of the following formulation of the TSP as an integer program:

• Minimize city pairs c,c’(xc,c’d(c,c’))• Subject to

– c’C xc,c’ = 2, for all c C.

– cS,c’C-S xc,c’ ≥ 2, for all S C.

– xc,c’ 0,1 , for all pairs c,c’ C.0 ≤ xc,c’ ≤ 1

Linear programming relaxation

Page 79: Case Studies:  Bin Packing & The Traveling Salesman Problem

Percent by which Optimal Tour exceeds Held-Karp Bound

Page 80: Case Studies:  Bin Packing & The Traveling Salesman Problem

Computing the Held-Karp Bound

• Difficulty: Too many “subtour” constraints:

cS,c’C-S xc,c’ ≥ 2, for all S C

• Fortunately, if any such constraint is violated by our current solution, we can find such a violated constraint in polynomial time:

• Suppose the constraint for S is violated by solution x. Consider the graph G, where edge c,c’ has capacity xc,c’ . For any pair of vertices (u,v), u S and v C-S, the maximum flow from u to v is less than 2 (and conversely).

• Consequently, an S yielding a violated inequality can be found using 2N-1 network flow computations, assuming such an inequality exists.

Page 81: Case Studies:  Bin Packing & The Traveling Salesman Problem

Concorde Branch-and-Cut Optimization [Applegate-Bixby-Chvatal-Cook]Optimum1,000 cities in median time 5 minutes at 2.66 Ghz

Lin-Kernighan [Johnson-McGeoch Implementation]1.4% off optimal10,000,000 cities in 46 minutes at 2.6 GhzIterated Lin-Kernighan [J-M Implementation] 0.4% off optimal100,000 cities in 35 minutes at 2.6 Ghz

Page 82: Case Studies:  Bin Packing & The Traveling Salesman Problem

Concorde • “Branch-and-Cut” approach exploiting linear programming to

determine lower bounds on optimal tour length.

• Based on 30+ years of theoretical developments in the “Mathematical Programming” community.

• Exploits “chained” (iterated) Lin-Kernighan for its initial upperbounds.

• Eventually finds an optimal tour and proves its optimality (unless it runs out of time/space).

• Also can compute the Held-Karp lower bound for very large instances.

• Executables and source code can be downloaded from http://www.tsp.gatech.edu/

Page 83: Case Studies:  Bin Packing & The Traveling Salesman Problem

Geometric Interpretation

-- Points in RN(N-1)/2 corresponding to a tour.

Hyperplane perpendicular to the vector of edge lengths

Optimal Tour

Page 84: Case Studies:  Bin Packing & The Traveling Salesman Problem

Optimal Tour is a point on the convex hull of all tours.

Unfortunately, the LP relaxation of the TSP can be a very poor approximation to the convex hull of tours.

Facet

To improve it, add more constraints (“cuts”)

Page 85: Case Studies:  Bin Packing & The Traveling Salesman Problem

One Facet Class: Comb Inequalities

H

T1 T2 T3Ts-1

Ts

Teeth Ti are disjoint, s is odd, all regions contain at least one city.

Page 86: Case Studies:  Bin Packing & The Traveling Salesman Problem

H

T1 T2 T3Ts-1

Ts

• For Y the handle or a tooth, let x(Y) be the total value of the edge variables for edges with one endpoint in Y and one outside, when the function x corresponds to a tour

• By subtour inequalities, we must have x(Y) ≥ 2 for each such Y. It also must be even, which is exploited to prove the comb inequality:

13s)x(Tx(H)s

1ii

Page 87: Case Studies:  Bin Packing & The Traveling Salesman Problem

Branch & Cut

• Use a heuristic to generate a initial “champion” tour and provide provide an upper bound U ≥ OPT.

• Let our initial “subproblem” consist of an LP with just the inequalities of the LP formulation (or some subset of them).

• Handle subproblems as follows:

Page 88: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Keep adding violated inequalities (of various sorts) that you can find, until

– (a) LP Solution value ≥ U. In this case we prune this case and if no other cases are left, our current tour is optimal.

– (b) Little progress is made in the objective function. In this case, for some edge c,c’ with a fractional value, split into two subproblems, one with xc,c’ fixed at 1 (must be in the

tour, and one with it fixed at 0 (must not be in the tour).

• If we ever encounter an LP solution that is a tour and has length L’ < L, set L = L’ and let this new tour be the champion. Prune any subproblems whose LP solution exceeds or equals L. If at any point all your children are pruned, prune yourself.

Branch & Cut

Page 89: Case Studies:  Bin Packing & The Traveling Salesman Problem

Initial LP, U = 100, LB = 90

LB = 92 LB = 93

Xa,b = 0 Xa,b = 1

LB = 92 LB = 100 LB = 98 LB = 97

Xa,c = 0Xc,d = 0 Xc,d = 1 Xa,c = 1

LB = 101 LB = 100

Xe,a = 0 Xe,a = 1

New Opt = 97

U = 97

Page 90: Case Studies:  Bin Packing & The Traveling Salesman Problem
Page 91: Case Studies:  Bin Packing & The Traveling Salesman Problem

N = 85,900

Current World Record (2006)

Using a parallelized version of the Concorde code, Helsgaun’s sophisticated variant on Iterated Lin-Kernighan, and 2719.5 cpu-days

The optimal tour is 0.09% shorter than the tour DSJ constructed using Iterated Lin-Kernighan in 1991. In 1986, when computers were much slower, we could only give the Laser Logic people a Nearest-Neighbor tour, which was 23% worse, but they were quite happy with it…

Page 92: Case Studies:  Bin Packing & The Traveling Salesman Problem

Running times (in seconds) for 10,000 Concorde runs on random 1000-city planar Euclidean instances (2.66 Ghz Intel Xeon processor in dual-processor PC, purchased late 2002).

Range: 7.1 seconds to 38.3 hours

Page 93: Case Studies:  Bin Packing & The Traveling Salesman Problem

Concorde Asymptotics[Hoos and Stϋtzle, 2009 draft]

• Estimated median running time for planar Euclidean instances.

• Based on– 1000 samples each for N = 500,600,…,2000– 100 samples each for N = 2500, 3000,3500,4000,4500– 2.4 Ghz AMD Opteron 2216 processors with 1MB L2

cache and 4 GB main memory, running Cluster Rocks Linux v4.2.1.

0.21 · 1.24194 N

Actual median for N = 2000: ~57 minutes, for N = 4,500: ~96 hours

Page 94: Case Studies:  Bin Packing & The Traveling Salesman Problem

Theoretical Properties of Random Euclidean Instances

Expected optimal tour length for an N-city instance approaches CN for some constant C as N . [Beardwood, Halton, and Hammersley, 1959]

Key Open Question:

What is the Value of C?

Page 95: Case Studies:  Bin Packing & The Traveling Salesman Problem

The Early History

• 1959: BHH estimated C .75, based on hand solutions for a 202-city and a 400-city instance.

• 1977: Stein estimates C .765, based on extensive simulations on 100-city instances.

• Methodological Problems:• Not enough data

• Probably not true optima for the data there is

• Misjudges asymptopia

Page 96: Case Studies:  Bin Packing & The Traveling Salesman Problem

Stein: C = .765BHH: C = .75

Figure from [Johnson, McGeoch, Rothberg, 1996]

Page 97: Case Studies:  Bin Packing & The Traveling Salesman Problem

What is the dependence on N ?

• Expected distance to nearest neighbor proportional to 1/N, times n cities yields (N)

• O(N) cities close to the boundary are missing some neighbors, for an added contribution proportional to (N)(1/N), or (1)

• A constant number of cities are close to two boundaries (at the corners of the square), which may add an additional (1/N )

• This yields target function

OPT/N = C + /N + /N

Page 98: Case Studies:  Bin Packing & The Traveling Salesman Problem

Asymptotic Upper Bound Estimates (Heuristic-Based Results Fitted to OPT/N

= C + /N)

• 1989: Ong & Huang estimate C ≤ .74, based on runs of 3-Opt.

• 1994: Fiechter estimates C ≤ .73, based on runs of “parallel tabu search”

• 1994: Lee & Choi estimate C ≤ .721, based on runs of “multicanonical annealing”

• Still inaccurate, but converging?

• Needed: A new idea.

Page 99: Case Studies:  Bin Packing & The Traveling Salesman Problem

• Join left boundary of the unit square to the right boundary, top to the bottom.

New Idea (1995): Suppress the

variance added by the “Boundary

Effect” by using

Toroidal Instances

Page 100: Case Studies:  Bin Packing & The Traveling Salesman Problem

Toroidal Unit Ball

Page 101: Case Studies:  Bin Packing & The Traveling Salesman Problem

Toroidal Distance Computations

Page 102: Case Studies:  Bin Packing & The Traveling Salesman Problem

Toroidal Instance Advantages

• No boundary effects.

• Same asymptotic constant for E[OPT/N] as for planar instances [Jaillet, 1992] (although it is still only asymptotic).

• Lower empirical variance for fixed N.

Page 103: Case Studies:  Bin Packing & The Traveling Salesman Problem

Toroidal Approaches

1996: Percus & Martin estimate

C .7120 ± .0002.

1996: Johnson, McGeoch, and Rothberg estimate

C .7124 ± .0002.

2004: Jacobsen, Read, and Saleur estimate

C .7119.

Each coped with the difficulty of computing optima in a different way.

With Concorde and today’s computers, we no longer have to.

Page 104: Case Studies:  Bin Packing & The Traveling Salesman Problem

Optimal Tour Lengths:One Million 100-City Instances

-1e+07 -5e+06 0 +5e+06

Optimal Tour Lengths Appear to Be Normally Distributed

Page 105: Case Studies:  Bin Packing & The Traveling Salesman Problem

With a standard deviation that appears to be independent of N

Optimal Tour Lengths:One Million 1000-City Instances

-1e+07 -5e+06 0 +5e+06

Page 106: Case Studies:  Bin Packing & The Traveling Salesman Problem

The New Data• Solver:

– Latest (2003) version of Concorde with a few bug fixes and adaptations for new metrics

• Primary Random Number Generator:– RngStream package of Pierre L’Ecuyer,

described in• “AN OBJECT-ORIENTED RANDOM-NUMBER PACKAGE

WITH MANY LONG STREAMS AND SUBSTREAMS,” Pierre L'ecuyer, Richard Simard, E. Jack Chen, W. David Kelton, Operations Research 50:6 (2002), 1073-1075

Page 107: Case Studies:  Bin Packing & The Traveling Salesman Problem

Toroidal Instances

Number of CitiesNumber of Instances

OPT HK

N = 3, 4, …, 49, 50 1,000,000 X X

N = 60, 70, 80, 90, 100 1,000,000 X X

N = 200, 300, …, 1,000 1,000,000 X X

N = 110, 120, …, 1,900 10,000 X X

N = 2,000 100,000 X X

N = 2,000, 3,000, …, 10,000 1,000,000 X

N = 100,000 1,000 X

N = 1,000,000 100 X

Page 108: Case Studies:  Bin Packing & The Traveling Salesman Problem

Euclidean Instances

Number of CitiesNumber of Instances

OPT HK

N = 3, 4, …, 49, 50 1,000,000 X X

N = 60, 70, 80, 90, 100 1,000,000 X X

N = 110, 120, …, 1,000, 2,000 10,000 X X

N = 1,100, 1,200 …, 10,000 10,000 X

N = 20,000, 30,000, …, 100,000 10,000 X

N = 1,000,000 1,000 X

Page 109: Case Studies:  Bin Packing & The Traveling Salesman Problem

Standard Deviations

N = 100 N = 1,000

Page 110: Case Studies:  Bin Packing & The Traveling Salesman Problem

99% Confidence Intervals for OPT/Nfor Euclidean and Toroidal Instances

Page 111: Case Studies:  Bin Packing & The Traveling Salesman Problem

99% Confidence Intervals for (OPT-HK)/Nfor Euclidean and Toroidal Instances

Page 112: Case Studies:  Bin Packing & The Traveling Salesman Problem

Least Squares fit for all data from [30,2000] -- OPT/N = (C + a/N + b/N2)

C = .712401 ± .000005 versus C = .712333 ± .00006

Page 113: Case Studies:  Bin Packing & The Traveling Salesman Problem

What is the right function?

Range of N

Function C Confidence

[30,2000] C + a/N + b/N2 .712401 ± .000005

[100,2000] C + a/N + b/N2 .712403 ± .000010

[100,2000] C + a/N .712404 ± .000006

Power Series in 1/N – the Percus-Martin Choice

Justification: Expected distance to the kth nearest neighbor is provably such a power series.

Page 114: Case Studies:  Bin Packing & The Traveling Salesman Problem

What is the right function?

Range of N

Function C Confidence

[100,2000] C + a/N0.5 .712296 ± .000015

[100,2000] C + a/N0.5 + b/N .712403 ± .000030

[100,2000] C + a/N0.5 + b/N + c/N1.5 .712424 ± .000080

OPT/sqrt(N) = Power Series in 1/sqrt(N))

Justification: This is what we saw in the planar Euclidean case (although it was caused by

boundaries).

Page 115: Case Studies:  Bin Packing & The Traveling Salesman Problem

What is the right function?

Range of N

Function C Confidence

[100,2000] C + a/N0.5 .712296 ± .000015

[100,2000] C + a/N0.5 + b/N1.5 .712366 ± .000022

[100,2000] C + a/N0.5 + b/N1.5 + c/N2.5

.712385 ± .000040

OPT = (1/sqrt(N) · (Power Series in 1/N)

Page 116: Case Studies:  Bin Packing & The Traveling Salesman Problem

What is the right function?

Range of N

Function C Confidence

[30,2000] C + a/N + b/N2 .712401 ± .000005

[100,2000] C + a/N + b/N2 .712403 ± .000010

[100,2000] C + a/N .712404 ± .000006

[100,2000] C + a/N0.5 .712296 ± .000015

[100,2000] C + a/N0.5 + b/N .712403 ± .000030

[100,2000] C + a/N0.5 + b/N + c/N1.5 .712424 ± .000080

[100,2000] C + a/N0.5 + b/N1.5 .712366 ± .000022

[100,2000] C + a/N0.5 + b/N1.5 + c/N2.5

.712385 ± .000040

Page 117: Case Studies:  Bin Packing & The Traveling Salesman Problem

C + a/n.5 + b/n + c/n1.5

Effect of Data Range on Estimate[30,2000], [60,2000], [100,2000], [200,2000],

[100,1000]

C + a/n + b/n2 + c/n3

C + a/n.5 + b/n.1.5 + c/n2.5

95% Confidence Intervals

C + a/n.5 + b/n C + a/n.5

C + a/n.5 + b/n.1.5

Page 118: Case Studies:  Bin Packing & The Traveling Salesman Problem

The Winners?

C + a/n + b/n2 + c/n3

C = .71240 ± .00002C = .71240 ± .000005

Page 119: Case Studies:  Bin Packing & The Traveling Salesman Problem

Does the HK-based approach agree?

Question

Page 120: Case Studies:  Bin Packing & The Traveling Salesman Problem

CHK = .707980 ± .000003

95% confidence interval derived using C + a/N + b/N2 functional form

Page 121: Case Studies:  Bin Packing & The Traveling Salesman Problem

C-CHK = .004419 ± .000002

95% confidence interval derived using C + a/N + b/N2 functional form

Page 122: Case Studies:  Bin Packing & The Traveling Salesman Problem

HK-Based Estimate

C-CHK = .004419 ± .000002

+ CHK = .707980 ± .000003

C = .712399 ± .000005

Versus (Conservative) Opt-Based Estimate

C = .712400 ± .000020

Combined Estimate?

C = .71240 ± .00001

Page 123: Case Studies:  Bin Packing & The Traveling Salesman Problem

OPEN PROBLEM:What function truly describes the data?

Our data suggests OPT/sqrt(N)

≈ .71240 + a/N - b/N2 + O(1/N3),

a = .049 ± .004, b = .3 ± .2

(from fits for ranges [60,2000] and [100,2000])

But what about the range [3,30]?

Page 124: Case Studies:  Bin Packing & The Traveling Salesman Problem

(95% confidence intervals on data) – f(N), 3 ≤ N ≤ 30

Page 125: Case Studies:  Bin Packing & The Traveling Salesman Problem

Fit of a + b/N + c/N2 + d/N3 + e/N4 for [3,30]

95% Confidence Intervals

To date, no good fit of any sort has been found.

Page 126: Case Studies:  Bin Packing & The Traveling Salesman Problem

Problem

• Combinatorial factors for small N may make them unfittable:– Only one possible tour for N = 3 (expected

length of optimal tour can be given in closed form)

– Only 3, 12, 60, 420, … possible tours for N = 4, 5, 6, 7, …, so statistical mechanics phenomena may not yet have taken hold.

• So let’s throw out data for N < 12

Page 127: Case Studies:  Bin Packing & The Traveling Salesman Problem

Fit of a + b/N + c/N2 + d/N3 + e/N4 for [12,2000]

Still Questionable…

Page 128: Case Studies:  Bin Packing & The Traveling Salesman Problem

99.7% confidence intervals on OPT/n, 10 ≤ n ≤ 30.

Unexplained Phenomenon: Rise and then Fall

Peaks at N = 17

Page 129: Case Studies:  Bin Packing & The Traveling Salesman Problem

Stop!

Page 130: Case Studies:  Bin Packing & The Traveling Salesman Problem

Applications

• Cities (3 dimensions):

– Orientations of a crystal when doing X-ray crystallography – up to 3000 cities

Page 131: Case Studies:  Bin Packing & The Traveling Salesman Problem

Gnuplot Least Squares fit for the Percus-Martin values of N -- OPT/N = (C + a/N +

b/N2)/(1+1/(8N))

C = .712234 ± .00017 versus originally claimed C = .7120 ± .0002

Page 132: Case Studies:  Bin Packing & The Traveling Salesman Problem

Least Squares fit for all data from [12,100] -- OPT/N = (C + a/N + b/N2)

C = .712333 ± .00006 versus C = .712234 ± .00017