a fuzzy hopfield-tank traveling salesman problem...

16
A Fuzzy Hopfield-Tank Traveling Salesman Problem Model WILLIAM J. WOLFE University of Colorado at Denver, Department of Computer Science and Engineering, Campus Box 109, P.O. Box 173364, Denver, CO 80217-3364, Email: [email protected] (Received: March 1999; revised: June 1999; accepted: August 1999) This article describes a Hopfield-Tank model of the Euclidean traveling salesman problem (using Aiyer’s subspace approach) that incorporates a “fuzzy” interpretation of the rows of the activation array. This fuzzy approach is used to explain the network’s behavior. The fuzzy interpretation consists of com- puting the center of mass of the positive activations in each row. This produces real numbers along the time line, one for each city, and defines a tour in a natural way (referred to as a “fuzzy tour”). A fuzzy tour is computed at each iteration, and examination of such tours exposes fine features of the network dynamics. For example, these tours demonstrate that the net- work goes through (at least) three phases: “centroid,” “mono- tonic,” and “nearest-city.” The centroid (initial) phase of the network produces an approximation to the centroid tour (i.e., the tour that orders the cities by central angle with respect to the centroid of the cities). The second phase produces an ap- proximation to the monotonic tour (i.e., the tour that orders the cities by their projection onto the principal axis of the cities). The third phase produces an approximation to a nearest-city tour, that is, a tour whose length is between the tour lengths of the nearest-city-best (best starting city) and nearest-city-worst (worst starting city). Simulations are presented for up to 125 uniformly randomly generated cities. T he Hopfield-Tank model [2, 13, 16, 17] of the Euclidean trav- eling salesman problem (TSP) [7, 8, 14] has attracted many re- searchers because it demonstrates how a highly intercon- nected network of simple processing units can solve a complex problem. The method is also interesting because it exemplifies a key property of parallel-distributed systems: the emergence of global characteristics from local interac- tions. [9, 10] It is often difficult to understand how local inter- actions can lead to a particular global effect, yet the analysis of such effects is crucial to improved understanding of re- cently developed models of biological, ecological, and eco- nomical systems. We take the TSP as a representative prob- lem and show how a fuzzy approach exposes the network’s organizational properties. We use Aiyer’s subspace method [1] to construct a particular Hopfield-Tank TSP model. Briefly put, the subspace method is based on Hopfield’s [2] original approach but with a more carefully chosen Energy Function. This is a direct conse- quence of analyzing the orthogonal projection onto the fea- sible (or valid) space. The result is a network that keeps the activation vector close to the feasible space, which in turn facilitates the interpretation of a final convergent state (i.e., the final state of the network corresponds closely to a per- mutation matrix). We show that for up to 100 uniformly randomly generated cities, the network produces tours of nearest-city quality (by “nearest-city quality” we mean that the tour length is between the tour lengths of the nearest- city-best (best starting city) and the nearest-city-worst (worst starting city) tours). The focus of this article, however, is to provide an explanation for the network’s behavior, that is, to better understand the underlying mechanisms by which the network chooses a tour. To provide a finer analysis of the network’s behavior, we introduce a fuzzy interpretation of each row of the activation array. This consists of calculating the center of mass (cm) of the positive activations in each row. These cms are real numbers, and therefore not restricted to the integers that represent the columns of the array. Thus, each row, and therefore each city, is associated with a real number on the time line, and the corresponding sequence is interpreted as the current tour (i.e., the “fuzzy tour”). Such a tour can be computed at any iteration. At the beginning of a run, as might be expected, the fuzzy tours are random, but as the network progresses, various patterns emerge—patterns that correlate with city statistics, such as the centroid and principal components. It is shown that the network goes through three phases: 1. Centroid Phase. The fuzzy tours are approximations to the centroid tour, that is, the tour that orders the cities by central angle with respect to the center of mass of the cities. 2. Monotonic Phase. The fuzzy tours are approximations to the monotonic tour, that is, the tour that orders the cities by their projections onto the principal axis of the cities. 3. Nearest-City Phase. The fuzzy tours are approximations to a nearest-city tour, that is, a tour that starts from an arbitrary city and then chooses nearest neighbors until a tour is constructed. In simulations the network is initialized with very small random positive activations (less than 10 30 ). Initially, the fuzzy tours are themselves random but they quickly become approximations to the centroid tour. The emergence of the centroid tour is surprising, but it was first suggested by Aiyer. [1] We do not provide a proof that centroid tours must emerge, but it is evident in almost all simulations, with rare exceptions. However, a proof is provided that for small activations the network produces sinusoids of frequency 1 in each row of the activation array, and that the peaks of the Subject classifications: Combinatorial optimization. Other key words: Hopfield-Tank, traveling salesman problem. 329 INFORMS Journal on Computing 0899-1499 99 1104-0329 $05.00 Vol. 11, No. 4, Fall 1999 © 1999 INFORMS

Upload: ngodang

Post on 21-Sep-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

A Fuzzy Hopfield-Tank Traveling Salesman Problem ModelWILLIAM J. WOLFE � University of Colorado at Denver, Department of Computer Science and Engineering, Campus Box 109,

P.O. Box 173364, Denver, CO 80217-3364, Email: [email protected]

(Received: March 1999; revised: June 1999; accepted: August 1999)

This article describes a Hopfield-Tank model of the Euclideantraveling salesman problem (using Aiyer’s subspace approach)that incorporates a “fuzzy” interpretation of the rows of theactivation array. This fuzzy approach is used to explain thenetwork’s behavior. The fuzzy interpretation consists of com-puting the center of mass of the positive activations in eachrow. This produces real numbers along the time line, one foreach city, and defines a tour in a natural way (referred to as a“fuzzy tour”). A fuzzy tour is computed at each iteration, andexamination of such tours exposes fine features of the networkdynamics. For example, these tours demonstrate that the net-work goes through (at least) three phases: “centroid,” “mono-tonic,” and “nearest-city.” The centroid (initial) phase of thenetwork produces an approximation to the centroid tour (i.e.,the tour that orders the cities by central angle with respect tothe centroid of the cities). The second phase produces an ap-proximation to the monotonic tour (i.e., the tour that orders thecities by their projection onto the principal axis of the cities).The third phase produces an approximation to a nearest-citytour, that is, a tour whose length is between the tour lengths ofthe nearest-city-best (best starting city) and nearest-city-worst(worst starting city). Simulations are presented for up to 125uniformly randomly generated cities.

T he Hopfield-Tank model[2, 13, 16, 17] of the Euclidean trav-eling salesman problem (TSP)[7, 8, 14] has attracted many re-searchers because it demonstrates how a highly intercon-nected network of simple processing units can solve acomplex problem. The method is also interesting because itexemplifies a key property of parallel-distributed systems:the emergence of global characteristics from local interac-tions.[9, 10] It is often difficult to understand how local inter-actions can lead to a particular global effect, yet the analysisof such effects is crucial to improved understanding of re-cently developed models of biological, ecological, and eco-nomical systems. We take the TSP as a representative prob-lem and show how a fuzzy approach exposes the network’sorganizational properties.

We use Aiyer’s subspace method[1] to construct a particularHopfield-Tank TSP model. Briefly put, the subspace methodis based on Hopfield’s[2] original approach but with a morecarefully chosen Energy Function. This is a direct conse-quence of analyzing the orthogonal projection onto the fea-sible (or valid) space. The result is a network that keeps theactivation vector close to the feasible space, which in turnfacilitates the interpretation of a final convergent state (i.e.,the final state of the network corresponds closely to a per-mutation matrix). We show that for up to 100 uniformly

randomly generated cities, the network produces tours ofnearest-city quality (by “nearest-city quality” we mean thatthe tour length is between the tour lengths of the nearest-city-best (best starting city) and the nearest-city-worst (worststarting city) tours). The focus of this article, however, is toprovide an explanation for the network’s behavior, that is, tobetter understand the underlying mechanisms by which thenetwork chooses a tour.

To provide a finer analysis of the network’s behavior, weintroduce a fuzzy interpretation of each row of the activationarray. This consists of calculating the center of mass (cm) ofthe positive activations in each row. These cms are realnumbers, and therefore not restricted to the integers thatrepresent the columns of the array. Thus, each row, andtherefore each city, is associated with a real number on thetime line, and the corresponding sequence is interpreted asthe current tour (i.e., the “fuzzy tour”). Such a tour can becomputed at any iteration. At the beginning of a run, asmight be expected, the fuzzy tours are random, but as thenetwork progresses, various patterns emerge—patterns thatcorrelate with city statistics, such as the centroid and principalcomponents. It is shown that the network goes through threephases:

1. Centroid Phase. The fuzzy tours are approximations tothe centroid tour, that is, the tour that orders the cities bycentral angle with respect to the center of mass of thecities.

2. Monotonic Phase. The fuzzy tours are approximations tothe monotonic tour, that is, the tour that orders the citiesby their projections onto the principal axis of the cities.

3. Nearest-City Phase. The fuzzy tours are approximationsto a nearest-city tour, that is, a tour that starts from anarbitrary city and then chooses nearest neighbors until atour is constructed.

In simulations the network is initialized with very smallrandom positive activations (less than 10�30). Initially, thefuzzy tours are themselves random but they quickly becomeapproximations to the centroid tour. The emergence of thecentroid tour is surprising, but it was first suggested byAiyer.[1] We do not provide a proof that centroid tours mustemerge, but it is evident in almost all simulations, with rareexceptions. However, a proof is provided that for smallactivations the network produces sinusoids of frequency � 1in each row of the activation array, and that the peaks of the

Subject classifications: Combinatorial optimization.Other key words: Hopfield-Tank, traveling salesman problem.

329INFORMS Journal on Computing 0899-1499� 99 �1104-0329 $05.00Vol. 11, No. 4, Fall 1999 © 1999 INFORMS

sinusoids cluster around 2 columns of the network (half wayaround the tour from each other). Empirical evidence is thenpresented that the peaks of the sinusoids organize them-selves so that the corresponding fuzzy tour is a close ap-proximation to the centroid tour.

The centroid configuration is just the initial phase, a man-ifestation of the linear behavior of the network (dominantwhen the activations are small). The corner-seeking (i.e.,nonlinear) tendency of the network ultimately forces theactivations to move toward a nearest-city tour. In-between,there is a phase where the fuzzy tours correlate stronglywith the monotonic tour. The relationship between themonotonic and nearest-city phases is critical to understand-ing the network’s overall performance. Unfortunately, it isvery difficult to see the correlation between these phases,but we provide strong empirical evidence that the principalaxis of the cities plays an important role in tying these twophases together.

1. Network ArchitectureFollowing Hopfield and Tank,[2] an n-city TSP problem isrepresented by an n � n grid of neurons (cities at possibletime stops). A row of the grid represents a city at all time stops;a column of the grid represents a time stop at all cities. Theneighboring columns of the grid are connected (symmetri-cally) with a connection strength of �d(x, y), where d(x, y) isthe distance between cities x and y. That is, neuron (i, x) isconnected to neurons (i � 1, y), with wrap around, withconnection strength � �d(x, y).

Each neuron has an activation (aix), and we refer to the n �n array of activations as the activation array and to then2-long vector obtained from the array (by concatenating therows) as the activation vector (a). The index “i” indicates thecolumn of the array (i.e., the “stop”), and the index “x”indicates the row of the array (i.e., the “city”). The net inputto a neuron is the sum of activations times the connectionstrengths (there are no external inputs):

netix � � y � k wixkyaky (1)

1.1 The Subspace ApproachSo far, the network architecture is too simple to be effective.Following Aiyer,[1] we analyze the orthogonal projectioninto the feasible space. The Hopfield-Tank model defines thefeasible space as the affine space containing the n � n ma-trices whose rows and columns sum to 1 (i.e., doubly sto-chastic matrices). In our model the feasible space (�0) is thespace spanned by n � n matrices whose rows and columnssum to zero (Aiyer refers to this as the “zero-sum” space).The Hopfield-Tank model uses an external input to shift(i.e., parallel translation) �0 to the affine space containingthe doubly stochastic matrices—otherwise the neural archi-tectures are identical.

The orthogonal projection consists of the following calcula-tion:

aixproj � aix � actavg � rowavg

x � colavgi . (2)

The “actavg” is the average of all the activations at thatiteration; rowx

avg is the average of the activations in the xth

row; colavgi is the average of the activations in the ith column.

After projection, the array (aixproj) has the property that the

sum of any row or column is zero and is therefore in thefeasible space.[6]

The effectiveness of the projection is a direct consequenceof the analysis provided by Aiyer[1] (see also Gee et al.[3] andAiyer et al.[4]). It helps to realize that �n2

decomposes intothe following mutually orthogonal subspaces:

�n2� �0 � � c � � r � � . (3)

where � is the one-dimensional space spanned by a constantvector (e.g., aij � 1); �c is spanned by harmonics in thex-direction (e.g., sinusoids parallel to the rows of the neuralgrid); �r is spanned by harmonics in the y-direction (e.g.,sinusoids parallel to the columns of the neural grid); and �0

(the feasible space) is spanned by the two-dimensional har-monics that are orthogonal to �, �c, and �r.

With these definitions the feasible space becomes a linearsubspace of �n2

(further details are provided in Aiyer,[1]

Wolfe et al.,[5] and Wolfe and Ulmer[6]).The projection is introduced here, separate from the rest

of the network architecture, for reasons of clarity, but theprojection calculation can be easily embedded in the neuralarchitecture introduced above. To do this we add to thenetwork connections: add 1/n2 to all connections; add �1/nto all connections between neurons in a common row; add�1/n to all connections between neurons in a commoncolumn. Notice that this results in self-connections of �(2n �1)/n2. These additional connections, in effect, allow the ac-tivations to be incrementally projected as part of the neuralupdate sequence.

As pointed out by Aiyer,[1] in this model we do not havevariables equivalent to the “A, B, C, D” as in the originalHopfield-Tank model.[2] That is, the subspace approachspecifies these parameters unambiguously.

Putting these definitions together we get the followingnetwork architecture:

wixjy

1/n2 � 1/n y � x, j � i; �row�1/n2 � 1/n y � x, j � i; �column�1/n2 � 2/n y � x, j � i; �self�1/n2 � d�x, y� y � x, j � i � 1, or j � i � 1. �distance�1/n2 j � i � 1, i, i � 1, and y � x; �global�

The energy function is given by:

E � �1/ 2� � i � x � j � y aixajywixjy�

� �1/ 2� � i � x � y ��d� x , y��aix�ai�1y � ai�1y�

� �i �x �j ��1/n�aixajx � �i �x �y ��1/n�aixaiy

� � i � x � j � y �1/n2�aixajy�

330Wolfe

We have found that substituting d(x, y)/n for d(x, y) in theweights puts more emphasis on seeking the feasible space inthe early phase of the network dynamics and leaves the“distance effect” for later (similar to setting D � 1/n, andA � B � C � 1, in the original Hopfield-Tank model).Let:

M�maximum activation � �1;m �minimum activation � �1/(n � 1).

Call (aix) a feasible state if there is exactly one maximumactivation in each row and column, and the rest of the entriesare equal to the minimum activation. Feasible states corre-spond to tours in the usual way. Also notice that with M ��1 and m � �1/(n � 1) the sum of any row or column of afeasible state is 0 (i.e., (aix) � �0). (See the Appendix for aproof that the feasible states have lower energy when theycorrespond to shorter tours.)

1.2 Network DynamicsAssume that we have n neurons, indexed by i � 1, . . . n, withactivations ai such that m � ai � M, connection strengths wij,and no external inputs. We define the Hopfield-Tank dy-namics by the following equations:[2]

dui/dt � ��ui � neti

where neti � jwijaj; aj � g(uj); and g(x) � (M � m)/(1 �exp(�x)) � m.

It is easy to show that when � � 0 these equations can beapproximated by the following (see the Appendix for theproof):

dai/dt � �ai � m��M � ai�neti .

Setting � � 0 is usually a good idea because it representsa decay factor that impedes the convergence of the networkto the boundary of the hypercube (see Takefuji[15] for furtherjustification).

2. City StatisticsTreating the coordinates of the cities as data points in atwo-dimensional feature space, the following statistics aredefined: centroid, correlation matrix, principal components, andeffective partition. The centroid, correlation matrix, and prin-cipal components are standard[12] but the concept of aneffective partition is introduced here because it helps explainthe network’s behavior.

Let S � {(ai, bi)�0 � ai, bi � 1, i � 1, . . . , n} be the set of citycoordinates. Let (aavg, bavg) be the centroid of the cities:

aavg � i a i

nand bavg �

i b i

n.

The correlation matrix is given by:

C � �cij� , i , j � 1, 2;

where c11 � 1/n i (ai � aavg)2; c12 � c21 � 1/n i(ai �aavg)(bi � bavg); and c22 � 1/n i(bi � bavg)2.

Let v1, v2 and �1, �2 be the solution to the eigenvalueproblem: Cv � �v. Since C is a correlation matrix the eig-envalues are real and nonnegative, and the eigenvectors are

orthogonal. Assume �1 �2. The eigenvalues, �1 and �2,measure the variance of the points when projected onto v1

and v2, respectively. The direction of v1 corresponds to thedirection with maximum variance. The “principal axis” ofthe cities is defined by placing v1 at the centroid.

Let {G1, G2} be a partition of S: G1 � G2 � S, G1 � G2 � ø.Call it an “effective” partition if it has the property that thesum of the distances from a city in G1 to all other cities in G1

is less than the sum of its distances to the cities in G2, andsimilarly for a city in G2. That is, {G1, G2} is an effectivepartition if:

for all cities k :

� j�G1 d� j , k� � � j�G2 d� j , k� if k � G1 ,

� j�G1 d� j , k� � � j�G2 d� j , k� if k � G2 .

Such partitions are relatively easy to find. For example, ifthe cities are uniformly spaced on a circle, any diameter thatdoes not intersect a city defines an effective partition. Ingeneral, an effective partition can be created by seeking alocal minimum to a specific objective function (see the Ap-pendix for a proof).

A simple 10-city example is provided in Figure 1. For alarge number of cities uniformly generated within a unitsquare, the principal axis will (approximately) lie along oneof the diagonals of the square, and the bisector of the prin-cipal axis (i.e., v2) will (approximately) separate the citiesinto an effective partition.

3. TSP Heuristics: Centroid, Monotonic, Nearest-City, 2-OptUsing the notation from the previous section, we can nowdefine, precisely, the TSP heuristics that we will use in

Figure 1. A 10-city example (cities: A, B, C, D, E, F, G, H,I, J) of the centroid, principal axis, and an effectivepartition (cities are labeled “G1” or “G2”). The v2 axisseparates the cities into an effective partition in this case.

331A Fuzzy Hopfield-Tank TSP Model

evaluating the neural network. The centroid and monotonictours are nonstandard, so we define them here. The nearest-city tour is a standard approach to the TSP,[7, 14] but our useof the terms “nearest-city-best” and “nearest-city-worst” isnot common, so we also define these. The 2-Opt is anotherstandard approach to the TSP,[7, 14] and we include its def-inition for the sake of completion.

3.1 Centroid TourFor each city i (location (ai, bi)), let i � arctan((bi � bavg)/(ai � aavg)). Sort the cities into the sequence (ii, i2, . . . , in) byik

� ik�1.

3.2 Monotonic TourLet v1 � (va, vb) be the principal axis of the cities. For eachcity i, let i � (ai � aavg)va � (bi � bavg)vb. Sort the cities intothe sequence (ii, i2, . . . , in) by ik

� ik�1.

3.3 Nearest-City TourFor each possible starting city, construct a tour by choosingthe nearest unvisited city for the next city until all the citiesare exhausted and return to the starting city. The length ofthe shortest of these tours is defined to be the “nearest-city-best” result, and the longest of these tours is defined to bethe “nearest-city-worst” result.

3.4 2-OptBegin with a random tour. Consider all possible pairs ofedges in the tour. For each pair delete the edges from thetour and add 2 new edges to reconnect the cities into a tour.After checking each edge pair, pick the one that improvesthe tour the most, and iterate. If none of them improves thetour, stop.

4. Fuzzy Read OutAssume that (aix) represents the activations, with “i” repre-senting the column and “x” the row of the array. We intro-duce a “fuzzy” way to read each row of the activation array.We compute the center of mass of the positive activationsand treat that as an estimate of where the city represented bythat row should be placed along the time line.

Let:

�aix � aix aix � 0;�aix � 0 aix � 0.

For each row (x) of the activation array let:

cmx � � i�i�aix�/� i�aix .

Cmx is undefined when there is no positive activation in therow. Also, to account for “wrap around” on the time line, thecm is computed relative to the position of the peak activationin the row. If ipeak is the column with the largest activation,then the sums are performed from i � ipeak � (n/2 � 1) to i �ipeak � (n/2 � 1) (with aix � a(imodn)x and cmx � [cmx mod n]).The results are not significantly affected by summing fromipeak � b to ipeak � b, where b is a small fraction of n. We used

this method of computing the cms in all the simulationspresented in this article, but it is worth pointing out that amore natural computation (i.e., one that exploits the cyclic,translation invariance, of the system) may be used. That is,similar results can be obtained by defining the cm to be thephase of the first Fourier coefficient:

cmx � arctan� � i a ixsin�2i/n�� � i a ixcos�2i/n�� .

An important feature of either of these methods is that thecms are real numbers, and therefore not restricted to theintegers that represent the columns of the neural array (pro-ducing what might be called “sub-pixel” accuracy). The cmsare computed for each row and sorted along the time line.Since each cm corresponds to a row, and each row to a city,a tour is easily deduced from the sequence of cms.

If: cmx1 � cmx2 � . . . � cmxn

Fuzzy Tour: x1x2 . . . xn .

This definition is ambiguous when two or more cms areequal. Equal cms represent the boundary between regions ofhyperspace representing different fuzzy tours. In simula-tions, these boundaries are often crossed but convergence tostates on such boundaries is extremely rare, and usuallycorresponds to having collocated cities.

The fuzzy approach gives meaning to “interior” states ofthe hypercube. A tour is now represented by a relativelylarge region of hyperspace containing, for example, the linefrom the origin (but not including the origin) to the tour’shypercube corner (i.e., permutation matrix). For a given tour�, let R� be the region of hyperspace that unambiguouslyrepresents �. R� contains hypercube corners correspondingto permutation matrix representations of �, including theredundant corners associated with rotated tours (same tour,but a different starting city) and reverse tours (same tour, buttraversed in the opposite direction). R� splits into disjointsubsets:

R� � R�1 � R�

2

where R�1 represents � with clockwise orientation, and R�

2 rep-resents � with counter-clockwise orientation.

It is easy to see that R�1 and R�

2 are disjoint (R�1 � R�

2 � A)and each is a connected set. That is, there is no hyperspacepoint that is in both R�

1 and R�2. This is clear since the order

of the cms would have to be completely reversed to movefrom R�

1 to R�2. The only points in hyperspace that might

represent a tour in both R�1 to R�

2 are the points correspond-ing to all equal cms. These points are ambiguous and, bydefinition, in neither R�

1 nor R�2. (Points where all the cms are

equal are called maximally ambiguous since a small perturba-tion can cause the point to represent any possible tour.) By acontinuous deformation of the activations, however, anypoint in R�

1 can be moved into any other point in R�1, without

leaving R�1, and the same is true of R�

2.Network dynamics can be viewed as moving the activa-

tion vector through these regions toward the hypercubeboundary. Along the way it “sees” many different tours, andthe best of these can be saved as the network output.

332Wolfe

Figure 2 shows a 20-city example. The fuzzy tour is com-puted (at the 500th iteration in this case) and drawn at thebottom of the figure. Notice that when activations for tworows compete for the same column (e.g., cities B and R), thecities are located close to each other, yet the cms unambig-uously define a tour (see also cities D and S).

The main purpose of the cm calculation is to expose thestages of network behavior. We do not see how to add thiscomputation to the neural architecture without corruptingthe symmetric and homogeneous nature of the neurons.However, more neurons could be added to each row to(asymmetrically) accumulate the sums necessary for calcu-lating the cms. This would retain the massively parallelnature of the network if implementation were the goal, but,again, the cms are used here primarily for diagnosis and notnecessarily included in the neural architecture.

5. Phase I Dynamics: Centroid ToursAnalysis of a linear approximation to the network dynamicsprovides vivid insight into the network behavior when theactivations are small. Since the activation vector is initializedin the interior of the hypercube (near the origin), and theincremental activation updates are small, the following lin-ear equation approximates the network dynamics in the firststage of updates:

da/dt Wa

where W is the n2 � n2 connection strength matrix.Based on the properties of W, it will be shown that sinu-

soids of frequency � 1, but arbitrary phase, must emerge ineach row of the n � n “activation array” during the firstphase of network dynamics. First, it is shown that the con-nection strength matrix W (n2 � n2) is composed of symmet-ric circulant blocks Axy (n � n), one for each pair of cities x,y. This implies that the eigenvalues of Axy are given by thediscrete Fourier transform (DFT) of the first row of Axy, andthat the corresponding eigenvectors (n � 1) are sinusoids ofall frequencies (see Davis[11] for the properties of circulantmatrices).

Let a be the “activation vector.” That is, a is an n2-longcolumn vector whose first n components are the activationsof the first row of the array of neurons, and so on. Thereforea can be written:

a � v1

v2

��

vn

where vj is an n � 1 column vector containing the activationsof the j th row of neurons. To save space we also use thenotation: a � �v1, v2, . . . , vn�.

Let W be the connection strength matrix. W can be writtenas:

W � A11A21 . . . An1

A12A22 . . . An2

A1nA2n . . . Ann

where Axy is the n � n matrix that contains the connectionstrengths between the xth and yth rows of neurons. It is easyto see that Axy � Ayx, and each Axy is a symmetric circulantmatrix.[11]

Now we invoke well-known theorems concerning sym-metric circulant matrices (see Davis[11] and Aiyer[5]). In par-ticular, we know that sinusoids of any frequency are eigen-vectors of Axy and that the (real) eigenvalues are obtained bytaking the DFT of the first row of Axy.

Theorem 1. The eigenvalues of Axy are given by:

when x y:

�0 � ��2�d� x , y� ,

�p � �1/n � 2d� x , y�cos�2p/n� ,

p � 1, . . . , n � 1.

Figure 2. Fuzzy read out for a 20-city example. Thesquares represent the level of neural activation. Thesequence of cms produces the tour at the bottom. Noticethat column competition between cities such as B and R iseffectively resolved by the fuzzy read out.

333A Fuzzy Hopfield-Tank TSP Model

when x � y:

�0 � �1,

�p � �1/n

p � 1, . . . , n � 1.

Proof. When x y, the first row of Axy is:

1/n2 � 1/n , 1/n2 � d� x , y� , 1/n2, . . . , 1/n2,

1/n2 � d� x , y� .

When x � y, the first row of Axy is:

1/n2 � 1/n � 1/n , 1/n2 � 1/n , . . . , 1/n2 � 1/n .

The corresponding DFTs are given by the coefficients in theTheorem. �

Thus, except for the “DC” term (�0 � 0), the eigenvaluewith the largest magnitude corresponds to the first harmonic(p � 1). The eigenvalue for the first harmonic is negative,and its magnitude is larger when d(x, y) is larger. In partic-ular, notice that for x y, �1 � �2d(x, y), for large n. So,without loss of generality we will make the identification:�1 � �d(x, y) for the rest of the analysis.

Theorem 2. Sinusoids of frequency � 1 will emerge in eachrow of the activation array as the network updates theactivations.

Proof. Consider the coupling that occurs between the var-ious Axys as W acts on a:

�Wa�k � � j Ajkvj

Now let uj be the first harmonic in the Fourier decomposi-tion of vj:

uj � �sin�2r/n � j� �r � 0, . . . , n � 1�

By Theorem 1, the largest magnitude effect that Ajk has on vj

is on the first harmonic:

Ajkvj �d� j , k�uj

Therefore (Wa)k will be a superposition of sinusoids of fre-quency � 1 but with differing amplitudes and phases, whichwill produce another sinusoid of frequency � 1. Thus, ateach neural iteration the network dynamics will add a rela-tively large sinusoid of frequency � 1 to each row of theactivation array. �

Notice that the proof of Theorem 2 implies that analyzingWa reduces to analyzing Dz where D is the negative dis-tance matrix (�d(i, j)) and z is an n � 1 column vector witharbitrary complex exponentials in each component:

z � �z1 , z2 , . . . , zn� ,

zj � rjexp� �1 j� .

Theorem 3. Let the cities j and k be the two cities that are thefurthest away among all the cities (i.e., cities j, k, such thatd( j, k) � maxx,y(d(x, y))). The network dynamics will force

the jth and kth rows of the activation array to be 180° out ofphase.

Proof. Consider the jth and kth components of Dz:

�Dz� j � �� p d� p , j� zp � . . .�d�k , j� zk � . . .

�Dz�k � �� p d� p , k� zp � . . .�d� j , k� zj � . . .

The term �d(k, j)zk will contribute a large multiple of �zk tozj, and the term �d( j, k)zj will contribute a large multiple of�zj to zk. This will cause the phases of zj and zk to move inopposite directions (see Figure 3). �

From Theorem 3 it is easy to see that, in general, thenetwork dynamics will adjust the phase of each row of theactivation array so as to “oppose” the distance-weightedeffect of the other rows (see Figure 4).

Theorem 4. Let {G1, G2} be an effective partition of the set ofcities, and let z have the property that zk � z if k � G1, andzk � �z if k � G2. The network dynamics will then multiplyeach component of z by a positive scalar, and thereforemaintain the phase relationships between the rows of theactivation array.

Figure 3. The addition of �zk to zj will cause it to rotateaway from zk. Similarly for the addition of �zj to zk.

Figure 4. When the kth row of the activation array isupdated, it receives a contribution from all the rows,weighted by the distances d( j, k). The phase of the kth rowwill be moved away from the sum of the distance-weighted phases of the other rows.

334Wolfe

Proof. If k � G1:

�Dz�k � � � j�G2 d� j , k� � � j�G1 d� j , k�� z .

And, if k � G2:

�Dz�k � � � j�G1 d� j , k� � � j�G2 d� j , k�� ��z� .

Thus, each component of z is scaled in proportion to thecorresponding difference in sum of distances to G1 and G2,which, by the definition of an effective partition, must bepositive. �

These Theorems predict that sinusoids of frequency � 1will emerge in the rows of the activation array in the net-work’s early stages, and Theorem 4 indicates that the sinu-soids will group themselves according to an effective parti-tion. That is, rows that correspond to the cities in G1 will beapproximately 180° out of phase with rows that correspondto cities in G2. Furthermore, simulations will show that foruniformly randomly generated cities the cities are roughlydivided into G1 and G2 by the line that is perpendicular tothe principal axes of the cities at the centroid.

What is left to verify is that the network dynamics willorganize the peaks of these sinusoids in such a way thattheir order corresponds to the centroid tour. This is difficultto prove, so we present only empirical evidence in the nextsection.

6. Network Performance: 50-City ExampleThe following simulations validate the theoretical results ofthe previous sections. In particular, the simulations showthat in the network’s initial phase:

• sinusoids of frequency � 1 emerge in each row of thearray;

• the peaks of the sinusoids are arranged in centroid order;• the peaks form 2 clusters, centered about 2 columns of the

array;• the clusters are approximately half way around the tour

from each other;• the clusters correspond to an effective partition of the

cities.

The simulations also show that the network moves throughthree phases: centroid, monotonic, and nearest-city. It isdifficult to explain, to any degree of satisfaction, the rela-tionship between the phases. It does appear, however, thatthe monotonic phase creates a biased introduction to thefinal, nearest-city, phase. Therefore, a better understandingof the monotonic phase might lead to network improve-ments.

Figures 5A–D show a 50-city example. The city locationsare randomly placed in the unit square. The 2-opt, nearest-city-best, nearest-city-worst, centroid, and monotonic toursare computed before running the neural network (see Figure5A). Notice that the tour lengths are given in the figure andthat the 2-opt is the better of the tours, with the nearest-city-

best the next best. In terms of tour length, the monotonic andcentroid tours are relatively poor. (The monotonic and cen-troid tours get increasingly worse, in comparison to the2-opt and nearest-city-best, as the number of cities increas-es.) To facilitate the interpretation of phase 1, the cities areordered according to the centroid result before launchingthe neural network. This makes the effective partition “jumpout” of the sketch of the activations.

Figure 5B shows the fuzzy tours, and activations, at iter-ation 10, 280, and 550. At iteration 280 the network is clearlyin the centroid phase: the fuzzy tour is nearly identical to thecentroid tour, the activations are sinusoidal across each row,they form 2 clusters, the clusters correspond to an effectivepartition of the cities, and the partition has one group ofcities at each end of the principal axis. (In Figures 5B and 5Cthe activations are “drawn to scale.” That is, the largest boxdrawn represents the largest activation currently in the ar-ray, which might be a very small activation value, especiallyin the initial phase.) At iteration 550 the network reaches themonotonic phase: the fuzzy tour correlates strongly with themonotonic tour. These phases emerge in almost all simula-tions, and the monotonic phase is typically less accurate andbriefer than the centroid phase.

Notice that the clustered sinusoids create closely spacedcms in each of the two groups, yet, as the fuzzy tour shows,there is enough resolution to specify the centroid tour un-ambiguously. As was noted earlier, equal cms are ambigu-ous, and the state consisting of all equal cms is maximallyambiguous. Considering how close the cms are in the cen-troid phase it is not that surprising that the network movesto a significantly different set of tours in the monotonicphase within just a few iterations.

The monotonic phase sets the stage for the nearest-cityphase. It appears that it creates a bias that affects the nearest-city phase. Figure 5C shows the final phase of the network:nearest-city tours. In this case the network beat the nearest-city-best result, but notice that the tours in Figure 5C showa tendency to snake along the principal axis. This phenom-enon is present in almost all simulations, and it seems to bethe source of increasingly poor performance as the numberof cities increases.

Figure 5D shows the fuzzy tour lengths for iteration 1through 1000. The three phases of the network are evident.Notice that the fuzzy tour lengths do not reach the mono-tonic tour length (Figure 5B shows that the fuzzy tours areonly an approximation to the monotonic tour). This appearsto be the result of the fact that the centroid phase haspartitioned the cities into 2 groups and the monotonic effectis acting on each group separately.

Figure 5D also shows that the fuzzy tour lengths arerelatively stable in the middle of the centroid phase, but theyare erratic in the monotonic phase, and they vary onlyslightly in the nearest-city phase. This is typical for the runswe did.

7. Network Performance: 10 to 100 CitiesFigure 6 shows the results of averaging 50 runs for eachproblem size from 10 to 70 cities. For any neural run we

335A Fuzzy Hopfield-Tank TSP Model

define the “neural result” to be the best of all the fuzzy toursthat are encountered in the run. Although we do not keeptrack of the phase that produced the best tour, in most casesthe best tour is produced in the nearest-city phase, but for 10to 30 cities it is often the case that the centroidal phaseproduces the best tour. For a small number of cities (10–20),the neural result competes with the 2-opt and nearest-city-best. This is correlated with the fact that, for 10 to 20 ran-domly generated cities, centroid tours also compete veryclosely with the 2-opt and nearest-city-best tours. That is, thenetwork gets to pick the better of the centroid and nearest-city solutions. Although the neural results reliably fall be-tween the nearest-city-best and nearest-city-worst results,there is some indication that the results are worsening as thenumber of cities grows, possibly because of the dramaticallyincreasing length of the average centroid and monotonictours as the number of cities grows.

Figure 7 provides additional data summarizing manysimulations. It shows the number of times the neural result(i.e., the best of all the fuzzy tours for that run) was betterthan the 2-opt, nearest-city-best, or nearest-city-worst for 50

instances of city sizes 25, 50, 70, and 100. It is clear that theneural results are slowly degrading, although even at 100cities the neural result is better than the nearest-city-worstapproximately 80% of the time. The degradation in perfor-mance appears to be a consequence of the fact that centroidand monotonic tours degrade very rapidly as the number ofcities increases.

Figure 8 shows two 125-city runs. Notice that the neuraltours tend to snake around the diagonal of the city locationbox (the principal axis is approximately along one of thediagonals). The neural results are better than the nearest-city-worst results approximately 50% of the time for 125randomly generated cities.

8. Special Cases: Circular and Linear City Configurations8.1 Circular CitiesIt is important to realize that the simulations presentedabove were based on (uniformly) randomly generated citiesin a square. The network does much better on circular andlinear configurations of cities. For example, for up to 40 citiesthat are uniformly placed on a circle the network almost

Figure 5A. A 50-city example. The 2-opt, nearest-city-best, nearest-city-worst, centroid, and monotonic tours are shown.The centroid and principal axis are also drawn in on some tours (2-opt and centroid). Before running the neural networkthe cities are put in the order of the centroid result.

336Wolfe

always gets the optimal solution. This is attributable to thecentroid phase, since it seeks the centroid solution that isoptimal in circular cases.

For up to 100 cities equally spaced on a circle the networkoften finds the optimal solution. Furthermore, a simple vari-ation of the network architecture produces 100% optimal

Figure 5B. Snapshots of the neural activations and the fuzzy tours at iteration 10, 280, and 550. The centroid and principalaxis are also drawn on the tours. At iteration 280 the network is in the centroid phase: the fuzzy tour is almost identical tothe centroid tour and the cities have been partitioned into an effective partition, G1 and G2, with G1 on one end of theprincipal axis and G2 on the other end. The activations (on the right) show the bumps corresponding to the peaks of thesinusoids formed in each row. At iteration 550, the network is in the monotonic phase.

337A Fuzzy Hopfield-Tank TSP Model

solutions for such cases. That is, add connections to columnsbeyond 1 column away from any neuron:

connect neuron (i, x) to neuron (i � q, y), for q � 1, . . . , qmax,with wrap around, with a connection strength of �d(x, y).

Thus, qmax is the number of columns away that we connecta given neuron, using the same distance-weighted connec-

tions we would use for a single column away. For thesimulations presented in the earlier sections qmax was set to1. In this section we examine the network performance forlarger values of qmax.

When we increase qmax we observe increasingly betterperformance on circular cities. Figure 9 shows the improve-ment in performance as qmax ranges from 1 to 3. For larger

Figure 5C. The nearest-city phase. The sinusoidal clusters break up and the activations form a pattern consistent with anearest-city tour. From iteration 655 to 995 the network refines the nearest-city tour to the point where it is competitivewith the nearest-city-best (7.66). Notice that these nearest-city tours tend to “snake” around the principal axis.

338Wolfe

values of qmax the network found the optimal solution on allsimulations (up to 100 cities). Figure 10 shows a 100-cityexample with qmax � 20. In this case it took only 26 iterationsto reach the optimal solution. In almost all cases, with qmax �n/5 the network finds the optimal solution within 50 itera-tions.

It is relatively easy to see that Theorem 1 can be general-ized to the modified architecture with the result that theeigenvalue with the largest magnitude is now �1 � �2 qmax

d(x, y). This enhances the centroid affect in the first phase ofthe network and helps the network find optimal solutionsfor circular configurations.

The affect of this modified architecture on randomly gen-erated cities is still under investigation, but initial simula-tions indicate that it does not improve overall network per-formance despite the fact that it improves the performanceon circular cities.

9.2 Linear CitiesTo test linear configurations of cities we placed cities equallyspaced along a line. For up to 100 cities (as far as we tested)the network, without modification (i.e., qmax � 1) producedthe optimal solution very quickly (less than 50 iterations) onall the runs that we did (dozens of runs for each problem size

from 10 to 100 cities). In this case the monotonic phase of thenetwork provides the good performance, as might be ex-pected since the monotonic tour is optimal. We also mod-eled “near-linear” city configurations by restricting one ofthe city coordinates to the range [0.45, 0.55] (see Figure 11).In these cases the neural results were much better than theresults shown in Figure 6 for uniformly generated cities inthe unit square. This is explained by the fact that the dele-terious effects caused by the “snaking” along the principalaxis are minimized for near-linear city configurations.

Finally, it is interesting to note that the centroid phase isabsent, or very brief, for near-linear cities, and similarly, themonotonic phase is absent for circular cities. It appears thatthe centroid and monotonic phases express the degree of“circular-ness” and “linear-ness” of a randomly generatedset of cities.

9. ConclusionsWe have shown that a neural network, using an implemen-tation of Aiyer’s subspace method and a fuzzy approach toreading the rows of the array, can reliably produce nearest-city quality tours for up to 100 randomly generated cities. Ithas been demonstrated that the fuzzy approach revealssome of the inner workings of the network, exposing cen-

Figure 5D. The fuzzy tour lengths for iterations 1 through 1000. The centroid phase is evident from iterations 100 to 500;the monotonic phase from 500 to 650; and the nearest-city phase from 650 to 1000.

339A Fuzzy Hopfield-Tank TSP Model

troid, monotonic, and nearest-city phases. It is unclear, how-ever, exactly how the phases affect each other. It appearsthat the network adopts a bias as it passes through a phase,but it is difficult to correlate a subsequent phase with theprevious phase. The principal axis plays a major role in theevolution of the network, and it is clear that cities with alarger projection onto the major axis have larger activationsduring the centroid phase. It is also clear that the effectivepartition forms on opposing ends of the principal axis. Fur-thermore, the nearest-city phase shows a strong tendency tosnake around the principal axis. A deeper analysis of theserelationships might lead to further improvements in net-work architecture.

Although further analysis of the 3 network phases maysuggest strategies for an improved network, it remains plau-sible that the network is limited by its nearest-city behavior.That is, in the nearest-city phase the network appears togrow separate small, locally short, paths of cities, but fails toorganize them into a good overall tour. This is at least partlybecause each of these short paths grows with a specificdirection, and when they meet up the directions may beincompatible with a good tour. For example, the 2-opt heu-ristic allows for reversing the direction of any sub-sequence

Figure 6. The plot shows the average of 50 random city runs for city sizes of 10 through 70. The neural results reliably fallbetween the nearest-city-best and nearest-city-worst results. The neural results, however, show signs of worsening withproblem size, possibly because of the significant worsening of the centroid tours as the number of cities increases.

Figure 7. These histograms show the number of times theneural result was better than the 2-opt, nearest-city-best,or nearest-city-worst for 50 runs of 25, 50, 70, and 100random cities. There is a noticeable degradation of theneural results as the number of cities increases, but evenat 100 cities the neural result beat the nearest-city-besttwice (but it was also worse than the nearest-city-worsttour 11 times).

340Wolfe

of cities within the current tour. This is the source of muchof its power. The network, on the other hand, has no mech-anism for such reversals of subpaths, and the number ofsubpaths that might grow independently appears to in-crease with the number of cities, a fact that is consistent withdegrading performance as the problem size grows.

Finally, there may be some value in applying the fuzzyread out method to other optimization networks, especiallyin cases where the network often converges to ambiguousstates.

Acknowledgments

Professor Ross M. McConnell, of the University of Colorado atDenver, made major contributions to the development of the ideasin this article. Thanks to the anonymous IEEE and INFORMS re-viewers who made important observations about earlier versions ofthis article. Further thanks to Professors Jay Rothman, Gita Alagh-band, Harvey Greenberg, and Steve Billups, and students EdChang, Frank Ducca, Mike Pitt, Ben Bahr, John Porter, Dan Szabo,Brad Swanson, Ross Thomason, Brian Powell, Donovan Nelson, JoeGee, Jeff Donovan, Jean Watson, Jonathan McSwain, Jiang Xu,

Figure 8. Two 125-city cases, showing the nearest-city-worst and neural tours. The top pair of tours shows the nearest-city-worst (tour length � 11.63) and the neural tour (tour length � 11.88), while the bottom pair of tours shows anotherexample with tour lengths 12.19 and 11.20. For 125 randomly generated cities the neural tour is shorter than the nearest-city-worst tour about 50% of the time.

341A Fuzzy Hopfield-Tank TSP Model

Kenny Simler, Lincoln Carlton, Mike Brockway, Qibin Tao, ThomasSheperd, Janet Siebert, Robert Keeble, and Libby Pflum of the Uni-versity of Colorado at Denver, and Steve Sorensen of Hughes In-formation Technology Systems, for their insightful comments con-cerning the technical details of this article. And finally, thanks toCheryl Dwyer for technical support.

AppendixLet (aix) be a feasible state if there is exactly one maximumactivation in each row and column, and the rest of the entries

are equal to the minimum activation. Feasible states corre-spond to tours in the usual way. Also notice that with M ��1 and m � �1/(n � 1) the sum of any row or column of afeasible state is 0 (i.e., (aix) � �0). Now we show that feasiblestates have lower energy when they correspond to shortertours.

Theorem A1. Feasible states have energy given by:

E � ���n � 2�/�n � 1�2� � �n/�n � 1��2TL ,

where � � x yd(x, y), and TL is the length of the tourdefined by the feasible state.

Proof. Assume that (aix) is a feasible state. By direct com-putation:

� i � x � j ��1/n�aixajx � � i � x � y ��1/n�aixaiy

� � i � x � j � y �1/n2�aixajy

� 0.

Thus, the energy of a feasible state is given by the first termin the energy equation:

E � �1/ 2� � i � x � y ��d� x , y��aix�ai�1y � ai�1y�� .

Let q be the row that has activation � 1 in the ith column ofthe array (i.e., aiq � 1), and let y1 and y2 be the rows on theleft and right that also have activation � 1 (i.e., ai�1y1

� 1,and ai�1y2

� 1). Then we can expand:

� x � y ��d� x , y��aix�ai�1y � ai�1y�

� �� z y1�m��1�d�q , z� � � z y2

�m��1�d�q , z�

� �d�q , y1� � d�q , y2��

� � r q ��� z y1d�r , z�m2 � � z y2

d�r , z�m2�

� m�d�r , y1� � d�r , y2��)

� �2msq � �m � 1��d�q , y1� � d�q , y2��

�2m2�� � sq� � m�m � 1��sy1 � sy2� � m�m � 1�

� �d�q , y1� � d�q , y2�

where sq is the sum of the distances from q to all other cities.Now summing over all the columns:

E � �1/ 2 � i � x � y ��d� x , y��aix�ai�1y � ai�1y�

� �1/ 2��2m2�n � 1�� � m�m � 1��2��

� 2m�m � 1�TL � 2m� � 2�m � 1�TL� .

Finally, substituting m � �1/(n � 1):

E � ��n � 2�/�n � 1�2) � �n/�n � 1��2TL .

Figure 9. In this experiment the cities were equallyspaced on a circle, and ten runs were made for eachproblem size from 10 to 100 cities. The number of optimalsolutions in the ten runs is plotted. For qmax � 1 thenetwork produced all optimal solutions for up to 30 cities;from there performance degrades to only 3 optimalsolutions at 100 cities. For qmax � 3, however, the(modified) network produced all optimal solutions exceptfor 1 run at 90 cities.

Figure 10. In this example 100 cities were equally spacedon a circle. The network was initialized with randomactivations and qmax � 20. After only 26 iterations thenetwork found the optimal solution.

342Wolfe

Theorem A2. The Hopfield-Tank model can be approxi-mated by the equation:

dai/dt � �ai � m��M � ai�neti .

Proof. Setting � � 0, and solving the equation

aj � �M � m�/�1 � exp��uj�� � m

for u, gives:

uj � ln��aj � m�/�M � aj�� .

Therefore:

duj/dt � �M � m�/��m � aj��M � aj��daj/dt

Therefore:

daj/dt � ��aj � m��M � aj�/�M � m��duj/dt .

Dropping the scale factor 1/(M � m), and substituting duj/dt � netj, gives:

daj/dt � �aj � m��M � aj�netj .

Theorem A3. Given an arbitrary set of cities, an effectivepartition can be constructed by the following procedure:

1. Arbitrarily assign the cities to a partition {G1, G2}.2. Loop over the cities:

a. If a city violates the condition for an effective partition,put it in the other group.

b. Stop when no city violates the condition.

Proof. Let Score(G1, G2) be the “score” of the partition {G1,G2}, defined as:

Score�G1 , G2� � � k�G1 � � j�G1 d� j , k� � � j�G2 d� j , k����k�G2 ��j�G2 d� j, k� � �j�G1 d� j, k��.

The procedure will move cities between G1 and G2, and ineach move the Score will decrease. For example, if city p is inG1 but j�G1d( j, p) � j�G2d( j, p) � 0, then moving city p toG2 causes the Score to decrease:

�Score � 2� � j�G2 d� j , p� � � j�G1 d� j , p�� � 0.

Therefore, the procedure will continue to decrease the Scoreuntil an effective partition is reached. �

References

1. S.V.B. AIYER, 1991. Solving Combinatorial Optimization Prob-lems Using Neural Networks, CUED/F-INFENG/TR89, Cam-bridge University Engineering Department, Cambridge, En-gland.

2. J.J. HOPFIELD and D.W. TANK, 1985. ‘Neural’ Computation ofDecisions in Optimization Problems, Biological Cybernetics 52,141–152.

3. A. GEE, S.V.B. AIYER, and R. PRAGER, 1993. An AnalyticalFramework for Optimizing Neural Networks, Neural Networks6, 79–97.

4. S.V.B. AIYER, M. NIRANJAN, and F. FALLSIDE, 1990. A TheoreticalInvestigation in the Performance of the Hopfield Model, IEEETransactions on Neural Networks 1, 204–215.

5. W.J. WOLFE, J.A. ROTHMAN, E.H. CHANG, W. AULTMAN, and G.RIPTON, 1995. Harmonic Analysis of Homogenous Networks,IEEE Transactions on Neural Networks 6, 1365–1374.

Figure 11. Near-Linear Cities. In these simulations city coordinates (x, y) were restricted to the range: x randomly chosenfrom [0, 1]; y randomly chosen from [0.45, 0.55]. We refer to such problems as “near-linear.” For each problem size (from30 to 100 cities) 10 runs were made. The plot shows a comparison of the average results for the 2-opt, nearest-city-best, andneural. The neural results compete with the 2-opt for up to 40 cities and with the nearest-city-best for up to 90 cities. Thisis a significant improvement over the results shown in Figure 6.

343A Fuzzy Hopfield-Tank TSP Model

6. W.J. WOLFE and R. ULMER, 1997. Orthogonal Projections Ap-plied to the Assignment Problem, IEEE Transactions on NeuralNetworks 8, 774–778.

7. E.L. LAWLER, J. LENSTRA, A. RINNOOY-KAN, and D. SHMOYS,1985. The Traveling Salesman Problem, John Wiley and Sons, NewYork.

8. S. LIN and B.W. KERNIGHAN, 1973. An Effective Heuristic Algo-rithm for the Traveling Salesman Problem, Operations Research21, 498–516.

9. J.L. MCCLELLAND and D.E. RUMELHART, 1981. An InteractiveActivation Model of Context Effects in Letter Perception: Part 1.An Account of Basic Findings, Psychological Review 88, 375–407.

10. D.E. RUMELHART and J.L. MCCLELLAND, 1988. Parallel Distrib-uted Processing, Vol. 1 and 2, MIT Press, Cambridge, MA.

11. P.J. DAVIS, 1979. Circulant Matrices, Wiley, New York.

12. S. HAYKIN, 1999. Neural Networks, Prentice Hall, EnglewoodCliffs, NJ.

13. K. SMITH, M. PALANISWAMI, and M. KRISHNAMOORTHY, 1998.Neural Techniques for Combinatorial Optimization with Appli-cations, IEEE Transactions on Neural Networks 9, 1301–1318.

14. E. AARTS and J.K. LENSTRA (eds.), 1997. Local Search in Combina-torial Optimization, John Wiley and Sons, New York.

15. Y. TAKEFUJI, 1992. Neural Network Parallel Computing, KluwerAcademic Publishers, Boston, MA.

16. A. RANGARAJAN, S. GOLD, and E. MJOLSNESS, 1996. A NovelOptimizing Network with Applications, Neural Computation 8,1041–1060.

17. K. SMITH, 1999. Neural Networks for Combinatorial Optimiza-tion: A Review of More Than a Decade of Research, INFORMSJournal on Computing 11, 15–35. 6.

344Wolfe