new policies for the dynamic traveling salesman problem

This article was downloaded by: [University of Chicago Library]On: 12 November 2014, At: 10:13Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Optimization Methods and SoftwarePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/goms20

New policies for the dynamic travelingsalesman problemGianpaolo Ghiani a , Antonella Quaranta a b & Chefi Triki ba Dipartimento di Ingegneria dell’Innovazione , Universit diLecce , Via per Arnesano, 73100, Lecce, Italyb Dipartimento di Matematica , Universit di Lecce , Via perArnesano, 73100, Lecce, ItalyPublished online: 28 Sep 2007.

To cite this article: Gianpaolo Ghiani , Antonella Quaranta & Chefi Triki (2007) New policies forthe dynamic traveling salesman problem, Optimization Methods and Software, 22:6, 971-983, DOI:10.1080/10556780701550026

To link to this article: http://dx.doi.org/10.1080/10556780701550026

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/goms20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10556780701550026

http://dx.doi.org/10.1080/10556780701550026

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Optimization Methods and SoftwareVol. 22, No. 6, December 2007, 971–983

New policies for the dynamic traveling salesman problem

GIANPAOLO GHIANI†, ANTONELLA QUARANTA†‡ and CHEFI TRIKI*‡

†Dipartimento di Ingegneria dell’Innovazione, Universit di Lecce, Via per Arnesano,73100 Lecce, Italy and

‡Dipartimento di Matematica, Universit di Lecce, Via per Arnesano, 73100 Lecce, Italy

(Received 11 November 2005; revised 4 May 2006; in final 10 January 2007)

The dynamic traveling salesman problem is a variant of the classical traveling salesman problem inwhich service requests arrive in a random fashion. In this paper we propose three new routing policiesminimizing the expected total waiting time of all the demands in the system. In particular, we showthat two policies are asymptotically optimal in heavy traffic, whereas the third one has a constant factorguarantee of two. We also perform extensive simulations in order to compare the three policies undervarious arrival rates.

Keywords: Stochastic and dynamic vehicle routing; Dynamic fleet management

1. Introduction

The purpose of this paper is to describe and assess three new policies for the dynamic travelingsalesman problem (DTSP). The DTSP is a generalization of the classic traveling salesmanproblem (TSP) in which customer requests arrive in an ongoing fashion. Consequently, thesalesman has to replan his route as soon as a new service request arrives. The objective is tominimize the expected total waiting time, i.e. the sum of the expected waiting times of all thecustomer demands.

The DTSP falls under the broad category of real-time fleet management. A large part of thecurrent literature is characterized by algorithms reacting to new requests only once they haveoccurred, while neglecting available stochastic information. Overviews of these problems canbe found in Powell et al. [1], Psaraftis [2,3], Gendreau and Potvin [4], and Ghiani et al. [5].Another line of research, which is particularly relevant to this paper, examines dispatching androuting policies whose performance can be determined analytically if specific assumptions aresatisfied. See, e.g. Bertsimas and Van Ryzin [6,7], where demands are distributed in a boundedarea in the plane and arrival times are modeled as a Poisson process. The authors identifyoptimal policies both in light and heavy traffic cases. Papastavrou [8] describes a routing

*Corresponding author. Email: [email protected]

Optimization Methods and SoftwareISSN 1055-6788 print/ISSN 1029-4937 online © 2007 Taylor & Francis

http://www.tandf.co.uk/journalsDOI: 10.1080/10556780701550026

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14

972 G. Ghiani et al.

policy that performs well both in light and in heavy traffic, whereas Swihart and Papastavrou [9]examine a dynamic pickup and delivery extension. A related area of research, often referredto as stochastic vehicle routing, examines the problems in which demand becomes knownat the beginning of each day. In this context, the problem is to determine, on the basis of aprobabilistic characterization of random data, a solution of least expected cost in which theorder of customers is fixed regardless of the demand realization for a particular day (an a priorisolution). Jaillet [10] introduced the probablisitic TSP, Jaillet and Odoni [11] examined thecapacitated case, and Bertsimas [12] introduced the multi-vehicle stochastic vehicle routingproblem. A survey of the research in this area can be found in Powell et al. [1], Bertsimas andSimchi-Levi [13], and Gendreau et al. [14]. Finally, we note the existence of a relatively recentline of research, known as anticipatory routing, in which general probability distributions areused in order to devise exact or heuristic policies. To our knowledge, four papers take thisapproach. Powell et al. [15] introduce a truckload dispatching problem, and Powell [16]provides formulations, solution methods, as well as numerical results. In these papers, futuredemand forecasts are used to determine which loads should be assigned to the vehicles in atruckload environment to account for forecasted capacity needs in the next period. In Thomasand White [17] a vehicle may serve several requests at a time and may wait for future demandboth at a customer and non-customer locations. Not all requests have to be serviced, and theobjective function to be minimized is the expected value of a combination of travel costs,terminal costs, and revenue generated from a pickup. Bent and Van Hentenryck [18] considera vehicle routing problem where customer locations and service times are random variableswhich are realized dynamically during plan execution. They develop a multiple scenarioapproach which continuously generates plans consistent with past decisions and anticipatingfuture requests. Moreover, there is another area of research related to online optimizationand competitive analysis [19–23]. More specifically, some related problems have been solvedby the so-called IGNORE-strategy [22,24–26], which was proved to be 2.5-competitive forminimizing the makespan for the dial-a-ride problem in graphs [24] and to have performancebounds to minimize the total waiting time in ref. [25]. Moreover, it is proved in ref. [26] thatthis strategy is asymptotically optimal for minimizing the total waiting time on trees in caseof high load.

This paper exploits some of the results exposed in the second stream of research and extendsthem to define new routing policies for the DTSP. The sequel of the paper is organized asfollows. In section 2, we propose three new spatially unbiased policies and compare theirperformance in terms of lower bounds on the optimal solution value. Our computationalexperience is summarized in section 4, followed by some concluding remarks in section 5.

2. New policies for the DTSP

We investigate the DTSP under the hypothesis that the requests occur in a plane. Moreover,we assume that service requests arrive according to a homogeneous Poisson process, and theirlocations are independently and uniformly distributed in a bounded region A having an area �.For the sake of simplicity, the region is assumed to be a square whose median corresponds toa depot where the vehicle is based. Demands are served by a single vehicle that travels at aconstant speed υ. We also suppose that the vehicle operates in heavy traffic condition, i.e. thearrival rate λ → ∞. We assume that there are no constraints on the vehicle’s capacity andthat the service time spent by the vehicle at each location is negligible. Our objective is tominimize the sum of the expected waiting times of all the demands in the system. Let di,i+1

be the time required by the vehicle to move from the ith demand’s location to the (i + 1)-thdemand’s location. Moreover, we denote by Di the service time of the ith request. We also

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14

New policies for DTSP 973

denote by ti the arrival time of demand i, by Ti = Di − ti the ith demand’s waiting time andwith T ≡ limi→∞ E[Ti] its steady state expected value.

In this section, we propose three spatially unbiased policies for the DTSP. A commonfeature of the first two policies is the partitioning of the service region A into a fixed numberof subregions, whereas the third policy uses a simple but efficient insertion method. Moreover,in the light traffic condition, i.e. whenever no customer is waiting in the system, the idle vehicleis repositioned in the stochastic median of the service region and waits until a new demandarrives. More specifically, this last strategy is substantially the stochastic queue median policydescribed in ref. [6] which was proved to be optimal in the light traffic condition.

2.1 The total deferment policy

This policy represents an adaptation of the unbiased policy introduced in ref. [6]. The serviceregion A is partitioned into k ∈ Z+ subregions A1,A2, . . . ,Ak of equal size, each having anarea A such that:

Pr(δ ∈ Ai ) = 1

k(i = 1, . . . , k),

where δ represents a generic demand to be serviced. The vehicle follows an a priori touramong zones arbitrarily designed. Within each subregion, the demands already known at themoment the vehicle enters in the subregion are serviced along an Hamiltonian path. As soonas the vehicle terminates the current path, it goes to the next subregion in the a priori tour.The process is then repeated until no demands are left. While the vehicle is moving alongthe current tour, new service requests may arrive. The locations of these requests may be inthe current subregion (the subregion in which the vehicle is currently operating) or in anothersubregion. In the latter case, these demands will be served as soon as the vehicle arrives intotheir subregion. In the former case, the salesman has to decide whether to serve the newdemands in the current tour or to postpone their service to the next tour. The total deferment(TD) policy implements the second strategy, i.e. it queues the new demands and services themin the next tour. Such a policy has the theoretical property of being asymptotically optimal asshown by the following theorem.

THEOREM 2.1 For the DTSP with a spatial uniform demand distribution, the expected totalsystem time TTD under the TD policy is bounded by:

ηβ2

2

λ�

v2≤ TTD ≤ η

(1 + 1

k

)β2λ�

2v2,

where η is the expected total number of demands in the system waiting for service and β isthe Euclidean TSP constant, which is estimated in [3] to be β ≈ 0.72.

Proof The lower bound was derived in ref. [6] and was proved for an arbitrary spatiallyunbiased policy. For the proof of the upper bound, let b(t) be a random variable representingthe number of demands for service left in subregion As(1 ≤ s ≤ k) when the vehicle crossesover to As+1. Denote by b the steady state expectation value of b(t). Between two visits toAs+1, the vehicle has visited k subregions; therefore, the average number of demands thevehicle finds in As+1 is ν = kb. In the same way, there will be (k − 1)b demands in As+2,and so on. As a result, the steady-state average number of demands waiting for service in the

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


region A will be given by

η = b + 2b + 3b + · · · + kb = k(k + 1)b

2·

Therefore,

b = 2η

k(k + 1),

and the steady-state average number of demands ν the vehicle finds in a subregion verifies:

ν = kb = 2η

k + 1·

Let h1 and h2 be two moments during which the vehicle performs two consecutive crossoversfrom As to As+1. During the elapsed time h2 − h1, the vehicle should have performed k TSPpaths (one for each subregion) and k travel distances lower than or equal to

√5�/k to go

from a subregion to another. (Such a quantity becomes ((√

k + 1/√

k)√

5� for odd values ofk. In the sequel of this paper we consider, for simplicity, only even values of k.) Thus,

h2 − h1 ≤ k

(Lν

v+ 1

v

√5�

k

),

where Lν is the length of an optimal TSP path through the ν points.Let δ be a random generic demand in As+1. The average waiting time for δ at time h2 is

(h2 − h1)/2, since δ is equally likely to arrive at any moment during h2 − h1. Moreover, it hasto wait another period of time equal to Lν/2v + √

(5�/k)/2v in order to achieve the serviceof the ν demands in As+1, since it is equally likely to be serviced at any order among the ν

demands. Therefore, the steady-state expected system time Tδ of demand δ is given by:

Tδ ≤ h2 − h1

2+ 1

2

(Lν

v+ 1

v

√5�

k

)≤ h2 − h1

2+ 1

2

(Lν

v+ 1

v

√5�

k

)

≤ k + 1

2

(Lν

v+ 1

v

√5�

k

)≈ k + 1

2

(β

v

√Aν + 1

v

√5�

k

)

= k + 1

2

(β

v

√�ν

k+ 1

v

√5�

k

)·

This expression has been derived by using a well-known formula to estimate Lν , the lengthof the optimal TSP tour through the ν points and that verifies Lν ≤ Lν . This formula is dueto Beardwood et al. [27] who showed that considering ν points independently and uniformlydistributed in a bounded region A of area A, then:

limν→∞

Lν√ν

= β√

A a.s., (1)

where β ≈ 0.7211, and ‘a.s.’means ‘almost surely’or ‘with probability one’. Indeed, ν → ∞as λ → ∞ because:

ν = 2η

k + 1·

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


In addition, by using Little’s law for which η = λTδ and the lower bound defined in ref. [6]for which

β2

2

λ�

v2≤ Tδ,

we have:

ν = 2λTδ

k + 1≥ β2λ2�

(k + 1)v2·

Thus, ν must be large as λ → ∞ and consequently:

Tδ ≤ k + 1

2

{β

v

√�

k

(2η

k + 1

)+ 1

v

√5�

k

},

or equivalently:

Tδ ≤ k + 1

2

{β

v√

k

√2�λTδ

(k + 1)+ 1

v

√5�

k

}.

By solving for Tδ , we get:

√Tδ ≤

√2(k + 1)β2λ� +

√2(k + 1)β2λ� + 8

√k(k + 1)v

√5�

4v√

k,

that is:

√Tδ ≤

√2(k + 1)β2λ� + 8

√k(k + 1)v

√5�

2v√

k,

from which we get the result:

TTD = ηTδ ≤ η

(1 + 1

k

)β2λ�

2v2.

�

This result implies that, by choosing a large value of k, i.e. for k → ∞, the TD policy resultsto be asymptotically optimal.

2.2 The partial deferment policy

The partial deferment (PD) policy is a variant of the TD policy. The key difference between thetwo policies is the way in which the new demands are handled. When a new demand δ arrivesin the same subregion in which the vehicle is currently operating, the new policy evaluateswhether it is more convenient to serve it immediately or to allocate it to the subsequent tour. Incase this request’s service is delayed, the detour cost is due to the demand’s expected waitingtime before being served. In case of service during the current visit of the subregion, thedetour cost includes the waiting time of the expected demands in the queue during the detour.Mathematically, this can be expressed in terms of maximizing the insertion benefit defined asthe difference between the profit and the cost produced by the detour. The profit is the savingthat occurs from the service of the new demand δ immediately in the current path after, say,

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


demand h. Let w denote by n the random variable representing the number of demands to beserved after δ in the tour in which it is inserted, then we have:

Lη+1 − dh,δ

v,

where η represents the steady-state expectation of n and Lη+1 the length of the optimal TSPpath through the η points plus δ. The cost of the detour is the cost that incurs when the vehicleaccepts to serve the new demand δ in the current tour causing a delay on the other demandsin the queue. It is expressed as:

η

v(dh,δ + dδ,h+1 − dh,h+1).

The benefit is, thus, maximized as:

maxj≥i

{(Lη+1 − dj,δ

v

)− η

v(dj,δ + dδ,j+1 − dj,j+1)

},

which could be, without loss of generality sinceLη+1 does not depend on the index j , written as:

maxj≥i

{(Lη − dj,δ

v

)− η

v, (dj,δ + dδ,j+1 − dj,j+1)

},

where i is the demand that the vehicle is at the point of serving along its current path when δ

arrives and j is a demand already scheduled in the current TSP path after i. In order to expresssuch a benefit, we use again formula [1] that can be written, for η points independently anduniformly distributed in a bounded region A of area �, as:

limη→∞

Lη√η

= β√

� a.s. (2)

Moreover, because of the Little’s law:

η = λLη

v, (3)

and observing that if δ is served immediately in the current tour after, say, demand h, then:

Lη = Lη − dδ,h+1.

Therefore,without loss of generality, we can write

maxj≥i

{(Lη − dj,δ

v

)− η

v, (dj,δ + dδ,j+1 − dj,j+1)

},

or equivalently

maxj≥i

{(Lη − dj,δ

v

)− η

v, (dj,δ + dδ,j+1 − dj,j+1)

},

and thus, the detour benefit becomes:

maxj≥i

{(λβ2�

v2− dj,δ

v

)− λ2β2�

v2(dj,δ + dδ,j+1 − dj,j+1)

}

for large values of η.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


By using an estimation of the benefit, we can define the best allocation of the new demandwithin the current subregion’s TSP path. If the new demand results to be the last demand tobe served in the current subregion TSP path, the policy will make a comparison between theresulting expected total system time to decide whether the demand is accepted in the currenttour or referred to the next a priori designed tour.

Now we will discuss the optimality of such a policy. For this purpose, we need to introducefirst the following result.

LEMMA 2.2 Let η be the total number of demands for service whose locations are indepen-dently and uniformly distributed in a square A. Suppose that π (π < η) demands have beenalready served and denote by n = (η − π) the remaining demands for service. The distribu-tion among the n demands is not uniform anymore. Let n be the length of the optimal TSPtour through these n points and Ln be the length of the optimal TSP tour through the samepoints if they were uniformly distributed, then:

n ≤ Ln.

Proof Steele showed in ref. [28] that if n points, having the same distribution f (x), areindependently but non-uniformly distributed in a square A, then:

limn→∞

n√n

= β

∫A

f (x)1/2 dx a.s.

where f (x) is the density of the absolutely continuous part of the distribution f (x). Moreover,by using Jensen’s inequality [29] for concave functions, it follows that:

∫A

f (x)1/2dx ≤(∫

Af (x) dx

)1/2

≤(∫

Af (x)dx

)1/2

.

Therefore, by combining formula [2] and Steele’s property for n → ∞, we obtain:

n ≈ β√

n

∫A

f (x)1/2 dx ≤ β√

n ≤ β√

�n ≈ Ln.

Now the main result concerning the optimality of the PD policy can be derived. �

THEOREM 2.3 When applying the PD policy, an upper bound on the expected total systemtime TPD is given by:

TPD ≤ η(k + 1)2

k2

β2λ�

2v2

where η is the expected total number of demands in the system waiting for service and β isthe Euclidean TSP constant, which is estimated in [27] to be β ≈ 0.72.

Proof Let b(t) andAs (1 ≤ s ≤ k) defined as above. Let b′(t) be a random variable represent-ing the number of demands that the vehicle accepts during the current TSP path in As . Denoteby b and b′ the steady-state expectation values of b(t) and b′(t), respectively. Thus, the steady-state average number of demands that are left for service in subregion As when the vehiclemoves to As+1 is (b − b′) ≤ b. Between two visits to As+1, the vehicle has visited k subregionsand consequently the average number of demands it finds in As+1 is ν = (b − b′) + (k − 1)b.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


Therefore, the steady-state average number of demands to be served in the region A after awhole tour is:

η = (b − b′) + (b − b′) + b + · · · + (b − b′) + (k − 1)b

= k(b − b′) + k(k − 1)b

2,

and the steady-state average number of demands ν the vehicle finds in a subregion verifies:

η

k≤ ν ≤ 2η

k.

Analogous to the proof of Theorem 2.1, and under the fact that the distribution is not uniformanymore since some of the demands have been already served, the difference h2 − h1 isbounded by:

h2 − h1 ≤ k

(ν+b′

v+ 1

v

√5�

k

),

where ν+b′ ≤ ν+b′ is the length of an optimal TSP path through the ν + b′ points.Let δ be a random generic demand of As+1 belonging to a set composed of the ν demands

to be served in As+1. As remarked in Theorem 2.1, the average waiting time for δ at time h2 is(h2 − h1)/2. Moreover, it has to wait another period of time equal to ν+b′/2v + √

5�/k/2v

in order to finish the service of the (ν + b′) demands in As+1, since it is equally likely tobe serviced at any order among the (ν + b′) demands. Therefore, the steady-state expectedsystem time Tδ of demand δ is given by:

Tδ = h2 − h1

2+ 1

2

(ν+b′

v+ 1

v

√5�

k

)≤ h2 − h1

2+ 1

2

(ν+b′

v+ 1

v

√5�

k

)

≤ k + 1

2

(Lν+b′

v+ 1

v

√5�

k

)≈ k + 1

2

(β

v

√A(ν + b′) + 1

v

√5�

k

)

= k + 1

2

(β

v

√� (ν + b′)

k+ 1

v

√5�

k

).

This expression has been derived by using both formula [2] and Lemma 2.2 in order to estimateLν+b′ . Indeed, (ν + b′) → ∞ as λ → ∞ because:

ν + b′ ≥ η

k+ b′.

In addition, by noting that η = λTδ and by using Theorem 2.1, we have:

ν + b′ ≥ λTδ

k+ b′ ≥ β2λ2�

2kv2+ b′.

Thus, (ν + b′) is a large value as λ → ∞ and consequently:

Tδ ≤ k + 1

2

{β

v

√�

k

(2η

k+ b′

)+ 1

v

√5�

k

},

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


or equivalently:

Tδ ≤ k + 1

2

{β

vk

√2�λTδ + 1

v

√5�

k+ β

v

√�b′

k

}.

By solving for Tδ , we get:

√Tδ ≤

√2(k + 1)2β2λ� +

√2(k + 1)2β2λ� + 8k(k + 1)v

√�k(

√5 + β

√b′)

4kv,

that is;

√Tδ ≤

√2(k + 1)2β2λ� + 8k(k + 1)v

√�k(

√5 + β

√b′)

2kv,

and we get the result:

TPD = ηTδ ≤ η(k + 1)2

k2

β2λ�

2v2. �

This result implies that the PD policy is asymptotically optimal. Indeed, by combiningTheorems 2.1 and 2.3, we obtain:

TPD

TTD≤ β2λη(k + 1)2� + 4k(k + 1)vη

√�k(

√5 + β

√b′)

β2λη�k2.

Therefore, as λ → ∞, the report:

TPD

TTD−→ 1 + 1

k2+ 2

k,

indicating that even the PD policy is optimal for large values of k.

2.3 The insertion policy

Even though the previous two policies have good theoretical properties, they may performpoorly under a finite arrival rate. This behaviour is because the rigid division of the serviceregion into subregions may force the vehicle to postpone the service of a demand close to itsposition just because it belongs to another subregion. This happens, for example, when twodemands are located close to the border of two adjacent subregions but cannot be plannedto be served consecutively within the a priori tour. In order to overcome this drawback, theinsertion (Ins) policy abandons the strategy of partitioning the service region A into subregionsand makes use of a simple insertion technique. The idea behind this policy is to insert a newrequest into the current route by minimizing an insertion cost. In particular, a new request δ isinserted between two already scheduled requests in such a way the expected total system time:

minj≥i

{E[T1] + E[T2] + · · · + E[Tj ] + E[Tδ] + E[Tj+1] + · · · }

is minimized. The following property holds.

THEOREM 2.4 Under heavy traffic, the insertion policy has a constant factor guaranteeof two, i.e.:

TIns

TTD≤ 2, λ −→ ∞,

where TIns is the expected total system time associated to the insertion policy.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


Proof Let δ be a new request. In addition, let η be the steady-state average number of demandsserved during δ’s waiting time. The distribution among the η demands is not uniform becausesome of the demands have already been served. Denote by x the steady-state average numberof demands in the path with an arrival time lower than tδ and by y the steady-state averagenumber of demands with an arrival time greater than or equal to tδ . Notice that by Little’s lawit follows that:

y ≤ λLx

v≤ λLx

v≈ λβ

√�

√x

v, (4)

where we have used [2] to estimate Lx because, as will be shown below, in heavy traffic x hasto be large. Moreover, since δ is equally likely to arrive at any moment during the service ofthe x demands and is equally likely to be serviced at any order among the y demands, thenwe can write:

η = x

2+ y

2≤ x

2+ λβ

√�

√x

2v. (5)

By Lemma 2.2, we obtain that Tδ , the steady-state expected system time of δ, can beexpressed as:

Tδ = (x+y)

2v≤ (x+y)

2v≤ L(x+y)

2v.

By using the fact that if x has to be large in heavy traffic then (x + y) will be large too, we get:

Tδ ≤ β√

�

2v

√x + y ≤ β

√�

2v(√

x + √y),

and by using the relation η = λTδ and equation (4), we obtain:

η ≤ λβ√

�

2v

√x + λβ

√�

2v

√λβ

√�

√x

v, (6)

and, consequently, in heavy traffic we have:

x ≤ λβ√

�

√λβ

√�

√x/

√v

v. (7)

Indeed, by (5) and (6), it follows that:

η ≤ min

⎧⎨⎩x

2+ λβ

√�

√x

2v,λβ

√�

√x

2v+ λβ

√�

2v

√λβ

√�

√x

v

⎫⎬⎭ .

Now, suppose that (7) is true, then we have:

1 ≤ λ3β3�√

�

v3. (8)

But as λ → ∞, the quantity (λ3β3�√

�/v3) → ∞, so (8) is true in heavy traffic and is falsein light traffic since it verifies (λ3β3�

√�/v3) → 0 as λ → 0.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


Therefore, by using (7), we obtain:

x ≤ λ2β2�

v2. (9)

Moreover, relations (5) and (9) ensure that:

Tδ = η

λ≤ β2λ�

v2, (10)

and by Theorem 2.1, we have:

TIns

TTD= ηTδ

TTD≤ 2 for λ −→ ∞.

with η the expected total number of demands in the system. This proof has been based on thehypothesis that x is large. Now we shall proof that this assumption is true. By Theorem (2.1)we have:

η = λTδ ≥ β2λ2�

2v2.

Thus, by using (5), we get:

x

2+ λβ

√�

2v

√x ≥ β2λ2�

2v2

and by solving for x, we obtain x ≥ {(1 − √5)2λ2β2�}/4 which is a large quantity

for λ → ∞. �

3. Computational results

We have compared the three policies under various (finite) arrival rates λ. The service regionA is a 100 × 100 square (that has been divided into four equal subregions when the first twopolicies are used). We have carried out several sets of experiments by generating 40 instanceswith a number of demands varying between 13 and 4500. The demands’arrival times and loca-tions have been generated randomly according to an exponential and a uniform distributions,respectively. The speed v of the vehicle is assumed to be equal to 1. Parameters η and Lη havebeen estimated by using expressions (2) and (3). The route within each subregion has beendefined by using the nearest neighbour heuristic. The algorithms have been implemented in Clanguage and all the experiments have been performed on a Pentium IV computer with 504 MbRAM. The results of our experiments are illustrated in table 1. The first two columns reportthe values of arrival rate λ and the resulting number of demands for service. The third, sixthand 10th columns report the expected total system time obtained with the TD, the PD and theIns policies, respectively. Moreover, the fifth and the eighth columns report the total numberof tours, t , traversed by the vehicle in the first two policies, whereas the fourth, seventh andeleventh columns report the execution time T (s) of the three algorithms, respectively. Finally,the ninth and the 12th columns report the relative improvement of policies PD and Ins withrespect to policy TD, which we denote by I1 and I2, respectively.

The results reported in table 1 show that policies PD and Ins outperform policy TD onmost of the generated instances. Compared with TD, policy PD not only requires a lowernumber of a priori tours but also achieves an average improvement of 5%. Moreover, theresults show clearly that policy Ins outperforms the other two policies. Indeed, the averageimprovement is 28% with respect to policy TD and 24% compared to policy PD. However,

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


Table 1. Experimental results

λ Demand TD T (s) t PD T (S) t I1 Ins T (S) I2

0.125 13 3 551 0 2 2 167 0 2 38 1 948 1 450.125 14 2 899 1 3 2 382 0 3 17 1 604 0 440.125 31 10 666 0 2 8 453 0 2 20 6 468 1 390.125 27 7 002 0 2 5 991 1 2 14 4 922 1 290.125 66 14 640 1 4 7 699 1 3 47 6 671 2 540.125 60 19 717 1 3 16 689 1 3 15 15 391 2 21

0.25 34 9 245 0 2 8 611 0 2 6 6 864 0 250.25 28 7 973 1 2 6 610 0 2 17 5 798 0 270.25 54 20 474 0 2 16 523 0 2 19 15 178 1 250.25 60 24 641 0 3 18 495 0 2 24 16 335 1 330.25 120 64 268 0 3 51 774 0 2 19 49 516 6 220.25 112 55 751 1 3 47 606 0 2 14 39 821 5 280.25 2514 4 261 804 13 9 3 796 442 48 7 10 3 153 404 1805 26

0.355 39 14 956 0 2 11 834 0 2 20 9 243 1 380.355 46 17 755 0 2 11 457 0 2 35 11 895 1 330.355 70 28 453 0 2 27 719 1 2 2 22 978 2 190.355 80 34 753 0 3 26 670 1 2 23 25 400 3 260.355 178 121 928 1 3 103 554 1 2 15 81 333 12 330.355 172 117 917 1 3 89 897 1 2 23 87 502 12 250.355 3418 7 752 157 17 8 7 042 203 46 6 9 5 481 657 4563 29

0.45 54 22 179 0 2 17 409 0 2 21 12 977 2 410.45 105 48 914 1 2 40 716 0 2 16 41 100 5 150.45 216 149 115 1 3 133 818 1 2 10 107 601 16 270.45 211 152 650 1 3 126 415 2 3 17 108 518 17 280.45 4500 12 271 333 22 7 11 307 125 81 5 7 8 868 317 5883 27

0.6 61 26 057 1 2 23 654 0 2 9 18 480 2 290.6 142 92 261 1 3 87 369 1 2 5 70 579 9 230.6 150 84 771 1 3 79 417 1 3 6 77 078 9 90.6 265 199 644 2 3 178 264 2 2 10 152 415 28 230.6 283 231 862 1 2 207 759 2 2 10 170 558 31 26

0.8 112 66 620 1 2 55 231 1 2 17 47 617 5 180.8 191 133 952 1 3 115 309 1 2 13 99 007 14 260.8 413 320 562 2 3 272 694 2 2 14 210 655 62 340.8 355 331 389 2 3 312 815 2 2 5 261 303 49 210.8 4075 12 208 304 21 5 12 117 869 46 3 0.7 8 845 174 3906 27

0.99 247 190 857 1 3 190 995 2 2 0.07 151 228 23 200.99 454 512 846 3 3 431 497 2 2 15 347 267 21 320.99 486 539 236 3 3 512 307 3 2 4 400 237 47 250.99 936 1 435 763 5 3 1 323 193 6 2 7 1 096 028 245 230.99 2117 4 850 611 6 3 4 626 169 15 2 4 3 302 173 1629 310.99 2657 6 042 570 13 4 6 164 919 15 3 1 4 087 637 1443 32

this improvement in the solution quality corresponds to a higher execution time, especiallyfor larger problems.

4. Conclusions

In this paper, we have introduced and evaluated three new policies for the DTSP. The keydifference between the TD and the PD policies is the way of inserting a new demand in thecurrent subregion. On the other hand, the Ins policy makes use of a simple insertion technique.A computational study showed that the Ins policy achieves better results than the others twopolicies but requires a larger computing effort. Moreover, we have proved that the TD andPD policies are asymptotically optimal in heavy traffic, whereas the policy Ins has a constant

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14


factor guarantee of two. There are several issues that are still open for future research. Amongthem, we underline the DTSP under an heterogeneous arrival process.

References

[1] Powell, W.B., Jailet, P. and Odoni, A., 1995, Stochastic and dynamic networks and routing. Handbook in ORand MS (Amsterdam: North-Holland), pp. 141–295.

[2] Psaraftis, H.N., 1988, Dynamic vehicle routing problems. In: B. Golden and A. Assad (Eds) Vehicle Routing:Methods and Studies (North Holland: Elsevier Science Publishers B.V.), pp. 223–248.

[3] Psaraftis, H.N., 1995, Dynamic vehicle routing: status and prospects. Annals of Operations Research, 61,143–164.

[4] Gendreau, M. and Potvin, J.Y., 1998, Dynamic routing and dispatching. In: T.G. Crainic and G. Laporte (Eds)Fleet Management and Logistics (Boston: Kluwer), pp. 115–126.

[5] Ghiani, G., Guerriero, F., Laporte, G. and Musmanno, R., 2003, Real-time vehicle routing: solution concepts,algorithms and parallel computing strategies. European Journal of Operational Research, 151, 1–11.

[6] Bertsimas, D.J. and Van Ryzin, G., 1991, A stochastic and dynamic vehicle routing problem in the Euclideanplane. Operations Research, 39, 601–615.

[7] Bertsimas, D.J. and Van Ryzin, G., 1993, A stochastic and dynamic vehicle routing problem in the Euclideanplane with multiple capacitated vehicles. Operations Research, 41, 60–76.

[8] Papastavrou, J. D., 1996,A stochastic and dynamic routing policy using branching processes with state dependentimmigration. European Journal of Operational Research, 95, 167–177.

[9] Swihart, M.R. and Papastavrou, J.D., 1999, A stochastic and dynamic model for the single vehicle pick-up anddelivery problem. European Journal of Operational Research, 114, 447–464.

[10] Jaillet, P., 1988, A priori solution of the traveling salesman problem in which a random subset of customers arevisited. Operations Research, 36, 929–936.

[11] Jaillet, P. and Odoni, A.R., 1988, The probabilistic vehicle routing problem. In: B.L. Golden and A.A. Assad(Eds) Vehicle Routing: Methods and Studies (Amsterdam; The Netherlands: North-Holland), pp. 293–318.

[12] Bertsimas, D.J., 1992, A vehicle routing problem with stochastic demand. Operations Research, 40, 574–585.[13] Bertsimas, D.J. and Simchi-Levi, D., 1996, A new generation of vehicle routing research: robust algorithms,

addressing uncertainty. Operations Research, 44, 286–303.[14] Gendreau, M., Laporte, G. and Séguin, R., 1996, Stochastic vehicle routing. European Journal of Operational

Research, 88, 3–12.[15] Powell, W.B., Sheffi, Y., Nickerson, K.S., Butterbaugh, K. and Atherton, S., 1988, Maximizing profits for

North American Van Lines’ truckload division: a new framework for pricing and operations. Interfaces, 18(1),21–41.

[16] Powell, W.B., 1996, A stochastic formulation of the dynamic assignment problem, with an application totruckload motor carriers. Transportation Science, 30, 195–219.

[17] Thomas, B.W. and White, C.C., III, 2004,Anticipatory route selection. Transportation Science, 38, 473–487.[18] Bent, R.W. and Van Hentenryck, P., 2004, Scenario-based planning for partially dynamic vehicle routing with

stochastic customers. Operations Research, 52, 977–987.[19] Ausiello, G., Feuerstein, E., Leonardi, S., Stougie, L. and Talamo, M., 2001, Algorithms for the on-line traveling

salesman. Algorithmica, 29, 560–581.[20] Borodin, A. and El-Yaniv, R., 1998, Online Computation and Competitive Analysis (Cambridge University

Press).[21] Grotschel, M., Krumke, S.O. and Rambau, J., 2001, Online Optimization of Large Scale Systems (Berlin,

Heidelberg, New York: Springer).[22] Irani, S., Lu, X. and Regan, A., 2004, On-line algorithms for the dynamic traveling repair problem. Journal of

Scheduling, 7, 243–258.[23] Krumke, S.O., Laura, L., Lipmann, M., Marchetti-Spaccamela,A., De Paepe,W.E., Poensgen, D. and Stougie, L.,

2002, Non-abusiveness helps: an O(1)-competitive algorithm for minimizing the maximum flow time in theonline traveling salesman problem. Proceedings of the 5th International Workshop on Approximation Algorithmsfor Combinatorial Optimization, Vol. 2462, Lecture Notes in Computer Science (Springer), pp. 200–214.

[24] Ascheuer, N., Krumke, S.O. and Rambau, J., 2000, Online dial-a-ride problems: minimizing the completiontime. Proceedings of the 17th International Symposium on Theoretical Aspects of Computer Science, Vol. 1770,Lecture Notes in Computer Science (Springer), pp. 639–650.

[25] Hauptmeier, D., Krumke, S.O. and Rambau, J., 2000, The online dial-a-ride problem under reasonable load.Proceedings of the 4th Italian Conference on Algorithms and Complexity, Vol. 1767, Lecture Notes in ComputerScience (Springer), pp. 125–136.

[26] Hiller, B., 2005, Probabilistic competitive analysis of dial-a-ride problem on trees under high load. ZIB Report05-56, Konrad-Zuse-Zentrum fur Informationstechnik, Berlin.

[27] Beardwood, J., Halton, J. and Hammersley, J., 1959, The shortest path through many points. Proceedings of theCambridge Philosophical Society, 55, 299–327.

[28] Steele, J.M., 1981, Subadditive euclidean functionals and non-linear growth in geometric probability. Annals ofProbability, 9, 365–376.

[29] Birge, J.R. and Louveaux, F.V., 1997, Introduction to Stochastic Programming (New York: Springer-Verlag).

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 1

0:13

12

Nov

embe

r 20

14

new policies for the dynamic traveling salesman problem

Documents