solving the dynamic traveling salesman game problem

6
Cybernetics and Systems Analysis, Vol. 46, No. 5, 2010 SOLVING THE DYNAMIC TRAVELING SALESMAN GAME PROBLEM A. A. Belousov, aYu. I. Berdyshev, b†† A. G. Chentsov, b and A. A. Chikrii aUDC 518.9 Abstract. A game problem of the successive capture of a team of evaders by a single pursuer under conditions of “simple motions” of the players is analyzed. The performance criterion is the total time all the evaders are captured. It is assumed that the pursuer is guided by the parallel pursuit law. In such a case, the optimal response of the evaders is the straightforward motion with maximum speed. The original infinite-dimensional problem can therefore be reduced to two finite-dimensional problems. Keywords: differential game, multi-evader game, order of captures, parallel pursuit. INTRODUCTION In differential game theory, there are a number of formalizations that determine the procedure of constructing optimal strategies of players [1–5]. Because of the high complexity of game problems, the consideration is often restricted to sufficient resolvability conditions or, in other words, to obtaining some guaranteed result. To study single-pursuer–single-evader problems, central in this field, efficient methods are developed, which solve a wide range of problems [1–3]. A natural generalization of this range of problems is pursuit–evasion problems with groups of participants, so-called problems of group or successive pursuit and evasion from a group. The study of game problems began in the mid 1970s. The results on group pursuit and on evasion can be found in [1, 6–8] and in [4, 9], respectively. The present paper considers a problem of successive pursuit (dynamic traveling salesman game problem), where the performance criterion is the total time all the evaders are captured. The problem consists of two inseparably connected parts: enumeration (to determine the order of capture) and pursuit for a specified order of service. The first part pertains to the target assignment problem, which is well known to rocket and space engineers and is the most difficult link in the solution of the original problem. Many useful recommendations on this topic can be found in [10], which employs the dynamic programming method to solve routing problems. Related problems are considered in [11]. If the order of capturing the evaders is specified, it is expedient to apply the above-mentioned efficient methods of pursuit. It should be emphasized that the key difficulty of the problem is that both parts of the original problem (enumeration and control) are inseparable and have to be solved simultaneously. Noteworthy are Maslov’s and his colleagues’ studies [12–14] on the successive pursuit problem. Scientists from St.-Petersburg [15] and Ekaterinburg [16] were engaged in similar problems. Of special interest is the paper [17], where the order of pursuit is determined by positions. In the methodology, the present study is most close to [18, 19], where, following the idea stated earlier, the method of resolving functions [1] is applied, which substantiates parallel pursuit based on Minkowski functionals. Thus, when a pursuer employs the algorithm of parallel pursuit for a prescribed order of capture of the evaders, it remains to determine the optimal response of each evader. The dynamics of each player is supposed to be simple and the pursuer has some advantage over each evader. This allows reducing the initial infinite-dimensional optimization problem to two mutually dependent finite-dimensional problems: continuous optimization of the total capture time with respect to the constant controls of the evaders and discrete optimization with respect to the order of capture. The present paper is related to the studies [1, 10, 12–19] and develops the studies [18–20]. 718 1060-0396/10/4605-0718 © 2010 Springer Science+Business Media, Inc. a V. M. Glushkov Institute of Cybernetics, National Academy of Sciences of Ukraine, Kyiv, Ukraine, [email protected]; [email protected]. b Institute of Mathematics and Mechanics, Ural Branch of the Russian Academy of Sciences, Ekaterinburg Russia, †† [email protected]. Translated from Kibernetika i Sistemnyi Analiz, No. 5, pp. 40–45, September–October 2010. Original article submitted February 17, 2010.

Upload: a-a-belousov

Post on 15-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Solving the dynamic traveling salesman game problem

Cybernetics and Systems Analysis, Vol. 46, No. 5, 2010

SOLVING THE DYNAMIC TRAVELING

SALESMAN GAME PROBLEM

A. A. Belousov,a†

Yu. I. Berdyshev,b††

A. G. Chentsov,b

and A. A. Chikriia‡

UDC 518.9

Abstract. A game problem of the successive capture of a team of evaders by a single pursuer under

conditions of “simple motions” of the players is analyzed. The performance criterion is the total time

all the evaders are captured. It is assumed that the pursuer is guided by the parallel pursuit law. In

such a case, the optimal response of the evaders is the straightforward motion with maximum speed.

The original infinite-dimensional problem can therefore be reduced to two finite-dimensional problems.

Keywords: differential game, multi-evader game, order of captures, parallel pursuit.

INTRODUCTION

In differential game theory, there are a number of formalizations that determine the procedure of constructing optimal

strategies of players [1–5]. Because of the high complexity of game problems, the consideration is often restricted to

sufficient resolvability conditions or, in other words, to obtaining some guaranteed result. To study

single-pursuer–single-evader problems, central in this field, efficient methods are developed, which solve a wide range of

problems [1–3]. A natural generalization of this range of problems is pursuit–evasion problems with groups of participants,

so-called problems of group or successive pursuit and evasion from a group. The study of game problems began in the mid

1970s. The results on group pursuit and on evasion can be found in [1, 6–8] and in [4, 9], respectively.

The present paper considers a problem of successive pursuit (dynamic traveling salesman game problem), where the

performance criterion is the total time all the evaders are captured. The problem consists of two inseparably connected parts:

enumeration (to determine the order of capture) and pursuit for a specified order of service. The first part pertains to the

target assignment problem, which is well known to rocket and space engineers and is the most difficult link in the solution of

the original problem. Many useful recommendations on this topic can be found in [10], which employs the dynamic

programming method to solve routing problems. Related problems are considered in [11]. If the order of capturing the

evaders is specified, it is expedient to apply the above-mentioned efficient methods of pursuit. It should be emphasized that

the key difficulty of the problem is that both parts of the original problem (enumeration and control) are inseparable and

have to be solved simultaneously.

Noteworthy are Maslov’s and his colleagues’ studies [12–14] on the successive pursuit problem. Scientists from

St.-Petersburg [15] and Ekaterinburg [16] were engaged in similar problems. Of special interest is the paper [17], where the

order of pursuit is determined by positions.

In the methodology, the present study is most close to [18, 19], where, following the idea stated earlier, the method of

resolving functions [1] is applied, which substantiates parallel pursuit based on Minkowski functionals. Thus, when a

pursuer employs the algorithm of parallel pursuit for a prescribed order of capture of the evaders, it remains to determine the

optimal response of each evader. The dynamics of each player is supposed to be simple and the pursuer has some advantage

over each evader. This allows reducing the initial infinite-dimensional optimization problem to two mutually dependent

finite-dimensional problems: continuous optimization of the total capture time with respect to the constant controls of the

evaders and discrete optimization with respect to the order of capture.

The present paper is related to the studies [1, 10, 12–19] and develops the studies [18–20].

718 1060-0396/10/4605-0718

©

2010 Springer Science+Business Media, Inc.

a

V. M. Glushkov Institute of Cybernetics, National Academy of Sciences of Ukraine, Kyiv, Ukraine,

[email protected];

[email protected].

b

Institute of Mathematics and Mechanics, Ural Branch of the Russian

Academy of Sciences, Ekaterinburg Russia,

††

[email protected]. Translated from Kibernetika i Sistemnyi Analiz, No. 5,

pp. 40–45, September–October 2010. Original article submitted February 17, 2010.

Page 2: Solving the dynamic traveling salesman game problem

PROBLEM FORMULATION

Consider a problem where one player (x) successively captures N evaders ( yi ), which tend to avoid an approach. The

dynamics of the objects is defined by the differential equations

�, , ( ) , | | | | ,x u x R x x u

n� � � �0 1

0

�, , ( ) , | | | | , , , ,y y R y y i Ni i i

n

i i i i� � � � � �� � �0 1 1 �

(1)

where the controls u i( ) ( )� �and � are Lebesgue measurable functions. By a capture we mean that the coordinates of

the objects coincide x T y Ti i i( ) ( )� at some instant of time Ti . The pursuer is assigned a task to catch all the evaders as

soon as possible, the evaders, in turn, tend to prevent this.

Various formalizations of game approach problems with a group objective are presented in [1, 12–15, 18, 19]. As

follows from these studies, constructive analytic solutions can only be obtained for games with two evaders. For a greater

number of evaders, the optimal behavior of players can only be found numerically. However, the numerical study is possible

only for small values of N and such problems are intractable already for N � 5 [18]. The complexity of these problems

becomes obvious if we take into account the fact that decision should be made in real time. Hence, the methods and

algorithms that implement decision making should be rather fast. In the paper, we propose a method of the analysis of the

traveling salesman game problem that allows the capture problem (1) to be efficiently solved for a large number of evaders.

APPROACHES FOR SOLVING

The paper [1] provides a detailed description of problem (1) and approaches to its solution. Its main results can be

briefly formulated as follows. The order of capture is programmed by the pursuer, i.e., is determined at the beginning of the

game and is not changed later. For a fixed order of capture, the pursuer employs the strategy of parallel pursuit of each next

evader

u x y e e e( , , ) , , ,� � � � �� � � � �

��

2 2

1 ex y

x y�

�| | | |

,

where x yand are the positions of the pursuer and evader at the time the pursuit of this player starts. Under these

assumptions, it is proved in [1] that, first, the pursuer can catch all the evaders for any their feasible controls (1);

second, to maximize the capture time TN , the evaders should move in constant directions and with maximum speed

( ( ) , [ , ), | | | | , , ,� � � �i i i it t i N� � � � �0 1 � ).

Thus, the original game problem, which was generally infinite-dimensional, reduces to two finite-dimensional

optimization problems: (i) maximization of the total capture time for a fixed order TN N( , , )� �1

� on the set of vectors

�� i

nR� : | | | |� �i i� , i N�1, ,� } and (ii) discrete optimization with respect to the choice of the order of capture, which

minimizes these maximum capture times.

The present paper studies the first problem, i.e., maximization of the total capture time for a fixed order. This is

because it is this problem that generates the majority of problems, and how fast and accurately the whole problem will be

solved mainly depends on the efficiency of its solution.

As to the second problem, note that it generally deals with the enumeration of N! alternatives of the orders of capture.

And for each such alternative, the total capture time does not depend on the others. Therefore, the solution algorithm can be

parallelized well in this part and if implemented using a multiprocessor system, the optimization time will depend almost

linearly on the number of processors.

OPTIMIZATION PROBLEM

The original game problem (1) for a fixed order of capture is reduced to a conditional optimization problem [21],

where the functional and constraints can be written in the simplest form as follows:

T x yN N N N� � �� | | | | max, (2)

G x x x y x y x xk k k k k k k k k k k( , ) | | | | | | | | | |

� � � � �

� � � � � �1 1 1 1

� �1

0 1| | , ,... , ,� �k N y x k k0 0 0

0 1� � �, , / .� � � (3)

719

Page 3: Solving the dynamic traveling salesman game problem

In this statement, the maximization is with respect to the vectors x Rk

n� , which designate the points of capture of the

kth evader:

x x T T u y T k Nk k k k k k k k� � � � � �� �1 1

1( ) , , ,� � , (4)

Tk is the instant of capture of the kth evader, �k is the control of the kth evader (�k

nR� and | | | |� �k k� ), uk is the

control of the pursuer on the interval [ , )T Tk k�1

(u Rk

n� and | | | |uk �1). Noteworty is the obvious relation

T x yk k k k� �� | | | | . Therefore, the functional TN (2) determines the total capture time and constraints (3) reflect the

relations among the velocities of objects, instants of time, and geometrical coordinates of the points of capture.

As a matter of fact, the optimization problem (2), (3) deals with maximization of the function TN N( , , )� �1

� on the

set S Sn n

N

� �

� �

1 1

� ��� ���

(the Cartesian product of N instances of ( )n �1 -dimensional spheres). It is required to find the global

maximum of this function. However, note that computational optimization methods are local, i.e., they find only a local

maximum of the function in the vicinity of the initial point of the algorithm. Therefore, as a rule, the optimization algorithm

starts from several points (from some sufficiently representative set of initial points) and the obtained optimum is assumed to

be global. However, such a general approach poses the following serious problem in this optimization problem: if M is the

number of initial points on each of the N spheres Sn�1

, then the total number of initial points of the optimization algorithm

on the set of constraints is MN

, and this number may be huge for sufficiently large M and N .

To overcome this difficulty, it is proposed to apply necessary extremum conditions in the form of the Lagrange

multiplier rule [21]. Note that passing from an optimization problem to the solution of a system of nonlinear equations does

not generally offer special advantages as to the numerical solution. However, in this case, the necessary conditions allow

acquiring important analytic information about the general form of extremum points.

NECESSARY EXTREMUM CONDITIONS

The Lagrange multiplier rule [21] for problem (2), (3) can be formulated as follows.

If a vector ( , , ) ( )

* *

x x RN

n N

1

� � is a local maximum of functional (2) under constraints (3), then there exists a

nonzero vector ( , , )� �1

� N

NR� such that for the function

f x x x y G x xN N N N k k k k

k

N

( , , , , , ) | | | | ( , )

1 1 1

1

� �� � �� � ��

the derivatives

xf x x

k

N N( , , , , , )

* *

1 1

0� �� � for k N�1, ,� .

Let us write these relations in explicit form:

�f

x

G

x

G

xk

kk

k

kk

k

� �1

1

� � �k kk k

k k

k k

k k

x y

x y

x x

x x�

�1

1

1

| | | | | | | |

� �

� �

� �k kk k

k k

k k

k k

x y

x y

x x

x xk

| | | | | | | |

,

1

1

0 1 1, , ,� N �

(5)

��

f

x

x y

x y

x y

x y

x x

xN

N N

N N

N NN N

N N

N N

| | | | | | | | | |

� �1

N Nx�

�1

0

| |

. (6)

Considering (4), we will write relations (5) as follows:

� � � � � �k k k k k k k ku u k N� �

� � � � �1

2

1

2

1 1( ) ( ), , , .� (7)

720

Page 4: Solving the dynamic traveling salesman game problem

Note that | | | | | | | | | | | |� � �k k k k ku u2

1

1� � � ��

; therefore, we can state that the vectors bracketed in (7) are nonzero.

Thus, if we assume that one of the numbers � k � 0 , then all the other numbers � i i N( , , )�1 � should also be zero, which

contradicts to the Lagrange multiplier rule. Hence, all the � k � 0 ( , , )k N�1 � .

Equation (7) can be rearranged as

u u wk kk

k

k�

� � �

�1

1

1

, w uk k k k� � �� �2

0.

Since | | | | | | | |u uk k�

� �1

1, this equation has two solutions:

u uk k�

�1

,

u u u e ek k k k k�

� � 1

2 , , ew

wk

k

k

| | | |

.

(8)

Obviously, the first solution would not do: it gives minimum time for the motion from point xk�1

to xk�1

. The form of

optimal control (8) was mentioned earlier (for example, [12]) and called the rule of reflection from a circle.

Thus, the necessary extremum conditions allow establishing the optimal control uk�1

based on the given values of

u Tk k, , and xk (formulas (8) and (4)). To find Tk�1

that corresponds to the control uk�1

, constraints (3) should be

rearranged as

�k k k k k k kx T T u y T� � � � �

� � � � �1 1 1 1 1

0| | ( ) | | .

This equation is quadratic in Tk�1

, and its roots are

T T u aT u a

a ukk

k

k kk k

k

k�

� �1

1

2

1

2

1

1

1

2

2

1

� �

, (

1

2

, ) ,a (9)

where a y xk k� ��1

. The greater root should be taken since (other things equal) it gives greater capture time. More

precisely, it is possible to show formally that if the other root is taken, an appropriate choice of xk can increase the

total capture time TN , the values x x x xk k N1 1 1

, , , , ,! !� �

remaining the same.

Note that the radicand in (9) may appear to be negative. This will mean that for the proposed values of u Tk k, , and xk ,

there is no control uk�1

that satisfies simultaneously the necessary optimality conditions (5) and constraints (3).

REDUCING THE OPTIMIZATION PROBLEM TO ONE EQUATION

Let us make two simple remarks. First, the set of points that satisfy constraint (3) for k �1, i.e.,

�1 1 1 1 0

| | | | | | | |x y x x� � � , is a sphere (so-called Apollonius sphere)

| | | | ,x d R1

� � (10)

where dy x

1

2

1 0

1

2

1

and R y x�

1

1

2

1 0

1

| | | |. Second, it follows from the necessary conditions (6) that the optimal

vector u N should be parallel to the vector x yN N� . Considering (4), we may conclude that u N should be parallel to

x yN N�

�1

. To increase the capture time, the Nth evader should move from the point xN �1

to yN . Therefore, to

maximize the total capture time, the following equality should hold:

uy x

y xN

N N

N N

,

| | | |

1

1

1.

Thus, the necessary optimality conditions reduce the original optimization problem (2), (3) to finding all the roots of

one nonlinear equation F x( ) � 0 defined on the Apollonius sphere (10).

721

Page 5: Solving the dynamic traveling salesman game problem

For an arbitrary vector x lying on the Apollonius sphere (10), the function F x( ) is defined recurrently as follows:

(i) assume x x1

� , T x x1 1 0

� �| | | | , ux x

T1

1 0

1

;

(ii) (recursion step) given x T uk k k, , (k N� �1 1, ,� ), find (see (8) and (9))

wx y

Tuk

k k

k

k�

�� , u u uw

w

w

wk k k�

� �1

2 ,

| | | | | | | |

,

a y xk k� ��1

, DT u a

a u ak k

k

k�

� �

1

1

2

2

1

2

( , ),

if D � 0 , then formally assume F x( ) � �1 and stop, if D � 0, then continue the process of determining the function,

T T u a Dkk

k

k k�

� �

�1

1

2

1

2

1

1

, , x x T T uk k k k k� � �

� � �1 1 1

( ) ,

repeat the recursion step for k N� �1 1, ,� ;

(iii) F x uy x

y xN

N N

N N

( ) ,

| | | |

1

1

1.

Choose the roots of the equation F x( ) � 0 that provide the maximum value to TN . This value TN is the solution of

the original optimization problem, the corresponding controls of the evaders have the form �kk k

k

x y

T�

(k N� !1, , ).

Thus, instead of finding the global maximum of the total time function TN N( , , )� �1

� on the Cartesian product of N

spheres ( )Sn N�1

, we propose to search for all the roots of the equation F x( ) � 0 on one sphere Sn�1

of the form (10). This

problem seems to be much simpler, in spite of the fact that the function F has rather complex form and is defined by

recurrence. The main result of the study can be briefly formulated as the following statement.

THEOREM. Solving the traveling salesman game problem (1) with a predetermined order of capture reduces to

finding all the roots of the equation F x( ) � 0 (for the function F constructed as specified) lying on the Apollonius sphere (10).

CONCLUSIONS

The proposed approach to the solution of the dynamic traveling salesman game problem was implemented in a

software for the multiprocessor cluster system at the V. M. Glushkov Institute of Cybernetics of the NAS of Ukraine. The

numerical calculations have shown the high efficiency of the developed algorithm. In the game problem of successive

pursuit on a plane, the optimal order of the capture of 11 to 12 players was synthesized almost in real time.

REFERENCES

1. A. A. Chikrii, Conflict-Controlled Processes [in Russian], Naukova Dumka, Kyiv (1992).

2. N. N. Krasovskii and A. I. Subbotin, Positional Differential Games [in Russian], Nauka, Moscow (1974).

3. L. S. Pontryagin, Selected Works [in Russian], Vol. 2, Nauka, Moscow (1988).

4. B. N. Pshenichnyi and V. V. Ostapenko, Differential Games [in Russian], Naukova Dumka, Kyiv (1992).

5. A. I. Subbotin and A. G. Chentsov, Guarantee Optimization in Control Game Problems [in Russian], Nauka, Moscow

(1981).

6. N. L. Grigorenko, Mathematical Methods of Control of Several Dynamic Processes [in Russian], Izd. MGU, Moscow

(1990).

7. A. I. Blagodatskikh and N. N. Petrov, Conflict Interaction of Groups of Controlled Objects [in Russian], Izd. Udm.

Gos. Univ., Izhevsk (2009).

722

Page 6: Solving the dynamic traveling salesman game problem

8. Yu. S. Ledyaev, “Optimal robust discontinuous feedback control in differential games of group pursuit,” in: Proc. Int.

Conf. Differential Equations and Topology, ded. 100th Anniversary of L. S. Pontryagin (Moscow, June 17–22,

2008), Lomonosov State Univ., Moscow (2008), p 265.

9. A. A. Chikrii, “The problem of avoidance for controlled dynamic objects,” J. Math., Game Theory and Algebra, 7,

No. 2/3, 81–94 (1998).

10. A. G. Chentsov, Extremum Rooting and Task Allocation Problems: Theory Matters [in Russian], NITz “Regulyarn.

Khaotich. Dinamika,” Moscow–Izhevsk (2008).

11. M. F. Kaspshitskaya and V. V. Glushkova, “On the stability of a solution to the traveling salesman problem with

moving objects,” Kibernetika, No. 3, 121–122 (1986).

12. E. P. Maslov and E. Ya. Rubinovich, “Differential games with a group objective,” in: Itogi Nauki Tekhn., Ser. Tekhn.

Kibernetika, 32, VINITI, Moscow (1991), pp. 32–59.

13. I. I. Shevchenko, Geometry of Alternative Pursuit [in Russian], Izd. Dal’nevost. Univ., Vladivostok (2003).

14. I. I. Shevchenko, Approach Strategies with General Coalitions [in Russian], Izd. Dal’nevost. Univ., Vladivostok

(2004).

15. L. A. Petrosyan and V. D. Shiryaev, “Group pursuit of several evaders by one pursuer,” Vestn. LGU, 13, No. 3,

50–57 (1980).

16. Yu. I. Berdyshev, “On a nonlinear problem of sequential control with parameter,” Izv. RAN, Teoriya Sit. Upravl.,

Issue 3 (2008), pp. 58–63.

17. J. V. Breakwell and P. Hagedorn, “Point capture of two evaders in succession,” J. Optim. Theory and Appl., 27,

No. 1, 90–97 (1979).

18. A. A. Chikrii, L. A. Sobolenko, and S. F. Kalashnikova, “A numerical method for the solution of the successive

pursuit-and-evasion problem,” Cybernetics, 24, No. 1, 53–59 (1988).

19. A. A. Chikrii and S. F. Kalashnikova, “Pursuit of a group of evaders by a single controlled object,” Cybernetics, 23,

No. 4, 437–445 (1987).

20. A. Belousov and A. Chikrii, “On solution of traveling salesman game problem,” in: Abstr. 12th Intern. Symp. on

Dynamic Games and Applications (3–6 July 2006, Sophia-Antipolis, France) (2006), p. 31.

21. A. D. Ioffe and V. M. Tikhomirov, Theory of Extremum Problems [in Russian], Nauka, Moscow (1974).

723