francesco amigoni, nicola basilico, nicola gatti {amigoni,basilico,ngatti}@elet.polimi.it

F. Amigoni, N. Basilico, N. GattiDEI, Politecnico di Milano

Finding the Optimal Strategies in Robotic Patrolling with Adversaries in Topologically-

Represented Environments

Francesco Amigoni, Nicola Basilico, Nicola Gatti{amigoni,basilico,ngatti}@elet.polimi.it


Robotic Patrolling

€€

€

€

A patrolling strategy determines the path followed by the robot,usually the next cell to move to


Randomized Patrolling Strategies

The patroller should adopt an unpredictable patrolling strategy, randomizing over cells and trying to reduce the intrusion risk (Pita et al., AAMAS08)

€

Randomized strategy: the robot determines the next cell according to a probability distribution


Patrolling Strategies with Adversaries

• Considering a model of the adversary (Agmon et al., AAMAS08, Paruchuri et al., AAMAS08) can provide the patrolling robot a larger expected utility than not considering it, i.e., it can lead to better strategies (Amigoni et al., IAT2008)

• Model of the adversary can include: its preferences over the possible targets, its knowledge about the patroller’s strategy, …


The Problem

The problem we addressed in this work: finding the optimal randomized patrolling strategy in a arbitrary environment while considering a model of the adversary

Our approach applies to environments with arbitrary topology generalizing (Agmon et al., ICRA08)

€

Agmon et al., ICRA08


The Basic Patrolling Model

• Time is discrete

• Environment: represented by a directed graph, e.g., a grid of cells or a topological map (Carpin et al., IROS08)

• Single patrolling robot It can move between adjacent nodes It can detect a possible intruder in its current node

• Single intruder It knows the strategy of the patrolling robot, for example because it

can observe the patroller movements before attempting to intrude It can directly enter any node

• Penetration time di is required to successfully complete an intrusion in a node i

When attempting to penetrate in a node i at time t, the intruder can be detected during {t,t+1,…,t+ di}


The Basic Patrolling Model

Final States

• The indruder enters node i at time t:

• If the patroller does not visit cell i in the interval {t,t+1,…,t+ di} the intruder wins

• Otherwise the intruder is captured and the patroller wins

• The intruder never enters

Utilities

• Xi ,Yi (i {1, 2, …, 13}∈ ) : patroller’s and intruder’s utilities when the intruder successfully attacks node i

• X0 ,Y0 : patroller’s and intruder’s utilities when the intruder is captured

€

P

7

10 12

I I I

move(10) move(12)move(7)

…P P P

waitenter(13)enter(1) …

… …

… …1 time unit

1 2 3 4 5

86

9 13


Objective

The proposed method finds the probability distribution over the patroller movements, i.e., given the current node, finding the probability of moving in each adjacent node

€


Solving the Game

• Two competing actors: we study their behaviors in a game-theoretical framework

• The patrolling problem can be modeled as a leader-follower game• Two players

• The leader commits to a strategy

• The follower observes such commitment and acts as a best responder

• Patrolling strategy: A = {αi,j}, where αi,j is the probability of doing move(j) when i is the current node

• The optimal A can be derived by computing the equilibrium of the leader-follower game resorting to a bilevel optimization problem (Conitzer and Sandholm, 2006)


Solving Algorithm

• We safely assume that the game will end, i.e., the intruder will enter

• We compute A such that the patroller’s expected payoff is maximum

• This amounts to solve a bilinear optimization problem for every possible action of the intruder

Game Model

Optimal patrolling strategy that maximizes patroller’s

expected utility

Solving algorithm

If the above problem does not admit a solution, Step 2:

Step 1: is there any strategy A such that the game will never end?• Single bilinear feasibility problem

• If a solution is found, it is the best patrolling strategy and the intruder will never attempt to enter


An Example

X1 = 0.8Y1 = 0.2d1 = 7

X5 = 0.5Y5 = 0.5d5 = 7

X0 = 1Y0 = -1

X1 = 0.8Y1 = 0.2d1 = 7

X5 = 0.5Y5 = 0.3d5 = 7

0.226

0.774 0.451 0.344 0.676

0.1020.0960.127

0.228 0.8980.5290.549

With this strategy the game never ends, i.e., the intruder will never enter


Another Example

X1 = 0.8Y1 = 0.2d1 = 5

X5 = 0.5Y5 = 0.3d5 = 4

X1 = 0.8Y1 = 0.2d1 = 5

X5 = 0.5Y5 = 0.3d5 = 4

1 0.546 0.546 0.546

0.454 10.4540.454

X0 = 1Y0 = -1

With this strategy the intruder will try to enter in cell 1 when the patroller is in cell 5, the expected utility of the patroller is 0.819


Model Extensions

1 3 5 7 9 110.6

0.650.7

0.750.8

0.850.9

0.951

r=1r=2r=3r=4

• Augmented sensing capabilities: we introduce the range parameter • Synchronized multirobot setting:

a single patroller able to sense an arbitrary subset of cells X4 = 0.8

Y4 = 0.4

X6 = 0.7Y6 = 0.5

X12 = 0.8Y12 = 0.4

expe

cted

util

ity

penetration time

X0 = 1Y0 = -1


Conclusions and Future Works

• We presented an approach to find optimal randomized patrolling strategies in arbitrary environments with adversaries

• Future Works• Accounting for intruder’s movements and limited

observation capabilities• Extending our framework with multiple non-synchronized

patrollers

francesco amigoni, nicola basilico, nicola gatti {amigoni,basilico,ngatti}@elet.polimi.it

Documents