protect: an application of computational game theory for ... · applications as a ﬁrst step, this...

PROTECT: An Application of Computational Game Theoryfor the Security of the Ports of the United States

Eric Shieh, Bo An, Rong Yang, Milind TambeUniversity of Southern California

{eshieh, boa, yangrong, tambe}@usc.edu

Craig Baldwin, Joseph DiRenzo, Ben Maule, Garrett MeyerUnited States Coast Guard

{Craig.W.Baldwin, Joseph.DiRenzo, Ben.J.Maule, Garrett.R.Meyer}@uscg.mil

Abstract

Building upon previous security applications of computa-tional game theory, this paper presents PROTECT, a game-theoretic system deployed by the United States Coast Guard(USCG) in the port of Boston for scheduling their patrols.USCG has termed the deployment of PROTECT in Boston asuccess, and efforts are underway to test it in the port of NewYork, with the potential for nationwide deployment.PROTECT is premised on an attacker-defender Stackelberggame model and offers five key innovations. First, this sys-tem is a departure from the assumption of perfect adversaryrationality noted in previous work, relying instead on a quan-tal response (QR) model of the adversary’s behavior — to thebest of our knowledge, this is the first real-world deploymentof the QR model. Second, to improve PROTECT’s efficiency,we generate a compact representation of the defender’s strat-egy space, exploiting equivalence and dominance. Third,we show how to practically model a real maritime patrollingproblem as a Stackelberg game. Fourth, our experimental re-sults illustrate that PROTECT’s QR model more robustly han-dles real-world uncertainties than a perfect rationality model.Finally, in evaluating PROTECT, this paper provides real-world data: (i) comparison of human-generated vs PROTECTsecurity schedules, and (ii) results from an Adversarial Per-spective Team’s (human mock attackers) analysis.1

IntroductionThe global need for security of key infrastructure with lim-ited resources has led to significant interest in research con-ducted in multiagent systems towards game-theory for real-world security. In fact, three applications based on Stack-elberg games have been transitioned to real-world deploy-ment. This includes ARMOR, used by the Los Angeles In-ternational Airport to randomize checkpoints of roadwaysand canine patrols; IRIS which helps the US Federal AirMarshal Service in scheduling air marshals on internationalflights; and GUARDS which is under evaluation by the US

Copyright c© 2012, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

1This paper modifies the original paper (Shieh et al. 2012) withthe inclusion of the Background section and a more condenseddescription of the Game Modeling, Compact Representation, andRobustness Analysis sections to provide greater accessibility for abroader audience.

Transportation Security Administration to allocate the re-sources available for airport protection (Tambe 2011). Yetwe as a community of agents and AI researchers remain inthe early stages of these deployments, and must continue todevelop our understanding of core principles of innovativeapplications of game theory for security.

To this end, this paper presents a new game-theoreticsecurity application to aid the United States Coast Guard(USCG), called Port Resilience Operational/Tactical En-forcement to Combat Terrorism (PROTECT). The USCG’smission includes maritime security of the US coasts, ports,and inland waterways; a security domain that faces in-creased risks in the context of threats such as terrorism anddrug trafficking. Given a particular port and the variety ofcritical infrastructure that an adversary may attack withinthe port, USCG conducts patrols to protect this infrastruc-ture; however, while the adversary has the opportunity toobserve patrol patterns, limited security resources imply thatUSCG patrols cannot be at every location 24/7. To assist theUSCG in allocating its patrolling resources, similar to previ-ous applications (Tambe 2011), PROTECT uses an attacker-defender Stackelberg game framework, with USCG as thedefender against terrorist adversaries that conduct surveil-lance before potentially launching an attack. PROTECT’ssolution is to typically provide a mixed strategy, i.e., ran-domized patrol patterns taking into account the importanceof different targets, and the adversary’s surveillance and an-ticipated reaction to USCG patrols.

While PROTECT builds on previous work, this paperhighlights five key innovations. The first is PROTECT’s de-parture from the assumption of perfect rationality on the partof the human adversaries. While appropriate in the initialapplications as a first step, this assumption of perfect ratio-nality is well-recognized as a limitation of classical gametheory, and bounded rationality has received significant at-tention in behavioral game-theoretic approaches (Camerer2003). Within this behavioral framework, quantal responseequilibrium has emerged as a promising approach to modelhuman bounded rationality (Camerer 2003; McKelvey andPalfrey 1995; Wright and Leyton-Brown 2010) includingrecent results illustrating the benefits of the quantal re-sponse (QR) model in security games contexts (Yang et al.2011). Therefore, PROTECT uses a novel algorithm calledPASAQ (Yang, Tambe, and Ordonez 2012) based on the QR

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence

2173

model of a human adversary. To the best of our knowledge,this is the first time that the QR model has been used in areal-world security application.

Second, PROTECT improves PASAQ’s efficiency via acompact representation of defender strategies with experi-mental results showing the significant benefits of this com-pact representation. Third, PROTECT addresses practicalconcerns of modeling real-world maritime patrolling ap-plication in a Stackelberg framework. Fourth, this paperpresents a detailed simulation analysis of PROTECT’s ro-bustness to uncertainty that may arise in the real-world. Forvarious cases of added uncertainty, the paper shows thatPROTECT’s quantal-response-based approach leads to sig-nificantly improved robustness when compared to an ap-proach that assumes full attacker rationality.

PROTECT has been in use at the port of Boston sinceApril 2011 and been evaluated by the USCG. This evalu-ation brings forth our final key contribution: for the firsttime, this paper provides real-world data comparing human-generated and game-theoretic schedules. We also provideresults from an Adversarial Perspective Team’s (APT) anal-ysis and comparison of patrols before and after the use of thePROTECT system from a viewpoint of an attacker. Giventhe success of PROTECT in Boston, we are now extendingit to the port of New York, and based on the outcome there,it may potentially be extended to other ports in the US.

(a) PROTECT is being used in Boston (b) ExtendingPROTECT to NY

Figure 1: USCG boats patrolling the ports of Boston and NY

BackgroundStackelberg Game: A generic Stackelberg game has twoplayers, a leader, and a follower (Fudenberg and Tirole1991). A leader commits to a strategy first, and then a fol-lower optimizes its reward, considering the action chosen bythe leader (von Stengel and Zamir 2004). Each player hasa set of possible pure strategies, or the actions that they canexecute. A mixed strategy allows a player to play a probabil-ity distribution over pure strategies. Payoffs for each playerare defined over all possible pure-strategy outcomes for boththe players. The payoff functions are extended to mixedstrategies by taking the expectation over pure-strategy out-comes. The follower can observe the leader’s strategy, andthen act in a way to optimize its own payoffs.

Stackelberg games are used to model the attacker-defender strategic interaction in security domains. The de-fender commits to a mixed (randomized) strategy, whereasthe attacker conducts surveillance of these mixed strategies

and responds with a pure strategy of an attack on a target.The objective of this framework is to find the optimal mixedstrategy for the defender. Examples of Stackelberg gamescan be found in (Tambe 2011).

Quantal Response: Previous applications of Stackelbergsecurity games assumed that the attacker is perfectly ratio-nal, i.e., chooses a strategy that maximizes his expected util-ity (Tambe 2011). However, in many real world domains,agents face human adversaries whose behavior may not beoptimal assuming perfect rationality.

Quantal Response Equilibrium is an important model inbehavior game theory that has received widespread supportin the literature in terms of its superior ability to model hu-man behavior in simultaneous-move games (McKelvey andPalfrey 1995; Wright and Leyton-Brown 2010). It suggeststhat instead of strictly maximizing utility, individuals re-spond stochastically in games: the chance of selecting anon-optimal strategy increases as the cost of such an errordecreases. We assume that the attacker acts with boundedrationality; the defender is assisted by software and thus wecompute the defender’s optimal rational strategy (Yang et al.2011). Given the strategy of the defender, the Quantal BestResponse of the attacker is defined as

qi =eλG

ai (xi)∑T

j=1 eλGa

j (xi)(1)

The parameter λ ∈ [0,∞] represents the amount of noisein the attacker’s strategy. λ with a value of 0 represents auniform random probability over attacker strategies while avalue of ∞ represents a perfectly rational attacker. qi cor-responds to the probability that the attacker chooses a targeti; Gai (xi) corresponds to the attacker’s expected utility ofattacking target i given xi, the probability that the defendercovers target i; and T is the total number of targets.

USCG and PROTECT’s GoalsThe USCG continues to face challenges with evolving asym-metric threats within the maritime environment not onlywithin the Maritime Global Commons, but also within theports and waterways that make up the United States Mar-itime Transportation System. The former Director of Na-tional Intelligence, Dennis Blair noted in 2010 a persistentthreat ”from al-Qa’ida and potentially others who share itsanti-Western ideology. A major terrorist attack may emanatefrom either outside or inside the United States” (Blair 2010).This threat was reinforced in May of 2011 following the raidon Osama Bin Laden’s home, where a large trove of mate-rial was uncovered, including plans to attack an oil tanker.”There is an indication of intent, with operatives seeking thesize and construction of tankers, and concluding it’s best toblow them up from the inside because of the strength of theirhulls” (Dozier 2011). These oil tankers transit the U.S. Mar-itime Transportation System. The USCG plays a key rolein the security of this system and the protection of seaportsto support the economy, environment, and way of life in theUS (Young and Orchard 2011).

Coupled with challenging economic times, USCG mustoperate as effectively as possible, achieving maximum ben-

2174

efit from every hour spent on patrol. Thus, USCG is com-pelled to re-examine the role that optimization of resourceusage plays in its mission planning — and how innovationprovided by game theory can be effectively employed.

The goal of PROTECT is to use game theory to assist theUSCG in maximizing its effectiveness in the Ports, Water-ways, and Coastal Security (PWCS) Mission. PWCS patrolsare focused on protecting critical infrastructure; without theresources to provide one hundred percent on scene presenceat any, let alone all of the critical infrastructure, optimiza-tion of security resource is critical. Towards that end, unpre-dictability creates situations of uncertainty for an enemy andcan be enough to deem a target less appealing.

The PROTECT system addresses how the USCG shouldoptimally patrol critical infrastructure in a port to maximizeprotection, knowing that the adversary may conduct surveil-lance and then launch an attack. While randomizing patrolpatterns is key, PROTECT also addresses the fact that thetargets are of unequal value, understanding that the adver-sary will adapt to whatever patrol patterns USCG conducts.The output of PROTECT is a schedule of patrols which in-cludes when the patrols are to begin, what critical infrastruc-ture to visit for each patrol, and what activities to perform ateach critical infrastructure. While initially pilot tested in theport of Boston, PROTECT was intended to be generalizableand applicable to other ports.

Key Innovations in PROTECTThe PWCS patrol problem was modeled as a leader-follower(or attacker-defender) Stackelberg game (Fudenberg and Ti-role 1991) with USCG as the leader (defender) and the ter-rorist adversaries in the role of the follower. We begin bydiscussing how to practically cast this real-world maritimepatrolling problem of PWCS patrols as a Stackelberg game.We also show how to reduce the number of defender strate-gies before addressing the most important of the innovationsin PROTECT: its use of the quantal response model.

Game ModelingTo model the USCG patrolling domain as a Stackelberggame, we need to define (i) the set of attacker strategies, (ii)the set of defender strategies, and (iii) the payoff function.These strategies and payoffs center on the targets in a port —ports, such as the port of Boston, have a significant numberof potential targets (critical infrastructure). In our Stackel-berg game formulation, the attacker conducts surveillanceon the mixed strategies that the defender has committed to,and can then launch an attack. Thus, the attacks an attackercan launch on different possible targets are considered ashis/her pure strategies.

Instead of basing the defender strategies on individualtargets, which would require significant USCG input whilealso micromanaging their activities, it was decided to groupnearby targets into patrol areas. The presence of patrol ar-eas led the USCG to redefine the set of defensive activitiesto be performed on patrol areas to provide a more accurateand expressive model of the patrols. Activities that take alonger time provide the defender a higher payoff comparedto activities that take a shorter time to complete.

Patrol Schedule Target 1 Target 2 Target 3 Target 4(1:k1), (2:k1), (1:k1) 50,-50 30,-30 15,-15 -20,20(1:k2), (2:k1), (1:k1) 100,-100 60,-60 15,-15 -20,20(1:k1), (2:k1), (1:k2) 100,-100 60,-60 15,-15 -20,20(1:k1), (3:k1), (2:k1), (1:k1) 50,-50 30,-30 15,-15 10,-10(1:k1), (2:k1), (3:k1), (1:k1) 50,-50 30,-30 15,-15 10,-10

Table 1: Portion of a simplified example of a game matrix

To generate all possible patrol schedules, a graph G =(V, E) is created with the patrol areas as vertices V and adja-cent patrol areas as edges E , with each patrol schedule beingrepresented by a closed walk of G that starts and ends at thepatrol area b ∈ V , the base patrol area for the USCG. Thepatrol schedules are a sequence of patrol areas and associ-ated defensive activities, and are constrained by a maximumpatrol time τ . The graph G along with the constraints b andτ are used to generate the defender strategies.

Table 1 gives an example, where the rows correspond tothe defender’s strategies and the columns correspond to theattacker’s strategies. There are two possible defensive activ-ities, activity k1 and k2, where k2 provides a higher payofffor the defender than k1. Suppose that the time bound dis-allows more than two k2 activities (given the time requiredfor k2) within a patrol. Patrol area 1 has two targets (tar-get 1 and 2) while patrol areas 2 and 3 each have one target(target 3 and 4 respectively). In the table, a patrol scheduleis composed of a sequence of patrol areas and a defensiveactivity in each area. The patrol schedules are ordered sothat the first patrol area in the schedule denotes which patrolarea the defender needs to visit first. In this example, patrolarea 1 is the base patrol area, and all of the patrol sched-ules begin and end at patrol area 1. For example, the patrolschedule in row 2 first visits patrol area 1 with activity k2,then travels to patrol area 2 with activity k1, and returns backto patrol area 1 with activity k1. For the payoffs, if a target iis the attacker’s choice and is also part of a patrol schedule,then the defender would gain a reward Rdi while the attackerwould receive a penalty P ai , else the defender would receivea penalty P di and the attacker would gain a reward Rai . Fur-thermore, let Gdij be the payoff for the defender if the de-fender chooses patrol j and the attacker chooses to attacktarget i. Gdij can be represented as a linear combination ofthe defender reward/penalty on target i and Aij , the effec-tiveness probability of the defensive activity performed ontarget i for patrol j, as described by Equation 2. The valueof Aij is 0 if target i is not in patrol j.

Gdij = AijRdi + (1−Aij)P di (2)

If a target is visited multiple times with different activi-ties, we just consider the activity with the highest quality.While Table 1 shows a zero-sum game, the algorithm usedby PROTECT is not limited to a zero-sum game.

Compact RepresentationIn our game, the number of defender strategies, i.e., pa-trol schedules, grows combinatorially, generating a scale-upchallenge. To achieve scale-up, PROTECT uses a compactrepresentation of the patrol schedules using two ideas: (i)

2175

Figure 2: Flow chart of the PROTECT system

combining equivalent patrol schedules and; (ii) removal ofdominated patrol schedules.

With respect to equivalence, different permutations of pa-trol schedules provide identical payoff results. Furthermore,if an area is visited multiple times with different activitiesin a schedule, only the activity that provides the defenderthe highest payoff requires attention. Therefore, many pa-trol schedules are equivalent if the set of patrol areas vis-ited and defensive activities in the schedules are the sameeven if their order differs. Such equivalent patrol schedulesare combined into a single compact defender strategy, rep-resented as a set of patrol areas and defensive activities (andminus any ordering information). Table 2 presents a com-pact version of Table 1, which shows how the game matrixis simplified by using equivalence to form compact defenderstrategies, e.g., the patrol schedules in the rows 2-3 from Ta-ble 1 are represented as a compact strategy Γ2 = {(1,k2),(2,k1)} in Table 2.

Compact Strategy Target 1 Target 2 Target 3 Target 4Γ1 = {(1:k1), (2:k1)} 50,-50 30,-30 15,-15 -20,20Γ2 = {(1:k2), (2:k1)} 100,-100 60,-60 15,-15 -20,20Γ3 = {(1:k1), (2:k1), (3:k1)} 50,-50 30,-30 15,-15 10,-10

Table 2: Example compact strategies and game matrix

Next, the idea of dominance is illustrated using Table 2and noting the difference between Γ1 and Γ2 is the defensiveactivity on patrol area 1. Since activity k2 gives the defendera higher payoff than k1, Γ1 can be removed from the set ofdefender strategies because Γ2 covers the same patrol areaswhile giving a higher payoff for patrol area 1.

Figure 2 shows a high level view of the steps of the al-gorithm using the compact representation. In this expansionfrom a compact strategy to a full set of patrol schedules,we need to determine the probability of choosing each pa-trol schedule, since a compact strategy may correspond tomultiple patrol schedules. The focus here is to increase thedifficulty for the attacker to conduct surveillance by increas-ing unpredictability, which we achieve by randomizing uni-formly over all expansions of the compact defender strate-gies. The uniform distribution provides the maximum en-tropy (greatest unpredictability).

Human Adversary ModelingWhile previous game-theoretic security applications haveassumed a perfectly rational attacker, PROTECT takes a stepforward by addressing this limitation of classical game the-ory. Instead, PROTECT uses a model of a boundedly ra-tional adversary by using a quantal response (QR) modelof an adversary, which has shown to be a promising model

ti Target iRd

i Defender reward on covering ti if it’s attackedP di Defender penalty on not covering ti if it’s attack

Rai Attacker reward on attacking ti if it’s not covered

P ai Attacker penalty on attacking ti if it’s covered

Aij Effectiveness probability of compact strategy Γj on tiaj Probability of choosing compact strategy Γj

J Total number of compact strategiesxi Marginal coverage on ti

Table 3: PASAQ notation as applied to PROTECT

of human decision making (McKelvey and Palfrey 1995;Rogers, Palfrey, and Camerer 2009; Yang et al. 2011). Arecent study demonstrated the use of QR as an effective pre-diction model of humans (Wright and Leyton-Brown 2010).An even more relevant study of the QR model was con-ducted by Yang et al. (Yang et al. 2011) in the context ofsecurity games where this model was shown to outperformcompetitors in modeling human subjects. Based on this ev-idence, PROTECT uses a QR model of a human adversary.(Aided by a software assistant, the defender still computesthe optimal mixed strategy.)

To apply the QR model in a Stackelberg framework,PROTECT employs an algorithm known as PASAQ (Yang,Tambe, and Ordonez 2012). PASAQ computes the optimaldefender strategy (within a guaranteed error bound) givena QR model of the adversary by solving the following non-linear and non-convex optimization problem P , with Table 3listing the notation:

P:

maxx,a

∑Ti=1 e

λRai e−λ(R

ai −P

ai )xi((Rdi − P di )xi + P di )∑T

i=1 eλRa

i e−λ(Rai −Pa

i )xi

xi =J∑j=1

ajAij , ∀i

J∑j=1

aj = 1, 0 ≤ aj ≤ 1, ∀j

The first line of P corresponds to the computation of thedefender’s expected utility resulting from a combination ofEquations 1 and 2. Unlike previous applications (Tambe2011; Kiekintveld, Marecki, and Tambe 2011), xi in thiscase not just summarizes presence or absence on a target,but also the effectiveness probability Aij on the target.

As with all QR models, a value for λ is needed to repre-sent the noise in the attacker’s strategy. Based on discussionswith USCG experts about the attacker’s behavior, a λ valueof 0 and∞were ruled out. Given the payoff data for Boston,an attacker’s strategy with λ = 4 starts approaching a fullyrational attacker — the probability of attack focuses on asingle target. It was determined from the knowledge gath-ered from USCG that the attacker’s strategy is best modeledwith a λ value that is in the range [0.5, 4]. A discrete sam-pling approach was used to determine a λ value that givesthe highest average expected utility across attacker strategieswithin this range to get λ = 1.5. Selecting an appropriatevalue for λ remains a complex issue however, and it is a keyagenda item for future work.

2176

EvaluationThis section presents evaluations based on (i) experimentscompleted via simulations and (ii) real-world patrol dataalong with USCG analysis. All scenarios and experiments,including the payoff values and graph (composed of 9 patrolareas), were based off the port of Boston. The defender’spayoff values have a range of [-10,5] while the attacker’spayoff values have a range of [-5,10]. The game was mod-eled as a zero-sum game2 in which the attacker’s loss or gainis balanced precisely by the defender’s gain or loss. ForPASAQ, the defender’s strategy uses λ = 1.5 as mentionedbefore. All experiments are run on a machine with an IntelDual Core 1.4 GHz processor and 2 GB of RAM.

Memory Analysis

Figure 3: Memory comparison

This section presentsthe results based onsimulation to show theefficiency in memoryof the compact rep-resentation versus thefull representation. InFigure 3, the x-axisis the maximum patroltime allowed and they-axis is the memoryneeded to run PROTECT. The maximum patrol time alloweddetermines the number of combinations of patrol areas thatcan be visited — so the x-axis indicates a scale-up in thenumber of defender strategies. When the maximum patroltime is set to 90 minutes, the full representation uses 540MB of memory while the compact representation requires20 MB of memory. Due to the exponential increase in thememory that is needed for the full representation, it cannotbe scaled up beyond 90 minutes. The graph of the runtimecomparison is similar to the memory comparison graph.

Utility AnalysisGiven that we are working with real data, it is useful to un-derstand whether PROTECT using PASAQ with λ = 1.5provides an advantage when compared to: (i) a uniform ran-dom defender’s strategy; (ii) a mixed strategy with the as-sumption of the attacker attacking any target uniformly atrandom (λ = 0) or; (iii) a mixed strategy assuming a fullyrational attacker (λ = ∞). The previously existing DOBSSalgorithm was used for λ =∞ (Tambe 2011). Additionally,comparison with the λ = ∞ approach is important becauseof the extensive use of this assumption in previous applica-tions. Typically, we may not have an estimate of the exactvalue of the attacker’s λ value, only a possible range. There-fore, ideally we would wish to show that PROTECT (withλ = 1.5) provides an advantage over a range of λ valuesassumed for the attacker (not just over a point estimate), jus-tifying our use of the PASAQ algorithm.

2In general these types of security games are non-zero-sum (Tambe 2011), however for Boston as a first step it was de-cided to cast the game as zero-sum.

Figure 4: Defender’s Expected Utility when varying λ forattacker’s strategy

To achieve this, we compute the average defender utilityof the four approaches above as the λ value of the attacker’sstrategy changes from [0, 6], which subsumes the range [0.5,4] of reasonable attacker strategies. In Figure 4, the y-axisrepresents the defender’s expected utility and the x-axis isthe λ value that is used for the attacker’s strategy. Bothuniform random strategies perform well when the attacker’sstrategy is based on λ = 0. However, as λ increases, bothstrategies quickly drop to a very low defender expected util-ity. In contrast, the PASAQ strategy with λ = 1.5 providesa higher expected utility than that assuming a fully rationalattacker over a range of attacker λ values (and indeed overthe range of interest), not just at λ = 1.5.

Robustness AnalysisIn the real world, observation, execution, and payoffs, arenot always perfect due to the following: noise in the at-tacker’s surveillance of the defender’s patrols, the manytasks and responsibilities of the USCG where the crew maybe pulled off a patrol, and limited knowledge of the at-tacker’s payoff values. Our hypothesis is that PASAQ withλ = 1.5 is more robust to such noise than a defender strat-egy which assumes full rationality of the attacker such asDOBSS (Tambe 2011), i.e., PASAQ’s expected defenderutility will not degrade as much as DOBSS over the rangeof attacker λ of interest. This is illustrated by comparingboth PASAQ and DOBSS against observation, execution,and payoff noise (Kiekintveld, Marecki, and Tambe 2011;Korzhyk, Conitzer, and Parr 2011; Yin et al. 2011).

Figure 5 shows the performance of different strategieswhile considering execution noise. The y-axis represents thedefender’s expected utility and the x-axis is the attacker’s λvalue. If the defender covered a target with probability p,this probability now changes to be in [p− x, p+ x] where xis the noise. The low execution error corresponds to x = 0.1whereas high error corresponds to x = 0.2. The key take-away here is that execution error leads to PASAQ dominat-ing DOBSS over all tested values of λ. For both algorithms,the defender’s expected utility decreases as more executionerror is added because the defender’s strategy is impacted bythe additional error. When execution error is added, PASAQdominates DOBSS because the latter seeks to maximize theminimum defender’s expected utility so multiple targets willhave the same minimum defender utility. For DOBSS, whenexecution error is added, there is a greater probability that

2177

one of these targets will have less coverage, resulting ina lower defender’s expected utility. For PASAQ, typicallyonly one target has the minimum defender expected utility.As a result changes in coverage do not impact it as muchas DOBSS. As execution error increases, the advantage inthe defender’s expected utility of PASAQ over DOBSS in-creases even more. This section only shows the executionnoise results; the details of the observation and payoff noiseresults can be found in (Shieh et al. 2012).

Figure 5: Defender’s expected utility: Execution noise

USCG Real-World EvaluationReal-world scheduling data: Unlike prior publications ofreal-world applications of game theory for security, a keynovelty of this paper is the inclusion of actual data fromUSCG patrols before and after the deployment of PROTECTat the port of Boston. Figure 6(a) and Figure 6(b) show thefrequency of visits by USCG to different patrol areas overa number of weeks. The x-axis is the day of the week, andthe y-axis is the number of times a patrol area is visited fora given day of the week. The y-axis is intentionally blurredfor security reasons as this is real data from Boston. Thereare more lines in Figure 6(a) than in Figure 6(b) becauseduring the implementation of PROTECT, new patrol areaswere formed which contained more targets and thus fewerpatrol areas in the post-PROTECT figure. Figure 6(a) de-picts a definite pattern in the patrols. While there is a spikein patrols executed on Day 5, there is a dearth of patrols onDay 2. Besides this pattern, the lines in Figure 6(a) intersect,indicating that some days, a higher value target was visitedmore often while on other days it was visited less often. Thismeans that there was not a consistently high frequency ofcoverage of higher value targets before PROTECT. In Fig-ure 6(b), we notice that the pattern of low patrols on Day 2(from Figure 6(a)) disappears. Furthermore, lines do not fre-quently intersect, i.e., higher valued targets are visited con-sistently across the week. The top line in Figure 6(b) is thebase patrol area and is visited at a higher rate than all otherpatrol areas.

Adversary Perspective Teams(APT): To obtain a bet-ter understanding of how the adversary views the potentialtargets in the port, the USCG created the Adversarial Per-spective Team (APT), a mock attacker team. The APT pro-vides assessments from the terrorist perspective and as a sec-ondary function, assesses the effectiveness of the patrol ac-tivities before and after deployment of PROTECT. In theirevaluation, the APT incorporates the adversary’s known in-

(a) Pre-PROTECT (b) Post-PROTECT

Figure 6: Patrol visits per day by area

tent, capabilities, skills, commitment, resources, and cul-tural influences. In addition, it screens attack possibilitiesand assists in identifying the level of deterrence projected atand perceived by the adversary. For the purposes of this re-search, the adversary is defined as an individual(s) with tiesto al-Qa’ida or its affiliates.

The APT conducted a pre- and post-PROTECT assess-ment of the system’s impact on an adversary’s deterrence atthe port of Boston. This analysis uncovered a positive trendwhere the effectiveness of deterrence increased from the pre-to post- PROTECT observations.

Additional Real-world Indicators: The use of PRO-TECT and APT’s improved guidance given to boat crewson how to conduct the patrol jointly provided a noticeableincrease in the quality and effectiveness of the patrols. Priorto implementing PROTECT, there were no documented re-ports of illicit activity. After implementation, USCG crews,reported more illicit activities within the port and provided anoticeable ”on the water” presence with industry port part-ners commenting, ”the Coast Guard seems to be everywhere,all the time.” With no actual increase in the number of re-sources applied, and therefore no increase in capital or oper-ating costs, these outcomes support the practical applicationof game theory in the maritime security environment.

Outcomes after Boston Implementation: The USCGviewed this system as a success and as a result, PROTECTis now getting deployed in the port of New York. We werepresented an award for the work on the PROTECT systemfor the Boston Harbor which reflects USCG’s recognition ofthe impact and value of PROTECT.

Lessons Learned: Putting Theory into PracticeDeveloping the PROTECT model was a collaborative effortinvolving university researchers and USCG personnel rep-resenting decision makers, planners and operators. Build-ing on the lessons reported in (Tambe 2011) for workingwith security organizations, we informed the USCG of (i)the assumptions underlying the game-theoretic approaches,e.g., full adversary rationality, and strengths and limitationsof different algorithms — rather than pre-selecting a simpleheuristic approach; (ii) the need to define and collect correctinputs for model development and; (iii) a fundamental un-derstanding of how the inputs affect the results. We gainedthree new insights involving real-world applied research; (i)unforeseen positive benefits because security agencies werecompelled to reexamine their assumptions; (ii) requirementto work with multiple teams in a security organization at

2178

multiple levels of their hierarchy and; (iii) need to prepareanswers to end-user practical questions not always directlyrelated to the ”meaty” research problems.

The first insight came about when USCG was compelledto reassess their operational assumptions as a result of work-ing through the research problem. A positive result of thisreexamination prompted USCG to develop new PWCS mis-sion tactics, techniques and procedures. Through the iter-ative development process, USCG reassessed the reasonswhy boat crews performed certain activities and whetherthey were sufficient. For example, instead of ”covered” vs”not covered” as the only two possibilities at a patrol point,there are now multiple sets of activities at each patrol point.

The second insight is that applied research requires theresearch team to collaborate with planners and operatorsof a security organization to ensure the model accounts forall aspects of a complex real world environment. Initiallywhen we started working on PROTECT, the focus was onpatrolling each individual target. This appeared to micro-manage the activities of boat crews, and it was through theirinput that individual targets were grouped into patrol areasassociated with a PWCS patrol. On the other hand, inputfrom USCG headquarters and the APT mentioned earlier,led to other changes in PROTECT, e.g., departing from afully rational model of an adversary to a QR model.

The third insight is the need to develop answers to end-user questions which are not always related to the ”meaty”research question but are related to the larger knowledge do-main on which the research depends. One example of theneed to explain results involved the user citing that one pa-trol area was being repeated and hence, randomization didnot seem to occur. After assessing this concern, we deter-mined that the cause for the repeated visits to a patrol areawas its high reward — order of magnitude greater than therarely visited patrol areas. PROTECT correctly assigned pa-trol schedules that covered the more ”important” patrol ar-eas more frequently. These practitioner-based issues demon-strate the need for researchers to not only be conversant inthe algorithms and math behind the research, but also be ableto explain from a user’s perspective how solutions are accu-rate. An inability to address these issues would result in alack of real-world user confidence in the model.

Summary

This paper reports on PROTECT, a game-theoretic systemdeployed by the USCG in the port of Boston since April2011. USCG has deemed the deployment of PROTECT inBoston a success and efforts are underway to deploy PRO-TECT in the port of New York, and to other ports in theUnited States. PROTECT has advanced the state of the artbeyond previous applications of game theory for securitysuch as ARMOR, IRIS or GUARDS (Tambe 2011). The useof a QR model also sets PROTECT apart from other Stack-elberg models such as (Basilico, Gatti, and Amigoni 2009;Vanek et al. 2011). Building on this initial success, we hopeto deploy it at more and much larger-sized ports.

AcknowledgmentsWe thank the USCG offices, and particularly sector Boston,for their exceptional collaboration. The views expressedherein are those of the author(s) and are not to be construedas official or reflecting the views of the Commandant orof the USCG. This research was supported by the US De-partment of Homeland Security through the National Centerfor Risk and Economic Analysis of Terrorism Events (CRE-ATE) under award number 2010-ST-061-RE0001.

ReferencesBasilico, N.; Gatti, N.; and Amigoni, F. 2009. Leader-followerstrategies for robotic patrolling in environments with arbitrarytopologies. In AAMAS.Blair, D. 2010. Annual threat assessment of the US intelli-gence community for the senate select committee on intelligence.http://www.dni.gov/testimonies/20100202 testimony.pdf.Camerer, C. F. 2003. Behavioral Game Theory: Experiments inStrategic Interaction. Princeton University Press.Dozier, K. 2011. Bin laden trove of documents sharpenUS aim. http://www.msnbc.msn.com/id/43331634/ns/us news-security/t/bin-laden-trove-documents-sharpen-us-aim/.Fudenberg, D., and Tirole, J. 1991. Game Theory. MIT Press.Kiekintveld, C.; Marecki, J.; and Tambe, M. 2011. Approximationmethods for infinite bayesian Stackelberg games: modeling distri-butional uncertainty. In AAMAS.Korzhyk, D.; Conitzer, V.; and Parr, R. 2011. Solving Stackelberggames with uncertain observability. In AAMAS.McKelvey, R. D., and Palfrey, T. R. 1995. Quantal response equi-libria for normal form games. Games and Economic Behavior10(1):6–38.Rogers, B. W.; Palfrey, T. R.; and Camerer, C. F. 2009. Het-erogeneous quantal response equilibrium and cognitive hierarchies.Journal of Economic Theory.Shieh, E.; An, B.; Yang, R.; Tambe, M.; Baldwin, C.; DiRenzo,J.; Maule, B.; and Meyer, G. 2012. Protect: A deployed gametheoretic system to protect the ports of the united states. In AAMAS.Tambe, M. 2011. Security and Game Theory: Algorithms, De-ployed Systems, Lessons Learned. Cambridge University Press.Vanek, O.; Jakob, M.; Hrstka, O.; and Pechoucek, M. 2011. Usingmulti-agent simulation to improve the security of maritime transit.In MABS.von Stengel, B., and Zamir, S. 2004. Leadership with commit-ment to mixed strategies. Technical Report LSE-CDAM-2004-01,CDAM Research Report.Wright, J., and Leyton-Brown, K. 2010. Beyond equilibrium: Pre-dicting human behavior in normal form games. In AAAI.Yang, R.; Kiekintveld, C.; Ordonez, F.; Tambe, M.; and John, R.2011. Improving resource allocation strategy against human adver-saries in security games. In IJCAI.Yang, R.; Tambe, M.; and Ordonez, F. 2012. Computing optimalstrategy against quantal response in security games. In AAMAS.Yin, Z.; Jain, M.; Tambe, M.; and Ordonez, F. 2011. Risk-aversestrategies for security games with execution and observational un-certainty. In AAAI.Young, S., and Orchard, D. 2011. Re-membering 9/11: Protecting America’s ports.http://coastguard.dodlive.mil/2011/09/remembering-911-protecting-americas-ports/.

2179

protect: an application of computational game theory for ... · applications as a ﬁrst step, this...

Documents