research article a new decentralized approach of multiagent cooperative...
TRANSCRIPT
Research ArticleA New Decentralized Approach of MultiagentCooperative Pursuit Based on the Iterated Elimination ofDominated Strategies Model
Mohammed El Habib Souidi12 and Songhao Piao1
1Harbin Institute of Technology Computer Science and Technology Harbin 150001 China2Department of Computer Science University of Khenchela 40000 Khenchela Algeria
Correspondence should be addressed to Mohammed El Habib Souidi mohamedsouidi1989hotmailcom
Received 20 June 2016 Revised 7 September 2016 Accepted 25 September 2016
Academic Editor Vladimir Turetsky
Copyright copy 2016 M E H Souidi and S Piao This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
Game Theory is a promising approach to acquire coalition formations in multiagent systems This paper is focused on theimportance of the distributed computation and the dynamic formation and reformation of pursuit groups in pursuit-evasionproblems In order to address this task we propose a decentralized coalition formation algorithm based on the Iterated Eliminationof Dominated Strategies (IEDS) This GameTheory process is common to solve problems requiring the withdrawal of dominatedstrategies iteratively Furthermore we have used theMarkovDecisionProcess (MDP) principles to control themotion strategy of theagents in the environment The simulation results demonstrate the feasibility and the validity of the given approach in comparisonwith different decentralized methods
1 Introduction
Multiagent System (MAS) is the cooperation of an orga-nized set of intelligent agents situated in an environment inorder to coordinate their performances and resolve complexproblems Coalition formation considered as a major focusin social systems is an important method of cooperationIn general coalition formation between the agents is goal-directed and short-lived Coalitions are formed to achievea specific objective and dissolve when it is accomplishedCoalition formation has received a considerable amount ofattention in recent research [1 2] Further some researchactivities are based on the notion of Organization that allowsthe coalition of the agents in the form of groups as well as thecooperation between their members Consequently Ferberet al [3] proposed AALAADIN organizational model basedon three principal axes Agent Group and the Role usedsimultaneously to describe the concrete agentsrsquo organizations(AGR)
Multiagent cooperative pursuit is a known multiagentproblem [4 5] Based on identical conditions between
pursuers and evaders the pursuit-evasion problem can beclassified into single object pursuit and multiobject pursuitAlso this problem has been considered in many referencesand its applications were inspired by an equally diverse set ofapproaches and useful techniques such as Organization [6]in which we have used the principles of AGR organizationalmodel to propose a pursuit coalition formation algorithmAlso in order to equip each pursuit group with a dynamicaccess mechanism we have introduced a flexible organiza-tional model extended from AGR through the application offuzzy logic principles that determines themembership degreeof each pursuer in relation to each group [7]
Furthermore Cai et al introduced an economical auctionmechanism (MPMEGBTBA) [8] where an advanced tasknegotiation process based on Task Bundle Auction wasproposed in order to allocate the task dynamically throughdynamic coalition formation of multiple agents Also wecan find several researches treating the pursuit problemthrough different principles as Graph Theory [9] PolygonalEnvironment [10 11] Data Mining [12] and ReinforcementLearning [13] In this kind of problem the pursuers and
Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 5192423 11 pageshttpdxdoiorg10115520165192423
2 Mathematical Problems in Engineering
evaders are presented in different kinds of types The type ofa pursuer denotes its pursuit capacity Otherwise the type ofan evader reflects the number and type of pursuers requiredto perform its capture The value of an evader indicates theexpected rewards that should be returned to the relevantpursuers after the achievement of the capture
Game Theory can be considered as the simplest wayto model situations of conflict and studies the interactionsbetween interested agents The classic question relating gametheory to multiagent systems is ldquowhat is the best actionthat an agent can performrdquo This principle has been widelyused in multiagent pursuit problems [14ndash17]The negotiationbased on Game Theory is focused on the value and rewardsof each agent which appropriately reflect the objective ofagentrsquos negotiation (satisfying the goal of each agent) Themain advantage provided by Game Theory algorithms inMAS is the coordination appearing through the hypoth-esis of mutual rationality of the agents Therefore TheGameTheory algorithms are used to coordinate autonomousrational agents without the use of coordination mechanismexplicitly integrated in themodel of agents Also they providedifferent methods that define the optimal agentsrsquo coalitionsin several types of problems Otherwise the disadvantagesof these algorithms concern the agents which are frequentlyconsidered as perfect rationales Moreover Game Theoryalgorithms are focused on the value of the optimal solutionand overlook the most efficient method to achieve it
In this paper we have focused on the Iterated Eliminationof Dominated Strategies (IEDS) Game Theory technique topropose coalition formation algorithm as part of the pursuit-evasion problems A strategy is the complete specificationof the agentrsquos behavior in any situation (in the case of anextensive form game it means what behavior the agent mustundertake according to the set of information provided)Moreover we have used Markov Decision Process (MDP)principles in order to control the motion strategy of eachagent This process (MDP) provides formalism of model andresolves planning and learning problems under uncertainty[18 19]
The paper is organized as follows in Section 2 we discussthe main related works based on the same principles used inthis paper In Section 3 we focus on the pursuit-evasion prob-lem by submitting a detailed explanation of the simulationenvironment and its different contents Also we approach theenvironmental agents by the definition and the clarificationof the principal characteristics of pursuers and evaders InSection 4 the Iterated Elimination of Dominated Strategiesprinciple is described anddetailed by an applicationrsquos exampleof this GameTheory process In Section 5 the basic principlesof Markov Decision Process are motivated Therefore weclarify the principles of the primary functions better knownas the reward and transition functions In Section 6 weintroduce the distributed coalition formation algorithm witha detailed clarification of the coalition progress A simulationof the pursuit-evasion game example is shown in Section 7 inthis part we describe our simulation environment in a specificmanner Also we present the results achieved in comparisonwith other outcomes based on other theories Finally Sec-tion 8 contains concluding remarks about this paper
2 Related Work
There exist a lot of works based on game theoretic prin-ciples regarding the PE problem such as in [14] where theauthor described the control of an autonomous agentsrsquo teamtracking an intelligent evader in a nonaccurately mappedterrain based on a method to calculate the Nash equilibriumpolicies by resolving an equivalent zero-sum matrix gameIn this example among all Nash equilibrium the evaderselects the one which optimizes its deterministic distance tothe pursuersrsquo team In order to resolve the problems oftenencountered in the algorithms of pursuit-evasion games suchas computational complexity and the lack of universalityDong et al [15] propose a hybrid algorithm founded onimproved dynamic artificial potential field and differentialgame where Nash equilibrium solution is optimal for bothpursuer and evader in barrier-free zone in pursuit-evasiongame and in accordance with environment changes aroundthe pursuit elements the algorithm is applied with flexibilityMoreover in [16] Amigoni and Basilico have presented anapproach to calculate the optimal pursuerrsquos strategy thatmaximizes the probability of the targetrsquos capture in a givenenvironment This approach is based on the definition ofthe pursuit-evasion game theoretic model as well as on itsresolution through the mathematical programming
More recently Lin et al [20] proposed a pursuit-evasiondifferential game based on Nash strategies involving limitedobservations On the one hand the evader undertakes thestandard feedback Nash strategy On the other hand thepursuers undertake Nash strategies based on the novelconcept of best achievable performance indices proposedThis model has potential applications in cases where sev-eral weakly equipped pursuing vehicles are tracking well-equipped unmanned vehicle
In relation to PE MDP is usually used to providethe motion planning for the mobile pursuers through themaximization of the rewards obtained during the pursuitIn [21] a Partially Observable Markov Decision Process(POMDP) algorithm is used to search the mobile target ina known graph The main objective is to ensure the captureof the targets via the clearing of the graph in minimal timeIn [22] the authors propose a new approach ContinuousTime Markov Decision Process (CTMDP) to address the PEproblem In relation to MDP CTMDP takes into account theimpact of the transition time between the states involving astrong robustness against changes in transition probabilityIn [23] the authors proposed an innovative approach totallybased on MDP with the aim of resolving sequential multia-gent decision problems by allowing agents to reason explicitlyabout specific coordinationmechanisms In otherwords theydetermined a value iteration algorithm to compute the opti-mal policies that recognizes and reasons about CoordinationProblems
Furthermore we can consider other works based onMDP and IEDS such as [24] in which an exact dynamicprogramming algorithm for partially observable stochasticgames (POSGs) is developed Also it is proven that the algo-rithm iteratively eliminates very weakly dominated strategieswithout first forming a normal form representation of the
Mathematical Problems in Engineering 3
game when it is applied to finite-horizon POSGs Otherwiseseveral types of coordination mechanisms are currently usedsuch as Stochastic Clustering Auctions (SCAs) [25 26] whichrepresent a class of cooperative auctionmethods based on themodified Swendsen-Wang method It permits each robot toreconstitute the tasks that have been linked and applies toheterogeneous teams Other mechanisms are market-basedas TraderBots [27] applied on greedy agents in order toprovide a detailed analysis of the requirements for robust andefficient multirobot coordination in dynamic environmentsFrom the point of view of Game Theory some researchactivities [28] investigated the optimal coordination approachfor multiagent foraging indeed they built the equivalencebetween the optimal solution of MAS and the equilibriumof the game according to the same case and then theyintroduced evolutionarily stable strategy to be of service inresolving the equilibrium selection problem of traditionalGameTheory
3 Problem Description
In this section we focus on the cooperation problem inwhich119899 pursuers situated in a limitary toroidal grid environment119883have to capture 119898 evaders of different types The expressionsof 119875 = 1198751 119875119899 and 119864 = 1198641 119864119898 represent thecollection of 119899 pursuers and119898 evaders respectively Pursuersand evaders represent the roles that the agents can play Eachevader is characterized by a type Re with Re isin I II III IVto indicate howmany pursuers are required to capture it Herewe suppose that the pursuers can evaluate the evadersrsquo typesafter the localization There exist some fixed obstacles withdifferent shapes and sizes in the environment119883The positioncould be destined for the mapping mp
119883 rarr 0 1 such as forall119909 isin 119883 mp(119909) = 1 well then 119909 isan obstacle
In our proposal the strategies of each pursuer are guided bydetermining factors that reflect the individual development ofthe pursuer during the execution of the assigned tasks Thesefactors are detailed as follows
Self-Confidence Degree In multiagent systems each agentmust be able to execute the services requested by the otheragents The self-confidence degree is the assessment of theagentrsquos success in relation to the assigned tasks It is denotedand computed in the following way
forallConf isin [01 1] Conf = max(01 119862119904119862119905) (1)
119862119904 is the number of tasks that the agent has accomplished 119862119905is the number of tasks in which the agent has participated
The Credit In the case where the agent cannot perform atask then its credit will be affected The credit of an agent isdesignated and calculated as follows
forallCredit isin [0 1] Credit = min(1 1 minus 119862119887119862119905 minus 119862119904) (2)
119862119887 is the number of the abandoned tasks by the agent
Environment Position The position of the agent in theenvironment is a crucial criterion for the pursuit sequencesbecause the capture will be easier if the pursuer is closer tothe evader The position Pos is computed as follows
Pos = Dist (119878119875 119878119864) (3)
119878119875 is the state (cell) of the pursuer 119878119864 is the state (cell) ofthe evader Dist is the distance between the pursuer and theevader
Dist (119878119875 119878119864) = radic(CC119875119894 minus CC119864119894)2 + (CC119875119895 minus CC119864119895)2 (4)
(CC119875119894 CC119875119895) is the Cartesian coordinates of the pursuer(CC119864119894 CC119864119895) is the Cartesian coordinates of the evader
In order to distinguish the different coalitions eachpursuer belonging to a coalition calculates the value returnedto itself through this strategy This computation is basedon the factors characterizing the pursuers For example apursuer (1198751) belongs to the coalition (Co) the value of thiscoalition in relation to this pursuer is calculated as follows
Co(val1198751)
= Coef1 times Conf1 + Coef2 times Credit1 + Coef3 times Pos1sum3119896=1 Coef119896
+ Resum119894=2
Coef1 times Conf119894 + Coef2 times Credit119894 + Coef3 times Pos119894Re times sum3119896=1 Coef119896
(5)
Coef is coefficient of each factorOn the basis of these values and using IEDS method our
mechanismwill be able to select the optimal pursuit coalitionfor each evader detected as detailed in Section 6
4 The Iterated Elimination ofDominated Strategies (IEDS)
The coalition is a set of pursuers required to capture theevaders detected In the coalition each pursuer must corre-spond to a specific strategy In our proposal a pure strategy 119904119894defines a specific pursuit grouprsquos integration that the pursuerwill follow in every possible and attainable situation duringthe pursuit Such coalitions may not be random or drawnfrom a distribution as in the case of mixed strategies Astrategy str119894 dominates another strategy str1015840119894 if and only if forevery potential combination of the other playersrsquo actions strminus119894
120583119894 (str119894 strminus119894) ge 120583119894 (str1015840119894 strminus119894) (6)
120583 is function that returns the results obtained through theapplication of a specific strategy
Consider the strategic game shown in Table 1 where thecolumn player has three pure strategies and the row playerhas only two (a) Knowing that the values shown in eachcase represent the expected payoffs returned to the players inthe case of selecting the current strategy Playing the Centeris always better than playing the Right side for the column
4 Mathematical Problems in Engineering
Table 1 Application of IEDS technique
(a)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(b)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(c)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(d)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1Bold fonts reflect how the dominated strategies are deleted
player Consequently we can assume he will eventually stopplaying Right side because it is a dominated strategy (b) Sowe can ignore the Right side column after its eliminationNow row player has a dominated strategy UP Eventuallyrow player stops playing UP then row-UP gets eliminated(c) Finally we have two remaining choices Down-Left andDown-Center and column player notices that it can only winby playing Left (d) So we can deduce that the IEDS solutionis (Down Left) with the following payoff (6 6)
5 Markov Decision Process Principles
Markov Decision Processes (MDPs) provide a mathematicalframework tomodel decisionmaking in situationswhere out-comes are somewhat random and partially under the controlof a decisionmaker In cooperative multiagent systems MDPallows the formalization of sequential decision problemsThisprocess only models the cooperative systems in which thereward function is shared by all players MDP is defined by⟨119873 119878 119860 119879 119877⟩ as follows
119873 is the number of agents Ag119894 in the system 119894 isin1 119873119878 corresponds to the set of agentsrsquo states 119904119860 119860 = 1198601 times 1198602 times sdot sdot sdot times 119860119873 defines the set of jointactions of the agents where 119860 119894 is the set of localactions of the agent Ag119894
119879 is the transition function It returns the probability119879(119904 119886 1199041015840)meaning that the agent goes into the state 1199041015840if it runs the joint action 119886 isin 119860 from state 119904119877 defines the reward function 119877(119904 119886 1199041015840) representsthe reward obtained by the agent when it transits from
the state 119904 to the state 1199041015840 by the execution of the action119886
51 Reward Function In MDP problem the next statesselected are the states returning maximum definitive rewardIn our proposal we have used Heuristic functions in orderto calculate the immediate reward of each state The rewardfunction defines the goals that the pursuers have to achieveand identifies the environmental obstacles To calculate thisfunction we relied on the agentsrsquo environment positiondetailed in Section 3 which allows the distribution of therewards on the environmental cells fairly The calculation ofthe rewards in each state 119904 concerned is effectuated as follows
119877 (119904 119886) =
120574 if 119864119894 sube 1199040 mp (119909) = 1120574 minus Val (Dist (CC119875CC119864)) else
(7)
120574 is the maximum reward Val(Dist(CC119875CC119864)) representsthe distance value
Regarding the distribution of the rewards in the standardcells we note that the reward function is inversely propor-tional to the distance function
Figure 1 illustrates a part of our simulation environmentdetailed in Section 7 The values displayed in the differentcells [1198811 1198812 1198813] represent the gains generated by the rewardfunction The dynamic rewards will be awarded to anypursuer situated in the cell concerned during the pursuit
1198811 the reward could be obtained if the pursuerconcerned tracks the first evader1198812 the reward could be obtained if the pursuerconcerned tracks the second evader1198813 is the index of the cell (occupied or free)
52 Transition Function The transition probabilities (120588)describe the dynamism of the environment They play therole of the next-state function in a problem-solving searchknowing that every state could be the possible next stateaccording to the action undertaken in the actual state Ourapproach is developed in grid of cells environment whereeach agent can move in four different states 119904up 119904down 119904leftand 119904right
The transition probabilities of the pursuers are based onthe reward degree as shown
sum1199041015840
120588 (1199041015840 | 119904 119886) = 1
120588 (1199041015840 | 119904 119886) = 119877 (1199041015840 119886)120574
120588 (1199041015840 | 119904 119886) = max (120588 (119904 | 119904 119886) 120588 (119904up | 119904 119886) 120588 (119904down | 119904 119886) 120588 (119904right | 119904 119886) 120588 (119904left | 119904 119886))
forall119904 119886
(8)
Mathematical Problems in Engineering 5
[41 97 0]
[42 96 0]
[43 95 0]
[44 94 0]
[40 98 1]
[41 97 0]
[42 96 0]
[43 95 0]
[39 97 0]
[40 96 0]
[41 95 0]
[42 94 1]
[38 96 0]
[39 95 0]
[41 93 0]
[37 95 0]
[38 94 0]
[39 93 0]
[40 92 0]
[36 94 0]
[37 93 0]
[38 92 0]
[39 91 0]
[35 93 0]
[36 92 0]
[38 90 0]
[34 92 0]
[35 91 0]
[36 90 0]
[37 89 0]
[34 90 0]
[35 89 0]
[36 88 0]
[33 91 1]
Figure 1 Reward function applied to the grid environment Cells with a red frame the selected states blue agents pursuers green agentsevaders black cells cells containing obstacles
The linkages between the evader and each pursuer shownin Figure 2 reflect the optimal trajectories provided by theapplication of the method proposed in this section duringeach different pursuit step
6 Coalition Formation AlgorithmBased on IEDS
A number of coalition formation algorithms have beendeveloped to define which of the potential coalitions shouldactually be formed To do so they typically compute avalue for each coalition known as the coalition value whichprovides an indication of the expected results that could bederived if this coalition is constitutedThen having calculatedall the coalitional values the decision about the optimalcoalition to form can be selected We employ an iterativealgorithm in order to determine the optimal coalitions ofagents It begins with a complete set of coalitions (agent-strategy combinations) and iteratively eliminates the coali-tions that have lower contribution values to MAS efficiencyThe pseudocode of our algorithm is shown in Algorithm 1
First the algorithm calculates all the possible coalitions(Nbrcl) that the pursuers can form before their filtration asneeded The expected number of the possible coalitions toform is calculated according to the following
Nbrcl = 119899(119899 minus Re1)Re1 times
119899 minus Re1(119899 minus (Re1 + Re2))Re2
times sdot sdot sdot times 119899 minus (Re1 + sdot sdot sdot + Re119873minus1)(119899 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(119899 minus sum119896=119895minus1119896=0
Re119896)(119899 minus sum119896=119895
119896=0Re119896)Re119895
(9)
119899 is the number of pursuers in the environment 119873 is thenumber of evaders detected Re0 = 0
In order to distribute the calculation of the possiblecoalitions among the pursuers the possible general coalitions
119899 The number of pursuers119894 = 0119896 = 0119895 = indicator of the chase iterationCalculate the possible coalitionsWhile (119862life gt 0) doCalculate the value of each coalitionWhile (number of coalitions gt 1) do
Eliminate the dominated strategy of 119875119894119894 larr 119894 mod 119899 + 1end whileAssign the pursuersrsquo roles according to theSelected coalitionChase iteration
End whileIf (capture = true) thenWhile (119896 le 119899)
Update (Reward119875119896 )119870++end while
ElseThe guilty pursuers pay some fines
end if
Algorithm 1
(Ω) will be calculated A general coalition enrolls all thepursuers required to capture the set of evaders detected
Ω = 119899(119899 minus 120582)120582 (10)
120582 = (Re1 + Re2 + sdot sdot sdotRe119873)The general coalitions generated will be equitably dis-
tributed among the agents playing the role Pursuer Specif-ically each general coalition will be composed of 119873 pur-suit groups From each general coalition generated through
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Mathematical Problems in Engineering
evaders are presented in different kinds of types The type ofa pursuer denotes its pursuit capacity Otherwise the type ofan evader reflects the number and type of pursuers requiredto perform its capture The value of an evader indicates theexpected rewards that should be returned to the relevantpursuers after the achievement of the capture
Game Theory can be considered as the simplest wayto model situations of conflict and studies the interactionsbetween interested agents The classic question relating gametheory to multiagent systems is ldquowhat is the best actionthat an agent can performrdquo This principle has been widelyused in multiagent pursuit problems [14ndash17]The negotiationbased on Game Theory is focused on the value and rewardsof each agent which appropriately reflect the objective ofagentrsquos negotiation (satisfying the goal of each agent) Themain advantage provided by Game Theory algorithms inMAS is the coordination appearing through the hypoth-esis of mutual rationality of the agents Therefore TheGameTheory algorithms are used to coordinate autonomousrational agents without the use of coordination mechanismexplicitly integrated in themodel of agents Also they providedifferent methods that define the optimal agentsrsquo coalitionsin several types of problems Otherwise the disadvantagesof these algorithms concern the agents which are frequentlyconsidered as perfect rationales Moreover Game Theoryalgorithms are focused on the value of the optimal solutionand overlook the most efficient method to achieve it
In this paper we have focused on the Iterated Eliminationof Dominated Strategies (IEDS) Game Theory technique topropose coalition formation algorithm as part of the pursuit-evasion problems A strategy is the complete specificationof the agentrsquos behavior in any situation (in the case of anextensive form game it means what behavior the agent mustundertake according to the set of information provided)Moreover we have used Markov Decision Process (MDP)principles in order to control the motion strategy of eachagent This process (MDP) provides formalism of model andresolves planning and learning problems under uncertainty[18 19]
The paper is organized as follows in Section 2 we discussthe main related works based on the same principles used inthis paper In Section 3 we focus on the pursuit-evasion prob-lem by submitting a detailed explanation of the simulationenvironment and its different contents Also we approach theenvironmental agents by the definition and the clarificationof the principal characteristics of pursuers and evaders InSection 4 the Iterated Elimination of Dominated Strategiesprinciple is described anddetailed by an applicationrsquos exampleof this GameTheory process In Section 5 the basic principlesof Markov Decision Process are motivated Therefore weclarify the principles of the primary functions better knownas the reward and transition functions In Section 6 weintroduce the distributed coalition formation algorithm witha detailed clarification of the coalition progress A simulationof the pursuit-evasion game example is shown in Section 7 inthis part we describe our simulation environment in a specificmanner Also we present the results achieved in comparisonwith other outcomes based on other theories Finally Sec-tion 8 contains concluding remarks about this paper
2 Related Work
There exist a lot of works based on game theoretic prin-ciples regarding the PE problem such as in [14] where theauthor described the control of an autonomous agentsrsquo teamtracking an intelligent evader in a nonaccurately mappedterrain based on a method to calculate the Nash equilibriumpolicies by resolving an equivalent zero-sum matrix gameIn this example among all Nash equilibrium the evaderselects the one which optimizes its deterministic distance tothe pursuersrsquo team In order to resolve the problems oftenencountered in the algorithms of pursuit-evasion games suchas computational complexity and the lack of universalityDong et al [15] propose a hybrid algorithm founded onimproved dynamic artificial potential field and differentialgame where Nash equilibrium solution is optimal for bothpursuer and evader in barrier-free zone in pursuit-evasiongame and in accordance with environment changes aroundthe pursuit elements the algorithm is applied with flexibilityMoreover in [16] Amigoni and Basilico have presented anapproach to calculate the optimal pursuerrsquos strategy thatmaximizes the probability of the targetrsquos capture in a givenenvironment This approach is based on the definition ofthe pursuit-evasion game theoretic model as well as on itsresolution through the mathematical programming
More recently Lin et al [20] proposed a pursuit-evasiondifferential game based on Nash strategies involving limitedobservations On the one hand the evader undertakes thestandard feedback Nash strategy On the other hand thepursuers undertake Nash strategies based on the novelconcept of best achievable performance indices proposedThis model has potential applications in cases where sev-eral weakly equipped pursuing vehicles are tracking well-equipped unmanned vehicle
In relation to PE MDP is usually used to providethe motion planning for the mobile pursuers through themaximization of the rewards obtained during the pursuitIn [21] a Partially Observable Markov Decision Process(POMDP) algorithm is used to search the mobile target ina known graph The main objective is to ensure the captureof the targets via the clearing of the graph in minimal timeIn [22] the authors propose a new approach ContinuousTime Markov Decision Process (CTMDP) to address the PEproblem In relation to MDP CTMDP takes into account theimpact of the transition time between the states involving astrong robustness against changes in transition probabilityIn [23] the authors proposed an innovative approach totallybased on MDP with the aim of resolving sequential multia-gent decision problems by allowing agents to reason explicitlyabout specific coordinationmechanisms In otherwords theydetermined a value iteration algorithm to compute the opti-mal policies that recognizes and reasons about CoordinationProblems
Furthermore we can consider other works based onMDP and IEDS such as [24] in which an exact dynamicprogramming algorithm for partially observable stochasticgames (POSGs) is developed Also it is proven that the algo-rithm iteratively eliminates very weakly dominated strategieswithout first forming a normal form representation of the
Mathematical Problems in Engineering 3
game when it is applied to finite-horizon POSGs Otherwiseseveral types of coordination mechanisms are currently usedsuch as Stochastic Clustering Auctions (SCAs) [25 26] whichrepresent a class of cooperative auctionmethods based on themodified Swendsen-Wang method It permits each robot toreconstitute the tasks that have been linked and applies toheterogeneous teams Other mechanisms are market-basedas TraderBots [27] applied on greedy agents in order toprovide a detailed analysis of the requirements for robust andefficient multirobot coordination in dynamic environmentsFrom the point of view of Game Theory some researchactivities [28] investigated the optimal coordination approachfor multiagent foraging indeed they built the equivalencebetween the optimal solution of MAS and the equilibriumof the game according to the same case and then theyintroduced evolutionarily stable strategy to be of service inresolving the equilibrium selection problem of traditionalGameTheory
3 Problem Description
In this section we focus on the cooperation problem inwhich119899 pursuers situated in a limitary toroidal grid environment119883have to capture 119898 evaders of different types The expressionsof 119875 = 1198751 119875119899 and 119864 = 1198641 119864119898 represent thecollection of 119899 pursuers and119898 evaders respectively Pursuersand evaders represent the roles that the agents can play Eachevader is characterized by a type Re with Re isin I II III IVto indicate howmany pursuers are required to capture it Herewe suppose that the pursuers can evaluate the evadersrsquo typesafter the localization There exist some fixed obstacles withdifferent shapes and sizes in the environment119883The positioncould be destined for the mapping mp
119883 rarr 0 1 such as forall119909 isin 119883 mp(119909) = 1 well then 119909 isan obstacle
In our proposal the strategies of each pursuer are guided bydetermining factors that reflect the individual development ofthe pursuer during the execution of the assigned tasks Thesefactors are detailed as follows
Self-Confidence Degree In multiagent systems each agentmust be able to execute the services requested by the otheragents The self-confidence degree is the assessment of theagentrsquos success in relation to the assigned tasks It is denotedand computed in the following way
forallConf isin [01 1] Conf = max(01 119862119904119862119905) (1)
119862119904 is the number of tasks that the agent has accomplished 119862119905is the number of tasks in which the agent has participated
The Credit In the case where the agent cannot perform atask then its credit will be affected The credit of an agent isdesignated and calculated as follows
forallCredit isin [0 1] Credit = min(1 1 minus 119862119887119862119905 minus 119862119904) (2)
119862119887 is the number of the abandoned tasks by the agent
Environment Position The position of the agent in theenvironment is a crucial criterion for the pursuit sequencesbecause the capture will be easier if the pursuer is closer tothe evader The position Pos is computed as follows
Pos = Dist (119878119875 119878119864) (3)
119878119875 is the state (cell) of the pursuer 119878119864 is the state (cell) ofthe evader Dist is the distance between the pursuer and theevader
Dist (119878119875 119878119864) = radic(CC119875119894 minus CC119864119894)2 + (CC119875119895 minus CC119864119895)2 (4)
(CC119875119894 CC119875119895) is the Cartesian coordinates of the pursuer(CC119864119894 CC119864119895) is the Cartesian coordinates of the evader
In order to distinguish the different coalitions eachpursuer belonging to a coalition calculates the value returnedto itself through this strategy This computation is basedon the factors characterizing the pursuers For example apursuer (1198751) belongs to the coalition (Co) the value of thiscoalition in relation to this pursuer is calculated as follows
Co(val1198751)
= Coef1 times Conf1 + Coef2 times Credit1 + Coef3 times Pos1sum3119896=1 Coef119896
+ Resum119894=2
Coef1 times Conf119894 + Coef2 times Credit119894 + Coef3 times Pos119894Re times sum3119896=1 Coef119896
(5)
Coef is coefficient of each factorOn the basis of these values and using IEDS method our
mechanismwill be able to select the optimal pursuit coalitionfor each evader detected as detailed in Section 6
4 The Iterated Elimination ofDominated Strategies (IEDS)
The coalition is a set of pursuers required to capture theevaders detected In the coalition each pursuer must corre-spond to a specific strategy In our proposal a pure strategy 119904119894defines a specific pursuit grouprsquos integration that the pursuerwill follow in every possible and attainable situation duringthe pursuit Such coalitions may not be random or drawnfrom a distribution as in the case of mixed strategies Astrategy str119894 dominates another strategy str1015840119894 if and only if forevery potential combination of the other playersrsquo actions strminus119894
120583119894 (str119894 strminus119894) ge 120583119894 (str1015840119894 strminus119894) (6)
120583 is function that returns the results obtained through theapplication of a specific strategy
Consider the strategic game shown in Table 1 where thecolumn player has three pure strategies and the row playerhas only two (a) Knowing that the values shown in eachcase represent the expected payoffs returned to the players inthe case of selecting the current strategy Playing the Centeris always better than playing the Right side for the column
4 Mathematical Problems in Engineering
Table 1 Application of IEDS technique
(a)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(b)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(c)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(d)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1Bold fonts reflect how the dominated strategies are deleted
player Consequently we can assume he will eventually stopplaying Right side because it is a dominated strategy (b) Sowe can ignore the Right side column after its eliminationNow row player has a dominated strategy UP Eventuallyrow player stops playing UP then row-UP gets eliminated(c) Finally we have two remaining choices Down-Left andDown-Center and column player notices that it can only winby playing Left (d) So we can deduce that the IEDS solutionis (Down Left) with the following payoff (6 6)
5 Markov Decision Process Principles
Markov Decision Processes (MDPs) provide a mathematicalframework tomodel decisionmaking in situationswhere out-comes are somewhat random and partially under the controlof a decisionmaker In cooperative multiagent systems MDPallows the formalization of sequential decision problemsThisprocess only models the cooperative systems in which thereward function is shared by all players MDP is defined by⟨119873 119878 119860 119879 119877⟩ as follows
119873 is the number of agents Ag119894 in the system 119894 isin1 119873119878 corresponds to the set of agentsrsquo states 119904119860 119860 = 1198601 times 1198602 times sdot sdot sdot times 119860119873 defines the set of jointactions of the agents where 119860 119894 is the set of localactions of the agent Ag119894
119879 is the transition function It returns the probability119879(119904 119886 1199041015840)meaning that the agent goes into the state 1199041015840if it runs the joint action 119886 isin 119860 from state 119904119877 defines the reward function 119877(119904 119886 1199041015840) representsthe reward obtained by the agent when it transits from
the state 119904 to the state 1199041015840 by the execution of the action119886
51 Reward Function In MDP problem the next statesselected are the states returning maximum definitive rewardIn our proposal we have used Heuristic functions in orderto calculate the immediate reward of each state The rewardfunction defines the goals that the pursuers have to achieveand identifies the environmental obstacles To calculate thisfunction we relied on the agentsrsquo environment positiondetailed in Section 3 which allows the distribution of therewards on the environmental cells fairly The calculation ofthe rewards in each state 119904 concerned is effectuated as follows
119877 (119904 119886) =
120574 if 119864119894 sube 1199040 mp (119909) = 1120574 minus Val (Dist (CC119875CC119864)) else
(7)
120574 is the maximum reward Val(Dist(CC119875CC119864)) representsthe distance value
Regarding the distribution of the rewards in the standardcells we note that the reward function is inversely propor-tional to the distance function
Figure 1 illustrates a part of our simulation environmentdetailed in Section 7 The values displayed in the differentcells [1198811 1198812 1198813] represent the gains generated by the rewardfunction The dynamic rewards will be awarded to anypursuer situated in the cell concerned during the pursuit
1198811 the reward could be obtained if the pursuerconcerned tracks the first evader1198812 the reward could be obtained if the pursuerconcerned tracks the second evader1198813 is the index of the cell (occupied or free)
52 Transition Function The transition probabilities (120588)describe the dynamism of the environment They play therole of the next-state function in a problem-solving searchknowing that every state could be the possible next stateaccording to the action undertaken in the actual state Ourapproach is developed in grid of cells environment whereeach agent can move in four different states 119904up 119904down 119904leftand 119904right
The transition probabilities of the pursuers are based onthe reward degree as shown
sum1199041015840
120588 (1199041015840 | 119904 119886) = 1
120588 (1199041015840 | 119904 119886) = 119877 (1199041015840 119886)120574
120588 (1199041015840 | 119904 119886) = max (120588 (119904 | 119904 119886) 120588 (119904up | 119904 119886) 120588 (119904down | 119904 119886) 120588 (119904right | 119904 119886) 120588 (119904left | 119904 119886))
forall119904 119886
(8)
Mathematical Problems in Engineering 5
[41 97 0]
[42 96 0]
[43 95 0]
[44 94 0]
[40 98 1]
[41 97 0]
[42 96 0]
[43 95 0]
[39 97 0]
[40 96 0]
[41 95 0]
[42 94 1]
[38 96 0]
[39 95 0]
[41 93 0]
[37 95 0]
[38 94 0]
[39 93 0]
[40 92 0]
[36 94 0]
[37 93 0]
[38 92 0]
[39 91 0]
[35 93 0]
[36 92 0]
[38 90 0]
[34 92 0]
[35 91 0]
[36 90 0]
[37 89 0]
[34 90 0]
[35 89 0]
[36 88 0]
[33 91 1]
Figure 1 Reward function applied to the grid environment Cells with a red frame the selected states blue agents pursuers green agentsevaders black cells cells containing obstacles
The linkages between the evader and each pursuer shownin Figure 2 reflect the optimal trajectories provided by theapplication of the method proposed in this section duringeach different pursuit step
6 Coalition Formation AlgorithmBased on IEDS
A number of coalition formation algorithms have beendeveloped to define which of the potential coalitions shouldactually be formed To do so they typically compute avalue for each coalition known as the coalition value whichprovides an indication of the expected results that could bederived if this coalition is constitutedThen having calculatedall the coalitional values the decision about the optimalcoalition to form can be selected We employ an iterativealgorithm in order to determine the optimal coalitions ofagents It begins with a complete set of coalitions (agent-strategy combinations) and iteratively eliminates the coali-tions that have lower contribution values to MAS efficiencyThe pseudocode of our algorithm is shown in Algorithm 1
First the algorithm calculates all the possible coalitions(Nbrcl) that the pursuers can form before their filtration asneeded The expected number of the possible coalitions toform is calculated according to the following
Nbrcl = 119899(119899 minus Re1)Re1 times
119899 minus Re1(119899 minus (Re1 + Re2))Re2
times sdot sdot sdot times 119899 minus (Re1 + sdot sdot sdot + Re119873minus1)(119899 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(119899 minus sum119896=119895minus1119896=0
Re119896)(119899 minus sum119896=119895
119896=0Re119896)Re119895
(9)
119899 is the number of pursuers in the environment 119873 is thenumber of evaders detected Re0 = 0
In order to distribute the calculation of the possiblecoalitions among the pursuers the possible general coalitions
119899 The number of pursuers119894 = 0119896 = 0119895 = indicator of the chase iterationCalculate the possible coalitionsWhile (119862life gt 0) doCalculate the value of each coalitionWhile (number of coalitions gt 1) do
Eliminate the dominated strategy of 119875119894119894 larr 119894 mod 119899 + 1end whileAssign the pursuersrsquo roles according to theSelected coalitionChase iteration
End whileIf (capture = true) thenWhile (119896 le 119899)
Update (Reward119875119896 )119870++end while
ElseThe guilty pursuers pay some fines
end if
Algorithm 1
(Ω) will be calculated A general coalition enrolls all thepursuers required to capture the set of evaders detected
Ω = 119899(119899 minus 120582)120582 (10)
120582 = (Re1 + Re2 + sdot sdot sdotRe119873)The general coalitions generated will be equitably dis-
tributed among the agents playing the role Pursuer Specif-ically each general coalition will be composed of 119873 pur-suit groups From each general coalition generated through
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 3
game when it is applied to finite-horizon POSGs Otherwiseseveral types of coordination mechanisms are currently usedsuch as Stochastic Clustering Auctions (SCAs) [25 26] whichrepresent a class of cooperative auctionmethods based on themodified Swendsen-Wang method It permits each robot toreconstitute the tasks that have been linked and applies toheterogeneous teams Other mechanisms are market-basedas TraderBots [27] applied on greedy agents in order toprovide a detailed analysis of the requirements for robust andefficient multirobot coordination in dynamic environmentsFrom the point of view of Game Theory some researchactivities [28] investigated the optimal coordination approachfor multiagent foraging indeed they built the equivalencebetween the optimal solution of MAS and the equilibriumof the game according to the same case and then theyintroduced evolutionarily stable strategy to be of service inresolving the equilibrium selection problem of traditionalGameTheory
3 Problem Description
In this section we focus on the cooperation problem inwhich119899 pursuers situated in a limitary toroidal grid environment119883have to capture 119898 evaders of different types The expressionsof 119875 = 1198751 119875119899 and 119864 = 1198641 119864119898 represent thecollection of 119899 pursuers and119898 evaders respectively Pursuersand evaders represent the roles that the agents can play Eachevader is characterized by a type Re with Re isin I II III IVto indicate howmany pursuers are required to capture it Herewe suppose that the pursuers can evaluate the evadersrsquo typesafter the localization There exist some fixed obstacles withdifferent shapes and sizes in the environment119883The positioncould be destined for the mapping mp
119883 rarr 0 1 such as forall119909 isin 119883 mp(119909) = 1 well then 119909 isan obstacle
In our proposal the strategies of each pursuer are guided bydetermining factors that reflect the individual development ofthe pursuer during the execution of the assigned tasks Thesefactors are detailed as follows
Self-Confidence Degree In multiagent systems each agentmust be able to execute the services requested by the otheragents The self-confidence degree is the assessment of theagentrsquos success in relation to the assigned tasks It is denotedand computed in the following way
forallConf isin [01 1] Conf = max(01 119862119904119862119905) (1)
119862119904 is the number of tasks that the agent has accomplished 119862119905is the number of tasks in which the agent has participated
The Credit In the case where the agent cannot perform atask then its credit will be affected The credit of an agent isdesignated and calculated as follows
forallCredit isin [0 1] Credit = min(1 1 minus 119862119887119862119905 minus 119862119904) (2)
119862119887 is the number of the abandoned tasks by the agent
Environment Position The position of the agent in theenvironment is a crucial criterion for the pursuit sequencesbecause the capture will be easier if the pursuer is closer tothe evader The position Pos is computed as follows
Pos = Dist (119878119875 119878119864) (3)
119878119875 is the state (cell) of the pursuer 119878119864 is the state (cell) ofthe evader Dist is the distance between the pursuer and theevader
Dist (119878119875 119878119864) = radic(CC119875119894 minus CC119864119894)2 + (CC119875119895 minus CC119864119895)2 (4)
(CC119875119894 CC119875119895) is the Cartesian coordinates of the pursuer(CC119864119894 CC119864119895) is the Cartesian coordinates of the evader
In order to distinguish the different coalitions eachpursuer belonging to a coalition calculates the value returnedto itself through this strategy This computation is basedon the factors characterizing the pursuers For example apursuer (1198751) belongs to the coalition (Co) the value of thiscoalition in relation to this pursuer is calculated as follows
Co(val1198751)
= Coef1 times Conf1 + Coef2 times Credit1 + Coef3 times Pos1sum3119896=1 Coef119896
+ Resum119894=2
Coef1 times Conf119894 + Coef2 times Credit119894 + Coef3 times Pos119894Re times sum3119896=1 Coef119896
(5)
Coef is coefficient of each factorOn the basis of these values and using IEDS method our
mechanismwill be able to select the optimal pursuit coalitionfor each evader detected as detailed in Section 6
4 The Iterated Elimination ofDominated Strategies (IEDS)
The coalition is a set of pursuers required to capture theevaders detected In the coalition each pursuer must corre-spond to a specific strategy In our proposal a pure strategy 119904119894defines a specific pursuit grouprsquos integration that the pursuerwill follow in every possible and attainable situation duringthe pursuit Such coalitions may not be random or drawnfrom a distribution as in the case of mixed strategies Astrategy str119894 dominates another strategy str1015840119894 if and only if forevery potential combination of the other playersrsquo actions strminus119894
120583119894 (str119894 strminus119894) ge 120583119894 (str1015840119894 strminus119894) (6)
120583 is function that returns the results obtained through theapplication of a specific strategy
Consider the strategic game shown in Table 1 where thecolumn player has three pure strategies and the row playerhas only two (a) Knowing that the values shown in eachcase represent the expected payoffs returned to the players inthe case of selecting the current strategy Playing the Centeris always better than playing the Right side for the column
4 Mathematical Problems in Engineering
Table 1 Application of IEDS technique
(a)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(b)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(c)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(d)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1Bold fonts reflect how the dominated strategies are deleted
player Consequently we can assume he will eventually stopplaying Right side because it is a dominated strategy (b) Sowe can ignore the Right side column after its eliminationNow row player has a dominated strategy UP Eventuallyrow player stops playing UP then row-UP gets eliminated(c) Finally we have two remaining choices Down-Left andDown-Center and column player notices that it can only winby playing Left (d) So we can deduce that the IEDS solutionis (Down Left) with the following payoff (6 6)
5 Markov Decision Process Principles
Markov Decision Processes (MDPs) provide a mathematicalframework tomodel decisionmaking in situationswhere out-comes are somewhat random and partially under the controlof a decisionmaker In cooperative multiagent systems MDPallows the formalization of sequential decision problemsThisprocess only models the cooperative systems in which thereward function is shared by all players MDP is defined by⟨119873 119878 119860 119879 119877⟩ as follows
119873 is the number of agents Ag119894 in the system 119894 isin1 119873119878 corresponds to the set of agentsrsquo states 119904119860 119860 = 1198601 times 1198602 times sdot sdot sdot times 119860119873 defines the set of jointactions of the agents where 119860 119894 is the set of localactions of the agent Ag119894
119879 is the transition function It returns the probability119879(119904 119886 1199041015840)meaning that the agent goes into the state 1199041015840if it runs the joint action 119886 isin 119860 from state 119904119877 defines the reward function 119877(119904 119886 1199041015840) representsthe reward obtained by the agent when it transits from
the state 119904 to the state 1199041015840 by the execution of the action119886
51 Reward Function In MDP problem the next statesselected are the states returning maximum definitive rewardIn our proposal we have used Heuristic functions in orderto calculate the immediate reward of each state The rewardfunction defines the goals that the pursuers have to achieveand identifies the environmental obstacles To calculate thisfunction we relied on the agentsrsquo environment positiondetailed in Section 3 which allows the distribution of therewards on the environmental cells fairly The calculation ofthe rewards in each state 119904 concerned is effectuated as follows
119877 (119904 119886) =
120574 if 119864119894 sube 1199040 mp (119909) = 1120574 minus Val (Dist (CC119875CC119864)) else
(7)
120574 is the maximum reward Val(Dist(CC119875CC119864)) representsthe distance value
Regarding the distribution of the rewards in the standardcells we note that the reward function is inversely propor-tional to the distance function
Figure 1 illustrates a part of our simulation environmentdetailed in Section 7 The values displayed in the differentcells [1198811 1198812 1198813] represent the gains generated by the rewardfunction The dynamic rewards will be awarded to anypursuer situated in the cell concerned during the pursuit
1198811 the reward could be obtained if the pursuerconcerned tracks the first evader1198812 the reward could be obtained if the pursuerconcerned tracks the second evader1198813 is the index of the cell (occupied or free)
52 Transition Function The transition probabilities (120588)describe the dynamism of the environment They play therole of the next-state function in a problem-solving searchknowing that every state could be the possible next stateaccording to the action undertaken in the actual state Ourapproach is developed in grid of cells environment whereeach agent can move in four different states 119904up 119904down 119904leftand 119904right
The transition probabilities of the pursuers are based onthe reward degree as shown
sum1199041015840
120588 (1199041015840 | 119904 119886) = 1
120588 (1199041015840 | 119904 119886) = 119877 (1199041015840 119886)120574
120588 (1199041015840 | 119904 119886) = max (120588 (119904 | 119904 119886) 120588 (119904up | 119904 119886) 120588 (119904down | 119904 119886) 120588 (119904right | 119904 119886) 120588 (119904left | 119904 119886))
forall119904 119886
(8)
Mathematical Problems in Engineering 5
[41 97 0]
[42 96 0]
[43 95 0]
[44 94 0]
[40 98 1]
[41 97 0]
[42 96 0]
[43 95 0]
[39 97 0]
[40 96 0]
[41 95 0]
[42 94 1]
[38 96 0]
[39 95 0]
[41 93 0]
[37 95 0]
[38 94 0]
[39 93 0]
[40 92 0]
[36 94 0]
[37 93 0]
[38 92 0]
[39 91 0]
[35 93 0]
[36 92 0]
[38 90 0]
[34 92 0]
[35 91 0]
[36 90 0]
[37 89 0]
[34 90 0]
[35 89 0]
[36 88 0]
[33 91 1]
Figure 1 Reward function applied to the grid environment Cells with a red frame the selected states blue agents pursuers green agentsevaders black cells cells containing obstacles
The linkages between the evader and each pursuer shownin Figure 2 reflect the optimal trajectories provided by theapplication of the method proposed in this section duringeach different pursuit step
6 Coalition Formation AlgorithmBased on IEDS
A number of coalition formation algorithms have beendeveloped to define which of the potential coalitions shouldactually be formed To do so they typically compute avalue for each coalition known as the coalition value whichprovides an indication of the expected results that could bederived if this coalition is constitutedThen having calculatedall the coalitional values the decision about the optimalcoalition to form can be selected We employ an iterativealgorithm in order to determine the optimal coalitions ofagents It begins with a complete set of coalitions (agent-strategy combinations) and iteratively eliminates the coali-tions that have lower contribution values to MAS efficiencyThe pseudocode of our algorithm is shown in Algorithm 1
First the algorithm calculates all the possible coalitions(Nbrcl) that the pursuers can form before their filtration asneeded The expected number of the possible coalitions toform is calculated according to the following
Nbrcl = 119899(119899 minus Re1)Re1 times
119899 minus Re1(119899 minus (Re1 + Re2))Re2
times sdot sdot sdot times 119899 minus (Re1 + sdot sdot sdot + Re119873minus1)(119899 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(119899 minus sum119896=119895minus1119896=0
Re119896)(119899 minus sum119896=119895
119896=0Re119896)Re119895
(9)
119899 is the number of pursuers in the environment 119873 is thenumber of evaders detected Re0 = 0
In order to distribute the calculation of the possiblecoalitions among the pursuers the possible general coalitions
119899 The number of pursuers119894 = 0119896 = 0119895 = indicator of the chase iterationCalculate the possible coalitionsWhile (119862life gt 0) doCalculate the value of each coalitionWhile (number of coalitions gt 1) do
Eliminate the dominated strategy of 119875119894119894 larr 119894 mod 119899 + 1end whileAssign the pursuersrsquo roles according to theSelected coalitionChase iteration
End whileIf (capture = true) thenWhile (119896 le 119899)
Update (Reward119875119896 )119870++end while
ElseThe guilty pursuers pay some fines
end if
Algorithm 1
(Ω) will be calculated A general coalition enrolls all thepursuers required to capture the set of evaders detected
Ω = 119899(119899 minus 120582)120582 (10)
120582 = (Re1 + Re2 + sdot sdot sdotRe119873)The general coalitions generated will be equitably dis-
tributed among the agents playing the role Pursuer Specif-ically each general coalition will be composed of 119873 pur-suit groups From each general coalition generated through
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Mathematical Problems in Engineering
Table 1 Application of IEDS technique
(a)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(b)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(c)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1
(d)
Left Center RightUp 5 4 3 8 1 5Down 6 6 6 0 minus3 minus1Bold fonts reflect how the dominated strategies are deleted
player Consequently we can assume he will eventually stopplaying Right side because it is a dominated strategy (b) Sowe can ignore the Right side column after its eliminationNow row player has a dominated strategy UP Eventuallyrow player stops playing UP then row-UP gets eliminated(c) Finally we have two remaining choices Down-Left andDown-Center and column player notices that it can only winby playing Left (d) So we can deduce that the IEDS solutionis (Down Left) with the following payoff (6 6)
5 Markov Decision Process Principles
Markov Decision Processes (MDPs) provide a mathematicalframework tomodel decisionmaking in situationswhere out-comes are somewhat random and partially under the controlof a decisionmaker In cooperative multiagent systems MDPallows the formalization of sequential decision problemsThisprocess only models the cooperative systems in which thereward function is shared by all players MDP is defined by⟨119873 119878 119860 119879 119877⟩ as follows
119873 is the number of agents Ag119894 in the system 119894 isin1 119873119878 corresponds to the set of agentsrsquo states 119904119860 119860 = 1198601 times 1198602 times sdot sdot sdot times 119860119873 defines the set of jointactions of the agents where 119860 119894 is the set of localactions of the agent Ag119894
119879 is the transition function It returns the probability119879(119904 119886 1199041015840)meaning that the agent goes into the state 1199041015840if it runs the joint action 119886 isin 119860 from state 119904119877 defines the reward function 119877(119904 119886 1199041015840) representsthe reward obtained by the agent when it transits from
the state 119904 to the state 1199041015840 by the execution of the action119886
51 Reward Function In MDP problem the next statesselected are the states returning maximum definitive rewardIn our proposal we have used Heuristic functions in orderto calculate the immediate reward of each state The rewardfunction defines the goals that the pursuers have to achieveand identifies the environmental obstacles To calculate thisfunction we relied on the agentsrsquo environment positiondetailed in Section 3 which allows the distribution of therewards on the environmental cells fairly The calculation ofthe rewards in each state 119904 concerned is effectuated as follows
119877 (119904 119886) =
120574 if 119864119894 sube 1199040 mp (119909) = 1120574 minus Val (Dist (CC119875CC119864)) else
(7)
120574 is the maximum reward Val(Dist(CC119875CC119864)) representsthe distance value
Regarding the distribution of the rewards in the standardcells we note that the reward function is inversely propor-tional to the distance function
Figure 1 illustrates a part of our simulation environmentdetailed in Section 7 The values displayed in the differentcells [1198811 1198812 1198813] represent the gains generated by the rewardfunction The dynamic rewards will be awarded to anypursuer situated in the cell concerned during the pursuit
1198811 the reward could be obtained if the pursuerconcerned tracks the first evader1198812 the reward could be obtained if the pursuerconcerned tracks the second evader1198813 is the index of the cell (occupied or free)
52 Transition Function The transition probabilities (120588)describe the dynamism of the environment They play therole of the next-state function in a problem-solving searchknowing that every state could be the possible next stateaccording to the action undertaken in the actual state Ourapproach is developed in grid of cells environment whereeach agent can move in four different states 119904up 119904down 119904leftand 119904right
The transition probabilities of the pursuers are based onthe reward degree as shown
sum1199041015840
120588 (1199041015840 | 119904 119886) = 1
120588 (1199041015840 | 119904 119886) = 119877 (1199041015840 119886)120574
120588 (1199041015840 | 119904 119886) = max (120588 (119904 | 119904 119886) 120588 (119904up | 119904 119886) 120588 (119904down | 119904 119886) 120588 (119904right | 119904 119886) 120588 (119904left | 119904 119886))
forall119904 119886
(8)
Mathematical Problems in Engineering 5
[41 97 0]
[42 96 0]
[43 95 0]
[44 94 0]
[40 98 1]
[41 97 0]
[42 96 0]
[43 95 0]
[39 97 0]
[40 96 0]
[41 95 0]
[42 94 1]
[38 96 0]
[39 95 0]
[41 93 0]
[37 95 0]
[38 94 0]
[39 93 0]
[40 92 0]
[36 94 0]
[37 93 0]
[38 92 0]
[39 91 0]
[35 93 0]
[36 92 0]
[38 90 0]
[34 92 0]
[35 91 0]
[36 90 0]
[37 89 0]
[34 90 0]
[35 89 0]
[36 88 0]
[33 91 1]
Figure 1 Reward function applied to the grid environment Cells with a red frame the selected states blue agents pursuers green agentsevaders black cells cells containing obstacles
The linkages between the evader and each pursuer shownin Figure 2 reflect the optimal trajectories provided by theapplication of the method proposed in this section duringeach different pursuit step
6 Coalition Formation AlgorithmBased on IEDS
A number of coalition formation algorithms have beendeveloped to define which of the potential coalitions shouldactually be formed To do so they typically compute avalue for each coalition known as the coalition value whichprovides an indication of the expected results that could bederived if this coalition is constitutedThen having calculatedall the coalitional values the decision about the optimalcoalition to form can be selected We employ an iterativealgorithm in order to determine the optimal coalitions ofagents It begins with a complete set of coalitions (agent-strategy combinations) and iteratively eliminates the coali-tions that have lower contribution values to MAS efficiencyThe pseudocode of our algorithm is shown in Algorithm 1
First the algorithm calculates all the possible coalitions(Nbrcl) that the pursuers can form before their filtration asneeded The expected number of the possible coalitions toform is calculated according to the following
Nbrcl = 119899(119899 minus Re1)Re1 times
119899 minus Re1(119899 minus (Re1 + Re2))Re2
times sdot sdot sdot times 119899 minus (Re1 + sdot sdot sdot + Re119873minus1)(119899 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(119899 minus sum119896=119895minus1119896=0
Re119896)(119899 minus sum119896=119895
119896=0Re119896)Re119895
(9)
119899 is the number of pursuers in the environment 119873 is thenumber of evaders detected Re0 = 0
In order to distribute the calculation of the possiblecoalitions among the pursuers the possible general coalitions
119899 The number of pursuers119894 = 0119896 = 0119895 = indicator of the chase iterationCalculate the possible coalitionsWhile (119862life gt 0) doCalculate the value of each coalitionWhile (number of coalitions gt 1) do
Eliminate the dominated strategy of 119875119894119894 larr 119894 mod 119899 + 1end whileAssign the pursuersrsquo roles according to theSelected coalitionChase iteration
End whileIf (capture = true) thenWhile (119896 le 119899)
Update (Reward119875119896 )119870++end while
ElseThe guilty pursuers pay some fines
end if
Algorithm 1
(Ω) will be calculated A general coalition enrolls all thepursuers required to capture the set of evaders detected
Ω = 119899(119899 minus 120582)120582 (10)
120582 = (Re1 + Re2 + sdot sdot sdotRe119873)The general coalitions generated will be equitably dis-
tributed among the agents playing the role Pursuer Specif-ically each general coalition will be composed of 119873 pur-suit groups From each general coalition generated through
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 5
[41 97 0]
[42 96 0]
[43 95 0]
[44 94 0]
[40 98 1]
[41 97 0]
[42 96 0]
[43 95 0]
[39 97 0]
[40 96 0]
[41 95 0]
[42 94 1]
[38 96 0]
[39 95 0]
[41 93 0]
[37 95 0]
[38 94 0]
[39 93 0]
[40 92 0]
[36 94 0]
[37 93 0]
[38 92 0]
[39 91 0]
[35 93 0]
[36 92 0]
[38 90 0]
[34 92 0]
[35 91 0]
[36 90 0]
[37 89 0]
[34 90 0]
[35 89 0]
[36 88 0]
[33 91 1]
Figure 1 Reward function applied to the grid environment Cells with a red frame the selected states blue agents pursuers green agentsevaders black cells cells containing obstacles
The linkages between the evader and each pursuer shownin Figure 2 reflect the optimal trajectories provided by theapplication of the method proposed in this section duringeach different pursuit step
6 Coalition Formation AlgorithmBased on IEDS
A number of coalition formation algorithms have beendeveloped to define which of the potential coalitions shouldactually be formed To do so they typically compute avalue for each coalition known as the coalition value whichprovides an indication of the expected results that could bederived if this coalition is constitutedThen having calculatedall the coalitional values the decision about the optimalcoalition to form can be selected We employ an iterativealgorithm in order to determine the optimal coalitions ofagents It begins with a complete set of coalitions (agent-strategy combinations) and iteratively eliminates the coali-tions that have lower contribution values to MAS efficiencyThe pseudocode of our algorithm is shown in Algorithm 1
First the algorithm calculates all the possible coalitions(Nbrcl) that the pursuers can form before their filtration asneeded The expected number of the possible coalitions toform is calculated according to the following
Nbrcl = 119899(119899 minus Re1)Re1 times
119899 minus Re1(119899 minus (Re1 + Re2))Re2
times sdot sdot sdot times 119899 minus (Re1 + sdot sdot sdot + Re119873minus1)(119899 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(119899 minus sum119896=119895minus1119896=0
Re119896)(119899 minus sum119896=119895
119896=0Re119896)Re119895
(9)
119899 is the number of pursuers in the environment 119873 is thenumber of evaders detected Re0 = 0
In order to distribute the calculation of the possiblecoalitions among the pursuers the possible general coalitions
119899 The number of pursuers119894 = 0119896 = 0119895 = indicator of the chase iterationCalculate the possible coalitionsWhile (119862life gt 0) doCalculate the value of each coalitionWhile (number of coalitions gt 1) do
Eliminate the dominated strategy of 119875119894119894 larr 119894 mod 119899 + 1end whileAssign the pursuersrsquo roles according to theSelected coalitionChase iteration
End whileIf (capture = true) thenWhile (119896 le 119899)
Update (Reward119875119896 )119870++end while
ElseThe guilty pursuers pay some fines
end if
Algorithm 1
(Ω) will be calculated A general coalition enrolls all thepursuers required to capture the set of evaders detected
Ω = 119899(119899 minus 120582)120582 (10)
120582 = (Re1 + Re2 + sdot sdot sdotRe119873)The general coalitions generated will be equitably dis-
tributed among the agents playing the role Pursuer Specif-ically each general coalition will be composed of 119873 pur-suit groups From each general coalition generated through
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Mathematical Problems in Engineering
[53 88 0] [52 89 0] [51 90 0] [50 91 0] [49 92 0] [48 93 0] [47 94 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 1] [41 90 0] [40 89 0] [39 88 0]
[54 89 1] [52 91 0] [51 92 0] [50 93 0] [49 94 0] [48 95 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [42 91 0] [41 90 0]
[55 90 0] [54 91 0] [53 92 0] [52 93 0] [51 94 0] [50 95 0] [49 96 0] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0] [41 90 0]
[56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 98 1] [48 97 0] [47 96 0] [46 95 0] [45 94 0] [44 93 0] [43 92 0] [42 91 0]
[57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 97 0] [49 96 0] [48 95 0] [47 94 0] [46 93 0] [45 92 0] [44 91 0] [43 90 0]
[58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 95 0] [51 96 0] [50 95 0] [49 94 0] [48 93 0] [47 92 0] [46 91 0] [45 90 0] [44 89 0]
[59 88 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [52 95 0] [51 94 0] [50 93 0] [49 92 0] [47 90 0] [46 89 0] [45 88 0]
[60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 94 0] [52 93 0] [51 92 0] [50 91 0] [49 90 0] [48 89 0] [47 88 0] [46 87 0]
[61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [56 91 0] [55 92 0] [54 93 0] [53 92 0] [52 91 0] [51 90 0] [50 89 0] [49 88 0] [48 87 0] [47 86 0]
[62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 0] [55 92 0] [54 91 0] [53 90 0] [52 89 0] [51 88 0] [50 87 0] [49 86 0]
[63 84 0] [62 85 0] [61 86 0] [60 87 0] [59 88 0] [58 89 0] [57 90 1] [56 91 0] [55 90 0] [54 89 0] [53 88 0] [52 87 0] [51 86 0] [50 85 0] [49 84 1]
Figure 2 Pursuersrsquo behaviors prediction after the transition function application
precedent calculation equation (10) a number of possiblecoalition formations () will be computed
= 120582(120582 minus Re1)Re1 times
120582 minus Re1(120582 minus (Re1 + Re2))Re2
times sdot sdot sdot times 120582 minus (Re1 + sdot sdot sdot + Re119873minus1)(120582 minus (Re1 + Re2 + sdot sdot sdot + Re119873))Re119873
= 119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(11)
Nbrcl = Ω times
= 119899(119899 minus 120582)120582 times
119873prod119895=1
(120582 minus sum119896=119895minus1119896=0
Re119896)(120582 minus sum119896=119895
119896=0Re119896)Re119895
(12)
This decentralized technique aims to balance the computa-tion of the possible coalition formations among the pursuersFurthermore this method is more detailed in Section 7 viaits application to the case study Noting that the value of eachcoalition generated in relation to each pursuer contained willbe calculated according to (5) Each pursuer shares the coali-tions calculated with the others to start the coalition selection
process Secondly we apply the Iterated Elimination ofDomi-nated Strategies principle with the aim of finding the optimalcoalition through this process Knowing that each strategyis represented by a possible coalition formation Alternatelyeach pursuer eliminates the coalition with the lower value inrelation to itself and sends the update to the next pursuer con-cerned Pursuers are assigned in accordance with the selectedcoalition Each pursuer performs only one chase iterationThe algorithm repeats these instructions until the end of thechase life When 119862life = 0 and the captures are accomplishedsome rewards will be attributed to each one of the participat-ing pursuers the rewards are determined as follows
Rewards119901 = 119877 (119904 119886)119871 (13)
119871 is the number of the coalitionrsquos membersOtherwise in the case of capture failure the guilty
pursuers must pay some fines to the rest of the coalitionrsquosmembersThese fines are calculated as the followingmanner
120574 = (1199040 1198861 1199041 1198862 1199042 119904ℎ 119886ℎ) Fines = ℎminus1sum
119894=119908
119877 (119904119894 119886119894+1) (14)
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 7
Table 2 The distribution of the possible coalitionsrsquo computation
Pursuers 1198751 1198752 1198753 1198754 1198755 1198756 1198757 1198758 1198759 11987510General coalitions 5 5 5 5 5 4 4 4 4 4Possible coalitions generated 350 350 350 350 350 280 280 280 280 280
Agentsrsquo localization
Possible coalitionsrsquo calculation
Value of coalitionsrsquo calculation
Dominated strategyrsquos elimination
Pursuersrsquo assignment
Chase iteration
Capture
Rewards Fines
Yes
Yes
Yes
No
No
NoClife = 0
Nbrcl gt 1
Figure 3 Flow chart of the algorithm
120574 is the set of states regarding the guilty pursuer 0 le 119908 le ℎwhere 119908 represents the index of coalitionrsquos beginning
Figure 3 reflects the flow chart of this pursuit algorithmresuming the different steps explained in this section fromthe detection to the capture of the existing evaders
7 Simulation Experiments
In order to evaluate the approach presented in this paperwe realize our pursuit-evasion game on an example takingplace in a rectangular two-dimensional grid with 100 times 100cells Also we can find some obstacles characterized by theconstancy and the solidity As regards the environmentalagents our simulations are based on ten (10) pursuers andtwo (02) evaders of type Re = IV As shown in Figure 4it is specifically detailed how a pursuer of this type can be
captured Each agent ismarkedwith an IDnumber Both pur-suers and evaders have a similar speed (one cell per iteration)and an excellent communication systemThe pursuersrsquo teamsare totally capable of determining their actual positions andthe evaders disappeared after the capture accomplishment Ifthe capture of the evader is performed the coalition createdto improve this pursuit will be automatically dissolved
Table 2 resumes the results obtained after the applicationof the decentralized computation of the possible coalitionson this case study according to the process explained inSection 6 In this case and according to (10) the possiblegeneral coalitions (Ω) are equal to 45 coalitions which willbe distributed on the existing pursuers as shown in Table 2From each general coalition a number of coalitions will begenerated ( = 70) according to (11)
Moreover we have studied the number of possible coali-tions generated in parallel by the pursuers in relation to thenumber of the existing pursuers as shown in Figure 5 Inrelation to the centralized method in which only one pursuercomputes the possible coalitions the decentralized methoddecreases significantly the time concerning this computationthrough its division on the number of the existing pursuers
In order to vary the types of coordination mechanismsused in our simulations we have seen the usefulness tocompare this work with our recent pursuit-evasion researchactivity based onAGRorganizationalmodel [6]We have alsoseen the usefulness to compare our results with the resultsachieved after the application of an auction mechanismillustrated in Case-C- [8] Noting that these twomethods arebased on decentralized coalition formation
Case-A- is pursuit based on (AGR) organizationalmodel [6]Case-B- is our new approach based on the IteratedElimination of Dominated Strategies (IEDS) princi-pleCase-C- is a pursuit based on an economical auctionmechanism (MPMEGBTBA) [8]
The results shown in Figure 6 represent the average capturingtime achieved during forty (40) different simulation casestudies (episodes) from the beginning to the end of eachone In order to showcase the difference between the differentcases we have seen the usefulness to take into considerationthe iteration concept which determines the number of statechanges regarding each agent during the pursuits
In the first case (AGR) the average capturing timeobtained equals 144225 iterations Furthermore we notean interesting decrease until 10057 iterations after theapplication of MPMEGBTBA due to the appropriate rolesrsquoattribution provided by this auction mechanism Howeverthe results that occurred through the application of IEDS
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Mathematical Problems in Engineering
[49 94 0] [48 95 0] [47 96 0]
[50 95 0] [49 96 0] [48 97 1]
[51 96 0] [50 97 1] [49 98 1]
[52 95 0] [51 96 0] [50 97 1]
[53 94 0] [52 95 0] [51 96 0]
[54 93 0] [52 95 0]
[46 95 0]
[47 96 0]
[48 97 1]
[49 96 0]
[50 95 0]
[51 94 0]
[45 94 0]
[46 95 0]
[47 96 0]
[48 95 0]
[49 94 0]
[50 93 0]
[44 93 0]
[45 94 0]
[46 95 0]
[47 94 0]
[48 93 0]
[49 92 0]
[44 93 0]
[45 94 0]
[46 93 0]
[47 92 0]
[42 91 0]
[43 92 0]
[44 93 0]
[45 92 0]
[46 91 0]
[47 90 0]
Figure 4 Example evader of the type Re equals IV after the capture
0
100000
200000
300000
400000
500000
600000
Num
ber o
f pos
sible
coal
ition
s
11 12 13 14 1510Number of pursuers
minus100000
DecentralizedCentralized
Figure 5 Centralized and decentralized coalitionsrsquo computation inrelation to the number of pursuers
coalition formation algorithm revealed an average capturingtime of 78 iterations
Figure 7 shows the development of the pursuersrsquo rewardfunction during the same pursuit period of the different casesand the outcomes reflect the improvement brought by thedynamic formations and reformations of the pursuit teams
Finally we have focused on the study of the averagepursuersrsquo rewards obtained in each case of chase iterationduring full pursuit In Figure 8 the 119909-axis represents thevalue of rewards achieved by a pursuer and each unit 119910-axisrepresents chase iterations The results shown in this figurereveal a certain similarity between AGR and MPMEGBTBA
40
60
80
100
120
140
160
180
200
The a
vera
ge ca
ptur
ing
time (
itera
tions
)
30 4010 201Time (episodes)
Case-A-Case-B-Case-C-
Figure 6 Average capturing time after (40) different pursuits
in which the average pursuerrsquos rewards achieved reach 059and 0507 respectively Otherwise in IEDS the average resultincreases until 088
The results shown in Figure 9 represent the internallearning development (self-confidence development) of thepursuers during the pursuit applied to the three cases Thepositivity of the results is due to the grouping and theequitable task sharing between the different pursuit groupsimposed by the different coordination mechanisms appliedMoreover we can note the superiority of the results obtainedthrough IEDS in relation to the other cases provoked by the
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 9
30
40
50
60
70
80
90
100
110
120
Purs
uers
rsquo rew
ards
dev
elopm
ent
10 20 30 40 50 60 70 781Time (iterations)
Case-A-Case-B-Case-C-
Figure 7 The pursuersrsquo rewards development
Case-C-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-B-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Case-A-
obta
ined
Aver
age p
ursu
ersrsquo
rew
ards
Time (iterations)
34
17
00
minus17
0 10 20 30 40 50
Figure 8 Average pursuersrsquo reward per iteration
Table 3 Pursuit result
AGR IEDS MPMEGBTBAAverage capturing time(iteration) 144225 78 10057
Average pursuersrsquo rewardsobtained by iteration 059 088 0507
Average pursuersrsquo self-confidence development 0408 0533 0451
0
1
2
3
4
5
6
7
8
Purs
uers
rsquo self
-con
fiden
ce d
evelo
pmen
t
20 40 60 80 1000Pursuit development ()
Case-A-Case-B-Case-C-
Figure 9 Pursuersrsquo learning development during the pursuit
dynamism of the coalition formations and the optimality oftask sharing provided by our algorithm
Table 3 summarizes the main results achieved we deducethat the pursuit algorithm based on the Iterative Eliminationof Dominated Strategies (IEDS) is better than the algorithmbased on AGR organizational model as well as the auctionmechanism based on MPMEGBTBA regarding the rewardrsquosdevelopment as well as the capturing time The leading causeof this fact is the dynamism of our coalitional groups Thisflexible mechanism improves the intelligence of the pursuersconcerning the displacements and the rewards acquisitionknowing that team reward is optimal in the case where eachpursuer undertakes the best path
8 Conclusion
This paper presents a kind of a decentralized coalitionmethod based on GameTheory principles for different typesof pursuit the proposed method demonstrates the positiveimpact imposed by the dynamismof the coalition formationsFirstly we have extended our coalition algorithm from theIterated Elimination of Dominated Strategies This processallows us to determine the optimal pursuit coalition strategyaccording to the Game Theory principles Secondly wehave focused on the Markov Decision Process as a motion
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
10 Mathematical Problems in Engineering
strategy of our pursuers in the environment (grid of cells)To highlight our proposal we have developed a comparativestudy between our algorithm and a decentralized strategyof coalition based on AGR organizational model as well asan auction mechanism based on MPMEGBTBA Simulationresults shown in this paper demonstrate that the algorithmbased on IEDS is feasible and effective
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
This paper is supported by National Natural Science Foun-dation of China (no 61375081) and a special fund project ofHarbin science and technology innovation talents research(no RC2013XK010002)
References
[1] A Ghazikhani H R Mashadi and R Monsefi ldquoA novelalgorithm for coalition formation in multi-agent systems usingcooperative game theoryrdquo in Proceedings of the 18th IranianConference on Electrical Engineering (ICEE rsquo10) pp 512ndash516Isfahan Iran May 2010
[2] L Boongasame ldquoPreference coalition formation algorithm forbuyer coalitionrdquo in Proceedings of the 9th International JointConference on Computer Science and Software Engineering(JCSSE rsquo12) pp 225ndash230 Bangkok Thailand May 2012
[3] J Ferber O Gutknecht and F Michel ldquoFrom agents to orga-nizations an organizational view of multi-agent systemsrdquo inAgent-Oriented Software Engineering IV 4th InternationalWork-shop AOSE 2003 Melbourne Australia July 15 2003 RevisedPapers P Giorgini J Muller and J Odell Eds vol 2935of Lecture Notes in Computer Science pp 214ndash230 SpringerBerlin Germany 2004
[4] J Y Kuo H-F Yu K F-R Liu and F-W Lee ldquoMultiagentcooperative learning strategies for pursuit-evasion gamesrdquoMathematical Problems in Engineering vol 2015 Article ID964871 13 pages 2015
[5] G I Ibragimov and M Salimi ldquoPursuit-evasion differentialgame with many inertial playersrdquo Mathematical Problems inEngineering vol 2009 Article ID 653723 15 pages 2009
[6] M Souidi S Piao G Li and L Chang ldquoCoalition formationalgorithm based on organization and Markov decision processfor multi-player pursuit evasionrdquo International Journal of Mul-tiagent and Grid Systems vol 11 no 1 pp 1ndash13 2015
[7] M E-H Souidi P Songhao L Guo and C Lin ldquoMulti-agentcooperation pursuit based on an extension of AALAADINorganisational modelrdquo Journal of Experimental amp TheoreticalArtificial Intelligence vol 28 no 6 pp 1075ndash1088 2016
[8] Z-S Cai L-N Sun H-B Gao P-C Zhou S-H Piao andQ-C Huang ldquoMulti-robot cooperative pursuit based on taskbundle auctionsrdquo in Intelligent Robotics and Applications CXiong Y Huang Y Xiong and H Liu Eds vol 5314 ofLecture Notes in Computer Science pp 235ndash244 SpringerBerlin Germany 2008
[9] B Goode A Kurdila and M Roan ldquoA graph theoreticalapproach toward a switched feedback controller for pursuit-evasion scenariosrdquo in Proceedings of the American Control
Conference (ACC rsquo11) pp 4804ndash4809 San Francisco Calif USAJune 2011
[10] V Isler S Kannan and S Khanna ldquoRandomized pursuitndashevasion in a polygonal environmentrdquo IEEE Transactions onRobotics vol 21 no 5 pp 875ndash884 2005
[11] J Thunberg P Ogren and X Hu ldquoA Boolean Control Networkapproach to pursuit evasion problems in polygonal environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation (ICRA rsquo11) pp 4506ndash4511 May 2011
[12] J Li Q Pan and B Hong ldquoA new approach of multi-robotcooperative pursuit based on association rule data miningrdquoInternational Journal of Advanced Robotic Systems vol 7 no 3pp 165ndash172 2010
[13] J Liu S Liu HWu andY Zhang ldquoA pursuit-evasion algorithmbased on hierarchical reinforcement learningrdquo in Proceedingsof the International Conference on Measuring Technology andMechatronics Automation (ICMTMA rsquo09) vol 2 pp 482ndash486IEEE Hunan China April 2009
[14] J P Hespanha M Prandini and S Sastry ldquoProbabilisticpursuit-evasion games a one-step Nash approachrdquo in Proceed-ings of the 39th IEEE Conference on Decision and Control vol 3pp 2272ndash2277 Sydney Australia December 2000
[15] J Dong X Zhang and X Jia ldquoStrategies of pursuit-evasiongame based on improved potential field and differential gametheory for mobile robotsrdquo in Proceedings of the 2nd Interna-tional Conference on Instrumentation Measurement ComputerCommunication and Control (IMCCC rsquo12) pp 1452ndash1456 IEEEHarbin China December 2012
[16] F Amigoni and N Basilico ldquoA game theoretical approach tofinding optimal strategies for pursuit evasion in grid environ-mentsrdquo in Proceedings of the IEEE International Conference onRobotics and Automation River Centre pp 2155ndash2162 SaintPaul Minn USA May 2012
[17] R Liu and Z-S Cai ldquoA novel approach based on Evolution-ary Game Theoretic model for multi-player pursuit evasionrdquoin Proceedings of the International Conference on ComputerMechatronics Control and Electronic Engineering (CMCE rsquo10)vol 1 pp 107ndash110 August 2010
[18] B Khosravifar F Bouchet R Feyzi-Behnagh R Azevedo and JM Harley ldquoUsing intelligent multi-agent systems to model andfoster self-regulated learning a theoretically-based approachusing Markov decision processrdquo in Proceedings of the 27th IEEEInternational Conference on Advanced Information Networkingand Applications (AINA rsquo13) pp 413ndash420 IEEE BarcelonaSpain March 2013
[19] L Ting Z Cheng and ZWeiming ldquoPlanning for target systemstriking based on Markov decision processrdquo in Proceedingsof the IEEE International Conference on Service Operationsand Logistics and Informatics (SOLI rsquo13) pp 154ndash159 IEEEDongguan China July 2013
[20] W Lin Z Qu and M A Simaan ldquoNash strategies for pursuit-evasion differential games involving limited observationsrdquo IEEETransactions on Aerospace and Electronic Systems vol 51 no 2pp 1347ndash1356 2015
[21] E Ehsan and F Kunwar ldquoProbabilistic search and pursuitevasion on a graphrdquo Transactions on Machine Learning andArtificial Intelligence vol 3 no 3 pp 57ndash65 2015
[22] S Jia X Wang and L Shen ldquoA continuous-time markovdecision process-based method with application in a pursuit-evasion examplerdquo IEEE Transactions on Systems Man andCybernetics Systems vol 46 no 9 pp 1215ndash1225 2016
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 11
[23] C Boutilier ldquoSequential optimality and coordination in mul-tiagent systemsrdquo in Proceedings of the 16th International JointConference on Artificial Intelligence (IJCAI rsquo99) vol 1 pp 478ndash485 Stockholm Sweden August 1999
[24] E A Hansen D S Bernstein and S Zilberstein ldquoDynamicprogramming for partially observable stochastic gamesrdquo in Pro-ceedings of the 19th National Conference on Artificial Intelligencepp 709ndash715 2004
[25] K Zhang E G Collins Jr and A Barbu ldquoAn efficient stochas-tic clustering auction for heterogeneous robotic collaborativeteamsrdquo Journal of Intelligent amp Robotic Systems vol 72 no 3-4 pp 541ndash558 2013
[26] K Zhang E G Collins Jr and D Shi ldquoCentralized anddistributed task allocation in multi-robot teams via a stochasticclustering auctionrdquo ACM Transactions on Autonomous andAdaptive Systems vol 7 no 2 article 21 2012
[27] M B Dias and T Sandholm TraderBots a new paradigmfor robust and efficient multirobot coordination in dynamicenvironments [PhD thesis] The Robotics Institute CarnegieMellon University Pittsburgh Pa USA 2004
[28] Y Wang Evolutionary Game Theory Based Cooperation Algo-rithm inMulti-Agent SystemMultiagent Systems InTech RijekaCroatia 2009
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of