Building Human-Level AI for Real-Time Strategy Games

Download Building Human-Level AI for Real-Time Strategy Games

Post on 13-Feb-2017




5 download

Embed Size (px)


<ul><li><p>Building Human-Level AI for Real-Time Strategy Games</p><p>Ben G. WeberExpressive Intelligence Studio</p><p>UC Santa</p><p>Michael MateasExpressive Intelligence Studio</p><p>UC Santa</p><p>Arnav JhalaComputational Cinematics Studio</p><p>UC Santa</p><p>Abstract</p><p>Video games are complex simulation environments withmany real-world properties that need to be addressed inorder to build robust intelligence. In particular, real-time strategy games provide a multi-scale challengewhich requires both deliberative and reactive reason-ing processes. Experts approach this task by studyinga corpus of games, building models for anticipating op-ponent actions, and practicing within the game environ-ment. We motivate the need for integrating heteroge-neous approaches by enumerating a range of competen-cies involved in gameplay and discuss how they are be-ing implemented in EISBot, a reactive planning agentthat we have applied to the task of playing real-timestrategy games at the same granularity as humans.</p><p>IntroductionReal-time strategy (RTS) games are a difficult domain forhumans to master. Expert-level gameplay requires per-forming hundreds of actions per minute in partial informa-tion environments. Additionally, actions need to distributedacross a range of in-game tasks including advancing re-search progress, sustaining an economy, and performing tac-tical assaults. Performing well in RTS games requires theability to specialize in many competencies while at the sametime working towards high-level goals. RTS gameplay re-quires both parallel goal pursuit across distinct competen-cies as well as coordination among these competencies.</p><p>Our work is motivated by the goal of building an agentcapable of exhibiting these types of reasoning processes in acomplex environment. Specifically, our goal is to develop anRTS agent capable of performing as well as human experts.Paramount to this effort is evaluating the agent in an environ-ment which resembles the interface provided to human play-ers as closely as possible. Specifically, the agent should notbe allowed to utilize any game state information not avail-able to a human player, such as the locations of non-visibleunits, and the set of actions provided to the agent should beidentical to those provided through the games user inter-face. Enforcing this constraint ensures that an agent whichperforms well in this domain is capable of integrating andspecializing in several competencies.</p><p>Copyright c 2011, Association for the Advancement of ArtificialIntelligence ( All rights reserved.</p><p>Real-time strategy games demonstrate the need for het-erogeneous agent architectures, because reducing the deci-sion complexity of the domain is non-trivial. One possi-ble approach is to split gameplay into isolated subproblemsand to reason about each subproblem independently. Thisapproach is problematic, because RTS gameplay is non-hierarchical resulting in contention for in-game resourcesacross different subproblems. A second approach is to de-velop an abstraction of the state space and use it for reason-ing about goals. This approach is also problematic, becausedifferent types of abstractions are necessary for the differ-ent tasks that need to be performed. RTS gameplay requiresconcurrent and coordinated goal pursuit across multiple ab-stractions. We refer to AI tasks that have these requirementsas multi-scale AI problems (Weber et al. 2010).</p><p>Our domain of analysis is the real-time strategy gameStarCraft. High-level StarCraft gameplay requires special-izing in several different competencies of varying complex-ity. One such competency is micromanagement, which isthe task of controlling individual units in combat scenariosin order to maximize the utility of a squad of units. This areaof expertise focuses on reactive behavior, because it requiressplit-second timing of actions. Another competency is strat-egy selection, which is the task of determining which typesof units to produce and which upgrades to research. Thisarea of expertise focuses on planning and opponent model-ing, and involves formulating strategies in order to counteropponent strategies. We also selected this game because it isplayed at a professional level, resulting in a large number ofhigh-quality replays being available for analysis, and thereis a large player base for evaluating our system.</p><p>Our approach for managing the complexity of StarCraft isto partition gameplay into domains of competence and de-velop interfaces between these competencies for resolvinggoal conflicts. It is based on the integrated agent frameworkproposed by McCoy and Mateas (2008), which partitions thebehaviors in a reactive planning agent into managers whichspecialize in distinct aspects of gameplay. The decomposi-tion of gameplay into managers is based on analysis of ex-pert gameplay.</p><p>EISBot is a heterogeneous agent for playing StarCraftwhich incorporates many of these competencies. The coreof the agent is a reactive planner that supports real-time ac-tion execution and concurrent goal pursuit. A large num-</p></li><li><p>ber of competencies are implemented in the ABL reactiveplanning language (Mateas and Stern 2002). The reactiveplanner interfaces with external components which performprediction and planning tasks. These components are imple-mented using case-based reasoning, machine learning, andparticle filters. The components are connected together byapplying several integration idioms, including using work-ing memory as a blackboard, instantiating new goals for theagent to pursue, and suspending and resuming the execu-tion of tasks. We have performed evaluation of EISBot on acompetitive StarCraft tournament ladder.</p><p>The remainder of this paper is structured as follows. Insection 2 we discuss related work. Section 3 provides anoverview of the EISBot architecture. Next, we discuss Star-Craft gameplay and competencies in Section 4, and presentthe competencies in EISBot in Section 5. In section 6 wepresent an evaluation of EISBot. We provide conclusions inSection 7 and discuss future work in Section 8.</p><p>Related WorkOur agent builds on ideas from commercial game AI, cogni-tive architectures, and heterogeneous agent designs. It alsocontinues the tradition of using games as an AI testbed.</p><p>Video games are ideal for AI research, because they pro-vide a domain in which incremental and integrative advancescan be made (Laird and VanLent 2001). RTS games inparticular are interesting because they provide several chal-lenges including decision making under uncertainty, spatialand temporal reasoning, and adversarial real-time planning(Buro 2003). Previous work has also shown that the deci-sion complexity of RTS games is enormous (Aha, Molin-eaux, and Ponsen 2005).</p><p>Cognitive architectures provide many of the mechanismsrequired for integrating heterogeneous competencies. TheICARUS architecture is capable of reasoning about concur-rent goals, which can be interrupted and resumed (Langleyand Choi 2006). The system has components for perception,planning, and acting which communicate through the use ofan active memory. One of the main differences from ourapproach is that our system works towards modeling game-play actions performed by players, as opposed to modelingthe cognitive processes of players.</p><p>SOAR is a cognitive architecture that performs state ab-straction, planning, and multitasking (Wintermute, Xu, andLaird 2007). The system was applied to the task of playingreal-time strategy games by specifying a middleware layerthat serves as the perception system and gaming interface.RTS games provide a challenge for the architecture, becausegameplay requires managing many cognitively distant tasksand there is a cost for switching between these tasks. Themain difference from our approach is that we decomposegameplay into more distinct modules and have a larger fo-cus on coordination across tasks.</p><p>Darmok is an online case-based planner that builds acase library by learning from demonstration (Ontanon et al.2010). The system can be applied to RTS games by sup-plying a collection of game replays that are used during theplanning process. The decision cycle in Darmok is similar tothe one in EISBot, in which each update checks for finished</p><p>steps, updates the world state, and picks the next step. Be-haviors selected for execution by Darmok can be customizedby specifying a behavior tree for performing actions (Palmaet al. 2011). The main difference from our approach is thatDarmok uses a case library to build a collection of behav-iors, while EISBot uses a large collection of hand-authoredbehaviors.</p><p>Heterogeneous agent architectures have also been appliedto the task of playing RTS games. Bakkes et al. (2008)present an approach for integrating an existing AI with case-based reasoning. The AI is able to adapt to opponents byusing a case retriever which selects the most suitable pa-rameter combination for the current situation, and modify-ing behavior based on these parameters. Baumgarten et al.(2009) present an RTS agent that uses case-based reasoning,simulated annealing, and decision tree learning. The systemuses an initial case library generated from a set of randomlyplayed games, and refines the library by applying an evalua-tion function to retrieved cases in order to evolve new tactics.Our system differs from this work, because components inEISBot interface using goals and plans rather than parameterselection.</p><p>Commercial game AI provides an excellent baseline foragent performance, because it must operate within a com-plex environment, as opposed to an abstraction of a game.However, the goal of commercial game AI is to provide theplayer with an engaging experience, as opposed to playingat the same granularity as a player. Both deliberative and re-active planning approaches have been applied to games. Be-havior trees are an action selection technique for non-playercharacters (NPCs) in games (Isla 2005). In a behavior treesystem, an agent has a collection of behaviors available forexecution. Each decision cycle update, the behavior withthe highest priority that can be activated is selected for exe-cution. Behavior trees provide a mechanism for implement-ing reactive behavior in an agent, but require substantial do-main engineering. Goal-oriented action planning (GOAP)addresses this problem by using a deliberative planning pro-cess (Orkin 2003). A GOAP agent has a set of goals, whichupon activation trigger the agent to replan. The planningoperators are similar to those in STRIPS (Fikes and Nils-son 1971), but the planner uses A* to guide the planningprocess. The main challenge in implementing a GOAP sys-tem is defining representations and operators that can be rea-soned about in real time. One of the limitations of both ofthese approaches is that they are used for action selection fora single NPC and do not extend to multi-scale problems.</p><p>EISBot ArchitectureEISBot is an extension of McCoy and Mateass (2008) inte-grated agent framework. We have extended the frameworkin several ways. While the initial agent was applied to thetask of playing Wargus (Ponsen et al. 2005), we have trans-ferred many of the concepts to StarCraft. Additionally, wehave implemented several of the competencies that were pre-viously identified, but stubbed out in the original agent im-plementation. Finally, we have interfaced the architecturewith external components in several ways.</p></li><li><p>StarCraft </p><p>Brood War API </p><p>Control process Proxybot </p><p>Sensor1 Act </p><p>WME1 </p><p>Working Memory </p><p>Sensors </p><p>ABT </p><p>ABL </p><p>Figure 1: EISBot-StarCraft interface</p><p>The core of the agent is a reactive planner which inter-faces with the game environment, manages active goals, andhandles action execution. It is implemented in the ABL(A Behavior Language) reactive planning language (Mateasand Stern 2002). ABL is similar to BDI architectures suchas PRS (Rao and Georgeff 1995), a hybrid architecture forplanning and decision making. The main strength of reactiveplanning is support for acting on partial plans while pursu-ing goal-directed tasks and responding to changes in the en-vironment (Josyula 2005).</p><p>EISBot interfaces with StarCraft through the use of sen-sors which perceive game state and actuators which enablethe agent to send commands to units. The ABL interfaceand agent components are shown in Figure 1. The BroodWar API, Control Process, and ProxyBot components pro-vide a bridge between StarCraft and the agent. The agentalso includes a working memory and an active behavior tree(ABT). The ABT pursues active goals by selecting behaviorsfor expansion from the behavior library, which is a collectionof hand-authored behaviors.</p><p>Our agent is based on a conceptual partitioning of game-play into areas of competence. It is composed of managers,each of which is responsible for a specific aspect of game-play. A manager is implemented as a collection of behaviorsin our system. We used several reactive planning idioms todecompose gameplay into subproblems, while maintainingthe ability to handle cross-cutting concerns such as resourcecontention between managers. EISBot is composed of thefollowing managers:</p><p> Strategy manager: responsible for the strategy selectionand attack timing competencies.</p><p> Income manager: handles worker units, resource collec-tion, and expanding.</p><p> Construction manager: responsible for managing re-quests to build structures.</p><p> Tactics manager: performs combat tasks and microman-agement behaviors.</p><p> Recon manager: implements scouting behaviors.Further discussion of the specific managers and planning</p><p>idioms is available in previous work (McCoy and Mateas2008; Weber et al. 2010).</p><p>Each of the managers implements an interface for inter-acting with other components. The interface defines howworking memory elements (WMEs) are posted to and con-sumed from working memory. This approach enables amodular agent design, where new implementations of man-agers can be swapped into the agent. It also supports thedecomposition of tasks, such as constructing a structure.In EISBot, the decision process for selecting which struc-tures to construct is decoupled from the construction pro-cess, which involves selecting a construction site and mon-itoring execution of the task. This decoupling also enablesspecialized behavior to be specified in a manager, such asunit specific movement behaviors in the tactics manager.</p><p>Integration ApproachesWe have integrated the reactive planner with external com-ponents. The following approaches are used to interfacecomponents with the reactive planner: Augmenting working memory External goal formulation External plan generation Behavior activationThese approaches are not specific to reactive planning andcan be applied to other agent architectures.</p><p>The agents working memory serves as a blackboard(Hayes-Roth 1985), enabling communication between dif-ferent managers. Working memory is also used to store be-liefs about the game environment. External components in-terface with EISBot by augmenting working memory withadditional be...</p></li></ul>