programmable elasticity for actor-based cloud applications · 2017-12-05 · programmable...

7
Programmable Elasticity for Actor-based Cloud Applications Bo Sang Purdue University Srivatsan Ravi University of Southern California Gustavo Petri IRIF – Universit´ e Paris Diderot Mahsa Najafzadeh Purdue University Masoud Saeida Ardekani Samsung Research America Patrick Eugster Purdue University, TU Darmstadt Abstract The actor model is a popular paradigm for programming scal- able cloud applications. Building elastic and scalable cloud applications requires application developers to carefully ad- just the application scale (the required resources) and the placement of actors at the runtime. Unfortunately, there is no efficient solution which could manage application elastic- ity automatically during runtime without disrupting ongoing requests. This paper proposes the idea of programmable elas- ticity approach, which allows application developers to define a set of elasticity rules for different actors. The runtime ser- vice endeavors to apply the elasticity rules while relieving the application programmer from dealing with the management of distributed state and efficient utilization of cloud resources. CCS Concepts Computing methodologies Distributed programming languages; Keywords Cloud Application, Elasticity, Actor Model, Pro- gramming Model ACM Reference Format: Bo Sang, Srivatsan Ravi, Gustavo Petri, Mahsa Najafzadeh, Ma- soud Saeida Ardekani, and Patrick Eugster. 2017. Programmable Elasticity for Actor-based Cloud Applications. In Proceedings of PLOS’17:9th Workshop on Programming Languages and Oper- ating Systems (PLOS’17). ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3144555.3144558 1 Introduction The actor model [3] has gained popularity for building com- plex cloud-based distributed applications [10]. Some latest actor-based programming language (e.g, AEON [9] and Or- leans [4]) can help programmers implement cloud applica- tions in a more efficient and simple way. Actors are used as service endpoints that process requests or tasks, and their placement can directly affect the application performance. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. PLOS’17, October 28, 2017, Shanghai, China © 2017 Association for Computing Machinery. ACM ISBN 978-1-4503-5153-9/17/10. . . $15.00 https://doi.org/10.1145/3144555.3144558 Thus, it is important to decide how to distribute and place actors across different servers to optimize performance and resource efficiency. Due to changes in workloads, the static placement of actors would not necessarily lead to the best per- formance and resource efficiency. For example, consider an online messaging application where users, modeled as actors, who join the same chat room can talk to each other. To reduce the communication cost, the actors participating in the same chat room should be put on the same server. However, one or many chat rooms may become big over time, such that one single server cannot handle the whole traffic. Moreover, users may actively join and/or leave different chat rooms, so it is not straightforward to choose one server for placing the user’s actor. Thus, actor-based applications require an efficient elas- ticity management service in order to react to all workload changes by automatically 1) re-distributing actors among available servers, and 2) provisioning and de-provisioning resources while ensuring Service Level Agreements (SLAs) and preserving the application semantics. Newell et al. [7] recently showed that more than 90% of communications among actors are remote for certain applica- tions. Thus, taking into account relationship between actors and also their workload can significantly increase the ac- tual potential of platforms towards actor-based programming models for building low-latency and scalable applications [4]. Moreover, and as we will argue in this paper, because re- source usage and interaction of actors are directly related to application logic, without any application-level information, elasticity management services may fail to make efficient de- cisions. Therefore, it is crucial to have a fine-grained elasticity management service. This paper proposes a model for programmable elasticity: it requires the minimal input from the programmers (e.g., speci- fying upper bounds on memory/CPU utilization, specifying which actors communicate frequently), while still providing efficient and fine-grained scale-adjustment without affecting the semantics of the distributed application or its availability. Generally speaking, this programming model raises the level of abstraction for elastic cloud programming by allowing developers to build efficient cloud applications with minimal programming efforts.

Upload: others

Post on 20-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

Programmable Elasticity for Actor-based CloudApplications

Bo SangPurdue University

Srivatsan RaviUniversity of Southern California

Gustavo PetriIRIF – Universite Paris Diderot

Mahsa NajafzadehPurdue University

Masoud Saeida ArdekaniSamsung Research America

Patrick EugsterPurdue University,

TU Darmstadt

AbstractThe actor model is a popular paradigm for programming scal-able cloud applications. Building elastic and scalable cloudapplications requires application developers to carefully ad-just the application scale (the required resources) and theplacement of actors at the runtime. Unfortunately, there isno efficient solution which could manage application elastic-ity automatically during runtime without disrupting ongoingrequests. This paper proposes the idea of programmable elas-ticity approach, which allows application developers to definea set of elasticity rules for different actors. The runtime ser-vice endeavors to apply the elasticity rules while relieving theapplication programmer from dealing with the managementof distributed state and efficient utilization of cloud resources.

CCS Concepts • Computing methodologies → Distributedprogramming languages;

Keywords Cloud Application, Elasticity, Actor Model, Pro-gramming Model

ACM Reference Format:Bo Sang, Srivatsan Ravi, Gustavo Petri, Mahsa Najafzadeh, Ma-soud Saeida Ardekani, and Patrick Eugster. 2017. ProgrammableElasticity for Actor-based Cloud Applications. In Proceedings ofPLOS’17:9th Workshop on Programming Languages and Oper-ating Systems (PLOS’17). ACM, New York, NY, USA, 7 pages.https://doi.org/10.1145/3144555.3144558

1 IntroductionThe actor model [3] has gained popularity for building com-plex cloud-based distributed applications [10]. Some latestactor-based programming language (e.g, AEON [9] and Or-leans [4]) can help programmers implement cloud applica-tions in a more efficient and simple way. Actors are usedas service endpoints that process requests or tasks, and theirplacement can directly affect the application performance.

ACM acknowledges that this contribution was authored or co-authored byan employee, contractor or affiliate of a national government. As such, theGovernment retains a nonexclusive, royalty-free right to publish or reproducethis article, or to allow others to do so, for Government purposes only.PLOS’17, October 28, 2017, Shanghai, China© 2017 Association for Computing Machinery.ACM ISBN 978-1-4503-5153-9/17/10. . . $15.00https://doi.org/10.1145/3144555.3144558

Thus, it is important to decide how to distribute and placeactors across different servers to optimize performance andresource efficiency. Due to changes in workloads, the staticplacement of actors would not necessarily lead to the best per-formance and resource efficiency. For example, consider anonline messaging application where users, modeled as actors,who join the same chat room can talk to each other. To reducethe communication cost, the actors participating in the samechat room should be put on the same server. However, one ormany chat rooms may become big over time, such that onesingle server cannot handle the whole traffic. Moreover, usersmay actively join and/or leave different chat rooms, so it isnot straightforward to choose one server for placing the user’sactor. Thus, actor-based applications require an efficient elas-ticity management service in order to react to all workloadchanges by automatically 1) re-distributing actors amongavailable servers, and 2) provisioning and de-provisioningresources while ensuring Service Level Agreements (SLAs)and preserving the application semantics.

Newell et al. [7] recently showed that more than 90% ofcommunications among actors are remote for certain applica-tions. Thus, taking into account relationship between actorsand also their workload can significantly increase the ac-tual potential of platforms towards actor-based programmingmodels for building low-latency and scalable applications [4].Moreover, and as we will argue in this paper, because re-source usage and interaction of actors are directly related toapplication logic, without any application-level information,elasticity management services may fail to make efficient de-cisions. Therefore, it is crucial to have a fine-grained elasticitymanagement service.

This paper proposes a model for programmable elasticity: itrequires the minimal input from the programmers (e.g., speci-fying upper bounds on memory/CPU utilization, specifyingwhich actors communicate frequently), while still providingefficient and fine-grained scale-adjustment without affectingthe semantics of the distributed application or its availability.Generally speaking, this programming model raises the levelof abstraction for elastic cloud programming by allowingdevelopers to build efficient cloud applications with minimalprogramming efforts.

Page 2: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

PLOS’17, October 28, 2017, Shanghai, China B. Sang et al.

Heartbeat Messages

Message

Session

Session

Session

Session

Session

Session

Observer

Router

Actor Reference

Figure 1. Halo 4 Presence Service [4].Router actors execute CPU intensive tasks and session actors will

make call on corresponding player actors frequently.

Roadmap. The remainder of this paper is structured as fol-lows. § 2 motivates our programmable elasticity model us-ing examples of two popular cloud applications. § 3 detailsour language support for programmable elasticity while § 4presents the details of our elasticity management service thatimplements the runtime system support for migrating actors.§ 5 presents preliminary evaluation for a cloud game engine,§ 6 overviews related work and § 7 concludes the paper.

2 MotivationIn this section, we motivate the need for programmable elas-ticity through two examples: the presence service of a largescale elastic game (Microsoft Halo), and a large scale elasticdata structure (B+ tree).Halo 4 Presence Service. The presence service of Microsoft’sHalo 4 is responsible for tracking active game sessions, par-ticipating players, and their game status. This service is de-veloped using a distributed actor model called Orleans [4, 7].Orleans is an actor-based programming model extended fromC#. Its runtime system keeps monitoring communicationsamong actors, and collocate actors with frequent interactionson the same server.

As shown in Figure 4, the presence service has four typesof actors: (1) Router actors receive heartbeats from Xboxconsoles. These actors are responsible for extracting sessionIDs from the heartbeats, and forwarding messages to theright session actors. (2) A session actor represents a gamesession. It receives messages from router actors, updates itsstatus, and calls player actors that are in the game session.(3) A player actor represents a game player. (4) An observeractor receives real-time notifications form session actors, andmanages session actors if some errors happen.

Each of the above four actor types have different placementrequirements during runtime. Since router actors are CPU de-manding, a programmer may want to balance them out onvarious servers. Session actors, on the other hand, frequentlycommunicate with their corresponding player actors. There-fore, a programmer may decide to colocate a session actoralong with its corresponding player actors on a single server

10

3 5

1 2 3 4

V1 V2 V3 V4

12 14

10 11 12 13 14 15 16

V10 V11 V12 V13 V14 V15 V16

Inner Node

Leaf Node

Delete 5

Tree Edge

Node Migration

30

...

Server

21

Deleted

5

V5

Figure 2. A distributed B+ tree.Deleting key 5 will remove the key from all inner nodes and value 5from the leaf node. It also results in structure adjustment and assignnew parent to node with key [10,11]. Then the inner node with key

[10,11] and the leaf node with value [10, 11] may be migrated.

in order to reduce latency by eliminating remote messages.Moreover, since a player may join different sessions, the run-time system may need to continuously monitor them, andmove a player actor to a server that is hosting its correspond-ing session actor. Consequently, these elasticity decisions arequite application-specific, and a program-agnostic approachcan not easily capture execution logic of the application.B+ tree. B+ tree is a popular data structure used for fast index-ing. Figure 2 shows a typical implementation of distributed B+tree. Since the keys of B+ tree are associated with the data inthe storage, the size of B+ tree varies as workload changes i.e.,adding or deleting keys. In order to handle large amount ofdata, B+ tree can be implemented as an elastic distributed ap-plication using actor-based frameworks by considering eachB+ tree node as an actor.

The runtime system needs to decide how to automaticallypartition these actors among servers. One viable approachis to collocate nodes with frequent interactions on the sameserver, which is the strategy of Orleans. However, solely re-lying on communications among actors may lead to poordecisions. In a B+ tree, the size of leaf nodes is much largerthan inner nodes, and have much higher migration overhead.The runtime system likely makes efforts to place leaf nodeswith their parent inner nodes on the same server. Therefore,one server may not be able to hold all those leaf actors. Inaddition, deleting and adding keys (eg., deleting key 5 inFigure 2) may result in B+ tree structure adjustment. Suchadjustments can lead to frequent migration of leaf nodes.

The above examples show that the runtime system needsmore application-specific information in order to have effi-cient elastic deployments. In this paper, we argue that pro-grammers implementing elastic applications can augmenttheir applications with little effort, and define fine-grainedelasticity policies while the underlying runtime system en-deavors to arbitrate among possible elasticity policies.

Page 3: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

(a) Primitive Resources

at, any actor typef .count # of calls to func. f

re f ( f d)references

via field f df d fieldcpu cpu usagemem memory usagenet network usage

(b) Behaviors

colocate(at1,at2,cd)separate(at1,at2,cd)

pin(at)isolate(at,cd,re)

Table 1. Elements of Configuration Language’s Syntaxany indicates any actor type, cd presents the conditions to trigger

elasticity behaviors.

3 Elasticity Programming ModelIn order to enable application-centric elasticity management,the runtime system collects statistics about the resource usagepatterns of the application, and it provides a simple configu-ration language to adapt the resource usage according to theexecution patterns of workload observed so far.

3.1 Elasticity ModelFor actor systems, elasticity decisions boil down to how toplace actors between servers for the better adjustment to theworkload. While there are many variables that could affectthe performance of an actor, we restrict the runtime system tomeasure the average cpu/memory/network usage, along withaverage number of requests received within a period of time,and the interactions between two particular actors.

While the finest elasticity decisions can only be made at thelevel of individual actor instances, we consider that a reason-able compromise between programmability and fine-grainedcontrol can be achieved by providing elasticity decisions atthe level of actor types. Hence, all actors of the same typewill follow the same policies, and the fine-grained tuningbetween different instances of the same type is informed bythe statistics which are collected on a per-actor level.

Two factors influence the choices of actor placement. Firstly,we consider the amount of resources necessary for an actor.To this end, we allow programmers to inform the runtimesystem of their expected CPU and memory utilization. Thesecond factor is the communication patterns, which greatlyimpact the performance of the application depending on thelocality of the actors. In general terms, the runtime systemwill choose to colocate actors that interact often in the sameserver, hence reducing the overhead due to network latencies.

3.2 Configuration LanguageTo express elasticity policies, our programming model isintegrated with an actor programming language [9]. We willnot discuss the details of the language, but we will assume thatactors are equipped with fields, and provide remote functionsimplemented by means of message passing. In fact, elasticity

policies are described in a separate file, and can use elementsof the syntax presented in Table 1. This specific syntax allowsa programmer to specify policies at the level of actor types(ranged over with the variable at in the table). The reservedkeyword any is used to designate any actor type. Hence, ourconfiguration language integrates the elasticity concerns withactors, fields and function names from the application codes.

There are a number of primitive resources that the program-mer can use to guide the policies as listed in Table 1a. Whendescribing the policy of a certain actor type, it is sometimesuseful to know how many times a certain function has beencalled in the last epoch, i.e., a configurable period of time.To this end, we provide f .count. One could be interested inaffecting all instances of a certain actor type at in a similarfashion. That is the purpose of the syntax re f (at), which rep-resents the set of all instances of type at pointed to by theactor on which the policy is being defined.

On top of resources related to the logical structure of theapplication, we provide also primitives to obtain informationabout the percentage of CPU, memory, and network usageincurred by an actor. Together these quantifiers can be usedin conditions – with standard conditionals and boolean opera-tors – to indicate when to trigger a particular behavior. Theconditions could also be empty and indicate True to triggerthe behavior.

Elasticity behaviors, shown in Table 1b, indicate to theruntime system that a certain configuration of the actors isdesirable now. The first command, colocate(at1,at2,cd) indi-cates to the runtime system that whenever the condition cd(as described above) is satisfied on actors of type at1 and at2the system should strive to preserve the concerned actors inthe same server. Notice that the condition cd can restrict therelation between the actors of type at1 and at2. For example,using the resource re f (at2) one could restrict that the actor oftype at2 be pointed by some field of the actor of type at1. Thisis typically the case when the programmer knows that twoactors related by cd should interact often, or are concernedwith a common service of the application. Conversely, thesyntax separate(at1,at2,cd) instructs the system to attemptto keep the actors separate whenever resources are available ifcd is satisfied. This could be used for example if both actorsrun computationally demanding activities, thus potentially ob-structing each other’s operation if colocated. The rule pin(at)indicates that whenever possible actors of type at should notbe moved. This could be important for services that have tobe highly available, where migration could poorly impactavailability during the transition from one server to another.Finally, isolate(at,cd,re) indicates to the runtime system thedesire to keep actors of type at in a server of their own whenresources are available and condition cd is met. The additionalcondition re indicates conditions on the desired resources ofthe receiving node after the migration. Thus conditions onlyinvolving the resources cpu, mem or net are allowed. This

Page 4: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

PLOS’17, October 28, 2017, Shanghai, China B. Sang et al.

LEM LEM LEM

GEM

Elasticity Behaviors Sets

Migration Action Queue

Elasticity Control flow

Actor Communication

Actor Migration

Actor Server

Figure 3. The architecture of Runtime System.GEM (Global Elasticity Monitor) manages the scale of the

application while LEM (Local Elasticity Monitor) manages actorsof a single server.

is useful when the programmer knows ahead of time thatresource demands will increase temporarily.

Many of the elasticity decisions are predicated over partic-ular conditions of the runtime system state. To express theseconditions, the programming model provides statistics aboutthe resource usage of actors. In particular, these resourcesstatistics are calculated over fixed time slots, and the queriesalways refer to the last completed time slot. To simplify, weconsider that these statistics are collected based on the activityof the last minute.1

As we explained in Section 2, there are two elasticity poli-cies which may improve the performance of B+ tree applica-tion during the runtime:

separate(InnerNode,Lea f Node,)

colocate(InnerNode, InnerNode, ref(children))The first policy tries to avoid to put any InnerNode actor

and Lea f Node actor on the same server. The second policyrequires to colocate InnerNode actors with parent-child rela-tionship. In B+ tree, if one InnerNode actor is the parent ofanother InnerNode actor, the first actor must has a referenceto the second actor via its field children.

Discussion. We remark that there are many possible alterna-tives for our configuration language, including more preciseruntime statistics, more precise elasticity migration decisions(which could be embedded into the program logic itself), andmore specific choices for the migration of the actors. Ours is afirst attempt to illustrate the need for such fine-grained elastic-ity policies. We will in future work consider finer constructsfor our language.

4 Elasticity Management ServiceIn this section, we introduce the design of our elasticity man-agement service which implements the runtime system forthe configuration language introduced in the previous section.

1The actual duration of a time slot is configurable.

Figure 3 illustrates the architecture of the runtime system. Itcontains one Global Elasticity Monitor (GEM) and multipleLocal Elasticity Monitors (LEMs). The GEM constantly mon-itors how much resources are being used by an application,and adjusts the total resources (e.g., the scale of cluster) thatare being used by the application accordingly. In order toavoid server overload or underutilization, the GEM informsthe overloaded servers about idle servers for scaling up, andalso asks underutilized servers to migrate their actors to asingle server for scaling down.

Each LEM is responsible for managing actors running ontop of one single server. Once an LEM receives the list of idleservers from GEM, indicating its associated server is over-loaded, the LEM firstly sorts all the local actors in descendingorder of resource usage. The LEM selects the most resourceconsuming actor a along with all the other actors that com-municate with a. The communication strength between twoactors is determined by the behaviors and conditions in elas-ticity policies defined by programmers, and also by tracingactor communication patterns at the runtime. Then the LEMwill trigger a set of migration events for the chosen actors.These events are inserted into the pending migration eventqueue. This procedure continues until enough number of ac-tors have been chosen for migration, and server’s resourceusage comes back to normal.

At the same time, every local actor periodically aggregatesits resource usage and checks the elasticity policies. If all con-ditions of a policy are satisfied, the actor informs the LEMof the corresponding elasticity behavior. It is not straightfor-ward how to implement elasticity behaviors in a correct andefficient way. A behavior may result in unacceptably high mi-gration overheads which outweigh its benefits. Complicatingmatters further is that for a given behavior, there might bea wide range of migration choices. For instance, to colocateactor a and actor b, one could migrate a to b or migrate bto a and even migrate both a and b to another server. Worse,for a given actor a, there might be conflict behaviors likecolocate(a,b) and colocate(a,c).

To tackle this problem, the LEM finds the relevant setof behaviors. Thus, we define a transitive relation betweenany pairs of behaviors. Two behaviors are relevant if theyinclude at least one common actor. The LEM places all therelevant behaviors in the same set. Next the LEM generatesall possible sequences of migration actions correspondingto all behaviors within a set. Some sequences may includeconflicting migration actions, such as migrating actor a toserver S1 and migrating actor a to server S2. These sequenceshave been deleted directly. For the rest of the action sequences,the LEM uses a measurement function to pick up the bestaction sequence. All the actions in the selected sequence areinserted into the pending migration queue.

Afterwards, the LEM starts to process actions in the pend-ing migration queue. For every migration action, the localLEM negotiates with LEMs on the migration-target servers.

Page 5: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

If all the remote LEMs accepts the migration requests, thelocal LEM marks the action as accepted and performs themigration action. Otherwise, this action is simply removedfrom the queue.

Intuitively, the runtime system ensures that policy requestsare totally ordered and ensure all-or-nothing semantics, i.e., apolicy is executed in its entirety or not executed at all, whichpreserves the application semantics.

5 Preliminary EvaluationIn this section, we present our preliminary evaluation results.We evaluate the efficacy of our solution, and show that it canyield to substantial performance improvements.Experimental setup. We implemented our elasticity manage-ment service in AEON [9] runtime system and implementedHalo 4 Presence Service using AEON. We deployed this appli-cation on two Amazon AWS m1.small instances. We created10 router actors, 10 session actors and 1 observer actor inthe presence service. Each session actor had at most 4 playeractors. We deploy 10 clients on a m1.medium instance tosimulate the game consoles, which repeatedly sends heartbeatmessages to the presence service.Router actors. Compared to other actors, router actors aremore CPU-intensive because they require to extract data fromheartbeat messages. Thus, the runtime system should guar-antee the load of router actors are equally distributed amongservers. However, without information of the router actors’resource usage, the runtime system could only apply a simpleplacement strategy like assigning the equal number of actorsto each server.

Since all actors in presence service are dynamically deletedand created, the number of actors on servers may become im-balanced overtime. Consider an example where most sessionand player actors, which are not CPU-intensive, have beenscheduled on one of the two available servers. Upon creatingnew router actors, the runtime system can follow the simplestrategy that balances the number of actors on each server. Inthis case, all created router actors may be placed on a sameserver. On the other hand, if the runtime system recognizesthe resource usage pattern of router actors, it may decide tobalance the number of router actors among two servers, whichwill certainly lead to imbalanced total number of actors ateach server. Figure 4a shows that the workload-aware balancestrategy outperforms the simple number of actors (#actors)balance strategy, even if it results in different number of actorsat each server.Session actors and player actors. Each session actor has ref-erences to player actors which are in the session. Since sessionactors communicate with player actors on every heartbeat, thelatency could significantly be improved if the runtime systemcolocates these two actor types. To verify this, we performan experiment in which each server has the same number ofrouter actors, player actors and session actors. In order to

minimize the effect of router actors, we reduce their computa-tion workload. Using our programming model, programmerscan suggest the runtime system to colocate session actorsand their corresponding player actors. In a comparison setup,the runtime system ignores relationships between session ac-tors and player actors and randomly distributes player actors.Figure 4b shows that ignoring references among session andplayer actors, and random placement leads to 50% additionallatency and 33% less throughout.Observer actor and session actors. Though the referencesbetween two actors give useful hints to the runtime system,we could not simply colocate all pairs of actors with relevantreferences. In Halo 4 Presence Service, observer actors alsohave reference to session actors. However, observer actorsonly make calls on those session actors when they observeerrors, which occur infrequently. In our implementation, weonly have one observer actor because there are only 10 ses-sion actors. If the runtime system colocate all actors withreferences, all session actors and player actors must be placedon one single server. The runtime system ignores referencebetween actors if programmers do not specify them in thepolicies. The experimental results in Figure 4c clearly in-dicates that the references could not be a universal rule todetermine the placement of actors. The runtime system needsmore specific information from programmer, which can helpto arbitrate useful actor references.

6 Related WorkStateless applications. Unlike our general programming modelfor actor-based applications which provides fine-grained elas-ticity at the level of actors, several works have studied elas-ticity techniques for “stateless” cloud applications for whichmigrating individual state does not affect the semantics of theapplication itself.

There are a few of works [6, 8] which apply machine learn-ing algorithms to understand the workload and predict therequired number of virtual machines during the next timeperiod. Those methods manage computing resources at vir-tual machine level and target stateless applications like webapplications. However, actor-based applications require morefine-grained elasticity management at the actor level. Evenwith the same number of VMs, different placement of actorsin those applications may result in different performance.

AWS autoscaling [1] allows users to setup a dynamic clus-ter which can scale up and down according to user-definedconditions. This service relieves users from scaling adjust-ment and with the help of load balancer, it achieves automaticelasticity management for stateless applications. However,this service can not be used for stateful applications directlysince the scaling adjustment of stateful applications requiresstate migration.

AWS Lambda [2] is one step further compared to autoscal-ing. Users only need to upload their codes has called Lambda

Page 6: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

PLOS’17, October 28, 2017, Shanghai, China B. Sang et al.

#Actors Balance Workload BalanceThroughput(req/s) 122 148

Latency(ms) 84.12 70.01(a) #Actors Balance vs. Workload Balance

Rand. Placement Ref. PlacementThroughput(req/s) 247 371

Latency(ms) 41.14 27.45(b) Random Placement vs. Reference Placement

Ref. Placement Ignore Ref.Throughput(req/s) 220 371

Latency(ms) 46.26 27.45(c) Reference Placement vs. Ignore Reference

Figure 4. Preliminary Experiment Results for Halo 4 Presence Service.

Our Model EventWave [5] Orleans [7] AEON [9] Autoscaling [1] Lambda [2]Stateful vs. Stateless Stateful Stateful Stateful Stateful Stateless Stateless

Elasticity level Actor Actor Actor Actor VM Lambda functionProgrammable elasticity Yes No No No No No

Figure 5. Summary of elasticity management for cloud applications

function and to specify sources which will trigger Lambdafunctions. Then the platform is responsible for capability,scalability and availability for their applications. However,Lambda is similar to auto-scaling which only works withstateless applications. According to the introduction of AWSLambda, it is mostly used to implement back-end services.Stateful applications. EventWave [5] is an event-driven pro-gramming model which supports strict serializability. Thoughits runtime supports live actor migration without efforts fromprogrammers, programmers still need to specify the migrationdetails (e.g, which actor and which server). In conclusion, itdoes not have automatic elasticity management.

Microsoft Orleans [7] is a programming language designedfor Azure cloud platform. It is based on the actor model andextends the C# language. Its runtime already includes someoptimization mechanisms based on actor locality. However,those optimizations are not a real elasticity solution as it ig-nores too many important factors in elasticity managementlike resource usage and migration overhead. A complete elas-ticity solution at least needs to process the scaling adjustmentaccording to the workload. Furthermore, Orleans does notprovide programmers any control over the runtime elasticitymanagement of their applications. The runtime system couldonly obtain limited information by observing the behaviors ofactors. This kind of black-box approach may fail to provideefficient elasticity solution in some cases.

AEON [9] is an actor programming model with automaticelasticity. However, its runtime system only supports very sim-ple elasticity management consisting in distributing evenlyon servers. The dynamic connection between actors and re-sources required by actors have evident impact on the overallperformance. AEON’s simple elasticity management servicecould not capture these important features of the applications.

7 ConclusionIn this paper, we focus on the problem and solution spacefor fine-grained elasticity in actor-based cloud applications.Ideally, when building large-scale distributed applicationslike a game engine or complex data structures, the applicationprogrammer would like to program largely with sequentialsemantics in mind so that he would not need to worry aboutthe subtle concurrency bugs that may violate the applicationsemantics. Similarly, the application programmer would alsolike to specify fine-grained elasticity policies for scaling dis-tributed actors with minimal programming overhead. Thegoal of this work is enabling application programmers to im-prove the resource efficiency of their cloud applications withlittle efforts.

There are different ways to implement the elasticity poli-cies. How to effectively translate the elasticity policies intoreal implementations is still open. However, the runtime sys-tem could not follow programmer-defined policies directly.Some policies may conflict with each other, and some poli-cies can promise more benefits. In the future, we are goingto extend the LEMs with cost/benefit analysis. We will alsoinvestigate how to integrate service level agreements with ourprogramming model to satisfy the quality of services.

AcknowledgmentsThis research is supported by NSF grants # 1117065, # 1618923,# 1421910, European Research Council grant # FP7-617805“LiVeSoft – Lightweight Verification of Software” and Ger-man Research Foundation under grant # SFB-1053 “MAKI –Multi-mechanism Adaptation for Future Internet”.

Page 7: Programmable Elasticity for Actor-based Cloud Applications · 2017-12-05 · Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

Programmable Elasticity for Actor-based Cloud Applications PLOS’17, October 28, 2017, Shanghai, China

References[1] 2017. Autoscaling. (2017). https://aws.amazon.com/autoscaling/

https://aws.amazon.com/autoscaling/.[2] 2017. AWS Lambda. (2017). https://aws.amazon.com/lambda/

https://aws.amazon.com/lambda/.[3] Gul Agha and Carl Hewitt. 1987. Actors: A Conceptual Foundation for

Concurrent Object-Oriented Programming. In Research Directions inObject-Oriented Programming. 49–74.

[4] Philip A Bernstein, Sergey Bykov, Alan Geller, Gabriel Kliot, andJorgen Thelin. 2014. Orleans : Distributed Virtual Actors for Pro-grammability and Scalability. Technical Report. Microsoft Research.

[5] Wei-Chiu Chuang, Bo Sang, Sunghwan Yoo, Rui Gu, Milind Kulkarni,and Charles Edwin Killian. 2013. EventWave: Programming Modeland Runtime Support for Tightly-coupled Elastic Cloud Applications.In Symp. on Cloud Computing (SoCC). 21:1–21:16.

[6] Zhenhuan Gong, Xiaohui Gu, and John Wilkes. 2010. PRESS: PRe-dictive Elastic ReSource Scaling for cloud systems. In Int. Conf. onNetwork and Service Management (CNSM). 9–16.

[7] Andrew Newell, Gabriel Kliot, Ishai Menache, Aditya Gopalan, So-ramichi Akiyama, and Mark Silberstein. 2016. Optimizing DistributedActor Systems for Dynamic Interactive Services. In Euro. Conf. onComp. Sys. (EuroSys). 38:1–38:15.

[8] Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Sethuraman Subbiah, andJohn Wilkes. 2013. AGILE: Elastic Distributed Resource Scalingfor Infrastructure-as-a-Service. In Int. Conf. on Autonomic Computing(ICAC). 69–82.

[9] Bo Sang, Gustavo Petri, Masoud Saeida Ardekani, Srivatsan Ravi, andPatrick Th. Eugster. 2016. Programming Scalable Cloud Services withAEON. In Int. Conf. on Middleware (MIDDLEWARE). 16:1–16:14.

[10] Samira Tasharofi, Peter Dinges, and Ralph E. Johnson. 2013. WhyDo Scala Developers Mix the Actor Model with other ConcurrencyModels?. In European conference on Object-Oriented Programming(ECOOP). 302–326.