resource allocation across multiple cloud data centers

8/3/2019 Resource Allocation Across Multiple Cloud Data Centers

http://slidepdf.com/reader/full/resource-allocation-across-multiple-cloud-data-centers 1/6

Resource Allocation across Multiple Cloud Data Centres

Barnaby MaletImperial College London

180 Queen’s GateLondon SW7 2AZ, [email protected]

Peter PietzuchImperial College London

180 Queen’s GateLondon SW7 2AZ, [email protected]

ABSTRACT

Web applications with rich AJAX-driven user interfaces makeasynchronous server-side calls to switch application state.To provide the best user experience, the response time of these calls must be as low as possible. Since response timeis bounded by network delay, it can be minimised by plac-ing application components closest to the network locationof the majority of anticipated users. However, with a lim-ited budget for hosting applications, developers need to se-lect data centre locations strategically. In practice, the bestchoice is difficult to achieve manually due to dynamic clientworkloads and effects such as flash crowds.

In this paper, we propose a cloud management middlewarethat automatically adjusts the placement of web applicationcomponents across multiple cloud data centres. Based onobservations and predictions of client request rates, it mi-grates application components between data centres. Ourevaluation with two data centres and globally distributedclients on PlanetLab shows that our approach can decreasemedian client response times by 21% for a realistic multi-tierweb application.

Categories and Subject DescriptorsC.2.4 [Distributed Systems]: Distributed applications

General Terms

Management, Performance, Experimentation, Design

Keywords

Resource allocation, Management middleware, Web appli-cations, Cloud computing

1. INTRODUCTIONThe benefits of hosting web applications in the cloud are

becoming increasingly attractive to both individuals andbusinesses. By using resources on demand in cloud data cen-tres, developers can significantly reduce management and

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.

MGC ’2010, 29 November - 3 December 2010, Bangalore, India.Copyright 2010 ACM 978-1-4503-0453-5/10/11 ...$10.00.

deployment overheads. Infrastructure-as-a-Service (IaaS)solutions, such as Amazon Web Services [3] and RackspaceCloud [16], operate multiple, geographically-distributed datacentre locations with resources for computation, storage andcommunication. Locations are priced differently based onbandwidth and computational usage and provide differentperformance to clients based on their network location.

Many of today’s web applications expose rich user inter-

faces (UIs) written in JavaScript, which are designed to givea user experience similar to that of native desktop appli-cations. They commonly employ a method called Asyn-chronous JavaScript And XML (AJAX) [12] to fetch furthercontent after the application has loaded. AJAX enables theweb browser to make background calls to the remote webserver. This can be used to change UI elements withoutreloading the page and to provide responsive behaviour. Thequality of the user experience is strongly correlated with theresponse time of the AJAX calls—high response times leadto sluggishness of the UI.

Therefore it is an important decision in which cloud datacentre to host an application given a fixed budget: placingapplication components close to a large user base results in

better quality-of-service due to lower latency and increasedthroughput; however, it may also incur a higher cost if re-sources at that location are more expensive. This problemis complicated by the fact that resource allocation decisionshave to take changes of the workload into account.

Resources are available as virtual machines (VMs) thatappear as traditional servers. Techniques for migrating VMsacross LANs [8] are widely available but the ability to mi-grate VMs across WANs remains more challenging [23]. Weargue that these mechanisms should be used to strategicallyplace application VMs across disparate geographic locationsin order to optimise user-perceived latency and cost.

To achieve this goal, we propose a cloud management mid-dleware that strategically migrates VMs of multi-tiered web

applications between data centre sites in response to work-load changes. Our middleware monitors client-server work-load at the data centre gateway. This information is usedby a control plane to initiate VM migrations between datacentres in order to move application components closer toclients. The web clients are aware of the migration mecha-nism and redirect their traffic to the newly migrated VM.

This approach achieves lower network round trip times,improving client response times in the web application, with-out requiring additional VMs. Our evaluation results on theglobal PlanetLab test-bed [6] together with two data centresshow that this approach can improve the median response



time of a web application by 21% and the 80 th percentile by7% over a single data centre deployment.

In summary, the main contributions of this paper are:

1. an approach for reorganising two-tiered web applica-tions by allocating resources across data centres in re-sponse to workload changes;

2. a design of a cloud management middleware that dy-namically migrates VMs based on client workload toprovide maximum quality of service to clients; and

3. a prototype middleware implementation that optimisesthe execution of a web application, and its evaluationwith two data centres and many clients on Planet-Lab [6].

In the next section, we motivate our work. Section 3 de-scribes the design of the cloud management middleware fo-cusing on the VM migration and placement algorithms. Wegive results from an Internet deployment in Section 4. Thepaper finishes with related work in Section 5 and conclusionsin Section 6.

2. MOTIVATIONModern Internet applications are typically designed using

a multi-tier architecture, in which each tier is responsiblefor a subset of the application’s computation and/or stor-age. Typical examples are the RUBiS [2] and TCP-W [5]web applications, which consist of a web server tier, an ap-plication tier and a database tier.

Due to recent advances in virtualisation technology [4, 20]and its associated benefits in data centre management, ap-plications are no longer hosted on dedicated machines butinstead placed within VMs. This shift has permitted re-searchers to develop autonomic management solutions that

can resolve server hot-spots and p erformance bottlenecksusing VM migration and provisioning [19, 24]. These tech-niques have been used by cloud providers to increase per-formance and ultimately provide better service level agree-ments (SLAs).

The widespread use of content delivery networks (CDNs)demonstrates the advantages of placing web servers close toclients [14]. However, the entry cost for using a CDN re-mains high and often beyond what small businesses can af-ford. A further disadvantage of CDNs is that they focus oncontent distribution and thus do not cater for custom com-putation and persistence. These short-comings coupled withthe geographic diversity of cloud compute sites provide astrong motivation for a cloud management middleware thatoptimises the placement of VMs.

To illustrate our idea, consider a scenario in which a webstart-up deploys a multi-tiered web application in the cloud.It has a limited budget and thus can only afford a smallnumber of VM instances. The choice of data centre locationsis large and growing steadily, and the start-up must decidehow to distribute its resources among the sites in order toprovide the best service to its users.

This problem is non-trivial: different peak times across ge-ographic regions mean that it is difficult to estimate wherethe majority of users will be located at any given time. In ad-dition, the application may face unpredictable traffic of vary-ing intensity and duration, such as flash crowds [10]. The

Control Plane

Web 1 Data 1

UK Site

US Site Web 2

Data 2

JavaScipt Layer

UK Client

JavaScript Layer

US Client

Traffic Traffic

Decisions Decisions

Metrics

Location

Figure 1: Cloud management middleware with twodeployed web applications. Each application con-sists of an application tier (Web 1 or Web 2) and astorage tier (Data 1 or Data 2).

decision to move or create a VM close to the clients may pro-vide a benefit (i.e., decreased latency or increased through-put) but may also incur a cost (i.e., moving data and/orincreased hosting fees). Since client workload changes canoccur rapidly and unexpectedly, such decisions have to beperiodically re-evaluated. We argue that this should be doneby a cloud management middleware that makes these deci-sions dynamically at a fine granularity.

3. CLOUD MIDDLEWARE DESIGNOur solution is to augment a traditional IaaS offering such

as AWS [3] with a cloud management middleware. Our de-sign aims to reduce the client response times of two-tieredweb applications consisting of an application tier and a stor-age tier . It achieves this by dynamically moving applicationtiers and their associated data closer to clients based on thecurrent and previous workloads.

Our middleware consists of the components shown in Fig-ure 1. Clients bootstrap by downloading application code(such as HTML/JavaScript) from a web server whose IP ad-dress is resolved via DNS. The application code is static andcan be distributed via traditional means (such as CDNs).

Clients track the movement of the application tiers us-ing a JavaScript layer through which all server-side callsare passed. This means that application code can transpar-ently refer to remote application components without hav-ing to know their current IP address. (This imposes theconstraint that anchor tags must refer to a JavaScript func-tion rather than a URL—the JavaScript layer then formsthe URL based on the current server IP address.)

A control plane uses the client workload observed at thedata centres to decide where best to place VMs. It coordi-nates VM migrations and traffic routing: When a migrationdecision has been made, a new VM image is instantiated atthe target location and the application state is transferred.The clients then re-direct their traffic to the new site.

In terms of strategy (cf. Section 3.3), the control planemigrates application tier VMs freely between sites—we as-sume that they are stateless and have a low migration cost.In contrast, VMs supporting the storage tier are placed atthe location that has historically seen the highest access.This is because data migration is considered to consumemore resources.

In the example in Figure 1, both tiers of application 1



(Web 1 and Data 1) are placed in the UK, close to the major-ity of clients. Application 2 is reacting to a sudden increasein usage from the US by moving its application tier (Web 2)there, but leaving its database tier (Data 2) in place.

3.1 Web application architectureModern web applications are frequently built using the

representational state transfer (REST) [11] architecture. It

enforces a strict separation between the client, which man-ages application state, and the web service, which is invokedby the client upon a change in application state. At anypoint in time, a client can either be in transition betweenapplication states or “at rest”. When a client is at rest,it exerts no load on the server; commonly, a RESTful ar-chitecture relies on the HTTP protocol for communication.As opposed to a traditional web application such as RU-BiS where the line between client and server responsibilityis blurred because the server maintains application state, aRESTful web service is stateless.

Building a RESTful web application allows developers todraw clear lines of responsibility between components andthus simplifies their implementation. Because RESTful ap-

plications are stateless, components can easily be duplicatedto adapt to increases in load. Furthermore intermediariessuch as proxies, gateways and firewalls can be introducedwithout changing the interface between components. Theseadvantages make this type of architecture popular in mod-ern web applications and we assume it throughout the restof the paper.

Modern web applications make heavy use of JavaScripton the client side to provide a rich user experience similarto desktop applications. In our work, we exploit this to getaround the problem of maintaining VM addresses during aWAN migration: we extend client code with an interme-diate JavaScript layer that acts as a proxy for all client-issued RESTful calls. This intermediary is aware of the cur-rent network location of the VM in terms of its IP address.It maintains a persistent HTTP connection to the controlplane, which is used to notify the client of a VM migration;when this occurs, the JavaScript layer redirects RESTfulcalls accordingly. This technique requires a browser thatsupports the W3C Cross-Origin Resource Sharing recom-mendation [21] that permits JavaScript calls to web serversother than those from which the application code originated.

This approach allows for fine-grained control of load-bal-ancing for this type of web application, which other tech-niques, such as round-robin DNS and HTTP redirecting, donot support. By providing a JavaScript client library, webapplications can be extended to handle VM mobility withonly minor changes to their code base.

3.2 Virtual machine migrationMechanisms for migrating VMs across WANs remain elu-

sive. Migrating a live VM requires high speed networks,access to a shared block device and control over the net-work address space [8]—all of which are not available whenmoving VMs between data centres over the Internet. Sinceresolving these issues requires changes to hypervisors andnetworks [23], we instead assume that a consistent image of each VM is available at every data centre. The middlewarecan then use a start-stop mechanism for migration, whilecoordinating with the clients to redirect traffic, as describedpreviously. Next we explain this approach in more detail.

3.2.1 Application tier VMs

Since we assume that applications are built using a REST-ful architecture, they are stateless and can be migrated bystarting a new instance at the target location and stoppingthe old instance. In practice, an application tier may main-tain an in-memory cache to reduce calls to the data tier; inthese cases, the newly-instantiated application would lazilyre-populate the cache using the data tier.

3.2.2 Storage tier VMs

We assume that the storage tier contains the applicationstate and its VMs must be migrated differently from theapplication tier VMs. Web applications often maintain ashared database of user-generated content. For RUBiS andTCP-W, it is in the range of a few hundred megabytes toa few gigabytes [2]. This data is challenging to replicate atmultiple remote sites because it must remain consistent.

For simplicity, we opt to maintain a single active copy of the data at any point in time. Migration is therefore im-plemented as part of the application: to migrate a storagetier VM, an identical VM with an empty database is instan-tiated at the target location. The data is then iteratively

copied to the target. Any concurrent database changes tothe source data are also mirrored at the target. After thedata has been copied, the target becomes the active copyand the source is switched off.

This approach is application-specific, and requires thatthe storage system is able to maintain a synchronised copyof the data during transfer. It incurs a considerable networkcost for transferring the entire state but the middleware triesto migrate storage tiers infrequently. As future work, stor-age tier migration could be improved using a distributed filesystem that handles consistency between replicas [18].

3.3 Virtual machine placementThe middleware uses the following simple algorithms to

translate the workload observations into migration commandsfor VMs hosted at data centres.

3.3.1 Placement goals

There are many potential goals when optimising the place-ment of VMs: reducing monetary costs [15], increasing appli-cation throughput or decreasing latency. The specific goaldepends on the nature of the underlying application. Forexample, a large map-reduce job [9] that is not time crit-ical may strive to reduce monetary costs by choosing sitesthat offer the least expensive compute and network transferrates. Web applications may have SLAs to meet, aiming toreduce response times as much as possible. In the case of web applications that expose Javascript-driven GUIs usingAJAX, reducing network round-trip times (RTT) is key.

Another decision to be made is which clients should achievea given goal. For example, the middleware may prioritise asubset of clients belonging to paying customers and reducetheir response times. Alternatively, it could try to achievea minimum SLA such that all clients would be guaranteeda maximum bound on their RTTs when supporting an in-teractive web GUI. In this work, we choose to reduce theresponse times of web requests for the majority of globallydistributed clients. We leave more complex placement goalsthat trade-off gain and cost for future work.



3.3.2 Placement algorithms

We now present the algorithms that decide the placementlocation of the web application tiers. We treat the migrationof application and storage VMs separately because storageVMs are more expensive to migrate due to communicationcosts. The middleware employs a reactive algorithm for theplacement of application VMs and a predictive algorithm forstorage VMs.

Reactive placement algorithm.The goal of the reactive placement algorithm is to respond

quickly to variations in application workload by moving ap-plication tier VMs. Intuitively, the system considers theoptimal location of application VMs to be the site closestto the majority of the clients issuing requests within a giventime window.

The control plane maintains a sliding window for eachhosted web application. The window contains one entryper client web request recording its origin location and size;as new requests reach the servers, they are added to thewindow. The control plane also maintains a list of availabledata centre locations DC .

At regular intervals, the control plane iterates throughthe window counting requests per data centre location. Arequest is counted for a site if that site is closest to therequesting client in terms of geographic distance. Each site xis then assigned a value λ, which is the weighted sum of therequests at its origin:

λ(x) =

|requests(x)|X

i=0

W i

where W i are the request weights, which can be tuned toprioritise some client requests over others. The new place-ment location is chosen as the data centre with the highestλ value:

max(λ(x) | x ∈ DC )

The size of the window used to record the requests governshow sensitive the system is to changes in client workload. Inour experiments in Section 4, we empirically determined awindow size of 10,000 requests to be rigid enough to preventunnecessary oscillations but sufficiently flexible to permitmigrations that compensate for a large increase in requestsfrom a site. In general, we would expect the window size tobe chosen adaptively to achieve a desired number of migra-tions.

Predictive placement algorithm.Because storage tiers are considered costly to migrate,

their location must be chosen more strategically. In order toachieve a reasonable degree of accuracy when placing storageVMs, we use a predictive algorithm inspired by the techniqueemployed by Rolia et al. [17]. It uses past observations of application workload to predict future peak demands.

The placement location is predicted at the granularity of hours. The best location of a storage VM for a given houris the location that was closest to the majority of clientaccesses for that hour over the past several days. A storageVM is migrated at hour t + 1 if the predicted location isdifferent from that at hour t and remains the same over thenext w hours. Intuitively, this algorithm splits up a day intocontiguous periods of at least w hours, moving storage VMs

at the boundary between these periods.This approach assumes that there are reoccurring patterns

between days (cf. Figure 2). This means that it cannot adaptto unpredicted events such as flash crowds. However, suchevents would be handled by the reactive placement algo-rithm above. More advanced prediction techniques such asa combined reactive and predictive approach [19] could beused instead.

4. EVALUATIONThe goal of our experimental evaluation is to demonstrate

that our cloud management middleware decreases the re-sponse time of a globally accessed web application. By plac-ing VMs strategically, it can achieve this without requiringadditional resources in the data centres.

4.1 Experimental set-upThe prototype deployment of our system consists of two

data centre sites, one in the US hosted at the Emulab test-bed facility [22] and another in the UK hosted in the datacentre of our university. We use the Xen Hypervisor [4] tomultiplex each server into a number of separate VMs, which

host individual application and storage tiers.We use four virtual machines to host the various tiers in all

experiments—two in the US and two in the UK. The appli-cation tiers use an Apache web server running PHP scriptsthat generate static data or retrieve data from a MySQLdatabase in the storage tier. Client requests to the webserver include the desired size of the response document anda flag indicating its source (database or static). A hard limiton the request size of 300 KB is set to put a bound on theamount of network resources used in the experiments.

For the application workload, we use a 24-hour requesttrace from our department’s web server. The trace is anApache log file that contains HTTP status codes, requestsizes, request times and client IP addresses. We adapt the

trace in the following ways: (1) Only HTTP“OK (200)” sta-tus codes are kept. (2) All internal university requests are re-moved. (3) All requests from locations outside the US or theUK are discarded. To expedite experimentation, the 24-hourtrace is uniformly compressed into a 2-hour run. Since wedo not know how many of the requests in the original traceinvolved database accesses, we assume 10% of the requeststo require database transactions due to efficient caching.

We use six PlanetLab nodes to emulate requests from aworld-wide client p opulation. The clients are split into EUand US clients. Each group is placed at a minimum 800 kmradius from their respective data centres. The EU (respec-tively US) clients are assigned the UK (respectively US) por-tions of the trace to replay. Since we use three clients persub-trace, this is equivalent to tripling the trace, resultingin a mean request rate of 28.6 requests/second over the twohour period.

The clients replay the transformed trace by implementinga multi-threaded HTTP downloader. For simplicity of im-plementation, we did not implement the JavaScript layer onthe clients, instead all URLs in the workload reference theIP addresses of gateways placed in the data centre. Whenthe control plane makes a migration decision, the gatewaysissue redirect responses to redirect the client traffic to thenew location.



0

5

10

15

20

25

30

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 1 1 0

1 2 0

US

UK

R e q u e s t s / S e c o n d

D a t a c e n t r e L o c a t i o n

Elapsed Time (in minutes)

US RequestsEU RequestsVM Location

Figure 2: Observed client request rate, split into USand EU requests.

4.2 ResultsFigure 2 shows the split in client request workload over

time based on client location as observed at the applicationservers. As mentioned previously, the workload is obtainedby uniformly compressing a day trace of our department’s

web servers into a 2-hour period. Due to our compressedreplay of the trace, each hour in the original trace is mappedto a 5 minute slot in the workload. Minute 0 correspondsto 1:21am in the original trace, minute 5 corresponds to2:21am and so forth—the intersection of the request ratesat minute 40 corresponds to 9:21am in the original trace.

The VM Location line shows the location of the applica-tion VM as decided by the reactive placement algorithm,where a low (respectively high) value indicates that the VMis in the US (respectively UK). We can see that the locationof the VM follows closely the user base issuing the major-ity of requests at any given time: The VM is in the US forthe first 40 minutes and then moves to the UK because theUK rate becomes higher than the US rate. Similarly, after

90 minutes, it switches back to the US. There are oscillationscausing the VM to move back and forth between continentsafter around 90 minutes because the UK and US requestrates are similar. Such oscillations are unwanted side effectsof workload variations and we are investigating dampeningapproaches to suppress them.

The distribution of client response times are shown in Fig-ure 3. Response time is measured at the client as the timeelapsed between a request being made and the full receiptof the server’s response. We compare three resource alloca-tion strategies: (1) singleDC places all VMs in the UK datacentre and does not migrate them; (2) application is the re-active placement algorithm with a window size of 10,000 tomove the application VMs only without migrating the stor-age VMs; and (3) application+data also migrates the storageVMs using the predictive placement algorithm. The place-ment algorithm is trained with “normal” days that did notexhibit any request surges in the workload.

In both reactive approaches, the control plane places theapplication components closer to the majority of clients atany given time. The results show that the application+data

approach achieves a 21% decrease in client response timesover the singleDC approach for the median and a 7% decreasefor the 80th percentile. The application approach does notdo as well, with a 15 % median decrease but a 7 % increaseat the 80th percentile.

The lower performance is caused by the dependency be-

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500

C D F

Response Time (in milliseconds)

applicationsingleDC

application+data

Figure 3: Distribution of client response times underdifferent resource allocation strategies.

tween the application and storage tiers when the applicationtier is placed in the US: 10% of requests involve database ac-cesses that require transatlantic communication between thetwo data centres. Since this happens over a slower networkpath, it is faster in these cases for the client to directly accessthe UK site. We believe that the performance could be im-

proved if the data centres had better network connectivity orless data was transferred between the two tiers. This couldbe achieved by caching or optimisations at the applicationlevel.

Around 25% of requests perform significantly better in thesingleDC allocation. At any point in time, the web appli-cation has requests originating from b oth regions. In caseswhere the adaptive approaches have placed application com-ponents in the US in response to client locality changes, theminority number of requests originating from EU clients ex-hibit significantly higher response times.

During experimentation, we observed that improvementsare sensitive to the locality of the clients in relation to thedata centres. To illustrate this, consider the following sce-

nario that we encountered: an application runs in a Londondata centre, with the majority of clients in London. A sud-den increase in requests from the US causes the applicationto move to the US site. Although the US clients benefit fromthis move, this is not reflected as an overall improvement be-cause the London clients experience a drastic performancedecrease. This is an interesting trade-off and relates backto our discussion on optimisation goals in Section 3.3.1—weleave this for future work.

5. RELATED WORKUrgaonkar et al. [19] strategically allocate VMs to appli-

cation tiers in web applications within a single data centre.Their work requires more granular workload prediction toprovision the correct number of VMs per application tier.Volley [1] addresses the problem of data placement in a multidata centre context to reduce data access latency. Theyemploy an iterative optimisation algorithm based on accesspatterns and data locations. They target large applicationsand their optimisation is performed at the order of days—incontrast to our fine-grained placement decisions.

To the best of our knowledge, we are unaware of work thatplaces VMs closer to users on the Internet. However, migrat-ing VMs for improved performance within a data centre hasbeen widely studied in recent years. Sandpiper [24] alle-viates server resource hot-spots using migration. Memory-

resource allocation across multiple cloud data centers

Documents