[ieee 2013 ieee 6th international conference on cloud computing (cloud) - santa clara, ca...
TRANSCRIPT
Smart CloudBench - Automated Performance Benchmarking of the Cloud
Mohan Baruwal Chhetri, Sergei Chichin, Quoc Bao Vo, Ryszard Kowalczyk
Faculty of Information & Communication TechnologiesSwinburne University of Technology,
Melbourne, Australia{mchhetri, schichin, bvo, rkowalczyk}@swin.edu.au
Abstract—As the rate of cloud computing adoption grows,so does the need for consumption assistance. Enterprises thatare looking to migrate their IT systems to the cloud, wouldlike to quickly identify providers that offer resources withthe most appropriate pricing and performance levels to matchtheir specific business needs. However, no two vendors offer thesame resource configurations, pricing and provisioning models,making the task of selecting appropriate computing resourcescomplex, time-consuming and expensive. In this paper, wepresent Smart CloudBench - a platform that automates theperformance benchmarking of cloud infrastructure, helpingpotential consumers quickly identify the cloud providers thatcan deliver the most appropriate price/performance levels tomeet their specific requirements. Users can estimate the actualperformance of the different cloud platforms by testing rep-resentative benchmark applications under representative loadconditions. Experimentation using the prototype implementa-tion shows that higher price does not necessarily translateto better or more consistent performance, and benchmarkingresults can provide more information to help enterprises makebetter informed decisions.
Keywords-Cloud computing, Benchmarking, Cloud Infras-tructure, Cloud Performance, Performance Evaluation, Auto-mated Benchmarking
I. INTRODUCTION
In recent years, cloud computing has emerged as a
major disruptive technology that has changed the way
in which enterprises procure and consume computing re-
sources. There has been an exponential growth in the
number of Infrastructure-as-a-Service (IaaS) vendors and
their offerings, with a corresponding increase in the number
of enterprises looking to migrate some, or all of their
IT systems to the cloud. However, as the rate of cloud
computing adoption grows, so does the need for consump-
tion assistance. Different providers offer different resource
configurations and use different pricing and provisioning
models, and, while information about pricing levels and
supported resource configurations are publicly available,
there is limited information about the resource performance
levels. This is important for organizations looking for first
opportunities to migrate their in-house IT systems to the
cloud. They would like to obtain a quick assessment of the
price/performance levels of different IaaS providers before
making any migration decisions.
A naive approach to do that would be to deploy their
own application on the target platforms, benchmark it with
the range of possible workloads, measure the actual per-
formance and analyse the test results. However, such an
approach is complex, time-consuming and expensive, and
very few organizations possess the time, resources and in-
house expertise to do a thorough and proactive evaluation
on multiple cloud platforms. A practical alternative is to test
representative applications against representative workloads
to estimate the performance of different cloud providers. For
example, the representative application for an e-commerce
website would be TPC-W, which is a transactional web
e-commerce benchmark, and the representative workload
would be the estimated or measured amount of concur-
rent requests to the website. The benchmarking results of
representative benchmarks could be used to quantify the
application performance on the different IaaS platforms and
to obtain valuable insights into the difference in performance
across providers. By combining the benchmark results with
pricing information, enterprises can better identify the most
appropriate cloud providers and offerings based on their
specific business needs.
In this paper, we present the Smart CloudBench plat-
form, a system that enables the measurement of infrastruc-
ture performance in an efficient, quick and cost-effective
manner, through the automated execution of representative
benchmarks on multiple IaaS clouds to measure their per-
formance levels under different workload conditions. The
Smart CloudBench offers an extensible suite of benchmark
applications corresponding to the most common types of ap-
plications hosted on the cloud1. Prospective cloud consumers
can use Smart CloudBench to (i) select the representative
application/s to use for evaluating cloud performance (ii),
configure the test harness (iii), select and acquire instances
on the cloud platforms to be tested (iv), launch the tests, and
(v) gather and use the results to build a price/performance
matrix that can help with decision-making for provider and
resource selection.
1Today, the cloud is increasingly being used for hosting web applications,high performance computing applications, social networks, media streamingapplications and for the development and testing of complex enterpriseapplications. Ideally, there will be a benchmark application correspondingto each type of application in the suite of benchmark application.
2013 IEEE Sixth International Conference on Cloud Computing
978-0-7695-5028-2/13 $26.00 © 2013 IEEE
DOI 10.1109/CLOUD.2013.7
414
The key benefits of using the Smart CloudBench are:
• Reduced time and effort involved in benchmarking
cloud platforms. If the number of cloud instances to
benchmark is high, and the number of representa-
tive applications is large, then manually executing the
benchmarking process becomes a very cumbersome
exercise.
• Reduced cost of performance testing. Since the cloud
resources to be tested can be commissioned just in time
and decommissioned immediately after completion of
the tests, there are significant cost savings.
• Minimised human error due to simplified repetition of
the benchmarking process. While the initial investment
is large, subsequent executions become easy and quick.
• Reports autogeneration based on the test results for
consumption by non-technical audiences.
• Centralised storage of performance data over time
which enables analysis of performance evolution.
• Benchmarking as a Service (BaaS) - the entire process
of performance benchmarking of cloud infrastructure is
offered as a service.
The rest of this paper is organised as follows. In Section
II we summarize the related work. In Section III we give
an overview of cloud performance benchmarking, followed
by a description of our proposed approach in Section IV.
In Section V we describe the experimental environment
used to validate the usefulness of Smart CloudBench. We
discus the results of the experiments in Section VI and make
conclusions and identify future research in Section VII.
II. RELATED WORK
There has been significant research activity on the mea-
surement and characterization of cloud infrastructure perfor-
mance to enable decision support for provider and resource
selection.
In [1][2], the authors present CloudCmp, a framework to
compare cloud providers based on the performance of the
various components of the infrastructure including compu-
tation, scaling, storage and network connectivity. The same
authors present the CloudProphet tool [3] to predict the
end-to-end response time of an on-premise web application
when migrated to the cloud. The tool records the resource
usage trace of the application running on-premise and then
replays it on the cloud to predict performance. In [4], the
authors present CloudSuite, a benchmark suite for emerging
scale-out workloads. While most work on cloud performance
looks at the performance bottlenecks at the application
level [1][2][8], this work focusses on analysing the micro-
architecture of the processors used.
In [11], the authors propose CloudRank-D, a benchmark
suite for benchmarking and ranking the performance of
cloud computing systems hosting big data applications. The
main difference between CloudRank-D and our work is that
CloudRank-D specifically targets big-data applications while
our framework applies to any application. In [6] and [7], the
authors present their results on the analysis of resource usage
from the service provider and service consumer perspectives.
They study two models for resource sharing - the t-shirt
model and the time-sharing model. While we look at the
performance of the different cloud providers from a cloud
consumer’s perspective, the resource usage results can be
included as part of the benchmarking results to highlight the
resource usage under different load conditions. The resource
usage levels could also potentially affect the resource and
provider selection process.
In [10], the authors propose a methodology and process to
implement custom tailored benchmarks for testing different
cloud providers. Using this methodology, any enterprise
looking to examine the different cloud service offerings
can manually go through the process of selecting providers,
selecting and implementing (if necessary) a benchmark
application, deploying it on multiple cloud resources, per-
forming the tests and recording the results. Evaluation is
done at the end of the tests. Our work differs in that it
offers prospective cloud consumers with a service to do
all of this without having to go through the entire process
of setup. Additionally, it gives users the flexibility to try
out different what-if scenarios to get additional information
about the performance.
In [15], the authors discuss the IaaS cloud-specific ele-
ments of benchmarking from the user’s perspective. They
propose a generic approach for IaaS cloud benchmarking
which supports rapidly changing black box systems, where
resource and job management is provided by the testing
infrastructure and tests can be conducted with complex
workloads. Their tool SkyMark provides support for mi-
cro performance benchmarking in the context of multi-
job workloads based on the MapReduce model. In [16],
the authors provide a theoretical discussion on what cloud
benchmarking should, can and cannot be. They identify the
actors involved in cloud benchmarking and analyse a number
of use cases where benchmarking can play a significant
role. They also identify the challenges of building scenario-
specific benchmarks and propose some solutions to address
them.
In [17], the authors present the Cloud Architecture Run-
time Evaluation (CARE) framework for evaluating cloud ap-
plication development and runtime platforms. Their frame-
work includes a number of pre-build, pre-configured and
reconfigurable components for conducting performance eval-
uations across different target platforms. The key difference
between CARE and our work is that while CARE looks at
micro performance benchmarking, we look at performance
benchmarking across the complete application stack.
III. OVERVIEW
In this Section, we give a brief overview of performance
benchmarking of cloud infrastructure. In the IaaS service
415
model, the service provider gives consumers the capability to
provision processing, storage, network and basic computing
resources on demand. While the consumer has control over
the operating system, assigned storage and the deployed
applications, it has no control over the underlying cloud
infrastructure. When a client requests and receives virtual
machines from a cloud provider, it perceives the provisioned
resource as a black-box whose run-time behaviour is un-
known.
Therefore, there is a need for tools and techniques to
measure the actual performance of the computing resources
offered by different cloud providers. One way to do this
is through benchmarking, which is a traditional approach
for verifying that the performance of a system meets the ex-
pected levels. In our current work, we look at benchmarking
from the consumers perspective i.e a black box view of cloud
performance, when the tester has limited or no knowledge
of the underlying hardware specifications2.
There are two ways to benchmark the cloud infrastructure:
application stack benchmarking, and micro benchmarking.
We focus on benchmarking of the entire application stack
instead of looking at individual services, like IO, CPU or
RAM performance. While a set of micro benchmarks can
offer a good starting point in evaluating the performance
of the server, application stack benchmarking offers more
customer-specific results and is easier to understand by non-
technical audience. Thus, if prospective consumers can find
representative benchmarks for their in-house applications,
they can design experiments to match the internal load levels
and load variations, and then test the representative applica-
tion to estimate how the different clouds are performance
wise and cost/performance wise. By doing representative
performance benchmarking, consumers can quickly assess
multiple cloud providers and their offerings in an objective,
consistent and fully automated manner without having to
deploy their own applications on the various cloud platforms.
Consumers can conduct performance benchmarking be-
fore migrating to the cloud to determine whether selected
providers offer appropriate price/performance levels. Once
they have selected a particular cloud provider and migrated
their in-house applications, they can continue to benchmark
the provided infrastructure to ensure that there is no degra-
dation of the performance over time.
IV. SMART CLOUDBENCH SYSTEM
In this Section, we present the main features and reference
architecture of Smart CloudBench. Smart CloudBench is a
configurable, extensible and portable system for the auto-
mated benchmarking of cloud infrastructure performance.
We start with an overview of the system in Section IV-A
followed by a description of the benchmarking process in
Section IV-B.
2Different providers use different virtualization techniques to provisionresources which can affect the performance significantly [12][14]
Figure 1: Smart CloudBench Architecture
A. Overview
The main components of the Smart CloudBench system,
presented in Figure 1, include:
1) Cloud Comparator (CC): This module allows users
to compare different cloud provider offerings based on their
specific requirements in terms of cost, geographic location,
infrastructure requirements etc. This module helps users
shortlist potential candidates for benchmarking.
2) Benchmark Orchestrator (BO): This is the main mod-
ule of the Smart CloudBench system. It orchestrates the
automated performance benchmarking of IaaS clouds. It
controls the entire process including benchmark selection,
provider selection, workload description, resource manage-
ment, workload generation, workload execution and result
collection. It automates all the tasks that would be manually
carried out in a normal benchmarking exercise.
3) Cloud Manager (CM): The cloud manager module
performs fundamental cloud resource management. It re-
ceives resource provisioning instructions from the Bench-
mark Orchestrator, based on which it procures appropriate
instances on the different providers - both for the System
Under Test (SUT) and the Test Agents (TA). It is responsible
for the decommissioning of the instances at the end of
each test. It uses the cloud purchaser module which we
have previously developed [18][19] to procure resources
according to user constraints regarding test completion time
and available budget.
4) Result Analyser (RA): This module collects the results
of benchmark tests, and delivers the performance results both
graphically and as textual reports.
5) Cloud Interface (CI): This module provides interfaces
to the different IaaS providers to enable automated manage-
ment of cloud instances including instantiation and termina-
tion before and after the execution of the benchmarks.
6) Provider & Benchmark Catalogs: The Smart Cloud-
Bench maintains a catalog of the different IaaS providers
416
Figure 2: Smart CloudBench Workflow
and their offerings. It also maintains a catalog of supported
benchmarks for the different types of representative appli-
cations.
B. Benchmarking Process
The steps involved in executing a typical benchmark using
the Smart CloudBench are depicted in Figure 2 with the
screen shots of the User Interface in Figure 3.
1) Provider Selection: (Figure 3a) As a first step, the user
selects the specific cloud providers and resource configura-
tions to test. This selection could be done based on user
requirements, which could include resource configuration,
cost, geographic location, supported operating systems etc.
2) Benchmark Selection: (Figure 3b) The user then se-
lects the representative benchmark application/s among pro-
posed.
3) Workload Selection: (Figure 3b) The user can de-
fine different scenarios to be tested against the selected
benchmark on the shortlisted cloud providers. The request
(comprising of selected benchmark/s, scenarios to test, and
cloud resources to be tested on) is submitted to the BO.
4) Instance Procurement: Upon receiving the bench-
marking request, the BO procures the required server in-
stances from the selected providers. Technically, the back-
end system engine generates the requests to the required
cloud providers APIs in order to launch the VMs of speci-
fied type in required location with pre-built images, which
contain the packaged applications, to be used to start up the
SUT and the TA. Different rules could be used to procure
these instances depending upon the request context - e.g.
available time or available budget.
5) Benchmark Execution: The BO then executes the
benchmark by issuing remote calls to the web service that
operates on newly started cloud machines and waits for the
benchmark results to be returned to it.
(a) Provider and Instances Selection
(b) Benchmark & Workload Selection
(c) Results Visualisation: Test Summary Report
Figure 3: Smart CloudBench UI
6) Result Collection: The TAs return the benchmarking
results to BO, as a result of web service invocation.
7) Report Generation: BO generates reports based on
returned results. Basically, these reports represent the col-
lection of formatted benchmarking data together with the
static data about cloud providers prices.
8) Report Visualisation: (Figure 3c) Generated reports
are pushed back to the user and the data is visualised as
graphs and data tables to help the user in analysis and
decision-making process.
9) Instance Decommissioning: Once the tests have been
completed, the BO decommissions the instances that were
started up for conducting the tests by issuing the call to
Cloud Providers APIs.
V. EXPERIMENTAL ENVIRONMENT
In this section, we describe the experimental environ-
ment used to validate the usefulness of Smart CloudBench.
417
The representative benchmark that we have used in our
experiments is TPC-W[22], which is an e-commerce appli-
cation. This kind of application is the most popular type
of application running on the cloud and its behaviour is
relatively simple and well understood. To achieve diversity
and comprehensiveness in our experiments we have tested
TPC-W on a representative set of resources. We describe
the experimental setup and the measured metrics here. We
present the results of the experiments in the next section.
A. TPC-W Benchmark
The TPC-W application models an online bookstore
which is representative of a typical enterprise web appli-
cation. It includes a web server to render the web pages, an
application server to execute business logic, and a database
to store application data. It is designed to test the complete
application stack and does not make any assumptions on the
technologies and software systems used in each layer. The
benchmark consists of two parts. The first part is the TPC-
W application which supports a mix of 14 different web
interactions and three workload mixes, including searching
for products, shopping for products and ordering products.
The second part is the remote browser emulation (RBE)
system which generates the workload to test the application.
One RBE emulates a single customer and simulates the same
HTTP network traffic as would be seen by a real customer
using the browser.
There are certain inherent characteristics of the TPC-
W benchmark that we have inherited because we use a
Java implementation of TPC-W application that is available
online at http://www.cs.virginia.edu/∼th8k/downloads/. Each
Provider Instance Type Code Price($/hr) CPU Model Memory (GB)Amazon EC2 TPC-W Server m1.small S1 0.096 1 EC2 1.7(US-West N. California) m1.medium S2 0.192 2 EC2 3.75
m1.large S3 0.384 4 EC2 7.5m1.xlarge S4 0.768 8 EC2 15m2.xlarge S5 0.560 6.5 EC2 17.1m2.2xlarge S6 1.120 13 EC2 34.2m2.4xlarge S7 2.240 26 EC2 68.4c1.medium S8 0.245 5 EC2 1.7c1.xlarge S9 0.980 20 EC2 7
GoGrid TPC-W Server Medium S10 0.160 2 Core 2(US-West 1) Large S11 0.32 4 Core 4
X-Large S12 0.64 8 Core 8XX-Large S13 1.28 8 Core 16XXX-Large S14 1.92 8 Core 24
Rackspace TPC-W Server 1GB S15 0.08 1 Core 1(Chicago) 2GB S16 0.16 2 Core 2
4GB S17 0.32 2 Core 48GB S18 0.58 4 Core 815GB S19 1.08 6 Core 1530GB S20 1.56 8 Core 30
Table I: Configurations of the instances used in the benchmarking experiments (prices correct on the 22nd of April 2013)
Code Avg Responce Time Max Response Time Successful Interactions Timeouts Stnd. Devialtion of ART100 500 1000 100 500 1000 100 500 1000 100 500 1000 100 500 1000
S1 2213 8583 8123 15361 23681 24353 1024 906 1174 23 1354 3253 345 470 654S2 202 6082 6481 3277 24205 24358 1401 1741 2068 2 988 2807 76 344 230S3 64 4768 6041 1240 23301 24472 1425 2413 2703 3 780 2490 23 203 155S4 49 99 3669 600 1879 21452 1438 7103 7558 2 13 730 2 18 305S5 41 755 3724 613 8560 21501 1426 6431 7640 2 38 837 12 553 1726S6 34 42 1256 321 1028 15491 1423 7146 11814 2 13 104 1 6 359S7 36 43 52 402 1057 1670 1435 7099 14312 3 13 26 5 13 13S8 57 4670 5869 732 23387 24453 1429 2896 3043 2 572 2328 6 234 293S9 56 100 2737 735 1534 19798 1426 7095 9039 3 13 451 3 53 516S10 64 4427 5955 707 23574 24409 1420 3019 3329 0 587 3034 19 462 359S11 54 133 3552 617 2164 21794 1437 7069 7737 0 0 709 7 72 495S12 51 68 741 428 723 14898 1431 7128 12405 0 0 243 2 4 381S13 93 2957 3778 1495 11020 17650 1418 4443 7395 0 499 1498 62 2973 3248S14 57 154 2084 483 3490 18654 1435 7054 9963 0 2 498 6 152 1240S15 236 6989 7076 4989 23976 24130 1385 1558 1840 1 1063 2942 106 229 485S16 97 5825 6827 1438 23686 23930 1431 2286 2823 0 749 2344 31 430 163S17 79 5492 6618 998 23690 23909 1446 2494 2960 0 667 2307 14 242 211S18 64 736 4531 573 8547 22426 1418 6474 6330 0 0 988 2 135 236S19 60 89 2878 509 1405 18358 1433 7113 9187 0 0 323 3 13 252S20 60 71 1299 418 834 15817 1443 7114 11727 0 0 138 2 1 230
Table II: Test Results
418
benchmark cycle runs for 2 minutes. During this period, the
TPC-W client generates a random number of simultaneous
requests to the server, depending on the specified number of
RBEs. A single RBE can request only one web-page at a
time. The client also simulates the waiting time between
the browsing sessions of each emulated user. The server
responds to the requests of the client by generating the
corresponding web-pages. In case the request time exceeds
25 seconds, the request is dropped by timeout. The total
number of requests that fit in a single benchmarking cycle
varies depending on the response time. If the server cannot
cope with the workload, the average response time and the
number of timeouts will be high. In such case the number
of generated requests will be lower, than when the server
is capable of handling the generated workload and responds
faster to the incoming requests.
B. Experimental Setup
All our experiments were conducted on three cloud
providers - Amazon Elastic Compute Cloud (EC2), GoGrid
Cloud Hosting, and Rackspace Hosting & Cloud. Resources
on all providers were provisioned in United States (N.
California for Amazon, San Francisco for GoGrid, Chicago
for Rackspace). We have implemented the cloud interface
by using the JClouds API, which currently supports 17 IaaS
providers with datacenters in more than 30 geographical
regions. Three different workloads of 100, 500 and 1000
RBEs were used to test the TPC-W application on 20
different types of cloud instances (refer to Table I). In this
experiments we have used the TA (client), provided by TPC-
W. The cloud instance selected for running the TPC-W client
on EC2 was m1.medium, Large on GoGrid, and 2GB on
Rackspace for all tests. The scenario chosen for the test
was page browsing with the property get-images set to false
[22]. All tests on EC2 and Rackspace were repeated 10
times and the average was calculated. The benchmark on
GoGrid was conducted 20 times, because of high deviations
in performance of particular virtual instances. To address
that, we represent standard deviation values as a measure of
service consistency, or variability of performance.
The benchmarks on all three providers were executed in
parallel. The tests on Amazon and Rackspace completed
within 2 hours. Due to the limit of free ip addresses on
GoGrid, the tests were executed in two stages and took 4
hours. The total cost of running the tests was USD $53.23.
VI. DISCUSSION OF RESULTS
In this section we present the results of our benchmarking
experiments. The full results are displayed in the Table
II. In the experiment, we collected the following metrics
- average response time (ART), maximum response time
(MRT), total number of successful interactions (SI) and total
number of timeouts (T) during each benchmark cycle. In
order to determine the consistency of performance we also
calculated the standard deviation of average response time
(St.Dev). In order to illustrate the results we describe 3
scenarios: one generic scenario, where we do not consider
any specific constraints and 2 custom scenarios where the
customer has got specific requirements. For the sake of
readability, sometimes we omit some of the metrics. We
discuss the measured metrics and their significance, and
give recommendations below. Please, note that in all figures
displayed in this section, the provider offerings are sorted
by increasing price, and the Y-axis in custom scenarios is in
logarithmic scale.
A. Generic scenario
In this scenario, we assume that there are no user con-
straints and we evaluate the entire set of cloud servers from
3 providers with a workload of 1000 RBEs. We present this
scenario to demonstrate the way to select the potentially
good offers. Different techniques can be used to rank the
offers based on the benchmarking results. We have proposed
one way to do this using utility theory and preference
policies in [8]. In [5], the authors propose CloudGenius,
a framework for automated decision-making for migration
of web applications to the cloud. They make use of the
well known mult-criteria decision-making technique called
Analytic Hierarchy Process. These are just two among
several multi-criteria decision-making techniques that can
be used to help automate the decision-making process.
In this example we focus on 3 characteristics: ART,
St.Dev, and T. The results are displayed in Figure 4. Based
on the displayed information, we can easily read and evalu-
ate the server configurations. For example, S15 has the aver-
age response time of about 7000ms, 3000 requests dropped
by timeout, and about 500ms deviation in performance. We
can notice that S5, S13, and S14 have rather high deviation
in their performance compared to other instances. Thus,
if consistency of performance is essential, they should not
be considered. S7 is the most productive instance with the
lowest values for all three metrics, but is the most expensive
one as well. S12 offers a good compromise between the price
and performance. If we want to achieve the lowest level of
timeouts, we should consider S7, S6 and S20 with the final
choice depending on the available budget.
B. Custom scenarios
In this subsection we assume that A has a small online
retail store selling jewellery. He currently runs the three
tier enterprise application on his own infrastructure and has
now decided to migrate to the cloud because of growing
business. He has a reasonable idea about the characteristics
of his application. He has several options in front of him
and does not know which option to select. We consider
two scenarios when A has different budget constraints and
different workload and performance requirements. We give
the recommendation in each scenario.
419
Figure 4: Generic Scenario Results (workload of 1000 RBEs)
(a) Scenario 1 Results (workload of 100 RBEs) (b) Scenario 2 Results (workload of 1000 RBEs)
Figure 5: Custom Scenarios Results
Scenario 1: low requirements. The customer needs to
accommodate the workload of maximum 100 concurrent
requests with the response within 1 second, and the budget
is limited to 20c/hour. By filtering out the entire set of
results, we get 4 instances that fit the requirements, which
are presented in the Figure 5a.
All 4 instances have insignificant variation in performance
among each other, including S15, which is twice as cheaper
than the other configurations. If the level of performance
of S15 is satisfactory, it is the best choice. S2, being the
most expensive option has the highest chance of dropping
the request by timeout, which does not make it particularly
attractive. The best price/performance ratio is achieved by
S10 and S16. Having the same price, S10 would be the better
choice because its performance characteristics are slightly
better. Our recommendation is given in the Table III.
Scenario 2: high requirements. The customer needs to
accommodate a workload of 1000 concurrent requests with
the response within 3 seconds, and the budget is limited to
$1.30/hour. After filtering the entire set of results, we get 4
instances that satisfy the conditions as shown in Figure 5b.
According to the benchmarking results, S12 is the cheap-
Constraint RecommendationScenario 1 Scenario 2
Budget S15 S12Performance S10 S12, S6Consistency of service S10 S19
Table III: Recommendations
est server and has the best performance in terms of average
and maximum response time, and successful interactions.
This instance seems to be the best choice among all options.
S6, being the most expensive server, offers the lowest
number of timeouts and the second best performance. If the
consistency of service is important, S19 is the best choice
with the lowest standard deviation value. Our recommenda-
tion is given in the Table III.
VII. CONCLUSION
As the rate of cloud computing adoption grows, it is
becoming more important for prospective consumers to
make informed decisions. Consumers would like to obtain a
quick assessment of the price/performance levels of different
420
IaaS providers before making any migration decisions. One
way to do this is through benchmarking. In this paper, we
have presented the Smart CloudBench system, which allows
the automated execution of representative benchmarks on
different IaaS clouds under representative load conditions to
quickly estimate their cost/performance levels. The Smart
CloudBench platform helps decision-makers make informed
decisions about migrating their in-house systems to the
cloud. They can design specific experiments to test the
performance of representative applications using load con-
ditions that match the load levels of their own in-house
applications. Smart CloudBench is useful for organizations
that do not possess the time, resources and in-house expertise
to do a thorough evaluation of multiple cloud platforms.
We have implemented a proof-of-concept prototype of
Smart CloudBench system and evaluated the TPC-W bench-
mark on twenty different cloud server types under variable
load conditions. Even though we performed simple load
tests, the results show the value of having such a bench-
marking tool by highlighting that price does not necessarily
translate to performance (and its consistency) on the cloud,
and that users do not necessarily benefit by procuring the
most powerful server instances.
As future work, we plan to add more application stack
benchmarks such as Rubis3, Oli4 and jEnterprise20105. We
plan to integrate and test more IaaS providers and their
offerings. We also plan to equip Smart CloudBench with a
set of micro-benchmarks, which could serve as an auditing
service for IaaS provider’s SLA. We are also adding a more
generic testing environment using tools like JMeter.
ACKNOWLEDGMENT
This work was partially funded by the Service Delivery
& Aggregation Project within the Smart Services CRC.
REFERENCES
[1] A. Li, X. Yang, S. Kandula, and M. Zhang. CloudCmp:comparing public cloud providers. In Proc. of the 10th AnnualConference on Internet Measurement, November 2010.
[2] A. Li, X. Yang, S. Kandula, and M. Zhang. CloudCMP:Shopping for a cloud made easy. In Proc. of the 2nd USENIXConference on Hot Topics in Cloud Computing, June 2010.
[3] A. Li, X. Yang, S. Kandula, X. Yang and M. Zhang.CloudProphet: towards application performance predictionin cloud. In Proc. of ACM SIGCOMM 2011, Toronto, pp-426-427.
[4] M. Ferdman, et. al: Clearing the clouds: a study of emergingscale-out workloads on modern hardware. In Proc of 17thASPLOS 2012, pp.37-48.
[5] M. Menzel, R. Ranjan CloudGenius: decision support forweb server cloud migration. In Proc. of 21st WWW 2012,Lyon, France: pp–979-988.
3http://rubis.ow2.org/index.html4http://incubator.apache.org/olio/5http://www.spec.org/jEnterprise2010/
[6] D. Gmach, J. Rolia, and L. Cherkasova. Comparingefficiency and costs of cloud computing models. In Proc.of IEEE Network Operations and Management Symposium(NOMS), pp. 647-650, 2012.
[7] D. Gmach, J. Rolia, and L. Cherkasova. Selling T-shirtsand Time Shares in the Cloud. In Proc. of 12th IEEE/ACMCCGrid, pp.539-546 (2012).
[8] M. Baruwal Chhetri, Q. Bao Vo, R. Kowalczyk, & C. LanDo. Cloud Broker: Helping You Buy Better. In Proc. of 12thWISE 2011, pp–341-342 (2011).
[9] M. Baruwal Chhetri, Q. Bao Vo, and R. KowalczykA Flexible Policy Framework for the QoS DifferentiatedProvisioning of Services. In Proc. of The 11th CCGRID-11,pp-444-453, 2011.
[10] A. Lenk, M. Menzel, J. Lipsky, S. Tai, S, P. OffermannWhat are you paying for? performance benchmarking forinfrastructure-as-a-service offerings. In proc. of 2011 IEEECLOUD, pp–484-491 (2011).
[11] C. Luo et. al. CloudRank-D: benchmarking and rankingcloud computing systems for data processing applications.In Frontiers of Computer Science, 6(4), pp–347-362 (2012).
[12] V. Vedam, J. Vemulapati. Demystifying Cloud BenchmarkingParadigm – An in Depth View. In Proceedings of 36th IEEECOMPSAC, pp. 416-421. (2012).
[13] P. Shivam, V. Marupadi, J. Chase, T. Subramaniam,and S. Babu. Cutting corners: Workbench automation forserver benchmarking. In Proc. of USENIX Annual TechnicalConference. 2008
[14] S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T.Fahringer, and D. Epema. A performance analysis of EC2cloud computing services for scientific computing. In CloudComputing Journal, Vol 34, pp–115-131 (2010).
[15] A. Iosup, R. Prodan, and D. Epema. IaaS Cloud Bench-marking: Approaches, Challenges, and Experience. In Proc.of 5th Workshop on Many-Task Computing on Grids andSupercomputers (MTAGS) 2012.
[16] A. Alexandrov, et. al. Benchmarking in the Cloud: What itShould, Can, and Cannot Be. In Proc.of TPC TechnologyConference on Performance Evaluation & Benchmarking(TPCTC), VLDB 2012.
[17] L. Zhao, A. Liu, and J. Keung. Evaluating cloud plat-form architecture with the care framework. In Proc. of 17thAPSEC, pp. 60-69. (2010).
[18] M. Baruwal Chhetri, B. Q. Vo, R. Kowalczyk Policy-BasedAutomation of SLA Establishment for Cloud Computing Ser-vices. In Proc. of the 11th IEEE/ACM CCGRID-12, Ottawa(Canada), 13-16 May 2012
[19] M. Baruwal Chhetri, B. Q. Vo, R. Kowalczyk AutoSLAM - APolicy-driven Middleware for Automated SLA Establishmentin SOA Environments. In Proc. of the 9th SCC 2012, pp. 9-16,2012.
[20] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, andR. Sears. Benchmarking cloud serving systems with YCSB.In Proc. of the 1st ACM symposium on Cloud computing,pp–143-154. ACM (2010).
[21] SPEC Open Systems Group. Report on cloud computing tothe OSG Steering Committee. Technical Report OSG-wg-nal-20120214, available online at www.spec.org/osgcloud/docs/osgcloudwgreport20120410.pdf(2012)
[22] Transaction Processing Performance Council. TPC Bench-mark W (Web Commerce) Specification, version 2.0r. Techni-cal Specification, available online at http://www.tpc.org/tpcw/spec/TPCWV2.pdf(2003)
421