csci920 (weeks 9 & 10) cloud computing oct. 2010
TRANSCRIPT
CSCI920 (Weeks 9 & 10)
Cloud Computing
Oct. 2010
What is cloud computing? Cloud computing economics
10 obstacles and opportunities for cloud computing
Outline
The main content of the slides is based on the following two references:
• Michael Armbrust et al. Above the Clouds: A Berkeley View of Cloud Computing. Feb. 2009. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
• Peter Mell and Tim Grance. Effectively and Securely Using the Cloud Computing Paradigm, July 2009.
http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-computing-v26.ppt
An explanation from wikipedia:
“Cloud computing is Internet-based computing, whereby shared resources, software, and information are provided to computers and other devices on demand, like the electricity grid. ”
What is cloud computing?
Why is it called ‘cloud’ computing? - ‘Cloud’ is a metaphor for Internet
- Once used in the past to represent the telephone network
- We don’t care where messages went or come from exactly
First cloud around TCP/IP abstraction Second cloud around WWW data abstractionThe cloud computing abstracts infrastructure
complexities of servers, applications, data, and heterogeneous platforms
What is cloud computing?
Cloud computing is a new computing paradigm
- Cloud computing describes a new model for IT services, which provides dynamically scalable and virtualized resources over-the-Internet.
The client–server computing paradigm
- User's applications are distributed to service providers (servers) and service requesters (clients).
The mainframe computing paradigm
- Large organizations possess powerful computers to conduct critical applications, like census, consumer statistics and financial transaction processing.
What is cloud computing?
In fact, cloud computing is realising the long-held dream of computing as utility.
“Computing may someday be organized as a public utility” - John McCarthy, MIT Centennial in 1961
Powerful computational and storage capabilities available from utilities
Cloud computing is also viewed as the 5th utility, after water, electricity, gas, and telephony.
What is cloud computing?
IBM Cloud Computing
http://www.youtube.com/watch?v=lk5O67Xrflc&feature=related
What is cloud computing?
There are a number of definitions on cloud computing.
Here, we recommend a working definition proposed by NIST:
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
What is cloud computing?
This above cloud model promotes availability and consists of 5 essential characteristics, 3 service models, and 4 deployment models.
What is cloud computing?
5 Essential Cloud Characteristics: • On-demand self-service • Broad network access• Resource pooling
– Location independence• Rapid elasticity• Measured service (pay-as-you-go)
What is cloud computing?
3 Service Models
• Software as a Service (SaaS): Cloud provider’s applications are available over a network.
• Platform as a Service (PaaS): Customer-created applications can be deployed to a cloud without the cost and complexity of buying and managing the underlying hardware and software and provisioning hosting capabilities
• Infrastructure as a Service (IaaS): Processing, storage, network capacity, and other fundamental computing resources can be rent over a network.
What is cloud computing?
Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centres that provide those services.
Different cloud computing offerings can be distinguished according to the level of abstraction presented to the programmer and the level of management of the resources.
The following diagram shows the rough locations of some cloud computing offerings in this spectrum.
What is cloud computing?
What is cloud computing?
What is cloud computing?
Google Docs
http://www.youtube.com/watch?v=eRqUE6IHTEA
4 Cloud Deployment Models
• Public cloud: Sold to the public, mega-scale infrastructure, made available in a pay-as-you-go manner to the public
• Private cloud: Enterprise owned or leased cloud• Community cloud: Shared infrastructure for a
specific community• Hybrid cloud: Composition of two or more clouds
What is cloud computing?
Galen Gruman & Eric Knorr, InfoWorld Executive Editor & Editor in Chief.
“A way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software.”
“The idea of loosely coupled services running on an agile, scalable infrastructure should eventually make every enterprise a node in the cloud.”
Source: http://www.infoworld.com/d/cloud-computing/what-cloud-computing-really-means-031
Thoughts on cloud computing
Tim O’Reilly, CEO O’Reilly Media
“I think it is one of the foundations of the next generation of computing”
“The network of networks is the platform for all computing”
“Everything we think of as a computer today is really just a device that connects to the big computer that we are all collectively building”.
Source: http://news.cnet.com/8301-13953_3-9938949-80.html?tag=mncol
Thoughts on cloud computing
Larry Ellison, Oracle’s CEO.
“The interesting thing about Cloud Computing is that we’ve redefined Cloud Computing to include everything that we already do. . . . I don’t understand what we would do differently in the light of Cloud Computing other than change the wording of some of our ads. ”
Source: the Wall Street Journal, September 26, 2008.
Andy Isherwood, HP’s Vice President of European Software Sales.
“A lot of people are jumping on the [cloud] bandwagon, but I have not heard two people say the same thing about it. There are multiple definitions out there of “the cloud.” ”
Source: ZDnet News, December 11, 2008
Thoughts on cloud computing
Richard Stallman, known for his advocacy of “free software”.
“It’s stupidity. It’s worse than stupidity: it’s a marketing hype campaign. Somebody is saying this is inevitable — and whenever you hear somebody saying that, it’s very likely to be a set of businesses campaigning to make it true.”
Source: The Guardian, September 29, 2008
Public Seminar by Dr Richard Stallman: Copyright vs. Community in the Age of Computer Networks
- 12:00-14:00, 12th Oct., Function Centre 2&3 in Building 11
Thoughts on cloud computing
James T. Yeh, director of IBM China Research Laboratory. “… So it has been difficult to put a boundary of what is in Cloud Computing, and
what is not. I assert that it is equally difficult to find a group of people who would agree on even the definition of Cloud Computing. In actuality, may be all that arguments are not necessary, as Clouds have many shapes and colors. ... It will be a very rich territory for both the businesses to take the advantage of the benefits of Cloud Computing and the academia to integrate the technology research and business research.”
Source: Keynote speak at CloudCom 2009.
Source: http://www.infoworld.com/d/cloud-computing/what-cloud-computing-really-means-031
Thoughts on cloud computing
Two Topics:
Elasticity: Shifting the Risk
Comparing Costs: Should I Move to the Cloud?
Cloud Computing Economics
Elasticity: Shifting the Risk Elasticity: Hours purchased via Cloud Computing (CC) can
be distributed non-uniformly in time.
Example: use 100 server-hours today and no server-hours tomorrow, and still pay only for what you use.
Resources matches to workload much more closely, as resources can be added or removed at a fine grain (one server at a time with EC2) and with a lead time of minutes rather than weeks.
Server utilization in datacenters range from 5% to 20%. This shocking fact is consistent with the observation that for
many services the ratio of the peak workload over the average varies from 2 to 10.
Cloud Computing Economics
Example: How can elasticity reduce waste?• A business requires 500 servers at the peak, but only
100 servers during at the trough. • So, the average utilization over a whole day is 300
servers. • The actual utilization over the whole day is 300 *24 =
7200 server-hours. • To meet the peak requirement of 500 servers, we pay
for 500*24 = 12000 server-hours, a factor of 1.7 more than what is needed.
Cloud Computing Economics
Example: How can elasticity reduce waste?• So, as long as the pay-as-you-go cost per server-
hour over 3 years is less than 1.7 times the cost of buying the server, we can save money using utility computing.
Cloud Computing Economics
The benefits of elasticity were underestimated in the above example.
• Beside diurnal patterns, most services also experience seasonal or other periodic demand variations, as well as some unexpected demand bursts due to external events.
• As acquiring and racking new equipment can take weeks, the only way to handle such spikes is to provision for them in advance.
• So, if they overestimate the spike they provision for, capacity is wasted too much.
• If service operators underestimate the spike sizes correctly, potential revenue will be lost.
Cloud Computing Economics
• Moreover, some user after experiencing poor service may never come back again.
Cloud Computing Economics
Do such scenarios really happen in practice? Animoto: A private company that produces videos from
user-selected photos, video clips and music.
http://animoto.com/
When Animoto made its service available via Facebook, it experienced a demand surge that resulted in growing from 50 servers to 3500 servers in three days.
No one could have foreseen that resource needs would suddenly double every 12 hours for 3 days.
By Nov. 2009, Animoto has 1 million registered users. This is an amazing success for a company just founded in
August in 2006.
Cloud Computing Economics
Elasticity is valuable to established companies as well
Cloud Computing Economics
Similarly, Salesforce.com hosts customers ranging from 2 seat to 40,000+ seat customers.
Even less-dramatic cases suffice to illustrate this key benefit of Cloud Computing: the risk of misestimating workload is shifted from the service operator to the cloud vendor.
Target, using AWS for the Target.com website, Target’s sites were just slower by about 50% on “Black Friday” (28/11/2008), while other retailers had severe performance problems and intermittent unavailability.
When is CC preferable?
For a web business with varying demand over time and revenue proportional to user hours, the following simple equation is helpful to decide whether the business should be shifted to CC (ph=per hour): UserHourscloud x (Revenueph – Costcloud-ph)
>=
UserHoursdatacentre x (Revenueph – Costdatacentre-ph /Utilization)
• The left-hand side is the expected profit from using Cloud Computing: the net revenue per user-hour is multiplied by the number of user-hours
• The right-hand side is the expected profit from owning datacentre.
Cloud Computing Economics
• In the above equation, if Utilization = 1 the two sides of the equation look the same.
• In practice, server utilization in datacenters range from 0.05 to 0.2.
• This illustrate the big economical capacity of CC. 2nd benefits to the CC users: CC can reduce or even
eliminate the penalty caused by unexpectedly scaling down (disposing of temporarily underutilized equipment).
3rd benefits to the CC users: Without incurring a capital expense, they can enjoy the savings due to the decrease in the costs of (new) hardware and software.
Cloud Computing Economics
• Over the last 2.5 years, heavy users of AWS saw storage costs fall 20% and networking costs fall 50%.
Cloud Computing Economics
• Over less than one year, 9 new services or features have been added to AWS.
Cloud Computing Economics
Comparing Costs: Should I Move to the Cloud? • We've tried to quantify elasticity, an economic value of CC
benefits. • Here, we tackle an equally important but larger question: Is
it more economical to move my existing datacentre-hosted service to the cloud, or to keep it in a datacentre?
• A good approach to answer this question is to track the rate of change of key technologies for CC.
• At the same time, we need to consider the combinatorial effect of the following four factors:
- Pay separately per resource
- Power, cooling and physical plant costs
Cloud Computing Economics
In the following, we take ASW (Amazon Web Storage) as a case study.
Cloud Computing Economics
The Rate of Change of Key Technologies for CCfrom 2003 to 2008
Cloud Computing Economics
From the above table, we can see: • Wide-area networking costs have improved the least in the
above period of 5 years, by less than a factor of 3. • At the same period, computing costs have improved the
most, by a factor of 16. • At first glance, it seems that a given dollar will go further if
used to purchase hardware in 2008, rather than to pay for use of that same hardware in CC.
• However, this simple analysis glosses over at least three important factors.
Cloud Computing Economics
1st Factor: Pay separately per resource• Most applications do not make equal use of computation,
storage, and network bandwidth. • Some are CPU-bound, others network-bound, and so on. • They may saturate one resource while underutilizing
others. • Pay-as-you-go CC can charge the application separately
for each type of resource and then reduce the waste of underutilization.
• Example: If the CPU is only 50% utilized while the network is at capacity; then in a datacentre you are effectively paying for double the number of CPU cycles actually being used.
Cloud Computing Economics
2nd Factor: Power, cooling and physical plant costs• The above simple analysis doesn’t include the costs of
power, cooling, and the amortized cost of the building. • If these costs considered, the costs of CPU, storage and
bandwidth roughly double. • Using this estimate, buying 128 hours of CPU in 2008
really costs $2 rather than $1, compared to $2.56 on EC2. • Similarly, 10 GB of disk space costs $2 rather than $1,
compared to $1.20–$1.50 per month on S3. • Moreover, S3 actually replicates the data at least 3 times
for durability and performance.
Cloud Computing Economics
A simple example of deciding whether to move a service into the cloud• A biology lab creates 500 GB of new data for every wet lab
experiment.• One EC2 instance takes 2 hours to process per GB data. • The lab has the equivalent 20 instances locally, so the time
to evaluate the experiment is 500X2/20 =50 hours.• So, they could process their data in a single hour using
1000 instances at AWS. • The cost to process one experiment would be just
1000x$0.10 =$100 in computation and another 500x$0.10 =$50 in network transfer fees.
Cloud Computing Economics
A simple example of deciding whether to move a service into the cloud• However, the transfer rate from the lab to AWS is 20
Mbits/second.• So, the transfer time is (500GB x 1000MB/GB x 8bits/Byte)
/ 20Mbits/sec = = 200,000 seconds, i.e. more than 55 hours.
• That is, using their own equipment takes 50 hours, but using CC takes over 56 hours.
• Therefore, they may not move their data analysis to the cloud, unless there is better way to overcome the transfer delay obstacle.
Cloud Computing Economics
According to Armbrust et al.’s research, there are top 10 obstacles to the growth of Cloud Computing. Corresponding to each obstacle, there is one associated opportunity, which shows how to overcome the obstacle, ranging from straightforward product development to major research projects.
These top ten obstacles and opportunities: 3 Technical obstacles to the adoption of Cloud Computing 5 Technical obstacles to the growth of Cloud Computing 2 Policy and business obstacles to the adoption of Cloud Computing
Source: http://www.infoworld.com/d/cloud-computing/what-cloud-computing-really-means-031
10 Obstacles and Opportunities
10 Obstacles and Opportunities
1st Obstacle: Availability of a Service• Organizations worry about if Utility Computing services have
adequate availability• This makes some wary of Cloud Computing• However, existing SaaS products have set a high standard in
this regard. • Google Search is effectively the dial tone of the Internet• New services like Cloud Computing seems difficult to
achieve similar availability at current stage. • Example: Amazon has an annual revenue of $27 billion, so
one minute service failure means potentially loss of over $51,000.
10 Obstacles and Opportunities
10 Obstacles and Opportunities
Source:
How to overcome the 1st Obstacle? • Backup is good though may be not enough• Multiple data centres of the same provider, distributed in different
geographically regions, may share common software infrastructure and accounting systems.
• The company may go out of business • So, we can use multiple Cloud Computing providers to prevent
the service failure by a single company• Just like large Internet service providers use multiple network
providers• This also allows SaaS providers to defend against DDoS attacks
by using quick scale-up
10 Obstacles and Opportunities
Example for defending DDOS attacks• DDoS (Distributed Denial of Service) attacks typically use large
“botnets”. • A bot can be rented on the black market for $0.03/pw. • Suppose an EC2 instance can handle 500 bots and an attack generates
an extra 1 GB/second of bogus network bandwidth and 500,000 bots. • So, the attacker needs to invest $15,000 up front. • At AWS’s current prices, the victim costs an extra $360 per hour in
network bandwidth and an extra $100 per hour (1,000 instances) of computation
• The attack would last 32 hours in order to cost the victim more. • In practice, an attack lasting this long time is usually easier to uncover
and defend against …
10 Obstacles and Opportunities
2nd Obstacle: Data Lock-in• Software stacks have improved interoperability among
platforms, but the APIs for Cloud Computing itself are still essentially proprietary (not standardized yet).
• Customers cannot easily extract their data and programs from one site to run on another.
• Customer lock-in may be attractive to CC providers, but CC users are vulnerable to price increases, to reliability problems, or even to providers going out of business.
10 Obstacles and Opportunities
Solution to Data Lock-in• The obvious solution is to standardize the APIs for CC. • But one may fear that this would lead to a “race-to-the-
bottom” of cloud pricing and flatten the profits of Cloud Computing providers.
• This seems not true due to two reasons. • First, the quality of a service matters as well as the price. • Second, standardization of APIs create the opportunities of
the same software infrastructure being used in a Private Cloud and in a Public Cloud. Such an option could enable “Surge Computing”.
10 Obstacles and Opportunities
3rd Obstacle: Data Confidentiality and Auditability• “My sensitive corporate data will never be in the cloud.”• As current Cloud offerings are essentially public (rather than private)
networks, they face more attacks. • However, there are no fundamental obstacles to making a cloud-
computing environment as secure as the vast majority of in-house IT environments, if proper security primitives and tools are correctly used.
• Similarly, auditability could be added as an additional layer beyond the reach of the virtualized guest OS (or virtualized application environment).
• Another concern is that many nations have laws (e.g. USA PATRIOT Act) on keeping customer data and copyrighted material within national boundaries.
10 Obstacles and Opportunities
4th Obstacle: Data Transfer Bottlenecks• Data transfer costs is an important issue.• Applications continuously become more data-intensive, and
applications may be “pulled apart” across the boundaries of clouds.
• At the rate of $100 to $150 per terabyte, these costs can go up very quickly.
• Lack of bandwidth is one reason few scientists using Cloud Computing
• Shipping disks is one way to reduce the high cost of Internet transfers.
10 Obstacles and Opportunities
4th Obstacle: Data Transfer Bottlenecks• US overnight delivery services has only one failure in about 400
attempts. • Example: Transferring 10 TB (terabyte) from U.C. Berkeley to Amazon in
Seattle may require 4,000,000 seconds (over 45 days), if we have a 20Mbi/sec WAN link.
• Note: The Amazon S3 average write bandwidth is 5-18 Mbits/sec• If ten 1 TB disks were sent via overnight shipping: less than a day to
transfer with cost about $400.
• A second opportunity is to make it attractive to keep data in the cloud.
• Amazon hosts large public datasets (e.g. US Census data) for free on S3; these datasets might “attract” EC2 cycles as there is no charge to transfer data between S3 and EC2
10 Obstacles and Opportunities
4th Obstacle: Data Transfer Bottlenecks• A third, more radical opportunity is to try to reduce the cost
of WAN bandwidth more quickly. • Two-thirds of the WAN bandwidth cost is for the high-end
routers, whereas only one-third for the fiber. • Build simpler & low-cost routers from commodity
components to replace the high-end distributed routers? • Like WAN bandwidth, intra-cloud networking technology may
be a performance bottleneck as well.
10 Obstacles and Opportunities
5th Obstacle: Performance Unpredictability• In Cloud Computing, multiple Virtual Machines can share
CPUs and main memory surprisingly well, but that I/O sharing is more problematic.
• One opportunity is to improve architectures and operating systems to efficiently virtualize interrupts and I/O channels.
• Another possibility is that flash memory will decrease I/O interference.
• Another unpredictability obstacle concerns the scheduling of virtual machines for some batch processing programs, specifically for high performance computing (HPC).
10 Obstacles and Opportunities
6th Obstacle: Scalable Storage• Three properties give Cloud Computing its appeal:
Short-term usage (resources can be scaled down as well as up according to requirements), no up-front cost, and infinite capacity on-demand.
• This has straightforward implication to computation, but it’s less obvious how to apply it to persistent storage.
• As an open research problem, the opportunity is to create a storage system that would meet a number of storage needs to satisfy scaling arbitrarily up and down on-demand, as well as meeting programmer expectations in regard to resource management for scalability, data durability, and high availability.
10 Obstacles and Opportunities
7th Obstacle: Bugs in Large-Scale Distributed Systems
• One of the difficult challenges in Cloud Computing is removing errors in these very large scale distributed systems.
• One opportunity may be the reliance on virtual machines in Cloud Computing, though many traditional SaaS providers developed their infrastructure without using VMs.
• Since VMs are de rigueur in Utility Computing, that level of virtualization may make it possible to capture valuable information in ways that are implausible without VMs.
10 Obstacles and Opportunities
8th Obstacle: Scaling Quickly• Pay-as-you-go certainly applies to storage and to network bandwidth. • Computation is slightly different, depending on the virtualization level. • The opportunity is then to automatically scale quickly up and down in
response to load in order to save money, but without violating service level agreements
• Another reason for scaling is to conserve resources as well as money. • An idle computer uses about two-thirds of the power of a busy
computer
10 Obstacles and Opportunities
9th Obstacle: Reputation Fate Sharing• Reputations do not virtualize well in CC. One customer’s
bad behaviour can affect the reputation of the cloud as a whole.
• An opportunity would be to create reputation-guarding services similar to the “trusted email” services currently offered (for a fee) to services hosted on smaller ISP’s.
• Here, one legal issue is on the transfer of legal liability.
10 Obstacles and Opportunities
10th Obstacle: Software Licensing• Current software licensing model: commonly restricting the
computers on which the software can run. • Hence, users need to pay for the software and an annual
maintenance fee. • This licensing model is not a good match to Utility
Computing. Many CC providers originally relied on open source software partially due to this reason.
• The opportunity is either continuously using open source and/or changing commercial software licensing structure to fit CC better.
10 Obstacles and Opportunities
10th Obstacle: Software Licensing
• For example, Microsoft and Amazon now offer pay-as-you-go software licensing for Windows Server and Windows SQL Server on EC2.
• An EC2 instance running Microsoft Windows costs $0.15 per hour instead of the traditional $0.10 per hour of the open source version.
• A related obstacle is encouraging sales forces of software companies to sell products into Cloud Computing. Pay-as-you-go seems incompatible with their quarterly sales target.
10 Obstacles and Opportunities
What is cloud computing? • A working definition proposed by NIST5 • It has 5 essential characteristics, 3 service models, and 4
deployment models Cloud computing economics• Elasticity: Shifting the Risk • Comparing Costs: Should I Move to the Cloud?
10 obstacles and opportunities for cloud computing
Summary