lecture course service-oriented computing 11: cloud...
TRANSCRIPT
Fuyuki Ishikawa (石川冬樹)[email protected]
Lecture CourseService-Oriented Computing
11: Cloud Computing (1): Overview2012/06/22
1 Introduction
2 Basics: Distributed Objects
3 Basics: XML
4 Web Services: Foundations
5 Web Services: Composition
6 Web Services: Implementation
7 Related Topics (1): Reliability
SOC'12 @ Sokendai 2Fuyuki Ishikawa
Course Plan
8 Related Topics (2): Security
9 Related Topics (3): Engineering
10 Related Topics (4): Semantic Web
11 Cloud Computing (1): Overview
12 Cloud Computing (2): Experience
13 Discussion and Summary
14 Students’ Presentation
SOC'12 @ Sokendai 3Fuyuki Ishikawa
Course Plan
Overview of Cloud ComputingOverview & ImpactsExample of IaaS: Amazon EC2Example of PaaS: Google App EngineTechnical Notes
TOC
4Fuyuki IshikawaSOC'12 @ Sokendai
Eric Schmidt (Google CEO, 2006)“… It starts with the premise that the data services and architecture should be on servers. We call it cloud computing ‒ they should be in a "cloud" somewhere. ““… There are a number of companies that have benefited from that. ““… This is the same talk that I gave in this room 10 years ago about something they called the network computer ‒ which, I can assure you, none of you are using, because it didn't work.”
SOC'12 @ Sokendai 5Fuyuki Ishikawa
Cloud Computing: Overview
Internet-based computing, using computational resources, data, functionality, etc. on serversParadigm shift of “from own to use”Computation as public utility Somewhat common features:Scale for large amount of data/processing, often automaticallyPay per useMuch cheaper to use than to own by yourselfcf. http://csrc.nist.gov/groups/SNS/cloud-computing/index.html
SOC'12 @ Sokendai 6Fuyuki Ishikawa
Cloud Computing: Overview
The NIST Definition of Cloud ComputingDefinition: Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.
SOC'12 @ Sokendai 7Fuyuki Ishikawa
Cloud Computing: Definition
Cont’dEssential characteristicsOn-demand self-serviceBroad network accessResource poolingRapid elasticityMeasured Service
Service modelsSaaS, PaaS, IaaS
Deployment modelsPrivate, Community, Public, Hybrid
SOC'12 @ Sokendai 8Fuyuki Ishikawa
Cloud Computing: Definition
SaaS (Software-as-a-Service)Package softwareGmail, Salesforce CRM, Microsoft Online Services, etc.
PaaS (Platform-as-a-Service)Platform often including automated management middleware for specific programming languagesForce.com, Google App Engine, Windows Azure, etc.
IaaS (Infrastructure-as-a-Service)Infrastructure including virtual machines and storages for free use casesAmazon EC2, Nifty Cloud, etc.
SOC'12 @ Sokendai 9Fuyuki Ishikawa
Cloud Computing: Representatives
Difficult (and not essential) to distinguish IaaSand PaaSNot binary: what the provider prepares and what the user preparese.g., Azure is said to be at the middle level
Sometimes considered together as “providing resources” (while SaaS can be considered “using resources to provide application-level services”)e.g., [Report by M. Armbrust et. al. (UC Berkeley)]
SOC'12 @ Sokendai 10Fuyuki Ishikawa
Notes: IaaS and PaaS
[Report by M. Armbrust et. al. (UC Berkeley)]
SOC'12 @ Sokendai 11Fuyuki Ishikawa
Notes: IaaS and PaaS
Animoto: made its movie creation/browsing service open to Facebook users (2008)Amazon EC2: 4,000 servers (originally 50) to handle 250 thousands users (originally 25 thousands) after 3 days
White House: prepared a questionnaire web site (2009)Google App Engine: 100 thousands questions and 3.6 million answers in 2 days (max 600 query/second)
Twitter, Nasdaq, …Scaling
SOC'12 @ Sokendai 12Fuyuki Ishikawa
Cloud Computing: Application (1)
New York Times: converted newspaper images of 100 years (several TBs) into PDF (2007)Amazon EC2: 100 virtual machines for 24 hours (about $1,250)
Japanese ministry/city: prepared systems for emergent and temporal requirements (e.g., 定額給付金) (2009)Force.com
Quick start up and withdraw
SOC'12 @ Sokendai 13Fuyuki Ishikawa
Cloud Computing: Application (2)
More amount, less cost
[Report by M. Armbrust et. al. (UC Berkeley)]Amazon has more traffic for EC2/S3 than for its book commerce (since 2008)More and more benefits in its original business
SOC'12 @ Sokendai 14Fuyuki Ishikawa
Cloud Computing: Economics of Scale
1,000 servers 50,000 serversNetwork (per 1M/sec) $95 $13Storage (per 1G) $2.2 $0.4Management (per 1 supervisor) 140 servers 1000 servers
Original concept of Grid
Also holds for clouds
SOC'12 @ Sokendai 15Fuyuki Ishikawa
cf. Clouds and Grids
Grid(ron): networks (e.g., of pipes) for electricity, gas, etc.
Connect and use(without caring about the source, place, or realization means)
[By Ian Foster]Cloud, Grid, and Services can make us smarterServices make distributed resources and capabilities accessible over the networkGrid assists with integration via standardized service interfaces and collective VO servicesCloud provides for scalable hosting of collective services
VO: Virtual Organization
SOC'12 @ Sokendai 16Fuyuki Ishikawa
cf. Clouds and Grids
Overview of Cloud ComputingOverview & ImpactsExample of IaaS: Amazon EC2Example of PaaS: Google App EngineTechnical Notes
TOC
17Fuyuki IshikawaSOC'12 @ Sokendai
Amazon EC2 (Elastic Compute Cloud)provides servers (processors and memories)(IaaS: Infrastructure-as-a-Service)Operated with Xen (open source virtualization software)Allow for deployment of third-party images with operation systems and software packagese.g., Windows Server 2008 + SQL Server 2005
You can pick up a disk image from the catalog (thousands) and run/stop it whenever you want
SOC'12 @ Sokendai 18Fuyuki Ishikawa
Amazon EC2: Overview
Example of price settings:Virtual machine type “Small”1 core of 1 ECU(corresponding to 1.0-1.2GHz Opteron/Xeon)1.7GB Memory
$0.085/hour (Linux), $0.12/hour (Windows)( * 24h * 30days = $61.2, $86.4/month)Data download from EC2If 0-10 Tbyte in the month $0.15/GB
Data upload to EC2 $0.1/GB
SOC'12 @ Sokendai 19Fuyuki Ishikawa
Amazon EC2: Pricing (1)
Various choices1-3.25 ECU * 1- 8 cores, 1.7-68.4GB memory
Pay per usee.g., pay only for business hours (9am-5pm, 8 hour)
SLA: Service Level Agreement10% back if the annual availability is less than 99.95%
Other discountsReserved instances: if you pay for 1 year or 3 years, discount by about 2/3Spot instances: automatically make a bid for unused resources (actually the price said to be about 1/4-1/3)
SOC'12 @ Sokendai 20Fuyuki Ishikawa
Amazon EC2: Pricing (2)
Options Elastic IP Address: use fixed IP addressesAvailability Zone: operate virtual machines in different management zones in a data center (to avoid “all got down”)
API for “programmable data center”A variety of third-party management tools (e.g., visualization of server statuses)
API for interoperabilityEucalyplus-based (NII cloud, NASA cloud, etc.)
SOC'12 @ Sokendai 21Fuyuki Ishikawa
Amazon EC2: Options and Other services
SOC'12 @ Sokendai 22Fuyuki Ishikawa
Amazon: Whole Picture
[J. Varia, Architecting for The Cloud: Best Practices, 2010]
Overview of Cloud ComputingOverview & ImpactsExample of IaaS: Amazon EC2Example of PaaS: Google App EngineTechnical Notes
TOC
23Fuyuki IshikawaSOC'12 @ Sokendai
GAE (Google Application Engine)provides platforms for web applications(PaaS: Platform-as-a-Service)Dedicated mechanisms for scaling web serversAvailability and redundancy managementScalable programming
You can run scalable web applications using Google’s resources by following the given design strategies
SOC'12 @ Sokendai 24Fuyuki Ishikawa
GAE: Overview
Free use amount1.3 million HTTP requests10 GB for each of inbound/outbound6.5 CPU hours12 GB writing and 115 GB reading in data storeThousands of API calls (data store, mail, image, etc.)
Pay for increaseCPUBandwidthStorage
SOC'12 @ Sokendai 25Fuyuki Ishikawa
GAE: Pricing
Run a program in Python (or Java)Makes a response to a HTTP requestWithin 30 secondsWithout writing on file systems, directly using sockets or making sub processes or threads
Also possible to use other kinds of programsRun periodically by cronRun by email receiptPut into and run by task queues
SOC'12 @ Sokendai 26Fuyuki Ishikawa
GAE: Programming
Uses key-value data structures for persistent storage (Datastore)Uses BigTable developed for Google’s search indexingOperates on entities that have one or more key-value pairs (properties)
Distributed and replicated automaticallyNeeds to explicitly define groups of entities located “physically and logically nearby”Transactions can only be defined for entities in the same groupA too large group leads to bad performance
SOC'12 @ Sokendai 27Fuyuki Ishikawa
GAE: Datastore
Needs to follow the specific designs (to obtain the benefits of scalability and availability)Key-value DB (not RDB)Cannot efficiently count the number of data itemsCannot efficiently implement the join operation
Needs dedicated programsNo threads, no file system accesss, …
Existing web frameworks are often not optimal e.g., eager caching of libraries can lead to timeout and often not make effective results due to more frequent swap on memories
SOC'12 @ Sokendai 28Fuyuki Ishikawa
GAE: Specific Constraints
Overview of Cloud ComputingOverview & ImpactsExample of IaaS: Amazon EC2Example of PaaS: Google App EngineTechnical Notes
TOC
29Fuyuki IshikawaSOC'12 @ Sokendai
To handle more and more amounts of processing (e.g., requests)Scale Up: use stronger serversUsed in traditional on-premise servers
Scale Out: use large number of servers with management software toolsFor web-scale data sets (e.g., search engines)Based on an assumption that always there are some servers that do not workUses specific mechanisms for data and functionality (e.g., key-value DB, map reduce)
SOC'12 @ Sokendai 30Fuyuki Ishikawa
Scale Up and Scale Out
Key-Value DB are actively discussed for scaling out (often called NoSQL)
SOC'12 @ Sokendai 31Fuyuki Ishikawa
Key-Value DB v.s. RDB
Name Place RoleTom Tokyo DeveloperBob New York Manager
Name PlaceTom TokyoBob New York
Name RoleTom DeveloperBob Manager
RDB Key-Value Pairs
joinnormalize
Key ValueTom Place: Tokyo
Role: DeveloperAgency: CompanyA
Bob Place: New YorkRole: Manager
Advantages of key-value DBEfficient full-text search by parallel processingEfficient read operations by load distributionHigh availability by replication
Disadvantages of key-value DBDifficulties or inefficiency in keeping data consistencies and in transactions, due to large number of replications and optimistic (non-blocking) writing mechanisms often used
SOC'12 @ Sokendai 32Fuyuki Ishikawa
Key-Value DB v.s. RDB
Distributed frameworks for parallel processing of large data setsSplit large data sets and process on the large number of serversMap: a data chunk -> key-value listReduce: [key, list of values] -> list of values
A lot of implementationOrigin: Google’s MapReduce Framework(was used for re-indexing the WWW data)Hadoop open source framework…
SOC'12 @ Sokendai 33Fuyuki Ishikawa
Map Reduce
Word Count Example: (1)
SOC'12 @ Sokendai 34Fuyuki Ishikawa
Map Reduce
abc def hij …………… xyz
abc def …… pqr
def stu …… def
…
vwx abc…… xyz
[abc, 1][def, 1]…
[pqr, 1][def, 1][stu, 1]…
[def, 1]
…
[vwx, 1][abc, 1]…
[xyz, 1]
Split
Map
Word Count Example: (2)
SOC'12 @ Sokendai 35Fuyuki Ishikawa
Map Reduce
[abc, 1][def, 1]…
[pqr, 1][def, 1][stu, 1]…
[def, 1]
…
[vwx, 1][abc, 1]…
[xyz, 1]
[abc, [1, 1, …, 1]][def, [1,1, …, 1]]
…[xyz, [1, 1, …, 1]]
Group
…
[abc, [1, 1, …, 1]]…
[def, [1, 1, …, 1]]…
[xyz, [1, 1, …, 1]]…
Partition
Word Count Example: (3)
SOC'12 @ Sokendai 36Fuyuki Ishikawa
Map Reduce
…
[abc, [1, 1, …, 1]]…
[def, [1, 1, …, 1]]…
[xyz, [1, 1, …, 1]]…
…
[abc, 346212]…
[def, 96521]…
[xyz, 260412]…
Reduce
[abc, 346212][def, 96521]
…[xyz, 260412]
Output
New services and new providerse.g., AWS Marketplace (Apr 2012)
Standard software implementations for building cloudse.g., OpenStack, CloudStack
New software and applications…
At NIIedubase Cloud: http://edubase.jp/cloud/ (Ja)
SOC'12 @ Sokendai 37Fuyuki Ishikawa
Many Ongoing Efforts
July 7Cloud Computing: InsideRun a simple example to see “quick, flexible self-service” as well as pros and cons in terms of distributed processing
38Fuyuki Ishikawa
Next
SOC'12 @ Sokendai