cloud computing mit mathematischen anwendungen · 7 relational database features ! sharing of data...
TRANSCRIPT
KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association
8. Platform as a Service
www.kit.edu
Cloud Computing mit mathematischen Anwendungen Dr. habil. Marcel Kunze Engineering Mathematics and Computing Lab (EMCL) Institut für Angewandte und Numerische Mathematik IV Karlsruhe Institute of Technology (KIT)
2
Service Delivery Models
Cloud Computing | SS2011 | M.Kunze
IaaS
PaaS
SaaS
3
Platform as a Service (PaaS)
! Programming Environment ! Execution Environment
! The consumer controls the applications that run in the environment (and possibly has some control over the hosting environment), but does not control the operating system, hardware or network infrastructure on which they are running.
! The platform is typically an application framework.
Cloud Computing | SS2011 | M.Kunze
[NIST]
4
Platform as a Service (PaaS)
! Amazon Relational Database Service (RDS) ! Google App Engine
Cloud Computing | SS2011 | M.Kunze
5
Relational Databases
Cloud Computing | SS2011 | M.Kunze
6
Relational Databases
! Relational database features ! Organization of structured data ! Query languages (SQL) ! Effective and secure data sharing (Transactions) ! Backup and recovery ! Controlled redundancy ! Data consistency & integrity constraints
! Isolation between applications and data
! Data abstraction (data models) ! Applications work on the representation of data
! Database file system is strongly protected against outside
manipulation ! It is best to hide it from the administrator / user
Cloud Computing | SS2011 | M.Kunze
7
Relational Database Features
! Sharing of data and support for atomic multi-user transactions ! Multiple user and applications may access the DB at the same time ! Concurrency control is necessary for maintaining consistency ! Transactions need to be atomic and isolated from each other
! Additional nice-to-have features
! Linear scaling ! Performance and storage capacity scales with the amount of resources
! Flexible data schema ! Free evolution or change ! Versioning of table entries ! Loosen up the relational model!
! Elastic Computing ! Machines can be added or removed at any time
Cloud Computing | SS2011 | M.Kunze
8
Relational Database Scalability
Cloud Computing | SS2011 | M.Kunze
9
Amazon Relational Database Services ! RDS implements a relational database platform service
! Scalability (Up and out) ! Re-instantiate service with different instance types ! Add several read replicas
! Fault-tolerance: Multi-AZ deployment ! Mirroring of database between availability zones
! Full implementation of SQL query language
Cloud Computing | SS2011 | M.Kunze
10
RDS Support in the AWS Mgmt. Console
! RDS implements mySQL and Oracle database as a service ! RDS tools exist for the command line ! RDS API for Java, Python, …
Cloud Computing | SS2011 | M.Kunze
11
RDS Prizing (mySQL)
! Data transfer and storage is accounted in addition ! Each read replica counts extra ! Multi-AZ deployment is twice as expensive ! Reserved instances may be ordered at a comparably low price
Cloud Computing | SS2011 | M.Kunze
12
RDS Prizing (Oracle)
Cloud Computing | SS2011 | M.Kunze
13
RDS Prizing (Oracle)
Cloud Computing | SS2011 | M.Kunze
14 Cloud Computing | SS2011 | M.Kunze
Hausaufgabe 7 Instanzieren Sie eine RDS-Datenbank in der Amazon Konsole. Instanzieren Sie dann mit AWS ein Amazon Linux System (t1.micro), basierend auf dem Image ami-8e1fece7. ! Einloggen auf dem Server mit
ssh –i <keyname>.pem ec2-user@<server-ip> ! Installation von mysql:
sudo yum install mysql ! Laden der Datenbank world.sql
wget http://cloudvorlesung.s3.amazonaws.com/world.sql mysql -h <dbname>.us-east-1.rds.amazonaws.com -u dbroot --password=<dbpass> mysql>CREATE DATABASE world; mysql>USE world; mysql>SOURCE world.sql; mysql>SELECT * FROM world. City LIMIT 0, 100; mysql>quit
! Welche Städte in Deutschland haben mehr als 100.000 Einwohner? SELECT * FROM `City` WHERE Population > 100000 AND CountryCode = "DEU" LIMIT 0, 100;
15
Google App Engine
Cloud Computing | SS2011 | M.Kunze
16
The Google Story
! Googol: 10100 ! In the late 1930s, mathematician Edward
Kasner was once asked to come up with a name for a very large number. He outsourced this task to his nine-year-old nephew, Milton Sirotta, who in turn coined the word “googol''
! Googolplex: 10googol
! A number of interest only to theoreticians. It has a googol (and one) digits — so even if you just try to write the number down, and you could somehow inscribe a quintillion (1018) digits onto every particle in the universe, you'd only write down 1% of all the digits
Cloud Computing | SS2011 | M.Kunze
17
“Google” Circa 1997 (google.stanford.edu)
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
18
Research Project, circa 1997 PhD students Larry Page, Sergey Brin at Stanfod University
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
19
Google Search Engine
! The Google Web Server (GWS) takes your query and coordinates the search and response
! The index is partitioned into “shards.” Each shard indexes a subset of the docs (web pages) and can be searched by multiple index servers. ! The GWS routes your search to one index server ! The result is an ID for every doc satisfying your search, rank-ordered by relevance
! The docs, too, are partitioned into “shards” – the partitioning is a hash on the doc ID. Each shard contains the full text of a subset of the docs and can be searched by multiple doc servers. ! The GWS sends appropriate doc IDs to one doc server ! The result is a URL, a title, and a summary for every relevant doc
! The GWS builds an HTTP response to your search and ships it off
Cloud Computing | SS2011 | M.Kunze
20
“Corkboards” (1999)
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
21
Serving System, circa 1999
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
22
Google Data Center (2000)
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
23
Early 2001: In-Memory Index
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
24
Serving Design, 2004 edition (“memcache”)
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
25
Google BigTable
! “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, a column key, and a timestamp; each value in the map is an uninterpreted array of bytes.”
! Google‘s Implementation of a database ! Lots of semi-structured data ! Enormous scale ! (row:string, column:string, time:int64) -> string
Cloud Computing | SS2011 | M.Kunze
26
2007: Universal Search
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
27
Current Machines
! In-house rack design ! PC-class motherboards ! Low-end storage and networking
hardware ! Linux + in-house software
Cloud Computing | SS2011 | M.Kunze
Source: J.Dean
28
Challenges building Web Applications
Cloud Computing | SS2011 | M.Kunze
29
Why Not LAMP?
! Linux, Apache, MySQL/PostgreSQL, Python/Perl/PHP/Ruby ! LAMP is the industry standard ! But management is a hassle:
! Configuration, tuning ! Backup and recovery, disk space management ! Hardware failures, system crashes ! Software updates, security patches ! Log rotation, cron jobs, and much more ! Redesign needed once your database exceeds one box
• “Google carries pagers so you don’t have to”
Cloud Computing | SS2011 | M.Kunze
30
App Engine Components
1. Scalable Serving Infrastructure 2. Python and Java Runtime 3. Software Development Kit
4. Datastore 5. Web based Admin Console
Cloud Computing | SS2011 | M.Kunze
31
App Engine Features
! Does one thing well: running web apps ! Simple app configuration ! Scalable ! Secure
Cloud Computing | SS2011 | M.Kunze
32
App Engine Does One Thing Well
! App Engine handles HTTP(S) requests, nothing else ! Think RPC: request in, processing, response out ! Works well for the web and AJAX; also for other services
! App configuration is dead simple ! No performance tuning needed
! Everything is built to scale ! “infinite” number of apps, requests/sec, storage capacity ! APIs are simple, stupid
Cloud Computing | SS2011 | M.Kunze
33
App Engine Architecture
Cloud Computing | SS2011 | M.Kunze
Python VM
process
stdlib
app
memcache datastore
images
urlfech
stateful APIs
stateless APIs R/O FS req/resp
34
Scaling
! Low-usage apps: many apps per physical host ! High-usage apps: multiple physical hosts per app
! Stateless APIs are trivial to replicate ! Memcache is trivial to shard
! Datastore built on top of Bigtable; designed to scale well ! Abstraction on top of Bigtable ! API influenced by scalability
! No joins ! Recommendations: denormalize schema; precompute joins
Cloud Computing | SS2011 | M.Kunze
35
Automatic Scaling to Application Needs
! You don’t need to configure your resource needs ! One CPU can handle many requests per second ! Apps are hashed (really mapped) onto CPUs:
! One process per app, many apps per CPU ! Creating a new process is a matter of cloning a generic “model” process
and then loading the application code (in fact the clones are pre-created and sit in a queue)
! The process hangs around to handle more requests (reuse) ! Eventually old processes are killed (recycle)
! Busy apps (many QPS) get assigned to multiple CPUs ! This automatically adapts to the need
! as long as CPUs are available
Cloud Computing | SS2011 | M.Kunze
36
Security
! Prevent the bad guys from breaking (into) your app
! Constrain direct OS functionality ! no processes, threads, dynamic library loading ! no sockets (use urlfetch API) ! can’t write files (use datastore) ! disallow unsafe Python extensions (e.g. ctypes)
! Limit resource usage (Quota) ! Limit 1000 files per app, each at most 1MB ! Hard time limit of 10 seconds per request ! Most requests must use less than 300 msec CPU time ! Hard limit of 1MB on request/response size, API call size, etc. ! Quota system for number of requests, API calls, emails sent, etc
Cloud Computing | SS2011 | M.Kunze
37
Web-based Admin Console
Cloud Computing | SS2011 | M.Kunze
38
App Engine Developer Account
! Hundreds of examples exist ! Tools, Communication, Games, News, Finance ,
Sports, Lifestyle, Technology, Enterprise ! Register as a developer
! http://code.google.com/appengine
! Free to get started ! 500 MB storage ! 2 GB bandwidth / day ! ~ 5 million page views / month
! Pay-per-use if you need more
Cloud Computing | SS2011 | M.Kunze
39
Billable Quota Unit Cost
Cloud Computing | SS2011 | M.Kunze
! Free quota usage corresponds to approx. 5 million page hits per month
! Max. daily budget can be set
! Backend services for long lasting jobs
40 Cloud Computing | SS2011 | M.Kunze
http://code.google.com/intl/de-DE/appengine/downloads.html
41
Click to Deploy from IDE / SDK
Cloud Computing | SS2011 | M.Kunze
42
Switch to Live Demo Now
Cloud Computing | SS2011 | M.Kunze
! Eclipse: Integrated Development Environment (IDE) ! http://www.eclipse.org
! Java Plugin for Google App Engine ! http://code.google.com/appengine/docs/java/tools/eclipse.html
43 Cloud Computing | SS2011 | M.Kunze
Summary
! Google App Engine ! Development of Web Applications ! Scalable and secure platform service
! Features ! Large file uploads and downloads ! Datastore import and export for large volumes ! Pay-as-you-go billing (for resource usage over free quota) ! Python and Java support ! Monitoring of Web applications
! Learn more ! http://code.google.com/appengine
44 Cloud Computing | SS2011 | M.Kunze