cloud slides

134
Cloud Computing and Scalable Data Management Solutions on Cloud Computing Platforms Sherif Sakr National ICT Australia (NICTA) [email protected] 20 March 2012 S. Sakr (NICTA) Cloud Computing 20 March 2012 1 / 134

Upload: sudeepgupta90

Post on 14-Apr-2015

45 views

Category:

Documents


2 download

DESCRIPTION

cloud

TRANSCRIPT

Page 1: Cloud Slides

Cloud Computingand

Scalable Data Management Solutions onCloud Computing Platforms

Sherif Sakr

National ICT Australia (NICTA)

[email protected]

20 March 2012

S. Sakr (NICTA) Cloud Computing 20 March 2012 1 / 134

Page 2: Cloud Slides

Outline

Introduction Cloud Computing

Amazon Web Services (Amazon EC2, Amazon S3, Amazon SimpleDB)

NoSQL Database Systems

Google BigTable

Yahoo! PNUTS

Amazon Dynamo

Other Systems

Database-as-a-Service (DaaS)

CloudDB AutoAdmin: Application-Managed Virtualized DatabaseServers (Our Research)

S. Sakr (NICTA) Cloud Computing 20 March 2012 2 / 134

Page 3: Cloud Slides

Part I

Introduction to Cloud Computing

S. Sakr (NICTA) Cloud Computing 20 March 2012 3 / 134

Page 4: Cloud Slides

Cloud Computing

Recently, there has been a great deal of hype about cloud computing.

Cloud computing is on the top of Gartner’s list of the ten mostdisruptive technologies of the next years.

Cloud computing is associated with a new paradigm for the provisionof computing infrastructure that shifts the location of thisinfrastructure to the network and reduces the costs associated withthe management of hardware and software resources.

Businesses and users become able to access application services fromanywhere in the world on demand which represents the long-helddream of envisioning computing as a utility where the economy ofscale principles help to drive the cost of computing infrastructureeffectively down.

S. Sakr (NICTA) Cloud Computing 20 March 2012 4 / 134

Page 5: Cloud Slides

Cloud Computing

The major enabling features of the cloud include elasticity ofresources, pay-per-use cost model, low time to market, and theperception of unlimited resources and infinite scalability.

Builds on, but unlike the earlier attempts:

Distributed Computing

Distributed Databases

Grid Computing

S. Sakr (NICTA) Cloud Computing 20 March 2012 5 / 134

Page 6: Cloud Slides

Cloud Computing Players

Big players such as Amazon, Google and Microsoft have establishedtheir data centers for hosting Cloud computing applications in variouslocations around the world

S. Sakr (NICTA) Cloud Computing 20 March 2012 6 / 134

Page 7: Cloud Slides

Grid Computing to Cloud Computing

S. Sakr (NICTA) Cloud Computing 20 March 2012 7 / 134

Page 8: Cloud Slides

Grid Computing VS Cloud Computing

S. Sakr (NICTA) Cloud Computing 20 March 2012 8 / 134

Page 9: Cloud Slides

Evolution to Cloud Computing

S. Sakr (NICTA) Cloud Computing 20 March 2012 9 / 134

Page 10: Cloud Slides

Cloud Reality: Data Centers

S. Sakr (NICTA) Cloud Computing 20 March 2012 10 / 134

Page 11: Cloud Slides

Cloud Reality: Data Centers

S. Sakr (NICTA) Cloud Computing 20 March 2012 11 / 134

Page 12: Cloud Slides

Cloud Computing Essential Characteristics

On-demand self-service. A consumer can unilaterally provisioncomputing capabilities, such as server time and network storage, asneeded automatically without requiring human interaction.

Broad network access. Capabilities are available over the network andaccessed through standard mechanisms by heterogeneous thin or thickclient platforms (e.g., mobile phones, laptops, and PDAs).

Resource pooling. The provider’s computing resources are pooled toserve multiple consumers using a multitenant model, with differentphysical and virtual resources.

Rapid elasticity. Capabilities can be rapidly and elastically provisionedto quickly scale out and rapidly released to quickly scale in.

Measured Service. Cloud systems automatically control and optimizeresource use by leveraging a metering capability at some level ofabstraction appropriate to the type of service (e.g., storage,processing and bandwidth). Resource usage can be monitored andreported providing transparency for both the providers and consumers.S. Sakr (NICTA) Cloud Computing 20 March 2012 12 / 134

Page 13: Cloud Slides

Cloud Computing Essential Characteristics

S. Sakr (NICTA) Cloud Computing 20 March 2012 13 / 134

Page 14: Cloud Slides

Economics of Cloud

S. Sakr (NICTA) Cloud Computing 20 March 2012 14 / 134

Page 15: Cloud Slides

Economics of Cloud

S. Sakr (NICTA) Cloud Computing 20 March 2012 15 / 134

Page 16: Cloud Slides

Cloud Computing Service Models

Infrastructure as a Service (IaaS): Provision resources such as servers(often in the form of virtual machines), network bandwidth, storage,and related tools necessary to build an application environment fromscratch.

Platform as a Service (PaaS): Provides a higher-level environmentwhere developers can write customized applications. Themaintenance, load-balancing and scale-out of the platform are doneby the service provider and the developer can concentrate on the mainfunctionalities of his application.

Software as a Service (SaaS): Refers to special-purpose software madeavailable through the Internet. Therefore, it does not require eachend-user to manually download, install, configure, run or use thesoftware applications on their own computing environments.

S. Sakr (NICTA) Cloud Computing 20 March 2012 16 / 134

Page 17: Cloud Slides

Cloud Computing Service Models

Infrastructure as a Service (IaaS)

Platform as a Service (PaaS)

Software as a Service (SaaS)

S. Sakr (NICTA) Cloud Computing 20 March 2012 17 / 134

Page 18: Cloud Slides

Cloud Computing Deployment Models

Private cloud. A cloud that is used exclusively by one organization. Itmay be managed by the organization or a third party and may existon premise or off premise. A private cloud offers the highest degree ofcontrol over performance, reliability and security. However, they areoften criticized for being similar to traditional proprietary server farmsand do not provide benefits such as no upfront capital costs.

Community cloud. The cloud infrastructure is shared by severalorganizations and supports a specific community that has sharedconcerns (e.g., mission, security requirements, policy, and complianceconsiderations).

S. Sakr (NICTA) Cloud Computing 20 March 2012 18 / 134

Page 19: Cloud Slides

Cloud Computing Deployment Models

Public cloud. The cloud infrastructure is made available to the generalpublic or a large industry group and is owned by an organizationselling cloud services (e.g. Amazon, Google, Microsoft). In practice,Public clouds offer several key benefits to service consumers such as:including no initial capital investment on infrastructure and shifting ofrisks to infrastructure providers. However, public clouds lackfine-grained control over data, network and security settings, whichmay hamper their effectiveness in many business scenarios.

Hybrid cloud. The cloud infrastructure is a composition of two ormore clouds (private, community, or public) that remain uniqueentities but are bound together by standardized or proprietarytechnology that enables data and application portability (e.g., cloudbursting for load-balancing between clouds).

S. Sakr (NICTA) Cloud Computing 20 March 2012 19 / 134

Page 20: Cloud Slides

Public Cloud VS Private Cloud

S. Sakr (NICTA) Cloud Computing 20 March 2012 20 / 134

Page 21: Cloud Slides

Part II

Amazon Web Services

S. Sakr (NICTA) Cloud Computing 20 March 2012 21 / 134

Page 22: Cloud Slides

Amazon Web Services (AWS)

AWS

http://aws.amazon.com/

S. Sakr (NICTA) Cloud Computing 20 March 2012 22 / 134

Page 23: Cloud Slides

Amazon Data Centers

Amazon AWSJeff Barr. Amazon Web Services: Building Blocks

S. Sakr (NICTA) Cloud Computing 20 March 2012 23 / 134

Page 24: Cloud Slides

Amazon Simple Storage Service (S3)

AWS S3

http://aws.amazon.com/s3/

S. Sakr (NICTA) Cloud Computing 20 March 2012 24 / 134

Page 25: Cloud Slides

Amazon S3 Concepts

ObjectsOpaque data to be stored (1 byte to 5 Gigabytes)

Authentication and access controls

BucketsObject container (any number of objects)

100 buckets per account

KeysUnique object identifier within bucket

Up to 1024 bytes long

Flat object storage model

Standards-Based Interfaces:REST and SOAP

URL-Addressability (every object has a URL)

S. Sakr (NICTA) Cloud Computing 20 March 2012 25 / 134

Page 26: Cloud Slides

S3 API

ServiceListAllMyBuckets

BucketsCreateBucket

DeleteBucket

ListBucket

GetBucketAccessControlPolicy

SetBucketAccessControlPolicy

GetBucketLoggingStatus

SetBucketLoggingStatus

ObjectsPutObject

PutObjectInline

GetObject

GetObjectExtended

DeleteObject

S. Sakr (NICTA) Cloud Computing 20 March 2012 26 / 134

Page 27: Cloud Slides

S3 API

Establish Connectionrequire ’S3’AWS ACCESS KEY = ’<your key>’AWS SECRET ACCESS KEY = ’<your key>’conn = S3::AWSAuthConnection.new (AWS ACCESS KEY ID,AWS SECRET ACCESS KEY, false)

Create BucketBUCKET NAME = ’assets.example.com’conn.create bucket(BUCKET NAME)

S. Sakr (NICTA) Cloud Computing 20 March 2012 27 / 134

Page 28: Cloud Slides

Amazon Elastic Compute Cloud (EC2)

AWS EC2

http://aws.amazon.com/ec2/

S. Sakr (NICTA) Cloud Computing 20 March 2012 28 / 134

Page 29: Cloud Slides

Amazon Elastic Compute Cloud (EC2)

S. Sakr (NICTA) Cloud Computing 20 March 2012 29 / 134

Page 30: Cloud Slides

Amazon EC2 Concepts

Amazon Machine Image (AMI)Bootable root disk stored in S3

Pre-defined or user-built

Catalog of user-built AMIs

OS: Fedora, Centos, Gentoo, Debian, Ubuntu, Windows Server AppStack: LAMP, mpiBLAST, Hadoop

InstanceRunning copy of an AMI

Launch in less than 2 minutes

Start/stop programmatically

Network Security ModelExplicit access control

Security groups

Inter-service bandwidth is freeS. Sakr (NICTA) Cloud Computing 20 March 2012 30 / 134

Page 31: Cloud Slides

EC2 API

ImagesRegisterImage

DescribeImages

DeregisterImageInstances

RunInstances

DescribeInstances

TerminateInstances

RebootInstancesKeypairs

CreateKeyPair

DescribeKeyPairs

DeleteKeyPairImage Attributes

ModifyImageAttribute

DescribeImageAttribute

ResetImageAttribute

S. Sakr (NICTA) Cloud Computing 20 March 2012 31 / 134

Page 32: Cloud Slides

Amazon Simple Queue Service (SQS)

AWS SQS

http://aws.amazon.com/sqs/

S. Sakr (NICTA) Cloud Computing 20 March 2012 32 / 134

Page 33: Cloud Slides

SQS API

QueuesListQueues

DeleteQueue

MessagesSendMessage

ReceiveMessage

DeleteMessage

S. Sakr (NICTA) Cloud Computing 20 March 2012 33 / 134

Page 34: Cloud Slides

Amazon SimpleDB

AWS SimpleDB

http://aws.amazon.com/simpledb/

S. Sakr (NICTA) Cloud Computing 20 March 2012 34 / 134

Page 35: Cloud Slides

SimpleDB Concepts

DomainCollection of similar items (similar to database table)

Query language

Any number of items per domain (10 GB beta limit)

100 domains per account

ItemCollection of key-value pairs (attributes) (similar to database row)

Multiple values per attribute

Up to 256 attributes per item

Up to 1024 bytes per value

BillingData storage

CPU utilization

Data storage

S. Sakr (NICTA) Cloud Computing 20 March 2012 35 / 134

Page 36: Cloud Slides

SimpleDB API

DomainsCreateDomain

ListDomains

DeleteDomain

ItemsPutAttributes

GetAttributes

Query

Sample Queries’Title’ = ’The Right Stuff’

’Number of Pages’ < ’00310’

’Rating’ = ’***’ or ’Rating’ = ’*****’

’Year’ > ’1950’ and ’Year’ < ’1960’ or ’Year’ starts-with ’193’ or ’Year’= ’2007’

S. Sakr (NICTA) Cloud Computing 20 March 2012 36 / 134

Page 37: Cloud Slides

Amazon Stack of Services

S. Sakr (NICTA) Cloud Computing 20 March 2012 37 / 134

Page 38: Cloud Slides

Part III

Scalable Data Management on Cloud

Platforms

S. Sakr (NICTA) Cloud Computing 20 March 2012 38 / 134

Page 39: Cloud Slides

Cloud Data Management

One of the main goals of the next wave of Cloud Computing is tofacilitate the job of implementing every application as a distributed,scalable and widely-accessible service on the Web.

Cloud Computing has provided the chance for deploying novelapplications which were not economically feasible in a traditionalenterprise infrastructure setting.

We are witnessing a proliferation in the number of applications with atremendous increase in the scale of the data generated as well asconsumed by such applications.

S. Sakr (NICTA) Cloud Computing 20 March 2012 39 / 134

Page 40: Cloud Slides

Cloud Data Management

The recent advances in Web technology have made it easy for anyuser to provide and consume content of any form. For example,building a personal Web page (e.g. Google Sites), starting a blog(e.g. WordPress, Blogger, LiveJournal) and making both searchablefor the public have now become a commodity.

Facebook serves 570 billion page views per month, stores 3 billionnew photos every month, manages 25 billion pieces of content (e.g.status updates, comments) every month and runs its services over30K servers.

Cloud-hosted database systems powering these applications form acritical component in the software stack of these applications.

S. Sakr (NICTA) Cloud Computing 20 March 2012 40 / 134

Page 41: Cloud Slides

Cloud Data Management

In general, successful cloud data management services should satisfy asmuch as possible from the following goals:

Availability: They must be always accessible even on the occasionswhere there is a network failure or a whole datacenter has gone offline.

Scalability: They must be able to support very large databases withvery high request rates at very low latency.

Elasticity: They must be able to satisfy changing applicationrequirements in both directions (scaling up or scaling down). Inparticular, the system must be able to gracefully respond to thesechanging requirements and quickly recover to its steady state.

Performance: On public cloud computing platforms, pricing isstructured in a way such that one pays only for what one uses, so thevendor price increases linearly with the requisite storage, networkbandwidth, and compute power. Therefore, the system performancehas a direct effect on its costs.

S. Sakr (NICTA) Cloud Computing 20 March 2012 41 / 134

Page 42: Cloud Slides

Data Replication and Data Partitioning

Data replication and data partitioning are two well-known strategiesto achieve the availability, scalability and performance improvementgoals in the distributed data management world.

When the application load increases, there are two main options forachieving scalability at the database tier and make the applicationable to cope with more client requests:

Scaling up: aims at allocating a bigger machine with more horsepower(e.g. more processors, memory, bandwidth) to act as a database server.

Scaling out: aims at replicating and partitioning data across moremachines.

The scaling up option has the main drawback that large machines areoften very expensive and eventually a physical limit is reached wherea more powerful machine cannot be purchased at any cost.

The scaling out model fits well with the pay-as-you-go pricingphilosophy.

S. Sakr (NICTA) Cloud Computing 20 March 2012 42 / 134

Page 43: Cloud Slides

Scale Up VS Scale Out

S. Sakr (NICTA) Cloud Computing 20 March 2012 43 / 134

Page 44: Cloud Slides

Options of Cloud-Hosted Database Systems

NoSQL Database System (Not Only SQL) - Key/Value Stores.

Database-as-a-Service (Daas)

Virtualized Database Servers

S. Sakr (NICTA) Cloud Computing 20 March 2012 44 / 134

Page 45: Cloud Slides

Options of Cloud-Hosted Database Systems

S. Sakr (NICTA) Cloud Computing 20 March 2012 45 / 134

Page 46: Cloud Slides

Part IV

NoSQL Database Systems

S. Sakr (NICTA) Cloud Computing 20 March 2012 46 / 134

Page 47: Cloud Slides

Web Tiers

S. Sakr (NICTA) Cloud Computing 20 March 2012 47 / 134

Page 48: Cloud Slides

Internet Scalability

S. Sakr (NICTA) Cloud Computing 20 March 2012 48 / 134

Page 49: Cloud Slides

Internet Scalability: Pragmatics

S. Sakr (NICTA) Cloud Computing 20 March 2012 49 / 134

Page 50: Cloud Slides

Database Partitioning

Partitioning-based approach:Although work for medium-scale applications, does not work with verylarge-scale

Enterprises such as Google, Yahoo, and Amazon:Needed to support tens of millions of usersFurthermore, concurrent usage very high

Traditional DBMS became a severe bottleneckEmergence of custom-built solutions for data management for largescale Web applications

S. Sakr (NICTA) Cloud Computing 20 March 2012 50 / 134

Page 51: Cloud Slides

Driving Forces

Modern Web applications: unprecedented data managementchallengesForemost requirements and features:

ScalabilityPopular apps scale to million of users

Support rapid growth with minimal operational efforts

Low response timesLow latency at page loads

Users geographically dispersed

High availabilityService must continue despite server outages, network partitions, andother failures even at the cost of risking some data consistency. Forexample, if Google can not serve ads, it will not get paid. If they cannot render pages, they disappoint users.

Simplified query needs (No joins, aggregations)Relaxed consistency needs

Applications can tolerate stale or reordered dataS. Sakr (NICTA) Cloud Computing 20 March 2012 51 / 134

Page 52: Cloud Slides

CAP Theorem

CAP TheoremEric A. Brewer. Towards robust distributed systems. PODC, 2000

S. Sakr (NICTA) Cloud Computing 20 March 2012 52 / 134

Page 53: Cloud Slides

Consistency?

Traditional notion of serializability:

Performance, availability, and consistency trade-off

Hence, enforcing or trying to achieve serializability impractical

Eventual consistency model for replication:

Potentially dangerous due to different order of updates on replicas

Photo-sharing app: < u1 : removeMOMfromaccesslist >;< u2 :postSpring − Breakphotos >

Consequence: if u1 and u2 are flipped → may not be acceptable

Eventual Consistency

Werner Vogels. Eventually consistent. Commun. ACM 52(1), 2009

S. Sakr (NICTA) Cloud Computing 20 March 2012 53 / 134

Page 54: Cloud Slides

S. Sakr (NICTA) Cloud Computing 20 March 2012 54 / 134

Page 55: Cloud Slides

How do I build a cool new web app?

Option 1: Code it up! Make it live!

Scale it later

Flickr, Twitter, MySpace, Facebook, ...

S. Sakr (NICTA) Cloud Computing 20 March 2012 55 / 134

Page 56: Cloud Slides

How do I build a cool new web app?

Option 2: Make it industrial strength!

Evaluate scalable database backends

Evaluate scalable indexing systems

Evaluate scalable caching systems

Architect data partitioning schemes

Architect data replication schemes

Architect monitoring and reporting infrastructure

Write application

Go live

Realize it does not scale as well as you hoped

Rearchitect around bottlenecks

1 year later / ready to go!

S. Sakr (NICTA) Cloud Computing 20 March 2012 56 / 134

Page 57: Cloud Slides

S. Sakr (NICTA) Cloud Computing 20 March 2012 57 / 134

Page 58: Cloud Slides

Part V

Google Bigtable

S. Sakr (NICTA) Cloud Computing 20 March 2012 58 / 134

Page 59: Cloud Slides

Bigtable

Bigtable is designed to reliably scale to petabytes of data andthousands of machines.

Bigtable is designed to support a variety of demanding workloads,which range from throughput-oriented batch-processing jobs tolatency-sensitive serving of data to end users.

BigTable

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber.Bigtable: A Distributed Storage System for Structured Data (AwardedBest Paper!). OSDI, 2006

S. Sakr (NICTA) Cloud Computing 20 March 2012 59 / 134

Page 60: Cloud Slides

Big Table: Overall Architecture

Shared-nothing architecture consisting of thousands of nodes (commodityPC).

S. Sakr (NICTA) Cloud Computing 20 March 2012 60 / 134

Page 61: Cloud Slides

Big Table

Used in different applications supported by Google.

S. Sakr (NICTA) Cloud Computing 20 March 2012 61 / 134

Page 62: Cloud Slides

Bigtable: Data Model

A sparse, distributed persistent multidimensional sorted map.

Data is partitioned across the nodes seamlessly.

The map is indexed by a row key, column key, and a timestamp.

Output value in the map is an un-interpreted array of bytes.

URLs as row keys, various aspects of web pages as column names, and store the contents of the

web pages in the contents: column under the timestamps when they were fetched

S. Sakr (NICTA) Cloud Computing 20 March 2012 62 / 134

Page 63: Cloud Slides

Rows / Columns

A row key is an arbitrary string.

Every read or write of data under a single row is atomic.

Data is maintained in lexicographic order by row keyFor example, in Google Earth, rows are named to ensure that adjacentgeographic segments are stored near each other

The row range for a table is dynamically partitioned.

Each partition (row range) is named a tablet (Unit of distributionand load-balancing).

Column keys are grouped into sets called column families.

Hundreds of static column families (Language:English,Language:German).

S. Sakr (NICTA) Cloud Computing 20 March 2012 63 / 134

Page 64: Cloud Slides

Bigtable API

Create and delete tables and column families

Modify cluster, table, and column family metadata such as accesscontrol rights

Write or delete values in Bigtable

Look up values from individual rows

Iterate over a subset of the data in a table

Atomic R-M-W sequences on data stored in a single row key. Nosupport for general transactions across row keys, although it providesan interface for batching writes across row keys at the clients.

S. Sakr (NICTA) Cloud Computing 20 March 2012 64 / 134

Page 65: Cloud Slides

SSTable

A database similar to a BDB database:Stores and retrieves key/data pairs.

Consists of 5 active replicas, one replica is the master and servesrequests.

Service is functional when majority of the replicas are running and incommunication with one another (when there is a quorum).

Implements a nameservice that consists of directories and files.

S. Sakr (NICTA) Cloud Computing 20 March 2012 65 / 134

Page 66: Cloud Slides

Chubby

A persistent and distributed lock service.

Cursors to iterate key/value pairs given a selection predicate (exactand range).

Configurable to use either persistent store (disk) or main-memorybased.

!iiinnndddiiinnnggg aaa tttaaabbbllleeettt

!00

SSStttooorrreeesss:::      KKKeeeyyy:::      tttaaabbbllleee      iiiddd      +++      eeennnddd      rrrooowww,,,                        DDDaaatttaaa:::      lllooocccaaatttiiiooonnnCCCaaaccchhheeeddd      aaattt      cccllliiieeennntttsss,,,      wwwhhhiiiccchhh      mmmaaayyy      dddeeettteeecccttt      dddaaatttaaa      tttooo      bbbeee      iiinnncccooorrrrrreeecccttt

iiinnn      wwwhhhiiiccchhh      cccaaassseee,,,      lllooooookkkuuuppp      ooonnn      hhhiiieeerrraaarrrccchhhyyy      pppeeerrrfffooorrrmmmeeedddAAAlllsssooo      ppprrreeefffeeetttccchhheeeddd      (((fffooorrr      rrraaannngggeee      qqquuueeerrriiieeesss)))

S. Sakr (NICTA) Cloud Computing 20 March 2012 66 / 134

Page 67: Cloud Slides

Big Table uses Chubby to

Ensure there is at most one active master at a time.

Store the bootstrap location of Bigtable data (Root tablet).

Discover tablet servers and finalize tablet server deaths.

Store Bigtable schema information (column family information)

Store access control list.

If Chubby becomes unavailable for an extended period of time,Bigtable becomes unavailable.

A tablet is assigned to one tablet server at a time.

Master maintains the set of live tablet servers.

S. Sakr (NICTA) Cloud Computing 20 March 2012 67 / 134

Page 68: Cloud Slides

Highlights of Bigtable

Separate storage layer from data management.

Restrict activity to one server.

Key-value store with column families.

Fault-tolerance achieved through: Chubby and GFS

Master-based approach for server/tablet management

S. Sakr (NICTA) Cloud Computing 20 March 2012 68 / 134

Page 69: Cloud Slides

Part VI

Yahoo PNUTS

S. Sakr (NICTA) Cloud Computing 20 March 2012 69 / 134

Page 70: Cloud Slides

PNUTS Overview

Massively parallel and geographically distributed database system

Main focus is Low latency for concurrent updates and queries.

Data ModelSimple relational model

Single-table scans with predicates

Data storage organized as hashed or ordered tables.

PNUTSBrian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, AdamSilberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, DanielWeaver, Ramana Yerneni. PNUTS: Yahoo!’s hosted data serving platform.PVLDB 1(2), 2008

S. Sakr (NICTA) Cloud Computing 20 March 2012 70 / 134

Page 71: Cloud Slides

PNUTS Overview

Fault-tolerance:Redundancy at multiple levels: data, meta-data etc.

Leverages relaxed consistency for high availability: reads and writesdespite failures

Pub/Sub Message System:Yahoo! Message Broker for asynchronous updates

Record-level Mastering:Asynchronous operations to enable record-level mastering

Hosting: Centrally managed database service, Shared among manyapplications → Basically, the idea of DaaS and Multtenantdatabases.

S. Sakr (NICTA) Cloud Computing 20 March 2012 71 / 134

Page 72: Cloud Slides

PNUTS Overview

S. Sakr (NICTA) Cloud Computing 20 March 2012 72 / 134

Page 73: Cloud Slides

PNUTS Overview

Data ModelTable of records with attributesBLOB is a valid data-type (exclude image/audio etc.)Flexible schema:

Attributes can be added dynamically (No mention of droppingattributes)Records not required to have values for all attributes (i.e., integrityconstraints minimal)

Query modelDesigned primarily for online serving workloads that consists mostly ofqueries that read and write single records or small group of records.Per-record operations (Get, Set, Delete)Multi-record operations (Multiget, Scan, Getrange)The query language of PNUTS supports selection and projection fromsingle tableUpdates and deletes must specify the primary keyCaveats:

No referential integrityNo complex operations: joins, group-by, etc.

S. Sakr (NICTA) Cloud Computing 20 March 2012 73 / 134

Page 74: Cloud Slides

PNUTS Overview

S. Sakr (NICTA) Cloud Computing 20 March 2012 74 / 134

Page 75: Cloud Slides

Consistency Model

Hide the complexity of data replicationBetween the two extremes:

One-copy serializabilityEventual consistency

Key assumption:Applications manipulate one record at a time

Per-record time-line consistency:All replicas of a record preserve the update order

One replica designated as a master (per record). All updatesforwarded to that master.

Pub/Sub Mechanism: Reliability, Replication

No traditional Database Logging mechanism. Rely on guaranteeddelivery

Pub/sub log for redo recovery: Replaying updates if necessary

S. Sakr (NICTA) Cloud Computing 20 March 2012 75 / 134

Page 76: Cloud Slides

API Calls

Read-Any: Returns (possibly) a stale version of the record

Read-Critical(required-version): Version ≥ required-version

Read-latest: Executed at the master

Write: ACID guarantees with a single write operation

TestAndSet(required-version): Performs write if and only if thepresented version = requiredversion

→ Synchronizes concurrent writers, optimistically

S. Sakr (NICTA) Cloud Computing 20 March 2012 76 / 134

Page 77: Cloud Slides

PNUTS Detailed Architecture

S. Sakr (NICTA) Cloud Computing 20 March 2012 77 / 134

Page 78: Cloud Slides

Physical Data Storage and Retrieval

Tables horizontally partitioned (TABLETS)

Tablets:Scattered across many servers

100s to 1000s of Tablets/server

Each tablet stored at a region

100s Mbytes / Few Gbytes per Tablet

1000s to 10s of thousands of records/Tablet

Tablet → server assignment flexible:Load balancingFault tolerance

The router stores an interval mapping, which defines the boundariesof each tablet, and also maps each tablet to a storage unit.

S. Sakr (NICTA) Cloud Computing 20 March 2012 78 / 134

Page 79: Cloud Slides

Physical Data Storage and Retrieval

S. Sakr (NICTA) Cloud Computing 20 March 2012 79 / 134

Page 80: Cloud Slides

Per-record Mastering?

Why not per-tablet master?Per-record master enables fine-grained control of locality of access

Updates can originate in a non-master region:Enables master migration from one region to another

No impact on the tablet

Currently N=3 continuous non-local updates trigger migration

S. Sakr (NICTA) Cloud Computing 20 March 2012 80 / 134

Page 81: Cloud Slides

Yahoo! Message Broker

Updates committed once published

Asynchronous update propagation to other regions

Update not purged until applied to all replicas

Logical ordering of updates (partial order)

Consistency via per-record master

S. Sakr (NICTA) Cloud Computing 20 March 2012 81 / 134

Page 82: Cloud Slides

Part VII

Amzon Dynamo

S. Sakr (NICTA) Cloud Computing 20 March 2012 82 / 134

Page 83: Cloud Slides

Amazon Eco-system

E-commerce platform: serving 10s of millions of users

Objective: reliability and scalability dependent on system design

Highly decentralized architecture: Cannot afford to havedependencies

Storage technology: Always available

Standard modus-operandi: Ubiquitous failures

S. Sakr (NICTA) Cloud Computing 20 March 2012 83 / 134

Page 84: Cloud Slides

Dynamo Design Overview

Data partitioning using consistent hashing

Data replication

Consistency via version vectors

Replica synchronization via quorum protocol

Gossip-based failure-detection and membership protocol

Data and Query Model:Read/write operations via primary keyNo relational schema: ¡key, value¿ objectObject size < 1 MB, typically.

Amazon Dynamo

Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, GunavardhanKakulapati, Avinash Lakshman, Alex Pilchin, SwaminathanSivasubramanian, Peter Vosshall, Werner Vogels. Dynamo: amazon’shighly available key-value store. SOSP, 2007

S. Sakr (NICTA) Cloud Computing 20 March 2012 84 / 134

Page 85: Cloud Slides

Dynamo Design Overview

ACID properties:NONEOnly single key updates

Efficiency: SLA 99.9 percentile of operations

Optimistic/Asynchronous Replication:Leads to update conflictsNeeds conflict resolution: eventual consistency

Who resolves the conflict:Data store: limited choices; syntactic: last write wins.Application: semantic: case-by case

Symmetry:Peer-based designPrinciple of equal responsibility

S. Sakr (NICTA) Cloud Computing 20 March 2012 85 / 134

Page 86: Cloud Slides

Service Level Agreements (SLA)

Application can deliverits functionality inabounded time: Everydependency in theplatform needs to deliverits functionality witheven tighter bounds.

Example: serviceguaranteeing that it willprovide a responsewithin 300ms for 99.9%of its requests for a peakclient load of 500requests per second.

S. Sakr (NICTA) Cloud Computing 20 March 2012 86 / 134

Page 87: Cloud Slides

Dynamo Design Overview

S. Sakr (NICTA) Cloud Computing 20 March 2012 87 / 134

Page 88: Cloud Slides

Part VIII

Other NoSQL Systems

S. Sakr (NICTA) Cloud Computing 20 March 2012 88 / 134

Page 89: Cloud Slides

NoSQL Database Systems

NoSQL Database Systems

http://nosql-database.org/

S. Sakr (NICTA) Cloud Computing 20 March 2012 89 / 134

Page 90: Cloud Slides

Alternative Design Decisions

Data Model: Key/Value, Row Store, Graph-Oriented,Document-Oriented

Access-Path Optimization: Read-intensive VS Write-intensive,Single-Key Vs Multi-Key

Data Partitioning: Row-Oriented, Column-Orieneted, Multi-ColumnOriented

Concurrency Management: Strong Consistency, EventualConsistency, Weak Consistency

Replication Management: Single Master, Multi Master, 2PHC

CAP Theorem

S. Sakr (NICTA) Cloud Computing 20 March 2012 90 / 134

Page 91: Cloud Slides

Alternative Design Decisions

S. Sakr (NICTA) Cloud Computing 20 March 2012 91 / 134

Page 92: Cloud Slides

Too Many Choices Often Make Us Confused :(

We have choices, but ...

There is a wide variety of NoSQL systems in terms of their functionaland non-functional offerings

Developers need to understand many things.Trade-offs among technologies/configurations.

How to fulfill applications’ requirements using which storage with whatconfigurations.

It’s not trivial !!

S. Sakr (NICTA) Cloud Computing 20 March 2012 92 / 134

Page 93: Cloud Slides

General Limitations of NoSQL Database Systems

Programming Model: NoSQL databases offer few facilities forad-hoc query and analysis. Even a simple query requires significantprogramming expertise. Missing the support of declarativelyexpressing the important join operation has been always consideredone of the main limitations of these systems.

Transaction Support: The current limited support (if any) of thetransaction notion from NoSQL database systems is considered as abig obstacle towards their acceptance in implementing mission criticalsystems.

No Support for Multiple Data Centers: Except Cassandra, allopen source NoSQL projects are designed to run on a cluster ofmachine in a single data center and they are nor designed to run overmultiple data centers.

S. Sakr (NICTA) Cloud Computing 20 March 2012 93 / 134

Page 94: Cloud Slides

General Limitations of NoSQL Database Systems

Maturity and Support: Most NoSQL alternatives are inpre-production versions with many key features either not stableenough or yet to be implemented. Therefore, enterprises are stillapproaching this new wave with extreme caution.

Expertise: Almost every NoSQL developer is in a learning mode.This situation will address naturally over time. However, currently, itis far easier to find experienced RDBMS programmers oradministrators than a NoSQL expert.

S. Sakr (NICTA) Cloud Computing 20 March 2012 94 / 134

Page 95: Cloud Slides

Part IX

Database-as-a-Services (DaaS)

S. Sakr (NICTA) Cloud Computing 20 March 2012 95 / 134

Page 96: Cloud Slides

Database-as-a-Service (DaaS)

Database-as-a-service (DaaS) is a new paradigm for data managementin which a third party service provider hosts a database as a service.

The service provides data management for its customers and thusalleviates the need for the service user to purchase expensive hardwareand software, deal with software upgrades and hire professionals foradministrative and maintenance tasks.

For example, Amazon RDS provides access to the capabilities ofMySQL or Oracle database while Microsoft SQL Azure has been builton Microsoft SQL Server technologies.

S. Sakr (NICTA) Cloud Computing 20 March 2012 96 / 134

Page 97: Cloud Slides

Database-as-a-Service (DaaS)

Users of these services can leverage the capabilities of traditionalrelational database systems such as creating, accessing andmanipulating tables, views, indexes, roles, stored procedures, triggersand functions. It can also execute complex queries and joins acrossmultiple tables.

The migration of the database tier of any software application to arelational database service is supposed to require minimal effort if theunderlying RDBMS of the existing software application is compatiblewith the offered service.

S. Sakr (NICTA) Cloud Computing 20 March 2012 97 / 134

Page 98: Cloud Slides

Amazon RDS

Making it easy to setup, operate and scale relational databases in thecloud since 2009

Deploy a pre-configured, resizable MySQL Database Instance inminutes via the AWS Management Console

Let Amazon RDS manage automated backups, software patching,replica;on for fault tolerance and read scaling

Compatible with existing MySQL apps and tools

Pay by the hour (rates vary by DB Instance class and region)

S. Sakr (NICTA) Cloud Computing 20 March 2012 98 / 134

Page 99: Cloud Slides

Amazon RDS Multi-AZ Deployments

Enterprise-grade, fault-tolerant solution for production databases

What is Mullti-AZ deployment?With a single API call, Amazon RDS creates and synchronouslymaintains a hot standby in a different availability zone

In the event of an unplanned or planned outage, Amazon RDSautomatically fails over to the standby so you can resume databasewrites and reads as soon as possible

S. Sakr (NICTA) Cloud Computing 20 March 2012 99 / 134

Page 100: Cloud Slides

Amazon RDS

S. Sakr (NICTA) Cloud Computing 20 March 2012 100 / 134

Page 101: Cloud Slides

Amazon RDS Read Replicas

A Read Replica is a copy of a specified DB Instance that can serveread traffic

Intended use casesRead scaling, business reporting

Not intended as fault tolerance substitute for multi-AZ

Unlike Multi-AZ, uses native, asynchronous MySQL replication andreplica can lag source

Read Replica can use Multi-AZ deployment as source

S. Sakr (NICTA) Cloud Computing 20 March 2012 101 / 134

Page 102: Cloud Slides

Limitations of Database-as-a-Service (DaaS)

The user does not have the full control on provisioning the hardwareresources that can achieve his performance and scalability goals.

Many relational database systems are not, yet, supported by the DaaSparadigm (e.g. DB2, Postgres)

Some limitations or restrictions might be introduced by the serviceprovider for different reasons such as database count, size limit andconnection constraints.

S. Sakr (NICTA) Cloud Computing 20 March 2012 102 / 134

Page 103: Cloud Slides

Part X

CloudDB AutoAdmin (Our Research)

S. Sakr (NICTA) Cloud Computing 20 March 2012 103 / 134

Page 104: Cloud Slides

Virtualized Database Servers

Virtualization is a key technology of the cloud computing paradigm.

Virtual machine technologies are increasingly being used to improvethe manageability of software systems and lower their total cost ofownership.

Virtualization allow resources to be allocated to different applicationson demand and hide the complexity of resource sharing from cloudusers by providing a powerful abstraction for application and resourceprovisioning.

Resource virtualization technologies add a flexible and programmablelayer of software between applications and the resources used by theseapplications.

Database servers, like any other software components, are migratedto run in virtual machines (e.g. Amazon EC2 Instances).

S. Sakr (NICTA) Cloud Computing 20 March 2012 104 / 134

Page 105: Cloud Slides

Virtualized Database Servers

S. Sakr (NICTA) Cloud Computing 20 March 2012 105 / 134

Page 106: Cloud Slides

Advantages of Virtualized Database Servers

The easiest option for migrating the database tier of existing softwareapplication.

The application can have the full control in dynamically allocatingand configuring the physical resources of the database tier (databaseservers) as needed.

Software applications can fully utilize the elasticity feature of thecloud environment to achieve their defined and customized scalabilityor cost reduction goals

Enables the software applications to build their geographicallydistributed database clusters. Without the cloud, building suchin-house cluster would require self-owned infrastructure whichrepresent an option that can be only afforded by big enterprises.

S. Sakr (NICTA) Cloud Computing 20 March 2012 106 / 134

Page 107: Cloud Slides

Advantages of Virtualized Database Servers

!eeepppllliiicccaaa      111

!eeepppllliiicccaaa      222

IIInnnttteeerrraaacccttt

SSSyyynnnccchhhrrrnnnoooiiizzzeee

IIInnnttteeerrraaacccttt

S. Sakr (NICTA) Cloud Computing 20 March 2012 107 / 134

Page 108: Cloud Slides

Limitations of Virtualized Database Servers

It is on the onus of the software application to:

Dynamically provision the allocated physical resources for thedatabase tier (elasticity).

When to provision a database?

How to provision a database?

The monetary cost management.

Management of application-defined SLA.

Data replication management.

S. Sakr (NICTA) Cloud Computing 20 March 2012 108 / 134

Page 109: Cloud Slides

CloudDB AutoAdmin / Vision

!ppppppllliiicccaaatttiiiooonnn      CCCooodddeee      (((CCCooonnnsssuuummmeeerrr)))

CCClllooouuudddDDDBBB !uuutttooo!dddmmmiiinnn

CCClllooouuuddd      DDDaaatttaaabbbaaassseee      SSSeeerrrvvviiiccceee      (((PPPrrrooovvviiidddeeerrr)))

DDDiiissstttrrriiibbbuuuttteeeddd,,,      ssscccaaalllaaabbbllleee      aaannnddd      wwwiiidddeeelllyyy-­‐-­‐-aaacccccceeessssssiiibbbllleee      DDDaaatttaaabbbaaassseee      !ppppppllliiicccaaatttiiiooonnn

CloudDB AutoAdminSherif Sakr, Liang Zhao, Hiroshi Wada, Anna Liu. CloudDB AutoAdmin:Towards a Truly Elastic Cloud-Based Data Store. ICWS, 2011

S. Sakr (NICTA) Cloud Computing 20 March 2012 109 / 134

Page 110: Cloud Slides

Dynamic Provisioning

Application workloads are quite dynamic especially for internet scaleapplications.

Application workloads are quite different.

Different applications have different requirements. Different modulesof the same application may require different goals (SLA) andconsequently use different design decisions (rationing approach / payonly when it matters).

S. Sakr (NICTA) Cloud Computing 20 March 2012 110 / 134

Page 111: Cloud Slides

Monetary Cost

The focus of many enterprises from using cloud environments is toreduce costs rather than improving the performance.

Evaluation figures where the Y-axis represent the money cost ratherthen the performance characteristics will be quite common (if notmandatory) shortly.

Variability of performance of public cloud providers and their directaffect on the cost need to considered (performance SLA).

S. Sakr (NICTA) Cloud Computing 20 March 2012 111 / 134

Page 112: Cloud Slides

Admission Control

Cloud DB Administrator is a poor guy.

In practice, finding the right configuration of cloud databases (designdecisions) in order to guarantee the high performance or reduce themoney cost is not a trivial task at all.

Reacting to dynamic characteristic of application workloads (e.g.elasticity spikes) might not be affordable in short period consideringthat it should momentum and avoid any unrequired waste of money.

Automated/Semi-Automated techniques for managing thedatabase tier in cloud environment is quite important andhighly required.

Many resource-based approach have been proposed. Our approach isSLA-based.

S. Sakr (NICTA) Cloud Computing 20 March 2012 112 / 134

Page 113: Cloud Slides

SLA Management for Cloud Hosted Database

According to a Gartner market report released in November 2010,SaaS is forecast to have a 15.8% growth ratethrough 2014 whichmakes SaaS and cloud very interesting to services industry.

The viability of the business models depends on the practicality andthe success of the terms and conditions (SLAs) being offered by theservice provider(s) in addition to their satisfaction to the serviceconsumers

Successful SLA management is a critical factor to be considered byboth providers and consumers alike.

SLA indicates the level of service agreed upon as well as theassociated cost if the service provider fails to deliver the level ofservice.

S. Sakr (NICTA) Cloud Computing 20 March 2012 113 / 134

Page 114: Cloud Slides

SLA Management for Cloud Hosted Database

Currently, cloud providers do not provide adequate SLA for theirservice offerings. Particularly, most providers guarantee only theavailability (but not the performance) of their services.

One complexity that arises with the virtualization technology is that itbecomes harder to provide performance guarantees and to reasonabout a particular application’s performance.

The performance of an application hosted on a virtual machinebecomes a function of applications running in other virtual machineshosted on the same physical machine.

Several studies have reported that the variation of the performance ofcloud computing resources is high.

The burden on the consumers applications to assure theirapplications’ SLA (e.g. performance, throughput, response time) totheir customers.

Mapping application-defined SLA to low-level resource utilization is avery complex and challenging task. Our approach is declarative-based.

S. Sakr (NICTA) Cloud Computing 20 March 2012 114 / 134

Page 115: Cloud Slides

SLA Management for Cloud Hosted Database

Cloud Infrastructure SLA (I-SLA): these SLA are offered by cloudinfrastructure providers to cloud consumers to assure the qualitylevels of their cloud computing resources (e.g., server performance,network speed, resources availability, storage capacity).

Cloud-hosted Application SLA (A-SLA): these guarantees relate tothe levels of quality for an application which is deployed on a cloudinfrastructure. In particular, cloud consumers often offer suchguarantees to their application’s customers/ users to assure quality ofservices they offer to them such as the application’s response timeand availability.

S. Sakr (NICTA) Cloud Computing 20 March 2012 115 / 134

Page 116: Cloud Slides

SLA Management for Cloud Hosted Database

S. Sakr (NICTA) Cloud Computing 20 March 2012 116 / 134

Page 117: Cloud Slides

SLA Management for Cloud Hosted Database

Infrastructure service providers charge business service providers forrenting computing resources to deploy their applications.

Software service providers may charge the users for processing theirworkloads (e.g. Software-as-a-Service) or may process the userrequests for free (cloud-hosted business application)

In both cases, the software service provider needs to guarantee theirusers’ SLA. Penalties are applied in the case of SaaS and reputationloss is incurred in the case of cloud-hosted business applications.

For example, Amazon found every 100ms of latency cost them 1% insales and Google found an extra 500ms in search page generationtime dropped traffic by 20%

S. Sakr (NICTA) Cloud Computing 20 March 2012 117 / 134

Page 118: Cloud Slides

CloudDB AutoAdmin

S. Sakr (NICTA) Cloud Computing 20 March 2012 118 / 134

Page 119: Cloud Slides

CloudDB AutoAdmin

CloudDB AutoAdmin is an end-to-end framework for a declarativemanagement of the application-defined SLA for cloud-hosteddatabases.

An SLA is a contract between a service provider and its customers. Inpractice, there exist many forms of SLAs with different metrics (e.g.response time, throughput, availability, etc).

The SLA of the application is declaratively defined in terms of goalswhich are subjected to a number of constraints that are specific tothe running application.

The defined SLA is fed to the SLA checker component whichcontinuously evaluates them against the reported metrics from amonitoring module and triggers the execution of necessary correctiveactions (e.g. scaling out/in the database tier) when required.

S. Sakr (NICTA) Cloud Computing 20 March 2012 119 / 134

Page 120: Cloud Slides

CloudDB AutoAdmin

The trigger of these corrective actions is based on a set of declarativeapplication-defined rules where the expected system goals(performance or cost) is optimized according to the defined SLA.

Examples of declarative rules are:Scale out the underlying database tier if the average percentage of SLAviolation for transactions T1 and T2 exceeds 10% for a continuousperiod of more than 8 minutes.

Scale in the database tier if the average percentage of SLA violation fortransactions T1 and T2 is less than 2% for a continuous period that ismore than 8 minutes and the average number of concurrent users perunderlying database replica is less than 25.

The framework is designed in a way that it acts as an admissioncontrol to achieve the SLA management goal for the database tier ofsoftware applications non-intrusively with zero change in their linecodes.

S. Sakr (NICTA) Cloud Computing 20 March 2012 120 / 134

Page 121: Cloud Slides

CloudDB AutoAdmin / SLA-Based Elasticity Manager

S. Sakr (NICTA) Cloud Computing 20 March 2012 121 / 134

Page 122: Cloud Slides

CloudDB AutoAdmin / SLA-Based Elasticity Manager

Transaction/Workload Monitor: Continuously log and monitor theexecuted database operations of the application workloads.

SLA Checker: Responsible of checking the results of the monitoringmodule and comparing them against the application-defined SLA ofthe different transaction types in order to report the percentage ofSLA violations for each of them (if any). The SLA checker relies on adynamic scheduler that uses the transaction rate as a main metric.Thus, it ensures that when the load is high, the SLA checks run morefrequently since its most likely the violations happen during theseperiods of time.

Action Manager: Continuously evaluates the condition of theapplication-defined action rules and executes the necessary actionwhen these conditions are satisfied.

S. Sakr (NICTA) Cloud Computing 20 March 2012 122 / 134

Page 123: Cloud Slides

CloudDB AutoAdmin / SLA-Based Elasticity Manager

TransPrice (Ti , Si , Vi): is the amount of monetary/satisfactionunits (Vi ) that an end user will pay/get if a cloud consumerapplication brings the transaction (Ti ) to completion according to thedefined SLA (Si ).

TransPenalty (Ti , Si , Vi): is the amount of monetary/dissatisfactionunits (Vi ) that the cloud consumer application (Ti ) will pay if itviolates the defined SLA (Si ) of the end user transaction (Ti ).

SLA-Satisfaction (Ti , Pi , Ii): represents the percentage (Pi ) of theinstances of the application transaction (Ti ) that satisfied theirdefined SLA during the last period of time units (Ii ).

SLA-DisSatisfaction (Ti , Pi , Ii): represents the percentage (Pi ) ofthe instances of the application transaction (Ti ) that did not fulfilltheir defined SLA during the last period of time units (Ii ).

S. Sakr (NICTA) Cloud Computing 20 March 2012 123 / 134

Page 124: Cloud Slides

CloudDB AutoAdmin / SLA-Based Elasticity Manager

SLA-ViolationMagnitude (Ti , Ni , Mi , Ii): represents the numberof the instances (Ni ) of the application transaction (Ti ) that did notfulfill their defined SLA and the average magnitude of SLA violation(Mi ) during the last period of time units (Ii ).

SLA-Threshold (Ti , Pi , Ii): defines the minimum percentage (Pi ) ofthe instances of the application transaction (Ti ) that should fulfilltheir defined SLA during the last period of time units (Ii ).

RreplicaCost (C): is the amount of monetary units (C ) which willbe paid by the cloud consumer to the cloud service provider (perhour) for renting the required resources for hosting an addition replicaof the database tier.

[Min/Max]Rreplicas: represents the boundary limits for theminimum/maximum number of replicas that can be allocated for thedatabase tier of the software application.S. Sakr (NICTA) Cloud Computing 20 March 2012 124 / 134

Page 125: Cloud Slides

Replication Performance on Virtualized Database Servers

us-west eu-west ap-southeast ap-northeast

L2

us-east-1aus-east-1bL3

Cloudstone benchmark

Master

Slave1 Slavek Slavek+1 Slaven

Slave1 Slavek Slavek+1 Slaven

M write operations

N read operations (distributed)

Replication within the same region and the same availability zone

Replication within the same region but across availability zones

Slave1 Slaven

Replication across regions

us-east

M / N satisfies pre-defined read/write ratioL1

Slave1 Slaven Slave1 Slaven Slave1 Slaven

Cloud Database Replication

Liang Zhao, Sherif Sakr, Alan Fekete, Hiroshi Wada, Anna Liu.Application-Managed Database Replication on Virtualized CloudEnvironments. DMC, 2012.

S. Sakr (NICTA) Cloud Computing 20 March 2012 125 / 134

Page 126: Cloud Slides

Replication Performance on Virtualized Database Servers

S. Sakr (NICTA) Cloud Computing 20 March 2012 126 / 134

Page 127: Cloud Slides

One-Size-Does-Not-Fit-ALL: Consistency Rationing

Consistency Rationing

Tim Kraska, Martin Hentschel, Gustavo Alonso, Donald Kossmann.Consistency Rationing in the Cloud: Pay only when it matters. PVLDB2(1), 2009

S. Sakr (NICTA) Cloud Computing 20 March 2012 127 / 134

Page 128: Cloud Slides

CloudDB AutoAdmin / Replication Controller

S. Sakr (NICTA) Cloud Computing 20 March 2012 128 / 134

Page 129: Cloud Slides

CloudDB AutoAdmin / Replication Controller

Provisioning of a new database replica involves extracting databasecontent from an existing replica and copying that content to a newreplica.

The time of executing these operations mainly depends on thedatabase size.

To provision database replicas in a timely fashion, it is necessary toperiodically snapshot the database state in order to minimize thedatabase extraction and copying time to that of only the snapshotsynchronization time.

Obviously, there is a tradeoff between the time to snapshot thedatabase, the size of the transactional log and the amount of updatetransactions in the workload. This trade-off can be controlled byapplication-defined parameters. This tradeoff can be further optimizedby applying recently proposed live database migration techniques.

S. Sakr (NICTA) Cloud Computing 20 March 2012 129 / 134

Page 130: Cloud Slides

CloudDB AutoAdmin / Replication Controller

S. Sakr (NICTA) Cloud Computing 20 March 2012 130 / 134

Page 131: Cloud Slides

Road Map / Future Work

!lllooouuuddd-­‐-­‐-HHHooosssttteeeddd      

DDDaaatttaaabbbaaassseee(((sss)))

AAAppppppllliiicccaaatttiiiooonnn

LLLoooaaaddd      BBBaaalllaaannnccceeerrrWWWooorrrkkkllloooaaaddd      

MMMooonnniiitttooorrr

RRReeepppllliiicccaaatttiiiooonnn      

!ooonnntttrrrooolllllleeerrr

!ooonnnsssiiisssttteeennncccyyy      

!ooonnntttrrrooolllllleeerrr

PPPaaarrrtttiiitttiiiooonnniiinnnggg      

!ooonnntttrrrooolllllleeerrr

PPPeeerrrfffooorrrmmmaaannnccceee      

MMMooodddeeellleeerrr

SSSLLLAAA      MMMooonnniiitttooorrr

FFFiiinnnaaannnccciiiaaalll      !ooosssttt      

MMMooodddeeellleeerrr

DDDiiissstttrrriiibbbuuuttteeeddd      

TTTrrraaannnsss...      MMMaaannnaaagggeeerrr

AAAccctttiiiooonnn      

SSSccchhheeeddduuullleeerrr

AAAccctttiiiooonnn      

PPPrrreeedddiiiccctttooorrr

!lllooouuudddDDDBBBAAAuuutttoooAAAdddmmmiiinnn

MMMeeetttaaadddaaatttaaa      

MMMaaannnaaagggeeerrr

DDDeeeccclllaaarrraaatttiiivvveee      SSSpppeeeccciiifffiiicccaaatttiiiooonnn      

ooofff      SSSLLLAAA      RRReeeqqquuuiiirrreeemmmeeennntttsssWWWooorrrkkkllloooaaaddd

AAAuuutttooommmaaattteeeddd      MMMaaannnaaagggeeemmmeeennnttt      ooofff      

AAApppppp...      SSSLLLAAA      RRReeeqqquuuiiirrreeemmmeeennntttsss

S. Sakr (NICTA) Cloud Computing 20 March 2012 131 / 134

Page 132: Cloud Slides

Road Map / Future Work

Replicating the whole database can deal with the volume spikesituation in order to achieve the performance requirements of aspecific SLA requirements.

For data spike situation (increasing volume to specific object ortable), may require just replicating a specific shard (partition) of thedatabase in order to tackle the problem in a more efficient, effectiveand economical way. Such replication of specific partition should bealso done declaratively and transparently to the application code.

Management of distributed transactions.

Adaptive consistency controller.

Advanced techniques for adaptive cost modeling and management.

S. Sakr (NICTA) Cloud Computing 20 March 2012 132 / 134

Page 133: Cloud Slides

Open Challenges

Data Confidentiality: Moving data off premises increases the numberof potential security risks and appropriate precautions must be made.

Data Lock-In: APIs for cloud computing have not been, yet, subjectof active standardization. Thus, customers cannot easily extract theirdata and programs from one site to run on another.

Data Transfer Bottlenecks: Cloud users and cloud providers have tothink about the implications of placement and traffic at every level ofthe system if they want to minimize costs.

Performance Unpredictability: Many HPC applications need to ensurethat all the threads of a program are running simultaneously.However, today’s virtual machines and operating systems do notprovide this service.

Application Debugging in Large-Scale Distributed Systems: Achallenging aspect in cloud computing programming is the removal oferrors in these very large scale distributed systems.S. Sakr (NICTA) Cloud Computing 20 March 2012 133 / 134

Page 134: Cloud Slides

The End

Thank You

S. Sakr (NICTA) Cloud Computing 20 March 2012 134 / 134