scaling small app

25
Seminar: Cloud Computing A Scalable Data Platform for a Large Number of Small Applications Sabbir Ahmmed

Upload: sabbir-ahmmed

Post on 13-Jul-2015

109 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scaling Small App

Seminar: Cloud Computing

A Scalable Data Platform for a Large Number of Small Applications

Sabbir Ahmmed

Page 2: Scaling Small App

9 January 2013 Scaling Small Apps 2

Motivation

Page 3: Scaling Small App

9 January 2013 Scaling Small Apps 3

Outline

➔ The Article

➔ Introduction

➔ System Architecture

➔ Failure Management in a Cluster

➔ Enforcing Database SLAs

➔ Experimental Evaluation

➔ Conclusion

Page 4: Scaling Small App

9 January 2013 Scaling Small Apps 4

The Article

Work done by: Fan Yang, Jayavel Shanmugasundaram, Ramana YerneniWork done at: Yahoo! Research

Published in: 2009 (CIDR)Published under: Creative Commons License AgreementThis paper is NOT about:

➔ File Systems/ Low Level Internals (Database vendors' wizardry)➔ Database Paradigms (Relational, NoSQL or any other kind)

This paper is about:➔ Data Platform/Data Management Solution (SaaS, PaaS, IaaS)➔ Design Space/System Architecture (for Cloud service providers)

Page 5: Scaling Small App

9 January 2013 Scaling Small Apps 5

Introduction (I)

➔ Small/Community Applications➔ Relatively small data size (tens to thousands of megabytes)➔ Small throughput requirement (tens to hundreds of concurrent user sessions)➔ Comfortably fit in a single machine

➔ However➔ Large number of such applications in a large social network.

➔ Tens of thousands!!➔ Combined data size and and workload is quite large.

➔ Peta bytes of data and millions of concurrent users !!

Page 6: Scaling Small App

9 January 2013 Scaling Small Apps 6

Introduction (II)

➔ Problems with existing data-management solutions➔ Commercial database systems

➔ Scale well ➔ Large monetary cost (Licensing with premium)

➔ Open-source alternatives➔ Free ➔ Do not scale well

➔ Peer-to-Peer (DHTs, Ordered tables)➔ Excellent scalability and throughput performance➔Only support very simple data-manipulation operations

Page 7: Scaling Small App

9 January 2013 Scaling Small Apps 7

Introduction (III)

➔ Problems with existing data-management solutions➔ Other emerging data platforms (e.g Bigtable, PNUTS, SimpleDB)

➔ Scale to a large number of data operations➔ Lacks rich query processing capabilities (crucial for most web apps!)➔ Restrict the kind of queries applications can issue !

Finaly all of the above solutions lack Multi-Tenancy support !!

Page 8: Scaling Small App

9 January 2013 Scaling Small Apps 8

Introduction (IV)

➔The goal of this paper is to design a data management solution that is:➔Low cost➔Full-featured➔Multi-tenancy capable➔Uses comodity hardware and free software (MySQL)➔Exploites two main properties of applications:

➔They are “small”➔Can comfortably “fit in a single machine”

➔And finally addresses two main challenges➔Fault-tolerance➔Ensuring SLAs

Page 9: Scaling Small App

9 January 2013 Scaling Small Apps 9

System Architecture (I)

Page 10: Scaling Small App

9 January 2013 Scaling Small Apps 10

➔Few more words about the proposed architecture➔System controller, colo controller and clusture controller need to be fault tolerant➔Two main types of failures that need to be handled

➔Machine failures within a colo (handled by syn. replication within a colo)➔Colo failures (handled by asyn. replication across colos)

System Architecture (II)

Page 11: Scaling Small App

9 January 2013 Scaling Small Apps 11

➔Replication Architecture➔Uses single node DBMSs➔Each databse is hosted in two or more machines➔Coordinated by a cluster controller

➔Maintains a map of databases to machines➔Manages all DB connections to the client➔Maintains a DB connection to each machine➔Uses Read-one write-all replication protocol➔Uses 2-phase commit (2PC) protocol

Failure Management in a Cluster (I)

Cluster Controller

DB Cluster

Page 12: Scaling Small App

9 January 2013 Scaling Small Apps 12

Failure Management in a Cluster (II)

Page 13: Scaling Small App

9 January 2013 Scaling Small Apps 13

Failure Management in a Cluster (III)

Page 14: Scaling Small App

9 January 2013 Scaling Small Apps 14

➔ Where to route the read operations ?➔ Performance vs Load-balancing

➔ There are three options:

➔All read operations => same physical machine (Option 1)➔All read operations from a single transaction => single physical machine but read operations from different transactions => different physical machines (Option 2)➔Read operations from same transaction => different physical machines (Option 3)

Failure Management in a Cluster (IV)

Page 15: Scaling Small App

9 January 2013 Scaling Small Apps 15

➔ What about serializability guarantee?➔ Reminder: “a transaction schedule is serializable if its outcome (e.g., the resulting database state) is equal to the outcome of its transactions executed serially, i.e., sequentially without overlapping in time”.

Failure Management in a Cluster (V)

Page 16: Scaling Small App

9 January 2013 Scaling Small Apps 16

➔ What happen when a machine fails?➔ The cluster controller continues to process database requests !!➔Also initaites a background database replication process .

Failure Management in a Cluster (VI)

Page 17: Scaling Small App

9 January 2013 Scaling Small Apps 17

➔ Key technical challenge lies in designing the replication process.➔ So that replicas are transactionally consistent !➔ Using existing DBMS tools (mysqldump in MySQL) !!➔ With minimum downtime to the database !!!

Failure Management in a Cluster (VII)

Page 18: Scaling Small App

9 January 2013 Scaling Small Apps 18

➔ Problem scenario:

Failure Management in a Cluster (VIII)

Page 19: Scaling Small App

9 January 2013 Scaling Small Apps 19

➔ Solution:

Failure Management in a Cluster (IX)

Page 20: Scaling Small App

9 January 2013 Scaling Small Apps 20

➔ Problem definition: to allocate databases to the minimum number of machines satisfying all database SLAs. ➔ SLA Definition:

➔The minimum throughput over a time period T ➔The maximum fraction of rejected transactions over a time period T

Enforcing Database SLAs (I)

Page 21: Scaling Small App

9 January 2013 Scaling Small Apps 21

➔ Solution: Adopted First-Fit algorithms

Enforcing Database SLAs (II)

Page 22: Scaling Small App

9 January 2013 Scaling Small Apps 22

➔ Synchronous Replication:

Experimental Evaluation (I)

Page 23: Scaling Small App

9 January 2013 Scaling Small Apps 23

➔ Failure Recovery:

Experimental Evaluation (II)

Page 24: Scaling Small App

9 January 2013 Scaling Small Apps 24

➔ Opinion!➔ Critical assessment !!

Conclusion

Page 25: Scaling Small App

9 January 2013 Sabbir Ahmmed 25

Questions