scaling small app
TRANSCRIPT
Seminar: Cloud Computing
A Scalable Data Platform for a Large Number of Small Applications
Sabbir Ahmmed
9 January 2013 Scaling Small Apps 2
Motivation
9 January 2013 Scaling Small Apps 3
Outline
➔ The Article
➔ Introduction
➔ System Architecture
➔ Failure Management in a Cluster
➔ Enforcing Database SLAs
➔ Experimental Evaluation
➔ Conclusion
9 January 2013 Scaling Small Apps 4
The Article
Work done by: Fan Yang, Jayavel Shanmugasundaram, Ramana YerneniWork done at: Yahoo! Research
Published in: 2009 (CIDR)Published under: Creative Commons License AgreementThis paper is NOT about:
➔ File Systems/ Low Level Internals (Database vendors' wizardry)➔ Database Paradigms (Relational, NoSQL or any other kind)
This paper is about:➔ Data Platform/Data Management Solution (SaaS, PaaS, IaaS)➔ Design Space/System Architecture (for Cloud service providers)
9 January 2013 Scaling Small Apps 5
Introduction (I)
➔ Small/Community Applications➔ Relatively small data size (tens to thousands of megabytes)➔ Small throughput requirement (tens to hundreds of concurrent user sessions)➔ Comfortably fit in a single machine
➔ However➔ Large number of such applications in a large social network.
➔ Tens of thousands!!➔ Combined data size and and workload is quite large.
➔ Peta bytes of data and millions of concurrent users !!
9 January 2013 Scaling Small Apps 6
Introduction (II)
➔ Problems with existing data-management solutions➔ Commercial database systems
➔ Scale well ➔ Large monetary cost (Licensing with premium)
➔ Open-source alternatives➔ Free ➔ Do not scale well
➔ Peer-to-Peer (DHTs, Ordered tables)➔ Excellent scalability and throughput performance➔Only support very simple data-manipulation operations
9 January 2013 Scaling Small Apps 7
Introduction (III)
➔ Problems with existing data-management solutions➔ Other emerging data platforms (e.g Bigtable, PNUTS, SimpleDB)
➔ Scale to a large number of data operations➔ Lacks rich query processing capabilities (crucial for most web apps!)➔ Restrict the kind of queries applications can issue !
Finaly all of the above solutions lack Multi-Tenancy support !!
9 January 2013 Scaling Small Apps 8
Introduction (IV)
➔The goal of this paper is to design a data management solution that is:➔Low cost➔Full-featured➔Multi-tenancy capable➔Uses comodity hardware and free software (MySQL)➔Exploites two main properties of applications:
➔They are “small”➔Can comfortably “fit in a single machine”
➔And finally addresses two main challenges➔Fault-tolerance➔Ensuring SLAs
9 January 2013 Scaling Small Apps 9
System Architecture (I)
9 January 2013 Scaling Small Apps 10
➔Few more words about the proposed architecture➔System controller, colo controller and clusture controller need to be fault tolerant➔Two main types of failures that need to be handled
➔Machine failures within a colo (handled by syn. replication within a colo)➔Colo failures (handled by asyn. replication across colos)
System Architecture (II)
9 January 2013 Scaling Small Apps 11
➔Replication Architecture➔Uses single node DBMSs➔Each databse is hosted in two or more machines➔Coordinated by a cluster controller
➔Maintains a map of databases to machines➔Manages all DB connections to the client➔Maintains a DB connection to each machine➔Uses Read-one write-all replication protocol➔Uses 2-phase commit (2PC) protocol
Failure Management in a Cluster (I)
Cluster Controller
DB Cluster
9 January 2013 Scaling Small Apps 12
Failure Management in a Cluster (II)
9 January 2013 Scaling Small Apps 13
Failure Management in a Cluster (III)
9 January 2013 Scaling Small Apps 14
➔ Where to route the read operations ?➔ Performance vs Load-balancing
➔ There are three options:
➔All read operations => same physical machine (Option 1)➔All read operations from a single transaction => single physical machine but read operations from different transactions => different physical machines (Option 2)➔Read operations from same transaction => different physical machines (Option 3)
Failure Management in a Cluster (IV)
9 January 2013 Scaling Small Apps 15
➔ What about serializability guarantee?➔ Reminder: “a transaction schedule is serializable if its outcome (e.g., the resulting database state) is equal to the outcome of its transactions executed serially, i.e., sequentially without overlapping in time”.
Failure Management in a Cluster (V)
9 January 2013 Scaling Small Apps 16
➔ What happen when a machine fails?➔ The cluster controller continues to process database requests !!➔Also initaites a background database replication process .
Failure Management in a Cluster (VI)
9 January 2013 Scaling Small Apps 17
➔ Key technical challenge lies in designing the replication process.➔ So that replicas are transactionally consistent !➔ Using existing DBMS tools (mysqldump in MySQL) !!➔ With minimum downtime to the database !!!
Failure Management in a Cluster (VII)
9 January 2013 Scaling Small Apps 18
➔ Problem scenario:
Failure Management in a Cluster (VIII)
9 January 2013 Scaling Small Apps 19
➔ Solution:
Failure Management in a Cluster (IX)
9 January 2013 Scaling Small Apps 20
➔ Problem definition: to allocate databases to the minimum number of machines satisfying all database SLAs. ➔ SLA Definition:
➔The minimum throughput over a time period T ➔The maximum fraction of rejected transactions over a time period T
Enforcing Database SLAs (I)
9 January 2013 Scaling Small Apps 21
➔ Solution: Adopted First-Fit algorithms
Enforcing Database SLAs (II)
9 January 2013 Scaling Small Apps 22
➔ Synchronous Replication:
Experimental Evaluation (I)
9 January 2013 Scaling Small Apps 23
➔ Failure Recovery:
Experimental Evaluation (II)
9 January 2013 Scaling Small Apps 24
➔ Opinion!➔ Critical assessment !!
Conclusion
9 January 2013 Sabbir Ahmmed 25
Questions