scalable architecture on amazon aws cloud - indicthreads cloud computing conference 2011

24
1 Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies [email protected]

Upload: indicthreads

Post on 05-Dec-2014

1.676 views

Category:

Technology


0 download

DESCRIPTION

Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011. http://CloudComputing.IndicThreads.com Abstract:“With increasing demand, ever-growing datasets, unpredictable traffic patterns and need for faster response times, “scalable architecture” has become a necessity. Here, we will see how the traditional concepts and best practices for scalability have to be adopted for the cloud. Further, we will go through the unique advantages that Amazon AWS cloud offers for architecting scalable applications. As an architect, you need to identify the components and bottlenecks in your architecture and modify your application to leverage the underlying scalability. We will cover the following topics: Scalability principles for the cloud Leveraging AWS services for application components Shared nothing architecture Asynchronous work queues for loosely coupled applications Database scalability Tools, connectors and enablers to help build, deploy and monitor your cloud environment Scalability using Platform-as-a-Service offerings on top of AWS An example of a horizontally scalable architecture for an enterprise application on Amazon AWS This talk will act as a primer for a cloud architect to achieve an auto-scalable, highly available, fully-monitored edge-cached application.” Speaker: Kalpak Shah is the Founder & CEO of Clogeny Technologies Pvt. Ltd. and guides the overall strategic direction of the company. Clogeny is focused on niche software and product development in cloud computing and scalable applications domains. He is passionate about the ground-breaking economics and technology afforded by the cloud computing platforms. He has been leading and architecting cutting-edge product development across the cloud stack including IaaS, PaaS and SaaS vendors. He has previously worked at organizations like Sun Microsystems and Symantec in the storage domain primarily distributed and disk filesystems. Kalpak has a Bachelors’ of Engineering degree in computer engineering from PICT, University of Pune.

TRANSCRIPT

Page 1: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

1

Scalable Architecture on Amazon AWS Cloud

Kalpak Shah

Founder & CEO,Clogeny Technologies

[email protected]

Page 2: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

2

* http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php

Page 3: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

3

Architect to scale on-demand and provision as per current requirements

Ideal model for unpredictable and variable loads

Page 4: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

4

Scalability RequirementsIncrease in resources → Increase in performance

Predictability

Low Latency

High Reliability

Dynamism: Number of users, volume of data, skews

Operational efficiency

Costs should not scale

{Elasticity, Scalability, Resiliency}

Page 5: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

5

Scalability PerspectivesWhat needs to scale?

ComputeMemoryNetworkStorageMonitoring

Vertical scalability

Horizontal scalability

Scale across geographies

HPC workloads

Data Processing workloads

IO LatencyProvisioning timeBackup / Restore timesFailoverOps

Page 6: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

6

Vertical ScalabilityWhen scale is predictable and linearWhen you do not want to spend on re-architecting the application or deploymentIncrease instance sizes

1 – 33.5 EC2 Compute Units

613MB memory to 68GB memory

Size or number of EBS disks

HPC Instances10 Gigabit ethernet

Higher IOPS for EBS disks

Limitations….

Page 7: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

7

Scaling multi-tier stacks – 1Service Oriented Architecture

Loosely coupled

Standard service contracts

Web Services

Enables independent tiers for deployment & management

Messaging / Queue layer

Tier A Tier B

Queue 1 Queue 2

Page 8: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

8

Scaling multi-tier stacks – 2Amazon SQS: Reliable, scalable, hosted queue; exposed as web service

RabbitMQ: Open-source HA messaging system, clustering support

BeanStalkD: Simple, fast work queue

Clustering Application ServersJBoss App Server, IBM WebSphere Application Server

Add or remove nodes on the fly – automate through scripting

Stateless behavior can be added when necessary

VPC does not work across availability zones (AZ) – in the pipeline though

Page 9: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

9

Elastic Load Balancing, Auto Scaling

Amazon Elastic Load BalancingDistributes incoming traffic to your application across several EC2 instancesDetects unhealthy instances and reroutes traffic

Auto-ScalingEnabled by CloudWatch: Monitoring, custom metrics, free tier, graphs and statisticsRule-based automatic scaling of your EC2 capacityBased on metrics including resource utilization, software stack metrics or custom metricsN+1 redundancy

Page 10: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

10

Monitoring & Logging

Amazon CloudWatchMonitoring for AWS cloud resources & applicationsCollect and track metrics – CPU, latency, request counts, custom metrics

Monitoring with your own tools

Using Hyperic or Nagios for monitoring specific layers of your stack or to leverage existing investments

LoggingNo dependency on instances – copy necessary logs to S3 periodically

Page 11: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

11

Databases - Replication

Master-Slave Replication (MySQL, Oracle RAC)

Writes on master

Reads distributed across slaves

Works well in read mostly scenarios

Slave lag issue

Master

Writes

Slave Slave Slave

← Reads →

Page 12: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

12

Databases – Sharding - 1Partition data across masters

Writes & Reads are distributed

Application needs modification

Needs choice of partitioning strategy for uniform data distribution

Page 13: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

13

Databases – Sharding - 2

Issues

Joins cannot be performed across shards

Application modification can be expensive

Example

Evernote uses database sharding – localized failures, no need for joins

Each shard handles all data & traffic for about 100,000 users

http://blog.evernote.com/tech/2011/05/17/architectural-digest/#

Page 14: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

14

Databases – Amazon SimpleDBSchema-less distributed key-value storeHighly reliable and scalable (redundancy across geos)Automatic indexing of columnsAPI based global accessSupports multiple values for key/attributesEventual consistency or consistency – speed or consistency?Limitations

No joins, No transactions, No aggregators, text searches

NoSQL

MongoDB, Cassandra, Redis

Page 15: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

15

Databases – Amazon RDSRelational Database Service (RDS) from AWS

Scale your DB layer with minimum administration

MySQL and Oracle supported

Import existing databases & no changes to applications

Multi-AZ deployments supported

Manages backup of your database and enables restore from DB snapshots

Page 16: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

16

Reserving ScalabilityReserved Instances

AWS has finite hardware capacity

Provisioning times can vary

Use few reserved instances to “book” capacity in advance (also take advantage of lower prices)

Can be done across availability zones to ensure DR

Larger EBS disksCreate larger EBS disks to ensure better performance

Netflix creates 1TB disks in this manner

Page 17: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

17

Scalability using PaaSAmazon Elastic Beanstalk

Platform-as-a-Service with deployment, capacity provisioning, load balancing, auto-scaling & application health monitoring

Application versioning support (rollback if needed)

Uses EC2, S3, RDS, SimpleDB, Load Balancer, CloudWatch

Retain control of your infrastructure if desired

Other PaaS productsCumuLogic, CloudBees, DotCloud, PHP Fog

Java, PHP, RoR, MongoDB, MySQL……

Page 18: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

18

Elastic MapReduce (PaaS)Hosted Hadoop Framework

Manages job flows & provisioning of all infrastructure

AWS Console to create & manage workflows

Supports custom jars, Hive, Pig, streaming & processing in multiple languages/stacks

Debug & profile jobs

Run across geographies

Data processing, analytics

Scalable Managed MapReduce Platform

Page 19: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

19

Automation for managing scaleCloudFormation

Templatize your stack

Predictable provisioning of your stack

RightScaleSophisticated cloud management platform

Templates, automation, orchestration, portability

Tools, Connectors, EnablersAutomated orchestration & setup

Snapshot management

Monitor security groups and firewalls

Page 20: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

20

AZ2

Case Study

scalableportal.com ELB

AZ1

Web Server

1(Reserved)

Web Server

n

Auto Scaling Group - Apache

3rd Party API

LB

App Server

1(Reserved)

Auto Scaling Group - Django App Server

Managed Scaling

MySQLMaster

Work Server

Amazon S3

MySQLSlaves

EBS Snapshots

App Server

n

No listing,Hash based access,Millions of files

SES

Work Server

App Server

n

Work Server

Local DB

Page 21: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

21

There are some limits…EC2 has limit of 20 instances

S3 has limit of 100 buckets

Simple Email Service (SES) has a daily sending quota

NOTE: All of these limits can be increased or waived by requesting AWS. Ensure to do this before you hit the limits in production.

Page 22: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

22

Scale but minimize costs - 1Use of Reserved Instances

Commitment for upto 1-3 years with some upfront payment

Actual usage cost is much lower

If used for more than 6 months in a year, can be 30-45% cheaper than on-demand instances

Reduced Redundancy Storage

Reduce costs by storing non-critical data at lesser redundancy and lesser availability/durability of 99.99%

Instance Sizes

Run some smaller instances as part of clusters

Page 23: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

23

Scale but minimize costs - 2Data Transfer beyond 10TB

Consolidate AWS accounts so that higher usage translates to saved costs. $0.15 upto 10TB and $0.11 beyond 10TB.

Identify extra capacity

Use monitoring to identify unused capacity & optimize

Spot Instances

Bid for unused capacity – choose your maximum price

Get more within your existing budget

Page 24: Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conference 2011

24

Questions?