aws activate webinar - scalable databases for fast growing startups

of 69 /69
Scalable Databases for Fast Growing Startups Blair Layton Business Development Manager – Database Services Amazon Web Services - APAC

Upload: amazon-web-services

Post on 11-Aug-2014

969 views

Category:

Data & Analytics


23 download

DESCRIPTION

Fast growing startups building high scale applications demand a lot from their infrastructure and in particular from their databases. Often, databases become the bottleneck of the startups’ technology stack, with the risk of inhibiting fast growth as they are not easy to set up, operate and scale in the cloud. This webinar focuses on how to build scalable databases in the Cloud and covers how to effectively combine the use of relational, NoSQL, and even data warehouse databases, which have become a reality for startups with the launch of Amazon Redshift. Key takeaways: Understand the trade-off between SQL and NoSQL and when to go for a hybrid model. Best practices in setting up your database in the AWS cloud whether using managed services or managing it yourself. Learn how to minimize the costs of your database with the right architecture and pricing models. Who should attend: DBA’s Startup CTO’s Developers Engineers Architects Growth Hackers

TRANSCRIPT

Page 1: AWS Activate webinar - Scalable databases for fast growing startups

Scalable Databases for Fast Growing StartupsBlair Layton

Business Development Manager – Database Services

Amazon Web Services - APAC

Page 2: AWS Activate webinar - Scalable databases for fast growing startups

Agenda

• Self-Managed or Managed Database Services?

• NoSQL or Relational?

• Performance Tips and Tricks

• How to scale from 1 to 10,000,000 users?

• How do I save money?

• Summary

• Q&A

Page 3: AWS Activate webinar - Scalable databases for fast growing startups

SelfSelfSelfSelf----Managed orManaged orManaged orManaged orManaged Database Services?Managed Database Services?Managed Database Services?Managed Database Services?

Page 4: AWS Activate webinar - Scalable databases for fast growing startups

backup & recovery,

data load & unload

performance tuning

25%25%25%25%40%40%40%40%

5%5%5%5% 5%5%5%5%

scripting & coding

security

planning

install, upgrade,

patch and migrate

documentation,

licensing & training

Why Managed Databases?

Page 5: AWS Activate webinar - Scalable databases for fast growing startups

If You Host Your Databases On-premises

Power, HVAC, net

Rack & stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

Page 6: AWS Activate webinar - Scalable databases for fast growing startups

Power, HVAC, net

Rack & stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

If You Host Your Databases On-premises

Page 7: AWS Activate webinar - Scalable databases for fast growing startups

If You Host Your Databases in EC2

Power, HVAC, net

Rack & stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

Page 8: AWS Activate webinar - Scalable databases for fast growing startups

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

you

App optimization

Power, HVAC, net

Rack & stack

Server maintenance

OS installation

If You Host Your Databases in EC2

Page 9: AWS Activate webinar - Scalable databases for fast growing startups

If You Choose a Managed Database Service

Power, HVAC, net

Rack & stack

Server maintenance

OS patches

DB s/w patches

Database backups

App optimization

High availability

DB s/w installs

OS installation

you

Scaling

Page 10: AWS Activate webinar - Scalable databases for fast growing startups

differentiated effort increases the

uniqueness of an application

Page 11: AWS Activate webinar - Scalable databases for fast growing startups

Amazon RDS

Amazon DynamoDB Amazon Redshift

Amazon ElastiCache

Compute Storage

AWS Global Infrastructure

Database

Application Services

Deployment & Administration

Networking

AWS Database Services

Scalable High Performance Application Storage in the Cloud

Page 12: AWS Activate webinar - Scalable databases for fast growing startups

Relational Databases

Fully managed; zero admin

MySQL, Oracle, Postgres, SQL Server

Trillions of I/O requests/month

Amazon

RDS

Page 13: AWS Activate webinar - Scalable databases for fast growing startups

Flipboard relies on Amazon RDS

• Flipboard is an online magazine with millions of users and billions of “flips” per month

• Uses Amazon RDS and its Multi-AZ capabilities to store mission critical user data

"We were able to go from

concept to delivered product in

about six months with just a

handful of engineers."

- Greg Scallan, Chief Architect,

Flipboard

Page 14: AWS Activate webinar - Scalable databases for fast growing startups

• Manageability� Rapid deployment with pre-configured parameters� Patch Management� Monitoring and Metrics

• Availability and Data Durability� Automated Backups and Point-In-Time-Recovery� DB Snapshots� Automatic Host Replacement (Single-AZ)� Multi-AZ deployments

• Scalability� Push-Button Scaling

• Storage, Memory and Compute

� Read Replicas

Key Features

Page 15: AWS Activate webinar - Scalable databases for fast growing startups

RDS for Production Workloads

AmazonAmazonAmazonAmazon RDSRDSRDSRDS

ConfigurationConfigurationConfigurationConfiguration

ImproveImproveImproveImprove

AvailabilityAvailabilityAvailabilityAvailability

IncreaseIncreaseIncreaseIncrease

ThroughputThroughputThroughputThroughput

ReduceReduceReduceReduce

LatencyLatencyLatencyLatency

PushPushPushPush----Button ScalingButton ScalingButton ScalingButton Scaling

Multi Multi Multi Multi AZAZAZAZ

Read Read Read Read ReplicasReplicasReplicasReplicas

Provisioned IOPSProvisioned IOPSProvisioned IOPSProvisioned IOPS

Read ReplicasPush-Button Scaling Provisioned IOPS

Region

Multi-AZ

availability

zone

availability

zone

Page 16: AWS Activate webinar - Scalable databases for fast growing startups

In-Memory Cache

Elastic and reliable

Memcached or Redis

Fully managed; zero admin

Amazon

ElastiCache

Page 17: AWS Activate webinar - Scalable databases for fast growing startups

ElastiCache: Fully Managed Cache Service

Easy to Deploy

Deploy master-slave(s)

configuration with a few button clicks

or API calls

Easy to Migrate

Compatible with memcached or

Redis

Existing code will work when you

update node end points

Easy to Administer

ElastiCache automatically replaces failed

nodes and patches software as needed

CloudWatch enables you to monitor cache performance

metrics

Easy to Secure

Supports VPC and Security Group configurations

Easy to Scale

Provide assisted scale up and scale

out capability

Page 18: AWS Activate webinar - Scalable databases for fast growing startups

Application

Server

Hot Items

Small, frequently-accessed items are ideal

candidates for read caching

• Reduce server-side latency to <1ms

• Eliminate “hot spot” performance barriers

• Offload heavy read activity from database

Page 19: AWS Activate webinar - Scalable databases for fast growing startups

NoSQL Database

Durable low latency

Fully managed; zero admin

Massive and seamless scalability

Amazon

DynamoDB

Page 20: AWS Activate webinar - Scalable databases for fast growing startups

WRITES

Continuously replicated to 3 AZ’s

Quorum acknowledgment

Persisted to disk (custom SSD)

READS

Strongly or eventually consistent

No trade-off in latency

Durable Low Latency – At Scale

Page 21: AWS Activate webinar - Scalable databases for fast growing startups

Petabyte scale

Massively parallel

Relational data warehouse

Fully managed; zero admin

Amazon

Redshift

a lot faster

a lot cheaper

a whole lot simpler

Page 22: AWS Activate webinar - Scalable databases for fast growing startups

• Load

• Query

• Resize

• Backup

• Restore

Parallelize and Distribute Everything

Compute

Node

16TB

10 GigE

(HPC)

Ingestion

Backup

Restore

SQL Clients / BI Tools

Amazon S3

Client VPC

Compute

Node

16TB

Compute

Node

16TB

Leader

Node

Page 23: AWS Activate webinar - Scalable databases for fast growing startups

Databases on EC2

• Any database that runs on Windows or Linux!

• Why?

• No managed service exists from AWS, e.g. MongoDB

• Full control

• Exceed limits of managed service, e.g. > 3TB of storage on RDS

Page 24: AWS Activate webinar - Scalable databases for fast growing startups
Page 25: AWS Activate webinar - Scalable databases for fast growing startups
Page 26: AWS Activate webinar - Scalable databases for fast growing startups

NoSQL or Relational?

Page 27: AWS Activate webinar - Scalable databases for fast growing startups

Not available on AWS

Spectrum of Database Options

SQL NoSQL

Low Cost High Cost�

Do-it Yourself Fully

Managed

Page 28: AWS Activate webinar - Scalable databases for fast growing startups

Spectrum of Database Options

SQL NoSQL

Do-it Yourself Fully

Managed

Page 29: AWS Activate webinar - Scalable databases for fast growing startups

MySQL, Oracle, SQL Server, PostgreSQLAmazon Redshift

Spectrum of Database Options

SQL NoSQL

Do-it Yourself Fully

Managed

MySQL, Oracle, SQL Server, PostgreSQL, MariaDB, Vertica, ParAccel…

Page 30: AWS Activate webinar - Scalable databases for fast growing startups

Spectrum of Database Options

SQL NoSQL

Do-it Yourself Fully

Managed

MongoDBCassandraRedisMemcache

DynamoDBElastiCache (Memcache)ElastiCache (Redis)SimpleDB

Page 31: AWS Activate webinar - Scalable databases for fast growing startups

Thinking About the Questions

Should I use

SQL or NoSQL?

Should I use

MySQL or

PostgreSQL?

Should I use Redis, Memcache, or ElastiCache?

?Should I use

MongoDB,

Cassandra, or

DynamoDB?

Page 32: AWS Activate webinar - Scalable databases for fast growing startups

Actually, Thinking About the Right Questions

What are my scale and latency

needs?

What are my transactional and

consistency needs?

What are my read/write, storage and IOPS needs?

What are my time to market and server control

needs?

?

Page 33: AWS Activate webinar - Scalable databases for fast growing startups

Factors to Consider

Factors SQL NoSQL

Application • App with complex business logic? • Web app with lots of users?

Transactions • Complex transactions, joins, updates? • Simple data model, updates, queries?

Scale • Developer managed • Automatic, on-demand scaling

Performance • Developer architected • Consistent, high performance at scale

Availability • Architected for fail-over • Seamless and transparent

Core Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP

Best of both worlds: Possible to Use SQL and NoSQL models in one AppBest of both worlds: Possible to Use SQL and NoSQL models in one App

Page 34: AWS Activate webinar - Scalable databases for fast growing startups

PerformanceTips and Tricks

Page 35: AWS Activate webinar - Scalable databases for fast growing startups

Performance Tips and Tricks

• Understand your workload– Read:Write ratio, I/O requirements, CPU requirements

• Identify bottlenecks– CPU, Memory, Disk I/O, Network latency/bandwidth

– Use Cloudwatch and OS metrics

• Choose the right instance type– High CPU, High Memory, High Storage, etc.

• Understand EBS!

Page 36: AWS Activate webinar - Scalable databases for fast growing startups

EBS =

Page 37: AWS Activate webinar - Scalable databases for fast growing startups

Amazon EBS MagneticAmazon Elastic Block Storage

(EBS)

• IOPS: ~100 IOPS steady-state, with best-effort bursts

to hundreds. 40-200 IOPS in terms of variability.

• Throughput: variable by workload, best effort to 10s of MB/s.

• Latency: Varies, reads typically <20 ms,

writes typically <10 ms.

• Capacity: As provisioned, up to 1 TB.

Page 38: AWS Activate webinar - Scalable databases for fast growing startups

Amazon EBS General Purpose

• IOPS: 3 IOPS per GB consistent, with bursts to 3,000 IOPS.

Bucket principle, fills up when not used and empties as used.

• Throughput: variable by workload, best effort to 64 MB/s.

• Latency: Low and consistent.

• Capacity: As provisioned, up to 1 TB.

Amazon Elastic Block Storage

(EBS)

Page 39: AWS Activate webinar - Scalable databases for fast growing startups

Amazon EBS Provisioned IOPS

• IOPS: Within 10% of up to 4000 IOPS,

99.9% of a given year, as provisioned.

• Throughput: 16 KB per I/O = up to 64 MB/s, as provisioned.

• Latency: low and consistent, at recommended QD

• Capacity: As provisioned, up to 1 TB

**

Amazon Elastic Block Storage

(EBS)

Page 40: AWS Activate webinar - Scalable databases for fast growing startups

EC2

Why the ?*An I/O

EBS

Just because Amazon EC2 sends more

work doesn’t mean there’s enough

bandwidth to handle it!

Page 41: AWS Activate webinar - Scalable databases for fast growing startups

EC2

Why the ?*An I/O

Without more bandwidth,

more EBS volumes or higher PIOPS won’t help!

Page 42: AWS Activate webinar - Scalable databases for fast growing startups

EBS-Optimized

Oh, YEAH!!*

EC2

A “boatload” of I/O

*

EBS w/ PIOPS

Page 43: AWS Activate webinar - Scalable databases for fast growing startups

Architecting for Performance

• IOPS consistency requires EBS-

optimized instances

• Maximum throughput delivered by

Amazon EBS is limited by Amazon

EC2 bandwidth

• EBS throughput =

EBS IOPS × Block size

– Ex: 64 MB/s = 4000 IOPS × 16 KB

Max 8k =

�2x

Max 4k =

�4x*

Max 2k =

� 8x*

*Maximum IOPS is also limited to ~100,000 per 32 vCpu,

irrespective of block size/throughput.

Page 44: AWS Activate webinar - Scalable databases for fast growing startups

Additional Hints

• Mount partitions with “noatime” and “nodiratime”– Removes a write every time a read is done

• Turn off file system read ahead if possible– Especially for OLTP systems

• Use vendor storage solutions– Oracle ASM

• Optimize kernel settings

Page 45: AWS Activate webinar - Scalable databases for fast growing startups

Scaling from1 to 10,000,000 Users

Page 46: AWS Activate webinar - Scalable databases for fast growing startups

So how do we scale?

Page 47: AWS Activate webinar - Scalable databases for fast growing startups

Hi, I have NO IDEA what I am doing!!

Page 48: AWS Activate webinar - Scalable databases for fast growing startups

So let’s start from day one, user one ( you )

Page 49: AWS Activate webinar - Scalable databases for fast growing startups

Day One, User One:

• We could potentially get to a few hundred to a few thousand depending on application complexity and traffic

• No failover

• No redundancy

• Too many eggs in one basket

EC2

Instance

Elastic IP

Amazon

Route 53User

Page 50: AWS Activate webinar - Scalable databases for fast growing startups

“We’re gonna need a bigger box”

• Simplest approach

• Can now leverage PIOPs

• High I/O instances

• High memory instances

• High CPU instances

• High storage instances

• Easy to change instance sizes

• Will hit an endpoint eventually

r3.8xlarge

m3.2xlarge

t2.small

Page 51: AWS Activate webinar - Scalable databases for fast growing startups

Day Two, User >1

First let’s separate out

our single host into

more than one.

• Web

• Database– Make use of a database

service? Web

Instance

Database

Instance

Elastic IP

Amazon

Route 53User

Page 52: AWS Activate webinar - Scalable databases for fast growing startups

Start with the right databases for the job

Page 53: AWS Activate webinar - Scalable databases for fast growing startups

User >100

First let’s separate out

our single host into

more than one

• Web

• Database– Use RDS to make your life

easier Web

Instance

Elastic IP

RDS DB

Instance

Amazon

Route 53User

Page 54: AWS Activate webinar - Scalable databases for fast growing startups

User > 1000

Next let’s address our

lack of failover and

redundancy issues• Elastic Load Balancing

• Another web instance– In another Availability Zone

• Enable Amazon RDS multi-AZ

Web

Instance

RDS DB Instance

Active (Multi-AZ)

Availability Zone Availability Zone

Web

Instance

RDS DB Instance

Standby (Multi-AZ)

Elastic Load

Balancing

Amazon

Route 53User

Page 55: AWS Activate webinar - Scalable databases for fast growing startups

User >10 ks–100 ks

RDS DB Instance

Active (Multi-AZ)

Availability Zone Availability Zone

RDS DB Instance

Standby (Multi-AZ)

Elastic Load

Balancing

RDS DB Instance

Read Replica

RDS DB Instance

Read Replica

RDS DB Instance

Read Replica

RDS DB Instance

Read Replica

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Amazon

Route 53User

Page 56: AWS Activate webinar - Scalable databases for fast growing startups

This will take us pretty far honestly, but we care about performance and efficiency, so let’s clean this up a bit

Page 57: AWS Activate webinar - Scalable databases for fast growing startups

Shift Some Load Around

Let’s lighten the load on our

web and database instances

• Move static content from the web instance to Amazon S3 and CloudFront

• Move dynamic content from the Elastic Load Balancing to CloudFront

• Move session/state and DB caching to ElastiCache or DynamoDB

Web

Instance

RDS DB Instance

Active (Multi-AZ)

Availability Zone

Elastic Load

Balancing

Amazon S3

Amazon

CloudFront

Amazon

Route 53User

ElastiCache

Amazon

DynamoDB

Page 58: AWS Activate webinar - Scalable databases for fast growing startups

User >500k+

Availability Zone

Amazon

Route 53User

Amazon S3

Amazon

Cloudfront

Availability Zone

Elastic Load

Balancing

DynamoDB

RDS DB Instance

Read Replica

Web

Instance

Web

Instance

Web

Instance

ElastiCache RDS DB Instance

Read Replica

Web

InstanceWeb

Instance

Web

Instance

ElastiCacheRDS DB Instance

Standby (Multi-AZ)RDS DB Instance

Active (Multi-AZ)

Page 59: AWS Activate webinar - Scalable databases for fast growing startups

From 500K to 1 Million Users

• Getting serious now

• Significant user base

• Plenty of attention if things go wrong

• Interesting phase for startups with funding

rounds

Page 60: AWS Activate webinar - Scalable databases for fast growing startups

Time to make some radical improvements at the web & app layers

Page 61: AWS Activate webinar - Scalable databases for fast growing startups

SOAing

Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Use queues!

Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.

Page 62: AWS Activate webinar - Scalable databases for fast growing startups

Users > 1 Million

RDS DB Instance

Active (Multi-AZ)

Availability Zone

Elastic Load

Balancer

RDS DB Instance

Read Replica

RDS DB Instance

Read Replica

Web

Instance

Web

Instance

Web

Instance

Web

Instance

Amazon

Route 53User

Amazon S3

Amazon

Cloudfront

Amazon

DynamoDB

Amazon SQS

ElastiCache

Worker

Instance

Worker

Instance

Amazon

CloudWatch

Internal App

Instance

Internal App

InstanceAmazon SES

Page 63: AWS Activate webinar - Scalable databases for fast growing startups

The next big steps

Page 64: AWS Activate webinar - Scalable databases for fast growing startups

From 5 to 10 Million Users

You may start to run into issues with your database around contention on the write master.

How can you solve it?

• Federation (splitting into multiple DBs based on function)

• Sharding (splitting one data set up across multiple hosts)

• Moving some functionality to other types of databases

– NoSQL for hot tables, lookup tables, leaderboards/scoring, meta data

– Data warehouse for analytics: user behavior, performance monitoring, a/b testing results, KPIs/dashboards.

Page 65: AWS Activate webinar - Scalable databases for fast growing startups

How do I Save Money?

Page 66: AWS Activate webinar - Scalable databases for fast growing startups

Saving $$$

• Use managed database services– Focus your limited resources on the application

– Elasticache can reduce your database costs

• Understand how to scale from the start– Save redesign work and unhappy customers

– Start and stop instances as required

• Use the AWS platform– Don’t reinvent the wheel, concentrate on your core competency

– Using CloudFront will reduce your costs on EC2 dramatically

• Purchase RIs and use spot instances

• Constantly monitor and right-size your environment

Page 67: AWS Activate webinar - Scalable databases for fast growing startups

Sorry, How do I Scale my Database?

Page 68: AWS Activate webinar - Scalable databases for fast growing startups

Summary• Decide on self-managed or managed database services

• Choose the right database for your use case and skillsets to start with

• Use Multi-AZ for your infrastructure

• Choose the right instance family and size for your workloads

• Understand the 3 types of EBS (Magnetic, General Purpose and PIOPS)

• Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon

SNS, SQS, Amazon SES, etc.)

• Build in redundancy at every level

• Blend SQL & NoSQL wisely

• Use a data warehouse to offload large analytical queries from your main database

• Cache data both inside and outside your infrastructure

• Purchase RIs and use Spot instances

• Split tiers into individual services (SOA)

• Use autoscaling once you are ready for it

• Use automation tools in your infrastructure

• Make sure you have good metrics, monitoring, and logging tools in place

• Don’t reinvent the wheel

Page 69: AWS Activate webinar - Scalable databases for fast growing startups

Q & A