scaling the platform for your startup

77
Scaling the Platform for your Startup Andreas Chatzakis, AWS Solutions Architecture Peter Mounce, Senior Software Developer at JUST EAT 15 th April 2015, AWS London Summit

Upload: amazon-web-services

Post on 15-Jul-2015

715 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Scaling the Platform for Your Startup

Scaling the Platform for your Startup

Andreas Chatzakis, AWS Solutions Architecture Peter Mounce, Senior Software Developer at JUST EAT

15th April 2015, AWS London Summit

Page 2: Scaling the Platform for Your Startup

Why are you here?

•  Building the technology platform for your startup •  You want to prepare for success •  Learn about design patterns & scalability •  A pragmatic approach for startups

Page 3: Scaling the Platform for Your Startup

Priorities for startups

•  Racing within a window of opportunity •  Small team with no legacy •  Focus on solving a problem •  Avoid over-engineering & re-engineering •  Reduce risk of failure when you go viral

Page 4: Scaling the Platform for Your Startup

A scalable architecture

•  Can support growth in users, traffic, data size •  Without practical limits •  Without a drop in performance •  Seamlessly - just by adding more resources •  Efficiently - in terms of cost per user

Page 5: Scaling the Platform for Your Startup

Day 1 – Dev & private beta

Page 6: Scaling the Platform for Your Startup

Single host

THE server (e.g. Apache,

MySQL)

Elastic IP www.example.com

Amazon Route 53 DNS service

Server Image (AMI)

Page 7: Scaling the Platform for Your Startup

Day 2 - Public beta

Page 8: Scaling the Platform for Your Startup

We need a bigger server

•  Add larger & faster storage (EBS) •  Use the right instance type •  Easy to change instance sizes •  Not our long term strategy •  Will hit an endpoint eventually •  No fault tolerance

Page 9: Scaling the Platform for Your Startup

Separating web and DB

•  More capacity •  Scale each tier individually •  Tailor instance for each tier

–  Instance type –  Storage

•  Security –  Security groups –  DB in a private VPC subnet

Page 10: Scaling the Platform for Your Startup

But how do I choose what DB technology I need?

SQL? NoSQL?

Page 11: Scaling the Platform for Your Startup

Why start with a Relational DB?

•  SQL is versatile & feature-rich •  Lots of existing code, tools, knowledge •  Clear patterns to scalability (for read-heavy apps) •  Reality: eventually you will have a polyglot data layer

–  There will be workloads where NoSQL is a better fit –  Use the right tool for each workload

Page 12: Scaling the Platform for Your Startup

Key Insight: Relational Databases are Complex

•  Our experience running Amazon.com taught us that relational databases can be a pain to manage and operate with high availability

•  Poorly managed relational databases are a leading cause of lost sleep and downtime in the IT world!

•  Especially for startups with small teams

Page 13: Scaling the Platform for Your Startup

Relational Databases MySQL, Aurora, PostgreSQL, Oracle, SQL Server

Fully managed; zero admin Amazon

RDS

Aurora

Page 14: Scaling the Platform for Your Startup

Improving efficiency

Page 15: Scaling the Platform for Your Startup

Offload static content •  Amazon S3: highly available hosting that scales

–  Static files (JavaScript, CSS, images) –  User uploads

•  S3 URLs – serve directly from S3 •  Let the web server focus on dynamic content

Page 16: Scaling the Platform for Your Startup

Amazon CloudFront •  Worldwide network of edge locations •  Cache on the edge

–  Reduce latency –  Reduce load on origin servers –  Static and dynamic content –  Even few seconds caching of popular content can have huge impact

•  Connection optimizations –  Optimize transfer route –  Reuse connections –  Benefits even non cachable content

CloudFront

Page 17: Scaling the Platform for Your Startup

CloudFront for static & dynamic content

AmazonRoute 53

EC2 instance(s)

S3 bucket

Static content

Dynamic content

css/* js/* Images/*

Default(*)

CloudFront

distribution

Page 18: Scaling the Platform for Your Startup

Database caching •  Faster response from RAM •  Reduce load on database

Application server

1. If data in cache, return result

2. If not in cache, read from DB

RDS database

Amazon ElastiCache

3. And store in cache

Page 19: Scaling the Platform for Your Startup

Amazon ElastiCache: in-memory cache

•  Simple to Deploy •  Managed

–  Automatically replaces failed nodes –  Patch management

•  Elastic •  Compatible ElastiCache

Page 20: Scaling the Platform for Your Startup

Day 3 – Paying customers

Page 21: Scaling the Platform for Your Startup

High Availability

Availability Zone a

RDS DB instance

Web server

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Amazon CloudFront

ElastiCache node 1

Page 22: Scaling the Platform for Your Startup

High Availability

Availability Zone a

RDS DB instance

Availability Zone b

Web server

Web server

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Amazon CloudFront

ElastiCache node 1

Page 23: Scaling the Platform for Your Startup

High Availability

Availability Zone a

RDS DB instance

Availability Zone b

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

S3 bucket for static assets

Amazon CloudFront

ElastiCache node 1

Page 24: Scaling the Platform for Your Startup

Elastic Load Balancing

•  Managed Load Balancing Service •  Fault tolerant •  Health Checks •  Distributes traffic across AZs •  Elastic – automatically scales its capacity

Page 25: Scaling the Platform for Your Startup

High Availability

Availability Zone a

RDS DB instance

Availability Zone b

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

S3 bucket for static assets

ElastiCache node 1

Amazon CloudFront

Page 26: Scaling the Platform for Your Startup

High Availability

Availability Zone a

RDS DB instance

Availability Zone b

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

S3 bucket for static assets

ElastiCache node 1

Amazon CloudFront

Page 27: Scaling the Platform for Your Startup

Data layer HA

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

Page 28: Scaling the Platform for Your Startup

Data layer HA

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

ElastiCache node 2

Page 29: Scaling the Platform for Your Startup

User sessions •  Problem: Often stored on local disk

(not shared) •  Quickfix: ELB Session stickiness •  Solution: DynamoDB

Elastic Load Balancing

Web server

Web server

Logged in Logged out

Page 30: Scaling the Platform for Your Startup

Amazon DynamoDB

•  Managed document and key-value store •  Simple to launch and scale

•  To millions of IOPS •  Both reads and writes

•  Consistent, fast performance •  Durable: perfect for storage of session data

https://github.com/aws/aws-dynamodb-session-tomcat

http://docs.aws.amazon.com/aws-sdk-php/guide/latest/feature-dynamodb-session-handler.html

Page 31: Scaling the Platform for Your Startup

Day 4 – Let’s go viral!

Page 32: Scaling the Platform for Your Startup

Replace guesswork with elastic IT

Startups pre-AWS

Demand

Unhappy Customers

Waste $$$

Traditional

Capacity

Capacity

Demand

AWS Cloud

Page 33: Scaling the Platform for Your Startup

Scaling the web tier

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

ElastiCache node 2

Page 34: Scaling the Platform for Your Startup

Scaling the web tier

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

ElastiCache node 2

Web server

Web server

Page 35: Scaling the Platform for Your Startup

Scaling the web tier

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

Web server

Web server

RDS DB standby

ElastiCache node 2

Web server

Web server

Page 36: Scaling the Platform for Your Startup

Automatic resizing of compute clusters based on demand

Feature   Details  

Control   Define  minimum  and  maximum  instance  pool  sizes  and  when  scaling  and  cool  down  occurs.  

Integrated  to  Amazon  CloudWatch  

Use  metrics  gathered  by  CloudWatch  to  drive  scaling.  

Instance  types   Run  Auto  Scaling  for  on-­‐demand  and  Spot  Instances.  CompaDble  with  VPC.  

aws autoscaling create-auto-scaling-group --auto-scaling-group-name MyGroup --launch-configuration-name MyConfig --min-size 4 --max-size 200 --availability-zones us-west-2c, us-west-2b

Auto Scaling Trigger auto-scaling policy

Amazon CloudWatch

Page 37: Scaling the Platform for Your Startup
Page 38: Scaling the Platform for Your Startup
Page 39: Scaling the Platform for Your Startup

Decompose into small, loosely coupled, stateless

building blocks

Prerequisite

Page 40: Scaling the Platform for Your Startup

What does this mean in practice?

•  Only store transient data on local disk •  Needs to persist beyond a single http request?

–  Then store it elsewhere

User uploads

User Sessions

Amazon S3

AWS DynamoDB

Application Data

Amazon RDS

Page 41: Scaling the Platform for Your Startup

Having decomposed into small, loosely coupled,

stateless building blocks

You can now Scale out with ease

Having  done  that…  

Page 42: Scaling the Platform for Your Startup

Having decomposed into small, loosely coupled,

stateless building blocks

We can also Scale back with ease

Having  done  that…  

Page 43: Scaling the Platform for Your Startup

Take the shortcut

•  While this architecture is simple you still need to deal with: –  Configuration details –  Deploying code to multiple instances –  Maintaining multiple environments (Dev, Test, Prod) –  Maintain different versions of the application

•  Solution: Use AWS Elastic Beanstalk

Page 44: Scaling the Platform for Your Startup

AWS Elastic Beanstalk (EB) •  Easily deploy, monitor, and scale three-tier web

applications and services. •  Infrastructure provisioned and managed by EB •  You maintain control. •  Preconfigured application containers •  Easily customizable. •  Support for these platforms:

Page 45: Scaling the Platform for Your Startup

Loose coupling with SQS

Tight  coupling  

•  Place  tasks  into  Amazon  Simple  Queue  Service  (SQS)  •  SQS  –  buffer  that  protects  backend  systems  •  Process  asynchronously  -­‐  at  own  pace  •  Remove  delay  from  latency  sensiDve  paths  

SQS

Get Message

Back End EC2 Instance

Put Message

Front End EC2 Instance

Page 46: Scaling the Platform for Your Startup

Day 5 – Add more features

Page 47: Scaling the Platform for Your Startup

Mobile

Push Notifications

Mobile Analytics Cognito Cognito

Sync

Analytics

Kinesis Data Pipeline RedShift EMR

Your Applications

AWS Global Infrastructure

Network

VPC Direct Connect Route 53

Storage

EBS S3 Glacier CloudFront

Database

DynamoDB RDS ElastiCache

Deployment & Management

Elastic Beanstalk OpsWorks Cloud

Formation Code

Deploy Code

Pipeline Code

Commit

Security & Administration

CloudWatch Config Cloud Trail IAM Directory KMS

Application

SQS SWF App Stream

Elastic Transcoder SES Cloud

Search SNS

Enterprise Applications

WorkSpaces WorkMail WorkDocs

Compute

EC2 ELB Auto Scaling Lambda ECS

Page 48: Scaling the Platform for Your Startup

AWS building blocks Inherently Scalable & Highly Available Scalable & Highly Available

!  Elastic Load Balancing

!  Amazon CloudFront

!  Amazon Route53

!  Amazon S3

!  Amazon SQS

!  Amazon SES

!  Amazon CloudSearch

!  AWS Lambda

!  …

!  Amazon DynamoDB

!  Amazon Redshift

!  Amazon RDS

!  Amazon Elasticache

!  …

"  Amazon EC2

"  Amazon VPC

Automated Configurable With the right architecture

Page 49: Scaling the Platform for Your Startup

Stay focused as you scale your team

AWS  Cloud-­‐Based  

Infrastructure  

Your  Business  

More  Time  to  Focus  on  Your  Business  

Configuring  Your  Cloud  Assets  

70%  

30%  70%  

On-­‐Premise  Infrastructure  

30%  

Managing  All  of  the    “UndifferenDated  Heavy  Li[ing”  

Page 50: Scaling the Platform for Your Startup

Day 6 – Growing fast

Page 51: Scaling the Platform for Your Startup

Scaling Relational DBs

•  Increase RDS instance specs –  Larger instance type –  More storage / more PIOPS

•  Read Replicas (Master – Slave) –  Scale out beyond capacity of single DB instance –  Available in Amazon RDS for MySQL, PostgreSQL and Amazon Aurora –  Writes => master –  Replication lag –  Reads with tolerance to stale data => read replica (slave) –  Reads with strong consistency requirements => master

Page 52: Scaling the Platform for Your Startup

Scaling the DB

Web server

Web server

Web server

Web server

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

RDS DB standby

ElastiCache node 2

Page 53: Scaling the Platform for Your Startup

Scaling the DB

Web server

Web server

Web server

Web server

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

RDS DB standby

ElastiCache node 2

RDS read replica

Page 54: Scaling the Platform for Your Startup

Scaling the DB

Web server

Web server

Web server

Web server

Availability Zone a

RDS DB instance

ElastiCache node 1

Availability Zone b

S3 bucket for static assets

www.example.com

Amazon Route 53 DNS service

Elastic Load Balancing

RDS DB standby

ElastiCache node 2

RDS read replica

RDS read replica

Page 55: Scaling the Platform for Your Startup

What if your app is write-heavy?

Challenge: You will eventually hit the write throughput or storage limit of the master node Solutions: •  Federation (splitting into multiple DBs based on function) •  Sharding (splitting one data set across multiple hosts)

Page 56: Scaling the Platform for Your Startup

Database federation •  Divide tables into smaller

autonomous databases •  Harder to do cross-function

queries •  Won’t help with single huge

functions/tables

Forums DB

Users DB

Products DB

Page 57: Scaling the Platform for Your Startup

Sharded horizontal scaling

•  Store subset of rows into each database shard

•  More complex at the application layer

•  No practical limit on scalability

•  Operation complexity

User ShardID

002345 A 002346 B 002347 C 002348 B 002349 A

Shard C

Shard B

Shard A

Page 58: Scaling the Platform for Your Startup

NoSQL data stores

•  Trade query & integrity features of Relational DBs for –  More flexible data model –  Horizontal scalability & predictable performance

DynamoDB Provisioned read/write performance per table

Page 59: Scaling the Platform for Your Startup

Massive and Seamless Scale

•  Distributed system that can scale both reads and writes –  Sharding + Replicas

•  Automatic partitioning: –  Data set size growth –  Provisioned capacity increases table

Page 60: Scaling the Platform for Your Startup

Summary

Page 61: Scaling the Platform for Your Startup

Amazon Route 53 DNS service No limit

Availability Zone a

RDS DB instance

ElastiCache node 2

Availability Zone b

S3 bucket for static assets

www.example.com

Elastic Load Balancing

RDS DB standby

ElastiCache node 3

RDS read replica

RDS read replica

DynamoDB

RDS read replica

ElastiCache node 4

RDS read replica

ElastiCache node 1

CloudSearch Lambda SES SQS

Page 62: Scaling the Platform for Your Startup

A quick review •  Keep it simple and stateless •  Make use of managed self-scaling services •  Multi-AZ and AutoScale your EC2 infrastructure •  Use the right DB for each workload •  Cache data at multiple levels •  Simplify operations with deployment tools

Page 63: Scaling the Platform for Your Startup

Next steps? READ! •  aws.amazon.com/documentation •  aws.amazon.com/architecture •  aws.amazon.com/start-ups ASK FOR HELP! •  forums.aws.amazon.com •  aws.amazon.com/support

Page 64: Scaling the Platform for Your Startup
Page 65: Scaling the Platform for Your Startup

Performance testing @ JUST EAT (Or: DoS yourself every night in production to prove you can take it)

@justeat_tech + @petemounce http://tech.just-eat.com

Page 66: Scaling the Platform for Your Startup

Please wait while I start my DoS attack... (Demo - start fake load, show dashboards)

@justeat_tech + @petemounce http://tech.just-eat.com

Page 67: Scaling the Platform for Your Startup

The problem with performance tests & continuous delivery

●  Don’t want to sacrifice continuous delivery & decoupled teams

●  Don’t want performance to suffer All the usual problems: ●  Bottleneck through single environment ●  Individual tests take too long

@justeat_tech + @petemounce http://tech.just-eat.com

Page 68: Scaling the Platform for Your Startup

Why?

Continuously test ●  performance ●  capacity If we find a problem Thursday night: 1.  don’t run fake load over the weekend 2.  enjoy weekend as normal 3.  fix it next week with leisure

@justeat_tech + @petemounce http://tech.just-eat.com

Page 69: Scaling the Platform for Your Startup

Gamble!

OH: “We deploy tens of small changes a day. I bet we won’t break production...”

OH: “Let’s just do it in production with fake traffic at the same time as customers!”

@justeat_tech + @petemounce http://tech.just-eat.com

Page 70: Scaling the Platform for Your Startup

Not that much of a gamble, really We have tight feedback loops at this point.

Engineers being on call

... highly invested in not regressing performance.

@justeat_tech + @petemounce http://tech.just-eat.com

Page 71: Scaling the Platform for Your Startup

How?

Pick scenarios we care about

Pick data variations to exercise

Add header(s) to discriminate fake load vs customer load

Run it every night during peak time

If no alerts fire, we’re good

@justeat_tech + @petemounce http://tech.just-eat.com

Page 72: Scaling the Platform for Your Startup

What did we gain?

Continuous confidence in capacity

@justeat_tech + @petemounce http://tech.just-eat.com

Page 73: Scaling the Platform for Your Startup

What did we gain?

Continuous confidence in dealing with spikes

@justeat_tech + @petemounce http://tech.just-eat.com

Page 74: Scaling the Platform for Your Startup

What did we gain?

Performance as a 1st-class concern

@justeat_tech + @petemounce http://tech.just-eat.com

Page 75: Scaling the Platform for Your Startup

What did we gain?

Tests become independent of environments’ data

@justeat_tech + @petemounce http://tech.just-eat.com

Page 76: Scaling the Platform for Your Startup

(Remind me to stop my DoS attack now) (Demo - stop fake load, show dashboards)

@justeat_tech + @petemounce http://tech.just-eat.com

Page 77: Scaling the Platform for Your Startup

@justeat_tech + @petemounce http://tech.just-eat.com

Yes, we’re recruiting too. http://tech.just-eat.com/jobs