what an enterprise can learn from netflix, a cloud-native company (ent203) | aws re:invent 2013
DESCRIPTION
In moving its streaming product to the cloud, Netflix has been able to realize tremendous benefits in scalability, performance, and availability. The biggest benefit came from moving to a service-based architecture, which allowed engineering teams to accelerate their development cycle and innovate more quickly. However, cloud migration was a substantial effort. We mobilized resources across the company over several years, reorganized our engineering and operations teams, developed new security policies, migrated to the DevOps operations model, and even embraced a new product architecture. In this talk, we trace the evolution of the Netflix cloud model, both the successes and the challenges, and present them in a way that’s maximally useful to enterprises considering making the move to the cloud.TRANSCRIPT
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
What an Enterprise Can Learn from Netflix,
a Cloud Native Company
Yury Izrailevsky, VP Cloud and Platform Engineering, Netflix
November 14, 2013
August 2008 Database Corruption
RDBMS
3
Performance Scalability Availability
Netflix Streaming Growth
4
• 5 billion quarterly streaming hours
• 40 million customers
• 41 countries
• 3 continents
100x growth since 2009
Netflix Cross-regional Cloud Architecture
7
Performance Scalability Availability
Cloud Too Expensive?
Netflix data center
87% cost reduction
per streaming start
Cloud Efficiency Benefits
Economy of scale Elasticity
1/4/2009 1/4/2010 1/4/2011 1/4/2012 1/4/2013
Streaming growth
Cyclical daily streaming usage
10
Performance Scalability Availability
A Truly Great Service…
11
Availability goal: 99.99%
30 secs/week
at peak traffic
Has to Just Work!
Weekly Streaming Availability (13wkMA) 7/1
7/2
011
8/7
/20
11
8/2
8/2
011
9/1
8/2
011
10
/9/2
011
10
/30
/201
1
11
/20
/201
1
12
/11
/201
1
1/1
/20
12
1/2
2/2
012
2/1
2/2
012
3/4
/20
12
3/2
5/2
012
4/1
5/2
012
5/6
/20
12
5/2
7/2
012
6/1
7/2
012
7/8
/20
12
7/2
9/2
012
8/1
9/2
012
9/9
/20
12
9/3
0/2
012
10
/21
/201
2
11
/11
/201
2
12
/2/2
012
12
/23
/201
2
1/1
3/2
013
2/3
/20
13
2/2
4/2
013
3/1
7/2
013
4/7
/20
13
4/2
8/2
013
5/1
9/2
013
6/9
/20
13
6/3
0/2
013
7/2
1/2
013
8/1
1/2
013
9/1
/20
13
9/2
2/2
013
10
/13
/201
3
12/24/2012 Elastic
Load Balancing outage
Using AWS redundancy to build highly fault-tolerant architecture
Netflix Cloud Journey: Tough Decisions
• System rearchitecture
• New security model
• New operational model
• Organizational changes
Old Architecture: Consolidated Java App
Javaweb Javaweb Javaweb
… …
Cloud Native Service-based Architecture
15
Cascading Failures
16
API
Instant
Queue
Simple DB
Cascading Failures
17
99% availability
X …
99% 500
= 0.657%
99% availability 99% availability
Cloud Native: Strategies to Improve Availability
18
Graceful
degradation Redundancy
Cloud Native: Graceful Degradation
19
Cloud Native: Redundancy
20
Zone
A
Zone
B
Zone
C
Redundancy across
Availability Zones
Cloud Native Persistence
21
RDBMS Relational NoSQL
distributed databases
Testing Fault Tolerance: Simian Army
22
Chaos Monkey Latency Monkey Chaos Gorilla
Open Source Portal at http://netflix.github.com
Cloud Native Operations: DevOps
Netflix data center
Central NOC team
coordinates bi-
weekly releases
Dev teams push production
changes on own schedule;
no central coordination
AMI-Based Cloud Deployments
Old
code
New
code
Red-black
deployments Bake new AMI for
each app deployment
Evolving a Cloud Native Organization
Data center
IT-Ops manages
budget, capacity Self-service provisioning by dev
teams; visibility through tools
Coordinated releases
via centralized NOC Distributed DevOps; SREs build
tools, share best practices
Oracle DBAs manage
several databases Java, DevOps engineers support
dozens of Cassandra clusters
Data science: analysts
write SQL queries Hadoop engineers build ETL
using PIG/Python
Cloud Pilot Project: Jobs Page
Building a Great Streaming Product
28
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
ENT203