Tweet @garethbowles with feedback!
Self-Service Build & Deployment @Netflix
Monday, August 5, 13
Tweet @garethbowles with feedback!
• How would your organization be different if all of your engineers could build, test and deploy their own code ...
• ... and were responsible for fixing what they broke at 3am ?
Monday, August 5, 13
Tweet @garethbowles with feedback!
Gareth Bowles
Monday, August 5, 13
Tweet @garethbowles with feedback!
Monday, August 5, 13
Tweet @garethbowles with feedback!
Netflix is the world’s leading Internet television network with more than 36 million members in 40 countries enjoying more than one billion hours of TV
shows and movies per month, including original series.
Source: http://ir.netflix.com
Monday, August 5, 13
Tweet @garethbowles with feedback!
The Challenge
• We need to innovate rapidly, driven by:
• Global competition
• New connected devices
• Continuous customer feedback
• And we need fast rollback
Monday, August 5, 13
Tweet @garethbowles with feedback!
The Challenge• We need to scale to cope with:
• Growing customer base
• Peaks in demand:
• Special events: holidays, Oscars
• Daily fluctuations (weekdays vs. weekends, daytime vs. evening)
Monday, August 5, 13
Tweet @garethbowles with feedback!
Things That Help
• We can push out updates whenever we like
• Company culture
Monday, August 5, 13
Tweet @garethbowles with feedback!
Things That Got in Our Way
Monday, August 5, 13
Tweet @garethbowles with feedback!
A Few Short Years Ago ...
• Monolithic web app
• Single points of failure
• Releases were done by following runbooks
• DC-based infrastructure
• Different teams used different tools
Monday, August 5, 13
Tweet @garethbowles with feedback!
Meeting the Challenge
Monday, August 5, 13
Tweet @garethbowles with feedback!
http://www.slideshare.net/reed2001/culture-1798664
Monday, August 5, 13
Tweet @garethbowles with feedback!
Freedom and Responsibility
• Hire mature people who work well with others
• Give them the context for company success
• Then get out of their way
• But hold them responsible for results
Monday, August 5, 13
Tweet @garethbowles with feedback!
Context, not Control
• Be transparent about what the company needs to succeed
• Minimize the processes people need to go through to achieve success
• Value results, not planning and process
Monday, August 5, 13
Tweet @garethbowles with feedback!
Highly Aligned, Loosely Coupled
• Clear strategy and goals
• Team interactions focus on strategy, not tactics
• Minimal cross-functional meetings
• Occasional post-mortems to increase alignment
Monday, August 5, 13
Tweet @garethbowles with feedback!
What This Helped Us Achieve
• DVD to Streaming
• DC to cloud
• US-only to 40-plus countries
Monday, August 5, 13
Tweet @garethbowles with feedback!
Architecture
Credit: Steve Somers
Monday, August 5, 13
Tweet @garethbowles with feedback!
Key Changes
• Service oriented architecture
• Many small teams, each providing their own interconnected service
• Deploy on Amazon Web Services
• Increased reliance on open source
Monday, August 5, 13
Tweet @garethbowles with feedback!
Highly aligned, loosely coupled
• Services are built by different teams who work together to figure out what each service will provide.
• The service owner publishes an API that anyone can use.
Monday, August 5, 13
Tweet @garethbowles with feedback!
What AWS Provides
• Machine Images (AMI)
• Instances (EC2)
• Elastic IPs
• Load Balancers
• Security groups / Autoscaling groups
Monday, August 5, 13
Tweet @garethbowles with feedback!
Freedom and Responsibility
• Developers deploy when they want
• They also manage their own capacity and autoscaling
• And fix anything that breaks at 3am!
Monday, August 5, 13
Tweet @garethbowles with feedback!
Personaliza-‐Eon Engine User Info Movie
MetadataMovie RaEngs
Similar Movies
API
Reviews A/B Test Engine
2B requests per day
into the Ne3lix API
12B outbound requests per day to API
dependencies
Monday, August 5, 13
Tweet @garethbowles with feedback!
Personaliza-‐Eon Engine User Info Movie
MetadataMovie RaEngs
Similar Movies
API
Reviews A/B Test Engine
2B requests per day
into the Ne3lix API
12B outbound requests per day to API
dependencies
Monday, August 5, 13
Tweet @garethbowles with feedback!
Build and Deployment
Monday, August 5, 13
Tweet @garethbowles with feedback!
The Audience
• ~700 engineers
• Large majority are developers
• Test engineers
• Delivery teams
• Operations & reliability engineering
Monday, August 5, 13
Tweet @garethbowles with feedback!
Our Goal
• Lower the barriers to build, test and deployment until the entire process is accessible to every developer.
Monday, August 5, 13
Tweet @garethbowles with feedback!
The Team
• 11 engineers and 1 director (but we’re hiring !)
• Developers, build / release engineers, DevOps
• Specialize, but understand the full stack
• Service oriented
Monday, August 5, 13
Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
Monday, August 5, 13
Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
Monday, August 5, 13
Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
• Make adoption easy
Monday, August 5, 13
Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
• Make adoption easy
• Make tools flexible
Monday, August 5, 13
Tweet @garethbowles with feedback!
Building and Deploying
Perforce / Git
libraries
source
Ant targets
Ivy
Groovy all over
snapshot / release libraries / apps
Jenkins
sync
resolve
buildcompile report
publishtest
Artifactory yumAminator
Asgard
rpms
Monday, August 5, 13
Tweet @garethbowles with feedback!
Building and Deploying
Perforce / Git
libraries
source
Ant targets
Ivy
Groovy all over
snapshot / release libraries / apps
Jenkins
sync
resolve
buildcompile report
publishtest
Artifactory yumAminator
Asgard
rpms
Monday, August 5, 13
Tweet @garethbowles with feedback!
Is That Really Self-Service ?
Monday, August 5, 13
Tweet @garethbowles with feedback!
Common Build Framework
• Define a build with just a few lines of Ant code
• Templates for libraries and webapps
• Override standard targets if you need to
Monday, August 5, 13
Tweet @garethbowles with feedback!
Jenkins Job DSL• Define Jenkins build jobs using a domain
specific language (based on Groovy)
• Loop to create multiple jobs (e.g. for building different branches)
• Make one change and rerun to update all jobs
• The code is the configuration
• https://wiki.jenkins-ci.org/display/JENKINS/Job
Monday, August 5, 13
Tweet @garethbowles with feedback!
Jenkins Dynaslaves
• Create build slaves in AWS
• Dedicated slave pools for teams
• Scale slave pools up and down on demand
• https://github.com/Netflix-Skunkworks/dynaslave-plugin
Monday, August 5, 13
Tweet @garethbowles with feedback!
From Build to Deployment
Monday, August 5, 13
Tweet @garethbowles with feedback!
Aminator
• Create (“bake”) AMIs
• Image contains a service and everything needed to run it
• Can be automatically triggered as a build step
• https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Traditional:•launch OS•install packages•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:•launch OS•install packages•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:•launch OS•install packages•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:•launch OS•install packages•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:•launch OS•install packages•install app
Netflix:•launch OS+app
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:•launch OS•install packages•install app
Netflix:•launch OS+app
App AMI Instance
https://github.com/Netflix/aminator
Monday, August 5, 13
Tweet @garethbowles with feedback!
Linux Base AMI (CentOS or Ubuntu)
Java (JDK 6 or 7)
Tomcat
Optional Apache
Monitoring
Log Rotation to S3
Appdynamics Machine Agent
Appdynamics App Agent
monitoring
Application war file, base servlet, platform, interface
jars for dependent services
GC and thread dump logging
Healthcheck, status servlets, JMX interface,
Servo autoscale
Monday, August 5, 13
Tweet @garethbowles with feedback!
At Netflix, the AMI is the unit of deployment.
Monday, August 5, 13
Tweet @garethbowles with feedback!
Asgard• Web UI and REST API for service deployment
and management
• Manage ASGs, ELBs, security groups, ...
• Application -> cluster -> ASG
• Rapid deployment and rollback
• Available to all engineers
• https://github.com/Netflix/asgard
Monday, August 5, 13
Tweet @garethbowles with feedback!
Red / Black Deployment
Monday, August 5, 13
Tweet @garethbowles with feedback!
Netflix has moved the granularity from the
instance to the cluster.
Monday, August 5, 13
Tweet @garethbowles with feedback!
Simple Service Setup Effort• Write the code (variable :-))
• 15 minutes to write a build file and define dependencies
• 15 mins to create a Jenkins build, 2 to 10 mins to run it
• 5 mins to bake an AMI
• 10 mins to deploy in test, another 10 for prod
Monday, August 5, 13
Tweet @garethbowles with feedback!
Just a quick reminder...
(Some of) Netflix is open source:
https://github.com/netflix
Monday, August 5, 13
Tweet @garethbowles with feedback!
Why We Open Source
• Give back to Apache license OSS community
• Motivate, retain, hire top engineers
• Benefit from a shared ecosystem
• Make Netflix solutions into common standards
Monday, August 5, 13
Tweet @garethbowles with feedback!
The Netflix PlatformDiscovery (Eureka)Entrypoints (Edda)
Configuration (Archaius)Zookeeper (Exhibitor)logging (Blitz4j & Honu)
NIWS (Ribbon)GeoBase
Hystrix
Circuit Breakers (Hystrix)Cassandra (Priam &
Astyanax & CassJMeter) Cryptex AKMS
EvCacheZuuli18nL10n
Open Source
Monday, August 5, 13
Tweet @garethbowles with feedback!
https://github.com/Netflix/Cloud-Prize/wiki
Monday, August 5, 13
Tweet @garethbowles with feedback!
Thank You !
Email: gbowles@{gmail,netflix}.com
Twitter: @garethbowles
Linkedin: www.linkedin.com/in/garethbowles
Monday, August 5, 13