cloud foundry platform operations - cf summit 2015
TRANSCRIPT
2© 2015 Pivotal Software, Inc. All rights reserved. 2© 2015 Pivotal Software, Inc. All rights reserved.
A Developer’s Perspective on Cloud Foundry Operations: One Month in the TrenchesCF Summit 2015
Cornelia Davis, Director, Platform Engineering, Cloud Foundry @cdavisafcMay 2015
3© 2015 Pivotal Software, Inc. All rights reserved.
Goals
Principles
Deployments
Monitoring
Stories
Platform
4© 2015 Pivotal Software, Inc. All rights reserved.
Deployment Topology
Jumpboxssh
Bosh cli
Microbosh
S3RDS
Full BOSH(cluster)
S3RDS
cf-release(cluster)
diego(cluster)
Pivotal Web Services
S3
RDS
Route53
domain cfapps.io
domain pivotal.io
domain …
ELB
cert *.cfapps.io
ELB
cert run.pivotal.io
ELB
cert …
5© 2015 Pivotal Software, Inc. All rights reserved.
Doing Deployments
Planned ones during working hours!– We can because Cloud Foundry enables zero downtime– Developers on hand
Different types: – New Release– Stemcell upgrade – i.e. Heartbleed– Manifest only – change the topology; i.e. increase the number of DEAs
minutes
hours
6© 2015 Pivotal Software, Inc. All rights reserved.
The one about the incident…
Spike in traffic expected and app instances scaled
Increased loggregator traffic– Logs being dropped
Scale number of loggregators
Manifest only deploy
bosh deploy
7© 2015 Pivotal Software, Inc. All rights reserved.
Doing Deployments
We do them during working hours!– We can because Cloud Foundry enables zero downtime– Developers on hand
Different types: – New Release– Stemcell upgrade – i.e. Heartbleed– Manifest only – change the topology; i.e. increase the number of DEAs
Release previously tested in a staging environment– Separate env: BOSH, storage, etc.– Shared package cache compilation previously completed– Same Tests
minutes
hours
8© 2015 Pivotal Software, Inc. All rights reserved.
Pipelines - Example
8
CF Runtime
CF Services
Runtime: Dev a MastServices: Dev a Mast
Svc: Mast a CR; RT: Mast a CR
Dijon Tabasco
A1
Prod
Services: Dev; Runtime: CR Services: CR; Runtime: Dev
Services: Mast; Runtime: Mast
Services: Rel; Runtime: Rel
Svc: CR a Rel; RT: CR a Rel
Shared Package Cache
9© 2015 Pivotal Software, Inc. All rights reserved.
Deployment Checklist: New Release
Pre-deployment:
Socialization – talk to related groups
Get the latest from github (put there by dev team)
Generate release notes (yes, this is automated too!)
Generate final release
Deployment:
Log into jumpbox
git pull
Generate new deployment manifest (from templates and prod config)
Log into bosh
Upload release
Bosh deploy – verify the diffs BOSH reports
Checklist for each type of deployment
Post-deployment:
Publish final release (into github)
Update checklists
10© 2015 Pivotal Software, Inc. All rights reserved.
Monitoring
Cloudops Dashboard (Datadog)
Lamb Dashboard (Datadog)
PivotalWeb
Services
Cloudops Logstash
Lamb Logstash
system metrics
Collector
Sys
log
conf
ig
system logs
12© 2015 Pivotal Software, Inc. All rights reserved.
The one about continuous delivery…
Dev Space
mock
customerDBenv config
Staging Space
customerDBenv config
Prod Space
customerDBenv config
Staging Prod
13© 2015 Pivotal Software, Inc. All rights reserved.
Monitoring
Cloudops Dashboard (Datadog)
Lamb Dashboard (Datadog)
PivotalWeb
Services
Pager Duty
Cloudops Logstash
Lamb Logstash
system metrics
Collector
Sys
log
conf
ig
system logs
15© 2015 Pivotal Software, Inc. All rights reserved.
Monitoring
Cloudops Dashboard (Datadog)
Lamb Dashboard (Datadog)
PivotalWeb
Services
Pager Duty
Slack & Slackbots
Cloudops Logstash
Lamb Logstash
system metrics
Collector
Sys
log
conf
ig
system logs
Status Page
Smoke TestsPingdom
16© 2015 Pivotal Software, Inc. All rights reserved.
What About App Ops? Different TeamD
ata
base
Web
S
erve
r
Mes
sagi
ng
Your Application Code
PAAS
Virtualized Infrastructure
PAAS
Platform Operations
Application Developers
Application Operations
• Deploys platform• Makes standard runtimes and services
available• Monitors platform• Scales platform
(ensuring sufficient capacity)• Upgrades platform with zero-downtime
• Creates deployable artifact
• Config Prod space• Deploys application to Prod• Monitors application• Scales application (capacity)• Deploys new app version with
zero downtime.
17© 2015 Pivotal Software, Inc. All rights reserved.
The one about ephemeral ports……Started updating job runner_z1 > runner_z1/91. Done (00:01:06)Started updating job runner_z1 > runner_z1/92. Done (00:01:06)Started updating job runner_z1 > runner_z1/93. Done (00:01:06)Started updating job runner_z1 > runner_z1/94. Done (00:01:06)Started updating job runner_z1 > runner_z1/95. Done (00:01:06)Started updating job runner_z1 > runner_z1/96. Failed
Ephemeral Port Range: 32768 to 61000
Punchline: Great value brought by platform includes shielding app
teams (dev and ops) from obscure, low level details that have nothing to do with satisfying business needs!
18© 2015 Pivotal Software, Inc. All rights reserved.
Summary
It was an AWESOME month!
I still lack “traditional ops” experience… I’m glad
Operational enablers in the platform:– BOSH!!– Immutable infrastructure– Infrastructure as code - everything checked into Github!– Firehose (instrumentation of the Platform!!)– Canaries and rolling upgrades
19© 2015 Pivotal Software, Inc. All rights reserved.
Blogs
Revelations From the Field – Life in the Operations Team (the one about the loggregator scaling) http://blog.pivotal.io/cloud-foundry-pivotal/p-o-v/revelations-from-the-field-life-in-the-operations-team
Is Continuous Delivery a First Class Concern of Your Platform? (the one about dashboards) http://blog.pivotal.io/cloud-foundry-pivotal/products/is-continuous-delivery-a-first-class-concern-of-your-platform
Cloud Foundry Ops: Ephemeral Ports and the Value of a Platform as a Service (the one about, well, ephemeral ports)http://blog.pivotal.io/cloud-foundry-pivotal/p-o-v/cloud-foundry-ops-ephemeral-ports-and-the-value-of-a-platform-as-a-service
20© 2015 Pivotal Software, Inc. All rights reserved. 20© Copyright 2015 Pivotal. All rights reserved.
Thank You
@cdavisafc