  • DevOps and Performance Why, How & Best Practices @grabnerandi
  • The stuff we did when we were a Start Up and we All were Devs, Testers and Ops
  • YOU ARE NOT ALONE: Popularity on Google
  • Who is doing it? How many successful deployments can they do? 300 Deployments / Year 50-60 Deployments / Day 10+ Deployments / Day Every 11.6 seconds
  • More on Amazons Story 75% fewer outages since 2006 90% fewer outage minutes ~0.001% of deployments cause a problem Instantaneous automatic rollback Deploying every 11.6s
  • Testing is Important and gives Confidence
  • But are we ready for The Real world?
  • Measure Performance during the game Ball Possession: 40 : 60 Fouls: 0 : 0 Score: 0 : 0 Minute 1 - 5
  • Measure Performance during the game Minute 6 - 35 Ball Possession: 80 : 20 Fouls: 2 : 12 Score: 0 : 0
  • Not always a happy ending Minute 90 Ball Possession: 80 : 20 Fouls: 4 : 25 Score: 3 : 0
  • How does that relate to Software?
  • From Deploy to Deploy Promotion/Event Problems Ops Playbook War Room Timeline
  • The War Room back then 'Houston, we have a problem NASA Mission Control Center, Apollo 13, 1970
  • The War Room NOW Facebook December 2012
  • 3 Situations on WHY this happens, HOW to avoid it
  • #Disconnected Teams
  • Teamwork between Dev and Ops SEV1 Problem in Production Need access to log files Where are they? Cant get them Need to increase log level Cant do! Cant change config files in prod!
  • Solution: Implement a Custom On Demand Remote Logger
  • Implementation and Rollout Implemented Custom Logger Worked well in Load Testing
  • What happened? ~ 1Mio Lock Exceptions in 30 mins
  • Root Cause: A special WebSphere Setting! Log Service provides a synchronized log file across ALL JVMs Log Service provides a synchronized log file across ALL JVMs
  • Metrics: # Log Messages, # Exceptions Share: Same Server Settings DevOps: Agree on Data for Troubleshooting
  • #No Agile Deployment
  • Adonair Load Spike resulted in Unavailability
  • Alternative: GoDaddy goes DevOps 1h before SuperBowl KickOff 1h after Game ended
  • Behind the Scenes
  • Metrics: Availability Page Size, # Objects # Hosts, # Connections DevOps: Feature Switches
  • #Push without a Plan
  • Mobile Landing Page of Super Bowl Ad 434 Resources in total on that page: 230 JPEGs, 75 PNGs, 50 GIFs, Total size of ~ 20MB
  • redirects to ALL CSS and JS files are redirected to the www domain This is a lot of time wasted especially on high latency mobile connections
  • Critical Pages not Optimized! Browse, Search and Product Info performs well because they dont follow best practices: 87 Requests, 28 Redirects, Critical Pages such as Shopping Cart are very slow
  • Metrics: Load Time, # Resources (Images, ), # HTTP 3xx, 4xx, 5xx Dev: Build for Mobile Test: Test on Mobile Ops: Monitor Mobile
  • # of Requests / User # of Log Messages # of Exceptions # Objects Allocated # Objects In Cache Cache Hit Ratio # of Images # of SQLs # SQLs per RequestAvailability # HTTP 3xx, 4xx Page Size
  • 54
  • Commit Stage Compile Execute Unit Test Code Analysis Build installers Automated Acceptance Testing Automated Capacity Testing Manual testing Key showcases Exploratory testing Release Unit & Integration Tests Functional Tests Performance Tests Production Monitoring Functional Tests
