stackstorm devops automation webinar
DESCRIPTION
Interactive webinar slides - July 22, 2014 Presenters: Evan Powell & Patrick HoolboomTRANSCRIPT
www.stackstorm.com!!
@Stack_Storm!!
July 2014!CONFIDENTIAL!
Vision, Common Operational Patterns, and a Little About
Our Approach!
Vision à Specific Patterns!
2!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
We’ve been chatting with you – what have we learned?!• Let’s talk operational patterns. !• A little on monitoring. !• A lot on operations automation.!!We want to learn more today! !!What are we building?!!!!
Market!
“So$ware is ea,ng everything.”
{ developed and operated in the DevOps way }
!
StackStorm does DevOps opera,ons automa,on
!
3!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
StackStorm Presenters!Patrick Hoolboom, Unicorn Stormer!• 7+ years building and running high pace DevOps
environment (2 years at Cloudmark)!• Puppet, Chef, Nagios, NewRelic, Logstash and more!• “Open source your process” – lets share operational
templates!
!!
Evan Powell, Co-Founder & CEO!• 15+ years in infrastructure software!• Founding CEO of Clarus Systems, acquired by OPNT!• Founding CEO of Nexenta Systems!
• Defined and led OpenStorage and SDS space!• Led Nexenta to 5k+ customers, 280+ employees, $350 million+
of partner sales, $75 million in funding !
!
!!
4!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
130+ Discussions!
PIX | SYSTEM
5!!5!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Source CA: h4p://www.slideshare.net/CAinc/dev-‐ops-‐research-‐cust-‐deck-‐mann-‐march-‐31
6!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
CI/CD, ChatOps, DevOps!
7!!
CI CD ChatOps “DevOps” CD*
Every single soul
SREs
Love it!
Stupid!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
8!!
7/21/14 © 2014 StackStorm, Inc. ConfidenRal
Day 1: Pre4y well sorted Day 2: Pre4y much chaos • But if infrastructure is immutable
then maybe we never have Day 2?
9!!
Patterns: Monitoring!Monitoring as a service:!• Check in your checks and monitoring is free!• Transparency => shame => compliance!Ongoing challenges:!• Pager fatigue!• Automate all the thresholds !• Dependencies => deduplicating alerts!• Drive => Host => Service => Application =>
User!
!
10!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Patterns: Monitoring!
11!!
Nagios – Sensu – APM Zabbix
Graphite StatsD
Humans via ChatOps
PAGER DUTY
Auto correcRons: Nagios or other
Declare event, ask for help
-‐>>Event pipeline -‐>>
-‐>>Event pipeline -‐>>
Rieman / StackTach
Patterns: Monitoring!Infer the threshold!• Anomaly detection!• Pattern recognition!Start with a check!And/or update the checks and thresholds as you learn more!
Monitoring as a service " Remediation as a service!
!
12!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Remediation as a Service #1!
13!!
Facebook’s FBAR: “AutomaRng the work of hundreds” h4ps://www.facebook.com/notes/facebook-‐engineering/making-‐facebook-‐self-‐healing/10150275248698920
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Remediation as a Service #2!
14!!
Microsoh AutoPilot: States: • Healthy, Failed,
ProbaRon RemediaRons (“Rs”): • Restart, Reboot,
Reimage, RMA h4p://research.microsoh.com/pubs/64604/osr2007.pdf
“Machine learning algorithms to analyze these data in order to understand how to improve the policy selngs, with the ulRmate goal of automaRng many of the current manual policies. “
Patterns: Tools – Monitoring?!The #1 tool we have seen is NewRelic!• Roughly 75% of the discussions!• Amazing because it does not fit all the
requirements!The #2 tool we have seen is Splunk!• Still seeing them more than LogStash,
ElasticSearch although it is close!
!15!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Patterns: Tools – Configuration!Puppet vs. Chef!• Ops vs. developers?!Ansible and Salt growing in importance!• Points of differentiation and also overlap!Best practices include:!• Separate source of truth !• Code reviews (duh?)!
!16!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
“State in the Repo” vs. Single Source of Truth!
17!!
• Provides visibility into the system!
• Fights configuration drift!• Reduces convergence
time of changes!• Tied in to monitoring
and configuration systems!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
18!!
Lack of trust
Hard to test Simple script + simple script + simple script +
Zounds
What’s Wrong with Day 2 AutomaCon?
Monitoring
My script vs. your script IntegraCo
n
sprawl
Automation: Don’t Forget #1!Simplicity scales.
19!!
Best.
7/21/14!© 2014 StackStorm, Inc.!Confidential!
BEST PRACTICES • Have X states (three seems logical) for a monitored
system • Have Y remediaRons • Document both • Automate the known mappings
• Atomic acRons • Workflows or acRons of acRons to Re them together
Lessons learned from many operators large and small
Automation: Don’t Forget #2:!Failure detectors must be able to dis7nguish between the symptoms of failure and overloading, otherwise overloaded computers may be marked as failed and removed from service, amplifying the problem and triggering a cascade of failures that disables the enRre applicaRon.
20!!
h4p://research.microsoh.com/pubs/64604/osr2007.pdf
BEST PRACTICE Check a threshold with each step so that run away automaRons cannot occur.
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Automation: Don’t Forget #3!Humans are good at pa4ern matching – if they perceive the pa4erns. • Always provide context. When a threshold is violated that requires that humans get back involved you are at risk of pulng the human in a difficult place.
21!!
BEST PRACTICE When the system needs a human, provide the human with context. • Automate the delivery of that context for the sake of consistency • Don’t allow the automa,on to just be addi,onal complexity on top
of complexity
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Automation: Don’t Forget #4!AutomaRon frees up humans to a4ack technical debt, refactor for growth and so forth. However, the implicit knowledge formed by eliminated manual acRon itself may be lost.
22!!
BEST PRACTICE Refer to the prior Don’t forget points including simplicity of state analysis and simplicity of allowed remediaRons. • Also – allow manual invoking of automaRons • Make automaRons human readable • And otherwise transparent • Give the automaRon an edgy personality (huh?)
7/21/14!© 2014 StackStorm, Inc.!Confidential!
ChatOps is Brilliant!Add some code to IRC or other chat to do stuff, and you have bot executed code. Add to that an enRre automaRon library – plus a personality – and the requirement that ALL changes happen through chat and DevOps happens.
23!!
BEST PRACTICES Get to know Hubot. And stay tuned for StackStorm’s take on this approach. ChatOps benefits: • Dev/Ops interface and Newby / Old pro interface • Context for humans • Trust in automaRon
h4p://puppetlabs.com/blog/really-‐building-‐data-‐driven-‐infrastructure
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Integrations & Relationships!
Auto Scaling
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Integrations & Relationships!
Auto Scaling
AutomaRon as a Service
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Before!
26!!
Events
AcCons
Events AcCons
Scripts
Scripts
AcCons
Scripts
Events
#BadAuto!• Who did what to what and how did it go?!• Fragments of automation and configuration management!• No opportunity for learning!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
StackStorm!
27!!
Events
AcCons
#ShareAndLearn!• Scripts -> automations!• Workflow – stitch stactions!• Close the loop – map events to
automations!• Total audit & transparency!• Still DevOps friendly !• All automations are code in repositories!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Technical Details!• Centralized view (federated) into automations!• Audit trail and RBAC access controls!
• Infrastructure as code!• Full APIs, configurations as code!
• Collaboration inherent!• Abstraction of automations for sharing!• Two way integration into collaboration including ChatOps!
• Bi directional close relationship with monitoring!• StackStorm helps users determine where to put
problem analysis – StackTach and others promising!• Facility for self learning!• Simple today headed towards controller synthesis
approaches!
28!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Example Use Cases!• OpenStack and hybrid management!
• Facilitated troubleshooting!• Over time – automated resolution!
• Multi-stage deployments!• Develop, stage, deploy for example!
• Response to security events!• Post-hoc as opposed to real time!
29!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Business Model!OpenSource!• Apache licenses!Free!• Free forever edition (community)!Enterprise!• On premise or hosted – all deployments now
are on premise!!
30!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Who Wins?!
31!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
What Can You Do?!Register as a beta participant!• Are you doing some OpenStack?!• Can you provide feedback at least 2x per month?!• Automating something already?!• Interested in ChatOps? Not required….!!Once we GA – grab free version!• Contribute to community w/ integrations and
automations!• OpenSource and share your operations patterns!
32!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Questions!To respond, please unmute your line by pressing *6!!CI, CD, Day 2 operations?!• Where are you in the continuum?!
ChatOps – do you use it?!!Have you had automations run amok?!!DevOps team? Title? Silo?!!
33!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Summary!World is changing rapidly!!We’ve learned a lot!!Safe, composable automations with circuit breakers and other controls should help!!
34!!
7/21/14!© 2014 StackStorm, Inc.!Confidential!
Private and confidential!
Thank YouFollow us on Twitter @Stack_Storm!