openfest 2014 aggressive devops

Post on 12-Jul-2015

297 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Aggressive DevOps

Ivo Vachkov

Xi Group Ltd.

What is “DevOps”?!

• Is it a technology / tooling?!

• Is it a cultural thing?!

• Is it a business thing?!

• Should I even care …

DevOps

• Development + Operations

• It is technological in nature!

• It is a cultural thing!

• “The Business” needs it!

A few myths about the DevOps

• Developers can do Ops

• System Administrators are obsolete

• You need it only for the Cloud

• It is supplementary activity

Why should I care?!

• Because today everything is distributed …

• … and distributed systems are hard!

• Because IT complexity is constantly growing!

• Because it allows you to scale the human factor!

Lets get technical!

The new normal

• Not a single server anymore!

The new normal

• Workload is dynamic!

The new normal

• Distributed systems are complex and fragile!

• Distributed systems come with control planes!

• Service discovery is required!

• “errāre hūmānum est” (Seneca)

… and from the ashes DevOps will rise, Fierce and Mighty …

… to help us …

• … change system architectures …

• … bring order to chaos …

• … build and deploy the product …

• … monitor everything …

• … analyze the log files …

• … educate Developers in all things Ops …

• … build data-driven control planes …

• … and much, much more …

New problems require new tools

• Configuration management

– Puppet, Chef, Ansible, Salt

– Vagrant

– Fabric, Gearman

• Infrastructure-as-a-Code tools

– AWS CLI / python-boto, REST API

– Joyent SmartDataCenters / node.js sdc

– Rackspace, CloudFlare / REST API

New problems require new tools

• Build and deployment automation

– Jenkins / Hudson

– Travis CI

– BuildBot

• Service Discovery

– DNS-SD

– Etsy ETCD, Heroku Doozer, Apache Zookeeper

– Consul & Serf

New problems require new tools

• Full-stack application monitoring

– Graphite, Ganglia

– New Relic

– StackDriver, Signal Fuse, Boundary, AWS CloudWatch, …

• Alerting systems

– Nagios (really?!)

– Sensu

– PagerDuty, AWS SNS, …

Focus: Continuous Delivery

• Jenkins is your friend!

Focus: Continuous Delivery

• Build on every commit / merge!

• Deploy after every build!

• Verify / Smoke-test the deployment!

• If possible, route some real traffic to it!

Focus: Monitoring

• Nagios is obsolete but relevant …

• … it can still handle quite some load …

Focus: Monitoring

Focus: Monitoring

• Infrastructure vitals

– CPU / Memory / Disk / etc.

• Application vitals

– Critical processes / Critical services / Queues / etc.

• External triggers & Perceived performance

– User counts / Change rates / Processing Latency

• Trending

– … when *it* will hit the fan …

Focus: Monitoring

MONITOR EVERYTHING !!!

Focus: Alerting

• Alert when human intervention is required!

• Automate otherwise!

• Have a direct link between alert and operational procedure!

• Account for “pager fatigue”!

Focus: Alerting

• Cautionary tale: The Three Mile Island accident!

Focus: Smart Control Plane

• Monitoring data is consumed by the Control Plane.

• Workloads drive the elastic behavior of the distributed system.

• Business-specific logic is used to guide operational decision making process.

Focus: Smart Control Plane

• Control Plane is Data-Driven.

• Control Plane is Pro-active.

• Control Plane is “aware” of business goals.

• Control Plane must be highly-available.

Focus: Smart Control Plane

Focus: Smart Control Plane

• Anomaly Detection and Recovery

New problems require new culture

• Operational input in all phases of the Software Development Life Cycle!

• Operational instrumentation is part of the core product!

• Enterprise silos are NO MORE!

What about the “Aggressive” ?!

Well … try implementing all of the above in a typical company … ;)

Thank you!

Q & A

top related