event driven automation meetup may 14/2015

40
Event Driven Automation and Workflows Dmitri Zimine CTO, StackStorm #Stack_Storm

Upload: dmitri-zimine

Post on 16-Aug-2015

282 views

Category:

Technology


6 download

TRANSCRIPT

Event Driven Automationand Workflows

Dmitri ZimineCTO, StackStorm

#Stack_Storm

About myself

• Past: – Opalis Software (now aka M$ SC Orchestrator)– VMware

• Present:– StackStorm CTO & co-founder– Mistral core team member– I don’t ops (but most Stormers do)

Agenda

1. High level: Brief History Of Event Driven Automation

2. Into the weeds: Workflow patterns for IT automation

Business Process Management

VMwareCA

BMC

OpsWare HP

CISCO

MicrosoftBMC

Citrix

The Problem is Biggerthan it was 5 years ago

More tools…

Still…

•Manual operations• Custom scripts

Solution

• Event Driven Automation – with modern twist– FBAR (saving 1532 hours/day)– Salt Conf - Event Driven Infrastructure– Microsoft – new Azure Automation (RunBooks)

Solution: Event Driven Automation

Event Driven Automation

Actions

Trigger

Rules

Infrastructure – Cloud – Applications – Tools – Processes

{.}

Sensors

Call

Workflows

//

WORKFLOWS

Zoom to Workflow, and Get Practical

• From now on I focus on workflow

• Reminder: EDA != Workflow, but Workflow is a big part of it.

Patterns vs Practice

• ~100 patterns http://www.workflowpatterns.com/

• Practice – IMAO: only few sufficient

• Workflow do two things well:– Keeps state– Carry data across systems

Basic: Sequence ... tasks: t1_update_config: action: core.remote_sudo input: cmd: sed -i -e"s/keepalive_timeout hosts: my_webserver.example.com on-complete: t2_cleanup_logs

t2_cleanup_logs: action: core.remote_sudo input: cmd: rm /var/log/nginx/ hosts: my_webserer.example.com on-complete: t3_restart_service

t3_restart_service: action: core.remote_sudo cmd="servic

t1 t2 t3

Basic: Data Passing

t1.code=0msg=“Some string..”

t1 t2

examples.data_pass: input: - host tasks: t1_diagnose: action: diag.run_mysql_diag input: host: <% $.host %> publish: - msg: <% t1_diagnose.stdout.summary %> on-complete: t2_cleanup_logs

t2_post_to_chat: action: chatops.say input: header: Returned <% $.t1_diagnose.code %> details: <% $.msg %>

Basic: Conditions

t1

t3

t2

tasks: ... t1_deploy: action: ops.deploy_fleet on-success: t2_post_to_chat on-failure: t3_page_ops

t2_post_to_chat: action: chatops.say input: header: Successfully deployed <% $.t1_diag

t3_page_admin: action: pagerduty.launch_incident input: details: Have to wake up dude... details: <% $.msg %>

Basic: Conditions on Data

t1

t3

t2

t1_diagnose: action: ops.run_mysql_diag publish: - code: <% t1_diagnose.return_code %> on-complete: - t2_post_to_chat: <% $.code == 0 %> - t3_page_mysql_admin: <% $.code > 0 %>

t2_post_to_chat: action: chatops.say input: header: "mysql checked, OK"

t3_page_mysql_admin: action: pagerduty.launch_incident input: details: Have to wake up dude... details: <% $.t1_diagnose.stdout %>

t1.code==0

t1.code >0

THAT’S THE BASICS! SUFFICIENT.THERE’S MORE…

More: Parallel Execution

t1

t4

t2

... t1_do_build: action: cicd.do_build_and_packages on-success: - t2_test_ubuntu14 - t3_test_fedora20 - t3_test_rhel6

t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14"

t3_test_fedora20: action: cicd.deploy_and_test distro="F20"

t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6"

t3

More: Join

t5

t4

t2

t3t1

More: Join

t5

t4

t2

t3t1

16 ways to join

More: Join – Simple Merge

t5

t4

t2

... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-success: t5_post_status

t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-success: t5_post_status

t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-success: t5_post_status

t5_post_status: action: chatops.say input: header: Test completed!

t3

http://www.workflowpatterns.com/patterns/control/basic/wcp5.php

Simple Merge

t5t5

More: Join – AND Join

t5

t4

t2

... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-success: t5_post_status

t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-success: t5_post_status

t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-success: t5_post_status

t5_tag_release: join: all action: cicd.tag_release

t3

http://www.workflowpatterns.com/patterns/control/new/wcp33.php

Full AND Join

More: Join - Discriminator

t5

t4

t2

... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-failure: t5_report_and_fail

t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-failure: t5_report_and_fail

t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-failure: t5_report_and_fail

t5_report_and_fail: join: one action: chatops.say header=“FAILURE!” on-complete: fail

t3

http://www.workflowpatterns.com/patterns/control/advanced_branching/wcp9.php

Discriminator

More: Multiple Data

t1 t2

ip_list=[...]

...

t1_get_ip_list: action: myaws.allocate_floating_ips num=4 publish: - ip_list: <% $.t1_get_ip_list.ips %> on-complete: t2_create_vms

t2_create_vms: with-items: ip in <% $. ip_list %> action: myaws.create_vms ip=<% $.ip %>

And More Details…

• Nesting– Nothing to say except – Input and output– Nested workflow is an action, not a task

• Retries, Waits, Pause/Resume• Default task policies

Recap: Workflow Operations

• Sequence• Data passing• Conditions (on data)• Parallel execution• Joins• Multiple Data Items

What else

• Other than pattern support: • Reliability• Manageability – API, CLI, DSL, infra as code…• Good to have: good GUI

Summary

• Event Driven Automation is coming back– with a new twist

• EDA > Workflow, but Workflow is a key component

• Shameless plugStackStorm is covering it all

• OpenSource Event Automation Platform• Github: github.com/stackstorm/st2• Twitter: Stack_Storm• IRC: #stackstorm on FreeNode• www.stackstorm.com