practical cloud & workflow orchestration

109
Practical Cloud & Workflow Orchestration 2011 Amazon Genomics Event Chris Dagdigian [email protected]

Upload: chris-dagdigian

Post on 01-Dec-2014

7.274 views

Category:

Technology


6 download

DESCRIPTION

A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA. This is a 30 minute talk I gave focusing mainly on practical tools, tips and methods for bootstrapping and orchestration on the cloud. Covers examples of: Ubuntu Cloud Init AWS Cloud Formation Opscode Chef MIT StarCluster

TRANSCRIPT

Page 1: Practical Cloud & Workflow Orchestration

Practical Cloud & Workflow Orchestration

2011 Amazon Genomics Event Chris Dagdigian

[email protected]  

Page 2: Practical Cloud & Workflow Orchestration
Page 3: Practical Cloud & Workflow Orchestration

I’m Chris. I’m an infrastructure geek. I work for the BioTeam.

Twitter: @chris_dag  

Page 4: Practical Cloud & Workflow Orchestration

Disclaimer.

Page 5: Practical Cloud & Workflow Orchestration

I’m not an Amazon shill.

Page 6: Practical Cloud & Workflow Orchestration

Really.

Page 7: Practical Cloud & Workflow Orchestration

The IaaS competition just can’t compete.

Page 8: Practical Cloud & Workflow Orchestration

AWS lets me build useful stuff.

Page 9: Practical Cloud & Workflow Orchestration

When stuff gets built, I get paid.

Page 10: Practical Cloud & Workflow Orchestration

Installing VMware & excreting a press release does not turn a

company into a cloud provider.

Page 11: Practical Cloud & Workflow Orchestration

I need more than just virtual compute and block storage. AWS has tons of glue

and many useful IaaS building blocks.

Page 12: Practical Cloud & Workflow Orchestration

IaaS competitors lag far behind in features and service offerings.

Page 13: Practical Cloud & Workflow Orchestration

Speaking of pretenders…

Page 14: Practical Cloud & Workflow Orchestration

No APIs? Not a cloud.

Page 15: Practical Cloud & Workflow Orchestration

No self-service? Not a cloud.

Page 16: Practical Cloud & Workflow Orchestration

I have to email a human? Not a cloud.

Page 17: Practical Cloud & Workflow Orchestration

50% failure rate on server launch? Lame cloud.

Page 18: Practical Cloud & Workflow Orchestration

Virtual servers & block storage only? Barely a cloud.

Page 19: Practical Cloud & Workflow Orchestration

I’m getting insufferable, huh? ���Moving on …

Page 20: Practical Cloud & Workflow Orchestration

Three Topics Today.

Page 21: Practical Cloud & Workflow Orchestration

Time, Laziness & Beauty.

Page 22: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

Tick … Tick Tick…

Page 23: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

User expectations are changing.

Page 24: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

Automated provisioning ���can shrink the time between ���

“I want to do some science” & ���“I’m ready to do some science”.

Page 25: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

However…

Page 26: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

If servers, storage and systems can be deployed in minutes …

Page 27: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

… why does it still take days, several helpdesk tickets & a team of humans to load software and configure my

systems to actually do science?

Page 28: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

It shouldn’t.

Page 29: Practical Cloud & Workflow Orchestration

image: shanelin via flickr

If provisioning gets faster, configuration management ���also needs to keep pace.

Page 30: Practical Cloud & Workflow Orchestration

Laziness.

Page 31: Practical Cloud & Workflow Orchestration

Larry Wall’s 1st Great Virtue

Page 32: Practical Cloud & Workflow Orchestration

“… the quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and document what you wrote so you don't have to answer

so many questions about it.”

Page 33: Practical Cloud & Workflow Orchestration

It’s all scriptable.

Page 34: Practical Cloud & Workflow Orchestration

•  Servers •  Storage •  Network •  Bootstrapping •  Provisioning •  Configuration •  Management •  Monitoring •  Scaling •  Accounting &

audit trails

Page 35: Practical Cloud & Workflow Orchestration

Not hype. Real.

Page 36: Practical Cloud & Workflow Orchestration

I can do it from my ipad.

Page 37: Practical Cloud & Workflow Orchestration

No cubicle required.

Page 38: Practical Cloud & Workflow Orchestration

Our research IT infrastructures can now be 100% virtual and 100% scriptable

Page 39: Practical Cloud & Workflow Orchestration

And it’s pretty easy to understand.

Page 40: Practical Cloud & Workflow Orchestration

Anyone can drive this stuff.

Page 41: Practical Cloud & Workflow Orchestration

Especially motivated researchers.

Page 42: Practical Cloud & Workflow Orchestration

Stuff like this is a big deal.

Page 43: Practical Cloud & Workflow Orchestration

5GB managed MySQL in the cloud. $.011 / hour

Page 44: Practical Cloud & Workflow Orchestration

Database Administrator not required.

Page 45: Practical Cloud & Workflow Orchestration

Automatic patching, backups & clustering

Page 46: Practical Cloud & Workflow Orchestration

Anyone with a web browser can launch one.

Page 47: Practical Cloud & Workflow Orchestration

Beauty.

Page 48: Practical Cloud & Workflow Orchestration

Scriptable infrastructure is just the beginning.

Page 49: Practical Cloud & Workflow Orchestration

The really cool stuff is what we build on top.

Page 50: Practical Cloud & Workflow Orchestration

With good tools …

Page 51: Practical Cloud & Workflow Orchestration

We can orchestrate complex systems, pipelines and workflows.

Page 52: Practical Cloud & Workflow Orchestration

Orchestrated systems working in concert��� are a beautiful thing.

Page 53: Practical Cloud & Workflow Orchestration

Let me show you a few of the tools we like.

Page 54: Practical Cloud & Workflow Orchestration

Cloud Init

Page 55: Practical Cloud & Workflow Orchestration

Cloud Init •  https://help.ubuntu.com/community/UEC •  Developed by Ubuntu •  Baked into all Ubuntu UEC releases •  Also baked into Amazon Linux AMIs •  Works on Eucalyptus clouds as well

Page 56: Practical Cloud & Workflow Orchestration

Cloud Init gives you a hook into freshly booted systems.

Page 57: Practical Cloud & Workflow Orchestration

It’s a great and easy-to-comprehend way to bootstrap or customize generic server images.

Page 58: Practical Cloud & Workflow Orchestration

When you launch a server, you can inject a YAML formatted file into the environment.

Page 59: Practical Cloud & Workflow Orchestration

Cloud init files are parsed and executed right after the node boots for the first time.

Page 60: Practical Cloud & Workflow Orchestration

You can run scripts, install software, load SSH keys, etc. to ‘bootstrap’ a generic node.

Page 61: Practical Cloud & Workflow Orchestration

#cloud-config!packages:! - httpd!!runcmd:! - /etc/init.d/httpd start ! - echo "<h1>Hello Amazon Genomics Event!</h1>” \ !!> /var/www/html/index.html!

!

Page 62: Practical Cloud & Workflow Orchestration

Previous real-world example does this: 1.  Download/install Apache web server 2.  Turn on the web server 3.  Create a cheezy index.html

Page 63: Practical Cloud & Workflow Orchestration

This is the script I ran moments before this talk …

Page 64: Practical Cloud & Workflow Orchestration

#!/bin/sh!!ec2-run-instances ami-8c1fece5 \! -n 1 \! -t m1.small \! -g dagdemo-SG \! -k dagdemo-sshkeypair \! --user-data-file ./cloudInit-config.txt!!

Page 65: Practical Cloud & Workflow Orchestration

Important to understand: •  ami-8c1fece5 is Amazon Linux public AMI •  No web server pre-installed •  Never before been ‘touched’ by me •  Cloud Init does it all via the script I injected at

instance launch time

Page 66: Practical Cloud & Workflow Orchestration

Lets see if it worked …

Page 67: Practical Cloud & Workflow Orchestration

Amazon CloudFormation

Page 68: Practical Cloud & Workflow Orchestration

Amazon CloudFormation •  http://aws.amazon.com/cloudformation/ •  AWS specific •  Sweet way to turn on|off entire stacks of

related and dependent AWS services

Page 69: Practical Cloud & Workflow Orchestration

Treat complex infrastructure as single resource •  Cliché example - In a single “stack” you can

define and then start/stop: •  Elastic database cluster + •  Elastic webserver cluster + •  Monitoring & auto-scaling triggers •  Event & error notification •  Elastic load balancer

Page 70: Practical Cloud & Workflow Orchestration

My live demo of CloudFormation •  Using the example WordPress Blog template •  It does a ton of cool stuff: •  RDS backend for mySQL database, elastic

webserver cluster with auto-scaling, security group setup, automatic scaling, automatic alarm notices

•  It all sits behind an elastic load balancer

Page 71: Practical Cloud & Workflow Orchestration

My CloudFormation blog demo: •  Actual stack file at http://biote.am/6d •  Check it out … •  .JSON formatted but still quite readable

•  It lets me define and then control a ton of different related AWS services all at once.

Page 72: Practical Cloud & Workflow Orchestration

#!/bin/sh!# Launch Stack !cfn-create-stack AWSGenomics-demoStack \! --template-file cf-wordpress.json.txt!!!

Page 73: Practical Cloud & Workflow Orchestration

#!/bin/sh!# Check state & status!!cfn-describe-stacks AWSGenomics-demoStack!echo ""!cfn-describe-stack-events \ ! AWSGenomics-demoStack --headers!

Page 74: Practical Cloud & Workflow Orchestration

10 AWS Services/Resources orchestrated as one.

Page 75: Practical Cloud & Workflow Orchestration
Page 76: Practical Cloud & Workflow Orchestration

Cloudwatch.

Page 77: Practical Cloud & Workflow Orchestration

Auto-scaling triggers.

Page 78: Practical Cloud & Workflow Orchestration

SNS Endpoints for Alarms.

Page 79: Practical Cloud & Workflow Orchestration

Alarm triggers.

Page 80: Practical Cloud & Workflow Orchestration

RDS Database & Security Group.

Page 81: Practical Cloud & Workflow Orchestration

Elastic Load Balancer.

Page 82: Practical Cloud & Workflow Orchestration

EC2 Security Group.

Page 83: Practical Cloud & Workflow Orchestration

Cool, huh?

Page 84: Practical Cloud & Workflow Orchestration

{ in case the demo fails! }

Page 85: Practical Cloud & Workflow Orchestration
Page 86: Practical Cloud & Workflow Orchestration

Opscode Chef

Page 87: Practical Cloud & Workflow Orchestration

Chef enables Infrastructure as Code

Page 88: Practical Cloud & Workflow Orchestration

It’s freaking awesome.

Page 89: Practical Cloud & Workflow Orchestration

Chef lets you:��� Manage configuration as idempotent Resources. Group resources as idempotent Recipes. Group recipes into Roles. Track it all like Source Code.��� Search your infrastructure like a ninja. Ohai!��� Configure your systems, software & pipelines

Page 90: Practical Cloud & Workflow Orchestration

http://www.opscode.com/chef/ ��� •  Several flavors •  Open source •  Commercial / Managed •  Commercial / ‘Behind your Firewall’ ���

•  No time today for even a short description

of how it works. You should check it out.

Page 91: Practical Cloud & Workflow Orchestration

Chef demo via ‘knife’ command line …

Page 92: Practical Cloud & Workflow Orchestration

knife ec2 server create \! -N aws-genomicsDemo \! -I ami-63be790a \! -f t1.micro \! -G default \! -S bioteam-IAM-admins-v1 \! -r 'recipe[getting-started]' \! -i ./bioteam-IAM-admins-v1.pem \! -x ubuntu!

Page 93: Practical Cloud & Workflow Orchestration

Fully automatic remote bootstrapping …

Page 94: Practical Cloud & Workflow Orchestration

Done!

Page 95: Practical Cloud & Workflow Orchestration
Page 96: Practical Cloud & Workflow Orchestration

Search-driven, parallel remote SSH execution

Page 97: Practical Cloud & Workflow Orchestration

knife ssh name:aws-genomicsDemo \! -a cloud.public_hostname \! -x ubuntu \! -i bioteam-IAM-admins-v1.pem \! 'sudo chef-client; \! cat /tmp/chef-getting-started.txt'!

Page 98: Practical Cloud & Workflow Orchestration

Lets install some genomics tools��� •  Our Maq short read assembler cookbook: •  Installs all dependencies (compilers, etc.) •  Puts application source on node •  Builds maq from source •  Installs it

Page 99: Practical Cloud & Workflow Orchestration

$ knife node \! run_list add \! aws-genomicsDemo \! 'recipe[maq]'!

Page 100: Practical Cloud & Workflow Orchestration

It really is that easy.

Page 101: Practical Cloud & Workflow Orchestration

MIT StarCluster

Page 102: Practical Cloud & Workflow Orchestration

MIT Starcluster •  http://web.mit.edu/stardev/cluster •  Ready to use Linux compute farm on AWS •  Grid Engine, MPI, NFS filesystems •  Libraries, tools, applications •  Easy to use, easy to extend •  Integrates well with Chef

Page 103: Practical Cloud & Workflow Orchestration

If you have not built Linux clusters from scratch before …

Page 104: Practical Cloud & Workflow Orchestration

It’s hard to really appreciate everything that StarCluster does behind the scenes.

Page 105: Practical Cloud & Workflow Orchestration

MIT Starcluster – More Info��� •  Live demo (time permitting) •  StarCluster & Spot Instances Screencast •  http://biote.am/6c •  http://aws.amazon.com/ec2/spot-and-

science/

Page 106: Practical Cloud & Workflow Orchestration

Phew. That’s a lot of slides.

Page 107: Practical Cloud & Workflow Orchestration

Time to explore the demos?

Page 108: Practical Cloud & Workflow Orchestration

Questions?

Page 109: Practical Cloud & Workflow Orchestration

Thanks! Related talk slides: http://biote.am/6a

“Mapping Informatics to the Cloud”