turtles all the way down: platform ops in public cloud

30
Turtles All the Way Down Platform Ops in Public Cloud Bridget Kromhout @bridgetkromhout

Upload: bridgetkromhout

Post on 25-May-2015

659 views

Category:

Technology


1 download

DESCRIPTION

When I joined a startup already in progress as their first ops hire, I got a crash course in cloud operations. Running databases in EC2 without being on bare metal presents its own challenges; we also began using Hadoop and HBase on EMR, with tragicomic results. What monitoring existed was a twisty maze of half-measures, so improving our Mean Time To Lost Sleep required trying new tools and alerting strategies. And scaling performance meant relying on best practices and gut-feeling hunches. This talk will have appeal for those curious about AWS, about using MapReduce in the cloud, and about whether MongoDB is really "web scale". (Spoiler alert: lolol.) Come for the EC2 trivia; stay for the table-flipping. Notes and image credits: http://bridgetkromhout.com/speaking/2014/beyondthecode/notes/

TRANSCRIPT

Page 1: Turtles All the Way Down: Platform Ops in Public Cloud

Turtles All the Way Down

Platform Ops in Public Cloud

Bridget Kromhout

@bridgetkromhout

Page 2: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 3: Turtles All the Way Down: Platform Ops in Public Cloud

We are the largest online video distributor of international televised content streaming the world's best movies, documentaries and TV shows on demand with professional subtitles.

@bridgetkromhout

Page 4: Turtles All the Way Down: Platform Ops in Public Cloud

Platform ops in public cloud?Do you mean Platform as a Service?

How is this different from Infrastructure as a Service?

@bridgetkromhout

Page 5: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 6: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 7: Turtles All the Way Down: Platform Ops in Public Cloud

(previous gig) SaaS Life

normal traffic

decision to turn off

decision to turnback on

accidental removal

@bridgetkromhout

Page 8: Turtles All the Way Down: Platform Ops in Public Cloud

Platform?

@bridgetkromhout

Page 9: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 10: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 11: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

AWS Regions*(containing availability zones)

* for some values of regions: Beijing & Sydney too

Page 12: Turtles All the Way Down: Platform Ops in Public Cloud

StorageNo procurement delays!All the IOPS!No waiting!

Yes, cloud storage better than the bad old days in some ways, but with caveats.

@bridgetkromhout

Page 13: Turtles All the Way Down: Platform Ops in Public Cloud

Alphabet Soup: EBS, SSDs, pIOPSGo with SSDs for your Elastic Block Store.

EBS-optimized instances = faster network

Provisioned IOPS: guaranteed, but prevent bursting

@bridgetkromhout

Page 14: Turtles All the Way Down: Platform Ops in Public Cloud

Story Time!

Data stores and sadness (as a service)

@bridgetkromhout

Page 15: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

wow. such nosql. very webscale.

Page 16: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 17: Turtles All the Way Down: Platform Ops in Public Cloud

“a single write operation holds the lock exclusively, and no other read or write operations may share the lock.”

@bridgetkromhout

Page 18: Turtles All the Way Down: Platform Ops in Public Cloud

It’s 4am. Do you know what your EMR cluster is doing?

@bridgetkromhout

Page 19: Turtles All the Way Down: Platform Ops in Public Cloud

StatsD

monitoring != alerting

@bridgetkromhout

Page 20: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 21: Turtles All the Way Down: Platform Ops in Public Cloud

If it moves, we track it. Sometimes we’ll draw a graph of something that isn’t moving yet, just in case it decides to make a run for it. -- Ian Malpass, Etsy

@bridgetkromhout

measure all the things

Page 22: Turtles All the Way Down: Platform Ops in Public Cloud

So, back to this platform stuff...

...how exactly do you build and deploy it?

@bridgetkromhout

Page 23: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 24: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 25: Turtles All the Way Down: Platform Ops in Public Cloud

orchestration & config management

Current: Future possibilities:

@bridgetkromhout

Page 26: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

kitten, not unicorn

Page 27: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 28: Turtles All the Way Down: Platform Ops in Public Cloud

“the game

has changed”

@littleidea

@bridgetkromhout

Page 29: Turtles All the Way Down: Platform Ops in Public Cloud

@bridgetkromhout

Page 30: Turtles All the Way Down: Platform Ops in Public Cloud

Questions? (and we’re hiring!)

@bridgetkromhout