managing rightscale on rightscale

23
1 Managing RightScale on RightScale February 1, 2011

Upload: rightscale

Post on 05-Jul-2015

755 views

Category:

Technology


1 download

DESCRIPTION

RightScale Webinar: February 1, 2011 – Just like our customers, RightScale runs in the cloud and requires the best platform to automate operations. As such, RightScale uses RightScale to manage RightScale. Our complete infrastructure – development, testing, staging, and production – consists of servers that are configured, launched and managed by the RightScale Platform.

TRANSCRIPT

Page 1: Managing RightScale on RightScale

1

Managing RightScale on RightScale

February 1, 2011

Page 2: Managing RightScale on RightScale

2

Your Panel Today

Presenting

• Rafael H. Saavedra – VP, Engineering at RightScale

• Chris Horne – Director, Product Marketing at RightScale

Q&A

• Douglas Johnson, Operations Manager at RightScale

Please use the questions window to ask questions any time!

Page 3: Managing RightScale on RightScale

3

Topics

• Managing RightScale on RightScale (Dev, Staging, Prod & Meta)

• RightScale Meta manages RightScale Production

• Production System Overview

• Monitoring Production – Quis Custodiet Ipsos Custodes

• Our Favorite RightScale Features

• Our Not-so-favorite Features

• Deploying RightScale – Cloud Best Practices

Page 4: Managing RightScale on RightScale

4

RightScale

Production

Managing RightScale on RightScale

Customer A Customer DCustomer B Customer C

RightScale

Development

RightScale

Staging

RightScale

Development

Page 5: Managing RightScale on RightScale

5

RightScale

Production

RS Production is managed by RS Meta

RightScale Meta

Production

RightScale

StagingCustomer A Customer D

RightScale

Development

RightScale

Development

Page 6: Managing RightScale on RightScale

6

A multitude of RightScale systems

• Meta Production manages the Production system

• Meta currently lives outside the cloud containing production

• Meta is extremely secure, accessible only by a handful of operations folks

• The Production system is my.rightscale.com

• We are reaching 200 servers with a large fraction in EC2 US-East

• Servers are located in every cloud to achieve high availability

• Servers are allocated in well defined availability zones

• A few staging systems are used for integration and QA

• Ad hoc systems for performance testing, demos, betas, etc.

• Many development systems with simplified configurations

• Development systems are available at the click of a button

Page 7: Managing RightScale on RightScale

7

Significant increase in cloud usage

N-08 D-08 J-09 F-09 M-09 A-09 M-09 J-09 J-09 A-09 S-09 O-09 N-09 D-09 J-10 F-10 M-10 A-10 M-10 J-10 J-10 A-10 S-10 O-10

EC

2 U

sage

N-08 D-08 J-09 F-09 M-09 A-09 M-09 J-09 J-09 A-09 S-09 O-09 N-09 D-09 J-10 F-10 M-10 A-10 M-10 J-10 J-10 A-10 S-10 O-10

EC

2 U

sa

ge

Page 8: Managing RightScale on RightScale

8

Some interesting RightScale numbers

• 2M servers launched by RightScale

• RightScale continuously monitors more than 70k servers

• Every day at RightScale:

• 2,000 array resize actions are executed

• 35,000 alert escalations are triggered

• 20,000 escalation emails are sent to users

• 9.0TB of monitoring data is exchange with our servers

• 1.6TB of logging data is sent to our servers

Page 9: Managing RightScale on RightScale

9

RightScale production (simplified)d

ae

mo

ns

DB Master

DB Slave

da

tab

as

es

mir

rors

log

gin

gm

on

ito

rin

g

Front Ends

da

sh

bo

ard

AP

I

Main App oth

ers

Page 10: Managing RightScale on RightScale

10

What do our users do?

• Dashboard, API, monitoring graphs & event notifications

• Most of the requests are monitoring updates 85% (70%)

• Dashboard and API calls are heavier requests; they represent

7% of requests but 26% of bandwidth

Monitoring85%

Notifications8%

API6%

Dashboard1%

Distribution by Requests

Monitoring70%

Notifications4%

API15%

Dashboard11%

Distribution by Bandwidth

Page 11: Managing RightScale on RightScale

11

We eat our own dog food

• Production servers are organized into independent deployments

• Core servers: frontends, core/api servers, databases, daemons

Page 12: Managing RightScale on RightScale

12

We eat our own dog food

• We use security groups extensively to isolate servers

• ServerTemplates are versioned for each major release

• This preserves the ability to launch exact configurations of past versions

Page 13: Managing RightScale on RightScale

13

Monitoring, alerts & escalations

• We monitor as much relevant data as possible and display it

in insightful ways to quickly detect patterns and abnormalities

• We proactively eliminate the conditions that raise critical alerts

• No broken windows policy. No critical alerts can remain unresolved.

API Network Activity Dashboard Network Activity

Page 14: Managing RightScale on RightScale

14

How to monitor hundreds of servers?

Page 15: Managing RightScale on RightScale

15

How to monitor hundreds of servers?

• We leverage a

monitoring data

warehouse to

develop heat maps

& stacked graphs

Page 16: Managing RightScale on RightScale

16

Quis Custodiet Ipsos Custodes?*

• We monitor the monitoring and alerting systems

• We extensively use alerts to monitor the responsiveness of all

RightScale servers

• When you have hundreds of cloud servers, you statistically

see more instance failures. Instance and EBS failures can

cause headaches. Be prepared to grab a new instance.

• The meta & production monitoring and alerting systems are

fully decoupled from each other

* Who watches the watchmen?

Page 17: Managing RightScale on RightScale

17

Our favorite RightScale features

• RightImages – Resist the temptation to build custom images.

Leverage pure, base images to avoid introducing surprises.

• Input Inheritance – Makes it easy to keep configurations in

sync for dozens of servers

• ServerTemplates – Makes it very easy to reproduce

configurations across production, staging and development.

You have to fully automate configuration to manage a high

number of servers.

• Component Library – There are always new assets

(RightScripts, ServerTemplates, Macros, etc.) that can be

adapted to our needs

• Monitoring – It’s easy to make collectd plugins to monitor just

about anything

Page 18: Managing RightScale on RightScale

18

Our not-so-favorite features

• ServerTemplates Inputs – Powerful but too many of them

make templates difficult to use. Document them well for others.

• Revision Management – Still a ways to go to make users

aware of new versions and how to update

• Component Library – Finding new resources from the library

is not easy and intuitive

• Alerts – They work pretty well but they are not easy to

configure, in particular, custom ones

Page 19: Managing RightScale on RightScale

19

Best practices for upgrading RightScale

• In the cloud, the cost of duplicating servers is minimal

• Avoid upgrading existing servers (a non-cloud approach).

Launch fresh ones with new software instead (fail forward).

• Old servers can take over in case something goes wrong

• Launch additional slaves to capture recovery points

• One slave continues to replicate in case of master failure

• Another slave is frozen at upgrade point – can rollback by failing over

• Don’t forget to take snapshots in case of major failure

Page 20: Managing RightScale on RightScale

20

Front Ends

DB Slave

Databases

DB Master

Main App

Upgrading RightScale Step-by-Step

Main App

DB Slave

7) Take snapshot

at cutoff

6) Stop replication

2) Servers with new code

1) Servers with current code

4) Cut access

to site5) Stop all access

to databases

3) Add second slave

9) Reconnect

all servers8) Update schema

10) Open access

to site

Page 21: Managing RightScale on RightScale

21

Front Ends

DB Slave

Databases

DB Master

Main App

Upgrading RightScale Step-by-Step

Main App

DB Slave

Cutoff SnapshotServers with new code

Servers with old code

Page 22: Managing RightScale on RightScale

22

Have a project and want to discuss how RightScale can help?

Contact [email protected] or (866) 720-0208

Ready to get started?

Sign up for our Free Edition: www.RightScale.com/Free

Call us for a VIP trial of our paid editions

Need to learn more?

TCO calculator: www.RightScale.com/tco-calculator

User Conference Videos: www.RightScale.com/conference

Webinar archive: www.RightScale.com/webinars

White papers: www.RightScale.com/whitepapers

Q&A / Getting Started

Page 23: Managing RightScale on RightScale

23

Thank You!