design for failure is the path to success in cloud

Post on 24-Feb-2016

27 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Design For Failure Is The Path To Success In Cloud. Ashay Chaudhary. Requirements. Journey thru the computing models. Mainframe Desktop Client-Server Internet Cloud Computing. Reliability Availability Serviceability Performance + Security + Agility. Evolution of Requirements. - PowerPoint PPT Presentation

TRANSCRIPT

Design For Failure Is The Path To Success In CloudAshay Chaudhary

REQUIREMENTSJourney thru the computing models

Evolution of Requirements

• Mainframe• Desktop• Client-Server

• Internet

• Cloud Computing

• Reliability• Availability• Serviceability• Performance

+• Security

+• Agility

AVAILABILITYNon-Cloud Model

Guiding Principles

• Design for Non-Failure• Deploy with Redundancy• Manage Effectively

• Design for Non-Failure• Quality

Hardware

• Deploy with Redundancy• Specialty

Hardware• Manage

Effectively• Expert Staff• Processes

AVAILABILITYCloud Model

Guiding Principles

• Design for Failure• Design for Redundancy• Monitor Extensively• Track Dependencies

Design For Failure

• Assume nothing• Expect failures• Anywhere and everywhere• If it is available now, doesn’t mean it is there later

• Failures cascade• Unhandled failures propagate• Poorly handled failures adds complexity• Difficulty increases exponentially with complexity

• Embrace failure, make it a first class citizen

Handle All Failures

• Unhandled failures is a very bad idea• Poorly handled trivial failure in one part

becomes a critical one somewhere else• Two types of failures: Transient and Resource• Transient failures are difficult, treat them like

Resource failures and fail fast• Delays are transient failures, define response

time guarantees• Failure injection is a lifestyle

Design For Redundancy

• Eliminate single points of failure• Architect distributed applications• Minimize duration of statefulness

Monitor Extensively

• Self assess and report health• Complementary external monitoring• Load and latency monitoring• Proactively restart components

Track Dependencies

• Identify all dependencies• Hardware, 3rd Party Libraries, Other servers, Network• Infrastructure/Platform services, External services• Your own components

• Track their health and availability

Key Takeaways

• If there’s only one thing you could do• Design for Failure

• It is a paradigm shift• It is a cultural change• It is not easy

• It is the key to success in the cloud

Ashay ChaudharyCloud Consultant

Corporate EducationPrivate Cloud SolutionsHighly Scalable SaaS ApplicationsSaaS Business Intelligence & Analytics

ashay@kloudpros.com@ashay_c

top related