software availability by resiliency

38
Availability In the name of ALLAH Software Availability, By Reza Same'e By SOFTWARE DEVELOPER  @ BISPHONE At ZCONF - 6 th Sep 2015 | Shahrivar 1394 [email protected] > Availability Resiliency

Upload: reza-samei

Post on 15-Apr-2017

360 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Software Availability by Resiliency

Availability

In the name of ALLAH

Software Availability, By

Reza Same'eBy

SOFTWARE DEVELOPER  @ BISPHONE

AtZCONF - 6th

Sep 2015 | Shahrivar 1394

[email protected] >

Availability

Resiliency

Page 2: Software Availability by Resiliency

Resiliency Availability ?

Available :

Present or Ready

for Immediate use

Available :

Page 3: Software Availability by Resiliency

Resiliency Availability ?

As far as they ( USERS ) know, when the response time

exceeds their expectation,

the system is down.

Page 4: Software Availability by Resiliency

Resiliency Availability ?

Available :

Ready for Immediate use

and React Quickly

Available :

Page 5: Software Availability by Resiliency

Resiliency How to Measure ?

https://en.wikipedia.org/wiki/High_availability

Availability = MTTF / ( MTTF + MTTR )Availability = MTTF / ( MTTF + MTTR )

Mean Time To Failure = MTTFMean Time To Recovery = MTTR

MTTF ~= UptimeMTTR ~= Downtime

Availability ( percent ) Downtime

99.9999999 “Nine Nine” Less than 32 ms in year

99.99999 “Seven Nine” About 3 sec in year

99.999 “Five Nine” About 5 min in year

99.9 "Three Nine" About 9 hours in year

Page 6: Software Availability by Resiliency

Resiliency Why Matters ?

- Critical Systems ( Health-Care , … )

- Business

- Our Quality of Life (^_^)

398/Second34.4 Millions Items

http://www.forbes.com/sites/ryanmac/2015/07/16/amazon-says-prime-day-was-huge-success-and-vows-to-repeat-it-despite-customer-criticism/

http://www.forbes.com/sites/ilyapozin/2013/10/17/industry-to-watch-in-2014-healthcare-tech/

Page 7: Software Availability by Resiliency

Resiliency Problem ?

Failures Are Everywhere !Failures Are Everywhere !

Page 8: Software Availability by Resiliency

Resiliency Solution ?

You Can't Prevent Failures

… then You Should Manage Them.

You Can't Prevent Failures

… then You Should Manage Them.

- ?

Reduce MTTRReduce MTTRAvailability = MTTF / ( MTTF + MTTR )

Page 9: Software Availability by Resiliency

Resiliency Reactive Manifesto ?

Reactive Manifesto

ResponsiveReact To Users

Message DrivenModules / Components Interaction

ResilientReact To Failures

ElasticReact To Load

Responsive

ElasticResilient

Message Driven

Reactive ManifestoGoal

PrinciplePrinciple

Method

http://www.reactivemanifesto.org/

Page 10: Software Availability by Resiliency

Resiliency Reactive Manifesto and Availability ?

Available =

Responsive + Resilient

Availability

Depends On Resiliecy

Availability

Depends On Resiliecy

Availability

Depends On Resiliecy

Available =

Responsive + Resilient

Page 11: Software Availability by Resiliency

Resiliency Resiliency

Resiliency means

React to Failures

A resilient system keeps processing transactions, even

when there are transient impulses, persistent stresses, or

component failures disrupting normal processing. This is

what most people mean when they just say stability.

Resiliency means

React to Failures

Page 12: Software Availability by Resiliency

Resiliency Resiliency

Design For Resiliency in Real World

Page 13: Software Availability by Resiliency

Resiliency Resiliency

Isolation

Communication

Failures

Isolation Over Functionality & Failures +

Abstraction Over Accessibility

Page 14: Software Availability by Resiliency

Resiliency Isolation: Functionality, Resources & State

Functionality, Resources, State

Isolation

Single Responsibility -

Share Nothing -

Stateless -

Eventual Consistency & Idempotency -

… -

Page 15: Software Availability by Resiliency

Resiliency Isolation: Functionality, Resources & State

BULK HEADBULK HEAD Isolation &

Redundancy

Isolation Over Failure+

Prevent Chain of Failure

https://en.wikipedia.org/wiki/Compartment_(ship)

Page 16: Software Availability by Resiliency

Resiliency Communication

Location TransparencyLocation Transparency

DNS, Load Balancers, Message Brokers

Face 2 Face vs. Phone/Email

"Where" != "Who"

Page 17: Software Availability by Resiliency

Resiliency Communication

Avoid Call StackAvoid Call Stack

Page 18: Software Availability by Resiliency

Resiliency Communication

Async – 1 : Event DrivenAsync – 1 : Event Driven

Concurrency isn't Easy !Break Isolation (State,Resources-Context)

Isolation Over FailureLock-Free

Page 19: Software Availability by Resiliency

Resiliency Communication

Async – 2 : Message DrivenAsync – 2 : Message Driven

The Big Idea is “Messaging” – Alan Kay

vs.

Page 20: Software Availability by Resiliency

Resiliency Communication

Async – 2 : Message DrivenAsync – 2 : Message Driven

The Big Idea is “Messaging” – Alan Kay

Lock-free & Non-BlockingLead to Elasticity ( Scalability )ThrottellingLocation TransparencyIsolation Over FailureShare Nothing & Bulk HeadConcurrency is Easy !Very Flexible

*

Page 21: Software Availability by Resiliency

Resiliency Communication

Avoid Unlimited Resources

Strict And Bug-Free API

Use Timeout

...

AND MORE …AND MORE …

Avoid Unlimited Resources

Strict And Bug-Free API

Use Timeout

...

Page 22: Software Availability by Resiliency

Resiliency Failure

In Resilient System

Failures Are First-Class

In Resilient System

Failures Are First-Class

( Fault Tolerancy )

Page 23: Software Availability by Resiliency

Resiliency Failure

Fail Fast : Immediate & Visible

before ... Desecrating State

& Being ZOMBIE :(

http://bond.trendolizer.com/2015/01/how-to-jump-out-of-a-moving-car-and-survive.html

Crash Safely

Fail Fast : Immediate & Visible

Page 24: Software Availability by Resiliency

Resiliency Failure

http://bond.trendolizer.com/2015/01/how-to-jump-out-of-a-moving-car-and-survive.html

Circuit Breaker - Strict API -

Avoid Default Values - Timeout -

Shed Load - … -

Fail Fast : Immediate & VisibleFail Fast : Immediate & Visible

Circuit Breaker - Strict API -

Avoid Default Values - Timeout -

Shed Load - … -

Page 25: Software Availability by Resiliency

Resiliency Failure

Ciruite BreakerCiruite Breaker

A little Fail-Fast

*

Page 26: Software Availability by Resiliency

Resiliency Failure

SupervisorSupervisor

Supervise ME :)Supervise ME :)

Monitor

Restart

Stop

Escalate

Monitor

Restart

Stop

Escalate

http://www.topdreamer.com/funny-cute-baby-faces-photos/

Page 27: Software Availability by Resiliency

Resiliency Failure

Error KernelError Kernel

Page 28: Software Availability by Resiliency

Resiliency Failure

Error KernelError Kernel

FAILED

Page 29: Software Availability by Resiliency

Resiliency Failure

Error KernelError Kernel

FAILED

Page 30: Software Availability by Resiliency

Resiliency Failure

Error KernelError Kernel

RESTARTED

Page 31: Software Availability by Resiliency

Resiliency Failure

Error KernelError Kernel

RESTARTED

One-For-One

All-For-One

Page 32: Software Availability by Resiliency

Resiliency Failure

http://askatoddler.com/redundancy/

RedundancyRedundancy

Page 33: Software Availability by Resiliency

Resiliency And More ...

Test

- Platform, Tools & Framework

- Pull The Plug

Test

http://blog.mmeconsulting.com/a-simple-but-costly-mistake/man-yanking-electrical-cord/

Page 34: Software Availability by Resiliency

Resiliency And More ...

PlatformPlatform

- Experience

- Maturity & Tools

- Platform Dependent: GC , ...

*

Page 35: Software Availability by Resiliency

Resiliency And More ...

UI & UX

- Hide Failures

- Consistency

Page 36: Software Availability by Resiliency

Resiliency Summary

- Isolation is the first step of Resiliency

- Better Isolation By “Async" Communication

- Manage Failures by Fail-Fast and Supervisor

- Hide Failures in UI & UX

Page 37: Software Availability by Resiliency

Resiliency GoodLuck (^_^)

Your Quality of life after release 1.0 depends on

choices you make long before that vital milestone.

People's Quality Of Life Depends On Our Choices

– Me :)

Page 38: Software Availability by Resiliency

Resiliency GoodLuck (^_^)

- Question ?

- Thanks :)Reza Same'eSOFTWARE DEVELOPER  @ BISPHONE

<  [email protected]  |  http://samee.blog.ir  |  @reza_samee  >

Interested In Scala, Functional and Reactive

We Are Hiring!If You are interested in Scala / Java or Erlang,

Let we know: [email protected]