opportunities to improve system reliability and resilience by donald belcham

66
System Reliability and Resilience and stuff

Upload: net-conf-uy

Post on 07-Jul-2015

80 views

Category:

Technology


1 download

DESCRIPTION

Opportunities to Improve System Reliability and Resilience Donald Belcham .NET Conf UY 2014 http://netconf.uy

TRANSCRIPT

Page 1: Opportunities to Improve System Reliability and Resilience by Donald Belcham

SystemReliability and Resilience

and stuff

Page 2: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Some things need to be cleared up first

Page 3: Opportunities to Improve System Reliability and Resilience by Donald Belcham

http://en.wikipedia.org/wiki/Vedette_(cabaret)

Page 4: Opportunities to Improve System Reliability and Resilience by Donald Belcham

tuple

Page 5: Opportunities to Improve System Reliability and Resilience by Donald Belcham

//Initialize customer and invoiceInitialize(customer, invoice);

Page 6: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public void Initialize(Customer customer, Invoice

invoice){

customer.Name = “asdf”;invoice.Date = DateTime.Now;

}

Page 7: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Initialize(customer, invoice);//did something happen to customer// and/or invoice?

Page 8: Opportunities to Improve System Reliability and Resilience by Donald Belcham

customer.Name =InitNameFrom(customer,

invoice);invoice.Date =

InitDateFrom(customer, invoice);

Page 9: Opportunities to Improve System Reliability and Resilience by Donald Belcham

customer.Name =GetNameFrom(customer,

invoice);invoice.Date =

GetDateFrom(customer, invoice);

Page 10: Opportunities to Improve System Reliability and Resilience by Donald Belcham

var results = Initialize(customer, invoice);

customer.Name = results.Item1;invoice.Date = results.Item2;

Page 11: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public tuple<string, DateTime>Initialize(customer,

invoice){

return new Tuple<string, DateTime>(“asdf”, DateTime.Now);

}

Page 12: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public static bool TryParse(string s, out DateTime result)

or

public static tuple<bool, DateTime?> TryParse(string s)

Page 13: Opportunities to Improve System Reliability and Resilience by Donald Belcham

tuple• Avoid side effects

• Avoid out parameters

• multiple values without a specific type

Page 14: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null object

Page 15: Opportunities to Improve System Reliability and Resilience by Donald Belcham

private ILogger _logger;public MyClass(ILogger logger) {

_logger = logger;}

if (_logger != null) {_logger.Debug(

“it worked on my machine!”);}

Page 16: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null checks for everyone!

Page 17: Opportunities to Improve System Reliability and Resilience by Donald Belcham

forget one and…

Page 18: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public class NullLogger : ILogger {public void Debug(string text) {

//do sweet nothing}

}

Page 19: Opportunities to Improve System Reliability and Resilience by Donald Belcham

private ILogger _logger = new NullLogger();

public MyClass(ILogger logger) {_logger = logger;

}

_logger.Debug(“it worked on my machine!”);

Page 20: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null object• Can eliminate null checks

• Simple to implement

Page 21: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Circuit Breaker

Page 22: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 23: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Retry

Page 24: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on O

ut o

f Pro

cess

Dependency

N times

Page 25: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Out o

f Pro

cess

Dependency

N times

*

Y clients

Page 26: Opportunities to Improve System Reliability and Resilience by Donald Belcham

= Denial of

Service Attack

Page 27: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Limit the # of retries

Page 28: Opportunities to Improve System Reliability and Resilience by Donald Belcham

N * Ybecomes

5 * Y

Page 29: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Y isstill a

problem

Page 30: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 31: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Circuit Breaker

Page 32: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 33: Opportunities to Improve System Reliability and Resilience by Donald Belcham

State Machine

On :: Off

Page 34: Opportunities to Improve System Reliability and Resilience by Donald Belcham

On Offwhen not healthy

Page 35: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Off Onmanually

Page 36: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Get to softwarebefore we ask you

to dance

Page 37: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Healthyor

Unhealthy

Out o

f Pro

cess

Dependency

Page 38: Opportunities to Improve System Reliability and Resilience by Donald Belcham

State is independent of

requestor

Out o

f Pro

cess

Dependency

Page 39: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Has many independent external dependencies

Page 40: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Can throttle itself

Page 41: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Has a wait threshold

Page 42: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your ApplicationExternal

DependencyCircuit Breaker

Threshold = 2Pause = 10msTimeout = 30sState = ClosedRequest

Request

Failure (i.e. HTTP 500)Failure Count = 1Pause 10ms

Request

Failure (i.e. HTTP 500)Failure Count = 2State = Open

OperationFailedException

Page 43: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = OpenRequest

30s has not passed

CircuitBreakerOpenException

Request

30s has not passed

CircuitBreakerOpenException

System can try

to

become

healthy

for 30s

Your ApplicationExternal

DependencyCircuit Breaker

Page 44: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = ½ OpenRequest

Request

Failure (i.e. HTTP 500)

Failure Count = 2State = Open

OperationFailedException

30s has passed

Your ApplicationExternal

DependencyCircuit Breaker

Page 45: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = ½ OpenRequest

Request

Failure Count = 0State = Closed

Response

30s has passed

Response

Your ApplicationExternal

DependencyCircuit Breaker

Page 46: Opportunities to Improve System Reliability and Resilience by Donald Belcham

ClosedOpen

½ Open

Page 47: Opportunities to Improve System Reliability and Resilience by Donald Belcham

½ Open is like a

manual reset

Page 48: Opportunities to Improve System Reliability and Resilience by Donald Belcham

PauseTimeout

Page 49: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Pausebetween calls

in the loop

Page 50: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Timeoutbefore you

can call again

Page 51: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Exceptions

Page 52: Opportunities to Improve System Reliability and Resilience by Donald Belcham

OperationFailed:

AggregateException

Page 53: Opportunities to Improve System Reliability and Resilience by Donald Belcham

CircuitBreakerOpen:

ApplicationException

Page 54: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Don’t Loose Exception Info

Page 55: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Always use InnerException(s)

Page 56: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 3State = ClosedRequest

Request

Failure (i.e. HTTP 500)

Request

Failure (i.e. HTTP 500)Failure Count = 2

Failure Count = 0State = Closed

Response

Response

Request?

Your ApplicationExternal

DependencyCircuit Breaker

Failure Count = 1

Page 57: Opportunities to Improve System Reliability and Resilience by Donald Belcham

SegregateDependencies

Page 58: Opportunities to Improve System Reliability and Resilience by Donald Belcham

circuitBreaker(“database”)

circuitBreaker(“weatherservice”)

Page 59: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Dependency type, endpoint svc,

endpoint

Page 60: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Where?

Page 61: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on O

ut o

f Pro

cess

Dependency

Cir

cuit

Bre

aker

Pro

xy

Page 62: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Watch forInception

Page 63: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on W

eb S

erv

ice

Cir

cuit

Bre

aker

Cir

cuit

Bre

aker

Pro

xy

Data

baseR

eposi

tory

Page 64: Opportunities to Improve System Reliability and Resilience by Donald Belcham

circuit breaker• retry looping

• slow down attempts

• good neighbour

Page 65: Opportunities to Improve System Reliability and Resilience by Donald Belcham

¡Muchas gracias!

Page 66: Opportunities to Improve System Reliability and Resilience by Donald Belcham

gracias

Donald Belcham@dbelcham

[email protected]