aws resilience and chaos engineering day

Post on 16-Feb-2022

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS Resilience and Chaos Engineering Day

Gunnar Grosch

Sr. Developer Advocate, AWS

@gunnargrosch

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

What we’ll cover in this session

• Challenges with distributed systems

• What chaos engineering is

• Phases of chaos engineering

• Common use cases for chaos engineering

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

© 2021, Amazon Web Services, Inc. or its affiliates.

Challenges withdistributed systems

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Distributed systems are complex

Message

Message

Reply

Reply

ServerNetworkClient

https://aws.amazon.com/builders-library/challenges-with-distributed-systems/

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Traditional testing is not enough

TESTING = VERIFYING A KNOWN CONDITION

Unit testingof components

Tested in isolation to ensure function meets expectations

Functional testingof integrations

Each execution path testedto assure expected results

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

And it can get more complicated…

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

IOError: No space left on device

close failed in file object destructor:

logfile

ROTATE

logfile.0

ROTATE

logfile.1

ROTATE

logfile.2

ROTATE

logfile.3

ROTATE

logfile.n

ROTATE

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

© 2021, Amazon Web Services, Inc. or its affiliates.

What is chaos engineering?

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Chaos engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production

principlesofchaos.org

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Chaos engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production

principlesofchaos.org

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Chaos engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production

principlesofchaos.org

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Chaos engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production

principlesofchaos.org

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Chaos engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production

principlesofchaos.org

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

The process of chaos engineering

• Stressing an application by creating disruptive events

• Observing how the system responds

• Implementing improvements

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Fundamental goals with chaos engineering

• Improve resilience and performance

• Uncover hidden issues

• Expose blind spotsMonitoring, observability, and alarm

• And more

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

When to do chaos experiments

• Adding new code

• Creating new features

• Adding services

• Adding or changing dependencies

• Continuously

• And more

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Don’t ask what happens if a system fails; ask what happens when it fails

28

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

© 2021, Amazon Web Services, Inc. or its affiliates.

Phases of chaos engineering

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Phases of chaos engineering

Steady state

Hypothesis

Run experiment

Verify

Improve

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

© 2021, Amazon Web Services, Inc. or its affiliates.

Use cases

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Use cases

One-offexperiments

Periodic game days

Automated experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Use cases

One-offexperiments

Periodic game days

Automated experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Use cases

One-offexperiments

Periodic game days

Automated experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Use cases

One-offexperiments

Periodic game days

Automated experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Automated experiments

Recurring scheduled

experiments

Event-triggered experiments

Continuous delivery experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Automated experiments

Recurring scheduled

experiments

Event-triggered experiments

Continuous delivery experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Automated experiments

Recurring scheduled

experiments

Event-triggered experiments

Continuous delivery experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Automated experiments

Recurring scheduled

experiments

Event-triggered experiments

Continuous delivery experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Automated experiments

Recurring scheduled

experiments

Event-triggered experiments

Continuous delivery experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Use cases

One-offexperiments

Periodic game days

Automated experiments

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Recap

• Challenges with distributed systems

• What chaos engineering is

• Phases of chaos engineering

• Common use cases for chaos engineering

© 2021, Amazon Web Services, Inc. or its affiliates.

AWS RESILIENCE AND CHAOS ENGINEERING DAY

Thank you!

© 2021, Amazon Web Services, Inc. or its affiliates.

Gunnar Grosch

Sr. Developer Advocate, AWS

@gunnargrosch

top related