resilience from theory to practice
TRANSCRIPT
![Page 1: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/1.jpg)
ResilienceFrom Theory to
Practiceby:
Efim Dimenstein - Chief ArchitectOri Cohen - Lead Resilience Engineer
Jan 2016
![Page 2: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/2.jpg)
What is Liveperson
Liveperson transforms the connection between brands
and consumers.
![Page 3: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/3.jpg)
1.5 M Visits concurrent
3BN Visits/month 200BN API calls/month 2 PB data
Our Scale
![Page 4: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/4.jpg)
99.97% Uptime
6 Data Centers1000+ physical servers6000+ VMs
Our Production
![Page 5: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/5.jpg)
Fast release cycle
~250 people R&DConstant InnovationMultiple Technologies
Our Engineering
![Page 6: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/6.jpg)
interruptions per month
on average
33 :)
![Page 7: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/7.jpg)
The Past
![Page 8: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/8.jpg)
The Past
![Page 9: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/9.jpg)
The Present
![Page 10: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/10.jpg)
LiveEngage Platform
Composable
~100 servicesWe keep splittingMuch easier to scale
![Page 11: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/11.jpg)
LiveEngage PlatformServices are grouped into typesThe platform is divided into layers
![Page 12: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/12.jpg)
LiveEngage Platform
![Page 13: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/13.jpg)
Everything That Can Go
Wrong Will Go Wrong
![Page 14: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/14.jpg)
![Page 15: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/15.jpg)
Resilience PyramidDCHW
SERVICECOMPONENT
CODE
![Page 16: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/16.jpg)
DC Resilience - Global
![Page 17: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/17.jpg)
DC Resilience
PrimarySecondary
![Page 18: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/18.jpg)
Service
Nod
e 1
Nod
e N
Nod
e 2
Nod
e 3
...
Service X
![Page 19: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/19.jpg)
Service
Nod
e 1
Nod
e N
Nod
e 2
Nod
e 3
...
Service X
HA Functionality
![Page 20: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/20.jpg)
Service GroupingA
dmin
istr
atio
n &
C
onfig
urat
ion
Real Time
Near Real Time
Offline
![Page 21: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/21.jpg)
![Page 22: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/22.jpg)
Components
Solve once - reuse
The GlueLevel of abstractionIsolates common problems
![Page 23: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/23.jpg)
Components - GuidelinesRetries
Fallback
Cache
![Page 24: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/24.jpg)
![Page 25: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/25.jpg)
@ ground level
![Page 26: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/26.jpg)
trust compan
y
![Page 27: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/27.jpg)
trust enginee
rs
![Page 28: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/28.jpg)
and still evaluate
![Page 29: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/29.jpg)
knowledge is power
![Page 30: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/30.jpg)
tooling
![Page 31: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/31.jpg)
testing
![Page 32: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/32.jpg)
deployment
![Page 33: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/33.jpg)
metrics
![Page 34: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/34.jpg)
logs
![Page 35: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/35.jpg)
E2E
![Page 36: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/36.jpg)
ALERTING
![Page 37: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/37.jpg)
untested ==
unreliable
![Page 38: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/38.jpg)
but… ?
![Page 39: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/39.jpg)
cost effective
![Page 40: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/40.jpg)
visibility
![Page 41: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/41.jpg)
incidentinjectiontesting
![Page 42: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/42.jpg)
process
![Page 43: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/43.jpg)
opt-in
![Page 44: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/44.jpg)
resilience @ scale● multi layered solution
● requires monitoring and testing● ingrained in the company culture● keep things simple● trust and empower your engineers● break stuff
![Page 45: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/45.jpg)
Thankyou!
![Page 46: Resilience from Theory to Practice](https://reader035.vdocuments.net/reader035/viewer/2022070603/5871146b1a28abac6d8b68f7/html5/thumbnails/46.jpg)
Q&A