monitoring latency sensitive enterprise applications on the cloud shankar narayanan ashiwan...

30
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

Upload: paul-hodges

Post on 11-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

Monitoring Latency Sensitive Enterprise Applications on the Cloud

Shankar NarayananAshiwan Sivakumar

Page 2: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

2

Enterprise Applications (EA)Stock Trader Benchmark Application

Data Base (DB)

Business Service (BS)Front End (FE)

Configuration Service (CS)

Order Processing Service (OS)

Page 3: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

3

EA as Services

FE

Users

FE

BS

BS

BS

BS

BS

OS

OS

OS

DB

DB

Load Balancers

Service Endpoints

Page 4: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

4

EA Characteristics

Notice: Dynamic and distributed nature of cloud deployments.

Reducing user observed latency is the goal – Monitor this !

EA property Relevant cloud characteristic

Scalability Dynamic deployment sizes

Availability geo-redundancy

Economics Pay-as-you-use

Elasticity Decoupled services

Low latency Deploy closer to user groups

Utilization Load balancing

Page 5: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

5

Performance Variation: Time Series and CDF of DB Latency

- data snapshot worth 4 hours across both the days

Page 6: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

6

Monitoring Framework – Design Goals

Resilience: Less sensitive to cloud variabilityScalability: Capable of scaling with component

instancesPortability: Easy to integrate with applicationsFlexibility: Multiple levels of measurement

User level latencyComponent level isolation

Efficiency: Fast and accurate measurements

Page 7: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

7

Why is Monitoring Hard Dynamic environment – number of components change

Distributed deployment - needs a collection framework

Variable request path – different choice of components

Existing monitoring tools

Do not support service oriented architectures

Too detailed

Not scalable

Remember: user observed latency is our goal Abstract away un-necessary details !

Page 8: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

8

Measuring End-points – Existing Tools

• FE • BS • DB

• Users

• 1• 2 • 3

• 5• 4• 7 • 6• 1

1• 1

0

• 9• 8• 1

2• 1

3

• HTTP Request

• SOAP Response

• HTTP Response

• MySQL Replies

Aggregate !!

Page 9: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

9

Measurement Model

Ti,i+1

C i + 1 C i + 2C i

Ti-1,i

Ti,i+1

Ti+1,i+2 Ti+1,i+2

T’i+1,i+2 Ti+1,i+2

Ti,i+2

T’i,i+2

T’’i,i+2

Ti,i+2

Ti,i+2

Ti,i+2

T’i,i+1 Ti,i+1 Ti+1,i+2 Ti+1,i+2

T’’’i,i+2 Ti,i+2

T’’’’i,i+2 Ti,i+2

CLi = Component latency of ith component

LLi,i+1 = Link latency across components i, i+1

N = No of components Ci

communicates withnj = No of calls made by Ci to each of

the j components

Page 10: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

10

Notification Q

Instrumented application component

Log server (local)

Raw logStorage (local)

Global collector

Instrumented application component

Log server (local)

Raw logStorage (local)

Aggregated log

Aggregated log

Monitoring Framework Architecture

Page 11: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

11

Outline

• Monitoring tool– Collection framework– Instrumentation framework

Page 12: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

12

The Collection Framework

Each component writes to local storage Front-end sends “done” message to local queue Queues: decouple producer, consumer entities Storage: persistence, no limit on size Both: scalable, robust

Question: Why this a right model ?When in doubt, measure!

Page 13: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

13

Alternative Model

All components write to queue Collection framework de-queues

Forms a P2P network to collate the data

Page 14: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

14

Experiments on Azure and EC2

• Experiments evaluating performance of storage and queues.

• Real cloud deployments (Microsoft Azure, Amazon AWS)

• Extensive measurements from all data-centers US (East/West/North/South)Europe (West/Central)Asia (East/South East)

Page 15: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

15

Performance of Storage and Queues

Microsoft Azure Amazon AWS

•Measurements made in all 12 datacenter regions (Azure and AWS)•Experiment length (24 – 26 hours) •Approx 100,000 requests to storage 16,000 requests to the queues

Write Q

Read Q

Read Q

Write Q

Write Store

Write Store

Page 16: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

16

Outline

• Monitoring tool– Collection framework– Instrumentation framework

Page 17: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

17

Instrumentation Framework - Goals

• Minimize coding effort and intervention• Measure latency at the granularity of user

request• Automate instrumentation as much as

possible• Generate minimal measurement parameters

Page 18: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

18

Comparison of Existing Tools

Page 19: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

19

Instrumentation Framework

Instrumented Application Component

Original ApplicationComponent

Aspects

Specification for the application end- points (X-trace: log events)

Measurement metric specification

(X-trace: meta-data)Log Format

specifications

Page 20: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

20

Experiment Set-up• Deployed two similar benchmark applications

• DayTrader - Amazon AWS • StockTrader - Windows Azure (prior work)

• Deployed the collection framework on AWS and Azure.

• User sessions and request patterns from DaCapo benchmark suite.

• Instrumentation:• Automated using aspects – DayTrader (AWS)• Custom coded - DayTrader and StockTrader

Page 21: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

21

Aggregation Benefit: DayTraderUser request

typeStorage writes without

aggregationStorage writes

with aggregation

FE BS FE BS

Login 3 5 1 1

Portfolio 10 10 1 1

Update profile 4 5 1 1

Home 2 2 1 1

Buy 1 7 1 1

Sell 1 8 1 1

Account 3 3 1 1

Total 24 40 7 7

• User sessions : 20 , 1 every 10 seconds• Results shown for a random user from DaCapo

78% writes reduced in above case transactions benefits

Page 22: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

22

Aggregation Benefit: MedRec Application Suite

Application Storage writes without aggregation

Storage writes with aggregation

FE BS FE BS

MedRec App 4 8 1 1Physician App 8 15 1 1

Admin App 2 5 1 1

• Storage writes reduced by at least 50% from FE, 80% from BS

Page 23: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

23

Instrumentation Benefit

Category Code (# of files)Handcrafted

Code (# of files)X-Trace with Aspect

same 15250 (88) 15250 (92)

modified 593 (74) 465 (70)

added 878 (0) 166 (2)

automatable 0 (0) 166 (2)

• FE component code : automatable using aspects with x-trace• Cross component calls : x-trace object passed as parameter

• New lines of code reduced by ~80%• SLOC reduced by ~20%• Aspects can be automated

Page 24: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

24

Future Work

• Scaling the framework • Application scale to Framework scale ratio• Per Datacenter ? Per VM ? Varies per cloud

provider ?• Impact of these design decisions on the sensitivity of

the framework

Page 25: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

25

Conclusions• Architectural benefits:

• Generic across - application, # of components, access patterns

• Scalable – decoupled entities• Aggregation benefits:

• N writes to storage becomes one write• Log server offloads work from application

• Instrumentation benefits:• Easy to integrate with application• New lines of code reduced by ~80%• SLOC reduced by ~20%

Page 26: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

26

Q & A

Page 27: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

27

Back up slides

Page 28: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

28

Azure Blob Read and Write Latency

Blob read-write at least30-40 msec

Page 29: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

29

Azure Queue Read and Write Latency

Queue read costly,write comparable to blob

Page 30: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar

30

SQL Azure Performance Issue Snapshot (6 Days)