servers in action: towards distributed traffic measurement ...2 network measurement in data centers...

25
1 Servers in Action: Towards Distributed Traffic Measurement in Data Centers Praveen Tammana & Myungjin Lee School of Informatics University of Edinburgh MSN’14 - Cosener's House, Abingdon

Upload: others

Post on 15-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

1

Servers in Action: Towards Distributed Traffic Measurement in Data Centers

Praveen Tammana & Myungjin Lee School of Informatics

University of Edinburgh

MSN’14 - Cosener's House, Abingdon

Page 2: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

2

Network Measurement in Data Centers

● Data center applications– High Bandwidth– Latency sensitive

● Network Management– Control

● Routing, Access control

– Measurement● Traffic engineering, Security applications● Basic requirements

– Accuracy, Scalability● Data center requirements

– Programmable, Responsive, Evolvable

Page 3: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

3

Data Center Measurement Requirements

● Responsiveness : Quick control loop decisions (Heavy flow scheduling)

● Programmability: Adaptable to dynamic workloads

● Evolvability: Software based measurement module (bloom filter, trie, hashtable)

Core

Edge

Aggregate

Page 4: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

4

Network Measurement Framework

AccountingTraffic Engineering

Fault diagnosis

SLA monitoring Anomaly detection

worms, portscans, botnets

Forensic analysis Network ManagementTasks

Flow Collector

Flow Monitoring● Software ● Hardware

Page 5: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

5

Software Based - Flow Monitoring

● Sampling

– NetFlow, sFlow

– High traffic rates compliance with limited switch resources (SRAM, CPU)

● Problem

– Not Accurate (Basic Requirement)

➔ Flow coverage and accuracy are compromised.

➔ Not suitable for management tasks that requires fine grained flow details.

AccountingTraffic Engineering

Fault diagnosis

SLA monitoring Anomaly detection

worms, portscans, botnets

Forensic analysis

sampled

Packet stream

Counters

Task 1

Task 2

Task N

# Bytes/Pkts

Management Tasks

Page 6: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

6

Hardware Based - Flow Monitoring

● Task Oriented

– Task 1 : Anomaly Detection

– Task 2 : Traffic Engineering

● Problem

– Not Evolvable (DC Requirement)

● Higher speed links (40/100 Gbps)

● SLA monitoring in data centers

Packet stream

Counters(SRAM) Counters Counters

# Bytes/Pkts

Task 1 Task 2 Task N

Page 7: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

7

Network Is Evolving

(Net Optics 2013)

Page 8: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

8

Distributed Traffic Measurement

● Our approach :

– Distribute flow monitoring overhead between switches and servers

Sta

tistic

pkt

s

Statistic pkts

f1f2f3f4

f1

f5

Flows

Monitor Flows1

2

3Aggregate pkts Report results

Collector

Page 9: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

9

Distributed Traffic Measurement

Core

Edge

Administrators have complete control of switches and servers

- High computational resources (multiple cores, large memory)

- Hosts observe relevant traffic of running services

- Monitors less traffic than switch

Aggregate

Page 10: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

10

Proposed Framework

Measurement module

● Producers

● Monitor traffic and generates statistic packets (s-pkts) (e.g., per flow record)

● Feeds s-pkts to ToR switch

Aggregation module

● Consumers

● Aggregates statistic packets (s-pkts)

● Counters can be stored in high density DRAM

Page 11: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

11

Proposed Framework – Packet Processing

Measurement module

NIC

1. Copy of regular packets

Reglular packets

Page 12: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

12

Measurement module

NIC

1. Copy of regular packets

Regular packets

2. statistic packets (s-pkts)

Proposed Framework – Packet Processing

Page 13: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

13

Aggregation Module

3. Copy of statistic packets (s-pkts)

Reglular packets

Measurement module

NIC

1. Copy of regular packets

Regular packets

2. statistic packets (s-pkts)

Ing

r es

s p

ort

Eg

r es

s p

ort

Proposed Framework – Packet Processing

Page 14: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

14

On going work : s-pkt forwarding

● How to forward s-pkt ?– Packet path encoding and IP source route option

– Use switch forwarding table

Page 15: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

15

Usecase – Hierarchical Heavy Hitter (HHH)

40

400

19

12 7

1 5 2

21

12 9

9 3 5 4

Threshold : T=10

11

****

0*** 1***

00** 01**

000* 001* 010* 011*

0000 0001 0010 0011 0100 0101 0110 0111

HHH: Longest IP prefix occupies more than fraction T of link bandwidthafter excluding any descendant HHH

Traffic volume for each IP Prefix

Page 16: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

16

HHH Detection

f1f2f3f4

f1

f5

HHH Measurement Module

HHH Aggregation Module

Collector

Report HHH

Pre-filtering

IP Prefix Trie (Source IP)

Statis

tic p

kts

f1f2f3

f1

HHH Measurement Module

Pre-filtering

Statistic pkts

Page 17: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

17

Evaluation

● Simulation setup

– Measurement module : Customized YAF

– Aggregation module : IP Prefix Trie

– Packet trace – T. Benson : University data center

● Aggregation module performance

– HHH Accuracy

– Computation overhead on Servers and switches

– Compared with NetFlow

Page 18: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

18

Preliminary Results

Aggregation module overhead (AMO) over NetFlow overhead (NFO)

HHH Accuracy Varying Sampling Rates

FPR : False Positive RateFNR : False Negative Rate

Page 19: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

19

Preliminary Results

Aggregation module overhead (AMO) over NetFlow overhead (NFO)

HHH Accuracy Varying Sampling Rates

FPR : False Positive RateFNR : False Negative Rate

Correctness : 100% Sampling rate – 100% Accuracy Overhead: AMO is just < 2% of NFO

Page 20: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

20

Conclusions and Future work

● Conclusions

– Our framework offloads overhead on switch

– Evolves along with data center traffic volume

– Provides more flexibility to data centre operators● Future Work

– Prototyping proposed framework

– Exploring performance across different measurement tasks

– Endhost based network trouble shooting (e.g., packet loss, delay)

– Impact of packet loss on accuracy

– Distributing measurement task overhead across network

Page 21: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

21

Thank You

Questions

Praveen [email protected]

University Of Edinburgh

Page 22: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

22

Challenges

– Handling multiple paths between End Hosts

– Consistency with forwarding rules update

Page 23: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

23

Measurement Tasks

● Hierarchical Heavy Hitter (HHH)● Heavy Hitter● Superspreader● Flow Size Distribution● DDoS

Page 24: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

24

Proposed Framework : s-pkt forwarding

S R

T1

SS

T2

A1

Flow Path : S → T1 → A1 → T2 → R

Encodes path information into packet

Page 25: Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers Data center applications – High Bandwidth – Latency sensitive Network Management

25

Proposed Framework : s-pkt Forwarding

S R

T1

SS

T2

A1

Flow Path : S → T1 → A1 → T2 → R

s-pkt : R → T2 → A1 → T1

s-pkt

s-pk

t s-

pkt

1. Generate s-pkt

2. Enables IP source routing option