an information-theoretic approach to network measurement and monitoring

22
1 An Information-theoretic Approach to Network Measurement and Monitoring Yong Liu, Don Towsley, Tao Ye, Je an Bolot

Upload: tan

Post on 12-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

An Information-theoretic Approach to Network Measurement and Monitoring. Yong Liu, Don Towsley, Tao Ye, Jean Bolot. Outline. motivation background flow-based network model full packet trace compression marginal/joint coarser granularity netflow and SNMP future work. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Information-theoretic Approach to Network Measurement and Monitoring

1

An Information-theoretic Approach to Network Measurement and

Monitoring

Yong Liu, Don Towsley, Tao Ye, Jean Bolot

Page 2: An Information-theoretic Approach to Network Measurement and Monitoring

2

Outline motivation background flow-based network model full packet trace compression

marginal/joint coarser granularity

netflow and SNMP future work

Page 3: An Information-theoretic Approach to Network Measurement and Monitoring

3

Motivation network monitoring: sensing a network

traffic engineering, anomaly detection, … single point v.s. distributed

different granularities full traffic trace: packet headers flow level record: timing, volume summary statistics: byte/packet counts

challenges growing scales: high speed link, large topology constrained resources: processing, storage, transmission 30G headers/hour at UMass gateway

solutions sampling: temporal/spatial compression: marginal/distributed

Page 4: An Information-theoretic Approach to Network Measurement and Monitoring

4

Questions how much can we compress monitoring traces? how much information is captured by different monitoring granularity?

packet trace/NetFlow/SNMP how much joint information is there in multiple monitors?

joint compression trace aggregation monitor placement

Page 5: An Information-theoretic Approach to Network Measurement and Monitoring

5

Our Contribution flow-based network models

explore temporal/spatial correlation in network traces

projection to different granularity information theoretic framework

entropy: bound/guideline on trace compression quantitative approach for more general problems

validation against measurement from operational network

Page 6: An Information-theoretic Approach to Network Measurement and Monitoring

6

Entropy & Compression Shannon entropy of discrete r.v.

compression of i.i.d. symbols (length M) by coding coding: expected code length:

info. theoretic bound on compression ratio:

Shannon/Huffman coding

assign short codeword to frequent outcome achieve the H(X) bound

Page 7: An Information-theoretic Approach to Network Measurement and Monitoring

7

Entropy & Correlation joint entropy

entropy rate of stochastic process

exploit temporal correlation Lempel-Ziv Coding: (LZ77, gzip, winzip) asymptotically achieve the bound for stationary process

joint entropy rate of correlated processes exploit spatial correlation Slepian-Wolf Coding: (distributed compression) encode each process individually, achieve joint entropy rate in limit

Page 8: An Information-theoretic Approach to Network Measurement and Monitoring

8

Network Trace Compression naïve way: treat as byte stream, compress by generic tools

gzip compress UMass traces by a factor of 2 network traces are highly structured data

multiple fields per packet• diversity in information richness • correlation among fields

multiple packets per flow• packets within a flow share information• temporal correlation

multiple monitors traversed by a flow• most fields unchanged within the network• spatial correlation

network models explore correlation structure quantify information content of network traces serves as lower bounds/guidelines for compression algorithms

Page 9: An Information-theoretic Approach to Network Measurement and Monitoring

9

Packet Header Trace

source IP addressdestination IP address

data sequence numberacknowledgment number

time stamp (sec.)time stamp (sub-sec.)

total lengthToSvers. HLenIPID flags

TTL protocol header checksum

destination portsource port

window sizeHlen

fragment offset

TCP flagsurgent pointerchecksum

Timing

IP Header

TCP Header

0 16 31

Page 10: An Information-theoretic Approach to Network Measurement and Monitoring

10

Header Field Entropy

source IP addressdestination IP address

data sequence numberacknowledgment number

time stamp (sec.)time stamp (sub-sec.)

total lengthToSvers. HLenIPID flags

TTL protocol header checksum

destination portsource port

window sizeHlen

fragment offset

TCP flagsurgent pointerchecksum

Timing

IP Header

TCP Header

0 16 31

flow id

time

Page 11: An Information-theoretic Approach to Network Measurement and Monitoring

11

Single Point Packet Trace

T0 F0 T1 F1 T3 F0 Tn FnTm F0

temporal correlation introduced by flows packets from same flow closely spaced in time they share header information

packet inter-arrival: # bits per packet:

T0 F0 T3 F0 Tm F0 flow based trace:

flow record: F0 K T0

flowID

flowsize

arrivaltime packet inter-arrival

Page 12: An Information-theoretic Approach to Network Measurement and Monitoring

12

Network Modelsflow-based model

flow arrivals follow Poisson with rate flows are classified to independent flow classes according to routing (the set of routers traversed) flow i is described by:

• flow inter-arrival time: • flow ID:• flow length: • packet inter-arrival time within the flow:

packet arrival stochastic process:

Page 13: An Information-theoretic Approach to Network Measurement and Monitoring

13

Entropy in Flow Record # bits per flow: # bits per second: marginal compression ratio determined by flow length (pkts.) and variability in pkt. inter-arrival.

Page 14: An Information-theoretic Approach to Network Measurement and Monitoring

14

Single Point Compression: Results

Trace H (total) ModelRatio

Compression Algorithm

C1-in 706.3772 0.2002 0.6425

BB1-out 736.1722 0.2139 0.6574

BB2-out 689.9066 0.2186 0.6657

Compression ratio lower bound calculated by entropy much lower than real compression algorithm Real compression algorithm difference

Records IPID, packet size, TCP/UDP fields Fixed packet buffer for each flow => many flow records for long

flows

BB2-outBB1-out

router

C1-in

C2-in

Page 15: An Information-theoretic Approach to Network Measurement and Monitoring

15

Distributed Network Monitoring single flow recorded by multiple

monitors spatial correlation:

traces collected at distributed monitors are correlated

marginal node view:#bits/sec to represent flows seen by one node, bound on single point compression

network system view:#bits/sec to represent flows cross the network, bound on joint compression

joint compression ratio: quantify gain of joint compression

Page 16: An Information-theoretic Approach to Network Measurement and Monitoring

16

“perfect” network fixed routes/constant link delay/no packet loss

flow classes based on routes flows arrive with rate: # of monitors traversed: #bits per flow record:

info. rate at node v: network view info. rate: joint compression ratio:

Baseline Joint Entropy Model

dependence on # of monitors travered

Page 17: An Information-theoretic Approach to Network Measurement and Monitoring

17

Joint Compression: Results

Set of Traces Joint Compression Ratio

{C1-in, BB1-out, C2-in, BB2-out} 0.5

{C1-in, BB1-out} 0.8649{C1-in, BB2-out} 0.8702{C2-in, BB1-out} 0.7125{C2-in, BB2-out} 0.6679

BB2-outBB1-out

router

C1-in

C2-in

Page 18: An Information-theoretic Approach to Network Measurement and Monitoring

18

Coarser Granularity Models NetFlow model

similar to flow model: joint compression result similar to full trace

SNMP model any link SNMP rate process is sum of rate processes of

all flow classes passing through that link traffic rates of flow classes are independent Gaussian entropy can be calculated by covariance of these

processes information loss due to summation

small joint information between monitors difficult to recover rates of flow classes from SNMP data

Page 19: An Information-theoretic Approach to Network Measurement and Monitoring

19

Joint Compression Ratio of Different Granularity

Set of Traces SNMP NetFlow Packet Trace

{C1-in, BB1-out} 1.0021 0.8597 0.8649

{C1-in, BB2-out} 0.9997 0.8782 0.8702

BB2-outBB1-out

router

C1-in

C2-in

Page 20: An Information-theoretic Approach to Network Measurement and Monitoring

20

Conclusion information theoretic bound on marginal

compression ratio -- ~ 20% (time+flow id, even lower if include other low entropy fields)

marginal compression ratio high (not very compressible) in SNMP, lower in NetFlow, and the lowest in full trace

joint coding is much more useful/nessassary in full trace case than in SNMP

“More entropy for your buck”

Page 21: An Information-theoretic Approach to Network Measurement and Monitoring

21

Future Work network impairments

how many more bits for delay/loss/route change model netflow with sampling distributed compression algorithms lossless v.s. lossy compression entropy based monitor placement

maximize information under constraints

Page 22: An Information-theoretic Approach to Network Measurement and Monitoring

22

Thanks!