internet measurement 2007. outline measurement overview –why measure? why model measurements?...

40
Internet Measurement 2007

Upload: peter-conley

Post on 27-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Internet Measurement

2007

Page 2: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Outline

• Measurement overview– Why measure? Why model

measurements?– What to measure? Where to measure?

• Internet challenges• Measurement tools

– Active: ping, traceroute, and pathchar– Passive: logs, SNMP, packet, and flow

monitoring• Operational applications of measurement• Discussion

Page 3: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

性能评价技术 : 实验 - 测量,解析,仿真 / 模拟• 实验 / 测量 (measurement) 技术:通过测量设备或测

量程序(软件)直接测量计算机系统的各种性能指标,或与之相关的量,然后由它们经过运算求出相应的性能的指标。

• 模型 / 建模 (modeling) 技术:对评价的计算机系统建立一个适当的模型,然后求出模型的性能指标,以便对计算机系统进行评价,该技术又分为解析技术和仿真技术两种。

• 解析 (analysis) 技术是采用数学分析方法,通过对系统的简化及解析模型的建立,以求得系统的性能。

• 仿真 (simulation) 技术是采用软件仿真原理,通过构造仿真模型,详尽、逼真地描述计算机系统。当模型按照系统本身的方式运行时,对系统的动态行为进行统计,从而得到有关的性能指标。

• 测量、解析、仿真之间,相互联系,相互验证,各有优缺。 • 模拟( emulation ) ---- simulation + experiment

Page 4: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Why Measure?

• The Internet is a man-made system, so why do we need to measure it?– Because we still don’t really understand it– Because sometimes things go wrong– Analyze/characterize network phenomena

• Measurement for network operations– Detecting and diagnosing problems– What-if analysis of future changes

• Measurement for scientific discovery– Characterizing a complex system as organism– Creating accurate models that represent

reality– Identifying new features and phenomena– Test new tools, protocols, systems, etc.

Page 5: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Why Build Models of Measurements?

• Compact summary of measurements– Efficient way to represent a large data set– E.g., exponential distribution with mean 100 sec

• Expose important properties of measurements– Reveals underlying cause or engineering

question– E.g., mean RTT to help explain TCP throughout

• Generate random but realistic data as input– Generate new data that agree in key properties– E.g., topology models to feed into simulators

“ All models are wrong, but some models are useful.” – George Box

Page 6: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

What Can be Measured?

• Traffic– Packet or flow traces– Load statistics

• Performance of paths– Application performance, e.g,. Web download

time– Transport performance, e.g., TCP bulk throughput– Network performance, e.g., packet delay and loss

• Network structure– Topology, and paths on the topology– Dynamics of the routing protocol

• Performance Metrics – Throughput, Latency, Response time, Loss,

Utilization, Arrival rate, Bandwidth, Routing ( hop ) , Reliability

Page 7: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Sample Question: Topology?

• What is the topology of the network?– At the IP router layer– Without “inside” knowledge or official

network maps– Without SNMP or other privileged access

• Why do we care?– Often need topologies for simulation and

evaluation– Intrinsic interest in how the Internet behaves

•“But we built it! We should understand it”•Emergent behavior; organic growth

Page 8: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Where Measure?

• Short answer– Anywhere you can!

• End hosts– Sending active probes to measure

performance– Application logs, e.g., Web server logs

• Individual links/routers– Load statistics, packet traces, flow traces– Configuration state– Routing-protocol messages or table dumps– Alarms

Page 9: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Internet Challenges Make Measurement an Art

• Stateless routers– Routers do not routinely store packet/flow state– Measurement is an afterthought, adds overhead

• IP narrow waist– IP measurements cannot see below network

layer– E.g., link-layer retransmission, tunnels, etc.

• Violations of end-to-end argument– E.g., firewalls, address translators, and proxies– Not directly visible, and may block

measurements

• Decentralized control– Autonomous Systems may block measurements– No global notion of time

Page 10: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Active Measurement: Ping

• Adding traffic for purposes of measurement– Send probe packet(s) into the network and

measure a response– Trade-offs between accuracy and overhead– Need careful methods to avoid introducing bias

• Ping: RTT and connectivity– Host sends an ICMP ECHO packet to a target– … and captures the ICMP ECHO REPLY– Useful for checking connectivity, and RTT– Only requires control of one of the two end-

points

• Problems with ping– Round-trip rather than one-way delays– Some hosts might not respond

Page 11: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Active Measurement: Traceroute

• Traceroute: path and RTT– TTL (Time-To-Live) field in IP packet header

•Source sends a packet with a TTL of n•Each router along the path decrements the

TTL•“TTL exceeded” sent when TTL reaches 0

– Traceroute tool exploits this TTL behavior•Send packets with increasing TTL values

sourcedestination

TTL=1Time

exceeded

TTL=2

Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message

Page 12: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Problems with Traceroute

• Round-trip vs. one-way measurements– Paths may have asymmetric properties– Can’t unambiguously identify one-way outages

•Failure to reach host : failure of reverse path?• Returns IP address of interfaces, not

routers– Routers have multiple interfaces– IP address of “time exceeded” packet may be

the outgoing interface of the return packet• Non-participating network elements

– Some routers and firewalls don’t reply– ICMP messages “TTL exceeded” may be filtered

or rate-limited• Inaccurate delay

– including processing delays on the router

Page 13: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Famous Traceroute Pitfall

• Question: What ASes does traffic traverse?

• Strawman approach– Run traceroute to destination– Collect IP addresses– Use “whois” to map IP addresses to AS

numbers• Thought Questions

– What IP address is used to send “time exceeded” messages from routers?

– How are interfaces numbered?– How accurate is whois data?

Page 14: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

• Measuring multiple paths– Host sends out a sequence of packets– Successive probes may traverse

different paths•Each has a different destination port•Load balancers send probes along different

paths

Less Famous Traceroute Pitfall

• Question: Why won’t just setting same port number work?

Page 15: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Applications of Traceroute

• Network troubleshooting– Identify forwarding loops and black holes– Identify long and convoluted paths– See how far the probe packets get

• Network topology inference– Launch traceroute probes from many

places– … toward many destinations– Join together to fill in parts of the

topology– … though traceroute undersamples the

edges

Page 16: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Active Measurement: Pathchar for Links

---- per-hop capacity, latency, loss

cLdirttirtt /)()1(

Three delay components:delay npropagatio :d

delay ontransmissi :/ cLnoise delay queueing :

How to infer d,c?

d

min. RTT (L)

L

rtt(i+1)-rtt(i)

slope=1/c

sizepacket

capacity link

TTL value initial

:

:

:

L

c

i

Page 17: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Passive Measurement

• Passive Measurement – Capture data as it passes by

• Two Main Approaches– Packet-level Monitoring

•Keep packet-level statistics•Examine (and potentially, log) variety of packet-level statistics. Essentially, anything in the packet

•Timing– Flow-level Monitoring

•Monitor packet-by-packet (though sometimes sampled)

•Keep aggregate statistics on a flow

Page 18: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Passive Measurement: Logs at Hosts

• Web server logs– Host, time, URL, response code, content

length, …– E.g., 122.345.131.2 - -

[15/Oct/1998:00:00:25 -0400] "GET /images/wwwtlogo.gif HTTP/1.0" 304 - "http://www.aflcio.org/home.htm" "Mozilla/2.0 (compatible; MSIE 3.02; Update a; AK; AOL 4.0; Windows 95)" "-"

• DNS logs– Request, response, time

• Useful for workload characterization, troubleshooting, etc.

Page 19: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Passive Measurement: SNMP

• Simple Network Management Protocol (SNMP)– Get # of packets across interface per 5

min or other similar very coarse states --– Coarse-grained counters on the router– E.g., byte and packet counts

• Polling– Management system can poll the

counters– E.g., once every five minutes

• Advantages: ubiquitous• Limitations

– Extremely coarse-grained statistics– Delivered over UDP!

Page 20: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Host A

Host B

Host C

Monitor

Switch

Multicast switch

Passive Measurement: Packet Monitoring

• Tapping a link

Host A Host B Monitor

Shared media (Ethernet, wireless)

Router A Router B

Monitor

Splitting a point-to-point link

Router A

Line card that does packet sampling

Page 21: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Packet Monitoring: Selecting the Traffic

• Filter to focus on a subset of the packets– IP addresses/prefixes (e.g., to/from specific

Web sites, client machines, DNS servers, mail servers)

– Protocol (e.g., TCP, UDP, or ICMP)– Port numbers (e.g., HTTP, DNS, BGP,

Napster)• Collect first n bytes of packet (snap length)

– Medium access control header (if present)– IP header (typically 20 bytes)– IP+UDP header (typically 28 bytes)– IP+TCP header (typically 40 bytes)– Application-layer message (entire packet)

Page 22: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Packet Capture: tcpdump/bpf• Put interface in promiscuous mode• Use bpf (Berkeley packet filter) to extract packets of interest

• Accuracy Issues– Packets may be dropped by filter– Failure of tcpdump to keep up with filter– Failure of filter to keep up with dump speeds– Question: How to recover lost information from packet

drops?

Page 23: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Tcpdump Output(three-way TCP handshake and HTTP request message)

23:40:21.008043 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: S 617756405:617756405(0) win 32120 <mss 1460,sackOK,timestamp 46339 0,nop,wscale 0> (DF)

timestamp client address and port #Web server(port 80)

SYN flag

23:40:21.036758 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: S

2598794605:2598794605(0) ack 617756406 win 16384 <mss 512>

23:40:21.036789 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: . 1:1(0) ack 1 win 32120 (DF)

23:40:21.037372 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: P 1:513(512) ack 1 win 32256 (DF)

23:40:21.085106 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: . 1:1(0) ack 513 win 16384

23:40:21.085140 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: P 513:676(163) ack 1 win 32256 (DF)

23:40:21.124835 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: P 1:179(178) ack 676 win 16384

sequence number TCP options

Page 24: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Analysis of Packet Traces

• IP header– Traffic volume by IP addresses or protocol– Burstiness of the stream of packets– Packet properties (e.g., sizes, out-of-order)

• TCP header– Traffic breakdown by application (e.g., Web)– TCP congestion and flow control– Number of bytes and packets per session

• Application header– URLs, HTTP headers (e.g., cacheable

response?)– DNS queries and responses, user key

strokes, …

Page 25: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

flow 1 flow 2 flow 3 flow 4

Aggregating Packets into IP Flows

• Set of packets that “belong together”– Source/destination IP addresses and port

#– Same protocol, ToS bits, … – Same input/output interfaces at a router

• Packets that are “close” together in time– Maximum spacing between packets (e.g., 15 sec,

30 sec)– Example: flows 2 and 4 are different flows due to

time

Page 26: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Traffic Flow Statistics

• Flow monitoring (e.g., Cisco Netflow)– Statistics about groups of related packets

(e.g., same IP/TCP headers and close in time)

– Records header information, counts, and time

– May be sampled• Flow Record Contents

– Basic information about the flow……•Source and Destination, IP address and port•Packet and byte counts•Start and end times•ToS, TCP flags

– plus, information related to routing•Next-hop IP address•Source and destination AS•Source and destination prefix

Page 27: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Why Trust Your Data?

• Measurement requires a degree of suspicion– Why should I trust your data? Why

should you?• Resolving that...

– Use current best practices•e.g., paris-traceroute, CAIDA topologies, etc.

– Don't trust the data until forced to•Sanity checks and cross-validation•Spot checks (when applicable)

Page 28: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Strategy: Examine the Zeroth-Order

• Paxson calls this “looking at spikes and outliers”

• More general: Look at the data, not just aggregate statistics– Tempting/dangerous to blindly compute

aggregates– Timeseries plots are telling (gaps, spikes,

etc.)– Basics

•Are the raw trace files empty?– Need not be 0-byte files (e.g., BGP update logs have state messages but no updates)

•Metadata/context: Did weird things happen during collection (machine crash, disk full, etc.)

Page 29: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Strategy: Sanity Checks & Cross-Validation

• Paxson breaks cross validation into two aspects– Self-consistency checks (and sanity checks)– Independent observations (Looking at same

phenomenon in multiple ways)• Example Sanity Checks

– Exploiting additional properties of the measured phenomenon• E.g., TCP: reliability, ACK cumulative (packet

drop measurement problem)– Is time moving backwards?

• Typical cause: clock synchronization issues– Has the speed of light increased?

• E.g., 10ms cross-country latencies– Do values make sense?

• IP addresses like 0.0.1.2 indicate bug

Page 30: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Cross-Validation Example

• Traceroutes captured in parallel with BGP routing updates

• Puzzle – Route monitor sees route withdrawal

for prefix– Routing table has no route to the prefix– IP addresses within prefix still

reachable from within the IP address space (i.e., traceroute goes through)

• Why?– Collection bugs … or– Broken mental model of routing setup:

A default route!

Page 31: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Measurement Challenges for Operators

• Network-wide view – Crucial for evaluating control actions – Multiple kinds of data from multiple locations

• Large scale– Large number of high-speed links and routers– Large volume of measurement data

• Poor state-of-the-art– Working within existing protocols and products– Technology not designed with measurement in

mind

• The “do no harm” principle– Don’t degrade router performance – Don’t require disabling key router features– Don’t overload the network with measurement

data

Page 32: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Network Operations Tasks

• Reporting of network-wide statistics– Generating basic information about usage

and reliability

• Performance/reliability troubleshooting – Detecting and diagnosing anomalous events

• Security– Detecting, diagnosing, and blocking security

problems

• Traffic engineering– Adjusting network configuration to the

prevailing traffic

• Capacity planning– Deciding where and when to install new

equipment

Page 33: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Basic Reporting

• Producing basic statistics about the network– For business purposes, network planning, ad hoc studies

• Examples– Proportion of transit vs. customer-customer traffic– Total volume of traffic sent to/from each private peer– Mixture of traffic by application (Web, Napster, etc.)– Mixture of traffic to/from individual customers– Usage, loss, and reliability trends for each link

• Requirements– Network-wide view of basic traffic and reliability

statistics– Ability to “slice and dice” measurements in

different ways (e.g., by application, by customer, by peer, by link type)

Page 34: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Troubleshooting

• Detecting and diagnosing problems– Recognizing and explaining anomalous events

• Examples– Why a backbone link is suddenly overloaded– Why the route to a destination prefix is flapping– Why DNS queries are failing with high probability– Why a route processor has high CPU utilization– Why a customer cannot reach certain Web sites

• Requirements– Network-wide view of many protocols and

systems– Diverse measurements at different protocol

levels– Thresholds for isolating significant phenomena

Page 35: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Security

• Detecting and diagnosing problems– Recognizing suspicious traffic or disruptions

• Examples– Denial-of-service attack on a customer or

service– Spread of a worm or virus through the network– Route hijack of an address block by adversary

• Requirements– Detailed measurements from multiple places– Including deep-packet inspection, in some cases– Online analysis of the data– Installing filters to block the offending traffic

Page 36: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Traffic Engineering

• Adjusting resource allocation policies– Path selection, buffer management, and link

scheduling• Examples

– OSPF weights to divert traffic from congested links

– BGP policies to balance load on peering links– Link-scheduling weights to reduce delay for

“gold” traffic• Requirements

– Netwrk-wide view of the traffic carried in backbone

– Timely view of the network topology and config– Accurate models to predict impact of control

operations (e.g., the impact of RED parameters on TCP throughput)

Page 37: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Capacity Planning

• Deciding whether to buy/install new equipment– What? Where? When?

• Examples– Where to put the next backbone router– When to upgrade a link to higher capacity– Whether to add/remove a particular peer– Whether the network can accommodate a new customer– Whether to install a caching proxy for cable modems

• Requirements– Projections of future traffic patterns from measmnt– Cost estimates for buying/deploying new equipmnt– Model of the potential impact of the change (e.g.,

latency reduction and bandwidth savings from a caching proxy)

Page 38: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Examples of Public Data Sets

• Network-wide data– Abilene and GEANT backbones– Netflow, IGP, and BGP traces

• CAIDA DatCat – Data catalogue maintained by CAIDA– http://imdc.datcat.org/

• Interdomain routing– RouteViews and RIPE-NCC– BGP routing tables and update messages

• Traceroute and looking glass servers– http://www.traceroute.org/– http://www.nanog.org/lookingglass.html

Page 39: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

PlanetLab for Network Measurement

• Nodes are largely at academic sites– Other alternatives: RON testbed

(disadvantage: smaller, less software support)

• Repeatability of network experiments is tricky– Proportional sharing

•Minimum guarantees provided by limiting the number of outstanding shares

– Work-conserving CPU scheduler means experiment could get more resources if there is less contention

Page 40: Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges

Discussion

• How important is accuracy of the data?• How can we validate measurement studies? • How to do controlled experiments with

measurement techniques?• Can we move measurement to a science

rather than an art?• Can we identify incentives for making

measurement possible and data available?• Measurement is meaningless without careful

analysis• Distributed analysis of measurement data?• An architecture for router or line-card

support for traffic and performance measurement?

• Trade-offs between security and privacy?