1 internet networking and application troubleshooting yao zhao eecs department northwestern...
TRANSCRIPT
1
Internet Networking and Application Troubleshooting
Yao Zhao
EECS Department
Northwestern University
2
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting– Rake
• Conclusions and Future Work
3
Motivation
“When something breaks in the Internet, the Internet's very decentralized structure makes it hard to figure out what went wrong and even harder to assign responsibility.”
- “Looking Over the Fence at Networks: A Neighbor's View of Networking Research”, by Committees on Research Horizons in networking, National Research Council, 2001.
4
Troubleshooting Philosophy
• Entity Oriented Troubleshooting– Monitor entity separately
• E.g. Router packet drop rates, queue size and other SNMP counters
• E.g. Machine CPU load, I/O intensity, network utility and other performance counters
– Potential problems• Not all entities can be monitored• Inferring entity performance from the counters may
be challenging
5
Troubleshooting Philosophy
• Entity Oriented Troubleshooting
• Task Based Troubleshooting– Use task performance to infer entity
performance• E.g. From Internet path loss rate to infer link-level
loss rates
– Advantage• Work with limited monitor points (e.g. end hosts)• Focus on target performance directly
6
Thesis Statements
• We design troubleshooting systems that monitor and diagnosis the Internet distribute systems in both network layer and application layer using the task based troubleshooting philosophy.
7
Publications• Papers
– Y. Zhao, Y. Chen, S. Ratnasamy, Load balanced and Efficient Hierarchical Data-Centric Storage in Sensor Networks, in the Proc. of SECON 2008
– Y. Gao, Y. Zhao, R. Schweller, S. Venkataraman, Y. Chen, D. Song, and M. Kao, Detecting Stealthy Spreaders Using Online Outdegree Histograms, in the Proc. of IWQoS, 2007
– Y. Zhao and Y. Chen, A Suite of Schemes for User-level Network Diagnosis without Infrastructure, in the Proc. of IEEE INFOCOM, 2007
– P. Narayana, R. Chen, Y. Zhao, Y. Chen, Z. Fu, and H. Zhou, Automatic Vulnerability Checking of IEEE 802.16 WiMAX Protocols through TLA+, in Proc. of NPSec, 2006
– Y. Zhao, Y. Chen, and D. Bindel, Towards Unbiased End-to-End Network Diagnosis, in Proc. of ACM SIGCOMM 2006
– Y. Zhao, Q. Zhang, B. Li, Y. Chen and W. Zhu, Hop ID based Routing in Mobile Ad Hoc Networks, in Proceedings of ICNP, 2005
• Patents– E. C. Gillum, Q. Ke, Y. Xie, F. Yu and Y. Zhao, Graph Based Bot-User Detection,
being filed through Microsoft Corporation, MS docket number 324953.01.– J. Wang, Y. Chen, D. Pei, Y. Zhao, and Z. Zhu, Towards Efficient Large-Scale N
etwork Monitoring and Diagnosis Under Operational Constraints, being filed through AT&T, docket number 1209-144.
8
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting– Rake
• Conclusions and Future Work
9
Motivation
Diagnosis
Model
Data Link
Netw
ork
Transport
Application
Monitoring
10
Components in Network Troubleshooting
• Model– Defines the extrinsic observations and
intrinsic faulty problems as well as the relationship between them
• Monitoring– Collect the observations
• Diagnosis– Identify the faulty location and find out the root
cause
11
Thesis Research Topics
Diagnosis
Model
Data Link
Netw
ork
Transport
Application
Monitoring
Lend, FAD and SPA
VScope
VScope
Rake
12
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting– Rake
• Conclusions and Future Work
13
Network Layer Troubleshooting
• LEND [Sigcomm06]– Tomography Diagnosis with least statistic
assumptions
• FAD & SPA [Infocom05]– On-demand loss rate diagnosis without
infrastructure
• VScope [Patent]– Experimental design for ISP VPN network
monitoring and diagnosis
14
LEND
• Basic Assumptions– End-to-end measurement can infer the end-to-end
properties accurately– Link level properties are independent
• Problem Formulation– Given end-to-end measurements, what is the finest
granularity of link properties can we achieve under basic assumptions?
Basic assumptions
More and stronger statistic assumptions
Virtual linkDiagnosis granularity?
Better accuracy
15
LEND
• Contributions– Define the minimal identifiable unit under basic
assumptions (MILS)– Prove that only E2E paths are MILS with a directed
graph topology (e.g., the Internet) – Propose good path algorithm (incorporating
measurement path properties) for finer MILS
Basic assumptions
More and stronger statistic assumptions
Virtual linkDiagnosis granularity?
Better accuracy
16
FAD & SPA
• Motivation– How do end users, with no special privileges,
identify packet loss inside the network with one or two computers?
• Conclusions– We proposed three user-level loss rate
diagnosis approaches– The combo of our approaches and Tulip
[SOSP03] is much better than any single approach
17
VScope Motivation
• Two Important Services Provided by ISP– Internet access service– VPN service
• Monitoring and Diagnosis on ISP Networks– Ensure Service Level Agreement (SLA)– Help Network Operations
18
Problem Definition (1)
• Challenges in ISP Network Monitoring and Diagnosis– Operational constraints on monitors and links
• A monitor can measure a certain number of paths at a time• The measurement traffic through a link cannot exceed a thre
shold (e.g. 1% of the link bandwidth)• Path and monitor selection constraints
– Monitor installation is costly– Real-time diagnosis– Special star-like topology features of ISP networks
• Access links should be monitored• The backbone topology extended with access links (backbon
eExt) is large and star-like
19
Problem Definition (2)
• Monitor Setup Phase– From certain monitor candidates select minimal numb
er of monitors, which in the measurement phase can measure a certain path set that covers all links in the network under the given measurement constraints
– NP-hard even without considering constraints
• Monitoring and Fault Diagnosis Phase– When faulty paths are discovered in the path monitori
ng phase, how to quickly select some paths under the operational constraints to be further measured so that the faulty link(s) can be accurately identified?
20
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting– Rake
• Conclusions and Future Work
21
Rake: Semantic Assisted Large Distributed System Diagnosis
• Motivation
• Related Work
• Rake
• Evaluation
• Conclusions
22
Motivation
• Large distributed systems involve hundreds or thousands of nodes– E.g. search system, CDN
• Host-based monitoring cannot infer the performance or detect bugs– Hard to translate OS-level info
(such as CPU load) into application performance
– Application log may not be enough• Task-based approach is
adopted in many diagnosis systems– WAP5, Magpie, Sherlock
Load Balancer
Web Servers
Aggregator
DISPATHER DISPATHER DISPATHER
Index Servers
23
Task-based Approaches
• The Critical Problem – Message Linking– Link the messages in a task together into a
path or tree
24
Example of Message Linking in Search System
Load Balancer
Web Servers
Aggregator
DISPATHER DISPATHER DISPATHER
Index Servers
URL
URL
Search keyword
URL
Search keyword
Doc ID
25
Task-based Approaches
• The Critical Problem – Message Linking– Link the messages in a task together into a path or tree
• Black-box approaches– Do not need to instrument the application or to understand its int
ernal structure or semantics– Time correlation to link messages
• Project 5, WAP5, Sherlock
• White-box approaches– Extracts application-level data and requires instrumenting the ap
plication and possibly understanding the application's source codes
– Insert a unique ID into messages in a task• X-Trace, Pinpoint
26
Problems of Black-Box
• Time Correlation– Affected by cross traffic
0
1
2
3
4
0
1
2
3
45
27
Related Work
Non-Invasive Invasive
Network Sniffing
Interpo-sition
App or OS Logs
Source code modification
Black-boxProject
5, Sherlock
WAP5 Footprint
Grey-box Rake Magpie
White-boxX-Trace, Pinpoint
Invasiveness
Application Knowledge
28
Rake
• Key Observations– Generally no unique ID linking the messages
associated with the same request– Exist polymorphic IDs in different stages of
the request
• Semantic Assisted– Use the semantics of the system to identify
polymorphic IDs and link messages
29
Message Linking Example
Load Balancer
Web Servers
Aggregator
DISPATHER DISPATHER DISPATHER
Index Servers
URL
URL
Search keyword
URL
Search keyword Doc ID
30
Questions on Semantics
• What Are the Necessary Semantics?– In worst case, re-implement the application
• How Does Rake Use the Semantics?– Naïve design is to implement Rake for each
application with specific application semantics
• How Efficient Is the Rake with Semantics– Can message linking to accurate?– What’s the computational complexity of Rake?
31
Necessary Semantics
• Intra-node linking– The system semantics
• Inter-node link– The protocol semantics
Node
P Q
R S
32
Utilize Semantics in Rake
• Implement Different Rakes for Different Application is time consuming– Lesson learnt for implementing two versions o
f Rake for CoralCDN and IRC
• Design Rake to take general semantics– A unified infrastructure – Provide simple language for user to supply se
mantics
33
Example of Rake Language (IRC)• <?xml version="1.0" encoding="ISO-8859-1"?>• <Rake>• <Message name="IRC PRIVMSG">• <Signature>• <Protocol> TCP </Protocol>• <Port> 6667 </Port>• </Signature>• <Link_ID>• <Type> Regular expression </Type>• <Pattern> PRIVMSG\s+(.*) </Pattern>• </Link_ID>• <Follow_ID id="0">• <Type> Same as Link ID </Type>• </Follow_ID>• <Query_ID>• <Type> No Return ID </Type>• </Query_ID>• </Message>• </Rake>
P Q
R S
Link_IDFollow_ID =Query_ID
=
Response_ID
34
Signature
• Signature to Classify Messages– <Signature>
• <Protocol> TCP </Protocol>• <Port> 6667 </Port>
– </Signature>• Formats of Signatures
– Socket information• Protocol, port
– Expression for TCP/IP header• udp [10]&128==0
– Regular expression– User defined function
35
Link_ID and Follow_ID
• Follow_IDs– The IDs will be in the triggered messages by this mes
sage– One message may have multiple Follow_IDs for trigg
ering multiple messages
• Link_ID– The ID of the current message– Match with Follow_ID previously seen
• Linking of Link_ID and Follow_ID– Mainly for intra-node message linking
36
Query_ID and Response_ID
• Query_IDs– The communication is in Query/Response style, e.g.
RPC call and DNS query/response.– The IDs will be in the response messages to this mes
sage• Response_ID
– The ID of the current message to match Query_ID previously seen
– By default requires the query and response to use the same socket
• Linking of Query_ID and Response_ID– Mainly for inter-node message linking
37
Complicated Semantics
• The process of generating IDs may be complicated– XML or regular expression is not good at com
plex computations– So let user provide own functions
• User provide share/dynamic libraries• Specify the functions for IDs in XML• Implementation using Libtool to load user defined f
unction in runtime
38
Example for DNS• <?xml version="1.0" encoding="ISO-8859-1"?>• <Rake>• <Message name="DNS Query">• <Signature>• <Protocol> UDP </Protocol>• <Port> 53 </Port>• <Expression> udp[10] & 128 == 0 </Expression>• </Signature>• <Link_ID >• <Type> User Function </Type>• <Libray> dns.so </Libray>• <Function> Link_ID </Function>• </Link_ID>• <Follow_ID id="0">• <Type> Link_ID </Type>• </Follow_ID>• <Query_ID>• <Type> Link_ID </Type>• </Query_ID>• </Message>
• ……………………………..
Extract the queried host
39
Accuracy Analysis
• One-to-one ID Transforming– Examples
• In search, URL -> Keywords -> Canonical format• In CoralCDN, URL -> Sha1 hash value
– Ideally no error if requests are distinct• Request ambiguousness
– Search keywords• Microsoft search data• Less than 1% messages with duplication in 1s
– Web URL• Two real http traces • Less than 1% messages with duplication in 1s
– Chat messages• No duplication with timestamps
40
Potential Applications
• Search– Verified by a Microsoft guy
• CDN– CoralCDN is studied and evaluated
• Chat System– IRC is tested
• Distributed File System– Hadoop DFS is tested
41
Evaluation
• Application– CoralCDN– Deployed on PlanetLab
• Experiment– Employ PlanetLab hosts as web clients– Retrieve URLs from real traces with different frequenc
y• Metrics
– Linking accuracy (false positive, false negative)– Diagnosis ability
• Compared Approach– WAP5
42
CoralCDN Task Tree
43
Message Linking Accuracy
• Rake Linking Accuracy is 100% for CoralCDN– Sha1 hash provides almost one-to-one URL t
o HashID mapping– The cache mechanism
• If the same URL is received twice, the 2nd one will be blocked until the first one retrieves back the webpage
• Use Rake Linking as Ground Truth to Evaluate WAP5
44
Message Linking Accuracy (1)
WAP5 False Negative
01020304050607080
33 53 69 93 118
Request Rate
Perc
enta
ge (
%)
The higher request rate, the less accuracy in WAP5.
45
Message Linking Accuracy (1)
WAP5 False Positive
0
50
100
150
200
33 53 69 93 118
Request Rate
Perc
enta
ge (
%)
The higher request rate, the less accuracy in WAP5.
46
Diagnosis Ability
• Controlled Experiments– Inject junk CPU-intensive processes– Calculated the packet processing time using WAP5 and Rake
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
koala_CPU_10
koala_CPU_20
koala_CPU_30
koala_CPU_40
koala_CPU_50
koala_CPU_60
koala_CPU_70
koala_CPU_80
koala_CPU_90
koala_CPU_100
RAKE
WAP5
Obviously Rake can identify the slow machine, while WAP5 fails.
47
Discussion
• Implementation Experience– How hard for user to provide semantics
• CoralCDN – 1 week source code study• DNS – a couple of hours • Hadoop DFS – 1 week source code study
• Inter-process Communication
• Encryption– Dynamic library interposition
48
Conclusions of Rake
• Feasibility– Rake works for many popular applications in different
categories
• Easiness– Rake allows user to write semantics via XML– Necessary semantics are easy to obtained given our
experience
• Accuracy– Much more accurate than black-box approaches and
probably matches white-box approaches
49
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting– Rake
• Conclusions and Future Work
50
Conclusions and Future Work
• Demonstrate Task-based Troubleshooting Is Promising– Network layer troubleshooting
• VScope, LEND, FAD and SPA
– Application layer troubleshooting• Rake
• Future Work– Extend Rake in diagnosis
• Timeline for Thesis Writing– From present to Feb. 1
51
Q & A?
Thanks!
52
53
Backup
54
Monitor Setup Phase
• Single-round Monitoring– Measure all the target paths simultaneously– Basic and is adopted by most monitoring
experimental design papers• Multi-round Monitoring
– Measure all the target paths in different time period (round)
• Tradeoff between time and link/node constraints– Multi-round Monitoring is necessary and efficient for
two reasons• Existing of operational constraints• Star-like topology
55
Single-Round Monitor Selection
• Pure Greedy Algorithm– Select monitors one by one and every time
select the monitor that can measure most uncovered links under the constraints
• To calculate the gain of adding a new monitor is a variant of Maximum k-Coverage problem
– Simple and local optimized
• Greedy Assisted Linear Programming based algorithm
56
Greedy Assisted Linear Programming based algorithm
• Formulate Integer Linear Programming First– ILP is NP-hard problem
• Relaxation to Linear Programming– Change all {0,1}-variable to continuous variable betwe
en 0 and 1
• Random Rounding– Solve the linear programming in polinomial time– Round the solutions within [0, 1] back to {0,1}-integers
with certain probabilities
57
Multi-round Monitor Selection
• Star-like Topology and Operation Constraints Make Single-round Monitor Selection Inefficient– Multi-round monitoring vs Reducing measurement fre
quency
• Algorithms for Multi-round Monitor Selection– Multiple the constraints with the round number and ru
n single-round monitor selection– Schedule the paths to measure in different rounds
• Greedy scheduling• Random scheduling• Linear programming based scheduling
58
Path Measurement Scheduling
• Greedy algorithm– Minimize link utilization in every step
• Random algorithm– Randomly schedule paths independently– Run random algorithm multiple times to get
the best one
• Linear Programming based algorithm with random rounding
59
Monitoring and Diagnosis
• Path Monitoring and Faulty Path Discovery
• Faulty Link Diagnosis– Select and measure some paths which favor
of the diagnosis of the potential faulty links
Monitor Selection & Deployment
PathMonitoring
LinkDiagnosis
VScope Setup VScope Operation
Iterative Continuous Monitoring
Faulty Paths
N
Y
60
Background and Related Work
• Network Layer Diagnosis– Linear algebraic model– Monitoring experimental design– Diagnosis algorithms
• Application Layer Diagnosis– Sherlock: enterprise network service
diagnosis
61
Linear Algebraic ModelPath loss rate pi, link loss rate lj:
)1)(1(1 211 llp
1
3
2
1
011 b
x
x
x
A
D
C
B
1
2
3p1
p2
)1log()1log()1log( 211 llp
)1log(
)1log(
)1log(
011
3
2
1
l
l
l
2
1
3
2
1
111
011
b
b
x
x
x
Usually an underconstrained syste
m G
62
Monitoring Experimental Design
• Monitor Placement Problem– Select least monitors that can measure some
paths covering all the links [Infocom03]
• Path Selection Problem– Selection of the basis of the path matrix
[Sigcomm04]– SVD based path selection [Infocom05]– Bayesian experimental design [Sigmetrics06]
• Network Layer Diagnosis
63
Network Layer Diagnosis• Internet Tomography
– Temporal correlations based algorithms• Unbiased if multicast is supported
– Statistic algorithms• Introducing additional statistic assumption or
optimization goal
0.1
0.1
0