scalable management for networks and services
DESCRIPTION
Scalable Management for Networks and Services. Rolf Stadler Laboratory for Communication Networks KTH Royal Institute of Technology Stockholm HP Laboratories, Palo Alto, March 31, 2003. Management station. A. A. A. A. A. A. node. Manager-Agent based management. - PowerPoint PPT PresentationTRANSCRIPT
Scalable Management for Networks and Services
Rolf Stadler
Laboratory for Communication NetworksKTH Royal Institute of Technology
Stockholm
HP Laboratories, Palo Alto, March 31, 2003
node
Managementstation
Managementstation
Manager-Agent based management
node
A A A
M
results
download &execute
P
Management Program
Manager
Agent
•Centralized Control•Management protocols: SNMP, CMIP•Program runs on Management Station
•Decentralized Control•Program runs on network nodes
P
The Shift of a Management Paradigm
AP
A A
Router
Execution Environment
Management station
Management Program
navigation Code Server
Architecture for Pattern-based Management
Weaver—A Testbed for pattern-based Management
WANA
WANB
WANC
WAND
Management Station
Router A Router B Router C Router D
FastEthernet Switch
Simple Navigation Patterns
1
21
2
1
2
2
2
3
3
34
operation typical application navigation pattern
type 1: node-to-node 1 node control/monitor(get/set of variables)
type 2: visit all nodesalong a path/flow
1 flow/path controle.g.: traceroute, bottleneckdetection, signalling, VPNoperation
type 3: distribute agentto all nodes in subnet(parallel control)
subnet controle.g.: topology detection
type 4: visit all nodes insubnet (sequent ialcontrol)
12
3
4
5
subnet control, messagebroadcaste.g.: congestion locationdetection
Echo Pattern (expansion)
droot=1
Echo Pattern (expansion)
droot=2
Echo Pattern (expansion)
droot=3
Echo Pattern (expansion)
droot=4
Echo Pattern (expansion)
Echo Pattern
droot=5
droot=4
Echo Pattern (contraction)
droot=3
Echo Pattern (contraction)
droot=2
Echo Pattern (contraction)
droot=1
Echo Pattern (contraction)
Echo Pattern (contraction)
Echo Pattern (contraction)
The Echo Pattern
• Two phases of traversal– expansion phase: explorers flood network with requests
for local operations– contraction phase: echoes return and aggregate results
• Properties– Generates balanced traffic load– Traffic load depends on network topology,
not on speed of traversal– Time complexity increases linearly with network
diameter.
Examples of Echo-based Management
• Get information on topology
– compute the current number of leaf nodes, the connectivity distribution
– discover current topology within 10 hops of node x
• Get information on network state
– identify 10 most congested links
– compute distribution of link utilization, queue lengths
– identify sub topologies with highly loaded links
– find a resource R closest to node x
Pattern-based Management—An Engineering Approach to Decentralized Management
• A management program consists of– A navigation pattern (distr. graph traversal algorithm)– An operation on nodes– An aggregation function
• Relevance of this approach– Provides a basis to analyze management operation for
performance, scalability, robustness– Supports concept of re-usable patterns, hides
complexity
Composing Management Programs
Segall
Echo Patterns
Navigation Patterns
Chang
Skip
Wait
Scope
Multi
Echo Aggregators
Res. Disc.
Aggregators
Leaf Count
Load. Hist.
Conn. Hist.
C LI
HT
TP
X ML
SN MP Local Operations
Node Access
Management Program
Properties of Patterns
Echo Aggregators
Res. Disc.
Aggregators
Leaf Count
Load. Hist.
Conn. Hist.
C LI
HT
TP
X ML
SN MP
Node Access
Management Program
Segall
Simple Echo Robust Echo Others
Echo Patterns
Navigation Patterns
Chang
Skip
Wait
Scope
Multi
• A pattern can be used for many management operations. • A pattern can be chosen according to performance objectives.• A pattern hides the complexity of a distributed operation.• Network failures can be handled within patterns.• Code mobility can be controlled.
visitedi : boolean init false;Gi : set of integers init neighbors();parenti : integer init -1;
Echo(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true;
OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty {
OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}
The Interface between Pattern and Aggregator
OnBegin
OnTerminate
OnInitiate OnComplete
OnAggregate
…av_loadi := av_load;…
…av_load := load();n:=1;…
…av_load := (av_load*n + av_loadj)/(n+1);n:=n+1;…
SIMPSON: A SIMple Pattern Simulator fOr Large Networks
0
200000
400000
600000
800000
1e+06
1.2e+06
0 1 2 3 4 5 6T
raff
ic (b
yte
s)
Time (secs)
Traffic vs Time for 221 node grid network
"trace1.txt"
Analyzing Management Operations
A
CD
E
F
B
A
B C
DEF
E D F E
A
B D FEC
Network Graph G=(V,E) Execution Graphs G’=(V’,E’)
Centralized Management Distributed Management
Star Pattern Echo Pattern
Traffic Complexity of Management Operations
Amount of traffic placed on the network during execution.
C t ra f f ic hopcount v' childk v' Iq Ir+
v' V' 0 k ch i ldcou n t v'
=
Ctrafficecho
Iq Ir+ Edegree G 2 V–
2------------------------------------------- 1+ +
=
Ctrafficstar Iq Ir+ hopcount v'ro ot v'
v' V'
=
Time Complexity of Management Operations
Time needed from invocation until completion of a operation.
C time C time v'root =
Ctime v' tc tr+ if childcount v' 0=
tc tr M v' + + otherwise
=
M v' max ktq 2+ hopcount v' childk v' tl Ctime childk v' + 1 k childcount v'
=
Ctimes tar
O V =
Ct imeecho
O d =
Performing Echo-based Operations on the Entire Internet
• Purpose is illustrating the scalability of echo-based operations.
• What we needed:
– Complexity analysis of pattern
– Estimation of Internet topological properties
• diameter
• connectivity distribution
• number of nodes
Estimated Performance of Echo-based Operation on the Internet
Assumptions:Process-level transmission time: 5msNetwork delay per hop: 4msMessage size: 1KBLocal operation: 500ms per executionDiameter of Internet: 34 hops
Echo Pattern Star Pattern
Aggregated Traffic 2.25 x1011 bytes 1.31 x 1012 bytes
Max Traffic on a Link 4'096 bytes 1.8 x 109 bytes
Completion Time 17.48 seconds 5.09 days
Active Node ManagerSource
Repository
BinariesRepository
Preprocessor
TransportAccessPoint
ExecutionEnvironment
ManagementOperationResults
DeviceManager
C++Compiler
Source, State
Source, State
SNMP sets
Management commands
Weaver Active Node
Source code,Active node management
Router
NodeState
LocalProgramStates
Source, State
Events
SNMP gets/traps
Source Code
results
Active Node Engine
Management Station
Suboperations in Weaver
Node A Node B
start
Execution (T1)
Serialization (T2)
Dispatch (T3)
Receiving (T4)
Loading (T5) or Instantiation (T6)
De-serialization (T7)
Execution (T1)
Serialization (T2)
Dispatch (T3)
Receiving (T4)Resolving (T8)
end
Tim
e
De-serialization (T7)Execution (T1)
TC1
TC2
Measuring Execution Times on Weaver
Duration in ms Performed by Module
Execution (T1) 1.57 (σ = 0.48) Execution Environment
Serialization (T2) 3.46 (σ = 0.71) Execution Environment
Dispatch (T3) 1.67 (σ = 0.49) Transport Access Point
Receiving (T4) 0.62 (σ = 0.30) Transport Access Point
Loading (T5) 23.42 (σ = 0.70) Execution Environment
Instantiation (T6) 0.77 (σ = 0.015) Execution Environment
De-serialization (T7) 2.04 (σ = 0.49) Execution Environment
Resolving (T8) 0.15 (σ = 0.001) Execution Environment
Communications Delay (TC) 4.04 (σ = 0.10) ---
Estimating Execution Times of Echo-based Operations on Weaver
Skip EchoSkipEcho(inmsg: bytes from: integer) {
if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi = true { if type = failure { Gi := Gi - affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } }}
Wait EchoSkipEcho(inmsg: bytes from: integer) {
if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi= true { if type == failure { Gi = Gi - affected Bi = Bi + affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i) else OnTerminate(inmsg); } } else { if affected is in Bi { Bi = Bi - affected Gi = Gi + affected } } }}
Designing Robust PatternsPlain EchoEcho(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty { OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}
100
120
140
160
180
200
220
52.5 53 53.5 54 54.5 55 55.5
Cove
rage (
nodes)
Time (mins)
Coverage Vs Time for skipecho
MTTF=3.683 hrsMTTF=7.367 hrsMTTF=11.05 hrs
MTTF=14.733 hrsMTTF=29.467 hrsMTTF=73.67 hrs
Network Coverage vs. Execution Timefor Skip Echo
MTTR=1 minMTTR=11 minMTTR 0
MTTR inf
MTTF = 3.6 hrsMTTF = 7.3 hrsMTTF = 11.0 hrsMTTF = 14.7 hrsMTTF = 29.4 hrsMTTF = 73.6 hrs
Current and Planned Work
• Self-organizing, adaptable Networks and Systems:Patterns for routing and dynamic construction of network control structures. (Constantin Adam)
• WQL: A table-based Network Query Language on Weaver.
(Koon-Seng Lim)
• Policy-based Management: Patterns for distribution and dynamic re-computation of policies.(Alberto Gonzalez)
Literature on this Work
• K.S. Lim, R. Stadler: “Weaver—Realizing a scalable management paradigm on commodity routers,” Eighth IFIP/IEEE International Symposium on Integrated Network Management (IM 2003), Colorado Springs, Colorado, USA, March 24-28, 2003.
• K.S. Lim and R. Stadler: "Developing pattern-based management programs," IFIP/IEEE International Conference on Management of Multimedia Networks and Services (MMNS 2001), Chicago, IL, October 29 - November 1, 2001.
• K.S. Lim and R. Stadler: "A navigation pattern for scalable Internet management,"IFIP/IEEE International Symposium on Integrated Network Management (IM 2001), Seattle,Washington, 14-18 May, 2001.
• R. Kawamura and R. Stadler: "A middleware architecture for active distributed management of IP networks, "IEEE/IFIP Network Operations and Management Symposium (NOMS 2000), Honolulu, Hawaii, April 10-14, 2000.