Download - The Network Management Problem
The Network Management The Network Management ProblemProblem
What Network operators must be What Network operators must be able to do able to do
The requirement for network The requirement for network management management
ProvisioningProvisioning Detecting faultsDetecting faults Checking (and verifying) performanceChecking (and verifying) performance Billing/accountingBilling/accounting Initiating repairs or network upgradesInitiating repairs or network upgrades Maintaining the network inventoryMaintaining the network inventory
The issues are :The issues are :
Bringing the managed data to the Bringing the managed data to the codecode
ScalabilityScalability The shortage of development skills The shortage of development skills
for creating management systemsfor creating management systems The shortage of operational skills for The shortage of operational skills for
running networksrunning networks
Bringing the Managed Data to Bringing the Managed Data to the Codethe Code
Managed objects reside on many Managed objects reside on many SNMP agent hosts.SNMP agent hosts.
Copies of managed objects reside on Copies of managed objects reside on SNMP management systems.SNMP management systems.
Changes in agent data may have to Changes in agent data may have to be regularly reconciled with the be regularly reconciled with the management system copy.management system copy.
Scalability: Today's Network Is Scalability: Today's Network Is Tomorrow's NE Tomorrow's NE
Layer 2 VPN ScalabilityLayer 2 VPN Scalability
Virtual Circuit Status MonitoringVirtual Circuit Status Monitoring
A new type of MIB object.A new type of MIB object. Compression software facilities in the Compression software facilities in the
agents and managers. To a degree, agents and managers. To a degree, this could be considered to run this could be considered to run counter to the philosophy of counter to the philosophy of simplicity associated with SNMP.simplicity associated with SNMP.
MIB Note: ScalabilityMIB Note: Scalability
Status (e.g., becoming congested or going Status (e.g., becoming congested or going out of service)out of service)
Faults such as an intermediate node/link Faults such as an intermediate node/link failure or receipt of an invalid MPLS labelfailure or receipt of an invalid MPLS label
Deletion by a user via a CLI (i.e., outside Deletion by a user via a CLI (i.e., outside the management system)the management system)
Modification by a user (changing the Modification by a user (changing the administrative status from up to down)administrative status from up to down)
Other Enterprise Network Other Enterprise Network Scalability IssuesScalability Issues
Storage solutions, such as adding, Storage solutions, such as adding, deleting, modifying, and monitoring SANsdeleting, modifying, and monitoring SANs
Administration of firewalls, such as rules Administration of firewalls, such as rules for permitting or blocking packet transitfor permitting or blocking packet transit
Routers, such as access control lists and Routers, such as access control lists and static routesstatic routes
Security management, such as encryption Security management, such as encryption keys, biometrics facilities, and password keys, biometrics facilities, and password controlcontrol
Application managementApplication management
Light Reading TrialsLight Reading Trials
MPLS throughputMPLS throughput LatencyLatency IP throughput at OC-48IP throughput at OC-48 IP throughput at OC-192IP throughput at OC-192
Large NEs Large NEs
They reduce the number of devices They reduce the number of devices required, saving central office (CO) required, saving central office (CO) space and reducing cooling and space and reducing cooling and power requirements.power requirements.
They may help to reduce cabling by They may help to reduce cabling by aggregating links.aggregating links.
They offer a richer feature set.They offer a richer feature set.
disadvantages disadvantages
They are harder to manage.They are harder to manage. They potentially generate vast They potentially generate vast
amounts of management data.amounts of management data. They are a possible single point of They are a possible single point of
failure if not backed up.failure if not backed up.
to control the network is may not be to control the network is may not be possible because of possible because of
Process priority clashesProcess priority clashes SNMP message queue sizes that are SNMP message queue sizes that are
too smalltoo small Excessive I/O interruptsExcessive I/O interrupts
Expensive (and Scarce) Expensive (and Scarce) Development Skill Sets Development Skill Sets
• Object-oriented development and modeling Object-oriented development and modeling using Unified Modeling Language (UML) for using Unified Modeling Language (UML) for capturing requirements, defining actors capturing requirements, defining actors (system users) and use cases (the principal (system users) and use cases (the principal transactions and features), and mapping them transactions and features), and mapping them into software classesinto software classes
• Java/C++Java/C++• GUI, often packaged as part of a browser and GUI, often packaged as part of a browser and
providing access to network diagrams, providing access to network diagrams, provisioning facilities, faults, accounting, and provisioning facilities, faults, accounting, and so onso on
Server software for long-running, Server software for long-running, multiclient FCAPS processesmulticlient FCAPS processes
Specific support for mature/developing Specific support for mature/developing features, such as ATM/MPLSfeatures, such as ATM/MPLS
CORBA for multiple programming CORBA for multiple programming languages and remote object support languages and remote object support across heterogeneous environmentsacross heterogeneous environments
Database design/upgrade—matching MIB Database design/upgrade—matching MIB to database schema across numerous to database schema across numerous NMS/NE software releases NMS/NE software releases
Deployment and installation issues—Deployment and installation issues—performance is always an important performance is always an important deployment issue, as is ease of installationdeployment issue, as is ease of installation
IP routingIP routing MPLSMPLS Layer 2 technologies such as ATM, FR, and Layer 2 technologies such as ATM, FR, and
Gigabit EthernetGigabit Ethernet Legacy technologies such as voice-over-Legacy technologies such as voice-over-
TDM and X.25TDM and X.25
Ability to develop generic software Ability to develop generic software components and models—the components and models—the management system can hide much of the management system can hide much of the complex underlying detail of running the complex underlying detail of running the networknetwork
Client/server designClient/server design Managed object design, part of the Managed object design, part of the
modeling phase for the management modeling phase for the management systemsystem
MIB design—often there is a need for new MIB design—often there is a need for new objects in the managed devices to support objects in the managed devices to support the management systemthe management system
A solution mindsetA solution mindset Distributed, creative problem solvingDistributed, creative problem solving Taking ownershipTaking ownership Acquiring domain expertiseAcquiring domain expertise Embracing short development cyclesEmbracing short development cycles Minimizing code changesMinimizing code changes Strong testing capabilityStrong testing capability
Developer Note: A Solution Developer Note: A Solution MindsetMindset
Clear economic valueClear economic value Fulfillment of important requirementsFulfillment of important requirements Resolution of one or more end-user Resolution of one or more end-user
problemsproblems
Examples of management systems Examples of management systems solutions include the following solutions include the following
Providing minimal management Providing minimal management support for third-party devices support for third-party devices
Creating generic management Creating generic management system components that can be used system components that can be used across numerous different products across numerous different products and technologies and technologies
Aiming for technology-independent Aiming for technology-independent software infrastructure using software infrastructure using standard middleware standard middleware
Developer Note: Distributed, Developer Note: Distributed, Creative Problem Solving Creative Problem Solving
Software bugsSoftware bugs NE bugs (can be very hard to identify)NE bugs (can be very hard to identify) Performance bottlenecks in any of the Performance bottlenecks in any of the
FCAPS applications due to congestion in FCAPS applications due to congestion in the network, DBMS, agent, manager, and the network, DBMS, agent, manager, and so onso on
Database problems such as deadlocks, Database problems such as deadlocks, client disconnections, log files filling up, client disconnections, log files filling up, and so onand so on
Developer Note: Distributed, Developer Note: Distributed, Creative Problem SolvingCreative Problem Solving
Client applications crashing intermittentlyClient applications crashing intermittently MIB table corruption, such as a number of MIB table corruption, such as a number of
set operations that only partially succeedset operations that only partially succeed—for example, three setRequests (against —for example, three setRequests (against a MIB table) are sent but one message a MIB table) are sent but one message results in an agent timeout and the other results in an agent timeout and the other two are successful, which could leave the two are successful, which could leave the table in an inconsistent statetable in an inconsistent state
SNMP agent exceptions SNMP agent exceptions
the excellent tools available the excellent tools available
UML support packagesUML support packages Java/C++/SDL productsJava/C++/SDL products Version controlVersion control DebuggersDebuggers
Developer Note: Taking Developer Note: Taking OwnershipOwnership
Developer Note: Acquiring Developer Note: Acquiring Domain Expertise and Linked Domain Expertise and Linked
OverviewsOverviews Layer 2 and layer 3 traffic engineeringLayer 2 and layer 3 traffic engineering Layer 2 and layer 3 QoSLayer 2 and layer 3 QoS Network managementNetwork management Convergence of legacy technologies into Convergence of legacy technologies into
IP. Many service providers have built large IP. Many service providers have built large IP networks in anticipation of forecasted IP networks in anticipation of forecasted massive demand. These IP networks are, massive demand. These IP networks are, in many cases, not profitable, so service in many cases, not profitable, so service providers are keen to push existing providers are keen to push existing revenue-generating services (such as layer revenue-generating services (such as layer 2) over them.2) over them.
Developer Note: Acquiring Developer Note: Acquiring Domain Expertise and Linked Domain Expertise and Linked
OverviewsOverviews Backward and forward compatibility Backward and forward compatibility
of new technologies, such as MPLS. of new technologies, such as MPLS. An example is that of a service An example is that of a service provider with existing, revenue-provider with existing, revenue-generating services such as ATM, FR, generating services such as ATM, FR, TDM, and Ethernet. The service TDM, and Ethernet. The service provider wants to retain customers provider wants to retain customers but migrate the numerous incoming but migrate the numerous incoming services into a common MPLS core.services into a common MPLS core.
Linked OverviewsLinked Overviews
Developer Note: An ATM Linked Developer Note: An ATM Linked OverviewOverview
ATM is a layer 2 protocol suitable for deployment ATM is a layer 2 protocol suitable for deployment in a range of operational environments (in VLANs in a range of operational environments (in VLANs and ELANs, in the WAN, and also in SP networks).and ELANs, in the WAN, and also in SP networks).
ATM offers a number of different categories and ATM offers a number of different categories and classes of service. The required service level is classes of service. The required service level is enforced by switches using policing (traffic cop enforced by switches using policing (traffic cop function), shaping (modifying the traffic function), shaping (modifying the traffic interarrival time), marking (for subsequent interarrival time), marking (for subsequent processing), and dropping.processing), and dropping.
Traffic is presented to an ATM switch and Traffic is presented to an ATM switch and converted into a stream of 53-byte ATM cells.converted into a stream of 53-byte ATM cells.
The stream of cells is transmitted through an ATM The stream of cells is transmitted through an ATM cloud.cloud.
Developer Note: An ATM Linked Developer Note: An ATM Linked OverviewOverview
A preconfigured virtual circuit dictates the route A preconfigured virtual circuit dictates the route taken by the cell stream. Virtual circuits can be taken by the cell stream. Virtual circuits can be created either manually or using a signaling created either manually or using a signaling protocol. If no virtual circuit is present then PNNI protocol. If no virtual circuit is present then PNNI can signal switched virtual circuits (SVCs).can signal switched virtual circuits (SVCs).
The virtual circuit route passes through The virtual circuit route passes through intermediate node interfaces and uses a label-intermediate node interfaces and uses a label-based addressing scheme.based addressing scheme.
Bandwidth can be reserved along the path of this Bandwidth can be reserved along the path of this virtual circuit in what is called a contract.virtual circuit in what is called a contract.
Various traffic engineering capabilities are Various traffic engineering capabilities are available, such as dictating the route for a virtual available, such as dictating the route for a virtual circuit.circuit.
the essential ATM managed the essential ATM managed objects can be derived objects can be derived
ATM nodesATM nodes A virtual circuit (switched, permanent, or A virtual circuit (switched, permanent, or
soft) spanning one or more nodessoft) spanning one or more nodes A set of interfaces and linksA set of interfaces and links A set of locally significant labels used for A set of locally significant labels used for
addressingaddressing An optional route or designated transit listAn optional route or designated transit list A bandwidth contractA bandwidth contract Traffic engineering settingsTraffic engineering settings QoS detailsQoS details
Developer Note: An IP Linked Developer Note: An IP Linked OverviewOverview
IP is packet-based—IP nodes make forwarding IP is packet-based—IP nodes make forwarding decisions with decisions with everyevery packet. packet.
IP is IP is notnot connection-oriented. connection-oriented. IP provides a single class of service: best effort.IP provides a single class of service: best effort. IP does not provide traffic engineering IP does not provide traffic engineering
capabilities.capabilities. IP packets have two main sections: header and IP packets have two main sections: header and
data.data. IP header lookups are required at each hop (with IP header lookups are required at each hop (with
the present line-rate technology, lookups are no the present line-rate technology, lookups are no longer such a big issue. Routing protocol longer such a big issue. Routing protocol convergence may cause more problems).convergence may cause more problems).
Developer Note: An IP Linked Developer Note: An IP Linked OverviewOverview
IP devices are either hosts or routers IP devices are either hosts or routers (often called gateways).(often called gateways).
Hosts do not forward IP packets—routers Hosts do not forward IP packets—routers do.do.
IP devices have routing tables.IP devices have routing tables. IP operates in conjunction with other IP operates in conjunction with other
protocols, such as OSPF, IS-IS, Border protocols, such as OSPF, IS-IS, Border Gateway Protocol 4 (BGP4), and Internet Gateway Protocol 4 (BGP4), and Internet Control Message Protocol (ICMP).Control Message Protocol (ICMP).
Large IP networks can be structured as Large IP networks can be structured as autonomous systems made up of smaller autonomous systems made up of smaller interior areas or levels.interior areas or levels.
the essential managed objects of IP the essential managed objects of IP are are
IP nodes (routers, hosts, clients, servers)IP nodes (routers, hosts, clients, servers) IP interfacesIP interfaces IP subnetsIP subnets IP protocols (routed protocols such as IP protocols (routed protocols such as
TCP/IP and routing protocols such as OSPF TCP/IP and routing protocols such as OSPF and IS-IS)and IS-IS)
Interior Gateway Protocol (IGP) areas Interior Gateway Protocol (IGP) areas (OSPF) or levels (IS-IS)(OSPF) or levels (IS-IS)
Exterior Gateway Protocol (EGP) Exterior Gateway Protocol (EGP) autonomous systemsautonomous systems
Embracing Short Development Embracing Short Development Cycles Cycles
Reduced feature sets in more Reduced feature sets in more frequent releasesfrequent releases
Foundation releasesFoundation releases Good upgrade pathsGood upgrade paths Getting good operational feedback Getting good operational feedback
from end usersfrom end users
Minimizing Code Changes Minimizing Code Changes
Elements of NMS Development Elements of NMS Development
NMS DevelopmentNMS Development Using a browser-based GUI, the developer has provisioned Using a browser-based GUI, the developer has provisioned
onto the network a managed object such as an ATM virtual onto the network a managed object such as an ATM virtual circuit or an MPLS LSP.circuit or an MPLS LSP.
The developer wants to check that the software executed The developer wants to check that the software executed the correct actions.the correct actions.
During provisioning, the developer verifies that the correct During provisioning, the developer verifies that the correct Java code executed using a Java console and trace files Java code executed using a Java console and trace files (similar actions can be done for C/C++ systems).(similar actions can be done for C/C++ systems).
The database is updated by the management system code, The database is updated by the management system code, and this can be checked by running an appropriate SQL and this can be checked by running an appropriate SQL script.script.
The next step is verifying that the correct set of managed The next step is verifying that the correct set of managed objects was written to the NE. To do this, the developer objects was written to the NE. To do this, the developer uses a MIB browser to check that the row object has been uses a MIB browser to check that the row object has been written to the associated agent MIB.written to the associated agent MIB.
Other skills are :Other skills are : Data analysis—matching NE data to the NMS database Data analysis—matching NE data to the NMS database
schemaschema Data analysis—defining NMS-resident objects that exist in Data analysis—defining NMS-resident objects that exist in
complex component form in the network (an example is a complex component form in the network (an example is a VPN, as discussed earlier in this chapter)VPN, as discussed earlier in this chapter)
Upgrade considerations for when MIBs change (as they Upgrade considerations for when MIBs change (as they regularly do)regularly do)
UML, Java, and object-oriented developmentUML, Java, and object-oriented development Class design for major NMS features, like MPLS provisioningClass design for major NMS features, like MPLS provisioning GUI developmentGUI development Middleware using CORBA-based productsMiddleware using CORBA-based products Insulating applications from low-level codeInsulating applications from low-level code
When MIBs Change: Upgrade When MIBs Change: Upgrade ConsiderationsConsiderations
Deprecate old objects no longer in use—Deprecate old objects no longer in use—don't delete them from the MIB if at all don't delete them from the MIB if at all possible.possible.
Keep the MIB object identifiers sequential; Keep the MIB object identifiers sequential; add new OIDs as necessary. add new OIDs as necessary.
Don't change any existing OIDs in MIBs Don't change any existing OIDs in MIBs that are currently in use by the NMS. RFC that are currently in use by the NMS. RFC 2578 provides guidelines for this.2578 provides guidelines for this.
Ensure that MIB files do not have to be Ensure that MIB files do not have to be changed in order to work with changed in order to work with management systems. management systems.
UML, Java, and Object-Oriented UML, Java, and Object-Oriented DevelopmentDevelopment
Structured classification (use cases, Structured classification (use cases, classes, components, and nodes)classes, components, and nodes)
Dynamic behavior (describes system Dynamic behavior (describes system changes over time)changes over time)
Model management (organization of Model management (organization of the model itself) the model itself)
Class Design for Major NMS Class Design for Major NMS FeaturesFeatures
GUI DevelopmentGUI Development Middleware Using CORBA-Based Middleware Using CORBA-Based
ProductsProducts Insulating Applications from Insulating Applications from
Low-Level CodeLow-Level Code
MPLS: Second ChunkMPLS: Second Chunk Explicit Route Objects (ERO), strict and looseExplicit Route Objects (ERO), strict and loose Resource blocksResource blocks Tunnels and LSPsTunnels and LSPs In-segmentsIn-segments Out-segmentsOut-segments Cross-connectsCross-connects Routing protocolsRouting protocols Signaling protocolsSignaling protocols Label operations: lookup, push, swap, and popLabel operations: lookup, push, swap, and pop Traffic engineeringTraffic engineering QoSQoS
Label OperationsLabel Operations Lookup: The node examines the value of the topmost label. Lookup: The node examines the value of the topmost label.
This operation occurs at every node in an MPLS cloud. In This operation occurs at every node in an MPLS cloud. In our example, lookup would occur using Label2. Typically, a our example, lookup would occur using Label2. Typically, a label lookup results in the packet being relabeled and label lookup results in the packet being relabeled and forwarded through a node interface indicated by the forwarded through a node interface indicated by the incoming label.incoming label.
Swap: This occurs when an MPLS node replaces the label Swap: This occurs when an MPLS node replaces the label with a new one.with a new one.
Pop: This occurs when the topmost label is removed from Pop: This occurs when the topmost label is removed from the stack. If the label stack has a depth of one, then the the stack. If the label stack has a depth of one, then the packet is no longer MPLS-encapsulated. In this case, an IP packet is no longer MPLS-encapsulated. In this case, an IP lookup can be performed using the IP header.lookup can be performed using the IP header.
Push: This occurs when a label is either pushed onto the Push: This occurs when a label is either pushed onto the label stack or attached to an unlabeled packet. label stack or attached to an unlabeled packet.
MPLS EncapsulationMPLS Encapsulation 0 – IPv4 explicit null that signals the 0 – IPv4 explicit null that signals the
receiving node to pop the label and receiving node to pop the label and execute an IP lookupexecute an IP lookup
1 – Router alert that indicates to the 1 – Router alert that indicates to the receiving node to examine the packet receiving node to examine the packet more closely rather than simply forwarding more closely rather than simply forwarding itit
2 – IPv6 explicit null2 – IPv6 explicit null 3 – Implicit null that signals the receiving 3 – Implicit null that signals the receiving
node to pop the label and execute an IP node to pop the label and execute an IP lookuplookup
SummarySummary
There are some serious problems There are some serious problems affecting network management. affecting network management. Bringing managed data and code Bringing managed data and code together is one of the central together is one of the central foundations of computing and foundations of computing and network management. Achieving this network management. Achieving this union of data and code in a scalable union of data and code in a scalable fashion is a problem that gets more fashion is a problem that gets more difficult as networks grow difficult as networks grow