establishing best practices for network management
TRANSCRIPT
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 1
1© 1999, Cisco Systems, Inc. 8041066_05F9_c2
2© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Establishing BestEstablishing BestPractices forPractices for
Network ManagementNetwork Management
Session 804Session 804
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 2
38041066_05F9_c2 © 1999, Cisco Systems, Inc.
AgendaAgenda
• Introduction to Best Practices• Preparing the Network for Management• Managing Change• Fault Management• Summary
4© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Introduction toIntroduction toBest PracticesBest Practices
4© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 3
58041066_05F9_c2 © 1999, Cisco Systems, Inc.
Network Downtime is CostlyNetwork Downtime is Costly
• The Internet ande-commerce hassignificantlyincreased theavailability stakes… 24-hour banking
E-trade
Global economy
0
1
2
3
4
5
6
7
8
Infonetics Cost of WANDowntime ’98
AverageDollars
per Year($000,000)
ProductivityLoss
ProductivityLoss
RevenueLoss
RevenueLoss
$4.2M$4.2M
$3.6M$3.6M
$3.6M$3.6M
Costs Enterprise NetworkMgmt. Budget
*Due to hard downtime and service degradations
68041066_05F9_c2 © 1999, Cisco Systems, Inc.
Best Practices DefinedBest Practices Defined
• Applying what works well for others toimprove overall network availability
Reduce the time required for plannedoutages (scheduled change) and includeschanges with no associated outage
Reduce network downtime duringunplanned outages (unscheduled change)
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 4
78041066_05F9_c2 © 1999, Cisco Systems, Inc.
Do WhatWorks
for You!
Lots of Practices—Some TruthsLots of Practices—Some Truths
• Even the best NMproducts can beuseless with“bad” practices
• Tools help you todo your job, theyare NOT the job
• Communication andsecurity are the“bread and butter”of best practices
8© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Preparing the NetworkPreparing the Network
8© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 5
98041066_05F9_c2 © 1999, Cisco Systems, Inc.
Congratulations!Congratulations!
You’ve just Been Promoted toManage the Entire Networkfor the Western Region...
108041066_05F9_c2 © 1999, Cisco Systems, Inc.
What They’re Really Thinking…What They’re Really Thinking…
What am I gettinginto… how am
I going to do this?Where do I begin?
I sure hopehe lasts longer
than the last guy..
What a loser!Does he have any
idea what he’sin for?
How come we don’t have legs?
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 6
118041066_05F9_c2 © 1999, Cisco Systems, Inc.
Preparing the NetworkPreparing the Networkfor Managementfor Management
Best Practices1. Selecting the “right” tools2. Preparing the devices3. Preparing the tools4. Building a baseline5. Maintaining “management”
128041066_05F9_c2 © 1999, Cisco Systems, Inc.
Selecting the Right ToolsSelecting the Right Tools
• How do I select the “right” set ofmanagement applications?Understand the technologies and buzzwords
Understand your network and end-userrequirements
Implement company standards
Many choices evaluate and choosewhat’s right for your environment
?
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 7
138041066_05F9_c2 © 1999, Cisco Systems, Inc.
Platforms and Vendor SpecificPlatforms and Vendor SpecificManagementManagement
• NMSSNMP-based, status map, and trap receiverHP Openview, Tivoli Netview, CA UniCenter, SNMPc, etc.MicroMuse, Seagate, Concord, Enterprise Pro, and MRTG
• Vendor SpecificGeared towards managing a specific vendors devices onlyOptivity, Transcend, CiscoWorks2000
148041066_05F9_c2 © 1999, Cisco Systems, Inc.
ApplicationApplication DBMSDBMS ServerServer NetworkNetwork DesktopDesktop UserUser
DeviceDeviceDeviceDevice DeviceDeviceService ServiceService
NetworkNetwork
Integrating EnterpriseIntegrating EnterpriseManagementManagement
Helpdesk, Trouble-ticket, Event MOMHelpdesk, Trouble-ticket, Event MOM
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 8
158041066_05F9_c2 © 1999, Cisco Systems, Inc.
Understand Your OrganizationUnderstand Your Organization
• Roles andresponsibilities
• Escalation policy
• Help desk vs.operations
• Planners vs.administrators
168041066_05F9_c2 © 1999, Cisco Systems, Inc.
Preparing the DevicesPreparing the Devices
• Security for Management
• Notification
• Baseline
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 9
178041066_05F9_c2 © 1999, Cisco Systems, Inc.
Securing the DevicesSecuring the Devices
• Identify scope of controlWho needs access to what?
• Secure and log accessPhysical access (badge readers)
Telnet and console(AAA accounting, Syslog)
SNMP communities (ACL, SNMP traps)
188041066_05F9_c2 © 1999, Cisco Systems, Inc.
Tacacs+Tacacs+
Syslog
SNMP gets and setsSNMP gets and sets
SNMP traps
SNMP Community ACLSNMP Community ACL
Sample Security ConfigurationSample Security Configuration
aaa new-modelaaa authentication login test tacacs+ lineaaa authentication enable default tacacs+ enableaccess-list 8 permit 161.44.34.157logging 161.44.34.157logging source-interface Loopback0snmp-server community public ROsnmp-server community bitbuck RW 8snmp-server contact Paul L. Della Maggiorasnmp-server chassis-id 071293snmp-server system-shutdownsnmp-server trap-source Loopback0snmp-server trap-authenticationsnmp-server host 161.44.34.157 public frame-relay
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 10
198041066_05F9_c2 © 1999, Cisco Systems, Inc.
Security Access ChangesSecurity Access Changes
• Password change policyQuarterly
Every time an employee leaves
• SolutionUse radius or tacacs+
Script the change
208041066_05F9_c2 © 1999, Cisco Systems, Inc.
NotificationNotification
• SNMP TrapsCritical for NMSnotification
• SyslogCisco-specificnotification
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 11
218041066_05F9_c2 © 1999, Cisco Systems, Inc.
Tacacs+
SyslogSyslog
SNMP gets and sets
SNMP trapsSNMP traps
SNMP Community ACL
Sample Notification ConfigurationSample Notification Configuration
aaa new-modelaaa authentication login test tacacs+ lineaaa authentication enable default tacacs+ enableaccess-list 8 permit 161.44.34.157logging 161.44.34.157logging source-interface Loopback0snmp-server community public ROsnmp-server community bitbuck RW 8snmp-server contact Paul L. Della Maggiorasnmp-server chassis-id 071293snmp-server system-shutdownsnmp-server trap-source Loopback0snmp-server trap-authenticationsnmp-server host 161.44.34.157 public frame-relay
228041066_05F9_c2 © 1999, Cisco Systems, Inc.
Building a BaselineBuilding a Baseline
• Document the networkMaps
Spreadsheets/databases
• Track inventoryIdentify equipment and who owns it
• Backup configurations
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 12
238041066_05F9_c2 © 1999, Cisco Systems, Inc.
Building a BaselineBuilding a Baseline
• Collect performance dataSnapshot ofthe network
Provides historicaldata for comparison
Useful for capacityplanning and trending
248041066_05F9_c2 © 1999, Cisco Systems, Inc.
Discovering the NetworkDiscovering the Network
• Auto-discovery can makedocumentation easy…but the daemonsmust be tamed
FiltersSeedfilesDiscovery intervalsExchange inventoryamong multipleautodiscovery tools
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 13
258041066_05F9_c2 © 1999, Cisco Systems, Inc.
Layer 2 AutodiscoveryLayer 2 Autodiscovery
1. Query seed device via SNMP2. Query CDP neighbor table (ciscoCdpMIBObjects)3. Interrogate neighborsCaveat—CDP only sees Cisco devices
c55k-26 (enable) sho cdp neighCapability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater
Port Device-ID Port-ID PlatformCapability-------- ----------------------- ----------------- ------------------ ---------- 4/1 002261261 4/1 WS-C5000 T B S 4/1 002274433 4/1 WS-C5000 T B S 4/1 069004796 4/1 WS-C5500 T B S 4/1 Router_81.130 Ethernet0 cisco 4500 R 4/1 WBU_GATEWAY Ethernet0 cisco 4500 R
268041066_05F9_c2 © 1999, Cisco Systems, Inc.
Layer 3 AutodiscoveryLayer 3 Autodiscovery
1. Start with default router2. Query MIB II ifTable, ipAddrTable, ipRouteTable
3. Interrogate neighborsSpecial cases e.g. IP unnumbered, HSRP
4500-4>sho ip routCodes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidatedefault U - per-user static route
Gateway of last resort is not set
100.0.0.0/8 is subnetted, 1 subnetsO 100.100.100.0 [110/70] via 172.16.11.1, 13:35:34, Serial0 153.10.0.0/16 is subnetted, 1 subnetsC 153.10.1.0 is directly connected, Serial1 172.16.0.0/16 is subnetted, 1 subnetsC 172.16.11.0 is directly connected, Serial0
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 14
278041066_05F9_c2 © 1999, Cisco Systems, Inc.
InventoryInventory
• Typical NMS is not enoughIP address, comm strings, and interfaces
• Third-party management suites andvendor specific provide richer content
• MIBs are generally vendor specific,although entity MIB will change this
288041066_05F9_c2 © 1999, Cisco Systems, Inc.
InventoryInventory
• Items of interestSystem informationChassis informationChassis cardsInterfacesStorage and memorySerial numbers
• All information availablevia IETF and Cisco MIBs
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 15
298041066_05F9_c2 © 1999, Cisco Systems, Inc.
ConfigurationsConfigurations
• Collection repositoryUseful for staging new configs
Version control helps with spaceand documentation
• How to automateScheduled backup
Watch Syslog
308041066_05F9_c2 © 1999, Cisco Systems, Inc.
Maintaining ManagementMaintaining Management
• Adding new devices
• Keeping the managementapplications up-to-date
• New management productsand standards
An Ongoing Process!An Ongoing Process!
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 16
31© 1999, Cisco Systems, Inc. 8041066_05F9_c2 311066_05F9_c2 © 1999, Cisco Systems, Inc.
Change ManagementChange Management
328041066_05F9_c2 © 1999, Cisco Systems, Inc.
I Didn’t Do It
Post Mortem BluesPost Mortem Blues
• Unplanned outages may bethe result of many factors.How do you explain andaccount for what occurred?
Fact based vs. hearsay
Who, what, and whenwas the change made?
Your job may be at stake
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 17
338041066_05F9_c2 © 1999, Cisco Systems, Inc.
*Based on Carnegie-Mellon Usability Study
Some FactsSome Facts
• 80% of all outagesare due to human error*
When an airlinesreservation system wentdown, thousands of travelagents had to book flightsmanually. Estimated loss ofreservations amounted to$36,000 a minute
XX
348041066_05F9_c2 © 1999, Cisco Systems, Inc.
Common Causes of ChangeCommon Causes of Change
• Business growth or downsizing
• New applications or services
• Implementing new technology
• Deploying product fixes or upgrades
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 18
358041066_05F9_c2 © 1999, Cisco Systems, Inc.
Change Management DefinedChange Management Defined
• Configuration, software andhardware changes
• Change tasks include:Anticipating and planning for change,controlling the introduction of change,and installing and implementing changesto software and hardware
368041066_05F9_c2 © 1999, Cisco Systems, Inc.
Best Practices for ChangeBest Practices for Change
Best Practices1. Implementing a
change control process2. Planning for change3. Implementing change4. Monitoring change
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 19
378041066_05F9_c2 © 1999, Cisco Systems, Inc.
Change review board• Identify risk• Schedule change• Generate work order
Implementation• Net admin• Engineer/tech.
Validation• Change verification• Audit
Change request• End user request• New app, server• New network service
Change or work order• Tracking #• Detailed change
requests
Close Work Order orResubmit If Problems
Change Control ProcessChange Control Process
388041066_05F9_c2 © 1999, Cisco Systems, Inc.
ExamplesExamples
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 20
398041066_05F9_c2 © 1999, Cisco Systems, Inc.
PlanningPlanning
• HardwarePre-configure, test prior to upgrade
• SoftwareResearch release, defect support, newfeature set, and device compatibility
• ConfigurationTest prior to deployment
• Have a back-out plan
408041066_05F9_c2 © 1999, Cisco Systems, Inc.
ImplementingImplementing
• Make different types of changesone at a time
• Maker/checker model
• Understand contingency plan inevent of failure
• Validate the change was successful
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 21
418041066_05F9_c2 © 1999, Cisco Systems, Inc.
MonitoringMonitoring
• Identifying change,who, what, when
• Audit trail
• Fault notification
428041066_05F9_c2 © 1999, Cisco Systems, Inc.
Change Management ToolsChange Management Tools
PlanningPlanning
SWIM—Defect,image analysis
CWSI—Layer2/Layer 3 topo
Netsys—Impactof change
SWIM—Defect,image analysis
CWSI—Layer2/Layer 3 topo
Netsys—Impactof change
DeploymentDeployment
SWIM—Downloadsoftware images
CWConfig—Deployconfig changes
CiscoView—Switchconfig changes
SWIM—Downloadsoftware images
CWConfig—Deployconfig changes
CiscoView—Switchconfig changes
MonitorMonitorCAS—Change audit
and reportingservice, logssoftware, configand hardwarechanges
CWSI—Topo anduser tracking
CAS—Change auditand reportingservice, logssoftware, configand hardwarechanges
CWSI—Topo anduser tracking
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 22
438041066_05F9_c2 © 1999, Cisco Systems, Inc.
ChangeChange
Poll TransportTransport
Audit LogAudit Log
Server
Network
5. IF VALID, Archive gets Configand logs details to ENCASE
3. C/Agent identifies devicechange, notifies archive
1. User telnets into device and makes a config change (shutdown int)
2. Device updatedSyslog generated
4. Archive gets config viatransport validates
change w/DIFF
ChangeAgent
Syslog
Archive
Change ScenarioChange Scenario
44© 1999, Cisco Systems, Inc. 8041066_05F9_c2 44© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Fault ManagementFault Management
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 23
458041066_05F9_c2 © 1999, Cisco Systems, Inc.
ScenarioScenario
• Virginia building-003network goes down
• Your boss hasbad breath
• Multiple peoplemaking changes
• Resolution takesnine hours
468041066_05F9_c2 © 1999, Cisco Systems, Inc.
ScenarioScenario
• Result:Network was down additional four hoursdue to conflicting changes
No one seems to know how the problemoccurred or how it was resolved
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 24
478041066_05F9_c2 © 1999, Cisco Systems, Inc.
Best Practices forBest Practices forFault ManagementFault Management
Best Practices1. Preventive Measures2. Coordination3. Reacting to Faults4. Escalation Policy4. Become Proactive
488041066_05F9_c2 © 1999, Cisco Systems, Inc.
Preventive MeasuresPreventive Measures
• Maintain accurate documentationKey to quick resolution
Includes maps, closets, connections,wiring, and servers
May require process/policy change.Only good if up to date, easy tomaintain, and useful
Dump it if you can’t maintain it!
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 25
498041066_05F9_c2 © 1999, Cisco Systems, Inc.
Preventive MeasuresPreventive Measures
• Remove single points of failureAlternate paths for mission-criticalapplications
Redundant equipment forcritical junctures
Ensure appropriate bandwidth toavoid contention and over utilization
Permits network rerouting
508041066_05F9_c2 © 1999, Cisco Systems, Inc.
Say What You Do,Do What You Say
CoordinationCoordination
• Communicationis KEY...
Understand rolesand responsibilities
Place phones inclosets; use cellphones, pagers
Publish policiesand procedures
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 26
518041066_05F9_c2 © 1999, Cisco Systems, Inc.
CoordinationCoordination
• Establish base of operationsAll efforts must go through one person
Prevents “who dropped the baby” and“slam management”
Conduct practice “scramble”
• Train staff on devices and technology
528041066_05F9_c2 © 1999, Cisco Systems, Inc.
Determination of FaultsDetermination of Faults
• Notification via:NMS status change
Trap and event logs
Help desk
Phone call from tech(“whoops...”)
ALARMALARM
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 27
538041066_05F9_c2 © 1999, Cisco Systems, Inc.
Determination of FaultsDetermination of Faults
• Remove the “noise” factor1. Filter
2. Prioritize
3. Appropriately notify
4. Correlate
548041066_05F9_c2 © 1999, Cisco Systems, Inc.
Reacting to FaultsReacting to Faults
• Determine fault domainWhich equipment, services,and users are affected?
• Determine level of responseWhat is the severity of the fault?
Can we kill the backbone?Identify dispatch timeframe andnumber of people
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 28
558041066_05F9_c2 © 1999, Cisco Systems, Inc.
Is It Time to Hit theBig Red Switch?
Reacting to Faults (Severe)Reacting to Faults (Severe)
• Determineescalation timeline
Criteria and time limitsto escalate to next level
Opening a case withthe TAC
Identifying the point ofdrastic action
568041066_05F9_c2 © 1999, Cisco Systems, Inc.
Reacting to Faults (severe)Reacting to Faults (severe)
• Coordinate, communicate,and document
• DebriefDetermine source of fault
Evaluate recovery efforts
Document resolution for continuousimprovement process
In order to learn, avoid CYA environment
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 29
578041066_05F9_c2 © 1999, Cisco Systems, Inc.
Moving from Reactive to ProactiveMoving from Reactive to Proactive
• Automate fault notification, escalationand resolution via “triggers”
• React to data before it goes bad
• Learn device and network behaviorThat doesn’t look right…
588041066_05F9_c2 © 1999, Cisco Systems, Inc.
Active vs. Passive PollingActive vs. Passive Polling
• Polling with thresholds vs.event-based polling
RMON events and alarms
• Conservation of network traffic vs.device CPU and memory
• Might be a combination of both
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 30
598041066_05F9_c2 © 1999, Cisco Systems, Inc.
Fault Management ToolsFault Management Tools
PlanningPlanning
CiscoView—Real-time time monitoring
RME—Availability,Syslog and CCO tools
CWSI—User tracking, trafficdirector and topo
CiscoView—Real-time time monitoring
RME—Availability,Syslog and CCO tools
CWSI—User tracking, trafficdirector and topo
DeploymentDeployment
SWIM—Defect analysis
CCO/TAC—Case tracking tools
Stack Decoder—Crash analysis
SWIM—Defect analysis
CCO/TAC—Case tracking tools
Stack Decoder—Crash analysis
MonitorMonitor
Availability—Monitor key resources
Syslog—Reporting,automated recovery
24-Hour Reports—Monitor reloads, Syslog,and changeTraffic Director—RMONconfig and report
Availability—Monitor key resources
Syslog—Reporting,automated recovery
24-Hour Reports—Monitor reloads, Syslog,and changeTraffic Director—RMONconfig and report
608041066_05F9_c2 © 1999, Cisco Systems, Inc.
Best Practices Can ImproveBest Practices Can ImproveNetwork AvailabilityNetwork Availability
• Prepare the network for managementSecurity, notification and maintenance
• Implement a change control processPlan, deploy and monitor
• Reduce unplanned outage minutesthrough fault management
Prepare, coordinate and be proactive
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 31
618041066_05F9_c2 © 1999, Cisco Systems, Inc.
For More InformationFor More Information
• General network management portalhttp://netman.cit.buffalo.edu/index.html
• Another good network management portalhttp://compnetworking.miningco.com/msubmanage.htm?terms=network+management&cob=home&TMog=5006366091143m&Mint=56534342191358&FFV=1
• “The Simple Times”http://www.simple-times.org/pub/simple-times/issues/
• SNMP FAQhttp://www.cis.ohio-state.edu/hypertext/faq/usenet/snmp-faq/part1/faq.html
628041066_05F9_c2 © 1999, Cisco Systems, Inc.
For More InformationFor More Information
• Sample Cisco device security configshttp://www.cisco.com/warp/public/700/tech_configs.html#SECURITY
• Cisco device SNMP configuration tipshttp://www.cisco.com/warp/public/490/index.shtml
• White paper on threshold managementhttp://www.ccci.com/product/papers/pete/papers/thresh.htm
• Public domain performance monitoring tool(MRTG)http://ee-staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html
Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.1066_05F9_c2.scr 32
63© 1999, Cisco Systems, Inc. 8041066_05F9_c2
Please Complete YourPlease Complete YourEvaluation FormEvaluation Form
Session 804Session 804
63© 1999, Cisco Systems, Inc. 8041066_05F9_c2
648041066_05F9_c2 © 1999, Cisco Systems, Inc.