installing template theme files - cisco.com · focus of this session is to show on real life...

66
Cisco Public 1 © 2010 Cisco and/or its affiliates. All rights reserved. Troubleshooting in enterprise networks Ing. Peter Mesjar Systems Engineer CCIE #17428 [email protected]

Upload: phamtu

Post on 07-Sep-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 1© 2010 Cisco and/or its affiliates. All rights reserved.

Troubleshooting in enterprise networksIng. Peter Mesjar

Systems Engineer

CCIE #17428

[email protected]

Page 2: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 2

Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst switches that are typically employed in Enterprise networks:

• Nightmare on Layer2

• Fear of high CPU

• Where did my packet go?

• Even good things break sometimes

• To prefer or not to prefer

• Life without nightmares

Page 3: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 3

• Time is most important – use NTP on all devices

• Configure timestamps – for logged messagesWhat is better for troubleshooting – without time:

%SPANTREE-2-LOOPGUARDBLOCK: No BPDUs were received on port 1/1 in VLAN 1. Moved to loop-inconsistent state

%SPANTREE-2-LOOPGUARDBLOCK: Port 1/1 restored in VLAN 1

or with time:

2010 May 17 14:42:50.824 : %SPANTREE-2-LOOPGUARDBLOCK: No BPDUs were received on port 1/1 in VLAN 1. Moved to loop-inconsistent state

2010 May 17 14:42:50:934 : %SPANTREE-2-LOOPGUARDUNBLOCK: Port 1/1 restored in VLAN 1

Router(config)#service timestamps debug datetime msec

Router(config)#service timestamps log datetime msec

Page 4: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 4

• Configure timestamps – for commands taken during troubleshooting (does not work with pipe)

Permanently for certain lines:

Router(config-line)#exec prompt timestamp

Or temporarily for single session:Router#terminal exec promp timestamp

Router#show user

Load for five secs: 2%/1%; one minute: 1%; five minutes: 1%

Time source is NTP, 09:04:37.595 MET Mon May 16 2011

Line User Host(s) Idle Location

* 1 vty 0 cisco idle 00:00:00 10.0.0.2

Interface User Mode Idle Peer Address

Page 5: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 5

• Safe debugging – almost any debug is safe if sent only to internal logging buffer

Disable complete logging output to console/telnet/SSH sessions:

Router(config)#no logging console

Router(config)#no logging monitor

or just disable sending of debug messages to console/telnet/SSH:

Router(config)#logging console 6

Router(config)#logging monitor 6

Then setup large logging buffer (today’s routers come with large amounts of RAM):

Router(config)#logging buffered 256000

Getting large log file from the router:

Router#term len 0

Router#sh log | redirect

Router#term len 24

Page 6: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 6

• How not to be overwhelmed by debugging output

Debug for particular IP address:

Router#debug ip packet <acl_num>

Router#debug ip pim <group_address>

Debug for particular interface:

Router#debug condition interface <int_id>

• Sniffer trace

Very useful to narrow down which device is dropping packets, what is happening prior to network event such as router adjacency flap, etc…

Important to have time on sniffer PC synced to time on router/switch

Cat4k/Cat6k allow to sniff CPU traffic and send it to directly attached PC

Cat6k with 12.2(33)SXI introduced Mini Protocol Analyzer

Cat4k with IOS XE will have wireshark embedded

Page 7: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 7© 2010 Cisco and/or its affiliates. All rights reserved.

Page 8: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 8

• STP (Spanning Tree Protocol) overview

L2 does not have a way to know that certain packet has been forwarded multiplt times (like L3 has with IP TTL)

STP will block redundant paths and create “single active link” tree based topology

Switches exchange BPDU frames to keep L2 topology loop free – if there is lack of BPDUs, forwarding loop might occur

In today’s world of 1GE and 10GE, forwarding loops usually lead to network down situations

Another issue frequently occuring is excessive flooding of unicast traffic and high CPU due to high rate of BPDUs with TCN bit set

Page 9: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 9

• How do you know there is STP forwarding loop?

Multiple clients claim to have no or very sluggish network connectivity

High CPU utilization on switches connected to affected L2 segments

switch#sh process cpu | i CPU util|PID|Spanning

CPU utilization for five seconds: 71%/13%; one minute: 67%; five minutes:

66%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

175 1933722304 220986600 8750 51.83% 47.78% 47.47% 0 Spanning Tree

Logs indicating constant MAC flapping, HSRP flapping

*Jun 10 13:44:45.918 CET: %SW_MATM-4-MACFLAP_NOTIF: Host 00d0.00c0.4c00 in Vlan501 is flapping between port Gi0/1 and port Gi0/2

*Jun 10 13:44:46.186 CET: %STANDBY-3-DUPADDR: Duplicate address 132.32.22.6

on Vlan501, sourced by 0000.0c07.ac69

Page 10: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 10

• How do you know there is STP forwarding loop?

High physical link utilization

Switch#show int port-ch200 | i rate

30 second input rate 816978000 bits/sec, 1143661 packets/sec

30 second output rate 1000 bits/sec, 1 packets/sec

Switch#show int g7/1 | i rate

30 second input rate 213000 bits/sec, 583 packets/sec

30 second output rate 817050000 bits/sec, 1143750 packets/sec

Increasing amounts of drops on many interfaces

Switch#show int g7/1 | i drops

Input queue: 0/2000/0/0 (size/max/drops/flushes); Total

output drops: 9540

Page 11: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 11

• Troubleshooting STP forwarding loopOnce we know it is a loop, we need to find the root cause and break the loop

Start with topology diagram – if you do not have one, troubleshooting STP loop can take quite a lot of time (CDP is your friend here)

From the logs try to narrow down which VLAN is affected

Start at root switch – on each switch:

- build your STP topology

switch#show spanning-tree vlan <vlan_id>

- check interface counters – idea here is to try to narrow down the segment which is originating all the traffic

switch#show int <int_id> | i rate

- check if port is receiving BPDUs or not – idea here is to find redundnat link that has issues sending/receiving BPDUs

switch#show spanning-tree interface <int_id> detail

Break the loop – Root Guard, Loop Guard, BPDU Guard, UDLD, hardware replacement, shut the link, etc...

Page 12: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 12

• How do you know there is excessive flooding due to STP?

BPDUs with TC bit set are used to notify there is change in network (port going up or down) and instruct switches to flush their MAC tables

Spikes in CPU can occur until MAC table is relearned (MAC learning is done in software on cat2k/3k/4k platforms)

MAC table stable:

switch#show platform tcam utilization

Load for five secs: 10%/1%; one minute: 12%; five minutes: 15%

Time source is NTP, 21:52:45.159 YEKST Sat Aug 14 2010

CAM Utilization for ASIC# 0 Max Used

Masks/Values Masks/values

Unicast mac addresses: 656/5248 415/3225

Page 13: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 13

• How do you know there is excessive flooding due to STP?

During the learning:

switch#show platform tcam utilization

Load for five secs: 87%/1%; one minute: 20%; five minutes: 16%

Time source is NTP, 21:53:04.486 YEKST Sat Aug 14 2010

CAM Utilization for ASIC# 0 Max Used

Masks/Values Masks/values

Unicast mac addresses: 656/5248 252/1930

Page 14: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 14

• Troubleshooting excessive flooding due to STP?

Track down the port that is origintating TCN BPDUs – starting at the root switch look for “Number of topology changes” with below command:

switch#show spanning-tree detail

MST0 is executing the mstp compatible Spanning Tree protocol

Bridge Identifier has priority 32768, sysid 0, address

f4ac.c1c4.2b80

Configured hello time 2, max age 20, forward delay 15, transmit

hold-count 6

Current root has priority 24576, address 0019.07aa.9ac0

Root port is 56 (Port-channel1), cost of root path is 0

Topology change flag not set, detected flag not set

Number of topology changes 296 last change occurred 00:01:17 ago

from GigabitEthernet0/15

Break the flooding – shut the port, configure Portfast

Page 15: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 15© 2010 Cisco and/or its affiliates. All rights reserved.

Page 16: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 16

• What is high CPU?Unexpected increase in CPU utilization over measured baseline

• Why is high CPU not desirable?CPU is critical for control plane stability

• What are some reasons for high CPU?High CPU can be process based or interrupt based

- Process based high CPU example:switch#show process cpu

CPU utilization for five seconds: 100%/0%; one minute: 99%; five minutes: 81%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

!--- Output omitted

139 6795740 1020252 6660 88.34% 91.63% 74.01% 0 BGP Router

- Interrupt based high CPU example:switch#show process cpu

CPU utilization for five seconds: 91%/41%; one minute: 65%; five minutes: 64%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

!--- Output omitted

272 33707580 806802430 90 13.51% 14.39% 14.70% 0 IP Input

Page 17: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 17

• Most common reasons for process based high CPU

BGP Scanner

SNMP polling

Netflow export

Virtual Exec

Page 18: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 18

• Most common reasons for interrupt based high CPU

Packets with IP options, expired TTL, that need fragmentation, fail MTU check or checksum

Packets required for tunneling (cat6k supports GRE in hardware)

Packets for which output interface is same as input interface

Packets for glean (require ARP) and receive (destined to local router) MLS CEF adjacency types

Packets hitting ACL entry with log keyword

Packets hitting hardware unsupported PBR action (eg. match length, making tunnel as next hop interface)

Packets for hardware assisted features, such as NAT

Packets switched on interfaces with ACLs that cannot be fit into TCAM

High CPU due to debug output sent to console

Page 19: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 19

• Protecting against process based high CPU

Disable that process if possible

Lower down utilization of the process (eg. for SNMP create SNMP view which will restrict which SNMP MIBs will be polled)

• Protecting against interrupt based high CPU

Don’t use features that are not supported in hardware

Control Plane Policing

Rate limiters

• Tools for troubleshooting

Cat6k – CPU netdr capture, CPU SPAN, Mini protocol analyzer

Cat4k – CPU buffer capture, CPU SPAN

Cat2k/3k – debug for CPU queues

Page 20: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 20

• Cat6k example – misbehaving application

CPU is busy processing 10kpps of traffic on ingress:

switch#sh proc cpu | i five

CPU utilization for five seconds: 91%/41%; one minute: 65%; five

minutes: 64%

switch#show ibc | i rate

5 minute rx rate 32459000 bits/sec, 10645 packets/sec

5 minute tx rate 65000 bits/sec, 85 packets/sec

Side note – even under these conditions, there was no control plane instability and router was configured with OSFP and BGP

Enable Netdr capture to get quick look on what packets are punted to CPU:

switch#debug netdr capture rx

Page 21: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 21

• Cat6k example – misbehaving application

Sample output of debug netdr capture:switch#show netdr capture

------- dump of incoming inband packet -------

interface Vl480, routine mistral_process_rx_packet_inlin, timestamp 12:42:11.035

dbus info: src_vlan 0x1E0(480), src_indx 0x4C(76), len 0x9A(154)

bpdu 0, index_dir 0, flood 1, dont_lrn 0, dest_indx 0x41E0(16864)

B0000400 01E00000 004C0000 9A080000 0011FFFF FFFF0AFD 30200008 41E00000

mistral hdr: req_token 0x0(0), src_index 0x4C(76), rx_offset 0x76(118)

requeue 0, obl_pkt 0, vlan 0x1E0(480)

destmac FF.FF.FF.FF.FF.FF, srcmac 00.08.02.F1.0C.B5, protocol 0800

protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 136, identifier 50603

df 0, mf 0, fo 0, ttl 128, src 10.253.48.32, dst 255.255.255.255

udp src 6503, dst 6502 len 116 checksum 0x8921

Output showed a lot of broadcast packets sourced by 10.253.48.32

Solution was to track down incoming interface and then shut it down

Page 22: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 22

• Cat4k example – Microsoft NLB with unicast IP/multicast MAC

CPU very busy, with 41% on IP Input and on average going up to 80%

switch#show proc cpu | e 0.0

CPU utilization for five seconds: 64%/0%; one minute: 80%; five minutes: 74%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

48 54239612 60847901 891 15.67% 15.66% 15.29% 0 Cat4k Mgmt HiPri

91 89801752 13111461 6849 41.56% 59.67% 53.72% 0 IP Input

95 2670124 8754203 305 1.50% 1.60% 1.60% 0 Spanning Tree

175 1101140 4785600 230 0.71% 0.78% 0.79% 0 HSRP

Page 23: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 23

• Cat4k example – Microsoft NLB with unicast IP/multicast MACswitch#debug platform packet all buffer

switch#show platform cpu packet buffered

Index 2:

7 days 0:31:15:707102 - RxVlan: 33, RxPort: Gi1/3

Priority: Normal, Tag: Dot1Q Tag, Event: 17, Flags: 0x40, Size: 64

Eth: Src 02:02:0A:14:21:49 Dst 03:BF:0A:14:21:40 Type/Len 0x0800

Ip: ver:IpVersion4 len:20 tos:0 totLen:40 id:3789 fragOffset:0 ttl:128

proto:tcp

src: 10.20.33.67 dst: 10.20.33.64 firstFragment lastFragment

Remaining data:

0: 0x51 0xEB 0x1F 0x90 0x46 0x15 0x7B 0x59 0x66 0xCA

10: 0x17 0x5F 0x50 0x10 0xFF 0xFF 0xA8 0x16 0x0 0x0

20: 0x0 0x0 0x0 0x0 0x0 0x0 0x48 0xBA 0xFB 0xA9

Destination MAC address 03:BF:0A:14:21:40 has multicast bit set – after talking to client, 10.20.33.67 was Microsoft NLB server

Solution was to configure static ARP/MAC entries on the switch

Page 24: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 24

• Cat3k example – incorrectly configured SDM templateswitch#show ip traffic

IP statistics:

Rcvd: 2107602 total, 564475 local destination

0 format errors, 0 checksum errors, 1444238 bad hop count

110 unknown protocol, 98709 not a gateway

0 security failures, 0 bad options, 3928 with options

Opts: 0 end, 0 nop, 0 basic security, 0 loose source route

0 timestamp, 0 extended security, 0 record route

0 stream ID, 0 strict source route, 3928 alert, 0 cipso, 0 ump

0 other

Frags: 0 reassembled, 0 timeouts, 0 couldn't reassemble

0 fragmented, 0 couldn't fragment

Bcast: 216024 received, 36 sent

Mcast: 0 received, 0 sent

Sent: 358616 generated, 680509528 forwarded

Drop: 744 encapsulation failed, 0 unresolved, 0 no adjacency

6 no route, 0 unicast RPF, 0 forced drop

0 options denied, 0 source IP address zero

Page 25: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 25

• Cat3k example – incorrectly configured SDM templateswitch#show controllers cpu-interface

cpu-queue-frames retrieved dropped invalid hol-block stray

----------------- ---------- ---------- ---------- ---------- ----------

rpc 5699890 0 0 0 0

stp 2467838 0 0 0 0

ipc 911457 0 0 0 0

routing protocol 4085448 0 0 0 0

L2 protocol 577996 0 0 0 0

remote console 2436 0 0 0 0

sw forwarding 680398547 0 0 0 0

host 359992 0 0 0 0

broadcast 2094978 0 0 0 0

cbt-to-spt 0 0 0 0 0

igmp snooping 2288969 0 0 0 0

icmp 0 0 0 0 0

logging 0 0 0 0 0

rpf-fail 0 0 0 0 0

dstats 0 0 0 0 0

cpu heartbeat 4477038 0 0 0 0

Each of these queues can be debugged to see packets hitting that queue:

debug platform cpu-queues

Page 26: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 26

• Cat3k example – incorrectly configured SDM templateHmm… something global is going on…switch#show platform ip unicast counts | i TCAM fails

Fibs of Prefix length 32, with TCAM fails: 10

switch#show sdm prefer

------------------ show sdm prefer ------------------

The current template is "desktop vlan" template.

The selected template optimizes the resources in

the switch to support this level of features for

8 routed interfaces and 1024 VLANs.

number of unicast mac addresses: 12K

number of IPv4 IGMP groups + multicast routes: 1K

number of IPv4 unicast routes: 0

number of IPv4 policy based routing aces: 0

number of IPv4/MAC qos aces: 0.75K

number of IPv4/MAC security aces: 1K

Solution was to reconfigure SDM template to routing and reload switch

Page 27: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 27© 2010 Cisco and/or its affiliates. All rights reserved.

Page 28: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 28

• Common scenarios for packet drops

Hardware issue – CRC errors, damaged line card

Input queue drops – CPU cannot handle incoming traffic anymore

Output queue drops – not enough buffer space

Overruns – packet drop on ingress interface due to very loaded egress interface

Performance issue – forwarding happening purely in software

Software bug – inconsistency between software and hardware forwarding table

• Tools to narrow down where the packet drop is occuring

Ping (with or without IP options, SF bit set, TOS set)

Traceroute

SPAN

Show commands

Page 29: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 29

• Packet drops due to hardware issue

Watch for increment in input errors, CRCs, etc…switch#ping 10.10.10.100 repeat 100

Type escape sequence to abort.

Sending 100, 100-byte ICMP Echos to 10.10.10.100, timeout is 2 seconds:

!...!..!.!!..!!.!!.!.!........!.!.!.!!.!!!.!..!!..!.!!!!!.!!!!!!..!!..!....!.!!!!!!.!...!.!.!!....!.

Success rate is 50 percent (50/100), round-trip min/avg/max = 1/1/4 ms

switch#sh ip cef 10.10.10.100 | i hop

nexthop 12.0.1.2 GigabitEthernet1/4

switch#sh int g1/4 | i errors

5253 input errors, 5253 CRC, 0 frame, 0 overrun, 0 ignored

0 output errors, 0 collisions, 0 interface resets

switch#sh counters int g1/4 | i Errors

0. rxCRCAlignErrors = 5253

6. ifInErrors = 5253

7. ifOutErrors = 0

25. InErrors = 5253

26. OutErrors = 0

42. CRCAlignErrors = 5253

Page 30: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 30

• Packet drops due to hardware issueswitch#sh int g5/17 counters errors

Port CrcAlign-Err Dropped-Bad-Pkts Collisions Symbol-Err

Gi5/17 0 0 0 5860410

Port Undersize Oversize Fragments Jabbers

Gi5/17 0 0 0 0

Port Single-Col Multi-Col Late-Col Excess-Col

Gi5/17 0 0 0 0

Port Deferred-Col False-Car Carri-Sen Sequence-Err

Gi5/17 0 0 0 0

Unfortunately this can be cable, SFP, line card or even chassis – only way to find out is by doing actual hardware swap

Page 31: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 31

• Output queue drops

Different line cards have different buffer space, or switches can have shared memory buffer per port ASIC

Output queue drops also count for packets dropped by MQC policy

Output queue drops are most evident on users side, where lower speed interfaces (10/100M) get data from high speed interfaces (1G/10GE)

Also a lot of buffering is needed when traffic is very bursty (such as video)

switch#sh int g1/7 | i output drops

Input queue: 0/2000/0/0(size/max/drops/flushes); Total output drops: 1232194

User access at

10/100/1000Core at 10GE

Server farm at

GE/10GE

Buffering and

output drops

Page 32: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 32

• Output queue drops – what can we do?

If the amount of drops (increase in drops) is low, it might not be a problem –TCP can adjust to drops

If the amount of drops is high, we can:

- try to see if we can do something on application level

- do a buffer tuning (possible only on shared memory platforms)

- if using oversubscribed line cards, do not use ports that are sharing same channel

- get a line card/switch with more buffer space

Page 33: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 33

• Tuning egress buffer space – cat3k exampleswitch#show platform port-asic stats drop g1/0/49

Interface Gi1/0/49 TxQueue Drop Statistics

Queue 0

Weight 0 Frames 0

Weight 1 Frames 0

Weight 2 Frames 0

Queue 1

Weight 0 Frames 11450677

Weight 1 Frames 0

Weight 2 Frames 0

Queue 2

Weight 0 Frames 0

Weight 1 Frames 0

Weight 2 Frames 0

Queue 3

Weight 0 Frames 0

Weight 1 Frames 0

Weight 2 Frames 0

Cat3k prior 12.2(44)SE does not show output drops with show interface

Page 34: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 34

• Tuning egress buffer space – cat3k exampleCat3k has 2MB of packet buffer space divided into 8192 buffers of 256 bytes each

4 queues per port – when QOS is enabled, each queue gets 25% of available buffer space (referred to as common pool)

DSCP to queue map will say which queue packet will land in

switch#show mls qos map dscp-output-q

Default for DSCP 0 traffic is queue 2 regardless if QOS is enabled or not (in previous slide this queue had drops and they were gradually increasing)

Let’s increase buffer usage – below is based on real life tunning that helps in most scenarios:

switch#(config)#mls qos queue-set output 1 threshold 2 400 400 100 400

400 means we take 4 times more than 25% assigned in case necessary

End note:

- above buffer tuning is for whole switch so there is risk of buffer starvation

- if you have VoIP, do not forget to enable egress priority queue (queue 1)

Page 35: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 35

• Performance issue – why it happens

Symptoms could be high CPU, packet drops across all interfaces, noticing slower application response

Cat3k/4k/6k use TCAM for HW forwarding - main CPU on supervisor is not designed for siwtching high amounts of data (it can do only couple of kpps)

Max. TCAM size for unicast routes:

Cat3k 12k IPv4, 8k IPv6 (can change depending on SDM template)

Cat4k sup6e/sup7e 256k IPv4, 128k IPv6

Cat6k in non-XL mode 192k IPv4, 32k IPv6 (you can go up to 239k for IPv4)

Cat6k in XL mode 512k IPv4, 256 IPv6 (you can go up to 1M for IPv4)

When TCAM is exhausted, you move to software forwarding – once you start forwarding in software, device performance will rapidly go down

Page 36: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 36

• Performance issue – how to find out that TCAM is fullCheck logs for following error messages:

Cat4k:

%C4K_L3HWFORWARDING-2-FWDCAMFULL: L3 routing table is full. Switching to software forwarding.

Cat6k:

%MLSCEF-SP-7-FIB_EXCEPTION: FIB TCAM exception, Some entries will be software switched

On Cat6k switch will also go into TCAM exception state:

switch#show mls cef exception status

Current IPv4 FIB exception state = TRUE

Current IPv6 FIB exception state = FALSE

Current MPLS FIB exception state = FALSE

To get out of TCAM exception, we need to remove condition that is causing this and on Cat4k reenable CEF, on Cat6k do full chassis reload

Page 37: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 37

• Performance issue – cat6k example

Customer notices high amount of software switched flows in Netflow and few days before opening a case with TAC he could not access the switch

switch#show ibc | i rate

5 minute rx rate 89628000 bits/sec, 11248 packets/sec

5 minute tx rate 90136000 bits/sec, 11287 packets/sec

switch##sh proc cpu | i IP Input|five

CPU utilization for five seconds: 15%/9%; one minute: 22%; five minutes: 17%

155 16654372 110796235 150 0.39% 0.09% 0.05% 0 IP Input

In the logs we had TCAM exception message logged

Root cause was too many IPv4 routes coming from BGP speaker:

switch#sh ip route sum | i bgp| Ex|Source

Route Source Networks Subnets Overhead Memory (bytes)

bgp 65500 135467 167497 21813408 44489384

External: 302963 Internal: 1 Local: 0

However…

Page 38: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 38

• Performance issue – cat6k exampleCustomer had XL mode supervisor, however he had non-XL DFC line card:switch#show mod

Mod Ports Card Type Model Serial No.

--- ----- -------------------------------------- ------------------ -----------

2 48 CEF720 48 port 10/100/1000mb Ethernet WS-X6748-GE-TX SAL1305HHS0

5 2 Supervisor Engine 720 (Active) WS-SUP720-3BXL SAL1221RGRF

7 16 CEF720 16 port 10GE WS-X6716-10GE SAL1311LQEC

9 6 Firewall Module WS-SVC-FWM-1 SAD113203BD

Mod MAC addresses Hw Fw Sw Status

--- ---------------------------------- ------ ------------ ------------ -------

2 0024.97e8.66b0 to 0024.97e8.66df 3.0 12.2(18r)S1 12.2(33)SXH4 Ok

5 0016.c848.5218 to 0016.c848.521b 5.6 8.5(2) 12.2(33)SXH4 Ok

7 0021.a0ef.82c8 to 0021.a0ef.82d7 1.0 12.2(18r)S1 12.2(33)SXH4 Ok

9 0007.0e0f.1c18 to 0007.0e0f.1c1f 4.2 7.2(1) 3.2(2) Ok

Mod Sub-Module Model Serial Hw Status

---- --------------------------- ------------------ ----------- ------- -------

2 Centralized Forwarding Card WS-F6700-CFC SAL1301FJX7 4.1 Ok

5 Policy Feature Card 3 WS-F6K-PFC3BXL SAL1221RH0Q 1.8 Ok

5 MSFC3 Daughterboard WS-SUP720 SAL1221RE7C 3.1 Ok

7 Distributed Forwarding Card WS-F6700-DFC3C SAL1313M8NB 1.2 Ok

Ways out – lower down amount of routes learned via BGP, remove module 7, put back module 7 with DFC3CXL and then reload cat6k chassis

Page 39: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 39

• Few notes on ping

No connectivity between two devices – check if ARP is resolved, if not try with static ARP

Traffic flow is in two directions – try find out in which direction the packet is dropped (this works only on endpoints)

Example ping from 1.1.1.1 to 2.2.2.2

switch(config)#access-list 100 permit icmp host 2.2.2.2 host 1.1.1.1 log

switch(config)#access-list 100 permit ip any any

switch(config-if)#ip access-group 100 in

Example ping from 2.2.2.2 to 1.1.1.1

switch(config)#access-list 100 permit icmp host 1.1.1.1 host 2.2.2.2 log

switch(config)#access-list 100 permit ip any any

switch(config-if)#ip access-group 100 in

Use show access-list 100 to view counters

Ping with IP options works, without does not – very likely there is inconstency between hardware and software forwarding tables

Page 40: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 40

• Few notes on SPAN

For capturing hardware switched traffic

When using SPAN it is important to know how the capture was taken

If traffic is dropped on egress due to congestion, very likely packets will be captured by SPAN

When SPANning trunks, include VLAN ID

http://wiki.wireshark.org/CaptureSetup/VLAN

Page 41: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 41© 2010 Cisco and/or its affiliates. All rights reserved.

Page 42: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 42

• Common scenarios for device outage

Hardware issue (malfunctioning line card, supervisor, port)

Device unable to boot

Device crashing/rebooting

• Hardware replacement needs RMA raised with Cisco

• Device crash needs crashinfo analysis done by Cisco

Page 43: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 43

• Hardware issueGOLD (Generic Online Diagnostics) – set of diagnostic tests for Catalyst switches

GOLD tests run at bootup and then periodically to verify health of the system

Default setting is to run minimal necessary tests

You can instruct switch to run full diagnostics at bootup or run particular diagnostic tests at runtime

Example of hardware issue caught by GOLD:

May 14 00:47:13 CST: %DIAG-SP-6-RUN_MINIMUM: Module 6: Running Minimal Diagnostics...

May 14 00:48:10 CST: %DIAG-SP-3-MAJOR: Module 6: Online Diagnostics detected a Major Error. Please use 'show diagnostic result <target>' to see test results.

May 14 00:48:10 CST: %CONST_DIAG-SP-3-BOOTUP_TEST_FAIL: Module 6: TestActiveToStandbyLoopback failed on port 1-2

May 14 00:48:10 CST: %CONST_DIAG-SP-3-BOOTUP_TEST_FAIL: Module 6: TestLoopback failed on port 1-2

Page 44: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 44

• Hardware issueswitch#show module 6

Mod Ports Card Type Model Serial No.

--- ----- -------------------------------------- ------------------ -----------

6 2 Supervisor Engine 720 (Hot) WS-SUP720-3BXL SAL1416FS7U

Mod MAC addresses Hw Fw Sw Status

--- ---------------------------------- ------ ------------ ------------ -------

6 0016.9de7.04e4 to 0016.9de7.04e7 5.10 8.5(3) 12.2(33)SRC3 MajFail

Mod Sub-Module Model Serial Hw Status

---- --------------------------- ------------------ ----------- ------- -------

6 Policy Feature Card 3 WS-F6K-PFC3BXL SAL1416FP73 1.11 MajFail

6 MSFC3 Daughterboard WS-SUP720 SAL1416FWRR 4.1 MajFail

Mod Online Diag Status

---- -------------------

6 Major Error

Solution here was however to replace chassis

Page 45: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 45

• Device unable to boot

What is the color of system LED?

Can you at least get into ROMMON?

• What to do if you are in ROMMON?

Check your configuration register using confreg and make sure it is set to 0x2102 (boot into operational state with config)

Try to boot image from ROMMON and check for potential error messages

Page 46: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 46

• What to do if you are in ROMMON?

Check for any error messages:

Autoboot executing command: "boot

disk0:s72033-advipservicesk9_wan-mz.122-18.SXF17.bin“

Loading image, please wait ...

device does not contain a valid magic number

loadprog: error - on file open

boot: cannot load "disk0:s72033-advipservicesk9_wan-mz.122-

18.SXF17.bin”

Resolution – IOS image is either missing or corrupted or flash card is not properly formatted

Page 47: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 47

• What to do if you are in ROMMON?

Check for any error messages:

Autoboot executing command: "boot disk0:s72033-ipservicesk9_wan-

mz.122-33.SXI5.bin“

Loading image, please wait ...

*** Bus Error (Load) Exception ***

Access address = 0x8ffffba0

PC = 0x80101ce4, Cause = 0x1c, Status Reg = 0x30409007

monitor: command "boot" aborted due to exception

Exit at the end of BOOT string

Resolution – supervisor replacement

Page 48: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 48

• Device crashing/reloading

Device crash is almost always software fault – result of device crash is crashinfo file located in device flash memory, which needs to be submited to Cisco for analysis

Device crash can be also hardware caused – typical example is parity error

switch#more slavesup-bootflash:crashinfo-20090110-054156

!--- output omited

*** Cache Error Exception at 0x80000080, cerr 0x20000000 ***

instruction reference, primary cache, data field error , error

not on SysAD Bus

!--- output omited

Parity errors can be soft or hard – soft are transient one time events, hard are recurring events

Parity errors will stay (IC are getting more complex with higher density)

Suggested action plan is to monitor the device to determine if it was soft or hard parity error

Page 49: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 49© 2010 Cisco and/or its affiliates. All rights reserved.

Page 50: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 50

• QoS on Catalyst switches is not same as on routers

Lot of things are hardware dependent (each line card in Cat6k has different buffers)

Lot of things is not supported (ie Cat6k does not support shaping)

By default QoS is disabled

• Catalyst switches use concept of trust

To decide if we keep packet CoS/DSCP or owerwrite it

Exception is Cat4k sup6/sup7

• QoS can be port based or VLAN based

Port based is default – QoS is applied to physical interface

VLAN based – QoS is applied to VLAN interface

Page 51: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 51

• Improper markingBelow is result of remarking OSPF packets to DSCP 0 by incorrect setting of port trust with QoS globally enabledAug 9 00:10:34.094 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.133 on Vlan3503 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 00:10:34.094 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.100 on Vlan3683 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 00:10:34.118 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.34 on Vlan3515 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 00:10:34.186 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.11 on Vlan3697 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 00:10:34.198 MET: %OSPF-5-ADJCHG: Process 44, Nbr 10.102.7.45 on Vlan3535 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 00:10:34.414 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.133 on Vlan3503 from LOADING to FULL, Loading Done

Aug 9 00:10:39.078 MET: %OSPF-5-ADJCHG: Process 44, Nbr 10.102.7.45 on Vlan3535 from LOADING to FULL, Loading Done

Aug 9 00:10:39.194 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.34 on Vlan3515 from LOADING to FULL, Loading Done

Aug 9 00:10:39.262 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.100 on Vlan3683 from LOADING to FULL, Loading Done

Aug 9 00:10:39.586 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.11 on Vlan3697 from LOADING to FULL, Loading Done

Aug 9 11:11:36.137 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.133 on Vlan3503 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 11:11:36.137 MET: %OSPF-5-ADJCHG: Process 44, Nbr 10.102.7.45 on Vlan3535 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 11:11:36.177 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.11 on Vlan3697 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 11:11:36.193 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.34 on Vlan3515 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 11:11:36.205 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.100 on Vlan3683 from FULL to DOWN, Neighbor Down: Dead timer expired

Aug 9 11:11:40.757 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.133 on Vlan3503 from LOADING to FULL, Loading Done

Aug 9 11:11:40.989 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.100 on Vlan3683 from LOADING to FULL, Loading Done

Aug 9 11:11:41.141 MET: %OSPF-5-ADJCHG: Process 44, Nbr 10.102.7.45 on Vlan3535 from LOADING to FULL, Loading Done

Aug 9 11:11:41.249 MET: %OSPF-5-ADJCHG: Process 33, Nbr 10.100.7.34 on Vlan3515 from LOADING to FULL, Loading Done

Aug 9 11:11:41.489 MET: %OSPF-5-ADJCHG: Process 1000, Nbr 10.168.7.11 on Vlan3697 from LOADING to FULL, Loading Done

Page 52: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 52

• Improper marking – resolution step 1

QoS was enabled on the switch, but all interfaces were not trusted – on interfaces over which OSPF communicates:

switch(config-if)#mls qos trust dscp

Unfortunately this did not stop flapping…

Page 53: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 53

• Improper marking – resolution step 2switch#debug netdr capture rx destination-address 224.0.0.5

switch#show netdr capture

------- dump of incoming inband packet -------

interface Vl3515, routine mistral_process_rx_packet_inlin, timestamp

15:26:34.544

dbus info: src_vlan 0xDBB(3515), src_indx 0x280(640), len 0x62(98)

bpdu 0, index_dir 0, flood 1, dont_lrn 0, dest_indx 0x4DBB(19899)

40020400 0DBB0000 02800000 62080000 00590510 0B002040 00000008 4DBB0000

mistral hdr: req_token 0x0(0), src_index 0x280(640), rx_offset 0x76(118)

requeue 0, obl_pkt 0, vlan 0xDBB(3515)

destmac 01.00.5E.00.00.05, srcmac 00.24.14.84.1B.80, protocol 0800

protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 80, identifier 63378

df 0, mf 0, fo 0, ttl 1, src 10.100.7.34, dst 224.0.0.5, proto 89

TOS of incoming OSPF hello packets was set to zero by intermediate switch –issue was corrected on this switch by trusting QoS

Page 54: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 54

• Policing with TCP

100Mbps of TCP traffic generated

switch#sh int g1/1 | i input rate

30 second input rate 98056000 bits/sec, 8268 packets/sec

switch#sh int g1/2 | i output rate

30 second output rate 97935000 bits/sec, 8260 packets/sec

Now let’s police this to 40Mbps…

g1/2 g1/1

Page 55: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 55

• Policing with TCP

Bc set to 1250000

switch#sh int g1/1 | i input rate

30 second input rate 43301000 bits/sec, 3641 packets/sec

switch#sh int g1/2 | i output rate

30 second output rate 30897000 bits/sec, 2600 packets/sec

Double Bc to 2500000

switch#sh int g1/1 | i input rate

30 second input rate 42965000 bits/sec, 3834 packets/sec

switch#sh int g1/2 | i output rate

30 second output rate 36410000 bits/sec, 3065 packets/sec

TCP gives rates below CIR due to slow start – always use large Bc when policing TCP traffic

Page 56: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Cisco Public 56© 2010 Cisco and/or its affiliates. All rights reserved.

Page 57: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 57

• VSS principles2 physical switches

Single control plane (active/standby principle)

Multichassis etherchannel for increased reliability

• VSS traffic forwarding troubleshootingFirst determine if packet needs to be switched via VSL link – if no, troubleshooting is same as on standalone Cat6k without VSS

• In next slides you will see example of correct forwarding table If you find any unexpected results during your troubleshooting, you have couple of options:

- You can try to clear ARP/MAC/IP route/CEF tables

- You can attempt to reload VSS to correct HW forwarding tables

- You can call Cisco TAC to troubleshoot deeper and find a root cause – most likely it will be either known or new bug, or it can be broken hardware

Page 58: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 58

vss# sh ip route 192.168.254.253

Routing entry for 192.168.254.253/32

Known via "ospf 222", distance 110, metric 2, type intra area

Last update from 192.168.222.17 on Vlan226, 15:40:19 ago

Routing Descriptor Blocks:

* 192.168.222.17, from 192.168.255.253, 15:40:19 ago, via Vlan226

Route metric is 2, traffic share count is 1

vss# sh ip cef 192.168.254.253

192.168.254.253/32

nexthop 192.168.222.17 Vlan226

vss# sh ip arp 192.168.222.17

Protocol Address Age (min) Hardware Addr Type Interface

Internet 192.168.222.17 2 0005.9a3b.6c80 ARPA Vlan226

vss# sh mac-address-table address 0005.9a3b.6c80 vlan 226

...

vlan mac address type learn age ports

------+----------------+--------+-----+----------+--------------------------

Supervisor switch 1 Module 6

* 226 0005.9a3b.6c80 dynamic Yes 10 Po3

Supervisor switch 2 Module 6

* 226 0005.9a3b.6c80 dynamic Yes 10 Po3

vss# sh etherchannel 3 summary

...

Group Port-channel Protocol Ports

------+-------------+-----------+-----------------------------------------------

3 Po3(SU) PAgP Gi1/1/15(P) Gi2/6/3(P)

192.168.254.253192.168.230.2

192.168.222.17

VSS

Po4

• Is there route to destination

• What is the next hop

• Is the mac address of next hop resolved

• What is the port for this mac address

• What are physical ports of port-channel

• This is IOS view of the situation

• Same steps as on any other IOS switch

Po3

1/1/33

2/4/33

1/1/15

2/6/3

Page 59: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 59

192.168.254.253192.168.230.2

192.168.222.17

VSS

Po4 Po3

1/1/33

2/4/33

1/1/15

2/6/3

vss# sh mls cef 192.168.254.253

Codes: decap - Decapsulation, + - Push LabelIndex Prefix Adjacency144 192.168.254.253/32 Vl226 , 0005.9a3b.6c80vss# sh mls cef 192.168.254.253 detail

Codes: M - mask entry, V - value entry, A - adjacency index, P - priority bitD - full don't switch, m - load balancing modnumber, B - BGP Bucket selV0 - Vlan 0,C0 - don't comp bit 0,V1 - Vlan 1,C1 - don't comp bit 1RVTEN - RPF Vlan table enable, RVTSEL - RPF Vlan table select

Format: IPV4_DA - (8 | xtag vpn pi cr recirc tos prefix)Format: IPV4_SA - (9 | xtag vpn pi cr recirc prefix)M(144 ): E | 1 FFF 0 0 0 0 255.255.255.255V(144 ): 8 | 1 0 0 0 0 0 192.168.254.253 (A:393217 ,P:1,D:0,m:0 ,B:0 )vss# sh mls cef adjacency entry 393217 detail

Index: 393217 smac: 00d0.00c6.7800, dmac: 0005.9a3b.6c80mtu: 1518, vlan: 226, dindex: 0x0, l3rw_vld: 1format: MAC_TCP, flags: 0x2000208408delta_seq: 0, delta_ack: 0packets: 0, bytes: 0

vss# sh mls cef adjacency entry 393217 detail

Index: 393217 smac: 00d0.00c6.7800, dmac: 0005.9a3b.6c80mtu: 1518, vlan: 226, dindex: 0x0, l3rw_vld: 1format: MAC_TCP, flags: 0x2000208408delta_seq: 0, delta_ack: 0packets: 5, bytes: 590

• HW FIB entry

• HW adjacency entry

• HW switched packet counter

(exported to IOS/zeroed periodically)

Page 60: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 60

vss# sh ip route 192.168.254.253

Routing entry for 192.168.254.253/32

Known via "ospf 222", distance 110, metric 2, type intra area

Last update from 192.168.222.17 on Vlan226, 15:40:19 ago

Routing Descriptor Blocks:

* 192.168.222.17, from 192.168.255.253, 15:40:19 ago, via Vlan226

Route metric is 2, traffic share count is 1

vss# sh ip cef 192.168.254.253

192.168.254.253/32

nexthop 192.168.222.17 Vlan226

vss# sh ip arp 192.168.222.17

Protocol Address Age (min) Hardware Addr Type Interface

Internet 192.168.222.17 2 0005.9a3b.6c80 ARPA Vlan226

vss# sh mac-address-table address 0005.9a3b.6c80 vlan 226

...

vlan mac address type learn age ports

------+----------------+--------+-----+----------+--------------------------

Supervisor switch 1 Module 6

* 226 0005.9a3b.6c80 dynamic Yes 10 Po3

Supervisor switch 2 Module 6

* 226 0005.9a3b.6c80 dynamic Yes 10 Po3

vss# sh etherchannel 3 summary

...

Group Port-channel Protocol Ports

------+-------------+-----------+-----------------------------------------------

3 Po3(SU) PAgP Gi1/1/15(D) Gi2/6/3(P)

192.168.254.253192.168.230.2

192.168.222.17

VSS

Po4

• Is there route to destination

• What is the next hop

• Is the mac address of next hop resolved

• What is the port for this mac address

• What are physical ports of port-channel

• All ports on switch1 side are down

• If packet will arrive to switch1 to be

switched to po3, packet will cross VSL

Po3

1/1/33

2/4/33

1/1/15

2/6/3

Page 61: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 61

192.168.254.253192.168.230.2

192.168.222.17

VSS

Po4 Po3

1/1/33

2/4/33

1/1/15

2/6/3

vss# sh mac-address-table address 0005.9a3b.6c80 vlan 226 detail switch 1 module 6

MAC Table shown in details

========================================

PI_E RM RMA Type Alw-Lrn Trap Modified Notify Capture Flood Mac Address Age Pvlan SWbits Index XTag

----+---+---+----+-------+----+--------+------+-------+------+--------------+----+------+------+------+----

Supervisor switch 1 Module 6

Yes No No DY No No Yes No No No 0005.9a3b.6c80 0x86 226 0 0xB40 0

vss# remote command switch test switch virtual ltl index 0xB40

...

Unmapped index: 0xB40

------+----------------------------------------

SW view

Index | Ports

------+----------------------------------------

0x0B40 Po3[Gi2/6/3],Po10[Te1/6/4]

...

------+----------------------------------------

HW view

Index | Ports

------+----------------------------------------

0x0B40 Te1/6/4,Gi2/6/3

...

vss# sh switch virtual link port-channel | i Po

Group Port-channel Protocol Ports

10 Po10(RU) - Te1/6/4(P)

20 Po20(RU) - Te2/6/4(P)

• Find the index for given mac address

on ingress forwarding engine

• Find what ports on the local switch (1)

this index includes

• Index should include VSL ports

• How to verify if the packet from switch 1

will cross the VSL in order to reach that

mac-address?

Page 62: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

Thank you.

Page 63: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 63

• Troubleshooting STPhttp://www.cisco.com/en/US/tech/tk389/tk621/technologies_tech_note09186a0080136673.shtml

• Troubleshooting high CPUCat6k http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml

https://supportforums.cisco.com/docs/DOC-15608

Cat4k

http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml

Cat2k/3k

http://www.cisco.com/en/US/partner/docs/switches/lan/catalyst3750/software/troubleshooting/cpu_util.html

http://www.cisco.com/en/US/products/hw/switches/ps5023/products_tech_note09186a00807213f5.shtml

Page 64: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 64

• Troubleshooting process based high CPU

High CPU due to BGP Scanner:

http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00809d16f0.shtml

High CPU due to SNMP

http://www.cisco.com/en/US/partner/tech/tk648/tk362/technologies_tech_note09186a00800948e6.shtml

• Troubleshooting input/output queue drops

http://www.cisco.com/en/US/partner/products/hw/routers/ps133/products_tech_note09186a0080094791.shtml

Page 65: Installing Template Theme Files - cisco.com · Focus of this session is to show on real life examples most common issues and various troubleshooting tools available in Cisco Catalyst

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public 65

• Troubleshooting hardware issues

GOLD tests

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_paper0900aecd801e659f.html

Cat6k

http://www.cisco.com/en/US/partner/products/hw/switches/ps700/products_tech_note09186a00801751d7.shtml

Cat4k

http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a008011e6b4.shtml