brkcrs-3146 - advanced vpc operation and troubleshooting

71
BRKCRS-3146 Advanced VPC Operation and Troubleshooting Follow us on Twitter for real time updates of the event: @ciscoliveeurope, #CLEUR Dmitry Goloubev Technical Leader, Tech services

Upload: ronaldo-gama

Post on 27-Oct-2015

959 views

Category:

Documents


173 download

TRANSCRIPT

Page 1: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

BRKCRS-3146

Advanced VPC Operation and Troubleshooting

Follow us on Twitter for real time updates of the event:

@ciscoliveeurope, #CLEUR

Dmitry Goloubev Technical Leader, Tech services

Page 2: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 2

Housekeeping

We value your feedback- don't forget to complete your online session evaluations after each session & the Overall Conference Evaluation which will be available online from Thursday

Visit the World of Solutions and Meet the Engineer

Visit the Cisco Store to purchase your recommended readings

Please switch off your mobile phones

After the event don’t forget to visit Cisco Live Virtual: www.ciscolivevirtual.com

Follow us on Twitter for real time updates of the event: @ciscoliveeurope, #CLEUR

Page 3: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 4

Goals

Understand general concepts of Virtual Port Channel feature on Nexus 7000

Review the impact of VPC on bridging and routing

Learn how to troubleshoot VPC

Page 4: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 5

No blocked ports, More usable bandwidth, Load-sharing

Distribution switch or link failure does not mean reconvergence

…enables to build PortChannel to 2 separate switches

virtualizing network building block

to this from this …or, logically

VPC at the network level

Page 5: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 6

2 active control planes

2 configs

2 points of management

2 active data planes

Primary-Secondary notion for some

aspects of operation

Control messages and Data frames

flow between active and standby via

Peer-Link

Peer-Link is 802.1Q trunk

Control messages are carried by CFS

over Peer Link

Active Data Plane

Active Control Plane

Active Data Plane

Active Control Plane

VPC

Peer-Link

Peer

Keepalive link

Primary Secondary

VPC domain

VPC components at a glance

Page 6: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 7

Agenda

Initialization & Redundancy considerations

Spanning Tree

Traffic forwarding

1st hop redundancy

Multicast considerations

Page 7: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 8

Stages of VPC initialization

16:34:06 %VPC-5-VPCM_ENABLED: vPC Manager enabled

16:34:07 %VPC-5-PEER_KEEP_ALIVE_STATUS: In domain 2, peer keep-alive status changed to enabled …

16:34:17 %ETHPORT-5-IF_UP: Interface port-channel2 is up in Layer3 Peer-Keepalive …

16:34:19 %VPC-3-VPC_PEER_LINK_BRINGUP_FAILED: vPC peer-link bringup failed (vPC peer is not reachable over cfs) …

16:34:19 %ETHPORT-5-IF_UP: Interface port-channel1 is up in mode trunk Peer-Link

16:34:23 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to primary …

16:34:23 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_START: vPC restore, delay interface-vlan bringup timer started

16:34:33 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_EXPIRED: vPC restore, delay interface-vlan bringup timer expired, reiniting interface-vlans

16:34:33 %INTERFACE_VLAN-5-UPDOWN: Line Protocol on Interface vlan 4, changed state to up

16:34:33 %VPC-5-VPC_RESTORE_TIMER_START: vPC restore timer started to reinit vPCs

16:34:41 %VPC-3-VPC_BRINGUP_FAILED: vPC 102 bringup failed (Peer-link state is not UP)

16:35:03 %VPC-5-VPC_RESTORE_TIMER_EXPIRED: vPC restore timer expired, reiniting vPCs

16:35:13 %VPC-5-VPC_UP: vPC 102 is up

16:35:13 %ETHPORT-5-IF_UP: Interface port-channel102 is up in mode trunk

1. VPC manager starts

2. Peer-keepalive comes up (receives keepalives from the peer)

3. Peer-link comes up (data is not passing through yet, just CFS)

4. Primary/Secondary Role resolved

5. Global Consistency check

6. Peer-link is up for data

7. SVIs brought up (VPC + 10 sec)

8. VPCs brought up (SVI + 30 sec)

Timers are adjustable in VPC

domain configuration context

SVI ‘delay restore interface-vlan’

VPC ‘delay restore’

Page 8: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 9

Certain configuration mistakes could lead to loops or blackholing [when STP config is inconsistent] Others might cause undesirable forwarding implications to specific interfaces [Inconsistent ACL, SVIs]

Consistency checking prevents the prevents network-wide issues (type1) and warns about possible forwarding oddities (type2)

Nexus# sh vpc consistency-parameters interface port-channel 1 Name Type Local Value Peer Value ------------- ---- ---------------------- ----------------------- lag-id 1 [(7f9b, [(7f9b, ... mode 1 active active STP Port Type 1 Default Default STP Port Guard 1 None None STP MST Simulate PVST 1 Default Default Native Vlan 1 1 1 Port Mode 1 trunk trunk MTU 1 1500 1500 Duplex 1 full full Speed 1 10 Gb/s 10 Gb/s Allowed VLANs - 101 101

Nexus# sh vpc consistency-parameters global Name Type Local Value Peer Value ------------- ---- ---------------------- ----------------------- STP Mode 1 Rapid-PVST Rapid-PVST STP Disabled 1 None None STP MST Region Name 1 "" "" STP MST Region Revision 1 0 0 STP MST Region Instance to 1 VLAN Mapping STP Loopguard 1 Disabled Disabled STP Bridge Assurance 1 Enabled Enabled STP Port Type, Edge 1 Normal, Disabled, Normal, Disabled, BPDUFilter, Edge BPDUGuard Disabled Disabled STP MST Simulate PVST 1 Enabled Enabled Interface-vlan admin up 2 101 101 Interface-vlan routing 2 1,101 1,101

VPC consistency checking

Inconsistency Type Action Example of inconsistency

Type 1 / Global Vlans suspended on peer-link, VPCs up with

respective vlans suspended

Rapid-PVST STP on one peer, MST

STP on another

Type 1 / Interface Vlans suspended on respective VPC MTU mismatch, STP guard config

mismatch

Type 2 Syslog message SVI is up on one peer, down on another

Page 9: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 10

Graceful Consistency check

VPC Type 1 inconsistency suspends all vlans on corresponding VPC on both peers

This triggers forwarding interruption during config changes (for example while changing MTU on VPC)

As of 4.2(8) and 5.2(1) VPC supports Graceful Consistency Check

Graceful consistency check brings down interfaces on secondary peer upon inconsistency, primary peer keeps forwarding traffic

Enabled by default

Nexus(config-vpc-domain)# graceful consistency-check

Nexus# show vpc brief vPC domain id : 1 Peer status : peer adjacency formed ok vPC keep-alive status : peer is alive vPC role : secondary ... Graceful Consistency Check : Enabled vPC status ---------------------------------------------------------------------------- id Port Status Consistency Reason Active vlans -- ---- ------ ----------- ------ ------------ 1 Po1 down* failed vPC type-1 2-10

Page 10: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 11

VPC behavior at initialization

Peer-Keepalives must be heard before we bring up the Peer-Link

VPC control plane must be able to communicate to the peer over peer-link

Negotiate LACP/STP operating roles for the chassis

Wait for per-port peer parameters and handshake to bring up vPC ports

Performs peer parameters consistency check on each VPC bringup

Will not bring up VPCs if only one of two VPC peers comes up (for example after power outage)

Page 11: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 12

VPC Reload Restore

Allows to bring up VPCs after timeout if peer is presumed dead

Default timeout 360 sec

Assumes primary role for STP and LACP

Nexus(config)# vpc domain 1 Nexus(config-vpc-domain)# reload restore ? <CR> delay Duration to wait before assuming peer dead and restoring vpcs Nexus(config-vpc-domain)# reload restore delay ? <240-3600> Time-out for restoring vPC links (in seconds)

Page 12: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 13

VPC auto-recovery (replaces Reload-Restore as of NXOS 5.2.1)

Auto-recovery addresses cases of multiple failures. For example

Peer-link fails and after a while primary switch (or keepalive link) fails

Both VPC peers are reloaded and only one comes back up

How it works

If Peer-link is down on secondary switch, 3 consecutive missing peer-keepalives will trigger auto-recovery

After reload (role is ‘none established’) auto-recovery timer (240 sec) expires while peer-link and peer-keepalive still down, autorecovery kicks in

Switch assumes primary role

VPCs are brought up bypassing consistency checks

Nexus(config)# vpc domain 1 Nexus(config-vpc-domain)# auto-recovery Nexus# sh vpc | i recovery Auto-recovery status : Enabled (timeout = 240 seconds)

Failure type Reload restore Auto recovery

After reload only single peer comes up √ √

Peer-link fails, then eventually complete

primary switch fails - √

Page 13: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 14

Troubleshooting VPC: initialization Always start with sh vpc – it gives ~90% of all information needed for initial situation assessment

vpc1# sh vpc Legend: (*) - local vPC is down, forwarding via vPC peer-link vPC domain id : 1

Peer status : peer adjacency formed ok

vPC keep-alive status : peer is alive

Configuration consistency status: success

Type-2 consistency reason : Consistency Check Not Performed

vPC role : primary

Number of vPCs configured : 1

Peer Gateway : Disabled

Dual-active excluded VLANs : - vPC Peer-link status --------------------------------------------------------------------- id Port Status Active vlans -- ---- ------ -------------------------------------------------- 1 Po100 up 1,101 vPC status ---------------------------------------------------------------------- id Port Status Consistency Reason Active vlans -- ---- ------ ----------- ------ ------------ 1 Po1 up success success 101

CFS can communicate with the

peer

We hear peer-alives

Configs are compatible

Master/Slave for certain apps

Peer-Link is up with expected vlans

Vlans are active on VPCs

Peer status issue check if peer-link is up, check if remote end is also configured as peer-link, then look at CFS. Note peer-link will fully come up when 1) peer-keepalive is up and 2) peers can talk via CFS over peer-link

Peer-keepalive issue check ‘sh vpc keepalive’, check outgoing interface being up, in correct vrf, check the route to destination (in correct vrf), ping the remote and check the same on the remote peer

Role issue check ‘sh vpc role’ on both sides, note that peer that’s been up/active the longest will remain operational-active even if other peer will have better priority. This is done to minimize traffic disruption. If role is ‘none established’ it means the VPC came up after reload/new config and VPCs will not come up before role is resolved or reload-restore/auto-recovery kicks in

Consistency issues check ‘sh vpc consistency global|interface …’

Vlans not up check if respective vlan allowed on peer-link, check syslog for other causes ‘sh log log | inc VLANS’

Always keep track of situation on both peers

Page 14: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 15

Process restartability

Supervisor redundancy

VPC redundancy

Active

Standby(SSO)

Active

Standby(SSO)

Process 1

Process 2

Process X

Process 1

Process 2

Process X

Switch 1 Switch 2

VPC Domain

Processes checkpoint their runtime state

Crashing process is restarted statefully by

NXOS system manager

HA-policy will trigger

supervisor switchover

in response to

excessive process

crashing, software,

hardware or

diagnostic failure

VPC redundancy model

Devices dual-attached to VPC domain are protected against

single switch failure (power, hardware, maintenance etc)

Page 15: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 16

VPC Keepalive link

Heartbeat between vPC peers to prevent dual-active scenario

Keepalives are sent every second by default on UDP port 3200

3 second hold timeout on peer-link loss how long we ignore keepalives after peer-link loss

5 seconds keepalive timeout (starts after hold timeout after peer-link down) how long we wait for failure after hold timeout

Use dedicated link, although NXOS does not enforce this – just IP connectivity is verified

Management interface can be used as keepalive link, but do not connect the interfaces together directly (only active supervisor management interface is up)

Peer Keepalive

Page 16: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 17

Handling Peer-link failure flow

Peer-link failure

Am I primary?

Done

Keepalive timeout

expired?

Primary is alive

Bring down all VPC ports

primary

2ndary

no no

Ignore keepalives

for hold-timeout (3 sec)

Start keepalive timeout timer

(default 5 sec)

Received Keepalive?

Primary is gone

Become primary

yes yes

Note: If primary fails completely

once the VPCs are down on

secondary, VPCs will stay down

until primary recovers

Page 17: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 18

Handling Peer-link failure flow with Auto-recovery

I am primary?

Done

Missed 3

Keepalives in a

row?

Primary is alive

Bring down all VPC ports

primary

2ndary

no

no

Received Keepalive

Primary is gone

Become primary

Bypass consistency checks

Bring up VPCs

yes

yes

Note: Unlike in the previous case

the keepalive status is always

checked, not only for

keepalivehold + keepalivetimeout

seconds after peer-link failure Peer-link

Down?

no

yes

NEW

Page 18: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 19

If Peer-link and Keepalive both fail …while primary peer is still alive

Dual-active situation

There will be 2 primary switches sending independent BPDUs

VPC Port-channels on upstream/downstream switches will be error-disabled by ‘EtherChannel Misconfiguration Guard’ after ~90seconds http://www.cisco.com/en/US/tech/tk389/tk213/technologies_tech_note09186a008009448d.shtml

If Nexus 7000/5000 is on the other end of VPC no errordisable as NXOS does not support EtherChannel Guard

Depending on remote configuration (presence of VPC, peer-switch etc) there can be different outcomes ranging from no impact to STP dispute, to STP state cycling between dispute, blocking and forwarding. Split vlan

Provision redundancy for keepalive link, make sure it doesn’t share datapath with peer-link

Page 19: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 20

What to do if only 1 peer is operational… … and VPCs are down … due to power issue, hardware failure on the 2nd peer etc

VPC(s) will be down if they had to flap or current peer was reloaded (because consistency check couldn’t be performed without 2nd peer)

Non-issue with auto-recovery, but what if current NXOS version < 5.2 ?

Possible actions

Recover 2nd peer

…or remove VPC config from port-channel(s) vpc(config-if)# no vpc 123

… or in case of many VPCs, remove VPC config vpc# sh run vpc > bootflash:myvpc.conf vpc(config)# no feature vpc

vpc# sh vpc ...

Peer status : peer link is down vPC keep-alive status : Suspended (Destination IP not reachable) Configuration consistency status : failed Configuration inconsistency reason: Consistency Check Not Performed vPC role : none established ...

vPC status ---------------------------------------------------------------------- id Port Status Consistency Reason Active vlans -- ---- ------ ----------- ------ ------------ 102 Po102 down Not Consistency Check Not - Applicable Performed

Page 20: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 21

Troubleshooting VPC peer-keepalives

Nexus# show vpc peer-keepalive

vPC keep-alive status : peer is alive

--Send status : Success

--Last send at : 2009.06.19 00:41:15 589 ms

--Sent on interface : Eth2/35

--Receive status : Success

--Last receive at : 2009.06.19 00:41:14 580 ms

--Received on interface : Eth2/35

--Last update from peer : (1) seconds, (9) msec

vPC Keep-alive parameters

--Destination : 7.7.7.77

--Keepalive interval : 1000 msec

--Keepalive timeout : 5 seconds

--Keepalive hold timeout : 3 seconds

--Keepalive vrf : v1

--Keepalive udp port : 3200

--Keepalive tos : 192

Nexus# show vpc statistics peer-keepalive

vPC keep-alive status : peer is alive

vPC keep-alive statistics

----------------------------------------------------

peer-keepalive tx count: 9773

peer-keepalive rx count: 8985

average interval for peer rx: 991

Count of peer state changes: 0

Peer-keepalive is only essential at

the time when peer-link goes down

or comes up

At any other time peer-keepalive

failure will only trigger syslog

Peer-keepalives might be affected

by extreme control plane load

(check CPU utilization & COPP)

Number of keepalive state

transitions, closer to 0 - better

Only reception of keepalive packets at IP level is required

Generic routing/switching connectivity troubleshooting might be needed if packets are lost (make sure there is a route/arp in the correct VRF…)

Page 21: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 22

Nexus# sh cfs application ---------------------------------------------- Application Enabled Scope ---------------------------------------------- arp Yes Physical-eth stp Yes Physical-eth vpc Yes Physical-eth igmp Yes Physical-eth l2fm Yes Physical-eth ...

Cisco Fabric Services CFS

Transport mechanism for control-plane messaging between VPC peers

Uses

• Consistency validation

• MAC address synchronization

• vPC member port status signalling

• IGMP snooping synchronization

• vPC status signalling

VPC CFS messages are encapsulated in Ethernet frames and delivered between to peer via the peer-link

CFS messaging

Page 22: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 23

VPC: CFS troubleshooting

Cisco Fabric Services Transport of control messages between VPC peers

Nexus# show cfs status

Distribution : Enabled

Distribution over IP : Disabled

IPv4 multicast address : 239.255.70.83

IPv6 multicast address : ff15::efff:4653

Distribution over Ethernet : Enabled

Nexus# show cfs peers

Physical Fabric

---------------------------------------------

Switch WWN IP Address

---------------------------------------------

20:00:00:1b:54:c2:42:41 10.48.73.222 [Local]

Nexus

20:00:00:1b:54:c2:42:44 0.0.0.0

Total number of entries = 2

Nexus# show cfs internal ethernet-peer statistics | i Trans|Rece

Number of Segments Transmitted : 218

Number of Acks Transmitted : 223

Maximum Segment Size Transmitted : 0

Number of Transmission Timeouts : 0

Number of segments in Transmit Queue : 0

Number of segments in Re-Transmit Queue : 0

Total Number of Segments Received : 441

Number of Acks Received : 217

Number of Duplicate Messages Received : 0

Number of Unexpected Segments Received : 0

Number of fragmented segments Received : 2

Number of duplicate fragments Received : 0

Number of unfragmented segments Received : 210

Number of Received Segments Dropped : 0

Number of Unreliable segments Transmitted : 1

Number of Unreliable segments Received : 1

Nexus# sh cfs internal notification log name vpc

Sun Nov 14 15:27:22 2010: Peer add 20:00:00:1b:54:c2:42:44

Sun Nov 14 19:05:25 2010: Peer gone 20:00:00:1b:54:c2:42:44

Sun Nov 14 19:08:03 2010: Peer add 20:00:00:1b:54:c2:42:44

TX/RX counters should move when

VPC is active or coming up

Remote peer should be seen

Shows timestamps for when CFS

communication for VPC was

interrupted (peer-reload, peer-link

issues etc)

Page 23: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 24

Swapping Primary Secondary roles

Sometimes it is preferred for operational reasons to have specific switch as primary

VPCs are down for ~1 minute after primary changes to secondary

Approach

1. Change role priority

2. Bounce peer-link

vpc1(config)# vpc domain 2 vpc1(config-vpc-domain)# role priority 60 Warning: !!:: vPCs will be flapped on current primary vPC switch while attempting role change ::!! Note: --------:: Change will take effect after user has re-initd the vPC peer-link ::-------- vpc1(config-vpc-domain)# int po1 vpc1(config-if)# shut .... vpc1(config-if)# no shut ... 21:28:34 %VPC-5-ROLE_PRIORITY_CFGD: In domain 2, vPC role priority changed to 60 21:28:34 %VPC-5-SYSTEM_PRIO_CFGD: In domain 2, vPC system priority changed to 32667 21:28:36 %ETHPORT-5-IF_DOWN_NONE: Interface port-channel102 is down (None) 21:28:36 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to secondary 21:35:40 %VPC-5-VPC_PEER_LINK_UP: vPC Peer-link is up

Page 24: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 25

VPC operational considerations … from troubleshooting perspective

VPC troubleshooting is often part of investigation of larger scale event – connectivity issues following power-outage, upgrade, migration, major changes etc

Datacenter connectivity being impacted usually implies lots of pressure (time and otherwise…)

Always know the current situation before trying to ‘recover’…

Trying to fix a non-issue one risks to make things worse… At minimum collect the state of the system before trying anything drastic

When traffic forwarding is concerned basic information on interfaces, VPC states, STP states, MAC addresses, L3 routes/ARPs is essential – takes a minute to collect, just paste this into shell on both peers term len 0 sh int sh vpc sh port-channel summary sh spanning-tree sh mac address-table sh routing vrf all sh ip arp vrf all

sh tech detail – is preferred (though takes ~10 minutes to collect, depending on CPU load and number of linecards) note: if VDCs are used best practice is to collect ‘sh tech detail’ from both main VDC and VDC in question. ‘sh tech brief’ is faster alternative

Page 25: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 26

VPC config considerations

VPC Domain # must be unique for each Layer2-adjacent VPC domain – otherwise issues with multicast forwarding, LACP negotiation of cross-VPC links may arise

Set logging level for vpc to 5 – makes VPC operation easier to follow

Use LACP for the peer-link (channel-group <x> mode active) – more resilient to separate link failures (fiber/sfp going bad) or switch control-plane failures

Use auto-recovery (if available, use reload-restore if not) – useful for cases of multiple failures, more graceful recovery

Page 26: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 29

Agenda

Initialization & Redundancy considerations

Spanning Tree

Traffic forwarding

1st hop redundancy

Multicast considerations

Page 27: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 30

Spanning Tree in VPC domain

STP process

Primary Secondary

STP process

STP runs on both switches (2 active control

planes) but only primary switch drives STP of

VPCs. Port state changes are communicated to

secondary via CFS messages.

For non-VPC ports domain appears as 2 bridges

1

Peer-link is part of STP. BPDU handling is

modified such that Peer-link will not be blocked

(similar to MST implementation of IST)

2

Non-VPC ports are managed independently by

local STP process on each switch

1 1

2

Page 28: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 31

STP behavior upon VPC primary failure

Primary Secondary OP-Primary

ROOT ROOT Backup

ROOT

Depending on control plane load it might take few

seconds for Op-primary to start sending BPDUs.

This might cause STP reconvergence on

connected switches hence increasing hello time

or peer-switch feature might be considered in

large deployments

Primary switch (STP root) fails 1

Secondary switch becomes operational primary

and STP root

2

STP root port doesn’t change nor any STP port

states for VPCs, forwarding continues

1

2

Page 29: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 32

STP behavior upon VPC primary recovery

Secondary OP-Primary

ROOT ROOT

OP-Secondary

SYNC Backup

ROOT

Left switch comes back up 1

Peer-Link comes back up 2

VPC role is resolved as Operational-secondary 3

Left switch has better STP priority becomes

STP root

4

STP root port of right switch will change and that

will trigger SYNC: all non-edge STP ports will be

temporarily blocked

5

Once sync is complete ports will resume

forwarding

1

2 3

4 5

Page 30: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 33

VPC Peer-Switch feature

Primary Secondary

Both VPC switches originate BPDUs with preconfigured information. This allows to keep the same BPDU when primary fails/recovers no extra SYNC required short interruption in forwarding described on previous slide is avoided

Both left and right switches consider themselves root

Both left and right switches send BPDUs all the time no need to raise hello time & STP Bridge Assurance can be enabled on VPCs

spanning-tree vlan 1-1000 priority 8192 vpc domain 1 peer-switch

spanning-tree vlan 1-1000 priority 8192 vpc domain 1 peer-switch

ROOT ROOT

Page 31: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 34

VPC Peer-Switch feature Primary Secondary

left# sh span vlan 101 VLAN0101 Spanning tree enabled protocol rstp Root ID Priority 8293 Address 0023.04ee.be01 This bridge is the root ... Bridge ID Priority 8293 (priority 8192) Address 0023.04ee.be01 ... Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- --------------- Po1 Desg FWD 1 128.4096 (vPC) P2p Po100 Root FWD 2 128.4195 (vPC peer-link) left# sh vpc role | i mac vPC system-mac : 00:23:04:ee:be:01 vPC local system-mac : 00:1b:54:c2:42:43

right# sh span vlan 101 VLAN0101 Spanning tree enabled protocol rstp Root ID Priority 8293 Address 0023.04ee.be01 This bridge is the root ... Bridge ID Priority 8293 (priority 8192) Address 0023.04ee.be01 ... Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- --------------- Po1 Desg FWD 1 128.4096 (vPC) P2p Po100 Desg FWD 2 128.4195 (vPC peer-link)

In Peer-Switch mode bridge-ID comes from system-mac as opposed to local mac in normal mode

ROOT ROOT

Page 32: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 35

STP inconsistencies

When STP detects certain abnormal situations it will mark ports as inconsistent and block them to prevent forwarding loops

- Root – Root Guard feature detected inconsistency (unwanted bridge tries to become root)

- Loop – Loop Guard feature detected inconsistency (port becomes designated because no BPDUs are being received)

- Bridge Assurance (BA) (no BPDUs are received from remote side)

- VPC Peer-link (any of above inconsistencies happened on VPC peer-link)

%STP-2-VPC_PEER_LINK_INCONSIST_BLOCK: vPC peer-link detected BPDU receive timeout blocking port-channel11 VLAN0121.

Page 33: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 36

Handling Peer-Link STP inconsistencies on Primary switch

Primary Secondary When peer-link STP inconsistency is detected on

primary switch the link will be put in ‘inconsistent’

STP state (effectively blocking state)

1

BPDUs are not sent on peer-link when it is

inconsistent. This is to allow secondary switch to

detect inconsistency and react

1

inco

nsi

sten

cy

Page 34: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 37

Handling Peer-Link STP inconsistencies on Secondary switch

Primary Secondary

When peer-link STP inconsistency is detected on

secondary switch the peer link will be put in

‘inconsistent’ STP state (effectively blocking

state)

1

Respective vlans or MST instances are also

blocked on all VPCs

2 2

2

1 inco

nsi

sten

cy

inco

nsi

sten

cy

This behavior depends on STP Bridge Assurance on peer-link (default) as a way to signal to the secondary peer about inconsistency

With BA disabled on Peer-link any inconsistency on the Primary will lead to Peer-link flap

Page 35: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 38

STP troubleshooting: PES/SPS & BPDU redirection Primary VPC peer controls the port states on the secondary peer by

means of SPS (set-port-state) messages

Changes in STP information are syncronized between peers using PES (port-event-sync) messages

nexus# sh spanning-tree internal info vpc | exc 0$

...

======= CFSoe Statistics =========================

Total PES Msgs sent : 4

Total SPS Msgs sent : 4

Total MCS Msgs sent : 8

Total PES Response Msgs received : 4

Total SPS Response Msgs received : 4

Total Response Msgs received : 8

nexus# sh system internal frame traffic | i BPDU

Ingress BPDUs qualified for redirection 42

Ingress BPDUs redirected to peer 42

Egress BPDUs qualified for redirection 0

Egress BPDUs dropped due to remote down 0

Egress BPDUs redirected to peer 0

BPDUs are sent to VPCs out of primary switch. If VPC leg connected to primary is down, BPDUs are sent over peer-link and sent out by secondary

Constantly incrementing SPS/PES

counters might indicate STP

instability or constant

reconvergence.

Use ‘sh spanning detail’ and

‘debug spanning-tree events’ to

find a reason for reconvergences

Page 36: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 39

STP troubleshooting

Peer link is running STP

vpc1# sh spanning-tree vlan 4

VLAN0004 Spanning tree enabled protocol rstp Root ID Priority 32772 Address 0018.ba88.4a00 Cost 2 Port 4096 (port-channel1) Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec

Bridge ID Priority 32772 (priority 32768 sys-id-ext 4) Address 68bd.abd7.51c2 Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec

Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Po1 Root FWD 1 128.4096 (vPC peer-link) Network P2p Po102 Root FWD 1 128.4197 (vPC) P2p

vpc1# sh spanning-tree vlan 4 detail | i "^ Port|BPDU"

Port 4096 (port-channel1, vPC Peer-link) of VLAN0004 is root forwarding

BPDU: sent 46416, received 46418

Port 4197 (port-channel102, vPC) of VLAN0004 is root forwarding

BPDU: sent 0, received 0

On the other end of peer-link po1 is designated

It is possible to see situation when

there are 2 root ports: peer-link

and some VPC

This happens when STP root is

behind VPC and BPDU is received

by the peer - this does not indicate

any issue

Page 37: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 40

STP troubleshooting Looking at BPDUs live

vpc1# debug spanning-tree bpdu_tx tree 101

14:20:37.556707 stp: RSTP(101): transmitting RSTP BPDU on port-channel100

14:20:37.556750 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 101 port port-channel100 enc_type 1 len 42

14:20:37.556834 stp: RSTP(101): transmitting RSTP BPDU on port-channel1

14:20:37.556863 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 101 port port-channel1 enc_type 2 len 36

vpc1# debug spanning-tree all

14:22:23.560147 stp: RSTP(1): transmitting RSTP BPDU on port-channel100

14:22:23.560169 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 1 port port-channel100 enc_type 2 len 36

14:22:23.560219 stp: BPDU TX: vb 1 vlan 1 port port-channel100 len 36 ->0180c2000000 CFG P:0000 V:02 T:02 F:78 R:80:01:00:1b:54:c2:42:43 00000002 B:80:01:00:1b:54:c2:42:44 9063 A:0000 M:0014 H:0002 F:000f

nexus# sh spanning-tree internal event-history tree 0 interface port-channel 50 VDC02 MST0000 <port-channel50> 0) Transition at 497772 usecs after Tue Oct 20 17:42:01 2009 State: FWD Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE] 1) Transition at 661395 usecs after Tue Oct 20 17:42:01 2009 State: FWD Role: Root Age: 4 Inc: no [STP_PORT_ROLE_CHANGE] 2) Transition at 17741 usecs after Tue Oct 20 17:42:03 2009 State: BLK Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE] ...

Alternatively use ‘ethanalyzer’ to capture and dump BPDUs. Beware the BPDUs received by other peer and redirected to primary will not be seen in expected way because of extra encapsulation

Looking at past events…

This output can be easily limited to

necessary Vlan/Interface, but it

doen’t dump the BPDU

Very chatty – use ‘debug logfile

<file>’ to redirect output to a file

Page 38: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 41

Layer2 stability features recap

Feature Condition Works on Effect Note

UDLD

Detects if link becomes

unidirectional

I.e. link cannot carry BPDUs

both ways causes loops

Physical

port

Error-disables

unidirectional

links

Useful on port-channels to

take out broken links,

alternative fast-timers

PAGP/LACP

Bridge

Assurance

(BA)

Expects to receive a BPDU

every hello_time from the

peer.

I.e. cases of dead control

plane on the remote side,

also BPDU loss

Logical

port

Blocks port at

STP level

(BA-

inconsistent

state)

Main protection mechanism

where supported, alternative

is Loop Guard

Dispute

Checks the remote port role

in the received BPDU, role

should not be designated in

BPDU received on

designated port

Cases of unidirectional

communication

Logical

port

Blocks port at

STP level

(Disputed

state)

Complements BA, on by

default. Somewhat overlaps

with UDLD, but not as

effective on port-channels.

Only works with RSTP/MST

BPDUs

Loop

Guard

Doesn’t allow port to take

designated role if it stopped

receiving BPDUs

Unidirectional

communication, control plane

issues on remote

Logical

port

Blocks port at

STP level

(Loop-

inconsistent)

Superseded by BA + Dispute,

use with PVST+ or when BA

is not supported

Page 39: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 42

Bridge assurance, Dispute & UDLD

BA is default enabled on Peer-Link, not recommended for VPCs unless Peer-Switch feature is also operational

Dispute is default enabled (for both RSTP and MST on VPC)

UDLD [normal mode] is recommended to take out bad links from channels (otherwise LACP takes ~100sec vs ~20 with UDLD)

Recommendation

Preferred BA + UDLD + Dispute (on all interswitch links when using Peer-switch) when all switches support this (nexus 7000/5000 and cat6500/VSS do support)

Without Peer-switch BA should be kept only on Peer-Link (no BA or LoopGuard on VPCs) use UDLD + Dispute

If preferred config is not supported use Loop Guard + UDLD (supported by all Cisco switches)

Can potentially mix and match supported features per-switch, but do understand which cases in which combinations each feature covers

Page 40: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 44

Agenda

Initialization & Redundancy considerations

Spanning Tree

Traffic forwarding

1st hop redundancy

Multicast considerations

Page 41: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 45

Special case for forwarding

x

x

x

PC A ends a packet to PC B 1

MAC B is not known by left switch flood 2

MAC B is not known by right switch flood 3

B receives duplicate frames 4

MAC A will be learned on wrong port on the lower

access switch blackholing traffic to A

5

Frames received on Peer-Link

must not be flooded out of VPCs

PC A

PC B

A ←

1

2 3

4

5 A ↑ x

Page 42: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 46

Special case for forwarding: VPC way

MAC B is not known by left switch flood 1

Frames received from Peer-Link are never sent

out of VPC (except those without operational

ports on ingress switch)

Egress port ASICs will drop the frame

Frame is still flooded to devices that are solely

connected to egress switch 3

This rule (called ‘VPC check’) stands for all traffic

(L2, L3, unicast, multicast, broadcast, flooded etc)

on Nexus 7000 (Nexus 3000/5000 VPC have

similar rule, but different implementation)

1

3

2

2

2

PC A

PC B

Page 43: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 47

Summary: VPC traffic forwarding with Nexus 7000

√ √ X √

x

Page 44: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 48

Topologies where VPC forwarding rules will have implications

vlan 2

SVI 1 up

SVI 2 up

Packets arriving to

the left switch, with

destination MAC of

right switch will be

dropped

With peer-gateway

enabled adjacencies

may not come up

This issue is not

specific to OSPF –

same for any routing

protocol

Use routed links to

connect routers

Configuration and

operational state of

SVI interfaces for

vlans present on

VPCs should be

consistent

Otherwise packet

arriving to left switch

for destination on

VPC in vlan 2 will

have to cross Peer-

Link and will be

dropped by right

switch

Add routed cross-link

between peers

x SVI 1 up

SVI 2 down

Frames received from Peer-Link are never sent out of

VPC (except those without operational ports on ingress

switch)

OSPF

routed routed

x

Page 45: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 49

Verifying whether frame will be sent to peer-link

Nexus# show mac address-table vlan 35

Legend:

* - primary entry, G - Gateway MAC, (R) - Routed MAC

age - seconds since last seen,+ - primary entry using vPC Peer-Link

VLAN MAC Address Type age Secure NTFY Ports

---------+-----------------+--------+---------+------+------+----------------

+ 35 0007.b400.0101 dynamic 0 False False Po1

G 35 0007.b400.0102 static - False False sup-eth1(R)

G 35 001b.54c2.4241 static - False False sup-eth1(R)

* 35 001b.54c2.4244 static - False False vPC Peer-Link

+ 35 0012.da65.9ec0 dynamic 0 False False Po1

If frame arrives to this switch in vlan 35 destined to 001b.54c2.4244 it will be sent to peer-link

If this MAC address belongs to one of L3 SVI interfaces of peer-switch and IP destination of the frame is behind the VPC and this VPC has active links on this (local) switch then frame will be dropped by peer-switch

Verify where the destination MAC address of the frame points to

Page 46: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 50

MAC address learning

A ↓ A x

A ↓

MAC A is learned on lower VPC 1

PC A

PC B MAC A is learned on Peer-Link 2

Frame destined to A arriving to right switch will be

sent to Peer-Link

3

Traffic should prefer local links when available

(traffic locality rule)

1

2

3

Page 47: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 51

MAC address learning: VPC way

A ↓ A ↓

MAC A is learned on lower VPC 1

PC A

PC B

MAC addresses are never learned from traffic on

Peer-Link

Frame destined to A arriving to right switch will be

sent to lower VPC 3

1

2

3

Left switch sends a CFS message to right switch

telling about MAC A learned on lower VPC. Right

switch updates MAC address table

2

CFS message

Page 48: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 53

Troubleshooting Layer 2

20.1.2.3 91.0.0.10

0013.1908.e246

Po50

Vlan 50

Po22

Vlan 20

nexus# sh mac address-table address 0013.1908.e246 vlan 50 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 50 0013.1908.e246 dynamic 0 F F Po50 nexus# sh spanning-tree vlan 50 interface port-channel 50 Mst Instance Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- MST0002 Desg FWD 200 128.4145 (vPC) P2p nexus# sh hardware mac address-table 2 address 0013.1908.e246 vlan 50 Valid| PI | BD | MAC | Index | Stat| SW | Modi| Age | Tmr | | | | | | ic | | fied| Byte| Sel | -----+----+-------+---------------+--------+-----+----+-----+-----+-----+ 1 1 161 0013.1908.e246 0x00a36 0 3 0 141 1 nexus# sh system internal pixm info ltl 0x00a36 | i Eth.*, 0x0a36 Eth2/36, nexus# sh mac address-table address 0021.55e0.66c2 vlan 20 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 20 0021.55e0.66c2 dynamic 660 F F Po22 nexus# sh spanning-tree vlan 20 interface port-channel 22 Mst Instance Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- MST0000 Desg FWD 200 128.4117 (vPC) Network P2p nexus# sh hardware mac address-table 1 address 0021.55e0.66c2 vlan 20 Valid| PI | BD | MAC | Index | Stat| SW | Modi| Age | Tmr | | | | | | ic | | fied| Byte| Sel | -----+----+-------+---------------+--------+-----+----+-----+-----+-----+ 1 1 18 0021.55e0.66c2 0x00a32 0 2 0 103 1 nexus# sh system internal pixm info ltl 0x00a32 | i Eth.*, 0x0a32 Eth1/13, Eth1/14,

MAC addresses should point

to expected ports in expected

vlans (path towards source)

The ports should be in STP

forwarding mode

Hardware MAC address

table should be consistent

with software table

Finding port# for given index

Linecard Slot number

VPC

Page 49: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 56

Troubleshooting Layer 3

nexus# sh routing ip 20.1.2.3 ... 20.1.2.3/32, ubest/mbest: 1/0 *via 20.1.1.240, Vlan20, [1/0], 03:48:59, static nexus# sh ip arp 20.1.1.240 Address Age MAC Address Interface 20.1.1.240 00:02:17 0021.55e0.66c2 Vlan20 nexus# sh forwarding ip route 20.1.2.3 module 2 ... ------------------+------------------+--------------------- Prefix | Next-hop | Interface ------------------+------------------+--------------------- 20.1.2.3/32 20.1.1.240 Vlan20 nexus# sh forwarding adjacency 20.1.1.240 module 2 IPv4 adjacency information next-hop rewrite info interface -------------- --------------- ------------- 20.1.1.240 0021.55e0.66c2 Vlan20 nexus# sh int vl 20 | i address Hardware is EtherSVI, address is 0023.ac66.1a42 nexus# sh mac address-table address 0023.ac66.1a42 vlan 20 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ G 20 0023.ac66.1a42 static - F F sup-eth1(R)

Is there route to

destination

Is the next hop resolved

Looking at module 2

because this is where

packets in question

should be received

Is adjacency consistent

with ARP

Router MAC must have

Gateway flag in order for

packet to be L3 switched

20.1.2.3 91.0.0.10

0013.1908.e246

Po50

Vlan 50

Po22

Vlan 20

VPC

Page 50: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 57

Where given packet will be load-balanced

nexus# sh routing hash 91.0.0.10 20.1.2.3 Load-share parameters used for software forwarding: load-share mode: address source-destination port source-destination Universal-id seed: 0xcdb5769f Hash for VRF "default" Hashing to path *20.1.1.3 (hash: 0x2a), for route: 20.1.2.3/32, ubest/mbest: 2/0 *via 20.1.1.3, Vlan20, [1/0], 00:01:37, static *via 20.1.1.240, Vlan20, [1/0], 16:32:42, static

For port-channels

nexus# sh port-channel load-balance forwarding-path interface port-channel 22 dst-ip 20.1.2.3 src-ip 91.0.0.10 vlan 20 module 2

Missing params will be substituted by 0's.

Module 2: Load-balance Algorithm: source-dest-ip-vlan

RBH: 0 Outgoing port id: Ethernet1/14

Load-balancing is configurable

under ‘ip load-sharing address’ in

default VDC and affects all VDCs

Load-balancing is configurable

under ‘port-channel load-balance’

in default VDC and affects all VDCs

Use ‘sh port-channel rbh-distribution’ to see which link sends traffic for which of 8 available load-balancing ‘buckets’

For equal-cost routes

Page 51: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 60

nexus# sh hardware internal errors all ---------------------------------------- Hardware errors as reported in module 1 ---------------------------------------- |------------------------------------------------------------------------| | Device:R2D2 Role:MAC | |------------------------------------------------------------------------| Instance:7 ID Name Value Ports -- ---- ----- ----- 28688 aric_no_port_select_error 0000000000000002 1,3,5,7 I2 ... |------------------------------------------------------------------------| | Device:Ashburton Role:MAC Mod: 1 | |------------------------------------------------------------------------| Instance:0 3629 Egress Port-1 VSL Dropped Packet Count 0000000853635833 5 - 3630 Egress Port-2 VSL Dropped Packet Count 0000000857893046 3 - ... |------------------------------------------------------------------------| | Device:Naxos Role:MAC SECURITY | |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 106 m1_fab_p25_txq_tc0_drop_count 00000000000012af 2 - ... |------------------------------------------------------------------------| | Device:Metropolis Role:REWR | |------------------------------------------------------------------------| Instance:1 ID Name Value Ports -- ---- ----- ----- 70 Krypton input controller zero portsel cnt 0000000000000038 18,20,22,24,26,28,30,32 |------------------------------------------------------------------------| | Device:Lamira Role:L3 | |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 93 CL2 Invalid Pkt count 00000008759cb9cb 1-32 I1 ...

#1 command to look for hardware

packet drops

Not every drop listed here is actual

data packet drop

Run several times to see if any

counters increase at rate similar to

traffic loss

To clear counters, use

‘clear statistics module-all device all’

Datapath Drops

Page 52: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 63

Agenda

Initialization & Redundancy considerations

Spanning Tree

Traffic forwarding

1st hop redundancy

Multicast considerations

Page 53: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 64

1st hop redundancy with VPC

MAC_A vMAC

IP A IP B

Router MAC1

0001.0002.0003

Virtual MAC

0000.0c07.ac00

Router MAC2

0005.0006.0007

Virtual MAC

0000.0c07.ac00

MAC_B vMAC

IP B IP A

PC A

PC B

HSRP

Each of VPC peers will L3 forward packets

destined to its respective Router MAC address

HSRP/VRRP/GLBP used for 1st hop redundancy

Both switches will L3 switch packets to vMAC

address as long as one of them is HSRP active or

HSRP standby.

If both switches are HSRP listening, they will not

L3 switch packets to vMAC

Page 54: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 65

Nexus# sh hsrp brief Interface Grp Prio P State Active addr Standby addr Group addr Vlan1 1 100 Standby 1.1.1.253 local 1.1.1.254 Nexus# sh mac address-table address 0000.0c07.ac01 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+-----+------+------+----------- G 1 0000.0c07.ac01 static - False False sup-eth1(R) Nexus2# sh hsrp brief Interface Grp Prio P State Active addr Standby addr Group addr Vlan1 1 100 Active local 1.1.1.252 1.1.1.254 Nexus2# sh mac address-table address 0000.0c07.ac01 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+-----+------+------+----------- G 1 0000.0c07.ac01 static - False False sup-eth1(R)

First hop redundancy troubleshooting

HSRP

Interface Vlan1 ip address 1.1.1.252/24 hsrp 1 ip 1.1.1.254

Interface Vlan1 ip address 1.1.1.253/24 hsrp 1 ip 1.1.1.254

Both peers will L3 forward packets destined to vMac address as long as either peer in VPC domain is in ‘active’ or ‘standby’ state for corresponding group

Virtual mac address (vMac) will be installed in both peers

‘G’ (gateway) flag must be present on any MAC address for which the nexus is expected to L3 forward packets

Only active will respond to ARP for VIP

standby active

Page 55: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 66

1st hop issue with some devices

MAC_A vMAC

IP A IP B

Router MAC1

0001.0002.0003

Virtual MAC

0000.0c07.ac00

Router MAC2

0005.0006.0007

Virtual MAC

0000.0c07.ac00

PC A

Server B

Router MAC1 MAC_B

IP A IP B

MAC_B Router MAC1

IP B IP A

MAC_B Router MAC1

IP B IP A

X

Left VPC switch will receive the packet and

forward it to Server B, note Source MAC of

outgoing packet will be that of Router1

2

PC A sends a packet to Server B 1

Server B responding to PC A will populate

destination MAC from source MAC of received

frame (this is wrong, it should use ARP)

3

If frame from BA will be load-balanced to right

switch the MAC address of Router1 will point to

Peer-Link and this is where the frame will be sent

4

Left switch will receive the frame from Peer-Link

and drop it

5

Why? Frames received from Peer-Link are never

sent out of VPC except those without operational

ports on ingress switch - egress port ASICs will

drop the frame (VPC check)

1

2

3

4

5

Page 56: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 67

Peer-Gateway : the workaround

PC A

Server B

MAC_B Router MAC1

IP B IP A

MAC_B Router MAC1

IP B IP A

With peer-gateway both peers will install router

MACs of each other in L2 table which will allow

them to L3 forward traffic destined to either

Router MAC

Server B responding to PC A will populate

destination MAC from source MAC of received

frame (this is wrong, it should use ARP)

1

Right switch will forward packet towards

destination

2

1

2

Router MAC1

0001.0002.0003

Virtual MAC

0000.0c07.ac00

Router MAC2

0005.0006.0007

Virtual MAC

0000.0c07.ac00

Router MAC1

0001.0002.0003

Router MAC2

0005.0006.0007

Virtual MAC

0000.0c07.ac00

Router MAC2

0005.0006.0007

Router MAC1

0001.0002.0003

Virtual MAC

0000.0c07.ac00

Page 57: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 68

Peer-Gateway : the implications

Router MAC1

0001.0002.0003

Router MAC2

0005.0006.0007

Virtual MAC

0000.0c07.ac00

Router MAC2

0005.0006.0007

Router MAC1

0001.0002.0003

Virtual MAC

0000.0c07.ac00

X

MAC_B Router MAC1

IP TOP IP LEFT, TTL 1

Top device attempts to establish OSPF adjacency

with the left switch

1

If peer-gateway is enabled in VPC domain and

OSPF unicast packet will be load-balanced to the

right switch, this packet will be dropped

2

Why? Right switch will try to L3-switch the

unicast packet (because RouterMAC1 is marked

as gateway MAC and destination IP is not local)

As packet has TTL==1 it will be dropped

Same applies to any other protocol that uses

unicast packets with TTL==1 entering right switch

but destined to left switch (or vise versa)

Routing protocol peering with devices attached to

VPC domain via SVI interface is not supported

Routed interface should be used in this case

1

2

There is ‘peer-gateway exclude-vlan’ command to turn off peer-gateway on certain vlans

Page 58: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 71

VPC Agenda

Initialization & Redundancy considerations

Spanning Tree

Traffic forwarding

1st hop redundancy

Multicast considerations

Page 59: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 72

Once (S1,G) traffic starts arriving, VPC peers will

resolve which one will be forwarder for that (S,G):

peer with best metric to source or primary in a tie

(this mechanism is specific to PIM in VPC mode,

normally PIM would use assert)

IP Multicast with VPC

Receiver

Source S1

Receiver sends IGMP report (join)

DR (left peer) sends PIM Join to RP

Only forwarder will have OIFs populated in (S,G)

the non-forwarder won’t have VPC SVIs in OIF list

RP

Primary 2ndary

CFS:IGMP

IGMP join

IGMP is encapsulated in CFS and sent to left peer

(*,G)VPC (*,G)VPC

(S1,G)VPC (S1,G)null

Access switch sends join to right VPC peer

Right VPC peer creates (*,G) adds VPC to OIF (as

proxy-DR)

Left peer (DR) creates (*,G) adding VPC to OIF

DR

Forwarder will send a copy of frame to the peer-

link for receivers single-connected to other peer

Proxy-DR

Goal is to allow the peer receiving source traffic to forward it to receivers behind VPC without crossing peer-link (VPC check will drop such traffic otherwise)

Page 60: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 73

IP Multicast with VPC: Prebuilt-SPT

Source S1

With ‘ip pim pre-build-spt’ proxy-DR will also send

a PIM Join to source/RP to draw the traffic

RP

Primary 2ndary

(*,G)VPC (*,G)VPC

(S1,G)VPC (S1,G)VPC

In case of DR failure proxy-DR becomes DR and

posts OIF-list from (*,G) to (S,G), but it will also

need to pull traffic from RP/source which delays

recovery

DR Traffic pulled by proxy-DR will be dropped until it

becomes DR – provision uplink and replication

bandwidth accordingly

Receiver

(S1,G)null

New DR

Page 61: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 74

IP Multicast with VPC: source behind VPC

Source S1

RP

Primary 2ndary

(*,G)VPC2 (*,G)VPC2

(S1,G)VPC2 (S1,G)VPC2

When Source is behind VPC both DR and Proxy-

DR will add OIFs for the group to (S,G)

This is because either peer can receive source

traffic and need to be able to send it to receivers

behind VPCs without crossing peer-link (to avoid

dropping the traffic by VPC check)

Receiver

VPC1 VPC2

When VPC is configured on N7K-F248XP-25 linecard (F2) there is no proxy-DR function (due to hardware specifics). Packet will be bridged to DR over peer-link (VPC check is modified accordingly for L3 multicast packets on F2 linecards)

DR Proxy-DR

Page 62: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 75

For sources behind VPC both peers will forward as they have no control on which one will get the traffic…

VPC1# sh ip pim internal vpc rpf Source: 10.0.1.1 Pref/Metric: 110/21 Source role: primary Forwarding state: Win (forwarding)

VPC1# sh ip pim internal vpc rpf Source: 1.1.1.1 Pref/Metric: 0/0 Source role: primary Forwarding state: Win-force (forwarding)

Peers do ‘metrics exchange’ over CFS for each new source

Peer that has better metric to source or primary will be forwarder

Forwarder election in VPC

Page 63: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 77

Are packets being switched by this entry?

Nexus# show ip mroute 239.1.2.3 (*, 239.1.2.3/32), uptime: 06:46:05, igmp pim ip static Incoming interface: Vlan36, RPF nbr: 36.0.0.3 Outgoing interface list: (count: 2) Ethernet2/43, uptime: 03:01:36, static Vlan37, uptime: 06:46:05, igmp (33.0.0.33/32, 239.1.2.3/32), uptime: 06:46:05, ip pim mrib Incoming interface: Vlan36, RPF nbr: 36.0.0.3 Outgoing interface list: (count: 2) Ethernet2/43, uptime: 03:01:36, mrib Vlan37, uptime: 06:46:04, mrib

control plane state for this group

where information came from

stable?

RPF interface

Nexus# show ip mroute 239.1.2.3 summary software-forwarded Total number of routes: 3 Total number of (*,G) routes: 1 Total number of (S,G) routes: 1 Total number of (*,G-prefix) routes: 1 Group count: 1, rough average sources per group: 1.0 Group: 239.1.2.3/32, Source count: 1 Source packets bytes aps pps bit-rate oifs (*,G) 0 0 0 0 0.000 bps 2 sw-pkts: 0 33.0.0.33 5046908 252345396 49 200 80.053 kbps 2 sw-pkts: 1

Is traffic being switched for this group?

counters updated once ~1 minute

packets forwarded in software

average packet size

VPC multicast: following packet flow

Nexus# show ip igmp snooping groups vlan 37 Type: S - Static, D - Dynamic, R - Router port Vlan Group Address Ver Type Port list 37 */* - R Vlan37 37 239.1.2.3 v2 D Eth2/8

where are receivers on this vlan?

Page 64: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 78

Following the flow: forwarding information Nexus# show forwarding multicast route group 239.1.2.3 slot 1 ======= (*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G Received Packets: 0 Bytes: 0 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:0 Bytes:0 Ethernet2/43 Outgoing Packets:N/A Bytes:N/A (33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags: Received Packets: 5723369 Bytes: 366295616 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:0 Bytes:0 Ethernet2/43 Outgoing Packets:N/A Bytes:N/A slot 2 ======= (*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G Received Packets: 0 Bytes: 0 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:5725816 Bytes:366452224 Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816 (33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags: Received Packets: 0 Bytes: 0 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:5725816 Bytes:366452224 Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816

This is platform independent forwarding

information

Ingress linecard entry

Egress linecard entry

Counters are updated once per ~1minute

Counters between ingress/egress do not have to

match, as information is collected not at the same

exact time, receiver might join after the entry was

created etc

Page 65: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 79

When traffic arrives via VPC

How to find which slot receives the S,G flow when ingress interface is port-channel scattered across several modules?

show forwarding multicast route group <g> source <s>

Nexus# show forwarding multicast route group 239.1.1.1 source 1.0.1.2 | i Received|slot slot 1 Received Packets: 0 Bytes: 0 slot 2 Received Packets: 727203 Bytes: 487290999

Page 66: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 81

Are there drops in forwarding path?

Start looking from Ingress module

Nexus# show hardware internal errors module 1 ---------------------------------------- Hardware errors as reported in module 1 ---------------------------------------- ... |------------------------------------------------------------------------| | Device:Lamira Role:L3 Mod: 1 | | Last cleared @ Thu Apr 8 12:57:37 2010 | Device Statistics Category :: ERROR |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 259 L3 Fib Miss Pkt ctr 0000000000000007 1-32 I1 262 L3 Non-Rpf Drop Pkt ctr 0000000000125617 1-32 I1 319 NF2 V4 IPMAC Lkup Error 0000000000272277 1-32 I1 455 Exception cause: DROP (Unicast) 0000000000025510 1-32 I1 465 Exception cause: DROP (Multicast) 0000000000226148 1-32 I1

Always take several snapshots and look for drops that grow coherently with [suspected] multicast traffic drops

There are always some drops shown by above command – this doesn’t always mean the actual network packets are dropped. Some of these are diag packets, some are packets that are dropped on blocked ports, extra floods etc

Page 67: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 82

Review & Summary

Infrastructure

Redundancy at process, supervisor, port-channel, chassis, VPC level

Both peers are needed to bring up VPCs auto-recovery/reload-restore can change this

Peer-Keepalive + Role defines behavior during VPC failovers

Forwarding

Traffic locality (VPC check) + No learning on Peer-Link

No blocking ports (generally), but common L2 stability mechanisms still important (LACP active, UDLD, BA, Dispute)

Interfacing with L3 requires separate links + cross link

Troubleshooting

Layered, always take basic info, narrow down to a layer/issue type before trying to recover

Data plane – troubleshoot each peer like normal switch paying attention to nuances like VPC check, dual-DR and Router-MACs

Page 68: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

Recommended Reading

Please visit the Cisco Store for suitable reading.

Page 69: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 84

Please complete your Session Survey

Don't forget to complete your online session evaluations after each session.

Complete 4 session evaluations & the Overall Conference Evaluation

(available from Thursday) to receive your Cisco Live T-shirt

Surveys can be found on the Attendee Website at www.ciscolivelondon.com/onsite

which can also be accessed through the screens at the Communication Stations

Or use the Cisco Live Mobile App to complete the

surveys from your phone, download the app at

www.ciscolivelondon.com/connect/mobile/app.html

We value your feedback

http://m.cisco.com/mat/cleu12/

1. Scan the QR code

(Go to http://tinyurl.com/qrmelist for QR code reader

software, alternatively type in the access URL above)

2. Download the app or access the mobile site

3. Log in to complete and submit the evaluations

Page 70: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 85

Page 71: BRKCRS-3146 - Advanced VPC Operation and Troubleshooting

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKCRS-3146 86

Thank you.