learned mlag on linux - lessons - netdev conf on linux - lessons learned.pdfisl - inter switch link...
TRANSCRIPT
![Page 1: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/1.jpg)
MLAG on Linux - Lessons Learned
Scott Emery, Wilson KokCumulus Networks Inc.
![Page 2: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/2.jpg)
Agenda
● MLAG introduction and use cases● Lessons learned● MLAG control plane model● MLAG data plane● Linux kernel requirements● Other important changes and considerations
![Page 3: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/3.jpg)
MLAG introduction
MLAG - a LAG across more than one node● multi-homing for redundancy● active-active to utilize all links which
otherwise may get blocked by Spanning Tree
● no modification of LAG partner
![Page 4: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/4.jpg)
MLAG terminologyISL - inter switch link
Dually connected Singly connected
Secondary rolePrimary role
![Page 5: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/5.jpg)
MLAG use case - hypervisor
kernel
eth0
virtual switch
eth1
kernel
eth0
virtual switch
eth1
no MLAG - striping by VM MACs or other policies
vm
MLAG - it’s a bond
switch switch switch switch
![Page 6: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/6.jpg)
MLAG use case - L2 fabric● no blocking links, full
utilization of bandwidth● load balancing and
redundancy offered by LAG
![Page 7: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/7.jpg)
MLAG use case - L2 fabric● no blocking links, full
utilization of bandwidth● load balancing and
redundancy offered by LAG
![Page 8: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/8.jpg)
Lessons learned● L2 can be dangerous! Fail open by default,
no TTL, unknown means flood...● MLAG - more ways to live dangerously● Rigorous and conservative interface state
management needed. Temporary loops or duplicates not acceptable
![Page 9: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/9.jpg)
Lessons learned● Fast convergence depends on a lot of things done right:
○ Proper daemon up/down sequences:■ UP: STPd up > MLAGd up > interface enable■ DOWN: interface disable > MLAGd down > STPd down
○ Avoid split brain as much as possible:■ changing LACP system id flaps bonds■ have multiple heart beat channels between MLAG daemons
● Failures, besides link and node down, do happen, should not melt network. e.g. daemon crash○ Need to fail close, e.g. monit clean up
![Page 10: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/10.jpg)
MLAG control plane model
● Linux kernel enforces default interface state on MLAG enabled interfaces
● User space MLAG daemon maintains MLAG configuration, controls the formation of MLAG and updates interface state and data path
● Analogous to the user space Spanning Tree model
![Page 11: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/11.jpg)
MLAG data plane● L2 must never have loops, redundant
paths are blocked● But want to utilize all links, cannot
blockAnswer…..● Make the links appear logically the
same for the protocols that are supposed to protect you!
![Page 12: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/12.jpg)
MLAG data plane rules● same packet is not delivered to a node more
than once● packet sourced from a dually connected node
is not delivered back to the same nodeThis means packets crossing the ISL and destined to:● dually-connected links => drop● singly-connected links => forward
![Page 13: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/13.jpg)
Minimum Linux kernel requirements
● ability to set LACP system ID on bond independent of bond mac address
● mlag_enable attribute on bond● mechanism to keep member interface carrier
down independent of admin state ○ IFF_PROTO_DOWN
● duplicate filtering of packets crossing the ISL
![Page 14: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/14.jpg)
Interface bring up● user enables an mlag (bond with mlag_enable = 1)
○ bonding driver keeps the bond and all its slaves down● MLAG daemon puts bond in dormant interface mode to begin● when MLAG daemon peering is complete
○ sets mlag LACP system id on bond (802.3ad mode)○ brings slaves up○ LACP can run, no data traffic○ LACP converges, bond moves from oper down to oper
dormant● MLAG daemon verifies MLAG membership, installs egress
filter, then sets bond to oper up
![Page 15: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/15.jpg)
Split brain handling● MLAG daemon pair cannot talk to each other
○ ISL down but MLAG daemons alive● MLAG daemon with secondary role keeps all MLAGs in
down state with IFF_PROTO_DOWN
● IFF_PROTO_DOWN indicates to kernel to not bring bond slaves carrier up until it is cleared
![Page 16: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/16.jpg)
Duplicate FilteringPacket ingress on ISL should only egress on singly connected links● use ebtables: -i <ISL> -o <dually connected interface> -j DROP● rule MUST be installed before dually connected interface is oper up● rule MUST be uninstalled as soon as interface becomes singly connected
One rule per dually connected interface, not scalable, especially in the case of non VLAN-aware bridge model with many bridges and many VLANs. Better if:● ebtables can filter on the parent interface, e.g. eth1 instead of eth1.100,
eth1.101, eth1.102….● or bridge driver can make use of the knowledge of which link is ISL and
which are dual-connected
![Page 17: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/17.jpg)
Possible other Linux kernel requirements
● interface attribute to indicate ISL● knowledge of the ‘dual-connectedness’ of a link● knowledge of mlag id of interfaces● bridge filtering modifications based upon above
![Page 18: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/18.jpg)
Other important changes and considerations
● Spanning Tree changes● MAC address management● IGMP group membership handling● MLAG control traffic treatment
![Page 19: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/19.jpg)
Spanning Tree Changes● STP daemon connects to MLAG daemon and learns
○ which is ISL○ singly/dually connected interfaces and their MLAG id○ when MLAG peering is up or down
● STP needs to run as if the two switches are one. Multiple approaches possible:○ master STP daemon runs the protocol and maintains full state sync
with the slave STP daemonor○ each STP daemon does independent calculation. Loosely coupled,
distributed processing● Loosely coupled model is simpler and more scalable
![Page 20: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/20.jpg)
Spanning Tree - Loosely coupled model
● use common bridge id (MLAG system id) when generating BPDUs
● only MLAG primary switch sends BPDU on dually-connected links
● both MLAG switches send BPDU on singly-connected links
● BPDU received on root port is processed and also relayed across ISL, replace source MAC with MLAG id
![Page 21: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/21.jpg)
MAC address managementGoals● reduce unknown flood● eliminate constant MAC moves between ISL and MLAG
Solution● disable learning on ISL● synchronize MAC addresses
○ install address learned on MLAG on one side to corresponding MLAG on the other side
○ install address learned on singly connected link on ISL on other side
![Page 22: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/22.jpg)
IGMP SnoopingMLAG daemons synchronize between themselves:● IGMP group membership for dually connected links● mrouter port information● reports/queries may need to be flooded, the same duplicate
filtering rule applies
![Page 23: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/23.jpg)
MLAG control traffic
control traffic share the ISL with data traffic, needs to be● given higher priority● independent of data traffic topology change -
use a separate VLAN device on the ISL which is not bridged
![Page 24: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/24.jpg)
While we’re at it...● VLAN-aware bridge driver
○ great enhancement!○ more work needed
■ scalability: vlan range*, per port per vlan local fdb*■ usability: limited to single STP instance, per bridge igmp
snooping control● Bonding driver
○ a few issues with slave active state setting and MUX machine transitions*
(*patches submitted upstream)
![Page 25: Learned MLAG on Linux - Lessons - NetDev conf on Linux - Lessons Learned.pdfISL - inter switch link Dually connected Singly connected Primary role Secondary role. MLAG use case - hypervisor](https://reader033.vdocuments.net/reader033/viewer/2022060523/6052e4f6dc67164a6f6ac586/html5/thumbnails/25.jpg)
Thank You!