perspective of virtual switching fabric€¦ · the average hash bucket’s list length cbp-input...

22
© 2018 VMware Inc. All rights reserved. Perspective of Virtual Switching Fabric [email protected]

Upload: others

Post on 18-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

© 2018 VMware Inc. All rights reserved.

Perspective of Virtual Switching [email protected]

Page 2: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Agenda

2

q Overview of Fabric Structure

q Emulation & Demonstration

q Performance Evaluation

q Use cases Imagination

Page 3: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

What is the Fabric?A L2 virtual networking using smart routing

o node naming: spine vSwitch node vs leaf vSwitch nodeo pseudo wire service: Ethernet-Line(E-Line)o pseudo lan service: Ethernet-LAN(E-LAN)

o mac based forwardingo emulate LAN at the core(built-in multicast tree, replication on-demand)

o path optimizationo adjacency next-hop count as weighto link workload as weighto for E-LAN, minimum spanning treeo For E-LINE, shortest path

E-LAN

E-LINE

Page 4: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

What is the Fabric? contdtransport layer consideration

o Ethernet over MPLS ? RFC448o path selection and tenant identificationo encapsulation overhead: 12+2+4=18 bytes

o mitigate TOR variables:o PF and VF mac addresses.

o hardware independencyo NICs supported by DPDKo high volume switch

B-dmac B-smac mpls(0x8847) L2 frame

Page 5: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Why Fabric?x86 platform networking capability

o data path essenceo hardware IO capability(PCIe Gen3 x8GT/s TL bandwidth utilization :66%)

from infrastructure network into host(hypervisor) memoryvpp benchmark: 480Gbps L3 forwarding, does not scale any more due to nic/PCI bus limitation

o memory bandwidth is bottleneck of virtual network from vm to vm(host/hypervisor)experiment: 60Gbps intra-numa-node ,two times memory copy, does not scale wellmemory movement consumes too much cpu time and memory bandwidth

o let dedicated servers as fabric do virtual networking!o more agility and hardware dependency than specific hardware solution.

o as the intrinsic requirement of NFV

Page 6: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

leaf vswitch

customer service port

Vlan basedservice selection

E-LINE

E-LAN

Label pushing&next hop forwarding

customer backbone port

spine vswitch leaf vswitch

provider backbone port

unicast

multicast

E-LINE

E-LAN

NHLFE selection Label swapping&replication

customer backbone portcustomer service port

Label basedservice selection

Label stripping&Vlan insertion

Fabric constitutionvirtual switch internals

Page 7: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Agenda

7

q Overview of Fabric Structure

q Emulation & Demonstration

q Performance Evaluation

q Use cases Imagination

Page 8: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Emulated environment pre-setup

Linux Guest(VMs)

0 1 2 3 4 5 6 7

Linux Host(Hyperv…) br-ac1 br-lan1 br-lan2 br-ac2

Leaf Spine Leaf

0 2 4 6 8 10net-nsnet-ns

§ Qemu KVM emulated guest, both host and guest OS are CentOS 7§ 8 x e1000 devices with VLAN stripping and insertion offloading capability§ 4 x ovs bridges for 4 LANs emulation(2 attachment circuit lans and 2 common lans)§ Port 0 and port 7 are for customers and port 1-6 are for fabric switches§ For fabric ports, once taken over by e3datapath, re-index them [0,2,4,6,8,10]§ For customer ports, use Linux namespace and vlan sub-interfaces to segregate themselves

Page 9: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

E-line servicecsp port 0 cbp port 2 pbp port 4 pbp port 6 cbp port 8 csp port 10

create two e-line services: 0 and 1

Vlan1000 ----> e-line 0

E-line 0 next hop(to port 4) via port2 with label:1

Port 4 with label 1,knows next hop(to port 8) via port 6 with label 100

Port 8 with label 100 goes to e-line1

e-line1--->vlan 2000

vlan 2000--->e-line1

E-lan1 next hop(to port 6) via port 8 with label:10000

Port 6 with 10000,it knows next hop(to port 2 ) with label 10.

Page 10: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

E-lan service multicast forwardingcsp port 0 cbp port 2 pbp port 4 pbp port 6 cbp port 8 csp port 10

create two e-lan services: 0 and 1 , and multicast next hop list:0

Vlan3000 ----> e-lan 0

E-lan 0,find no fwd entry, multicast next hop(to port 4) via port2 with label:2

Port 4 with label 2, goes to multicast list0,perform RPF check and send replication (to port 8) via port 6 with label:101

Port 8 with label 101 goes to e-lan1

e-lan1--->vlan 4000

vlan 4000--->e-lan1

E-lan1 still finds no fwd entry, use multicast nexthop(to port 6)via port 8 with label:10001

Port 6 with 10001,does multicast forwarding, finally goes to port 2 via port 4 with label:11

Page 11: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

E-lan service unicast forwarding§ At leaf virtual switch, a fwd entry is found with deterministic <label, nhlfe>§ At spine virtual switch, single next hop is bound to input label entry, no multicast list is searched

Page 12: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Agenda

12

q Overview of Fabric Structure

q Emulation & Demonstration

q Performance Evaluation

q Use cases Imagination

Page 13: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Phy-VS-Phy

VPP Performance at Scale

64B1518B0.0

200.0400.0600.0

[Gbps]]

480Gbps zero frame loss

64B1518B0.0

100.0200.0300.0[Mpps]

200Mpps zero frame loss

64B0

200400600[Gbps]]

IMIX => 342 Gbps,1518B => 462 Gbps

64B0

100200300

[Mpps]

64B => 238 Mpps

IPv6, 24 of 72 cores IPv4+ 2k Whitelist, 36 of 72 cores

Zero-packet-loss Throughput for 12 port 40GE

Hardware:

Cisco UCS C460 M4Intel® C610 series chipset4 x Intel® Xeon® Processor E7-8890v3(18 cores, 2.5GHz, 45MB Cache)2133 MHz, 512 GB Total9 x 2p 40GE Intel XL71018 x 40GE = 720GE !!

Latency18 x 7.7trillion packets soak testAverage latency: <23 usecMin Latency: 7…10 usecMax Latency: 3.5 ms

HeadroomAverage vector size ~24-27Max vector size 255Headroom for much more throughput/featuresNIC/PCI bus is the limit not vpp

https://fd.io/resources/

Page 14: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Max Quad Links rx/tx rate

0102030405060708090

100

bidirectional 2x bidirectional 3x bidirectional 4x singly directional 4x

bidirectional 4x

Max Quad Links Rx/Tx rate in Mpps

64-bytes 1514-bytes

46Gbps52Gbps

60Gbps

60Gbps 60Gbps

[Mpps]

Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz with 20M L3 cache

Page 15: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

vSwitching processing complexity

items spatial complexity temporal complexity

csp-input 2^12 vlan entries per csp interface o(1) to find vlan distribution entry

e-line-forward 1 <vlan,interface> entry and 1 <label,nhlfe> entry

all o(1) to find the fwd entries

e-lan-forward 64 <vlan,interface> and 64 <label,nhlfe> and 2^16 fib base entry per e-lan and [n/48,n] fib

entry

O(m/48) to find the mac fwd entry, where m is the average hash bucket’s list length

cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry

pbp-input 2^20 label entries per pbp interface o(1) to find unicast fwd nhlfeo(n) to enumerate multicast entries where n by

default set to 64

Page 16: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Performance expectation

o simpler forwarding logico dpdk native context o burst-oriented and cache optimized and fast indexo expected to scale out with ports across the vSwitch datapath

Page 17: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Agenda

17

q Overview of Fabric Structure

q Emulation & Demonstration

q Performance Evaluation

q Use cases Imagination

Page 18: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

vSwitching fabric

Fabric structure reviewbasic end-system accessing

hypervisor PF and VFs(sr-iov)apply QoS and VLAN

vm container bm

leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch

spine vSwitch spine vSwitch spine vSwitch

hypervisor hypervisor hypervisor

Multi-homed

make it bipartite graph ifignoring inter-spine-vswitch connections

attachment circuit LAN

• Try to fully migrate L2 virtual network into infrastructure network domain

Page 19: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Fabric structure review contdbasic end-system accessing

hypervisor

vm container bm

leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch

spine vSwitch spine vSwitch spine vSwitch

hypervisor hypervisor hypervisor

E-LANE-LINE

• local vlan scope• vlan designated forwarder for multi-homed leaf vSwitches• hairpin traffic optimized• BUM traffic optimized

vlan100

vlan1000 vlan2000

vlan101

vlan1001

vlan102vlan102

vSwitching fabric

Page 20: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Native HA viewService continuity

hypervisor

leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch

spine vSwitch spine vSwitch spine vSwitch

hypervisor hypervisor hypervisor

• only active ether-service in forwarding state• ether-service failure can be detected by in-band OAM or(and) controller• controller assists ether-service failover• populate <ff:ff:ff:ff:ff:ff,smac,vlan> to fast update AC lan’s mac table

vlan100

vlan1001activestandby

vSwitching fabric

Page 21: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

Native ECMP viewservice continuity and load balancing

hypervisor

leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch leaf vSwitch

spine vSwitch spine vSwitch spine vSwitch

hypervisor hypervisor hypervisor

• each ether-service is active• ether-services are detected by iOAM or(and) controller• failover can be achieved by disabling inactive ether-service

vlan100

vlan1001activeactive

vSwitching fabric

Page 22: Perspective of Virtual Switching Fabric€¦ · the average hash bucket’s list length cbp-input 2^20 label entries per cbp interface O(1) to find label distribution entry pbp-input

More use case featuresas a link-level L2 network virtualization solution

• We may still borrow ideas from compute virtualization• Snapshot or backup and restore your virtual network• Live migration for your virtual network • Virtual network High Availability• Virtual network fault tolerance (as ECMP?)