22 - idnog03 - christopher lim (mellanox) - efficient virtual network for service providers

28
Christopher Lim, Sr. Engineer July 2016 Mellanox Efficient Virtual Network for Service Providers

Upload: indonesia-network-operators-group

Post on 16-Apr-2017

289 views

Category:

Internet


4 download

TRANSCRIPT

Page 1: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

Christopher Lim, Sr. Engineer July 2016

Mellanox Efficient Virtual Network for Service Providers

Page 2: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 2- Mellanox Confidential -

Leading Supplier of End-to-End Interconnect Solutions

StoreAnalyzeEnabling the Use of Data

SoftwareICs Switches/GatewaysAdapter Cards Cables/Modules

Comprehensive End-to-End InfiniBand and Ethernet Portfolio (VPI)

Metro / WANNPU & Multicore

NPSTILE

Page 3: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 3- Mellanox Confidential -

Cloud-Native NFV Architecture Dictates Efficient Virtual Network

Mellanox EVN: Foundation for Efficient Telco Cloud Infrastructure

Efficient Virtual NetworkEnabling High-performance, Reliable and

Scalable Infrastructure for Cloud Service Delivery

AUTOMATIONACCELERATIONVIRTUALIZATION

ComputeHigher Workload

Density

NetworkLine Rate Packet

ProcessingStorageHigher IOPS, Lower

Latency

Page 4: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 4- Mellanox Confidential -

SR-IOV – Overcome Compute Virtualization Penalty

VM 1 VM 2 VM N……VF Driver VF Driver VF Driver

VM

Virtual NIC

VM

Virtual NIC

Hypervisor

Single Root I/O Virtualization (SR-IOV) capable NIC

Virtual Switch

Physical FunctionVirtual Function

Virtual Function

Virtual Function

NIC Embedded Switch

PF Driver

PCIe Bus

Application Direct Access to achieve bare metal I/O performance

……

VMs leveraging SR-IOV and Mellanox eSwitch for near-line-rate performance without CPU overhead

Software-switched VMs suffering from compute virtualization penalty

Page 5: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 5- Mellanox Confidential -

SR-IOV + DPDK: Better Together with Mellanox PMD

VM 1 VM 2 VM N

……

Hypervisor

Single Root I/O Virtualization (SR-IOV) capable NIC

Virtual Function

Virtual Function

Virtual Function

NIC Embedded Switch

PCIe Bus

Further accelerate packet processing performance by eliminating interrupts and context switches

……

Mellanox DPDK PMDDPDK Library

Mellanox DPDK PMDDPDK Library

Mellanox DPDK PMDDPDK Library

Page 6: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 6- Mellanox Confidential -

Mellanox Sets New DPDK Performance Records

64 128 256 512 1024 1280 1518 Message size0

10

20

30

40

50

60

42.11

30.58

17.96

9.364.78 3.83 3.24

Superior DPDK Packet Performance at Various Frame Sizes (Lx 40G)

Frame Size (In Bytes)

Fram

es p

er S

econ

d (In

Mill

ions

)

Test setup:• ConnectX-4Lx 40GbE Single

port• 4 Cores Dedicated to DPDK

Product

Single-port TCP Throughput

DPDK 64B Packet Throughput

ConnectX-4 100G

93.4 Gb/s

74.4 million p/s

ConnectX-4 Lx 40G

37.6 Gb/s

42.1 million p/s

ConnectX-4 Lx 25G

23.5 Gb/s

34 million p/s

ConnectX-4 40G

37.6 Gb/s

56.4 million p/s

Page 7: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 7- Mellanox Confidential -

Solution:• Overlay Network Accelerators in NIC• Penalty free overlays at bare-metal speed• Integrated and validated by major SDN

vendors Benefits:

• 37.5Gb/s on 40G link, >2X compared to without VxLAN offload

• On a 20 cores system, 7 cores are freed to run addition VMs, saving 35% of total cores while doubling the throughput!

Turbocharge Overlay Networks with ConnectX-3/4 NICs

Page 8: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 8- Mellanox Confidential -

Cumulus Overlay Solution

VMware NSX

PLUMgrid ONS

Nuage VSP

Midokura Midonet

Juniper OpenContrail

Akanda Astara

Cumulus LNV

Switch VXLAN tunnel endpoint (VTEP) is used • To connect bare metal servers to VXLAN network• To connect VXLAN and legacy network

Cumulus Integrated with every major Overlay Solution Available with Mellanox switches April 2016

Page 9: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 9- Mellanox Confidential -

Accelerated Switching And Packet Processing (ASAP2)

Best of both worlds: Enable hardware accelerated data plane with SDN/virtual switch control plane

Multiple possibilities of accelerated data plane including DPDK in CPU, embedded switch, FPGA, network processor, multi-core processor in server adaptor, TOR switch, or centralized acceleration pool

Standard hardware API to allow control plane and data plane to operate and innovate independently

Roadmap

Virtual Switch Control Plane

Hardware Accelerate

d Data Plane

Standard Hardware Abstraction Interface

ASAP2

Page 10: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 10- Mellanox Confidential -

ASAP2 Phase 1: ASAP2 Direct

OVS Control Plane, optionally combined with SDN controller

Direct application I/O access through SR-IOV

Accelerated forwarding and classification through Embedded Switch (eSwitch) on Mellanox NIC

OSVM

OSVM

OSVM

OSVM

tap tapSR-IOV

to the VM

Embedded Switch

Page 11: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 11- Mellanox Confidential -

OVS Architecture and Operations

11

OVS-vswitchd

OVS Kernel ModuleFirst

Packet

Subsequent Packets

UserKernel

Forwarding• Flow-based forwarding• First packet of a new flow (match miss) is

directed to user space (ovs-vswitchd)• ovs-vswitchd determines flow handling and

programs kernel (fast path) • Following packets hit kernel flow entries and are

executed in fast path

Page 12: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 12- Mellanox Confidential -

Mellanox eSwitch

ASAP2 – Let the Hardware Do the Heavy-lifting

New Flow

• A new flow will result in a ‘miss’ action in eSwitch and is directed to OVS kernel module

• Miss in kernel will punt the packet to OVS-vswitchd in user space

Configuration

• OVS-vswitchd will resolve the flow entry, and based on a policy decision to offload, propagate that to corresponding eSwitch tables for offload-enabled flows

Fast Forwarding

• Subsequent frames of offload-enabled flows will be processed and forwarded by eSwitch

OVS-vswitchd

OVS Kernel ModuleFirst

Packet

Subsequent HW Forwarded

Packets

User

Kernel

Fallback Forwarding Path

Software

Hardw

are

Page 13: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 13- Mellanox Confidential -

OVS and SRIOV, Working Seamlessly Together

Representor ports enable OVS to “know” and service those VMs that uses SR-IOV

Representor ports are used for eSwitch / OVS communication (miss flow and PV to SR-IOV communication)

Netdev Representor

Netdev Representor

netdev netdev

VMs using OVS Offload VMs using Para-Virtualization

NIC eSwitch

Policy based Flow Sync

Page 14: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 14- Mellanox Confidential -

Software Defined Networking, at Full Speed

Highest performance (High throughput, low and deterministic latency)• Offload is increasingly important as server I/O

speed goes up Low CPU overhead, higher infrastructure

efficiency Software defined Everything In-Box

• All changes will be up-streamed, no proprietary OVS or kernel patches eSwitcheSwitcheSwitch

ConfigurationStats Reporting

SDN or Other Network Orchestration

Page 15: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 15- Mellanox Confidential -

Benchmark Targets

Matrices• Message Rate (PPS) • Network related CPU Load

Environments • 25Gbps network • Extreme performance• Open Source• Free

Standard Benchmark• RFC 2544

Page 16: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 16- Mellanox Confidential -

Benchmark Topology and Traffic Flow

Mellanox

Kernel Kernel

Kernel Kernel

User User

UserUser

OVS Over DPDK OVS Offload

OVS

DPDK

DPDK

Testpmd

OVS

eSwitch

DPDK

Testpmd

Flows Offload

25GE 25GE

VM

Hypervisor

NIC

VS.

Page 17: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 17- Mellanox Confidential -

Results and Conclusions

330% higher message rate compared to OVS over DPDK• 33M PPS VS. 7.6M PPS• OVS Offload reach near line rate at 25G (37.2M PPS)

Zero! CPU utilization on hypervisor compared to 4 cores with OVS over DPDK• This delta will grow further with packet rate and link

speed

Same CPU load on VM

OVS Offload OVS over DPDK0

5

10

15

20

25

30

35

0

0.5

1

1.5

2

2.5

3

3.5

4

4.533M PPS

7.6M PPS

0 Cores

4 Cores

Message Rate Dedicated Hypervisor Cores

Mill

ion

Pack

et P

er S

econ

d

Num

ber o

f Ded

icat

ed C

ores

Page 18: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 18- Mellanox Confidential -

Accelerated Data Movement End to End: 25 is the New 10

One Switch. A World of Options.

Flexibility, Opportunities, Speed

Open Ethernet, Zero Packet Loss

Most Cost-Effective Ethernet Adapter

2.5X the Network Performance

Same Infrastructure, Same Connector

One Switch. A World of Options. 25G and 50G at Your Fingertips

Page 19: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 19- Mellanox Confidential -

Spectrum: The Ultimate 25/100GbE Switch

The only predictable 25/50/100Gb/s Ethernet switch Full wire speed, non-blocking switch

• Doesn’t drop packets per RFC2544 ZPL: ZeroPacketLoss for all packets sizes

Page 20: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 20- Mellanox Confidential -

25GbE to 25GbE Latency Test Results

Not All Ethernet Switches Were Born Equal

5.2

8.49.6 9.7

0.3 0.9 1.0 1.1

64B 512B 1.5B 9KB

Max

Bur

st S

ize (M

B)

Packet size

Microburst Absorption Capability

Spectrum Tomahawk64 82 12

814

616

418

220

025

615

1892

1650

60

70

80

90

100

50

60

70

80

90

100

Packet Size (Bytes)Packet Size (Bytes)

Broadcom Spectrum

Microburst Absorption Fairness Avoidable Packet Loss

Broadcom Spectrum

www.Mellanox.com/tollywww.zeropacketloss.com

Consistently Low Latency

Page 21: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 21- Mellanox Confidential -

Open APIs

Open Composable Networks

Automation

End-to-End Interconnect

Network OS

ChoiceSONiC

Page 22: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 22- Mellanox Confidential -

RDMA Acceleration – Overcome Transport Protocol Inefficiencies

ZERO Copy Remote Data Transfer

Low Latency, High Performance Data Transfers

InfiniBand - 100Gb/s RoCE* – 100Gb/s

Kernel Bypass Protocol Offload

* RDMA over Converged Ethernet

Application ApplicationUSER

KERNEL

HARDWARE

Buffer Buffer

Page 23: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 23- Mellanox Confidential -

RDMA Increases Memcached Performance

Memcached: High Performance in-memory distributed memory object caching system• Simple key-value store• Speeds application by eliminating database access• Used by YouTube, Facebook, Zynga, Twitter etc.

RDMA improved Memcached performance:• 1/3 query latency• >3X throughput

D. Shankar, X. Lu, J. Jose, M.W. Rahman, N. Islam, and D.K. Panda, Can RDMA Benefit On‐Line Data Processing Workloads with Memcached and MySQL, ISPASS’15

OLDP workload

64 96 128 160 320 4000

1

2

3

4

5

6

7

8 Memcached-TCP Memcached-RDMA

No. of Clients

Late

ncy

(sec

)

64 96 128 160 320 4000

500

1000

1500

2000

2500

3000

3500Memcached-TCP Memcached-RDMA

No. of Clients

Thro

ughp

ut (

Kq/

s)

Reduced by 66% Increased

by >200%

Page 24: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 24- Mellanox Confidential -

Case Studies

Page 25: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 25- Mellanox Confidential -

Server I/O Decides Affirmed Networks Virtual EPC Efficiency

When server I/O is constrained, the Affirmed MCC deployment efficiency can be constrained, resulting in underutilized resources and larger server footprint

Mellanox 40G NIC enables MCC to fully utilize CPU resources, reduce server footprint and enhance efficiency.

MCM

CCMDCM

ASM

WSMIOM

Affirmed High AvailabilityMCM

CCMDCM

ASM

WSMIOM

12

12

N

12

N

12

N

12

N1

2N

Affirmed Mobile Content CloudTMH y p e r v i s o rx86 H W P l a t f o r m

MCM – Management Control Module

CCM – Centralized Control Module

DCM – Distributed Control Module

IOM – Input Output Module

WSM – Workflow Services Module (data plane)

ASM – Advanced Services Module (data plane)

MCC Cluster

IOMIOM

IOMWSMWSM

WSM

SP Router

North-South traffic to and from MCC Cluster

East-West traffic within MCC Cluster

A Typical Datapath Traffic Pattern

A single “composite” virtualized network function with distributed microservices that can scale in and out independently

To support 20Gbps of Cluster

I/OWith 10G NIC With 40G NIC

Number of Servers Needed 4 1

An Example to Show Server Efficiency Improvement

Page 26: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 26- Mellanox Confidential -

SR-IOV & Data Plane Acceleration Essential for Affirmed MCC

HypervisorHypervisor

PHY

Native Open vSwitch (OVS) DPDK Accelerated vSwitch (AVS)

~20-30% Line Rate ~80% Line Rate

SR-IOV

Near Line Rate

VM VM VM

Hypervisor

Server NIC

OVS

PHY

Server NIC

OVS

DPDK Lib

DPDK Lib

PHY

Server NIC

OVS

Page 27: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

© 2016 Mellanox Technologies 27- Mellanox Confidential -

Conclusion - The Mellanox EVN Differentiation

Higher Workload Density

Faster Data

Movement

Cloud-native

Scalability and

Reliability

Operation and Cost Efficiency

Page 28: 22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Service Providers

Thank You