dev conf 2017 - meeting nfv networking requirements

37
Meeting Networking Requirements for NFV Flavio Bruno Leitner Principal Software Engineer - Networking Service Team January 2017

Upload: flavio-leitner

Post on 21-Feb-2017

48 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Dev Conf 2017 - Meeting nfv networking requirements

Meeting Networking Requirements for NFV

Flavio Bruno LeitnerPrincipal Software Engineer - Networking Service TeamJanuary 2017

Page 2: Dev Conf 2017 - Meeting nfv networking requirements

● NFV concepts and goals● NFV requirements● 10G Ethernet● Physical-Virtual-Physical (PVP) scenario● Some network solutions● Dive into DPDK enabled Open vSwitch● Possible improvements

Agenda

2

Page 3: Dev Conf 2017 - Meeting nfv networking requirements

Virtualize network hardware appliances

NFV - Network Functions Virtualization

3

Virtualization

VM VM VMFirewall

LB

Router

Page 4: Dev Conf 2017 - Meeting nfv networking requirements

A new product/project needs new networking infrastructure

NFV - Goals

4

Before● Slow Process● High Cost● Less Flexibility

After● Fast Process● Lower Cost● Greater Flexibility

Deploy a new service with a click!

Page 5: Dev Conf 2017 - Meeting nfv networking requirements

NFV - Networking Requirements

5

VM

Virtualization

=

Page 6: Dev Conf 2017 - Meeting nfv networking requirements

Low Latency

High Throughput

… with zero packet loss

NFV Requirements - Challenge

6

Page 7: Dev Conf 2017 - Meeting nfv networking requirements

Worse case: Wirespeed smallest frame

Packet rate: 14.88Mpps (million packets per second)

Ethernet specific: 20 bytes [Inter-frame gap (12) + MAC preamble (8)]

Ethernet frame: 64 bytes [MAC header(14) + Payload(46)]

Minimum Ethernet frame size: 20 + 64 = 84 bytes.

Challenge 10GBit/s

7

Page 8: Dev Conf 2017 - Meeting nfv networking requirements

How much time per packet?

1 / 14.88Mpps = 67.2 nanoseconds

3GHz CPU => ~200 cycles

Cache Miss => ~32 nanoseconds

L2 Cache Hit => ~10 cycles

L3 Cache hit=> ~36 cycles

Small Budget!

Challenge 10GBit/s - 14.88Mpps

Sources: http://www.intel.co.uk/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdfhttps://people.netfilter.org/hawk/presentations/nfws2014/dp-accel-10G-challenge.pdf8

Page 9: Dev Conf 2017 - Meeting nfv networking requirements

Networking to Virtual Machines - PVP

9

VM

LogicPort

LogicPort

vSwitchPhysPort

PhysPort

Traffic Generator

Page 10: Dev Conf 2017 - Meeting nfv networking requirements

● Linux Bridge

● Open vSwitch (OVS)

● SR-IOV

● DPDK Enabled Open vSwitch (OVS-DPDK)

Networking to Virtual Machines

10

Page 11: Dev Conf 2017 - Meeting nfv networking requirements

● Use the kernel datapath

● NAPI

● Unpredictable latency

● Not SDN ready

● Low throughput: ~1Mpps/core (Phy-to-Phy)

● qemu runs in userspace

Linux Bridge

11

Page 12: Dev Conf 2017 - Meeting nfv networking requirements

● Use the kernel datapath

● NAPI

● Unpredictable latency

● SDN ready

● Low throughput: ~1Mpps/core

● qemu runs in userspace

Open vSwitch

12

Page 13: Dev Conf 2017 - Meeting nfv networking requirements

● Low latency

● High throughput

● Bypass the host

● Not SDN friendly - Can’t use a virtual switch in the host

● Physical HW exposed - no abstraction, certification issues/costs

● Migration issues

● Limited number of devices

SR-IOV

13

Page 14: Dev Conf 2017 - Meeting nfv networking requirements

What is DPDK?

● A set of libraries and drivers for fast packet processing.

● Open Source, BSD License

Usage:

● Receive and send packets within the minimum number of CPU cycles.

What it is not:

● A networking stack

Data Plane Development Kit (DPDK)

14

Page 15: Dev Conf 2017 - Meeting nfv networking requirements

Consists of APIs, provided through the BSD driver running in userspace, to

configure the devices and their respective queues. In addition, a PMD

accesses the RX and TX descriptors directly without any interrupts to quickly

receive, process and deliver packets in the user’s application.

DPDK - Poll-Mode Drivers

Source: http://dpdk.org/doc/guides/prog_guide/poll_mode_drv.html15

Page 16: Dev Conf 2017 - Meeting nfv networking requirements

● Open vSwitch kernel module is just a cache managed by userspace.

● DPDK provides the libraries and drivers to RX/TX from userspace.

● Yeah, DPDK enabled Open vSwitch!

● Remember the 14.88Mpps? ~16Mpps/core Phys-to-Phys.

● Cost at least one core 100% busy running the PMD thread.

(power consumption, cooling, wasted cycles)

Open vSwitch + DPDK

16

Page 17: Dev Conf 2017 - Meeting nfv networking requirements

● Provide network connectivity to Virtual Machines

● Qemu runs in userspace

● Vhost-user interface (TX/RX shared virtqueues)

● Guests can choose between kernel or userspace

● Throughput: ~3.5Mpps/core (default features, PVP, tuned)

● Scales up linearly with multiple parallel streams

● System needs to be carefully tuned

OVS-DPDK for NFV

17

Page 18: Dev Conf 2017 - Meeting nfv networking requirements

● Poll-Mode Driver thread owns a CPU

● Devices (queues) are distributed between PMD threads

● Each PMD thread will busy loop polling and processing

● Run-To-Completion

● Batching (reduce per packet processing cost)

How does it work?

18

Page 19: Dev Conf 2017 - Meeting nfv networking requirements

X-Ray Patient: OVS-DPDK PMD Thread

19

Port 1

PMD

FW Plane

DROP

Port 2 Port n

Page 20: Dev Conf 2017 - Meeting nfv networking requirements

PMD in PVP

20

P1

PMD

FW Plane

P2 L1 L2 VM

LogicPort

LogicPort

vSwitchPhysPort

PhysPort

Traffic Generator

Page 21: Dev Conf 2017 - Meeting nfv networking requirements

Packet Flow

21

PhysicalNIC (10)

PMD

FW Plane

PhysicalNIC (11)

vhost-user (20)

vhost-user (21)

in_port=10,action=21in_port=20,action=11

Page 22: Dev Conf 2017 - Meeting nfv networking requirements

Measuring Throughput: Zero Packet Loss

22

Expected:● Constant traffic rate● System is constantly dropping packets● Decrease traffic rate, repeat

Page 23: Dev Conf 2017 - Meeting nfv networking requirements

Packet Drops: Aim For Weak Spots

23

PhysicalNIC (10)

PMD

FW Plane

PhysicalNIC (11)

vhost-user(20)

vhost-user(21)

in_port=10,action=21in_port=20,action=11

Page 24: Dev Conf 2017 - Meeting nfv networking requirements

Packet Drops: NIC RX QUEUE

24

PhysicalNIC

PMD

FW Plane

● Fixed sized limited by hardware● Drops are reported in the port stats● Queue overflow

(producer-consumer problem)

Page 25: Dev Conf 2017 - Meeting nfv networking requirements

Packet Drops: Vhost-user TX Queue

25

PMD

FW Plane

DROP

Guestvhost-user

● Fixed sized limited in software● Drops reported in the guest● Queue overflow

(producer-consumer problem)

Page 26: Dev Conf 2017 - Meeting nfv networking requirements

Packet Drops: Vhost-user RX Queue

26

PMD

FW Plane

Guestvhost-user

● Fixed sized limited in software● Drops are reported in the port stats● Queue overflow

(producer-consumer problem)

Page 27: Dev Conf 2017 - Meeting nfv networking requirements

Measuring Throughput: Zero Packet Loss

27

Expected:● Constant traffic rate● System is constantly dropping packets● Decrease traffic rate, repeat

Reality:● System is stable for a period of time● Few packets dropped sporadically● Decrease traffic rate, repeat● Very low throughput● Understand what is causing the drops

Page 28: Dev Conf 2017 - Meeting nfv networking requirements

Estimating PMD Processing Budget

28

Throughput (Mpps) Proc. Budget (µs) PMD Budget (µs)

3.0

4.0

5.0

6.0

0.33

0.25

0.20

0.16

0.16

0.12

0.10

0.08

Page 29: Dev Conf 2017 - Meeting nfv networking requirements

Measuring Polling/Processing cost.

29

Device Mode Time (µs)Phys Ingress Polling 0.2

Phys Ingress Processing 3.1

Phys egress Polling 0.016

Phys egress Processing 0

vhost-user ingress Polling 0.013

vhost-user ingress Processing 0

vhost-user egress Polling 0.73

vhost-user egress Processing 2.14

Total Polling+Processing 6.2

Page 30: Dev Conf 2017 - Meeting nfv networking requirements

● Total of 6.2µs is 24x the per packet budget (0.25µs)

● Assuming 32 packets in a batch, per packet reduces to 0.19µs, ~5Mpps

● 3.5Mpps 0 packet loss (0.29µs) => batch size of 21.4 in average.

Batching

30

Page 31: Dev Conf 2017 - Meeting nfv networking requirements

● Internal sources

● External sources

What is wasting time?

31

Page 32: Dev Conf 2017 - Meeting nfv networking requirements

● What are they?

● How much significant are they?

Externals Sources

32

Page 33: Dev Conf 2017 - Meeting nfv networking requirements

● PMD Processing Budget (3Mpps): 0.16µs

● Ftrace tool => Kernel RCU callback: 50µs + preemption cost

● Roughly 8 batches

● rcu_nocbs=<cpu-list>, rcu_nocb_poll

External Interferences: RCU Callback

33

Page 34: Dev Conf 2017 - Meeting nfv networking requirements

● nohz_full

● No way to get rid off it

External Interferences: Timer Interrupt

34

Page 35: Dev Conf 2017 - Meeting nfv networking requirements

● Scheduling issues:

○ irqbalance off

○ isolcpus

● Watchdog: nowatchdog

● Power Management: processor.max_cstates=1

● Hyper Threading

● Real-Time Kernel

External Interferences: Other Sources

35

Page 36: Dev Conf 2017 - Meeting nfv networking requirements

● Use DPDK L-Thread subsystem to isolate devices

● Disable mergeable buffers to increase batch sizes inside the guest

● Disable mergeable buffers to decrease per packet cost

● Increase OVS-DPDK batch size

● Increase NIC queue size

● Increase virtio ring size

● BIOS settings

● Hardware Offloading

● Faster platform/CPUs

● Improve CPU isolation in the kernel

Possible Improvements

36

Page 37: Dev Conf 2017 - Meeting nfv networking requirements

Thank You

Questions & Answers

Source: http://dpdk.org/doc/guides/prog_guide/poll_mode_drv.html37