event-driven packet processing · 2020-07-14 · packet event rmw operations operate on main...

Post on 24-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Stephen Ibanez, Gordon Brebner, Gianni Antichi, Nick McKeown

May 1st 2019

Event-Driven Packet Processing

P4 Programming Model

>> 2

TrafficManager

PacketGenerator

Synchronous packet-by-packet processing

Limitations of P4 Programming Model˃ Performing periodic tasks

HULA [1] – periodic packet probesCount-Min-Sketch – periodic state reset

˃ Updating state multiple times / using state in a different stageUsing congestion signals in ingress pipeline (AQM, NDP [2])

>> 3

Common Congestion Signals Other Congestion Signals• Queue size• Queue service rate• Queueing delay

• Packet loss volume• Rate of change of queue size• Timestamp of buffer

overflow/underflow events• Per-active-flow buffer occupancy• Etc…

˃ Solution: Generalize: Packet arrival/departure events è data-plane events

[1] Katta, Naga, et al. "Hula: Scalable load balancing using programmable data planes." SOSR, 2016.[2] Handley, Mark, et al. "Re-architecting datacenter networks and stacks for low latency and high performance." SIGCOMM, 2017.

Data-Plane Events

>> 4

Event Type Description

Ingress Packet Packet arrivalEgress Packet Packet departureRecirculated packet Packet sent back to ingressBuffer Enqueue Packet enqueued in bufferBuffer Dequeue Packet dequeued from bufferBuffer Overflow Packet dropped at bufferBuffer Underflow Buffer becomes emptyTimer Event Configurable timer expiresControl-plane triggered Control-plane triggers processing logic in

data-planeLink Status Change Link goes down / comes upPacket Transmission Packet finished transmissionState Condition Met User defined condition

Packet & MetadataEvents

MetadataEvents

Event-Driven Programming Model

>> 5

TrafficManagerIngress PipelineIngress Packet

Enqueue PipelineEnqueue Event

Dequeue PipelineDequeue Event

statestate

statestate

statestate

stat

e

stat

e

Does not sacrifice line-rate packet processing

Event-Driven Programming Model

>> 6

// arch.p4

extern shared_register<T> {

shared_register();

void read(out T result);

void write(in T value);

}

// my_prog.p4

shared_register<bit<32>>() bufSize_reg;

// Ingress Packet Event Logic

control Ingress(inout headers_t hdr,

inout std_meta_t meta) {

bit<32> bufSize;

apply {

bufSize_reg.read(bufSize);

// use bufSize to make forwarding decisions

}

}

// Enqueue Event Logic

control Enqueue(inout enq_data_t meta) {

bit<32> bufSize;

apply {

bufSize_reg.read(bufSize);

bufSize = bufSize + meta.pkt_len;

bufSize_reg.write(bufSize);

}

}

// Dequeue Event Logic

control Dequeue(inout deq_data_t meta) {

bit<32> bufSize;

apply {

bufSize_reg.read(bufSize);

bufSize = bufSize - meta.pkt_len;

bufSize_reg.write(bufSize);

}

}

˃ E.g: Compute total buffer occupancy:

Lower Line Rate Event Processing˃ Multi-ported memory is more practical

˃ One port per event type that accesses state array

>> 7

TrafficManagerIngress Packet

Enqueue Event

Dequeue Event

state

state

Higher Line Rate Event Processing˃ Multi-ported memory is impractical

>> 8

TrafficManager

statestate

EnqueueEvent

DequeueEvent

Higher Line Rate Event Processing˃ Multi-ported memory is impractical

˃ Approach:Multiple single ported register arraysPacket event RMW operations operate on main registerMetadata event RMW operations are aggregatedStaleness of algorithmic state is bounded

>> 9

TrafficManager

statestate

Enqueue Event

Dequeue EventIngress Packet Event

Enqueue: NULL

Dequeue: SUB 100B from queue 0

Packet: READ queue 1

NetFPGA SUME Event Switch Demo˃ Simple Fair-RED (FRED) AQM implementation

˃ Isolate TCP flow from non-adaptive UDP flow

˃ Computes per-active-flow queue occupancyEnqueue & Dequeue Events

˃ Queue occupancy tracing:

>> 10

Host A

Host B

Host C

Monitor

F0 qsize: XXX

F1 qsize: YYY

Flow 0(TCP Cubic)

Flow 1(UDP – 8Gbps)

• With FRED – UDP flow limited to 10KB of buffer• One 10MB TCP Flow è ~0.01 sec FCT

Que

ue O

ccup

ancy

(B)

Time (ms)

Per-Active-FlowQueue Occupancy

Conclusion

˃ Network algorithms are event-driven, so should our data-plane architectures

>> 11

IDLESend LSU

timeout

Start

link status change

Send HELLO

Periodicevent

Start

End Host Protocols(e.g. Reliable Delivery)

Control-Plane Protocols(e.g. Routing Protocols)

˃ Potential to offload much more functionality to our data-planes

Wait ACK

Send Pkt

timeoutACK

receiveddone

Start

Datasent

IDLE

Questions?

Line Rate Event Processing

˃ Idle clock cycles:1. Workload contains large packets2. Pipeline runs faster than line rate

>> 13

EnqueueOperations

Register

100

Queue SizeRegister

0

0

DequeueOperations

RegisterEnqueue: NULL

Dequeue: SUB 100B from queue 0

Packet: READ queue 1

0

SUB 10000:

1:

2:

3:

0:

1:

2:

3:

0:

1:

2:

3:

˃ Bounded staleness of the main register

SUME Event Switch on NetFPGA

>> 14

Output Queues

Event Merger

Timer Module

Packet Generator

Arbiter

nf0

nf1

nf2

nf3

dma

Timer period

Generate packet

Enq event

Deq eventDrop event

Timer event

Packet event

Link statuschange event

nf0nf1

nf2

nf3

dma

P4 Pipeline (SDNet)

top related