software defined networking (sdn) and the design …...software de ned networking (sdn) and the...
TRANSCRIPT
Software Defined Networking (SDN) andthe design of high-performance programmable switches
Paolo Giaccone
Notes for the class on “Switching Architectures for Data Centers”
Politecnico di Torino
November 2019
Outline
1 Introduction to SDN
2 OpenFlow protocol
3 Design of OpenFlow switches
4 Design of protocol-independent switches: the PISA architecture
5 Languages for data plane programmability: P4
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 2 / 48
Introduction to SDN
Section 1
Introduction to SDN
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 3 / 48
Introduction to SDN
Network Architecture
Control plane
how to handle traffic (e.g., routing)
Data plane
forward the traffic based on the control plane decision
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 4 / 48
Introduction to SDN
Traditional networks
Internet architecture
vertically integrated
control and data planes within each devicecontrol plane distributed across the switches/routers
complex and hard to manage
due to distributed control planedifficult to understand the state of the network and its history
vendor-specific commands to manage switches/routers
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 5 / 48
Introduction to SDN
Software Defined Networking (SDN)
Emerging networking paradigm
Separation between control and data plane
routers and switches just acting as forwarding elements
Flow-based forwarding decision
instead of destination-based forwarding
flow definition: a set of packet field values acting as a match rule anda set of actions to operate on all packets belonging to the same flow
unify the behavior of routers, switches, firewalls, load-balancers,traffic shapers, etc.
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 6 / 48
Introduction to SDN
Software Defined Networking (SDN)
Logical centralization of the control plane
unique abstract view of the network state (e.g. topology)
control logic moved to an external entity
SDN controller or Network Operating System (NOS)
Network programmability
network applications run on the SDN controller
similar to computer applications running on computer operating system
standard or ad-hoc languages
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 7 / 48
Introduction to SDN
SDN controller
software platform running on commodity serverslogically centralized
for scalability and reliability reasons, can be distributed in differentservers
open-source examples: NOX, POX, Ryu, ONOS, OpenDaylight, etc.VERSION 2.01 2
Network Infrastructure
Data forwarding elements
(e.g., OpenFlow switches)
Open southbound API
Controller(Pla+orm(
Network(Applica4on(s)(
Open northbound API
Fig. 1. Simplified view of an SDN architecture.
network operating system1), simplifying policy enforcementand network (re)configuration and evolution [6]. A simplifiedview of this architecture is shown in Figure 1. It is importantto emphasize that a logically centralized programmatic modeldoes not postulate a physically centralized system [7]. In fact,the need to guarantee adequate levels of performance, scala-bility, and reliability would preclude such a solution. Instead,production-level SDN network designs resort to physicallydistributed control planes [7], [8].
The separation of the control plane and the data planecan be realized by means of a well-defined programminginterface between the switches and the SDN controller. Thecontroller exercises direct control over the state in the data-plane elements via this well-defined application programminginterface (API), as depicted in Figure 1. The most notableexample of such an API is OpenFlow [9], [10]. An OpenFlowswitch has one or more tables of packet-handling rules (flowtable). Each rule matches a subset of the traffic and performscertain actions (dropping, forwarding, modifying, etc.) onthe traffic. Depending on the rules installed by a controllerapplication, an OpenFlow switch can – instructed by thecontroller – behave like a router, switch, firewall, or performother roles (e.g., load balancer, traffic shaper, and in generalthose of a middlebox).
An important consequence of the software-defined net-working principles is the separation of concerns introducedbetween the definition of network policies, their implemen-tation in switching hardware, and the forwarding of traffic.This separation is key to the desired flexibility, breaking thenetwork control problem into tractable pieces, and making iteasier to create and introduce new abstractions in networking,simplifying network management and facilitating networkevolution and innovation.
Although SDN and OpenFlow started as academic experi-ments [9], they gained significant traction in the industry overthe past few years. Most vendors of commercial switches nowinclude support of the OpenFlow API in their equipment. The
1We will use these two terms interchangeably.
SDN momentum was strong enough to make Google, Face-book, Yahoo, Microsoft, Verizon, and Deutsche Telekom fundOpen Networking Foundation (ONF) [10] with the main goalof promotion and adoption of SDN through open standardsdevelopment. As the initial concerns with SDN scalabilitywere addressed [11] – in particular the myth that logicalcentralization implied a physically centralized controller, anissue we will return to later on – SDN ideas have maturedand evolved from an academic exercise to a commercialsuccess. Google, for example, has deployed a software-definednetwork to interconnect its data centers across the globe.This production network has been in deployment for 3 years,helping the company to improve operational efficiency and sig-nificantly reduce costs [8]. VMware’s network virtualizationplatform, NSX [12], is another example. NSX is a commercialsolution that delivers a fully functional network in software,provisioned independent of the underlying networking devices,entirely based around SDN principles. As a final example, theworld’s largest IT companies (from carriers and equipmentmanufacturers to cloud providers and financial-services com-panies) have recently joined SDN consortia such as the ONFand the OpenDaylight initiative [13], another indication of theimportance of SDN from an industrial perspective.
A few recent papers have surveyed specific architecturalaspects of SDN [14], [15], [16]. An overview of OpenFlowand a short literature review can be found in [14] and [15].These OpenFlow-oriented surveys present a relatively simpli-fied three-layer stack composed of high-level network services,controllers, and the controller/switch interface. In [16], theauthors go a step further by proposing a taxonomy for SDN.However, similarly to the previous works, the survey is limitedin terms of scope and it does not provide an in-depth treatmentof fundamental aspects of SDN. In essence, existing surveyslack a thorough discussion of the essential building blocksof an SDN such as the network operating systems, program-ming languages, and interfaces. They also fall short on theanalysis of cross-layer issues such as scalability, security, anddependability. A more complete overview of ongoing researchefforts, challenges, and related standardization activities is alsomissing.
In this paper, we present, to the best of our knowledge,the most comprehensive literature survey on SDN to date.We organize this survey as depicted in Figure 2. We start, inthe next two sections, by explaining the context, introducingthe motivation for SDN and explaining the main conceptsof this new paradigm and how it differs from traditionalnetworking. Our aim in the early part of the survey is also toexplain that SDN is not as novel as a technological advance.Indeed, its existence is rooted at the intersection of a seriesof “old” ideas, technology drivers, and current and futureneeds. The concepts underlying SDN – the separation ofthe control and data planes, the flow abstraction upon whichforwarding decisions are made, the (logical) centralization ofnetwork control, and the ability to program the network –are not novel by themselves [17]. However, the integrationof already tested concepts with recent trends in networking– namely the availability of merchant switch silicon and thehuge interest in feasible forms of network virtualization – are
Image taken from [Kr15]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 8 / 48
Introduction to SDN
SDN controller interfaces
Northbound Interface (NBI) API
APIs available to application developersCLI (Command Line Interface), GUI (Graphical User Interface), RESTAPIs
e.g. “curl http://controllerIP:controllerPort/command”
abstracts the low-level instructions to program forwarding devices
needed to develop the network applications with any programminglanguage
Southbound Interface (SBI) API
protocol to access the switches and to send them commands
enables the interaction between control and data plane
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 9 / 48
Introduction to SDN
Adoption of SDN
WAN interconnection between data centers (B4 network)
within the data center
Network operators
network operators and ISPs are migrating to SDN networks (orconsidering this possibility)
hybrid SDN integrates fully SDN devices with legacy devices
5G networks
SDN/NFV (Network Function Virtualization) integrated architecture
SDN complementary to NFV to support advanced traffic control, e.g.,for service chaining
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 10 / 48
OpenFlow protocol
Section 2
OpenFlow protocol
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 11 / 48
OpenFlow protocol
OpenFlow
initial idea proposed at Stanford University in 2006
definition by Open Networking Foundation (ONF)
ONS promotes the adoption of SDN through open standardsOpenFlow 1.0 released on Dec 2009OpenFlow 1.5.1 released on Mar 2015
defines the protocol adopted in the southbound interface (SBI)
Image taken from [OF15]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 12 / 48
OpenFlow protocol
OpenFlow messages
transported on TLS or TCP connections
Packet-infrom the switch to the controller
to transfer the control of a received packet to the controller
carries a copy of the packet (possibly, only the header)
generated by default in case of forwarding table misses
Packet-outfrom the controller to the switch
to specify the action/s to apply to the packet (i.e., send packet out of a specified port)
carries the full packet or an ID buffer of the switch
Flow-modfrom the controller to the switch
to modify the flow tables
carries the match-action rule to install in the switch
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 13 / 48
OpenFlow protocol
OpenFlow example
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 14 / 48
OpenFlow protocol
Example of flow tables
Ethernet switch (e.g. to reach 68:a8:6d:00:81:42)
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* * 68:a8:6d:00:81:42 * * * * * Forward P1
IP router (e.g. direct delivery on 130.192.9.0/24)
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* * * 0x0800 130.192.9.* * * * Forward P2
Firewall (e.g. block BitTorrent from all Politecnico’s hosts)
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* * * 0x0800 130.192.* * * 6969 Drop
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 15 / 48
OpenFlow protocol
OpenFlow fields
Version Fields Year
OpenFlow 1.0 12 2009OpenFlow 1.1 15 2011OpenFlow 1.2 36 2011OpenFlow 1.3 40 2012OpenFlow 1.4 41 2013OpenFlow 1.5 44 2015
Example of fields: switch input port, Ethernet frame type and addresses,VLAN id and priority, IP DSCP/ECN flags and protocol and addresses,TCP ports and flags, UDP ports, ICMP type and code, ARP op code andaddresses, MPLS labels and flags, metadata passed between tables.
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 16 / 48
Design of OpenFlow switches
Section 3
Design of OpenFlow switches
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 17 / 48
Design of OpenFlow switches
OpenFlow switches
Flow (or Forwarding) Tables
Each rule is a match-action pair
Match
binary exact match (0,1)ternary match (0,1,*)
Action
drop, forward, modify, gotoanother table
Flexibile
Enables to implement routers,switches, firewalls, load balancers,traffic shapers, etc.
Packet processing
Through a sequence of flowtables
OpenFlow Switch Specification Version 1.5.0
1 Introduction
This document describes the requirements of an OpenFlow Logical Switch. Additional informationdescribing OpenFlow and Software Defined Networking is available on the Open Networking Foundationwebsite (https://www.opennetworking.org/). This specification covers the components and the basicfunctions of the switch, and the OpenFlow switch protocol to manage an OpenFlow switch from aremote OpenFlow controller.
Port
Port
Port
Port
OpenFlowChannel
FlowTable
FlowTable
FlowTable
Controller
Pipeline
OpenFlow Switch
OpenFlowChannel Group
TableMeterTableControl Channel
Controller
Datapath
Protocol
Figure 1: Main components of an OpenFlow switch.
2 Switch Components
An OpenFlow Logical Switch consists of one or more flow tables and a group table, which perform packetlookups and forwarding, and one or more OpenFlow channels to an external controller (Figure 1). Theswitch communicates with the controller and the controller manages the switch via the OpenFlow switchprotocol.
Using the OpenFlow switch protocol, the controller can add, update, and delete flow entries in flowtables, both reactively (in response to packets) and proactively. Each flow table in the switch containsa set of flow entries; each flow entry consists of match fields, counters, and a set of instructions to applyto matching packets (see 5.2).
Matching starts at the first flow table and may continue to additional flow tables of the pipeline (see5.1). Flow entries match packets in priority order, with the first matching entry in each table beingused (see 5.3). If a matching entry is found, the instructions associated with the specific flow entry areexecuted (see 5.5). If no match is found in a flow table, the outcome depends on configuration of the
11 © 2014; The Open Networking Foundation
Image taken from [OF15]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 18 / 48
Design of OpenFlow switches
Packet processing within each flow table
Matching based only on the header fieldsOpenFlow Switch Specification Version 1.5.0
Match
Find highestprioritymatchingflow entry
Apply-actions{list of actions} • modify packet • update match fields • update pipeline fields • if output or group � clone packet
Clear-actions • empty action setWrite-actions{set of actions} • merge in action set
Goto-table{table-id}
Extractheaderfields
Apply Instructions
Flow Table
Packet
ActionSet
PipelineFields
EgressPacket clones
ExecuteActionSet
FlowTable
flow entry
flow entryflow entry
flow entry
flow entrytable missflow entry
Figure 4: Matching and Instruction exectution in a flow table.
5.5 Instructions
Each flow entry contains a set of instructions that are executed when a packet matches the entry. Theseinstructions result in changes to the packet, action set and/or pipeline processing (see Figure 4).
A switch is not required to support all instruction types, just those marked “Required Instruction”below. The controller can also query the switch about which of the “Optional Instruction” types itsupports.
• Optional Instruction: Apply-Actions action(s): Applies the specific action(s) immediately,without any change to the Action Set. This instruction may be used to modify the packet betweentwo tables or to execute multiple actions of the same type. The actions are specified as a list ofactions (see 5.7).
• Optional Instruction: Clear-Actions: Clears all the actions in the action set immediately.
• Required Instruction: Write-Actions action(s): Merges the specified set of action(s) into thecurrent action set (see 5.6). If an action of the given type exists in the current set, overwrite it,otherwise add it. If a set-field action with a given field type exists in the current set, overwrite it,otherwise add it.
• Optional Instruction: Write-Metadata metadata / mask: Writes the masked metadata valueinto the metadata field. The mask specifies which bits of the metadata register should be modified(i.e. new metadata = old metadata & ˜mask | value & mask).
25 © 2014; The Open Networking Foundation
Image taken from [OF15]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 19 / 48
Design of OpenFlow switches
Implementation platforms
standard CPU: software switch
FPGA, Net-FPGA
Network Processor (NPU)
switching chips
two orders of magnitude faster at switchingthan CPUsan order of magnitude faster than NPUs
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 20 / 48
Design of OpenFlow switches
Implementation on NetFPGA
Implementation in FPGA used as reference model for hardwarearchitecture of any OpenFlow switch
Figure 5: The OpenFlow switch pipeline is similarto the IPv4 Reference Router Pipeline.
8 cycles, the state machine reads the flow headers stored inthe two locations indicated by the two hashes for the firstpacket. While waiting for the results of the lookups for thefirst packet, the state machine issues read requests for theflow headers using the second packet’s two hashes. In themeantime, the results from the first packet’s read requestsare checked for a match. In the case of a hit, the data fromthe hit address is read. The same is then done for the secondpacket.
The results of both wildcard and exact-match lookups aresent to an arbiter that decides which result to choose. Oncea decision is reached on the actions to take on a packet, thecounters for that flow entry are updated and the actions arespecified in new headers prepended at the beginning of thepacket by the Packet Editor.
The design allows more stages—OpenFlow Action stages—to be added between the Output Port Lookup and the Out-put Queues. These stages can handle optional packet mod-ifications as specified by the actions in the newly appendedheaders. It is possible to have multiple Action stages in se-ries each doing one of the actions from the flow entry. Thisallows adding more actions very easily as the specificationmatures—the current implementation does not support anypacket modifications.
5. RESULTSIn this section we will first give performance results for our
OpenFlow switch. We will then compare the complexity ofthe Output Port Lookup module for the OpenFlow Switch,the IPv4 Router, the NIC, and the learning Ethernet switch.
5.1 Performance ResultsOur NetFPGA OpenFlow switch implementation is eval-
uated in three dimensions: flow table size, forwarding rate,and new flow insertion rate.
Table Size: While modern enterprise routers and switchescan have tables that are hundreds of thousands of entries(our Gates building router—a Cisco Catalyst 6509—can fit1M prefixes), the total number of active flows at any pointin time is much smaller. Data from the 8,000-host networkat LBL [20] indicates that the total number of active flowsnever exceeded 1200 in any one second. Results collectedusing Argus [21] from the Computer Science and ElectricalEngineering network which connects more than 5500 activehosts are shown in Figure 6. We find that the maximum
Table 1: Summary of NetFPGA OpenFlow switchperformance.
Pkt Size Forwarding Full Loop(bytes) (Mbps) (flows/s)64 1000 61K512 1000 41K1024 1000 28K1518 1000 19K
number of flows active in any one second over a period of7 days during the end of January, a busy time, only crossesover 9000 once, and stays below 10000. The number of ac-tive flows seen in both the LBL network and the StanfordEE/CS network fit very easily into the NetFPGA OpenFlowswitch’s 32,000-entry exact match table.
It is worth noting that at one point, the number of activeflows recorded from a single IP address reached more than50k active flows in one second for a period of two minutes.However, almost all the flows consisted of two packets, onethat initiates the flow and one that ends it after 50 seconds.The host that generated all these flows was contacting anAFS server for backup. We suspect the first packet wasa request that timed out, while the second packet was thetimeout message. The large number of flows is due to theway Argus counts UDP flows. Even when the flow consistsof only two packets spread out over 50 seconds, Argus stillcounts them as active flows. We do not include the resultsfrom this single host because this situation could have eas-ily been remedied in an OpenFlow network using a singlewildcard table entry. A more optimized version of the im-plementation can handle more than 64,000 entries, so evenwithout the wildcard entry, we would still be able to run thenetwork.
Forwarding: We ran two tests on our NetFPGA Open-Flow switch. First, to test the hardware’s forwarding rate,we inserted entries into the hardware’s flow table and ranstreams across all four ports of the NetFPGA. This was doneusing a NetFPGA packet generator that can transmit pre-determined packets at line-rate. A NetFPGA packet cap-ture device audited the output from the OpenFlow switchto make sure we received the expected packets. The for-
warding column in table 1 shows that our switch is capableof running at the full line-rate across 64, 512, 1024, and 1518packet sizes.
New Flow Insertion: Second, we tested the rate atwhich new flows can be inserted into the hardware flow ta-ble. This answered the following question: Assuming aninfinitely fast OpenFlow controller, at what rate can newflows be coming into the switch before they start gettingdropped? The test was run by connecting the NetFPGAswitch to an external host that continuously generated newflows. A simple controller implementing a static Ethernetswitch 3 was run on the same machine as the switch so thatthe local OpenFlow switch manager and the OpenFlow con-troller could communicate through memory. We calculatedthe rate at which new flows were received by the NetFPGAand the rate at which new entries were created in the NetF-PGA flow table. The results are summarized in the full
loop column of table 1.The main bottleneck in the system is due to the PCI bus—
3i.e. it uses static routes
Image taken from [Na08]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 21 / 48
Design of protocol-independent switches: the PISA architecture
Section 4
Design of protocol-independent switches: the PISAarchitecture
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 22 / 48
Design of protocol-independent switches: the PISA architecture
Fully-programmable data planes
Provide much higher level of programmability than OpenFlow, thus now theSDN controller can offload decisions to the switches
OpenState / Open Packet Processor (OPP) / FlowBlaze
fast and flexible state machines can be programmed within each switchOpenState exploits the same architecture of OpenFlow switchesvery high performance and simplicity of programmingrefer to [BC14], [OPP16], [FB19]
PISA + P4
Protocol-Independent Switch Architecture (PISA) as packet processorP4 abstraction to configure the switchmay support internal state machines within each switchrefer to [Bo14]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 23 / 48
Design of protocol-independent switches: the PISA architecture
Background
Match action models
Single Match Table (SMT)
Multiple Match Table (MMT)
Reconfigurable Match Tables (RMT)
Protocol-Independent Switch Architecture (PISA)
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 24 / 48
Design of protocol-independent switches: the PISA architecture
Single Match Table (SMT)
the SDN controller tells the switch to match any set of packet headerfields against entriesa parser locates and extracts the correct header fields to match againstthe tablematch
binary exact match when all fields are completely specifiedternary match when some bits are switched off (i.e., wildcard entries)
SMT abstraction is good for
programmers (simplicity)implementation (compatible with TCAM memories)
costly in terms of resources
table needs to store every combination of headershuge waste if one header match affects another, for example if a matchon the first header determines a disjoint set of values to match on thesecond header requiring the table to hold the Cartesian-product of both
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 25 / 48
Design of protocol-independent switches: the PISA architecture
SMT example
Firewall
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* MAC1 * 0x800 * IP1 * * A1* MAC1 * 0x800 * IP2 * * A2... ... ... ... ... ... ... ... ...* MAC1 * 0x800 * IP10 * * A10
* MAC2 * 0x800 * IP1 * * A1... ... ... ... ... ... ... ... ...* MAC2 * 0x800 * IP10 * * A10
... ... ... ... ... ... ... ... ...
* MAC10 * 0x800 * IP1 * * A1... ... ... ... ... ... ... ... ...* MAC10 * 0x800 * IP10 * * A10
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 26 / 48
Design of protocol-independent switches: the PISA architecture
Multiple Match Table (MMT)
multiple smaller match tables matched by a subset of packet fields
match tables arranged into a pipeline of stages
processing at stage j can be made to depend on processing from stagei < jstage i modifies the packet headers or other information passed tostage j
easy to implement using a set of narrower tables in each stage
easy to pipeline
fixed multiple match tables
e.g., Ethernet header exact match → IP header longest prefix matching→...
compatible with OpenFlow
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 27 / 48
Design of protocol-independent switches: the PISA architecture
MMT example
Firewall - Table 1
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* MAC1 * 0x800 * * * * Goto Table 2... ... ... ... ... ... ... ... ...* MAC10 * 0x800 * * * * Goto Table 2
Firewall - Table 2
Switch MAC MAC Eth IP IP TCP TCP ActionPort src dst type src dst src port dst port
* * * 0x800 * IP1 * * A1... ... ... ... ... ... ... ... ...* * * 0x800 * IP10 * * A10
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 28 / 48
Design of protocol-independent switches: the PISA architecture
Reconfigurable Match Tables (RMT)
set of pipelined stages, each with a match table of arbitrary depthand width
e.g., for IP forwarding a match table of 256k 32-bit prefixese.g., for Ethernet a match table of 64k 48-bit addresses
reconfigurations
field definitions can be altered and new fields addednumber, topology, widths, and depths of match tables can be specified,subject only to resource limitsnew actions may be defined, such as creating new fieldsarbitrarily modified packets can be placed in specified queue(s) to betransferred to any subset of output ports, with a queuing disciplinespecified for each queue
configuration process managed by an SDN controller
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 29 / 48
Design of protocol-independent switches: the PISA architecture
RMT example
Firewall - Table 1
MAC MAC Eth Actionsrc dst type
MAC1 * 0x800 Goto Table 2... ... ... ...
MAC10 * 0x800 Goto Table 2
Firewall - Table 2
IP IP Actionsrc dst
* IP1 A1... ... ...* IP10 A10
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 30 / 48
Design of protocol-independent switches: the PISA architecture
RMT implementation
Protocol-Independent Switch (PISA) Architecture
Programmable, pipelined architecture compatible with any packetprocessing
Step 1: programmable protocol parser
defines how the headers can be recognized according to their order inthe packetallow field definitions to be modified or added
Step 2: programmable match-action pipeline
defines the tables and the exact processing algorithm
Step 3: programmable deparser
defines how the packet looks on the wire when sent on the outputinterface
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 31 / 48
Design of protocol-independent switches: the PISA architecture
PISA logical architecture
Input Channels
Logical Stage 1
...
Switch State(metadata)
Select
... ... VLIWAction
Match Tables
Statistics State
Prog. Parser
HeaderPayload
...
Packets
1
K
...
Logical Stage N
Recombine
Output Channels
...
1
K
Configurable Output Queues
Packets
New Header(a) RMT model as a sequence of logical Match-Action stages.
Physical Stage 1
PhysicalStage 2
Logical Stage 1
Logical Stage 2
Physical Stage M
...
: Ingress logical match tables
: Egress logical match tables
Logical Stage N
(b) Flexible match table configuration.
Packet Header Vector
Action Input Selector (Crossbar)
Action Memory OP code
VLIW Instruction Memory
Ctrl
Action Unit
Packet Header Vector
...
Src 1Src 2Src 3
Src 1Src 2Src 3
OP code(from inst
mem)
Match Results
...
Match Tables
Action Unit
Very Wide
Header Bus
(c) VLIW action architecture.
Figure 1: RMT model architecture.
2. Flexible Resource Allocation Minimizing Resource Waste:A physical pipeline stage has some resources (e.g., CPU,memory). The resources needed for a logical stage can varyconsiderably. For example, a firewall may require all ACLs,a core router may require only prefix matches, and an edgerouter may require some of each. By flexibly allocating phys-ical stages to logical stages, one can reconfigure the pipelineto metamorphose from a firewall to a core router in the field.The number of physical stages N should be large enoughso that a logical stage that uses few resource will waste atmost 1/N -th of the resources. Of course, increasing N willincrease overhead (wiring, power): in our chip design wechose N = 32 as a compromise between reducing resourcewastage and hardware overhead.
3. Layout Optimality: As shown in Figure 1b, a logicalstage can be assigned more memory by assigning the logicalstage to multiple contiguous physical stages. An alternatedesign is to assign each logical stage to a decoupled set ofmemories via a crossbar [4]. While this design is more flexi-ble (any memory bank can be allocated to any stage), worstcase wire delays between a processing stage and memorieswill grow at least as
pM , which in router chips that require
a large amount of memory M can be large. While these
delays can be ameliorated by pipelining, the ultimate chal-lenge in such a design is wiring: unless the current matchand action widths (1280 bits) are reduced, running so manywires between every stage and every memory may well beimpossible.
In sum, the advantage of Figure 1b is that it uses a tiledarchitecture with short wires whose resources can be recon-figured with minimal waste. We acknowledge two disadvan-tages. First, having a larger number of physical stages seemsto inflate power requirements. Second, this implementationarchitecture conflates processing and memory allocation. Alogical stage that wants more processing must be allocatedtwo physical stages, but then it gets twice as much memoryeven though it may not need it. In practice, neither issueis significant. Our chip design shows that the power usedby the stage processors is at most 10% of the overall powerusage. Second, in networking most use cases are dominatedby memory use, not processing.
2.2 Restrictions for RealizabilityThe physical pipeline stage architecture needs restrictions
to allow terabit-speed realization:
102
Image taken from [Bo13]
the output of the programmable parser is a packet header vector
set of header fields such as IP dest, Ethernet dest, etc.“metadata” fields such as the input port on which the packet arrived and other router statevariables (e.g., current size of router queues)
packet header vector flows through a sequence of logical match stages, each of whichabstracts a logical unit of packet processing (e.g., Ethernet or IP processing)
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 32 / 48
Design of protocol-independent switches: the PISA architecture
PISA hardware architecture
packet modifications through a wide instruction (VLIW - very long instructionword) that can operate on all fields in the header vector concurrentlyflexible resource allocation minimizing resource waste
a physical pipeline stage has some resources (e.g., CPU, memory)the resources needed for a logical stage can vary considerablye.g., a firewall may require all ACLs, a core router may require only prefixmatches, and an edge router may require some of eachflexible allocation of physical stages to logical stages
on-live reconfiguration of the pipeline to metamorphose, e.g., from a firewallto a core router
Input Channels
Logical Stage 1
...
Switch State(metadata)
Select
... ... VLIWAction
Match Tables
Statistics State
Prog. Parser
HeaderPayload
...
Packets
1
K
...
Logical Stage N
Recombine
Output Channels
...
1
K
Configurable Output Queues
Packets
New Header(a) RMT model as a sequence of logical Match-Action stages.
Physical Stage 1
PhysicalStage 2
Logical Stage 1
Logical Stage 2
Physical Stage M
...
: Ingress logical match tables
: Egress logical match tables
Logical Stage N
(b) Flexible match table configuration.
Packet Header Vector
Action Input Selector (Crossbar)
Action Memory OP code
VLIW Instruction Memory
Ctrl
Action Unit
Packet Header Vector
...
Src 1Src 2Src 3
Src 1Src 2Src 3
OP code(from inst
mem)
Match Results
...
Match Tables
Action Unit
Very Wide
Header Bus
(c) VLIW action architecture.
Figure 1: RMT model architecture.
2. Flexible Resource Allocation Minimizing Resource Waste:A physical pipeline stage has some resources (e.g., CPU,memory). The resources needed for a logical stage can varyconsiderably. For example, a firewall may require all ACLs,a core router may require only prefix matches, and an edgerouter may require some of each. By flexibly allocating phys-ical stages to logical stages, one can reconfigure the pipelineto metamorphose from a firewall to a core router in the field.The number of physical stages N should be large enoughso that a logical stage that uses few resource will waste atmost 1/N -th of the resources. Of course, increasing N willincrease overhead (wiring, power): in our chip design wechose N = 32 as a compromise between reducing resourcewastage and hardware overhead.
3. Layout Optimality: As shown in Figure 1b, a logicalstage can be assigned more memory by assigning the logicalstage to multiple contiguous physical stages. An alternatedesign is to assign each logical stage to a decoupled set ofmemories via a crossbar [4]. While this design is more flexi-ble (any memory bank can be allocated to any stage), worstcase wire delays between a processing stage and memorieswill grow at least as
pM , which in router chips that require
a large amount of memory M can be large. While these
delays can be ameliorated by pipelining, the ultimate chal-lenge in such a design is wiring: unless the current matchand action widths (1280 bits) are reduced, running so manywires between every stage and every memory may well beimpossible.
In sum, the advantage of Figure 1b is that it uses a tiledarchitecture with short wires whose resources can be recon-figured with minimal waste. We acknowledge two disadvan-tages. First, having a larger number of physical stages seemsto inflate power requirements. Second, this implementationarchitecture conflates processing and memory allocation. Alogical stage that wants more processing must be allocatedtwo physical stages, but then it gets twice as much memoryeven though it may not need it. In practice, neither issueis significant. Our chip design shows that the power usedby the stage processors is at most 10% of the overall powerusage. Second, in networking most use cases are dominatedby memory use, not processing.
2.2 Restrictions for RealizabilityThe physical pipeline stage architecture needs restrictions
to allow terabit-speed realization:
102
Image taken from [Bo13]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 33 / 48
Design of protocol-independent switches: the PISA architecture
Example of single-chip implementation
64 ports × 10 Gb/s, i.e., aggregate throughput of 960M packets/s64 bytes Ethernet + 8 byte preamble + 12 byte interframe gap
1 GHz operating frequency
single pipeline with 32 stages
each physical stage106 1000× 112 bit SRAM blocks, used for 80 bit wide hash tables andto store actions and statistics16 2000× 40 bit TCAM blocks
e.g., longest prefix matching on 1 million IP prefixes
blocks may be used in parallel for wider matches, e.g., a 80 bit ACLlookup using two blockshash-based binary match in SRAM is 6× cheaper in area than TCAMternary match
total memory across the 32 stages is 379.9 Mbit SRAM and40.96 Mbit TCAM
packet instructions do not support state machines
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 34 / 48
Languages for data plane programmability: P4
Section 5
Languages for data plane programmability: P4
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 35 / 48
Languages for data plane programmability: P4
P4 context
P4 language consortium https://p4.org/
specifications: P414 (2018) and P416 (2019)
based on Portable Switch Architecture (PSA), which is an extensionof PISA architecture (2018)
open source code and open documentation
P4 compilerBehavioral Model (BMv2), i.e., a P4 software switch, which can runstandalone or in MininetP4Runtime, i.e., a runtime API and protocol to control data-planeprograms
support on commercial chipsets (e.g., 12.8 Tbps Tofino2 chipset byBarefoot networks) or on Net-FPGA
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 36 / 48
Languages for data plane programmability: P4
P4 vs OpenFlow
P4 used to configure the switch, i.e., how packets are parsed andprocessed
OpenFlow (but any other south-bound interface) used to populateprocessing rules in the forwarding table
Image taken from [Bo14]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 37 / 48
Languages for data plane programmability: P4
Design goals 1/2
Reconfigurability
The controller can re-define the packet parsing and processing on-live
Protocol independence
The controller specifies any packet format through the definition of
a packet parser to extract header fields
a collection of typed match+action tables that process these headers
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 38 / 48
Languages for data plane programmability: P4
Design goals 2/2
Target independence
Details of the underlying switch not exposed to the controllerprogrammer
The compiler takes the switch’s capabilities into account whenturning a target-independent description (written in P4) into atarget-dependent program (used to configure the switch)
support for switch ASICs, network processors, reconfigurable switches,software switches, FPGAssimilarly to a C programmer who does not need to know the specificsof the underlying CPU
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 39 / 48
Languages for data plane programmability: P4
Abstract forwarding model
fully programmable parser to define new headers (vs fixed parser inOpenFlow)
match-actions tables in series or in parallel (vs in series tables inOpenflow)
Image taken from [Bo14]
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 40 / 48
Languages for data plane programmability: P4
Operations in the abstract forwarding model
Configure
Determines which protocols are supported and how the switch processpackets
program the parser
set the order of match+action stages
specify the header fields processed by each stage
re-configuration should not interrupt forwarding
Populate
Determines the policy applied to packets at any given time
add/modify/remove entries to the match+action tables that werespecified during configuration
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 41 / 48
Languages for data plane programmability: P4
Processing in the abstract forwarding model
Ingress match-action table
may modify the packet headerdetermines the egress port(s) and the queue where to store the packetbased on ingress processing, the packet may be forwarded, replicated (formulticast, span, or to the control plane), dropped, or trigger flow control
Egress match-action table
may modify the packet header (e.g., multicast copies)
Action tables
Counters, policers, etc., can be associated with a flow to track packet-by-packetstate
Metadata
Packets can carry additional information between stages, called metadata, which istreated identically to packet header fields
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 42 / 48
Languages for data plane programmability: P4
P4 language (examples)
Header definition
header ethernet_t {
bit <48> dstAddr;
bit <48> srcAddr;
bit <16> etherType;
}
header ipv4_t {
...
}
struct headers {
ethernet_t ethernet;
ipv4_t ipv4;
}
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 43 / 48
Languages for data plane programmability: P4
P4 language (examples)
Parser definition
parser MyParser (..) {
state start {
transition parse_ethernet;
}
state parse_ethernet {
packet.extract(hdr.ethernet );
transition select(hdr.ethernet.etherType) {
TYPE_IPV4: parse_ipv4;
default: accept;
}
}
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
}
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 44 / 48
Languages for data plane programmability: P4
P4 language (examples)
Ingress processing 1/2
control MyIngress (...) {
action drop() {
mark_to_drop(standard_metadata );
}
action ipv4_forward(macAddr_t dstAddr , egressSpec_t port) {
standard_metadata.egress_spec = port;
hdr.ethernet.srcAddr = hdr.ethernet.dstAddr;
hdr.ethernet.dstAddr = dstAddr;
hdr.ipv4.ttl = hdr.ipv4.ttl - 1;
}
...
}
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 45 / 48
Languages for data plane programmability: P4
P4 language (examples)
Ingress processing 2/2
control MyIngress (...) {
...
table ipv4_lpm {
key = {
hdr.ipv4.dstAddr: lpm;
}
actions = {
ipv4_forward;
drop;
NoAction;
}
size = 1024;
default_action = drop ();
}
apply {
if (hdr.ipv4.isValid ()) {
ipv4_lpm.apply ();
}
}
}Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 46 / 48
Languages for data plane programmability: P4
P4 language (examples)
Controller programming
Table loaded by the controller into the switch (json format)
"table_entries": [
{
"table": "MyIngress.ipv4_lpm",
"default_action": true ,
"action_name": "MyIngress.drop",
"action_params": { }
},
{
"table": "MyIngress.ipv4_lpm",
"match": {
"hdr.ipv4.dstAddr": ["10.0.1.1", 32]
},
"action_name": "MyIngress.ipv4_forward",
"action_params": {
"dstAddr": "08:00:00:00:01:11",
"port": 1
}
}
]Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 47 / 48
Languages for data plane programmability: P4
Bibliography
OPP16 Bianchi G., et al., “Open Packet Processor: a programmable architecture for wire speedplatform-independent stateful in-network processing”, arXiv:1605.01977, 2016
BC14 Bianchi G., et al., “OpenState: programming platform-independent stateful openflowapplications inside the switch”, ACM SIGCOMM CCR, 2014
Bo13 Bosshart, Pat, et al. ”Forwarding metamorphosis: Fast programmable match-actionprocessing in hardware for SDN.” ACM SIGCOMM CCR, 2013
Bo14 Bosshart, Pat, et al. ”P4: Programming protocol-independent packet processors”, ACMSIGCOMM CCR, 2014
Kr15 Kreutz, Diego, et al. ”Software-defined networking: A comprehensive survey.”Proceedings of the IEEE, 2015
Na08 Naous, Jad, et al. ”Implementing an OpenFlow switch on the NetFPGA platform.”,ACM/IEEE ANCS, 2008
OF15 OpenFlow Switch Specification, Version 1.5.1, Open Networking Foundation, March 2015
FB19 Pontarelli S, et al., “FlowBlaze: Stateful Packet Processing in Hardware”, NSDI’19
Giaccone (Politecnico di Torino) SDN and programmable switches Nov. 2019 48 / 48