washington washington university in st louis [email protected] control fred kuhns...

23
Washington WASHINGTON UNIVERSITY IN ST LOUIS [email protected] Control Fred Kuhns [email protected] Applied Research laboratory Department of Computer Science and Engineering Washington University in St. Louis

Upload: baldwin-phillip-daniel

Post on 31-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

[email protected]

Control

Fred [email protected]

Applied Research laboratory

Department of Computer Science and Engineering

Washington University in St. Louis

Page 2: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Virtual Networking – Basic Concepts

Substrate Linksinterconnect adjacentSubstrate Routers

Meta Links interconnect adjacentMeta Routers. Defined

within substrate link context

One or more Meta Router

instances

Substrate Router

substrate links may be Tunneled

within existing networks: IP, MPLS, etc.

Page 3: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Adding a Node

Install new substrate router

Create substrate links between peers

Instantiate meta router(s)

Define meta-links between meta nodes (routers or hosts)

Page 4: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

System Components• General purpose processing engines (PE/GP).

– Shared: PlanetLab VM environment. • Local Planetlab node manager to configure and manager VMs

– vserver, vnet may change to support substrate functions• Implement substrate functions in kernel

– rate control, mux/demux, substrate header processing

– Dedicated: no local substrate functions• May choose to implement substrate header processing and rate control.• Substrate uses VLANs to ensure isolation (VLAN == MRid)• Can use 802.1Q priorities to isolate traffic further.

• NP blades (PE/NP).– Shared: user supplies parse and header formatting code.– Dedicated: User has full access to and control over the hardware device

• General Meta-Processing Engine (MPE) notes:– Use loopback to enforce rate limits between dedicated MPEs– Legacy node modeled as dedicated MPE, use loopback blade to remove/add

substrate headers.• Substrate links: Interconnect substrate nodes

– Meta-links defined within their context.– Assume an external entity configures end-to-end meta-nets and meta-links– Substrate links configured outside of the node manager’s context

Page 5: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Switch• Switch Blade Specs:

– Promentum™ ATCA-2210 – http://www.radisys.com/products/ds-page.cfm?productdatasheetsid=1191– 20-port 10GE fabric switch

• 14 10GE links to user slots• 4 10GE links for external connections (up/cross links) on front panel

– 24-port 1GE Base switch• 14 1GE links to users lots• 1GE link to redundant switch blade• 1 10GE and 4 1GE links for external connections (up/cross links) on front panel

– Wire-speed L2 and L3 switching– 4K IEEE 802.1Q VLANs– Etc…

• Traversing the Switch:– Switching is based on Ethernet Destination Address– Isolation is based on VLAN.

• One VLAN will be assigned to each MetaNet present on a Substrate Router.• All switch traffic for a MetaNet will be required to use its assigned VLAN.

– Frames from a MetaNet will only be transmitted to a port which is allowed to receive the specified VLAN.

Page 6: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Packet Processing• Key features

– 16 32 bit 1.4 GHz Micro-engines• peak instruction rate >20 GIPs

• 8 hw contexts per processor

• support >50 i/byte (input & output)

• pipeline connections for streaming

– four QDR SRAM interfaces and three RDRAM interfaces

– high IO bandwidth (up to 20G)

– Xscale control processor

– encryption/decryption engine

Page 7: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

System Architecture• General purpose blades.

–shared blades run Plab OS• no change to current apps

–also support dedicated blades–use separate blade server to preserve

ATCA slots for NPs

• NP blades.–support dedicated PEs

• control from Vserver on PE/GP

–shared PE options• shared NP for fast path

• shared NP with plugins

• 10 GE fabric switch–VLANs used to isolate metarouters–uplinks for connecting to multiple

chasses

• Good ratio of PEs to LC: 3:1

10 GE Switch

Line C

ard

Switch Blade

PE/G

P

PE/N

P

. . . . . . up to 10 1GEinterfaces

compute blade with disk

Radisys7010

Radisys 7010 with RTM

1 GE for control10 Gb/s for data

Page 8: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Block Diagram of a Meta-Router

MPEk1

0 1 2

MPEk2

3 4 5

Meta Switch

MPEk3

control

Control/Managementusing Base channel(Control Net: IPv4)Meta Interfaces (MI):

MI connected to meta-links

Meta-Processing Engines (MPE): - virtual machine, COTS PC, NPU, FPGA - PEs differ in ease of “programming” and performance - MR may use one or more PEs, with possibly different types

MPEs interconnected in data planeby a meta-switch. Packet includes

Meta-Router and Meta-PE identifier Some Substrate detected errors or events reported to

Meta-Router “control” MPE.

The first Meta-Processing Engine (MPE) assigned to Meta-Network MNetk called MPEk1

Meta-Router

1G 1G .5G 2G 1G .5G

3G .1G .1G 3G .1G.1G

data path data path

Page 9: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

2x1

GE

System Block Diagram

… …

Base Ethernet Switch (1Gbps, control)

PE/NP N

PU

-A

NP

U-B

xscale xscale

X

PE/NP PE/GP

10 x 1GbERTM

LC

NP

U-A

NP

U-B

xscale

X

xscaleT

CA

M …

LC

RTM

PE/GP

GbEinterface

PCI

Loopback map VLANX to VLANY

Shelf manager

I2C(IPMI)

Node Server

Node Manager user login accounts

Fabric Ethernet Switch (10Gbps, data path)

GPCPU

2x1

GE

Page 10: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

PE/GP(control, IPaddr)(platform, x86)

(type, dedicated)…

Top-Level View (exported) of the Node

Node Server

user login accounts Node Manager

Substrate Control

PE/NP(control, IPaddr)

(platform, IXP2800)(type, IXP_DEDICATED)

PE/GP(control, IPaddr)(platform, x86)

(type, linux_vserver)…

PE/NP(control, IPaddr)

(platform, IXP2800)(type, IXP_SHARED)

… … …

Exported Node Resource List (Processing engines, Substrate Links)

S-Link(type, p2p)

(peer, XXX)(BW, XXGbps)

S-Link(type, p2p)

(peer, _Desc_)(BW, XGbps)

Page 11: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Substrate: Enabling an MR

LC LC Line card…

10GbE (fabric)

Substrate

6

Host(located within node)

loop

bac

k

VLANk5

4

7

3

Define Meta-Interface mappings

local

Meta-Router MR1 for MNetk

PE PE PE

MPEk1 MPEk2

MNetk Data Plane

MPEk3

MNetk Control andManagement Plane

Enable VLANk on fabric switch ports

2 1 0

Update shared MPEs for MI and inter-

MPE traffic

Update host with local Net gateway

Allocate data-plane MPEs

Allocate control-plane MPE (required)

Enable control over Base switch (IP-based)

Use loopback to define interfaces internal to the system node.

MI2MI1

MNetk

MI4

MNetk

MI3

MNetk

MI0

MNetk

Page 12: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

meta-router

Block Diagram

Meta-Interfacesare rate controlled

Each MR:MI pair is assigned its own rate controlled queue

…Lookup table

map toPort, MetaLink pair

Lookup table

… map toMR:MI

Shared PE

Dedicated PE

…Shared PE/NP

Lookup table

map toPort MetaLink pair

Fab

ric

Sw

itch

1

2 MR5

MR4

MR1

MR2

1

2

Fab

ric

Sw

itch

MR3

Lookup table

… map toMR:MI

MR5:MI1

Line Card

Line Card Line Card

Line Card

map received packet to MR and MI

VM

M“VM” manager

meta-net5 controlApp-level service

Shared PE/GP

Bas

e sw

itch

(co

ntr

ol) Meta-net control and management

functions (configure, stats, routing etc). Communicate with MR over

separate base switch.

VM

M?

Node M.Node Server

‘slice’/MN VMs?

Internet

Page 13: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Partitioning the Control plane• Substrate manager

– Initialization: discover system HW components and capabilities (blades, links etc)

– Hides low level implementation details– Interacts with shelf manager for resetting boards or detecting failures.

• Node manager– Initialization: request system resource list– Operational: Allocate resources to meta-Networks (slice authorities?)– Request substrate to reset MPEs

• Substrate assumptions:– All MNets (slices) with a locally defined meta-router/service (sliver) have a

control process to which it can send exception packets and event notifications. • Communication:

– out-of-band uses Base interface and internal IP addresses– in-band uses data plane and MPE id.

• Notifications:– ARP errors, Improperly formatted frame, Interface down/up, etc.

– If meta-link is a pass-through link then the Node manager is responsible for handling meta-net level errors/event notification. For example link goes down.

Page 14: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Initialization: Substrate Resource Discovery• Creates list of devices and their Ethernet Addresses

– Network Processor (NP) blades:• Type: network-processor, Arch: ixp2800, Memory: 768MB (DRAM), Disk: 0, Rate: 5Gbps

– General Processor (GP) blades:• Type: linux-vserver, Arch: X, Memory: X, Disk: X, Rate: X

– Line Card blades:• not exposed to node manager, used to implement meta-interfaces• another entity creates substrate links to interconnect peer substrate nodes.• create table mapping line card blades, physical links and Ethernet addresses.

• Internal representation:– Substrate device ID: <ID, SDid>– If device has a local control daemon: <Control, IP Address>– Type = Processing Engine (NP/GP):

• <Platform, (Dual IXP2800|Xeon|???)>, <Memory, #>, <Storage, #> <Clock, (1.4GHz|???)> <Fabric, 10GbE>, <Base, 1GbE>, ???

– Type = Line Card• <Platform, Dual IXP2800> <Ports, {<Media, Ethernet>, <Rate, 1Gbps>}>, ???

– Substrate Links• <Type, p2p>, <Peer, Ethernet Address>, <Rate Limit>, …• Met-Link list <MLid, MLI>, <MR, MRid>, …

Page 15: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Initialization: Exported Resource Model• List of available elements

– Attributes of interest?• Platform: IXP2800, PowerPC, ARM, x86; Memory: DRAM/SRAM; Disk: XGB;

Bandwidth: 5Gbps; VM_Type: linux-vserver, IXP_Shared, IXP_Dedicated, G__Dedicated; Special: TCAM

– network-processor: NP-Shared, NP-Dedicated

– General purpose: GP-Shared (linux-vserver), GP-Dedicated

– Each element is assigned an IP address for control (internal control LAN)

• List of available substrate links:– Access networks (expect Ethernet LAN interface): substrate link is multi-

access• Attributes: Access: multi-access, Available Bandwidth, Legacy protocol(s) (i.e.

IP), Link protocol (i.e. Ethernet), Substrate ARP implementation.

– Core interface: assume point-to-point, Bandwidth controlled• Attributes: Access: Substrate; Bandwidth, Legacy protocol?

Page 16: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Instantiate a router: Register MNet• Substrate assumptions:

– All MNets (slices) with a locally defined meta-router/service (sliver) will have defined a control process to which it can send exception packets and event notifications.

• Communication: out-of-band uses Base interface and internal IP addresses, in band uses data plane. ???

• Notifications: ARP errors, Improperly formatted frame, Interface down/up, etc.

– If meta-link is a pass-through link then the Node manager is responsible for handling errors/event notification.

• Node manager Actions:– Request binding of MNidk to allocated device (use SDid from initialization)

• Substrate enables VLANk on applicable ports of the fabric switch

– Allocate hardware resources (see following discussion for different scenarios)– If control module already instantiated then notify it of the MR location (IP

address of control interface). – If creating control entity then register it with any line cards with meta-router

interfaces (for exception traffic). ???

Page 17: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Instantiate a router: Register Meta-Router (MR)

• Define MR specific Meta-Processing Engines (MPE):– Register MR ID MRidk with substrate

• substrate allocates VLANk and binds to MRidk,

– Request Meta-Processing Engines• shared or dedicated, NP or GP, if shared then relative allocation (rspec)

– shared: implies internal implementation has support for substrate functions– dedicated w/substrate: user implements substrate functions.– dedicated no/substrate: implies substrate will remove any substrate headers

from data packets before delivering to MPE. For legacy systems.

• indicate of this MPE is to receive control events from substrate (Control_MPE).

• substrate returns MPE id (MPid) and control IP (MPip) address for each allocated MPE

• substrate internally records Ethernet address of MPE and enables VLAN on applicable port

• substrate assumes that any MPE may send data traffic to any other MPE– MPE specifies target MPE rather then MI when sending packet.

Page 18: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Instantiate a router: Register Meta-Router (MR)

• Create meta-interfaces (with BW constraints)– create meta-interfaces associated with external substrate links

• request meta-interface id (MIid) be bound to substrate link x (SLx).– we need to work out the details of how a SL is specified

• We need to work out the details of who assigns inbound versus outbound meta-link identifiers (when they are used). If downstream node then the some entity (node manager?) reports the outgoing label. This node assigns the inbound label.

• multi-access substrate/meta link: node manager or meta-router control entity must configure meta-interface for ARP. Set local meta-address and send destination address with output data packet.

• substrate updates tables to bind MI to “receiving” MPE (i.e. were substrate sends received packets)

– create meta-interfaces for delivery to internal devices (for example, legacy Planetlab nodes)

• create meta-interface associated with an MPE (i.e. the endsystem)

Page 19: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Line Cards: Assumptions

• Initially use a simplified model– Core interfaces has point-to-point substrate links which

correspond (physically or logically) to physical links.

– LAN interfaces only support legacy IP traffic

Page 20: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

Scenarios• Shared PE/NP, send request to device controller on the XScale

– Allocate memory for MR Control Block

– Allocate microengine and load MR code for Parser and Header Formatter

– Allocate meta-interfaces (output queues) and assign Bandwidth constraints

• Dedicated PE/NP– Notify device control daemon that it will be a dedicated device. May

require loading/booting a different image?

• Shared GP– use existing/new PlanetLab framework

• Dedicated GP– legacy planetlab node

– other

Page 21: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

IPv4

• Create the default IPv4 Meta-Router, initially in the non-forwarding state.– Register MetaNet: output Meta-Net ID = MNid

– Instantiate IPv4 router: output Meta-Router ID = MRid

• Add interfaces for legacy IPv4 traffic:– Substrate supports defining a default protocol handler

(Meta-Router) for non-substrate traffic.

– for protocol=IPv4, send to IPv4 meta-router (specify the corresponding MPE).

Page 22: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

General Control/Management• Meta routers use Base channel to send requests to control entity on

associated MPE devices• Node manager sends requests to central substrate manager (xml-rpc?)

– request to both configure, start/stop and tear down meta-routers (MPEs and MIs).

• Substrate enforces isolation and policies/monitors meta-router sending rates. – Rate exceeded error: If MPE violates rate limits then its interface is disabled

and the control MPE is notified (over Base channel)..• Shared NP

– xscale daemon– requests: start/stop forwarding; Allocate shared memory for table; Get/set

statistic counters; Set/alter MR control lock; Add/Remove lookup table entries.– Lookup entries can be added to send data packets to control MPE, packet

header may contain tag to indicate reason packet was sent– mechanism for allocating space for MR specific code segments.

• dedicated NP– MPE controls XScale. When XScale boots a control daemon si told to load a

specific image containing user code.

Page 23: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer

23WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 04/19/23

ARP for Access Networks• The substrate offers an ARP service to meta-routers• Meta-router responsibilities:

– before enabling interface must register its meta-network address associated with meta-interface

– send destination (next-hop) meta-net address with packets (part of substrate internal header). Substrate will use arp with this value.

– if meta-router wants to use multicast or broadcast address then it mus also supply the Link layer destination address. So the substrate must also export the Link layer type.

• substrate responsibilities– all substrate nodes on an access network must agree on meta-net

identifiers (MLIs)– Issues ARP requests/responses using supplied meta-net addresses and

met-net id (MLI).– maintain ARP table and timeout entries according to relevant rfcs.– ARP Failed error: If ARP fails for a supplied address then substrate

must send packet (or packet context) to control MPE of meta-router.