application of sdn - university of minnesota · pdf filerequests to replicas based on the...

Application of SDN:

Load Balancing & Traffic Engineering

Outline

1 OpenFlow-Based Server Load Balancing Gone Wild

IntroductionOpenFlow SolutionPartitioning the Client TrafficTransitioning With Connection AffinityEvaluationFuture Work

Introduction

Clients access online service through a single public IP address.

Data centers host online services on multiple replica servers offeringthe same service each has a unique IP and an integer weight.

Front-end load balancers: direct each client request to a particularreplica server.

Problems: Dedicated loadbalancers are expensive andquickly become a single pointof failure and congestion.

OpenFlow Basic Solution

Plug-n-Serve system uses OpenFlow to reactively assign clientrequests to replicas based on the current network and server load.

Plug-n-Serve intercepts the first packet of each client request andinstalls an individual forwarding rule that handles the remainingpackets of the connection.

Scalability Limitations:◮ Overhead and delay in involving the relatively slow controller in every

client connection.◮ Many rules installed at each switch (separate rule for each client).◮ Heavy load on the controller.

OpenFlow Features

Microflow rule: matches on all fields.

Wildcard rule: can have “don’t care” bits in some fields.

Rules can be deleted after a fixed time interval (a hard timeout).

Rules can be deleted after a specified period of inactivity (a softtimeout).

The switch counts the number of bytes and packets matching eachrule. The controller can poll these counter values.

OpenFlow Alternative Approach

Use wildcard rules to direct incoming client requests based on theclient IP addresses.

Switch performs an “action” of:

1 Rewriting the server IP address2 Forwarding the packet to the output port associated with the chosen

replica.

Rely on microflow rules only during transitions from one set ofwildcard rules to another.

Soft timeouts allow these microflow rules to “self destruct” after aclient connection completes.

Load-balancing Architecture

Constraints:

1 Generating an efficient set of rules for a target distribution of load.

2 Ensuring that packets in the same TCP connection reach the sameserver across changes in the rules.

Components:

1 Partitioning algorithm: Generates wildcard rules that balance loadover the replicas.

2 Transitioning algorithm: Moves from one set of wildcard rules toanother, without disrupting ongoing connections.

[1] Partitioning the Client Traffic

Must divide client traffic in proportion to the load-balancing weights.

Successive packets from the same TCP connection forwarded to samereplica ⇒ Rules installed match on client IP addresses

Figure: Basic model from load balancer switch’s view

[1] Partitioning the Client Traffic

Binary tree is used to represent IP prefixes.

If∑

αj is power of 2 ⇒ binary tree leaf nodes

Each Rj is associated with αj leaf nodes. e.g. R2 is associated withfour leaves.

If∑

αj is not power of 2 ⇒ find closest power of 2 and renormalizethe weights.

Figure: Wildcard rule assigned to each leaf node

Minimizing the Number of Wildcard Rules

Creating a wildcard rule for each leaf node ⇒ large number of rules.

Aggregate siblings associated with the same server replica.10* can represent 100* and 101* associated with R2.00* can represent 000* 001* associated with R1.

6 wildcard rules instead of 8.

Alternate assignment can lead to only 4 rules: (0*, 10*, 110*, and111*).

Minimizing Change During Re-Partitioning

Weights αj may change over time: maintenance, save energy,congestion.

Possible solution: regenerate wildcard rules from scratch.

Problems: Change replica selection for large number of client IPaddresses, increase overhead of transitioning to new rules.

Minimizing Change During Re-Partitioning

Better Solution:

If number of leaf nodes of a replica unchanged ⇒ Rules of this replicamay not need to change.e.g. If α3 changed to 0 and α1 changed to 4: Rule of R2 remainsunchanged, and R1 will only have one rule 1*.

Create a new binary tree for updated αj .

Pre-allocates leaf nodes to re-usable wildcard rules.

Re-usable wildcard rules: ith highest bit set to 1 in new and old αj

even if old and new αj are different.

Allocate leaf nodes for larger group rather than using existing rules ofsmaller pre-allocated nodes.

[2] Transitioning With Connection Affinity

Existing connections should complete at the original replica.

New Connection: TCP SYN flag is set in the first packet of a newconnection.

Approaches:◮ Faster Transition: Direct some packets to controller◮ Slower Transition: Switch handles all packets

Transitioning Quickly With Microflow Rules

Rule directing all 0* traffic to the controller for inspection.

A dedicated high-priority microflow rule with 60-second soft timeoutfor each connection.

Rule directs to the new replica R2 (for a SYN).

Rule directs to the old replica R1 (for a non-SYN).

Controller modifies the 0* rule to direct all future traffic to the newreplica R2.

Transitioning With No Packets to Controller

Controller divides the address space for 0* into several smaller pieces,each represented by a high-priority wildcard rule (e.g., 000*, 001*,010*, and 011*) directing traffic to old replica R1.

60-second soft timeout added to higher-priority rules to be deleted ifno activity ⇒ safely can shift to R2.

Controller installs a single lower-priority rule directing 0* to the newreplica R2.

Evaluation

α1 = 3, α2 = 4, α3 = 1

At time 75 sec: α2 = 0

Future Work: Non-Uniform Client Traffic

The target distribution of load is 50%, 25%, and 25% for R1, R2, andR3.

Actual division of load is (overwhelming) 75% for R1 and(underwhelming) 12.5% for R2 and R3 each.

Solution:

Use OpenFlow counters for rules.

Identify severely overloaded and underloaded replicas.

Identify the set of rules to shift.

Future Work: Network of Multiple Switches

SW1: forward packets with src IP in 1* to SW3, modify dst IP to R3.



SW2,SW3: forward packets to appropriate server.

Advantages

Computes concise wildcard rules that achieve a target distribution ofthe traffic.

Proactively installs wildcard rules in the switches to direct requests forlarge groups of clients without involving the controller.

Automatically adjust to changes in load-balancing policies withoutdisrupting existing connections.

Avoids the cost and complexity of separate load-balancer devices.

Allows flexibility of network topology.

Scales naturally as the number of switches and replicas grows, whiledirecting client requests at line rate.

SDN and Traffic Engineering: SWAN

Outline

1 Achieving high utilization with software-driven WAN

Introduction

Service rely on low-latency inter-DC communication, hence resourcesover-provisioned

Unable to fully leverage investment:I lack of co-ordination among servicesI network under-subscribed on averageI poor efficiency of MPLS TE

Solution?

Introduction

Software-Driven WAN (SWAN) proposed by Microsoft

Enables inter-DC WAN to carry significantly more traffic.

Achieves high efficiency and utilization.

Enables to update the network’s data plane at high load as well

Fully use network capacity with an order of few rules

4 / 17

Background & Motivation

Types of services:

Interactive ServicesI critical path of end user experience - eg. DC contacts

another DC to serve user’s requestI highly sensitive to loss and delay

Elastic ServicesI regular timely delivery - eg. data replicationI sensitive to delay varies

Background ServicesI maintenance and provisioning activities - eg. copy all

data of service to another DC for long-term storageI bandwidth hungry, requires more resourcesI not sensitive to delay or latency

Background & Motivation - Issues with MPLS TE

Poor utilization

Daily traffic pattern on abusy link

Break down based on traffictype

Reduction in peak usage ifbackground traffic isdynamically adapted


Poor efficiency

Flows arrive in the order Fa followed by Fb and finally Fc

MPLS TE greedily assigns path as shown in Fig. (a) while thereexists a more efficient solution as shown in Fig. (b)


Poor sharing

Link capacity = 1, each service (Si ⇒ Di ) has unit demand

With link-fairness- (S2 ⇒ D2) gets twice throughput of other services

SWAN Overview

SWAN’s sharing policiesSmall number of priority classesInteractive ⇒ Elastic ⇒ Background (lowest priority)

I bandwidth allocated in strict precedenceI prefer shorter paths for higher priority classes

Except interactive services, all other inform SWAN controller aboutdetails of their demand. Interactive traffic sent using traditionalapproach.

Controller: up-to-date, global view of topology & demands;computes resource allocation for services;

Per SDN paradigm, controller directly updates forwarding entries inswitches

SWAN Overview

Need for a scalable algorithm for global allocation

Computationally intensive (LP)

SWAN uses a practical approach

approximately fair with provable bounds and close to optimal

SWAN Overview

Atomic reconfiguration of a distributed switch

Each flow unit = 1, Link capacity = 1.5 units

SWAN computes multi-step congestion-free transition plan

SWAN Overview

Key concept

For each link, SWAN leaves a scratch copy s ∈ [0, 50%].

This scratch capacity guarantees a transition plan existswith a maximum of

(d1

s e − 1)

steps.

SWAN Overview

Switch hardware supports limited number of rules.

SWAN dynamically identifies and installs tunnels using LP.

What about network re-configuration? Will it disrupt traffic?SWAN sets aside scratch space (eg. 10%) on the switch toaccommodate new set of rules.

SWAN Design

Figure: Architecture of SWAN

Service brokers & hosts - host estimate service’s demand (every Th time);broker apportions the demand based on current limits; broker also aggregatesdemand and updates controller every Ts time.

Network agent - report topology changes to controller, get traffic info. fromcontroller (every Ta time); reliably update switches.

Controller - uses info. on service demands and network topology (every Tc

time), computes service allocations, decide forwarding plane config. updates, andinstructs service brokers and network agents accordingly.

SWAN Design

Forwarding plane configurationI uses label-based forwarding (similar to VLAN tagging)I label assigned by source; transit switches use label and table to route

Computing service allocationsI approximate max-min fairness among same priority classes

Updating forwarding stateI update traffic distribution across tunnels

uses scratch capacity and LP-based algorithmI updating tunnels

SWAN Design - Handling Failures

Network agents report link/switch failures to the controller.

Controller re-computes the allocation and updates network agentsand service brokers, etc.

Network agents, service brokers, and the controller have backupinstances.

Conclusion

SWAN enables highly efficient and flexible inter-DC WAN

Scratch capacity on the links and scratch space on the switchenable updates without congestion.

Test-bed and data-driven simulations show SWAN can carry 60 %more traffic.

application of sdn - university of minnesota · pdf filerequests to replicas based on the...

Documents