application of sdn - university of minnesota · pdf filerequests to replicas based on the...
TRANSCRIPT
Application of SDN:
Load Balancing & Traffic Engineering
Outline
1 OpenFlow-Based Server Load Balancing Gone Wild
IntroductionOpenFlow SolutionPartitioning the Client TrafficTransitioning With Connection AffinityEvaluationFuture Work
Introduction
Clients access online service through a single public IP address.
Data centers host online services on multiple replica servers offeringthe same service each has a unique IP and an integer weight.
Front-end load balancers: direct each client request to a particularreplica server.
Problems: Dedicated loadbalancers are expensive andquickly become a single pointof failure and congestion.
OpenFlow Basic Solution
Plug-n-Serve system uses OpenFlow to reactively assign clientrequests to replicas based on the current network and server load.
Plug-n-Serve intercepts the first packet of each client request andinstalls an individual forwarding rule that handles the remainingpackets of the connection.
Scalability Limitations:◮ Overhead and delay in involving the relatively slow controller in every
client connection.◮ Many rules installed at each switch (separate rule for each client).◮ Heavy load on the controller.
OpenFlow Features
Microflow rule: matches on all fields.
Wildcard rule: can have “don’t care” bits in some fields.
Rules can be deleted after a fixed time interval (a hard timeout).
Rules can be deleted after a specified period of inactivity (a softtimeout).
The switch counts the number of bytes and packets matching eachrule. The controller can poll these counter values.
OpenFlow Alternative Approach
Use wildcard rules to direct incoming client requests based on theclient IP addresses.
Switch performs an “action” of:
1 Rewriting the server IP address2 Forwarding the packet to the output port associated with the chosen
replica.
Rely on microflow rules only during transitions from one set ofwildcard rules to another.
Soft timeouts allow these microflow rules to “self destruct” after aclient connection completes.
Load-balancing Architecture
Constraints:
1 Generating an efficient set of rules for a target distribution of load.
2 Ensuring that packets in the same TCP connection reach the sameserver across changes in the rules.
Components:
1 Partitioning algorithm: Generates wildcard rules that balance loadover the replicas.
2 Transitioning algorithm: Moves from one set of wildcard rules toanother, without disrupting ongoing connections.
[1] Partitioning the Client Traffic
Must divide client traffic in proportion to the load-balancing weights.
Successive packets from the same TCP connection forwarded to samereplica ⇒ Rules installed match on client IP addresses
Figure: Basic model from load balancer switch’s view
[1] Partitioning the Client Traffic
Binary tree is used to represent IP prefixes.
If∑
αj is power of 2 ⇒ binary tree leaf nodes
Each Rj is associated with αj leaf nodes. e.g. R2 is associated withfour leaves.
If∑
αj is not power of 2 ⇒ find closest power of 2 and renormalizethe weights.
Figure: Wildcard rule assigned to each leaf node
Minimizing the Number of Wildcard Rules
Creating a wildcard rule for each leaf node ⇒ large number of rules.
Aggregate siblings associated with the same server replica.10* can represent 100* and 101* associated with R2.00* can represent 000* 001* associated with R1.
6 wildcard rules instead of 8.
Alternate assignment can lead to only 4 rules: (0*, 10*, 110*, and111*).
Minimizing Change During Re-Partitioning
Weights αj may change over time: maintenance, save energy,congestion.
Possible solution: regenerate wildcard rules from scratch.
Problems: Change replica selection for large number of client IPaddresses, increase overhead of transitioning to new rules.
Minimizing Change During Re-Partitioning
Better Solution:
If number of leaf nodes of a replica unchanged ⇒ Rules of this replicamay not need to change.e.g. If α3 changed to 0 and α1 changed to 4: Rule of R2 remainsunchanged, and R1 will only have one rule 1*.
Create a new binary tree for updated αj .
Pre-allocates leaf nodes to re-usable wildcard rules.
Re-usable wildcard rules: ith highest bit set to 1 in new and old αj
even if old and new αj are different.
Allocate leaf nodes for larger group rather than using existing rules ofsmaller pre-allocated nodes.
[2] Transitioning With Connection Affinity
Existing connections should complete at the original replica.
New Connection: TCP SYN flag is set in the first packet of a newconnection.
Approaches:◮ Faster Transition: Direct some packets to controller◮ Slower Transition: Switch handles all packets
Transitioning Quickly With Microflow Rules
Rule directing all 0* traffic to the controller for inspection.
A dedicated high-priority microflow rule with 60-second soft timeoutfor each connection.
Rule directs to the new replica R2 (for a SYN).
Rule directs to the old replica R1 (for a non-SYN).
Controller modifies the 0* rule to direct all future traffic to the newreplica R2.
Transitioning With No Packets to Controller
Controller divides the address space for 0* into several smaller pieces,each represented by a high-priority wildcard rule (e.g., 000*, 001*,010*, and 011*) directing traffic to old replica R1.
60-second soft timeout added to higher-priority rules to be deleted ifno activity ⇒ safely can shift to R2.
Controller installs a single lower-priority rule directing 0* to the newreplica R2.
Evaluation
α1 = 3, α2 = 4, α3 = 1
At time 75 sec: α2 = 0
Future Work: Non-Uniform Client Traffic
The target distribution of load is 50%, 25%, and 25% for R1, R2, andR3.
Actual division of load is (overwhelming) 75% for R1 and(underwhelming) 12.5% for R2 and R3 each.
Solution:
Use OpenFlow counters for rules.
Identify severely overloaded and underloaded replicas.
Identify the set of rules to shift.
Future Work: Network of Multiple Switches
SW1: forward packets with src IP in 1* to SW3, modify dst IP to R3.
SW1: forward packets with src IP in 00* to SW2, modify dst IP to R2.
SW1: forward packets with src IP in 01* to SW2, modify dst IP to R3.
SW2,SW3: forward packets to appropriate server.
Advantages
Computes concise wildcard rules that achieve a target distribution ofthe traffic.
Proactively installs wildcard rules in the switches to direct requests forlarge groups of clients without involving the controller.
Automatically adjust to changes in load-balancing policies withoutdisrupting existing connections.
Avoids the cost and complexity of separate load-balancer devices.
Allows flexibility of network topology.
Scales naturally as the number of switches and replicas grows, whiledirecting client requests at line rate.
SDN and Traffic Engineering: SWAN
Outline
1 Achieving high utilization with software-driven WAN
Introduction
Service rely on low-latency inter-DC communication, hence resourcesover-provisioned
Unable to fully leverage investment:I lack of co-ordination among servicesI network under-subscribed on averageI poor efficiency of MPLS TE
Solution?
Introduction
Software-Driven WAN (SWAN) proposed by Microsoft
Enables inter-DC WAN to carry significantly more traffic.
Achieves high efficiency and utilization.
Enables to update the network’s data plane at high load as well
Fully use network capacity with an order of few rules
4 / 17
Background & Motivation
Types of services:
Interactive ServicesI critical path of end user experience - eg. DC contacts
another DC to serve user’s requestI highly sensitive to loss and delay
Elastic ServicesI regular timely delivery - eg. data replicationI sensitive to delay varies
Background ServicesI maintenance and provisioning activities - eg. copy all
data of service to another DC for long-term storageI bandwidth hungry, requires more resourcesI not sensitive to delay or latency
Background & Motivation - Issues with MPLS TE
Poor utilization
Daily traffic pattern on abusy link
Break down based on traffictype
Reduction in peak usage ifbackground traffic isdynamically adapted
Background & Motivation - Issues with MPLS TE
Poor efficiency
Flows arrive in the order Fa followed by Fb and finally Fc
MPLS TE greedily assigns path as shown in Fig. (a) while thereexists a more efficient solution as shown in Fig. (b)
Background & Motivation - Issues with MPLS TE
Poor sharing
Link capacity = 1, each service (Si ⇒ Di ) has unit demand
With link-fairness- (S2 ⇒ D2) gets twice throughput of other services
SWAN Overview
SWAN’s sharing policiesSmall number of priority classesInteractive ⇒ Elastic ⇒ Background (lowest priority)
I bandwidth allocated in strict precedenceI prefer shorter paths for higher priority classes
Except interactive services, all other inform SWAN controller aboutdetails of their demand. Interactive traffic sent using traditionalapproach.
Controller: up-to-date, global view of topology & demands;computes resource allocation for services;
Per SDN paradigm, controller directly updates forwarding entries inswitches
SWAN Overview
Need for a scalable algorithm for global allocation
Computationally intensive (LP)
SWAN uses a practical approach
approximately fair with provable bounds and close to optimal
SWAN Overview
Atomic reconfiguration of a distributed switch
Each flow unit = 1, Link capacity = 1.5 units
SWAN computes multi-step congestion-free transition plan
SWAN Overview
Key concept
For each link, SWAN leaves a scratch copy s ∈ [0, 50%].
This scratch capacity guarantees a transition plan existswith a maximum of
(d1
s e − 1)
steps.
SWAN Overview
Switch hardware supports limited number of rules.
SWAN dynamically identifies and installs tunnels using LP.
What about network re-configuration? Will it disrupt traffic?SWAN sets aside scratch space (eg. 10%) on the switch toaccommodate new set of rules.
SWAN Design
Figure: Architecture of SWAN
Service brokers & hosts - host estimate service’s demand (every Th time);broker apportions the demand based on current limits; broker also aggregatesdemand and updates controller every Ts time.
Network agent - report topology changes to controller, get traffic info. fromcontroller (every Ta time); reliably update switches.
Controller - uses info. on service demands and network topology (every Tc
time), computes service allocations, decide forwarding plane config. updates, andinstructs service brokers and network agents accordingly.
SWAN Design
Forwarding plane configurationI uses label-based forwarding (similar to VLAN tagging)I label assigned by source; transit switches use label and table to route
Computing service allocationsI approximate max-min fairness among same priority classes
Updating forwarding stateI update traffic distribution across tunnels
uses scratch capacity and LP-based algorithmI updating tunnels
SWAN Design - Handling Failures
Network agents report link/switch failures to the controller.
Controller re-computes the allocation and updates network agentsand service brokers, etc.
Network agents, service brokers, and the controller have backupinstances.
Conclusion
SWAN enables highly efficient and flexible inter-DC WAN
Scratch capacity on the links and scratch space on the switchenable updates without congestion.
Test-bed and data-driven simulations show SWAN can carry 60 %more traffic.