data center networking with multipath tcp costin raiciu university college london &...

28
Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh, UCL Sebastien Barre, Universite Catholique Louvain Damon Wischik. UCL Mark Handley, UCL UCL

Upload: kimberly-alvarez

Post on 27-Mar-2015

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Data Center Networking with Multipath TCP

Costin RaiciuUniversity College London & Universitatea Politehnica Bucuresti

Christopher Pluntke, UCLAdam Greenhalgh, UCLSebastien Barre, Universite Catholique LouvainDamon Wischik. UCLMark Handley, UCL

UCL

Page 2: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Topology

Data Center Networking Today

Routing

Resource Allocation

FatTree, VL2, BCube, multi-rooted tree

Random load balancing

TCP

Path Selection

OSPF, VLANs, TRILL

Page 3: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Topology

Data Center Networking Tomorrow

Routing

Resource Allocation

FatTree, VL2, BCube, multi-rooted tree

Random load balancing

TCP

MultipathTCPPath

Selection

OSPF, VLANs, TRILL

Page 4: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Data Centers are Important

Cloud computing Economies of scale:

networks of tens of thousands of hosts

Cool apps Web search, GFS, BigTable,

DryadLINQ, MapReduce Dense traffic patterns

Page 5: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Flexibility is Important in Data Centers

Apps distributed across thousands of machines. Flexibility: want any machine to be able to play

any role.

But: Traditional data center topologies are tree based. Don’t cope well with non-local traffic patterns.

Many recent proposals for better topologies.

Page 6: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Traditional Data Center Topology

…Racks of servers

Top of Rack Switches

Aggregation Switches

Core Switch

1Gbps

10Gbps

10Gbps

Page 7: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Fat Tree Topology [Fares et al., 2008; Clos, 1953]

Aggregation Switches

K Pods with K Switches

each

K=4

Racks of servers

1Gbps

1Gbps

Page 8: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

VL2 Topology [Greenberg et al, 2009, Clos topology]

10Gbps

20 hosts

10Gbps …

Page 9: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

BCube Topology [Guo et al, 2009]

BCube (4,1)

Page 10: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

How Do We Use this Capacity?

Need to distribute flows across paths.

Basic solution: Random Load Balancing. Use Equal-Cost Multipath (ECMP) routing.

• Hash to a path at random.

Use many differently rooted VLANs.• End-host hashes to a VLAN; determines path.

Page 11: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Collisions

Racks of servers

1Gbps

1Gbps

Page 12: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Can MPTCP self-optimize data-center traffic?

With Multipath TCP we can explore many paths: Instead of using one random path, use

many random paths Don’t worry about collisions. Just don’t send (much) traffic on colliding

paths

Page 13: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Simulation Setup

~8000 hosts Long-lived flows Permutation traffic matrix

Each hosts sends and receives from a single other randomly chosen host

Smallest amount of traffic that can fill the network

Page 14: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Multipath TCP in the Fat Tree Topology

Throughput Allocation

Page 15: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Performance depends on topology

VL2 BCube

Page 16: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Overloaded Fat Tree: better fairness with Multipath TCP

Page 17: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Centralized Scheduling

With RLB, it’s really hard to utilize FatTree.

Hedera [Fares et al.,2010] uses a centralized scheduler and flow switching. Start by using RLB Measure all flow throughput periodically. Any flow using more than 10% of its interface rate

is explicitly scheduled onto an unloaded link.

How does centralized scheduling compare with MPTCP?

Page 18: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

MPTCP vs Centralized Dynamic Scheduling

Infinite

Centralized Scheduling MPTCP

Scheduling Interval

Page 19: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Can’t we just use many TCP connections?

Loss rate of MP-TCP (“linked”) vs multiple uncoupled TCP flows

Retransmit timeouts with MPTCP (“linked”) vs uncoupled TCP flows

Page 20: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

MPTCP Linked Increases in DCs

Better fairness and less aggressive than uncoupled TCP

Improves throughput in dense traffic in BCube (25%)

Page 21: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

The bigger picture

Topology

Routing

Resource Allocation

FatTree, VL2, Bcube, multi-rooted tree

MultipathTCP

Path Selection

OSPF, VLANs, etc.

?

Page 22: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Multipath TCP can utilize topologies TCP can’t

1Gb/s

1Gb/s

Requirement: a subset of hosts should be able to communicate at 10Gb/s

10Gb/s

Page 23: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Multipath TCP can utilize topologies TCP can’t [2]

Problem ToR switch failures wipe out tens of

hosts Repair time is on the order of days

Solution: use two ToRs/rack, multi-home servers

Single path TCP Single flows still get same max

throughput Which interface do I use?

With Multipath TCP Flows double their maximum

throughput Path selection automatic

Page 24: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Summary

Data center networking offers many paths between end-hosts. Yet: Random Load Balancing does a poor job of utilizing

them Centralized scheduling is laggy and has inherently

limited knowledge Multipath TCP naturally optimizes data center networks:

Improves throughput Improves fairness More robust than centralized scheduling

Question: what topologies does multipath TCP enable?

Page 25: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Backup Slides

Page 26: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Centralized Scheduling: Setting the Threshold

Throughput

1Gbps

100Mbps

Hope

App Limited

17% worse than

multipath TCP

17% worse than

multipath TCP

Page 27: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Centralized Scheduling: Setting the Threshold

Throughput

1Gbps

100Mbps HopeApp Limited

21% worse than

multipath TCP

21% worse than

multipath TCP

Page 28: Data Center Networking with Multipath TCP Costin Raiciu University College London & Universitatea Politehnica Bucuresti Christopher Pluntke, UCL Adam Greenhalgh,

Centralized Scheduling: Setting the Threshold

Throughput

1Gbps

100Mbps17%

21%

500Mbps

45%

51%