practical tdma for datacenter ethernet

35
Practical TDMA for Datacenter Ethernet Bhanu C. Vattikonda, George Porter, Amin Vahdat, Alex C. Snoeren

Upload: salene

Post on 23-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Practical TDMA for Datacenter Ethernet. Bhanu C. Vattikonda, George Porter, Amin Vahdat , Alex C. Snoeren. Variety of applications hosted in datacenters. Gather/Scatter. All-to-all. Performance depends on throughput sensitive traffic in shuffle phase. Generate latency sensitive traffic. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Practical TDMA for Datacenter Ethernet

Practical TDMA for Datacenter Ethernet

Bhanu C. Vattikonda, George Porter, Amin Vahdat, Alex C. Snoeren

Page 2: Practical TDMA for Datacenter Ethernet

Performance depends on throughput sensitive traffic in shuffle phase

Generate latency sensitive traffic

All-to-all Gather/ScatterVariety of applications hosted in datacenters

Page 3: Practical TDMA for Datacenter Ethernet

Network is treated as a black-box

3

Applications like Hadoop MapReduce perform inefficiently

Applications like Memcached experience high latency

Why does the lack of coordination

hurt performance?

Page 4: Practical TDMA for Datacenter Ethernet

Example datacenter scenario

4

Traffic receiver

Bulk transfer

Latency sensitive

Bulk transfer

•Bulk transfer

•Latency sensitive

Page 5: Practical TDMA for Datacenter Ethernet

Drops and queuing lead to poor performance

5

Traffic receiver

Bulk transfer

Latency sensitive

Bulk transfer

•Bulk transfer traffic experiences packet drops

•Latency sensitive traffic gets queued in the buffers

Page 6: Practical TDMA for Datacenter Ethernet

Current solutions do not take a holistic approach

6

Facebook uses a custom UDP based transport protocol

Alternative transport protocols like DCTCP address TCP shortcomings

Infiniband, Myrinet offer boutique hardware solutions to address these problems but are expensiveSince the demand

can be anticipated, can we coordinate

hosts?

Page 7: Practical TDMA for Datacenter Ethernet

Taking turns to transmit packets

7

Receiver

Bulk transfer

Latency sensitive

Bulk transfer

TIME DIVISION MULTIPLE ACCESS

Page 8: Practical TDMA for Datacenter Ethernet

TDMA: An old technique

Page 9: Practical TDMA for Datacenter Ethernet

Enforcing TDMA is difficult

9

It is not practical to task hosts with keeping track of time and controlling transmissions

End host clocks quickly go out of synchronization

Page 10: Practical TDMA for Datacenter Ethernet

Existing TDMA solutions need special support

10

Since end host clocks cannot be synchronized, special support is needed from the network

FTT-Ethernet, RTL-TEP, TT-Ethernet require modified switching hardware

Even with special support, the hosts need to run real time operating systems to enforce TDMA

FTT-Ethernet, RTL-TEP

Can we do TDMA with commodity

Ethernet?

Page 11: Practical TDMA for Datacenter Ethernet

TDMA using Pause Frames

11

Flow control packets (pause frames) can be used to control Ethernet transmissions

Pause frames are processed in hardware

Very efficient processing of the flow control packetsBlast UDP packets

802.3x Pause framesMeasure time taken by sender to react to the pause frames

Page 12: Practical TDMA for Datacenter Ethernet

TDMA using Pause Frames

12

Pause frames processed in hardware

Very efficient processing of the flow control packets

* Measurement done using 802.3x pause frames

• Reaction time to pause frames is 2 – 6 μs

• Low variance

Page 13: Practical TDMA for Datacenter Ethernet

TDMA using commodity hardware

13

Collect demand information from the

end hosts

Control end host transmissions

Compute the schedule for communication

TDMA imposed over Ethernet using a centralized fabric

manager

Collect demand information from the

end hosts

Compute the schedule for communication

Page 14: Practical TDMA for Datacenter Ethernet

S –> D1: 1MB

S –> D2: 1MB

round2round1

TDMA example

14

S

D1

D2

Fabric manager

Collect demand information from the

end hosts

Control end host transmissions

Compute the schedule for communication

• round1: S -> D1• round2: S -> D2• round3: S -> D1• round4: S -> D2•…

Schedule

Page 15: Practical TDMA for Datacenter Ethernet

More than one host

Fabricmanager

•Control packets should be processed with low variance

•Control packets should arrive at the end hosts synchronously

round2round2round2round2round1round1round1round1

Page 16: Practical TDMA for Datacenter Ethernet

Synchronized arrival of control packets

16

We cannot directly measure the synchronous arrival

Difference in arrival of a pair of control packets at 24 hosts

Page 17: Practical TDMA for Datacenter Ethernet

Synchronized arrival of control packets

17

Difference in arrival of a pair of control packets at 24 hosts

Variation of ~15μs for different sending rates at end hosts

Page 18: Practical TDMA for Datacenter Ethernet

Ideal scenario: control packets arrive synchronously

Round 1 Round 2 Round 3Host A

Host B Round 1 Round 2 Round 3

round2

round2

round3

round3

18

Page 19: Practical TDMA for Datacenter Ethernet

Experiments show that packets do not arrive synchronously

Round 1 Round 2 Round 3

Round 1 Round 2 Round 3

Host A

Host B

Out of sync by <15μs

round2

round2

19

Page 20: Practical TDMA for Datacenter Ethernet

Guard times to handle lack of synchronization

Round 1

Round 1

Host A

Host B

Guard times (15μs) handle out of sync control

packets

Round 2

Round 2

Round 3

Round 3

Stop

Stop

round2

round2

20

Page 21: Practical TDMA for Datacenter Ethernet

TDMA for Datacenter Ethernet

21

Control end host transmissions

•Use flow control packets to achieve low variance

•Guard times adjust for variance in control packet arrival

Page 22: Practical TDMA for Datacenter Ethernet

Encoding scheduling information

22

We use IEEE 802.1Qbb priority flow control frames to encode scheduling informationUsing iptables rules, traffic for different destinations can be classified into different Ethernet classes

802.1Qbb priority flow control frames can then be used to selectively start transmission of packets to a destination

Page 23: Practical TDMA for Datacenter Ethernet

Methodology to enforce TDMA slots

23

Pause all traffic

Un-pause traffic to a particular destination

Pause all traffic to begin the guard time

Page 24: Practical TDMA for Datacenter Ethernet

Evaluation

24

MapReduce shuffle phaseAll to all transfer

Memcached like workloads Latency between nodes in a mixed environment in presence of background flows

Hybrid electrical and optical switch architectures

Performance in dynamic network topologies

Page 25: Practical TDMA for Datacenter Ethernet

Experimental setup

25

24 serversHP DL380Dual Myricom 10G NICs with kernel bypass to access packets

1 Cisco Nexus 5000 series 10G96-port switch,1 Cisco Nexus 5000 series 10G 52-port switch

300μs TDMA slot and 15μs guard time

Effective 5% overhead

Page 26: Practical TDMA for Datacenter Ethernet

All to all transfer in multi-hop topology

26

•10GB all to all transfer

8 Hosts

8 Hosts

8 Hosts

Page 27: Practical TDMA for Datacenter Ethernet

All to all transfer in multi-hop topology

27

•10GB all to all transfer

•We use a simple round robin scheduler at each level

•5% inefficiency owing to guard time

8 Hosts

8 Hosts

8 Hosts

Ideal transfer time: 1024s

TCP all to all

TDMA all to all

Page 28: Practical TDMA for Datacenter Ethernet

Latency in the presence of background flows

28

Receiver

Bulk transfer

Latency sensitive

Bulk transfer

•Start both bulk transfers

•Measure latency between nodes using UDP

Page 29: Practical TDMA for Datacenter Ethernet

Latency in the presence of background flows

29

•Latency between the nodes in presence of TCP flows is high and variable

•TDMA system achieves lower latency

TCP

TDMA

TDMA with Kernel bypass

Page 30: Practical TDMA for Datacenter Ethernet

Adapting to dynamic network configurations

30

Optical circuitswitch

Electrical packet switch

Page 31: Practical TDMA for Datacenter Ethernet

Adapting to dynamic network configurations

•Link capacity between the hosts is varied between 10Gbps and 1Gbps every 10ms

Sender

Receiver

Ideal performance

Page 32: Practical TDMA for Datacenter Ethernet

Adapting to dynamic network configurations

•Link capacity between the hosts is varied between 10Gbps and 1Gbps every 10ms

Sender

Receiver

TCP performance

Page 33: Practical TDMA for Datacenter Ethernet

Adapting to dynamic network configurations

33

TDMA better suited since it prevents packet losses

TCP performance

Page 34: Practical TDMA for Datacenter Ethernet

Conclusion

34

TDMA can be achieved using commodity hardwareLeverage existing Ethernet standards

TDMA can lead to performance gains in current networks

15% shorter finish times for all to all transfers3x lower latency

TDMA is well positioned for emerging network architectures which use dynamic topologies

2.5x throughput improvement in dynamic network settings

Page 35: Practical TDMA for Datacenter Ethernet

35

Thank You