circuit emulation for bulk transfers in distributed storage and clouds

Post on 19-Jan-2015

92 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Assuming that majority of in-cloud networking is Ethernet-based at least at departure and entry points, it is widely recognized that TCP/UDP communications fail to achieve the necessary throughput during bulk transfers. While modern switches support maximum achievable throughput via the cut-through mode of operation, the practical benefit of this mode is diminished when the network is contended by multiple communication parties. This research removes this problem by implementing circuits-over-packets emulation. Circuits are simply optimal schedules for communication sessions where each session gets exclusive access to the network. Transfer of chunks of Big Data, pieces of storage, VM images, etc. all fall under the category of bulk transfers.

TRANSCRIPT

.

Setting the Mood

• "It's time to get rid of TCP/UDP protocols in DCs"

• DCs/Clouds are closed worlds, brand new technologies are OK

• with bulk transfers (BigData, ...), the business value of a TCP/UDP alternative is high

• circuits are an alternative to packets

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 2/32...

2/32

.

Ethernet is the Best

.Ethernet.....

.

... is the cheapest and most available technology with e2esupport

• Fiber Channel (FC), SATA, etc. require expensive hardware, lowcompatibility, no e2e support

• FCoE = Ethernet, same problems, expensive hardware, no e2e support

• network virtualization is best fit for Ethernet

• disclaimer: one of proposed models will work with optical networks aswell

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 3/32...

3/32

.

Ethernet is the Worst

.Ethernet.....

.... is the worst technology in terms of throughput• CSMA/CD is the biggest throughput limitation

◦ not in modern switches, but still major problem in wireless

• contention problem cannot be easily resolved

• same applies to OBS/OPS optical technologies

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 4/32...

4/32

.

Ethernet Contention

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 5/32...

5/32

.

Ethernet and Contention

• whaterver you do, Ethernet L2 domains cannot avoid contention

Switch Switch

Qualitatively Identical

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 6/32...

6/32

.

Parallel vs Sequential (2 flows)

20 24 28 32 36 40Transfer time in contention (s)

20

24

28

32

36

40Tr

ansf

er ti

me

by e

xclu

sive

circ

uits

(s)

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 7/32...

7/32

.

Ethernet Switches : Basic Facts

• cut-through versus store-and-forward• cut-through is 10..15x better

• Cisco has advanced cut-through : +bytes versus routing decision tradeoff

• store-and-forward is subjected to QoS classes◦ L3 DSCP versus L2 CoS, AF, EF, BE, SBE models

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 8/32...

8/32

.

Switchess : Modeling

C: Cut Through

Check, etc. Q: Queue

D: Drop QoS classes

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 9/32...

9/32

.

Proposal

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 10/32...

10/32

.

Proposal : Circuits

.Circuits..

.

... are emulations which allow for exclusive access to L2 domain byindividual parties

• circuits-over-packets emulation

• cut-through mode for each circuit is guaranteed

• highest possible throughput

• NOTE: will work with cheepest switches

• NOTE2: applies to optical networks as well (L2=lightpaths)

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 11/32...

11/32

.

Implementation : 2 cases• left: book-then-send, right: separate control layer

SWITCH

NOC

Storage Node A

Storage Node B

Step 1: Book

session

Step 2: Transfer bulk

SWITCH

Storage Node A

Storage Node B

SWITCH

Bookingsegment

BulkSegment

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 12/32...

12/32

.

Impl.: Centralized Case

SWITCH

NOC

Storage Node A

Storage Node B

Step 1: Book

session

Step 2: Transfer bulk

• same network for booking andcircuits

• inefficient but still valid/practical

• legacy-compatible,partial implementation, etc.

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 13/32...

13/32

.

Impl.: Distributed Case

SWITCH

Storage Node A

Storage Node B

SWITCH

Bookingsegment

BulkSegment

• book on one network, send on another

• legacy-incompatible• contention-sensing possible →fully distributed models

• can also use sensing andcontention control

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 14/32...

14/32

.

Optimization

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 15/32...

15/32

.

Optimization : Basics

• same for distributed and centralized models◦ does not matter, optimization shows the overall utility of a heuristic

• practical optimization = formulation + heuristic• given: demand matrix

• expected result: a routing table mapping demand to topology

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 16/32...

16/32

.

Optimization : Basics

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 17/32...

17/32

.

Optim. : OSPF → tuple notation

• OSPF is traditional in such optimizations, but too rigid for many practical cases◦ too complex for lightpaths in optical networks◦ no good heuristics for complex topologies

• OSPF notation is not very convinient1. capacity constraints2. flow preservation3. contention/congestion metrics

• alternative: tuples ... for example ⟨s, d, v, t⟩ defines demand of traffic

volume v at time t from source s to destionation d◦ this notation ismuch more flexible for several coming formulations

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 18/32...

18/32

.

Optim. : Basic Tuple Notation

• nodes: source s, destination: d and others a, b, c• individual demand tupleTi = ⟨s, d, v, t⟩• lightpathλ for optical networks

• time t, can be start time, start and end of a period, etc.

• we do not care about utility so far, just the notation, but utility is obvious inmost cases

• → means results in... or leads to...

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 19/32...

19/32

.

tOSPF : Traditional OSPF

Ti = ⟨s, d, v, t⟩ → ⟨s, a, b, ..., d⟩.Externals..

.

Using demand matrix, creates a set of per-linkweights, which define a unique route for eachdemand item.

.Internals..

.

Per-link capacity constraint, in/out flowconservation constraint, unstable for largetopologies and demand matrices

• s source

• d destination

• a, b, c, ... intermediatenodes on e2e paths/routes

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 20/32...

20/32

.

oOSPF : Optical OSPF w/out Switching

Ti = ⟨s, d, v, t⟩ → ⟨s, λ⟩.Externals..

.

Using demand matrix, maps each demand item onisolated lightpath

.Internals..

.

Simple but inefficient because the number ofe2e lightpaths is small

• s source

• d destination

• λ a wavelength for a fixed e2elightpath from s to destination

d

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 21/32...

21/32

.

oOSPFs : Optical OSPF with Switching

Ti = ⟨s, d, v, t⟩ → ⟨s, λs, λa, λb, ...⟩.Externals..

.

Using demand matrix, maps each demand item on aroute of wavelengths

.Internals..

.

Efficient, but suffers from the same problemsas traditional OSPF

• s source

• d destination

• λx an exit wavelength at agiven node x

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 22/32...

22/32

.

Proposal : Sensing Formulation

Ti = ⟨s, d, v, t1, t2⟩ → ⟨s, λ, t⟩.Externals..

.

Using a matrix of loosely scheduled demand, createa schedule of sequential sessions withexlusive access to paths

.Internals..

.

Same approach for Ethernet (one wavelength) andoptical networks

• s source

• d destination

• t1 and t2 areuser-preferred range forthe start of a session, a valuet is picked between them

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 23/32...

23/32

.

Heuristics

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 24/32...

24/32

.

Centralized Case

SWITCH

NOC

Storage Node A

Storage Node B

Step 1: Book

session

Step 2: Transfer bulk

• all optimization formulations exceptsensing

• very close to traditional OSPF• same problems as in OSPF

• the biggest problem is to knowdemand matrix in advance

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 25/32...

25/32

.

Distributed Case

SWITCH

Storage Node A

Storage Node B

SWITCH

Bookingsegment

BulkSegment

• can be used for all formulations

• pefectly suited for the Sensingformulation

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 26/32...

26/32

.

The Sensing Model• contention methods in wireless and OBS will work

◦ in practice: sensing can beSNMP-like feedback on gate's status◦ no sync among users is necessary

• same model for Ethernet (+virtual nets) and optical networks

• main advantage: the offload, no need to implement funny OSPFheuristics

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 27/32...

27/32

.

Realistic Gate/Sensing Model

• an approximate view of JGNtopology

• two way = one way + ring• Gates are created at optical/ethernet border

• NOTE: already working for Ethernet

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 28/32...

28/32

.

Wrapup

• circuit emulation is necessary for effective bulk transfers◦ up to 40% faster in our lab tests

• intra-DC, DC-DC, federations, etc. -- all can benefit from circuits

• circuits formulated as OSPF are bad -- a Gate/Sensing model is better• validity: worst case is the existing technology, but upper performancebound is very high

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 29/32...

29/32

.

That’s all, thank you ...

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 30/32...

30/32

.

[01] myself (2014)High Availability Cloud Storage...NS研

[02] Cisco (2014)LAN Switching and Wireless, CCNA Exploration Companion GuideCisco Press

[03] Cisco (2014)Cut-Through and Store-and-Forward Ethernet Switching for Low-Latency....Cisco Press

[04] NetOptics (2014)Cut-Through Ethernet Switching: A Versatile Resource for Low Latency...White Paper

[05] Cisco (2006)QoS: DSCP Classification GuidelinesRFC4594

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 30/32...

30/32

.

[06] Cisco (2010)A Differentiated Services Code Point (DSCP)...RFC5865

[07] open source (current)PICA8 Project for Low Latency Virtual Networkinghttp://www.pica8.com/

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 31/32...

31/32

.

Wait-n-Send Model

Bulk size per transmission

Goodput

2 potential distributions in practice

Response curve(s)

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 31/32...

31/32

.

Utility of Waiting (curve)

• I called it Wait-n-SeeCurve

• source waits for some time forexclusive access --sensing and accumulating bulk

• on timeout, the current bulkis released at best effort(fallback)

M.Zhanikeev -- maratishe@gmail.com -- Circuit Emulation for Bulk Transfers in Dist. Storage and Clouds -- http://bit.do/marat140903 32/32...

32/32

top related