datacenter interconnection network design

31
Datacenter Interconnection Network Design Christina Delimitrou, Frank Nothaft, Milad Mohammadi, Laura Sharpless May 26 th 2011 – Final Project Presentation

Upload: others

Post on 05-Apr-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Datacenter Interconnection Network Design

Christina Delimitrou, Frank Nothaft, Milad Mohammadi, Laura Sharpless

May 26th 2011 – Final Project Presentation

Introduction

• Objective: Optimize for cost and power in data-center networks with competitive performance

• Evaluate alternative configurations in terms of Performance/Watt, Performance/$.

• Solution:

o Dragonfly topology o PAR routing o Virtual cut-through flow control

Outline • Topology

• Cost

• Routing / Slicing

• Traffic Management

• Results

Topology

Dragonfly

• Fat Tree: Common interconnection network in large-scale datacenters

• Dragonfly: Cost effective use of resources Fewer hops (improves latency) Greater path diversity Fewer optical cables (improves cost)

Design Overview 19 Columns

19

Co

lum

ns

Design Overview 9 racks=1/2Group

1 Group

19

Co

lum

ns

9 racks=1/2Group

1 Group

19 Columns

Design Overview

1m

0.25m

0.25m

R R R R

R R R R

SHARED

SHARED

3m

9 racks = 1/2Group

1 Group = 18 racks

Routers per Group = 8

Design Overview

Router #0 Router #1 Router #2 Router #3

Design Overview

9 racks=1/2Group

1 Group

Cost Optimization: Connect neighboring groups with electrical wires

19 Columns

19

Co

lum

ns

Router Pins

7

Cost

Cost

• Free variables: number of groups, endpoints / router

• Routers / group, endpoints / router => number of groups

• Number of routers / group, endpoints / router determines optical

cables needed

Cost

Cost

Design Point: 8 routers / group 71 endpoints / router

Cost

Energy

Latency

Routing / Slicing

Progressive Adaptive Routing

• Adaptive routing: handles tree saturation

PAR

• Implementation: o 4 VC's to avoid deadlock

VC0: Min routing in src group VC1: Val routing to intermediate VC2: Min routing to dest group VC3: Min routing within dest group

o Next: Threshold determination based on

simulation: 30 + (H x q)non-minimal > (H x q)minimal

Slicing

We simulate: • Two Full Groups • One Cloud Node

Cloud Node

Group 1

Cloud Node

Group 0

Traffic Management

Traffic Management

• Challenges: o Cloud abstraction o Simulate PAR misrouting

• Cloud abstraction:

o 1136 nodes = ~1% of total o Assumption: cloud node issues one packet per cycle

number of cloud nodes * ratio of cloud <-> real o Generation of packets leaving the cloud

Traffic Management

• Traffic Misrouting o Misrouting due to PAR varies inversely with offered traffic o Range is linear and fairly small (varies from 15-20%) o Currently, we say 20% of all packets are misrouted, and 1% of

packets sent between nodes in the cloud are misrouted to the outside world

• Hotspot Placement

o Randomly place hotspots in topology o Guarantee at least one non-cloud hotspot

Results

Throughput

Simulation Cycles

Latency

Simulation Cycles

Simulator Status

• Topology supports slicing • Router supports PAR • Traffic Manager Supports Hot Spot and Slicing • Partly implemented message routing

Next?

• Reevaluate PAR implementation • Complete Hot Spot implementation • Complete message routing implementation

Thank you!