a network-state management service · 2014-12-10 · a network-state management service peng sun...

113
A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton & Microsoft

Upload: others

Post on 19-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

A Network-State Management Service

Peng SunRatul Mahajan, Jennifer Rexford,

Lihua Yuan, Ming Zhang, Ahsan ArefinPrinceton & Microsoft

Page 2: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Complex Infrastructure

1

Page 3: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Complex Infrastructure

1

Number of 2010

Data Center A few

NetworkDevice 1,000s

NetworkCapacity 10s of Tbps

Microsoft Azure

Page 4: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Complex Infrastructure

1

Number of 2010 2014

Data Center A few 10s

NetworkDevice 1,000s 10s of 1,000s

NetworkCapacity 10s of Tbps Pbps

Microsoft Azure

Page 5: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Complex Infrastructure

Variety of vendors/models/time1

Number of 2010 2014

Data Center A few 10s

NetworkDevice 1,000s 10s of 1,000s

NetworkCapacity 10s of Tbps Pbps

Microsoft Azure

Page 6: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Management Applications

2

Page 7: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Management Applications

2

Traffic Engineering

Page 8: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Management Applications

2

Traffic Engineering

Load Balancing

Page 9: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Management Applications

2

Traffic Engineering

Load Balancing Link

Corruption Mitigation

Page 10: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Management Applications

2

Traffic Engineering

Load Balancing Link

Corruption MitigationDevice

Firmware Upgrade

……

Page 11: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Our Question

How to safely run multiple management applications on shared infrastructure

3

Page 12: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Naïve Solution

• Run independently

4

Traffic Engineering

Link Corruption Mitigation

Firmware Upgrade

Network Devices

Page 13: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

• It does not work due to 2 problems

Naïve Solution

4

Traffic Engineering

Link Corruption Mitigation

Firmware Upgrade

Network Devices

Page 14: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #1: Conflict

5

Page 15: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #1: Conflict

5

Link-corruption-mitigation adjusts traffic away from Core1

Page 16: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #1: Conflict

5

Link-corruption-mitigation adjusts traffic away from Core1

TE tunes traffic among links to Core1, 2

Page 17: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #2: Safety Violation

6

Page 18: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #2: Safety Violation

6

Link-corruption-mitigation shuts down faulty Agg A

Page 19: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

AggA

ToRs

AggB

Core1 2

Problem #2: Safety Violation

6

Link-corruption-mitigation shuts down faulty Agg A

Firmware-upgrade schedules Agg B to upgrade

Page 20: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #1

7

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 21: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #1

• One monolithic application

7

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 22: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #1

• One monolithic application

• Central control of all actions

7

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 23: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Too Complex to Build

• Difficult to develop• Combine all applications that are

already individually complicated

8

Page 24: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Too Complex to Build

• Difficult to develop• Combine all applications that are

already individually complicated

• High maintenance cost• for such huge software in practice

8

Page 25: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #2

9

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 26: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #2• Explicit coordination among

applications

9

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 27: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Potential Solution #2• Explicit coordination among

applications

• Consensus over network changes

9

Traffic Engineering

Firmware Upgrade

Link Corruption Mitigation

Page 28: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Still Too Complex

• Hard to understand each other• Diverse network interactions

10

Page 29: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Still Too Complex

• Hard to understand each other• Diverse network interactions

10

Application Routing Device Config

TrafficEngineering

Firmwareupgrade

Page 30: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Still Too Complex

• Hard to understand each other• Diverse network interactions

10

Application Routing Device Config

TrafficEngineering

Firmwareupgrade

Page 31: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Still Too Complex

• Hard to understand each other• Diverse network interactions

10

Application Routing Device Config

TrafficEngineering

Firmwareupgrade

Page 32: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Main Enemy: Complexity

• Application development

• Application coordination

11

Page 33: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Main Enemy: Complexity

• Application development

• Application coordination

11

MonolithicIndepen-dent

Explicitlycoordinate

Simple Complex

Page 34: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

What We Advocate

• Loose coupling of applications

• Design principle:• Simplicity with safety guarantees

12

Page 35: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

What We Advocate

• Loose coupling of applications

• Design principle:• Simplicity with safety guarantees

• Forgo joint optimization• Worthwhile tradeoff for simplicity• Applications could do it out-of-band

12

Page 36: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Overview of Statesman

• Network operating system for safe multi-application operation

13

Page 37: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Overview of Statesman

• Network operating system for safe multi-application operation

• Uses network state abstraction• Three views of network state

13

Page 38: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Overview of Statesman

• Network operating system for safe multi-application operation

• Uses network state abstraction• Three views of network state• Dependency model of states

13

Page 39: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

The “State” in Statesman

• Complexity of dealing with devices• Heterogeneity• Device-specific commands

14

Network Devices

Page 40: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

The “State” in Statesman

• Complexity of dealing with devices• Heterogeneity• Device-specific commands

14

Network Devices

Network State

Page 41: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

State Variable Examples

State Variable Value

Device Power Status Up, down

Device Firmware Version number

Device SDN Agent Boot Up, down

Device Routing State Routing rules

Link Admin Status Up, down

Link Control Plane BGP, OpenFlow, …15

Page 42: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Simplify Device InteractionPast Now

16

Network Devices Network Devices

Network State

Application Application

Page 43: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Simplify Device InteractionPast Now

16

SNMP, OF, vendor API, …

Network Devices Network Devices

Network State

Application

Device Statistics

Application

Page 44: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Simplify Device InteractionPast Now

16

SNMP, OF, vendor API, …

Network Devices Network Devices

Network State

Application

Device Statistics

Application

Device-specificcmds

Page 45: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Simplify Device InteractionPast Now

16

SNMP, OF, vendor API, …

Read

Network Devices Network Devices

Network State

Application

Device Statistics

Application

Device-specificcmds

Page 46: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Simplify Device InteractionPast Now

16

SNMP, OF, vendor API, …

Read Write

Network Devices Network Devices

Network State

Application

Device Statistics

Application

Device-specificcmds

Page 47: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Views of Network State

17Network Devices

Network State

ApplicationApplicationApplication

Page 48: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Views of Network State

17Network Devices

Observed State

Observed State Actual state of the whole network

Target State Desired state to be updated on the whole network

Target State

ApplicationApplicationApplication

Page 49: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Network Devices

Two Views Are Not Enough

18

Observed State

Target State

ApplicationApplicationApplication

Page 50: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Network Devices

Two Views Are Not Enough

18

Observed State

Target State

One More View

Proposed State A group of entity-variable-valuesdesired by an application

Proposed State

Page 51: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Network Devices

Two Views Are Not Enough

18

Observed State

Target State

One More View

Proposed State A group of entity-variable-valuesdesired by an application

Proposed State

ApplicationApplicationApplication

Page 52: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Network Devices

Two Views Are Not Enough

18

Observed State

Target State

One More View

Proposed State A group of entity-variable-valuesdesired by an application

Proposed State

ApplicationApplicationApplication

Page 53: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

How Merging Works• Combine multiple proposed states

into a safe target state

19

Page 54: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

How Merging Works• Combine multiple proposed states

into a safe target state

• Conflict resolution• Last-writer-wins• Priority-based locking• Sufficient for current deployment

19

Page 55: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

How Merging Works• Combine multiple proposed states

into a safe target state

• Conflict resolution• Last-writer-wins• Priority-based locking• Sufficient for current deployment

• Safety invariant checking• Partial rejection & Skip update

19

Page 56: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Choose Safety Invariants

20

Page 57: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Choose Safety Invariants

20

TightLoose

Page 58: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Choose Safety Invariants

20

Hinder application too frequently

TightLoose

Page 59: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Choose Safety Invariants

20

Hinder application too frequently

TightLoose

Cannot protect network operation

Page 60: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Choose Safety Invariants

• Our current choice• Connectivity: Every pair of ToRs in

one DC is connected• Capacity: 99% of ToR pairs have at

least 50% capacity

20

Hinder application too frequently

TightLoose

Cannot protect network operation

Page 61: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Recap of Three-View Model• Simplify network management

21

Observed State

Target StateProposed

State

Page 62: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Recap of Three-View Model• Simplify network management

21

Observed State

Target StateProposed

State

ApplicationApplicationApplication

Page 63: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Recap of Three-View Model• Simplify network management

21

Observed State

Target StateProposed

State

What we see from

the network

ApplicationApplicationApplication

Page 64: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Recap of Three-View Model• Simplify network management

21

Observed State

Target StateProposed

State

What we see from

the network

What we want the network

to be

ApplicationApplicationApplication

Page 65: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Recap of Three-View Model• Simplify network management

21

Observed State

Target StateProposed

State

What we see from

the network

What we want the network

to be

What can be actually done on the network

StatesmanApplicationApplicationApplication

Page 66: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Yet Another Problem

• What’s in Proposed State• Small number of state variables that

application cares

22

Page 67: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Yet Another Problem

• What’s in Proposed State• Small number of state variables that

application cares

• Implicit conflicts arises

22

Page 68: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Yet Another Problem

• What’s in Proposed State• Small number of state variables that

application cares

• Implicit conflicts arises• Caused by state dependency

22

Page 69: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Implicit Conflict

23

Page 70: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

A

B C

D

Implicit Conflict

23

Page 71: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

A

B C

D

Implicit Conflict

23

TE writes new value of routing state of B for tunneling traffic

Page 72: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

A

B C

D

Implicit Conflict

23

TE writes new value of routing state of B for tunneling traffic

Firmware-upgrade writes new value of firmware state of B

Page 73: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

Device

Link

Page 74: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState Device

Link

Page 75: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

Device

Link

Page 76: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

ConfigurationState

Device

Link

Page 77: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

ConfigurationState

Device

Link

bgpd SDN

Page 78: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

ConfigurationState AdminState

ConfigurationState

Device

Link

Page 79: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

ConfigurationState

RoutingState

AdminState

ConfigurationState

Device

Link

Page 80: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Dependency Relations

24

PowerState

FirmwareVersion

ConfigurationState

RoutingState

AdminState

ConfigurationState

PathState

Device

Link

Page 81: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Build in Dependency Model

• Statesman calculates it internally

• Only exposes the result for each state variable• Whether the variable is controllable

25

Page 82: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Statesman System

26

TargetState

Proposed State

Observed State

Page 83: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Statesman System

26

TargetState

Proposed State

Observed State

Storage Service

Page 84: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Statesman System

26

TargetState

Monitor

Proposed State

Observed State

Storage Service

Page 85: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Statesman System

26

TargetState

Monitor

Checker

Proposed State

Observed State

Storage Service

Page 86: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Statesman System

26

TargetState

Monitor Updater

Checker

Proposed State

Observed State

Storage Service

Page 87: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Deployment Overview

• Operational in Microsoft Azure for 12 months

• Cover 10 DCs of 20K devices

27

Page 88: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Deployment Overview

• Operational in Microsoft Azure for 12 months

• Cover 10 DCs of 20K devices

27

Page 89: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Production Applications

• 3 diverse applications built• Device firmware upgrade• Link corruption mitigation• Traffic engineering

28

Page 90: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Production Applications

• 3 diverse applications built• Device firmware upgrade• Link corruption mitigation• Traffic engineering

• Finish within months

• Only thousands of lines of code

28

Page 91: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Case #1: Resolve ConflictInter-DC TE &

Firmware-upgrade

29

BR 1

BR 2DC 1

BR 8

BR 7DC 4

BR 3BR 4

DC 2

BR 5

DC 3

BR 6

DC = Data CenterBR = Border Router

Page 92: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

……

……

Page 93: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

……

……

Page 94: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

Firmware-upgrade acquires lock of BR1

……

……

Page 95: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

TE fails to acquire lock, and moves traffic away

……

……

Page 96: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

TE fails to acquire lock, and moves traffic away

……

……

Page 97: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

BR1 firmware upgrade starts

……

……

Page 98: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

BR1 firmware upgrade starts

BR1 firmware upgrade ends. Lock released.

……

……

Page 99: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

BR1 firmware upgrade starts

TE re-acquires lock, and moves traffic back

……

……

Page 100: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

30

BR1 firmware upgrade starts

TE re-acquires lock, and moves traffic back

……

……

Page 101: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Case #1 Summary

• Each application: • Simple logic• Unaware of the other

• Statesman enables: • Conflict resolution• Necessary coordination

31

Page 102: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Case #2: Maintain Capacity Invariant

Firmware-upgrade & Link-corruption-mitigation

32

ToR

Agg

…… …

Core

…Pod 4

41

1 n…Pod 1

41

1 n …Pod 10

41

1 n

1 4

Link corrupting packets

Page 103: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

……

……

Page 104: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

Upgrade proceeds in normal speed in Pod 3 and 5

……

……

Page 105: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

Upgrade proceeds in normal speed in Pod 3 and 5

……

……

Page 106: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

Upgrade proceeds in normal speed in Pod 3 and 5

……

……

Page 107: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

Upgrade proceeds in normal speed in Pod 3 and 5

Upgrade in Pod 4 is slowed down by checker due to lost

capacity

……

……

Page 108: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

33

Upgrade proceeds in normal speed in Pod 3 and 5

Upgrade in Pod 4 is slowed down by checker due to lost

capacity

……

……

Page 109: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Case #2 Summary

• Statesman:• Automatically adjusts application

progresses• Keeps the network within safety

requirements

34

Page 110: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Conclusion

• Need network operating system for multiple management applications

35

Page 111: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Conclusion

• Need network operating system for multiple management applications

• Statesman• Loose coupling of applications• Network state abstraction

35

Page 112: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

Conclusion

• Need network operating system for multiple management applications

• Statesman• Loose coupling of applications• Network state abstraction

• Deployed and operational in Azure

35

Page 113: A Network-State Management Service · 2014-12-10 · A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton &

36

Thanks!

Questions?Check paper for related works