Transcript
Page 1: Machine Learning for Router Congestion Kurtis Heimerlbnrg.eecs.berkeley.edu/~randy/Courses/CS294.F07/Kurtis_poster.pdf · Kurtis Heimerl Key Concepts/Tools •Active testing •Actively

Preamble/Abstract

Network congestion in Datacenters have taken down someof the largest service providers. In this work, I combinemachine learning techniques with the flexibilities aDatacenter environment provides to remedy this problem. Ifocus in particular on the router, and use an active testingmethodology to determine “failure dependencies” in theDatacenter. With these dependencies, I will be able to moreeffectively route packets during times of congestion.

Goals:•Determine the relative importance of services by testingat the router level.•Route packets in a congestion event as dictated by theirrelative importance.

Machine Learning for Router CongestionKurtis Heimerl

Key Concepts/Tools•Active testing

•Actively drop packets to test the nature of thedependencies.

•Failure dependencies rather than generic dependencies•Generic dependencies do not allow us to route.•Actively modifying the network allows us to test thenature of the dependency.

•Batch Jobs rather than online algorithms•Distinct Phases

•Data gathering (blue)•Testing (red)

•Each phase requires only one SVM run, which allowsus to reduce the overhead by using a coprocessor.

•Datacenter service redundancy•Services expect outages•Infrastructure exists to restart dead services•These allow us to kill services and expect minimalaffect on the users of the services.

•Weighing later data points more heavily•Services may not immediately recover from outages•Weigh packets based on the expectation of a servicerecovery

117-108Weight 0

158-108Equal Weight 0.603175568804Linear Weight 0.678278982714944Exponential Weight 0.689595088365564

174-108Equal Weight 0.642814009661783Linear Weight 0.797291379253128Exponential Weight 0.844763634041119

Result: 117 Depends on 108. Very unlikely that 108depends on 117

HDFS 3 Nodes, 3 Replication174-108 disconnected

HDFS 3 nodes, 1 Replication117-108 disconnected

117-108Equal Weight 0.289433384379793Linear Weight 0.281712955072555Exponential Weight 0.27887394256017

117-158Weight 0

158-108Equal Weight 0.0677545796336299Linear Weight 0.0794150611951044Exponential Weight 0.0835522769207024

174-108Weight 0

174-117 – No Data

174-158Weight 0

Result: 174 Depends on 108. Possible that 108depends on 174

174-108 117-108

174-117 174-158

158-108 117-108

174-108

Top Related