1 vivaldi: a decentralized network coordinate system frank dabek, russ cox, frans kaashoek, robert...

36
1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

Upload: roberta-knight

Post on 04-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

1

Vivaldi: A Decentralized Network Coordinate System

Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris

Presented by:

Chen Qian

Page 2: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

2

Probe-then-connect is an intuitive scheme to find a close server or host.

However it is not practical to first probe all servers to find the closest one, in some cases. P2P systems such as KaZaA, BitTorrent have a large

number of replica servers. DNS is an example of systems in which each piece of

data is small.

Motivation

Page 3: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

3

Synthetic coordinate systems allow Internet hosts to predict the RTTs to any other hosts.

The distance between the coordinates of two hosts should be an accurate predictor of the RTT.

These systems can be constructed by each host only communicating with a small set of other hosts.

A Solution

Page 4: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

4

Global Network Positioning (GNP) is the first coordinate system.

It is a landmark-based approach. There are several nodes in the network are landmarks, whose coordinates are given. A normal node uses its distances to three (or more) landmarks to estimate its coordinates.

GNP

Page 5: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

5

Vivaldi is a simple, adaptive, de-centralized algorithm for computing network coordinates.

No low-dimensional coordinate space would predict RTTS exactly. Internet latencies violate the triangle

inequality.Vivaldi introduces the notion height that

improves the prediction accuracy.

Vivaldi

Page 6: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

6

Where Lij: the actual RTT between nodes i and j xi: the coordinates assigned to node i ||xi-xj||: the distance between the coordinates

of i and j Minimizing the squared-error function is

equivalent to minimizing the energy in a physical mass-spring network.

Prediction Error

Page 7: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

7

Tries to minimize the error of predicted RTT values by simulating the movements of nodes under spring forces.

Centralized Algorithm

N1 N2100

N1 N2150

N1 N250

A single spring at rest

longer spring

shorter spring

Page 8: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

8

)( ||)||( jijiijij xxuxxLF By Hook’s Law:

Force vector Fij can be viewed as an error vector, which has a direction

Algorithm

Scalar quantity: the displacement of the spring from rest

Unit vector which gives the direction of the force on i.

Page 9: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

9

N1 N2

Local minimum

But the global minimum is not guaranteed. The system may come to rest in a local minimum.

N3

N5N4

local minimum

Page 10: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

10

Local minimum

But the global minimum is not guaranteed. The system may come to rest in a local minimum.

N1 N2

N3

N5N4

lower error

Page 11: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

11

• Calculate sum of forces on node i• Move a step in the direction of the sum of forces

Centralized Algorithm

Page 12: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

12

• Continuously contact sample nodes • For each sample node• Calculate force (error change) of this sample• Move a step in the direction of the error

Simple Distributed Version

Page 13: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

13

Identical to the individual forces calculated in the loop of the centralized algorithm

Coordinates update

Page 14: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

14

The main difficulty in implementing Vivaldi is ensuring that it converges to coordinates that predict RTT well.

If the timestep is too small, convergence is slow. If the timestep is too large, convergence may fail.

Adaptive Timestep

optimal optimal

Page 15: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

15

The system should obtain both fast convergence and avoidance of oscillation. Simple adaptive timestep

Adaptive timestep to deal with large errors

Adaptive Timestep

If the remote node has a large error, it should be given less weight than a remote node with small error.

Page 16: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

16

Algorithm with adaptive timestep

Compute error confidence

Update local error

Adjust time step

Page 17: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

17

Latency dataMatrix of inter-host Internet RTTsCompute coordinates from a subset of these RTTsCheck accuracy of algorithm by comparing

simulated results to full RTT matrix4 Data sets (2 Measured, 2 Synthetic)

192 nodes Planet Lab network, all pair-ping gives fully populated matrix

1740 Internet DNS servers Collect full matrix using the King method Continuously measure pairs over a week and take

the median value

Evaluation Methodology

More geographically diverse at that time

Page 18: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

18

King’s method

First DNS query is for a name in the domain of A. It returns the latency to A.Second query is for a name in the domain of B, but is sent initially to A. The difference between two queries is the latency between A and B

Page 19: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

19

King’s method

Take the median value, because King can report a RTT higher or lower than the true value if there is congestion.

About 10% of the original nodes were removed from the data High load or queuing at name server A adds a delay that is significantly larger than the network latency. The initial query (to A) and recursive query (via A to B) will require roughly the same amount of time and the estimated

latency between them will be near zero.

Page 20: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

20

Simulation test setup Input RTT matrix Send a packet one a second Simulator delays each transmission by ½ RTT

time Use measured RTT of the packets to update

coordinates Limitation of the simulator: RTTs do not vary over

time; cannot model queuing delay or changes in routing

Setup

Page 21: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

21

Error definitionsError of Link

Absolute difference between predicted RTT and measured RTT.

Error of Node Median of link errors involving this node

Error of System Median of all node errors

Setup

A small proportion of nodes have large errors?

Page 22: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

22

(a)Constant timestep: too small and too large values all cause large errors.

(b)Adaptive timestep: c=0.25 yields both quick error reduction and low oscillation.

Timestep choice

Page 23: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

23

200 new nodes join a stable 200-node networkConstant timestep, new nodes may confuse the old

nodes. The system need to be re-converged.Timestep with weighted errors allows new nodes to

find their places quickly.

Timestep choice

Page 24: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

24

Sampling only nearby nodes gives good local coordinates but poor global coordinates.

The second case allow nodes to contact distant nodes as well, improving the accuracy of the coordinates.

Communication pattern

Page 25: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

25

Put 4 close neighbors and 4 far-away neighbors. Each node chooses one of the far neighbors with probability p.

p = .5 quick convergencep < .5 convergence slows. But similar accurate

coordinates are eventually chosen.

Communication pattern

Page 26: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

26

Ability to adapt to changes in the network (tested with “Transit-Stub”) At time 100 one of the transit stub links is made 10 time larger; after

20 s the system has re-converged. At time 300 the link goes back to its normal size and the system quickly re-converged to original error.

Adapting to network changes

Page 27: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

27

Accuracy: Vivaldi vs. GNP

How about communication cost?

Page 28: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

28

Model Selection

Almost any coordinate space satisfies the triangle inequality (the distance between A and C should be less than or equal to the distance along the path A-B-C).

N1

N2

N3100 ms

48 ms 48 ms

Not always true in Internet

Page 29: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

29

Triangle inequality

The best indirect path usually has lower RTT than the direct path.

But luckily only 5% pairs have a significant shorter indirect path.

Page 30: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

30

Euclidean Spaces

If geographic distance were the only factor in latency, a 2-D model would be sufficient. However, the fit is not perfect. Adding more dimensions, the accuracy of the fit improves slightly

3D is okay!

Page 31: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

31

Spherical coordinates

Does a spherical distance function provide a more accurate model, as the distances are drawn from paths along the surface of the Earth?

No!

Page 32: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

32

2D+Height

The Euclidean portion models a high-speed Internet core with latencies proportional to geographic distance. The height models the time it takes packets to travel the access link from the node to the core.

The cause of the access link latency may be queuing delay, low bandwidth, etc.

A packet sent from one node to another must travel the source node’s height, then travel in the Euclidean space, then travel the destination node’s height.

Page 33: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

33

2D+Height

Performs better than 2D and 3D! Does not look very promising because they take the

median!

Page 34: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

34

2D+Height

Nodes with large errors

Height plots results smaller max error and median error

Page 35: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

35

Presents a simple, adaptive, decentralized algorithm for computing synthetic coordinates, which help Internet hosts to estimate latencies

Requires no fixed infrastructure. All nodes run the same algorithm.

Converges quickly by adaptive timestep. Maintains accuracy even as a large number of

new hosts join the network that are uncertain of their coordinates.

Conclusion

Page 36: 1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian

36

Thanks!Q&A