“erdos and the internet”

28
1 “Erdos and the Internet” Milena Mihail Georgia Tech. Internet is a remarkable phenomenon involves graph theory in a natural wa gives rise to new questions and models the Internet at the level utonomous Systems orts the critical routing protocol.

Upload: lan

Post on 25-Feb-2016

46 views

Category:

Documents


0 download

DESCRIPTION

E.g. the Internet at the level of Autonomous Systems supports the critical BGP routing protocol. “Erdos and the Internet”. Milena Mihail Georgia Tech. The Internet is a remarkable phenomenon that involves graph theory in a natural way and gives rise to new questions and models. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: “Erdos and the Internet”

1

“Erdos and the Internet”

Milena MihailGeorgia Tech.

The Internet is a remarkable phenomenonthat involves graph theory in a natural wayand gives rise to new questions and models.

E.g. the Internet at the level of Autonomous Systemssupports the criticalBGP routing protocol.

Page 2: “Erdos and the Internet”

2

Search and routing networks, like the WWW, the internet, P2P networks, ad-hoc (mobile, wireless, sensor) networks are pervasive and scale at an unprecedented rate.

Performance analysis/evaluation in networking:measure parameters hopefully predictive of performance.

Important in network simulation and design.

Page 3: “Erdos and the Internet”

3

Sparse small-world graphs with large degree-variance.

Want metrics predictiveor explanatory of network function.

4 102 100degree

freq

uenc

y

, but

no sharp concentration

Erdos-Renyi

Page 4: “Erdos and the Internet”

4

Networking questions

RoutingDoes packet drop (blocking) scale?

Does the network evolve towards monopolies?

Are network resources used efficiently?

How does delay scale in routing?

Is there load balancing?Is it or ?

Searching

Design

How fast can you crawl the WWW?

Can you search a P2P network with low overhead?

How can you maintain a well connected topology?

How about distributed and dynamic networks?

Are there strategies to improve crawling and searching?

Is it or ?

Is it or ?

Congestion

Congestion = flow on most loaded link under optimal routing.

Route 1 unit of flow between each pair of nodes.

Graph on nodes.

Total flow .

Page 5: “Erdos and the Internet”

5

Relevant metric: “bottlenecks”Conductance

Alon 85Jerrum & Sinclair 88Leighton & Rao 95

Page 6: “Erdos and the Internet”

6

Second eigenvalue of the lazy random walk associated with the adjacency matrix

closely approximates conductance:

computationally softMatlab does 1-2M node sparse graphs

Random Graph

InternetThis is also another point of view of the small-world phenomenon

This also says that congestionunder link capacities, search timeand sampling time scale smoothly

Plots at 700 nodes, 3000 nodes, and 15000 nodes.

100 largest eigenvalues

+ -+

+

+

- -

-

Eigenvectors associated with large eigenvaluesare “shadows” of sets with bad conductance.

Page 7: “Erdos and the Internet”

7

Beyond today, we need network models to predict future behavior.

What are suitable network models?

The Internet grows anarchically, so random graphs are good canditates.

Current network models are random graphs which produce power law degree sequences (thus also matching this important observed data).

Page 8: “Erdos and the Internet”

8

One vertex at a time

New vertex attaches to

existing vertices

EVOLUTIONARY:Growth & Preferential Attachment

Simon 55, Barabasi & Albert 99, Kumar et al 00, Bollobas & Riordan 01, Bollobas, Riordan, Spencer & Tusnady 01.

Page 9: “Erdos and the Internet”

9

CONFIGURATIONAL aka structural MODEL

Given choose random perfect matching over

minivertices

“Random” graph with given “power law” degree sequence.

Bollobas 80s, Molloy & Reed 90s, Aiello, Chung & Lu 00s, Sigcomm/Infocom 00s

Page 10: “Erdos and the Internet”

10

Given

Choose random perfect matching over

CONFIGURATIONAL MODEL

minivertices

edge multiplicity O(log n) , a.s. connected, a.s.

Page 11: “Erdos and the Internet”

11

Theorem [M, Papadimitriou, Saberi 03]: For a random graph grown with preferential attachment with , , a.s.

Theorem [Gkantsidis, M, Saberi 03]: For a random graph in the configurational model arising from degree sequence ,

, a.s.

Bounds on Conductance

Previously: Cooper & Frieze 02

Independent: Chung,Lu&Vu 03

Technique: Probabilistic Counting Arguments & Combinatorics.

Difficulty: Non homogeneity in state-space, Dependencies.

for a different structural random graph model and

Page 12: “Erdos and the Internet”

12Worst case is when all vertices have degree 3.

Structural Model, Proof Idea: Difficulty: Non homogeneity in state-space

Page 13: “Erdos and the Internet”

13

Growth with Preferential Connectivity Model, Proof Idea:

Difficulty:Arrival Time Dependencies

Shifting Argument

Page 14: “Erdos and the Internet”

14

first lastfirstlast

Page 15: “Erdos and the Internet”

15

Theorem [Gkantsidis,MM, Saberi 03]: For a random graph in the structural model arising from degree sequence there is a poly time computable flow that routes demand between all vertices and with max link congestion a.s.

Theorem [MM, Papadimitriou, Saberi 03]: For a random graph grown with preferential attachment with there is a poly time computable flow that routes demand between all vertices and with max link congestion , a.s.

Each vertex with degree in the network coreserves customers from the network periphery.

Note: Why is demand ?

Page 16: “Erdos and the Internet”

16

Networking questions

Routing Congestion

Searching

Design

Does packet drop (blocking) scale?

How fast can you crawl the WWW?

Does the network evolve towards monopolies?

Can you search a P2P network with low overhead?

How can you maintain a well connected topology?

Are network resources used efficiently?

How does delay scale in routing?

Is there load balancing?

How about distributed and dynamic networks?

Are there strategies to improve crawling and searching?

It is

Is it or ?

Is it or ?

Page 17: “Erdos and the Internet”

17

Searching, Cover Time and Mixing Time

Cover time = expected time to visit all nodes.

Search the graph by random walk.

Graph on nodes.

Mixing time = time to reach stationary distribution (arbitrarily close).

Page 18: “Erdos and the Internet”

18

Conductance, Mixing and Cover Time

For

Cover Time

“mixing” in

Rapid Mixing of Random WalkAlon 85Jerrum & Sinclair 88

Page 19: “Erdos and the Internet”

19

Extensions of Cover Time

In practice, when crawling the WWW or searching a P2P network, when a node is visited, all nodes incident to the node are also visited.This can be implemented by one-step local replication of information.

Page 20: “Erdos and the Internet”

20

can discover vertices

in steps.

Cover Time with Look-Ahead One In the configurational model

withTheorem [MM,Saberi,Tetali 05]:

Proof

Adamic et al 02 Chawathe et al 03Gkanstidis, MM, Saberi 05

Page 21: “Erdos and the Internet”

21

Proof

In the configurational model

with

Cover Time with Look-Ahead TwoTheorem [MM,Saberi,Tetali 05]:

can discover vertices

in steps.

Page 22: “Erdos and the Internet”

22

Networking questions

Searching Cover time

Does packet drop (blocking) scale?

How fast can you crawl the WWW?

Does the network evolve towards monopolies?

Can you search a P2P network with low overhead?

How can you maintain a well connected topology?

Are network resources used efficiently?

How does delay scale in routing?

Is there load balancing?

How about distributed and dynamic networks?

Are there strategies to improve crawling and searching?

It is

It is and local replication offers substantial improvement

Routing Congestion

DesignIs it or ?

Page 23: “Erdos and the Internet”

23

The case of Peer-to-Peer Networks

n nodes, d-regular graph

Each node has resources O(polylogn)and knows a very small size neighborhood around itself

Distributed, decentralized

Search for content, e.g. by flooding or random walk

?

Must maintain well connected topology, e.g. a graph with good concuctance, a random graph

Page 24: “Erdos and the Internet”

24

Gnutella: constantly drops existing connections and replaces them with new connections

P2P networks are constantly randomizing their links

There are between 5 and 30 requests for new connections per second.

About 1% of these requests are satisfied and existing links are dropped.

The network is working “in panic” trying to randomizethus avoiding network configurations with bottlenecksand trying to maintain high conductance.

Page 25: “Erdos and the Internet”

25

Theorem [Feder, Guetz, M, Saberi 06]: The Markov chain on d-regular graphs is rapidly mixing, even under local 2-link switches or flips.

P2P Network Topology Maintenance by Constant Randomization

Theorem [Cooper, Frieze & Greenhill 04]: The Markov chain corresponding to a 2-link switch on d-regular graphs is rapidly mixing.

Question: How does the network “pick” a random 2-link switch?In reality, the links involved in a switch are within constant distance.

Page 26: “Erdos and the Internet”

26

Space of d-regular graphsgeneral 2-link switch Markov chain

Space of connected d-regular graphs local Flip Markov chain

Define a mapping from to such that

(a) (b) each edge in maps to a path of constant length in

The proof is a Markov chain comparison argument

Page 27: “Erdos and the Internet”

27

Networking questions

Congestion

Cover time

Mixing time

Does packet drop (blocking) scale?

How fast can you crawl the WWW?

Does the network evolve towards monopolies?

Can you search a P2P network with low overhead?

How can you maintain a well connected topology?

Are network resources used efficiently?

How does delay scale in routing?

Is there load balancing?

How about distributed and dynamic networks?

Are there strategies to improve crawling and searching?

It is

It is

It is

Con

duct

ance

Page 28: “Erdos and the Internet”

28

The Internet topology has constant second eigenvalue, but larger than the second eigenvalue of random graphs.Can we develop random graph models (with powerlaw degree distributions)and with varying values of the second eigenvalue ?Preliminary work by Flaxman, Frieze & Vera

Routing on the Internet is done according to shortest paths.Can we characterize congestion under shortest path routing?

How can we maintain a P2P topology with good connectivityunder dynamic settings or arriving and departing nodes?

Can we develop efficient distributed algorithms that discover critical links in the network?Preliminary work by Boyd, Diaconis & Xiao.

Open Problems