algorithmic problems in the internet christos h. papadimitriou christos

26
Algorithmic Problems in the Internet Christos H. Papadimitriou www.cs.berkeley.edu/ ~christos

Post on 22-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Algorithmic Problems in the Internet

Christos H. Papadimitriou

www.cs.berkeley.edu/~christos

Iowa State, April 2003 2

Goals of TCS (1950-2000): Develop a productive mathematical understanding

of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time);Mathematical tools: combinatorics, logic

What should the goals of TCS be today?(and what math tools will be handy?)

Iowa State, April 2003 3

Iowa State, April 2003 4

The Internet

• huge, growing, open, emergent, mysterious

• built, operated and used by a multitude of diverse economic interests

• as information repository: open, huge, available, unstructured, critical

• foundational understanding urgently needed

5Iowa State, April 2003

Today…

• Games and mechanism design

• Getting lost in the web

• The Internet’s heavy tail

Iowa State, April 2003 6

Games, games…strategies

strategies3,-2

payoffs

(NB: also, many players)

Iowa State, April 2003 7

1,-1 -1,1

-1,1 1,-1

0,0 0,1

1,0 -1,-1

3,3 0,4

4,0 1,1

matching pennies prisoner’s dilemma

chicken

e.g.

Iowa State, April 2003 8

Nash equilibrium

• Definition: double best response

(problem: may not exist)

• randomized Nash equilibrium

Theorem [Nash 1952]: Always exists.

• Problem: there are usually many...

Iowa State, April 2003 9

The price of anarchy

cost of worst Nash equilibrium

“socially optimum” cost [Koutsoupias and P, 1998]

in networkrouting

= 2 [Roughgarden and Tardos, 2000,

Roughgargen 2002]

Iowa State, April 2003 10

mechanism design(or inverse game theory)

• agents have utilities – but these utilities are known only to them

• game designer prefers certain outcomes depending on players’ utilities

• designed game (mechanism) has designer’s goals as dominating strategies

Iowa State, April 2003 11

e.g., Vickrey auction

• sealed-highest-bid auction encourages gaming and speculation

• Vickrey auction: Highest bidder wins,

pays second-highest bid

Theorem: Vickrey auction is a truthful mechanism.

(Theorem: It maximizes social benefit and auctioneer expected revenue.)

Iowa State, April 2003 12

Vickrey shortest paths

6

6

3

4

5

11

10

3

ts

pay e Vc(e) = its declared cost c(e),plus a bonus equal to dist(s,t)|c(e) = - dist(s,t)

Iowa State, April 2003 13

Problem:

ts

11

1

1

1

10

Iowa State, April 2003 14

But…

• …in the Internet Vickrey overcharge would be only about 30% on the average [FPSS 2002]

• Could this be the manifestation of rational behavior at network creation?

• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP

Iowa State, April 2003 15

But… (cont)

• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP

• [with Mihail and Saberi, 2003]– They are small in expectation in random

graphs.– (Also: Why traffic grows moderately as the

Internet grows…)

Iowa State, April 2003 16

The web as a graphcf: [Google 98], [Kleinberg 98]

• how do you sample the web?

[Bar-Yossef, Berg, Chien, Fakcharoenphol, Weitz, VLDB 2000]

• e.g.: 42% of web documents are in html. How do you find that?

• What is a “random” web document?

17Iowa State, April 2003

documents

hyperlinks

Idea: random walk

Problems:

1. asymmetric 2. uneven degree3. 2nd eigenvalue?

= 0.99999

Iowa State, April 2003 18

The web walker: results

• mixing time is ~log N/(1-)

• WW mixing time: 3,000,000

• actual WW mixing time: 100

• .com 49%, .jp 9%,

.edu 7%, .cn 0.8%

Iowa State, April 2003 19

Q: Is the web a random graph?

• Many K3,3’s (“communities”)• Indegrees/outdegrees obey “power laws”

• Model [Kumar et al, FOCS 2000]: copying

Iowa State, April 2003 20

Also the Internet

• [Faloutsos3 1999] the degrees of the Internet are power law distributed

• Both autonomous systems graph and router graph

• Eigenvalues: ditto!??!

• Model?

Iowa State, April 2003 21

The world according to Zipf

• Power laws, Zipf’s law, heavy tails,…

• i-th largest is ~ i-a (cities, words: a = 1, “Zipf’s Law”)

• Equivalently: prob[greater than x] ~ x -b

• (compare with law of large numbers)

• “the signature of human activity”

Iowa State, April 2003 22

Models

• Size-independent growth (“the rich get richer,” or random walk in log paper)

• Growing number of growing cities

• In the web: copying links [Kumar et al, 2000]

• Carlson and Doyle 1999: Highly optimized tolerance (HOT)

Iowa State, April 2003 23

Our model [with Fabrikant and Koutsoupias, 2002]:

minj < i [ dij + hopj]

Iowa State, April 2003 24

Theorem:

• if < const, then graph is a star

degree = n -1

• if > n, then there is exponential concentration of degrees

prob(degree > x) < exp(-ax)

• otherwise, if const < < n, heavy tail:

prob(degree > x) > x -b

Iowa State, April 2003 25

Heuristically optimized tradeoffs

• Also: file sizes (trade-off between communication costs and file overhead)

• Power law distributions seem to come from tradeoffs between conflicting objectives (a signature of human activity?)

• cf HOT, [Mandelbrot 1954]• Other examples? • General theorem?

Iowa State, April 2003 26

PS: eigenvalues

Model: Edge [i,j] has prob. ~ di dj

Theorem [with Mihail, 2002]: If the di’s obey a power law, then the nb largest eigenvalues are almost surely very close to d1, d2, d3, …

(NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent)

Corollary: Spectral methods are of dubious value in the presence of large features