Benjamin Doerr MPII Saarbrücken
joint work with
Quasi-Random Rumor Spreading
Tobias FriedrichU Berkeley
Anna HuberMPII Saarbrücken
Thomas SauerwaldU Berkeley
Marvin KünnemannU Saarbrücken
Benjamin Doerr
Advertisement: Positions at the MPI
5 Postdocs:– Starting October 2009, deadline: January 31, 2009.
5 PhD students: – positions filled continuously
All positions have– generous support (travel, computer, ...)– no teaching duties, but teaching is possible– are in the “Algorithms&Complexity” group (~40
researchers, mainly theory)
Benjamin Doerr
Quasi-Random Rumor Spreading
Outline:– Randomized Rumor Spreading (classical)
always contact a random neighbor
– Quasirandom Rumor Spreading (new model) less independent randomness
– Results
Conclusion: dependent random stuff...– can be analyzed– works well
Benjamin Doerr
Randomized Rumor Spreading Model (on a graph G):
– Start: One vertex is knows a rumor (“is informed”)– Each round, each informed vertex contacts a neighbor chosen
uniformly at random and informs it (if it wasn’t already)– Problem: How many rounds are necessary to inform all
vertices?
Stupid animation: G = Kn, edges not drawn
Round 0: Starting vertex is informedRound 1: Starting vertex informs random vertexRound 2: Each informed vertex informs a random vertexRound 3: Each informed vertex informs a random vertexRound 4: Each informed vertex informs a random vertexRound 5: Let‘s hope the remaining two get informed...
Benjamin Doerr
Randomized Rumor Spreading Model (on a graph G):
– Start: One vertex is knows a rumor– Each round, each informed vertex informs a neighbor
chosen uniformly at random– Problem: How many rounds are necessary to inform all
vertices?
CS-Application:– Broadcasting updates in distributed replicated databases
simple robust self-organized
Maths-NoApplication: Fun to study
Benjamin Doerr
Randomized Rumor Spreading Model (on a graph G):
– Start: One vertex is knows a rumor– Each round, each informed vertex informs a neighbor chosen
uniformly at random– Problem: How many rounds are necessary to inform all vertices?
Main results [n: number of vertices]:– Easy: For all graphs and starting vertices, at least log2(n) rounds
are necessary– Theorem: These graph classes have the property that
independent of the starting vertex O(log(n)) rounds suffice w.h.p.: Complete graphs: Kn = ([n], 2[n]) Hypercubes: Hd = ({0,1}d, “Hamming distance one”) Random graphs: Gn,p, p (1+Ɛ) log(n)/n For complete graphs, the constant is log2(n) + ln(n) + o(log(n))
[Frieze&Grimmet (1985), Feige, Peleg, Raghavan, Upfal (1990)]
Benjamin Doerr
Motivation of this Work
Observation: – “all decisions independent at random’’ is simple, but
efficient
Question: Can we do better with more clever (randomized) approaches?– introduce problem-motivated dependencies– concept of quasirandomness [Jim Propp]:
Simulate properties of the random object/process deterministically
Successful applications:– Quasi Monte Carlo Methods– Propp maschine (quasirandom random walks)
Benjamin Doerr
Deterministic Rumor Spreading?
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list
Problem: Might take long... [Proof by animation, Graph Kn, n = 6]
Here: n -1 rounds . No hope for cleverness (quasirandomness) here?
1 3 4 5 62
List: 2 3 4 5 6 3 4 5 6 1 4 5 6 1 2 5 6 1 2 3 6 1 2 3 4 1 2 3 4 5
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Results
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Results: The O(log(n)) bounds for – complete graphs (including the leading constant), – hypercubes,
– random graphs Gn,p, p (1+Ɛ) log(n)
still hold...
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Results: The O(log(n)) bounds for – complete graphs (including the leading constant), – hypercubes,
– random graphs Gn,p, p (1+Ɛ) log(n)
still hold regardless of the structure of the lists
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Results: The O(log(n)) bounds for – complete graphs (including the leading constant), – hypercubes,
– random graphs Gn,p, p (1+Ɛ) log(n)
still hold regardless of the structure of the lists
[2 good news: (a) results hold, (b) things can be analyzed in spite of dependencies]
Benjamin Doerr
Semi-Deterministic Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Results: The O(log(n)) bounds for – complete graphs (including the leading constant), – hypercubes,
– random graphs Gn,p, p (1+Ɛ) log(n)
still hold regardless of the structure of the lists
[2 good news: (a) results hold, (b) things can be analyzed in spite of dependencies]
Quasirandom
Benjamin Doerr
Quasirandom Rumor Spreading
Same model as above, except:– Each vertex has a list of its neighbors.– Informed vertices inform their neighbors in the order of
this list, but start at a random position in the list
Natural Property:– A vertex never informs a neighbor twice (unless it
informed all neighbors)
Algorithmic aspects:– If results hold for all lists, then lists already present for
technical reasons can be used– Less random bits needed
Benjamin Doerr
Intra-Talk Summary Randomized rumor spreading:
– Informed vertices inform neighbors chosen uniformly at random
Quasirandom rumor spreading– Each vertex has an arbitrary list of its neighbors– Informed vertices inform their neighbors in the order of this
list, starting at a random position in the list– Some nice properties
Remainder of the talk: Results!– Runtime– Robustness– Some proof ideas
Benjamin Doerr
Runtime: Proven bounds “As fast as independent”: The O(log(n)) bounds hold for
– complete graphs (including the leading constant), – hypercubes,
– random graphs Gn,p, p (1+Ɛ) log(n)
“Slightly faster than independent”:– Random graphs Gn,p, p = (log(n)+log(log(n)))/n:
independent: Θ(log(n)2) necessary to obtain a success probability of 1 – 1/n
quasirandom: Θ(log(n)) suffice– Complete k-regular trees:
independent: w.h.p. Θ(k log(n)) rounds necessary/sufficient quasirandom: w.p.1, r rounds necessary/sufficient,
where r = Θ(k log(n)/log(k))
Benjamin Doerr
Runtime: Experimental Results (n=1024)
Complete graph Kn
Average broadcast times:
Fully random: 18.09 ± 1.74Quasirandom: 17.63 ± 1.76
Lists: neighbors sorted in increasing order
Benjamin Doerr
Runtime: Experimental Results (n=1024)
Complete graph Kn Hypercube H10
Lists: “inform the neighbor in dimension 1, 2, 3, ...”
Average broadcast times:
Fully random: 18.09 ± 1.74Quasirandom: 17.63 ± 1.76
Fully random: 21.11 ± 1.78Quasirandom: 18.71 ± 0.71
Lists: neighbors sorted in increasing order
Benjamin Doerr
Runtime: Experimental Results (n=1024)
Complete graph Kn Hypercube H10 Random graphs Gn,p, p such that graph connected w.p.1/2
Lists: “inform the neighbor in dimension 1, 2, 3, ...” Lists: neighbors sorted in
increasing order
Average broadcast times:
Fully random: 18.09 ± 1.74Quasirandom: 17.63 ± 1.76
Fully random: 21.11 ± 1.78Quasirandom: 18.71 ± 0.71
Fully random: 27.31 ± 50.82Quasirandom: 19.48 ± 3.07
Lists: neighbors sorted in increasing order
Benjamin Doerr
Robustness
Robustness: How well does the protocol work if some transmissions fail?– Failure model: Each transmission fails with a (1-p) chance (independently).
The sender does not get to know this. – Referee question: Quasirandom could be less robust?– ‘Theorem’ [not yet written up]: W.h.p., both models need time
log2(1+p)-1 log2(n) + p-1 ln(n) + o(log(n)) on the complete graph.
– Experiments:
Average broadcast times ± standard deviations for hypercube and complete graph, n=4096, p=1/2
Benjamin Doerr
Delaying&Ignoring: Some proof ideas... Proceed in phases of several rounds:
– Assume pessimistically that nodes informed in this phase start rumor spreading only in the next phase (delaying).
– Next phase: Only the nodes newly informed in the last phase spread the rumor (ignore the rest).
– Cool: They still have their independent random choice!
How does is work for the Θ(log(n)) bound for the Kn?– Round 0: Startvertex informed– 1st phase: log(n) rounds: log(n) newly informed nodes– 2nd phase: log(n) rounds: Each of the log(n) newly informed nodes informs
a random log(n) segment of his list. The segments are chosen independently, hence few overlaps. Result: Θ(log(n)2) newly informed nodes.
– Phases until 1% informed: 8 rounds per phase. Half of the newly informed inform at least 4 new ones. Result: Twice as many newly informed nodes.
– “Endgame”...
Benjamin Doerr
Delaying&Ignoring... Delaying: Delay independent random decisions until you
have enough of them– admits Chernoff bounds
Ignoring: Ignore nasty stuff to make the rest independent.
Problem: To get the leading constant, in average only– a o(1) fraction of the decisions may be delayed;– a o(1) fraction of the informed vertices may be ignored.
Solution: Busy phases– vertices informed in the phase do inform others in this phase– reduce dependencies by ignoring “overtaking”: If A calls B in
the phase (determined by A’s random decision), then we ignore that A might call C and C might call B earlier than A.
– yields an only (1-o(1)) slowdown of the process.
Benjamin Doerr
Summary
Results:– Theory: Guarantee that things work fine for all list structures
good broadcast times & robustness for many graphs better broadcast times for some graphs
– Experiments: The lists we tried yield better results reduced broadcast times broadcast times stronger concentrated
– General: No need to be afraid of dependencies !
Outlook: – Try to “mathematically” see the differences seen in the
experiments.– Open problem: Are some lists structures better or worse than
others? Grazie mille!