milan vojnović microsoft research joint work with moez draief, kyomin jung, bo young kim, etienne...
TRANSCRIPT
Milan VojnovićMicrosoft Research
Joint work with Moez Draief, Kyomin Jung, Bo Young Kim, Etienne Perron and Dinkar Vasudevan
1 Consensus
Lecture series ACiD – Algorithms and Complexity in Durham, 2012
Abstract
In this talk, I will consider the problem of distributed ranking of alternatives in a network of nodes under limited memory per node and limited information communicated between nodes. In particular, for the case of ranking of two alternatives, each node in the network is assumed to prefer one of the alternatives, and the goal for each node is to correctly identify one of the two alternatives that is preferred by majority of the nodes. This type of a problem has been studied under various names such as consensus, k-selection and quantile computation. The model is an abstraction that underlies various systems such as ranking of items in distributed peer-to-peer systems, databases and may also capture dynamics of opinion formation in social networks.
2
This Talk Based on
M. Draief and M. V., Convergence Speed of Binary Interval Consensus, SIAM Journal on Control and Optimization, 2012
K. Jung, B. Y. Kim, and M. V., Distributed Ranking in Networks with Limited Memory and Communication, IEEE Int’l Symposium on Information Theory, 2012
E. Perron, D. Vasudevan, and M. V., Using Three States for Binary Consensus on Complete Graphs, IEEE Infocom 2009
3
Binary Consensus Problem
0
1
0
11
1
10
0
Goal: each node wants to correctly decide whether 0 or 1 was initially held by majority of nodes
4
Applications (Cont’d)
01101
Ex. Distributed databases Top-k query processing
Query: Is object X most preferred by majority of nodes?
8
Questions of Interest
Correctness: probability that each node identifies the initial majority alternative ?
Convergence time: time to reach consensus ?
Dependence on the number of nodes n and initial fraction of nodes (voting margin) holding the majority state ?
11
Desiderata
Reach correct consensus – initial majority
Fast convergence
Small communication overhead
Small processing per node
Decentralized
12
Classical Voter Model
Node takes over the state of the contacted node
Binary state per node & binary signaling
0 initially held by V nodes,1 initially held by U nodes
Complete graph node interactionsProbability of incorrect consensus
UVVU
Uf VU
for ,,
1
0
0
0
1
0
1
1
14
m-ary Hypothesis Testing
Q: How many states does S need to decide correct hypothesis with probability going to 1 with the number of observations ?
1,,0 ),,[ : 1 miaaH iii
15
000110111110100011
Hi
i. i. d. mean S
00 a 1ma1a
A: m+1 necessary and sufficient (Koplowitz, IEEE Trans IT ’75)
Ternary Protocol
Both processing and signaling take one of three states 0 or 1 or e e = “indecisive” state
1
0
e
0
0
0
e
0
17
e
1
1
e
Binary Signalling
Processing same as for ternary protocol Binary signaling – takes one of two states 0
or 1
e e
signals 0 or 1 with equal probability
18
Binary Signaling – A Motivation
Nodes may not be able to signal indifference – by the very nature of the application
Ex. two news pieces may be equally most read but only one can be recommended to the user
19
US navy ship stems into port where Russian...
US navy ship stems into port where Russian...
Soldier forced to sleep in car after hotel...
Assumptions
Complete graph node interactions Each node samples a node uniformly at random
across all nodes at instances of a Poisson process with intensity 1
20
Summary of Results
Ternary protocol Prob of error decays exponentially with the
number of nodes n – found exact exponent log n convergence time
Binary protocol Prob of error worse than for ternary protocol
for a factor exponentially increasing with n, but not worse than for classical voter
Convergence time C log n with 2 C 3
21
Ternary Protocol - Dynamics
U = number of nodes in state 0
V = number of nodes in state 1
n = total number of nodes
22
n
UVVU
n
VVUNVU
n
VUVU
n
UVUNVU
VU
: )1,(
)(: )1,(
: ),1(
)(: ),1(
),(
(U,V) Markov process:
Ternary Protocol - Probability of Error
Theorem – probability of error:
U
jjVjU
VUVU
jaf
1)()(
,, 2
)(
2
1
jU
jVjU
jVjU
UVja VU
)()(
)()()(,
(U, V) = initial point, V > U
23
Proof Outline
First-step analysis:
with
Boundary conditions:
1,1,,1,1,)2( VUVUVUVUVU UVfaVfUVfaUffUVaVaU
VUna
0for 10for ,0 0,,0 U, fVf UV
24
Proof Outline (Cont’d)
Lemma – solution of
Boundary conditions:
VUVUVU fff ,11,, 2
1
2
1
VUf ,
0for 10for ,0 0,,0 U, fVf UV
25
VUf ,
}0{
}0{
12
1:),1(
12
1:)1,(
),(
U
V
VU
VUVU
i.e. is error probability of
Proof Outline (Cont’d)26
U
VfU,U = 1/2
(U, V)
(j, j)
U
jjjVjUVU nf
1)()(, 2
1
Number of pathsfrom (U, V) to (j, j) that do not intersect the line U = V-- Ballot theorem
Probability of Error (Cont’d)
Corollary – For
Ob. Exponential decay for large
nDfn VU large ),||(~)log(1
21
,
1 1/2 ),,1(/))0(),0(( nVU
27
Convergence Time Lower Bound Lower bound:
Example: pathreduction to classical voter model
28
1 01 1 1 0 0 0. . . . . .
U V
ConvergenceTime Lower … (cont’d) Ternary protocol on a path
corresponds to a classical voter model dynamics
29
01 1 1 0 0 0
01 1 0 0 0e
01 1 0 0 00
1/2
1/2
1/2
Binary Protocol – Reminder
Processing same as for ternary protocol Binary signaling – takes one of two states 0 or 1
e e
signals 0 or 1 with equal probability
30
Binary Protocol – Dynamics
Markov process:
n
VUVVU
n
UVVUnVU
n
UVUVU
n
VUVUnVU
VU
12
1: )1,(
1)(2
1: )1,(
12
1: ),1(
1)(2
1: ),1(
),(
31
Probability of Error – Binary Signaling
Theorem –
where
UVVU pf ,
12
12
!
!2
1n
ni
i
n
UVni
i
UV
in
in
p
))]2log(1(21[~)log(1 UVpn
32
Corollary – for large n
But …
Theorem –
– Not worse than classical voter model
Probability of Error (Cont’d)
Ob. Worse than under ternary protocol for a factor exponentially increasing with
UVn
Uf VU for ,,
33
Binary Protocol: Many-Nodes Limit
The limit ODE:
For z = u + v and w = v – u, we have
)]())(1())(1[()(
)]())(1())(1[()(
2
2
tvtututvdt
d
tutvtvtudt
d
)())(1(2
1)(
)(2
1)(
2
31)( 2
twtztwdt
d
twtztzdt
d
35
Convergence Time
Theorem – Convergence time:
= constants independent on
Slower than ternary signaling by at least factor 2
Not slower than factor 3
nBnntAn largefor ,)log(3)()log(2
36
Proof Basic Steps
in this set in a finite time independent of
Asserted bounds follow by ODE comparisons
37
Extension to Plurality Protocol
alternatives Binary consensus as a special case:
Goal: each node to correctly identify an alternative that is initially a plurality winner
39
Plurality Protocol
For each alternative two states: strong and weak
At each communication instance between two nodes: If the observer node is in strong state j and
the contacted node is in a different strong state, then the observer node switches to weak state j
If the observer node is in weak state j, it switches to the state of the contacted node
bits of memory per node and communication between nodes
40
Plurality Protocol (cont’d)
m alternatives
2m states: weak strong
41
1 2 m…
s s
s’
s
s’
s
s’
s
s’
s’
s’
s
s’
s’
observer
Convergence Time Upper Bound
Linear in the number of alternatives Logarithmic in the voting margin
48
Ternary Protocol Can Fail52
0
1
1
0
0
0
e
0 1
e
10
e
0
1
0
Complete graph with asymmetric communication rates
Two node types:Light – small interaction rate
Heavy– large interaction rate
Q: Can initial minority prevail ?
Initial Minority Can Prevail53
Example: Node types
0.2 light 0.8 heavy
Interaction rates0.1 light2 heavy
U V
Light 0.1 0.05
Heavy 0.35 0.45
0.45 0.5
V state nodes(initial majority)
Quaternary Protocol55
Four states
Update rules: swap or annihilate
0 1e0 e1
e0
0
e0
0
e1
0
e0
0
0 1
e0
e1
e0
e1
e0
e1
e0
e1
1
1e1
1
e1
1
Convergence
For any given connected graph, the binary interval consensus converges to the correct state with probability 1. [Benezit et al, 2010]
56
Convergence (cont’d)
Each edge activated at instances of a Poisson point process of intensity
57
Let for every nonempty set of nodes matrix :
Phase 1 dynamics (cont’d)60
Dynamics:
Sk = set of nodes in state 0
• The result follows by using a “spectral bound” on the expected number of nodes in state 1
Complete Graph61
Each edge activated at rate 1/(n-1)
• Inversely proportional to the voting margin• Can be made arbitrarily large ! 61
Complete Graph (cont’d)62
• The general bound is tight
• 0 and 1 state nodes annihilate after a random time that has exponential distribution with parameter cut
Erdos-Renyi Grahps (cont’d)66
For sufficiently large expected degree, the bound is approximately as for the complete graph as intuition would suggest
Conclusion
The ternary protocol has appealing properties for complete graphs:
Exponentially decreasing probability of error with Logarithmic convergence time in
The quaternary protocol features: Guarantees convergence to the correct state with
probability 1 Provided a tight bound on the expected
convergence time Instantiated to particular graphs including
complete graph, path, cycle, star-shaped and Erdos-Renyi
Critical parameters: the number of nodes and voting margin
68
Open Problems
State of the Art: consider an algorithm and then analyze the probability of error and convergence time
Suggests a trade-off between accuracy and speed
Q: upper and lower bounds for the expected convergence time, some classes of input graphs, subject to a bound on the probability of error ?
Q: accuracy and speed vs. memory and communication constraints ?
69
Some References
S. Shang, P. W. Cuff, S. R. Kulkarni and P. Hui, An Upper Bound on the Convergence Time for Distributed Binary Consensus, 15th Int’l Conf. on Information Fusion, 2012
M. A. Abdullah and M. Draief, Majority Consensus on Random Graphs of a Given Degree Sequence, ArXiv, 2012
E. Mossel, J. Neeman and O. Tamuz, Majority Dynamics and Aggregation of Information in Social Networks, 2012
F. Chierichetti and J. Kleinberg, Voting with Limited Information and Many Alternatives, ACM SODA 2012
F. Benezit, P. Thiran and M. Vetterli, The Distributed Multiple Voting Problem, IEEE Journal on Selected Topics in Signal Processing, Vol 5, No. 4, 2011
70
Some References (cont’d)
J. Cruise and A. Ganesh, Probabilistic Consensus via Polling and Majority Rules, Proc. of Allerton Conference, 2010
D. Acemoglu, M. A. Dahleh, I. Lobel and A. Ozdaglar, Bayesian Learning in Social Networks, forthcoming Review of Economic Studies, 2011
F. Benzit, P. Thiran and M. Vetterli, Interval Consensus: From Quantized Gossip to Voting, IEEE Int’l Conf. on Acoustics, Speech, and Signal Processing, 2009
A. Nedic, A. Olshevsky, A. Ozdaglar and J. N. Tsitsiklis, Distributed Averaging Algorithms and Quantization Effects, IEEE Conf. on Decision and Control, 2008
71
Some References (cont’d)
W. P. Tay, J. N. Tsitsiklis and M. Z. Win, On the Subexponential Decay of Detection Error Probabilities in Long Tandems, IEEE Trans. on Info. The., Vol 54, No 10, 2008
F. Kuhn, T. Locher, R. Wattenhofer, Tight Bounds for Distributed Selection, ACM SPAA 2007
S. Boyd, A. Ghosh, B. Prabhakar and D. Shah, Randomized gossip algorithms, IEEE Trans. on Information Theory, Vol 52, No 6, 2006
T. M. Liggett, Interacting Particle Systems, Springer, 2006
M. Greenwald and S. Khanna, Power-conserving Computation of Order-Statistics over Sensor Networks, ACM PODS 2004
72
Some References (cont’d)
D. Kempe, J. Kleinberg and E. Tardos, Maximizing Influence through a Social Network, ACM KDD 2003
Y. Hassin and D. Peleg, Distributed Probabilistic Polling and Applications to Proportionate Agreement, Information and Computation, 171, 2001
M. Greenwald and S. Khanna, Space-efficient Online Computation of Quantile Summaries, ACM SIGMOD 2001
J. Koplowitz, Necessary and Sufficient Memory Size for m-hypothesis Testing, IEEE Trans. on Information Theory, Vol 21, No 1, 1975
73