2007/1/15std/lpbcast.ppt1 lightweight probabilistic broadcast m2 tatsuya shirai m1 dai saito

33
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbca st.ppt 1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

Upload: lenard-davis

Post on 05-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 1

Lightweight Probabilistic Broadcast

M2 Tatsuya Shirai

M1 Dai Saito

Page 2: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 2

Broadcast in Large Scale Environment

• End users send messages to all other users more frequently.– P2P BBS– Stock markets

• These applications need software broadcast.• Participating processes change more

dynamically compared to processes on servers, – machine crash– login to or logout from applications

Page 3: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 3

Deterministic Broadcast

• Each process transfers messages along defined routes.• This approach provides consistency of message delivery

ordering.– Messages from each process reach in the order that it

sends• Reliability is expressed in “best effort”

Page 4: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 4

Deterministic Broadcast cont.

rate of perturbed processes

• Poor scalability– Single point of failure– Cost of maintaining

routing information• Low reliability at

unstable networks.– Perturbation of few

processes makes performance of healthy processes lower.

Page 5: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 5

Probabilistic Broadcast

• Each process transfers messages to randomly selected processes without using defined routing information.

• Approximate redundancy enhances reliability.• Reliability is relatively high and stable in large

scale and unstable environments.

Page 6: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 6

Pbcast [Kenneth et al. 1999]

• This approach concurrently uses deterministic and probabilistic broadcast.– While network load is low, deterministic broadcast

achieve high reliability and low cost.– While network load is high, probabilistic broadcast

ensure certain reliability, especially of healthy processes.

Page 7: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 7

Deterministic Broadcast

• The first protocol is deterministic broadcast.• It uses IP multicast, or if it is not available, uses

spanning trees randomly composed.– But composing spanning trees needs information of

all membership. So this approach is limited to a few hundred processes, as mentioned in this paper.

Page 8: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 8

Anti-Entropy Protocol

• The second is anti-entropy protocol based on gossip.– In each round, members choose some of other members

randomly, send a summary of their message history digest to the selected processes.

– Processes receive the digest and check the lack of message, and require the lacking message for original sender.

message history

membership info.

digests

digests

message history

message history

lack 5, 8!

lack 3, 9!

3, 9

message 3, message 9

5, 8

message 5, message 8 5, 8

3, 9

Page 9: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 9

Anti-Entropy Protocol cont.

• Message size and fanout, the number of processes to which a process send in one round, define network load of this protocol.

• Message size is limited by message lifetime on each process.– A process send any message for some fixed rounds fr

om initial reception.– After that, the message is gave up.

1 5 8 6 2 4

3 7

5 9 1 5 8 6 2 4

3 7 9

1 5 8 6 2 4

Page 10: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 10

Flow Control

• Flow control while the network load is high.– The rate of pbcast message

s should be limited.• Normally every 100ms.

– Retransmission should delays in some rounds if many other processes require.

digests

digests3, 9

message 3, message 9

5, 8

message 5, message 8

Page 11: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 11

Evaluations

• Parameters:– Message loss rate– Fanout, the number of processes

• Reliability:– (infected processes – failed ones) > all ones/2

• for applications based on quorum replication algorithm

• Throughput:– The number of messages a process receives in 1 sec

ond.

Page 12: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 12

Effects of Fanout

• Predicate I shows pbcast.– Message loss rate is 0.05.– Deterministic broadcast reac

hes 10 % of the processes. – 50 processes participate.

• Probability of failure decrease with an increase of the number of fanout to 8. fanout (0~10)

Page 13: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 13

Scalability

• Predicate I shows pbcast.– Message loss rate is 0.0

5.– Deterministic broadcast r

eaches 10 % of the processes.

• Probability of failure decrease with bigger scale.– Though broadcast to all

processes take more rounds

processes (0~60)

Page 14: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 14

Time for broadcast to all processes

• Messages are received in 12 rounds on an average, less than 20 rounds at 1024 processes.– Fanout is 1– Det. broadcast is not used.

• This result shows the means are at O(logN)

rounds (0~20)

16 32 1024 processes

Page 15: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 15

Throughput

• 150 messages are sent in one second.– When message loss happens

frequently fanout is limited to small size.

• Throughput of perturbed processes decreases, but healthy processes avail full throughput.

rate of perturbed processes

deterministic

pbcast

Page 16: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 16

Throughput cont.

• Throughput at 200 msg/sec.– 25 % of the processes pe

rtube 25 % of the time.– Det. broadcast is unuse

d.• High frequency of packe

t loss causes throughput lower.

• In this case, average throughput decreases to 60% at 96 processes at high bandwidth.

loss rate(0 ~ 0.2)

32~96 processes

Page 17: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 17

Conclusion of pbcast

• Gossip based protocol achieves scalability and reliability in general network environments.

• Then, cost of processes are not considered. The next topic is memory management for pbcast.

Page 18: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 18

Membership Management

• Assumption– Each process knows all Members

• memory consumption in large scale• communication required to ensure

the consistency of the Membership

– Problems of Scalability in Large scale environment

Page 19: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 19

Membership Management of lpbcast

• Member Management + Gossip– Each process knows a subset of all Members– Sending messages with Member information– Size limitation of

Membership Management Buffer • Fixed Memory consumption

Page 20: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 20

Memory Management

• The Memory requirement for a process should not change (in large scale)– Buffer of Membership Management– Buffer of outgoing message

→Scalability

• pbcast with a viewpoint of “Memory Consumption”

Page 21: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 21

lpbcast algorithm

• Assumptions– Each process has unique ID– Each message has unique ID (including

process ID) – joining/leaving (= subscribing/unsubscribing)• Buffers

– Events : event notifications– EventIDs : Event IDs– Subs : subscription information– unSubs : unsubscription information– View : targets of gossip message

• Size limitation for all Buffers– Especially in Events and Subs

Page 22: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 22

sending

• lpbcast(e)– Add e to Events

• periodical gossip– Send buffers to a subset

of View (every 50ms)

e

e

Events

Events

EventIDs

View

SubsunSubs

e

Mes

Page 23: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 23

receiving

• When receiving gossip…– Membership Management

• add Mes.unSubs : unSubs ・ remove Mes.unSubs : View,Subs

• add Mes.Subs : View,Subs• If size of View is too large, move some items to Subs randomly

View

Mes.unSubs

Mes.SubsSubs

Page 24: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 24

receiving

• When receiving gossip…– Event transmission

• Events received for the first time are transmitted to other processes in View

• If size of Events is too large, remove randomly

– Retrieving Event• When receiving undelivered event ID in Mes.EventIDs,

a request of retrieving Event

Events

e

Unknown

ee

EventIDs

Unknown

eID

ID

Page 25: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 25

subscribing

• Subscribing process should know at least one node in specific Members

• Sending Gossip with appending itself to Subs• When timeout, making retransmission

View

Page 26: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 26

unsubscribing

• Sending Gossip with appending itself to unSubs– The process is gradually removed from individual view– Set timeout to unSubs messages– Assumption : removed process will not recover soon

 

unSubs

unSubs unSubs

unSubs

Page 27: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 27

features of lpbcast

• Throughput is as high as pbcast• A estimation of Memory consumption

• The membership algorithm and the dissemination of events are dealt with at the same level.

• Each view is independent uniformly– True P2P Model

→suitable for WAN– Need to recognize the “locality”

Page 28: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 28

[m1,m2]

Optimization

• Age-base– Optimization of Events Buffer– Now: Events Buffer is purged randomly

→better to remove well disseminated messages– Age = # of hops

P1

P2

bcast(m1)

bcast(m2) gossip(m2)

[m1]

deliver(m2)[m1,m2]

Page 29: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 29

Optimization

• Frequency-base– Optimization of Subs Buffer– Now: Subs Buffer is purged randomly

→ better to remove well-known processes– well-known = included in Subs Buffers

P1

P2

P3

Subs(P1, P2)

Subs(P2)

[P2] [P1,P2]

Page 30: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 30

Experiment : # of rounds

• Simulation– Prob. of Message loss: 0.05– Prob. of process crash: 0.01

• # of rounds to disseminate 99% of all processes

• Logarithmically

– Fanout = 3

Page 31: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 31

Experiment : Reliability

– SUN Ultra 10 (Solaris2.6, Memory256Mb)– 100Mbps Ethernet– 40msg/round, len(Events)=60

• A probability for any given process ofdelivering any givenevent notification

Page 32: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 32

Experiment : Optimization Effect

• Age-based optimization– Delivery ratio =

(# of delivered message)/(# of broadcast)

– 30msg/roundlen(Events)=30Fanout=460processes

Optimized

Random

Page 33: 2007/1/15std/lpbcast.ppt1 Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito

2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 33

Conclusion

• Scalability+Reliability

• Bimodal Multicast– Gossip based protocol achieves scalability an

d reliability.

• Lightweight Probabilistic Broadcast– Paying attention to cost of processes– memory management for pbcast.– Lightweight in large scale environment