lh* rs p2p : a scalable distributed data structure for p2p environment
DESCRIPTION
CERIA Laboratory. LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment. W. LITWIN. T. SCHWARZ. H.YAKOUBEN. Paris Dauphine University [email protected]. Santa Clara University (USA) [email protected]. Paris Dauphine University [email protected]. - PowerPoint PPT PresentationTRANSCRIPT
LH*RSP2P: A Scalable Distributed
Data Structure for P2P Environment
W. LITWIN
CERIA Laboratory
H.YAKOUBENParis Dauphine University
[email protected] Dauphine University
T. SCHWARZSanta Clara University (USA)
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 2
PlanObjective
Overview: SDDS & P2P
LH*RSP2P
Architecture Addressing Properties Churn Management
Conclusion
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 3
Objective
High availabilityto deal with churn
At most one forwarding message for key search or insert or scan
(fastest known performance)
Very Large Scalable Files
A File of records identified by keys SDDS client nodes face the applications
and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data
In buckets Overflowing servers split on new servers Servers do not notify clients about splits
SDDS (1993)
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 4
Clients use images of the file state for addressing Key based Range queries Scans …
Images get adjusted towards the file state during queries by Image Adjustment Messages Triggered by incorrect addressing by the client
IAMs reflect the file evolution by splits or, rarely, merges.
IAMs reflect also the location changes because of failures and recovery
SDDS (1993)
5LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
6
SDDS Typology
LH*sa
SDDS(1993)
Data Structures
Classics
Tree
m-d Tree1-d Tree
RP*, k-RP*, SD-Rtree, DRT*, LH*, DDH, EH*,
Hash
High Availability
1-dimensional d-dimensional
IH*…
LH*rs LH*sAlg. Sign…
k-Availability Security
LH*m LH*g
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Structured P2P Schemes
CHORD... BATON VBI-Tree
New Peer
7LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Clients
Growth through splits under inserts
Peer
SDDS Expansion
8LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Clients
SDDS Client Image Evolution
Image Adjustm
ent Messagin
g
Available at CERIA site Announced at DbWorld Managing LH* RS and RP* files
LH* implementaton based on LH*LH n scheme J. Karlsson’s Thesis & EDBT paper
In distributed RAM Under Windows Over 1 gbs Ethernet
Various functions Response time reaching 30 microsec Up to 300 times faster than disk files
SDDS 2007 Prototypes
9
Autonomous nodes store and search data By flooding in early systems
Freenet, Napster, Gnutella… Structured P2P reduce the flooding
Using decentralized data structures Distributed Hash Table (DHT) especially
Few folks know the concept is due to B. Devine FODO 93
Chord, P-tree, VBI, Baton… Structured P2P schemes are specific SDDS schemes
P2P (1995 ?)
10
Architecture based on LH*RS
ACM –TODS, Oct. 05)
11
LH*RSP2P
j
i’n’
Client Part
Server Part
LH*P2P Peer
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Peer
LH*RS
ClientLH*RS
DB
Candidate Peer
Candidate Peer
Client &Spare
Storage
LH*RSP2P Peer
LH*RS
ClientLH*RS
PB
Pupils
Pupil’s IP = Its key C
12
Client Address Calculus
a’ hi’(C ) ; /* a’ is the address of peer destination of the key C*/
if a’ < n’ then a hi’+1(C ) ;
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Global Addressing Rulea hi(C ) ; /* a is the address of peer destination of the key C*/
if a < n then a hi+1(C ) ; /* (i, n) state of an SDDS file, they are only known to the
file coordinator node
LH*RSP2P Addressing
hi (C ) = C mod 2i
File starts with i = 0 and n = 0 and a single data bucket 0
Every bucket keeps the bucket level j of hash function hi last used to split, j = 0 initially.
Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split
Usually n ≠ m It sends the address of the pupil ready for the new
bucket n + 2i
It also sends the addresses of the buckets that have been created since bucket n last split
LH*RSP2P File Expansion
13
Bucket n applies hi + 1 About half of keys migrates to new bucket n + 2i
About half of pupils migrate as well Bucket n and the new one set j = j + 1 Coordinator performs
n = n + 1 if n = 2i then i = i + 1 and n = 0 Resulting address space growth, starting from i = 0
and n = 0: 0 1 0 2, 1 3 ; 0 4, 1 5, 2 6, 3 7 0 8, 1 9, 2 10, 3 11, 412, 513…7 15
LH*RSP2P File Expansion
14
15
i’ = j - 1 ; /* j value after the splitn‘ = a + 1 /* a is the splitting bucket ; n = a + 1 if n’ = 2i’ then i’ = j + 1 ; n’ = 0 ;
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer & Pupil Image Adjustment After Peer Split
The adjustment concerns
Splitting peer a New peer a’ = a + 2i Every pupil of a and of a’
16
Before splittingCoordinator Peer (CP)
P0
j=2
i’=1n’=1
P2P1i=1n=1
j=1
i’=1n’=0
j=2
i’=1n’=1
After splitting
j=2
i’=2n’=0
CP
P0
j=2
i’=1n’=1
P2P1i=2n=0
j=2
i’=2n’=0
j=2
i’=1n’=1
P3
i’= j =1;n’= m+1= 1+1;
If n’=21 then n’=0; i’= i’+1and
(i’, n’)= (2,0)
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Example
17LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Server Address Calculus
a’ hj (C ) ;
if a’= a then exit /* Bucket a is the correct one
else send C to bucket a /* Forwarding to bucket a’exit;
Simpler and faster than for LH*
As only one forwarding is possible
18
i’ j - 1, n’ a + 1 ;if n’ >2i’ then n’ 0 ; i’ i’ + 1 ;
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer Image Adjustment by IAM
IAM comes from the correct bucket
Bucket a is the forwarding one
Bucket level j is that of the correct bucket
0f the forwarding one as well
• Same algorithm as for the adjustment of the local client and of pupils after a split
19LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
PC
i =3n=2
P0
j=4
i’=3n’=1
Pairs
P4
j=3
i’=2n’=1
P9
j=4
i’=3n’=2
P1
j=4
i’=3n’=2
9
9 Checking and
forward the key using A2
Peer Image Adjustment by IAM
IAMa = 1j = 4
20LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
PC
i =3n=2
P0
j=4
i’=3n’=1
Pairs
P4
j=3
i’= 3n’= 2
P9
j=4
i’=3n’=2
P1
j=4
i’=3n’=2
9
9
Peer Image Adjustment by IAM
IAMa = 1j = 4
21
Example of the File Expansion
PC
i=2n=2
P0
j=3
i’=2n’=1
Peers
P2
j=2
i’=1n’=1
P5
j=3
i’=2n’=2
Candidate Peer
i’=0n’=0
P6
j=3
i’=2n’=3
i=2
n=3
i’=2n’=3
j=3
Pupil
i’=2n’=1
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Assign a Tutor for Candidate Peer: LH-hash of its IP
Address
TUTOR, Update Pupil
LH*RSP2P
22
1. The maximal number of forwarding messages for the key search is one.
2. The maximal number of rounds for the scan search can be two.
3. The worst case addressing performance of LH*RS
P2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme.
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Properties of LH*RSP2P :
23LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 1
n n+2i’2i’a’0 a
Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split
No IAM came since.
j = i’+1 j = i’
No forwardinga+2i’
24LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Forwarding possible for any address a’ between (a, n)
Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split
No IAM came since.
n n+2i’2i’
a’0 a
j = i’+1 j = i’
a+2i’
Proof Property 1
25LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Forwarding possible for any address a’ beyond [n, a]
Case 2 : i = i’ + 1 and n < n’ Peer a addresses peer a’, using its image (i’,n’) from last split
No IAM came since.
n n+2i’+1 2i’
a’0 a
j = i’+2j = i’+1j = i’
2i’+1
Proof Property 1
26LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer a sends the scan to all buckets in its image• Including its image (i’, n’)
• Receiving peer a’ can have bucket level j as in the image
• j (a) = j’ (a)• No forwarding of the scan
• Or, bucket a’ split• Once and only once
• j (a) = j’ (a) + 1• See the figs for the key address calculus
Proof Property 2
27LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer a’ forwards the scan to its (only) child• No child can have a child• Peer a would first need to split again as
well
•Every peer gets thus the scan and only once
•There at worst two rounds
Proof Property 2
28LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
•The only faster worst case performance is zero forwarding messages
•Every split has to be notified then to every peer
•It would be against the scalability goal of every SDDS & structured P2P scheme
Proof Property 2
29LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management
Bucket reliability group with k parity buckets protect against up to k bucket failures per group
••••
•••••
••••• •
•••••
54321
0
•••••
•••••
Parity PeerData PeerRank
Parity RecordData Record
Tutoring records
30LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer leaves with notice
Coordinator Peer
j
i’,n’
j
i’,n’
j
i’,n’
i’,n’
P0 PmPl Candidate Peer
Notification
… … Say
that’s OK
LH*RSP2P Churn Management
31LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer leaves without notice or fails
Coordinator Peer
j
i’,n’
j
i’,n’
j
i’,n’
i’,n’Pl-1 PmPl Parity Peer
Query
Forward
LH*RS Bucket Recovery
LH*RSP2P Churn Management
32LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer leaves without notice or fails
Coordinator Peer
j
i’,n’
j
i’,n’
j
i’,n’
i’,n’Pl-1 Pm
Pl
Parity Peer
Answer
LH*RS Bucket Recovery
LH*RSP2P Churn Management
33LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Sure Search : Protects against outdated server read (transient communication or peer failure)
Coordinator Peer
j
i’,n’
j
i’,n’
j
i’,n’
i’,n’Pl-1 PmPl Parity Peer
Query
Answer
j
i’,n’
Pl
LH*RSP2P Churn Management
34
Conclusion LH*RS
P2P require at most one forward message when addressing error occur
Is the fastest known SDDS and P2P key based addressing algorithm
Protects efficiently against churn
Allows to manage very large scalable files Should have numerous applications Google ??
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
35
Current & Future Work
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Implementation of peer node architecture and of tutoring functionsUsing existing LH*RS prototypeCreated by Rim Moussa & shown at VLDB
2004 Performance Analysis Variants
36
END
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Work partly funded by the IST eGov-Bus project
Thank you for Your Attention
37
[1] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004). , June 2004.
[2] Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p.17-26
[3] Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4th Intl. Foundation of Data Organisation and Algorithms –FODO, 1993.
[4] Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACM-SIGMOD Int. Conf. On Management of Data, 93.
[5] Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACM-TODS, (Dec., 1996).
[6] Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems, , IEEE Press 1996.
[7] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30th VLDB Conference, , 2004.
[8] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept 2005.
[9] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000)
[10] Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001,
References
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
38LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment