full p2p tut

Upload: loosedetective

Post on 04-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Full p2p Tut

    1/60

    www.intel.com/labs

    Peer-to-PeerComputing: The hype,the hard problems and

    quest for solutionsKrishna Kant

    Ravi IyerVijay Tewari

    Intel Corporation

  • 8/13/2019 Full p2p Tut

    2/60

    www.intel.com/labs2

    Outline Section I

    Overview of P2P

    P2P Framework

    Overview of distributed computing frameworks

    Additional P2P framework requirements

    P2P Middleware

    Section II

    Taxonomy of P2P applications

    Research Issues

    Section III

    Preliminary Performance Modeling

    Conclusion

  • 8/13/2019 Full p2p Tut

    3/60

    www.intel.com/labs3

    Goals for Section I

    Examine the early beginnings of Peer-to-Peer. Look at some possible definitions of Peer-to-Peer

    General idea about the Peer-to-Peer applications

    and frameworks.

    Identify the requirements of Peer-to-Peer

    applications.

  • 8/13/2019 Full p2p Tut

    4/60

    www.intel.com/labs4

    P2P Beginnings

    Interest kindled by distributed file-sharing applications Napster: Mediated digital music swapping. (http://www.napster.com)

    Peer Bhas it

    Where isX?

    Copying X

    Peer A Peer B

    Mediator

    1

    3

    2

  • 8/13/2019 Full p2p Tut

    5/60

    www.intel.com/labs5

    P2P Beginnings

    Gnutella: Fully distributed file sharing. (http://gnutella.wego.com) Freenet Distributed file sharing with anonymity and key based search.

    (http://freenet.sourceforge.net)

    Peer A

    Peer D Peer C

    Peer B

    C: I have it.

    4

    C: I have it.3

    Where is File

    (Key) X?

    1

    Where is File X?

    1

    Where is File (Key) X?

    2

    File X

    6

    GET File (Key) X (HTTP)

    5

  • 8/13/2019 Full p2p Tut

    6/60

    www.intel.com/labs6

    We had them already! Using idle CPU cycles on home PCs, e.g., SETI@home

    Involves scanning of radio telescope images for extraterrestrial life. Chunks of data downloaded by home PCs, processed and results returned to the

    coordinator.

    Similar schemes used for other heavy-duty computational problems.

    Idle disk and main memory on workstations exploited in a number of

    network of workstation (NOW) projects.

    Master

    Peer 2Data

    Crunching

    Peer 1 Peer 4Peer 3

    Raw Data

    Processed Data

    DataCrunching

    DataCrunching

    DataCrunching

    mailto:SETI@homemailto:SETI@home
  • 8/13/2019 Full p2p Tut

    7/60 www.intel.com/labs

    7

    Newer Applications

    P P streaming media distribution CenterSpan (C-Star Multisource Peer Streaming)

    Mediated, Secure P2P platform for distributing digital content.

    Partition content and encrypt each segment. Distribute segments amongstpeers. Redundant distribution for reliability.

    Download segments from local cache, peers or seed servers.

    http://www.centerspan.com

    vTrails

    vtCaster: At stream source. Creates network topology tree based on end users(vtPass client software).

    Dynamically optimizes tree.

    Content distributed in a tiered manner.

    http://www.vtrails.com

    http://www.vtrails.com/http://www.vtrails.com/
  • 8/13/2019 Full p2p Tut

    8/60 www.intel.com/labs

    8

    Newer Applications

    P P Collaboration Groove (http://www.groove.net)

    Real time, small group interaction and collaboration.

    Fundamental notion around a shared space

    Each member of the group owns a copy of the shared space.

    Changes made to the shared space by one member arepropagated to each member of the group (Store and forward if

    some member is offline).

    Platform is secure.

    PKI for user authentication.

    End to end encryption. Groove components are digitally signed

  • 8/13/2019 Full p2p Tut

    9/60 www.intel.com/labs

    9

    So, what is P2P?

    Hype: A new paradigm that can

    Unlock vast idle computing power of the Internet, and

    Provide unlimited performance scaling.

    Skeptics view: Nothing new, just distributed computing re-discovered ormade fashionable.

    Reality: Distributed computing on a large scale

    No longer limited to a single LAN or a single domain.

    Autonomous nodes, no controlling/managing authority.

    Heterogeneous nodes intermittently connected via links of varying speed andreliability.

    A tentative definition: A dynamic network (peers can come & go as they please)

    No central controlling or managing authority.

    A node can act as both as a client and as a server.

  • 8/13/2019 Full p2p Tut

    10/60

    www.intel.com/labs10

    P2P Platforms

    Legion, University of Virginia, Now owned by Avaki Corp. Globe, Vrije Univ., Netherlands

    Globus, Developed by a consortium including Argonne Natl. Lab

    and USCs Information Sciences Institute.

    JXTA, Open source P2P effort started by Sun Microsystems.

    .NET by Microsoft Corp.

    WebOS, University of Washington

    Magi, Endeavors Technology

    Groove

  • 8/13/2019 Full p2p Tut

    11/60

    www.intel.com/labs11

    Avaki (Legion) Objective: Wide-area O/S functionality via distributed objects.

    Middleware infrastructure for distributed resource sharing in mutually distrustfulenvironment..

    Global O/S services built on top of local O/S

    *Source: Peer-to-Peer Computing by David Barkai (Intel Press)

  • 8/13/2019 Full p2p Tut

    12/60

    www.intel.com/labs12

    Avaki (Legion)

    Naming: LOID (location Indep. Object Id), current object address &object name

    Persistent object space: generalization of file-system (managesfiles, classes, hosts, etc.)

    Communication: RPC like except that the results can be forwardedto the real consumer directly.

    Security: RSA keys a part of LOIDs, Encryption, authentication,digesting provided.

    Local autonomy:Objects call local O/S services for allmanagement, protection and scheduling.

    Active objects: objects represent both processes and methods. Overall: Comprehensive WAN O/S, but not targeted as a

    general P2P enabler.

  • 8/13/2019 Full p2p Tut

    13/60

  • 8/13/2019 Full p2p Tut

    14/60

    www.intel.com/labs14

    Globus

    Objective: Grid computing, integration of existing services.

    Defines a collection of services, e.g.,

    Service discovery protocol

    Resource location & availability protocol

    Resource replication service

    Performance monitoring service

    Any service can be defined and becomes the part of the system.

    Higher level services can be built on top of basic ones.

    Preserves site autonomy. Existing legacy services can be offeredunaltered.

    Overall: Excellent reusability. Unconstrained toolbox approach => Verydifficult to join two islands.

  • 8/13/2019 Full p2p Tut

    15/60

    www.intel.com/labs15

    JXTA

    Objective: A low-level framework to support P2P applications:

    Avoids any reference to specific policies or usage models.

    Not targeted for any specific language, O/S, runtime environment, or networking model.

    All exchanges are XML based.

    Base concepts for

    Identifiers

    Advertisements Peers

    Peer Groups

    Pipes

    At the highest abstraction defines a set of protocols using the base concepts:

    Peer Discovery protocol: Discovery of peers, resources, peer groups etc.

    Peer Resolver Protocol

    Peer Information Protocol

    Peer Membership protocol.

    Pipe binding protocol

    Peer endpoint protocol.

  • 8/13/2019 Full p2p Tut

    16/60

    www.intel.com/labs16

    JXTA

    Source: White Paper on Project JXTA: A Technology Overview by Li Gong

  • 8/13/2019 Full p2p Tut

    17/60

    www.intel.com/labs17

    Microsoft .NET in the contextof P2P

    Objective: An enabler of general XML/SOAP based webservices.

    Message transfer via SOAP (simple object access

    protocol) over HTTP.

    Kerberos based user authentication.

    Extensive class library.

    Emphasizes global user authentication via passport

    service (user distinct from the device being used). Hailstorm supports personal services which can be

    accessed via SOAP from any entity

  • 8/13/2019 Full p2p Tut

    18/60

    www.intel.com/labs18

    MAGI Enabler for collaborative business applications.

    *Source: Peer-to-Peer Computing by David Barkai (Intel Press)

  • 8/13/2019 Full p2p Tut

    19/60

    www.intel.com/labs19

    Magi

    Magi: Micro-Apache Generic Interface, anextension of Apache project.

    Superset of HTTP using

    WebDAV: Web distributed authoring & versioningprotocol, which provides, locking services, discovery &

    assignment services, etc. for web documents.

    SWAP (simple workflow access protocol) that supports

    interaction between running services (e.g., notification,monitoring, remote stop/synchronization, etc.)

    Intended for servers; client interface is HTTP.

  • 8/13/2019 Full p2p Tut

    20/60

    www.intel.com/labs20

    WebOS

    Objective: WAN O/S that can dynamically push functionality to variousnodes depending on loading.

    Outgrowth of the Berkeley NOW (network of workstations) project.

    Consists of a number of components

    Global naming: Mapping a service to multiple nodes, load balancing &

    failover.

    Wide-area file system (with transparent caching and cache coherency).

    Security & Authentication w/ fine-grain capability control.

    Process control: Support for remote process execution.

    Project no longer active, parts of it being used elsewhere.

    Overall: Dynamic configurability useful for P2P environment.

  • 8/13/2019 Full p2p Tut

    21/60

    www.intel.com/labs21

    Groove

    Groove (http://www.groove.net)

    Real time, small group interaction and collaboration.

    Fundamental notion around a shared space

    Each member of the group owns a copy of the shared space.

    Changes made to the shared space by one member are propagated to

    each member of the group (Store and forward if some member isoffline).

    Platform is secure.

    PKI for user authentication.

    End to end encryption.

    Groove components are digitally signed

  • 8/13/2019 Full p2p Tut

    22/60

    www.intel.com/labs22

    Requirements for P2PApplications Local autonomy: No control or management by a central authority.

    Scalability: Support collaboration of arbitrarily large number of nodes.

    Security & Privacy: All accesses are authenticated and authorized.

    Fault Tolerance: Assured progress with up to k failures anywhere.

    Interoperability: Any peer that follows the protocol can participate irrespective

    of platform, OS, etc.

    Responsiveness: Satisfy the latency expectations of the application.

    Non-imposing: Allows machine user full resource usage whenever desired

    without affecting responsiveness.

    Simplicity: Setting up a P2P application or participating in one should require

    minimum of manual intervention.

    Auto-optimization: Ability to dynamically reconfigure the application (no of

    nodes, functionality, etc.)

    Extensibility: Dynamic addition of functionality.

  • 8/13/2019 Full p2p Tut

    23/60

    www.intel.com/labs23

    P2P Services

    Basic.

    Network Services.

    Naming.

    Event and Exception management services.

    Storage Services

    Metadata services

    Security Services

    Advanced.

    Search and Discovery.

    Administrative and Auditing.

    File services akin to a virtual file system.

    User and group management services.

    Resource management services.

    Digital Rights management.

    Replication and Migration services.

  • 8/13/2019 Full p2p Tut

    24/60

    www.intel.com/labs24

    From Services to possible Layers

    Transport and dataprotocols forinteroperability

    Common protocols: IP,IPv6, sockets, http, XML,

    SOAP, . . . NAT and firewall solutions

    Roaming, intermittentconnectivity

    Availability from unreliable

    components

    Replication

    Striping

    Failover

    Guaranteed message

    queuing

    CommunicationsCommunications

    Location Independent Services

    Identity, Presence, CommunityIdentity, Presence, CommunityIdentity, Presence, Community

    SecuritySecuritySecurity

    AvailabilityAvailabilityAvailability

    Communications

    Administration, Monitoring

    Naming, Discovery, Directory

    Sharable Resources

    S

    tandards

    Policies

    Authorization

    Integrity

    Privacy

    Web of trust

    Certification

    DRM

  • 8/13/2019 Full p2p Tut

    25/60

    www.intel.com/labs25

    From Services to possible Layers

    User / group identity

    Authentication

    Persistence

    Beyond a session

    Across multiple

    devices

    Local Autonomy

    IT allocation of resources

    Self administrationreliable

    whole from unreliable parts

    Resource monitoring

    Payment tracking

    CommunicationsCommunications

    Location Independent Services

    Identity, Presence, CommunityIdentity, Presence, CommunityIdentity, Presence, Community

    SecuritySecuritySecurity

    AvailabilityAvailabilityAvailability

    Communications

    Administration, Monitoring

    Naming, Discovery, Directory

    Sharable Resources

    Standards

    Policies

    Name space

    management Metadata management

    Discovery & location of

    peers, services,

    resources, users

    CPU, storage,

    memory

    Bandwidth

    I/O devices

    Capability discovery

  • 8/13/2019 Full p2p Tut

    26/60

    www.intel.com/labs

    Questions ???

  • 8/13/2019 Full p2p Tut

    27/60

    www.intel.com/labs27

    Part 2: Taxonomy &

    Research Issues

    Goals:

    To introduce a taxonomy for classifying P2Papplications and environments.

    To elaborate upon some major research issues.

  • 8/13/2019 Full p2p Tut

    28/60

  • 8/13/2019 Full p2p Tut

    29/60

    www.intel.com/labs29

    P2P Taxonomy Contd

    Environmental characteristics:

    Network latency: Ranges from uniformly low (e.g., for a high-speed LAN)to highly variable (e.g., for general WAN).

    Security concerns: Ranges from low (e.g., corporate intranet) to high (e.g.,public WAN).

    Scope of failures: Ranges from occasional isolated failures (e.g., a

    laboratory network of workstations) to network partitioning. Connectivity: Ranges from always-on (e.g., nodes in a business LAN) to

    occasional-on (e.g., mobile devices).

    Heterogeneity: Ranges from complete homogeneity to completeheterogeneity (in platform, O/S, protocols etc.).

    Stability: Ranges from highly stable (i.e., Planned occasionalchanges/upgrades) to unpredictable.

    Convenient to aggregate them as friendly and hostile.

  • 8/13/2019 Full p2p Tut

    30/60

    www.intel.com/labs30

    Research Issues

    Intelligent caching of search results.

    Intelligent object retrieval

    Retrieval by properties rather than URL.

    Need distributed indexing mechanisms.

    Directing searches to more promising and less loaded nodes.

    Multiparty synchronization and communication that scales tothousands of nodes.

    For home computers: Utilize idle computing resources w/o significantcommunication requirements.

    Unobtrusive use: If the owner wants to use the resources, get out of

    the way quickly. Low latency service handoff protocols.

  • 8/13/2019 Full p2p Tut

    31/60

    www.intel.com/labs31

    Research Issues Contd

    Distributed load balancing that scales to thousands ofgeographically distributed nodes.

    Stitching traffic from multiple paths to reduce latency orlosses for real-time applications.

    Access control in a mutually suspicious environment(foreign objects on your machine must protect themselvesfrom you, and you from these objects).

    Effective mapping of the application topology to thephysical topology.

    Architectural features to

    Efficiently propagate requests and responses w/o significant CPUinvolvement

    Squelch duplicate, orphaned or very late responses.

  • 8/13/2019 Full p2p Tut

    32/60

    www.intel.com/labs32

    Additional P2P Issues

    Communicating with peers behind NAT devices and firewalls.

    Naming and addressing peers that do not have DNS entries.

    Coping with intermittent connectivity & presence (e.g., queuedtransfers).

    Authentication of users independent of devices.

    Digital rights management.

    On demand task migration w/o breaking the application.

    Efficient distributed information location and need based contentmigration.

    Scalability to huge number of peers (e.g., 100M): Peer state management

    Discovery and presence management (intermittent connectivity & slow lastmile links)

    Certificate management and authentication.

  • 8/13/2019 Full p2p Tut

    33/60

    www.intel.com/labs

    Part 3: PerformanceStudy

    Goals:

    1. Define a performance model including

    - Network model

    - File storage and access model- File caching and propagation model

    2. Discuss sample results

    3. Discuss Architectural impacts

  • 8/13/2019 Full p2p Tut

    34/60

    www.intel.com/labs34

    P2P Network Characteristics

    Desirable characteristics

    Adequate representation of ad hoc nature of the network.

    Expected to contain a few special sites (well-known, content rich,substantial resources, etc.)

    Heavy-tailed nature of connectivity.

    Other Issues

    Dynamic changes to the network

    Direct modeling not required if rate of change

  • 8/13/2019 Full p2p Tut

    35/60

  • 8/13/2019 Full p2p Tut

    36/60

    www.intel.com/labs36

    P2P Network Model

    Use a random graph model to represent topology.

    Traditional G(n,p) RG model too simplistic.

    Use a 2-tier non-uniform model built as follows:

    Start with a degree Kd regular graph of Nddist. Nodes.

    Add Nuundistinguished nodes sequentially as follows:

    The new node connects to K other nodes.

    K: const or an integer-valued RV in range 1..Kmax Each connection targets an undistinguished node with prob qu(this

    may not be possible for the first Kmaxnodes).

    Dist. Node target: uniform distribution over all dist nodes.

    Undist. Node target: Zipf(a) over existing undist. nodes. At most one connection allowed between any pair of nodes.

    acontrols the decay rate of nodal degree a=0 => Uniform dist => Very slow decay. Used here for simplicity.

  • 8/13/2019 Full p2p Tut

    37/60

    www.intel.com/labs37

    Topological properties Some network properties can be analyzed analytically

    Outline of Analysis (see http://kkant.ccwebhost.com/download.htm) Degree distribution:

    Distinguished nodes at level 0, each new node defines a new level.

    Pn(l2,l): Prob(level lnode has degree nwhen current level = l2)

    Get recurrence eqns for Pn(l2,l) & hence its PGFf(z| l2,l) .

    Get avg degree Dat(l2,l) at level lwhen current level = l2.

    Can be adapted for computing the undistinguished degree of a node.

    No of nodes reached in hhops:

    Rhmatrix: Rh(i,j)is prob of reaching levelifrom leveljin exactly hhops.

    Compute Rh(i,j)by enumerating all unique paths of length h.

    Compute G(l2,h), avg no of nodes reached in hhops starting from a level l2.

    Request and response traffic at level l node:

    nreqs= No of requests reaching undist. nodes in h hops = 1 + ShG(l2,h), nresps= 1 + Shh G(l2,h), since resp from hhops away goes thru hnodes.

    Nodal utilization & node engineering:

    Easy to ensure that nodal utilization do not exceed some limits.

    Queuing properties generally intractable; explored via simulation.

  • 8/13/2019 Full p2p Tut

    38/60

    www.intel.com/labs38

    Sample Results - 100 nodes

    undist no_of nodes undist resps traf

    prob hops reached reached /node /node

    1 5.9 3.3 4.9 6.1

    2 55.2 44.5 103.6 146.5

    0.05 3 99.1 85.8 235.2 320.5

    4 100 90.0 238.8 328.8

    5 100 90.0 238.8 328.8

    1 5.9 4.3 4.9 8.4

    2 34.3 23.8 61.7 82.3

    0.50 3 91.0 73.9 231.7 304.0

    4 99.9 89.4 267.5 356.9

    5 100 89.6 267.7 357.3

    1 5.9 5.3 4.9 10.6

    2 28.6 22.6 50.3 73.6

    0.95 3 76.7 63.8 194.6 258.4

    4 98.5 87.4 281.8 369.2

    5 99.7 89.3 287.8 377.2

  • 8/13/2019 Full p2p Tut

    39/60

    www.intel.com/labs39

    Sample Results - 500 nodes

    undist no_of nodes undist resps trafprob hops reached reached /node /node

    1 6.0 3.6 5.0 6.2

    2 243.7 232.7 480.5 711.5

    0.05 3 499.7 488.6 1248.4 1737.0

    4 500.0 490.0 1249.6 1739.6

    1 6.0 4.7 5.0 8.5

    2 95.7 84.2 184.3 264.6

    0.50 3 483.5 465.1 1347.8 1812.4

    4 500.0 490.0 1413.9 1903.9

    1 6.0 5.8 5.0 10.7

    2 35.1 29.1 63.2 91.7

    0.95 3 163.5 137.1 448.3 582.4

    4 405.7 367.7 1417.2 1782.7

  • 8/13/2019 Full p2p Tut

    40/60

    www.intel.com/labs40

    Simulation of Random Graphs

    Simulation of Random graph is a hard problem

    Model represents a large number of topologies that the actual network might take.

    Too many instances to simulate explicitly and then average the results.

    Example: 2 dist & 3 undist nodes, each connects to 2 nodes => 6 distinct topologies.

    Possible approaches to simulation:

    Average case analysis

    Model with limited set of instances.

    Direct simulation of probabilistic model.

  • 8/13/2019 Full p2p Tut

    41/60

    www.intel.com/labs41

    Average case analysis

    Intended environment

    To study performance of an average network defined by RG model.

    No dynamic changes to the topology possible.

    Graph construction

    Start with the regular graph of distinguished nodes (as usual).

    For adding undist nodes, work with only the avg connectivities Kd& Kuforan incoming node.

    Always connect to the existing node with min connectivity.

    Kd& Kdcan be used successively to handle non-integer Kdvalues(similarly for Ku).

    Characteristics/issues

    Simple, only one graph to deal with in simulation.

    Gives correct avg reachability and nodal utilizations.

    All queuing metrics (including avg response time) are underestimated.

  • 8/13/2019 Full p2p Tut

    42/60

    P b bili ti G h

  • 8/13/2019 Full p2p Tut

    43/60

    www.intel.com/labs43

    Probabilistic GraphEmulation Intended environment

    To study overall performance when the topology is defined by the random graph

    model.

    Accommodate fast changing or unstable topologies.

    Method:

    For each node i, estimate relative prob qijof having an edge to nodej i.

    A query coming from node kto node iis sent to node j with prob qij/(1-qik).

    This virtual topology for the query is used to return responses as well.

    Characteristics/Issues

    Method dependent on analytic calculation of edge probabilities to neighbors.

    Single simulation automatically visits various instances in the correct proportion.

    No explicit control over which instances are visited => Reliable results may take a

    very long time.

    Very expensive and difficult to handle complex operations (e.g., file migration).

    File Si e & access

  • 8/13/2019 Full p2p Tut

    44/60

    www.intel.com/labs44

    File Size & accessdistribution Using a 2-segment model:

    Small sizes: Distribution generally irregular; uniform is a reasonable model.

    Pareto tail with decay rate 1

  • 8/13/2019 Full p2p Tut

    45/60

    www.intel.com/labs45

    Parameters: File Copies

    Each search in a P2P network may result in multiple hits.

    Need only dist. of hits; precise modeling of search mechanism not needed.

    Use file copies for this:

    Each file has Ccopies in the range (1..Cmax) with a given distribution.

    A file is now identified by the triplet: (category, file_no, copy_no) where file_no is a

    unique id (e.g., sequence no) of files in a category.

    This allows following capabilities:

    Unique searches specified by the file-id triplet.

    Non-unique searches specified by (category, file_no).

    Replication control and fault-tolerant operation.

    File copy parameters: Distribution may be related to the nature of the file (not considered here).

    Separate distributions allowed for files allocated to dist & undist nodes.

    Assuming a triangular distribution with Cmax = 20, and mode Cmode= 5 for all nodes

    => Mean no of copies = 8.667.

    Fil A i t t N d

  • 8/13/2019 Full p2p Tut

    46/60

    www.intel.com/labs46

    File Assignment to Nodes Assignment of copies to nodes:

    Assign copies at a fixed distanceso as to distribute them evenly across the network.

    Apply an offsetfor each round of copy assignment to avoid bunching up.

    Do not assign more than one copy of a file to a node.

    Algorithm: loop over all files

    n_copies = triangular_rv(1, Cmax , Cmode) // Generate random no of copies

    if ( n_copies > n_nodes ) n_copies = n_nodes; // Dont allow more copies than nodes

    distance = n_nodes/n_copies; // Distance for copy allocationoffset = 1 + n_nodes/no_files; // If too few files, get an offset to avoid bunching

    tot_offset = (tot_offset + offset) % n_nodes;

    node_no = tot_offset; // Node for the assignment of first copy

    for ( copy_no = 0; copy_no < n_copies; copy_no++) {

    assign_file( node_no, file_no, size);

    node_no = (node_no + distance) % n_nodes; // Next node for assignment

    if ( copy_no < n_copies -1 && node_no == (tot_offset + wraps)% n_nodes) {

    node_no = (node_no + 1) % n_nodes; wraps++;

    }

    } // loop over copies

    Q C

  • 8/13/2019 Full p2p Tut

    47/60

    www.intel.com/labs47

    Query Characteristics

    Assumptions:

    No queries (searches) started from distinguished nodes since these nodes areessentially servers.

    Identical query arrival process at each undistinguished node.

    Arrival process model

    An on-off process with identical Pareto distribution for on \& off periods:

    P(X>x) = (x/T)g for x > T

    Assume T=12 secs, and g=1.4 which gives E(X)=30 secs. Const inter-arrival time of 4 secs during the on-period, no traffic during off period.

    Total traffic at a node is superposition of arrivals from all reachable nodes.

    Approx. a self-similar process with Hurst parameter H=(3 - g)/2 = 0.8 when no ofreachable nodes is large.

    Query properties:

    Each query specifies a file (category, file_no) w/ given access characteristics.

    Shown results do not specify copy_no => Multiple hits possible for each query.

    Query percolates for hhops. (h=3 can cover 95% of nodes for chosen graph).

    If a query arrives at a node more than once, it is not propagated.

  • 8/13/2019 Full p2p Tut

    48/60

  • 8/13/2019 Full p2p Tut

    49/60

    www.intel.com/labs49

    Simulation Results

  • 8/13/2019 Full p2p Tut

    50/60

    www.intel.com/labs50

    Major Observations

  • 8/13/2019 Full p2p Tut

    51/60

    www.intel.com/labs51

    Conclusions & Future Work

    Covered in the tutorial:

    Introduced major developments relevant to P2P computing.

    Introduced sample middleware functionality to support P2P applications.

    Introduced a taxonomy for classifying P2P computing applications andenvironments.

    Discussed major research issues to be resolved.

    Proposed a random graph model for P2P networks and studied itsproperties.

    Studies some performance issues for P2P deployments using detailedsimulation of file-sharing applications.

    Potential Future Work

    Further refinement of middleware functionality and taxonomy as newerP2P applications emerge.

    More comprehensive performance studies, particularly going beyondsimply file-sharing.

  • 8/13/2019 Full p2p Tut

    52/60

    www.intel.com/labs

    Backup

  • 8/13/2019 Full p2p Tut

    53/60

    www.intel.com/labs53

    Goals

    Define Peer-to-Peer. General idea about the Peer-to-Peer applications

    and frameworks.

    Identify the requirements of Peer-to-Peerapplications.

    Examine a taxonomy for Peer-to-Peer.

    Performance. (Not clear what we write here)

    Ad hoc Collaborative

  • 8/13/2019 Full p2p Tut

    54/60

    www.intel.com/labs54

    Ad-hoc Collaborativecomputing Several applications e.g., telemedicine, military planning, video-conferencing,

    document editing

    A group of peers discover one-another and form an ad-hoc network

    Peers setup the necessary communication channels (perhaps secure) and distribute

    objects.

    Peers do arbitrary real-time computation perhaps involving multiparty

    synchronization. Results are collected and the network disbanded.

  • 8/13/2019 Full p2p Tut

    55/60

  • 8/13/2019 Full p2p Tut

    56/60

    www.intel.com/labs56

    P2P Services Basic.

    Network Services. Core communication functionality.

    Enable communication on various network topologies such asdirect via firewalls.

    Enable communication in the face of intermittent connectivity.

    Event and Exception management services. Publish and subscribe model.

    Storage Services

    Low level File services.

    Metadata services

    Generic mechanism for publishing and obtaining Metadata for

    Devices

    Resources (Files, CPU, Memory etc)

  • 8/13/2019 Full p2p Tut

    57/60

    www.intel.com/labs57

    P2P Services Security Services

    Identification

    Authentication

    Access Control

    Integrity

    Confidentiality

    Audit Trail

    User and group management services.

    Resource management and Placement services.

    Advanced.

    Naming.

    Search.

    Discovery.

    Administrative.

    Auditing.

    File services

  • 8/13/2019 Full p2p Tut

    58/60

    www.intel.com/labs58

    Additional P2P Issues

    Communicating with peers behind NAT devices and firewalls.

    Naming and addressing peers that do not have DNS entries.

    Coping with intermittent connectivity & presence (e.g., queuedtransfers).

    Authentication of users independent of devices.

    Digital rights management.

    On demand task migration w/o breaking the application.

    Efficient distributed information location and need based contentmigration.

    Scalability to huge number of peers (e.g., 100M): Peer state management

    Discovery and presence management (intermittent connectivity & slow lastmile links)

    Certificate management and authentication.

  • 8/13/2019 Full p2p Tut

    59/60

    www.intel.com/labs59

    Web Sites of Interest

    Napster (http://www.napster.com)

    Gnutella (http://gnutella.wego.com)

    Freenet (http://freenet.sourceforge.net)

    JXTA (http://www.jxta.org)

    Avaki Corp (http://www.avaki.com)

    Legion (http://legion.virginia.edu)

    Globe (http://www.cs.vu.nl/~steen/globe)

    Globus (http://www.globus.org)

    Microsoft .Net (http://www.microsoft.com/net)

    http://www.napster.com/http://gnutella.wego.com/http://freenet.sourceforge.net/http://www.jxta.org/http://www.avaki.com/http://legion.virginia.edu/http://www.cs.vu.nl/~steen/globehttp://www.globus.org/http://www.microsoft.com/nethttp://www.microsoft.com/nethttp://www.globus.org/http://www.cs.vu.nl/~steen/globehttp://legion.virginia.edu/http://www.avaki.com/http://www.jxta.org/http://freenet.sourceforge.net/http://gnutella.wego.com/http://www.napster.com/
  • 8/13/2019 Full p2p Tut

    60/60