tuesday, january 27, 2009
DESCRIPTION
Tuesday, January 27, 2009. “In the confrontation between the stream and the rock, the stream always wins, not through strength but by perseverance.” H. Jackson Brown. Distributed Computing Class:BSIT-8. Instructor : Dr. Raihan Ur Rasool. Lecture Objectives. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/1.jpg)
1
![Page 2: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/2.jpg)
2
Tuesday, January 27, 2009
“In the confrontation between the stream and the rock,
the stream always wins, not through strength but by
perseverance.”
- H. Jackson Brown
![Page 3: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/3.jpg)
Distributed ComputingClass:BSIT-8
Instructor: Dr. Raihan Ur Rasool
![Page 4: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/4.jpg)
Lecture Objectives To understand the practical concepts of
P2P SOA Distributed Algorithms
Loose Coupling and the degree of loose coupling
![Page 5: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/5.jpg)
Outline Peer to Peer SystemsPeer to Peer Systems
Evolution of P2P systems, P2P middleware, Routing overlay, case studies: Chord, Pastry, TeaPastry
Service Oriented Architecture Service Oriented Architecture Vision of web & Evolution of webVision of web & Evolution of web Web Services Web Services
Web Services, Web Services Architecture, SOAP, WSDL, UDDI, Service Description and IDL, Directory Service for use with Web Services, XML Security, Coordination of web Services.
![Page 6: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/6.jpg)
Intro to P2P Systems [reliable resource sharing layer over unreliable]
Demand for services --eliminating separately-managed servers
The scope of expanding popular services by adding to number of the computers hosting them is limited when all the host must be owned & managed by the service provider
Administration and fault recovery costs
Bandwidth that can be provided to a single server site over available physical link
Major service provider all face this problem with varying severity
![Page 7: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/7.jpg)
Intro to P2P Systems Purpose:
Describe some general techniques Construction of P2P applications Scalability, reliability and security
Problem: placement of objects, manage workloads ensure scalability without adding overheads
P2P applications exploit resources available at the edges of the internet *Storage, content, cycles, human presence
![Page 8: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/8.jpg)
Intro to P2P Systems P2P application that exploit resources available at the edges of the internet
*Storage, content, cycles, human presence Traditional client-server provide access to these but
only on single machine or tightly coupled servers This centralized design required few decisions about
placement & management of resources
![Page 9: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/9.jpg)
Intro to P2P Systems P2P application that exploit resources available at the edges of the
internet Storage, content, cycles, human presence
Traditional client-server provide access to these but only on single machine or tightly coupled servers This centralized design required few decisions about placement
& management of resources In P2P -- algorithm for the placement and
subsequent retrieval of information objects are a key aspect of the system design. It’s a system which is Fully decentralized & self organizing Can dynamically balance the storage and processing
loads between all the participating computers as they join and leave
![Page 10: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/10.jpg)
P2P Design Characteristics Their design ensures that each user contributes
resources to the systems Although they may differ in the resources that
contribute, all the nodes in a peer to peer system have the same functionality capabilities and responsibilities
Their correct operation dose not depend on the existence of any centrally administered systems
They can be designed to offer a limited degree of anonymity to the providers and users of resources
Key issues for the their efficient operation is the choice of algorithm for placing and retrieving data on many hosts Balance of load Availability without much overhead
Participants availability to system is unpredictable
![Page 11: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/11.jpg)
Evolution of P2P Volatile resources --Strength ?
No guaranteed access to individual resources Probability of failure can be minimized
Can be grouped in three generations First generation – Napster music exchange service [OpenNap 2001] Second generation – file sharing applications with greater
Scalability, anonymity & fault tolerance Guentella, Kaza, Freenet
Developed with help of middleware layers Application independent management of distributed resources on a
global scale E.g. Pastry, Tapestry, CAN, CHORD, JAXTA Provide guarantees of delivery for requests in a bounded number of
network hops Place replicas of resources, by keeping in mind volatile availability &
trustworthiness, locality
![Page 12: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/12.jpg)
P2P Middleware - GUID Resources are identified by Global Unique Identifier GUID Derived from secure hash from resource’s state HASH makes a resource self certifying Client receiving the resource can check the hash This requires that states of resources are immutable P2P systems are inherently best suited for the storage of
immutable objects – music file, images Mutable objects sharing can be managed by set of trusted
servers to manage the sequence of versions e.g Oceanstore, Ivy – more in section 10.6
![Page 13: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/13.jpg)
Overlay routing vs IP routing (shared characteristics)
IP Application-level routing overlay
Scale IPv4 is limited to 232 addressable nodes. The IPv6 name space is much more generous (2128), but addresses in both versions are hierarch ically structured and much of the space is pre-allocated accordi ng to administrative requirements.
Peer-to-peer systems can address more objects. The GUID name space is very large and flat (>2128), allowing it to be much more fully occupied.
Load balancing Loads on routers are determined by network topology and associated traffic patterns.
Object locations can be randomized and hence traffic patterns are divorced from the network topology.
Network dynamics (addition/deletion of objects/nodes)
IP routing tables are updated asynchronously on a best-efforts basis with time constants on the order of 1 hour.
Routing tables can be updated synchronously or asynchronously with fractions of a second delays.
Fault tolerance Redundancy is designed into the IP network by its managers, ensuring tolerance of a single router or network connectivity failure. n-fold replication is costly.
Routes and object references can be replicated n-fold, ensuring tolerance of n failures of nodes or connections.
Target identification Each IP address maps to exactly one target node.
Messages can be routed to the nearest replica of a target object.
Security and anonymity Addressing is only secure when all nodes are trusted. Anonymity for the owners of addresses is not achievable.
Security can be achieved even in environments with limited trust. A limited degree of anonymity can be provided.
![Page 14: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/14.jpg)
15
Distributed Computation
Only a small portion of the CPU cycles of most computers is utilized. Most computers are idle for the greatest portion of the day, and many of the ones in use spend the majority of their time waiting for input or a response.
Loosely coupled –data/computation A number of projects have attempted to use these idle
CPU cycles. The best known is the SETI@home project, but other projects including code breaking have used idle CPU cycles on distributed machines.
![Page 15: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/15.jpg)
How many of you did not shutdown the computer and are now here in this room? Assume we are 15 people running a screensaver without
performing real work. The talk lasts one hour. Opportunity loss for one hour:
Speed: 15 * 0.8 GFlops = 12 GFlopsComp: 12GFlops * 1h = 43‘200 billion of floating point operations
Costs for one hour:Power consumption: 15 * 300 W =
4500 W during one hour = 4.5 kWh Money: 4.5kWh à 0.20 CHF = 0.9 CHFOil needed: 0.36 liter (Gasoline: 12.3 kWh/kg)CO2 emissions: 0.81 kg CO2 (Gasoline: 2.27 kg CO2 / liter)
![Page 16: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/16.jpg)
During one year (15 people)… Opportunity loss for one year:
Speed: 15 * 0.8 GFlops = 12 GFlops
Comp: 12GFlops * 1y = 378 432 000 billion of floating point ops
Costs for one year:Power consumption: 15 * 300 W =
4500 W during one year = 39.42 MWh
Money: 39.42 MWh => 7 884 CHF (525.6 CHF per head)
Oil needed: 3153.6 liter (Gasoline: 12.3 kWh/kg)
CO2 emissions: 7 t CO2 (Gasoline: 2.27 kg CO2 / liter)
![Page 17: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/17.jpg)
Distributed computation Usage & Exploitation best example
SETi@home (Search for Extra-Terrestrial Intelligence) Portions a steam of digitized radio telescope data into 107
second work unit, each about 350KB, distribute them on clients computer
Work unit is redundantly distributed to 3-4 users, to guard against errors & bad nodes
Coordination work is handled by a single server 3.91 million PCs participated in this by 2002 In one year they processed 221 million work units, data worth
27.36 teraflops on average Need for Grid –bluebrain
![Page 18: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/18.jpg)
19
Discussion Question: Computer or Infomachine?
The first computers were used primarily for computations. One early use was calculating ballistic tables for the U.S. Navy during World War II.
Today, computers are used more for sharing information than computations— perhaps infomachine may be a more accurate name than computer?
Distributed computation may be better suited to Grid and peer-to-peer systems while information tends to be hierarchical and may be better suited to client/server.
![Page 19: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/19.jpg)
Current Peer-Peer Concerns
Topics listed in the IEEE 9th annual conference:
20
![Page 20: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/20.jpg)
Dangers and Attacks on P2P Poisoning (files with contents different to description) Polluting (inserting bad packets into the files) Defection (users use the service without sharing) Insertion of viruses (attached to other files) Malware (originally attached to the files) Denial of Service (slow down or stop the network traffic) Filtering (some networks don’t allow P2P traffic) Identity attacks (tracking down users and disturbing them) Spam (sending unsolicited information)
21
![Page 21: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/21.jpg)
22
Introduction Napster and its legacy –self study Peer-to-Peer middleware –self study Routing overlays Overlay case studies: Pastry, Tapestry Application case studies: Squirrel, OceanStore, Ivy Summary
Where are we ?
Discussion date: 6th January
First four and last two pages only
![Page 22: Tuesday, January 27, 2009](https://reader035.vdocuments.net/reader035/viewer/2022062309/56815209550346895dc04c51/html5/thumbnails/22.jpg)
Reading Assignment
23
Reading• Napster and its legacy• Peer-to-Peer middleware
Discussion date: 10th February(first 8 pages, conclusion and future work)