peer-to-peer by rui zhang, chen teng, li dong, quanshuan he & yongzheng zhang
TRANSCRIPT
Peer-to-Peer
By
Rui Zhang, Chen Teng, Li Dong,
Quanshuan He & Yongzheng Zhang
Overview
• What is peer-to-peer
• Application
• Advantages and Disadvantages
• Case Study(Gnutella)
• Conclusion
• References
What is peer-to-peer
• Why P2P?
• Circumstance
• Concept
• Landscape
Why P2P?
• Growing application demand
• Good Suitability
• Update immediately
Circumstance
• Napster – focus awareness
• One of important Internet service foundations
• Part of other architectures application process
• Scale of peer-to-peer protocol
Circumstance(cond.)
• Statistic number• Home users: 79.5 million
• Work users: 1.2 million
• Simultaneous users: 640,000
• Downloads in September alone: 1.4 billion
What is Peer-to-Peer?
• Traditional Peer-to-Peer "A type of network in which each workstation has
equivalent capabilities and responsibilities. This differs from client/server architectures, in which some computers are dedicated to serving the others. "
What is Peer-to-Peer?(cond.)
• P2P Architecture Todaykey characteristics:
• interfaces running outside of a web browser
• both clients and servers
• easy to use and well-integrated
• content creation or functionality addition support
• connections provide
• something new!
• "cross-network" protocols support(SOAP or XML-RPC)
Landscape
• P2P Distributed Computing
• P2P Affinity Communities
• Peertailing
Key features of P2P application
• Discovering other peers
• Querying peers for content
• Sharing content with other peers
Different types of P2P application
• Pure P2P
• P2P with a Simple Discovery Server
• P2P with a Discovery and Lookup Server
Pure P2P
• No central server
• How to discover peers • Uses information from local configuration scheme
• Employs network broadcasting and discovery techniques
• Limits the application’s reach
Pure P2P(cont.)
Peer 2
Peer 3Peer 1
(1) C
onten
t quer
y(2) File
transfe
r
Pure P2P
Peer 2
Peer 3Peer 1
(1) Content q
uery (2)Content query
(3) Response(4) Response
(5) Connect and File transfer
P2P with a Simple Discovery Server
• Notifies central server of its existence at startup time
• Uses central server to download a list of other peers
• Goes through the list and contacts each peer individually with its request
P2P with a Simple Discovery Server
Peer 1
Peer 2
Peer 3
Server
(2) Peer list
(1) Log in
(3) Content query
(3) Content query
(4) File transfer
P2P with a Discovery and Lookup Server
• Server includes both discovery and content lookup services
• The peer application registers with a discovery server and uploads a list of its contents at regular intervals
• Queries central server for particular content
• Reduces the number of queries
P2P with a Simple Discovery and Lookup Server
Peer 1
Peer 2
Peer 3
Server
(2) List of peers which
have the requested content
(1) Tell server
which content it wants
(3) Content query
(4) File transfer
When Using Napster
You need: A copy of the Napster utility installed A directory on your computer that has been shared Some type of Internet connection
When Using Napster(cont.)
• The provider of the song needs:• A copy of the Napster utility installed
• A shared directory on their computer
• Some type of Internet connection that is currently on
• A copy of the song you are looking for in the designated shared directory
The Napster Network
P2P advantages & disadvantages
P2P advantages
• Low cost
• Sharing individual resources
---- data resources (Napster)
---- hardware (SETI)
• Administration
• Highly fault-tolerant
• Real time updating (online auction)
P2P disadvantages
• The limited access number• Availability • Hard to predict the consequences of failure • Bandwidth consumption• Security problem
Security Problem
• Why client/server is more secure
--------- centralized resource --------- centralized administration --------- system integrity
•Why P2P is less secure
--------- non-specialist users
--------- vendors
--------- authentication information
--------- disclosure of IP and MAC addresses
--------- virus distribution
Security Problem (cont.)
Security problem ---- possible solutions
• Limit and restrict access number ----- validate certificate ----- obtain certificate ----- caching data (FreeNet)
Case Study
Gnutella
Concepts
1. Introduction for Gnutella
2. Gnutella & Firewalls
3. Security Considerations for Gnutella Users
4. Gnutella Protocol Information
5. Limitions and Risks for Gnutella
Gnutella basics
(1)An open, decentralized, peer-to-peer search system . It is a name for a technology.
(2)The Gnutella protocol and original servent ("Gnutella 0.56") were conceived and developed by Justin Frankel and Tom Pepper at Nullsoft in March, 2000.
(3) Each piece of Gnutella software servent (SERVer+cliENT) is both a server and a client in one.
Gnutella
1. Gnutella Is File sharing .
2. Gnutella Is Anonymous .
3. Gnutella Is The Game : Telephone .
4. Gnutella Is Designed to Survive Nuclear War .
5. Gnutella Can Withstand A Band of Hungry Lawyers .
How Gnutella retrieves information
Gnutella & Firewalls
1. With a firewall, there are some problems for Gnutella when making a request for a file.
2. To compensate for this, Gnutella's designers came up with the "push request".
How Gnutella handles firewalls
Internet Security Considerations
1. IP Address Advertising
2. Connection Acceptance
IP Address Advertising
1. Peering networks dynamically collect, distribute, and broadcast the IP addresses of their active peers.
2. Malicious hackers now use special "IP Address Harvesters" to collect the Internet addresses of active, online, peering clients and servers ,then target them by their IP addresses for direct attack.
Connection Acceptance
1. The typical personal computer never need to accept unknown connections .
2. Users of peering services such as Gnutella do accept connections from other unknown machines and are therefore temporarily acting as Internet servers which are similarly vulnerable to direct attacks.
What Can You Do?
1. Take responsibility and get yourself informed!
2. Get your Shields UP!
3. Add a free Firewall!
4. Ignore the IBR(Internet Background Radiation)!
5. Tell Your Friends!
Gnutella Protocol Information
General Description
• Works by “Viral Propagation”
• Inordinate amounts of traffic
• In reality, it isn’t so bad! (Horizon 10000)
• Uses GUID to identify each message
• Each servent maintains a short memory of GUIDs it has seen
• http
Connecting to a Servent
• Connect to other gnutella servents:GNUTELLA CONNECT/0.4\n\n
• The accepting servent responds: GNUTELLA OK\n\n
• After that, it's all data.
Gnutella Messages
• Data passed on the Gnutella network are called "messages” (Header+Payload):
1. PING request
2. PONG reply
3. Query (Search Request)
4. Query Hit (Search Reply)
5. PUSH request
Header FormatBytes Summary Description
0-15 Message
Identifier
GUID, used to identify each particular message
16
Payload Descriptor
(Function Identifier)
Value Function
0x00 Ping
0x01 Pong
0x40 Push Request
0x80 Query
0x81 Query Hit
17 TTL Time To Live (hops left before dropped)
18 Hops Number of hops this message has taken
19-22 Payload Length
The length of the data which follows the header
Ping (function 0x00)• No payload
• Servent sends/forwards PING message to all connected servents
Pong (function 0x01)
• Payload
• Routing Instruction:Servent sends/forwards Pong message back along the path its Ping came from.
Bytes Summary Description
0-1 Port number Port number of responding host
2-5 IP address IPv4 address of responding host
6-9 # of files Number of total files shared
10-13 # of kilobytes Size of total files shared
Query (function 0x80)• Payload
• Routing Instruction
Servent sends/forwards Query message to all connected servents.
Bytes Summary Description
0-1 Minimum Speed
The minimum speed, in kilobytes/sec, of responding hosts
2+ Search Criteria
Search keywords or other criteria. NULL terminated.
Query Hit (function 0x81)• Payload
Bytes Summary Description
0 # of hits (N) # of hits in the result set following this header
1-2 Port IPv4 port number of responding host
3-6 IP address IPv4 address of responding host
7-10 Speed Speed of responding host, in kilobits/s
11+ Result Set
(N of these)
last 16 Servent Identifier GUID of responding host, used in PUSH
Bytes Summary Description
0-3 Index Index number of file
4-7 Size Size of file in bytes
8+ File Name Terminate:double NULL
Query Hit (cont.d)
• Routing InstructionServent sends/forwards QueryHit message back along the path its Query came from.
Push Request (function 0x40)• Payload
• Routing Instruction• Used when trying to download a file from the servent
behind a firewall
• Push messages is sent along the path on which the query hit was delivered.
Bytes Summary Description
0-15 Servent Identifier GUID of the servent which should push
16-19 Index Index number of file (given in query hit)
20-23 IP address IPv4 address of servent to push to
24-25 Port number IPv4 port number of servent to push to
Routing Examples
A
B C
D D
CB
A
• Imagine yourself as node 1. You have direct (physical socket) connections to nodes 2, 3, 4, and 5. You have reachable hosts at nodes 6 through 13.
1. You get a Ping from 2 with GUID of x.
2. Lookup in your routing table [message, socket]
3. Not there? Save [message x, socket 2] in the routing table.
4. Respond with a Pong (GUID x) to node 2.
5. Forward this Ping to nodes 3, 4, and 5 (not 2!!).
6. Node 3 will respond with Pong (GUID x) to you.
7. Record [message x, socket 3] in routing table, then fine the entry [message x, socket 2], so forward this Pong to node 2.
8. Do the same thing with responses from 4 and 5.
9. Since node 3 through 5 will also pass the Ping on to 8 thru 13, you'll also get a Pong from them too.
10.Node 3 is connected to 10 who is connected to 4 and 4 is connected to you! Node 4 will also send a Pong message along the path 410931. You lookup in your routing table and find [message x, socket 4] is already there! You drop the message, and do not forward to anyone!
Downloading File• The servent requests the file using HTTP:
GET /get/1234/blue.mp3 HTTP/1.0\r\n Connection: Keep-Alive\r\n User-Agent: Gnutella\r\n
Range: bytes=0-\r\n
\r\n
• The servent will respond with normal HTTP headers, e.g.:HTTP 200 OK\r\n servent: Gnutella\r\n Content-type:application/binary\r\n Content-length: 1624\r\n \r\n
• Supports the range parameter to resume partial downloads
Topology Summary
• Gnutella network has no hierarchy,
i.e. every servent is equal.
• Some servents contribute more than others.
• Gnutella network is not a tree and it is cyclic.
• Gnutella is barely HTTP.
Limitations and Risks
• Problem in scaling (not a tree)
• TTL imposes a horizon (10000) on each user
• Hackers misuses Gnutella for other reasons
• Difficulty in authenticating the source of the data returned
Conclusion
• Peer to peer is now being recognized as the computing paradigm of the future.
References
• http://www.peer-to-peerwg.org
• http://www.gnutellanet.com
• http://www.gnutellanews.com
• http://gnutella.wego.com
• http://www.limewire.com/glossary.htm