a measurement study of peer-to-peer file sharing systems
DESCRIPTION
A Measurement Study of Peer-to-Peer File Sharing Systems. Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble Presented by Zhengxiang Pan March 18 th , 2003. Introduction. Napster & Gnutella Population of users Bottleneck bandwidth of hosts & latencies Duration time of remain connected - PowerPoint PPT PresentationTRANSCRIPT
A Measurement Study of Peer-to-Peer File Sharing
Systems
Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble
Presented by Zhengxiang PanMarch 18th, 2003
Introduction
• Napster & Gnutella• Population of users• Bottleneck bandwidth of hosts & latencies• Duration time of remain connected• Number of files shared & downloaded
Methodology-architecture
• Napster’s architecture– A cluster of central servers– Each peer connects to one server– Servers cooperate to process query
• Gnutella’s architecture– No centralized servers– Peers form overlay network– Send a query by a controlled flood
Methodology-crawler• Napster crawler
– A larger number of connections to a single server
– Issue popular queries in parallel– Captured 40%-60% local users
• Gnutella crawler– Iteratively send ping messages with large
TTLs– Discover new hosts by receiving pong
messages.– Capture 25%-50% of the total population
Methodology-directly measure characteristics• Latency
– Measure the time spent by exchanging a 40-byte TCP packet.
• Lifetime– Offline: not respond to TCP SYN packets– Inactive: respond with TCP RST– Active: accept the connection
• Bottleneck bandwidth– Approximate to available bandwidth– Actively measure upstream and downstream using
a few TCP packets
Results-bandwidth
Downstream & upstream bottleneck bandwidth-50% in Napster & 60% in Gnutella use broadband connections-25% in Napster & 8% in Gnutella use modems-20% in Napster & 30% in Gnutella have high bandwidth (>3Mbps)
Result-reported bandwidth
22% in Napster report “unknown” bandwidth
Result- latency
Latencies for Gnutella users-Unstructured, ad-hoc, a substantial fraction suffer from high-lantency-Difference in trans-oceanic peers
Result- availability
-only 20% peers had an IP-level uptime of 93% or more-Median session duration : 60 minutes
Result-files
-25% in Gnutella do not share any files-40%-60% peers share 5%-20% of the shared files
Result-download & upload
the percentage of peers in each bandwidth class is roughly the same as the percentage of files shared by that bandwidth class.
Result- cooperate
-30% of the users that report their bandwidth as 64 Kbps or less actually have a significantly greater bandwidth.-10% of the users reporting high bandwidth (3Mbps or higher) in reality have significantly lower bandwidth.
Result-resilience of Gnutella overlay
Although highly resilient in the face of random breakdowns, Gnutella is nevertheless highly vulnerable in the face of well-orchestrated, targeted attacks.
Conclusion
• Heterogeneity of hosts– Carefully delegate responsibilities
• Clearly evidence of client-like and server-like behaviors
• Peers tend to misreport information if there is an incentive to do so– Built-in incentive for telling the truth– Verify reported information