peer-to-peer(p2p) zhenxiang chen network center of jinan university [email protected]
Embed Size (px)
TRANSCRIPT

Table of Contents
• Background• Definitions• P2P based applications• P2P structure• Challenges• Platform and tools• Conclusion• References and workgroup

Background

What is peer
Firewall
因特网洲际主干
洲际主干
ISP
消费者用户
第三方内容
Web 服务器
应用服务器
因特网主干
地区网络
企业网提供商
ISP
专业提供商
本地 ISP
T1
社团用户社团网络
数据库
消费者用户
Peer
Peer
Peer

What is overlay network
• Overlay networks create a structured virtual topology above the basic transport protocol level that facilitates deterministic search and guarantees convergence.
IP
Overlay
IP
Overlay

Application Layer Network
• Application Layer Network• Overlay Network• 网络:定义主机之间通信的寻址
方式、路由方式和服务模型• 在现有的 Internet 传输网络之
上构建一个完全位于应用层的网络系统• 拓扑发现,路由等功能完全由
应用层自己完成,不依赖网络层
• 基于 Internet 网络的大规模的分布式应用

Futures of Application layer Network
• 优点:• 易于部署,不依赖于网络设备的升级• 可扩展性好
• 缺点:• 增加了复杂性和处理开销• 无法利用最佳路由,增加了延迟• 破坏了网络的分层结构模型
• 路由具有“自私”特性 (AS/bandwidth/get and offered)

Other Overlay Networks
• Peer-to-Peer systems• Application layer multicast• VPN• Service Overlay Networks• 6bone• Content distribution networks
6/4
Internet v4
6/4 4
6
6
6
6/4NAT

12
… …
n-1
A:Sarnoff A:Sarnoff ’’law :law : 规模是规模是 O(n)O(n)
B: Metcalfe B: Metcalfe ’’law :law : 规模是规模是 O(nO(n22))
CCnn33
CCnnn-1n-1
CCnn22
CCnn22
C: Reed C: Reed ’’lawlaw :规模是:规模是 O(2O(2nn))
Sarnoff Sarnoff ’’law(law( 萨萨罗夫定律罗夫定律 )) :效益规:效益规模是模是 O(n)O(n) :网络是:网络是广播媒介,任广播媒介,任 11 发送发送者(设备)和多个者(设备)和多个(n-1)(n-1) 接收者(设接收者(设备)。备)。
Metcalfe Metcalfe ’’lawlaw (( 梅特卡夫定律):效梅特卡夫定律):效益规模是益规模是 O(nO(n22)) 网络网络是全互连媒介,任何是全互连媒介,任何11 个设备可与其它个设备可与其它 n-n-11 个交互,同时存在个交互,同时存在n(n-1)=nn(n-1)=n22-n-n 个并个并发执行的事务发执行的事务
Reed Reed ’’lawlaw (瑞德(瑞德定律):效益规模是定律):效益规模是O(2O(2nn)) :网络是群组:网络是群组媒介。网络可建立媒介。网络可建立CnCn22+Cn+Cn33++……CnCnnn--1+Cn1+Cnnn = 2 = 2nn-n-1 -n-1 个个小组小组
Network service scale rules

Problem
• Client-Server and Web architectures are inherently centralized.
• Some problems involve distributed control, distributed data, or a hierarchical organizational structure.
• Fitting a centralized solution to a decentralized problem makes a poor solution.
Thick Thick ClientClient
Thick Thick ClientClient
ServerServer
Thin ClientThin Client
Web ServerWeb Server
Database Database ServerServer
Middle TierMiddle Tier
BrowserBrowserBrowserBrowser

P2P Architecture
• P2P means actors in the system talk directly with each other as equals.
• Can decentralize some or all of the solution.• Represents distributed or hierarchical information
models.• Moves data and control to where the action is.

Definitions

Definitions of P2P
• Intel 工作组:通过在系统之间直接交换来共享计算机资源和服务的一种应用模式
• R.l.Granham: 通过 3 个关键条件定义• 具有服务器质量的可运行计算机• 具有独立于 DNS 的寻址系统• 具有与可变连接合作的能力
• C.Shirky:• 利用因特网边界的存储 /CPU/ 内容 / 现场等资源的一种应用• 访问这些非集中资源意味着运行在不稳定连接和不可预知 IP 地址环境下, P2P
节点必须运行在 DNS 系统外边• 具备有效或全部的自治
• Milojicic et al. (HP) : P2P refers to a class of systems and applications that employ distributed resources to perform a critical function in a decentralized manner.

Controversy
• Is p2p a new approach?
Problems is peer-to-peer systems are neither new nor unique; they make us look for solutions
to old problems that we all worked around or tried to ignore before.
Andy Oram (O'Reilly & associates)speech at Free and Open Source
Software Developers's MeetingBrussels, BE, Feb. 2002

P2P based applications

Examples of p2p usage
• File-sharing applications• Distributed databases• Distributed computing (grid?)• Collaboration• Distributed games• Instant messaging• Ad hoc networks• Application-level multicast• Etc.

Peer-to-Peer Systems

Interesting P2P Applications
• Gnutella for dictionaries (with supernode)• Worldwide Lexicon, http://picto.weblogger.com
• Infrastructure for interoperability• Edutella (RDF-based Metadata Infrastructure),
http://edutella.jxta.org/
• Global-scale storage• Oceanstore, http://oceanstore.cs.berkeley.edu/
• Payments (not involving a bank)• PayPal (more than 0.4 billion accounts, payment
volume 15B/year per year, profit 230M/year)• eCount.com (email payment)

Interesting P2P Applications
• Instant Messaging• Jabber, http://www.jabber.org/• Skype, http://www.skype.com
• VoIP (good quality, latency and 256-bit encry, NAT/firewall traversal)• Skype , www.skype.com/
• Groupware• Groove, http://www.groove.net/
• FOAF (Friends-of-a-Friend)• FOAFNaut, http://www.foafnaut.org/ • Friends Reunited, http://www.friendsreunited.com/ • Orkut, http://www.orkut.com/ • Tribe.net, http://www.tribe.net/
• Application-layer multicast• PPlive,QQlive

Interesting P2P Applications
• 虚拟超级计算机 peer-to-peer technology 产生空前大量的计算能力
• 使医疗研究者能加速治疗方法的改进和药物的设计
• 加快癌研究的新发现
http://www.stanford.edu/group/pandegroup/Cosm/
http://members.ud.com/vypc/cancer/
[email protected]/ 蛋白质折叠和药物设计
[email protected]/ 寻找地外文明计划http://www.equn.com/seticn/

P2P Structure

P2P Overlay Network structure
• Unstructured• Without prior knowledge
of the topology• Flooding
• Freenet• Gnutella• FastTrack/KaZaA• BitTorrent• Overnet/eDonkey
• Structured• Topology is tightly
controlled• DHT (distributed hash
table)
• CAN• Chord• Tapestry• Pastry• Kademlia• Viceroy
hybrid

Centralized model (Napster)
• File-sharing system• Almost distributed system
• The location of a document is centralized• The "transfer" is peer-to-peer
• Problems• Robustness• Scalability (?)
• Impacts• Lawsuits• Denial of service
INTERNET
locationserver
register
Document x?OK: Peer ZIP = a.b.c.d
Document x!
x

Non-structured system (Gnutella-like)
• Two phases (like Napster)• Localization + exchange
• No server• Open source
• gnutella.wego.com
• Distributed search• The query is flooded• Loop avoidance• Limited TTL (not all nodes are visited)
1
1
2
34
1
5

Freenet
• Anonymity• Replication, cache
• Routing• Local knowledge• cache• TTL limits search

FastTrack/KaZaA
5
3
2
11
4
metadata
metadata
Supernodes still use a broadcast protocolforsearch.

Related work: Skype From the KaZaA community
• Promote to super nodes• Peer cache of some super nodes• Based on availability, capacity
• Protocol among super nodes: ???• Other features
• Auto-detect NAT/firewall settings• Allows searching a user (e.g., kun*)• History of known buddies• All communication is encrypted• Conferencing
P
P
P
P
PP
PP
P
P P P

BitTorrent
seed
url
The tracker keeps track of all the owners and lookup peers.

Why structured?
• Query time, number of messages,network usage, per node state, etc.
Unstructured
P2P systems
Structured
Data availability• Decentralization• Scalability• Load balancing• Fault tolerance
Maintenance• Join/leave• Repair
Efficient searching• Proximity• Locality
• If present => find it• Flooding: not scalable• Blind search: inefficient
Core facility—DHT ( Distribute Hash Tables )

General concepts of DHTs
• Every object has a (hash) key• An object is stored at the node responsible for its
key• Every node maintains a small routing (hash) table
consisting of its neighboring nodes• All DHTs provide one elementary function
• lookup(key) node

The role of DHT in structured P2P

Chord lookup

Chord lookup w/ finger table
id-space = 2m
m = 6
size = m

Challenges

Technical Challenges of P2P
• Decentralization• Control• Security• Sustainability• Management

Decentralization
• Fully decentralized means every peer is an equal participant and no peers have special or administrative abilities
• Fully decentralized is difficult and many P2P systems are hybrids
• Decentralization is a tool, not a goal• Centralize the parts that need to be fast and need
to scale• Decentralize the parts required by the problem
model

Control
• Myth: P2P has no control over their systemsTruth: P2P has no central control, but each peer is constrained by its own rules
• Myth: P2P systems must rely on honor system and are prone to malicious usersTruth: P2P systems have a design tradeoff, openess vs. susceptibility
• Myth: There is no way to control the data in a P2P systemTruth: No one has super-user access to the data. But users control the data they create.
• Myth: P2P has anonymous users with no accountabilityTruth: Mechanisms like pseudonyms allow anonymity while enforcing accountability
• Myth: P2P systems can’t exclude known malicious usersTruth: Decentralized user access is possible but tricky

Security
• P2P applications can be made secure much like the IP protocol
• Encryption can ensure that a file is unread and unmodified even if it passes through the control of malicious peers (eg. Freenet)
• Data’s origin can be ensured even though anyone can add data to the system (eg. Groove)

Sustainability
• You need a cool idea and a critical mass• System must be easy to use• Normal use of the system needs to contribute to
the system • Imposition on users must be things they don’t mind

Managemnet
• Selfishness• Equitableness• Impact to IP networks• Copyright and laws

Platform and tools

Jini – a service broker
• Jini is a Java-based service toolkit• Provides service broker called Jini Lookup Service• Provides discovery and notification API• Service stubs passed to requester
Jini Lookup Service
ServiceRequester
Jini ServiceProvider 1
Jini ServiceProvider 2
Jini ServiceProvider 3
Need service X with attribute A
Service XAttributes: B, D
Service YAttributes: C
Service XAttributes: A, D
ServiceProvider 3

JXTA (Sun)
• Open platform for p2p cooperation
• Interoperability• Any system/peer/application
• Platform independency• Languages (C, Java, etc)• Systems platforms (Unix, Windows, etc)• Networking platforms (802.11, Bluetooth, TCP/IP, etc)
• Ubiquity• Sensors, PDAs, routers, desktops, laptops, storage
systems

JXTA (Sun)
• Objectives• Find peers and resources• Share files with anyone across the network • Create a particular group of peers across different
networks • Communicate securely with peers across public networks
• Projects• Applications (24 projects)• Core (13 projects)• Demos (3 projects)• Forge (15 projects)• Other (12 projects)• Services (24 projects)

JXTA (Sun) Protocols
• Peer discovery protocol• Peer resolver protocol• Peer information protocol• Pipe binding protocol• Endpoint routing protocol• ……

JXTA (Sun)
Peer (Desktop, cell phone, PDA, etc.)
Security
Peer Groups Peer Pipes Peer Monitoring
JXTA Community ServicesSun JXTAServices
JXTAShell
PeerCommands
JXTA Community Applications
CORE
JXTA

JXTA applications

PlanetLab
• Testbed to experiment with your networked applications. • >400 nodes, >150 sites, • PlanteLab consortium: 80+ universities, Intel, HP
• View presented to users: a distributed set of VMs• Allocation unit: a slice = a set of virtual machines (VM),
one VM at each node.
452 nodes162 sites450 research projects
VMM VMM VMM VMM
S
lice
K
OS S
lice
K
OS S
lice
K
OS S
lice
K
OS
http://www.planet-lab.org/

PlanetLab usage examples
• Stress-test your Grid services (Globus RLS)• GSLab: a playground to experiment with grid-services • ‘Better-than-Internet’ services:
• Resilient Overlays • Multipath TCP (mTCP)• Multicast Overlays
VMM VMM VMM VMM
OS
OS
OS
OS
Use
r ac
coun
t
Use
r ac
coun
t
Use
r ac
coun
t
Use
r ac
coun
t

Conclusions

Reviews
• P2P solutions can fit the problem model better than client-server or web solutions
• P2P solutions can do some cool things• P2P solutions can be production quality, but have
different issues than client-server or web solutions• It is not hard to code a P2P solution• Interesting application • Big challenge

Final remarks
• P2P implies a very large spectrum of areas• High interest in both academicals/industrials• Much has already been done, but no conclusions
are definitive• IPv6 and P2P
• NAT, firewalls, IPv6 as an overlay
• Many open issues• Trust, security, scalability, QoS, etc.

P2P related research in future
安全和保护诚信匿名声誉
智能代理 /Web-based 服务
比赛安排服务描述
网络结构和设计Network Topology
RoutingOverlay Networks
分布式数据库查询分解查询分布
仲裁
P2P
社会人际小世界现象
Power-Law 网络
商业和法律问题商业模式知识产权
分布式数据结构分布式 Hash 表
可扩展分布式数据
网络结构和设计网络拓扑路由
重叠网络
可扩展路由和对象可扩展路由和对象定位定位
性能提高性能提高 语义重叠网络语义重叠网络 P2PP2P 算法算法 复制复制 基于基于 WebWeb 的信息的信息
搜索搜索 激励和公平激励和公平 隐私隐私 // 安全安全 // 诚信诚信

What can we learn from P2P?

P2P 系统实例研究摘要 ----- 系统特点
分类 可选解 平台 语言 / 工具 不同点 网络
Avaki 分布式计算
单装置 HPC超级计算
Linux, WinSolaris
OO,ParalFort.Ada,C
分布管理、异质、安全高并
因特内网
[email protected] 分布式对象 所有通用OS
Closed source 大规模 因特
Groove
协同
Web-Based协同
Windows JavaScript,VB,Perl,C++,XML 回放、自更新 因特内网
Magi 分布文件聊天 / 消息
WindowsMac
Java,XML,HTTP,WebDAV
基于 HTTP平台独立
因特Ad-hoc
Freenet内容共享
匿名可信单点
Any withJava
Java 实现和 APIs 匿名保存 因特
Gnutella 中心服务 WindowsLinux
Java,C 协议 因特
JXTA平台
C/S Solaris Linux, Win
Java,C,Perl 开发源码 因特
.NET/My Service
Web-Based Windows C#,VC++,JScripVBScrip,VB
基于 MS 应用 因特移动

P2P系统
实例特点比较研究 -------- 系统特点
非集中化 可扩展性 匿名 自组织 权成本
Ad-hoc 性能 安全 透明性 容错 交互
Avaki 无中心 1000测 2-3千
N/A 失效重构 低 进出计算资源 加速 加密认证管理域
本地HW/SW异质
校验重启可靠报文
同 SUN网格
主从 百万 中 低 很低 进出计算资源 大加速 私有 高 定时校验 IP?
Groove 混合P2P
N/A 差 高 低 协同进出
中 共享空间 /认证授权
高 消息进队列
基于 IP
Magi 混合P2P
约 100 N/A N/A 低 伙伴进出
N/A 证书授权 离线伙伴通信
消息进队列
JXTA/WEebD
AV
Freenet 纯 P2P 理论LogN
高 高 低 Peers 的进出 中 匿名 / 防DOS
高 无单点故障
低
Gnutella 纯 P2P 千 低 高 低 Peers 的进出 低 不明 中 再用下载 IP?
JXTA 纯 P2P 嵌入式系统
N/A N/A 低 Peers 的进出 N/A 加密算法 / 分布
信任
低 低 低
.NET/My 混合 世界范围 N/A 中 低 Peers 的进出 高 基于护照 高 复制 SOAP/XML/UDDI/WSDL

P2P系统
实例商业模式的比较研究 ----- 系统特点
收入模式 支持应用 知名用户 竞争者 基金 商业模式
Avaki 产品和开放源码 计算网格共享安全数据
无 /科学实验室评价
平台计算Globus
Startup N/A
学术研究 关闭 学术 [email protected]...
政府 售机加屏保
Groove 产品 进销存 N/A Magi IPO 选 Lotus协同工具
Magi 产品和开放源码 共享文件消息聊天 全球 e 技术媒体软
Groove Startup N/A
Freenet 开放源码 文件共享 公共 N/A Startup N/A
Gnutella 开放源码 文件共享 公共 N/A 公共领域 选 P2P 算法
JXTA 开放源码& 所有权扩展
文件共享事件通知 多 P2P端口到 NET/Myservice
Sun支持的公域
公用 P2P 平台
.NET/My 所有权 & 开放源码标准
微软办公其它 MS 大基数 AOL/J2EE/JXTA
MS 内部 普适平台

系统和应用需求
解决方案比较 1---- 系统类型
Centralized C/S Peer to Peer
非集中化 低(无) 高 很高
Ad-hoc 行为 无 中 高
产权成本 很高 高 低
匿名 低(无) 中 很高
可扩展性 低 高 高
性能 单独高聚合低 中 单独低聚合高
容错 单独高聚合低 中 单独低聚合高
自组织 中 中 中
透明性 低 中 中
安全 很高 高 低
交互性 标准化 标准化 正在进行

目标 标准 解决方案比较 2---- 系统类型Centralized C/S P2P
用户普适性 低 中 高技术水平 低 高 中
复杂性 高 低 中信誉声望 高 中 低
开发者复杂性 高 直接 典型- N0
支撑能力 低 高 中工具 中(私有) 高-标准 低(少)兼容性 中 高 低
IT记帐能力 高 中 低
在控 高(全) 中 低管理能力 中 高 低标准 中(私有) 高 低(无)

Main references
• Eng Keong Lua et al. “A Survey and Comparison of Peer-to-A Survey and Comparison of Peer-to-Peer Overlay Network SchemesPeer Overlay Network Schemes,” IEEE Communications Surveys and Tutorials, Vol 7, No 2 (Second Quarter, 2005), pp. 72-93.
• Ion Stoica, Robert Morris, et al. “Chord: A Scalable Peer-to-Chord: A Scalable Peer-to-peer Lookup Service for Internet Applicationspeer Lookup Service for Internet Applications,” Proceedings of ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160.
• Diego Doval and Donal O’Mahony, “Overlay networks: a Overlay networks: a scalable alternative for P2Pscalable alternative for P2P,” IEEE Internet Computing, Vol 7, No 4 (July-August 2003), pp. 79-82.

References
• Distributed Computing• Distributed (www.distributed.net)• [email protected] (www.seti.org)• [email protected] (gah.stanford.edu)• [email protected]
(www.stanford.edu/group/pandegroup/folding)• Global Grid Forum (www.globalgridforum.org)• Globus Project (www.globus.org)
• File sharing• Napster (www.napster.com)• Gnutella (gnutella.wego.com)• Kazaa (www.kazaa.com)

References
• Distributed hash tables• CAN (www.acm.org/sigs/sigcomm/sigcomm2001/p13-
ratnasamy.pdf)• Pastry (research.microsoft.com/~antr/Pastry)• Chord (www.pdos.lcs.mit.edu/chord)• Tapestry (www.cs.berkeley.edu/~ravenben/tapestry)• Freenet (freenet.sourceforge.net)• Kademlia (kademlia.scs.cs.nyu.edu)
• Ad hoc networking• AODV (www.ietf.org/internet-drafts/draft-ietf-manet-
aodv-13.txt)• OLSR (www.ietf.org/internet-drafts/draft-ietf-manet-olsr-
10.txt)• Tribe (rp.lip6.fr/site_rp/_publications/350-79Viana.ps.gz)

References
• Platforms • JXTA (www.jxta.org)• .NET (www.microsoft.com/net)
• Collaboration• Groove (www.groove.net)• Endeavors (www.endeavors.com)
• IPv6 as a p2p overlay• Working Groups
• p2p.internet2.edu• www.openp2p.com

Slides borrowed
• Chord: A Scalable Peer-to-peer Lookup Service for Internet Applicationshttp://pdos.csail.mit.edu/~rtm/slides/sigcomm01.ppt
• P2P-SIP: Peer to peer Internet telephony using SIPhttp://www1.cs.columbia.edu/~kns10/research/p2p-sip/

Working groups et al.
• A generic site on p2p from O'Reilly• www.openp2p.com
• P2P working group• www.peer-to-peerwg.org/
• Internet2 p2p working group• p2p.internet2.edu
• Peer-to-peer development (p2p-hackers)• zgp.org/mailman/listinfo/p2p-hackers
• Interesting meeting• www.codecon.org

Reading
• CAN• Chord
• Tapestry• Pastry