ieee icc 2013 - symbiotic coupling of p2p and cloud systems: the wikipedia case

21
Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case Lars Bremer, University of Paderborn, Germany Kalman Graffi, University of Düsseldorf, Germany

Upload: kalman-graffi

Post on 15-Jun-2015

77 views

Category:

Internet


5 download

DESCRIPTION

Lars Bremer and Kalman Graffi. Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case. In IEEE ICC ’13: Proceedings of the International Conference on Communications, 2013. Abstract—Comparative evaluations of peer-to-peer protocols through simulations are a viable approach to judge the per- formance and costs of the individual protocols in large-scale networks. In order to support this work, we enhanced the peer- to-peer systems simulator PeerfactSim.KOM with a fine-grained analyzer concept, with exhaustive automated measurements and gnuplot generators as well as a coordination control to evaluate a set of experiment setups in parallel. Thus, by configuring all experiments and protocols only once and starting the simulator, all desired measurements are performed, analyzed, evaluated and combined, resulting in a holistic environment for the comparative evaluation of peer-to-peer systems. Abstract—Cloud computing offers high availability, dynamic scalability, and elasticity requiring only very little administration. However, this service comes with financial costs. Peer-to-peer systems, in contrast, operate at very low costs but cannot match the quality of service of the cloud. This paper focuses on the case study of Wikipedia and presents an approach to reduce the operational costs of hosting similar websites in the cloud by using a practical peer-to-peer approach. The visitors of the site are joining a Chord overlay, which acts as first cache for article lookups. Simulation results show, that up to 72% of the article lookups in Wikipedia could be answered by other visitors instead of using the cloud.

TRANSCRIPT

Page 1: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, University of Paderborn, Germany

Kalman Graffi, University of Düsseldorf, Germany

Page 2: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 2

Know this Banners?

Page 3: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 3

Background on Wikipedia

Wikipedia– Collaborative Internet Encylopaedia

Numbers on English Wikipedia– Alexa rank: 6– Article Count: 3.8 million– Edits: 3.4 million per month– Page Views: 11.3 million per hour

Figures show the popularity of articles– Top: All articles– Bottom: Top 250 articles

Problem: Costs through high traffic

Page 4: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 4

Motivation and Outline of our Work

Goal: Efficiency increase– Cloud-like performance

• Maintain high data availability

• Quick article delivery

– Low operational costs• Users should help in sharing articles

• Donations of network resources

Approach– Combine peer-to-peer (p2p) and centralized (cloud) architecture– Cloud is used as backup and main hoster

• Much less traffic and costs

– Users participate in p2p overlay• Lookup articles first there

• Provide downloaded articles to other peers

Page 5: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 5

Outline

Motivation / Use Case

Background on Structured P2P Overlays

Symbiotic Coupling of P2P and Cloud Systems

Evaluation

Conclusions

Page 6: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 6

Background on Structured P2P Overlays

Nodes and objects use same ID space

Each object is managed by a node ( responsible)

Assignment based on IDs

Nodes maintain a topology / routing structure to support:

Lookup: getResponsibleNode(ID)

After that: e.g. data tranfer

H(„my data“)= 3107

2207

29063485

201116221008709

611

H(„my data“)= 3107

2207

29063485

201116221008709

611

Lookup

Data transfer

Page 7: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 7

Cloud Computing vs. P2P Technology

Cloud and P2P– Access to a distributed pool of

resources:• Storage, bandwidth, computational

power

Cloud computing– Resource providers: companies– Controlled environment

• No (/minimal) churn • Homogenous devices

– Selective centralized structures – Mainly paid by usage

P2P systems– Resource providers: user devices– Uncontrolled environment

• Churn• Heterogeneous devices• Uncertainty / unpredictability• Distributed access points

Page 8: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 8

Symbiotic Coupling of P2P and Cloud Systems

Goal: High performance at low costs– Performance: data availability, low delays– Costs: traffic at cloud operator (linked to monetary expenses)

Our approach– Main service (here Wikipedia) remains as main data pool– Nodes install an (p2p) addon p2p overlay

• Allows to share content of specific services

– Nodes visiting Wikipedia• Join the p2p overlay and remain online for a while

– Articles are served and provided in p2p overlay– If not available / initially: download from cloud

Page 9: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 9

Overview on the Architecture

Document space– Article ID is hashed article name– Responsible node maintains

list of articles providers– Article providers

• Downloaded once the article

• Registered at resp. node

We use Chord– Any other DHT also fine– Needs to support

Key-based Routing

Page 10: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 10

Operation: Initial Lookup for an Article

Page 11: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 11

Operation: Further Lookup for an Article

Page 12: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 12

Other Operations

Update– Editing done on the Cloud– Active vs. passive updates

• Active: Cloud actively informs node holding references

• Passive: Responsible peers periodically check for updates– Frequency based on object popularity

– Old references are discarded• They point to outdated content

– New reference table is built-up

Leave– Leaving node informs all nodes holding references to it– Can also be detected, but introduces delay

Page 13: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 13

Evaluation

Main questions– What is the efficiency gain?– How much traffic is saved?

Approach– Evaluation through simulation

Layer setup– User mode: downscaled Wikipedia workload– Application: document storage– Overlay: Chord– Network model:

• Global Network Positioning delay model

• OECD bandwidth model

Page 14: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 14

► PeerfactSim.KOM (see www.peerfact.org)

Type– Event-based simulator– Written in Java– Simulations up to 100K peers possible– Focus on simulation of p2p systems on various layers

• User

• Application

• Services: monitoring, replication …

• Overlays

• Network models

Invitation to join the community– Several universities use and extend the simulator actively– Used and heavily extended in the project

Page 15: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 15

Layered View

Layered Architecture– Easy exchange of components– Testing of new applications– Testing of new mechanisms

Main idea– Layers have several implementations– Enables testing of individual layer

mechanisms• on its own

• in combination with other layers

See www.peerfact.org

Application

Overlay

User

Sim

ulation E

ngin

e

Network

Service

Transport

Application

Overlay

User

Sim

ulation E

ngin

e

Network

Service

Transport

Page 16: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 16

Simulation Setup / Workload Model

Page 17: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 17

Simulation Results

Page 18: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 18

Simulation Results

Page 19: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 19

Simulation Results

Page 20: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 20

Conclusions and Future Work

Symbiotic p2p/cloud approach lowers operational costs– Users take load and share content– Traffic load on server was reduced to 27.6% in this experiment– Websites with many users can benefit from p2p support– WebP2P – browser-based p2p via peerjs, nodejs, etc. is coming– User devices are powerful, load can be handled „for free“

Future Work– Investigate WebP2P approach

• Browser plugin to create p2p overlay

– Create p2p framework for social networks• Use capacity of user devices to host a social network• See http://www.p2pframework.com

– Further extend PeerfactSim.KOM – the p2p system simulator• See http://www.peerfact.org

Page 21: IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case

Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 21

Thank You for Your Attention

Jun.-Prof. Dr.-Ing. Kalman Graffi Technology of Social Networks GroupInstitute of Computer ScienceHeinrich-Heine-Universität Düsseldorf

eMail: [email protected] Web: www.p2pframework.com

??