[ieee 2010 fourth ieee international conference on self-adaptive and self-organizing systems...

Self-distributing Software Updates through EpidemicDissemination

Cosmin Arad, Tallat M. Shafaat, Seif HaridiRoyal Institute of Technology (KTH)Electrum 229, 164 40 Kista, Sweden

{icarad,tallat,haridi}@kth.se

Abstract

Peer-to-peer systems have recently receivedtremendous amount of popularity in both researchand commercial endeavors. This paper arguesfor the systematic exploration of a hybrid of cen-tralized and peer-to-peer system design. We givean example application of peer-to-peer archi-tecture to an inherently centralized service andshow how this application raises an interestingresearch question in the field of epidemic infor-mation dissemination. We propose a previouslyunexplored push mechanism for the distributionof updates for system software that exists in mil-lions of copies.

1 Introduction

In the last decade, peer-to-peer technologieshave been widely accepted as alternatives to theclient-server paradigm of building distributed sys-tems. Systems relying on a central server areoften criticized for having a single point of fail-ure or a scalability bottleneck. In contrast, peer-to-peer approaches have been touted for theirfault-tolerance and scalability. With a few ex-ceptions [3], design decisions avoided centraliza-tion and adopted a pure peer-to-peer architecturewhere all system participants are equal.

We argue for the wider application of peer-to-peer approaches to systems that inherently in-clude a central component. Such systems, forexample, deliver media content from a centralsource to a large set of subscribers. In recentyears, many peer-to-peer systems for media distri-bution have been proposed [9, 8]. In these cases,a peer-to-peer architecture decreases the costs ofoperating the source site. The economy stemsfrom needing less upload bandwidth and fewermachines to provide the service; less is spent onhardware, electricity, and cooling.

We argue for the systematic exploration of sys-tem designs where peer-to-peer technology is ap-plied to centralized services with the goal of de-creasing the load on the service provider by utiliz-ing the resources of the participating consumers.

One context that would benefit from usinga peer-to-peer architecture is the distribution ofsoftware updates to systems existing in millionsof copies. For example, anti-virus engines need toupdate their virus definitions on a daily basis. Op-erating system updates occur less often but theyare larger in size and occur at a large number ofsites.

Software updates originate at the site of thesoftware producer. A push model is a naturalchoice for distributing the updates to all instancesof the software. Instead of a push model, usinga pull model in this case would require clients to

2010 Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshop

978-0-7695-4229-4/10 $26.00 © 2010 IEEE

DOI 10.1109/SASOW.2010.69

243

frequently poll for any updates on the server. Thiswould overburden the server and hamper scalabil-ity [5]. Furthermore, a push from the server to allclients is also not scalable. Since epidemic dis-semination [4] has been shown to be a scalableand reliable solution for information dissemina-tion, we propose a push-based epidemic dissemi-nation of software updates.

2 Background

Gossiping refers to a class of algorithms inwhich each node periodically picks a neigh-bor and exchanges information with the selectedneighbor1. The exchange of information can be ofdifferent types. If the exchange involves only datatransfer from the initiator to the selected node,the gossip algorithm is called push-based gossip-ing. Similarly, if the data exchange only involvestransferring data from the selected node to the ini-tiator, it is called pull-based gossiping. Finally,if the exchange involves transfer of data to/fromboth nodes, the algorithm is called push-pull gos-siping [4].

Both push and pull approaches have their ad-vantages and disadvantages. When there aremany updates to be spread in the network, pullspreads the information faster than push. But, ifthere are not many updates, pull introduces un-necessary network traffic which push does not.Similarly, if there is a limit on the number ofconnections a node can simultaneously maintain,push spreads information faster than pull [4]. Ithas been shown that for most applications, bestresults can be achieved by using push-pull [2, 4,6, 1]. For instance, compared to push-pull, usingonly push or pull to maintain a random overlaycan result in a partition of the nodes [2].

Recently, gossip techniques have been em-ployed in media streaming, where multiple nodesare interested in receiving the same media, e.g. amovie, from a server [9, 8]. While our goal also

1These algorithms are also referred to as epidemic dis-semination algorithms.

resembles media streaming, i.e. broadcasting datafrom a server to a large number of nodes, the re-quirements are different. In media streaming, acore requirement is the timely arrival of data sothat it can played in time. On the contrary, our fo-cus is on applications that are not latency critical,such as software updates dissemination. Levis etal. [7] proposed to use gossiping for propagatingsoftware updates in a sensor network, yet their fo-cus is more on dealing with the constrains on sen-sor devices, such as low energy and bandwidth.On the contrary, we view software update dis-semination as an example of a class of applica-tions where: (1) data is introduced in the networkby some select servers, (2) latency is not critical,(3) the data to be spread requires more bandwidththan for usual gossip algorithms [6, 10], and (4)the number of nodes interested in the data is verylarge.

3 K-try push gossip for update dis-semination

We believe that further investigation of thepush technique is required, as has been argued byFranklin et al. [5] that push should be used in thecorrect context and methodology. We believe thatk-try push, a variant of ordinary push, can be em-ployed for scenarios with the requirements as de-picted by software updates dissemination. In k-trypush, each node u periodically tries to push newupdates to a selected node. If the selected nodealready has the update, u selects another node topush the updates. u repeats this until, either it suc-ceeds in pushing the data, or it has tried k times.

We believe that using k-try push overcomes thedisadvantage of a normal push since it will takelesser time to disseminate new updates. The rea-son for lesser time is that k-try push tries to find anuninfected node more times in each cycle than aregular push. Thus, the chances of spreading theinformation in a cycle are higher for k-try pushthan for a regular push. Similarly, for an applica-tion like disseminating software updates, where

244

the updates are large in size, testing if a node al-ready has an update just requires comparing meta-data. This is not an expensive operation in termsof amount of bandwidth required.

Since any gossip algorithm requires nodes tomaintain neighbors, our solution would require alayer for maintaining a random topology, such asCyclon [10]. Further, to enable k-try push to suc-ceed in pushing new updates, it would be desir-able that nodes have neighbors with lower versionnumbers, thus push is more likely to find an un-infected neighbor. This would result in faster dis-semination of new updates. Voulgaris et al. [11]have already explored how to cluster nodes basedon similarity in contents. We believe similarmethodologies can be used to cluster nodes ac-cording to version numbers.

We propose to employ a three layer approach.The bottom layer maintains a randomly connectednetwork that provides each node access to randomnodes in the system, such as Cyclon [10]. Themiddle layer takes in random nodes from the bot-tom layer and aims at creating an overlay whereneighbors are at either low latency, or clusteredaccording to version numbers. The top layer em-ploys the k-try push dissemination. The top layertakes nodes from the middle layer and selectsnodes to try the push.

4 Conclusion

By applying an epidemic dissemination ap-proach to the distribution of software updates, wediscovered a previously unexplored design deci-sion in the area of gossip-based information dis-semination. This leads us to speculate that thesystematic exploration of the design space of ap-plying peer-to-peer approaches to inherently cen-tralized services, may uncover new problems andyield new results.

References

[1] Mert Akdere, Cagatay Bilgin, Ozan Ger-daneri, Ibrahim Korpeoglu, Ozgur Ulu-soy, and Ugur Cetintemel. A compari-son of epidemic algorithms in wireless sen-sor networks. Computer Communications,29(13):2450–2557, 2006.

[2] Andre Allavena, Alan Demers, and John E.Hopcroft. Correctness of a gossip basedmembership protocol. In PODC ’05: Pro-ceedings of the twenty-fourth annual ACMsymposium on Principles of distributed com-puting, pages 292–301, New York, NY,USA, 2005. ACM.

[3] Bram Cohen. Incentives Build Robustnessin BitTorrent. In Proc. 1st Workshop on Eco-nomics of Peer-to-Peer Systems (P2PEcon),2003.

[4] A. Demers, D. Greene, C. Hauser, W. Irish,J. Larson, S. Shenker, H. Sturgis, D. Swine-hart, and D. Terry. Epidemic Algorithmsfor Replicated Database Maintenance. InProceedings of the 7th Annual ACM Sympo-sium on Principles of Distributed Comput-ing (PODC’87), pages 1–12, New York, NY,USA, 1987. ACM Press.

[5] Michael J. Franklin and Stanley B. Zdonik.”data in your face”: Push technology in per-spective. In SIGMOD Conference, pages516–519, 1998.

[6] Mark Jelasity, Spyros Voulgaris, RachidGuerraoui, Anne-Marie Kermarrec, andMaarten van Steen. Gossip-based peer sam-pling. ACM Trans. Comput. Syst., 25(3):8,2007.

[7] P. Levis, N. Patel, D. Culler, and S. Shenker.Trickle: a self-regulating algorithm for codemaintenance and propagation in wirelesssensor. In USENIX/ACM Symposium on

245

Network Systems Design and Implementa-tion (NSDI), 2004.

[8] B. Li, Y. Qu, Y. Keung, S. Xie, C. Lin, J. Liu,and X. Zhang. Inside the new coolstream-ing: Principles, measurements and perfor-mance implications. In INFOCOM 2008.The 27th Conference on Computer Commu-nications. IEEE, 2008.

[9] N. Magharei, R. Rejaie, and Yang Guo.Mesh or multiple-tree: A comparative studyof live p2p streaming approaches. In INFO-COM 2007. 26th IEEE International Con-ference on Computer Communications.,

pages 1424–1432, 2007.

[10] S. Voulgaris, D. Gavidia, and M. van Steen.Cyclon: Inexpensive membership manage-ment for unstructured p2p overlays. Jour-nal of Network and Systems Management,13(2), 2005.

[11] S. Voulgaris and M. van Steen. Epidemic-Style Management of Semantic Overlaysfor Content-Based Searching. In Euro-Par 2005 Parallel Processing, pages 1143–1152, Lisbon, Portugal, 2005. SpringerBerlin/Heidelberg.

246

[ieee 2010 fourth ieee international conference on self-adaptive and self-organizing systems...

Documents