distributed communication algorithms for ad hoc mobile networks
TRANSCRIPT
J. Parallel Distrib. Comput. 63 (2003) 58–74
Distributed communication algorithms for ad hocmobile networks$,$$
Ioannis Chatzigiannakis,a,b Sotiris Nikoletseas,a,b,* and Paul Spirakisa,b
aComputer Technology Institute, 61 Riga Fereou Str, 26221 Patras, GreecebComputer Engineering and Informatics Department, Patras University, 26500 Rion, Patras, Greece
Received 17 October 2002; accepted 21 October 2002
Abstract
An ad hoc mobile network is a collection of mobile hosts, with wireless communication capabilities, forming a temporary network
without the aid of any established fixed infrastructure. In such networks, topological connectivity is subject to frequent,
unpredictable change. Our work focuses on networks with high rate of such changes to connectivity. For such dynamically changing
networks we propose protocols which exploit the co-ordinated (by the protocol) motion of a small part of the network. We show that
such protocols can be designed to work correctly and efficiently even in the case of arbitrary (but not malicious) movements of the
hosts not affected by the protocol. We also propose a methodology for the analysis of the expected behavior of protocols for such
networks, based on the assumption that mobile hosts (those whose motion is not guided by the protocol) conduct concurrent random
walks in their motion space. In particular, our work examines the fundamental problem of communication and proposes distributed
algorithms for it. We provide rigorous proofs of their correctness, and also give performance analyses by combinatorial tools.
Finally, we have evaluated these protocols by experimental means.
r 2002 Elsevier Science (USA). All rights reserved.
1. Introduction
1.1. The rate of topology changes and our approach
Mobile computing has been introduced (mainly as aresult of major technological developments) in the pastfew years forming a new computing environment.Because of the fact that mobile computing is constrainedby poor resources, highly dynamic variable connectivityand volatile energy sources, the design of stable andefficient mobile information systems is greatly compli-cated. Until now, two basic system models have been
proposed for mobile computing. The ‘‘fixed backbone’’mobile system model has been around the past decadeand has evolved to a fairly stable system that can exploita variety of information in order to enhance alreadyexisting services and yet provide new ones. On the otherhand, the ‘‘ad hoc’’ system model assumes that mobilehosts can form networks without the participation ofany fixed infrastructure, a fact that further complicatestheir design and implementation.
An ad hoc mobile network [21,32] is a collection ofmobile hosts with wireless interfaces forming a tempor-ary network without the aid of any established infra-
structure or centralized administration. In an ad hocnetwork two hosts that want to communicate may notbe within wireless transmission range of each other, butcould communicate if other hosts between them are alsoparticipating in the ad hoc network and are willing toforward packets for them.
Ad hoc mobile networks have been modeled by mostresearchers by a set of independent mobile nodescommunicating by message passing over a wireless
$A preliminary version of this work has appeared in the 15th
Symposium on Distributed Computing (DISC), in the 4th, 5th
Workshops on Algorithmic Engineering (WAE), see [9,11,12].$$This work was partially supported by the EU projects IST FET-
OPEN ALCOM-FT, FLAGS, CRESCCO and IMPROVING RTN
ARACNE.
*Corresponding author.
E-mail addresses: [email protected] (I. Chatzigiannakis), [email protected]
(S. Nikoletseas), [email protected] (P. Spirakis).
0743-7315/03/$ - see front matter r 2002 Elsevier Science (USA). All rights reserved.
doi:10.1016/S0743-7315(02)00034-5
network. The network is modeled as a dynamicallychanging, not necessarily connected, undirected graph,with nodes as vertices and edges (virtual links) betweenvertices corresponding to nodes that can currentlycommunicate. Such a model is used, for example, in[23,26,32].
However, the proof of correctness of algorithmspresented under such settings requires a bound on the
virtual link changes that can occur at a time. As [32] alsostates, in ad hoc mobile networks the topologicalconnectivity is subject to frequent, unpredictablechanges (due to the motion of hosts). If the rate oftopological change is very high, then structured algo-rithms (i.e. algorithms that try to maintain datastructures based on connectivity) fail to react fastenough and [32] suggests that, in such cases, the onlyviable alternative is flooding. If the rate is low (‘‘quasi-static’’ networks) or medium, then adaptive algorithmictechniques can apply. In such settings many nice workshave appeared like e.g. the work of [26] on leaderelection, the work of [1] on communication (whichhowever examines only static transmission graphs), etc.
The most common way to establish communication inad hoc networks is to form paths of intermediate nodesthat lie within one another’s transmission range and candirectly communicate with each other [21,23,30,33]. Themobile nodes act as hosts and routers at the same time inorder to propagate packets along these paths. Indeed,this approach of exploiting pair wise communication iscommon in ad hoc mobile networks that cover arelatively small space (i.e. with diameter which is smallwith respect to transmission range) or are dense (i.e.thousands of wireless nodes) where all locations areoccupied by some hosts; broadcasting can be efficientlyaccomplished.
In wider area ad hoc networks with less users,however, broadcasting is impractical: two distant peerswill not be reached by any broadcast as users may notoccupy all intermediate locations (i.e. the formation of apath is not feasible). Even if a valid path is established,single link ‘‘failures’’ happening when a small number ofusers that were part of the communication path move ina way such that they are no longer within transmissionrange of each other, will make this path invalid. Notealso that the path established in this way may be verylong, even in the case of connecting nearby hosts.
We indeed conjecture that, in cases of high mobilityrate and/or of low density of user spreading, onecan even state an impossibility result: any algorithmthat tries to maintain a global structure with respectto the temporary network will be erroneous if themobility rate is faster than the rate of updates of thealgorithm.
In contrast to all such methods, we try to avoid ideasbased on paths finding and their maintenance. Weenvision networks with highly dynamic movement of the
mobile users, where the idea of ‘‘maintenance’’ of a validpath is inconceivable (paths can become invalidimmediately after they have been added to the routingtables). Our approach is to take advantage of the mobilehosts natural movement by exchanging informationwhenever mobile hosts meet incidentally. It is evident,however, that if the users are spread in remote areas andthey do not move beyond these areas, there is no way forinformation to reach them, unless the protocol takesspecial care of such situations.
In the light of the above, we propose the idea offorcing a small subset of the deployed hosts to move as
per the needs of the protocol. We call this set the‘‘support’’ of the network. Assuming the availability ofsuch hosts, the designer can use them suitably by
specifying their motion in certain times that thealgorithm dictates. We admit that the assumption ofavailability of such hosts for algorithms deviates fromthe ‘‘pure’’ definition of ad hoc mobile networks.However, we feel that our approach opens up an areafor distributed algorithms design for ad hoc mobilenetworks of high mobility rate.
Our approach is motivated by two research directionsof the past:
(a) The ‘‘two tier principle’’ [21], stated for mobilenetworks with a fixed subnetwork however, whichsays that any protocol should try to move commu-nication and computation to the fixed part of thenetwork. Our assumed set of hosts that are co-ordinated by the protocol simulates such a (skele-ton) network; the difference is that the simulationactually constructs a co-ordinated moving set.
(b) A usual scenario that fits to the ad hoc mobilemodel is the particular case of rapid deployment ofmobile hosts, in an area where there is no under-lying fixed infrastructure (either because it isimpossible or very expensive to create such aninfrastructure, or because it is not established yet,or because it has become temporarily unavailablei.e. destroyed or down).
In such a case of rapid deployment of a number ofmobile hosts, it is possible to have a small team of fastmoving and versatile vehicles, to implement the support.These vehicles can be cars, jeeps, motorcycles orhelicopters. We interestingly note that this small teamof fast moving vehicles can also be a collection ofindependently controlled mobile modules, i.e. robots.This specific approach is inspired by the recent paper ofWalter et al. In their paper ‘‘Distributed Reconfigura-tion of Metamorphic Robot Chains’’ [35] the authorsstudy the problem of motion co-ordination in distrib-uted systems consisting of such robots, which canconnect, disconnect and move around. The paper dealswith metamorphic systems where (as is also the case inour approach) all modules are identical.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 59
Note that the approach of having the support movingin a co-ordinated way, i.e. as a chain of nodes, has somesimilarities to [35].
1.2. Previous and related work
In a recent paper [24], Li and Rus present a modelwhich has some similarities to ours. The authors give aninteresting, yet different, protocol to send messages,which forces all the mobile hosts to slightly deviate (for a
short period of time) from their predefined, deterministic
routes, in order to propagate the messages. Theirprotocol is, thus, compulsory for any host, and it worksonly for deterministic host routes. Moreover, theirprotocol considers the propagation of only one message(end to end) each time, in order to be correct. Incontrast, our support scheme allows for simultaneousprocessing of many communication pairs. In theirsetting, they show optimality of message transmissiontimes.
Adler and Scheideler [1] deal only with static
transmission graphs i.e. the situation where the positionsof the mobile hosts and the environment do not change.In [1] the authors pointed out that static graphs providea starting point for the dynamic case. In our work, weconsider the dynamic case (i.e. mobile hosts movearbitrarily) and in this sense we extend their work. Asfar as performance is concerned, their work providestime bounds for communication that are proportional tothe diameter of the graph defined by random uniformspreading of the hosts.
We note here (see also Section 1.1) that motion co-ordination ideas for distributed systems of metamorphicrobots have been introduced by Walter et al. in [35]. Intheir paper ‘‘Distributed Reconfiguration of Meta-morphic Robot Chains’’ the authors study the problemof motion co-ordination in distributed systems consist-ing of such robots, which can connect, disconnect andmove around. The paper deals with metamorphicsystems where all modules are identical.
An interesting approach for Leader Election in ad hocmobile networks is presented in the paper of Malpaniet al. [26]. There, the authors present algorithms,based on the TORA [30] routing method, whichmaintain wireless links and guarantee exactly oneleader per connected component of the implied virtuallinks network among mobile hosts. Their proof ofcorrectness assumes a bounded rate of topologychanges.
1.3. The explicit model of motions
1.3.1. The motion space
The set of previous research that follows the approachof slowly changing communication graphs, models hostsmotions only implicitly, i.e. via the pre-assumed upper
bound on the rate of virtual link changes. In contrast,we propose an explicit model of motions because it isapparent that the motions of the hosts are the cause ofthe fragility of the virtual links.
Thus we distinguish explicitly between (a) the fixed
(for any algorithm) space of possible motions of themobile hosts and (b) the kind of motions that the hostsperform inside this space. In the sequel we have decidedto model the space of motions only combinatorially, i.e.as a graph. We however believe that other research willcomplement this effort by introducing geometry detailsinto the model.
In particular, we abstract the environment where thestations move (in three-dimensional space with possibleobstacles) by a motion-graph (i.e. we neglect the detailedgeometric characteristics of the motion. We expect thatfuture research will incorporate geometry constraintsinto the subject). In particular, we first assume that eachmobile host has a transmission range represented by asphere tr centered by itself. This means that any otherhost inside tr can receive any message broadcast by thishost. We approximate this sphere by a cube tc withvolume VðtcÞ; where VðtcÞoVðtrÞ: The size of tc canbe chosen in such a way that its volume VðtcÞ is themaximum that preserves VðtcÞoVðtrÞ; and if a mobilehost inside tc broadcasts a message, this message isreceived by any other host in tc: Given that the mobilehosts are moving in the space S; S is divided intoconsecutive cubes of volume VðtcÞ:
Definition 1.1. The motion graph GðV ;EÞ; (jV j ¼ n;jEj ¼ m), which corresponds to a quantization of S isconstructed in the following way: a vertex uAG
represents a cube of volume VðtcÞ: An edge ðu; vÞAG
if the corresponding cubes are adjacent.
The number of vertices n; actually approximates theratio between the volume VðSÞ of space S; and thespace occupied by the transmission range of a mobilehost VðtrÞ: In the extreme case where VðSÞCVðtrÞ(the transmission range of the hosts approximates thespace where they are moving), then n ¼ 1: Given thetransmission range tr; n depends linearly on the volume
of space S regardless of the choice of tc; and n ¼OðVðSÞ
VðtrÞÞ: Let us call the ratio VðSÞVðtrÞ by the term relative
motion space size and denote it by r: Since the edges of G
represent neighboring polyhedra each node is connectedwith a constant number of neighbors, which yields thatm ¼ YðnÞ: In our example where tc is a cube, G hasmaximum degree of six and mp6n:
Thus motion graph G is (usually) a bounded degree
graph as it is derived from a regular graph of smalldegree by deleting parts of it corresponding to motion orcommunication obstacles. Let D be the maximum vertexdegree of G:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7460
1.3.2. The motion of the hosts—adversaries
The motion that the hosts perform (we mean here thehosts that are not part of the support i.e. those whosemotion is not specified by the distributed algorithm) is an
input to any distributed algorithm.(a) In the general case, we assume that the motions of
such hosts are decided by an oblivious adversary: Theadversary determines motion patterns in any possibleway but independently of the part of the distributedalgorithm that specifies motions of the support. In otherwords, we exclude the case where some of the hosts notin the support are deliberately trying to maliciously
affect the protocol (e.g. to avoid the hosts in thesupport). This is a pragmatic assumption usuallyfollowed by applications. We call such motion adver-saries the restricted motion adversaries.
(b) For purposes of studying efficiency of distributedalgorithms for ad hoc networks on the average, weexamine the case where the motions of any host notaffected by the algorithm are modeled by concurrent and
independent random walks. In fact, the assumption thatthe mobile users move randomly, either according touniformly distributed changes in their directions andvelocities or according to the random waypoint mobilitymodel by picking random destinations, has been used byother research (see e.g. [17,20]).
We interestingly note here a fundamental result ofgraph theory, according to which dense graphs look likerandom graphs in many of their properties. This hasbeen noticed for expander graphs at least by [2,5] and iscaptured by the famous Szemeredi’s Regularity Lemma[34] which says that in some sense most graphs can beapproximated by random-looking graphs.
In analogy, we conjecture that any set of dense (in
number) but arbitrary otherwise motions of many hostsin the motion space can be approximated (at least withrespect to their meeting and hitting times statistics) by aset of concurrent dense random walks. But the meetingtimes statistics of the hosts essentially determine thevirtual fragile links of any ad hoc network. We thusbelieve that our suggestion for adoption of concurrentrandom walks as a model for input motions, is not onlya tool for average case performance analysis but it mightin fact approximate well any ad hoc network of densemotions.
2. A framework for protocol classes
Protocols (distributed algorithms) for ad hoc mobilenetworks can be divided into three major categories:
Definition 2.1. Non-Compulsory protocols are the oneswhose execution does not affect the movement of themobile hosts. On the other hand, compulsory protocolsare those that require all hosts to perform certain moves
in order to ensure the correct protocol execution.Finally, the class of ad hoc mobile networks protocolswhich enforce a (small) subset of the mobile hosts tomove in a certain way is called the class of semi-
compulsory protocols.Note that non-compulsory protocols try to take
advantage of the mobile hosts natural movement byexchanging information whenever mobile hosts meetincidentally. Compulsory protocols force the mobilehosts to move according to a specific scheme in order tomeet the protocol demands (i.e. meet more often, spreadin a geographical area, etc.).
Note also that we can further categorize each classdepending on whether or not the mobile hosts have(individual or common) sense of orientation in themotion space. We can strengthen this by assumingabilities of geolocation given to mobile hosts. Suchlocation information (i.e. complete co-ordinates accord-ing to some origin) can be obtained using globalpositioning system (GPS) [8,15,22,29,31] facilities.
Definition 2.2. The subset of the mobile hosts of an adhoc mobile network whose motion is determined by anetwork protocol P is called the support S of P: Thepart of P which indicates the way that members of Smove and communicate is called the support management
subprotocol MS of P:
This definition captures network management ideasfor ad hoc mobile networks.
Notice that our scheme defines a support (and itsmanagement subprotocol) suitable not only for pair wisecommunication but also for a whole set of basicproblems including many-to-one communication, infor-mation spreading and multicasting.
Definition 2.3. Consider a family of protocols, F; for amobile ad hoc network, and let each protocol P in Fhave the same support (and the same support manage-ment subprotocol). Then S is called the support of the
family F.
Our scheme follows the general design principle ofmobile networks (with a fixed subnetwork however)called the ‘‘two-tier’’ principle [21] which says that anyprotocol should try to move communication andcomputation to the fixed part of the network. Our ideaof the support S is a simulation of such a (skeleton)network, by moving hosts however.
In addition, we may wish that the way hosts in Smove (maybe co-ordinated) and communicate is robust(i.e. that it can tolerate failures of hosts).
The types of failures of hosts that we consider here arepermanent (i.e. stop) failures. Once such a fault happensthen the host of the fault does not participate in the adhoc mobile network anymore.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 61
Definition 2.4. A support management subprotocol,MS; is k-faults tolerant, if it still allows the membersof F (or P) to execute correctly, under the presence ofat most k permanent faults of hosts in S (kX1).
3. Basic communication algorithms
3.1. Problem definition
A basic communication problem, in ad hoc mobilenetworks, is to send information from some sender user,MHS; to another designated receiver user, MHR:
One way to solve this problem is the protocol ofnotifying every user that the sender MHS meets (andproviding all the information to it) hoping that some ofthem will eventually meet the receiver MHR:
Is there a more efficient technique (other thannotifying every user that the sender meets, in hope thatsome of them will then eventually meet the receiver) thatwill effectively solve the basic communication problemwithout flooding the network and exhausting the batteryand computational power of the hosts?
The most common way to establish communication isto form paths of intermediate nodes that lie within oneanother’s transmission range and can directly commu-nicate with each other [7,18,23,30,33,36]. Indeed, thisapproach of exploiting pair-wise communication iscommon in ad hoc mobile networks that cover arelatively small space (i.e. with diameter which is smallwith respect to transmission range) or are dense (i.e.thousands of wireless nodes) where all locations areoccupied by some hosts; broadcasting can be efficientlyaccomplished in such cases.
In wider area ad hoc networks with less users,however, broadcasting is impractical: two distant peerswill not be reached by any broadcast as users may notoccupy all intermediate locations (i.e. the formation of along path is not feasible). Even if a valid path isestablished, single link ‘‘failures’’ happening when asmall number of users that were part of the commu-nication path move in a way such that they are no longerwithin transmission range of each other, will make thispath invalid. Note also that the path established in thisway may be very long, even in the case of connectingnearby hosts.
In contrast to all such methods, we try to avoid ideasbased on paths finding and their maintenance. Weenvision networks with highly dynamic movement of themobile users, where the idea of ‘‘maintenance’’ of a validpath is inconceivable (paths can become invalidimmediately after they have been added to the directorytables). Our approach is to take advantage of the mobilehosts natural movement by exchanging informationwhenever mobile hosts meet incidentally. It is evident,however, that if the users are spread in remote areas andthey do not move beyond these areas, there is no way for
information to reach them, unless the protocol takesspecial care of such situations.
In the light of the above, we propose the idea offorcing only a small subset of the deployed hosts tomove as per the needs of the protocol, i.e. we propose toexploit the support idea. Assuming the availability ofsuch hosts, we use them to provide a simple, correct andefficient strategy for communication between any pair ofhosts in such networks that avoids message flooding.
A protocol solving this important communicationproblem is reliable if it allows the sender to be notifiedabout delivery of the information to the receiver. Notethat no distributed computing protocol can be imple-mented in ad hoc mobile networks without solving thisbasic communication problem.
3.2. The basic protocol
In simple terms, the protocol works as follows: Thenodes of the support move fast enough so that they visit(in sufficiently short time) the entire motion graph. Theirmotion is accomplished in a distributed way via asupport motion subprotocol P1: When some node of thesupport is within communication range of a sender, anunderlying sensor subprotocol P2 notifies the sender thatit may send its message(s).
The messages are then stored ‘‘somewhere within thesupport structure’’. When a receiver comes withincommunication range of a node of the support, thereceiver is notified that a message is ‘‘waiting’’ for himand the message is then forwarded to the receiver.
The messages received by the support are propagatedwithin the structure when two or more members of thesupport meet on the same site (or are within commu-nication range). A synchronization subprotocol P3 isused to dictate the way that the members of the supportexchange information.
In a way, the support S plays the role of a (moving)skeleton subnetwork (whose structure is defined by themotion subprotocol P1), through which all communica-tion is routed. From the above description, it is clearthat the size, k; and the shape of the support may affectperformance.
Note that the proposed scheme does not require thepropagation of messages through hosts that are not partof S; thus its security relies on the support’s security andis not compromised by the participation in messagecommunication of other mobile users. For a discussionof intrusion detection mechanisms for ad hoc mobilenetworks see [36].
3.3. The ‘‘Snake’’ support management subprotocol MSS
The main idea of the protocol proposed in [9,11] is asfollows. There is a set-up phase of the ad hoc network,
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7462
during which a predefined set, k; of hosts, become thenodes of the support. The members of the supportperform a leader election by running a randomizedbreaking symmetry protocol in anonymous networks(see e.g. [19]). This is run once and imposes only aninitial communication cost. The elected leader, denotedby MS0; is used to co-ordinate the support topologyand movement. Additionally, the leader assignslocal names to the rest of the support membersMS1;MS2;y;MSk�1:
The nodes of the support move in a co-ordinated way,always remaining pair wise adjacent (i.e., forming achain of k nodes), so that they sweep (given some time)the entire motion graph. This encapsulates the support
motion subprotocol PS1 : Essentially the motion subpro-
tocol PS1 enforces the support to move as a ‘‘Snake’’,
with the head (the elected leader MS0) doing a randomwalk on the motion graph and each of the other nodesMSi executing the simple protocol ‘‘move where MSi�1
was before’’. More formally, the movement of S isdefined as follows:
Initially, MSi; 8iAf0; 1;y; k � 1g; start from thesame area-node of the motion graph. The direction ofmovement of the leader MS0 is given by a memorylessoperation that chooses randomly the direction of the nextmove. Before leaving the current area-node, MS0 sendsa message to MS1 that states the new direction ofmovement. MS1 will change its direction as perinstructions of MS0 and will propagate the message toMS2: In analogy, MSi will follow the orders of MSi�1
after transmitting the new directions to MSiþ1: Move-ment orders received by MSi are positioned in aqueue Qi for sequential processing. The very first moveof MSi; 8iAf1; 2;y; k � 1g is delayed by a d period oftime.
We assume that the mobile support hosts move with acommon speed. Note that the above described motionsubprotocol PS
1 enforces the support to move as a‘‘Snake’’, with the head (the elected leader MS0) doing a
random walk on the motion graph G and each of theother nodes MSi executing the simple protocol ‘‘movewhere MSi�1 was before’’. This can be easily imple-mented because MSi will move following the edge fromwhich it received the message from MSi�1 and thereforeour protocol does not require common sense oforientation.
The purpose of the random walk of the head is toensure a cover (within some finite time) of the wholemotion graph, without memory (other than local) oftopology details. Note that this memoryless motion alsoensures fairness. The value of the random walk principle
of the motion of the support will be further justified inthe correctness and the efficiency parts of the paper,where we wish our basic communication protocol tomeet some performance requirements regardless of the
motion of the hosts not in S.
A modification of MSS is that the head does a random
walk on a spanning subgraph of G (e.g. a spanning tree).This modified MS (call it M 0
S) is more efficient in our
setting since ‘‘edges’’ of G just represent adjacentlocations and ‘‘nodes’’ are really possible host places.
A simple approach to implement the support syn-
chronization subprotocol PS3 is to use a forward mechan-
ism that transmits incoming messages (in-transit) to
neighboring members of the support (due to PS1 at least
one support host will be close enough to communicate).In this way, all incoming messages are eventually copiedand stored in every node of the support. This is not themost efficient storage scheme and can be refined invarious ways in order to reduce memory requirements.
3.4. The ‘‘Runners’’ support management subprotocol
MRS
A different approach to implement MS is to alloweach member of S not to move in a Snake-like fashion,but to perform an independent random walk on themotion graph G; i.e., the members of S can be viewed as‘‘Runners’’ running on G: In other words, instead ofmaintaining at all times pair wise adjacency betweenmembers of S; all hosts sweep the area by movingindependently of each other. When two runners meet,they exchange any information given to them by sendersencountered using a new synchronization subprotocol
PR3 : As in the Snake case, the same underlying sensor
subprotocol P2 is used to notify the sender that it maysend its message(s) when being within communicationrange of a node of the support.
As presented in [12], the Runners protocol does notuse the idea of a (moving) backbone subnetwork as no
motion subprotocol PR1 is used. However, all commu-
nication is still routed through the support S and weexpect that the size k of the support (i.e., the number ofrunners) will affect performance in a more efficient waythan that of the Snake approach. This expectation stemsfrom the fact that each host will meet each other inparallel, accelerating the spread of information (i.e.,messages to be delivered).
A member of the support needs to store allundelivered messages, and maintain a list of receipts tobe given to the originating senders. For simplicity, wecan assume a generic storage scheme where allundelivered messages are members of a set S1 and thelist of receipts is stored on another set S2: In fact, theunique ID of a message and its sender ID is all that isneeded to be stored in S2:
When two runners meet at the same vertex of the
motion graph G; the synchronization subprotocol PR3 is
activated. This subprotocol imposes that when runnersmeet on the same vertex, their sets S1 and S2 aresynchronized. In this way, a message delivered by some
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 63
runner will be removed from the set S1 of the rest ofrunners encountered, and similarly delivery receiptsalready given will be discarded from the set S2 of the
rest of runners. The synchronization subprotocol PR3 is
partially based on the two-phase commit algorithm aspresented in [25] and works as follows.
Let the members of S residing on the same site (i.e.,vertex) u of G be MSu
1 ;y;MSuj : Let also S1ðiÞ (resp.
S2ðiÞ) denote the S1 (resp. S2) set of runner MSui ;
1pipj: The algorithm assumes that the underlyingsensor subprotocol P2 informs all hosts about the
runner with the lowest ID, i.e., the runner MSu1 : PR
3
consists of two rounds.Round 1. All MSu
1 ;y;MSuj residing on vertex u of G;
send their S1 and S2 to runner MSu1 : Runner MSu
1
collects all the sets and combines them with its own tocompute its new sets S1 and S2: S2ð1Þ ¼
S1plpj S2ðlÞ
and S1ð1Þ ¼S
1plpj S1ðlÞ � S2ð1Þ:Round 2. Runner MSu
1 broadcasts its decision to all
the other support member hosts. All hosts that receivedthe broadcast apply the same rules, as MSu
1 did, to join
their S1 and S2 with the values received. Any host thatreceives a message at Round 2 and which has notparticipated in Round 1, accepts the value received inthat message as if it had participated in Round 1.
This simple algorithm guarantees that mobile hostswhich remain connected (i.e., are able to exchangemessages) for two continuous rounds, will manage tosynchronize their S1 and S2: Furthermore, on the eventthat a new host arrives or another disconnects duringRound 2, the execution of the protocol will not beaffected. In the case where runner MSu
1 fails to broad-
cast in Round 2 (either because of an internal failure orbecause it has left the site), then the protocol is simplyre-executed among the remaining runners.
Remark that the algorithm described above does notoffer a mechanism to remove message receipts from S2;eventually the memory of the hosts will be exhausted. Asimple approach to solve this problem and effectivelyreduce the memory usage is to construct an order list ofIDs contained in S2; for each receiver. This orderedsequence of IDs will have gaps—some messages will stillhave to be confirmed, and thus are not in S2: In this list,we can identify the maximum ID before the first gap.We are now able to remove from S2 all message receiptswith smaller ID than this ID.
3.5. The hierarchical approach
In this section we briefly introduce a new model of adhoc mobile networks, which we call hierarchical [10] thatare comprised of dense subnetworks of mobile usersinterconnected across access ports by sparse but fastconnections.
The lower level of the hierarchy may model dense adhoc subnetworks of mobile users that are unstructuredand where there is no fixed infrastructure. To implementcommunication in such a case, a possible solutionwould be to install a very fast (yet limited) backbone
interconnecting such highly populated mobile userareas, while using the support approach in the lowerlevels. We assume that this fast backbone provides alimited number of access ports within these dense areasof mobile users. We call this fast backbone the higher
level of the hierarchy.For a detailed description and the analysis of
communication in the hierarchical case see [10,13].
3.6. Analytical considerations for the ‘‘Snake’’ protocol
3.6.1. Preliminaries
In the sequel, we assume that the head of the Snakedoes a continuous time random walk on GðV ;EÞ; withoutloss of generality (if it is a discrete time random walk, allresults transfer easily, see [4]). We define the randomwalk of a host on G that induces a continuous timeMarkov chain MG as follows: The states of MG are thevertices of G and they are finite. Let st denote the state ofMG at time t: Given that st ¼ u; uAV ; the probabilitythat stþdt ¼ v; vAV ; is pðu; vÞ dt where
pðu; vÞ ¼1
dðuÞ if ðu; vÞAE;
0 otherwise
(
and dðuÞ is the degree of vertex u:We assume that all random walks are concurrent and
that there is a global time t; not necessarily known to thehosts.
Note 3.1. Since the motion graph G is finite andconnected, the continuous Markov chain abstractingthe random walk on it is automatically time-reversible.
Definition 3.1. PiðEÞ is the probability that the randomwalk satisfies an event E given it started at vertex i:
Definition 3.2. For a vertex j; let Tj ¼ minftX0: st ¼ jgbe the first hitting time of the walk onto that vertex andlet EiTj be its expected value, given that the walk started
at vertex i of G:
Definition 3.3. For the random walk of any particularhost, let pðÞ be the stationary distribution of its positionafter a sufficiently long time.
We denote Em½ � the expectation for the chain started
at time 0 from any vertex with distribution m (e.g. theinitial distribution of the Markov chain).
We know (see [4]) that for every vertex s; pðsÞ ¼ dðsÞ2m
where dðsÞ is the degree of s in G and m ¼ jEj:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7464
Definition 3.4. Let pj;k be the transition probability of
the random walk from vertex j to vertex k: Let pj;kðtÞ bethe probability that the random walk started at j will beat kAV in time t:
Definition 3.5. Let XðtÞ be the position of the randomwalk at time t:
3.6.2. Correctness guarantees under the restricted motion
adversary
Let us now consider the case where any sender orreceiver is allowed a general, unknown motion strategy,but its strategy is provided by a restricted motionadversary. This means that each host not in thesupport either executes a deterministic motion (whicheither stops at a node or cycles forever after someinitial part) or it executes a stochastic strategy whichhowever is independent of the motion of thesupport.
Theorem 3.1. The support S and the management
subprotocol MS guarantee reliable communication be-
tween any sender–receiver ðMHS; MHRÞ pair in finite
time, whose expected value is bounded only by a function
of the relative motion space size r and does not depend on
the number of hosts, and is also independent of how MHS;MHR move, provided that the mobile hosts not in the
support do not deliberately try to avoid the support.
Proof. For the proof purposes, it is enough to show thata node of S will meet MHS and MHR infinitely often,with probability 1 (in fact our argument is a conse-quence of the Borel–Cantelli Lemmas for infinitesequences of trials). We will furthermore show that thefirst meeting time M (with MHS or MHR) has anexpected value (where expectation is taken over the walkof S and any strategy of MHS (or MHR) and anystarting position of MHS (or MHR) and S) which isbounded by a function of the size of the motion graph G
only. This then proves the Theorem since it showsthat MHS (and MHR) meet with the head of S infi-nitely often, each time within a bounded expectedduration.
So, let EM be the expected time of the (first) meeting
and mn ¼ sup EM; where the supremum is taken overall starting positions of both S and MHS (or MHR) andall strategies of MHS (one can repeat the argument withMHR).
We proceed to show that we can construct for thewalk of S’s head a strong stationary time sequence Vi
such that for all sAV and for all times t
PiðXðViÞ ¼ s j Vi ¼ tÞ ¼ pðsÞ:
Remark that strong stationary times were introducedby Aldous and Diaconis in [3].
Notice that at times Vi; MHS (or MHR) willnecessarily be at some vertex s of V ; either still movingor stopped.
Let u be a time such that for X ;
pj;kðuÞX 1� 1
e
� �pðkÞ
for all j; k: Such a u always exists because pj;kðtÞconverges to pðkÞ from basic Markov Chain Theory.
Note that u depends only on the structure of thewalk’s graph, G: In fact, if one defines separation fromstationarity to be sðtÞ ¼ maxj sjðtÞ where sjðtÞ ¼supfs: pijðtÞXð1� sÞpðjÞg then
tð1Þ1 ¼ minft: sðtÞpe�1g
is called the separation threshold time. For generalgraphs G of n vertices this quantity is known to be
Oðn3Þ [6].Now consider a sequence of stopping times
UiAfu; 2u; 3u;yg such that
PiðXðUiÞ ¼ s j Ui ¼ uÞ ¼ 1
epðsÞ ð1Þ
for any sAV : By induction on lX1 then
PiðXðUiÞ ¼ s j Ui ¼ luÞ ¼ e�ðl�1Þ 1
epðsÞ:
This is because of the following: First remark that forl ¼ 1 we get the definition of Ui: Assume that therelation holds for ðl� 1Þ i.e.
PiðXðUiÞ ¼ s j Ui ¼ ðl� 1ÞuÞ ¼ e�ðl�2Þ 1
epðsÞ
for any sAV : Then 8sAV
PiðXðUiÞ ¼ s j Ui ¼ luÞ¼XaAV
PiðX ðUiÞ ¼ a j Ui ¼ ðl� 1ÞuÞ
PðX ðUi þ uÞ ¼ s j X ðUiÞ ¼ aÞ
¼ e�ðl�2Þ 1
e
XaAV
pðaÞ 1epðsÞ fromð1Þ
¼ e�ðl�1Þ 1
epðsÞ
which ends the induction step. Then, for all s
PiðXðUiÞ ¼ sÞ ¼ pðsÞ ð2Þ
and, from the geometric distribution with q ¼ 1e;
EiUi ¼ u 1� 1
e
� ��1
:
Now let c ¼ u ee�1
: So, we have constructed (by (2)) a
strong stationary time sequence Ui with EUi ¼ c:Consider the sequence 0 ¼ U0oU1oU2o? such thatfor iX0
EðUiþ1 � Ui j Uj; jpiÞpc:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 65
But, from our construction, the positions X ðUiÞ areindependent (of the distribution pðÞ), and, in particular,XðUiÞ are independent of Ui:
Therefore, regardless of the strategy of MHS (orMHR) and because of the independence assumption, thesupport’s head has chance at least mins pðsÞ to meetMHS (or MHR) at time Ui; independently of how i
varies.So, the meeting time M satisfies MpUT where T is a
stopping time with mean
ETp1
mins pðsÞ:
Note that the idea of a stopping time T such thatXðTÞ has distribution p and is independent of thestarting position is central to the standard modernTheory of Harris—recurrent Markov Chains (seee.g. [16]).
From Wald’s equality [4] then
EUTpc ET ) mnpc1
mins pðsÞ:
Note that since G is produced as a subgraph of aregular graph of fixed degree D we have
1
2mppðsÞp1
n
for all s ðn ¼ jV j; m ¼ jEjÞ; thusETp2m
hence
mnp2mc ¼ e
e � 12mu
Since m; u only depend on G; this proves theTheorem. &
Corollary 3.1. If S’s head walks randomly in a regular
spanning subgraph of G; then mnp2cn:
3.6.3. Worst-case time efficiency for the restricted motion
adversary
Clearly, one intuitively expects that if k ¼ jSj is thesize of the support then the higher k is (with respect ton), the best the performance of S gets.
By working as in the construction of the proof ofTheorem 3.1, we can create a sequence of strongstationary times Ui such that XðUiÞAF where F ¼fs: s is a position of a host in the supportg: Then pðsÞ isreplaced by pðFÞ which is just pðFÞ ¼
PpðsÞ over all
sAF : So now mn is bounded as follows:
mnpc1
minsAJðP
pðsÞÞ;
where J is any induced subgraph of the graph of thewalk of S’s head such that J is the neighborhood of avertex s of radius (maximum distance simple path) at
most k: The quantity
minJ
XsAJ
pðsÞ !
is then at least k2m
(since each pðsÞ is at least 12m
and there
are at least k vertices s in the induced subgraph), hence,
mnpc2mk:
The overall communication is given by
Ttotal ¼ X þ TS þ Y ; ð3Þ
where X is the time for the sender node to reach a nodeof the support; TS is the time for the message topropagate inside the support. Clearly, TS ¼ OðkÞ; i.e.linear in the support size; Y is the time for the receiver tomeet with a support node, after the propagation of themessage inside the support.
We have for all MHS; MHR:
EðTtotalÞp2mc
kþYðkÞ þ 2mc
k:
The upper bound achieves a minimum when k ¼ffiffiffiffiffiffiffiffiffi2mc
p:
Lemma 3.1. For the walk of S’s head on the entire motion
graph G; the communication time’s expected value is
bounded above by Yðffiffiffiffiffiffimc
pÞ when the (optimal) support
size jSj isffiffiffiffiffiffiffiffiffi2mc
pand c is e
e�1u; u being the ‘‘separation
threshold time’’ of the random walk on G:
Remark that other authors use for u the symbol tð1Þ1
for a symmetric Markov Chain of continuous time on n
states.
3.6.4. Tighter bounds for the worst case of motion
To make our protocol more efficient, we nowforce the head of S to perform a random walk on aregular spanning graph of G: Let GRðV ;E0Þ be such asubgraph.
Our improved protocol versions assume that (a) sucha subgraph exists in G and (b) is given in the beginningto all the stations of the support.
Then, for any sAV ; and for this new walk X 0; we have
for the steady-state probabilities pX 0 ðsÞ ¼ 1nfor all s
(since in this case dðsÞ ¼ r and 2m ¼ rn; for all s; wherer is the degree of any s).
Let now M be again the first meeting time of MHS (orMHR) and S’s head. By the stationarity of X 0 and sinceall pX 0 ðsÞ are equal,Z t
0
PðMHS;X 0 are together at time sÞ ds ¼ t
nð4Þ
Now let
pnðtÞ ¼ maxx;v
px;vðtÞ:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7466
Regardless of initial positions, the chance that MHS
(or MHR), S’s head are together (i.e. at the same vertex)
at time u is at most pnðuÞ: So, by assuming first that S’shead starts with the stationary distribution, we get
PðMHS;X 0 are together at time sÞ
¼Z s
0
f ðuÞPðtogether at time s j M ¼ uÞ du;
where f ðuÞ is the (unknown) probability density func-tion of the meeting time M: So
PðMHS;X 0 together at time sÞpZ s
0
f ðuÞpnðs � uÞ du:
But we know (see [4]) that
Lemma 3.2 (Aldous and Fill [4]). There exists an
absolute constant K such that
pnðtÞp1
nþ Kt�
12 where 0ptoN:
Thus
PðMHS;X 0 are together at time sÞ
p1
nPðMpsÞ þ K
Z s
0
f ðuÞðs � uÞ�12 du
So, from (4), we get
t
np1
n
Z t
0
PðMpsÞ ds þ K
Z t
0
f ðuÞ du
Z t
0
ðs � uÞ�12 ds
¼ t
n� 1
n
Z t
0
PðM4sÞ ds þ 2K
Z t
0
f ðuÞðt � uÞ�12 du
pt
n� 1
nE minðM; tÞ þ 2Kt�
12:
So, the expected of the minimum of M and t is
E minðM; tÞp2Knt�12:
Taking t0 ¼ ð4KnÞ2; from Markov’s inequality we get
PðMpt0ÞX12:
When S starts at some arbitrary vertex, we can use thenotion of separation time sðuÞ (a time to approach
stationarity by at least a fraction of ð1� 1eÞ�1Þ to get
PðMpu þ t0ÞX1� sðuÞ
2
i.e., by iteration,
EMp2ðu þ t0Þ1� sðuÞ ;
where u is as in the construction of Theorem 3.1. Thus,
mnp2
ð1� 1eÞðt0 þ uÞ
but for regular graphs it is known (see e.g. [4]) that
u ¼ Oðn2Þ implying (together with our remarks aboutS’s size and set of positions) the following theorem:
Theorem 3.2. By having S’s head to move on a regular
spanning subgraph of G; there is an absolute constant
g40 such that the expected meeting time of MHS (or
MHR) and S is bounded above by g n2
k:
Again, the total expected communication time is
bounded above by 2gn2
kþYðkÞ and by choosing k ¼ffiffiffiffiffiffiffiffiffi
2gn2p
we can get a best bound of YðnÞ for a support
size of YðnÞ:Recall that n ¼ OðVðSÞ
VðtrÞÞ i.e. n is linear to the ratio r of
the volumes of the space of motions and the transmis-sion range of each mobile host. Thus,
Corollary 3.2. By forcing the support’s head to move on a
regular spanning subgraph of the motion graph, our
protocol guarantees a total expected communication time
of YðrÞ; where r is the relative motion space size, and this
time is independent of the total number of mobile hosts,and their movement.
Note 3.2. Our analysis assumed that the head of Smoves according to a continuous time random walk oftotal rate 1 (rate of exit out of a node of G). If we selectthe support’s hosts to be c times faster than the rest ofthe hosts, all the estimated times, except of the inter-support time, will be divided by c: Thus
Corollary 3.3. Our modified protocol where the support
is c times faster than the rest of the mobile hosts
guarantees an expected total communication time which
can be made to be as small as Yðg rffiffiffic
p Þ where g is an
absolute constant.
3.6.5. Time efficiency—a lower bound
Lemma 3.3.
mnXmaxi;j EiTj:
Proof. Consider the case where MHS (or MHR) juststands still on some vertex j and S’s head starts at i: &
Corollary 3.4. When S starts at positions according to
the stationary distribution of its head’s walk then
mnXmax
jEpTj
for jAV ; where p is the stationary distribution.
From a Lemma of ([4, Chapter 4, pp. 21]), we knowthat for all i
EpTiXð1� pðiÞÞ2
qipðiÞ;
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 67
where qi ¼ di is the degree of i in G i.e.,
EpTiXmini
ð1� di
2mÞ2
didi
2m
Xmini
1
2m
ð2m � diÞ2
d2i
:
For regular spanning subgraphs of G of degree D we
have m ¼ Dn2; where di ¼ D for all i: Thus,
Theorem 3.3. When S’s head moves on a regular
spanning subgraph of G; of m edges, we have that the
expected meeting time of MHS (or MHR) and S cannot
be less thanðn�1Þ22m
:
Corollary 3.5. Since m ¼ YðnÞ we get a YðnÞ lower
bound for the expected communication time. In that sense,our protocol’s expected communication time is optimal,for a support size which is YðnÞ:
3.6.6. Time efficiency on the average
Time-efficiency of semi-compulsory protocols for adhoc networks is not possible to estimate without ascenario for the motion of the mobile users not in thesupport (i.e. the non-compulsory part). In a way similarto [9,19], we propose an ‘‘on-the-average’’ analysis byassuming that the movement of each mobile user is a
random walk on the corresponding motion graph G: Wepropose this kind of analysis as a necessary andinteresting first step in the analysis of efficiency of anysemi-compulsory or even non-compulsory protocol forad hoc mobile networks. In fact, the assumption that themobile users are moving randomly (according touniformly distributed changes in their directions andvelocities, or according to the random waypointmobility model, by picking random destinations) hasbeen used in [17,20].
In the light of the above, we assume that any hostnot belonging in the support, conducts a randomwalk on the motion graph, independently of the otherhosts.
The communication time of MSS is the total time
needed for a message to arrive from a sending node u toa receiving node v: Since the communication time, Ttotal;between MHS; MHR is bounded above by X þ Y þ Z;where X is the time for MHS to meet S; Y is the time forMHR to meet S (after X ) and Z is the messagepropagation time within S; in order to estimate theexpected values of X and Y (these are randomvariables), we work as follows:
(a) Note first that X ; Y are, statistically, of the samedistribution, under the assumption that u; v arerandomly located (at the start) in G: Thus EðXÞ ¼EðYÞ:
(b) We now replace the meeting time of u and S by ahitting time, using the following thought experi-ment:
(b1) We fix the support S in an ‘‘average’’ place insideG:
(b2) We then collapse S to a single node (by collapsingits nodes to one but keeping the incident edges).Let H be the resulting graph, s the resulting nodeand dðsÞ its degree.
(b3) We then estimate the hitting time of u to sassuming u is somewhere in G; according to thestationary distribution, p of its walk on H: Wedenote the expected value of this hitting time by
EpTHs :
Thus, now EðTtotalÞ ¼ 2EpTHs þ OðkÞ:
Note 3.3. The equation above amortizes over meetingtimes of senders (or receivers) because it uses thestationary distribution of their walks for their positionwhen they decide to send a message.
Now, proceeding as in [4], we get the followinglemma.
Lemma 3.4. For any node s of any graph H in a
continuous-time random walk
EpTHs p
t2ð1� pðsÞÞpðsÞ ;
where pðsÞ is the (stationary) probability of the walk
at node (state) s and t2 is the relaxation time of the
walk.
Proof. Let Zi;j ¼P
N
t¼oðPi;jðtÞ � pðjÞÞ dt; be the ijth
element of the recurrent potential Z={Zi;j} of the
random walk on H: Here,
PðtÞi;j ¼Prfwalk at vertex j on time t given it starts
at vertex igA corollary of renewal identities relates the mean
hitting time EpTi to Zii (see [4, Chapter 3]) as follows:For all vertices i;
pðiÞEpTi ¼ Zii
hence,
pðiÞEpTi ¼XN0
ðPiiðtÞ � pðiÞÞ dt:
Now, f ðtÞ ¼ PiiðtÞ � pðiÞ is a completely monotone
function, because, by the spectral representation of thetransition matrix, it follows that
PðtÞii � pðiÞ ¼
XmX2
u2im expð�lmtÞ;
where u is the orthonormal matrix of the spectraltheorem for Markov chains. But then, l2 controls thebehavior of f ðtÞ as t-N in the sense
f ðtÞpf ð0Þ expð�l2tÞ:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7468
But f ð0Þ ¼ 1� pðiÞ: Then we get for all vertices i;
pðiÞEpTipð1� pðiÞÞXN0
expð�l2tÞ dt
i.e. (for i ¼ s)
EpTsp1
l2
1� pðsÞpðsÞ : &
Note 3.4. In the above bound, t2 ¼ 1l2
where l2 is the
second eigenvalue of the (symmetric) matrix S ¼ fsi;jgwhere si;j ¼
ffiffiffip
pðiÞ pi;j ð
ffiffiffiffiffiffiffiffipðiÞ
pÞ�1 and P ¼ fpi;jg is the
transition matrix of the walk. Since S is symmetric, it isdiagonalizable and the spectral theorem gives the
following representation: S ¼ ULU�1 where U isorthonormal and L is a diagonal real matrix of diagonalentries 0 ¼ l1ol2p?pln0 ; n0 ¼ n � k þ 1 (wheren0 ¼ jVH j). These l’s are the eigenvalues of both P andS: In fact, l2 is an indication of the expansion of H andof the asymptotic rate of convergence to the stationarydistribution, while relaxation time t2 is the correspond-ing interpretation measuring time.
It is a well-known fact (see e.g. [28]) that 8vAVH ;
pðvÞ ¼ dðvÞ2m0 where m0 ¼ jEH j is the number of the edges of
H and dðvÞ is the degree of v in H: Thus pðsÞ ¼ dðsÞ2m0 :
By estimating d(s) and m0 and remarking that theoperation of locally collapsing a graph does not reduce
its expansion capability and hence lH2 XlG
2 (see Lemma
1.15 [14, Chapter 1, p. 13]).
Theorem 3.4.
EðXÞ ¼ EðYÞp 1
l2ðGÞYn
k
� :
Theorem 3.5. The expected communication time of our
scheme is bounded above by the formula
EðTtotalÞp2
l2ðGÞYn
k
� þYðkÞ:
Note 3.5. The above upper bound is minimized when
k ¼ffiffiffiffiffiffiffiffiffi2n
l2ðGÞ
q; a fact also verified by our experiments (see
[9] and also Section 3.7 in this paper).
We remark that this upper bound is tight in the sensethat by [4, Chapter 3],
EpTHs X
ð1� pðsÞÞ2
qspðsÞ;
where qs ¼P
jas psj in Hj is the total exit rate from s in
H: By using martingale type arguments we can showsharp concentration around the mean degree of s:
3.6.7. Robustness
Now, we examine the robustness of the support
management subprotocol MSS and MR
S under single
stop-faults.
Theorem 3.6. The support management subprotocol MSS
is 1-fault tolerant.
Proof. If a single host of S fails, then the next host inthe support becomes the head of the rest of the Snake.We thus have two, independent, random walks in G (ofthe two Snakes) which, however, will meet in expected
time at most mn (as in Theorem 3.1) and re-organize as asingle Snake via a very simple re-organization protocolwhich is the following:
When the head of the second Snake S2 meets a host hof the first Snake S1 then the head of S2 follows the hostwhich is ‘‘in front’’ of h in S1; and all the part of S1 afterand including h waits, to follow S2’s tail. &
Note that in the case that more than one faults occur,the procedure for merging Snakes described above maylead to deadlock (see Fig. 1).
Theorem 3.7. The support management subprotocol MRS
is resilient to t faults, for any 0ptok:
Proof. This is achieved using redundancy: whenever tworunners meet, they create copies of all messages intransit. In the worst-case, there are at most k � t copiesof each message. Note, however, that messages mayhave to be re-transmitted in the case that only one copyof them exists when some fault occurs. To overcome thislimitation, the sender will continue to transmit amessage for which delivery has not been confirmed, toany other members of the support encountered. This
1
3
24
Fig. 1. Deadlock situation when four ‘‘Snakes’’ are about to be
merged.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 69
guarantees that more than one copy of the message willbe present within the support structure. &
3.7. Experimental evaluation
To experimentally validate, fine-tune and furtherinvestigate the proposed protocols, we performed aseries of algorithmic engineering experiments.
All of our implementations follow closely the supportmotion subprotocols described above. They have beenimplemented as C++ classes using several advanceddata types of LEDA [27]. Each class is installed in anenvironment that allows to read graphs from files, andto perform a network simulation for a given number ofrounds, a fixed number of mobile users and certaincommunication and mobility behaviors. After theexecution of the simulation, the environment stores theresults again on files so that the measurements can berepresented in a graphical way.
3.7.1. Experimental set-up
A number of experiments were carried out modelingthe different possible situations regarding the geogra-phical area covered by an ad hoc mobile network. Weconsidered three kinds of inputs, unstructured (random)and more structured ones. Each kind of input corre-sponds to a different type of motion graph.
For all the cases that we investigated, each heS;generates a new message with probability ph ¼ 0:01 (i.e.,a new message is generated roughly every 100 rounds)by picking a random destination host. The selection ofthis value for ph is based on the assumption that themobile users do not execute a real-time application thatrequires continuous exchange of messages. Based on thischoice of ph; each experiment takes several thousands ofrounds in order to get an acceptable average messagedelay. For each experiment, a total of 100,000 messageswere transmitted. We carried out each experiment untilthe 100,000 messages were delivered to the designatedreceivers. Our test inputs are as follows.
Random graphs. This class of graphs is a naturalstarting point in order to experiment on areas withobstacles. We used the Gn;p model obtained by sampling
the edges of a complete graph of n nodes independently
with probability p: We used p ¼ 1:05 log nn
which is
marginally above the connectivity threshold for Gn;p:
The test cases included random graphs withnAf1600; 3200; 6400; 40; 000; 320; 000g over differentvalues of kA½3; 45�:
2D grid graphs. This class of graphs is the simplestmodel of motion one can consider (e.g., mobile hoststhat move on a plane surface). We used two differentffiffiffi
np
�ffiffiffin
pgrid graphs with nAf400; 1600g over different
values of k: For example, ifVðtcÞ ¼ 50 m2 then the area
covered by the ad hoc network will be 20; 000 m2 in thecase where n ¼ 400:
Bipartite multi-stage graphs. A bipartite multi-stagegraph is a graph consisting of a number x of stages (orlevels). Each stage contains n=x vertices and there areedges between vertices of consecutive stages. Theseedges are chosen randomly with some probability p
among all possible edges between the two stages. Thistype of graphs is interesting as such graphs may modelmovements of hosts that have to pass through certainplaces or regions, and have a different second eigenvaluethan grid and Gn;p graphs (their second eigenvalue
lies between that of grid and Gn;p graphs). In our ex-
periments, we considered x ¼ log n stages and chosep ¼ n
x lognx which is the threshold value for bipartite
connectivity (i.e., connectivity between each pair ofstages). The test cases included multi-stage graphswith 7, 8, 9 and 10 stages, number of verticesnAf1600; 3200; 6400; 40000g; and different values ofkA½3; 45�:
3.7.2. Validation of the ‘‘Snake’’ protocol’s performance
(average case)
In [9] we performed extensive experiments to evaluatethe performance of the ‘‘Snake’’ protocol. A number ofexperiments were carried out modeling the differentpossible situations regarding the geographical areacovered by an ad hoc mobile network. We onlyconsidered the average case where all users (even those
not in the support) perform independent and concurrent
random walks.In the first set of experiments, we focused on the
validation of the theoretical bounds provided for theperformance of the ‘‘Snake’’ protocol. For each messagegenerated, we calculated the delay X (in terms ofrounds) until the message has been transmitted to amember of S: Additionally, for each message receivedby the support, we logged the delay Y (in terms ofrounds) until the message was transmitted to thereceiver. We assumed that no extra delay was imposedby the mobile support hosts (i.e. TS ¼ 0). We then usedthe average delay over all measured fractions of messagedelays for each experiment in order to compare theresults and further discuss them.
The experimental results have been used to comparethe average message delays ðEðTtotalÞÞ with the upperbound provided by the theoretical analysis of theprotocol. We observe that as the total number n ofmotion-graph nodes remains constant, as we increasethe size k of S; the total message delay (i.e. EðTtotalÞÞ isdecreased. Actually, EðTtotalÞ initially decreases very fastwith k; while having a limiting behavior of no furthersignificant improvement when k crosses a certainthreshold value. Therefore, taking into account apossible amount of statistical error, the following has
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7470
been experimentally validated:
if k14k2 ) E1ðTtotalÞoE2ðTtotalÞ:Throughout the experiments we used small and
medium-sized graphs, regarding the total number ofnodes in the motion graph. Over the same type of graph(i.e. grid graphs) we observed that the average messagedelay EðTtotalÞ increases as the total number of nodesincreases. This can be clearly seen from Fig. 3 where thecurves of the four different graph-sizes are displayed. Itis safe to assume that for the same graph type and for afixed-size S the overall message delay increases with thesize of the motion graph. We further observe thatEðTtotalÞ does not increase linearly but is affected by theexpansion rate of the graph, or otherwise, as stated inthe theoretical analysis, by the second eigenvalue of thematrix of the graph. If we take into account possibleamount of statistical error that our experiments areinherently prone to, we can clearly conclude thefollowing:
if n14n2 ) E1ðTtotalÞ4E2ðTtotalÞ:Furthermore, we remark that EðTtotal) does not only
depend on the actual size of the graph and the size of thesupport S but, additionally, that it is directly affected bythe type of the graph. This is expressed throughout thetheoretical analysis by the effect of the second eigenva-lue of the matrix of the graph on communication times.If we combine the experimental results of Figs. 2–4 forfixed number of motion-graph nodes (e.g. n ¼ 400) andfixed number of S size (e.g. k ¼ 40) we observe thefollowing:
EgridðTtotalÞ4EmultistageðTtotalÞ4EGn;pðTtotalÞ
which validates the corresponding results due to theanalysis.
In Section 3.6.6, we note that the upper bound on
communication times EðTtotalÞp 2l2ðGÞYðn
kÞ þYðkÞ is
minimized when k ¼ffiffiffiffiffiffiffiffiffi2n
l2ðGÞ
q: Although our experiments
disregard the delay imposed by S (i.e. we take YðkÞ-0)this does not significantly affect our measurements.Interestingly, the values of Figs. 2–4 imply that theanalytical upper bound is not very tight. In particular,for the Gn;p graphs we clearly observe that the average
message delay drops below 3 rounds (which is a verylow, almost negligible, delay) if S size is equal to 10;however, the above formula implies that this size shouldbe higher, especially for the medium-sized graphs.Actually, it is further observed that there exists a certainthreshold value (as also implied by the theoreticalanalysis) for the size of S above which the overallmessage delay does not further significantly improve.The threshold size for the support validated experimen-tally is smaller than the threshold implied by theanalytical bound.
3.7.3. Experimental comparison of the ‘‘Snake’’ and the
‘‘Runners’’ approach
In [12] we performed a comparative experimentalstudy of the ‘‘Snake’’ and the ‘‘runners’’ protocols. Ourtest inputs were similar to that used in Section 3.7.2. Wemeasured the total delay of messages exchanged betweenpairs of sender–receiver users. For each motion graphconstructed, we injected 1000 users (mobile hosts) atrandom positions that generated 100 transaction mes-sage exchanges of 1 packet each by randomly picking
Average Message Delay (Random Graphs)
0
2
4
6
8
10
12
0 10 15 20 25 30 35 40 45
Support size (k)
Sim
ula
tio
n T
ime
(in
ro
un
ds)
S1600 R1600 S6400 R6400
5
Fig. 2. Average message delay for the ‘‘Snake’’ approach over Gnp types of motion graphs with different sizes n and varied support size k:
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 71
different destinations. For each message generated, wecalculated the overall delay (in terms of simulationrounds) until the message was finally transmitted to thereceiver. We used these measurements to experimentallyevaluate the performance of the two different supportmanagement subprotocols.
The reported experiments for the three different testinputs we considered are illustrated in Figs. 5–7. In thesefigures, we have included the results of two instances ofthe input graph w.r.t. n; namely a smaller and a largervalue of n (similar results hold for other values of n).Each curve in the figures is characterized by the name‘Px’, where P refers to the protocol used (S for Snakeand R for runners) and x is a 3 or 4 digit numberdenoting the value of n:
The curves reported for the Snake protocol confirmthe theoretical analysis in Section 3.6.6. That is, theaverage message delay drops rather quickly when k is
small, but after some threshold value stabilizes andbecomes almost independent of the value of k: Thisobservation applies to all test inputs we considered. We
Average Message Delay (Bipartite-Multistage Graphs)
0
5
10
15
20
25
30
35
40
0 10 15 20 25 30 35 40 45
Support size (k)
Sim
ula
tio
n T
ime
(in
ro
un
ds)
S1600 R1600 S6400 R6400
5
Fig. 3. Average message delay of the ‘‘Snake’’ approach over bipartite multi-stage types of motion graphs with different sizes ðnÞ and varied support
size k:
Fig. 5. Average message delay over random graphs with different sizes
ðnÞ and varied support size k; for the Snake (S) protocol and Runners
(R) protocol.
Fig. 4. Average message delay of ‘‘Snake’’ approach over grid types of
motion graphs with different sizes ðnÞ and varied support size k:
Average Message Delay vs Support Size(Bipartite Multi-stage Graphs)
0
10
20
30
40
50
60
5
Support Size (#hosts)
Mes
sage
Del
ay (
avg)
G400
G1600
G3200
G40000
403025201510
Fig. 6. Average message delay over bipartite multi-stage motion
graphs with different sizes ðnÞ and varied support size k; for the
Snake (S) protocol and Runners (R) protocol.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7472
also note that this behavior of the Snake protocol issimilar to the one reported in Section 3.7.2, although wecount here the extra delay time imposed by thesynchronization subprotocol P3:
Regarding the ‘‘runners’’ protocol, we firstly observethat its curve follows the same pattern with that of theSnake protocol. Unfortunately, we do not have yet atheoretical analysis for the new protocol to see whetherit behaves as should expected, but from the experimentalevidence we suspect that it obeys a similar but tighterbound than that of the Snake protocol. Our secondobservation is that the performance of the ‘‘runners’’protocol is slightly better than that of the ‘‘Snake’’protocol in random graphs (except for the case ofrandom graphs with small support size; see Fig. 5) andbipartite multi-stage graphs (Fig. 6), but is substantiallybetter in the more structured case (grid graphs; cf.Fig. 7). This could be partly explained by the fact thatthe structured graphs have a smaller second eigenvaluethan that of bipartite multi-stage and random graphs. Asmall value for this eigenvalue makes the averagemessage delay to be more dependent on the hiddenconstant in the asymptotic analysis of the ‘‘Snake’’protocol (see Section 3.6.6) which is apparently largerthan the corresponding constant of the runnersprotocol.
4. Conclusions
In this paper, we have presented some noveldistributed algorithms for the problem of basic com-
munication in ad hoc mobile networks. Our approachis motivated by the fact that in such networkstopological connectivity is subject to frequent, unpre-dictable change. Our work especially focuses onnetworks with high rate of such changes to connectivity.In particular, we introduce explicit models for themotion of the hosts and for the network area. Forsuch highly changing networks we have proposedprotocols, whose key idea is to simulate the absentfixed infrastructure by co-ordinated (by the protocol)motion of a small part of the network, which actsas an intermediate pool for receiving and deliveringmessages.
We have also provided a framework for protocolclasses, depending on the way protocols affect motion,as well as the common sense of orientation and fault-tolerance properties the protocols have.
We have shown that such protocols can be designedto work correctly and efficiently even in the case ofarbitrary (but not malicious movements of the hosts notaffected by the protocol.
We have also proposed a methodology for theanalysis of the expected behavior of protocols forsuch networks, based on the assumption that mobile
hosts (those whose motion is not guided by the protocol)conduct concurrent random walks in their motionspace.
We provide rigorous proofs of the protocols correct-ness, and also give performance analyses by combina-torial tools. Finally, we have evaluated these protocolsby experimental means, by validating, comparativelystudying and fine-tuning these protocols.
Average Message Delay (2D Grid Graphs)
0
250
500
750
1000
1250
1500
1750
2000
2250
2500
0 10 15 20 25 30 35 40 45
Support size (k)
Sim
ula
tio
n T
ime
(in
ro
un
ds)
S400 R400 S1600 R1600
5
Fig. 7. Average message delay over grid motion graphs with different sizes ðnÞ and varied support size k; for the Snake (S) protocol and Runners (R)
protocol.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–74 73
References
[1] M. Adler, C. Scheideler, Efficient communication strategies for ad
hoc wireless networks, in: Proceedings of the 10th Annual
Symposium on Parallel Algorithms and Architectures—SPAA’98,
1998.
[2] M. Ajtai, J. Komlos, E. Szemeredi, Deterministic simulation in
LOGSPACE, in: Proceedings of the 19th Annual ACM Sympo-
sium on Theory of Computing—STOC’87, 1987.
[3] D. Aldous, P. Diaconis, Strong stationary times and finite random
walks, Adv. Appl. Math. 8 (1987) 69–97.
[4] D. Aldous, J. Fill, Reversible Markov chains and random walks
on graphs, Unpublished manuscript. http://stat-www.berkeley.
edu/users/aldous/book.html, 1999.
[5] N. Alon, F.R.K. Chung, Explicit construction of linear sized
tolerant networks, Discrete Math. 72 (1998) 15–19.
[6] G. Brightwell, P. Winkler, Maximum hitting times for random
walks on graphs, J. Random Struct. Algorithms 1 (1990)
263–276.
[7] J. Broch, D.B. Johnson, D.A. Maltz, The dynamic source routing
protocol for mobile ad hoc networks, IETF, Internet Draft, draft-
ietf-manet-dsr-01.txt, December, 1998.
[8] A. Boukerche, A source route discovery optimization scheme in
mobile ad hoc networks, in: Proceedings of the Eighth European
Conference on Parallel Processing—EUROPAR’02, 2002.
[9] I. Chatzigiannakis, S. Nikoletseas, P. Spirakis, Analysis and
experimental evaluation of an innovative and efficient routing
approach for ad hoc mobile networks, in: Proceedings of the
Fourth Annual Workshop on Algorithmic Engineering—
WAE’00, 2000, Lecture Notes in Computer Science, Vol. 1982,
Springer, Berlin, 2000.
[10] I. Chatzigiannakis, S. Nikoletseas, P. Spirakis, An efficient
routing protocol for hierarchical ad hoc mobile networks,
in: Proceedings of the First International Workshop on Parallel
and Distributed Computing Issues in Wireless Networks and
Mobile Computing, satellite workshop of 15th Annual Interna-
tional Parallel & Distributed Processing Symposium—IPDPS’01,
2001.
[11] I. Chatzigiannakis, S. Nikoletseas, P. Spirakis, An efficient
communication strategy for ad hoc mobile networks,
in: Proceedings of the 15th International Symposium on
Distributed Computing—DISC’01, 2001, pp. 285–299. Also Brief
Announcement in Proceedings of the 20th Annual Symposium
on Principles of Distributed Computing—PODC’01, Lecture
Notes in Computer Science, Vol. 2180, Springer, Berlin, 2001,
pp. 320–322.
[12] I. Chatzigiannakis, S. Nikoletseas, P. Spirakis, Experimental
evaluation of basic communication for ad hoc mobile networks,
in: Proceedings of the Fifth Annual Workshop on Algorithmic
Engineering—WAE’01, Lecture Notes in Computer Science, Vol.
2141, Springer, Berlin, 2001, pp. 159–171.
[13] I. Chatzigiannakis, S. Nikoletseas, Design and analysis of an
efficient communication strategy for hierarchical and highly
changing ad hoc mobile networks, in: A. Zomaya, M. Kumar
(Eds.), ACM/Baltzer Mobile Networks Journal (MONET),
Special Issue on Parallel Processing Issues in Mobile Computing,
2003, to appear.
[14] F.R.K. Chung, Spectral graph theory. Regional conference series
in mathematics, ISSN 0160-7642; no. 92, CBMS Conference on
Recent Advances in Spectral Graph Theory, California State
University at Fesno, 1994, ISBN 0-8218-0315-8.
[15] G. Dommety, R. Jain, Potential networking applications of global
positioning systems (GPS), Tech. Rep. TR-24, CS Department,
The Ohio State University, April, 1996.
[16] R. Durret, Probability: Theory and Examples, Wadsworth,
Belmont, CA, 1991.
[17] Z.J. Haas, M.R. Pearlman, The performance of a new routing
protocol for the reconfigurable wireless networks, in: Proceedings
of the IEEE International Conference on Communications—
ICC’98, 1998.
[18] Z.J. Haas, M.R. Pearlman, The zone routing protocol (ZRP) for
ad hoc networks, IETF, Internet Draft, draft-zone-routing-
protocol-02.txt, June, 1999.
[19] K.P. Hatzis, G.P. Pentaris, P.G. Spirakis, V.T. Tampakas, R.B.
Tan, Fundamental control algorithms in mobile networks, in:
Proceedings of the 11th Annual ACM Symposium on Parallel
Algorithms and Architectures—SPAA’99, 1999.
[20] G. Holland, N. Vaidya, Analysis of TCP performance over
mobile ad hoc networks, in: Proceedings of the Fifth Annual
ACM/IEEE International Conference on Mobile Computing—
MOBICOM’99, 1999.
[21] T. Imielinski, H.F. Korth, Mobile Computing, Kluwer Academic
Publishers, Dordrecht, 1996.
[22] Iowa State University GPS page. http://www.cnde.iastate.edu/
gps.html.
[23] Y. Ko, N.H. Vaidya, Location-aided routing (LAR) in mobile ad
hoc networks, in: Proceedings of the Fourth Annual ACM/IEEE
International Conference on Mobile Computing—MOBI-
COM’98, 1998.
[24] Q. Li, D. Rus, Sending messages to mobile users in disconnected
ad hoc wireless networks, in: Proceedings of the Sixth Annual
ACM/IEEE International Conference on Mobile Computing—
MOBICOM’00, 2000.
[25] N.A. Lynch, Distributed algorithms, Morgan Kaufmann Publish-
ers Inc., Los Altos, CA, 1996.
[26] N. Malpani, J.L. Welch, N. Vaidya, Leader election algorithms
for mobile ad hoc networks, in: Proceedings of the Third Annual
ACM Symposium on Discrete Algorithms and Models for
Mobility—DIALM’00, 2000.
[27] K. Mehlhorn, S. N.aher, LEDA: A Platform for Combinatorial
and Geometric Computing, Cambridge University Press, Cam-
bridge, 1999.
[28] R. Motwani, P. Raghavan, Randomized Algorithms, Cambridge
University Press, Cambridge, 1995.
[29] NAVSTAR GPS operations. http://tycho.usno.navy.mil/gpsin-
fo.html.
[30] V.D. Park, M.S. Corson, Temporally-ordered routing algorithms
(TORA) version 1 functional specification, IETF, Internet Draft,
draft-ietf-manet-tora-spec-02.txt, October, 1999.
[31] B. Parkinson, S. Gilbert, NAVSTAR: global positioning
system ten years later, in: Proceedings of the IEEE, 1983,
pp. 1177–1186.
[32] C.E. Perkins, Ad Hoc Networking, Addison–Wesley, Boston,
MA, USA, January, 2001.
[33] C.E. Perkins, E.M. Royer, Ad hoc on demand distance vector
(AODV) routing, IETF, Internet Draft, draft-ietf-manet-aodv-
04.txt, 1999.
[34] E. Szemeredi, Regular Partitions of graphs, Colloques Inter-
nationaux C.N.R.S., Nr 260, Problemes Combinatoires et Theorie
des Graphes, Orsay, 1976, pp. 399–401.
[35] J.E. Walter, J.L. Welch, N.M. Amato, Distributed reconfigura-
tion of metamorphic robot chains, in: Proceedings of the 19th
ACM Annual Symposium on Principles of Distributed Comput-
ing—PODC’00, 2000.
[36] Y. Zhang, W. Lee, Intrusion detection in wireless ad hoc
networks, in: Proceedings of the Sixth Annual ACM/IEEE
International Conference on Mobile Computing—MOBI-
COM’00, 2000.
I. Chatzigiannakis et al. / J. Parallel Distrib. Comput. 63 (2003) 58–7474