€¦ · web viewafter four months of hard work, this project finally came to fruition. although i...

Counting in Anonymous Dynamic Networks: An

Experimental Perspective

Abstract

To implement high level abstractions, counting is undoubtedly a fundamental

problem for all of distributed system, because it represents a basic component. In

anonymous dynamic networks, because all nodes have no identity and a priori knowledge

of the network, counting is vitally important. Based on two leader-based algorithms,

namely ANoK and ALCO, these two algorithms count the exact number of nodes using

the notion of energy transfer. However, there is a limitation, ANoK just makes sure the

leader can get the number of processes but it is unable to recognize when this happens.

This paper will describe and analyze a new algorithm A*Nok which asks the leader to

make a prediction about when it produces the current count, then we use an experiment to

evaluate both algorithms. In addition, this paper will provide a new algorithm A*LCO to

accelerate the convergence time as well.

Acknowledgements

After four months of hard work, this project finally came to fruition. Although I had

encountered many problems, I overcame all kinds of difficulties by many people’s help. First of

all, I give sincerest gratitude to my supervisor, Evangelos Kranakis, he has supported me

throughout my project and help me to find idea to work in my way. If I did not get his support

and encouragement, then this paper would not have been finished or even started. I do not think I

can get a nicer supervisor than him and do not think I can get better help than what I already got.

There were a lot of teachers and students provided help to me. When I got problems, they

tried to lead me to find answers. When I had no idea what should I do, then they will help me

figure out what I can do next. They are always with me, give me encourage, give me supports, let

me can finish this project. Now, I want to give my heartfelt thanks to you, and I also have to say

sorry I had always disturb yours in the last four months.

Moreover, thanks for the Department of Computer Science, which has given the

equipment and support for all I have needed to start and finish my project.

Finally, I want to give my sincere regards to my family for their full supporting to me

throughout all my studies at Carleton University, thus let me only focus on my studying, do not

need to worry about any life problem.

1. Introduction

In the 21st century, because of the rapidly development of computer technology,

the word “network” is being used more and more often in people’s life. The network

maybe used as small as entertainment, shopping, studying or maybe used personal

business, international business, even some government and martial. I may predict the

network will cover our life anywhere in the near future. Now, let us turn our attention

back to today. With the development, of time, computer and network technology is

progressing with each passing day. In particular cause for concern, the distributed

computing systems are switching from static to more and more dynamic. The

computation’s relatively stable and traditional static models will no longer suit for the now

established and fast emerging technologies of information. You may notice that,

nowadays, there are more and more new mobile computing devices integrate successfully

into people’s work and life. Most of them can provide the capabilities of communication,

sensing, mobility and so on. Even the internet has gradually shifted from wireless

internet to wireless mobile internet. Networked use sensors and mobile devices to

produce data exchanges between real world and modern and traditional networks,

includes information, communication and social networks. Thus, such a kind of hyper-

connected dynamic environments brought a chance to us to solve what the old static

distributed system cannot capture. There was a good example we can see:

A highly dynamic, less infrastructure network - Delay-tolerant networks (DNT),

which is also known as Disruption Tolerant Networks. This DTN is used to deal with the

problem in heterogeneous networks, this kind of network may not always have the continuous

network connectivity. Under some certain network environment, network will become

disconnected, so that the communication routs of message cannot guarantee it is an end-

to-end communication routes. However, the DNT provide a basic characteristic is at any

instant there can be no end-to-end communication routes. The completely predictable

mobility can change a completely unpredictable mobility. The research of DNT will

bring a powerfully theory and technology’s support to the messaging interactions in the

domain of military flights, aerospace, disaster recovery, emergency rescue and so on.

Unfortunately, the research of dynamic communication networks is still at a

beginning stage. In the field of distributed computing still have lots of work needs to be

done, includes changes and failures in the topology that are slow and even stabilize

ultimately. Actually, the topological changed very slowly that is considered it is

inappropriate for inferring about dynamic networks. Othon [3] stated that “even graph

theoretic techniques need to be revisited: the suitable graph model is now that of a

dynamic graph in which each edge has an associated set of time labels indicating

availability times. Even fundamental properties of classical graphs do not carry over to

their temporal counterparts.” (Othon, Ioannis, & Paul 2012)

At present, from a social aspect, a critical issue in designing dynamic network is

security and trust. This new form network has many special features, such as dynamic

changing topology, open communication medium, limited node, channel resource and so

on. However, those features more like a two-edged sword, which has its good and bad

points at the same time. In designing such a hyper-connected dynamic infrastructure, it

makes information become more possible be monitor and intercepted. For example,

followed an increasing number of people use social networking application, the problem

of privacy invasion and the inappropriate transfer of personal information became

especially serious. The problems brought a series of inconvenience, even danger to

people’s life. Moreover, some special departments appeared to have a higher and higher

security requirement to wireless mobile internet, includes military, government, and

defense department. Therefore, because of those security threats, we can see that we need

to figure out a way to solve this problem as soon as possible, this research field also has

become a hotspot laid before us. At this time, there was a kind of anonymous network

came into people’s view.

In fact, networks is now struggling for dealing with attacker’s tracking and

monitoring online. Such as data confidentiality and integrity, identification, information

security and entity authentication, all of those aspects need to be considered instead of

only focus on elective targeting, data aggregating and profiling. In order to solve this

problem, mobile networks have to provide anonymous service to conceal the identity of

the mobile node, so anonymous service plays a significant role in the mobile service. This

is what we called anonymity of the artifacts. Although, traditional encryption technology

would also make some protection to data privacy and security, integrity and identity in

the mobile network, but attackers can still get the source, destination, quantity, and some

implication of the message. Moreover, they even can deduce the confidential information

of node identity, location and so on by analyses the transfer mode and the header

information of messages, such as source addresses and destination addresses, length of

messages. After that, based on that information they can launch targeted attacks.

However, the technology of anonymous can hide the communication relationship in

communication streaming through a certain technique, such that attacker cannot know the

relationship or identity of communicators. Also, the strategy of dynamic anonymous

network in wireless mobile internet can well adapt to the complicate network topology

structure, lower system overhead and improve efficient.

In this project, I will explore a basic problem in the dynamic anonymous network

– Counting. Counting nodes' number in the network but without any prior information of

the node or the network state, in other words, nodes do not know the size of the network

is what I want to present in this work. Counting is the most basic and important problem

in the distributed computation, which is a central part in the management and control of

network. In the future, networks should be extremely dynamic: after received message

the connected artifacts become immediately unreachable. For now, the current theoretical

models of dynamic networks, their topology is usually changing arbitrarily round to

round. The edge represents communication is changed at each round by an opponent

among hype-connected artifacts, so that modify the edges is constricted as the network is

always connected. Also some edges are random distributed with a certain properties.

Under these assumptions taken for granted, based on the theoretical results we can create

a robust, scalable and that terminate protocol for distributed task.

Right now, we need to forget the assumptions that we see from the previous

theoretical models:

1) There is not any prior knowledge about networks, nodes do not known

the size n of the network, and other metric.

2) Nodes are initially identical, which means they have no unique

identities and execute same programs. Except for a leader is not

introduced, nodes execute same programs in symmetric networks that

cannot be count.

3) The network does not have to be connected at any time.

I trust the desire mode of operation is work well for the future hype-connected

environments, and privacy also be considered in this model. Based on these conditions,

we present a distributed algorithm was called A*Nok, and to avoid looking for the size of

the anonymous network I used the termination heuristic. This algorithm is based on a no

knowledge algorithm ANok that was established in another paper [2]. I are going to use an

energy-transfer technique to find what exact number of nodes is in the anonymous

network.

In order to examine the performance of those two algorithm A*Nok and ANok, I use

an experimental approach to do so. Also, I try to use an algorithm ALOC to compare with

A*Nok. Then I change the ALOC to let it work under a more generic mode of operation,

called A*LOC. I use different random evolving graph models to test the algorithm’s error

rate and efficiency for terminating the computation. I also use the periodically

disconnected networks as the artifacts duty-cycle.

Under densely connected anonymous networks, algorithm A*NoK terminate always

correctly. For the network experience regular partitions, A*NoK can predict the accuracy of

the size varies corresponding to the degree of disconnection of the network. The accuracy

in counting decreased when the periods of network disconnections get longer. Thus, we

can conclude that the algorithm A*NoK can predicate “does the network contain more than

N nodes?” in a certain rounds, and it is lesser than what we see from [2] used in

algorithm ANoK.

2 Preliminaries

Energy – Transfer Technique. A technique used a simple method of energy transfer to

count the number of nodes in the networks with a constantly dynamic environment and no

identities. For each node, it has a fixed amount of energy charge, and then during each round it

will discharge itself by send the energy to its neighbor. At this time, we need a leader to collect

energy as a sink, the leader will not give energy to the neighbor. This technique ensures that the

totally amount of energy around each node in the network is not change, which means the energy

will not be created and destroyed. Now, we can imagine that all of the energy will be transferred

and saved in leader. After that, the leader can use the sum of energy to calculate the size of the

network. In fact, this method is not only very simple to implement but also need very limited

information of the given network. In my paper, all the following algorithms used this energy

transfer technique. It may need some certain aspect knowledge of the network to let the

computation terminate such as the upper bound on node degree, or do not terminate the

computation without any additional assumptions which can get the exact amount of nodes in the

network, but it cannot know when to terminate.

The lack of the knowledge about the network, it provides feasibility to us, at the same

time it also be decided that it cannot be used in practical terms. Because the leader is not able to

verify any terminating condition, so that it cannot give counting problem answer to us. Now, I

show an algorithm A*NoK based on A NoK, used heuristic to define a terminating condition to

produce the accurate count result.

The Counting Problem. The counting problem is defined by the following properties:

Suppose y is a variable represents the size of the network at process v.

i) Convergence: There was a round r, exist at least one process p has permanently a

correct guess in the size of the network. i.e. y=|V|.

ii) Consciousness: if there is a node v has a correct guess y on |V|, then we have a round

r’, with r’ ≥ r, thus v is notice about the correctness of the guess.

System Model. When I say a network is dynamic, I mean that this network’s topology

changed with time, because of the failures of nodes and communication links. The computations

is dominated under a global clock which can access to all the nodes, and executed in discrete

synchronous rounds. Therefore, I let r be the local variable for the current round number, all

nodes can have access to the current round number by this variable. We always use a dynamic

graph to represent the dynamic network, G(r) = ( V, E(r) ), V represents a set of nodes,

meanwhile, it is assumed to be static throughout this work, which means it remains the same

during the execution. Let E’= {{ u, v }: u, v ∈ V}, E: IN → P(E’) is a mapping function from r ∈

IN to a set E(r) of bidirectional links drawn from E’. The dynamic graph G, its edge was chosen

by a worst-case adversary, the edge sets are subset of E’, then the result graph G is an infinite

instantaneous graphs sequence G(1), G(2), G(3)…

Important, nodes in V they have no identifier, they are anonymous. Now, I write the local

view of a node v as lv(r) with each round r. In the other word, every local variable is kept by the

neighbors of v at round r.

In the communication in the network, nodes use anonymous broadcast to send and

receive massage. I express that in formula, at each round r, each node v produce and sent a

message mu(r) to its neighbors Nu(r) = {v | {u, v} E(r)}. ∈

Dynamic graph models. To present the topology graph’s dynamicity, we first need to

know the following concepts:

1. G (n, p) graph: At each round r’s beginning, the edge sets need to be emptied, then

the edge uv is created based on a given probability p, with the pair of processes (u, v)

V. For the probability p, we know that in the G (n, p) graph model, it depends on a ∈

threshold t, when the probability p is above the threshold, then graph G (n, p) is

associated with very high probability. Also, this connectivity threshold t depends on

the number of nodes n at there.

2. Edge-Markovian (EM) graph: the following two rules is what the edges based to

modify at each round r:

(a) For each edge, uv E(r-1), when we have the probability p∈ d, uv is removed from

E(r), we say that pd is a probability of death.

(b) For each edge uv ∉ E(r - 1), when we have the probability pb, uv is created and

inserted in E(r), we say that pb is a probability of birth.

Intuitively, the death and birth probability controlled the connectivity of the graph at each

round.

3. Random Connected graph: To construct a sparsest possible graph that still

maintains connectivity, I need to first pick up a pair of nodes (u, v) V at each round∈

r, and then create an edge getting the graph G(V, E’). After that, we iterate the

procedure until we find a connected graph.

4. Duty-cycle based graph: Based on the topology of round r0 that is a fixed, connected

topology in the dynamic graph, every node during the duty-cycling phase, if the node

is awake then it can send and receive messages to any other also is awaked

neighboring node at a round ri. But when the node is in a sleep mode at round rj, we

will remove all its adjacent edges from the graph. Because of the existence of the duty

cycle, we no longer need all edges will be set at each round, this create the

dynamicity to our graph. This kind of graph also shows that how the resource

constraint devices worked. Notice that use this kind of model the graph does not need

to be connected to each round.

Related Work.

Static anonymous networks. The question about the problem can be deal with a

distributed system when begin from the same state and employ the same algorithm in all

processors has a really long story with its roots dating back to the seminal work of Angluin [4],

she did the research on the issue of establishing a “center”. Also, she made celebrated

contribution to this field, she is a pioneer who found the connection with the theory of graph

coverings, defined some characterizations for the problem that can be solved with some certain

topological constraints, in particular with Yamashita and Kameda [25]’s work. Of course, there

were many other outstanding researches. For example, the research about unknown networks did

figure out the problem about robot-exploration and map-drawing of an unknown graph and

information dissemination. And a Japanese expert Sakamoto [26] worked on the initial

conditions' “usefulness” in the distributed algorithms, which means leader or knowing n, by

using a transformation algorithm change from the initial condition to one another on anonymous

networks.

Fraigniaud et al. [27] defined that if we have a unique leader, then we can break

symmetry and assign short labels as soon as possible. In currently, Chalopin et al. [21] have

studied on the issue of naming anonymous networks under the condition of snapshot

computation.

Finally, Aspnes et al. [29] worked on the relative powers of reliable anonymous

distributed systems using diverse communication mechanisms, such as read - write registers, or

read - write registers plus additional shared-memory objects, anonymous broadcast and so on.

Dynamic distributed systems where all processes have distinct

identifiers. O’Dell et al. [30] was the first people did work on the distributed systems with

worst-case dynamicity by in introducing the 1-interval connectivity model. Based on a same

model, they did research on flooding and routing problems in asynchronous communication and

made nodes can be used to detect the changes of local neighborhood, also the counting problem

in the networks whose nodes have unique identifiers, then defined an algorithm that requires

O(n2) rounds using O(log n) bits per message. After that, Michail et al. [31] focus on the

problem of anonymous counting using the worst-case dynamicity model set to study, and

founded an algorithm that knowing an upper bound on the maximum degree of graphs produced

by the adversary, makes possible to each node to compute an upper bound on the network's size.

And then I used this algorithm as fundamental building block in our counting algorithm.

According to the study of Michail et al. [32], some other less restrictive temporal connectivity

conditions that only need another causal influence appears in every time-window that has a

certain length was instead of the 1-interval connectivity assumption . At the same time, to

acquire that in a dynamic network what is the speed of information propagation, they provided

several novel metrics and provide terminating algorithms for fast spreading of information in

continuous dis-connectivity.

We know that this is the first study for distributed counting algorithms in anonymous

dynamic networks from an experimental perspective that are possibly disconnected. Now, my

project presents a strong proof to show there should be efficient computation can be created for

this kind of future networks.

Adversarial vs Random dynamic graph. We can see that the gossiping- model

[34, 35] is a dynamic graph. Each node randomly selects some neighbors to execute the view

exchange at each round. In my project, we consider the adversarial model and random dynamic

graph is different, because in our case the adversary can read the state of nodes and compute the

set of best edges to add/remove so that it can break the correctness of the counting, however, the

gossip adversary just selects nodes randomly with no any strategy.

3 Counting Algorithms for Anonymous Dynamic Networks

3.1 The No-Knowledge Algorithm ANoK

Frist, I induced a No-Knowledge Algorithm called ANoK, which works in the following

way:

At beginning, at round r0 each non-leader node v has energy quantity ev = 1, then it will

transfer half of its current energy to its the neighbors. But the problem is that the non-leader

nodes v does not have any prior knowledge of the network, before it received the message,

except it can make a guess of the number of neighbors, it cannot get the exact number in r.

Therefore, the non-leader nodes v has to make an assumption that they have d neighbors, and the

broadcast a ½ d energy quantity. After that, v begins to collect messages was sent from its

neighbors at the beginning of the round, then store this message into a local variable Ssmg.

Finally, at the end of the round, non-leader nodes v can update its energy quantity, now the ev

should be ½ + ( d - | Smsg | ) * ½ d + ∑ ∀ m S∈ msg * m to maintain the quantity energy will change

over all the network.

Pay attention to that, when the real number of neighbors is less than what we estimated,

represent as | Nv (r) ≤ d |, then the global energy conserved among all the processes is still

constant, since the non-leader nodes v did the behavior of compensation at the end of the round

according to the effective number of received message. However, when the real number of

neighbors is more than what we estimated, represented as | Nv (r) > d |, then an operation of

releasing of a local surplus of energy will be done. For example, supposed that v has quantity

energy ev, estimated number of neighbors is d = 2, and the real neighbor is Nv(r) = 8. Now, if v

sends ¼ ev to every neighbors, then the energy stored in v will be half of the total amount of

energy transferred, since the transferred energy is 8 * ¼ ev = 2 ev while node v had only ev energy

left. In fact, because node v regulates its local residual energy according to the number of

message received, eventually its residual energy will be negative, but the globally energy is still

conserved.

In the leader, the local surplus of energy (positive/ negative) that the adversary could

create a temporary energy value e, e > | V | or negative. In addition, at each round, the adversary

could change the degree of node such that the convergence of the leader can be avoided. To deal

with these issues, each process need to store the highest number of neighbors that it has ever seen

in the local, so that it can use this number as its estimated degree d. We can see from that the

adversary can create surplus of local energy (positive/negative) is upper bounded by a function f(

| V | ): because the energy’s conservation has to be maintained, and the local surplus is not

infinity it is finite, the worst case adversary cannot infinitely create surplus of local energy, each

node just can increase d at most | V | - 1 times. Hence, it is easy to prove that eventually the

leader need to converge to the value | V |, there were only a finite number of times this

convergence could be delay. Intuitively, the adversary cannot delay too much moves, since when

the energy stored in V \ { vl } is less than a certain value, even in the worst case, then the local

surplus of energy it could create, it is not enough to change the leader count. Therefore, if the

leader counts [ evl ], at each round r , it can be proved that there exists a round r*, after that the

leader will always compute the correct value despite the move of the adversary [2].

However, choosing several consecutive rounds, we can see the leader outputs are always

same count. This is not enough to define a terminating condition as such number can always

influenced by the adversary. Thus, the convergence cannot be detected by the leader. Now, I

assumed that when at round r, the increment of energy is below a threshold t, and then the leader

stops. We can always get a network that has the t+1 size, each node has t residual energy.

Therefore, each increment on the leader energy is below the termination threshold and residual

energy is greater than 1, so when the leader terminates it will miss one node.

Proofs. In the next part, I will show the proof of the algorithm ANoK converges to the exact

count in a finite of number rounds. First of all, I will prove that the dynamic adversary created

the quantities of negative energy is constrained by a function of the network size | V |. After that,

I will prove the leader can get the correct count in a finite number of rounds but such number is

unknown, hence I will use the convergence property to prove it.

Lemma 1. We have a dynamic anonymous graph G(r) = (V, E(r)). In the ANoK algorithm, the

amount of energy that can be created is finite. And, a single node vi V has energy ∈ can

generat at most negative energy, for any round r.

Proof. Now, we consider that a generic node vi V can create negative energy during the ∈

execution period of algorithm. En : { r(i,1), r(i,2), ….., r(i,t)}be the set of rounds in which vi

generates negative energy. Then, our aim is to show the number t is finite. For the generic round

r(i.j) En we should have that | rcv∈ vi | > 2Dmax, with ½ e + (| rcvvi | - Dmax)* ½ e < 0. This express

means that there is | rcvvi | > 2Dmax, and number Dmax will become twice at the end of r(i,j) round.

Because, we have the | rcvvi | ≤ | V | and Dmax ≥ 1, and the condition | rcvvi | > 2Dmax could appear

at most log(| V |) times. Therefore, t ≤ O(log (| V |)). Also, the node vi generated negative energy

e is at most (| rcvvi | - 2Dmax)* e at a single round, and it is maximized when | rcvvi | = | V | and

Dmax = 1.

Lemma 2. We have the dynamic anonymous graph G(r) = (V, E(r)). At any execution period of

ANoK, For any ε R∈ + , there is a round r|V, ε |, then the amount of negative energy could be

transferred to the leader is less than ε in the following rounds.

Proof. Each node can generate the negative energy is based on the energy that the nodes

already possess, because of the invariant on energy. Moreover, the leader received the

monotonically increasing energy, then the absolute value of energy in V \ { v l} is a

monotonically decreasing function of r, if there is no negative energy is generated. The

maximum amount of negative energy that can be created is bounded starting from round r, and

from the previous consideration it is a monotonic function f of . We

know that because f is monotonic exists ε1 such that f (ε1) ≤ ε, therefore, we get that

Lemma 3. We have the dynamic anonymous graph G(r) = (V, E(r)) at an execution period of

ANoK. Then, we have that . Therefore, the algorithm

ANoK is a counting algorithm that respects the convergence property.

Proof. Lemma 2 indicates that there is a round r|V,ε| such that the maximum amount of negative

energy can be created is constrained by a quantity ε that could be made subjectively small in the

network. According to the energy conservation Lemma, the total amount of positive and negative

energy in V \ { vl } is bounded by a monotonically decreasing function of the number of round

at each round. Therefore, we get , which represent that

there exist a round r make with . Then, the variable countvl

will be equal to . This is what I said the algorithm ANoK respects the

convergence property.

In fact, the previous Lemma implicates that there is a round r makes holds

. But, there does not exist any node that has the ability to detect this

condition. Therefore, the algorithm ANoK respects convergence property. In the practical context,

we can use unconscious algorithms, and the application’s safety does not depend on a correct

count and the liveness and fairness need just eventually correct count.

Energy vs Mass Conservation. In the work of Kempe et al. [37], they used a global

invariant – conservation of mass is similar to the idea of energy transfer and the concept of

energy conservation, and a push-only mechanism implement a gossip-based protocol to do the

aggregation. Forget the similarities between those two concepts, the basic graph is based on a

simple function, but the model I used here is controlled by the worst case antagonist. In addition,

in their model the node knows the number of neighbors for the message-exchange in advance,

but in this paper we do not know it. Therefore, they have probabilistic bounds on the

convergence time and each node of the system converges to the average of input value, but I

used the algorithm the leader always absorbs energy converging to the correct count and the

other nodes converge to zero.

3.2 The No-Knowledge Algorithm with Termination Heuristic A*NoK

At here, I will show how to add heuristic to the basic ANoK , then it can create a new

algorithm A*NoK used in the anonymous network that is No prior Knowledge and does have a

termination condition. In this heuristic, the idea is that using the leader to decide when is the time

to make the current count become the final one. This heuristic is according to the assumption that

the graph’s dynamicity I controlled by a random process, which means that a graph where links

change based on a uniform probability distribution, it supposes that the leader has the ability to

observe the notion of flow.

The leader vl will receive some energy from its neighbors at each round r, now we can

write that at each round r the flow of energy to the leader as:

In here, ev(r) represent the energy of node v, at round r, and dmax v (r) represent the maximum

number of neighbors that v has so far. After a certain number of rounds, the leader observed flow

is:

At here, the expression represents nodes, at round r, has seen the average of the

maximum degrees in G, and the ev(r) represents that at round r the average of the energy kept by

all non-leader node.

Notice that, when the leader is lack, in the network the energy will be always balanced

among nodes, and do not forget that the leader is the only node can absorb energy. Thus, the

neighbors of the leader nodes will have less energy than others, since they transferred some their

energy to the leader node without receiving anything from it. Because of the probabilistic nature

of the edges creation process we have and functioning of ANoK, all the non-leader nodes will tend

to have a similar quantity of energy as they will balance the energy surplus. Therefore, we will

have the estimation by the leader as:

Because of the probabilistic nature of the edges creation process, the leader node will see

the same maximum number of neighbors as the other nodes. Hence, we have:

and, substituting we have:

Based on this we get:

At here, represents at round r the estimation of the number of processes done by the

leader in the network and the p(r) is:

.

We have the number count done by the leader at round r, and the

difference Δ(r) is the difference between the energy currently stored in the leader and the

network size estimated with the energy flow, which is computed from

.

Eventually, We can obtain a termination condition as follows: as long as

maintains constant, over the last k rounds, the leader computes the average

and if after k consecutive rounds both the quantity and

is equal to k, then the counting procedure terminates and the leader outputs

k.

3.3 The Local Counting Oracle-based Algorithm ALCO

In order to deal with the problem of the terminating condition, the Lunay et al. [4] stated

the concept of the local counting oracle (LCO) that reports the current number of neighbors at

each round r. Based on such an oracle, there is a counting algorithm called ALCO has been created,

this algorithm has the ability to count the exact number of processes using a finite number of

rounds. In my work, I will introduce a new algorithm, namely A*LCO, this algorithm is created by

a simple modification of ALCO, and then it will work in a practical context. After that I will do a

performance comparison of between the A*LCO and A*NoK,

The ALCO used the main idea is to give every node a color in the graph, then count there

are how many processes have a specific same color. And the computation is done in the

synchronous round. When at the beginning round r0, except the leader will first colors itself with

a color c0, the other node has no color, its color represent by . After that, at each round r, the ⊥

node will have a new color cr, when it satisfied two conditions: with color , and has at least one⊥

neighbor with a not color. Moreover, at any round r, each non leader colored process ⊥

broadcast a unit energy and its color cr , with the current round r and the multi-set containing the

information it has about its neighbors. When the counting algorithm starts at round r0, the leader

knows exactly how many processes has the color c0 as they are its neighbors, the leader colors

them and their number is given by the LCO. At the later round, the leader will initialize a local

variable to collect the energy sending from the colored node as a container. And for transferring

the energy to the leader I used the same mechanism of ANoK, and the access to LCO also make

sure that negative energy cannot be created.

The leader begins to collect energy from the node has color c0, until it makes all of them

has been collected. And after the leader has collected all the energy from the node has color c0, it

can compute a bound B1 on the set of nodes colored with c1, using the node colored by c0

gathered the multi sets of local view. This bound is obtained by multiplied the number of for ⊥

the multiplicity of the multi-set, and the multiplicity obtained using the energy. Thus, the leader

can use this bound B1 to decide when it already collects enough energy to get the correct count

C1 of process with color c1.

At round 2, each node has the color c1 will create a unitary quantity of energy, the

transfer it to the leader, this energy is marked with the local view lw(1), with the color c1 and

round 1.

The leader can compute the multiplicity of neighbors that has id 0 by collecting this

energy, for each node w. Based on this information, when the nodes that has a color c1 collected

the energy is equal to the adjusted bound B1, it can lower the bound B1 till it gets the correct

count C1 of the nodes has the id 1. Using the sum equation C≤1 = C1 + C0, the leader can

compute the multi sets of neighbors of nodes with ids { 0, 1 }. Finally, when this multi-set is

empty, then the leader terminates, if not it uses the same procedure to count nodes that has color

c2 and so on.

3.4 The Local Counting Oracle-based Algorithm with Symmetry Breaking

A*LCO

The Lunay et al. [4] stated that ALCO has the ability to compute an exact count in a finite

time. But, it needs a lot of time to do so. The reason is that before the leader collected more than

one unit of energy, the leader cannot count two nodes have the same color. For example, suppose

that we have y nodes at round r, and they have the same color and same multi-set lv of neighbors.

The leader needs to collect at least y – 1 + ε energy to count the correct multiciplity y of lv, to do

this process need lots of time. But in practice this two nodes may be same if we consider the

history of their local views, which means that the union of all the multi-sets they saw from round

r0 until the current round.

According to the information, and breaking some symmetry in ALCO, I introduced a new

algorithm, called A*LCO. Symmetry breaking is obtained by using an additional parameter,

thinking about all the local views history and it is used to break the symmetry and to

disambiguate processes having the same color. Basically, each non leader process with color ci

≠ computes a round id ⊥ , at each round r.

Therefore, two nodes u, v that have a different multi set of neighbors at round r’ will have a

different ridr for each round r > r’. This ridr will be added together with the other information to

the energy created at round r.

In this modified algorithm, when the symmetry has not been broken by the dynamic

topology in other words, two nodes that at each rounds have the same neighborhood, it is uses

the concept of energy to count. However, if the symmetry has been broken it can count fast.

Assuming that at round r all nodes with cid ≠ have a different rid⊥ r, the leader could collect

information from all necessary nodes in at most V rounds. We can see from the example

presented that if ½ y nodes have different ridr, then the leader need to wait until it collects

of energy, this is faster.

4 Performance Evaluation

Tool. To do this experiment, I used a JAVA simulator – the Jung library [36] to build

the graph data structure. In the graph each node represents a process v, and it shows an interface

consist of two methods:

1) allowing to send a message for round r.

2) allowing t deliver message for round r

In addition, each node also has a queen qv to store the received message. This simulation have a

set of thread, in which one thread Tj takes one to be examined node from a list that contained all

of the nodes, represent as lm, in this round, then remove it from this list and execute the send

message method. And the thread Tj also takes v produced message to adds it to the queues of

Nv( r ). However, if the lm is empty, then there will be a different set of thread to be activated to

deliver message. The thread Tj takes a node v from list ld and regulate the delivery of all the

message that v received and stored in qv during the current round. After all the messages are

delivered to all the processes in qv, then the round terminates and based on the dynamicity model

the topology can be changed. Finally, a new round can start.

Metrics and parameters. There are three key performance metrics we considered:

i) Convergence Time Distribution: at the first round, the algorithm will output the

correct value and defined the convergence time. In the next section, I work on the

probability distribution of the convergence time in order to explain the average

latency before obtained the correct count in this algorithm.

ii) Flow Based Gain Δ: this is the difference between the leader measured that the

size estimated through the flow and the size estimated through the stored energy

in the leader, expressed as .

iii) Error frequency ρ: this represent the incorrect termination

probability that got from the heuristics-based termination condition

present in the previous section.

The below parameters is what I used to evaluate those above metrics:

i) Dynamicity model: there are different types of dynamic graphs to be used to

evaluate the factor influence each metrics.

ii) Edges creation probability p: based on the certain model, this probability

controls the dynamicity of the graph.

I did used different metrics in the networks to evaluate the performance of the algorithm

includes {10, 100, 1000} nodes. In the following section, most of the test results are

come from 1000 independent runs.

4.1 Evaluation of ANoK

Focused on the algorithm ANoK, I did the evaluation on the G(n , p), edge- Markovian

and Duty-cycle-based graphs. First of all, let us focus on the G(n, p) graphs, in which the

connectivity threshold t is obtained bases on the amount of node in the graph, is

. I did the evaluation of this algorithm for several probability p. In

some cases, the probability greater than 2t, then just connected graph instances to be

considered. Otherwise, if the probability smaller than 2t the disconnected graph instances

to be allowed.

Below the figure 1 present the convergence time distribution of the ANoK algorithm

using by G(n, p) graph. We can see from that as we consider disconnected instances the

convergence time becomes worse and worse. Now, notice that although the disconnected

instances exist, the algorithm still can converge to the correct count. And, the increasing

of convergence time is inversely proportional to p. Moreover, because of the existing of

the disconnected instances, there is an increment of the distribution variance.

Figure. 1: ANoK Convergence time distribution on G(n;p) graphs

Next, for the edge-Markovian graph, let the probability of creating an edge same

as the G(n, p) graphs, and modified the probability of deleting an edge to 0.25, pd = 0.25,

pb = f( t ).

The ANoK convergence time distribution displayed in the figure 2, intuitively, we

can compare it with what I show previously with the G(n, p) graph. Moreover, the low

values of edge creation probability is mitigated by the persistence of edges across rounds.

Eventually, the convergence time is less than the G (n, p).

Figure. 2: ANoK Convergence time distribution on edge-Markovian graph

4.2 Evaluation of A*NoK

At here, I do the evaluation of the A*NoK algorithm on te G(n ,p) graphs and Edge-

Markovian graphs. I present several measures related to the heuristic correctness in the figure 3.

Moreover, except the error frequency ρ, I also measured maximum error and the average error, it

is the number of nodes missed compare to the real number of nodes in this graph. Notice that, I

omit some probabilities because they terminate correctly, which is p ≥ ½ p in the graph G(n, p)

and pb ≥ ¼ t in the graph Edge-Markovian. For the disconnected topologies p ≤ ¼ t in the graph

G(n, p) or pb ≤1/8 t in the graph Edge-Markovian, the number of terminated correctly counting

instance is less that 100% and it becomes proportionally worse with the decrease of p. In

addition, there will be a bimodal behavior of the heuristic occurred when ti fails:

i) In the first counting process, the heuristic forces to terminate. Then the leader output

a count, but it was much smaller than the real number of processes.

ii) If the leader accumulated energy is close to the current network size, then the

heuristic fails.

In my work, there was no the heuristic forces to terminate that occurred in a different way

from this two. Also, the table Convergence Detection Time shows that the number of rounds

after the first convergence that the heuristics employs to correctly terminate the count. Now,

we can see that most of time, the heuristic converges is equal to the size of the network as if

on disconnected instances.

Model G(n, p) Edge-Markovain pd=0.25P t/4 t/8 t/16 t/32 t/8 t/16 t/32| V | 10 100 1000 100 100 100 100 100 100

ᵨ 22% 3% 2% 19% 25% 84% 30% 68% 76%

Average Error 2,02 8,96 1 9 44,5 41,4

1 3,12 11,9

Max Error in Nodes 8 96 1 99 99 99 1 99 99σ of Error 2,1166 27,4 0 27,4 48,3 48,

81 14,23 29,73

Convergence Detection Time Average

10,2 100 1000 100 100 100 100 100 100

Convergence Detection Time Max

40 100 1000 100 100 100 100 100 100

Convergence Detection Time Min

10 100 1000 100 100 100 100 100 100

Figure 3. Evaluation of the Termination Correctness. The results are the

outcome of 500 experiments.

4.3 Comparison between ANoK and A*NoK

We know that we can use the flow to estimate the size of | V | to get a faster count. In the

following figure display that the evolution of Δ, I defined before the Δ is the leader measured the

difference between the size estimated through the flow and the size estimated through the stored

energy in the leader, in a temporal perspective 4(a) and an energy perspective 4(b).

Figure 4: Difference between the size estimated with the flow (A*NoK) and the size estimated by looking

to the energy stored at the leader (ANoK) in a G(n, p) network of |V |=100.

When the energy is about half of size of the network at the leader, Δ get the maximum

value. Thus, when p ≥ t the network is connected, the leader can predict the presence of at least

others 17 nodes based on heuristic. Therefore, this approach could answer faster to predicates

likes | V | ≥ t, on connected instance. In fact, the flow-based estimation also can do well on the

disconnected instances only until a specific threshold, after that the gain obtained with the flow

drops to one or two nodes more than the energy estimated the ones.

In addition, the figures tell us that the reason of the termination heuristics work badly on

instances with p ≤ ¼ p. The value falls behind the threshold of 1, if in the leader the energy is

low, or if in the leader the energy is close to the value | V |, then there are two possible

misbehavior will occurred. One is terminated after several rounds from start. One is when Δ falls

behind 1 again, it could terminate near |V |.

In the figure 4(a), it indicates the behavior of Δ along time.

i) p ≥ t, the network is connected, the leader done the counting fast approaches half

of the size of the network, which is the maximum value for Δ. However, the

energy-based count will approach the real size using an exponential time. We can

see it from the exponential decay of Δ. And this behavior is also correct with p < t,

even there is a slower decay of Δ that clear indicates a slower approach to the real

size.

ii) The curves present a high variance for the value of p ≤ t, since the existing of

disconnected topologies give a variance in the convergence time, and the

magnitude is proportional to the inverse of p. Moreover, during the execution the

high variance of the flow that the leader will see will bring the high variance in

convergence.

We can see the same behavior in Edge-Markovian graph in the figure 5. Because of the

less prone to the value p, then the existing of more edges in the edge-markovian graph has a

positive effect on the Δ. Now, we can find a slightly low maximum value for the edge-markovian

process, 17 against 17.3 of the G(n, p) graph.

Figure 5: Difference between the size estimated with the flow (A*NoK) and the size estimated by looking to

the energy stored at the leader (ANoK) for Edge- Markovian network with pd = 0:25 of | V |=100.

Notice that, run the test with larger graphs | V | = 100, the curves show the same behavior

like figure 4 and figure 5, and in this case the maximum Δ is about 170 nodes.

4.4 Duty Cycle

Running the algorithm A*NoK on regular topologies to test the adaptiveness of our

heuristic: chains and rings. I use a duty-cycle of eighty percent, over those topologies. Each node

sleeps 20% of the time independently, at this period the links of sleeping nodes will be deleted.

Now, the topology of ring has | V | =100, and the average convergence time is about 26986

rounds over 100 experiments, however, the topology of chain has a convergence time on average

70000 round. Moreover, I have the test of the random G(n, p) topologies with p =2t, it has a

convergence time on average 1059 round for 200 experiments. At this time, we see that the duty-

cycle perform badly both the size estimation and the termination heuristic on graph. For the

chain and ring, the termination heuristic always fails, but on the random graph fail about the 23%

of the instances.

Figure. 6: Difference between the size estimated with the flow (A*NoK) and the size estimated by looking to the

energy stored at the leader (ANoK) of 200 runs for duty cycle and random graph with |V |=100.

4.5 Evaluation of A*LCO

For the each G(n, p), edge-markovian and Random Connected graph, I evaluate the

termination time of the algorithm A*LOC. We know that A*LOC only work on instances are 1-

internval connected. Also, I have the original algorithm ALCO uses15 round with | V | = 10, 393

round with | V | = 100, and 7753 round with | V | = 1000, on average. The figure 7 display that

the performance for the symmetry breaking version. We can easily find that the symmetry

breaking extension allows the algorithm to terminate faster, and the time of termination is

approach to the size of the network. In addition, this indicates that the LCO offered the additional

knowledge allows the counting algorithm to do faster.

Model G(n,p)p = 2t Edge-Markovian Pb =2t Random Connected Graph

|V| 10 100 1000 10 100 1000 10 100 1000

Average Termination 7,6 107,6 1690,6 9,9 113,4 15543,2 10,14 113,8 1559

Max Termination 17 187 2117 19 222 2684 23 220 1899

Min Termination 5 96 1175 5 99 807 6 13 1263

Fig. 7: Termination performance of A*LCO on 1-interval connected instances.

5 Conclusion and Future Works

In this paper, I introduced two new practical algorithms, called A*NoK and A*LCO, which

is modified according to algorithms using the notion of energy transfer, ANoK, ALCO stated in [4].

With those algorithms, I did implement, test and compare among them. From those experiments,

we can see that the algorithm A*NoK not only can terminate correctly on dense graphs, but also it

has a good performances on disconnected instances. Unfortunately, if we use the sparse and

extremely disconnected instances or regular ones where the dynamicity is due to duty-cycling,

then its error rate became high. Moreover, when the heuristic fails, it will have a bimodal

behavior. This problem still need to be studied in the future, I need to figure out how it is

possible to design better heuristics. Because of the concept of energy-flow, the algorithm A*NoK

could answer faster to predicates like | V | ≥ T, becoming a practical use of energy-transfer based

algorithms. In addition, this slow convergence of algorithm ANok on sparse and disconnected

instances could be studied to design algorithms that want to estimate, in a distributed-way, the

edge-density and the connectivity of dynamic anonymous graph with size knowledge. We can

see that from previous work the convergence time seems to strictly depend on the probability

threshold of edge creation p. The algorithm A*LCO does terminate fast on dynamic graphs, and the

dynamicity model of the dynamic graphs is a random adversary, this has a trade-off phoneme

between the knowledge of the network and the counting time.

Reference

1. G. A. DI LUNAy, S. BONOMIy, I. CHATZIGIANNAKISz, R. BALDONIy. Counting in Anonymous Dynamic Networks: An Experimental Perspective

2. G. Di Luna, R. Baldoni, S. Bonomi, and I. Chatzigiannakis. Counting on Anonymous Dynamic Networks through Energy Transfer. Technical Report 1, 2013.

3. Othon Michail, Ioannis Chatzigiannakis, Paul G. Spirakis. Naming and Counting in Anonymous Unknown Dynamic Networks

4. G. A. DI LUNAy, S. BONOMIy, I. CHATZIGIANNAKISz, R. BALDONIy. Counting on Anonymous Dynamic Networks through EnergyTransfer

5. Dana Angluin. Local and global properties in networks of processors (extended abstract). In Proceedings of the twelfth annual

6. ACM symposium on Theory of computing, STOC ’80, pages 82–93. ACM, 1980.7. F Kuhn, N Lynch, R Oshman. Distributed computation in dynamic networks8. H. Terelius; D. Varagnolo; K. H. Johansson. Distributed size estimation of dynamic

anonymous networks9. A. Boukerche; K. El-Khatib; L. Xu; L. Korba. An efficient secure distributed anonymous

routing protocol for mobile and wireless ad hoc networks10. R. Niu; P.K. Varshney. Performance Analysis of Distributed Detection in a Random

Sensor Field11. A. Kimura; E. Kohno; Y. Kakuda. Security and Dependability Enhancement of Wireless

Sensor Networks with Multipath Routing Utilizing the Connectedness of Joint Nodes12. K. Zhao. Dynamic anonymous routing protocol of the Mobile network 13. Perkins, C.; Royer, E. (1999), "Ad hoc on-demand distance vector routing", The Second

IEEE Workshop on Mobile Computing Systems and Applications 14. A Delay-Tolerant Network Architecture for Challenged Internets, K. Fall, SIGCOMM,

August 2003.15. Correctness of a Gossip-based Membership Protocol. André Allavena, Alan Demers and

John Hopcroft. Proc. 24th ACM Symposium on Principles of Distributed Computing (PODC 2005).

16. Rajmohan Rajaraman, Northeastern U.www.ccs.neu.edu/home/rraj/Talks/DynamicNetworks/DYNAMO/June 2006

17. Y. Afek, B. Awerbuch, and E. Gafni. Applying static network protocols to dynamic networks. In Proc. of 28th Symp. On Foundations of Computer Science (FOCS), pages 358–370, 1987.

18. Y. Afek and D. Hendler. On the complexity of gloabl computation in the presence of link failures: The general case. Distributed Computing, 8(3):115–120, 1995.

19. Roger Dingledine, Nick Mathewson, Paul Syverson. Tor: The Second-Generation Onion Router. Usenix Security 2004, August 2004.

20. V. Kostakos. Temporal graphs. Physica A: Statistical Mechanics and its Applications, 388[6]:1007{1023, 2009.

21. Jérémie Chalopin, Yves Métivier, and Thomas Morsellino. On snapshots and stable properties detection in anonymous fully distributed systems (extended abstract). In Guy Even and Magnús Halldórsson, editors, Structural Information and Communication Complexity, volume 7355 of Lecture Notes in Computer Science, pages 207–218. Springer Berlin / Heidelberg, 2012. 10.1007/978-3-642-31104-8 18.

22. Yun Zhou, Yuguang Fang, and Yanchao Zhang. Securing wireless sensor networks: a survey. Communications Surveys Tutorials, IEEE, 10(3):6 –28, 2008.

23. Masafumi Yamashita and Tiko Kameda. Computing on an anonymous network. In Proceedings of the seventh annual ACM Symposium on Principles of distributed computing, PODC ’88, pages 117–130, New York, NY, USA, 1988. ACM.

24. Christian Scheideler. Models and techniques for communication in dynamic networks. In Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science, STACS ’02, pages 27–49, London, UK, UK, 2002. Springer-Verlag.

25. Naoshi Sakamoto. Comparison of initial conditions for distributed algorithms on anonymous networks. In Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing, PODC ’99, pages 173–179. ACM, 1999.

26. Masafumi Yamashita and Tiko Kameda. Computing on an anonymous network. In Proceedings of the seventh annual ACM Symposium on Principles of distributed computing, PODC ’88, pages 117–130, New York, NY, USA, 1988. ACM.

27. Naoshi Sakamoto. Comparison of initial conditions for distributed algorithms on anonymous networks. In Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing, PODC ’99, pages 173–179. ACM, 1999.

28. P. Fraigniaud, A. Pelc, D. Peleg, and S. Pérennes. Assigning labels in unknown anonymous networks. In Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing, pages 101–111. ACM, 2000.

29. James Aspnes, Faith Ellen Fich, and Eric Ruppert. Relationships between broadcast and shared memory in reliable anonymous distributed systems. Distributed Computing, 18(3):209–219, February 2006.

30. Regina O’Dell and RogertWattenhofer. Information dissemination in highly dynamic graphs. In Proceedings of the 2005 joint workshop on Foundations of mobile computing, DIALM-POMC ’05, pages 104–110, New York, NY, USA, 2005. ACM.

31. Othon Michail, Ioannis Chatzigiannakis, and Paul G. Spirakis. Brief announcement: Naming and counting in anonymous unknown dynamic networks. In DISC, pages 437–438, 2012.

32. Othon Michail, Ioannis Chatzigiannakis, and Paul G. Spirakis. Causality, influence, and computation in possibly disconnected synchronous dynamic networks.

33. In Roberto Baldoni, Paola Flocchini, and Ravindran Binoy, editors, OPODIS, volume 7702 of Lecture Notes in Computer Science, pages 269–283. Springer, 2012. P. Boldi and S. Vigna. Computing anonymously with arbitrary knowledge. In PODC, pages 181–188. ACM, 1999.

34. M. Jelasity, A. Montresor, and O¨ . Babaoglu. Gossip-based aggregation in large dynamic networks. ACM Trans. Comput. Syst.,23(3):219–252, 2005.

35. D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, pages 482–491, 2003.

36. Jung. http://jung.sourceforge.net/, 2013.37. D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information.

In FOCS, pages 482–491, 2003.

€¦ · web viewafter four months of hard work, this project finally came to fruition. although i...

Documents