international journal of pure and applied …communication in vanet 1senthamilselvan, 2wahidabanu...
TRANSCRIPT
ANALYSIS OF SPAWN PROTOCOL AND EDFC ALGORITHM FOR SECURE
COMMUNICATION IN VANET
1Senthamilselvan, 2Wahidabanu
1Research Scholar, Bharath University, Chennai, Tamilnadu, India. 2Professor, Department of Electronics and Communication Engineering,
2Government College of Engineering, Thanjavur, Anna University, Chennai, Tamilnadu, India.
Abstract: In this paper wireless sensor networks
(WSN), are similar to wireless ad hoc networks in the
sense that they rely on wireless connectivity and
anomalous generation of networks so that sensor data
can be transported wirelessly. Here we propose a novel
decentralized implementation of fountain codes in
sensor networks, such that data can be encoded in a
completely distributed fashion. In our proposed
algorithms, a sensor disseminates its data to a random
subset of sensors in the network, while each sensor
only encodes data that it has received. As the collector
collects a sufficient number of encoded data blocks by
visiting more and more sensors, it is able to decode all
original data with an efficient decoding process
designed for fountain codes. The salient and original
contribution of this paper is a solution to disseminate
data from one sensor to others in an efficient and
scalable fashion with the help of Exact Decentralized
Fountain Codes (EDFC) along with the routing
SPAWN protocol. As has been well known,
conventional shortest-path routing algorithms require
each sensor to maintain a routing table with a size
proportional to the total number of sensors in the
network with their respective locations. Be that as it
may, we have observed that our decentralized
implementation of fountain codes does not require the
support of a generic layer of SPAWN routing protocol.
We propose SPAWN (Swarming Protocol for vehicular
Adhoc Wireless Networks), a simple cooperative
strategy for content delivery in future vehicular
networks. The issues involved in using a strategy from
the standpoint of Vehicular Adhoc networks. In this
paper, we propose a decentralized algorithm using
fountain codes to guarantee the persistence and
reliability of cached data on unreliable sensors. With
fountain codes, the collector is able to recover all data
as long as a sufficient number of sensors are alive. Our
theoretical analysis and simulation-based studies have
shown that, the EDFC algorithm maintains the level of
fault tolerances the original centralized fountain code,
while introducing lower overhead than naive random-
walk based implementation in the dissemination
process. Finally we bring that ad hoc relay wireless
networks, based on NS2 simulation technologies, have
potential for many prominent applications of this kind.
Keywords: Vehicular Networks (VANET), Content
Distribution, SPAWN Protocol, EDFC, Data
Dissemination, Broadcast, Adaptive Traffic Control
Scheme.
1. Introduction
Wireless sensor networks consist of unreliable and
energy-constrained sensors communicating with one
another wire-lessly. It has been a conventional
assumption that, in wireless sensor networks, measured
data in individual sensors are gathered (via data
aggregation) and processed en masse at powered
sinks with Internet connections [1]. There are, however,
at least two cases in which this assumption may not
realistically hold. First, if sensor networks consist of a
large number of sensors (in the order of tens of
thousands or higher), it may not be energy efficient to
gather measured data from sensors to sinks using data
aggregation. Second, if sensors are randomly deployed
in inaccessible geographical regions or environments, it
may not be feasible to deploy powered sinks as well.
Along with the industrial development and
technological progress, monitoring the environment
plays a very vital role. Many research works have been
built-up on monitoring systems that can replace
traditional systems in critical environments [2].
Wireless sensor networks (WSNs) are deployed to
an area of interest to sense phenomena. Wireless sensor
network is a type of ad-hoc networks that has the
ability of sensing and processing data collected from
the environment. These networks are comprised of
autonomous devices, called sensor nodes. Each sensor
has a buffer which can be divided into small slots.
Wireless Sensor Networks are recently applied in many
environmental applications such as measuring
temperature, humidity, salts or monitoring objects and
others.WSNs applications have common task, which is
environmental monitoring [3]. This task is realized by
using nodes to sense data from the environment and
International Journal of Pure and Applied MathematicsVolume 118 No. 20 2018, 1961-1973ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu
1961
sends it to the base station. In all applications, we must
take into account to use an efficient data collection
approach. Wireless sensor networks have many
advantages but the main problem is the limitation of
node's resources. Due to that fact, any node may fail to
communicate with other nodes in the network or
disappear from the network due to accidental events in
harsh environments or due to battery depletion.
Figure 1. (a) WSN is randomly Distributed in a field.
A node S is Source node and Flood its data to its
neighboring nodes D and D unicast of its Data
to one of its neighbors and so on
In this paper, we propose a best Shortest Path
algorithm which is called a Exact Decentralized
Fountain Codes (EDFC) along with SPAWN protocol
depends on unicasting instead of multicasting to
provide controlled-redundancy of data through the
network [4]. A consequence of these modifications, the
number of created messages and the energy
consumption are reduced. Also, taking into account the
buffer status whether it is empty or not. If it is full, it
will cancel the operation to reduce processing load on
the nodes. The proposed model is applied in large-scale
wireless sensor network where these nodes are
randomly distributed in the serviced environment.
2. Preliminaries
2.1 The Spawn Protocol
SPAWN has the same generic structure of any
swarming protocol. Peers downloading a file form a
mesh and exchange pieces of the file amongst
themselves. However the wireless setting of SPAWN,
characterized by limited capacity, intermittent
connectivity and high degree of churn in nodes requires
it to adapt in specific ways. As we shall see, this
particular scenario provides a compelling incentive for
individual nodes to cooperate while accessing the
Internet. Figure I.a describes the basic operation of the
SPAWN protocol.
Figure 1. (a) Evolution of a file in a node using SPAWN Protocol
(1) A car arrives in the range of a gateway, (2)
initiates a download (3) download a piece of the file.
(4) After getting out of range, (5) starts to gossip with
its neighbours about content availability and (6)
exchanges pieces of the file, thereby getting a larger
portion of the file as opposed to waiting for the next
gateway to resume the download
International Journal of Pure and Applied Mathematics Special Issue
1962
2.2 Peer Discovery
There are several components to the operation of the
SPAWN protocol, but for brevity we just focus on Peer
Discovery. We propose a decentralized mechanism for
peer discovery. We utilize the broadcast medium of the
wireless channel to gossip information about the
content availability at neighbours. In a mobile
environment, gossiping provides a way to incorporate
location awareness into the peer discovery scheme [3].
Since TCP over multiple-hops suffers quite
Figure 1. (b) Gossip Message format, here ni denotes
the address the address of the ith node in the path that
the gossip message traversed
Dramatically in the ad-hoc wireless scenario [4], a
node is better off using unicast TCP connections with
near-by peers. Gossiping helps here in constructing
overlapping meshes of physically close peers for
exchanging pieces of the file. Figure shows the
structure of a Gossip message in SPAWN.
2.3 Coded Cooperative Data Transmission
A. Spurring Example
In this work, a basic bidirectional transmission
convention utilizing system coding is proposed and
computerized Fountain codes; Network Simulator
(NS2) is a very adaptable, quickest simulator for large
heterogeneous system that supports the wired and
wireless network protocol. In the proposed protocols,
the XOR operation is utilized for the first information.
Furthermore, we likewise consider the bidirectional
transmission plot with multi-transfer. Versatile
methodologies focus on recreating established systems
by mitigating detachment, data transfer capacity decline
and constrained correspondence extend utilizing nature
of administration and steering components. System
coding is less inclined to a solitary purpose of
disappointment. A functional utilization of system
coding has been exhibited in [3] for large scale content
partaking in the wired Internet. In the wake of getting
adequate number of pieces related with directly
autonomous coefficients, an accepting hub can interpret
the squares.
B. Cooperative Downloading
Cooperative downloading is a reasonable and
successful contrasting option to reflect servers and
substance conveyance systems. Bit Torrent is a
standout amongst the most mainstream helpful
conveyed downloading conventions being used today.
It depends on the rule of swarming, wherein the
coveted document is downloaded in parallel from
various participating associates. The parallel download
approach utilized by Bit Torrent empowers it to
accomplish enhanced execution when contrasted with
other shared frameworks, with the current advances in
remote correspondence innovations and the expansive
development in the quantity of versatile clients, content
appropriation in portable remote situations is picking
up significance. A few directing methodologies
appropriate for distributed record sharing over MANET
have been examined in [4].
VANET (Vehicular Adhoc Networks) is an
uncommon sort of MANET (Metro Politian Adhoc
Networks) where the way, direction and speeds of the
hubs are very deterministic. Also, the VANET hubs
frequently cross stationary passages from which they
can exchange information. Likewise the rapid of the
hubs in VANET (Vehicular Adhoc Network) causes a
few disengagements and consequent course breakages,
which impede the operation of shared conventions that
depend on steering for finding the associates. For
VANETs, where the hubs move at a fast and topology
changes quickly, vehicles can just download
incomplete information from the passages before
disengagement. In this manner it is in a perfect world
suited possibility for agreeable substance sharing
frameworks. Notwithstanding, most existing shared
swarming conventions are intended for settled topology
based systems. Because of the changing topology and
high versatility, executing these plans in VANETs is
exceptionally testing [5]. Specifically, conceiving
associate and substance choice procedures for sharing
is very intricate. The SPAWN convention utilizes a talk
system to promote the piece list every hub has and
considers while choosing substance among peers.
Besides, utilizing the communicate idea of remote
media empowers SPAWN to lessen excess
transmissions.
C. Unequal Packet Importance
This fact lies at the heart of multimedia streaming: the
unequal importance of packets is used to guide
prioritized transmission over a network. Depending on
the transmission scenario, available differentiation
mechanisms are used to ensure that the most important
packets of a particular stream are given priority, thus
providing a graceful degradation in the presence of
adverse network conditions [6]. One challenge that
arises from this fundamental property of multimedia,
with respect to network coding, is that network coding,
so far, has been agnostic to the content of the packets
that are coded together.
In inter-session network coding, the goal is to mix
together several packets from different flows, thus
International Journal of Pure and Applied Mathematics Special Issue
1963
increasing the information per packet and eventually
the throughput. However, for media streaming it is not
only the quantity of delivered packets that matters but
also their quality.
Using an analogy to Redundant Array of
Independent Disks (RAID) in computer systems [5], we
explain a number of alternatives that motivate the credo
of this paper.
2.4 Why Fountain Codes?
Similar to RAID 1 disk arrays (mirrored disks), the first
alternative is to utilize other sensors to “mirror” or
“replicate” data from a particular sensor. After such
mirroring, when a sensor fails, the data it has gathered
before its failure can be retrieved from other “backup”
sensors, improving data persistence and fault tolerance.
As an example of existing work, Hass et al. [6] have
addressed the fault tolerance of distributed location
databases by using replication. It is easy to see that a
large number of replicas are required to maintain a
certain degree of fault tolerance, when the failure rate
of sensors increases. Analogous to RAID 6 disk arrays
(in which Reed-Solomon codes are used), error-
correcting codes may be used to alleviate the
disadvantages of mirroring data. It has also been
proposed that Reed-Solomon and LDPC codes may be
used towards building reliable and distributed data
storage systems over wide-area networks [7], [8]. There
is one catch, however, in that the encoding and
decoding processes of conventional error-correcting
codes have to be implemented in a centralized fashion.
To make good use of them in sensor networks, all data
have to be delivered to a centralized sensor node, which
encodes them and re-distributes the encoded data
within the network. This is obviously not realistic, in
that a single sensor is not assumed to have sufficient
energy or computational power to encode all the data
received, it suffers from a single point of failure and the
communication cost of forwarding all data to a central
node for encoding is prohibitive.
In wireless sensor networks, where data are
generated from different sensors in a distributed way, it
has been proposed in existing work that random linear
codes may be used to improve the degree of fault
tolerance [9], [10], [11] by decentralized encoding.
Random linear codes are also investigated in distributed
networked storage [12]. Unfortunately, similar to Reed-
Solomon codes, the decoding process of random linear
codes is computationally expensive, with a decoding
complexity of O (K3), where K is the number of data
blocks1 to be encoded. We believe that the superior
decoding complexity of fountain codes — O (K lnK)
— may come to our rescue, even though the encoding
process of fountain codes is also centralized in coding
literature.
We briefly introduce LT codes [4], the first
fountain codes proposed in the coding literature. We
use LT codes in our subsequent discussions and
performance evaluation. Fountain codes are rateless in
the sense that the numbers of encoded data blocks that
can be generated from the original data blocks are
potentially unlimited, and the encoded blocks can be
computed on the fly. Therefore, fountain codes are
especially suitable for erasure channels where the
erasure probabilities are unknown. Fountain codes also
enjoy excellent computational efficiency for both
encoding and decoding processes. In LT codes, it has
been shown that K source blocks can be decoded from
any subset of K + O (√K ln2(K/_)) encoded blocks with
probability 1−_. The encoding and the decoding
complexity are both K ln (K/_). The number of source
blocks that are used to generate an encoded block is
referred to as the degree of the encoded block. The
degree distribution of encoded blocks in LT codes
follows the Robust Soliton.
International Journal of Pure and Applied Mathematics Special Issue
1964
The encoder of LT codes generates each encoded
block independently by first randomly choosing the
degree d of an encoded block from the Robust Soliton
distribution, and then chooses d distinct blocks from K
source blocks uniformly at random[13]. The encoded
block is the bitwise exclusive-or (XOR) of the d source
blocks. The decoding process utilizes the Belief
Propagation (BP) algorithm, which intuitively is more
computationally efficient than the general matrix
inversion process, since degree-one encoded blocks are
used to activate a cascading backward cancellation
process, producing more degree-one encoded blocks.
2.5 Data Access Through decentralized Fountain
Codes
We propose to use decentralized fountain codes to
efficiently maintain data persistence in large-scale
sensor networks. In this section, we first show how
fountain codes can be used to access sensed data in a
decentralized fashion, then present two data
dissemination algorithms based on “one-way” random
walks.
A. Decentralized Fountain Codes
We model a sensor network as a graph G = (V, E),
where V represents the set of sensors and E represents
the set of links. Let N denote the total number of nodes
in V .We assume that a subset of K nodes, called
sensing nodes, monitor the environment and generate
sensed data to be disseminated. The sensing nodes are
able to cache their data on all other nodes (referred to
as caching nodes), such that their data are available
even when they fail or otherwise become unavailable
[14],[15]. In fountain codes, each encoded block is the
XOR combination of a number of source blocks,
randomly selected based on a special distribution, such
as the Robust Soliton distribution in LT codes.
To take advantage of fountain codes in sensor
networks, one way to design a straightforward
implementation is based on “two-way” random walks
as follows. A node first generates the code-degree d
from the Robust Soliton distribution, where the code-
degree of a node refers to the degree of the encoded
block cached on the node. This node may then request
the source blocks from d distinct sensing nodes,
uniformly distributed in the K sensing nodes, using
random walks to deliver these requests. All d random-
walk paths to the d sensing nodes are recorded in the
network. Upon receiving the requests, the d sensing
nodes disseminate their source blocks through the
previous recorded d random-walk paths to the node.
Finally, this node encodes the d received source blocks
for later retrieval.
However, we believe that such “two-way”
random-walk paths are rigid, not realistic and
undesirable. First, sending data along the recorded
random-walk paths makes the more demanding
assumptions of bi-directional wireless links, which is
sometimes not the case in reality. Second, to record
random walk paths in the network, sensors have to
maintain forwarding tables with the size proportional to
the number of random walks in the network. Third,
sensor networks are inherently unreliable, and random-
walk paths may be broken or lost if any of the sensors
on these paths fails during the transmission. Finally,
excessive overhead of transmitting control messages is
unavoidable when selecting random sensing nodes and
additional random walks have to be activated if the
selected node is not a sensing node, or a duplicate of
previous selections. In this paper, we seek to construct
decentralized fountain codes with only one traversal of
random walks, from sensing nodes to the caching nodes
that encode and store the source blocks. Though
nontrivial to design, such “one-way” random walks are
much more scalable, completely stateless, and have
avoided all the overhead and drawbacks of the
aforementioned “two-way” walks. Our algorithms are
intuitively justified based on the following rationale. If
the steady-state probabilities of two nodes are different
in the random-walk Markov chain, e.g., _i > _j, random
International Journal of Pure and Applied Mathematics Special Issue
1965
walks guarantee that each source block has a higher
probability to stop on node i than on node j. Therefore,
node i will receive more distinct source blocks than
node j, which results in node i obtaining a higher code-
degree than node j[16].
Henceforth, we propose two heuristic algorithms
to guarantee the Robust Soliton distribution of LT
codes, in a completely decentralized fashion, and using
“one-way” random walks. These two algorithms are
called Exact Decentralized Fountain Codes (EDFC)
and Approximate Decentralized Fountain Codes
(ADFC), since EDFC achieves the same coding
performance of the original centralized LT codes,
whereas ADFC offers degraded performance with the
tradeoff of lower dissemination cost. For both
algorithms, we assume that, before execution, every
node has been pre-programmed with the number of
sensing nodes K, the total number of nodes N, and the
Robust Soliton distribution. All these information can
be broadcast to every node when the sensor network is
initialized. However, K and N may decrease with time
when some sensors die. In Section V, we will show that
EDFC and ADFC do not require each node to know the
precise value of K and N.
B. Exact Decentralized Fountain Codes (EDFC)
Because of the randomization introduced by random
walks the number of distinct source blocks received by
a node is uncertain and usually does not equal the code-
degree of this node. EDFC attempts to overcome such
challenge and guarantee the Robust Soliton degree
distribution. It is based on the observation that we may
disseminate more than d source blocks on each node,
but encode only d of them, where d is the code-degree
of a node. Assume each node receives xd ・ d source
blocks on average, where xd is called the redundancy
coefficient. By choosing a sufficiently large xd, the
probability that such number of source blocks comprise
less than d distinct source blocks can be made
arbitrarily small. Therefore, the distribution of the
number of source blocks used in encoding can be
arbitrarily close to the Robust Soliton degree
distribution. Note that, through the arguments of
symmetry, xd should depend only on d.
Algorithm Description: EDFC proceeds as follows.
Initially, each node generates its code-degree from the
Robust Soliton distribution. Next, each sensing node
sends out its source block by random walks. As long as
a source block stops at a node at the end of the random
walk, this node will store this source block. After all
source blocks are disseminated, each node generates its
encoded block from a subset of received source blocks
with cardinality equal to its code degree. Although the
algorithm is simple, two critical elements need to be
computed and are derived in details in the following:
the number of random walks launched from each
sensing node and the probabilistic forwarding tables for
random walks.
The number of random walks: It is clear that the
expected number of nodes with code-degree d in the
network is Nµ(d), where µ(d), the fraction of code-
degree d nodes, is defined in (1). Each code-degree d
node receives xdd source blocks. Therefore, the total
number of source blocks received in the network is
∑kd=1Nµ(d)xd d which also equals bK, the number of
source blocks disseminated from the K sensing nodes,
where b denotes the number of random walks from
each sensing node. Hence, we have
Figure 2. (a) Chosen Degree Distribution (b) Actual Degree Distribution
The actual degree distribution has a shape similar
to the Robust Soliton distribution as shown in Fig. 1,
though not exactly the same. Because of this
inaccuracy, we expect that the collector needs to collect
more encoded blocks than in EDFC to successfully
decode. However, further numerical computation
International Journal of Pure and Applied Mathematics Special Issue
1966
reveals that the overhead ratio g2 is only 0.2326. This
suggests that far less transmission cost is required to
disseminate source blocks than EDFC or the ideal
algorithm.
3. Gossiping Schemes & Communication
Overhead In Spawn Protocol
Gossiping Schemes
We evaluate various gossiping schemes which we
describe in this section.
Probabilistic Spawn
Spawners not interested in the particular file listen to
gossip messages of that file and forward them with a
low probability. Interested Spawners listen to those
gossip messages and forward them with a higher
probability after stamping the route-list of the packet
with their own id. An Interested spawner who is
currently downloading a file will generate Gossip
messages on completion of downloading a new piece
Rate-Limited Spawn
Each Spawner maintains two caches, a Non-Interested
cache of gossip messages about files that it is not
interested in, and an interested cache. Periodically,
gossip messages are picked up from one of the caches
and re-broadcasted (without updating the origination
time-stamp). Interested cache messages are selected at
a higher frequency. The decision about which message
to select from a particular cache can be made in
different ways.
� Rate-Limited-Recent Spawn: The gossip
message with the most recent origination time-stamp is
forwarded.
� Rate-Limited-Random Spawn: The gossip
message is selected at random from the relevant table.
Figure 3. Local-file piece Evolution
Communication Overhead
We examine three metrics under different network
sizes: Message Cost, Energy Cost, and Termination
Time. In Figure.3, we compare the total message cost
of our ECPC algorithm with EDFC and RCDS. In
general, the total message cost of ECPC is much lower
than other two approaches. And the message cost
difference increases faster as the network size scales
up. This is because that the dissemination in EDFC and
RCDS rely on the random walks, which is at least O(n2
_ polylogn), while the dissemination of ECPC only
consumes O(1) message per node. In particular, the
message cost of ECPC stays under 1; 000 under the
network size of 500. RCDS requires 2; 000; 000
messages in total to achieve the expected code degree
distribution across the network. Approximately 1; 200;
000 messages are consumed in EDFC.
International Journal of Pure and Applied Mathematics Special Issue
1967
Figure 4. Communication Overhead: Total Message Cost
Since the packet transmission power is
dynamically adjusted in ECPC, energy costs for every
message broadcasting indeed vary dramatically. Thus,
reduction of message cost does not guarantee better
energy efficiency. We consider the total energy
consumption required for protocol, avoiding the
arguable concerns about the energy efficiency of
proposed ECPC. In Figure 6, the measurement of total
energy ratio between RCDS and ECPC, EDFC and
ECPC are presented respectively. For RCDS, the
energy ratio to ECPC could rise from about 20 (N =
100) to 100 (N = 500). Compared with RCDS, EDFC
decreases the number of random walks for each raw
data by reducing the exact condition to approximated
conditions. Hence, the maximum energy ratio between
EDFC and ECPC is about 60 under network size
500[16],[17]. The reason behind this significant energy
saving is that ECPC only adopts single-hop packet
broadcasting to ensure the final degree distribution is
satisfied among encoded and stored packets. Though
transmission power may be increased to cover more
neighbors in one broadcasting attempt, the expected Tx
power value throughout multi-round encoding process
is closer to P’. In terms of time for data dissemination,
ECPC method terminates at a short time period, e.g.
less than 10 time units under varying network sizes,
shown in Figure.3 in EDFC and RCDS, each source
node can initiate its own random walk at the same time,
as far as there is no media access conflict.
Figure 5. Total Energy consumption distributed data storage Schemes
Their termination of encoding did cost a
considerable time, with 200 time units for EDFC and
400 time units for RCDS respectively under the
network size of 500.
International Journal of Pure and Applied Mathematics Special Issue
1968
Figure 6. Communication Overhead: Total Termination Time
Data Recovery Ratio
We study the decoding performance in terms of data
recovery ratio. Data recovery ratio denotes the
percentage of original data recovered after decoding.
The node has a failure probability of 10%. First, we
observe that the recovery ratio of ECPC has 40% at the
network size 50, but, increase to over 90% as network
size increases beyond the 400. In experiment, ECPC
select ß 0.07 from range (0, 0.1) to maintain the
randomness properties. The decoding performance of
EDFC and RCDS also experience an increasing trend
as network size grows. The reason for their poor
performance is that nodes can fail when the random
walk is still going. It may result in a stop condition for
the random walk on that failure node. Thus, the
resultant degree distribution is not the best match of
expected code degree distribution. In particular, the
data recovery ratio is only 67% and 62% for EDFC and
RCDS respectively.
Figure 7. The decoding performance under varying network sizes: MAX data
recovery ratio for different network sizes
We examine the data recovery ratio for 500 nodes
along the axis of experiment time elapsed in Figure 6.
We keep node failure probability as 10%. Since we
repeat the experiments 50 times with node failure occur
randomly each time, the overall failure occurrence time
is also uniformly distributed. Since EDFC and RCDS
need 300 to 400 time units to 117 perform the random
walk based data dissemination, the recovery ratio is 0
during the early time period of data dissemination for
both EDFC and RCDS. On the other hand, ECPC
terminates in a short time period, making data recovery
possible even in a early stage, like at the time of 100
time units. The peak recovery ratio of EDFC can only
reach 67% is due to the node failure occur during the
International Journal of Pure and Applied Mathematics Special Issue
1969
data dissemination. As the time elapses, the recovery
ratios of three approaches decrease. It is because that
more nodes fail as time elapses. However, the decoding
performance of ECPC degrades much less than other
two approaches.
4. Performance Evaluation
To evaluate the effectiveness and performance of the
proposed algorithms, we have implemented both the
original centralized and the decentralized
implementations of fountain codes. The centralized
implementation of fountain codes (LT codes in this
case) consists of about 1000 lines of C++ code, with
fully optimized implementation of the encoding and
decoding algorithms. The decentralized implementation
of fountain codes with random walks in wireless sensor
networks is also simulated with C++. We use the two-
dimensional Geometric Random Graph, G2(N, r), as
the topological model, where N sensors are uniformly
distributed on a unit disc. Besides the total number of
sensors N and the radio range r of each node, an
additional parameter special to our algorithms is the
total number of sensing nodes K. The K sensing nodes
are uniformly distributed among the N sensors. We set
K = 10000, N = 20000, and r = 0.033 in most
experiments with exceptions explicitly stated. The
average number of neighbors for each node is 21 in
such a setting. To mitigate randomness, we show, for
each data point in all figures, the average and the 95%
confidence interval from 10 independent experiments.
Figure 8. Decoding Ratio of Centralized
Fountain Codes
Communication Cost and Decoding Ratio
We examine two main performance metrics, the
communication cost and the decoding ratio. The
communication cost, governed by the length of random
walks and the number of random walks from a sensing
node, represents the system efficiency. The decoding
ratio denotes the number of nodes that need to be
visited by a collector for decoding, normalized by the
number of sensing nodes. It reflects the degree of fault
tolerance of the network, since the fewer nodes are
required for a collector to visit in order to decode all
data, a higher percentage of nodes are allowed to fail.
We compare the performance of EDFC and ADFC with
the two-way algorithm as described in Section III-A.
First, the impact of random-walk lengths on the
decoding ratio is studied. Fig. 3 plots the decoding ratio
vs. the length of random walks for the three algorithms.
In general, the decoding ratio decreases when the
length of random walks increases, and stay stationary
on a certain value if the length exceeds a threshold for
all three algorithms. This is because the random walks
approach the steady-state distribution when their
lengths increase. In particular, for EDFC, Fig. 3 shows
that when the Random-walk length is larger than 500,
the decoding ratio stays stationary around 1.05, which
implies that EDFC achieves the same decoding
performance of the original centralized fountain codes.
For ADFC, Fig. 3 shows that when the random-
walk length is larger than 50, the decoding ratio stays
around 1.6, higher than what is achievable by EDFC.
This is because the actual degree distribution of ADFC
is slightly different from the Robust Soliton
distribution. However, ADFC requires much lower cost
in data dissemination as we will show later. For the
two-way algorithm, the length of random walks shown
in Fig. 3 is the one-way length. The decoding ratio of
the two-way algorithm is less than EDFC and ADFC
when the random-walk length is smaller than 500. This
is because the uniform steady-state distribution in the
two-way algorithm is easier to be achieved than the
biased steady-state distribution of EDFC and ADFC
[13],[16]. Yet, this does not imply that the two-way
algorithm has lower transmission cost than EDFC and
ADFC for a given decoding ratio, since its additional
random walk overhead due to selecting a node that is
not a sensing node, or a duplicate of previous
selections, is not reflected in the length of random
walks shown in the figure.
Next, we compare the communication costs of
these three algorithms. For EDFC and the two-way
algorithm, we record their minimal transmission costs
when their decoding ratios are similar to centralized
fountain codes. For ADFC, we record its minimal
transmission cost when its decoding ratio is stabilized
on some value (e.g., 1.6 in Fig. 3). In addition, for the
two way algorithm, the transmission cost of control
messages in selecting random nodes is considered the
same as the cost in data transmission. Fig. 4 shows that
the dissemination cost of ADFC is much lower than
EDFC and the two-way algorithm [16].
This is due to two reasons. First, the required
length of random walks of ADFC is shorter than both
EDFC and the two-way algorithm as shown in Fig. 3.
Second, for ADFC, the number of random walks from
International Journal of Pure and Applied Mathematics Special Issue
1970
each sensing node is significantly smaller than that of
EDFC and the two-way algorithm. For example, in the
simulation setting of Fig. 3, we observe this number is
6 and 50 for ADFC and EDFC, respectively.
Figure 9. The ratio of dissemination cost of EDFC
and ADFC to that of two-way algorithm
Furthermore, Fig. 8 shows that the cost ratio of
EDFC to the two-way algorithm starts from 0.2 and
increases slowly to 0.8, as the number of source blocks
increases from 1000 to 20000. This cost ratio increases
sub-linearly with the number of source blocks, since as
shown in Theorem 1, the redundancy coefficient for
EDFC grows sub-linearly with the number of source
blocks. However, Fig. 4 shows that EDFC significantly
outperforms the two-way algorithm for practical sensor
networks with less than 20000 nodes. Furthermore, we
emphasize that, as explained in Section III-A, EDFC is
easier to implement and avoids the many disadvantages
of the two way algorithm. Furthermore, both figures
suggest that EDFC is more robust than ADFC with
overestimation of N or K.
5. Conclusion
Wireless ad-hoc networks may need to operate in the
absence of sinks in remote geographical zone. In this
paper, we seek to improve the fault tolerance and
persistence of data in sensor networks by proposing a
decentralized implementation of fountain codes, which
efficiently disseminates original data throughout the
network with SPAWN. We are attracted by the superior
decoding performance and low decoding complexity of
fountain codes as the number of nodes in the sensor
network scales up, especially when compared to
alternative coding techniques such as random linear
codes and ADFC. We have shown that simulation
result, as well as in actual experiments, the EDFC
algorithm is able to provide near-optimal fault tolerance
with minimal demand on local storage.
References
[1] Zhenzhou Tang, Hongyu Wang, Qian Hu, and
Long Hai, “How Network Coding Benefits Converge-
Cast in Wireless Sensor Networks”, In Proc. IEEE
Vehicular Technology Conference (VTC 2012 Fall
),2012.
[2] M. Gorlatova, A. Wallwater, and G. Zussman,
\Networking Low-Power Energy Harvesting Devices:
Measurements and Algorithms," in IEEE INFOCOM,
Apr. 2011.
[3] R. Huang, W.-Z. Song, M. Xu, N. Peterson, B.
Shirazi, and R. LaHusen, \Real-World Sensor Network
for Long-Term Volcano Monitoring: Design and
Findings," IEEE Trans- actions on Parallel and
Distributed Systems, 2011.
[4] S A Aly, H Darwish, M Youssef and M.
Zidan,"Distributed Flooding based storage algorithms
for large-scale wireless sensor networks," In Proc.
IEEE International Conference on Communication,
Dresden, Germany, June 13-17, 2009.
[5] Y. Gu , T. Zhu, and T. He, \ESC: Energy
Synchronized Communication in Sustainable Sensor
Networks," in The 17th International Conference on
Network Protocols, Oct. 2009.
[6] P ilip A. C ou and Yunnan Wu,” Network
Coding for t e Internet and Wireless Networks”,
Microsoft Research One Microsoft Way, Redmond,
WA, 98052, June 2007.
[7] A. G. Dimakis, V. Prabhakaran, and K.
Ramchandran, “Decentralized Erasure Codes for
Distributed Networked Storage,” IEEE Transactions on
Information Theory, vol. 52, no. 6, pp. 2809–2816,
June 2006.
[8] G. Dimakis, V. Prabhakaran, and K.
Ramchandran, “Distributed Fountain Codes for
Networked Storage,” in Proc. of IEEE International
Conference on Acoustics, Speech, and Signal
Processing (ICASSP), 2006.
[9] Kamra, J. Feldman, V. Misra, and D.
Rubenstein, “Growth Codes: Maximizing Sensor
Network Data Persistence,” in Proc. of ACM
SIGCOMM, 2006
[10] D. Wang, Q. Zhang, and J. Liu, “Partial
Network Coding for Continuous Data Collection in
Sensor Networks,” in Proc. of the Fourteenth IEEE
International Workshop on Quality of Service
(IWQoS), 2006.
[11] M. Rabbat, J. Haupt, A. Singh, and R. Nowak,
“Decentralized Compression and Predistribution via
Randomized Gossiping,” in Proc. Of the Fifth
International Symposium on Information Processing in
Sensor Networks (IPSN), 2006.
International Journal of Pure and Applied Mathematics Special Issue
1971
[12] F. Akyildiz, W. Su, Y. Sankarasubramaniam
and E. Cayirci, "A survey on sensor networks," IEEE
Communications Magazine, August 2002.
[13] M. Luby, “LT Codes,” in Proc. of the 43th
IEEE Symposium on Foundations of Computer Science
(FOCS), 2002.
[14] J. Kubiatowicz, D. Bindel, Y. Chen, S.
Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea,
H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao,
“OceanStore: An Architecture for Global-Scale
Persistent Storage,” in Proc. of ACM Architectural
Support for Programming Languages and Operating
Systems (ASPLOS), 2000.
[15] Karp and H. T. Kung, “GPSR: Greedy
Perimeter Stateless Routing for Wireless Networks,” in
Proc. of the Sixth Annual ACM/IEEE International
Conference on Mobile Computing and Networking
(MobiCom), 2000.
[16] Z. J. Hass and B. Liang, “Ad Hoc Mobility
Management with Uniform Quorum Systems,”
IEEE/ACM Transactions on Networking, vol. 7 no. 2,
pp. 228–240, April 1999.
[17] P. M. Chen, E. K. Lee, G. A. Gibson, R. H.
Katz, and D. A. Patterson, “RAID: High-Performance,
Reliable Secondary Storage,” ACM Computing
Surveys (CSUR), vol. 26, no. 2, pp. 145–185, June
1994.
[18] T. Padmapriya, V.Saminadan, “Performance
Improvement in long term Evolution-advanced network
using multiple imput multiple output technique”,
Journal of Advanced Research in Dynamical and
Control Systems, Vol. 9, Sp-6, pp: 990-1010, 2017.
[19] S.V.Manikanthan and K.srividhya "An
Android based secure access control using ARM and
cloud computing", Published in: Electronics and
Communication Systems (ICECS), 2015 2nd
International Conference on 26-27 Feb. 2015,
Publisher:IEEE,DOI:10.1109/ECS.2015.7124833.
International Journal of Pure and Applied Mathematics Special Issue
1972
1973
1974