1 data dissemination peyman teymoori. 2 introduction data dissemination: the process by which...
DESCRIPTION
3 Routing Models Address Centric Each source independently send data to sink Data Centric Routing nodes en-route look at data sent Source 2 Source 1 Sink BA Source 2 Source 1 Sink BA AC Routing DC RoutingTRANSCRIPT
1
Data Dissemination
Peyman Teymoori
2
Introduction
Data dissemination: the process by which queries or data routed in the network
Source: the node generating data Event: the information to be reported Sink: the node interested in an event Two general steps:
Interest propagation: temperature, intrusion Data propagation: routing, aggregation
3
Routing Models
Address Centric Each source independently send data to sink
Data Centric Routing nodes en-route look at data sent
Source 2
Source 1
Sink
BA
Source 2
Source 1
Sink
BA
AC Routing
DC Routing
4
Differences with Current Networks Difficult to pay special attention to any individual
node: Collecting information within the specified region Collaboration between neighbors
Sensors may be inaccessible: embedded in physical structures. thrown into inhospitable terrain.
Sensor networks deployed in very large ad hoc manner No static infrastructure
5
Differences with Current Networks They will suffer substantial changes as nodes fail:
battery exhaustion accidents new nodes are added.
User and environmental demands also contribute to dynamics: Nodes move Objects move
Data-centric and application-centric Location aware Time aware
6
One possible solution? Internet technology coupled with ad-hoc routing
mechanism Each node has one IP address Each node can run applications and services Nodes establish an ad-hoc network amongst
themselves when deployed Application instances running on each node can
communicate with each other
Overall Design of Sensor Networks
7
Why Different and Difficult?
A sensor node is not an identity (address) Content based and data centric
Where are nodes whose temperatures will exceed more than 10 degrees for next 10 minutes?
Tell me the location of the object ( with interest specification) every 100ms for 2 minutes.
Multiple sensors collaborate to achieve one goal. Intermediate nodes can perform data aggregation
and caching in addition to routing.where, when, how?
8
Energy-limited nodes Computation
Aggregate data Suppress redundant routing information
Communication Bandwidth-limited Energy-intensive
Scalability: ad-hoc deployment in large scale Robustness: unexpected sensor node failures Dynamic changes: no a-priori knowledge, mobility
Goal: Minimize energy dissipation
Challenges
9
Aggregation Many studies addressing not only the routing problem
but also representing and combining data more efficiently
Process of data while being forwarded toward the sink
Reducing the number of transmissions Definition:
Gathering & routing information in a multihop network Processing data at intermediate nodes to
Reduce resource consumption Increase network lifetime
10
Aggregation Two approaches:
In-network with size reduction More ability to reduce traffic Less accuracy Difficulty in reconstructing the original data
In-network without size reduction Merging some smaller packets into one
Requires: Networking protocol: routing Effective aggregation functions Data representation
11
Aggregation A different routing paradigm is required Data-centric routing: nodes route packets based on
packet content Taking into account:
The most suitable aggregation points Data type Priority of information
Timing strategies: Periodic simple aggregation Periodic per-hop aggregation Periodic per-hop adjusted aggregation
depends on the node’s position in the gathering tree
12
Aggregation
Aggregation functions: Lossy & lossless Duplicate sensitive vs. duplicate insensitive
AVG vs. MAX Data representation:
limited storage capabilities vary according to the application requirements distributed source coding: A method to deal
with data representation and compression
13
Network Protocols for In-Network Aggregation How to forward packets in order to facilitate
in-network aggregation Several approaches:
Tree-based (Hierarchical): SPTs rooted at sink, or nodes grouped into clusters
Multi-path routing: DAG, more robust Hybrid approaches
14
Flooding
Just broadcast what you receive and is not yours (consider the max hop count)
Disadvantages: Implosion
A
B C
D
(a)
(a)
(a)
(a)
A B
C (r,s)(q,r)
q sr
Data overlap
Resource blindness
15
Gossiping
A modified version of flooding Random selection of neighbors for broadcast Avoids implosion Disadvantages:
Taking a long time to propagate No delivering guarantee
16
Rumor Routing
An agent-based path creation algorithm Agents or “ants”:
long-lived entities created at random by nodes packets circulated to establish shortest paths
to events they encounter perform path optimization
Motivation Sometimes a non-optimal route is satisfactory
17
Rumor Routing Creating Paths:
Nodes having observed an event send out agents which leave routing info to the event as state in nodes
Agents attempt to travel in a straight line
If an agent crosses a path to another event, it begins to build the path to both
Agent also optimizes paths if they find shorter ones.
18
Rumor Routing Basis for algorithm:
Observation: Two lines in a bounded rectangle have a 69% chance of intersecting
Create a set of straight line gradients from event, then send query along a random straight line from source.
Event
Source
19
Rumor Routing
(a) (b)
20
Rumor Routing
Advantages: Tunable best effort delivery Tunable for a range of query/event ratios
Disadvantages: Optimal parameters depend heavily on
topology (but can be adaptively tuned) Does not guarantee delivery
21
Sensor Protocols for Information via Negotiation (SPIN) A data-centric routing approach Uses negotiation & resource adaptation to
address deficiencies of flooding Two basic ideas:
Exchanging sensor data may be expensive, but exchanging data about sensor data may not be.
Nodes need to monitor and adapt to changes in their own energy resources
22
Sensor Protocols for Information via Negotiation (SPIN) Data negotiation
Meta-data (data naming) Application-level control Model “ideal” data paths
SPIN messages ADV- advertise data REQ- request specific data DATA- requested data
Resource management
A B
A B
A B
ADV
REQ
DATA
23
Sensor Protocols for Information via Negotiation (SPIN)
B
AADVREQDATA
ADV
AD
VADV
ADV
AD
V ADV
REQ
REQ
REQ
REQ
REQ
DATADA
TA
DATA
DA
TA
DATA
24
Cost-Field Approach
Sets up minimum cost paths to a sink A two-phase process:
Set up the cost field (metrics such as delay) Data dissemination using the cost
At each node, cost = min cost to the sink No explicit path information
25
Cost-Field Approach Setting up the cost field:
Sink broadcasts: ADV + its cost as 0 A node N hears an ADV from M:
Its path Ln : total cost from node N to sink Lm : the cost of node M to sink Cnm : the cost from node N to M
If Ln was updated, the new cost is broadcasted (new ADV)
Flooding-based implementation of Dijkstra’s algorithm A back-off-based approach, Time to defer = γ * Cmn
γ is a parameter of the algorithm
min( , )N M NMcost L L C
26
Cost-Field Approach
An example of setting up the cost field γ = 10
27
Cost-Field Approach
Data dissemination: A source sends a message
cost = Cs cost-so-far = 0
In an intermediate node M (with cost = Cm): If cost-so-far + Cm = Cs then
Forward the packet
28
Directed Diffusion
A reactive data-centric protocol Suitable for monitoring Expressed in terms of named data Organized in three phases:
Interest dissemination Gradient setup Path reinforcement & forwarding
29
Directed Diffusion
Naming A list of attribute – value pairs Animal tracking:
Interest ( Task ) DescriptionType = four-legged animalInterval = 20 msDuration = 1 minuteLocation = [-100, -100; 200, 400]
RequestRequestNode dataType =four-legged animalInstance = elephantLocation = [125, 220]Confidence = 0.85Time = 02:10:35
ReplyReply
30
Directed Diffusion Interest
The sink periodically broadcasts interest messages
Every node maintains an interest cache Each item corresponds to a distinct interest No information about the sink Interest aggregation : identical type, completely
overlap rectangle attributes Each entry in the cache has several fields
Timestamp: last received matching interest Several gradients: data rate, duration, direction
31
Directed Diffusion Setting Up Gradient
Source
Sink
Interest = InterrogationGradient = Who is interested(data rate , duration, direction)
Neighbor’s choices :1. Flooding 2. Geographic routing3. Cache data to direct interests
32
Directed Diffusion Data propagation
Sensor node computes the highest requested event rate among all its outgoing gradients
When a node receives a data: Find a matching interest entry in its cache
Examine the gradient list, send out data by rate Cache keeps track of recent seen data items
(loop prevention) Data message is sent individually to the
relevant neighbors (unicast)
33
Directed Diffusion
Reinforcing the best path
Source
Sink
Low rate event Reinforcement = Increased interest
The neighbor reinforces a path:1. At least one neighbor2. Choose the one from whom it first received the latest event (low delay)3. Choose all neighbors from which new events were recently received
34
Directed Diffusion The reinforced path must be periodically
refreshed A trade off based on network dynamics:
Frequency of gradient setup Achieved performance
MAC layer issues: Keeping local (control) traffic at a low level
Avoid collision, delay Enhanced Directed Diffusion
Joint of Directed Diffusion & cluster-based arch.
35
Tiny AGgregation (TAG)
A tree-based data-centric approach Timing: periodic per hop adjusted Two main phases:
Distribution: disseminating queries Collection: aggregating & routing readings
Declarative interface for data collection and aggregation – SQL style
36
Tiny AGgregation (TAG) A sample query:
SELECT: an expression over one or more aggregation values expr: the name of a single attribute agg: aggregation function attrs: the attributes by which the sensor readings are partitioned WHERE, HAVING: filters out irrelevant readings GROUP BY: specifies an attribute based partitioning of readings EPOCH DURATION: time interval of aggr record computation
SELECT {agg(expr), attrs} from SENSOR WHERE {selPreds} GROUP BY {attrs} HAVING {havingPreds} EPOCH DURATION i
37
Tiny AGgregation (TAG)
Distribution: The sink broadcasts an organizing message
Message contains level & distance from the root Node receiving the message:
If it doesn’t belongs to any level: Its level = message.level + 1 Its parent = the sender node Rebroadcasts the message adding its own ID & level
Broadcasting the query along the structure
38
Tiny AGgregation (TAG)
Collection: Each parent waits for data Then sends its aggregation up the tree Epochs are divided slots equals to the max
depth of the tree – Sleep & Wake up Every epoch, new aggregate produced Most of the times, motes are idle and in low
power state
39
Tiny AGgregation (TAG)
1 2 3 4 5
4 1
3
2
1
4
1
2 3
4
51
Sensor #
<- T
ime
SELECT COUNT(*) FROM sensors
40
Tiny AGgregation (TAG)
1 2 3 4 5
4 1
3 2
2
1
4
1
2 3
4
5
2
Sensor #
SELECT COUNT(*) FROM sensors
<- T
ime
41
Tiny AGgregation (TAG)
1 2 3 4 5
4 1
3 2
2 1 3
1
4
1
2 3
4
5
31
Sensor #
SELECT COUNT(*) FROM sensors
<- T
ime
42
Tiny AGgregation (TAG)
1 2 3 4 5
4 1
3 2
2 1 3
1 5
4
1
2 3
4
5
5Sensor #
SELECT COUNT(*) FROM sensors
<- T
ime
43
Tiny AGgregation (TAG)
1 2 3 4 5
4 1
3 2
2 1 3
1 5
4 1
1
2 3
4
51
Sensor #
SELECT COUNT(*) FROM sensors
<- T
ime
44
Tiny AGgregation (TAG)
A grouping example
45
Synopsis Diffusion Synopsis Diffusion : In-network aggregation Synopsis Functions
Synopsis Generation: s=SG(r) Synopsis Fusion: s=SF(s1,s2) Synopsis Evaluation: r*=SE(s)
Efficient Topology for Synopsis Diffusion Rings (R0 R1 … Ri-1 Ri )
Duplicate Sensitive Aggregates Mapping DS Aggregates Order- and duplicate-
insensitive synopsis
46
Synopsis Diffusion
Aggregate : A metric of aggregationClass of Aggregates
Not T
AGCo
unt M
IN Hist
ogra
mAv
erag
e
Coun
t Dist
inct
Med
ian
Sink
47
Synopsis Diffusion
[ ]
48
Synopsis Diffusion Phases of SD:
Distribution Phase Aggregate query is flooded through the network Network node form a set of rings
Aggregation Phase Each node uses SG to convert local data to local
synopsis and then uses SF to merge two synopsis to create a new one. The query initiator uses the SE to generate the final result.
Adapting the Topology Ring Topology Adaptive Ring Topology Nodes moves up or down in the rings dependent upon
the messages it overhears.
49
Delay Bounded Medium Access Control (DB-MAC) A tree-based aggregation scheme A joint design of routing & MAC protocols Minimizes the latency for delay bounded
apps. Takes advantage of data aggregation Adopts CSMA/CA scheme based on
RTS/CTS/DATA/ACK handshake Suitable for cases where:
Different sources sense an event There is delay constraint
50
Delay Bounded Medium Access Control (DB-MAC) A message exchange example in DB-MAC
51
Delay Bounded Medium Access Control (DB-MAC) Dynamically aggregates data while forwarding toward
the sink Aggregation tree is built on the fly No knowledge of the network topology
RTS/CTS are exploited to do aggregation Back-off intervals respect priorities By overhearing CTSs, the relay node is chosen Advantages:
Flexible & distributed construction of the tree Suitable for dynamic topologies Energy-efficient due to cross-layer design
52
Tributaries and Deltas
A hybrid approach: tree-based & multipath Idea
In low packet loss rate regions, use tree (T) In high loss rate regions, use multipath (M)
How to link the regions: M nodes only send data to M nodes M nodes subgraph includes the sink
Thresholds for M node percentage
53
Tributaries and Deltas