in network processing
DESCRIPTION
In Network Processing. When processing is cheaper than transmitting Daniel V Uhlig Maryam Rahmaniheris. Basic Problem. How to gather interesting data from thousands of Motes? Tens to thousands of motes Unreliable individually To collect and analyze data - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/1.jpg)
1
In Network Processing
When processing is cheaper than transmitting
Daniel V Uhlig Maryam Rahmaniheris
![Page 2: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/2.jpg)
2
Basic Problem How to gather interesting data from
thousands of Motes?• Tens to thousands of motes• Unreliable individually
To collect and analyze data • Long term low energy deployment• Can using processing power at each Mote
Analyze local before sharing data
![Page 3: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/3.jpg)
3
Costs Transmission of data is expensive compare to
CPU cycles• 1Kb transmitted 100 meters = 3 million CPU
instructions• AA power Mote can transmit 1 message per day for
about two months (assuming no other power draws)
• Power density is growing very slowly compared to computation power, storage, etc
Analyze and process locally, only transmitting what is required
![Page 4: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/4.jpg)
4
Framework of Problem Minimize communications
◦ Minimize broadcast/receive time◦ Minimize message size ◦ Move computations to individual nodes
Nodes pass data in multi-hop fashion towards a root
Select connectivity so graph helps with processing
Handle faulty nodes within network
![Page 5: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/5.jpg)
5
Example of Problem (MAX)
C: 4, 6
F: 2, 7,5, 10
E: 3, 5, 1
D: 3,4, 6
B: 4,7, 6
5
10
6
10A: 7,1, 6
7
10
6 5
5
![Page 6: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/6.jpg)
6
Complications Max is very simple What about Count?
◦ Need to avoid double counting due to redundant paths
What about spatial events?◦ Need to evaluate readings across multiple sensors
Correlation between events Failures of nodes can loose branches of the
tree
![Page 7: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/7.jpg)
7
Design Decisions
• Connectivity Graph – unstructured or how to structure
• Diffusion of requests and how to combine data
• Maintenance messages vs Query messages• Reliability of results• Load balancing– messages traffic – storage
• Storage costs at different nodes
![Page 8: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/8.jpg)
8
TAG: a Tiny Aggregation Service for Ad-Hoc Sensor
Networks
S.Madden, M.Franklin, J.Hellerstein, and W.Hong
Intel Research, 2002
![Page 9: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/9.jpg)
9
TAG
• Aggregates values in low power, distributed network
• Implemented on TinyOS Motes• SQL like language to search for values or
sets of values– Simple declarative language
• Energy savings• Tree based methodology– Root node generates requests and dissipates
down the children
![Page 10: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/10.jpg)
10
TAG Functions• Three functions to aggregate results– f (merge function)• Each node runs f to combine values• <z>=f (<x> , <y>) • EX: <SUM, COUNT>=f (<SUM1+SUM2>, <COUNT1+COUNT2>)
– i (initialize function)• Generates state record at lowest level of tree• EX:<SUM, COUNT>
– e (evaluator function)• Root uses e to generate the final result• RESULT=e<z>, • EX: SUM/COUNT
• Functions must be preloaded on Motes or distributed via software protocols
![Page 11: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/11.jpg)
11
TAG
1 1
31
1
37
1
2 1
10Count =
Max via tree
![Page 12: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/12.jpg)
12
TAG Taxonomy
All searches have different properties that affect aggregate performance
• Duplicate insensitive – unaffected by double counting (Max, Min) vs (Count, Average)– Restrict network properties
• Exemplary – return one value (Max/Min)– Sensitive to failure
• Summary – computation over values (Average)– Less sensitive to failure
![Page 13: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/13.jpg)
13
TAG Taxonomy
• Distributive – Partial states are the same as final state (Max)
• Algebraic – Partial states are of fixed size but differ from final state (Average - Sum, Count)
• Holistic – Partial states contain all sub-records (median)– Unique – similar to Holistic, but partial records
may be smaller then holistic• Content Sensitive – Size of partial records
depend on content (Count Distinct)
![Page 14: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/14.jpg)
14
TAG Diffusion of requests and then collection of
information Epochs subdivided
for each level to complete task◦ Saves energy◦ Limits rate of data
flow
![Page 15: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/15.jpg)
15
TAG Optimizations Snooping – Broadcast messages so others
can hear messages◦ Rejoin tree if parents have failure◦ Listen to other broadcasts and only broadcast if
its values are needed In case of MAX, do not broadcast if peer has
transmitted a higher value Hypothesis testing – root guesses at value
to minimize traffic
![Page 16: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/16.jpg)
16
TAG - Results Theoretic results for
◦ 2500 Nodes Savings depend on
function Duplicate
Insensitive, summary best◦ Distributive helps
Holistic is the worse
![Page 17: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/17.jpg)
17
TAG Real World Results• 16 Mote network• Count number of motes
in 4 sec epochs• No optimizations• Quality of count is due to
less radio contention in TAG
• Centralized used 4685 messages vs TAG’s 2330
• 50% reduction, but less then theoretical results – Different loss model, node
placement
![Page 18: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/18.jpg)
18
Advantages/Disadvantages
• Loss of nodes and subtrees–Maintenance for structured connectivity
• Single message per node per epoch–Message size might increase at higher level nodes– Root gets overload (Does it always matter?)
• Epochs give a method for idling nodes– Snooping not included, timing issues
![Page 19: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/19.jpg)
20
Synopsis Diffusion for Robust Aggregation in Sensor Networks
S.Nath, P.Gibbons, S.Seshan, Z.AndersonMicrosoft Research, 2008
![Page 20: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/20.jpg)
21
TAG◦ Not robust against node or link failure◦ A single node failure leads to loss of the entire sub branch's data
Synopsis Diffusion◦ Exploiting the broadcast nature of wireless medium to enhance reliability
◦ Separating routing from aggregation
◦ The final aggregated data at the sink is independent of the underlying routing topology
◦ Synopsis diffusion can be used on top of any routing structure
◦ The order of evaluations and the number of times each data included in the result is irrelevant
Motivation
![Page 21: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/21.jpg)
TAG
Not robust against node or link failure
22
1 1
31
1
37
1
2 1
103Count = 10
![Page 22: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/22.jpg)
23
Multi-path routing
◦ Benefits Robust Energy-efficient
◦ Challenges Duplicate sensitivity Order sensitivity
Synopsis Diffusion
14
7
152
20 23
Count =
1
3
2
5810
![Page 23: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/23.jpg)
24
A novel aggregation framework◦ ODI synopsis: small-sized digest of the partial results
Bit-vectors Sample Histogram
Better aggregation topologies◦ Multi-path routing◦ Implicit acknowledgment◦ Adaptive rings
Example aggregates
Performance evaluation
Contributions
![Page 24: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/24.jpg)
25
The exact definition of these functions depend on the particular aggregation function:◦ SG(.)
Takes a sensor reading and generates a synopsis◦ SF(.,.)
Takes two synopsis and generates a new one◦ SE(.)
Translates a synopsis into the final answer
AggregationSG: Synopsis Generation
SF: Synopsis FusionSE: Synopsis Evaluation
![Page 25: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/25.jpg)
26
Distribution phase◦ The aggregate query is flooded◦ The aggregate topology is constructed
Aggregation phase◦ Aggregated values are routed toward Sink◦ SG() and SF() functions are used to create partial
results
Synopsis diffusion Algorithm
![Page 26: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/26.jpg)
27
The sink is in R0
A node is in Ri if it’s i hops away from sink
Nodes in Ri-1 should hear the broadcast by nodes in Ri
Loose synchronization between nodes in different rings
Each node transmits only once◦ Energy cost same as tree
Ring Topology
R3
R2
R0
R1
A
B
C
![Page 27: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/27.jpg)
28
Coin tossing experiment CT(x) used in Flajolet and Martin’s Algorithm:
◦ For i=1,…,x-1: CT(x) = i with probability ◦ Simulates the behavior of the exponential hash function◦ Synopsis: a bit vector of length k > log(n)
n is an upper bound on the number of the sensor nodes in the network
◦ SG(): a bit vector of length k with only the CT(k)th bit is set
◦ SF(): bit wise Boolean OR◦ SE(): the index of lowest-order 0 in the bit vector= i->
Example: Count
77.0/2 1i
i2
SG: Synopsis GenerationSF: Synopsis Fusion
SE: Synopsis Evaluation
Magic Constant
![Page 28: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/28.jpg)
29
The number of live sensor nodes, N, is proportional to
Example: Count
0 1 0 0 0 0 0 0 0 0 1 0
0 0 1 0 0 0 0 0 0 0 0 1
0 1 0 0 0 0 0 1 0 0 1 0
0 1 1 0 0 0
0 1 0 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 1 0 1 1 Count 1 bits
4
12 i
Intuition: The probability of N nodes all failing to set the ith bit is which is approximately 0.37 when and even smaller for larger N.
Ni )21( iN 2
SG: Synopsis GenerationSF: Synopsis Fusion
SE: Synopsis Evaluation
![Page 29: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/29.jpg)
30
ODI-Correctness
Aggregation DAG Canonical left-deep tree
For any aggregation DAG, the resulting synopsis is identical to the synopsis produced by the canonical left-deep tree
SG SG SG SG SG
SF
SF
SF
SF
SF S
F
SF
r1 r2 r5r3 r4
s
SG
SG
SG
SGSG
r1 r2
r3
r4
r5
SF
SF
SF
SF
s
SG: Synopsis GenerationSF: Synopsis Fusion
SE: Synopsis Evaluation
![Page 30: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/30.jpg)
31
◦ P1: SG() preserves duplicates
If two reading are considered duplicates then the same synopsis is generated
◦ P2: SF() is commutative SF(s1, s2) = SF(s2, s1)
◦ P3: SF() is associative SF(s1, SF(s2, s3)) = SF(SF(s1, s2), s3)
◦ P4: SF() is same-synopsis idempotent SF(s, s) = s
A Simple Test for ODI-CorrectnessTheorem: Properties P1-P4
are necessary and sufficient properties for ODI-
Correctness
![Page 31: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/31.jpg)
32
Uniform Sample of Readings
◦ Synopsis: A sample of size K of <value, random number, sensor id> tuples
◦ SG(): Output the tuple <valu, ru, idu>
◦ SF(s,s’): outputs the K tuples in s∪s’ with the K largest ri
◦ SE(s): Output the set of values val i in s
◦ Useful holistic aggregation
More Examples SG: Synopsis Generation
SF: Synopsis FusionSE: Synopsis Evaluation
![Page 32: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/32.jpg)
33
Frequent Items (items occurring at least T times)
◦ Synopsis: A set of <val, weight> pairs, the values are unique and the weights are at least log(T)
◦ SG(): Compute CT(k) where k>log(n) and call this weight and if it’s at least log(T) output <val, weight>
◦ SF(s,s’): For each distinct value discard all but the pair <value, weight> with maximum weight. Output the remaining pairs.
◦ SE(s): Output <value, > for each <val, weight> pair in s as a frequent value and its approximate count
◦ Intuition: A value occurring at least T time is expected to have at least one of its calls to CT() return at least log(T) p=1/T
More Examples
weight2
SG: Synopsis GenerationSF: Synopsis Fusion
SE: Synopsis Evaluation
![Page 33: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/33.jpg)
34
Communication error◦ 1-Percent contributing
◦ h: height of DAG
◦ k: the number of neighbors each nodes has
◦ p: probability of loss
◦ The overall communication error upper bound:
◦ If p=0.1, h=10 then the error is negligible with k=3
Approximation error◦ Introduced by SG(), SF(), and SE() functions
◦ Theorem 2: any approximation error guarantees provided for the centralized data stream scenario immediately applies to a synopsis diffusion algorithm , as long as the data stream synopsis is ODI-correct.
Error Bounds of Approximation
hkp )1(1
![Page 34: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/34.jpg)
35
Implicit acknowledgement provided by ODI synopses
◦ Retransmission High energy cost and delay
◦ Adapting the topology When the number of times a node’s transmission is included
in the parents transmission is below a threshold Assigning the node to a ring that can have a good number of
parents Assign a node in ring i with probability p to :
Ring i +1 If ni > ni-1 ni+1 > ni -1 and ni+2 > ni
Ring i -1 If ni-2 > ni-1 ni-1 < ni+1 and ni-2 > ni
Adaptive Rings
![Page 35: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/35.jpg)
36
Effectiveness of Adaptation
Rings Adaptive Rings
•Random placement of sensors in a 20*20 grid with a realistic communication model•the solid squares indicate the nodes not accounted for in the final answer
![Page 36: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/36.jpg)
37
Realistic Loss Experiment
The algorithms are implemented in TAG simulator 600 sensors deployed randomly in a 20 ft * 20 ft grid The query node is in the center Loss probabilities are assigned based of the distance between nodes
![Page 37: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/37.jpg)
38
Impact of Packet Loss
RMS Error % Value Included
![Page 38: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/38.jpg)
39
Pros◦ High reliability and robustness◦ More accurate answers◦ Implicit acknowledgment◦ Dynamic topology adaptation◦ Moderately affected by mobility
Cons◦ Approximation error◦ Low node density decreases the benefits◦ The fusion functions should be defined for each
aggregation function◦ Increased message size
Synopsis Diffusion
![Page 39: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/39.jpg)
40
Is there any benefit in coupling routing with aggregation?◦ Choosing the paths and finding the optimal aggregation points◦ Routing the sensed data along a longer path to maximize
aggregation◦ Finding the optimal routing structure
Considering energy cost of links NP-Complete Heuristics (Greedy Incremental)
Considering data correlation in the aggregation process◦ Spatial◦ Temporal
Defining a threshold TiNA
Overall Discussion points
![Page 40: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/40.jpg)
41
Could energy saving gained by aggregation be outweighed by the cost of it? ◦ Aggregation function cost
Storage cost Computation cost (Number of CPU cycles)
No mobility◦ Static aggregation tree
Structure-less or structured? That is the question…◦ Continuous◦ On-demand
Overall Discussion points
![Page 41: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/41.jpg)
42
Generalize Problem to other areas Transmitting large amounts of data on the
internet is slow ◦ Better to process locally and transmit the
interesting parts only
![Page 42: In Network Processing](https://reader036.vdocuments.net/reader036/viewer/2022070415/56814e53550346895dbbe10b/html5/thumbnails/42.jpg)
43
Overall Discussion points How does query rate affect design
decisions?
Load balancing between levels of the tree◦ Overload root and main nodes
How will video capabilities of Imote affect aggregation models?