ahmed helmy
Post on 20-Jan-2016
52 Views
Preview:
DESCRIPTION
TRANSCRIPT
UNIVERSITY OFSOUTHERN CALIFORNIA
Understanding and Utilizing Multi-Understanding and Utilizing Multi-Dimensional Correlations in Sensor Dimensional Correlations in Sensor
Networks: A Protocol Design PerspectiveNetworks: A Protocol Design Perspective
Ahmed HelmyAhmed Helmy
Department of Electrical EngineeringDepartment of Electrical Engineering
USC Viterbi School of EngineeringUSC Viterbi School of Engineering
University of Southern CaliforniaUniversity of Southern California
helmy@usc.eduhelmy@usc.edu
Web: Web: ceng.usc.edu/~helmyceng.usc.edu/~helmy, Lab: , Lab: nile.usc.edunile.usc.edu
UNIVERSITY OFSOUTHERN CALIFORNIA
Outline
• Classifying Correlations• How to Utilize Correlations? • Insights for Protocol Design
– Gradient-based Routing (RUGGED)
– Active Query Routing (ACQUIRE)
– Abnormality Detection and Filtering Inserted Data
• WLANs as Sensor Networks (IMPACT)– Sensing access and usage patterns
– Analyzing correlations in wireless users behavior
• Issues
UNIVERSITY OFSOUTHERN CALIFORNIA
Correlation Classification• Dimensions of Correlation:
– Spatial• Between neighboring nodes
– Temporal• Across time (different samples) for the same node
– Spatio-temporal• Moving target (e.g., vehicle), moving phenomenon (e.g., fire)
• What is correlated?– Sensor readings (e.g., temperature, light, gradients)
– Communication channel (e.g., loss, fading)
– Localization information, …
UNIVERSITY OFSOUTHERN CALIFORNIA
How Can We Utilize Correlations?• In-network processing
– Aggregation
– Abstraction/ adaptive fidelity/ zoom-in
• Prediction (model-based), enables Caching• Routing (gradients in time and space, etc.)• Abnormality detection (attacks, failures, mis-calibration)
• Equivalence– Sampling smaller set of nodes (sleep/wake-up)
– Topology control
UNIVERSITY OFSOUTHERN CALIFORNIA
RUGGED: RoUting on RUGGED: RoUting on finGerprint Gradients in finGerprint Gradients in
sEnsor NetworkssEnsor Networks
Jabed FaruqueJabed Faruque, Ahmed , Ahmed HelmyHelmy
Department of Electrical EngineeringDepartment of Electrical Engineering
University of Southern CaliforniaUniversity of Southern California
faruque@usc.edu, helmy@usc.edufaruque@usc.edu, helmy@usc.edu
URL: URL: http://nile.usc.eduhttp://nile.usc.edu, , http://ceng.usc.edu/~helmyhttp://ceng.usc.edu/~helmy
- Faruque, Psounis, Helmy, IEEE/ACM DCOSS 2005. - Faruque, Helmy, IEEE ICPS 2004.- Faruque, Psounis, Helmy, IEEE/ACM DCOSS 2005. - Faruque, Helmy, IEEE ICPS 2004.
UNIVERSITY OFSOUTHERN CALIFORNIA
IntroductionIntroduction
• Sensor networks are envisioned to be widely used for Sensor networks are envisioned to be widely used for habitat habitat and and environmentalenvironmental monitoring, among others monitoring, among others
• Every physical event produces a Every physical event produces a fingerprintfingerprint in the in the environment environment
• Usually Usually diffusion lawsdiffusion laws are inherent property of many are inherent property of many physical phenomena physical phenomena
f(d) f(d) 1/d 1/d, where , where d = distance from the source, d = distance from the source, = diffusion parameter, depends on the type of effect = diffusion parameter, depends on the type of effect
((e.g. for temperature = 1, light = 2))
UNIVERSITY OFSOUTHERN CALIFORNIA
ExampleExample (of diffusion) (of diffusion): : Isoseismal (intensity) maps ((North Palm Springs earthquake of July 8, 1986North Palm Springs earthquake of July 8, 1986))
Ref.: Southern California Earthquake Center. (http://www.scec.org)
UNIVERSITY OFSOUTHERN CALIFORNIA
Why Natural Information Gradient Why Natural Information Gradient is Important?is Important?
• This natural information gradient isThis natural information gradient is FREEFREE
• Routing protocols can use it to forward query packetRouting protocols can use it to forward query packet ((greedilygreedily))
- Locate event(s); e.g., fire, nuclear leakage.- Locate event(s); e.g., fire, nuclear leakage.
• Diffusion property is not limited to natural phenomenaDiffusion property is not limited to natural phenomena
- Time gradient- Time gradient
• Existing approaches – Existing approaches – flooding, expanding ring search, flooding, expanding ring search, random-walk random-walk, etc. do not utilize this information gradient, etc. do not utilize this information gradient
UNIVERSITY OFSOUTHERN CALIFORNIA
Challenges Challenges -Erroneous reading of malfunctioning sensors Erroneous reading of malfunctioning sensors
- Calibration error, obstacles. Cause local max/min- Calibration error, obstacles. Cause local max/min
-Environmental noiseEnvironmental noise
-In real life, sensors unable to measure below certain In real life, sensors unable to measure below certain threshold. So, diffusion curve has finite tailthreshold. So, diffusion curve has finite tail
-Non-uniform sensor distribution (gaps)Non-uniform sensor distribution (gaps)
0
20
40
60
80
100
0 50 100 150 200
distance
mag
nit
ud
e o
f ef
fect
Local MaximumDip
gap
UNIVERSITY OFSOUTHERN CALIFORNIA
ObjectiveObjectiveDesign an efficient algorithm to locate source(s) in Design an efficient algorithm to locate source(s) in sensor networks, utilizing the natural information sensor networks, utilizing the natural information gradient i.e., the diffusion pattern of the event’s effectgradient i.e., the diffusion pattern of the event’s effect
- Gradient based- Gradient based- Fully distributed- Fully distributed- Robust to node or sensor failure or malfunction- Robust to node or sensor failure or malfunction- Capable of finding multiple sources- Capable of finding multiple sources
Environment ModelEnvironment Model• Event’s effect follows the diffusion lawEvent’s effect follows the diffusion law
• Discontinuity exists in the diffusion curve with finite tailDiscontinuity exists in the diffusion curve with finite tail
• Environmental noiseEnvironmental noise
UNIVERSITY OFSOUTHERN CALIFORNIA
Basic ProtocolBasic Protocol A node can have two modeA node can have two mode
- flat region mode- flat region mode- gradient region mode - gradient region mode
A node forwards the query to neighbors with its information level A node forwards the query to neighbors with its information level To forward the query, each node uses following algorithm:To forward the query, each node uses following algorithm: 11. Information gradient region follows . Information gradient region follows greedy approachgreedy approach
- Forwards the query to the neighbors if the information level about the - Forwards the query to the neighbors if the information level about the event improves event improves
22. Unsmooth gradient region use probabilistic forward based . Unsmooth gradient region use probabilistic forward based on the on the Simulated AnnealingSimulated Annealing concept concept
- Probabilistic function is - Probabilistic function is ffpp(x) = 1/x(x) = 1/xaa, where x = hop count in the information , where x = hop count in the information
gradient region and ‘ gradient region and ‘a’a’ depends on the diffusion parameter depends on the diffusion parameter ( )
33. Use flooding for the flat (ie. zero) information region. Use flooding for the flat (ie. zero) information region - Decrease latency to reach gradient information region- Decrease latency to reach gradient information region - Handles query in the absence of event- Handles query in the absence of event
Query ID prevents looping Query ID prevents looping Once query is resolved, node uses the Once query is resolved, node uses the reverse path reverse path to replyto reply
UNIVERSITY OFSOUTHERN CALIFORNIA
E
Q’Q’ Q’Q’ Q’Q’
Q’Q’ Q’Q’ Q’Q’
Q’Q’ Q’Q’ Q’Q’
E
Mn
ng ngng
ng
ngng ng
ngMx
np npnp
np
np np np
np
• All neighbors (All neighbors (nngg) of M) of Mnn have more information, so they forward the have more information, so they forward the
query to their neighbors query to their neighbors
• All neighbors (All neighbors (nnpp) of M) of Mxx have less information, so they forward the have less information, so they forward the
query to their neighbors query to their neighbors probabilisticallyprobabilistically
UNIVERSITY OFSOUTHERN CALIFORNIA
Query TypesQuery Types• I. Single-value queryI. Single-value query
- Search for a specific value and have a single response- Search for a specific value and have a single response
• II. Global Maxima searchII. Global Maxima search- Search for the maximum value of information in the system- Search for the maximum value of information in the system- Intermediate nodes suppress non-promising replies- Intermediate nodes suppress non-promising replies
• III. Multiple Events detection III. Multiple Events detection (still presents a challenge)(still presents a challenge)- Search for multiple events of same type- Search for multiple events of same type
Performance MetricsPerformance Metrics• Reachability i.e., success probabilityReachability i.e., success probability
- Probability that the query will reach the source- Probability that the query will reach the source
• Overhead in terms of average energy dissipation Overhead in terms of average energy dissipation - Number of transmissions to forward the query and to get the reply - Number of transmissions to forward the query and to get the reply
• For the probabilistic function fFor the probabilistic function fpp(x) = 1/x(x) = 1/xaa, , a a < < is recommended, but is recommended, but
close to close to gives optimal trade-off between reachability and overheadgives optimal trade-off between reachability and overhead
- Reachability ~98% is achievable in presence of noise, gaps and flat region- Reachability ~98% is achievable in presence of noise, gaps and flat region
UNIVERSITY OFSOUTHERN CALIFORNIA
ComparisonsComparisons
• Existing gradient-based routing protocols can be Existing gradient-based routing protocols can be categorized into categorized into twotwo major approaches major approaches
• Single-path approachSingle-path approach - CADR [Chu2002], Min-hop [Liu2003], … - CADR [Chu2002], Min-hop [Liu2003], …
• Multiple-path approachMultiple-path approach - GRAB [Ye2003], RUGGED [Faruque2004] - GRAB [Ye2003], RUGGED [Faruque2004]
Which Which approachapproach to choose? to choose?
UNIVERSITY OFSOUTHERN CALIFORNIA
ObjectiveObjective
• Analyze the performance of these general Analyze the performance of these general approaches to route a query approaches to route a query
- Model - Model query success ratequery success rate and and overheadoverhead• Using probability tools Using probability tools
- For - For idealideal and and lossylossy wireless link wireless link conditions conditions
• Simulate the protocols based on these Simulate the protocols based on these approaches in more realistic scenarios approaches in more realistic scenarios
- Also investigate Also investigate path qualitypath quality metric metric
• Compare both approaches using analytical and Compare both approaches using analytical and simulation results simulation results
UNIVERSITY OFSOUTHERN CALIFORNIA
Brief Description of Routing Brief Description of Routing ApproachesApproachesSingle-path Query forwarding with
look-ahead = 1Multiple-path Query forwarding
17.2 17.217.2 17.2
17.2 18.9 3.1 18.9
17.2 18.9 21.1 21.1
17.2 92.1 21.1 23.8
21.1 6.9 21.1 21.1
23.8 4.1 98.1 23.8 23.8 23.8 23.8
67.0 3.2 21.1 23.8 27.5 27.5 27.5 27.5 27.5 27.5 27.5
18.9 21.1 30.0 27.5 29.0 32.9 32.9 80.5 32.9 32.9
23.8 31.0 32.9 41.5 41.5 41.5 41.5 41.5
23.8 27.8 3.4 41.5 57.4 57.4 57.4 41.5
23.8 27.8 32.9 41.5 57.4 100 57.4 41.5
57.4 41.557.457.441.5
Q
17.2 17.217.2 17.2
17.2 18.9 3.8 18.9
17.2 18.9 21.1 21.1
17.2 92.1 21.1 23.8
21.1 6.9 21.1 21.1
23.8 4.1 98.1 23.8 23.8 23.8 23.8
67.0 3.2 21.1 23.8 27.5 27.5 27.5 27.5 27.5 27.5 27.5
18.9 21.1 30.0 27.5 29.0 32.9 32.9 9.0 32.9 32.9
21.1 23.8 31.0 32.9 41.5 41.5 41.5 41.5 41.5
23.8 27.8 3.4 41.5 57.4 57.4 57.4 41.5
23.8 27.8 32.9 41.5 57.4 100 57.4 41.5
57.4 41.557.457.441.5
Q
S S
Look-ahead = 1
Active Node
Candidate Node
Active Nodes
UNIVERSITY OFSOUTHERN CALIFORNIA
Variations of Single-path ApproachVariations of Single-path Approach
1. 1. Basic single-path approachBasic single-path approach- Selects a candidate node having maximum Selects a candidate node having maximum information and higher than current active node information and higher than current active node
- Sensitive to Sensitive to local maximalocal maxima
2. 2. Improved single-path approachImproved single-path approach- Selects a candidate node having maximum Selects a candidate node having maximum information information
78
10 15
1218
Depends on Depends on Next Active nodeNext Active node selection policyselection policy
78
9 14
1012
Candidate node
Active node
13
1110
9 14
1012
- Information of the selected node can be less - Information of the selected node can be less than the current active node than the current active node
UNIVERSITY OFSOUTHERN CALIFORNIA
ComparisonsComparisons --Query Success Rate Query Success Rate (ideal and lossy (ideal and lossy link case,link case,ppcc= = 0.050.05))
Lossy link case - analytical resultIdeal link case - analytical result
• Query success rate of the improved single-path approach drops drastically for lossy links while the multiple-path approach is quite resilient
• ARQ may improve success rate of the improved single-path approach
UNIVERSITY OFSOUTHERN CALIFORNIA
ComparisonsComparisons -- Overhead Overhead
Overhead of both approaches Energy saving of the multiple-path approach over improved single-path approach
• Multiple-path approach creates extra paths due to probabilistic forwarding, so overhead increases
• Single-path approach uses 1-hop look ahead at every step to decide on the forwarder
• With the increase of malfunctioning nodes, the overhead of the single-path approach increases - The length of the path increases
UNIVERSITY OFSOUTHERN CALIFORNIA
ResultsResults –– Path Quality Path Quality (ideal link case)(ideal link case)
• Ratio of the average path length due to a routing approach over the shortest path length between a source and a sink
• Multiple-path approach results shorter path which are close to the shortest path
• With the increase of malfunctioning nodes, the path length of the single-path approach increases
UNIVERSITY OFSOUTHERN CALIFORNIA
ConclusionsConclusions
• Multiple-path approach causes less overhead when a source Multiple-path approach causes less overhead when a source is is < 20hops< 20hops from sink from sink- Multiple-path approach yields shorter pathsMultiple-path approach yields shorter paths- With increase of malfunctioning nodes, the query success With increase of malfunctioning nodes, the query success rate of the multiple-path approach degrades gracefullyrate of the multiple-path approach degrades gracefully- With - With lossy linkslossy links- Query success rate of the single-path approach drops Query success rate of the single-path approach drops drasticallydrastically- Multiple-path approach is quite resilient Multiple-path approach is quite resilient
UNIVERSITY OFSOUTHERN CALIFORNIA
Future workFuture work
• Combine the benefits of both routing Combine the benefits of both routing approaches in a approaches in a hybrid hybrid routing approachrouting approach• Develop more Develop more adaptive adaptive multiple-path approach multiple-path approach to reduce the number of extra paths due to to reduce the number of extra paths due to probabilistic forwarding probabilistic forwarding • Implementation & evaluation in a test-bedImplementation & evaluation in a test-bed
- - on-going 150 sensor node new test-bed at USCon-going 150 sensor node new test-bed at USC- continued work under the NSF-funded ACQUIRE - continued work under the NSF-funded ACQUIRE projectproject
UNIVERSITY OFSOUTHERN CALIFORNIA
ACQUIRE: ACtive QUery ACQUIRE: ACtive QUery Forwarding In Sensor Forwarding In Sensor
NetworksNetworksOriginal team:Original team: Narayanan Sadagopan, Bhaskar Narayanan Sadagopan, Bhaskar
Krishnamachari, Ahmed HelmyKrishnamachari, Ahmed Helmy
Current: Sundeep Pattem, Jabed Faruque, Rahul Current: Sundeep Pattem, Jabed Faruque, Rahul Orgaonkar, Yongjin Kim, Jung-Hyun Jun, Sapon Orgaonkar, Yongjin Kim, Jung-Hyun Jun, Sapon
Tanachaiwiwat, Shao-Cheng WangTanachaiwiwat, Shao-Cheng Wang
Department of Electrical EngineeringDepartment of Electrical Engineering
USC Viterbi School of EngineeringUSC Viterbi School of Engineering
University of Southern CaliforniaUniversity of Southern California
URL: URL: http://ceng.usc.edu/~acquirehttp://ceng.usc.edu/~acquire
Funding: NSF NETS NOSS, Intel (equipment)
UNIVERSITY OFSOUTHERN CALIFORNIA
Develop a model of variation over time(or space) using measurements
Use the model to predict data/readings.Only trigger updates or queries when data/readings deviate from predicted value.
Depending on the data dynamics, we may be able to cache information collected earlier and answer queries without having to trigger new data collection.
UNIVERSITY OFSOUTHERN CALIFORNIA
ACtive QUery forwarding In sensoR nEtworks (ACQUIRE)*
• A mechanism for answering one-shot, complex queries for replicated data in sensor nets:
– One-shot (vs. continuous): answers are given based explicit queries about current readings.
– Complex (vs. simple): the query can contain several sub-queries. E.g: (x OR y) AND z.
– Replicated data: several sensors might have answer to a sub-query.
• Example: Micro Climate Data Collection– Different sensor modalities
– Give a location where (Temp > 80 degrees OR Humidity > 40%) AND Wind speed > 20 mph
* N. Sadagopan, B. Krishnamachari, A. Helmy, “Active Query Forwarding In Sensor Networks (ACQUIRE)”, AdHoc Networks Journal - Elsevier, Jan 2005 [Earlier version in SNPA ‘03]
UNIVERSITY OFSOUTHERN CALIFORNIA
UNIVERSITY OFSOUTHERN CALIFORNIA
Flooding Based Queries (Directed Diffusion)
D
C
E
C
A
C
AB
C
A
1
27
93
4
56
10
8x*
[QA, QC]
[QA, QC]
[QA, QC]
[QA, QC]
[QA, QC]
[QA, QC]
[QA, QC]
[QA, QC] [QA, QC]
[QA, QC]
(a) Flooding of interest query from querier node (sink x*)
x*
D
C
E
C
A
C
AB
C
A
[RC]
[RC][RA]
[RA, RC, RC]
[RA, RC, RC]
[RC]
[RA, RC]
[RA][RA, RA, RC]
(b) Response to query
1
27
93
4
56
10
8
Flooding:• Useful for long standing (continuous) queries• Replicated responses might make it very inefficient.
UNIVERSITY OFSOUTHERN CALIFORNIA
ACQUIRE
x*
D
C
E
C
A
C
AB
C
A
[QA, QC]
[QA, QC] [QA, RC]
[RA, RC][RA, RC]
(d) Sample trajectory of active query (solid) and response (dashed) in basic ACQUIRE (zero look-ahead)
[RA, RC]
1
27
93
4
56
10
8
Active Query
Complete Response
Update Messages
LEGEND
ACQUIRE
• An active node “refreshes” data from its “neighborhood”.
• The query is then forwarded to a node on the edge of the neighborhood
UNIVERSITY OFSOUTHERN CALIFORNIA
ACQUIRE
• Key Features
– In-network processing
– Does not rely on geographic information or unicast routing protocol
• Existence of these may considerably improve performance
– d helps us span the space from random walk (d = 0) to flooding (d = D, the network diameter)
UNIVERSITY OFSOUTHERN CALIFORNIA
ACQUIRE
• Look-ahead parameter, d– Determines the size of the “neighborhood” in hops.– Effects a tradeoff between the number of steps taken to resolve
the query and the energy consumed.– Optimal look-ahead, d*
• Depends on the query rate, refresh rate and the data dynamics (captured by the amortization factor, c)
• May be achieved by localized schemes.• The higher the query rates & lower the data dynamics, the
higher the optimal look ahead.
UNIVERSITY OFSOUTHERN CALIFORNIA
0
500
1000
1500
2000
2500
3000
3500
4000
1 3 5 7 9 11 13 15 17 19 21 23 25 27
Look-ahead Parameter (d) [N=1000, M=200]
Ave
rag
e E
ne
rgy
pe
r Q
ue
ry
c=0.01
c=0.02
c=0.03
c=0.04
c=0.05
c=0.06
c=0.07
Performance of ACQUIRE
C is the refresh/query ratio (e.g., 0.01 means refreshonce every 100 queries) [the refresh overhead is amortized over the saving in queries]
UNIVERSITY OFSOUTHERN CALIFORNIA
ACQUIRE
• Efficiency
– 60-75% energy savings over Expanding Ring Search (analytical results)
– Order of magnitude savings over flooding.
• Future Work
– Develop ACQUIRE in to a full fledged protocol that actively adapts the ‘d’ parameter for optimal performance
– Evaluation over an experimental sensor network test bed.
– ceng.usc.edu/~acquire
UNIVERSITY OFSOUTHERN CALIFORNIA
Correlations and Inserted Data
• Main purpose of sensor networks: Collect Data• Sybil attacks may insert false data that affect
operation of sensor networks:– Impersonating multiple IDs (at same/different times)
– Outlier detection alone will not work
• Approach:– Understand normal correlations between data
– Detect outliers based on reference to normal behavior
– Design protocol robust to massive amount of forged data
UNIVERSITY OFSOUTHERN CALIFORNIA
Single Attacker Scenario ISingle Attacker Scenario I
Data: X from Data: X from location (x,y)location (x,y)--Interesting --Interesting
eventsevents
MobiQuitous 2005 5
UNIVERSITY OFSOUTHERN CALIFORNIA
Single Attacker Scenario IISingle Attacker Scenario II
Data: X’ from location Data: X’ from location (x,y)(x,y)
--Normal events--Normal events
MobiQuitous 2005 6
UNIVERSITY OFSOUTHERN CALIFORNIA
Sybil Attack Scenario ISybil Attack Scenario I
Data: WData: Wii from from location (xlocation (xii,y,yii))--Interesting --Interesting
eventsevents
MobiQuitous 2005
Source
Source/forwarder
Attackers (sybil nodes)
Inactive node
Aggregator
Sink
UNIVERSITY OFSOUTHERN CALIFORNIA
Sybil Attack Scenario IISybil Attack Scenario IIData: WData: Wii’ from location ’ from location
(x(xii,y,yii))--Normal events--Normal events
MobiQuitous 2005
Sourceforwarder
Inactive node
Aggregator
Sink
Attackers (sybil nodes)
UNIVERSITY OFSOUTHERN CALIFORNIA
T P H T P H T P H T P H
111 1 1 1
116 .74 .64 .74 1 1 1
122 .83 .42 .91 .84 .67 .80 1 1 1
126 .67 .41 .56 .55 .50 .64 .70 .55 .77 1 1 1
ID 111 116 122 126
Data Correlation (Great duck Data Correlation (Great duck island)island)
T: Temperature, P: Pressure, H: HumidityID: Sensor ID (only 4 neighboring sensors are shown)
UNIVERSITY OFSOUTHERN CALIFORNIA
Anomaly Relationship Test (ART) Anomaly Relationship Test (ART) ArchitectureArchitecture
Statistical Analysis Module
T*-test (Outlier T*-test (Outlier threshold)threshold)
Correlation-Correlation-coefficient coefficient analysisanalysis
Authentication ModuleDistributed Interactive ProofDistributed Interactive Proof
S. Tanachaiwiwat, A. Helmy, MobiQuitous 2005
UNIVERSITY OFSOUTHERN CALIFORNIA
Anomaly Relationship Test (Anomaly Relationship Test (ART) ART) ProtocolProtocol
(1)Correlation/T*-test (2)Request valid credential
(3)Response with valid/invalid/no response
Compromised/Failed
source
Verifier (aggregator)
sink
Prover (attacker)
Verifier (forwarder)
Sybil
MobiQuitous 2005 9
Perform at Perform at verifiersverifiers only! only!
(4) Send reportto sink
(5) Cross verify
UNIVERSITY OFSOUTHERN CALIFORNIA
SummarySummary• Dynamic sliding window Correlation analysis and T*-Dynamic sliding window Correlation analysis and T*-
Test can Test can alleviatealleviate the attack effectively even under the attack effectively even under full scalefull scale attack from sybil nodes. attack from sybil nodes.
• RemarksRemarks– Recognition of normal/abnormal/malicious events based on Recognition of normal/abnormal/malicious events based on
statistical analysisstatistical analysis– Malicious data insertion can cause the problem to critical Malicious data insertion can cause the problem to critical
mission in WSN mission in WSN – Error is reduced by using Dynamic Sliding Window and Error is reduced by using Dynamic Sliding Window and
careful choice of correlation thresholdcareful choice of correlation threshold
MobiQuitous 2005 22
UNIVERSITY OFSOUTHERN CALIFORNIA
Total Population: ~ 25,000 studentsWireless Users: ~6000 studentsAccess Points: ~400
WLANs as Sensor Networks
UNIVERSITY OFSOUTHERN CALIFORNIA
IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace
Analysis*• Classes of future sensor networks will be attached to
humans• What kinds of correlations exist between users?• Analyze measurements of wireless networks
– Understand Wireless Users Behavior (individual and group)
– Develop models to understand associations and friendship
• Study of relationships and user behavior based on measurements of various University WLANs
* W. Hsu, A. Helmy, “IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis”, USC TR, July ‘05 (Under Submission)
UNIVERSITY OFSOUTHERN CALIFORNIA
Statistics of Studied Traces
- Four major campuses- Month long traces studied- Total users in the study: over 12,000 users- Total Access Points in the study: over 1,300
UNIVERSITY OFSOUTHERN CALIFORNIA
Observations: On-line Time
On-off behavior is very common for wireless users. This seems especially true for small handheld devices. There are clear categories of heavy and light users,
the distribution of which is skewed and heavily depends on the campus.
UNIVERSITY OFSOUTHERN CALIFORNIA
Observations: Visited Access Points (APs)
•Individual users access only a very small portion of APs in the network, less than 35% in all campuses. The long-term mobility of users is highly skewed in terms of time associated with each AP. On average a user spends more than 95% of time at its top five most visited APs.
[percentage of visited APs]
UNIVERSITY OFSOUTHERN CALIFORNIA
Observations: Visited APs
•The majority of users experience low mobility while using the network. This is even true for portable devices such as PDAs. The actual handoff statistics depend heavily on the environment.
UNIVERSITY OFSOUTHERN CALIFORNIA
•We observe clear repetitive patterns of association in wireless network users. Typically, user association patterns show the strongest repetitive pattern at time gap of one day/one week.
Observations: Similarity Index
UNIVERSITY OFSOUTHERN CALIFORNIA
Observations: Encounters
•In all the traces, the MNs encounter a relatively small fraction of the user population; below 40% in most cases and never reaching above 60% in any case. Except for UCSD trace, on average a MN only encounters 1.88%-5.94% of the whole population. The number of total encounters for the users follows a BiPareto distribution, the parameters of which depends on the campus.
UNIVERSITY OFSOUTHERN CALIFORNIA
Encounter-graphs
• Definition– When 2 nodes access the same AP at the same time we
call this an ‘encounter’
– The encounter graph has all the mobile nodes as vertices and its edges link all those vertices that encounter each other
UNIVERSITY OFSOUTHERN CALIFORNIA
Regular Graph- High path length- High clustering
Random Graph - Low path length, - Low clustering
Small World Graph: Low path length, High clustering
- In Small Worlds, a few short cuts contract the diameter (i.e., path length) of a regular graph to resemble diameter of a random graph without affecting the graph structure (i.e., clustering)
0
0.2
0.4
0.6
0.8
1
0.0001 0.001 0.01 0.1 1
probability of re-wiring (p)
Clustering
Path Length
UNIVERSITY OFSOUTHERN CALIFORNIA
• Encounters link most of the MNs together in a connected graph:– Albeit each MN encounters only with small portion of the population.
– The encounter graph is a SmallWorld graph
– Even for short time period (1 day) its clustering coefficent, average path length, and connectivity are all close to those for longer traces.
• Friendship between MNs is highly asymmetric. – The distribution for the friendship index is exponential for all the traces,
regardless of the friendship definition (based on time, encouner, or location).
– Among all node pairs there are less than 5% with friendship index larger than 0.01, and less than 1% with friendship index larger than 0.4.
Encounter-graphs and Friendship
UNIVERSITY OFSOUTHERN CALIFORNIA
UNIVERSITY OFSOUTHERN CALIFORNIA
•Top-ranked friends tend to form cliques and low-ranked friends are the key to provide random links and reduce the degree of separation in encounter graph.
Encounter-graphs using Friends
UNIVERSITY OFSOUTHERN CALIFORNIA
•Encounters patterns are rich enough to support information diffusion. Specifically, information can be delivered to more than 94% of users within two days. The reachability and average delay do not decrease significantly until at least ~40% of nodes are selfish.
Encounter-based Information Diffusion
UNIVERSITY OFSOUTHERN CALIFORNIA
Vision: Building Community-wide Wireless/Mobility Library
• Library of measurements from WLANs, mobility and associations from potential wireless societies (e.g., universities, vehicular nets)
• Library of realistic models of user behavior (e.g., mobility, traffic, friendship, encounter models, … )
• Library of benchmarks and guidelines for simulation and evaluation
• How much insight can we get by analyzing the traces?
• Can we use the insight to ‘design’ protocols of the future (not only for evaluation)?
• Currently 20 major universities willing to share their traces
• …. more to come: http://nile.usc.edu/MobiLib (under heavy update)
• If you have traces: helmy@usc.edu !
UNIVERSITY OFSOUTHERN CALIFORNIA
Issues
• How can we model correlations accurately?• How can we further utilize correlations?• Context-aware protocols:
– Phenomenon-aware protocols
– Socially-aware protocols
• Other kinds of correlations:– Sensor Networks Test-beds: correlation between radio
connectivity and phenomenon (e.g., rain)
– …
UNIVERSITY OFSOUTHERN CALIFORNIA
Thank You !
• Related Links– ACQUIRE: ceng.usc.edu/~acquire
– Mobility Library: nile.usc.edu/MobiLib
– Lab: nile.usc.edu
– Homepage: ceng.usc.edu/~helmy
top related