1
Rethinking Data Management for Storage-centric Sensor Networks
Yanlei Diao, Deepak Ganesan, Gaurav Mathur, and Prashant Shenoy
CIDR 2007CIDR 2007Proceedings of the Third Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, January 2007.
2
STONESSTONES Project
STONES STOrage for Networked Embedded Systems
http://sensors.cs.umass.edu/projects/stones/
Energy-efficient Storage for Sensors Sensor Database PRESTO, TSAR, Capsule, …etc
3
Papers
PRESTO: Feedback-Driven Data Management in Sensor Networks
ACM/USENIX NSDI 2006ACM/USENIX NSDI 2006
TSAR: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs
ACM Sensys 2005ACM Sensys 2005
Capsule: An Energy-Optimized Object Storage System for Memory-Constrained Sensor Devices
ACM Sensys 2006ACM Sensys 2006
Proxy CacheProxy Cache
Sensor NodesSensor Nodes
4
PRESTOIntroduction
PRESTO ProxyPRESTO Proxy
PRESTO SensorPRESTO Sensor
precision(query) > confidence interval
x(t+1)-x(t) > worst-case deviation
5
TSARIntroduction
6
CapsuleIntroduction
7
Outline
Introduction StonesDB Architecture Local Database Distributed Data Management Current Status and Conclusions References
8
IntroductionData Management in Sensor Networks
Live Data Management Real-time queries Only small window of data is important Event detection and notification Push-down Filters, AQP, …etc
Archival Data Management Database outside the sensor networks View sensor networks as database Analysis of past events, Historical trends
9
IntroductionExample: Smart Home and Smart Biz
Live Data Management?
Archival Data Management?
10
IntroductionCentralized Archival Data Management
Internet
DBMSDBMS
Database Management System
User QueryUser Query
Data Access
Data Access Sensors with high data rate?!
camera, acoustic, vibration…
lossless aggregation…
Low data rate, High query rateLow data rate, High query rate
22.1 ˚C 21.5 ˚C 21.8 ˚C
?
11
IntroductionStorage-centric Archival Data Management
Internet
User QueryUser Query
Data AccessData Access
Local StorageLocal Storage
Flash MemoryFlash Memory
acoustic
image
Push query to sensors!
limited capabilities
flash memory efficiency
High data rate, High data rate, Low query rateLow query rate
?
12
IntroductionSensor Node Hardware Today
Mica2 mote 6 MHz Processor 4 KB RAM 128 KB FLASH
iMote2 13 – 416 MHz Processor 32 MB RAM 32 MB FLASH
13
IntroductionTechnology Trends
Communication
Storage
Energy cost of storage compared to that of communicationEnergy cost of storage compared to that of communication
Generation of Sensor Platforms
Ene
rgy
Cos
t (p
er b
yte)
14
Design Goals
Exploit local flash memory Cheap, energy-efficient flash memory
Exploit resources-rich proxies cache data and split query plans
Support a rich set of queries SQL-type queries data mining and similarity search queries
Support heterogeneity configurable to heterogeneous sensor platforms
15
StonesDB ArchitectureTwo-tier Sensor Networks
Local DatabaseLocal Database
Distributed Data Management LayerDistributed Data Management Layer
user specified confidence bound
16
StonesDB ArchitectureSystem Operations
Image RetrievalImage Retrieval
…
Proxy Cache of Image SummariesProxy Cache of Image Summaries
找出沒有洋蔥頭臉部表情的圖片 ...
17
StonesDB ArchitectureSystem Operations
Image RetrievalImage Retrieval
找出沒有洋蔥頭臉部表情的圖片 ...
Query Engine
Partitioned Access Methods
…
Sensor Local StorageSensor Local Storage
18
Local DatabaseArchitecture of Local Database
StreamStream
IndexIndex
SummarySummary
19
Local DatabaseCosts and Benefits of Access Methods
Cost for B+ tree insertion
Cost for sequential scan
)( wr CCH
pager RC /
:
:,
:
page
wr
R
CC
H H-level B+ tree
cost for page read/write
readings per page
Sequential scan is 340 times more energy efficiency!Sequential scan is 340 times more energy efficiency!
When depth of B+ tree is 2…
Scan is better when data is not accessed very frequently.Scan is better when data is not accessed very frequently.
Lazy index Lazy index construction!construction!
20
Local DatabasePartitioned Access Methods
temporal segments
B+ Tree R Tree
Write-Once Indexing!Write-Once Indexing!
21
Local DatabaseSummarization and Aging
All available storage gets filled… When to drop these summaries? How to drop these summaries? Graceful query quality degradation.
local storage capacity
Resolution 4 Resolution 1Resolution 2Resolution 3
Multi-resolution Summarization: Local Storage Allocation
22
Local DatabaseSummarization and Aging
High query accuracyLow compactness
Low query accuracy
High compactness
How long should a summary be stored in the network?
23
Local DatabaseSummarization and Aging
Qu
ery
Acc
ura
cy
Time
Quality Difference
present past
iAge
95%
50%
userQ
systemQ
user-desired quality degradation
system-provided step function
Objective: minimize the worst case quality difference
)))(diff((0 tqMaxMin Tt
1 4
ir
s
R
NsAge
i
ii
i
ii
24
Distributed Data ManagementThe Problems
Proxy Cache of Image SummariesProxy Cache of Image Summaries
What summaries to cache?
What resolution of summaries?
How should a query plan be split?
I want the data of …
25
Distributed Data ManagementQuerying the Proxy Cache
Internet
User Query
User Query
Statistical models?Statistical models?
Low-resolution data?Low-resolution data?
Metadata of images?Metadata of images?
Response
Response
Gateway, ProxyGateway, Proxy
Sensor NodesSensor Nodes
summaries from sensors to the proxy
queries from the proxy to sensors
query execution at the sensors
results back to the proxy
minmin
26
Distributed Data ManagementQuerying the Sensor Tier
Gateway, ProxyGateway, Proxy
Sensor NodesSensor Nodes
Cache miss…Cache miss…
Not meet accuracy requirement?Not meet accuracy requirement?
User Query
How to split the query plan?
Use Query Processing Engine…Use Query Processing Engine…
27
Distributed Data ManagementQuerying the Sensor Tier
Sensor NodesSensor Nodes
Gateway, ProxyGateway, Proxy
Number of cars over past half hour?
01
02
03
04
01 02 03 04Proxy Cache of Image SummariesProxy Cache of Image Summaries
Partially process the query at the Partially process the query at the proxy!proxy!
28
Distributed Data ManagementQuerying the Sensor Tier
Sensor NodesSensor Nodes
Gateway, ProxyGateway, Proxy
Average temperature between PM 1:00 - 2:00
01
02
03
04
Proxy Cache of Data SummariesProxy Cache of Data Summaries
Refine the result at the sensor node!Refine the result at the sensor node!
…19.2 ˚C19.5 ˚C
20.2 ˚C21.1 ˚C
20.0 ˚C18.6 ˚C
average temperature every two hours
29
Current Status and Conclusions
Implemented Capsule Flash-based object store Energy-efficient data structure (lists, arrays, trees) Currently extended with summarization, aging, and partitio
ned indexing.
Other related systems TSAR: separate data from metadata PRESTO: implement a proxy cache
No running system for StonesDB architecture.
30
ReferencesThe STONES Project
STONES http://sensors.cs.umass.edu/projects/stones/
CAPSULE http://sensors.cs.umass.edu/projects/capsule/
PRESTO http://presto.cs.umass.edu/
The Wireless Sensor Networks Group at UMASS http://sensors.cs.umass.edu/index.shtml