databases unplugged: challenges in ubiquitous data management michael franklin uc berkeley

27
Databases Unplugged: Databases Unplugged: Challenges in Challenges in Ubiquitous Data Ubiquitous Data Management Management Michael Franklin UC Berkeley

Post on 22-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

Databases Unplugged:Databases Unplugged:Challenges in Ubiquitous Challenges in Ubiquitous

Data ManagementData ManagementMichael Franklin

UC Berkeley

Page 2: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 2

““Gazillions of Gizmos”Gazillions of Gizmos”

“In ten years, billions of people will be using the Web, but a trillion "gizmos" will also be connected to the Web.” Asilomar Rep. on DB Research, Dec. 1998

You’ve heard it before…

Smartphones, PDAs, Smartcards, badges, wearables, lightswitches, toasters, …

Worldwide sales of Internet-enabled appliances projected to grow from 5.9M units in 1998 to 55.7M units in 2002. IDC via H&Q report

Page 3: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 3

An Explosion in ScaleAn Explosion in Scale

Distribution

Personalization

More

Less

Less More

BatchRJE

Time Sharing

WS/Server

PC + Network

Many peopleper computer

One personper computer

Many computersper person

InformationAppliances

Scaled downPCs, desktop

metaphor

(Picture is by way of Randy Katz)

Page 4: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 4

Technical ChallengesTechnical Challenges

Disconnection/Weak Connection

Standard distributed database techniques break down. Limited resources

Memory, CPU, Power, User Interface, Bandwidth Movement/Location

Killer Mobile apps use current and future locations. Scale

Number and diversity of devices. Reliability - Palm Pilots don’t bounce.

Page 5: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 5

But, is Mobile Data Mgmt Needed?But, is Mobile Data Mgmt Needed?

“Fundamentally, the ability to access all information from anywhere and have ONE unified and synchronized information repository is critical to making appliances useful.” Hambrecht and Quist, iWord , March 1999

“All these information appliances have internal data that "docks" with other data stores. Each gizmo is a candidate for database system technology, because most will store and manage some information.” Asilomar Report

Page 6: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 6

Road MapRoad Map Motivation

Alternative scenarios for mobile Databases

Technical/Research challenges

Some solutions

Consistency Data Dissemination Data Recharging

Conclusions

Page 7: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 7

How Will it Happen?How Will it Happen?

SQL engine on the device (largely standalone)

Extension of enterprise infrastructure

Data Collection (device to infrastructure)

Data Dissemination (infrastructure to device)

PIM-driven information assistant

Alternatives

Page 8: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 8

SQL Engine on the DeviceSQL Engine on the Device Reasonable for Palmtop — but probably not the

toaster or light-switch…

Stand-alone with occasional synchronization.

Footprint versus functionality

Engine can be made surprisingly small (10-100s KB). Sybase uses “take what you need” library approach

All major vendors are playing in this space: Oracle Lite, Sybase SQL Anywhere, Informix/Cloudscape, DB2 Oracle Lite, Sybase SQL Anywhere, Informix/Cloudscape, DB2

for the Workpad, SQL Server for Windows CEfor the Workpad, SQL Server for Windows CE But, what is the killer app???

Page 9: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 9

Extension of EnterpriseExtension of Enterprise

Logical Progression?

Mainframe->Desktop->Palm ERP-> Palm

Device becomes the endpoint of the enterprise infrastructure (queries and updates).

This is happening but must take into account fundamental limitations of the mobile platforms.

Again, examples exist, but the killer app has not yet emerged here.

Page 10: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 10

Data Collection DevicesData Collection Devices

Inventory Management/Tracking/Sensors/Census

Examples: Symbol technologies --- Palm with a bar code scanner; more futuristic: smart dust.

Asymmetric (device to server) data flow/usage dictates system architecture.

Many applications exist, but no clear need for full function DBMS on the device.

Server-side DB must handle data streams

Page 11: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 11

Data DisseminationData Dissemination Many Potential Apps

stock and sports tickers traffic information systems software distribution news and/or entertainment delivery

Asymmetric (server to devices) data flow/usage dictates system architecture.

No clear need for full function DBMS on the device, but intelligent caching and filtering on device is crucial.

Page 12: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 12

Personal Information ManagementPersonal Information Management PIM is the killer app for mobile devices.

So, use PIM to drive the data management architecture.

Example: IBM’s Active Calendar

Calendar provides semantic information on what information will be needed when (and where).

Use this information to pre-stage information from the fixed infrastructure.

This seems to be the most promising approach for driving device DB functionality.

Page 13: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 13

Research IssuesResearch Issues Transactions (not likely) and Consistency. Distribution of function

how to split query functionality? adaptive??

New Querying and Access Models info filtering and dissemination location centric/movement triggers/pervasive (invasive?) computing Evidence Accrual – killer app: dating game

Availability and Recovery

Page 14: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 14

Data Caching and ConsistencyData Caching and Consistency How to keep distributed data consistent?

Centralized algorithms require connectivity at specific times.

Alternative: Epidemic Algorithms (Peer-to-peer)

Conflict detection: timestamps, version vectors,… Conflict Handling (update commitment):

OptimisticOptimistic (resolution) - Manual except in limited (resolution) - Manual except in limited domains,domains,

PessimisticPessimistic (avoidance) - primary copy, (avoidance) - primary copy, write-all or voting-based.write-all or voting-based.

Page 15: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 15

Epidemic Protocol IllustrationEpidemic Protocol Illustration(Picture is by way of Ugur Cetintemel)

Page 16: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 16

Deno - Cetintemel and KeleherDeno - Cetintemel and Keleher

Pessimistic, Asynchronous (epidemic), voting-based

“Bounded” weighted-voting:

Each replica is assigned a currency ci s.t. 0 ci 1.0

Total currency in the system is bounded, i.e., ci=1.0 Currency can be re-distributed for optimization or planned

disconnection.

An update’s life:

Sites issue tentative updates Updates and votes are propagated in a pair-wise fashion Updates gather votes as they pass through sites An update commits when it gathers plurality of votes

Page 17: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 17

Decentralized Update CommitmentDecentralized Update Commitment An update u wins an election

with plurality A site s maintains:

votes(u): the sum of votes u gained so far

unknown: the sum of votes unknown to s

(i.e., 1.0 – votes(u), for u) u commits iff for all u’ <> u,

votes(u) > votes(u') + unknown and

votes(u) > unknown

Issues: time to commit; abort rates

s1Oi

(s1, 0.20, u1)

votes(u1) = 0.20

unknown = 0.80

(s1, 0.20, u1)

(s5, 0.20, u1)

votes(u1) = 0.40

unknown = 0.60

(s1, 0.20, u1)

(s5, 0.20, u1)(s6, 0.15, u2)

votes(u1) = 0.40

votes(u2) = 0.15

unknown = 0.45

(s1, 0.20, u1)

(s5, 0.20, u1)(s6, 0.15, u2)(s2, 0.15, u1)

votes(u1) = 0.55

votes(u2) = 0.15

unknown = 0.30

u1 commits!

s1Oi

(s1, 0.20, u1)

votes(u1) = 0.20

unknown = 0.80

(s1, 0.20, u1)

(s4, 0.20, u2)

votes(u1) = 0.20votes(u2) = 0.20

unknown = 0.60

(s1, 0.20, u1)

(s4, 0.20, u2)

(s6, 0.25, u3)

votes(u1) = 0.20votes(u2) = 0.20votes(u3) = 0.25

unknown = 0.35

(s1, 0.20, u1)

(s4, 0.20, u2)

(s6, 0.25, u3)

(s2, 0.25, u2)

votes(u1) = 0.20votes(u2) = 0.45votes(u3) = 0.25

unknown = 0.10

u2 commits!

Page 18: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 18

Semantic Caching - Dar et al.Semantic Caching - Dar et al. Idea: Maintain description of cache contents as a set of

logical predicates rather than a list of items.

Potential advantages:

Less overhead with no need for static clustering (reduces bandwidth requirements).

Describe missing items with logical remainder query. Application/Environment specific replacement functions ---

e.g. considering direction and velocity. Issues:

controlling complexity of cache descriptions interacting with real database systems

Page 19: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 19

Dissemination-Based Info Sys Dissemination-Based Info Sys (DBIS)(DBIS)

1) Push vs. Pull is just one dimension along which to compare data delivery mechanisms.

- We’ve identified three.

2) Different mechanisms for data delivery can (and should) be applied at different points in the system.

- Select components from toolkit.

Franklin and Zdonik - Framework in OOPSLA 97,Toolkit description and demo in SIGMOD 99.

Page 20: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 20

DBIS FrameworkDBIS Framework

An architecture that combines data delivery techniques for responsive client access.

3 types of nodes: Data sources Clients Information brokers (can add value)

Any data delivery mode can be used.

Network transparency Possibly dynamic.

Page 21: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 21

Delivery OptionsDelivery Options

PushPull

Aperiodic Periodic

Unicast 1-to-n Unicast 1-to-n

Aperiodic Periodic

Unicast 1-to-n Unicast 1-to-n

request/response

request/responsew/snoop

polling pollingw\snoop

Email lists

publish/subscribe

Emaillistdigests

Broad-castdisks

publish/subscribe

Page 22: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 22

Network TransparencyNetwork Transparency

Clients Brokers Sources

The type of a link matters The type of a link matters only only to nodes on each endto nodes on each end

Page 23: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 23

DBIS ExampleDBIS Example

1-to-n pushServerDB

Proxy cache

An example:

Can vary dynamically

Unicast pull

Proxy cache

Proxy cache

Unicast pull

Unicast pull

Page 24: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 24

DBIS Research IssuesDBIS Research Issues

Each data delivery mechanism has unique aspects

Broadcast Disks - sched., caching, prefetching,updates On-demand Broadcast -scheduling, data staging Publish/Subscribe-large-scale filtering, channelization

Security/Fault-tolerance/Reliability

End-to-End network design and control

Fundamental performance tradeoffs

Exploiting existing and emerging technologies

Page 25: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 25

““Data Recharging”Data Recharging” Mobile devices require 2 resources: power and data

It is impractical to be continuously connected to fixed sources of these.

Devices cope with disconnection using caching:

Power cached in rechargeable batteries Data cached in hot-synched memory

Ideal: make recharging data as simple as power:

Anywhere (with adapters), anytime, flexible connection duration

Joint work w/ Mitch Cherniack and Stan Zdonik getting underway

Page 26: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 26

Data Recharging - Research Data Recharging - Research AgendaAgenda

Profile Definition and Maintenance

Update Storage and Preparation

Efficient integration of "recharge" updates with existing cached data.

Recharge, Trickle Charge, Jump Start... Consistency Guarantees

Global Data Staging

Approaches will be driven by (mostly PIM) applications.

Page 27: Databases Unplugged: Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley

M. Franklin, 12/17/99 27

ConclusionsConclusions Lots of plausible/useful Mobile data architectures.

For many, the applications exist today Each has its own set of fascinating research

opportunities. PIM is the killer app for mobile data access.

It can be used to drive the integration with enterprise and Internet data sources.

Successful MDA work lies at the intersection of communications and data management rather than exclusively in either camp.