caching with “good enough” currency, consistency, and completeness hongfei guouniversity of...

Post on 14-Jan-2016

220 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Caching with “Good Enough” Currency, Consistency, and

Completeness

Hongfei Guo University of WisconsinPer-Åke Larson Microsoft ResearchRaghu Ramakrishnan University of Wisconsin

2

Motivation — Scaling Google

3

Updates

Backend DBMS

Problem: How to tell whether the cached data is “good enough” for an application?

NO data quality requirements from the applications! NO data quality guarantees from the caching DBMS!

Motivation — Scaling A DBMS By Caching

Application Server

Application Server

App specific code

Caching DBMS

Asynchronous Updates

4

Apps: Specifies data quality requirements in queries[SIGMOD 2004] [SIGMOD 2004 Demo]

Fine-grained data quality-aware database caching model

Cache admin: Specifies local data quality Cache: Keeps track of local data quality

[VLDB 2005]

Query processing: Enforces data quality constraint[SIGMOD 2004] [VLDB 2005]

System performance evaluation[ongoing work]

Caching DBMS

Backend DBMS

Application ServerApplication Server

Big Picture

5

Contributions

Goal: fine-grained data quality-aware cache management

Problems How does the cache track data quality? How does the admin specify cache

properties? How to maintain the cache efficiently? How to enforce data quality constraints for

queries?

A comprehensive solution Cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce data quality checking

6

Review: Data Quality Metrics (informal)

Currency: The elapsed time since this copy becomes stale

Consistency: A query result is (snapshot) consistent iff it is as if evaluated from a snapshot of the master database

C&C: Currency & Consistency

7

bid

title author

bid rid

text

1 databases

Raghu 1 1 …

1 databases

Raghu 1 2 …

2 databases

Ullman 2 3 …

CURRENCY BOUND 10 min ON (B, R) BY B.bid

CURRENCY BOUND 10 min ON (B), 30 min ON (R)

CURRENCY BOUND 10 min ON (B, R)

Review: Proposed SQL Syntax

Ullmandatabases2

Raghudatabases1

authortitlebid

BookCopy

…23

…12

…11

textbidrid

ReviewCopy

SELECT *FROM Books B, Reviews R WHERE B.bid = R.bid AND

B.title = “Databases“

Consistency class

Currency bound

Group by

8

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

9

Cache Properties

Why Define Cache Properties?

Query processing

Cache maintenance

Queries with Relaxed C&C Requirements Results

= contract

10

Cache Properties (P+3C)

Presence — per object Consistency — a set of objects Completeness — per predicate Currency — object staleness

Describe local data status

11

Presence

Example: SELECT *

FROM Authors AWHERE authorId = 1

Question: Is an object present at the cache?

12

Consistency and Currency

Example: SELECT *

FROM Authors AWHERE authorId in (1, 2, 3)CURRENCY BOUND 10 ON (A)

Question: Is a set of objects consistent and no more than 10 minutes old?

13

Completeness

Example: SELECT *

FROM Authors AWHERE city = ‘Madison’

Question: Are ALL authors from Madison in the cache?

View 1

View 2View 3

Basic Concepts

ObjectTables

Cache

H2

H1Master Database

Snapshots

View 1

View 2View 3

Cache Property Examples

Cache

H2

H1Master Database

Present Complete

Currency = now – stale point

Consistent

Stale point

16

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

17

Specifying Cache Properties

Specified as integrity constraints Presence constraint Consistency constraint Completeness constraint

Presence correlation constraint Consistency correlation constraint

Single view

Between two views

18

AuthorList_PCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Presence Constraint AuthorCopy:

authorId

1

2

3

Backend DBMS

Caching DBMS

19

control-table

CREATE VIEW AuthorCopy AS SELECT * FROM Authors

CREATE TABLE AuthorList_PCT (authorId int)

ALTER VIEW AuthorCopy ADD

ON authorId IN (SELECTauthorId FROM authorId_PCT

Partially materialize

d view[Zhou et al 2005]

authorId name city

Presence ConstraintAuthorCopy:

authorId

AuthorList_PCT:

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

1

2

3

control-key

PRESENCE

20

CityList_CsCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Consistency Constraint AuthorCopy:

city

Madison

authorId

AuthorList_PCT:

1

2

3

authorId

AuthorList_PCT:

1

2

3

CREATE TABLE CityList_CsCT (city string)

ALTER VIEW AuthorCopy ADD

ON city IN (SELECT city

FROM cityList_CsCT

Consistency

Backend DBMS

Cache Region

21

authorId

AuthorList_PCT:CityList_CpCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Completeness Constraint AuthorCopy:

city

Madison

New York

CREATE TABLE CityList_CpCT (city string)

ALTER VIEW AuthorCopy ADD

ON city IN (SELECT city

FROM cityList_CsCT

Completeness

Backend DBMS

authorId

AuthorList_PCT:

1

3

1

3

22

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Presence Correlation Constraint

AuthorCopy:

BookCopy:

ALTER VIEW BookCopy ADD PRESENCE ON authorId IN (SELECT authorId

FROM AuthorCopy)

authorId

AuthorList_PCT:

1

2

3Backend

DBMS

authorId

authorId

23

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Presence Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorId

AuthorList_PCT

AuthorCopy

BookCopy

authorId

authorId

24

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Consistency Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorIdBackend

DBMS

ALTER VIEW BookCopy ADD CONSISTENCY ROOT

25

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Consistency Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorId

AuthorList_PCT

AuthorCopy

BookCopy

authorId

authorId

26

Cache Schema Example

AuthorList_PCT

AuthorCopy

BookCopy

ReviewerList_PCT

ReviewerCopy

authorId

authorId

isbn

reviewId

reviewerId

ReviewCopy

CityList_CsCT

27

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

28

Extension to the Optimizer

Compile-time consistency checking

Run-time currency and inexpensive consistency checking

Cost estimation

29

Run-time C&C Checking

Currency guard:Check if local view V satisfies currency requirement

Consistency guard: Check if local view V satisfies consistency requirement

ChoosePlan

C&CGuard

Remote planrequesting E

Local plan using V

30

Future Directions

Improve current prototype Read-write

transactions?

Adaptive data quality aware caching policies Control-table content? Refresh intervals?

Automate cache design/tuning How to get a good cache

schema? (i.e., cache region granularity, assignment)

Comprehensive performance evaluation Cache configurations? Comparison with other

replication solutions?

31

Summary Goal: fine-grained data quality-

aware cache management A comprehensive solution

Four cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce C&C checking

Questions?

32

So long, and thanks for all the fish!

33

34

Simple Consistency Guards Overhead

0

10

20

30

40

50

60

70

80

Qa Qb Qc Qa Qb Qc

Consistency guard

Query

Local

Remote

Execu

tion t

ime (

ms)

16.56%

14.00%

1.72%

1.59%1.66%

1.6%

35

0

1

2

3

4

5

6

7

A11a A11b A12 S11 S12 A11a A11b A12 S11 S12

Consistency guard

Query

Single Table Consistency Guard Overhead

Local

Remote

Execu

tion t

ime (

ms)

62.85%

16.98% 71.41%

6.06% 8.79%7.48%2.33%4.95%

58.32%

23.77%

(Qa is used)

top related