aggregate queries in peer-to-peer olap mauricio minuto espil faculty of engineering universidad...

Post on 14-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

• Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina

• Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires

7th InternationalWorkshop on

Data Warehousing &OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Peer-to-Peer Systems

Involves a network of interconnected peer systems;The network topology is not relevant;Each peer maintains full autonomy over its own data resources; Each peer may assume the role of local. The rest become acquaintances of the local peer;The roles of local and acquaintance among peers are not static; they are functional and are determined with respect to an operation.

MAIN CHARACTERISTICS:

Peer-to-Peer Data Management

No global schema is assumed to exist for data; Each peer must manage its data according its own perspective;A query may be posed on any peer, the responsive peer becomes local with respect to the query;Answers to queries must conform the best attempt to gather data from all peers; Answers to queries posed by local peer users must conform the view those users have of their data;Peers must cooperate in maintaining the local views of data;

MAIN CHARACTERISTICS:

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

OLAP Data in a Peer-to-Peer System

• OLAP data is essentially multidimensional;• Multidimensional data consists in a collection of views of base and derived aggregated data, describing fact indicators by dimensions of analysis; • Concepts for aggregation within dimensions are obtained from finer grain concepts through hierarchies;• Different peers may have affine fact indicators described by different dimension hierarchies;• Integration is needed: Any summary concept that appears in a hierarchy of a peer acquaintance must be transformed into a summary concept meaningful to the local peer. •••• >

THE PROBLEM:

OLAP Data in a Peer-to-Peer System

• The expected integration is not always possible;• Users may pose OLAP queries in a local peer expecting results involving all relevant data stored in all peers.• Local queries must be propagated among the acquaintances;• A rewriting of the propagated queries is needed to conform the view of the local user.• The rewriting technique must accomplish the data integration on the fly;• Incomplete and uncertain results must be admitted;

•••• > THE PROBLEM

Peer-to-Peer OLAP

• FACT PEERS• DIMENSION PEERS• AGGREGATE P2P OLAP QUERIES• COMPLETE AND CERTAIN QUERY ANSWERS

MODEL (DEFINES):

• AUTONOMOUS PEER DATA MANAGEMENT• THREE PHASE PEER TO PEER COORDINATION• COOPERATIVE QUERY ANSWERING

ARCHITECTURE (INVOLVES):

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Fact Integration

• GENERIC FACT • FACT PEERS

TYPES OF FACT:

IS-A RELATIONSHIP

FACT CONCILIATION PHASE:

SOURCE PEER

PUBLISHES GENERIC FACT DEFINITION AND DIMENSIONAL STRUCTURE

LISTENING PEER

GENERIC FACT AGREEMENT AND DIMENSION PEERS DEFINITION

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Dimension Integration

• LEVEL HIERARCHY INTEGRATION• MEMBER HIERARCHY INTEGRATION.

CONSISTS IN:

• CORRESPONDENCE DEFINITION AMONG DIMENSION LEVELS • REVISION/MAPPING DEFINITION AMONG DIMENSION INSTANCES

COMPRISES:

INVOLVES:

• A PAIR OF DIMENSION PEERS

Level Hierarchy Integration

LEVEL CORRESPONDENCE

• APPLIES ON SCHEMAS• ESTABLISHES HOW A PAIR OF LEVELS ON DIFFERENT PEER DIMENSIONS ARE RELATED• IS PRODUCED/UPDATED DURING A SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN CORRESPONDENCE TABLES

ORDER PRESERVING LEVEL CORRESPONDENCE

Benefit Type

Funding Class

All

Tax DischargeCategory

Loan Type

All

Charity Modality

BenefitType

Level Hierarchy Integration

A LEVEL CORRESPONDENCE THAT DO NOT PRESERVE ORDER IS NOT ADMISSIBLE

Benefit Type

Funding Class

All

Tax DischargeCategory

Loan Types

All

Charity Modality

BenefitType

Level Hierarchy Integration

WRONG

Member Hierarchy Integration

INTEGRATION BY MAPPING

• APPLIES ON INSTANCES• ESTABLISHES HOW A PAIR OF MEMBERS OF CORRESPONDING LEVELS ARE RELATED• IS PRODUCED/UPDATED DURING A MAPPING ACQUISITION PHASE• MUST BE PRECEDED BY AT LEAST ONE SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN MAPPING TABLES

l1: m1 (Local) l'1: m'1 (Peer)

l2: m2 (Local) l'2: m'2 (Acq)

For each member m of a level l, such that map (l:m) is defined,

if there exists some member m’ of level l’, satisfying roll-up (l:m) = l’:m’

and level l’ is in dom(Correspondence)then roll-up (map (l:m) ) = map (l’:m’).

Member Hierarchy Integration

MAPPINGS: HOMOMORPHISM PROPERTY

l:m

l':m’map

map

roll-uproll-up

Member m’ in level l’ is conflicting,it cannot be mapped.

An approach based on mapping exclusively is not always effective.

Member Hierarchy Integration

HOMOMORPHISM MAY NOT BE ALWAYS GRANTED

l:m1

l':m’

mapmap

roll-uproll-up

l:m2

roll-uproll-up

MAPPINGS DO NOT SUFFICE: MAPPINGS DO NOT SUFFICE: REVISIONS MAY BE NECESSARYREVISIONS MAY BE NECESSARY

Member Hierarchy Integration

l:m1

l':m’

l:m2

Conflicting Member

REVISIONS AFFECT THE VIEW A PEER HAS OF THE REVISIONS AFFECT THE VIEW A PEER HAS OF THE HIERARCHY OF ITS ACQUAINTANCE ONLYHIERARCHY OF ITS ACQUAINTANCE ONLY

LOCAL

ACQUAINTANCE

A REVISION BY SPLITTING A REVISION BY SPLITTING MAY BE USED TO REPAIR CONFLICTSMAY BE USED TO REPAIR CONFLICTSGIVING WAY TO MAPPABLE MEMBERSGIVING WAY TO MAPPABLE MEMBERS

Member Hierarchy Integration

l:m1

l':m2’

l:m2

l:m1’

LOCAL

ACQUAINTANCE

EXAMPLE OF A REVISION: EXAMPLE OF A REVISION: CONFLICTING MEMBER SPLITCONFLICTING MEMBER SPLIT

Non-Conflicting Members

A REVISION BY RECLASSIFYING A REVISION BY RECLASSIFYING MAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISMMAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISM

Member Hierarchy Integration

l:m1

l:m2

l:m’

LOCAL

ACQUAINTANCE

l:m3

l':m”

EXAMPLE OF A REVISION:EXAMPLE OF A REVISION:CONFLICTING MEMBER RECLASSIFICATIONCONFLICTING MEMBER RECLASSIFICATION

Non-Conflicting Members

• PRODUCES AND BROADCASTS REVISION AND MAPPING DEFINITIONS TO POTENTIAL ACQUAINTANCES

REVISE AND MAP APPROACH:LOCAL PEER:

Member Hierarchy Integration

ACQUAINTANCE:• REVISES ITS OWN HIERARCHIES PRODUCING A REVISED INSTANCE (REVISED ROLL-UPS) WITH RESPECT TO THE LOCAL PEER• STORE INFORMATION ON MAPPINGS IN METADATA MAPPING TABLES

Whenever some member m2’ of a level l’ is not mapped,a bottom-up completion approach for query answeringis employed: information on non-mapped members andtheir roll-ups is stored in metadata completion tables.

Member Hierarchy Integration

BOTTOM-UP COMPLETION APPROACH

l:m1

l':m2’

mapmap

Incompleteroll-up

roll-upl:m2

roll-uproll-up

l':m1’Non-Mapped

Member

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

P2P OLAP Queries

Syntactical Structure (Datalog Style):

query( Z1, ... , Zn, aggr(M), Set of Peers) Generic Fact(X1, ... , Xn, M ), rollup dimension d1 from bottom level to desired level l1 ( X1, Z1 ), ... , rollup dimension dn from bottom level to desired level ln ( Xn, Zn );

• GENERATES A QUERY FOR EACH RELEVANT PEER (INCLUDING THE LOCAL PEER);• GENERATED QUERIES ARE PROPAGATED TO RELEVANT PEERS;• QUERIES FOR RELEVANT PEERS STEM FROM THE REWRITING OF THE SUBMITTED P2P OLAP QUERY;• THE REWRITING PROCESS INTRODUCES REFERENCES TO FACT PEERS, REVISED ROLL-UPS, AND MAPPING AND COMPLETION TABLES;• RESULTS OF PROPAGATED QUERIES ARE COLLECTED AND AGGREGATED LOCALLY TO PRODUCE THE FINAL QUERY ANSWER;• QUERY ANSWERS MAY BE UNCERTAIN AND INCOMPLETE DUE TO BOTTOM-UP COMPLETION.

Query Evaluation Process

Query ProcessingQuery Processing

Local Peer Relevant Peer

Fact Fact tablestables

QUERY

Rewriting

Evaluation

Partial Result

Revised Revised RollupsRollups

MetadataMapping Mapping

tablestables

Integration

Answer

Completion Completion tablestables

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

• GENERIC FACTS• FACT CONCILIATION PHASE• HIERARCHY LEVEL CORRESPONDENCE• SCHEMA CONCILIATION PHASE• REVISE AND MAP APPROACH• BOTTOM-UP COMPLETION• MAPPING ACQUISITION PHASE• P2P OLAP QUERIES• QUERY REWRITING AND EVALUATION

CONCLUSIONS: MAIN POINTS DISCUSSED

top related