distributed database management systems. reading textbook: ch. 4 textbook: ch. 4 farkascsce 824 -...

31
Distributed Database Distributed Database Management Systems Management Systems

Upload: nicholas-chandler

Post on 22-Dec-2015

229 views

Category:

Documents


3 download

TRANSCRIPT

Distributed Database Distributed Database Management SystemsManagement Systems

ReadingReading

Textbook: Ch. 4Textbook: Ch. 4

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 22

Design IssuesDesign Issues

Placing of data and programs Placing of data and programs (DBMS and application)(DBMS and application)

Network issuesNetwork issues

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 33

Level of SharingLevel of Sharing

No sharingNo sharing Data sharingData sharing Data and program sharingData and program sharing

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 44

Heterogeneous environment!

Top-Down DesignTop-Down Design

Global Conceptual schema Global Conceptual schema distributiondistribution– FragmentationFragmentation– ReplicationReplication– AllocationAllocation

Figure 3.2Figure 3.2

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 55

Correctness of Correctness of FragmentationFragmentation

1.1. Completeness: FCompleteness: FRR={R={R11, …, R, …, Rnn}}

2.2. Reconstruction: R=Reconstruction: R=RRii, , RRiiRR

3.3. Disjointness: Disjointness: – Horizontal: does not Horizontal: does not d djjRRi i such that dsuch that djjRRk k

where kwhere ki i – Vertical: same as horizontal for non-Vertical: same as horizontal for non-

primary key attributesprimary key attributesFarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 66

1&2: Lossless-join (normalization)

Data DirectoryData Directory

Global vs. local conceptual Global vs. local conceptual schemasschemas– How to search?How to search?– Where to store?Where to store?– Single vs. multiple copies? Single vs. multiple copies?

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 77

Current ResearchCurrent Research

Allocation: new requirements, Allocation: new requirements, technology, etc.technology, etc.

Where to store the fragments?Where to store the fragments? Dynamic environmentDynamic environment

– Usage patternUsage pattern– Application characteristicsApplication characteristics– Network changesNetwork changes– SecuritySecurity

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 88

Bottom-Up ApproachBottom-Up Approach

Multi-database systemsMulti-database systems How to integrate them into 1 How to integrate them into 1

database?database?– InteroperabilityInteroperability

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 99

Database IntegrationDatabase Integration

Physical integrationPhysical integration– Materialized database: data Materialized database: data

warehouseswarehouses– Extract-transform-load (ETL) toolsExtract-transform-load (ETL) tools

Logical integrationLogical integration– Virtual (not materialized) Virtual (not materialized)

integrationintegration– Enterprise Information IntegrationEnterprise Information Integration

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1010

Data WarehousesData Warehouses

On-line Analytical Processing On-line Analytical Processing (OLAP) applications:(OLAP) applications:– Decision support systemsDecision support systems– Trend analysis and forecastingTrend analysis and forecasting

Complex queries, large Complex queries, large databasesdatabases

Materialized view maintanenceMaterialized view maintanence

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1111

Logical IntegrationLogical Integration

No materialized global databaseNo materialized global database Virtual integration: data remains at Virtual integration: data remains at

the local (operational) databasesthe local (operational) databases Global conceptual schema may not Global conceptual schema may not

contain everything from local contain everything from local schemasschemas

AutonomousAutonomous and and heterogeneouheterogeneous s local systemslocal systems

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1212

Bottom-Up DesignBottom-Up Design

Global Conceptual Schema (GCS Global Conceptual Schema (GCS or mediated schema)or mediated schema)– Defined first: local conceptual Defined first: local conceptual

schemas (LCS) are mapped to GCSschemas (LCS) are mapped to GCS– Defined during the integration of Defined during the integration of

the LCSs and develop the the LCSs and develop the corresponding mappings from LCSs corresponding mappings from LCSs to the GCSto the GCS

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1313

GCS Defined FirstGCS Defined First Local-as-view (LAV) systemsLocal-as-view (LAV) systems

– Each LCS is treated as a view over the GCSEach LCS is treated as a view over the GCS– Query results: constrained to the objects in the Query results: constrained to the objects in the

local DBs while the GCS definition may be richerlocal DBs while the GCS definition may be richer– Potential incomplete answersPotential incomplete answers

Global-as-view GCS is defined as a set of views Global-as-view GCS is defined as a set of views over the LCSsover the LCSs– View definition defines how to derive elements View definition defines how to derive elements

of the GCSof the GCS– Query results: constrained to the GCS while the Query results: constrained to the GCS while the

local DBs might be richerlocal DBs might be richer

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1414

Design TasksDesign Tasks

Schema translationSchema translation Schema generationSchema generation Figure 4.3Figure 4.3

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1515

Intermediate Intermediate Canonical Canonical RepresentationRepresentation Expressive to incorporate all Expressive to incorporate all

concepts in the local databasesconcepts in the local databases Simple, intuitive, practical, etc. Simple, intuitive, practical, etc. Example: E/R model, relational Example: E/R model, relational

model, graph/tree models, etc.model, graph/tree models, etc. Tools Tools

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1616

Schema GenerationSchema Generation

Schema matching: syntax and Schema matching: syntax and semanticssemantics

Integration of common schema Integration of common schema elementselements

Schema mappingSchema mapping See example 4.1, 4.2See example 4.1, 4.2

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1717

Schema MatchingSchema Matching

Defined or discovered (e.g., web Defined or discovered (e.g., web data)data)

Rules:Rules:– Correspondence between 2 elementsCorrespondence between 2 elements– Predicate whether the Predicate whether the

correspondence holds or notcorrespondence holds or not– Similarity value between the 2 Similarity value between the 2

elementselements

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1818

Finding Finding CorrespondenceCorrespondence Difficult process due to Difficult process due to schema schema

heterogeneity heterogeneity Can be automated?Can be automated?

– Insufficient schema and instance Insufficient schema and instance informationinformation

– Unavailability of schema Unavailability of schema documentationdocumentation

– Subjectivity of matchingSubjectivity of matchingFarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1919

Matching Algorithm Matching Algorithm IssuesIssues Schema vs. instance matchingSchema vs. instance matching

– Concept matchConcept match– Data instance: semantic inconsistenciesData instance: semantic inconsistencies

Element-level vs. structure-level mappingElement-level vs. structure-level mapping– Element name Element name semantics semantics– Multiple attribute mapping?Multiple attribute mapping?

Matching cardinalityMatching cardinality– One-to-one, one-to-many, many-to-manyOne-to-one, one-to-many, many-to-many

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2020

Semantic Schema Semantic Schema Heterogeneity Heterogeneity Semantic: meaning, interpretation, Semantic: meaning, interpretation,

and intended use of dataand intended use of data– Synonyms, homonyms, hypernymsSynonyms, homonyms, hypernyms– Different ontologiesDifferent ontologies– Imprecise wordingImprecise wording

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2121

Structural Schema Structural Schema Heterogeneity Heterogeneity – Type conflict: attribute vs. entityType conflict: attribute vs. entity– Dependency conflict: mapping Dependency conflict: mapping

cardinality inconsistenciescardinality inconsistencies– Key conflict: different primary keys Key conflict: different primary keys – Behavioral conflict: modeling Behavioral conflict: modeling

assumptions, e.g., referential assumptions, e.g., referential integrity, deletion, etc.integrity, deletion, etc.

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2222

Schema IntegrationSchema Integration

BinaryBinary N-aryN-ary

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2323

Schema MappingSchema Mapping

How the data from local How the data from local databases can be mapped to databases can be mapped to GCSGCS

Mapping creatingMapping creating Mapping maintanenceMapping maintanence

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2424

Mapping CreationMapping Creation

Input: LCS, GCS, M (schema Input: LCS, GCS, M (schema matches)matches)

Output: Q={QOutput: Q={Q11, …, Q, …, Qkk} such that} such that

– DBDBGCSGCS = = Q(DB Q(DBCLSCLS))

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2525

Security ObjectivesSecurity Objectives

ConfidentialityConfidentiality IntegrityIntegrity AvailabilityAvailability

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2626

Question 1Question 1

How distributed databases How distributed databases impact the security objectives?impact the security objectives?– Confidentiality in traditional vs. Confidentiality in traditional vs.

distributed DBsdistributed DBs– Integrity in traditional vs. Integrity in traditional vs.

distributed DBsdistributed DBs– Availability in traditional vs. Availability in traditional vs.

distributed DBsdistributed DBs

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2727

IntegrityIntegrity

Correctness criteriaCorrectness criteria– Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2828

AvailabilityAvailability

What are the issues related to What are the issues related to availability when dealing with availability when dealing with – Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2929

ConfidentialityConfidentiality

(will be covered in 2(will be covered in 2ndnd part of part of semester but…)semester but…)

Centralized vs. distributed Centralized vs. distributed security policysecurity policy– Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 3030

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 3131

Next ClassNext Class

Semantics-based Database Semantics-based Database IntegrationIntegration