distributed databases and ddbms. understand the concept of “distributed data” describe various...

20
DISTRIBUTED DATABASES AND DDBMS

Upload: baldwin-black

Post on 24-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

DISTRIBUTED DATABASES AND DDBMS

Understand the concept of “Distributed Data”

Describe various Distributed Data and DDBMS implementations

Explain how database design affects the DDBMS environment

Apply DDBMS principles to solve problems

Learning Objectives

Distributed Database: A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link

Decentralized Database: A collection of independent databases on non-networked computers

They are not the same thing!

Definitions

What are we talking about here?Key Questions:Are components of the application in

more than one place?Are the data in more than one place?Does the app use more than one DBMS

or “system” for data management?Which facets, if any, are transparent to

users?

Why distribute your app or data? It’s hard. It’s complex. So why do it?

Scalability. Redundancy.

Application Complexity

Monolithic

Everything works / is contained within one computer.

Ex. Ms Word

Distributed

Various working pieces are in different physical places, working over a computer network.

Ex. Google Docs

Data Distribution

Single Site Data (Simple)

All data stored in / retrieved from one place on a network.

Ex. Wordpress

Multi-Site Data (Complex)

Various parts of the data come from various sites on a network.

Ex. My Slice, DNS

Data Complexity

All data associated with the application is stored in the same DBMS

Ex. Wordpress

Various data components of the application are stored in different DBMSes

Ex. SU Blackboard, Facebook

Homogeneous (Easier)Heterogeneous (More Difficult)

Multisite Data DBMS Options Horizontal Partitioning –

Distributing data by row Vertical Partitioning –

Distributing data by table or column. Replication –

Copying data either on a schedule or in real-time

Summary: The taxonomy

App

Monolithic

Distributed

Single Site

Multi Site

Homo.Hetero.

Multi Site

Horiz. Partition

edVert.

Partitoned

Replicated

Homogeneous == Same DBMS

• Customers• Sales Staff

• Orders

CRM Db

• Customers• Sales Staff

N. America

• Orders

Europe

User’s View of Db

Actual Implementation

Oracle OracleSame

Heterogeneous == Multiple DBMS

• Customers• Sales Staff

• Orders

CRM Db

• Customers• Sales Staff

N. America

• Orders

Europe

User’s View of Db

Actual Implementation

Oracle MySQL

• Orders Invoices

Europe

File System

Example of Replication

• Customers• Sales Staff

• Orders

CRM Db

• All Customers• All Sales Staff

• All Orders

N. America

• All Customers• All Sales Staff

• All Orders

Europe

User’s View of Db

Actual Implementation

Master Replica

Example of Horizontal Partitioning

• Customers• Sales Staff

• Orders

CRM Db

• NA Customers• NA Sales Staff

• NA Orders

N. America• E Customers• E Sales Staff

• E Orders

Europe

User’s View of Db

Actual Implementation

Example of Vertical Partitioning

• Financials• Customer Service

• Prod. Support• Human Resources

ERP System

• Financials• Human Resources

N. America• Customer

Service• Prod Support

Europe

User’s View of Db

Actual Implementation

5 Typical Distributed Databases Centralized with Single Site Data Replicated with Snapshots (in real

time) Replicated with Synchronization (on

demand, or a schedule) Integrated Partitions ( Paritioning in

data center) Independent Partitions

(Geographically distributed partitioning)

5 Typical Distributed Databases

Location Transparency User/application does not need to know where data

resides Replication Transparency

User/application does not need to know about duplication of data

Failure Transparency Either all or none of the actions of a transaction are

committed

Transparency is difficult but important. The greater the distribution of data the more there will be a need for transparency to offset the complexity.

Transparency

Applying The Concepts Via Example:

Monolithic or Distributed? Single Site or Multi Site data? If multi-site:

H / V Partitioned or Replicated? Homogeneous or Heterogeneous?

Location Transparency? Replication Transparency? Failure Transparency?

DISTRIBUTED DATABASE AND DDBMS

Questions?