distributed database. introduction a major motivation behind the development of database systems is...

23
Distributed Database

Upload: berniece-brooks

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Distributed Database

Page 2: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Introduction

 A major motivation behind the development of database systems is the desire to integrate the operational data of an organization and to provide controlled access to the data. Although integration and controlled access may imply centralization, this is not the intention.

In fact, the development of computer networks promotes a decentralized mode of work. This decentralized approach mirrors the organizational structure of many companies, which are logically distributed into divisions, departments, projects, and so on, and physically distributed into offices, plants, factories, where each unit maintains its own operational data. The sharing ability of the data and the efficiency of data access should be improved by the development of a distributed database system that reflects this organizational structure, makes the data in all units accessible, and stores data proximate to the location where it is most frequently used.

Page 3: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Distributed DBMS

 The software system that permits the management of the distributed database and makes the distribution transparent to users.

 A Distributed Database Management System (DDBMS) consists of a single logical database that is split into a number of fragments. Each fragment is stored on one or more computers under the control of a separate DBMS, with the computers connected by a communications network. Each site is capable of independently processing user requests that require access to local data and is also capable of processing data stored on other computers in the network.

Users access the distributed database via applications. Applications are classified as those that do not require data from other sites (local Applications) and those that do require data from other sites (global applications). We require a DDBMS to have at least one global application.

Page 4: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an
Page 5: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Banking Example

 

Using distributed database technology, a bank may implement their database system on a number of separate computer systems rather than a single, centralized mainframe. The computer systems may be located at each local branch office: for example, Amritsar, Patiala, and Qadian. A network linking the computer will enable the branches to communicate with each other, and DDBMS will enable them to access data stored at another branch office. Thus, a client living in Amritsar can also check his/her account during the stay in Patiala or Qadian.

 

Page 6: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Distributed Relational Database Design

 In this section we examine the factors that have to be considered for the design of a distributed relational database. More specifically, we examine:

 

      Fragmentation

A relation may be divided into a number of subrelations, called fragments, which are the distributed.

 

There are two main types of fragmentation:

1) Horizontal fragmentation

2) Vertical fragmentation

Page 7: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

      Allocation Each fragment is stored at the site with ‘optimal’ distribution.

  Replication The DDBMS may maintain a copy of a fragment at several different sites.

 The definition and allocation of fragments must be based on how the database is to be used. This involves analyzing transactions. The design should be based on both quantitative and qualitative information. Quantitative information is used in allocation;

qualitative information is used in fragmentation.

The quantitative information may include:

      The frequency with which a transaction is run.

      The site from which a transaction is run.

      The performance criteria for transactions.

Page 8: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

The qualitative information may include information about the transaction that are following objectives:

•Locality of reference

•Improved reliability and availability

•Acceptable performance

•Balanced storage capacities and costs

• Minimal communication costs

Page 9: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Data Allocation

There are four alternative strategies regarding the placement of data:

      Centralized

      Fragmented

      Complete replication

      Selective replication.

We now compare these strategies using the strategic objective identified above.

Page 10: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Centralized This strategy consists of a single database and DBMS stored at one site with users distributed across the network (we referred to this previously as distributed processing). Locality of reference is at its lowest as all sites, except the central site, have to use the network for all data accesses. This also means that communication costs are high. Reliability and availability are low, as a failure of the central site results in the loss of the entire database system.

 Fragmented (or partitioned)

This strategy partitions the database into disjoint fragments, with each fragment assigned to one site. If data items are located at the site where they are used most frequently, locality of reference is high. As there is no replication, storage cost are low; similarly, reliability and availability are low, although they are higher than in the centralized case; as the failure of a site results in the loss of only that site’s data. Performance should be good and communications costs low if the distribution is designed properly.

Page 11: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Advantages of fragmentation

•Usage

•Efficiency

•Parallelism

•Security

Disadvantages of fragmentation

•Performance

•Integrity

Page 12: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Data Fragmentation

 If relation r is fragmented, r is divided into a number of fragments r1, r2 ……rn. These fragments contain sufficient information to allow reconstruction of the original relation r. As we shall see, this reconstruction can take place through the application of either the union operation or a special type of join operation on the various fragments.

 There are three different schemes for fragmenting a relation:

       Horizontal fragmentation

      Vertical fragmentation

      Mixed fragmentation

 We shall illustrate these approaches by fragmenting the relation document, with schema:

EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)

Page 13: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Horizontal Fragmentation

 In horizontal fragmentation, the relations (tables) are divided horizontally. That is some of the tuples of the relation is placed in one computer and rest are placed in other computers.

 

A horizontal fragment is a subset of the total tuples in that relation

 

To construct the relation R from various horizontal fragments, a UNION operation can be performed on the fragments. Such a fragment containing all the tuples of relation R is called a complete horizontal fragment.

Page 14: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

For example, suppose that the relation r is the EMP relation of above. This relation can be divided into n different fragments, each of which consists of tuples of employee belonging to a particular department. EMP relation has three departments 10,20 and 30 results three different fragments:

 EMP1=DEPTNO=10 (EMP)

EMP2=DEPTNO=20 (EMP)

EMP3=DEPTNO=30 (EMP)

 These three fragments are shown below. Fragment r1 is stored in the department number 10 site, fragment r2 is stored in the department number 20 site and so on r3 is stored at department number 30 site.

These fragments are shown below:

 

Page 15: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

We obtain the reconstruction of the relation r by taking the union of all fragments; that is,

 R=r1r2…..rn

Page 16: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Vertical Fragmentation

 In vertical fragmentation, some of the columns (attributes) are stored in one computer and rest are stored in other computers. This is because each site may not need all the attributes of a relation.

 A vertical fragment keeps only certain attributes of the relation.

 

The fragmentation should be done such that we can reconstruct relation r from the fragments by taking the natural join

 

r=r1*r2*r3………rn

 

 

Page 17: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an
Page 18: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an
Page 19: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Mixed Fragmentation

 Mixed fragmentation, also known as Hybrid fragmentation, intermixes the horizontal and vertical fragmentation.

The relation r is divided into a number of fragment relations r1, r2……..rn. Each fragment is obtained as the result of application of either the horizontal fragmentation or vertical fragmentation scheme on relation r, or on a fragment of r that was obtained previously.

For example, if we can combine the horizontal and vertical fragmentation of the EMP relation, it will result into a mixed fragmentation. This relation is divided initially into the fragments EMP1 and EMP2 as vertical fragments. We can now further divide fragment EMP1 using the horizontal-fragmentation scheme, into the following two fragments:  EMP1a=DEPTNO= 10 (EMP1)

EMP2a=DEPTNO= 20 (EMP2)

EMP3a=DEPTNO= 30 (EMP3)

Page 20: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an
Page 21: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Data Replication and Fragmentation

 

The techniques described for data replication and data fragmentation can be applied successively to the same relation. That is, a fragment can be replicated, replicas of fragments can be fragmented further, and so on. For example, consider a distributed system consisting of sites S1, S2…….S11. We can fragment EMP into EMP1a, EMP2a and EMP2, and for example, store a copy of EMP1a at sites S1, S3 and S7; a copy of EMP2a at sites S4 and S11; and a copy of EMP2 at sites S2, S8 and S9.

 

Page 22: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an

Complete replication

 This strategy consists of maintaining a complete copy of the database at each site. Therefore, locality of reference, reliability and availability, and performance are maximized. However, storage costs and communication costs for updates are the most expensive. To overcome some of these problems, snapshots are sometimes used. A snapshot is a copy of the data at a given time. The copies are updated periodically, for example, hourly or weekly, so they may not be always up to date. Snapshots are also sometimes used to implement views in a distributed database to improve the time it takes to perform a database operation on a view.

 Selective replication This strategy is a combination of fragmentation, replication and centralized. Some data items are fragmented to achieve high locality of reference and others, which are used at many sites and are not frequently updated, are replicated; otherwise, the data items are centralized. The objective of this strategy is to have all the advantages of the other approaches but none of the disadvantages. This is the most commonly used strategy because of its flexibility.

Page 23: Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an