lectures ppt for unit 1

81
What is Data and Information?  Data is the term, that may be new to beginners, but it is very interesting and simple to understand. It can be anything like name of a person or a place or a number etc. Data is the name given to basic facts and entities such as names and numbers. The main examples of data are weights, prices, costs, numbers of items sold, employee names, product names, addresses, tax codes, registration marks etc.

Upload: rahul-sawhney

Post on 07-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 1/81

What is Data and Information? 

Data is the term, that may be new to beginners, but it is veryinteresting and simple to understand. It can be anything like

name of a person or a place or a number etc. Data is the name

given to basic facts and entities such as names and numbers.

The main examples of data are weights, prices, costs, numbers

of items sold, employee names, product names, addresses, taxcodes, registration marks etc.

Page 2: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 2/81

Information: Information is data that has been converted into

a more useful or intelligible form. It helps human beings in

their decision making process. Examples are: Time Table,Merit List, Report card, Headed tables, printed documents,

pay slips, receipts, reports etc. The information is obtained by

assembling items of data into a meaningful form.

For example, marks obtained by students and their roll

numbers form data, the report card/sheet is the information.

Page 3: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 3/81

Difference between Data and

Information

Data is the material on which computer programs work upon. It

can be numbers, letters of the alphabet, words, special symbols

etc. But by themselves they have no meaning. For example, thefollowing sequence of digits 240343 is meaningless by itself 

since it could refer to a data of birth, a part number for a

automobile, the number of rupees spent on a project, population

of a town, the number of people employed in a largeorganization etc. Once we know what the sequence refers to,

then it becomes meaningful and can be called information.

When we write above as 24-03-43, it may mean date of birth as

24th March 1943.

Page 4: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 4/81

A set of words would be data but text would be information.

For example “ANNUAL-EXAMINATION, AMITABH,

JYOTSNA, PHYSICS” is a set of data and “JYOTSNA 

SCORED THE HIGHEST MARKS IN PHYSICS IN

ANNUAL EXAMINATION” is information.

Page 5: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 5/81

Page 6: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 6/81

Database 

The related information when placed is an organized form makes a

database. The organization of data/information is necessary becauseunorganized information has no meaning. There are so many examples

of organized information, more precisely and the most common are, the

dictionary, the telephone directory, student record register, your own

address book and many more. In each of these the data is stored in some

particular order i.e. in an organized form.

In dictionary, the words are arranged in alphabetic order along with their

meanings. So that it becomes easier to search any word whose meaning

is required. If this ordering would not have done, how could you find one

word out of say 10,000 words. Similarly everybody can make a database

of his/her own to keep the information in an organized manner. Think of 

your own address directory where you keep the addresses and phone

numbers of your near and dear ones and it is also a database. Now let us

move one step ahead. What do we do with that database?

Page 7: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 7/81

There are so many operations like:

¨ To add new information (e.g. to add the address of a new friend in

your address book)

To view or retrieve the stored information (e.g. you have to find

the address of one of your old friends)

¨ To modify or edit the existing information (e.g. your friend hasshifted to a new place so his address would get changed)

¨ To remove or delete the unwanted information (e.g. your friend has

changed his/her mobile number, so his/her mobile number would have to

be removed from list)¨ Arranging the information in a desired order etc.

Page 8: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 8/81

Manual database and its

problems Consider an example of accounts department of an organization. To

make the salary calculations of the employees every month they are to

keep the record of every employee and do a number of calculations such

as addition of allowances like DA, HRA to the basic salary and to makeseveral deductions as loan recoveries, income tax and insurance etc and

at the end, to make the pay slips of the net pay. This whole procedure is

repeated every month and is very tedious and laborious job. It‟s a mere

calculation job and does not require any logic or intelligence. So to waste

the skills and intelligence of human beings on such repetitivecalculations is not a wise decision.

Page 9: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 9/81

Consider another situation where a magazine publisher, who has 10000

subscribers, receives a cheque from Mr. Suneet Bhatia with a request to

renew his subscription for the magazine, but Mr. Suneet Bhatia does not

mention his subscription number. Now, the publisher has to search the

entire list of 10,000 names to find out the subscription number of Mr.

Suneet Bhatia. This is a boring job, isn‟t it?

Page 10: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 10/81

Consider the case list of shareholders maintained by a public limited

company. Usually there are thousands of shareholders in each company.

Some large companies may have even millions of shareholders. Every

day, the company receives several requests for a change of address.

Similarly, when a person sells a share certificate to another person, the

buyer sends the share certificate to the company for transferring of the

corresponding shares to his name. Then the company has to update its

records.

Page 11: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 11/81

The company also has to send regular communications for meetings of 

shareholders, annual accounts, payment of dividends, offers of additionalshares, etc. Next, assume that a company, which has one million share

holders wants to pay dividend. And according to the latest Government

regulations the company must complete the job within 10 days. Now, to

calculate the dividend payable to its one million shareholders, and writecheques and mail letters, in ten days, even by putting 100 people on the

 job, is very difficult to accomplish within the time frame. By employing

extra staff or by taking additional time, the company may be able to

complete the job. However, there are bound to be several mistakes, such

as a wrong calculation of the dividend amount, or mailing the letters to

wrong addresses.

Page 12: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 12/81

Database and Computers 

¨ Computer has a large storage capacity. It can store thousands of 

records at a time.

¨ It has high speed, within no time it searches any desired

information, arrange the data in alphabetical order, do calculations onthe data and make repetitions and so on.

¨ Computer is more accurate.

¨ Data in computers can be stored in the form of a file, records andfields.

¨ There are two approaches for storing data in computers such as

File based approach and Database approach.

Page 13: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 13/81

File Based Approach 

File Based system: File-based systems were an early attempt to

computerize the manual filing system that we are all familiar with.

A file system is a method for storing and organizing computer files and

the data they contain to make it easy to find and access them.

File systems may use a storage device such as a hard disk or CD-ROM

and involve maintaining the physical location of the files

Page 14: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 14/81

In our own home, we probably have some sort of filing system, which

contains receipts, guarantees, invoices, bank statements, and such like.

When we need to look something up, we go to the filing system andsearch through the system starting from the first entry until we find what

we want.

Alternatively, we may have an indexing system that helps to locate what

we want more quickly. For example we may have divisions in the filingsystem or separate folders for different types of item that are in some

way logically related.

Page 15: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 15/81

The manual filing system works well when the number of items to bestored is small. It even works quite adequately when there are large

numbers of items and we have only to store and retrieve them. However,

the manual filing system breaks down when we have to cross-reference

or process the information in the files. For example, a typical real estate

agent‟s office might have a separate file for each property for sale or rent,

each potential buyer and renter, and each member of staff.

Page 16: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 16/81

Clearly the manual system is inadequate for this type of work. The file

based system was developed in response to the needs of industry for

more efficient data access. In early processing systems, an

organization‟s information was stored as groups of records in separate

files. These file processing systems consisted of a few data files and

many application programs.

Each file called a flat file, contained and processed information for one

specific function, such as accounting or inventory.

Page 17: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 17/81

In addition, early programmers focused on physical implementation andaccess procedures when designing a database. These physical procedures

were written into database application; therefore, physical changes

resulted in intensive rework on the part of the programmer. As systems

became more complex, file processing systems offered little flexibility,

presented many limitations, and were difficult to maintain.

Page 18: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 18/81

There are following problems associated with the File Based Approach:

Separated and Isolated Data 

To make a decision, a user might need data from two separate files. First,

the files were evaluated by analysts and programmers to determine the

specific data required from each file and the relationships between the

data and then applications could be written in a programming languageto process and extract the needed data. Imagine the work involved if data

from several files was needed.

Limitations of the File-Based Approach 

Page 19: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 19/81

Duplication of data

Often the same information is stored in more than one file. Uncontrolled

duplication of data is not required for several reasons, such as:

¨ Duplication is wasteful. It costs time and money to enter the data

more than once.

¨ It takes up additional storage space, again with associated costs.

¨ Duplication can lead to loss of data integrity; in other words the data

is no longer consistent. For example, consider the duplication of data

between the Payroll and Personnel departments. If a member of staff 

moves to new house and the change of address is communicated only to

Personnel and not to Payroll, the  person‟s pay slip will be sent to thewrong address. A more serious problem occurs if an employee is

promoted with an associated increase in salary. Again, the change is

notified to Personnel but the change does not filter through to Payroll.

Now, the employee is receiving the wrong salary. When this error isdetected, it will take time and effort to resolve.

Page 20: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 20/81

Data Dependence 

In file processing systems, files and records were described by specific

physical formats that were coded into the application program by

programmers. If the format of a certain record was changed, the code in

each file containing that format must be updated. Furthermore,

instructions for data storage and access were written into the

application‟s code.

Therefore, changes in storage structure or access methods could greatlyaffect the processing or results of an application.

Page 21: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 21/81

Difficulty in representing data from the user’s view 

To create useful applications for the user, often data from various files

must be combined. In file processing it was difficult to determine

relationships between isolated data in order to meet user requirements.

Page 22: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 22/81

Data Inflexibility 

Program-data interdependency and data isolation, limited the flexibility

of file processing systems in providing users with ad hoc information

requests.

Incompatible file formats 

As the structure of files is embedded in the application programs, the

structures are dependent on the application programming language. For

example, the structure of a file generated by a COBOL program may be

different from the structure of a file generated by a „C‟ program. The

direct incompatibility of such files makes them difficult to process jointly.

Page 23: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 23/81

Database Approach 

In order to remove all the above limitations of the File Based Approach,

a new approach was required that must be more effective known as

Database approach.

A database is a computer based record keeping system whose over all

purpose is to record and maintain information. The database is a single,

large repository of data, which can be used simultaneously by many

departments and users.

Instead of disconnected files with redundant data, all data items are

integrated with a minimum amount of duplication. The database is no

longer owned by one department but is a shared corporate resource.

Page 24: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 24/81

The database holds not only the organization‟s operational data but also a

description of this data. For this reason, a database is also defined as a

self-describing collection of integrated records. The description of the

data is known as the Data Dictionary or Meta Data (the „data about

data‟). It is the self-describing nature of a database that providesprogram-data independence.

Page 25: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 25/81

A database implies separation of physical storage from use of the data byan application program to achieve program/data independence.

Using a database system, the user or programmer or application

specialist need not know the details of how the data are stored and such

details are “transparent to the user”. Changes (or updating) can be madeto data without affecting other components of the system. These changes

include, for example, change of data format or file structure or relocation

from one device to another.

Characteristics of data in a database

Page 26: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 26/81

Characteristics of data in a database

The data in a database should have the following features:

Shared: Data in a database are shared among different users and

applications.

Validity/Integrity/Correctness: Data should be correct with respect to

the real world entity that they represent.

Security: Data should be protected from unauthorized access.Consistency: Whenever more than one data element in a database

represents related real world values, the values should be consistent with

respect to the relationship.

Non-redundancy: No two data items in a database should represent the

same real world entity.

Page 27: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 27/81

Independence: Data at different levels should be independent of each

other so that the changes in one level should not affect the other levels.

To create, manage and manipulate data in databases, a management

system known as database management system was developed.

Page 28: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 28/81

The Database Management System (DBMS) 

DBMS A database management system is the software system

that allows users to define, create and maintain a database and providescontrolled access to the data.

A database management system (DBMS) is basically a collection of 

programs that enables users to store, modify, and extract information

from a database as per the requirements. DBMS is an intermediate layer

between programs and the data. Programs access the DBMS, which then

accesses the data. There are different types of DBMS ranging from

small systems that run on personal computers to huge systems that runon mainframes. The following are main examples of database

applications:

Computerized library systems

Page 29: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 29/81

Computerized library systems

¨ Automated teller machines

¨ Flight reservation systems

¨ Computerized parts inventory systems

Commercially available Database management systems in the market

are dbase, Foxpro, IMS and Oracle.

These systems allow users to create, update, and extract information

from their databases. Compared to a manual filing system, the biggestadvantages to a computerized database system are speed, accuracy, and

accessibility.

Page 30: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 30/81

DBMS provides the following languages to provide the facilities to its

users:

Data Definition Language (DDL): It is a language that allows the users to define data and their relationship

to other types of data. It is mainly used to create files, databases, data

dictionary and tables within databases.

It is also used to specify the structure of each table, set of associated

values with each attribute, integrity constraints, security and

authorization information for each table and physical storage structure of 

each table on disk.

Page 31: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 31/81

Data Manipulation Language (DML) 

It is a language that provides a set of operations to support the basic data

manipulation operations on the data held in the databases. It allows users

to insert, update, delete and retrieve data from the database. The part of 

DML that involves data retrieval (SELECT command) is called a query

language.

DML can be distinguished between two types: procedural and non-

procedural.

The procedural language specify how the output of a DML statement

must be obtained, while the non-procedural DMLs describe only what

output is to be obtained. Typically procedural language treats records

individually, while non-procedural languages operate on sets of records.

Page 32: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 32/81

Components of the DBMS Environment 

Hardware

¨ Software

¨ Data

¨ Users

Procedures

Page 33: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 33/81

Hardware 

The hardware is the actual computer system used for keeping andaccessing the database. Conventional DBMS hardware consists of 

secondary storage devices, usually hard disks, on which the database

physically resides, together with the associated Input- Output devices,

device controllers and so forth. Databases run on a range of machines,

from Micro-computers to large mainframes. Other hardware issues for a

DBMS includes database machines, which is hardware designed

specifically to support a database system.

Page 34: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 34/81

Software 

The software is the actual DBMS. Between the physical database itself 

(i.e. the data as actually stored) and the users of the system is a layer of software, usually called the Database Management System or DBMS. All

requests from users for access to the database are handled by the DBMS.

One general function provided by the DBMS is thus the shielding of 

database users from complex hardware-level detail.The DBMS allows the users to communicate with the database. In a

sense, it is the mediator between the database and the users. The DBMS

controls the access and helps to maintain the consistency of the data.

Utilities are usually included as part of the DBMS. Some of the mostcommon utilities are report writers, application development tools and

other design aids. Examples of DBMS software include Microsoft

Access, Oracle Corporation‟s Personal Oracle and IBM DB2. 

Page 35: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 35/81

Data 

It is the most important component of DBMS environment from the endusers point of view. As shown in above figure 1.2 observe that data acts

as a bridge between the machine components and the user components.

The database contains both the operational data and the meta-data, the

„data about data‟.

The database should contain all the data needed by the organization. One

of the major features of databases is that the actual data are separated

from the programs that use the data. A database should always be

designed, built and populated for a particular audience and for a specificur ose.

Page 36: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 36/81

Users 

There are a number of users who can access or retrieve data on demand

using the applications and interfaces provided by the DBMS. Each type

of user needs different software capabilities. The users of a database

system can be classified in the following groups, depending on their

degrees of expertise or the mode of their interactions with the DBMS.

Page 37: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 37/81

The users can be:

¨ Naive Users

¨ Online Users

¨ Application Programmers

¨ DBA

Page 38: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 38/81

Naive Users

Naive Users are those users who need not be aware of the presence of thedatabase system or any other system supporting their usage. Naive users

are end users of the database who work through a menu driven

application program where the type and range of response is always

indicated to the user.

A user of an Automatic Teller Machine (ATM) falls in this category. The

user is instructed through each step of a transaction. He or she then

responds by pressing a coded key or entering a numeric value.

The operations that can be performed by naïve users are very limited and

affect only a precise portion of the database. For example, in the case of 

the user of the Automatic Teller Machine, user‟s action affects only one

or more of his/her own accounts.

Page 39: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 39/81

 Online Users

Online users are those who may communicate with the database directly

via an online terminal or indirectly via a user interface and applicationprogram. These users are aware of the presence of the database system

and may have acquired a certain amount of expertise with in the limited

interaction permitted with a database.

Application Programmers

Professional programmers are those who are responsible for developing

application programs or user interface. The application programs could

be written using general purpose programming language or the

commands available to manipulate a database.

Page 40: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 40/81

Database Administrator

The database administrator (DBA) is the person or group in charge for

implementing the database system within an organization. The DBAhas all the system privileges allowed by the DBMS and can assign

(grant) and remove (revoke) levels of access (privileges) to and form

other users. DBA is also responsible for the evaluation, selection and

implementation of DBMS package.

P d

Page 41: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 41/81

Procedures

Procedures refer to the instructions and rules that govern the design and

use of the database. The users of the system and the staff that manage the

database require documented procedures on how to use or run thesystem.

These may consist of instructions on how to:

¨ Log on to the DBMS.¨ Use a particular DBMS facility or application program.

¨ Start and stop the DBMS.

¨ Make backup copies of the database.

¨ Handle hardware or software failures.

¨ Change the structure of a table, reorganize the database across

multiple disks, improve performance, or archive data to secondary

storage.

Page 42: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 42/81

Advantages of DBMS Controlling Redundancy 

In file system, each application has its own private files, which cannot

be shared between multiple applications. This can often lead to

considerable redundancy in the stored data, which results in wastage of 

storage space. By having centralized database most of this can be

avoided. It is not possible that all redundancy should be eliminated.

Sometimes there are sound business and technical reasons for

maintaining multiple copies of the same data. In a database system,

however this redundancy can be controlled.

For example: In case of college database, there may be the number of 

applications like General Office, Library, Account Office, Hostel etc.

Each of these applications may maintain the following information into

own private file applications:

Page 43: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 43/81

Page 44: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 44/81

Page 45: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 45/81

Integrity can be enforced 

Integrity of data means that data in database is always accurate, such

that incorrect information cannot be stored in database. In order to

maintain the integrity of data, some integrity constraints are enforced on

the database. A DBMS should provide capabilities for defining and

enforcing the constraints.

For Example: Let us consider the case of college database and suppose

that college having only BA, BCA, BIT, BBA and BCOM classes. But if 

a user enters the class MCA, then this incorrect information must not be

stored in database and must be prompted that this is an invalid data entry.

In order to enforce this, the integrity constraint must be applied to the

class attribute of the student entity. But, in case of file system thisconstraint must be enforced on all the application separately (because all

applications have a class field).

Page 46: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 46/81

 

In case of DBMS, this integrity constraint is applied only once on the

class field of the General Office (because class field appears only once inthe whole database), and all other applications will get the class

information about the student from the General Office table so the

integrity constraint is applied to the whole database. Now, we can say

that integrity constraint can be easily enforced in centralized DBMSsystem as compared to file system.

Page 47: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 47/81

Inconsistency can be avoided 

When the same data is duplicated and changes are made at one site,

which is not propagated to the other site, it gives rise to inconsistency

and the two entries regarding the same data will not agree. At such times

the data is said to be inconsistent. So if the redundancy is removed

chances of having inconsistent data is also removed.

Let us again consider the college system and suppose that in case of 

General_Office file it is indicated that Roll_Number 5 lives in Amritsar

but in library file it is indicated that Roll_Number 5 lives in Jalandhar.

Then this is a state at which the two entries of the same object do not

agree with each other (that is one is updated and other is not). At such

time the database is said to be inconsistent.

An inconsistent database is capable of supplying incorrect or conflicting

information. So, there should be no inconsistency in database. It can be

clearly shown that inconsistency can be avoided in centralized system

very well as compared to file system.

 

Page 48: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 48/81

Let us consider again the example of college system and suppose that

RollNo 5 is shifted from Amritsar to Jalandhar, then address information

of Roll Number 5 must be updated, whenever Roll number and address

occurs in the system. In case of file system, the information must be

updated separately in each application, but if we make updation only at

three places and forget to make updation at fourth application, then the

whole system show the inconsistent results about Roll Number 5.

In case of DBMS, Roll number and address occurs together only single

time in General_Office table. So, it needs single updation and then all

other application retrieve the address information from General_Office

which is updated so, all application will get the current and latest

information by providing single update operation and this single updateoperation is propagated to the whole database or all other application

automatically, this property is called as Propagation of Update.

Page 49: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 49/81

Data can be shared 

As explained earlier, the data about Name, Class, Father_name etc. of 

General_Office is shared by multiple applications in centralized DBMS

as compared to file system so now applications can be developed to

operate against the same stored data. The applications may be developed

without having to create any new stored files.

Standards can be enforced 

Since DBMS is a central system, so standard can be enforced easily may

be at Company level, Department level, National level or International

level. The standardized data is very helpful during migration or

interchanging of data. The file system is an independent system sostandard cannot be easily enforced on multiple independent applications.

Page 50: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 50/81

Restricting unauthorized access 

When multiple users share a database, it is likely that some users will

not be authorized to access all information in the database. For exampleaccount office data is often considered confidential, and hence only

authorized persons are allowed to access such data. In addition, some

users may be permitted only to retrieve data, whereas other are allowed

both to retrieve and to update. Hence the type of access operation

retrieval or update must also be controlled. Typically, users or usergroups are given account numbers protected by passwords, which they

can use to gain access to the database. A DBMS should provide a

security and authorization subsystem, which the DBA uses to create

accounts and to specify account restrictions. The DBMS should thenenforce these restrictions automatically.

Note: Without such checks the security of the data may actually be more

at risk in database system than in a traditional file system.

Page 51: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 51/81

Solving enterprise requirement than individual requirement 

Since many types of users with varying level of technical knowledge use

a database, a DBMS should provide a variety of user interface. The

overall requirements of the enterprise are more important than the

individual user requirements. So the DBA can structure the database

system to provide an overall service that is “best for the enterprise”.

For example: A representation can be chosen for the data in storage that

gives fast access for the most important application at the cost of poor

performance in some other application. But the file system favors the

individual requirements than the enterprise requirements.

Page 52: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 52/81

Providing Backup and Recovery

A DBMS must provide facilities for recovering from hardware or

software failures. The backup and recovery subsystem of the DBMS is

responsible for recovery. For example, if the computer system fails in the

middle of a complex update program, the recovery subsystem is

responsible for making sure that the database is restored to the state it

was in before the program started executing.

i f S

Page 53: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 53/81

 The disadvantages of the database approach are summarized as follows:

Complexity

The provision of the functionality that is expected of a good DBMS

makes the DBMS an extremely complex piece of software. Database

designers, developers, database administrators and end-users must

understand this functionality to take full advantage of it. Failure tounderstand the system can lead to bad design decisions, which can have

serious consequences for an organization.

Disadvantages of DBMS

Si

Page 54: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 54/81

Size

The complexity and breadth of functionality makes the DBMS an

extremely large piece of software, occupying many megabytes of disk 

space and requiring substantial amounts of memory to run efficiently.

Performance

Typically, a File Based system is written for a specific application, such

as invoicing. As result, performance is generally very good. However,the DBMS is written to be more general, to cater for many applications

rather than just one. The effect is that some applications may not run as

fast as they used to.

Higher impact of a failure

The centralization of resources increases the vulnerability of the system.

Since all users and applications rely on the availability of the DBMS, the

failure of any component can bring operations to a halt.

C t f DBMS

Page 55: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 55/81

Cost of DBMS

The cost of DBMS varies significantly, depending on the environment

and functionality provided. There is also the recurrent annual

maintenance cost.

Additional Hardware costs

The disk storage requirements for the DBMS and the database may

necessitate the purchase of additional storage space. Furthermore, to

achieve the required performance it may be necessary to purchase a

larger machine, perhaps even a machine dedicated to running the DBMS.

The procurement of additional hardware results in further expenditure.

Page 56: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 56/81

Cost of Conversion

In some situations, the cost of the DBMS and extra hardware may be

insignificant compared with the cost of converting existing applications

to run on the new DBMS and hardware. This cost also includes the cost

of training staff to use these new systems and possibly the employmentof specialist staff to help with conversion and running of the system. This

cost is one of the main reasons why some organizations feel tied to their

current systems and cannot switch to modern database technology.

Wh t t U DBMS

Page 57: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 57/81

When not to Use a DBMS

In spite of the advantages of using a DBMS, there are a few situations in

which such a system may involve unnecessary overhead costs, as that

would not be incurred in traditional file processing.

The overhead costs of using a DBMS are due to the following:

¨ High initial investment in hardware, software, and training

¨ Generality that a DBMS provides for defining and processing data.

¨ Overhead for providing security, concurrency control, recovery, and

integrity functions.

Additional problems may arise if the database designers and DBA do notproperly design the database or if the database systems applications are

not implemented properly.

Page 58: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 58/81

Hence, it may be more desirable to use regular files under the following

circumstances:

¨ The database and applications are simple, well defined and not

expected to change.

¨ There are tight real-time requirements for some programs that may

not be met because of DBMS overhead.

¨ Multiple user access to data is not required.

¨ An application may need to manipulate the data in a way not

supported by the query language.

Comparison of File Management System with Database

Page 59: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 59/81

Comparison of File Management System with Database

Management System 

Page 60: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 60/81

Page 61: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 61/81

An early proposal for a standard terminology and general architecture

for database systems was produced in 1971 by the DBTG (Data

Base Task Group) appointed by the Conference on Data Systems and

Languages (CODASYL, 1971). The DBTG recognized the need for atwo level approach with a system view called the schema and user

views called subschema. The American National Standards Institute

(ANSI) recognized the need for a three level approach with a system

catalog.

Three Level Architecture

Page 62: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 62/81

There are following three levels or layers of DBMS architecture:

¨  External Level 

¨  Conceptual Level 

¨  Internal Level 

Objective of the Three Level Architecture

Page 63: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 63/81

Objective of the Three Level Architecture

¨ Each user should be able to access the same data, but have a

different customized view of the data.

¨  User‟s interaction with the database should be independent of 

storage considerations.

Page 64: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 64/81

¨ The Database Administrator (DBA) should be able to change the

database storage structures without affecting the user‟s views.

¨ The internal structure of the database should be unaffected by

changes to the physical aspects of storage, such as the changeover to anew storage device.

¨ The DBA should be able to change the conceptual structure of thedatabase without affecting all users.

Page 65: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 65/81

Page 66: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 66/81

External Level or View level 

It is the users‟ view of the database. This level describes that part of thedatabase that is relevant to each user. External level is the one, which is

closest to the end users. This level deals with the way in which individual

users view data. Individual users are given different views according to

the user‟s requirement.

External level is also known as the view level. In addition different views

may have different representations of the same data. For example, one

user may view dates in the form (day, month, year), while another may

view dates as (year, month, day).

Page 67: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 67/81

Conceptual Level or Logical level 

It is the community view of the database. This level describes what data

is stored in the database and the relationships among the data. Themiddle level in the three level architecture is the conceptual level.

This level contains the logical structure of the entire database as seen by

the DBA. It is a complete view of the data requirements of the

organization that is independent of any storage considerations. Theconceptual level represents:

¨ All entities, their attributes, and their relationships;

¨ The constraints on the data;

¨ Security and integrity information.

.

Page 68: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 68/81

This level must not contain any storage dependent details. For instance,

the description of an entity should contain only data types of attributes

(for example, integer, real, character) and their length (such as the

maximum number of digits or characters), but not any storage

considerations, such as the number of bytes occupied. Conceptual level

is also known as the logical level

Internal Level or Storage level

Page 69: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 69/81

It is the physical representation of the database on the computer. This

level describes how the data is stored in the database. The internal level

is the one that concerns the way the data are physically stored on the

hardware.

The internal level covers the physical implementation of the database to

achieve optimal runtime performance and storage space utilization. It

covers the data structures and file organizations used to store data on

storage devices. It interfaces with the operating system access methods to

place the data on the storage devices, build the indexes, retrieve the data,

and so on.

Page 70: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 70/81

The internal level is concerned with such things as:

¨ Storage space allocation for data and indexes;

¨ Record descriptions for storage (with stored sizes for data items);

¨ Record placement;

¨ Data compression and data encryption techniques.

There will be only one conceptual view, consisting of the abstract

representation of the database in its entirely. Similarly there will be only

one internal or physical view, representing the total database, as it isphysically stored.

Schema 

Page 71: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 71/81

The overall description of the database is called the Database Schema.

There are three different types of schema in the database corresponding

to each data view of database. In other words, the data views at each of three levels are described by schema.

A schema is defined as an outline or a plan that describes the

records and relationships existing at the particular level. The External

view is described by means of a schema called external schema thatcorrespond to different views of the data. Similarly the Conceptual view

is defined by conceptual schema, which describes all the entities,

attributes, and relationship together with integrity constraints. Internal

View is defined by internal schema, which is a complete description of the internal model, containing definition of stored records, the methods

of representation, the data fields, and the indexes used.

There is only one conceptual schema and one internal schema per

database The schema also describes the way in which data elements at

Page 72: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 72/81

database. The schema also describes the way in which data elements at

one level can be mapped to the corresponding data elements in the next

level.

Thus, we can say that schema establishes correspondence between the

records and relationships in the two levels. In a relational database, the

schema defines the tables, the fields in each table, and the relationships

between fields and tables. Schema are generally stored in a data

dictionary.

The schema is specified during the database design process and is not

expected to change frequently. However the actual data in the database

may change frequently; The data in the database at any particular point

in time is called a database instance. Therefore, many database instancescan correspond to the same database schema. The schema is sometimes

called the intension of the database, while an instance is called an

extension (or state) of the database.

Page 73: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 73/81

To understand the difference between the three levels, consider again

the database schema that describes College Database system. If User1

is a Library clerk, the external view would contain only the studentand book information. If User2 is a account office clerk then he/she

may be interested in students detail and fee detail.

Page 74: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 74/81

Mapping between Views

Page 75: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 75/81

Mapping between Views 

External/Conceptual Mapping: Each external schema is related to the

conceptual schema by the external/conceptual mapping. A mapping

between the external and conceptual views gives the correspondenceamong the records and the relationships of the external and conceptual

views.. There is a mapping from a particular logical record in the

external view to one (or more) conceptual record(s) in the conceptual

view.

Names of the fields and records, for instance, may be different. A number

of conceptual fields can be combined into a single external field, for

example, Last_Name and First_Name at the conceptual level but Name

at the external level. A given external record could be derived from anumber of conceptual records.

Page 76: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 76/81

Conceptual/Internal Mapping: Conceptual schema is related to the

internal schema by the conceptual/internal mapping. This enables the

DBMS to find the actual record or combination of records in physical

storage that constitute a logical record in conceptual schema. Mapping

between the conceptual and the internal levels specifies the method of 

deriving the conceptual record from the physical database.

Differences between Conceptual/Internal Views 

¨ Representation of numeric values could be different in the two

views.

¨ Representation of string data can be considered by the two views to

be coded differently.

¨ 

Data Independence-Achievement of Layered Architecture of DBMS

Page 77: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 77/81

Data Independence Achievement of Layered Architecture of DBMS 

There are two kinds of data independence:

¨ Logical data independence¨ Physical data independence

Logical data independence 

Logical data independence indicates that the conceptual schema can bechanged without affecting the existing external schemas. The change

would be absorbed by the mapping between the external and conceptual

levels.

Page 78: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 78/81

Physical data independence 

Physical data independence indicates that the physical storage structures

or devices could be changed without affecting conceptual schema. Thechange would be absorbed by the mapping between the conceptual and

internal levels. Physical data independence is achieved by the presence

of the internal level of the database and the mapping or transformation

from the conceptual level of the database to the internal level.If there is a need to change the file organization or the type of physical

device used as a result of growth in the database or new technology, a

change is required in the conceptual/internal mapping between the

conceptual and internal levels.The physical data independence criterion requires that the conceptual

level does not specify storage structures or the access methods (indexing,

hashing etc.) used to retrieve the data from the physical storage medium.

Page 79: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 79/81

The Logical data independence is difficult to achieve than physical data

independence as it requires the flexibility in the design of database and

programmer has to foresee the future requirements or modifications in

the design.

Th P d f D b A

Page 80: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 80/81

The Procedure for Database Access 

A user‟s request for data is received by the data manager, which

determines the physical record required. The decision as to whichphysical record is needed may require some preliminary consultation of 

the database and/or the data dictionary prior to the access of the actual

data itself.

The data manager sends the request for a specific physical record to the

file manager. The file manager decides which physical block of 

secondary storage devices contains the required record and sends the

request for the appropriate block to the disk manager. A block is a unit of physical input/output operations between primary and secondary storage.

The disk manager retrieves the block and sends it to the file manager,

which sends the required record to the data manager.

Page 81: Lectures Ppt for Unit 1

8/4/2019 Lectures Ppt for Unit 1

http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 81/81