adbms lectures 1-5

104
4/23/22 Advanced DBMS 1 What is a Database? • A collection of data or information about real-world entities such as objects or events. • A collection of related records held over a period of time in a computer-readable form. • A shared, integrated, computer structure that stores a collection of Data, that is raw facts of interest to the user.

Upload: razee

Post on 29-Oct-2014

230 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 1

What is a Database?

• A collection of data or information about real-world entities such as objects or events.

• A collection of related records held over a period of time in a computer-readable form.

• A shared, integrated, computer structure that stores a collection of Data, that is raw facts of interest to the user.

Page 2: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 2

Applications of Databases• the preferred method of storage for large multi-user

applications, where coordination between many users is needed – information systems.

• many electronic mail programs and personal organizers are based on standard database technology.

• Software database drivers are available for most database platforms so that application software can use a common Application Programming Interface to retrieve the information stored in a database.

Page 3: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 3

What is a DBMS?

• A computer software designed for the purpose of managing databases.

• A software program that manages the storage of and access to data in databases.

• is a complex set of software programs that controls the organization, storage, management, and retrieval of data in a database

Page 4: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 4

Role/Advantages of DBMS

• Improved data sharing• Better data integration• Minimized data inconsistency• Improved data access• Improved decision making• Increase end-user productivity

Page 6: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 6

Components of a DBMS

• Modeling language– defines the schema of each database hosted in the

DBMS, according to the DBMS data model.

• Data structure (fields, records, files and objects)

– optimized to deal with very large amounts of data stored on a permanent data storage device

Page 7: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 7

Components of a DBMS

• Database query language– allow users to interactively interrogate the

database, analyze its data and update it according to the user privileges on data

• Transaction mechanism– would guarantee the ACID properties, in order to

ensure data integrity, despite concurrent user access and faults

Page 8: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 8

Database Management System

• found at the heart of most database applications

• DBMS accepts requests for data from the application program and instructs the OS to transfer the appropriate data

• When a DBMS is used, information systems can be changed much more easily as the organization's information requirements change

Page 9: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 9

Database Management System cont.

• Database servers are specially designed computers that hold the actual databases and run only the DBMS and related software.

• Database servers are usually multiprocessor computers, with RAID disk arrays used for stable storage.

Page 10: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 10

Database System Environment

Page 11: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 11

DBMS Architecture : 3 Schema

INTERNAL SCHEMA

CONCEPTUAL SCHEMA

END USERS

logical data independence

physical data independence

Objective:

To provide data independence

External Level

Conceptual Level

Internal Level

STORED DATABASE

Page 12: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 12

DBMS Architecture: 3 Schema

• External Level– User’s view of the database

• Conceptual Level– The community view of the database– Describes what is stored in the DB & the relationships

among the data• Internal Level

– The physical representation of the database in the computer

– Describes how the data is stored in the DB

Page 13: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 13

Data Independence

• Upper levels are unaffected by changes to lower levels

• Logical Data Independence– Being able to make changes on the conceptual

schema without any on the external schema.• Physical Data Independence

– Being able to make changes on the internal schema without any on the conceptual or external schemas

Page 14: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 14

Classification/Types of DBMSs

• Accdg. to Data model– Hierarchical, network, relational

• Accdg. to # of users– Single-user, multi-user

• Accdg. to # of sites– Centralized, distributed

• Accdg. to purpose– General, special purpose

Page 15: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 15

Database Design

• Refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data

• What is a good database?– Meets all the requirements– Must be designed carefully– A well-designed database facilitates data

management and generates accurate and valuable information.

Page 16: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 16

Data Model

• An abstraction of a more complex real-world object or event

• Help you understand the complexities of real-world environment

• Represents data structures and their characteristics, relations, constraints, and transformations

• Facilitates interaction among the designer, the applications programmer, and the end-user.

Page 17: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 17

Data Models

• Defines a method for data organization• It specifies relationships and integrity rules in

databases• DBMS’s can be based on a data model

– Hierarchical– Network– Relational

Page 18: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 18

Hierarchical Model

• data is organized into a tree-like structure• Every entity has one to two immediate

neighbors – superior or subordinate.• Restrictions:

– Branches can only diverge & are not allowed to converge.

• A child element can be related to only one parent.

– Prevents any entity from being related to any entity more than one level up the hierarchy.

Page 19: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 19

Hierarchical Model: An Example

Company

Automobile Aerospace

Sales MIS Marketing Research HR MIS

Ann Lane Joe Smith Bob Lee Laura Kline

Page 20: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 20

Network Model

• Derived from the hierarchical model• Basic structure of superior and subordinate

retained• Difference:

– Allows child entity to have more than one parent– Allows a child to be connected directly to a

“grandparent” element

Page 21: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 21

Network Model: An Example

Company

Automobile Aerospace

Sales MIS Marketing Research HR MIS

Ann Lane Joe Smith Bob Lee Laura Kline

Page 22: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 22

Relational Model

• Rooted in 2 branches of mathematics:– Set theory – properties of sets– Predicate calculus – mathematical proposition

• permits the database designer to create a consistent, logical representation of information

• All data about entities are contained in tables• Any table can be reference other tables

through the use of keys & relationships

Page 23: ADBMS Lectures 1-5

Relational Data Model

COMP 208 Database Management System 1

Page 24: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 24

Relational Database

• originally defined and coined by E.F. Codd • a database that conforms to the relational

model, and refers to a database's data and schema (the database's structure of how that data is arranged).

• a set of relations or a database built in an RDBMS (Relational DBMS)

• a relational database is a collection of relations (frequently called tables)

Page 25: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 25

Relational DBMS

• a system that manages data using the relational model

• the term "RDBMS" is inaccurately used as a generic label for the relational database concept

• examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, Ingres etc.

Page 26: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 26

Components of Relational Database

• Relations/tables• Keys• Relationships

Page 27: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 27

Relation

• a set of tuples (record) that all have the same attributes– Attribute – property of an entity/relationship

• is usually represented by a table, which is data organized in rows & columns

• all of the data stored in a column should be in the same domain – Domain-set of possible values for a given attribute

Page 28: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 28

Relation: An Example

LastName

FirstName

Address City

Hansen Ola Timoteivn 10 Sandnes

Svendson Tove Borgvn 23 Sandnes

Pettersen Kari Storgt 20Stavange

r

SSN

123456

789012

345678

Page 29: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 29

Keys

• a kind of constraint which requires that the object, or critical information about the object, isn't duplicated

• A set of one or more attribute – Candidate key-all attributes with unique value– Primary key-chosen key– Simple key-only one attribute– Composite key-made up of two or more attributes– Foreign key-an attribute used to reference a

primary key in a another table/relation

Page 30: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 30

Key: An Example

LastName

FirstName

Address City

Hansen Ola Timoteivn 10 Sandnes

Svendson Tove Borgvn 23 Sandnes

Pettersen Kari Storgt 20Stavange

r

SSN

123456

789012

345678

Key: primary key

Page 31: ADBMS Lectures 1-5

Friday, April 7, 2023 Advanced DBMS 31

• An association of entities where the association includes one entity from each participating entity type

• Example:

Relationships

borrower material

employee department

product customer

Borrows/returns

belongs/employs

sold/bought

Page 32: ADBMS Lectures 1-5

Entity-Relationship Data Model: A conceptual data model

MITF – 202 Advanced Database SystemsMr. Alvin R. Malicdem

Professor

Friday, April 7, 2023 32ADBS

Page 33: ADBMS Lectures 1-5

• A set of concepts that describe the structure of the database and the associated retrieval and update transactions on the database

• The aim is to describe the information used by an organization in a way which is not governed by implementation-level issues and details.

Friday, April 7, 2023 ADBS 33

The Conceptual Data Model

Page 34: ADBMS Lectures 1-5

• A high-level conceptual data model developed by Peter Chen (1976) to facilitate database design

• Support user’s perception of data, and to conceal the more technical aspects associated with database design

• It is independent of the particular DBMS and hardware platform that is used to implement the database

Friday, April 7, 2023 ADBS 34

The Entity-Relationship Data Model

Page 35: ADBMS Lectures 1-5

• A common method of analysis involves identifying: – the ENTITIES (persons, places, things etc.) which

the organization has to deal with – objects/events. – the ATTRIBUTES - the items of information which

characterize and describe these entities. – the RELATIONSHIPS between entities which exist

and must be taken into account when processing information.

Friday, April 7, 2023 ADBS 35

The Entity-Relationship Data Model

Page 36: ADBMS Lectures 1-5

• The basic object of the real world• Defined by a set of attributes (properties)• Uniquely identified by a primary key• Classified with similar entities under one set• Considered an instance of a given entity type• Tangible or intangible

Friday, April 7, 2023 ADBS 36

Entities

Page 37: ADBMS Lectures 1-5

Friday, April 7, 2023 ADBS 37

Entities: An Example

LastName

FirstName

Address City

Hansen Ola Timoteivn 10 Sandnes

Svendson Tove Borgvn 23 Sandnes

Pettersen Kari Storgt 20Stavange

r

SSN

123456

789012

345678

Entity Type: EmployeeAttributes

Tuples/records/entity set

Page 38: ADBMS Lectures 1-5

• Weak Entity type– An entity type that is existence-dependent on

some other entity type (do not have their own keys)

– Example: employee – dependents• Strong Entity type

– An entity type that is not existence-dependent on some other entity type

Friday, April 7, 2023 ADBS 38

Entity Types

Page 39: ADBMS Lectures 1-5

• pieces of information ABOUT entities• analysis must of course identify those which

are actually relevant to the proposed application

• give rise to recorded items of data in the database

Friday, April 7, 2023 ADBS 39

Attributes

Page 40: ADBMS Lectures 1-5

• attribute name. • the domain from which attribute values are

taken.– Set of allowable values of an attribute

• whether the attribute is part of the entity identifier.

• whether it is permanent or time-varying. • whether it is required or optional for the

entity. Friday, April 7, 2023 ADBS 40

Characteristics of an Attribute

Page 41: ADBMS Lectures 1-5

• Simple – single component with an independent existence

• Composite – multiple components, each with an independent existence

• Single-valued – holds a single value for a single entity

• Multi-valued – holds multiple value for a single entity

• Derived attribute – value is taken from a related attribute

Friday, April 7, 2023 ADBS 41

Types of Attributes

Page 42: ADBMS Lectures 1-5

Examples

Simple & single-valued Ex. gender, civil status

Simple & multi-valued Ex. Phone #

Composite & single-valued

Ex. name

Composite & multi-valued

Ex. address

Friday, April 7, 2023 ADBS 42

Page 43: ADBMS Lectures 1-5

• An association of entities where the association includes one entity from each participating entity type

• should be named by a word or phrase which explains its function

Friday, April 7, 2023 ADBS 43

Relationships

Page 44: ADBMS Lectures 1-5

Friday, April 7, 2023 ADBS 44

Relationships: Examples

borrower material

employee department

product customer

Borrows/returns

belongs/employs

sold/bought

relationships

Page 45: ADBMS Lectures 1-5

• The degree indicates the # of associated entities– Unary Relationship

• Same entity participates more than one in different roles

– Binary Relationship• Two different entities participate in the relationship

– Ternary Relationship• Three different entities participate in the relationship

Friday, April 7, 2023 ADBS 45

Degree of a Relationship

Page 46: ADBMS Lectures 1-5

• Unary Relationship

• Binary Relationship

• Ternary Relationship

Friday, April 7, 2023 ADBS 46

Examples

STAFF supervises

OWNER owns PROPERTY

RENTER

STAFF

INTERVIEWSets up

Page 47: ADBMS Lectures 1-5

• Expresses the number of entities to which another entity can be associated via a relationship set

• The number of participating entities in a relationship– ONE-TO-ONE, e.g. Building - Location, – ONE-TO-MANY, e.g. hospital - patient, – MANY-TO-MANY, e.g. Author - Book. – RECURSIVE, e.g. Manager - Employee.

Friday, April 7, 2023 ADBS 47

Mapping Cardinality

Page 48: ADBMS Lectures 1-5

• A relationship may be required/mandatory or optional for either participant

• e.g. a piece of property must be owned by a person but not all persons need to own a piece of property.

Friday, April 7, 2023 ADBS 48

Entity Participation

Page 49: ADBMS Lectures 1-5

Sample Notations

Friday, April 7, 2023 49ADBS

Page 50: ADBMS Lectures 1-5

Friday, April 7, 2023 ADBS 50

Simplified Notations for ER Modeling

Entity Type

Weak Entity Type

Relationship Type

Attribute

Multi-valued attribute

derived attribute

one

many

mandatory

optional

Page 51: ADBMS Lectures 1-5

Friday, April 7, 2023 ADBS 51

A Complete ER Diagram

manages

DEPARTMENT

deptid deptname

EMPLOYEE

empid

name

address

bdate

works for

works on

hours

PROJECT

projid

projname

Page 52: ADBMS Lectures 1-5

• FastFlight Airlines is a small air carrier operating in three north-eastern states. FastFlight is in the process of computerizing its passenger reservation system. The following data items have been identified: reservation code, flight number, flight date, origin, destination, departure time, arrival time, passenger name, seat number, reservation agent number and reservation agent name. For example, flight number 303, which is scheduled every Tuesday and Thursday, leaves Augusta, Maine, at 9:23am and arrives in Nashua, New Hampshire at 10:17am. You can assume that the FastFlight reservation system will detect automatically whether empty seats are available. Draw the Entity Relationship Diagram for this system.

Friday, April 7, 2023 ADBS 52

Exercise

Page 53: ADBMS Lectures 1-5

MITF 202 Advance Database Systems

Alvin R. MalicdemProfessor

Transformation from ER to Relational Model

Page 54: ADBMS Lectures 1-5

Transformation

ADBS

• Involves the transformation of the ER diagram to a complete database scheme

• Entity set in the ER model become tables in a relational database

• The relationships become the source of foreign keys that integrate the resulting table together

• Based on the mapping cardinalities & entity participation of the relationship

04/07/2023

Page 55: ADBMS Lectures 1-5

Given

ADBS

• Let:– E1 and E2 be entity sets– R be a relationship that associates E1 & E2

• ER Diagram:

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

04/07/2023

Page 56: ADBMS Lectures 1-5

Rule 1

ADBS

• many-to-many regardless of entity participation

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1)

E2(pkE2,attrE2)

R(pkE1,pkE2,attrR)

04/07/2023

Page 57: ADBMS Lectures 1-5

Rule 2

ADBS

• One-to-many, mandatory on the many side regardless of entity participation on the “1” side

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1,pkE2,attrR)

E2(pkE2,attrE2)

04/07/2023

Page 58: ADBMS Lectures 1-5

Rule 3

ADBS

• One-to-many, optional on the many side regardless of entity participation on the “1” side

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1)

E2(pkE2,attrE2)

R(pkE1,pkE2,attrR)

04/07/2023

Page 59: ADBMS Lectures 1-5

Rule 4

ADBS

• One-to-one, mandatory on both sides

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1,pkE2,attrR)

E2(pkE2,attrE2)

E1(pkE1,attrE1)

E2(pkE2,attrE2,pkE1,attrR)or

04/07/2023

Page 60: ADBMS Lectures 1-5

Rule 5

ADBS

• One-to-one, mandatory on one side

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1,pkE2,attrR)

E2(pkE2,attrE2)

04/07/2023

Page 61: ADBMS Lectures 1-5

Rule 6

ADBS

• Optional on both sides

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1)

E2(pkE2,attrE2)

R(pkE1,pkE2,attrR)

04/07/2023

Page 62: ADBMS Lectures 1-5

Weak Entity Rule

ADBS

E1 E2R

pkE1

attrE1pkE2

attrE2attrR

E1(pkE1,attrE1,attrR)

E2(pkE2,attrE2,pkE1)

04/07/2023

Page 63: ADBMS Lectures 1-5

Multi-valued Attribute Rule

ADBS

• Transform a multi-valued attribute into a weak entity and follow the weak entity rule

• Ex:

ADDRESSEMPLOYEE resides

04/07/2023

Page 64: ADBMS Lectures 1-5

Transform the ER Diagram into a Relational Schema

ADBS

manages

DEPARTMENT

deptid deptname

EMPLOYEE

empid

name

address

bdate

works for

works on

hours

PROJECT

projid

projname

04/07/2023

Page 65: ADBMS Lectures 1-5

Two entities at a time

ADBS

EMPLOYEE

empid

name

address

bdate

works on

hours

PROJECT

projid

projname

Apply Rule 1: many-to-many, regardless of entity participation

EMPLOYEE(empid,name,address,bdate)

PROJECT(projid,projname)

WORKS(empid,projid,hours)

04/07/2023

Page 66: ADBMS Lectures 1-5

Next

ADBS

DEPARTMENT

deptid

deptnameEMPLOYEE

empid

name

address

bdate

works for

Apply Rule 2: one-to-many, mandatory on the “many” side

EMPLOYEE(empid,name,address,bdate,deptid)

DEPARTMENT(deptid,deptname)

04/07/2023

Page 67: ADBMS Lectures 1-5

And the last…

ADBS

managesEMPLOYEE

empid

name

address

bdate

DEPARTMENT

deptid

deptname

Apply Rule 5: one-to-one, mandatory on the one side

EMPLOYEE(empid,name,address,bdate)

DEPARTMENT(deptid,deptname,empid)

04/07/2023

Page 68: ADBMS Lectures 1-5

The Complete Relational Schema

ADBS

EMPLOYEE(empid,name,address,bdate,deptid)

PROJECT(projid,projname)

DEPARTMENT(deptid,deptname,empid)

WORKS(empid,projid,hours)

04/07/2023

Page 69: ADBMS Lectures 1-5

Transformation & Foreign Key Issues

ADBS

• Referential integrity is a vital consideration in designing a database.

• Whenever a tuple in a referenced relation is deleted, the cascade rule should only be applied to the referencing tuples if there will be no undue loss of information

04/07/2023

Page 70: ADBMS Lectures 1-5

One-many: optional on both sides

ADBS

• If null values are not allowed in foreign keys:– EMPLOYEE(empid,ename)– PROJECT(projid,pname,startdate)– WORKS(empid,projid,assigndate)

• FK: empid references EMPLOYEE DELETE restrict/cascade*

projid references PROJECT DELETE cascade/restrict

PROJECTEMPLOYEE works

(empid,ename)(assigndate)

(projid,pname,startdate)

04/07/2023

Page 71: ADBMS Lectures 1-5

Explanation

ADBS

• Due to the optionality of the relationship, the cascade rule is possible without any loss of important data. That is, whenever an employee is deleted from EMPLOYEE or a project is removed from PROJECT, the referencing tuples in WORKS can be deleted as well.

• NOTE: if the cascade rule is allowed, the restrict rule is also applicable.

04/07/2023

Page 72: ADBMS Lectures 1-5

One-many: optional on the “many” side, mandatory on the “one” side

ADBS

• If null values are not allowed in foreign keys:– EMPLOYEE(empid,ename)– PROJECT(projid,pname,startdate)– WORKS(empid,projid,assigndate)

• FK: empid references EMPLOYEE DELETE cascade/restrict

projid references PROJECT DELETE cascade/restrict

PROJECTEMPLOYEE works

(empid,ename)(assigndate)

(projid,pname,startdate)

04/07/2023

Page 73: ADBMS Lectures 1-5

Explanation

ADBS

• Cascade* indicates that the cascade rule maybe used as long as the interface for removing an employee incorporates a checking mechanism for determining whether that employee is the only person working for a particular project.

• Due to the optional participation of EMPLOYEE in the relationship WORKS, the cascade rule is allowed. However, the mandatory participation of PROJECT in the relationship requires that at least one employee working for it.

04/07/2023

Page 74: ADBMS Lectures 1-5

One-many: mandatory on the “many” side, optional on the “one” side

ADBS

• Due to mandatory participation of EMPLOYEE and the transformation rule, null values are absolutely not allowed in the foreign key– EMPLOYEE(empid,ename,projid,assigndate)

• FK: projid references PROJECT DELETE restrict/cascade

– PROJECT(projid,pname,startdate)

PROJECTEMPLOYEE works

(empid,ename)(assigndate)

(projid,pname,startdate)

04/07/2023

Page 75: ADBMS Lectures 1-5

Explanation

ADBS

• Despite to the optional participation of PROJECT in relationship WORKS, the deletion of a project must not be cascaded in EMPLOYEE because deletion of any referencing tuples in EMPLOYEE result in loss of employee information.

• If however, the application really requires that employees are automatically removed when a project is shelved, the cascade rule must then be used.

04/07/2023

Page 76: ADBMS Lectures 1-5

One-many: mandatory on both sides

ADBS

– EMPLOYEE(empid,ename,projid,assigndate)• FK: projid references PROJECT DELETE restrict/cascade

– PROJECT(projid,pname,startdate)

• Due to the mandatory nature of the participation of PROJECT in the relationship WORKS, the deletion of a tuple in EMPLOYEE might lead to a project not having an employee working for it.

PROJECTEMPLOYEE works

(empid,ename)(assigndate)

(projid,pname,startdate)

04/07/2023

Page 77: ADBMS Lectures 1-5

ADVANCE Database System

Database Normalization

Page 78: ADBMS Lectures 1-5

Database Normalization

Friday, April 7, 2023 ADBS 78

• a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems, namely data anomalies.

Page 79: ADBMS Lectures 1-5

Database Normalization

Friday, April 7, 2023 ADBS 79

• Example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity.

Page 80: ADBMS Lectures 1-5

Database Normalization

Friday, April 7, 2023 ADBS 80

• A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.

Page 81: ADBMS Lectures 1-5

Why do we need to normalize?

Friday, April 7, 2023 ADBS 81

• To avoid anomalies

Page 82: ADBMS Lectures 1-5

Why do we need to normalize?

Friday, April 7, 2023 ADBS 82

• Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance.

• Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an Automated teller machine), while less normalized tables tend to be used in database applications that do not need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).

Page 83: ADBMS Lectures 1-5

Normal Forms

Friday, April 7, 2023 ADBS 83

• 1NF,2NF,3NF,BCNF,4NF,5NF• Database theory describes a table's degree of

normalization in terms of normal forms of successively higher degrees of strictness.

• A table in third normal form (3NF), for example, is consequently in second normal form (2NF) as well; but the reverse is not always the case.

Page 84: ADBMS Lectures 1-5

Anomalies: Problems addressed by Normalization

Friday, April 7, 2023 ADBS 84

• If the proper normal forms aren't followed, various undesirable side effects can occur in a database

• These side effects are commonly referred to as anomalies.– Insertion Anomalies– Deletion Anomalies– Update Anomalies

Page 85: ADBMS Lectures 1-5

Insertion Anomaly

Friday, April 7, 2023 ADBS 85

• This occurs when you can’t add row into a table

Until the new faculty member is assigned to teach at least one course, his details cannot be recorded.

Page 86: ADBMS Lectures 1-5

Deletion Anomaly

Friday, April 7, 2023 ADBS 86

• This occurs when you can’t delete row into a table, when the row you want to delete contains a important piece of information, or when the row you delete is the last one in the table that contains this piece of information.

Page 87: ADBMS Lectures 1-5

Deletion Anomaly

Friday, April 7, 2023 ADBS 87

All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses.

Page 88: ADBMS Lectures 1-5

Update Anomaly

Friday, April 7, 2023 ADBS 88

• These occur when there is unnecessary redundancy in the data.

Employee 519 is shown as having different addresses on different records.

Page 89: ADBMS Lectures 1-5

First Normal Form (1NF)

Friday, April 7, 2023 ADBS 89

• Any table that has only one value per cell, or row/column intersection, is in 1NF.– Columns contain only scalar values, not arrays– There can be only one value per column-row

position (field) in a table.

Page 90: ADBMS Lectures 1-5

Violation of 1NF (not normalized)

First Name Last Name Course Code Title

John Dela Cruz CS1, CS22, IS101, Math3a

Intro to Computer Concepts, Computer Graphics, Mgmt Info System, College Algebra

Peter Domingo CS2, CS3, Math3a, Math6, IS102

Programming 2, Data Structures, College Algebra, Statistics, Quality Assurance

Kaye Abad CS1,Math3a,Engl2

Intro to Computer Concepts, College Algebra, Comm. Arts 2

Friday, April 7, 2023 ADBS 90

Page 91: ADBMS Lectures 1-5

A table in the 1NFFirstName LastName CourseCode Ttile

John Dela Cruz CS1 Intro to Computer Concepts

John Dela Cruz CS22 Computer Graphics

John Dela Cruz IS101 Mgmt Info System

John Dela Cruz Math3a College Algebra

Peter Domingo CS2 Programming 2

Peter Domingo CS3 Data Structures

Peter Domingo Math3a College Algebra

Peter Domingo Math6 Statistics

Peter Domingo IS102 Quality Assurance

Kaye Abad CS1 Intro to Computer Concepts

Kaye Abad Math3a College Algebra

Kaye Abad Engl2 Comm Arts 2

Friday, April 7, 2023 ADBS 91

Page 92: ADBMS Lectures 1-5

Second Normal Form (2NF)

Friday, April 7, 2023 ADBS 92

• It is in the 1NF• Every column that is not a part of the primary

key is functionally dependent on the entire primary key.

Page 93: ADBMS Lectures 1-5

Functional Dependency

Friday, April 7, 2023 ADBS 93

• Attribute B is functionally dependent on attribute A if, for each value of attribute A, there is exactly one value of attribute B.

• Example, Employee Address is functionally dependent on Employee ID, because a particular Employee ID value corresponds to one and only one Employee Address value. Employee Address Employee ID

Page 94: ADBMS Lectures 1-5

Second Normal Form (2NF)

Friday, April 7, 2023 ADBS 94

• Test whether a table is in the 2NF, we ask:– What is the key to this relation? If the key is

concatenated or composite (more than one column or attribute), we further ask:

– Are there any non-key columns that depends on only part of the key

Page 95: ADBMS Lectures 1-5

Violation of the 2NF

PatientID RelativeIDRelationship Patient_Telephone

2490974 GGP001 Father 123-1342

2490974 GGP002 Guardian 123-1342

2490974 GGP003 Mother 123-1342

0803484 PDE001 Brother 789-3421

0803484 PDE002 Uncle 789-3421

9857092 AGE001 Sister 421-5986

Friday, April 7, 2023 ADBS 95

Patient_Telephone is not functionally dependent on the entire primay key (PatientID+RelativeID). The Patient_Telephone column is fully dependent on PatientID alone.

Page 96: ADBMS Lectures 1-5

Normalized – 2NF

Friday, April 7, 2023 ADBS 96

PatientID RelativeID Relationship

2490974 GGP001 Father

2490974 GGP002 Guardian

2490974 GGP003 Mother

0803484 PDE001 Brother

0803484 PDE002 Uncle

9857092 AGE001 Sister

PatientID Patient_Telephone

2490974 123-1342

0803484 789-3421

9857092 421-5986

Page 97: ADBMS Lectures 1-5

Third Normal Form (3NF)

Friday, April 7, 2023 ADBS 97

• It is in the 2NF• All of its columns that are not part of the

primary key are mutually dependent• The 3NF is concerned with the removal of

transitive dependencies in tables.• There exist transitive dependency when a

column is dependent on a column not the primary key.

Page 98: ADBMS Lectures 1-5

Transitive Dependency

Friday, April 7, 2023 ADBS 98

• A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z.

• Example:– Column1 is directly dependent on the primary key

column; another column, say Column2, is indirectly dependent on the primary key column because of its dependency on Column1.

– The dependency of Column2 to the primary key is by virtue of its dependency on Column1.

Page 99: ADBMS Lectures 1-5

Violation of 3NF

EmployeeID First_Name Last_Name Department Dept_Address

8934 Anna Burgin Research 2 Ferman St.

3049 Clarence Dillon Research 2 Ferman St.

4589 Elliot Freeman Research 2 Ferman St.

7623 Gail Healy MIS 3 Gauss Lane

6103 Keith Jordan MIS 3 Gauss Lane

4503 Michael Landon HR 6 Wiles Place

Friday, April 7, 2023 ADBS 99

This table violates the 3NF because Dept_Address depends on Department, which is not part of the primary key.

Page 100: ADBMS Lectures 1-5

Normalized – 3NF

Friday, April 7, 2023 ADBS 100

EmployeeID First_Name Last_Name Department

8934 Anna Burgin Research

3049 Clarence Dillon Research

4589 Elliot Freeman Research

7623 Gail Healy MIS

6103 Keith Jordan MIS

4503 Michael Landon HR

Department Dept_Address

Research 2 Ferman St.

MIS 3 Gauss Lane

HR 6 Wiles Place

Page 101: ADBMS Lectures 1-5

Boyce-Codd Normal Form (BCNF)

Friday, April 7, 2023 ADBS 101

• An extension of 3NF for the special case where:– There are at least 2 candidate keys in the table– All the candidate keys are composite keys– There is an overlapping column in the candidate

keys.• A table to be in BCNF, it must be

– 3NF– All of its columns in all its candidate keys must be

functionally independent.

Page 102: ADBMS Lectures 1-5

Violation of the BCNF

EmployeeID Extension Month Overtime_Hours

8934 9089 January 2

8934 9089 February 1.5

8934 9089 March 2

7623 8607 January 0

7623 8607 February 3

7623 8607 March 3

4503 3869 January 2

4503 3869 February 8

4503 3869 March 0

Friday, April 7, 2023 ADBS 102

Page 103: ADBMS Lectures 1-5

Fourth Normal Form (4NF)

Friday, April 7, 2023 ADBS 103

• It is in the 3NF• There is only one multi-valued dependency

per table.– A multi-valued dependency is a constraint

according to which the presence of certain rows in a table implies the presence of certain other rows

• Multi-valued dependency occurs when there is many-to-many relationship between two columns in a table.

Page 104: ADBMS Lectures 1-5

Violation of the 4NF

AuthorID BookTitle EditorSmith023 Monkeys are from Asteriod Lisa

Smith023 How to Exercise Catherine

Smith023 How to Exercise John

Williams153 Monkeys are from Asteriod John

Williams153 AP Programming Elaine

Johnson823 Dogs are from Hale-Bopp Lisa

Johnson823 Monkeys are from Asteriod Silvia

Friday, April 7, 2023 ADBS 104

There is no functional dependency between the BookTitle and Editor columns.