chapter 6 normalization 正規化. 6-2 in this chapter you will learn: 更動異常 how tables...
TRANSCRIPT
Chapter 6Chapter 6
Normalization Normalization
正規化正規化
6-2
In This Chapter You Will Learn:In This Chapter You Will Learn:
How tables that contain redundant data can suffer from update anomalies ( 更動異常更動異常 ), which can introduce data inconsistencies into a database.
The rules associated with the most commonly used normal forms, namely first (1NF), second (2NF), and third (3NF) normal forms.
How tables that break the rules of 1NF, 2NF, or 3NF are likely to contain redundant data and suffer from update anomalies.
How to restructure tables that break the rules of 1NF, 2NF, or 3NF.
6-3
NormalizationNormalization
A technique for producing a suitable set of tables given
the data requirements of an enterprise.
Developed by E.F. Codd (1972).
For a given table, it is often performed as a series of tests
to determine whether the rules of a given normal form
are satisfied or violated.
Normalization leads to a normalized table structure.
6-4
The Purpose of NormalizationThe Purpose of Normalization
To identify a suitable set of tables that support the data requirements of an organization
A suitable set of tables includes the following:A minimal number of attributes necessary to support
the data requirementsAttributes with a close, logical relationship are
organized in a same tableMinimal data redundancy
Each attribute is represented only once with an important exception for attributes that form
part or all of FKs
6-5
How Normalization Supports DB Design
© Pearson Education Limited 1995, 2005
6-6
NormalizationNormalization
The most commonly used normal forms are first (1NF), second (2NF), and third (3NF) normal forms.
All these normal forms are based on rules about relationships among the columns of a table.
A table can be normalized to prevent the possible occurrence of update anomalies (更動異常 ).
Note:
In general, IT industry considers normalization to 3NF
an acceptable level for removing data redundancy.
Higher normalization levels are not widely used.
6-7
Data Redundancy and Update AnomaliesData Redundancy and Update Anomalies
Major aim of normalization is to
minimize data redundancy by grouping data columns into
tables and letting related data columns in a single table.
reduce file storage space required by base tables
Problems associated with data redundancy are
illustrated by comparing StaffBranch table with Staff and
Branch tables.
The tables store information about staff and branches
6-8
Data Redundancy – Data Redundancy – StaffBranch Table Table
Question: Question: 1.1. Any redundant data?Any redundant data?2.2. Primary Key?Primary Key?
StaffBranch Table Table
6-9
Data Redundancy Staff and Branch Tables
Question: Question: 1.1. Any redundant data?Any redundant data?2.2. Primary Key?Primary Key?
6-10
Data Redundancy and Update AnomaliesData Redundancy and Update Anomalies
StaffBranch table has redundant data
The details of a branch are repeated for every member
of staff.
In contrast, the branch information appears only once
for each branch in the Branch table, and only the
branch number (branchNo) is repeated in the Staff
table, to represent the location each staff member
works at.
6-11
Data Redundancy and Update AnomaliesData Redundancy and Update Anomalies
Tables that contain redundant information may
potentially suffer from update anomalies.
Types of update anomalies (更動異常 )
Insertion anomalies (新增異常 )
Deletion anomalies (刪除異常 )
Modification anomalies (更新異常 )
6-12
Insertion Anomalies (Insertion Anomalies (新增異常新增異常 ))
1. To insert a new staff located at branch B003, We must also enter the correct details of branch B003 so that
the branch details are consistent with values for branch B003 in other records of the StaffBranch table.
2. To insert a new branch that currently has no members of staff into the StaffBranch table, It’s necessary to enter nulls into the staff-related columns,
such as staffNo. However, as staffNo is the PK for the StaffBranch table, this is
not allowed.
StaffBranch table
PKPK
6-13
A Design Without Insertion AnomaliesA Design Without Insertion Anomalies
1. To insert a new staff located at branch B003, …
2. To insert a new branch that currently has no members of staff ……
No problem with the new design !!!!
6-14
Deletion Anomalies (Deletion Anomalies (刪除異常刪除異常 ))
If we delete a record from the StaffBranch table, that represents the last member of staff located at a branch, the details about that branch are also lost from the database.
Example If we delete the record for staff S0415,…
6-15
A Design Without Deletion AnomaliesA Design Without Deletion Anomalies
If we delete the record for staff If we delete the record for staff S0415S0415,…,…
No problem with the new design !!!!
6-16
Modification Anomalies (Modification Anomalies (更新異常更新異常 ))
If we want to change the value of one of the columns of a particular branch in the StaffBranch table, for example the telephone number for branch B001, we must update the records of all staff located at that branch.
If this modification is not carried out on all the appropriate records of the StaffBranch table, the database will become inconsistent.
6-17
Without Modification AnomaliesWithout Modification Anomalies
If we want to change the value of the telephone number If we want to change the value of the telephone number for branch B001, …..for branch B001, …..
No problem with the new design !!!!
6-18
Functional Dependency (Functional Dependency ( 函數依賴函數依賴 ))
Functional Dependency (FD)
Describes the relationship between the columns of a table
For example,
Assume that A and B are columns of table R.
B is functionally dependent on A (denoted A → B), if
each value of A in R is associated with exactly one value
of B in R, at any moment in time.
For example, in StaffBranch table, staffNo → branchNo (Yes)
branchNo → staffNo (No)
6-19
Functional DependencyFunctional Dependency
The determinant (決定項 ) of a functional dependency refers to the column or a group of columns on the left-hand side of the arrow.
Diagrammatic representation.
Example:Example:branchNo branchNo branchAddressbranchAddress
6-20
Example - Functional DependencyExample - Functional Dependency
6-21
The Process of NormalizationThe Process of Normalization
A formal technique for analyzing a relation based on the PK of the relation the FDs between the columns of the relation (table).
Normalization consists of a series of rules that must be applied to convert from an unnormalized structure into a normalized structure.
The process is described in a series of steps which lead to “higher” levels of normalization. These levels are called normal forms.
As normalization proceeds step by step, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.
6-22
6-23
First Normal Form (1NF)First Normal Form (1NF)
A table is in 1NF if the intersection of each record and
each column contains only one value in the table.
6-24
The Following Table Is The Following Table Is Not In 1NFNot In 1NF
Branch Table
6-25
Converting Branch Table to 1NFConverting Branch Table to 1NF((Method 1Method 1))
1. Place the multi-valued column(s) along with a copy of
the original key column(s) into a separate table.
2. Remove the multi-valued column(s) from the original
table
6-26
Converting to 1NF: Method 1Converting to 1NF: Method 1
6-27
Converting to 1NF: Method 2Converting to 1NF: Method 2
copy
copy
new record
new record
new record
6-28
Second Normal Form (2NF)Second Normal Form (2NF)
Apply only to tables with composite primary keys.
Based on the concept of full functional dependency
(完整依附、完全依附) Full functional dependency is that if
A and B are columns of a table, B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A.
If B is dependent on a subset of A, this is referred to as a partial dependency (PD). (部分依附)
6-29
Second Normal Form (2NF)Second Normal Form (2NF)
A table is in 2NF if
1. it is in 1NF, and
2. every non-PK column is fully functionally dependent on the PK.
A table in 1NF will be in 2NF if any one of the following applies: The PK is composed of only one column. No nonkey columns exist in the table. Every nonkey attribute is dependent on all of the
columns of the PK.
6-30
TempStaffAllocation Table Is Not In 2NF.TempStaffAllocation Table Is Not In 2NF.
branchNo branchNo branchAddress branchAddress (a (a PDPD, , branchNo is part of the PKbranchNo is part of the PK)) staffNo staffNo name, position name, position (a (a PDPD, , staffNo is part of the PKstaffNo is part of the PK)) staffNo, branchNo staffNo, branchNo hoursPerWeek hoursPerWeek
6-31
Converting to Second Normal FormConverting to Second Normal Form
For each group of partial dependencies
1.Determine which non-key columns are not dependent
upon the table’s entire PK.
2.Remove those columns from the base table.
3.Create a new table with those columns and the partial
PK columns that they depend upon.
4.Create a FK for the original base table, which links to the
PK of the new table.
6-32
Converting To 2NF
UsingbranchNo branchAddress
staffNo name, position
6-33
Third Normal Form (3NF)Third Normal Form (3NF)
Based on the concept of transitive dependency. (傳遞依附 )
Transitive Dependency (TD)
If A B and B C, then C is transitively dependent on
A through B.
6-34
Third Normal Form (3NF)Third Normal Form (3NF)
A table is in 3NF if
1. it is in 1NF and 2NF, and
2. no non-PK columns transitively depends on its PK.
A table is in 3NF if every nonkey column directly depends on the PK, and not on another nonkey column.
6-35
Converting to 3NFConverting to 3NF
For each group of transitive dependencies, remove any
columns that depend upon another non-key column:
1. Determine which columns depend upon another non-key column(s).
2. Remove those columns from the base table.
3. Create a new table with those columns and the non-key column(s) that they depend upon.
4. Create a foreign key in the original table, which links to the PK of the new table.
6-36
StaffBranchStaffBranch Table Is Not In 3NF Table Is Not In 3NF
staffNo name, position, salary, branchNo, branchAddress, telNobranchNo branchAddress, telNo (a group of transitive dependencies)
6-37
Converting Converting To 3NFTo 3NF
UsingbranchNo branchAddress, telNo
6-38
Example 1 – Normalization Example 1 – Normalization Property Rental ReportProperty Rental Report
6-39
copy
copy
new record
new record
new record
Example 1 – UNF to 1NFExample 1 – UNF to 1NF
6-40
Example 1 – Define Primary KeyExample 1 – Define Primary Key
(Customer_No, RentStart) ?
(Customer_No, RentFinish) ? Note: NULL values could be in RentFinish
(Property_No, RentStart)?
(Property_No, RentFinish)?
(Customer_No, Property_No) ? Any Assumption? A customer doesn’t rent a same property twice
6-41
Example 1 – FDs for Customer_RentalExample 1 – FDs for Customer_Rental
(Primary key)
FDs:FDs:Customer_No, Property_No Customer_No, Property_No CName, PAddress, RentStart, RentFinish, Rent, CName, PAddress, RentStart, RentFinish, Rent,
Owner_no, OName Owner_no, OName Customer_No Customer_No CNameCNameProperty_No Property_No PAddress, Rent, Owner_no, OName PAddress, Rent, Owner_no, ONameOwner_No Owner_No ONameOName
6-42
(Primary key)
Example 1 – Converting Example 1 – Converting Customer_Rental to 2NFCustomer_Rental to 2NF
Remove partial dependencyRemove partial dependency
1
2
3
6-43
Converting Customer_Rental to 2NFConverting Customer_Rental to 2NF
6-44
Example 1 – Converting Property_Owner To 3NFExample 1 – Converting Property_Owner To 3NF
Remove transitive dependencyRemove transitive dependency1
2
6-45
39
Example 1 – Converting Property_Owner To 3NFExample 1 – Converting Property_Owner To 3NF
6-46
Example 1 – Process of NormalizationExample 1 – Process of Normalization
Remove PD
Remove TD
RentalCustomer 3 tables
4 tables
6-47
Example 1 – Summary of 3NFExample 1 – Summary of 3NF
OriginaOriginal tablel table
6-48
Example 2 – Property Inspection ReportExample 2 – Property Inspection Report
6-49
Example 2 – Property InspectionExample 2 – Property Inspection
Business Rules:
When staff are required to undertake inspections, they
are allocated a company car for use on the day of the
inspections.
However, a car may be allocated to several staff members
as required throughout the working day.
A staff member may inspect several properties on a
given date, but a property is only inspected once on a
given date.
6-50
Example 2 – UNF To 1NFExample 2 – UNF To 1NF
copy
copy
new record
new record
new record
6-51
Example 2 – Define Primary KeyExample 2 – Define Primary Key
(Staff_No, IDate) ?
(Property_No, IDate) ?Check business rules
6-52
Example 2 – FDs Of Property_InspectionExample 2 – FDs Of Property_Inspection
FDs:FDs:Property_No, IDate Property_No, IDate ITime, PAddress, Comments, Staff_No, SName, Car_Reg ITime, PAddress, Comments, Staff_No, SName, Car_Reg
Property_No Property_No PAddress PAddress
Staff_No Staff_No SName SName
6-53
Example 2 – Converting To 2NFExample 2 – Converting To 2NF
Property_Inspection (Property_No, IDate, ITime, PAddress, Comments, Staff_No, SName, Car_Reg)
Remove FD2 (Partial Dependency)Remove FD2 (Partial Dependency)
Prop_Inspection (Property_No, IDate, ITime, Comments, Staff_No, SName, Car_Reg)
Prop (Property_No, PAddress)
6-54
Example 2 – Converting To 3NFExample 2 – Converting To 3NF
Prop (Property_No, PAddress)
Prop_Inspection (Property_No, IDate, ITime, Comments, Staff_No, SName, Car_Reg)
Remove FD3 (Transitive Dependency)Remove FD3 (Transitive Dependency)
Prop (Property_No, PAddress)
Prop_Inspection (Property_No, IDate, ITime, Comments, Staff_No, Car_Reg)
Staff (Staff_No, SName)
6-55
Summary: Normalization RulesSummary: Normalization Rules
Normal Form RuleNormal Form Rule DescriptionDescription
First Normal FormFirst Normal Form The table must express a set of (1NF)The table must express a set of (1NF)
unordered, two-dimensional tables.unordered, two-dimensional tables.
The table The table cannot contain repeatingcannot contain repeating
groupsgroups..
Second Normal Form Second Normal Form (2NF)(2NF)
The table must be in 1NF. Every non-keyThe table must be in 1NF. Every non-key
column must be column must be dependent on all dependent on all
parts of the primary keyparts of the primary key..
Third Normal Form Third Normal Form (3NF)(3NF)
The table must be in 2NF. The table must be in 2NF. NoNo non-keynon-key
columncolumn may be functionally may be functionally dependentdependent
onon another non-key columnanother non-key column..
“Each non-primary key value MUST be dependent on the key,the whole key, and nothing but the key.”
no partial dependency
no transitive dependency
no repeating group
6-56
The Process of The Process of Normalization up Normalization up to 3NFto 3NF