database design and development: a visual approach
DESCRIPTION
DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH. Raymond Frost – John Day – Craig Van Slyke. Chapter 4 Normalization. Normalized vs. Denormalized. Exhibit 4-1: Arcade Database Normalized vs. Denormalized Design. Denormalized Sample Data. *note the duplicate entries. - PowerPoint PPT PresentationTRANSCRIPT
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
1
DATABASE DESIGN AND DEVELOPMENT: A VISUAL
APPROACH
Chapter 4
Normalization
Raymond Frost – John Day – Craig Van Slyke
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
2
Normalized vs. Denormalized
Exhibit 4-1: Arcade Database Normalized vs.Denormalized Design
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
3
Denormalized Sample Data
Exhibit 4-2: Arcade Database DenormalizedSample Data
*note the duplicate entries
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
4
Denormalized Design Causes Update Problems
Exhibit 4-3: Arcade Database Update ProblemsDue to Duplicate Data
When the password for Thom Luce was changed, it was not changed in both his entries.
Which password is the correct password?
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
5
Normalized Design Eliminates Update Problems
Exhibit 4-4: Arcade Database No Update Problemin Normalized Data
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
6
Denormalized Design Creates Insert Problems
Exhibit 4-5: Arcade Database Insert ProblemDue to Duplicate Data
A new member cannot be created unless they have already made a visit, otherwise there would be no primary key value. This field may not be left blank.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
7
Normalized Design Eliminates the Insert Problem
Exhibit 4-6: Arcade Database No Insert Problemin Normalized Design
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
8
Denormalized Design Creates Delete Problems
Exhibit 4-7: Arcade Database Delete ProblemDue to Duplicate Data
If a member makes only one visit, deleting that record will cause the loss of the member data.
Deleting visit 005 would cause the loss of Sean McGann’s member data.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
9
Normalized Design Eliminates the Delete Problem
Exhibit 4-8: Arcade Database No Delete Problemin Normalized Design
Now, deleting visit 005 would not cause the loss of Sean McGann’s member data.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
10
First Normal Form (1NF)
Exhibit 4-9: 1NF Violation
Definition: A table in which all fields contain a single value.
The example above would be a violation of the 1NF rule; therefore this table would have to be redesigned.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
11
Fixing Normalization Violations
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
-Create new table(s)-Rename original table if necessary
- Establish relationships between original and new table(s)
- Transfer fields and rename as needed
- Choose PK and FK for all tables
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
12
Solving a 1NF Violation
Exhibit 4-10: Arcade Database Solving the 1NF Violation
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
- Since the phone number column violated the 1NF rule, make a new table to hold phone numbers: DIRECTORY.
- A member has multiple phone numbers and a phone number belongs to one member
- The phone field is transferred to the DIRECTORY table
- The email column in MEMBER becomes a FK (MEMBER$email) in DIRECTORY
- The PK in DIRECTORY becomes MEMBER$email and phone since two members could have the same phone number
(e.g. two members from the same household).
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
13
Tables in 1NF Eliminate Repeating Data Problems
Exhibit 4-11: 1NF Solution with Sample Data
Now all tables have fields that contain only single values.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
14
Determinants
Exhibit 4-12: Primary Key Determines Non-key Fields
Determinant: a field or group of fields that controls or determines the values in another field.
The value of email will determine the values in all the other fields.
That is, if you know someone’s email, you can determine the rest of their information.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
15
Determinants and Duplicate Data
Exhibit 4-13: Email Acts as a DeterminantExhibit 4-14: Email Fails to Act as a Determinant
When you have duplicate data, knowing someone’s email may not allow you to determine the rest of the data as shown below.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
16
Second Normal Form (2NF)
Exhibit 4-15: 2NF Violation
Definition: A table in which each non-key field is determined by the whole primary key and not part of the primary key by itself.
In this example, once you know a student’s id, you can determine his or her name, dorm, and phone.
Therefore, the fname, lname, dorm, and phone non-key fields are determined by just part of the primary key: id.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
17
Update Problem Caused by 2NF Violation
Exhibit 4-16: 2NF Violation Creates UpdateProblem
Data not determined by the whole primary key will be duplicated and any updates may not be made to all instances of duplicate data.
In this example, we no longer know the correct phone number for Jim Green.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
18
Solving a 2NF Violation
Exhibit 4-17: Enrollment Database Solving the2NF Violation
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
- Since only name, dorm, and phone belong in the STUDENT table, create a new table (ENROLL) for the registration information.
- A student can enroll in many sections but a particular student-section enrollment relates back to one student.
- The SECTION$call_no and grade are information about the enrollment.
- The id column in STUDENT becomes a FK (STUDENT$id) in ENROLL.
- The PK in ENROLL becomes STUDENT$id and SECTION$call_no.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
19
All Keys Are Now Determinants
Exhibit 4-18: Keys Are Now the OnlyDeterminants
All the fields in the STUDENT table are determined by the id.
In ENROLL, grade is determined by the student + section.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
20
2NF Solution with Sample Data
Exhibit 4-19: 2NF Solution with Sample Data
If you a student’s id, you can determine the values in fname, lname, dorm, and phone.
If you know the id of a student and the call_no of a section, you can determine the value of grade.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
21
Third Normal Form (3NF)
Exhibit 4-20: 3NF Violation
Definition: A table in which none of the non-key fields determine another non-key field.
In this example, once you know a member’s email, you can determine his or her password, name, and phone number.
Therefore, the fname, lname, dorm and phone non-key fields are determined by just part of the primary key: id.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
22
Update Problem Caused by 3NF Violation
Exhibit 4-21: 3NF Violation Creates Update Problem
By including both member information and visit information in the same table, not all non-key fields are determined by the primary key.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
23
Solving a 3NF Violation
Exhibit 4-22: 3NF Solution
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
- Since only password, name, and phone belong in the MEMBER table, create a new table (VISIT) for the visit information.
- A member can make many visits, but a particular visit is associated with one member.
- The id, date_time_in, and date_time_out are information about the visit, so move them to the VISIT table.
- The email column becomes the PK in MEMBER.- The email column in MEMBER becomes a FK (MEMBER$email) in VISIT.- The PK in VISIT becomes the session id for the visit.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
24
Keys Are Now Determinants
Exhibit 4-23: 3NF Solution – Keys Are Nowthe Only Determinants
All the fields in the MEMBER table are determined by email.
In VISIT, all non-key fields are determined by id and email.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
25
3NF Solution With Sample Data
Exhibit 4-24: 3NF Solution with Sample Data
All the non-key fields are determined only by the primary key.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
26
Boyce-Codd Normal Form (BCNF)
Exhibit 4-25: BCNF Violation
Definition: Every determinant is a key.
In this example, there are two fields that could be determinants for an employee: employee_id and ssn.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
27
Update Problem Caused by a BCNF Violation
Exhibit 4-26: BCNF Violation Creates Update Problem
Since ssn is a non-key field in this table, it can be easily updated, causing incorrect data.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
28
Solving a BCNF Violation
Exhibit 4-27: BCNF Solution
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
- Since ssn is really information about the employee, it needs to be in the EMPLOYEE table.
- An employee can get many bonuses but a particular bonus belongs to just one employee.
- Move the ssn field into the EMPLOYEE table.
- The primary key for the EMPLOYEE table is the employee’s id.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
29
Keys Are Now Determinants
Exhibit 4-28: BCNF Solution – Keys AreNow the Only Determinants
All the fields in the EMPLOYEE table are determined by id.
In QUARTERLY_BONUS, all non-key fields are determined by id and quarter.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
30
BCNF Solution With Sample Data
Exhibit 4-29: BCNF Solution With Data
All the non-key fields are determined only by the primary key
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
31
Fourth Normal Form (4NF)
Exhibit 4-30: 4NF Violation
Definition: In an all-key table, part of the key can determine multiple values of, at most, one other field.
In this example, email can determine multiple languages or sports associated with an employee.
The double-headed arrow indicates a multivalued dependency: one field determining multiple values of another field.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
32
Update Problem Cause by a 4NF Violation
Exhibit 4-31: 4NF Violation Creates Update Problem
We could not drop Luce’s German certification without losing his sports information since the primary key requires that all three fields have a value.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
33
Solving a 4NF Violation
Exhibit 4-32: 4NF Solution
Step 1: Tables
Step 2: Relationships
Step 3: Fields
Step 4: Keys
- Email must be associated with both languages and sports, so create two new tables: LANGUAGE and SPORT.
- These tables would not be directly related, but rather, would be related separately to a member table.
- Put language and sport in separate tables each time paired with email.
- The primary key for the LANGUAGE table is email and language,- The primary key for the SPORT table is email and sport.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
34
No More Than One Multi-Valued Determinant
Exhibit 4-33: 4NF Solution – Only OneMVD per Table
Each all-key table only has one part of the key that determines multiple values another field.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
35
4NF Solution with Sample Data
Exhibit 4-34: 4NF Solution with Sample Data
Now Luce’s German certification can be dropped without losing his sports information.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
36
Detecting Normalization Violations
Exhibit 4-35: Necessary Conditions for NormalForm Violations
These are the conditions under with each type of violation can occur.
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
37
ER Diagram for Practice Exercise 1
Exhibit 4-36: ER Diagram for Practice Exercise 1
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
38
ER Diagram for Practice Exercise 2
Exhibit 4-37: ER Diagram for Practice Exercise 2
Database Design and Development: A Visual Approach © 2006 Prentice Hall
Chapter 4
39
ER Diagram for Practice Exercise 3
Exhibit 4-38: ER Diagram for Practice Exercise 3