database design and development: a visual approach

39
Database Design and Development: A Visual Approach © 2006 Prentice Hall Chapter 4 1 DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH Chapter 4 Normalization Raymond Frost – John Day – Craig Van Slyke

Upload: michael-middleton

Post on 03-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH. Raymond Frost – John Day – Craig Van Slyke. Chapter 4 Normalization. Normalized vs. Denormalized. Exhibit 4-1: Arcade Database Normalized vs. Denormalized Design. Denormalized Sample Data. *note the duplicate entries. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

1

DATABASE DESIGN AND DEVELOPMENT: A VISUAL

APPROACH

Chapter 4

Normalization

Raymond Frost – John Day – Craig Van Slyke

Page 2: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

2

Normalized vs. Denormalized

Exhibit 4-1: Arcade Database Normalized vs.Denormalized Design

Page 3: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

3

Denormalized Sample Data

Exhibit 4-2: Arcade Database DenormalizedSample Data

*note the duplicate entries

Page 4: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

4

Denormalized Design Causes Update Problems

Exhibit 4-3: Arcade Database Update ProblemsDue to Duplicate Data

When the password for Thom Luce was changed, it was not changed in both his entries.

Which password is the correct password?

Page 5: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

5

Normalized Design Eliminates Update Problems

Exhibit 4-4: Arcade Database No Update Problemin Normalized Data

Page 6: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

6

Denormalized Design Creates Insert Problems

Exhibit 4-5: Arcade Database Insert ProblemDue to Duplicate Data

A new member cannot be created unless they have already made a visit, otherwise there would be no primary key value. This field may not be left blank.

Page 7: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

7

Normalized Design Eliminates the Insert Problem

Exhibit 4-6: Arcade Database No Insert Problemin Normalized Design

Page 8: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

8

Denormalized Design Creates Delete Problems

Exhibit 4-7: Arcade Database Delete ProblemDue to Duplicate Data

If a member makes only one visit, deleting that record will cause the loss of the member data.

Deleting visit 005 would cause the loss of Sean McGann’s member data.

Page 9: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

9

Normalized Design Eliminates the Delete Problem

Exhibit 4-8: Arcade Database No Delete Problemin Normalized Design

Now, deleting visit 005 would not cause the loss of Sean McGann’s member data.

Page 10: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

10

First Normal Form (1NF)

Exhibit 4-9: 1NF Violation

Definition: A table in which all fields contain a single value.

The example above would be a violation of the 1NF rule; therefore this table would have to be redesigned.

Page 11: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

11

Fixing Normalization Violations

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

-Create new table(s)-Rename original table if necessary

- Establish relationships between original and new table(s)

- Transfer fields and rename as needed

- Choose PK and FK for all tables

Page 12: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

12

Solving a 1NF Violation

Exhibit 4-10: Arcade Database Solving the 1NF Violation

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

- Since the phone number column violated the 1NF rule, make a new table to hold phone numbers: DIRECTORY.

- A member has multiple phone numbers and a phone number belongs to one member

- The phone field is transferred to the DIRECTORY table

- The email column in MEMBER becomes a FK (MEMBER$email) in DIRECTORY

- The PK in DIRECTORY becomes MEMBER$email and phone since two members could have the same phone number

(e.g. two members from the same household).

Page 13: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

13

Tables in 1NF Eliminate Repeating Data Problems

Exhibit 4-11: 1NF Solution with Sample Data

Now all tables have fields that contain only single values.

Page 14: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

14

Determinants

Exhibit 4-12: Primary Key Determines Non-key Fields

Determinant: a field or group of fields that controls or determines the values in another field.

The value of email will determine the values in all the other fields.

That is, if you know someone’s email, you can determine the rest of their information.

Page 15: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

15

Determinants and Duplicate Data

Exhibit 4-13: Email Acts as a DeterminantExhibit 4-14: Email Fails to Act as a Determinant

When you have duplicate data, knowing someone’s email may not allow you to determine the rest of the data as shown below.

Page 16: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

16

Second Normal Form (2NF)

Exhibit 4-15: 2NF Violation

Definition: A table in which each non-key field is determined by the whole primary key and not part of the primary key by itself.

In this example, once you know a student’s id, you can determine his or her name, dorm, and phone.

Therefore, the fname, lname, dorm, and phone non-key fields are determined by just part of the primary key: id.

Page 17: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

17

Update Problem Caused by 2NF Violation

Exhibit 4-16: 2NF Violation Creates UpdateProblem

Data not determined by the whole primary key will be duplicated and any updates may not be made to all instances of duplicate data.

In this example, we no longer know the correct phone number for Jim Green.

Page 18: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

18

Solving a 2NF Violation

Exhibit 4-17: Enrollment Database Solving the2NF Violation

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

- Since only name, dorm, and phone belong in the STUDENT table, create a new table (ENROLL) for the registration information.

- A student can enroll in many sections but a particular student-section enrollment relates back to one student.

- The SECTION$call_no and grade are information about the enrollment.

- The id column in STUDENT becomes a FK (STUDENT$id) in ENROLL.

- The PK in ENROLL becomes STUDENT$id and SECTION$call_no.

Page 19: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

19

All Keys Are Now Determinants

Exhibit 4-18: Keys Are Now the OnlyDeterminants

All the fields in the STUDENT table are determined by the id.

In ENROLL, grade is determined by the student + section.

Page 20: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

20

2NF Solution with Sample Data

Exhibit 4-19: 2NF Solution with Sample Data

If you a student’s id, you can determine the values in fname, lname, dorm, and phone.

If you know the id of a student and the call_no of a section, you can determine the value of grade.

Page 21: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

21

Third Normal Form (3NF)

Exhibit 4-20: 3NF Violation

Definition: A table in which none of the non-key fields determine another non-key field.

In this example, once you know a member’s email, you can determine his or her password, name, and phone number.

Therefore, the fname, lname, dorm and phone non-key fields are determined by just part of the primary key: id.

Page 22: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

22

Update Problem Caused by 3NF Violation

Exhibit 4-21: 3NF Violation Creates Update Problem

By including both member information and visit information in the same table, not all non-key fields are determined by the primary key.

Page 23: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

23

Solving a 3NF Violation

Exhibit 4-22: 3NF Solution

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

- Since only password, name, and phone belong in the MEMBER table, create a new table (VISIT) for the visit information.

- A member can make many visits, but a particular visit is associated with one member.

- The id, date_time_in, and date_time_out are information about the visit, so move them to the VISIT table.

- The email column becomes the PK in MEMBER.- The email column in MEMBER becomes a FK (MEMBER$email) in VISIT.- The PK in VISIT becomes the session id for the visit.

Page 24: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

24

Keys Are Now Determinants

Exhibit 4-23: 3NF Solution – Keys Are Nowthe Only Determinants

All the fields in the MEMBER table are determined by email.

In VISIT, all non-key fields are determined by id and email.

Page 25: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

25

3NF Solution With Sample Data

Exhibit 4-24: 3NF Solution with Sample Data

All the non-key fields are determined only by the primary key.

Page 26: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

26

Boyce-Codd Normal Form (BCNF)

Exhibit 4-25: BCNF Violation

Definition: Every determinant is a key.

In this example, there are two fields that could be determinants for an employee: employee_id and ssn.

Page 27: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

27

Update Problem Caused by a BCNF Violation

Exhibit 4-26: BCNF Violation Creates Update Problem

Since ssn is a non-key field in this table, it can be easily updated, causing incorrect data.

Page 28: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

28

Solving a BCNF Violation

Exhibit 4-27: BCNF Solution

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

- Since ssn is really information about the employee, it needs to be in the EMPLOYEE table.

- An employee can get many bonuses but a particular bonus belongs to just one employee.

- Move the ssn field into the EMPLOYEE table.

- The primary key for the EMPLOYEE table is the employee’s id.

Page 29: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

29

Keys Are Now Determinants

Exhibit 4-28: BCNF Solution – Keys AreNow the Only Determinants

All the fields in the EMPLOYEE table are determined by id.

In QUARTERLY_BONUS, all non-key fields are determined by id and quarter.

Page 30: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

30

BCNF Solution With Sample Data

Exhibit 4-29: BCNF Solution With Data

All the non-key fields are determined only by the primary key

Page 31: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

31

Fourth Normal Form (4NF)

Exhibit 4-30: 4NF Violation

Definition: In an all-key table, part of the key can determine multiple values of, at most, one other field.

In this example, email can determine multiple languages or sports associated with an employee.

The double-headed arrow indicates a multivalued dependency: one field determining multiple values of another field.

Page 32: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

32

Update Problem Cause by a 4NF Violation

Exhibit 4-31: 4NF Violation Creates Update Problem

We could not drop Luce’s German certification without losing his sports information since the primary key requires that all three fields have a value.

Page 33: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

33

Solving a 4NF Violation

Exhibit 4-32: 4NF Solution

Step 1: Tables

Step 2: Relationships

Step 3: Fields

Step 4: Keys

- Email must be associated with both languages and sports, so create two new tables: LANGUAGE and SPORT.

- These tables would not be directly related, but rather, would be related separately to a member table.

- Put language and sport in separate tables each time paired with email.

- The primary key for the LANGUAGE table is email and language,- The primary key for the SPORT table is email and sport.

Page 34: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

34

No More Than One Multi-Valued Determinant

Exhibit 4-33: 4NF Solution – Only OneMVD per Table

Each all-key table only has one part of the key that determines multiple values another field.

Page 35: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

35

4NF Solution with Sample Data

Exhibit 4-34: 4NF Solution with Sample Data

Now Luce’s German certification can be dropped without losing his sports information.

Page 36: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

36

Detecting Normalization Violations

Exhibit 4-35: Necessary Conditions for NormalForm Violations

These are the conditions under with each type of violation can occur.

Page 37: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

37

ER Diagram for Practice Exercise 1

Exhibit 4-36: ER Diagram for Practice Exercise 1

Page 38: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

38

ER Diagram for Practice Exercise 2

Exhibit 4-37: ER Diagram for Practice Exercise 2

Page 39: DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH

Database Design and Development: A Visual Approach © 2006 Prentice Hall

Chapter 4

39

ER Diagram for Practice Exercise 3

Exhibit 4-38: ER Diagram for Practice Exercise 3