normalisation ensuring data integrity in database design 1

11
Normalisation Normalisation Ensuring data integrity in database design 1

Upload: george-campbell

Post on 18-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

NormalisationNormalisationEnsuring data integrity in database design

1

IntroductionIntroductionWhat is normalisation?Why use normalisation?How do we normalise?How far should we normalise data?

1st, 2nd and 3rd Normal Form

2

What is Normalisation?What is Normalisation?

The process of reducing data to a set of relationships (ERDs)Normalisation has three stages:1st Normal Form (1NF)2nd Normal Form (2NF)3rd Normal Form (3NF)Each stage removes further redundancy from the data

3

Why use normalisation?Why use normalisation?To ensure data integrity and avoid

duplication of dataTo keep attributes atomicData integrity – all data is consistent

throughout the database.Data duplication – is when data is found in

more than one location, or data can be calculated (Unit Price, VAT rate, price + VAT).

Atomic Attributes – data represents single values, not groups (e.g. do not use ‘subjects’ in student table if they are studying more than one topic).

4

How do we normalise?How do we normalise?Make sure all attributes are atomic

(single values).All attributes must belong to a

single entity (the primary key can be a foreign key in another entity).

All attributes must relate to the entity primary key.

Normalise as far as possible (3NF is usual).

5

How far should data be How far should data be normalised?normalised?3rd Normal Form1NF if a table has no repeating

attributes or groups of attributes, and all data items are atomic.

2NF when all the fields in a table (other than the primary key fields) are entirely to do with the primary key.

3NF when no non-key attribute is functionally dependent on another non-key attribute.

6

Example – 3NFExample – 3NFProject _no

Engineer

Address

0000 Bailey, S. 22 High Street

1111 Hussain, R.

17 Ford Lane

2222 Bailey, S. 22 High Street

3333 Bailey, S. 22 High Street

Address is dependant on Engineer (both non-key fields)

Where an engineer is in charge of more than one project, data redundancy occurs.

Create a new entity to resolve this.

7

Resolving 3NFResolving 3NFPROJECT

Project_no Engineer

0000 Bailey, S.

1111 Hussain, R.

2222 Bailey, S.

3333 Bailey, S.

ENGINEER

Engineer Address

Bailey, S. 22 High Street

Hussain, R. 17 Ford Lane

Now we need to store the address only once •If we need to know an engineer’s address we can look it up in the engineer table. •The engineer attribute is the link between the two tables, and in the Projects table it is now a foreign key. These relations are now in third normal form.

8

List all attributes

1st normal formRemove repeating groups by creating further entities

2nd normal formReduce duplicate data by identifying primary keys and composite keys

3rd normal formCheck all fields in entities depend wholly on the primary key

Student numberForenameSurnameGenderTutorTutor codeUnit IDAssessor codeAssessor name Date achieved

StudentStudent numberForenameSurnameGenderTutorTutor code*Unit IDDate achieved

UnitUnit IDAssessor codeAssessor name

List primary keys and combinations1.Student Number2. Unit ID3. Student Number, Unit IDEach becomes an entity

StudentStudent numberForenameSurnameGenderTutorTutor code

UnitUnit IDAssessor codeAssessor name

Student Achievement*Student number*Unit IDDate achieved

StudentStudent numberStudent ForenameStudent SurnameStudent Gender*Tutor code

UnitUnit id*Assessor code

Student Achievement*Student number*Unit IDDate achieved

StaffStaff codeStaff name

9

List all attributes

1st normal formRemove repeating groups by creating further entities

2nd normal formReduce duplicate data by identifying primary keys and composite keys

3rd normal formCheck all fields in entities depend wholly on the primary key

Student numberForenameSurnameGenderTutorTutor codeUnit IDAssessor codeAssessor name Date achieved

StudentStudent numberForenameSurnameGenderTutorTutor code*Unit IDDate achieved

UnitUnit IDAssessor codeAssessor name

List primary keys and combinations1.Student Number2. Unit ID3. Student Number, Unit IDEach becomes an entity

StudentStudent numberForenameSurnameGenderTutorTutor code

UnitUnit IDAssessor codeAssessor name

Student Achievement*Student number*Unit IDDate achieved

StudentStudent numberStudent ForenameStudent SurnameStudent Gender*Tutor code

UnitUnit id*Assessor code

Student Achievement*Student number*Unit IDDate achieved

StaffStaff codeStaff name

10

SummarySummaryMake sure all attributes are atomic.

Normalise data by removing repeating groups.

Relate all fields in a table to the primary key.

Create further entities/tables where necessary.

11