normalisation ensuring data integrity in database design 1
TRANSCRIPT
IntroductionIntroductionWhat is normalisation?Why use normalisation?How do we normalise?How far should we normalise data?
1st, 2nd and 3rd Normal Form
2
What is Normalisation?What is Normalisation?
The process of reducing data to a set of relationships (ERDs)Normalisation has three stages:1st Normal Form (1NF)2nd Normal Form (2NF)3rd Normal Form (3NF)Each stage removes further redundancy from the data
3
Why use normalisation?Why use normalisation?To ensure data integrity and avoid
duplication of dataTo keep attributes atomicData integrity – all data is consistent
throughout the database.Data duplication – is when data is found in
more than one location, or data can be calculated (Unit Price, VAT rate, price + VAT).
Atomic Attributes – data represents single values, not groups (e.g. do not use ‘subjects’ in student table if they are studying more than one topic).
4
How do we normalise?How do we normalise?Make sure all attributes are atomic
(single values).All attributes must belong to a
single entity (the primary key can be a foreign key in another entity).
All attributes must relate to the entity primary key.
Normalise as far as possible (3NF is usual).
5
How far should data be How far should data be normalised?normalised?3rd Normal Form1NF if a table has no repeating
attributes or groups of attributes, and all data items are atomic.
2NF when all the fields in a table (other than the primary key fields) are entirely to do with the primary key.
3NF when no non-key attribute is functionally dependent on another non-key attribute.
6
Example – 3NFExample – 3NFProject _no
Engineer
Address
0000 Bailey, S. 22 High Street
1111 Hussain, R.
17 Ford Lane
2222 Bailey, S. 22 High Street
3333 Bailey, S. 22 High Street
Address is dependant on Engineer (both non-key fields)
Where an engineer is in charge of more than one project, data redundancy occurs.
Create a new entity to resolve this.
7
Resolving 3NFResolving 3NFPROJECT
Project_no Engineer
0000 Bailey, S.
1111 Hussain, R.
2222 Bailey, S.
3333 Bailey, S.
ENGINEER
Engineer Address
Bailey, S. 22 High Street
Hussain, R. 17 Ford Lane
Now we need to store the address only once •If we need to know an engineer’s address we can look it up in the engineer table. •The engineer attribute is the link between the two tables, and in the Projects table it is now a foreign key. These relations are now in third normal form.
8
List all attributes
1st normal formRemove repeating groups by creating further entities
2nd normal formReduce duplicate data by identifying primary keys and composite keys
3rd normal formCheck all fields in entities depend wholly on the primary key
Student numberForenameSurnameGenderTutorTutor codeUnit IDAssessor codeAssessor name Date achieved
StudentStudent numberForenameSurnameGenderTutorTutor code*Unit IDDate achieved
UnitUnit IDAssessor codeAssessor name
List primary keys and combinations1.Student Number2. Unit ID3. Student Number, Unit IDEach becomes an entity
StudentStudent numberForenameSurnameGenderTutorTutor code
UnitUnit IDAssessor codeAssessor name
Student Achievement*Student number*Unit IDDate achieved
StudentStudent numberStudent ForenameStudent SurnameStudent Gender*Tutor code
UnitUnit id*Assessor code
Student Achievement*Student number*Unit IDDate achieved
StaffStaff codeStaff name
9
List all attributes
1st normal formRemove repeating groups by creating further entities
2nd normal formReduce duplicate data by identifying primary keys and composite keys
3rd normal formCheck all fields in entities depend wholly on the primary key
Student numberForenameSurnameGenderTutorTutor codeUnit IDAssessor codeAssessor name Date achieved
StudentStudent numberForenameSurnameGenderTutorTutor code*Unit IDDate achieved
UnitUnit IDAssessor codeAssessor name
List primary keys and combinations1.Student Number2. Unit ID3. Student Number, Unit IDEach becomes an entity
StudentStudent numberForenameSurnameGenderTutorTutor code
UnitUnit IDAssessor codeAssessor name
Student Achievement*Student number*Unit IDDate achieved
StudentStudent numberStudent ForenameStudent SurnameStudent Gender*Tutor code
UnitUnit id*Assessor code
Student Achievement*Student number*Unit IDDate achieved
StaffStaff codeStaff name
10