lecture 07 functional dependenciesweb.uettaxila.edu.pk/cms/seadmsbssp09/notes/adbms-lecture-7...

23
1 Functional Dependencies Functional Dependencies Framework for systematic design and optimization of relational schemas Generalization over the notion of Keys Crucial in obtaining correct normalized schemas

Upload: others

Post on 16-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

1

Functional Dependencies

Functional Dependencies

• Framework for systematic design and optimization of relational schemas

• Generalization over the notion of Keys• Crucial in obtaining correct normalized

schemas

Page 2: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

2

DefinitionsIn any relation R, if there exists a set of attributes A1, A2, … Anand an attribute B such that if any two tuples have the same value for A1, A2, … An then they also have the same value for B.

A functional dependency (FD) of the above form is written as:

A1, A2, … An B

Functional dependencies define properties of the schema and notof any particular instance. The dependency must hold for all tuples in the schema.

DefinitionsIf A1, A2, … An can uniquely determine many attributes, they canall be clubbed together in one expression.

A1, A2, … An B1A1, A2, … An B2A1, A2, … An B3

…A1, A2, … An Bm

A1, A2, … An B1B2B3…Bm

Page 3: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

3

DefinitionsKeys revisited: If a subset of attributes can uniquely determinethe entire tuple, then they are called super keys.

Minimal super keys and candidate keys can be defined analogously.

Functional DependenciesConsider the relation:

Movies (title, year, length, filmType, studio, star)

We can identify some FDs as the following:

title, year length title, year filmType

However, note that title, year star

may not always be true!

Page 4: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

4

Reasoning about FDs

Transitivity:

In any relation R, if A B and B C, then the FDA C also holds for R.

Example:

If “Employee_Number” “Job” and “Job” “Salary”, then

“Employee_Number” “Salary”

Reasoning about FDs

Two FDs S = A B and T = C D are said to be equivalentif the set of relation instances satisfying S is the same as the set of relation instances satisfying T.

We say that S follows T, if the set of all relation instances satisfying T also satisfies S.

FDs S and T are equivalent, if S follows T and T follows S.

Page 5: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

5

Trivial Functional Dependencies

Note that in the Movies relation:

title, year title

An FD where the right hand side is contained within the left handside is called a trivial FD.

If there is at least one element on the RHS that is not contained inthe LHS, it is called non-trivial, and if none of the elements of the RHS are contained in the LHS, it is called completelynon-trivial FD.

Closure of FDsIn any relation R, let A be a set of attributes of R.

The closure of FDs defined by A, is the set of all attributes that are “eventually” defined by A.

Let:A B; B C, D; B ∪ D E;

Then, closure(A) = A ∪ B ∪ C ∪ D ∪ E

Page 6: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

6

Closure of FDsAdding attributes to closure(A):

Let A’ ⊆ closure(A) and A’ F, then

closure(A) = closure(A) ∪ F

Computing closure of FDsGiven a relation R and a set of attributes A, closure(A) is computed by the following algorithm:

1. Initially closure(A) = A2. For every A’ ⊆ A, if there exists an FD of the form A’ B

and B ⊄ A, then closure(A) = closure(A) ∪ B3. Repeat step 2 until no more attributes can be added to

closure(A)

The closure of a set of attributes A is denoted by A+. Note that if A+ is the set of all attributes of R, then A is a super-key of R.

Page 7: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

7

Inferred FDsIn a relation R, suppose A, B, C and D be sets of attributes of Rsuch that:

A B; B C; and C D

Also let DA ⊂ D such that DA ⊂ A and let D’ = D – DA.

Given this, we can infer a non-trivial FD: A D’.

FDs which are specified are called stated FDs, and FDs which arederived are called inferred FDs.

Inferred FDsA given set of FDs from which the set of all FDs for a relation can be inferred, is called the basis of the relation.

If the basis is such that no subset of the basis is also a basis, then it is said to be a minimal basis for the relation.

Page 8: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

8

Armstrong’s AxiomsFor computing the set of FDs that follow a given FD, the following rules called Armstrong’s axioms are useful:

1. Reflexivity: If B ⊆ A, then A B

2. Augmentation: If A B, then A ∪ C B ∪ C Note also that if A B, then A ∪ C B for any set of attributes C.

3. Transitivity: If A B and B C then A C

Projecting FDsLet R be a relation and F(R) be the set of all FDs in R.

Suppose relation S is projected from R, by removing some attributes. How can we infer F(S)?

FDs that belong to F(S) are those which: 1. Follow from F(R) 2. Involve only attributes of S

Page 9: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

9

Projecting FDsGiven a relation R (A,B,C,D) and F(R) = {A B, B C, C D}.

Suppose S is projected from R as S(A,C,D). What is F(S). To compute F(S), start by computing the closures of all attributesin S.

In R, A+ = {A B, A C, A D}In S, A+ = {A C, A D}

C+ = {C D} and D+ = {D}

Since A+ contains all attributes of S, it is not required to compute(AC)+, (AD)+ or (ACD)+.

Designing Relational SchemasIn a carelessly designed relational schema, functional

dependenciesare “improper”. This leads to the following problems:

1. Redundancy: Information is repeated across tuples

2. Update anomalies: If information is repeated across tuples, then an update of any such information has to be performed across all tuples containing the information

3. Deletion anomalies: If information is repeated across tuples, deletion of information has to be performed across all these tuples.

Page 10: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

10

Designing Relational SchemasConsider the Movie (title, year, length, studio, star) relation,

where: title, year lengthtitle, year studio

Buttitle,year star need not be true.

For each movie star of a given movie, the title, year, length and studio information has to be repeated. If any of these values have to be updated or deleted, they should consult all tupleswhere they occur.

DecompositionAnomalies are removed from a relation R(A), by decomposingit into other relations S(B) and T(C) where B, C ⊂ A, such that there are no anomalies in S and T.

A decomposition that does not contain any anomalies is said to be in Boyce-Codd Normal Form (BCNF).

A BCNF relation has the following property: A relation R(A) is said to be in BCNF, if any non-

trivial FD of the form A’ A’’ exists in R(A), it means A’ is a super-key for R.

Page 11: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

11

DecompositionIn a given relation R(A), let there be a functional dependency of the form A’ A’’ which violates BCNF.

In order to bring R into BCNF, decompose R as follows:

Let B be the set of all attributes which lie in the RHS of any FD that has A’ in the LHS.

Remove the set of all attributes A’ ∪ B and form a separate relation. Retain A’ along with A – {A’ ∪ B} to form the other decomposed part of the relation R.

DecompositionExample:

Consider the Movies (title, year, length, studio, star)relation. Here the following FD holds:

title, year length, studio, star

However, this is a BCNF violating FD, since (title, year) is nota super-key as the attribute “star” is not in (title,year)+.

To decompose Movies, remove (title, year) along with (length, studio, star) and put them in a separate relation. Retain (title, year) along with (star) to form the other relation.

Page 12: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

12

DecompositionHence:

Movies (title, year, length, studio, star) is decomposed into

Movies1 (title, year, length, studio) and Movies2 (title, year, star)

2-attribute RelationsAny 2-attribute relation of the form R(A,B) is always in BCNF. To prove, consider the following cases:

1. There are no FDs between A and B, in which case only trivial FDs exist and R is in BCNF

2. A B, but there is no FD of the form B A. In this case, A is the key and R is in BCNF.

3. B A, but there is no FD of the form A B. This is symmetric to the case above, here, B is the key.

4. A B and B A. Both A and B are keys, this does not violate the BCNF condition.

Page 13: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

13

Third Normal Form (3NF)Sometimes, some BCNF violating FDs cannot be removed from relations without losing information.

Consider the relation Drama (title, theater, city) having the following FDs:

FD1: title, city theater(title and city form the key as they uniquely determine theater)

FD2: theater city(each drama theater has a unique name across cities)

FD2 violates BCNF since {theater} is not a key to Drama.

Third Normal Form (3NF)Based on FD2, if we decompose Drama into the relationsDrama1 (title, theater) and Drama2 (theater, city) it will be incorrect!

This is because in the join of the relation Drama1 and Drama2, (title, city) will no longer be the key!

Page 14: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

14

Third Normal Form (3NF)Consider the example tables:

Drama1 Drama2

NazTroy

Jude Brave Golden

JeansNazJeansTheaterTitle

KarachiJude Brave Golden

LahoreNazCityTheater

Third Normal Form (3NF)A Join between Drama1 and Drama2 gives the table:

LahoreNazTroy

KarachiJude Brave Golden

JeansLahoreNazJeansCityTheaterTitle

Note that (theater, city) no longer uniquely determine title!

Page 15: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

15

Third Normal Form (3NF)Discrepancies in the previous example occurred because of theFD theater city where theater is not part of a key, but city is!

In accommodate such cases, the “third normal form” (3NF) decomposition is used which relaxes BCNF as follows:

Any relation R is said to be in 3NF, if for any non-trivial FD of the form A B, either A is the super-key, or B is a member of some key.

An attribute that is a member of a key is called a prime attribute.

Multi-valued Dependencies

In some cases, even if a relation is in BCNF, there could still be redundancies.

Consider the relation: Drama (title, theater, star, genre).

Drama is in BCNF.

A given drama may have many stars. For every entry of star, the theater and genre attributes have to be repeated.

Page 16: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

16

Multi-valued Dependencies• The notation for multivalued dependency is a double-headed arrow

between two attributes, A →→ B. In English, a multivalueddependency means that “if I know a value of A, I can determine a subset of B values.”

This relationship was also axiomized by Beri, Fagin, and Howard (1977). Their axioms are

• Reflexive: X →→ X• Augmentation: if X →→ Y

then XZ →→ Y• Union: if X →→ Y and X →→ Z

then X →→ YZ• Projection: if X →→ Y and X →→ Z

then X →→ (Y U Z)and X →→ (Y – Z)

Multi-valued Dependencies• Transitivity: if X →→ Y and Y →→ Z

then X →→ (Z – Y)• Pseudotransitivity: if X →→ Y and YW →→ Z

then XW →→ (Z – YW)• Complement: if X →→ Y and Z = (R – XY)

then X →→ Z• Replication: if X → Y

then X →→ Y• Coalescence: if X →→ Y and Z →→ W

where W Yand Y U Z = Øthen X → W

Page 17: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

17

Multi-valued Dependencies

In a given relation R(A), we say that there is a multi-valueddependency (MVD) if the following condition exists:

Suppose A’ be the key and suppose A’ B

Now if B is independent of all attributes in A – B, then the above dependency is said to be a multi-valued dependency denoted by:

A’ B

Fourth Normal Form (4NF)A relation that has no “non-trivial” multi-valued dependencies is said to be in fourth normal form (4NF).

In a given relation R(A), the MVD A’ B

is said to be “non-trivial” if: B ⊄ A’ and A’ ∪ B ⊂ A

A relation R(A) is said to be in 4NF if for every non-trivial MVD of the form A’ B, A’ is the super-key.

Page 18: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

18

Example• Consider a table of departments, their projects,

and the parts they stock. The MVDs in the table would be

department →→ projectsdepartment →→ parts

• Assume that department d1 works on jobs j1 and j2 with parts p1 and p2; that department d2 works on jobs j3, j4, and j5 with parts p2 and p4; and that department d3 works on job j2 only with parts p5 and p6. The table would look like this:

Tabledepartment job part d1 j1 p1 d1 j1 p2 d1 j2 p1 d1 j2 p2 d2 j3 p2 d2 j3 p4 d2 j4 p2 d2 j4 p4 d2 j5 p2 d2 j5 p4 d3 j2 p5 d3 j2 p6

Example Contd..

Page 19: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

19

Example Contd..• If you want to add a part to a department, you must create

more than one new row.

• Likewise, to remove a part or a job from a row can destroy information. Updating a part or job name will also require multiple rows to be changed.

• The solution is to split this table into two tables, one with (department, projects) in it and one with (department, parts) in it. The definition of 4NF is that we have no more than one MVD in a table. If a table is in 4NF, it is also in BCNF.

Relationship between NFs

3NF

BCNF

4NF

Note that 4NF implies BCNF implies 3NF.

Page 20: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

20

Join Dependencies• A join dependency is a further generalization of

MVDs. • A join dependency (JD) ►◄ {R1……...Rn} is said

to hold over a relation R if R1......... Rn is a lossless-join decomposition of R.

• An MVD X →→ Y over a relation R can be expressed as the join dependency ►◄ {XY, X(R−Y)}.

Unlike FDs and MVDs, there is no set of sound and complete inference rules for JDs.

course teacher bookPhysics101 Green Mechanics Physics101 Green Optics Physics101 Brown Mechanics Physics101 Brown Optics Math301 Green Mechanics Math301 Green Vectors Math301 Green Geometry

• As an example, in the CTB relation, the MVD C →→ T can be expressed as the join dependency ►◄{CT, CB}.

Page 21: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

21

Page 22: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

22

SELECT BS.buyer, SL.seller, BL.lenderFROM BuyerLender AS BL,SellerLender AS SL,BuyerSeller AS BSWHERE BL.buyer = BS.buyerAND BL.lender = SL.lenderAND SL.seller = BS.seller;

Page 23: Lecture 07 Functional Dependenciesweb.uettaxila.edu.pk/CMS/seADMSbsSp09/notes/ADBMS-Lecture-7 Functional... · Trivial Functional Dependencies Note that in the Movies relation: title,

23

Fifth Normal Form (5NF)

• Fifth normal form, also called the join-projection normal form (JPNF) or the projection-join normal form

• Based on the idea of a lossless join or the lack of a join-projection anomaly.

• This problem occurs when you have an n-way relationship, where n > 2.

• A quick check for 5NF is to see if the table is in 3NF and all the candidate keys are single columns.

Domain-Key Normal Form (DKNF)

• Domain-key normal form was proposed by Ron Fagin (1981).

• The idea is that if all the constraints implied by domain restrictions and by key conditions are true, then the database is in at least 5NF.

• The interesting part of Fagin’s paper is that there is no mention of functional dependencies, multivalueddependencies, or join dependencies.

• This is currently considered the stongest normal form possible.

• The problem is that his paper does not tell you how you can achieve DKNF and shows that in some cases it is impossible.