c20.0046: database management systems lecture #4

36
M.P. Johnson, DBMS, Stern/NYU, Sprin g 2008 1 C20.0046: Database Management Systems Lecture #4 M.P. Johnson Stern School of Business, NYU Spring, 2008

Upload: magee

Post on 21-Mar-2016

36 views

Category:

Documents


0 download

DESCRIPTION

C20.0046: Database Management Systems Lecture #4. M.P. Johnson Stern School of Business, NYU Spring, 2008. Agenda. Last time: relational model This time: Functional dependencies Keys and superkeys in terms of FDs Finding keys for relations Extended E/R example Rules for combining FDs - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

1

C20.0046: Database Management SystemsLecture #4

M.P. JohnsonStern School of Business, NYUSpring, 2008

Page 2: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

2

Agenda Last time: relational model This time:1. Functional dependencies

Keys and superkeys in terms of FDs Finding keys for relations

2. Extended E/R example3. Rules for combining FDs Next time: anomalies & normalization

Pick groups

Page 3: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

3

Where are we going, where have we been? Goal: manage large amounts of data

effectively Use a DBMS must define a schema

DBMSs use the relational model But initial design is easier in E/R

Must design an E/R diagram Must then convert it to rel. model

Page 4: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

4

Where are we going, where have we been? At this pt, often find problems – redundancy

How to fix?

Convert the tables to a special “normal” form How to do this?

First step is: check which FDs there are The reason we looked at FDs last time Will have to look at all true FDs of the table Then well do decompositions

Page 5: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

5

Next topic: Functional dependencies FDs are constraints

Logically part of the schema can’t tell from particular relation instances FD may hold for some instances “accidentally”

Finding all FDs is part of DB design Used in normalization

Page 6: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

6

Functional dependencies Definition:

Notation: Read as: Ai functionally determines Bj

If two tuples agree on the attributes

A1, A2, …, An

then they must also agree on the attributes

B1, B2, …, Bm

A1, A2, …, An B1, B2, …, Bm

Page 7: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

7

Typical Examples of FDs Product

name price, manufacturer

Person ssn name, age father’s/husband’s-name last-name zipcode state phone state (notwithstanding inter-state area codes?)

Company name stockprice, president symbol name name symbol

Page 8: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

8

Functional dependencies To check A B, erase all other columns; for all

rows t1, t2

i.e., check if remaining relation is many-one no “divergences” i.e., if AB is a well-defined function thus, functional dependency

Bm...B1Am...A1

t1

t2

if t 1, t 2 agree here then t 1, t 2 agree here

Page 9: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

9

FD exampleProduct(name, category, color, department, price)

name colorcategory departmentcolor, category price

Consider these FDs:

What do they say?

Page 10: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

10

FD exampleFDs as properties:• On some instances they hold• On others they don’t

name category color department price

Gizmo Gadget Green Toys 49

Tweaker Gadget Green Toys 99

Does this instance satisfy all the FDs?

name colorcategory departmentcolor, category price

Page 11: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

11

FD example

name category color department price

Gizmo Gadget Green Toys 49

Tweaker Gadget Black Toys 99

Gizmo Stationary Green Office-supp. 59

What about this one?

name colorcategory departmentcolor, category price

Page 12: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

12

Q: Is PositionPhone an FD here?

A: It is for this instance, but no, presumably not in general

Others FDs? EmpID Name, Phone, Position but Phone Position

Recognizing FDs

EmpID Name Phone PositionE0045 Smith 1234 ClerkE1847 John 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

Page 13: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

13

Keys of relations {A1A2A3…An} is a key for relation R if

A1A2A3…An functionally determine all other atts Usual notation: A1A2A3…An B1B2…Bk

rels = sets distinct rows can’t agree on all Ai

A1A2A3…An is minimal No proper subset of A1A2A3…An functionally determines

all other attributes of R

Primary key: chosen if there are several possible keys

Page 14: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

14

Keys example Relation: Student(ssn,Name, Address, DoB,

Email, Credits) Which (/why) of the following are keys?

SSN Name, Address (on reasonable assumptions) Name, SSN Email, SSN Email

NB: minimal != smallest

Page 15: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

15

Superkeys Df: A set of attributes that contains a key Satisfies first condition: determination Might not satisfy the second: minimality

Some superkey attributes may be superfluous keys are superkeys

key are special case of superkey superkey set is superset of key set

name;ssn is a superkey but not a key

Page 16: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

16

Discovering keys for relations Relation entity set

Key of relation = (minimized) key of entity set

Relation binary relationship Many-many: union of keys of both entity sets Many(M)-one(O): only key of M (why?) One-one: key of either entity set (but not both!)

Page 17: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

17

Review: entity sets Key of entity set = key of relation

Student(Name, Address, DoB, SSN, Email, Credits)

Student

Name

Address

DoB

SSN

Email

Credits

Page 18: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

18

Review: many-many Many-many key: union of both ES keys

Student Enrolls Course

SSN Credits CourseID Name

Enrolls(SSN,CourseID)

Page 19: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

19

Review: many-one Key of the many ES but not of the one ES

keys from both would be non-minimal

Course MeetsIn Room

CourseID Name RoomNo Capacity

MeetsIn(CourseID,RoomNo)

Page 20: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

20

Review: one-one Keys of both ESs included in relation Key is key of either ES (but not both!)

Husbands Married Wives

SSN Name SSN Name

Married(HSSN, WSSN) or

Married(HSSN, WSSN)

Page 21: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

21

Review: multiway relships Multiple ways – may not be obvious R:F,G,HE is many-one E’s key is included

but not part of key Recall that relship atts are implicitly many-one

Course Enrolls Student

CourseID Name SSN NameSection

RoomNo CapacityEnrolls(CourseID,SSN,RoomNo)

Page 22: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

22

Extended E/R example: Exercise 2.3 Profs have: ssn, name, age, rank, specialty Projs have: proj#, sponsor name (e.g. NSF), start date, end date,

budget Grad students have: ssn, name, age, degree program Each proj: managed by one prof Each proj: worked on by one or more profs Each proj: worked on by one or more grad students When grad students work on a proj, a prof must supervise their

work. Grad students may have different supervisors for each proj. Depts have: dept#, dept name, main office Profs work in one or more departments, and a percent-worked in

each dept Grad students have one major dept in which they work

Then convert to relations…

Page 23: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

23

Extended E/R example: Exercise 2.7 Patients are ID’ed by ssn, and they have names, addresses, and ages Doctors are ID’ed by ssn. For each doctor, record name, specialty, years of

experience Each pharm. company is ID’ed by name and has a phone number For each drug, the trade name and foruma are recorded. Each is sold by a given

pharm. company, and the trade name ID’s a drug uniquely from the products of that company. If a pharm. company is deleted, you need not keep track of its products any longer.

Each pharmacy has: name, address, phone number Every patient has a primary physician. Doctors can have multiple patients. Each pharmacy can sell several drugs and has a price of each. A drug price

could vary between pharmacies. Doctors prescribe drugs for patients. A doctor could prescribe multiple drugs for

multiple patients, and a patient could obtain prescriptions from multiple doctors. Each prescription has a date and quantity. Assume that if a doctor prescribes the same drug for the same patient multiple times, we only record the last prescription.

A pharm. company can contract with several pharmacies, and vice versa. For each contract, store the text and start and end dates.

Pharmacies appoint a unique supervisor for each contract.

How would this change if each drug had a single price?

Page 24: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

24

Next topic: Combining FDsIf some FDs are satisfied, thenothers are satisfied too

If all these FDs are true:name colorcategory departmentcolor, category price

Then this FD also holds: name, category price

Why?

Page 25: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

25

Rules for FDs (quickly) Reasoning about FDs: given a set of FDs,

infer other FDs – useful E.g. A B, B C A C

Definitions: for FD-sets S and T T follows from S if all relation-instances

satisfying S also satisfy T. S and T are equivalent if the sets of relation-

instances satisfying S and T are the same. I.e., S and T are equivalent if S follows from T,

and T follows from S.

Page 26: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

26

Splitting & combining FDs (quickly)Splitting rule:

Combining rule:Note: doesn’t apply to the left side

Q: Can you split and combine the A’s, too?

A1A2…An B1B2…Bm

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .A1, A2, …, An Bm

Bm...B1Am...A1

t1

t2

Page 27: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

27

Reflexive rule: trivial FDs (quickly)

FD A1A2…An B1B2…Bk may be Trivial: Bs are a subset of As Nontrivial: >=1 of the Bs is not among the As Completely nontrivial: none of the Bs is among the As

Trivial elimination rule: Eliminate common attributes from Bs, to get an equivalent

completely nontrivial FD

A1, A2, …, An Ai with i in 1..n is a trivial FD

A1 … An

t

t’

Page 28: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

28

Transitive rule (quickly)If

and

then

A1, A2, …, An B1, B2, …, Bm

B1, B2, …, Bm C1, C2, …, Cp

A1, A2, …, An C1, C2, …, Cp

A1 … Am B1 … Bm C1 ... Cp

t

t’

Page 29: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

29

Augmentation rule (quickly)

If

then

A1, A2, …, An B

A1, A2, …, An , C B, C, for any C

A1 … Am B1 … Bm C1 ... Cp

t

t’

Page 30: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

30

Rules summary (quickly)1. A B AC B (by definition)2. Separation/Combination3. Reflexive4. Augmentation5. Transitivity

Last 3 called Armstrong’s Axioms Complete: entire closure follows from these Sound: no other FDs follow from these

Don’t need to memorize details…

Page 31: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

31

Inferring FDs example (quickly)

Start from the following FDs:

Infer the following FDs:

1. name color2. category department3. color, category price

Inferred FD Which Ruledid we apply?

4. name, category name

5. name, category color

6. name, category category

7. name, category color, category

8. name, category price

Reflexive rule

Transitivity(4,1)

Reflexive rule

combine(5,6) or Aug(1)Transitivity(3,7)

Page 32: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

32

Problem: infer all FDsGiven a set of FDs, infer all possible FDs

How to proceed? Try all possible FDs, apply all rules

E.g. R(A, B, C, D): how many FDs are possible? Drop trivial FDs, drop augmented FDs

Still way too many

Better: use the Closure Algorithm…

Page 33: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

33

Closure of a set of AttributesGiven a set of attributes A1, …, An

The closure, {A1, …, An}+ = {B in Atts: A1, …, An B}

name colorcategory departmentcolor, category price

Example:

Closures: {name}+ = {name, color} {name, category}+ = {name, category, color, department, price} {color}+ = {color}

Page 34: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

34

Closure Algorithm

Start with X={A1, …, An}.

Repeat:

if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X.

until X doesn’t change

{name, category}+ = {name, category, color, department, price}

name colorcategory departmentcolor, category price

Example:

Page 35: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

35

Example

Compute {A,B}+ X = {A, B, }

Compute {A, F}+ X = {A, F, }

R(A,B,C,D,E,F) A, B CA, D EB DA, F B

In class:

Page 36: C20.0046: Database Management Systems Lecture #4

M.P. Johnson, DBMS, Stern/NYU, Spring 2008

36

Next time Read ch.19, sections 4-5