1 database design database abstraction layers 1.conceptual model 2.logical model 3.physical database...

49
1 Database Design Database Abstraction Layers 1. Conceptual Model 2. Logical Model 3. Physical Database Design

Upload: eden-hemmings

Post on 02-Apr-2015

250 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

1

Database Design

Database Abstraction Layers

1. Conceptual Model

2. Logical Model

3. Physical Database Design

Page 2: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

2

Hardware/OS

ProcessingRequirements

InformationRequirements

Physical design

DBMS

PhysicalModelling

Logical Modelling

ConceptualModelling

RequirementsEngineering

Logical design(schema)

Conceptual design (ER)

Book of duty

Database design

Page 3: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

3

Book of Duty

Describe information requirements Objects used (e.g., student, professor,

lecture) Domains of attributes of objects Identifiers, references / relationships

Describe processes E.g., examination, degree, register course

Describe processing requirements Cardinalities: how many students? Distributions: skew of lecture attendance Workload: how often a process is carried out Priorities and service level agreements

Page 4: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

4

Entity/Relationship (ER) Models

Entity (Gegenstandstyp)

Relationship (Beziehungstyp)

Attribute (Eigenschaft)

Key (Identifikation)

Role

Page 5: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

5

Entity/Relationship (ER) Models

Entity (Gegenstandstyp)

Relationship (Beziehungstyp)

Attribute (Eigenschaft)

Key (Identifikation)

Role

Student

attends

MatrNr

Name

Attendant

Page 6: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

6

Entity/Relationship (ER) Models

Entity (Gegenstandstyp)

Relationship (Beziehungstyp)

Attribute (Eigenschaft)

Key (Identifikation)

Role

Student

Lecture

attends

Nr Title CP

Legi Name Semester

Attendant

Course

Page 7: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

7

Student

Assistant

Legi

PersNr

Semester

Name

Name

Area

Grade

attends

tests

works-for Professor

Lecture

gives

requires

CP

Nr

Title

Room

Level

PersNr

follow-upprerequisite

Name

University

Page 8: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

8

Natural Language Version Students have a LegiNr, Name and Semester. The Legi

identifies a student uniquely. Lectures have a Nr, CP and Title. The Nr identifies a

lecture uniquely. Professors have a PersNr, Name, Level and Room. The

PersNr identifies a professor uniquely. Assistents have a PersNr, Name and (research) Area.

The PersNr identifies an assistent uniquely. Students attend lectures. Lectures can be prerequisites for other lectures. Professors give lectures. Assistents work for professors. Students are tested by professors about lectures.

Students receive grades as part of these tests. Is this the only possible interpretation?

Page 9: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

9

Page 10: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

10

Why ER? Advantages

ER diagrams are easy to createER diagrams are easy to editER diagrams are easy to read (from the layman)ER diagrams express all information requirements

Other aspectsMinimalityTools (e.g., Visio)Graphical representation

GeneralTry to be consice, complete, comprehensible, and

correctControversy whether ER/UML is useful in practiceNo controversy that everybody needs to learn ER/UML

Page 11: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

11

Functionalities

E1E2R

... ...

R E1 x E2

1:N

N:M

E1 E21:1

N:1

Page 12: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

12

Functionalities of n-ary relationships

E1

En E2

Ek

R

P

MN

1

R : E1 x ... x Ek-1 x Ek+1 x ... x En Ek

Page 13: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

13

Example: seminar

Student supervise

Grade

Topic

Professor

1

1N

supervise : Professor x Student Topic

supervise : Topic x Student Professor

Page 14: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

14

Constraints

The following is not possible:

1. Students may only do at most one seminar with a prof.

1. Students may only work on a topic at most once.

The following is possible:

Profs may recycle topics and assign the same topic to several students.

The same topic may be supervised by several profs.

Page 15: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

15

ExampleProfessor

Topic

p1

p2

p3

p4

t1

t2

t3

t4

s1

s2

s3

s4

b1

b2

b3

b4

b5

b6

Student

Dashed lines representillegal references

Page 16: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

16

Student

Assistant

Legi

PersNr

Semester

Name

Name

Area

Grade

attends

tests

works-for Professor

Lecture

gives

requires

CP

Nr

Title

Room

Level

PersNr

Follow-upprerequisite

Name

Functionalities

1

N

1

1

N N

N

M

M

MN

Page 17: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Exercise Model „+“ as a relationship.

How many functions have you modelled?

Page 18: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Two Binary vs. One Ternary Relat. A thief steals a painting as part of a theft.

Model as two binary relationshipsModel as one ternary relationshipWhat is better?

Page 19: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Rules of thumb Attribute vs. Entity

Entity if the concept has more than one relationshipAttribute if the concept has only one 1:1 relationship

Partitioning of ER ModelsMost realistic models are larger than a pagePartition by domains (library, research, finances, ...) I do not know of any good automatic graph partitioning tool

Good vs. Bad modelsDo not model redundancy or tricks to improve performanceLess entities is better (the fewer, the better!)Remember the C4 rule. (concise, correct, complete, compr.)

Page 20: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

20

(min, max)-Notation

E2

R E1 x ... x Ei x ... x En

E1

En

Ek

R

(min1 max1)

(min2, max2

)

(mini, maxi)

(minn, max

n )

For all ei Ei : •At least mini records (..., ei, ...) exist in R AND•At most maxi records (..., ei, ...) exist in R

Page 21: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

21

Geometric Modelling

Polyhedron

covers

Surface

boundary

Edge

StartEnd

Node

PolyID

SurfaceID

EdgeID

X

Y

Z

1

N

N

M

N

M

Polyhedron

Page 22: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

22

Geometric Modelling

Polyhedron

covers

Surface

boundary

Edge

StartEnd

Node

PolyID

SurfaceID

EdgeID

X

Y

Z

1

N

N

M

N

M

(4, )

(1,1)

(3, )

(2, 2)

(2, 2)

(3, )

Polyhedron

Page 23: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

23

Weak Entities

•The existince of room depends on the existence of the associated building.

•Why must such relationships be N:1 (or 1:1)?

•RoomNr is only unique within a building.

•Key of a room: BldNr and RoomNr

Building located Room

Size BldNr

Size

RoomNr

1 N

Page 24: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

24

Exams depend on the student

Student takes Exam1 N Grade

Part

Legi

Lecture

covers

Nr

gives

Professor

PersNr

N N

M M

• Can the existence of an entity depend on several other entities? (E.g., exam on student and prof?)

Page 25: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

25

Corner Case 1

besitzt HumanHeart1 1

A human cannot exist without a heart. A heart cannot exist without a human. Anne lives on Bob‘s heart. Bob lives on Anne‘s heart. Possible? Heart transplantation – possible?

ER describes possible worlds and their rules ER does not describe legal transitions!

Page 26: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

26

Corner Case 2 A human can only survive with at least one kidney. A relationship can only survive with all its entities. Not expressible with ER! (Why not?)

N 1Kidney Humanhat

Page 27: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Why is this a bad model?

27

Attribute

Entity

Relationship

N

N

1

1

Page 28: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Why is this example so itchy?

28

Is the following „instance“ legal?

Answer: Yes! When „requires“ dies, „%“ dies, too When „lecture“ dies, „%“ dies, too but, not cleanly modelled – transitivity of weakness

N.B. weakness of „Relationship“ cannot be modelled!!!

Lecture

requires %

% engages in relationship with Lecture via requires

Page 29: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Why is this example so itchy?

29

Is the following „instance“ legal?

Answer: Yes! when „R“ dies, „A“ dies, too when „E“ dies, „A“ dies, too no way to model that it is an „either or“ relationship

yet another weakness of ER (lack of negation)

E

R A

Page 30: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Solution

30

Model attributes as two weak entities attributes of relationships attributes of entities

(Give up on relationships as weak entities)

Not perfect because redundant but the pricest way to model ER as ER

Page 31: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

31

Generalization

Legi

Uni-Member

is-a

Student

Assistant

is-a

ProfessorArea

Name

Employee PersNr

Room

Level

Page 32: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Comments on Generalization There may be Employees who are neither Assistant nor

Prof.

There may be an Employee who is also an Assistant.

There may be an Employee who is also a Prof.

There may be an Employee who is also an Assistant and a Prof.

ER has no way to restrict to any of those four cases. If at all, done in the „quantative part“ of the book of

dutyDoes it matter? 32

Page 33: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

33

Student

Legi

Semester

Name attends Lecture

requires

CP

Nr

Title

follow-upprerequisite

Grade tests gives

Name

Area

Assistant Works-for ProfessorRoom

Level

is-a

Employee

PersNr

(0,*) (3,*)

(0,*) (0,*)

(0,*) (0,*) (1,1)

(1,1)

(0,*)(0,*)

(0,*)

Page 34: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

34

Aggregation

Bicycle

Part-of Part-of

Frame Wheel

Part-of Part-of Part-of Part-of

... ... ... ...

... ... ......

Page 35: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

35...

...

... ... ...

Bicycle

Part-of Part-of

Frame Wheel

Part-of Part-of Part-of Part-of

... ... ...

Tricycle Motorcycle Car

is-a is-a

Manual vehicle Motor vehicle

is-a

Vehicle Aggregation and

Generalization

Page 36: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

Limitations of ER ER has no formal semantics

unclear whether this is a bug or a feature (natural language has no formal semantics either)

No way to express relationships between sets of entitiese.g., existence of person depends on a set of organssets of sets are notoriously hard to model (more on that when we talk about 4 NF)

No way to express negative rulese.g., same entity cannot be an Assistant and Professoragain, negation notoriously hard (e.g., 2nd-order logic)

ER has been around for 30+ years maybe, ER hit sweet spot of expressivity vs. simplicity (UML class diagrams inherit same weaknesses)

36

Page 37: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

37

Why is ER modelling so difficult?

View 3

View 4

View 2

View 1Global Schema•No redundancy•No conflicts•Avoid synonyms•Avoid homonyms

Consoli-

date

Page 38: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

38

Consolidation Hierarchies

Problem: How to achieve multi-lateral consensus?

S1

S1,2,3,4

S1,2,3, S4

S1,2 S3

S2

S1,2,3,4

S1,2 S3,4

S1 S2 S3 S4

Page 39: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

39

Example: Professor View

Student

Assistant

Professor

do

write

supervise

superviseMaster thesis

PhD thesis

Title

Title

Page 40: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

40

SignatureLibrary

Uni-Member

ownsDocument

Authors

Title

Year

lends

manages

Date

Faculty

Example: Library View

Page 41: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

41

Lecture

Lecturer

Textbook

Title

Year

Publisher

recommends

Authors

Example: Lecture View

Page 42: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

42

Observations Lecturer and Professor are synonyms.

Uni-Member is a generalization of Student, Professor and Assistant.

However, libraries are managed by Employees. (View 2 is imprecise in this respect.)

Dissertations, Master theses and Books are different species of Document. All are held in libraries.

Do and Write are synonyms in View 1.

Things get complicated very quickly – requires „engineers“

Not unique

Need to invent new concepts

Need to compromise (e.g., authorship of documents)

Page 43: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

43

supervise supervise recommends

Master Thesis Dissertation Book

Assistant Professor

EmployeeStudent

Uni-Member

Person

Lecturer

manages

lends

Document

keepsLibrary

writes

faculty

Signature

Title

Year

Publisher

Date

Page 44: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

44

Data Modelling with UML Unified Modelling Language UML De-facto standard for object-orientierted design Data modelling is done with „class diagramms“

Class in UML ~ Entity in ERAttribute in UML ~ Attribute in ERAssociation in UML ~ Relationship in ERCompositor in UML ~ Weak Entity in ERGeneralization in UML ~ Generalization in ER

Key differences between UML class diagrams and ERMethods are associated to classes in UMLKeys are not modelled in UMLUML explicitly models aggregation (part-of)UML supports the modelling of instances (object

diagrams) UML has much more to offer (use cases, sequence

diagr., ...)

Page 45: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

45

Class: Professor

Professor

- PersNr: Integer+ Name: String- Level: String

+ promote()

Page 46: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

46

Associations (directed, undirected)

Page 47: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

47

Functionalities & Multiplicities

Multiplicities Every instance of A is associated to 4 to 6 instances of B. Every instance of B is associated to 2 to 5 instances of A. Be careful: Flipped around as compared to ER. Be careful: Cannot be used for n-ary relationships.

Functionalities Represented as UML multiplicities: 1, *, 1..*, 0..*, or 0..1 Otherwise, the same as in ER.

Page 48: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

48

Aggregation

Page 49: 1 Database Design Database Abstraction Layers 1.Conceptual Model 2.Logical Model 3.Physical Database Design

49

Generalization