entity-relationship (er) modeling and relational database...

33
Entity-Relationship (ER) modeling and Relational Database Design Fernando J. Pineda 140.636

Upload: others

Post on 04-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Entity-Relationship (ER) modeling and Relational Database Design

Fernando J. Pineda 140.636

Page 2: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Programmers-For-Hire Inc: A one-table database MID Name Skills Firm Loc

1 Joh Smith Access, DB2, FoxPro ABC AL

2 Dave Jones dBASE, Clipper MCI FL

3 Mike Beach IBM DE

4 Jerry Miller DB2, Oracle MCI FL

5 Ben Stuart Oracle, Sybase AIC NE

6 Fred Flint Informix ABC AL

7 Joe Blow RUN IA

8 Greg Brown Access, MS SQLserver XYZ NY

9 Doug Hope IBM DE

Page 3: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Problems due to bad database design

n  Some queries require special programming n  The Skills attribute has more than one value in each row n  We need to parse the values of the skills attribute when we search

for a particular skill

n  If we insert a new employee at IBM, but incorrectly enter MD as the location, then how do we determine where IBM actually resides? (insertion anomaly)

n  If Greg Brown quits, and we delete him from the table, we lose the fact that XYZ was in our organization. (deletion anomaly)

n  If the IBM office moves from DE to MA, we need to scan every row and update the IBM entries. If we miss just one, we don’t know where IBM is located (update anomaly)

Page 4: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Data Modeling

n  The purpose of data modeling is to impose a formal structure on data. The formal structure is known as a data model.

n  Most databases support only one data model. n  The relational data model is just one such

representation. n  An entity-relationship diagram (ER diagram) is a

DBMS-independent technique for representing the structure of a relational data model.

Page 5: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

ER modeling terminology

n  Entity: a table n  Something about which we store data, e.g. a protein, a customer,

a citation, etc..

n  Attribute: a column in a table n  Entities are characterized by their attributes, e.g. a customer entity

is described by a customer_id, first name, last name, street, city street, zipcode and telephone number.

n  Instances (of an entity): a row in a table n  Entities in a relational database are represented by a row of

attributes. n  A row of attributes is an instance of an entity. n  Often we simply say entity when we mean instance of an entity.

Page 6: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Entity Identifiers

n  We put data in a database so we can ultimately retrieve it! n  To retrieve specific data we need a means of

distinguishing one instance of an entity from another instance.

n  Each instance of an entity must have at least one unique attribute (the entity identifier)

Page 7: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Entity Identifiers & keys

n  Superkeys n  A set of columns in a database that together serve to uniquely

identify a row

n  key n  A minimal superkey

n  foreign keys n  a column in a table that is the key of another table (used for joining

tables together).

Page 8: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Entity+identifer+attribute uniquely retrieves entity data

MID Name Skills Firm Loc

1 Joh Smith Access, DB2, FoxPro ABC AL

2 Dave Jones dBASE, Clipper MCI FL

3 Mike Beach IBM DE

4 Jerry Miller DB2, Oracle MCI FL

5 Ben Stuart Oracle, Sybase AIC NE

6 Fred Flint Informix ABC AL

7 Joe Blow RUN IA

8 Greg Brown Access, MS SQLserver XYZ NY

9 Doug Hope IBM DE

Page 9: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Choosing entity identifiers

n  Bad ways of identifying instances (e.g. of people) n  Social security number n  Phone_number n  First_name+Last_name+social

n  Best practice, n  unique meaningless numbers make the best entity identifiers. n  Sometimes there is a natural or previously assigned meaningless

number, e.g. an order number. n  Sometimes concatenated entity identifiers are appropriate n  AUTO_INCREMENT in MySQL

Page 10: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Single vs multi-valued attributes

n  Multi-valued attributes cause problems: n  What is meaning of the data? n  Slows down searching n  How many values can be stored? Unnecessarily restricts amount

of data that can be stored.

n  In general, if you encounter multi-valued data, it’s a hint that it’s time to create a new instance (row).

ID Parent children Birthdays 1 Fernando Katy, Daniel 12/21/93

03/26/95

2 Rosie Lucky,Two,Yeller,Little-Squirt,Furzy

01/06/2004

Page 11: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

ER diagram Styles n  Special case of Associative Networks

n  http://web.cs.mun.ca/~ulf/pld/assoc.html

n  Chen Style (Classical ER modeling approach) n  Chen P. "The Entity-Relationship Model-Toward a

Unified view of Data” ACM Trans on Database systems vol. 1 ,no. 1 Mar 1976)

n  Information Engineering Style (Classical ER modeling approach) n  Martin, J. and McClure, C., Diagramming Techniques for

Analysts and Programmers (publ: Prentice Hall 1985)n  C. Finkelstein: An introduction to information

engineering: From strategic planning to information systems. Addison Wesley (1989).

n  http://www.inconcept.com/JCM/April2000/halpin.html

n  Object oriented Styles (Modern approaches) n  UML -- essentially ER + methods n  ORM

Page 12: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Chen Style n  Box Entity n  Ellipse/Rounded box Attribute n  Asterisk Entity instance identifier n  Arrow/line Relationship n  Diamond makes relationships explicit

employee

first_namelast_name*e_id telephone

department

locationdept_name*d_id

works-in 1n

Page 13: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Information Engineering Style n  Box Entity

n  Entity name labels the box n  Attributes are in the lower section of the box n  Instance identifier has an asterisk

n  Line Indicates Relationship n  Terminators Indicate multiplicity of the relationship

n  || One and only one n  0| Zero or one n  >| One or more n  >0 Zero, one or more

employee*e_id

telephonefirst_namelast_name

d_id

department*d_id

dept_namelocation

||

Page 14: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Domains

n  Each attribute has a domain which is the set of values the attribute is allowed to take.

n  A domain can be small, e.g. n  TRUE, FALSE n  Dates n  zip codes (different from integers since they can start with zero)

n  A DBMS enforces a domain via Domain constraints, e.g. n  Date domain constraints reject illegal dates n  Time domain constraints enforce 24 hours, 60 minutes and 60

seconds.

n  Practical domain constraints n  Use data types allowed by particular DBMS

Page 15: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Examples of Practical Domains (e.g. in MySQL) n  CHAR: Fixed-length string text, often up to 256 chars n  VARCHAR: Variable-length string, often up to 256 chars n  INT: Integer, whose size depends on OS n  DECIMAL(m,n): decimal with m characters and n digits to

the right of the decimal point n  DATE: a date n  TIME: a time n  DATETIME: The combination of a date and time n  BOOLEAN: a logical value (TRUE, FALSE) n  BLOB: Binary Large Object for anything binary See the RDBMS documentation for other data types

Page 16: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Basic Data Relationships

n  Three basic types of relationships n  one-to-one (e.g. people and social security numbers) n  one-to-many (e.g.species and tissues) n  many-to-many (e.g. students and classes)

n  Relationships are between particular instances of entities n  ER diagrams show possible relationships between

instances. n  No requirement that every instance of every entity have

every relationship.

Page 17: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Schemas

n  A complete entity-relation (ER) diagram represents the overall logical plan of a relational database.

n  The relational database schema specifies how to implement the ER diagram in a real database.

Page 18: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

customer*cust_idcust_name

cust_addresstelephone

credit_card_no

Video store (first draft)

video*vid_idtitleprice

actor_idprod_id

Actor*actor_idactor_name

Producer*prod_idprod_namestudio

order*order_idcust_id

order_date

=

Entity

Weak entity

Page 19: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Many-to-many relationship n  Actors can be in zero or more videos and a given video can have zero

or more actors

actor*actor_idactor_name

video*vid_idtitleprice

dist_idactor_idprod_id

Page 20: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Many-to-many relationships are problematic n  Actors can be in more than one video and a given video can have

multiple actors. n  One (terrible) way to handle this is to use multiple-valued attributes?

actor_id actor_name

30 Billy Crystal

27 Meg Ryan

1077 Tom Hanks

Actor

vid_id title vid_actors 10578 When Harry met Sally 301,27

2901 Sleepless in Seatle 1077, 27

Video

But we know that multivalued attributes cause problems

Page 21: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Many-to-many relationships are problematic n  We could eliminate multi-valued attributes by creating a new row for

each actor in the video table.

actor_id actor_name

30 Billy Crystal

27 Meg Ryan

1077 Tom Hanks

Actor

Video

vid_id title vid_actors

10578 When Harry met Sally 301

10578 When Harry met Sally 27

2901 Sleepless in Seatle 1077

2901 Sleepless in Seatle 27

But then our vid_id is not unique

Page 22: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Many-to-many relationships are problematic n  We could create a unique key in the table by combining vid_id and

actor_id into a primary key

Video

vid_id title actor_id

10578 When Harry met Sally 301

10578 When Harry met Sally 27

2901 Sleepless in Seatle 1077

2901 Sleepless in Seatle 27

But now we have lost videos as an entity

Page 23: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Many-to-many attributes are problematic n  A more subtle problem. n  An order can have many videos and a video can be in many orders. n  Customers can order more than one copy of a video. n  Where do we put the quantity attribute?

order*order_idquantity

video*vid_id| ……

order*order_id

video*vid_idquantity|

……

n  Neither is correct since quantity applies to the relationship between the entities rather than either of the entities.

n  quantity is an example of relationship data

Page 24: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Eliminating many-to-many relations with relationship entities (composite entities) n  Composite entities are entities that represent relationship data

order*order_idcust_id

order_dateorder_filled

video*vid_idtitle

Included_in*vid_id

*order_idquantity

n  A single many-to-many relationship between order and video entities is replaced by two one-to-many relationships

n  Note that the the identifier for the composite entity is composed by concatenating the entity identifiers of the two related entities primary key(vid_id,order_id)

==

Page 25: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

customer*cust_idcust_name

cust_addresstelephone

credit_card_no

Video store (second draft)

order*order_idcust_id

order_date

video*vid_idtitleprice

produced_by*prod_id*Vid_id

actor*actor_idactor_name

=

=

Included_in*order_id*vid_idquantity

=

producer*prod_idprod_namestudio

performed_in*actor_id*vid_id

=

=

=

=Entity

Weak entity

Composite (relationship) �entity

Page 26: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

SQL statements (entities) create table customer ( customer_id int NOT NULL AUTO_INCREMENT PRIMARY KEY, customer_name varchar(64), customer_address varchar(64), customer_telephone char(12), customer_credit_card char(20),) engine=InnoDB;

create table actor ( actor_id int NOT NULL AUTO_INCREMENT PRIMARY KEY, actor_name varchar(64)) engine=InnoDB;

create table producer ( prod_id int NOT NULL AUTO_INCREMENT PRIMARY KEY, prod_name varchar(64)) engine=InnoDB;

create table video ( vid_id int NOT NULL AUTO_INCREMENT PRIMARY KEY, title varchar(64), price decimal(6,2)) engine=InnoDB;

Page 27: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

SQL statemements (weak entity)

create table order ( order_id int NOT NULL AUTO_INCREMENT PRIMARY KEY, customer_id int, order_date DATE, FOREIGN KEY (customer_id) REFERENCES customer(customer_id)) engine=InnoDB;

Page 28: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Composite/Relationship (entities) create table performed_in ( vid_id int NOT NULL, actor_id int NOT NULL, FOREIGN KEY(actor_id) REFERENCES actor(actor_id), FOREIGN KEY(vid_id) REFERENCES video(vid_id), PRIMARY KEY(actor_id, vid_id)) engine= InnoDB;

create table produced_by ( vid_id int NOT NULL, prod_id int NOT NULL, FOREIGN KEY(prod_id) REFERENCES producer(prod_id), FOREIGN KEY(vid_id) REFERENCES video(vid_id), PRIMARY KEY(prod_id, vid_id)) engine= InnoDB;

create table included_in( order_id int NOT NULL, vid_id int NOT NULL, quantity int, FOREIGN KEY (order_id) REFERENCES order(order_id), FOREIGN KEY (vid_id) REFERENCES video(vid_id), PRIMARY KEY (order_id,vid_id)) engine=InnoDB;

Page 29: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Example schemas

Page 30: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Plasmodium falciparum DB ER diagram

Page 31: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Swiss-Prot InterPro database schema http://www.ebi.ac.uk/swissprot/Publications/mbd2.html

Protein families, domains and functional sites

Page 32: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

Array Express Database Schema

Page 33: Entity-Relationship (ER) modeling and Relational Database ...ec2-54-227-251-26.compute-1.amazonaws.com/word... · ER modeling terminology n Entity: a table n Something about which

End