the relational model 01/28/2014 – material from chapter 4 (chap2 and chap3 make an appearance)

29
The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Upload: calvin-johnson

Post on 05-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

The Relational Model

01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Page 2: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Project Proposal - ???’s

Added some clarification.

What questions do you have about the project?

Data sets are available online at

psql –h db.cs.jmu.edu vlds – to play with the data

if already inside psql \c databaseName

Page 3: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Homework debrief

Scripts – If all is working correctly, you can run through psql.

psql –h hostname –f filename

-- comment to end of line (inline comment)

/* C-like comment, possibly multiple lines */ (block comment)

easier to build a text file and cut and paste commands in. Easier to change.

Can also use the –e option in psql to echo queries sent to server to standard output. > to pipe to a file.

Page 4: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Datatypes

http://www.postgresql.org/docs/9.3/static/datatype.html

Of note: varchar (n). Will save a maximum of n characters in the corresponding field.

If all of that space is not needed, then it will use less space.

money

date and time types

Page 5: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Review 2.2.1

acctNo type balance

12345 savings 12000.00

23456 checking 1000.00

34567 savings 25.25

firstName lastName idNo account

Robbie Banks 901-222 12345

Lena Hand 805-333 12345

Lena Hand 805-333 23456

a. attributesb. tuplesc. one tuple from each

relationd. relation schemae. database schemaf. a domain for each

attributeg. an equivalent relationh. a possible key to the

relation

Page 6: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

ReviewExercise – 2.2.1

Accounts(balance : float, acctNo : integer, type : string)

firstName lastName idNo account

Robbie Banks 901-222 12345

Lena Hand 805-333 12345

Lena Hand 805-333 23456

Customer(firstName : string, lastName : string,idNo : string, account : integer)

a. attributesb. tuplesc. one tuple from each

relationd. relation schemae. database schemaf. a domain for each

attributeg. an equivalent relationh. a possible key to the

relation

acctNo type balance

12345 savings 12000.00

23456 checking 1000.00

34567 savings 25.25

Page 7: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Model (freeonlinedictionary.com)

A schematic description of a system, theory, or phenomenon that accounts for its known or inferred properties and may be used for further study of its characteristics: a model of generative grammar; a model of an atom; an economic model; a database model.

Gives us a way of understanding the database and the interaction among its objects.

Page 8: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Design vs Implementation

Page 9: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

The Relational Model itself

The database is formed from a series of interconnected tables.

Each table has one or more attributes (fields).

Each table has zero or more rows (tuples).

Tables can be connected by keys.

Page 10: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

How do we express that model in design?

Various design tools

ER Diagramming (Peter Chen – 1976) (Remember Codd’s paper in 1970)

Databases can be expressed as “entities” who are connected by “relationships”.

Page 11: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

From Chen’s paper

Page 12: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

In the ER-Diagram Chen Style

Entity

Relation-ship

Attribute

Teacher

Teaches

Classes

Number

Name

Id

Class

Page 13: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Multiplicity

Each connection can have a number associated with it.

1:1 – one to one - each entity in one relation has at most one entity in the other and vice versa. ex: a faculty member and their office.

1:M – one to many – each entity in one relation has many related entities in another. ex: one faculty member teaches many courses but each

course is taught by only one faculty member.

M:M (often shown as M:N) many to many - each entity in one relation may have many related entities in another and vice versa. ex: a supplier supplies many products and each product

is supplied by many suppliers.

Page 14: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

In the ER-Diagram Chen Style

Entity

Relation-ship

Attribute

Teacher

Teaches

Classes

Number

Name

Id

Class

1

M

Page 15: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Instances

Each ERD describes a database schema.

If that schema is realized as a database, an instance of the database is a snapshot of the database at a particular period of time.

Page 16: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

A database instance

Accounts(balance : float, acctNo : integer, type : string)

firstName lastName idNo account

Robbie Banks 901-222 12345

Lena Hand 805-333 12345

Lena Hand 805-333 23456

Customer(firstName : string, lastName : string,idNo : string, account : integer)

acctNo type balance

12345 savings 12000.00

23456 checking 1000.00

34567 savings 25.25

Page 17: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Keys (Chapter 3.1.2)

“A key for an entity set E is a set K of one or more attributes such that, given any two distinct entities e1 and e2 in E, e1 and e2 cannot have identical values for each of the attributes in the key.” Ullman

A key is a set of attributes that uniquely define a row of a table.

There may be more than one key. One is selected as the primary key.

One additional requirement – No proper subset of the key attributes can themselve be a key.

Page 18: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Keys

Person(id, email, last, first, dob, SSN)

id, email, last-first-dob, SSN may all be considered keys.

id-email, last-first-dob-SSN would not be keys. In each, there is a proper subset of elements that itself is a key.

We would call id-email and last-first-dob-SSN superkeys…a set of attributes that contain a key. Every key is a superkey but not every superkey is a key.

We sometimes use the term candidate keys for the possible keys to the relation. From the candidates, a single key is chosen.

Page 19: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Keys in ERD

Movies Own Studios Runs Presidents

title year

genrelength

Id

name

composite key

Page 20: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Movies Own Studios Runs Presidents

So what does this ERD tell us?

Constraints (Referential Integrity)

The curve arrow is a 1 relationship. The filled arrow is a 0..1 relationship. (I may have a studio that does not currently have a president, but if I have a president, she must run a studio.)

A studio may own 0..* Movies, but a Movie must be owned by a single Studio.

Many-one relationship->optional tupleReferential integrityrequired tuple

Page 21: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Movies Own Studios Runs Presidents

Alternate expression of the same idea

Constraints (Referential Integrity)

The curve arrow is a 1 relationship. The filled arrow is a 0..1 relationship. (I may have a studio that does not currently have a president, but if I have a president, she must run a studio.)

A studio may own 0..* Movies, but a Movie must be owned by a single Studio.

0..1 110..m

Page 22: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Other kinds of relationships - self

Movies sequel of

original

A movie may have many sequels, but a sequel can have only one original.

sequel

Page 23: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Other kinds of relationships - multiway

Movies

Contracts

Studios

Stars

A contract is between a star, the movie that star is in, the producing studio and the star’s contracting studio.

Producing studio, the arrow implies that there is at most one producing studio

Stars’ studios, the arrow impliesa star can have at most one studio

pay

Page 24: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Other diagramming tools

Crow’s Foot

UML – See Chapter 4.7

MoviesPK titlePK yearlength genre

StudiosPresidentsPK Idname

own

minimum cardinality

maximum cardinality

Means, a movie must be owned by one and only one studio. A studio may own no or many movies.

Means a studio may have one or no presidents, but if there is a president then she must preside over exactly one studio.

Page 25: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Some other terms

Weak entity (represented by double lines) An entity that depends upon the existence of

another. Usually it will have as one of its key components the key from another table.

Ex. A student-class table is a weak entity of student and class as it depends on both.

We will see that relationships are often implemented as tables within a db. They become weak entities.

Page 26: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Design considerations

While we will be looking more at design through the semester, a few things of note:

Faithfulness (or fitfulness) – The data should “faithfully” represent the underlying real world model and be “fit” for the application. Ex: Stars and Movies – Implies a many to many

relationship since movies have multiple stars and stars appear in multiple movies.

It also depends on the enterprise in which the data will be used. A school that has no “team” teaching could say one teacher

teaches many courses but a class can only be taught by one teacher vs

JMU which has team teaching so we have a many to many relationship between teachers and courses.

Page 27: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Design

Avoid redundancy – Redundant data is often inconsistent data.

Simplicity – Avoid extra elements or elements that do not support the enterprise.

Choosing the right relationships – Look at the relationships among the entities and determine the best way to represent them. (See diagram 143)

What is right for the type of element? Is it an attribute of an entity or a standalone entity in its own right? Is it an entity or simply a relationship between entities.

Page 28: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Design

Choose the right data elements (Atomicity) Do we make name an attribute or do we break it up

into first middle last. name can be derived from the individual data.

Avoid derived data (Staleness) Data that can be derived from other data perhaps

does not belong in the db. If I have date of birth I can derive age at any time that

I need it. If I have vendor transactions, I can derive the total

amount purchased.

Page 29: The Relational Model 01/28/2014 – Material from Chapter 4 (Chap2 and Chap3 make an appearance)

Design – Referential Integrity

Where do we want to enforce the notion of required tuples?

When must we have a corresponding row?

Sometimes we build tables to help us maintain data integrity.