comp 5138 relational database management systems sem2, 2007 lecture 3 relational data model

43
COMP 5138 COMP 5138 Relational Database Relational Database Management Systems Management Systems Sem2, 2007 Sem2, 2007 Lecture 3 Lecture 3 Relational Data Model Relational Data Model

Upload: morgan-manning

Post on 16-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

COMP 5138COMP 5138

Relational Database Relational Database

Management Systems Management Systems

Sem2, 2007Sem2, 2007

Lecture 3Lecture 3

Relational Data ModelRelational Data Model

Page 2: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

2222

L3L3 Relational Data ModelRelational Data Model

Download from our website

Individual work with your academic honesty declaration form

Hardcopy or written work not electronic form

Due date: 20/08/2007(Monday, week 5), late submission will not be accepted

ER Diagram, relational schema and SQL DDL

Assignment 1

Page 3: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

3333

L3L3 Relational Data ModelRelational Data Model

Introduction to Conceptual Data ModelDatabase Design SequenceConceptual data modelTypes of conceptual data model

Entity Relationship ModelEntity, Relationship, AttributeDegree of relationshipCardinality and participationTernary relationshipWeak entity

Review of Last Class

Page 4: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

4444

L3L3 Relational Data ModelRelational Data Model

Defining the SchemaMathematics of RelationsLogical Database Design

Today’s Agenda

Page 5: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

5555

L3L3 Relational Data ModelRelational Data ModelReview

Relational database: a set of relationsRelation:

Instance : a table, with rows and columns. The value in the database at a given time.Schema : specifies name of relation, plus name and type of each column.

E.G. Student (sid: string, name: string, login: string, age: integer, gpa: real).

Can think of a relation as a set of rows or tuples (i.e., all rows are distinct).

Page 6: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

6666

L3L3 Relational Data ModelRelational Data ModelExample Instance

All rows are distinct. Do all columns in a relation instance have to be distinct? NO!

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8

StudentStudentFields (Attributes, Columns)

Tuples (Records,

Rows)

Page 7: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

7777

L3L3 Relational Data ModelRelational Data ModelDefining Schema in SQL

Creates the Student relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. As another example, the Enrolled table holds information about courses that students take.

CREATE TABLE Student(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL)

CREATE TABLE Enrolled(sid: CHAR(20), cid: CHAR(20), grade:

CHAR(2))

Page 8: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

8888

L3L3 Relational Data ModelRelational Data Model

Destroying and Altering Relations

DROP TABLE Student

Destroys the relation Student. The schema information and the tuples are deleted.

ALTER TABLE Student ADD COLUMN firstYear: integer

The schema of Student is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field.

Page 9: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

9999

L3L3 Relational Data ModelRelational Data ModelAdding Tuples

Can insert a single tuple using:INSERT INTO Student (sid, name, login, age, gpa)VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)

Can delete all tuples satisfying some condition (e.g., name = Smith):

DELETE FROM Student SWHERE S.name = ‘Smith’

Powerful variants of these commands are available; more later!

Page 10: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

10101010

L3L3 Relational Data ModelRelational Data ModelIntegrity Constraints

Integrity Constraint (IC): condition that must be true for any instance of the database; e.g., domain constraints.

ICs can be declared in the schemaThey are specified when schema is defined.All declared ICs are checked whenever relations are modified.

A legal instance of a relation is one that satisfies all specified ICs.

If ICs are declared, DBMS will not allow illegal instances.

If the DBMS checks ICs, stored data is more faithful to real-world meaning.

Avoids data entry errors, too!

Page 11: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

11111111

L3L3 Relational Data ModelRelational Data ModelNon-Null Columns

One domain constraint is to insist that no value in a given column can be null

The value can’t be unknown; The concept can’t be inapplicable

In SQLCREATE TABLE Student

(sid: CHAR(20) NOT NULL, name: CHAR(20), login: CHAR(10) NOT NULL,age: INTEGER,gpa: REAL)

Page 12: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

12121212

L3L3 Relational Data ModelRelational Data ModelRelational Keys

Keys are special fields that serve two main purposes:

Primary keys are unique not-null identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are uniqueForeign keys are where one table has a column that refers to the primary key of another table (to allow connections to be made)

Keys can be simple (a single field) or composite (more than one field)Keys usually are used as indices to speed up the response to user queries (More on this later in the semester)

Page 13: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

13131313

L3L3 Relational Data ModelRelational Data ModelExample: Relational Keys

Primary Key

Foreign Key (implements 1:N relationship between customer and order)

Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product)

Page 14: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

14141414

L3L3 Relational Data ModelRelational Data ModelPrimary Key Constraints

A set of fields is a key for a relation if :1. No two distinct tuples can have same

values in all key fields, and2. This is not true for any subset of the key.

Part 2 false? A superkey.If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key.

E.g., sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey.

Page 15: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

15151515

L3L3 Relational Data ModelRelational Data ModelStudent and Course

student courseenroll title

cid

credits

sid

name

grade

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid))

Page 16: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

16161616

L3L3 Relational Data ModelRelational Data Model

Primary and Candidate Keys in SQL

Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key.

“For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

Used carelessly, an IC can prevent the storage of database instances that arise in practice!

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid))

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade))

vsvs

Page 17: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

17171717

L3L3 Relational Data ModelRelational Data ModelForeign Keys, Referential

Integrity

Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’.E.g. sid is a foreign key referring to Student:

Enrolled(sid: string, cid: string, grade: string)If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references.

Can you name a data model w/o referential integrity? Links in HTML!

sid name

cid title

Student

Course

credits

grade

Enrolled

sid cid

Page 18: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

18181818

L3L3 Relational Data ModelRelational Data ModelForeign Keys in SQL

Only students listed in the Students relation should be allowed to enroll for courses.CREATE TABLE Enrolled

(sid CHAR(20), cid CHAR(20), grade CHAR(2),PRIMARY KEY (sid,cid),FOREIGN KEY (sid) REFERENCES Student )

sid name login age gpa

53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

EnrolledStudent

Page 19: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

19191919

L3L3 Relational Data ModelRelational Data ModelEnforcing Referential

Integrity

Consider Student and Enrolled; sid in Enrolled is a foreign key that references Student.

What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!)

What should be done if a Student tuple is deleted?Also delete all Enrolled tuples that refer to it.Disallow deletion of a Student tuple that is referred to.Set sid in Enrolled tuples that refer to it to a default sid.(In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.)

Similar if primary key of Student tuple is updated.

Page 20: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

20202020

L3L3 Relational Data ModelRelational Data ModelReferential Integrity in SQL

SQL/92 and SQL:1999 support all 4 options on deletes and updates.

Default is NO ACTION (delete/update is rejected)CASCADE (also delete all tuples that refer to deleted tuple)SET NULL / SET DEFAULT (sets foreign key value of referencing tuple)

CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Student

ON DELETE CASCADEON UPDATE SET DEFAULT )

sid name

cid title

Student

Coursecredits

gradeEnrolled

sid cid

Page 21: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

21212121

L3L3 Relational Data ModelRelational Data ModelNamed Constraints

Tip: give a name to constraints (this makes error messages clearer)

CREATE TABLE Student ( sid INTEGER, … ,CONSTRAINT Students_PK PRIMARY KEY (sid) );

CREATE TABLE Course ( cid CHAR(8), … ,CONSTRAINT Courses_PK PRIMARY KEY (cid) );

CREATE TABLE Enroll ( sid INTEGER, cid CHAR(8),CONSTRAINT Enroll_FK1 FOREIGN KEY (sid) REFERENCES Students,CONSTRAINT Enroll_FK2 FOREIGN KEY (cid) REFERENCES Course,CONSTRAINT Enroll_PK PRIMARY KEY (sid,cid));

student courseenroll title

cid

credits

sid

name

Page 22: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

22222222

L3L3 Relational Data ModelRelational Data Model Views

A view is just a relation, but we store a definition, rather than a set of tuples.

The definition is given by a query (see later lectures) which combine information from one or more tablesUsers can operate on the view as if it were a stored table

DBMS calculates the value whenever it is needed

CREATE VIEW YoungActiveStudents (name, grade)AS SELECT S.name, E.gradeFROM Students S, Enrolled EWHERE S.sid = E.sid and S.age<21

Page 23: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

23232323

L3L3 Relational Data ModelRelational Data Model

Defining the SchemaMathematics of RelationsLogical Database Design

Today’s Agenda

Page 24: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

24242424

L3L3 Relational Data ModelRelational Data Model Relation

The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper:"A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970.

The above paper caused a major revolution in the field of Database management and earned Ted Codd the coveted ACM Turing Award.

The relational Model of Data is based on the mathematical concept of Relation.

Studied in Discrete Mathematics

The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations.

Page 25: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

25252525

L3L3 Relational Data ModelRelational Data ModelDefinition of Relation

A Relation is a mathematical concept based on the ideas of sets.Formal definition

Relation RGiven sets D1, D2, …, Dn, a relation R is a subset of D1 x D2 x…x Dn.

Thus, a relation is a set of n-tuples ( a1, a2, …, an) where ai Di

Example: If

customer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Sydney, Brisbane, Perth}

then R = { (Jones, Main, Sydney), (Smith, North, Perth), (Curry, North, Brisbane), (Lindsay, Park, Sydney) }is a relation over customer-name x customer-street x customer-city

Page 26: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

26262626

L3L3 Relational Data ModelRelational Data ModelRelation Schema and

Instance

A relation R hasa relation schema: specifies name of relation, plus name and data type of each attribute.

A1, A2, …, An are attributes

R = (A1, A2, …, An ) is a relation schema

e.g. Students (sid: string, name: string, login: string, age: integer, gpa: real).

an relation instance: a table, with rows and columns.

D1, D2, …, Dn are the domains

each attribute corresponds to one domain:dom(Ai) = Di , 1 <= i <= n

R D1 x D2 x…x Dn

#rows = cardinality, #fields = degree.

Page 27: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

27272727

L3L3 Relational Data ModelRelational Data Model

Defining the SchemaMathematics of RelationsLogical Database Design

Today’s Agenda

Page 28: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

28282828

L3L3 Relational Data ModelRelational Data ModelLogical DB Design:

ER to Relational

Entity sets to tables:

CREATE TABLE Employee (ssn CHAR(11), name CHAR(20), lot INTEGER, PRIMARY KEY (ssn))Employee

ssnname

lot

ssn name lot

Employee

Page 29: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

29292929

L3L3 Relational Data ModelRelational Data ModelRelationship Sets to Tables

In translating a relationship set to a relation, attributes of the relation must include:

Keys for each participating entity set (as foreign keys).

This set of attributes forms a superkey for the relation.

All descriptive attributes.

CREATE TABLE Works_In( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employee, FOREIGN KEY (did) REFERENCES Department)

ssn name lot

did dname

Employee

Department

budget

since

Works_In

ssn did

Page 30: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

30303030

L3L3 Relational Data ModelRelational Data ModelReview: Key Constraints

Each dept has at most one manager, according to the key constraint on Manages.

Translation to relational model?

Many-to-Many1-to-1 1-to Many Many-to-1

dname

budgetdid

since

lot

name

ssn

ManagesEmployee Department

Page 31: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

31313131

L3L3 Relational Data ModelRelational Data ModelTranslating ER Diagrams

with Key Constraints

Map relationship to a table:

Note that did is the key now!Separate tables for Employees and Departments.

Since each department has a unique manager, we could instead combine Manages and Departments.

CREATE TABLE Manages( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employee, FOREIGN KEY (did) REFERENCES Department)

CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employee)

OROR

Page 32: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

32323232

L3L3 Relational Data ModelRelational Data ModelReview: Participation

Constraints

Does every department have a manager?If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial).

Every did value in Departments table must appear in a row of the Manages table (with a non-null ssn value!)

lot

name dnamebudgetdid

sincename dname

budgetdid

since

Manages

since

DepartmentEmployee

ssn

Works_In

Page 33: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

33333333

L3L3 Relational Data ModelRelational Data Model

Participation Constraints in SQL

We can capture participation constraints involving one entity set in a binary relationship, but little else (without resorting to CHECK constraints).

CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employee, ON DELETE NO ACTION)

Page 34: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

34343434

L3L3 Relational Data ModelRelational Data ModelReview: Weak Entities

A weak entity can be identified uniquely only by considering the primary key of another (owner) entity.

Owner entity set and weak entity set must participate in a one-to-many relationship set (1 owner, many weak entities).Weak entity set must have total participation in this identifying relationship set.

lot

name

agepname

DependentEmployee

ssn

Policy

cost

Page 35: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

35353535

L3L3 Relational Data ModelRelational Data ModelTranslating Weak Entity

Sets

Weak entity set and identifying relationship set are translated into a single table.

When the owner entity is deleted, all owned weak entities must also be deleted.

CREATE TABLE Dep_Policy ( pname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employee, ON DELETE CASCADE)

Page 36: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

36363636

L3L3 Relational Data ModelRelational Data ModelMulti-valued attributes

Resolution of a Multi-valued attributeFinite & non-critical: split the attribute into multiple attributes

Video Store

Phone#Customer

Video StoreHome Phone#

CustomerOffice Phone#

Infinite & critical: create a new entity

Phone Company

Phone# Customer

Phone Company

Customer Phone

Phone#

Phone type

Servicetype

Page 37: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

37373737

L3L3 Relational Data ModelRelational Data Model

Mapping Finite Multi-valued Attributes

LicenceNo Name Contact Age

PILOT

IdentifierAttributeLicenceNumber

ENTITY

Multi-valued Attribute

PILOT

Attribute

Attribute

AttributeName Age

Contact

Aircraft TypeEndorsement

Endorsement 1 Endorsement 2

Page 38: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

38383838

L3L3 Relational Data ModelRelational Data Model

Mapping Infinite Multi-valued Attributes

LicenceNo Name Contact Age

PILOT

LicenceNumber

ENTITYPILOT

Name Age

ContactAircraft TypeEndorsement

Endorsement LicenceNo

ENDORSEMENT

ENDORSEMENTHas

Page 39: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

39393939

L3L3 Relational Data ModelRelational Data ModelReview: ISA Hierarchies

As in C++, or other PLs, attributes are inherited.

If we declare A ISA B, every A entity is also considered to be a B entity.

Overlap constraints: Can Joe be an Hourly_Emp as well as a Contract_Emp entity? (Allowed/disallowed)Covering constraints: Does every Employee entity also have to be an Hourly_Emp or a Contract_Emp entity? (Yes/no)

Contract_Emp

namessn

Employee

lot

hourly_wages

ISA

Hourly_Emp

contractid

hours_worked

Page 40: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

40404040

L3L3 Relational Data ModelRelational Data ModelTranslating ISA Hierarchies

to Relations

General approach:3 relations: Employee, Hourly_Emp and Contract_Emp.

Hourly_Emp: Every employee is recorded in Employee. For hourly emps, extra info recorded in Hourly_Emp (hourly_wages, hours_worked, ssn); must delete Hourly_Emp tuple if referenced Employee tuple is deleted).Queries involving all employees easy, those involving just Hourly_Emp require a join to get some attributes.

Alternative: Just Hourly_Emp and Contract_Emp.Hourly_Emp: ssn, name, lot, hourly_wages, hours_worked.Each employee must be in one of these two subclasses.

Page 41: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

41414141

L3L3 Relational Data ModelRelational Data ModelReview: Binary vs.

Ternary Relationships

What are the additional constraints in the 2nd diagram?

agepname

DependentCovers

name

Employee

ssn lot

Policy

policyid cost

Beneficiary

agepname

Dependent

policyid cost

Policy

Purchaser

name

Employee

ssn lot

Bad design

Better design

Page 42: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

42424242

L3L3 Relational Data ModelRelational Data ModelBinary vs. Ternary

Relationships (Contd.)

The key constraints allow us to combine Purchaser with Policies and Beneficiary with Dependents.

Participation constraints lead to NOT NULL constraints.

What if Policies is a weak entity set?

CREATE TABLE Policy ( policyid INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (policyid). FOREIGN KEY (ssn) REFERENCES Employee, ON DELETE CASCADE)

CREATE TABLE Dependent ( pname CHAR(20), age INTEGER, policyid INTEGER, PRIMARY KEY (pname, policyid). FOREIGN KEY (policyid) REFERENCES Policy, ON DELETE CASCADE)

Page 43: COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 3 Relational Data Model

43434343

L3L3 Relational Data ModelRelational Data Model Wrap-Up

Relational data modelSQL DML, especially for constraints

Mathematics of RelationsLogical Database Design

Transformation from ER to Relational