relational model (cb chapter 4) cpsc 356 database ellen walker hiram college

Post on 14-Dec-2015

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Relational Model (CB Chapter 4)

CPSC 356 Database

Ellen Walker

Hiram College

Relational Model

• Proposed by E.F. Codd, 1970• Goals

– Data independence (physical vs. conceptual)– Normalization (avoiding redundancy)– Set-oriented data manipulation language

(relational algebra)

• Examples of Relational DBMS: – DBMS System R (IBM), Ingres (UC Berkeley)– Oracle (all versions), Access, RDB, DB2 …

Terminology

• A relation (instance) is a table with columns and rows.– (don’t confuse with relationships in ER models)

• An attribute is a named column of a table.– (corresponds to simple attribute in ER)

• A domain is the set of allowable values for an attribute

• A tuple is a row of a table.

More Terminology

• The degree or arity of a relation is its number of attributes (columns)

• The cardinality of a relation is its number of tuples (rows)

• A relational database is a collection of normalized relations with distinct names.– We’ll get to normalization later!

Example Relation (Phone Book)

• Attributes: First, last, dept, email, etc.• Domain for Title: {Prof, Inst, Assoc, Asst}• Tuple: (Obie,Slotterbeck,CS,Obie,5275,Prof)• Degree = 6, Cardinality = 4

First Last Dept Email Phone Title

Obie Slotterbeck CS Obie 5275 Prof

Irina Lomonosov CS LomonosovI 5002 Inst

Ellen Walker CS WalkerEL 5250 Prof

Angela Guercio CS GuercioA 6048 Asst

Order Doesn’t Matter!

• This is the same phone book relation.

Email Title First Phone Last Dept

LomonosovI Inst Irina 5002 Lomonosov CS

Obie Prof Obie 5275 Slotterbeck CS

WalkerEL Prof Ellen 5250 Walker CS

GuercioA Asst Angela 5048 Guercio CS

Sets and Subsets

• A set is an unordered collection of unique elementsSet S = {1,2,3} “1 is an element of S”

{a,b,c} = {a,c,b}

• A subset of a set is another set whose elements all come from the original set.{a,b} is a subset of {a,c,b}

{1,2,3} is a subset of {1,2,3}

{1,2,4} is not a subset of {1,2,3}

{} (the empty set) is a subset of every set!

Mathematical Relations

• Cartesian product: a set of ordered pairs, where each contains one element from each original set{1,2,3} x {a, b} =

{(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)}

• Relation: any subset of a Cartesian product– R1 = {(1,a),(2,b),(3,a)}– R2 = {(1,a), (1,b), (2,b), (3,b)}– R3 = { }

Bigger Relations

• First, take the cross product of n sets– If n = 3, get triples; if n=4, get quadruples– For arbitrary n, get “n-tuples”

• Now, take a subset of that big cross product– Several n-tuples

• Arrange them in rows, so the commas line up– Looks familiar…

• Domains of attributes are sets from the cross product.

Database Schemas

• A schema describes a database, but not the data

• Relation schema: set of attribute and domain name pairs

(First: char-string) (Last: char-string)(Dept: {CS,Math,Comm,Econ,Mgmt, etc.})(Phone: 4-digit-string) (email: char-string)(Title: {Inst,Asst,Assoc,Prof})

• Relational Database Schema: set of relational schemas, each with a distinct name and integrity constraints (later)

Properties of (Database) Relations

• Every relation has a unique name• Every attribute has a unique name • Attribute’s values are all from its domain• Because relations are sets…

– No duplicate tuples– Order of tuples doesn’t matter (theoretically)

• Every cell contains a single value (single-valued attributes)

Single-valued Attributes

• “Oldest-child” is an ok attribute• “Children” is not, because it is likely to be

multi-valued• Break up composite attributes into multiple

attributes– Name --> First-name, Middle-init, Last-name

• Use “relationships” in the ER sense for multivalued attributes.

Keys

• Superkey: set of attributes that uniquely identifies a tuple. – (The whole tuple is always a superkey - why?)

• Candidate Key: a superkey that doesn’t have a proper subset that is also a superkey

• Primary Key: the candidate key that is selected by the designer (often generated)

• Foreign Key: Set of attributes in one relation that matches the Primary key of another.

Properties of a Candidate Key

• Uniqueness– The set of attribute values in the candidate key

uniquely identifies a tuple

• Irreducibility– If one attribute is removed from the candidate key,

then uniqueness no longer holds

NULL

• NULL is considered a member of every domain

• NULL means:– Value is currently unknown– Value is not applicable for this entry (e.g. “spouse”

of a single person)

• Don’t confuse NULL with– Zero– Blank or empty string

Relational Schema “Shorthand”

• List the relation name first• Attribute names in parentheses following the

relation name• Primary key attribute(s) underlined• Example

– Branch (BranchNo, street, city, postcode)– Staff (StaffNo, fName, position, sex, DOB, salary,

BranchNo)

• Arrow from foreign key to its referent

Integrity Constraints

• Rules that apply to all instances of any database

• If these rules are violated, the instance is not a legal database (it doesn’t “make sense”)

• Some come from business rules of enterprise– “Students can take only 1 course in 3-week term”

• Others from the schema design– “InstructorID cannot be NULL and must match the

ID of a listed faculty member”

Key Constraint

• Specifies which attributes compose a candidate key (or primary key) for the relation

• Example:– {first, last} and {email} are both keys of the Phone

Book relation

• Generally, only the primary key is specified• In practice, an auto-generated id number is

used as the key

Basic Integrity Constraints

– Entity Integrity: No attribute of a primary key can be NULL for any tuple.

• This constraint is implicit in the KEY constraint in SQL

– Referential Integrity: If a relation has a foreign key, either the foreign key is NULL, or the foreign key matches the primary key of an existing tuple in another relation.

• Attribute names do not have to be the same

Maintaining Referential Integrity

• Adding a new tuple– Add with NULL foreign keys – Add tuple in the other relation (with primary key matching the

foreign key) first

• Deleting a tuple– Find all tuples in other relations whose foreign keys match

this one– Either delete those tuples, or change their foreign keys to

NULL

• Modifying a tuple– If a foreign key is changed, make sure that the new value

points to an existing tuple or is NULL

Summary of Constraints

• Domain constraint– Each attribute’s values come from its domain

• Key constraint– Every relation has a primary key

• Entity integrity constraint– No NULL primary key attributes

• Referential integrity constraint– Foreign keys are NULL or refer to existing tuples’ primary

keys

• Semantic constraints– Any additional constraints that must be satisfied based on

the client’s needs

Views

• “Relations” that appear to exist, but actually combine attributes from multiple base (implemented) relations.

• When a base table changes, the view changes & vice versa

• Useful for– Security: limit which attributes a user sees– Customization– Simplification

top related