it 20303

73
IT 20303 • Relational Database Theory

Upload: orea

Post on 22-Jan-2016

57 views

Category:

Documents


0 download

DESCRIPTION

IT 20303. Relational Database Theory. Relational Database Theory. The Relational Theory Ways of working with data Types of “Models” File database model Hierarchical database model Network database model Relational database model. Relational Database Theory. The Relational Theory - PowerPoint PPT Presentation

TRANSCRIPT

IT 20303

• Relational Database Theory

Relational Database Theory

• The Relational Theory

– Ways of working with data

• Types of “Models”

–File database model

–Hierarchical database model

–Network database model

–Relational database model

Relational Database Theory

• The Relational Theory

– Meaning of database model

• The way data is organized & stored

• The way data is manipulated

Relational Database Theory

• Relational Model of Data

– Published in 1970 by Dr. Edgar (Ted) Codd – IBM

• “A Relational Model of Data for Large Shared Data Banks”

Relational Database Theory

• Relational Model of Data– Purpose

• Achieve program/data structure independence

• Treat data in a disciplined way–Apply rigor of mathematics–Uses Set Theory – sets of related

data• Improve programmer productivity

Relational Database Theory

• The Relational Model– Relational uses familiar concepts

• The data is perceived as organized in tables– Relational also incorporates the rigor of

mathematics• Rows of the table are treated as elements in a

set• Manipulation of rows is based on set

operations – (Vinn Diagrams)– User works with a set of rows at a time

Relational Database Theory

• Relational also impacts Data Design

– Files were often constructed to support an application

– Tables are designed to describe one thing or Entity in the database

Relational Database Theory

• Example of a Relation:– ANIMAL – Entity (Relation)

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

Relational Database Theory

• Definition of a Relation

– Data is organized & stored in structures called relations

– A relation is a table that adheres to certain rules

• A relation can be called a table

Relational Database Theory

• Definition of a Relation

– A relation is a table containing all the data about some entity

• An entity is a thing or object that is important in this application area

• Data items in the table are related

Relational Database Theory

• Relational Data Structure

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

Name Species WeightDomains

Primary Key

Relation

Attributes

Tuples

Relational Database Theory

• Relational Data Structure Definitions

– Relation

• The Table

– Tuple

• A Row

– Attribute

• A Column

Relational Database Theory

• Relational Data Structure Definitions– Primary Key

• A unique identifier for the table– Domain

• A pool of legal values from which an attribute value is selected

–Related to meaning–Has a Data Type

Relational Database Theory

• Relational Data Structure Definitions

– Degree

• The number of attributes

– Cardinality

• The number of tuples

Relational Database Theory

• Relational Table Rules

– A Relation is a table that adheres to the following rules:

• There are No Duplicate Tuples in the table

–The tuples in the table are treated as a mathematical set

Relational Database Theory

• Relational Table Rules

–By definition, a set is a collection of unique elements

• There must be a primary key (unique identifier) for each tuple

Relational Database Theory

• Relational Table Rules• There is no order to the tuples

(top to bottom)• There is no order to the attributes

(left to right)–By convention, the primary key

attribute is usually the first one on the left side of the table

Relational Database Theory

• Attributes

– Each attribute has a datatype

• Examples: Integer, character, date, user-defined

– The data value of an attribute can be null

Relational Database Theory

• Attributes– Each attribute value is atomic

• There is One & Only One data value in each cell of the table

• There are no Lists or Arrays• One fact per field, one field per

fact– Can be called a Field (MS Access)

Relational Database Theory

• Relational Data Structure: Design– Each relation contains data about

only one entity• Each row corresponds to one

unique occurrence of the entity– A relation does not contain arrays,

lists or repeating groups• No multi-valued attributes

Relational Database Theory

– Tables are designed according to Rules of Normalization

• Each data item in the table is determined

–By the Primary Key

–By the Whole Primary Key

–Only by the Primary Key

Relational Database Theory

– Normalization avoids well-known update problems

• Optimizes design to minimize redundancy & storage requirements

Relational Database Theory

• Example: Table with repeating group–Animal

ANAME AFAMILY WEIGHT FOOD

Candice Camel 1800 Hay

Buns

Zona Zebra 900 Brush

Sam Snake 5 Mice

People

Elmer Elephant 5000 Leaves

Leonard Lion 1200 People

Meat

Relational Database Theory

• Example: Table with no repeating group

ANAME FOOD

Candice Hay

Candice Buns

Zona Brush

Sam Mice

Sam People

Elmer Leaves

Leonard People

Leonard Meat

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

Animal

Animal-Food

Relational Database Theory

• A Database Models the Real World– A Database represents Reality– The database is a collection of relations

• A relation represents an entity type• Each tuple represents one occurrence

of that entity type• Each occurrence of an entity is unique

Relational Database Theory

• A Database Models the Real World

– A database contains information about

• Entities

• Relationships between entities

• Rules about the entities’ data & the relationships

Relational Database Theory

• Relational Databases Support Relationships– Relational databases support

relationships between entities• Relationship is established by a

Foreign Key• Repeat the Primary Key of one

table in the related table(s)

Relational Database Theory

• Example: The Zoo has an “Adopt-an-Animal” program– A zoo member can adopt an animal

MID MNAME MADDR *** ANAME

171 N. Harrison 1400 Blush Rd

Zona

144 J. Montagano

1108 5th Ave Leonard

194 J. Spence 1244 Lark Ln Candice

303 E. Wingate 5222 Gains Dr Candice

101 H. Yarchun 177 Beach Rd

270 K. Steeg 140 Crystal Dr Zona

291 S. Ackerman 1172 Park Dr Sam

301 K. Snyder 196 279th Ave

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

AnimalForeign KeyZoo-Member

Relational Database Theory

• Example: Another Relationship

ANAME FOOD

Candice Hay

Candice Buns

Zona Brush

Sam Mice

Sam People

Elmer Leaves

Leonard People

Leonard Meat

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

Animal

Animal-FoodComposite Primary Key

Foreign Key

Relational Database Theory

• Relational Integrity Rules– Entity Integrity

• No part of the Primary Key (PK) may be Null

– Referential Integrity• The value of a Foreign Key (FK) must

either–Be Null or–Be one of the values of the PK in

the related table

Relational Database Theory

• Keys, Keys, and More Keys

– Characteristic of a Primary Key (PK)

• Unique

• Mandatory

• Unchanging

• Under the control of IT organization

Relational Database Theory

• Keys, Keys, and More Keys

– Names or Types of Keys

• Candidate Key

–A minimal set of attributes that can be used as the unique identifier for a table

Relational Database Theory

• Keys, Keys, and More Keys

– Names or Types of Keys

• Primary Key

–One of the candidate keys

• Alternate Key

–A candidate key that is not the primary key

Relational Database Theory

• Keys, Keys, and More Keys

– Names or Types of Keys

• Foreign Key

–A primary key of a related table

–Indicates relationships

Relational Database Theory

• Keys, Keys, and More Keys– Names or Types of Keys

• Composite Key–A key composed of more than one

attribute• Search Key

–One or more attributes on which a retrieval is based

»Indexes

Relational Database Theory

• Characteristics of Relationships

– Referential integrity applies to the relationship between entities

• Also known as an existence constraint or an enterprise rule

• For every relationship, referential integrity must be defined

Relational Database Theory

• Relationships have Cardinality– One-To-One– One-To-Many– Many-To-Many

• Relationships have Optionality– Each entity’s participation is either

• Mandatory or• Optional

Relational Database Theory

• Cardinality reflects Business Rules

– One-To-One Relationship

• One animal is cared for by one zoo worker

• One zoo worker cares for one animal

Relational Database Theory

• Cardinality reflects Business Rules

– One-To-Many Relationship

• One animal is cared for by many zoo workers

• One zoo worker cares for only one animal

Relational Database Theory

• Cardinality reflects Business Rules

– Many-To-Many Relationship

• One animal is cared for by many zoo workers

• One zoo worker cares for many animals

Relational Database Theory

• Mandatory Relationship

– The Foreign Key Cannot be Null

– Every purchase order must have a supplier

– In the example below the FK, SNO, cannot be Null

Relational Database Theory

• Example:

ONO SNO ODATE ***

7001 1234 03/09/02

7002 2079 03/10/02

7003 2079 03/12/02

***

SUPPLIER

SNO SNAME SADDR

1234 Farm & Feed

7000 Booth Rd

2079 The Grain House

2001 Larkin Dr

***

PORDER

Relational Database Theory

• Example: FK can be Null

ANID ANAME AFAMILY WEIGHT

0001 Candice

Camel 1800

0002 Zona Zebra 900

0003 Sam Snake 5

0004 Elmer Elephant 5000

0005 Leonard

Lion 1200

ANIMALMID MNAME MADDR *** ANID

171 N. Harrison 1400 Blush Rd

0002

144 J. Montagano

1108 5th Ave 0005

194 J. Spence 1244 Lark Ln 0001

303 E. Wingate 5222 Gains Dr 0001

101 H. Yarchun 177 Beach Rd

270 K. Steeg 140 Crystal Dr 0002

291 S. Ackerman 1172 Park Dr 0003

301 K. Snyder 196 279th Ave

Foreign KeyZOO-MEMBER

Relational Database Theory

• What happens when a Tuple is deleted?

– For every relationship, there are three possible delete options

• Cascades

–Delete the target tuple and

–Delete the related tuples

Relational Database Theory

• Restricted–Delete restricted to cases for

which there are no related tuples

• Nullifies–Delete the target tuple and–Set the FK to null in the related

tuples

Relational Database Theory

• Relational Algebra Operations

– Select

– Project

– Join

– Union

– Intersect

– Difference

Relational Database Theory

• Our Zoo Database Tables

ANID ANAME AFAMILY WEIGHT

0001 Candice

Camel 1800

0002 Zona Zebra 900

0003 Sam Snake 5

0004 Elmer Elephant 5000

0005 Leonard

Lion 1200

ANIMAL

MID MNAME MADDR *** ANID

171 N. Harrison 1400 Blush Rd

0002

144 J. Montagano

1108 5th Ave 0005

194 J. Spence 1244 Lark Ln 0001

303 E. Wingate 5222 Gains Dr 0001

101 H. Yarchun 177 Beach Rd

270 K. Steeg 140 Crystal Dr 0002

291 S. Ackerman 1172 Park Dr 0003

301 K. Snyder 196 279th Ave

ZOO-MEMBER ANIMAL-FOOD

ANID FOOD

0001 Hay

0001 Buns

0002 Brush

0003 Mice

0003 People

0004 Leaves

0005 People

0005 Meat

Relational Database Theory

• Relational Algebra: SELECT

– Extracts specified tuples from a relation (or get rows from a table)

Relational Database Theory

• Example: SELECT out from the ANIMAL-FOOD table (display) the rows where FOOD=PEOPLE

ANIMAL-FOODANID FOOD

0001 Hay

0001 Buns

0002 Brush

0003 Mice

0003 People

0004 Leaves

0005 People

0005 Meat

ANID FOOD

0003 People

0005 People

RESULTS

Relational Database Theory

• Relational Algebra: PROJECT

– Extracts specified attributes(columns) from a relation (or get columns from a table)

Relational Database Theory

• Example: PROJECT from the ZOO-MEMBER table columns (MID, NAME)

MID MNAME MADDR *** ANID

171 N. Harrison 1400 Blush Rd

0002

144 J. Montagano

1108 5th Ave 0005

194 J. Spence 1244 Lark Ln 0001

303 E. Wingate 5222 Gains Dr 0001

101 H. Yarchun 177 Beach Rd

270 K. Steeg 140 Crystal Dr 0002

291 S. Ackerman 1172 Park Dr 0003

301 K. Snyder 196 279th Ave

ZOO-MEMBERMID MNAME

171 N. Harrison

144 J. Montagano

194 J. Spence

303 E. Wingate

101 H. Yarchun

270 K. Steeg

291 S. Ackerman

301 K. Snyder

RESULTS

Relational Database Theory

• Relational Algebra: JOIN

– Join the data in two tables

• Concatenate one row from Table 1 with one row from Table 2

–Usually based on a common column called the join condition

Relational Database Theory

• Example: JOIN T1 and T2 based on the AFAMILY column

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0002 Zebra Zebra 03

RESULT

Relational Database Theory

• Different types of Joins– Equijoin – means a row in T1 is joined with a row in T2 where

the values in the common column(s) are equal– This is the most common type of join

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0002 Zebra Zebra 03

RESULT

Join T1 and T2 where T1.AFAMILY=T2.AFAMILY

Relational Database Theory

• Natural Join– The rows of T1 are joined with the rows of T2 where the PK

value in one table equals the FK value in the other table• Where column name are the same• Don’t use this in a Production Database – renaming causes

problems

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0002 Zebra Zebra 03

RESULT

T1 NATURAL JOIN T2

Relational Database Theory

• Inner Join– The rows of T1 are joined with the rows of

T2 based on the join condition specified• Only rows from T1 with a matching row

in T2 are in the result• Often an Inner Join is both a Natural & a

Equijoin

Relational Database Theory

• Example: Inner Join– T1 INNER JOIN T2 on

T1.AFAMILY=T2.AFAMILY

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0002 Zebra Zebra 03

RESULT

Relational Database Theory

• Outer Join– The rows of T1 are joined with the rows of

T2• All rows from one of the tables are

included in the result even if there is no matching row in the other table

Relational Database Theory

• Example: Outer Join– T1 RIGHT OUTER JOIN T2 on T1.AFAMILY=T2.AFAMILY

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

Snake 05

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0002 Zebra Zebra 03

Snake 05

RESULT

Relational Database Theory

• Cross Join– Every row in T1 is joined with every row in

T2• All possible combinations of rows in the

two tables• Also called a Cartesian Product

Relational Database Theory

• Example: Cross Join– T1 CROSS JOIN T2

ANID AFAMILY

0001 Camel

0002 Zebra

T1

AFAMILY AREA

Camel 01

Zebra 03

T2

ANID AFAMILY AFAMILY AREA

0001 Camel Camel 01

0001 Camel Zebra 03

0002 Zebra Camel 01

0002 Zebra Zebra 03

RESULT

Relational Database Theory

• An RDBMS manipulates Data using Relational Algebra Operations– There are (usually) several sequences of

operations to answer a query• One sequence may be more efficient

than another– A relational DBMS internally has routines

that do the relational algebra

Relational Database Theory

– A relational DBMS generates a sequence or plan of relational algebra operations to accomplish the request

– A relational DBMS has a query optimizer to develop an efficient query plan• A least-cost optimizer generates several

execution plans and chooses the least-cost one; i.e.. Least amount of I/O

Relational Database Theory

• Union, Intersection, and Minus

Union – union together (append) the result tables from two queries

Intersect – take only the rows that are identical in the result tables from two queries

Difference – take only the rows in the first result table that have no identical rows in the second result table

Relational Database Theory

• Relational Algebra: UNION– Union together the results of two queries

• Result contains every element in either one or both sets

– Query 1• Select the rows from ANIMAL where

WEIGHT > 2000 into T1• Project from T1(ANID) into result 1

Relational Database Theory

– Query 2• Select the rows from ANIMAL-FOOD

where FOOD=PEOPLE into T2• Project from T2(ANID) into Result 2

– Query 1 UNION Query 2

Relational Database Theory

ANID

0003

0004

0005

ANID

0003

0005

ANID

0004

RESULT 1 RESULT 2 RESULT

UNION

Relational Database Theory

• Relational Algebra: INTERSECTION– Take only the rows (tuples) that are

identical in the result tables of two queries• Query 1

– Select out the rows from ANIMAL where WEIGHT > 1000 into T1

– Project from T1(ANID) into Result 1

Relational Database Theory

• Query 2– Project from ZOO-MEMBER(ANID) into

Result 2• Query 1 INTERSECT Query 2

ANID

0001

0005

ANID

0002

0005

0001

0003

ANID

0001

0004

0005

RESULT 1 RESULT 2 RESULT

INTERSECT

Relational Database Theory

• Relational Algebra: Minus/Difference/Except– Subtract from the results of one query from

the results of a second query• Query 1

– Project from ANIMAL(ANID) into Result 1• Query 2

– Project from ZOO-MEMBER(ANID) into Result 2

Relational Database Theory

• Query 1 EXCEPT Query 2

ANID

0004

ANID

0002

0005

0001

0003

ANID

0001

0002

0003

0004

0005

RESULT 1 RESULT 2 RESULT

EXCEPT

Relational Database Theory

• Strengths of the Relational Approach– Simple

• People are familiar with tables• Few rules• Few operations

– Easy to learn• Relational algebra is straightforward• Multiple high-level, non-procedural

languages are available -SQL

Relational Database Theory

– Well founded• Basis is mathematics, set theory