rel algebra
TRANSCRIPT
-
8/6/2019 Rel Algebra
1/58
Chapter 6
The Relational Algebra
-
8/6/2019 Rel Algebra
2/58
Slide 6- 2
Chapter Outline
Relational Algebra
Unary Relational Operations
Relational Algebra Operations From Set Theory
Binary Relational Operations Additional Relational Operations
Examples of Queries in Relational Algebra
Example Database Application (COMPANY)
-
8/6/2019 Rel Algebra
3/58
Slide 6- 3
Relational Algebra Overview
Relational algebra is the basic set of operationsfor the relational model
These operations enable a user to specify basicretrieval requests (orqueries)
The result of an operation is a new relation, whichmay have been formed from one or more inputrelations
This property makes the algebra closed (allobjects in relational algebra are relations)
-
8/6/2019 Rel Algebra
4/58
Slide 6- 4
Relational Algebra Overview (continued)
The algebra operations thus produce newrelations
These can be further manipulated using
operations of the same algebra
A sequence of relational algebra operationsforms a relational algebra expression
The result of a relational algebra expression is also arelation that represents the result of a databasequery (or retrieval request)
-
8/6/2019 Rel Algebra
5/58
Slide 6- 5
Brief History of Origins of Algebra(from the author Navathe)
Muhammad ibn Musa al-Khwarizmi (800-847 CE) wrote abook titled al-jabr about arithmetic of variables Book was translated into Latin. Its title (al-jabr) gave Algebra its name.
Al-Khwarizmi called variables shay Shay is Arabic for thing. Spanish transliterated shay as xay (x was sh in Spain). In time this word was abbreviated as x.
Where does the word Algorithm come from?
Algorithm originates from al-Khwarizmi" Reference: PBS (http://www.pbs.org/empires/islam/innoalgebra.html)
-
8/6/2019 Rel Algebra
6/58
Slide 6- 6
Relational Algebra Overview
Relational Algebra consists of several groups of operations Unary Relational Operations
SELECT (symbol: W (sigma)) PROJECT (symbol: T (pi)) R
ENAME (symbol: V (rho)) Relational Algebra Operations From Set Theory
UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, ) CARTESIAN PRODUCT ( x )
Binary Relational Operations JOIN (several variations of JOIN exist)
DIVISION Additional Relational Operations
OUTER JOINS, OUTER UNION AGGREGATE FUNCTIONS (These compute summary of
information: for example, SUM, COUNT, AVG, MIN, MAX)
-
8/6/2019 Rel Algebra
7/58
Slide 6- 7
Database State for COMPANY
All examples discussed below refer to the COMPANY databaseshown here.
-
8/6/2019 Rel Algebra
8/58
Slide 6- 8
Unary Relational Operations: SELECT
The SELECT operation (denoted by W (sigma)) is used to select asubsetof the tuples from a relation based on a selection condition.
The selection condition acts as a filter
Keeps only those tuples that satisfy the qualifying condition
Tuples satisfying the condition are selectedwhereas theother tuples are discarded (filtered out)
Examples:
Select the EMPLOYEE tuples whose department number is 4:
W DNO = 4 (EMPLOYEE) Select the employee tuples whose salary is greater than $30,000:
W SALARY > 30,000 (EMPLOYEE)
-
8/6/2019 Rel Algebra
9/58
Slide 6- 9
Unary Relational Operations: SELECT
In general, the selectoperation is denoted by
W (R) where
the symbol W (sigma) is used to denote the selectoperator
the selection condition is a Boolean (conditional)expression specified on the attributes of relation R
tuples that make the condition true are selected appear in the result of the operation
tuples that make the condition false are filtered out discarded from the result of the operation
-
8/6/2019 Rel Algebra
10/58
Can use Comparison Operators
The Select OperatorThe Select Operator ---- WW
! { " u e
Selection predicate may includecomparison between two attributes
Wsalary < 30000(Employee)
Find those employee having salary less than 30000
-
8/6/2019 Rel Algebra
11/58
The Select OperatorThe Select Operator ---- WW
Can also use ConnectivesAND
OR
Wdno=5AND salary < 30000(Employee)
Find those employee of department number 5 and salaryless than 30000
-
8/6/2019 Rel Algebra
12/58
Slide 6- 12
Unary Relational Operations: SELECT(contd.)
SELECT Operation Properties The SELECT operation W (R) produces a relation
S that has the same schema (same attributes) as R SELECT W is commutative:
W (W < condition2> (R)) = W (W < condition1> (R)) Because of commutativity property, a cascade (sequence) of
SELECT operations can be applied in any order: W(W (W (R)) = W (W (W ( R)))
A cascade of SELECT operations can be replaced by a singleselection with a conjunction of all the conditions:
W(W< cond2> (W(R)) = W AND < cond2> AND (R)))
The number of tuples in the result of a SELECT is less than(or equal to) the number of tuples in the input relation R
-
8/6/2019 Rel Algebra
13/58
Slide 6- 13
The following query results refer to thisdatabase state
-
8/6/2019 Rel Algebra
14/58
Slide 6- 14
Unary Relational Operations: PROJECT
PROJECT Operation is denoted by T (pi)
This operation keeps certain columns (attributes)from a relation and discards the other columns.
PROJECT creates a vertical partitioning The list of specified columns (attributes) is kept in
each tuple
The other attributes in each tuple are discarded
Example: To list each employees first and lastname and salary, the following is used:
TLNAME, FNAME,SALARY(EMPLOYEE)
-
8/6/2019 Rel Algebra
15/58
Slide 6- 15
Unary Relational Operations: PROJECT(cont.)
The general form of theprojectoperation is:T(R)
T (pi) is the symbol used to represent theprojectoperation is the desired list of attributes from
relation R.
The project operation removes any duplicate
tuples This is because the result of theprojectoperation
must be a set of tuples Mathematical sets do not allowduplicate elements.
-
8/6/2019 Rel Algebra
16/58
Slide 6- 16
Unary Relational Operations: PROJECT(contd.)
PROJECT Operation Properties
The number of tuples in the result of projectionT(R) is always less or equal to the number of
tuples in R If the list of attributes includes a keyofR, then the
number of tuples in the result of PROJECT is equalto the number of tuples in R
PROJECT is notcommutative T (T (R) ) = T (R) as long as
contains the attributes in
-
8/6/2019 Rel Algebra
17/58
Slide 6- 17
Examples of applying SELECT andPROJECT operations
-
8/6/2019 Rel Algebra
18/58
Slide 6- 18
Relational Algebra Expressions
We may want to apply several relational algebraoperations one after the other
Either we can write the operations as a single
relational algebra expression by nesting theoperations, or
We can apply one operation at a time and createintermediate result relations.
In the latter case, we must give names to therelations that hold the intermediate results.
-
8/6/2019 Rel Algebra
19/58
Slide 6- 19
Single expression versus sequence ofrelational operations (Example)
To retrieve the first name, last name, and salary of allemployees who work in department number 5, we mustapply a select and a project operation
We can write asingle relational algebra expre
ssion asfollows:
TFNAME, LNAME, SALARY(W DNO=5(EMPLOYEE))
OR We can explicitly show the sequence of operations,giving a name to each intermediate relation:
DEP5_EMPSn W DNO=5(EMPLOYEE)
RESULTn T FNAME, LNAME, SALARY (DEP5_EMPS)
-
8/6/2019 Rel Algebra
20/58
Slide 6- 20
Unary Relational Operations: RENAME
The RENAME operator is denoted by V (rho)
In some cases, we may want to rename theattributes of a relation or the relation name or
both
Useful when a query requires multipleoperations
Necessary in some cases (see JOIN operationlater)
-
8/6/2019 Rel Algebra
21/58
Slide 6- 21
Unary Relational Operations: RENAME(contd.)
The general RENAME operation V can beexpressed by any of the following forms:
VS (B1, B2, , Bn )(R) changes both:
the relation name to S,and
the column (attribute) names to B1, B2, ..Bn
VS(R) changes:
the relation name only to S
V(B1, B2, , Bn )(R) changes:
the column (attribute) names only to B1, B2, ..Bn
-
8/6/2019 Rel Algebra
22/58
Slide 6- 22
Unary Relational Operations: RENAME(contd.)
For convenience, we also use a shorthandforrenaming attributes in an intermediate relation:
If we write:
RESULTn T FNAME, LNAME, SALARY (DEP5_EMPS) RESULT will have the same attribute names as
DEP5_EMPS (same attributes as EMPLOYEE)
If we write:
RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO)nV RESULT (F.M.L.S.B,A,SX,SAL,SU, DNO)(DEP5_EMPS)
The 10 attributes of DEP5_EMPS are renamedtoF, M, L, S, B, A, SX, SAL, SU, DNO, respectively
-
8/6/2019 Rel Algebra
23/58
Slide 6- 23
Example of applying multiple operationsand RENAME
-
8/6/2019 Rel Algebra
24/58
Slide 6- 24
Relational Algebra Operations fromSet Theory: UNION
UNION Operation
Binary operation, denoted by
The result ofR S, is a relation that includes all
tuples that are either in R or in S or in both R andS
Duplicate tuples are eliminated
The two operand relations R and S must be typecompatible (or UNION compatible)
R and S must have same number of attributes
Each pair of corresponding attributes must be typecompatible (have same or compatible domains)
-
8/6/2019 Rel Algebra
25/58
Slide 6- 25
Relational Algebra Operations fromSet Theory: UNION
Example: To retrieve the social security numbers of all employees who
eitherwork in department 5(RESULT1 below) ordirectlysupervise an employee who works in department 5(RESULT2below)
We can use the UNION operation as follows:DEP5_EMPS n WDNO=5 (EMPLOYEE)RESULT1n T SSN(DEP5_EMPS)
RESULT2(SSN)n TSUPERSSN(DEP5_EMPS)RESULTn RESULT1 RESULT2
The union operation produces the tuples that are in eitherRESULT1 orRESULT2 or both
-
8/6/2019 Rel Algebra
26/58
Slide 6- 26
Example of the result of a UNIONoperation
UNION Example
-
8/6/2019 Rel Algebra
27/58
Slide 6- 27
Relational Algebra Operations fromSet Theory
Type Compatibility of operands is required for the binaryset operation UNION , (also for INTERSECTION , andSET DIFFERENCE , see next slides)
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are typecompatible if:
they have the same number of attributes, and
the domains of corresponding attributes are type compatible(i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n).
The resulting relation forR1R2 (also forR1R2, orR1R2, see next slides) has the same attribute names as thefirstoperand relation R1 (by convention)
-
8/6/2019 Rel Algebra
28/58
Slide 6- 28
Relational Algebra Operations from SetTheory: INTERSECTION
INTERSECTION is denoted by
The result of the operation R S, is a
relation that includes all tuples that are inboth R and S
The attribute names in the result will be thesame as the attribute names in R
The two operand relations R and S must betype compatible
-
8/6/2019 Rel Algebra
29/58
Slide 6- 29
Relational Algebra Operations from SetTheory: SET DIFFERENCE (cont.)
SET DIFFERENCE (also called MINUS orEXCEPT) is denoted by
The result ofR S, is a relation that includes all
tuples that are in R but not in S
The attribute names in the result will be thesame as the attribute names in R
The two operand relations R and S must betype compatible
-
8/6/2019 Rel Algebra
30/58
Slide 6- 30
Example to illustrate the result of UNION,INTERSECT, and DIFFERENCE
-
8/6/2019 Rel Algebra
31/58
Slide 6- 31
Some properties of UNION, INTERSECT,and DIFFERENCE
Notice that both union and intersection are commutativeoperations; that is
R S = S R, and R S = S R
Both union and intersection can be treated as n-aryoperations applicable to any number of relations as bothare associative operations; that is
R (S T) = (R S) T
(R S) T = R (S T)
The minus operation is not commutative; that is, ingeneral
R S S R
-
8/6/2019 Rel Algebra
32/58
Slide 6- 32
Relational Algebra Operations from SetTheory: CARTESIAN PRODUCT
CARTESIAN (or CROSS) PRODUCT Operation
This operation is used to combine tuples from two relationsin a combinatorial fashion.
Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
The resulting relation state has one tuple for eachcombination of tuplesone from R and one from S.
Hence, ifR has nR tuples (denoted as |R| = nR ), and S hasnS tuples, then R x S will have nR * nS tuples.
The two operands do NOT have to be "type compatible
-
8/6/2019 Rel Algebra
33/58
Slide 6- 33
Relational Algebra Operations from SetTheory: CARTESIAN PRODUCT (cont.)
Generally, CROSS PRODUCT is not ameaningful operation
Can become meaningful when followed by other
operations Example (not meaningful):
FEMALE_EMPS n W SEX=F(EMPLOYEE)
EMPNAMES n T FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS n EMPNAMES x DEPENDENT
EMP_DEPENDENTS will contain every combination ofEMPNAMES and DEPENDENT whether or not they are actually related
-
8/6/2019 Rel Algebra
34/58
Slide 6- 34
Relational Algebra Operations from SetTheory: CARTESIAN PRODUCT (cont.)
To keep only combinations where theDEPENDENT is related to the EMPLOYEE, weadd a SELECT operation as follows
Example (meaningful): FEMALE_EMPS n W SEX=F(EMPLOYEE)
EMPNAMES n T FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS n EMPNAMES x DEPENDENT
ACTUAL_DEPS n W SSN=ESSN(EMP_DEPENDENTS)
RESULTn T FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS)
RESULT will now contain the name of female employeesand their dependents
-
8/6/2019 Rel Algebra
35/58
A Simple Cartesian ProductA Simple Cartesian Product
Two one-attribute relations
a
A
B
C
b
X
Y
Z
A B
a b
A X
A Y
A Z
B X
B Y
B Z
C X
C Y
C Z
Their cartesian product
A X B
-
8/6/2019 Rel Algebra
36/58
Slide 6- 36
Example of applying CARTESIANPRODUCT
-
8/6/2019 Rel Algebra
37/58
Slide 6- 37
Binary Relational Operations: JOIN
JOIN Operation (denoted by ) The sequence of CARTESIAN PRODUCT followed by
SELECT is used quite commonly to identify and selectrelated tuples from two relations
A special operation, called JOIN combines this sequenceinto a single operation
This operation is very important for any relational databasewith more than a single relation, because it allows uscombine related tuples from various relations
The general form of a join operation on two relations R(A1,
A2, . . ., An) and S(B1, B2, . . ., Bm) is:R Swhere R and S can be any relations that result from generalrelational algebra expressions.
-
8/6/2019 Rel Algebra
38/58
Slide 6- 38
Binary Relational Operations: JOIN (cont.)
Example: Suppose that we want to retrieve the name of themanager of each department. To get the managers name, we need to combine each
DEPARTMENT tuple with the EMPLOYEE tuple whose SSNvalue matches the MGRSSN value in the department tuple.
We do this by using the join operation.
DEPT_MGRn DEPARTMENT MGRSSN=SSN EMPLOYEE
MGRSSN=SSN is the join condition Combines each department record with the employee who
manages the department The join condition can also be specified as
DEPARTMENT.MGRSSN= EMPLOYEE.SSN
-
8/6/2019 Rel Algebra
39/58
Slide 6- 39
Example of applying the JOIN operation
DEPT_MGRn DEPARTMENT MGRSSN=SSN EMPLOYEE
-
8/6/2019 Rel Algebra
40/58
Slide 6- 40
Some properties of JOIN
Consider the following JOIN operation:
R(A1, A2, . . ., An) S(B1, B2, . . ., Bm)
R.Ai=S.Bj
Result is a relation Q with degree n + m attributes: Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
The resulting relation state has one tuple for eachcombination of tuplesr from R and s from S, but only iftheysatisfy the join condition r[Ai]=s[Bj]
Hence, ifR has nR tuples, and S has nS tuples, then the joinresult will generally have less than nR * nS tuples.
Only related tuples (based on the join condition) will appearin the result
-
8/6/2019 Rel Algebra
41/58
Slide 6- 41
Some properties of JOIN
The general case of JOIN operation is called aTheta-join: R S
theta
The join condition is called theta Theta can be any general boolean expression on
the attributes ofR and S; for example: R.Ai
-
8/6/2019 Rel Algebra
42/58
Slide 6- 42
Binary Relational Operations: EQUIJOIN
EQUIJOIN Operation
The most common use of join involves joinconditions with equality comparisons only
Such a join, where the only comparison operatorused is =, is called an EQUIJOIN.
In the result of an EQUIJOIN we always have oneor more pairs of attributes (whose names need not
be identical) that have identical values in everytuple.
The JOIN seen in the previous example was anEQUIJOIN.
-
8/6/2019 Rel Algebra
43/58
Slide 6- 43
Binary Relational Operations:NATURAL JOIN Operation
NATURAL JOIN Operation
Another variation of JOIN called NATURAL JOIN denoted by * was created to get rid of the second(superfluous) attribute in an EQUIJOIN condition.
because one of each pair of attributes with identical values issuperfluous
The standard definition of natural join requires that the twojoin attributes, or each pair of corresponding join attributes,have the same name in both relations
If this is not the case, a renaming operation is applied first.
-
8/6/2019 Rel Algebra
44/58
Slide 6- 44
Binary Relational OperationsNATURAL JOIN (contd.)
Example: To apply a natural join on the DNUMBER attributes ofDEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:
DEPT_LOCSn DEPARTMENT * DEPT_LOCATIONS
Only attribute with the same name is DNUMBER
An implicit join condition is created based on this attribute:DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER
Another example: Q n R(A,B,C,D) * S(C,D,E)
The implicit join condition includes each pairof attributes with thesame name, ANDed together: R.C=S.C AND R.D = S.D
Result keeps only one attribute of each such pair: Q(A,B,C,D,E)
-
8/6/2019 Rel Algebra
45/58
Slide 6- 45
Example of NATURAL JOIN operation
-
8/6/2019 Rel Algebra
46/58
Slide 6- 46
Complete Set ofRelational Operations
The set of operations including SELECT W,PROJECT T , UNION , DIFFERENCE ,RENAME V, and CARTESIAN PRODUCT X is
called a complete setbecause any otherrelational algebra expression can be expressedby a combination of these five operations.
For example:
R S = (R S ) ((R S) (S R))
R S = W (R X S)
-
8/6/2019 Rel Algebra
47/58
Slide 6- 47
Binary Relational Operations: DIVISION
DIVISION Operation The division operation is applied to two relations
R(Z) z S(X), where X subset Z. Let Y = Z - X (and hence Z= X Y); that is, let Y be the set of attributes ofR that are
not attributes of S.
The result of DIVISION is a relation T(Y) that includes atuple t if tuples tR appear in R with tR [Y] = t, and with tR [X] = ts for every tuple ts in S.
For a tuple t to appear in the result T of the DIVISION, thevalues in t must appear in R in combination with everytuplein S.
-
8/6/2019 Rel Algebra
48/58
Slide 6- 48
Example of DIVISION
-
8/6/2019 Rel Algebra
49/58
Slide 6- 49
Recap ofRelational Algebra Operations
-
8/6/2019 Rel Algebra
50/58
Slide 6- 50
Additional Relational Operations:Aggregate Functions and Grouping
A type of request that cannot be expressed in the basicrelational algebra is to specify mathematical aggregatefunctions on collections of values from the database.
Examples of such functions include retrieving the averageor total salary of all employees or the total number ofemployee tuples. These functions are used in simple statistical queries that
summarize information from the database tuples.
Common functions applied to collections of numericvalues include SUM, AVERAGE, MAXIMUM, and MINIMUM.
The COUNT function is used for counting tuples orvalues.
-
8/6/2019 Rel Algebra
51/58
Slide 6- 51
Examples of applying aggregate functionsand grouping
-
8/6/2019 Rel Algebra
52/58
Slide 6- 52
Illustrating aggregate functions andgrouping
-
8/6/2019 Rel Algebra
53/58
Slide 6- 53
Additional Relational Operations (cont.)
The OUTER JOIN Operation
In NATURAL JOIN and EQUIJOIN, tuples without amatching(orrelated) tuple are eliminated from the joinresult
Tuples with null in the join attributes are also eliminated
This amounts to loss of information.
A set of operations, called OUTER joins, can be used whenwe want to keep all the tuples in R, or all those in S, or all
those in both relations in the result of the join,
regardless ofwhether or not they have matching tuples in the otherrelation.
-
8/6/2019 Rel Algebra
54/58
Slide 6- 54
Additional Relational Operations (cont.)
The left outer join operation keeps every tuple inthe first or left relation R in R S; if no matchingtuple is found in S, then the attributes of S in the
join result are filled or padded with null values.
A similar operation, right outer join, keeps everytuple in the second or right relation S in the resultofR S.
A third operation, full outer join, denoted bykeeps all tuples in both the left and the rightrelations when no matching tuples are found,padding them with null values as needed.
-
8/6/2019 Rel Algebra
55/58
Slide 6- 55
Additional Relational Operations (cont.)
-
8/6/2019 Rel Algebra
56/58
Slide 6- 56
Additional Relational Operations (cont.)
OUTER UNION Operations The outer union operation was developed to take
the union of tuples from two relations if therelations are not type compatible.
This operation will take the union of tuples in tworelations R(X, Y) and S(X, Z) that are partiallycompatible, meaning that only some of theirattributes, say X, are type compatible.
The attributes that are type compatible are
represented only once in the result, and thoseattributes that are not type compatible from eitherrelation are also kept in the result relation T(X, Y,Z).
-
8/6/2019 Rel Algebra
57/58
Slide 6- 57
Additional Relational Operations (cont.)
Example: An outer union can be applied to two relationswhose schemas are STUDENT(Name, SSN, Department,Advisor) and INSTRUCTOR(Name, SSN, Department,Rank). Tuples from the two relations are matched based on having the
same combination of values of the shared attributes Name,SSN, Department.
If a student is also an instructor, both Advisor and Rank willhave a value; otherwise, one of these two attributes will be null.
The result relation STUDENT_OR_INSTRUCTOR will have thefollowing attributes:
STUDENT_OR_INSTRUCTOR (Name, SSN, Department,Advisor, Rank)
-
8/6/2019 Rel Algebra
58/58
Examples of Queries in RelationalAlgebra : Procedural Form
Q1: Retrieve the name and address of all employees who work for the
Research department.
RESEARCH_DEPTn W DNAME=Research (DEPARTMENT)
RESEARCH_EMPSn (RESEARCH_DEPT DNUMBER= DNOEMPLOYEEEMPLOYEE)
RESULTn T FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
Q6: Retrieve the names of employees who have no dependents.
ALL_EMPSn T SSN(EMPLOYEE)
EMPS_WITH_DEPS(SSN)n T ESSN(DEPENDENT)
EMPS_WITHOUT_DEPS n (ALL_EMPS - EMPS_WITH_DEPS)
RESULTn T LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)