from the calculus to the structured query language zachary g. ives university of pennsylvania cis...
TRANSCRIPT
From the Calculus to the Structured Query Language
Zachary G. IvesUniversity of Pennsylvania
CIS 550 – Database & Information Systems
September 22, 2005
Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
2
Administrivia
Homework 1 due Tuesday Homework 2 will also be handed out
Will involve writing SQL Oracle set up on eniac.seas.upenn.edu (also
eniac-l.seas.upenn.edu) Go to:
www.seas.upenn.edu/~zives/cis550/oracle-faq.html
Click on “create Oracle account” link
Enter your login info so you’ll get an Oracle account
3
Tuple Relational Calculus (in More Detail)
Queries of form:
{T | p}
Predicate: boolean expression over Tx attribs Expressions:
Tx R TX.a op TY.b TX.a op const const op TX.a T.a = Tx.a
where op is , , , , , Tx,… are tuple variables, Tx.a, … are
attributes
Complex expressions: e1e2, e1e2, e, and e1e2
Universal and existential quantifiers
predicate
4
Domain Relational Calculusto Tuple Relational Calculus
{<subj> | 9 cid, sem, cid, sid (<cid, subj, sem> 2 COURSE Æ <sid, “C”, cid> 2 Takes}
{<cid> | 9 s1, s2 (<cid, s1, s2> 2 COURSE Æ 9 cid2, s3, s4 (<cid2, s3, s4> 2 COURSE Æ (cid > cid2)))}
5
Mini-Quiz on the Relational Calculus
How do you write: DRC: Which students have taken more than
one course from the same professor?
TRC: Which faculty teach every course?
6
Algebra vs. Calculus
We’ve claimed thatthe calculus (when safe)and the algebra areequivalent
Thus (core) SQL => calculus algebramakes sense
Let’s look moreclosely at this… SELECT *
FROM STUDENT, Takes, COURSE WHERE STUDENT.sid = Takes.sID AND Takes.cID = cid
STUDENT
Takes COURSE
Calculus
7
Translating from RA to DRC
Core of relational algebra: , , , x, - We need to work our way through the
structure of an RA expression, translating each possible form. Let TR[e] be the translation of RA expression e
into DRC.
Relation names: For the RA expression R, the DRC expression is {<x1,x2, …, xn>| <x1,x2, …, xn> R}
8
Selection: TR[ R]
Suppose we have (e’), where e’ is another RA expression that translates as:
TR[e’]= {<x1,x2, …, xn>| p} Then the translation of c(e’) is
{<x1,x2, …, xn>| p’}where ’ is obtained from by replacing each attribute with the corresponding variable
Example: TR[#1=#2 #4>2.5R] (if R has arity 4) is
{<x1,x2, x3, x4>|< x1,x2, x3, x4> R x1=x2 x4>2.5}
9
Projection: TR[i1,…,im(e)]
If TR[e]= {<x1,x2, …, xn>| p} then TR[i1,i2,…,im
(e)]=
{<x i1,x i2
, …, x im >| xj1,xj2
, …, xjk.p},
where xj1,xj2
, …, xjk are variables in x1,x2, …, xn
that are not in x i1,x i2
, …, x im
Example: With R as before,#1,#3 (R)={<x1,x3>| x2,x4. <x1,x2, x3,x4> R}
10
Union: TR[R1 R2] R1 and R2 must have the same arity For e1 e2, where e1, e2 are algebra
expressionsTR[e1]={<x1,…,xn>|p} and TR[e2]={<y1,…yn>|q}
Relabel the variables in the second:TR[e2]={< x1,…,xn>|q’}
This may involve relabeling bound variables in q to avoid clashesTR[e1e2]={<x1,…,xn>|pq’}.
Example: TR[R1 R2] = {< x1,x2, x3,x4>| <x1,x2, x3,x4>R1 <x1,x2, x3,x4>R2
11
Other Binary Operators
Difference: The same conditions hold as for unionIf TR[e1]={<x1,…,xn>|p} and TR[e2]={< x1,…,xn>|q}
Then TR[e1- e2]= {<x1,…,xn>|pq}
Product: If TR[e1]={<x1,…,xn>|p} and TR[e2]={< y1,…,ym>|q}
Then TR[e1 e2]= {<x1,…,xn, y1,…,ym >| pq}
Example: TR[RS]= {<x1,…,xn, y1,…,ym >|
<x1,…,xn> R <y1,…,ym > S }
12
Relational Algebra vs. Calculus
Can translate relational algebra into relational calculus
Given syntactic restrictions that guarantee safety of calculus query, can translate back to relational algebra
These are the principles behind initial development of relational databases SQL is close to calculus; query plan is close to algebra
But SQL can do other things (recursion, aggregation that RA/RC can’t)
Great example of theory leading to practice!
13
Basic SQL: A Friendly FaceOver the Tuple Relational Calculus
SELECT [DISTINCT] {T1.attrib, …, T2.attrib}FROM {relation} T1, {relation} T2, …WHERE {predicates}
Let’s do some examples, which will leverage your knowledge of the relational calculus… Faculty ids Course IDs for courses with students expecting a
“C” Courses taken by Jill
select-list
from-list
qualification
14
Our Example Data Instance
sid name
1 Jill
2 Qun
3 Nitin
fid name
1 Ives
2 Saul
8 Martin
sid exp-grade
cid
1 A 550-0105
1 A 700-1005
3 C 501-0105
cid subj sem
550-0105 DB F05
700-1005 AI S05
501-0105 Arch F05
fid cid
1 550-0105
2 700-1005
8 501-0105
STUDENT Takes COURSE
PROFESSOR Teaches
15
Some Nice Features
SELECT * All STUDENTs
AS As a “range variable” (tuple variable): optional As an attribute rename operator
Example: Which students (names) have taken more than
one course from the same professor?
16
Expressions in SQL
Can do computation over scalars (int, real or string) in the select-list or the qualification Show all student IDs decremented by 1
Strings: Fixed (CHAR(x)) or variable length (VARCHAR(x)) Use single quotes: ’A string’ Special comparison operator: LIKE Not equal: <>
Typecasting: CAST(S.sid AS VARCHAR(255))
17
Set Operations
Set operations default to set semantics, not bag semantics:(SELECT … FROM … WHERE …){op}(SELECT … FROM … WHERE …)
Where op is one of: UNION INTERSECT, MINUS/EXCEPT
(many DBs don’t support these last ones!)
Bag semantics: ALL
18
Exercise
Find all students who have taken DB but not AI Hint: use EXCEPT
19
Nested Queries in SQL
Simplest: IN/NOT IN
Example: Students who have taken subjects that have (at any point) been taught by Martin
20
Correlated Subqueries
Most common: EXISTS/NOT EXISTS Find all students who have taken DB but not AI
21
Universal and Existential Quantification
Generally used with subqueries: {op} ANY, {op} ALL Find the students with the best expected
grades
22
Table Expressions
Can substitute a subquery for any relation in the FROM clause:
SELECT S.sidFROM (SELECT sid FROM STUDENT WHERE sid = 5) SWHERE S.sid = 4
Notice that we can actually simplify this query!
What is this equivalent to?
23
Aggregation
GROUP BY
SELECT {group-attribs}, {aggregate-operator}(attrib)FROM {relation} T1, {relation} T2, …WHERE {predicates}GROUP BY {group-list}
Aggregate operators AVG, COUNT, SUM, MAX, MIN DISTINCT keyword for AVG, COUNT, SUM
24
Some Examples
Number of students in each course offering
Number of different grades expected for each course offering
Number of (distinct) students taking AI courses
25
What If You Want to Only ShowSome Groups?
The HAVING clause lets you do a selection based on an aggregate (there must be 1 value per group):
SELECT C.subj, COUNT(S.sid)FROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid = C.cidGROUP BY subjHAVING COUNT(S.sid) > 5
Exercise: For each subject taught by at least two professors, list the minimum expected grade
26
Aggregation and Table Expressions
Sometimes need to compute results over the results of a previous aggregation:
SELECT subj, AVG(size)FROM (
SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS sizeFROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid =
C.cidGROUP BY cid, subj)
GROUP BY subj
27
Something to Ponder
Tables are great, but… Not everyone is uniform – I may have a cell
phone but not a fax We may simply be missing certain information We may be unsure about values
How do we handle these things?