cs 405g: introduction to database systems sql ii instructor: jinze liu
TRANSCRIPT
Review: SQL DML
• DML includes 4 main statements:SELECT (query), INSERT, UPDATE and DELETE
• e.g: SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid = E.sid AND
S.age=19 SELECT
PROJECT
JOIN
SELECT R.sidFROM Boats B,Reserves RWHERE B.color = ‘red’
AND R.bid=B.bid
INTERSECT
SELECT R.sidFROM Boats B,Reserves RWHERE B.color = ‘green’ AND R.bid=B.bid Reserves
sid bid day
1 101 9/12
2 103 9/13
1 105 9/13
bid bname color
101 Nina red
102 Pinta blue
103 Santa Maria
red
105 Titanic green
Boats
sid
1
2
sid
1 =sid
1
Now let’s do this with a self-join
Example: Find sailors who have reserved a red and a green boat
SELECT R1.sidFROM Boats B1, Reserves R1, Boats B2, Reserves R2
WHERE B1.color = ‘red’ AND B1.bid=R1.bid AND B2.color = ‘green’ AND B2.bid=R2.bid
AND R1.sid=R2.sid
Find sids of sailors who’ve reserved a red and a green boat
sid
bid day
1 101 9/12
2 103 9/13
1 105 9/13
bid bname color
101 Nina red
102 Pinta blue
103 Santa Maria red
105 Titanic green
Find red reserved boats
Find green reserved boats
Find matching green and red reserved boats
sid
1
2
sid
1
=sid
1
• The WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses can too)
Nested Queries
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R
WHERE R.bid=103)
e.g. Find the names of sailors who’ve reserved boat #103:
…and then check each the sid of each tuple in Sailors to see if it is in S1
First compute the set of all sailors that have reserved boat 103…
sid
bid day
1 101 9/12
2 103 9/13
1 105 9/13
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
Sailors
Reserves
S sid
2S
S
sname
Bilbo
• This nested query was uncorrelated because the subquery does not refer to anything in the enclosing query
Nested Queries
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R
WHERE R.bid=103)
It can be evaluated once and then checked for each tuple in enclosing query
• Nested queries can also be correlated; the subquery refers to the enclosing query
Nested Queries with correlation
The subquery must be re-evaluated for each tuple in enclosing query
SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid)
EXISTS is a set operator that is true if result of set expression has at least one tuple
Nested Queries
e.g. Find the names of sailors who’ve reserved boat #103:
sid
bid day
1 101 9/12
2 103 9/13
1 105 9/13
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
SailorsReserves
S
S
S
SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid)
sid
bid day
Notice that this query computes the same answer as the previous query!
321
2 103 9/13
Set-Comparison Operators
• <tuple expression> IN <set expression>– True if <tuple> is a member of <set>– Also, NOT IN
• EXISTS <set expression>– True if <set expression> evaluates to a set with at least one member– Also, NOT EXISTS
• UNIQUE <set expression>– True if <set expression> evaluates to a set with no duplicates;
each row can appear exactly once– Also, NOT UNIQUE
• <tuple expression> comparison op ANY <set expression>– True if <set expression> contains at least one member that
makes the comparison true– Also, op ALL
Use NOT Exists for Division
SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))
Find Sailors S such that ...
there is no boat B...
without a reservation by Sailor S
Find sailors who’ve reserved all boats.X = set of sailors and Y = set of all boats with reservations.
Recall: X/Y means only give me X tuples that have a match in Y.
2
Division
sid
bid day
1 103 9/12
2 103 9/13
3 103 9/14
3 101 9/12
1 103 9/13
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
Sailors
Reserves
S
S
S
SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))
bid bname color
101 Nina red
103 Pinta blue
Boats
1101
R
RR
B
B
3103
RR
UNIQUE
sid
bid day
1 103 9/12
2 103 9/13
1 103 9/13
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
SailorsReserves
S
S
S
sid
bid day
2 103 9/13
Find the names of sailors who’ve reserved boat #103 exactly once
SELECT S.snameFROM Sailors SWHERE UNIQUE (SELECT sid, bid FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)321
ANY
SELECT *FROM Sailors SWHERE S.rating > ANY(SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Bilbo’)
Find sailors whose rating is greater than that of some sailor called Bilbo:
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
S1
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
S2
sid sname rating age
2 Bilbo 2 39
Correlated or uncorrelated?
Uncorrelated!
Aggregate Operators• Very powerful; enables computations over sets of
tuples
SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10
SELECT COUNT (*)FROM Sailors S
• COUNT: returns a count of tuples in the set
• AVG: returns average of column values in the set
• SUM: returns sum of column values in the set
• MIN, MAX: returns min (max) value of column values in a set.
• DISTINCT can be added to COUNT, AVG, SUM to perform computation only over distinct values.
SELECT AVG(DISTINCT S.age)FROM Sailors SWHERE S.rating=10
Find name and age of the oldest sailor(s)
SELECT S.sname, MAX (S.age)FROM Sailors S
Aggregate Operators
What will the result be?
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
Sailors
sname age
Frodo 39
Bilbo 39
Sam 39
X
Not legal syntax; no other columns allowed in SELECT clause without a GROUP BY clause(we’ll learn about those next)
Find name and age of the oldest sailor(s)
SELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)
Aggregate Operators
Instead:
Find the maximum age…
And then find the sailors(s) of that age…
GROUP BY and HAVING
• So far, we’ve seen aggregate operators applied to all tuples. – What if we want to apply ops to each of several groups of
tuples?
• Consider: Find the age of the youngest sailor for each rating level.– In general, we don’t know how many rating levels exist,
and what the rating values for these levels are!– Suppose we know that rating values go from 1 to 10; we
can write 10 queries that look like this (!):
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i
For i = 1, 2, ... , 10:
Queries With GROUP BY
target-list contains:• list of column names from grouping–list• terms with aggregate operations (e.g., MIN (S.age)).
SELECT [DISTINCT] target-listFROM relation-list[WHERE qualification]GROUP BY grouping-list
• To generate values for a column based on groups of rows, use aggregate functions in SELECT statements with the GROUP BY clause
First select these rows…
Then group them by the values in these columns…
And finally compute aggregate function over each group…
Returning 1 row per group
Group By Example
For each rating, find the age of the youngestsailor with age 18
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.rating
sid sname rating age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
4 Pippin 2 21
5 Merry 8 17
Sailors
1 Frodo 7 22Group 1
2 Bilbo 2 39
4 Pippin 2 21Group 2
3 Sam 8 27Group 3
7 22
2 21
8 27
Find the number of reservations for each red boat.
SELECT B.bid, COUNT(*)AS tot_resFROM Boats B, Reserves RWHERE R.bid=B.bid AND B.color=‘red’GROUP BY B.bid
sid bid day
1 102 9/12
2 103 9/13
3 101 9/14
4 101 9/14
Reserves
bid bname color
101 Nina red
102 Pinta blue
103 Santa Maria
red
Boats
3 101 9/14
4 101 9/14
2 103 9/13
103 1
101 2
Queries With GROUP BY and HAVING
• Use the HAVING clause with the GROUP BY clause to restrict which group-rows are returned in the result set
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification
Find the age of the youngest sailor with age 18, for each rating with at least 2 such sailors
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1
rating7 35.0
Answer
rating age1 33.07 45.07 35.08 55.510 35.0
2
rating m-age count1 33.0 17 35.0 28 55.0 110 35.0 1
3
sid sname rating age22 Dustin 7 45.031 lubber 8 55.571 zorba 10 16.064 horatio 7 35.029 brutus 1 33.058 rusty 10 35.0
Sailors who have reserved all boats si
dsname
rating
age
1 Frodo 7 22
2 Bilbo 2 39
3 Sam 8 27
Sailors
sid bid day
1 102 9/12
2 102 9/12
2 101 9/14
1 102 9/10
2 103 9/13
Reserves
bid bname color
101 Nina red
102 Pinta blue
103 Santa Maria
red
Boats
SELECT S.nameFROM Sailors S, reserves RWHERE S.sid = R.sid GROUP BY S.name, S.sidHAVING COUNT(DISTINCT R.bid) = ( Select COUNT (*) FROM Boats)
count
3
sname sid bid
Frodo 1 102
Bilbo 2 101
Bilbo 2 102
Frodo 1 102
Bilbo 2 103
sname sid bid
Frodo 1 102,102
Bilbo 2 101, 102, 103
sname sid count
Frodo 1 1
Bilbo 2 3
More about Joins
Explicit join semantics needed unless it is an INNER join
(INNER is default)
SELECT (column_list)FROM table_name [INNER | {LEFT |RIGHT | FULL } OUTER] JOIN table_name ON qualification_listWHERE …
Default semantics: Inner Join
Only rows that match search conditions are returned.
SELECT s.sid, s.name, r.bidFROM Sailors s INNER JOIN Reserves rON s.sid = r.sid
Returns only those sailors who have reserved boats
SQL-92 also allows: SELECT s.sid, s.name, r.bidFROM Sailors s NATURAL JOIN Reserves r
“NATURAL” means equi-join for each pair of attributes with the same name
SELECT s.sid, s.name, r.bidFROM Sailors s INNER JOIN Reserves rON s.sid = r.sid
s.sid s.name r.bid22 Dustin 10195 Bob 103
sid sname rating age
22 Dustin 7 45.0
31 Lubber 8 55.595 Bob 3 63.5
sid bid day
22 101 10/10/9695 103 11/12/96
Left Outer Join
Left Outer Join returns all matched rows, plus all unmatched rows from the table on the left of the join clause
(use nulls in fields of non-matching tuples)
SELECT s.sid, s.name, r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid = r.sid
Returns all sailors & information on whether they have reserved boats
SELECT s.sid, s.name, r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid = r.sid
s.sid s.name r.bid22 Dustin 10195 Bob 10331 Lubber
sid sname rating age
22 Dustin 7 45.0
31 Lubber 8 55.595 Bob 3 63.5
sid bid day
22 101 10/10/9695 103 11/12/96
Right Outer Join
Right Outer Join returns all matched rows, plus all unmatched rows from the table on the right of the join clause
SELECT r.sid, b.bid, b.nameFROM Reserves r RIGHT OUTER JOIN Boats
bON r.bid = b.bid
Returns all boats & information on which ones are reserved.
SELECT r.sid, b.bid, b.nameFROM Reserves r RIGHT OUTER JOIN Boats bON r.bid = b.bid
r.sid b.bid b.name22 101 Interlake
102 Interlake95 103 Clipper
104 Marine
sid bid day
22 101 10/10/9695 103 11/12/96
bid bname color101 Interlake blue102 Interlake red103 Clipper green104 Marine red
Full Outer Join
Full Outer Join returns all (matched or unmatched) rows from the tables on both sides of the join clause
SELECT r.sid, b.bid, b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid = b.bid
Returns all boats & all information on reservations
SELECT r.sid, b.bid, b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid = b.bid
r.sid b.bid b.name22 101 Interlake
102 Interlake95 103 Clipper
104 MarineNote: in this case it is the same as the ROJ becausebid is a foreign key in reserves, so all reservations musthave a corresponding tuple in boats.
sid bid day
22 101 10/10/9695 103 11/12/96
bid bname color101 Interlake blue102 Interlake red103 Clipper green104 Marine red
INSERT
INSERT INTO Boats VALUES ( 105, ‘Clipper’, ‘purple’)INSERT INTO Boats (bid, color) VALUES (99, ‘yellow’)
“bulk insert” from one table to another (must be type compatible):INSERT INTO TEMP(bid)SELECT r.bid FROM Reserves R WHERE r.sid = 22;
“bulk insert” from files (in Postgres)Copy
INSERT [INTO] table_name [(column_list)]VALUES ( value_list)
INSERT [INTO] table_name [(column_list)]<select statement>
DELETE & UPDATE
DELETE FROM Boats WHERE color = ‘red’
DELETE FROM Boats b
WHERE b. bid =
(SELECT r.bid FROM Reserves R WHERE r.sid = 22)
Can also modify tuples using UPDATE statement.
UPDATE Boats
SET Color = “green”
WHERE bid = 103;
DELETE [FROM] table_name[WHERE qualification]
Null Values
• Values are sometimes – unknown (e.g., a rating has not been assigned or – inapplicable (e.g., no spouse’s name). – SQL provides a special value null for such situations.
• The presence of null complicates many issues. E.g.:– Special operators needed to check if value is/is not null. – “rating>8” - true or false when rating is null? What about
AND, OR and NOT connectives?– Need a 3-valued logic (true, false and unknown).– Meaning of constructs must be defined carefully. (e.g.,
WHERE clause eliminates rows that don’t evaluate to true.)– New operators (in particular, outer joins) possible/needed.