cs 405g: introduction to database systems sql ii instructor: jinze liu

36
CS 405G: Introduction to Database Systems SQL II Instructor: Jinze Liu

Upload: sibyl-wilkinson

Post on 17-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

CS 405G: Introduction to Database Systems

SQL II

Instructor: Jinze Liu

Review: SQL DML

• DML includes 4 main statements:SELECT (query), INSERT, UPDATE and DELETE

• e.g: SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid = E.sid AND

S.age=19 SELECT

PROJECT

JOIN

SELECT R.sidFROM Boats B,Reserves RWHERE B.color = ‘red’

AND R.bid=B.bid

INTERSECT

SELECT R.sidFROM Boats B,Reserves RWHERE B.color = ‘green’ AND R.bid=B.bid Reserves

sid bid day

1 101 9/12

2 103 9/13

1 105 9/13

bid bname color

101 Nina red

102 Pinta blue

103 Santa Maria

red

105 Titanic green

Boats

sid

1

2

sid

1 =sid

1

Now let’s do this with a self-join

Example: Find sailors who have reserved a red and a green boat

SELECT R1.sidFROM Boats B1, Reserves R1, Boats B2, Reserves R2

WHERE B1.color = ‘red’ AND B1.bid=R1.bid AND B2.color = ‘green’ AND B2.bid=R2.bid

AND R1.sid=R2.sid

Find sids of sailors who’ve reserved a red and a green boat

sid

bid day

1 101 9/12

2 103 9/13

1 105 9/13

bid bname color

101 Nina red

102 Pinta blue

103 Santa Maria red

105 Titanic green

Find red reserved boats

Find green reserved boats

Find matching green and red reserved boats

sid

1

2

sid

1

=sid

1

• The WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses can too)

Nested Queries

SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R

WHERE R.bid=103)

e.g. Find the names of sailors who’ve reserved boat #103:

…and then check each the sid of each tuple in Sailors to see if it is in S1

First compute the set of all sailors that have reserved boat 103…

sid

bid day

1 101 9/12

2 103 9/13

1 105 9/13

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

Sailors

Reserves

S sid

2S

S

sname

Bilbo

• This nested query was uncorrelated because the subquery does not refer to anything in the enclosing query

Nested Queries

SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R

WHERE R.bid=103)

It can be evaluated once and then checked for each tuple in enclosing query

• Nested queries can also be correlated; the subquery refers to the enclosing query

Nested Queries with correlation

The subquery must be re-evaluated for each tuple in enclosing query

SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid)

EXISTS is a set operator that is true if result of set expression has at least one tuple

Nested Queries

e.g. Find the names of sailors who’ve reserved boat #103:

sid

bid day

1 101 9/12

2 103 9/13

1 105 9/13

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

SailorsReserves

S

S

S

SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid)

sid

bid day

Notice that this query computes the same answer as the previous query!

321

2 103 9/13

Set-Comparison Operators

• <tuple expression> IN <set expression>– True if <tuple> is a member of <set>– Also, NOT IN

• EXISTS <set expression>– True if <set expression> evaluates to a set with at least one member– Also, NOT EXISTS

• UNIQUE <set expression>– True if <set expression> evaluates to a set with no duplicates;

each row can appear exactly once– Also, NOT UNIQUE

• <tuple expression> comparison op ANY <set expression>– True if <set expression> contains at least one member that

makes the comparison true– Also, op ALL

Use NOT Exists for Division

SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))

Find Sailors S such that ...

there is no boat B...

without a reservation by Sailor S

Find sailors who’ve reserved all boats.X = set of sailors and Y = set of all boats with reservations.

Recall: X/Y means only give me X tuples that have a match in Y.

2

Division

sid

bid day

1 103 9/12

2 103 9/13

3 103 9/14

3 101 9/12

1 103 9/13

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

Sailors

Reserves

S

S

S

SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))

bid bname color

101 Nina red

103 Pinta blue

Boats

1101

R

RR

B

B

3103

RR

UNIQUE

sid

bid day

1 103 9/12

2 103 9/13

1 103 9/13

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

SailorsReserves

S

S

S

sid

bid day

2 103 9/13

Find the names of sailors who’ve reserved boat #103 exactly once

SELECT S.snameFROM Sailors SWHERE UNIQUE (SELECT sid, bid FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)321

ANY

SELECT *FROM Sailors SWHERE S.rating > ANY(SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Bilbo’)

Find sailors whose rating is greater than that of some sailor called Bilbo:

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

S1

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

S2

sid sname rating age

2 Bilbo 2 39

Correlated or uncorrelated?

Uncorrelated!

Aggregate Operators• Very powerful; enables computations over sets of

tuples

SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10

SELECT COUNT (*)FROM Sailors S

• COUNT: returns a count of tuples in the set

• AVG: returns average of column values in the set

• SUM: returns sum of column values in the set

• MIN, MAX: returns min (max) value of column values in a set.

• DISTINCT can be added to COUNT, AVG, SUM to perform computation only over distinct values.

SELECT AVG(DISTINCT S.age)FROM Sailors SWHERE S.rating=10

Find name and age of the oldest sailor(s)

SELECT S.sname, MAX (S.age)FROM Sailors S

Aggregate Operators

What will the result be?

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

Sailors

sname age

Frodo 39

Bilbo 39

Sam 39

X

Not legal syntax; no other columns allowed in SELECT clause without a GROUP BY clause(we’ll learn about those next)

Find name and age of the oldest sailor(s)

SELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)

Aggregate Operators

Instead:

Find the maximum age…

And then find the sailors(s) of that age…

GROUP BY and HAVING

• So far, we’ve seen aggregate operators applied to all tuples. – What if we want to apply ops to each of several groups of

tuples?

• Consider: Find the age of the youngest sailor for each rating level.– In general, we don’t know how many rating levels exist,

and what the rating values for these levels are!– Suppose we know that rating values go from 1 to 10; we

can write 10 queries that look like this (!):

SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i

For i = 1, 2, ... , 10:

Queries With GROUP BY

target-list contains:• list of column names from grouping–list• terms with aggregate operations (e.g., MIN (S.age)).

SELECT [DISTINCT] target-listFROM relation-list[WHERE qualification]GROUP BY grouping-list

• To generate values for a column based on groups of rows, use aggregate functions in SELECT statements with the GROUP BY clause

First select these rows…

Then group them by the values in these columns…

And finally compute aggregate function over each group…

Returning 1 row per group

Group By Example

For each rating, find the age of the youngestsailor with age 18

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.rating

sid sname rating age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

4 Pippin 2 21

5 Merry 8 17

Sailors

1 Frodo 7 22Group 1

2 Bilbo 2 39

4 Pippin 2 21Group 2

3 Sam 8 27Group 3

7 22

2 21

8 27

Find the number of reservations for each red boat.

SELECT B.bid, COUNT(*)AS tot_resFROM Boats B, Reserves RWHERE R.bid=B.bid AND B.color=‘red’GROUP BY B.bid

sid bid day

1 102 9/12

2 103 9/13

3 101 9/14

4 101 9/14

Reserves

bid bname color

101 Nina red

102 Pinta blue

103 Santa Maria

red

Boats

3 101 9/14

4 101 9/14

2 103 9/13

103 1

101 2

Queries With GROUP BY and HAVING

• Use the HAVING clause with the GROUP BY clause to restrict which group-rows are returned in the result set

SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification

Find the age of the youngest sailor with age 18, for each rating with at least 2 such sailors

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

rating7 35.0

Answer

rating age1 33.07 45.07 35.08 55.510 35.0

2

rating m-age count1 33.0 17 35.0 28 55.0 110 35.0 1

3

sid sname rating age22 Dustin 7 45.031 lubber 8 55.571 zorba 10 16.064 horatio 7 35.029 brutus 1 33.058 rusty 10 35.0

Sailors who have reserved all boats si

dsname

rating

age

1 Frodo 7 22

2 Bilbo 2 39

3 Sam 8 27

Sailors

sid bid day

1 102 9/12

2 102 9/12

2 101 9/14

1 102 9/10

2 103 9/13

Reserves

bid bname color

101 Nina red

102 Pinta blue

103 Santa Maria

red

Boats

SELECT S.nameFROM Sailors S, reserves RWHERE S.sid = R.sid GROUP BY S.name, S.sidHAVING COUNT(DISTINCT R.bid) = ( Select COUNT (*) FROM Boats)

count

3

sname sid bid

Frodo 1 102

Bilbo 2 101

Bilbo 2 102

Frodo 1 102

Bilbo 2 103

sname sid bid

Frodo 1 102,102

Bilbo 2 101, 102, 103

sname sid count

Frodo 1 1

Bilbo 2 3

More about Joins

Explicit join semantics needed unless it is an INNER join

(INNER is default)

SELECT (column_list)FROM table_name [INNER | {LEFT |RIGHT | FULL } OUTER] JOIN table_name ON qualification_listWHERE …

Default semantics: Inner Join

Only rows that match search conditions are returned.

SELECT s.sid, s.name, r.bidFROM Sailors s INNER JOIN Reserves rON s.sid = r.sid

Returns only those sailors who have reserved boats

SQL-92 also allows: SELECT s.sid, s.name, r.bidFROM Sailors s NATURAL JOIN Reserves r

“NATURAL” means equi-join for each pair of attributes with the same name

SELECT s.sid, s.name, r.bidFROM Sailors s INNER JOIN Reserves rON s.sid = r.sid

s.sid s.name r.bid22 Dustin 10195 Bob 103

sid sname rating age

22 Dustin 7 45.0

31 Lubber 8 55.595 Bob 3 63.5

sid bid day

22 101 10/10/9695 103 11/12/96

Left Outer Join

Left Outer Join returns all matched rows, plus all unmatched rows from the table on the left of the join clause

(use nulls in fields of non-matching tuples)

SELECT s.sid, s.name, r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid = r.sid

Returns all sailors & information on whether they have reserved boats

SELECT s.sid, s.name, r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid = r.sid

s.sid s.name r.bid22 Dustin 10195 Bob 10331 Lubber

sid sname rating age

22 Dustin 7 45.0

31 Lubber 8 55.595 Bob 3 63.5

sid bid day

22 101 10/10/9695 103 11/12/96

Right Outer Join

Right Outer Join returns all matched rows, plus all unmatched rows from the table on the right of the join clause

SELECT r.sid, b.bid, b.nameFROM Reserves r RIGHT OUTER JOIN Boats

bON r.bid = b.bid

Returns all boats & information on which ones are reserved.

SELECT r.sid, b.bid, b.nameFROM Reserves r RIGHT OUTER JOIN Boats bON r.bid = b.bid

r.sid b.bid b.name22 101 Interlake

102 Interlake95 103 Clipper

104 Marine

sid bid day

22 101 10/10/9695 103 11/12/96

bid bname color101 Interlake blue102 Interlake red103 Clipper green104 Marine red

Full Outer Join

Full Outer Join returns all (matched or unmatched) rows from the tables on both sides of the join clause

SELECT r.sid, b.bid, b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid = b.bid

Returns all boats & all information on reservations

SELECT r.sid, b.bid, b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid = b.bid

r.sid b.bid b.name22 101 Interlake

102 Interlake95 103 Clipper

104 MarineNote: in this case it is the same as the ROJ becausebid is a foreign key in reserves, so all reservations musthave a corresponding tuple in boats.

sid bid day

22 101 10/10/9695 103 11/12/96

bid bname color101 Interlake blue102 Interlake red103 Clipper green104 Marine red

INSERT

INSERT INTO Boats VALUES ( 105, ‘Clipper’, ‘purple’)INSERT INTO Boats (bid, color) VALUES (99, ‘yellow’)

“bulk insert” from one table to another (must be type compatible):INSERT INTO TEMP(bid)SELECT r.bid FROM Reserves R WHERE r.sid = 22;

“bulk insert” from files (in Postgres)Copy

INSERT [INTO] table_name [(column_list)]VALUES ( value_list)

INSERT [INTO] table_name [(column_list)]<select statement>

DELETE & UPDATE

DELETE FROM Boats WHERE color = ‘red’

DELETE FROM Boats b

WHERE b. bid =

(SELECT r.bid FROM Reserves R WHERE r.sid = 22)

Can also modify tuples using UPDATE statement.

UPDATE Boats

SET Color = “green”

WHERE bid = 103;

DELETE [FROM] table_name[WHERE qualification]

Null Values

• Values are sometimes – unknown (e.g., a rating has not been assigned or – inapplicable (e.g., no spouse’s name). – SQL provides a special value null for such situations.

• The presence of null complicates many issues. E.g.:– Special operators needed to check if value is/is not null. – “rating>8” - true or false when rating is null? What about

AND, OR and NOT connectives?– Need a 3-valued logic (true, false and unknown).– Meaning of constructs must be defined carefully. (e.g.,

WHERE clause eliminates rows that don’t evaluate to true.)– New operators (in particular, outer joins) possible/needed.

Null Values – 3 Valued Logic

AND T F Null

T

F

NULL

(null > 0)

(null + 1)

(null = 0)

null AND true

is null

is null

is null

is null

OR T F Null

T

F

NULL

T

F Null

T F

F F

Null

NullF

F

T T

T

T NullNullNull