1 sql: structured query language (‘sequel’) chapter 5
Post on 17-Dec-2015
233 Views
Preview:
TRANSCRIPT
1
SQL: Structured Query Language(‘Sequel’)
Chapter 5
3
Running Example
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0
sid bid day
22 101 10/10/9658 103 11/12/96
R1
S1
S2
Instances of the Sailors and Reserves relations in our examples.
4
Basic SQL Query
relation-list A list of relation names target-list A list of attributes of relations in
relation-list qualification
• Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT.
DISTINCT is optional keyword indicating that answer should not contain duplicates. Default is that duplicates are not eliminated!
SELECT [DISTINCT] target-listFROM relation-listWHERE qualification
, , , , ,
5
Example Query
SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103
6
Comparisons in Oracle
7
Conceptual Evaluation Strategy
Semantics of SQL query defined in terms of conceptual evaluation strategy:
• Compute cross-product of relation-list.• Discard resulting tuples if they fail
qualifications.• Delete attributes that are not in target-list.• If DISTINCT is specified, eliminate duplicate
rows. Probably the least efficient way! Optimizer must find more efficient strategies
to compute same answers.
SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103
8
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
sid bid day
22 101 10/10/9658 103 11/12/96
R S
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/ 10/ 96
22 dustin 7 45.0 58 103 11/ 12/ 96
31 lubber 8 55.5 22 101 10/ 10/ 96
31 lubber 8 55.5 58 103 11/ 12/ 96
58 rusty 10 35.0 22 101 10/ 10/ 96
58 rusty 10 35.0 58 103 11/ 12/ 96
SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103
10
A Note on Range Variables Needed only if same relation appears twice in FROM
clause. The previous query can also be written as:
SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND bid=103
SELECT snameFROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103
It is good style, however, to use range variables always!
OR
11
Find sailors who’ve reserved at least one boat
Question: What is the output of this query ?
SELECT S.sidFROM Sailors S, Reserves RWHERE S.sid=R.sid
13
Expressions and Strings
AS and = are two ways to name fields in result. LIKE is used for string matching. `_’ stands for any one
character and `%’ stands for 0 or more arbitrary characters.
MEANING of Query: Find sailors (age of sailor and two fields defined by
expressions) whose names begin and end with B and contain at least three characters.
SELECT S.age, age1=S.age-5, 2*S.age AS age2FROM Sailors SWHERE S.sname LIKE ‘B_%B’
14
SQL Queries Using Set Operations
15
UNIONcompute union of any two union-compatible sets of tuples
SELECT S.sidFROM Sailors S, Reserves R, Boats BWHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ OR B.color=‘green’)
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’UNIONSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
Find sid’s of sailors who’ve reserved a red or a green boat
17
If we replace OR by AND as done below, what are the semantics?
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ AND B.color=‘green’)
Find sid’s of sailors who’ve reserved a red and a green boat.
18
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’
INTERSECT
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
Find sid’s of sailors who’ve reserved a red and a green boat
INTERSECT: compute intersection of two union-compatible sets of tuples.
What happens if change S.sid as S.sname ?
Key field!
20
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’INTERSECTSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
Find sid’s of sailors who’ve reserved a red and a green boat
Could query be written without INTERSECT?
SELECT S.sidFROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2WHERE (S.sid=R1.sid AND R1.bid=B1.bid AND B1.color=‘red’ ) AND (S.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’)
Do we need Sailors relation in the FROM clauses?
22
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’EXCEPTSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
Find sid’s of sailors who’ve reserved a red and not a green boat
What if we want to compute the DIFFERENCE instead?
SELECT S.sidFROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND (B1.color='red' AND NOT(B2.color = 'green'))
SQL/92 standard, but some systems don’t support EXCEPT.
What then?
23
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’EXCEPTSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
Find sid’s of sailors who’ve reserved a red and not a green boat
What if we want to compute the DIFFERENCE instead?
SELECT S.sidFROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND (B1.color='red' AND NOT(B2.color = 'green'))
Except is SQL/92 standard, but some systems don’t . support EXCEPT.
Is the above query correct, or not correct ?
24
About duplication
Default: Eliminate duplicate
Unless we use special keyword “ALL” :
# of copies of a row in result
UNION ALL (m+n)INTERSECT ALL min(m,n)EXCEPT ALL m-n
25
Nested Queries
26
Nested Queries
WHERE, FROM, HAVING clauses can itself contain another SQL query, called a subquery.
powerful feature of SQL.
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:
27
Nested Queries Evaluation
Nested loops evaluation: For each Sailors tuple,
• Evaluate the WHERE clause qualification• Which here means re-compute subquery.
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:
28
Nested Queries
Change query to find sailors who’ve not reserved #103:
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:
SELECT S.snameFROM Sailors SWHERE S.sid NOT IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
29
Nested Queries
Could this query be rewritten without nesting?
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:
30
Nested Queries with Correlation
EXISTS is another set comparison operator, like IN.
SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)
Find names of sailors who’ve reserved boat #103:
31
Nested Queries with Correlation
Why, in general, this subquery must be re-computed for each Sailors tuple ?
Could this query be rewritten into a flat SQL query?
SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)
Find names of sailors who’ve reserved boat #103:
33
More on Set-Comparison Operators op ANY and op ALL where op = { }
Find sailors whose rating is greater than that of some sailor called Horatio:
IN is equal to =ANY
NOT IN is equal to <>ANY
, , , , ,
SELECT *FROM Sailors SWHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Horatio’)
34
More on Set-Comparison Operators
What if there is no other sailor with that name? “> ALL” would return TRUE if set is empty.
SELECT *FROM Sailors SWHERE S.rating > ALL (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Horatio’)
Find sailors whose rating is greater than that of all sailors called Horatio:
35
Rewriting INTERSECT Queries Using IN
Find sid’s of sailors who’ve reserved both a red and a green boat:
SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ AND S.sid IN (SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’)
36
Rewriting INTERSECT Queries Using IN
INTERSECT queries re-written using IN.
Note: This query can return “sname” (and not just “sid) of Sailors who’ve reserved both red and green boats
Find sname’s of sailors who’ve reserved both a red and a green boat:
SELECT S.snameFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ AND S.sid IN (SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’)
37
Division in SQLFind sailors who’ve reserved all boats.
38
Division in SQL
SELECT S.snameFROM Sailors SWHERE NOT EXISTS ((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid=S.sid))
Find sailors who’ve reserved all boats.
39
Division in SQL
Can we do it without EXCEPT ?
SELECT S.snameFROM Sailors SWHERE NOT EXISTS ((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid=S.sid))
Find sailors who’ve reserved all boats.
(1)
40
Division in SQL
Let’s do it without EXCEPT:
SELECT S.snameFROM Sailors SWHERE NOT EXISTS ((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid=S.sid))
SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))
Sailors S such that ...
there is no boat B without ...
a Reserves tuple showing S reserved B
Find sailors who’ve reserved all boats.
(1)
(2)
41
Aggregation and Having Clauses
42
Aggregate Operators
Significant extension of relational algebra.
COUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)
Why no Distinct?
43
Aggregate Operators
COUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)
SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10
SELECT COUNT (*)FROM Sailors S
Why no Distinct?
SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’
44
More Examples of Aggregate Operators
SELECT AVG ( DISTINCT S.age)FROM Sailors SWHERE S.rating=10
SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2)
47
Find name and age of the oldest sailor(s)
SELECT S.sname, MAX (S.age)FROM Sailors S
SELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)
•Is this first query legal?
Is this second query legal?
•No! •Why not ?
48
Motivation for Grouping So far, aggregate operators to all (qualifying)
tuples. Question?
What if want to apply aggregate to each group of tuples. Example :
Find the age of the youngest sailor for each rating level. Example Procedure :
Suppose rating values {1,2,…,10}, 10 queries:
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i
For i = 1, 2, ... , 10:
49
Motivation for Grouping
What’s the problem with above ? • We don’t know how many rating levels
exist.• Nor what the rating values for these
levels are.
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i
For i = 1, 2, ... , 10:
50
Add Group By to SQL
51
Queries With GROUP BY and HAVING
A group is a set of tuples that each have the same value for all attributes in grouping-list.
HAVING clause is a restriction on each group.
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification
52
Queries With GROUP BY and HAVING
target-list contains :(i) attribute names (ii) aggregate-op (column-name) (e.g., MIN (S.age)).
A group is a set of tuples that each have the same value for all attributes in grouping-list.
REQUIREMENT: - target-list (i) grouping-list. - Why?
- Each answer tuple of a group must have single value.
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification
53
Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors
SELECT S.rating, MIN (S.age) AS min-ageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1
54
GroupBy --- Conceptual Evaluation
1. Compute the cross-product of relation-list2. Discard tuples that fail qualification 3. Delete `unnecessary’ fields4. Partition the remaining tuples into groups by the value
of attributes in grouping-list. (GroupBy)5. Eliminate groups using the group-qualification (Having)
We will have a single value per group That is, one answer tuple is generated per qualifying
group.
55
Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors
SELECT S.rating, MIN (S.age) AS min-age
FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1
sid sname rating age
22 dustin 7 45.0
29 brutus 1 33.0
31 lubber 8 55.5
32 andy 8 25.5
58 rusty 10 35.0
64 horatio 7 35.0
71 zorba 10 16.0
74 horatio 9 35.0
85 art 3 25.5
95 bob 3 63.5
96 frodo 3 25.5
Sailors instance:
56
Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors.
rating age
7 45.0
1 33.0
8 55.5
8 25.5
10 35.0
7 35.0
10 16.0
9 35.0
3 25.5
3 63.5
3 25.5
rating minage 3 25.5 7 35.0 8 25.5
rating age
1 33.0
3 25.5
3 63.5
3 25.5
7 45.0
7 35.0
8 55.5
8 25.5
9 35.0
10 35.0
60
Find age of the youngest sailor with age 18, for each rating with at least 2 sailors between 18 and 60.
SELECT S.rating, MIN (S.age) AS min-age
FROM Sailors SWHERE S.age >= 18 AND S.age <= 60GROUP BY S.ratingHAVING COUNT (*) > 1
sid sname rating age
22 dustin 7 45.0
29 brutus 1 33.0
31 lubber 8 55.5
32 andy 8 25.5
58 rusty 10 35.0
64 horatio 7 35.0
71 zorba 10 16.0
74 horatio 9 35.0
85 art 3 25.5
95 bob 3 63.5
96 frodo 3 25.5
Answer relation:
Sailors instance:
rating minage 3 25.5 7 35.0 8 25.5
61
For each red boat, find the number of reservations for this boat
Grouping over Join of three relations. Q: What do we get if we remove B.color=‘red’ from
WHERE clause and add HAVING clause with this condition?
• No difference. But Illegal.• Only column in GroupBy can appear in Having,
unless in aggregate operator of Having
SELECT B.bid, COUNT (*) AS s-countFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’GROUP BY B.bid
62
Find age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age)
HAVING COUNT(*) >1
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating)
HAVING clause can also contain a subquery.
Find age of the youngest sailor with age > 18, for each rating with at least 2 such sailors (of age 18)
63
Find those ratings for which the average age is the minimum over all ratings
WRONG : Aggregate operations cannot be nested!
SELECT S.ratingFROM Sailors SWHERE S.age = (SELECT MIN (AVG (S2.age))
FROM Sailors S2)
65
SELECT Temp.rating, Temp.avg-age
FROM (SELECT S.rating, AVG (S.age) AS avg-age FROM Sailors S GROUP BY S.rating) AS Temp
WHERE Temp.avg-age = (SELECT MIN (avg-age) FROM (SELECT S.rating, AVG(S.age) as avg_age FROM Sailors S GROUP BY S.rating))
Find those ratings for which the average age is the minimum over all ratings
Correct solution (in SQL/92):
66
Null Values
67
Null Values
Field values in a tuple are sometimes : Unknown (e.g., a rating has not been assigned) or Inapplicable (e.g., no maiden-name when a male person)
SQL provides a special value null for such situations.
The presence of null complicates many issues.
68
Comparisons Using Null Values
(S.rating = 8) is TRUE or FALSE ?? What if rating has null value in tuple ?
We need a 3-valued logic : condition can be true, false or unknown.
SQL special operators IS NULL to check if value is/is not null.
Disallow NULL value : rating INTEGER NOT NULL
69
Comparison with NULL values
Arithmetic operations on NULL return NULL.
Comparison operators on NULL return UNKNOWN.
We can explicitly check whether a value is null or not by IS NULL and by IS NOT NULL.
70
Truth table with UNKNOWN
UNKNOWN AND TRUE = UNKNOWNUNKNOWN OR TRUE = TRUEUNKNOWN AND FALSE = FALSEUNKNOWN OR FALSE = UNKNOWNUNKNOWN AND UNKNOWN = UNKNOWNUNKNOWN OR UNKNOWN = UNKNOWNNOT UNKNOWN = UNKNOWN
WHERE clause is satisfied only when it evaluates to TRUE.
71
Outer Joins : Special Operators
Left Outer Join Right Outer Join Full Outer Join
SELECT S.sid, R.bidFROM Sailors S NATURAL LEFT OUTER JOIN Reserves R
SELECT S.sid, R.bidFROM Sailors S LEFT OUTER JOIN Reserves RON S.sid = R.sid
Sailors rows (left) without a matching Reserves row (right) appear in result, but not vice versa.
72
Sorting: ORDER BY clause
SELECT * FROM StudentWHERE sNumber >= 1ORDER BY sNumber, sName
(sNumber, sName) ( (sNumber >= 1) (Student))
73
SQL Queries - Summary
SELECT [DISTINCT] a1, a2, …, anFROM R1, R2, …, Rm[WHERE C1][GROUP BY g1, g2, …, gl [HAVING C2]][ORDER BY o1, o2, …, oj]
74
Summary SQL was important factor in early acceptance of
relational model : easy-to-understand !
Relationally complete: in fact, significantly more expressive power than relational algebra.
Many alternative ways to write a query. So optimizer must look for most efficient evaluation plan.
In practice, users (still) need to be aware of how queries are optimized and evaluated for best results.
top related