copyright © 2003-2008 curt hill queries in sql more options

29
Copyright © 2003-2008 Curt Hill Queries in SQL More options

Upload: jennifer-bond

Post on 05-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

Queries in SQLMore options

Page 2: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

Duplicates• A select usually joins several

tables creating large unique tuples• Temporary table has an

unspecified key• If the select removes portions of

the key, then duplicates can occur• Consider the query that links

faculty to the students taking any of their classes

Page 3: Copyright © 2003-2008 Curt Hill Queries in SQL More options

The query SELECT f_name, s_nameFROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid

This produces 238 rowsWhat is the key?

Copyright © 2003-2008 Curt Hill

Page 4: Copyright © 2003-2008 Curt Hill Queries in SQL More options

The Key• Does not need to be specified• In this case it is the linking fields

– F_naid (or ct_naid)– Ct_dept– Ct_number– S_id

• Since some of these fields will be removed by the Select duplicates occur

Copyright © 2003-2008 Curt Hill

Page 5: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Removing duplicates• In this query duplicates occurs

when a student takes multiple classes from the teacher

• The result is not a set (which eliminates duplicates) but a multi-set (which allows duplicates)

• Placing the reserved word DISTINCT immediately after the Select removes these

• The new query follows:Copyright © 2003-2008 Curt Hill

Page 6: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Revised query SELECT DISTINCT f_name, s_nameFROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid

This produces 213 rows

Copyright © 2003-2008 Curt Hill

Page 7: Copyright © 2003-2008 Curt Hill Queries in SQL More options

How does this work?• Removing duplicates is not trivial• There are several ways, but all are

work• One possibility is to sort the tuples

– Duplicates then must be adjacent

• Another is to hash them– Duplicates have the same key

• Small queries could be done in memory, larger ones cannot

• We will consider sorting and hashing later

Copyright © 2003-2008 Curt Hill

Page 8: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Deception• The difference between the two

queries is just one keyword• That keyword forces the DBMS to

do substantial extra work• Looks like no big deal but actually

is• Hence the query is deceptively

different• However, make the database do its

jobCopyright © 2003-2008 Curt Hill

Page 9: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

All

• The opposite of the Distinct is the All

• Specifies that duplicates should not be eliminated

• Since elimination is expensive, it is usually not done– Thus All gives same result whether

present or absent

Page 10: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Order• The order of the output table is

dependent on many unpredictable things

• Different DBMSs may give different orderings, even with same data– Based on how they process the data

• The order of the above queries is different on Oracle and MySQL

• Worse yet neither will put all the students from one faculty together

Copyright © 2003-2008 Curt Hill

Page 11: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

Order by clause • Order by follows the Where• It specifies a sort order for the

output• May specify one or more fields• Fields do not have to be displayed

Page 12: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Sorted query 1 SELECT DISTINCT f_name, s_nameFROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidORDER BY f_name, S_name

Copyright © 2003-2008 Curt Hill

Page 13: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Sorting• The default behavior is to sort:

– Case sensitive way– Ascending order (lowest to highest)

• Usually we sort on the display values– Oracle only allows this– SQL Server and MySQL allow sorts on

other fields

Copyright © 2003-2008 Curt Hill

Page 14: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Sorted query 2 SELECT DISTINCT f_name, s_nameFROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidORDER BY f_naid, S_id

Copyright © 2003-2008 Curt Hill

Page 15: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Sort Order• The default is sort in ascending

order for all sort keys• The key may be followed by ASC or

DESC• ASC makes ascending order• DESC is descending order• These may not be spelled out• If left out ASC is default

Copyright © 2003-2008 Curt Hill

Page 16: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Sorted query 3SELECT DISTINCT f_name, s_nameFROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidORDER BY f_name DESC, s_name ASC

Copyright © 2003-2008 Curt Hill

Page 17: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Aggregate operations• We can collapse several rows into one• This produces a summary report• Several rows of table become one row

of output• This requires the Group By clause with

Aggregate functions• The Group By follows Where• Aggregate functions are in Select

Copyright © 2003-2008 Curt Hill

Page 18: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Group By and Aggregate functions

• Each of these Aggregate functions specify a field:– Count– Avg– Sum– Max– Min

• Usually used with Group by but not always

• Group by follows Where• Specifies the groups as changes in

fieldsCopyright © 2003-2008 Curt Hill

Page 19: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Grouped Query 1 SELECT f_name, count(s_name)FROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidGROUP BY f_name

This produces 16 rows

Copyright © 2003-2008 Curt Hill

Page 20: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

Commentary• Group by forces a sort• This is only means to ensure that the

items are together• The DISTINCT keyword may be used

within aggregate functions:– Count– Avg– Sum

Page 21: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Grouped Query 2 SELECT f_name, count(DISTINCT s_name)FROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidGROUP BY f_name

This produces 16 rows but different counts

Copyright © 2003-2008 Curt Hill

Page 22: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Copyright © 2003-2008 Curt Hill

Secondary Selection• The Where does an initial selection

– It eliminates numerous combinations of tuples of no interest

• We may also wish to remove aggregated rows

• This must occur after the Where but before final table

• This is done with the HAVING clause of the GROUP BY

Page 23: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Having• The Having clause follows the

Group By fields• It gives a selection criteria for rows• Usually based upon the aggregate

functions• Form:

Having comparison• See following

Copyright © 2003-2008 Curt Hill

Page 24: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Grouped Query 3 SELECT f_name, count(DISTINCT s_name)FROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidGROUP BY f_name HAVING count(*)>10

Copyright © 2003-2008 Curt Hill

Page 25: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Commentary• This produces 9 rows • Notice the * is the parameter

of count• Other Aggregate functions

could be used as well• A Having without a Group By is

like a Where

Copyright © 2003-2008 Curt Hill

Page 26: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Ungrouped Query• Suppose we just want a count or

sum• Then we can use an aggregate

function without Group By• This will generally collapse the

entire table into a single row• Consider the next screen

Copyright © 2003-2008 Curt Hill

Page 27: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Aggregates• Counting rows:Select count(*)from faculty– Results in one row with count of 19

• Sum of student balances:Select sum(s_balance)from students– Results in one row with the sum:

93240.34

Copyright © 2003-2008 Curt Hill

Page 28: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Variations• Recall this query SELECT f_name, count(DISTINCT s_name)…GROUP BY f_name

• Suppose f_naid were included in the SelectSELECT f_name, f_naid,

• In Oracle and SQL Server it would also have to be part of the Group By– But not in MySQL

Copyright © 2003-2008 Curt Hill

Page 29: Copyright © 2003-2008 Curt Hill Queries in SQL More options

Bad Oracle QuerySELECT f_name, f_naid, count(DISTINCT s_name)FROM faculty, c_teach, students, gradesWHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naidGROUP BY f_name– Receives an error:

ORA-00979: not a GROUP BY expression

Copyright © 2003-2008 Curt Hill