basic sql for institutional research...what is sql? • “s‐q‐l” vs “sequel” •...
TRANSCRIPT
![Page 1: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/1.jpg)
Basic SQL forInstitutional Research
Phil Rhodes
TAIR 2012
February 22, 2012
Concurrent Session B4
![Page 2: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/2.jpg)
Topics
• Why SQL?
• What is SQL?
• Basic Queries
• Joining Two Tables
• Simple Summary Reports
![Page 3: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/3.jpg)
Why SQL?
• Institutional Research is all about data
• Data usually ‘locked up’ in relational databases (Oracle, DB2, SQL Server)
• IR offices can:• Depend on IT to give timely, correct extracts
• Extract data themselves
• There are other tools, but some operations are easier in SQL
![Page 4: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/4.jpg)
What is SQL?
• “S‐Q‐L” vs “sequel”
• Developed at IBM in early 1970’s
• Based on a 1970 paper by Edgar F. Codd
• Standardized first in 1986
• Many variants• “English”
![Page 5: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/5.jpg)
What is SQL?
• Queries• Retrieve data
• Data Manipulation Language (DML)• Add, update, and delete data
• Data Definition Language (DDL)• Manage structure of tables and indices
• Character, numeric, date/date‐time values
• Data Control Language (DCL)• Access control
![Page 6: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/6.jpg)
What is SQL?
• Focus on Queries
• Examples from SAS and MS Access
![Page 7: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/7.jpg)
Sample Data
Student
Acad_Info Schedule
![Page 8: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/8.jpg)
Sample Data
• Student table• Keys: ID
• Bio‐Demo data
• Acad_info table• Keys: ID, TERM
• College, Degree, Major, Classification
• Schedule table• Keys: ID, TERM, CRN
• Course, Credit Hours
![Page 9: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/9.jpg)
Basic SQL Queries
• Most basic query:select [column-list]from [table]
• Returns values of the columns in [column‐list] for all rows of the table
• An asterisk ‘*’ can be used as an alias for all columns• Useful shorthand, risky in production code
• Example 1 ‐ SAS
![Page 10: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/10.jpg)
Basic SQL Queries
• Restricting the rows retrievedselect [column-list]from [table]where [logical-expression]
• Standard boolean operators AND, OR, NOT• Also IN, LIKE, BETWEEN
• Three‐valued logic – True, False, Null/Unknown• IS NULL
• Example 2 – SAS & Access
![Page 11: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/11.jpg)
Joining Two Tables
• Often require data from more than one table
• Keys• Required to match rows from one table to another
• Primary and secondary keys
• Two types of joins• Inner Joins
• Outer Joins
![Page 12: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/12.jpg)
Joining Two Tables
• Inner Joinsselect [column-list]from [table1] as a
inner join [table2] as bwhere a.key = b.key
• Returns data from rows in both tables where keys match
![Page 13: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/13.jpg)
Joining Two Tables – Inner Join
Table 1ID Gender
1 M
2 M
3 F
4 F
Table 2ID Classification
1 Freshman
3 Sophomore
5 Junior
7 Senior
select a.id, a.gender,b.classification
from table1 as ainner join table2 as b
where a.id = b.id
![Page 14: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/14.jpg)
Joining Two Tables – Inner Join
ID Gender Classification
1 M Freshman
3 F Sophomore
select a.id, a.gender,b.classification
from table1 as ainner join table2 as b
where a.id = b.id
Result
![Page 15: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/15.jpg)
Joining Two Tables – Inner Join
• Example 3
![Page 16: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/16.jpg)
Joining Two Tables
• Outer Joinsselect [column-list]from [table1] as a
left join [table2] as bon a.key = b.key
• Returns all rows from table1 plus matching rows in table2
![Page 17: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/17.jpg)
Joining Two Tables – Outer Join
Table 1ID Gender
1 M
2 M
3 F
4 F
Table 2ID Classification
1 Freshman
3 Sophomore
5 Junior
7 Senior
select a.id, a.gender,b.classification
from table1 as aleft join table2 as b
on a.id = b.id
![Page 18: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/18.jpg)
Joining Two Tables – Outer Join
ID Gender Classification
1 M Freshman
2 M
3 F Sophomore
4 F
select a.id, a.gender,b.classification
from table1 as aleft join table2 as b
on a.id = b.id
Result
![Page 19: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/19.jpg)
Joining Two Tables – Outer Join
• Example 4
![Page 20: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/20.jpg)
Joining Two Tables – Issues
• Bad or Missing WHERE statements• Cartesian products
• Multiplying Observations• Duplicated data if more than one row in a table matches criterion
• Sometimes good, sometimes sign of an error
![Page 21: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/21.jpg)
Basic Reporting
• GROUP BY• Combine rows with common values
• Often used with SQL aggregation functions like SUM()• SUM, MIN, MAX, AVG, COUNT, VAR, STDEV
• WHERE is applied before GROUP BY
• HAVING • Similar to WHERE, but applied after GROUP BY
• Can use aggregation functions
![Page 22: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/22.jpg)
Basic Reporting
• ORDER BY• Order the result data by the values of certain columns
• ASCENDING or DESCENDING
![Page 23: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/23.jpg)
Basic Reporting
• Example 5: Create a report of the number of students by home state.
• Example 6: Create a report by classification of students taking more than 18 hours
![Page 24: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/24.jpg)
Resources
• SAS Online Documentation• support.sas.com/onlinedoc/913/docMainpage.jsp
• Base SAS Base SAS Procedures Guide SAS SQL Procedure User’s Guide
• MS Access SQL Documentation• office.microsoft.com/en‐us/access‐help/CH010072899.aspx
• SQL Tutorial• www.sqltutorial.org
![Page 25: Basic SQL for Institutional Research...What is SQL? • “S‐Q‐L” vs “sequel” • Developed at IBM in early 1970’s • Based on a 1970 paper by Edgar F. Codd • Standardized](https://reader035.vdocuments.net/reader035/viewer/2022081612/5f687ff942f36f0b723dd073/html5/thumbnails/25.jpg)
Resources
• O’Reilly Books (shop.oreilly.com)• Learning SQL, 2nd Edition
• SQL In a Nutshell, 3rd Edition
• SQL Cookbook, 1st Edition