advanced sql programming in as400 iseries
TRANSCRIPT
Advanced SQL
Programming
Mark HolmMark Holm
Centerfield TechnologyCenterfield Technology
Goals
�� IntroduceIntroduce some useful advanced SQL some useful advanced SQL
programming techniques programming techniques
��Show you how to let the database do more Show you how to let the database do more
work to reduce programming effortwork to reduce programming effort
��Go over some basic techniques and tips to Go over some basic techniques and tips to
improve performanceimprove performance
2
Notes
��V4R3 and higher syntax used in examplesV4R3 and higher syntax used in examples
��Examples show only a small subset of what Examples show only a small subset of what
can be done!can be done!
3
Agenda
�� Joining files Joining files -- techniques, do’s and don’tstechniques, do’s and don’ts
��Query within a query Query within a query -- SubqueriesSubqueries
��Stacking data Stacking data -- UnionsUnions
��Simplifying data with ViewsSimplifying data with Views
��Referential Integrity and constraintsReferential Integrity and constraints
��Performance, performance, performancePerformance, performance, performance
4
Joining files
�� Joins are used to relate data from different tablesJoins are used to relate data from different tables
�� Data can be retrieved with one “open file” rather Data can be retrieved with one “open file” rather
than manythan many
�� Concept is identical to join logical files without an Concept is identical to join logical files without an
associated permanent object (except if the join is associated permanent object (except if the join is
done with an SQL view)done with an SQL view)
5
Join types
�� Inner JoinInner Join
–– Used to find related dataUsed to find related data
��Left Outer (or simply Outer) JoinLeft Outer (or simply Outer) Join
–– Used to find related data and ‘orphaned’ rowsUsed to find related data and ‘orphaned’ rows
��Exception JoinException Join
–– Used to only find ‘orphaned’ rowsUsed to only find ‘orphaned’ rows
��Cross JoinCross Join
–– Join all rows to all rowsJoin all rows to all rows
6
Sample tables
FirstName LastName Dept
John Doe 397
Cindy Smith 450
Sally Anderson 250
Dept Area
397 Development
550 Marketing
250 Sales
Em
plo
yee
ta
ble
Dep
art
men
t ta
ble
Inner Join
SELECT LastName, Division FROM Employee, Department
WHERE Employee.Dept = Department.Dept
• Method #1 - Using the WHERE Clause
• Method #2 - Using the JOIN Clause
SELECT LastName, Division FROM Employee INNER
JOIN Department ON Employee.Dept = Department.Dept
NOTE: This method is useful if you need to influence the order of the tables are
joined in for performance reasons. Only works on releases prior to V4R4.
8
Results
LastName Area
Doe Development
Anderson Sales
• Return list of employees that are in a valid
department.
• Employee ‘Smith’ is not returned because she is not
in a department listed in the ‘Department’ table
Res
ult
ta
ble
9
Left Outer Join
SELECT LastName, Area FROM Employee
LEFT OUTER JOIN Department
ON Employee.Dept = Department.Dept
• Must use Join Syntax
10
Results
LastName Area
Doe Development
Smith -
Anderson Sales
• Return list of employees even if they are not in a
valid department
• Employee ‘Smith’ has a NULL Area because it
could not be associated with a valid Dept
Res
ult
ta
ble
11
Exception Join
SELECT LastName, Area FROM Employee
EXCEPTION JOIN Department
ON Employee.Dept = Department.Dept
• Must use Join Syntax
12
Results
LastName Area
Smith -
• Return list of employees only if they are NOT in a
valid department
• Employee ‘Smith’ is only one without a valid
department
Res
ult
ta
ble
13
WARNING!
��The order tables are listed in the FROM The order tables are listed in the FROM
clause is importantclause is important
��For OUTER and EXCEPTION joins, the For OUTER and EXCEPTION joins, the
database must join the tables in that order. database must join the tables in that order.
��The result may be horrible The result may be horrible
performance…more on this topic laterperformance…more on this topic later
14
Observations
�� Joins provide one way to bury application Joins provide one way to bury application
logic in the databaselogic in the database
��Each join type has a purpose and can be Each join type has a purpose and can be
used to not only get the data you want but used to not only get the data you want but
identify “incomplete” informationidentify “incomplete” information
��With some exceptions, if joined properly With some exceptions, if joined properly
performance should be at least as good as performance should be at least as good as
an applicationan application
15
Subqueries
��Subqueries Subqueries are a powerful way to select are a powerful way to select
only the data you need without separate only the data you need without separate
statements.statements.
��Example: List employees making a higher Example: List employees making a higher
than average salarythan average salary
16
Subquery Example
SELECT FNAME, LNAME FROM EMPLOYEE
WHERE SALARY > (SELECT AVG(SALARY)
FROM EMPLOYEE)
SELECT FNAME, LNAME FROM EMPLOYEE
WHERE SALARY > (SELECT AVG(SALARY)
FROM EMPLOYEE
WHERE LNAME = ’JONES’)
17
Subqueries - types
��Correlated Correlated
–– Inner select refers to part of the outer (parent) Inner select refers to part of the outer (parent)
select (multiple evaluations)select (multiple evaluations)
��NonNon--CorrelatedCorrelated
–– Inner select does not relate to outer query (one Inner select does not relate to outer query (one
evaluation)evaluation)
18
Subquery Tips 1
��SubquerySubquery optimization (2nd statement will optimization (2nd statement will
be faster)be faster)
–– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary > ALL (SELECT salary FROMsalary > ALL (SELECT salary FROM salscalesalscale) )
–– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary > (SELECT max(salary) FROMsalary > (SELECT max(salary) FROM salscalesalscale))
19
Subquery Tips 2
��SubquerySubquery optimization (2nd statement will optimization (2nd statement will
be faster)be faster)
–– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
salary IN (SELECT salary FROMsalary IN (SELECT salary FROM salscalesalscale) )
–– SELECT name FROM employee WHERE SELECT name FROM employee WHERE
EXISTS (SELECT salary FROMEXISTS (SELECT salary FROM salscalesalscale
WHERE employee.WHERE employee.salidsalid == salscalesalscale..salidsalid))
20
UNIONs
��Unions provide a way to append multiple Unions provide a way to append multiple
row sets files in one statementrow sets files in one statement
��Example: Process all of the orders from Example: Process all of the orders from
January and FebruaryJanuary and February
SELECT * FROM JanOrders WHERE SKU = 199976
UNION
SELECT * FROM FebOrders WHERE SKU = 199976
21
Unions
��Each SELECT statement that is Each SELECT statement that is UNIONed UNIONed
together must have the same number of together must have the same number of
result columns and have compatible typesresult columns and have compatible types
��Two forms of syntaxTwo forms of syntax
–– UNION ALL UNION ALL ---- allow duplicate recordsallow duplicate records
–– UNION UNION ---- return only distinct rowsreturn only distinct rows
22
Views
��Views provide a convenient way to Views provide a convenient way to
permanently put SQL logicpermanently put SQL logic
��Create once and use many timesCreate once and use many times
��Also make the database more Also make the database more
understandable to usersunderstandable to users
��Can put simple business rules into views to Can put simple business rules into views to
ensure consistencyensure consistency
23
Views��Example: Make it easy for the human Example: Make it easy for the human
resources department to run a report that resources department to run a report that
shows ‘new’ employees. shows ‘new’ employees.
CREATE VIEW HR/NEWBIES (EMPLOYEE_NAME, DEPARTMENT, HIRE_DATE) AS
SELECT concat(concat(strip(last_name),','),strip(first_name)),
department,
hire_date
FROM HR/EMPLOYEE
WHERE (year(current date)-year(hire_date)) < 2
24
Performance
��SQL performance is harder to predict and SQL performance is harder to predict and
tune than native I/O.tune than native I/O.
��SQL provides a powerful way to manipulate SQL provides a powerful way to manipulate
data but you have little control over HOW it data but you have little control over HOW it
does it.does it.
��Query optimizer takes responsibility for Query optimizer takes responsibility for
doing it ‘right’.doing it ‘right’.
25
Performance - diagnosis
��Getting information about how the Getting information about how the
optimizer processed a query is crucialoptimizer processed a query is crucial
��Can be done via one or all of the following:Can be done via one or all of the following:
–– STRDBG: debug messages in job logSTRDBG: debug messages in job log
–– STRDBMON: optimizer info put in fileSTRDBMON: optimizer info put in file
–– QAQQINI: can be used to force messagesQAQQINI: can be used to force messages
–– CHGQRYA: messages put out when time limit CHGQRYA: messages put out when time limit
set to 0set to 0
26
Performance tips
��Create indexesCreate indexes
–– Over columns that significantly limit data in Over columns that significantly limit data in
WHERE clauseWHERE clause
–– Over columns that join tables togetherOver columns that join tables together
–– Over columns used in ORDER BY and Over columns used in ORDER BY and
GROUP BY clausesGROUP BY clauses
27
Performance tips
��Create Encoded Vector Indexes (Create Encoded Vector Indexes (EVI’sEVI’s))
–– Most useful in heavy query environments with Most useful in heavy query environments with
a lot of data (e.g. large data warehouses)a lot of data (e.g. large data warehouses)
–– Helps queries that process between 20Helps queries that process between 20--60% of a 60% of a
table’s datatable’s data
–– Create over columns with a modest number of Create over columns with a modest number of
distinct values and those with data skewdistinct values and those with data skew
–– EVI’s EVI’s bridge the gap between traditional bridge the gap between traditional
indexes and table scansindexes and table scans
28
Performance tips
��Encourage optimizer to use indexesEncourage optimizer to use indexes
–– Use keyed columns in WHERE clause if Use keyed columns in WHERE clause if
possiblepossible
–– Use Use ANDed ANDed conditions as much as possibleconditions as much as possible
–– OPTIMIZE FOR n ROWSOPTIMIZE FOR n ROWS
–– Don’t do things that eliminate index useDon’t do things that eliminate index use
�� Data conversion (binaryData conversion (binary--key = 1.5)key = 1.5)
�� LIKE clause w/leading wildcard (NAME LIKE LIKE clause w/leading wildcard (NAME LIKE
‘%JOE’)‘%JOE’)
29
Performance tips
��Keep statements simpleKeep statements simple
–– Complex statements are much more difficult to Complex statements are much more difficult to
optimizeoptimize
–– Provide more opportunity for the optimizer to Provide more opportunity for the optimizer to
choose a subchoose a sub--optimal plan of attackoptimal plan of attack
30
Performance tips
��Enable DB2 to use parallelismEnable DB2 to use parallelism
–– Query processed by many tasks (CPU Query processed by many tasks (CPU
parallelism) or by getting data from many disks parallelism) or by getting data from many disks
at once (I/O parallelism)at once (I/O parallelism)
–– CPU parallelism requires IBM’s SMP feature CPU parallelism requires IBM’s SMP feature
and a machine with multiple processorsand a machine with multiple processors
–– Enabled via the QQRYDEGREE system value, Enabled via the QQRYDEGREE system value,
CHGQRYA, or the QAQQINI fileCHGQRYA, or the QAQQINI file
31
Other useful features
��CASE clause CASE clause -- conditional calculationsconditional calculations
��ALIAS ALIAS -- access to multiaccess to multi--member filesmember files
��Primary/Foreign keys Primary/Foreign keys -- referential integrityreferential integrity
��ConstraintsConstraints
32
CASE
��Conditional calculations with CASEConditional calculations with CASE
SELECT Warehouse, Description,
CASE RegionCode
WHEN 'E' THEN 'East Region'
WHEN 'S' THEN 'South Region'
WHEN 'M' THEN 'Midwest Region'
WHEN 'W' THEN 'West Region'
END
FROM Locations
33
CASE
��Avoiding calculation errors (e.g. division by 0)Avoiding calculation errors (e.g. division by 0)
SELECT Warehouse, Description,
CASE NumInStock
WHEN 0 THEN NULL
ELSE CaseUnits/NumInStock
END
FROM Inventory
34
ALIAS names
�The CREATE ALIAS statement creates an alias
on a table, view, or member of a database file.
–– CREATE ALIAS CREATE ALIAS aliasalias--name name FORFOR table membertable member
��Example: Create an alias over the second Example: Create an alias over the second
member of a multimember of a multi--member physical filemember physical file
–– CREATE ALIASCREATE ALIAS February February FORFOR MonthSalesMonthSales
FebruaryFebruary
35
Referential Integrity
��Keeps two or more files in synch with each Keeps two or more files in synch with each
otherother
��Ensures that children rows have parentsEnsures that children rows have parents
��Can also be used to automatically delete Can also be used to automatically delete
children when parents are deletedchildren when parents are deleted
36
Referential Integrity Rules
��A row inserted into a child table must have A row inserted into a child table must have
a parent row (typically in another table).a parent row (typically in another table).
��Parent rulesParent rules
–– A parent row can not be deleted if there are A parent row can not be deleted if there are
dependent children (Restrict rule) ORdependent children (Restrict rule) OR
–– All children are also deleted (Cascade rule) ORAll children are also deleted (Cascade rule) OR
–– All children’s foreign keys are changed (Set All children’s foreign keys are changed (Set
Null and Set Default rules)Null and Set Default rules)
37
Parent table Child table
Pri
ma
ry
Key
Fo
reig
n
KeyPri
ma
ry k
ey m
ust
be
un
iqu
e
38
Referential Integrity syntax
��ALTER TABLE Hr/Employee ADD ALTER TABLE Hr/Employee ADD
CONSTRAINT CONSTRAINT EmpPKEmpPK PRIMARY KEY PRIMARY KEY
((EmployeeIdEmployeeId))
��ALTER TABLE Hr/Department ADD ALTER TABLE Hr/Department ADD
CONSTRAINT CONSTRAINT EmpFK EmpFK FOREIGN KEY FOREIGN KEY
((EmployeeIdEmployeeId) REFERENCES Hr/Employee ) REFERENCES Hr/Employee
((EmployeeIdEmployeeId) ON DELETE CASCADE ) ON DELETE CASCADE
ON UPDATE RESTRICTON UPDATE RESTRICT
39
Check Constraints
��Rules which limit the allowable values in one or Rules which limit the allowable values in one or
more columns:more columns:
CREATE TABLE Employee CREATE TABLE Employee
((FirstNameFirstName CHAR(20), CHAR(20),
LastNameLastName CHAR(30), CHAR(30),
Salary CHECK (Salary>0 AND Salary<200000))Salary CHECK (Salary>0 AND Salary<200000))
40
Check Constraints
��Effectively does data checking at the database Effectively does data checking at the database
level.level.
��Data checking done with display files or Data checking done with display files or
application logic can now be done at the application logic can now be done at the
database level.database level.
��Ensures that it is always done and closes “back Ensures that it is always done and closes “back
doors” like DFU, ODBC, 3doors” like DFU, ODBC, 3--rd party utilities….rd party utilities….
41
Other resources
� Database Design and Programming for DB2/400 - book by Paul
Conte
� SQL for Smarties - book by Joe Celko
� SQL Tutorial - www.as400network.com
� AS/400 DB2 web site at http://www.as400.ibm.com/db2/db2main.htm
� Publications at http://publib.boulder.ibm.com/pubs/html/as400/
� Our web site at http://www.centerfieldtechnology.com
42
Summary
��SQL is a powerful way to access and SQL is a powerful way to access and
process dataprocess data
��Used effectively, it can reduce the time it Used effectively, it can reduce the time it
takes to build applicationstakes to build applications
��Once tuned, it can perform very close (and Once tuned, it can perform very close (and
sometimes better) than sometimes better) than HLL’s HLL’s alonealone
43
Good Luck
and
Happy SQLing