oracle sql: part 3
TRANSCRIPT
REPORTING AGGREGATED DATA, USING GROUP
FUNCTIONS.
OBJECTIVES
At the end of this lesson, you will learn:
How to identify the available group functions
How to describe the use of group functions in select statements
Grouping data, by using the Group By clause and
How to include or exclude grouped rows, by using the having clause.
WHAT ARE GROUP FUNCTIONS?
Group functions are SQL functions, which operate on a GROUP of rows, and return a result.
This group of rows may be an entire column or a column, split into smaller groups.
EXAMPLE
This example uses the GROUP function called COUNT, to count the number of employees that earn a salary from the company.
AVG & SUM
The AVG and Sum keyword is used to find the average and sum –respectively- of a group of numbers.
EXERCISE
Query your database for the average salary of all the employees in the department with where department_id=90.
MIN & MAX
The MAX and the MIN functions, will display the maximum and minimum –respectively- of all the values in the specified group, or column.
EXAMPLE
STDDEV & VARIANCE
STDDEV and VARIANCE are used to find the standard deviation and the variance, of the numbers in the specified column or group.
COUNT
The COUNT function counts the number of rows in the stated group or column. It has three different variations but each performs the same function of counting the number of rows in the group.
COUNT (COLUMN_NAME)
This SQL statement COUNTs all the rows in the stated column as returns their total number.
FORMAT: SELECT COUNT (column_name) FROM table_name;
EXAMPLE:
This SQL statement COUNTs all the rows in the employees table and returns the number of rows.
EXAMPLE:
The SQL statement counts the number of rows in the manager _id column.
COUNT DISTINCT (COLUMN_NAME)
Where the same value occurs many times in a column, the Oracle server COUNTS them all as one value when the SELECT COUNT keyword is used.
The Oracle server would count the different values in the manager_id column instead of the number of rows.
From the result of this query, it is obvious, that all the 107 employees, share only 18 manager-ids
The DISTINCT keyword, when used with a group function, will specify only the different rows available in that group.
The opposite of the DISTINCT keyword, is the ALL keyword, and it operates on all the rows in the group, including duplicates. Only null values are exempted.
GROUPING DATA
Initially we said group functions are functions that operate on a group of rows.
We also said these group of rows could be an entire column.
When we want to specify the order in which the output of a GROUP function is processed, we use the GROUP BY keyword.
EXAMPLE:
The Oracle server goes to the department_id column and fetches the distinct departments.
Afterwards, the Oracle server comes to the salary column and begins to group the rows in the salary column by their various departments, and then finally, for each different department_id, the Oracle server returns the minimum salary.
When using the GROUP BY clause, one general rule is that GROUP BY is always followed by a column name.
No column alias can be used with the GROUP BY clause.
EXAMPLE:
NESTED GROUPS
The situation may demand a nested GROUP, also referred to as a sub GROUP.
EXAMPLE:
In the event of Nested GROUPS, the inner GROUP function is SELECT before the outer GROUP function.
INCLUDING & EXCLUDING ROWS
When restricting rows in a SELECT clause, we use the WHERE keyword. However, when restricting rows in a GROUP BY clause, we use the HAVING keyword.
The GROUP BY clause will only return the rows that meet the HAVING condition.
EXAMPLE:
All the rows displayed by the Oracle server, have a department_id column greater than 30.
DISPLAYING DATA FROM MULTIPLE TABLES
OBJECTIVES
At the end of this module, you will learn how to:
1. Write a SELECT statement to access data from more than one table using EQUIJOINS and NON EQUIJOINS.
2. Join a table to itself by using a SELF JOIN.
3. Generate a CARTESIAN PRODUCT of all the rows from two or more tables.
DATA FROM MULTIPLE TABLES
There are times when the situation demands that you view data from more than one table using one SELECT statement.
If you have a department-id, and you want to post a letter to the department, you would realize that the post code is contained in the locations table.
You would have to join the departments table to the locations table, to get the post code you need.
When you have such situations, you use the JOIN keyword.
Five types of JOIN comply with standard SQL.
These are: The NATURAL JOIN The USING Clause The CROSS Join The Full Outer Join and The arbitrary join conditions for outer joins.
THE NATURAL JOIN
NATURAL JOIN is based on all the columns in the two tables that have the same column_name and data type.
The joining condition for the NATURAL JOIN, is based on the equality of these columns.
EXAMPLE:
Take a look at all the columns in the departments table and the locations table.
A NATURAL JOIN adds all the columns in the department table, to all the columns in the locations table.
Both tables are bound by the common column which in this case is the location_id column and it appears, as the first column in the query result.
A NATURAL JOIN is restricted using the WHERE clause.
FORMAT: SELECT column_name
FROM table_name1
NATURAL JOIN table_name2
WHERE condition
EXAMPLE:
This would only display the columns with a department_id greater than 30;
THE USING CLAUSE
The USING clause is used when more than one column satisfies the JOIN condition i.e both tables have more than one column in common.
The USING clause would specify which one column, to be used for the JOIN.
EXAMPLE:
Note that the NATURAL JOIN and the USING clause can’t be used together.
TABLE ALIASES
When joining tables that have many columns in common, you need to specify which table a particular column belongs to.
This would make the SELECT statement look very long and confusing; making it necessary to use a table alias.
Although the alias could be as long as 30 characters, it is better to keep it short and simple.
FORMAT:
SELECT table_alias1.column_name1, table-aliase2.column_name2
FROM table_name1 JOIN table_name2
USING (column_name)
EXAMPLE:
Once a table alias is specified at the beginning of a SELECT statement, it must be used throughout the SELECT
statement.
SELF JOINS
A self-JOIN, is used to JOIN a table to itself.
The ON keyword, is used alongside a self-JOIN and where it is necessary to place a restriction on the rows displayed you use the WHERE clause.
FORMAT:
SELECT column_name1, column_name2
FROM table1 a JOIN table1 b
ON (a.column_name1 condition b.column_name2)
EXERCISE:
Query the employees table for possible father and son relationship among employees.
To do that, you have to know if the last_name of any employee is the same as the first_name of any other employee.
We are told that no data is found, that is to say, such scenario does not exist.
EXERCISE:
Search the employees table for employees who are managers.
These are employees whose employee_id is the same as their manager_id
NON EQUIJOINS
A non-equijoin, is a join condition, that uses any conditional operator, other than the equality.
A typical equijoin, contains either the BETWEEN comparison condition, or the ANY comparison condition.
The <, >, <= and >= are examples of comparison conditions used in non-equijoins.
GENERATING CARTESIAN PRODUCTS
If you write a JOIN Query, and you fail to specify the JOIN condition the Oracle Database optimizer returns their CARTESIAN PRODUCT.
Oracle combines each row of one table, with each row of the other table, resulting in many rows that are rarely useful.
To avoid a CARTESIAN PRODUCT, always include a valid JOIN condition, unless you specifically need a CARTESIAN PRODUCT.
Cartesian products are not always useless, when we need a large number of rows for analysis purpose, we use a CARTESIAN PRODUCT.