tips for effective queries

Upload: balajismith

Post on 06-Jul-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 Tips for Effective Queries

    1/32

    Tips to Write Effective Queries and EXPLAIN PLAN

    Contents

    SQL Statement Processing Phases AutoTraceEXPLAIN PLAN

    Explain Plan Using SQL IDExplain Plan from Active Session (Using TOP)Explain Plan Operations Reference:  - Table Access Methods (Full Table Scans, Cluster, Hash, by Rowid, Index Lookup)  - Index Access Methods (Unique scan, Range scan, Full scan, Fast full scan)  - Join Operation Techniques (Nested Loops, Merge Joins or Sort Joins, Hash Joins)  - Operations (sort, filter, view)Detect Driving TableSeveral Tips to write better queries

    SQL Statement Processing PhasesThe four statement processing phases in SQL are parsing binding, executing and fetching.

    P ARSE: During the parse step, Oracle first verifies whether or not the SQL statement is in the library cache. If it is, only l ittle further processing isnecessary, such as ver ification of access rights. If not, the statement will need to be parsed, checked for syntax errors, checked for correctness of table- and column-names, and optimized to find the best access plan, etc. The former type of parse is called a soft parse and it is considerably faster than the latter, a hard parse.

    BIND: It scans the statement for bind variables and assigns a value to each variable.

    EXECUTE: The Server applies the parse tree to the data buffers, performs necessary I/O and sorts for DML statements.

    FETCH: Retrieves rows for a SELECT statement during the fetch phase. Each fetch retrieves multiple rows, using an array fetch.

     A careful understanding of these steps will show that real user data are being processed in the steps 2 through 4; and that the step 1 merely ispresent for the Oracle engine to deal with the SQL statement.This first step may take considerable time and resources, and as it is overhead seen from the data processing point of view, appl ications should bewritten to minimize the amount of time spent during this step. The most efficient way to do this is to avoid the parse/optimization step as much aspossible.

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    2/32

    This PDF was generated via the PDFmyURL !

    AUTOTRACE

    The trace utility is very helpful to see the execution plan for a specific query WITHOUT executing it. We can obtain the execution plan and someadditional statistics on running a SQL command automatically using AUTOTRACE.  SET AUTOTRACE

    Setup:

     - Create the PLAN_TABLE as SYS by executing:@$ORACLE_HOME/rdbms/admin/utlxplan.sql create or replace public synonym PLAN_TABLE for PLAN_TABLE;grant all on PLAN_TABLE to PUBLIC;

    - Setup the PLUSTRACE role (to be used with AUTOTRACE options) as SYS user:@$ORACLE_HOME/sqlplus/admin/plustrce.sqlgrant plustrace to public;

    If granting the 'plustrace' role to public doesn't work, you could also do the following:alter user &USER_NAME default role PLUSTRACE;

    Note=If you get problems with AUTOTRACE, then try the following as SYS:grant select on v_$session to plustrace;

    Options to execute itFist of all, we need to say that the format of it is hard to read. So I suggest to execute the following at the top:set lines 100 wrap on trim on trimspool oncol plan_plus_exp format a100

    OFF - Disables autotracing SQL statementsON - Enables autotracing SQL StatementsTRACEONLY - Enables auto tracing SQL Statements, and Suppresses Statement OutputEXPLAIN - Displays execution plans, but does not display statisticsSTATISTICS Displays statistics, but does not display execution plans.

    The best option is to use SET AUTOTRACE TRACE , this wil l not return the selected data from the query, it will return the access path from plantable and its statistics.

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    3/32

    If you just want the execution plan, then you can use SET AUTOTRACE TRACE EXP. These are the options:set autotrace on explain;  -> only the explain plan and the query resultset autotrace on statistics;  -> only the result set and statistics. No explain planset autotrace traceonly;  -> only the explain plan and statistics . No query resultset autotrace traceonly statistics;  -> only the statistics. No query result or explain planset autotrace traceonly explain; -> only the explain plan. No query result or statistics

    To disable use:SET AUTOTRACE OFF;

    NOTE: The most important results are the db block gets, consistent gets, physical reads, redo size, sorts (memory) andsorts (disk).

    Statistic Explanation  • recursive calls: The number of internal calls Oracle has made to execute the command. Those additional calls(sql) executed by Oracleimplicitly to process your (user) sql statement. Can be many things, hard parses, trigger executions , sort extent allocations , data dictionarylookups/updates etc

    • db block gets: The number of blocks retrieved to answer the query. A: A 'db block get' is a current mode get. That is, it's the most up-to-datecopy of the data in that block, as i t is right now, or currently. There can only be one current copy of a block in the buffer cache at any time. Db blockgets generally are used when DML changes data in the database. In that case, row-level locks are implicitly taken on the updated rows. There isalso at least one well-known case where a select statement does a db block get, and does not take a lock. That is, when it does a full table scan or fast full index scan, Oracle wil l read the segment header in current mode

    • consistent gets: The number of blocks retrieved that did not change the data and therefore did not interfere with other users (i.e. by lockingdata). A 'consistent get' is when Oracle gets the data in a block which is consistent with a given point in time, or SCN. The consistent get is at theheart of Oracle's read consistency mechanism. When blocks are fetched in order to satisfy a query result set, they are fetched in consistent mode. If no block in the buffer cache is consistent to the correct point in time, Oracle will (attempt to) reconstruct that block using the information in therollback segments. If it fails to do so, that's when a query errors out with the much dreaded, much feared, and much misunderstood ORA-1555

    "snapshot too old".• physical reads: The number of blocks read from the disc. Basically those that cannot be satisfied by the cache and those that are direct reads.  • redo size: The number of redo entries. The redo entries are written out to the online redolog files from the log buffer cache by LGWR.  • bytes sent via SQL*Net to client: The number of bytes sent across the network from the server to the client.  • bytes received via SQL*Net from client: The number of bytes sent across the network from the client to the server.  • SQL*Net roundtrips to/from client: The number of exchanges between cl ient and server.  • sorts (memory): The number of data sorts performed in memory.  • sorts (disc): The number of data sorts performed on disc.  • rows processed: The number of rows processed by the query.

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    4/32

    The db block gets, consistent gets and physical reads give the number of blocks that were read to form the buffers or from the disc. For manyqueries, the number of physical reads is low as the data is already in the database buffers. If the number of physical reads is high then the query willbe expected to be slow as there will be many disc accesses.The bytes received/sent via SQL*Net indicate how much data is being moved across the network. This is important as moving a lot of data acrossthe network may affect the network's performance.The sorts indicate the amount of work done in sorting data during the execution of the query. Sorts are important as sorting data is a slow process.

    EXPLAIN PLANThe Explain Plan command uses a table to store information about the execution plan chosen by the optimizer.Oracle provides an autotrace facility to provide execution plan and some statistics.

    There are two methods for looking at the execution plan1. EXPLAIN PLAN command: Displays an execution plan for a SQL statement without actually executing the statement2. V$SQL_PLAN A dictionary view introduced in Oracle 9i that shows the execution plan for a SQL statement that has been compiled into a cursor in the cursor cache

    EXPLAIN PLAN COMMANDPerform the following to check it:EXPLAIN PLAN FOR your query.

    ExampleEXPLAIN PLAN FOR SELECT * FROM emp e, dept d WHERE e.deptno = d.deptno AND e.ename = 'SMITH';

    Finally use the DBMS_XPLAN.DISPLAY function to display the execution plan:SET LINESIZE 130 SET PAGESIZE 0 SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

    -------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 57 | 3 |

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    5/32

     | 1 | NESTED LOOPS | | 1 | 57 | 3 ||* 2 | TABLE ACCESS FULL | EMP | 1 | 37 | 2 || 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 ||* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | |-------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------2 - filter("E"."ENAME"='SMITH')4 - access("E"."DEPTNO"="D"."DEPTNO")

    How To Read Query Plans?The execution order in EXPLAIN PLAN output begins with the line that is the furthest indented to the right.The next step is the parent of that line.If two lines are indented equally, then the top line is normally executed first.

    ------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 11 | 53 (2)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 11 | | |

    | 2 | TABLE ACCESS BY INDEX ROWID| SKEW | 53 | 583 | 53 (2)| 00:00:01 ||* 3 | INDEX RANGE SCAN | SKEW_COL1 | 54 | | 3 (0)| 00:00:01 |------------------------------------------------------------------------------------------

    The DBMS_XPLAN package supplies four table functions:

    DISPLAY: to format and display the contents of a PLAN_TABLE. Parameters: table_name, sql_id, format, filter_predsDISPLAY_CURSOR: to format and display the contents of the execution plan of any loaded cursor available in V$SQL. Parameters: sql_id,child_number, formatDISPLAY_AWR: to format and display the contents of the execution plan of a stored SQL statement in the AWR in DBA_HIST_SQLPLAN.Parameters: sql_id, plan_hash_value, db_id, formatDISPLAY_SQLSET: to format and display the contents of the execution plan of statements stored in a SQL tuning set, used in conjunction withthe package DBMS_SQLTUNE. Parameters: sqlset_name, sql_id, plan_hash_value, format, sqlset_owner 

    The DBMS_XPLAN.DISPLAY function can accept 3 parameters:1 table_name - Name of plan table, default value 'PLAN_TABLE'.2 statement_id - Statement id of the plan to be displayed, default value NULL.3 format - Controls the level of detail displayed, default value 'TYPICAL'. Other values include 'BASIC', 'ALL', 'SERIAL'.

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    6/32

    EXPLAIN PLAN SET STATEMENT_ID='TSH' FOR SELECT * FROM emp e, dept d WHERE e.deptno = d.deptno AND e.ename = 'SMITH';

    SET LINESIZE 130 SET PAGESIZE 0 SELECT * 

    FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE','TSH','BASIC'));

    ---------------------------------------------| Id | Operation | Name |---------------------------------------------| 0 | SELECT STATEMENT | || 1 | NESTED LOOPS | || 2 | TABLE ACCESS FULL | EMP || 3 | TABLE ACCESS BY INDEX ROWID| DEPT || 4 | INDEX UNIQUE SCAN | PK_DEPT |---------------------------------------------

    Explain Plan using the SQL ID

    You can also grab the Explain Plan by using the SQL ID for an already executed query. Example:

    create table t ( x varchar2(30) primary key, y int );exec dbms_stats.set_table_stats( user, 'T', numrows => 1000000, numblks => 100000 );declare

    l_x_number number;l_x_string varchar2(30);

    beginexecute immediate 'alter session set optimizer_mode=all_rows';for x in (select * from t look_for_me where x = l_x_number) loop null; end loop;for x in (select * from t look_for_me where x = l_x_string) loop null; end loop;execute immediate 'alter session set optimizer_mode=first_rows';for x in (select * from t look_for_me where x = l_x_number) loop null; end loop;for x in (select * from t look_for_me where x = l_x_string) loop null; end loop;

    end;/

    Run this query to "catch" specific queries:select sql_id, child_number, sql_textfrom v$s l

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    7/32

     where upper(sql_text) like 'SELECT % FROM T%'ORDER BY 2;

    Then run the following to Show its planselect * from table(DBMS_XPLAN.DISPLAY_CURSOR('&cursor_id', 0) );  -- The 0 is the child_number of the query

    You can also view the plan in memory for a statement in the AWR Reportselect * from table(DBMS_XPLAN.DISPLAY_AWR('SQL_ID'));

    Explain plan Hierarchy Sample explain plan:

    Query Plan-----------------------------------------SELECT STATEMENT [CHOOSE] Cost=1234

      TABLE ACCESS FULL TPAIS [:Q65001] [ANALYZED]

    The rightmost uppermost operation of an explain plan is the first thing that the explain plan wil l execute. In this case TABLE ACCESS FULL TPAIS

    is the first operation. This statement means we are doing a full table scan of table TPAIS When this operation completes then the resultant rowsource is passed up to the next level of the query for processing. In this case it is the SELECT STATEMENT, which is the top of the query.

    [CHOOSE] is an indication of the optimizer_goal for the query. This DOES NOT necessarily indicates that plan has actually used this goal. The onlyway to confirm this is to check the cost= part of the explain plan as well . For example the following query indicates that the CBO has been usedbecause there is a cost in the cost field:SELECT STATEMENT [CHOOSE] Cost=1234

    However the explain plan below indicates the use of the RBO because the cost field is blank:SELECT STATEMENT [CHOOSE] Cost=

    The cost field is a comparative cost that is used internally to determine the best cost for particular plans. The costs of different statements are notreally directly comparable.

    [:Q65001] indicates that this particular part of the query is being executed in parallel. This number indicates that the operation wil l be processed bya parallel query slave as opposed to being executed serially.

    [ANALYZED] indicates that the object in question has been analyzed and there are currently statistics available for the CBO to use. There is noindication of the 'level' of analysis done.

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    8/32

    Explain Plan from Active Session (Using TOP)If you noticed that a session is using too much CPU, you can identify the actions performed by that session using top and Explain Plan.So first, use TOP to identify the session using high CPU and take a note of the PID.

    set linesize 140set pagesize 100col username format a15col machine format a20ACCEPT Process_ID prompt 'Pid : 'select s.inst_id,p.spid,s.sid,s.serial#,s.username,s.machine

    from gv$session s, gv$process pwhere s.paddr=p.addrand p.spid=&proceso;

    Once you got the SID associated to that PID, then you can use it with explain plan:set lines 140set pages 10000set long 1000000ACCEPT Process_SID prompt 'Sid : 'SELECT a.sql_id, a.sql_fulltextFROM v$sqlarea a, v$session sWHERE a.address = s.sql_address

     AND s.sid = &proceso;

    set lines 150set pages 40000col operation format a55col object format a25ACCEPT sentencia prompt 'Identificador de SQL ejecutado : '

    select lpad(' ',2*depth)||operation||' '||options||decode(id, 0, substr(optimizer,1, 6)||'Cost='||to_char(cost)) operation,

    object_name object, cpu_cost, io_costfrom v$sql_plan where sql_id='&sentencia';

    Explain Plan Operations Reference

    Table Access Methods1- FULL TABLE SCAN (FTS) - Read every row in the table, every block up to the high water mark. The HWM marks the last block in the table thathas ever had data written to it. If you have deleted al l the rows then you will stil l read up to the HWM. Truncate is the only way to reset the HWMback to the start of the table. Buffers from FTS operations are placed on the Least Recently Used (LRU) end of the buffer cache so will be quickly

    This PDF was generated via the PDFmyURLweb conversion service

    !

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    9/32

    aged out. FTS is not recommended for large tables unless you are reading >5-10% of it (or so) or you intends to run in parallel. Oracle usesmultiblock reads where it can.

    2- CLUSTER - Access via an index cluster.

    3- HASH - A hash key is issued to access one or more rows in a table with a matching hash value.

    4- BY ROWID - This is the quickest access method available. Oracle simply retrieves the block specified and extracts the rows it is interested in. Access by rowid :SQL> explain plan for select * from dept where rowid = ':x';

    Query Plan------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID DEPT [ANALYZED]

     Another example where the table is accessed by rowid following index lookup:

    SQL> explain plan for select empno,ename from emp where empno=10;Query Plan------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID  EMP [ANALYZED]INDEX UNIQUE SCAN  EMP_I1

    5- INDEX LOOKUP - The data is accessed by looking up key values in an index and returning rowids. A rowid uniquely identifies an individual rowin a particular data block. This block is read via single block I/O. In this example an index is used to find the relevant row(s) and then the table isaccessed to lookup the ename column (which is not included in the index):

    SQL> explain plan for select empno,ename from emp where empno=10;

     Query Plan------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID EMP [ANALYZED]INDEX UNIQUE SCAN EMP_I1

    Note the 'TABLE ACCESS BY ROWID' section. This indicates that the table data is not being accessed via a FTS operation but rather by a rowidlookup. In this case looking up values in the index first has produced the rowid. The index is being accessed by an 'INDEX UNIQUE SCAN'operation. This is explained below. The index name in this case is EMP_I1. If all the required data resides in the index then a table lookup may beunnecessary and all you will see is an index access with no table access.

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    10/32

    In the next example all the columns (empno) are in the index. Notice that no table access takes place:

    SQL> explain plan for select empno from emp where empno=10;Query Plan------------------------------------SELECT STATEMENT [CHOOSE] Cost=1INDEX UNIQUE SCAN EMP_I1

    Indexes are presorted so sorting may be unnecessary if the sort order required is the same as the index. In the next example the index is sorted so

    the rows will be returned in the order of the index hence a sort is unnecessary.

    SQL> explain plan for select empno,ename from emp where empno > 7876 order by empno;Query Plan--------------------------------------------------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID EMP [ANALYZED]INDEX RANGE SCAN EMP_I1 [ANALYZED]

    In the next example we wi ll forcing a full table scan. Because we have forced a FTS the data is unsorted and we must sort the data after it has beenretrieved.

    SQL> explain plan for select /*+ Full(emp) */ empno,ename from emp where empno> 7876 order by empno;Query Plan--------------------------------------------------------------------------------SELECT STATEMENT [CHOOSE] Cost=9SORT ORDER BYTABLE ACCESS FULL EMP [ANALYZED] Cost=1 Card=2 Bytes=66

    Index Access MethodsThere are 4 methods of index lookup:

    b1.Index Unique Scanb2.Index Range Scanb3.Index Full Scanb4.Index Fast Full Sscanb5.Index Skip Scan

    1. Index Unique ScanOnly one row will be returned. Used when the statement contains a UNIQUE or a PRIMARY KEY constraint that guarantees that only a single row isaccessedExample:

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    11/32

    SQL> explain plan for select empno,ename from emp where empno=10;Query Plan------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID EMP [ANALYZED]INDEX UNIQUE SCAN EMP_I1

    2. Index range scanThis is a method for accessing multiple column values. You must supply AT LEAST the leading column of the index to access data via the index.

    Can be used for range operations (e.g. >, =, explain plan for select empno,ename from emp where empno > 7876 order by empno;Query Plan--------------------------------------------------------------------------------SELECT STATEMENT [CHOOSE] Cost=1TABLE ACCESS BY ROWID EMP [ANALYZED]INDEX RANGE SCAN EMP_I1 [ANALYZED]

     A non-unique index may return multiple values for the predicate col1 = 5 and will use an index range scan

    SQL> explain plan for select mgr from emp where mgr = 5;Query plan

    --------------------SELECT STATEMENT [CHOOSE] Cost=1INDEX RANGE SCAN EMP_I2 [ANALYZED]

    3. Index Full ScanIn certain circumstances it is possible for the whole index to be scanned as opposed to a range scan (i.e. where no constraining predicates areprovided for a table). Oracle chooses an index Full Scan when you have statistics that indicate that it is going to be more efficient than a Full tablescan and a sort. For example Oracle may do a Full index scan when we do an unbounded scan of an index and want the data to be ordered in theindex order. The optimizer may decide that selecting al l the information from the index and not sorting is more efficient than doing a FTS or a FastFull Index Scan and then sorting. An Index full scan will perform single block i/o's and so it may prove to be inefficient.

    Processes all leaf blocks of an index, but only enough branch blocks to find 1st leaf block. Used when all necessary columns are in index & order byclause matches index struct or if sort merge join is done.e.g. Index BE_IX is a concatenated index on big_emp (empno,ename)

    SQL> explain plan for select empno,ename from big_emp order by empno,ename;Query Plan--------------------------------------------------------------------------------SELECT STATEMENT [CHOOSE] Cost=26INDEX FULL SCAN BE_IX [ANALYZED]

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    12/32

    4. Index Fast Full Scan (not very used)Scans all the block in the index. Rows are not returned in sorted order. Note that INDEX FAST FULL SCAN is the mechanism behind fast indexcreate and recreate.Scans all blocks in index used to replace a FTS when all necessary columns are in the index. Using multi-block IO & can going parallel.E.g.Index BE_IX is a concatenated index on big_emp (empno, ename).

    SQL> explain plan for select empno,ename from big_emp;Query Plan

    ------------------------------------------SELECT STATEMENT [CHOOSE] Cost=1INDEX FAST FULL SCAN BE_IX [ANALYZED]

    5. Index Skip ScanSkips the leading edge of the index & uses the rest Advantageous if there are few distinct values in the leading column and many distinct values inthe nonleading column

    Join Operations TechniquesThere are three kinds of join conditions: nested loops, merge joins, and hash joins. Each has specific performance implications, and each should beused in different circumstances.a. Nested loops work from one table (preferably the smaller of the two), looking up the join criteria in the larger table. For every row in the outer table, Oracle accesses all the rows in theinner table Useful when joining small subsets of data and there is an efficient way to access the second table (index look up). It’s helpful if the joincolumn is indexed from the larger table. Nested loops are useful when joining a smaller table to a larger table and performs very well on smaller amounts of data. Nesting is when you perform the same operation for every element in a data set: For each row in A do Bb. Hash joins read the smaller tables into a hash table in memory so the referenced records can be quickly accessed by the hash key. Hash joinsare great in data warehouse scenarios where several smaller tables (with referential integrity defined) are being referenced in the same SQL queryas a single larger or very large table. The hash join has ab initial overhead (of creating the hash tables) but performs rather well no matter how manyrows are involved.c. Sort Merge or Merge joins work by selecting the result set from each table, and then merging these two (or more) results together. Merge joinsare useful when joining two relatively large tables of about the same size together, the merge join starts out with more overhead but remains rather consistent.

    a. NESTED LOOPS JOIN - Nested Loops Joins are the most common and straightforward type of nesting in Oracle. When joining two tables, for each row in one table Oracle looks up the matching rows in the other table.Take the example of 2 tables joined as follows:Select *

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    13/32

    From Table1 T1, Table2 T2Where T1.Table1_Id = T2.Table1_id;

    In the case of the Nested Loop Join, the rows wil l be accessed with an outer table being chosen (say Table1 in this case) and for each row in theouter table, the inner table (Table2) wil l be accessed with an index to retrieve the matching rows. Once all matching rows from Table2 are found,then the next row on Table1 is retrieved and the matching to Table2 is performed againIt's important that efficient index access is used on the inner table (Table2 in this example) or that the inner table be a very small table. This is criticalto prevent table scans from being performed on the inner table for each row of the outer table that was retrieved.Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving condition. It is the most common join performed by transactional (OLTP) systemsOUTER - A nested loops operation to perform an outer join statement.

    Note: You will see more use of nested loop when using FIRST_ROWS optimizer mode as it works on model of showing instantaneous results touser as they are fetched. There is no need for selecting caching any data before it is returned to user. In case of hash join it is needed and isexplained below.

    b. HASH JOIN - An operation that joins two sets of rows and returns the same result.  -ANTI - A hash anti-join.  -SEMI - A hash semi-join.

    Hash joins are used when we are joining large tables. The optimizer uses the smaller of the 2 tables to build a hash table in memory and the scansthe large tables and compares the hash value (of rows from large table) with this hash table to find the joined rows.The algorithm of hash join is divided in two parts

    1. Build a in-memory hash table on smaller of the two tables.2. Probe this hash table with hash value for each row second table\

    Unlike nested loop, the output of hash join result is not instantaneous as hash joining is blocked on bui lding up hash table.The Hash Join is is a very efficient join when used in the right situation. With the hash join, one Table is chosen as the Outer table. This is the larger of the two tables in the Join - and the other is chosen as the Inner Table. Both tables are broken into sections and the inner Tables join columns arestored in memory (if hash_area_size is large enough) and 'hashed'. This hashing provides an algorithmic pointer that makes data access veryefficient. Oracle attempts to keep the inner table in memory since i t will be 'scanned' many times. The Outer rows that match the query predicates arethen selected and for each Outer table row chosen, hashing is performed on the key and the hash value is used to quickly find the matching row inthe Inner Table. This join can often outperform a Sort Merge join, particularly when 1 table is much larger than another. No sorting is performed andindex access can be avoided since the hash algorithm is used to locate the block where the inner row is stored. Hash-joins are also only used for equi-joins. Other important init.ora parms are: hash_join_enabled, sort_area_size and hash_multiblock_io_count.

    Note: You may see more hash joins used with ALL_ROWS optimizer mode, because it works on model of showing results after all the rows of at least one of the tables are hashed in hash table.

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    14/32

    c. SORT MERGE JOIN or MERGE JOIN or Merge Scan - An operation that accepts two sets of rows, each sorted by a speci fic value, combineseach row from one set with the matching rows from the other. Take an example of 2 tables being joined and returning a large number of rows (say,thousands) as follows:Select *From Table1 T1, Table2 T2Where T1.Table1_Id = T2.Table1_id;

    The Merge Scan join wil l be chosen because the database has detected that a large number of rows need to be processed and it may also noticethat index access to the rows are not efficient since the data is not clustered (ordered) efficiently for this join. The steps followed to perform this typeof join are as follows:

    1) Pick an inner and outer table2) Access the inner table, choose the rows that match the predicates in the Where clause of the SQL statement.3) Sort the rows retrieved from the inner table by the joining columns and store these as a Temporary table. This step may not be performed if data isordered by the keys and efficient index access can be performed.4) The outer table may also need to be sorted by the joining columns so that both tables to be joined are sorted in the same manner. This step isalso optional and dependent on whether the outer table is already well ordered by the keys and whether efficient index access can be used.5) Read both outer and inner tables (these may be the sorted temporary tables created in previous steps), choosing rows that match the join criteria.This operation is very quick since both tables are sorted in the same manner and Database Prefetch can be used.6) Optionally sort the data one more time if a Sort was performed (e.g. an 'Order By' clause) using columns that are not the same as were used toperform the join.The Merge Join can be deceivingly fast due to database multi-block fetch (helped by initialization parameter db_file_multiblock_read_count)

    capabilities and the fact that each table is accessed only one time each. These are only used for equi-joins. The other init.ora parm that can betuned to help performance is sort_area_size.OUTER - A merge join operation to perform an outer join statement.  -ANTI - A merge anti-join.  -SEMI - A merge semi-join.

    Important point to understand is, unlike nested loop where driven (inner) table is read as many number of times as the input from outer table, in sortmerge join each of the tables involved are accessed at most once. So they prove to be better than nested loop when the data set is large.

    When optimizer uses Sort merge join?

    a) When the join condition is an inequality condition (like

  • 8/17/2019 Tips for Effective Queries

    15/32

    OperationsThe following operations show up in explain plans:a. Sortb. filterc. view

    a. SortsThere are a number of different operations that promote sorts

    Order by clausesGroup bySort merge join

    Sorts are expensive operations especially on large tables where the rows do not fit in memory and spill to disk. By default sort blocks are placed intothe buffer cache. This may result in aging out of other blocks that may be reread by other processes.

    b. FilterHas a number of different meanings used to indicate partition elimination may also indicate an actual filter step where one row source is filteringanother functions such as min may introduce filter steps into query plans.In the next example there are 2 filter steps. The first is effectively like a NLexcept that it stops when it gets something that it doesn't like (i.e. a bounded NL). This is there because of the not in. The second is filtering out themin value:

    SQL> explain plan for select * from emp where empno not in (select min(empno) from big_emp group by empno);Query Plan------------------SELECT STATEMENT [CHOOSE] Cost=1FILTER **** This is like a bounded nested loops

    TABLE ACCESS FULL EMP [ANALYZED]FILTER **** This filter is introduced by the minSORT GROUP BY NOSORTINDEX FULL SCAN BE_IX

    This example is also interesting in that it has a NOSORT function. The group by does not need to sort because the index row source is already presorted.

    c. ViewsWhen a view cannot be merged into the main query you will often see a projection view operation. This indicates that the 'view' will be selected from

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    16/32

    directly as opposed to being broken down into joins on the base tables. A number of constructs make a view non mergeable. Inline views are alsonon mergeable.In the following example the select contains an inline view that cannot be merged:

    SQL> explain plan for select ename,tot from emp, (select empno,sum(empno) tot from big_emp group by empno) tmp where emp.empno =tmp.empno;Query Plan------------------------SELECT STATEMENT [CHOOSE]HASH JOINTABLE ACCESS FULL EMP [ANALYZED]VIEWSORT GROUP BYINDEX FULL SCAN BE_IX

    In this case the inline view tmp that contains an aggregate function cannot be merged into the main query. The explain plan shows this as a viewstep

    Optimizer Method and how to know the Driving Table. A small "golden rule" is that your driving table should be the table that returns the smallest number of rows (so you need to look at the where clause),and this is not always the table with the smallest number of rows. But…. Where to specify the driving Table?Oracle processes result sets a table at a time. It starts by retrieving all the data for the first (driving) table. Once this data is retrieved it is used to limitthe number of rows processed for subsequent (driven) tables. In the case of multiple table joins, the driving table limits the rows processed for thefirst driven table. Once processed, this combined set of data is the driving set for the second driven table etc. Roughly translated into English, thismeans that it is best to process tables that will retrieve a small number of rows first. The optimizer will do this to the best of its abilityregardless of the structure of the DML, but some factors may help.Both the Rule and Cost based optimizers select a driving table for each query.In the RBO (Rule Based Optimizer) the driving table is the LAST TABLE in the FROM CLAUSE (chooses the driving order by taking the tables inthe FROM clause RIGHT to LEFT).

    In the CBO (Cost Based Optimizer) the driving table is is determinated from costs derived from GATHERED STATISTICS. If there are no statisticsor if the optimizer_mode IS COST then CBO chooses the driving order of tables from LEFT to RIGHT in the FROM clause, Place the most limitingtables first in the FROM clause

    If a decision cannot be made, the order of processing is FROM the END of the FROM clause to the START.In RBO, we have a habit of ordering tables right-to-left in queries, right being the driving table for the query.In CBO, I had to adapt to ordering from left-to-right, left being the driving table. The ORDERED hint used in CBO picks up tables left-to-right for processing. Take a pick.

    Hence, it is important to have good statistics to pick up the correct driving table.

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    17/32

    The WHERE clause is the main decision maker about which indexes to use. You should always try to use your unique indexes first, and then if thatis not possible then use a non-unique index. For a query to use an index, one or more fields from that index need to be mentioned in the WHEREclause. On concatenated indexes the index wi ll only be used i f the first field in the index is mentioned.On 10g that in not needed any more!!!The more of its fields are mentioned in the where clause, the better an index is used.

    So if you need to get statistics on your schema quickly, you can perform:BEGIN  dbms_stats.gather_schema_stats (ownname => 'SCOTT'  , estimate_percent => 10  , degree => 5

      , cascade => true);END;/

    ORexecute dbms_stats.gather_schema_stats(ownname => 'SCOTT', estimate_percent => 10, degree => 5, cascade => true);

    If you want to grab statistics for a Table and i ts indexes, then:EXEC DBMS_STATS.gather_table_stats('SCOTT', 'TEST', cascade => TRUE);

    More information HERE

    TIPS to write better queries Although two SQL statements may produce the same result, Oracle may process one faster than the other. You can use the results of the EXPLAINPLAN statement to compare the execution plans and costs of the two statements and determine which is more efficient. Following are some tipsthat help in writing efficient queries.Before starting our discussion, once nice parameter to know:

    Flushing the Buffer CachePrior to Oracle Database 10g, the only way to flush the database buffer cache was to shut down the database and restart it. Oracle Database 10gnow allows you to flush the database buffer cache with the al ter system command using the flush buffer_cache parameter. The FLUSH Buffer Cache clause is useful if you need to measure the performance of rewritten queries or a suite of queries from identical starting points. Use the

    following statement to flush the buffer cache.ALTER SYSTEM FLUSH BUFFER_CACHE; #This command flushed the buffer cache in the SGAALTER SYSTEM FLUSH SHARED_POOL; #This command flushed the shared pool

    However, note that these clauses are intended for use only on a test database. It is not advisable to use them on a production database, becausesubsequent queries will have no hits, only misses.

    Declare with Care!!The following table and then the sections after that offer some concrete advice on potential issues you might encounter when declaring variables inPL/SQL

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://www.pafumi.net/Tuning.htm#ANALYZE

  • 8/17/2019 Tips for Effective Queries

    18/32

    NUMBER If you don’t specify a precision, as in NUMBER(12,2), Oracle supports up to 38 digits of precision. If you don’t need this precision, you’rewasting memory.

    CHAR This is a fixed-length character string and is mostly available for compatibility purposes with code written in earlier versions of Oracle. Thevalues assigned to CHAR variables are right-padded with spaces, which can result in unexpected behavior. Avoid CHAR unless it’s specificallyneeded.

    VARCHAR2 The greatest challenge you will run into with VARCHAR2 is to avoid the tendency to hard-code a maximum length, as in

    VARCHAR2(30). Use %TYPE as described later in this sectoin.

    INTEGER If your integer values fall within the range of –231+1 .. 231–1 (a.k.a. – 2147483647 .. 2147483647), you should declare your variables asPLS_INTEGER. This is the most efficient format for integer manipulation (until you get to Oracle Database 10g Release 2, at which pointBINARY_INTEGER, PLS_INTEGER and all the other subtypes of BINARY_INTEGER offer the same performance).

    Anchor variables to database datatypes using %TYPE and %ROWTYPE.When you declare a variable using %TYPE or %ROWTYPE, you “anchor” the type of that data to another, previously defined element. If your program variable has the same datatype as (and, as is usually the case, is acting as a container for) a column in a table or view, use %TYPE todefine it from that column. If your record has the same structure as a row in a table or view, use %ROWTYPE to define it from that table. Your codewill automatically adapts to underlying changes in data structures.

    1. Existence of a rowDo not use ‘Select count(*)…’ to test the existence of a row. Instead, open an explicit cursor, fetch once, and then check cursor%NOTFOUND :

    If you are going to insert a row or update one if that exists, instead of:DECLARE  /* Declare variables which will be used in SQL statements */  v_LastName VARCHAR2(10) := 'Pafumi';  v_NewMajor VARCHAR2(10) := 'Computer';  v_exists number := 0;BEGIN

      Select count(1) into v_exists from studentsWHERE last_name = v_LastName;

      If v_exists = 1 then  /* Update the students table. */  UPDATE students  SET major = v_NewMajor  WHERE last_name = v_LastName;  else  INSERT INTO students (ID, last_name, major)  VALUES (10020, v_LastName, v_NewMajor);  END IF;

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    19/32

    END;/

    Try to perform the following, is much faster !!!!DECLARE  /* Declare variables which will be used in SQL statements */  v_LastName VARCHAR2(10) := 'Pafumi';  v_NewMajor VARCHAR2(10) := 'Computer';BEGIN  /* Update the students table. */  UPDATE students

      SET major = v_NewMajor  WHERE last_name = v_LastName;

      /* Check to see if the record was found. If not, then we need to insert this record. */  IF SQL%NOTFOUND THEN  INSERT INTO students (ID, last_name, major)  VALUES (10020, v_LastName, v_NewMajor);  END IF;END;/

    --Another Example if trying to insert values in a table with PK:  INSERT INTO RecognitionLog(MachineName,StartDateTime)  values(p_MachineName,p_StartDateTime);  p_RowsAffected := SQL%ROWCOUNT;  COMMIT;  RETURN 0;EXCEPTION When Dup_val_on_index then  UPDATE RecognitionLog  SET EndDateTime = p_EndDateTime,  TotalRecognized = p_TotalRecognized,  TotalRecognitionFailed = p_TotalRecognitionFailed  WHERE MachineName = p_MachineName

      AND StartDateTime = p_StartDateTime;  p_RowsAffected := SQL%ROWCOUNT;  commit;  RETURN 0; when others then  rollback;  p_RowsAffected := 0;  return 1

    If you just want to check the existance of a row, instead of the "classical":select count(*) from student where status = 10;

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    20/32

    You can perform the following:SELECT COUNT(*) INTO v_count  FROM student where status = 10 AND ROWNUM = 1;orSELECT '1' INTO v_dummy  FROM student where status = 10 AND ROWNUM = 1;

    In these examples only single a record is retrieved in the presence/absence check.

    2. Avoid the use of NULL or IS NOT NULL .Instead of:  Select * from clients where phone_number is null;Use:  Select * from clients where phone_number = 0000000000000000;

    3. Select the data that you need ONLY!!!When selecting from a table, be sure to only select the data that you need.For example, if you only need 1 column from a 50 column table, be sure to do a'select fld from table' and only retrieve what you need. If you do a

    'select * from table' you will be fetching ALL columns of the table which increases network traffic and causes the system to perform unnecessarywork to retrieve data that is not being used

    4. Always use table alias and prefix The parse phase for statements can be decreased by efficient use of aliasing. This helps the speed of parsing the statements in two ways:

    If an alias is not present, the engine must resolve which tables own the specified columns.

     A short alias is parsed more quickly than a long table name or alias. If possible, reduce the alias to a single letter.

     5. IN and EXISTS

    Correlated Queries (Exists) A subquery is said to be Correlated when it is joined to the outer query within the Subquery. An example of this is:Select last_name, first_nameFrom CustomerWhere customer.city = ‘Chicago’  and Exists

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    21/32

     (Select customer_id

      From Sales wheresales.total_sales_amt > 10000

      and sales.customer_id = customer.customer_id);

    Notice that the last line in the above query is a join of the outer Customer table and inner Sales tables. Given the query above, the outer query isread and for each Customer in Chicago, the outer row is joined to the Subquery. Therefore, in the case of a subquery, the inner query is executedonce for every row read in the outer query. EXISTS often result in a FULL TABLE SCANThis is efficient where a relatively small number of rows are processed by the query, but considerable overhead is incurred when a large number of rows are read.

    Uncorrelated Queries (sub-query executes first) (IN) A subquery is said to be uncorrelated (aka non-correlated) when the two tables are not joined together in the inner query. In effect, the inner (sub)query is processed first and then the result set is joined to the outer query. This is very efficient for queries that return a large number of rows. Anexample of this is:Select last_name, first_nameFrom CustomerWhere customer_id IN  (Select customer_id  From Sales where

    sales.total_sales_amt > 10000);

    The Sales table wi ll be processed first and then all entries with a total_sales_amt > 10000 will be joined to the Customer table. This is efficientwhere a large number of rows is being processed.The optimizer is more likely to translate an IN into a join. It is important to understand the number of rows to be returned by a query and then decidewhich approach to use.

    EXISTS vs. IN

    use a join where possibleuse IN over EXISTS (i.e. non-correlated subquery vs. correlated subquery).The optimizer is more likely to translate IN into a Join than it is with EXISTSIN executes a subquery once whi le Exists executes i t once per outer row

    In is similar to a merge-scan while Exists is similar to a nested-loop join.There are some cases where EXISTS can outperform IN, but in more cases IN will dramatically out-perform EXISTS. In general, IN is better than EXISTS.EXISTS tries to satisfy the subquery as quickly as possible and returns ‘true’ if the subquery returns 1 or more rows – it should be indexed.Optimize the execution of the subquery.

    Not In vs. Not ExistsSubqueries may be written using NOT IN and NOT EXISTS clauses. The NOT EXISTS clause is sometimes more efficient since the database onlyneeds to verify non-existence. With NOT IN the entire result set must be materialized. Another consideration when using NOT IN, is if the subqueryreturns NULLS, the results may not be returned (at all). With NOT EXISTS, a value in the outer query that has a NULL value in the inner will be

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    22/32

    returned. Not In performs very well as an anti-join using the cost-based optimizer and often performs Not Exists when this access path is used. Outer  joins can also be a very fast way to accomplish this.

    Use IN instead of EXISTS. A simple trick to increase the speed of an EXISTS sub query is to replace it with IN. The IN method is faster than EXISTS because it doesn’t checkunnecessary rows in the comparison.But this tip wi ll be useful only i f the inner query returns a small number of rows. If the inner query retrieves a larger row set, then it is better to useEXISTS.Example:

    Before: select cgrfnbr from category where EXISTS (select cpcgnbr from cgprrel where cpprnbr = 149 ) After: select cgrfnbr from category where cgrfnbr IN (select cpcgnbr from cgprrel where cpprnbr = 149 )36% Time Reduction could be achieved. 

    6. Use Joins in place of EXISTS.  SELECT *  FROM emp e  WHERE EXISTS (SELECT d.deptno  FROM dept d   WHERE e.deptno = d.deptno  AND d.dname = 'RESEARCH');

     To improve performance use the following:  SELECT *  FROM emp e, dept d   WHERE e.deptno = d.deptno  AND d.dname = ‘RESEARCH’;

     

    7. Use EXISTS in place of DISTINCT.  SELECT DISTINCT d.deptno, d.dname ,  FROM dept d, emp e

    WHERE d.deptno = e.deptno;

     The following SQL statement is a better alternative.  SELECT d.deptno , d.dname  FROM dept d   WHERE EXISTS (SELECT 'X'  FROM emp e  WHERE d.deptno = e.deptno);

     Another Example:SELECT DISTINCT hetitle, hename

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    23/32

    FROM helpfiles h , merchant mWHERE m.merfnbr = h.hemenbr;

    Much Better:SELECT hetitle, henameFROM helpfiles h WHERE EXISTS (SELECT m.merfnbr

    FROM merchant m);

    48% Time Reduction could be achieved.

    8. Math Expressions.

    The optimizer fully evaluates expressions whenever possible and translates certain syntactic constructs into equivalent constructs. This is doneeither because Oracle can more quickly evaluate the resulting expression than the original expression or because the original expression is merelya syntactic equivalent of the resulting expression. Any computation of constants is performed only once when the statement is optimized rather than each time the statement is executed. Consider these conditions that test for monthly salaries greater than 2000:

      sal > 24000/12  sal > 2000  sal*12 > 24000

    If a SQL statement contains the first condition, the optimizer simplifies it into the second condition.

    Note that the optimizer does not simplify expressions across comparison operators. The optimizer does not simplify the third expression into thesecond. For this reason, application developers should write conditions that compare columns with constants whenever possible, rather thanconditions with expressions involving columns.

      The Optimizer does not use index for the following statement.  SELECT *

    FROM emp  WHERE sal*12 > 24000;

      Instead use the following statement.  SELECT *

    FROM emp

      WHERE sal > 24000/12;

    9. Never use NOT in an indexed column. Whenever Oracle encounters a NOT in an index column, it will perform full-table scan.

      SELECT *  FROM emp  WHERE NOT deptno = 0;

     Instead use the following.  SELECT *

    This PDF was generated via the PDFmyURL web conversion service!

     

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    24/32

    FROM emp  WHERE deptno > 0;

     

    10. Never use a function / calculation on an indexed column (unless you are SURE that you are using an Index Function Based new in Oracle8i). If there is any function is used on an index column, optimizer will not use index. Use some other alternative. If you don’t have another choice,keep functions on the right hand side of the equal sign. The Concatenate || symbol will also disable indexes. Examples:

      /** Do not use **/  SELECT * FROM emp WHERE SUBSTR (ENAME, 1,3) = ‘MIL’;

    /** Suggested Alternative **/  Note: Optimizer uses the index only when optimizer_goal is set to FIRST_ROWS.  SELECT * FROM emp WHERE ENAME LIKE 'MIL%’;

      /** Do not use **/  SELECT * FROM emp WHERE sal! = 0;

      Note: Index can tell you what is there in a table but not what is not in a table.  Note: Optimizer uses the index only when optimizer_goal = FIRST_ROWS.

    /** Suggested Alternative **/  SELECT * FROM emp WHERE sal > 0; 

    /** Do not use **/  SELECT * FROM emp WHERE ename || job = ‘MILLERCLERK’;

      Note: || is the concatenate function. Like other functions it disables index.

      /** Suggested Alternative **/  Note: Optimizer uses the index only when optimizer_goal=FIRST_ROWS.  SELECT *  FROM emp

      WHERE ename = 'MILLER'AND job = ‘CLERK’;

     11. Whenever possible try to use bind variables

    In Dynamic SQL, this is a MUST!!! 

    The next example would always require a hard parse when it is submitted:create or re lace rocedure dsal( em no in number) as

    This PDF was generated via the PDFmyURL web conversion service!

      _

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    25/32

    _begin  execute immediate 'update emp set sal = sal*2 where empno = '||p_empno;  commit;end;/

    Is more effective to use bind variables on the EXECUTE IMMEDIATE command as follows:create or replace procedure dsal(p_empno in number) asbegin  execute immediate 'update emp set sal=sal*2 where empno = :x' using p_empno;  commit;

    end;/

    The Performance Killer Just to give you a tiny idea of how huge of a difference this can make performance wise, you only need to run a very small test:

    Here is the Performance Killer ....SQL> alter system flush shared_pool;SQL> set serveroutput on;

    declare

      type rc is ref cursor;  l_rc rc;  l_dummy all_objects.object_name%type;  l_start number default dbms_utility.get_time;begin  for i in 1 .. 1000  loop  open l_rc for  'select object_name from all_objects  where object_id = ' || i;  fetch l_rc into l_dummy;  close l_rc;  -- dbms_output.put_line(l_dummy);

      end loop;  dbms_output.put_line (round((dbms_utility.get_time-l_start)/100, 2) || ' Seconds...' );end;/

    101.71 Seconds...

    ... and here is the Performance Winner:declare  type rc is ref cursor;  l_rc rc;

    This PDF was generated via the PDFmyURL web conversion service!

    l d ll bj t bj t %t

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    26/32

      l_dummy all_objects.object_name%type;  l_start number default dbms_utility.get_time;begin  for i in 1 .. 1000  loop  open l_rc for  'select object_name from all_objects where object_id = :x' using i;  fetch l_rc into l_dummy;  close l_rc;  -- dbms_output.put_line(l_dummy);  end loop;  dbms_output.put_line (round((dbms_utility.get_time-l_start)/100, 2) || ' Seconds...' );end;/

    1.9 Seconds...

    That is pretty dramatic. The fact is that not only does this execute much faster (we spent more time PARSING our queries then actually EXECUTINGthem!) it will let more users use your system simultaneously.

    12. Use the same convention for all your queries.

    Oracle will put al l your SQL or PL/SQL code in memory and will reuse statements that are the same (saving parse time). So remember that:  Select * from emp where dept = :dept_no

    Is different than  Select * from EMP where dept = :dept_no

    Even differing spaces in the statement will cause this lookup to fail. Assuming the statement does not have a cached execution plan i t must beparsed before execution.

    13. Tuning the WHERE Clause:

    - When using AND Clauses in the WHERE Clause, put the most stringent AND Clause furthest from the WHERE.- When using OR Clauses in the WHERE Clause, put the most stringent OR Clause closest to the WHERE.

    14. Do not use the keyword HAVING use the keyword WHERE instead

    The HAVING clause filters selected rows only after all rows have been fetched. Using a WHERE clause helps reduce overheads in sorting,summing, etc. HAVING clauses should only be used when columns with summary operations applied to them are restricted by the clause.

    Given Query Alternative

    This PDF was generated via the PDFmyURL web conversion service!

    SELECT d d AVG ( l) SELECT d d AVG ( l)

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    27/32

    SELECT d.dname, AVG (e.sal)FROM emp e, dept dWHERE e.deptno = d.deptnoGROUP BY d.dnameHAVING dname != 'RESEAECH' AND dname != 'SALES';

    SELECT d.dname, AVG (e.sal)FROM emp e, dept dWHERE e.deptno = d.deptno AND dname != 'RESEAECH' AND dname != 'SALES'GROUP BY d.dname;

     26% Time Reduction could be achieved

    15. Avoid Multiple Sub queries where possibleInstead of this:

    Update emp set emp_cat = (select max (category) from emp_categories),

    sal_range = (select max(sal_range) from emp_categories);

    Use:  Update emp set (emp_cat, sal_range) = (Select max (category), max (sal_range) from emp_categories) ;

    16. Use IN in place of ORLeast Efficient:  Select ….  From location  Where loc_id = 10 or loc_id=20 or loc_id = 30

    Most Efficient  Select ….  From location  Where loc_id in (10,20,30)

     17. Do not Commit inside a LoopDo not use a commit or DDL statements inside a loop or cursor, because that wil l make the undo segments needed by the cursor unavailable.

    Many applications commit more frequently than necessary, and their performance suffers as a result. In isolation a commit is not a very expensiveoperation, but lots of unnecessary commits can nevertheless cause severe performance problems. While a few extra commits may not be noticed,the cumulative effect of thousands of extra commits is very noticeable. Look at this test. Insert 1,000 rows into a test table -- first as a singletransaction, and then committing after every row. Your mileage may vary, but these are my results, on an otherwise idle system show a performanceblowout of more than 100% when committing after every row.

    create table t (n number);

    --BAD METHODdeclare  start_time number;

    This PDF was generated via the PDFmyURL web conversion service!

    begin

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    28/32

    begin  start_time := dbms_utility.get_time;  for i in 0..999 loop  insert into t values (i);  commit;  end loop;  dbms_output.put_line(dbms_utility.get_time - start_time || ' centiseconds');end;/102 centiseconds

    truncate table t;

    --GOOD METHODdeclare  start_time number;begin  start_time := dbms_utility.get_time;  for i in 0..999 loop  insert into t values (i);  end loop;  commit;  dbms_output.put_line(dbms_utility.get_time - start_time || ' centiseconds');end;

    /44 centiseconds

    18. Use UNION ALL instead of UNION

    The problem is that in a UNION, Oracle finds all the qualifying rows and then "deduplicates" them. To see what I mean, you can simply compare the following queries:select * from dualunionselect * from dual;D---

    X

    select * from dualunion ALLselect * from dual;D---X X

    Note how the first query returns only one record and the second returns two. A UNION forces a big sort and deduplication—a removal of duplicate values. Most of the time, this iswholly unnecessary. To see how this might affect you, I'll use the data dictionary tables to run a WHERE EXISTS query using UNION and UNION ALL and compare the results withTKPROF. The results are dramatic.

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    29/32

    First, I 'll do the UNION query:

    SQL> select *  2 from dual  3 where exists

    4 (select null from all_objects  5 union  6 select null from dba_objects  7 union  8 select null from all_users);

    call cnt cpu ela query---- --- ---- --- ------Parse 1 0.01 0.00 0Execute 1 2.78 2.75 192234Fetch 2 0.00 0.00 3----- ---- ---- ---- ------total 4 2.79 2.76 192237

     As you can see, that was a lot of work—more than 192,000 I/Os just to see if I should fetch that row fromDUAL. Now I add a UNION ALL to the query:

    SQL> select *  2 from dual

      3 where exists4 (select null from all_objects

      5 union all  6 select null from dba_objects  7 union all  8 select null from all_users);

    call cnt cpu ela query------ ---- ---- ---- -----Parse 1 0.00 0.00 0Execute 1 0.01 0.00 9Fetch 2 0.00 0.00 3------ ---- ---- ---- -----

    total 4 0.01 0.00 12

    Quite a change! What happened here was that the WHERE EXISTS stopped running the subquery when i t got the first row back, and because thedatabase did not have to bother with that deduplicate step, getting the first row back was very fast indeed.

    The bottom line: If you can use UNION ALL, by all means use it over UNION to avoid a costly deduplication step—a step that is probably not evennecessary most of the time.

    19. Check that your application is using the existing indexes

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    30/32

    This is a CRITICAL point. So make use of Explain Plan!!!

    20. Recommendation to work with dates.

    If you need to get all the data for today's date, instead of:SELECT ImportedDate, StateFROM IssueData WHERE TRUNC(ImportedDate ) = TRUNC(SYSDATE);

    Use the following:SELECT ImportedDate, StateFROM IssueData WHERE ImportedDate between trunc(SYSDATE) and TRUNC(SYSDATE) + .99999;

    21. Anti Joins

     An anti-join is used to return rows from a table that that are present in another table. It might be used for example between DEPT and EMP to returnonly those rows in DEPT that didn't join to anything in EMP;

    SELECT *FROM dept

     WHERE deptno NOT IN (SELECT deptno FROM EMP);

    SELECT dept.*  FROM dept, emp WHERE dept.deptno = emp.deptno (+)  AND emp.ROWID IS NULL;

    SELECT *  FROM dept

    WHERE NOT EXISTS (SELECT NULL FROM emp WHERE emp.deptno = dept.deptno);

    22. Full Outer Joins

    Normally, an outer join of table A to table B would return every record in table A, and if it had a mate in table B, that would be returned as well. Everyrow in table A would be output, but some rows of table B might not appear in the result set. A full outer join would return ebery row in table A, as wellas every row in table B. The syntax for a full outer join is new in Oracle 9i, but it is a syntactic convenience, it is possible to produce full outer joins

    This PDF was generated via the PDFmyURL web conversion service!

    sets using conventional SQL

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    31/32

    sets using conventional SQL.

    update emp set deptno = 9 where deptno = 10;commit;

    Conventional SQL New Syntax

    SELECT empno, ename, dept.deptno, dname  FROM emp, dept WHERE emp.deptno(+) = dept.deptnoUNION ALL

    SELECT empno, ename, emp.deptno, NULL  FROM emp, dept WHERE emp.deptno = dept.deptno(+)  AND dept.deptno IS NULLORDER BY 1,2,3,4;

     EMPNO ENAME DEPTNO DNAME------ ------- ------- ----------  7369 SMITH 20 RESEARCH  7499 ALLEN 30 SALES  7521 WARD 30 SALES  7566 JONES 20 RESEARCH  7654 MARTIN 30 SALES  7698 BLAKE 30 SALES  7782 CLARK 9  7788 SCOTT 20 RESEARCH  7839 KING 9  7844 TURNER 30 SALES  7876 ADAMS 20 RESEARCH  7900 JAMES 30 SALES  7902 FORD 20 RESEARCH  7934 MILLER 9  10 ACCOUNTING  40 OPERATIONS

    SELECT empno, ename,NVL(dept.deptno,emp.deptno) deptno, dname

    FROM emp FULL OUTER JOIN dept ON  (emp.deptno = dept.deptno)

    ORDER BY 1,2,3,4;

      EMPNO ENAME DEPTNO DNAME------ ------- ------- ----------  7369 SMITH 20 RESEARCH  7499 ALLEN 30 SALES  7521 WARD 30 SALES  7566 JONES 20 RESEARCH

      7654 MARTIN 30 SALES  7698 BLAKE 30 SALES  7782 CLARK 9  7788 SCOTT 20 RESEARCH  7839 KING 9  7844 TURNER 30 SALES  7876 ADAMS 20 RESEARCH  7900 JAMES 30 SALES  7902 FORD 20 RESEARCH  7934 MILLER 9  10 ACCOUNTING  40 OPERATIONS

    23. Use BETWEEN instead of IN.The BETWEEN keyword is very useful for filtering out values in a specific range. It is much faster than typing each value in the range into an IN.Example:Before: SELECT crpcgnbr FROM cgryrel WHERE crpcgnbr IN (508858, 508859, 508860, 508861,508862, 508863, 508864) After: SELECT crpcgnbr FROM cgryrel WHERE crpcgnbr BETWEEN 508858 and 50886459% Time Reduction could be achieved.

    This PDF was generated via the PDFmyURL web conversion service!

    http://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdfhttp://pdfmyurl.com/?src=pdf

  • 8/17/2019 Tips for Effective Queries

    32/32