correlated subquery

Upload: saeed-meethal

Post on 13-Apr-2018

235 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Correlated Subquery

    1/15

    Correlated Subquery

  • 7/27/2019 Correlated Subquery

    2/15

    What is Correlated Subquery?

    If the a subquery depends on the values of

    its parent query then the sub query is

    called correlated subquery.

    select

    Coursename , Courseadminid,

    (select

    Firstname+' '+Lastname

    from

    studentwhere

    studentid=Course.courseadminid

    ) as CourseAdminName

    from

    course

  • 7/27/2019 Correlated Subquery

    3/15

    How Does It Work?

    For each output row from parent query, Oraclewill execute the subquery once.

    Oracle may use subquery resultset cache, that

    is, Oracle will cache the subquery resultsetsbased on parent query values.

    But the cache size is very limited. For example,for scalar subquery (scalar means each

    subquery execution only returns one row),Oracle uses a HASH table for cache, and thecache slot number is 256 for 8i and 9i, and 1024for 10g (from Jonathan Lewis).

  • 7/27/2019 Correlated Subquery

    4/15

    Performance Concerns

    Since the subquery is executed once per recordfrom parent query, we should eliminate as manyrows as possible before the execution of thesubquery step.

    The efficiency of each subquery execution: dowe have right index or partition pruning? Forexample, when a subquery has to use FTS on atable with 10K blocks, if the parent query returns10K rows, that will be a huge number of physicalIO.

    If the purpose is to get a value from anothertable, consider using JOIN.

  • 7/27/2019 Correlated Subquery

    5/15

    Subquery In Action: UAD

    This query had massive temp space usages

    on the hash join and it had not yet to

    generate a single output row to execute

    subquery.

  • 7/27/2019 Correlated Subquery

    6/15

    Subquery In Action: TAOHere is the source of occasional TAO DB CPU alerts, the part of the view

    TAO.TAO_FACT_AGGREGATE_D_SEGMENTS used by TAO cube build process. Note there is a

    subquery around table RM.RM_PIXELS, which used to be the source of high LIO and CBC

    contentions. In one run on 09/27 from hour 06 to 11, one query recorded 699M LIO and AWR ASH

    had major waits as CPU and latch: cache buffers chain. Table RM.RM_PIXELS has 101142 rows,

    so it passed the limit of cache slots. The other table involved TAO.TAO_DEFAULT_IDS has only 3

    rows.

  • 7/27/2019 Correlated Subquery

    7/15

    Todays Subquery: Direct

    MarketingRealtime Tracking:

    Query has executed 31,002 seconds

    Last call from client (SID 2008) is 29,602 seconds.

    All the PX slave has LAST_CALL_ET as 2 seconds, meaning a

    new round of subquery execution just started.

    This is the view used by the qurery in question. Direct Marketing

    wraps most of its business logic inside views. Plenty of correlated

    subqueries here, this is another direct marketing feature.

  • 7/27/2019 Correlated Subquery

    8/15

    The Issue

    After 11.2.0.3 upgrade (2012/09/28), this query has continuouslytopped high physical IO usages among all databases scanned byperformance framework.

    For example, the single run from 09/30 22 hour to 10/01 08 hour, itread 339M blocks. The run from 09/26 22 hour only read 3.168Mblocks, also completed within 2 hours. So my interest is toinvestigate where the 100 times of physical IO is from.

    AWR segment statistics shows the source of physical IO is the tableSEM.AI_KEYWORD, one of the table used twice by the correlatedsubquery inside the view related to the query. The physical reads onthis table for the same period from 09/30 22 hour to 10/01 08 hour is100M blocks. The mismatch from 300M blocks from AWR SQL stats

    is another interesting thing worth some research.

  • 7/27/2019 Correlated Subquery

    9/15

    FTS and partition pruning

    for AI_KEYWORD

    1. 6 sets of PX queues, or DFO (dataflow object), one for parent query, one

    for each subquery.

    2. 6 PX COORDINATOR operations. So

    each subquery will start its own PX

    operations.

    3. Plans for subqueries in SELECT listare usually listed at the top on the

    main query. Here from line ID 1 to 28.

    The main query is from 29 to 32.

    4. The final COST is usually messed up

    and the cost from subqueries not

    counted in.

    5. The values passed from main queryto subquery usually lised in

    PREDICATE part as bind variables.

    6. No plan change after upgrade.

  • 7/27/2019 Correlated Subquery

    10/15

    Table Info

    The base table of parent query, AI_ACCOUNT, has 206 rows, or206 unique IDs.

    The subquery condition on AI_KEYWORD is ACCOUNT_ID =AI_ACCOUNT.ID for each row returned from parent query.

    AI_KEYWORD is list partitioned by ACCOUNT_ID. The total size is1,730,120 blocks (outdated, but pretty close. dba_segments has1,752,876 blocks). The largest partition has 85,120 blocks.

    In theory, for each row from AI_ACCOUNT, the two subqueriesinside the view related to AI_KEYWORD will each read a singlepartition. So this query should use at most 2* 1,752,876 blocks(3.505M) of physical reads, plus some block numbers from othertables. That is close to the result before 11.2.0.3 upgrade (3.168M).

  • 7/27/2019 Correlated Subquery

    11/15

    Some Research

    Using session stats to see if any interesting statisticsduring realtime tracking.

    From previous case study, we have seen the case whena target table has full table refresh, consistent readsrelated to UNDO could cause larger than expected IOusages. Another case is chained or migrated rows whichcould also cause larger than expected IO usages. Alsothere is no UNDO and chained/migrated rows statisticsinside session statistics during realtime tracking.

    Using ASH and AWR ASH: The dominant events are

    direct path read, which is usually FTS. AWR ASH alsoshows the major waits for the query before upgrade isdb file scatter read, which usually can take advantagesof cache.

  • 7/27/2019 Correlated Subquery

    12/15

    Where is The Clue?

    The huge PX process count is from the sum of PX

    slave processes used by all runs of the subqueries.

    1. Plenty of sessions use direct path read at the same time. Direct marketing uses DEFAULparallel attribute at table level for thousands of tables. The execution plan shows PX for all

    subqueries and main query.

    2. The interesting thing here is, each direct path read has a different object_id. We are

    expecting partition pruning to a single partition (table has no subpartition).

    3. There is an issue with partition pruning with correlated subquery.

    We can also use AWR ASH for the

    same SQL_ID, and order the object

    id by sample time. For this query,

    we can see the same object_id

    appeared multiple times at totally

    different time ranges, meaning the

    same partition was scanned

    multiple times.

  • 7/27/2019 Correlated Subquery

    13/15

    Some Test to Narrow Down The

    Scope The purpose is to see if PX is the cause for inefficient partition pruning inside subquery.

    I rewrote the query by replacing the view with an inline view, and using no_parallel hints todisable all PX operations.

    The query plan shows partition pruning with PARTITION LIST SINGLE operation, the one withPX is PX Block Iterator. So I assume partition pruning will work in this case.

    The query completed within 310 seconds. Total physical IO is 1,564,075 blocks. Next run took 3minutes with 400K blocks. The major waits were db file scatter read, so cache played a role.

  • 7/27/2019 Correlated Subquery

    14/15

    Is It Reproducible?

    Yes and No. When this database was busy and my test failed to acquire enough PX processes,it actually completed with the time close to the one run in serial.

    But when the database was relatively idle and the query could acquire PX processes for mostof the time, the pattern could be reproduced. Here is the part of SQL report from a test query.

    After 17 minutes, it had read more than 10M blocks, with only one account id completed.

  • 7/27/2019 Correlated Subquery

    15/15

    Summary

    Be careful about the performance of correlated

    subquery.

    Use DEFAULT (or any) parallel attribute on

    tables with caution. This should be an Oracle bug or unwanted

    feature, just not sure if it exists in earlier version.

    Remove outdated or unwanted scheduled queryjobs. The offending query is from an outdated

    scheduled job no application actually uses.