guiding practices for gathering database statistics

60
ORA00 Ltd 1/45 Guiding Practices Guiding Practices for for Gathering Database Gathering Database Statistics Statistics Martin Widlake Martin Widlake Database Architecture, Performance & Database Architecture, Performance & Training Training Ora600 Limited Ora600 Limited http://mwidlake.wordpress.com www.ora600.org.uk

Upload: lewis-glass

Post on 01-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Guiding Practices for Gathering Database Statistics. Martin Widlake Database Architecture, Performance & Training Ora600 Limited. http://mwidlake.wordpress.com www.ora600.org.uk. Guiding practices for Database Statistics gathering - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Guiding Practices for Gathering Database Statistics

ORA00 Ltd 1/45

Guiding PracticesGuiding Practicesforfor

Gathering Database StatisticsGathering Database Statistics

Martin WidlakeMartin Widlake Database Architecture, Performance & TrainingDatabase Architecture, Performance & Training

Ora600 LimitedOra600 Limited

http://mwidlake.wordpress.comwww.ora600.org.uk

Page 2: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 2/45

Abstract

• Guiding practices for Database Statistics gathering

• The Cost Based Optimizer continues to improve and stats gathering is now more efficient than ever - but it still seems to be that most Oracle Sites struggle with performance issues due to poor stats. It's like the annoying, embarrassing rash that simple won't

go away. I will cover the options available and general principles for sorting out the stats issue, which should lead to more stable and good performance ie a more comfortable life . This should calm the annoying rash and give you some potential treatments should it flare up again.

Page 3: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 3/45

Who am I and why am I doing thisTalk?

Who am I and why am I doing thisTalk?

• 20+ years of Oracle experience, mostly as a developer, development DBA, Architect, Performance guy.

• I keep getting pulled into designing “better” methods of gathering stats for clients and, frankly, I’d rather do other things {thus the presentations and blog posting telling everyone what I know}.

• Tested the CBO under V7.3 and became a cautious advocate of it in V8. Been fighting the issues since!

• I am of the opinion that, over all the CBO now gets 99% of SQL execution plans good enough, if the stats are good.

• Stats gathering can be quite interesting. Honest!

Page 4: Guiding Practices for Gathering Database Statistics

These slides will be on theUKOUG web site

I am going to talk around some slides (theones with pictures on and key points) and skip

over some - as we all get tired of reading powerpoint slides in presentations.

The others are there to fill in the chat.

Ask questions, Email [email protected]

ORA600 Ltd 4/45

Page 5: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 5/45

Quick Quiz• What is the most common version of Oracle you currently use? (8, 9,

10.1, 10.2, 11.1, 11.2)

• What is the latest version you use in production?

• Who relies on the Automated Stats Collection job on their database? (If “Yes”, have you altered the schedule?)

• Who has intermittent performance issues when code goes bad either “over night” or after stats are collected?

• Who has a site-crafted stats gathering regime?

• (If your site wrote it’s own, did it take 2X, 4X, 8X or more the effort to get right than you expected?).

Page 6: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 6/45

Possibly the Single Most Common Cause of Poor Database Performance

Poor or missing object statistics are probably the single most common and easily fixed cause of poor database performance.

Most issue with individual SQL statements performing poorly are fixed by gathering accurate

statistics on the tables involved.

The worst of all situations is to have statistics on a few tables. The Cost Based Optimiser is invoked and has to use very poor defaults for everything

else.

In my opinion, the introduction of the automated stats gathering process with

Oracle 10g was probably the single greatest performance enhancement by Oracle Corp

In the last 15 years.

And I don’t really like it

Page 7: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 7/45

Automatic Stats Collection

DBA_TAB_MODIFICATIONS(10% BY DEFAULT)

Auto Stats Job

Preparation

SYS.COL_USAGE$

Data Dictionary Information

Existing Statistics

Global and Table Prefs(stale_pct,est_pct, met_op. Degree..)

FLUSH_DATABASE_ MONITORING_INFO

OBJ_FILTER_LIST(STALE AND EMPTY)

Runs gathersIn the scheduled

window

Page 8: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 8/45

GATHER_DATABASE_STATS_JOB_PROC

From the 10g Tuning guide:-

The GATHER_DATABASE_STATS_JOB_PROC procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).The DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC is an internal procedure, but its operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATS procedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes.

That last sentence is the only major change in the 11g documentation.

Page 9: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 9/45

Automated DBMS_STATS Job• If it works for you, then fine, leave it be and work on something else. If it

almost works for you, fix the exceptions, leave the main job alone and work on something else.

• If you have a VLDB (or you downloaded this as you had an issue with stats gathering) It is almost certainly not good enough for you.

• It is an attempt at a single solution to work for every situation and it does not. Even Oracle Corp have admitted, it just simply does not work for VLDBs, it chokes on large objects

• Turn it off (maybe leave it running for DICTIONARY stats) and write the replacement. You will write something that does a lot of what this job does. Your replacement will almost certainly be more complex than you initially plan. Sorry.

• There is no one single solution to stats gathering that is right for any large, complex system. {Sorry again}

Page 10: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 10/45

DBA _TAB_MODIFICATIONS

• All inserts, updates, deletes and truncate operations on monitored tables are flushed to this table. So V10 upwards, that is everything.

• Under V9 flushed every 3 hours, under V10.1 every 15 minutes, under V10.2/V11 it is not automatically flushed.

• It is flushed to by calls to schema/db dbms_stats GATHER calls or by calling DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

• It does not capture direct inserts, appends, things that avoid the SQL layer.

• Increments, including over database restarts.

• Row is deleted when stats are gathered on the Table OR PARTITION (and only at the correct level).

It seems generally accurate but I have witnessed it missing the odd “insert-into-select” statements. And it does not see direct-insert activity.

Page 11: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 11/45

show_tab_mods-- show_tab_mods-- Martin Widlake 11-nov-07-- quick check on recent table changes-- NB flush if want up-to-date info (15 min interval)set lines 100 pause oncol obj_name form a50col no_ins form 9999,999col no_upd form 9999,999col no_del form 999,999select table_owner||'.'||table_name||'-'||partition_name obj_name,inserts no_ins,updates no_upd,deletes no_del,substr(truncated,1,1) T,to_char(timestamp,'YYMMDD hh:mm:ss') last_flushfrom dba_tab_modificationswhere timestamp > sysdate -31and table_owner not like 'SYS%'order by timestamp desc/clear colu

OBJ_NAME NO_INS NO_UPD NO_DEL T LAST_FLUSH

--------------------------------------- --------- --------- ------ - ---------------

MIDDLEOFFICE.POSITIONLIQUIDATIONAUDIT- 9,588 0 0 N 080602 10:06:03

MIDDLEOFFICE.POSITIONKEEPINGGROUPS- 2 1 0 N 080602 10:06:03

GATEWAY.TREE_RELATIONS- 44 0 43 N 080602 10:06:03

COMMHOME.COMM_TRAD_PCTTRAD_BAND- 4 0 4 N 080602 10:06:03

GATEWAY.INSTRUMENT_INFO- 2 1 0 N 080602 10:06:03

GATEWAY.TRADINGPERIODPROFILEDATA- 1 0 1 N 080602 10:06:03

BOBJ_LOGIN.ORDERS-ORDERS_12345678 1,066 9,381 0 N 080531 06:05:03

DATAFIX.TRADEORD_VOLT- 2 0 0 N 080531 06:05:03

BOBJ_LOGIN.ORDERS_HISTORY-OH_54545454 9,379 0 0 N 080531 06:05:03

DATAFIX.MAXORDERS- 2 0 0 N 080531 06:05:03

BOBJ_LOGIN.ORDERAUDIT-ORDERAUDIT23 1375610 0 0 N 080530 10:05:02

COMMHOME.COMM_ON_DEPOSIT_H- 4 0 0 N 080530 10:05:02

GATEWAY.BIN$TnRrf2V98FrgQwrckArwWg==$0- 95,449 0 0 N 080530 10:05:02

GATEWAY.BIN$TnRrf2V38FrgQwrckArwWg==$0- 882 0 0 N 080530 10:05:02

GATEWAY.DEALINGRECNMSAI- 290 5,052 0 N 080530 10:05:02

GATEWAY.BIN$TnRkBv02UezgQwrckApR7A==$0- 16,913 0 0 N 080530 10:05:02

Page 12: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 12/45

-- mdw 11/05/03

-- mdw 17/01/08 Modified to look at dba_tab_modifications

set pause on pages 24 lines 110 pause 'Any Key>'

colu anlyzd_rows form 99999,999,999

colu tot_rows form 99999,999,999

colu tab_name form a30

colu chngs form 99,999,999

colu pct_c form 999.999

select dbta.owner||'.'||dbta.table_name tab_name

,dbta.num_rows anlyzd_rows

,to_char(dbta.last_analyzed,'yymmdd hh24:mi:ss') last_anlzd

,nvl(dbta.num_rows,0)+nvl(dtm.inserts,0)

-nvl(dtm.deletes,0) tot_rows

,nvl(dtm.inserts,0)+nvl(dtm.deletes,0)+nvl(dtm.updates,0) chngs

,(nvl(dtm.inserts,0)+nvl(dtm.deletes,0)+nvl(dtm.updates,0))

/greatest(nvl(dbta.num_rows,0),1) pct_c

from dba_tables dbta

left outer join dba_tab_modifications dtm

on dbta.owner = dtm.table_owner

and dbta.table_name = dtm.table_name

and dtm.partition_name is null

where dbta.table_name like upper(nvl('&Tab_name','WHOOPS'))

/

clear colu

TAB_NAME ANLYZD_ROWS LAST_ANLZD TOT_ROWS CHNGS PCT_C

-------------------------- ----------- --------------- ----------- -------- ------

GATEWAY.ACCESSGROUPS 4 080212 16:03:12 4 0 .000

MIDDLEOFFICE.ACCOUNTAUDIT 4,725,464 080512 22:37:16 4,898,302 173,738 .037

GATEWAY.ACCOUNTBEHAVIOURS 14 080212 16:03:18 14 0 .000

GATEWAY.ENBZHEERHWV 149,136 080522 22:06:39 150,922 6,156 .041

DATAFIX.VRHERHHHEHH_08_RBK 17,650 080425 22:00:04 17,650 0 .000

DATAFIX.AFEOUNSFEWS_190505 42,757 071105 22:01:00 42,757 0 .000

DATAFIX.ACSFFWEGGEE_310108 182 080131 22:00:03 182 0 .000

MM_AUDIT.EFEFEFGHOME_AUDIT 513,230 080509 22:07:34 526,192 12,962 .025

DATAFIX.AEGFWSCEFOE_TEMP 10,000 071105 22:00:52 10,000 0 .000

GATEWAY.AEFSSTEFFFE 2,083 080212 16:03:36 2,08 0 .000

MM_AUDIT.ACCOUNSGSEGEGGEIT 2,083 071105 22:00:21 2,083 0 .000

MIDDLEOFFICE.ASGESGEPOTALLOCAT 123,944 080602 22:00:40 123,944 0 .000

GATEWAY.ENBZHEERHWV 39,136 080522 22:06:39 44,922 6,156 .131

DATAFIX.RWERNGNONNGNVZBYNO 17,650 080425 22:00:04 17,650 0 .000

DATAFIX.WEGEBTRUUUXTDWTTHH 32,707 071105 22:01:00 32,707 0 .000

DATAFIX.ACGERTIIJJYKYNJNIY 189 080131 22:00:03 189 0 .000

MM_AUDIT.KUKLYUTDJJ4YUNYJJ 313,230 080509 22:07:34 426,192 112,962 .251

DATAFIX.AERGERHGGRE_TEMP 10,000 071105 22:00:52 10,000 0 .000

GATEWAY.AEFSSTEFFFE 2,083 080212 16:03:36 15,231 0 .000

Page 13: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 13/45

SYS.COL_USAGE$

• Every time a SQL statement is parsed, information about columns referenced in table joins and where predicates is stored in the internal table SYS.COL_USAGE%

• This is what DBMS_STATS uses to help decide which of the indexed columns to gather stats on when method_opt “for all indexed columns” is used.

• It might also play a part in deciding which columns to gather histograms on, as I have tested adding very skewed columns to a table and the automatic stats collection does not gather histograms and neither does a specific call to gather_stats with method_opt=“auto”.

• It can also be useful to use, to help identify if an index is missing or even is likely to be used

Page 14: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 14/45

-- chk_col_usage-- this is a rip-off of Tim Gormans' script to look at the column usage info that,in-- 9i,10g and 11 beta at least, is not revealed in a DB view. Gitscol owner form a22 wrapcol tab_name form a30 wrapcol column_name form a30 wrapcol equal_preds form 9999,999col eqi_joins form 9999,999col noneqi_jns form 9999,999col range_prds form 9999,999col like_prds form 9999,999col null_prds form 9999,999select oo.name owner, o.name tab_name, c.name column_name, u.equality_preds equal_preds, u.equijoin_preds eqi_joins, u.nonequijoin_preds nonegi_jns, u.range_preds range_prds, u.like_preds like_prds, u.null_preds null_prds, u.timestamp tsfrom sys.col_usage$ u, sys.obj$ o, sys.user$ oo, sys.col$ cwhere o.obj# = u.obj#and oo.user# = o.owner#and c.obj# = u.obj#and c.col# = u.intcol#and o.name like upper(nvl('&tab_name','%'))||'%'and oo.name like upper(nvl('&tab_own','%'))||'%'order by 1,2,3/clear colu

OWNER TAB_NAME COLUMN_NAME

--------------- ------------------------- ---------------------------

EQUAL_PREDS EQI_JOINS NONEGI_JNS RANGE_PRDS LIKE_PRDS NULL_PRDS TS

----------- --------- ---------- ---------- --------- --------- -------------

COMMHOME COMM_TRAD_PCTTRAD_H TREENODEID

2 0 0 0 0 0 02 JUN 2011 12:44:22

COMMHOME COMM_TRAD_PIPREFUND CC

0 3 0 0 0 0 02 JUN 2011 10:59:13

COMMHOME COMM_TRAD_PIPREFUND HOMEID

4,581 10 0 0 0 0 15 JUN 2011 20:38:44

COMMHOME COMM_TRAD_PIPREFUND ISLEAF

163 0 0 0 0 0 15 JUN 2011 20:38:44

COMMHOME COMM_TRAD_PIPREFUND TREENODEID

177 169 0 0 0 0 15 JUN 2011 20:38:44

SYS TS$ FLAGS

3,102 0 0 0 0 0 14 JUN 2011 23:33:37

SYS TS$ NAME

2,708 1,954 5 0 392 0 15 JUN 2011 12:22:23

SYS TS$ ONLINE$

3,553 0 0 0 0 0 15 JUN 2011 06:38:16

SYS TS$ TS#

1,132 8,555 0 86 0 0 15 JUN 2011 18:08:32

GATEWAY TRADINGPERIODPROFILEDATA TRADINGCLOSE

0 0 0 1 0 0 29 MAR 2011 05:23:45

GATEWAY TRADINGPERIODPROFILES INSTGROUPID

46 237 0 0 0 0 12 JUN 2011 11:09:17

GATEWAY TRADINGPERIODPROFILES PROFILENAME

0 67 0 0 1 0 12 JUN 2011 00:53:36

GATEWAY TRADINGPERIODPROFILES SOURCEID

44 234 0 0 0 0 12 JUN 2011 11:09:17

Page 15: Guiding Practices for Gathering Database Statistics

SYSTEM STATISTICS

Gather “once” – how fast the hardware is.Multi-Block read : Single-Block+Speed of your COU

FIXED OBJECTSTATISTICS

Gather “once” – How big your memory objects areAreas of memory, number of users, size of caches X$ sys-only “objects”

DICTIONARYSTATISTICS

Gather regularly, probably via auto stats jobDo not enhance or do one-offsEssentially “normal” stats for sys.obj$-type things

OBJECTSTATISTICS

Gather regularly, via auto job and enhancementsTables, Indexes, ColumnsWhat DBAs/Developers mean by “Stats”

Statistics HierarchyThere is more than one type of “stats” that Oracle can gather and

which have different impacts and are best gathered in different ways.

Incre

as

ing

Imp

act

Au

to G

ath

er

Page 16: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 16/45

System Statistics• In effect, these stats are just the CPU speed and the relative speeds of

Single-Block Reads (SBR) and Multi-Block Reads (MBR).

• The actual speed of single- and multi- block reads are recorded, in milliseconds, but it is the ratio between them that counts.

• If Multi-Block reads are found to be the same or faster than Single-Block reads, Oracle 10 ignores the data collected, does not store it.

• The CBO converts all IO and CPU cost into units of single-block reads. That is what the COST is in explain plan. It is also what you see in AWR.

• Gathering System Statistics may: • push oracle towards or away from high-CPU actions like sorts.• Alter the likelihood of full table scans and fast full index scans as

oracle better understands the cost of the multi-block actions.

Page 17: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 17/45

System Statistics• V10 and 11 come with a default set of system statistics. You can

GATHER_SYSTEM_STATS with a fake workload or based on activity on your system over a period of time.

• I advise the latter – but ensure your system has a representative workload.

• You only need to gather the System Statistics “once” (but ensure you do so with an “average load”).

• Re-gathering is only required if your storage changes (eg add more spindles), if there is a major system change or the server(s) you use change significantly in CPU utilisation

• Gathering at day and night and storing/swapping system stats is often suggested – but seems to be a bit of an urban myth.

• You may wish to gather system stats 4 or 5 times and DBMS_STATS.SET_SYSTEM_STATS to the average.

• NB Not RAC aware – system stats gathered on one node apply to all

Page 18: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 18/45

Fixed Object Statistics• These are statistics on the in-memory “dynamic performance”

objects, the x$ and similar tables (what V$ views sit on).

• Need to gather “once” and only re-gather if something significant changes such as allocating much more memory to the instance or the number of user sessions greatly increasing.

• Gathering Fixed Object stats will aid internal SQL, checking session details, looking at memory constructs. I have seen a small improvement in parse speed. Certain dictionary queries run faster.

• Re-gather after upgrade etc.

• If they have never been gathered the impact can be significant, I have yet to personally see a major change as a result of re-gathering (I just do so once a year on “just in case” principles).

Page 19: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 19/45

Dictionary Statistics• Statistics on the internal tables owned by SYS and other internal

Oracle users. SYS.OBJ$, SYS.TAB%, SYS.TS$, those tables.

• Are gathered as part of the default statistics gathering job.

• In effect just like gathering schema statistics on the SYS, SYSTEM, OUTLN and other users. Support (and recommended {*} ) from V10. V9, keep with the RBO.

• Can take several hours to gather on a database with tens of thousands of objects or more.

• Can significantly aid parsing and other internal SQL, as well as DBA scripts running on the DBA views and also the underlying tables.

Page 20: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 20/45

Dictionary Statistics

• Gather them regularly. Weekly to monthly.

• If you use the default automatic statistics gathering job, it is collecting dictionary statistics for you and is fine.

• If you disable the automatic statistics gathering for your schema stats either:• Leaving it running for ONLY Dictionary statistics:

DBMS_PARAM.SET_PARAM(‘AUTOSTATS_TARGET’,’ORACLE’)• Organise regular dictionary stats gathering by your own methods.

• Not gathering Dictionary Stats, especially on very large/complex database could lead to very poor dictionary/parse performance.

• To be honest, with 10.1/10.2 at least, even gathering Dictionary Stats for systems with massive numbers of segments can fail to resolve some slow Dictionary performance

Page 21: Guiding Practices for Gathering Database Statistics

That was the pre-amble. Getting the System, Fixed Object and Dictionary stats gathered gives

you a solid base to tackle Object Statistics

ORA600 Ltd 21/45

Page 22: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 22/45

Why are Stats Key to Performance?

• Used by the Cost Based Optimiser (CBO).

• CBO examines the SQL statement and works out the various ways in which it could satisfy the query (up to about 2,000 plans under 10).

• For each step the CBO works out the cost, which is expected IO plus CPU (if turned on) and the cardinality, the number of records that step will return.

• The cardinality is passed back to the next step and can be a multiplier of that step’s cost.

• The costs are added up and Oracle then picks the plan with the lowest overall cost(*) and runs that plan

(*) This is a slight lie, but the principal is true

Page 23: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 23/45

Why are Stats Key to Performance?

• The CBO is very logical, it uses just the figures it is presented with and simple calculations to make it’s decision. No magic involved.

• If those figures are wrong, ie the statistics are not representative then the costs calculated will be incorrect.

• A small error can cascade up the code and cause a large difference to other steps and cause the plan to change.

• With edge cases, a small difference often result in a different plan being chosen.

• That different plan is often sub-optimal, sometimes seriously so, occasionally hundreds or thousands of times slower.

• Cost/Cardinality being too low , often 1, is the most common cause of poor performance, due to prompting nested loop plans and incorrect driving tables

Page 24: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 24/45

Automated Stats GatheringAutomated Stats Gathering• If the automated job is working for you, leave it alone.

• If you turned off the automated job and “wrote your own” under V10 or before and now are on 11 – consider going back to the automated job. It is faster and more accurate.

• If you turned off the automated job and did not do anything about your Dictionary stats, that is bad.

• Under V11 you can control %stale and defaults like method_opt at table level. Consider doing that.

• If you decided to “roll your own” I would advise you leave the automated job running but just for Oracle’s objects:

dbms_stats.set_global_prefs(autostats_target,’ORACLE’)

Page 25: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 25/45

Stats GatheringStats Gathering• Version 9, swap to DBMS_STATS and write your own

• Version 10, use the automated job (at least for dictionary stats) write your own version/exceptions for your tables.

• V11.1 – Test, do not trust me, but I would still say go auto.

• V11.2 – Use the automated job and if you must intervene, use default sample size so you get one pass NDV.

• V11 – look at rolling up stats for partitions, subpartitions – but read up on it extensively. You have to ensure you gather all partitions or sub-partitions.

• With all versions of Oracle you will have exceptions, usually overnight batch or partitioned tables. If you are the DBA, this is part of your job.

Page 26: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 26/45

New Oracle 11 NDVNew Oracle 11 NDV• One of the most demanding parts of generating object

statistics is gathering Number of Distinct Values

• Oracle 11 introduced the single-pass NDV function. It scans the data once, uses much less memory, is faster and more accurate.

• You have to use: ESTIMATE_PERCENT=DBMS_STATS.AUTO_SAMPLE_SIZE

• You cannot use BLOCK sampling.

Amit poddar via JLhttp://jonathanlewis.files.wordpress.com/2011/12/one-pass-distinct-sampling-presentation.pdfhttp://structureddata.org/2007/09/17/oracle-11g-enhancements-to-dbms_stats/

https://blogs.oracle.com/optimizer/entry/improvement_of_auto_sampling_statistics_gathering_feature_in_oracle_database_11g

Page 27: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 27/45

The Automated Job may Choke

Page 28: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 28/45

select substr(operation,1,30) operation

,to_char(start_time,'DD-MON-YYYY HH24:MI:SS.FF') START_tme

,to_char(END_TIME,'DD-MON-YYYY HH24:MI:SS.FF') end_tme

from sys.WRI$_OPTSTAT_OPR

order by start_time desc

OPERATION START_TME END_TME

------------------------------ ------------------------------ ---------------------

gather_database_stats(auto) 02-JUN-2011 22:00:02.979206 02-JUN-2011 23:04:37

gather_database_stats(auto) 31-MAY-2011 06:00:02.811976 31-MAY-2011 06:56:21

gather_database_stats(auto) 30-MAY-2011 22:00:01.976379 30-MAY-2011 22:19:31

gather_database_stats(auto) 29-MAY-2011 22:00:01.416256 29-MAY-2011 23:36:14

gather_database_stats(auto) 28-MAY-2011 22:00:02.243542 29-MAY-2011 00:12:18

gather_database_stats(auto) 27-MAY-2011 22:00:03.588237 27-MAY-2011 23:14:24

gather_database_stats(auto) 26-MAY-2011 22:00:01.602425 26-MAY-2011 23:14:05

gather_dictionary_stats 24-MAY-2011 11:42:31.771667 24-MAY-2011 11:42:35

gather_dictionary_stats 24-MAY-2011 11:42:11.396340 24-MAY-2011 11:42:15

gather_database_stats(auto) 24-MAY-2011 06:00:02.905945 24-MAY-2011 06:20:38

gather_database_stats(auto) 23-MAY-2011 22:00:01.732964 23-MAY-2011 22:58:35

gather_database_stats(auto) 22-MAY-2011 22:00:01.421518 23-MAY-2011 06:00:05

gather_database_stats(auto) 21-MAY-2011 22:00:01.942455 22-MAY-2011 06:00:01

gather_database_stats(auto) 20-MAY-2011 22:00:03.066981 21-MAY-2011 06:00:01

gather_database_stats(auto) 19-MAY-2011 22:00:02.571718 19-MAY-2011 23:04:27

gather_database_stats(auto) 17-MAY-2011 06:00:01.462810 17-MAY-2011 08:02:34

gather_database_stats(auto) 16-MAY-2011 22:00:01.096761 16-MAY-2011 23:14:56

Page 29: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 29/45

Fixing Choked Stats• Once the automated job chokes, it will continue to choke. Every night.

This is because it tries the same thing each night.

• The longer weekend run should sort things out – until it chokes.

• Identify the table (Look for the gather statement interactively during the window, pull off the list of tables to gather options=list_stale, check for large objects with 10% difference...)

• Do a manual gather with something like: block_sample=>true, estimate_percent=>0.1, degree=>8 method_opt=>for all columns size 1,cascade=> false, noinvalidate=>false, granualrity=>global

• Once that is run you can afford to do a larger sample size and do the indexes. Do the PK first.

• You probably need to lock the stats on the table and treat it as an exception.

Page 30: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 30/45

The Biggest “Wrong Stats” Issues

1. The stats say a segment is empty and it is not.

2. Your WHERE predicates are out of range for the known column values.

3. The stats say a segment holds than 10* less data than it does. The more orders of magnitude out, the worse.

4. Histograms. (there and not need or need and not there. Ouch)

5. The correlation between columns is not understood by the optimiser e.g. That values in tab1.x “line up” with those in tab2.y

6. Edge cases that experts get excited about but 99% of us never see

The below are the worst “stats” causes of performance issues, in order, in my opinion based on experience

Page 31: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 31/45

Stats Issues are a VLDB thing?• I can only go one what I have seen and, to a less turstworthy level

(*ironic given the sources), what I have heard...

• OLTP systems with a need for the fastest absolute response time to small data queries are NOT troubled by object stats.

• Edge cases that balance on correlation or swapping to nested loop from hash or using Cartesian join are specific to OLTP and a set of requirements where stats gathering are, well, redundant.

• Where pain occurs is when a plan that hashes several segments together {often including partition exclusion} swaps to either a nested loop or Cartesian merge join that is not suitable

• When it comes down to it, the plan for large volumes of data is right for all volumes of data. If it does it in 30 seconds inefficiently, doing it in 33 second efficiently is spot-on good enough.

Page 32: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 32/45

Single Values Expected Outside of range

10,000 values

200 300100 4000

100

Expect NoOf values

Oracle 10 with noHistograms

10,000 values

200 300100 4000

100

Expect NoOf values

Oracle 10 withHistograms

10,000 values

200 300100 4000

100

Expect NoOf values

Oracle 11 with noHistograms

10,000 values

200 300100 4000

100

Expect NoOf values

Oracle 11 withHistograms

10,000 values

200 300100 4000

100

Expect NoOf values

Or

Page 33: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 33/45

Range Values Expected Outside of range

10,000 values

200 3000

100

Expect NoOf values

Oracle 10 with noHistograms

10,000 values

200 3000

100

Expect NoOf values

Oracle 10 withHistograms

10,000 values

200 3000

100

Expect NoOf values

Oracle 11 with noHistograms

10,000 values

200 3000

100

Expect NoOf values

Oracle 11 withHistograms

Page 34: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 34/45

Histograms and Data Ranges• Oracle 10 deals with column values being “out of range” by

decreasing cardinality over that range

• Eg if the low_value is 200 and the high_value is 300 and there are 1000 rows, that is 10 rows for 225.

• For 350, it is 50% outside the range, so 10 rows is reduced by 50% to 5 rows. At value 400 it drops to 1 row. This is fine if you gather at 10%.

• Histograms massively alter this “out of range” half-life. I have seen massive issues with dates. A table covering 5 years of data, with histograms on the date, can reduce a value only a week out of range to less then 1% of the average value.

• This is a big issue on large tables that have stats gathered only occasionally (as not changed by a big percentage).

Page 35: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 35/45

Histograms and Partitions• This Out of Range issue is thrown into sharp relief with Partitions.

• If the partitions are daily or weekly and have histograms on them, SQL statement selecting data for the latest hour can become “out of range” sooner than you can believe.

• Spot this by the cardinality being 1 rather than several hundred or thousand.

• Can happen even without histograms, especially with daily partitions

• Solutions? Use dynamic sampling on latest daily or weekly partitions (your in-house code either does not collect or deletes such stats), insert half-day stats or gather very aggresively.

Page 36: Guiding Practices for Gathering Database Statistics

Replacing the Automatic Stats Job

The CBO is Complex.Stats are in effect a constantly

evolving part of your code base.A simple approach will only work

on a simple system.(And I really am Sorry!)

ORA600 Ltd 36/45

Page 37: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 37/45

Replacing the AUTO stats job• Don’t

• Tweak the current job – make it run at different times, alter the %stale and defaults at table levels.

• Locks the stats on the tables that give you issues and treat these as your exceptions.

• If you MUST replace the Auto job it will hurt:• Initial estimate for “simple” replacement will be a week or two.• If you plan and estimate it, that will come out as four weeks.• It will take you 2 months.• You will emulate most of what the AUTO job does.

• Your solution will probably need to look at segment size, DBA_TAB_MODIFICATIONS and have several control table that allow you to specify METHOD_OPT, SAMPLE SIZE and GRANULARITY at table level...

Page 38: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 38/45

Auto Stats Replacement

OBJ_CTLOWNERTAB_NAMEPART_NAMEDFLT_STALEDFLT_SAMPLEDFLT_METHOD_OPTIDX_PCT

Yo

ur S

tats

Job

STATS_CTL

STATS_LOG

DBA_TAB_MODSDBA_SEGMENTS

LIST_STALE/EMPTY

COPY RULES

SAVE_STATS

BLOCK SAMPLETAB THEN INDEXORDER BY SIZE ASCALL COLUMNS SIZE 1PARALLEL

Page 39: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 39/45

Block or Row Sample• When you stats an ESTIMATE_PERCENT for a table gather statement,

you can also state if it is block or row sample. It defaults to Row.

• Row sample selects the percentage of rows scattered across all blocks. If you have eg 16k block size and over 100 rows per block, a 1%sample size will visit every block.

• Block sample selects whole blocks, which greatly reduces the physical IO

• Block sample size gives low column cardinalities and is susceptible to getting high-low values that are not close to the true edges.

• The under-sampling of columns varies depending on spread, but my tests show that 5% block sample size is about as good as 0.8% row sample size, but still 10 times faster. It is fine for 99% of cases.

Breaking news!

There appears to be some issue with block sampling on version 10.2

You can use the SAMPLE command in normal SQL select statements and that is what Oracle does to

gather stats. However, the BLOCK SAMPLE seems to vary the actual number of blocks it checks for a given

sample size.I will investigate when I have time.

• Oracle V11 up – the new NDV and speed of stats pushes me back towards AUTO SAMPLE SIZE

Page 40: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 40/45

Write Something to Gather Stats on all Segments

• You need to gather stats on each segment that needs stats. I strongly suggest any in-house code works segment-by-segment such that all tables, partitions and index segments are processed “as one”

• You could use GATHER_SCHEMA_STATS and LIST the objects into an array and process. This gets object s Oracle would deem in need of stats:o STALE list those objects that have stats but have changed by 10%o EMPTY list those objects with no current statso AUTO is supposed to list both but is buggy in 10.2.0.3

• Alternatively, run through all segments in the schemas you are interested in and used DBA_TAB_MODIFICATIONS directly. Slower (*) but much more control

• I used the above on 10.1 but on current 10.2 system, the data dictionary is too slow. We decided to revert to GATHER_SCHEMA_STATS

Page 41: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 41/45

Do not write code to roll back stats.• Oracle keeps all stats changed for 31 day, by default.

• You can alter this with DBMS_STATS.ALTER_STATS_HISTORY_RETENTION, check it with DBMS_STATS.GET_STATS_HISTORY_RETENTION (usualy 31).

• If the stats history is causing you issues use PURGE_STATS to get rid of them but, be warned, once gone they are gone. But then, how often do you RESTORE stats?

• Unless you REALLY want to be able to identify sets of stats, just use Oracle’s in-built feature where it stores previous stats for 31 days and you can recover them.

• This hidden feature can create a LOT of data, especially if you have lots of partitions with lots of column stats. It goes into SYSTEM TS. http://mwidlake.wordpress.com/2009/08/03/why-is-my-system-tablespace-so-big/

Page 42: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 42/45

Rolling Back Statistics• Most people are aware of the potential to have a user_statistics_table.

This is a table you create that you can put stats into using EXPORT_XXXX_STATS and retrieve them with IMPORT_XXXX_STATS

• You use the CREATE_STAT_TABLE table to create the table and give a STATID to sets of stats you wish to export and import.

• One little Gotcha. The documentation is not clear, if you state a user statistics table in GATHER_XXXX_STATISTICS then they are NOT placed into the stats table, they are put in the dictionary and the OLD values are put in the stats table.

• Rolling back stats works for system, fixed_object, database and object stats. I’ve tested it, it works – at least under 10.1 and 10.2.

• Oracle 11 allows you to create Pending Stats which is very nice. It beats the manually version I developed for V10 in...ohh... 2005?

Page 43: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 43/45

Rolling Back• Gathering System ,Fixed Object and Dictionary statistics are “system

wide” changes. Do you have a suitable system to test this on?

• All have equivalent DBMS_STATS.RESTORE_XXXX_STATISTICS commands that, when I tested on 10.2.0.3, worked correctly (including blanking stats that were previously null).

• You could save the stats being replaced in your own statistics table and recover from there. The only advantage is it is easier to interrogate stats you saved into your own stats table.

• Deleting System Stats reinstigates the default values seen after install.

• Deleting Fixed Object stats deletes the stats.• Deleting Dictionary stats is not a very good idea as it works.• I said you could restore Dictionary stats but, even though I tested it, I

am not going to promise that all stats will be the same after you restore...

Page 44: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 44/45

Tables Cascade or Not?• If you gather statistics on a Table or Table Partition, the default is to

CASCADE to relevant index segments.

• The default will gather between 1000 and 2000 index blocks. I find this to be overkill.

• By NOT cascading to index partitions but doing the indexes specifically and setting a sample size derived from the index size, you can use smaller samples sizes.

• However, this is more code to write and test. It may be more pragmatic to just cascade, though each index segment may end up taking as long as the table segment to be gathered.

• Consider global indexes and partitioned global indexes.

Page 45: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 45/45

DBMS_STATS Package• Many of us still refer to “analyze the table” but of course we all now

mean “gather statistics with DBMS_STATS”. Don’t we .

• DBMS_STATS gathers better statistics, including histograms, and cascade down to partitions and sub-partitions as needed.

• Can be run for specific segments (table, index, partition thereof etc), for a schema or for the database and can be set to only gather “stale” objects.

• Can use Parallel to make up for the slower running, can set sample size by row and block (see more later)

• In these slides, any procedure or function is in DBMS_STATS unless otherwise specified.

• Don’t use ANALYZE any more, especially not on production systems.

Page 46: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 46/45

Stability and Performance Dichotomy

• To get the best performance, you want the stats to be as up-to-date and accurate as possible.

• When the stats change, it is like having a miniture code release. The processing of the application can change – after all, that is the idea.

• How many people here would allow a code release on their business critical systems at any time?

• Most people are not actually concerned with good performance, they are concerned with poor performance.

• Code can suddenly start running slowly “over night” when stats are gathered. Just one very bad statement can actually make the system unusable.

Please note, the actual result of the data processing does not change, the functionality is preserved, but the time taken can change, the idea being it changes for the better

Page 47: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 47/45

Why are Bad Stats worse than No Stats?

• As an example, stats showing a table or partition is empty will cause the CBO to think it is a cost of 1 to scan and only one record will be found. Nested loops to access it and Cartesian Merge plans will occur.

• More later, but stats saying there is no data later than a month ago will cause the CBO to again assume there is as little as one record to find.

• If there are no stats, Oracle will use dynamic sampling or use defaults, which are usually better than bad stats

• Spotting Bad Stats is not as easy as spotting missing Stats, especially as Oracle stores high/low values and histograms in internal formats.

• Old stats (a specific type of bad stats) can cause a plan to flip to another plan without notice, with NO CHANGE.

Page 48: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 48/45

Intervention methods for stats

• Stated low percentage stats gathering for large objects – use BLOCK SAMPLING and do tables and indexes specifically.

• Set statistics manually or COPY_TABLE_STATS. The tricky part is the column stats and any histograms you have to have.

• Gather stats at lowest level (sub-partition or Partition) and allow them to be calculated at higher levels via Incremental stats. Made possible via synopses (but needs care to implement)

• The overall aim is always the same. You do not need accurate stats, you need stats that are good enough to give execution plans that will work for large data volumes

• Delete and lock stats and allow Dynamic Sampling to occur.

Page 49: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 49/45

When to Gather• Most sites have stats gathering over night, even if they replace the

auto stats gathering job. This may well not be the best time,

• You may be better off collecting stats in the late evening or early morning. You can use PARALLEL to get the job done more quickly.

• You are very unlikely to benefit from gathering all stats in one window.

• If you are processing date into a table, gather the stats at the right time(s). This is probably after the load, or it could be several times in the process.

• So long as only specific gathering is done, there is great benefit to be had gathering stats as your batch process proceeds.

Page 50: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 50/45

When to Gather-Batch Processing• I’m soooooo tired of this conversation and I have had it so many

times... Batch processing means you need to think about stats

• You have a table that is empty. You stuff it full of data. You use it to load data into your live database. Then you may or may not truncate it. Ask yourself, when is the volume of data significant and when do you gather stats.

• If it is a global temporary table you may need to set some stats.

• If I was to be given a pound for each time I had seen that the batch processing had no concept of stats gathering I would have £17.50.

• So long as only specific gathering is done, there is great benefit to be had gathering stats as your batch process proceeds.

Page 51: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 51/45

Incremental Stats• I’m sorry but I ripped it out. There is nothing between “you can do it”

and “here is how it works” that is not 1 min or 30 mins.

• Do NOT attempt before V11, even though it is back-ported to 10.2.0.4 and .5. I’ve tried, it was hard. It was not nice.

• You gather stats at the lowest level, sub-partition or partition at a reasonable level. These are summed up. Synopses are used to allow NDV calculations. Better than the crap I came up with..

• YOU HAVE TO BE RIGOROUS in ensuring all stats are gathered at the lowest levels and issues do not occur, but it works well if controlled.

• No one seems to mention this, but, it all falls apart when you add new columns to a table. It won’t handle this as far as I have seen and those tables with stats that no longer update? Urrrrgghhhh.

Page 52: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 52/45

One Extreme - Dynamic Sampling

Delete stats from a table and lock the table.

• If there are no stats for a table, they are gathered from the table as it is now, thus avoiding issues with the table having recently changed.

• You set the level of DYNAMIC SAMPLING at the instance level. In 10.2 it defaults to 2, gather stats on any segment lacking them,

• At level 3, Guesses for filter predicates are checked and at level 4 correlation between columns are checked. These can greatly improve performance.

• Only the data sampled can be considered and this is going to either be much less than that considered by a GATHER statement or else will take a long time to sample.

• Levels above 4 increase the number of blocks assessed by the dynamic sample, up to 10 which is “every block”. Don’t do that.

Page 53: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 53/45

Dynamic Sampling• Dynamic sampling can be set at the database level

(optimizer_dynamic_sampling) or in hints.

• I would suggest setting the database level to 3 or 4 on a Datawarehouse, but I’ve not had much success getting Live sites to do this.

• Dynamic sampling does extend parse times (sometimes to several seconds, potentially) so not suitable for sites with high SQL statement turnover (ie no use of binds). Which is a shame as binds cause other issues, with histograms and correlation!

• Some people argue that, on DWs, delete all user-table (and index)stats, lock the tables and use dynamic sampling alone.

• I repeat, Not suitable for OLTP-like SQL activity.

Page 54: Guiding Practices for Gathering Database Statistics

Ora 600 Ltd 54/45

Other Extreme - Freezing Stats

• DBMS_STATS.LOCK_TABLE_STATS and LOCK_SCHEMA_STATS allow you to lock the stats. Can’t lock at partition or index.

• The automated stats collection will leave those segments (and their dependents) alone unless you use the ‘force=true’ parameter.

• Individual gather/set/delete/import stats statements on a locked table will cause an error.

• Locking populated stats will help preserve an execution plan, which gives stability but prevents improvements. However, you can’t stop time.

• If you delete the stats first and lock them, they stay empty, allowing dynamic sampling to prevail.

• If you lock the stats, people or processes gathering stats THAT SHOULD NOT BE will fail. Can re-enable via the force parameter

Page 55: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 55/45

Locking Stats• If you lock the stats on a table then Oracle will not collect them

anymore.

• If you lock a table with empty stats, the CBO will dynamically sample the table when the SQL statment is parsed. This is fine for long running SQL like on Data Warehouses and utterly unacceptable for OLTP systems where you need the answer in 100ms.

• If you have gathered stats on an empty table and lock the stats you are in a world of pain as the CBO will think there are no rows and base it’s plan on that.

• If you gather stats when you know that they will be good, and then lock them, then your execution plans will probably be good.

• If you have a complex situation but know what stats are needed, you can SET those stats and lock the table. No more stats will be gathered. Everything will be fine until the world moves on.

Page 56: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 56/45

Histograms• Histograms and bind variables do not mix well on Oracle 10 due to

bind peeking. Oracle 11 is supposed to fix this.

• In essence, if the first bind values seen by the parse are not typical (or match a low-cardinality plan) the plan chosen can perform very poorly for larger returned data sets.

• These bad plans can get stuck in the SGA. Eg takes 11 minutes to run and kicked off every 10 minutes. You can’t get the code out of the SGA.

• Histograms increase the time taken to gather stats significantly. Oracle 10.1 and 10.2 massively over-gather histograms.

• If you write your own code you can stop histograms being gathered.• But Histograms are really good for some code.• If only there was a “parse” hint

Page 57: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 57/45

Further Information

• PL/SQL Packages and Types Reference, DBMS_STATS

• Blogs - Jonathan Lewis (generally), Doug Burns (recent series on partitions and stats), Christian antognini, follow links to others

• My Blog, I intend to do some more posts on stats gathering.

• Email me and ask, I’m happy to answer general questions if I can.

• Your system.

Page 58: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 58/45

Table Sample Size• Generally speaking, the larger the table the smaller the sample size

you need for “good enough” statistics.

• I usually decide on a percentage sample size derived from the number of blocks in the table, aiming for 1000 to 10,000 blocks.

• SIZE_AUTO samples 0.01% then 0.1% (or similar) and sees if the stats change significantly, then increase the percentage size until the stats are stable or a compute is quicker.

• SIZE_AUTO sounds clever but works out, in practice, to usually be inefficient. Just does not cope with lots of large segments. If you think about it, it almost guarantees gathering at a samples size larger than you need before stopping.

• May want to gather the table only with no CASCADE to indexes as Oracle over-samples indexes with the cascade option (I hardly every use CASCADE in replacements for the auto job)

Page 59: Guiding Practices for Gathering Database Statistics

I over-ran, didn’t I?And I kept it to 45 slides.

Oh Well, next SIG...

ORA600 Ltd 59/45

Page 60: Guiding Practices for Gathering Database Statistics

ORA600 Ltd 60/45

Template Picture Slide