best practices for oracle exadata and the oracle optimizer

44
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Best Practices for Oracle Exadata and the Oracle Optimizer Maria Colgan, Product Manager, Oracle Optimizer Levi Norman, Product Marketing Director, Oracle Exadata

Upload: edgar-alejandro-villegas

Post on 15-Jul-2015

359 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Best Practices for Oracle Exadata and the Oracle Optimizer

1 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Best Practices for Oracle Exadata and the

Oracle Optimizer

Maria Colgan, Product Manager, Oracle Optimizer

Levi Norman, Product Marketing Director, Oracle Exadata

Page 2: Best Practices for Oracle Exadata and the Oracle Optimizer

2 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

• Fastest Data Warehouse & OLTP • Best Cost/Performance Data Warehouse & OLTP • Optimized Hardware • Software Breakthroughs • Scales from ¼ Rack to 8 Full Racks

Oracle Exadata Database Machine Extreme performance. Lowest cost.

Page 3: Best Practices for Oracle Exadata and the Oracle Optimizer

3 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Hybrid Columnar Compression Smart Flash Cache

Smart Scan Queries

Up to

50X 10X

+ + +

Oracle Exadata Innovation Storage Server Software

Page 4: Best Practices for Oracle Exadata and the Oracle Optimizer

4 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Standardized and Simple to Deploy Delivered Ready-to-Run

• Oracle Exadata Database Machines Are The Same Thoroughly Tested & Highly Supportable

Identical Configuration Used by Oracle Engineering

• Run Existing OLTP and DW Applications 30 Years of Oracle DB Capabilities

No Exadata Certification Required

• Leverage Oracle Ecosystem Skills, Knowledge Base, People, & Partners

Page 5: Best Practices for Oracle Exadata and the Oracle Optimizer

5 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Oracle Exadata in the Market Rapid Adoption Across Geographies and Industries

Page 6: Best Practices for Oracle Exadata and the Oracle Optimizer

6 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Agenda

• How to gather statistics

• Additional types of statistics

• When to gather statistics

• Statistics gathering performance

Page 7: Best Practices for Oracle Exadata and the Oracle Optimizer

7 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics

• Analyze command is deprecated

– Only good for row chaining

• The GATHER_*_STATS procedures take 13 parameters

– Ideally you should only set the first 2-3 parameters

• SCHEMA NAME

• TABLE NAME

• PARTITION NAME

Use DBMS_STATS Package

Page 8: Best Practices for Oracle Exadata and the Oracle Optimizer

8 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics Use DBMS_STATS Package

• Your gather statistics commands should be this simple

Page 9: Best Practices for Oracle Exadata and the Oracle Optimizer

9 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics

• Can change the default value at the global level

– DBMS_STATS.SET_GLOBAL_PREF

– This changes the value for all existing objects and any new objects

• Can change the default value at the table level

– DBMS_STATS.SET_TABLE_PREF

Changing Default Parameter Values for Gathering Statistics

Page 10: Best Practices for Oracle Exadata and the Oracle Optimizer

10 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics

• CASCADE

• CONCURRENT

• DEGREE

• ESTIMATE_PERCENT

• METHOD_OPT

• NO_INVALIDATE

• GRANULARITY

• PUBLISH

• INCREMENTAL

• STALE_PERCENT

• AUTOSTATS_TARGET

(SET_GLOBAL_PREFS only)

Changing Default Parameter Values for Gathering Statistics

•The following parameter defaults can be changed:

Page 11: Best Practices for Oracle Exadata and the Oracle Optimizer

11 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics

• # 1 most commonly asked question

– “What sample size should I use?”

• Controlled by ESTIMATE_PRECENT parameter

• From 11g onwards use default value AUTO_SAMPLE_SIZE

– New hash based algorithm

– Speed of a 10% sample

– Accuracy of 100% sample

Sample Size

Page 12: Best Practices for Oracle Exadata and the Oracle Optimizer

12 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

How to Gather Statistics

• Speed of a 10% sample

• Accuracy of 100% sample

Sample Size

Run Num AUTO_SAMPLE_SIZE 10% SAMPLE 100% SAMPLE

1 00:02:21.86 00:02:31.56 00:08:24.10

2 00:02:38.11 00:02:49.49 00:07:38.25

3 00:02:39.31 00:02:38.55 00:07:37.83

Column

Name

NDV with

AUTO_SAMPLE_SIZE

NDV with 10%

SAMPLE

NDV with 100%

SAMPLE

C1 59852 31464 60351

C2 1270912 608544 1289760

C3 768384 359424 777942

Page 13: Best Practices for Oracle Exadata and the Oracle Optimizer

13 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Agenda

• How to gather statistics

• Additional types of statistics

• When to gather statistics

• Statistics gathering performance

Page 14: Best Practices for Oracle Exadata and the Oracle Optimizer

14 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Additional Types of Statistics

• Two types of Extended Statistics

– Column groups statistics

• Column group statistics useful when multiple column from the same

table are used in where clause predicates

– Expression statistics

• Expression statistics useful when a column is used as part of a

complex expression in where clause predicate

• Can be manually or automatically created

• Automatically maintained when statistics are gathered

When Table and Column Statistics are not enough

Page 15: Best Practices for Oracle Exadata and the Oracle Optimizer

15 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics –

SELECT * FROM vehicles

WHERE model = ‘530xi’

AND color = 'RED’;

Column Group Statistics

SLIVER C320 MERC

RED SLK MERC

RED 911 PORSCHE

SILVER 530xi BMW

BLACK 530xi BMW

RED 530xi BMW Color Model Make

Vehicles Table

Cardinality #ROWS * 1 * 1 12 * 1 * 1 1 NDV c1 NDV c2 4 3

= =>

MAKE MODEL COLOR

BMW 530xi RED

=

Page 16: Best Practices for Oracle Exadata and the Oracle Optimizer

16 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

SELECT * FROM vehicles WHERE model = ‘530xi’

AND make = ‘BMW’;

Column Group Statistics

SLIVER C320 MERC

RED SLK MERC

RED 911 PORSCHE

SILVER 530xi BMW

BLACK 530xi BMW

RED 530xi BMW Color Model Make

Vehicles Table

Cardinality #ROWS * 1 * 1 12 * 1 * 1 1 NDV c1 NDV c2 4 3

= => =

MAKE MODEL COLOR

BMW 530xi RED

BMW 530xi BLACK

BMW 530xi SLIVER

Page 17: Best Practices for Oracle Exadata and the Oracle Optimizer

17 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

• Create extended statistics on the Model & Make columns using

DBMS_STATS.CREATE_EXTENDED_STATS

Column Group Statistics

New Column

with system

generated

name

Page 18: Best Practices for Oracle Exadata and the Oracle Optimizer

18 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

SELECT * FROM vehicles WHERE model = ‘530xi’

AND make = ‘BMW’;

Column Group Statistics

SLIVER C320 MERC

RED SLK MERC

RED 911 PORSCHE

SILVER 530xi BMW

BLACK 530xi BMW

RED 530xi BMW Color Model Make

Vehicles Table

Cardinality calculated using column group statistics

MAKE MODEL COLOR

BMW 530xi RED

BMW 530xi BLACK

BMW 530xi SLIVER

Page 19: Best Practices for Oracle Exadata and the Oracle Optimizer

19 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

SELECT *

FROM Customers

WHERE UPPER(CUST_LAST_NAME) = ‘SMITH’;

• Optimizer doesn’t know how function affects values in the column

• Optimizer guesses the cardinality to be 1% of rows

SELECT count(*) FROM customers;

COUNT(*)

55500

Expression Statistics

Cardinality estimate

is 1% of the rows

Page 20: Best Practices for Oracle Exadata and the Oracle Optimizer

20 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

Expression Statistics

New Column with

system generated

name

Page 21: Best Practices for Oracle Exadata and the Oracle Optimizer

21 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

1. Start column group usage capture

Automatic Column Group Detection

• Oracle can automatically

detect column group

candidates based on an STS

or by monitoring a workload

• Uses DBMS_STATS procedure

SEED_COL_USAGE

• If the first two arguments are

set to NULL the current

workload will be monitored

• The third argument is the

time limit in seconds

Page 22: Best Practices for Oracle Exadata and the Oracle Optimizer

22 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

2. Run your workload

Automatic Column Group Detection

• Actual number of rows

returned by this query is 932

• Optimizer under-estimates

the cardinality as it assumes

each where clause predicate

will reduce number of rows

returned

• Optimizer is not aware of

real-world relations between

city, state, & country

Page 23: Best Practices for Oracle Exadata and the Oracle Optimizer

23 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

2. Run your workload

Automatic Column Group Detection

• Actual number of rows

returned by this query is 145

• Optimizer over-estimates the

cardinality as it is not aware

of the real-world relations

between state & country

Page 24: Best Practices for Oracle Exadata and the Oracle Optimizer

24 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Extended Statistics

3. Check column usage information recorded for our table

Automatic Column Group Detection

COLUMN USAGE REPORT FOR SH.CUSTOMERS

1. COUNTRY_ID : EQ

2. CUST_CITY : EQ

3. CUST_STATE_PROVINCE : EQ

4. (CUST_CITY, CUST_STATE_PROVINCE, COUNTRY_ID) : FILTER

5. (CUST_STATE_PROVINCE, COUNTRY_ID) : GROUP_BY

SELECT dbms_stats.report_col_usage(user, 'customers') FROM dual;

EQ means column was used in equality predicate in query 1

GROUP_BY columns used in group by expression in query 2

FILTER means columns used together as filter predicates rather than join

etc. Comes from query 1

Page 25: Best Practices for Oracle Exadata and the Oracle Optimizer

25 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Enhancements

4. Create extended stats for customers based on usage

EXTENSIONS FOR SH.CUSTOMERS

1. (CUST_CITY, CUST_STATE_PROVINCE, COUNTRY_ID):

SYS_STUMZ$C3AIHLPBROI#SKA58H_N created

2. (CUST_STATE_PROVINCE, COUNTRY_ID) :

SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ created

Column group statistics will now be automatically maintained

every time you gather statistics on this table

Automatic Column Group Detection

SELECT dbms_stats.create_extended_stats(user, 'customers') FROM dual;

Page 26: Best Practices for Oracle Exadata and the Oracle Optimizer

26 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Agenda

• How to gather statistics

• Additional types of statistics

• When to gather statistics

• Statistics gathering performance

Page 27: Best Practices for Oracle Exadata and the Oracle Optimizer

27 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

When to Gather Statistics

• Oracle automatically collect statistics for all database

objects, which are missing or have stale statistics

• AutoTask run during a predefined maintenance window

• Internally prioritizes the database objects – Both user schema and dictionary tables

– Objects that need updated statistics most are processed first

• Controlled by DBMS_AUTO_TASK_ADMIN package or via

Enterprise Manager

Automatic Statistics Gathering

Page 28: Best Practices for Oracle Exadata and the Oracle Optimizer

28 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

When to Gather Statistics Automatic Statistics Gathering in Enterprise Manager

• Enterprise Manager allows

you to control all aspects of

the automatic statistics

gathering task

• The statistics gathering task

can be set to only run during

certain maintenance windows

Page 29: Best Practices for Oracle Exadata and the Oracle Optimizer

29 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

When to Gather Statistics

• If you want to disable auto job for application schema

leaving it on for Oracle dictionary tables

• The scope of the auto job is controlled by the global

preference AUTOSTATS_TARGET

• Possible values are

– AUTO Oracle decides what tables need statistics (Default)

– All Statistics gathered for all tables in the system

– ORACLE Statistics gathered for only the dictionary tables

Automatic Statistics Gathering

Page 30: Best Practices for Oracle Exadata and the Oracle Optimizer

30 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

When to Gather Statistics

• Need to determine when to manually gather statistics

• After large data loads

– Add statistics gather to the ETL or ELT process

• If trickle loading or online transactions

– Manually determine when statistics are stale and trigger gather

– USER_TAB_MODIFICATIONS lists # INSERTS, UPDATES, and

DELETES that occurs on each table

• If trickle loading into a partition table

– Used dbms.stats.copy_table_stats()

If the Auto Statistics Gather Job is not suitable

Page 31: Best Practices for Oracle Exadata and the Oracle Optimizer

31 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

When to Gather Statistics If the Auto Statistics Gather Job is not suitable

Partitioned Table

Partition 1

June 1st 2012

: Partition 4

June 4th 2012

Partition 5

June 5th 2012

DBMS_STATS.COPY_TABLE_STATS(); • Copies statistic from source

partition to new partition

• Adjusts min & max values for

partition column

• Both partition & global statistics

• Copies statistics of the

dependent objects

• Columns, local (partitioned)

indexes* etc.

• Does not update global indexes

Page 32: Best Practices for Oracle Exadata and the Oracle Optimizer

32 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Agenda

• How to gather statistics

• Additional types of statistics

• When to gather statistics

• Statistics gathering performance

Page 33: Best Practices for Oracle Exadata and the Oracle Optimizer

33 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance

• Three parallel options to speed up statistics gathering

– Inter object using parallel execution

– Intra object using concurrency

– The combination of Inter and Intra object

• Incremental statistics gathering for partitioned tables

How to speed up statistics gathering

Page 34: Best Practices for Oracle Exadata and the Oracle Optimizer

34 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance

• Controlled by GATHER_*_STATS parameter DEGREE

• Default is to use parallel degree specified on object

• If set to AUTO Oracle decide parallel degree used

• Works on one object at a time

Inter Object using parallel execution

Page 35: Best Practices for Oracle Exadata and the Oracle Optimizer

35 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance Inter Object using parallel execution

P4

P3

P2

P1 Customers

Table • Customers table has a

degree of parallelism of 4

• 4 parallel server processes

will be used to gather stats

Page 36: Best Practices for Oracle Exadata and the Oracle Optimizer

36 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance Inter Object using parallel execution

Exec DBMS_STATS.GATHER_TABLE_STATS(‘SH’,’SALES);

Sales Table

Partition 1

May 18th 2012

Partition 2

May 19th 2012

Partition 3

May 20th 2012

• Each individual partition will

have statistics gathered one

after the other

• The statistics gather

procedure on each individual

partition operates in parallel

BUT the statistics gathering

procedures won’t happen

concurrently

P4

P3

P2

P1

Page 37: Best Practices for Oracle Exadata and the Oracle Optimizer

37 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance

• Gather statistics on multiple objects at the same time

• Controlled by DBMS_STATS preference, CONCURRENT

• Uses Database Scheduler and Advanced Queuing

• Number of concurrent gather operations controlled by

job_queue_processes parameter

• Each gather operation can still operate in parallel

Intra Object

Page 38: Best Practices for Oracle Exadata and the Oracle Optimizer

38 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance Intra Object Statistics Gathering for SH Schema

Exec DBMS_STATS.GATHER_SCHEMA_STATS(‘SH’);

• A statistics gathering job is

created for each table and

partition in the schema

• Level 1 contain statistics

gathering jobs for all non-

partitioned tables and a

coordinating job for each

partitioned table

• Level 2 contain statistics

gathering jobs for each

partition in the partitioned

tables

Page 39: Best Practices for Oracle Exadata and the Oracle Optimizer

39 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance Inter Object using concurrent parallel execution

Exec DBMS_STATS.GATHER_TABLE_STATS(‘SH’,’SALES);

Sales Table

Partition 1

May 18th 2012

Partition 2

May 19th 2012

Partition 3

May 20th 2012

Job1

Job2

Job3

• The number of concurrent

gathers is controlled by the

parameter

job_queue_processes

• In this example it is set to 3

• Remember each concurrent

gather operates in parallel

• In this example parallel

degree is 4

Page 40: Best Practices for Oracle Exadata and the Oracle Optimizer

40 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Statistics Gathering Performance

• Typically gathering statistics after a bulk loading data into one partition would causes a full scan of all partitions to gather global table statistics – Extremely time consuming

• With Incremental Statistic gather statistics for touched partition(s) ONLY – Table (global) statistics are accurately built from partition statistics

– Reduce statistics gathering time considerably

– Controlled by INCREMENTAL preference

Incremental Statistics Gathering for Partitioned tables

Page 41: Best Practices for Oracle Exadata and the Oracle Optimizer

41 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Incremental Statistics Gathering

Sales Table

May 22nd 2011

May 23rd 2011

May 18th 2011

May 19th 2011

May 20th 2011

May 21st 2011

Sysaux Tablespace

1. Partition level stats are

gathered & synopsis created

2. Global stats generated by aggregating

partition level statistics and synopsis

Page 42: Best Practices for Oracle Exadata and the Oracle Optimizer

42 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

Incremental Statistics Gathering

Sales Table

May 22nd 2011

May 23rd 2011

May 18th 2011

May 19th 2011

May 20th 2011

May 21st 2011

Sysaux Tablespace May 24th 2011

4. Gather partition

statistics for new partition

5. Retrieve synopsis for

each of the other partitions

from Sysaux

6. Global stats generated by aggregating

the original partition synopsis with the

new one

3. A new partition is added to the table & Data is Loaded

Page 43: Best Practices for Oracle Exadata and the Oracle Optimizer

43 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.

More Information

• Accompanying Two Part White Paper Series – Understanding Optimizer Statistics

– Best Practices for Managing Optimizer Statistics

• Optimizer Blog – http://blogs.oracle.com/optimizer

• Oracle.com – http://www.oracle.com/technetwork/database/focus-areas/bi-

datawarehousing/dbbi-tech-info-optmztn-092214.html

• Oracle Exadata Database Machine – http://www.oracle.com/exadata

Page 44: Best Practices for Oracle Exadata and the Oracle Optimizer

44 Copyright © 2011, Oracle and/or its affiliates. All rights

reserved.