Download - Mark Rittman Optimizing Biapps
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Optimizing the Performance of the Oracle BI Apps Mark Rittman, Director, Rittman Mead Consulting Rittman Mead BI Forum 2009
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Who Am I?
Oracle BI&W Architecture and Development Specialist Co-Founder of Rittman Mead Consulting
Oracle BI&W Project Delivery Specialists 10+ years with Discoverer, OWB etc Oracle ACE Director, ACE of the Year 2005 Writer for OTN and Oracle Magazine Longest-running Oracle blog
http://www.rittmanmead.com/blog Ex-Chair of UKOUG BIRT SIG Co-Chair of ODTUG BI&DW SIG Editor of Oracle Scene UKOUG magazine Speaker at IOUG and BIWA events
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Rittman Mead Consulting
Oracle BI&DW Project Specialists providing consulting, training and support Clients in the UK, USA, Europe, Middle-East Voted UKOUG BI Partner of the Year 2008 Consultants in Europe and North America Regular speakers at user group and Oracle events
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Oracle Business Intelligence Applications
Packaged set of ETL mappings, dimensional data warehouse and pre-built reports and dashboards
Financial Analytics HR Analytics Marketing Analytics Order Management Analytics Sales Analytics Service Analytics Contact Center Analytics Supply Chain Analytics
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Part of Oracle Business Intelligence Technology Stack
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Predefined, Integrated Dimensional Data Warehouse
Integrated, conformed dimensional data warehouse Allows modular deployment Lowest grain of information Prebuilt aggregates Deployable on Oracle, MS SQL,
IBM DB/2 and Teradata History tracking Indexing
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Oracle BI Applications Product Architecture
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Oracle Data Warehouse Administration Console
Control panel for running the OBAW load process ETL packaged into Execution Plans Tight integration with Informatica Run jobs, monitor progress Orchestrate the ETL process
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Oracle BI Apps Technology Limitations
Designed to work against Oracle, IBM DB/2 or Microsoft SQL Server Uses database features common to all three platforms
B-tree and bitmap indexes Tables NOT NULL constraints Thats about it
Doesnt use any platform specific features such as Database aggregations (materialized views, OLAP etc) Compression features VLDB features
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Compression Partitioning Materialized Views Star Transformations Others, potentially OLAP, query rewrite, etc How can we make use of these? And will we break things if we use them?
Oracle Database Data Warehousing Enhancements
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Some changes, such as partitioning, may require changes to index handling Some changes involve more work than others To make use of these features, you will need to upgrade certain components i.e. DAC Some might require the use of tools outside of INFA and DAC (i.e. AWM) Need to make sure the ETL process and queries still run properly afterwards
A Word of Warning Though...
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Oracle BI Apps Optimization Scenario
To illustrate the benefits of using Oracle DW features, we will focus on one fact table W_SALES_INVOICE_LINE_F, and its aggregate W_SALES_INVOICE_LINE_A
Reasonably large, has 450k rows in the sample dataset Currently takes around 904 seconds (15 mins) minutes to load both tables Takes up 189MB of disk space Full table scans require all of the table to be loaded into the buffer cache Can we improve on this?
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Optimization Task List
1. Use Segment Compression on the fact table 2. Add Partitioning to the fact table 3. Replace the existing aggregate table with a fast-refresh materialized view
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
The tables that the DAC creates are uncompressed W_SALES_INVOICE_LINE_F is 189MB in size
Oracle has feature called Segment Compression that allows us to compress a table
Packs more rows of data into a single data block Only works for bulk-load inserts Makes full table scans faster as well Simple to implement
Scenario #1 : Compressing the Fact Table
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Things to Consider When Implementing Compression
The DAC normally creates tables for you, but isnt aware of compression You will therefore need to find a way to add the COMPRESS clause to your tables One method is to get your DBA to run a script outside of the DAC One-off, can be done at start or after your tables are deployed
If done after deployment, only new data is compressed
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Compression Step 1 : Alter Table to use Compression
Run steps as SQL script outside of DAC Optional : Truncate table to remove old values
Alter table to use compression
Existing DAC task can now be executed as normal Full or incremental load, both by default use BULK LOAD (e.g. direct path load)
Switches to conventional path if ETL requires both INSERT and UPDATE (check) Indexes will be dropped, and then recreated, as part of the ETL task
TRUNCATE TABLE w_sales_invoice_line_f;
ALTER TABLE w_sales_invoice_line_f COMPRESS;
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Compression : Results
Compression reduced the size of the fact table by 77% Time load load table reduced slightly as well (2% reduction) No impact on ETL process except for the need to add COMPRESS clause outside of DAC All Informatica workflows use bulk load insert as normal, including incremental loads
Beware of mixed insert/updates though, these uncompress or load in conventional path
Fact Table Type Size in MBTime to Load "(Full Load)
Uncompressed 189MB 904 seconds
Compressed 43MB 887 seconds
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
The tables in the DAC don't use partitioning Partitioning is useful for managability (independent backup/restore)
and can enhance query performance (partition elimination, partition-wise joins etc) An option to the Enterprise Edition of the Database Approx 100% of Oracle DW customers use partitioning Oracle BI Apps by default doesnt use it though
Scenario #2 : Partitioning the Fact Table
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Things to Consider When Implementing Partitioning
DAC can't add partitioning clauses to table creation We need therefore to find a way to create our table with PARTITION BY clause If we partition a table, we need to create bitmap indexes LOCAL, as DAC by default creates
them as GLOBAL We potentially need to add new partitions as the ETL window moves on
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
New in Oracle DAC 10.1.3.4 : Actions
Actions can be used to override the normal table and index creation scripts Can also be used to add preceding and following actions to a task We can use an Index Action to tell the DAC to create indexes on partitioned tables LOCAL Table actions only can override TRUNCATE and ANALYZE table steps Therefore we will use Task Actions to drop and recreate table before FULL load,
to add PARTITION BY clause
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning Step 1 : Create Task Action
Create a task action that drops, and then recreates with partitioning, the fact table Drop stage is required so that action can be rerun Mark drop stage as Continue on Fail SQL Command is regular Oracle SQL,
alter for other platforms
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning Step 2 : Create Index Action
Uses DAC functions to return index name, table name and list of indexed columns Single CREATE INDEX statement, DAC will run for each affected index Add LOCAL clause so that bitmap indexes
become valid for our partitioned table
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning Step 3 : Associate Task Action with Task
Locate task (SIL_SalesInvoiceLInesFact) that will load the fact table Add Task Action as Preceding task in order to drop and recreate the fact table Only runs for FULL actions, sets table up with the partitioning we require
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning Step 4 : Associate Index Action with Indexes
Locate the indexes associated with the newly partitioned table
Use the Add Action feature to associate the indexes with the new Index Action
New Action will override normal index creation for the selected indexes
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning : Results
Main benefit in adding partitioning is around Partition Elimination Allows Oracle to only process those partitions with data relevant to query, rather than
load the whole table into the buffer cache The actual total size of the table will generally be slightly larger than non-partitioned,
but as compression can also be used, it was still much smaller than the original table Load time was slightly faster than original, and compressed form
Fact Table Type Size in MBTime to Load "(Full Load)
Original 189MB 904 seconds
Compressed & "Partitioned
44MB 834 seconds
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning : Results
Typical SQL execution plan before partitioning Full table scans require all of the table to be run through the buffer cache
select sum(net_amt) from w_sales_invoice_line_f where cost_center_wid = 4500;Execution Plan----------------------------------------------------------Plan hash value: 1876308179---------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 8 | 1342 (1)| 00:00:17 || 1 | SORT AGGREGATE | | 1 | 8 | | ||* 2 | TABLE ACCESS FULL| W_SALES_INVOICE_LINE_F | 17038 | 133K| 1342 (1)| 00:00:17 |---------------------------------------------------------------------------------------------
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Partitioning : Results
After partitioning is added, full table scans only have to process the relevant partitions Reduces the I/O associated with full table scans (partition elimination) Also permits partition-wise joins, independent backup & recovery of partitions etc
select sum(net_amt) from w_sales_invoice_line_f where cost_center_wid = 4500;Execution Plan----------------------------------------------------------Plan hash value: 2982023362------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 8 | 405 (1)| 00:00:05 | | || 1 | SORT AGGREGATE | | 1 | 8 | | | | || 2 | PARTITION RANGE SINGLE| | 9758 | 78064 | 405 (1)| 00:00:05 | 4 | 4 ||* 3 | TABLE ACCESS FULL | W_SALES_INVOICE_LINE_F | 9758 | 78064 | 405 (1)| 00:00:05 | 4 | 4 |------------------------------------------------------------------------------------------------------------------
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Aggregates in OBAW are regular tables loaded via Informatica Workflows Workflows have to populate aggregates by reloading and aggregating fact table data The Enterprise Edition of the Oracle Database has a similar concept called
Materialized Views Main benefit to us is that they can be very fast to refresh Advanced forms of materialized views use
OLAP to pre-calculate an entire star schema
Scenario #3 : Replacing Aggregate Tables with Materialized Views
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Components of a Materialized View
The Materialized View itself (similar to a view definition, with specific clauses for the MV)
Materialized View Logs (for tracking changes to the source tables, used for fast refresh)
Indexes on the Materialized View (MV can also be partitioned, compressed etc) Script to refresh the materialized view
CREATE MATERIALIZED VIEW LOG ON W_SALES_INVOICE_LINE_FWITH SEQUENCE, ROWID( SALES_ORDLN_ID, SALES_PCKLN_ID ....) INCLUDING NEW VALUES
CREATE MATERIALIZED VIEW W_SALES_INVOICE_LINE_APCTFREE 0BUILD IMMEDIATEREFRESH FASTASSELECT ... GROUP BY ...
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Things to Consider When Implementing Materialized Views
You need to reproduce the logic used by the PLP mapping Loading of the MV is done in two stages; initial build and subsequent refreshes Fast refresh MVs have their own requirements around required COUNT columns We will also need to index the MVs, could also benefit from partitioning
CREATE MATERIALIZED VIEW W_SALES_INVOICE_LINE_APCTFREE 0BUILD IMMEDIATEREFRESH FASTASSELECT ... GROUP BY ...
begin DBMS_MVIEW.REFRESH(OBAW.W_SALES_INVOICE_LINE_A); end;
+=
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 1 : Extract Aggregate Logic
Locate task (PLP_SalesInvoiceLinesAggregate_Load) that will load the fact table Extract the logic from the associated Informatica mappings that load the table Locate the SQL query from the Source Qualifier transformation Identify what keys are generated No need to populate every metadata column (ROW_WID, ETL_PROC_WID etc)
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 2 : Create MV Task Action
Create a Task Action to drop and recreate the Materialized View Logs, Materialized View MV Logs will track changes to the source tables (W_SALES_INVOICE_LINE_A, W_DAY_D)
enabling fast refresh of MV Initial build of MV will populate aggregate
automatically
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 3 : Create Refresh Task Actions
Incremental loads of the MV will be carried out via DBMS_MVIEW.REFRESH You need to create two Task Actions associated with this
One, for incremental loads, to use DBMS_MVIEW.REFRESH
One, for full loads, to act as a dummy action
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 4 : Add Preceding Task Action
Like the creation of the partitioned table, the MV is created using a Preceding Task Action Assign Task Action to the PLP_SalesInvoiceLinesAggregate_Load DAC task Task will now run the MV creation scripts before starting the table load
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 5 : Turn off Table Truncation
By default, the DAC will truncate target tables when a task runs in FULL mode This needs to be disabled, otherwise the MV will be truncated just after creation Subsequent fast refreshes will also fail with an ORA-32320 if truncation occurs
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views Step 6 : Register SQL MV Commands
Finally, you can register the SQL Task Actions with the PLP task, so that these are run instead of the Informatica mappings
Main DBMS_MVIEW.REFRESH task action runs for incremental loads
Dummy task action (contains no SQL statement) runs for full loads Load is performed by initial
MV creation task action
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Implementing Materialized Views : Results
FULL refresh time of subject area (fact + aggregate) fell by 50% INCREMENTAL refresh time of subject area fell by 33% Actual FULL refresh time for aggregate went from 300 seconds to 48 seconds Actual INCREMENTAL refresh time for aggregate went from 25 seconds to 1 second
Fact & Aggregate"Type
Full Refresh Incremental Refresh"
Table fact, Table Aggregate
904 secs 499 seconds
Table fact, MV Aggregate
437 secs 334 seconds
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Further Opportunities for Optimization
Replace Materialized View with Cube Organized Materialized View Add Dimensions to the OBAW database to facilitate Query Rewrite Add constraints, enable STAR_TRANSFORMATION_ENABLED Support compression for all INSERT, UPDATEs using Advanced Compression Option Other similar features can be found on IBM DB/2, Microsoft SQL Server platforms Key is DAC Actions that allow us to override default functionality
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Summary
The Oracle Business Analytics Warehouse has a generic data model Does not by default make use of Oracle (or other DB) data warehouse optimizations Oracle DAC 10.1.3.4 makes standard Index, Table and Task Actions overridable You can add compression, partitioning and materialized views simply and easily Also possible to enable star transformations, OLAP access etc Tests have shown 50% improvement in ETL and query time, disk space usage For more details, check out http://www.rittmanmead.com/blog Thank you for attending, enjoy the rest of the conference
-
T : +44 (0) 8446 697 995 E : [email protected] W: www.rittmanmead.com
Optimizing the Performance of the Oracle BI Apps Mark Rittman, Director, Rittman Mead Consulting Rittman Mead BI Forum 2009