agenda odi performance odi scheduling odi deployment/release
Embed Size (px)
TRANSCRIPT

AGENDA
• ODI Performance• ODI Scheduling• ODI Deployment/Release
BI-Quotientwww.bi-q.ie

ULI BETHKE
• Dublin based• Blog www.bi-q.ie• ODI 2007• Reviewer two ODI books• ODI articles OTN• Deputy chair OUG BI SIG. Next event 11th June• ODI advanced trainer
BI-Quotientwww.bi-q.ie

ODI PERFORMANCE
ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java
agent to communicate and send data between source and target systems and the repository over
the network.
BI-Quotientwww.bi-q.ie

SQL
- > 80%: ODI performance issues = SQL issues => SQL main ODI skill
- Perfect your SQL. Advanced SQL. Analytic Functions
- Know your database(s) inside out. In particular the target
- Understand, write, and modify Knowledge Modules
BI-Quotientwww.bi-q.ie

AGENT
- Light weight Java based application- Tied to host OS- Generates code based on ODI metadata.- Communicates source, target, repository.- JDBC data transport- XML- Jetty- Interpreters: Jython, JBS, JavaScript, Groovy- HSQLDB in memory database- Scheduler- Sizing
BI-Quotientwww.bi-q.ie

AGENT
Target- Least amount of roundtrips. Network (JDBC, XML)- One target database server only (DW)Another Server- ODBC drivers- JEE agent on Weblogic- No support for target OS- Resources on target- DBA
BI-Quotientwww.bi-q.ie

INTERFACES
- No!! KM using row by row processing- Use ODI functions rather than DB functions- Don’t overuse CKM (especially for large data
volumes)- temp indexes (I$)- Gather statistics (C$, I$, TGT when applicable)- Rule of thumb: Use loader KMs or db link
KMs rather than JDBC KMs
BI-Quotientwww.bi-q.ie

SOURCE/TARGET
- Schemas on same database server. Physical schema and not data server.
- Have sources physically close to target- Minimize impact on source- Chunking
BI-Quotientwww.bi-q.ie

CRITICAL PATH BI-Quotientwww.bi-q.ie
NETWORK PATHS: PATH DURATIONS:B > E > H 6 + 2 + 11=19B > D > F 6 + 4 + 14=24B > D > G 6 + 4 + 10=20A > C > G 9 + 8 + 10=27 CRITICAL
PATH

MICRO TUNING
• JDBC drivers• JVM• Type 4 or 5 JDBC drivers (Data Direct)• Array fetch size. • DB packet size. • Network packet size.
BI-Quotientwww.bi-q.ie

PERFORMANCE MONITORING
• ODI Log Data Mart• Facts• Dimensions• Metrics• Frontend
BI-Quotientwww.bi-q.ie

DBMS_SQLTUNE_UTIL0
• dbms_sqltune_util0.sqltext_to_sqlid • Link to Data Dictionary Tables
BI-Quotientwww.bi-q.ie

MACIEJ KOCON
• Dublin based• ODI 2005 (Sunopsis)• Reviewer two ODI books• Blog www.bi-q.ie• [email protected]
BI-Quotientwww.bi-q.ie

ORCHESTRATING DWH PROCESSES
• Orchestration of Data Process Flow– Standard DWH Process flow orchestration– Packages in Oracle Data Integrator 10g– Load Plans in Oracle Data Integrator 11g
• Process Flow use cases - efficiency analysis• Alternative scheduling
– benefits
BI-Quotientwww.bi-q.ie

1
TYPICAL DATA FLOW in DWH
step
STAGE
DATA EXTRACTloads data from
sources
E-LT
BI-Quotientwww.bi-q.ie

1 2
TYPICAL DATA FLOW in DWH
step
STAGE
DATA EXTRACTloads data from
sources
step
DIMs
LABELprovides
structured labelinginformation
E-LT
BI-Quotientwww.bi-q.ie

1 2 3
TYPICAL DATA FLOW in DWH
step
STAGE
DATA EXTRACTloads data from
sources
step
DIMs
LABELprovides
structured labelinginformation
step
FACTS
FACTSconsists of
measurements, metrics or facts
E-LT
BI-Quotientwww.bi-q.ie

1 2 3
TYPICAL DATA FLOW in DWH
step
STAGE
DATA EXTRACTloads data from
sources
step
DIMs
LABELprovides
structured labelinginformation
step
FACTS
FACTSconsists of
measurements, metrics or facts data transport &
transform units
E-LT
BI-Quotientwww.bi-q.ie

1 2 3
TYPICAL DATA FLOW in DWH
step
STAGE
DATA EXTRACTloads data from
sources
step
DIMs
LABELprovides
structured labelinginformation
step
FACTS
FACTSconsists of
measurements, metrics or facts data transport &
transform units
ODI 10gPackages orchestration
E-LT
ODI 11Load Plans
BI-Quotientwww.bi-q.ie

PRC_B
INT_A
PKG_ABC
ORCHESTRATION – ODI PACKAGES
INT_C
INT_D
PKG_DE
INT_E
using object directly
BI-Quotientwww.bi-q.ie

INT_C
PRC_B
INT_A
PKG_ABCDE
PKG_DE
PRC_B
INT_A
PKG_ABC
ORCHESTRATION – ODI PACKAGES
INT_C
INT_D
PKG_DE
INT_E
using object directly using scenarios – compiled code
SYNCHRONOUS
BI-Quotientwww.bi-q.ie

INT_C
PRC_B
INT_A
PKG_ABCDE
PKG_DE
PRC_B
INT_A
PKG_ABC
ORCHESTRATION – ODI PACKAGES
INT_C
INT_C
PRC_B
INT_A
PKG_ABCDE
PKG_DE
INT_D
PKG_DE
INT_E
using object directly using scenarios – compiled code
SYNCHRONOUS
ASYNCHRONOUS
BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS
INT_CPRC_B
INT_A
PKG_ABC
PRC_D
INT_C
PKG_DE
PRC_G
INT_F
PKG_FGPKG_DM
A
B
C
D
EF
G
ODI 10gPackages
BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS
INT_CPRC_B
INT_A
PKG_ABC
PRC_D
INT_C
PKG_DE
PRC_G
INT_F
PKG_FGPKG_DM
ODI 11Load plans
ODI 10gPackages
BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS
INT_CPRC_B
INT_A
PKG_ABC
PRC_D
INT_C
PKG_DE
PRC_G
INT_F
PKG_FGPKG_DM
ODI 10gPackages
ODI 11Load plans
A
B
C
D
EF
GSAME EFFECT!
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
30
30
10
10
10 10
10
A
B
C
D
E
F
G
sequential
para
llel
30 + 30 + 10 = 70
A 30
B 10
C 10
D 10
E30
F 10
G10
Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts
BI-Quotientwww.bi-q.ie

Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts
PROCESS FLOW EFFICIENCY ANALYSIS
30
30
10
10
10 10
10
A
B
C
D
E
F
G
sequential
para
llel
30 + 30 + 10 = 70
A 30
B 10
C 10
D 10
E30
F 10
G10
DOWNSIDES:• POSSIBLE INEFFICIENCIES (IDLE RESOURCES)
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
A 30
B 10
C 10
D 10
E30
F 10
G10
OPTIMIZATION ATTEMPT
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
A 30
B 10
C 10
D 10
E30
F 10
G10
30
30
10
10
1010
1030 + 10 10 + 30
+ 10 = 50
B
C
A D
E
F
G
sequential
para
llel
OPTIMIZATION ATTEMPT
70 50 = 1.4 times quicker!UPSIDE:• EFFICIENCY IMPROVED
BI-Quotientwww.bi-q.ie

ADVANCED DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

ENTERPRISE DWH DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

ENTERPRISE DWH DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
A 30
B 10
C 10
D 10
E30
F 10
G10
30
30
10
10
1010
1030 + 10 10 + 30
+ 10 = 50
B
C
A D
E
F
G
sequential
para
llel
OPTIMIZATION ATTEMPT
70 50 = 1.4 times quicker!UPSIDE:• EFFICIENCY IMPROVEDDOWNSIDES:• TIMINGS KNOWLEDGE REQUIRED• OVERALL DEPENDECY KNOWLEDGE REQURED
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
30
30
10
10
10 10
10
A
B
C
D
E
F
G
sequential
para
llel
30 + 30 + 10 = 70
A 30
B 10
C 10
D 10
E30
F 10
G10
OPTIMIZATION ATTEMPT
DOWNSIDE:• INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED• CONSUMER WAITING & IMPACT
70
70
BI-Quotientwww.bi-q.ie

• Possible inefficiencies (idle resources)• Timings knowledge required• Overall dependecy knowledge requred• Inefficiency exists but can’t be resolved• Consumer waiting & impact
TRADITIONAL SCHEDULING - LIMITATIONS BI-Quotientwww.bi-q.ie

• Possible inefficiencies (idle resources)• Timings knowledge required• Overall dependecy knowledge required• Inefficiency exists but can’t be resolved• Consumer waiting & impact
TRADITIONAL SCHEDULING - LIMITATIONS
SCHEDULER
BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULINGA
B
C
D
E
B
A
C
D
E
A
B
C
D
E
B
A
C
D
E
B
A
C
D
E
A
B
C
D
E
B
A
C
D
E
BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULINGA
B
C
D
E
B
A
C
D
E
A
B
C
D
E
B
A
C
D
E
B
A
C
D
E
A
B
C
D
E
B
A
C
D
E
PACKGAGES&
LOAD PLANS
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
30
30
10
10
10 10
10
A
B
C
D
E
F
G
sequential
para
llel
30 + 30 + 10 = 70
A 30
B 10
C 10
D 10
E30
F 10
G10
30 30
10
10
10 10
10
70
70
A 30
B 10
C 10
D 10
E30
F 10
G10
BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS
30
30
10
10
10 10
10
A
B
C
D
E
F
G
sequential
para
llel
30 + 30 + 10 = 70
A 30
B 10
C 10
D 10
E30
F 10
G10
30 30
10
10
10 10
10
70
70
A 30
B 10
C 10
D 10
E30
F 10
G10
70
30
70 30 = 2.3 times faster!
BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULING
• Simplifies orchestrating the flow– only immediate upstream definition required– execution timings not relevant– self-adapts in the most effective way
• Improves overall E-LT performance– Less idle resources – better utilization– Independency– unveils its full potential in complex Enterprise class
DWHs (Inmon)
BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULING
• Notifications– errors (+auto-restartability)– finish summary– logging
• Multiple/overlapping E-LT streams– load with different frequencies
• Parameterization– improved system stress control– process prioritization
BI-Quotientwww.bi-q.ie

F I R S T RUN
10p ro c e s s e s

F I R S T RUN
10p ro c e s s e s
T O D A Y
584p ro c e s s e s
1389DEPENDENCIES

F I R S T RUN
10p ro c e s s e s
T O D A Y
584p ro c e s s e s
132 231 SCENARIOS RUN
1389DEPENDENCIES

F I R S T RUN
10p ro c e s s e s
T O D A Y
584p ro c e s s e s
132 231 SCENARIOS RUN
1389DEPENDENCIES
12h43mLOAD PLANS
TIM
E

F I R S T RUN
10p ro c e s s e s
T O D A Y
584p ro c e s s e s
132 231 SCENARIOS RUN
1389DEPENDENCIES
12h43mLOAD PLANS
4h21mSCHEDULER
TIM
E
2.9T I M E S
F A S T E R

ENTERPRISE DWH DATA FLOW BI-Quotientwww.bi-q.ie


RELEASE 1.0 BI-Quotientwww.bi-q.ie

RELEASE 2.0 TST BI-Quotientwww.bi-q.ie

TESTING RELEASE 2.0 BI-Quotientwww.bi-q.ie

DEPLOY RELEASE 2.0 PRD BI-Quotientwww.bi-q.ie

THE HOT FIX SITUATION

RELEASE FREQUENTLY BI-Quotientwww.bi-q.ie

CI ENVIRONMENT BI-Quotientwww.bi-q.ie

CI ENVIRONMENT BI-Quotientwww.bi-q.ie

THE BUILD MASTER BI-Quotientwww.bi-q.ie

AUTOMATE STUFF BI-Quotientwww.bi-q.ie

ODI VS. SOURCE CONTROL BI-Quotientwww.bi-q.ie

ODI STRUCTURE BI-Quotientwww.bi-q.ie

BEYOND INTRA BUILD DEPENDENCIES
BI-Quotientwww.bi-q.ie