three simple ways to master the administration and ... · pdf file4 mdm components • hub...
TRANSCRIPT
1
Jitendra Malhotra Lead Engineer Global Customer Support
Avneet Bajwa
Senior Engineer Global Customer Support
Informatica
Three Simple Ways to Master the Administration and Management of an MDM Hub
2
Breakout Session Includes
• MDM Components
• Optimizing Hub Environment
• Tools & Utilities
• Effective Problem Resolution of MDM Hub
• IDD Best Practices
3
MDM Components
4
MDM Components
• Hub Store: A collection of databases in which business data is stored and
consolidated. A Hub Store consists one Master Database and one or more ORS.
• Hub Server: Manages core and common services for MDM Hub and is a
J2EE application, deployed on the application server, that orchestrates the data
processing within the Hub Store, as well as integration with external applications.
• The Hub Console: MDM user interface that comprises a set of
administrative and data management tools for administrators and data stewards.
• Cleanse Match Server: Manages cleansing operations to
standardize data, and match server that handles the match operations
5
MDM Components
Hub Store
(Database tier)
Application
Server tier
User
Interfaces Hub Console JNLP client
Cleanse Match Server
Custom Client IDD HTML client
Hub Server SIF Engine
CMX_SYSTEM
SIF SDK APIs
CMX_
ORS2
CMX_
ORS1
Cleanse
Engine
6
Hub Environment
• Components on a
Single Machine
• Components
distributed on
Multiple Machines
7
Trust Framework
Sources (Reference or
Relationship Data)
Consumers (Master Reference or
Relationship Data)
Data Source
ETL
Msg Queue/
Services
Landing
Data Source
Application
Data Warehouse
Rules-based Configuration Tools
Consolidation Process
Target Data Model
Metadata
Auto Merge
Match
Manual Merge
Un-Merge
Insert/
Update
Insert/
Update
New
Name
Product
Address
Dynamic Cell-
Level
Survivorship
Msg Queue
Queued for
Merging
Queued for
Matching Raw Reject
Staging
Application
Management
Rules
Hierarchy
Validation
State Mgmt
Workflow
Event Trigger
Content
History
Lineage
X Ref
Trust Score
Audit
Even
ts
Process Flow
f(x)
f(x)
f(x)
Delta Detection and
Cleansing
Delta Detection and
Cleansing
f(x)
Apply Trust and
Validation
Apply Trust and
Validation
8
Security Access Manager (SAM)
Services Integration Framework (SIF)
Access Interfaces
Applications
Portal Oracle SAP Siebel Composite Legacy
(Design Time)
Services & Events Generator
Get Customer
Get Name Get Address
New Customer Profile
Name Change
New Address
Multidomain
MDM Hub
Business Events Business Services
Schema Specific Services
Generic Services
Data Events Data Services
Synchronous / Asynchronous (EJB, SOAP, HTTP, JMS)
SIF
Bus. Data Director
Process Services
9
Optimizing Hub Environment
10
Optimizing Hub Environment
• Database Optimizations
1. Init.ora parameters recommendations
• Application Server
1. JVM Parameters
2. Connection Pool Sizes and JTA timeout
11
Database Optimization – Init.ora parameters
• Oracle DB Parameters For Baseline Performance
• memory_target 6000M
• memory_max_target 6000M
• The Oracle PGA & SGA sizing should be adjusted according to the memory
available on the server
• Unless otherwise noted the recommendations are for minimum settings,
additional resources will improve performance
• For larger systems, you may need to change the
PGA_AGGREGATE_TARGET and SGA_TARGET parameters to extend
beyond the required 6GB to make use of the total memory available.
Therefore: RAM = O/S + 2 equal amounts for SGA and PGA.
• 32GB(Total Memory) = 4GB (O/S) + 14GB (PGA) + 14GB (SGA)
• The setting for Oracle 11g is on assumption of 8GB RAM machine
• Detail of init.ora setting is given in Knowledge Base article 90408
12
Init.ora recommendations continued.. • Oracle database parameter Non-Default Required
Value
• db_block_checking FALSE
• db_file_multiblock_read_count Do not set
• db_cache_size 2000M
• disk_asynch_io TRUE
• filesystemio_options SETALL
• java_pool_size 0
• large_pool_size 400M
• log_buffer 4002816
• open_cursors 1000
• parallel_adaptive_multi_user TRUE
• pga_aggregate_target 3000M
• processes 1000
• recyclebin OFF
• sga_target 4000M
• shared_pool_size 400M
• streams_pool_size 0
• workarea_size_policy AUTO
• utl_file_dir ** *
• db_writer_processes 1or<number of CPUs/8>
Informatica Recommended
init.ora Parameters
NOTE: The default init.ora parameters should be used except where noted above
13
Application Server Configuration
• JVM Setting
• Xms = 512m
• Xmx= 2048m
• PermSize= 256m
• MaxPermSize= 512m
• Xss= 2048k
• JNLP Setting
• jnlp.max-heap-size=2048m
• jnlp.initial-heap-size=512m
14
Optimizing Connection Pool Size
Max Number of concurrent job * (max thread count +2)
• So if there are 8 jobs you want to be able to run in parallel and 16 as max thread count, you need to have 8 * (16+2)
• At least 144 connections available in the datasource connections. It will not normally use all of these but it will be safe for all cases.
15
JTA Timeout
• Should be set to 600 and Above
• Information pertaining to Application Server
configuration is available at KB article 120187 –
Link below:
• https://communities.informatica.com/infakb/solution/18/Pages/120187.aspx?docid=120187&type=external&index=1
16
Tools & Utilities
17
Tools & Utilities
• Various MDM built-in as well as external tools &
utilities can help users determine potential
issues within day to day operations of HUB.
Some of these tools and utilities are:
• Enterprise Manager
• Metadata Manager
• TEST_IO Utility
• Memory Analyzer Utility
• Heap Dump Utility
18
Enterprise Manager
• View properties
• Version histories
• Environment
reports for the
Hub server
• Cleanse servers
• ORS databases
• Master database
19
Enterprise Manager
• Enterprise Manager is a
powerful tool to
determine HUB
Environment through
“Environment Report”
Utility
20
Metadata Manager
• Validate Metadata and Repair some metadata Errors
• Promotes changes from one environment to another
• Export Metadata from one environment to another
21
Other Useful Utilities
TEST_IO Utility
• MDM Engineering developed utility to help determine database disk_io.
Memory Analyzer
• Memory Analyzer tool helps in monitoring health of applications servers.
Heap Dump Analyzer
• Tools used for analysis of issues related to Out of Memory Server Crashes.
22
TEST_IO Utility
• One of the important factors of performance is disk I/O
speed
• OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 735 0.01 0.05 0 1 0 0
Execute 738 43.69 76.72 511988 518341 523816 20000488
Fetch 733 0.02 1.22 194 1294 0 489
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2206 43.73 78.00 512182 519636 523816 20000977
Verify that 20mil records were processed. If it was not, something went wrong
with the script
• Good execute elapsed time is between 40-100.
• Just ok elapsed time is between 100-200.
• Any elapsed time above 200 needs serious investigation and could have enormous impact on SIF as well as Batch job performance.
23
Application Server Monitoring
• Server Memory
• Live Threads
• CPU Utilization
• Class Loaders
Application Server
24
Heap Dump Analysis
• Histograms
• Leak Suspects
• Top Consumers
• Component Report
• Dominator Tree
Heap Analyzer
25
Troubleshooting & Resolving Hub Issues
26
• Database debug log
• Cmxserver.log (server logs)
• Cmxserver.log (cleanse logs)
• Alert logs from Oracle
• Application server logs
MDM Log Files
27
MDM Components and Common Logs
Hub Store
(Database tier)
Application Server tier
User
Interfaces Hub Console JNLP client
Cleanse Match Server
Custom Client IDD HTML client
Hub Server SIF Engine
CMX_SYSTEM
SIF SDK APIs
CMX_
ORS2
CMX_
ORS1
Cleanse
Engine
SiperianClientException Custom Application Logs
SiperianCommunicationException Custom Application Logs
SiperianServerException Hub Server Logs
SiperianServerException Cleanse Server Logs
SiperianServerException Database Logs
28
MDM Log Analysis – Scenario 1
• Customer Reports IDD
throwing an exception,
while performing a merge
• Support is provided with
MDM Server logs in
debug Mode
• Initial analysis point out
the issue to be on the
database side of MDM
• Support Request for the
database debug logs from
that time frame
• After reviewing them the
issue is found within a
product and an Fix is
provided
29
MDM Log Analysis
• 19-APR-2012 15:57:46.621[ERROR][sid:200][Preview_bvt:Generate_bvt............. 79 package body CMXBV.1279] Autonomous SQL Error (-955).SQLERRM is: ORA-00955: name is already used by an existing object
• 19-APR-2012 15:57:46.676[ERROR][sid:200][Preview_bvt:Generate_bvt............. 93 package body CMXBV.1293]
Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_B" has errors
• 19-APR-2012 15:57:46.781[ERROR][sid:200][Preview_bvt:Generate_bvt............. 06 package body CMXBV.1306] Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has
errors
• 19-APR-2012 15:57:46.782[ERROR][sid:200][Preview_bvt:Generate_bvt............. 16 package body CMXBV.1316] Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has
errors
• 19-APR-2012 15:57:46.797[DEBUG][sid:200][Preview_bvt:Generate_bvt............. 19 package body CMXBV.1319]
CMX_ORS.T$1_BE0EA5683999CFB2E040_b dropped.
• 19-APR-2012 15:57:46.805[DEBUG][sid:200][Preview_bvt:Generate_bvt............. 20 package body CMXBV.1320] CMX_ORS.T$1_BE0EA5683999CFB2E040_g dropped.
• 19-APR-2012 15:57:46.806[DEBUG][sid:200][Preview_bvt:Application.............. 09 package body CMXUT.2709] Module Name: Generate_bvt*****Exception Name: ERROR_BUILD_TABLE1
• 19-APR-2012 15:57:46.859[DEBUG][sid:200][Preview_bvt:Generate_bvt............. 09 package body CMXUT.2709]
Module Name: Preview_bvt*****Exception Name: NO_BVT_FOR_RECORD
• 19-APR-2012 15:57:46.869[DEBUG][sid:200][Preview_bvt:Preview_bvt.............. 86 package body CMXBV.2686]
0,SIP0: No BVT available for rowid_object 132057, SIP-28241: Error creating BV1 working table for C_PARTY: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has errors
30
• Customer reports an
IDD operation running
for a long time and
throwing an IDD error
message
• Support request CMX
Server logs
• After reviewing the
sever logs and
reviewing the server
load, supports
recommended thread
pool queue to be
increased.
MDM Log Analysis – Scenario 2
31
MDM Log Analysis
• [2012-04-10 12:04:53,022] [http-0.0.0.0-8080-18] [SEVERE] javax.enterprise.resource.webcontainer.jsf.application: org.springframework.core.task.TaskRejectedException: Executor
[java.util.concurrent.ThreadPoolExecutor@43ee60d2] did not accept task:
com.siperian.dsapp.jsf.ui.dsbean.dataview.savehandle.CompositeAsyncSaveOperation@1d002224; nested exception is java.util.concurrent.RejectedExecutionExceptionjavax.faces.el.EvaluationException:
org.springframework.core.task.TaskRejectedException: Executor
[java.util.concurrent.ThreadPoolExecutor@43ee60d2] did not accept task: com.siperian.dsapp.jsf.ui.dsbean.dataview.savehandle.CompositeAsyncSaveOperation@1d002224; nested
exception is java.util.concurrent.RejectedExecutionException at
javax.faces.component.MethodBindingMethodExpressionAdapter.invoke(MethodBindingMethodExpressionAdapter.java:102) at com.sun.faces.application.ActionListenerImpl.processAction(ActionListenerImpl.java:102)
at javax.faces.component.UICommand.broadcast(UICommand.java:387) at
org.ajax4jsf.component.AjaxActionComponent.broadcast(AjaxActionComponent.java:55) at com.exadel.siperian.component.SipUIAjaxCommandButton.broadcast(SipUIAjaxCommandButton.java:25)
32
IDD Best Practices
• SAM, HM Configuration
• IDD Design
• User Exits
• Sizing
33
Business Data Components
Library
Platform
Hierarchy Potential Matches
Merge
Point-in-Time History Cross References
Match Comparison
Build Component
Business Data Components Library and Business Data Components Platform Deliver Reliable and Relevant Data In Real-time Through Standard & Composite Applications
Business Data Director
Application
Users
Applications Salesforce
Siebel
SAP
Sharepoint
Portals
VisualForce
NetWeaver
VBC (HTTP)
Web Part
Portlet
Others… iFrames
ES
B, E
AI
Applications
Third Party Data
CIF Legacy Systems
Legacy Build Component
ES
B, E
AI, M
OM
, E
TL
, S
QL
, JC
A, JN
I
Multidomain MDM Match and Merge
Relationships History / Lineage Create Consume
Manage Monitor
Informatica MDM
34
IDD Best Practices
• Security Access manager
• Hierarchy Manager
• Sizing
Configuration
• IDD Configuration
• ORS Design
Design
• Correct Entry Points
• Optimization of Custom Code
• SIF API
User Exits
35
IDD Best Practices
• MDM access is broken into 6 categories Read, Write, Update, Merge, Delete, Execute
• Each Resource related directly or indirectly to a IDD operation shall be given the required access
• IDD implementation guide has good refrences for SAM configuration
Security Access Manager
36
IDD Best Practices
• HM Configuration in the Hub
• HM Profile Validation
• IDD Configuration for Single and Multiple Hops
Hierarchy Manager
37