table of contents - informatica pc... · table of contents introduction ... of the checked-in...

14
Table of Contents Introduction .................................................................................................................................................. 2 About the tool ............................................................................................................................................... 2 Scope ......................................................................................................................................................... 2 Components .............................................................................................................................................. 4 Report Structure ....................................................................................................................................... 4 Limitations ................................................................................................................................................ 5 How to Use.................................................................................................................................................... 5 Installation ................................................................................................................................................ 5 Configuration and Execution .................................................................................................................... 5 Oracle .................................................................................................................................................... 6 SQL Server ............................................................................................................................................. 6 Important Points to Remember .................................................................................................................... 7 Appendix ....................................................................................................................................................... 8 A. Step by step process flow for installation ......................................................................................... 8 Oracle .................................................................................................................................................... 8 SQL Server ............................................................................................................................................. 9 B. Naming Convention ........................................................................................................................ 10 C. Procedures and their function ........................................................................................................ 11 D. Sample Review Reports .................................................................................................................. 14

Upload: hoangnhi

Post on 05-May-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

Table of Contents

Introduction .................................................................................................................................................. 2

About the tool ............................................................................................................................................... 2

Scope ......................................................................................................................................................... 2

Components .............................................................................................................................................. 4

Report Structure ....................................................................................................................................... 4

Limitations ................................................................................................................................................ 5

How to Use .................................................................................................................................................... 5

Installation ................................................................................................................................................ 5

Configuration and Execution .................................................................................................................... 5

Oracle .................................................................................................................................................... 6

SQL Server ............................................................................................................................................. 6

Important Points to Remember .................................................................................................................... 7

Appendix ....................................................................................................................................................... 8

A. Step by step process flow for installation ......................................................................................... 8

Oracle .................................................................................................................................................... 8

SQL Server ............................................................................................................................................. 9

B. Naming Convention ........................................................................................................................ 10

C. Procedures and their function ........................................................................................................ 11

D. Sample Review Reports .................................................................................................................. 14

Informatica Code Review Tool

Introduction

Manual review of Informatica PowerCenter code takes a considerable amount of time and

effort. Even then, it leaves room for human error and has an impact on project deadlines.

The purpose of this tool is to reduce the time taken for review and to increase the code

review quality and effectiveness while automating much of the process.

About the tool

The Informatica Code Review Tool reviews Informatica code against a set of widely accepted

guidelines. The tool is a database dependent tool and as such is specific to the repository

database. The tool in its current version can review code of Informatica Powercenter

repositories on Oracle or SQL server databases. The tool consists of two sets of code- one

for each of type of database. Each set consists of a Powercenter workflow, a template

entity list (to be used as the source file for the workflow), one installation batch file and a

set of database procedures.

The Powercenter mapping takes in a list of entities to be reviewed and calls the wrapper

review procedure for each entity. These procedures generate the review reports in a path

defined by the user. The entities in the entity-list can be folders, workflows or mappings.

The tool reviews only the latest checked-in version of the code. The procedures employed to

review the code are based on Informatica‟s Metadata Exchange (MX) views and hence are

not Powercenter version specific.

Scope

The tool reviews the code on following standards:

All sub-entities within a Powercenter entity should have a description. This includes

transformations, mappings, tasks and workflows.

Two similar transformations shouldn‟t be present one after the other in a mapping.

This applies to Expression, Transaction Control, Update Strategy, Aggregator, Filter,

Router and Sorter transformations.

The ports within an expression and an aggregator should be in the order: Input-

Input/Output-Variable-Output.

All the filters present in the mapping should have a valid filter condition and not have

a default value (TRUE).

There shouldn‟t be a joiner after two homogenous source qualifiers on the same

database. Neither should we have a filter right after a source qualifier that reads

from a database.

The Override Tracing of a session should be „Normal‟ or „Terse‟.

The data type, the precision and the scale of the ports should be kept homogenous

across the mapping.

Aggregators and joiners should have sorted input. This helps improve the

performance of the workflow.

Stop on Errors for a session should be set to 1, so that no errors are skipped.

The transformations, mappings, command tasks, sessions and workflows should be

named as per the accepted standards. E.g. The name of a mapping should start with

„m_‟. The complete list of naming standards that are checked by the tool is listed in

Appendix A.

All the output and input-output ports in an expression and an aggregator should be

linked to at least one port downstream.

All the input and variable ports in an expression and in an aggregator should be used

in the expression of at least one port.

All the unconnected lookups should be called at least once in the mapping.

All the transformations in a mapping should have a tracing level of „Normal‟ or „Terse‟

to avoid creating large session logs in production environment.

The session and workflow log directory should be parameterized.

Additional concurrent pipelines for lookup cache creation should not be disabled.

The tool also lists all the source qualifiers that have an SQL override.

The tool checks to ensure that every link between tasks in a workflow contains a

condition.

In addition, a summary report is also generated. The details contained in the summary

report are described in the SUMMARY_REPORT procedure description. A sample

summary report is attached in the Appendix C.

Components

The deployment package contains two sets of code. One set to review Oracle based

Informatica Powercenter and the set that uses SQL Server as database. The two sets are

almost identical but have a few minor differences.

Each set consists of following entities:

One Informatica Powercenter Workflow xml which should be imported using the

repository manager. It contains a mapping that processes the entity list and a

workflow associated with the mapping.

A batch file which creates the review procedures in the database.

SQL procedures that review the entities.

Entity List template. This template is specific to the database being used for the

Powercenter Repository.

The batch file is used to create the procedures in the respective databases. The installation

process is explained in detail in a later section.

Report Structure

The tool generates three types of reports. A summary report for folders and reports for

each of the workflows and mappings within the folder. The tool generates text reports. The

reports follow the naming convention:

SummaryReport_<Folder name>.txt – Summary report for folders.

<Folder name>_<Workflow name>.txt - For workflows

<Folder name>_<Mapping name>.txt - For mappings

Folder name is the name of the folder in which the workflow/mapping is present.

If a folder is reviewed, the tool creates one report for every workflow and one report for

every mapping within those workflows. The mappings that are not called in any of the

workflows are not reviewed.

The review comments in the report are mentioned below appropriate headings. Two sample

review reports are attached in the section Appendix C.

Limitations

1. The tool can handle Informatica Powercenter installation on Oracle and SQL server

databases only. Installations on any other database can‟t be reviewed.

2. The mappings and the sessions that are not used in any of the workflows are not

reviewed in case of a folder level review. The mappings however can be reviewed

independently.

3. A session can‟t be reviewed independently.

4. The tool reviews only the latest checked in version. If an entity is checked-out, then

the last checked-in version is reviewed.

5. The code appends the output in the review report. So, the user should make sure

that there are no existing reports in the path where the reports are to be generated.

How to Use

Installation

The installation package consists of a zipped file. Steps to install the code packages for both

DB types are described in detail in Appendix A.

Configuration and Execution

Once the package has been installed, the user has to edit some of the session properties to

suit their environment.

The source file and the target files used in the mappings are flat files and hence the user

needs to either alter these as per his/her requirements or needs to make sure that the

respective default paths are available.

A user needs to be created/used with a default schema of that used to store the MX views

for the Informatica Powercenter Repository on the server and use this connection for the

Stored Procedure in the mapping.

Configuration of the Entity File

The entity file lists present in each of the folders act as the source for the Powercenter

workflows. The entity file lists are different for the two different databases. The template

for each type is present in the respective folders. They are described in detail below:

Oracle

The entity file is a comma separated value file and consists of three columns :

Folder_Name, Entity_Name, Entity_Type.

A sample entity file would look like :

Folder_Name,Entity_Name,Entity_Type

Folder1,Mapping1,mapping

Folder2,workflow1,workflow

Folder3,Folder3,folder

The first line should always contain the name of the columns.

This input will review 1 workflow, 1 mapping and 1 folder. The entries in the file are not

case sensitive.

SQL Server

The entity file is a comma separated value file and has four columns

Entity_Name,Entity_type,Folder_name,Path

A sample entity file would look like:

Entity_Name,Entity_Type, Folder_Name, Path

Mapping1,mapping, Folder1,C:\Review

workflow1,workflow Folder2, C:\Review

Folder3,folder , Folder3, C:\Review

The path in the source file is the directory where the review reports are generated and

hence should be accessible to Powercenter as well as SQL server.

The first line should always contain the column names. The entries in the file are not case

sensitive.

Important Points to Remember

The source file (entity file list) can have three types of entities- folder, workflow,

mapping. If the entity type is folder it reviews all the workflows within the folder and

all the mappings associated with the workflows. If the entity type is workflow, the

tool review the workflow as well as all the sessions and mappings associated with the

workflow. If the entity type is mapping it just reviews the listed mapping.

The review reports for Oracle installation are generated at the same path as defined

by the user while executing the Install_Oracle.bat batch file. For SQL server

installations the review reports are generated for each entity as specified in the

source file.

The procedures are based upon Informatica MX views which contain information

about only the checked in entities. So, the user should make sure that the entities

that are to be reviewed are checked in before the Informatica job is kicked off.

For Oracle code, the review directory created should be accessible to Oracle server.

The review directory being created should be present beforehand for both the code

sets.

The entity list used as source in the Informatica job can only handle three entity

types: Workflow, Mapping and Folder.

Informatica Repository manager should be used to import the workflow XML from the

installer package.

The user for the stored procedure should have “select” access on the MX views.

The source file directory and target file directory should be present before the

execution of the workflow.

The entries in the entity file are not case-sensitive and hence can be entered in any

case.

The output file can have two messages:

a. “<Entity type> is incorrect” which means that the either the entity type or the

entity name is incorrect

b. “Processed” which means successful execution of the Wrapper_Review

procedure.

When a folder is reviewed, only those mappings are reviewed which are called in one

of the checked-in workflows.

For Oracle code, the Oracle username used during installation should have “create

directory” privilege.

The user provided while installing the SQL server version should have select access

on the Informatica MX views.

Appendix

A. Step by step process flow for installation

As described earlier the code contains two set of procedures. But, the user needs to install

only one of the sets that is applicable for the installation. If the Informatica Powercenter

installation is done on a Oracle database then the user needs to install the Oracle set only.

The installation method for both the sets is described below.

Oracle

Follow the following steps for installation of the code review tool for Oracle:

1. Extract the zip package. The package will extract into a folder called Code_Review.

Code_Review folder further contains two folders – Oracle and SQLServer.

2. The „Oracle‟ folder contains a batch file – Install_Oracle.bat file, a sql file Install.sql

and a Informatica powercenter workflow-wf_s_m_QualityReview.XML

3. For installation, open the command prompt. Change directory to the folder where the

the Install_Oracle.bat is present.

4. Execute the Install_Oracle.bat file from the command prompt. For Oracle installation,

the batch file expects 4 parameters, in sequence- DB Username, DB password, DB

connect string/TNS name, server path where the review reports are to be generated.

The path should be accessible to Oracle as well. Ex- If the username that has select

access on the MX views is Usr, password – Usr_pwd, TNS name – Ora11, server path

where report is to be generated – C:\ReviewReport, then the command would look

like

Install_Oracle.bat Usr Usr_pwd Ora11 C:\ReviewReport

5. The file should execute two sql files – DIR_REPORT.sql(generated by

Install_Oracle.bat) and Install.sql. Both these files should successfully execute.

6. Import the workflow xml wf_s_m_QualityReview.XML using Informatica repository

manager.

7. Change the source path, target path and the connection string for the stored

procedure in workflow as per the requirements.

8. The installation is done and the tool is ready for use.

SQL Server

Follow the following steps for installation of the code review tool for SQL server:

1. Extract the zip package. The package will extract into a folder called Code_Review.

Code_Review folder further contains two folders – Oracle and SQLServer.

2. The „SQLServer‟ folder contains a batch file – Install_SQL.bat file and a Informatica

powercenter workflow-wf_s_m_QualityReview.XML

3. For installation, open the command prompt. Change directory to the folder where the

the Install_SQL.bat is present.

4. Execute the Install_SQL.bat file from the command prompt. For SQL Server

installation, the batch file expects 4 parameters, in sequence- DB schema, Connect

String, LoginId and password. The DB schema to be provided here should have

select access on MX views. Ex – If DB schema is InfaDev, Connect string is

sqlserver.test.com\SQLSRVR, LoginId= InfaUsr and Password – InfaUsr Pwd then the

command would look like :

Install_SQL.bat InfaDev sqlserver.test.com\SQLSRVR InfaUsr InfaUsrPwd

5. The file should execute all the stored procedure SQL files. All these files should

successfully execute.

6. Import the workflow xml wf_s_m_QualityReview.XML using Informatica repository

manager.

7. Change the source path, target path and the connection string for the stored

procedure in workflow as per the requirements.

8. The installation is done and the tool is ready for use.

B. Naming Convention

The Informatica code is evaluated against the following naming convention:

Entity Type Name should start with

Workflow wf_

Session s_

Command Task cmd_

Mapping m_

Source Qualifier sq_

Transaction Control tct_

Stored Procedure sp_

Update Strategy upd_

Expression exp_

Joiner jnr_

Aggregator agg_

Lookup lkp_

Filter fil_

Router rtr_

Mapplet mplt_

Sequence Generator seq_

Sorter srt_

Rank rnk_

SQL sql_

Union un_

C. Procedures and their function

We have two sets of procedures for the purpose of code review. One set for Oracle

installations of Informatica while the other set is for the Informatica installations that use

SQL server as database. The procedures have same name and functionality in both the sets.

All the procedures and a short description for each is listed below:

PRC_WRAPPER_REVIEW: The procedure accepts the Entity_Type, Entity_Name and

Folder_name as input and calls one of the three procedures (PRC_FOLDER_REVIEW,

PRC_WORKFLOW_REVIEW or PRC_MAPPING_REVIEW) based upon the entity type. As the

name suggests, if the entity type is a folder, this calls PRC_FOLDER_REVIEW procedure and

so on.

PRC_FOLDER_REVIEW: This procedure is called by PRC_WRAPPER_REVIEW procedure

when the entity type is „Folder‟. This procedure lists all the checked-in workflows in the

folder and then calls the PRC_WORKFLOW_REVIEW procedure for each of these workflows.

SUMMARY_REPORT: This procedure is used to generate a summary report for the folder.

This summary report is only generated for folder reviews. This procedure calls the following

procedures: PRC_SUMREP_AGG_LKP_DEF_MEM, PRC_SUMREP_COMPLEX_MAPS,

PRC_SUMREP_MAP_TASK_DESC, PRC_SUMREP_PORT_TYPE_PREC_MIS,

PRC_SUMREP_PORTS_DEF_VALS, PRC_SUMREP_SESS_CONCUR_LKP,

PRC_SUMREP_SESS_LOG_DIR, PRC_SUMREP_SESS_OVERRIDE,

PRC_SUMREP_SQ_OVERRIDE, PRC_SUMREP_TRANS_DESC, PRC_SUMREP_WFLOW_DESC,

PRC_SUMREP_WFLOW_LOG.

The following checks are reported in the summary report:

a. The memory values for aggregator and lookup is defaulted to Auto

b. The number of complex mappings in a folder

c. The mappings, sessions and workflows having no description

d. The ports with mismatching data type and precision from one transformation to

another

e. The ports with default or no default values

f. Sessions with no additional concurrent lookup threads

g. Session and workflows with hardcoded log directories

h. Source qualifiers with source qualifier overrides

i. Sessions with override tracing other than terse or normal

j. Transformations with no description

k. Workflows and sessions with no description

The summary report details the results in terms of percentages in most of the scenarios and

rates the various aspects of the code in the folder. This procedure is called from the

PRC_WRAPPER_REVIEW when the entity type to be reviewed is a folder.

PRC_WORKFLOW_REVIEW: This procedure is called by PRC_FOLDER_REVIEW procedure

as indicated in the previous section. This workflow calls the individual procedures that check

for various quality standards pertaining to the sessions within the workflow as well as

performs the quality checks at the workflow level. This procedure also lists all the mappings

called by the sessions contained within it and calls the PRC_MAPPING_REVIEW procedure

for quality review of each of these mappings.

PRC_MAPPING_REVIEW: This procedure is called by the PRC_WORKFLOW_REVIEW

procedure for each of the mappings associated with the workflow. This procedure performs

the mapping level checks and calls procedures that review the mapping with respect to

individual quality standards.

PRC_WF_COMMENTS: This is a workflow level procedure which checks if all the tasks

contain within a workflow have description or not. This procedure is called by the procedure

PRC_WORKFLOW_REVIEW.

PRC_STOP_ON_ERRS: This is a workflow level procedure which checks whether the „Stop

on Errors‟ property of all the session within the workflow is set to 0. This procedure is called

by the procedure PRC_WORKFLOW_REVIEW.

PRC_OVERRIDE_TRACING: This is a workflow level procedure which checks whether the

Override Tracing for all the sessions within the workflow is set to Normal. This procedure is

called by the procedure PRC_WORKFLOW_REVIEW.

PRC_WF_LINK_COND: This is a workflow level procedure which checks if all the links

present in the workflow have some condition attached to it or not. This procedure is called

by the procedure PRC_WORKFLOW_REVIEW.

PRC_SESSION_LOG_DIRECTORY: This is a workflow level procedure. It checks for

hardcoded session and workflow log directories. This procedure is called by the procedure

PRC_WORKFLOW_REVIEW.

PRC_SESS_CONCUR_LKP: This is a workflow level procedure. It checks for concurrent

lookup builds for all the sessions in the workflow. This procedure is called by the procedure

PRC_WORKFLOW_REVIEW.

PRC_TRANS_COMMENTS: This is a mapping level procedure which checks for description

for all the transformations in the mapping. This is called by the procedure

PRC_MAPPING_REVIEW.

PRC_SORTED_INP_AGGJNR: This is a mapping level procedure which checks if all the

aggregators and joiners in the mapping have sorted input property checked. This is called

by the procedure PRC_MAPPING_REVIEW.

PRC_FILTER_COND: This is a mapping level procedure which checks if all the filters within

the mapping have valid filter conditions and aren‟t defaulted to NULL, so that they have no

real functionality and are just used a pass-through. This is called by the procedure

PRC_MAPPING_REVIEW.

PRC_JNR_FIL_POST_SOURCE: This is a mapping level procedure which raises a warning

if a filter or a joiner is used right after a source qualifier that caters to a database and hence

the function of the filter or the joiner could have been pushed to the source qualifier itself

without bringing in any excess data into the mapping. This is called by the procedure

PRC_MAPPING_REVIEW.

PRC_UNLINKED_OUTPUT_PORTS: This is a mapping level procedure which checks if all

the output and input-output ports in the expression and aggregator are linked to one or

more ports downstream. This helps reduce the redundant code within the mapping. This is

called by the procedure PRC_MAPPING_REVIEW.

PRC_CONSECUTIVE_TRANS: This is a mapping level procedure which checks for presence

of two consecutive transformations of same type, the functionality of which could have been

handled by just one. The transformations which the procedure would report are Expression,

Transaction Control, Update Strategy, Aggregator, Filter, Router and Sorter. This is called

by the procedure PRC_MAPPING_REVIEW.

PRC_UNUSED_UNCONN_LKP: This is a mapping level procedure which checks whether all

the unconnected lookups within the mapping are called in at least one of the expression in

the mapping. This is called by the procedure PRC_MAPPING_REVIEW.

PRC_UNUSED_INP_VAR_PORTS: This is a mapping level procedure which checks for

unused input and output ports within all transformations of the mapping. This is called by

the procedure PRC_MAPPING_REVIEW.

PRC_TRANSFORMATION_NAMING_CONV: This is a mapping level procedure which

checks for the conformity of transformation names with the naming standards as detailed in

Appendix B. This is called by the procedure PRC_MAPPING_REVIEW.

PRC_EXP_PORT_ORDER: This is a mapping level procedure which checks for the order of

the port within a transformation. The ports should follow the order Input-Input/Output-

Variable-Output. This is called by the procedure PRC_MAPPING_REVIEW.

PRC_PORT_PROP: This is a mapping level procedure which checks if a port‟s data type,

precision or scale changes from one transformation to other and reports if there is a

change. This is called by the procedure PRC_MAPPING_REVIEW.

PRC_TRANS_TRACING_LEVEL: This is a mapping level procedure which checks for the

Tracing Level for all the transformations within a mapping. The procedure lists the

transformations within the mapping that have a tracing level greater than „Normal‟. This is

called by the procedure PRC_MAPPING_REVIEW.

PRC_LKP_AGG_DEF_MEM: This is a mapping level procedure which checks for the

memory value settings for aggregators and lookups within a mapping. This procedure lists

all the aggregators and lookups which have default „Auto‟ value. This is called by the

procedure PRC_MAPPING_REVIEW.

PRC_SQ_OVERRIDE: This is a mapping level procedure which checks for source qualifier

overrides and list all the source qualifiers with overrides in the report generated. This

procedure is called by PRC_MAPPING_REVIEW.

D. Sample Review Reports

Sample summary report for folder:

Sample workflow review report:

Sample mapping review report: