48810459-informatica-senarios

8/7/2019 48810459-informatica-senarios

1/6

How to use "SUBSTR" functiion in mapping.Explanation :Returns a portion of a string. SUBSTR counts all characters, including blanks, starting at the

beginning of the string.

Syntax

SUBSTR( string , start [, length ] )

Example

Substr (IN_PHONE, 1 ,3)

Design a mapping , which generates sequence of numbers using setvariable function in exptransformation( without using sequence generator)

Mapping:

Design a mapping generates sequence of numbers without using sequence

generator?

Solution : Source : Flatfile

Target : RelationalDatabase : Oracle

Note : usage of setmaxvariable() function and mapping variables !

Download : XML FILE

m_sequence_variablefunction

DWH

Design a mapping to move first half of the data to one target and second half of the data to other

target? eg., if you 20 records in source - first 10 to one target and other 10 to second target or ifyour source records have odd number first n/2 +1 in one target and other in second target?

Mapping : first half to one target and second half to other target.

Solution : Source : FlatfileTarget : Relational

Database : OracleTip : use stored procedure to count the records

Download : XML FILE

m_firsthalf_secondhalf


2/6

REPOSITORY ADMIN CONSOLE

Actions

y Create Local or Global Repository

y Start Repositories.y Back up repositoryy Move the copy of the Repository to a different Servery Disable the Repository.y Export connection information.y Notificy Users :: Notification message can be send to all the users connected to the

Repositoryy Propagatey Register Repositoriesy Rstore Repositoryy

Upgrade Repository

Actions

y Create Local or Global Repositoryy Start Repositories.y Back up repositoryy Move the copy of the Repository to a different Servery Disable the Repository.y Export connection information.y Notificy Users :: Notification message can be send to all the users connected to the

Repositoryy Propagatey Register Repositoriesy Rstore Repositoryy Upgrade Repository

Actionsy Create Reusable tasks , Worklets , Workflows.y Schedule Workflows.y Configure tasks.

Workflow

A workflow is a set of instructions that describes how and when to run tasks related to extracting,transforming, and loading data.

WorkletsA worklet is an object that represents a set of tasks.


3/6

When to create Worklets?

Create a worklet when you want to reuse a set of workflow logic in several workflows. Use theWorklet Designer to create and edit worklets.

Where to use

Worklets?You can run worklets inside a workflow. The workflow that contains the worklet is called the

parent workflow. You can also nest a worklet in another worklet.

WORKFLOWMONITOR

You can monitor workflows and tasks in the Workflow Monitor. View details about a workflow

or task in Gantt Chart view or Task view.

ActionsYou can run, stop, abort, and resume workflows from the Workflow Monitor.

You can view the log file and Performance DataSlowly Changed Dimension

y It is a Dimension which slowly changes over a time.

Slowly Changed

Dimension MappingType Description

SCD Type 1 Slowly Changing Dimension Inserts new dimensions.Overwrites existing

dimensions withchanged dimensions.

(Shows Current Data)

SC

D Type 2 /VersionData SlowlyC

hanging Dimension Inserts new and changeddimensions. Creates aversion number and

increments the primarykey to track changes.

SCD Type 2 /Flag

Current

Slowly Changing Dimension Inserts new and changed

dimensions. Flags thecurrent version and

increments the primarykey to track changes.

SCD Type 2 /Date

Range

Slowly Changing Dimension Inserts new and changed

dimensions. Creates aneffective date range totrack changes.

SCD Type 3 Slowly Changing Dimension Inserts new dimensions.

Updates changed valuesin existing dimensions.

Optionally uses the loaddate to track changes.


4/6

OLTP OLAP

On Line Transaction processing On Line Analytical processing

Continuously updates data Read Only Data

Tables are in normalized form Partially Normalized / Denormalized Tables

Single record access Multiple records for analysis purpose

Holds current data Holds current and historical data

Records are maintained using Primary keyfeild

Records are baased on surogate keyfield

Delete the table or record Cannot delete the records

Complex data model Simplified data model

DATAMART DATA WAREHOUSE

A scaled - down version of the DataWarehouse that addresses only one subject

like Sales Department, HR Department

etc.,

It is a database management system thatfacilitates on-line analytical processing by

allowing the data to be viewed in different

dimensions or perspectives to provide businessintelligence.

One fact table with multiple dimensiontables.

More than one fact table and multipledimension tables.

[Sales Department] [HR Department][Manufacturing Department]

[Sales Department , HR Department ,Manufacturing Department]

Small Organizations prefer DATAMARTBigger Organization prefer DATA

WAREHOUSE

Ans DIMENSION TABLE FACT TABLE

It provides the context /descriptiveinformation for a fact table measurements. It provides measurement of an enterprise.

Structure of Dimension - Surrogate key ,one or more other fields that compose the

natural key (nk) and set of Attributes.

Measurement is the amount determined byobservation.

Size of Dimension Table is smaller than

Fact Table.

Structure of Fact Table - foreign key (fk),

Degenerated Dimension and Measurements.

. In a schema more number of dimensions

are presented than Fact Table.

Size of Fact Table is larger than Dimension

Table.

Surrogate Key is used to prevent theprimary key (pk) violation(store historical

data).

In a schema less number of Fact Tables observedcompared to Dimension Tables.

Provides entry points to data. Compose of Degenerate Dimension fields act asPrimary Key.

Values of fields are in numeric and text

representation.

Values of the fields always in numeric or integer

form.


5/6

DATA MINING VS WEB MINING

DATA MINING WEB MININGData mining involves using techniques to findunderlying structure and relationships in large

amounts of data.

Web mining involves the analysis ofWeb server logs of a Web site.

Data mining products tend to fall into fivecategories: neural networks, knowledge

discovery, data visualization, fuzzy queryanalysis and case-based reasoning.

The Web server logs contain theentire collection of requests made by

a potential or current customerthrough their browser and responses

by the Web server

FACT TABLE VS DIMENSION TABLE

FACT TABLE DIMENSION TABLEA table in a data warehouse whose entries

describe data in a fact table. Dimension tablescontain the data from which dimensions are

created. A fact table in data ware house is itdescribes the transaction data. It contains

characteristics and key figures.

A dimensional table is a collection of

hierarchies and categories along whichthe user can drill down and drill up. it

contains only the textual attributes.

In a Data Model schema less number of facttables are observed.

In a Data Model schema more number ofdimensional tables are observed.

RDBMS SCHEMA VS DWH SCHEMA

RDBMS SCHEMA DWH SCHEMA

* Used for OLTP systems* Traditional and old schema

* Normalized* Difficult to understand and navigate

* Cannot solve extract and complexproblems

* Poorly modelled

* Used for OLAP systems* New generation schema

* Denormalized* Easy to understand and navigate

* Extract and complex problems can beeasily solved

* Very good model

How to find the number of success , rejected and bad records in the same mapping.

y First we seperate this data using Expression transformation.Which is used to flag the row for 1or 0 .The condition as follows ..

y IIF(NOT IS_DATE(HIREDATE,'DD-MON-YY') OR ISNULL(EMPNO) OR

ISNULL(NAME) OR ISNULL(HIREDATE) OR ISNULL(SEX) ,1,0)

y FLAG =1 is considered as invalid data and FLAG =0 is considered as valid data .This datawill be routed into next transformation using router transformation .Here we added two user

groups one as FLAG=1 for invalid data and the other as FLAG=0 for valid data.


6/6

y FLAG=1 data is forwarded to the expression transformation .Here we take one variable portand trwo ouput ports .One for increament purpose and the other for flag the row ...

48810459-informatica-senarios

Documents