48810459-informatica-senarios
TRANSCRIPT
-
8/7/2019 48810459-informatica-senarios
1/6
How to use "SUBSTR" functiion in mapping.Explanation :Returns a portion of a string. SUBSTR counts all characters, including blanks, starting at the
beginning of the string.
Syntax
SUBSTR( string , start [, length ] )
Example
Substr (IN_PHONE, 1 ,3)
Design a mapping , which generates sequence of numbers using setvariable function in exptransformation( without using sequence generator)
Mapping:
Design a mapping generates sequence of numbers without using sequence
generator?
Solution : Source : Flatfile
Target : RelationalDatabase : Oracle
Note : usage of setmaxvariable() function and mapping variables !
Download : XML FILE
m_sequence_variablefunction
DWH
Design a mapping to move first half of the data to one target and second half of the data to other
target? eg., if you 20 records in source - first 10 to one target and other 10 to second target or ifyour source records have odd number first n/2 +1 in one target and other in second target?
Mapping : first half to one target and second half to other target.
Solution : Source : FlatfileTarget : Relational
Database : OracleTip : use stored procedure to count the records
Download : XML FILE
m_firsthalf_secondhalf
-
8/7/2019 48810459-informatica-senarios
2/6
REPOSITORY ADMIN CONSOLE
Actions
y Create Local or Global Repository
y Start Repositories.y Back up repositoryy Move the copy of the Repository to a different Servery Disable the Repository.y Export connection information.y Notificy Users :: Notification message can be send to all the users connected to the
Repositoryy Propagatey Register Repositoriesy Rstore Repositoryy
Upgrade Repository
Actions
y Create Local or Global Repositoryy Start Repositories.y Back up repositoryy Move the copy of the Repository to a different Servery Disable the Repository.y Export connection information.y Notificy Users :: Notification message can be send to all the users connected to the
Repositoryy Propagatey Register Repositoriesy Rstore Repositoryy Upgrade Repository
Actionsy Create Reusable tasks , Worklets , Workflows.y Schedule Workflows.y Configure tasks.
Workflow
A workflow is a set of instructions that describes how and when to run tasks related to extracting,transforming, and loading data.
WorkletsA worklet is an object that represents a set of tasks.
-
8/7/2019 48810459-informatica-senarios
3/6
When to create Worklets?
Create a worklet when you want to reuse a set of workflow logic in several workflows. Use theWorklet Designer to create and edit worklets.
Where to use
Worklets?You can run worklets inside a workflow. The workflow that contains the worklet is called the
parent workflow. You can also nest a worklet in another worklet.
WORKFLOWMONITOR
You can monitor workflows and tasks in the Workflow Monitor. View details about a workflow
or task in Gantt Chart view or Task view.
ActionsYou can run, stop, abort, and resume workflows from the Workflow Monitor.
You can view the log file and Performance DataSlowly Changed Dimension
y It is a Dimension which slowly changes over a time.
Slowly Changed
Dimension MappingType Description
SCD Type 1 Slowly Changing Dimension Inserts new dimensions.Overwrites existing
dimensions withchanged dimensions.
(Shows Current Data)
SC
D Type 2 /VersionData SlowlyC
hanging Dimension Inserts new and changeddimensions. Creates aversion number and
increments the primarykey to track changes.
SCD Type 2 /Flag
Current
Slowly Changing Dimension Inserts new and changed
dimensions. Flags thecurrent version and
increments the primarykey to track changes.
SCD Type 2 /Date
Range
Slowly Changing Dimension Inserts new and changed
dimensions. Creates aneffective date range totrack changes.
SCD Type 3 Slowly Changing Dimension Inserts new dimensions.
Updates changed valuesin existing dimensions.
Optionally uses the loaddate to track changes.
-
8/7/2019 48810459-informatica-senarios
4/6
OLTP OLAP
On Line Transaction processing On Line Analytical processing
Continuously updates data Read Only Data
Tables are in normalized form Partially Normalized / Denormalized Tables
Single record access Multiple records for analysis purpose
Holds current data Holds current and historical data
Records are maintained using Primary keyfeild
Records are baased on surogate keyfield
Delete the table or record Cannot delete the records
Complex data model Simplified data model
DATAMART DATA WAREHOUSE
A scaled - down version of the DataWarehouse that addresses only one subject
like Sales Department, HR Department
etc.,
It is a database management system thatfacilitates on-line analytical processing by
allowing the data to be viewed in different
dimensions or perspectives to provide businessintelligence.
One fact table with multiple dimensiontables.
More than one fact table and multipledimension tables.
[Sales Department] [HR Department][Manufacturing Department]
[Sales Department , HR Department ,Manufacturing Department]
Small Organizations prefer DATAMARTBigger Organization prefer DATA
WAREHOUSE
Ans DIMENSION TABLE FACT TABLE
It provides the context /descriptiveinformation for a fact table measurements. It provides measurement of an enterprise.
Structure of Dimension - Surrogate key ,one or more other fields that compose the
natural key (nk) and set of Attributes.
Measurement is the amount determined byobservation.
Size of Dimension Table is smaller than
Fact Table.
Structure of Fact Table - foreign key (fk),
Degenerated Dimension and Measurements.
. In a schema more number of dimensions
are presented than Fact Table.
Size of Fact Table is larger than Dimension
Table.
Surrogate Key is used to prevent theprimary key (pk) violation(store historical
data).
In a schema less number of Fact Tables observedcompared to Dimension Tables.
Provides entry points to data. Compose of Degenerate Dimension fields act asPrimary Key.
Values of fields are in numeric and text
representation.
Values of the fields always in numeric or integer
form.
-
8/7/2019 48810459-informatica-senarios
5/6
DATA MINING VS WEB MINING
DATA MINING WEB MININGData mining involves using techniques to findunderlying structure and relationships in large
amounts of data.
Web mining involves the analysis ofWeb server logs of a Web site.
Data mining products tend to fall into fivecategories: neural networks, knowledge
discovery, data visualization, fuzzy queryanalysis and case-based reasoning.
The Web server logs contain theentire collection of requests made by
a potential or current customerthrough their browser and responses
by the Web server
FACT TABLE VS DIMENSION TABLE
FACT TABLE DIMENSION TABLEA table in a data warehouse whose entries
describe data in a fact table. Dimension tablescontain the data from which dimensions are
created. A fact table in data ware house is itdescribes the transaction data. It contains
characteristics and key figures.
A dimensional table is a collection of
hierarchies and categories along whichthe user can drill down and drill up. it
contains only the textual attributes.
In a Data Model schema less number of facttables are observed.
In a Data Model schema more number ofdimensional tables are observed.
RDBMS SCHEMA VS DWH SCHEMA
RDBMS SCHEMA DWH SCHEMA
* Used for OLTP systems* Traditional and old schema
* Normalized* Difficult to understand and navigate
* Cannot solve extract and complexproblems
* Poorly modelled
* Used for OLAP systems* New generation schema
* Denormalized* Easy to understand and navigate
* Extract and complex problems can beeasily solved
* Very good model
How to find the number of success , rejected and bad records in the same mapping.
y First we seperate this data using Expression transformation.Which is used to flag the row for 1or 0 .The condition as follows ..
y IIF(NOT IS_DATE(HIREDATE,'DD-MON-YY') OR ISNULL(EMPNO) OR
ISNULL(NAME) OR ISNULL(HIREDATE) OR ISNULL(SEX) ,1,0)
y FLAG =1 is considered as invalid data and FLAG =0 is considered as valid data .This datawill be routed into next transformation using router transformation .Here we added two user
groups one as FLAG=1 for invalid data and the other as FLAG=0 for valid data.
-
8/7/2019 48810459-informatica-senarios
6/6
y FLAG=1 data is forwarded to the expression transformation .Here we take one variable portand trwo ouput ports .One for increament purpose and the other for flag the row ...