introduction to datastage

111
 S PI C A DATA SYSTEMS  Agend a Introduction to ETL Data Stage client-server architecture 2 Tier architecture 3 Tier architecture Data Stage client review MANAGER DESIGNER  DIRECTOR  ADMIN

Upload: bhaskar-reddy

Post on 28-Feb-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 1/111

 

SPICADATA SYSTEMS

 Agenda

Introduction to ETL

Data Stage client-server architecture• 2 Tier architecture

• 3 Tier architecture

Data Stage client review

MANAGER DESIGNER   DIRECTOR   ADMIN

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 2/111

 

SPICADATA SYSTEMS

 Agenda…

Overview of the following:

DataStage Administrator 

DataStage Designer 

DataStage anager 

DataStage Director 

DataStage !o"s

Stages

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 3/111

 

SPICADATA SYSTEMS

#hat is ETL$

• ETL stands for:

• - e%tracting data from outside sources&

• - transforming it to fit "usiness needs& and

ultimatel'

• - loading it into the target(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 4/111

 

SPICADATA SYSTEMS

ETL )rocess

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 5/111

 

SPICADATA SYSTEMS

DataStage * Introduction

 DataStage is an ETL tool( It is used to

•  Design +o"s for E%traction& Transformation& and Loading,ETL(

•  Ideal tool for data integration .ro+ects * data

warehouses& data marts& and s'stem migrations(

•  Im.ort& e%.ort& create& and manage metadata for usewithin +o"s(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 6/111

 

SPICADATA SYSTEMS

DataStage Introduction

• Schedule& run& and monitor +o"s - all within

DataStage(

•  Administer DataStage develo.ment ande%ecution environments(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 7/111

 

SPICADATA SYSTEMS

DataStage

client-server architecture

INTE

RNET

 DataStageServer 

Clients Server Network

/-tier 0lient Server Architecture

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 8/111

 

SPICADATA SYSTEMS

DataStage client-server

architecture

C I T R I X A P P L I C A T I N S E R ! E R

Data

stage

SS"

Client

#usiness

$%ects

Control&'

sche(uler 

L C A L A R E A N E T ) R *+    I    R    E    )    A    L    L    

    +    I    R    E    )    A    L    L

#usiness

$%ects

Control&'

sche(uler 

SS"

Client

Data

stage

+    I    R    E    )    A    L    L    

    +    I    R    E    )    A    L    L

1-tier 0lient Server Architecture

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 9/111

 

SPICADATA SYSTEMS

Data Stage client com.onents

A(,inistrator   * add2delete .ro+ects& set defaults

'anager   * im.ort meta data& "ac3u. .ro+ects

Designer   * assem"le +o"s& com.ile& and e%ecute

Director   * e%ecute +o"s& e%amine +o" run logs

!ersion control * log all the changes made to thedesigner com.onents

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 10/111

 

SPICADATA SYSTEMS

 Administrator 

•  )ro+ects can "e created and deleted in Administrator(

•  )ro+ect .ro.erties are set in Administrator(

•  Environment varia"les and their defaults are set in

 Administrator(

Environmentvariables aredefined here

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 11/111

 

SPICADATA SYSTEMS

 Administrator 

Setting Pro%ect Pro-erties .

• Logon to administrator 

•  Select the .ro+ect

•  4o to )ro.erties ta"

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 12/111

 

SPICADATA SYSTEMS

DataStage anager 

Host name Project

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 13/111

 

SPICADATA SYSTEMS

anager 0ontents

etadata for source and targets•  Ta"le definitions

•  Source2target se5uential file2dataset la'outs

DataStage O"+ects

•  !o"s•  6outines

•  Ta"le definitions

•  0ontainers

Im.ort and e%.ort com.onents2.ro+ects

7ac3u. .ro+ects2com.onents

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 14/111

 

SPICADATA SYSTEMS

anager 

I,-ort an( E/-ort

 E%.orting DataStage com.onents•  To e%.ort using anager 4o to 8 E%.ort 9 DataStage

com.onents 8(

•  Select DataStage o"+ects for e%.ort(

•  S.ecif' t'.e of e%.ort * DS 2 L(

•  Select .ath where the file to "e e%.orted to on the client

machine(

 Im.orting DataStage com.onents•  To im.ort using anager 4o to 8 Im.ort 9 DataStage

com.onents;(

•  Select DataStage o"+ects to im.ort(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 15/111

 

SPICADATA SYSTEMS

anager * E%.orting DataStage

O"+ects

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 16/111

 

SPICADATA SYSTEMS

DataStage Designer 

• Designer functions  <sed to "uild +o"s

 0om.ile2run the created +o"s

 Define the +o" environment varia"le .ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 17/111

 

SPICADATA SYSTEMS

DataStage Designer 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 18/111

 

SPICADATA SYSTEMS

DataStage Designer 

Run

CompileJobproperties

Palette

Repository

Host name

Project

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 19/111

 

SPICADATA SYSTEMS

DataStage Director - =unctions

 <sed to control2monitor the +o"s or +o" activit'

 >alidate& start& sto. and reset the +o"

 Schedule +o" runs

 >iew the log and filter2limit the log entries

 0an change the +o" .arameters at run-time

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 20/111

 

SPICADATA SYSTEMS

DataStage Director 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 21/111

 

SPICADATA SYSTEMS

Director 

0o$ e/ecution1vali(ation

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 22/111

 

SPICADATA SYSTEMS

Director 

0o$ 'onitor 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 23/111

 

SPICADATA SYSTEMS

Director 

Job Reset0o$ Reset

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 24/111

 

SPICADATA SYSTEMS

Director * Log >iew

Log detailLog !iew

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 25/111

 

SPICADATA SYSTEMS

DataStage !o"s

• E%ecuta"le DataStage .rogram(

• 0reated in DataStage Designer& "ut can use

com.onents from anager(

• 7uilt using a gra.hical user interface(

• Stages to re.resent the .rocessing ste.s

re5uired and lin3s "etween the stages to

re.resent the flow of data(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 26/111

 

SPICADATA SYSTEMS

T'.es of DataStage !o"s

• Server !o"s

• )arallel !o"s

• ainframe !o"s

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 27/111

 

SPICADATA SYSTEMS

Stage

•  A Stage descri"es a data source& a .articular

.rocess& or a data mart(

 * =or e%am.le one stage ma' e%tract data from one

data source& while the other transforms it(

• Stages are added to a +o" and lin3ed together

using the Designer (

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 28/111

 

SPICADATA SYSTEMS

 A Sim.le !o" with Stages

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 29/111

 

SPICADATA SYSTEMS

=ile formats 

•  =i%ed length file- Each field has a fi%ed length ,similar to ta"le metadata

RowColu,n RowColu,n2

Row2Colu,n Row2Colu,n2

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 30/111

 

SPICADATA SYSTEMS

=ile formats

•  >aria"le length2delimited file

  - Each field is se.arated "' a delimiter 

  - A record delimiter signifies end-of-record =ield ?

=ield /

=ield 1

+iel( Deli,iter 

+iel(

Deli,iter 

+iel(

Deli,iter 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 31/111

 

SPICADATA SYSTEMS

Se5uential =ile Stage

• <sed to e%tract data from& or load data to& a

se5uential file(

• S.ecif' full .ath to the file(

• S.ecif' a file format: fi%ed width or delimited(

• S.ecified column definitions(

• S.ecif' write action(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 32/111

 

SPICADATA SYSTEMS

#here to find DataStage Designer$

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 33/111

 

SPICADATA SYSTEMS

Logging Into DataStage Designer 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 34/111

 

SPICADATA SYSTEMS

DataStage Designer 

Select the

Server !o"(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 35/111

 

SPICADATA SYSTEMS

Designer #or3 Area

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 36/111

 

SPICADATA SYSTEMS

Designer * 0reate a @ew !o"

ou can also go to =ile in the menu and select @ew

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 37/111

 

SPICADATA SYSTEMS

Se5uential =ile Stage

Drag theSe5uential =ile

from Tool

)alette(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 38/111

 

SPICADATA SYSTEMS

Sim.le !o"

Bee.ing the right mouse

"utton .ressed& lin3 "oth

the se5uential stages(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 39/111

 

SPICADATA SYSTEMS

Im.orting =lat =ile Definition

6ight clic3 over the Ta"le

definitions node and select

the o.tions as shown

a"ove(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 40/111

 

SPICADATA SYSTEMS

Im.orting =ile

Select the file and clic3 on Im.ort

ou can clic3 on

this "utton to

"rowse for the .ath

of the file(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 41/111

 

SPICADATA SYSTEMS

Definition of =lat =ile

Select this

o.tion to treat

the first line as

column names(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 42/111

 

SPICADATA SYSTEMS

Definition of =lat =ile

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 43/111

 

SPICADATA SYSTEMS

 After Successful Im.ort

The =lat file

has "een

im.orted(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 44/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Source

6ight clic3 on Se5uential =ile Stage and select )ro.erties

0lic3 on this "utton to

"rowse for this file(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 45/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Source

Select the a..ro.riate file

Select and file and clic3

on *

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 46/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Source

Select this o.tion to

treat the first row as

column names(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 47/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Source

4o the 0olumns Ta" and 0lic3 on the Load "utton to load columns

0olumns

ta"

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 48/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Source

0olumn

definition of

the selected

flat file(

0lic3 on the

!iew Data 

"utton to see

the Data

)review(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 49/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Target

6ight clic3 on Target Stage and select )ro.erties

Enter the

filename for

out.ut( 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 50/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Target

Select this o.tion to treat

first line as column

names(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 51/111

 

SPICADATA SYSTEMS

Stage )ro.erties ,Target

0olumn definition

from source(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 52/111

 

SPICADATA SYSTEMS

Save and 0om.ile the !o"

0lic3 on this

"utton to savethe +o"(

0lic3 here to com.ile

the +o"(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 53/111

 

SPICADATA SYSTEMS

6un the !o"

0lic3ing this

"utton will invo3e

the "o% "elow(

0lic3 this "utton

to run the +o"(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 54/111

 

SPICADATA SYSTEMS

E%ecuted !o"

If the lin3 goes

green the +o" has

e%ecuted

successfull'(Select this o.tion to

o.en DataStage

Director(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 55/111

 

SPICADATA SYSTEMS

DataStage Director 

The

highlighted +o"has "een

com.leted(0lic3 here to

view the log(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 56/111

 

SPICADATA SYSTEMS

DataStage Director Log >iew

This is the log of the

com.leted +o"(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 57/111

 

SPICADATA SYSTEMS

DataStage Designer 

6ight clic3 and

select thiso.tion to view

target data(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 58/111

 

SPICADATA SYSTEMS

DataStage Designer 

Data )review for

Target

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 59/111

 

SPICADATA SYSTEMS

DataStage Designer 

Introduction to DataStage Designer 

Server !o"s

Stages

E%ercises

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 60/111

 

SPICADATA SYSTEMS

DataStage Designer 

 A design interface used to create DataStage

a..lications ,3nown as +o"s( Each +o" s.ecifies

the data sources& the transforms re5uired& and

the destination of the data( !o"s are com.iled tocreate e%ecuta"les that are scheduled "' the

Director and run "' the Server(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 61/111

 

SPICADATA SYSTEMS

Server !o"s

 The server +o"s are com.iled and run on theDataStage server( A server +o" will do thefollowing:

 • 0onnect to data"ases on other machines as

necessar' and e%tract data(

• )rocess the data(

• #rite the data to the target data warehouse(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 62/111

 

SPICADATA SYSTEMS

Sort stage

• The DataStage Sort stage sorts a variet' of data(

• The Sort stage receives a stream of rows using asingle in.ut lin3 and on the out.ut lin3 it gives'ou a sorted stream of rows(

• The Sort stage must have one in.ut and oneout.ut lin3( 

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 63/111

 

SPICADATA SYSTEMS

Sort stage((

•  A single in.ut stream lin3 .rovides rows of data

to "e sorted(

•  A single out.ut stream lin3 receives sorted rows

of data(• Out.ut rows have the same column order as

in.ut columns(

• The resulting sorted rows are written as column

values to a single out.ut lin3(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 64/111

 

SPICADATA SYSTEMS

Sort stage

The following rules s.ecif' the order of rows& de.endingon case-sensitivit':

Case&Sensitivit4 Ascen(ing r(er Descen(ing

r(er 

Sensitive a d

asc dsc

ascending descending

Insensitive A D

 AS0 DS0

 AS0E@DI@4 DES0E@DI@4

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 65/111

 

SPICADATA SYSTEMS

Sort !o"

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 66/111

 

SPICADATA SYSTEMS

Sort stage )ro.erties

6ight clic3 on the Stage for )ro.erties

E%ercise ' t ( t I t

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 67/111

 

SPICADATA SYSTEMS

E%ercise-'eta(ata I,-ort

• Co-4 Source 5iles 5ol(er to c.6

• Login to Designer 

• +ro, ta$le (e5inition I,-ort select

se7uential 5ile• Select c.6src5iles6cust8or(t/t

• S-eci54 5irst line is colu,n na,es

• S-eci54 9:; as (eli,iter • <'eta(ata can $e i,-orte( using DS

'anager also=

E%ercise S t St

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 68/111

 

SPICADATA SYSTEMS

E%ercise-Sort Stage

• Create a new server %o$ $4 going to the +ile in the ,enu $aran( selecting the o-tion New un(er the 5ile ,enu an( selectServer 0o$

• +ro, the Palette on the le5t si(e -ane un(er the +ile sectionselect Se7uential +ile <Source=

• >o $ack to the Palette select Sort un(er the Processingsection

• nce again get $ack to the -alette an( select Se7uential +ile<Target= un(er the +ile section

• nce all the a$ove stages have $een -lace( connect the, witha Link 5ro, the >eneral Section

• Now co,e to the +irst Se7uential 5ile an( (ou$le click to o-enthe -ro-erties win(ow• In the -ro-erties win(ow un(er ut-uts -age an( in the

>eneral ta$: s-eci54 the 5ile na,e with the a--ro-riate -ath or$rowse <c.6src5iles6cust8or(t/t=

E%ercise Sort Stage

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 69/111

 

SPICADATA SYSTEMS

E%ercise-Sort Stage

• Then in the +or,at ta$ select the o-tion 9+irst line is colu,nna,es;

• In the Colu,ns ta$ check whether the right colu,n na,eshave $een i,-orte( I5 4ou (on;t see an4thing 4ou can click onLoa( an( I,-ort the (esire( 5ile (e5inition

• In or(er to -review the (ata 4ou can click on the !iew Data$utton

• In the Sort Stage go to the Stage -age an( un(er the -ro-ertiesta$ an( select the o-tion sort s-eci5ication ?ou nee( to enterPro(8C( (sc

• +inall4 in the Se7uential +ile <Target= go to the In-uts -age an(un(er the >eneral ta$ enter the 5ile na,e with the (irector4-ath <c.6tgt6-ro(8sortt/t= @n(er the +or,at ta$ select 9+irstline is colu,n na,e; @n(er the Colu,ns ta$ check whetherthe right colu,n na,es have utilie( I5 4ou (on;t see an4thing4ou can click on Loa( an( I,-ort the (esire( 5ile (e5inition

E%ercise Sort Stage

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 70/111

 

SPICADATA SYSTEMS

E%ercise-Sort StageB

• Save 4our %o$ 5ro, the +ile ,enu: co,-ile the %o$ $4 going tothe +ile ,enu an( selecting Co,-ile

• A5ter the co,-ilation has $een (one success5ull4: then run the

 %o$ $4 selecting Run 5ro, the +ile ,enu

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 71/111

 

SPICADATA SYSTEMS

 Aggregator stage

• This stage classifies data rows from a single

in.ut lin3 into grou.s and com.ute totals or

other aggregate functions for each grou.(

• The summed totals for each grou. are out.ut

from the stage through an out.ut lin3(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 72/111

 

SPICADATA SYSTEMS

 Aggregator stage )ro.erties

6ight clic3 on the Stage for )ro.erties

ou can add some

descri.tion on the

>eneral Ta"(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 73/111

 

SPICADATA SYSTEMS

 Aggregator stage )ro.erties

Select the In.uts Ta"

Select this ta" to

view column

definition fromthe in.ut lin3

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 74/111

 

SPICADATA SYSTEMS

 Aggregator Stage )ro.erties

Select the Out.uts Ta"

Select the 0olumns

ta"

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 75/111

 

SPICADATA SYSTEMS

 Aggregator Stage )ro.erties

Dou"le clic3 on

the field to o.en

the derivation

window

A t St ) ti

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 76/111

 

SPICADATA SYSTEMS

 Aggregator Stage )ro.erties

Select this o.tion to

create a grou.

according to this

field and clic3 on OB

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 77/111

 

SPICADATA SYSTEMS

 Aggregator stage )ro.erties

Dou"le clic3 on

the field to o.en

the derivation

window

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 78/111

 

SPICADATA SYSTEMS

 Aggregator stage )ro.erties

Select the

a..ro.riate function

to .erform an

aggregation and

clic3 on OB

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 79/111

 

SPICADATA SYSTEMS

Annotations

• DataStage allows 'ou to insert notes into a Diagramwindow( These are called annotations and there are two

t'.es:

• Annotation( ou enter the te%t for this 'ourself( <se it to

annotate stages and lin3s in 'our +o" design( These can"e cut and co.ied and .aste into other +o"s(

• 0o$ Descri-tion Annotation( This dis.la's either the

short or full descri.tion from the +o" .ro.erties( ou can

edit the descri.tion within the annotation if re5uired(There can onl' "e one of these .er +o"& the' cannot "e

cut and co.ied and .asted into other +o"s(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 80/111

 

SPICADATA SYSTEMS

Annotations

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 81/111

 

SPICADATA SYSTEMS

E%ercise

• 6ead the data from c:CsrcfilesCcustord(t%t andcalculate the total amount for each .roduct

write to file c:CtgtC.rodagg(t%t(

•  Add the Annotation to descri"e the +o" stages(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 82/111

 

SPICADATA SYSTEMS

Transformer stage

•  This stage is used to handle e%tracted data(

•  The' can .erform re5uired conversions and

.ass data to another stage(

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 83/111

 

SPICADATA SYSTEMS

Transformer stage

This stage can .erform the following tas3s:

• 0reate new columns on a lin3(

• Delete columns from a lin3(

• Define Local varia"les

• Define 0onstraints

• Define out.ut column derivations(

T f ) i

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 84/111

 

SPICADATA SYSTEMS

Transformer stage )ro.erties

Dou"le clic3 on the Stage for )ro.erties

In.ut lin3 Out.ut lin3

T f t ) ti

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 85/111

 

SPICADATA SYSTEMS

Transformer stage )ro.erties

Lin3 the columns

from the in.ut lin3 to

the out.ut lin3 "'

3ee.ing the left

mouse "utton.ressed

In.ut Out.ut

T f t ti

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 86/111

 

SPICADATA SYSTEMS

Transformer stage .ro.erties

6ight clic3 on

the re5uired field

to add a

derivation

D fi D i ti

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 87/111

 

SPICADATA SYSTEMS

Define Derivation

In this wa' 'ou can add a

derivation and clic3 on

OB

D fi L l > i "l

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 88/111

 

SPICADATA SYSTEMS

Define Local >aria"les

Add 0onstraints

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 89/111

 

SPICADATA SYSTEMS

 Add 0onstraints

ou can define limits for out.ut data "' s.ecif'ing a constraint( 

6 + t Li 3

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 90/111

 

SPICADATA SYSTEMS

6e+ect Lin3

• @ote that the re+ect lin3,s must "e last in thee%ecution order(

•  An' data on rows not written to an' other out.ut

lin3 in the stage is then written to the re+ectlin3,s& using the column ma..ings 'ou have

s.ecified(

0 t l t + t li 3

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 91/111

 

SPICADATA SYSTEMS

0onnect columns to re+ect lin3

E%ercise-Trans5or,er Stage

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 92/111

 

SPICADATA SYSTEMS

g

• 6ead the data from custord(t%t

•  Add local varia"le vamtFGHHHH

•  Add derivation newamt

• @ewamtFordamt-/HHH•  Add constraint ordamt9vamt

•  Add se5uential stage for writing re+ected data to

it ,data that doesnt meet constraint(

D t " St OD70

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 93/111

 

SPICADATA SYSTEMS

Data"ase Stage- OD70

• OD70 stages are used to re.resent a data"asethat su..orts the industr' standard O.en

Data"ase 0onnectivit' A)I( ou can use an

OD70 stage to e%tract& write& or aggregate data(

• Each OD70 stage can have an' num"er of

in.uts or out.uts(

• ou can s.ecif' the data on an in.ut lin3 using

an SJL statement constructed "' DataStage& auser-defined 5uer'& or a stored .rocedure(

Stage )ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 94/111

 

SPICADATA SYSTEMS

Stage )ro.erties,0reate OD70 DS@ in windows

Load 0olumn Definition

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 95/111

 

SPICADATA SYSTEMS

Load 0olumn Definition

>iew SJL

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 96/111

 

SPICADATA SYSTEMS

>iew SJL

Define #here 0lause

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 97/111

 

SPICADATA SYSTEMS

Define #here 0lause

Aggregating Data @sing an D#C Stage

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 98/111

 

SPICADATA SYSTEMS

gg g g g g

• ou can use an OD70 stage to aggregate dataat the source instead of using an intermediate

 Aggregator stage(

• Define grou. "' column• De5ine Derivation cell for the column 'ou want

to aggregate using S< or 0O<@T(

• >iew the changed SJL with 4rou. "' clause

Aggregating Data @sing an D#C Stage

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 99/111

 

SPICADATA SYSTEMS

<+or $etter -er5or,ance=

Data"ase Stage Oracle O0I

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 100/111

 

SPICADATA SYSTEMS

Data"ase Stage- Oracle O0I

• <se this stage for Oracle data sources or targets

• Oracle client should "e .re-installed on the

machine

Stage )ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 101/111

 

SPICADATA SYSTEMS

Stage )ro.erties

In.ut )ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 102/111

 

SPICADATA SYSTEMS

In.ut )ro.erties

<.date Action

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 103/111

 

SPICADATA SYSTEMS

<.date Action

• Truncate ta"le then insert rows

• 0lear ta"le then insert rows

• Insert rows without clearing

• Delete e%isting rows onl'

• 6e.lace e%isting rows com.letel'

• <.date e%isting rows onl'

• <.date e%isting rows or insert new rows

• Insert new rows or u.date e%isting rows

Load 0olumns

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 104/111

 

SPICADATA SYSTEMS

Load 0olumns

SPICA SJL Ta"

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 105/111

 

SPICADATA SYSTEMS

SJL-Ta"

SPICA

<sing <ser-defined

SJL St t t

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 106/111

 

SPICADATA SYSTEMS

SJL Statements

• Instead of writing data using an SJL statementconstructed "' DataStage& 'ou can enter 'our

own SJL I@SE6T& DELETE& or <)DATE

statement for each O6AO0I in.ut lin3(

• This statement is e%ecuted for each row(

SPICA 7efore After SJL

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 107/111

 

SPICADATA SYSTEMS

7efore After SJL

• 7efore( 0ontains the SJL statements e%ecuted"efore the stage .rocesses an' +o" data rows(

•  After( 0ontains the SJL statements e%ecutedafter the stage .rocesses the +o" data rows(

SPICAOD70 Stage used to Aggregate the

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 108/111

 

SPICADATA SYSTEMS Data

SPICATarget)ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 109/111

 

SPICADATA SYSTEMS

Target)ro.erties

SPICATransformer )ro.erties

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 110/111

 

SPICADATA SYSTEMS

Transformer )ro.erties

SPICAE%ercise

7/25/2019 Introduction to DataStage

http://slidepdf.com/reader/full/introduction-to-datastage 111/111

SPICADATA SYSTEMS

E%ercise

• 0reate user olt. identified "' olt.• 0reate user stag identified "' stage

• 6un source scri.ts against olt. user 

• 6un target scri.ts against stage user 

• 0reate OD70 DS@ for logging to oracle user in windows

• Im.ort ta"le definition from olt. stage users

• 0reate +o" to read data from ordert% ta"le& aggregate it

"' .roduct load into custorderstats ta"le ,Do not use Aggregator Stage