introduction to datastage
TRANSCRIPT
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 1/111
SPICADATA SYSTEMS
Agenda
Introduction to ETL
Data Stage client-server architecture• 2 Tier architecture
• 3 Tier architecture
Data Stage client review
MANAGER DESIGNER DIRECTOR ADMIN
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 2/111
SPICADATA SYSTEMS
Agenda…
Overview of the following:
DataStage Administrator
DataStage Designer
DataStage anager
DataStage Director
DataStage !o"s
Stages
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 3/111
SPICADATA SYSTEMS
#hat is ETL$
• ETL stands for:
• - e%tracting data from outside sources&
• - transforming it to fit "usiness needs& and
ultimatel'
• - loading it into the target(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 4/111
SPICADATA SYSTEMS
ETL )rocess
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 5/111
SPICADATA SYSTEMS
DataStage * Introduction
DataStage is an ETL tool( It is used to
• Design +o"s for E%traction& Transformation& and Loading,ETL(
• Ideal tool for data integration .ro+ects * data
warehouses& data marts& and s'stem migrations(
• Im.ort& e%.ort& create& and manage metadata for usewithin +o"s(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 6/111
SPICADATA SYSTEMS
DataStage Introduction
• Schedule& run& and monitor +o"s - all within
DataStage(
• Administer DataStage develo.ment ande%ecution environments(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 7/111
SPICADATA SYSTEMS
DataStage
client-server architecture
INTE
RNET
DataStageServer
Clients Server Network
/-tier 0lient Server Architecture
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 8/111
SPICADATA SYSTEMS
DataStage client-server
architecture
C I T R I X A P P L I C A T I N S E R ! E R
Data
stage
SS"
Client
#usiness
$%ects
Control&'
sche(uler
L C A L A R E A N E T ) R *+ I R E ) A L L
+ I R E ) A L L
#usiness
$%ects
Control&'
sche(uler
SS"
Client
Data
stage
+ I R E ) A L L
+ I R E ) A L L
1-tier 0lient Server Architecture
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 9/111
SPICADATA SYSTEMS
Data Stage client com.onents
A(,inistrator * add2delete .ro+ects& set defaults
'anager * im.ort meta data& "ac3u. .ro+ects
Designer * assem"le +o"s& com.ile& and e%ecute
Director * e%ecute +o"s& e%amine +o" run logs
!ersion control * log all the changes made to thedesigner com.onents
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 10/111
SPICADATA SYSTEMS
Administrator
• )ro+ects can "e created and deleted in Administrator(
• )ro+ect .ro.erties are set in Administrator(
• Environment varia"les and their defaults are set in
Administrator(
Environmentvariables aredefined here
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 11/111
SPICADATA SYSTEMS
Administrator
Setting Pro%ect Pro-erties .
• Logon to administrator
• Select the .ro+ect
• 4o to )ro.erties ta"
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 12/111
SPICADATA SYSTEMS
DataStage anager
Host name Project
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 13/111
SPICADATA SYSTEMS
anager 0ontents
etadata for source and targets• Ta"le definitions
• Source2target se5uential file2dataset la'outs
DataStage O"+ects
• !o"s• 6outines
• Ta"le definitions
• 0ontainers
Im.ort and e%.ort com.onents2.ro+ects
7ac3u. .ro+ects2com.onents
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 14/111
SPICADATA SYSTEMS
anager
I,-ort an( E/-ort
E%.orting DataStage com.onents• To e%.ort using anager 4o to 8 E%.ort 9 DataStage
com.onents 8(
• Select DataStage o"+ects for e%.ort(
• S.ecif' t'.e of e%.ort * DS 2 L(
• Select .ath where the file to "e e%.orted to on the client
machine(
Im.orting DataStage com.onents• To im.ort using anager 4o to 8 Im.ort 9 DataStage
com.onents;(
• Select DataStage o"+ects to im.ort(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 15/111
SPICADATA SYSTEMS
anager * E%.orting DataStage
O"+ects
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 16/111
SPICADATA SYSTEMS
DataStage Designer
• Designer functions <sed to "uild +o"s
0om.ile2run the created +o"s
Define the +o" environment varia"le .ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 17/111
SPICADATA SYSTEMS
DataStage Designer
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 18/111
SPICADATA SYSTEMS
DataStage Designer
Run
CompileJobproperties
Palette
Repository
Host name
Project
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 19/111
SPICADATA SYSTEMS
DataStage Director - =unctions
<sed to control2monitor the +o"s or +o" activit'
>alidate& start& sto. and reset the +o"
Schedule +o" runs
>iew the log and filter2limit the log entries
0an change the +o" .arameters at run-time
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 20/111
SPICADATA SYSTEMS
DataStage Director
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 21/111
SPICADATA SYSTEMS
Director
0o$ e/ecution1vali(ation
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 22/111
SPICADATA SYSTEMS
Director
0o$ 'onitor
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 23/111
SPICADATA SYSTEMS
Director
Job Reset0o$ Reset
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 24/111
SPICADATA SYSTEMS
Director * Log >iew
Log detailLog !iew
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 25/111
SPICADATA SYSTEMS
DataStage !o"s
• E%ecuta"le DataStage .rogram(
• 0reated in DataStage Designer& "ut can use
com.onents from anager(
• 7uilt using a gra.hical user interface(
• Stages to re.resent the .rocessing ste.s
re5uired and lin3s "etween the stages to
re.resent the flow of data(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 26/111
SPICADATA SYSTEMS
T'.es of DataStage !o"s
• Server !o"s
• )arallel !o"s
• ainframe !o"s
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 27/111
SPICADATA SYSTEMS
Stage
• A Stage descri"es a data source& a .articular
.rocess& or a data mart(
* =or e%am.le one stage ma' e%tract data from one
data source& while the other transforms it(
• Stages are added to a +o" and lin3ed together
using the Designer (
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 28/111
SPICADATA SYSTEMS
A Sim.le !o" with Stages
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 29/111
SPICADATA SYSTEMS
=ile formats
• =i%ed length file- Each field has a fi%ed length ,similar to ta"le metadata
RowColu,n RowColu,n2
Row2Colu,n Row2Colu,n2
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 30/111
SPICADATA SYSTEMS
=ile formats
• >aria"le length2delimited file
- Each field is se.arated "' a delimiter
- A record delimiter signifies end-of-record =ield ?
=ield /
=ield 1
+iel( Deli,iter
+iel(
Deli,iter
+iel(
Deli,iter
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 31/111
SPICADATA SYSTEMS
Se5uential =ile Stage
• <sed to e%tract data from& or load data to& a
se5uential file(
• S.ecif' full .ath to the file(
• S.ecif' a file format: fi%ed width or delimited(
• S.ecified column definitions(
• S.ecif' write action(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 32/111
SPICADATA SYSTEMS
#here to find DataStage Designer$
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 33/111
SPICADATA SYSTEMS
Logging Into DataStage Designer
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 34/111
SPICADATA SYSTEMS
DataStage Designer
Select the
Server !o"(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 35/111
SPICADATA SYSTEMS
Designer #or3 Area
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 36/111
SPICADATA SYSTEMS
Designer * 0reate a @ew !o"
ou can also go to =ile in the menu and select @ew
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 37/111
SPICADATA SYSTEMS
Se5uential =ile Stage
Drag theSe5uential =ile
from Tool
)alette(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 38/111
SPICADATA SYSTEMS
Sim.le !o"
Bee.ing the right mouse
"utton .ressed& lin3 "oth
the se5uential stages(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 39/111
SPICADATA SYSTEMS
Im.orting =lat =ile Definition
6ight clic3 over the Ta"le
definitions node and select
the o.tions as shown
a"ove(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 40/111
SPICADATA SYSTEMS
Im.orting =ile
Select the file and clic3 on Im.ort
ou can clic3 on
this "utton to
"rowse for the .ath
of the file(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 41/111
SPICADATA SYSTEMS
Definition of =lat =ile
Select this
o.tion to treat
the first line as
column names(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 42/111
SPICADATA SYSTEMS
Definition of =lat =ile
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 43/111
SPICADATA SYSTEMS
After Successful Im.ort
The =lat file
has "een
im.orted(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 44/111
SPICADATA SYSTEMS
Stage )ro.erties ,Source
6ight clic3 on Se5uential =ile Stage and select )ro.erties
0lic3 on this "utton to
"rowse for this file(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 45/111
SPICADATA SYSTEMS
Stage )ro.erties ,Source
Select the a..ro.riate file
Select and file and clic3
on *
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 46/111
SPICADATA SYSTEMS
Stage )ro.erties ,Source
Select this o.tion to
treat the first row as
column names(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 47/111
SPICADATA SYSTEMS
Stage )ro.erties ,Source
4o the 0olumns Ta" and 0lic3 on the Load "utton to load columns
0olumns
ta"
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 48/111
SPICADATA SYSTEMS
Stage )ro.erties ,Source
0olumn
definition of
the selected
flat file(
0lic3 on the
!iew Data
"utton to see
the Data
)review(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 49/111
SPICADATA SYSTEMS
Stage )ro.erties ,Target
6ight clic3 on Target Stage and select )ro.erties
Enter the
filename for
out.ut(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 50/111
SPICADATA SYSTEMS
Stage )ro.erties ,Target
Select this o.tion to treat
first line as column
names(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 51/111
SPICADATA SYSTEMS
Stage )ro.erties ,Target
0olumn definition
from source(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 52/111
SPICADATA SYSTEMS
Save and 0om.ile the !o"
0lic3 on this
"utton to savethe +o"(
0lic3 here to com.ile
the +o"(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 53/111
SPICADATA SYSTEMS
6un the !o"
0lic3ing this
"utton will invo3e
the "o% "elow(
0lic3 this "utton
to run the +o"(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 54/111
SPICADATA SYSTEMS
E%ecuted !o"
If the lin3 goes
green the +o" has
e%ecuted
successfull'(Select this o.tion to
o.en DataStage
Director(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 55/111
SPICADATA SYSTEMS
DataStage Director
The
highlighted +o"has "een
com.leted(0lic3 here to
view the log(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 56/111
SPICADATA SYSTEMS
DataStage Director Log >iew
This is the log of the
com.leted +o"(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 57/111
SPICADATA SYSTEMS
DataStage Designer
6ight clic3 and
select thiso.tion to view
target data(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 58/111
SPICADATA SYSTEMS
DataStage Designer
Data )review for
Target
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 59/111
SPICADATA SYSTEMS
DataStage Designer
Introduction to DataStage Designer
Server !o"s
Stages
E%ercises
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 60/111
SPICADATA SYSTEMS
DataStage Designer
A design interface used to create DataStage
a..lications ,3nown as +o"s( Each +o" s.ecifies
the data sources& the transforms re5uired& and
the destination of the data( !o"s are com.iled tocreate e%ecuta"les that are scheduled "' the
Director and run "' the Server(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 61/111
SPICADATA SYSTEMS
Server !o"s
The server +o"s are com.iled and run on theDataStage server( A server +o" will do thefollowing:
• 0onnect to data"ases on other machines as
necessar' and e%tract data(
• )rocess the data(
• #rite the data to the target data warehouse(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 62/111
SPICADATA SYSTEMS
Sort stage
• The DataStage Sort stage sorts a variet' of data(
• The Sort stage receives a stream of rows using asingle in.ut lin3 and on the out.ut lin3 it gives'ou a sorted stream of rows(
• The Sort stage must have one in.ut and oneout.ut lin3(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 63/111
SPICADATA SYSTEMS
Sort stage((
• A single in.ut stream lin3 .rovides rows of data
to "e sorted(
• A single out.ut stream lin3 receives sorted rows
of data(• Out.ut rows have the same column order as
in.ut columns(
• The resulting sorted rows are written as column
values to a single out.ut lin3(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 64/111
SPICADATA SYSTEMS
Sort stage
The following rules s.ecif' the order of rows& de.endingon case-sensitivit':
Case&Sensitivit4 Ascen(ing r(er Descen(ing
r(er
Sensitive a d
asc dsc
ascending descending
Insensitive A D
AS0 DS0
AS0E@DI@4 DES0E@DI@4
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 65/111
SPICADATA SYSTEMS
Sort !o"
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 66/111
SPICADATA SYSTEMS
Sort stage )ro.erties
6ight clic3 on the Stage for )ro.erties
E%ercise ' t ( t I t
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 67/111
SPICADATA SYSTEMS
E%ercise-'eta(ata I,-ort
• Co-4 Source 5iles 5ol(er to c.6
• Login to Designer
• +ro, ta$le (e5inition I,-ort select
se7uential 5ile• Select c.6src5iles6cust8or(t/t
• S-eci54 5irst line is colu,n na,es
• S-eci54 9:; as (eli,iter • <'eta(ata can $e i,-orte( using DS
'anager also=
E%ercise S t St
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 68/111
SPICADATA SYSTEMS
E%ercise-Sort Stage
• Create a new server %o$ $4 going to the +ile in the ,enu $aran( selecting the o-tion New un(er the 5ile ,enu an( selectServer 0o$
• +ro, the Palette on the le5t si(e -ane un(er the +ile sectionselect Se7uential +ile <Source=
• >o $ack to the Palette select Sort un(er the Processingsection
• nce again get $ack to the -alette an( select Se7uential +ile<Target= un(er the +ile section
• nce all the a$ove stages have $een -lace( connect the, witha Link 5ro, the >eneral Section
• Now co,e to the +irst Se7uential 5ile an( (ou$le click to o-enthe -ro-erties win(ow• In the -ro-erties win(ow un(er ut-uts -age an( in the
>eneral ta$: s-eci54 the 5ile na,e with the a--ro-riate -ath or$rowse <c.6src5iles6cust8or(t/t=
E%ercise Sort Stage
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 69/111
SPICADATA SYSTEMS
E%ercise-Sort Stage
• Then in the +or,at ta$ select the o-tion 9+irst line is colu,nna,es;
• In the Colu,ns ta$ check whether the right colu,n na,eshave $een i,-orte( I5 4ou (on;t see an4thing 4ou can click onLoa( an( I,-ort the (esire( 5ile (e5inition
• In or(er to -review the (ata 4ou can click on the !iew Data$utton
• In the Sort Stage go to the Stage -age an( un(er the -ro-ertiesta$ an( select the o-tion sort s-eci5ication ?ou nee( to enterPro(8C( (sc
• +inall4 in the Se7uential +ile <Target= go to the In-uts -age an(un(er the >eneral ta$ enter the 5ile na,e with the (irector4-ath <c.6tgt6-ro(8sortt/t= @n(er the +or,at ta$ select 9+irstline is colu,n na,e; @n(er the Colu,ns ta$ check whetherthe right colu,n na,es have utilie( I5 4ou (on;t see an4thing4ou can click on Loa( an( I,-ort the (esire( 5ile (e5inition
E%ercise Sort Stage
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 70/111
SPICADATA SYSTEMS
E%ercise-Sort StageB
• Save 4our %o$ 5ro, the +ile ,enu: co,-ile the %o$ $4 going tothe +ile ,enu an( selecting Co,-ile
• A5ter the co,-ilation has $een (one success5ull4: then run the
%o$ $4 selecting Run 5ro, the +ile ,enu
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 71/111
SPICADATA SYSTEMS
Aggregator stage
• This stage classifies data rows from a single
in.ut lin3 into grou.s and com.ute totals or
other aggregate functions for each grou.(
• The summed totals for each grou. are out.ut
from the stage through an out.ut lin3(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 72/111
SPICADATA SYSTEMS
Aggregator stage )ro.erties
6ight clic3 on the Stage for )ro.erties
ou can add some
descri.tion on the
>eneral Ta"(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 73/111
SPICADATA SYSTEMS
Aggregator stage )ro.erties
Select the In.uts Ta"
Select this ta" to
view column
definition fromthe in.ut lin3
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 74/111
SPICADATA SYSTEMS
Aggregator Stage )ro.erties
Select the Out.uts Ta"
Select the 0olumns
ta"
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 75/111
SPICADATA SYSTEMS
Aggregator Stage )ro.erties
Dou"le clic3 on
the field to o.en
the derivation
window
A t St ) ti
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 76/111
SPICADATA SYSTEMS
Aggregator Stage )ro.erties
Select this o.tion to
create a grou.
according to this
field and clic3 on OB
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 77/111
SPICADATA SYSTEMS
Aggregator stage )ro.erties
Dou"le clic3 on
the field to o.en
the derivation
window
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 78/111
SPICADATA SYSTEMS
Aggregator stage )ro.erties
Select the
a..ro.riate function
to .erform an
aggregation and
clic3 on OB
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 79/111
SPICADATA SYSTEMS
Annotations
• DataStage allows 'ou to insert notes into a Diagramwindow( These are called annotations and there are two
t'.es:
• Annotation( ou enter the te%t for this 'ourself( <se it to
annotate stages and lin3s in 'our +o" design( These can"e cut and co.ied and .aste into other +o"s(
• 0o$ Descri-tion Annotation( This dis.la's either the
short or full descri.tion from the +o" .ro.erties( ou can
edit the descri.tion within the annotation if re5uired(There can onl' "e one of these .er +o"& the' cannot "e
cut and co.ied and .asted into other +o"s(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 80/111
SPICADATA SYSTEMS
Annotations
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 81/111
SPICADATA SYSTEMS
E%ercise
• 6ead the data from c:CsrcfilesCcustord(t%t andcalculate the total amount for each .roduct
write to file c:CtgtC.rodagg(t%t(
• Add the Annotation to descri"e the +o" stages(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 82/111
SPICADATA SYSTEMS
Transformer stage
• This stage is used to handle e%tracted data(
• The' can .erform re5uired conversions and
.ass data to another stage(
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 83/111
SPICADATA SYSTEMS
Transformer stage
This stage can .erform the following tas3s:
• 0reate new columns on a lin3(
• Delete columns from a lin3(
• Define Local varia"les
• Define 0onstraints
• Define out.ut column derivations(
T f ) i
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 84/111
SPICADATA SYSTEMS
Transformer stage )ro.erties
Dou"le clic3 on the Stage for )ro.erties
In.ut lin3 Out.ut lin3
T f t ) ti
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 85/111
SPICADATA SYSTEMS
Transformer stage )ro.erties
Lin3 the columns
from the in.ut lin3 to
the out.ut lin3 "'
3ee.ing the left
mouse "utton.ressed
In.ut Out.ut
T f t ti
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 86/111
SPICADATA SYSTEMS
Transformer stage .ro.erties
6ight clic3 on
the re5uired field
to add a
derivation
D fi D i ti
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 87/111
SPICADATA SYSTEMS
Define Derivation
In this wa' 'ou can add a
derivation and clic3 on
OB
D fi L l > i "l
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 88/111
SPICADATA SYSTEMS
Define Local >aria"les
Add 0onstraints
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 89/111
SPICADATA SYSTEMS
Add 0onstraints
ou can define limits for out.ut data "' s.ecif'ing a constraint(
6 + t Li 3
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 90/111
SPICADATA SYSTEMS
6e+ect Lin3
• @ote that the re+ect lin3,s must "e last in thee%ecution order(
• An' data on rows not written to an' other out.ut
lin3 in the stage is then written to the re+ectlin3,s& using the column ma..ings 'ou have
s.ecified(
0 t l t + t li 3
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 91/111
SPICADATA SYSTEMS
0onnect columns to re+ect lin3
E%ercise-Trans5or,er Stage
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 92/111
SPICADATA SYSTEMS
g
• 6ead the data from custord(t%t
• Add local varia"le vamtFGHHHH
• Add derivation newamt
• @ewamtFordamt-/HHH• Add constraint ordamt9vamt
• Add se5uential stage for writing re+ected data to
it ,data that doesnt meet constraint(
D t " St OD70
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 93/111
SPICADATA SYSTEMS
Data"ase Stage- OD70
• OD70 stages are used to re.resent a data"asethat su..orts the industr' standard O.en
Data"ase 0onnectivit' A)I( ou can use an
OD70 stage to e%tract& write& or aggregate data(
• Each OD70 stage can have an' num"er of
in.uts or out.uts(
• ou can s.ecif' the data on an in.ut lin3 using
an SJL statement constructed "' DataStage& auser-defined 5uer'& or a stored .rocedure(
Stage )ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 94/111
SPICADATA SYSTEMS
Stage )ro.erties,0reate OD70 DS@ in windows
Load 0olumn Definition
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 95/111
SPICADATA SYSTEMS
Load 0olumn Definition
>iew SJL
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 96/111
SPICADATA SYSTEMS
>iew SJL
Define #here 0lause
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 97/111
SPICADATA SYSTEMS
Define #here 0lause
Aggregating Data @sing an D#C Stage
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 98/111
SPICADATA SYSTEMS
gg g g g g
• ou can use an OD70 stage to aggregate dataat the source instead of using an intermediate
Aggregator stage(
• Define grou. "' column• De5ine Derivation cell for the column 'ou want
to aggregate using S< or 0O<@T(
• >iew the changed SJL with 4rou. "' clause
Aggregating Data @sing an D#C Stage
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 99/111
SPICADATA SYSTEMS
<+or $etter -er5or,ance=
Data"ase Stage Oracle O0I
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 100/111
SPICADATA SYSTEMS
Data"ase Stage- Oracle O0I
• <se this stage for Oracle data sources or targets
• Oracle client should "e .re-installed on the
machine
Stage )ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 101/111
SPICADATA SYSTEMS
Stage )ro.erties
In.ut )ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 102/111
SPICADATA SYSTEMS
In.ut )ro.erties
<.date Action
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 103/111
SPICADATA SYSTEMS
<.date Action
• Truncate ta"le then insert rows
• 0lear ta"le then insert rows
• Insert rows without clearing
• Delete e%isting rows onl'
• 6e.lace e%isting rows com.letel'
• <.date e%isting rows onl'
• <.date e%isting rows or insert new rows
• Insert new rows or u.date e%isting rows
Load 0olumns
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 104/111
SPICADATA SYSTEMS
Load 0olumns
SPICA SJL Ta"
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 105/111
SPICADATA SYSTEMS
SJL-Ta"
SPICA
<sing <ser-defined
SJL St t t
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 106/111
SPICADATA SYSTEMS
SJL Statements
• Instead of writing data using an SJL statementconstructed "' DataStage& 'ou can enter 'our
own SJL I@SE6T& DELETE& or <)DATE
statement for each O6AO0I in.ut lin3(
• This statement is e%ecuted for each row(
SPICA 7efore After SJL
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 107/111
SPICADATA SYSTEMS
7efore After SJL
• 7efore( 0ontains the SJL statements e%ecuted"efore the stage .rocesses an' +o" data rows(
• After( 0ontains the SJL statements e%ecutedafter the stage .rocesses the +o" data rows(
SPICAOD70 Stage used to Aggregate the
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 108/111
SPICADATA SYSTEMS Data
SPICATarget)ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 109/111
SPICADATA SYSTEMS
Target)ro.erties
SPICATransformer )ro.erties
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 110/111
SPICADATA SYSTEMS
Transformer )ro.erties
SPICAE%ercise
7/25/2019 Introduction to DataStage
http://slidepdf.com/reader/full/introduction-to-datastage 111/111
SPICADATA SYSTEMS
E%ercise
• 0reate user olt. identified "' olt.• 0reate user stag identified "' stage
• 6un source scri.ts against olt. user
• 6un target scri.ts against stage user
• 0reate OD70 DS@ for logging to oracle user in windows
• Im.ort ta"le definition from olt. stage users
• 0reate +o" to read data from ordert% ta"le& aggregate it
"' .roduct load into custorderstats ta"le ,Do not use Aggregator Stage