dwh informatica session (1).pdf
TRANSCRIPT
Data Warehousing & Informatica
By
Deepthi.G
AGENDAINTRODUCTION TO DATAWAREHOUSING DATAWAREHOUSING ARCHITECTURE STEPS FOR BULIDING DATAWAREHOUSETYPES OF SCHEMAS CONCEPTS OF DATAWAREHOUSINGLIST OF TOOLS
INTRODUCTION TO INFORMATICAINFORMATICA ARCHITECTURECOMPONENTS OF INFORMATICAWORKING WITH INFORMATICAINSTALLATION AND CONFIGURATION
AS PER W.H.INMON DATWAREHOUSING IS AS PER W.H.INMON DATWAREHOUSING IS Subject-OrientedIntegratedTime-VariantNon-volatile
The other name of data warehousing isDecision support system (DSS)
Subject Oriented Analysis
SalesSales
CustomersCustomers
ProductsProducts
EntrySales RepQuantity SoldProd NumberDate Customer NameProduct DescriptionUnit PriceMail Address
Process Oriented Subject Oriented
Transactional Storage Data Warehouse Storage
Integration of Data
Data Warehouse StorageTransactional Storage
Appl. A - M, FAppl. B - 1, 0Appl. C - X, Y
Appl. A - pipeline cm.Appl. B - pipeline inchesAppl. C - pipeline mcf
Appl. A - balance dec(13,2) Appl. B - balance PIC 9(9)V99Appl. C - balance float
Appl. A - bal-on-handAppl. B - current_balanceAppl. C - balance
Appl. A - date (Julian)Appl. B - date (yymmdd)Appl. C - date (absolute)
M, F
pipeline cm
balance dec(13, 2)
balance
date (Julian)
Encoding
Unit of Attributes
Physical Attributes
Naming Conventions
Data Consistency
Volatility of Data
Load
Access
Mass Load / Access of DataRecord-by-Record Data Manipulation
Insert
Access
Insert
Change
Delete
Change
Data Warehouse StorageTransactional Storage
Volatile Non-Volatile
Time Variant Data Analysis
Data Warehouse StorageTransactional Storage
Current Data Historical Data
0
5
10
15
20
Sales ( in lakhs )
January February MarchYear97
Sales ( Region , Year - Year 97 - 1st Qtr)
EastWestNorth
Data warehouses store large volumes of data which are frequently used by DSSIt is maintained separately from the organization’s operational databasesData warehouses are relatively static with only infrequent updatesA data warehouse is a stand-alone repository of information, integrated from several, possibly heterogeneous operational databases
Is the enabling technology that facilitates improved business decision-makingIt’s a process, not a productA technique for assembling and managing a wide variety of data from multiple operational systems for decision support and analytical processing
It’s a journey not an destination……
Data Warehouse Architecture
Source
Analysis
Reporting
Data Mining
StagingArea
DataWarehouse
Data Mart
Metadata
Raw Data
SummaryData
Oracle
Teradata
DB2
SQL Server
Source:It’s Database where data is extracted
Ex : OracleTeradataSybaseDB2
Staging area:
It’s a temporary storage area used for the process of data
Meta Data:Data about the data.
Or Description of the data.
Data Mart :
A Data mart is nothing but a Data warehouse but for specific domain
A Data mart can be divided into two types:
Independent Data mart
Dependent Data mart
Steps For Building A Data warehouse
Identify key business drivers, sponsorship, risks .Survey information needs and identify desired functionality and define functional requirements for initial subject area.Architect long-term, data warehousing architectureEvaluate and Finalize DW tool & technologyConduct Proof-of-ConceptDesign target data base schemaBuild data mapping, extract, transformation, cleansing and aggregation/summarization rulesBuild initial data mart, using exact subset of enterprise data warehousing architecture and expand to enterprise architecture over subsequent phasesMaintain and administer data warehouse
Snow Flake Schema
Same use star flake schema but the cube will have at least one dimension with two/more levels under at leastTwo hierarchy.
List Of Tools
ETL TOOLS Informatica,Ascential Data stage , IBM Visual Warehouse , Oracle warehouse Builder .
OLAP SERVER Oracle Express Server, Hyperion Essbase, IBM DB2 OLAP Server, Microsoft SQL Server OLAP Services, Seagate HOLOS, SAS/MDDB .
OLAP TOOLS Oracle Express Suite, Business Objects, Web Intelligence, SAS, Cognos Powerplay/Impromtu, KALIDO, MicroStrategy, Brio
Query, MetaCube .
Data warehouse Oracle, Informix, Teradata, DB2/UDB, Sybase, Microsoft SQL Server .
INTRODUCTION TO INFORMATICA
It is an ETL TOOL.Extracting of data from sourcesPerforming the Transformations Loading the data in to target
INFORMATICA ARCHITECTURE
Repository manager
Informatica
Repository
Server
Repository Admin console
Source Informatica server Target
Designer Workflow Manager
Workflow Monitor
validationsession
Status
Components of Informatica
REPOSITORY MANAGERDESIGNERSERVER MANAGER
REPOSITORY MANAGER
REPOSITORY SECURITYFOLDER MANAGEMENTMETADATA REPORTINGREPOSITORY MAINTENANCE
OUTPUT WINDOW
DEPENDENCY WINDOW
ANALYSIS WINDOWNAVIGATORWINDOW
REPOSITORY SECURITY
♦ CREATE USERS
♦ CREATE GROUPS
♦ ASSIGN PRIVILEGES
♦ MOVE USERS INTO GROUPS
♦
ASSIGN ADDITIONAL PRIVILEGES TO USERS
REPOSITORY SECURITY
♦ LOCK TYPES ( READ, WRITE, EXEC, FETCH, SAVE )
♦ OBJECT LOCKS ( FOLDERS, SOURCE DEF., TARGET DEF. )
♦ VIEW LOCKS ( EDIT| SHOW LOCKS )
♦ UNLOCKING OBJECTS
FOLDER MANAGEMENT
♦
FOLDER ATTRIBUTES * OWNER * PERMISSIONS * SHARED * SHORTCUT * VERSIONS
DESIGNER
SOURCE ANALYZER TO CREATE SOURCE DEFINITIONSWAREHOUSE DESIGNER TO CREATE TARGET DEFINITIONSTRANSFORMATION DEVELOPER TO CREATE REUSABLE TRANSFORMATIONSMAPPLET DESIGNER TO CREATE REUSABLE MAPPINGSMAPPING DESIGNER TO CREATE SOURCE TO TARGET MAPPINGS
Designer
Mapping = Source +Transformation+Target
Transformation : 2 Types
Active Transformation
Passive Transformation
ACTIVE TRANSFORMATION
PASSIVE TRANSFORMATION
SorterRankRouterNormalizerSource QualifierJoinerAggregatorAdvance external ProcedureUpdate StrategyCustom TransformationTransformation controlUnion
LookupExpressionStored ProcedureSequence generatorExternal ProcedureXML Source Qualifier
SERVER MANAGER
CONFIGURE SERVERCREATE SESSIONSTART SESSION MONITOR SESSIONVIEW LOGSCORRECT SESSION PROBLEMS
Important Bottlenecks:
TEST SQL QUERY CHECK SESSION LOG FOR ERRORSCHECK PERFORMANCE DETAILSREDUCE NUMBER OF RECORDS
PROCESSEDINDEX THE SOURCEREPLACE DEFAULT QUERY WITH AN
OPTIMIZED QUERYDROP INDEXES BEFORE LOADING CONSIDER INCEASING COMMIT LEVEL.
INSTALLATION AND CONFIGURATION
SYSTEM REQUIREMENTS :
OPERATING SYSTEM ( WINDOW 95/98/ NT 4.0 )DISK SPACE ( 120 MB )RAM ( 32 MB)CONNECTIVITY ( MERANT ODBC 3.50 )NETWORK SUPPORT ( TCP/IP OR IPX/SPX )
THANK YOU