atlanta microsoft database forum introduction to data warehousing concepts brian thomas solution...

24
Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilde rs.com

Upload: annis-silvia-hopkins

Post on 12-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Atlanta Microsoft Database ForumIntroduction to Data Warehousing Concepts

Brian ThomasSolution Builders, Inc.

Presented by

March 8, [email protected]

Page 2: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data collected from one or many systems that exist within and outside the organization. The Data is structured in such a way as to reduce the amount of time that it takes to produce reliable information.

What is a Data Warehouse?

Page 3: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Why Build a Data Warehouse?

• To Provide a Consistent Common Source for Corporate Information

• To Store Large Volumes of Historical Detail Data from Mission Critical Applications

• Improve the Ability to Access, Report Against, and Analyze Information

• To Solve or Improve Upon Business Processes

Page 4: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Sales System System GeneratedReports

Sales Analysis is extrapolatedfrom the System Reports.

Functional Data Warehouse

Page 5: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Functional Data Warehouseof Sales Information

Sales Information is available to a wider audience of decision makers.

Sales System

Functional Data Warehouse

Page 6: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Sales System

Div

isio

n A

Div

isio

n B

Sales System

Sales System

Div

isio

n C

Centralized Data Warehouse of Sales Data

from across the Organization

Analysis performed and Decisions drawn from

the Cross Organizational Sales Data

Cross Organizational Functional Data Warehouse

Page 7: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Sales System

Production Systems

Marketing System

System GeneratedReports

Corporate Performance Analysis is extrapolated

from the System Reports.

Cross Functional Data Warehouse

Page 8: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Sales System

Production Systems

Marketing System

Cross Functional Data Warehouseof Information

Corporate Performance Analysis is available to a

wider audience.

Cross Functional Data Warehouse

Page 9: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Turning Data into Information

Div

isio

n A

Div

isio

n B

Div

isio

n C

Centralized Cross Functional Data

Warehouseof Information

Analysis is performed and Decisions made from the

Cross Functional Organizational

Performance Data

Cross Organizational & Cross Functional Data Warehouse

Page 10: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Source Systems Data Warehouse ComponentsAccess

MethodsE

xtra

ctio

n T

ran

sfor

mat

ion

Loa

d (

ET

L)

CorporateLevel

BusinessGroupLevel

DivisionalLevel

Enterprise Data Warehouse

Incr

ease

d L

evel

of

Stan

dard

izat

ion

Increased Local Specifications

DW / DM

DM DM DM DM DM DM

DW / DM DW / DM

Dat

a A

cces

s &

Qu

ery

Man

agem

ent

Ser

vice

s

`

Planning &Forecasting

PerformanceManagement

Scorecards &Dashboards

Analytics &Modeling

Query &Reporting

Portal /Web Interface

DesktopApplications

PrintedReports

Email

MobileDevices

Div

isio

n A

Div

isio

n B

Div

isio

n C

Ext

ern

al D

ata

Data Warehouse ArchitectureManagement

Systems

Page 11: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse ArchitectureSource Systems

Div

isio

n A

Div

isio

n B

Div

isio

n C

Ext

ern

al D

ata

Data Staging Area

Data Warehouse Repository

Ext

ract

, Tra

nsfo

rmat

ion

and

Loa

d (E

TL

)

Page 12: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse ArchitectureData Staging Area

• Subject Area Oriented

• Data Structure more closely mirrors Operational System Data Layouts

• Supports Identification of Changed Data

• Acts as a Working Area to Support the Transformation Process

Page 13: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse ArchitectureExtraction, Transformation & Load (ETL)

Ext

ract

, Tra

nsfo

rmat

ion

and

Loa

d (E

TL

)

• Perform Attribute Standardization and Cleansing

• Apply Business Rules and Calculations

• Consolidate using Matching and Merge / Purge Logic

• Ensure Proper Linking and Tracking of History

Page 14: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse ArchitectureExtraction, Transformation & Load (ETL)

App. A: Male , FemaleApp. B: 1 , 0App. C: x , yApp. D: m , f

App. A: pipeline (cm)App. B: pipeline (inches)App. C: pipeline (mcf)App. D: pipeline (yds)

App. A: Date (julian) App. B: Date (yyyymmdd)App. C: Date (mm/dd/yyyy)App. D: Date (absolute)

App. A: DescriptionApp. B: DescriptionApp. C: DescriptionApp. D: Description

App. A: balance on handApp. B: current balanceApp. C: cash in houseApp. D: balance

Male, Female

pipeline (cm)

Date (julian)

Description

Balance

Lookup Function

Conversion Function

Formatting Function

Merging Function

Mapping Function

Page 15: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse ArchitectureData Warehouse Repository

• Organized around Conformed Dimensions and Facts

• Promotes Usability and Intuitiveness

• Consolidated and Cross-Functional

• Historical and Atomic Representation of Data •Insulated from Source System Modifications and Additions

Page 16: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse RepositoryStar Schema Concepts

Fact TableThis table is the core of the Star Schema Structure and contains the Facts or Measures available through the Data Warehouse.

These Facts answer the questions of “What”, “How Much”, or “How Many”.

Some Examples:Sales Dollars, Units Sold, Gross Profit, Expense Amount, Net Income, Unit Cost, Number of Employees, Turnover, Salary, Tenure, etc.

Page 17: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Dimension Tables

Data Warehouse RepositoryStar Schema Concepts

These tables describe the Facts or Measures. These tables contain the Attributes and may also be Hierarchical.

These Dimensions answer the questions of “Who”, “What”, “When”, or “Where”.

Some Examples:• Day, Week, Month, Quarter, Year• Sales Person, Sales Manager, VP of Sales• Product, Product Category, Product Line• Cost Center, Unit, Segment, Business, Company

Page 18: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse RepositoryStar Schema Concepts

Time_DimTime_DimTime_DimTime_DimTimeKeyTimeKeyTheDate...

TheDate...

Sales_FactSales_FactTimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

TimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

Required Data(Business Metrics) or (Measures)...

Required Data(Business Metrics) or (Measures)...

Employee_DimEmployee_DimEmployee_DimEmployee_DimEmployeeKeyEmployeeKeyEmployeeID...

EmployeeID...

Product_DimProduct_DimProduct_DimProduct_DimProductKeyProductKeyProductID...

ProductID...

Customer_DimCustomer_DimCustomer_DimCustomer_DimCustomerKeyCustomerKeyCustomerID...

CustomerID...

Shipper_DimShipper_DimShipper_DimShipper_DimShipperKeyShipperKeyShipperID...

ShipperID...

Page 19: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Apples

CherriesGrapes

Melons

Q4Q1 Q2 Q3

Time Dimension

Dallas

Denver

Chicago

Mar

ket

s D

imen

sion Atlanta

Produ

ct D

imen

sion

Data Warehouse RepositoryCube Concepts

Page 20: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Q4

Data Warehouse RepositoryCube Concepts

CherriesGrapes

Melons

Q1 Q2 Q3

Time Dimension

Dallas

Denver

Chicago

Mar

ket

s D

imen

sion Atlanta

Produ

ct D

imen

sion

Sales Fact

Apples

Page 21: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Data Warehouse RepositoryStorage Concepts

• Relational On-Line Analytical Processing (ROLAP): The information that is stored in the Data Warehouse is held in a relational structure. Aggregations are performed on the fly either by the database or in the analysis tool.

• Multidimensional On-Line Analytical Processing (MOLAP): This information is aggregated in a predefined manner based on the characteristics of the Measures and the defined hierarchy of the Dimensions. Since the data is pre-aggregated, navigating through the hierarchies is instantaneous. The user is simply navigating to a point within the Multidimensional Cube and not performing any on the fly aggregations.

• Hybrid On-Line Analytical Processing (HOLAP): This is a combination of MOLAP and ROLAP. A portion of the data is predefined and aggregated. This would typically be the set of information that is accessed most frequently. Additional detail can be held in a ROLAP structure and allow a user to drill through the MOLAP structure into the ROLAP structure.

Page 22: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Client perspectiveClient perspectiveClient perspectiveClient perspective MOLAPMOLAPMOLAPMOLAP HOLAPHOLAPHOLAPHOLAP ROLAPROLAPROLAPROLAP

Query performanceQuery performance

Storage consumptionStorage consumption

FastestFastest

HighHigh

FasterFaster

MediumMedium

FastFast

LowLow

Data Warehouse RepositoryCube Concepts

Page 23: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Source Systems Data Warehouse ComponentsAccess

MethodsE

xtra

ctio

n T

ran

sfor

mat

ion

Loa

d (

ET

L)

CorporateLevel

BusinessGroupLevel

DivisionalLevel

Enterprise Data Warehouse

Incr

ease

d L

evel

of

Stan

dard

izat

ion

Increased Local Specifications

DW / DM

DM DM DM DM DM DM

DW / DM DW / DM

Dat

a A

cces

s &

Qu

ery

Man

agem

ent

Ser

vice

s

`

Planning &Forecasting

PerformanceManagement

Scorecards &Dashboards

Analytics &Modeling

Query &Reporting

Portal /Web Interface

DesktopApplications

PrintedReports

Email

MobileDevices

Div

isio

n A

Div

isio

n B

Div

isio

n C

Ext

ern

al D

ata

ManagementSystems

Where does Microsoft fit in?SQL Server DTSSQL Server Relational Database and Analysis Services

SQL Stored Procedures, SQL Views, MDX, and .NET Web Services

Microsoft Office, Reporting Services and .NET Framework

Sh

areP

oin

t P

orta

l, E

xch

ange

, an

d .N

ET

Fra

mew

ork

Page 24: Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com

Q & A