tuesday introduction to olap and dimensional modelling
TRANSCRIPT
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
1/29
Introduction to OLAP and
Dimensional Modelling
Tuesday
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
2/29
Overview: Tuesday
Format Time Description
Lecture 10:00 - 10:45 Introduction to OLAP andAnalysis Services
Demo 10:45 - 11:30 Dimensional modelling
Lab 12:15 - 13:00 Practical session: Defining adata source and defining anddeploying a cube
Lab 13:00 - 13:45 Practical session: Modifyingmeasures, attributes and
hierarchiesLecture 14:30 - 15:15 Observations about design forOLAP and Reporting
Discussion 15:15 - 16:00 Wrap-up: questions andfeedback
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
3/29
Definition of OLAP
Fast Analysis of Shared Multidimensional
Information (FASMI, Nigel Pendse)
Fast
Analysis (statistical and business logic)
Shared
Multidimensional
Information (all of the data and derived
information needed)
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
4/29
Multidimensional
The system must provide a
multidimensional conceptual view of the
data, including full support for hierarchies
and multiple hierarchies, as this is
certainly the most logical way to analyze
businesses and organizations.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
5/29
Alternative definition of OLAP
(from SAS)
OLAP is "fast access to large amounts ofsummarized data".
This implies the concept of dimensionality. For
without dimensions, there would be nothing tosummarize the data by.
Alternative definition is that OLAP provides:
"the ability of users to conveniently interrogate
large amounts of data, at varying levels of detail,across a variety of combinations of businessdimensions"
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
6/29
Kimballs Four-Step Design
Process
1. Select a business process
2. Declare the grain
3. Choose dimensions4. Identify facts
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
7/29
STEP 1: Select a business process
For our exercise, we will be looking at
Internet sales
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
8/29
A Quick Look at the Data (1)
USE AdventureWorksDW;
SELECT TOP 5
CustomerKey,ProductKey, OrderDateKey,
OrderQuantity
FROM FactInternetSales
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
9/29
A Quick Look at the Data (2)
Customer Product OrderDate SalesAmount
Key Key Key
----------- ---------- ------------ ---------------------
11003 346 1 3399.99
14501 336 1 699.0982
21768 310 1 3578.27
25863 346 1 3399.99
28389 346 1 3399.99
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
10/29
STEP 2: Declare the grain
(What does a row in the fact table mean?)
In our example, a row is an individual
order.
Design rule: recognise the trade-off.
A finer grain facilitates more detailed analysis,
but results in a larger quantity of data.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
11/29
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
12/29
STEP 4: Identify facts
The numeric facts that we will measure
FactInternetSales
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
13/29
DimCustomer
CustomerKey
GeographyKey
CustomerAlternateKey
Title
FirstName
MiddleName
DimGeography
GeographyKey
City
StateProvinceCode
StateProvinceName
CountryRegionCode
EnglishCountryRegionName
SpanishCountryRegionNa...
DimProduct
ProductKey
ProductAlternateKey
ProductSubcategoryKey
WeightUnitMeasureCode
SizeUnitMeasureCode
EnglishProductName
SpanishProductName
FrenchProductName
StandardCost
FinishedGoodsFlag
Color
SafetyStockLevel
ReorderPoint
ListPrice
DimTime
TimeKey
FullDateAlternateKey
DayNumberOfWeek
EnglishDayNameOfWeek
SpanishDayNameOfWeek
FrenchDayNameOfWeek
DayNumberOfMonth
FactInternetSales
ProductKey
OrderDateKey
DueDateKey
ShipDateKey
CustomerKey
PromotionKey
CurrencyKey
SalesTerritoryKey
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
14/29
Demo
Dimensional modelling
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
15/29
Lab
Defining a data source view
Defining and deploying a cube
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
16/29
Lab
Practical session: Modifying measures,
attributes and hierarchies
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
17/29
Lecture: Observations about design
for OLAP and Reporting
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
18/29
The BI Bottleneck (1)
Report consumers The report may be electronic, e.g. Excel
Power users Capable of some self-service
Report authors The know the data and the business.
Reporting administrator They know the database and the data, but not necessarily how it
relates to the business.
Challenge: make reporting more interactive so thatchanges can be accommodated without passing alongthe chain
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
19/29
The BI bottleneck (2)
Typically, analysts time is the scarce
resource.
The number of iterations is the killer.
Sometimes, testing is the bottleneck.
Possible solution: the analyst spends a bit
more time in the first iteration providing the
business user with a more
generic/interactive report.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
20/29
The BI Bottleneck (3)
Long lead times
High development costs
Apparently small changes to arequirement for a report take a long time to
implement.
For each link along the chain that arequest for a change needs to go, delay
goes up by a big factor.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
21/29
The Relational Model of Data
Conceptually, homogeneous tabular structure:
Logic: for declarative query language
Algebra: for query optimization
Application interface (e.g. simple reporting tools). Applications designers and even some end-users can
(just about) understand tables.
Relational model provided a mutually intelligible
language for implementers, administrators,developers, researchers and even users.
Flexible: join anything with anything (c.f. OLAP).
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
22/29
Inadequacy of the Relational Model
for Reporting applications
Heterogeneous data sources:
Database, OLAP, XML Web services, etc.
Relational model does not fit well with the
area between storage and presentation.
Aggregation hierarchies
Matrix structures
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
23/29
The Microsoft approach: UDM
Server Analysis Services 2005 implements
UDM.
Acts as a bridge between users and their data.
Encapsulates semantics, language and time.
UDM perspectives allow the user to view
subsets.
Integrated with Data Mining. Accessed via SOAP and XML for Analysis.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
24/29
UDM
A UDM provides a single dimensional model for allOLAP analysis andrelational reporting needs. So you can use either MDX or SQL
Perspectives are the new data marts
Cubes are largely transparent concepts downgraded to the status of caches
Commonly youll only have 1 cube with multiple measure groups andmultiple perspectives.
Its better to think of measure groups instead of cubes; partitions now applyto measure groups.
Whilst a UDM can gather data from numerous data sources, the need tocleanse data still requires a data warehouse.
A cube is structured around dimensional attributes(previously known asmember properties) rather than dimensional hierarchies. Hence the virtualdimension, as a term, is now gone and concept converted to a real, firstclass, dimension.
UDM has five new dimension types, Role Playing, Fact, Reference, DataMining and Many to many.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
25/29
The pre-UDM and UDM stacks
Pre-UDM stack
Dimension model
(pivot table)
Calculations (Excel) End-user model (if you
are lucky)
Data source view
Management settings
UDM stack
Management settings
End-user model
Calculations Dimensional model
Data source view
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
26/29
Enterprise BI with UDM
DW
Datamart
Datamart
BI Applications
XML
Web
Service
MOLAP
Reporting
Tool (1)
OLAP
Browser (2)
OLAP
Browser (1)
Reporting
Tool (1)UDM
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
27/29
Desirable features of a BI data
model
The model must facilitate
re-use of report spare-parts by the power
users (rather than just the report authors).
more flexibility for report consumers
easier maintenance of the set of all reports
used by an enterprise. (E.g. Avoiding the
reporting chain.) Interaction.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
28/29
Current design principles
All about how to make reports look good.
See, for example, Microsoft SQL Server 2005
Report Design: Best Practices and Guidelines
Some focus on maintenance.
No focus on re-use.
-
8/12/2019 Tuesday Introduction to OLAP and Dimensional Modelling
29/29
Wrap-Up
Questions
Feedback