ch05 dss turban data warehouse at
TRANSCRIPT
-
8/13/2019 CH05 DSS Turban Data Warehouse At
1/58
MIS-410: Decision Support System DSS)
Ashis Talukder
Lecturer, MISDhaka University
-
8/13/2019 CH05 DSS Turban Data Warehouse At
2/58
Chapter 5
DATA WAREHOUSING
-
8/13/2019 CH05 DSS Turban Data Warehouse At
3/58
Ashis Talukder, MIS, DU 3
Learning Objectives Understand the basic definitions and concepts
of data warehouses
Understand data warehousing architectures
Describe the processes used in developing andmanaging data warehouses
Explain data warehousing operations
Explain the role of data warehouses in decisionsupport
-
8/13/2019 CH05 DSS Turban Data Warehouse At
4/58
Ashis Talukder, MIS, DU 4
Learning Objectives Explain data integration and the
extraction, transformation, and load (ETL)
processes
Describe real-time (active) data
warehousing
Understand data warehouse
administration and security issues
-
8/13/2019 CH05 DSS Turban Data Warehouse At
5/58
Ashis Talukder, MIS, DU 5
Data WarehousingDefinitions and Concepts Data warehouse
A physical repository where relational data arespecially organized to provide enterprise-wide,
cleansed data in a standardized format Current and historical data
Data are structured to be available in the formready for analytical processing (OLAP), mining,
querying, reporting Subject oriented, integrated, time-variant, non-
volatile collection of data for managementsdecision making.
-
8/13/2019 CH05 DSS Turban Data Warehouse At
6/58
Ashis Talukder, MIS, DU 6
Data WarehousingDefinitions and Concepts Characteristics of data warehousing
Subject oriented
Integrated
Time variant (time series) Nonvolatile
Web based
Relational/multidimensional
Client/server
Real-time
Include metadata
-
8/13/2019 CH05 DSS Turban Data Warehouse At
7/58
Ashis Talukder, MIS, DU 7
Data WarehousingDefinitions and Concepts Data mart
A departmental data warehouse that stores only
relevant data
Subset of DW Dependent data mart
A subset that is created directly from a data warehouse
Independent data mart
A small data warehouse designed for a strategic
business unit or a department but its source is not an
EDW
-
8/13/2019 CH05 DSS Turban Data Warehouse At
8/58
Ashis Talukder, MIS, DU 8
Data WarehousingDefinitions and Concepts Operational data stores (ODS)
A type of database often used as an interim area for a datawarehouse, especially for customer information files
Recent form of customer info file (CIF)
Used for short term decision involving mission-critical application
Like short term memory with recent info unlike DW which is longterm memory with permanent info
Oper marts An operational data mart. An oper mart is a small-scale data mart
typically used by a single department or functional area in anorganization
Data comes from ODS
-
8/13/2019 CH05 DSS Turban Data Warehouse At
9/58
Ashis Talukder, MIS, DU 9
Data WarehousingDefinitions and Concepts Enterprise data warehouse (EDW)
A technology thatprovides a vehicle for pushing data from sourcesystems into a data warehouse
Large scale DW, integrates data from many sources
Used to provide data for many types of DSS:
CRM SCM
Business Performance Monitor (BPM)
Business activity monitor (BAM)
Product lifecycle monitoring (PLM)
Revenue management
Knowledge management System (KMS) Metadata
Data about data. In a data warehouse, metadata describe thecontents of a data warehouse and the manner of its use
-
8/13/2019 CH05 DSS Turban Data Warehouse At
10/58
Ashis Talukder, MIS, DU 10
Data WarehousingProcess Overview Organizations continuously collect data,
information, and knowledge at an
increasingly accelerated rate and store
them in computerized systems
The number of users needing to access the
information continues to increase as a
result of improved reliability and availabilityof network access, especially the Internet
-
8/13/2019 CH05 DSS Turban Data Warehouse At
11/58
Ashis Talukder, MIS, DU 11
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
12/58
Ashis Talukder, MIS, DU 12
Data WarehousingProcess Overview Organizations need to create
data warehousesmassivedata stores of time series datafor decision support.
Data are imported formvarious internal and externalsources, cleansed andorganized in a mannerconsentient with the
organizations need. Then Data mart can be loaded
for any department
-
8/13/2019 CH05 DSS Turban Data Warehouse At
13/58
Ashis Talukder, MIS, DU 13
Data WarehousingProcess Overview The major components
of a data warehousing
process
Data sources
Data extraction
Data loading
Comprehensivedatabase
Metadata
Middleware tools
-
8/13/2019 CH05 DSS Turban Data Warehouse At
14/58
Ashis Talukder, MIS, DU 14
Data WarehousingProcess Overview Data sources
Multiple independent operational legacy system
From external data provider (Census)
From OLTP or enterprise resource planning (ERP)
Web data can be fed as Web logs
Data extraction: By custom-written or commercial software
Data loading Loaded into staging area where transformed and
cleansed
Then data are ready to load in DW
-
8/13/2019 CH05 DSS Turban Data Warehouse At
15/58
Ashis Talukder, MIS, DU 15
Data WarehousingProcess Overview Comprehensive database
This is EDW to support all decision analysis byproviding relevant summarized and detailed informationfrom different sources.
Metadata Includes software programs about data and rules fororganizing data summaries that are easy to index orsearch, with web tools
Middleware tools
Enables access to DW May use SQL queries, Front end application
Ex: data reporting, mining, OLAP, reporting tools,visulizing tool
-
8/13/2019 CH05 DSS Turban Data Warehouse At
16/58
Ashis Talukder, MIS, DU 16
Data Warehousing Architectures Three tire architecture
Two Tire Architecture
One tire Architecture
-
8/13/2019 CH05 DSS Turban Data Warehouse At
17/58
Ashis Talukder, MIS, DU 17
Data Warehousing Architectures Three parts of the data warehouse
The data warehouse that contains the data and
associated software
Data acquisition (back-end) software thatextracts data from legacy systems and external
sources, consolidates and summarizes them,
and loads them into the data warehouse
Client (front-end) software that allows users to
access and analyze data from the warehouse
-
8/13/2019 CH05 DSS Turban Data Warehouse At
18/58
Ashis Talukder, MIS, DU 18
Data Warehousing Architectures
-
8/13/2019 CH05 DSS Turban Data Warehouse At
19/58
Ashis Talukder, MIS, DU 19
Data Warehousing Architectures
-
8/13/2019 CH05 DSS Turban Data Warehouse At
20/58
Ashis Talukder, MIS, DU 20
Data Warehousing Architectures
-
8/13/2019 CH05 DSS Turban Data Warehouse At
21/58
Ashis Talukder, MIS, DU 21
Data Warehousing Architectures Issues to consider when deciding which
architecture to use:
Which database management system (DBMS)
should be used? Will parallel processing and/or partitioning be
used?
Will data migration tools be used to load thedata warehouse?
What tools will be used to support data retrieval
and analysis?
-
8/13/2019 CH05 DSS Turban Data Warehouse At
22/58
Ashis Talukder, MIS, DU 22
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
23/58
Ashis Talukder, MIS, DU 23
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
24/58
Ashis Talukder, MIS, DU 24
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
25/58
-
8/13/2019 CH05 DSS Turban Data Warehouse At
26/58
Ashis Talukder, MIS, DU 26
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
27/58
Ashis Talukder, MIS, DU 27
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
28/58
Ashis Talukder, MIS, DU 28
Data WarehousingProcess Overview
-
8/13/2019 CH05 DSS Turban Data Warehouse At
29/58
Ashis Talukder, MIS, DU 29
Data WarehousingArchitectures
1. Information
interdependencebetween organizational
units
2. Upper managements
information needs
3. Urgency of need for adata warehouse
4. Nature of end-user tasks
5. Constraints on resources
6. Strategic view of the datawarehouse prior to
implementation
7. Compatibility with existing
systems
8. Perceived ability of the in-
house IT staff
9. Technical issues
10. Social/political factors
Ten factors that potentially affect the architecture selection
decision (Ariyachandra & Watson 2005):
-
8/13/2019 CH05 DSS Turban Data Warehouse At
30/58
Ashis Talukder, MIS, DU 30
Data Integration and theExtraction, Transformation,and Load ETL) Process Data integration
Integration that comprises three major processes:
data access,
data federation, and change capture.
When these three processes are correctly
implemented, data can be accessed and made
accessible to an array of ETL and analysis toolsand data warehousing environments
-
8/13/2019 CH05 DSS Turban Data Warehouse At
31/58
Ashis Talukder, MIS, DU 31
Data Integration and theExtraction, Transformation,and Load ETL) Process Enterprise application integration (EAI)
A technology thatprovides a vehicle for
pushing data from source systems into a
data warehouse
-
8/13/2019 CH05 DSS Turban Data Warehouse At
32/58
Ashis Talukder, MIS, DU 32
Data Integration and theExtraction, Transformation,and Load ETL) Process Enterprise information integration (EII)
An evolving tool space that promises real-
time data integration from a variety of
sources, such as relational databases, Webservices, and multidimensional databases
-
8/13/2019 CH05 DSS Turban Data Warehouse At
33/58
Ashis Talukder, MIS, DU 33
Data Integration and theExtraction, Transformation,and Load ETL) Process Extraction, transformation, and load (ETL)
A data warehousing process that consists of
Extraction: reading data from a database
Transformation: converting the extracted data from its previous form
into the form in which it needs to be so that it can be
placed into a data warehouse or simply anotherdatabase, and
Load: putting the data into the data warehouse
-
8/13/2019 CH05 DSS Turban Data Warehouse At
34/58
Ashis Talukder, MIS, DU 34
Data Integration and theExtraction, Transformation,and Load ETL) Process
-
8/13/2019 CH05 DSS Turban Data Warehouse At
35/58
Ashis Talukder, MIS, DU 35
Data Integration and theExtraction, Transformation,and Load ETL) Process Issues affect whether an organization will
purchase data transformation tools or build
the transformation process itself
Data transformation tools are expensive
Data transformation tools may have a long
learning curve
It is difficult to measure how the IT organizationis doing until it has learned to use the data
transformation tools
-
8/13/2019 CH05 DSS Turban Data Warehouse At
36/58
-
8/13/2019 CH05 DSS Turban Data Warehouse At
37/58
-
8/13/2019 CH05 DSS Turban Data Warehouse At
38/58
Ashis Talukder, MIS, DU 38
Data Warehouse Development Indirect benefits result from end users using
these direct benefits
Enhance business knowledge
Present competitive advantage
Enhance customer service and satisfaction
Facilitate decision making
Help in reforming business processes
-
8/13/2019 CH05 DSS Turban Data Warehouse At
39/58
Ashis Talukder, MIS, DU 39
Data Warehouse Development Data warehouse vendors
Six guidelines to considered when developing a
vendor list:
1. Financial strength
2. ERP linkages
3. Qualified consultants
4. Market share
5. Industry experience6. Established partnerships
-
8/13/2019 CH05 DSS Turban Data Warehouse At
40/58
Ashis Talukder, MIS, DU 40
Data Warehouse Development Data warehouse development approaches
Inmon Model: EDW approach
Kimball Model: Data mart approach
Which model is best? There is no one-size-fits-all strategy to data
warehousing
One alternative is the hosted warehouse
-
8/13/2019 CH05 DSS Turban Data Warehouse At
41/58
Ashis Talukder, MIS, DU 41
Data Warehouse Development Data warehouse structure: The Star
Schema
Dimensional modeling
A retrieval-based system that supports high-volume query access
Dimension tables
A table that address howdata will be analyzed
-
8/13/2019 CH05 DSS Turban Data Warehouse At
42/58
Ashis Talukder, MIS, DU 42
Data Warehouse Development A Central Fact TableContains
many rows that
correspond to observed
facts
Attribute for decision
analysis,
descriptive attributes for
query reporting
Foreign key to join
dimension tables
Fact Table addressesWHAT data warehouse
supports for decision
analysis
-
8/13/2019 CH05 DSS Turban Data Warehouse At
43/58
Ashis Talukder, MIS, DU 43
Data Warehouse Development A Central Fact TableSurrounded by number of
dimension tables.
Dimension table contains
Classification and
aggregation information Data that describe data
in the fact tables
One-to-many
relationship
Fact Table addresses
HOW data will be
analyzed
-
8/13/2019 CH05 DSS Turban Data Warehouse At
44/58
Ashis Talukder, MIS, DU 44
Data Warehouse Development Grain
A definition of the highest level of detail that is
supported in a data warehouse
Whether DW is highly summarized or havedetailed transaction data
Drill-down
The process of probing beyond a summarized
value to investigate each of the detail
transactions that comprise the summary
-
8/13/2019 CH05 DSS Turban Data Warehouse At
45/58
Ashis Talukder, MIS, DU 45
Data Warehouse Development Data warehousing implementation issues
Implementing a data warehouse is generally a
massive effort that must be planned and
executed according to established methods
There are many facets to the project lifecycle,
and no single person can be an expert in each
area
-
8/13/2019 CH05 DSS Turban Data Warehouse At
46/58
Ashis Talukder, MIS, DU 46
Data Warehouse Development
1. Establishment ofservice-levelagreements and data-refresh requirements
2. Identification of datasources and their
governance policies3. Data quality planning
4. Data model design
5. ETL tool selection
6. Relational databasesoftware and platformselection
7. Data transport
8. Data conversion
9. Reconciliation process
10.Purge (Cleaning) andarchive planning
11.End-user support
Eleven major tasks that could be performed in parallel forsuccessful implementation of a data warehouse (Solomon, 2005) :
-
8/13/2019 CH05 DSS Turban Data Warehouse At
47/58
Ashis Talukder, MIS, DU 47
Data Warehouse Development Some best practices for implementing a
data warehouse (Weir, 2002):
Project must fit with corporate strategy andbusiness objectives
There must be complete buy-in to the projectby executives, managers, and users
It is important to manage user expectationsabout the completed project
The data warehouse must be builtincrementally
Build in adaptability
-
8/13/2019 CH05 DSS Turban Data Warehouse At
48/58
Ashis Talukder, MIS, DU 48
Data Warehouse Development Some best practices for implementing a
data warehouse (Weir, 2002):
The project must be managed by both IT and
business professionals Develop a business/supplier relationship
Only load data that have been cleansed and
are of a quality understood by the organization
Do not overlook training requirements
Be politically aware
-
8/13/2019 CH05 DSS Turban Data Warehouse At
49/58
Ashis Talukder, MIS, DU 49
Data Warehouse Development Failure factors in data warehouse projects:
Cultural issues being ignored
Inappropriate architecture
Unclear business objectives Missing information
Unrealistic expectations
Low levels of data summarization Low data quality
-
8/13/2019 CH05 DSS Turban Data Warehouse At
50/58
Ashis Talukder, MIS, DU 50
Data Warehouse Development Issues to consider to build a successful
data warehouse:
Starting with the wrong sponsorship chain
Setting expectations that you cannot meet andfrustrating executives at the moment of truth
Engaging in politically naive behavior
Loading the warehouse with information just
because it is available
-
8/13/2019 CH05 DSS Turban Data Warehouse At
51/58
Ashis Talukder, MIS, DU 51
Data Warehouse Development Issues to consider to build a successful
data warehouse:
Believing that data warehousing databasedesign is the same as transactional database
design Choosing a data warehouse manager who is
technology oriented rather than user oriented
Focusing on traditional internal record-orienteddata and ignoring the value of external dataand of text, images, and, perhaps, sound andvideo
-
8/13/2019 CH05 DSS Turban Data Warehouse At
52/58
Ashis Talukder, MIS, DU 52
Data Warehouse Development Issues to consider to build a successful
data warehouse:
Delivering data with overlapping and confusing
definitions Believing promises of performance, capacity,
and scalability
Believing that your problems are over when the
data warehouse is up and running
Focusing on ad hoc data mining and periodic
reporting instead of alerts
-
8/13/2019 CH05 DSS Turban Data Warehouse At
53/58
Ashis Talukder, MIS, DU 53
Real-Time Data Warehousing Real-time (active) data warehousing
The process of loading and providing data
via a data warehouse as they become
available
-
8/13/2019 CH05 DSS Turban Data Warehouse At
54/58
Ashis Talukder, MIS, DU 54
Real-Time Data Warehousing Levels of data warehouses:
1. Reports what happened
2. Some analysis occurs
3. Provides prediction capabilities,4. Operationalization
5. Becomes capable of making events happen
-
8/13/2019 CH05 DSS Turban Data Warehouse At
55/58
Ashis Talukder, MIS, DU 55
Real-Time DataWarehousing
-
8/13/2019 CH05 DSS Turban Data Warehouse At
56/58
Ashis Talukder, MIS, DU 56
Real-Time Data Warehousing The need for real-time data
A business often cannot afford to wait a whole day for
its operational data to load into the data warehouse for
analysis
Provides incremental real-time data showing everystate change and almost analogous patterns over time
Maintaining metadata in sync is possible
Less costly to develop, maintain, and secure one huge
data warehouse so that data are centralized for BI/BAtools
An EAI with real-time data collection can reduce or
eliminate the nightly batch processes
-
8/13/2019 CH05 DSS Turban Data Warehouse At
57/58
Ashis Talukder, MIS, DU 57
Data WarehouseAdministration and Security Issues
Data warehouse administrator (DWA)A person responsible for the administration
and management of a data warehouse
-
8/13/2019 CH05 DSS Turban Data Warehouse At
58/58
Ashis Talukder MIS DU 58
Data WarehouseAdministration and Security Issues
Effective security in a data warehouseshould focus on four main areas:
Establishing effective corporate and securitypolicies and procedures
Implementing logical security procedures andtechniques to restrict access
Limiting physical access to the data centerenvironment
Establishing an effective internal control reviewprocess with an emphasis on security andprivacy