ch05 dss turban data warehouse at

Upload: suriya-shopna

Post on 04-Jun-2018

237 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    1/58

    MIS-410: Decision Support System DSS)

    Ashis Talukder

    Lecturer, MISDhaka University

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    2/58

    Chapter 5

    DATA WAREHOUSING

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    3/58

    Ashis Talukder, MIS, DU 3

    Learning Objectives Understand the basic definitions and concepts

    of data warehouses

    Understand data warehousing architectures

    Describe the processes used in developing andmanaging data warehouses

    Explain data warehousing operations

    Explain the role of data warehouses in decisionsupport

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    4/58

    Ashis Talukder, MIS, DU 4

    Learning Objectives Explain data integration and the

    extraction, transformation, and load (ETL)

    processes

    Describe real-time (active) data

    warehousing

    Understand data warehouse

    administration and security issues

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    5/58

    Ashis Talukder, MIS, DU 5

    Data WarehousingDefinitions and Concepts Data warehouse

    A physical repository where relational data arespecially organized to provide enterprise-wide,

    cleansed data in a standardized format Current and historical data

    Data are structured to be available in the formready for analytical processing (OLAP), mining,

    querying, reporting Subject oriented, integrated, time-variant, non-

    volatile collection of data for managementsdecision making.

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    6/58

    Ashis Talukder, MIS, DU 6

    Data WarehousingDefinitions and Concepts Characteristics of data warehousing

    Subject oriented

    Integrated

    Time variant (time series) Nonvolatile

    Web based

    Relational/multidimensional

    Client/server

    Real-time

    Include metadata

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    7/58

    Ashis Talukder, MIS, DU 7

    Data WarehousingDefinitions and Concepts Data mart

    A departmental data warehouse that stores only

    relevant data

    Subset of DW Dependent data mart

    A subset that is created directly from a data warehouse

    Independent data mart

    A small data warehouse designed for a strategic

    business unit or a department but its source is not an

    EDW

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    8/58

    Ashis Talukder, MIS, DU 8

    Data WarehousingDefinitions and Concepts Operational data stores (ODS)

    A type of database often used as an interim area for a datawarehouse, especially for customer information files

    Recent form of customer info file (CIF)

    Used for short term decision involving mission-critical application

    Like short term memory with recent info unlike DW which is longterm memory with permanent info

    Oper marts An operational data mart. An oper mart is a small-scale data mart

    typically used by a single department or functional area in anorganization

    Data comes from ODS

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    9/58

    Ashis Talukder, MIS, DU 9

    Data WarehousingDefinitions and Concepts Enterprise data warehouse (EDW)

    A technology thatprovides a vehicle for pushing data from sourcesystems into a data warehouse

    Large scale DW, integrates data from many sources

    Used to provide data for many types of DSS:

    CRM SCM

    Business Performance Monitor (BPM)

    Business activity monitor (BAM)

    Product lifecycle monitoring (PLM)

    Revenue management

    Knowledge management System (KMS) Metadata

    Data about data. In a data warehouse, metadata describe thecontents of a data warehouse and the manner of its use

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    10/58

    Ashis Talukder, MIS, DU 10

    Data WarehousingProcess Overview Organizations continuously collect data,

    information, and knowledge at an

    increasingly accelerated rate and store

    them in computerized systems

    The number of users needing to access the

    information continues to increase as a

    result of improved reliability and availabilityof network access, especially the Internet

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    11/58

    Ashis Talukder, MIS, DU 11

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    12/58

    Ashis Talukder, MIS, DU 12

    Data WarehousingProcess Overview Organizations need to create

    data warehousesmassivedata stores of time series datafor decision support.

    Data are imported formvarious internal and externalsources, cleansed andorganized in a mannerconsentient with the

    organizations need. Then Data mart can be loaded

    for any department

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    13/58

    Ashis Talukder, MIS, DU 13

    Data WarehousingProcess Overview The major components

    of a data warehousing

    process

    Data sources

    Data extraction

    Data loading

    Comprehensivedatabase

    Metadata

    Middleware tools

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    14/58

    Ashis Talukder, MIS, DU 14

    Data WarehousingProcess Overview Data sources

    Multiple independent operational legacy system

    From external data provider (Census)

    From OLTP or enterprise resource planning (ERP)

    Web data can be fed as Web logs

    Data extraction: By custom-written or commercial software

    Data loading Loaded into staging area where transformed and

    cleansed

    Then data are ready to load in DW

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    15/58

    Ashis Talukder, MIS, DU 15

    Data WarehousingProcess Overview Comprehensive database

    This is EDW to support all decision analysis byproviding relevant summarized and detailed informationfrom different sources.

    Metadata Includes software programs about data and rules fororganizing data summaries that are easy to index orsearch, with web tools

    Middleware tools

    Enables access to DW May use SQL queries, Front end application

    Ex: data reporting, mining, OLAP, reporting tools,visulizing tool

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    16/58

    Ashis Talukder, MIS, DU 16

    Data Warehousing Architectures Three tire architecture

    Two Tire Architecture

    One tire Architecture

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    17/58

    Ashis Talukder, MIS, DU 17

    Data Warehousing Architectures Three parts of the data warehouse

    The data warehouse that contains the data and

    associated software

    Data acquisition (back-end) software thatextracts data from legacy systems and external

    sources, consolidates and summarizes them,

    and loads them into the data warehouse

    Client (front-end) software that allows users to

    access and analyze data from the warehouse

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    18/58

    Ashis Talukder, MIS, DU 18

    Data Warehousing Architectures

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    19/58

    Ashis Talukder, MIS, DU 19

    Data Warehousing Architectures

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    20/58

    Ashis Talukder, MIS, DU 20

    Data Warehousing Architectures

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    21/58

    Ashis Talukder, MIS, DU 21

    Data Warehousing Architectures Issues to consider when deciding which

    architecture to use:

    Which database management system (DBMS)

    should be used? Will parallel processing and/or partitioning be

    used?

    Will data migration tools be used to load thedata warehouse?

    What tools will be used to support data retrieval

    and analysis?

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    22/58

    Ashis Talukder, MIS, DU 22

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    23/58

    Ashis Talukder, MIS, DU 23

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    24/58

    Ashis Talukder, MIS, DU 24

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    25/58

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    26/58

    Ashis Talukder, MIS, DU 26

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    27/58

    Ashis Talukder, MIS, DU 27

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    28/58

    Ashis Talukder, MIS, DU 28

    Data WarehousingProcess Overview

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    29/58

    Ashis Talukder, MIS, DU 29

    Data WarehousingArchitectures

    1. Information

    interdependencebetween organizational

    units

    2. Upper managements

    information needs

    3. Urgency of need for adata warehouse

    4. Nature of end-user tasks

    5. Constraints on resources

    6. Strategic view of the datawarehouse prior to

    implementation

    7. Compatibility with existing

    systems

    8. Perceived ability of the in-

    house IT staff

    9. Technical issues

    10. Social/political factors

    Ten factors that potentially affect the architecture selection

    decision (Ariyachandra & Watson 2005):

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    30/58

    Ashis Talukder, MIS, DU 30

    Data Integration and theExtraction, Transformation,and Load ETL) Process Data integration

    Integration that comprises three major processes:

    data access,

    data federation, and change capture.

    When these three processes are correctly

    implemented, data can be accessed and made

    accessible to an array of ETL and analysis toolsand data warehousing environments

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    31/58

    Ashis Talukder, MIS, DU 31

    Data Integration and theExtraction, Transformation,and Load ETL) Process Enterprise application integration (EAI)

    A technology thatprovides a vehicle for

    pushing data from source systems into a

    data warehouse

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    32/58

    Ashis Talukder, MIS, DU 32

    Data Integration and theExtraction, Transformation,and Load ETL) Process Enterprise information integration (EII)

    An evolving tool space that promises real-

    time data integration from a variety of

    sources, such as relational databases, Webservices, and multidimensional databases

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    33/58

    Ashis Talukder, MIS, DU 33

    Data Integration and theExtraction, Transformation,and Load ETL) Process Extraction, transformation, and load (ETL)

    A data warehousing process that consists of

    Extraction: reading data from a database

    Transformation: converting the extracted data from its previous form

    into the form in which it needs to be so that it can be

    placed into a data warehouse or simply anotherdatabase, and

    Load: putting the data into the data warehouse

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    34/58

    Ashis Talukder, MIS, DU 34

    Data Integration and theExtraction, Transformation,and Load ETL) Process

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    35/58

    Ashis Talukder, MIS, DU 35

    Data Integration and theExtraction, Transformation,and Load ETL) Process Issues affect whether an organization will

    purchase data transformation tools or build

    the transformation process itself

    Data transformation tools are expensive

    Data transformation tools may have a long

    learning curve

    It is difficult to measure how the IT organizationis doing until it has learned to use the data

    transformation tools

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    36/58

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    37/58

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    38/58

    Ashis Talukder, MIS, DU 38

    Data Warehouse Development Indirect benefits result from end users using

    these direct benefits

    Enhance business knowledge

    Present competitive advantage

    Enhance customer service and satisfaction

    Facilitate decision making

    Help in reforming business processes

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    39/58

    Ashis Talukder, MIS, DU 39

    Data Warehouse Development Data warehouse vendors

    Six guidelines to considered when developing a

    vendor list:

    1. Financial strength

    2. ERP linkages

    3. Qualified consultants

    4. Market share

    5. Industry experience6. Established partnerships

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    40/58

    Ashis Talukder, MIS, DU 40

    Data Warehouse Development Data warehouse development approaches

    Inmon Model: EDW approach

    Kimball Model: Data mart approach

    Which model is best? There is no one-size-fits-all strategy to data

    warehousing

    One alternative is the hosted warehouse

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    41/58

    Ashis Talukder, MIS, DU 41

    Data Warehouse Development Data warehouse structure: The Star

    Schema

    Dimensional modeling

    A retrieval-based system that supports high-volume query access

    Dimension tables

    A table that address howdata will be analyzed

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    42/58

    Ashis Talukder, MIS, DU 42

    Data Warehouse Development A Central Fact TableContains

    many rows that

    correspond to observed

    facts

    Attribute for decision

    analysis,

    descriptive attributes for

    query reporting

    Foreign key to join

    dimension tables

    Fact Table addressesWHAT data warehouse

    supports for decision

    analysis

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    43/58

    Ashis Talukder, MIS, DU 43

    Data Warehouse Development A Central Fact TableSurrounded by number of

    dimension tables.

    Dimension table contains

    Classification and

    aggregation information Data that describe data

    in the fact tables

    One-to-many

    relationship

    Fact Table addresses

    HOW data will be

    analyzed

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    44/58

    Ashis Talukder, MIS, DU 44

    Data Warehouse Development Grain

    A definition of the highest level of detail that is

    supported in a data warehouse

    Whether DW is highly summarized or havedetailed transaction data

    Drill-down

    The process of probing beyond a summarized

    value to investigate each of the detail

    transactions that comprise the summary

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    45/58

    Ashis Talukder, MIS, DU 45

    Data Warehouse Development Data warehousing implementation issues

    Implementing a data warehouse is generally a

    massive effort that must be planned and

    executed according to established methods

    There are many facets to the project lifecycle,

    and no single person can be an expert in each

    area

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    46/58

    Ashis Talukder, MIS, DU 46

    Data Warehouse Development

    1. Establishment ofservice-levelagreements and data-refresh requirements

    2. Identification of datasources and their

    governance policies3. Data quality planning

    4. Data model design

    5. ETL tool selection

    6. Relational databasesoftware and platformselection

    7. Data transport

    8. Data conversion

    9. Reconciliation process

    10.Purge (Cleaning) andarchive planning

    11.End-user support

    Eleven major tasks that could be performed in parallel forsuccessful implementation of a data warehouse (Solomon, 2005) :

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    47/58

    Ashis Talukder, MIS, DU 47

    Data Warehouse Development Some best practices for implementing a

    data warehouse (Weir, 2002):

    Project must fit with corporate strategy andbusiness objectives

    There must be complete buy-in to the projectby executives, managers, and users

    It is important to manage user expectationsabout the completed project

    The data warehouse must be builtincrementally

    Build in adaptability

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    48/58

    Ashis Talukder, MIS, DU 48

    Data Warehouse Development Some best practices for implementing a

    data warehouse (Weir, 2002):

    The project must be managed by both IT and

    business professionals Develop a business/supplier relationship

    Only load data that have been cleansed and

    are of a quality understood by the organization

    Do not overlook training requirements

    Be politically aware

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    49/58

    Ashis Talukder, MIS, DU 49

    Data Warehouse Development Failure factors in data warehouse projects:

    Cultural issues being ignored

    Inappropriate architecture

    Unclear business objectives Missing information

    Unrealistic expectations

    Low levels of data summarization Low data quality

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    50/58

    Ashis Talukder, MIS, DU 50

    Data Warehouse Development Issues to consider to build a successful

    data warehouse:

    Starting with the wrong sponsorship chain

    Setting expectations that you cannot meet andfrustrating executives at the moment of truth

    Engaging in politically naive behavior

    Loading the warehouse with information just

    because it is available

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    51/58

    Ashis Talukder, MIS, DU 51

    Data Warehouse Development Issues to consider to build a successful

    data warehouse:

    Believing that data warehousing databasedesign is the same as transactional database

    design Choosing a data warehouse manager who is

    technology oriented rather than user oriented

    Focusing on traditional internal record-orienteddata and ignoring the value of external dataand of text, images, and, perhaps, sound andvideo

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    52/58

    Ashis Talukder, MIS, DU 52

    Data Warehouse Development Issues to consider to build a successful

    data warehouse:

    Delivering data with overlapping and confusing

    definitions Believing promises of performance, capacity,

    and scalability

    Believing that your problems are over when the

    data warehouse is up and running

    Focusing on ad hoc data mining and periodic

    reporting instead of alerts

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    53/58

    Ashis Talukder, MIS, DU 53

    Real-Time Data Warehousing Real-time (active) data warehousing

    The process of loading and providing data

    via a data warehouse as they become

    available

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    54/58

    Ashis Talukder, MIS, DU 54

    Real-Time Data Warehousing Levels of data warehouses:

    1. Reports what happened

    2. Some analysis occurs

    3. Provides prediction capabilities,4. Operationalization

    5. Becomes capable of making events happen

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    55/58

    Ashis Talukder, MIS, DU 55

    Real-Time DataWarehousing

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    56/58

    Ashis Talukder, MIS, DU 56

    Real-Time Data Warehousing The need for real-time data

    A business often cannot afford to wait a whole day for

    its operational data to load into the data warehouse for

    analysis

    Provides incremental real-time data showing everystate change and almost analogous patterns over time

    Maintaining metadata in sync is possible

    Less costly to develop, maintain, and secure one huge

    data warehouse so that data are centralized for BI/BAtools

    An EAI with real-time data collection can reduce or

    eliminate the nightly batch processes

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    57/58

    Ashis Talukder, MIS, DU 57

    Data WarehouseAdministration and Security Issues

    Data warehouse administrator (DWA)A person responsible for the administration

    and management of a data warehouse

  • 8/13/2019 CH05 DSS Turban Data Warehouse At

    58/58

    Ashis Talukder MIS DU 58

    Data WarehouseAdministration and Security Issues

    Effective security in a data warehouseshould focus on four main areas:

    Establishing effective corporate and securitypolicies and procedures

    Implementing logical security procedures andtechniques to restrict access

    Limiting physical access to the data centerenvironment

    Establishing an effective internal control reviewprocess with an emphasis on security andprivacy