professional issues in computing · pdf filetelenor, ufone, mobilink, warid, zong ??? ... data...

48
Data Warehousing (Introduction, Types & Applications) Naveed Iqbal, Assistant Professor NUCES, Islamabad Campus (Lecture Slides Week # 2)

Upload: phunghuong

Post on 08-Feb-2018

228 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

Data Warehousing(Introduction, Types & Applications)

Naveed Iqbal, Assistant Professor

NUCES, Islamabad Campus(Lecture Slides Week # 2)

Page 2: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 2

Why a Data Warehouse (DWH)?

Data recording and storage is growing: Almost every industry has huge amount of operational data.

Careful use/analysis of historic information may result inexcellent prediction for the future: Knowledge worker wants to turn available data into useful

information.

This information is used by them to support strategic decisionmaking.

Gives total view of the organization: It is a platform for consolidated historical data for analysis.

It stores data of good quality so that knowledge worker canmake correct decisions.

Intelligent decision-support is required for decision-making.

Page 3: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 3

Why a Data Warehouse? (Contd.)

From business perspective:

It is latest marketing weapon.

Helps to keep customers by learning more about

their needs.

Valuable tool in today‟s competitive fast evolving

world.

Page 4: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 4

Reason-I: Why a Data Warehouse (DWH)?

Data sets are growing:

How Much Data is that?1 MB 220 or 106 bytes Small novel 3½ Disk.

1 GB 230 or 109 bytesPaper reams that could fill the back of a

pickup van.

1 TB 240 or 1012 bytes50,000 trees chopped and converted into

paper and printed.

2 PB 1 PB = 250 or 1015 bytes Academic research libraries across USA.

5 EB 1 EB = 260 or 1018 bytesAll words ever spoken by the Human

Beings.

Page 5: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 5

Reason-I: Why a Data Warehouse (DWH)?

Size of Data Sets are going up.

Cost of Data Storage is coming down. The amount of data average business collects and stores is

doubling every year.

Total hardware and software cost to store and manage 1 MB ofdata: 1990: $ 15

2002: ¢ 15 (down 100 times)

2010: < ¢ 1 (down 150 times)

A few examples: Wall Mart: 24+ TB

Finance Telecom: 100+ TB

CERN: Upto 20 PB by 2006

Stanford Linear Accelerator Center (SLAC): 500 TB

Telenor, Ufone, Mobilink, Warid, Zong ???

Page 6: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 6

Caution!

A Warehouse of Data

is NOT a

Data Warehouse.

Page 7: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 7

Caution!

Size

is NOT

Everything.

Page 8: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 8

Reason-2: Why a Data Warehouse (DWH)?

DBMS Approach

List of all items that were soldlast month?

List of all makeup itemspurchased by Sassi?

The total sales of the last monthgrouped by branch?

How many sales transactionsoccurred during the month ofJanuary?

Intelligent Enterprise

Which items sell together?Which items to stock?

Where and how to place theitems? What discounts to offer?

How best to target customers toincrease sales at a branch?

Which customers are most likelyto respond to my nextpromotional campaign, and why?

Businesses demand Intelligence (BI).

Complex questions from integrated data.

“Intelligent Enterprise”

Page 9: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 9

Reason-3: Why a Data Warehouse (DWH)?

Businesses want much more …

What happened?

Why it happened?

What will happen?

What is happening?

What do you want to happen?

Page 10: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 10

What is a Data Warehouse?

A complete repository of historical

corporate data extracted from

transaction systems that is

available for ad-hoc access by

knowledge workers.

Page 11: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 11

What is a Data Warehouse?

Transaction System: Management Information System (MIS)

Could be typed sheets (NOT transaction system)

Ad-Hoc Access: Does not have a certain access pattern

Queries not known in advance

Difficult to write SQL in advance

Knowledge Workers: Typically NOT IT literate (Executives, Analysts, Managers)

NOT clerical workers

Decision makers

Page 12: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 12

What is a Data Warehouse?

Inmons‟s Definition:

A Data Warehouse is:

Subject-oriented

Integrated

Time-variant

Nonvolatile

Collection of data in support of

management‟s decision making process.

Page 13: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 13

Another View of a DWH

Subject

Oriented

Integrated

Time Variant

Non Volatile

Page 14: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 14

Subject-oriented

Data Warehouse is organized around subjects such as sales,

product, customer.

It focuses on modeling and analysis of data for decision makers.

Excludes data not useful in decision support process.

Page 15: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 15

Integration

Data Warehouse is constructed by integrating multiple

heterogeneous sources.

Data Preprocessing are applied to ensure consistency.

RDBMS

Legacy

System

Data

Warehouse

Flat File Data Processing

Data Transformation

Page 16: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 16

Time-variant

Provides information from historical perspective e.g.

past 5-10 years.

Every key structure contains either implicitly or

explicitly an element of time.

Page 17: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 17

Nonvolatile

Data once recorded cannot be updated.

Data Warehouse requires two operations in dataaccessing

Initial loading of data

Access of data

load

access

Page 18: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 18

Summary: What is a Data Warehouse?

It is a blend of many technologies, the basicconcept being: Take all data from different operational systems

If necessary, add relevant data from industry

Transform all data and bring into a uniform format

Integrate all data as a single entity

Store data in a format supporting easy access fordecision support

Create performance enhancing indices

Implement performance enhancement joins

Run ad-hoc queries with slow selectivity

Page 19: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 19

Benefits of Data Warehouse

High returns on investment.

Substantial competitive advantage.

Increased productivity of corporate decision-makers.

Fast reporting for decision making process.

Reduced reporting load on transactional systems.

Making institutional data more user-friendly andaccessible for knowledge workers.

Integrated data from different source systems.

Enabled „point-in-time‟ analysis and trending overtime.

Helps in identifying and resolving data integrity issues,either in the warehouse itself or in the source systemsthat collect the data.

Page 20: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 20

Data Warehouse: How is it Different?

1. Decision making is Ad-Hoc

Page 21: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 21

Data Warehouse: How is it Different?

2. Different patterns of hardware utilization

Bus Service vs. Train

Page 22: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 22

Data Warehouse: How is it Different?

3. Combines operational and historic data

Don‟t do data entry into a DWH. OLTP or ERP are the

source systems.

OLTP systems don‟t keep history, cannot get balance

statement more than a year old.

DWH keep historical data, even of bygone customers.

Why?

In the context of bank, want to know why the customer

left?

What are the events that led to his/her leaving? Why?

Customer retention

Page 23: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 23

Data Warehouse: How is it Different?

How much history? Depends on:

Industry

Cost of storing historical data

Economic value of historical data

Industry and history

Telecom calls are much much more as compared to banktransactions

18 months

Retailers interested in analyzing yearly seasonal patterns 65 weeks, why?

Insurance companies want to do actuary analysis, use thehistorical data in order to predict risk

7 years

Hence NOT a complete repository of data.

Page 24: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 24

Data Warehouse: How is it Different?

How much history?

Economic value of data vs. storage cost

Data Warehouse a complete repository of

data?

Page 25: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 25

Data Warehouse: How is it Different?

4. Usually (but not always) periodic or batch updatesrather than real-time

The boundary is blurring for active data warehousing.

For an ATM, if update not in real-time, then lot of realtrouble.

DWH is for strategic decision making based onhistorical data, would not hurt if transactions of lastone hour or day are absent.

Rate of update depends on:

Volume of data

Nature of business

Cost of keeping historical data

Benefit of keeping historical data

Page 26: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 26

Data Warehouse: How is it Different?

5. Starts with 6x12 availability requirement … but 7x24usually becomes the goal.

Decision makers typically don‟t work 24 hrs a day and7 days a week. An ATM system does.

Once decision makers start using the DWH, and startreaping the benefits, they start liking it.

Start using the DWH more often, till want it available100% of the time.

For business across the globe, 50% of the world maybe sleeping at any one time, but the businesses areup 100% of the time.

100% availability not a trivial task, need to take intoaccount loading strategies, refresh rates etc.

Page 27: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 27

Data Warehouse: How is it Different?

6. Does not follows the traditional development model

• Requirements gathering

• Analysis

• Design

• Programming

• Testing

• Integration

• Implementation

• Design warehouse

• Integrate data

• Test for biasness / incorrectness

• Program w.r.t. data

• Design DSS system

• Analyze results

• Understand requirements

Page 28: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 28

Data Warehouse: How is it Different?

7. Comparison of response times

OLAP (Online Analytical Processing) queries must be

executed in a small number of seconds.

Often requires de-normalization and/or sampling

Complex query scripts and large list selections can

generally be executed in a small number of minutes.

Sophisticated clustering algorithms e.g. data mining

can generally be executed in a small number of hours

(even for hundreds of thousands of customers).

Page 29: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 29

Data Warehouse: How is it Different?

8. Data Warehouse vs. OLTP (Online Transaction Processing)

OLTP: Select tx_date, balance from tx_table where account_ID = 829;

DWH

Select balance, age, sal, gender from customer_table and tx_table

where age between (30 and 40) and education = „graduate‟ and

custID.customer_table = customer_ID.tx_table;

OLTP

Primary key used

No concept of primary index

May use a single table

Normally few rows returned

High selectivity of query

Indexing on primary key (unique)

DWH

Primary key NOT used

Primary index used

Mostly uses multiple tables

Normally many rows returned

Low selectivity of query

Indexing on primary index (non-

unique)

Page 30: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 30

DWH vs. OLTP: Summary

DWH

Application neutral

Single source of „truth‟

Evolves over time

How to improve business

Historical, detailed data

Some summary

Lightly de-normalized

Hardly uses PK

No. of returned results in Ks

Minutes to hours

Typical availability 12x6

OLTP

Application specific

Multiple databases with repetition

Off the shelf application

Runs the business

Operational data

No summary

Fully normalized

Based on PK

No. of returned results in 100s

Sub seconds to seconds

Typical availability 24x7

Scope

Data

Perspective

Queries

Time Factor

Page 31: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 31

DWH

Informational processing

Analysis

Knowledge workers

Decision support

Subject oriented

Complex query

Mostly read

Information out

High flexibility / autonomy

Query throughput

OLTP

Operational processing

Transaction

Clerks, DBAs etc.

Day to day operation

Application oriented

Short, simple transaction

Read / write

Data in

High performance / availability

Transaction throughput

Characteristics

Orientation

Users

Function

DB Design

Unit of work

Access

Focus

Priority

Metric

DWH vs. OLTP: Summary

Page 32: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 32

Putting the pieces together

Data Warehouse Server

(Tier 1)

Data

Warehouse

Operational

Data Bases

Semistructured

Sources Query/Reporting

Data Marts

MOLAP

ROLAP

Clients

(Tier 3)

Tools

Meta

Data

Data sources

Data

(Tier 0)

IT

Users

Business

Users

Business Users

Data Mining

Archived

data

Analysis

OLAP Servers

(Tier 2)

Extract

Transform

Load

(ETL)

www data

Page 33: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 33

Why is this hard?

Data sources are unstructured & heterogeneous.

Requirements are always changing.

Most computer scientist trained on OLTP systems,those concepts not valid for VLDB & DSS.

The scale factor in VLDB implementations is difficult tocomprehend.

Performance impacts are often non-linear O(n) vs.O(nlogn) e.g. scanning vs indexing.

Complex computer/database architectures.

Rapidly changing product characteristics.

Page 34: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 34

DWH: High level implementation steps

Phase-I

Determine user needs

Determine DBMS Server platform

Determine hardware platform(s)

Information and Data Modeling

Construct metadata repository

Phase-II

Data acquisition and cleansing

Data transform, transport and populate

Determine middleware connectivity

Prototyping, querying and reporting

Data Mining

Online Analytical Processing (OLAP)

Phase-III

Deployment and System Management

Page 35: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 35

Types of Data Warehouses

Financial

Telecommunication

Insurance

Human Resource

Global

Exploratory

Page 36: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 36

Types of Data Warehouses

Financial

First Data Warehouse that an organization

builds. This is appealing because:

Nerve center, easy to get attention.

In most organizations, smallest data set.

Touches all aspects of an organization with a

common denomination i.e. money.

Inherent structure of data directly influenced by

the day-to-day activities of financial processing.

Page 37: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 37

Types of Data Warehouses

Telecommunication

Dominated by sheer volume of data

Many ways to accommodate call leveldetail: Only a few months of call level detail.

Storing lots of call level detail scattered overdifferent storage media.

Storing only selective call level detail etc.

Unfortunately, for many kinds of processing,working at an aggregate level is simply notpossible as finding patterns will be difficult.

Page 38: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 38

Types of Data Warehouses

Insurance

Insurance Data Warehouses are similar toother Data Warehouses BUT with fewexceptions: Store data that is very old and used for actuarial

processing / analysis.

Typical business may change dramatically overlast 40-50 years, but not insurance.

In retailing or telecom, there are few importantdates but in the insurance environment there aremany dates of many kinds.

Page 39: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 39

Types of Data Warehouses

Insurance

Insurance Data Warehouses are similar toother Data Warehouses BUT with fewexceptions (Contd.): Long operational business cycles, in years.

Processing time in months. Thus the operatingspeed is different.

Transactions are not gathered and processedbut are in kind of „frozen‟.

Thus a very unique approach of design &implementation.

Page 40: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 40

DWH: Typical Applications

Impact on organization‟s core business is tostreamline and maximize profitability. Fraud Detection

Profitability Analysis

Direct Mail / Database Marketing

Credit Risk Prediction

Customer Retention Modeling

Yield Management

Inventory Management

ROI on any one of these applications canjustify HW / SW and Consultancy costs in mostorganizations.

Page 41: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 41

DWH: Typical Applications

Fraud Detection

By observing data usage patterns

People have typical purchase patterns

Deviation patterns

Certain cities notorious for fraud

Certain items bought by stolen cards

Similar behavior for stolen cards

Page 42: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 42

DWH: Typical Applications

Profitability Analysis

Banks know if they are profitable are not

Don‟t know which customers are profitable

Typically more than 50% are NOT profitable

Don‟t know which one?

Balance is not enough, transactionalbehavior is the key

Restructure products and pricing strategies

Life time profitability models (next 3-5 years)

Page 43: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 43

DWH: Typical Applications

Direct Mail Marketing

Targeted marketing

Offering high bandwidth package NOT to all

users

Know from call detail records of web surfing

Saves marketing expense, saving pennies

Knowing your customer better

Page 44: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 44

DWH: Typical Applications

Credit Risk Prediction

Who should get a loan?

Customer segregation i.e. stable vs. rolling

Qualitative decision making NOT subjective

Different interest rates for different

customers

Do not subsidize bad customer on the basis

of good

Page 45: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 45

DWH: Typical Applications

Yield Management

Works for fixed inventory businesses

The price of item suddenly goes to zero

Item prices vary for varying customers

Examples: Airlines, Hotels etc.

E.g. Price of air ticket depends on: How much in advance ticket was bought?

How many vacant seats were available?

How profitable is the customer?

Ticket is one-way or return?

Page 46: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 46

DWH: Recent Applications

Agriculture Systems

Agriculture related data collected for

decades

Metrological data consists of 50+ attributes

Decision making based on expert judgment

Lack of integration results in underutilization

What is required, in which amount and

when?

Page 47: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 47

DWH: Typical Early Adopters

Financial service / insurance

Retailing and distribution

Telecommunications

Transportation

Government

Common thread:

Lots of customers and transactions.

Page 48: Professional Issues in Computing · PDF fileTelenor, Ufone, Mobilink, Warid, Zong ??? ... Data Warehouse Flat File Data Processing Data Transformation. NUCES, Islamabad Campus Data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 48

DWH: End User Expectations

Point and click access to data

Insulation from DBMS structures

Want semantic data model – not 3rd normalform

Integration with existing tools: Excel,SAS etc.

Interactive response times for onlineanalysis but batch time is important aswell.