data mining for enterprise solutions -...

14
Data Mining for Enterprise Solutions A Business Perspective on Mining Data for Corporate Intelligence By: Lelia Morrill Retrograde Data Systems In Conjunction with The Teradata Data Mining Lab Data Warehousing

Upload: ledieu

Post on 07-Mar-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise SolutionsA Business Perspective on Mining Data for Corporate Intelligence

By:

Lelia Morrill

Retrograde Data Systems

In Conjunction with

The Teradata Data

Mining Lab

Data Warehousing

Page 2: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 2 OF 14

Executive Overview . . . . . . . . 2

Marketplace Transitions . . . 3-4

Using Mining for Competitive Advantage . . . . . 4

Multidimensional Answers on a Global Scale . . . . . . . . 4-5

Adaptive Response for Leveraging Web Interactions . . . . . . . . . . . . . 5-6

The Value of Data Mining —Justifying the Investment. . . 6-7

Applying Data Mining to Solve Real-World Business Problems. . . . . . . . 7-8

The Data Mining Process . . 8-9

Where Does Data Mining Fit? . . . . . . . . . . . . 9-10

Data Mining Challenges . 10-13

The Business Analytic Roadmap . . . . . . . . . . . . 13-14

Summary . . . . . . . . . . . . . . . 14

Executive Overview

This paper explores data mining from the business perspective,

focusing on the premise that for a corporation to realize the poten-

tial ROI to be gained from high-value analytics, the business must

play a lead role in defining, validating, and translating results into

corporate profit.

Teradata®, a division of NCR, has evolved data mining from the

realm of raw algorithms to proven high-value analytics that have

had significant impact on corporate revenue generation and cost

savings. The types of problems that are appropriate for applying

mining solutions vary from simple market basket analysis to com-

plex customer insight and prediction. Data mining can be described

as the process of identifying and interpreting intrinsic patterns in

data to solve a business problem. The mining process yields an

analytic model that can be used to gain insight, reduce costs, and

increase profit. It is the key component of the Knowledge Discovery

process that incorporates analyzing, understanding, deploying, and

using mining results. Data mining analysts use and apply the tech-

nology. The business owns and drives the knowledge discovery

process. This teaming is key to exploiting and translating analytic

models into lucrative action. The business steers and guides the

process by determining and prioritizing the problems that can be

addressed through data mining, problems such as customer reten-

tion, marketing effectiveness, fraud detection, and behavioral

segmentation. The business validates results for expected outcome

and develops a strategy for deploying analytic models into relevant

action and successful programs.

Table of Contents

Page 3: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 3 OF 14

Marketplace Transitions

Data mining has consistently proven

its effectiveness in select and focused

situations from oil drilling to niche

identification since the mid 1980s. It has

been used for medical diagnosis, genome

analysis, and behavioral profiling. When

historical data exists with established

precedents for observing trends and

patterns over time, data mining can assist,

in any field, across all industries. If data

mining has been used successfully since

the 1980s, why is it just making a corpo-

rate splash? There have been historical

challenges that are at the root of data

mining’s slow emergence into business,

some of which include:

• Lack of standards and business

packaging

• Inability of tools to scale up to the

volumes of data

• Data mining tools struggle toward

industrial level strength when it comes

to real-world data problems (noisy,

missing and faulty corporate data)

• Databases designed for operational

processing cannot scale up to

voluminous analytical processing

• Corporate warehousing and methods

have been slow to evolve

• The business does not trust results

they cannot validate or understand

• Data analysis and mining are typically

niche-oriented processes that exist

outside of business processes

Technological advances in compute power

and speed, advanced data processing and

management techniques, and greater user

sophistication have changed the face of

today’s business. There is heightened

demand for accessing data, generating

knowledge, and solving difficult business

problems. The marketplace is poised,

more than ever, to exploit advanced data

mining for enterprise profitability. Then

why aren’t more companies devoting

budget and resource to getting up and

running with data mining? There are

several reasons:

1. Most tools still work in their own

proprietary environment. The process

of moving vast amounts of data out

of warehouse databases into the tool

environment (and back and forth

during the exploration process)

is cumbersome, time consuming,

and unintuitive.

2. Most databases are not optimized

for analytic processing.

3. The business has not integrated data

mining and knowledge discovery into

their workflow.

4. Companies do not support data

mining from the top.

Most tool providers and database vendors

are aware of the chasm between technology,

business, and mining. They are now trying

to reorient tools and databases, originally

Infrastructurefor Intelligence

Validate andUse Intelligence

Warehouse

KnowledgeDeployment

Business

ITMiners

DeployIntelligence

DevelopIntelligence

Figure 1. Enterprise mining is a collaborative effort between businesses, miners, and IT.

Page 4: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 4 OF 14

designed for operational processing or

sampled analysis, not high-volume

analytic processing, to handle increasing

data volume and the analytic business

complexity required to deliver answers.

Teradata has naturally met the challenge

with the combination of the Teradata

Database, which is designed for enterprise

level analytic processing; Teradata Ware-

house Miner, in-place data mining; a Data

Mining Lab, a dynamic, low risk environ-

ment where mining experts work with

clients to develop enterprise mining

solutions; and knowledge discovery

training. Teradata Warehouse is the

leading data warehousing technology

available today, and has now extended

its power to mining. This means a faster,

more intuitive process that keeps analysts

and business users closer to their data.

Teradata Warehouse Miner’s capability

for speedy throughput creates a dynamic,

engaging process where business users

see and use intricate, current mining

solutions. Teradata has made enterprise-

level mining achievable.

Using Mining for

Competitive Advantage

Data mining provides insight that renders

corporate knowledge. With executive

commitment, data mining can provide

powerful, predictive capability that leads

to more strategic business and strengthens

corporate positioning. How? By providing

immediate, accurate business answers

across a global spectrum, adaptive

response to critical market trends, agility

in dealing with competitive land mines,

and insight that translates to customer

intimacy. These are characteristics of

successful companies.

The level of corporate intelligence

required to maneuver judiciously and

deftly in today’s marketplace is confound-

ing. For centuries, business has been run

on the assumptions of experienced

management and the information from

simplistic reporting mechanisms, but the

scope of data that must be sifted through

to glean interesting information is becom-

ing exponentially overwhelming. This is

where data mining excels — in analyzing

the volumes of historical data that contain

the truth about what has occurred in

business operations. It helps companies

decipher quantitative, fact-based intelli-

gence that can be extracted from the data,

and assists them in predicting what will

happen in future situations.

Data mining brings high ROI to the

warehouse investment. It augments CRM

applications by inserting intelligence in

the form of scores, predictions, descrip-

tions, profiles, propensities, and value

into customer records. Data mining makes

CRM smarter. It is becoming a major

component of establishing business

direction and strategic positioning in

today’s progressive companies.

Multidimensional Answers

on a Global Scale

Most companies acknowledge the wealth

of information stored in historical data

and have put forth commitment and

effort to building warehouse environ-

ments that store and manage this data.

But many of these labors have gone by

the wayside because business users don’t

necessarily see the value of static data,

because finding answers to complex

TeradataData Mining and

OLAP Assists

CRM

Channel Analysis:What is the best

channel to reach mycustomer base?

Churn Analysis:Which of my customersare my loyal customers,

and who will alwaysswitch suppliers?

Propensity to Buy:Who is most likely

to purchase thistype of product?

Custom Analytics

Customer Value:Which of my

customers aremost profitable?

Fraud Detection:How can I tell if a transaction is fraudulent?

Cross Sell:What other products

is this customerlikely to buy?

Teradata Warehouse

Miner

Figure 2. Data mining makes CRM smarter.

Page 5: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 5 OF 14

questions is difficult and time-consuming

when using standard reporting and data

access tools alone. The value of data

mining is the strength and comprehen-

siveness that it brings for extracting

answers to enterprise-level, multidimen-

sional business questions. With good

tools, the right skill sets, and a quality

data environment, tough questions can

be answered and predictive solutions

deployed globally.

For example, a global company wanted to

understand product trends across stores on

multiple continents. Data mining helped

them to understand which stores were

doing best and worst. It also helped them

understand which factors, such as location,

weather, demographic profile, season,

product category, size of store, and years

in business, were impacting trends and

revenues. The resultant understanding was

then deployed to the sales team who put

together tailored programs to fit each

store’s needs, to the marketing team who

designed more targeted, relevant cam-

paigns, and to the product development

team that bundled and repackaged for

better customer service. A predictive model

was deployed to help strategic planners

choose optimal sites for new stores. Data

mining is a comprehensive process, but

can provide far reaching answers.

Adaptive Response for

Leveraging Web Interactions

Whether reacting to market trends, or

dealing intelligently with an on-line

customer, data mining helps companies

to respond real-time, and to adapt to the

situation at hand. If an on-line customer is

frustrated and having difficulty, or perhaps

having success and considering serious

product purchase, most companies would

like to intercede to create a satisfying

experience that leads to immediate or

future sales. Companies can use data

mining in this situation to develop

descriptive models that profile customer

behavior, on-line trends and buying

patterns. They can also use data mining

to develop predictive models that score

customers for value, propensity to

respond to ads, propensity to buy (various

products), or to defect. And, they can

develop mining models that can predict

how much a customer will buy, how much

they will spend, how often they will shop,

and how satisfied they are. All of these

models can then be integrated and

deployed online to adaptively respond to

an individual user session. The integrated

suite can be deployed to the customer

service center, which can respond know-

ingly to each customer’s request. Models

can be developed singularly or in concert.

Once web data have been organized and

integrated into a customer data warehouse

environment, the most valuable mining

can occur — Total Customer Management

across all channels. Data mining can be

used to develop a hard hitting set of

high-value analytics, for example:

• How is the business growing across

the web channel?

• How can we make our total product

offering available through the web?

• What are the ramifications to our

standard points of distribution, e.g.,

the Call Center?

Data Warehouse

Historical Data

Intelligence

CustomerTransactions

Meaningful,PersonalCustomer

Interaction

Smarter CRM

OperationalDatabases

Detailed Data

High-ValueAnalytics

Figure 3. Intelligence mined from corporate data is leveraged across the enterprise.

Page 6: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 6 OF 14

The business must participate in, own,

and drive the process, and bring their

acumen to a sound, strategic deployment

strategy. Integrated mining creates the

ability to respond adaptively in real-time,

real-world business.

The Value of Data Mining �

Justifying the Investment

Companies can acquire facts about their

customers through querying the data

warehouse. They can organize customer

data so that CRM applications can take

advantage of managing relationships and

lifecycles. But to get to a point of customer

intimacy, where the customer is under-

stood at an individual level, including

their behavior, trends and preferences, the

analytic power of data mining is required.

Data mining is one of the most lucrative

reasons to build a data warehouse, and

brings far greater value to CRM.

In the 90s, fraud occurring in the insurance

industry accounted for eighty billion dollars

in losses per year across the industry. Using

data mining to offset fraudulent activity

even by 10% meant saving 8 billion dollars

per year. Progressive credit card companies

have understood the value of data mining

and have used models in an operational

mode to circumvent fraud, saving millions

per analytic model per year. Many of those

companies are extending their tried and true

data mining models to web activity, eager to

save several million dollars more per year.

The stories of successful mining, and the

millions that can be earned and saved,

extend across all industries. Manufactur-

ing has used mining to hone capacities,

loads, and processing, saving millions in

lost time and breakdowns. Telecommuni-

cation companies have used mining to

prevent churn, to cross sell, to bundle,

and gain customer insight, saving and

earning millions per year. Banks have

successfully implemented mining for

target marketing, offering new products

and services, and bundling products to

better serve customers’ needs, thereby

saving and earning millions.

It is unknown how effective data mining

will be at any given business until the

data, technology, and culture have been

assessed for readiness and potential

business profit. But what is clear is that

without data mining, companies will not

discover the power of insight and knowl-

edge that exists currently or potentially in

their corporate data. Recognition of the

importance of data warehousing has been

a big step that has gained momentum

over the last several years. Having clean,

reliable, organized data allows for consis-

tency and repeatability in the complex,

compute intensive mining process. With

warehousing as the foundation, data

mining brings the power and intelligence

layer to the warehouse environment, and

ultimately to the business user’s desktop.

The revenue gains from the effective

application of high-value analytics have

The model can be a calledprocess or embedded code.

�!Scores

�!Descriptions

�!Predictions

�!Forecasts

�!Propensities

�!Behavioral! Patterns

�!Segment! Classifications

�!Most Important! Business Drivers

�!Business Rules

Figure 4. The end result of mining is an analytic model trained on business history.

Page 7: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 7 OF 14

historically justified both warehousing

and mining investments, and are playing

a larger role in CRM considerations.

Applying Data Mining

to Solve Real-World

Business Problems

Business mining solutions have been

successfully implemented across all

industries at several of the largest compa-

nies worldwide. Because each business

situation is different, and the historical

data captured over the years is far from

standardized or consistent across compa-

nies, it’s virtually impossible to develop a

turnkey mining solution for complicated

enterprise-level questions. Data mining

is a process, and although several of the

components within the process are

automated, there are several phases that

require domain expertise and analytic

shrewdness to sculpt, develop, decipher,

and deploy actionable results. The key

factors to successful mining include

quality data, historical precedence,

experienced analysts, champion business

users, comprehensive tools, and an

integrated, robust environment. The final

result is a valuable analytic model that

allows on-going capability to describe

and predict the business.

Given the right ingredients and a clear

definition of the problem, mining can

be applied to achieve in-depth answers.

Here is a short list of types of business

issues that lend themselves to data

mining solutions:

Customer Segmentation and

Behavioral Profiling

Segmentation is a way to derive homoge-

neous groups based on common traits.

Those attributes in combination provide

insight whereby profiles can be synthe-

sized and segment ids assigned to each

customer. The segments can be further

analyzed and programs derived that are

tailored to the profile of each group,

maximizing response and engagement.

Customer Retention

Patterns of flight are analyzed and factors

leading to churn are discerned. The analytic

model uses identified patterns of flight to

predict risk of current customers allowing

the company to intervene with offers,

programs and enticements that will retain

the customer. Used in conjunction with

customer profitability, lifetime value, and

cross-sell analysis, campaigns can be further

targeted to high-value customers who

exhibit high risk potential and who have

propensity to buy or respond to offers.

Customer Profitability

Customers can be segmented into groups

based on value and profitability. Once

factors of high to low profitability have

been identified, whether behavioral,

demographic, or psychographic, customer

profitability can be predicted by identifying

similar factors in new customers.

Customer Lifetime Value

Patterns of behavior and activity that lead

to high/low value over time are identified

and overlaid onto newer customers. These

are then used to predict the lifetime value

of each customer. Based on findings,

programs can then be tailored to enhance,

maintain, or drop relationships.

Customer Satisfaction

Attributes are identified by business

domain experts that can be used to identify,

understand, and interpret satisfaction.

Metrics, such as revenue per year, increase

per year, number of items purchased, credit

rating, and activity levels, can be used to

segment and score customers. Tailored

programs are then implemented to increase

satisfaction per segment.

Customer Acquisition

Acquiring new customers is far more

intelligent and targeted when done in

conjunction with (or after) profitability,

lifetime value, and/or propensity to

respond analysis. This targets marketing

efforts to the group of prospects that

has the highest potential for value and

loyalty and those who demonstrate high

likelihood of responding.

Targeted Marketing

Segmentation analysis can provide profiles

of demographic and behavioral attributes

as they map to high-value customers. Then

marketing can be focused, as mentioned

above, to targeted segments of high-value,

likely responders with solid potential. This

makes for intelligent, fact-based marketing.

Effective Campaigns

A CRM tool assists companies in managing

customer, marketing, and campaign data.

Page 8: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 8 OF 14

Data mining can then be used to gain

customer insight, monitor existing

campaigns, and analyze the effectiveness

of campaigns over time. Attributes

of success can then be applied to new

campaigns and best practices honed

for most profitable results.

Cross Selling

Data mining can be used to discover the

attributes that lead to high propensity to

buy. The model can then be overlaid onto

current customers who can be marketed to

based on high propensity to purchase, thus

enhancing individual portfolios, creating

greater loyalty, and increasing retention.

Channel Management

Data mining assists companies in better

understanding which channels are most

effective for various segments of cus-

tomers. Patterns of activity are analyzed

to understand trends of usage. Once

understood, offers, programs, services,

and products can be introduced to the

customer in an engaging, responsive

dialogue. Channels can be enhanced to

better serve customers.

Sales Forecasting

Mining can be used to understand sales

trends, to predict revenue, and to estab-

lish business drivers leading to increased

sales. Profiles of sales representative

behavior can be synthesized to profile

best sales practices.

Fraud Detection

Data mining has been used to help

companies identify attributes that can and

do potentially lead to fraud. If data have

been captured over time, patterns can be

used to predict fraud on an on-going

basis. If data have not been captured and

organized through time, data mining can

be used to investigate suspicious charac-

teristics of operational data. For example,

in the case of medical fraud, if attributes

such as number of visits, number of

claims, amount of charges, or number

of procedures, is off average and predicted

values, investigations can be initiated with

the justification of analysis insight to

support and direct.

E-Business and Web Mining

Integrate any or all of the above data

mining models and deploy onto web

servers for adaptively responding to

customer clicks and observed behavior.

Integrate web data with customer ware-

house data and use the high-value

analytics of data mining to gain a total

customer view, behavioral patterns, and

insight across all channels.

The Data Mining Process

Data mining is a process — it is not a

shrinkwrapped package. To be successful,

actionable, and profitable, it must be a

collaboration driven by the business,

developed by mining analysts, and

supported by IT.

To be efficient and repeatable, an environ-

ment designed for industrial strength

warehousing and analytic processing is

necessary. Teradata has hit the mark,

developing the right skill sets, tools,

and architecture for ensuring successful

mining, and has incorporated the practi-

cal knowledge, learned through years of

mining experience, into the Teradata Data

Mining Method. This methodology,

outlined below, has been at the core of

successful data mining project implemen-

tations at some of the world’s largest

companies.

1. Develop Project Scope and Plan —

checkpoints throughout the process.

2. Identify and clarify the business

problem or question to be solved —

this gives the project focus.

3. Determine and prepare the technology

and architecture or environment for

mining — this ensures processing

efficiency and effectiveness.

4. Select, analyze and prepare the data

for the mining process — data mining

hinges on this step — knowing the

data intimately so that it can be used

precisely and intelligently to attain

relevant business results.

5. Develop analytic models — choose

best methods and algorithms, make

final variable selections, iterate to

best model.

Page 9: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 9 OF 14

6. Deploy results — compare results

to expected outcomes, validate with

domain experts, test analytic models

for accuracy, deploy to the business

in whatever form is most suitable

(can be incorporated into applications,

databases, warehouse processes or as

standalone code).

7. Transfer knowledge between analysts

and business users throughout the

discovery process.

Business users play a key role in identify-

ing the business question(s) that will be

answered, verifying potential business

factors and data sources, determining

the validity of the analytic models, and

developing deployment strategies for

using the analytic model for business

answers. For example, deciding how

to best deploy a customer value model.

Should it be invoked for scoring in the

database once a customer has been a

patron for six months? Or should it be

manually invoked by territory analysts

who can use it to assess clients per region

and develop targeted programs? Or

perhaps it should be invoked in batch

mode, providing weekly reports to the

right business analysts. These are the types

of considerations that business users will

work through as they formulate strategies

for deploying their predictive models into

useful decision workflow.

The end result of the mining process is

an analytic model that can be used for

understanding the past and predicting

the future. Analytic models can be

deployed into the decision environment

for on-going predictive capability. The

deployment strategy is designed by the

business user and implemented by the

IT organization. Bringing the process

full circle to completion ensures that the

value and ROI of data mining are realized.

Where Does Data Mining Fit?

Corporations discovered the value of

data in the eighties and the importance

of customer centric focus in the nineties.

Businesses continue running operational

systems to manage capture and modifica-

tion of business data, and the processes

are expanding daily with web interac-

tions. How does data mining fit into

all of this? Data mining processing uses

warehouse data as input. It crunches

through historical data finding patterns

and developing rules about business.

Once the analysis is run, business users

validate the output. The intelligence from

the analysis is then incorporated back

into the warehouse in the form of scores,

predictions, forecasts, and descriptions.

Decision applications that access ware-

house data, including CRM, will also

have access to the mined intelligence.

Data mining works hand in hand with

warehousing. The two technologies and

processes complement each other and

offer mutual benefit. A warehouse that

will be used for mining should be

designed with architectural considera-

tions for eventual mining. This also

propagates inclusion of the right players

up-front — a combination of business,

miners, and IT — ensuring that business

needs are met and technical environment

is accounted for.

Build

DevelopAnalytic Model

Feedback Loop

Use�Deploy toTest & Deploy

Test model can be deployed as:� Code� Database triggers� Called module� One-time report

Reports

OLAP

DW

OperationalDatabases

DSS

� Data mining ! process

Figure 5. Deploy analytic models into the enterprise for high-value business usage.

Page 10: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 10 OF 14

The final result of mining, that is scores,

predictions, and descriptions, can be

propagated through the business commu-

nity in a variety of ways. One way to make

the data accessible in an ad hoc manner is

by integrating it into warehouse data and

making it available through multi-

dimensional data access tools.

Data mining is not pushbutton and has,

therefore, been very difficult to package

and shrinkwrap as a standalone applica-

tion or with other business applications

such as CRM. Although data mining is an

intensive business process that takes time,

with expertise, a framework based on

experience can be used for faster develop-

ment. Once analytic models are developed,

decision applications, including CRM, can

take advantage of the intelligence generated

by the mining process.

Data Mining Challenges

What are the ingredients of successful

mining? The right people, an integrated,

technological environment, good tools,

and sound business commitment. There

are difficult challenges within each of

these components.

Developing Data Mining Skills

One of the biggest challenges to creating

mining as an internal corporate service is

developing the skill sets. A skilled analyst

will have expertise in statistics, machine

learning algorithms, business analysis,

and technology. Because data mining is

a relatively new field, skilled data mining

analysts are difficult to find. However, this

should not deter companies from moving

forward with data mining. There are many

avenues for developing skills internally,

including hiring data mining consultants

who develop data mining capability with

the objective of transferring knowledge.

This, along with a comprehensive training

program, can build a core competency in

a methodical and confident manner.

Business users must also be trained,

starting with the specifics of the data in

their warehouse. Knowing what is there

and how to navigate it brings a more

savvy business user to the data mining

project. Business users should then be

Technical Components

TOOLS

SYSTEMS

, Load, Access, OLAP, Mining

, WEB, RDBMS, Metadata,

Transform => dtl/aggr warehouse> Access & Presentation

Operational Warehouse

DW INFRASTRUCTURE

Predictive AppAnalyticServer

Warehouse Data Model

Location Customer Market

ProductSalesFinancial

Analytic Business Models

Location Customer Market

Product Sales

Financial

Branch Retention Market Customer

Loss Usage Content

Marketing Workstation

Figure 6. Integrated, business-driven architecture.

Page 11: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 11 OF 14

trained in the data mining process and

finally, in how to deploy and use data

mining results to greatest advantage.

Data warehouse practitioners from the IT

community must be trained in the techni-

cal and functional aspects of the mining

process and tools so that they can support

analysts in preparing the environment,

accessing the data, and deploying results

into databases and applications. If IT is

trained up front, they will understand

how to augment their warehouse business

discovery processes with questions that

will bring mining into the warehouse

for up-front value in sync with phased

implementation.

The Right Technological Environment

The right technological environment for

data mining has a foundation of good

quality data. In today’s terms, that means

data warehousing. Data mining can occur

without a warehouse in place, but the

problems with gathering and cleansing the

data can seem insurmountable. Also, once

the process has been completed for one

model, the cycle has to start again from

scratch for each subsequent model. There

is no repeatability. The data warehouse

provides a natural environment for

efficient, on-going data mining. It also

becomes the repository for data mining

results. This includes the information

gleaned from historical data about how

the business has been running and why,

the resultant analytical models, and finally,

the scores, predictions, and intelligence

that come from the data mining endeavor.

Although it would seem that data mining

should be the next logical step after

developing a warehouse, in fact, it contin-

ues to be a great difficulty for most

companies that attempt it. This is because

relational database management systems

(RDBMS) were originally designed for

operational processing of high-speed

transactions. Adding, deleting, or modify-

ing records is an entirely different process

than analyzing volumes of historical data.

The persistent challenges include:

• Inability to scale up to the new data

volumes being generated by historical

and web transactions

• Inability of database vendors to

efficiently take advantage of parallel

processing capability

• Inability of mining tools to go directly

against the data source (data ware-

house databases), creating awkward

and cumbersome movement of data

in and out of environments

Teradata Database was originally developed

for high-speed, analytic processing. Mining

an enterprise-level warehouse is now

achievable through Teradata Warehouse’s

parallel, scalable capability. The detailed data

sitting static in too many warehouses can be

Data Warehouse Data

Data Warehouse Data

Mined Intelligence

Name Addr �#Prods Tot$ #Yrs

Name Addr �#Prods Tot$ #Yrs

Prop to buy Prod X, Y, Z, LTV,Prof�ty score, Churn score, Cluster id�

Figure 7. Customer records are augmented with mined intelligence and deployed through OLAP tool.

Page 12: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 12 OF 14

brought to life for detailed transaction level

mining. The Teradata solution takes data

mining to a new level — one where enter-

prise-level mining solutions are a reality.

The Right Tools

When developing a data mining practice,

mining analysts typically team up with IT

to determine which tools work best within

the technical architecture. Tools are

usually chosen on the basis of:

Comprehensiveness

Is there a variety of statistical and

machine learning algorithms, for example,

Factor Analysis, Decision Tree, Linear

Regression, Logistical Regression, Rule

Induction, Neural Net, or Clustering?

Different modeling techniques work

for various problems.

Data Manipulation

Can the tool work with data directly in

the source database or must the data be

moved in and out of the tool environment

for derivation, transformation, modeling,

and testing?

Functionality

Is there surrounding functionality that

encases the mining engine, allowing users

to set parameters, easily read output,

understand the validity of the model,

change settings, choose different variables,

create new variables on the fly, and create

and save work flows?

Metadata

Is there easy access to information about

the development and use of the analytic

models? Is the model building informa-

tion integrated with warehouse data

information?

Tools have become more user-friendly,

sophisticated, and industrial strength

compared to ten years ago when mining

was making its first emergence into

corporate scenarios. But most still fall

short in terms of scalability and integra-

tion into RDBMS environments for

in-place mining of detailed data. Teradata

Warehouse Miner’s in-place mining

diminishes the problem of data move-

ment. The data is at the source, and

6.!IT! Deploy models & resulting

intelligence into databases, applications, job-streams

5.!Business! Verify results, confirm expected

outcomes, test model, develop deployment strategy, use on-going

4.!Mining Experts! Prepare data, experiment ! with modeling approaches

(algorithms methods), build analytic models, test, validate

3.!IT! Gather data, define subsets,

develop access routines, prepare technical architecture

1.!Business Users! Initiate project, clarify issue

define business parameters

2.!Mining Experts! Data pre-analysis, sculpt

project with business, clarify expected outcomes

7.!Mining Expert! Monitor, validate, hone,

refresh models over time

Figure 8. Mining process division of labor.

Page 13: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

EB-3019 > 1103 > PAGE 13 OF 14

iterating through exploration, data

selection, and preparation is part of

an integrated, intuitive process. Teradata

Warehouse Miner reduces processing

times by orders of magnitude.

A comprehensive data mining solution

that gets customers up and running,

starting with education, proof of concept

projects, tool evaluation, and a learning

environment for developing new skill sets,

is necessary to experience the benefit.

Management commitment is necessary for

exploiting data as the corporate asset that

it is, and for realizing and acting on the

potential knowledge that is locked within.

BusinessUsers

Analysts / MinersWarehouseAdministrator

Reports,AdHoc Queries

Data Analysis,Statistics

Analytic Models/Scores, Propensities,

Predictions, Descriptions

DataWarehouse

MiningResults

OLAP Teradata WarehouseMiner Stats

Teradata WarehouseMiner & Mining Partner Tools

Figure 9. Knowledge generation, management and deployment within the warehouse.

Business Analytic Roadmap

Phased solution development and

implementation

Developing an Analytic Roadmap brings

into focus those issues and problems that

can be solved by mining the data ware-

house, and prominently engages the

business community to drive and own

the knowledge discovery process. Bringing

these issues to the forefront at the onset

of warehouse development creates the

possibility of realizing immediate high

return value from the warehouse imple-

mentation. It has been proven that users

who receive up front value are more likely

to support and use the warehouse to

business advantage. A successful ware-

house implementation is not just a

technological feat — it must also be a

cultural achievement that engages the

business community in an interactive,

high-quality data environment where

they can ask questions, find answers,

and synthesize results into knowledge and

business action. An Analytic Roadmap

will give guidance and direction to

developing such an environment.

A mining architect conducts an intensive

discovery process that identifies business

issues, problems, and questions that are

decision oriented and complex, and which

can be clarified and/or solved using data

mining processes. The types of questions

asked might include:

• Questions that are supported by

existing historical data (data that have

been identified as candidate or current

subject areas in the data warehouse).

• Questions that will provide high ROI

if addressed through data mining

and/or analytical reporting (OLAP).

• Questions that the business has

identified as urgent.

• Questions that do not have data

available to support the solution,

but if augmented by third-party data,

could be solved (requires further gap

analysis project).

Page 14: Data Mining for Enterprise Solutions - …download.101com.com/pub/tdwi/Files/DataMining_EB3019.pdf · Data Mining for Enterprise Solutions ... The Teradata Data Mining Lab Data Warehousing

Data Mining for Enterprise Solutions

Teradata.com

EB-3019 > 1103 > PAGE 14 OF 14

Value of the Analytic Roadmap

Developing an enterprise analytic frame-

work is similar to developing an enterprise

data model. It brings everyone to a consis-

tent understanding of data mining, provides

education in relevant, everyday business

terms, and builds knowledge of the mining

process by breaking down the necessary

steps to get from abstract ideas to imple-

mented solutions. Important outcomes

are the sense of business ownership that

is developed and an understanding of how

to map mining projects to warehouse

implementations to successfully exploit

the warehouse for greater ROI.

Summary

There have been many challenges on

the road to making data mining viable.

Teradata Warehouse has effectively

overcome these obstacles by providing:

• ability to deal with volumes of data

• in-place mining

• integration between mining and

warehousing

• the ability to scale up with processing

and data demands

• the ability to score directly in the

database, making the results immediately

available to the business community

Teradata Warehouse mining transforms

what is otherwise an ad hoc, desktop

process into a profitable enterprise

capability. The promise of translating

high-value analytics into business solu-

tions has been realized in the Teradata

Warehouse mining products and services.

This paper was developed by Lelia Morrill

of Retrograde Data Systems and the

Teradata Data Mining Lab. For more

information, please contact Mike Rote,

Director of Teradata Data Mining for

Teradata, a division of NCR.

Loyalty

Satisfaction

Propensityto Buy

Profitability LifetimeValue

Propensityto Churn

Customer

Profitability

Satisfaction

Retention

Forecasting LifetimeValue

Loss

Financial

TargetMarketing

Mkt BasketAnalysis

Cross-SellStrategies

CampaignEffectiveness

Life CycleSequence

BestCampaign

Marketing

ChannelAnalysis

BestPractices

SalesForecasting

PartnerProfiling Bundling

RepProfiling

Sales

InventoryAnalysis

ShipmentAnalysis

ShipperProfiling

WarehouseOptimization

MaintenanceForecasts

TimelineOptimization

Equipment

Supply/Demand

New ProductProjections

Price PointAnalysis

ProductOptimization

LifecycleAnalysis

ProductBundling

Product

Figure 10. The Analytic Roadmap provides a framework for Enterprise Knowledge Management.

Teradata and NCR are registered trademarks of NCR Corporation. NCR continually enhances products as new technologies and components become available. NCR, therefore, reserves the right to change specifications without prior notice. All features, functions and operations described herein may not be marketed in all parts of the world. Consult your Teradata representative or Teradata.com for the latest information.

© 2003 NCR Corporation Dayton, OH U.S.A. All rights reserved.