erp centric data mining and kd

42
Copyrights 2002 ERP Data Mining & Knowledge Disco 1 ERP Centric Data ERP Centric Data Mining and Mining and Knowledge Discovery Knowledge Discovery Naeem Hashmi Chief Technology Officer Information Frameworks e-mail: [email protected] Web: http://infoframeworks.com Webcast - searchsap.com September 10, 2002

Upload: tommy96

Post on 13-Jan-2015

1.494 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

1

ERP Centric Data Mining and ERP Centric Data Mining and Knowledge DiscoveryKnowledge Discovery

Naeem Hashmi

Chief Technology Officer

Information Frameworks

e-mail: [email protected]

Web: http://infoframeworks.com

Webcast - searchsap.comSeptember 10, 2002

Page 2: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

2

• Founder and CTO of Information Frameworks, an author, speaker and world-renowned expert on emerging Information Architectures, Integration and Business Intelligence Technologies.

• Author of the best selling book titled, – SAP Business Information Warehouse for SAP, 2000.

• Technical Editor– SAP BW Certification Guide, authored by Catherine Roze 2002

• Contributing Author, SAP BW Handbook, 2002

• Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The ERP World, Data Mining and the Data Warehouse Institute.

• 25+ years of experience in emerging Information Technology research, development, and management; Information Architectures; Enterprise Application Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object and Client/Server Technologies and Strategic Consulting.

• Email- [email protected] url: http://infoframeworks.com Tel: 603-432-4550

Naeem Hashmi

About the Speaker

Page 3: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

3Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 4: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

4Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 5: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

5What is Data Mining and Knowledge Discovery ?

• Data Mining is a tactical process that uses mathematical algorithms to sift through large data-stores to extract data patterns/models/rules

• The Knowledge Discovery is the process of identifying and understanding potentially useful hidden anomalies, trends and patterns. Data mining is an integral part of knowledge discovery process

Page 6: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

6Data Mining and Statistics ?

• DM sounds very similar to regression analysis but its approach and purpose are quite different

– Statistical methods tests a hypothesis on a data set

– Data Mining starts from the data sets to construct a hypothesis

Page 7: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

7Data Mining - Present State

Business 317 73%Life Sciences 85 20%Other 31 7%

Source: http://www.kdnuggets.com/polls/

Application Domains

Page 8: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

8Data Mining Methodologies

Source: http://www.kdnuggets.com/polls/

CRoss Industry Standard Process for Data Mining

1. Business Understanding

2. Data Understanding

3. Data Preparation

4. Modeling

5. Evaluation

6. Deployment

CRISP-DM

Source: http://www.crisp-dm.org/

http://www.crisp-dm.org/

SIX STEPS PROCESS

Page 9: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

9Data Mining Process

CRoss Industry Standard Process (CRISP) for Data Mining

Data Understanding

Data Preparation

Data Warehouse

1. Business Understanding

2. Data Understanding

3. Data Preparation

4. Modeling

5. Evaluation

6. Deployment

Initially will take about Initially will take about 60% to 80% 60% to 80%

of the data mining project of the data mining project timetime

Initially will take about Initially will take about 60% to 80% 60% to 80%

of the data mining project of the data mining project timetime

http://www.crisp-dm.org/

Source: http://www.crisp-dm.org/

Page 10: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

10Data Mining - Tools and Data Formats

Business 317 73%Life Sciences 85 20%Other 31 7%

Source: http://www.kdnuggets.com/polls/

Domains

57% Flat files37% Proprietary27% DBMS

Page 11: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

11

VisualizationUse human pattern recognition capabilities

StatisticsApplying statistical techniques to predict

Decision TreesBuilding scripts based on historic data

Association Rules (Rule Induction)Reasoning from specific facts to reach a hypothesis

ClusteringRefers to finding and visualizing groups of facts that were not previously known

Neural NetworksLearning how to solve problems based on examples

K-Nearest NeighborClassification by looking at similar data

Genetic Algorithms

Survival of the fittest …

TECHNIQUES

TECHNIQUES

USAGE

USAGE

Discover

Understand

Predict

Data Mining Technology

Page 12: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

12Data Mining Models

Regression algorithms • Neural Networks, Rule Induction

• Predict Numerical Outcome

Classification algorithm• CHAID, discriminant analysis

• Predict Symbolic Outcome

Two Types of Data Mining ModelsTwo Types of Data Mining Models

Clustering/Grouping algorithms• K-means, Kohonen, Factor

Analysis Association algorithms

• Apriori, Sequence

Descriptive ModelsGrouping & AssociationsDescriptive ModelsGrouping & Associations

Prediction Models Prediction and ClassificationPrediction Models Prediction and Classification

Page 13: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

13Traditional DM vendors

• SPSS Clementine

• SAS Enterprise Miner

• IBM Intelligent Miner

• Salford CART/MARTS

• …more

Page 14: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

14Database Vendors – DM within the Products

• Data Mining Engine in Oracle 9i– Oracle 9i consists of key products

• Oracle9i Database ,Oracle9i Application Server,Oracle9i Developer Suite

• IBM Intelligent Miner into DB2• TeraMiner into Teradata• Microsoft – SQL Server 2000

• When you implement DM functionality in a DBMS, you are limited to a specific database engine and not quite flexible in a typical enterprise application landscape - heterogeneous environment.

Page 15: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

15Data Mining Standards

• PMML - Predictive Model Markup Language• OleDB for Data Mining • Java Data Mining API• Other Data Exchange Standards for Analytics and

need Data Mining extensions– CWM: Common Warehouse Metadata– XML/A: XML for Analytics– CPEX: Customer Profile EXchange– xCIL: Extensible Customer Information Language

Page 16: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

16Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 17: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

17Enterprise Applications Landscape

• ERP Solutions– Oracle– PeopleSoft – SAP

• ERP vendors have extended scope of their applications far beyond tradition ERP functions to a wide array of business solutions such as: Customer Relationships

Management Business Intelligence Enterprise Portals

• Siebel

• Oracle Business Intelligence Solution

• Peoplesoft Enterprise Performance Management

• SAP Business Information Warehouse

Page 18: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

18Oracle Business Intelligence Solution

Business Processes (Pre-Built Portlets)• Response to Lead (27)• Lead to Quote (56)• Quote to Order (15)• Order to Cash (34)• Demand to Build (40)• Procure to Pay (28)• Revenue to Compensation (29)• Expiration to Renewal (33)• Issue to Resolution (51)• HR Family (43)

Source: Oracle

Oracle 9i DM Integration• Oracle Marketing Online for Campaign Management • Oracle9iAS Personalization• iStore• more to come…

Oracle9iDS Warehouse Builder Oracle9iAS Discoverer Oracle9iDS Reports Oracle9iAS Portal Oracle9iAS Clickstream Intelligence Oracle9iAS Personalization Oracle9i Data Mining Oracle9iDS Business Intelligence Beans

Oracle 9i Business

Intelligence

Page 19: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

19PeoplSoft Business Intelligence Solution

Customer Profitability

Finance

Workforce Analytics

Supply Chain Management Process

Workforce Rewards

Enrollment Management

Retail Merchandise

Project Analysis

Student Administration

Balanced Scorecard

Employee Scorecard

Customer Scorecard

Vendor Scorecard

Enterprise Performance Management (EPM)

Courtesy: eBusiness Advantage Inc. (w

ww.ebizadvan.com)

CRM Prospect Analysis

CRM Marketing Analysis

CRM Sales Effectiveness

CRM Service Effectiveness

Data miningCapabilities

No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products.No response from PeopleSoft contacts

Page 20: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

20SAP Business Intelligence Solution

+420 InfoCubes

+1700 Queries

Source: SAP

SAP CRM

Campaign management

Opportunity analytics

Customer behavior modeling

SAP SCM

Demand planning

Spend optimization

SCOR KPIs

SAP Financials, Human Capital Management

SEM

Balanced scorecard

Planning

Economic profit

Benchmarking

Employee turnover & retention

Corporate investment management

Closed loop platform capabilities

Drill-through (report-report i/f)

Remote cubes (read through)

Real-time data warehousing

Data mining

Write back to operational system

SAP Portals

E-commerce analysis

SAP Markets, Procurement

Bidding, pattern-based offering

Activity reproting, service analytics

90 ODS

Objects

Business Information Warehouse

Page 21: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

21CRM Venders – Data Mining Integration

• Oracle CRM– Pre 9i Darwin– Post 9i ODM

• RightPoint and E.piphany• SPSS and Siebel• SAP CRM

– Native Data Mining built in SAP BW - Database Independent– Interface to IBM Intelligent Miner Interface with SAP BW

• PeopleSoft CRM– No official data mining product or vendor solution– Waiting for their response on what they have?

Page 22: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

22Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 23: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

23SAP BW 3.0b Data Mining Implementation

• Currently for Customer Subject Area• Algorithm Supported

– Decision Trees– Scoring– Clustering/Segmentation– Association

• Data Mining process– Model definition– Training the model– Performing prediction using the training results– Uploading the results back into BW– Utilizing the mining results (on the operational side)– SAPGUI is the Interface to the Data Mining modeling and analysis

No ExtensiveNo ExtensiveData StagingData StagingNo ExtensiveNo ExtensiveData StagingData Staging

Page 24: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

24Modeling a Decision Tree

Create a mining model

Source: SAP

2

Model ccolumns1Specifying the column parameters

6

Specifying the values in case the original values in the column are to be treated differently

Indicating the prediction column

4

Indicating the key column

5

The nature of the column content3

Data type of the column

7

Page 25: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

25Modeling a Decision Tree

Specify Model Parameters

Source: SAP

Use portion (%) of the data for training or the whole data set

for training

1

Size of the window (such as 10%)

The number of repeats with different samples

Stop training when the no. of cases

under the given node is less than/equal to the specified value

4

Stop training when the accuracy is greater than or equal to the expected accuracy

5 If the tree is too big, prune the tree without violating the expected accuracy

6

Use the information gain threshold to check the relevance

7

32

Page 26: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

26

Create a training source and map the model columns

Source: SAP

2

Modeling a Decision Tree

BW Query Runtime parameters for query

Model columns

1

Selected source columns

3

Mapping between model column and source column4

5

Page 27: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

27

Create a mining model Train the model Predictions using

Training results Using the data mining

results against BW Query

Source: SAP

SAP BW Data Mining – Process Steps

Page 28: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

28

Source: SAP

3

5

Viewing Decision Tree Training Results

This decision tree predicts whether the customer has

left or is still “on board1

Chances of a customer leaving is 70.7% if the profession is

“LABOURER”2

Chart shows the distribution at the selected node

28/41 customers are likely to leave

13/41 customers are likely to stay

6

Out of a total of 705 cases, 41 cases are covered under this node

4

Page 29: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

29

Uploaded in BW Then BEX for further Analysis

Source: SAP

Data Mining – Decision Trees

Page 30: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

30

• Create a Association model

• Define Model Columns• Train the model• Predictions using

Training results• Using the data mining

results against BW Query

Source: SAP

Data Mining – Association

Page 31: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

31

Source: SAP

Data Mining – Association

Page 32: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

32

• Create a Cluster model

• Train the model

• Predictions using Training results

• Using the data mining results against BW Query

Source: SAP

Data Mining – Cluster Analysis

Page 33: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

33

Source: SAP

Viewing Cluster Analysis Results

1

2

3

Page 34: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

34

Uploaded in BW Then BEX for further Analysis

Source: SAP

Viewing Cluster Analysis results

Page 35: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

35

• Good attempt to implement few Data Mining Algorithms• Very traditional Data Mining Approach• Requires a well versed Statistician or Data Mining

Expert to model and interpret the results• Source: BEX Query – Big Limitation in DM• Weak Visualization • BEX for additional discovery - slicing and dicing

SAP Data Mining

Page 36: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

36

IBM Intelligent Miner is designed to:

SAP BW - IBM Intelligent Miner

• Copy data from SAP BW to IBM Intelligent Miner– Results of reports in BW – Modeling in Business

Explorer Analyzer– Data direct from InfoCubes (for cross-selling analysis) – Descriptions, hierarchies

• Results data from IBM IM back into SAP BW– Results of segmentation can be loaded as master data or

hierarchies

• Data transport is designed through Wizards in SAP BW– Possible to get a good view of Intelligent Miner Results

from SAP BW

Page 37: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

37Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 38: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

38ERPs and Data Mining: Good and the Bad News

• Good News– Known Business Processes– Few data Sources– Improved Data Quality– Metadata Integration– Near real-time data mining– Closed-loop Knowledge Discovery– Consistent Infrastructure

• Bad News– Complex Data Structures– Performance– Availability– Very few Data Mining algorithms - Today

1. Business Understanding

2. Data Understanding

3. Data Preparation

4. Modeling

5. Evaluation

6. Deployment

CRISP-DM

Page 39: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

39

Data Understanding

Data Preparation

Deployment

Business Understanding

Data Mining Process and ERP Data Mining

Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment

Will reduce data mining Will reduce data mining project time up toproject time up to

50% 50%

Will reduce data mining Will reduce data mining project time up toproject time up to

50% 50%

Source: http://www.crisp-dm.org/

Good News for Future Business Applications

Page 40: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

40Agenda

• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information

Warehouse• Pro and Cons of ERP centric Data Mining• Q&A

Page 41: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

41INFORMATION FRAMEWORKS

Technology/Solution Assessment

Product Strategy Solution Strategy

Product Positioning Competitive Analysis

Software product architecture Marketing Strategy

Product Performance and Benchmarking Consulting

Hardware Configuration

Market Research Market Assessment

Competitive Analysis Technology due

Seminars WebinarsKeynotes

Panel ModeratorPublications

Hands-on trainingConferences

Executive and Senior IT Management Consulting

Enterprise Information Architectures (EIA) Business Case Development

Information Architecture Application

Deployment Architectures implementation

Legacy Application Migration Strategies

ERP Application deployment strategies

Enterprise Applications Integration (EAI)

Architectures, Service Modeling and design, EAI technology assessment

Tools and Technology Assessment

Vendor Selection and Assessment

Conference Room Pilot implementation

Business Intelligence and Portals

Architectures, Methodologies

Tool/technology/Vendor assessment and selection

Data Warehouse, Data Marts, Analytics, Information Delivery

Deployment Architectures

Business Intelligence and eBusiness Integration architectures

Portals Strategies, Business case, Assessment, Architectures, Modeling, Planning and knowledge Transfer

KNOWLEDGETRANSFER

INFORMATIONTECHNOLOGY

ORGANIZATION

SOFTWAREAND

SOLUTION VENDORS

INFORMATIONTECHNOLOGYINVESTORS

http://infoframeworks.com

Page 42: ERP Centric Data Mining and KD

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

42Questions

Naeem HashmiChief Technology Officer

September 10, 2002Email: [email protected] Site: http://infoframeworks.com

Tel: 603-432-4550