making the business better presented by mohammed dwikat data mining presented to faculty of it mis...

20
MAKING THE BUSINESS BETTER MAKING THE BUSINESS BETTER Presented By Presented By Mohammed Dwikat Mohammed Dwikat DATA MINING DATA MINING Presented to Presented to Faculty of IT Faculty of IT MIS Department MIS Department An Najah National University An Najah National University

Upload: alicia-patrick

Post on 25-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

MAKING THE BUSINESS BETTERMAKING THE BUSINESS BETTER

Presented ByPresented By

Mohammed DwikatMohammed Dwikat

DATA MININGDATA MINING

Presented toPresented to

Faculty of ITFaculty of IT

MIS Department MIS Department

An Najah National UniversityAn Najah National University

Page 2: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

What is Data MiningWhat is Data Mining

Exploration & analysis of large quantities of data in order to discover meaningful patternsExtraction useful information from data

Group together similar documents returned by search engine

Page 3: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

What is What is NOTNOT Data Mining Data Mining

Look up phone number in phone directoryQuery a Web search engine for information about “Amazon”Search a customer name in a Bank

Page 4: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Data Mining TasksData Mining Tasks

Predictive Methods Use some variables to predict unknown or

future values of other variables.

Descriptive Methods Find human-interpretable patterns that

describe the data.

Page 5: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Clustering [Descriptive]

Association Rule Discovery [Descriptive]

Sequential Pattern Discovery [Descriptive]

Classification [Predictive]

Regression [Predictive]

Deviation Detection [Predictive]

Data Mining TasksData Mining Tasks

Page 6: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Clustering ExampleClustering Example

Euclidean Distance Based Clustering in 3-D space

Intracluster distancesare minimized

Intracluster distancesare minimized

Intercluster distancesare maximized

Intercluster distancesare maximized

Page 7: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Market Segmentation: Goal: subdivide a market into distinct subsets of

customers where any subset may conceivably be selected as a market target to be reached with a distinct marketing mix.

Document Clustering: Goal: To find groups of documents that are similar

to each other based on the important terms appearing in them.

Other Clustering ExamplesOther Clustering Examples

Page 8: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

predict occurrence of an item based on occurrences of other items

Association Rule ExampleAssociation Rule Example

TID Items

1 Bread, Coke, Milk

2 Beer, Bread

3 Beer, Coke, Diaper, Milk

4 Beer, Bread, Diaper, Milk

5 Coke, Diaper, Milk

Rules Discovered: {Milk} --> {Coke} {Diaper, Milk} --> {Beer}

Rules Discovered: {Milk} --> {Coke} {Diaper, Milk} --> {Beer}

Page 9: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Marketing and Sales Promotion

Supermarket shelf management

Inventory Management

Other Association Rule ExamplesOther Association Rule Examples

Page 10: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Find rules that predict strong sequential dependencies among different events.

Sequential Pattern Discovery Sequential Pattern Discovery ExampleExample

(A B) (C) (D E)

(A B) (C) (D E)

Page 11: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Other Sequential Pattern Other Sequential Pattern Discovery ExamplesDiscovery Examples

In telecommunications alarm logs, (Inverter_Problem Excessive_Line_Current)

(Rectifier_Alarm) --> (Fire_Alarm)

In point-of-sale transaction sequences, Computer Bookstore:

(Intro_To_Visual_C) (C++_Primer) -->

(Perl_for_dummies,Tcl_Tk) Athletic Apparel Store:

(Shoes) (Racket, Racketball) --> (Sports_Jacket)

Page 12: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Classification ExampleClassification Example

Given a collection of records (training set ) Each record contains a set of attributes, one of

the attributes is the class.

Find a model for class attribute as a function of the values of other attributes.

Goal: previously unseen records should be assigned a class as accurately as possible.

A test set is used to determine the accuracy of the model.

Page 13: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Classification ExampleClassification Example

Tid Refund MaritalStatus

TaxableIncome Cheat

1 Yes Single 125K No

2 No Married 100K No

3 No Single 70K No

4 Yes Married 120K No

5 No Divorced 95K Yes

6 No Married 60K No

7 Yes Divorced 220K No

8 No Single 85K Yes

9 No Married 75K No

10 No Single 90K Yes10

categoric

al

categoric

al

continuous

class

Refund MaritalStatus

TaxableIncome Cheat

No Single 75K ?

Yes Married 50K ?

No Married 150K ?

Yes Divorced 90K ?

No Single 40K ?

No Married 80K ?10

TestSet

Training Set

ModelLearn

Classifier

Page 14: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Other Classification ExamplesOther Classification Examples

Direct Marketing Reduce cost of mailing by targeting a set of

consumers likely to buy a new cell-phone product

Fraud Detection

Predict fraudulent cases in credit card transactions.

Page 15: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Regression ExamplesRegression Examples

Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency.

Examples: Predicting sales amounts based on advertising

expenditure. Predicting wind velocities as a function of

temperature, humidity, air pressure, etc. Time series prediction of stock market indices.

Page 16: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Deviation/Anomaly ExampleDeviation/Anomaly ExampleDetect significant deviations from normal behavior

Applications: Credit Card Fraud Detection

Network Intrusion Detection

Page 17: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

Prediction MeasurementPrediction Measurement

Confusion Matrix

Example of confusion matrix

Predicted

Actual

Pass Fail

Pass 9 3

Fail 1 7

True Positive vs. True Negative

False Positive vs. False Negative

Page 18: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

ChallengesChallenges

Distributed Data

Dimensionality

Complex and Heterogeneous Data

Data Quality

Data Ownership and Distribution

Privacy Preservation

Page 19: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

WEKA Free, Simple, Limited

SAS Enterprise Miner Data Miner, Text miner

SPSS Regression, Time Series and more

Data Mining ApplicationsData Mining Applications

Page 20: MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University

Mohammed Dwikat Data Mining

QuestionsQuestions

Thank YouThank You