a look at data mining presented by: charles hollingsworth flavia peynado ritch overton dsc8020,...

27
A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Upload: gabriel-clinton-rich

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

A Look at Data Mining

Presented by:

Charles Hollingsworth

Flavia Peynado

Ritch Overton

DSc8020, Group Presentation, July 31, 2002

Page 2: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

What is Data Mining?

It may be described as the process of extracting previously unidentified, valid, and actionable information from large databases and then using the information to make crucial business decisions.

Page 3: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Why the need for data mining?

Business environment is constantly changing. Customer Behavior Patterns Market Saturation New niche markets Increased commoditization Time to market Shorter product life cycles Increased competition and business risks

Page 4: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Drivers The Customer Products Competition Operations/Data

Assets.

Enablers Data flood Growth of data

warehousing New IT solutions New research in

machine learning

Page 5: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Process overview contd.

1. Business Understanding

2. Data understanding

3. Data Preparation

4. Data Transformation

5. Data Mining

6. Analysis of results

7. Assimilation of results

Page 6: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Effort needed at each stage of data mining

0

10

20

30

40

50

60

Effort

Page 7: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Visualization

Goal is to provide a summary and overview of a dataset

Promotes Understanding: Deconstructive process

Promotes Trust: Constructive process

Narrows the gap between human and computer during data analysis

Page 8: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Types of Visualization Tools

Histograms Bar Charts Scatter plots Pie Charts Line Plots

Time-Series Plots Decision Trees

Coxcomb Plots Stereograms Mosley’s X-ray’s

Page 9: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Histogram

Graphically illustrates how many observations fall in various categories

 Histogram for Diameter

0

20

40

60

80

100

Category

Page 10: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Bar Chart

Categories are placed on the vertical axis, instead of the horizontal axis in a histogram

Page 11: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Scatter Plot

Scatter Plot

0

5

10

15

20

25

0 50 100 150 200

Domestic Gross

Sa

lary

Salary

Graphical representation of the relationship between two variables.

Page 12: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Pie Chart

Radii are used to divide a circle into wedges. The resulting angles represent the values of the wedges.

Spring 2000 Salary Survey

<$30,000

$30,000 to $39,999

$40,000 to $49,999

$50,000 to $59,999

$60,000 to $69,999

More than $70,000

No Answer

Page 13: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Line Plot

Connects consecutive data points to enhance visualization

Page 14: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Time-Series Plot: Playfair’s

•Helpful in forecasting future values

•Time variable is placed on the horizontal axis

•Makes patterns in data more apparent

•The area between two time-series curves was emphasized to show the difference between them, representing the balance of trade.

Page 15: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002
Page 16: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Decision Trees

Conventions for Decision Trees:

1. Composed of nodes (points in time) and branches (possible decisions).

2. Squares represent decision nodes, circles represent probability nodes, triangles represent end nodes.

3. Probabilities are listed on probability branches.

4. Monetary values are listed on the branches where they occur.

5. Decision maker has no control over probability branches.

Page 17: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Decision Trees

Page 18: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Coxcomb Plot

In 1858, Florence Nightingale constructed graphs of her own design, which she called “Coxcombs".

The radii in a Coxcomb vary as opposed to the angle of the wedge in a pie chart.

Page 19: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002
Page 20: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Stereogram

Luigi Perozzo, from the Annali di Statistica, 1880

The population of Sweden from 1750-1875 by age groups

Page 21: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002
Page 22: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Mosley’s X-ray’s

Caused Henry Mosley to discover that the atomic number is more than a serial number; that it has some physical basis. Moseley proposed that the atomic number was the number of electrons in the atom of the specific element.

Page 23: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002
Page 24: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Other Visualization Tools

Doughnut Area Chart Box Plot Radar

Page 25: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Algorithms

Predictive

Regression Classification

Descriptive Parallel Formulation

of Classification Association Rule

Discovery Sequential Pattern

Discovery Analysis Clustering

Page 26: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Applying

Relevance to managers

Decreasing Costs Valuing Appropriately Effective Implementation

Page 27: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002

Conclusion

Converging Developments Data compilation Processing power Maturing Algorithms Visualization

Accessible Resources