![Page 1: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/1.jpg)
A Look at Data Mining
Presented by:
Charles Hollingsworth
Flavia Peynado
Ritch Overton
DSc8020, Group Presentation, July 31, 2002
![Page 2: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/2.jpg)
What is Data Mining?
It may be described as the process of extracting previously unidentified, valid, and actionable information from large databases and then using the information to make crucial business decisions.
![Page 3: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/3.jpg)
Why the need for data mining?
Business environment is constantly changing. Customer Behavior Patterns Market Saturation New niche markets Increased commoditization Time to market Shorter product life cycles Increased competition and business risks
![Page 4: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/4.jpg)
Drivers The Customer Products Competition Operations/Data
Assets.
Enablers Data flood Growth of data
warehousing New IT solutions New research in
machine learning
![Page 5: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/5.jpg)
Process overview contd.
1. Business Understanding
2. Data understanding
3. Data Preparation
4. Data Transformation
5. Data Mining
6. Analysis of results
7. Assimilation of results
![Page 6: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/6.jpg)
Effort needed at each stage of data mining
0
10
20
30
40
50
60
Effort
![Page 7: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/7.jpg)
Visualization
Goal is to provide a summary and overview of a dataset
Promotes Understanding: Deconstructive process
Promotes Trust: Constructive process
Narrows the gap between human and computer during data analysis
![Page 8: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/8.jpg)
Types of Visualization Tools
Histograms Bar Charts Scatter plots Pie Charts Line Plots
Time-Series Plots Decision Trees
Coxcomb Plots Stereograms Mosley’s X-ray’s
![Page 9: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/9.jpg)
Histogram
Graphically illustrates how many observations fall in various categories
Histogram for Diameter
0
20
40
60
80
100
Category
![Page 10: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/10.jpg)
Bar Chart
Categories are placed on the vertical axis, instead of the horizontal axis in a histogram
![Page 11: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/11.jpg)
Scatter Plot
Scatter Plot
0
5
10
15
20
25
0 50 100 150 200
Domestic Gross
Sa
lary
Salary
Graphical representation of the relationship between two variables.
![Page 12: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/12.jpg)
Pie Chart
Radii are used to divide a circle into wedges. The resulting angles represent the values of the wedges.
Spring 2000 Salary Survey
<$30,000
$30,000 to $39,999
$40,000 to $49,999
$50,000 to $59,999
$60,000 to $69,999
More than $70,000
No Answer
![Page 13: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/13.jpg)
Line Plot
Connects consecutive data points to enhance visualization
![Page 14: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/14.jpg)
Time-Series Plot: Playfair’s
•Helpful in forecasting future values
•Time variable is placed on the horizontal axis
•Makes patterns in data more apparent
•The area between two time-series curves was emphasized to show the difference between them, representing the balance of trade.
![Page 15: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/15.jpg)
![Page 16: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/16.jpg)
Decision Trees
Conventions for Decision Trees:
1. Composed of nodes (points in time) and branches (possible decisions).
2. Squares represent decision nodes, circles represent probability nodes, triangles represent end nodes.
3. Probabilities are listed on probability branches.
4. Monetary values are listed on the branches where they occur.
5. Decision maker has no control over probability branches.
![Page 17: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/17.jpg)
Decision Trees
![Page 18: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/18.jpg)
Coxcomb Plot
In 1858, Florence Nightingale constructed graphs of her own design, which she called “Coxcombs".
The radii in a Coxcomb vary as opposed to the angle of the wedge in a pie chart.
![Page 19: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/19.jpg)
![Page 20: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/20.jpg)
Stereogram
Luigi Perozzo, from the Annali di Statistica, 1880
The population of Sweden from 1750-1875 by age groups
![Page 21: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/21.jpg)
![Page 22: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/22.jpg)
Mosley’s X-ray’s
Caused Henry Mosley to discover that the atomic number is more than a serial number; that it has some physical basis. Moseley proposed that the atomic number was the number of electrons in the atom of the specific element.
![Page 23: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/23.jpg)
![Page 24: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/24.jpg)
Other Visualization Tools
Doughnut Area Chart Box Plot Radar
![Page 25: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/25.jpg)
Algorithms
Predictive
Regression Classification
Descriptive Parallel Formulation
of Classification Association Rule
Discovery Sequential Pattern
Discovery Analysis Clustering
![Page 26: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/26.jpg)
Applying
Relevance to managers
Decreasing Costs Valuing Appropriately Effective Implementation
![Page 27: A Look at Data Mining Presented by: Charles Hollingsworth Flavia Peynado Ritch Overton DSc8020, Group Presentation, July 31, 2002](https://reader035.vdocuments.net/reader035/viewer/2022070401/56649f215503460f94c399fe/html5/thumbnails/27.jpg)
Conclusion
Converging Developments Data compilation Processing power Maturing Algorithms Visualization
Accessible Resources