lecture 5: automatic cluster detection lecture 6: artificial neural networks
DESCRIPTION
Brief introduction to lectures. Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge. Transparencies prepared by Ho Tu Bao [JAIST]. Lecture 5: Automatic Cluster Detection . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/1.jpg)
1
Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered
knowledge
Brief introduction to lectures
Transparencies prepared by Ho Tu Bao [JAIST]
![Page 2: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/2.jpg)
2
Lecture 5: Automatic Cluster Detection
•One of the most widely used KDD classification techniques for unsupervised data.
•Content of the lecture1. Introduction2. Partitioning Clustering3. Hierarchical Clustering4. Software and case-studies
•Prerequisite: Nothing special
![Page 3: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/3.jpg)
3
Partitioning Clustering
:conditions following the satisfying ,clusters called often n),(K X of P,...,P,P subsets empty-non disjointK of collection a is
}x,...,x,{xX objects n of set a of partition AK21
n21
• Each cluster must contain at least one object• Each object must belong to exactly one group
P of components called are P },P,...,P,{PP partition the Denote iK21
X.P...PP:X is union their (2) ji ,P and P all for 0PP :disjoint are they (1)
K21
jiji
![Page 4: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/4.jpg)
4
Partitioning ClusteringWhat is a “good” partitioning clustering?Key ideas: Objects in each group are similar and objects between different groups are dissimilar.
Minimize the within-group distance and Maximize the between-group distance.
Notice: Many ways to define the “within-group distance” (the average of distance to the group’s center or the average of distance between all pairs of objects, etc.) and to define the “between-group distance”. It is in general impossible to find the optimal clustering.
}},,{,},{,},,,,{{321
65372109741 PPP
xxxxxxxxxxP
![Page 5: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/5.jpg)
5
Hierarchical Clustering
A hierarchical clustering is a sequence of partitions in which each partition is nested into the next partition in the sequence.
Partition Q is nested into partition P if every component of Q is a subset of a component of P.
(This definition is for bottom-up hierarchical clustering. In case of top-down hierarchical clustering, “next” becomes “previous”).
},,{},,{},,,,,{ 65382109741 xxxxxxxxxxP },{},{},,{},,{},,,{ 63582107941 xxxxxxxxxxQ
![Page 6: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/6.jpg)
6
Bottom-up Hierarchical Clustering
654321 ,,,,, xxxxxx
},{},,,,{ 654321 xxxxxx
},{},{},,,{ 654321 xxxxxx
}{},{},{},,{},{ 654321 xxxxxx
}{},{},{},{},{},{ 654321 xxxxxx x1 x2 x3 x4 x5 x6
![Page 7: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/7.jpg)
7
Top-Down Hierarchical Clustering
654321 ,,,,, xxxxxx
},{},,,,{ 654321 xxxxxx
},{},{},,,{ 654321 xxxxxx
}{},{},{},,{},{ 654321 xxxxxx
}{},{},{},{},{},{ 654321 xxxxxx
x1 x2 x3 x4 x5 x6
![Page 8: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/8.jpg)
8
OSHAM: Hybrid Model
WisconsinBreastCancerData
Attributes
Brief Descriptionof Concepts
ConceptHierarchy
Multiple Inheritance Concepts
Discovered Concepts
![Page 9: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/9.jpg)
9
Lecture 1: Overview of KDDLecture 2: Preparing dataLecture 3: Decision tree induction Lecture 4: Mining association rulesLecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge
Brief introduction to lectures
![Page 10: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/10.jpg)
10
Lecture 6: Neural networks•One of the most widely used KDD classification
techniques.•Content of the lecture
•Prerequisite: Nothing special
1. Neural network representation2. Feed-forward neural networks3. Using back-propagation algorithm4. Case-studies
![Page 11: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/11.jpg)
11
Lecture 1: Overview of KDDLecture 2: Preparing dataLecture 3: Decision tree induction Lecture 4: Mining association rulesLecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge
Brief introduction to lectures
![Page 12: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/12.jpg)
12
Lecture 7 Evaluation of discovered knowledge
•One of the most widely used KDD classification techniques.
•Content of the lecture1. Cross validation2. Bootstrapping3. Case-studies
•Prerequisite: Nothing special
![Page 13: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/13.jpg)
13
Out-of-sample testing
HistoricalData
(warehouse)Samplingmethod
Sampledata
Samplingmethod
Trainingdata Induction
method
Testingdata
Errorestimation
Model
2/3
1/3
error
The quality of the test sample estimate is dependent on the number of test cases and the validity of the independent assumption
![Page 14: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/14.jpg)
14
Cross ValidationHistorical
Data(warehouse)
Samplingmethod
Sampledata
Samplingmethod
Sample 1 Inductionmethod
Sample nError
estimation
Model
Run’serror
10-fold cross validation appears adequate (n = 10)
Sample 2
...
Errorestimation
iterate
- Mutually exclusive- Equal size
![Page 15: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/15.jpg)
15
randomly split the data set into 3 subsets of equal size
run on each 2 subsets as training
data to find knowledge
test on the rest subset as testing data to evaluate the accuracy
average theaccuracies asfinal evaluation
2
3
1
1
2
2A data set
A method to be evaluated
Evaluation: k-fold cross validation (k=3)1 3
3 2
3 1
![Page 16: Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks](https://reader031.vdocuments.net/reader031/viewer/2022020417/56815df4550346895dcc28b4/html5/thumbnails/16.jpg)
16
Outline of the presentation
Objectives, Prerequisite and Content
Brief Introduction to Lectures
DiscussionandConclusion
This presentation summarizes the content and organizationof lectures in module “Knowledge Discovery and Data Mining”