an overlapping fuzzy approach to pattern...
TRANSCRIPT
An improved hierarchical partitioning fuzzy approach to pattern classification
A dissertation submitted to The University of Manchester
for the degree of MSc Information Systems Engineering
in the Faculty of Engineering and Physical Sciences
2008
Han Ding
School of Computer Science
1
LIST OF CONTENTS LIST OF FIGURES AND TABLES...............................................................................................2
ABSTRACT .....................................................................................................................................3
DECLARATION .............................................................................................................................4
COPYRIGHT ..................................................................................................................................5
ACKNOWLEDGEMENT ..............................................................................................................6
1. Introduction .................................................................................................................................7
1.1 Pattern classification ........................................................................................................ 7 1.2 The problem...................................................................................................................... 9 1.3 Overview of dissertation ................................................................................................ 10
2. Background................................................................................................................................12
2.1 Statistical approaches..................................................................................................... 12 2.2 Neural network approaches .......................................................................................... 13 2.3 Structural approaches.................................................................................................... 15 2.4 Fuzzy approaches ........................................................................................................... 16 2.5 Comparison..................................................................................................................... 19
3. Research Method.......................................................................................................................21
3.1 Hierarchical overlapping fuzzy approach .................................................................... 21 3.1.1 Initial Input Partitioning .................................................................................... 21 3.1.2 Fuzzy Rules Generation ...................................................................................... 25 3.1.3 Fuzzy Inference Process...................................................................................... 26
3.2 Improvements ................................................................................................................. 29 3.2.1 Euclidean distance calculation in assigning rules ............................................. 29 3.2.2 Tuning slopes........................................................................................................ 31
4. Implementation .........................................................................................................................34
4.1 Implementation background and software environment............................................ 34 4.2 Programme procedure and operating results .............................................................. 35
5. Evaluation ..................................................................................................................................40
5.1 Training and testing data issue...................................................................................... 41 5.2 Iris dataset....................................................................................................................... 42 5.3 Wisconsin Breast Cancer dataset .................................................................................. 45 5.4 Comparisons with other methods and analysis ........................................................... 48
6. Conclusion and Future work....................................................................................................53
References ......................................................................................................................................55
Appendix 1: Sample source code .................................................................................................57
Appendix 2: A sample separation of Iris dataset ........................................................................70
Final word count: 12953
LIST OF FIGURES AND TABLES
Figure 1: A typical pattern classification system ..........................................................................8
Figure 2: A mapping x→ y from measurement space X to decision space Y............................12
Figure 3: An artificial neuron.......................................................................................................14
Figure 4: The multi-layer perceptron ..........................................................................................15
Figure 5: An example of structural patterns...............................................................................16
Figure 6: An iterative partitioning of the overlapping area ......................................................23
Figure 7: Hierarchy of the generated hyperboxes ......................................................................23
Figure 8: Class boundaries and a membership function of a 2-dimensional hyperbox ..........27
Figure 9: Class boundaries and membership functions of 2 overlapping hyperboxes............28
Figure 10: Modified class boundaries and membership functions ...........................................33
Table 1. Classification result for the first configuration ............................................................43
Table 2. Rules generated from Iris dataset..................................................................................43
Table 3. More results for other configurations on Iris dataset ..................................................44
Table 4. Rules generated from Wisconsin Breast Cancer dataset .............................................46
Table 5. Results for all configurations of Wisconsin Breast Cancer dataset ............................47
Table 6. Comparative results of several classification systems on Iris dataset ........................48
Table 7. Comparative results of systems on Wisconsin Breast Cancer dataset .......................50
2
ABSTRACT
Pattern classification has become an essential element in variety kind of realms such as
engineering control and medical diagnosis applications. There are numerous approaches
for classification and each of them proved effective in certain cases. Recently, a more
general, accurate and efficient method is still desirable and the approaches of fuzzy logic
have been successfully applied in this area.
In this dissertation, a pattern classification system is realised based on an improved
hierarchical partitioning fuzzy approach, which was initially proposed by I. Gadaras and L.
Mikhailov [9]. The approach is able to directly extract rules from numerical data for
classification and focus on achieving high accuracy with low expensiveness. A meaningful
input partitioning technique for overlapping area and some adjustments on membership
functions are highlighted.
A pattern classification system based on proposed fuzzy methodology is programmed out
by using Java language and JDBC database. This system is evaluated by Fisher Iris dataset
and Wisconsin Breast Cancer dataset, which are famously employed for testing
classification performance. Comparative results are analysed in detail with critical
conclusion and future suggestions.
Keywords: pattern classification, improved hierarchical partitioning, fuzzy approach.
3
An improved hierarchical partitioning fuzzy approach to pattern classification
DECLARATION No portion of the work referred to in the dissertation has been submitted in support of an
application for another degree or qualification of this or any other university of other
institution of learning.
4
An improved hierarchical partitioning fuzzy approach to pattern classification
COPYRIGHT The ownership of any intellectual property rights which may be described in this
dissertation is vested in the University of Manchester, subject to any prior agreement to the
contrary, and may not be made available for use by third parties without the written
permission of the University, which will prescribe the terms and conditions of any such
agreement.
5
An improved hierarchical partitioning fuzzy approach to pattern classification
ACKNOWLEDGEMENT I would like to take this opportunity to thank my supervisor Dr. Ludmil Mikhailov, for his
professional and amiable guidance, and PhD. Ioannis Gadaras, for his inspiring and
constant support. I also would like to thank my parents, Sophie and all who once helped
me during this year.
6
An improved hierarchical partitioning fuzzy approach to pattern classification
1. Introduction
In this chapter, basic knowledge of pattern classification is introduced, followed by
describing the problem in general level. Finally, the structure of dissertation is listed, with
brief introduction for each following chapter.
1.1 Pattern classification
Pattern classification is a vital intelligence of human being. A person can identify whether
an animal is a cat or a horse from its shape, size and behaviour. This process, from
receiving visual information to judging species, is exactly pattern classifying.
From artificial intelligence emerged in 1950s, people have been trying to empower
computer with this ability. 10 years later, pattern classification had become a new
discipline and developed very fast. Nowadays, pattern classification particularly aims to
classify real world applications as accurate and as efficient as possible. It includes a wide
range of information processing research and applies in widespread realms such as voice
recognition, figure print detection and diseases diagnosis [1]. Consequently, pattern
classification is not only a sole subject of computer science any more, it is involved with
diverse fields including cybernetics, linguistics and biology etc.
7
An improved hierarchical partitioning fuzzy approach to pattern classification
Data Acquisition
Feature Extractor
Classifier
Output Device
External Signals
Captured Data
Feature Vector
Class Indices
Figure 1: A typical pattern classification system
A typical four-operator pattern classification system can be illustrated by Figure 1 [2].
Initially, External Signals are caught by Data Acquisition and then transfer to a form that
can be understood by next operator in the system. Due to huge amount of data captured by
Data Acquisition, it is very difficult and unnecessary to process all information. Therefore,
Feature Extractor takes the responsibility to condense the information and discard those
unimportant data. Meanwhile, Feature Extractor also processes all useful information to a
multi-dimensional Feature Vector and conveys them into Classifier. The Classifier is now
able to assign inputs belonging to a specific class from received Feature Vector, and then
produces Class Indices to Output Device. Finally, the classification result is displayed or
proceeds by this Output Device.
A concrete example here is used for demonstrating this process. In a speaker recognition
system, physical sound wave is the External Signal. A Data Acquisition, microphone,
8
An improved hierarchical partitioning fuzzy approach to pattern classification
receives the wave and converts it into digital format in order to process in future. The
ambient noisy is also impeded in this stage. These digital data (Captured Data) become
Feature Vector including frequency and volume via a Feature Extractor machine. The
machine then provides quantificational information for Classifier. The Classifier here can
be a computer that stores personal records, algorithms as well as classification rules. It
recognises the person speaking and passes commands (Class Indices) to the display
(Output Device) in order to show the name of the person on the screen.
1.2 The problem
Feature extractor and classifier are two major focuses of pattern classification systems.
Either of these two elements could influence capacity of the system: if one of them
performs strongly enough, the other one can be entirely ignored. However, the
performance of feature extraction usually relays on the specific application, and that is
relatively difficult to improve generally. Therefore, more research now is focused on
improving the classifier. This dissertation is also trying to contribute improvements to the
classifier.
A good classifier aims to produce highly accurate pattern classification result with as less
expensiveness as possible. However in the past, it seems very difficult to achieve at the
same time. Some approaches provided an excellent accuracy but generated a great amount
of rules, which was costly and time consuming. On the other hand, some algorithms
achieved simple and fast, at the cost of low precision.
9
An improved hierarchical partitioning fuzzy approach to pattern classification
Therefore, the problem is how to create a novel method for the classifier, with an optimum
balance between accuracy and expensiveness.
1.3 Overview of dissertation
In Chapter 2, backgrounds and different methods to tackle pattern classification problem
are shortly presented. It briefly explains the principle of several familiar classification
approaches including statistical approaches, structural approaches and neural network
approaches. Moreover, fuzzy approaches are discussed in more detail with comparative
analysis of these methodologies.
Chapter 3 illustrates the major methodology of this dissertation, which is inspired from an
overlapping fuzzy classification approach [9]. Detailed description of main processes can
be found: initial input partitioning, fuzzy rules generation and fuzzy inference process.
Novel theoretical improvements are highlighted next, including utilising calculation of
Euclidean distance and tuning membership function slopes to enhance classifying
performance.
Chapter 4 covers implementation details from development techniques and environment to
the programming procedure. It explains how this programmed system implements the
proposed approach step by step.
In Chapter 5, evaluations are carried out by employing two popular pattern classification
10
An improved hierarchical partitioning fuzzy approach to pattern classification
testing datasets: Fisher Iris dataset and Wisconsin Breast Cancer dataset. An important
discussion about training and testing data selection is initially presented. After brief
introductions of each testing dataset, evaluation results are followed, with comparison of
different classification methods and critical analysis.
Chapter 6 concludes the dissertation and emphasizes significant achievements.
Investigations of current problems are also provided, with suggestions for several
directions of future research.
11
An improved hierarchical partitioning fuzzy approach to pattern classification
2. Background
There has been a great amount of effort paid on pattern classification research. Many
creative methods have been suggested and some of them proved quite effective in some
cases. Currently, a classification system usually employs one of the following approaches:
statistical (or theoretic) approaches, neural network approaches, structural (or syntactic)
approaches and fuzzy approaches. In this section, we briefly introduce each of them and
explain fuzzy approach relatively in detail, as it is the approach applied in this dissertation.
2.1 Statistical approaches
Pattern classification covers a wide range of problems and it is hard to find a single unified
approach. However, statistical decision and estimation are regarded as fundamental to the
discipline of pattern classification [3].
A A
C C
C C
A A A
B B
B B
B
Measurement Space X
Ay
By
Cy
Ny
.
.
.
Decision Space Y
A
Figure 2: A mapping x→ y from measurement space X to decision space Y
Statistics is a mathematical method that can be used to summarise a collection of data. It
12
An improved hierarchical partitioning fuzzy approach to pattern classification
describes the appearing frequency of variations. Statistical approach use algorithms to
analyse the probability of an input belonging to a certain class.
An example can be easily explained by Figure 2. Each input corresponds to a point in the
multi-dimensional Measurement Space X. Inputs who belong to the same class are closely
placed and mapping to one class in the Decision Space Y. The projecting process tries to
indicate the point in Y area linking to class y correctly as the best mapping is that gives the
maximum recognition rate.
The accuracy of this approach is actually relied on the natural distribution of inputs. That is
to say, if the inputs belonging to one class are loosely separated, the accuracy would be
affected by this. To solve this problem, additional knowledge or information about
distribution is required.
2.2 Neural network approaches
The inspiration of neural network came from observation in biology. The artificial neurone,
imitating biotic one, consists of an interconnected group of artificial neurons and processes
information using a connectionist approach to computation. The operation of a basic
element works at very low level and the only thing that a neural can output is a Boolean
value. Network is connected by a great number of these neurons with input and output
weights, which are able to change dynamically for learning process.
13
An improved hierarchical partitioning fuzzy approach to pattern classification
∑ xm f(x) Output
w1
w2
wm
Input x1
Input x2
Input xm
Figure 3: An artificial neuron
Neural network approaches are widely used and effective to solve pattern classification
problems. They provide a new suite of nonlinear algorithms as well as vehicle of using
existing feature extraction and classification algorithms for efficient implementation. In
spite of the different underlying principles, nearly all famous neural network models are
actually similar to classical statistical pattern classification methods [4].
The power of neural network is greatly improved by introducing Hidden layer (showed in
Figure 4). The hidden layer receives output form first input layer and produce adjusted
output to final layer to make decisions. This process is why it can overcome the difficulty
of dealing with non-linear separable problems.
14
An improved hierarchical partitioning fuzzy approach to pattern classification
Input layer
Hiddenlayer
Output layer
Input x1
Input Weights
w
Output Weights h
Input x2
Input xm
Output y1
Output ym
Figure 4: The multi-layer perceptron
Neural network approaches have a strongpoint for learning ability. By using adaptable
weight function and many available algorithms, it has great potential for parallelism
because of the independence of each calculation element. However the problems are:
firstly, neural network is unable to extract rules automatically from neurons to form a
system. Secondly, the relatively long learning time does affect the performance of pattern
classification. Thirdly, the trained network is very difficult to be analysed and researched
for future improvements.
2.3 Structural approaches
There are some complex patterns which are more suitable to apply in a different
perspective, especially when the pattern is viewed as being combined of some simple sub
patterns [5]. Structural approaches assume the pattern structural is quantifiable and can be
solved by hierarchically decomposing original structure into several manageable parts. It
15
An improved hierarchical partitioning fuzzy approach to pattern classification
describes how the given pattern is constructed.
Figure 5 would be a good example of structural approach. There are two patterns and each
of them contains several strings. It can be easily identified that all number in Pattern A
keeps the format of , and that in Pattern B has the format of , .
After abstract these rules, it is possible to determine that a string belonging to a specific
pattern, but strings like ABC could belong to both patterns.
nABC 0≥n CAB n 0≥n
AC
ABC
ABBC ABBBC
AB ABC
ABCCC ABCC
ABC
AB
Pattern A Pattern B
ABBC
AC
ABBC
Figure 5: An example of structural patterns.
2.4 Fuzzy approaches
Fuzzy approaches are originated from fuzzy logic, a concept created by Professor
L.A.Zadeh in 1965. Although it experienced fierce criticises and arguments at the
beginning, it has eventually been recognised and its applications contributed to various
fields. In the past, scientists widely applied probability theory to describe uncertainty. This
theory only predicted the probability to be one of two distinct values, 0 or 1. However,
sometimes it is unable to do that. For example, when considering whether a person is old
16
An improved hierarchical partitioning fuzzy approach to pattern classification
or young, assuming “old” means more than 70 and “young” means less than 50, how can
we describe a person whose age is 60? Obviously, it is inappropriate to use probability in
this case.
By introducing membership function, fuzzy theory is able to solve those issues properly.
Professor Zadeh expanded value description from only “0” and “1” to a continuous area [0,
1] (fuzzy set). If an element is impossible to belong to a class, its membership to this class
is 0. In the contrary, If an element is definitely belongs to this class, the membership is 1.
Consequently, the membership between 0 and 1 describes the degree of belonging [6].
Fuzzy rules consist of the antecedent such as “If x is in A” and the consequence such as
“Then y belongs to B”. For example, if the person is old then his acuity is poor. The
difference between fuzzy rules and crisp rules is that in crisp rules either antecedent or
consequence can only be true or false, but in fuzzy rules they can be any value between 0
and 1, for example, the degree of “old” could be 0.8. Afterward, by acquisition of these
IF-THEN rules as operators, a direct relationship from input elements to pattern
classification can be realised.
There are many kinds of fuzzy classification approaches. Two most common ones are
Wang and Mendel’s method and Abe and Lan’s method. Wang and Mendel’s method [13]
was introduced by their paper of 1992, containing five steps: firstly it divides input and
output spaces into fuzzy regions; secondly, it generates rules from the given numerical data;
17
An improved hierarchical partitioning fuzzy approach to pattern classification
thirdly, the degree of membership of generated rules is assigned for tackling the rule
intersection conflicts; fourthly, a Fuzzy-Associative-Memory bank is created by generated
rules and linguistic rules from human experts; finally, a mapping from input space to
output space is achieved by the FAM bank as the defuzzifing procedure. Wang and
Mendel’s method has proven effective and accurate on approximating, however it requires
initial partition of the input space and priori knowledge, which is hard to accomplish by
computer automatically.
In 1995, Shigeo Abe and Ming-Shong Lan suggested a method that is able to directly
extract rules from numerical data [14]. This method groups training input data into
activation hyperboxes corresponding to their output pattern. If there is an intersection area
between hyperboxes, containing inputs of more than one class, then lower level inhibition
hyperboxes are recursively formed until no overlaps remain. This method do not need
initial partition and prior knowledge, however, the partitioning process is difficult to
entirely remove all overlaps and for certain cases, some partitioning might be irrelevant
and redundant.
Another fuzzy classification method is a hierarchical overlapping fuzzy approach proposed
recently by I. Gadaras and L. Mikhailov [9]. The partitioning procedure of this approach is
very similar with Abe and Lan’s method, but it creatively provides terminative criterions
for the recursive partitioning, which perfectly avoid meaningless clustering. This approach
inherits advantages of Abe and Lan’s method as well as overcomes its problems,
18
An improved hierarchical partitioning fuzzy approach to pattern classification
considering both the accuracy and the number of generated rules. The fuzzy system based
on this approach has also been carefully evaluated by variety of testing datasets. The
comparison of its results with other fuzzy approaches showed that it achieved a fairly good
accuracy with relatively fewer generated costs.
2.5 Comparison
Each of approach has its advantages and disadvantages and might be suitable for certain
cases. Moreover, it is also possible to combine merits of several approaches together in
order to maximise their performance. Fuzzy theory and neural network are particularly
popular these years as they showed great performance and potential in solving pattern
classification problems.
Statistical approaches are fundamental and simple. When possessing a limited amount of
information, they might be good to solve problem directly and are not necessary to provide
a more general solution as an intermediate step [7]. However, as mentioned above, its
accuracy quite depends on natural distribution and if additional information is not
accessible, the result of classification could be severely affected.
Structural approaches can efficiently deal with obvious syntactic patterns, but those
approaches may raise a combinatorial explosion of possibilities to be investigated, which
requires huge training sets and large computational efforts [8]. Difficulties are also found
in dealing with segmentation of noisy patterns and other interferences.
19
An improved hierarchical partitioning fuzzy approach to pattern classification
The major advantage of neural network approaches is learning ability and accuracy, which
has proven by recent research. However the drawbacks of that are the time costly learning
process and inconvenient rules extraction and analysis.
Fuzzy approaches, on the other hand, have more outstanding benefits. As they are more
similar with logic of human being, inner operations of system can be clearly understood,
which enables further improving. Meanwhile, problems which are hard to model by
mathematics can also be handled by fuzzy approaches. Unfortunately, fuzzy approaches
are not perfect since they are difficult to formalise the process of rules generation. That is
to say, rules and benchmarks in the system require expertise from human being since it is
inevitable in any artificial intelligence related research. However, numerical cases give
fuzzy approaches a great platform to prove its ability when overcoming problems. By
empowering computers in a very similar way of human thinking, fuzzy approaches achieve
accuracy and efficiency simultaneously. Therefore, a fuzzy approach has been selected in
this dissertation as it focuses on extracting fuzzy rules directly from numerical data.
20
An improved hierarchical partitioning fuzzy approach to pattern classification
3. Research Method
This dissertation involves realisation of a pattern classification system using fuzzy theory.
As motioned above, many different fuzzy approaches already exist and the one we
employed for this classification system is the hierarchical overlapping fuzzy approach [9].
It is a novel approach that automatically extracts fuzzy rules form labelled numerical data,
with a meaningful input partitioning method and a hierarchical fuzzy rule structure.
This chapter starts with describing the selected fuzzy approach in detail. After that, we also
suggest theoretical improvements and adjustments, and then discuss possible effects on
them.
3.1 Hierarchical overlapping fuzzy approach
The hierarchical overlapping fuzzy approach has three stages: Initial input partitioning,
fuzzy rules generation and fuzzy inference process. It divides all training input vectors into
many regions and each of them assigned with a single type. After generation of certain rule
for each region, region boundaries expand by fuzzy inference. When classifying, the
system automatically assigns each input to a specific region by its relative location and
membership in each dimension, followed by executing corresponding rules.
3.1.1 Initial Input Partitioning
In the training stage, assume that there is a set of input-output data pairs:
21
An improved hierarchical partitioning fuzzy approach to pattern classification
[ );,...,,( )1()1()1(2
)1(1 yxxx m , , where
is the k-th attribute of the input vector x and is the output class for the j-th data pair
j=1,2,3,…,N. There are N different classes and their paired data belong to those classes.
The target is to build single-class regions by recursively partitioning from training
input-output data pairs.
]);,...,,(...,),;,...,,( )()()(2
)(1
)2()2()2(2
)2(1
NNm
NNm yxxxyxxx )( j
kx
)( jy
A hyperbox for the i-th class can be created, which contains its all paired inputs , by
the minimum value and maximum value for the k-th dimension. For all
iA iX
ikv ikV iXx∈
we have: = { |v ≤ x ≤ V , k=1, 2,…,m}. iA iXx∈ ik k ik
Hyperboxes might include data from other classes and consequently create overlapping
area with other hyperboxes. If that is the case, a new hyperbox is created for their
intersection . Of course, the overlapping area could be created by three
different classes , but here we just discuss the case of two hyperboxes
overlapping in order to explain briefly and clearly.
jiij AAA I=
kjiijk AAAA II=
For these overlapping hyperboxes, a recursive algorithm is suggested for partitioning the
input space. Whist l-th iteration, overlapping hyperbox can be continuously
partitioned into and if the terminative requirements are not met.
lijA
1+liA 1+l
jA
22
An improved hierarchical partitioning fuzzy approach to pattern classification
Iterating
liA
ljA
lijA
1−lijA
1+liA
1+ljA
1+lijA
lijA
Figure 6: An iterative partitioning of the overlapping area
As the iteration showed in Figure 6, this process can continue and form hierarchical
hyperboxes for class i, where L is the depth of repeat. (Figure 7) Liiii AAAA ,...,,, 210
…
0ijA 0
iA 0jA
1ijA 1
iA 1jA
2ijA 2
iA 2jA
LijA L
iA LjA
… …
A
Figure 7: Hierarchy of the generated hyperboxes
23
An improved hierarchical partitioning fuzzy approach to pattern classification
As mentioned above, iteration stops when the terminative requirements are reached. The
first criterion is to stop when the number of input data in overlapping area is relatively
small. This index can be presented by , where
is the number of inputs in the overlapping area and is the number of all
inputs in . If value of close to 0, the number of inputs in overlapping area must be
very small and further partitioning seems meaningless. The first terminative parameter Th1
can be set by user as when <Th1, the partitioning of is stopped, marking this
region by one of classes.
lijA
)(/)( lj
li
lj
li
lij AADAADR ∪∩= )( l
jli AAD ∩
)( lj
li AAD ∪
1−lijA l
ijR
lijR l
ijA
If >Th1, the second criterion is activated. It checks whether there is a class obviously
dominating the overlapping area. This index can be mathematically calculated
by
lijR
)(/)()( lj
li
lj
lij
lj
lii
lij AADAADAADS ∩∩−∩= , where is the total
number of inputs in intersection area and
)( lj
li AAD ∩
)()( lj
lij
lj
lii AADAAD ∩−∩ is the number of
difference between class i and j in this area. The value should be a number between 0
and 1. The more close to 1, the more a class dominates in this area, the less
meaningful to continue partitioning. The second terminative parameter Th2 can also be set
by user as when > Th2, the partitioning of is stopped, marking this region by one
of classes.
lijS
lijS
lijS l
ijA
Eventually, the whole inputs area is divided into regions and each of them belongs to a
single class.
24
An improved hierarchical partitioning fuzzy approach to pattern classification
3.1.2 Fuzzy Rules Generation
After input partitioning, the fuzzy rules can be generated for each region. For example, if a
hyperbox only has data from a class , rule (1) is generated: liA iY
IF x is in THEN y is in (1) liA iY
If hyperbox contain overlapping area , rule (2) is generated: liA l
ijA
IF x is in and x is not in THEN y is in (2) liA l
ijA iY
There are two additional rules generated for overlapping hyper box since this
hyperbox normally contain two more types of inputs. Two criterions can stop partitioning.
Therefore, when the portioning is terminated by either tiny number or obvious domination
in overlapping area, all these areas are trivial for accuracy. Consequently, without
significant influence on accuracy, two simple but meaningful algorithms are introduced:
lijA
IF x is in THEN y is in when > , OR y is in when < ,
where and . (3)
lijA iY iw jw jY iw jw
)(/)( lij
lijii ADADw = )(/)( l
ijlijjj ADADw =
However, if is exactly same with , then a distance-based rule is applied. This rule
calculates Euclidean distances and , form x to the two centroids of classes
and where centroids C are calculated by arithmetic means for each dimension. For
instance, if the overlapping area contains N points that belong to class ,
, p=1,2,…,N, then the centroid of class i is:
iw jw
id jd
iC jC
iY
),...,,( )()(2
)(1
)( pm
ppp xxxx =
),...,,(C 21iiii c
mcc xxx= , k=1, 2,…m.Nxxxx N
kkkck
i /)...( )()2()1( +++=
And the rule (4) is:
IF x is in THEN y is in when > OR y is in when > , where lijA iY jd id jY id jd
25
An improved hierarchical partitioning fuzzy approach to pattern classification
2222
211 )(...)()( iii C
mmCC
i xxxxxxd −++−+−= ,
2222
211 )(...)()( jjj C
mmCC
j xxxxxxd −++−+−= . (4)
After rules generation process, all partitioned regions are assigned by specific fuzzy rules.
In order to produce more accuracy and flexibility, fuzzy inference is applied as it
guarantees executing the most appropriate rule for a new input.
3.1.3 Fuzzy Inference Process
As reasons mentioned before, the membership function in fuzzy theory is introduced. To
determine which rule is executed, the degree of membership to each region needs to be
investigated. Assume after completed partition, if an input point is inside the single
hyperbox in a dimension, then the membership of the input for this hyperbox in this
dimension is 1. It decreases from 1 to 0 as it moves away from boundaries of the region.
The fuzzy area of a hyperbox called “generalisation area” and if the input is inside this area,
the membership to this hyperbox is between 0 and 1, indicating it partially belongs to this
hyperbox. The membership function can be presented as a trapezoid shape, describing full
membership by its upper base as well as gradually decreasing membership by its slopes.
This expression of membership employed in [10] and [11] is now widely used.
If a hyperbox does not overlap with other hyperboxes, the membership function can
be defined by equation:
liA
(a), where )))]}(,1min(,0max(1[)))],(,1min(,0max(1min{[)( likkkk
likkk
li Vxxvxm −−−−= γγ
26
An improved hierarchical partitioning fuzzy approach to pattern classification
kγ is the sensitivity parameter for the k-th dimension (or attribute).
liA
( )kli xm
kx
k
likv γ
1−k
likV γ
1+likv l
ikV
generalisable region
kx
Figure 8: Class boundaries and a membership function of a 2-dimensional hyperbox
Figure 8 illustrates generalisation area and its boundaries as well as a membership function
of a 2-dimensional hyperbox. If the input vector is placed inside of hyperbox , this
vector has degrees of membership =1 in each k-th dimension since satisfies
, . If the input vector is placed outside of in some dimensions
but not far away, the degrees of membership in these dimensions are since
satisfies , which belongs to generation area. If in k-th
dimension there is or , the degree of membership in k-th
dimension is 0, which means this vector is outside of generalised region. The sensitivity
liA
)( kli xm kx
likk
lik Vxv << mk ,...,1= l
iA
1)(0 << kli xm
kx kl
ikkklik Vxv γγ /1/1 +<<−
klikk vx γ/1−≤ k
likk Vx γ/1+≥
27
An improved hierarchical partitioning fuzzy approach to pattern classification
parameter kγ is able to enlarge or reduce the area of generalised region.
This method can also be extended to the situation that two hyperboxes overlap. If a
hyperbox does overlap with another hyperbox , the membership functions can be
defined by equation (b):
liA l
jA
)))]}))(/1/(1(,1min(,0max(1[)))],(,1min(,0max(1min{[)( likk
ljkk
likk
likkk
li VxvVxvxm −−+−−−= γγ
liA
( )kli xm
kx
k
likv γ
1−k
likV γ
1+likv l
ikV
kx
ljAl
ijA
k
ljkv γ
1−k
ljkV γ
1+ljkv l
jkV
( )klj xm
Figure 9: Class boundaries and membership functions of 2 hyperboxes
In Figure 9, similar with that in Figure 8, generalised regions are produced by both
hyperboxes. For overlapping area, two rules of the type (2) are applied and we use min
operator in each rule. min operator takes the minimum degree of membership of fuzzy
28
An improved hierarchical partitioning fuzzy approach to pattern classification
value for a given input:
( ) ( ){ }kliR
xmxd li
min= , k=1, 2,…m, where is the degree of membership to execute
rule in region of class i. The min operator guarantees the degree of membership to execute
a rule equals to one only if input vector is placed inside of hyperbox in every dimension. If
the input vector is outside of hyperbox or in the overlapping area, the degree of
membership to execute the rule must be less than one. If the input vector belongs to two
regions and they represent different class, the degrees of membership to execute rules
or can be calculated respectively. If input vector is placed into common
area of two hyperboxes, rule (3) or rule (4) is executed. Similar with single hyperbox case,
the sensitivity parameter
liR
d
)(xd liR )(xd l
jR
kγ is able to control the area of generalised region.
3.2 Improvements
In order to achieve better performance of the method above, creative adjustments are
theoretically suggested below. Predictions of possible effects are discussed in the end of
this chapter. In evacuation part all of them will be careful examined and analysed.
3.2.1 Euclidean distance calculation in assigning rules
The hierarchical overlapping fuzzy approach is very efficient classification method. It has a
very effective partitioning procedure and commendably integrates fuzzy theory,
considering both accuracy and the number of generated rules.
29
An improved hierarchical partitioning fuzzy approach to pattern classification
This approach utilises a rational way to deal with partitioned regions and generates rules,
which is primarily based on density-based judgement. For example, in a region contains
two classes and the partition is stopped by Th1 or Th2, how this input vector is classified
requires comparison of the densities of two classes in this area. The input will belong to the
class which has a greater density. If and only if the densities of two classes are exactly
same, the Euclidean distance-based comparison is activated as the input will be classified
to the class that has shorter distance form its centroid to the input vector.
However, the situations in the partitioned region are different because the portioning can be
terminated by either criterion Th1 or Th2. For example, if the partitioning is stopped by
Th2, one of the classes must dominate the area and consequently employ density
comparison. But if the partitioning is stopped by Th1, it only suggests this area could be
trivial for overall accuracy, the dominative situation of classes in this area is still unknown.
Although the overlapping approach has suggested Euclidean distance-based comparison,
but it is only activated when the density is perfectly same, which is a rare case. This
probably results the Euclidean distance-based comparison effacing itself.
For human being, the judging process normally contains these two criterions together and
equally. When people come across the problem of two classes overlaps, they seem to
equally use density judgement (How frequently it happened in here in the past?) and the
Euclidean-distance judgement (Does it numerically close to the mathematical mean?).
30
An improved hierarchical partitioning fuzzy approach to pattern classification
Therefore, we suggest enhancing the power of Euclidean-distance judgement in the
overlapping area. It can be replaced density-based method or coexisted with it, which
might be achieving by setting a benchmark to activate the function of Euclidean-distance
comparison rather than wait until the draw of density comparison. When the density
difference of classes is not that obvious, for example the density difference is less than 2,
density judgments are immediately given up and start using distance judging function. It
can also be adjusted by user to control the preference between density comparison and
distance comparison for specific problem.
The adjustment is hopefully to improve the accuracy a little, especially for some particular
cases. It enhances the function of Euclidean-distance comparison and is more flexible to
different issues. Of course the final result still needs to be verified. This will be compared
with the original approach in evaluation chapter.
3.2.2 Tuning slopes
When testing the performance of pervious version of this approach, lots of points were
found missing from classification, which severely affected the accuracy. Although missing
of points is sometimes inevitable, the points near the boundaries, which obviously belong
to a certain class have also been ignored since the slopes made by . Theoretically,
if we extend the slopes as “1” near the boundary (because no other classes in that direction),
some missing points would be classified correctly. This could simply improve accuracy
without creating more misclassified points. The advantage of this adjustment should be
klikv γ/1−
31
An improved hierarchical partitioning fuzzy approach to pattern classification
particularly obvious when there are only 2 or 3 classes but have multiple dimensions (such
as Wisconsin breast cancer dataset). This action is more similar with human actions that
can make some obvious decision on points when they are nor numerically apparent. The
modification will not affect the side of overlapping area, because in overlapping area the
situation of assigning those points remain unknown. This may imply that so, if many
classes exist rather than one or two, for example there are more than 10 classes, the
improvement could be very trivial.
Another modification on slope is the length of the slope (membership functions).
Previously, all slopes of classes were only determined by a sensitivity parameter γ .
Although for each application the sensitivity parameterγ can be set by the user, when
changingγ , all slopes equally extended a certain distance, regardless the size of the
hyperboxes. This could be unreasonable when the sizes of hyperboxes dramatically vary.
For example, a small hyperbox may have the generation area (related with slopes) that
twice of its original size, whilst another big hyperbox may have its generation area that
only one tenth of its original area. Generally speaking, a bigger hyperbox means its points
distribute more sparsely than others, consequently it needs bigger generation area.
One of the best solutions to this is tuning slopes according the size of the hyperbox. It
adjusts all extended distance for slopes from γ/1 to . That means all slopes
and generation areas are now related to the size of the hyperbox. It would give each
hyperbox a proper generation area rather than all hyperboxes keeping a fixed one. Figure
γ/lik
lik vV −
32
An improved hierarchical partitioning fuzzy approach to pattern classification
10 illustrates the modified version of membership functions.
liA
( )kli xm
kxlikv l
ikV
kx
ljAl
ijA
ljkv l
jkV
( )klj xm
k
lik
likl
ikvVv γ
−−k
ljk
ljkl
jkvVv γ
−−
k
lik
likl
ikvVV γ
−+k
ljk
ljkl
jkvVV γ
−+
Figure 10: Modified class boundaries and membership functions
33
An improved hierarchical partitioning fuzzy approach to pattern classification
4. Implementation
This chapter is to describe the implementation of proposed method by a programme. It
involves two sections: the first section introduces implementation background and software
environment. The second section describes specific programme logic and procedures, with
concrete examples of executing results.
4.1 Implementation background and software environment
There was a fuzzy classification system developed by I. Gadaras, which provided an
excellent general framework of all similar fuzzy systems as well as a well-developed fuzzy
java library. In this system, all fuzzy concepts were realised as objects. For example, fuzzy
sets, fuzzy values and fuzzy variables were designed as separated classes, which was very
flexible on applying to different applications. Although many modifications were necessary
to realise proposed method, the general framework and basic logic were inherited. There
are two major reasons for that: First, all effects of modifications would be easily verified
by comparing the performance of the original system. Second, it is also possible for future
developer to reuse part of the programme. Since new modifications need only change
related sections, the format and logic always remain clear. Then, more attention could be
paid on new theoretical improvements.
More than realising proposed method based on the previous framework, the current version
of system enabled to test approaches in a more general way. For example, it separated
34
An improved hierarchical partitioning fuzzy approach to pattern classification
training and testing datasets in order to evaluate under real world situations and make it
more convenient to examine different configurations. Also, the framework now was more
flexible for new theoretical modifications. A source code sample of modified system for
this dissertation can be found in Appendix 1.
The programme was based on object-oriented programming concept. Although it is
possible to write the programme in various languages, as the one of the most popular
programming language, Java (Jbuilder as the development tool) was employed for
realisation. This programme was also easily understandable and maintainable, and all the
theoretical parts of proposed method can be clearly identified in programme.
JDBC database was selected to store different testing datasets and separations. When
testing new datasets or configurations, only simple updating on table names was required.
Manipulation of these datasets and separations was also efficient by using SQL commands.
As many previous versions of similar classification systems used text file to store data, this
programme also allowed transforming between different formats of datasets.
4.2 Programme procedure and operating results
In this section, the programme realisation for each theoretical part is described. The
programme was written in the order of the proposed method. Although for each dataset the
programme slightly varied, the general logic and procedure were the same.
35
An improved hierarchical partitioning fuzzy approach to pattern classification
Step1: Initialisation
First of all, the programme imports all libraries needed. The most important one fuzzy
library “nrc.fuzzy.*” contains all classes for fuzzy concepts. Then the programme connects
database by administrative username and password. After that, the programme read all
training data into memory from database, for constructing initial hyperboxes.
Step2: Construct hyperboxes
After input all training data, the programme identifies both maximum and minimum value
by class in each dimension. It use Result set: “rs = stmt.executeQuery ("SELECT
MAX(attr_1) AS maxX, MAX(attr_2) AS maxY…” to keep these values. The programme
works as the same way as initial construction of hyperboxes when re-partitioning, until all
overlapping areas meet termination criteria. Fuzzy value can be set by programmers by any
numerical value as long as they can recognise output from operating results.
Step3: Extract rules from training data
In this stage, training data that had been read into memory generate fuzzy rules. Each
input-output pair of data transforms to a rule for future classification, and then fuzzy rule
class stores all rules by double arrays in the main memory. For example:
“double outxlow[] = {0, 1.5, 3};
double outylow[] = {0, 1, 0};
double outxmed[] = {3.5, 5, 6.5};
output.addTerm("low", outxlow, outylow, 3);”
36
An improved hierarchical partitioning fuzzy approach to pattern classification
This means that when the input belongs to 0-3, then the output belongs to 3.5-6.5. The
value of output is used for final classification. These rules can be either set by human
expert or automatically extract from training data. Now the training part of this programme
finishes.
Setp4: Obtain testing data
In order to examine the performance of the classification system, the programme read the
other part of the whole datasets. Testing data is introduced in the same way as training data
but without the class outputs, which will be classified by the system. Here, the programme
also categorises which inputs are in overlapping area and which are out of it.
Step5: Produce results of testing data
Rules generated from training data had been stored in the memory, and after inputs of
testing data arrive, the programme starts classifying. For each region, regardless it is
overlapping or non-overlapping area a fixed rule has already waited there. Every input
matches the rule assigned in the region, and is determined to the final output. These
classification results will be compared with original outputs.
Here is a concrete example of executing result:
Fuzzy Variable -> out [ 0, 10.0 ] unit(s)
Fuzzy Set -> { 0/3.5 0.15/3.73 0.15/6.27 0/6.5 }
34
37
An improved hierarchical partitioning fuzzy approach to pattern classification
Fuzzy Variable -> out [ 0, 10.0 ] unit(s)
Fuzzy Set -> { 0/3.5 0.5/4.25 0.5/5.75 0/6.5 }
46
Fuzzy Variable -> out [ 0, 10.0 ] unit(s)
Fuzzy Set -> { 0/0 0.75/1.12 0.75/1.88 0/3 }
1
Fuzzy Variable -> out [ 0, 10.0 ] unit(s)
Fuzzy Set -> { 0/0 0.75/1.12 0.75/1.88 0/3 }
2 …
The classification result for each testing data is showed in three rows: the first row defines
the total range of the Fuzzy Variable. “out [ 0, 10.0 ] unit(s)” means that the fuzzy value
could be a number between 0 and 10. The second row Fuzzy Set shows the result set: it
could be one of several sets. For instance, “{ 0/3.5 0.15/3.73 0.15/6.27 0/6.5 }” means that
it belongs to the set “0/3-0/6.5” or the intermediate class. Last row lists the ID of the
testing data, which is used for marking errors or statistics.
Step6: Show classification results and calculate the accuracy
By comparison the classified result and its original results, it is easy to identify whether
38
An improved hierarchical partitioning fuzzy approach to pattern classification
they are correctly classified and the general performance of the system. The classified
results are stored in memory and in order to compare, the programme imports the output
for each testing data from database. It finds all differences between classified and original
outputs, and then counts how many points that are misclassified. The system not only
calculates the accuracy by errors number, but also does it show the time for executing and
details of errors, which are used for evaluating performance and future analysis.
39
An improved hierarchical partitioning fuzzy approach to pattern classification
5. Evaluation
This chapter contains classification results for purposed method. These results are
investigated in detail with critical analysis. The performance of the system is evaluated
against two criteria: the pattern classification accuracy and the generation expressiveness.
Accuracy could be the most important issue to evaluate classification systems, since it is
the original reason for developing similar systems and the major direction for improving.
Expressiveness is also an essential consideration especially when dealing with a huge
amount of data. For example if a system generates too many fuzzy rules, it requires a long
executing time and consumes considerable computer resources, which might lead to an
unrealistic cost. Another issue worthy to be mentioned is the difference between artificial
testing environments and real world cases. That is to say, in real world cases the testing
data could be very different from training patterns, and consequently yields more errors.
Therefore, testing systems in an environment that similar with real world cases is crucial.
Two famous data sets were selected to test the proposed method and the system: Iris
Flower Dataset and Wisconsin Breast Cancer Dataset, which can properly illustrate the
accuracy and expensiveness performed in real world situation. Detailed results, discusses
and analysis can be found in following sections.
40
An improved hierarchical partitioning fuzzy approach to pattern classification
5.1 Training and testing data issue
Before evaluating the result for each dataset, general discussion about training and testing
data is necessary. Inevitably, the performance of classification systems is affected by
selection of training and testing data.
There are many controversies on how to select training data and testing data. Data
selection could significantly affect the classification result: if the training data are “good”
enough, all the testing data will be included in the initial hyperboxes, and even without
fuzzy inference the result could be accurate since only few points were outside of the
generation areas. On the other hand, if using very less or “poor” data for training, there will
be a great number of missing points, which lead to a low accuracy.
Basically, a fuzzy classification system should be general enough to adjust itself to
different situations and applications. So using equal amount and random data for training
and testing is a normal procedure. However in practice, when the users apply this fuzzy
system to a specific situation, it is realistic for them to select some typical case for training,
for the system working much better. For example, the user could make a separation of
training dataset which contains all maximum and minimum value in each dimension. If no
testing input locates outside of the initial hyperboxes boundaries, more adjustments are
able to concentrate on overlapping areas, and then achieve better accuracy.
In this dissertation, the proposed method was tested in both ways. Random data was used
41
An improved hierarchical partitioning fuzzy approach to pattern classification
for testing the general performance of the system, with also trying to use different
separations for datasets for training and testing to indentify the best performance of the
systems.
5.2 Iris dataset
Iris Flower Dataset or Fisher’s Iris Dataset [12] is a popular dataset testing pattern
classification. It is a four-dimensional dataset invented by Sir Ronald Aylmer Fisher,
containing 150 samples from three species of Iris flowers, including Iris setosa, Iris
virginica and Iris versicolor. The features are measured against the length and the width of
sepal and petal. Pattern classification systems utilise the combination of these for features
to determine which species they are.
As mentioned before, two types of configurations were employed for evaluating
performance of proposed method on Iris dataset, one for examining its average ability, the
other for examine its best performance. In first type of configuration, 75 samples (25 for
each class) had been chosen randomly for training and the other 75 for testing (25 for each
class). In order to compare it with similar systems, a separation of data applied by P. C.
Chen [2] was initially experimented (Table 1). This separation detailed can be found in
appendices as an example.
42
An improved hierarchical partitioning fuzzy approach to pattern classification
Table 1. Classification results for the first configuration.
Training Testing Randomly Correctly Number of Error Error
Sample size Sample size Selected? Classified Errors Rate Ids
Config1 75 75 YES 68 7 9.3%
28,58,68 69,70,63 53
In the first configuration, 2 testing points fell into the overlapping area, and they were
correctly classified. No.63 and No.53 which originally belong to Iris Versicolor were
misclassified to Iris Virginica. No.28, No.58, No.68, No.69 and No.70 were missing, which
means they were not classified at all. Different results could be found if tuning sensitivity
parameter γ , since the number of misclassified points and missing points may vary.
However in this configuration γ =1.5 was the optimum one, and because of the fixed
memberships set near boundaries, the effect of tuning slopes (by sensitivity parameterγ )
had been reduced. Rules generated can be found in Table 2.
Table 2. Rules generated from Iris dataset
Rule No. Sepal Width Sepal Lengt Petal Width Petal Length class
Rule 1 ’low’ ’high’ ’low’ ’low’ setosa
Rule 2 ’med’ ’low’ ’med’ ’med’ versicolor
Rule 3 ‘high’ ’med’ ‘high’ ‘high’ Virginica
Rule 4 ‘med- ’low- ‘med- ‘med- versicolor high’ mediu m’ high’ high’ or Virginica
43
An improved hierarchical partitioning fuzzy approach to pattern classification
In order to evaluate the average performance of proposed method on Iris, more
configurations had been tested. One of them was purely random selection (configuration 2)
and the other followed a simple numerical rules: No.1-25, No.51-75, No.100-125 for
training and the rest for testing (configuration 3).
Finally, as mentioned above, a careful selection of training data had also been tested to find
out the best performance of this proposed method (configuration 4). Although there were
only 20 training data, they contained all maximum and minimum value for each dimension.
Therefore, no point was allocated outside of the initial hyperboxes after training and no
point was missing from classification.
Table 3. More results for other configurations.
Training Testing Randomly Correctly Number of Error Error
Sample size Sample size Selected? Classified Errors Rate Ids
Config1 75 75 YES 68 7 9.3%
28,58,68 69,70,63 53
Config2 75 75 YES 71 4 5.2%
42,139,7178
Config3 75 75 No 69 6 8.0%
42,44,134135,84,78
Config4 20 130 No (careful selection)
129 1 0.7%
42
44
An improved hierarchical partitioning fuzzy approach to pattern classification
Configuration 2 achieved better accuracy than the first one. Only 4 errors: No.42 and
No.139 missing, and No.71 and No.78 which originally belong to Iris Versicolor were
misclassified to Iris Virginica. The optimum sensitivity parameter γ of these
configurations is 4. In configuration 3, No.84 and No.78 were misclassified as the same
situation as config2, but just more points were missing. After carefully selection of data,
the accuracy achieved an extreme high level: for 130 testing data, only No.42 was missing,
6 points in overlapping area were perfectly decided. This also proved the importance of
data selection, so paying attention when training data in real applications is really
desirable.
5.3 Wisconsin Breast Cancer dataset
Wisconsin Breast Cancer Dataset is invented by Medical College of Wisconsin and has
already been widely used for the evaluation of pattern classification systems. It contains
nine attributes describing blood ingredients (These are: Clump Thickness (UC),
Uniformity of Cell Size (UC), Uniformity of Cell Shape (UC), Marginal Adhesion (MA),
Single Epithelial Cell Size (SE), Bare Nuclei (BN), Bland Chromatin (BC), Normal
Nucleoli (NN), Mitoses (Mit) ) with two output classes for the nature of cancer: benign or
malign. This dataset has 699 observations with 16 cases are deliberately excluded for in
complete description of attributes. 444 cases of the total 683 belong to benign class and the
other 239 to malign. 252 of all cases are placed in the overlapping area. It is a fairly
multi-dimensional dataset that two classes coincide expansively.
45
An improved hierarchical partitioning fuzzy approach to pattern classification
Same as the evaluation on Iris dataset, three investigations were conducted: two for testing
average performance, and one for testing the maximum ability for the proposed method on
Wisconsin Breast Cancer dataset. All configurations used 340 samples for training and 343
samples for testing, but the training data for last configuration were carefully selected. In
database there are nine columns for attributes and one for output class: benign or
malignant.
Table 4. Rules generated from Wisconsin Breast Cancer dataset
Rule CT UC UC MA SE BN BC NN Mit. Class
R1 low low low low low low low low low benign
R2 high high high high high high high high high malign
R 3.1 low-
med
low-
med
low-
med
low-
med
low-
med
low-
med
low-
med
low-
med
low-
med
benign
malign R 3.2 med-
high
med-
high
med-
high
med-
high
med-
high
med-
high
med-
high
med-
high
med-
high
Table 4 showed the rules generated from training data after two iterations. The number of
iterations can be manually controlled by the user, who set suitable stopping parameter Th1
or Th2. In this experiment two iterations were set for following reasons: firstly, after two
iterations there had no explicit improvements for the accuracy but might lead higher cost of
producing rules. Secondly, it would be easy to compare with the pervious versions of
46
An improved hierarchical partitioning fuzzy approach to pattern classification
proposed method, for explicitly identify the advantages of current version.
In the first configuration, all the training data were selected referencing their ids: extracting
all odd id numbers as training data and all even id numbers as testing data. This is a
convenient way to separate dataset which equals the random separation. Configuration 2
took pure random separation to present normal performance. In configuration 3, training
data had been selected in the same way as configuration 4 of Iris data, which contained all
maximum and minimum value to optimise initial hyperboxes.
Table 5. Results for different configurations of Wisconsin Breast Cancer dataset.
Training Testing Randomly Correctly Number of Error
Sample size Sample size Selected? Classified Errors Rate
Config1 340 343 YES 330 13 3.79%
Config2 340 343 YES 329 14 4.0%
Config4 340 343 No (careful selection)
339 4 1.17%
The accuracy of configuration 1 was 96.21% with 13 points were misclassified in the
second iteration overlapping area. Because there were only two classes in this experiment,
and we modified the membership near the boundary, therefore no point was missing. The
47
An improved hierarchical partitioning fuzzy approach to pattern classification
second configuration made by entirely random selection, achieved worse but similar
accuracy 96.0%. It showed the average accuracy was around 96.10%. Configuration 3
proved the importance of selection again: only 4 points found misclassified. After careful
choosing training data, the accuracy increased to 98.83%.
All of the testing were experimented under a situation that similar with real world cases
and utilised totally different data for training and testing. Sensitivity parameterγ was set
as 3, as it was the optimum for these tests.
5.4 Comparisons with other methods and analysis
First, results of various methods on Iris dataset were compared. These comparisons with
classification systems using other methods or approaches clearly proved the performance
of the proposed method (Table 6).
Table 6. Comparative results of several classification systems on Iris dataset
Approaches Training No. Testing No. Error
No.
Error Rate
Bayes Classifier 75(selected) 75(selected) 2 2.6%
Fuzzy k-NN 36(random) 36(random) 4 11.0%
K-nearest neighbour 75(random) 75(random) 4 5.2%
Fuzzy Perceptron Whole set Whole set 2 2.6%
Abe and Lan’s Method 75(random) 75(random) 7 9.3%
Proposed Method 75(random) 75(random) 4 5.2%
Proposed Method* 20(selected) 130(selected)) 1 0.7%
48
An improved hierarchical partitioning fuzzy approach to pattern classification
The result of proposed method on Iris achieved a low error rate 5.2%. This was same with
k-nearest neighbour approach, and better than Fuzzy k-NN approach [15] and Abe and
Lan’s method [6], which also based on overlapping approach. Although Fuzzy Perception
only had 2 errors in 75 testing data, it used whole set for training and testing, which means
the training data could be the best set. The same as Fuzzy Perception, Bayes Classifier
actually selected training and testing data but did not provide information about how these
75 training selected. The last row on Table 6 showed that after careful selection, the
proposed system could achieve very high accuracy, and only one was misclassified. In
previous system using similar approach, which developed by I. Gadaras and L. Mikhailov
[9], the best performance was 2 errors for 75 testing data (2.7% error rate). This
improvement in current version proved those modifications actually increased accuracy.
Because we employed the original way to generate rules, so there was no increase in
expensiveness without complex mathematical calculation. Furthermore, the proposed
method did not need initial partition of input space nor any priori knowledge like Bayes
Classifier. This enabled the method to be easily applied in a more general environment.
For Wisconsin Breast Cancer dataset, we also compared with other methods in the same
way of comparing Iris. Several results of other research listed in Table 7, with its average
ability and best performance. All methods listed below were eminent research in the realm:
an evolutionary method VISIT suggested by Chang and Lilly [16], a neuro-fuzzy approach
NEFCLASS by Nauck and Kruse [17], an alternative technique based on decision trees
49
An improved hierarchical partitioning fuzzy approach to pattern classification
initialisation by Abonyi and Szeifert’s [18] and Gadaras and Mikhailov’s method [9] which
is the pre-existence of proposed method.
Table 7. Comparative results of systems on Wisconsin Breast Cancer dataset
Approaches Training No. Testing No. Accuracy
VISIT, Chang & Lilly’s 400(random) 283(random) 96.50%
NEFCLASS Whole Set Whole Set 95.06%
Abonyi & Szeifert’s Method 342(random) 341(random) 95.57%
Gadaras and Mikhailov’s Method 340(random) 343(random) 96.08%
Proposed Method 340(random) 343(random) 96.10%
Proposed Method* 340(selected) 343(selected) 98.83%
For average ability, the accuracy of proposed method outperformed NEFCLASS, Abonyi
& Szeifert’s Method and pervious Gadaras and Mikhailov’s Method. The advantages were
even more obvious since NEFCLASS required long training period and more than 10
conditions for rule pruning, and in Abonyi & Szeifert’s Method the parameter initialization
and 3-4 conditions were needed. VISIT achieved higher accuracy than proposed method,
however 400 data were set as training data and it required initialization of membership
functions, which lead to more than 100 learning iterations. The average accuracy of
Gadaras and Mikhailov’s Method was 96.10% and maximum performance of that was
97.12% (careful selection of training data). The proposed method exceeded in both
situations. Gadaras and Mikhailov’s Method trained very fast and required no prior
50
An improved hierarchical partitioning fuzzy approach to pattern classification
knowledge, which had the same advantages on expensiveness with proposed method. The
modifications and improvements of his method reflected on accuracy.
It is observable that classification results of proposed method were comparable to others
modern eminent approaches, in both accuracy and expensiveness perspectives, especially
in high dimensional cases. The method required only few parameters which are flexible for
the user to adjust according to specific situations and their needs. When proper training
data obtained, the system can even achieve a highly accurate level based on proposed
method.
Meanwhile, when testing the proposed method, we also observed that pure Euclidean
distance based algorithm nearly achieved the same accuracy as density based algorithm.
However, in the test of Wisconsin Breast Cancer dataset, which has nine dimensions,
calculation of Euclidean distance consumed a little more time than just simply comparing
densities in overlapping area. Accuracy of different algorithms also varies for different
separations of dataset, so maybe algorithm selection ought to be related to specific cases.
Overall speaking, the shift between these two algorithms did not apparently affect accuracy.
Moreover, iteration in overlapping area was a great idea to obtain accurate result,
especially when the number of data was huge and proper amount of them fall into
overlapping area.
However two situations might not be perfectly suitable for using this approach: first, if
51
An improved hierarchical partitioning fuzzy approach to pattern classification
there were only few points allocated in overlapping areas, iterations seemed redundant and
constructing new hyperboxes consumed extra resource; second, if the distribution were
highly separative, this overlapping approach might not be efficient enough to deal with
non-overlapping situations. The first problem could be solved by careful controlling of
terminative parameters or even adding new criteria to stop iteration under certain
classification cases. The second problem of data distribution is a general problem for all
classification systems. There is also an interesting approach of non-overlapping methods
being researched by L. Mikhailov [19], which is an alternative approach that might be
suitable for other certain situations.
52
An improved hierarchical partitioning fuzzy approach to pattern classification
6. Conclusion and Future work
In this dissertation a fuzzy system using hierarchical overlapping approach was realised.
The theoretical method of this system was based on an approach proposed by I. Gadaras
and L. Mikhailov, which enables to directly extract fuzzy rules from numerical data. The
system was developed by Java programming language with JDBC database. All testing
results of the system were evaluated by eminent datasets: Fisher Iris dataset and Wisconsin
Breast Cancer dataset. Comparison with other methods and similar systems were also
provided, followed by comparative analysis.
The major achievements of this dissertation were: Firstly, Euclidean distance based
calculation algorithm was realised as an alternative of density based one. Results showed
that in some certain cases, this approach could achieve slightly better accuracy than density
based version. This improved accuracy a little but not generally enough, and if the number
of data were very large then the calculation of Euclidean distance could be costly. Secondly,
a modification of membership that near the boundaries also successfully applied. This
considerably rectified the situation of missing points, which helped to decrease error rate.
It was especially useful when there were huge amount of data with only few classes.
Thirdly, tuning slopes according to the size of the hyperbox was proven effective for
improving accuracy, which was a noticeable advancement of the I. Gadaras and L.
Mikhailov’s method. Moreover, this system inherited the pervious advantage of low
expensiveness, since no prior knowledge for initialisation was required.
53
An improved hierarchical partitioning fuzzy approach to pattern classification
Due to the complexity of classification problems themselves, it is very difficult to obtain a
system that is general enough to suit various situations. The pattern classification system
developed in this dissertation had proven comparative good performance, but its result
actually did rely on data distribution to some extent. For example, if the data located very
loosely or only few of them fell into overlapping area, then other methods such as
non-overlapping method might have better performance. Moreover, iteration approach was
suitable for heavily overlapped situations, and could not guarantee improvement on
accuracy for all cases. Stopping criteria of iteration partially solved this problem, but the
user still had to modify it for each specific application.
Future research on this subject remains desirable. Deeper analysis in various situations
could help to produce better achievements. One direction would be trying to propose a
creative approach or algorithm that may widely suit different circumstances, or at least be
able to reduce influence by data distribution; another direction would be integrating all
different approaches in one system, which can manually or automatically select algorithm
for specific application. This could be achieved by syncretising pervious classification
systems, controlled under a proper UI interface.
54
An improved hierarchical partitioning fuzzy approach to pattern classification
References
[1] C. Bishop, Neural networks for pattern recognition. Oxford University Press, 1995.
[2] P. C. Chen, Fuzzy Approach for Pattern Classification. Dissertation submitted to
University of Manchester, 1999.
[3] K. Fukunaga, Introduction to Statistical Pattern Recognition. Academic Press, 1990.
[4] A. Jain, P. Duin, and J. Mao, “Statistical Pattern Recognition: A Review,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, 2000.
[5] K. S. Fu, Syntactic Pattern Recognition and Applications. Prentice-Hall, 1982.
[6] S. Abe, M.S. Lan, “Fuzzy Rules Extraction Directly from Numerical Data for Function
Approximation,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 25, no. 1,
January 1995.
[7] V. N. Vapnik, Statistical Learning Theory. New York: John Wiley & Sons, 1998.
[8] L. I. Perlovsky, “Conundrum of Combinatorial Complexity,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 20, no. 6, pp. 666-670, 1998.
[9] I. Gadaras, L. Mikhailov, “Generation of Fuzzy Classification Rules Directly from
Overlapping Input Partitioning,” Fuzzy IEEE Conference, London, pp.1-6, 2007.
[10] J. Abonyi, F. Szeifert, “Supervised fuzzy clustering for the identification of fuzzy
classifers,” Elsevier Pattern Recornision Letters, vol. 24, 2195-2207, 2003.
[11] M. Setnes and H. Roubos, “GA-fuzzy modelling and classification: Complexity and
Performance,” IEEE Transactions on Systems, vol. 8, pp. 509-522, Oct. 2000.
[12] R. A. Fisher, The Use of Multiple Measurements in Taxonomic Problems. Annals of
55
An improved hierarchical partitioning fuzzy approach to pattern classification
Eugenics 7: 179–188. Cambridge University Press, 1936.
[13] L. Wang, J. M. Mendel, “Generating Fuzzy Rules by Learning from Examples,” IEEE
Transactions on Systems, Man and cybernetics, vol. 22, pp. 1414-1427, 1992.
[14] S. Abe, M. Lan, ”Fuzzy Rules Extraction Directly from Numerical Data for Function
Approximation,” IEEE Transactions on Systems, Man and cybernetics, vol.25, no.1,
119-229, 1995.
[15] J. M. Keller, M. R. Gray and J. A. Givens, “A fuzzy k-nearest neighbor algorithm”,
IEEE Transactions on System, Man, and Cybernetics, CMC-15, pp. 580-585, 1985.
[16] X. Chang, J. H. Lilly, “Evolutionary Design of Fuzzy Classifier from Data,” IEEE
Transactions on Systems, Man and Cybernetics, vol.34, pp. 1894-1906, 2004.
[17] D. Nauck, R. Kruse, “Obtaining interpretable fuzzy classification rules from medical
data,” Artificial Intelligence in Medicine, vol.16, pp. 149-169, 2006.
[18] J. Abonyi, F. Szeifert, “Supervised fuzzy clustering for the identification of fuzzy
classifiers,” Elsevier Pattern Recognition Letters, vol.24, pp. 2195-2207, 2003.
[19] L. Mikhailov, “Generation of Fuzzy Classification Rules by Non-Overlapping Input
Partitioning,” IEEE International Symposium on Evolving Fuzzy Systems EFS'06, Lake
District, UK, 7-9, pp.365-369, 2006.
56
An improved hierarchical partitioning fuzzy approach to pattern classification
Appendix 1: Sample source code
HyperBoxM.class
///////////////////////////////////////////////////////////////////////////////////
//construct the hyperbox
///////////////////////////////////////////////////////////////////////////////////
public class HyperBoxM
{
private double xmin, xmax, ymin, ymax, zmin, zmax, wmin, wmax;
public HyperBoxM(double xmin, double xmax, double ymin, double ymax, double zmin, double
zmax, double wmin, double wmax)
{
this.xmin = xmin;
this.xmax = xmax;
this.ymin = ymin;
this.ymax = ymax;
this.zmin = zmin;
this.zmax = zmax;
this.wmin = wmin;
this.wmax = wmax;
}
///////////////////////////////////////////////////////////////////////////////////
//Obtain Minimum value in each dimension
///////////////////////////////////////////////////////////////////////////////////
public double getMinimum(int dimension)
{
double min = 0.0;
switch( dimension )
{
case 1:
min = xmin;
break;
case 2:
min = ymin;
break;
case 3:
min = zmin;
break;
57
An improved hierarchical partitioning fuzzy approach to pattern classification
case 4:
min = wmin;
break;
default:
min = 0.0;
}
return min;
}
///////////////////////////////////////////////////////////////////////////////////
//Obtain Maximum value in each dimension
///////////////////////////////////////////////////////////////////////////////////
public double getMaximum(int dimension)
{
double max = 0.0;
switch( dimension )
{
case 1:
max = xmax;
break;
case 2:
max = ymax;
break;
case 3:
max = zmax;
break;
case 4:
max = wmax;
break;
default:
max = 0.0;
}
return max;
}
///////////////////////////////////////////////////////////////////////////////////
//Obtain centre for calculation of Euclidean distance
///////////////////////////////////////////////////////////////////////////////////
public double getCentre(int dimension)
{
double centre = 0.0;
58
An improved hierarchical partitioning fuzzy approach to pattern classification
switch (dimension)
{
case 1:
centre = (xmax + xmin) / 2;
break;
case 2:
centre = (ymax + ymin) / 2;
break;
case 3:
centre = (zmax + zmin) / 2;
break;
case 4:
centre = (wmax + wmin) / 2;
break;
default:
centre = 0.0;
}
return centre;
}
}
59
An improved hierarchical partitioning fuzzy approach to pattern classification
classificationTestM.class
///////////////////////////////////////////////////////////////////////////////////
//Imports
///////////////////////////////////////////////////////////////////////////////////
import java.sql.*;
import java.util.*;
import java.lang.*;
import nrc.fuzzy.*;
///////////////////////////////////////////////////////////////////////////////////
//select separation and set sensitivity parameter
///////////////////////////////////////////////////////////////////////////////////
public class classificationTestM
{
private final static String tableName = "iris_Train";
private final static String tableName2 = "iris_Test";
private double r = 2;
///////////////////////////////////////////////////////////////////////////////////
//generate hyperboxes
///////////////////////////////////////////////////////////////////////////////////
public boolean hyperboxGeneration()
{
HyperBoxM hBox1 = null, hBox2 = null, hBox3 = null;
rs = stmt.executeQuery("SELECT MAX(attr_1) AS maxX, MAX(attr_2) AS maxY, MAX(attr_3) AS
maxZ, MAX(attr_4) AS maxW, MIN(attr_1) AS minX, MIN(attr_2) AS minY, MIN(attr_3) AS minZ,
MIN(attr_4) AS minW FROM " + tableName + " WHERE out = 1");
if( rs.next() )
{
hBox1 = new HyperBoxM(rs.getDouble("minX"), rs.getDouble("maxX"), rs.getDouble("minY"),
rs.getDouble("maxY"), rs.getDouble("minZ"), rs.getDouble("maxZ"), rs.getDouble("minW"),
rs.getDouble("maxW"));
rs.close();
}
rs = stmt.executeQuery("SELECT MAX(attr_1) AS maxX, MAX(attr_2) AS maxY, MAX(attr_3) AS
maxZ, MAX(attr_4) AS maxW, MIN(attr_1) AS minX, MIN(attr_2) AS minY, MIN(attr_3) AS minZ,
MIN(attr_4) AS minW FROM " + tableName + " WHERE out = 2");
if( rs.next() )
60
An improved hierarchical partitioning fuzzy approach to pattern classification
{
hBox2 = new HyperBoxM(rs.getDouble("minX"), rs.getDouble("maxX"), rs.getDouble("minY"),
rs.getDouble("maxY"), rs.getDouble("minZ"), rs.getDouble("maxZ"), rs.getDouble("minW"),
rs.getDouble("maxW"));
rs.close();
}
rs = stmt.executeQuery("SELECT MAX(attr_1) AS maxX, MAX(attr_2) AS maxY, MAX(attr_3) AS
maxZ, MAX(attr_4) AS maxW, MIN(attr_1) AS minX, MIN(attr_2) AS minY, MIN(attr_3) AS minZ,
MIN(attr_4) AS minW FROM " + tableName + " WHERE out = 3");
if( rs.next() )
{
hBox3 = new HyperBoxM(rs.getDouble("minX"), rs.getDouble("maxX"), rs.getDouble("minY"),
rs.getDouble("maxY"), rs.getDouble("minZ"), rs.getDouble("maxZ"), rs.getDouble("minW"),
rs.getDouble("maxW"));
rs.close();
}
stmt.close();
///////////////////////////////////////////////////////////////////////////////////
//set fuzzy varibles
///////////////////////////////////////////////////////////////////////////////////
FuzzyVariable attribute1 = new FuzzyVariable("attr_1", 0, 10, "unit(s)");
FuzzyVariable attribute2 = new FuzzyVariable("attr_2", 0, 10, "unit(s)");
FuzzyVariable attribute3 = new FuzzyVariable("attr_3", 0, 10, "unit(s)");
FuzzyVariable attribute4 = new FuzzyVariable("attr_4", 0, 10, "unit(s)");
FuzzyVariable output = new FuzzyVariable("out", 0, 10, "unit(s)");
///////////////////////////////////////////////////////////////////////////////////
//generate membership functions
///////////////////////////////////////////////////////////////////////////////////
double d1xlow[] = {0, 0, (hBox1.getMaximum(1)), (hBox1.getMaximum(1) +
(hBox1.getMaximum(1)-hBox1.getMinimum(1)) / r)};
double d1ylow[] = {0, 1, 1, 0};
double d1xmed[] = {(hBox2.getMinimum(1) - ((hBox2.getMaximum(1)-hBox2.getMinimum(1))
/ r)), hBox2.getMinimum(1), hBox3.getMinimum(1), hBox2.getMaximum(1)};
double d1ymed[] = {0, 1, 1, 0};
double d1xhigh[] = {hBox3.getMinimum(1), hBox2.getMaximum(1), 10, 10};
double d1yhigh[] = {0, 1, 1, 0};
attribute1.addTerm("low", d1xlow, d1ylow, 4);
attribute1.addTerm("medium", d1xmed, d1ymed, 4);
attribute1.addTerm("high", d1xhigh, d1yhigh, 4);
61
An improved hierarchical partitioning fuzzy approach to pattern classification
double d2xlow[] = {0, 0, hBox3.getMinimum(2), hBox2.getMaximum(2) +
(hBox2.getMaximum(2)-hBox2.getMinimum(2))/r};
double d2ylow[] = {0, 1, 1, 0};
double d2xmed[] = {hBox3.getMinimum(2), hBox2.getMaximum(2), hBox3.getMaximum(2),
hBox3.getMaximum(2)};
double d2ymed[] = {0, 1, 1, 0};
double d2xhigh[] = {hBox1.getMinimum(2) - (hBox1.getMaximum(2)-hBox1.getMinimum(2)) /
r, hBox1.getMaximum(2), 10, 10};
double d2yhigh[] = {0, 1, 1, 0};
attribute2.addTerm("low", d2xlow, d2ylow, 4);
attribute2.addTerm("medium", d2xmed, d2ymed, 4);
attribute2.addTerm("high", d2xhigh, d2yhigh, 4);
double d3xlow[] = {0, 0, (hBox1.getMaximum(3)), hBox1.getMaximum(3)+
(hBox1.getMaximum(3)-hBox1.getMinimum(3))/r };
double d3ylow[] = {0, 1, 1, 0};
double d3xmed[] = {(hBox2.getMinimum(3) -
(hBox2.getMaximum(3)-hBox2.getMinimum(3))/r), hBox2.getMinimum(3), hBox3.getMinimum(3),
hBox2.getMaximum(3)};
double d3ymed[] = {0, 1, 1, 0};
double d3xhigh[] = {hBox3.getMinimum(3), hBox2.getMaximum(3), 10, 10};
double d3yhigh[] = {0, 1, 1, 0};
attribute3.addTerm("low", d3xlow, d3ylow, 4);
attribute3.addTerm("medium", d3xmed, d3ymed, 4);
attribute3.addTerm("high", d3xhigh, d3yhigh, 4);
double d4xlow[] = {0, 0, (hBox1.getMaximum(4)-hBox1.getMinimum(4))/r,
hBox1.getMaximum(4)};
double d4ylow[] = {0, 1, 1, 0};
double d4xmed[] = {(hBox2.getMinimum(4) - (hBox2.getMaximum(4)-hBox2.getMinimum(4))/
r), hBox2.getMinimum(4), hBox3.getMinimum(4), hBox2.getMaximum(4)};
double d4ymed[] = {0, 1, 1, 0};
double d4xhigh[] = {hBox3.getMinimum(4), hBox2.getMaximum(4), 10, 10};
double d4yhigh[] = {0, 1, 1, 0};
attribute4.addTerm("low", d4xlow, d4ylow, 4);
attribute4.addTerm("medium", d4xmed, d4ymed, 4);
attribute4.addTerm("high", d4xhigh, d4yhigh, 4);
///////////////////////////////////////////////////////////////////////////////////
//set fuzzy values
///////////////////////////////////////////////////////////////////////////////////
62
An improved hierarchical partitioning fuzzy approach to pattern classification
double outxlow[] = {0, 1.5, 3};
double outylow[] = {0, 1, 0};
double outxmed[] = {3.5, 5, 6.5};
double outymed[] = {0, 1, 0};
double outxhigh[] = {7, 8.5, 10};
double outyhigh[] = {0, 1, 0};
output.addTerm("low", outxlow, outylow, 3);
output.addTerm("medium", outxmed, outymed, 3);
output.addTerm("high", outxhigh, outyhigh, 3);
///////////////////////////////////////////////////////////////////////////////////
//generate fuzzy rules
///////////////////////////////////////////////////////////////////////////////////
FuzzyRule lhll = new FuzzyRule();
FuzzyRule mlmm = new FuzzyRule();
FuzzyRule hmhh = new FuzzyRule();
lhll.addAntecedent(new FuzzyValue(attribute1, "low"));
lhll.addAntecedent(new FuzzyValue(attribute2, "high"));
lhll.addAntecedent(new FuzzyValue(attribute3, "low"));
lhll.addAntecedent(new FuzzyValue(attribute4, "low"));
lhll.addConclusion(new FuzzyValue(output, "low"));
mlmm.addAntecedent(new FuzzyValue(attribute1, "medium"));
mlmm.addAntecedent(new FuzzyValue(attribute2, "low"));
mlmm.addAntecedent(new FuzzyValue(attribute3, "medium"));
mlmm.addAntecedent(new FuzzyValue(attribute4, "medium"));
mlmm.addConclusion(new FuzzyValue(output, "medium"));
hmhh.addAntecedent(new FuzzyValue(attribute1, "high"));
hmhh.addAntecedent(new FuzzyValue(attribute2, "medium"));
hmhh.addAntecedent(new FuzzyValue(attribute3, "high"));
hmhh.addAntecedent(new FuzzyValue(attribute4, "high"));
hmhh.addConclusion(new FuzzyValue(output, "high"));
int overIds = 0, nonOverIds = 0,overIds2 = 0, overIds3 = 0;
Vector<Integer> overlapIds = new Vector<Integer>();
Vector<Integer> nonOverlapIds = new Vector<Integer>();
63
An improved hierarchical partitioning fuzzy approach to pattern classification
///////////////////////////////////////////////////////////////////////////////////
//Obtain testing data (overlapping and non-overlapping points)
///////////////////////////////////////////////////////////////////////////////////
rs = stmt.executeQuery("SELECT count(id) as cnt FROM " + tableName2 + " WHERE (attr_1
>= " + hBox3.getMinimum(1) + " AND attr_1 <= " + hBox2.getMaximum(1) + ") AND " +
"(attr_2 > " +
hBox3.getMinimum(2) + " AND attr_2 < " + hBox2.getMaximum(2) + ") AND "+
"(attr_3 > " +
hBox3.getMinimum(3) + " AND attr_3 < " + hBox2.getMaximum(3) + ") AND "+
"(attr_4 > " +
hBox3.getMinimum(4) + " AND attr_4 < " + hBox2.getMaximum(4) + ") AND out=2");
if( rs.next() )
overIds2 = rs.getInt("cnt");
rs.close();
rs = stmt.executeQuery("SELECT count(id) as cnt FROM " + tableName2 + " WHERE (attr_1
>= " + hBox3.getMinimum(1) + " AND attr_1 <= " + hBox2.getMaximum(1) + ") AND " +
"(attr_2 > " +
hBox3.getMinimum(2) + " AND attr_2 < " + hBox2.getMaximum(2) + ") AND "+
"(attr_3 > " +
hBox3.getMinimum(3) + " AND attr_3 < " + hBox2.getMaximum(3) + ") AND "+
"(attr_4 > " +
hBox3.getMinimum(4) + " AND attr_4 < " + hBox2.getMaximum(4) + ") AND out=3");
if( rs.next() )
overIds3 = rs.getInt("cnt");
rs.close();
//overlapping points
rs = stmt.executeQuery("SELECT count(id) as cnt FROM " + tableName2 + " WHERE (attr_1
>= " + hBox3.getMinimum(1) + " AND attr_1 <= " + hBox2.getMaximum(1) + ") AND " +
"(attr_2 > " +
hBox3.getMinimum(2) + " AND attr_2 < " + hBox2.getMaximum(2) + ") AND "+
"(attr_3 > " +
hBox3.getMinimum(3) + " AND attr_3 < " + hBox2.getMaximum(3) + ") AND "+
"(attr_4 > " +
hBox3.getMinimum(4) + " AND attr_4 < " + hBox2.getMaximum(4) + ")");
if( rs.next() )
overIds = rs.getInt("cnt");
rs.close();
rs = stmt.executeQuery("SELECT id FROM " + tableName2 + " WHERE (attr_1 >= " +
hBox3.getMinimum(1) + " AND attr_1 <= " + hBox2.getMaximum(1) + ") AND " +
64
An improved hierarchical partitioning fuzzy approach to pattern classification
"(attr_2 > " +
hBox3.getMinimum(2) + " AND attr_2 < " + hBox2.getMaximum(2) + ") AND "+
"(attr_3 > " +
hBox3.getMinimum(3) + " AND attr_3 < " + hBox2.getMaximum(3) + ") AND "+
"(attr_4 > " +
hBox3.getMinimum(4) + " AND attr_4 < " + hBox2.getMaximum(4) + ")");
while(rs.next())
{
overlapIds.addElement( rs.getInt("id") );
}
rs.close();
FuzzyValueVector fvv1, fvv2, fvv3;
///////////////////////////////////////////////////////////////////////////////////
//classify each point in overlapping area
///////////////////////////////////////////////////////////////////////////////////
for(int i = 0; i < overIds; i++)
{
rs = stmt.executeQuery("SELECT id, attr_1, attr_2, attr_3, attr_4 FROM " + tableName2
+ " WHERE id = " + overlapIds.elementAt(i));
if( rs.next() )
{
double a = rs.getDouble("attr_1");
double b = rs.getDouble("attr_2");
double c = rs.getDouble("attr_3");
double d = rs.getDouble("attr_4");
FuzzyValue at1FV = new FuzzyValue(attribute1, new SingletonFuzzySet(a));
FuzzyValue at2FV = new FuzzyValue(attribute2, new SingletonFuzzySet(b));
FuzzyValue at3FV = new FuzzyValue(attribute3, new SingletonFuzzySet(c));
FuzzyValue at4FV = new FuzzyValue(attribute4, new SingletonFuzzySet(d));
rs.close();
double c21 = hBox2.getCentre(1);
double c22 = hBox2.getCentre(2);
double c23 = hBox2.getCentre(3);
double c24 = hBox2.getCentre(4);
double c31 = hBox3.getCentre(1);
double c32 = hBox3.getCentre(2);
65
An improved hierarchical partitioning fuzzy approach to pattern classification
double c33 = hBox3.getCentre(3);
double c34 = hBox3.getCentre(4);
///////////////////////////////////////////////////////////////////////////////////
//calculate Euclidean distance and match rule
///////////////////////////////////////////////////////////////////////////////////
if (distance(a, b, c, d, c21, c22, c23, c24) <= distance(a, b, c, d, c31, c32, c33,
c34))
{
mlmm.removeAllInputs();
mlmm.addInput(at1FV);
mlmm.addInput(at2FV);
mlmm.addInput(at3FV);
mlmm.addInput(at4FV);
if( mlmm.testRuleMatching() )
{
fvv2 = mlmm.execute();
System.out.println(fvv2.fuzzyValueAt(0));
System.out.println(overlapIds.elementAt(i));
}
}
else
{
hmhh.removeAllInputs();
hmhh.addInput(at1FV);
hmhh.addInput(at2FV);
hmhh.addInput(at3FV);
hmhh.addInput(at4FV);
if( hmhh.testRuleMatching() )
{
fvv3 = hmhh.execute();
System.out.println(fvv3.fuzzyValueAt(0));
System.out.println(overlapIds.elementAt(i));
}
}
}//end if
}//end for overlapping
///////////////////////////////////////////////////////////////////////////////////
//deal with points in non-overlapping area
///////////////////////////////////////////////////////////////////////////////////
rs = stmt.executeQuery("SELECT count(id) as cnt FROM " + tableName2 + " WHERE (attr_1
66
An improved hierarchical partitioning fuzzy approach to pattern classification
<= " + hBox3.getMinimum(1) + " or attr_1 >= " + hBox2.getMaximum(1) + ") or " +
"(attr_2 <= " +
hBox3.getMinimum(2) + " or attr_2 >= " + hBox2.getMaximum(2) + ") or "+
"(attr_3 <= " +
hBox3.getMinimum(3) + " or attr_3 >= " + hBox2.getMaximum(3) + ") or "+
"(attr_4 <= " +
hBox3.getMinimum(4) + " or attr_4 >= " + hBox2.getMaximum(4) + ")");
if( rs.next() )
nonOverIds = rs.getInt("cnt");
rs.close();
rs = stmt.executeQuery("SELECT id FROM " + tableName2 + " WHERE (attr_1 <= " +
hBox3.getMinimum(1) + " or attr_1 >= " + hBox2.getMaximum(1) + ") or " +
"(attr_2 <= " +
hBox3.getMinimum(2) + " or attr_2 >= " + hBox2.getMaximum(2) + ") or "+
"(attr_3 <= " +
hBox3.getMinimum(3) + " or attr_3 >= " + hBox2.getMaximum(3) + ") or "+
"(attr_4 <= " +
hBox3.getMinimum(4) + " or attr_4 >= " + hBox2.getMaximum(4) + ")");
while(rs.next())
{
nonOverlapIds.addElement( rs.getInt("id") );
}
rs.close();
for(int j = 0; j < nonOverIds; j++)
{
rs = stmt.executeQuery("SELECT id, attr_1, attr_2, attr_3, attr_4 FROM " + tableName2
+ " WHERE id = " + nonOverlapIds.elementAt(j));
if( rs.next() )
{
FuzzyValue at1FV = new FuzzyValue(attribute1, new
SingletonFuzzySet(rs.getDouble("attr_1")));
FuzzyValue at2FV = new FuzzyValue(attribute2, new
SingletonFuzzySet(rs.getDouble("attr_2")));
FuzzyValue at3FV = new FuzzyValue(attribute3, new
SingletonFuzzySet(rs.getDouble("attr_3")));
FuzzyValue at4FV = new FuzzyValue(attribute4, new
SingletonFuzzySet(rs.getDouble("attr_4")));
rs.close();
lhll.removeAllInputs();
67
An improved hierarchical partitioning fuzzy approach to pattern classification
lhll.addInput(at1FV);
lhll.addInput(at2FV);
lhll.addInput(at3FV);
lhll.addInput(at4FV);
if( lhll.testRuleMatching() )
{
fvv1 = lhll.execute();
System.out.println(fvv1.fuzzyValueAt(0));
System.out.println(nonOverlapIds.elementAt(j));
}
mlmm.removeAllInputs();
mlmm.addInput(at1FV);
mlmm.addInput(at2FV);
mlmm.addInput(at3FV);
mlmm.addInput(at4FV);
if( mlmm.testRuleMatching() )
{
fvv2 = mlmm.execute();
System.out.println(fvv2.fuzzyValueAt(0));
System.out.println(nonOverlapIds.elementAt(j));
}
hmhh.removeAllInputs();
hmhh.addInput(at1FV);
hmhh.addInput(at2FV);
hmhh.addInput(at3FV);
hmhh.addInput(at4FV);
if( hmhh.testRuleMatching() )
{
fvv3 = hmhh.execute();
System.out.println(fvv3.fuzzyValueAt(0));
System.out.println(nonOverlapIds.elementAt(j));
}
}
}//end for non overlap
///////////////////////////////////////////////////////////////////////////////////
//detect existence of overlapping area
///////////////////////////////////////////////////////////////////////////////////
private boolean existsOverlapping(HyperBoxM box1, HyperBoxM box2)
{
if( box1.getMaximum(1) >= box2.getMinimum(1) )
if( box1.getMaximum(2) >= box2.getMinimum(2) )
68
An improved hierarchical partitioning fuzzy approach to pattern classification
if( box1.getMaximum(3) >= box2.getMinimum(3) )
if( box1.getMaximum(4) >= box2.getMinimum(4) )
return true;
return false;
}
///////////////////////////////////////////////////////////////////////////////////
//calculate Euclidean distance
///////////////////////////////////////////////////////////////////////////////////
public double distance(double a, double b, double c, double d, double a1, double b1,
double c1, double d1)
{
double dis = 0;
dis = Math.sqrt(Math.abs(a - a1) * Math.abs(a - a1) + Math.abs(b - b1) * Math.abs(b
- b1) + Math.abs(c - c1) * Math.abs(c - c1) + Math.abs(d - d1) * Math.abs(d - d1));
return dis;
}
///////////////////////////////////////////////////////////////////////////////////
//main
///////////////////////////////////////////////////////////////////////////////////
public static void main(String[] args)
{
classificationTestM calgo = new classificationTestM("CAlgoDTB");
if( calgo.hyperboxGeneration() )
System.out.println("[+] Finish");
}
69
An improved hierarchical partitioning fuzzy approach to pattern classification
Appendix 2: A sample separation of Iris dataset
Training data
*species: 1 for setosa, 2 for versicolor, 3for virginica
id species Petal width Petal length Sepal width Sepal length
1 1 0.2 1.4 3.3 5
2 1 0.2 1.6 3.1 4.8
3 1 0.2 1.3 3.2 4.4
4 1 0.2 1.4 3 4.9
5 1 0.4 1.5 3.4 5.4
6 1 0.2 1.4 4.2 5.5
7 1 0.2 1.4 2.9 4.4
8 1 0.1 1.4 3 4.8
9 1 0.4 1.5 3.7 5.1
10 1 0.2 1.3 3 4.4
11 1 0.2 1.6 3.2 4.7
12 1 0.1 1.1 3 4.3
13 1 0.2 1.4 3.5 5.1
14 1 0.4 1.6 3.4 5
15 1 0.2 1.3 3.2 4.7
16 1 0.2 1.5 3.4 5.1
17 1 0.1 1.5 3.1 4.9
18 1 0.2 1.5 3.7 5.4
19 1 0.3 1.3 2.3 4.5
20 1 0.3 1.5 3.8 5.1
21 1 0.2 1.5 3.5 5.2
22 1 0.6 1.6 3.5 5
23 1 0.2 1.4 3.2 4.6
24 1 0.2 1.5 3.1 4.6
25 1 0.2 1.5 3.7 5.3
26 2 1.3 4.5 2.8 5.7
27 2 1.2 4 2.6 5.8
28 2 1 4.1 2.7 5.8
29 2 1.5 4.5 2.9 6
30 2 1 3.3 2.4 4.9
31 2 1.5 4.2 3 5.9
32 2 1.5 4.9 2.5 6.3
33 2 1.4 4.4 3 6.6
34 2 1.1 3.9 2.5 5.6
35 2 1.5 4.5 3 5.4
70
An improved hierarchical partitioning fuzzy approach to pattern classification
36 2 1 3.5 2.6 5.7
37 2 1.3 4.2 2.7 5.6
38 2 1.3 5.4 2.9 6.2
39 2 1.2 4.7 2.8 6.1
40 2 1.3 4.1 2.8 5.7
41 2 1.5 4.9 3.1 6.9
42 2 1.3 4 2.5 5.5
43 2 1.5 4.6 2.8 6.5
44 2 1.8 4.8 3.2 5.9
45 2 1.3 4 2.8 6.1
46 2 1.1 3.8 2.4 5.5
47 2 1.2 4.2 3 5.7
48 2 1.3 5.6 2.9 6.6
49 2 1.5 4.7 3.1 6.7
50 2 1.3 4 2.3 5.5
51 3 2.4 5.6 3.1 6.7
52 3 1.9 5.1 2.7 5.8
53 3 1.9 5 2.5 6.3
54 3 1.8 4.9 2.7 6.3
55 3 1.5 5 2.2 6
56 3 2 4.9 2.8 5.6
57 3 1.8 5.8 2.5 6.7
58 3 2.1 5.4 3.1 6.9
59 3 2.1 5.5 3 6.8
60 3 1.5 5.1 2.8 6.3
61 3 2.3 5.9 3.2 6.8
62 3 2.5 5.7 3.3 6.7
63 3 2.1 5.7 3.3 6.7
64 3 1.8 4.8 3 6
65 3 1.8 5.5 3 6.5
66 3 2.1 6.6 3 7.6
67 3 1.8 6 3.2 7.2
68 3 2 6.7 2.8 7.7
69 3 1.4 5.6 2.6 6.1
70 3 2.4 5.6 3.4 6.3
71 3 1.6 5.8 3 7.2
72 3 2.3 6.9 2.6 7.7
73 3 1.9 6.1 2.8 7.4
74 3 2.2 5.8 3 6.5
75 3 2 5 2.5 5.7
71
An improved hierarchical partitioning fuzzy approach to pattern classification
Testing data
*species: 1 for setosa, 2 for versicolor, 3for virginica
id species Petal width Petal length Sepal width Sepal length
1 1 0.2 1 3.6 4.6
2 1 0.1 1.4 3.6 4.9
3 1 0.2 1.6 3.8 5.1
4 1 0.2 1.6 3 5
5 1 0.4 1.9 3.8 5.1
6 1 0.2 1.4 3.6 5
7 1 0.3 1.7 3.8 5.7
8 1 0.2 1.3 3.5 5.5
9 1 0.2 1.2 3.2 5
10 1 0.1 1.5 4.1 5.2
11 1 0.2 1.5 3.1 4.9
12 1 0.4 1.7 3.9 5.4
13 1 0.4 1.3 3.9 5.4
14 1 0.3 1.4 3.4 4.6
15 1 0.5 1.7 3.3 5.1
16 1 0.2 1.4 3.4 5.2
17 1 0.2 1.2 4 5.8
18 1 0.2 1.5 3.4 5.2
19 1 0.3 1.3 3.5 5
20 1 0.3 1.4 3.5 5.1
21 1 0.2 1.7 3.4 5.4
22 1 0.2 1.6 3.4 4.8
23 1 0.4 1.5 4.4 5.7
24 1 0.3 1.4 3 4.8
25 1 0.2 1.9 3.4 4.8
26 2 1 3.3 2.3 5
27 2 1.6 4.7 3.3 6.3
28 2 1.4 4.7 3.2 7
29 2 1.4 3.9 2.7 5.2
30 2 1.2 3.9 2.7 5.8
31 2 1.3 4.4 2.3 6.3
32 2 1.1 3 2.5 5.1
33 2 1.3 3.6 2.9 5.6
34 2 1.7 5 3 6.7
35 2 1.5 4.5 2.2 6.2
36 2 1.4 4.6 3 6.1
37 2 1.5 4.5 3.2 6.4
38 2 1.4 4.4 3.1 6.7
39 2 1.3 4.2 2.6 5.7
72
An improved hierarchical partitioning fuzzy approach to pattern classification
40 2 1.6 4.5 3.4 6
41 2 1 3.5 2 5
42 2 1 5 2.2 6
43 2 1.4 4.8 2.8 6.8
44 2 1.2 4.4 2.6 5.5
45 2 1.5 4.5 3 5.6
46 2 1.6 5.1 2.7 6
47 2 1.4 4.7 2.9 6.1
48 2 1 3.7 2.4 5.5
49 2 1.3 4.1 3 5.6
50 2 1.3 4.3 2.9 6.4
51 3 2.3 5.1 3.1 6.9
52 3 2 5.2 3 6.5
53 3 1.7 4.5 2.5 4.9
54 3 2.1 5.6 2.8 6.4
55 3 1.9 5.1 2.7 5.8
56 3 1.8 5.5 3.1 6.4
57 3 2.3 5.7 3.2 6.9
58 3 2.5 6.1 3.6 7.2
59 3 2.2 5.6 2.8 6.4
60 3 2.3 5.4 3.4 6.2
61 3 1.8 5.1 3 5.9
62 3 2.3 5.3 3.2 6.4
63 3 1.3 5.2 3 6.7
64 3 1.8 4.9 3 6.1
65 3 2.3 6.1 3 7.7
66 3 2 5.1 3.2 6.5
67 3 2.5 6 3.3 6.3
68 3 2.2 6.7 3.8 7.7
69 3 2 6.4 3.8 7.9
70 3 1.8 4.8 2.8 6.2
71 3 2.1 5.9 3 7.1
72 3 1.8 5.6 2.9 6.3
73 3 1.8 6.3 2.9 7.3
74 3 2.4 5.1 2.8 5.8
75 3 1.9 5.3 2.7 6.4
73