fuzzy rule-based system derived from similarity to prototypes włodzisław duch department of...

14
Fuzzy rule-based system Fuzzy rule-based system derive derive d d from similarity from similarity to prototypes to prototypes Włodzisław Duch Włodzisław Duch Department of Informatics, Department of Informatics, Nicolaus Copernicus Nicolaus Copernicus University, University, Poland Poland School of Computer School of Computer Engineering, Engineering, Nanyang Technological Nanyang Technological University, University, Marcin Blachnik Marcin Blachnik Division of Computer Division of Computer Methods Methods , , Department of Department of Elektrotehnology Elektrotehnology , , The Silesian University of The Silesian University of Technology Technology , , Poland Poland

Post on 20-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Fuzzy rule-based system Fuzzy rule-based system derivederivedd from similarity to from similarity to

prototypesprototypes

Włodzisław DuchWłodzisław DuchDepartment of Informatics,Department of Informatics,

Nicolaus Copernicus University,Nicolaus Copernicus University,

PolandPoland

School of Computer Engineering, School of Computer Engineering,

Nanyang Technological University,Nanyang Technological University,

SingaporeSingapore

Marcin BlachnikMarcin BlachnikDivision of Computer MethodsDivision of Computer Methods,,

Department of ElektrotehnologyDepartment of Elektrotehnology,,

The Silesian University of The Silesian University of TechnologyTechnology,,

PolandPoland

PlanPlan

1.1. What is it all about? What is it all about?

2.2. Fuzzy ruleFuzzy rule systems and prototype rule systems and prototype rule based systems.based systems.

3.3. From prototype rules to fuzzy rules and From prototype rules to fuzzy rules and vice versa, with examples.vice versa, with examples.

4.4. Results of applications on real datasets. Results of applications on real datasets.

5.5. Conclusions.Conclusions.

MotivationMotivationUnderstanding data, situations, recognizing objects or Understanding data, situations, recognizing objects or making diagnosis people frequently use similarity to known making diagnosis people frequently use similarity to known cases, and rarely use logical reasoning, but soft computing cases, and rarely use logical reasoning, but soft computing experts use logic instead of similarity ... experts use logic instead of similarity ...

Relations between similarity and logic are not clear. Relations between similarity and logic are not clear.

QQ11: How to obtain the same decision borders in Fuzzy Logic : How to obtain the same decision borders in Fuzzy Logic

systems and Prototype Rule Based systems?systems and Prototype Rule Based systems?

QQ22: What type of similarity measure corresponds to a typical : What type of similarity measure corresponds to a typical

fuzzy functions and vice versa?fuzzy functions and vice versa?

QQ33: How to transform one type of a system into another type : How to transform one type of a system into another type

preserving their decision borders?preserving their decision borders?

QQ44: Are there any advantages of such transformations?: Are there any advantages of such transformations?

QQ55: Can we understand data better using prototypes instead : Can we understand data better using prototypes instead

of logical rules? of logical rules?

Fuzzy Rule BasedFuzzy Rule Based SystemSystem

Learning process includes: Learning process includes:

– for each feature, select shapes of membership for each feature, select shapes of membership functions and the number of these functions;functions and the number of these functions;

– optimize parameters of the membership functions optimize parameters of the membership functions (such as positions and spreads) using training data;(such as positions and spreads) using training data;

– aggregate input information and calculate final rule aggregate input information and calculate final rule activations for each category; activations for each category;

– assign membership degrees to output classes;assign membership degrees to output classes;– write the set of F-rules and interpret them. write the set of F-rules and interpret them.

Prototype Rule Based SystemPrototype Rule Based SystemLearning process involves:Learning process involves:

specify the number and positions of prototypes; specify the number and positions of prototypes; select similarity or dissimilarity (distance) functionsselect similarity or dissimilarity (distance) functions (we (we use distance functions);use distance functions);calculate distance (similarity) to each prototype;calculate distance (similarity) to each prototype;assign P-rule to the output class as a rule; choices are:assign P-rule to the output class as a rule; choices are:

If P=argminIf P=argminp’p’(D(X,P’)) Then Class(X)=Class(P)(D(X,P’)) Then Class(X)=Class(P)

This is a nearest prototype rule, similar This is a nearest prototype rule, similar to to the the fuzzy logicfuzzy logic

rule: If R=maxrule: If R=maxkk MembF MembFkk(X) Then Class(X)<=Class(R)(X) Then Class(X)<=Class(R)

Another form of P-rules is based on similarity threshold:Another form of P-rules is based on similarity threshold:

If D(X,P)If D(X,P)≤≤ddp p Then CThen C

Taking DTaking D(X,P) distance crisp logic rules are obtained(X,P) distance crisp logic rules are obtained

Advantages of prototype based rulesAdvantages of prototype based rules

Inspired by cognitive psychology: it may be easier to Inspired by cognitive psychology: it may be easier to

understand prototypes and similarity than fuzzy rulesunderstand prototypes and similarity than fuzzy rules

P-rules may be defined for nominal features using P-rules may be defined for nominal features using

probabilistic distance measures (such as VDM), probabilistic distance measures (such as VDM),

while F-rules require numerical inputs. while F-rules require numerical inputs.

Many algorithms for prototype selection and optimization Many algorithms for prototype selection and optimization

exist but they have not been applied to understand data exist but they have not been applied to understand data

and their relation to fuzzy rules have not been explored;and their relation to fuzzy rules have not been explored;

Applications of P-rules to real datasets give excellent Applications of P-rules to real datasets give excellent

results generating small number of prototypes. results generating small number of prototypes.

Value Difference MatrixValue Difference Matrix (VDM) (VDM)

VDM – probability difference measureVDM – probability difference measurefor 1 attributefor 1 attribute

for many attributesfor many attributes

VDM measure can be also applied for continuous features, VDM measure can be also applied for continuous features, in the simplest way using discretization and interpolation, in the simplest way using discretization and interpolation, or other probability estimation techniques (Gaussian or other probability estimation techniques (Gaussian smoothing, Parzen windows, etc).smoothing, Parzen windows, etc).

1

, | |qKq

VDM j j i j i ji

d x r p C x p C r

1

, ,N qq

VDM VDM j jj

D X R d x r

P-rules P-rules F-rules F-rulesCondition: preserve classification bordersCondition: preserve classification bordersQ: how are membership functionQ: how are membership functions s and distanceand distance functions functions related? Can one obtain new, interesting membership related? Can one obtain new, interesting membership functions from known distance functions and vice versa? functions from known distance functions and vice versa?

For all additive distance functions exp transformationFor all additive distance functions exp transformationchanges distances D of P-rules into products of MF of F-rules: changes distances D of P-rules into products of MF of F-rules: MF=exp(-D)MF=exp(-D)Example:Example: Euclidean distance is equivalent to Gaussian MFsEuclidean distance is equivalent to Gaussian MFs

2 2

1

22 2

1 1

,

exp , exp exp

N

i i ii

N N

i i i i i ii i

D X P W X P

F D X P W X P W X P

2

1

; , exp ; ; ,i i i i i i i i ii

X P W W X P F X P W

Algebraic (product) T-norm is obtained with Gaussian MFsAlgebraic (product) T-norm is obtained with Gaussian MFs

VisualizationVisualization

Decision borderDecision border MF for attrib 1MF for attrib 1 MF for attrib 2MF for attrib 2

Euclidean distance function

Square of Canberra distance function

VDM distance => membership functionsVDM distance => membership functions

Decision borderDecision border MF for attrib 1MF for attrib 1 MF for attrib 2MF for attrib 2

DVDM distance function

IVDM distance function

Inverse transformationInverse transformationFor all product T-normFor all product T-norm

D = D = ln(F)ln(F)

Advantages: New type of distance functions are generated.Advantages: New type of distance functions are generated.

Example: distances generated from triangular functions.Example: distances generated from triangular functions.

1

1

1 ( ) / ( ; )

1 ( ) / ( ; )

0 otherwise

1 ( ) / ( ; )

ln( ) ln 1 ( ) / ( ; )

0 otherwise

ln(1 ( ) / )

i i i i iN

i i i i ii

i i i i iN

i i i i ii

i i

x p x p p

F p x x p p

x p x p p

D F p x x p p

x p

1

( ; )

ln(1 ( ) / ) ( ; )

inf otherwise

i i iN

i i i i ii

x p p

p x x p p

Applications to rApplications to real dataeal data1.1. Gene expression data for 2 types of leukaemia Gene expression data for 2 types of leukaemia (Golub et (Golub et

al, Science 286 (1999) 531-537al, Science 286 (1999) 531-537Description: 2 classes, 1100 features, 3 most relevant selected.Description: 2 classes, 1100 features, 3 most relevant selected.Used methods: 1 prototype/class LVQ, DVDM similarity measure.Used methods: 1 prototype/class LVQ, DVDM similarity measure.Results (number of misclassified vectors):Results (number of misclassified vectors):

2.2. SearchingSearching for Promoters in DNA strings for Promoters in DNA stringsDescription: 2 classes, 57 features, all symbolic features. Description: 2 classes, 57 features, all symbolic features. Used methods: 9 prototypes for promoters, 12 for nonpromoters, Used methods: 9 prototypes for promoters, 12 for nonpromoters, generated using C-means + LVQ, with VDM similarity measure. generated using C-means + LVQ, with VDM similarity measure.

Results: Results: 5 misclassified vectors in leave one out test.5 misclassified vectors in leave one out test.

Data SetData Set Golub et alGolub et al P-rulesP-rules

TrainTrain 33 00

TestTest 55 33

ConclusionsConclusionsFirst step in understanding relations between fuzzy and First step in understanding relations between fuzzy and similarity-based systems was made.similarity-based systems was made.

Prototype rules can be expressed using fuzzy rules and Prototype rules can be expressed using fuzzy rules and vice versa. vice versa.

New possibilities in both fields:New possibilities in both fields:– new type of membership functions;new type of membership functions;– new type of distance functions;new type of distance functions;

VDM measure used in P-rules leads to a natural shape VDM measure used in P-rules leads to a natural shape of membership functions in fuzzy logic for symbolic data.of membership functions in fuzzy logic for symbolic data.

Expert knowledge can be captured in both types of rules, Expert knowledge can be captured in both types of rules, but sometimes it is easier to express as P-rules and but sometimes it is easier to express as P-rules and sometimes as F-rules.sometimes as F-rules.

Many open problems remain. Many open problems remain.

Thank YouThank Youfor lending your ears ...for lending your ears ...

Speaker: Marcin BlachnikSpeaker: Marcin Blachnik