fuzzy-rough instance selection

17
Richard Jensen and Chris Cornelis Chris Cornelis Ghent University, Belgium Richard Jensen Aberystwyth University, UK Fuzzy-Rough Instance Selection

Upload: sema

Post on 23-Feb-2016

77 views

Category:

Documents


0 download

DESCRIPTION

Fuzzy-Rough Instance Selection. Outline. The importance of instance selection Rough set theory Fuzzy-rough sets Fuzzy-rough instance selection Experimentation Conclusion. Instance selection. Knowledge discovery The problem of too much data Requires storage - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Chris CornelisGhent University, Belgium

Richard JensenAberystwyth University, UK

Fuzzy-Rough Instance Selection

Page 2: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Outline• The importance of instance selection

• Rough set theory

• Fuzzy-rough sets

• Fuzzy-rough instance selection

• Experimentation

• Conclusion

Page 3: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

• Knowledge discovery

• The problem of too much data• Requires storage• Intractable for data mining algorithms

• Removing data that is noisy or irrelevant

Instance selection

Page 4: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Rough set theory

Rx is the set of all points that are indiscernible with point x

UpperApproximation

Set A

LowerApproximation

Equivalence class Rx

Page 5: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Fuzzy-rough sets• Approximate equality

• Handle real-valued features via fuzzy tolerance relations instead of crisp equivalence

• Better noise and uncertainty handling

• Focus has been on feature selection, not instance selection

Page 6: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Fuzzy-rough sets• Parameterized relation

• Fuzzy-rough definitions:

Page 7: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Instance selection: basic idea

Not needed

Remove objects to keep the underlying approximations unchanged

Page 8: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Instance selection: basic idea

Remove objects to keep the underlying approximations unchanged

Page 9: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

FRIS-I

Page 10: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

FRIS-II

Page 11: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

FRIS-III

Page 12: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Experimentation: setup

Page 13: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Results: FRIS-I (heart)

• (214 objects, 9 features)

Page 14: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Results: FRIS-II (heart)

Page 15: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Results: FRIS-III (heart)

Page 16: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

Conclusion• Proposed new techniques for instance selection

based on fuzzy-rough sets• Managed to reduce the number of instances significantly,

retaining classification accuracy

• Future work• Many possibilities for novel fuzzy-rough instance

selection methods• Comparisons with non-rough techniques• Improving the complexity of FRIS-III• Combined instance/feature selection

Page 17: Fuzzy-Rough Instance Selection

Richard Jensen and Chris Cornelis

• WEKA implementations of all fuzzy-rough methods can be downloaded from:

http://users.aber.ac.uk/rkj/book/weka.zip