new challenges in learning classifier systems: mining rarities and evolving fuzzy rules
DESCRIPTION
TRANSCRIPT
New Challenges in Learning Classifier g gSystems: Mining Rarities and Evolving
Fuzzy RulesFuzzy Rules
Student: Albert Orriols-Puig
Supervisor: Ester Bernadó-Mansilla
Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La SalleEnginyeria i Arquitectura La Salle
Universitat Ramon Llull
Background
GRSI has been researching on machine learning and data miningEspecially focused on data classificationEspecially focused on data classificationResearch aims at
Improving learning methodsApplying learning methods to real-world applications
Application of LCS to classification problems is one of the main research linesLCS are appealing because the mine streams of examples
Many applications make the data available in streams
Slide 2Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Important challenges need to be addressed to deal with complex applications
BackgroundGeneral schema of LCSs
Introduced by Holland
EnvironmentS i l
L iCl ifi 1
Apportionment of credit algorithms Online rule evaluator
Sensorialstate FeedbackAction
A R t ti Learning Classifier System
Classifier 1Classifier 2
Classifier n
XCS: Q-Learning (Sutton & Barto, 1998)
Uses Widrow-Hoff delta rule
Any Representationproduction rules,
genetic programs,tperceptrons,
SVMs
EvolutionaryAlgorithm
Rule evolutionTypically, a GA (Holland, 75; Goldberg, 89)
applied to the population
Slide 3Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
applied to the population.
When this Work Started
In 2004, when Michigan-style LCSs were reaching maturityFirst successful implementations (Wilson, 95; Wilson, 98)
Many other derivations YCS, UCS, XCSF, and many others
Applications in important domainspp p
Data mining (Bernadó et al, 02; Wilson, 02a; Bacardit & Butz, 04)
Function approximation (Wilson 02b)Function approximation (Wilson, 02b)
Reinforcement Learning (Lanzi, 02)
Th ti l l f d i (B t t l 02 03 04b)Theoretical analyses for design (Butz et al., 02, 03, 04b)
But still, there are important challenges to face
Slide 4Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Two Key Challenges in ML and LCSs
1st challenge: Learning from domains that contain rare classesg gData classification: Extract interesting, useful, and hidden patterns
The most interesting knowledge resides in rare classesThe most interesting knowledge resides in rare classes
Example: fraud detection in credit card transactions
C l d l l t l ? M b t!Can learners model rare classes accurately? May be not!Knowledge ModelDataset
Learner
Mi i i l i
What about online learning?
Minimize learning error +maximize generalization
What about online learning?More challenging: Model rare classes on the fly
Slide 5
Aim: Analyze and improve LCS for mining domains with rarities
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Two Key Challenges in ML and LCSs
2nd challenge: Building more understandable models and g gbring reasoning mechanisms close to human ones
In some domains, interpretability is more important than accuracyLCSs most often use interval-based rules in domains described by continuous variables
V i bl “ ti f ”Variables are “semantic-free”
Analyses of the inference mechanisms are scarce
Fuzzy logics provides a robust framework forknowledge representation and
i d t i treasoning under uncertainty
Some fuzzy LCS approaches already existBut no online fuzzy LCS for supervised learning has been designed
Aim: Incorporate fuzzy logics into LCS for supervised learning
Slide 6Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Goal of this WorkGeneral Goal: Address the two challenges withg
The extended classifier system (XCS) (Wilson, 95, 98)By far, the most influential Michigan-style LCS
The supervised classifier system (UCS) (Bernadó-Mansilla, 03)Inherits XCS’s architecture and specialized it for data classificationInherits XCS s architecture and specialized it for data classification
Two challenges with two LCSs that lead to four objectives
1. Revise and update UCS and compare it with XCS
Challenges Objectives2 4
LCS and rare classes
XCS and UCS 1. Revise and update UCS and compare it with XCS
2. Analyze and improve LCS for mining rarities
3. Apply LCSs for extracting models from real-world
Fuzzy logics in LCS
classification problems with rarities
4. Design and implement an LCS with fuzzy logicreasoning for supervised learning
Slide 7Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets For Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 8Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Description of XCSIn training mode for single step tasks (Wilson, 95)
ENVIRONMENT
Problem Match Set [M]
Population [P]
Problem instance
Match set
1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp6 C A P ε F num as ts exp
Match Set [M]Selected
actionDesigned for reinforcement learning:1 C A P ε F num as ts exp2 C A P ε F num as ts exp3 C A P ε F num as ts exp4 C A P ε F num as ts exp
Population [P] Match set generation
6 C A P ε F num as ts exp…
Select actionrandomly
REWARD
g gError: Error of the predicted payoffFitness: Computed as a function of the error
5 C A P ε F num as ts exp6 C A P ε F num as ts exp
…
randomly
Action Set [A]
Random Action
ClassifierParameters
Update(Widrow-Hoff rule)
1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp6 C A P ε F num as ts exp
[ ]Selection, reproduction,
and mutationDeletion
(Widrow Hoff rule)
Fitness Sharing…Genetic Algorithm
Competition in the niche
Slide 9Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Description of UCSIn training mode (Bernadó-Mansilla & Garrell, 03)
ENVIRONMENT
Match Set [M]Problem instance
Stream ofexamples
Population [P]
1 C A acc F num cs ts exp3 C A acc F num cs ts exp5 C A acc F num cs ts exp6 C A acc F num cs ts exp
Match Set [M]Problem instance+
output class
1 C A acc F num cs ts exp2 C A acc F num cs ts exp3 C A acc F num cs ts exp4 C A acc F num cs ts exp
p [ ]
ClassifierParameters
Update
6 C A acc F num cs ts exp…
correct setgeneration
5 C A acc F num cs ts exp6 C A acc F num cs ts exp
…
pAverage of the
parameter values
No fitness sharing
Match set generation
3 C A F t
Correct Set [C]
Genetic Algorithm
Selection, Reproduction, and mutation
Deletion
Competition in the niche
3 C A acc F num cs ts exp6 C A acc F num cs ts exp
…
in the nicheKey differences with respect to XCS
Accuracy computation as average of correct predictionsExploration of the “correct class” instead of all classes
Slide 10Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Exploration of the correct class instead of all classesNo fitness sharing
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 11Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Fitness Sharing in UCSSharing or not sharing, a key difference between XCS and UCS
GoalDesign a fitness sharing schemeEmpirically compare whether fitness sharing is beneficial to UCSEmpirically compare XCS with UCS
Incorporate a fitness sharing scheme into UCS Take inspiration from XCS
Classifier accuracyClassifier numerosity
Relative accuracy
Classifier numerosity
Learning rateAnd finally, fitness is shared in [M]
Learning rate
Slide 12Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Methodology of Analysis
Analysis divided into two comparisons1. Compare UCS without fitness sharing (UCSns) and with fitness sharing (UCSs)
2. Compare UCSs with XCS
Comparison on four boundedly-difficult problems, that permit moving the complexity along: number of classes, size of the b ildi bl k l i b l d ti f ibuilding block, class imbalance, and proportion of noise.
The parity problem (par)
Th d d bl (d )The decoder problem (dec)
The position problem (pos)
The 20-bit multiplexer with alternating noise (mux-an)The 20 bit multiplexer with alternating noise (mux an)
Slide 13Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Does Fitness Sharing Benefit UCS?
Fitness sharing provides the following benefits:g p gHigher pressure toward deletion of over-general classifiersHigher selective pressure toward the fittest classifiers in [C]g p [ ]Better results in the four problems: par, dec, pos, and mux-an
UCSns vs UCSs in DecoderUCSns vs UCSs in Decoder
UCSs
UCSns
Slide 14Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Comparison of UCS with XCSAdvantages of UCS due toThe exploration regimeThe exploration regime
XCS explores all the classes while UCS explores only the “correct” class
The accuracy guidanceThe accuracy guidanceXCS may provide a misleading guidance toward the fittest classifiers identified as the fitness dilemma (Butz et. al, 2003)
UCS solves this problem by computing accuracy as the proportion of correct predictions
UCSs vs XCS in Decoder
UCSs
XCS
Slide 15Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Summary of the ComparisonThe empirical study has shown thatp y
UCS benefits from a fitness sharing scheme. Therefore, we use UCSs in the remaining of this workg
Key differences between XCS and UCS reviewed and experimentally analyzedexperimentally analyzed
Explore regimeAccuracy guidanceAccuracy guidancePopulation size
XCS is a more general architecture and can solve reinforcement learning problems
Slide 16Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 17Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
MotivationSo, does rare classes pose a challenge to XCSs?, p g
Test on unbalanced 11-bit multiplexer
number of examples of the majority class
%[O] ith XCS
number of examples of the majority classnumber of examples of the minority classIR =
%[O] with XCS
Slide 18Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Design Decomposition
AimAnalyze the challenges that rare classes pose to XCS
Improve XCS in problems with rare classes
Design decomposition approach (Goldberg, 02) proposes toDecompose the problem in critical elementsp p
Derive “little” models or facetwise models for each element, assuming that the others behave in an ideal manner
Integrate all the models (patchquilt integration)
Slide 19Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Focusing the ProblemHow should XCS partition the problem solution?p p
Nourished niche
Small Disjunct orStarved niche
Againmore smalldisjuncts
OOvergeneralClassifier
Slide 20Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Critical Elements of LCS
Five critical elements to detect small niches were identified
Five critical elements:
1. Estimate the classifier parameters correctly
2. Analyze whether representatives of starved niches can be provided in initialization
3. Ensure the generation and growth of representatives of starved niches
4 Adjust the GA application rate4. Adjust the GA application rate
5. Ensure that representatives of starved niches will take over their niches
Derivations studied according to the imbalance ratio (IR)
Slide 21Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Estimate Classifier Parameters1
Derive the maximum imbalance ratio
The error of over-general classifiers is:
However, empirical results did not agree with the theoryError of the most over-general classifier over time trackedg
Theoretical value
ir = 100Deviation between theoretical and empirical error ir 100empirical error
Over general classifiers may beOver-general classifiers may be considered accurate
Slide 22Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Estimate Classifier Parameters1
We proposed two alternatives to obtain better estimatesTheoretical value
1. Tune the learning rate of theWidrow-Hoff rule according to ir
Theoretical value
Widrow Hoff rule according to irir = 100
2. Apply gradient descent th d (B t t l 2005)
Theoretical value
methods (Butz et. al, 2005)
ir = 10000
Slide 23Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Provide Representatives in Initial.2
Can covering provide schemas of classifiers of starved niches?g p
Probability of activating covering in the first minority class instance
Specificity of [P]
Imbalance ratio
Length of the classifierLength of the classifier
For large values of ir, covering will not provide schemas of the minority class
W ti th l i iWe continue the analysis assuming acovering failure
Slide 24Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Ensure Growth of Representatives3
How to size the population to ensure that representatives of p p pstarved niches will be supplied?
Assumptions:Crossover is not considered. Only mutation (probability of mutation μ).The time to create a representative of a starved niche is
Random deletionRandom deletion
A GA is applied to [A] every time [A] is activated
Time to receive a genetic event
Mixing all together: Population size bound to ensure reproductiveMixing all together: Population size bound to ensure reproductive opportunity Number of classes
Slide 25Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Imbalance ratio
Ensure Growth of Representatives3
Theory matches empirical results (parity problem)y p (p y p )Imbalanced parity problem with building block length from 1 to 4
Unbalanced by removing instances of one of the classes
Theory matches also when the assumptions of the model are not met
Widrow-Hoff RuleAll assumptions satisfied
Slide 26Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Adjust GA Application Rate4
Assumption in the previous modelp pA GA is applied to [A] every time [A] is activated
What is the effect of varying GA?What is the effect of varying GA?To guarantee that all niches receive the same number of genetic events approximately:
If satisfied, all niches receive the same number of geneticsame number of genetic opportunities
Thence, time of deletion increases linearly with ir and population size remains constant
Slide 27Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Ensure Take Over of Represent.5
The previous facets set the conditions to ensure thatp1. Representatives of starved niches are created2. Representatives of starved niches receive a genetic eventp g
But still, to ensure full convergence we need thatRepresentatives of starved niches take over their nicheRepresentatives of starved niches take over their nicheEnsure that these representatives will not be extinguished
Study takeover time of representatives, which depends onInitial stock of classifiers in the nicheType of selection
Proportionate selection (Wilson, 95)
Tournament selection (Butz et al., 2005c)
Slide 28Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Ensure Take Over of Represent.5
Takeover time for proportionate selectionp pPopulation
size
Number of niches Ratio of the accuracy of the
Initial proportion of classifiersFinal proportion of classifiersNumber of niches
over-general classifier to theaccuracy of the best representative
Condition forniche extinction
Maximum acceptable errorpredicted by the
niche extinction model
Slide 29Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Ensure Take Over of Represent.5
Takeover time for tournament selectionPopulation
size
Initial proportion of classifiersFinal proportion of classifiers Tournament size
Condition forniche extinction
Key differences with respect to proportionate selection:
Number of
Key differences with respect to proportionate selection:Independent of the fitness of the best and the over-general classifierHighly dependent on the tournament size
Number of classifiersin the niche
predicted by the niche extinction model
Number of representatives
Slide 30Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
in the niche
Patchquilt IntegrationWill XCS learn rare classes? Lessons learned from the models
1. Parameters need to be correctly estimatedWidrow Hoff rule with auto adjusted βWidrow-Hoff rule with auto-adjusted β
Gradient descent methods
2. Representatives need to be created and evolvedCovering may fail if ir is large
Th h ll b t bThe challenge can be met bySizing the population according to the imbalance ratio
Setting θ according to the imbalance ratioSetting θGA according to the imbalance ratio
3. Niche extinction models set the conditions under which XCS will failIndicate how parameters should be tuned to satisfy the model
Takeover time models to predict the time to convergence
Slide 31Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Why Is this Analysis Important?The lessons enable us to solve problems that previously eluded solutioneluded solution
Unbalanced 11-bit multiplexer problem
%[O] with XCS After the %[O] with XCS analysis
Before theBefore the analysis
Before we could solve up to ir=32
Slide 32
pNow we can solve up to ir=1024 and more
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Iimbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 33Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Reviewing the Critical ElementsEstimate the classifier parameters correctly1
Pure averages! We get the exact value
Analyze whether representatives of starved niches can be provided in initialization
2initialization
Covering applied if the correct set is emptyIf no mutation, covering will be always applied to the first minority g y pp yclass instances
Suppose the worst case: no provision
We derive maximum bounds
Slide 34Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Reviewing the Critical Elements
Ensure the generation and growth of representatives of starved niches
3
I b l i Default configurationImbalance ratio Default configurationAll assumptions satisfied
Adjust the GA application rateXCS’s model is still valid
4
Ensure that representatives of starved niches will take over their nichesXCS’s takeover time models are still valid
5
Slide 35Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Patchquilt IntegrationThe lessons enable us to solve problems that previously eluded solution
Results following the guidelines provided by the lessons
%[O] with UCS
Slide 36Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 37Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
MotivationFrom boundedly-difficult problems to real-world problemsy p p
RWP contain continuous attributes Interval-based rules
IF i [l ] d i [l ] d d i [l ] THEN l
Key difference: Problem characteristics not known
IF x1 in [l1, u1] and x2 in [l2, u2] and … and xn in [ln, nn] THEN classi
yGap between theory and application to RWP
How can we apply the recommendations extracted from the analysis?
Aim1. Start bridging the gap between theory and practiceSta t b dg g t e gap bet ee t eo y a d p act ce2. Confirm that both LCS are valuable for mining domains with rarities
Slide 38Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
What is Different in RWPImbalance ratio vs. niche imbalance ratio?
In boundedly-difficult problems IR equaled to the niche imbalance ratioIn RWP, this assumption may not holdp y
Same imbalance ratio, different niche imbalance ratio
Niche imbalance ratio (NIR) in RWP depends on:IR
Geometrical distribution of the examples
Slide 39
Knowledge representation
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Self-Adaptation to Unknown Domains
Heuristic to estimate the niche imbalance ratioTake the strongest over-general classifierAssume NIR is the imbalance ratio of the over-general classifiergTune parameters according to NIR and the recommendations extracted from the facetwise analysis
Empirical test on the 11-bit multiplexer problem
%[B] with UCS%[O] with XCS %[B] with UCS%[O] with XCS
Slide 40Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
LCS in RWPComparison methodology
Id. Data set #Ins. #At. irbald1 balance disc. 1 625 4 11.76b ld2 b l di 2 62 4 1 1Comparison with:
C4.5 (Quinlan, 95)SMO (Pl tt 98)
bald2 balance disc. 2 625 4 1.17bald3 balance disc. 3 625 4 1.17bpa bupa 345 6 1.38glsd1 glass disc. 1 214 9 22.75
SMO (Platt, 98)IBk (Aha et al., 91)Configured to maximize performance
g gglsd2 glass disc. 2 214 9 15.47glsd3 glass disc. 3 214 9 11.59glsd4 glass disc. 4 214 9 6.38glsd5 glass disc 5 214 9 2 06Co gu ed to a e pe o a ce
Selection of 25 imbalanced real-world problems with different characteristics
glsd5 glass disc. 5 214 9 2.06glsd6 glass disc. 6 214 9 1.82h-s heart-disease 270 13 1.25pim pima-inidan 768 8 1.87
10-fold cross validation
Performance measure: TP rate · TN rate
tao tao-grid 1888 2 1.00thyd1 thyroid disc. 1 215 5 6.17thyd2 thyroid disc. 2 215 5 5.14thyd3 thyroid disc. 3 215 5 2.31
Statistical tests:Friedman’s test (Friedman, 37, 40)
thyd3 thyroid disc. 3 215 5 2.31 wavd1 waveform disc. 1 5000 40 2.02wavd2 waveform disc. 2 5000 40 1.96wavd3 waveform disc. 3 5000 40 2.02
b d Wi B 699 9 1 90Nemenyi test (Nemenyi, 63)Wilcoxon signed-ranks test (Wilcoxon, 45)
wbcd Wis. B. cancer 699 9 1.90wdbc Wis. diag. 569 30 1.68wined1 wine disc. 1 178 13 2.71wined2 wine disc. 2 178 13 2.02
Slide 41Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
wined3 wine disc. 3 178 13 1.51wpbc wine disc. 4 198 33 3.21
Summary of the ResultsTP rate · TN rate
XCS and UCS perform the best on average for the tested problems
However, no significant differences according to Friedman’s test
Pairwise analysis enables the extraction of further observations
XCS and UCS fail to create accurate models in problems such as bald2, bald3, and tao, which have low imbalance ratio
Presents difficulties to learn from domains with curved boundaries
Oth l iti i dditi t l i b l
Slide 42Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Other complexities in addition to class imbalance
DiscussionWhen a ML practitioner has a new problemp p
Which learner should she or he apply?
The empirical analysis indicated thatShe or he should bet for LCSsBut no guarantees of being the best performer on a particular problem
What is missing?What is missing?Evaluate problem complexityLink problem complexity with domain of competence of LCSLink problem complexity with domain of competence of LCS
How?Complexity metrics is a good starting point (Ho & Basu, 02) to bridge the gap between theory and practice
Slide 43Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 44Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
MotivationCompetent data classification techniques should be able to
E l t d lEvolve accurate modelsin some legible structure
LCS li i l hi hl t d l liLCS are very appealing since evolve highly accurate models online
However:Tend to evolve a large number of semantic-free interval-based rules
Use reasoning mechanisms that can be little intuitive
(Bernadó et al., 02)
Slide 45Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Design of Fuzzy-UCSLinguistic fuzzy representation
Disjunction of linguistic fuzzy terms
Rule: IF x1 is A1 and x2 is A2 … and xn is An THEN class1
Disjunction of linguistic fuzzy terms
Example: IF x1 is small and x2 is medium or large THEN class1
In our experiments, all variables shared the same semantics, which were d fi d b t i l b hi f tidefined by triangular membership functions
small medium large
Classifier parameters were changed to let them deal with fuzzy matching
Slide 46
C ass e pa a ete s e e c a ged to et t e dea t u y atc g
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Design of Fuzzy-UCSThree procedures designed to infer the class of test examples, p g p ,which result in a tradeoff between intepretability and accuracy
Weighted average (wavg)
Action winner(awin)
Most numerous andfittest rules (nfit)
+ size of the rule set+ size of the rule set -
wavg Based on average voting. All rules considered.
awin Best rule decides the class. Only best matching rules considered.y g
nfit Based on average voting. Only most numerous rules considered.
Slide 47Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Methodology of AnalysisComparison methodology Id Data set #Ins #At #Cl. %Min %Maj %MIgy
Two comparisonsFuzzy learners
ann Annealing 898 38 5 0.9 76.2 0.0aut Automobile 205 25 6 1.5 32.7 22.4bal Balance 625 4 3 7.8 46.1 0.0bpa Bupa 345 6 2 42 0 58 0 0 0
Non-fuzzy learners
Selection of 20 real-world problems 10 fold cross validation
bpa Bupa 345 6 2 42.0 58.0 0.0cmc Contrac. choice 1473 9 3 22.6 42.7 0.0col Horse colic 368 22 2 37.0 63.0 98.1gls Glass 214 9 6 4.2 35.5 0.0
10-fold cross validation
MetricsTest accuracy
h-c Heart-c 303 13 2 45.5 54.5 2.3h-s Heart-s 270 13 2 44.4 56.6 0.0irs Iris 150 4 3 33.3 33.3 0.0
68 8 2 3 9 6 1 0 0y
Number of rules of the models
Statistical tests:
pim Pima 768 8 2 34.9 65.1 0.0son Sonar 208 60 2 46.7 53.3 0.0tao Tao 1888 2 2 50.0 50.0 0.0thy Thyroid 215 5 3 14 0 60 0 0 0
Friedman’s test (Friedman, 37, 40)
Nemenyi test (Nemenyi, 63)
Bonferroni Dunn test (Dunn 61)
thy Thyroid 215 5 3 14.0 60.0 0.0veh Vehicle 846 18 4 23.5 25.8 0.0wbcd Wisc. breast-cancer 699 9 2 34.5 65.5 2.3wdbc Wisc. Diagnosis 569 30 2 37.3 62.7 0.0Bonferroni-Dunn test (Dunn, 61)
Wilcoxon signed-ranks test (Wilcoxon, 45)wne Wine 178 13 3 27.0 39.9 0.0wpbc Wisc. Prognostic 198 33 2 23.7 76.3 2.0zoo Zoo 101 17 7 4.0 40.6 0.0
Slide 48Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Comparison with the Fuzzy LearnersAccuracy
F GP (GP) (Sá h l 01)1. Fuzzy GP (GP) (Sánchez et al., 01)2. Fuzzy GAP (GAP) Sánchez & Couso, 00)3. Fuzzy SAP (SAP) Sánchez et al, 01)
F Ad b t (AB) (d l J t l 04)4. Fuzzy Adaboost (AB) (del Jesus et al, 04)5. Fuzzy Logitboost (LB) (Otero & Sánchez, 06)6. Fuzzy MaxLogitBoost (MLB) (Otero & Sánchez, 07)
All methods run using KEEL (Alcalá-Fdez et. al, 08)
- Interpretability +
Fuzzy-UCS wavg(1000’s of rules)
Fuzzy-UCS awin(< 100 rules)
Fuzzy-UCS nfit(> 10 rules)(1000 s of rules) (< 100 rules)
Fuzzy GAP, Fuzzy SAP
(> 10 rules)
Fuzzy AdaBoost
Slide 49Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Fuzzy GP, Fuzzy MLBFuzzy LogitBoost
Comparison with Non-Fuzzy LearnersAccuracy
1. C4.5 (Quinlan, 95)2. IBk (Aha et al., 91)3. Naïve Bayes (NB) (John & Langley, 95)3. Naïve Bayes (NB) (John & Langley, 95)4. Part (Frank & Witten, 98)5. SMO (Platt, 98)6. GAssist (Bacardit, 04)6. GAssist (Bacardit, 04)7. UCS (Bernadó & Garrell, 03)
- Interpretability +Interpretability
Fuzzy-UCS avg Fuzzy-UCS awinFuzzy-UCS nfit
GAssistNaïve Bayes
C4.5Part
SMOIBk
UCS
Slide 50Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Naïve BayesPartIBk
Mining Large Volumes of DataThe last experimentp
Fuzzy-UCS to extract models from the 1999 KDD Cup intrusion detection mechanism data set494,022 examples with 41 features
Slide 51Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3 Facetwise Analysis of XCS for Imbalanced Domains3. Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work7. Conclusions and Further Work
Slide 52Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Conclusions and Further Work
This work contributed to Increasing the comprehension of how LCS workImproving them to deal with problems that contain rare classesp g pProviding new implementations of LCS
Two challenges and four objectives addressed in the contextTwo challenges and four objectives addressed in the context of LCS
1. Revise and update UCS and compare it to XCSNew fitness sharing designedNew fitness sharing designedFitness sharing provides benefits to UCSKey differences between UCS and XCS empirically studiedKey differences between UCS and XCS empirically studiedFurther work: Complement the analysis with theory
Slide 53Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Conclusions and Further Work
2 & 3. Study LCS in domains with rare classesyStart with a systematic analysis validated with boundedly-difficult problemsFinish with its application to real-world problems with rare classes
Further workD i t h t i l ld l ifi ti bl
pp p
ProblemComplexsystems
Facetwiseanalysis
Design measures to characterize real world classification problemsMeasure the difficulty of the problems
Li k bl diffi lt ith d i f tLCSs can learnfrom imbalanced
domains Lots ofinteracting
Small models
Link problem difficulty with domain of competence
Include problem difficulty in the study of re-sampling techniques, etc.
First steps taken in (Bernadó et al 06; Orriols et al 08a)components
D i fApplication ofProblem
characterization
First steps taken in (Bernadó et. al, 06; Orriols et. al, 08a)
Domain of competence
of LCSs
Application of LCSs to a new
real-world problem
characterization
Heuristic to estimatethe niche imbalance ratio
Resampling
Complexitymetrics
Future research line
Slide 54Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
p gtechniques
Conclusions and Further Work
4. Design and implement an LCS with fuzzy logic reasoning for g p y g gsupervised learning
Analysis to mixFurther work
Accurate online evaluation system of LCSs
Human like representation and reasoning mechanisms of fuzzy logics
Further workAdapt LCSs to extract association rules online
Robust discovery capabilities of GAs
Each of the three ideas was not novel itself, but the combination of them to create a supervised learning technique was
Many real-world applications generate data streams
LCS are appealing since they mine data streamsthem to create a supervised learning technique was.Fuzzy-UCS
Evolved highly accurate models of moderate size
However, in most cases, unlabeled data
Aim: design an LCS that is able to extract association rules onlineEvolved highly accurate models of moderate size
Was able to extract classification models from large volumes of data
Is prepared to deal with domains with uncertainty and vagueness
First steps taken in (Orriols et al., 2008f)
Is prepared to deal with domains with uncertainty and vagueness
Slide 55Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Lessons Learned on the Way 1. The importance of design decomposition
W d t i LCS f i i itiWe need to improve LCS for mining rarities1. Mix existing, powerful techniques that solve problems that you intuitively
identifyidentifyThe thesis started in this way (Orriols-Puig, 05a, 05b)
Lesson: despite moderate success, poor understanding
2. Build complete models of your system
3. Design decomposition and facetwise analysis (Goldberg, 02)Key for success
Not only for GAs or LCSs
2. The relevance of ideas crossbreedingNew complex real-world problems require the best practices of different fieldsLCSs are friendly frameworks to ideas crossbreeding
Slide 56Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
Publications This work has resulted in 35 publications:
7 j l (4 d/ bli h d d 3 l b i d)7 journal papers (4 accepted/published and 3 currently submitted)5 papers in LNCS/LNAI volumes 6 book chapters6 book chapters15 international conference papers2 national conference papers
Selected publicationsAlbert Orriols-Puig, Ester Bernadó-Mansilla, David E. Goldberg, Kumara Sastry, and Pier Luca Lanzi. Facetwise Analysis of XCS for Problems with Class Imbalances IEEE Transactions on Evolutionary Computation 2008 submittedXCS for Problems with Class Imbalances. IEEE Transactions on Evolutionary Computation, 2008, submitted
Albert Orriols-Puig, Jorge Casillas and Ester Bernadó-Mansilla. Fuzzy-UCS: A Michigan-style Fuzzy-Learning Classifier System for Supervised Learning. IEEE Transactions on Evolutionary Computation, 2008, doi=10.1109/TEVC.2008.925144
Albert Orriols-Puig, Ester Bernadó-Mansilla. Evolutionary Rule-Based Systems for Imbalanced Datasets. Soft Computing Journal. Special Issue on Evolutionary and Metaheuristic-based Data Mining, 2008, doi=10.1007/s00500-008-0319-7
Albert Orriols-Puig and Ester Bernadó-Mansilla. Revisiting UCS: Description, Fitness Sharing, and Comparison with XCS. In Advances at the frontier of LCS, LNCS series, volume 4998, pages 96–116, Springer, 2008
Albert Orriols P ig Da id E Goldberg K mara Sastr and Ester Bernadó Mansilla Modeling XCS in Class ImbalancesAlbert Orriols-Puig, David. E. Goldberg, Kumara Sastry, and Ester Bernadó-Mansilla. Modeling XCS in Class Imbalances: Population Size and Parameter Settings. In GECCO’07, pages 1838-1845, ACM Press, 2007
Albert Orriols-Puig, Kumara Sastry, Pier Luca Lanzi, David E. Goldberg, and Ester Bernadó-Mansilla. Modeling Selection Pressure in XCS for Proportionate and Tournament Selection. In GECCO’07, pages 1846-1853, ACM Press, 2007
Slide 57
Albert Orriols-Puig and Ester Bernadó-Mansilla. Bounding XCS’s Parameters for Unbalanced Datasets. Best paper nomination. In GECCO’06, pages 1561-1568. ACM Press, 2006
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
AcknowledgmentsEnginyeria i Arquitectura La Salle
Prof. Ester Bernadó-Mansilla
My first “second home”: the IlliGALProf. David E. Goldberg for accepting my visits and for all his valuable lessons
All labbies, and especially Kumara Sastry, Xavier Llorà, and Tian Li Yu
My second “second home”: the SCI2S groupProf. Francisco Herrera for accepting my visits and for his time and advice
All labbies and especially Jorge CasillasAll labbies, and especially Jorge Casillas
My examining committeeProf. David E. Goldberg, Prof. Francisco Herrera, Prof. Martin V. Butz, Prof. Xavier Llorà, and Prof. Xavier Vilasís
All the people I have worked withEster Bernadó-Mansilla, Jorge Casillas, David E. Goldberg, Pier Luca Lanzi, Francisco J. Martínez-López, Sergio Morales-Ortigosa , Núria Macià, Joaquim Rios-Boutin, Kumara Sastry, Francesc Teixidó-Navarro
Th h t d bThe research was supported byDepartament d’universitats, recerca i societat de la informació (DURSI)
Under a FI scholarship with reference 2005FI-00252
Under two BE travel grants with references 2006BE-00299 and 2007BE2-00124
Generalitat de Catalunya, under grants 2002SGR-00155 and 2005SGR-00302
Ministerio de educación y ciencia under projects KEEL and KEEL2 with references (TIC2002-04036-C05-03 and TIN2005 08386 C05 04)
Slide 58
TIN2005-08386-C05-04)
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
New Challenges in Learning Classifier g gSystems: Mining Rarities and Evolving
Fuzzy RulesFuzzy Rules
Student: Albert Orriols-Puig
Supervisor: Ester Bernadó-Mansilla
Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La SalleEnginyeria i Arquitectura La Salle
Universitat Ramon Llull