fuzzy interpretation of discretized intervals author: dr. xindong wu ieee transactions on fuzzy...
Post on 21-Dec-2015
216 views
TRANSCRIPT
Fuzzy Interpretation of Discretized Intervals
Author: Dr. Xindong Wu
IEEE TRANSACTIONS ON FUZZY SYSTEMVOL. 7, NO. 6, DECEMBER 1999
Presented by: Gong Chen
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusions• Answers for Final Exam
Concepts Review
• Induction: Generalize rules from training data• Deduction: Apply generalized rules to testing data• Three possible results of Deduction:
– Single match– No match– Multiple match
Concepts Review
• Discretization of Continuous domains
– Continuous numerical domains can be discretized into intervals
– The discretized intervals can be treated as nominal values
Concepts Review
• Using Information Gain Heuristic for Discretization:
(employed by HCV)– x = (xi + xi+1)/2 for (i = 1, …, n-1)
– x is a possible cut point if xi and xi+1 are of different classes
– Use IGH to find best x– Recursively split on left and right– Stop recursive splitting when some criteria is met
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam
Overview
Training Data
Discretizaion induction rules
Testing Data Deduction
No match
Single match
Multiple match
Fuzzy Borders
Outline• Concepts Review• Overview• Problem• Solution• Several Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam
Problem
• Discretization of continuous domains does not always fit accurate interpretation!
• Recall, using Info Gain, --a kind of heuristic measure applying in training data, cannot accurately fit “data in real world”.
• Example
Problem• Heuristic 1(e.g. Information Gain)
• Heuristic 2(e.g. Gain Ratio)
18 35
young
49
old
49.49
18 35
young
50
old
49.49
Problem
• Suppose after induction, we just get one rule:
• If (age=old) then Class=MORE_EXPERIENCE
According to Heuristic 2,
Instance(age=49.49) No match!
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam
Solution
• More safe way to describe age=49.49 is to say: To some degree, it is young; To some degree, it is old.
• Rather than using one assertion that definitely tells it is young or old.
• Thus, to some degree, it can get its rule and classification result other than no match.– No matchSingle match or multiple match with some
degree
• This is so-called fuzzy match!
Solution
• “Fuzziness is a type of deterministic uncertainty. It describes the event class ambiguity.”
• “Fuzziness works when there are the outcomes that belong to several event classes at the same time but to different degrees.”
• “Fuzziness measures the degree to which an event occurs.”
– Jim Bezdek, Didier Dubois, Bart osko, Henri Prade
Solution
• “to some degree”?– Membership function describes “degree”– Membership function tells you to what degree, an eve
nt belongs to one class.– Membership function calculates this degree.
• Three widely used membership functions are employed by HCV.– Linear – Polynomial– Arctan
Solution
• Linear membership function
xleft xright
l
sl
k = 1/2sl; a = -kxleft + ½; b = kxright + ½
linleft(x) = kx + a
linright(x) = -kx + b
lin(x) = MAX(0, MIN(1,linleft(x),linright(x)))
S: is user-specifiedparameter.
e.g.0.1 indicates the interval spreads out into adjacent intervals for 10% of its original length at each end.
Solution
• Polynomial Membership Function—using more smooth curve function instead of linear function.
• Arctan Membership Function
• Experimental results shows that no significant difference between three kinds of functions—so Polynomial Membership Function is chosen.
Solution
polyside(x) = asidex3 + bsidex2 + csidex + dside
aside = 1/(4(ls)3)bside = -3asidexside side {left,right}cside = 3aside(xside
2 - (ls)2)dside = -a(xside
3 -3xside(ls)2 + 2(ls)3)
polyleft(x), if xleft -ls x xleft + lspoly(x) = polyright(x), if xright -ls x xright +ls
1, if xleft +ls x xright -ls0, otherwise
To what degree, x belongs to one
interval
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam Problems
Related Techniques
– No match• Largest Class
– Assign all no match examples to the largest class, the default class
– Multiple match• Largest Rule
– Assign examples to the rules which cover the largest number of examples
• Estimate of Probability– Fuzzy borders can bring multiple match--conflicts, so
hybrid method is desired for the whole progress
Related Techniques
• Estimate of Probability# of e.g.s in training se
t covered by conj
The probability of e belongs to clas
s ci Conj1 and Conj2 are two rules supporting e belongs to Ci
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam Problems
Algorithms Design in HCV
• HCV(Large)– No match: Largest Class– Multiple match: Largest Rule
• HCV(Fuzzy)– No match: Fuzzy Match – Multiple match: Fuzzy Match
• HCV(Hybrid)– No match: Fuzzy Match– Multiple match: Estimate of Probability
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam Problems
Experimental Results
• Data:– 17 datasets from UCI Machine Learning Repository– Why select these:
1) Numerical data
2) Situations where no rules clearly apply
• Test conditions– 68 parameters in HCV are all default except deductio
n strategy– Parameters for C4.5 and NewID are adopted as the o
ne recommended by respective inventors
Experimental ResultsDataset HCV HCV (large) HCV C4.5 C4.5 NewID
(hybrid) (fuzzy) (R 8) (R 5)
Anneal 98.00% 93.00% 93.00% 95.00% 93.00% 81.00%
Bupa 57.60% 55.90% 55.90% 71.20% 61.00% 73.00%
Cleveland 2 78.00% 68.10% 73.60% 71.40% 76.90% 67.00%
Cleveland 5 54.90% 56.00% 52.70% 51.60% 56.00% 47.30%
CRX 82.50% 72.50% 82.00% 83.00% 80.00% 79.00%
Glass (w/out ID) 72.30% 60.00% 60.00% 71.50% 64.60% 66.00%
Hungarian 2 86.30% 85.00% 85.00% 81.20% 80.00% 78.00%
Hypothroid 97.80% 86.30% 96.30% 99.40% 99.40% 92.00%
Imports 85 62.70% 59.30% 61.00% 61.00% 67.80% 61.00%
Ionosphere 88.00% 81.20% 81.20% 86.30% 85.50% 82.00%
Labor Neg 76.50% 76.50% 76.50% 82.40% 82.40% 65.00%
Pima 73.90% 69.10% 69.10% 73.50% 75.50% 73.00%
Swiss 2 96.90% 96.90% 96.90% 96.90% 96.90% 97.00%
Swiss 5 28.10% 25.00% 28.10% 40.60% 31.20% 22.00%
Va 2 78.90% 78.90% 78.90% 77.50% 70.40% 77.00%
Va 5 28.20% 25.40% 29.60% 31.00% 26.80% 20.00%
Wine 90.40% 76.90% 76.90% 90.40% 90.00% 90.40%
Experimental Results
• Predictive accuracy– HCV (hybrid) outperforms others in 9 datasets– HCV (large) 3 datasets– HCV (fuzzy) 2 datasets– C4.5 (R 8) 7 datasets– C4.5 (R 5) 6 datasets– NewID 3 datasets
– HCV (hybrid)clearly and significantly outperforms other interpretation techniques (in HCV) for datasets with numerical data in “no match” and “multiple match” cases.
• C4.5 and NewID are included for reference, not for extensive comparison.
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam Problems
Conclusion• Fuzziness is strongly domain dependent, HCV al
lows users to specify their own intervals and fuzzy functions.– An important direction to take with specific domains
• Fuzzy Borders design combined with probability estimation achieve better results in term of predicative accuracy.– Applicable to other machine learning and data mining
algorithms
Outline• Concepts Review• Overview• Problem• Solution• Related Techniques• Algorithms Design in HCV• Experimental Results• Conclusion• Answers for Final Exam Problems
Answers for Final Exam Problems
• Q1:When doing deduction on real world data, what are the three possible cases for each test example? – Single match– No match– Multiple match
• Q2: Of the three cases during deduction, which ones do the HCV hybrid interpretation algorithm use fuzzy borders to classify? – No match
• Q3: In the Hybrid interpretation algorithm used in HCV,– when are sharp borders set up?
• “Sharp borders are set up as usual during induction”– when are fuzzy border defined?
• In deduction, “only in the no match case, fuzzy borders are set up in order to find a rule which is closest to the test example in question”
Thank You!