a semi-naive bayes classifier with grouping of cases
TRANSCRIPT
A Semi-naive Bayes Classifier with Grouping of Cases
J. Abellán, A. Cano, A. R. Masegosa, S. Moral
Department of Computer Science and A.I. University of Granada
Spain
2
Outline 1. Introduction. 2. Semi-Naive Bayes Classifier with
Grouping of Cases. General Description The Joining Criterions The Grouping Criterions
3. Experimental Evaluation. 4. Conclusions and Future Work.
3
Introduction Information from a data base
Attribute variables Class variable
Data Base
Calcium Tumor Coma Migraine Cancer
normal a1 absent absent absent
high a1 present absent present
normal a1 absent absent absent
normal a1 absent absent absent
high ao present present absent
...... ...... ...... ...... ......
4
Introduction Naive Bayes (Duda & Hart, 1973)
Attribute variables {Xi | i=1,..,r} Class variable C={c1,..,ck}. New observation z=(z1,..,zr)
(X1=z1,..,Xr=zr). Select state of C: arg maxci
(P(ci|Z)). Supposition of independecy
known the class variable: arg maxci
(P(ci) ∏rj=1P(zj|ci))
…
C
X1 X2 Xr
Graphical Structure
5
Introduction Naive Bayes Classifiers Naive Bayesian Classifiers: NB’s performance is comparable with some
state-of-the-art classifiers even when its independency assumption does not hold in normal cases.
Question: “Can the performance be better when the
conditional independency assumption of NB is relaxed?”
6
Semi-Naive Bayesian Classifiers(SNB) A looser assumption than NB. Independency occurs among the joined
variables given the class variable C.
Introduction Semi-Naive Bayes Classifiers
7
Introduction Semi-Naive Bayes Classifiers Main problems of Semi-NB approach: When to join two variables? Joining Criterion
Kononenko’s criterion is entropy based.
Pazzani’s criterion is accuracy based. Wrapper estimation. Very high complexity with high number of variables.
Class entropy reduction
8
A SNB with Grouping of Cases Joining Method
Three new proposals for Joining Criterions. BDe: Bayesian Dirichlet Equivalent.
L10: The Expected Log-likelihood under
leaving-one-out. LRT: Log-likelihood Ratio Test.
9
A SNB with Grouping of Cases Grouping Method Increment in Parameter Estimations
Solution: “Grouping cases of the new variable”.
Independent P (Xi | C)P(Xj | C) Nº Parameters:
#(C) (#(Xi) + #(Xj))
Dependent P (Xi, Xj | C)
Nº Parameters: #(C) #(Xi) #(Xj)
Similar Information
10
A SNB with Grouping of Cases Example
…
C
X1 X2 Xr
Joining Phase
…
C
X5 x X9 X1 Xr
Each pair of Variables is evaluated using a JC
Grouping Phase
Similar Information
Each pair of Cases is evaluated using a GC
…
C
X5 x X9 X1 Xr
11
Joining Criterions BDe criterion Bayesian Dirichlet equivalent Metric (BDe)
“Bayesian scores measure the quality of a model, M, as the posterior probability of
the model given the learning data D”
JC(BDe) = Score (M1:D) – Score(M2:D)
C
X Y
C
X x Y
M1 M2
12
Joining Criterions L1O criterion Expected Log-Likelihood Under Leave-
One-Out (L1O).
Leave-one-out Estimation Laplace Estimation
“The estimation of the log-likelihood of the class is carried out with a leave-one-out scheme
computed with a closed equation”
13
Joining Criterions LRT criterion Log-likelihood Ratio Test (LRT):
Corrector Factor:
“Comparison of two nested models: M1 with merged variables and M2 variables are independent”
Number of total comparisons over n active variables
14
Grouping Method Hypotheses
Hypotheses: Model Selection Problem Sample data D is restricted to X=xi or X=xj. Consider xi and xj the only possible cases of X. Grouping xi and xj implies X has only one case.
Similar Information
15
Grouping Method Criterions BDe score:
L10 score:
LRT score:
16
Experimental Evaluation Details
SNG was implemented in Elvira. Integrated in Weka for evaluation. Tested in 13 data bases without missing
values from UCI repository. 10 fold-cross validation repeated 10 times. Comparison with a corrected paired t-test
to 5%.
17
The trade-off between Accuracy and log-likelihood is better for LRT.
L10 works badly as joining criterion.
Evaluating Joining Criterions Naive Bayes Comparison
18
Evaluating Joining Criterions Pazzani’s semi-NB comparison
LRT works slightly better than BDe. Similar performance with a lower time
complexity.
LRT is the best joining criterion
19
Evaluating Grouping Criterions Naive Bayes Comparison
LRT Joining + Grouping Method Not strong differences among criterions. L10 slightly better.
L1O is the best grouping criterion
20
Pazzani’s Semi-NB Comparison SNB-G = LRT Joining + L10 Grouping
Similar performance:
Dramatic building time reduction:
21
State-of-the-art Classifiers AODE, TAN and LBR comparison Three wins against
NB. 1 W vs 1 D against
AODE. None difference
against TAN and LBR.
One Win against Pazzani’s Semi-NB.
22
Conclusions and Future Work A preprocessing step for Naive Bayes: Method for joining variables. Combined method for grouping cases.
Very efficient with similar performance respect to Pazzani’s Semi-NB classifier.
Application to high-dimensionality data sets. Generalization of the methodology to
another models: decision trees and TAN model.