multi-classification and rule extraction with svms · cecilio angulo [email protected] grec...
TRANSCRIPT
Multi-classification and Rule Extractionwith SVMs
Cecilio [email protected]
GREC – Grup de Recerca en Enginyeria del ConeixementUPC – Universitat Politecnica de Catalunya
Bi-Class SVMStandard primal SVM formulation (Vapnik, 1996)
Class A
Class B2 / ||w||
w·x +b= −1
w·x+b = 1
w·x +b= 0
minw,b
12 ‖w‖
2
s.t.A ·w + b · 1 ≥ 1
B ·w + b · 1 ≤ −1
Multi-Class SVM• From Bi-class
– one versus all
– one versus one
– ECOC
• All the classes at once
• From Tri-class
From Bi-Class
• one versus all. Decomposition Phase
Class A
Class B
Class C
Class D
Class E
From Bi-Class
• one versus all. Reconstruction Procedure
Linear Kernel 7-Polynomial Kernel
Gaussian Kernel, σ = 0.2 Gaussian Kernel, σ = 1
From Bi-Class
• one versus one. Decomposition Phase
Class A
Class B
Class C
Class D
Class E
From Bi-Class
• one versus one. Reconstruction Procedure
Linear Kernel 7-Polynomial Kernel
Gaussian Kernel, σ = 0.2 Gaussian Kernel, σ = 1
From Bi-Class
• Error Correcting Output Codes (ECOC)
one versus onef1 · · · fi · · · fL
C1...
Cj...
CN
1−10...0
0...1...0
0...0−11
;
one versus allf1 · · · fi · · · fL
C1...
Cj...
CN
1−1−1−1−1
−1...1...−1
−1−1−1−11
ECOC
f1 · · · fi · · · fL
C1...
Cj...
CN
1−11−1−1
1...1...−1
−11−1−11
All te classes at once
• Decomposition - Reconstruction Procedure
Linear Kernel 7-Polynomial Kernel
Gaussian Kernel, σ = 0.2 Gaussian Kernel, σ = 1
Tri-Class SVMPrimal SVM formulation (Angulo, 2001)
w·x+b = 1 w·x+b = 0
w·x+b = −1
w·x+b = − δ
w·x+b = δ
Class A
Class C
Class B
Linear Kernel 7-Polynomial Kernel
minw,b
12 ‖w‖
2
s.t.A ·w + b · 1 ≥ 1
B ·w + b · 1 ≤ −1−δ ≤ C ·w + b · 1 ≤ δ
From Tri-Class
• Decomposition Phase (with δ = 0.01, 0.90)
Class A
Class B
Class C
Class D
Class E
From Tri-Class
• Reconstruction Procedure (with δ = 0.01)
Linear Kernel 2-Polynomial Kernel 7-Polynomial Kernel
Gaussian Kernel, σ = 0.2 Gaussian Kernel, σ = 0.5 Gaussian Kernel, σ = 1
From Tri-Class
• Reconstruction Procedure (with δ = 0.90)
Linear Kernel 2-Polynomial Kernel 7-Polynomial Kernel
Gaussian Kernel, σ = 0.2 Gaussian Kernel, σ = 0.5 Gaussian Kernel, σ = 1
Rule Extraction with SVMs• Idea
• Examples
• Results
Rule Extraction
• Idea
SVM
SVs α’s
Rule Extraction
Clustering Clusters
Data
EllipsoidsEquation rules
Hyper-rectanglesInterval rules
New Model
SVM function
IF AX12 + BX22 + CX1X2 + DX1 + EX2 + F ≤ G THEN CLASS
IF X1 ∈ [a,b] ∧ X2 ∈ [c,d] THEN CLASS
Rule Extraction
• Examples
SVM function First iteration
Second iteration Third iteration
SVM function First iteration
Second iteration Third iteration
Rule Extraction
• Results
Table 1. Results obtained from data sets (with Netlab software).
Equation rules Interval rules Data set RBF nodes
RBFerror Err Co Cv Ov NR Err Co Cv Ov NR
Iris 4.5 0.040 0.028 94.67 64.67 0.00 5.4 0.046 96.67 69.33 0.00 5.5 Wisconsin 2.2 0.029 0.032 98.54 89.31 1.17 3.8 0.039 97.22 91.92 2.33 7.7
Wine 3.0 0.011 0.023 98.89 70.30 0.56 6.2 0.039 96.03 79.25 3.36 9.0 Soybean 5.4 0.020 0.060 91.50 19.00 0.00 6.3 0.020 100.0 71.50 2.00 5.8 Thyroid 9.3 0.065 0.047 92.58 80.02 0.47 13.0 0.059 96.73 75.30 5.49 13.4Monk3 6.0 0.048 0.064 91.90 68.52 0.69 11.0 0.027 97.92 100.0 0.00 8.0
Zoo 7.0 0.062 0.080 93.22 61.73 0.00 15.0 0.073 96.98 77.28 1.11 15.4Mushroom 30.0 0.040 0.06 92.18 77.16 3.43 30.0 0.06 92.00 93.47 7.27 30
Rule Extraction
• Results
Table 2. Results obtained from data sets (with Orr software).
Equation rules Interval rules Data set RBF nodes
RBF error Err Co Cv Ov NR Err Co Cv Ov NR
Iris 5.1 0.033 0.033 96.00 70.00 0.00 6.4 0.033 94.67 72.00 0.00 6.2 Wisconsin 21.5 0.034 0.039 97.50 82.00 0.28 23.9 0.045 96.65 95.02 3.95 24.8
Wine 15.2 0.039 0.045 91.56 59.86 0.62 28.0 0.039 94.41 84.92 6.08 69.7Soybean 12.4 0.000 0.040 96.00 47.00 0.00 13.3 0.100 89.50 91.00 34.50 17.7Thyroid 29.2 0.062 0.042 90.28 64.65 0.00 31.8 0.046 89.72 74.97 0.45 32.5Monk3 12.0 0.050 0.064 91.89 61.57 2.54 21.0 0.028 94.21 100.0 57.25 23.0
Zoo 17.33 0.088 0.090 91.29 58.55 0.00 21.67 0.098 91.42 95.64 3.02 24.0Mushroom 48.0 0.051 0.063 90.24 71.31 3.58 49.0 0.059 92.79 92.41 8.12 49.0