rethinking the essence, flexibility and reusability of...
TRANSCRIPT
http://www.reframe-d2k.org/
Rethinking the Essence, Flexibility and Reusability of Advanced Model Exploitation
1
} Give an overview of threshold selection methods in multi-label; (i) context-independent and (ii) context-based.
} Study different setting of thresholds: global, label-wise, instance-wise.
} Analysis of the performance of these methods using multi-label cost curves.
2
} Training set: ◦ Data with multiple labels/target variables.
◦ Binary targets.
◦ Targets costs.
} Training context: ◦ A model.
} Deployment context: ◦ Operating condition: costs.
} Solution: ◦ i.e. Thresholds tuning.
Multi-label Data
Threshold Selection Methods
4
} Context-independent: Cost is ignored (Fixed) 1. Fixed-score 2. RCut 3. MCut instance-wise
globally or label-wise
EX
Features Targets
X1 X2 X3 … XF Y1 Y2 Y3 … YL
1 0.2 0.6 0.7 0.9
2 0.1 0.4 0.5 0.7
3 0.0 0.2 0.3 0.4
.
N . . . . .
+ + + +
+ + + +
Global Fixed-score 0.5 !
5
} Context-independent: Cost is ignored (Fixed) 1. Fixed-score 2. RCut 3. MCut instance-wise
globally or label-wise
EX
Features Targets
X1 X2 X3 … XF Y1 Y2 Y3 … YL
1 0.2 0.6 0.7 0.9
2 0.1 0.4 0.5 0.7
3 0.0 0.2 0.3 0.4
.
N . . . . .
+ + +
+
+
+
RCut top 2 most relevant !
Threshold per instance
6
} Context-independent: Cost is ignored (Fixed) 1. Fixed-score 2. RCut 3. MCut instance-wise
globally or label-wise
EX
Features Targets
X1 X2 X3 … XF Y1 Y2 Y3 … YL
1 0.2 0.6 0.7 0.9
2 0.1 0.2 0.5 0.7
3 . . . .
.
N . . . . .
+ + +
MCut !
Threshold per instance
EX
Features Targets
X1 X2 X3 … XF Y1 Y2 Y3 … YL
1 0.3 0.2 0.1 0.0
2 0.4 0.3 0.2 0.0
3 0.5 0.5 0.4 0.1
.
N 0.9 0.7 0.6 . 0.2
+
+ +
c=0.5 c=0.2
Score-driven label-wise
+
+
Threshold per label
7
} Context-based: Cost is considered 1. Score-driven: = 2. Rate-driven (PCut): = 3. Optimal (SCut)
ctt R-1(c)
globally or label-wise
EX
Features Targets
X1 X2 X3 … XF Y1 Y2 Y3 … YL
1 0.3 0.2 0.1 0.0
2 0.4 0.3 0.2 0.0
3 0.5 0.5 0.4 0.1
.
N 0.9 0.7 0.6 . 0.2
+
cRate-driven label-wise
+
Threshold per label
= = 0.5 R
8
} Context-based: Cost is considered 1. Score-driven: = 2. Rate-driven (PCut): = 3. Optimal (SCut)
ctt R-1(c)
globally or label-wise
9
} Datasets: 6 multi-label datasets from Mulan ◦ Names: Enron, Birds, Yeast, Flags, Scene, Emotion ◦ ♯ of labels: 6—53
} Trained model: BR + logistic regression
} Different thresholding methods
} Different setting: globally, label-wise, instance-wise
} Misclassification cost ◦ Case 1: One uniform cost ◦ Case 2: L uniform costs, average is used for global threshold
} Evaluation: cost curves for multi-label
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Cost
Loss
0
Socre−Driven Rate−Driven Optimal Fixed10
Los
s
Cost
Fixed Rate-driven
Optimal Score-driven
11
●
●●●
●●●●●●●●●●●●●●
●
●●●●
●●●●●●●●●●●●●●●●●
●
●
●
●●●●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Average labels costs
Aver
age
loss
ove
r all
labe
ls0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Average labels costs
Aver
age
loss
ove
r all
labe
ls
Cost curves for equal costs
Scatter plots for unequal costs
12
●
●●●
●●●●●●●●●●●●●●
●
●●●●
●●●●●●●●●●●●●●●●●
●
●
●
●●●●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Enron Dataset: Global Threshold
Average Cost
Loss
Score-driven Optimal
Fixed
53 Targets
ScoresFrequencies
0.0 0.2 0.4 0.6 0.8 1.0
05000
15000
25000
13
1 uniform random variable
cost
Frequency
0.0 0.2 0.4 0.6 0.8 1.0
0100
300
500
uniform random variables
cost
Frequency
0.0 0.2 0.4 0.6 0.8 1.0
0200
600
1000
5 uniform random variables
cost
Frequency
0.2 0.4 0.6 0.8
0500
1000
1500
53 uniform random variables
cost
Frequency
0.35 0.40 0.45 0.50 0.55 0.60 0.65
0500
1000
2000
14
●
●
●
●
●●●●●●●●●●●
●
●●●●●●●
●●●●●●●●
●
●●●●●●●●●●
●●●●
●●●●
●●●
●●●●●●●●●●●●●●
●
●●●●
●●●●●●●●●●●●●●●●●
●
●
●
●●●●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Enron Dataset: Label−wsie Thresholds
Average Cost
Loss
● ●
●●● ●
●●
●
●
●●●●
●●
●●●●●
●●●
●
●●
●
●
●●
●●
●
●●●●
●
●
●●●●●
●●●●
● ●●●
● ●●● ●
●●●●●●●
●●●●●●●
●
●
●●●●●●●●
●
●●●●●
●●
●●●●●
●
●●
0.0 0.2 0.4 0.6 0.8 1.00.
00.
10.
20.
30.
40.
5
Yeast Dataset: Label−wsie Thresholds
Average Cost
Loss
Fixed Score-driven
Rate-driven
Optimal
15
●
●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●●●●●●●●●●
●●
●●●●●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Enron Dataset: Instance−wsie Thresholds
Average Cost
Loss
MCut
RCut
0
0.1
0.2
0.3
0.4
0.5
Enron Birds Yeast Flags Emotions Scene
Aver
age
Los
s
One Global threshold when label costs are different
Score_driven Fixed_score Optimal_train
0 0.1 0.2 0.3 0.4 0.5
Aver
age
Los
s
Label-wise thresholds when label costs are different
Score_driven Rate_driven Optimal_train 0
0.1 0.2 0.3 0.4 0.5
Aver
age
Los
s
Instance-wise thresholds when label costs are different
Rcut Mcut
} In the paper: ◦ A structured presentation of multi-label thresholding methods. ◦ A link to binary classification thresholding methods. ◦ A comparative experimental results about the performance of different
thresholds methods.
} Next time ◦ Depth i.e. � Study a possible link between thresholds and evaluation metrics � Non uniform cost � Cost-based f-measure � A cost per data point ◦ What do you think?
17