mediaeval 2016 - upmc at mediaeval2016 retrieving diverse social images task
Post on 09-Jan-2017
15 Views
Preview:
TRANSCRIPT
page 1/9
UPMC at MediaEval2016Retrieving Diverse Social Images Task
Sabrina Tollari
Universite Pierre et Marie CURIE (UPMC) — Paris 6, UMR CNRS LIP6
MediaEval 2016 WorkshopOctober 21th, 2016 — Hilversum, Netherlands
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 2/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
Baseline 1 1 1 1 1 1 1
Step 1: Re-rank baseline to improve relevance
1 2 3 4 5 6 7
Step 2: Cluster results using hierarchical clustering
6 5 4 2 1 7 3
Step 3: Sort the images, sort the clusters using image ranks in Step 1
1 2 3 7 4 5 6
Step 4: Re-rank the results alternating images from clusters
1 3 4 5 2 7 6
page 3/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
F Is this strategya good idea ?
F Does therelevance ofthe baselineaffect the finalresults ?
0 50 100 150 200 250 300
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.6
0.65
0.7
0.75
0.8
nb clusters
P@
20
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
nb clusters
CR
@2
0
baseline
Only Step1 (VSM(ttu))
AHC Without Step1
AHC With Step1
AHCCompl(cred)
page 3/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
F Is this strategya good idea ?
F Does therelevance ofthe baselineaffect the finalresults ?
0 50 100 150 200 250 300
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
Only Step1 (VSM(ttu))
AHCCompl(cred)
0 50 100 150 200 250 300
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
VSM(ttu): Vector Space Model using title (t), tags (t), username (u)
0 50 100 150 200 250 3000.6
0.65
0.7
0.75
0.8
nb clusters
P@
20
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
nb clusters
CR
@2
0
baseline
Only Step1 (VSM(ttu))
AHC Without Step1
AHC With Step1
AHCCompl(cred)
page 3/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
F Is this strategya good idea ?
F Does therelevance ofthe baselineaffect the finalresults ?
0 50 100 150 200 250 300
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
VSM(ttu): Vector Space Model using title (t), tags (t), username (u)
0 50 100 150 200 250 3000.6
0.65
0.7
0.75
0.8
nb clusters
P@
20
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
nb clusters
CR
@2
0
baseline
Only Step1 (VSM(ttu))
AHC Without Step1
AHC With Step1
AHCCompl(cred)
page 3/9
Framework
Strategy: first re-rank to improve relevance, then cluster usingagglomerative hierarchical clustering (AHC)
F Is this strategya good idea ?
F Does therelevance ofthe baselineaffect the finalresults ?
0 50 100 150 200 250 300
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.6
0.65
0.7
0.75
0.8
nb clusters
P@
20
baseline
Only Step1 (VSM(ttu))
AHC without Step1
AHC with Step1
AHCCompl(cred)
0 50 100 150 200 250 3000.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
nb clusters
CR
@2
0
baseline
Only Step1 (VSM(ttu))
AHC Without Step1
AHC With Step1
AHCCompl(cred)
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1@
20
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
Similar F1@20 best value
But the peak of the curve iswider with 300 documents
=⇒ More chance to findthe best number ofclusters for testset
Most of the times (around)50 clusters gives the bestF1@20 for 300 documents
page 4/9
Influence of the number of documents
Is it worth taking time to cluster online 300 results in order toimprove the F1@20 of the first 20 documents ?
0 50 100 150 200 250 3000.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
nb clusters
F1
@2
0
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
0 50 100 150 200 250 3000.68
0.7
0.72
0.74
0.76
0.78
0.8
nb clusters
P@
20 baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
0 50 100 150 200 250 3000.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
nb clusters
CR
@20
baseline
VSM(ttu)
AHC nbDocs=150
AHC nbDocs=300
VSM(ttu)+AHCCompl(cred)
page 5/9
Best results using only one feature
Which feature is the best with our method (on devset) ?
0 20 40 60 80 100 120 140
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1
@2
0 (
de
vse
t)
baseline
VSM(ttu)
cred
username
text only (tdtu)
visual only (ScalCol)
visual only (cnn_ad)
random
VSM(ttu)+AHCCompl(feature)
On devset, we test several features� We use the credibility descriptors (cred) as a vector input for AHC
=⇒ cred gives the best results (of only one feature results)4 Not confirmed on testset
What is the meaning of this feature for diversity ?� One vector per user, and not per document� cred is better than grouping documents by username (on devset)� cred and username are better than text only (on devset)
page 5/9
Best results using only one feature
Which feature is the best with our method (on devset) ?
0 20 40 60 80 100 120 140
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1
@2
0 (
de
vse
t)
baseline
VSM(ttu)
cred
username
text only (tdtu)
visual only (ScalCol)
visual only (cnn_ad)
random
VSM(ttu)+AHCCompl(feature)
On devset, we test several features� We use the credibility descriptors (cred) as a vector input for AHC
=⇒ cred gives the best results (of only one feature results)4 Not confirmed on testset
What is the meaning of this feature for diversity ?� One vector per user, and not per document� cred is better than grouping documents by username (on devset)� cred and username are better than text only (on devset)
page 5/9
Best results using only one feature
Which feature is the best with our method (on devset) ?
0 20 40 60 80 100 120 140
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1
@2
0 (
de
vse
t)
baseline
VSM(ttu)
cred
username
text only (tdtu)
visual only (ScalCol)
visual only (cnn_ad)
random
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 120 1400.6
0.65
0.7
0.75
0.8
number of clusters
P@
20
(d
evse
t)
baseline
VSM(ttu)
cred
username
text only (tdtu)
visual only (ScalCol)
visual only (cnn-ad)
random
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 120 1400.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
number of clusters
CR
@2
0 (
de
vse
t)
baseline
VSM(ttu)
cred
username
text only (tdtu)
visual only (ScalCol)
visual only (cnn-ad)
random
VSM(ttu)+AHCCompl(feature)
page 6/9
Fusion of similarities
How to use different features to improve diversity ?
=⇒ Several possible ways, but we choose to fusion similarities
let sim(x , y) be a similarity between documents x and y
Linear fusion: f1, f2 two features, τ ∈ [0, 1]
simLinear(f1,f2,τ)(x , y) = τ · simf1(x , y) + (1− τ) · simf2(x , y)
Weighted-max fusion:
simWMax(f1,w1,f2,w2,··· ,fn,wn)(x , y) = maxi∈{1,··· ,n}
wi · simfi (x , y)
with n the number of features, wi weight for feature fi , suchas
∑ni=1 wi = 1
page 6/9
Fusion of similarities
How to use different features to improve diversity ?
=⇒ Several possible ways, but we choose to fusion similarities
let sim(x , y) be a similarity between documents x and y
Linear fusion: f1, f2 two features, τ ∈ [0, 1]
simLinear(f1,f2,τ)(x , y) = τ · simf1(x , y) + (1− τ) · simf2(x , y)
Weighted-max fusion:
simWMax(f1,w1,f2,w2,··· ,fn,wn)(x , y) = maxi∈{1,··· ,n}
wi · simfi (x , y)
with n the number of features, wi weight for feature fi , suchas
∑ni=1 wi = 1
page 6/9
Fusion of similarities
How to use different features to improve diversity ?
=⇒ Several possible ways, but we choose to fusion similarities
let sim(x , y) be a similarity between documents x and y
Linear fusion: f1, f2 two features, τ ∈ [0, 1]
simLinear(f1,f2,τ)(x , y) = τ · simf1(x , y) + (1− τ) · simf2(x , y)
Weighted-max fusion:
simWMax(f1,w1,f2,w2,··· ,fn,wn)(x , y) = maxi∈{1,··· ,n}
wi · simfi (x , y)
with n the number of features, wi weight for feature fi , suchas
∑ni=1 wi = 1
page 7/9
Best fusion results
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
On devset, we made a lot of experiments to optimise the weights of thefusions
The linear fusion of text and visual similarities gave much betterresults than text only (tdtu) or visual only (ScalCol)
But the linear fusion gave lower result than cred result
Finally, the best WMax fusion gave slightly better result than cred one
page 7/9
Best fusion results
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
On devset, we made a lot of experiments to optimise the weights of thefusions
The linear fusion of text and visual similarities gave much betterresults than text only (tdtu) or visual only (ScalCol)
But the linear fusion gave lower result than cred result
Finally, the best WMax fusion gave slightly better result than cred one
page 7/9
Best fusion results
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
On devset, we made a lot of experiments to optimise the weights of thefusions
The linear fusion of text and visual similarities gave much betterresults than text only (tdtu) or visual only (ScalCol)
But the linear fusion gave lower result than cred result
Finally, the best WMax fusion gave slightly better result than cred one
page 7/9
Best fusion results
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
On devset, we made a lot of experiments to optimise the weights of thefusions
The linear fusion of text and visual similarities gave much betterresults than text only (tdtu) or visual only (ScalCol)
But the linear fusion gave lower result than cred result
Finally, the best WMax fusion gave slightly better result than cred one
page 7/9
Best fusion results
0 20 40 60 80 100 120
0.46
0.48
0.5
0.52
0.54
0.56
0.58
number of clusters
F1@
20 (
devset)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 1200.66
0.68
0.7
0.72
0.74
0.76
0.78
0.8
number of clusters
P@
20
(d
evse
t)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
0 20 40 60 80 100 1200.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
number of clusters
CR
@2
0 (
de
vse
t)
baseline
VSM(ttu)
cred
WMax(tdtu,0.014,ScalCol,0.97,cred,0.016)
Linear(tdtu,ScalCol,0.02)
tdtu
ScalCol
VSM(ttu)+AHCCompl(feature)
page 8/9
Run results
Step 1 Steps 2-4: AHC devset testset
Run Features F1@20 F1@20
baseline - - 0.467(ref.) -run 1 No visual 0.498(+7%) 0.430run 2 Yes text 0.569(+22%) 0.552run 3 Yes Linear(text,visual) 0.582(+25%) 0.553run 4 Yes cred 0.585(+25%) 0.543run 5 Yes WMax(text,visual,cred) 0.588(+26%) 0.544Number of queries 70 64
On testset
Text feature gives better results than cred feature
Best results is obtained using the linear fusion of visual and textualfeature similarities, not using WMax
The F1@20 for run 2 to run 5 are very close (' 0.55)
=⇒ Difficult to make reliable conclusion
page 8/9
Run results
Step 1 Steps 2-4: AHC devset testset
Run Features F1@20 F1@20
baseline - - 0.467(ref.) -run 1 No visual 0.498(+7%) 0.430run 2 Yes text 0.569(+22%) 0.552run 3 Yes Linear(text,visual) 0.582(+25%) 0.553run 4 Yes cred 0.585(+25%) 0.543run 5 Yes WMax(text,visual,cred) 0.588(+26%) 0.544Number of queries 70 64
On testset
Text feature gives better results than cred feature
Best results is obtained using the linear fusion of visual and textualfeature similarities, not using WMax
The F1@20 for run 2 to run 5 are very close (' 0.55)
=⇒ Difficult to make reliable conclusion
page 8/9
Run results
Step 1 Steps 2-4: AHC devset testset
Run Features F1@20 F1@20
baseline - - 0.467(ref.) -run 1 No visual 0.498(+7%) 0.430run 2 Yes text 0.569(+22%) 0.552run 3 Yes Linear(text,visual) 0.582(+25%) 0.553run 4 Yes cred 0.585(+25%) 0.543run 5 Yes WMax(text,visual,cred) 0.588(+26%) 0.544Number of queries 70 64
On testset
Text feature gives better results than cred feature
Best results is obtained using the linear fusion of visual and textualfeature similarities, not using WMax
The F1@20 for run 2 to run 5 are very close (' 0.55)
=⇒ Difficult to make reliable conclusion
page 8/9
Run results
Step 1 Steps 2-4: AHC devset testset
Run Features F1@20 F1@20
baseline - - 0.467(ref.) -run 1 No visual 0.498(+7%) 0.430run 2 Yes text 0.569(+22%) 0.552run 3 Yes Linear(text,visual) 0.582(+25%) 0.553run 4 Yes cred 0.585(+25%) 0.543run 5 Yes WMax(text,visual,cred) 0.588(+26%) 0.544Number of queries 70 64
On testset
Text feature gives better results than cred feature
Best results is obtained using the linear fusion of visual and textualfeature similarities, not using WMax
The F1@20 for run 2 to run 5 are very close (' 0.55)
=⇒ Difficult to make reliable conclusion
page 9/9
Conclusion and discussion
On this benchmark and with our framework
Is it worth taking time to cluster 300 results ?
� To improve F1@20 ? No.� To ensure good F1@20 ? Yes.
On devset :
� The credibility descriptors gave very good results
4 Why ? What is the meaning of these descriptors for diversity ?
4 Results not so good on testset
� The WMax operator gave the best results
4 Not confirmed on testset, maybe an overfitting problem
� The linear fusion between text and visual gave good results
4 Confirmed on testset
Thank you for your attention
page 9/9
Conclusion and discussion
On this benchmark and with our framework
Is it worth taking time to cluster 300 results ?
� To improve F1@20 ? No.� To ensure good F1@20 ? Yes.
On devset :� The credibility descriptors gave very good results
4 Why ? What is the meaning of these descriptors for diversity ?
4 Results not so good on testset
� The WMax operator gave the best results
4 Not confirmed on testset, maybe an overfitting problem
� The linear fusion between text and visual gave good results
4 Confirmed on testset
Thank you for your attention
page 9/9
Conclusion and discussion
On this benchmark and with our framework
Is it worth taking time to cluster 300 results ?
� To improve F1@20 ? No.� To ensure good F1@20 ? Yes.
On devset :� The credibility descriptors gave very good results
4 Why ? What is the meaning of these descriptors for diversity ?
4 Results not so good on testset
� The WMax operator gave the best results
4 Not confirmed on testset, maybe an overfitting problem
� The linear fusion between text and visual gave good results
4 Confirmed on testset
Thank you for your attention
page 9/9
Conclusion and discussion
On this benchmark and with our framework
Is it worth taking time to cluster 300 results ?
� To improve F1@20 ? No.� To ensure good F1@20 ? Yes.
On devset :� The credibility descriptors gave very good results
4 Why ? What is the meaning of these descriptors for diversity ?
4 Results not so good on testset
� The WMax operator gave the best results
4 Not confirmed on testset, maybe an overfitting problem
� The linear fusion between text and visual gave good results
4 Confirmed on testset
Thank you for your attention
page 9/9
Conclusion and discussion
On this benchmark and with our framework
Is it worth taking time to cluster 300 results ?
� To improve F1@20 ? No.� To ensure good F1@20 ? Yes.
On devset :� The credibility descriptors gave very good results
4 Why ? What is the meaning of these descriptors for diversity ?
4 Results not so good on testset
� The WMax operator gave the best results
4 Not confirmed on testset, maybe an overfitting problem
� The linear fusion between text and visual gave good results
4 Confirmed on testset
Thank you for your attention
top related