sac treck 2008
DESCRIPTION
TRANSCRIPT
![Page 1: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/1.jpg)
the effect of correlation coefficients on
communities of recommenders
neal lathia, stephen hailes, licia capradepartment of computer science
university college london
ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration
Know-how
![Page 2: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/2.jpg)
recommender systems:
built on collaboration between users
![Page 3: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/3.jpg)
collaborative filtering research design
methodsto solve problems
1. accuracy, coverage
2. data sparsity, cold-start
3. incorporating tag knowledge
for example,
![Page 4: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/4.jpg)
… a method to classify content correctly
data predictedratingsintelligent
process
our focus: k-nearest neighbours (kNN)
![Page 5: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/5.jpg)
how do we model kNN collaborative filtering?
![Page 6: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/6.jpg)
a graph of cooperating users
me
nodes = userslinks = weighted according to similarity
![Page 7: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/7.jpg)
accuracy, coverage
to answer this question, we need to find the optimal weighting:
the best similarity measure for the dataset, from the many available:
ba
ba
baRR
RRw ,
2
,
2
,
,,,
bibaia
bibaiaba
rrrr
rrrrw
2
,
2
,
,,,
1
bibaia
bibaiaba
rrrr
rrrr
Nw
and there are more still…
2
,2
,
,,,
5.25.2
5.25.2
ibia
ibiaba
rr
rrw
![Page 8: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/8.jpg)
concordance: proportion of agreement
TN
DCw ba
,
+0.5 +3.0
-1.5+1.5
+1.5 +/-?
concordant
discordant
tied
Somers’ d}
![Page 9: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/9.jpg)
community view of the graph:
-0.430.57
(a very small example)
me-0.50
-0.65
0.12
0.87
0.010.57
0.840.220.99
0.82
0.23
0.39
0.11
0.68
0.02
0.41 0.01
-0.99
0.78
![Page 10: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/10.jpg)
or, put another way:
-0.430.57
(a very small example)
me
good
bad
none
good
good
goodgood
none
nonegood
bad
bad
good
good
good
good
nonegood
good
![Page 11: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/11.jpg)
what is the best way of generating the graph?
![Page 12: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/12.jpg)
like this?
-0.430.57
(a very small example)
me
good
bad
none
none
good
badbad
good
goodgood
good
good
bad
none
none
good
nonebad
bad
![Page 13: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/13.jpg)
or like this?
-0.430.57
(a very small example)
megood
bad
none
good
good
good
good
none
nonebad
bad
bad
good
good
good
good
none
good
good
![Page 14: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/14.jpg)
similarity values depend on the method used:
there is no agreement between measures
[2][3][1][5][3]
[4][1][3][2][3]
my profile neighbour profile
pearson -0.50weighted- pearson -0.05cosine angle0.76co-rated proportion1.00concordance -0.06
badnear zero
goodvery goodnear zero
![Page 15: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/15.jpg)
nodes = userslinks = weighted according to similarity
each method will change the distribution of similarity across the graph
![Page 16: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/16.jpg)
… the pearson distribution
intelligent process
Pearson Distribution
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
,-0.0
5)
(0.0,
0.05
)
(0.1,
0.15
)
(0.2,
0.25
)
(0.3,
0.35
)
(0.4,
0.45
)
(0.5,
0.55
)
(0.6,
0.65
)
(0.7,
0.75
)
(0.8,
0.85
)
(0.9,
0.95
)
Range
Pro
po
rtio
n
![Page 17: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/17.jpg)
… the modified pearson distributionsweighted-PCC, constrained-PCC
Modified Pearson Distributions
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
,-0.0
5)
(0.0,
0.05
)
(0.1,
0.15
)
(0.2,
0.25
)
(0.3,
0.35
)
(0.4,
0.45
)
(0.5,
0.55
)
(0.6,
0.65
)
(0.7,
0.75
)
(0.8,
0.85
)
(0.9,
0.95
)
Range
Pro
po
rtio
n
Weighted-PCC Constrained-PCC
![Page 18: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/18.jpg)
… and other measures
intelligent process
Other Distributions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
5,-0
.1)
(-0.0
5,0.0
)
(0.05
,0.1)
(0.15
,0.2)
(0.25
,0.3)
(0.35
,0.4)
(0.45
,0.5)
(0.55
,0.6)
(0.65
,0.7)
(0.75
,0.8)
(0.85
,0.9)
(0.95
,1.0)
Range
Pro
po
rtio
n
Co-Rated Somers VSS
somers’ d, co-rated, cosine angle
![Page 19: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/19.jpg)
an experiment withrandom numbers
![Page 20: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/20.jpg)
what happens if we do this?
me
java.util.Random r = new java.util.Random()
for all neighbours i {
similarity(i) = (r.nextDouble()*2.0)-1.0);
}
![Page 21: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/21.jpg)
Neighborhood Co Rated Somers’ d PCC wPCC R(0.5, 1.0) Constant(1.0) R(-1.0, 1.0)
1 0.9449 0.9492 1.1150 0.9596 1.0665 1.0406 1.0341
10 0.8498 0.8355 1.0455 0.8277 0.9595 0.9495 0.9689
30 0.7979 0.7931 0.9464 0.7847 0.8903 0.9108 0.8848
50 0.7852 0.7817 0.9007 0.7733 0.8584 0.8922 0.8498
100 0.7759 0.7728 0.8136 0.7647 0.8222 0.8511 0.8153
153 0.7726 0.7727 0.7817 0.7638 0.8053 0.8243 0.8024
229 0.7717 0.7771 0.7716 0.7679 0.7919 0.7992 0.8058
459 0.7718 0.7992 0.8073 0.8025 0.7773 0.7769 0.7811
N
prMAE
iaia ,,accuracy
…cross-validation results in paper
movielens u1 subset…
![Page 22: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/22.jpg)
sprediction#
sprediction uncovered#Coveragecoverage
…cross-validation results in paper
movielens u1 subset…
Neighborhood Co Rated Somers’ d PCC wPCC Oracle
1 0.67795 0.57165 0.96725 0.61375 0.00495
10 0.15455 0.0999 0.80515 0.1114 0.00495
30 0.0512 0.0407 0.57225 0.04135 0.00495
50 0.03065 0.0266 0.3641 0.0251 0.00495
100 0.01515 0.01645 0.08345 0.01485 0.00495
153 0.00945 0.0122 0.0273 0.01135 0.00495
229 0.00715 0.00965 0.01165 0.00915 0.00495
459 0.00495 0.0054 0.00495 0.00495 0.00495
(best coverage when all of community used)
![Page 23: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/23.jpg)
why do we get these results?
![Page 24: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/24.jpg)
a) our error measures are not good
enough?
N
rpMAE
iaia ,,
sprediction#
sprediction uncovered#Coverage
J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004.
S.M. McNee, J. Riedl, and J.A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006.
N
prRMSE iaia
2
,,
![Page 25: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/25.jpg)
b) is there something wrong with the dataset?
![Page 26: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/26.jpg)
c) is user-similarity not strong enough to capture the best recommender relationships in
the graph?
![Page 27: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/27.jpg)
one proposal…
N. Lathia, S. Hailes, L. Capra. Trust-Based Collaborative Filtering. To appear In IFIPTM 2008: Joint iTrust and PST Conferences on Privacy, Trust management and Security. Trondheim, Norway. June 2008.
is modelling filtering as a trust-management problem a potential solution?
once we do that, more questions arise…
![Page 28: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/28.jpg)
what other graph properties emerge from kNN collaborative filtering?
how does the graph evolve over time?
current work
N. Lathia, S. Hailes, L. Capra. Evolving Communities of Recommenders: A Temporal Evaluation. Research Note RN/08/01, Department of Computer Science, University College London. Under Submission.
N. Lathia, S. Hailes, L. Capra. kNN User Filtering: A Temporal Implicit Social Network. Current Work.
![Page 29: SAC TRECK 2008](https://reader036.vdocuments.net/reader036/viewer/2022062616/548ffb79b47959d3248b45bb/html5/thumbnails/29.jpg)
read more: http://mobblog.cs.ucl.ac.uktrust, recommendations, …
neal lathia, stephen hailes, licia capradepartment of computer science
university college london
ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration Know-how
questions?