milan vojnovi ć msrc, systems and networking tagging done by you

35
Milan Vojnović MSRC, Systems and Networking Tagging done by YOU

Upload: morgan-dillon

Post on 27-Mar-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Milan VojnovićMSRC, Systems and Networking

Tagging done by YOU

Page 2: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Thanks TagBooster project Dinan

Gunawardena James Cruise (U

Cambridge) Peter Marbach (U

Toronto) Fabian Suchanek (MPI)

Product groups O14 Sharepoint

Communities Tagspace Officelabs

MSR Tagging Summit TagBooster

User Study Nick Duffield John

Mulgrew Andy Slowey This talk Abi

Alex Chris Peter Stephen

Page 3: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Social tagging in web2.0

Page 4: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Why tag?

Page 5: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

• Tagging: what and why

• Tag suggestions

• Conclusion

Page 6: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

In this talk, we’ll find relations among the following

f

g

x y

Page 7: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Discover, filter, share

Page 8: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Faceted browsing

BBC news

Michael Palin

BBC radio

BBC shop

bbc

BBC news

Michael Palin

BBC radio

BBC shop

bbc

palin

Page 9: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Tagging vs. traditional classification

• Traditional classification– Pre-defined vocabulary– Structured– Done by

authors/librarians– Non trivial task

• Social tagging– Use any words– No structure– Done by anyone– Easy

Page 10: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Systems with controlled vocabulary

Page 11: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Social tagging challenges

• Vocabulary evolution– Filtering tags, tag suggestions, tagging metaphors– Uncontrolled vocabulary: scalable, mitigate vocabulary problem,

but tag noise

• User interface design– Tagcloud, tag clustering

• Cold start– Lack of prior knowledge about tags for an object– Participation incentive

• Scale– More tagging events, easier filtering

• Making use of tags– Related tags for navigation, expertise tracking, tag meta-data

for search, scoped rankings of items, faceted browsing

Page 12: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

TagBooster User Study

Sept-Oct 07

4000+ participantsTagging web pagesQuestionnaire

Page 13: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Tagging done by YOU

Page 14: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Analogous to voting

music soul london

music

soul

jazz

london

black

artist

british

singer

Feedback !

Page 15: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

PositiveNegative

Why suggest tags?

• Hiding users’ true preference over tags

• I picked a suggested tag that now I can’t remember

• I tend to overuse same tags all over again “exploit” vs. “explore”

• Less effort (cognitive, typing)

• Encourage users to use tags (cold start)

• Conformance in vocabulary

Page 16: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Top Popular: classical suggestion method

# sel. tag174 music110 radio96 internet radio77 online radio49 last.fm40 online music34 fm33 streaming music31 streaming28 last fm22 web radio19 scrobbling18 lastfm12 listen12 new music10 mp310 stream9 streaming radio

Suggested tags:radio, music, online radio, internet radio

Page 17: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Users’ generation of tags

singer

music

jazzBlack

british

rehab

London

soul

Set of all tags

artist

singermusic

jazz

british

singer

soul

Suggested tags

artist

Page 18: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Simple user model

music

jazzBlack

british

rehab

London

soul

music

jazz

british

singer

soul

Suggested tagsSet of all tags

singer

artist

artist

singer

1-p p imitationnon imitation

ri

i

ri i

Page 19: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Users’ tag selection affected by tag suggestions

Conditional on that the tag was

suggested

Unconditional

Frequency of tag selection

0

0.1

0.2

0.3

0.4

0.5

tag: apollo

Page 20: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

The imitation rate

p̂ 0.32 0.31 0.4 0.34

portion of tag selections not in S;

suggestions not made

portion of tag selections not in the

suggestion set Sg

hgp

ˆBoes’ estimate:

Page 21: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Sel. Tag174 music110 radio96 internet radio77 online radio49 last.fm40 online music34 fm33 streaming music31 streaming28 last fm22 web radio19 scrobbling18 lastfm12 listen12 new music10 mp310 stream9 streaming radio

Move-to-Set: simple randomised rule

Suggested tags:last.fm, music, online radio, web radio

Page 22: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

SiS

Sjj

iii r

rSprpf

:

)()1(

ji ff ji rr ?

)()( jAiA Sufficient ji rr for

Under the user model, for any imitation probability p < 1, the long run frequency of tag selections induces the true popularity ranking

Correctness of popularity order

Page 23: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Simple update rule

• Converges to sampling the suggestion set proportional to the product of true rank scores

Suggested tags:last.fm, music, online radio, web radio

Suggested tags:last.fm, music, radio, web radio

radio

Suggested tags:last.fm, music, online radio, web radio

• Same as “show most recent item” for suggestion set size 1

Page 24: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Analogous to exclusion process

jirj

rj

Page 25: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Frequency Move-to-Set

Rank Tag174 music110 radio96 internet radio77 online radio49 last.fm40 online music34 fm33 streaming music31 streaming28 last fm22 web radio19 scrobbling18 lastfm12 listen12 new music10 mp310 stream9 streaming radio

Suggested tags:radio, music, online radio, internet radio

radio

last.fm

Rank(radio) remains unchanged(“radio” suggested)

Rank(last.fm) ++(“last.fm” NOT suggested)

Page 26: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Only sufficiently popular tags eventually suggested

o.w.0

)(11 ||

|| Cir

rhs

i

cCs

i

frequency of suggesting tag i

competing set

suggestion set size harmonic mean of r1, ..., r|C|

Tag i in the competing set iff: )(1 rhr iis

i

Page 27: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Suggestion methods in action

Tag rank i

Frequency of tag suggestion

TOP

FMTS

MTS

NONE

Tag rank i

Norm. frequency of tag selection

Page 28: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Suggestion methods in action (cont’d)

TOP

FMTS

MTS

NONE

Page 29: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

How did users appreciate the suggested tags?

Web page Method They were confusing

They were OK, but not very relevant

They were generally helpful

engadget TOP 35.00% 15.00% 50.00%FMTS 25.93% 25.93% 48.15%

MTS 22.22% 25.93% 51.85%lastfm TOP 22.22% 55.56% 22.22%

FMTS 25.00% 32.14% 42.86%

MTS 27.59% 24.14% 48.28%startup TOP 39.13% 21.74% 39.13%

FMTS 50.00% 29.17% 20.83%

MTS 30.30% 24.24% 45.46%mit TOP 21.74% 30.44% 47.83%

FMTS 23.08% 23.08% 53.85%

MTS 24.24% 18.18% 57.68%

Page 30: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Why did I select these tags?

Tags:gadgetstechnologyengadgetblog

2

1

I thought these are keywords that I would likely use later to find this item

I thought these are categories that best describe the object

else

Page 31: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Why did I select these tags? (cont’d)

YOU find, search, describe, categorise, identify, remember, organise, classify

wikipedia tag (meta data)

definition

describing the item

keyword-based classification

search

Page 32: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Why did I select these tags (cont’d)?

Semantic analysis of tags, search and content keywords – May 2007 popular Web searches + delicious tags

Tags similar to categories

Small overlap with search keywords

Page 33: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Summary

• Social tagging poses interesting research challenges– Space for innovation

• A mix of control theory, user behaviour, information retrieval, interface design

• Aim at best design of tagging systems to support particular users’ tasks

Page 34: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Sample of research challenges

User model?

Rate of convergence

Asymptotically accurate algorithms

Select from the list only (e.g. remote controller/mobile device)

What does it mean a tag is relevant?

Make suggestions to improve users’ task (e.g. search, faceted browsing)?

Beyond popularity ranking:

Ranking across multiple lists

Faces project

ongoingwork

Tag to attract

Page 35: Milan Vojnovi ć MSRC, Systems and Networking Tagging done by YOU

Familiarity with tagging

Email domain Users

microsoft.com 44%

hotmail.com 15%

gmail.com 11%

other 30%

Tagging frequency Users

daily 15%

weekly 25%

monthly 17%

less frequently 40%

still used infrequently by many