school of information university of michigan expertise sharing dynamics in online forums lada adamic...

51
School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy, Jiang Yang School of Information, University of Michigan

Upload: damon-dorsey

Post on 16-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

School of InformationUniversity of Michigan

Expertise Sharing Dynamics in Online Forums

Lada Adamic

joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy, Jiang Yang

School of Information, University of Michigan

Page 2: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Outline of talk

Knows

Knowledge iN

Page 3: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Oozing out knowledge

Knowledge In ``Knowledge search is like oozing out knowledge in human brains to the Internet. People who know something better than others can present their know-how, skills or knowledge''

NHN CEO Chae Hwi-young

“(It is) the next generation of search… (it) is a kind of collective brain -- a searchable database of everything everyone knows. It's a culture of generosity. The fundamental belief is that everyone knows something.”

-- Eckart Walther (Yahoo Research)

Page 4: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Limitations of Current Systems

• The Response Time Gap

4939N =

ExpertiseRating

lowhigh

WAITTIME(min)

10000

9000

8000

7000

6000

5000

4000

3000

2000

1000

0

69

96

41

• The Expertise Gap • Difficult to infer reliability of answers

Automatically ranking expertise may be helpful.

Page 5: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Related work

• Analysis of online communities• NetScan (Smith, Fisher, et al. at Microsoft)• Social network analysis (LiveJournal, blog communities)• Motivations of online participation (Lakhani & Hippel, Kraut)

• Graph-based ranking algorithms• PageRank, HITS, etc.

• Expertise sharing studies• Expertise recommenders

• ContactFinder (Krulwich et al.), Answer Garden (Ackerman)• Small Blue (Lin)

• Automatic evaluation of expertise levels• Using different text resources (Kautz, et al, and a lot of others)• Using email networks (Campbell et al.)

• Best answers correspond to best questions (Agichtein et al.)

Page 6: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Java Forum

• 87 sub-forums• 1,438,053

messages• community

expertise network constructed:• 196,191 users• 796,270 edges

Page 7: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Constructing a community expertise network

A B C

Thread 1 Thread 2

Thread 1: Large Data, binary search or hashtable? user ARe: Large... user BRe: Large... user C

Thread 2: Binary file with ASCII data user ARe: File with... user C

A

B

C

1

1

A

B

C

1

2

A

B

C

1/2

1+1//2

A

B

C

0.9

0.1

unweighted

weighted by # threads

weighted by shared credit

weighted with backflow

Page 8: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Uneven participation

100

101

102

103

10-4

10-3

10-2

10-1

100

degree (k)

cum

ula

tive

prob

abili

ty

= 1.87 fit, R2 = 0.9730

number of people one received replies from

number of people one replied to

• ‘answer people’ may reply to thousands of others

• ‘question people’ are also uneven in the number of repliers to their posts, but to a lesser extent

Page 9: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Not Everyone Asks/Replies

• Core: A strongly connected component, in which everyone asks and answers • IN: Mostly askers.• OUT: Mostly Helpers

The Web is a bow tie The Java Forum network is

an uneven bow tie

Page 10: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

fragment of the Java Forum

Page 11: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Relating network structure to Java expertise

• Human-rated expertise levels• 2 raters• 135 JavaForum users with >= 10 posts• inter-rater agreement (= 0.74, = 0.83)• for evaluation of algorithms, omit users where raters disagreed by

more than 1 level (= 0.80, = 0.83)

L Category Description

5 Top Java expert Knows the core Java theory and related advanced topics deeply.

4 Java professional Can answer all or most of Java concept questions. Also knows one or some sub topics very well,

3 Java user Knows advanced Java concepts. Can program relatively well.

2 Java learner Knows basic concepts and can program, but is not good at advanced topics of Java.

1 Newbie Just starting to learn java.

Page 12: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Algorithm Rankings vs. Human Ratings

simple local measures do as well (and better) than measures incorporating the wider network topology

Top K Kendall’s Spearman’s

# answersz-score # answersindegreez-score indegreePageRankHITS authority

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

Page 13: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

10121617181920192N =

LEVCOM

1098765432

RANK of PRANK

160

140

120

100

80

60

40

20

0

- 20

9281

5

68

1

10121617181920192N =

LEVCOM

1098765432

RANK of REPLY

140

120

100

80

60

40

20

0

- 20

40

101

10121617181920192N =

LEVCOM

1098765432

RANK of ZTHREADS

160

140

120

100

80

60

40

20

0

- 20

40

1011

10121117171917192N =

LEVCOM

1098765432

RANK of HITS_AUT

140

120

100

80

60

40

20

0

- 20

33

automated vs. human ratings

# answers

human rating

auto

mat

ed r

anki

ng

10121617181920192N =

LEVCOM

1098765432

RANK of INDGR

160

140

120

100

80

60

40

20

0

- 20

40

101

10121117171917192N =

LEVCOM

1098765432

RANK of ZDGR

140

120

100

80

60

40

20

0

- 20

106104

z # answers

HITS authority

indegree

z indegree

PageRank

Page 14: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Modeling community structure to explain algorithm performance

Control Parameters: Distribution of expertise Who asks questions most often? Who answers questions most often? best expert most likely someone a bit more expert

ExpertiseNet Simulator

Page 15: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Simulating probability of expertise pairing

0 1 2 3 4 50

1

2

3

4

5

replier expertise

asker expertise

0.02

0.04

0.06

0.08

0.1

0.12

0 1 2 3 4 50

1

2

3

4

5

replier expertise

asker expertise

0

0.05

0.1

0.15

suppose:

expertise is uniformly distributed

probability of posing a question is inversely proportional to expertise

pij = probability a user with expertise j replies to a user with expertise i

2 models:‘best’ preferred ‘just better’ preferred

iep ijij /~ )( −β iep ji

ij /~ )( −γ j>i

Page 16: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Visualization

Best “preferred” just better

Page 17: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Degree correlation profiles

best preferred (simulation) just better (simulation)

Java Forum Networkas

ker

inde

gree

aske

r in

degr

ee

aske

r in

degr

ee

Page 18: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

It can tell us when to use which algorithms

Preferred Helper: ‘just better’

Preferred Helper: ‘best available’

Page 19: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Different ranking algorithms perform differently

In the ‘just better’ model, a node is correctly ranked by PageRank but not by HITS

Page 20: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Summary for a focused expertise sharing forum

• Expertise Networks have interesting characteristics• A set of useful metrics • Ranking algorithms are affected by network structures• Simulation as an analysis tool

• There are rich design opportunities• Find experts with the help of structural information (and content

analysis) • Predict good answers • Re-order questions/answers to match expertise

UIST2007: “Expertise-Level based Interface Personalization for Online Help-seeking Communities”

Page 21: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

• Looking at diverse sets of question-answer forums (Yahoo Answers)

• Expertise across different topics

Everyone is not a Java expert, but everyone knows something...

cars & transportation

maintenance & repairs

beauty & style

hair

w/ Jun Zhang, Eytan Bakshy, Mark Ackerman

Page 22: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Yahoo! Answers

• A community-driven knowledge market site, allowing users to ask and answer questions

• About 80 million answers since launch in Dec 2005• Similar sites: Naver, Baidu-knows

• our sample• 1 month• 8,452,337 answers• 1,178,983 questions• unique repliers: 433,402• unique askers: 495,414• users who are both askers and helpers: 211,372

Page 23: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,
Page 24: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,
Page 25: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,
Page 26: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

length of answer vs. # of answers

Celebrities

MusicMathematics

Physics

Languages

ChemistryCooking & recipesProgramming & design

Jokes & Riddles

Parenting

Polls &Surveys

Baby NamesWrestling

Small business

Genealogy

thread length

cont

ent l

engt

h

Page 27: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

network structure of a technical forum

programming & design

Page 28: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

network structure of a discussion forum

politics

Page 29: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Clustering categories: technical, advice/support, discussion

Page 30: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Breadth of user’s post patterns

• Sum entropies at each level

cars & transportation

maintenance & repairs

beauty & style

hair

)log( ,, iLi

iLL ppH ∑−= ∑=L

LT HH

0.3 0.7

0.70.1 0.2

L=1

L=2

car audio

thanks to Mark Newman

Page 31: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

user with entropy = 0

all answers are in the Pets > Dogs subcategory

Page 32: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

user with medium entropy

_Travel *1*

_Asia_Pacific *1*

_Pets *1*

__General___Pets *1*

_Sports *18*

__Cricket *18*

_Society___Culture *19*

_Cultures___Groups *1*

__Religion___Spirituality *18*

_Arts___Humanities *1*

__Philosophy *1*

===== entropy ======

L1 entropy: 0.99

L2 entropy: 1.09

total: 2.08

Page 33: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

user with medium entropy

_Beauty___Style *32*

__Makeup *7*

__General___Beauty___Style *3*

__Fashion___Accessories *12*

__Hair *7*

_Skin___Body *3*

_Family___Relationships *8* __Singles___Dating *6* __Friends *1*

__Family *1*

===== entropy ======

L1 entropy: 0.50

L2 entropy: 1.83

total: 2.33

focused on a few top level categories, during the sampled period

Page 34: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

high entropy user

===== entropy ======

L1 entropy: 2.64

L2 entropy: 3.11

total entropy: 5.75

user with high entropy

Page 35: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Entropy and “expertise”

• consider 1st level categories and 2nd level entropies (HL=2) within them individually

level 1 category(ies)

Pearson (entropy,score)

p-value

computers & internet science and math

-0.22 10-7

family & relationships

-0.13 10-13

sports -0.01 0.65

Page 36: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Predicting best answers

lengthier replies

Page 37: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Predicting best answers

Programming Marriage wrestling

answer length + + +

thread length - - -

replier best answers

+ + +

replier answers - - -

R2 0.729 0.693 0.692

Page 38: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

people who answer about X, ask about Y

Page 39: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Yahoo Answers Summary

• Everyone knows something• Differing network characteristics by category type• Focus corresponds to performance, primarily for “factual

categories”• Predicting best answers easiest for those categories

Page 40: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Competing to share expertise on Witkey sites

Page 41: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

when money is given to best answers

• previously (Java Forum): asker -> replier• in Task CN: submitter -> winner

Page 42: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

task prestige network

• If winners of other tasks lose in this task, this task is more prestigious...

Page 43: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

network of task participants

• Same two users compete twice:

• same winner 77% of the time (compared to 1/2 chance)

• Same two users compete 3x:

• same winner all 3 times in 56% of the cases (compared to 1/4 chance)

Page 44: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

Taskcn knowledge sharing community

• Askers offer cash reward for best “solution” to task

w/ Jiang Yang and Mark Ackerman

Page 45: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

does money matter?

• more views• not more submissions on average• small negative correlation with task PageRank

• task may be a bit more difficult

• average prestige negatively correlated with number of participants

• but winner’s prestige positively correlated

does the number of participants matter?

Page 46: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

predicting who will win

• Competing against few other and having a good track record is predictive of future wins

• Some users “learn” to improve their odds of winning by selecting less popular tasks

Page 47: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

users learn to choose less competitive tasks

Page 48: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

successful users shorten interval between wins

Page 49: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

summary of Taskcn findings

• Amount of reward does not correlate with• # submissions• expertise level

• It does correlate with the number of views

• Can infer expertise from new formulation of expertise networks

• Can predict future probability of winning

• Overall, users don’t increase their winning rate over time• successful users

• choose less popular tasks• decrease intervals between wins

Page 50: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

for more info• Competing to share expertise

Yang, J., Adamic, L., Ackerman, M.S., ICWSM2008

• Examining knowledge sharing on Yahoo Answers Adamic, L., Zhang, J., Ackerman, M.S., Knowledge Sharing and Yahoo Answers: Everyone knows something, WWW’08

• ExpertiseRank algorithms and evaluations Zhang, J., Ackerman, M.S., Adamic, L., Expertise Networks in Online Communities: Structure

and Algorithms, WWW’07

• Simulations of expertise networks Zhang, J., Ackerman, M.S., Adamic, L., CommunityNetSimulator: Using Simulations to Study

Online Community Network Formation and Implications, C&T2007

• QuME: A Mechanism to Support Expertise Finding In Online Help-seeking CommunitiesJ. Zhang, M. S. Ackerman, and L. A. Adamic, UIST2007, Newport, RI, 2007.

Lada Adamic [email protected]

http://www-personal.umich.edu/~ladamic

Page 51: School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,

thanks!

Jun Zhang [email protected]

http://www-personal.si.umich.edu/~junzh

Mark Ackerman [email protected]

http://www.eecs.umich.edu/~ackerm/

Jiang Yang [email protected]

http://www.yangjiang.us/

Eytan Bakshy [email protected]

http://www-personal.umich.edu/~ebakshy

Thanks to: ARI, Intel, NSF 0325347