quantitative study of innovation and knowledge building in questions&answers system with math...

15
INSTITUTE OF PHYSICS BELGRADE Introduction Network topology Entropy measures Temporal patterns Summary Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags Marija Mitrovi´ c Dankulov, Bosiljka Tadi´ c Scientific Computing Laboratory, Institute of Physics Belgrade University of Belgrade, Pregrevica 118, 11080 Belgrade

Upload: knowescape2014

Post on 07-Aug-2015

31 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Quantitative Study of Innovation and KnowledgeBuilding in Questions&Answers System with

Math Tags

Marija Mitrovic Dankulov, Bosiljka Tadic

Scientific Computing Laboratory, Institute of Physics Belgrade

University of Belgrade, Pregrevica 118, 11080 Belgrade

Page 2: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Collective Knowledge Building

Socio-cultural process which takes place trough self-organizeddynamics of interactions among individualsConditions that support collective knowledge building:

(i) Problems as an attempt to understand world/field.

(ii) Improving coherence, quality and utility of ideas.

(iii) Interaction - participants negotiate fit between theirown ideas and of others.

(iv) All participants must contribute.

(v) Knowledge-building discourse, more than knowledgesharing;. participants engage in constructing, refining andtransforming knowledge.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 3: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Questions & Answers Sites

Rich repositories for studying dynamics of collective knowledgebuilding

On Q&A sites:

Participants ask, answer and vote for questions.

Comment and engage in discussion aboutquestions/answers.

All participants contribute trough different type of actions:posting and voting for questions, answers, comments. Theyconstruct (ask/answer), refine (comment/vote) andtransform knowledge.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 4: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Data: Stack Exchange

Stack Exchange: where expert answers to your questions!Network of 130 Q&A sites where participants answers toinformational and factual questions.Mathematics:

Data for four year period: since the beginning (July 2010)until April 2014.

Rich dataset: 77895 Users posted 269819 Questions,400511 Answers and 1265445 Comments.

High temporal resolution.

Tags - list of up to 5 tags is assigned to each question.Overall 1040 tags: calculus, linear algebra, complexanalysis, application, . . .

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 5: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Quantitative study of knowledge building:methods

Tools and methods from statistical physics and complexnetwork theory.

Complex networks - topological structure.

Entropy measures of user activity and activity ondifferent tags.

Time series analysis - power spectrum, avalanches,fluctuations.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 6: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Network mapping

Weighted bipartite network

Two partitions: Users andQuestions.

Link weight: number ofanswers/comments.

Structural properties ofbipartite network and it’sprojections to Question andUser partitions.

[M. Mitrovic et al., EPJB 73,

293-301, (2010).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 7: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Topology

Broad distributions of degree for both partitions stable overtime and tags.

100 101 102

s10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

P(s

)

1st year2nd year3rd year4th year

100 101 102 103 104

s10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(s

)

Users1st year2nd year3rd year4th year

100 101

q10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(q

)

homeworkcalculusreal-analysislinear-algebra

100 101 102 103

q10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(q

)

homeworkcalculusreal-analysislinear-algebra

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 8: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Community structure

2 week activity network.Community detectionmethod - Louvainmethod. [V. D. Blondel,

JSTAT 2008 (10), P100,

(2008).]

Communities are formedaround few very activeexperts.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 9: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Focus and expertise of users

0 1 2 3 4 5 6 7 8

H

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

number of users

Questions

0 1 2 3 4 5 6 7 8 9

H

0.00

0.05

0.10

0.15

0.20

0.25

numberofusers

Answers+Comments

User activity on separatetags - Xi = n1, . . . , nmax;Total activity Σi =

∑l ni

User’s entropy -Hi = −

∑lnlΣi

Lower Hi higher focus.

[Adamic et al., Proceedings

of WWW’08, (2008).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 10: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Zipf’s and Heap’s law

Heap’s law

100 101 102 103 104 105 106 107

N

100

101

102

103

104

105

D(N

)

TagsCombination of Tags

D(N) ∼ N−β ; β < 1 sublinear

growth (β = 0.27 (Tags) &

β = 0.92 (Combination of Tags)

Zipfs’s law

100 101 102 103 104 105

R

100

101

102

103

104

105

106

f(R)

TagsCombination of Tags

f(R) ∼ R−α; α = 1.47 (Tags) &

α = 1 (Combination of Tags)

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 11: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Entropy of events associated to Tag

100 101 102 103 104 105 106

K

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

S/log

(K)

datareshuffle

K - number of occurrence ofTag.

Ψ - sequence of eventsdivide into K equalintervals; fl is the numberof occurrence of Tag ininterval l;

S = −∑K

l=1flK log( flK )

S = 0 all events are in oneinterval; Smax = log(K)events are equallydistributed.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 12: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Power spectrum

Power spectrum is of type 1f for small frequencies - long term

correlations.

100 101 102 103 104

s10-1100101102103104105106107108109

P(s

)

p(t)

binned

100 101 102 103 104

s100101102103104105106107108109

10101011

P(s

)

Na(t)

binned

100 101 102 103 104

q10-1100101102103104105106107108109

P(q

)

homeworkbinned

100 101 102 103 104

q10-1100101102103104105106107108

P(q

)

calculusbinned

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

p(t

)

New users

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

Na(t

)

all

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

N(t

)

homework

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

N(t

)

calculus

[M. Mitrovic et al., JSTAT 2011, P02005, (2011).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 13: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Avalanche distribution

Time series of events N(t) ⇒ time series of avalanches Si.

100 101 102 103

S

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(S

)

allhomeworkcalculus

100 101 102

T

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(T

)allhomeworkcalculus

78000 78500 79000 79500 80000

t

2

4

6

8

10

12

14

N(t)

time series of events

Broad distributions of avalanche sizes and duration ⇒self-organized criticality (SOC).

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 14: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Avalanche size returns

−20 −15 −10 −5 0 5 10 15 20d/σ

10-6

10-5

10-4

10-3

10-2

10-1

100

P(d

)

homeworkcalculus

Return di=Si+1 − Si+∆

P (d) = P0(1− (1− q)( dσ )2)1

1−q

SOC ⇒ peaked distributionwith fat tail.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

Page 15: Quantitative Study of Innovation and Knowledge Building in Questions&Answers System with Math Tags

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Summary

Collective knowledge building can be studied by applyingmethods of complex networks and statistical physics:

Complex networks - Q&A sites can be used for studyingof dynamics of collective knowledge building process.

Entropy measures - most of the users focus on fewcategories (expertise); tag specific dynamics is highlycooperative process.

Time series analysis - self-organized criticalitymechanism with long-range correlations is at the origin ofcollective knowledge building.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building