quantitative study of innovation and knowledge building in questions&answers system with math...
TRANSCRIPT
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Quantitative Study of Innovation and KnowledgeBuilding in Questions&Answers System with
Math Tags
Marija Mitrovic Dankulov, Bosiljka Tadic
Scientific Computing Laboratory, Institute of Physics Belgrade
University of Belgrade, Pregrevica 118, 11080 Belgrade
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Collective Knowledge Building
Socio-cultural process which takes place trough self-organizeddynamics of interactions among individualsConditions that support collective knowledge building:
(i) Problems as an attempt to understand world/field.
(ii) Improving coherence, quality and utility of ideas.
(iii) Interaction - participants negotiate fit between theirown ideas and of others.
(iv) All participants must contribute.
(v) Knowledge-building discourse, more than knowledgesharing;. participants engage in constructing, refining andtransforming knowledge.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Questions & Answers Sites
Rich repositories for studying dynamics of collective knowledgebuilding
On Q&A sites:
Participants ask, answer and vote for questions.
Comment and engage in discussion aboutquestions/answers.
All participants contribute trough different type of actions:posting and voting for questions, answers, comments. Theyconstruct (ask/answer), refine (comment/vote) andtransform knowledge.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Data: Stack Exchange
Stack Exchange: where expert answers to your questions!Network of 130 Q&A sites where participants answers toinformational and factual questions.Mathematics:
Data for four year period: since the beginning (July 2010)until April 2014.
Rich dataset: 77895 Users posted 269819 Questions,400511 Answers and 1265445 Comments.
High temporal resolution.
Tags - list of up to 5 tags is assigned to each question.Overall 1040 tags: calculus, linear algebra, complexanalysis, application, . . .
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Quantitative study of knowledge building:methods
Tools and methods from statistical physics and complexnetwork theory.
Complex networks - topological structure.
Entropy measures of user activity and activity ondifferent tags.
Time series analysis - power spectrum, avalanches,fluctuations.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Network mapping
Weighted bipartite network
Two partitions: Users andQuestions.
Link weight: number ofanswers/comments.
Structural properties ofbipartite network and it’sprojections to Question andUser partitions.
[M. Mitrovic et al., EPJB 73,
293-301, (2010).]
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Topology
Broad distributions of degree for both partitions stable overtime and tags.
100 101 102
s10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
P(s
)
1st year2nd year3rd year4th year
100 101 102 103 104
s10-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
101
P(s
)
Users1st year2nd year3rd year4th year
100 101
q10-6
10-5
10-4
10-3
10-2
10-1
100
101
P(q
)
homeworkcalculusreal-analysislinear-algebra
100 101 102 103
q10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
101
P(q
)
homeworkcalculusreal-analysislinear-algebra
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Community structure
2 week activity network.Community detectionmethod - Louvainmethod. [V. D. Blondel,
JSTAT 2008 (10), P100,
(2008).]
Communities are formedaround few very activeexperts.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Focus and expertise of users
0 1 2 3 4 5 6 7 8
H
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
number of users
Questions
0 1 2 3 4 5 6 7 8 9
H
0.00
0.05
0.10
0.15
0.20
0.25
numberofusers
Answers+Comments
User activity on separatetags - Xi = n1, . . . , nmax;Total activity Σi =
∑l ni
User’s entropy -Hi = −
∑lnlΣi
Lower Hi higher focus.
[Adamic et al., Proceedings
of WWW’08, (2008).]
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Zipf’s and Heap’s law
Heap’s law
100 101 102 103 104 105 106 107
N
100
101
102
103
104
105
D(N
)
TagsCombination of Tags
D(N) ∼ N−β ; β < 1 sublinear
growth (β = 0.27 (Tags) &
β = 0.92 (Combination of Tags)
Zipfs’s law
100 101 102 103 104 105
R
100
101
102
103
104
105
106
f(R)
TagsCombination of Tags
f(R) ∼ R−α; α = 1.47 (Tags) &
α = 1 (Combination of Tags)
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Entropy of events associated to Tag
100 101 102 103 104 105 106
K
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
S/log
(K)
datareshuffle
K - number of occurrence ofTag.
Ψ - sequence of eventsdivide into K equalintervals; fl is the numberof occurrence of Tag ininterval l;
S = −∑K
l=1flK log( flK )
S = 0 all events are in oneinterval; Smax = log(K)events are equallydistributed.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Power spectrum
Power spectrum is of type 1f for small frequencies - long term
correlations.
100 101 102 103 104
s10-1100101102103104105106107108109
P(s
)
p(t)
binned
100 101 102 103 104
s100101102103104105106107108109
10101011
P(s
)
Na(t)
binned
100 101 102 103 104
q10-1100101102103104105106107108109
P(q
)
homeworkbinned
100 101 102 103 104
q10-1100101102103104105106107108
P(q
)
calculusbinned
0 10000 20000 30000 40000 50000 60000
t[10min]
0
5
10
15
20
25
p(t
)
New users
0 10000 20000 30000 40000 50000 60000
t[10min]
0
5
10
15
20
25
Na(t
)
all
0 10000 20000 30000 40000 50000 60000
t[10min]
0
5
10
15
20
25
N(t
)
homework
0 10000 20000 30000 40000 50000 60000
t[10min]
0
5
10
15
20
25
N(t
)
calculus
[M. Mitrovic et al., JSTAT 2011, P02005, (2011).]
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Avalanche distribution
Time series of events N(t) ⇒ time series of avalanches Si.
100 101 102 103
S
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
101
P(S
)
allhomeworkcalculus
100 101 102
T
10-6
10-5
10-4
10-3
10-2
10-1
100
101
P(T
)allhomeworkcalculus
78000 78500 79000 79500 80000
t
2
4
6
8
10
12
14
N(t)
time series of events
Broad distributions of avalanche sizes and duration ⇒self-organized criticality (SOC).
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Avalanche size returns
−20 −15 −10 −5 0 5 10 15 20d/σ
10-6
10-5
10-4
10-3
10-2
10-1
100
P(d
)
homeworkcalculus
Return di=Si+1 − Si+∆
P (d) = P0(1− (1− q)( dσ )2)1
1−q
SOC ⇒ peaked distributionwith fat tail.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building
INSTITUTE OF PHYSICS
BELGRADEIntroduction
Network topologyEntropy measuresTemporal patterns
Summary
Summary
Collective knowledge building can be studied by applyingmethods of complex networks and statistical physics:
Complex networks - Q&A sites can be used for studyingof dynamics of collective knowledge building process.
Entropy measures - most of the users focus on fewcategories (expertise); tag specific dynamics is highlycooperative process.
Time series analysis - self-organized criticalitymechanism with long-range correlations is at the origin ofcollective knowledge building.
KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building