point processes - adapted from gomez-rodriguez [gomez...
TRANSCRIPT
Point ProcessesAdapted from Gomez-Rodriguez [4, Gomez-Rodriguez]Knowledge Discovery and Data Mining 2 (VU) (707.004)
Tiago Santos
Institute for Interactive Systems and Data Science, TU Graz
2019-12-05
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 1 / 37
Section 1
Motivation and Applications
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 2 / 37
Example 1: assessing source trustworthiness
Timeline of edits to a Wikipedia article
Refutation probabilities by topic and source
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 3 / 37
Paper: [9, Tabibian et al.]
Example 1: assessing source trustworthiness
Timeline of edits to a Wikipedia article
Refutation probabilities by topic and source
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 3 / 37
Paper: [9, Tabibian et al.]
Example 2: seismology models
Interactions between di�erent kinds of earthquakes
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 4 / 37
Paper: [7, Ogata 1983]
Generalized problem formulation
Suppose:1 Discrete event stream of timestamps
I Irrespective of application scenario [3, Daley and Vere-Jones], [1, Bacry et al.], [5, Kurashima etal.]
2 Non-trivial temporal dynamics and dependencies:I Dependence of own event historyI Dependence of other event histories
When facing such a problem,consider Hawkes processes!
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 5 / 37
Generalized problem formulation
Suppose:1 Discrete event stream of timestamps
I Irrespective of application scenario [3, Daley and Vere-Jones], [1, Bacry et al.], [5, Kurashima etal.]
2 Non-trivial temporal dynamics and dependencies:I Dependence of own event historyI Dependence of other event histories
When facing such a problem,consider Hawkes processes!
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 5 / 37
Section 2
Univariate Point Processes and Hawkes Processes
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 6 / 37
Temporal point processes
Definition: A random process whose realization consists of discrete events localized in time.
Formally, N(t) =∫ t0 dN(s), dN(t) =
∑ti∈H(t) δ(t − ti)dt , where dN(t) ∈ {0, 1} and δ is the
Dirac delta.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 7 / 37
Intensity function
Since it is cumbersome to model event timelines directly, we model event intensity over time:
λ∗(t)dt = E[dN(t)|H(t)]
λ∗(t)dt is the expected value of (infinitesimal) change in event count over time, given eventhistory.→ λ∗(t) is an event rate (i.e., number of events per time unit), and this changes over time!
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 8 / 37
Intensity function
Since it is cumbersome to model event timelines directly, we model event intensity over time:
λ∗(t)dt = E[dN(t)|H(t)]
λ∗(t)dt is the expected value of (infinitesimal) change in event count over time, given eventhistory.→ λ∗(t) is an event rate (i.e., number of events per time unit), and this changes over time!
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 8 / 37
Poisson process
Intensity of a Poisson process:
λ∗(t) = µ
Note:1 Intensity independent of history2 Events occur uniformly at random3 Exponential inter-event time distribution
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 9 / 37
Inhomogeneous Poisson process
Intensity of an inhomogeneous Poisson process:
λ∗(t) = g(t) ≥ 0
Note:1 Intensity independent of history
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 10 / 37
Survival (or terminating) process
Intensity of a survival (or terminating) process:
λ∗(t) = g∗(t)(1− N(t)) ≥ 0
Note:1 Limited number of occurrences
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 11 / 37
Hawkes (or self-exciting) process
Intensity of Hawkes (or self-exciting) process:
λ∗(t) = µ+∑
ti∈H(t)
ακβ(t − ti)
Note:1 Clustered (or bursty) occurrence of events2 Intensity is stochastic and history-dependent
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 12 / 37
Hawkes (or self-exciting) process
Typical choices for kernel function κβ(t) include power law and exponential kernel:
κβ(t) = e−βt
Hence we get:λ∗(t) = µ+
∑ti<t
αe−β(t−ti)
What can we do with these models?
Fit models to real data by maximizing log-likelihood
Sample from fi�ed process via Ogata thinning [6, Ogata 1981]
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 13 / 37
Hawkes (or self-exciting) process
Typical choices for kernel function κβ(t) include power law and exponential kernel:
κβ(t) = e−βt
Hence we get:λ∗(t) = µ+
∑ti<t
αe−β(t−ti)
What can we do with these models?
Fit models to real data by maximizing log-likelihood
Sample from fi�ed process via Ogata thinning [6, Ogata 1981]
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 13 / 37
Fi�ing temporal point processes: Poisson
Likelihood of historical timeline with length T :
λ∗(t1)λ∗(t2)λ∗(t3) exp(−∫ T
0λ∗(τ)dτ
)= µ3 exp(−µT )
Maximizing log-likelihood:
µ∗ = argmaxµ
3 log(µ)− µT =3T
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 14 / 37
Fi�ing temporal point processes: Hawkes
Likelihood of historical timeline with length T :
λ∗(t1)λ∗(t2)λ∗(t3) . . . λ∗(tn) exp(−∫ T
0λ∗(τ)dτ
)Set λ∗(t) = µ+
∑ti∈H(t) ακβ(t − ti) and max. likelihood:
maxµ,α
n∑i=1
logλ∗(ti)−∫ T
0λ∗(τ)dτ
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 15 / 37
Section 3
Multivariate Hawkes Processes
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 16 / 37
Mutually exciting process
Intensity of mutually exciting (or cross-exciting) Hawkes process:
λ∗(t) = µ+∑
ti∈Hb(t)
ακβ(t − ti) +∑
ti∈Hc(t)
γκβ(t − ti)
Note:1 Superposition of processes2 Clustered occurrence of events a�ected by neighbors
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 17 / 37
Multivariate Hawkes process
M-variate Hawkes process with exponential kernel:
λ∗m(t) = µm +M∑n=1
∑tni <t
αmne−βmn(t−tni )
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 18 / 37
Fi�ing and sampling multivariate Hawkes
Sampling and fi�ing multivariate Hawkes processes works as previously.Example 2-variate Hawkes Process sample for T = 8:
0.00
0.25
0.50
0.75
1.00
0 1 2 3 4 5 6 7 8Time
Inte
nsity
Dimension λ1 λ2
Parameter values: µ = ( 0.10.5 ), α = ( 0.1 0.70.5 0.2 ), β = ( 1.2 1.0
0.8 0.6 )
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 19 / 37
Section 4
A few words of caution!
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 20 / 37
Pitfalls & Counter-Measures
Assure stationarity of multivariate Hawkes, otherwise:
Stationarity test: Spectral radius ρ < 1Fi�ing β: EM, L-BFGS, Hyperparameter optim., . . .Fit quality: Measure with Q-Q plot
Alternative approaches:I Information-theory (e.g. transfer entropy)I Dynamical systems (e.g. branching processes)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 21 / 37
Pitfalls & Counter-Measures
Assure stationarity of multivariate Hawkes, otherwise:
Stationarity test: Spectral radius ρ < 1Fi�ing β: EM, L-BFGS, Hyperparameter optim., . . .Fit quality: Measure with Q-Q plotAlternative approaches:
I Information-theory (e.g. transfer entropy)I Dynamical systems (e.g. branching processes)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 21 / 37
Section 5
Example Application: Understanding Q&A Community Development
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 22 / 37
Motivation
How and why do some online communities grow and others do not?
How do users become active, and how does their activity evolve over time?
We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.
→ This will help community managers guide and encourage activity.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37
Motivation
How and why do some online communities grow and others do not?
How do users become active, and how does their activity evolve over time?
We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.
→ This will help community managers guide and encourage activity.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37
Motivation
How and why do some online communities grow and others do not?
How do users become active, and how does their activity evolve over time?
We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.
→ This will help community managers guide and encourage activity.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37
Fi�ing Multivariate Hawkes
Ensuring stationarity:I Fit only stationary segments of event streamsI Estimate stationary segments via Zeileis et al.’s [10, Zeileis et al.] algorithm:
Fi�ing βm,n:I Assume βm,n = β,∀1≤m,n≤MI Algorithm: Bayesian hyperparameter optimization [2, Bergstra et al.]
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 24 / 37
Fi�ing Multivariate Hawkes
Ensuring stationarity:I Fit only stationary segments of event streamsI Estimate stationary segments via Zeileis et al.’s [10, Zeileis et al.] algorithm:
Fi�ing βm,n:I Assume βm,n = β,∀1≤m,n≤MI Algorithm: Bayesian hyperparameter optimization [2, Bergstra et al.]
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 24 / 37
Dataset
Stack Exchange: 159 Q&A communities from 2008 to 2017 with 22 million events
Dataset Group Communities # Activity total Age (years) Growth (%)
Growing
electronics (757.62%), ru (736.42%), codegolf (510.06%),
22 [7987, 1489384] [3.08, 7.83] [169.29, 757.62]chemistry, sharepoint, academia, puzzling, tex, codereview,blender, unix, money, gis, ux, crypto, security, stats, salesforce, dba,wordpress (182.28%), opendata (174.69%), askubuntu (169.29%)
Declining
boardgames (−28.53%), fitness (−34.56%), sound (−35.01%),
22 [3301, 117474] [3, 7.75] [−82.7,−28.53]productivity, tridion, parenting, pets, cra�cms, webapps, spanish, cooking,ham, bricks, gardening, cstheory, expressionengine, pm, skeptics, sustainability,genealogy (−80.26%), ebooks (−81.52%), stackapps (−82.7%)
STEMelectronics (757.62%), chemistry (473.48%), stats (199.18%), biology,
15 [15759, 745674] [2.41, 8.75] [−35.01, 757.61]datascience, physics, astronomy, cs, space, cogsci, earthscience, engineering,reverseengineering (0.00%), so�wareengineering (−21.28%), sound (−35.01%)
Humanitiesphilosophy (122.45%), english (117.76%), chinese (23.17%), music, german,
15 [87, 896631] [0.17, 6.83] [−50.10, 127.47]mythology, portuguese, christianity, esperanto, arabic, russian, writers,buddhism (−26.62%), french (−27.91%), spanish (−50.10%)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 25 / 37
Experimental Setup
Longitudinal comparison:
We compare groups of datasets across 3 years. . .
. . . by fi�ing Hawkes process every 3 monthsGroup comparisons:
I Growing vs. decliningI STEM vs humanities
Mapping event streams to Hawkes processes:
Every dataset group is a multivariate process, every community a process realization4 process dimensions distinguish common activity and user types:
I �estions by Power Users (QP)I �estions by Casual Users (QC)I Answers by Power Users (AP)I Answers by Casual Users (AC)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 26 / 37
Experimental Setup
Longitudinal comparison:
We compare groups of datasets across 3 years. . .
. . . by fi�ing Hawkes process every 3 monthsGroup comparisons:
I Growing vs. decliningI STEM vs humanities
Mapping event streams to Hawkes processes:
Every dataset group is a multivariate process, every community a process realization4 process dimensions distinguish common activity and user types:
I �estions by Power Users (QP)I �estions by Casual Users (QC)I Answers by Power Users (AP)I Answers by Casual Users (AC)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 26 / 37
Growing vs. Declining: Baseline Excitation
Low baseline intensities:
● ● ● ● ● ●● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(a) Baseline of Answers by Power Users
● ● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(b) Baseline of �estions by Power Users
● ●●
●●
● ● ● ● ● ● ●
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(c) Baseline of Answers by Casual Users
● ● ●● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(d) Baseline of �estions by Casual Users
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 27 / 37
Growing vs. Declining: Self- and Cross-Excitation
Early power user excitation, late casual user excitation and late self-excitation:
●●
●●
●● ● ●
●●
● ●
Late Stage
Self−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(a) Self of AP
●
●
●
●●
●
●
●● ●
●●
Early Power User
Cross−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(b) Cross of QP on AP
● ● ● ●●
● ● ●● ● ● ●
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(c) Cross of AC on AP
●
●
●
● ●●
●
● ● ●●
●
Early Power User
Cross−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(d) Cross of QC on AP
● ● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(e) Cross of AP on QP
●
● ● ●● ● ● ●
● ● ● ●
Late Stage
Self−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(f) Self of QP
● ● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(g) Cross of AC on QP
●● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(h) Cross of QC on QP
●● ●
● ● ● ● ●● ● ● ●
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(i) Excitation of AP on AC
●
● ●
●
●●
●●
●
● ● ●
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(j) Cross of QP on AC
●●
●
●
●● ●
●●
● ● ●
Late Stage
Self−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(k) Self of AC
●
●
●
●
●●
● ●● ● ●
●
Late Casual UserCross−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(l) Cross of QC on AC
● ● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(m) Cross of AP on QC
●● ● ● ● ● ● ● ● ● ● ●0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(n) Cross of QP on QC
● ● ●
●
● ● ● ● ● ● ● ●
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(o) Cross of AC on QC
●●
●
●
●● ● ● ● ● ● ●
Late Stage
Self−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(p) Self of QC
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 28 / 37
STEM vs. Humanities: Self- and Cross-Excitation
Importance of casual users for STEM communities, and of power users for Humanities:
●
●● ●
●
● ● ● ● ● ● ●
Casual User
Self−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(a) Self of AC
●
●●
●
●
●
●
● ●● ●
●
Power User
Cross−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(b)Cross of QC on AP
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 29 / 37
E�ect Evaluation — “Sanity Checks”
High self-excitation of casual users in STEM is not due to growth (K-S two-sample test)
Permutation tests confirm the e�ects do not arise at random:
Growing vs. decliningcomparison
Growing vs. decliningcomparison
Growing vs. decliningcomparison
Humanities vs. STEMcomparison
●
●
●
●●
●
●
●● ●
●●
●●
●●
● ●●
● ●●
● ●
Early Power User
Cross−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(a) Permuted Cross-Excitation of�estions by Power Users on Answersby Power Users
●
●
●
●
●●
● ●● ● ●
●
●
● ●
●
●● ● ● ● ● ● ●
Late Casual UserCross−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(b) Permuted Cross-Excitation of�estions by Casual Users on Answersby Casual Users
●●
●
●
●● ●
●●
● ● ●● ●
●
●
●● ● ● ●
● ● ●
Late Stage
Self−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(c) Permuted Self-Excitation of Answersby Casual Users
●
●●
●
●
●
●
● ●
● ●●
●
●●
●
●●
●
●
●
●●
●
Power User
Cross−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(d) Permuted Cross-Excitation of�estions by Casual Users on Answersby Power Users
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 30 / 37
E�ect Evaluation — “Sanity Checks”
High self-excitation of casual users in STEM is not due to growth (K-S two-sample test)
Permutation tests confirm the e�ects do not arise at random:
Growing vs. decliningcomparison
Growing vs. decliningcomparison
Growing vs. decliningcomparison
Humanities vs. STEMcomparison
●
●
●
●●
●
●
●● ●
●●
●●
●●
● ●●
● ●●
● ●
Early Power User
Cross−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(a) Permuted Cross-Excitation of�estions by Power Users on Answersby Power Users
●
●
●
●
●●
● ●● ● ●
●
●
● ●
●
●● ● ● ● ● ● ●
Late Casual UserCross−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(b) Permuted Cross-Excitation of�estions by Casual Users on Answersby Casual Users
●●
●
●
●● ●
●●
● ● ●● ●
●
●
●● ● ● ●
● ● ●
Late Stage
Self−Excitation0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(c) Permuted Self-Excitation of Answersby Casual Users
●
●●
●
●
●
●
● ●
● ●●
●
●●
●
●●
●
●
●
●●
●
Power User
Cross−Excitation
0.0
0.5
1.0
1.5
2.0
1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)
Inte
nsity
(d) Permuted Cross-Excitation of�estions by Casual Users on Answersby Power Users
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 30 / 37
E�ect Evaluation — Predictive Impact
Prediction setup:
We fit a quarter and predict the next over 3 years
We measure prediction K-S distance and RMSEWe compare 3 models in the Growing-vs-Declining se�ing:
I BaselineI Excitation E�ects RemovedI Full
Excitation e�ects ma�er for prediction:
Best performance by Full model�arters where Excitation E�ects Removed model performs worse allow for ranking e�ectswrt. predictive importance:
1 Late Stage Self-Excitation2 Early Power User Excitation3 Late Casual User Excitation
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 31 / 37
E�ect Evaluation — Predictive Impact
Prediction setup:
We fit a quarter and predict the next over 3 years
We measure prediction K-S distance and RMSEWe compare 3 models in the Growing-vs-Declining se�ing:
I BaselineI Excitation E�ects RemovedI Full
Excitation e�ects ma�er for prediction:
Best performance by Full model�arters where Excitation E�ects Removed model performs worse allow for ranking e�ectswrt. predictive importance:
1 Late Stage Self-Excitation2 Early Power User Excitation3 Late Casual User Excitation
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 31 / 37
Limitations
Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters
High-dimensional Hawkes process may be more realistic
Pinpointing exact transition dates beyond scope of this work
No claim of causality
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37
Limitations
Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters
High-dimensional Hawkes process may be more realistic
Pinpointing exact transition dates beyond scope of this work
No claim of causality
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37
Limitations
Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters
High-dimensional Hawkes process may be more realistic
Pinpointing exact transition dates beyond scope of this work
No claim of causality
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37
Limitations
Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters
High-dimensional Hawkes process may be more realistic
Pinpointing exact transition dates beyond scope of this work
No claim of causality
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37
Conclusions
Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities
Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus
Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms
altogether)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37
Source: [8, Santos et al.]
Conclusions
Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities
Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus
Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms
altogether)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37
Source: [8, Santos et al.]
Conclusions
Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities
Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus
Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms
altogether)
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37
Source: [8, Santos et al.]
Section 6
Further Resources
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 34 / 37
Code Resources
Python package: Tickhttps://github.com/X-DataInitiative/tick
C++ package: PtPackhttps://github.com/dunan/MultiVariatePointProcess
Hawkes network inference: Pyhawkeshttps://github.com/slinderman/pyhawkes
Models from papers:I Distilling Information Reliability and Source Trustworthiness from Digital Traces
http://btabibian.com/projects/reliability/
I Modeling Interdependent and Periodic Real-World Action Sequenceshttp://snap.stanford.edu/tipas/
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 35 / 37
References I
E. Bacry, I. Mastroma�eo, and J.-F. Muzy.
Hawkes processes in finance.Market Microstructure and Liquidity, 1(01):1550005, 2015.
J. Bergstra, D. Yamins, and D. Cox.
Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures.In Proceedings of the 30th International Conference on Machine Learning (ICML’13), pages 115–123, 2013.
D. J. Daley and D. Vere-Jones.
An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods.Springer Science & Business Media, 2003.
M. Gomez-Rodriguez.
Machine learning for dynamic social network analysis seminar.http://learning.mpi-sws.org/uc3m-seminar/, 2017.Accessed: 2018-02-10.
T. Kurashima, T. Altho�, and J. Leskovec.
Modeling interdependent and periodic real-world action sequences.In Proceedings of the 2018 World Wide Web Conference, pages 803–812. International World Wide Web Conferences Steering Commi�ee, 2018.
Y. Ogata.On lewis’ simulation method for point processes.IEEE Transactions on Information Theory, 27(1):23–31, 1981.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 36 / 37
References II
Y. Ogata.
Likelihood analysis of point processes and its applications to seismological data.Bulletin of the International Statistical Institute, 50:943–961, 1983.
T. Santos, S. Walk, R. Kern, M. Strohmaier, and D. Helic.
Self- and cross-excitation in stack exchange question & answers communities.In WWW, 2019.
B. Tabibian, I. Valera, M. Farajtabar, L. Song, B. Scholkopf, and M. Gomez-Rodriguez.
Distilling information reliability and source trustworthiness from digital traces.In Proceedings of the 26th International Conference on World Wide Web, pages 847–855. International World Wide Web Conferences Steering Commi�ee, 2017.
A. Zeileis, C. Kleiber, W. Kramer, and K. Hornik.Testing and dating of structural changes in practice.Computational Statistics & Data Analysis, 44:109–123, 2003.
Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 37 / 37