learning to detect events with markov-modulated poisson processes ihler, hutchins and smyth (2007)

29
Learning to Detect Events with Markov-Modulated Poisson Processes Ihler, Hutchins and Smyth (2007)

Upload: erika-ellis

Post on 13-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Learning to Detect Events with Markov-Modulated Poisson

Processes

Ihler, Hutchins and Smyth (2007)

Outline

Problem: Finding unusual activity (events) in rhythms of natural human activity

Method: Unsupervised learning Time-varying Poisson process modulated by a hidden

Markov process (events) Bayesian framework for parameter learning

Why is it hard?

Chicken-and-egg problem Where do we start?

Previous approaches: baseline Simple threshold model Has severe limitations

Need to quantify the notion of an unusual activity

How unusual is a measurement How persistent is a deviating measurement

The Data Sets 2 data sets used

Building data Counts of people entering and exiting a building 15 weeks of data 30 minute time bins 29 known events in the 15 weeks

Freeway Traffic data Vehicle counts on a freeway on-ramp 6 months of data 5 minute time bins 78 known events in the 6 months

Building Data

Example day

Building Data

Example week

Freeway Traffic Data

Example day

Freeway Traffic Data

Example week

A naïve Poisson model

Is the data actually Poisson? In a Poisson distribution the mean = the variance Is this the case in out data?

A Baseline Model

Use a simple threshold approach We say there is an event if

P(N;λ) < ε

Problems with this Approach

Hard to detect sustained small variation Hard to capture event duration Chicken and egg problem

The model (1)

Assuming the processes are additive

...which is a fair assumption

The model (2)

What is a Markov Process?

0.1

0.5

A = Rainy B = Sunny

Modelling Events with a Markov Process

We define a three state Markov chain z(t) is the state at time t, the 3 possible states are

0 if there is no event +1 if there is a positive event -1 if there is a negative even

With transition matrix

Details of the Markov Process

We give each row in the transition matrix a Dirichlet prior:

Given z(t), we can model NE(t) as a Poisson

with rate γ(t). We give this a Gamma prior Γ(γ;aE,bE), which is independent of t

We can then marginalize out over γ(t):

Graphical Model of the Dependencies

Learning the parameters

If we are given the hidden variables N0(t), N

E(t)

and z(t), we can: compute MAP estimates draw posterior samples of the parameters

λ(t) and Mz

So, we can use MCMC; iterate between sampling from the hidden variables (given the parameters), and the parameters (given the variables)

Sampling the hidden variables, given the parameters

Rough outline: First, use forward-backward algorithm [Baum et

al. 1970] to sample z(t) Then given z(t), determine N

0(t) and N

E(t) by

sampling

Sampling the parameters, given the hidden variables

The conjugate prior distributions give us a straightforward way to compute the posteriors

Use the sufficient statistics of the data as (updating) parameters for the posterior:

Prior distributions of zij and γ(t)

Markov-modulated Poisson processes are sensitive to selection of priors for z

ij and γ(t)

For the domains of these models, we often have strong ideas on e.g. what constitutes a “rare” event

Use these ideas to build strong priors in the model in order to avoid overfitting, and to adjust threshold levels of event detection

Calculating Results

We are looking to detect unusual events, we can use our model to do this do this by calculating the posterior:

We can then compare our predictions with the known event occurrences

Example Posterior Predictions (1)

Example Posterior Predictions (2)

Example Posterior Predictions (3)

Comparison of Predicted Events with Known Events

Other Possible Inferences

The model can be modified to test the degree of heterogeneity of the time process. We can ask questions like

are all week days essentially the same? are all afternoons essentially the same?

We can estimate event attendance

Conclusion

Model much more affective than threshold approach

Good detection rate Difficult to access false positive rate

Possibility for extension

Questions