mining event periodicity from incomplete observations

16
Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at Penn State University KDD 2012 Beijing, China 1 Zhenhui Jessie Li

Upload: walda

Post on 11-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Mining Event Periodicity from Incomplete Observations. Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at Penn State University. KDD 2012 Beijing, China. Prologue: Detect Periodicity in Movements [Li et al., KDD’10]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 1

Mining Event Periodicity from Incomplete Observations

Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei HanUniversity of Illinois at Urbana-Champaign

*Now at Penn State University

KDD 2012Beijing, China

Page 2: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 2

Prologue: Detect Periodicity in Movements [Li et al., KDD’10]

Problem: What is the periodicity of the

movement?

Bee example:8 hours in hive16 hours fly nearby

Page 3: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 3

Prologue: Detect Periodicity in Movements [Li et al., KDD’10]

Observe the in-and-out movements from the reference spot (i.e., hive).

in hive

outside hive

time

Two-Dimensional Movement One-Dimensional Binary Sequence

Easy to see the

periodicity.

Page 4: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 4

Challenge: Periodicity Detection for Incomplete Observations

• Two factors result in incomplete observations: inconsistent + low sampling rate

• Movement data collection in real scenarios:– Human movements data collected from cellphones: only report

locations when making calls– Animal movement data: 2~3 locations in 3~5 days

2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…

in hive

outside hive

Complete Observations Incomplete Observations

Page 5: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 5

A Challenging Case of Detecting Periodicity for Incomplete

Observations

2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…

Sparse Raw Data

in out in

Any periodicity in the above sequence?

Page 6: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 6

Mining Periodicity in Incomplete Data

• Event has a period of 20• Occurrences of the event happen between 20k+5 to 20k+10

Page 7: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 7

A Probabilistic Model for Periodic Event

Example:• Human daily periodicity visiting

office• Period as 24• Visiting office at 10-11am, 14-

16pm

Page 8: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 8

A Probabilistic Model for Periodic Event with Random Observation

generate

x(5)=1 x(62)=0

Page 9: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 9

Periodicity Detection by Overlaying Observations

Skewed distribution

Even distribution

True period Wrong period

Page 10: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 10

Relationship between Observation Ratio and Probabilistic Model

Pos/Neg Ratio Periodic Distribution Vector

Page 11: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 11

Discrepancy Score to Measure Periodicity

If T (=24) is the correct period, the discrepancy score should be large for certain set of timestamps

If T (=23) is the wrong period, the discrepancy scores are likely to be zero for any set of timestamps

Page 12: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 12

Periodicity Measure

Page 13: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 13

Performance Comparisons

Sampling rate(Ratio of observed points in the complete sequence)

Page 14: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 14

Experiment on Real Human Data

One person’s visits to a specific location

Sampling rate: 20min

Sampling rate: 1hour

Page 15: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 15

Problems with Using Fourier Transform to Detect Periodicity

T=4

T=16

Page 16: Mining Event Periodicity from Incomplete Observations

Zhenhui Jessie Li 16

Summary: Mining Event Periodicity from Incomplete Observations

• Motivation– Challenge of the real data: incomplete

observations (inconsistent + low sampling rate)

• Method– Overlay the segments and measure the

“skewness” of the distribution– Theoretically prove the correctness of the method

• Application– Location prediction– 2nd place in Nokia Mobile Data Challenge 2012– Periodicity-based feature + SVM

Thanks! Questions?