mining trajectory profiles for discovering user communities speaker : chih-wen chang national chiao...

29
Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung, Chih-Wen Chang, Wen-Chih Peng

Upload: betty-dawson

Post on 21-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

Mining Trajectory Profiles for Discovering User

Communities

Speaker : Chih-Wen ChangNational Chiao Tung University, Taiwan

2009.11.03

Chih-Chieh Hung, Chih-Wen Chang, Wen-Chih Peng

Page 2: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

2

Outline

• Motivation• Goal• Framework

– Preprocess– Construct User’s Profiles– Formulate Distance function– Identify Community

• Experiments• Conclusion

Page 3: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

3

Motivation (1/2)

• Rapid development of positioning techniques, users can easily collect their trajectories– GPS Logger, smart phones and navigation

devices

Page 4: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

4

Motivation (2/2)

• Many GPS community sites are established– Users can share their own trajectories – Users can search trajectories

My tracks

Every Trail

Query

Page 5: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

5

Goal

• Mine user communities from raw trajectories– User Communities

• Sets of users who have similar moving behaviors

• Applications– Find new friends– Recommendation– Rank of trajectories

Page 6: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

6

Profile Profile

Profile

Measure Distance Between UsersCommunity 2

Community 1

1. Construct User’s Profile2. Formulate distance function3. Identify users communities

Page 7: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

7

Outline

• Motivation• Goal• Framework

– Preprocess– Construct User’s Profiles– Formulate Distance function– Identify Community

• Experiments• Conclusion

Page 8: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

8

Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community

Page 9: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

9

Preprocessing• Step 1:

– Find frequent regions• Input: all trajectories of users• Output: frequent regions • Density-based approach

• Step 2: – Transform trajectories into sequences of

frequnet region id• T1 : <A, B, D>

Page 10: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

10

Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community

Page 11: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

11

Construct User’s Profiles (1/2)

• User’s Profile– Probabilistic Suffix Tree (abbreviated as PST)

• Find and organize trajectory patterns• Record the probability of next movements

Frequently moving sequence

Conditional tables(next possible movements)

Page 12: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

12

Construct User’s Profiles (2/2)

• Construct PST– Level by level– Two operations:

• Create a child node– The counts of Before symbol > MinSup

• Add a symbol into the related conditional table– The counts of After symbol > MinSup

root

A:0.5 B:0.375

A

A

B

ABEABAACBADFHJHIEDH AB:0.25

Before symbol A : 2 2/3 × 0.375 = 0.25

After symbol A : 1 1/2 = 0.5 E : 1 1/2 = 0.5

Node B

SID Count C. Prob.

A 1 0.5

E 1 0.5

ABEABAACBADFHJHIEDH

ABEABAACBADFHJHIEDH

B:0.375

MinSup = 0.2

Page 13: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

13

Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community

Page 14: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

14

• Determine distance of users1. Transform the PST into Moving Sequence

ListEach element in moving sequence list is a branch of PST with their probability

Formulate Distance function (1/3)

L1 [1..2] = <[(A,0.5)],[(B,0.375)(AB,0.33)]>

Page 15: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

15

Formulate Distance function (2/3)

2. Define the distance between PSTs−Find the minimal dist(Li[1..m], Lj[1..n])

−Use three editing operations• Insertion

L1={m1:0.3,m2:0.2,m3:0.3}

L2={m1:0.3,m2:0.2}L1={m1:0.3,m2:0.2,m3:0.3}L2={m1:0.3,m2:0.2,m3:0.3}

Insert0.2

0.1

T1 T2 Cost = 0.3

Page 16: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

• Deletion

• Replacement

L1={m1:0.2,m2:0.2,m3:0.2}

L2={m1:0.2,m2:0.2,m3:0.2}

Replace

Formulate Distance function (3/3)

16

L1={m1:0.2,m2:0.3}

L2={m1:0.2,m2:0.3,m3:0.3}

Delete

L1={m1:0.2,m2:0.3}L2={m1:0.2,m2:0.3,____}

L1={m1:0.2,m2:0.2,m3:0.2}

L2={m1:0.2,m2:0.2,m4:0.3}

T1 T2

T1 T2

0.3 Cost = 0.3

0.2 0.3Cost = 0.3+0.2 = 0.50.2

Page 17: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

17

Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community

Page 18: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

18

Identify Community (1/4)

• User community– The same community: δMLS(Ti,Tj) < thresholdδ

– The number of communities is minimal• Transform the relation between PSTs into a

graph– A vertex represents a user– An edge exists between two vertices when

δMLS(Ti,Tj) < thresholdδ O1

O2 O5O3

O4

Page 19: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

19

Identify Community (2/4)

• Model as a minimum clique problem– A clique is a set of pair-wise adjacent vertices Example

O1

O2 O5O3

O4

Page 20: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

20

Identify Community (3/4)

• Select a representative PST for each community– Represent all PSTs in the same community– Advantages

• Reduce the overhead of storages• Speed up query processing• Identify new users for their communities

Representative PST

Add into

?

Page 21: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

21

Identify Community (4/4)

• Two factors1. Size of representative PST

▪ The number of tree nodes, denoted as N(Ti)

2. Distance between the selected PST and othersin the same community▪ The error sum, denoted as ES

- Sum of the distance between selected PST and others

• Representative PST– Minimize

Page 22: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

22

Outline

• Motivation• Goal• Framework

– Preprocess– Construct User’s Profiles– Formulate Distance function– Identify Community

• Experiments• Conclusion

Page 23: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

23

Experiments (1/4)

• Simulator Model– Use real trajectories from CarWeb to simulate

the group mobility of users• Total : 2400 trajectories

Page 24: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

24

• Compare to General Sequential Pattern mining algorithm (GSP)– Set of sequential patterns Ex. sp1, sp2, ..., spn

– Trajectory profile of a user represented as a

– Distance function between profiles• Cosine similarity measurement, similarity(Vi, Vj) = Example

Experiments (2/4)

Similarity : <1,1,0,0> . <0,1,1,1>

|<1,1,0,0>||<0,1,1,1>| 32

1

||||||| ji

ji

VV

VV

Page 25: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

25

Experiments (3/4)

• Impact of Trajectory Profiles

Storage

Prediction

GSP are always larger than PSTEspecially in MinSup smaller than 0.15

Page 26: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

26

Experiments (4/4)

• Impact of the thresholdδ and MinSup– Smaller thresholdδ will find more number of

communities

Storage

Prediction

Page 27: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

27

Outline

• Motivation• Goal• Framework

– Preprocess– Construct User’s Profiles– Formulate Distance function– Identify Community

• Experiments• Conclusion

Page 28: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

28

Conclusion

• Explore the problem of mining communities from trajectories

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community

Find frequent regionsReplace trajectories by region ids

Formulate distance function

Cluster users by distance functionSelect Representative PSTs

Build probabilistic suffix tree (abbreviated as PST)

Page 29: Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

29

THANK YOU!