frequent pattern mining from time-fading streams of uncertain data carson kai-sang leung and fan...

16
Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

Upload: barnard-watts

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

1

Frequent Pattern Mining fromTime-Fading Streams of Uncertain

DataCarson Kai-Sang Leung and Fan Jiang

DaWaK 2011

Page 2: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

2

Outline Motivation Background Method

A Naive Algorithm: TUF-Streaming(Naive) A Space-Saving Algorithm: TUF-

Streaming(Space) A Time-Saving Algorithm: TUF-Streaming(Time)

Experimental Result Conclusion

Page 3: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

3

Motivation In past few years, several mining algorithms

have been proposed to discover frequent patterns from uncertain data. However, most of them mine frequent patterns from static databases—but not dynamic streams—of uncertain data.

Page 4: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

4

Background Mining from Static Database of

Uncertain data x:item X:itemset DB: transaction database ti:transaction the expected support of X in the DB can be

computed by summing (over all transactionst1, ..., t|DB|) the product (of existential probabilities of items within X):

Page 5: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

5

Background Mining from Uncertain data Streams with

Sliding window Bi:batch X:itemset T:time DB: transaction database the expected support of X in the current sliding

window containing batches of uncertain data in Batches inclusive can be computed as follows:

Page 6: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

6

A Naive Algorithm: TUF-Streaming(Naive)

minsup=1.0preMinsup=0.8

Page 7: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

7

(Cont.)

Page 8: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

8

(Cont.)

minsup=1.0preMinsup=0.8

Page 9: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

9

A Space-Saving Algorithm: TUF-Streaming(Space)

minsup=1.0preMinsup=0.8

Page 10: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

10

(Cont.)

Page 11: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

11

A Time-Saving Algorithm: TUF-Streaming(Time) In have frequent: {a}=1.7,{b}=1.8,{b,c}=1.44,{b,c,d}=0.86,{b,d}=1.08,{c}=1.6, {c,d}=0.96,{d}=1.3Last

batch

Last batch’s expected support

Page 12: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

12

(Cont.) In have frequent:{a},{a,d},{b},{b,c},{c},

{d}

minsup=1.0preMinsup=

0.8

Page 13: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

13

(Cont.) In have frequent:{a},{b},{b,d},{c},{d}

minsup=1.0preMinsup=

0.8

Page 14: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

14

Experimental Result

Page 15: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

15

(Cont.)

Page 16: Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1

16

Conclusion In this paper, we proposed tree-based mining

algorithms that can be used for mining frequent patterns from dynamic streams of uncertain data with both time-fading and landmark models.