expert systems with applications 34 (2008) 459–468 multi-level fuzzy mining with multiple minimum...

33
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with mu ltiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin W ang 報報報 : Huai-Pin g Chu 2008/11/ 15

Upload: janis-rose

Post on 04-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Expert Systems with Applications 34 (2008) 459–468

Multi-level fuzzy mining with multiple minimum supports

Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

報告人 : Huai-Ping Chu

2008/11/15

Page 2: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Outline

Abstract Introduction Review of related mining algorithms The proposed algorithm An example Conclusion

Page 3: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Abstract

In real applications, different items may have different support criteria to judge their importance, taxonomic relationships among items may appear, and data may have quantitative values.

A fuzzy multiple-level mining algorithm for extracting knowledge implicit in quantitative transactions with multiple minimum supports of items is proposed to derive large itemsets and discover cross-level fuzzy association rules under the maximum-itemset minimum-taxonomy support constraint.

Page 4: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Introduction

An association rule is expressed as the form A B, where A and B are sets of items, such that the presence of A in a transaction will imply the presence of B in the same transaction.

Srikant & Agrawal proposed a method for mining association rules from data sets using quantitative and categorical attributes.

Hong et al. proposed a fuzzy mining algorithm for managing quantitative data.

Page 5: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Introduction (cont.)

Liu et al. proposed an approach for mining association rules with non-uniform minimum support values, which allowed users to specify different minimum supports to different items and used the lowest minimum support among all the items in the itemset as the minimum support value of the itemset.

Lee, Hong & Lin proposed a simple and efficient algorithm based on the apriori approach to generate large itemsets under the maximum constraints of multiple minimum supports.

Page 6: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Introduction (cont.)

Han et al. and Agrawal et al. proposed respectively algorithms to discover association rules on multiple-level taxonomic relationships among items.

This paper thus proposes a fuzzy multiple-level mining algorithm with multiple supports of items for extracting implicit knowledge from transactions stored as quantitative values, which integrates fuzzy-set concepts, data-mining technologies and multiple-level taxonomy to find fuzzy association rules.

Page 7: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Review of related mining algorithms

Mining multiple-level association rules.

Mining association rules with multiple minimum supports.

Page 8: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

1. Mining multiple-level association rules

Relevant item taxonomies are usually predefined in

real-word applications and can be represented as hierarchy tree. Terminal nodes on the trees

represent actual items appearing in transactions; internal

nodes represent classes or concepts formed from

lower-level nodes.

Page 9: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

The method of Han & Fu : Nodes in predefined taxonomies are first encoded

using sequences of numbers and the symbol “*” according to their positions in the hierarchy tree.

(1**)

(11*)

(111)

(112)

(12*)

(2**)

(21*)

(22*)

(211)

(212)

Page 10: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

A top-down progressively deepening search approach is used and exploration of “level-crossing” association relationships is allowed.

Candidate itemsets at certain levels may thus contain items at lower levels.

EX: Large items at level 2 may be paired with large items at level 1 to form candidate 2-itemsets at level 2 (such as {11*,2**}).

Page 11: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

2. Mining association rules with multiple minimum supports

Liu et al. proposed an approach for mining association rules with non-uniform minimum support values, allowing users to specify different minimum supports to different items.

The minimum support value of an itemset is defined as the lowest minimum supports among the items in the itemset.

Page 12: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

The minimum support of an item means that the occurrence frequency of the item must be larger than or equal to it for being considered in the next mining steps. If the support of an item is not larger than or equal to the support threshold, the item is not worth considering.

When the minimum support value of an itemset is defined as the lowest minimum supports of the items in it, the itemset may be large, but items included int it may be small.

Page 13: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

EX : Minimum support of item A is 20%. Minimum support of item B is 40%. If the support of item B is 30%, smaller than its mini

mum support 40%, and then the 2-itemset {A,B} should note be worth considering.

It is meaningful to assign the minimum support of an itemset as the maximum of the minimum supports of the items contained in the itemset.

Page 14: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

The proposed algorithm

The mining algorithm for fuzzy multiple-level association rules under the maximum-itemset minimum-taxonomy support constraint of multiple minimum supports:

INPUT: A set of quantitative transaction data, a taxonomy with the primitive items assigned their own minimum supports, a set of of membership functions, and a minimum confidence value.

OUTPUT: A set of fuzzy multiple-level association rules under maximum constraints of multiple minimum supports.

Page 15: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 1: Encode the taxonomy using a sequence of numbers and the symbol “*”.

Step 2: Translate the item names in the transaction data according to the encoding schema.

Step 3: Group the items with the same first k in each transaction Di, and add the amounts of the items in the same groups in Di.

Page 16: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 4: Calculate the occurring count of each group in all the transactions. Remove the group with their counts less than their respective support thresholds.

Step 5: Transform the quantitative value of each remaining group in each transaction data into a fuzzy set fij represented as (fk

ij1/Rkj1 + fk

ij2/Rkj2 + … + fk

ijh/Rk

jh), k is the level number, h is the number of fuzzy regions for Ik

j.

Page 17: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 6: Collect the fuzzy regions (linquistic terms) with membership values > 0 to form the candidate set Ck

1.

Step 7: Check whether the value countkjl of each re

gion Rkjl in Ck

1 ≧ the threshold, which is the minimum of minimum supports of the primitive items desceding from it. If Rk

jl satisfies the threshold, put it into the large 1-itemset (Lk

1) for level k.

Page 18: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 8: Generate the candidate set Ck2 from L1

1, L21,

… , Lk1 to find “level-crossing” large itemsets wit

h satisfying following condition: Each 2-itemset in Ck

2 must contain at least one item in Lk1.

The two regions in a 2-itemset may not have the same item name.

The two item names in a 2-itemset may not be with the hierarchy relation in the taxonomy.

Both of the support values of the two large 1-itemsets comprising a candidate 2-itemset must ≧ the maximum of the minimum supports of the two large 1-itemsets.

Page 19: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 9: Do the following substeps for each newly formed candidate 2-itemset s with regins(s1, s2) in Ck

2: Calculate the fuzzy value of s in each transaction Di as fis = fis1 Λ fis2

Calculate the scalar cardinality of s in all the transaction data as

counts = Σfis

If counts ≧ the maximum of the minimum supports of the items contained in it, put s into Lk

2.

Page 20: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 10: Repeat above similar steps and generate all large q-itemset.

Step 11: Construct the fuzzy association rules for the q-itemset by the following substeps:

Form all possible association rules as follows:S1 Λ … Λ Sr-1 Λ Sr+1 Λ … Λ Sq Sr r=1 to q

Calculate the confidence values of all association rules by

n

iisisisis

n

iis

qkk ffff

f

1

1

)......( 111

Page 21: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Step 12: Output the rules with confidence values ≧ the predefined confidence value.

Page 22: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

An example

Page 23: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 24: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 25: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 26: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 27: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 28: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang
Page 29: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

All possible association rules are formed as follows:

If 2** = Middle, then 3** = Middle; If 3** = Middle, then 2** = Middle;

If 21* = Middle, then 22* = Low; If 22* = Low, then 21* = Middle;

If 22* = Low, then 32* = Middle; If 32* = Middle, then 22* = Low.

Page 30: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

The confidence of the above association rules are calculated –

If 2** = Middle, then 3** = Middle, with conf = 0.74. If 3** = Middle, then 2** = Middle, with conf = 0.69.

If 21* = Middle, then 22* = Low, with conf = 0.82. If 22* = Low, then 21* = Middle, with conf = 046.

If 22* = Low, then 32* = Middle, with conf = 0.97. If 32* = Middle, then 22* = Low, with conf = 1.0.

Page 31: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Assume the confidence is set at 0.8 in this example.

The following three association rules are generated.

If 21* = Middle, then 22* = Low, with conf = 0.82.

If 22* = Low, then 32* = Middle, with conf = 0.97.

If 32* = Middle, then 22* = Low, with conf = 1.0.

Page 32: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

Conclusion

This algorithm offers an solution for three issues that usually occur in real mining application: using different criteria to judge the importance of different items, managing taxonomic relationships among items, and dealing quantitative data sets.

In this algorithm, the minimum support for an item at a higher taxonomic concept is set as the minimum of the minimum supports of the items belonging to it and the minimum support for an itemset is set as the maximum of the minimum supports of the items contained in the itemset.

Page 33: Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang

THANK YOU !!