![Page 1: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/1.jpg)
Frequent Itemsets and
Association Rules
MTAT.03.183 Data Mining. 20.02.2014.
Konstantin Tretyakov
![Page 2: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/2.jpg)
Frequent patterns
Humpty Dumpty sat on a wall,
Humpty Dumpty had a great fall;
All the King's horses and all the King's men
Couldn't put Humpty together again
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
2
![Page 3: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/3.jpg)
Frequent patterns
Humpty Dumpty sat on a wall,
Humpty Dumpty had a great fall;
All the King's horses and all the King's men
Couldn't put Humpty together again
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
3
![Page 4: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/4.jpg)
Frequent patterns
Humpty Dumpty sat on a wall,
Humpty Dumpty had a great fall;
All the King's horses and all the King's men
Couldn't put Humpty together again
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
4
![Page 5: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/5.jpg)
Frequent patterns
Humpty Dumpty sat on a wall,
Humpty Dumpty had a great fall;
All the King's horses and all the King's men
Couldn't put Humpty together again
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
5
![Page 6: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/6.jpg)
Frequent patterns
Humpty Dumpty sat on a wall,
Humpty Dumpty had a great fall;
All the King's horses and all the King's men
Couldn't put Humpty together again
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
6
[H|D]umpty
![Page 7: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/7.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
7
![Page 8: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/8.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
8
![Page 9: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/9.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
9
![Page 10: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/10.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
10
Front page Search Product A Product B Product C
Front page Ad Product B Purchase Product C
Front page Front page Feedback
Front page Search Product B Search Product C Purchase
Front page Ad Product C Search Product B Purchase
Front page Front page Front page Front page
![Page 11: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/11.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
11
Front page Search Product A Product B Product C
Front page Ad Product B Purchase Product C
Front page Front page Feedback
Front page Search Product B Search Product C Purchase
Front page Ad Product C Search Product B Purchase
Front page Front page Front page Front page
![Page 12: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/12.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
12
Front page Search Product A Product B Product C
Front page Ad Product B Purchase Product C
Front page Front page Feedback
Front page Search Product B Search Product C Purchase
Front page Ad Product C Search Product B Purchase
Front page Front page Front page Front page
![Page 13: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/13.jpg)
Frequent patterns
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
13
Front page Search Product A Product B Product C
Front page Ad Product B Purchase Product C
Front page Front page Feedback
Front page Search Product B Search Product C Purchase
Front page Ad Product C Search Product B Purchase
Front page Front page Front page Front page
![Page 14: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/14.jpg)
Why frequent patterns
Repetition in data always indicates structure.
This structure may be due known processes.
Otherwise, we want to know about it and
explain it.
Consequently, searching for frequent
patterns in data is one of the most basic
procedures used in descriptive data analysis.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
14
![Page 15: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/15.jpg)
Quiz
Is a frequent pattern always interesting?
Is an interesting pattern always frequent?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
15
![Page 16: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/16.jpg)
Quiz
Is a frequent pattern always interesting?
No (e.g. βFront pageβ is frequent but uninteresting)
Is an interesting pattern always frequent?
No (e.g. βDiapers & Beerβ may be a rare but a very
valuable pattern)
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
16
NB
Finding frequent patterns: primarily algorithmic problem
Finding βinterestingβ patterns: primarily statistical problem
![Page 17: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/17.jpg)
Frequent itemsets
In general, we may be interested in different
types of data and types of patterns:
Natural language / Word sequences
Text / Regular expression patterns
Web log / Event sequences (various models)
Purchases / Item sets
β¦
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
17
![Page 18: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/18.jpg)
Frequent itemsets
In general, we may be interested in different
types of data and types of patterns:
Natural language / Word sequences
Text / Regular expression patterns
Web log / Event sequences (various models)
Purchases / Item sets
β¦
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
18
Today
(however, many notions and ideas apply
to all data / pattern types).
![Page 19: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/19.jpg)
Definitions
Let the data consist of transactions (i.e. sets of items)
The patterns we are interested are itemsets.
E.g. {Milk, Eggs} is an itemset.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
19
![Page 20: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/20.jpg)
Definitions
The support of an itemset is the proportion of
transactions where it occurs
support({Milk, Eggs}) = ?
support({Bread}) = ?
support({Milk, Beer}) = ?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
20
![Page 21: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/21.jpg)
Definitions
The support of an itemset is the proportion of
transactions where it occurs
support({Milk, Eggs}) = 0
support({Bread}) = 4/5 = 0.8
support({Milk, Beer}) = 2/5 = 0.4
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
21
![Page 22: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/22.jpg)
Definitions
The support of an itemset is the proportion of
transactions where it occurs
support({Milk, Eggs}) = 0
support({Bread}) = 4
support({Milk, Beer}) = 2
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
22
NB: in some treatments, support
may denote the number of
transactions (βsupport countβ)
![Page 23: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/23.jpg)
Definitions
An itemset is frequent if its support is greater than or
equal to some predefined threshold π min.
Let π min= 3. Find all frequent itemsets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
23
![Page 24: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/24.jpg)
Definitions
An itemset is frequent if its support is greater than or
equal to some predefined threshold π min.
Let π min= 3. Find all frequent itemsets.{Bread}, {Milk}, {Diaper}, {Beer}, {Bread, Milk}, {Bread, Diaper},
{Milk, Diaper}, {Diaper, Beer}, {}.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
24
![Page 25: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/25.jpg)
Explanations for itemsets
Suppose we found that {Beer, Milk, Diapers}
is a frequent itemset.
How do we explain it?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
25
![Page 26: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/26.jpg)
Explanations for itemsets
Suppose we found that {Beer, Milk, Diapers}
is a frequent itemset.
How do we explain it?
Perhaps it is frequent due to the fact that there
is a causal relationship{Diapers, Milk} => Beer
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
26
![Page 27: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/27.jpg)
Association rules
An association rule is an implication of the form
π β π, where π and π are itemsets.
The support of a rule is just the support of π βͺ π.
support π β π = support(π βͺ π)
The confidence of a rule is the proportion of
transactions satisfying the right part among the
transactions, which satisfy the left.
confidence π β π =support π βͺ π
support(π)20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
27
![Page 28: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/28.jpg)
Association rule mining
Given a set of transactions, find all rules π β π,
such that
support π β π β₯ π min
confidence π β π β₯ πmin
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
28
![Page 29: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/29.jpg)
Association rule mining
Given a set of transactions, find all rules π β π,
such that
support π β π β₯ π min
confidence π β π β₯ πmin
Solution:
Find frequent itemset π΄ with support β₯ π min
Then find a partitioning of π΄ into left- and right-
part, so that the resulting rule has high confidence.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
29
![Page 30: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/30.jpg)
Association rule mining
Given a set of transactions, find all rules π β π,
such that
support π β π β₯ π min
confidence π β π β₯ πmin
Solution:
Find frequent itemset π΄ with support β₯ π min
Then find a partitioning of π΄ into left- and right-
part, so that the resulting rule has high confidence.
The algorithmic approach is the same in
both steps. Weβll start with the first one.20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
30
![Page 31: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/31.jpg)
Frequent itemset mining
What would be the simplest method for
finding all frequent itemsets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
31
![Page 32: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/32.jpg)
Frequent itemset mining
Brute force approach:
Enumerate all possible sets of items
For each set compute its support in the database
Output sets with support β₯ π min
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
32
![Page 33: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/33.jpg)
Frequent itemset mining
Brute force approach:
Enumerate all possible sets of items
For each set compute its support in the database
Output sets with support β₯ π min
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
33
for itemset in subsets(items):
if support(itemset, data) >= s_min:
yield itemset
![Page 34: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/34.jpg)
Frequent itemset mining
Brute force approach:
Enumerate all possible sets of items
For each set compute its support in the database
Output sets with support β₯ π min
Let there be π different items,
π transactions, average transaction size π€.
What is the complexity of this algorithm?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
34
![Page 35: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/35.jpg)
Frequent itemset mining
Brute force approach:
Enumerate all possible sets of items
For each set compute its support in the database
Output sets with support β₯ π min
Let there be π different items,
π transactions, average transaction size π€.
What is the complexity of this algorithm?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
35
2π iterations
Scan π transactions,
do π€ operations at each
π(2πππ€)
![Page 36: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/36.jpg)
Faster itemset mining
Apriori: Avoid scanning through all 2π
itemsets.
FP-tree: Also store transactions in an
efficient data structure, speeding up matching.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
36
![Page 37: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/37.jpg)
The Apriori idea
Suppose that
support({A, C}) = 42/100
What does it tell you about
support({A, B, C}) ?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
37
![Page 38: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/38.jpg)
The Apriori idea
Suppose that
support({A, C}) = 42/100
It follows that
MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
38
support({A, B, C}) β€ 42/100
(A, B, C, D, E)
(A, C, D, E)
( B, D, E, F)
(A, B, C, F)
(A, C, F, G)
(A, B, C, D)
![Page 39: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/39.jpg)
The Apriori idea
Suppose that
support({A, C}) = 42/100
It follows that
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
39
support({A, B, C}) β€ 42/100
(A, B, C, D, E)
(A, C, D, E)
( B, D, E, F)
(A, B, C, F)
(A, C, F, G)
(A, B, C, D)
![Page 40: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/40.jpg)
The Apriori idea
Suppose that
support({A, C}) = 42/100
It follows that
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
40
support({A, B, C}) β€ 42/100
(A, B, C, D, E)
(A, C, D, E)
( B, D, E, F)
(A, B, C, F)
(A, C, F, G)
(A, B, C, D)
![Page 41: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/41.jpg)
Anti-monotonicity of support
In general,
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
41
π β π β support π β₯ support(π)
![Page 42: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/42.jpg)
Anti-monotonicity of support
In general,
it follows that:
and
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
42
π β π β support π β₯ support(π)
If an itemset is not
frequent, all of its
supersets are also
not frequent.
If an itemset is
frequent, all of its
subsets are also
frequent.
![Page 43: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/43.jpg)
Apriori principle
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
43
![Page 44: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/44.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
Next, generate frequent 2-sets from 1-sets,
β¦ then generate frequent 3-sets from 2-sets,
β¦ etc, until there are no frequent π-sets
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
44
![Page 45: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/45.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
How?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
45
![Page 46: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/46.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
Simply count the frequency of each item and leave only
the frequent ones.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
46
![Page 47: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/47.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
Simply count the frequency of each item and leave only
the frequent ones.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
47
{A}: 10
{B}: 15
{C}: 3
{D}: 4
{E}: 6
{F}: 10
{G}: 4
![Page 48: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/48.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
Simply count the frequency of each item and leave only
the frequent ones.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
48
{A}: 10
{B}: 15
{C}: 3
{D}: 4
{E}: 6
{F}: 10
{G}: 4
Let min support count = 5
![Page 49: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/49.jpg)
Basic Apriori algorithm
First generate frequent 1-sets,
Simply count the frequency of each item and leave only
the frequent ones.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
49
{A}: 10
{B}: 15
{E}: 6
{F}: 10
![Page 50: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/50.jpg)
Basic Apriori algorithm
Next, generate frequent 2-sets. This is done in
several steps.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
50
{A}: 10
{B}: 15
{E}: 6
{F}: 10
![Page 51: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/51.jpg)
Basic Apriori algorithm
Next, generate frequent 2-sets. This is done in
several steps.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
51
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}
{A,E}
{A,F}
{B,E}
{B,F}
{E,F}
All subsets of a candidate set
must be frequent.
For 2-sets it simply means that both
elements are from frequent 1-sets.
1. Generate candidate 2-sets
![Page 52: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/52.jpg)
Basic Apriori algorithm
Next, generate frequent 2-sets. This is done in
several steps.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
52
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,E}: 3
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
2. Count actual support for
each candidate
(requires a full pass over the
transaction database for each
candidate)
![Page 53: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/53.jpg)
Basic Apriori algorithm
Next, generate frequent 2-sets. This is done in
several steps.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
53
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,E}: 3
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
3. β¦ and leave only actually
frequent ones
![Page 54: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/54.jpg)
Basic Apriori algorithm
Now we have frequent 1- and 2-sets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
54
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
![Page 55: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/55.jpg)
Basic Apriori algorithm
Now we have frequent 1- and 2-sets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
55
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
In many practical situations
the algorithm stops here.
Why?
![Page 56: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/56.jpg)
Basic Apriori algorithm
Now we have frequent 1- and 2-sets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
56
In many practical situations
the algorithm stops here.
1. Because there are so many
items that enumerating
beyond 2-sets is
impractical.
2. Because knowing frequent
2-sets is already useful
enough (think of the
βBeer/Diapersβ example)
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
![Page 57: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/57.jpg)
Basic Apriori algorithm
But letβs try generating frequent 3-sets. We
proceed as before.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
57
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
![Page 58: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/58.jpg)
Basic Apriori algorithm
But letβs try generating frequent 3-sets. We
proceed as before.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
58
1. Generate candidate 3-sets
We augment each 2-set with an additional
element and check that all 2-subsets of
the resulting set is frequent.
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{A,B,F}
{B,E,F}
This can be optimized somewhat, see, e.g:
http://www.dais.unive.it/~orlando/PAPERS/dawak01.pdf
![Page 59: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/59.jpg)
Basic Apriori algorithm
But letβs try generating frequent 3-sets. We
proceed as before.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
59
2. Count actual support
(Again, a pass over the whole DB)
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{A,B,F}: 4
{B,E,F}: 5
![Page 60: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/60.jpg)
Basic Apriori algorithm
But letβs try generating frequent 3-sets. We
proceed as before.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
60
3. β¦ and throw away the
non-frequent ones
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{A,B,F}: 4
{B,E,F}: 5
![Page 61: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/61.jpg)
Basic Apriori algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
61
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{B,E,F}: 5
![Page 62: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/62.jpg)
Basic Apriori algorithm
Quiz: In this particular
example, can there be
frequent 4-sets?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
62
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{B,E,F}: 5
![Page 63: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/63.jpg)
Basic Apriori algorithm
Quiz: In this particular
example, can there be
frequent 4-sets?
No, because if, say, {B,E,F,X} is frequent,
then {B,E,X} must be
frequent too!
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
63
{A}: 10
{B}: 15
{E}: 6
{F}: 10
{A,B}: 10
{A,F}: 5
{B,E}: 6
{B,F}: 10
{E,F}: 5
{B,E,F}: 5
![Page 64: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/64.jpg)
Basic Apriori algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
64
# Candidate 1-sets
C = [(i,) for i in items]
# Frequent 1-sets
F = [itemset for itemset in C
if support(itemset, data) >= s_min]
# Frequent (k+1)-sets from k-sets.
while len(F) != 0:
output(F)
# Generate candidates of size k+1
C = [itemset + (item,)
for itemset in F
for item in items
if item > itemset[-1]]
# Check that all k-subsets of each candidate are frequent
C = [itemset for itemset in C
if all([s in F for s in combinations(itemset, len(itemset)-1)])]
# Count actual support and leave only the frequent ones
F = [itemset for itemset in C
if support(itemset, data) >= s_min]
![Page 65: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/65.jpg)
Many optimizations are possible
Avoid generating the separate candidate set explicitly
(do it on-the-fly while counting).
Store π-sets in a hash tree data structure, (speeds up
the counting/generation process).
Use a part of the whole transaction DB (sample or
partition)
Use Bloom-filter like data structures to reduce
candidate set.
etc.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
65
(see, e.g. http://i.stanford.edu/~ullman/mmds/ch6a.pdf)
![Page 66: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/66.jpg)
Compact representation of itemsets
If {A,B,C,D,E} is frequent, then also those sets are
frequent:
{A},{B},{C},{D},{E},
{A,B},{A,C},{A,D},{A,E},{B,C},{B,D},{B,E},{C
,D},{C,E},{D,E},{A,B,C},{A,B,D},{A,B,E},{A,C
,D},{A,C,E},{A,D,E},{B,C,D},{B,C,E},{B,D,E},
{C,D,E},{A,B,C,D},{A,B,C,E},{A,B,D,E},{A,C,D
,E},{B,C,D,E}
Do we really want our algorithm to report those?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
66
![Page 67: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/67.jpg)
Closed itemsets
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
67
![Page 68: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/68.jpg)
Maximal frequent itemsets
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
68
![Page 69: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/69.jpg)
Quiz
Can an itemset be:
Closed and not Maximal?
Not closed and Maximal?
Closed and Maximal?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
69
![Page 70: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/70.jpg)
Quiz
Can an itemset be:
Closed and not Maximal? Yes.
E.g. adding any item will reduce support, but adding some
items will still make a frequent set.
Not closed and Maximal? No.
Not closed, hence we can add some other item without
reducing support.
Closed and Maximal? Yes.
Adding any item will reduce support and make the set
infrequent.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
70
![Page 71: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/71.jpg)
Closed vs maximal frequent itemsets
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
71
![Page 72: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/72.jpg)
Intermediate summary
Frequent itemsets are interesting because
those correspond to structure in the data.
Association rules are basically frequent
itemsets with bells and whistles.
Apriori-like algorithms search for frequent
sets better than brute-force.
It is sufficient to find only maximal itemsets
(or closed itemsets, if some flexibility in
changing cutoff later is needed).
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
72
![Page 73: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/73.jpg)
Apriori is fine, but
When the number of frequent patterns is
large, the candidate sets are large and the
repeated database scans are slow.
This happens, in particular, when there is a long
frequent pattern (then all subsets are also
frequent).
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
73
![Page 74: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/74.jpg)
FP-tree
A frequent-pattern tree is a data structure for
storing the transaction database, which allows
to enumerate frequent itemsets.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
74
![Page 75: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/75.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
75
TID Items
100 F,A,C,D,G,I,M,P
200 A,B,C,F,L,M,O
300 B,F,H,J,O
400 B,C,K,S,P
500 A,F,C,E,L,P,M,N
![Page 76: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/76.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
76
1. Count the frequency of each itemTID Items
100 F,A,C,D,G,I,M,P
200 A,B,C,F,L,M,O
300 B,F,H,J,O
400 B,C,K,S,P
500 A,F,C,E,L,P,M,N
F: 4
A: 3
C: 4
D: 1
G: 1
I: 1
M: 3
S: 1
P: 3
B: 3
L: 2
O: 2
H: 1
J: 1
K: 1
E: 1
N: 1
![Page 77: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/77.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
77
2. Leave only the frequent itemsTID Items
100 F,A,C,M,P
200 A,B,C,F,M
300 B,F
400 B,C,P
500 A,F,C,P,M
F: 4
A: 3
C: 4
B: 3
M: 3
P: 3
![Page 78: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/78.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
78
3. Sort items by frequency
(recommended)TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F: 4
C: 4
A: 3
B: 3
M: 3
P: 3
![Page 79: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/79.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
79
4. Create a header table
and a root nodeTID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
![Page 80: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/80.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
80
5. Now start inserting transaction.
Each transacton will be a path in
the tree:
TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F
C
A
M
P
![Page 81: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/81.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
81
Header table keeps a linked list for
the occurrences of each item in the
tree.
TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F
C
A
M
P
![Page 82: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/82.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
82
Next transaction shares a part of the
Path (F,C,A) with the previously inserted.
We keep counts at each node
to track this.
TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F:2
C:2
A:2
M:1
P:1
B:1
M:1
![Page 83: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/83.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
83
We update header table lists (the
new nodes B and M need to be
included)
TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F:2
C:2
A:2
M:1
P:1
B:1
M:1
![Page 84: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/84.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
84
Inserting third transactionβ¦TID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F:3
C:2
A:2
M:1
P:1
B:1
M:1
B:1
![Page 85: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/85.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
85
β¦ fourthTID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F:3
C:2
A:2
M:1
P:1
B:1
M:1
B:1
C:1
B:1
P:1
![Page 86: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/86.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
86
... fifth is exactly like the first, so we
only update counts along the pathTID Items
100 F, C,A,M,P
200 F,C,A,B,M
300 F,B
400 C,B,P
500 F,C,A,M,P
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 87: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/87.jpg)
Constructing the FP-tree
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
87
The resulting data structure has all the information from the
original database, but allows some fast lookups.
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 88: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/88.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
88
How to find out the total number of transactions stored
in this tree?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 89: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/89.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
89
How to find out the total number of transactions stored
in this tree?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 90: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/90.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
90
How many transactions contain βMβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 91: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/91.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
91
How many transactions contain βMβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 92: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/92.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
92
Which transactions contain βBβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 93: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/93.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
93
Which transactions contain βBβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 94: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/94.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
94
Which transactions contain βBβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 95: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/95.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
95
Which transactions contain βBβ?
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 96: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/96.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
96
Make a tree, that corresponds to all transactions, that
contain βBβ.
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 97: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/97.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
97
Make a tree, that corresponds to all transactions, that
contain βBβ.
F
C
A
B
M
P
root
F:2
C:1
A:1B:1
M:1
B:1
C:1
B:1
P:1
Note the count changes!
![Page 98: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/98.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
98
Make a tree, that corresponds to all transactions, that
contain βBβ and βCβ.
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 99: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/99.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
99
First find the tree for βBβ, then further trim it for βCβ:
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 100: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/100.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
100
First find the tree for βBβ, then further trim it for βCβ:
F
C
A
B
M
P
root
F:2
C:1
A:1B:1
M:1
B:1
C:1
B:1
P:1
![Page 101: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/101.jpg)
Quiz
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
101
First find the tree for βBβ, then further trim it for βCβ:
F
C
A
B
M
P
root
F:1
C:1
A:1B:1
M:1
C:1
B:1
P:1
![Page 102: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/102.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
102
Frequent-pattern enumeration based on FP-tree
Core idea:
β’ Recursively enumerate all subsets.
β’ For each subset construct a FP-tree of transactions
that contain this subset
β’ Output those subsets for which the corresponding
tree contains more than (support count) transactions.
![Page 103: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/103.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
103
Frequent-pattern enumeration based on FP-tree
Core idea:
β’ Recursively enumerate all subsets.
.def subsets(items, current_subset):
process(current_subset)
for i in items:
if i > max(current_subset):
new_subset = current_subset + [i]
subsets(items, new_subset)
Start enumeration with
subsets(items, [])
![Page 104: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/104.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
104
Core idea:
β’ Recursively enumerate all subsets.
.def subsets(items, current_subset, FP_tree):
process(current_subset)
for i in items:
if i > max(current_subset):
new_subset = current_subset + [i]
new_tree = FP_tree.filter(i)
subsets(items, new_subset, new_tree)
Start enumeration with
subsets(items, [], full_tree)
Now while enumerating itemset
we keep track of the FP-tree with
all transactions having this itemset
![Page 105: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/105.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
105
Frequent-pattern enumeration based on FP-tree
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 106: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/106.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
106
Frequent-pattern enumeration based on FP-tree
Start from the bottom
of the header table
Extract all transactions
that contain βPβ.
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 107: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/107.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
107
Frequent-pattern enumeration based on FP-tree
Start from the bottom
of the header table
Extract all transactions
that contain βPβ and make
a corresponding FP-tree
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 108: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/108.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
108
Frequent-pattern enumeration based on FP-tree
F
C
A
B
M
P
root
F:2
C:2
A:2
M:2
P:2
C:1
B:1
P:1
Start from the bottom
of the header table
Extract all transactions
that contain βPβ and make
a corresponding FP-tree
![Page 109: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/109.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
109
Frequent-pattern enumeration based on FP-tree
We can see that P
is frequent (for support
threshold 3).
Now descend
recursively to examine
all sets like P + {β¦},
start with {P, M}
F
C
A
B
M
P
root
F:2
C:2
A:2
M:2
P:2
C:1
B:1
P:1
![Page 110: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/110.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
110
Frequent-pattern enumeration based on FP-tree
F
C
A
B
M
P
root
F:2
C:2
A:2
M:2
P:2
We can see that P
is frequent (for support
threshold 3).
Now descend
recursively to examine
all sets like P + {β¦},
start with {P, M}
![Page 111: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/111.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
111
Frequent-pattern enumeration based on FP-tree
It is not frequent,
so no need to dig
deeper with {P,M, β¦}
We can return from
the recursive call
F
C
A
B
M
P
root
F:2
C:2
A:2
M:2
P:2
![Page 112: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/112.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
112
Frequent-pattern enumeration based on FP-tree
F
C
A
B
M
P
root
F:2
C:2
A:2
M:2
P:2
C:1
B:1
P:1
β¦ back to level {P}
![Page 113: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/113.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
113
Frequent-pattern enumeration based on FP-tree
F
C
A
B
M
P
root
C:1
B:1
P:1
β¦ and descend
to {P, B}
![Page 114: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/114.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
114
Frequent-pattern enumeration based on FP-tree
β¦ etc:
{P,B} β not frequent, backtrack
{P,A} β not frequent, backtrack,
{P,C} β frequent, recurse,
{P,C,F} β not frequent, backtrack
{P,F} β not frequent, backtrack
Now we are done with βPβ
completely!
Recursion follows with
{M}, then {M,B}, then {M,A}
(frequent), then {M,A,C}
(frequent), then {M,A,C,F}
(frequent), then {M,A,F}
(frequent), β¦ etc
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 115: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/115.jpg)
FP-growth algorithm
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
115
Frequent-pattern enumeration based on FP-tree
β’ Several optimizations are
possible to the idea just
described.
β’ In a careful implementation you
never need to go downwards in
the tree (i.e. against the
pointers): when processing
each item you really only care
about the part of the tree above
it.
β’ You can have the algorithm
output only the maximal sets.
β’ Special cases like a βchainβ-
tree can be treated separately.
F
C
A
B
M
P
root
F:4
C:3
A:3
M:2
P:2
B:1
M:1
B:1
C:1
B:1
P:1
![Page 116: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/116.jpg)
Quiz
Could you use FP-tree with the βusualβ Apriori
algorithm?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
116
![Page 117: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/117.jpg)
Intermediate summary
Frequent itemsets are interesting because those
correspond to structure in the data.
Association rules are basically frequent itemsets
with bells and whistles.
Apriori-like algorithms search for frequent sets
better than brute-force.
FP-tree is a data structure for keeping transactions.
FP-growth can be more memory-efficient than
Apriori.
Maximal and closed itemsets are your friends.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
117
![Page 118: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/118.jpg)
Back to association rules
Recall association rule mining:
Problem: find association rules π β π with
Support at least π min
Confidence at least πmin
Solution:
Find frequent itemsets with support at least π min
For each itemset find a split into π β π, which ensures
required confidence.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
118
![Page 119: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/119.jpg)
Rule generation
Suppose we found that π΄, π΅, πΆ, π· is a
frequent itemset with the necessary support.
We can make a variety of rules from it:
{π΄} β {π΅, πΆ, π·}
π΄, π΅ β πΆ, π·
π΅, πΆ, π· β {π΄}
How to efficiently find those which satisfy the
confidence threshold?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
119
![Page 120: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/120.jpg)
Rule generation
Recall that
confidence π΄ β π΅, πΆ, π· =support π΄, π΅, πΆ, π·
support( π΄ )
confidence π΄, π΅ β πΆ,π· =support π΄, π΅, πΆ, π·
support( π΄, π΅ )
confidence π΄, πΆ, π· β π΅ =support π΄, π΅, πΆ, π·
support( π΄, πΆ, π· )
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
120
![Page 121: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/121.jpg)
Rule generation
Recall that
confidence π΄ β π΅, πΆ, π· =support π΄, π΅, πΆ, π·
support( π΄ )
Consequently, among all rules built on the set
{π΄, π΅, πΆ, π·}, confidence is inverse proportional
to the support of antecedent (the left part of
the rule).
I.e. confidence is monotonic wrt antecedent
(and anti-monotonic wrt consequent).
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
121
![Page 122: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/122.jpg)
Rule generation
In other words, you can use Apriori for rule
generation from a found frequent set.
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
122
![Page 123: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/123.jpg)
Quiz
Can you apply FP-growth to search for the
rule split for a given frequent set?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
123
![Page 124: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/124.jpg)
Questions?
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
124
![Page 125: Frequent Itemsets and Association Rules...Association rule mining Given a set of transactions, find all rules β , such that support β β₯π min confidence β β₯ min MTAT.03.183](https://reader036.vdocuments.net/reader036/viewer/2022071218/605100a5baa92b1554099c0e/html5/thumbnails/125.jpg)
Image Credits
Title page image: Β© Marco Taiana,
http://www.flickr.com/photos/marcotaiana/92122947/
Some diagrams borrowed from a slides by Tan et al,
http://www-users.cs.umn.edu/~kumar/dmbook/index.php
20.02.2014MTAT.03.183 Data Mining - Frequent Itemsets
& Association Rules
125