lecture 4: association market basket analysis analysis of customer behavior and service modeling
TRANSCRIPT
![Page 1: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/1.jpg)
Lecture 4: Association
Market Basket Analysis
Analysis of Customer Behavior and Service Modeling
![Page 2: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/2.jpg)
What Is Association Mining?
Association rule mining:– Finding frequent patterns, associations, correlations,
or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.
Applications:– Market basket analysis, cross-marketing, catalog
design, loss-leader analysis, clustering, classification, etc.
Examples:– Rule form: “Body Head [support, confidence]”
• buys(x, “diapers”) buys(x, “beers”) [0.5%, 60%]• major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%,
75%]
![Page 3: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/3.jpg)
Support and Confidence
Support – Percent of samples contain both A and B– support(A B) = P(A ∩ B)
Confidence– Percent of A samples also containing B – confidence(A B) = P(B|A)
Example– computer financial_management_software
[support = 2%, confidence = 60%]
![Page 4: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/4.jpg)
Association Rules: Basic Concepts
Given: (1) database of transactions, (2) each transaction is a list of items (purchased by a customer in a visit)
Find: all rules that correlate the presence of one set of items with that of another set of items– e.g., 98% of people who purchase tires and auto accessories
also get automotive services done Applications
– Home Electronics - What other products should the store stocks up?
– Retailing – Shelf design, promotion structuring, direct marketing
![Page 5: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/5.jpg)
Find all the rules A C with minimum confidence and support– Support (s) probability that a
transaction contains {A & C}– Confidence (c) conditional
probability that a transaction having {A} also contains {C}
Transaction ID Items Bought2000 A,B,C1000 A,C4000 A,D5000 B,E,F
Let minimum support 50%, and minimum confidence 50%, we have
A C (50%, 66.6%)C A (50%, 100%)
Customerbuys diaper
Customerbuys both
Customerbuys beer
Rule Measures: Support and Confidence
![Page 6: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/6.jpg)
For rule A C:support = support({A, C}) = 50%confidence = support({A, C})/support({A}) =
66.6%
Transaction ID Items Bought2000 A,B,C1000 A,C4000 A,D5000 B,E,F
Frequent Itemset Support{A} 75%{B} 50%{C} 50%{A,C} 50%
Target:Min. support 50%Min. confidence 50%
Mining Association Rules: An Example
![Page 7: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/7.jpg)
An Example of Market Basket(1)
There are 8 transactions on three items on A (Apple), B (Banana) , C (Carrot).
Check associations for below two cases.
(1) A B (2) (A, B) C
# Basket
1 A
2 B
3 C
4 A, B
5 A, C
6 B, C
7 A, B, C
8 A, B, C
![Page 8: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/8.jpg)
An Example of Market Basket(1(2)
Basic probabilities are below:
(1) AB (2) (A, B) C
LHS P(A) = 5/8 = 0.625 P(A,B) = 3/8 = 0.375
RHS P(B) = 5/8 = 0.625 P(C) = 5/8 = 0.625
Coverage
LHS = 0.625 LHS = 0.375
Support P(A∩B) = 3/8 = 0.375 P((A,B)∩C)) = 2/8 =0.25
Confidence
P(B|A)=0.375/0.625=0.6
P(C|(A,B))=0.25/0.375=0.7
Lift0.375/(0.625*0.625)=0.96
0.25/(0.375*0.625)=1.07
Leverage 0.375 - 0.390 = -0.015 0.25 - 0.234 = 0.016
![Page 9: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/9.jpg)
What are good association rules? (How to interpret them?)
– If lift is close to 1, it means there is no association between two items (sets).
– If lift is greater than 1, it means there is a positive association between two items (sets).
– If lift is less than 1, it means there is a negative association between two items (sets).
Lift
![Page 10: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/10.jpg)
Leverage
– Leverage = P(A∩B) - P(A)*P(B) , it has three types① Leverage > 0② Leverage = 0 ③ Leverage < 0
– ① Two items (sets) are positively associated– ② Two items (sets) are independent– ③Two items (sets) are negatively associated
![Page 11: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/11.jpg)
Lab on Association Rules(1)
SPSS Clementine, SAS Enterprise Miner have association rules softwares.
This exercise uses Magnum Opus. Go to http://www.rulequest.com and download
Magnum Opus evaluation version ( click)
![Page 12: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/12.jpg)
After you install the problem, you can see below initial screen. From menu, choose File – Import Data (Ctrl – O).
![Page 13: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/13.jpg)
Demo Data sets are already there. Magnum Opus has two types of data sets available: (transaction data: *.idi, *.itl) and (attribute-value data: *.data, *.nam)
Data format has below two types:(*.idi, *.itl).
idi(identifier-item file)
itl(item list file)
001, apples 001, oranges 001, bananas 002, apples 002, carrots 002, lettuce 002, tomatoes
apples, oranges, bananas apples, carrots, lettuce, tomatoes
![Page 14: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/14.jpg)
If you open tutorial.idi using note pad, you can see the file inside as left.
The example left has 5 transactions (baskets)
![Page 15: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/15.jpg)
File – Import Data, or click . click Tutorial.idi
Check Identifier – item file and click Next >.
![Page 16: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/16.jpg)
Click Yes and click Next > …
click Next > …
![Page 17: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/17.jpg)
Click Next > …
What percentage of whole file you want to use? Type 50% and click Next > …
![Page 18: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/18.jpg)
click Import Data 를 클릭
Then, you can see a screen like below left.
![Page 19: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/19.jpg)
Set things as they are.– Search by:
LIFT– Minimum
lift: 1– Maximum
no. of rules: 10
Click GO
![Page 20: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/20.jpg)
Results are saved in tutorial.out file. Below are rules derived:
lettuce & carrotsare associated with tomatoeswith strength = 0.857coverage = 0.042: 21 cases satisfy the LHSsupport = 0.036: 18 cases satisfy both the LHS and the RHSlift 3.51: the strength is 3.51 times greater than the strength if there were no associationleverage = 0.0258: the support is 0.0258 (12.9 cases) greater than if there were no association
![Page 21: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/21.jpg)
lettuce & carrots tomatoes– When Lettuce and carrots are purchase then they buy
tomatoes– coverage = 0.042: 21 cases satisfy the LHS– LHS(lettuce & carrots) = 21/500 = 0.042
support = 0.036: 18 cases satisfy both the LHS and the RHS– P((lettuce & carrots) ∩ tomatoes)) = 18/500 = 0.036
strength(confidence) = 0.857– P(support|LHS)= 18/21 = 0.036/0.042 = 0.857
![Page 22: Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649f435503460f94c63e99/html5/thumbnails/22.jpg)
lift 3.51: the strength is 3.51 times greater than the strength if there were no association– 즉 , (18/21)/(122/500) = 3.51
leverage = 0.0258: the support is 0.0258 (12.9 cases) greater than if there were no association– P(LHS ∩ RHS) – P(A)*P(B) = 0.036 –
0.042*0.244 = 0.0258