data mining association analysis stu (1)

Upload: vijay-sai

Post on 02-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    1/17

    ERP 345 5410

    Business IntelligenceAssociation Analysis

    1

    Source: Business Intelligence, 3rded. Sharda et.al. 2014 Prentice HallSAP University alliance, BI workshop

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    2/17

    Association Rule Mining

    Finds interesting relationships (affinities)between variables (items or events)

    Part of machine learning family Employs unsupervised learning

    There is no output variable

    Also known as market basket analysis Often used as an example to describe DM to

    ordinary people, such as the famousrelationship between diapers and beers!

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    3/17

    Association Analysis

    Data Mining

    Cross-SellingRules

    C

    D

    D

    A

    B

    E

    E

    E

    A

    Customers

    Products

    B

    C

    D

    What products /

    services are typically

    bought together?

    Export rules

    to Web Shop

    Use in

    merchandising

    Association Analysis -

    Example Diapers impliesBeers?

    3

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    4/17

    An urban legend: Beer

    Implies Diapers?? Pattern: An Analysis of behavior of

    supermarket shoppers discovered that

    customers who buy beer tend also todiapers??

    Rationale: Men in their 20s who

    purchase beer on Fridays after receivingtheir paycheck are also likely to buy apack of diapers for the young kids in the

    family.4

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    5/17

    Association Rule Mining

    Input:the simple point-of-sale transaction data

    Output:Most frequent affinities among items

    Example: according to the transaction dataCustomer who bought a laptop computer anda virus protection software, also boughtextended service plan 70 percent of the time"

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    6/17

    How to Apply Market Basket

    Analysis Results? Put the items next to each other for ease of finding

    Promote the items as a package

    Place items far apart from each other so that thecustomer has to walk the aisles to search for it, and bydoing so potentially see and buy other items

    Direct marketers can use this information to determinewhich new products to offer to their current

    customers. Inventory policies can be improved if reorder points

    reflect the demand for the complementary products.

    6

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    7/17

    Association Rules for

    Market Basket AnalysisRules are written in the form left-hand side

    implies right-hand side or A=>B

    Green Peppers IMPLIES BananasOranges IMPLIES Apples

    Milk IMPLIES Breads

    To make effective use of a rule, three numericmeasures about that rule must be considered:(1) support, (2) confidence and (3) lift

    7

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    8/17

    Measures of Predictive

    Ability -- Confidence Confidence measures what percentage of

    baskets that contained the item A also

    contained item B.Confidence of the rule: AB=

    (# of transactions contain A & B)

    (#of transactions contain A)

    Discussion:

    What does Confidence measure?8

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    9/17

    Measures of Predictive

    Ability-- Support Support refers to the percentage of baskets

    where the rule was true (both items A and Bwere present).

    Support of the rule: AB= (# of transactions contain A & B)

    (#of all transactions)

    Note: We could also define Support of any item thesame way. i.e. Support of Item A = (# oftransactions contain A) /(#of all transactions)

    Questions: What is Support? Why Support?

    So we would only like to retain rules with large support,9

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    10/17

    Measures of Predictive

    Ability -- Lift Lift measures how much more frequently item A

    is found with item B than without item B.

    Lift of the rule: AB= Confidence (of the rule: AB)

    Support of B

    =(# of transactions contain A & B) x (#of all transactions)(#of transactions contain A) x (#of transactions contain B)

    Questions: What does lift measure? Why Lift?

    10

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    11/17

    Small Example

    Rule:Diapers -> Beer Support: 60% (3/5)

    60% of all purchases have diapers and beer Confidence: 75% (3/4)

    If diapers are purchased, 75% chance ofbuying beer

    Lift: 1.25 (75%/60%) (Note: this 60% isnot the 60% of Support of the rules, butthe Support of Beer) If diapers purchased, person is 1.25 times

    more likely to purchase beerurl: http://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdf11

    http://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdfhttp://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdfhttp://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdfhttp://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdf
  • 8/10/2019 Data Mining Association Analysis Stu (1)

    12/17

    Using the Results

    The tabulations can immediately betranslated into association rules and the

    numerical measures computed. Comparing this weeks table to last weeks

    table can immediately show the effect ofthis weeks promotional activities.

    Some rules are going to be trivial. But youmay discover some interested facts/patternsfrom the data

    12

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    13/17

    Limitations to Market

    Basket Analysis A large number of real transactions are

    needed to do an effective basket analysis,

    but the datas accuracy is compromised if allthe products do not occur with similarfrequency.

    The analysis can sometimes capture results

    that were due to the success of previousmarketing campaigns (and not naturaltendencies of customers).

    13

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    14/17

    Performing Analysis with

    Virtual Items The sales data can be augmented with the

    addition of virtual items. For example, we

    could record that the customer was new to us,or had children.

    The transaction record might look like:

    Item 1: Sweater Item 2: Jacket Item 3: New

    customer

    This might allow us to see what patterns newcustomers have versus old customers.

    14

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    15/17

    Multidimensional Market

    Basket Analysis Rules can involve more than two items, for

    example Plant and Clay Pot IMPLIES Soil.

    These rules are built iteratively. First, pairs arefound, then relevant sets of three or four.

    These are then pruned by removing those thatoccur infrequently.

    In an environment like a grocery store, wherecustomers commonly buy over 100 items, rulescould involve as many as 10 items.

    15

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    16/17

    Lab 5 Association Analysis

    with Excel Use the transactions from slide # 11 as the input.

    Develop your own excel spreadsheet (Download

    the sample Excel file from Blackboard) to conductassociation analysis (with formula from slides 8, 9& 10) Report Confidence, Support and Lift for allpossible first level (A implies B) rules.

    Lab Questions(1)What conclusions can you make from the

    association analysis of this lab? Explain.

    (2)What suggestions can you provide to the store

    manager? Explain.16

  • 8/10/2019 Data Mining Association Analysis Stu (1)

    17/17

    Lab 5 report

    Turn in a lab report with thefollowings:

    A cover page

    Summary of the lab

    A screenshot of your Excel table including

    all the association measures of your rules.Sort your rules first by Product A, then byproduct B (Ascending orders)

    Two Lab questions with your answers17