mining non-derivable association rules

15
Mining Non- Derivable Association Rules Bart Goethals, Juho Muhonen, Hannu Toivonen Proceeding of SIAM2005 Speaker:Pei-Min Chou Date:05/12/30

Upload: omar

Post on 09-Jan-2016

55 views

Category:

Documents


1 download

DESCRIPTION

Mining Non-Derivable Association Rules. Bart Goethals, Juho Muhonen, Hannu Toivonen Proceeding of SIAM2005. Speaker:Pei-Min Chou Date:05/12/30. Introduction. Association rule Support: Ex: A=>C;2/4=50% Confidence: Ex: A=>C;(AC)/(A)=2/3=67% X=>YX,Y: itemset Frequent: X∪Y is frequent - PowerPoint PPT Presentation

TRANSCRIPT

  • Mining Non-Derivable Association RulesBart Goethals, Juho Muhonen, Hannu ToivonenProceeding of SIAM2005Speaker:Pei-Min ChouDate:05/12/30

  • IntroductionAssociation ruleSupport:Ex: A=>C;2/4=50%Confidence:Ex: A=>C;(AC)/(A)=2/3=67%X=>YX,Y: itemsetFrequent: XY is frequentConfident :supp(XY)/supp(X)confidence thresholdTypically association rule:large Redundant

  • Introduction (cont.)RelatedApply rule with the same confidenceUse specific inference system to pruneDoes not give error boundMining non-derivable association ruleFind tight bounds on confidence of rule from its subruleIf low bound=upper bound derivable

  • Non-Derivable set propertyDownward closedall supersets of a derivable set are derivable all subsets of a non-derivable set are non-derivableGiven all subrules of X=>YX=>Y is derivable if and only if XY is a derivable set

  • MethodGoal: remove all derivable association ruleDifferent caseRules have exactly same condition and consequentFixed consequentSingle item: ex. abc=>dMultiple item: ex. abc=>deFixed condition or consequentconsequent: use method abovecondition: use inclusion-exclusion principleSome subrules

  • ExampleConsider rule abc=>dAll subrules:

    We miss information of abc and abcd

  • Bounds on supp(abc)Use inclusion-exclusion principleSupp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({})Supp(abc) supp(ab)+supp(ac)-supp(a)Supp(abc) supp(ab)+supp(bc)-supp(b)Supp(abc) supp(bc)+supp(ac)-supp(c)Supp(abc) supp(ab)Supp(abc) supp(bc)Supp(abc) supp(ac)Supp(abc) 0

  • Example (cont.)ab=>cSupp(ac) =3Supp(bc) =3Supp(a) =7Supp(b) =7Supp(c) =5Supp({})=10Confidence interval:ab=>c is [1/5,1/2]Supp(ab) supp(a)+supp(b)-supp({})=7+7-10=4 (low)Supp(ab) supp(a)=supp(b)=7 (upper)For supp(ab)=4Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({}) =4+3+3-7-7-5+10=1Supp(abc) supp(ab)+supp(ac)-supp(a) =4+3-7=0Supp(abc) supp(ab)+supp(bc)-supp(b) =4+3-7=0Supp(abc) supp(bc)+supp(ac)-supp(c) =3+3-5=1Supp(abc) supp(ab)=4Supp(abc) supp(bc)=3Supp(abc) supp(ac)=3Supp(abc) 0For supp(ab)=5Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({}) =5+3+3-7-7-5+10=2Supp(abc) supp(ab)+supp(ac)-supp(a) =5+3-7=1Supp(abc) supp(ab)+supp(bc)-supp(b) =5+3-7=1Supp(abc) supp(bc)+supp(ac)-supp(c) =3+3-5=1Supp(abc) supp(ab)=5Supp(abc) supp(bc)=3Supp(abc) supp(ac)=3Supp(abc) 0For supp(ab)=6Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({}) =6+3+3-7-7-5+10=3Supp(abc) supp(ab)+supp(ac)-supp(a) =6+3-7=2Supp(abc) supp(ab)+supp(bc)-supp(b) =6+3-7=2Supp(abc) supp(bc)+supp(ac)-supp(c) =3+3-5=1Supp(abc) supp(ab)=6Supp(abc) supp(bc)=3Supp(abc) supp(ac)=3Supp(abc) 0For supp(ab)=7Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({}) =7+3+3-7-7-5+10=4Supp(abc) supp(ab)+supp(ac)-supp(a) =7+3-7=3Supp(abc) supp(ab)+supp(bc)-supp(b) =7+3-7=3Supp(abc) supp(bc)+supp(ac)-supp(c) =3+3-5=1Supp(abc) supp(ab)=7Supp(abc) supp(bc)=3Supp(abc) supp(ac)=3Supp(abc) 0

  • Example (cont.)ab=>cSupp(ac) =7Supp(bc) =7Supp(a) =7Supp(b) =7Supp(c) =10Supp({})=10Supp(ab)=[4,7]non-derivableab=>c is [1,1]derivableSupp(ab)supp(a)+supp(b)-supp({})=7+7-10=4Supp(ab)supp(a)=supp(b)=7For supp(ab)=4Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({}) =4+7+7-7-7-10+10=4Supp(abc) supp(ab)+supp(ac)-supp(a) =4+7-7=4Supp(abc) supp(ab)+supp(bc)-supp(b) =4+7-7=4Supp(abc) supp(bc)+supp(ac)-supp(c) =7+7-10=4Supp(abc) supp(ab)=4Supp(abc) supp(bc)=7Supp(abc) supp(ac)=7Supp(abc) 0

  • Use subrulesFor any subset J I, such that |I\J|k-1K>0: user given parameterdepthEx. depth=4Supp(abc) supp(ab)+ supp(bc)+ supp(ac)-supp(a)-supp(b)-supp(c)+supp({})Supp(abc) supp(ab)+supp(ac)-supp(a)Supp(abc) supp(ab)+supp(bc)-supp(b)Supp(abc) supp(bc)+supp(ac)-supp(c)Supp(abc) supp(ab)Supp(abc) supp(bc)Supp(abc) supp(ac)Supp(abc) 0

  • ExperimentsDataset characteristics

    Number of rules after different pruning methods

  • Exp(1)non-derivableMinimal closed association rules

  • Exp(2)non-derivablebasic association rulesmaximum entropy method

  • Exp(3) ---non-derivable with singular consequent

  • Exp(4) ---non-derivable with different support