fundamentos de minería de datos
DESCRIPTION
Fundamentos de Minería de Datos. Reglas de asociación. Fernando Berzal [email protected]. Motivation. Association mining searches for interesting relationships among items in a given data set EXAMPLES Diapers and six-packs are bought together, specially on Thursday evening (a myth?) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/1.jpg)
Intelligent Databases and Information Systems research groupDepartment of Computer Science and Artificial IntelligenceE.T.S Ingeniería Informática – Universidad de Granada (Spain)
Fundamentos de Minería de DatosFundamentos de Minería de Datos
Reglas de asociación
Fernando [email protected]
![Page 2: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/2.jpg)
2
Association mining searches for Association mining searches for interesting relationships among items in interesting relationships among items in
a given data seta given data set
EXAMPLESEXAMPLES Diapers and six-packs are bought Diapers and six-packs are bought
together, specially on Thursday evening together, specially on Thursday evening (a myth?)(a myth?)
A sequence such as buying first a digital A sequence such as buying first a digital camera and then a memory card is a camera and then a memory card is a frequent (sequential) patternfrequent (sequential) pattern
……
MotivationMotivation
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 3: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/3.jpg)
3
MARKET BASKET ANALYSISMARKET BASKET ANALYSIS
The earliest form of association rule The earliest form of association rule miningmining
Applications: Applications:
Catalog design, store layout, cross-Catalog design, store layout, cross-marketing…marketing…
MotivationMotivation
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 4: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/4.jpg)
4
DefinitionDefinition
ItemItem In transactional databases:
Any of the items included in a transaction.
In relational databases:
(Attribute, value) pair(Attribute, value) pair
k-itemsetk-itemsetSet of k items
Itemset supportItemset support support(I) = P(I)
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 5: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/5.jpg)
5
DefinitionDefinition
Association ruleAssociation rule
X X Y Y
SupportSupport
support(XY) = support(XUY) = P(XUY)
ConfidenceConfidence
confidence(XY) = support(XUY) / support(X)
= P(Y|X)
NOTE: Both support and confidence are relative
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 6: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/6.jpg)
6
DiscoveryDiscovery
Association rule mining
1. Find all frequent itemsets
2. Generate strong association rules from the frequent itemsets
Strong association rules are those that satisfy both a minimum support threshold and a minimum confidence threshold.
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 7: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/7.jpg)
7
Apriori
Observation:
All non-empty subsets of a frequent itemset must also be frequent
Algorithm:
Frequent k-itemsets are used to explore potentially frequent (k+1)-itemsets (i.e. candidates)
DiscoveryDiscovery
Agrawal & Skirant: "Fast Algorithms for "Fast Algorithms for Mining Association Rules",Mining Association Rules",
VLDB'94
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 8: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/8.jpg)
8
Apriori improvements (I)
Reducing the number of candidates Park, Chen & Yu: "An Effective Hash-Based "An Effective Hash-Based Algorithm for Mining Association Rules",Algorithm for Mining Association Rules", SIGMOD'95
Sampling Toivonen: "Sampling Large Databases for Association Rules", VLDB'96 Park, Yu & Chen: "Mining Association Rules "Mining Association Rules with Adjustable Accuracy",with Adjustable Accuracy", CIKM'97
Partitioning Savasere, Omiecinski & Navathe: "An Efficient "An Efficient Algorithm for Mining Association Rules in Large Algorithm for Mining Association Rules in Large Databases"Databases", VLDB'95
DiscoveryDiscovery
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 9: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/9.jpg)
9
Apriori improvements (II)
Transaction reduction Agrawal & Skirant: "Fast Algorithms for Mining "Fast Algorithms for Mining Association Rules",Association Rules", VLDB'94 (AprioriTID)
Dynamic itemset counting Brin, Motwani, Ullman & Tsur: "Dynamic "Dynamic Itemset Counting and Implication Rules for Itemset Counting and Implication Rules for Market Basket Data",Market Basket Data", SIGMOD'97 (DIC) Hidber: "Online Association Rule Mining","Online Association Rule Mining", SIGMOD'99 (CARMA)
DiscoveryDiscovery
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 10: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/10.jpg)
10
DiscoveryDiscovery
Apriori-like algorithm:TBAR
(Tree-based association rule mining)
Berzal, Cubero, Sánchez & Serrano
““TBAR: An efficient method for TBAR: An efficient method for association association
rule mining in relational rule mining in relational databases”databases”
Data & Knowledge Engineering, 2001
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 11: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/11.jpg)
11
Discovery: TBARDiscovery: TBAR
A A #7#7 B B #9#9 C C #7#7 D D #8#8
B B #6#6 D D #5#5 C C #6#6 D D #7#7 D D #5#5
D D #5#5D D #5#55 instances 5 instances
withwith ABDABD
7 instances 7 instances
wihwih A A6 instances 6 instances
withwith ABAB
5 instances 5 instances
withwith ADAD
LL11
LL22
LL33
6 instances 6 instances
withwith BCBC
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 12: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/12.jpg)
12
An alternative to Apriori:Compress the database
representing frequent items into a frequent-pattern tree (FP-tree)…
Han, Pei & Yin:
"Mining Frequent Patterns without "Mining Frequent Patterns without Candidate Candidate Generation",Generation", SIGMOD'2000
DiscoveryDiscovery
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 13: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/13.jpg)
13
A challengeWhen an itemset is frequent,all its subsets are also frequent
Closed itemset C: There exists no proper super-itemset S such that support(S)=support(C)
Maximal (frequent) itemset M:M is frequent and there exists no super-itemset Y such that MY and Y is frequent.
DiscoveryDiscovery
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 14: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/14.jpg)
14
VariationsVariations
Based on the kinds of patterns to be mined:
Frequent itemset mining(transactional and relational data)
Sequential pattern mining(sequence data sets, e.g. bioinformatics)
Structured pattern mining(structured data, e.g. graphs)
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 15: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/15.jpg)
15
VariationsVariations
Based on the types of values handled:
Boolean association rules
Quantitative association rules
Fuzzy association rules
Delgado, Marín, Sánchez & Vila
““Fuzzy association rules: General model and Fuzzy association rules: General model and applications”applications”IEEE Transactions on Fuzzy Systems, 2003
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 16: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/16.jpg)
16
VariationsVariations
More options:
Generalized association rules(a.k.a. multilevel association rules)
Constraint-based association rule mining
Incremental algorithms
Top-k algorithms
…
ICDM FIMI
ICDM FIMI
Workshop on
Workshop on
Frequent Itemset
Frequent Itemset
Mining
Mining
Implementatio
ns
Implementatio
ns
http://fimi.cs.h
elsinki.fi/
http://fimi.cs.h
elsinki.fi/
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 17: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/17.jpg)
17
VisualizationVisualization
Integrated into data mining tools to help users understand data mining
results:
Table-based approache.g. SAS Enterprise Miner, DBMiner…
2D Matrix-based approache.g. SGI MineSet, DBMiner…
Graph-based techniquese.g. DBMiner ball graphs
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 18: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/18.jpg)
18
Visualization: TablesVisualization: Tables
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 19: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/19.jpg)
19
Visualization: Visual aidsVisualization: Visual aids
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 20: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/20.jpg)
20
Visualization: 2D MatrixVisualization: 2D Matrix
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 21: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/21.jpg)
21
Visualization: GraphsVisualization: Graphs
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 22: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/22.jpg)
22
Visualization: VisARVisualization: VisAR
Based on parallel coordinates(Techapichetvanich & Datta,
ADMA’2005)
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 23: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/23.jpg)
23
ExtensionsExtensions
Confidence is not the best possible
interestingness measure for rules
e.g. A very frequent item will always appear in rule consequents,
regardless its true relationship with the rule antecedent
X went to war X did not serve in Vietnam
(from the US Census)
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 24: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/24.jpg)
24
ExtensionsExtensions
Desirable properties for interestingness measuresPiatetsky-Shapiro, 1991
P1 ACC(A⇒C) = 0 when supp(A⇒C) =
supp(A)supp(C)
P2 ACC(A⇒C) monotonically increases with supp(A⇒C)
P3 ACC(A⇒C) monotonically decreases with supp(A) (or supp(C))
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 25: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/25.jpg)
25
ExtensionsExtensions
Certainty factors… … satisfy Piatetsky-Shapiro’s properties … are widely-used in expert systems … are not symmetric (as interest/lift) … can substitute conviction when CF>0 Berzal, Blanco, Sánchez & Vila:
“Measuring the accuracy and interest of “Measuring the accuracy and interest of association rules: A new framework",association rules: A new framework", Intelligent Data Analysis, 2002
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 26: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/26.jpg)
26
ExtensionsExtensions
References:
Hilderman & Hamilton: “Evaluation of “Evaluation of interestingness measures for ranking discovered interestingness measures for ranking discovered knowledge”knowledge”. PAKDD, 2001
Tan, Kumar & Srivastava: “Selecting the right “Selecting the right objective measure for association analysis”objective measure for association analysis”. Information Systems, vol. 29, pp. 293-313, 2004.
Berzal, Cubero, Marín, Sánchez, Serrano & Vila: “Association rule evaluation for classification “Association rule evaluation for classification purposes”purposes” TAMIDA’2005
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 27: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/27.jpg)
27
ApplicationsApplications
Two sample applications where associations rules have been successful
Classification (ART)
Anomaly detection (ATBAR) Balderas, Berzal, Cubero, Eisman & Marín “Discovering Hidden Association “Discovering Hidden Association Rules ”Rules ”
KDD’2005, Chicago, Illinois, USA
Berzal, Cubero, Sánchez & Serrano
““ART: A hybrid classification ART: A hybrid classification modelmodel””
Machine Learning Journal, 2004
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 28: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/28.jpg)
28
ClassificationClassification
Classification models based on association rules
Partial classification modelsvg: Bayardo
“Associative” classification models vg: CBA (Liu et al.)
Bayesian classifiersvg: LB (Meretakis et al.)
Emergent patternsvg: CAEP (Dong et al.)
Rule treesvg: Wang et al.
Rules with exceptionsvg: Liu et al.
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 29: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/29.jpg)
29
GOALGOAL
Simple, intelligible, and robust Simple, intelligible, and robust
classification modelsclassification models
obtained in an efficient and scalable wayobtained in an efficient and scalable way
MEANSMEANS
ClassificationClassification
Decision Tree Induction+
Association Rule Mining=
ARTART[Association Rule Trees][Association Rule Trees]
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 30: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/30.jpg)
30
ART Classification ModelART Classification Model
IDEAMake use of efficient association rule mining algorithms to build a decision-
tree-shaped classification model.
ART = Association Rule Tree
KEY
Association rules + “else” branches
Hybrid between decision trees and decision lists
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 31: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/31.jpg)
31
ART Classification ModelART Classification Model
SPLICESPLICEMotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 32: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/32.jpg)
41
ExampleExample ART vs. TDIDTART vs. TDIDT
ARTART TDIDTTDIDT
X Y
Z
0
0
0 1
1
0 0 e ls e0 1
1
Y
X
1
0
X
Z Z0
0 1 0 1
0 1
0 1 1
0 1 0 1
ART classification modelART classification model
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 33: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/33.jpg)
48
Final commentsFinal commentsART classification modelART classification model
Classification models Acceptable accuracy Reduced complexity Attribute interactions Robustness (noise & primary keys)
Classifier building method Efficient algorithm Good scalability properties Automatic parameter selection
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 34: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/34.jpg)
49
It is often more interesting to find It is often more interesting to find surprising non-frequent events than surprising non-frequent events than
frequent onesfrequent ones
EXAMPLESEXAMPLES Abnormal network activity patterns in Abnormal network activity patterns in
intrusion detection systems.intrusion detection systems. Exceptions to “common” rules in Exceptions to “common” rules in
Medicine (useful for diagnosis, drug Medicine (useful for diagnosis, drug evaluation, detection of conflicting evaluation, detection of conflicting therapies…)therapies…)
……
Anomaly detectionAnomaly detection
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 35: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/35.jpg)
50
Anomaly detectionAnomaly detection
Anomalous association rule
Confident rule representing homogeneous deviations from common behavior.
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 36: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/36.jpg)
51
Anomaly detectionAnomaly detection
X¬Y confident
X Y frequent and confident
X usually implies Y (dominant rule)
When X does not imply Y, then it usually implies A (the Anomaly)
A
X Y ¬A confident
Anomalous association rule
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 37: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/37.jpg)
52
Anomaly detectionAnomaly detection
X Y A1 Z1…
X Y A1 Z2…
X Y A2 Z3…
X Y A2 Z1…
X Y A3 Z2…
X Y A3 Z3…
X Y A Z …
X Y3A Z3
…
X Y3A Z …
X Y4A Z …
X Y is the dominant rule
X A when ¬ Yis the anomalous rule
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 38: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/38.jpg)
53
Anomaly detectionAnomaly detection
Suzuki et al.’s “Exception Rules”
X Y is an association rule
X I
X I is the reference rule
is the exception rule
¬ Y
I is the “interacting” itemset
Too many exceptions
The “cause” needs to be present
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 39: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/39.jpg)
54
Anomaly detection: ATBARAnomaly detection: ATBAR
Anomalous association rules
AA#7 #7 AB#6 AC#4 AD#5 AE#3 AF#3AB#6 AC#4 AD#5 AE#3 AF#3
B B #9#9 C C #7#7 D D #8#8First First scanscan
A A #7#7
Second Second scanscan
B B #6#6 D D #5#5 Non-frequentNon-frequent
A A #7 #7 AA**
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 40: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/40.jpg)
55
Anomaly detection: ATBARAnomaly detection: ATBAR
Anomalous association rules
B B #9#9 C C #7#7 D D #8#8First First scanscan
A A #7#7
Second Second scanscan
A A #7 #7 AA**
B B #6#6 D D #5#5
B B #9#9 BB** C C #7#7 CC** D D #8#8 DD**
C C #6#6 D D #7#7 D D #5#5
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 41: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/41.jpg)
56
Anomaly detection: ATBARAnomaly detection: ATBAR
Anomalous association rules
Rule generation is immediate from the frequent and extended
itemsets obtained by ATBAR
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 42: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/42.jpg)
57
Anomaly detection: ResultsAnomaly detection: Results
Experiments on health-related datasetsfrom the UCI Machine Learning Repository
Relatively small set of anomalous rules (typically, >90% reduction with respect to standard association rules)
Reasonable overhead needed to obtain anomalous association rules(about 20% in ATBAR w.r.t. TBAR)
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 43: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/43.jpg)
58
Anomaly detection: ResultsAnomaly detection: Results
An example from the Census dataset:
if WORKCLASS: Local-govif WORKCLASS: Local-gov
then then
CAPGAIN: [99999.0 , 99999.0] (7 out of 7)CAPGAIN: [99999.0 , 99999.0] (7 out of 7)
when not CAPGAIN: [0.0 , 20051.0]when not CAPGAIN: [0.0 , 20051.0]
Usual Usual consequentconsequent
““Anomaly”Anomaly”
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR
![Page 44: Fundamentos de Minería de Datos](https://reader035.vdocuments.net/reader035/viewer/2022081506/56814dd8550346895dbb3fc3/html5/thumbnails/44.jpg)
59
Anomalous association rules(novel characterization of potentially interesting knowledge)
An efficient algorithm for discovering anomalous association rules: ATBAR
Some heuristics for filtering the discovered anomalous association rules
Anomaly detection: ResultsAnomaly detection: Results
MotivationDefinitionDiscoveryVariationsVisualizationExtensionsApplications
ARTATBAR