dcmeet second v2
TRANSCRIPT
-
8/4/2019 Dcmeet Second v2
1/34
Presented By
K.Indira
Under the Guidance of
Dr. S. Kanmani,
Professor,Department of Information Technology,
Pondicherry Engineering College.
1
Mining Association Rules using OptimalGenetic Algorithm & Quantum Swarm
intelligent PSO.
-
8/4/2019 Dcmeet Second v2
2/34
2
Objective.
Introduction.Data Mining.Association Analysis.Limitations of the existing system.GA and PSO An Introduction.
Existing Work.Based on GA.Based on PSO.
Work Done So far.
Proposed Work.
Papers Published.
References.
Contents
Execution Plan.
-
8/4/2019 Dcmeet Second v2
3/34
3
To Propose an efficient methodology formining of ARs using Optimal Genetic
Algorithm & Quantum Swarm intelligentPSO
Objective
-
8/4/2019 Dcmeet Second v2
4/34
Extraction of interesting information orpatterns from data in large databases is knownas data mining.
Data Mining
4
-
8/4/2019 Dcmeet Second v2
5/34
5
Association Rules
Find all the rulesXYwithminimum support andconfidence
Support, s, probability that a
transaction contains X
Y Confidence, c, conditional
probability that a transactionhaving X also contains Y
Let minsup = 50%, minconf = 50%
Freq. Pat.: Milk:3, Nuts:3, Sugar:4, Eggs:3,{Milk, Sugar}:3
Customer
buys sugar
Customer
buys both
Customer
buys milk
Nuts, Eggs, Bread40
Nuts, Coffee, Sugar , Eggs, Bread50
Milk, Sugar, Eggs30
Milk, Coffee, Sugar20
Milk, Nuts, Sugar10
Items boughtTid
Association rules: Milk Sugar (60%, 100%) SugarMilk (60%, 75%)
-
8/4/2019 Dcmeet Second v2
6/34
6
Apriori, FP Growth Tree, clat are some of the
popular algorithms for mining ARs.
Traverse the database many times.
I/O overhead, and computational complexity is more
Cannot meet the requirements of large-scale
database mining.
Limitations of Existing System
-
8/4/2019 Dcmeet Second v2
7/34
GA and PSO An Introduction
Evolutionary algorithms provide robust andefficient approach in exploring large search space.
A Genetic Algorithm (GA) is a procedure used to
find approximate solutions to search problemsthrough the application of the principles ofevolutionary biology.
PSOs mechanism is inspired by the social andcooperative behavior displayed by various specieslike birds, fish etc including human beings.
7
-
8/4/2019 Dcmeet Second v2
8/348
Existing Work
Mining ARs Based on Genetic Algorithm
Efficient Distributed Genetic Algorithm done by spatialpartitioning of the population into several semi-isolated nodes,each evolving in parallel and possibly exploring different regionsof the search space.
Genetic algorithm without taking the minimum support and
confidence into account. Extracts the best rules that have bestcorrelation between support and confidence
Improved niched Pareto genetic algorithm(INPGA), selects theaccurate candidates and also saves selection time with combining
BNPGA and SDNPGA
GRA with a new operator, called guided mutation is introduced.GRA considers the correlation coefficient between nodes in eachindividual of GRA.
-
8/4/2019 Dcmeet Second v2
9/349
Mining ARs Based on Particle Swarm Optimization
Existing Work contd..
A novel algorithm for association rule mining in order to improve
computational efficiency as well as to automatically determinesuitable threshold values.
The algorithm operates at three evolution levels where an adaptiveinertia weight is presented. The safety distance is introduced to move
the particle through its current position, and the proximity index. Self-adaptive method to adjust the inertia weight of the velocity update
rule based on the empirical values and negative feedback technique isintroduced ,which relieve the burden of specifying the parametersvalues.
Combines Particle Swarm Optimization (PSO) and Genetic Algorithms(GAs) using fuzzy logic to integrate the results of both methods and forparameters tuning. The new optimization method combines theadvantages of PSO and GA to give us an improved FPSO + FGA hybrid
approach.
-
8/4/2019 Dcmeet Second v2
10/3410
Work Done so Far
Association Rule Mining was carried out using theGenetic Algorithm in Matlab 2008a.
Mining of Association rule was carried out using selfAdaptive Genetic algorithm using Java.
The GA Parameters were varied and the results wererecorded for each cases.
-
8/4/2019 Dcmeet Second v2
11/3411
Mining ARs using GA in Matlab 2008a.
MethodologySelection : Tournament
Crossover Probability : Fixed ( Tested with 3 values)
Mutation Probability : No Mutation
Fitness Function :
Dataset : Lenses, Iris, Haberman fromUCI Irvine repository.
Population : Fixed ( Tested with 3 values)
-
8/4/2019 Dcmeet Second v2
12/34
12
Flow chart of the GA
-
8/4/2019 Dcmeet Second v2
13/34
Results Analysis
No. of Instances No. of Instances * 1.25 No. of Instances *1.5Accuracy
%
No. ofGenerations
Accuracy%
No. ofGenerations
Accuracy%
No. ofGenerations
Lenses 75 7 82 12 95 17Haberman 71 114 68 88 64 70Iris 77 88 87 53 82 45
Comparison based on variation in population Size.
Minimum Support & Minimum Confidence
Sup = 0.4 & con=0.4
Sup =0.9 & con =0.9 Sup = 0.9 & con =0.2
Sup = 0.2 & con =0.9
Accuracy%
No. ofGen
Accuracy%
No. ofGen.
Accuracy%
No. ofGen.
Accuracy%
No. ofGen
Lenses 22 20 49 11 70 21 95 18Haberman 45 68 58 83 71 90 62 75
Iris 40 28 59 37 78 48 87 55
Comparison based on variation in Minimum Support and Confidence
-
8/4/2019 Dcmeet Second v2
14/34
14
Cross OverPc = .25 Pc = .5 Pc = .75
Accuracy % No. ofGenerations
Accuracy % No. ofGenerations
Accuracy % No. ofGenerations
Lenses 95 8 95 16 95 13Haberman 69 77 71 83 70 80Iris 84 45 86 51 87 55
Dataset No. of
Instances
No. of
attributes
Populatio
n Size
Minimum
Support
Minimum
confidence
Crossover
rate
Accuracy
in %
Lenses 24 4 36 0.2 0.9 0.25 95Haberman 306 3 306 0.9 0.2 0.5 71Iris 150 5 225 0.2 0.9 0.75 87
Comparison of the optimum value ofParameters for maximum Accuracy achieved
Comparison based on variation in Crossover Probability
-
8/4/2019 Dcmeet Second v2
15/34
15
Values of minimum support, minimum confidence and
population size decides upon the accuracy of the systemthan other GA parameters.
Crossover rate affects the convergence rate rather than the
accuracy of the system. The optimum value of the GA parameters varies from data
to data and the fitness function plays a major role in
optimizing the results.
The size of the dataset and relationship between
attributes in data contributes to the setting up of the
parameters.
Inferences
-
8/4/2019 Dcmeet Second v2
16/34
16
Mining ARs using Self Adaptive GA inJava.
MethodologySelection : Roulette Wheel
Crossover Probability : Fixed ( Tested with 3 values)
Mutation Probability : Self Adaptive
Fitness Function :
Dataset : Lenses, Iris, Car fromUCI Irvine repository.
Population : Fixed ( Tested with 3 values)
-
8/4/2019 Dcmeet Second v2
17/34
17
Procedure SAGA
BeginInitialize population p(k);Define the crossover and mutation rate;
Do
{Do{Calculate support of all k rules;
Calculate confidence of all k rules;Obtain fitness;Select individuals for crossover / mutation;
Calculate the average fitness of the n and (n-1) the generation;Calculate the maximum fitness of the n and (n-1) the generation;
Based on the fitness of the selected item, calculate the new crossoverand mutation rate;Choose the operation to be performed;} k times;}
-
8/4/2019 Dcmeet Second v2
18/34
Self Adaptive GA
SELFADAPTIVE
l l
-
8/4/2019 Dcmeet Second v2
19/34
19
Dataset Traditional GA Self Adaptive GAAccuracy No. of
GenerationsAccuracy No. of Generations
Lenses 75 38 87.5 35
Haberman 52 36 68 28
CarEvaluation
85 29 96 21
Dataset Traditional GA Self Adaptive GAAccuracy No. of
GenerationsAccuracy No. of
GenerationsLenses 50 35 87.5 35
Haberman 36 38 68 28
CarEvaluation 74 36 96 21
ACCURACY COMPARISON BETWEEN GA AND SAGA WHEN PARAMETERS AREACCORDING TO TERMINTAION OF SAGA
ACCURACY COMPARISON BETWEEN GA AND SAGA WHEN PARAMETERS ARE IDEALFOR TRADITIONAL GA
Results Analysis
-
8/4/2019 Dcmeet Second v2
20/34
Inferences
Better accuracy.
Better convergence.
Self Adaptive GA gives better accuracy than
Traditional GA.
-
8/4/2019 Dcmeet Second v2
21/34
21
Proposed Work
1. To implement a Distributive niched Pareto memetic
Algorithm for Rule Mining.
2. To propose a association rule mining algorithm basedon Chaotic PSO and swarm intelligence.
3. Propose a Particle swarm optimization rule miningmethodology combined with quantum computing andquantum differential evolution
-
8/4/2019 Dcmeet Second v2
22/34
22
Obtains the comparison set S from clustering based samples.
For any two candidates and comparison set S, if one candidate is
dominated and the other not, the candidate non-dominated is
selected, Exit.
If two candidates (cd_1 and cd_2) compute the number of samples
in two niches, count1 and count2. Ifcount1=0, cd_1 is selected and if count2=0, cd_2 is selected, Exit.
If count1-count2>delta or count2-count1>delta, then selects
cd_2 or cd_1, Exit..
If abs(count1-count2)sd2, cd_1 is selected, otherwise, cd_2 is selected.
Exit
Niched Pareto Selection Algorithm
-
8/4/2019 Dcmeet Second v2
23/34
23
Distributed Model
GA1subpopulation
GA2subpopulation
GA3subpopulation
GA4subpopulation Full Dataset
RulesGenerated
RulesGenerated
RulesGenerated
RulesGenerated
Concept
Description
Association Rule mining Algorithm based on Chaotic
-
8/4/2019 Dcmeet Second v2
24/34
24
Based onchaotic maps
Association Rule mining Algorithm based on ChaoticPSO and Swarm intelligence.
Swarm IntelligenceConcept
E ti Pl
-
8/4/2019 Dcmeet Second v2
25/34
Execution Plan
25
July : Niched Pareto Sampling based Selection.Implementing GA for Local intensity Search.
August : Distributed Methodology Implementation.Preparing the Above work as a paper.
September& : Particle Swarm Optimization basedOctober Rule Mining to be implemented.
November : Chaotic PSO & Swarm intelligence based PSOfor Mining ARs to be implemented.Documenting the same into paper.
December& : Study on Quantum computing and
January differential Evolution concepts.
P P bli h d
-
8/4/2019 Dcmeet Second v2
26/34
Papers Published
26
Paper titled Framework for Comparison of Association RuleMining Using Genetic Algorithm has been presented in the
International Conference On Computers, Communication &Intelligence at VCET, 2010.
Paper titled Mining Association Rules Using GeneticAlgorithm: The role of Estimation Parameters has beenSelected for presentation in the International conference onadvances in computing and communications ,2011. To bepublished in Springer LNCS (CCIS) series.
Paper titled Rule Acquisition in Data Mining Using a SelfAdaptive Genetic Algorithm has been Selected for
presentation in the First International conference on ComputerScience and Information Technology (CCSEIT-2011) , To bepublished in Springer LNCS (CCIS) series.
R f
-
8/4/2019 Dcmeet Second v2
27/34
References Jing Li, Han Rui-feng, A Self-Adaptive Genetic Algorithm Based On Real-
Coded, International Conference on Biomedical Engineering andcomputer Science , Page(s): 1 - 4 , 2010
Chuan-Kang Ting, Wei-Ming Zeng, Tzu- Chieh Lin, Linkage Discoverythrough Data Mining, IEEE Magazine on Computational Intelligence,
Volume 5, February 2010.
Caises, Y., Leyva, E., Gonzalez, A., Perez, R., An extension of the Genetic
Iterative Approach for Learning Rule Subsets , 4th International Workshopon Genetic and Evolutionary Fuzzy Systems, Page(s): 63 - 67 , 2010
Shangping Dai, Li Gao, Qiang Zhu, Changwu Zhu, A Novel GeneticAlgorithm Based on Image Databases for Mining Association Rules, 6thIEEE/ACIS International Conference on Computer and Information Science,
Page(s): 977 980, 2007
Peregrin, A., Rodriguez, M.A., Efficient Distributed Genetic Algorithm for
Rule Extraction,. Eighth International Conference on Hybrid Intelligent
Systems, HIS '08. Page(s): 531 536, 2008
27
-
8/4/2019 Dcmeet Second v2
28/34
28
Mansoori, E.G., Zolghadri, M.J., Katebi, S.D., SGERD: A Steady-StateGenetic Algorithm for Extracting Fuzzy Classification Rules From
Data, IEEE Transactions on Fuzzy Systems, Volume: 16 , Issue: 4 ,Page(s): 1061 1071, 2008..
Xiaoyuan Zhu, Yongquan Yu, Xueyan Guo, Genetic Algorithm Based onEvolution Strategy and the Application in Data Mining, FirstInternational Workshop on Education Technology and Computer Science,ETCS '09, Volume: 1 , Page(s): 848852, 2009
Hong Guo, Ya Zhou, An Algorithm for Mining Association Rules Basedon Improved Genetic Algorithm and its Application, 3rd International
Conference on Genetic and Evolutionary Computing, WGEC '09, Page(s):
117120, 2009
Genxiang Zhang, Haishan Chen, Immune Optimization Based GeneticAlgorithm for Incremental Association Rules Mining, International
Conference on Artificial Intelligence and Computational Intelligence, AICI'09, Volume: 4, Page(s): 341345, 2009
References Contd..
R f
-
8/4/2019 Dcmeet Second v2
29/34
29
Maria J. Del Jesus, Jose A. Gamez, Pedro Gonzalez, Jose M. Puerta,On the Discovery of Association Rules by means of Evolutionary
Algorithms, from Advanced Review of John Wiley & Sons , Inc. 2011 Junli Lu, Fan Yang, Momo Li, Lizhen Wang, Multi-objective Rule
Discovery Using the Improved Niched Pareto Genetic Algorithm,Third International Conference on Measuring Technology andMechatronics Automation, 2011.
Hamid Reza Qodmanan, Mahdi Nasiri, Behrouz Minaei-Bidgoli,Multi Objective Association Rule Mining with Genetic Algorithmwithout specifying Minimum Support and Minimum Confidence,Expert Systems with Applications 38 (2011) 288298.
Miguel Rodriguez, Diego M. Escalante, Antonio Peregrin, EfficientDistributed Genetic Algorithm for Rule Extraction, Applied SoftComputing 11 (2011) 733743.
J.H. Ang, K.C. Tan , A.A. Mamun, An Evolutionary MemeticAlgorithm for Rule Extraction, Expert Systems with Applications 37
(2010) 13021315.
References
R f C td
-
8/4/2019 Dcmeet Second v2
30/34
R.J. Kuo, C.M. Chao, Y.T. Chiu, Application of particle swarm optimizationto association rule mining, Applied Soft Computing 11 (2011) 326336.
Bilal Alatas , Erhan Akin, Multi-objective rule mining using a chaoticparticle swarm optimization algorithm, Knowledge-Based Systems 22(2009) 455460.
Mourad Ykhlef, A Quantum Swarm Evolutionary Algorithm for miningassociation rules in large databases, Journal of King Saud University Computer and Information Sciences (2011) 23, 16.
Haijun Su, Yupu Yang, Liang Zhao, Classification rule discovery withDE/QDE algorithm, Expert Systems with Applications 37 (2010) 12161222.
Jing Li, Han Rui-feng, ASelf-Adaptive Genetic Algorithm Based On Real-Coded, International Conference on Biomedical Engineering andcomputer Science , Page(s): 1 - 4 , 2010
Chuan-Kang Ting, Wei-Ming Zeng, Tzu- Chieh Lin, Linkage Discoverythrough Data Mining, IEEE Magazine on Computational Intelligence,
Volume 5, February 2010.
30
References Contd..
-
8/4/2019 Dcmeet Second v2
31/34
31
Caises, Y., Leyva, E., Gonzalez, A., Perez, R., An extension of theGenetic Iterative Approach for Learning Rule Subsets , 4thInternational Workshop on Genetic and Evolutionary Fuzzy Systems,Page(s): 63 - 67 , 2010
Xiaoyuan Zhu, Yongquan Yu, Xueyan Guo, Genetic Algorithm Based onEvolution Strategy and the Application in Data Mining, FirstInternational Workshop on Education Technology and ComputerScience, ETCS '09, Volume: 1 , Page(s): 848 852, 2009
References Contd..
-
8/4/2019 Dcmeet Second v2
32/34
32
References Miguel Rodriguez, Diego M. Escalante, Antonio Peregrin, Efficient
Distributed Genetic Algorithm for Rule extraction, Applied Soft
Computing 11 (2011) 733743.
Hamid Reza Qodmanan , Mahdi Nasiri, Behrouz Minaei-Bidgoli,Multi objective association rule mining with genetic algorithmwithout specifying minimum support and minimum confidence,
Expert Systems with Applications 38 (2011) 288298.
Junli Lu, Fan Yang, Momo Li1, Lizhen Wang, Multi-objective RuleDiscovery Using the Improved Niched Pareto Genetic Algorithm, 2011Third International Conference on Measuring Technology andMechatronics Automation.
Yan Chen, Shingo Mabu, Kotaro Hirasawa, Genetic relation algorithmwith guided mutation for the large-scale portfolio optimization,Expert Systems with Applications 38 (2011) 33533363.
References
-
8/4/2019 Dcmeet Second v2
33/34
33
References
R.J. Kuo, C.M. Chao, Y.T. Chiu, Application of particle swarm
optimization to association rule mining, Applied Soft Computing 11(2011) 326336
Yamina Mohamed Ben Ali, Soft Adaptive Particle Swarm Algorithmfor Large Scale Optimization, IEEE 2010.
Feng Lu, Yanfeng Ge, LiQun Gao, Self-adaptive Particle SwarmOptimization Algorithm for Global Optimization, 2010 SixthInternational Conference on Natural Computation (ICNC 2010)
Fevrier Valdez, Patricia Melin, Oscar Castillo, An improved
evolutionary method with fuzzy logic for combining Particle SwarmOptimization and Genetic Algorithms, Applied Soft Computing 11(2011) 26252632
-
8/4/2019 Dcmeet Second v2
34/34
Thank You