data mining and soft computing francisco herrera
TRANSCRIPT
Data Mining and Soft Computing
Francisco HerreraResearch Group on Soft Computing andInformation Intelligent Systems (SCI2S) g y ( )
Dept. of Computer Science and A.I. University of Granada, Spain
Email: [email protected]://sci2s.ugr.eshttp://sci2s.ugr.es
http://decsai.ugr.es/~herrera
Data Mining and Soft ComputingData Mining and Soft Computing
In this course:
We will introduce the Data Mining and K l d Di t t kKnowledge Discovery area: steps, task, challenges ..
We will introduce Soft Computing techniques: p g qFuzzy Logic, Genetic Algorithms, …
… and we will present the use of Soft Computing 2tecniques in Data Mining
Data Mining and Soft ComputingData Mining and Soft Computing
Material of this course at: http://sci2s.ugr.es/docencia/asignatura.php?id_asignatura=14
3
Data Mining and Soft ComputingData mining:
Data Mining and Soft ComputingData mining:
Extraction of interesting (non-trivial, implicit, i l k d t ti ll f l)previously unknown and potentially useful)
information or patterns from data in large databasesK l d
Patterns
Knowledge
Targetd t
Processed
Patterns
data data
InterpretationEvaluation
data
P i
Data Mining Evaluation
4Selection
Preprocessing& cleaning
Data MiningData Mining
How can I analyze this data?
Knowledge
We have rich data, but poor information
Data mining-searching for knowledge (interesting patterns) in your data.
5
g p y
J. Han, M. Kamber. Data Mining. Concepts and Techniques Morgan Kaufmann, 2006 (Second Edition)
S f C iSoft computing refers to a collection of computational techniques in
Soft ComputingSoft computing refers to a collection of computational techniques in computer science, machine learning and some engineering disciplines,
which study, model, and analyze very complex phenomena: those for which ti l th d h t i ld d l t l ti dmore conventional methods have not yielded low cost, analytic, and
complete solutions.
Prof. Zadeh:
"...in contrast to traditional hard computing, soft computing exploits the tolerance for
i i i i d i l himprecision, uncertainty, and partial truth to achieve tractability, robustness, low solution-
cost, and better rapport with reality”, pp yLotfi A. Zadeh
Introduce “Fuzzy Logic” in 1965
6and “Soft Computing”
in 1992.
C t ti l I t lliComputational Intelligence
The Field of Interest of the Society shall be the theory, d i li ti d d l t f bi l i lldesign, application, and development of biologically
and linguistically motivated computational paradigms emphasizing neural networks connectionist systemsemphasizing neural networks, connectionist systems, genetic algorithms, evolutionary programming, fuzzy
systems, and hybrid intelligent systems in which these y , y g yparadigms are contained.
7
Data Mining and Soft ComputingData Mining and Soft Computing
Contents: P t I P i i l f D t Mi iPart I. Principles of Data Mining
Introd ction to Data Mining and Kno ledgeIntroduction to Data Mining and Knowledge Discovery
Data Preparation
Introduction to Prediction, Classification, Clustering and Association
Data Mining - From the Top 10 Algorithms to the New Challenges
8
Challenges
Data Mining and Soft ComputingData Mining and Soft Computing
Contents: P t II S ft C ti T h i i D t Mi iPart II. Soft Computing Techniques in Data Mining
Introd ction to Soft Comp ting Foc sing o rIntroduction to Soft Computing. Focusing our attention in Fuzzy Logic and Evolutionary ComputationComputation
Soft Computing Techniques in Data Mining: FuzzySoft Computing Techniques in Data Mining: Fuzzy Data Mining and Knowledge Extraction based on Evolutionary LearningEvolutionary Learning
Genetic Fuzzy Systems: State of the Art and New 9Trends
Data Mining and Soft ComputingData Mining and Soft Computing
Contents: P t III D t Mi i S Ad d T iPart III. Data Mining: Some Advanced Topics
Some Ad anced Topics I Classification ithSome Advanced Topics I: Classification with Imbalanced Data Sets
Some Advanced Topics II: Subgroup Discovery
Some advanced Topics III: Data Complexity
Final talk: How must I Do my Experimental Study? Design of Experiments in Data Mining/
10Computational Intelligence. Using Non-parametric Tests. Some Cases of Study.
BibliographyBibliographyJ Han M KamberJ. Han, M. Kamber.
Data Mining. Concepts and Techniques Morgan Kaufmann, 2006 (Second Edition)
http://www.cs.sfu.ca/~han/dmbook
I.H. Witten, E. Frank. Data Mining: Practical Machine Learning Tools and Techniques,
Second Edition,Morgan Kaufmann, 2005.http://www.cs.waikato.ac.nz/~ml/weka/book.htmlp
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar Introduction to Data Mining (First Edition)
Addi W l (M 2 2005)Addison Wesley, (May 2, 2005)http://www-users.cs.umn.edu/~kumar/dmbook/index.php
Dorian Pyle Data Preparation for Data MiningMorgan Kaufmann, Mar 15, 1999
Mamdouh Refaat
11
Mamdouh RefaatData Preparation for Data Mining Using SAS
Morgan Kaufmann, Sep. 29, 2006)
Data Mining and Soft Computing
Summary1. Introduction to Data Mining and Knowledge Discovery2. Data Preparation 3. Introduction to Prediction, Classification, Clustering and Association3. Introduction to Prediction, Classification, Clustering and Association4. Data Mining - From the Top 10 Algorithms to the New Challenges5. Introduction to Soft Computing. Focusing our attention in Fuzzy Logic
and Evolutionary Computationand Evolutionary Computation6. Soft Computing Techniques in Data Mining: Fuzzy Data Mining and
Knowledge Extraction based on Evolutionary Learning7 G ti F S t St t f th A t d N T d7. Genetic Fuzzy Systems: State of the Art and New Trends8. Some Advanced Topics I: Classification with Imbalanced Data Sets9. Some Advanced Topics II: Subgroup Discovery10.Some advanced Topics III: Data Complexity 11.Final talk: How must I Do my Experimental Study? Design of
Experiments in Data Mining/Computational Intelligence. Using Non-p g p g gparametric Tests. Some Cases of Study.