third colloquium: application of data mining in...
TRANSCRIPT
![Page 1: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/1.jpg)
Third Colloquium:
Application of Data Mining in Education
SITI KHADIJAH MOHAMAD
FACULTY OF EDUCATION
APRIL 10 & 11, 2018
![Page 2: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/2.jpg)
Introduction Data Mining, Software, RQs,
1
![Page 3: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/3.jpg)
Data Mining
Data Mining is a technique which use to discover patterns in data, gain knowledge.
Machine Learning is the algorithms used in data mining technique.
Types of DM: Decision tree, Association rules, Clustering, etc.
Supervised and Unsupervised Learning?
Cross validation?
![Page 4: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/4.jpg)
Software
Types: WEKA, Microsoft SQL Server 2008, RapidMiner, Clementine, R
Download: http://www.cs.waikato.ac.nz/ml/weka/
Supported Platform: Linux, Windows, Mac OS
Created: Researchers at the University of Waikato, New Zealand
![Page 5: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/5.jpg)
Research Question
Association, Clustering and Decision tree are NOT Cause - Effect analysis.
It is actually about relationship analysis.
Eg of RQs:
1. To develop a decision tree model that can predict student’s performance based on the
mechanisms of metacognitive scaffolding prompted by the instructor in Facebook discussion.
2. To formulate learning performance pathways based on the reflective thinking and types of
feedback through educational blogging
3. How the provision of feedback and reflective thinking shape the reflection process through
educational blogging
4. To develop deaf students’ learning patterns when using the e-learning environment in studying
Nuclear Energy
![Page 6: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/6.jpg)
Decision Tree
• This is related to lifestyle and heart disease.
• Age, Smoker (y/n), Diet (good/poor), and a label Risk
(Less Risk/More Risk).
• The biggest influence on Risk turns out to be the
Smoker attribute.
• Smoker becomes the first branch in our tree.
• For Smokers, the next influential attribute happens to
be Age, however, for non smokers, the data indicates
that their diet has a bigger influence on the risk.
• The tree will branch into two different nodes until the
classification is reached.
• Decision tree can be a great way to visualize how a
decision is derived based on the attributes in your
data.
Credit to: refactorthis.net
![Page 7: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/7.jpg)
Association Rules
Q1 Q2 T1 conf: (1)
Q7 T3 conf: (0.92)
T2 Q2 conf: (0.5)
Support (coverage) and Confidence (accuracy)
![Page 8: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/8.jpg)
Clustering
Credit to: Almodiel
![Page 9: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/9.jpg)
WEKA Workbench 2
![Page 10: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/10.jpg)
WEKA Workbench (1) Performance Comparison
Graphical Interface
Classifiers
Command-line Interface
![Page 11: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/11.jpg)
WEKA Workbench (2)
Supply data here
Details of the data
Details of the data
• Attributes == Variables
• Instances == No of samples
Preprocess Tab
![Page 12: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/12.jpg)
4 options to
classify the data
WEKA Workbench (3)
Classify Tab (also known as postprocessing tab)
Results panel
Lists of algorithms
Right click here to
view the tree
![Page 13: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/13.jpg)
![Page 14: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/14.jpg)
What Does Precision and Recall Tell Us?
Precision: Given all the predicted labels (for a given class X), how many
instances were correctly predicted?
Recall: For all instances that should have a label X, how many of these
were correctly captured?
Suppose a computer program for recognizing dogs in scenes from a
video identifies 7 dogs in a scene containing 9 dogs and some cats. If 4
of the identifications are correct, but 3 are actually cats, the program's
precision is 4/7 while its recall is 4/9.
Application & Interpretation
True Positives and True Negatives: are correct classification
False Positives: when the outcome is incorrectly predicted as yes when it is actually no
False Negatives: when the outcome is incorrectly predicted as no when it is actually yes Credit to: wikipedia
![Page 15: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/15.jpg)
Calculate Recall for Class A:
= TP_A / (TP_A+ FN_A)
= 10 / (10 + 2 )
= 0.83
Predicted Class
a b c Total
Actual
Class
a 10 1 1 12
b 2 0 1 3
c 1 0 0 1
Total 13 1 2 16
Application & Interpretation
Calculate Precision for Class A:
= TP_A / (TP_A+ FP_A)
= 10 / (10 + 3 )
= 0.769
![Page 16: Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique](https://reader034.vdocuments.net/reader034/viewer/2022042110/5e8b2d23281c4022304c5c67/html5/thumbnails/16.jpg)
Thank You! Questions?