application of kdd & its future scope
TRANSCRIPT
Tanmay Sethi
IT 4th Year
08220803111
What is Data Mining?Data mining [1] refers to the extraction ofhidden information from large databases.Data Mining and KDD i.e Knowledgediscovery in databases are often usedinterchangabily but Data mining is just astep in KDD process.
What is KDD?KDD [1] refers to the broad process of finding knowledge in data and it the "high-level" application of particular data mining methods.
NEED OF KDD
Traditional method of turning data into knowledge used manual analysis and interpretation. Eg- health care centers may make resources available according to the currently spreading disease.
But now due to increased sizes of databases, traditional method is impractical to use.
For example-let say , in some database, there are N no of records and M be the no of attributes.
Its just a wastage of time and energy to do manual analysis in the cases where N and M are larger values say 100 or so.
KDD PROCESS
KDD is the nontrivial process of identifying valid, novel,potentially useful, and ultimately understandable patternsin data.
The term process implies that KDD comprises many steps,which involve data preparation, search for patterns,knowledge evaluation, and refinement.
By nontrivial, we mean that some search or inference isinvolved; that is, it is not a straightforward computation ofpredefined quantities like computing the average value of a
set of numbers.
Steps in KDD Process[2]
STEPS-Understanding GoalsData SelectionData PreprocessingData MiningPatterns RecognitionInterpretation / EvaluationKnowledge discovery
Figure 1-KDD Process
KNOWLEDGE DISCOVERY IN THE REAL WORLD[3]
There are a wide range of applications of KDD in real world like in business, artificial intelligence, health care, science, marketing, finance, fraud detection, manufacturing, telecommunications, and Internet agents and many more.
APPLICATIONS IN EDUCATIONAL INSTITUTES/SCHOOLS
GROUPING OF STUDENTS
Clustering is used to group similar
students into a cluster.
Figure 2-Cluster of students
PREDICTING THE REGISTRATION OF STUDENTS IN AN EDUCATIONAL PROGRAMMEclassification and prediction is used for better assessment, evaluation, planning, and decision making in universities/schools so that they can allocate resources more effectively.
Figure 3- Prediction of number of girls this year
PREDICTING STUDENT'S PERFORMANCEDecision tree and classification helps aninstructor to assess the quality of student byconducting an online discussion among agroup of students and use the possibleindicators such as the time differencebetween posts, frequency distribution of thepostings, duration between postings andreplies etc.
DETECTING CHEATING IN AN EXAMINATIONExaminations are useful to evaluatestudents’ knowledge.The models generated use datacomprising of different student’spersonalities, and common practices usedby students to cheat to obtain a bettergrade on these exams.
Figure 4-to access the quality of
students
IDENTIFYING ABNORMAL/ ERRONEOUS VALUESThe data stored in databases may contain abnormal or erroneous,
incomplete, exceptional data which may confuse the analysis process. . As a
result, the accuracy of the discovered patterns can be poor.
Abnormalities in student’s marks may be due to software fault, data entry
operator negligence or an extraordinary performance of the student in a
particular subject.
Figure 5-Abnormal result of a student
The student in subject 4 with roll no 104 will be detected as an exception.
FUTURE SCOPE OK KDD[4]
Although no human being can foretell the future, we believe that there are
plenty of interesting new challenges ahead of us, and quite a few of them
cannot be foreseen at the current point of time. Here we describe one of the
future scope of KDD that is, in chess(computer) game.
EDUCATIONAL CHESS PROGRAMSThere could be a program that could analyzes a
certain position or an entire game on an abstract
strategic level, tries to understand your opponent’s
and your own plans, and provides suggestions on
different ways to proceed.
Figure 6-Educational chess programs
TOURNAMENT PREPARATIONAnother possible application for KDD in chess is to let the player know his
strength and weakness by providing statistics of his wins and losses
depending upon his opening move or some specific move.
INCREASING PLAYING STRENGTHIncorporating additional knowledge
into computer chess programs can
lead to significant increases in playing
strength.
In figure 7, in a particular situation
FILTZ chess game algorithm gave
another explanation which was much
complex than what a normal human
being would do in such situation.
Figure 7-Zugzwang situation
REFERENCES[1] Paulraj Ponniah, "Data Warehousing Fundamentals for IT
Professionals", Wiley, pp. 400–402,2010
[2] Oded Maimon, Lior Rokach," introduction to knowledge discovery in
databases",Department of Industrial Engineering,2012
[3] Manoj Bala,"study of applications of data mining techniques in
education",Vol. No. 1, Issue No. IV, Jan-Mar, (IJRST),2012
[4] Johannes F¨urnkranz," Knowledge Discovery in Chess
Databases",Austrian Research Institute for Artificial Intelligence,2001