mining minds knowledge maintenance engine maqbool ali maqbool ali khu memberkhu member byeong ho...
TRANSCRIPT
Mining MindsKnowledge Maintenance Engine
• Maqbool Ali
KHU Member
• Byeong Ho Kang
UTAS Member
/ 2Agenda
• Introduction• Motivation• Related Work• Limitations• Proposed Architecture• Tools and Technologies• Development Timeline• Current Status
/ 3Introduction
• Knowledge Management
• Key factor Evolution of knowledge
• Major challenge Knowledge maintenance
• To handle dynamic knowledge generation and maintenance
• Intelligent and effective system to provide better quality of service
• Feedback to enhance knowledge maintenance capabilitieshttp://www.journal.forces.gc.ca/vo4/no1/images/McIntyre-4-fig3-eng.gif
/ 4Motivation
CHAL
LEN
GES
High Quality of Contents
Change Management
Data Inconsistency (updation, maintenance)
Evolution of Knowledge (expert, feedback, learning)
/ 5Related Work
Phases System / Study / Prototype Features and Limitations
KME
Knowledge Creation Dimitriadis [1], Kwiatkowska [2], Bachman [3]
Features:• Handle noisy, highly variable data• Extract new knowledge • Create effective set of decision rules• Worked on Time and space domainLimitations:• Used fixed machine learning methods (ZeroR, NaiveBayes,
J48, SVM)• Human is involved
Knowledge MaintenanceBachman [3], S.Auer [4], K.Kaljurand [5], M.Afzal [6], R.Regier [7], J.Dinerstein [8]
Features:• Maintain rules• Semi-automatic maintenance• Evidence SupportLimitations:• Single-Level maintenance• Manual maintenance• Manual tuning
/ 6Limitations
• Fixed machine learning method
• Manual rules generation
• Single-level maintenance
• Manual maintenance
• Manual rules tuning
/ 7
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Recommendation
UI / UX
Proposed Architecture
/ 8
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
Filtered Data {John, Over Eaten, Walk}
Features {Entropy, Stand. Deviation, Mean}
Learned Data {x=john && y=normal, Run}
Algorithm space {SVM, ANN, NB, …, J48}
Selected Algorithm {J48}Provide Feedback {hConfidence, Rule}
Update Rules {x=john && y=normal, Walk}
Structured Data {1,John,Normal,Chlos,Lunch,Walk}
Query {Concept1 and Concept2,…, and Conceptn}
Evidence List {Evidence1, Evidence2,…, Evidencen}
Concepts {Jogging, Run, Normal}
Rules {x=john && y=normal, Run}
Update Rules {x=john && y=normal, Walk}
Knowledge Maintenance Knowledge Creation
1
2
3
4 5
6
7
2
3
4
1 1
1
1
1
1
2
3
4
5
6
4 5
6
7
8
9
3
4
5
3
4
5
6
Model Creation Case-1 Case-2
Case-3
Filtered Data {Data1, Data2, Data3}
/Knowledge Creation 9
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
Model Creation | Model Execution
/
User Id Name Health Condition ……… Activity
1 John Normal ……… Walk
2 Alice Abnormal ……… Run
… …. ….. …… ….. 10
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
2
User ID Person Age
1 John 35
2 Alice 29
… …. …..
3
User Disease Level Location
1 2 Suwon
2 1 Yongin
… …. …..
User ID Condition Activity
1 Normal Run
2 Abnormal Walk
… …. …..
4
Dataset Algorithm Performance
4
Dataset Algorithm
Data 1 J48
Dataset Entropy Std. Dev. Mean …….
Data 1 0.43 0.55 35.6 …..
Data 2 0.15 0.22 17.7 …..
Data 3 0.25 0.36 48.2 …..
…… …….. ……… ……. …..
Dataset Entropy Std. Dev. Mean ……. Algorithm
Data 1 0.43 0.55 35.6 ….. J48
Data 2 0.15 0.22 17.7 ….. SVM
Data 3 0.25 0.36 48.2 ….. NB
…… …….. ……… ……. ….. ……...
5
Algorithm Selection Models
If (Entropy<=0.35 & Mean <50) then Algorithm=J48
If (Entropy>=0.35 & Mean >50) then Algorithm=SVM
………
6
7
Intermediate Data
HDFS data Access Interface
1
ML MethodsSVM
Data 1 SVM 0.74
J48Data 1 J48 0.86
NB
Data 1 NB 0.80
…..
….. ….. …..
Data 1 J48 0.86J48
Data 2 SVM
Data 3 NB
…… ……..
/Knowledge Creation 11
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
Model Creation | Model Execution
Intermediate Data
HDFS data Access Interface
/ 12
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
User Id Name Health Condition ……… Activity
1 John Normal ……… Walk
2 Alice Abnormal ……… Run
… …. ….. …… …..
2
Intermediate Data
HDFS data Access Interface
1
User Id Name Activity
1 John Walk
2 Alice Run
… …. …..
3
4
Entropy Std. Dev. Mean …..
0.30 0.65 25.6 …..
Algorithm Selection Models
If (Entropy>=0.35 & Mean >50) then Algorithm = SVM
If (Entropy<0.35 & Mean <=50) then Algorithm = J48
………
5
J48
6
Health Condition
Run Walk
Person
Age > 30
=Normal
Age <= 30
Diet
=Over Eaten
7
/ 13Knowledge MaintenanceCase-1 | Case-2 | Case-3
/ 14
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation1
1
1
1
2
3
4
5
6
4 5
6
7
8
9
3
4
5
3
4
5
6
Knowledge MaintenanceCase-1 | Case-2 | Case-3
/ 15
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation1
1
1
1
2
3
Knowledge MaintenanceCase-1 | Case-2 | Case-3
ID Rule Satisfaction
Rule-1 (Activity = Eating && Disease=Yes && Temp=low) => (Health_condition = Abnormal) 70%
Rule-4 (Activity = Jogging && Disease=No) => (Health_condition = Normal) 90%
...... ……… …….
Rule-n (Activity = BUs&& Diet= Imbalance && Temp=high) => (Health_condition = Dizziness) 85%
Rule-4 (Activity = Jogging || Disease=No) => (Health_condition = Normal)
4
Rule Rule Modified On
Rule-1 (Activity = Eating && Disease=Yes && Temp=low) => (Health_condition = Abnormal) 10-07-2014
Rule-4 (Activity = Jogging && Disease=No) => (Health_condition = Normal) 25-06-2014
Rule-4 (Activity = Jogging || Disease=No) => (Health_condition = Normal) 24-07-2014
….. ……. ……
5
Rule-4 (Activity = Jogging && Disease=No) => (Health_condition = Normal) 90%
/ 16
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human Expert
Knowledge Maintenance Engine
Knowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation1
1
1
1
2
3
4
5
Knowledge MaintenanceCase-1 | Case-2 | Case-3
Rule-4 (Activity = Jogging || Disease=No) => (Health_condition = Normal)
ID Rule Satisfaction
Rule-4 (Activity = Jogging && Disease=No) => (Health_condition = Abnormal) 20%
Rule-4 (Activity = Jogging && Disease=No) => (Health_condition = Abnormal) 15%
...... ……… …….
6
/ 17
Intermediate Data
HDFS data Access Interface
Feedback Analysis
Evolutionary Knowledge Maintenance
Change Management
Inconsistency Detection
Mapping and logging
Recommendation
UI / UX
Human ExpertKnowledge Bases
ServiceKB
InformationKB
DataKB
Query Builder
Query Formulation
Query Validation
Knowledge Data Broker
Schema Filtration
Schema Validation
Broker InterfaceKnowledge Builder
Learner
SelectorData and Algorithm Characterization
Meta-Features Computation Algorithm Selection
Machine Learning M
ethods
Data-Algorithm Training Data
ML Algorithm Performance Evaluation
Meta Features Computation
Algorithm Selection Model Creation
Algorithm Selection Models
Confidence Level Checker
Coverage Analysis
Satisfaction Analysis
Functional Evaluation
Rule ValidationRule Tuning
Expert Authoring Interface
Rules Extractor
Concept Extractor
Editing Interface
Editor Mapper
Concept Repository
Evidence Support
Evidence Searching
Query Generation
Evidence Presentation
1
Knowledge MaintenanceCase-1 | Case-2 | Case-3
1
1
1
2
ID Rule Satisfaction
Rule-1 (Activity = Jogging && Disease=No) => (Health_condition = Normal) 55%
4
Concepts Possible Values
Activity Jogging | Eating | Bus Traveling
Disease Yes | No | Diabetes …..
Diet Balance | Imbalance5
6
Concept Values
Activity Jogging
Eating
Bus Traveling
If Activity = and Disease = Jogging
Eating
Travelling
Diabetes
No
Yes
ID Rule
Rule-2 (Activity = Eating && Disease=Yes && Temp=low) => (Health_condition = Abnormal)
Rule-4 (Activity = Jogging || Disease=No) => (Health_condition = Normal)
…… ………
Rule-n (Activity = BUs&& Diet= Imbalance && Temp=high) => (Health_condition = Dizziness)
Query (Jogging and Diabetes …. and condition)
4
Search(Online Sources, Query) Example: Pubmed
5
Title Date Journal Type
Dietary and lifestyle factors in relation to pla 2010 Med J Exercise
Novel approaches to obesity prevention 2012 J Diabetes Food
6
Rule-1 (Activity = Eating || Disease=No) => (Health_condition = Normal)
3 7
8
9
/Tools and Technologies
• Java• Weka• IBM SPSS Statistics• JSP• JavaScript
18
/Development Timeline 19
Finding Service Issue
KME Refinement (1st Service)
Broker Interface GUI Development
(1st Service)
Model Creation for Knowledge Builder
(1st Service)
Model Execution for Knowledge Builder
(1st Service)
Implementation of Maintenance cases
(1st Service)
KME Refinement based on Results
Report -1st Service
Report – 2nd Service
KME Refinement based on Results
Implementation of Maintenance cases
(2nd Service)
Model Execution for Knowledge Builder
(2nd Service)
Model Creation for Knowledge Builder
(2nd Service)
Broker Interface GUI Development
(2nd Service)
KME Refinement (2nd Service)
Finding Service Issue
Finding Service Issue
KME Refinement (3rd Service)
Broker Interface GUI Development
(3rd Service)
Model Creation for Knowledge Builder
(3rd Service)
Model Execution for Knowledge Builder
(3rd Service)
Implementation of Maintenance cases
(3rd Service)
KME Refinement based on Results
Report- 3rd Service
Final Report
Final checking and supplements of overall project
KME Adjustments reflecting whole service checking result
Checking whole service based on survey result
Satisfactory Survey based on Prototype for Service Refinement
1st Year
2nd Year
3rd Year
4th Year
/Current Status
• Literature survey of the existing systems on knowledge creation and maintenance• Had meeting with consortium member on 27th June 2014.• Redesign of the Architecture based on comments from consortium
member• Redesign of Selector module (Knowledge Generation)• Redesign of Confidence Level Checker (Knowledge Maintenance)
• Study on Integration/interfacing with other MM Modules• Write initial draft of SRS document• Designing UML Diagrams
20
21/References
• [1] Dimitriadis, S.; Goumopoulos, C., "Applying Machine Learning to Extract New Knowledge in Precision Agriculture Applications," Informatics. PCI '08. pp.100-104, 2008
• [2] Kwiatkowska, E.J.; Fargion, G.S., "Application of machine-learning techniques toward the creation of a consistent and calibrated global chlorophyll concentration baseline dataset using remotely sensed ocean color data," Geoscience and Remote Sensing, IEEE Transactions on , vol.41, no.12, pp.2844-2860, Dec. 2003.
• [3] Bachman, R. E.; Hoffman, R. D.; Johnson, V. M.; McDavid, D. W.; Mazina, D. I., "Search engine facility with automated knowledge retrieval, generation and maintenance." U.S. Patent No. 7,216,121. 8 May 2007.
• [4] Auer, S.; Lehmann, J., "Creating knowledge out of interlinked data."Semantic Web 1.1, 2010, 97-104.
• [5] Kaljurand, K., "ACE View---an Ontology and Rule Editor based on Attempto Controlled English." OWLED. 2008.
• [6] Afzal, M.; Hussain, M.; Khan, W.A.; Ali, T.; Lee, S.; Kang, B.H., “KnowledgeButton: An Evidence Adaptive Tool for CDSS and Clinical Research.” INISTA14, 2014.
• [7] Regier, R.; Gurjar, R.; Rocha, R. A., "A clinical rule editor in an electronic medical record setting: development, design, and implementation." AMIA Annual Symposium Proceedings. Vol. 2009.
• [8] Dinerstein, J.; Dinerstein, S.; Egbert, P.K.; Clyde, S.W., "Learning-Based Fusion for Data Deduplication," Machine Learning and Applications, 2008. ICMLA '08. , pp.66-71, Dec. 2008.
QuestionsThank You!