an automated approach to extract domain ontology 4.pdf · e-learning system e-learning is ......
Post on 30-Aug-2018
Embed Size (px)
An Automated Approach to Extract Domain Ontology for
E-learning is becoming a hot area in the field of both online and offline
education. E-learning deals with the interaction between the teacher and learner on the
basis of knowledge possessed by the learner. Aware about the learners knowledge
level, the teacher can easily provide the required lessons to the student through the
online medium such as Internet. Adaptive learning is such an educational method
which uses computers as an interactive method. It also tailors the learning materials
based on the learners knowledge level. In this chapter an automated approach to
extract domain ontology is designed with the objective of the enhancement of the
efficiency of adaptive e-learning.
4.1 ONTOLOGY AND E-LEARNING
Ontologies have become a key concept for providing more relevant lessons to
the learner than other means. Ontologies are established for information sharing and
are extensively used as a means for conceptually structuring domains of interest.
Ontologies help us to describe, develop, annotate and relate the educational resources,
which in turn will help in the retrieval of more relevant resources for the learners.
Ontology can be created by a domain expert and embedded into an e-learning system
or it can be automated and embedded in to e-learning system. Automation of
ontologies will reduce the human intervention and also the time required for ontology
creation. The chief advantage of the proposed approach is automated ontology
construction through concept map extraction. It is effectively achieved through the
use of association rule mining and sequential pattern mining algorithms. The
constructed domain ontology is applied to the e-learning system so that the real-time
application of the proposed approach is discussed.
Figure 4.1: Sample ontology for E-learning
Union of Union of
Figure 4.1 shows a sample structure of an ontology constructed by domain
experts for the e-learning system. Though the structure is a basic graph like structure,
we incorporate relations with each node present in the ontology. A node is a topic
related to the domain that is considered for the construction of e-learning system.
4.2 ONTOLOGY CONSTRUCTION
The main objective of the proposed approach is to construct an ontology for an e-
learning system which fulfills the needs of clients. The client mentioned in the
approach is related to the student or person who makes use of the e-learning system.
The ontology is listed with a detailed association between the nodes or the topics. The
ontology construction undergoes a series of developing steps to ensure that the e-
learning system is an effective one. The ontology is constructed from a text corpus,
which contain a number of documents regarding a particular domain. So, the ontology
has to be created based on the above specified domain. The main steps in the
construction of an ontology are:
Processing the documents
Outline the domain ontology
Concept Processing (Extraction of concepts from the domain)
Creating concept maps
The above four steps serve as the main components of the proposed approach.
These processes have the virtue of producing an effective ontology for the learning
system. Based on these steps, an automatic ontology construction method is provided.
The proposed approach derives a specific algorithm to give weightage to all the nodes
and to provide association between the nodes. The nodes are assigned their inter-
relationships through a mutual association function. The different document
processing methods will help to extract the key features from each document. The key
features are then associated together to form the concepts and from the concepts, an
effective concept map is created for the e-learning system. Thus, a query from a user
is used to extract a concept map regarding that query.
Figure 4.2: Ontology construction
Figure 4.2 depicts the block diagram for the construction of an ontology for
the specified e-learning system in our proposed approach. In the succeeding sections,
the proposed approach in discussed in detail.
4.2.1 DOCUMENT PROCESSING
The initial part of the ontology construction is to process the documents to
extract the keywords from the documents. The text corpus is selected and the
documents from the corpus are extracted for the processing which is done by applying
two basic document processing steps. Initially a stop word removal process removes
all the non-profitable words from the documents. Once the stop word removal is
finished, a stemming algorithm (Willet, 2006) is applied to extract the keywords in
their root form. The keywords from the documents are then stored in an array by
making sure that no words are repeated words. The stored keywords are then
transferred to the concept extraction phase.
For example: Consider two statements from the text corpus
Database is a collection of related information. Data in a database are stored in
the form of tables.
ontology Concept extraction
The stop words are: is, a, of, from, are, in, the.
Keywords extracted: Database, Collection, Related, Information, data, database,
Stemming: Collection - Collect
Tables - Table
4.2.2 OUTLINING DOMAIN ONTOLOGY
The procedure of the ontology construction should be specific and transparent
as we define the e-learning system as a user friendly one. In this section, the different
steps that are needed for the efficient construction of the ontology are defined. The
basic structure of the domain ontology can be presented as in Figure 4.3.
Figure 4.3: Outline of domain ontology construction
Creating concept map
The outlining of the structure of the ontology should be precise, because
ontology is a domain specific one. The main concentration is needed in the concept
extraction phase. The concept should be associated to more concepts and it should
possess an individual existence. So, the redundancy in the concept should be
identified to ease the process of execution. The other major part is regarding the
dimension of the concept set. For high dimensional concept set, the dimension should
be reduced to make the associations more rigid and precise.
4.2.3 CONCEPT PROCESSING
A concept is defined as a keyword or set of keywords that defines a common
topic as reference. So, the purpose of concept processing step is to identify such
concepts from the set of keywords, which is already extracted. Let K be the set of n
keywords defined by,
The set K includes the keywords from all the documents. Now we process
each keyword to find the concept. Each keyword is selected and processed with other
keywords to find the association between them. Initially, a sorting process is applied
to the set of keywords based on their frequency. The most frequent keywords are
selected as top priority keywords. These top priority keywords are processed initially
for concept extraction. The frequency of each keyword k is calculated based on their
presence in the document present in the text corpus.
The frequency is calculated as the number of keywords ( present in the
document ( to the total number of keywords (N(k)) in . Now the set K is
reformatted with the most frequent keywords in the descending order of their
frequency values. We adopt a sentence level windowing process, in which the
window moves in a sliding manner. The text window formed is four term window
which enclosed in a sentence. As the window slides, the words enclosed in the
window are selected for association calculation. The association is calculated as,
( ) |
The association between two keywords is obtained through the probability of
occurrence of the keywords. A conditional probability is adopted for finding the
relation between the keywords. The value of the association between the keywords is
used to extract the concept. If the association value is high, it is considered as a
concept. The process is continued upto the last document in the text corpora. A
threshold value is set for making