A Survey on Designing Metrics suite to Asses the Quality of Ontology

Download A Survey on Designing Metrics suite to Asses the Quality of Ontology

Post on 09-Apr-2018

216 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 8/8/2019 A Survey on Designing Metrics suite to Asses the Quality of Ontology

    1/6

    Abstract---With the persistent growth of the World Wide Web,the difficulty is increased in the retrieval of relevant information for a

    users query. Present search engines offer the user with several web

    pages, but different levels of relevancy. To overcome this, the

    Semantic Web has been proposed by various authors to retrieve and

    utilize additional semantic information from the web. As the

    Semantic Web adds importance for sharing knowledge on the internet

    this has guide to the development and publishing of several

    ontologies in different domains. Using the database terminology, it

    can be said that the web-ontology of a semantic web system is

    schema of that system. As web ontology is an integral aspect of

    semantic web systems, hence, design quality of a semantic web

    system can be deliberated by measuring the quality of its web-

    ontology. This survey focuses on developing good ontologies. This

    survey draws upon semiotic theory to develop a suite of metrics that

    assess the syntactic, semantic, pragmatic, and social aspects of

    ontology quality. This research deliberates about the metrics that may

    contribute in developing a high quality semantic web system.

    Keywords--- Quality Metrics, Web ontology, Semiotic Metrics,

    Semantic Quality, Domain modularity.

    I. INTRODUCTIONEMANTIC Web is nothing but the extension of the

    present web in which the web resources are prepared with

    formal semantics about their interpretation for the machines.

    These web resources are combined in the form of web

    information systems, and their formal semantics are usually

    characterized in the form of web-ontologies. By means of the

    database terminology, it can be said that the web-ontology of a

    semantic web system is representation of that system [11].

    Design quality of a semantic web system can be calculated by

    computing the quality of its web-ontology because web

    ontology is the integral element of semantic web systems [25].

    The main concern is that when the design of a web-ontology is

    completed, it is suitable time to assess its quality so that in

    case, the design is of low quality, it can be enhanced before its

    instantiation. This helps in saving of considerable amount of

    cost and effort for developing high quality semantic web

    systems. Metrics are considered as the appropriate tools for

    estimating quality. This survey focuses on several metrics for

    web ontology quality evaluation.

    II. LITERATURE SURVEYAhluwalia et al., [1]presenteda Semiotic Metrics Suite for

    Assessing the Quality of Ontologies. Table 1 shows some of

    the metrics for quality evaluation [1, 3].

    As a decisive construct, overall quality (Q) is a subjective

    function of its syntactic (S), semantic (E), pragmatic (P), and

    social (O) qualities [1] (i.e., Q = b1S + b2E + b3P +

    b4O). The addition of weight is equal to 1. In the absence of

    pre-specified weights, the weights are assigned to be equal.

    Syntactic Quality (S) evaluates the quality of the ontology

    according to the way it is written. Lawfulness is the extent to

    which an ontology languages rules have been obeyed. Not

    every ontology editors have error-checking capabilities;

    however, without correct syntax, the ontology cannot be read

    and used. Richness is nothing but the proportion of features in

    the ontology language that have been used in ontology (e.g.,

    whether it includes terms and axioms, or only terms). Richer

    ontologies are more valuable to the user (e.g., agent).

    Semantic Quality (E) estimates the meaning of terms in the

    ontology library. Three attributes are used here are

    interpretability, consistency, and clarity. Interpretability dealswith the meaning of terms (e.g., classes and properties) in the

    ontology. In the real world, the knowledge provided by the

    ontology can map into meaningful concepts. This is

    accomplished by checking that the words used by the ontology

    be present in another independent semantic source, such as a

    domain-specific lexical database or a comprehensive, generic

    lexical database such as WordNet. Consistency is nothing but

    whether terms having a consistent meaning in the ontology.

    For example, if an ontology claims that X is a subclass of Y,

    and that Y is a property of X, then X and Y have incoherent

    meanings and are of no semantic value. For example,

    ontological terms such as IS-A is often used inconsistently.

    Clarity is the term which determines whether the context ofterms is clear. For example, if ontology claims that class

    Chair has the property Salary, an agent must know that

    this illustrate academics, not furniture.

    Pragmatic Quality (P) deals with the ontologys usefulness

    for users or their agents, irrespective of syntax or semantics.

    Three criteria are used for determining P. Accuracy is whether

    the claims on ontology makes are true. This is very tricky to

    determine automatically without a learning mechanism or

    truth maintenance system. Currently, a domain expert

    evaluates accuracy. The measure of the size of the ontology is

    A Survey on Designing Metrics suite to Asses the

    Quality of OntologyK.R Uthayan G.S.Anandha Mala, Professor & Head,

    Department of Information Technology, Department of Computer Science & Engineering

    SSN College of Engineering St.Josephs College of Engineering,Chennai, India Chennai, India

    uthayankr@yahoo.com gs.anandhamala@gmail.com

    S

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 8, November 2010

    179 http://sites.google.com/site/ijcsis/ISSN 1947-5500

  • 8/8/2019 A Survey on Designing Metrics suite to Asses the Quality of Ontology

    2/6

    called as Comprehensiveness. Larger ontologies are more

    probable to be complete representations of their domains, and

    provide more knowledge to the agent. Relevance indicates

    whether the ontology satisfies the agents specific

    requirements.

    TABLE 1:DETERMINATION OF METRIC VALUES

    Attributes Determination

    Overall Quality (Q) Q = b1.S + b2.E + b3.P + b4.O

    Syntactic Quality (S) S = bs1.SL + bs2.SR

    Lawfulness (SL)Let X be total syntactical rules. Let Xb be total breached rules. Let NS

    be the number of statements in the ontology. Then SL = Xb / NS.

    Richness (SR)

    Let Y be the total syntactical features available in ontology language.

    Let Z be the total syntactical features used in this ontology.

    Then SR = Z/Y.

    Semantic Quality (E) E = be1.EI + be2.EC + be3.EA

    Interpretability (EI)

    Let C be the total number of terms used to define classes and properties

    in ontology.Let W be the number of terms that have a sense listed in WordNet. Then

    EI = W/C.

    Consistency (EC)

    Let I = 0. Let C be the number of classes and properties in ontology.

    Ci, if meaning in ontology is inconsistent, I+1. Therefore, I = number

    of terms with inconsistent meaning. Ec = I/C.

    Clarity (EA)Let Ci = name of class or property in ontology. Ci, count Ai, (the

    number of word senses for that term in WordNet). Then EA = A/C.

    Pragmatic Quality (P) P = bp1.PO + bp2.PU + bp3.PR

    Comprehensiveness (PO)Let C be the total number of classes and properties in ontology. Let V

    be the average value for C across entire library. Then PO = C/V.

    Accuracy (PU)

    Let NS be the number of statements in ontology. Let F be the number of

    false statements. PU = F/NS. Requires evaluation by domain expert

    and/or truth maintenance system.

    Relevance (PR)

    Let NS be the number of statements in the ontology. Let S be the type of

    syntax relevant to agent. Let R be the number of statements within NS

    that use S. PR = R / NS.

    Social Quality (O) O = bo1.OT + bo2.OH

    Authority (OT)

    Let an ontology in the library be OA. Let the set of other ontologies in

    the library be L. Let the total number of links from ontologies in L to

    OA be K. Let the average value for K across ontology library be V.

    Then OT = K/V.

    History (OH) Let the total number of accesses to an ontology be A. Let the averagevalue for A across ontology library be H. Then OH = A/H.

    Cohesion (Coh)Coh=|SCC|

    Where SCC is separate connected components

    Fullness (F) Readability (Rd)

    For the purpose of evaluation, it needs some knowledge of

    the agents requirements. This metric is coarse as it verifies for

    the type of information the agent uses by ontology (e.g.,

    property, subclass, etc), rather than the semantics needed for

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 8, November 2010

    180 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/8/2019 A Survey on Designing Metrics suite to Asses the Quality of Ontology

    3/6

    specific tasks (e.g., the particular subclasses needed to

    interpret a users specific query).

    Social quality (O) imitates the fact that agents and

    ontologies exist in communities. The authority of an ontology

    is nothing but the number of other ontologies that link to it

    (define their terms using its definitions). More authoritative

    ontologies indicate that the knowledge they provide is

    accurate or useful. The history indicates the number of times

    the ontology is accessed. Ontologies are more dependablewhen they are with longer histories.

    The cohesion (Coh) of a KB is nothing but the number of

    separate connected components (SCC) of the graph

    representing the KB.

    The fullness (F) of a class Ci is defined as the actual number

    of instances that belong to the subtree rooted at Ci (Ci(I))

    compared to the expected number of instances that belong to

    the subtree rooted at Ci (Ci`(I)).

    The readability (Rd) of a class C i is defined as the total of

    the number attributes that are comments and the number of

    attributes that are labels the class has.

    Amjad et al., [2] provided the Web-Ontology Design

    Quality Metrics. The author proposes design metrics for web-

    ontology [21] by maintaining certain recommended principles

    like a metric may reach its highest value for perfect quality for

    excellent case and vice versa that is it may reach its lowest

    level for worst case. It is supposed to be monotonic, clear, and

    intuitive. It must correlate well with human decisions and it

    should be automated if possible. The proposed metrics may

    give notification about how much knowledge can be derived

    from a given webontology; how much it is relevant to a users

    specific necessities and how much it is effortless to reuse,

    manage, trace and adapt. The metrics provided by the author

    are Knowledge Enriched (KnE), Characteristics Relevancy

    (ChR) and Domains modularity (DoM).

    Knowledge Enriched metric

    The reasoning capability of a web-ontology is determined

    by Knowledge Enriched (KnE) metric, and it is based on two

    sub-metrics so-called Isolated Axiom Enriched (IAE) metric

    and Overlapped Axiom Enriched (OAE) metric. There are

    three parts in this axiom namely, predicate, resource and

    object. If none of these is similar with any other axiom of

    identical domain then that axiom is termed as isolated axiom.

    If the two axioms have some similar parts, it is said to be

    overlapped. There may be more than a few transitively

    overlapped axioms in any domain. This metric determines the

    percentage of IAE and OAE, and if the former is greater thanthe later one, then the web-ontology can be regarded as less

    knowledge enriched. IAE is officially defined as the ratio of

    total number of isolated axioms (tIAs) to the total number of

    domain axioms (tDAs).

    (1)

    In the above equation, n is total number of sub-domains of

    web-ontology. Similarly, the OAE metric is officially defined

    as ratio of total number of overlapped axioms (tOAs) to the

    total number of domain axioms. It can be written as follows:

    (2)

    In the equation given above, n is total number of sub-

    domains of web-ontology. Lastly, the KnE metric is the

    difference of total number of overlapped axioms and the total

    number of isolated axioms. It may be written as follows:

    (3)

    If the resultant KnE value is positive, then the web-ontology

    is more knowledge enriched, if it is zero, then the web-

    ontology is average knowledge enriched, and if it is negative,then the web-ontology is less knowledge enriched.

    Characteristics Relevancy metric

    Characteristics Relevancy (ChR) metric gives us the

    suggestion about how much a given web-ontology is close to a

    users specific necessities and the degree of reusability of the

    web-ontology. Formally, it is termed as the ratio of the

    number of relevant attributes (nRAs) in a class to the total

    number of attributes (TnAs) of that class. It can be written as

    follows:

    (4)

    where n in above equation represents the total number of

    classes in the provided web-ontology. ChR metric reveals the

    proportion of relevant attributes in the web-ontology, and this

    number gives insights how much a web-ontology is relevant.

    Domain Modularity metric

    Domain modularity (DoM) metric denotes the component-

    orientation feature of a web-ontology. This metric specifies

    the grouping of knowledge in different components of web-ontology. The webontology is best manageable, traceable,

    reusable and adaptable, if it is designed in components

    (subdomains). Formally, the DoM metric is given as the

    number of sub-domains (NSD) contained in a webontology.

    This metric also depends on the coupling and cohesion [25]

    levels of sub-domains, and it is directly proportional to its

    cohesion level and inversely proportional to its coupling level.

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 8, November 2010

    181 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/8/2019 A Survey on Designing Metrics suite to Asses the Quality of Ontology

    4/6

    (5)

    In the above equation, DCoh indicates the level of domain

    cohesion and DCoup represents the level of coupling among

    sub-domains of web-ontology domain. DoM metric is a realnumber indicating the degree of partial reusability of a given

    web-ontology.

    Samir et al., [3] given the OntoQA: Metric-Based Ontology

    Quality Analysis. The metrics presented can highlight key

    characteristics of an ontology schema and also its population

    and facilitate users to make an informed judgment easily. The

    metrics used by the author here are not 'gold standard'

    measures of ontologies. Instead, the metrics are projected to

    estimate several aspects of ontologies and their potential for

    knowledge representation. Rather than describing ontology as

    merely effective or ineffective, metrics describe a certain

    aspect of the ontology because, in most cases, the way the

    ontology is built is largely dependent on the domain in whichit is designed. The metrics defined here are Schema Metrics

    and Instance Metrics. The following are metrics considered by

    the author:

    The following are some of Schema Metrics:

    Relationship Richness: The diversity of relations and

    placement of relations in the ontology is defined by this

    metrics. An ontology that has many relations further than

    class-subclass relations is better than taxonomy with no more

    than class-subclass relationships. The relationship richness

    (RR) is defined as the ratio of the number of relationships (P)

    defined in the schema to the sum of the number of subclasses

    (SC) plus the numbe...