empirical findings on ontology metrics

Download Empirical findings on ontology metrics

Post on 04-Sep-2016




0 download

Embed Size (px)


  • a,1 Alcy, W

    e paing ptolor inentslogs fo

    The results show that the existing proposed metrics evaluated in general do not seem to be indicators of

    1. Introduction

    Ontologies (Gruber, 1995) are explicits. Morproblembinedare nod as at of sof

    are capableof judgingqualityproperties and their interpretation. In-quiry on the relation of quality indicators and metrics is in conse-quence an important open challenge for a more systematicapproach todevelopingontology-based software systems. Theprob-lemwith ontologymetrics is that ontologies are very heterogeneousin their structure, objectives and level of formality. In consequence,

    Section 3 describes the statistical analyses performed togetherwith some ndings. Finally, Section 4 concludes the paper and out-lines future research work.

    2. Background

    Ontologies have been a resource in the eld of articial intelli-gence since the 70s. However since the inception of the Semantic

    Corresponding author.E-mail addresses: msicilia@uah.es (M.A. Sicilia), daniel.rodriguezg@uah.es (D.

    Rodrguez), elena.garciab@uah.es (E. Garca-Barriocanal), salvador.sanchez@uah.es1 http://www.w3.org/TR/owl-guide/.

    Expert Systems with Applications 39 (2012) 67066711

    Contents lists available at ciVerse ScienceDirect

    Expert Systems w

    .e lsevier .com/locate /eswa(S. Snchez-Alonso).In the software engineering domain, metrics play an importantrole in designing, developing and identifying software defects of fu-ture maintenance problems. Like it happens in the software engi-neering tasks, generally the construction of ontologies follows aniterative and incremental approach and a number of metrics hasbeen proposed tomeasure different aspects such as reliability, reus-ability and cohesion (see for example Uschold & Gruninger, 1996).However, unlike it happens in software engineering domain, thereis a lack of empirical studies that validate how the different metrics

    guage (OWL),1 that was already used in a previous preliminary study(Manouselis, Sicilia, & Rodrguez, 2010). OWL has been chosen as thelanguage is aW3C recommendation and has rapidly become the stan-dard language for expressing formal ontologies. The results of thestudy point out that current metrics are not directly interpretable ina general way, but there seems to be different categories of ontologiesthat could be discernible from the value of the metrics.

    The rest of the paper is structured as follows. Section 2 coversthe background on ontology metrics and some related work. Next,main concepts and their relationshipdenes the conceptualisations of aconstraints on how terms can be cousing a formal language. Ontologiesent purposes in different systems, anbecome artifacts in the developmen0957-4174/$ - see front matter 2011 Elsevier Ltd. Adoi:10.1016/j.eswa.2011.11.094that ranking, but they can be helpful to identify particular kinds of ontologies. 2011 Elsevier Ltd. All rights reserved.

    representations of do-e formally, an ontologym domain and a set ofto model the domain,

    wadays used for differ-consequence, they havetware.:

    there is a need to carry out exploratory studies that could eventuallyresult in insights about the metrics applicable to different kinds ofontologies and their interpretation for each of them.

    This paper reports on such an exploratory study, carried out bymeasuring a large set of OWL ontologies obtained from theSwoogle repository together with a measure of their popularityusing OntoRank (Ding et al., 2005). The study has been carried outby implementing and using an open source software frameworkfor computing ontologymetrics expressed in theOntologyWebLan-ing ontology metrics from a large set of ontologies extracted from the Swoogle search engine. The Onto-Rank value used inside the Swoogle to rank search results is used as a contrast for some existing metrics.Empirical ndings on ontology metrics

    M.A. Sicilia a,, D. Rodrguez a,b, E. Garca-BarriocanalaDepartment of Computer Science, University of Alcal, Ctra. Barcelona, Km. 31.6, 2887bDepartment of Computing and Communication Technologies, Oxford Brookes Universit

    a r t i c l e i n f o


    a b s t r a c t

    Ontologies are becoming thinformation in several domhas become an engineerinmethods. As part of any ontifying possible problems oassessment that complemthere are a number of ontostudies that provide a basi

    journal homepage: wwwll rights reserved.S. Snchez-Alonso a

    al de Henares, Madrid, Spainheatley Campus, Oxford OX33 1HX, UK

    referred way of representing, dealing and reasoning with large volumes ofs. In consequence, the creation, evaluation and maintenance of ontologiesrocess that needs to be managed and measured using sound and reliablegy engineering or revision process, metrics can play a role helping in iden-correct use of ontology elements, along with providing a kind of qualityreviews requiring expert inspection. However, in spite of the fact that

    y metric proposals described in the literature, there is a lack of empiricalr their interpretation. This paper reports a systematic exploration of exist-ith ApplicationsS

  • Motwani, & Winograd, 1999). That ranking metric is an externa

    ith Applications 39 (2012) 67066711 6707Web, ontologies have become the principal recourse to integrateand deal with online information, and a new set of standards toformally represent them has been developed and gained wide-spread adoption. The OntologyWeb Language (OWL) is one of suchstandards that belongs to a family of knowledge representationlanguages created for the Semantic Web (Berners-Lee, Hendler, &Lassila, 2001). The OWL has reached the status of W3C (WorldWide Web Consortium) recommendation. From a technical pointof view OWL extends the RDF (Resource Description Framework)and RDF-S (RDF Schema) allowing us to integrate a variety of appli-cations using XML as interchange syntax.

    The OWL has evolved since its inception in 2004 to a new spec-ication in 2009 (OWL2) with different sub-languages dependingon the expressiveness and reasoning capabilities provided. Thereare however common elements and OWL ontologies are composedof (i) classes that can be dened as sets of individuals, (ii) individ-uals as instances of classes, i.e., objects of the domain and (iii)properties as binary relations between individuals. It is also possi-ble to specify property domains, cardinality ranges and reasoningon ontologies. From these elements a number of authors have pro-posed metrics to measure the quality of ontologies.

    Several authors have proposed metrics related to ontologies oradapted from object-oriented software. For example Yao, Orme,and Etzkorn (2005) proposed a set on metrics to measure the cohe-sion in ontologies such as the number of root classes, number ofleaf classes and average depth of inheritance tree of all leaf nodes.Authors validated the metrics both theoretically and empirically.Their empirical validation consisted of comparing statistically theresults of the metrics and subjective evaluation of eighteenevaluators.

    Tartir, Budak Arpinar, Moore, Sheth, and Aleman-Meza (2005)propose three orthogonal dimensions to evaluate the quality ofan ontology. The rst dimension is quality of the ontology schemarepresenting the real world. The second dimension evaluates thecontent of the ontology, i.e, how well populated is the ontologyand if it reects the real world. Finally, the third dimension evalu-ates if the set of instances and relations agree with the schema.Schema metrics proposed by the authors include relationship,attribute and inheritance richness. Instance metrics are dividedinto (i) knowledge base metrics such as class richness, averagepopulation and cohesion and (ii) class metrics which includeimportance, fullness, inheritance richness, relationship richnessconnectivity and readability. Authors analysed two general pur-pose ontologies SWETO and TAP as well as the GlycO (GlycomicsOntology) ontology.

    Vrandecic and Sure (2007) propose a framework for designingontologies though a process of normalisation and considering sta-ble metrics, i.e., metrics considering the open world assumption.

    Metrics and metric frameworks can help with processes ormethodologies for the design of ontologies or ontological tools. Awell-known methodology for the evaluation of ontological ade-quacy of taxonomic relationships is OntoClean dened by Guarinoand Welty (2002). Another process is the OntoMetric process de-ned by Lozano-Tello and Gmez-Prez (2004). OntoMetric is aprocess to select ontologies using a broad range of metrics, notonly those that can be collected form the actual ontologies.

    A suite a of ontology metrics to measure design complexity hasbeen recently proposed (Zhang, Li, & Kuan Tan, 2010), reporting ananalysis of desirable properties for metrics described by Weyuker,and providing results of an empirical evaluation in which the met-rics proposed are shown to be discriminating ontologies that areconsidered by the authors to be of different complexity.

    In another direction Ding et al. (2004, 2005) have developed

    M.A. Sicilia et al. / Expert Systems wSwoogle, a semantic search engine in which the popularity of aSemantic Web document is dened using the OntoRank measureadapted from the well-known PageRank algorithm (Page, Brin,criteria that can be used to contrast ontology metrics. The problemwith the ranking is that its interpretation as a quality measure isnot evident and has not been empirically studied. However, asother quality measures are not available for ontologies, it repre-sents an interesting opportunity to contrast existing metrics.

    We also developed an open source framework to extract theOWL ontology metrics called OntoMetrics. The OntoMetrics is anopen source Java implementation that m


View more >