part-whole representation and reasoning in formal biomedical ontologies

22
Part-whole representation and reasoning in formal biomedical ontologies Stefan Schulz a, * , Udo Hahn b a Department of Medical Informatics, Freiburg University Hospital, Stefan-Meier-Str. 26, D-79104 Freiburg, Germany b Jena University Language and Information Engineering (JULIE) Lab, Fu ¨rstengraben 30, D-07743 Jena, Germany Received 6 October 2003; received in revised form 21 October 2004; accepted 12 November 2004 1. Introduction Biology and medicine both share a long-standing tradition for structuring their domain knowledge in terms of taxonomies, classifications, and the- sauri. These terminological resources have been developed and put into practice mainly from the perspective of their utility for disease encoding, health care statistics, gene annotation, document retrieval, or accountancy practices, while ontologi- cal considerations never played a considerable role in organizing terms and their (shared) meaning. With increasing demands for more ‘intelligent’ sup- port of research and routine work in terms of plan- ning, diagnostic reasoning, decision support, and natural language processing, requirements for more sophisticated forms of computationally adequate domain representations have shaped [1]. Artificial Intelligence in Medicine (2005) 34, 179—200 http://www.intl.elsevierhealth.com/journals/aiim KEYWORDS Part-whole reasoning; Description logics; (Bio)medical ontologies Summary Objective: Biomedical ontologies are typically structured in a biaxial way, reflecting both a taxonomic (is-a) and a partonomic ( part-of) hierarchy. Commonly used biome- dical terminologies, which incorporate such distinctions excel in terms of broad cover- age but lack a rigid formal foundation. The latter, however, is a prerequisite for automated reasoning. For the biomedical domain, it is not only crucial to cope with ontological dependencies between wholes and their parts but also with specific reason- ing patterns which underlie the propagation of roles across partonomic hierarchies. Methods: We scale down part-whole reasoning to subsumption-based taxonomic reasoning within the formal framework of a parsimonious variant of description logics (viz. ALC). Results: We provide a formal basis for ontological engineering in the domain of biomedicine, as far as part-whole relationships are concerned, by addressing typical reasoning patterns encountered in this domain. # 2005 Elsevier B.V. All rights reserved. * Corresponding author. Tel.: +49 761 203 3252; fax: +49 761 203 3251. E-mail address: [email protected] (S. Schulz). 0933-3657/$ — see front matter # 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.artmed.2004.11.005

Upload: stefan-schulz

Post on 04-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

  • ng

    Stefan Schulz a,*, Udo Hahn b

    aDepartment of Medical Informatics, Freiburg University Hospital, Stefan-Meier-Str. 26,D-79104 Freiburg, Germanyb Jena University Language and Information Engineering (JULIE) Lab, Furstengraben 30,D-07743 Jena, Germany

    Received 6 October 2003; received in revised form 21 October 2004; accepted 12 November 2004

    sauri. These terminological resources have beendeveloped and put into practice mainly from the

    With increasing demands for more intelligent sup-port of research and routine work in terms of plan-ning, diagnostic reasoning, decision support, and

    Artificial Intelligence in Medicine (2005) 34, 179200

    KEYWORDSPart-whole reasoning;Description logics;(Bio)medical ontologies

    Summary

    Objective: Biomedical ontologies are typically structured in a biaxialway, reflectingboth a taxonomic (is-a) and a partonomic (part-of) hierarchy. Commonly used biome-dical terminologies, which incorporate such distinctions excel in terms of broad cover-age but lack a rigid formal foundation. The latter, however, is a prerequisite forautomated reasoning. For the biomedical domain, it is not only crucial to cope with

    reasoning patterns encountered in this domain.natural language processing, requirements for moresophisticated forms of computationally adequatedomain representations have shaped [1].

    * Corresponding author. Tel.: +49 761 203 3252;fax: +49 761 203 3251.

    E-mail address: [email protected] (S. Schulz).

    0933-3657/$ see front matter # 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.artmed.2004.11.0051. Introduction

    Biology and medicine both share a long-standingtradition for structuring their domain knowledgein terms of taxonomies, classifications, and the-

    perspective of their utility for disease encoding,health care statistics, gene annotation, documentretrieval, or accountancy practices, while ontologi-cal considerations never played a considerable rolein organizing terms and their (shared) meaning.

    # 2005 Elsevier B.V. All rights reserved.ontological dependencies betweenwholes and their parts but also with specific reason-ing patterns which underlie the propagation of roles across partonomic hierarchies.

    Methods: We scale down part-whole reasoning to subsumption-based taxonomicreasoning within the formal framework of a parsimonious variant of description logics(viz. ALC).

    Results: We provide a formal basis for ontological engineering in the domain ofbiomedicine, as far as part-whole relationships are concerned, by addressing typicalPart-whole representatioformal biomedical ontoloand reasoning inies

    http://www.intl.elsevierhealth.com/journals/aiim

  • used controlled vocabularies. Multiaxial systems,e.g., SNOMED [5] and ICNP [6], provide additionaldescriptive flexibility through the compositionalityof their constituent concept classes, polyhierar-

    180 S. Schulz, U. HahnTwo different scientific communities share effortsto handle these new challenges. The biomedicalcommunity, on the one hand, strives for controlledvocabularies with a broad coverage for routine use.The artificial intelligence community, on the otherhand, aims at expressive formal representation fra-meworks which allow for deep representation ofshared knowledge of a given domain as the metho-dological backbone for intelligent systems. Theseefforts have predominantly been devoted to thedesign of knowledge representation languages andthe support of special forms of automated reasoningservices (e.g., taxonomic classification, probabilisticor fuzzy reasoning). Far less attention, however, hasbeen given to large-scale, robust knowledge repre-sentation systems beyond the level of fragile experi-mental prototypes. Only recently mainlypropagated by the emergent Semantic Web well-engineered and scalable reasoning systems, whichare able to reliably and efficiently deal with largeamounts of terminological structures, are underdevelopment. They are a prerequisite for any seriousbiomedical application which will be based on large-sized domain ontologies. Additionally, many authorsargue that these domain ontologies should be con-structed on the basis of a formal, domain-indepen-dent upper level ontology as a general theory of theworld, grounded in philosophy and logics.

    After reviewing the current state of the art ofterminologies in the biological and medical domainand pointing out their major shortcomings from aknowledge representation and reasoning perspectivein Section1.1,wedevelop several desiderataas far asbiomedical reasoning is concerned in Section 1.2. InSection 1.3, we then turn to the formal framework onwhich our study is based, viz. that of descriptionlogics. In Section 2, we discuss fundamental aspectsof biomedical ontologies with emphasis on mereolo-gical considerations. In Section 3, our approach todealing with partonomic reasoning as taxonomicreasoning is described. In Section 4, we discuss ourapproach in the context of related work.

    1.1. The biomedical approach to theorganization of shared domain knowledge

    In biology and medicine, numerous routine taskshave motivated the development of large thoughheterogeneous repositories of terms and concepts[2,3]. Classifications provide exhaustive sets ofmutually exclusive, pre-coordinated classes suchas the International Classification of Diseases(ICD) [4] or diverse classifications of medical proce-dures. In both cases, health care statistics andaccounting practices constitute the rationale for

    the development and maintenance of these widelychies, and semantic links often, however, at theprice of introducing ambiguity and semantic vague-ness [3]. Thesaurus-like systems, such as the Med-ical Subject Headings (MeSH) [7], are tailored to theneeds of biomedical document indexing. This mayjustify, for instance, the extensive use of polyhier-archies which help retrieve documents indexed byso-called narrower terms of a given search term.For biology, the Gene Ontology [8] was devised as aresource for the manual annotation of genomesequences in order to keep track of the rapidlyexpanding terminology of genes, gene products,and biological functions as described in variousresearch reports. Recently, open source ontologiesof several biological species have emerged in theframework of the OBO project [9].

    These ontologies focus on what may be perceivedas the very core of life sciences, viz. the physicalstructure of living organisms, whose tangible partssuch as organs, tissues, cells, and molecules con-stitute the location of biological processes as well asthe targets of experimental, diagnostic and thera-peutic interventions. Due to the pivotal role physi-cal structure plays in the biomedical domain, itssymbolic representation is considered to be crucialfor any ontology engineering effort [1012], a claimwhich has only partly been accomplished. The Foun-dational Model of Anatomy (FMA) [12], for instance,provides a description of (ideal) human anatomy at ahigh level of granularity, but it is not referred to inany clinical vocabulary. On the other hand, clinicalvocabularies such as the ICD [4] contain only implicitrepresentations of anatomy in terms of free-textreferences to body parts, tissues, and cells.

    More than one hundred of these independentlydeveloped biomedical terminologies are assembledin the Unified Medical Language System (UMLS)[13,14], an umbrella system which provides a the-saurus-style framework and contains, in its 2004edition, the impressive number of more than onemillion different concepts.1 As inherited from mostof its sources, the underlying concept representa-tion format can, in principle, be expressed as simplehObject, Attribute, Valuei triplets, all col-lected within large table structures. A total of morethan ten million descriptive units guarantee an

    1 The entities of meaning in biomedical vocabularies aretraditionally called concepts. We use this term when referringto these systems. For our formal considerations, however, weprefer the term class, in accordance with the OWL [15] parlance.

    For the ongoing discussion about classes versus concepts, cf. [16].

  • bimvoa

    con

    eve

    Hence, the problems just outlined have to be

    Part-whole representation and reasoning in formal biomedical ontologies 181there is a clear-cut distinction between is-a andpart-of in the Foundational Model of Anatomy. How oneshould interpretattribute/valuepairs,which stand ina kind of definitory relation to classes, is either left tothe user or is subject to intentionally fuzzy or other-wise unstable definitions. For example, the Gene

    OnCertainly, not all of these shortcomings affect allomedical terminologies. Circular definitions ariseainly via themerging of various inconsistent sourcecabularies into UMLS, hierarchical relations revealfairly unambiguous is-a meaning in the ICD, andinThere is no specification of themutual exclusion ofconcepts or spatial disjointness of their instances.For example, the triple Epithelial-CellsiblingNeuroglial-Cell or LiversiblingBiliary tractleaves open the question whether a cell can bebothanepithelial andaneuroglial cell andwhetherstances of Liver and Biliary-Tract may overlap.tleaves open the interpretation as to whetherthere must be a gallbladder in each abdomen,or whether an abdomenmay have a gallbladder asa part. Similarly, one could interpret the tripletAppendix location-of Appendectomy either inthe sense that for every appendix there is anappendectomy or that an appendix may consti-ute the location of an appendectomy.A vague semantics concerning relations is foundespecially with regard to transitivity. There is noway of deciding, e.g., whether the deductionAbdomenhas-partLumen of Gallbladder isvalid and how it should be interpreted givenAbdomenhas-partGallbladder and Gallblad-derhas-partLumen-of-Gallbladder.There is no indication of the dependency status ofan attribute: Abdomenhas-partGallbladderbextensive coverage of the biomedical domainunmatched by any other terminology system. Acloser look, however, reveals some of the short-comings of the UMLS:

    A high number of (approximately 7.6 million)utterly shallow relationships are provided, whichdo not exceed the thesaurus-style broader/nar-rower term relation. For example, the conceptBlood is linked by a child relation to concepts suchas Blood Plasma, Fetal Blood, and Blood Test, thusmixing up is-a, part-of, and other relations.

    Relations such as broader/narrower term, part-of, and is-a are commonly considered antisym-metric, thus precluding cyclic definitions. Itis well known, however, that there are severalthousands of cyclic definitions in the UMLS, suchas AtherosclerosisbroaderArteriosclerosisroaderAtherosclerosis [17,18].tology used to define part-of as follows: partresolved in order to scale up these terminologiesfor more advanced usage scenarios.

    1.2. Desiderata for reasoning in(bio)medical ontologies

    First of all, we need to clarify how to interpretexpressions such as Abdomenhas-partGallblad-der or Appendixlocation-ofAppendectomy. Dowe mean that for each instance of the class beingdefined (here Abdomen, Appendix), there exists aninstance of the target class (here Gallbladder,Appendectomy) related by a relation (here has-part, location-of)? Or do we want to express thatthe associations are principally allowed for some,though not necessarily all, instances? Even if wetend to favor the latter interpretation and considergallbladder as a possible part of an abdomen, wemay still want to reject completely invalid asser-tions (e.g., an abdomen may have a thyroid glandas part). Hence, for a given relation r, each pos-sible pair of classes Ci;Cj should be assigned toexactly one of the following categories:

    1. Mandatory: each instance of Ci has an instance ofCj related by r.

    2. Optional: an instance of Ci may (but need not) berelated to an instance of Cj by r.

    3. Invalid: no instance of Ci may ever be related toan instance of Cj by r.

    Secondly, the formal properties of relations interms of transitivity, reflexivity, and symmetry haveto be specified. (Not only) in the biomedical domain,a particular challenge arises from the transitivityproperty of a relation (e.g., part-of, has-location)[20].

    We will, thirdly, elaborate on the propagation ofattributes of wholes from attributes of parts [21rang24]n inconsistent assertions may lead to a broade of inadequate, or even invalid, deductions.doom

    ceptual structures, the informal approach ised to failure, because semantically vague oroutapplof means can be a part of, not is always a part of[19]. Recently, a change of this definition hasoccurred to The part-of relationship used in GO isusually (. . .) necessarily is-part.

    For the current routine uses of large-scale bio-medical vocabularies, such as lexicon look-up,enforcement of a controlled language (terminologyservice), or term expansion for information retrie-val, the underlying representation structures turn

    to be useful in most cases. For intelligentications requiring inferential computations onsuch as illustrated by the following examples:

  • wbi

    wex

    (such as SNOMED) is used for the fine-grained encod-

    trarte

    leenth

    1.3. Formal approaches to knowledgerepresentation

    Formal methods of knowledge representation2 con-

    182 S. Schulz, U. Hahnapp

    efould be mapped to the code for amputation of ag, and appendicitis would be classified as gastro-teritis. Even for standard procedures we have,erefore, to turn towards a more principledroach in order to avoid these unwarrantedweatment of role propagation via part-whole hier-chies may then cause serious and often unin-nded consequences: The amputation of a toeproing of clinical conditions, which will then have to beautomatically mapped to a set of pre-coordinatedcategories (such as ICD). One of the most importantapplication contexts of biomedical ontologies is the(legally required) clinical coding for accountancy orhealth statistics purposes. The procedures are error-

    ne, complex, and expensive. An inadequatetheherever the subsumption of new terminologicalpressions by existing ones is required. This iscase when a vocabulary of atomic classesincall are frequent, but bear subtle intricacies,hich still have not been sufficiently treated inomedical ontology engineering.A suitable account of attribute propagation,luding propagation anomalies, is indispensableat The disease Nephritis is considered an Inflamma-tion of the Kidney. It subsumes Glomerulonephri-tis (Inflammation of the Glomerula) becauseGlomerula are parts of the Kidney.

    Insulin Secretion is usually related to the Pancreasby function-of, because Pancreatic Beta Cells(Insulin producing cells) are considered part-ofthe Pancreas.

    Muscular Contraction tends to be classified as afunction of a Muscle since it is a function of theActin-Myosin Complex which is a component ofMuscle Cells, the latter being part-of Muscle.

    However, there are various counterexamples tothese seemingly general propagation patterns:

    An Amputation of a Toe cannot be classified asAmputation of a Foot although every Toe is part-of a Foot.

    Mitosis is a Cell function but it would be counter-intuitive to classify it as a Pancreas or Liverfunction although these organs have Cells asparts.

    Appendicitis is an inflammation of the Appendixbut it is not an Gastroenteritis (inflammation oftheGastrointestinal Tract) although every Appen-dix is part-of Gastrointestinal Tract.

    We conclude that such part-whole propagationpatterns in which attributes propagate from partsto wholes, fromwholes to parts, or do not propagatefects.stitute the core of artificial intelligence (cf. Brach-man and Levesque [25] for an overview). As with anyapplication field, a representation language has tobe specified for the biomedical domain which shouldbe expressive enough to encode the relevant onto-logical structures and apt to support the reasoningservices required by the intended application.Rather than some codified form of natural languagewhich is the basis of the terminological systems fromSection 1.1, formal, or at least formalized, lan-guages are preferred as a description frameworkfor advanced knowledge representation efforts. Twomain paradigms can currently be distinguished, viz.logic-based (predicate logic, description logic, etc.)[26,27] on the one hand, and structured, graph-based (semantic network, conceptual graph)[28,29] languages on the other hand. In recentyears, description logics have increasingly beenadopted by the medical informatics community[3035], thus superseding the conceptual graphparadigm [36].

    Logic-based knowledge representation languagesare characterized by a rigid definition of their lan-guage foundations in terms of a formal syntax andformal semantics. Together with their axiomaticbasis and sets of inference rules, truth-preservingtransformations of expressions, i.e., (syntactically)correct derivations and (semantically) valid deduc-tions, are guaranteed.

    An implementation of a formal knowledge repre-sentation language should provide reasoning ser-vices required by the domain and the particularapplication. There are universal mechanisms suchas computing new expressions from existing ones (onthe basis of truth-preserving inferences), matchingequivalent expressions, or detecting logical incon-sistencies. There are also more specific functionssuch as computing the least general common sub-sumer of two classes [37], or relating an instance(individual) to its most specific class, etc., all typicalof description logic inference machines [38].

    Within the family of description logics (DL), theALC language [38] is often used as a referencelanguage, mainly due to its balance betweenexpressiveness and tractability. Taking the formalproperties of ALC, and, quite recently, more

    2 In order to avoid a misinterpretation of the widely used termknowledge representation some clarification is required. In ourcontext, knowledge is not meant to denote individual beliefsbut rather the shared notions of the meaning of terms in a given

    domain.

  • expressive variants such as SHIQ [39] into account,many widely used implementations of DL reasoningengines, such as NIKL [40], CLASSIC [41], or LOOM [42],FACT [43], and RACER [44], have been developed. Inthe emerging Semantic Web, the semantic markuplanguage OWL [15] conceptually rooted in descrip-tion logics plays a foundational role for publishingand sharing ontologies on the World Wide Web and isincreasingly accepted in the medical domain, too[45].

    Here, we refer toALC as a common denominator.ALCs syntax and semantics are summarized in

    D v C holds. At the reasoning level of terminolo-gical knowledge, this kind of inference is performed

    Part-whole representation and reasoning in formal biomedical ontologies 183Table 1, where Cand D denote classes, while Rdenotes role terms. Within the set-theoreticalsemantics of ALC, classes are unary predicates,and roles are binary predicates over a domain D,with individuals being the elements of D. The inter-pretation I is a function that assigns to each classsymbol C contained in the set of class symbols A asubset of the domain D (I : A! 2D, cf. (1) inTable 1), and to each role symbol Rcontained inthe set of role symbols P it assigns a binary relationof D (I : P! 2DD, cf. (5) in Table 1). ALCalsoprovides class forming operators, viz. for intersec-tion (2), union (3), and (full) negation (4). A roleexpression can either be a value restriction (6) or anexistential condition (7).

    By means of terminological axioms, a symbolicname can be assigned to a class forming expressionusing in order to relate them by equality(necessary and sufficient constraints), or using v to express general inclusion (only necessaryconstraints), where the symbolic name appears onthe left side of the terminological axiom and theclass defining expression on its right side. A finite setof such axioms is called the terminology or TBox.

    In order to illustrate these formal devices in acommonsense domain, the intersection (u ) ofWomanu Parent denotes all women who are parents(i.e., the set of mothers), the negation (: ):Woman denotes all individuals who are notwomen, while the union (t ) MantWoman denotesall individuals who are men or women. The value

    Table 1 Syntax and semantics for the descriptionlogic language ALCSyntax Semantics

    C fd 2DI jIC dg (1)CuD CI \DI (2)CtD CI [DI (3):C DInCI (4)R fd; e 2DI DI jIR d; eg (5)8R:C fd 2DI jRI dCIg (6)9R:C fd 2DI jRI d \CI 6?g (7)by a deductive component called the classifier,which runs at the TBox level of terminologicalknowledge representation.

    In contradistinction to theTBox, theABox containsfacts about instances (individuals), i.e., theelementsof the domain D. An ABox reasoner computesassertions about instances based on definitions andconstraints provided by the TBox. Assume we definethe class MoDWomanu 8 hasChild:Woman andinstantiate Mary 2MoD and Mary;Kim 2 hasChild,a deductive reasoning component which computesthe least general class to which a particular instancebelongs, i.e. the realizer, runs at the ABox andinfers that Kim is a woman. An important character-istic of description logics is its underlying open-world semantics [38]: An ABox represents all itsmodels as valid interpretations. In the above exam-ple, the assertion Kim2Man would not lead to aninconsistent ABox unless the TBox explicitly statesthe mutual exclusiveness of the classes Man andWoman (Manv :Woman).

    2. Ontological considerationsunderlying biomedical knowledgerepresentation

    Each ontology for a concrete domain, such as biologyor medicine, has portions with lots of very detailedknowledge pieces (e.g., dealing with the constitu-ents of cells, proteins, genes, or the heart, cancer,drugs). Moving upwards to more and more generalclasses then leads to the fundamental categoriesaround which such a concrete domain is organized.These so-called upper level regions of an ontology[4648], which we will deal with in Section 2.1, arefar less evident and, moreover, less widely agreedupon than the more specific regions. In Section 2.2,restriction 8 hasChild:Woman denotes all indivi-duals who have only daughters (if any), while theexistential restriction 9hasChild:Woman denotesall individuals who have at least one daughter.The statement MotherWomanu Parent indicatesan equivalence between both sides of the expres-sion, and the general inclusion enforces subsump-tion. HumanvMammal expresses that each humanis a mammal, without, however, specifying suffi-cient criteria which distinguish a human from themore general class of a mammal.

    The set-theoretical semantics of descriptionlogics allows the inference of subsumption relations(is-a) between classes. Whenever the extension of aclass D is a subset of another class C, the relationwe will focus on the relevance of part/whole rela-

  • Wcl

    Wtico

    with P being the value restriction of the role has-part at LOrg:

    184 S. Schulz, U. Hahncose, Oxygen, Liver Tumor (in particular, weexclude here manufactured objects and any otherartifacts such as Dental Implant, Heart Pace-maker, etc. from further consideration),

    constitute solids (e.g., Femur, Epithelium) orhollow structures (e.g., Cavity of Heart) solidsmay have hollow structures as parts, but not viceversa [49],

    feature as countable entities (e.g., Brain), collec-tions (e.g., Teeth), or mass terms (e.g., Connec-tive Tissue).

    In short, BStruct encompasses any component ofthe toolkit of life. This justifies our decision toinclude even neutral classes such as WaterorAtom into the category of BStruct. With Top beingthe artificial root of the upper level taxonomy, wehere define LOrg (living organism) as a subclass ofTop

    LOrgvTop (1)

    Biological structure BStruct subsumes any LOrg aswell as any part LOrg can have:iBSranging from zero to three (e.g., McBurney pointhas dimension zero, Border of Heart has one,Surface of Body has two, and GastrointestinalTract has three),necessarily or possibly constitute Living Organ-sms (LOrg), e.g., Axon, Femur, Appendix, Glu-truct, and Living Organism referred to as LOrg.e, deliberately, refrain from an extensive defini-on of both terms but rather regard BStruct as ammon denominator of all those entities which

    have an n-dimensional spatial extension with nBSmain

    e start by informally introducing two upper-levelasses, viz. Biological Structure, referred to asdotions, or, more generally, mereology, in biology andmedicine. Turning to themain topic of this article, inSection 2.3, we will discuss special reasoning pat-terns closely tied to our domain. In essence, thesepatterns make use of the propagation of roles alongtaxonomic is-a and partonomic part-of hierarchies.In Section 2.4, we introduce a distinction of differ-ent types of role fillers that seem to be crucial forany adequate account of biomedical domain repre-sentation and reasoning. Finally, in Section 2.5, wediscuss the implications of modeling countable enti-ties, collections, and mass terms in a biomedicalontology .

    2.1. Upper ontology for the biomedicaltruct LOrgt P (2)LOrgv 8has-part:P (3)

    It is, therefore, sufficient for an instance of a givenclass, e.g., Oxygen to be part of a living organismsuch that we may consider this class as a BStruct, aswell.

    2.2. Mereology

    The taxonomic is-a relation (cf. Brachman [50]),which links classes in terms of generalization andspecialization (such as with Cancer is-a Disease), istraditionally considered to be the primary orderingrelation for class hierarchies. Likewise, the semanticfoundations of such taxonomies quite stable thoughcontroversies still exist [51,52,16,53]. As far as thebiomedical domain is concerned, the mereological(part-whole related) order, however, is equal inimportance as a hierarchy-building principle. Dueto diverging epistemological considerations depen-dent on different conceptual approaches to biologyand medicine as well as distinct formal criteria, onwhich part-whole relations should be based, thesemantics ofmereological relations is far less obviousand theoretically less clear. The transitivity of thepart-of relation, in particular, has been vigorouslydiscussed in the literature (cf. the overview by Artaleet al. [21]).

    Classical mereology [54,55] rooted in the work ofLesniewski [56] treats generic parthood as a (strict)partial order relation, i.e., one that is reflexive,antisymmetric, and transitive. This axiomatic stipu-lation of the transitivity of part-of has been chal-lenged by empirical findings from linguists andcognitive scientists [5760], who distinguish sever-alpart-whole relation types to which different rela-tion properties (e.g., non-transitivity) can beassigned [61]. Other controversial issues regardingpart-ofare the principles of extensionality (any twoobjects with exactly the same parts are the same)and supplementation (no object has precisely oneproper part) [54].

    Still, a coexistence of various part-whole relationtypes with alternative readings of transitivity and ageneral part-of relation, satisfying the strict axiomsof classical mereology, seems reasonable [54,62].3

    Because this view is also compatible with sharedintuitions in biomedicine, we subscribe to an under-

    3 But even the general part-of relation requires semantic expli-citness, principally with regard to its commitment to space and

    time [63,64].

  • standing of parthood in the broadest sense in whichtransitivity is axiomatically postulated:

    8 x; y; z : part-ofx; y ^ part-ofy; z) part-ofx; z (4)

    Common conceptualizations in the biologicaldomain, however, suggest the rejection of theassumption that part-of be reflexive. Otherwise,for instance, Partial Resection of Stomach would

    Given that a class from the range of 9 fracture-of:Yis in the domain of a part-of relation, whose range isZ, the relation r ( fracture-of) can also be propa-gated to Z ( ):

    X v 9 r:Z (8)More generally, the following inference rule is

    stipulated:

    Part-whole representation and reasoning in formal biomedical ontologies 185include Total resection of Stomach. We, there-fore, interpret part-of in the sense of proper part-of, thus excluding equality.

    Furthermore, we introduce the inverse relationhas-part part-of"1:8 x; y : part-ofx; y, has- party; x (5)

    Some description logics languages, e.g.,SHIQ [39], allow the specification of algebraicproperties of roles, especially regarding transitivityand inverse relations. Since our model is based onthe parsimonious language ALC, which does notsupport transitive roles, we do not make anyassumptions about inverse relations and role tran-sitivity. However, we introduce role transitivityimplicitlyby means of the encoding patterns to beintroduced in Sections 3.1 and 3.2.

    2.3. Role propagation: general rulesversus inference anomalies

    Inheritance-like propagation patterns of rolesthrough partonomies have been described by sev-eral authors in the context of medical knowledgerepresentation [30,65,66]. Rector et al. [33] discusstwo taxonomic reasoning patterns, especiallyadapted to biomedical reasoning, which are cru-cially dependent on part-whole relations. The firstone accounts for role propagation in partonomies.Consider, e.g., Fig. 1, where a class X (Fracture ofShaft of Femur) is related to a partclass Y (Shaftof Femur) via some relation r ( fracture-of ):

    X v 9 r:Y (6)The part class Y is part-of ( ) a whole Z(Femur):

    Y v 9 part-of:Z (7)Figure 1 Taxonomic reawhole specialization). In this way, partonomicreasoning is dealt with at the axiomatic languagerelation s, and that class specialization is deducedon the basis of part-of relations (hence, part-X vW (15)Along the lines of these two reasoning patterns,

    dedicated knowledge representation languages forthe biomedical domain, such as GRAIL [33], havebeen developed. Incorporating the preceding con-siderations, taxonomic reasoning in partonomiescan be defined as a property of a relation by anaxiom of the form r specialized By s [24]. A morerecent approach in description logics is the use ofcomplex role inclusion axioms [67]. These solutions,however, imply that the relation r is always propa-gated along hierarchies based on s, i.e., the inheri-tance mechanism is invariably associated with theY v 9 part-of:Z (14)

    W v 9 r:Z (13)X v 9 r:Y (9)Y v 9 part-of:Z (10)X v 9 r:Z (11)The above framework, furthermore, allows for so-called concept or class specialization in parto-nomies. In a given example (cf. Fig. 1), we assumethe relation fracture-of to link the classes X (Frac-ture of Shaft of Femur) and Y (Shaft of Femur) ( )and the classesW (Fracture of Femur) and Z (Femur)( ), respectively. Given the part-of relationbetween Y and Z ( ), we can conclude that X (Frac-ture of Shaft of Femur) specializes W (Fracture ofFemur) ( ), hence X is-a W. The general reasoningpattern can be summarized as follows:

    X v 9 r:Y (12)soning in partonomies.

  • iricanina

    1.

    186 S. Schulz, U. Hahn

    r vsthe class Perforation of Appendix implies 9perforation-of:Intestine, whereas Appendicitisdoes not imply 9 inflammation-of:Intestine,although Appendixv 9 part-of:Intestine holdsin both cases.

    2. Also, class specialization in partonomies does notabybyal evidence which seems to suggest that suchaxiomatic approach might be fundamentallydequate. We make the following claims instead:

    Role propagation in partonomies does not gener-lly hold. Consider Fig. 2 (left side), wheredefinition level. We have, however, collected emp-

    Figure 2 Regular and irregular reasoning patterns (uppe(left vs. right part).generally hold true for certain classes related to apartonomy by the same relation. For instance,given that Glomerulum and Kidney are related bypart-of just like Appendix and Intestine, werecognize another inference anomaly (cf.Fig. 2, right side). In contradistinction to thefact that Glomerulonephritis (defined as9 inflammation-of:Glomerulum) specializesNephritis (defined as 9 inflammation-of:Kidney), the class Appendicitis (defined as9 inflammation-of:Appendix) does not specializeEnteritis (defined as 9 inflammation-of:Intestine). Even worse both reasoning patternstend to interact. Class specialization requiresthe role propagation pattern to be true. Viceversa, if the role propagation pattern is false,class specialization is consequently also invalid(cf. Fig. 2, left and right side, lower example).

    This phenomenon has already been pointed outBernauer [66], and has recently been discussedRector [24]. Currently, neither established large-scale terminologies nor dedicated medical knowl-edge representation languages are able to properlyaccount for the above-mentioned, regular as well asirregular, phenomena of part-whole reasoning.Therefore, we conclude that the generality of thetwo reasoning patterns must be constrained. Ratherthan imposing these constraints on the axiomaticlevel of language definition or devising additionallanguage built-ins (e.g., complex role inclusions[67]), we propose a particular encoding patternwhich allows us to decide as to whether specializa-tion is actually valid or not when we specify theontology. In other words, wemove this decision from

    . lower part) for role propagation and class specializationthAsunemca

    2.

    IntiothstCir lmai

    e language designer to the ontology engineer.a consequence of leaving the language layerchanged for part-whole representation, we fullybed partonomic reasoning into standard classifi-tion-based taxonomic reasoning.

    4. Types of role fillers

    analyzing the status of role fillers in class defini-ns from an epistemological perspective, we makee following distinctions in accordance with theipulations in Section 1.2: Given any two classes,and Cj, from a TBox T we annotate each relationinking Ci and Cj by one of the following tags, viz.andatory, optional or invalid. Table 2 gives str-ghtforward examples of the intended semantics.

    Mandatory. Mandatory role fillers are those whichoccur in existentially restricted roles in the defi-nition of a class C for a relation r. This is equiva-lent to the notion of ontological dependence [68].MFCrconstrains mandatory fillers of the role r at

  • anturofildore

    Part-whole representation and reasoning in formal biomedical ontologies 187

    ; y), with x corresponding to the rows and y to the columns

    Cell membrane Cell nucleus

    ory Mandatory MandatoryMandatory Optional

    ho

    ngartaintes

    M ll mU leeU llN llN dneinfer that each concrete Neuron has an Axon asone of its parts.Optional/Possible. PFCr constrains possible fillersof the role r at the class C under the followingcondition:

    Cv 8 r:PFCr (17)Possible role fillers are all the classes which arethe class C:

    Cv 9 r:MFCr (16)

    Axon, e.g., is a mandatory role filler of has-partwhen defining the class Neuron. This allows one to

    atrogenic structures Appendix Inass terms Lipids Censpecified collections Lymph Follicles Spnspecified classes Cell Nucleus Ceecessary constituents Cell Membrane Ceecessary collections Glomeruli KiPaITable 2 Examples for role status attributes: has-part (x

    C2 Tissue Cell

    Tissue Invalid MandatCell Invalid InvalidCell membrane Invalid InvalidCell nucleus Invalid Invalid

    Table 3 Ontological dependencies between parts and w

    Phenomena X Y

    Anatomical variations Azygos Lobe LuCongenital malformations Mitral Valve He

    thological structures Glioblastoma Brsubsumed by the value restriction PFCr at a role r.For example, Nucleus is a possible filler of has-part when the class Cell is defined because cellswith and without nuclei exist (e.g., bacteria).Mandatory fillers are always possible fillers, since9 r:F u 8 r: : F ? . Possible fillers which arenot mandatory fillers are optional fillers.Invalid. IFCr constrains invalid fillers of the role rat the class C under the following condition:

    9 r:IFCr v :C (18)

    An example is Axon as filler of the relation has-part in Basal Cell.

    When we analyze the status of the roles part-ofd has-part in biomedical terminology systems, itrns out that the implication that X is a mandatoryle filler of part-of for Ywhen Y is a mandatory roleler of has-part for X (i.e., mutual dependence)es not hold inmany cases. The examples in Table 3veal that the amount of unilateral dependenciesare influenced by the level of granularity of the classdescriptions, by the consideration of (well-formedor ill-formed) structural variations, and the inclu-sion of collections and mass terms [69].

    2.5. Countable entities, collections, andmass terms

    Analyzing the UMLS Metathesaurus [14], we madethe observation that most anatomical concepts havea singular form as their preferred description (e.g.Heart, Liver, Head, Hand, Connective Tissue, Blood)whereas the remaining ones have plural formsassigned to them (e.g., Cells, Leukocytes, Micro-

    Invalid InvalidInvalid Invalid

    les (assuming a theory of clinical anatomy)

    X v 9 part-of:Y Y v 9 has- part:X+ "+ "+ "

    tine + "embrane " +n " +

    + "+ +

    y + +villi). This subtle linguistic difference sheds light onthe ontological distinction between countable enti-tiesandmass terms in the first group and collectionsof uniform objects in the second group.4 Thesefindings coincide with the classification of mereo-logically relevant parts in terms of cardinality andcompositionality by Gerstl and Pribbenow [60], whodistinguish mass terms, collections of elements andcomponents of complex structures.5

    We could refer to collections as sets of elements(cf. Uschold [71] who uses typed logics for repre-senting biological populations). This, however,intertwines the set theoretic foundation of the for-mal semantics of our knowledge representationlanguage (reasoning about sets of sets is not

    4 Most natural languages use specific markers in order to specifywhether a discourse object belongs to either categories, e.g., aliver, some blood, some cells, in English; eine Leber, . . . Blut, . . .Zellen in German; un foie, du sang, des cellules in French.5 See also Bittner et al. [70] for the distinction between collec-

    tions and classes.

  • supported by the underlying language ALC) with

    lar way, we will deal with the relation has-part inSection 3.2. We will then consider the merits of our

    entity BStruct (cf. Section 2.1). Each BStruct sub-class which denotes instances of a certain kind (e.g.,Femur, Glomerulum, Cell, Tooth) is referred to as anE-node. In addition, we introduce for each E-node an(artificial) class, the so-called P-node. The P-node isthe common subsumer of all specific parts of itsassociated E-node. Specific parts of a class C are allthose classes, whose instances are, by definition,parts of an instance of C. In other words, the P-nodeis instantiated by all those individuals which are partof an instance of the related E-node. As an example,Neck of Femur and Shaft of Femur are specificparts of Femur because their instances occurexclusively as parts of a Femur. In contradistinction,Bone Marrow or Periosteum are not specific parts ofFemur, as instances of these occur as part of other

    188 S. Schulz, U. Hahnapproach with respect to the proper distinction ofcountable entities, collections, and mass terms(Section 3.3) as well as flexible ways to expressvalue restrictions and role filler constraints (Section3.4). Finally, in Section 3.5, we summarize practicalexperience with our encoding approach.

    3.1. Part-of hierarchies as taxonomies

    Building on earlier suggestions by Schmolze andMark [40] as well as E. Schulz et al. [73], we origin-ally proposed so-called SEP (= Structure/Entity/Part) triplets [74]6 in order to define a characteristicpattern of taxonomic hierarchies which supports theemulation of inferences typical of transitive part-ofrelations. In this section, we present a slightly gen-eralized variant of this approach referred to as EP(= Entity/Part) patterns.

    In order to embed our approach into biomedicalreality, we start at the top-level biological structure

    6 See also [75] for a preliminary version of this modelingempirical considerations about the domain. Ratherthan introducing sets as abstract elements, we pre-fer to consider a collection in mereological termsand assign it an ontological status of its own, asadvocated by Smith [72]. Thus, we simplify therelation between a collection and its elements byreducing it to part-of. We do not assume an atom-less world and, therefore, mass terms in a strictsense do not exist. Our treatment of mass termsisnot different from the treatment of collection ofelements because a mass is a collection of uniformparticles (cells, molecules, atoms). Accordingly, theextension of a mass term such asWater correspondsto all possible mereological sums of water moleculesin the domain. Whether one classifies something interms of a mass or as a collection class is, thus,basically a matter of perspective. The main char-acteristic of instances of a mass or collection class isthat portions of it may have the same structure, i.e.,portioning gives rise to new instances of the sameclass.

    3. Partonomic reasoning as taxonomicreasoning

    We will now consider part-whole relations from twoangles. In Section 3.1, we will introduce a particularencoding pattern through which part-of reasoningcan be emulated as taxonomic reasoning. In a simi-approach.for reification in description logics, cf. Badea[76]). Therefore, the class Cell Nucleus will besubsumed by CellP because each Cell Nucleus is partof a Cell.

    For the formal reconstruction of the part-ofrelation in terms of taxonomic reasoning, we assumethat C, D, and E denote E-nodes and that CP, DP, andEP denote their corresponding P-nodes, related tothem via the role part-of (cf. Fig. 3). These con-ventions can be expressed by the following axioms:

    CP 9 part-of:C (19)DP 9 part-of:D (20)EP 9 part-of:E (21)CvDP (22)CP vDP (23)

    Figure 3 EP Pairs: a partitive hierarchy expressed as abeing equivalent to the expression 9 part-o f:Cell(hence, the P-node reifies that particular role;node Cell, together with the P-node CellP, the latterbones as well. In order to represent Cell as an EPpair, we introduce the class of cells itself, the E-taxonomy.

  • Part-whole representation and reasoning in formal biomedical ontologies 189Dv EP (24)DP v EP (25)Ev BStruct (26)EP v BStruct (27)

    Since C is subsumed by DP and DP is subsumed byEP, we infer that the relation part-of holds betweenC and D as well as between C and E:

    Cv 9 part-of:Du 9 part-of:E (28)In this way, partonomies are represented as taxo-nomies of P-nodes. Using this pattern across varioussubsumees of BStruct linked to each other via part-of, the deductions are the same as if part-of wasreally transitive at the TBox level since the classesinherit all part-of roles from their mereologic par-ents. Chains of classes, meaning that one has theother as filler of the role part-of, are modeled asparallel P-node/P-node and E-node/P-node links, asdepicted in Fig. 3.

    The encoding of part-whole hierarchies via EPpairs allows the ontology engineer to manage role

    Figure 4 Enabling/disabling role propapropagation and, consequently, class specializationaccording to the needs of the domain to be mod-eled. This means that one can either choosewhether an E-node alone or the union of an E-nodewith the corresponding P-node7 is addressed as arole filler in a class definition. In the first case therole propagation is disabled, in the second case it isenabled.

    For example (cf. Fig. 4),Gastroenteritis is definedas 9 inflammation-of.Gastrointestinal Tract, i.e.,the range of the relation inflammation-of isrestricted to the E-node of Gastrointestinal Tract.This precludes the classification of Appendicitis asGastroenteritis, althoughAppendix is related toGas-trointestinal Tract via part-of. If role propagationis enabled, however, Glomerulonephritis 9inflammation-of.(GlomerulumP t Glomerulum) isclassified as Nephritis 9 inflammation-of.(Kid-neyP t Kidney), with Glomerulum being part-of

    gation in an EP-encoded partonomy.

    7 In our previous publications, this union class was representedas a separate class and was referred to as S-node.

  • the Kidney hence, both Glomerulum and Glomer-ulumP are subsumed by KidneyP. In a similar way,Perforation of Appendix is classified as Gastrointest-inal Perforation.

    the anatomic world which is equally a part of akidney and a part of a liver.

    3.2. Has-part hierarchies as taxonomies

    A severe restriction of the (S)EP model, up untilnow, is that has-part hierarchies as existentialcondition in class definitions such as Mucosav9 has- part:EpithelialCell are not accounted for.Therefore, analogous inference patterns such as thepropagation of has-part roles are not yet supported.While the EP pattern allows to address a class C andits specific parts (cf. Fig. 4), we would also like toaddress a class C and its specific includers (i.e.,wholes). Such an extension seems reasonable as,e.g., the class Epithelial Cells and whateverincludes Epithelial Cells is useful in constraining

    190 S. Schulz, U. Hahn

    Figure 5 EP representation of topological disconnect-edness.In case we wanted to exclude an instance of aclass C to be part of another instance of C (e.g., anorgan can never be part of an organ or a nucleotidecan never be part of another nucleotide), we wouldenforce disjointness between the correspondingE- and P-nodes:

    CP v :C (29)

    Furthermore, the definition of disjoint partitionsbetween P-nodes allows the reconstruction of topo-logical disconnectedness if this requirement needsto be fulfilled:

    CP v :DP (30)

    Consequently, in this axiomatic setting, all sub-classes of CP and DP occupy disjoint partitions. Thisis equivalent to a situation in which no instance of Cshares any part with any instance of D, as illustratedin Fig. 5. For example, the P-nodes KidneyPandLiverP are disjoint, i.e., there is no instance inFigure 6 EPI architecture: emulation of transitivity of bothdependency (e.g., E = Cell, D = Cell Membrane, C = Cell Membr(e.g., E = Organism, D = Cell, C = Cell Nucleus); right: exampledotted arrows depict inferred taxonomic links (e.g., E = Cellthe role has-location as used in the description ofdisease classes, e.g., Adenocarcinoma.

    In order to overcome these restrictions, we herepropose an extension of the (S)EP encoding patternwe will refer to as the EPI (Entity/Part/Includer)pattern (cf. Fig. 6). In addition to the P-nodesintroduced in Section 3.1, we define for eachBStruct subclass Ca so-called I-node CI which isequivalent to the expression 9has- part:C.

    I-nodes can be described as common subsumers ofall specific includers (or wholes) of the associatedE-nodes. Specific includers of an entity, denoted bythe class C, are all those classes whose instancesare, by definition, related to an instance of C by has-part.

    In addition to the P-nodes (expressions (19)(21)), for each of the classes C,D,E we now definea reification class for its corresponding has-partre-lation, viz. CI,DI,EI, respectively. These artificialclasses are common subsumers for all those classes

    part-of and has-part hierarchies. Left: example of mutualane Protein); middle: example of asymmetric dependencyof a mixed part-of/has-part/is-a encoding scheme where

    , D = Cell Membrane, C = Blood Cell Membrane).

  • Part-whole representation and reasoning in formal biomedical ontologies 191

    rowntEI 9 has- part:E (33)As with their corresponding P-nodes, cascading sub-sumption of classes by I-nodes of their mandatorywholes emulates the transitivity of the has-partrelation. We now complete the taxonomies accord-DI 9 has- part:D (32)which must have, by definition, the role has-partfilled by C, D, and E, respectively:

    CI 9 has- part:C (31)

    Figure 7 EPI example for countable objects. The thin arinferred is-a links are dotted. Diagonal is-a links represerepresent the emulation of partonomies.in

    D

    DI

    E

    EI

    EI

    Inby(3

    clofreNolehag to Fig. 6 (left):

    vCI (34)vCI (35)vDI (36)vDI (37)v BStruct (38)this way, part-of and has-part roles are inheritedthe subsumees (cf. assertions (19)(25) and

    1)(37)):

    C inherits the roles 9 part-of:D and 9 part-of:E.D inherits the roles 9 part-of:E and 9has-part:C.E inherits the roles 9has-part:D and 9 has- part:C.

    Using this pattern across various BStruct sub-asses linked to each other via the relations part-or has-part, we get the same deductions as if bothlations were really transitive at the class level.te, however, that the situation displayed in theft part of Fig. 6 (in both branches, viz. part-of ands-part are present throughout the hierarchy) isnot generally valid in the biomedical domain if weallow pathological modifications. In many cases (cf.Table 3), CvDP does not imply DvCI (cf. also thecenter part of Fig. 6). Together with the taxonomicsubsumption between E-nodes (cf. Fig. 6, rightpart), we subsequently get three kinds of taxonomicorderings, typically all mixed up: The is-a hierarchyitself, the taxonomy of P-nodes emulating the part-of hierarchy, and the taxonomy of I-nodes emulatingthe has-part hierarchy. The resulting complex graphis still acyclic with respect to the is-a relation, which

    s depict is-a relations between extended EPI triplets, thetaxonomic subsumption of EPI triplets, other is-a linksemulates part-of and has-part hierarchies, giventhat there are no cycles in either subgraph. Inboundis-a links to E-nodes come exclusively from other E-nodes. Consequently, no is-a path can traverse bothsubgraphs. However, class definitions are cyclic asfar as there exist mutual part-of and has-partdependencies.8

    3.3. An EPI account of countables,collections, and mass terms

    We will now demonstrate how the proposed EPIencoding is suited to represent in a natural waythree ontological primitives, viz. countable enti-ties, collections, and mass terms. For the represen-tation of countable entities, the circuit diagram inFig. 7 illustrates a typical scenario from humananatomy, which also instantiates the schemata fromFig. 6. Let us consider the classes Hand (H) and

    8 Whereas these kinds of cycles are incompatible with DL sys-tems such as LOOM, they are supported by the most recent DLsystems such as FACT and RACER, cf. Section 3.5.

  • 192 S. Schulz, U. HahnThumb (T), where TP and T are linked to HP by is-a(for each thumb, there exists a hand it is part of),but no is-a link exists between H and TI (not everyhand needs to have a thumb). Now, we add the classProximal Phalangeal Bone of the Thumb9 (Ph), itsparent class Bone (B), and the class Periosteum10

    (P). Each T mandatorily has-part Ph, and each Ph ismandatorily a part-of T. The same applies to Bone Band Periosteum P, respectively. Consequently, TPsubsumes both Ph and PhP, and PhI subsumes Tand TI. Because Ph is subsumed by B, each Ph will

    Figure 8 Representation of countable entities, collec

    9 The bone of the base of the thumb.10 The tissue which covers and nourishes a bone.have a P as part (B is subsumed by PI) but thecontrary does not hold true. Our encoding patternfurther allows the deduction that each instance of T(Thumb) has the role has-part filled by an instanceof P (Periosteum) by means of the subsumptionchain T v PhIv BIv PI.

    Fig. 8 depicts the EPI encoding of masses andcollections: LL denotes the collection class Leuko-cytes, GG denotes Granulocytes. LL subsumes GGsince every collection of Granulocytes is-a collec-tion of Leukocytes. L (Leukocyte) is the basic build-ing block of the collection LL. LI subsumes all classesthat have L as part, e.g., LL and GG. L (Leukocyte)is-a C (Cell). Each instance of C has an instance of P

    tions, and mass terms in the EPI encoding pattern.

  • Part-whole representation and reasoning in formal biomedical ontologies 193lized relations of has-part and part-of, e.g.,

    Cellv 8has-nucleus:CellNucleus (41)Nevertheless, this would not prevent us from

    asserting a bizarre relation such as

    fCell1;Kidney1g2 has-nucleus (42)unless Kidney and Cell Nucleus were defined asmutually exclusive. Therefore, in a descriptionformalism assuming an open-world, we have toprovide means to prevent such unintended models.

    Rather than subscribing to a proliferation of spe-cific relations such as has-nucleus, we use the nega-tion operator (: ) in class definitions in order toconstruct disjoint partitions. For example, anupper-level partition such as

    Organv :CelltCellP (43)would disallow the assertion (42), providedKidneyvOrgan, CellNucleusvCellP, Cell12Cell,(Cytoplasm) as part; each instance of P is part of aninstance of C. P has the mass term Water (W) as amandatory part, which is mandatorily constitutedby Water Molecules (H). Our approach allows us toinfer that G must have its role has-part filled by L,and, moreover, L being subsumed by PI, filled by P.Since P is subsumed byWI and HI, G inherits the roles9 has- part:W and 9has-part:H.

    3.4. Value restrictions and role fillerconstraints

    With regard to role propagation, the EPI encodingpattern allows for a precise specification of thevalue restrictions. For example, we specify thelocation of Insulin production as follows:

    InsulinProductionv 9 has-location:BetaCell (39)InsulinProduction

    v 8 has-location:BetaCellt BetaCellI (40)Thus, any instance of the process Insulin Productionmust take place in a BetaCell. Furthermore, it can belocated in any place containing beta cells, e.g. Isletsof Langerhans, Pancreas, etc. An analogous examplewould be the class Muscular Contraction. Anyinstance of it would necessarily be located in anActin-Myosin Complex butmay also be locatedwithinany anatomical structure which includes instances ofActin-Myosin Complex (referred to as ActinMyosin-ComplexI) such as Muscle Cells, Muscles etc.

    So far, however, we have no means to specifyoptional role fillers. One might, for instance, wantto express that a Cell may have a Cell Nucleusthough this is not necessary. One solution to solvethis problem would be the introduction of specia-and Kidney12Kidney.More generally, we require the corresponding I-and P-nodes of a given class X to be disjoint:

    XP v : XI (44)Still, the engineering challenge remains to intro-duce the disjointness axioms at the proper specifi-cation level (as high as possible in the taxonomy) inorder to avoid an extreme proliferation of expres-sions as the definition (18) of invalid role fillers inSection 2.4 may suggest.

    3.5. Practical experience with the (S)EPencoding patterns

    (S)EP encodings have already been used in buildingexperimental biomedical knowledge bases as well aslarge-scale commercial ontologies. In a first experi-ment, we tested the scalability of this architectureby setting up a huge T-Box from the UMLS [17]. Weselected UMLS concepts from pre-defined semantictypes, covering the fields of anatomy and pathology.Applying a set of transformation rules, we automa-tically generated a T-Box using the DL system LOOM[42]. The transformation routine mapped unspeci-fied hierarchies between UMLS pathology conceptsto is-a hierarchies and used part-of links for theconstruction of SEP triplets. Furthermore, the rela-tions has-part and has-location were mapped toexistentially quantified roles. The LOOM classifierwas used to detect terminological cycles and par-allel is-a/part-of links. The curated and, conse-quently, consistent T-Box contained over 240,000terminological axioms. In a manual step, a randomsample was examined for correctness and comple-teness. The results of this experiment are discussedin more detail by Schulz and Hahn [17].

    In another (on-going) project,weare transformingthe Foundational Model of Anatomy (about 62,000classes) into a description logics T-Box based on theEPI encoding pattern [77]. The mutual dependenciesbetween parts and wholes and the resulting T-Boxcycles turned out to be a major obstacle in all DLsystems examined. Whereas LOOM does not supportcyclic definitions at all, FACT [78] and RACER [79]reveal severe scaling problems when it comes tohigher numbersofDLexpressions (& 10,000axioms).It is known that cyclic axioms make reasoning inALC EXPTIME-hard [80]. Further experiments in col-laboration with various DL developers might showwhether this worst-case scenario can be averted byusing still to be determined optimization techniques.

    Two large commercial ontologies, namely SNOMEDCT (R) [81] and LinKBase (R) [82], already makeextensive use of SEP triplets. SEP-based extendedterminologies were also used for the implementation

    of a terminology system for intensive care [83].

  • 194 S. Schulz, U. HahnDealing with large amounts of classes, ontologyacquisition and maintenance becomes a more andmore pressing issue. When we use complex encodingpatterns, such as reificator classes, inclusionaxioms, or partitions, it becomes entirely prohibi-tive to perform editing at the base level of theterminological language. A solution lies in the useof an intermediate representation, as already pro-ven in the context of the GALEN-IN-USE project [84].For the purpose of ontology acquisition and editingin the context of the FMA (Foundational Model ofAnatomy), we currently use the PROTEGE graphicaluser interface [85] (cf. also Fig. 9) and adaptableexport routines for the generation of output indifferent DL languages and encoding patterns.

    4. Discussion of related work

    The nature of part-whole relations (or, mereonymicrelations) has been under study for a very long time.Burkhardt and DuFour [86] survey the field from itsancient roots in classical philosophy, while Simons[87] focuses on research from the past century. Morerecently, Artale et al. [21] and Lambrix [88] deal

    Figure 9 PROTEGE Ontology Editor with a FMA (Fowith this topic from the perspective of object-cen-tered formal knowledge representation languages,description logics, in particular. Bittner et al.[89,64,70] formalize mereological primitives inthe context of the Basic Formal Ontology (BFO)upper ontology.

    In our discussion, we shall concentrate on somecontroversial issues related to parts and wholesfrom a knowledge representation and reasoningperspective that are raised in these studies andrelate our contributions to some of these openproblems. Our interest is particularly in the needsof (bio)medical knowledge representation. There-fore, we will focus on two issues which haveattracted the attention of researchers from a knowl-edge representation perspective, viz. whether ornot part-whole relations are transitive and whichdependency relationships exist between parts andwholes.

    4.1. Transitivity and part-whole relateddependency

    Unlike taxonomies, our understanding of parto-nomies, general and biomedical ones, substantially

    undational Model of Anatomy) edit example.

  • Part-whole representation and reasoning in formal biomedical ontologies 195lacks consensus. Perhaps the most controversialissue is centered around the transitivity propertyof the part-of relation. Early formal accounts withinthe framework of General Extensional Mereology[56] were committed to a strictly transitive axio-matization of part-of. This constructive view hasbeen challenged primarily by linguists [57,59] aswell as cognitive psychologists [58] who aimed atreconstructing through empirical evidence fromlanguage use and cognitive conceptualizations.Their criticism is based on the observation thatour conceptualization of the world is not justguided by one single relation, but takes intoaccount different kinds of relations many of whichcan be organized in a generalization hierarchy(hence, the notion of ). This hierarchy has tremen-dous implications for reasoning because transitivityis valid, e.g., only when a single sense of is kept[58]. Otherwise, the transitivity property vanishes,as can be illustrated by the reasoning pattern[anatomical-]part-of and [constituent-]part-of(member-of ) banddrummerdrummerfingerpart-ofpartonomiespart-ofpart-ofpart-of part-of bandis, to say the least, bizarre. Similar reasoninganomalies have already been recognized for thebiomedical domain, as well [74].

    With several part-of relations having to be con-sidered rather than a single generic one, the chal-lenge to properly organize these conceptualizationsarises. Various proposals have been made [5860]which, in essence, all rely on various domain-inde-pendent classificatory dimensions, e.g., whether apart is functional or not, separable or not, or (non-)homeomerous (parts are the same kind of things astheir wholes), etc. Depending on different para-meter settings on these dimensions, differentpart-whole relation types can be postulated (thecombination of) which are characterized by theirspecific (non-)transitivity property. Examples arefunctional component/integral object (e.g.,motor/car), member/collection (e.g., drummer/band), portion/mass (e.g., slice/pizza), phase/activity (e.g., braking/driving), place/area (e.g.,Munich/Bavaria), subset/set (e.g., inhabitants ofMunich/inhabitants of Bavaria).

    Adaptations of these relation types to anatomyhave been proposed by Bernauer [66] and Rogers andRector [23]. In the general medical domain, the gapbetween formal mereology and a cognitive accountof part-of is even deeper. Schubert [90] admits so-called exclusive parts, i.e., anatomical structures(e.g., neurons) whose (minor) components (e.g.,axons) lie outside the whole (e.g., a certain brainarea). A technical rationale for this approach is toavoid the proliferation of artificial classes. By refer-

    ring to its incompatibility with basic mereologicalassumptions, it might be easy to reject such anapproach but one still has to take into account thatpropositions like the motor neurons in the cerebralcortex or pyramid cells constitute the cerebralcortex are commonly found in textbooks of anat-omy and make sense to medical experts and stu-dents.

    How properties of parts propagate to wholes (orvice versa) is an issue that has also been addressedby Winston et al. [58] and Simons [54], where it isreferred to as indirect transitivity [30], upward/downward distributivity [65], or inheritance viapart-of [66] (see also our extensive discussion ofthis phenomenon in Section 2). Rogers and Rector[23] discuss propagation patterns for subrelations ofpart-of such as component-of, surface-of, andstuff-of. They argue that roles such as function-of often propagate only up to a certain (arbitrary)level of aggregation and propose to control thisphenomenon by the introduction of additional rela-tions.

    Another kind of part-whole dependence dealswith the question of existential dependence, whichgives rise to the distinction between essential partsand dependent parts [65]. As important as this issueis in the medical domain (see our examples in Sec-tion 2), it has largely been ignored in biomedicalknowledge representation.

    We have refrained from subscribing to a policyof a proliferating system of part-of relations and,rather, feature just a single relation assumingtransitivity. Our EP(I) encoding pattern, however,is flexible enough to account for inference anoma-lies in which the propagation of properties doesnot hold simply by way of properly choosing eitherE-nodes or the disjunction of E- and P-nodes inorder to disable or enable propagation, respec-tively.

    4.2. Part-whole relations as first-classcitizens: (description) logics forcomposite objects

    Mostly inspired by the fundamental claims of formalmereologists [56], many researchers argue for aspecial ontological status of part-whole relations.This first-class citizen status similar to is-a orinstance-of relations which express generaliza-tion/specialization then licenses dedicated for-mal specifications as distinct from other relations[65,88]. As a result, part-whole-specific extensionsof formal systems have to be developed.

    In the description logic paradigm, this has led tocontributions such as new role constructors[91,92,88,67] or transitive roles (which account

    for the combination of various part-whole relation

  • ro

    196 S. Schulz, U. HahnCI, etc. can be entirely substituted by their equiva-lent expressions. To give an example, the assertions(19)(25) would boil down to the following set ofaxioms by simple substitution:

    Cv 9 part-of:D (45)9 part-of:Cv 9 part-of:D (46)Dv 9 part-of:E (47)9 part-of:Dv 9 part-of:E (48)

    A language that supports the transitive closure ofroles, of course, yields inferences, such as thoseshown in Figs. 7 and 8, without introducing artificialnodes or the corresponding inclusion axioms. How-ever, what is still lacking in these approaches is anyadditional support for expressing role propagationas depicted in Fig. 4, as well as a proper solution forthe distinction between possible and invalid rolefillers such as discussed in Section 2.

    For the medical domain, Haimowitz et al. [94]were the first to request a representation formalismfor part-whole relations and corresponding reason-ing capabilities as an extension to description logics.Not only role propagation but also the support ofpossible parts were put on the agenda of descrip-tion logics by Padgham and Lambrix [95]. In responseto these demands, two basic approaches developed.

    In the first one, part-whole reasoning is dealt withby extending a knowledge representation languageby means of new operators dedicated to partonomicreasoning, as described above. In order to meet theinclusion axioms of the form CvD, with C and Dbeing complex expressions, the artificial classes CP,and the transitive closure of roles. Using generaltypes [92]) and role hierarchies [39], etc. Theselanguage extensions often accompanied by detri-mental computational properties however, haveto be implemented in concrete terminological rea-soning systems so that they can be used by theirinference engines. At the beginning, this require-ment was only rarely met by the vast majority ofreasoning technologies (cf. different treatments ofpart-of relations in the CLASSIC system [93,88]). Inthe meanwhile, role transitivity also part of theSHIQ language specification [39] has been thor-oughly studied in terms of computational behaviorand has been integrated into the FACT system [78],which is considered one of the most mature DLimplementations currently available.

    Indeed, FACT [43] and other more recent DL sys-tems, such as RACER [79], have made significantprogress in terms of the formal expressiveness ofthe DL language they implement. Among the addi-tional features they provide, which are most rele-vant to our approach, are general inclusion axiomsrequirements for biomedical ontologies, the GRAIL that the proliferation of classes can be tamedusing an intermediate representation as an inter-face to the user,

    and that artificial classes can be eliminated usinggeneral inclusion axioms in certain representationlanguages.

    Although this second approach allows one toselectively switch off/on part-whole mediatedrole propagation, we do not advocate a softeningof the semantics of part-of as proposed by Schubert[90] because this would give rise to an unforesee-able surge of unintended models.

    It remains to be seen whether conservative struc-tural extensions of a stable language platform areable to carry over to the many varieties of parto-nomic reasoning and different mereotopologicalrelations, while still taking into account the spatialversus functional reading of part-of [63] (as well asrepresenting branching relations which are essentialfor describing anatomy [69]), or whether newlydesigned operators or other fundamental languageextensions are really needed to meet the require-ments of the stipulated first-class citizen status ofepathat many of these additional, though often fal-sely dubbed artificial, classes are necessaryfrom an ontological point of view, as the distinctmechanisms for conditioned specialization mod-ling reveal,les.We, on the contrary, arguewhlanguage provides special constructors for role pro-pagation in the context of the GALEN project [33].However, role propagation and class specializationare hard-wired to role definitions and, thus, fail tocover conflicting empirical data. In spite of thisshortcoming, a complete and decidable algorithmfor this sort of inference is still an open issue indescription logics research [24]. In a similar vein,Yang and Patil [30] dealt with this problem withinthe NIKL framework.

    In the second approach, standard language defi-nitions are preserved for reasons of simplicity andparsimony. Congruently, Schmolze and Mark [40]proposed a solution similar to ours, using subsump-tion to obtain inferences resembling those of tran-sitive roles or transitive closure of roles. Artale et al.[21] criticize this proposal for the proliferation of(artificial) classes involved and fundamentallyreject such a solution. Furthermore, their harshcriticism is based on stipulating a particularly pro-minent ontological status for part-whole relations,

    ich should be kept entirely distinct from ordinaryrts and wholes.

  • Part-whole representation and reasoning in formal biomedical ontologies 1975. Conclusions

    Part-whole relations, together with taxonomicrelations (is-a, instance-of, kind-of, etc.), consti-tute the prevalent relational glue of ontologies forconcrete physical domains such as biology or med-icine. This becomes immediately evident whenone looks at classical thesauri or terminologies,in which taxonomic and partonomic relationdescriptions prevail. Unlike with regard to taxo-nomic relations, a consensus about how to ade-quately represent the underlying knowledgewithin a formal framework is still lacking. Hence,a fundamental knowledge representation and rea-soning problem arises.

    Given the outstanding importance of part-wholerelations, many researchers argue in favor of a first-class citizen status for part-whole relations, similarto taxonomic relations [21]. From a conceptualmodeling point of view, this necessitates both anin-depth ontological study of the specific character-istics which distinguish the part-whole relationsfrom ordinary relations such as has-color ordrugs-prescribed, in addition to dedicated mechan-isms by which the results of this ontological inves-tigation can be expressed at the formal knowledgerepresentation language level. In the descriptionlogics community where knowledge is primarilyorganized around taxonomies and their associatedreasoning services (subsumption, recognition), aresponse to this challenge has been the extensionof simple base languages such as ALC by additionalparts-sensitive constructors, transitive roles, tran-sitive closure operators, and part-whole specificreasoning services (e.g., constructing a whole fromits parts by appropriate specifications, detectingmissing parts, etc.).

    This may be a reasonable option when imple-menting experimental systems on a proof-of-con-cept basis. In medical or large-scale biologicalapplications (e.g., functional genomics), efficientand robust operational knowledge representationsystems are required to be capable of managinghundreds of thousands of classes and relations. Thisrequirement is currently only fulfilled by a handfulof systems, e.g., CLASSIC, GRAIL, (POWER)LOOM, FACT, orRACER, to name the most prominent ones. Althoughpart-whole related extensions of some of thesesystems exist [93,88] none of them are capable ofcovering essential part-whole related inferences ona larger scale (such as discussed in Section 3). Also,interactions of these constructive extensions withthe basic (taxonomic) system core are often unpre-dictable and largely unexplored.

    In this article, we drive the reuse of these base-

    line taxonomy-centered systems for part-whole rea-References

    [1] Clancey WJ, Shortliffe EH, editors. Readings in medicalartificial intelligence. Reading, MA: AddisonWesley; 1984.

    [2] Rossi Mori A, Consorti F, Galeazzi E. Standards to supportdevelopment of terminological systems for healthcare tele-matics. Methods Inf Med 1998;37(45):55163.

    [3] Ingenerf J, Giere W. Concept-oriented standardization andstatistics-oriented classification: continuing the classifica-tion versus nomenclature controversy. Methods Inf Med1998;37(45):52739.

    [4] ICD-10. International statistical classification of diseases andhealth related problems. 10th revision. Geneva: WorldHealth Organization; 1992.

    [5] Cote R, Rothwell DJ, Beckett RS, Palotay JL, Brochu L. Thesystemised nomenclature of medicine: SNOMED international.Northfield, IL: College of American Pathologists; 1993.

    [6] ICNP. International classification of nursing practice. Inter-national Council of Nurses; 2002.

    [7] MeSH. Medical subject headings. Bethesda, MD: NationalLibrary of Medicine; 2004.

    [8] Gene Ontology Consortium. Creating the gene ontologyresource: design and implementation. Genome Res 2001;Acknowledgments

    The research reported in this article has been par-tially supported by Deutsche Forschungsge-meinschaft (DFG) under contracts HA 2097/5-1and HA 2097/5-2. We also like to thank our collea-gues Martin Romacker (who helped shape the idea ofSEP triplets in the very beginning) and Kornel Marko(who helped to clarify our idea of EP(I) patterns bycontinuously challenging our views when Martin wasno longer at our Lab).soning to the extreme. We refrain from anyconstructive extension of the basic ALC languageand embed part-whole specific reasoning entirelyinto taxonomic reasoning. This is achieved by theintroduction of complex encodings, such as the basicEP model (cf. Section 3.1), which represent a classand all its specific parts both subsumed by thedisjunction of these two classes. By introducingcorresponding includer nodes, the EPI model (cf.Section 3.2), both part-of and has-part inferencescan be drawn. Furthermore, by way of appropriateencoding, inference anomalies are adequately dealtwith, which have always been a source of trouble foroverly general assumptions related to part-wholerelations. Finally, we have provided evidence thateven seemingly artifical classes implied by thecomplex EP(I) encoding patterns often have anontologically plausible correlate and, hence, byno means constitute a notational plethora.11(8):142533.

  • 198 S. Schulz, U. Hahn[9] OBO. Open biological ontologies (OBO). http://obo.source-forge.net. Last accessed: October 23; 2004.

    [10] Campbell KE, Das AK, Musen MA. A logical foundation forrepresentation of clinical data. J Am Med Inf Assoc1994;1(3):21832.

    [11] Rector AL, Gangemi A, Galeazzi E, Glowinski AJ, Rossi Mori A.The Galenmodel schemata for anatomy: towards a re-usableapplication-independent model of medical concepts. In:Barahona P, Veloso M, Bryant J, editors. MIE94 Medicalinformatics Europe 94. Proceedings of the 12th conferenceof the European federation for medical informatics. Amster-dam: IOS Press; 1994. p. 22933.

    [12] Rosse C, Mejino JLV, Modayur BR, Jakobovits R, Hinshaw KP,Brinkley JF. Motivation and organizational principles foranatomical knowledge representation: the digital anatomistsymbolic knowledge base. J Am Med Inf Assoc 1998;5(1):1740.

    [13] McCray AT, Nelson SJ. The representation of meaning in theUmls. Methods Inf Med 1995;34(12):193201.

    [14] UMLS. Unified medical language system. Bethesda, MD:National Library of Medicine; 2004.

    [15] Horrocks I, Patel-Schneider PF, van Harmelen F. From SHIQand RDF to OWL: the making of a Web ontology language. JWeb Semantics 2003;1(1):726.

    [16] Smith B. Beyond concepts: ontology as reality representa-tion. In: Varzi AC, Vieu L, editors. Formal ontology ininformation systems. Proceedings of the 3rd internationalconference FOIS 2004, vol. 114 in Frontiers in ArtificialIntelligence and Applications. Amsterdam etc.: IOS Press;2004. p. 7384.

    [17] Schulz S, Hahn U. Medical knowledge reengineering: con-verting major portions of the UMLS into a terminologicalknowledge base. Int J Med Inf 2001;64(23):20721.

    [18] Bodenreider O. Circular hierarchical relationships in theUMLS: etiology, diagnosis, treatment, complications, andprevention. In: Bakken S, editor. AMIA 2001 Proceedingsof the annual symposium of the American medical infor-matics association. A medical informatics odyssey: visions ofthe future and lessons from the past. Philadelphia, PA:Hanley & Belfus; 2001. p. 5761.

    [19] Gene ontology consortium. GO editorial style guide,2004. http://www.geneontology.org/GO.usage.html. Lastaccessed: October 23, 2004.

    [20] Chaffin R, Herrmann DJ, Winston M. An empirical taxonomyof part-whole relations: effects of part-whole relation typeon relation identification. Lang Cognit Processes1988;3(1):1748.

    [21] Artale A, Franconi E, Guarino N, Pazzi L. Part-whole rela-tions in object-centered systems: an overview. Data KnowlEng 1996;20(3):34783.

    [22] Horrocks I, Rector AL, Goble CA. A description logic basedschema for the classification of medical data. In: Baader F,Buchheit M, Jeusfeld MA, Nutt W, editors. KRDB96 Pro-ceedings of the 3rd workshop Knowledge RepresentationMeets Databases. 1996. p. 248. Published via http://CEUR-WS.org/Vol-4/.

    [23] Rogers J, Rector AL. Galens model of parts and wholes:experience and comparisons. In: Overhage MJ, editor. AMIA2000 Proceedings of the annual symposium of the Amer-ican medical informatics association. Converging informa-tion, technology, and health care. Philadelphia, PA: Hanley& Belfus; 2000. p. 7148.

    [24] Rector AL. Analysis of propagation along transitive roles:formalisation of the Galen experience with medical ontol-ogies. In: Horrocks I, Tessaris S, editors. DL 2002 Proceed-ings of the 2002 international workshop on description

    logics. 2002. Published via http://CEUR-WS.org/Vol-53/.[25] Brachman RJ, Levesque HJ. Knowledge representationand reasoning. Amsterdam: Elsevier/Morgan Kaufmann;2004.

    [26] Genesereth MR, Nilsson NJ. Logical foundations of artificialintelligence. Palo Alto, CA: Morgan Kaufmann; 1987.

    [27] Baader F, Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF, editors. The description logic handbook. The-ory, implementation, and applications. Cambridge, UK:Cambridge University Press; 2003.

    [28] Sowa JF, editor. Principles of semantic networks. Explora-tions in the representation of knowledge. San Mateo, CA:Morgan Kaufmann; 1991.

    [29] Sowa JF. Knowledge representation: logical, philosophical,and computational foundations. Stamford, CT: ThomsonLearning; 2000.

    [30] Yang Y, Patil R. KOLA: a knowledge organization language.In: Kingsland III LC, editor. SCAMC89 Proceedings of the13th annual symposium on computer applications in medicalcare. New York, NY: IEEE Computer Society Press; 1989. p.715.

    [31] Ensing M, Paton R, Speel P-H, Rada R. An object-orientedapproach to knowledge representation in a biomedicaldomain. Artif Intell Med 1994;6(6):45982.

    [32] Mays E, Weida R, Dionne R, Laker M, White B, Liang C, et al.Scalable and expressive medical terminologies. In: CiminoJJ, editor. AMIA96 Proceedings of the 1996 AMIA annualfall symposium (formerly SCAMC). Beyond the superhigh-way: exploiting the internet with medical informatics. Phi-ladelphia, PA: Hanley & Belfus; 1996. p. 25963.

    [33] Rector AL, Bechhofer S, Goble CA, Horrocks I, Nowlan WA,Solomon WD. The GRAIL concept modelling language formedical terminology. Artif Intell Med 1997;9(2):13971.

    [34] Spackman KA. Normal forms for description logic expressionof clinical concepts in SNOMED RT. In: Bakken S, editor. AMIA2001 Proceedings of the annual symposium of the Amer-ican medical informatics association. A medical informaticsodyssey: visions of the future and lessons from the past.Philadelphia, PA: Hanley & Belfus; 2001. p. 62731.

    [35] Schulz S,HahnU.Parts, locations, andholes: formal reasoningabout anatomical structures. In: Quaglini S, Barahona P,Andreassen S, editors. Artificial intelligence in medicine.Proceedings of the 8th conference on artificial intelligenceinmedicine in Europe AIME 2001, vol. 2101 of Lecture Notesin Artificial Intelligence. Berlin: Springer; 2001. p. 293303.

    [36] Volot F, Joubert M, Fieschi M. Review of biomedical knowl-edge and data representation with conceptual graphs. Meth-ods Inf Med 1998;37(1):8696.

    [37] Kusters R. Non-standard inferences in description logics, vol.2100 of Lecture Notes in Computer Science. Berlin: Springer;2001.

    [38] Baader F, Nutt W. Basic description logics. In: Baader F,Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF,editors. The description logic handbook. Theory, implemen-tation, and applications. Cambridge, UK: Cambridge Uni-versity Press; 2003. p. 4395.

    [39] Horrocks I, Sattler U. A description logic with transitive andinverse roles and role hierarchies. J Logic Comput1999;9(3):385410.

    [40] Schmolze JG, Mark WS. The Nikl experience. Comput Intell1991;7(1):4869.

    [41] Brachman RJ, McGuinness DL, Patel-Schneider PF, ResnickLA, Borgida A. Living with classic: when and how to use a Kl-one-like language. In: Sowa JF, editor. Principles of semanticnetworks. Explorations in the representation of knowledge.San Mateo, CA: Morgan Kaufmann; 1991. p. 40156.

    [42] MacGregor R, Bates R. The LOOM knowledge representation

    language. Technical report RS-87188. Marina del Rey, CA:

  • Part-whole representation and reasoning in formal biomedical ontologies 199Information Sciences Institute, University of Southern Cali-fornia; 1987.

    [43] Horrocks IR. The FACT system. In: de Swart HCM, editor.Automated reasoning with analytic tableaux and relatedmethods TABLEAUX 98. Proceedings of the internationalconference Tableaux98, vol. 1397 of Lecture Notes in Com-puter Science. Berlin: Springer; 1998. p. 30712.

    [44] Haarslev V, Moller R. RACER: a core inference engine for thesemantic web. In: EON 2003 Proceedings of the 2ndinternational workshop on evaluation of ontology-basedtools, located at the 2nd international semantic web con-ference (ISWC 2003). Sanibel Island, FL, USA; 2003. p. 2736

    [45] Kashyap V, Borgida A. Representing the UMLS semantic net-work using OWL: (or Whats in a semantic web link?). In:Fensel D, Sycara KP, Mylopoulos J, editors. The semantic web ISWC 2003. Proceedings of the 2nd international semanticweb conference, vol. 2870 in Lecture Notes in ComputerScience. Berlin: Springer; 2003. p. 116.

    [46] Bateman JA, Magnini B, Fabris G. The generalized uppermodel knowledge base: organization and use. In: Mars NJI,editor. Towards very large knowledge bases: knowledgebuilding and knowledge sharing. Amsterdam: IOS Press;1995. p. 6072.

    [47] Hafner CD, Noy NF. Ontological foundations for biologyknowledge models. In: States DJ, Agarwal P, GaasterlandT, Hunter L, Smith RF, editors. ISMB96 Proceedings of the4th international conference on intelligent systems formolecular biology. Menlo Park, CA: AAAI Press; 1996 . p.7887.

    [48] McCray AT. An upper level ontology for the biomedicaldomain. Comp Funct Genomics 2003;4(1):804.

    [49] Schulz S, Hahn U. Mereotopological reasoning about partsand (w)holes in bio-ontologies. In: Welty C, Smith B, editors.Formal ontology in information systems. Collected papersfrom the 2nd international FOIS conference. New York, NY:ACM Press; 2001. p. 21021.

    [50] Brachman RJ. What is-a is and isnt: an analysis of taxonomiclinks in semantic networks. IEEE Comput 1983;16(10):306.

    [51] Welty C, Guarino N. Supporting ontological analysis of taxo-nomic relationships. Data Knowl Eng 2001;39(1):5174.

    [52] Guarino N, Welty CA. An overview of OntoClean. In: Staab S,Studer R, editors. Handbook on ontologies. Berlin: Springer;2004. p. 15171.

    [53] Bittner T, Smith B. A theory of granular partitions. In:DuckhamM, Goodchild MF, Worboys MF, editors. Foundationsof geographic information science. London: Taylor & FrancisBooks; 2003. p. 11751.

    [54] Simons P. Parts: a study in ontology. Oxford: Clarendon Press;1987.

    [55] Casati R, Varzi AC. Parts and places. The structures of spatialrepresentation. Cambridge, MA: MIT Press/Bradford; 1999.

    [56] Srzednicki JTJ, Rickey VF, Czelakowski J, editors. Lesniews-kis systems: ontology and mereology, vol. 13 in NijhoffInternational Philosophy Series. Dordrecht: Kluwer Aca-demic Publishers; 1984.

    [57] Cruse AD. On the transitivity of the part-whole relation. JLinguistics 1979;15(1):2938.

    [58] Winston ME, Chaffin R, Herrmann DJ. A taxonomy of part-whole relations. Cognit Sci 1987;11(4):41744.

    [59] Iris MA, Litowitz BE, Evens MW. Problems of the part-wholerelation. In: Evens MW, editor. Relational models of thelexicon. Representing knowledge in semantic networks.Cambridge, UK: Cambridge University Press; 1988 . p.26188.

    [60] Gerstl P, Pribbenow S. Midwinters, end games and bodyparts: a classification of part-whole relations. Int J Hum

    Comput Stud 1995;43(56):86589.[61] Pribbenow S. Meronymic relationships: from classical mer-eology to complex part-whole relationships. In: Green R,Bean CA, Myaeng SH, editors. The semantics of relation-ships: an interdisciplinary perspective, vol. 3 of InformationScience and Knowledge Management. Dordrecht: KluwerAcademic Publishers; 2002. p. 3550.

    [62] Varzi AC. Parts, wholes, and part-whole relations: the pro-spects of mereotopology. Data Knowl Eng 1996;20(3):25968.

    [63] Schulz S, Hahn U. Parthood as spatial inclusion: evidencefrom biomedical conceptualizations. In: Dubois D, Welty C,Williams M-A., editors. Principles of knowledge rep-resentation and reasoning. Proceedings of the 9th interna-tional conference KR 2004. Menlo Parko, CA: AAAI Press;2004. p. 5563.

    [64] Bittner T, Donnelly M. The mereology of stages and persis-tent entities. In: Lopez de Mantaras R, Saitta L, editors.ECAI 2004 Proceedings of the 16th European conferenceon artificial intelligence. Amsterdam: IOS Press; 2004. p.2837.

    [65] Artale A, Franconi E, Guarino N. Open problems with part-whole relations. In: Padgham L, Franconi E, Gehrke M,McGuinness DL, Patel-Schneider PF, editors. DL96 Pro-ceedings of the international workshop on descriptionlogics, vol. WS-9605 of AAAI technical report. Menlo Park,CA: AAAI Press; 1996. p. 703.

    [66] Bernauer J. Analysis of part-whole relation and subsumptionin the medical domain. Data Knowl Eng 1996;20(3):40515.

    [67] Horrocks I, Sattler U. The effect of adding complex roleinclusion axioms in description logics. In: IJCAI03 Pro-ceedings of the 18th international joint conference onartificial intelligence, vol. 1. San Francisco, CA: MorganKaufmann; 2003. p. 3438.

    [68] Gangemi A, Guarino N, Masolo C, Oltramari A. Understand-ing top-level ontological distinctions. In: Gomez-Perez A,Gruninger M, Stuckenschmidt H, Uschold M, editors. Ontol-ogies and information sharing. Proceedings of the IJCAI-01workshop on ontologies and information sharing. 2001. p.2633.

    [69] Schulz S, Hahn U. Towards a computational paradigm forbiomedical structure. In: Hahn U, Schulz S, Cornet R, edi-tors. KR-MED 2004 Proceedings of the 1st internationalworkshop on formal biomedical knowledge representation,collocated with the 9th international conference on theprinciples of knowledge representation and reasoning (KR2004). Bethesda, MD: American Medical Informatics Associa-tion (AMIA); 2004. p. 6371. Published via http://CEUR-WS.org/Vol-102/.

    [70] Bittner T, Donnelly M, Smith B. Individuals, universals,collections: on the foundational relations of ontology. In:Varzi AC, Vieu L, editors. Formal ontology in informationsystems. Proceedings of the 3rd international conference -FOIS 2004, vol. 114 in Frontiers in Artificial Intelligence andApplications. Amsterdam: IOS Press; 2004. p. 3748.

    [71] Uschold M. The use of the typed lambda calculus for guidingnaive users in the representation and acquisition of part-whole knowledge. Data Knowl Eng 1996;20(3):385404.

    [72] Smith B. The basic tools of formal ontology. In: Guarino N,editor. FOIS98 Proceedings of the conference on formalontology in information systems. Amsterdam: IOS Press;1998. p. 1928.

    [73] Schulz EB, Price C, Brown PJB. Symbolic anatomic knowl-edge representation in the Read Codes version 3: structureand application. J Am Med Inf Assoc 1997;4(1):3848.

    [74] Hahn U, Schulz S, Romacker M. Partonomic reasoning astaxonomic reasoning in medicine. In: AAAI99/IAAI99

    Proceedings of the 16th national conference on artificial

  • intelligence and 11th innovative applications of artificialintelligence conference. Menlo Park, CA; Cambridge, MA:AAAI Press and MIT Press; 1999. p. 2716.

    [75] Schulz S, Romacker M, Hahn U. Part-whole reasoning inmedical ontologies revisited: introducing SEP triplets intoclassification-based description logics. In: Chute CG, editor.AMIA98 Proceedings of the 1998 AMIA annual fall sympo-sium.